14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

proximation <strong>of</strong> a Mumford-Shah functional. They indicate that boundary based<br />

level-set methods such as [52, 53] depend on the number <strong>of</strong> boundary evolution<br />

steps, <strong>an</strong>d they are also sensitive to touching text lines. To h<strong>an</strong>dle these difficulties,<br />

their method seeks to minimize a Mumford-Shah functional using a<br />

piecewise const<strong>an</strong>t approximation [92]. The initial estimates <strong>of</strong> the text lines<br />

are the same as [85], <strong>an</strong>d then they are refined by visiting each pixel <strong>of</strong> the<br />

image in a given order. For each initial text line, a segmentation curve is set<br />

to segment the text line into two regions; inner <strong>an</strong>d outer. In each iteration,<br />

these curves evolve by calculating their parameters based on the intensity <strong>of</strong> the<br />

region. The final results may contain line fragments due to large gaps between<br />

words; so morphological operators are used as part <strong>of</strong> the post-processing step<br />

to merge some <strong>of</strong> these fragments.<br />

tel-00912566, version 1 - 2 Dec 2013<br />

The last reviewed method is published in paper [88] for detecting h<strong>an</strong>dwritten<br />

Arabic text lines. Instead <strong>of</strong> summing values <strong>of</strong> adjacent pixels as in<br />

projection pr<strong>of</strong>iles in [87], Shi et al. apply steerable directional filters, each with<br />

a shape <strong>of</strong> <strong>an</strong> ellipse with a large focal dist<strong>an</strong>ce. The height <strong>of</strong> the filter is chosen<br />

to be the same as the height <strong>of</strong> <strong>an</strong> average text <strong>an</strong>d the width to be five times<br />

its height. Using a filter with a direction similar to the direction <strong>of</strong> text lines,<br />

the pixel value for that location has a greater response th<strong>an</strong> when using <strong>an</strong>other<br />

filter in <strong>an</strong>y other direction. The result <strong>of</strong> filtering generates a map that is later<br />

thresholded adaptively to enh<strong>an</strong>ce the location <strong>of</strong> text lines.<br />

Dist<strong>an</strong>ce based methods<br />

We only review one method in this category. This method has been recently<br />

proposed in [103] for segmentation <strong>of</strong> h<strong>an</strong>dwritten Chinese text regions into<br />

text lines. The heart <strong>of</strong> this method is a minimum sp<strong>an</strong>ning tree algorithm.<br />

In the first stage, the method extracts all the connected components <strong>of</strong> the<br />

<strong>document</strong>. The reasonable assumption is that components which belong to a<br />

single text line are close to one <strong>an</strong>other compared to the components that belong<br />

to different text lines. Therefore, a minimum sp<strong>an</strong>ning tree is applied to<br />

connect neighboring components <strong>of</strong> the same line, <strong>an</strong>d each line corresponds to<br />

a sub-tree. Then because <strong>of</strong> variability <strong>of</strong> layout <strong>of</strong> text lines <strong>an</strong>d occasionally<br />

large gaps between words, the results are not prefect. Hence, the method use a<br />

second-stage clustering procedure to dynamically cut the edges <strong>of</strong> the tree into<br />

groups corresponding to correct text lines. A detection accuracy rate <strong>of</strong> 98.02%<br />

is reported for 803 unconstrained <strong>document</strong>s.<br />

2.3.3 Conclusion<br />

We have noted m<strong>an</strong>y methods in this section for text line detection, <strong>an</strong>d eventually<br />

we have to decide which method is exploitable for detecting text lines<br />

from our corpus. An evaluation <strong>an</strong>d comparison between methods is valid only<br />

if the results are available for the same dataset <strong>an</strong>d with the same evaluation<br />

metric. We identified three different perform<strong>an</strong>ce evaluation metrics; Pixel Correspondence,<br />

Match Counting <strong>an</strong>d Overall Pixel-Level HitRate. We have already<br />

described Match Counting in the first chapter. Unfortunately, in our case, it<br />

32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!