14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Figure 2.8: These <strong>images</strong> from [18] illustrate how the snakes grow to engulf a text<br />

line.<br />

tel-00912566, version 1 - 2 Dec 2013<br />

In the review section for text region detection, we already reviewed dist<strong>an</strong>cebased<br />

methods that try to segment <strong>document</strong> image into text regions. Some<br />

methods use the same strategy to detect text lines. In one method [102] Delaunay<br />

tri<strong>an</strong>gulation is used to find a set <strong>of</strong> edges between connected components.<br />

Then the algorithm looks for the principal direction among the shortest edges<br />

to estimate the direction <strong>of</strong> the text lines. This method c<strong>an</strong> be applied to pages<br />

containing sl<strong>an</strong>ted text lines successfully. Nevertheless, because the strategy for<br />

filtering edges based on their lengths is somehow primitive <strong>an</strong>d the method uses<br />

a global threshold to remove edges, it becomes ineffective for detecting skewed<br />

text lines or separating side notes.<br />

To detect skewed text lines that exist on camera-captured <strong>document</strong>s, Bukhari<br />

et al. [17, 18] propose a method based on active contour models. Active contour<br />

model is a framework for finding the outline <strong>of</strong> <strong>an</strong> object from a possibly noisy<br />

image [24]. The method starts with multiple initial closed contours (snakes)<br />

that engulf text blobs. Each snake is associated with <strong>an</strong> energy that is usually<br />

the sum <strong>of</strong> the internal <strong>an</strong>d external forces, computed from the gradient vector<br />

flow (GVF) at the boundaries <strong>of</strong> the contours. Then in each iteration, snakes<br />

grow by a fixed number <strong>of</strong> pixels to minimize the associated energies. All overlapping<br />

pairs <strong>of</strong> snakes are merged together to form a single snake that covers<br />

a curled text line. Results indicate that this method has the potential to detect<br />

highly curled text lines with different spacing <strong>an</strong>d font sizes. The <strong>an</strong>alysis <strong>of</strong><br />

the errors shows that most <strong>of</strong> them are due to over-segmentation <strong>of</strong> a single line<br />

in the ground-truth into multiple detected text lines. Figure 2.8 illustrates how<br />

snakes grow on characters to engulf the skewed text line.<br />

Another method [19] is also proposed for detecting text lines in cameracaptured<br />

<strong>document</strong>s. One adv<strong>an</strong>tage <strong>of</strong> this new method is its ability to locate<br />

curled text lines. The approach is using <strong>an</strong> oriented <strong>an</strong>isotropic Gaussi<strong>an</strong><br />

smoothing function to generate a filter b<strong>an</strong>k. For each pixel, the maximum<br />

value among all scales <strong>an</strong>d orientations is selected for the final smoothed image.<br />

Then Horn-Riley based ridge detection algorithm [41, 80] is used to enh<strong>an</strong>ce<br />

text line structures from the smoothed image. The result from this work is<br />

mentioned in CBDAR2007 <strong>document</strong> image de-warping contest [83]. Although<br />

the method performs well on <strong>document</strong>s with printed text, it would be hard to<br />

be adapted for h<strong>an</strong>dwritten <strong>document</strong>s where alignment <strong>of</strong> the strings <strong>an</strong>d nonuniform<br />

gaps between words are <strong>an</strong> issue <strong>an</strong>d the results m<strong>an</strong>y contain m<strong>an</strong>y<br />

text line fragments.<br />

26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!