Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
(a) Divided strips <strong>an</strong>d their projection pr<strong>of</strong>iles<br />
(b) Extracted blocks<br />
Figure 2.11: Block extraction steps in [105]<br />
tel-00912566, version 1 - 2 Dec 2013<br />
method works on blocks. But the difference is that the authors assume two different<br />
types <strong>of</strong> <strong>document</strong>s, namely tightly spaced <strong>document</strong>s (TSD) <strong>an</strong>d widely<br />
spaced <strong>document</strong>s (WSD) to better cope with overlapping <strong>an</strong>d multi-touching<br />
components. Figure 2.11 shows the projection pr<strong>of</strong>iles for several strips <strong>an</strong>d<br />
the result <strong>of</strong> their block extraction. For each block, the method computes two<br />
features based on fractal dimensions resulting from the classical box-counting<br />
algorithm. Then <strong>an</strong> unsupervised fuzzy C-me<strong>an</strong>s 2-class classifier is used to<br />
separate blocks into tightly packed or widely spaced. Each block type is approached<br />
differently for detecting text lines.<br />
Another method is proposed by Arivazhag<strong>an</strong> et al. in [7] that starts by<br />
obtaining c<strong>an</strong>didate lines using piecewise vertical projection pr<strong>of</strong>iles similar to<br />
the one used in [105]. Then the method draws a decision for <strong>an</strong>y obstructing<br />
element based on the bivariate Gaussi<strong>an</strong> probability density <strong>of</strong> a dist<strong>an</strong>ce metric<br />
to find out whether the element belongs to the line above or the line below.<br />
After applying the piecewise projection pr<strong>of</strong>ile to the <strong>document</strong> image, the complete<br />
set <strong>of</strong> text line separators are drawn by connecting a valley <strong>of</strong> a pr<strong>of</strong>ile<br />
associated with a block on the right, to a valley from the block on its left <strong>an</strong>d<br />
continuing the line straightly in a situation where a valley is still unconnected.<br />
For each connected component in which a line passes through, it may belong<br />
totally to either the line above or the live below, or it may need to be broken<br />
into two components. The method uses a dist<strong>an</strong>ce metric decision to determine<br />
which approach should be taken for each obstructing component.<br />
The last reviewed projection based method is proposed by Papavassiliou et<br />
al. [75]. This method is very similar to the one in [7] with one difference that<br />
it adds <strong>an</strong>other stage based on Hidden Markov Models (HMM) to correct some<br />
misleading peaks <strong>an</strong>d valleys <strong>of</strong> projection pr<strong>of</strong>iles <strong>an</strong>d therefore, it segments<br />
vertical strips into better situated blocks compared to other methods. Initially,<br />
like other methods, text <strong>an</strong>d gap areas are extracted by detecting peaks <strong>an</strong>d<br />
valleys in a smoothed projection pr<strong>of</strong>ile computed for each vertical strip. Then<br />
a Viterbi algorithm locates the optimal succession <strong>of</strong> text <strong>an</strong>d gap areas based<br />
upon the statistics drawn from the initial set <strong>of</strong> blocks. Finally, a text line<br />
separating technique is applied to assign connected components into appropriate<br />
text lines. Results for preliminary version <strong>of</strong> this method are published in<br />
[90] <strong>an</strong>d submitted to the ICDAR2007 [38] <strong>an</strong>d ICDAR2009 [40] h<strong>an</strong>dwritten<br />
29