Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
[21] is <strong>an</strong>other texture-based method that uses a neuro-fuzzy approach. The<br />
accuracy <strong>of</strong> the main algorithm is about 59%. However, after morphological<br />
operations, the authors report 96% as the accuracy <strong>of</strong> the method.<br />
Finally, we mention the method in [69] based on Markov r<strong>an</strong>dom fields<br />
(MRF). The authors consider that <strong>an</strong>y clique is constructed from 4-connected<br />
neighbors, <strong>an</strong>d the interaction terms only consider horizontal <strong>an</strong>d vertical directions.<br />
For each node, nine features are gathered from each scale <strong>an</strong>d just two<br />
scales are considered. These features are density values <strong>of</strong> black pixels (text)<br />
associated to the current clique in both scales. This method has not been evaluated<br />
on popular datasets; however, we mention it because MRFs are one <strong>of</strong> the<br />
predecessors to what we use in our method involving conditional r<strong>an</strong>dom fields.<br />
tel-00912566, version 1 - 2 Dec 2013<br />
Texture-based methods on their own c<strong>an</strong> be used to classify <strong>an</strong>d isolate text<br />
regions, but they are not specifically designed to separate regions <strong>of</strong> text. As a<br />
consequence, refinement <strong>an</strong>d merging operations are usually needed to correct<br />
segmentation results. Because <strong>of</strong> these merging operations, under-segmentation<br />
<strong>of</strong>ten occurs. The second issue is that in m<strong>an</strong>y circumst<strong>an</strong>ces, parts <strong>of</strong> graphical<br />
elements are situated close to text regions, so they are recognized as text<br />
<strong>an</strong>d are merged with the text region. The third <strong>an</strong>d fundamental problem with<br />
texture-based methods is that there is no theoretical pro<strong>of</strong> about which filtering<br />
method is better th<strong>an</strong> the other except to empirically report the results. The<br />
main three parts <strong>of</strong> a texture-based method is the filtering strategy, the classifier<br />
<strong>an</strong>d the training method. So m<strong>an</strong>y authors publish methods by ch<strong>an</strong>ging one<br />
or all parts <strong>an</strong>d report new results.<br />
2.2.4 Conclusion<br />
In conclusion, texture-based methods that are solely based on image filtering, fit<br />
best for finding text in complex scenes other th<strong>an</strong> <strong>document</strong> <strong>images</strong>. Example<br />
<strong>of</strong> such scenes might be text detection from <strong>images</strong> <strong>of</strong> vehicle’s plates registration<br />
or from camera-captured scenes. Methods based on whitespace <strong>an</strong>alysis<br />
<strong>an</strong>d dist<strong>an</strong>ce-based methods have their own drawbacks. Clearly, a method is<br />
desirable that takes all consideration into account.<br />
2.3 Text line detection<br />
Detecting text lines in <strong>document</strong> <strong>images</strong> is a critical step towards optical character<br />
recognition (OCR) or h<strong>an</strong>dwritten character recognition (HCR). This process<br />
refers to the segmentation <strong>of</strong> each text region into distinct entities, namely<br />
text lines. The overall perform<strong>an</strong>ce <strong>of</strong> <strong>an</strong> OCR/HCR system strongly relies<br />
upon the results from text line detection process. Different methods have been<br />
proposed for detecting printed <strong>an</strong>d h<strong>an</strong>dwritten text lines. In both cases there<br />
exist several challenges; however, in printed <strong>document</strong>s, detecting text lines is<br />
a rather straightforward process <strong>an</strong>d is not a fit for detecting h<strong>an</strong>dwritten text<br />
lines. In contrast, most <strong>of</strong> the proposed methods for detecting h<strong>an</strong>dwritten text<br />
21