14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

[21] is <strong>an</strong>other texture-based method that uses a neuro-fuzzy approach. The<br />

accuracy <strong>of</strong> the main algorithm is about 59%. However, after morphological<br />

operations, the authors report 96% as the accuracy <strong>of</strong> the method.<br />

Finally, we mention the method in [69] based on Markov r<strong>an</strong>dom fields<br />

(MRF). The authors consider that <strong>an</strong>y clique is constructed from 4-connected<br />

neighbors, <strong>an</strong>d the interaction terms only consider horizontal <strong>an</strong>d vertical directions.<br />

For each node, nine features are gathered from each scale <strong>an</strong>d just two<br />

scales are considered. These features are density values <strong>of</strong> black pixels (text)<br />

associated to the current clique in both scales. This method has not been evaluated<br />

on popular datasets; however, we mention it because MRFs are one <strong>of</strong> the<br />

predecessors to what we use in our method involving conditional r<strong>an</strong>dom fields.<br />

tel-00912566, version 1 - 2 Dec 2013<br />

Texture-based methods on their own c<strong>an</strong> be used to classify <strong>an</strong>d isolate text<br />

regions, but they are not specifically designed to separate regions <strong>of</strong> text. As a<br />

consequence, refinement <strong>an</strong>d merging operations are usually needed to correct<br />

segmentation results. Because <strong>of</strong> these merging operations, under-segmentation<br />

<strong>of</strong>ten occurs. The second issue is that in m<strong>an</strong>y circumst<strong>an</strong>ces, parts <strong>of</strong> graphical<br />

elements are situated close to text regions, so they are recognized as text<br />

<strong>an</strong>d are merged with the text region. The third <strong>an</strong>d fundamental problem with<br />

texture-based methods is that there is no theoretical pro<strong>of</strong> about which filtering<br />

method is better th<strong>an</strong> the other except to empirically report the results. The<br />

main three parts <strong>of</strong> a texture-based method is the filtering strategy, the classifier<br />

<strong>an</strong>d the training method. So m<strong>an</strong>y authors publish methods by ch<strong>an</strong>ging one<br />

or all parts <strong>an</strong>d report new results.<br />

2.2.4 Conclusion<br />

In conclusion, texture-based methods that are solely based on image filtering, fit<br />

best for finding text in complex scenes other th<strong>an</strong> <strong>document</strong> <strong>images</strong>. Example<br />

<strong>of</strong> such scenes might be text detection from <strong>images</strong> <strong>of</strong> vehicle’s plates registration<br />

or from camera-captured scenes. Methods based on whitespace <strong>an</strong>alysis<br />

<strong>an</strong>d dist<strong>an</strong>ce-based methods have their own drawbacks. Clearly, a method is<br />

desirable that takes all consideration into account.<br />

2.3 Text line detection<br />

Detecting text lines in <strong>document</strong> <strong>images</strong> is a critical step towards optical character<br />

recognition (OCR) or h<strong>an</strong>dwritten character recognition (HCR). This process<br />

refers to the segmentation <strong>of</strong> each text region into distinct entities, namely<br />

text lines. The overall perform<strong>an</strong>ce <strong>of</strong> <strong>an</strong> OCR/HCR system strongly relies<br />

upon the results from text line detection process. Different methods have been<br />

proposed for detecting printed <strong>an</strong>d h<strong>an</strong>dwritten text lines. In both cases there<br />

exist several challenges; however, in printed <strong>document</strong>s, detecting text lines is<br />

a rather straightforward process <strong>an</strong>d is not a fit for detecting h<strong>an</strong>dwritten text<br />

lines. In contrast, most <strong>of</strong> the proposed methods for detecting h<strong>an</strong>dwritten text<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!