Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
them.<br />
In such circumst<strong>an</strong>ces, it would be logical to take adv<strong>an</strong>tage <strong>of</strong> methods that<br />
work on a block or a region <strong>of</strong> <strong>document</strong> instead <strong>of</strong> the connected components.<br />
The idea is that regions <strong>of</strong> text have a particular texture that c<strong>an</strong> be revealed<br />
with methods such as wavelet <strong>an</strong>alysis, co-occurrence matrices, Radon tr<strong>an</strong>sform,<br />
etc.<br />
Shape <strong>of</strong> the regions c<strong>an</strong> be rect<strong>an</strong>gular or free form. They c<strong>an</strong> be strips <strong>of</strong><br />
overlapping blocks or coming from a multi-resolution structure like quad-tree or<br />
image pyramids. To name a few methods, Journet [44] uses a method based on<br />
orientation <strong>of</strong> auto-correlation <strong>an</strong>d frequency features to segment regions <strong>of</strong> text<br />
in historical <strong>document</strong> <strong>images</strong>. In [46], Kim et al propose <strong>an</strong> approach based on<br />
support vector machine (SVM) <strong>an</strong>d CAMSHIFT algorithm.<br />
tel-00912566, version 1 - 2 Dec 2013<br />
In CC-based methods, the geometry <strong>of</strong> the components is clearly defined.<br />
However, no matter how well the methods work, there is always <strong>an</strong> issue <strong>of</strong><br />
confidence at the boundaries <strong>of</strong> regions. This issue becomes larger as the gap<br />
between text <strong>an</strong>d non-textual components gets smaller. Using large block implies<br />
high confidence on the computed features, but poor confidence on regions’<br />
boundaries. Using small blocks implies just the opposite. One solution is to use<br />
overlapping blocks, but then the problem becomes to assign a label to pixels<br />
when two or more overlapping blocks have a disagreement about the label <strong>of</strong> the<br />
pixels. Confidence based methods for combination <strong>of</strong> evidence (e.g. Dempster-<br />
Shafer ) also may fail. Consider the case where two blocks have overlapping<br />
pixels. One block is located on the mostly text area <strong>of</strong> the <strong>document</strong>, <strong>an</strong>d the<br />
other block is located on the mostly graphical one. The pixels in between get<br />
evidences from two sources that have a high degree <strong>of</strong> conflict. The Dempster-<br />
Shafer theory is criticized for being incompetence in the mentioned situation<br />
[104].<br />
The other problem with region-based methods is their ineffectiveness to detect<br />
frames <strong>an</strong>d lines that belong to a table. Thus, the result for parts <strong>of</strong> a table<br />
that contain text, comes back as text for every pixel <strong>of</strong> that region, whereas it<br />
includes separating rule lines.<br />
2.1.3 Conclusion<br />
Whenever it is feasible to compute the connected components <strong>of</strong> a <strong>document</strong>,<br />
it is preferable to assign the label to the connected component itself <strong>an</strong>d avoid<br />
using a region-based method. However, to overcome some <strong>of</strong> the drawbacks<br />
<strong>of</strong> the component based methods, a hybrid approach should be adapted that<br />
also takes adv<strong>an</strong>tage <strong>of</strong> texture features around the component. Our method is<br />
based on this strategy <strong>an</strong>d is described in chapter 3.<br />
17