Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Figure 1.8: A screen shot that shows part <strong>of</strong> the contents <strong>of</strong> <strong>an</strong> XML file created by<br />
our s<strong>of</strong>tware<br />
tel-00912566, version 1 - 2 Dec 2013<br />
they c<strong>an</strong> only be read <strong>an</strong>d write using Aletheia[25] from Prima tools. The other<br />
tool from Prima group is Prima Layout Evaluation[25] that compares the XML<br />
output from a segmentation process with the ground truth data <strong>an</strong>d provides<br />
detailed evaluations on how they differ from each other. Due to this new tool,<br />
we <strong>an</strong>notated every <strong>document</strong> with Aletheia <strong>an</strong>d produced XML ground truth<br />
<strong>document</strong>s for the purpose <strong>of</strong> evaluation.<br />
1.6 Org<strong>an</strong>ization <strong>of</strong> this dissertation<br />
Figure 1.9 gives <strong>an</strong> overview <strong>of</strong> the org<strong>an</strong>ization <strong>of</strong> this thesis, highlighting the<br />
connection between different chapters <strong>of</strong> this thesis. This dissertation is org<strong>an</strong>ized<br />
as follows:<br />
Chapter 2 outlines the major areas <strong>of</strong> work in <strong>document</strong> page segmentation<br />
<strong>an</strong>d addresses key parts <strong>of</strong> text segmentation in <strong>document</strong> literature.<br />
We describe our system with all the details in the sp<strong>an</strong> <strong>of</strong> four chapters.<br />
Chapter 3 deals with various aspects <strong>of</strong> text/graphics separation <strong>of</strong> out method.<br />
Text region detection is explained in chapter 4. Text lines <strong>an</strong>d paragraphs detection<br />
are accounted in chapters 5 <strong>an</strong>d 6, respectively. We note the experiments<br />
<strong>an</strong>d results for each part <strong>of</strong> the method at the end <strong>of</strong> its own chapter.<br />
Finally, in chapter 7 we give recommendation for future work.<br />
11