14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Figure 1.8: A screen shot that shows part <strong>of</strong> the contents <strong>of</strong> <strong>an</strong> XML file created by<br />

our s<strong>of</strong>tware<br />

tel-00912566, version 1 - 2 Dec 2013<br />

they c<strong>an</strong> only be read <strong>an</strong>d write using Aletheia[25] from Prima tools. The other<br />

tool from Prima group is Prima Layout Evaluation[25] that compares the XML<br />

output from a segmentation process with the ground truth data <strong>an</strong>d provides<br />

detailed evaluations on how they differ from each other. Due to this new tool,<br />

we <strong>an</strong>notated every <strong>document</strong> with Aletheia <strong>an</strong>d produced XML ground truth<br />

<strong>document</strong>s for the purpose <strong>of</strong> evaluation.<br />

1.6 Org<strong>an</strong>ization <strong>of</strong> this dissertation<br />

Figure 1.9 gives <strong>an</strong> overview <strong>of</strong> the org<strong>an</strong>ization <strong>of</strong> this thesis, highlighting the<br />

connection between different chapters <strong>of</strong> this thesis. This dissertation is org<strong>an</strong>ized<br />

as follows:<br />

Chapter 2 outlines the major areas <strong>of</strong> work in <strong>document</strong> page segmentation<br />

<strong>an</strong>d addresses key parts <strong>of</strong> text segmentation in <strong>document</strong> literature.<br />

We describe our system with all the details in the sp<strong>an</strong> <strong>of</strong> four chapters.<br />

Chapter 3 deals with various aspects <strong>of</strong> text/graphics separation <strong>of</strong> out method.<br />

Text region detection is explained in chapter 4. Text lines <strong>an</strong>d paragraphs detection<br />

are accounted in chapters 5 <strong>an</strong>d 6, respectively. We note the experiments<br />

<strong>an</strong>d results for each part <strong>of</strong> the method at the end <strong>of</strong> its own chapter.<br />

Finally, in chapter 7 we give recommendation for future work.<br />

11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!