Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4.7.1 Post-processing<br />
Three post-processing steps are applied to the output <strong>of</strong> the CRF:<br />
• Removing regions whose width or height are smaller th<strong>an</strong> the width <strong>of</strong><br />
height <strong>of</strong> the average character.<br />
• Opening each regions separately from other regions. Since side notes are<br />
very close to main text body, care should be taken not to merge two text<br />
regions together.<br />
• Applying a hole-filling method to fill holes inside each text region separately.<br />
tel-00912566, version 1 - 2 Dec 2013<br />
Four pages are shown in figure 4.12 after applying these three post-processing<br />
steps.<br />
4.8 Results <strong>an</strong>d discussion<br />
At this point, regions <strong>of</strong> text are detected <strong>an</strong>d ready for text line detection,<br />
but paragraphs are yet to be found. Paragraphs may or may not be separated<br />
depending on the dist<strong>an</strong>ce between them. However since in most ground-truth<br />
data for competitions such as ICDAR2011 Historical Document Layout Competition,<br />
paragraphs are <strong>an</strong>notated separately, evaluation <strong>of</strong> the results based<br />
on region matching with the true ground-truth data are me<strong>an</strong>ingless for the<br />
purpose <strong>of</strong> comparison.<br />
We report the current success rate for site-wise classification. The statistics<br />
aim to show how far the results are from the closest acceptable region segmentation<br />
for the me<strong>an</strong>s <strong>of</strong> text line detection. This me<strong>an</strong>s that instead <strong>of</strong> preparing<br />
a ground-truth that separates all the paragraphs, we generate the ground-truth<br />
data by correcting the segmentation results to make them acceptable for text<br />
line detection.<br />
Table 4.1 indicates number <strong>of</strong> misclassified sites from the output <strong>of</strong> our CRF<br />
model.<br />
Table 4.1: NUMBER OF MISCLASSIFIED SITES (%) FROM THE OUTPUT OF<br />
OUR CRF MODEL<br />
Total sites (%) Textual sites (%) Non-textual sites (%) Gap between columns (%)<br />
0.97 1.32 0.88 3.5<br />
Two other tables 4.2 <strong>an</strong>d 4.3 show region segmentation success rates for<br />
different <strong>images</strong>. The indicated rates are computed between the segmentation<br />
output <strong>an</strong>d the closest acceptable segmentation for text line detection. The<br />
closest acceptable segmentation is a segmentation that is capable <strong>of</strong> producing<br />
76