Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
tel-00912566, version 1 - 2 Dec 2013<br />
Figure 6.3: Updated results reported in [4] based on scaled estimates. In this figure,<br />
the results from our method (Demat) <strong>an</strong>d Tesseract-OCR are added based on scaled<br />
estimates by using results <strong>of</strong> EPITA as a reference.<br />
Table 6.3: PARAGRAPH DETECTION SUCCESS RATES FOR 100 DOCU-<br />
MENTS OF OUR CORPUS<br />
Area weighted error % Area weighted % Count weighted %<br />
Method Merge Split Miss Partial Miss False Detection Overall Overall<br />
Our Method 23.88 27.08 0.39 3.03 12.23 86.97 72.71<br />
Tesseract 16.11 44.95 0.51 2.38 23.86 81.79 62.55<br />
EPITA 23.08 19.03 0.85 8.82 12.82 88.05 67.84<br />
<strong>of</strong> the Tessseract-OCR.<br />
107