14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

tel-00912566, version 1 - 2 Dec 2013<br />

Figure 6.3: Updated results reported in [4] based on scaled estimates. In this figure,<br />

the results from our method (Demat) <strong>an</strong>d Tesseract-OCR are added based on scaled<br />

estimates by using results <strong>of</strong> EPITA as a reference.<br />

Table 6.3: PARAGRAPH DETECTION SUCCESS RATES FOR 100 DOCU-<br />

MENTS OF OUR CORPUS<br />

Area weighted error % Area weighted % Count weighted %<br />

Method Merge Split Miss Partial Miss False Detection Overall Overall<br />

Our Method 23.88 27.08 0.39 3.03 12.23 86.97 72.71<br />

Tesseract 16.11 44.95 0.51 2.38 23.86 81.79 62.55<br />

EPITA 23.08 19.03 0.85 8.82 12.82 88.05 67.84<br />

<strong>of</strong> the Tessseract-OCR.<br />

107

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!