14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 3.4: COMPARISON OF TEXT/GRAPHICS SEPARATION WITH LOGIT-<br />

BOOST, TESSERACT-OCR AND EPITA ON ICDAR2009 (61 DOCUMENTS)<br />

Method Text precision Text recall Graphics precision Graphics recall Text Accuracy<br />

LogitBoost 97.45 98.04 79.21 88.00 97.52<br />

TesseractOCR 93.32 95.44 88.52 85.87 92.96<br />

Epita 94.95 96.25 81.62 92.45 95.78<br />

tel-00912566, version 1 - 2 Dec 2013<br />

Table 3.5: COMPARISON OF TEXT/GRAPHICS SEPARATION WITH LOGIT-<br />

BOOST, TESSERACT-OCR AND EPITA ON ICDAR2011 (100 DOCUMENTS)<br />

Method Text precision Text recall Graphics precision Graphics recall Text Accuracy<br />

LogitBoost 98.05 93.42 56.58 73.52 94.22<br />

TesseractOCR 94.76 87.60 84.66 94.08 90.16<br />

Epita 97.85 95.43 62.33 85.29 95.23<br />

Table 3.6: COMPARISON OF TEXT/GRAPHICS SEPARATION WITH LOGIT-<br />

BOOST, TESSERACT-OCR AND EPITA ON OUR CORPUS (97 DOCUMENTS)<br />

Method Text precision Text recall Graphics precision Graphics recall Text Accuracy<br />

LogitBoost 93.82 93.62 59.41 74.11 92.40<br />

TesseractOCR 88.58 95.90 76.80 63.15 89.90<br />

Epita 95.75 90.20 61.28 85.23 91.20<br />

50

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!