Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.3.2 Text components . . . . . . . . . . . . . . . . . . . . . . . 59<br />
4.3.3 Graphical components . . . . . . . . . . . . . . . . . . . . 59<br />
4.3.4 Horizontal <strong>an</strong>d vertical run-lengths . . . . . . . . . . . . . 59<br />
4.3.5 Gabor features . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
4.4 Feature functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 62<br />
4.5 Label decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
4.6 Training (parameter/weight estimation) . . . . . . . . . . . . . . 68<br />
4.6.1 Collin’s voted perceptron method . . . . . . . . . . . . . . 69<br />
4.6.2 Loopy belief propagation . . . . . . . . . . . . . . . . . . 70<br />
4.6.3 L-BFGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />
4.7 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />
4.7.1 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . 76<br />
4.8 Results <strong>an</strong>d discussion . . . . . . . . . . . . . . . . . . . . . . . . 76<br />
4.8.1 Discussion on parameters . . . . . . . . . . . . . . . . . . 81<br />
4.9 Final Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86<br />
tel-00912566, version 1 - 2 Dec 2013<br />
5 Text line detection 87<br />
5.1 Initial text line separators . . . . . . . . . . . . . . . . . . . . . . 87<br />
5.2 Refinement <strong>of</strong> initial text line separators . . . . . . . . . . . . . . 88<br />
5.3 Connecting separators across vertical zones . . . . . . . . . . . . 93<br />
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95<br />
6 Paragraph detection 100<br />
6.1 Minimum sp<strong>an</strong>ning tree (MST) . . . . . . . . . . . . . . . . . . . 101<br />
6.2 Binary partition tree (BPT) . . . . . . . . . . . . . . . . . . . . . 101<br />
6.3 Paragraph features . . . . . . . . . . . . . . . . . . . . . . . . . . 103<br />
6.4 State decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />
6.5 Training method . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />
6.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105<br />
7 Conclusion <strong>an</strong>d future work 110<br />
7.1 Future direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110<br />
A Perform<strong>an</strong>ce evaluation methods 112<br />
A.1 Precision <strong>an</strong>d recall . . . . . . . . . . . . . . . . . . . . . . . . . . 112<br />
A.2 Match counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113<br />
A.3 Scenario driven region correspondence . . . . . . . . . . . . . . . 115<br />
B Implementation <strong>an</strong>d s<strong>of</strong>tware 116<br />
Bibliography 118<br />
Index 127<br />
iii