14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

tel-00912566, version 1 - 2 Dec 2013<br />

Figure 4.2: This figure shows our two-dimensional conditional r<strong>an</strong>dom fields model.<br />

Blue lines represent the boundaries between site. We divide the <strong>document</strong> image into<br />

rect<strong>an</strong>gular blocks with equal heights <strong>an</strong>d equal widths. The label on each site depends<br />

on the label <strong>of</strong> sites in its vicinity <strong>an</strong>d observations defined for that site. Sites depend<br />

on observations by me<strong>an</strong>s <strong>of</strong> feature functions. The ground truth label for each site is<br />

also available for the purpose <strong>of</strong> training <strong>an</strong>d evaluation. Note that the area that is<br />

shown on the image is part <strong>of</strong> a <strong>document</strong> that shows text lines from the main text <strong>an</strong>d<br />

part <strong>of</strong> a side note. Width <strong>an</strong>d height <strong>of</strong> sites in this image are not correct according<br />

<strong>an</strong>d are shown for visualization purposes only.<br />

The reason is that <strong>document</strong>s come in different sizes <strong>an</strong>d resolutions, <strong>an</strong>d the<br />

size <strong>of</strong> each site must be normalized <strong>an</strong>d the width should be small enough to<br />

pass between side notes <strong>an</strong>d the main body. Each site may take one <strong>of</strong> the two<br />

labels; ”text” or ”non-text”. However, the label <strong>of</strong> each site depends on the<br />

labels <strong>of</strong> the sites in its vicinity; this includes the sites on its left, right, top<br />

<strong>an</strong>d bottom. Furthermore, the label <strong>of</strong> each site may depend on observations<br />

from the <strong>document</strong> image. Generally, in conditional r<strong>an</strong>dom fields, there is no<br />

restriction on where these observations come from. However, we restrict the<br />

observations to be the result from several filtering operations on the <strong>document</strong><br />

image under the current site. Figure 4.2<br />

Let x s = {x 1 , ..., x N } be the vector <strong>of</strong> observations available to site s <strong>an</strong>d y s<br />

be one <strong>of</strong> the labels {text, non-text}. For each site s, the conditional probability<br />

<strong>of</strong> having a label y s given observations x s is defined as:<br />

p s (y s |x s ) ∝ exp<br />

( F<br />

e<br />

)<br />

∑<br />

∑F n<br />

λ k fk(y e i,i∈N(s) , y s , x i,i∈N(s) , x s ) + µ k fk n (y s , x s )<br />

k=1<br />

where f n <strong>an</strong>d f e are the node <strong>an</strong>d edge feature functions,respectively, which<br />

are discussed later in section 4.2. F e is the total number <strong>of</strong> edge feature functions<br />

<strong>an</strong>d F n is the total number <strong>of</strong> node feature functions. N(s) is the set <strong>of</strong><br />

neighbors <strong>of</strong> the site s <strong>an</strong>d is <strong>of</strong>ten called Markovi<strong>an</strong> bl<strong>an</strong>ket <strong>of</strong> s. λ <strong>an</strong>d µ are<br />

k=1<br />

54

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!