14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

{ 1 if ys = text <strong>an</strong>d y<br />

f 1 (y s , y i,i∈N(s) ) =<br />

← = non-text<br />

0 otherwise<br />

where y ← is the label for the site on the left <strong>of</strong> s. Of course in the training<br />

phase, the weight associated with feature function f 1 has a correlation with the<br />

number <strong>of</strong> times that a site s has a ”text” label <strong>an</strong>d the site <strong>of</strong> its left has a<br />

”non-text” label given all sites in the training dataset. If the training algorithm<br />

finds that the number <strong>of</strong> times a site with a ”non-text” label appears at the left<br />

side <strong>of</strong> a site with a ”text” label, is greater th<strong>an</strong> all other label configurations<br />

combined, then the associated weight for this feature obtains a positive value.<br />

In <strong>an</strong>y other case, the weight would be zero or negative.<br />

tel-00912566, version 1 - 2 Dec 2013<br />

Now we look at a more complex feature which produces real values. In the<br />

previous chapter, all text <strong>an</strong>d graphical elements were separated. We render<br />

two separate <strong>images</strong>; one for each. Imagine that we pick the image (G) with all<br />

graphic components which also includes rule lines <strong>an</strong>d table lines. Every pixel<br />

on this image has a value <strong>of</strong> 1 for every pixel <strong>of</strong> the graphical components, <strong>an</strong>d<br />

the rest <strong>of</strong> the pixels have a value <strong>of</strong> 0. In the new node feature function f 2 ,<br />

if the current site has the ”non-text” label, the value <strong>of</strong> the feature function is<br />

the average <strong>of</strong> the intensity values <strong>of</strong> the pixels <strong>of</strong> the graphical image covered<br />

by the current site.<br />

{<br />

Gs if y<br />

f 2 (y s , G) =<br />

s = non-text<br />

0 otherwise<br />

where G s is the average value <strong>of</strong> pixels on image G covered by the site s.<br />

The value <strong>of</strong> feature function f 2 is always positive for sites that are located on<br />

graphical drawing <strong>an</strong>d zero elsewhere. Thus, the λ 2 weight for this function is<br />

guar<strong>an</strong>teed to have a positive value. Suppose that we ch<strong>an</strong>ge the non-text label<br />

in f 2 to text, then the λ 2 is guar<strong>an</strong>teed to have a negative value after training.<br />

Often good feature function engineering c<strong>an</strong> signific<strong>an</strong>tly increase the labeling<br />

accuracy <strong>of</strong> the model. We will describe our observations thoroughly later<br />

in this chapter.<br />

4.3 Observations<br />

We described that feature functions in CRFs are functions that depend on both<br />

the observations <strong>an</strong>d labels at one or two sites. In other words, labels <strong>an</strong>d observations<br />

are tied to each other in a feature function. But, before we design our<br />

feature functions, we need to explain where our observations come from. Given<br />

a <strong>document</strong> image, we <strong>of</strong>ten perform operations such as filtering or run-length<br />

<strong>an</strong>alysis on the image. Then each site in our model has access to the results<br />

from these operations <strong>an</strong>d by knowing the location <strong>of</strong> all the pixels on the site, it<br />

makes <strong>an</strong> average estimate <strong>of</strong> the pixel values from the result <strong>of</strong> each operation<br />

<strong>an</strong>d generates a vector <strong>of</strong> observations. Later, we will bind these observations<br />

with appropriate labels to be used in our feature functions. In this section, we<br />

describe all the operations that lead to our observations.<br />

57

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!