Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
Segmentation of heterogeneous document images : an ... - Tel
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
{ 1 if ys = text <strong>an</strong>d y<br />
f 1 (y s , y i,i∈N(s) ) =<br />
← = non-text<br />
0 otherwise<br />
where y ← is the label for the site on the left <strong>of</strong> s. Of course in the training<br />
phase, the weight associated with feature function f 1 has a correlation with the<br />
number <strong>of</strong> times that a site s has a ”text” label <strong>an</strong>d the site <strong>of</strong> its left has a<br />
”non-text” label given all sites in the training dataset. If the training algorithm<br />
finds that the number <strong>of</strong> times a site with a ”non-text” label appears at the left<br />
side <strong>of</strong> a site with a ”text” label, is greater th<strong>an</strong> all other label configurations<br />
combined, then the associated weight for this feature obtains a positive value.<br />
In <strong>an</strong>y other case, the weight would be zero or negative.<br />
tel-00912566, version 1 - 2 Dec 2013<br />
Now we look at a more complex feature which produces real values. In the<br />
previous chapter, all text <strong>an</strong>d graphical elements were separated. We render<br />
two separate <strong>images</strong>; one for each. Imagine that we pick the image (G) with all<br />
graphic components which also includes rule lines <strong>an</strong>d table lines. Every pixel<br />
on this image has a value <strong>of</strong> 1 for every pixel <strong>of</strong> the graphical components, <strong>an</strong>d<br />
the rest <strong>of</strong> the pixels have a value <strong>of</strong> 0. In the new node feature function f 2 ,<br />
if the current site has the ”non-text” label, the value <strong>of</strong> the feature function is<br />
the average <strong>of</strong> the intensity values <strong>of</strong> the pixels <strong>of</strong> the graphical image covered<br />
by the current site.<br />
{<br />
Gs if y<br />
f 2 (y s , G) =<br />
s = non-text<br />
0 otherwise<br />
where G s is the average value <strong>of</strong> pixels on image G covered by the site s.<br />
The value <strong>of</strong> feature function f 2 is always positive for sites that are located on<br />
graphical drawing <strong>an</strong>d zero elsewhere. Thus, the λ 2 weight for this function is<br />
guar<strong>an</strong>teed to have a positive value. Suppose that we ch<strong>an</strong>ge the non-text label<br />
in f 2 to text, then the λ 2 is guar<strong>an</strong>teed to have a negative value after training.<br />
Often good feature function engineering c<strong>an</strong> signific<strong>an</strong>tly increase the labeling<br />
accuracy <strong>of</strong> the model. We will describe our observations thoroughly later<br />
in this chapter.<br />
4.3 Observations<br />
We described that feature functions in CRFs are functions that depend on both<br />
the observations <strong>an</strong>d labels at one or two sites. In other words, labels <strong>an</strong>d observations<br />
are tied to each other in a feature function. But, before we design our<br />
feature functions, we need to explain where our observations come from. Given<br />
a <strong>document</strong> image, we <strong>of</strong>ten perform operations such as filtering or run-length<br />
<strong>an</strong>alysis on the image. Then each site in our model has access to the results<br />
from these operations <strong>an</strong>d by knowing the location <strong>of</strong> all the pixels on the site, it<br />
makes <strong>an</strong> average estimate <strong>of</strong> the pixel values from the result <strong>of</strong> each operation<br />
<strong>an</strong>d generates a vector <strong>of</strong> observations. Later, we will bind these observations<br />
with appropriate labels to be used in our feature functions. In this section, we<br />
describe all the operations that lead to our observations.<br />
57