14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

determine the average height <strong>of</strong> text lines from text connected components, but<br />

much harder to determine the spaces between lines. Furthermore, line-spacing<br />

varies from one <strong>document</strong> to the other. To ensure that our kernel only captures<br />

one text line, the value <strong>of</strong> σ should be tied to λ. The size <strong>of</strong> the kernel should<br />

depend on the σ parameter. The size should normally be selected large enough<br />

so that kernel coefficients <strong>of</strong> the border rows <strong>an</strong>d columns contribute very little<br />

to the sum <strong>of</strong> coefficients. As a rule <strong>of</strong> thumb the size (in pixels) <strong>of</strong> the kernel<br />

window should be six times the value <strong>of</strong> σ or larger to ensure that the power<br />

<strong>of</strong> coefficient values at the borders <strong>of</strong> the kernel window subside to 1% or lower<br />

th<strong>an</strong> the power <strong>of</strong> center coefficients. The second issue is that we w<strong>an</strong>t the λ<br />

parameter to be dependent on the size <strong>of</strong> the text line. Our parameters are:<br />

tel-00912566, version 1 - 2 Dec 2013<br />

λ = 2 × Avg. text height or width<br />

σ = λ<br />

3.5<br />

γ = 0.7<br />

{ π To capture text lines<br />

ψ =<br />

0 To capture white space<br />

Unfortunately, there is no efficient implementation <strong>of</strong> Gabor filter in which<br />

the λ parameter varies locally. We could m<strong>an</strong>ually crop patches from a <strong>document</strong><br />

image <strong>an</strong>d use a Gabor kernel for each patch that matches text heights locally,<br />

but it would take a huge amount <strong>of</strong> time. Therefore, Gabor filtering using kernel<br />

multiplication in frequency space <strong>an</strong>d fixed parameters for each kernel per <strong>document</strong><br />

is still the only available option. As a consequence two different Gabor<br />

kernels are used to capture text lines. For these kernels, the λ parameters are<br />

set to 2 × Average Text Height <strong>an</strong>d 4 × Average Text Height. Moreover, to capture<br />

white space gaps, three Gabor kernels are used. The λ parameters are set<br />

to 1.5 × Text Width, 3.5 × Average Text Width <strong>an</strong>d 5.5 × Average Text Width.<br />

Figure 4.6 shows the results <strong>of</strong> applying the mentioned Gabor filters to a <strong>document</strong><br />

in figure 4.5.<br />

Figure 4.7 displays two additional examples <strong>of</strong> filtered <strong>images</strong>, highlighting<br />

the effect <strong>of</strong> using various λ parameters to capture text lines <strong>of</strong> different font<br />

sizes.<br />

4.4 Feature functions<br />

Each feature function has two parts. The first part depends on the label <strong>of</strong> the<br />

site or a combination <strong>of</strong> the labels from its neighbors. The second part depends<br />

on the observations. Theoretically, these observations c<strong>an</strong> be generated from<br />

<strong>an</strong>ywhere on the image; however, we restrict them to be computed from <strong>an</strong>d<br />

around the site in question.<br />

We described m<strong>an</strong>y features that we extract from <strong>document</strong> <strong>images</strong>. In order<br />

to generate observations from these features, we compute me<strong>an</strong> <strong>an</strong>d vari<strong>an</strong>ce for<br />

each feature map at the same site. These statistics serve as observations at each<br />

62

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!