Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel Segmentation of heterogeneous document images : an ... - Tel

tel.archives.ouvertes.fr
from tel.archives.ouvertes.fr More from this publisher
14.01.2014 Views

determine the average height of text lines from text connected components, but much harder to determine the spaces between lines. Furthermore, line-spacing varies from one document to the other. To ensure that our kernel only captures one text line, the value of σ should be tied to λ. The size of the kernel should depend on the σ parameter. The size should normally be selected large enough so that kernel coefficients of the border rows and columns contribute very little to the sum of coefficients. As a rule of thumb the size (in pixels) of the kernel window should be six times the value of σ or larger to ensure that the power of coefficient values at the borders of the kernel window subside to 1% or lower than the power of center coefficients. The second issue is that we want the λ parameter to be dependent on the size of the text line. Our parameters are: tel-00912566, version 1 - 2 Dec 2013 λ = 2 × Avg. text height or width σ = λ 3.5 γ = 0.7 { π To capture text lines ψ = 0 To capture white space Unfortunately, there is no efficient implementation of Gabor filter in which the λ parameter varies locally. We could manually crop patches from a document image and use a Gabor kernel for each patch that matches text heights locally, but it would take a huge amount of time. Therefore, Gabor filtering using kernel multiplication in frequency space and fixed parameters for each kernel per document is still the only available option. As a consequence two different Gabor kernels are used to capture text lines. For these kernels, the λ parameters are set to 2 × Average Text Height and 4 × Average Text Height. Moreover, to capture white space gaps, three Gabor kernels are used. The λ parameters are set to 1.5 × Text Width, 3.5 × Average Text Width and 5.5 × Average Text Width. Figure 4.6 shows the results of applying the mentioned Gabor filters to a document in figure 4.5. Figure 4.7 displays two additional examples of filtered images, highlighting the effect of using various λ parameters to capture text lines of different font sizes. 4.4 Feature functions Each feature function has two parts. The first part depends on the label of the site or a combination of the labels from its neighbors. The second part depends on the observations. Theoretically, these observations can be generated from anywhere on the image; however, we restrict them to be computed from and around the site in question. We described many features that we extract from document images. In order to generate observations from these features, we compute mean and variance for each feature map at the same site. These statistics serve as observations at each 62

tel-00912566, version 1 - 2 Dec 2013 (a) ψ = π, λ = Avg. CC height × 2 (b) ψ = π, λ = Avg. CC height × 4 (c) ψ = 0, λ = Avg. CC width × 3.5 (d) ψ = 0, λ = Avg.n CC width × 5.5 Figure 4.6: Results of applying two sets of Gabor filters to a document image. (a) and b display the results of the first set with two Gabor filters that try to capture text lines with different heights. c and d show the results of two additional Gabor filters that capture gaps between text columns. 63

tel-00912566, version 1 - 2 Dec 2013<br />

(a) ψ = π, λ = Avg. CC height × 2 (b) ψ = π, λ = Avg. CC height × 4<br />

(c) ψ = 0, λ = Avg. CC width × 3.5 (d) ψ = 0, λ = Avg.n CC width × 5.5<br />

Figure 4.6: Results <strong>of</strong> applying two sets <strong>of</strong> Gabor filters to a <strong>document</strong> image. (a)<br />

<strong>an</strong>d b display the results <strong>of</strong> the first set with two Gabor filters that try to capture text<br />

lines with different heights. c <strong>an</strong>d d show the results <strong>of</strong> two additional Gabor filters<br />

that capture gaps between text columns.<br />

63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!