14.01.2014 Views

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

Segmentation of heterogeneous document images : an ... - Tel

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

(a) Original image (b) Height map (c) Width map<br />

tel-00912566, version 1 - 2 Dec 2013<br />

Figure 4.3:<br />

Height <strong>an</strong>d width maps<br />

4.3.1 Height <strong>an</strong>d width maps<br />

These two maps represent the local average height <strong>an</strong>d width <strong>of</strong> connected components<br />

in a <strong>document</strong> image. We do not use them directly in feature functions;<br />

instead, we use them as part <strong>of</strong> the computation for a dist<strong>an</strong>ce-based feature.<br />

Each pixel <strong>of</strong> a width/height map is the weighted average <strong>of</strong> the height or width<br />

<strong>of</strong> all text-labeled connected components in a 300 × 300 block around it.<br />

Each map c<strong>an</strong> be computed using the results <strong>of</strong> two image convolutions that<br />

are divided pixel by pixel. Consider that our goal is to generate the height map.<br />

We know the location <strong>of</strong> pixels for all text-labeled components <strong>of</strong> the page. We<br />

first prepare two sparse <strong>images</strong>. The first image A has height <strong>of</strong> a component for<br />

pixels <strong>of</strong> text components, <strong>an</strong>d the rest <strong>of</strong> pixels are zero. The second image B<br />

has a value <strong>of</strong> one for every pixel <strong>of</strong> text components, <strong>an</strong>d the rest <strong>of</strong> the pixels<br />

are zero. We also have a weighting function K <strong>of</strong> size 300×300 pixels defined as:<br />

K(x, y) = cos( πi πj<br />

) cos(<br />

300 300 )<br />

where x <strong>an</strong>d y are the coordinates <strong>of</strong> pixels with (0, 0) being the center <strong>of</strong><br />

the window. Then<br />

HeightMap(i,j) = (K ∗ A) i,j<br />

(K ∗ B) i,j<br />

The width map c<strong>an</strong> be computed in the same way but instead <strong>of</strong> having<br />

the height <strong>of</strong> connected components in the A image, we use widths. Figure 4.3<br />

displays one <strong>document</strong> image <strong>an</strong>d its height <strong>an</strong>d width map. We also normalize<br />

our height <strong>an</strong>d width maps by dividing every pixel’s value to the height <strong>an</strong>d<br />

width <strong>of</strong> the <strong>document</strong>, respectively.<br />

58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!