03.01.2015 Views

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

could interpret it as number of basic operations (in the sense defined above) to unify two different<br />

result sets.<br />

Following example illustrates how the distance between two result sets can be evaluated.<br />

Example:<br />

Let us consider following result sets:<br />

RS1 = (a, b, c)<br />

RS2 = (b, c, a)<br />

Then the distance between those result sets is equal to 2.<br />

To obtain RS2 <strong>from</strong> RS1 one is required to do:<br />

1 deletion – remove a <strong>from</strong> the beginning<br />

1 insertion – add a at the end<br />

It gives following 2 alignments:<br />

1. (a, b, c)<br />

(b, c, a)<br />

2. (a, b, c, -)<br />

(-, b, c, a)<br />

What corresponds to lowest cost path <strong>from</strong> (-1, -1) to (2, 2)<br />

-1 0 1 2<br />

b c A<br />

-1 0 1 2 3<br />

0 a 1 1 2 2<br />

1 b 2 1 2 3<br />

2 c 3 2 1 2<br />

Listing 3.4.3 Example of variation of algorithm for Levenshtein distance<br />

This distance is used in Consensus Method. It is used during the main algorithm part and<br />

also during weights calculation after it. The following listing presents pseudo code of dynamic<br />

programming version of the variation of this algorithm.<br />

40

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!