03.01.2015 Views

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.4 Common algorithms<br />

This part of chapter 3 presents algorithms that are commonly used throughout the<br />

application. This chapter presents the purposes of the algorithms and their pseudo codes. Also a<br />

short description of each algorithm is provided.<br />

3.4.1 Ranking algorithm<br />

Listing 3.4.1 presents the pseudo code of the algorithm for the initial URL ranking. This<br />

initial ranking is being performed before the Game theory and Auction methods (not the Consensus<br />

method) can start their main computational parts. Its purpose is to calculate the confidence values of<br />

the Search Agents about a certain URL. The confidence value in general is calculated as<br />

follows: result set of agent − position of the URL in the result set . However the Game theory and<br />

the Auction methods require that each of the result sets contain the same URLs, not necessarily at<br />

the same places. In other case the algorithm breaks, since agent may now nothing about a certain<br />

URL and therefore the comparison of ranks of this certain URL cannot be performed. This<br />

algorithm also insures that this assumption is fulfilled by updating the result sets with missing<br />

URLs. Algorithm also determines if the main computational parts of the two aforementioned<br />

approaches can be even performed. The rule is as following: if for all pairs of result sets say<br />

Aand<br />

B the A ∩ B = ∅ then the main part of the Game theory and Auction can not start. If there is a<br />

result set say that has no common URL with any other result sets it is removed <strong>from</strong> the process at<br />

the very beginning as being not suitable for the algorithms which require every URL to be in every<br />

result set.<br />

37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!