03.01.2015 Views

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

Combining Information from Multiple Internet Sources

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Consensus Ask.com Live Interia Yahoo! Google<br />

Set Coverage 70% 50% 80% 70% 80%<br />

URL to URL 20% 20% 10% 20% 10%<br />

Table 4.1.6 Coverage of results of Consensus method and search engines for simple query<br />

Table 4.1.5 presents the comparison of result sets between Consensus method and each of<br />

the result sets returned by the search engines. Table 4.1.6 presents the sets coverage. Sets setcoverage<br />

is at least 50% but URL to URL coverage is very low and is at most 20%. Also the result<br />

set which was provided by Consensus method was not consistent. It means that average of distances<br />

(Levenshtein) between consensus answer and the result set each of the search engines was higher<br />

than average of distance between result sets between search engines. This shows that the result sets<br />

are dispersed in the sense of the Levenshtein distance.<br />

Consensus method’s answer processing algorithm is based on average ranks of the URLs<br />

<strong>from</strong> the result sets of the engines - in the final answer there will be overall highest ranked URLs. If<br />

an URL was at the top places throughout the result sets of the engines, it will be at one of the<br />

topmost places in the consensus answer. If its overall ranking was low it will be on low place or not<br />

at all in the final answer.<br />

When measuring distances between result set using Levenshtein distance, this method is<br />

highly dependent on URL positions – where a particular URL is placed in first set and where it is<br />

placed in the second set, highly contributes to the final distance value. It can be observed that URL<br />

coverage of the consensus answer is very low for each result set. This resulted in a consensus<br />

answer which is said to be inconsistent. Nevertheless, it is a subjective result, the inconsistency. If<br />

to use some other metric of measuring distance, it could happen that the result would be marked as<br />

being consistent.<br />

48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!