Combining Information from Multiple Internet Sources
Combining Information from Multiple Internet Sources
Combining Information from Multiple Internet Sources
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Consensus Ask.com Live Interia Yahoo! Google<br />
Set Coverage 70% 50% 80% 70% 80%<br />
URL to URL 20% 20% 10% 20% 10%<br />
Table 4.1.6 Coverage of results of Consensus method and search engines for simple query<br />
Table 4.1.5 presents the comparison of result sets between Consensus method and each of<br />
the result sets returned by the search engines. Table 4.1.6 presents the sets coverage. Sets setcoverage<br />
is at least 50% but URL to URL coverage is very low and is at most 20%. Also the result<br />
set which was provided by Consensus method was not consistent. It means that average of distances<br />
(Levenshtein) between consensus answer and the result set each of the search engines was higher<br />
than average of distance between result sets between search engines. This shows that the result sets<br />
are dispersed in the sense of the Levenshtein distance.<br />
Consensus method’s answer processing algorithm is based on average ranks of the URLs<br />
<strong>from</strong> the result sets of the engines - in the final answer there will be overall highest ranked URLs. If<br />
an URL was at the top places throughout the result sets of the engines, it will be at one of the<br />
topmost places in the consensus answer. If its overall ranking was low it will be on low place or not<br />
at all in the final answer.<br />
When measuring distances between result set using Levenshtein distance, this method is<br />
highly dependent on URL positions – where a particular URL is placed in first set and where it is<br />
placed in the second set, highly contributes to the final distance value. It can be observed that URL<br />
coverage of the consensus answer is very low for each result set. This resulted in a consensus<br />
answer which is said to be inconsistent. Nevertheless, it is a subjective result, the inconsistency. If<br />
to use some other metric of measuring distance, it could happen that the result would be marked as<br />
being consistent.<br />
48