Combining Information from Multiple Internet Sources
Combining Information from Multiple Internet Sources
Combining Information from Multiple Internet Sources
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Depending on the outcome of the consistency check the different entry point is used for the<br />
weight calculation algorithm. If the consistency of the consensus was high, the agent whose result<br />
set has the smallest distance to the consensus is selected as the agent whose weight will be equal to<br />
1 and the algorithm in listing 3.3.3 does not require the feedback URL as an input – step 1 is<br />
omitted. If the consistency was low, the first step of the algorithm must be performed to find the<br />
agent.<br />
Input: Map of results<br />
i<br />
a ,<br />
r<br />
i<br />
provided by m Search agents – each in the<br />
form<br />
r 1, 2 ,...,<br />
i = U<br />
i U<br />
i U<br />
i<br />
n where U i , U<br />
i<br />
2 ,...,<br />
U<br />
i<br />
n<br />
1 are URLs; feedback URL<br />
Output: Set of weights with corresponding agents<br />
BEGIN<br />
1. find the agent whose result set contains URL <strong>from</strong> feedback and is<br />
closest to consensus, set his weight to 1<br />
2. for all other agents:<br />
( )<br />
find d( r<br />
i , C)<br />
W[<br />
i]<br />
=<br />
r<br />
( r , C)<br />
( i) ( i)<br />
− d<br />
r<br />
( i)<br />
where<br />
d<br />
3. return weights<br />
END<br />
( )<br />
( r<br />
i , C)<br />
is the Levenshtein distance<br />
Listing 3.3.3 Weights calculation algorithm for Consensus method<br />
Those weights are used as ranking modifiers of the results provided by the search engines,<br />
when application is issued the same query for this algorithm. When the weights are calculated<br />
Manager Agent sends those to the corresponding Search Agents. Depending on the distances<br />
between results sets provided by Search Agents weight may vary <strong>from</strong> 0 to 1. Weight will be equal<br />
to 0 when a result set has maximal distance to the anchor result set. Afterwards, when weights are<br />
already calculated those are stored in the database in case the query is issued once more. Then<br />
during main algorithm, which yields the consensus answer, those are used as URL ranks modifiers –<br />
the positions of URLs are divided by those. This results in moving a certain URL to the bottom of<br />
the list if the weight of the result set <strong>from</strong> which the URL originates is close to zero. If the weight is<br />
equal to 0, URL position is divided by 0.01. 2<br />
2 Like stated before, this way of ranking search engines was not tested and was disabled during tests which are described in chapter<br />
4. Weights of all search engines were equal to 1 – URL position was not altered. However, it was implemented for future possibility<br />
of including rankings of the search engines in the process of answers processing.<br />
35