Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval

informatics.buu.ac.th
from informatics.buu.ac.th More from this publisher
23.06.2015 Views

Introduction to Information Retrieval Jaccard coefficient: Example • What is the query-document match score that the Jaccard coefficient computes for: • Query: “ides of March” • Document “Caesar died in March” • JACCARD(q, d) = 1/6 10

Introduction to Information Retrieval What’s wrong with Jaccard? • It doesn’t consider term frequency (how many occurrences a term has). • Rare terms are more informative than frequent terms. Jaccard does not consider this information. • We need a more sophisticated way of normalizing for the length of a document. • Later in this lecture, we’ll use (cosine) . . . • . . . instead of |A ∩ B|/|A ∪ B| (Jaccard) for length normalization. 11

<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />

Jaccard coefficient: Example<br />

• What is the query-document match score that the Jaccard<br />

coefficient computes for:<br />

• Query: “ides of March”<br />

• Document “Caesar died in March”<br />

• JACCARD(q, d) = 1/6<br />

10

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!