Introduction to Information Retrieval
Introduction to Information Retrieval Introduction to Information Retrieval
Introduction to Information Retrieval Desired weight for rare terms • Rare terms are more informative than frequent terms. • Consider a term in the query that is rare in the collection (e.g., ARACHNOCENTRIC). • A document containing this term is very likely to be relevant. • → We want high weights for rare terms like ARACHNOCENTRIC. 20
Introduction to Information Retrieval Desired weight for frequent terms • Frequent terms are less informative than rare terms. • Consider a term in the query that is frequent in the collection (e.g., GOOD, INCREASE, LINE). • A document containing this term is more likely to be relevant than a document that doesn’t . . . • . . . but words like GOOD, INCREASE and LINE are not sure indicators of relevance. • → For frequent terms like GOOD, INCREASE and LINE, we want positive weights . . . • . . . but lower weights than for rare terms. 21
- Page 1 and 2: Introduction to Information Retriev
- Page 3 and 4: Introduction to Information Retriev
- Page 5 and 6: Introduction to Information Retriev
- Page 7 and 8: Introduction to Information Retriev
- Page 9 and 10: Introduction to Information Retriev
- Page 11 and 12: Introduction to Information Retriev
- Page 13 and 14: Introduction to Information Retriev
- Page 15 and 16: Introduction to Information Retriev
- Page 17 and 18: Introduction to Information Retriev
- Page 19: Introduction to Information Retriev
- Page 23 and 24: Introduction to Information Retriev
- Page 25 and 26: Introduction to Information Retriev
- Page 27 and 28: Introduction to Information Retriev
- Page 29 and 30: Introduction to Information Retriev
- Page 31 and 32: Introduction to Information Retriev
- Page 33 and 34: Introduction to Information Retriev
- Page 35 and 36: Introduction to Information Retriev
- Page 37 and 38: Introduction to Information Retriev
- Page 39 and 40: Introduction to Information Retriev
- Page 41 and 42: Introduction to Information Retriev
- Page 43 and 44: Introduction to Information Retriev
- Page 45 and 46: Introduction to Information Retriev
- Page 47 and 48: Introduction to Information Retriev
- Page 49 and 50: Introduction to Information Retriev
- Page 51 and 52: Introduction to Information Retriev
- Page 53: Introduction to Information Retriev
<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />
Desired weight for rare terms<br />
• Rare terms are more informative than frequent terms.<br />
• Consider a term in the query that is rare in the collection<br />
(e.g., ARACHNOCENTRIC).<br />
• A document containing this term is very likely <strong>to</strong> be<br />
relevant.<br />
• → We want high weights for rare terms like<br />
ARACHNOCENTRIC.<br />
20