Introduction to Information Retrieval
Introduction to Information Retrieval Introduction to Information Retrieval
Introduction to Information Retrieval Summary: tf-idf • Assign a tf-idf weight for each term t in each document d: • The tf-idf weight . . . • . . . increases with the number of occurrences within a document. (term frequency) • . . . increases with the rarity of the term in the collection. (inverse document frequency) 28
Introduction to Information Retrieval Exercise: Term, collection and document frequency Quantity term frequency document frequency collection frequency Symbol Definition tf t,d df t cf t number of occurrences of t in d number of documents in the collection that t occurs in total number of occurrences of t in the collection • Relationship between df and cf? • Relationship between tf and cf? • Relationship between tf and df? 29
- Page 1 and 2: Introduction to Information Retriev
- Page 3 and 4: Introduction to Information Retriev
- Page 5 and 6: Introduction to Information Retriev
- Page 7 and 8: Introduction to Information Retriev
- Page 9 and 10: Introduction to Information Retriev
- Page 11 and 12: Introduction to Information Retriev
- Page 13 and 14: Introduction to Information Retriev
- Page 15 and 16: Introduction to Information Retriev
- Page 17 and 18: Introduction to Information Retriev
- Page 19 and 20: Introduction to Information Retriev
- Page 21 and 22: Introduction to Information Retriev
- Page 23 and 24: Introduction to Information Retriev
- Page 25 and 26: Introduction to Information Retriev
- Page 27: Introduction to Information Retriev
- Page 31 and 32: Introduction to Information Retriev
- Page 33 and 34: Introduction to Information Retriev
- Page 35 and 36: Introduction to Information Retriev
- Page 37 and 38: Introduction to Information Retriev
- Page 39 and 40: Introduction to Information Retriev
- Page 41 and 42: Introduction to Information Retriev
- Page 43 and 44: Introduction to Information Retriev
- Page 45 and 46: Introduction to Information Retriev
- Page 47 and 48: Introduction to Information Retriev
- Page 49 and 50: Introduction to Information Retriev
- Page 51 and 52: Introduction to Information Retriev
- Page 53: Introduction to Information Retriev
<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />
Summary: tf-idf<br />
• Assign a tf-idf weight for each term t in each document d:<br />
• The tf-idf weight . . .<br />
• . . . increases with the number of occurrences within a<br />
document. (term frequency)<br />
• . . . increases with the rarity of the term in the collection.<br />
(inverse document frequency)<br />
28