Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval

informatics.buu.ac.th
from informatics.buu.ac.th More from this publisher
23.06.2015 Views

Introduction to Information Retrieval Summary: tf-idf • Assign a tf-idf weight for each term t in each document d: • The tf-idf weight . . . • . . . increases with the number of occurrences within a document. (term frequency) • . . . increases with the rarity of the term in the collection. (inverse document frequency) 28

Introduction to Information Retrieval Exercise: Term, collection and document frequency Quantity term frequency document frequency collection frequency Symbol Definition tf t,d df t cf t number of occurrences of t in d number of documents in the collection that t occurs in total number of occurrences of t in the collection • Relationship between df and cf? • Relationship between tf and cf? • Relationship between tf and df? 29

<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />

Summary: tf-idf<br />

• Assign a tf-idf weight for each term t in each document d:<br />

• The tf-idf weight . . .<br />

• . . . increases with the number of occurrences within a<br />

document. (term frequency)<br />

• . . . increases with the rarity of the term in the collection.<br />

(inverse document frequency)<br />

28

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!