Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval

informatics.buu.ac.th
from informatics.buu.ac.th More from this publisher
23.06.2015 Views

Introduction to Information Retrieval Summary: tf-idf • Assign a tf-idf weight for each term t in each document d: • The tf-idf weight . . . • . . . increases with the number of occurrences within a document. (term frequency) • . . . increases with the rarity of the term in the collection. (inverse document frequency) 28

Introduction to Information Retrieval Exercise: Term, collection and document frequency Quantity term frequency document frequency collection frequency Symbol Definition tf t,d df t cf t number of occurrences of t in d number of documents in the collection that t occurs in total number of occurrences of t in the collection • Relationship between df and cf? • Relationship between tf and cf? • Relationship between tf and df? 29

<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />

Exercise: Term, collection and document<br />

frequency<br />

Quantity<br />

term frequency<br />

document frequency<br />

collection frequency<br />

Symbol Definition<br />

tf t,d<br />

df t<br />

cf t<br />

number of occurrences of t in<br />

d<br />

number of documents in the<br />

collection that t occurs in<br />

<strong>to</strong>tal number of occurrences of<br />

t in the collection<br />

• Relationship between df and cf?<br />

• Relationship between tf and cf?<br />

• Relationship between tf and df?<br />

29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!