23.06.2015 Views

Introduction to Information Retrieval

Introduction to Information Retrieval

Introduction to Information Retrieval

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Introduction</strong> <strong>to</strong> <strong>Information</strong> <strong>Retrieval</strong><br />

Collection frequency vs. Document frequency<br />

word collection frequency document frequency<br />

INSURANCE<br />

TRY<br />

• Collection frequency of t: number of <strong>to</strong>kens of t in the<br />

collection<br />

• Document frequency of t: number of documents t occurs in<br />

• Why these numbers?<br />

10440<br />

10422<br />

• Which word is a better search term (and should get a<br />

higher weight)?<br />

• This example suggests that df (and idf) is better for<br />

weighting than cf (and “icf”).<br />

3997<br />

8760<br />

26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!