24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Re<strong>com</strong>mending Movies Using Affinity Analysis<br />

The Apriori algorithm<br />

The Apriori algorithm is part of our affinity analysis and deals specifically with<br />

finding frequent itemsets within the data. The basic procedure of Apriori builds<br />

up new candidate itemsets from previously discovered frequent itemsets. These<br />

candidates are tested to see if they are frequent, and then the algorithm iterates as<br />

explained here:<br />

1. Create initial frequent itemsets by placing each item in its own itemset.<br />

Only items with at least the minimum support are used in this step.<br />

2. New candidate itemsets are created from the most recently discovered<br />

frequent itemsets by finding supersets of the existing frequent itemsets.<br />

3. All candidate itemsets are tested to see if they are frequent. If a candidate is<br />

not frequent then it is discarded. If there are no new frequent itemsets from<br />

this step, go to the last step.<br />

4. Store the newly discovered frequent itemsets and go to the second step.<br />

5. Return all of the discovered frequent itemsets.<br />

This process is outlined in the following workflow:<br />

[ 68 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!