24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

movie_name_data.columns = ["MovieID", "Title", "Release Date",<br />

"Video Release", "IMDB", "", "Action", "Adventure",<br />

"Animation", "Children's", "Comedy", "Crime", "Documentary",<br />

"Drama", "Fantasy", "Film-Noir",<br />

"Horror", "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller",<br />

"War", "Western"]<br />

Chapter 4<br />

Getting the movie title is important, so we will create a function that will return a<br />

movie's title from its MovieID, saving us the trouble of looking it up each time. Let's<br />

look at the code:<br />

def get_movie_name(movie_id):<br />

We look up the movie_name_data DataFrame for the given MovieID and return only<br />

the title column:<br />

title_object = movie_name_data[movie_name_data["MovieID"] ==<br />

movie_id]["Title"]<br />

We use the values parameter to get the actual value (and not the pandas Series<br />

object that is currently stored in title_object). We are only interested in the first<br />

value—there should only be one title for a given MovieID anyway!<br />

title = title_object.values[0]<br />

We end the function by returning the title as needed. Let's look at the code:<br />

return title<br />

In a new IPython Notebook cell, we adjust our previous code for printing out the top<br />

rules to also include the titles:<br />

for index in range(5):<br />

print("Rule #{0}".format(index + 1))<br />

(premise, conclusion) = sorted_confidence[index][0]<br />

premise_names = ", ".join(get_movie_name(idx) for idx<br />

in premise)<br />

conclusion_name = get_movie_name(conclusion)<br />

print("Rule: If a person re<strong>com</strong>mends {0} they will<br />

also re<strong>com</strong>mend {1}".format(premise_names, conclusion_name))<br />

print(" - Confidence: {0:.3f}".format(confidence[(premise,<br />

conclusion)]))<br />

print("")<br />

[ 75 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!