www.allitebooks.com

24.07.2016 Views
Chapter 1 Summary In this chapter, we introduced data mining using Python. If you were able to run the code in this section (note that the full code is available in the supplied code package), then your computer is set up for much of the rest of the book. Other Python libraries will be introduced in later chapters to perform more specialized tasks. We used the IPython Notebook to run our code, which allows us to immediately view the results of a small section of the code. This is a useful framework that will be used throughout the book. We introduced a simple affinity analysis, finding products that are purchased together. This type of exploratory analysis gives an insight into a business process, an environment, or a scenario. The information from these types of analysis can assist in business processes, finding the next big medical breakthrough, or creating the next artificial intelligence. Also, in this chapter, there was a simple classification example using the OneR algorithm. This simple algorithm simply finds the best feature and predicts the class that most frequently had this value in the training dataset. Over the next few chapters, we will expand on the concepts of classification and affinity analysis. We will also introduce the scikit-learn package and the algorithms it includes. [ 23 ]

Page 1 and 2: [ 1 ] www.allitebooks.com

Page 3 and 4: Learning Data Mining with Python Co

Page 5 and 6: About the Author Robert Layton has

Page 7 and 8: Christophe Van Gysel is pursuing a

Page 9 and 10: www.allitebooks.com

Page 11 and 12: Table of Contents Preprocessing usi

Page 13 and 14: Table of Contents Chapter 7: Discov

Page 15 and 16: Table of Contents GPU optimization

Page 18 and 19: Preface If you have ever wanted to

Page 20 and 21: What you need for this book It shou

Page 22 and 23: Preface Reader feedback Feedback fr

Page 24 and 25: Getting Started with Data Mining We

Page 26 and 27: Chapter 1 In the preceding dataset,

Page 28 and 29: After you have the above "Hello, wo

Page 30 and 31: Chapter 1 Windows users may need to

Page 32 and 33: Chapter 1 The dataset we are going

Page 34 and 35: Chapter 1 As an example, we will co

Page 36 and 37: We get the names of the features fo

Page 38 and 39: Chapter 1 Two rules are near the to

Page 40 and 41: Chapter 1 The scikit-learn library

Page 42 and 43: We then iterate over all the sample

Page 44 and 45: Chapter 1 Overfitting is the proble

Page 48 and 49: Classifying with scikit-learn Estim

Page 50 and 51: Chapter 2 Nearest neighbors can be

Page 52 and 53: Chapter 2 Loading the dataset The d

Page 54 and 55: Chapter 2 Finally, we take the last

Page 56 and 57: Chapter 2 Next, we use this functio

Page 58 and 59: Chapter 2 While there is a lot of v

Page 60 and 61: This gives a score of 82.3 percent

Page 62 and 63: Chapter 2 Pipelines take a list of

Page 64 and 65: Predicting Sports Winners with Deci

Page 66 and 67: Using pandas to load the dataset Th

Page 68 and 69: Chapter 3 Now that we have our data

Page 70 and 71: Chapter 3 You can change those indi

Page 72 and 73: Chapter 3 Using decision trees We c

Page 74 and 75: Chapter 3 The output is as follows:

Page 76 and 77: Chapter 3 Finally, we update our di

Page 78 and 79: To compensate for this, we could cr

Page 80 and 81: Chapter 3 This results in an immedi

Page 82: Chapter 3 If you are facing trouble

Page 85 and 86: Recommending Movies Using Affinity










Page 105 and 106: Extracting Features with Transforme











Page 128 and 129: Social Media Insight Using Naive Ba

Page 130 and 131: Chapter 6 Downloading data from a s

Page 132 and 133: In the preceding loop, we also perf

Page 134 and 135: Chapter 6 Next, we create a simple

Page 136 and 137: Chapter 6 For this cell, we will be

Page 138 and 139: Chapter 6 On running the preceding

Page 140 and 141: Chapter 6 The code is as follows: a

Page 142 and 143: Chapter 6 Here's an excerpt from Th

Page 144 and 145: As an example, for n=3, we extract

Page 146 and 147: Chapter 6 From here, we use Bayes'

Page 148 and 149: Now, we can compute the probability

Page 150 and 151: Chapter 6 Let's take a look at the

Page 152 and 153: We can nearly run our pipeline now,

Page 154 and 155: Chapter 6 Note that we aren't reall

Page 156: Chapter 6 Summary In this chapter,

Page 159 and 160: Discovering Accounts to Follow Usin













Page 185 and 186: Beating CAPTCHAs with Neural Networ











Page 208 and 209: Authorship Attribution Authorship a

Page 210 and 211: Authorship studies alone cannot pro

Page 212 and 213: Getting the data The data we will u

Page 214 and 215: Chapter 9 We create lists for stori

Page 216 and 217: Chapter 9 The use of function words

Page 218 and 219: Classifying with function words Nex

Page 220 and 221: Chapter 9 The derivation of these e

Page 222 and 223: Chapter 9 Character n-grams are fou

Page 224 and 225: Chapter 9 Accessing the Enron datas

Page 226 and 227: Next, we iterate through each of th

Page 228 and 229: Chapter 9 This document contains an

Page 230 and 231: Chapter 9 Evaluation It is generall

Page 232: Chapter 9 We can see that authors a

Page 235 and 236: Clustering News Articles Our system

Page 237 and 238: Clustering News Articles Now let's

Page 239 and 240: Clustering News Articles The URL fo

Page 241 and 242: Clustering News Articles As the las

Page 243 and 244: Clustering News Articles If there i

Page 245 and 246: Clustering News Articles At this po

Page 247 and 248: Clustering News Articles The algori

Page 249 and 250: Clustering News Articles The labels

Page 251 and 252: Clustering News Articles After this

Page 253 and 254: Clustering News Articles You can th

Page 255 and 256: Clustering News Articles In graph t

Page 257 and 258: Clustering News Articles How it wor

Page 259 and 260: Clustering News Articles We then wr

Page 261 and 262: Clustering News Articles We can the

Page 263 and 264: Clustering News Articles Summary In

Page 265 and 266: Classifying Objects in Images Using














Page 294 and 295: Working with Big Data The amount of

Page 296 and 297: Chapter 12 In big data, we can't lo

Page 298 and 299: Chapter 12 MapReduce originates fro

Page 300 and 301: Chapter 12 The map function takes a

Page 302 and 303: The Hadoop ecosystem is quite compl

Page 304 and 305: Chapter 12 We set a test filename s

Page 306 and 307: Chapter 12 Extracting the blog post

Page 308 and 309: Chapter 12 The first parameter, /bl

Page 310 and 311: Chapter 12 The first function is th

Page 312 and 313: We again redefine our word search r

Page 314 and 315: Chapter 12 One problem with using l

Page 316 and 317: Chapter 12 for line in inf: tokens

Page 318 and 319: Chapter 12 python extract_posts.py

Page 320 and 321: Next Steps… During the course of

Page 322 and 323: Appendix To install it, clone the r

Page 324 and 325: Chapter 4 - Recommending Movies Usi

Page 326 and 327: Chapter 7 - Discovering Accounts to

Page 328 and 329: Local n-grams https://github.com/ro

Page 330 and 331: Appendix Other image datasets are a

Page 332 and 333: Index A access keys 107 accuracy im

Page 334 and 335: example 2 features 2 follower infor

Page 336 and 337: K Kaggle about 308 URL 308 Keras UR

Page 338 and 339: preprocessing, using pipelines abou

Page 340: U UCL Machine Learning data reposit

Page 343 and 344: Python Data Analysis ISBN: 978-1-78

dataset

features

import

algorithm

mining

feature

neural

python

networks

analysis

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python ... View more Learning%20Data%20Mining%20with%20Python

Delete template?

Save as template ?

Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python