www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
Chapter 1 Summary In this chapter, we introduced data mining using Python. If you were able to run the code in this section (note that the full code is available in the supplied code package), then your computer is set up for much of the rest of the book. Other Python libraries will be introduced in later chapters to perform more specialized tasks. We used the IPython Notebook to run our code, which allows us to immediately view the results of a small section of the code. This is a useful framework that will be used throughout the book. We introduced a simple affinity analysis, finding products that are purchased together. This type of exploratory analysis gives an insight into a business process, an environment, or a scenario. The information from these types of analysis can assist in business processes, finding the next big medical breakthrough, or creating the next artificial intelligence. Also, in this chapter, there was a simple classification example using the OneR algorithm. This simple algorithm simply finds the best feature and predicts the class that most frequently had this value in the training dataset. Over the next few chapters, we will expand on the concepts of classification and affinity analysis. We will also introduce the scikit-learn package and the algorithms it includes. [ 23 ]
- Page 1 and 2: [ 1 ] www.allitebooks.com
- Page 3 and 4: Learning Data Mining with Python Co
- Page 5 and 6: About the Author Robert Layton has
- Page 7 and 8: Christophe Van Gysel is pursuing a
- Page 9 and 10: www.allitebooks.com
- Page 11 and 12: Table of Contents Preprocessing usi
- Page 13 and 14: Table of Contents Chapter 7: Discov
- Page 15 and 16: Table of Contents GPU optimization
- Page 18 and 19: Preface If you have ever wanted to
- Page 20 and 21: What you need for this book It shou
- Page 22 and 23: Preface Reader feedback Feedback fr
- Page 24 and 25: Getting Started with Data Mining We
- Page 26 and 27: Chapter 1 In the preceding dataset,
- Page 28 and 29: After you have the above "Hello, wo
- Page 30 and 31: Chapter 1 Windows users may need to
- Page 32 and 33: Chapter 1 The dataset we are going
- Page 34 and 35: Chapter 1 As an example, we will co
- Page 36 and 37: We get the names of the features fo
- Page 38 and 39: Chapter 1 Two rules are near the to
- Page 40 and 41: Chapter 1 The scikit-learn library
- Page 42 and 43: We then iterate over all the sample
- Page 44 and 45: Chapter 1 Overfitting is the proble
- Page 48 and 49: Classifying with scikit-learn Estim
- Page 50 and 51: Chapter 2 Nearest neighbors can be
- Page 52 and 53: Chapter 2 Loading the dataset The d
- Page 54 and 55: Chapter 2 Finally, we take the last
- Page 56 and 57: Chapter 2 Next, we use this functio
- Page 58 and 59: Chapter 2 While there is a lot of v
- Page 60 and 61: This gives a score of 82.3 percent
- Page 62 and 63: Chapter 2 Pipelines take a list of
- Page 64 and 65: Predicting Sports Winners with Deci
- Page 66 and 67: Using pandas to load the dataset Th
- Page 68 and 69: Chapter 3 Now that we have our data
- Page 70 and 71: Chapter 3 You can change those indi
- Page 72 and 73: Chapter 3 Using decision trees We c
- Page 74 and 75: Chapter 3 The output is as follows:
- Page 76 and 77: Chapter 3 Finally, we update our di
- Page 78 and 79: To compensate for this, we could cr
- Page 80 and 81: Chapter 3 This results in an immedi
- Page 82: Chapter 3 If you are facing trouble
- Page 85 and 86: Recommending Movies Using Affinity
- Page 87 and 88: Recommending Movies Using Affinity
- Page 89 and 90: Recommending Movies Using Affinity
- Page 91 and 92: Recommending Movies Using Affinity
- Page 93 and 94: Recommending Movies Using Affinity
- Page 95 and 96: Recommending Movies Using Affinity
Chapter 1<br />
Summary<br />
In this chapter, we introduced data mining using Python. If you were able to run the<br />
code in this section (note that the full code is available in the supplied code package),<br />
then your <strong>com</strong>puter is set up for much of the rest of the book. Other Python libraries<br />
will be introduced in later chapters to perform more specialized tasks.<br />
We used the IPython Notebook to run our code, which allows us to immediately<br />
view the results of a small section of the code. This is a useful framework that will<br />
be used throughout the book.<br />
We introduced a simple affinity analysis, finding products that are purchased<br />
together. This type of exploratory analysis gives an insight into a business process,<br />
an environment, or a scenario. The information from these types of analysis can<br />
assist in business processes, finding the next big medical breakthrough, or creating<br />
the next artificial intelligence.<br />
Also, in this chapter, there was a simple classification example using the OneR<br />
algorithm. This simple algorithm simply finds the best feature and predicts the<br />
class that most frequently had this value in the training dataset.<br />
Over the next few chapters, we will expand on the concepts of classification<br />
and affinity analysis. We will also introduce the scikit-learn package and the<br />
algorithms it includes.<br />
[ 23 ]