www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
Chapter 3 If you are facing trouble extracting features of these types, check the pandas documentation at http://pandas.pydata.org/pandas-docs/stable/ for help. Alternatively, you can try an online forum such as Stack Overflow for assistance. More extreme examples could use player data to estimate the strength of each team's sides to predict who won. These types of complex features are used every day by gamblers and sports betting agencies to try to turn a profit by predicting the outcome of sports matches. Summary In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandas library to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis. We saw the effect that good features have on performance and used an ensemble algorithm, Random forests, to further improve the accuracy. In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use approximation to improve the scalability of data mining. [ 59 ]
- Page 32 and 33: Chapter 1 The dataset we are going
- Page 34 and 35: Chapter 1 As an example, we will co
- Page 36 and 37: We get the names of the features fo
- Page 38 and 39: Chapter 1 Two rules are near the to
- Page 40 and 41: Chapter 1 The scikit-learn library
- Page 42 and 43: We then iterate over all the sample
- Page 44 and 45: Chapter 1 Overfitting is the proble
- Page 46: Chapter 1 Summary In this chapter,
- Page 49 and 50: Classifying with scikit-learn Estim
- Page 51 and 52: Classifying with scikit-learn Estim
- Page 53 and 54: Classifying with scikit-learn Estim
- Page 55 and 56: Classifying with scikit-learn Estim
- Page 57 and 58: Classifying with scikit-learn Estim
- Page 59 and 60: Classifying with scikit-learn Estim
- Page 61 and 62: Classifying with scikit-learn Estim
- Page 63 and 64: Classifying with scikit-learn Estim
- Page 65 and 66: Predicting Sports Winners with Deci
- Page 67 and 68: Predicting Sports Winners with Deci
- Page 69 and 70: Predicting Sports Winners with Deci
- Page 71 and 72: Predicting Sports Winners with Deci
- Page 73 and 74: Predicting Sports Winners with Deci
- Page 75 and 76: Predicting Sports Winners with Deci
- Page 77 and 78: Predicting Sports Winners with Deci
- Page 79 and 80: Predicting Sports Winners with Deci
- Page 81: Predicting Sports Winners with Deci
- Page 85 and 86: Recommending Movies Using Affinity
- Page 87 and 88: Recommending Movies Using Affinity
- Page 89 and 90: Recommending Movies Using Affinity
- Page 91 and 92: Recommending Movies Using Affinity
- Page 93 and 94: Recommending Movies Using Affinity
- Page 95 and 96: Recommending Movies Using Affinity
- Page 97 and 98: Recommending Movies Using Affinity
- Page 99 and 100: Recommending Movies Using Affinity
- Page 101 and 102: Recommending Movies Using Affinity
- Page 103 and 104: Recommending Movies Using Affinity
- Page 105 and 106: Extracting Features with Transforme
- Page 107 and 108: Extracting Features with Transforme
- Page 109 and 110: Extracting Features with Transforme
- Page 111 and 112: Extracting Features with Transforme
- Page 113 and 114: Extracting Features with Transforme
- Page 115 and 116: Extracting Features with Transforme
- Page 117 and 118: Extracting Features with Transforme
- Page 119 and 120: Extracting Features with Transforme
- Page 121 and 122: Extracting Features with Transforme
- Page 123 and 124: Extracting Features with Transforme
- Page 125 and 126: Extracting Features with Transforme
- Page 128 and 129: Social Media Insight Using Naive Ba
- Page 130 and 131: Chapter 6 Downloading data from a s
Chapter 3<br />
If you are facing trouble extracting features of these types, check the pandas<br />
documentation at http://pandas.pydata.org/pandas-docs/stable/ for help.<br />
Alternatively, you can try an online forum such as Stack Overflow for assistance.<br />
More extreme examples could use player data to estimate the strength of each<br />
team's sides to predict who won. These types of <strong>com</strong>plex features are used every<br />
day by gamblers and sports betting agencies to try to turn a profit by predicting the<br />
out<strong>com</strong>e of sports matches.<br />
Summary<br />
In this chapter, we extended our use of scikit-learn's classifiers to perform<br />
classification and introduced the pandas library to manage our data. We analyzed<br />
real-world data on basketball results from the NBA, saw some of the problems that<br />
even well-curated data introduces, and created new features for our analysis.<br />
We saw the effect that good features have on performance and used an ensemble<br />
algorithm, Random forests, to further improve the accuracy.<br />
In the next chapter, we will extend the affinity analysis that we performed in the first<br />
chapter to create a program to find similar books. We will see how to use algorithms<br />
for ranking and also use approximation to improve the scalability of data mining.<br />
[ 59 ]