www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
Chapter 8 Then we iterate over our testing dataset and add each as a sample into a new SupervisedDataSet instance for testing. The code is as follows: testing = SupervisedDataSet(X.shape[1], y.shape[1]) for i in range(X_test.shape[0]): testing.addSample(X_test[i], y_test[i]) Now we can build a neural network. We will create a basic three-layer network that consists of an input layer, an output layer, and a single hidden layer between them. The number of neurons in the input and output layers is fixed. 400 features in our dataset dictates that we need 400 neurons in the first layer, and 26 possible targets dictate that we need 26 output neurons. Determining the number of neurons in the hidden layers can be quite difficult. Having too many results in a sparse network and means it is difficult to train enough neurons to properly represent the data. This usually results in overfitting the training data. If there are too few results in neurons that try to do too much of the classification each and again don't train properly, underfitting the data is the problem. I have found that creating a funnel shape, where the middle layer is between the size of the inputs and the size of the outputs, is a good starting place. For this chapter, we will use 100 neurons in the hidden layer, but playing with this value may yield better results. We import the buildNetwork function and tell it to build a network based on our necessary dimensions. The first value, X.shape[1], is the number of neurons in the input layer and it is set to the number of features (which is the number of columns in X). The second feature is our decided value of 100 neurons in the hidden layer. The third value is the number of outputs, which is based on the shape of the target array y. Finally, we set network to use a bias neuron to each layer (except for the output layer), effectively a neuron that always activates (but still has connections with a weight that are trained). The code is as follows: from pybrain.tools.shortcuts import buildNetwork net = buildNetwork(X.shape[1], 100, y.shape[1], bias=True) From here, we can now train the network and determine good values for the weights. But how do we train a neural network? Back propagation The back propagation (backprop) algorithm is a way of assigning blame to each neuron for incorrect predictions. Starting from the output layer, we compute which neurons were incorrect in their prediction, and adjust the weights into those neurons by a small amount to attempt to fix the incorrect prediction. [ 173 ]
Beating CAPTCHAs with Neural Networks These neurons made their mistake because of the neurons giving them input, but more specifically due to the weights on the connections between the neuron and its inputs. We then alter these weights by altering them by a small amount. The amount of change is based on two aspects: the partial derivative of the error function of the neuron's individual weights and the learning rate, which is a parameter to the algorithm (usually set at a very low value). We compute the gradient of the error of the function, multiply it by the learning rate, and subtract that from our weights. This is shown in the following example. The gradient will be positive or negative, depending on the error, and subtracting the weight will always attempt to correct the weight towards the correct prediction. In some cases, though, the correction will move towards something called a local optima, which is better than similar weights but not the best possible set of weights. This process starts at the output layer and goes back each layer until we reach the input layer. At this point, the weights on all connections have been updated. PyBrain contains an implementation of the backprop algorithm, which is called on the neural network through a trainer class. The code is as follows: from pybrain.supervised.trainers import BackpropTrainer trainer = BackpropTrainer(net, training, learningrate=0.01, weightdecay=0.01) The backprop algorithm is run iteratively using the training dataset, and each time the weights are adjusted a little. We can stop running backprop when the error reduces by a very small amount, indicating that the algorithm isn't improving the error much more and it isn't worth continuing the training. In theory, we would run the algorithm until the error doesn't change at all. This is called convergence, but in practice this takes a very long time for little gain. Alternatively, and much more simply, we can just run the algorithm a fixed number of times, called epochs. The higher the number of epochs, the longer the algorithm will take and the better the results will be (with a declining improvement for each epoch). We will train for 20 epochs for this code, but trying larger values will increase the performance (if only slightly). The code is as follows: trainer.trainEpochs(epochs=20) After running the previous code, which may take a number of minutes depending on the hardware, we can then perform predictions of samples in our testing dataset. PyBrain contains a function for this, and it is called on the trainer instance: predictions = trainer.testOnClassData(dataset=testing) [ 174 ]
- Page 145 and 146: Social Media Insight Using Naive Ba
- Page 147 and 148: Social Media Insight Using Naive Ba
- Page 149 and 150: Social Media Insight Using Naive Ba
- Page 151 and 152: Social Media Insight Using Naive Ba
- Page 153 and 154: Social Media Insight Using Naive Ba
- Page 155 and 156: Social Media Insight Using Naive Ba
- Page 158 and 159: Discovering Accounts to Follow Usin
- Page 160 and 161: Chapter 7 Next, we will need a list
- Page 162 and 163: Chapter 7 Make sure the filename is
- Page 164 and 165: Chapter 7 cursor = results['next_cu
- Page 166 and 167: Chapter 7 Next, we are going to rem
- Page 168 and 169: Chapter 7 Creating a graph Now, we
- Page 170 and 171: Chapter 7 As you can see, it is ver
- Page 172 and 173: Chapter 7 Next, we will only add th
- Page 174 and 175: Chapter 7 The difference in this gr
- Page 176 and 177: Chapter 7 We can graph the entire s
- Page 178 and 179: Chapter 7 Optimizing criteria Our a
- Page 180 and 181: Chapter 7 Next, we need to get the
- Page 182 and 183: • method='nelder-mead': This is u
- Page 184 and 185: Beating CAPTCHAs with Neural Networ
- Page 186 and 187: Chapter 8 The red lines indicate th
- Page 188 and 189: Chapter 8 The combination of an app
- Page 190 and 191: Chapter 8 Next we set the font of t
- Page 192 and 193: Chapter 8 We can then extract the s
- Page 194 and 195: Chapter 8 Our targets are integer v
- Page 198 and 199: Chapter 8 From these predictions, w
- Page 200 and 201: Chapter 8 This code correctly predi
- Page 202 and 203: The result is shown in the next gra
- Page 204 and 205: Chapter 8 However, it isn't very go
- Page 206: Chapter 8 Summary In this chapter,
- Page 209 and 210: Authorship Attribution Attributing
- Page 211 and 212: Authorship Attribution If we cannot
- Page 213 and 214: Authorship Attribution After taking
- Page 215 and 216: Authorship Attribution This dataset
- Page 217 and 218: Authorship Attribution "instead", "
- Page 219 and 220: Authorship Attribution Support vect
- Page 221 and 222: Authorship Attribution Kernels When
- Page 223 and 224: Authorship Attribution We can reuse
- Page 225 and 226: Authorship Attribution With our dat
- Page 227 and 228: Authorship Attribution We then reco
- Page 229 and 230: Authorship Attribution If it doesn'
- Page 231 and 232: Authorship Attribution Finally, we
- Page 234 and 235: Clustering News Articles In most of
- Page 236 and 237: Chapter 10 API Endpoints are the ac
- Page 238 and 239: The token object is just a dictiona
- Page 240 and 241: Chapter 10 We then create a list to
- Page 242 and 243: Chapter 10 We are going to use MD5
- Page 244 and 245: Chapter 10 Next, we develop the cod
Chapter 8<br />
Then we iterate over our testing dataset and add each as a sample into a new<br />
SupervisedDataSet instance for testing. The code is as follows:<br />
testing = SupervisedDataSet(X.shape[1], y.shape[1])<br />
for i in range(X_test.shape[0]):<br />
testing.addSample(X_test[i], y_test[i])<br />
Now we can build a neural network. We will create a basic three-layer network that<br />
consists of an input layer, an output layer, and a single hidden layer between them.<br />
The number of neurons in the input and output layers is fixed. 400 features in our<br />
dataset dictates that we need 400 neurons in the first layer, and 26 possible targets<br />
dictate that we need 26 output neurons.<br />
Determining the number of neurons in the hidden layers can be quite difficult.<br />
Having too many results in a sparse network and means it is difficult to train<br />
enough neurons to properly represent the data. This usually results in overfitting<br />
the training data. If there are too few results in neurons that try to do too much<br />
of the classification each and again don't train properly, underfitting the data is<br />
the problem. I have found that creating a funnel shape, where the middle layer is<br />
between the size of the inputs and the size of the outputs, is a good starting place.<br />
For this chapter, we will use 100 neurons in the hidden layer, but playing with this<br />
value may yield better results.<br />
We import the buildNetwork function and tell it to build a network based on our<br />
necessary dimensions. The first value, X.shape[1], is the number of neurons in the<br />
input layer and it is set to the number of features (which is the number of columns in<br />
X). The second feature is our decided value of 100 neurons in the hidden layer. The<br />
third value is the number of outputs, which is based on the shape of the target array<br />
y. Finally, we set network to use a bias neuron to each layer (except for the output<br />
layer), effectively a neuron that always activates (but still has connections with a<br />
weight that are trained). The code is as follows:<br />
from pybrain.tools.shortcuts import buildNetwork<br />
net = buildNetwork(X.shape[1], 100, y.shape[1], bias=True)<br />
From here, we can now train the network and determine good values for the<br />
weights. But how do we train a neural network?<br />
Back propagation<br />
The back propagation (backprop) algorithm is a way of assigning blame to each<br />
neuron for incorrect predictions. Starting from the output layer, we <strong>com</strong>pute which<br />
neurons were incorrect in their prediction, and adjust the weights into those neurons<br />
by a small amount to attempt to fix the incorrect prediction.<br />
[ 173 ]