24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 11<br />

Next, we define how the network will train. The nolearn package doesn't<br />

have the exact same training mechanism as we used in Chapter 8, Beating CAPTCHAs<br />

with Neural Networks, as it doesn't have a way to decay weights. However, it does<br />

have momentum, which we will use, along with a high learning rate and low<br />

momentum value:<br />

update=updates.momentum,<br />

update_learning_rate=0.9,<br />

update_momentum=0.1,<br />

Next, we define the problem as a regression problem. This may seem odd, as we<br />

are performing a classification task. However, the outputs are real-valued, and<br />

optimizing them as a regression problem appears to do much better in training<br />

than trying to optimize on classification:<br />

regression=True,<br />

Finally, we set the maximum number of epochs for training at 1,000, which is a good<br />

fit between good training and not taking a long time to train (for this dataset; other<br />

datasets may require more or less training):<br />

max_epochs=1000,<br />

We can now close off the parenthesis for the neural network constructor;<br />

)<br />

Next, we train the network on our training dataset:<br />

net1.fit(X_train, y_train)<br />

Now we can evaluate the trained network. To do this, we get the output of our<br />

network and, as with the Iris example, we need to perform an argmax to get the<br />

actual classification by choosing the highest activation:<br />

y_pred = net1.predict(X_test)<br />

y_pred = y_pred.argmax(axis=1)<br />

assert len(y_pred) == len(X_test)<br />

if len(y_test.shape) > 1:<br />

y_test = y_test.argmax(axis=1)<br />

print(f1_score(y_test, y_pred))<br />

The results are equally impressive—another perfect score on my machine. However,<br />

your results may vary as the nolearn package has some randomness that can't be<br />

directly controlled at this stage.<br />

[ 257 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!