24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Beating CAPTCHAs with Neural Networks<br />

The output of the neural network is 26 numbers, each relative to the likelihood that<br />

the letter at the given index is the predicted letter. To get the actual prediction, we<br />

get the index of the maximum value of these outputs and look up our letters list<br />

from before for the actual letter. For example, if the value is highest for the fifth<br />

output, the predicted letter will be E. The code is as follows:<br />

prediction = np.argmax(outputs)<br />

We then append the predicted letter to the predicted word we are building:<br />

predicted_word += letters[prediction]<br />

After the loop <strong>com</strong>pletes, we have gone through each of the letters and formed our<br />

predicted word:<br />

return predicted_word<br />

We can now test on a word using the following code. Try different words and see<br />

what sorts of errors you get, but keep in mind that our neural network only knows<br />

about capital letters.<br />

word = "GENE"<br />

captcha = create_captcha(word, shear=0.2)<br />

print(predict_captcha(captcha, net))<br />

We can codify this into a function, allowing us to perform predictions more easily.<br />

We also leverage our assumption that the words will be only four-characters long to<br />

make prediction a little easier. Try it without the prediction = prediction[:4]<br />

line and see what types of errors you get. The code is as follows:<br />

def test_prediction(word, net, shear=0.2):<br />

captcha = create_captcha(word, shear=shear)<br />

prediction = predict_captcha(captcha, net)<br />

prediction = prediction[:4]<br />

return word == prediction, word, prediction<br />

The returned results specify whether the prediction is correct, the original word,<br />

and the predicted word.<br />

[ 176 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!