24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

This will unzip only one Coval.otf file (there are lots of files in this zip folder<br />

that we don't need).<br />

While still in the virtual machine, you can run the program with the following<br />

<strong>com</strong>mand:<br />

Chapter 11<br />

python3 chapter11script.py<br />

The program will run through as it would in the IPython Notebook and the results<br />

will print to the <strong>com</strong>mand line.<br />

The results should be the same as before, but the actual training and testing of the<br />

neural network will be much faster. Note that it won't be that much faster in the<br />

other aspects of the program—we didn't write the CAPTCHA dataset creation to<br />

use a GPU, so we will not obtain a speedup there.<br />

You may wish to shut down the Amazon virtual machine to<br />

save some money; we will be using it at the end of this chapter<br />

to run our main experiment, but will be developing the code on<br />

your main <strong>com</strong>puter first.<br />

Application<br />

Back on your main <strong>com</strong>puter now, open the first IPython Notebook we created<br />

in this chapter—the one that we loaded the CIFAR dataset with. In this major<br />

experiment, we will take the CIFAR dataset, create a deep convolution neural<br />

network, and then run it on our GPU-based virtual machine.<br />

Getting the data<br />

To start with, we will take our CIFAR images and create a dataset with them.<br />

Unlike previously, we are going to preserve the pixel structure—that is,. in rows<br />

and columns. First, load all the batches into a list:<br />

import numpy as np<br />

batches = []<br />

for i in range(1, 6):<br />

batch_filename = os.path.join(data_folder, "data_batch_{}".<br />

format(i))<br />

batches.append(unpickle(batch1_filename))<br />

break<br />

[ 263 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!