www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
Chapter 11 Finally, we set the verbosity as equal to 1, which will give us a printout of the results of each epoch. This allows us to know the progress of the model and also that it is still running. Another feature is that it tells us the time it takes for each epoch to run. This is pretty consistent, so you can compute the time left in training by multiplying this value by the number of remaining epochs, giving a good estimate on how long you need to wait for the training to complete: verbose=1) Putting it all together Now that we have our network, we can train it with our training dataset: nnet.fit(X_train, y_train) This will take quite a while to run, even with the reduced dataset size and the reduced number of epochs. Once the code completes, you can test it as we did before: from sklearn.metrics import f1_score y_pred = nnet.predict(X_test) print(f1_score(y_test.argmax(axis=1), y_pred.argmax(axis=1))) The results will be terrible—as they should be! We haven't trained the network very much—only for a few iterations and only on one fifth of the data. First, go back and remove the break line we put in when creating the dataset (it is in the batches loop). This will allow the code to train on all of the samples, not just some of them. Next, change the number of epochs to 100 in the neural network definition. Now, we upload the script to our virtual machine. As with before, click on File | Download as, Python, and save the script somewhere on your computer. Launch and connect to the virtual machine and upload the script as you did earlier (I called my script chapter11cifar.py—if you named yours differently, just update the following code). The next thing we need is for the dataset to be on the virtual machine. The easiest way to do this is to go to the virtual machine and type: wget http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz [ 267 ]
Classifying Objects in Images Using Deep Learning This will download the dataset. Once that has downloaded, you can extract the data to the Data folder by first creating that folder and then unzipping the data there: mkdir Data tar -zxf cifar-10-python.tar.gz -C Data Finally, we can run our example with the following: python3 chapter11cifar.py The first thing you'll notice is a drastic speedup. On my home computer, each epoch took over 100 seconds to run. On the GPU-enabled virtual machine, each epoch takes just 16 seconds! If we tried running 100 epochs on my computer, it would take nearly three hours, compared to just 26 minutes on the virtual machine. This drastic speedup makes trailing different models much faster. Often with trialing machine learning algorithms, the computational complexity of a single algorithm doesn't matter too much. An algorithm might take a few seconds, minutes, or hours to run. If you are only running one model, it is unlikely that this training time will matter too much—especially as prediction with most machine learning algorithms is quite quick, and that is where a machine learning model is mostly used. However, when you have many parameters to run, you will suddenly need to train thousands of models with slightly different parameters—suddenly, these speed increases matter much more. After 100 epochs of training, taking a whole 26 minutes, you will get a printout of the final result: 0.8497 Not too bad! We can increase the number of epochs of training to improve this further or we might try changing the parameters instead; perhaps, more hidden nodes, more convolution layers, or an additional dense layer. There are other types of layers in Lasagne that could be tried too; although generally, convolution layers are better for vision. [ 268 ]
- Page 240 and 241: Chapter 10 We then create a list to
- Page 242 and 243: Chapter 10 We are going to use MD5
- Page 244 and 245: Chapter 10 Next, we develop the cod
- Page 246 and 247: Chapter 10 We use clustering techni
- Page 248 and 249: Chapter 10 The k-means algorithm is
- Page 250 and 251: Chapter 10 We only fit the X matrix
- Page 252 and 253: Chapter 10 We then print out the mo
- Page 254 and 255: Chapter 10 Our function definition
- Page 256 and 257: Chapter 10 The result from the prec
- Page 258 and 259: Chapter 10 Implementation Putting a
- Page 260 and 261: Chapter 10 Neural networks can also
- Page 262 and 263: We then call the partial_fit functi
- Page 264 and 265: Classifying Objects in Images Using
- Page 266 and 267: Chapter 11 This dataset comes from
- Page 268 and 269: You can change the image index to s
- Page 270 and 271: Chapter 11 Each of these issues has
- Page 272 and 273: Chapter 11 Using Theano, we can def
- Page 274 and 275: Chapter 11 Building a neural networ
- Page 276 and 277: Chapter 11 Finally, we create Thean
- Page 278 and 279: Chapter 11 return [image,] return s
- Page 280 and 281: Chapter 11 Next, we define how the
- Page 282 and 283: Chapter 11 Getting your code to run
- Page 284 and 285: Chapter 11 Setting up the environme
- Page 286 and 287: This will unzip only one Coval.otf
- Page 288 and 289: Chapter 11 First we create the laye
- Page 292: Chapter 11 Summary In this chapter,
- Page 295 and 296: Working with Big Data Big data What
- Page 297 and 298: Working with Big Data Governments a
- Page 299 and 300: Working with Big Data We start by c
- Page 301 and 302: Working with Big Data The final ste
- Page 303 and 304: Working with Big Data Getting the d
- Page 305 and 306: Working with Big Data If we aren't
- Page 307 and 308: Working with Big Data Before we sta
- Page 309 and 310: Working with Big Data The first val
- Page 311 and 312: Working with Big Data This gives us
- Page 313 and 314: Working with Big Data Next, we crea
- Page 315 and 316: Working with Big Data Then, make a
- Page 317 and 318: Working with Big Data Left-click th
- Page 319 and 320: Working with Big Data The result is
- Page 321 and 322: Next Steps… Extending the IPython
- Page 323 and 324: Next Steps… Chapter 3: Predicting
- Page 325 and 326: Next Steps… Vowpal Wabbit http://
- Page 327 and 328: Next Steps… Deeper networks These
- Page 329 and 330: Next Steps… Real-time clusterings
- Page 331 and 332: Next Steps… More resources Kaggle
- Page 333 and 334: authorship, attributing 185-188 AWS
- Page 335 and 336: feature extraction about 82 common
- Page 337 and 338: NetworkX about 145 defining 303 URL
- Page 339 and 340: scikit-learn package references 305
Chapter 11<br />
Finally, we set the verbosity as equal to 1, which will give us a printout of the results<br />
of each epoch. This allows us to know the progress of the model and also that it is<br />
still running. Another feature is that it tells us the time it takes for each epoch to run.<br />
This is pretty consistent, so you can <strong>com</strong>pute the time left in training by multiplying<br />
this value by the number of remaining epochs, giving a good estimate on how long<br />
you need to wait for the training to <strong>com</strong>plete:<br />
verbose=1)<br />
Putting it all together<br />
Now that we have our network, we can train it with our training dataset:<br />
nnet.fit(X_train, y_train)<br />
This will take quite a while to run, even with the reduced dataset size and the<br />
reduced number of epochs. Once the code <strong>com</strong>pletes, you can test it as we did before:<br />
from sklearn.metrics import f1_score<br />
y_pred = nnet.predict(X_test)<br />
print(f1_score(y_test.argmax(axis=1), y_pred.argmax(axis=1)))<br />
The results will be terrible—as they should be! We haven't trained the network very<br />
much—only for a few iterations and only on one fifth of the data.<br />
First, go back and remove the break line we put in when creating the dataset<br />
(it is in the batches loop). This will allow the code to train on all of the samples,<br />
not just some of them.<br />
Next, change the number of epochs to 100 in the neural network definition.<br />
Now, we upload the script to our virtual machine. As with before, click on<br />
File | Download as, Python, and save the script somewhere on your <strong>com</strong>puter.<br />
Launch and connect to the virtual machine and upload the script as you did earlier<br />
(I called my script chapter11cifar.py—if you named yours differently, just update<br />
the following code).<br />
The next thing we need is for the dataset to be on the virtual machine. The easiest<br />
way to do this is to go to the virtual machine and type:<br />
wget http://<strong>www</strong>.cs.toronto.edu/~kriz/cifar-10-python.tar.gz<br />
[ 267 ]