www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
Chapter 11 Getting your code to run on a GPU can be a frustrating experience. It depends greatly on what type of GPU you have, how it is configured, your operating system, and whether you are prepared to make some low-level changes to your computer. There are three main avenues to take: • The first is to look at your computer, search for tools and drivers for your GPU and operating system, explore some of the many tutorials out there, and find one that fits your scenario. Whether this works depends on what your system is like. That said, this scenario is much easier than it was a few years ago, with better tools and drivers available to perform GPU-enabled computation. • The second avenue is to choose a system, find good documentation on setting it up, and buy a system to match. This will work better, but can be fairly expensive—in most modern computers, the GPU is one of the most expensive parts. This is especially true if you want to get great performance out of the system—you'll need a really good GPU, which can be very expensive. • The third avenue is to use a virtual machine, which is already configured for such a purpose. For example, Markus Beissinger has created such a system that runs on Amazon's Web Services. The system will cost you money to run, but the price is much less than that of a new computer. Depending on your location, the exact system you get and how much you use it, you are probably looking at less than $1 an hour, and often much, much less. If you use spot instances in Amazon's Web Services, you can run them for just a few cents per hour (although, you will need to develop your code to run on spot instances separately). If you aren't able to afford the running costs of a virtual machine, I recommend that you look into the first avenue, with your current system. You may also be able to pick up a good secondhand GPU from family or a friend who constantly updates their computer (gamer friends are great for this!). Running our code on a GPU We are going to take the third avenue in this chapter and create a virtual machine based on Markus Beissinger's base system. This will run on an Amazon's EC2 service. There are many other Web services to use, and the procedure will be slightly different for each. In this section, I'll outline the procedure for Amazon. [ 259 ]
Classifying Objects in Images Using Deep Learning If you want to use your own computer and have it configured to run GPU-enabled computation, feel free to skip this section. You can get more information on how this was set up, which may also provide information on setting it up on another computer, at http://markus.com/install-theano-on-aws/. To start with, go to the AWS console at: https://console.aws.amazon.com/console/home?region=us-east-1 Log in with your Amazon account. If you don't have one, you will be prompted to create one, which you will need to do in order to continue. Next, go to the EC2 service console at: https://console.aws.amazon.com/ec2/v2/ home?region=us-east-1. Click on Launch Instance and choose N. California as your location in the dropdown menu at the top-right. Click on Community AMIs and search for ami-b141a2f5, which is the machine created by Markus Beissinger. Then, click on Select. On the next screen, choose g2.2xlarge as the machine type and click on Review and Launch. On the next screen, click on Launch. At this point, you will be charged, so please remember to shut down your machines when you are done with them. You can go to the EC2 service, select the machine, and stop it. You won't be charged for machines that are not running. You'll be prompted with some information on how to connect to your instance. If you haven't used AWS before, you will probably need to create a new key pair to securely connect to your instance. In this case, give your key pair a name, download the pem file, and store it in a safe place—if lost, you will not be able to connect to your instance again! Click on Connect for information on using the pem file to connect to your instance. The most likely scenario is that you will use ssh with the following command: ssh -i .pem ubuntu@ [ 260 ]
- Page 231 and 232: Authorship Attribution Finally, we
- Page 234 and 235: Clustering News Articles In most of
- Page 236 and 237: Chapter 10 API Endpoints are the ac
- Page 238 and 239: The token object is just a dictiona
- Page 240 and 241: Chapter 10 We then create a list to
- Page 242 and 243: Chapter 10 We are going to use MD5
- Page 244 and 245: Chapter 10 Next, we develop the cod
- Page 246 and 247: Chapter 10 We use clustering techni
- Page 248 and 249: Chapter 10 The k-means algorithm is
- Page 250 and 251: Chapter 10 We only fit the X matrix
- Page 252 and 253: Chapter 10 We then print out the mo
- Page 254 and 255: Chapter 10 Our function definition
- Page 256 and 257: Chapter 10 The result from the prec
- Page 258 and 259: Chapter 10 Implementation Putting a
- Page 260 and 261: Chapter 10 Neural networks can also
- Page 262 and 263: We then call the partial_fit functi
- Page 264 and 265: Classifying Objects in Images Using
- Page 266 and 267: Chapter 11 This dataset comes from
- Page 268 and 269: You can change the image index to s
- Page 270 and 271: Chapter 11 Each of these issues has
- Page 272 and 273: Chapter 11 Using Theano, we can def
- Page 274 and 275: Chapter 11 Building a neural networ
- Page 276 and 277: Chapter 11 Finally, we create Thean
- Page 278 and 279: Chapter 11 return [image,] return s
- Page 280 and 281: Chapter 11 Next, we define how the
- Page 284 and 285: Chapter 11 Setting up the environme
- Page 286 and 287: This will unzip only one Coval.otf
- Page 288 and 289: Chapter 11 First we create the laye
- Page 290 and 291: Chapter 11 Finally, we set the verb
- Page 292: Chapter 11 Summary In this chapter,
- Page 295 and 296: Working with Big Data Big data What
- Page 297 and 298: Working with Big Data Governments a
- Page 299 and 300: Working with Big Data We start by c
- Page 301 and 302: Working with Big Data The final ste
- Page 303 and 304: Working with Big Data Getting the d
- Page 305 and 306: Working with Big Data If we aren't
- Page 307 and 308: Working with Big Data Before we sta
- Page 309 and 310: Working with Big Data The first val
- Page 311 and 312: Working with Big Data This gives us
- Page 313 and 314: Working with Big Data Next, we crea
- Page 315 and 316: Working with Big Data Then, make a
- Page 317 and 318: Working with Big Data Left-click th
- Page 319 and 320: Working with Big Data The result is
- Page 321 and 322: Next Steps… Extending the IPython
- Page 323 and 324: Next Steps… Chapter 3: Predicting
- Page 325 and 326: Next Steps… Vowpal Wabbit http://
- Page 327 and 328: Next Steps… Deeper networks These
- Page 329 and 330: Next Steps… Real-time clusterings
- Page 331 and 332: Next Steps… More resources Kaggle
Chapter 11<br />
Getting your code to run on a GPU can be a frustrating experience. It depends<br />
greatly on what type of GPU you have, how it is configured, your operating system,<br />
and whether you are prepared to make some low-level changes to your <strong>com</strong>puter.<br />
There are three main avenues to take:<br />
• The first is to look at your <strong>com</strong>puter, search for tools and drivers for your<br />
GPU and operating system, explore some of the many tutorials out there,<br />
and find one that fits your scenario. Whether this works depends on what<br />
your system is like. That said, this scenario is much easier than it was a few<br />
years ago, with better tools and drivers available to perform GPU-enabled<br />
<strong>com</strong>putation.<br />
• The second avenue is to choose a system, find good documentation on<br />
setting it up, and buy a system to match. This will work better, but can be<br />
fairly expensive—in most modern <strong>com</strong>puters, the GPU is one of the most<br />
expensive parts. This is especially true if you want to get great performance<br />
out of the system—you'll need a really good GPU, which can be very<br />
expensive.<br />
• The third avenue is to use a virtual machine, which is already configured for<br />
such a purpose. For example, Markus Beissinger has created such a system<br />
that runs on Amazon's Web Services. The system will cost you money to<br />
run, but the price is much less than that of a new <strong>com</strong>puter. Depending on<br />
your location, the exact system you get and how much you use it, you are<br />
probably looking at less than $1 an hour, and often much, much less. If you<br />
use spot instances in Amazon's Web Services, you can run them for just a few<br />
cents per hour (although, you will need to develop your code to run on spot<br />
instances separately).<br />
If you aren't able to afford the running costs of a virtual machine, I re<strong>com</strong>mend that<br />
you look into the first avenue, with your current system. You may also be able to<br />
pick up a good secondhand GPU from family or a friend who constantly updates<br />
their <strong>com</strong>puter (gamer friends are great for this!).<br />
Running our code on a GPU<br />
We are going to take the third avenue in this chapter and create a virtual machine<br />
based on Markus Beissinger's base system. This will run on an Amazon's EC2 service.<br />
There are many other Web services to use, and the procedure will be slightly<br />
different for each. In this section, I'll outline the procedure for Amazon.<br />
[ 259 ]