www.allitebooks.com
Learning%20Data%20Mining%20with%20Python Learning%20Data%20Mining%20with%20Python
What you need for this book It should come as no surprise that you'll need a computer, or access to one, to complete this book. The computer should be reasonably modern, but it doesn't need to be overpowered. Any modern processor (from about 2010 onwards) and 4 GB of RAM will suffice, and you can probably run almost all of the code on a slower system too. Preface The exception here is with the final two chapters. In these chapters, I step through using Amazon Web Services (AWS) to run the code. This will probably cost you some money, but the advantage is less system setup than running the code locally. If you don't want to pay for those services, the tools used can all be set up on a local computer, but you will definitely need a modern system to run it. A processor built in at least 2012 and with more than 4 GB of RAM is necessary. I recommend the Ubuntu operating system, but the code should work well on Windows, Macs, or any other Linux variant. You may need to consult the documentation for your system to get some things installed, though. In this book, I use pip to install code, which is a command-line tool for installing Python libraries. Another option is to use Anaconda, which can be found online here: http://continuum.io/downloads. I have also tested all code using Python 3. Most of the code examples work on Python 2, with no changes. If you run into any problems and can't get around them, send an email and we can offer a solution. Who this book is for This book is for programmers who want to get started in data mining in an application-focused manner. If you haven't programmed before, I strongly recommend that you learn at least the basics before you get started. This book doesn't introduce programming, nor does it give too much time to explain the actual implementation (in code) of how to type out the instructions. That said, once you go through the basics, you should be able to come back to this book fairly quickly—there is no need to be an expert programmer first! I highly recommend that you have some Python programming experience. If you don't, feel free to jump in, but you might want to take a look at some Python code first, possibly focusing on tutorials using the IPython Notebook. Writing programs in the IPython Notebook works a little differently than other methods such as writing a Java program in a fully fledged IDE. [ xi ] www.allitebooks.com
Preface Conventions In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. The most important is code. Code that you need to enter is displayed separate from the text, in a box like this one: if True: print("Welcome to the book") Keep a careful eye on indentation. Python cares about how much lines are indented. In this book, I've used four spaces for indentation. You can use a different number (or tabs), but you need to be consistent. If you get a bit lost counting indentation levels, reference the code bundle that comes with the book. Where I refer to code in text, I'll use this format. You don't need to type this in your IPython Notebooks, unless the text specifically states otherwise. Any command-line input or output is written as follows: # cp file1.txt file2.txt New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Click on the Export link." Warnings or important notes appear in a box like this. Tips and tricks appear like this. [ xii ]
- Page 1 and 2: [ 1 ] www.allitebooks.com
- Page 3 and 4: Learning Data Mining with Python Co
- Page 5 and 6: About the Author Robert Layton has
- Page 7 and 8: Christophe Van Gysel is pursuing a
- Page 9 and 10: www.allitebooks.com
- Page 11 and 12: Table of Contents Preprocessing usi
- Page 13 and 14: Table of Contents Chapter 7: Discov
- Page 15 and 16: Table of Contents GPU optimization
- Page 18 and 19: Preface If you have ever wanted to
- Page 22 and 23: Preface Reader feedback Feedback fr
- Page 24 and 25: Getting Started with Data Mining We
- Page 26 and 27: Chapter 1 In the preceding dataset,
- Page 28 and 29: After you have the above "Hello, wo
- Page 30 and 31: Chapter 1 Windows users may need to
- Page 32 and 33: Chapter 1 The dataset we are going
- Page 34 and 35: Chapter 1 As an example, we will co
- Page 36 and 37: We get the names of the features fo
- Page 38 and 39: Chapter 1 Two rules are near the to
- Page 40 and 41: Chapter 1 The scikit-learn library
- Page 42 and 43: We then iterate over all the sample
- Page 44 and 45: Chapter 1 Overfitting is the proble
- Page 46: Chapter 1 Summary In this chapter,
- Page 49 and 50: Classifying with scikit-learn Estim
- Page 51 and 52: Classifying with scikit-learn Estim
- Page 53 and 54: Classifying with scikit-learn Estim
- Page 55 and 56: Classifying with scikit-learn Estim
- Page 57 and 58: Classifying with scikit-learn Estim
- Page 59 and 60: Classifying with scikit-learn Estim
- Page 61 and 62: Classifying with scikit-learn Estim
- Page 63 and 64: Classifying with scikit-learn Estim
- Page 65 and 66: Predicting Sports Winners with Deci
- Page 67 and 68: Predicting Sports Winners with Deci
- Page 69 and 70: Predicting Sports Winners with Deci
What you need for this book<br />
It should <strong>com</strong>e as no surprise that you'll need a <strong>com</strong>puter, or access to one, to<br />
<strong>com</strong>plete this book. The <strong>com</strong>puter should be reasonably modern, but it doesn't<br />
need to be overpowered. Any modern processor (from about 2010 onwards) and<br />
4 GB of RAM will suffice, and you can probably run almost all of the code on a<br />
slower system too.<br />
Preface<br />
The exception here is with the final two chapters. In these chapters, I step through<br />
using Amazon Web Services (AWS) to run the code. This will probably cost you<br />
some money, but the advantage is less system setup than running the code locally.<br />
If you don't want to pay for those services, the tools used can all be set up on a local<br />
<strong>com</strong>puter, but you will definitely need a modern system to run it. A processor built<br />
in at least 2012 and with more than 4 GB of RAM is necessary.<br />
I re<strong>com</strong>mend the Ubuntu operating system, but the code should work well<br />
on Windows, Macs, or any other Linux variant. You may need to consult the<br />
documentation for your system to get some things installed, though.<br />
In this book, I use pip to install code, which is a <strong>com</strong>mand-line tool for installing<br />
Python libraries. Another option is to use Anaconda, which can be found online here:<br />
http://continuum.io/downloads.<br />
I have also tested all code using Python 3. Most of the code examples work on<br />
Python 2, with no changes. If you run into any problems and can't get around them,<br />
send an email and we can offer a solution.<br />
Who this book is for<br />
This book is for programmers who want to get started in data mining in an<br />
application-focused manner.<br />
If you haven't programmed before, I strongly re<strong>com</strong>mend that you learn at least<br />
the basics before you get started. This book doesn't introduce programming, nor<br />
does it give too much time to explain the actual implementation (in code) of how<br />
to type out the instructions. That said, once you go through the basics, you should<br />
be able to <strong>com</strong>e back to this book fairly quickly—there is no need to be an expert<br />
programmer first!<br />
I highly re<strong>com</strong>mend that you have some Python programming experience. If you<br />
don't, feel free to jump in, but you might want to take a look at some Python code<br />
first, possibly focusing on tutorials using the IPython Notebook. Writing programs in<br />
the IPython Notebook works a little differently than other methods such as writing a<br />
Java program in a fully fledged IDE.<br />
[ xi ]<br />
<strong>www</strong>.<strong>allitebooks</strong>.<strong>com</strong>