24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Classifying with scikit-learn<br />

Estimators<br />

The scikit-learn library is a collection of data mining algorithms, written in Python<br />

and using a <strong>com</strong>mon programming interface. This allows users to easily try different<br />

algorithms as well as utilize standard tools for doing effective testing and parameter<br />

searching. There are a large number of algorithms and utilities in scikit-learn.<br />

In this chapter, we focus on setting up a good framework for running data mining<br />

procedures. This will be used in later chapters, which are all focused on applications<br />

and techniques to use in those situations.<br />

The key concepts introduced in this chapter are as follows:<br />

• Estimators: This is to perform classification, clustering, and regression<br />

• Transformers: This is to perform preprocessing and data alterations<br />

• Pipelines: This is to put together your workflow into a replicable format<br />

scikit-learn estimators<br />

Estimators are scikit-learn's abstraction, allowing for the standardized<br />

implementation of a large number of classification algorithms. Estimators are used<br />

for classification. Estimators have the following two main functions:<br />

• fit(): This performs the training of the algorithm and sets internal<br />

parameters. It takes two inputs, the training sample dataset and the<br />

corresponding classes for those samples.<br />

• predict(): This predicts the class of the testing samples that is given as<br />

input. This function returns an array with the predictions of each input<br />

testing sample.<br />

[ 25 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!