24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 11<br />

return [image,]<br />

return subimages<br />

random_state = check_random_state(14)<br />

letters = list("ABCDEFGHIJKLMNOPQRSTUVWXYZ")<br />

shear_values = np.arange(0, 0.5, 0.05)<br />

def generate_sample(random_state=None):<br />

random_state = check_random_state(random_state)<br />

letter = random_state.choice(letters)<br />

shear = random_state.choice(shear_values)<br />

return create_captcha(letter, shear=shear, size=(20, 20)),<br />

letters.index(letter)<br />

dataset, targets = zip(*(generate_sample(random_state) for i in<br />

range(3000)))<br />

dataset = np.array(dataset, dtype='float')<br />

targets = np.array(targets)<br />

onehot = OneHotEncoder()<br />

y = onehot.fit_transform(targets.reshape(targets.shape[0],1))<br />

y = y.todense().astype(np.float32)<br />

dataset = np.array([resize(segment_image(sample)[0], (20, 20)) for<br />

sample in dataset])<br />

X = dataset.reshape((dataset.shape[0], dataset.shape[1] * dataset.<br />

shape[2]))<br />

X = X / X.max()<br />

X = X.astype(np.float32)<br />

X_train, X_test, y_train, y_test = \<br />

train_test_split(X, y, train_size=0.9, random_state=14)<br />

A neural network is a collection of layers. Implementing one in nolearn is a case of<br />

organizing what those layers will look like, much as it was with PyBrain. The neural<br />

network we used in Chapter 8, Beating CAPTCHAs with Neural Networks, used fully<br />

connected dense layers. These are implemented in nolearn, meaning we can replicate<br />

our basic network structure here. First, we create the layers consisting of an input<br />

layer, our dense hidden layer, and our dense output layer:<br />

from lasagne import layers<br />

layers=[<br />

('input', layers.InputLayer),<br />

('hidden', layers.DenseLayer),<br />

('output', layers.DenseLayer),<br />

]<br />

[ 255 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!