13.08.2022 Views

advanced-algorithmic-trading

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

309

]

# Generate a list of the 2D cluster points

norm_dists = [

np.random.multivariate_normal(m, c, samples)

for m, c in zip(mu, cov)

]

X = np.array(list(itertools.chain(*norm_dists)))

The Scikit-Learn API for K-Means Clustering is very straightforward. In this case it consists

of initialising the KMeans class with the n_clusters parameter, representing the number of clusters

to find, and then calling the fit method on the observational data. The cluster assignments

can be extracted from the labels_ property.

In the following snippet this procedure is carried out for both K = 3 and K = 4 on the same

dataset:

# Apply the K-Means Algorithm for k=3, which is

# equal to the number of true Gaussian clusters

km3 = KMeans(n_clusters=3)

km3.fit(X)

km3_labels = km3.labels_

# Apply the K-Means Algorithm for k=4, which is

# larger than the number of true Gaussian clusters

km4 = KMeans(n_clusters=4)

km4.fit(X)

km4_labels = km4.labels_

The final section simply makes two Matplotlib scatter plot subplots of the data, one for K = 3

and one for K = 4, using colours to represent cluster assignments by the K-Means algorithm:

# Create a subplot comparing k=3 and k=4

# for the K-Means Algorithm

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14,6))

ax1.scatter(X[:, 0], X[:, 1], c=km3_labels.astype(np.float))

ax1.set_xlabel("$x_1$")

ax1.set_ylabel("$x_2$")

ax1.set_title("K-Means with $k=3$")

ax2.scatter(X[:, 0], X[:, 1], c=km4_labels.astype(np.float))

ax2.set_xlabel("$x_1$")

ax2.set_ylabel("$x_2$")

ax2.set_title("K-Means with $k=4$")

plt.show()

The output of the code can be seen in Figure 22.2.

Note that the colour differences between the two plots only represent the fact that there are

different numbers of clusters, rather than any other implicit relationship.

The full two-dimensional set of data was generated by sampling from three separate Gaussian

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!