09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Unsupervised Learning

• Inspector Panel: On the right-hand side, here you can search for particular

points and see a list of nearest neighbors.

Figure 2: Screenshow of the Embedding Projector tool

K-means clustering

K-means clustering, as the name suggests, is a technique to cluster data, that is, to

partition data into a specified number of data points. It is an unsupervised learning

technique. It works by identifying patterns in the given data. Remember the sorting

hat of Harry Potter fame? What it is doing in the book is clustering—dividing new

(unlabeled) students into four different clusters: Gryffindor, Ravenclaw, Hufflepuff,

and Slytherin.

Humans are very good at grouping objects together; clustering algorithms try

to give a similar capability to computers. There are many clustering techniques

available, such as Hierarchical, Bayesian, or Partitional. K-means clustering belongs

to partitional clustering; it partitions the data into k clusters. Each cluster has a

center, called the centroid. The number of clusters k has to be specified by the user.

The k-means algorithm works in the following manner:

1. Randomly choose k data points as the initial centroids (cluster centers)

2. Assign each data point to the closest centroid; there can be different measures

to find closeness, the most common being the Euclidean distance

[ 380 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!