pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 10In SOMs, neurons are usually placed at nodes of a (1D or 2D) lattice. Higherdimensions are also possible but are rarely used in practice. Each neuron in thelattice is connected to all the input units via a weight matrix. The following diagramshows a SOM with 6 × 8 (48 neurons) and 5 inputs. For clarity, only the weightvectors connecting all inputs to one neuron are shown. In this case, each neuron willhave seven elements, resulting in a combined weight matrix of size (40 × 5):A SOM learns via competitive learning. It can be considered as a nonlineargeneralization of PCA and thus, like PCA, can be employed for dimensionalityreduction.In order to implement SOM, let's first understand how it works. As a first step,the weights of the network are initialized either to some random value or by takingrandom samples from the input. Each neuron occupying a space in the lattice will beassigned specific locations. Now as an input is presented, the neuron with the leastdistance from the input is declared the winner (WTU). This is done by measuring thedistance between the weight vectors (W) and input vectors (X) of all neurons:NNdd jj = √∑(WW jjjj − XX ii ) 2ii=1Here, d jis the distance of weights of neuron j from input X. The neuron with thelowest d value is the winner.Next, the weights of the winning neuron and its neighboring neurons are adjusted ina manner to ensure that the same neuron is the winner if the same input is presentednext time.[ 385 ]

Unsupervised LearningTo decide which neighboring neurons need to be modified, the network usesa neighborhood function ∧ (rr) ; normally, the Gaussian Mexican hat function ischosen as a neighborhood function. The neighborhood function is mathematicallyrepresented as follows:∧ (rr) = ee − dd22σσ 2Here, σσ is a time-dependent radius of influence of a neuron and d is its distance fromthe winning neuron. Graphically the function looks like a hat (hence its name), asyou can see in the following figure:Figure 6: The "Gaussian Maxican hat" function, visualized in graph formAnother important property of the neighborhood function is that its radius reduceswith time. As a result, in the beginning, many neighboring neurons' weights aremodified, but as the network learns, eventually a few neurons' weights (at times,only one or none) are modified in the learning process. The change in weight is givenby the following equation:dddd = ηη ∧ (XX − WW)The process is repeated for all the inputs for a given number of iterations. As theiterations progress, we reduce the learning rate and the radius by a factor dependenton the iteration number.SOMs are computationally expensive and thus are not really useful for verylarge datasets. Still, they are easy to understand, and they can very nicely findthe similarity between input data. Thus, they have been employed for imagesegmentation and to determine word similarity maps in NLP [3].[ 386 ]

Chapter 10

In SOMs, neurons are usually placed at nodes of a (1D or 2D) lattice. Higher

dimensions are also possible but are rarely used in practice. Each neuron in the

lattice is connected to all the input units via a weight matrix. The following diagram

shows a SOM with 6 × 8 (48 neurons) and 5 inputs. For clarity, only the weight

vectors connecting all inputs to one neuron are shown. In this case, each neuron will

have seven elements, resulting in a combined weight matrix of size (40 × 5):

A SOM learns via competitive learning. It can be considered as a nonlinear

generalization of PCA and thus, like PCA, can be employed for dimensionality

reduction.

In order to implement SOM, let's first understand how it works. As a first step,

the weights of the network are initialized either to some random value or by taking

random samples from the input. Each neuron occupying a space in the lattice will be

assigned specific locations. Now as an input is presented, the neuron with the least

distance from the input is declared the winner (WTU). This is done by measuring the

distance between the weight vectors (W) and input vectors (X) of all neurons:

NN

dd jj = √∑(WW jjjj − XX ii ) 2

ii=1

Here, d j

is the distance of weights of neuron j from input X. The neuron with the

lowest d value is the winner.

Next, the weights of the winning neuron and its neighboring neurons are adjusted in

a manner to ensure that the same neuron is the winner if the same input is presented

next time.

[ 385 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!