MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
134<br />
6.6 Hebbian Learning<br />
Hebbian learning is the core of unsupervised learning techniques in neural networks. It takes its<br />
name from the original postulate of the neurobiologist Donald Hebb (Hebb, 1949) stating that:<br />
When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes<br />
part in firing it, some growth process or metabolic change takes place in one or both cells<br />
such that A’s efficiency, as one of the cells firing B, is increased.<br />
The Hebbian learning rule lets the weights across two units grow as a function of the coactivation<br />
of the input and output units. If we consider the classical perceptron with no noise:<br />
y<br />
( w x )<br />
= ∑ (6.20)<br />
i ij j<br />
j<br />
Then, the weights increase following:<br />
Δ w = α ⋅x ⋅ y<br />
(6.21)<br />
ji j i<br />
α is the learning rate and is usually comprised between [ 0,1 ]. It determines the speed at which<br />
the weights grow. If x and y are binary inputs, the weights increase only when both x and y are 1.<br />
Note that, in the discrete case, the co-activation must be simultaneous. This is often too strong a<br />
constraint in a real-time system which displays large variation in the temporality of concurrent<br />
events.<br />
A continuous time neural network would best represent such a system. In the continuous case,<br />
we would have:<br />
( ) α ( ) ( ) ( )<br />
∂ w t = t ⋅x t ⋅ y t<br />
ji j i<br />
t2<br />
( ) α ( ) ( ) ( )<br />
∫<br />
Δ w t = t ⋅x t ⋅y t dt<br />
ji<br />
t<br />
j i<br />
1<br />
which corresponds to the area of superposed coactivation of the two neurons in the time interval<br />
[t 1 t 2 ] .<br />
One can show that<br />
Δw<br />
ij<br />
= ∑ wxx and in the limit<br />
ik k j<br />
t 0<br />
Δt<br />
k<br />
d W t<br />
dt<br />
( ) C W ( t )<br />
Δ ⎯⎯⎯⎯→ is equivalent to<br />
∝ ⋅ (6.22)<br />
where C<br />
ij<br />
is the correlation coefficient calculated over all input patterns between the i th and j th<br />
term of the inputs and W(t) is the matrix of weights at time t.<br />
The major drawback of the Hebbian learning rule, as stated in Equation (6.21), is that weights<br />
grow continuously and without bounds. This can quickly get out of hand. If learning is to be<br />
continuous, the values taken by the weights can quickly go over the floating-point margin of your<br />
system. We will next review two major ways of limiting the growth of the weights.<br />
© A.G.Billard 2004 – Last Update March 2011