MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA MACHINE LEARNING TECHNIQUES - LASA

01.11.2014 Views

140 where g is an arbitrary function dependent on the weight w. One can consider two cases: 1) Multiplicative constraints d wt ( ) = Cwt ( ) − g ( wt ( )) (6.34) dt ( ( )) γ ( ) ( ) ( ) g w t = w t ⋅ w t (6.35) where the decay term is multiplied by the weight. In this case, the decay term can be viewed as a feedback term, which limits the rate of growth of each weight. The bigger the weight, the bigger the decay. 2) Substractive constraint d wt ( ) = Cwt ( ) − γ ( wt ( )) wt ( ) (6.36) dt where the weight decay is proportional to the weight and multiplies the weight. 6.7 Anti-Hebbian learning If inputs to a neural network are correlated, then each contains information about the other. In I x; y > 0. other words, there is redundancy in the information conveyed by the input and ( ) Anti-Hebbian learning is designed to decorrelate inputs. The ultimate goal is to maximize the information that can be processed by the network. The less redundancy, the more information and the less number of output nodes required for transferring this information. Anti-Hebbian learning is also known as lateral inhibition, as the anti-learning occurs within members of the same layer. The basic model is defined by: Δ w =− α < y ⋅ y > (6.37) ij i j where the angle brackets indicate the ensemble average of values taken over all training patterns. Note that this is a tremendous limitation of the system, as it forces global and off-line learning. © A.G.Billard 2004 – Last Update March 2011

141 If y i and y are highly correlated, then, the weights between them will grow to a large negative j value and each will tend to turn the other off. Indeed, we have: ( Δwij →0 ) ⇒ ( < yi , yj > → 0) The weight change stops when the two outputs are decorrelated. At this stage, the algorithm converges. Note that there is no need for weight decay or renormalizing on anti-Hebbian weights, as they are automatically self-limiting. 6.7.1 Foldiak’s models Foldiak has suggested several models combining anti-Hebbian learning and weight decay. Here, we will consider the first 2 models as examples of solely anti-Hebbian learning. The first model is shown in Figure 6-12 and has anti-Hebbian connections between the output neurons. Figure 6-12: Foldiak's 1st model The equations, which define its dynamical behavior, are with learning rule y x w y i i ij j j= 1 n = +∑ (6.38) Δ w =−α ⋅y ⋅y for i≠ j (6.39) ij i j In matrix terms, we have And so, y= x+ W⋅y ( ) −1 y= I −W ⋅x (6.40) Therefore, we can view the system as a transformation, T, from the input vector x to the output y given by: ( ) 1 − y= T⋅ x= I−W ⋅ x (6.41) Now, the matrix W must be symmetric. It has only non-zero non-diagonal terms, i.e. if we consider only a two input, two output net as in the diagram. W ⎛⎛0 w⎞⎞ = ⎜⎜ ⎟⎟ ⎝⎝ w 0⎠⎠ © A.G.Billard 2004 – Last Update March 2011

140<br />

where g is an arbitrary function dependent on the weight w.<br />

One can consider two cases:<br />

1) Multiplicative constraints<br />

d wt ( ) = Cwt ( ) − g ( wt ( ))<br />

(6.34)<br />

dt<br />

( ( )) γ ( )<br />

( ) ( )<br />

g w t = w t ⋅ w t<br />

(6.35)<br />

where the decay term is multiplied by the weight. In this case, the decay term can<br />

be viewed as a feedback term, which limits the rate of growth of each weight.<br />

The bigger the weight, the bigger the decay.<br />

2) Substractive constraint<br />

d wt ( ) = Cwt ( ) − γ ( wt ( )) wt ( )<br />

(6.36)<br />

dt<br />

where the weight decay is proportional to the weight and multiplies the weight.<br />

6.7 Anti-Hebbian learning<br />

If inputs to a neural network are correlated, then each contains information about the other. In<br />

I x; y > 0.<br />

other words, there is redundancy in the information conveyed by the input and ( )<br />

Anti-Hebbian learning is designed to decorrelate inputs. The ultimate goal is to maximize the<br />

information that can be processed by the network. The less redundancy, the more information<br />

and the less number of output nodes required for transferring this information.<br />

Anti-Hebbian learning is also known as lateral inhibition, as the anti-learning occurs within<br />

members of the same layer. The basic model is defined by:<br />

Δ w =− α < y ⋅ y > (6.37)<br />

ij i j<br />

where the angle brackets indicate the ensemble average of values taken over all training<br />

patterns. Note that this is a tremendous limitation of the system, as it forces global and off-line<br />

learning.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!