01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

136<br />

6.6.2 Weight decay<br />

A more interesting option is to prune weights that seem to have little influence on the network<br />

operation. This is an operation that requires global knowledge of all the weights in the network,<br />

and, thus, is less supported biologically. The idea is that no single weight should grow too large,<br />

while keeping the total weight connections into a particular output neuron fairly constant. One of<br />

the simplest rule for weight decay was developed by Grossberg in 1968 and was of the form:<br />

dw<br />

dt<br />

ij<br />

= α yx − w<br />

(6.23)<br />

i j ij<br />

dw ij<br />

It is clear that the weights will be stable when 0<br />

dt<br />

= at the points where wij = αE( xi yj<br />

)<br />

One can show that, at stability, we must have α Cw = w and, thus, that w must be an<br />

eigenvector of the correlation matrix C.<br />

We will next consider two more sophisticated learning equations, which use weight decay.<br />

The instar rule where the decay term is gated by the input term<br />

In the discrete case, we have:<br />

dw<br />

dt<br />

ij<br />

{ }<br />

i ij j<br />

x<br />

j<br />

:<br />

= α y − w x<br />

(6.24)<br />

Δ w = α⋅x ⋅y −γ<br />

⋅w ⋅ y<br />

(6.25)<br />

ij i j ij j<br />

( )<br />

α = γ ⇒ Δ w = α⋅y ⋅ x − w<br />

(6.26)<br />

ij i i ij<br />

( )<br />

y = 1 ⇒ Δ w = α ⋅ x −w<br />

i ij i ij<br />

( ) (1 α) ( 1) α ( )<br />

⇒ w t = − ⋅w t− + ⋅x t<br />

ij ij i<br />

(6.27)<br />

The weight moves in the direction of the input.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!