MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
143<br />
Foldiak’s second model allows all neurons to receive their own outputs with weight 1.<br />
( 1 y y )<br />
Δ w = α − (6.46)<br />
ii i i<br />
which can be written in matrix form as<br />
T<br />
( )<br />
Δ W = α I− YY<br />
(6.47)<br />
where I is the identity matrix.<br />
This network will converge when the outputs are decorrelated (due to the off-diagonal anti-<br />
Hebbian learning) and when the expected variance of the outputs is equal to 1. i.e. this learning<br />
rule forces each network output to take responsibility for the same amount of information since<br />
the entropy of each output is the same.<br />
This is generalizable to<br />
( y y )<br />
Δ w = αθ− (6.48)<br />
ij ij i j<br />
Where<br />
θ<br />
ij<br />
= 0 for i ≠ j. The value of θ<br />
ii<br />
for all i , will determine the variance on that output<br />
and so we can manage the information output of each neuron.<br />
6.7.2 CCA Revisited<br />
Adapted from Peiling Lai and Colin Fyfe, Kernel and Nonlinear Canonical Correlation Analysis, Computing and<br />
Information Systems, 7 (2000) p. 43-49.<br />
The Canonical Correlation Network<br />
Figure 1 The CCA Network. By adjusting weights, w 1 and w 2 , we maximize correlation<br />
between y 1 and y 2 .<br />
Let us consider CCA in artificial neural network terms. The input data comprises two vectors x 1 and<br />
x 2 . Activation is fed forward from each input to the corresponding output through the respective<br />
weights, w 1 and w 2 (see Figure 1 and equations (1) and (2)) to give outputs y 1 and y 2 .<br />
One can derive an objective function for the maximization of this correlation under the constraint<br />
that the variance of y 1 and y 2 should be 1 as:<br />
© A.G.Billard 2004 – Last Update March 2011