01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

35<br />

2.3.5 ICA Ambiguities<br />

We cannot determine the variances of the independent components. The reason is that, both<br />

s and A being unknown, any scalar multiplier in one of the sources s i could always be cancelled<br />

by dividing the corresponding column a of A by the same scalar, sayα .<br />

i<br />

⎛⎛ 1 ⎞⎞<br />

x= a s ⋅α<br />

i ⎝⎝αi<br />

⎠⎠<br />

∑ ⎜⎜ i⎟⎟( i i)<br />

(2.29)<br />

As a consequence, we may as well fix the magnitudes of the independent components; as they<br />

are random variables. The most natural way to do this is to assume that each component has unit<br />

E s<br />

2 = 1. Then the matrix A will be adapted in the ICA solution methods to take<br />

variance, i.e. { } i<br />

into account this restriction. Note that this still leaves the ambiguity of the sign: we could multiply<br />

any of the independent components by -1 without affecting the model. This ambiguity is,<br />

fortunately, insignificant in most applications.<br />

We cannot determine the order of the independent components.<br />

The reason is that, again both s and A being unknown, we can freely change the order of the<br />

terms in the sum in (2.23) and call any of the independent components the first one. Formally, a<br />

−1<br />

permutation matrix P and its inverse can be substituted in the model to give x= AP Ps. The<br />

1<br />

elements of Ps are the original independent variables s j , but in another order. The matrix AP − is<br />

just a new unknown mixing matrix, to be solved by the ICA algorithms.<br />

ICA properties are:<br />

• Redundancy reduction: reducing redundancy in a dataset has numerous advantages. When<br />

the data show no redundancy any longer, the correlations across the data is null. In other<br />

words, each data point is significant and encapsulates a relevant characteristic of the<br />

dataset. Moreover, such a dataset is noise-free, because noise affects all data similarly. The<br />

brain seems to rely particularly on redundancy reduction. As we will see in later on in these<br />

lecture notes, the capacity of Neural Networks based on Hebbian Learning is maximal with<br />

non-redundant datasets.<br />

• Project pursuit: In a noise-free or non-redundant dataset, there might still be a number of<br />

features irrelevant to the task. Project pursuit is a method for determining the relevant<br />

directions along which lie the data.<br />

i<br />

2.3.6 ICA by maximizing non-gaussianity<br />

A simple and intuitive way of estimating ICA is done through the FastICA method. FastICA is<br />

based on a fixed-point iteration scheme for finding a maximum of a measure of nongaussianity.<br />

We present this solution here. Note that other objective functions have been proposed in the<br />

literature to solve ICA, such as minimizing mutual information or maximizing the likelihood. We<br />

will not review these here. In Section 6.7.2, we will offer another solution to ICA using Artificial<br />

Neural Networks.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!