MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
33<br />
• In our general definition of the ICA model given previously, we have assumed that A was<br />
a q× N matrix. Here, we will focus on a simplified version of ICA whereby one assumes<br />
that the unknown mixing matrix A is square, i.e. that the number of independent<br />
components is equal to the number of observed mixtures and thus A is. q× q (note that<br />
this assumption can be sometimes relaxed (see extensions proposed in Hyvarinen et al<br />
2001). This can be done for instance by performing first PCA on the original dataset and<br />
then use a reduced set of q dimensions obtained by PCA for ICA. If A is square, then,<br />
after estimating the matrix A , one can compute its inverse, W = A −1 , and obtain the<br />
independent component simply by:<br />
−1<br />
s Wx A x<br />
= = (2.24)<br />
• The data is white, i.e. each datapoint is uncorrelated and the variance of the dataset is<br />
equal to unity. This will be a basic preprocessing step in ICA, as we will discuss next.<br />
2.3.4 Whitening<br />
A useful preprocessing strategy in ICA is to first whiten the observed variables. This means that<br />
before the application of the ICA algorithm (and after centering), we transform the observed<br />
vector x linearly so that we obtain a new vector x%, which is white, i.e. its components are<br />
uncorrelated and its variance equal unity. In other words, the covariance matrix of x%equals the<br />
identity matrix:<br />
E xx %% T = I<br />
(2.25)<br />
{ }<br />
The whitening transformation is always possible. One popular method for whitening is to use the<br />
%% T<br />
T , where U is the orthogonal<br />
eigen-value decomposition of the covariance matrix E{ xx }<br />
= UDU<br />
matrix of eigenvectors of the basis of x and D is the diagonal matrix of its eigenvalues,<br />
= ( ,..., 1 n ). Note that { T<br />
}<br />
D diag λ λ<br />
E xx%% is the empirical means, i.e. it is estimated from the<br />
available data samples. Whitening can now be done by computing:<br />
1<br />
−<br />
2 T<br />
% (2.26)<br />
x = UD U x<br />
The matrix<br />
1<br />
2<br />
D −<br />
is computed by a simple component-wise operation, such that:<br />
1 1 1<br />
− ⎛⎛ − − ⎞⎞<br />
2 2 2<br />
D = diag ⎜⎜d1 ,..., dn<br />
⎟⎟<br />
⎝⎝ ⎠⎠<br />
It is easy to check that now Exx {%% T } = I.<br />
Whitening transforms the mixing matrix into a new one, A % :<br />
.<br />
1<br />
−<br />
2 T<br />
% (2.27)<br />
x= UD U As= As %<br />
The utility of whitening resides in the fact that the new mixing matrix A % is orthogonal. This can be<br />
seen from<br />
© A.G.Billard 2004 – Last Update March 2011