01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

21<br />

2.1.2.2 Reconstruction error minimization through constrained optimization<br />

Earlier on, we showed that PCA finds the optimal (in a mean-square sense) projections of the<br />

dataset. This, again, can be formalized as an optimization under constraint problem taking, that<br />

minimizes the following objective function:<br />

T i<br />

( ex<br />

j )<br />

1<br />

( ,..., , )<br />

q<br />

M<br />

q<br />

1<br />

i<br />

J e e λ = x −µ − λje<br />

M<br />

∑ ∑<br />

j<br />

(2.9)<br />

i= 1 j=<br />

1<br />

where λ = are the projection coefficients and xthe mean of the data.<br />

ij<br />

One optimizes J under the constraints that the eigenvectors form an orthonormal basis, i.e:<br />

j<br />

( )<br />

T<br />

i<br />

i<br />

e = 1 and e ⋅ e = 0, ∀ i, j = 1,..., q.<br />

2.1.3 PCA limitations<br />

PCA is a simple, straightforward means of determining the major dimensions of a dataset. It<br />

suffers, however, from a number of drawbacks. The principal components found by projecting the<br />

dataset onto the perpendicular basis vectors (eigenvectors) are uncorrelated, and their directions<br />

orthogonal. The assumption that the referential is orthogonal is often too constraining, see Figure<br />

2-3 for an illustration.<br />

Figure 2-3: Assume a set of data points whose joint distribution forms a parallelogram. The first PC is the<br />

direction with the greatest spread, along the longest axis of the parallelogram. The second PC is orthogonal<br />

to the first one, by necessity. The independent component directions are, however, parallel to the sides of<br />

the parallelogram.<br />

PCA ensures only uncorrelatedness. This is a less constraining condition than statistical<br />

independence, which makes standard PCA ill suited for dealing with non-Gaussian data. ICA is a<br />

method that specifically ensures statistical independence.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!