MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA MACHINE LEARNING TECHNIQUES - LASA

01.11.2014 Views

18 correlation across the two components covariance is zero c c 0 ij ji X and i X j . If the two components are uncorrelated, their = = . The covariance matrix is, by definition, always symmetric. It has, i thus, an orthogonal basis, defined by N Eigenvectors e , i 1,... N λ i : = with associated eigenvalues Ce i i = λ e (2.3) i The Eigenvalues λ i are calculated by solving the equation: C− λ I = 0 (2.4) i When I is the N× N identity matrix and the determinant of the matrix. If the data vector has N components, the characteristic equation becomes of order N. This is easy to solve only if N is small. By ordering the eigenvectors in the order of descending eigenvalues (largest first), one can create an ordered orthogonal basis with the first eigenvector having the direction of largest variance of the data. In this way, the eigenvector corresponding to the largest eigenvalue is the direction along which the variance of the data is maximal. The directions of eigenvectors are drawn in the Figure 2-1 right as vectors. The first eigenvector having the largest eigenvalue points to the direction of largest variance (the longest axis of the ellipse) whereas the second eigenvector is orthogonal to the first one (pointing to the second axis of the ellipse). 2.1.1 Dimensionality Reduction By chosing the k

19 ( ) Xʹ′ = W X − µ (2.6) i i i Components of X ' can be seen as the coordinates in the orthogonal base of i i In order to reconstruct the original data vector using the property of an orthogonal matrix X from i X i X = W X ʹ′ + µ T i i i W W − 1 T = . ' X . , one must compute: Now, instead of using all the eigenvectors of the covariance matrix, one may represent the data in terms of only a few basis vectors of the orthogonal basis. If we denote the reduced transfer matrix W , that contains only the k first eigenvectors. The reduced transformation is, thus: k ( ) Xʹ′ = W X − µ i k i i i X ' lives now in a coordinates system of dimension k. Such a transformation reduces the meansquare error between the original data point and its projection. If the data is concentrated in a linear subspace, this provides a way to compress data without losing much information and simplifying the representation. By picking the eigenvectors having the largest eigenvalues we lose as little information as possible in the mean-square sense. One can e.g. choose a fixed number of eigenvectors and their respective eigenvalues and get a consistent representation, or abstraction of the data. This preserves a varying amount of energy of the original data. Alternatively, we can choose approximately the same amount of energy and a varying amount of eigenvectors and their respective eigenvalues. This would in turn give approximately consistent amount of information with the expense of varying representations with regard to the dimension of the subspace. © A.G.Billard 2004 – Last Update March 2011

18<br />

correlation across the two components<br />

covariance is zero c c 0<br />

ij<br />

ji<br />

X and<br />

i<br />

X<br />

j<br />

. If the two components are uncorrelated, their<br />

= = . The covariance matrix is, by definition, always symmetric. It has,<br />

i<br />

thus, an orthogonal basis, defined by N Eigenvectors e , i 1,... N<br />

λ<br />

i<br />

:<br />

= with associated eigenvalues<br />

Ce<br />

i<br />

i<br />

= λ e<br />

(2.3)<br />

i<br />

The Eigenvalues λ<br />

i<br />

are calculated by solving the equation:<br />

C− λ I = 0<br />

(2.4)<br />

i<br />

When I is the N× N identity matrix and the determinant of the matrix.<br />

If the data vector has N components, the characteristic equation becomes of order N. This is easy<br />

to solve only if N is small.<br />

By ordering the eigenvectors in the order of descending eigenvalues (largest first), one can create<br />

an ordered orthogonal basis with the first eigenvector having the direction of largest variance of<br />

the data. In this way, the eigenvector corresponding to the largest eigenvalue is the direction<br />

along which the variance of the data is maximal. The directions of eigenvectors are drawn in the<br />

Figure 2-1 right as vectors. The first eigenvector having the largest eigenvalue points to the<br />

direction of largest variance (the longest axis of the ellipse) whereas the second eigenvector is<br />

orthogonal to the first one (pointing to the second axis of the ellipse).<br />

2.1.1 Dimensionality Reduction<br />

By chosing the k

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!