20.01.2013 Views

Master Thesis - Department of Computer Science

Master Thesis - Department of Computer Science

Master Thesis - Department of Computer Science

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

in our case. On the other hand, a between-class scatter matrix is the scatter <strong>of</strong> the<br />

expected vectors around the mixture mean as:<br />

C�<br />

Sb = Pi(Mi − M0)(Mi − M0)<br />

i=1<br />

T . (4.43)<br />

where M0 represents the expected vector <strong>of</strong> the mixture distribution and is given by:<br />

C�<br />

M0 = E{x} = PiMi. (4.44)<br />

i=1<br />

A linear transformation from an n-dimensional x to an m-dimensional y (m < n) is<br />

expressed by:<br />

y = A T x. (4.45)<br />

where A is an n × m rectangular matrix and the column vectors are linearly inde-<br />

pendent. The problem <strong>of</strong> feature extraction for classification is to find the A which<br />

optimizes J. It can be easily proved that J is optimized if A constitutes first C-1<br />

eigenvectors <strong>of</strong> S −1<br />

w Sb [39].<br />

4.4.2.2 Nonparametric Linear Discriminant Analysis<br />

The number <strong>of</strong> features, C-1, selected in LDA [39] is suboptimal in Bayes sense.<br />

Therefore, if the estimate <strong>of</strong> Bayes error in feature space (y space) is much larger<br />

than the one in the original space (x space), the feature extraction process should be<br />

augmented. If tr(S −1<br />

w Sb) is used as a criterion, LDA selects the first (C-1)-dimensional<br />

subspace containing classification information using the scatter <strong>of</strong> mean vectors, while<br />

the second (n-C+1)-dimensional space containing information due to the covariance-<br />

differences is neglected, where n is the dimensionality <strong>of</strong> x space. Fig. 4.5 shows<br />

some <strong>of</strong> the cases where traditional LDA does not work. Therefore, we need to select<br />

additional features from the (n-C+1)-dimensional subspace. The basis <strong>of</strong> nonpara-<br />

metric LDA [39] is the nonparametric formulation <strong>of</strong> scatter matrices, using k-nearest<br />

neighbor (kNN) techniques, which measure between-class and within-class scatter on<br />

a local basis, both being generally <strong>of</strong> full rank. In addition, the nonparametric na-<br />

ture <strong>of</strong> the scatter matrix inherently leads to extracted features that preserve class<br />

structure important for accurate classification.<br />

88

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!