Master Thesis - Department of Computer Science
Master Thesis - Department of Computer Science
Master Thesis - Department of Computer Science
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
in our case. On the other hand, a between-class scatter matrix is the scatter <strong>of</strong> the<br />
expected vectors around the mixture mean as:<br />
C�<br />
Sb = Pi(Mi − M0)(Mi − M0)<br />
i=1<br />
T . (4.43)<br />
where M0 represents the expected vector <strong>of</strong> the mixture distribution and is given by:<br />
C�<br />
M0 = E{x} = PiMi. (4.44)<br />
i=1<br />
A linear transformation from an n-dimensional x to an m-dimensional y (m < n) is<br />
expressed by:<br />
y = A T x. (4.45)<br />
where A is an n × m rectangular matrix and the column vectors are linearly inde-<br />
pendent. The problem <strong>of</strong> feature extraction for classification is to find the A which<br />
optimizes J. It can be easily proved that J is optimized if A constitutes first C-1<br />
eigenvectors <strong>of</strong> S −1<br />
w Sb [39].<br />
4.4.2.2 Nonparametric Linear Discriminant Analysis<br />
The number <strong>of</strong> features, C-1, selected in LDA [39] is suboptimal in Bayes sense.<br />
Therefore, if the estimate <strong>of</strong> Bayes error in feature space (y space) is much larger<br />
than the one in the original space (x space), the feature extraction process should be<br />
augmented. If tr(S −1<br />
w Sb) is used as a criterion, LDA selects the first (C-1)-dimensional<br />
subspace containing classification information using the scatter <strong>of</strong> mean vectors, while<br />
the second (n-C+1)-dimensional space containing information due to the covariance-<br />
differences is neglected, where n is the dimensionality <strong>of</strong> x space. Fig. 4.5 shows<br />
some <strong>of</strong> the cases where traditional LDA does not work. Therefore, we need to select<br />
additional features from the (n-C+1)-dimensional subspace. The basis <strong>of</strong> nonpara-<br />
metric LDA [39] is the nonparametric formulation <strong>of</strong> scatter matrices, using k-nearest<br />
neighbor (kNN) techniques, which measure between-class and within-class scatter on<br />
a local basis, both being generally <strong>of</strong> full rank. In addition, the nonparametric na-<br />
ture <strong>of</strong> the scatter matrix inherently leads to extracted features that preserve class<br />
structure important for accurate classification.<br />
88