01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

62<br />

quantities are comparable. Since a pdf is a positive, but unbounded (above), function, its<br />

likelihood may grow arbitrarily. It usually grows with the numer of parameters. A good precaution<br />

is then to make sure that the two classifiers have the same number of parameters and are trained<br />

on the same number of datapoints. The latter means that one must have at its disposal as many<br />

examples of the positive class as that of the negative class. If this can not be ensured, then one<br />

may apply some normalization term on the likelihood of each classifier or one may use a<br />

threshold on the likelihood of both models to determine when inference is warranted. Despite<br />

these caveats, Bayes classifiers remain quite popular, in part because of their extreme simplicity,<br />

but also because they yield very good performance in practice.<br />

Note that Bayes classification can easily be extended to multiclass classification. Assume that<br />

i<br />

you wish to classify a set of datapoints { }<br />

i=<br />

1<br />

M<br />

X = x into K classes<br />

i<br />

i<br />

y associated to each datapoint x , i = 1.... M,<br />

can then take any value<br />

compute the posterior probability of each class, using:<br />

1 K<br />

C ,...., C . Each label<br />

1 K<br />

C ,...., C . One can then<br />

i<br />

( | )<br />

p y = C x =<br />

i<br />

i<br />

p( y = C ) p( x|<br />

y = C )<br />

K<br />

k<br />

k<br />

∑ p( y = C ) p( x|<br />

y = C )<br />

k = 1<br />

(3.33)<br />

We see next how this can be used when using Gaussian Mixture Models. Note that we will revisit<br />

multi-class classification using kernel methods in Section 5.10.<br />

3.4 Linear classification with Gaussian Mixture Models<br />

The Gaussian Mixture Model, introduced in Section 3.1.5, can be used in conjunction with the<br />

Bayes classifier to perform binary classification. For two knows classes of datapoints, one can<br />

train two separate GMM-s, one for each class of datapoints and use the Bayes classifier to<br />

determine the labeling of any given datapoint. In other word, each GMM yields a density<br />

p and p which can then be used to determine the label of a new training point using (3.32).<br />

+ −<br />

To ensure that the two models are comparable, one should train two GMM with same number of<br />

Gaussians, using the same modeling (with full, diagonal or isotropic covariance matrix in each<br />

case), and using similar number of training points for each GMM. Figure 3-19 shows one example<br />

of such classification using two GMM-s with 8 Gaussians each.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!