01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

115<br />

5.9.3 Curse of dimensionality, choice of hyperparameters<br />

The above showed that when considering particular condition on the form of the kernel matrix for<br />

GP, GPR and GMR become equivalent. In its generic form, however, GP is not equivalent to<br />

GMM and offers a more powerful tool for inference.<br />

Unfortunately, this is done at the expense of being prohibitive in its computation steps. Indeed,<br />

Gaussian Process is a very expensive method as inference grows with the number of datapoints<br />

3<br />

with O( M )<br />

. Advances in the field investigate sparsifying techniques to decrease in a clever<br />

manner the number of training points. Unfortunately, most methods are based on some heuristics<br />

to determine which points are deemed better than others. As a result the gain of doing a full GP<br />

inference with heuristic-driven sparsification over GMM is no longer obvious.<br />

Another drawback lies in the choice of the kernel function and in particular of the kernel width.<br />

This is illustrated in Figure 5-18. Too small a kernel width may lead to poor generalization as it<br />

allows solely points very close to one another to influence inference. On the other hand too large<br />

a kernel width may smooth out local irregularities. In this respect, GMM is a more powerful tool to<br />

provide generalization outside areas not covered by the datapoints, while still encapsulating the<br />

local non-linearities. I<br />

Figure 5-18: Effect of the width of a gaussian kernel on a classification task (SVM, top) and regression<br />

(GPR, bottom). When the kernel width (gamma) is very small, the generalization becomes very poor (i.e. the<br />

system overfits the data and is unable to estimate correctly on samples that have never been seen).<br />

Choosing appropriate parameters for the kernel depends on the data and is one of the main challenges in<br />

kernel methods.<br />

[DEMOS\CLASSIFICATION\SVM-KERNEL-WIDTH.ML] [DEMOS\REGRESSION\GPR-KERNEL-WIDTH.ML]<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!