MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
115<br />
5.9.3 Curse of dimensionality, choice of hyperparameters<br />
The above showed that when considering particular condition on the form of the kernel matrix for<br />
GP, GPR and GMR become equivalent. In its generic form, however, GP is not equivalent to<br />
GMM and offers a more powerful tool for inference.<br />
Unfortunately, this is done at the expense of being prohibitive in its computation steps. Indeed,<br />
Gaussian Process is a very expensive method as inference grows with the number of datapoints<br />
3<br />
with O( M )<br />
. Advances in the field investigate sparsifying techniques to decrease in a clever<br />
manner the number of training points. Unfortunately, most methods are based on some heuristics<br />
to determine which points are deemed better than others. As a result the gain of doing a full GP<br />
inference with heuristic-driven sparsification over GMM is no longer obvious.<br />
Another drawback lies in the choice of the kernel function and in particular of the kernel width.<br />
This is illustrated in Figure 5-18. Too small a kernel width may lead to poor generalization as it<br />
allows solely points very close to one another to influence inference. On the other hand too large<br />
a kernel width may smooth out local irregularities. In this respect, GMM is a more powerful tool to<br />
provide generalization outside areas not covered by the datapoints, while still encapsulating the<br />
local non-linearities. I<br />
Figure 5-18: Effect of the width of a gaussian kernel on a classification task (SVM, top) and regression<br />
(GPR, bottom). When the kernel width (gamma) is very small, the generalization becomes very poor (i.e. the<br />
system overfits the data and is unable to estimate correctly on samples that have never been seen).<br />
Choosing appropriate parameters for the kernel depends on the data and is one of the main challenges in<br />
kernel methods.<br />
[DEMOS\CLASSIFICATION\SVM-KERNEL-WIDTH.ML] [DEMOS\REGRESSION\GPR-KERNEL-WIDTH.ML]<br />
© A.G.Billard 2004 – Last Update March 2011