MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
118<br />
Non-linear case<br />
The non-linear case of classification using Gaussian Process proceeds similarly to Gaussian<br />
Process Regression. Instead of putting a prior on the weight as in the linear case, we put a prior<br />
f x , such that we have<br />
on the latent function ( )<br />
1<br />
p( y =+ 1| x) = (5.94)<br />
f ( x)<br />
1 + e −<br />
This has for effect to ensure that the output is bounded between 0 and +1 and can hence be<br />
interpreted as a probability (as in the linear case), see Figure Figure 5-20<br />
Figure 5-20: Example of an arbitrary prior function f(x) (here composed of the superposition of two<br />
Gaussian). Applyinhg the sigmoid function on f(x) flattens the function, while normalizing between 0 and +1.<br />
One now can build an estimate of the class label<br />
*<br />
y for a query point<br />
*<br />
x by computing the<br />
posterior distribution of the function f ( x ) applied on our query point. If we make this a<br />
distribution that is a function of the training datapoints, we have ( )<br />
posterior distribution we want to compute is given by:<br />
( ) ( )<br />
( ) ( ( ) )<br />
p y * | x * , X, Y sigmoid f x * p f x * | x * , X,<br />
Y df<br />
*<br />
( | *<br />
, , )<br />
p f x x X Y and the<br />
= ∫<br />
(5.95)<br />
The integral on the righthandside compute all values on our prior on f ( )<br />
x . While in GPR there<br />
was an analytical solution, in the classification case, the integral is usually analytically intractable.<br />
To solve this, one must use either an analytic approximations of integrals, or solutions based on<br />
Monte Carlo sampling.<br />
© A.G.Billard 2004 – Last Update March 2011