MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
112<br />
T<br />
In this case the covariance of the prior becomes: ( )<br />
is a matrix whose columns<br />
i<br />
x through (5.84).<br />
2<br />
( ε) ( ) σ<br />
yy = cov f X + = K X , X + I , where y<br />
i<br />
y , i=1….M, correspond to the projection of the associated training point<br />
As done previously, we can express the joint distribution of the prior y (now including noise in the estimate of<br />
prior on the training datapoints) and the testing points given through f*:<br />
2<br />
( , ) + σ I ( *, )<br />
( ) ( )<br />
⎡⎡y<br />
⎤⎤ ⎛⎛ ⎡⎡K X X K X X ⎤⎤⎞⎞<br />
⎢⎢ N 0,<br />
f *<br />
⎥⎥ ⎜⎜ ⎢⎢<br />
⎥⎥⎟⎟<br />
⎣⎣ ⎦⎦ ⎜⎜ ⎢⎢K X, X * K X*, X * ⎥⎥⎟⎟<br />
⎝⎝ ⎣⎣<br />
⎦⎦⎠⎠<br />
: (5.85)<br />
Again, one can compute the conditional distribution of f* given the pair of training datapoints X,<br />
the testing datapoints X* and the noisy prior y.<br />
( ( ))<br />
f*| X*, X, y : N f*,cov f *<br />
{ } ( ) ⎡⎡ ( )<br />
2<br />
−1<br />
f* = E f*| X*, X, y = K X*, X K X,<br />
X + σ I⎤⎤<br />
y<br />
2<br />
−1<br />
( f ) = K( X X ) − K( X X) ⎡⎡K( X X) + σ I⎤⎤<br />
K( X X )<br />
cov * *, * *, ⎣⎣ , ⎦⎦ , *<br />
⎣⎣<br />
⎦⎦<br />
(5.86)<br />
We are usually interested in computing solely the response of the model to one query point x *.<br />
In this case, the estimate of the associated output y * is given by the following:<br />
T<br />
{ } ( ) ⎡⎡ ( )<br />
2<br />
−1<br />
y*~ f* = E f*| x*, X, y = k x*, X K X,<br />
X + σ I⎤⎤<br />
y<br />
⎣⎣<br />
⎦⎦<br />
(5.87)<br />
i<br />
( ) ( )<br />
k x*, X is the vector of covariance k x*, x between the query point and the<br />
i<br />
M training data points x , i = 1... M.<br />
Since all the training pairs ( )<br />
i i<br />
x , y , i 1... M<br />
= are given, these can be treated as parameters to<br />
the system and hence the prediction on y * from Equation (5.87) can be expressed as a linear<br />
i<br />
combination of kernel functions k( x*, x ):<br />
M<br />
i<br />
{ } ∑αi<br />
( )<br />
y*~ f* = E f*| x*, X, y = k x*,<br />
x<br />
( )<br />
i=<br />
1<br />
2<br />
−1<br />
K X X σ I⎤⎤<br />
y<br />
with α = ⎡⎡<br />
⎣⎣ , +<br />
⎦⎦<br />
(5.88)<br />
We have M kernel functions for each of the M training points<br />
x i .<br />
© A.G.Billard 2004 – Last Update March 2011