01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

71<br />

4.4.2 Multi-Gaussian Case<br />

It is easy to extend the previous result to a mixture of Gaussians, where p( x,<br />

y)<br />

is given by<br />

Equation (4.18). Recall that each joint density pk<br />

( y, x)<br />

, k = 1,.. K can be decomposed<br />

as p ( y, x) = p ( y|<br />

x) ⋅ p ( x)<br />

(Bayes’ Theorem), then the joint density of the complete mixture<br />

k k k<br />

is given by:<br />

K<br />

k = 1<br />

k k k k<br />

( ) ( )<br />

p( y, x) = α<br />

k<br />

⋅pk y| x, µ ( x)<br />

, ∑%<br />

∑ ⋅pk x| µ<br />

X, ∑ XX<br />

−<br />

( x) ( ) ( x ) and %<br />

( )<br />

% (4.25)<br />

1 −1<br />

k k k k k k k k k k<br />

% .<br />

with µ = µ +∑ ∑ −µ<br />

∑ =∑ −∑ ∑ ∑<br />

Y YX XX X YX YX XX XY<br />

The marginal density of x is given by:<br />

k<br />

K<br />

k k k k<br />

( | µ , ∑ ) =<br />

k( | µ , ∑ )<br />

p x p x<br />

∑ (4.26)<br />

Replacing (4.26) in (4.25), we can express the conditional probability of y given x :<br />

Where<br />

w<br />

k<br />

=<br />

K<br />

j=<br />

1<br />

k<br />

j<br />

k<br />

( x| k k<br />

, ∑ )<br />

j( x| j j<br />

, ∑ )<br />

α p µ<br />

∑<br />

α p µ<br />

k=<br />

1<br />

K<br />

∑ k k<br />

(4.27)<br />

k = 1<br />

( | ) = ( ) ⋅ ( | )<br />

p y x w x p y x<br />

.<br />

Equation (4.27) forms the core of the GMR model. One can then compute explicitly the<br />

expectation and variance on the above conditional, similarly to what we did for the probabilistic<br />

regression.<br />

( Y YX XX X )<br />

K<br />

K<br />

−1<br />

k k k k k<br />

{ ( | )} =<br />

k( ) ⋅ µ ( ) =<br />

k( ) ⋅ µ +∑ ( ∑ ) ( −µ<br />

)<br />

∑<br />

E p y x w x x w x x<br />

∑<br />

% (4.28)<br />

k= 1 k=<br />

1<br />

The expectation is thus a non-linear combination of the expectation of each local component. In<br />

effect, the regression signal from GMR is the result of a non-linear weighting of local linear<br />

regressions.<br />

K<br />

{ ( )} k( ) ( )<br />

∑<br />

2 K<br />

k k k<br />

(( µ )<br />

%<br />

( ) ) ∑ k( ) µ ( )<br />

k= 1 k=<br />

1<br />

( )<br />

2 2<br />

var p y|<br />

x = w x ⋅ % x + ∑ − w x ⋅ % x (4.29)<br />

The variance of GMR is no longer a simple function increasing with the amplitude of the input x<br />

(as in probabilistic regression). Rather it is modulated by the variance of each component locally<br />

and hence carries across a notion of local variance.<br />

A schematic of these variables is shown in Figure 4-3.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!