MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
71<br />
4.4.2 Multi-Gaussian Case<br />
It is easy to extend the previous result to a mixture of Gaussians, where p( x,<br />
y)<br />
is given by<br />
Equation (4.18). Recall that each joint density pk<br />
( y, x)<br />
, k = 1,.. K can be decomposed<br />
as p ( y, x) = p ( y|<br />
x) ⋅ p ( x)<br />
(Bayes’ Theorem), then the joint density of the complete mixture<br />
k k k<br />
is given by:<br />
K<br />
k = 1<br />
k k k k<br />
( ) ( )<br />
p( y, x) = α<br />
k<br />
⋅pk y| x, µ ( x)<br />
, ∑%<br />
∑ ⋅pk x| µ<br />
X, ∑ XX<br />
−<br />
( x) ( ) ( x ) and %<br />
( )<br />
% (4.25)<br />
1 −1<br />
k k k k k k k k k k<br />
% .<br />
with µ = µ +∑ ∑ −µ<br />
∑ =∑ −∑ ∑ ∑<br />
Y YX XX X YX YX XX XY<br />
The marginal density of x is given by:<br />
k<br />
K<br />
k k k k<br />
( | µ , ∑ ) =<br />
k( | µ , ∑ )<br />
p x p x<br />
∑ (4.26)<br />
Replacing (4.26) in (4.25), we can express the conditional probability of y given x :<br />
Where<br />
w<br />
k<br />
=<br />
K<br />
j=<br />
1<br />
k<br />
j<br />
k<br />
( x| k k<br />
, ∑ )<br />
j( x| j j<br />
, ∑ )<br />
α p µ<br />
∑<br />
α p µ<br />
k=<br />
1<br />
K<br />
∑ k k<br />
(4.27)<br />
k = 1<br />
( | ) = ( ) ⋅ ( | )<br />
p y x w x p y x<br />
.<br />
Equation (4.27) forms the core of the GMR model. One can then compute explicitly the<br />
expectation and variance on the above conditional, similarly to what we did for the probabilistic<br />
regression.<br />
( Y YX XX X )<br />
K<br />
K<br />
−1<br />
k k k k k<br />
{ ( | )} =<br />
k( ) ⋅ µ ( ) =<br />
k( ) ⋅ µ +∑ ( ∑ ) ( −µ<br />
)<br />
∑<br />
E p y x w x x w x x<br />
∑<br />
% (4.28)<br />
k= 1 k=<br />
1<br />
The expectation is thus a non-linear combination of the expectation of each local component. In<br />
effect, the regression signal from GMR is the result of a non-linear weighting of local linear<br />
regressions.<br />
K<br />
{ ( )} k( ) ( )<br />
∑<br />
2 K<br />
k k k<br />
(( µ )<br />
%<br />
( ) ) ∑ k( ) µ ( )<br />
k= 1 k=<br />
1<br />
( )<br />
2 2<br />
var p y|<br />
x = w x ⋅ % x + ∑ − w x ⋅ % x (4.29)<br />
The variance of GMR is no longer a simple function increasing with the amplitude of the input x<br />
(as in probabilistic regression). Rather it is modulated by the variance of each component locally<br />
and hence carries across a notion of local variance.<br />
A schematic of these variables is shown in Figure 4-3.<br />
© A.G.Billard 2004 – Last Update March 2011