MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
189<br />
9.2.7 Statistical Independence<br />
If y 1 and y 2 are two variables, generated by a random distribution p i (y i ) with zero-mean, then y 1<br />
and y 2 are said to be mutually independent, if their joint density p(y 1 , y 2 ) is equal to the product of<br />
their density functions:<br />
pyy ( ) = py ( ) py ( )<br />
(8.25)<br />
1, 2 1 2<br />
9.2.8 Uncorrelatedness<br />
y 1 and y 2 are said to be uncorrelated, if their covariance is zero ( y y )<br />
cov , 0<br />
1 2<br />
= . If the variables<br />
are generated by a distribution with zero means, then uncorrelatedness is equal to product of<br />
their separate expectations:<br />
( ) ( ) ( )<br />
cov( yy) = E yy − E y E y = 0<br />
(8.26)<br />
1, 2 1, 2 1 2<br />
E( y y ) E( y ) E( y )<br />
= (8.27)<br />
1, 2 1 2<br />
Independence is in general a much stronger requirement than uncorrelatedness. Indeed, if the y i<br />
are independent, one has<br />
E( f y f y ) = E( f y ) E( f y )<br />
(8.28)<br />
( ) ( ) ( ) ( )<br />
1 2 1 2<br />
for any functions f 1 and f 2 .<br />
Independence and uncorrelatedness are equivalent when y 1 and y 2 have a joint Gaussian<br />
distribution. The notion of independence depends, thus, on the definition of the probability<br />
distribution of any given variables.<br />
9.3 Information Theory<br />
The mathematical formalism for measuring the information content, I, of an arbitrary<br />
measurement I is due largely to a seminal 1948 paper by Claude E. Shannon. Within the context<br />
of sending and receiving messages in a communication system, Shannon was interested in<br />
finding a measure of the information content of a received message. Shannon’s measure of<br />
information aimed at conveying the impression that an unlikely event contains more information<br />
than a likely event. The information content of an outcome x is defined to be:<br />
where P(x) is the probability of x.<br />
( )<br />
I x<br />
1<br />
= log2 =− log2<br />
P( x)<br />
(8.29)<br />
P x<br />
( )<br />
© A.G.Billard 2004 – Last Update March 2011