01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

189<br />

9.2.7 Statistical Independence<br />

If y 1 and y 2 are two variables, generated by a random distribution p i (y i ) with zero-mean, then y 1<br />

and y 2 are said to be mutually independent, if their joint density p(y 1 , y 2 ) is equal to the product of<br />

their density functions:<br />

pyy ( ) = py ( ) py ( )<br />

(8.25)<br />

1, 2 1 2<br />

9.2.8 Uncorrelatedness<br />

y 1 and y 2 are said to be uncorrelated, if their covariance is zero ( y y )<br />

cov , 0<br />

1 2<br />

= . If the variables<br />

are generated by a distribution with zero means, then uncorrelatedness is equal to product of<br />

their separate expectations:<br />

( ) ( ) ( )<br />

cov( yy) = E yy − E y E y = 0<br />

(8.26)<br />

1, 2 1, 2 1 2<br />

E( y y ) E( y ) E( y )<br />

= (8.27)<br />

1, 2 1 2<br />

Independence is in general a much stronger requirement than uncorrelatedness. Indeed, if the y i<br />

are independent, one has<br />

E( f y f y ) = E( f y ) E( f y )<br />

(8.28)<br />

( ) ( ) ( ) ( )<br />

1 2 1 2<br />

for any functions f 1 and f 2 .<br />

Independence and uncorrelatedness are equivalent when y 1 and y 2 have a joint Gaussian<br />

distribution. The notion of independence depends, thus, on the definition of the probability<br />

distribution of any given variables.<br />

9.3 Information Theory<br />

The mathematical formalism for measuring the information content, I, of an arbitrary<br />

measurement I is due largely to a seminal 1948 paper by Claude E. Shannon. Within the context<br />

of sending and receiving messages in a communication system, Shannon was interested in<br />

finding a measure of the information content of a received message. Shannon’s measure of<br />

information aimed at conveying the impression that an unlikely event contains more information<br />

than a likely event. The information content of an outcome x is defined to be:<br />

where P(x) is the probability of x.<br />

( )<br />

I x<br />

1<br />

= log2 =− log2<br />

P( x)<br />

(8.29)<br />

P x<br />

( )<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!