01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

190<br />

Shannon’s information measure is binary coded. To get an intuitive feeling of this concept,<br />

consider the case where the probability of x is uniform over the interval [1, 2]∈• . Then, the<br />

information conveyed in the observed event x=1 is equal to:<br />

1<br />

I( x= 1) =− log2 = log22=<br />

1<br />

2<br />

If, on the other hand, x is uniformly distributed over the interval [1,8]∈• , making each of the<br />

occurrences of x 3 times less likely. Then, the information conveyed in observing x=1, is 3 times<br />

more important:<br />

( )<br />

3<br />

I x= 1 = log 8= log 2 = 3<br />

2 2<br />

9.3.1 Entropy<br />

The notion of information is tightly linked to the physics notion of entropy. The entropy of the<br />

information on x is a measure of the uncertainty or of the information content of a set of N<br />

observations of x:<br />

N<br />

=−∑ (8.30)<br />

( ) ( ) log ( )<br />

H x P x P x<br />

i=<br />

1<br />

If we take again our earlier example, the entropy of x, when x is uniform over the interval [1,2] is<br />

equal to<br />

( )<br />

H x<br />

2<br />

1 1<br />

=− ∑ log2<br />

= 1<br />

2 2<br />

i=<br />

1<br />

. If now, we choose a non uniform distribution of probability for x, wher<br />

H x<br />

⎛⎛1 1 3 3⎞⎞<br />

=− ⎜⎜ log + log ⎟⎟=<br />

0.8<br />

⎝⎝4 4 4 4⎠⎠<br />

e P(x=1)=1/4 and P(x=2)=3/4, then ( ) 2 2<br />

Thus, there is more uncertainty in a situation where every outcome is equally probable.<br />

.<br />

In the continuous case, one defines the differential entropy of a distribution f of a random variable<br />

x as:<br />

h x =−∫ f x log f x dx<br />

(8.31)<br />

( ) ( ) ( )<br />

S<br />

where S is the support set of x. For instance, consider the case where x is uniformly distribution<br />

between [0,a], so that its density function is<br />

Then, its differential entropy is:<br />

f<br />

( x)<br />

⎧⎧1<br />

if 0 ≤ x≤a<br />

⎪⎪a<br />

= ⎨⎨<br />

⎪⎪ 1<br />

if a < x≤ a+<br />

b<br />

⎪⎪⎩⎩ b<br />

,<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!