MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
190<br />
Shannon’s information measure is binary coded. To get an intuitive feeling of this concept,<br />
consider the case where the probability of x is uniform over the interval [1, 2]∈• . Then, the<br />
information conveyed in the observed event x=1 is equal to:<br />
1<br />
I( x= 1) =− log2 = log22=<br />
1<br />
2<br />
If, on the other hand, x is uniformly distributed over the interval [1,8]∈• , making each of the<br />
occurrences of x 3 times less likely. Then, the information conveyed in observing x=1, is 3 times<br />
more important:<br />
( )<br />
3<br />
I x= 1 = log 8= log 2 = 3<br />
2 2<br />
9.3.1 Entropy<br />
The notion of information is tightly linked to the physics notion of entropy. The entropy of the<br />
information on x is a measure of the uncertainty or of the information content of a set of N<br />
observations of x:<br />
N<br />
=−∑ (8.30)<br />
( ) ( ) log ( )<br />
H x P x P x<br />
i=<br />
1<br />
If we take again our earlier example, the entropy of x, when x is uniform over the interval [1,2] is<br />
equal to<br />
( )<br />
H x<br />
2<br />
1 1<br />
=− ∑ log2<br />
= 1<br />
2 2<br />
i=<br />
1<br />
. If now, we choose a non uniform distribution of probability for x, wher<br />
H x<br />
⎛⎛1 1 3 3⎞⎞<br />
=− ⎜⎜ log + log ⎟⎟=<br />
0.8<br />
⎝⎝4 4 4 4⎠⎠<br />
e P(x=1)=1/4 and P(x=2)=3/4, then ( ) 2 2<br />
Thus, there is more uncertainty in a situation where every outcome is equally probable.<br />
.<br />
In the continuous case, one defines the differential entropy of a distribution f of a random variable<br />
x as:<br />
h x =−∫ f x log f x dx<br />
(8.31)<br />
( ) ( ) ( )<br />
S<br />
where S is the support set of x. For instance, consider the case where x is uniformly distribution<br />
between [0,a], so that its density function is<br />
Then, its differential entropy is:<br />
f<br />
( x)<br />
⎧⎧1<br />
if 0 ≤ x≤a<br />
⎪⎪a<br />
= ⎨⎨<br />
⎪⎪ 1<br />
if a < x≤ a+<br />
b<br />
⎪⎪⎩⎩ b<br />
,<br />
© A.G.Billard 2004 – Last Update March 2011