09.12.2012 Views

Concrete mathematics : a foundation for computer science

Concrete mathematics : a foundation for computer science

Concrete mathematics : a foundation for computer science

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

376 DISCRETE PROBABILITY<br />

hence VS = y = 7.5 when both dice are loaded. Notice that the loaded dice<br />

give S a larger variance, although S actually assumes its average value 7 more<br />

often than it would with fair dice. If our goal is to shoot lots of lucky 7’s, the<br />

variance is not our best indicator of success.<br />

OK, we have learned how to compute variances. But we haven’t really<br />

seen a good reason why the variance is a natural thing to compute. Everybody<br />

does it, but why? The main reason is Chebyshew’s inequality ([24’] and If he proved it in<br />

[50’]), which states that the variance has a significant property: 1867, it’s a classic<br />

‘67 Chebyshev.<br />

Pr((X-EX)‘>a) < VX/ol, <strong>for</strong> all a > 0. (8.17)<br />

(This is different from the summation inequalities of Chebyshev that we encountered<br />

in Chapter 2.) Very roughly, (8.17) tells us that a random variable X<br />

will rarely be far from its mean EX if its variance VX is small. The proof is<br />

amazingly simple. We have<br />

VX = x (X(w) - EX:? Pr(w)<br />

CLJE~~<br />

3 x (X(w) -EXf Pr(cu)<br />

WEn<br />

(X(w)-EX)‘>a<br />

3 x aPr(w) = oL.Pr((X - EX)’ > a) ;<br />

WEn<br />

(X(W)-EX]~&~<br />

dividing by a finishes the proof.<br />

If we write u <strong>for</strong> the mean and o <strong>for</strong> the standard deviation, and if we<br />

replace 01 by c2VX in (8.17), the condition (X - EX)’ 3 c2VX is the same as<br />

(X - FL) 3 (~0)~; hence (8.17) says that<br />

Pr(/X - ~13 co) 6 l/c’. (8.18)<br />

Thus, X will lie within c standard deviations of its mean value except with<br />

probability at most l/c’. A random variable will lie within 20 of FL at least<br />

75% of the time; it will lie between u - 100 and CL + 100 at least 99% of the<br />

time. These are the cases OL := 4VX and OL = 1OOVX of Chebyshev’s inequality.<br />

If we roll a pair of fair dice n times, the total value of the n rolls will<br />

almost always be near 7n, <strong>for</strong> large n. Here’s why: The variance of n independent<br />

rolls is Fn. A variance of an means a standard deviation of<br />

only

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!