23.03.2013 Views

Section 3.4 1 Chapter 3 – Special Discrete Random Variables ...

Section 3.4 1 Chapter 3 – Special Discrete Random Variables ...

Section 3.4 1 Chapter 3 – Special Discrete Random Variables ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Section</strong> <strong>3.4</strong> 1<br />

<strong>Chapter</strong> 3 <strong>–</strong> <strong>Special</strong> <strong>Discrete</strong> <strong>Random</strong> <strong>Variables</strong>.<br />

<strong>Section</strong> <strong>3.4</strong> Binomial random variable<br />

An experiment that has only two possible outcomes is called a Bernoulli trial, for<br />

example, a single coin toss. For the sake of argument, we will call one of the possible<br />

outcomes “success”, and the other one “failure”. The probability of a success is p, and the<br />

probability of failure is 1 − p. We are interested in studying a sequence of identical and<br />

independent Bernoulli trials, and looking at the total number of successes that occur.<br />

Definition. A binomial random variable is the number of successes in n independent<br />

and identical Bernoulli trials.<br />

Examples.<br />

A fair coin is tossed 100 times and Y , the number of heads, is recorded. Then Y is<br />

a binomial random variable with n = 100 and p = 1/2.<br />

Two evenly matched teams play a series of 6 games. The number of wins Y is a<br />

binomial random variable with n = 6 and p = 1/2.<br />

An inspector looks at five computers where the chance that each computer is defective<br />

is 1/6. The number Y of defective computers that he sees is a binomial random variable<br />

with n = 5 and p = 1/6.<br />

If Y is a binomial random variable, then the possible outcomes for Y are obviously<br />

0, 1, . . . , n. In other words, the number of observed successes could be any number between<br />

0 and n. The sample space consists of all strings of length n that consist of S’s and F ’s;<br />

for example,<br />

n trials<br />

<br />

SSF SF SSSF · · · SF .<br />

Now let us choose a value of 0 ≤ y ≤ n, and look at a couple of typical sample points<br />

belonging to the event (Y = y),<br />

y n − y<br />

<br />

SSS · · · S F F F · · · F ,<br />

y − 1 n − y<br />

<br />

SSS · · · S F F F · · · F S,<br />

y − 2 n − y<br />

<br />

SSS · · · S F F F · · · F SS.<br />

Every sample point in the event (Y = y) is an arrangement of y S’s and n − y F ’s, and<br />

so therefore has probability p y (1 − p) n−y .<br />

How many such sample points are there? The number of sample points in (Y = y)<br />

. Putting it<br />

is the number of distinct arrangements of y S’s and n − y F ’s, that is, n<br />

y


together gives the formula for binomial probabilities.<br />

Binomial probabilities.<br />

<strong>Section</strong> <strong>3.4</strong> 2<br />

If Y is a binomial random variable with parameters n and p,<br />

then<br />

<br />

n<br />

P(Y = y) = p<br />

y<br />

y (1 − p) n−y , y = 0, 1, . . . , n.<br />

Example. Best-of-seven series<br />

In section 1.6 we figured out that the probability of a best-of-seven series between two<br />

evenly matched teams going the full seven games was 20/64. This can also be calculated<br />

using binomial probabilities. If you play six games against an equally skilled opponent,<br />

and Y is the number of wins, then Y has a binomial distribution with n = 6 and p = 1/2.<br />

The series goes seven games if Y = 3, and the chance of that happening is P(Y = 3) =<br />

6<br />

3<br />

(1/2) 3 (1/2) 3 = 20/64 = .3125. So best-of-seven series ought to be seven games long<br />

30% of the time. But, in fact, if you look at the Stanley Cup final series for the last fifty<br />

years (1946-1995), there were seven-game series only 8 times (1950, 1954, 1955, 1964, 1965,<br />

1971, 1987, 1994). This seems to show that a lot of these match-ups were not even, which<br />

tends to make the series end sooner.<br />

If you are twice as good as your opponent, what is the chance of a full seven games?<br />

This time p = 2/3, and so P(Y = 3) = 6 3 3<br />

3 (2/3) (1/3) = .2195. This agrees more closely<br />

to the actual results, although it’s still a bit high.<br />

Example. An even split<br />

If I toss a fair coin ten times, what is the chance that I get exactly 5 heads and 5<br />

tails? The answer is P(Y = 5) = 10 5 5<br />

5 (1/2) (1/2) = .2461. If I toss a fair coin 100<br />

times, what is the chance of exactly fifty heads? This time the answer is P(Y = 50) =<br />

100<br />

50<br />

(1/2) 50 (1/2) 50 = .0796. You may be a bit surprised that this is such an uncommon<br />

event. If you flip a coin 100 times the odds are pretty good that you will get about an equal<br />

number of heads and tails, but to get exactly one half heads and one half tails gets harder<br />

and harder as the sample size increases. Just for fun, here is an approximate formula for<br />

the chance of getting exactly n heads in 2n coin tosses: P(an even split) ≈ (πn) −1 .<br />

Example. Testing for ESP<br />

In order to test for ESP you draw a card from an ordinary deck and ask the subject<br />

what color it is. You repeat this 20 times and the subject is correct 15 times. How likely<br />

is it that this is due to chance?<br />

If the subject is guessing, then Y , the number of correct readings, follows a binomial<br />

distribution with n = 20 and p = 1/2. We want to know the probability that someone


can do this well (or better) by guessing. Thus<br />

P(Y ≥ 15) = P(Y = 15) + P(Y = 16) + · · · + P(Y = 20)<br />

<br />

20<br />

= (1/2)<br />

15<br />

15 (1/2) 5 <br />

20<br />

+<br />

16<br />

= 21700(1/2) 20<br />

= 0.0207.<br />

<strong>Section</strong> <strong>3.4</strong> 3<br />

(1/2) 16 (1/2) 4 + · · · +<br />

<br />

20<br />

(1/2)<br />

20<br />

20 (1/2) 0<br />

This is a pretty unlikely event but certainly not impossible. What conclusion can we draw?<br />

Example. Quality control<br />

In mass production manufacturing there is a certain percentage of acceptable loss<br />

due to defective units. To check the level of defectives, you take a sample from the day’s<br />

production. If the number of defectives is small you continue, but if there are too many<br />

defectives you shut down the production line for repairs.<br />

Suppose that 5% defectives is considered acceptable, but 10% defectives is unacceptable.<br />

Our strategy is to take a sample of n = 40 units and shut down production if we find<br />

4 or more defectives. Our inspection strategy has two conflicting goals, it is supposed to<br />

shut down when p ≥ .10, but continue if p ≤ .05. There are two possible wrong decisions;<br />

to continue when p ≥ .10, and to shut down even though p ≤ .05.<br />

How often will we unnecessarily shut down? Suppose that there are acceptably many<br />

defectives, and to take the worst case, say there are 5% defectives, so that p = .05. Let<br />

Y be the number of observed defective units in the sample. The probability of shutting<br />

down production is<br />

P(shut down)<br />

= P(Y ≥ 4)<br />

= 1 − P(Y ≤ 3)<br />

= 1 − P(Y = 0) − P(Y = 1) − P(Y = 2) − P(Y = 3)<br />

= 1 −<br />

40<br />

0<br />

<br />

(.05) 0 (.95) 40 −<br />

40<br />

1<br />

= 1 − .1285 − .2705 − .2777 − .1851<br />

= .1382<br />

<br />

(.05) 1 (.95) 39 −<br />

40<br />

2<br />

<br />

(.05) 2 (.95) 38 −<br />

<br />

40<br />

(.05)<br />

3<br />

3 (.95) 37<br />

On the other hand, how often will we fail to spot an unacceptably high level of<br />

defectives? Let us now suppose that there are unacceptably many defectives, and again to<br />

take the worst case, let’s say there are 10% defectives, so that p = .10. The chance that


the day’s production passes inspection anyway is<br />

P(passes inspection)<br />

= P(Y ≤ 3)<br />

= P(Y = 0) + P(Y = 1) + P(Y = 2) + P(Y = 3)<br />

=<br />

40<br />

0<br />

<br />

(.10) 0 (.90) 40 +<br />

40<br />

1<br />

= .0148 + .0657 + .1423 + .2003<br />

= .4231<br />

<strong>Section</strong> <strong>3.4</strong> 4<br />

<br />

(.10) 1 (.90) 39 +<br />

40<br />

2<br />

<br />

(.10) 2 (.90) 38 +<br />

<br />

40<br />

(.10)<br />

3<br />

3 (.90) 37<br />

We see that this scheme is fairly likely to make errors. If we wanted to be more certain<br />

about our decision, we would need to take a larger sample size.<br />

Example. Multiple choice exams<br />

If a multiple choice exam has 30 questions, each with 5 responses, what is the probability<br />

of passing the exam by guessing? If you guess on every question, then Y the number<br />

of correct answers will be a binomial random variable with n = 30 and p = 1/5. To pass<br />

you need 15 or more correct answers so P(pass the exam) = P(Y ≥ 15) = 0.000231.<br />

Binomial moments.<br />

If Y is a binomial random variable with parameters n and p,<br />

then<br />

E(Y ) = np and VAR (Y ) = np(1 − p).<br />

Example. The accuracy of empirical probabilities<br />

If we simulate n random events, where the chance of a success is p, then the number of<br />

observed successes Y has a binomial distribution with parameters n and p. The empirical<br />

probability is p = Y/n. Now the binomial moments given above show that E(p) =<br />

(np)/n = p, and VAR (p) = (np(1−p))/n 2 = p(1−p)/n. By computing the two standard<br />

deviation interval, we get some idea about how close p is to p. Since the quantity p(1 − p)<br />

is maximized when p = 1/2, we find that regardless of the value of p,<br />

2 STD (p) = 2<br />

p(1 − p)<br />

n<br />

≤ 1<br />

√ n .<br />

In most of our examples, the empirical probabilities have been based on n = 1000 repetitions.<br />

Thus, our empirical probabilities are typically within ±.03 of the true probabilities.<br />

For example, suppose we simulate 1000 throws of five dice, and find that on 71 occasions<br />

we get a sum of 14. Then we are fairly certain that the true probability of getting<br />

14 lies somewhere between .041 and .101.


<strong>Section</strong> 3.5 5<br />

<strong>Section</strong> 3.5 Geometric and negative binomial random variables<br />

Like the binomial, the geometric and negative binomial random variables are based<br />

on a sequence of independent and identical Bernoulli trials. Instead of fixing the number<br />

of trials n and counting up how many successes there are, we fix the number of successes<br />

k and count up how many trials it takes to get them. The geometric random variable is<br />

the number of trials until the first success. Given an integer k ≥ 1, the negative binomial<br />

random variable is the number of trials until the k th success. You see that a geometric<br />

random variable is a negative binomial random variable where k = 1. On the other hand,<br />

note that a negative binomial random variable Y is the sum of k independent geometric<br />

random variables. That is, Y = X1 + X2 + · · · + Xk, where X1 is the number of trials until<br />

the first success, X2 is the number of trials after the first success until the second success,<br />

etc. All of these X ’s have geometric distributions with parameter p. If Y is negative<br />

binomial, then a typical sample point belonging to (Y = y) looks like F F S · · · F S S,<br />

where the first y − 1 symbols in the string contain exactly k − 1 successes and y − k<br />

such strings, and they all<br />

failures, and then the y th symbol is an S. Since there are y−1 k−1<br />

have probability pk (1 − p) y−k we get the following formula.<br />

Negative binomial probabilities.<br />

If Y is a negative binomial random variable with parameters k<br />

and p, then<br />

P(Y = y) =<br />

<br />

y − 1<br />

p<br />

k − 1<br />

k (1 − p) y−k , y = k, k + 1, . . . .<br />

It follows that the geometric distribution is given by p(y) = p(1 − p) y−1 , y = 1, 2, . . . .<br />

Example. The chance of a packet arrival to a distribution hub is 1/10 during each<br />

time interval. Let Y be the arrival time of the first packet, it has a geometric distribution<br />

with p = .10. The probability that the first packet arrives during the third time interval<br />

is P(Y = 3) = (1/10) 1 (9/10) 2 = .081. The probability that the first packet arrives on or<br />

after the third time interval is<br />

P(Y ≥ 3) = 1 − P(Y = 1) − P(Y = 2) = 1 − .10 − (.90)(.10) = .81.<br />

If X is the arrival time of the tenth packet, the chance that it arrives on the 99 th time<br />

interval is P(X = 99) = 98 10 89<br />

9 (1/10) (9/10) = 0.01332.


Example. The 500 goal club<br />

<strong>Section</strong> 3.7 6<br />

With only 30 games remaining in the NHL season, veteran winger Flash LaRue is<br />

starting to get worried. With a career total of 488 goals, it is not at all certain that he<br />

will be able to score his 500th career goal before the end of the season. He will get a big<br />

bonus from his team if he manages this feat, but unfortunately Flash only scores at a rate<br />

of about once every three games. Is there any hope that he will get his 500th goal before<br />

the end of the season?<br />

Let’s try to calculate the moments of a negative binomial random variable.<br />

p + p(1 − p) + p(1 − p) 2 + p(1 − p) 3 + · · · 1<br />

p(1 − p) + p(1 − p) 2 + p(1 − p) 3 + · · · (1 − p)<br />

p(1 − p) 2 + p(1 − p) 3 + · · · (1 − p) 2<br />

p(1 − p) 3 + · · · (1 − p) 3<br />

p + 2p(1 − p) + 3p(1 − p) 2 + 4p(1 − p) 3 + · · · 1/p<br />

This sum ought to convince you that the mean of a geometric random variable is 1/p,<br />

and the result for negative binomial follows from the equation Y = X1 + X2 + · · · + Xk.<br />

Confirming the variance formula is left as an exercise.<br />

Negative binomial moments.<br />

If Y is a negative binomial random variable with parameters k<br />

and p, then<br />

E(Y ) = k<br />

p<br />

and VAR (Y ) =<br />

. ..<br />

k(1 − p)<br />

p2 .<br />

We note that, as you would expect, the rarer an event is, the longer you will have to<br />

wait for it. Taking the geometric case (k = 1), we see that we will wait on average µ = 2<br />

trials to see the first “heads” in a coin tossing experiment, we will wait on average µ = 36<br />

trials to see the first pair of sixes in tossing a pair of dice, and we will buy on average<br />

µ = 13, 983, 816 tickets before we win Lotto 6-49.<br />

We also note that σ decreases from infinity to zero as p ranges from 0 to 1. This says<br />

that predicting the first occurrence of an event is difficult for rare events, and easy for<br />

common events.<br />

<strong>Section</strong> 3.7 Hypergeometric random variable<br />

The hypergeometric distribution is the number of successes that arise in sampling<br />

without replacement. We suppose that there is a population of size N , of which r of them<br />

are “successes” and the rest “failures”, and a sample of size n is drawn.<br />

.


<strong>Section</strong> 3.7 7<br />

The probability formula below is simply the ratio of the number of samples containing<br />

y successes and n −y failures, to the total number of possible samples of size n. The weird<br />

looking conditions on y just ensure that you don’t try to find the probability of some<br />

impossible event.<br />

Hypergeometric probabilities.<br />

If Y is a hypergeometric random variable with parameters n, r,<br />

and N , then<br />

P(Y = y) =<br />

<br />

r N − r<br />

y n − y<br />

<br />

N<br />

n<br />

, y = max(0, n −(N −r)), . . . , min(n, r)<br />

Example. A box contains 12 poker chips of which 7 are green and 5 are blue.<br />

Eight chips are selected at random without replacement from this box. Let X denote the<br />

number of green chips selected. The probability mass function is<br />

<br />

7 5<br />

<br />

p(x ) =<br />

x 8 − x<br />

<br />

12<br />

8<br />

, x = 3, 4, . . . , 7.<br />

Note that the range of possible x values is restricted by the make-up of the population.<br />

Example. Lotto 6-49<br />

In Lotto 6-49 you buy a ticket with six numbers chosen from the set {1, 2, . . . , 49}.<br />

The draw consists of a random sample drawn without replacement from the same set,<br />

and your prize depends on how many “successes” were drawn. Here a “success” is any<br />

number that was on your ticket. So Y , the number of matches, follows a hypergeometric<br />

distribution with r = 6, n = 6, and N = 49. The probabilities for the different number of<br />

matches are obtained using the formula<br />

<br />

6 43<br />

y 6 − y<br />

P(Y = y) = , y = 0, . . . , 6.<br />

49<br />

6<br />

To four decimal places, we have<br />

y 0 1 2 3 4 5 6<br />

p(y) .4360 .4130 .1324 .0176 .0010 .0000 .0000


Hypergeometric moments.<br />

<strong>Section</strong> 3.7 8<br />

If Y is a hypergeometric random variable with parameters n, r,<br />

and N then<br />

E(Y ) = n r<br />

N<br />

and VAR (Y ) = n r<br />

N<br />

N − r<br />

N<br />

N − n<br />

N − 1 .<br />

For example, the average number of green chips drawn in the first problem is µ =<br />

(8)(7)/12 = 4.66666. Also, the average number of matches on your Lotto 6-49 ticket is<br />

µ = (6)(6)/49 = .73469.<br />

Example. Capture-tag-recapture<br />

A scientific expedition has captured, tagged, and released eight sea turtles in a particular<br />

region. The expedition assumes that the population size in this region is 35, which<br />

means that 8 are tagged and 27 not tagged. The expedition will now capture 10 turtles<br />

and note how many of them are tagged. If the assumption about the population size is<br />

correct, what is the probability that the new sample will have 3 or less tagged turtles in<br />

it?<br />

P(Y ≤ 3) = P(Y = 0) + P(Y = 1) + P(Y = 2) + P(Y = 3)<br />

<br />

8 27 8 27 8 27 8 27<br />

=<br />

0 10<br />

<br />

35<br />

+<br />

1 9<br />

<br />

35<br />

+<br />

2 8<br />

<br />

35<br />

+<br />

3 7<br />

<br />

35<br />

10 10 10 10<br />

= .04595 + .20424 + .33861 + .27089<br />

= .85969.<br />

We would certainly expect to get three or less tagged turtles in the new sample. If the<br />

expedition found five tagged turtles, is that evidence that they have over-estimated the<br />

population size?<br />

Example. A political poll<br />

The population of Alberta is around 2, 545, 000, and let’s suppose that about 70% of<br />

these are eligible to vote in the next provincial election. Then the population of eligible<br />

voters has N = 1781500 people in it. Suppose that n = 100 people are randomly selected<br />

from the eligible voters (without replacement) and asked whether or not they support<br />

Ralph Klein. Also suppose, for the sake of argument, that exactly 60%, or 1068900 eligible<br />

voters do support Ralph Klein. How accurately will the poll reflect that?<br />

Let Y stand for the number of Klein supporters included in the random sample. Then<br />

Y has a hypergeometric distribution with n = 100, r = 1068900, and N = 1781500. The


mean and variance of Y are given by<br />

µ = 100 1068900<br />

1781500 = 60 and σ 2 = 100 1068900<br />

1781500<br />

<strong>Section</strong> 3.8 9<br />

712600<br />

1781500<br />

1781400<br />

1781499<br />

= 23.998666.<br />

A two standard deviation interval says that probably between 50 to 70 people in the poll<br />

will be Klein supporters.<br />

Note that if the sampling were done with replacement, then Y would follow a binomial<br />

distribution with n = 100 and p = .6. In this case, we would have<br />

Since n is small relative to N , the ratio<br />

µ = 100(.6) = 60 and σ 2 = 100(.6)(.4) = 24.<br />

1781400<br />

1781499<br />

= N − n<br />

N − 1<br />

and the mean and variance of the hypergeometric distribution coincide with the mean and<br />

variance of the binomial distribution. The distributions of these two random variables are<br />

also essentially the same whenever n is small relative to N .<br />

<strong>Section</strong> 3.8 Poisson random variable<br />

This probability distribution is named after the French mathematician Poisson, according<br />

to whom. . .<br />

≈ 1,<br />

Life is good for only two things, discovering mathematics and<br />

teaching mathematics <strong>–</strong> Siméon Poisson<br />

In Recherches sur la probabilité des jugements en matière criminelle et en matière<br />

civile, an important work on probability published in 1837, the Poisson distribution first<br />

appeared. The Poisson distribution describes the probability that a random event will<br />

occur in a time or space interval under the conditions that the probability of the event<br />

occurring is very small, but the number of trials is very large so that the event actually<br />

occurs a few times.<br />

To illustrate this idea, suppose you are interested in the number of arrivals to a queue<br />

in a one day period. You could divide the time interval up into little subintervals, so that<br />

for all practical purposes, only one arrival can occur per subinterval. Therefore, for each<br />

subinterval of time, we have<br />

P(no arrival) = 1 − p, P(one arrival) = p, P(more than one arrival) = 0.<br />

The total number of arrivals X , is the number of subintervals that contain an arrival. This<br />

has a binomial distribution, where n is the number of subintervals. The probability of<br />

seeing x arrivals during the day is<br />

P(X = x ) =<br />

<br />

n<br />

p<br />

x<br />

x (1 − p) n−x .


<strong>Section</strong> 3.8 10<br />

Now let’s suppose that you keep on dividing the time interval into smaller and smaller<br />

subintervals; increasing n but decreasing p so that the product µ = np remains constant.<br />

What happens to P(X = x )?<br />

<br />

n<br />

p<br />

x<br />

x (1 − p) n−x =<br />

<br />

n<br />

<br />

µ<br />

x 1 −<br />

x n<br />

µ<br />

n−x n<br />

Now you take the limit as n → ∞, and obtain<br />

<br />

1 − µ<br />

n → e<br />

n<br />

−µ<br />

= n(n − 1) · · · (n − x + 1)<br />

<br />

µ<br />

x 1 −<br />

x !<br />

n<br />

µ<br />

n 1 −<br />

n<br />

µ<br />

−x n<br />

= µ x <br />

1 −<br />

x !<br />

µ<br />

n n<br />

<br />

n − 1<br />

<br />

n − x + 1<br />

<br />

· · ·<br />

1 −<br />

n n n<br />

n<br />

µ<br />

−x .<br />

n<br />

and<br />

This leads to the following formula.<br />

n<br />

n<br />

Poisson probabilities.<br />

n − 1<br />

n<br />

<br />

· · ·<br />

n − x + 1<br />

n<br />

<br />

1 − µ<br />

−x → 1.<br />

n<br />

If X is a Poisson random variable with parameter µ, then<br />

P(X = x ) = e −µ µ x<br />

x !<br />

, x = 0, 1, . . . ,<br />

The derivation of the Poisson distribution explains why it is sometimes called the law<br />

of rare events. Let’s look at an example involving the rarest event I can think of.<br />

Example. More Lotto 6-49<br />

The odds of winning the jackpot in Lotto 6-49 are one in 13,983,816, or p = 7.1511 ×<br />

10 −8 . Suppose you play twice a week, every week for 10,000 years. The total number<br />

of plays is then n = 2 × 52 × 10000 = 1, 040, 000. Setting µ = np = .07437 and using<br />

the Poisson formula, we see that the chance of hitting zero jackpots during this time is<br />

P(X = 0) = (e −.07437 )(.07437) 0 /0! = .928327. After all that time, we still have only<br />

about a 7% chance of getting a Lotto 6-49 jackpot. The probability of getting exactly two<br />

jackpots during this time is P(X = 2) = (e −.07437 )(.07437) 2 /2! = .002567.<br />

Example. Hashing<br />

Hashing is a tool for organizing files, where a hashing function transforms a key into<br />

an address, which is then the basis for searching for and storing records. Hashing has two<br />

important features:<br />

1. With hashing, the addresses generated appear to be random — there is no immediate<br />

connection between the key and the location of the record.


<strong>Section</strong> 3.8 11<br />

2. With hashing, two different keys may be transformed into the same address, in which<br />

case we say that a collision has occurred.<br />

Given that it is nearly impossible to achieve a uniform distribution of records among<br />

the available addresses in a file, it is important to be able to predict how records are likely<br />

to be distributed. Suppose that there are N addresses available, and that the hashing<br />

function assigns them in a completely random fashion. This means that for any fixed<br />

address, the probability that it is selected is 1/N . If r keys are hashed, we can use the<br />

Poisson approximation to the binomial to obtain the probability that exactly x records<br />

are assigned to a given address. This is<br />

p(x ) = e −(r/N) (r/N ) x<br />

x !<br />

, x = 0, 1, . . .<br />

For instance, if we are trying to fit r = 10000 records in N = 10000 addresses, the<br />

proportion of addresses that will remain empty is p(0) = 1 0 e −1 /0! = .3679. We would<br />

expect a total of about 3679 empty addresses. Since p(1) = 1 1 e −1 /1! = .3679, we would<br />

also expect a total of about 3679 addresses with 1 record assigned, and about 10000 −<br />

2(3679) = 2642 addresses with more than 1 record assigned. Because we have a packing<br />

density r/N of 1, we must expect a large number of collisions. In order to reduce the<br />

number of collisions we should increase the number N of available addresses.<br />

For more about hashing, the reader is referred to <strong>Chapter</strong> 11 of the book File Structures:<br />

A conceptual toolkit by Michael J. Folk and Bill Zoellick.<br />

Poisson moments.<br />

If X is a Poisson random variable, then<br />

Example. Particle emissions<br />

E(X ) = µ and VAR (X ) = µ.<br />

In 1910, Hans Geiger and Ernest Rutherford conducted a famous experiment in which<br />

they counted the number of α-particle emissions during 2608 time intervals of equal length.<br />

Their data is as follows.<br />

x 0 1 2 3 4 5 6 7 8 9 10 > 10<br />

intervals 57 203 383 525 532 408 273 139 45 27 10 6<br />

A total of 10097 particles were observed, giving a rate of µ = 10097/2608 = 3.8715<br />

particles per time period. If these particles were following a Poisson distribution, then the<br />

number of intervals with no particles should be about<br />

2608 × e −3.8715 (3.8715) 0<br />

0!<br />

= 54.31,


<strong>Section</strong> 3.8 12<br />

the number of intervals with exactly one particle should be about<br />

2608 × e −3.8715 (3.8715) 1<br />

1!<br />

= 210.27,<br />

and so on. In fact, the frequencies that we would expect to observe are<br />

0 1 2 3 4 5 6 7 8 9 10 > 10<br />

54.31 210.27 407.06 525.31 508.44 393.69 254.03 140.50 67.99 29.25 11.32 5.83<br />

By comparing these two tables, you can see that the Poisson distribution seems to<br />

describe this phenomenon quite well.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!