17.11.2014 Views

Probability Distributions - Viplav Kambli

Probability Distributions - Viplav Kambli

Probability Distributions - Viplav Kambli

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Probability</strong> <strong>Distributions</strong><br />

The laws of probability allow us to obtain the chances of a given set of outcomes if we know the<br />

probabilities of various possibilities in a certain situation. Sometimes the given situation may be generalised<br />

so that standard rules can be used for calculating the required probabilities. If a generalised probability rule<br />

may be obtained which covers all possible outcomes in a situation, we get a probability distribution<br />

function, so called because it spels the probability of occurrence of each outcome—the proportion of time<br />

an outcome is likely to occur or the proportion of cases likely to lie in a given class.<br />

We now discuss some generalised distributions that are very widely obtained and are relevant to the<br />

managers. They are based on theoretical considerations and in many cases frequency patterns observed in<br />

real life conform to either of these. Using these distributions, predictions can be made on theoretical<br />

grounds.<br />

BINOMIAL DISTRIBUTION<br />

A binomial distribution is applicable when<br />

(i) an experiment involves n trials,<br />

(ii) each of the trials can result in a set of dichotomous alternatives, one of which is arbitrarily termed<br />

as success and the other as failure,<br />

(iii) the trials are independent so that the probability of success is the same in each trial, and<br />

(iv) the focus is on the occurrence of a certain number of successes.<br />

In such a situation, the probabilities for the occurrence of 0, 1, 2, .. ., n successes are given by the<br />

successive terms of the binomial expansion (q + p) n , where q is the probability of failure and p is the<br />

probability of success in a trial, and there are n trials. In general,<br />

P(x) = n C x q n – x p x<br />

where x = the number of successes<br />

Example 1 Output of a production process is known to be thirty per cent defective. What is the<br />

probability that a sample of 5 items would contain 0, 1, 2, 3, 4, and 5 defectives?<br />

If the appearance of a defective item is considered a success, then the probability of success in a trial,<br />

p = 0.3. Thus, q = 1 – p = 1 – 0.3 = 0.7. With n = 5, and P(x) = n C x q n – x p x , we have<br />

No. of successes (x) <strong>Probability</strong> p(x)<br />

0 0.16807<br />

1 0.36015<br />

2 0.30870


Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 21<br />

3 0.13230<br />

4 0.02835<br />

5 0.00243<br />

Total = 1.00000<br />

A binomial distribution has two parameters: n and p. It means that a binomial distribution can be<br />

specified completely by its n and p.<br />

The mean of a binomial distribution is np and its standard deviation equals npq . For the example<br />

stated, it can be shown that mean = 5 ¥ 0.3 = 1.5 and standard deviation = 5¥ 03 . ¥ 07 . = 1.025.<br />

If, for a binomial distribution, p = q = 0.5, it would be a symmetrical one. If p > q, the distribution would<br />

be negatively skewed, whereas if p < q, the distribution would have a positive skewness. Further, greater<br />

the divergence between p and q, for a given n, more pronounced would the skewness be.<br />

POISSON DISTRIBUTION<br />

The Poisson distribution is the distribution of rare events since it deals with situations where the chances of<br />

occurrence of an event are very low. It is used where the events happen at random i.e one cannot predict<br />

precisely when each would occur, nor how many will occur altogether. Examples of these include road<br />

accidents, fire, arrival of customers at a shop and so on.<br />

The distribution is used either as an approximation to binomial distribution when n is large and p is very<br />

small, or in its own right.<br />

According to this model,<br />

P(x) = e –l l x<br />

x!<br />

where e is the exponential 2.7183, l is the mean value (equal to np when used as approximation to<br />

binomial) and x is the number of occurrences, which may be any integer from 0 to any value.<br />

Example 2 An insurance company receives, on an average, 2 telephone calls every 15 minutes.<br />

Find the chance that (a) no calls, and (b) 3 calls be received in a 30-minute interval.<br />

According to the given information, the average number of calls during 30 minute period, l = 4.<br />

<strong>Probability</strong> of no calls, P(0) = e –l l<br />

x! = 2.7183–4 ¥ 4 0! = 0.0183<br />

<strong>Probability</strong> of 3 calls, P(3) = 2.7183 –4 ¥ 4 = 0.1954<br />

3!<br />

Important features of the Poisson distribution are:<br />

(a) There is no theoretical maximum number of events that can occur. Whether the average value is<br />

small or large, for example, we can theoretically conceive an infinite fires during a year. However,<br />

the probabilities of the successes build up sharply around mean and then fall at a brisk rate so that<br />

the probabilities of higher number of successes become extremely small and are negligible.<br />

Further, whatever the mean value of a Poisson variable (x) be, the probabilities add upto 1.<br />

Total <strong>Probability</strong> = P(0) + P(1) + P(2) + P(3) + . . .<br />

3<br />

0


22 Quantitative Techniques in Management<br />

-l 2 -l 3 -l<br />

-l<br />

le l e l e<br />

= e + + + + L<br />

1! 2! 3!<br />

= e –l 2 3<br />

F l l I<br />

1 + l + + + L<br />

HG<br />

2 3 K J<br />

! !<br />

The series in the bracket, being the exponential series, adds upto e l . Thus, total probability<br />

= e –l ¥ e l = 1.<br />

(b) A Poisson probability distribution is positively skewed. However, the skewness becomes less<br />

pronounced as the mean value increases.<br />

(c) The Poisson distribution has only one parameter—the mean l.<br />

(d) For a Poisson distribution, mean and variance are each equal to l.<br />

(e) We have, for this distribution, the following recursive relationship, P(x) = P(x – 1) ¥ l /x.<br />

Negative exponential distribution Using Poisson’s rule, although we can predict chances of<br />

occurrences but not the events because they occur at random. The time between events is variable and has<br />

a distribution known as negative exponential distribution or simply exponential distribution. Thus, there<br />

is a relation between Poisson and exponential distributions so that if the number of events occurring in a<br />

specified time follows a Poisson distribution with mean l, then the waiting time until the first occurrence,<br />

T, will follow an exponential distribution with a mean equal to 1/l and variance equal to 1/l 2 .<br />

The problem of finding probabilities for events defined in terms of T can be solved by considering the<br />

relationship between Poisson and exponential distributions. Since T is the waiting time until the first<br />

occurrence of an event in which the total number of occurrences in a given time interval follows the<br />

Poisson distribution, it should not be surprising that a relationship does exist. In particular, if we consider<br />

a fixed amount of time t, then the event T > t is simply the event that the waiting time until the first<br />

occurrence is longer than t units of time. If we know that the waiting time for the first occurrence is longer<br />

time than t units, then there must be no occurrences during this interval. Otherwise the waiting time until<br />

the first occurrence would be less than t. Consequently, the event T > t is equivalent to the event X = 0,<br />

where X is the number of occurrences in t units of time. Since X has a Poisson distribution, it follows that<br />

P(T > t) = P(X = 0) = e -ltal<br />

t f 0<br />

= e –lt<br />

0!<br />

Example 3 On an average, 2 calls are received in 15 minutes. Find the average time between<br />

successive calls. What is the probability that the first call of the day would be received not before 10<br />

minutes? Within 5 minutes? 7<br />

2 1 minutes?<br />

Average rate of calls, l = 2/15 calls/minute<br />

\ Expected time between successive calls = 15/2 = 7.5 minutes. With t = 10 minutes,<br />

P(T > t) = e –l t = 2.7183 –(2/15) (10) = 2.7183 –4/3 = 0.2636<br />

When t = 5 minutes<br />

P(T £ t) = 1 – e –l t = 1 – 2.7183 –(2/15) (5) = 0.4866<br />

When t = 7.5 minutes,<br />

P(T £ t) = 1 – 2.7183 –(2/15) (15/2) = 1 – 0.3679 = 0.6321.


Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 23<br />

NORMAL DISTRIBUTION<br />

Normal distribution is a very important and useful distribution for a manager because many phenomena<br />

follow such a distribution or are close to it. In contrast to the binomial and Poisson distributions in which<br />

our concern is with determining probabilities of some number of successes, which can assume only discrete<br />

value of 0, 1, 2 and so forth, we deal in the normal distribution with characteristics which can assume any<br />

value between two given limits. Height of an individual, to illustrate, shall not be only, say, 65 or 66<br />

inches—it could be any value between these two.<br />

A characteristics, or variable, is said to be distributed normally if its curve appears as shown in Fig. 1<br />

and is represented by the following expression.<br />

1 exp (– (x – m)<br />

y(x) =<br />

2 /2s 2 )<br />

s 2p<br />

Here y(x) depicts the height of the y-ordinate at a specific value of x, and p and e are, respectively, equal<br />

to 3.1416 and 2.7183. Thus, if m and s, the mean and the standard deviation, are known, we can get the<br />

height of the curve at any specific value of x.<br />

It may be noted that:<br />

(a) A normal curve is a unimodal, bell-shaped, and symmetrical about its mean. As seen in the figure,<br />

the curve on either side of m is a mirror image of the other side. The mean, the median, and the<br />

mode all coincide.<br />

(b) The total area under the curve is divided evenly because of symmetry: 50% of area is to the right of<br />

a perpendicular line drawn at the mean and 50% is to its left.<br />

(c) It is assumed that the variable can take any value between – • and + •. As such, a normal curve<br />

approaches closely, but never touches, the horizontal axis.<br />

(d) If we construct vertical lines at a distance of one standard deviation from mean in both the<br />

directions, the area under the curve enclosed by these lines is equal to 68.27% of the total area. If we<br />

draw these lateral boundaries at two standard deviations from the mean in both the directions, they<br />

would enclose 95.45% of the total area. Similarly, m ± 3s covers 99.73% area under the curve.<br />

(e) A normal curve is defined completely by the mean, m, and standard deviation, s (> 0). That is, each<br />

different value of m and s specifies a distinct normal distribution and curve. Thus, the normal<br />

distribution is a family of distributions in which a member is distinguished from the others on the<br />

basis of the twin values of m and s. A distinguished member of this family is the distribution which<br />

has a zero mean and a standard deviation of 1. It is called the standard normal distribution.<br />

y<br />

yx ()= 1 =e<br />

s ÷ 2p<br />

-( x 2<br />

- m) /2s 2<br />

Fig. 1<br />

m<br />

Normal Curve<br />

Variable x


24 Quantitative Techniques in Management<br />

Calculation of probabilities For a normal distribution, the probability that the given variable would<br />

take a value in a certain range, say between X 1 and X 2 , is calculated as the proportion of the area under the<br />

normal curve between X 1 and X 2 , to the total area under it. For the purpose of calculating the probabilities,<br />

the given distribution is expressed in terms of the standard normal distribution. This is done by stating the<br />

variable X as the variable z, where<br />

z = X -m<br />

s<br />

At X = m, z would equal zero, z would be positive for values of X > m, and negative for values X < m.<br />

The proportion of areas under the normal curve between the mean and particular values of z are<br />

tabulated and shown in Table B1 at the end of the book or in the tables given on web. To illustrate, some of<br />

the areas are given here.<br />

Area<br />

(i) Between m and z = 1.00 0.3413<br />

(ii) Between m and z = 1.45 0.4265<br />

(iii) Between z = 1.2 and z = 2.8 0.4974 – 0.3849 = 0.1125<br />

(iv) Between z = – 1.2 and z = 2.8 0.3849 + 0.4974 = 0.8823<br />

(v) Beyond z = 1.2 0.5000 – 0.3849 = 0.1125<br />

It may be noted that since the curve is symmetrical, the area between the mean and a particular value of<br />

z is the same whether z is positive or negative.<br />

Example 4 A machine is set to fill in coffee powder in tins, with an average of 200 gms, and a<br />

standard deviation of 4 gms. Find the probability that a coffee tin selected at random shall contain<br />

(a) at least 200 gms, (b) between 200 and 206 gms, (c) between 195 and 205 gms, and (d) less than<br />

196 gms.<br />

The given distribution has m = 200 gms and s = 4 gms.<br />

(a) Area to the right of m being 0.50, this is the probability that a tin would contain at least 200 gms of<br />

coffee.<br />

(b) For X = 206, z = (206 – 200)/4 = 1.5.<br />

From the normal area table, area between m and z = 1.5 is 0.4332. Thus P(200 £ X £ 206)<br />

= 0.4332.<br />

(c) Area between X = 195 and X = 205 equals area between m and X = 195, plus area between m and<br />

X = 205. For X = 205, z = (205 – 200)/4 = 1.25 while for X = 195, z = (195 – 200)/4 = – 1.25. Area<br />

between m and z = 1.25 is equal to 0.3944.<br />

\ P(195 £ X £ 205) = 2 ¥ 0.3944 = 0.7888.<br />

(d) For X = 196, z = (196 – 200)/4 = – 1. Area between m and z = – 1 equals 0.3413.<br />

\ P(X < 196) = 0.5000 – 0.3413 = 0.1687.<br />

Example 5 A manufacturer of batteries wishes to give a guarantee for free replacement of the<br />

batteries whose life is less than a certain time period. If he desires to replace no more than 5% of the<br />

batteries, what should be the guarantee period, if the lives of batteries are known to be normally<br />

distributed with mean of 1200 hours and a standard deviation of 100 hours?<br />

The given information is depicted in Fig. 2. Here the value of X is to be determined. Since the area<br />

between m and X is 0.45, we observe from the normal area table that z corresponding to this area is<br />

– 1.645.


Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 25<br />

0.05<br />

Fig. 2<br />

m = 1200 hrs.<br />

Determination of X<br />

Hours<br />

Thus, – 1.645 = X - 1200 or X = 1200 – 100 ¥ 1.645 = 1035.5<br />

100<br />

A guarantee of 1036 hours may, therefore, be given by the manufacturer.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!