Probability Distributions - Viplav Kambli
Probability Distributions - Viplav Kambli
Probability Distributions - Viplav Kambli
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Probability</strong> <strong>Distributions</strong><br />
The laws of probability allow us to obtain the chances of a given set of outcomes if we know the<br />
probabilities of various possibilities in a certain situation. Sometimes the given situation may be generalised<br />
so that standard rules can be used for calculating the required probabilities. If a generalised probability rule<br />
may be obtained which covers all possible outcomes in a situation, we get a probability distribution<br />
function, so called because it spels the probability of occurrence of each outcome—the proportion of time<br />
an outcome is likely to occur or the proportion of cases likely to lie in a given class.<br />
We now discuss some generalised distributions that are very widely obtained and are relevant to the<br />
managers. They are based on theoretical considerations and in many cases frequency patterns observed in<br />
real life conform to either of these. Using these distributions, predictions can be made on theoretical<br />
grounds.<br />
BINOMIAL DISTRIBUTION<br />
A binomial distribution is applicable when<br />
(i) an experiment involves n trials,<br />
(ii) each of the trials can result in a set of dichotomous alternatives, one of which is arbitrarily termed<br />
as success and the other as failure,<br />
(iii) the trials are independent so that the probability of success is the same in each trial, and<br />
(iv) the focus is on the occurrence of a certain number of successes.<br />
In such a situation, the probabilities for the occurrence of 0, 1, 2, .. ., n successes are given by the<br />
successive terms of the binomial expansion (q + p) n , where q is the probability of failure and p is the<br />
probability of success in a trial, and there are n trials. In general,<br />
P(x) = n C x q n – x p x<br />
where x = the number of successes<br />
Example 1 Output of a production process is known to be thirty per cent defective. What is the<br />
probability that a sample of 5 items would contain 0, 1, 2, 3, 4, and 5 defectives?<br />
If the appearance of a defective item is considered a success, then the probability of success in a trial,<br />
p = 0.3. Thus, q = 1 – p = 1 – 0.3 = 0.7. With n = 5, and P(x) = n C x q n – x p x , we have<br />
No. of successes (x) <strong>Probability</strong> p(x)<br />
0 0.16807<br />
1 0.36015<br />
2 0.30870
Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 21<br />
3 0.13230<br />
4 0.02835<br />
5 0.00243<br />
Total = 1.00000<br />
A binomial distribution has two parameters: n and p. It means that a binomial distribution can be<br />
specified completely by its n and p.<br />
The mean of a binomial distribution is np and its standard deviation equals npq . For the example<br />
stated, it can be shown that mean = 5 ¥ 0.3 = 1.5 and standard deviation = 5¥ 03 . ¥ 07 . = 1.025.<br />
If, for a binomial distribution, p = q = 0.5, it would be a symmetrical one. If p > q, the distribution would<br />
be negatively skewed, whereas if p < q, the distribution would have a positive skewness. Further, greater<br />
the divergence between p and q, for a given n, more pronounced would the skewness be.<br />
POISSON DISTRIBUTION<br />
The Poisson distribution is the distribution of rare events since it deals with situations where the chances of<br />
occurrence of an event are very low. It is used where the events happen at random i.e one cannot predict<br />
precisely when each would occur, nor how many will occur altogether. Examples of these include road<br />
accidents, fire, arrival of customers at a shop and so on.<br />
The distribution is used either as an approximation to binomial distribution when n is large and p is very<br />
small, or in its own right.<br />
According to this model,<br />
P(x) = e –l l x<br />
x!<br />
where e is the exponential 2.7183, l is the mean value (equal to np when used as approximation to<br />
binomial) and x is the number of occurrences, which may be any integer from 0 to any value.<br />
Example 2 An insurance company receives, on an average, 2 telephone calls every 15 minutes.<br />
Find the chance that (a) no calls, and (b) 3 calls be received in a 30-minute interval.<br />
According to the given information, the average number of calls during 30 minute period, l = 4.<br />
<strong>Probability</strong> of no calls, P(0) = e –l l<br />
x! = 2.7183–4 ¥ 4 0! = 0.0183<br />
<strong>Probability</strong> of 3 calls, P(3) = 2.7183 –4 ¥ 4 = 0.1954<br />
3!<br />
Important features of the Poisson distribution are:<br />
(a) There is no theoretical maximum number of events that can occur. Whether the average value is<br />
small or large, for example, we can theoretically conceive an infinite fires during a year. However,<br />
the probabilities of the successes build up sharply around mean and then fall at a brisk rate so that<br />
the probabilities of higher number of successes become extremely small and are negligible.<br />
Further, whatever the mean value of a Poisson variable (x) be, the probabilities add upto 1.<br />
Total <strong>Probability</strong> = P(0) + P(1) + P(2) + P(3) + . . .<br />
3<br />
0
22 Quantitative Techniques in Management<br />
-l 2 -l 3 -l<br />
-l<br />
le l e l e<br />
= e + + + + L<br />
1! 2! 3!<br />
= e –l 2 3<br />
F l l I<br />
1 + l + + + L<br />
HG<br />
2 3 K J<br />
! !<br />
The series in the bracket, being the exponential series, adds upto e l . Thus, total probability<br />
= e –l ¥ e l = 1.<br />
(b) A Poisson probability distribution is positively skewed. However, the skewness becomes less<br />
pronounced as the mean value increases.<br />
(c) The Poisson distribution has only one parameter—the mean l.<br />
(d) For a Poisson distribution, mean and variance are each equal to l.<br />
(e) We have, for this distribution, the following recursive relationship, P(x) = P(x – 1) ¥ l /x.<br />
Negative exponential distribution Using Poisson’s rule, although we can predict chances of<br />
occurrences but not the events because they occur at random. The time between events is variable and has<br />
a distribution known as negative exponential distribution or simply exponential distribution. Thus, there<br />
is a relation between Poisson and exponential distributions so that if the number of events occurring in a<br />
specified time follows a Poisson distribution with mean l, then the waiting time until the first occurrence,<br />
T, will follow an exponential distribution with a mean equal to 1/l and variance equal to 1/l 2 .<br />
The problem of finding probabilities for events defined in terms of T can be solved by considering the<br />
relationship between Poisson and exponential distributions. Since T is the waiting time until the first<br />
occurrence of an event in which the total number of occurrences in a given time interval follows the<br />
Poisson distribution, it should not be surprising that a relationship does exist. In particular, if we consider<br />
a fixed amount of time t, then the event T > t is simply the event that the waiting time until the first<br />
occurrence is longer than t units of time. If we know that the waiting time for the first occurrence is longer<br />
time than t units, then there must be no occurrences during this interval. Otherwise the waiting time until<br />
the first occurrence would be less than t. Consequently, the event T > t is equivalent to the event X = 0,<br />
where X is the number of occurrences in t units of time. Since X has a Poisson distribution, it follows that<br />
P(T > t) = P(X = 0) = e -ltal<br />
t f 0<br />
= e –lt<br />
0!<br />
Example 3 On an average, 2 calls are received in 15 minutes. Find the average time between<br />
successive calls. What is the probability that the first call of the day would be received not before 10<br />
minutes? Within 5 minutes? 7<br />
2 1 minutes?<br />
Average rate of calls, l = 2/15 calls/minute<br />
\ Expected time between successive calls = 15/2 = 7.5 minutes. With t = 10 minutes,<br />
P(T > t) = e –l t = 2.7183 –(2/15) (10) = 2.7183 –4/3 = 0.2636<br />
When t = 5 minutes<br />
P(T £ t) = 1 – e –l t = 1 – 2.7183 –(2/15) (5) = 0.4866<br />
When t = 7.5 minutes,<br />
P(T £ t) = 1 – 2.7183 –(2/15) (15/2) = 1 – 0.3679 = 0.6321.
Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 23<br />
NORMAL DISTRIBUTION<br />
Normal distribution is a very important and useful distribution for a manager because many phenomena<br />
follow such a distribution or are close to it. In contrast to the binomial and Poisson distributions in which<br />
our concern is with determining probabilities of some number of successes, which can assume only discrete<br />
value of 0, 1, 2 and so forth, we deal in the normal distribution with characteristics which can assume any<br />
value between two given limits. Height of an individual, to illustrate, shall not be only, say, 65 or 66<br />
inches—it could be any value between these two.<br />
A characteristics, or variable, is said to be distributed normally if its curve appears as shown in Fig. 1<br />
and is represented by the following expression.<br />
1 exp (– (x – m)<br />
y(x) =<br />
2 /2s 2 )<br />
s 2p<br />
Here y(x) depicts the height of the y-ordinate at a specific value of x, and p and e are, respectively, equal<br />
to 3.1416 and 2.7183. Thus, if m and s, the mean and the standard deviation, are known, we can get the<br />
height of the curve at any specific value of x.<br />
It may be noted that:<br />
(a) A normal curve is a unimodal, bell-shaped, and symmetrical about its mean. As seen in the figure,<br />
the curve on either side of m is a mirror image of the other side. The mean, the median, and the<br />
mode all coincide.<br />
(b) The total area under the curve is divided evenly because of symmetry: 50% of area is to the right of<br />
a perpendicular line drawn at the mean and 50% is to its left.<br />
(c) It is assumed that the variable can take any value between – • and + •. As such, a normal curve<br />
approaches closely, but never touches, the horizontal axis.<br />
(d) If we construct vertical lines at a distance of one standard deviation from mean in both the<br />
directions, the area under the curve enclosed by these lines is equal to 68.27% of the total area. If we<br />
draw these lateral boundaries at two standard deviations from the mean in both the directions, they<br />
would enclose 95.45% of the total area. Similarly, m ± 3s covers 99.73% area under the curve.<br />
(e) A normal curve is defined completely by the mean, m, and standard deviation, s (> 0). That is, each<br />
different value of m and s specifies a distinct normal distribution and curve. Thus, the normal<br />
distribution is a family of distributions in which a member is distinguished from the others on the<br />
basis of the twin values of m and s. A distinguished member of this family is the distribution which<br />
has a zero mean and a standard deviation of 1. It is called the standard normal distribution.<br />
y<br />
yx ()= 1 =e<br />
s ÷ 2p<br />
-( x 2<br />
- m) /2s 2<br />
Fig. 1<br />
m<br />
Normal Curve<br />
Variable x
24 Quantitative Techniques in Management<br />
Calculation of probabilities For a normal distribution, the probability that the given variable would<br />
take a value in a certain range, say between X 1 and X 2 , is calculated as the proportion of the area under the<br />
normal curve between X 1 and X 2 , to the total area under it. For the purpose of calculating the probabilities,<br />
the given distribution is expressed in terms of the standard normal distribution. This is done by stating the<br />
variable X as the variable z, where<br />
z = X -m<br />
s<br />
At X = m, z would equal zero, z would be positive for values of X > m, and negative for values X < m.<br />
The proportion of areas under the normal curve between the mean and particular values of z are<br />
tabulated and shown in Table B1 at the end of the book or in the tables given on web. To illustrate, some of<br />
the areas are given here.<br />
Area<br />
(i) Between m and z = 1.00 0.3413<br />
(ii) Between m and z = 1.45 0.4265<br />
(iii) Between z = 1.2 and z = 2.8 0.4974 – 0.3849 = 0.1125<br />
(iv) Between z = – 1.2 and z = 2.8 0.3849 + 0.4974 = 0.8823<br />
(v) Beyond z = 1.2 0.5000 – 0.3849 = 0.1125<br />
It may be noted that since the curve is symmetrical, the area between the mean and a particular value of<br />
z is the same whether z is positive or negative.<br />
Example 4 A machine is set to fill in coffee powder in tins, with an average of 200 gms, and a<br />
standard deviation of 4 gms. Find the probability that a coffee tin selected at random shall contain<br />
(a) at least 200 gms, (b) between 200 and 206 gms, (c) between 195 and 205 gms, and (d) less than<br />
196 gms.<br />
The given distribution has m = 200 gms and s = 4 gms.<br />
(a) Area to the right of m being 0.50, this is the probability that a tin would contain at least 200 gms of<br />
coffee.<br />
(b) For X = 206, z = (206 – 200)/4 = 1.5.<br />
From the normal area table, area between m and z = 1.5 is 0.4332. Thus P(200 £ X £ 206)<br />
= 0.4332.<br />
(c) Area between X = 195 and X = 205 equals area between m and X = 195, plus area between m and<br />
X = 205. For X = 205, z = (205 – 200)/4 = 1.25 while for X = 195, z = (195 – 200)/4 = – 1.25. Area<br />
between m and z = 1.25 is equal to 0.3944.<br />
\ P(195 £ X £ 205) = 2 ¥ 0.3944 = 0.7888.<br />
(d) For X = 196, z = (196 – 200)/4 = – 1. Area between m and z = – 1 equals 0.3413.<br />
\ P(X < 196) = 0.5000 – 0.3413 = 0.1687.<br />
Example 5 A manufacturer of batteries wishes to give a guarantee for free replacement of the<br />
batteries whose life is less than a certain time period. If he desires to replace no more than 5% of the<br />
batteries, what should be the guarantee period, if the lives of batteries are known to be normally<br />
distributed with mean of 1200 hours and a standard deviation of 100 hours?<br />
The given information is depicted in Fig. 2. Here the value of X is to be determined. Since the area<br />
between m and X is 0.45, we observe from the normal area table that z corresponding to this area is<br />
– 1.645.
Appendix A4: <strong>Probability</strong> <strong>Distributions</strong> 25<br />
0.05<br />
Fig. 2<br />
m = 1200 hrs.<br />
Determination of X<br />
Hours<br />
Thus, – 1.645 = X - 1200 or X = 1200 – 100 ¥ 1.645 = 1035.5<br />
100<br />
A guarantee of 1036 hours may, therefore, be given by the manufacturer.