Normal approximation to the hypergeometric distribution in ...

More documents

Recommendations

Info

y (CDF) S.N. Lahiri et al. / Journal of Statistical Planning and Inference 137 (2007) 3570 –3590 3575 1.0 0.8 0.6 0.4 0.2 0.0 Hypergeometric Binomial Normal -20 -15 -10 -5 0 x (Standardised) Fig. 3. A plot of the cdfs of normalized Hypergeometric and Binomial random variables against the standard Normal cdf for the parameter values N = 60, p = 0.9 and f = 0.7. y (CDF) 1.0 0.8 0.6 0.4 0.2 0.0 Hypergeometric Binomial Normal -20 -15 -10 -5 0 x (Standardised) Fig. 4. A plot of the cdfs of normalized Hypergeometric and Binomial random variables against the standard Normal cdf for the parameter values N = 60, p = 0.9 and f = 0.8. approximation; the maximal error of approximation to the Binomial (54, 0.8) distribution is 0.0803. However, with N = 60, n = 54 and p = 0.9, the maximal error of Normal approximation to the Hypergeometric distribution is as high as 0.4633, making the approximation practically useless. With about a 9-fold increase in the sample size, at n = 450, the accuracy of the approximation in the Hypergeometric case only improves to 0.1683 for the same values of f and p. The corresponding maximal error for the Normal approximation to the Binomial distribution with parameters n = 450 and p = 0.8 is only 0.0282. Thus, the loss in accuracy in this case is an astounding 600% compared to the Binomial 5
3576 S.N. Lahiri et al. / Journal of Statistical Planning and Inference 137 (2007) 3570 – 3590 y (CDF) 1.0 0.8 0.6 0.4 0.2 0.0 Hypergeometric Binomial Normal -20 -15 -10 -5 0 x (Standardised) Fig. 5. A plot of the cdfs of normalized Hypergeometric and Binomial random variables against the standard Normal cdf for the parameter values N = 60, p = 0.9 and f = 0.9. Table 2 Values of the maximal error of Normal approximation to Hypergeometric distribution (viz., (N,p,f)of (3.1)) at N = 60 and the corresponding values of the maximal error for the Binomial (n, p) distribution where n = Nf and p, f ∈{0.5, 0.6, 0.7, 0.8, 0.9} Hypergeometric Binomial n = 30 36 42 48 54 n = 30 36 42 48 54 p = 0.5 0.1017 0.1038 0.1106 0.1260 0.1646 0.0722 0.0660 0.0612 0.0573 0.0540 p = 0.6 0.1038 0.1066 0.1171 0.1284 0.1817 0.0785 0.0722 0.0661 0.0626 0.0588 p = 0.7 0.1106 0.1171 0.1283 0.1476 0.1844 0.0888 0.0808 0.0757 0.0711 0.0670 p = 0.8 0.2268 0.2528 0.2896 0.3480 0.4633 0.1070 0.0992 0.0922 0.0859 0.0803 p = 0.9 0.3128 0.3550 0.4046 0.4633 0.7169 0.1474 0.1391 0.1289 0.1184 0.1145 Table 3 Values of the maximal error of Normal approximation to Hypergeometric distribution (viz., (N,p,f)of (3.1)) at N = 200 and the corresponding values of the maximal error for the Binomial (n, p) distribution where n = Nf and p, f ∈{0.5, 0.6, 0.7, 0.8, 0.9} Hypergeometric Binomial n = 100 120 140 160 180 n = 100 120 140 160 180 p = 0.5 0.0562 0.0574 0.0613 0.0701 0.0929 0.0398 0.0363 0.0337 0.0315 0.0297 p = 0.6 0.0574 0.0593 0.0641 0.0743 0.0995 0.0433 0.0395 0.0366 0.0343 0.0323 p = 0.7 0.0613 0.0641 0.0702 0.0822 0.1112 0.0491 0.0449 0.0416 0.0389 0.0367 p = 0.8 0.1261 0.1373 0.1559 0.1887 0.2634 0.0595 0.0543 0.0503 0.0471 0.0444 p = 0.9 0.1764 0.1922 0.2181 0.2634 0.3652 0.0832 0.0761 0.0705 0.0661 0.0623 case. Indeed, similar high levels of loss in accuracy occur for values of f and p near 1 even when the population size N is increased to 2000 and beyond. As a consequence, the commonly used guidelines for the accuracy in the Binomial case can be misleading for assessing accuracy of the Normal approximation to the Hypergeometric distribution in the extreme cases. 5
Page 1 and 2: Journal of Statistical Planning and
Page 3 and 4: 3572 S.N. Lahiri et al. / Journal o
Page 5: 3574 S.N. Lahiri et al. / Journal o
Page 21: 3590 S.N. Lahiri et al. / Journal o

Normal approximation to the hypergeometric distribution in ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?