10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Describing populations 151> res = rnorm(1000,mean = mu,sd = sigma)> k = 1;sum(res > mu − k*sigma & res < mu + k*sigma)[1] 694> k = 2;sum(res > mu − k*sigma & res < mu + k*sigma)[1] 958> k = 3;sum(res > mu − k*sigma & res < mu + k*sigma)[1] 998Our simulation has 69.4%, 95.8%, and 99.8% of the data within 1, 2, and 3 standarddeviations of the mean. If we repeat this simulation, the answers will likely differ, as the1,000 random numbers will vary each time.5.2.3 Popular distributions to describe populationsMany populations are well described by the normal distribution; others are not. Forexample, a population may be multimodal, not symmetric, or have longer tails than thenormal distribution. Many other families of distributions have been defined to describedifferent populations. We highlight a few.Uni<strong>for</strong>m distributionThe uni<strong>for</strong>m distribution on [a,b] is useful to describe populations that have no preferredvalues over their range. For a finite range of values, the sample() function can choose onewith equal probabilities. The uni<strong>for</strong>m distribution would be used when there is a range ofvalues that is continuous.The density is a constant on [a,b]. As the total area is 1, the height is 1/(b− a). Themean is in the middle of the interval, µ=(a+b)/2. The variance is (b—a) 2 /12. Thedistribution has short tails.As mentioned, the family name in R is unif, and the parameters are min= and max=with defaults a and 1. We use Uni<strong>for</strong>m(a, b) to denote this distribution. The left graphicin Figure 5.6 shows a histogram and boxplot of 25 random samples from Uni<strong>for</strong>m(0, 10).On the histogram are superimposed the empirical density and the population density. Therandom sample is shown using the rug() function.> res = runif(50, min=0, max=10)## fig= setting uses bottom 35% of diagram> par(fig=c(0,1,0,.35))> boxplot(res,horizontal=TRUE, bty="n", xlab="uni<strong>for</strong>msample")## fig= setting uses top 75% of figure> par(fig=c(0,1,.25,1), new=TRUE)> hist(res, prob=TRUE, main="", col=gray(.9))> lines(density(res),lty=2)> curve(dunif(x, min=0, max=10), lwd=2, add=TRUE)> rug(res)(We overlaid two graphics by using the fig=argument to par(). This parameter sets theportion of the graphic device to draw on. You may manually specify the range on the x-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!