10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Describing populations 153> hist(res, ylim=c(0,y.max), prob=TRUE, main="",col=gray(.9))> lines(density(res), lty=2)> curve(dexp(x, rate=1/5), lwd=2, add=TRUE)> rug(res)Plotting the histogram and then adding the empirical and population densities as shownmay lead to truncated graphs, as the y-limits of the histogram may not be large enough. Inthe above, we look first at the maximum y-values of the histogram and the two densities.Then we set the ylim= argument in the call to hist(). Finding the maximum value differsin each case. For the hist() function, more is returned than just a graphic. We store theresult and access the density part with tmp. hist$density. For the empirical density, twonamed parts of the return value are x and y. We want the maximum of the y value.Finally, the population density is maximal at 0, so we simply use the dexp() function at ato give this. For other densities, we may need to find the maximum by other means.Lognormal distributionThe lognormal distribution is a heavily skewed continuous distribution on the positivenumbers. A lognormal random variable, X, has its name as log(X) is normally distributed.Lognormal distributions describe populations such as income distribution.In R the family name is Inorm. The two parameters are labeled meanlog= and sdlog=.These are the mean and standard deviation of log(X), not of X.Figure 5.7 shows a sample of size 50 from the lognormal distribution, with parametersmeanlog=0 and sdlog=1.Figure 5.7 Histogram and boxplot of50 samples from lognormaldistribution with meanlog=0 andsdlog=15.2.4 Sampling distributions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!