10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Confidence intervals 191correlates with the failure time exceeding the 1-million-hour mark. In stress tests of 15hard drives they found an average of 9.5 days, with a standard deviation of 1 day. Does a90% confidence level include 10 days?7.14 The stud. recs (<strong>Using</strong>R) data set contains math SAT scores in the variable sat. m.Find a 90% confidence interval <strong>for</strong> the mean math SAT score <strong>for</strong> this data.7.15 For the homedata (<strong>Using</strong>R) data set find 90% confidence intervals <strong>for</strong> bothvariables y1970 and y2000. Use t.test(), but first discuss whether it is appropriate.7.16 The variable weight in the kid.weights (<strong>Using</strong>R) data set contains the weights ofa random sample of children. Find a 90% confidence interval <strong>for</strong> the weight of 5-yearolds.You’ll need to isolate just the 5-year-olds’ data first. Here’s one way:> attach(kid.weights)> ind = age < (5+1)*12 & age >= 5*12> weight[ind] # just five-year olds> detach(kid.weights)7.17 The brightness (<strong>Using</strong>R) data set contains in<strong>for</strong>mation on the brightness of stars in asector of the sky. Find a 90% confidence interval <strong>for</strong> the mean.7.18 The data set normtemp (<strong>Using</strong>R) contains measurements of 130 healthy,randomly selected individuals. The variable temperature contains normal bodytemperature. Does the data appear to come from a normal distribution? Is so, find a 90%confidence interval <strong>for</strong> the mean normal body temperature. Does it include 98.6 °F?7.19 The t-distribution is also called the Student t-distribution. (A Guinness Breweryemployee, William Gosset, derived the distribution of T to handle small samples. AsGuinness did not allow publication of research results at the time, Gosset chose to publishunder the pseudonym Student.)Gosset applied his research to a data set containing height and left-middlefingermeasurements of 3,000 criminals. These values were written on cards and randomlysorted into 750 samples, each containing four criminals. (This is how simulations weredone previously.)Suppose the first sample of four had an average height of 67.5 inches, with a standarddeviation of 2.54. From this sample, find a 95% confidence interval <strong>for</strong> the mean heightof the 3,000 data points.7.20 We can investigate how robust the T statistic is to changes in the underlyingparent population from normality. In particular, we can verify that if the parentpopulation is not too skewed or is symmetric without too heavy a tail then the T statisticwill still have the t-distribution <strong>for</strong> its sampling distribution.A simulation of the T statistic when X i are Normal(0, 1) may be done as follows:> n = 10; m = 250; df = n−1> res = c()> <strong>for</strong>d in 1:m) {+ x = rnorm(n) # change this line only+ res[i] = (mean(x) − 0)/(sd(x)/sqrt(n))+}> qqplotCres, rt(m,df=df))

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!