10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 1001. The model of illiteracy rate (Illiteracy) modeled by high school graduation rate HS.Grad.2. The model of life expectancy (Life. Exp) modeled by (Murder()) the murder rate.3. The model of income (Income) modeled by the illiteracy rate (Illiteracy).Write a sentence or two describing any relationship. In particular, do you find it asexpected or is it surprising?3.27 The data set batting (<strong>Using</strong>R) contains baseball statistics <strong>for</strong> the year 2002. Fit alinear model to runs batted in (RBI) modeled by number of home runs (HR). Make ascatterplot and add a regression line. In 2002, Mike Piazza had 33 home runs and 98 runsbatted in. What is his predicted number of RBIs based on his number of home runs?What is his residual?3.28 In the American culture, it is not considered unusual or inappropriate <strong>for</strong> a man todate a younger woman. But it is viewed as inappropriate <strong>for</strong> a man to date a muchyounger woman. Just what is too young? Some say anything less than half the man’s ageplus seven. This is tested with a survey of ten people, each indicating what the cutoff is<strong>for</strong> various ages. The results are in the data set too.young (<strong>Using</strong>R). Fit the regressionmodel and compare it with the rule of thumb by also plotting the line y=7+(1/2)x. How dothey compare?3.29 The data set diamond (<strong>Using</strong>R) contains data about the price of 48 diamondrings. The variable price records the price in Singapore dollars and the variable caratrecords the size of the diamond. Make a scatterplot of carat versus price. Use pch=5 toplot with diamonds. Add the regression line and predict the amount a one-third caratdiamond ring would cost.3.30 The data set Animals (MASS) contains the body weight and brain weight ofseveral different animals. A simple scatterplot will not suggest the true relationship, but alog-trans<strong>for</strong>m of both variables will. Do this trans<strong>for</strong>m and then find the slope of theregression line.Compare this slope to that found from a robust regression model using lqs( ).Comment on any differences.3.31 To gain an understanding of the variability present in a measurement, aresearcher may repeat or replicate a measurement several times. The data set breakdown(<strong>Using</strong>R) includes measurements in minutes of the time it takes an insulating fluid tobreak down as a function of an applied voltage. The relationship calls <strong>for</strong> a log-trans<strong>for</strong>m.Plot the voltage against the logarithm of time. Find the coefficients <strong>for</strong> simple linearregression and discuss the amount of variance <strong>for</strong> each level of the voltage.3.32 The motors (MASS) data set contains measurements on how long, in hours, ittakes a motor to fail. For a range of temperatures, in degrees Celsius, a number of motorswere run in an accelerated manner until they failed, or until time was cut off. (When timeis cut off the data is said to have been censored.) The data shows a relationship betweenincreased temperature and shortened life span.The commands> data(motors, package="MASS")> plot(time ~ temp, pch=cens, data=motors)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!