10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Linear regression 26710.1.4 <strong>Using</strong> lm() to find the estimatesIn Chapter 3 we learned how to fit the simple linear regression model using 1m (). Thebasic usage is of the <strong>for</strong>mlm(<strong>for</strong>mula, data=…, subset=…)As is usual with functions using model <strong>for</strong>mulas, the data= argument allows the variablenames to reference those in the specified data frame, and the subset= argument can beused to restrict the indices of the variables used by the modeling function.By default, the lm () function will print out the estimates <strong>for</strong> the coefficients. Muchmore is returned, but needs to be explicitly asked <strong>for</strong>. Usually, we store the results of themodel in a variable, so that it can subsequently be queried <strong>for</strong> moreFigure 10.1 Simulation of modelY i =1+2x i +ε i . The regression linebased on the data is drawn withdashes. The big square marks thevaluein<strong>for</strong>mation.■ Example 10.1: Maximum heart rate Many people use heart-rate monitors whenexercising in order to achieve target heart rates <strong>for</strong> optimal training. The maximum safeheart rate is generally thought to be 220 minus one’s age in years. A 25-year-old wouldhave a maximum heart rate of 195 beats per minute.This <strong>for</strong>mula is said to have been devised in the 1970s by Sam Fox and WilliamHaskell while en route to a conference. Fox and Haskell had plotted a graph of some data,and had drawn by hand a line through the data, guessing the slope and intercept.* Their<strong>for</strong>mula is easy to compute and comprehend and has found widespread acceptance.It may be wrong, though. In 2001, Tanaka, Monahan, and Seals found that 209–0.7times one’s age is a better fit.The following data is simulated to illustrate:> age = rep(seq(20,60,by=5), 3)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!