10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Linear regression 281Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’0.1 ‘ ’ 1These values in the column headed Sum Sq are SSReg and RSS. The total sum ofsquares, SST, would be the sum of the two. Although the ratio of the mean sums ofsquares, 2596/14, is not exactly 183, 183 is the correct value, as numbers have beenrounded to integers.Predicting the response with predict ()The function predict () is used to make different types of predictions.A template <strong>for</strong> our usage ispredict (res, newdata=…, interval=…, level =…)The value of res is the output of a modeling function, such as 1m (). We call this resbelow, but we can use any valid name. Any changes to the values of the predictor aregiven to the argument newdata= in the <strong>for</strong>m of a data frame with names that match thoseused in the model <strong>for</strong>mula. The arguments interval= and level= are set when prediction orconfidence intervals are desired.The simplest usage, predict (res), returns the predicted values (the <strong>for</strong> the data.Predictions <strong>for</strong> other values of the predictor are specified using a data frame, as thisexample illustrates:> predict(res, newdata=data.frame(age=42))[1] 176.5This finds the predicted maximum heart rate <strong>for</strong> a 42-year-old. The age= part of the dataframe call is important. Variable names in the data frame supplied to the newdata=argument must exactly match the variable names used when the model object wasproduced.Prediction intervalsThe value of can be used to predict two different things: the value of a single estimateof y <strong>for</strong> a given x or the average value of many values of y <strong>for</strong> a given x. If we think of amodel with replication (repeated /s <strong>for</strong> a given x, such as in Figure 10.6), then thedifference is clear: one is a prediction <strong>for</strong> a given point, the other a prediction <strong>for</strong> theaverage of the points.Statistical inference about the predicted value of y based on the sample is done with aprediction interval. As y is not a parameter, we don’t call this a confidence interval. The<strong>for</strong>m of the prediction interval is similar to that of a confidence interval:For the prediction interval, the standard error is(10.9)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!