10.07.2015 Views

Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

andS S R =Y T (H − 1 n J )Y. (33)Immediately from Equations 31, 32, and 33 we get the Anova EqualityS S TO = S S E + S S R. (34)The multiple coefficient of determination is defined by the formulaR 2 = 1 − S S ES S TO . (35)R 2 is the proportion of total variation that is explained by the multiple regression model. InMLR we must be careful, however, because the value of R 2 can be artificially inflated by theaddition of explanatory variables to the model, regardless of whether or not the added variables areuseful with respect to prediction of the response variable.We address the problem by penalizing R 2 when parameters are added to the model. The resultis an adjusted R 2 which we denote by R 2 .R 2 =(R 2 −p ) (n − 1n − 1n − p − 1). (36)It is good practice for the statistician to weigh both R 2 and R 2 during assessment of model utility.In many cases their values will be very close to each other. If their values differ substantially, orif one changes dramatically when an explanatory variable is added, then (s)he should take a closerlook at the explanatory variables in the model.3.2 How to do it with RFor the trees data, we can get R 2 and R 2 from the summary output or access the values directlyby name as shown (recall that we stored the summary object in treesumry).> treesumry$r.squared[1] 0.94795> treesumry$adj.r.squared[1] 0.9442322High values of R 2 and R 2 such as these indicate that the model fits very well, which agrees withwhat we saw in Figure 2.11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!