10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Bivariate data 93■ Example 3.6: Kids’ weights: Is weight related to height squared?In Figure 3.8, the relationship between height and weight is given <strong>for</strong> the kid. weights(<strong>Using</strong>R) data set. In Example 3.4, we mentioned that the BMI suggests a relationshipbetween height squared and weight. We model this as follows:> height.sq = kid.weights$height^2> plot(weight ~ height.sq, data=kid.weights)> res = 1m(weight ~ height.sq, data=kid.weights)> abline(res)> resCall:1m(<strong>for</strong>mula = weight ~ height.sq, data=kid.weights)Coefficients:(Intercept) height.sq3.1089 0.0244Figure 3.12 Height squared versusweightFigure 3.12 shows a better fit with a linear model than be<strong>for</strong>e. However, the BMI is notconstant during a person’s growth years, so this is not exactly the expected relationship.<strong>Using</strong> a model <strong>for</strong>mula with trans<strong>for</strong>mations If we had tried the above exampleusing this model <strong>for</strong>mula, we’d be in <strong>for</strong> a surprise:> plot(weight ~ height^2, data=kid.weights) # not asexpected> res = lm(weight ~ height^2, data=kid.weights)> abline(res)The resulting graph would look identical to the graph of height versus weight in Figure3.8 and not the graph of height squared versus weight in Figure 3.12.The reason <strong>for</strong> this is that the model <strong>for</strong>mula syntax uses the familiar math notations *,/, ^ differently. To use them in their ordinary sense, we need to insulate them in the<strong>for</strong>mulas with the I() function, as in:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!