10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 328If Table 12.1 contains data on the number of e-mails opened <strong>for</strong> each possiblecombination, what can we say about the importance of including a name or an offer in thesubject heading?For simplicity, assume that we have two variables, X and Y, where Y is a binaryvariable coded as a 0 or 1. For example, 1 could mean a spam message was opened. If wetry to model the response with Y i =β 0 +ε i or Y i =β 0 +β 1 x i + ε i , then, as Y i is either 0 or 1, theεi can’t be an i.i.d. sample from a normal population. Consequently, the linear modelwon’t apply. As having only two answers puts a severe restriction on the error term,instead the probability of success is modeled.Let π i =P(Y i =1). Then π i is in the range 0 to 1. We might try to fit the modelπi=β 0 +β 1 x i +ε i , but again the range on the left side is limited, whereas that on the rightisn’t. Even if we restrict our values of the x i , the variation of the ε i can lead toprobabilities outside of [0,1].Let’s change tack. For a binary random variable, the probability is also an expectedvalue. That is, after conditioning on the value of x i , we have E(Y i /x i )= π i . In the simplelinear model we called this µ y/x , and we had the model Y i = µy| x +ε i . Interpreting thisdifferently will let us continue. We mentioned that the assumption on the error can beviewed two ways. Either assuming the error terms, the ε i values, are a random samplefrom a mean a normally distributed population, or, equivalently that each data point Y i israndomly selected from a Normal (µ y|x ,σ) distribution independently of the others. Thus,we have the following ingredients in simple linear regression:■ The predictors enter in a linear manner through β 0 +β 1 x 1■ The distribution of each Y i is determined by the mean, µ y/x , and some scale parameter σ■ There is a relationship between the mean and the linear predictors (µ y|x = β 0 +β 1x1 )The last point needs to be changed to continue with the binary regression model. Letη=β 0 +β 1 x 1 . Then the change is to assume that η can be trans<strong>for</strong>med to give the mean bysome function m() via µ y|x =W(η), which can be inverted to yield back η=m −1 (µ y|x ). Thefunction m() is called a link function, as it links the predictor with the mean.The logistic function m(x)=e x /(1+e x ) is often used (see Figure 12.1), and thecorresponding model is called logistic regression. For this, we haveThe logistic function turns values between −∞ and ∞ into values between 0 and 1, so thenumbers specifying the probabilities will be between 0 and 1. When m() is inverted wehave(12.1)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!