12.07.2015 Views

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

such that E(v) = α/β <strong>and</strong> var(v) = α/β 2 . In Bayesian terms, the gamma distributi<strong>on</strong> is thec<strong>on</strong>jugate prior distributi<strong>on</strong> for the Poiss<strong>on</strong> (<strong>and</strong> other distributi<strong>on</strong>s in the exp<strong>on</strong>ential family).When the prior distributi<strong>on</strong> is combined with the Poiss<strong>on</strong> distributi<strong>on</strong> for y, c<strong>on</strong>diti<strong>on</strong>al <strong>on</strong> v, theresulting unc<strong>on</strong>diti<strong>on</strong>al distributi<strong>on</strong> (or posterior distributi<strong>on</strong>) of y is negative binomial. This wasdiscovered not l<strong>on</strong>g after the Poiss<strong>on</strong> distributi<strong>on</strong>, perhaps round 1909.For c<strong>on</strong>venience, we normalize the distributi<strong>on</strong> of v so that it has a mean of 1.0 as follows:E(v) = 1.0<strong>and</strong> this implies,so the resulting distributi<strong>on</strong> for v isvar(v) = 1/α,g(v) = αα v α−1Γ(α)exp(−αv) α > 0The likelihood of y for the ith woman c<strong>on</strong>diti<strong>on</strong>al <strong>on</strong> her r<strong>and</strong>om effect v i is.L(y i |v i )To obtain the marginal likelihood of y for the whole sample, we need to integrate over thedistributi<strong>on</strong> of v for each woman in our sample in order to “average” out the r<strong>and</strong>om effect.L m = ∏ ∫L(y i |v i )g(v)dvivThe resulting distributi<strong>on</strong> can be evaluated in closed formL m = ∏ iΓ(y i + 1 α ) [αµ] i yi[1] 1αΓ(y i + 1)Γ( 1 α ) 1 + αµ i 1 + αµ iExercise 5. Derive the negative binomial distributi<strong>on</strong> as a mixture of a Poiss<strong>on</strong> <strong>and</strong> gammadistributi<strong>on</strong>.A negative binomial variable has mean E(Y ) = µ <strong>and</strong> variance var(Y ) = µ + µα −1 . The loglikelihood functi<strong>on</strong> for this model is,log L =n∑log(1 − α) − log(1 + αy i ) + y i log µ i − (y + 1 α ) log(1 − αµ i) − log Γ(y i + 1).i=1It is a bit more difficult to optimize this model, but it is straightforward. All major statisticalsoftware (SAS, R, Stata have routines for estimating this model.Estimati<strong>on</strong>. We maximize this likelihood with respect to the parameters β <strong>and</strong> α, theoverdispersi<strong>on</strong> parameter. This is the st<strong>and</strong>ard deviati<strong>on</strong> of the gamma-distributed r<strong>and</strong>om effectmenti<strong>on</strong>ed earlier (note that we have assumed this effect to have a mean of 1 <strong>and</strong> have estimatedits variance using the current sample of women). In fact, the negative binomial is identical to anindividual-level r<strong>and</strong>om-effects Poiss<strong>on</strong> regressi<strong>on</strong>. We get identical results using either nbreg orxtpois, however we must supply a resp<strong>on</strong>dent ID number in order to identify the clusters (eachof size 1) to be used in the r<strong>and</strong>om effects specificati<strong>on</strong>.11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!