12.07.2015 Views

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The probability functi<strong>on</strong> for zero-inflated count variable Y z can be specified as,⎧⎨p + (1 − p)e −µ if y = 0Pr(Y z = y|p, µ) =⎩(1 − p) e−µ µ yy!if y > 0(2)Unlike the Poiss<strong>on</strong> distributi<strong>on</strong> which is determined by a single parameter, µ, the ZIP distributi<strong>on</strong>is determined by two parameters µ (i.e., the parameter for the st<strong>and</strong>ard Poiss<strong>on</strong> distributi<strong>on</strong>) <strong>and</strong>p (i.e., the probability of being an inflated zero). The mean <strong>and</strong> variance of a zero-inflated countvariable are, respectively,E(Y z ) = µ(1 − p)<strong>and</strong>var(Y z ) = µ(1 − p)(1 + µp),which shows the nature of the overdispersi<strong>on</strong> via the mean to variance ratio when count data arecharacterized by an excess of 0’s. The variance is (1 + µp) times the mean.Estimati<strong>on</strong>.We can model the inflated-zero probability as a functi<strong>on</strong> of covariates,<strong>and</strong> the Poiss<strong>on</strong> rates as,The likelihood functi<strong>on</strong> can then be written as,p i = exp(x′ i α)1 + exp(x ′ i α),µ i = exp(x ′ iβ).log L =n∑log p i + (1 − p i ) exp(µ i ) + log{(1 − p i ) exp(−µ i )} + yµ i − log Γ(y i + 1). (3)i=1Interpretati<strong>on</strong>. Interpretati<strong>on</strong> of the results from this model is straightforward. We caninterpret the parameters in the inflated porti<strong>on</strong> as effects <strong>on</strong> the log odds of being zero. 2 Theother parameters have the usual Poiss<strong>on</strong> regressi<strong>on</strong> interpretati<strong>on</strong>. Using results from the nullmodel <strong>and</strong> applying the formulas above for the mean <strong>and</strong> variance, we get ̂µ = e 0.829 (0.846),which is 1.94 for the mean, where ̂α = −1.7069, so p = 0.153. Using the formula above, thevariance is 2.621. The mean is exactly reproduced by the model, the variance is closer to theestimate of 2.79 from the raw data than was the estimate from the Poiss<strong>on</strong> regressi<strong>on</strong> model.Example.Zero-inflated poiss<strong>on</strong> regressi<strong>on</strong> Number of obs = 1496N<strong>on</strong>zero obs = 1140Zero obs = 356Inflati<strong>on</strong> model = logit LR chi2(6) = 180.92Log likelihood = -2468.552 Prob > chi2 = 0.00002 Other probability distributi<strong>on</strong>s for zero-inflati<strong>on</strong> are possible.8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!