12.07.2015 Views

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

Notes on Poisson Regression and Some Extensions

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

packages. They are very useful for particular applicati<strong>on</strong>s. When this model is fit to these data,the overdispersi<strong>on</strong> parameter is no l<strong>on</strong>ger statistically significant.Exercise 6: Fit a comm<strong>on</strong> model specificati<strong>on</strong> (i.e., same set of variables) for all these models <strong>and</strong>compare <strong>on</strong> the basis of likelihood ratio tests (when possible) <strong>and</strong> BIC criteria (for n<strong>on</strong>-nestedmodels). Which model emerges as best for this data.Models for Rates in Time. Let t 1 , t 2 , . . . , t n denote the waiting times until an event occursfor a sample of n individuals, with distributi<strong>on</strong> functi<strong>on</strong> F (t) = Pr(T < t) <strong>and</strong> probability densityfuncti<strong>on</strong> f(t), where we assume that T is a c<strong>on</strong>tinuous r<strong>and</strong>om variable (T > 0). The hazard rate,intensity functi<strong>on</strong>, or failure rate is denoted by µ(t). The hazard rate is the instantaneousprobability of an event in the interval [t, t + ∆t], given that the event has not already occurredbefore the beginning of the interval. More formally, the hazard rate is the limit of a c<strong>on</strong>diti<strong>on</strong>alprobability (or transiti<strong>on</strong> probability)1µ(t) = lim Pr[t ≤ T < t + ∆t | T ≥ t]. (4)∆t→0 ∆t∆t>0The probability of surviving the interval [t, t + ∆t] is given by the survival functi<strong>on</strong>.S(t) = Pr[T > t] = 1 − F (t) =∫ ∞tf(u)du. (5)If we assume that the r<strong>and</strong>om variable for waiting-time (T ) follows an exp<strong>on</strong>ential distributi<strong>on</strong>with density functi<strong>on</strong>, f(t) = µ exp(−µt), the expressi<strong>on</strong>s for the survival functi<strong>on</strong> in Eq. 5 isS(t) = exp(−µt). The expressi<strong>on</strong> for the hazard rate of Eq. 4 is defined is the ratio, µ(t) =f(t)/S(t) = µ. For the exp<strong>on</strong>ential distributi<strong>on</strong>, this implies a c<strong>on</strong>stant hazard over time (i.e., notdepending <strong>on</strong> t).C<strong>on</strong>sider a Poiss<strong>on</strong> variate d that denotes the number of events occurring in an observati<strong>on</strong>window (exposure period) of length t. If events can occur repeatedly over time, <strong>and</strong> if the timesbetween events are independent exp<strong>on</strong>ential variables with mean time to event occurrence is givenby E(T ) = 1/µ (i.e., a time-homogeneous Poiss<strong>on</strong> process), then the probability of d events in atime-interval of length t follows a Poiss<strong>on</strong> distributi<strong>on</strong>,Pr(d | µ, t) = (tµ)d exp(−tµ). (6)d!The mean number of events in time-interval t is, λ = tµ. For a sample of size n, we can model thec<strong>on</strong>diti<strong>on</strong>al mean count as a functi<strong>on</strong> of independent variables, so that for the ith individual, theexpected number of events in time-interval t i isµ i = t i λ i = t i exp(x ′ iβ).The likelihood is a product of the individual Poiss<strong>on</strong> probabilities in Eq. 6, <strong>and</strong> is proporti<strong>on</strong>al toL =n∏(t i λ i ) d iexp(−t i λ i ), (7)i=1which is the kernel of a Poiss<strong>on</strong> likelihood.It will often be the case that we do not observe the event times for some individuals in thesample. In this case, the event times are said to be right censored, <strong>and</strong> the Poiss<strong>on</strong> variate is16

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!