Estimation, Evaluation, and Selection of Actuarial Models
Estimation, Evaluation, and Selection of Actuarial Models Estimation, Evaluation, and Selection of Actuarial Models
28 CHAPTER 2. MODEL ESTIMATION likelihood estimator. The general form of this estimator is presented in this introduction. The following Subsections present useful special cases. To define the maximum likelihood estimator, let the data set consist of n events A 1 ,...,A n where A j is whatever was observed for the jth observation. For example, A j may consist of a single point or an interval. The latter arises with grouped data or when there is censoring. For example, when there is censoring at u and a censored observation is observed, the event is the interval from u to infinity. Further assume that the event A j results from observing the random variable X j . The random variables X 1 ,...,X n need not have the same probability distribution, but their distributions must depend on the same parameter vector, θ. In addition, the random variables are assumed to be independent. Definition 2.30 The likelihood function is L(θ) = nY Pr(X j ∈ A j |θ) j=1 and the maximum likelihood estimate of θ is the vector that maximizes the likelihood function. 8 Note that there is no guarantee that the function has a maximum at eligible parameter values. It is possible that as various parameters become zero or infinite the likelihood function will continue to increase. Care must be taken when maximizing this function because there may be local maxima in addition to the global maximum. Often, it is not possible to analytically maximize the likelihood function (by setting partial derivatives equal to zero). Numerical approaches, such as those outlined in Appendix B, will usually be needed. Because the observations are assumed to be independent, the product in the definition represents the joint probability Pr(X 1 ∈ A 1 ,...,X n ∈ A n |θ), that is, the likelihood function is the probability of obtaining the sample results that were obtained, given a particular parameter value. The estimate is then the parameter value that produces the model under which the actual observations are most likely to be observed. One of the major attractions of this estimator is that it is almost always available. That is, if you can write an expression for the desired probabilities, you can execute this method. If you cannot write and evaluate an expression for probabilities using your model, there is no point in postulating that model in the first place because you will not be able to use it to solve your problem. Example 2.31 Suppose the data in Data Set B was censored at 250. Determine the maximum likelihood estimate of θ for an exponential distribution. The first 7 data points are uncensored. For them, the set A j contains the single point equal to the observation x j . When calculating the likelihood function for a single point for a continuous model it is necessary to interpret Pr(X j = x j )=f(x j ). That is, the density function should be used. Thus the first seven terms of the product are f(27)f(82) ···f(243) = θ −1 e −27/θ θ −1 e −82/θ ···θ −1 e −243/θ = θ −7 e −909/θ . 8 Some authors write the likelihood function as L(θ|x) where the vector x represents the observed data. Because observed data can take many forms, in this Note the dependence of the likelihood function on the data is suppressed in the notation.
2.3. ESTIMATION FOR PARAMETRIC MODELS 29 For the final13terms,thesetA j is the interval from 250 to infinity and therefore Pr(X j ∈ A j )= Pr(X j > 250) = e −250/θ . There are thirteen such factors making the likelihood function L(θ) =θ −7 e −909/θ (e −250/θ ) 13 = θ −7 e −4159/θ . It is easier to maximize the logarithm of the likelihood function. Because it occurs so often, we denote the loglikelihood function as l(θ) =lnL(θ). Then l(θ) = −7lnθ − 4159θ −1 l 0 (θ) = −7θ −1 +4159θ −2 =0 ˆθ = 4159/7 = 594.14. In this case, the calculus technique of setting the first derivative equal to zero is easy to do. Also, evaluating the second derivative at this solution produces a negative number, verifying that this solution is a maximum. ¤ 2.3.3.2 Complete, individual data When there is no truncation, no censoring, and the value of each observation is recorded, it is easy to write the loglikelihood function. nY nX L(θ) = f Xj (x j |θ), l(θ) = ln f Xj (x j |θ). j=1 The notation indicates that it is not necessary for each observation to come from the same distribution. Example 2.32 Using Data Set B determine the maximum likelihood estimates for an exponential distribution, for a gamma distribution where α is known to equal 2, and for a gamma distribution where both parameters are unknown. For the exponential distribution, the general solution is nX ¡ l(θ) = − ln θ − xj θ −1¢ = −n ln θ − n¯xθ −1 j=1 l 0 (θ) = −nθ −1 + n¯xθ −2 =0 nθ = n¯x, ˆθ =¯x. For Data Set B, ˆθ =¯x =1, 424.4. The value of the loglikelihood function is −165.23. For this situation the method of moments and maximum likelihood estimates are identical. For the gamma distribution with α =2, f(x|θ) = x2−1 e −x/θ Γ(2)θ 2 = xθ −2 e −x/θ , ln f(x|θ) =lnx − 2lnθ − xθ −1 nX l(θ) = ln x j − 2n ln θ − n¯xθ −1 j=1 l 0 (θ) = −2nθ −1 + n¯xθ −2 =0 ˆθ = ¯x/2. j=1
- Page 1: Estimation, Evaluation, and Selecti
- Page 4 and 5: iv CONTENTS 4.5.2 Anderson-Darlingt
- Page 6 and 7: 2 CHAPTER 1. INTRODUCTION Exercises
- Page 8 and 9: 4 CHAPTER 2. MODEL ESTIMATION Throu
- Page 10 and 11: 6 CHAPTER 2. MODEL ESTIMATION 2.2 E
- Page 12 and 13: 8 CHAPTER 2. MODEL ESTIMATION and t
- Page 14 and 15: 10 CHAPTER 2. MODEL ESTIMATION of t
- Page 16 and 17: 12 CHAPTER 2. MODEL ESTIMATION Exer
- Page 18 and 19: 14 CHAPTER 2. MODEL ESTIMATION i d
- Page 20 and 21: 16 CHAPTER 2. MODEL ESTIMATION 8. C
- Page 22 and 23: 18 CHAPTER 2. MODEL ESTIMATION Exer
- Page 24 and 25: 20 CHAPTER 2. MODEL ESTIMATION In e
- Page 26 and 27: 22 CHAPTER 2. MODEL ESTIMATION Exam
- Page 28 and 29: 24 CHAPTER 2. MODEL ESTIMATION That
- Page 30 and 31: 26 CHAPTER 2. MODEL ESTIMATION Unle
- Page 34 and 35: 30 CHAPTER 2. MODEL ESTIMATION For
- Page 36 and 37: 32 CHAPTER 2. MODEL ESTIMATION like
- Page 38 and 39: 34 CHAPTER 2. MODEL ESTIMATION wher
- Page 40 and 41: 36 CHAPTER 2. MODEL ESTIMATION Exer
- Page 42 and 43: 38 CHAPTER 3. SAMPLING PROPERTIES O
- Page 44 and 45: 40 CHAPTER 3. SAMPLING PROPERTIES O
- Page 46 and 47: 42 CHAPTER 3. SAMPLING PROPERTIES O
- Page 48 and 49: 44 CHAPTER 3. SAMPLING PROPERTIES O
- Page 50 and 51: 46 CHAPTER 3. SAMPLING PROPERTIES O
- Page 52 and 53: 48 CHAPTER 3. SAMPLING PROPERTIES O
- Page 54 and 55: 50 CHAPTER 3. SAMPLING PROPERTIES O
- Page 56 and 57: 52 CHAPTER 3. SAMPLING PROPERTIES O
- Page 58 and 59: 54 CHAPTER 3. SAMPLING PROPERTIES O
- Page 60 and 61: 56 CHAPTER 3. SAMPLING PROPERTIES O
- Page 62 and 63: 58 CHAPTER 3. SAMPLING PROPERTIES O
- Page 64 and 65: 60 CHAPTER 3. SAMPLING PROPERTIES O
- Page 66 and 67: 62 CHAPTER 4. MODEL EVALUATION AND
- Page 68 and 69: 64 CHAPTER 4. MODEL EVALUATION AND
- Page 70 and 71: 66 CHAPTER 4. MODEL EVALUATION AND
- Page 72 and 73: 68 CHAPTER 4. MODEL EVALUATION AND
- Page 74 and 75: 70 CHAPTER 4. MODEL EVALUATION AND
- Page 76 and 77: 72 CHAPTER 4. MODEL EVALUATION AND
- Page 78 and 79: 74 CHAPTER 4. MODEL EVALUATION AND
- Page 80 and 81: 76 CHAPTER 4. MODEL EVALUATION AND
2.3. ESTIMATION FOR PARAMETRIC MODELS 29<br />
For the final13terms,thesetA j is the interval from 250 to infinity <strong>and</strong> therefore Pr(X j ∈ A j )=<br />
Pr(X j > 250) = e −250/θ . There are thirteen such factors making the likelihood function<br />
L(θ) =θ −7 e −909/θ (e −250/θ ) 13 = θ −7 e −4159/θ .<br />
It is easier to maximize the logarithm <strong>of</strong> the likelihood function. Because it occurs so <strong>of</strong>ten, we<br />
denote the loglikelihood function as l(θ) =lnL(θ). Then<br />
l(θ) = −7lnθ − 4159θ −1<br />
l 0 (θ) = −7θ −1 +4159θ −2 =0<br />
ˆθ = 4159/7 = 594.14.<br />
In this case, the calculus technique <strong>of</strong> setting the first derivative equal to zero is easy to do. Also,<br />
evaluating the second derivative at this solution produces a negative number, verifying that this<br />
solution is a maximum. ¤<br />
2.3.3.2 Complete, individual data<br />
When there is no truncation, no censoring, <strong>and</strong> the value <strong>of</strong> each observation is recorded, it is easy<br />
to write the loglikelihood function.<br />
nY<br />
nX<br />
L(θ) = f Xj (x j |θ), l(θ) = ln f Xj (x j |θ).<br />
j=1<br />
The notation indicates that it is not necessary for each observation to come from the same distribution.<br />
Example 2.32 Using Data Set B determine the maximum likelihood estimates for an exponential<br />
distribution, for a gamma distribution where α is known to equal 2, <strong>and</strong> for a gamma distribution<br />
where both parameters are unknown.<br />
For the exponential distribution, the general solution is<br />
nX ¡<br />
l(θ) = − ln θ − xj θ −1¢ = −n ln θ − n¯xθ −1<br />
j=1<br />
l 0 (θ) = −nθ −1 + n¯xθ −2 =0<br />
nθ = n¯x, ˆθ =¯x.<br />
For Data Set B, ˆθ =¯x =1, 424.4. The value <strong>of</strong> the loglikelihood function is −165.23. For this<br />
situation the method <strong>of</strong> moments <strong>and</strong> maximum likelihood estimates are identical.<br />
For the gamma distribution with α =2,<br />
f(x|θ) = x2−1 e −x/θ<br />
Γ(2)θ 2 = xθ −2 e −x/θ , ln f(x|θ) =lnx − 2lnθ − xθ −1<br />
nX<br />
l(θ) = ln x j − 2n ln θ − n¯xθ −1<br />
j=1<br />
l 0 (θ) = −2nθ −1 + n¯xθ −2 =0<br />
ˆθ = ¯x/2.<br />
j=1