Estimation, Evaluation, and Selection of Actuarial Models
Estimation, Evaluation, and Selection of Actuarial Models Estimation, Evaluation, and Selection of Actuarial Models
48 CHAPTER 3. SAMPLING PROPERTIES OF ESTIMATORS Constructing confidence intervals is usually very difficult. However, there is a method for constructing approximate confidence intervals that is often accessible. Suppose we have an estimator ˆθ of the parameter θ such that E(ˆθ) = . θ, Var(ˆθ) = . v(θ), andˆθ has approximately a normal distribution. With all these approximations, we have à 1 − α =Pr . −z α/2 ≤ ˆθ ! − θ p ≤ z α/2 (3.2) v(θ) and solving for θ produces the desired interval. Sometimes this is difficult to do (due to the appearance of θ in the denominator) and so we may replace v(θ) in (3.2) with v(ˆθ) to obtain a further approximation 1 − α =Pr . q µˆθ − zα/2 qv(ˆθ) ≤ θ ≤ ˆθ + z α/2 v(ˆθ) (3.3) where z α is the 100(1 − α)th percentile of the standard normal distribution. Example 3.24 Use (3.2) and (3.3) to construct approximate 95% confidence intervals for p(2) using Data Set A. From (3.2), ⎛ ⎞ 0.95 = Pr ⎝−1.96 ≤ p n(2) − p(2) q ≤ 1.96⎠ . p(2)[1−p(2)] n Solve this by making the inequality an equality and then squaring both sides to obtain (dropping the argument of (2) for simplicity), (p n − p) 2 n p(1 − p) = 1.96 2 np 2 n − 2npp n + np 2 = 1.96 2 p − 1.96 2 p 2 0 = (n +1.96 2 )p 2 − (2np n +1.96 2 )p + np 2 n p = 2np n +1.96 2 ± p (2np n +1.96 2 ) 2 − 4(n +1.96 2 )np 2 n 2(n +1.96 2 ) which provides the two endpoints of the confidence interval. Inserting the numbers from Data Set A(p n =0.017043, n =94,935) produces a confidence interval of (0.016239, 0.017886). Equation (3.3) provides the confidence interval directly as r pn (1 − p n ) p n ± 1.96 . n Inserting the numbers from Data Set A gives 0.017043±0.000823 for an interval of (0.016220, 0.017866). The answers for the two methods are very similar, whichisthecasewhenthesamplesizeislarge. The results are reasonable, because it is well known that the normal distribution is a reasonable approximation to the binomial. ¤
3.3. VARIANCE AND CONFIDENCE INTERVALS 49 When data are censored or truncated the matter becomes more complex. Counts no longer have the binomial distribution and therefore the distribution of the estimator is harder to obtain. While there are proofs available to back up the results presented here, they will not be provided. Instead, an attempt will be made to indicate why the results are reasonable. Consider the Kaplan-Meier product-limit estimator of S(t). It is the product of a number of terms of the form (r j − s j )/r j where r j was viewed as the number available to die at age y j and s j is the number who actually did so. Assume that the death ages and the number available to die are fixed, so that the value of s j is the only random quantity. As a random variable, S j has a binomial distribution based on a sample of r j lives and success probability [S(y j−1 ) − S(y j )]/S(y j−1 ). The probability arises from the fact that those available to die were known to be alive at the previous death age. For one of these terms, µ rj − S j E = r j − r j [S(y j−1 ) − S(y j )]/S(y j−1 ) = S(y j) r j r j S(y j−1 ) . That is, this ratio is an unbiased estimator of the probability of surviving from one death age to the next one. Furthermore, h i µ S(y rj − S r j−1 )−S(y j ) j j S(y j−1 ) 1 − S(y j−1)−S(y j ) S(y j−1 ) Var = r j r 2 j = [S(y j−1) − S(y j )]S(y j ) r j S(y j−1 ) 2 . Now consider the estimated survival probability at one of the death ages. Its expected value is E[Ŝ(y j)] = E = jY i=1 " jY i=1 µ ri − S i r i # = S(y i ) S(y i−1 ) = S(y j) S(y 0 ) jY i=1 µ ri − S i E r i where y 0 is the smallest observed age in the sample. In order to bring the expectation inside the product, it was assumed that the S-values are independent. The result demonstrates that at the death ages, the estimator is unbiased. With regard to the variance, we first need a general result concerning the variance of a product of independent random variables. Let X 1 ,...,X n be independent random variables where E(X j )=µ j and Var(X j )=σ 2 j . Then, Var(X 1 ···X n ) = E(X 2 1 ···X 2 n) − E(X 1 ···X n ) 2 = E(X 2 1) ···E(X 2 n) − E(X 1 ) 2 ···E(X n ) 2 = (µ 2 1 + σ 2 1) ···(µ 2 n + σ 2 n) − µ 2 1 ···µ 2 n.
- Page 1: Estimation, Evaluation, and Selecti
- Page 4 and 5: iv CONTENTS 4.5.2 Anderson-Darlingt
- Page 6 and 7: 2 CHAPTER 1. INTRODUCTION Exercises
- Page 8 and 9: 4 CHAPTER 2. MODEL ESTIMATION Throu
- Page 10 and 11: 6 CHAPTER 2. MODEL ESTIMATION 2.2 E
- Page 12 and 13: 8 CHAPTER 2. MODEL ESTIMATION and t
- Page 14 and 15: 10 CHAPTER 2. MODEL ESTIMATION of t
- Page 16 and 17: 12 CHAPTER 2. MODEL ESTIMATION Exer
- Page 18 and 19: 14 CHAPTER 2. MODEL ESTIMATION i d
- Page 20 and 21: 16 CHAPTER 2. MODEL ESTIMATION 8. C
- Page 22 and 23: 18 CHAPTER 2. MODEL ESTIMATION Exer
- Page 24 and 25: 20 CHAPTER 2. MODEL ESTIMATION In e
- Page 26 and 27: 22 CHAPTER 2. MODEL ESTIMATION Exam
- Page 28 and 29: 24 CHAPTER 2. MODEL ESTIMATION That
- Page 30 and 31: 26 CHAPTER 2. MODEL ESTIMATION Unle
- Page 32 and 33: 28 CHAPTER 2. MODEL ESTIMATION like
- Page 34 and 35: 30 CHAPTER 2. MODEL ESTIMATION For
- Page 36 and 37: 32 CHAPTER 2. MODEL ESTIMATION like
- Page 38 and 39: 34 CHAPTER 2. MODEL ESTIMATION wher
- Page 40 and 41: 36 CHAPTER 2. MODEL ESTIMATION Exer
- Page 42 and 43: 38 CHAPTER 3. SAMPLING PROPERTIES O
- Page 44 and 45: 40 CHAPTER 3. SAMPLING PROPERTIES O
- Page 46 and 47: 42 CHAPTER 3. SAMPLING PROPERTIES O
- Page 48 and 49: 44 CHAPTER 3. SAMPLING PROPERTIES O
- Page 50 and 51: 46 CHAPTER 3. SAMPLING PROPERTIES O
- Page 54 and 55: 50 CHAPTER 3. SAMPLING PROPERTIES O
- Page 56 and 57: 52 CHAPTER 3. SAMPLING PROPERTIES O
- Page 58 and 59: 54 CHAPTER 3. SAMPLING PROPERTIES O
- Page 60 and 61: 56 CHAPTER 3. SAMPLING PROPERTIES O
- Page 62 and 63: 58 CHAPTER 3. SAMPLING PROPERTIES O
- Page 64 and 65: 60 CHAPTER 3. SAMPLING PROPERTIES O
- Page 66 and 67: 62 CHAPTER 4. MODEL EVALUATION AND
- Page 68 and 69: 64 CHAPTER 4. MODEL EVALUATION AND
- Page 70 and 71: 66 CHAPTER 4. MODEL EVALUATION AND
- Page 72 and 73: 68 CHAPTER 4. MODEL EVALUATION AND
- Page 74 and 75: 70 CHAPTER 4. MODEL EVALUATION AND
- Page 76 and 77: 72 CHAPTER 4. MODEL EVALUATION AND
- Page 78 and 79: 74 CHAPTER 4. MODEL EVALUATION AND
- Page 80 and 81: 76 CHAPTER 4. MODEL EVALUATION AND
- Page 82 and 83: 78 CHAPTER 4. MODEL EVALUATION AND
- Page 84 and 85: 80 CHAPTER 4. MODEL EVALUATION AND
- Page 86 and 87: 82 CHAPTER 4. MODEL EVALUATION AND
- Page 88 and 89: 84 CHAPTER 4. MODEL EVALUATION AND
- Page 90 and 91: 86 CHAPTER 5. MODELS WITH COVARIATE
- Page 92 and 93: 88 CHAPTER 5. MODELS WITH COVARIATE
- Page 94 and 95: 90 CHAPTER 5. MODELS WITH COVARIATE
- Page 96 and 97: 92 CHAPTER 5. MODELS WITH COVARIATE
- Page 98 and 99: 94 CHAPTER 5. MODELS WITH COVARIATE
- Page 100 and 101: 96 APPENDIX A. SOLUTIONS TO EXERCIS
3.3. VARIANCE AND CONFIDENCE INTERVALS 49<br />
When data are censored or truncated the matter becomes more complex. Counts no longer<br />
have the binomial distribution <strong>and</strong> therefore the distribution <strong>of</strong> the estimator is harder to obtain.<br />
While there are pro<strong>of</strong>s available to back up the results presented here, they will not be provided.<br />
Instead, an attempt will be made to indicate why the results are reasonable.<br />
Consider the Kaplan-Meier product-limit estimator <strong>of</strong> S(t). It is the product <strong>of</strong> a number <strong>of</strong><br />
terms <strong>of</strong> the form (r j − s j )/r j where r j was viewed as the number available to die at age y j <strong>and</strong> s j<br />
is the number who actually did so. Assume that the death ages <strong>and</strong> the number available to die are<br />
fixed, so that the value <strong>of</strong> s j is the only r<strong>and</strong>om quantity. As a r<strong>and</strong>om variable, S j has a binomial<br />
distribution based on a sample <strong>of</strong> r j lives <strong>and</strong> success probability [S(y j−1 ) − S(y j )]/S(y j−1 ). The<br />
probability arises from the fact that those available to die were known to be alive at the previous<br />
death age. For one <strong>of</strong> these terms,<br />
µ <br />
rj − S j<br />
E<br />
= r j − r j [S(y j−1 ) − S(y j )]/S(y j−1 )<br />
= S(y j)<br />
r j r j S(y j−1 ) .<br />
That is, this ratio is an unbiased estimator <strong>of</strong> the probability <strong>of</strong> surviving from one death age to<br />
the next one. Furthermore,<br />
h<br />
i<br />
µ S(y<br />
rj − S r j−1 )−S(y j )<br />
j<br />
j S(y j−1 )<br />
1 − S(y j−1)−S(y j )<br />
S(y j−1 )<br />
Var<br />
=<br />
r j<br />
r 2 j<br />
= [S(y j−1) − S(y j )]S(y j )<br />
r j S(y j−1 ) 2 .<br />
Now consider the estimated survival probability at one <strong>of</strong> the death ages. Its expected value is<br />
E[Ŝ(y j)] = E<br />
=<br />
jY<br />
i=1<br />
" jY<br />
i=1<br />
µ<br />
ri − S i<br />
r i<br />
# =<br />
S(y i )<br />
S(y i−1 ) = S(y j)<br />
S(y 0 )<br />
jY<br />
i=1<br />
µ <br />
ri − S i<br />
E<br />
r i<br />
where y 0 is the smallest observed age in the sample. In order to bring the expectation inside the<br />
product, it was assumed that the S-values are independent. The result demonstrates that at the<br />
death ages, the estimator is unbiased.<br />
With regard to the variance, we first need a general result concerning the variance <strong>of</strong> a product <strong>of</strong><br />
independent r<strong>and</strong>om variables. Let X 1 ,...,X n be independent r<strong>and</strong>om variables where E(X j )=µ j<br />
<strong>and</strong> Var(X j )=σ 2 j . Then,<br />
Var(X 1 ···X n ) = E(X 2 1 ···X 2 n) − E(X 1 ···X n ) 2<br />
= E(X 2 1) ···E(X 2 n) − E(X 1 ) 2 ···E(X n ) 2<br />
= (µ 2 1 + σ 2 1) ···(µ 2 n + σ 2 n) − µ 2 1 ···µ 2 n.