Estimation, Evaluation, and Selection of Actuarial Models
Estimation, Evaluation, and Selection of Actuarial Models Estimation, Evaluation, and Selection of Actuarial Models
50 CHAPTER 3. SAMPLING PROPERTIES OF ESTIMATORS For the product-limit estimator, " jY Var[S n (y j )] = Var i=1 jY · S(yi ) 2 = = i=1 i=1 µ # ri − S i r i S(y i−1 ) 2 + [S(y i−1) − S(y i )]S(y i ) r i S(y i−1 ) 2 ¸ − S(y j) 2 S(y 0 ) 2 jY ·ri S(y i ) 2 ¸ +[S(y i−1 ) − S(y i )]S(y i ) r i S(y i−1 ) 2 − S(y j) 2 S(y 0 ) 2 jY · S(yi ) 2 ¸ r i S(y i )+[S(y i−1 ) − S(y i )] = S(y i=1 i−1 ) 2 − S(y j) 2 r i S(y i ) S(y 0 ) 2 ( jY = S(y j) 2 · S(y 0 ) 2 1+ S(y ¸ ) i−1) − S(y i ) − 1 . r i=1 i S(y i ) This formula is unpleasant to work with, so the following approximation is often used. It is based on the fact that for any set of small numbers a 1 ,...,a n , the product (1+a 1 ) ···(1+a n ) is approximately 1+a 1 + ···+ a n . This follows because the missing terms are all products of two or more of the a i s. If they are all small to begin with, the products will be even smaller, and so can be ignored. Applying this produces the approximation Var[S n (y j )] . = ¸2 ·S(yj ) S(y 0 ) jX i=1 S(y i−1 ) − S(y i ) . r i S(y i ) Because it is unlikely that the survival function is known, an estimated value needs to be inserted. Recall that the estimated value of S(y j ) is actually conditional on being alive at age y 0 . Also, (r i − s i )/r i is an estimate of S(y i )/S(y i−1 ). Then, dVar[S n (y j )] . = S n (y j ) 2 jX i=1 s i r i (r i − s i ) . This formula is known as Greenwood’s approximation. It is the only version that will be used in this Note. Example 3.25 Using Data Set D1, estimate the variance of S 30 (3) both directly and using Greenwood’s formula. Do the same for 2ˆq 3 . Because there is no censoring or truncation, the empirical formula can be used to directly estimate this variance. There were three deaths out of 30 individuals, and therefore dVar[S 30 (3)] = (3/30)(27/30) = 81 30 30 3 . For Greenwood’s approximation, r 1 =30, s 1 =1, r 2 =29,ands 2 =2. The approximation is µ 27 2 µ 1 30 30(29) + 2 29(27) = 81 30 3 .
3.3. VARIANCE AND CONFIDENCE INTERVALS 51 It can be demonstrated that when there is no censoring or truncation, the two formulas will always produce the same answer. Recall that the development of Greenwood’s formula produced the variance only at death ages. The convention for non-death ages is to take the sum up to the last death age that is less than or equal to the age under consideration. With regard to 2ˆq 3 , arguing as in Example 3.17 produces an estimated (conditional) variance of dVar( 2ˆq 3 )= (4/27)(23/27) = 92 27 27 3 . For Greenwood’s formula, we first must note that we are estimating 2q 3 = S(3) − S(5) S(3) =1− S(5) S(3) . As with the empirical estimate, all calculations must be done given the 27 people alive at duration 3. Furthermore, the variance of 2ˆq 3 is the same as the variance of Ŝ(5) using only information from duration 3 and beyond. Starting from duration 3 there are three death times, 3.1, 4.0, and 4.8 with r 1 =27, r 2 =26, r 3 =25, s 1 =1, s 2 =1,ands 3 =2. Greenwood’s approximation is µ 23 2 µ 1 27 27(26) + 1 26(25) + 2 = 92 25(23) 27 3 . ¤ Exercise 57 Repeat the previous example, using time to surrender as the variable. Example 3.26 Repeat the previous example, this time using all 40 observations in Data Set D2 and the incomplete information due to censoring and truncation. For this example, the direct empirical approach is not available. That is because it is unclear what the sample size is (it varies over time as subjects enter and leave due to truncation and censoring). From Example 2.16, the relevant values within the first 3 years are r 1 =30, r 2 =26, s 1 =1,ands 2 =2.FromExample2.17,S 40 (3) = 0.8923. Then, Greenwood’s estimate is (0.8923) 2 µ 1 30(29) + 2 26(24) =0.0034671. An approximate 95% confidence interval can be constructed using the normal approximation. It is 0.8923 ± 1.96 √ 0.0034671 = 0.8923 ± 0.1154 which corresponds to the interval (0.7769, 1.0077). For small sample sizes, it is possible to create confidence intervals that admit values less than 0 or greater than 1. With regard to 2ˆq 3 , the relevant quantities are (starting at duration 3, but using the subscripts from the earlier examples for these data) r 3 =26, r 4 =26, r 5 =23, r 6 =21, s 3 =1, s 4 =2, s 5 =1, and s 6 =1. This gives an estimated variance of µ 0.7215 2 µ 1 0.8923 26(25) + 2 26(24) + 1 23(22) + 1 =0.0059502. 21(20) ¤
- Page 4 and 5: iv CONTENTS 4.5.2 Anderson-Darlingt
- Page 6 and 7: 2 CHAPTER 1. INTRODUCTION Exercises
- Page 8 and 9: 4 CHAPTER 2. MODEL ESTIMATION Throu
- Page 10 and 11: 6 CHAPTER 2. MODEL ESTIMATION 2.2 E
- Page 12 and 13: 8 CHAPTER 2. MODEL ESTIMATION and t
- Page 14 and 15: 10 CHAPTER 2. MODEL ESTIMATION of t
- Page 16 and 17: 12 CHAPTER 2. MODEL ESTIMATION Exer
- Page 18 and 19: 14 CHAPTER 2. MODEL ESTIMATION i d
- Page 20 and 21: 16 CHAPTER 2. MODEL ESTIMATION 8. C
- Page 22 and 23: 18 CHAPTER 2. MODEL ESTIMATION Exer
- Page 24 and 25: 20 CHAPTER 2. MODEL ESTIMATION In e
- Page 26 and 27: 22 CHAPTER 2. MODEL ESTIMATION Exam
- Page 28 and 29: 24 CHAPTER 2. MODEL ESTIMATION That
- Page 30 and 31: 26 CHAPTER 2. MODEL ESTIMATION Unle
- Page 32 and 33: 28 CHAPTER 2. MODEL ESTIMATION like
- Page 34 and 35: 30 CHAPTER 2. MODEL ESTIMATION For
- Page 36 and 37: 32 CHAPTER 2. MODEL ESTIMATION like
- Page 38 and 39: 34 CHAPTER 2. MODEL ESTIMATION wher
- Page 40 and 41: 36 CHAPTER 2. MODEL ESTIMATION Exer
- Page 42 and 43: 38 CHAPTER 3. SAMPLING PROPERTIES O
- Page 44 and 45: 40 CHAPTER 3. SAMPLING PROPERTIES O
- Page 46 and 47: 42 CHAPTER 3. SAMPLING PROPERTIES O
- Page 48 and 49: 44 CHAPTER 3. SAMPLING PROPERTIES O
- Page 50 and 51: 46 CHAPTER 3. SAMPLING PROPERTIES O
- Page 52 and 53: 48 CHAPTER 3. SAMPLING PROPERTIES O
- Page 56 and 57: 52 CHAPTER 3. SAMPLING PROPERTIES O
- Page 58 and 59: 54 CHAPTER 3. SAMPLING PROPERTIES O
- Page 60 and 61: 56 CHAPTER 3. SAMPLING PROPERTIES O
- Page 62 and 63: 58 CHAPTER 3. SAMPLING PROPERTIES O
- Page 64 and 65: 60 CHAPTER 3. SAMPLING PROPERTIES O
- Page 66 and 67: 62 CHAPTER 4. MODEL EVALUATION AND
- Page 68 and 69: 64 CHAPTER 4. MODEL EVALUATION AND
- Page 70 and 71: 66 CHAPTER 4. MODEL EVALUATION AND
- Page 72 and 73: 68 CHAPTER 4. MODEL EVALUATION AND
- Page 74 and 75: 70 CHAPTER 4. MODEL EVALUATION AND
- Page 76 and 77: 72 CHAPTER 4. MODEL EVALUATION AND
- Page 78 and 79: 74 CHAPTER 4. MODEL EVALUATION AND
- Page 80 and 81: 76 CHAPTER 4. MODEL EVALUATION AND
- Page 82 and 83: 78 CHAPTER 4. MODEL EVALUATION AND
- Page 84 and 85: 80 CHAPTER 4. MODEL EVALUATION AND
- Page 86 and 87: 82 CHAPTER 4. MODEL EVALUATION AND
- Page 88 and 89: 84 CHAPTER 4. MODEL EVALUATION AND
- Page 90 and 91: 86 CHAPTER 5. MODELS WITH COVARIATE
- Page 92 and 93: 88 CHAPTER 5. MODELS WITH COVARIATE
- Page 94 and 95: 90 CHAPTER 5. MODELS WITH COVARIATE
- Page 96 and 97: 92 CHAPTER 5. MODELS WITH COVARIATE
- Page 98 and 99: 94 CHAPTER 5. MODELS WITH COVARIATE
- Page 100 and 101: 96 APPENDIX A. SOLUTIONS TO EXERCIS
- Page 102 and 103: 98 APPENDIX A. SOLUTIONS TO EXERCIS
50 CHAPTER 3. SAMPLING PROPERTIES OF ESTIMATORS<br />
For the product-limit estimator,<br />
" jY<br />
Var[S n (y j )] = Var<br />
i=1<br />
jY<br />
· S(yi ) 2<br />
=<br />
=<br />
i=1<br />
i=1<br />
µ #<br />
ri − S i<br />
r i<br />
S(y i−1 ) 2 + [S(y i−1) − S(y i )]S(y i )<br />
r i S(y i−1 ) 2<br />
¸<br />
− S(y j) 2<br />
S(y 0 ) 2<br />
jY<br />
·ri S(y i ) 2 ¸<br />
+[S(y i−1 ) − S(y i )]S(y i )<br />
r i S(y i−1 ) 2<br />
− S(y j) 2<br />
S(y 0 ) 2<br />
jY<br />
· S(yi ) 2<br />
¸<br />
r i S(y i )+[S(y i−1 ) − S(y i )]<br />
=<br />
S(y<br />
i=1 i−1 ) 2 − S(y j) 2<br />
r i S(y i )<br />
S(y 0 ) 2<br />
( jY<br />
= S(y j) 2 ·<br />
S(y 0 ) 2 1+ S(y ¸ )<br />
i−1) − S(y i )<br />
− 1 .<br />
r<br />
i=1<br />
i S(y i )<br />
This formula is unpleasant to work with, so the following approximation is <strong>of</strong>ten used. It is based on<br />
the fact that for any set <strong>of</strong> small numbers a 1 ,...,a n , the product (1+a 1 ) ···(1+a n ) is approximately<br />
1+a 1 + ···+ a n . This follows because the missing terms are all products <strong>of</strong> two or more <strong>of</strong> the<br />
a i s. If they are all small to begin with, the products will be even smaller, <strong>and</strong> so can be ignored.<br />
Applying this produces the approximation<br />
Var[S n (y j )] . =<br />
¸2 ·S(yj )<br />
S(y 0 )<br />
jX<br />
i=1<br />
S(y i−1 ) − S(y i )<br />
.<br />
r i S(y i )<br />
Because it is unlikely that the survival function is known, an estimated value needs to be inserted.<br />
Recall that the estimated value <strong>of</strong> S(y j ) is actually conditional on being alive at age y 0 . Also,<br />
(r i − s i )/r i is an estimate <strong>of</strong> S(y i )/S(y i−1 ). Then,<br />
dVar[S n (y j )] . = S n (y j ) 2<br />
jX<br />
i=1<br />
s i<br />
r i (r i − s i ) .<br />
This formula is known as Greenwood’s approximation. It is the only version that will be used in<br />
this Note.<br />
Example 3.25 Using Data Set D1, estimate the variance <strong>of</strong> S 30 (3) both directly <strong>and</strong> using Greenwood’s<br />
formula. Do the same for 2ˆq 3 .<br />
Because there is no censoring or truncation, the empirical formula can be used to directly<br />
estimate this variance. There were three deaths out <strong>of</strong> 30 individuals, <strong>and</strong> therefore<br />
dVar[S 30 (3)] = (3/30)(27/30) = 81<br />
30 30 3 .<br />
For Greenwood’s approximation, r 1 =30, s 1 =1, r 2 =29,<strong>and</strong>s 2 =2. The approximation is<br />
µ 27 2 µ 1<br />
30 30(29) + 2 <br />
29(27)<br />
= 81<br />
30 3 .