Estimation, Evaluation, and Selection of Actuarial Models

Estimation, Evaluation, and Selection of Actuarial Models Estimation, Evaluation, and Selection of Actuarial Models

01.08.2014 Views

50 CHAPTER 3. SAMPLING PROPERTIES OF ESTIMATORS For the product-limit estimator, " jY Var[S n (y j )] = Var i=1 jY · S(yi ) 2 = = i=1 i=1 µ # ri − S i r i S(y i−1 ) 2 + [S(y i−1) − S(y i )]S(y i ) r i S(y i−1 ) 2 ¸ − S(y j) 2 S(y 0 ) 2 jY ·ri S(y i ) 2 ¸ +[S(y i−1 ) − S(y i )]S(y i ) r i S(y i−1 ) 2 − S(y j) 2 S(y 0 ) 2 jY · S(yi ) 2 ¸ r i S(y i )+[S(y i−1 ) − S(y i )] = S(y i=1 i−1 ) 2 − S(y j) 2 r i S(y i ) S(y 0 ) 2 ( jY = S(y j) 2 · S(y 0 ) 2 1+ S(y ¸ ) i−1) − S(y i ) − 1 . r i=1 i S(y i ) This formula is unpleasant to work with, so the following approximation is often used. It is based on the fact that for any set of small numbers a 1 ,...,a n , the product (1+a 1 ) ···(1+a n ) is approximately 1+a 1 + ···+ a n . This follows because the missing terms are all products of two or more of the a i s. If they are all small to begin with, the products will be even smaller, and so can be ignored. Applying this produces the approximation Var[S n (y j )] . = ¸2 ·S(yj ) S(y 0 ) jX i=1 S(y i−1 ) − S(y i ) . r i S(y i ) Because it is unlikely that the survival function is known, an estimated value needs to be inserted. Recall that the estimated value of S(y j ) is actually conditional on being alive at age y 0 . Also, (r i − s i )/r i is an estimate of S(y i )/S(y i−1 ). Then, dVar[S n (y j )] . = S n (y j ) 2 jX i=1 s i r i (r i − s i ) . This formula is known as Greenwood’s approximation. It is the only version that will be used in this Note. Example 3.25 Using Data Set D1, estimate the variance of S 30 (3) both directly and using Greenwood’s formula. Do the same for 2ˆq 3 . Because there is no censoring or truncation, the empirical formula can be used to directly estimate this variance. There were three deaths out of 30 individuals, and therefore dVar[S 30 (3)] = (3/30)(27/30) = 81 30 30 3 . For Greenwood’s approximation, r 1 =30, s 1 =1, r 2 =29,ands 2 =2. The approximation is µ 27 2 µ 1 30 30(29) + 2 29(27) = 81 30 3 .

3.3. VARIANCE AND CONFIDENCE INTERVALS 51 It can be demonstrated that when there is no censoring or truncation, the two formulas will always produce the same answer. Recall that the development of Greenwood’s formula produced the variance only at death ages. The convention for non-death ages is to take the sum up to the last death age that is less than or equal to the age under consideration. With regard to 2ˆq 3 , arguing as in Example 3.17 produces an estimated (conditional) variance of dVar( 2ˆq 3 )= (4/27)(23/27) = 92 27 27 3 . For Greenwood’s formula, we first must note that we are estimating 2q 3 = S(3) − S(5) S(3) =1− S(5) S(3) . As with the empirical estimate, all calculations must be done given the 27 people alive at duration 3. Furthermore, the variance of 2ˆq 3 is the same as the variance of Ŝ(5) using only information from duration 3 and beyond. Starting from duration 3 there are three death times, 3.1, 4.0, and 4.8 with r 1 =27, r 2 =26, r 3 =25, s 1 =1, s 2 =1,ands 3 =2. Greenwood’s approximation is µ 23 2 µ 1 27 27(26) + 1 26(25) + 2 = 92 25(23) 27 3 . ¤ Exercise 57 Repeat the previous example, using time to surrender as the variable. Example 3.26 Repeat the previous example, this time using all 40 observations in Data Set D2 and the incomplete information due to censoring and truncation. For this example, the direct empirical approach is not available. That is because it is unclear what the sample size is (it varies over time as subjects enter and leave due to truncation and censoring). From Example 2.16, the relevant values within the first 3 years are r 1 =30, r 2 =26, s 1 =1,ands 2 =2.FromExample2.17,S 40 (3) = 0.8923. Then, Greenwood’s estimate is (0.8923) 2 µ 1 30(29) + 2 26(24) =0.0034671. An approximate 95% confidence interval can be constructed using the normal approximation. It is 0.8923 ± 1.96 √ 0.0034671 = 0.8923 ± 0.1154 which corresponds to the interval (0.7769, 1.0077). For small sample sizes, it is possible to create confidence intervals that admit values less than 0 or greater than 1. With regard to 2ˆq 3 , the relevant quantities are (starting at duration 3, but using the subscripts from the earlier examples for these data) r 3 =26, r 4 =26, r 5 =23, r 6 =21, s 3 =1, s 4 =2, s 5 =1, and s 6 =1. This gives an estimated variance of µ 0.7215 2 µ 1 0.8923 26(25) + 2 26(24) + 1 23(22) + 1 =0.0059502. 21(20) ¤

50 CHAPTER 3. SAMPLING PROPERTIES OF ESTIMATORS<br />

For the product-limit estimator,<br />

" jY<br />

Var[S n (y j )] = Var<br />

i=1<br />

jY<br />

· S(yi ) 2<br />

=<br />

=<br />

i=1<br />

i=1<br />

µ #<br />

ri − S i<br />

r i<br />

S(y i−1 ) 2 + [S(y i−1) − S(y i )]S(y i )<br />

r i S(y i−1 ) 2<br />

¸<br />

− S(y j) 2<br />

S(y 0 ) 2<br />

jY<br />

·ri S(y i ) 2 ¸<br />

+[S(y i−1 ) − S(y i )]S(y i )<br />

r i S(y i−1 ) 2<br />

− S(y j) 2<br />

S(y 0 ) 2<br />

jY<br />

· S(yi ) 2<br />

¸<br />

r i S(y i )+[S(y i−1 ) − S(y i )]<br />

=<br />

S(y<br />

i=1 i−1 ) 2 − S(y j) 2<br />

r i S(y i )<br />

S(y 0 ) 2<br />

( jY<br />

= S(y j) 2 ·<br />

S(y 0 ) 2 1+ S(y ¸ )<br />

i−1) − S(y i )<br />

− 1 .<br />

r<br />

i=1<br />

i S(y i )<br />

This formula is unpleasant to work with, so the following approximation is <strong>of</strong>ten used. It is based on<br />

the fact that for any set <strong>of</strong> small numbers a 1 ,...,a n , the product (1+a 1 ) ···(1+a n ) is approximately<br />

1+a 1 + ···+ a n . This follows because the missing terms are all products <strong>of</strong> two or more <strong>of</strong> the<br />

a i s. If they are all small to begin with, the products will be even smaller, <strong>and</strong> so can be ignored.<br />

Applying this produces the approximation<br />

Var[S n (y j )] . =<br />

¸2 ·S(yj )<br />

S(y 0 )<br />

jX<br />

i=1<br />

S(y i−1 ) − S(y i )<br />

.<br />

r i S(y i )<br />

Because it is unlikely that the survival function is known, an estimated value needs to be inserted.<br />

Recall that the estimated value <strong>of</strong> S(y j ) is actually conditional on being alive at age y 0 . Also,<br />

(r i − s i )/r i is an estimate <strong>of</strong> S(y i )/S(y i−1 ). Then,<br />

dVar[S n (y j )] . = S n (y j ) 2<br />

jX<br />

i=1<br />

s i<br />

r i (r i − s i ) .<br />

This formula is known as Greenwood’s approximation. It is the only version that will be used in<br />

this Note.<br />

Example 3.25 Using Data Set D1, estimate the variance <strong>of</strong> S 30 (3) both directly <strong>and</strong> using Greenwood’s<br />

formula. Do the same for 2ˆq 3 .<br />

Because there is no censoring or truncation, the empirical formula can be used to directly<br />

estimate this variance. There were three deaths out <strong>of</strong> 30 individuals, <strong>and</strong> therefore<br />

dVar[S 30 (3)] = (3/30)(27/30) = 81<br />

30 30 3 .<br />

For Greenwood’s approximation, r 1 =30, s 1 =1, r 2 =29,<strong>and</strong>s 2 =2. The approximation is<br />

µ 27 2 µ 1<br />

30 30(29) + 2 <br />

29(27)<br />

= 81<br />

30 3 .

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!