Estimation, Evaluation, and Selection of Actuarial Models
Estimation, Evaluation, and Selection of Actuarial Models
Estimation, Evaluation, and Selection of Actuarial Models
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6 CHAPTER 2. MODEL ESTIMATION<br />
2.2 <strong>Estimation</strong> using data-dependent distributions<br />
2.2.1 Introduction<br />
When observations are collected from a probability distribution, the ideal situation is to have the<br />
(essentially) exact 1 value <strong>of</strong> each observation. This case is referred to as “complete, individual<br />
data.” This is the case in Data Sets B <strong>and</strong> D1. There are two reasons why exact data may not<br />
be available. One is grouping, in which all that is recorded is the range <strong>of</strong> values in which the<br />
observation belongs. This is the case for Data Set C <strong>and</strong> for Data Set A for those with five or more<br />
accidents.<br />
A second reason that exact values may not be available is the presence <strong>of</strong> censoring or truncation.<br />
When data are censored from below, observations below a given value are known to be below that<br />
value, but the exact value is unknown. When data are censored from above, observations above a<br />
given value are known to be above that value, but the exact value is unknown. Note that censoring<br />
effectively creates grouped data. When the data are grouped in the first place, censoring has no<br />
effect. For example, the data in Data Set C may have been censored from above at 300,000, but<br />
we cannot know for sure from the data set <strong>and</strong> that knowledge has no effect on how we treat the<br />
data. On the other h<strong>and</strong>, were Data Set B to be censored at 1,000, we would have fifteen individual<br />
observations <strong>and</strong> then five grouped observations in the interval from 1,000 to infinity.<br />
In insurance settings, censoring from above is fairly common. For example, if a policy pays no<br />
more than 100,000 for an accident, any time the loss is above 100,000 the actual amount will be<br />
unknown, but we will know that it happened. In Data Set D2 we have r<strong>and</strong>om censoring. Consider<br />
the fifth policy in the table. When the “other information” is not available, all that is known about<br />
the time <strong>of</strong> death is that it will be after 1.8 years. All <strong>of</strong> the policies are censored at 5 years by the<br />
nature <strong>of</strong> the policy itself. Also, note that Data Set A has been censored from above at 5. This is<br />
more common language than to say that Data Set A has some individual data <strong>and</strong> some grouped<br />
data.<br />
When data are truncated from below, observations below a given value are not recorded. Truncation<br />
from above implies that observations above a given value are not recorded. In insurance<br />
settings, truncation from below is fairly common. If an automobile physical damage policy has a<br />
per claim deductible <strong>of</strong> 250, any losses below 250 will not come to the attention <strong>of</strong> the insurance<br />
company <strong>and</strong> so will not appear in any data sets. Data Set D2 has observations 31—40 truncated<br />
from below at varying values. The other data sets may have truncation forced on them. For example,<br />
if Data Set B were to be truncated from below at 250, the first seven observations would<br />
disappear <strong>and</strong> the remaining thirteen would be unchanged.<br />
2.2.2 The empirical distribution for complete, individual data<br />
AsnotedinDefinition 2.3, the empirical distribution assigns probability 1/n to each data point.<br />
That works well when the value <strong>of</strong> each data point is recorded. An alternative definition is<br />
Definition 2.5 The empirical distribution function is<br />
number <strong>of</strong> observations ≤ x<br />
F n (x) =<br />
n<br />
1 Some measurements are never exact. Ages may be rounded to the nearest whole number, monetary amounts<br />
to the nearest dollar, car mileage to the nearest tenth <strong>of</strong> a mile, <strong>and</strong> so on. This Note is not concerned with such<br />
rounding errors. Rounded values will be treated as if they are exact.