01.08.2014 Views

Estimation, Evaluation, and Selection of Actuarial Models

Estimation, Evaluation, and Selection of Actuarial Models

Estimation, Evaluation, and Selection of Actuarial Models

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6 CHAPTER 2. MODEL ESTIMATION<br />

2.2 <strong>Estimation</strong> using data-dependent distributions<br />

2.2.1 Introduction<br />

When observations are collected from a probability distribution, the ideal situation is to have the<br />

(essentially) exact 1 value <strong>of</strong> each observation. This case is referred to as “complete, individual<br />

data.” This is the case in Data Sets B <strong>and</strong> D1. There are two reasons why exact data may not<br />

be available. One is grouping, in which all that is recorded is the range <strong>of</strong> values in which the<br />

observation belongs. This is the case for Data Set C <strong>and</strong> for Data Set A for those with five or more<br />

accidents.<br />

A second reason that exact values may not be available is the presence <strong>of</strong> censoring or truncation.<br />

When data are censored from below, observations below a given value are known to be below that<br />

value, but the exact value is unknown. When data are censored from above, observations above a<br />

given value are known to be above that value, but the exact value is unknown. Note that censoring<br />

effectively creates grouped data. When the data are grouped in the first place, censoring has no<br />

effect. For example, the data in Data Set C may have been censored from above at 300,000, but<br />

we cannot know for sure from the data set <strong>and</strong> that knowledge has no effect on how we treat the<br />

data. On the other h<strong>and</strong>, were Data Set B to be censored at 1,000, we would have fifteen individual<br />

observations <strong>and</strong> then five grouped observations in the interval from 1,000 to infinity.<br />

In insurance settings, censoring from above is fairly common. For example, if a policy pays no<br />

more than 100,000 for an accident, any time the loss is above 100,000 the actual amount will be<br />

unknown, but we will know that it happened. In Data Set D2 we have r<strong>and</strong>om censoring. Consider<br />

the fifth policy in the table. When the “other information” is not available, all that is known about<br />

the time <strong>of</strong> death is that it will be after 1.8 years. All <strong>of</strong> the policies are censored at 5 years by the<br />

nature <strong>of</strong> the policy itself. Also, note that Data Set A has been censored from above at 5. This is<br />

more common language than to say that Data Set A has some individual data <strong>and</strong> some grouped<br />

data.<br />

When data are truncated from below, observations below a given value are not recorded. Truncation<br />

from above implies that observations above a given value are not recorded. In insurance<br />

settings, truncation from below is fairly common. If an automobile physical damage policy has a<br />

per claim deductible <strong>of</strong> 250, any losses below 250 will not come to the attention <strong>of</strong> the insurance<br />

company <strong>and</strong> so will not appear in any data sets. Data Set D2 has observations 31—40 truncated<br />

from below at varying values. The other data sets may have truncation forced on them. For example,<br />

if Data Set B were to be truncated from below at 250, the first seven observations would<br />

disappear <strong>and</strong> the remaining thirteen would be unchanged.<br />

2.2.2 The empirical distribution for complete, individual data<br />

AsnotedinDefinition 2.3, the empirical distribution assigns probability 1/n to each data point.<br />

That works well when the value <strong>of</strong> each data point is recorded. An alternative definition is<br />

Definition 2.5 The empirical distribution function is<br />

number <strong>of</strong> observations ≤ x<br />

F n (x) =<br />

n<br />

1 Some measurements are never exact. Ages may be rounded to the nearest whole number, monetary amounts<br />

to the nearest dollar, car mileage to the nearest tenth <strong>of</strong> a mile, <strong>and</strong> so on. This Note is not concerned with such<br />

rounding errors. Rounded values will be treated as if they are exact.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!