28.05.2013 Views

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

tel-00703797, version 2 - 7 Jun 2012<br />

Mo<strong>de</strong>l a<strong>de</strong>quacy<br />

1.2. GLMs, a brief introduction<br />

The <strong>de</strong>viance, which is one way to measure the mo<strong>de</strong>l a<strong>de</strong>quacy with the data and generalizes<br />

the R 2 measure of linear mo<strong>de</strong>ls, is <strong>de</strong>fined by<br />

D(y, ˆπ) = 2(ln(L(y1, . . . , yn, y1, . . . , yn)) − ln(L(ˆπ1, . . . , ˆπn, y1, . . . , yn))),<br />

where ˆπ is the estimate of the beta vector. The “best” mo<strong>de</strong>l is the one having the lowest<br />

<strong>de</strong>viance. However, if all responses are binary data, the first term can be infinite. So in<br />

practice, we consi<strong>de</strong>r the <strong>de</strong>viance simply as<br />

D(y, ˆπ) = −2 ln(L(ˆπ1, . . . , ˆπn, y1, . . . , yn)).<br />

Furthermore, the <strong>de</strong>viance is used as a relative measure to compare two mo<strong>de</strong>ls. In most softwares,<br />

in particular in R, the GLM fitting function provi<strong><strong>de</strong>s</strong> two <strong>de</strong>viances: the null <strong>de</strong>viance<br />

and the <strong>de</strong>viance. The null <strong>de</strong>viance is the <strong>de</strong>viance for the mo<strong>de</strong>l with only an intercept or if<br />

not offset only, i.e. when p = 1 and X is only an intercept full of 1 ∗ . The (second) <strong>de</strong>viance<br />

is the <strong>de</strong>viance for the mo<strong>de</strong>l D(y, ˆπ) with the p explanatory variables. Note that if there are<br />

as many parameters as there are observations, then the <strong>de</strong>viance will be the best possible, but<br />

the mo<strong>de</strong>l does not explain anything.<br />

Another criterion introduced by Akaike in the 70’s is the Akaike Information Criterion<br />

(AIC), which is also an a<strong>de</strong>quacy measure of statistical mo<strong>de</strong>ls. Unlike the <strong>de</strong>viance, AIC<br />

aims to penalized overfitted mo<strong>de</strong>ls, i.e. mo<strong>de</strong>ls with too many parameters (compared to the<br />

length of the dataset). AIC is <strong>de</strong>fined by<br />

AIC(y, ˆπ) = 2k − ln(L(ˆπ1, . . . , ˆπn, y1, . . . , yn)),<br />

where k the number of parameters, i.e. the length of β. This criterion is a tra<strong>de</strong>-off between<br />

further improvement in terms of log-likelihood with additional variables and the additional<br />

mo<strong>de</strong>l cost of including new variables. To compare two mo<strong>de</strong>ls with different parameter<br />

numbers, we look for the one having the lowest AIC.<br />

In a linear mo<strong>de</strong>l, the analysis of residuals (which are assumed to be i<strong>de</strong>ntical and in<strong>de</strong>pen<strong>de</strong>nt<br />

Gaussian variables) may reveal that the mo<strong>de</strong>l is unappropriate. Typically we can<br />

plot the fitted values against the fitted residuals. For GLMs, the analysis of residuals is much<br />

more complex, because we loose the normality assumption. Furthermore, for binary data, i.e.<br />

not binomial data, the plot of residuals exhibits straight lines, which are hard to interpret, see<br />

Appendix 1.8.2. We believe that the residual analysis is not appropriate for binary regressions.<br />

Variable selection<br />

From the normal asymptotic distribution of the maximum likelihood estimator, we can<br />

<strong>de</strong>rive confi<strong>de</strong>nce intervals as well as hypothesis tests for coefficents. Therefore, a p-value is<br />

available for each coefficient of the regression, which help us to keep only the most significant<br />

variable. However, as removing one variable impacts the significance of other variables, it can<br />

be hard to find the optimal set of explanatory variables.<br />

There are two approaches: either a forward selection, i.e. starting from the null mo<strong>de</strong>l, we<br />

add the most significant variable at each step, or a backward elimination, i.e. starting from<br />

the full mo<strong>de</strong>l, we remove the least significant variable at each step.<br />

∗. It means all the heterogeneity of data comes from the random component.<br />

51

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!