28.05.2013 Views

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

Etude des marchés d'assurance non-vie à l'aide d'équilibres de ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

tel-00703797, version 2 - 7 Jun 2012<br />

Chapitre 1. Sur la nécessité d’un modèle <strong>de</strong> marché<br />

where Θ <strong>de</strong>notes the (unknown) parameter vector and E the (random) noise vector. The<br />

linear mo<strong>de</strong>l assumptions are: (i) white noise: E(Ei) = 0, (ii) homoskedasticity: V ar(Ei) =<br />

σ 2 , (iii) normality: Ei ∼ N (0, σ 2 ), (iv) in<strong>de</strong>pen<strong>de</strong>nce: Ei is in<strong>de</strong>pen<strong>de</strong>nt of Ej for i = j,<br />

(v) parameter i<strong>de</strong>ntification: rank(X) = p < n. Then, the Gauss-Markov theorem gives<br />

us the following results: (i) the least square estimator ˆ Θ of Θ is ˆ Θ = (X T X) −1 X T Y and<br />

ˆσ 2 = ||Y − XΘ|| 2 /(n − p) for σ 2 , (ii) ˆ Θ is a Gaussian vector in<strong>de</strong>pen<strong>de</strong>nt of the random<br />

variable ˆσ 2 ∼ χ 2 n−p, (iii) ˆ Θ is an unbiased estimator with minimum variance of Θ, such that<br />

V ar( ˆ Θ) = σ 2 (X T X) −1 and ˆσ 2 is an unbiased estimator of σ 2 .<br />

Let us note that first four assumptions can be expressed into one single assumption<br />

E ∼ N (0, σ 2 In). But splitting the normality assumption will help us to i<strong>de</strong>ntify the strong<br />

differences between linear mo<strong>de</strong>ls and GLMs. The term XΘ is generally referred to the linear<br />

predictor of Y .<br />

Linear mo<strong>de</strong>ls inclu<strong>de</strong> a wi<strong>de</strong> range of statistical mo<strong>de</strong>ls, e.g. the simple linear regression<br />

yi = a + bxi + ɛi is obtained with a 2-column matrix X having 1 in first column and (xi)i<br />

in second column. Many properties can be <strong>de</strong>rived for linear mo<strong>de</strong>ls, notably hypothesis<br />

tests, confi<strong>de</strong>nce intervals for parameter estimates as well as estimator convergence, see, e.g.,<br />

Chapter 6 of Venables and Ripley (2002).<br />

We now focus on the limitations of linear mo<strong>de</strong>ls resulting from the above assumptions.<br />

The following problems have been i<strong>de</strong>ntified. When X contains near-colinear variables, the<br />

computation of the estimator Θ will be numerically unstable. This would lead to an increase<br />

in the variance estimator ∗ . Working with a constrained linear mo<strong>de</strong>l is not an appropriate<br />

answer. In pratice, a solution is to test mo<strong>de</strong>ls with omitting one explanatory variable after<br />

another to check for near colinearity. Another stronger limitation lies in the fact that the<br />

response variance is assumed to be the same (σ 2 ) for all individuals. One way to <strong>de</strong>al with this<br />

issue is to transform the response variable by the <strong>non</strong>linear Box-Cox transformation. However,<br />

this response transformation can still be unsatifactory in certain cases. Finally, the strongest<br />

limitation is the assumed support of the response variable. By the normal assumption, Y<br />

must lies in the whole set R, which exclu<strong><strong>de</strong>s</strong> count variable (e.g. Poisson distribution) or<br />

positive variable (e.g. exponential distribution). To address this problem, we have to use a<br />

more general mo<strong>de</strong>l than linear mo<strong>de</strong>ls.<br />

In this paper, Y represents the lapse indicator of customers, i.e. Y follows a Bernoulli<br />

variable with 1 indicating a lapse. For Bernoulli variables, there are two main pitfalls. Since<br />

the value of E(Y ) is contained within the interval [0, 1], it seems natural the expected values ˆ Y<br />

should also lie in [0, 1]. However, predicted values θX may fall out of this range for sufficiently<br />

large or small values of X. Furthermore, the normality hypothesis of the residuals is clearly<br />

not met: Y − E(Y ) will only take two different values, −E(Y ) and 1 − E(Y ). Therefore, the<br />

mo<strong>de</strong>lling of E(Y ) as a function of X needs to be changed as well as the error distribution.<br />

This motivates to use an exten<strong>de</strong>d mo<strong>de</strong>l that can <strong>de</strong>al with discrete-valued variables.<br />

Toward generalized linear mo<strong>de</strong>ls<br />

48<br />

A Generalized Linear Mo<strong>de</strong>l is characterized by three components:<br />

1. a random component: Yi follows a specific distribution of the exponential family Fexp(θi, φi, a, b, c) † ,<br />

∗. This would be one way to <strong>de</strong>tect such isssue.<br />

†. See Appendix 1.8.1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!