24.02.2013 Views

Optimality

Optimality

Optimality

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

IMS Lecture Notes–Monograph Series<br />

2nd Lehmann Symposium – <strong>Optimality</strong><br />

Vol. 49 (2006) 98–119<br />

c○ Institute of Mathematical Statistics, 2006<br />

DOI: 10.1214/074921706000000419<br />

Where do statistical models come from?<br />

Revisiting the problem of specification<br />

Aris Spanos ∗1<br />

Virginia Polytechnic Institute and State University<br />

Abstract: R. A. Fisher founded modern statistical inference in 1922 and identified<br />

its fundamental problems to be: specification, estimation and distribution.<br />

Since then the problem of statistical model specification has received scant<br />

attention in the statistics literature. The paper traces the history of statistical<br />

model specification, focusing primarily on pioneers like Fisher, Neyman, and<br />

more recently Lehmann and Cox, and attempts a synthesis of their views in the<br />

context of the Probabilistic Reduction (PR) approach. As argued by Lehmann<br />

[11], a major stumbling block for a general approach to statistical model specification<br />

has been the delineation of the appropriate role for substantive subject<br />

matter information. The PR approach demarcates the interrelated but complemenatry<br />

roles of substantive and statistical information summarized ab initio<br />

in the form of a structural and a statistical model, respectively. In an attempt<br />

to preserve the integrity of both sources of information, as well as to ensure the<br />

reliability of their fusing, a purely probabilistic construal of statistical models<br />

is advocated. This probabilistic construal is then used to shed light on a<br />

number of issues relating to specification, including the role of preliminary<br />

data analysis, structural vs. statistical models, model specification vs. model<br />

selection, statistical vs. substantive adequacy and model validation.<br />

1. Introduction<br />

The current approach to statistics, interpreted broadly as ‘probability-based data<br />

modeling and inference’, has its roots going back to the early 19th century, but it<br />

was given its current formulation by R. A. Fisher [5]. He identified the fundamental<br />

problems of statistics to be: specification, estimation and distribution. Despite its<br />

importance, the question of specification, ‘where do statistical models come from?’<br />

received only scant attention in the statistics literature; see Lehmann [11].<br />

The cornerstone of modern statistics is the notion of a statistical model whose<br />

meaning and role have changed and evolved along with that of statistical modeling<br />

itself over the last two centuries. Adopting a retrospective view, a statistical model<br />

is defined to be an internally consistent set of probabilistic assumptions aiming to<br />

provide an ‘idealized’ probabilistic description of the stochastic mechanism that<br />

gave rise to the observed data x := (x1, x2, . . . , xn). The quintessential statistical<br />

model is the simple Normal model, comprising a statistical Generating Mechanism<br />

(GM):<br />

(1.1) Xk = µ + uk, k∈ N :={1,2, . . . n, . . .}<br />

∗ I’m most grateful to Erich Lehmann, Deborah G. Mayo, Javier Rojo and an anonymous<br />

referee for valuable suggestions and comments on an earlier draft of the paper.<br />

1 Department of Economics, Virginia Polytechnic Institute, and State University, Blacksburg,<br />

VA 24061, e-mail: aris@vt.edu<br />

AMS 2000 subject classifications: 62N-03, 62A01, 62J20, 60J65.<br />

Keywords and phrases: specification, statistical induction, misspecification testing, respecification,<br />

statistical adequacy, model validation, substantive vs. statistical information, structural vs.<br />

statistical models.<br />

98

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!