SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ... SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

12.11.2014 Views

A Guide to Small Area Estimation - Version 1.1 05/05/2006 4.2 The Modelling Framework Figure 4.1 presents a schematic representation of the small area modeling framework followed in this manual. Figure 4.2 complements Figure 4.1 by providing a list of key questions the purpose of which is to aid the decision making process of small area modeling in a reasonably systematic approach. The objective of these questions is to help the modeller/analyst better understand the modeling framework (Figure 4.1) and hence be able to choose the most appropriate technique for a given set of data. This, however, does not mean these are the only questions that need to be raised in this kind of exercise. The left-hand-side of Figure 4.1 shows the simplest small area methods, these being the Direct and Broad Area Ratio estimators, which are frequently used in the absence of good quality auxiliary data. The answer to question 1 of Figure 4.2 is important as good quality auxiliary data is a key requisite in order to proceed to the regression-based small area estimators. We take good auxiliary data to mean area-level and/or unit-level data that are potentially correlated (both theoretically and empirically) with the variable of interest. Section 3.5 discusses some of the ways the quality of auxiliary data can be determined. The quality of the auxiliary data, therefore, has a large bearing on the reliability of model predictions for the variable of interest. In other words, when good quality auxiliary data is available one can choose among a number of regression-based estimators that “borrow strength” from the relationship between the variable of interest and the auxiliary data; thereby improving the quality of small area estimates/predictions. Australian Bureau of Statistics 28

A Guide to Small Area Estimation - Version 1.1 05/05/2006 Figure 4.1: Small Area Modelling Framework Small Area Methods Simple Small Area Models Regression based Models Less complex More complex Direct Estimator Broad Area Ratio Estimator Linear Models for - Continuous data With No Auxiliary data With Auxiliary data Synthetic Regression Models Area Level Analysis Unit Level Analysis Random Effects Models Generalised Linear Models - Count data(poisson model) - Binary data (logistic model) Univariate Analysis Multivariate Analysis The classes of regression based estimators are shown in the right-hand-side of Figure 4.1. These estimators can be classified into two major categories, namely, the synthetic regression models and the random effects models which are relatively more complex than their synthetic counterparts. For the moment let us focus on the synthetic models. Once this choice is made the next choice is between a linear or generalised linear model. The Linear model, which is the simplest of all, is suitable if the variable of interest is continuous (e.g.; income, age, etc. ). If the variable of interest is not continuous (binary or count data) one can select appropriately from a wide range of Generalised Linear Models. The most common examples are the Logistic and Poisson models which are used to model binary and count data, respectively. Australian Bureau of Statistics 29

A Guide <strong>to</strong> Small Area Estimation - Version 1.1 05/05/20<strong>06</strong><br />

Figure 4.1: Small Area Modelling Framework<br />

Small Area<br />

Methods<br />

Simple Small Area<br />

Models<br />

Regression based<br />

Models<br />

Less complex<br />

More complex<br />

Direct<br />

Estima<strong>to</strong>r<br />

Broad Area<br />

Ratio<br />

Estima<strong>to</strong>r<br />

Linear Models for<br />

- Continuous data<br />

With No Auxiliary<br />

data<br />

With Auxiliary<br />

data<br />

Synthetic<br />

Regression<br />

Models<br />

Area<br />

Level<br />

Analysis<br />

Unit<br />

Level<br />

Analysis<br />

Random<br />

Effects<br />

Models<br />

Generalised Linear Models<br />

- Count data(poisson model)<br />

- Binary data (logistic model)<br />

Univariate Analysis<br />

Multivariate Analysis<br />

The classes of regression based estima<strong>to</strong>rs are shown in the right-hand-side of Figure<br />

4.1. These estima<strong>to</strong>rs can be classified in<strong>to</strong> two major categories, namely, the synthetic<br />

regression models and the random effects models which are relatively more complex<br />

than their synthetic counterparts. For the moment let us focus on the synthetic models.<br />

Once this choice is made the next choice is between a linear or generalised linear<br />

model. The Linear model, which is the simplest of all, is suitable if the variable of interest<br />

is continuous (e.g.; income, age, etc. ). If the variable of interest is not continuous<br />

(binary or count data) one can select appropriately from a wide range of Generalised<br />

Linear Models. The most common examples are the Logistic and Poisson models which<br />

are used <strong>to</strong> model binary and count data, respectively.<br />

Australian Bureau of Statistics 29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!