SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

More documents

Recommendations

Info

A Guide to Small Area Estimation - Version 1.1 05/05/2006 Clearly, as indicated in questions 2 to 3 of Figure 4.2, the choice of any of these or other models depends on the following important interrelated factors: i. ii. iii. iv. v. the level at which the small area estimates are required. Are small area estimates required at area-level or at some other sub-population such as age by sex group. the nature of the auxiliary data available related to the variable of interest. Again, these may include whether the data is at the unit-level (person-level), area-level or both. the nature of the variable of interest, i.e., whether it is continuous, binary or count data. users quality requirements for small area estimates access to statistical expertise Small area models can be fitted either at area-level or person-level. Area level models are fitted when the variable of interest and associated covariates in the auxiliary data are observed at the level of the specific geographic area, which is referred in Figure 4.1 as area-level analysis. On the other hand a unit/person-level analysis refers to unit/person-level model that makes use of individual/unit level data in the analysis. When a model is fitted using unit/person-level data then the predictions based on this model must be aggregated to produce area-level estimates. It is also possible to fit a unit/person level model involving both individual and area-level covariates. Choosing the right model for the right type of data is crucial in the modelling process. For example, if the auxiliary information consists of data observed at area or unit level and the variable of interest is of a continuous nature, then it will be appropriate to use a linear model to estimate the variable of interest. Alternatively, if we have unit level data where the variable of interest is binary (e.g., 1= person has a disability and 0 = person has no disability) which is usually the case in many small area models, then we would go for a model that captures the binary nature of the observations, such as the logistic regression model. Similarly, if our data provides, say, area level count data of people with a disability then a suitable choice would be the Poisson model which is appropriate for count data models. It is also possible to use two or more models (e.g., unit-level and area-level models) provided that the dataset is amenable to such analyses . For instance, as we will see in the examples of Section 5 , the logistic and Poisson models are used to predict person-level and area-level disability proportions, respectively. Australian Bureau of Statistics 30
A Guide to Small Area Estimation - Version 1.1 05/05/2006 Figure 4.2: Key Questions for Small Area Modelling If NO Q1. Do you have good quality auxiliary Data? If Yes Use Linear or Generalised linear models, depending on your data. Q2. Is the variable of interest of continuous, binary or count data? If Continuous data: Linear model If Binary data: Logistic Model If Count Data: Poisson model Simple Direct or Broad Area Ratio Estimators are the likely candidates. Q3. Shall I use an area-level or unit-level model or both? Q4. At what level is my auxiliary data available and of good quality? Good Area Level or unit level continuous data Good Unit Level binary data Good Area Level count data Q5. Are there likely to be major differences between small areas that are not taken into account by the auxiliary data? If Yes Use random effects model Consult methodology staff for technical advice The next key question (as indicated by questions 5 of Figure 4.2) is when and why do we use the random effects models as compared to the synthetic models. To start with, the preceding discussion on the choice of models (linear versus generalised linear) also applies to the random effects models as well. However, the random effects models are different in that they include an additional error component to account for differences between units that aren’t explained by the auxiliary variables. In other words, synthetic models assume that the variable of interest can be determined from the same functional relationship with the auxiliary variables, and that this relationship applies across all small areas. This assumption, however, could be restrictive for a number of reasons. For example, in the disability data some small areas are located in remote areas with limited support facilities and services while others are in big cities with better infrastructure and services where people with disability could move there to take advantage of the improved services. Some areas are may have larger population of indigenous people relative to others which again may affect disability rates in different areas. Yet, others are located in coastal areas that attract people of retirement age and the elderly. These factors are not fully accounted for in the auxiliary data. Thus, unless these and other factors are taken into account in the model, they could limit the predictive abilities of synthetic models Australian Bureau of Statistics 31
Page 1 and 2: A Guide to Small Area Estimation -
Page 29: A Guide to Small Area Estimation -
Page 35: A Guide to Small Area Estimation -

SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?