SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ... SAE Manual Sections 1 to 4_1 (May 06).pdf - National Statistical ...

12.11.2014 Views

A Guide to Small Area Estimation - Version 1.1 05/05/2006 Spatial Relationships Spatial relationships in the data can be harnessed in much the same way that time series relationships can be. Thus, if we hypothesize that different units bear some relationship to each other that depends upon the distance and direction between them, units can then be pooled together to give a greater effective sample size for each small area estimate. This approach also has the benefit of reducing the impact of the odd unit value that is discordant with its neighbouring values. Spatial methods are commonly used in the contexts of health, disease, agricultural or environmental data but may be quite applicable to other specific topics. As in the case of time series relationships, borrowing strength through spatial relationships adds additional complexity to the small area estimation and should only be contemplated where statistical expertise is available. Multivariate Relationships In a univariate model the response or target variable is a single variable. In this manual the models referred to are univariate models. So using the example of disability type (physical, sensory, intellectual, psychological/psychiatric, head injury/acquired brain damage), a separate univariate model is fitted to each of the disability types. In a multivariate model, the target variable is a vector of these variables and the model is fitted to these variables simultaneously. A multivariate approach may be more efficient in terms of producing more accurate predictions if there are strong correlations between the constituent variables. For example, physical impairment may have a strong correlation with sensory impairment. A multivariate approach that takes advantage of this additional information should be more robust and give more accurate estimates. However, multivariate models add additional complexity to small area estimation and should only be contemplated where statistical expertise is available. 3.2 Basic Conditions for Success The first step in undertaking a small area exercise is to determine the quality of the direct estimates and the auxiliary data at the small area level. The variable of interest is often drawn from a sample survey, which can not provide estimates at a fine level due to small sample size in each small area and correspondingly high Relative Standard Errors (RSE's). Auxiliary data can be obtained from many sources including administrative datasets, survey variables and census counts. Table 3.1 outlines some issues that will help in determining whether the basic conditions for producing quality small area estimates are being met. Australian Bureau of Statistics 18

A Guide to Small Area Estimation - Version 1.1 05/05/2006 Table 3.1: Recipe for Success Ingredient Small Area Size Each small area should have a reasonable sample. Few small areas should have no sample. Variable of Interest Reasonably common population characteristic Consistent estimates across small areas Model Specification Model is well-specified, meaning that: o all main determinants or explanators (auxiliary variables) for the target variable are included in the model and o the model reflects the correct form of the relationship between the target variable and the auxiliary variables (eg linear, quadratic, logistic etc) and that variance structures are accounted for correctly. Auxiliary Data Strong theoretical relationship between auxiliary variable and population of interest Statistically significant relationships between auxiliary data and small area estimates. The auxiliary data has been accurately collected and maintained and uses similar scope and definitions to the survey data. No missing values Compatibility of auxiliary data with census data in terms of consistency of definitions of variables, measurement, timing and other issues. Confidentiality Maintain confidentiality standards Reason The smaller the sample the harder it is to reliably discern the characteristics of individual small areas. More reliance is then placed on the assumption that the small area is similar to others. It also becomes more difficult to identify relationships either in the data or with auxiliary data. This will lead to lower quality small area estimates . Similar reason to small area size. In the context of household surveys, the rarer the characteristic the smaller the likely sample Key assumption with simple synthetic models. Mis-specification may result in incorrect predictions and incorrect measures of the statistical reliability of those predictions. Allows easy identification of potential auxiliary variables and aids in explanation of method to users. Allows a reasonable small area model to be estimated. Eliminates a further source of error that would otherwise impact upon the quality of the final small area output. Missing values can bias estimates or cause model failure. Where possible ensure these have been accounted for before modelling. Reduces further sources of errors caused due to inconsistency of definitions, measurement and other changes over time. ABS mission statement provides an assurance concerning the confidentiality of the data it collects. Australian Bureau of Statistics 19

A Guide <strong>to</strong> Small Area Estimation - Version 1.1 05/05/20<strong>06</strong><br />

Spatial Relationships<br />

Spatial relationships in the data can be harnessed in much the same way that time series<br />

relationships can be. Thus, if we hypothesize that different units bear some relationship<br />

<strong>to</strong> each other that depends upon the distance and direction between them, units can<br />

then be pooled <strong>to</strong>gether <strong>to</strong> give a greater effective sample size for each small area<br />

estimate. This approach also has the benefit of reducing the impact of the odd unit value<br />

that is discordant with its neighbouring values. Spatial methods are commonly used in<br />

the contexts of health, disease, agricultural or environmental data but may be quite<br />

applicable <strong>to</strong> other specific <strong>to</strong>pics.<br />

As in the case of time series relationships, borrowing strength through spatial<br />

relationships adds additional complexity <strong>to</strong> the small area estimation and should only be<br />

contemplated where statistical expertise is available.<br />

Multivariate Relationships<br />

In a univariate model the response or target variable is a single variable. In this manual<br />

the models referred <strong>to</strong> are univariate models. So using the example of disability type<br />

(physical, sensory, intellectual, psychological/psychiatric, head injury/acquired brain<br />

damage), a separate univariate model is fitted <strong>to</strong> each of the disability types. In a<br />

multivariate model, the target variable is a vec<strong>to</strong>r of these variables and the model is<br />

fitted <strong>to</strong> these variables simultaneously.<br />

A multivariate approach may be more efficient in terms of producing more accurate<br />

predictions if there are strong correlations between the constituent variables. For<br />

example, physical impairment may have a strong correlation with sensory impairment. A<br />

multivariate approach that takes advantage of this additional information should be<br />

more robust and give more accurate estimates. However, multivariate models add<br />

additional complexity <strong>to</strong> small area estimation and should only be contemplated where<br />

statistical expertise is available.<br />

3.2 Basic Conditions for Success<br />

The first step in undertaking a small area exercise is <strong>to</strong> determine the quality of the<br />

direct estimates and the auxiliary data at the small area level. The variable of interest is<br />

often drawn from a sample survey, which can not provide estimates at a fine level due <strong>to</strong><br />

small sample size in each small area and correspondingly high Relative Standard Errors<br />

(RSE's). Auxiliary data can be obtained from many sources including administrative<br />

datasets, survey variables and census counts. Table 3.1 outlines some issues that will<br />

help in determining whether the basic conditions for producing quality small area<br />

estimates are being met.<br />

Australian Bureau of Statistics 18

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!