Current Population Survey Design and Methodology - Census Bureau

More documents

Recommendations

Info

Hurwitz, and Bershad, 1961) and correlated response variance, one form of which is interviewer variance (a measure of the variability among responses obtained by different interviewers over repeated administrations). Similarly, when a particular design-estimator fails over repeated sampling to include a particular set of population units in the sampling frame or to ensure that all units provide the required data, bias can be viewed as having components such as coverage bias, unit nonresponse bias, or item nonresponse bias (Groves, 1989). For example, a survey administered solely by telephone could result in coverage bias for estimates relating to the total population if the nontelephone households were different from the telephone households with respect to the characteristic being measured (which almost always occurs). One common theme of these types of models is the decomposition of total mean squared error into two sets of components, one resulting from the fact that estimates are based on a sample of units rather than the entire population (sampling error) and the other due to alternative specifications of procedures for conducting the sample survey (nonsampling error). (Since nonsampling error is defined negatively, it ends up being a catch-all term for all errors other than sampling error, and can include issues such as individual behavior.) Conceptually, nonsampling error in the context of statistical science has both variance and bias components. However, when total mean squared error is decomposed mathematically to include a sampling error term and one or more other nonsampling error terms, it is often difficult to categorize such terms as either variance or bias. The term nonsampling error is used rather loosely in the survey literature to denote mean squared error, variance, or bias in the precise mathematical sense and to imply error in the more general sense of process mistakes (see next section). Some nonsampling error components which are conceptually known to exist have yet to be expressed in practical mathematical models. Two examples are the bias associated with the use of a particular set of interviewers and the variance associated with the selection of one of the numerous possible sets of questions. In addition, the estimation of many nonsampling errors—and sampling bias—is extremely expensive and difficult or even impossible in practice. The estimation of bias, for example, requires knowledge of the truth, which may be sometimes verifiable from records (e.g., number of hours paid for by employer) but often is not verifiable (e.g., number of hours actually worked). As a consequence, survey organizations typically concentrate on estimating the one component of total mean squared error for which practical methods have been developed—variance. It is frequently possible to construct an unbiased estimator of variance. In the case of complex surveys like the CPS, estimators have been developed that typically rely on the proposition— usually well-grounded—that the variability among estimates based on various subsamples of the one actual sample is a good proxy for the variability among all the possible samples like the one at hand. In the case of the CPS, 160 subsamples or replicates are used in variance estimation for the 2000 design. (For more specifics, see Chapter 14.) It is important to note that the estimates of variance resulting from the use of this and similar methods are not merely estimates of sampling variance. The variance estimates include the effects of some nonsampling errors, such as response variance and intra-interviewer correlation. On the other hand, users should be aware of the fact that for some statistics these estimates of standard error might be statistically significant underestimates of total error, an important consideration when making inferences based on survey data. To draw conclusions from survey data, samplers rely on the theory of finite population sampling from a repeated sampling perspective: If the specified sample designestimator methodology were implemented repeatedly and the sample size sufficiently large, the probability distribution of the estimates would be very close to a normal distribution. Thus, one could safely expect 90 percent of the estimates to be within two standard errors of the mean of all possible sample estimates (standard error is the square root of the estimate of variance) (Gonzalez et al., 1975; Moore, 1997). However, one cannot claim that the probability is .90 that the true population value falls in a particular interval. In the case of a biased estimator due to nonresponse, undercoverage, or other types of nonsampling error, confidence intervals may not cover the population parameter at the desired 90-percent rate. In such cases, a standard error estimator may indirectly account for some elements of nonsampling error in addition to sampling error and lead to confidence intervals having greater than the nominal 90-percent coverage. On the other hand, if the bias is substantial, confidence intervals can have less than the desired coverage. QUALITY MEASURES IN STATISTICAL PROCESS MONITORING The process of conducting a survey includes numerous steps or components, such as defining concepts, translating concepts into questions, selecting a sample of units from what may be an imperfect list of population units, hiring and training interviewers to ask people in the sample unit the questions, coding responses into predefined categories, and creating estimates that take into account the fact that not everyone in the population of interest had a chance to be in the sample and not all of those in the sample elected to provide responses. It is a process where the possibility exists at each step of making a mistake in process specification and deviating during implementation from the predefined specifications. 13–2 Overview of Data Quality Concepts Current Population Survey TP66 U.S. Bureau of Labor Statistics and U.S. Census Bureau
For example, we now recognize that the initial labor force question used in the CPS for many years (‘‘What were you doing most of last week. . .’’) was problematic to many respondents (see Chapter 6). Moreover, many interviewers tailored their presentation of the question to particular respondents, for example, saying ‘‘What were you doing most of last week—working, going to school, etc.?’’ if the respondent was of school age. Having a problematic question is a mistake in process specification; varying question wording in a way not prespecified is a mistake in process implementation. Errors or mistakes in process contribute to nonsampling error in that they would contaminate results even if the whole population were surveyed. Parts of the overall survey process that are known to be prone to deviations from the prescribed process specifications and thus could be potential sources of nonsampling error in the CPS are discussed in Chapter 15, along with the procedures put in place to limit their occurrence. A variety of quality measures have been developed to describe what happens during the survey process. These measures are vital to help managers and staff working on a survey understand the process is quality. They can also aid users of the various products of the survey process (both individual responses and their aggregations into statistics) in determining a particular product’s potential limitations and whether it is appropriate for the task at hand. Chapter 16 contains a discussion of quality indicators and, in a few cases, their potential relationship to nonsampling errors. SUMMARY The quality of estimates made from any survey, including the CPS, is a function of decisions made by designers and implementers. As a general rule of thumb, designers make decisions aimed at minimizing mean squared error within given cost constraints. Practically speaking, statisticians Current Population Survey TP66 U.S. Bureau of Labor Statistics and U.S. Census Bureau are often compelled to make decisions on sample designs and estimators based on variance alone. In the case of the CPS, the availability of external population estimates and data on rotation group bias makes it possible to do more than that. Designers of questions and data collection procedures tend to focus on limiting bias, assuming that the specification of exact question wording and ordering will naturally limit the introduction of variance. Whatever the theoretical focus of the designers, the accomplishment of the goal is heavily dependent upon those responsible for implementing the design. Implementers of specified survey procedures, like interviewers and respondents, are concentrating on doing the best job possible. Process monitoring through quality indicators, such as coverage and response rates, can determine when additional training or revisions in process specification are needed. Continuing process improvement is a vital component for achieving the survey’s quality goals. REFERENCES Gonzalez, M. E., J. L. Ogus, G. Shapiro, and B. J. Tepping (1975), ‘‘Standards for Discussion and Presentation of Errors in Survey and Census Data,’’ Journal of the American Statistical Association, 70, No. 351, Part II, 5−23. Groves, R. M. (1989), Survey Errors and Survey Costs, New York: John Wiley & Sons. Hansen, M. H., W. N. Hurwitz, and M. A. Bershad (1961), ‘‘Measurement Errors in Censuses and Surveys,’’ Bulletin of the International Statistical Institute, 38(2), pp. 359−374. Moore, D. S. (1997), Statistics Concepts and Controversies, 4th Edition, New York: W. H. Freeman. Särndal, C., B. Swensson, and J. Wretman (1992), Model Assisted Survey Sampling, New York: Springer-Verlag. Tukey, J. W. (1949), ‘‘Memorandum on Statistics in the Federal Government,’’ American Statistician, 3, No. 1, pp. 6−17; No. 2, pp. 12−16. Overview of Data Quality Concepts 13–3
Page 1 and 2:
Design and Methodology Current Popu
Page 3 and 4:
U.S. Department of Labor Elaine L.
Page 5 and 6:
Foreword The Current Population Sur
Page 7 and 8:
CONTENTS Chapter 9. Data Preparatio
Page 9 and 10:
CONTENTS Figures 3-1 CPS Rotation C
Page 11 and 12:
Chapter 1. Background The Current P
Page 13 and 14:
Chapter 2. History of the Current P
Page 15 and 16:
in small circles with an ordinary l
Page 17 and 18:
was added as a control to the Hispa
Page 19 and 20:
More detailed information on these
Page 21 and 22:
social and economic characteristics
Page 23 and 24:
d. Stratification is performed inde
Page 25 and 26:
Calculation of overall state sampli
Page 27 and 28:
use of census information to reduce
Page 29 and 30:
Summary of Sampling Frames Table 3-
Page 31 and 32:
Examples of Post-Sampling Code Assi
Page 33 and 34:
2. The sample for 1 month is compos
Page 35 and 36:
Chapter 4. Preparation of the Sampl
Page 37 and 38:
Even without geographic clustering,
Page 39 and 40:
Permit listing. Listing in the perm
Page 41 and 42:
Chapter 5. Questionnaire Concepts a
Page 43 and 44:
Employed citizens of foreign countr
Page 45 and 46:
‘‘What are all of the things yo
Page 47 and 48:
Chapter 6. Design of the Current Po
Page 49 and 50:
Item Response Analysis The primary
Page 51 and 52:
Industry and occupation—Dependent
Page 53 and 54: months. For people reported to be l
Page 55 and 56: Oksenberg, L., C. Cannell, and G. K
Page 57 and 58: Figure 7−1. Introductory Letter 7
Page 59 and 60: Figure 7-3. Noninterviews: Main Ite
Page 61 and 62: Figure 7-5. Summary Table for Deter
Page 63 and 64: eason, the CATI facilities generall
Page 65 and 66: Figure 7-9. Interviewing Results (S
Page 67 and 68: transmitted directly to the compute
Page 69 and 70: Chapter 9. Data Preparation INTRODU
Page 71 and 72: Demographic-related recodes are cre
Page 73 and 74: sample unit (person or household) b
Page 75 and 76: still be associated with the state-
Page 77 and 78: Weights After National Coverage Adj
Page 79 and 80: Table 10−4: Second-Stage Adjustme
Page 81 and 82: Iteration 2: (Repeat steps above be
Page 83 and 84: 1. For each state 7 and the Distric
Page 85 and 86: Production of monthly, quarterly, a
Page 87 and 88: for national CPS labor force series
Page 89 and 90: Chapter 11. Current Population Surv
Page 91 and 92: Table 11-1. Current Population Surv
Page 93 and 94: have completed their eighth and fin
Page 95 and 96: Table 11−3. Summary of 2004 ASEC
Page 97 and 98: Current Population Survey TP66 U.S.
Page 99 and 100: Chapter 12. Data Products From the
Page 101 and 102: Table 12-1. Bureau of Labor Statist
Page 103: Chapter 13. Overview of Data Qualit
Page 107 and 108: the other panel in the SR SECU or N
Page 109 and 110: means that the estimates of varianc
Page 111 and 112: Table 14-2. Components of Variance
Page 113 and 114: Table 14-4. Effect of Compositing o
Page 115 and 116: Wolter, K. (1984), ‘‘An Investi
Page 117 and 118: • Master Address File (MAF): The
Page 119 and 120: no RO review of group quarters list
Page 121 and 122: 5. The questionnaire does not elici
Page 123 and 124: chance to answer; probing technique
Page 125 and 126: Chapter 16. Quality Indicators of N
Page 127 and 128: Figure 16−2. Average Yearly Type
Page 129 and 130: Table 16-3. Labor Force Status by I
Page 131 and 132: supposed to be done in person, the
Page 133 and 134: Figure 16−4. Basic CPS Household
Page 135 and 136: Appendix A Sample Preparation Mater
Page 137 and 138: Illustration 1. Segment Folder, BC-
Page 139 and 140: Illustration 3. Unit/Permit Listing
Page 141 and 142: Illustration 5. Unit/Permit Listing
Page 143 and 144: Illustration 7. Permit Sketch Map,
Page 145 and 146: group numbers are chosen for deleti
Page 147 and 148: demographic components often cited
Page 149 and 150: 2. The CPS control universe exclude
Page 151 and 152: A concern exists relative to the co
Page 153 and 154: The resulting race categories (Whit
Page 155 and 156:
treat race and Hispanic origin as a
Page 157 and 158:
REFERENCES Ahmed, B. and J. G. Robi
Page 159 and 160:
Figure D-1. Map D-2 Organization an
Page 161 and 162:
This report provides the supervisor
Page 163 and 164:
Appendix E. Reinterview: Design and
Page 165 and 166:
Acronyms ADS Annual Demographic Sup
Page 167 and 168:
Births—Con. total population C-4
Page 169 and 170:
Estimates, population—Con. popula
Page 171 and 172:
Listing checks 15-4 Listing Sheets
Page 173 and 174:
Ratio adjustments—Con. second-sta
Page 175:
Vacancy rates 11-2, 11-4 Variance e
show all

Current Population Survey Design and Methodology - Census Bureau

Create successful ePaper yourself

Delete template?

Save as template?