26.12.2012 Views

Current Population Survey Design and Methodology - Census Bureau

Current Population Survey Design and Methodology - Census Bureau

Current Population Survey Design and Methodology - Census Bureau

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

the other panel in the SR SECU or NSR pseudostratum (Fay,<br />

Dippo, <strong>and</strong> Morganstein, 1984). Thus the full sample was<br />

included in each replicate, but the matrix determined differing<br />

weights for the half samples. These 48 replicates<br />

were processed through all stages of the CPS weighting<br />

through compositing. The estimated variance for the characteristic<br />

of interest was computed by summing a squared<br />

difference between each replicate estimate (Y ˆ r) <strong>and</strong> the full<br />

sample estimate (Y ˆ 0) The complete formula 1 is<br />

Var(Yˆ 0)= 4<br />

48 � 48<br />

r=1<br />

(Y ˆ r � Y ˆ 0) 2 .<br />

Due to costs <strong>and</strong> computer limitations, variance estimates<br />

were calculated for only 13 months (January 1987 through<br />

January 1988) <strong>and</strong> for about 600 estimates at the national<br />

level. Replication estimates of variances at the subnational<br />

level were not reliable because of the small number of<br />

SECUs available (Lent, 1991). Based on the 13 months of<br />

variance estimates, generalized sampling errors (explained<br />

below) were calculated. (See Wolter 1985; or Fay 1984,<br />

1989 for more details on half-sample replication for variance<br />

estimation.)<br />

METHOD FOR ESTIMATING VARIANCE FOR 1990<br />

AND 2000 DESIGNS<br />

The general goal of the current variance estimation methodology,<br />

the method in use since July 1995, is to produce<br />

consistent variances <strong>and</strong> covariances for each month over<br />

the entire life of the design. Periodic maintenance reductions<br />

in the sample size <strong>and</strong> the continuous addition of<br />

new construction to the sample complicated the strategy<br />

needed to achieve this goal. However, research has shown<br />

that variance estimates are not adversely affected as long<br />

as the cumulative effect of the reductions is less than 20<br />

percent of the original sample size (Kostanich, 1996).<br />

Assigning all future new construction sample to replicates<br />

when the variance subsamples are originally defined provides<br />

the basis for consistency over time in the variance<br />

estimates.<br />

The current approach to estimating the 1990 <strong>and</strong> 2000<br />

design variances is called successive difference replication.<br />

The theoretical basis for the successive difference<br />

method was discussed by Wolter (1984) <strong>and</strong> extended by<br />

Fay <strong>and</strong> Train (1995) to produce the successive difference<br />

replication method used for the CPS. The following is a<br />

description of the application of this method. Successive<br />

1<br />

Usually balanced half-sample replication uses replicate factors<br />

of 2 <strong>and</strong> 0 with the formula,<br />

Var(Yˆ 0)= 1 k<br />

(Yˆ r � Yˆ 0) 2<br />

k � r=1<br />

where k is the number of replicates. The factor of 4 in our variance<br />

estimator is the result of using replicate factors of 1.5 <strong>and</strong><br />

0.5.<br />

USUs 2 (ultimate sampling units) formed from adjacent hit<br />

strings (see Chapter 3) are paired in the order of their<br />

selection to take advantage of the systematic nature of the<br />

CPS within-PSU sampling scheme. Each USU usually occurs<br />

in two consecutive pairs: for example, (USU1, USU2),<br />

(USU2, USU3), (USU3, USU4), etc. A pair then is similar to a<br />

SECU in the 1980 design variance methodology. For each<br />

USU within a PSU, two pairs (or SECUs) of neighboring<br />

USUs are defined based on the order of selection—one<br />

with the USU selected before <strong>and</strong> one with the USU<br />

selected after it. This procedure allows USUs adjacent in<br />

the sort order to be assigned to the same SECU, thus better<br />

reflecting the systematic sampling in the variance estimator.<br />

Also, the large increase in the number of SECUs <strong>and</strong><br />

in the number of replicates (160 vs. 48) over the 1980<br />

design increases the precision of the variance estimator.<br />

Replicate Factors for Total Variance<br />

Total variance is composed of two types of variance, the<br />

variance due to sampling of housing units within PSUs<br />

(within-PSU variance) <strong>and</strong> the variance due to the selection<br />

of a subset of all NSR PSUs (between-PSU variance). Replicate<br />

factors are calculated using a 160-by-160 3 Hadamard<br />

orthogonal matrix. To produce estimates of total variance,<br />

replicates are formed differently for SR <strong>and</strong> NSR samples.<br />

Between-PSU variance cannot be estimated directly using<br />

this methodology; it is the difference between the estimates<br />

of total variance <strong>and</strong> within-PSU variance. NSR<br />

strata are combined into pseudo- strata within each state,<br />

<strong>and</strong> one NSR PSU from the pseudostratum is r<strong>and</strong>omly<br />

assigned to each panel of the replicate as in the 1980<br />

design variance methodology. Replicate factors of 1.5 or<br />

0.5 adjust the weights for the NSR panels. These factors<br />

are assigned based on a single row from the Hadamard<br />

matrix <strong>and</strong> are further adjusted to account for the unequal<br />

sizes of the original strata within the pseudostratum<br />

(Wolter, 1985). In most cases these pseudostrata consist of<br />

a pair of strata except where an odd number of strata<br />

within a state requires that a triplet be formed. In this<br />

case, for the 1990 design, two rows from the Hadamard<br />

matrix are assigned to the pseudostratum resulting in replicate<br />

factors of about 0.5, 1.7, <strong>and</strong> 0.8; or 1.5, 0.3, <strong>and</strong><br />

1.2 for the three PSUs assuming roughly equal sizes of the<br />

original strata. However, for the 2000 design, these factors<br />

were further adjusted to account for the unequal<br />

sizes of the original strata within the pseudostratum. All<br />

USUs in a pseudostratum are assigned the same row number(s).<br />

For an SR sample, two rows of the Hadamard matrix are<br />

assigned to each pair of USUs creating replicate factors,<br />

f r for r = 1,...,160<br />

2 An ultimate sampling unit is usually a group of four neighboring<br />

housing units.<br />

3 Rows 1 <strong>and</strong> 81 have been dropped from the matrix.<br />

14–2 Estimation of Variance <strong>Current</strong> <strong>Population</strong> <strong>Survey</strong> TP66<br />

U.S. <strong>Bureau</strong> of Labor Statistics <strong>and</strong> U.S. <strong>Census</strong> <strong>Bureau</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!