26.12.2012 Views

Current Population Survey Design and Methodology - Census Bureau

Current Population Survey Design and Methodology - Census Bureau

Current Population Survey Design and Methodology - Census Bureau

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

either does not know the answer to a question or refuses<br />

to provide the answer. Item nonresponse in the CPS is<br />

modest (see Chapter 16, Table 16−4).<br />

One of three imputation methods are used to compensate<br />

for item nonresponse in the CPS. Before the edits are<br />

applied, the daily data files are merged <strong>and</strong> the combined<br />

file is sorted by state <strong>and</strong> PSU within state. This sort<br />

ensures that allocated values are from geographically<br />

related records; that is, missing values for records in Maryl<strong>and</strong><br />

will not receive values from records in California. This<br />

is an important distinction since many labor force <strong>and</strong><br />

industry <strong>and</strong> occupation characteristics are geographically<br />

clustered.<br />

The edits effectively blank all entries in inappropriate<br />

questions (e.g., followed incorrect path of questions) <strong>and</strong><br />

ensure that all appropriate questions have valid entries.<br />

For the most part, illogical entries or out-of-range entries<br />

have been eliminated with the use of electronic instruments;<br />

however, the edits still address these possibilities,<br />

which may arise from data transmission problems <strong>and</strong><br />

occasional instrument malfunctions. The main purpose of<br />

the edits, however, is to assign values to questions where<br />

the response was ‘‘Don’t know’’ or ‘‘Refused.’’ This is<br />

accomplished by using 1 of the 3 imputation techniques<br />

described below.<br />

The edits are run in a deliberate <strong>and</strong> logical sequence.<br />

Demographic variables are edited first because several of<br />

those variables are used to allocate missing values in the<br />

other modules. The labor force module is edited next<br />

since labor force status <strong>and</strong> related items are used to<br />

impute missing values for industry <strong>and</strong> occupation codes<br />

<strong>and</strong> so forth.<br />

The three imputation methods used by the CPS edits are<br />

described below:<br />

1. Relational imputation infers the missing value from<br />

other characteristics on the person’s record or within<br />

the household. For instance, if race is missing, it is<br />

assigned based on the race of another household<br />

member, or failing that, taken from the previous<br />

record on the file. Similarly, if relationship data is<br />

missing, it is assigned by looking at the age <strong>and</strong> sex<br />

of the person in conjunction with the known relationship<br />

of other household members. Missing occupation<br />

codes are sometimes assigned by analyzing the industry<br />

codes <strong>and</strong> vice versa. This technique is used as<br />

appropriate across all edits. If missing values cannot<br />

be assigned using this technique, they are assigned<br />

using one of the two following methods.<br />

2. Longitudinal edits are used in most of the labor force<br />

edits, as appropriate. If a question is blank <strong>and</strong> the<br />

individual is in the second or later month’s interview,<br />

the edit procedure looks at last month’s data to determine<br />

whether there was an entry for that item. If so,<br />

last month’s entry is assigned; otherwise, the item is<br />

assigned a value using the appropriate hot deck, as<br />

described next.<br />

3. The third imputation method is commonly referred to<br />

as ‘‘hot deck’’ allocation. This method assigns a missing<br />

value from a record with similar characteristics,<br />

which is the hot deck. Hot decks are defined by variables<br />

such as age, race, <strong>and</strong> sex. Other characteristics<br />

used in hot decks vary depending on the nature of the<br />

unanswered question. For instance, most labor force<br />

questions use age, race, sex, <strong>and</strong> occasionally another<br />

correlated labor force item such as full- or part-time<br />

status. This means the number of cells in labor force<br />

hot decks are relatively small, perhaps fewer than<br />

100. On the other h<strong>and</strong>, the weekly earnings hot deck<br />

is defined by age, race, sex, usual hours, occupation,<br />

<strong>and</strong> educational attainment. This hot deck has several<br />

thous<strong>and</strong> cells.<br />

All CPS items that require imputation for missing values<br />

have an associated hot deck . The initial values for the hot<br />

decks are the ending values from the preceding month. As<br />

a record passes through the editing procedures, it will<br />

either donate a value to each hot deck in its path or<br />

receive a value from the hot deck. For instance, in a hypothetical<br />

case, the hot deck for question X is defined by the<br />

characteristics Black/non-Black, male/female, <strong>and</strong> age<br />

16−25/25+. Further assume a record has the value of<br />

White, male, <strong>and</strong> age 64. When this record reaches question<br />

X, the edits determine whether it has a valid entry. If<br />

so, that record’s value for question X replaces the value in<br />

the hot deck reserved for non-Black, male, <strong>and</strong> age 25+.<br />

Comparably, if the record was missing a value for item X,<br />

it would be assigned the value in the hot deck designated<br />

for non-Black, male, <strong>and</strong> age 25+.<br />

As stated above, the various edits are logically sequenced,<br />

in accordance with the needs of subsequent edits. The<br />

edits <strong>and</strong> codes, in order of sequence, are:<br />

1. Household edits <strong>and</strong> codes. This processing step<br />

performs edits <strong>and</strong> creates recodes for items pertaining<br />

to the household. It classifies households as interviews<br />

or noninterviews <strong>and</strong> edits items appropriately.<br />

Hot deck allocations defined by geography <strong>and</strong> other<br />

related variables are used in this edit.<br />

2. Demographic edits <strong>and</strong> codes. This processing<br />

step ensures consistency among all demographic variables<br />

for all individuals within a household. It ensures<br />

all interviewed households have one <strong>and</strong> only one reference<br />

person <strong>and</strong> that entries stating marital status,<br />

spouse, <strong>and</strong> parents are all consistent. It also creates<br />

families based upon these characteristics. It uses longitudinal<br />

editing, hot deck allocation defined by<br />

related demographic characteristics, <strong>and</strong> relational<br />

imputation.<br />

9–2 Data Preparation <strong>Current</strong> <strong>Population</strong> <strong>Survey</strong> TP66<br />

U.S. <strong>Bureau</strong> of Labor Statistics <strong>and</strong> U.S. <strong>Census</strong> <strong>Bureau</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!