Sampling and Data Collection Methodology
PALS Sampling
To obtain a probability sample, yet achieve the goal of racially diverse
oversamples, a four stage sampling procedure was used. The sample design and
interviews were conducted by RTI International, the second largest independent
nonprofit research organization in the United States. The PALS covers the
civilian, non-institutionalized household population in the continental U.S. who
were 18 years of age or older at the time the survey was conducted, and speak
English or Spanish. The sampling frame was based on the use of residential
mailing lists supplemented with a frame-linking procedure to add households not
included on the lists to the frame. In a recently completed national household
survey, RTI estimated that this combined sampling frame accounted for over 98%
of the occupied housing units in the U.S.
Stage 1
As noted above, RTI selected the sample in four stages. At the first stage, they used Census data to construct a nationally representative sampling frame of Primary Sampling
Units (PSUs) defined as three-digit Zip Code Tabulation Areas. After the frame was constructed, RTI selected a first-stage sample of 60 PSUs with probabilities proportional to a composite size measure that weights PSUs with concentrations of minorities higher than other PSUs with the same number of addresses. The sample of 60 PSUs yielded a variety of local areas from across the country and provided an adequate number of degrees of freedom for variance estimation. While the use of composite size measures reduced screening costs by focusing the sample on PSUs with concentrations of minorities, it should be noted that the coverage of the sample was not adversely affected because PSUs that were mostly
"nonminority" had a chance of being selected as were non-minority households within mostly
"minority" PSUs.
Stage 2
At the second stage, RTI selected two five-digit Zip Codes from each selected
PSU (120 Zips in all) again with composite size measures that weights SSUs with concentrations of minorities higher than other SSUs with the same number of addresses.
Stage 3
At the third stage, RTI selected an average about 100 addresses from each selected Zip Code. From these, some were found ineligible because they were not occupied, had no English or Spanish speakers (rarely), or due to physical and mental incompetence. After the addresses were selected, RTI produced digital maps for a sub-sample of selected addresses to facilitate the use of the half-open interval (HOI) frame-linking procedure that identified and included housing units that are not on the mailing lists. Housing units may be missing because of new housing units built in the time between frame development and data collection, or because of errors in frame development stage. Field Interviewers reported to the home office any missing housing units that are not on the field enumeration. When confirmed by the home office that the units were excluded from the field enumeration, the missed unit was added to the sample to improve coverage (McMichael, et al, 2008).
Stage 4
At the fourth stage, RTI selected one per selected housing unit for interview. RTI generated a sample selection table for use by the Field Interviewers at each address to randomly determine which eligible person at the address should be asked to participate in the study.
After data collection was completed, RTI assigned a sampling weight to each respondent that reflected his/her probability of selection at each stage. The weight was calculated as the inverse of the overall selection probability and can be thought of as the number of persons in the population that the sample member represents. Moreover, and importantly, RTI used Census projections to post-stratify the weights of respondents to compensate for differential non-response and noncoverage. Also, due to the design, the data should be analyzed to correct for clustering (by obtaining correct standard errors). Programs such as STATA or SPSS' Complex Samples are designed for calculating corrected standard errors and significance tests.
Collecting the Data
Apart from the sample design itself, the actual interviews had to be conducted and the data collected. To do so, advance letters were mailed to all selected households four to five days before interviewers' initial visits to the sample households. Interviewers then visited sample households and completed a screening interview. The screening was conducted using a paper-and-pencil instrument (PAPI). If a respondent was selected from the household and agreed to participate, a questionnaire was administered using a laptop computer. Respondents were paid an incentive of $50 to complete the interview, which took an average of 80 minutes.
A portion of the questionnaire covered sensitive topics such as relationship behaviors and quality, deviance, attitudes about race and ethnicity, moral attitudes, and religious beliefs and authority. At this point, the respondent was given a device for audio computer-assisted self-interviewing (ACASI) to complete about 70 questions. During this portion of the survey, the respondent wore earphones to hear the prerecorded questions, and entered their responses directly into the computer, apart from the knowledge or aid of the interviewer.
In addition to the primary questionnaire, other PAPI instruments were left behind or mailed to spouses or partners at a later date to complete and return on their own. A $15 incentive check was mailed to all spouses or partners who returned a completed questionnaire.
Two response rates are appropriate to report. Of the homes in which interviewers attempted to reach a respondent, 71% were successfully contacted. Of those contacted, 71% agreed to participate. Thus, in the strictest sense, the response rate is 50% (.71 contact rate x .71 cooperation rate).
However, for 700 households, the sample was opened and work begun on contacting them (preliminary letters sent out, in some cases an initial contact made), but not followed through because the project time and money concluded. If we do not include that portion of the potential sample, but instead include only the portion of the sample that full contact attempts were made, the contact rate is 82% and the response rate is 58% (.82 x .71).