You are here
3. Sampling Weights and Adjustments
3. Sampling Weights and Adjustments
This section discusses the development of survey weights. Two types of weights were used in the present survey: pre-population adjustment weights (to correct for unequal selection probabilities) and post-stratification (to correct for known discrepancies between the sample and the population). The final analysis weight reflects both types of adjustments, i.e., adjustment for non-response, multiple telephone lines, and persons per household as well as post-stratification adjustments. The final analysis weight is the weight that should be used for analyzing the survey data.
The final analysis weight was developed using the following steps:
- Calculation of the base sampling weights;
- Adjustment for unit non-response;
- Adjustment for households with multiple voice telephone numbers;
- Adjustment for selecting an adult within a sampled household; and
- Post-stratification adjustments to the target population.
The product of the above variables represents the final analysis weight. If needed, extreme values of the final analysis weight can be reduced (or trimmed) using standard weight trimming procedures.
3.1 Base Sampling Weights
The first step in weighting the sample is to calculate the sampling weight for each telephone number in the sample. The sampling weight is the inverse of the telephone number's probability of selection:
Where N is the total number of telephone numbers in the sampling frame and n is the total number of telephone numbers in the sample. For this survey, the total number of telephone numbers in the sampling frame, N, is 282,271,600 for the national survey and 69,120,100 for the survey of targeted MSAs. The total number of telephone numbers in the sample (numbers dialed) is 5,350 for the national survey and 4,229 for the survey of targeted MSAs, which eventually included 2,830 cases in the original sample of targeted MSAs and 1,399 cases that were sampled for the national survey and were from the nine targeted MSAs.
3.2 Adjustment for Unit Non-Response
For the national survey, sampled telephone numbers are classified as responding or non-responding households according to Census division and metropolitan status (inside or outside a Metropolitan Statistical Area). The non-response adjustment factor for all telephone numbers in each Census division (c) by metropolitan status (s), is calculated as follows:
Where the denominator is the CASRO response rate for Census division c and metropolitan status s. The non-response adjustment factor for a specific cell (defined by metropolitan status and Census division) is a function of the response rate, which is given by the ratio of the estimated number of telephone households to the number of completed surveys. For the survey of targeted MSAs, the cell for calculating the non-response adjustment factor is each of the nine targeted MSAs.
The non-response adjusted weight (W_{NR}) is the product of the sampling weight (W_{S}) and the non-response adjustment factor (ADJ_{NR}) within each stratum.
3.3 Adjustment for Households with Multiple Telephone Numbers
Some households have multiple telephone lines for voice communication. Thus, these households have multiple chances of being selected into the sample, and adjustments must be made to their survey weights. The adjustment for multiple telephone lines follows:
The adjustment is limited to a maximum factor of three. In other words, the adjustment factor ADJ_{MT} will be one over two (0.50) if the household has two telephone lines, and one over three (0.33) if it has three or more.
Table 3 provides the summary statistics for the number of telephone lines in the sampled households.
Table 3: Number of Telephone Lines per Household
National | MSA | |
---|---|---|
Mean | 1.04 | 1.063 |
Standard error of mean | 0.007 | 0.01 |
Minimum | 1 | 1 |
25th percentile | 1 | 1 |
Median | 1 | 1 |
75th percentile | 1 | 1 |
Maximum | 4 | 4 |
For respondents who did not provide this information, it is assumed that the household contained only one telephone line. The non-response adjusted weight (W_{NR}) is multiplied by the adjustment factor for multiple telephone lines (multiple selection probability) (ADJ_{MT}) to create a weight that is adjusted for non-response and for multiple selection probability (W_{NRMT}).
3.4 Adjustment for Number of Eligible Household Members
The probability of selecting an individual respondent depends on the number of eligible respondents in the household. Therefore, it is important to account for the total number of eligible household members when constructing the sampling weights. The adjustment for selecting a random adult household member follows:
ADJ_{RA} = Number of Eligible Household Members
Table 4 provides the summary statistics for the number of eligible members in the sampled households.
Table 4: Number of Eligible Household Members
National | MSA | |
---|---|---|
Mean | 2.325 | 2.36 |
Standard error of mean | 0.056 | 0.067 |
Minimum | 1 | 1 |
25th percentile | 2 | 2 |
Median | 2 | 2 |
75th percentile | 3 | 3 |
Maximum | 9 | 7 |
For respondents who did not provide this information, a value for ADJ_{RA} is imputed according to the distribution of the number of eligible persons in a household (from responding households) within the age, gender, and race/ethnicity cross-classification cell matching that of the respondent for which the value is being imputed.
The weight adjusted for non-response and for multiple selection probability (W_{NRMT}) is then multiplied by ADJ_{RA}, resulting in W_{NRMTRA}, a weight adjusted for non-response, multiple selection probability and for selecting a random, household member.
3.5 Post-Stratification Adjustments
Adjusting weighted survey counts so that they agree with population counts provided by the Census Bureau can compensate for different response rates by demographic subgroups, increase the precision of survey estimates, and reduce the bias in the estimates due to the exclusion of households without telephones from sampling. The final adjustment to the survey weight is a post-stratification adjustment that allows the weights to sum to the target population (i.e., U.S. non-institutionalized persons 18 years of age or older) by age, gender, and race/ethnicity.
The outcome of post-stratification is a factor or multiplier (M) that scales W_{NRMTRA} within each age/gender/race cell, so that the weighted marginal sums for age, gender, and race/ethnicity agree with the corresponding Census Bureau distribution for these characteristics. The method used in the post-stratification adjustment is a simple ratio adjustment applied to the sampling weight using the appropriate national population total for a given cell defined by the intersection of age, gender, and race/ethnicity.^{2} The general method for ratio adjusting follows:
- A table of the sum of the weights for each cell denoted by each age, gender, and race/ethnicity combination is created. Each cell is denoted by S(i,j,k), where i is the indicator for age, j is the indicator for gender, and k is the indicator for race/ethnicity.
- A similar table of national population controls is created, where each cell is denoted by P(i,j,k).
- The ratio R(i,j,k) = P(i,j,k) / S(i,j,k) is calculated; the cell ratio R(i,j,k) is denoted as the multiplier M.
- Each weight, at the record level, is multiplied by the appropriate cell ratio of R(i,j,k) to form the post-stratification adjustment.
For the national sample, cells used in the post-stratification are defined by the combination of age, gender, and race/ethnicity.^{3} Some race/ethnicity or, preferably, age categories may be merged if the number of completed interviews within the corresponding cells falls below 30. For this survey, many of the cells have less than 30 observations. After grouping and to remain consistent with what was done in previous surveys, a total of 16 cells are used for the national sample and 10 for the sample of targeted MSAs. For the sample of targeted MSAs, cells for post-stratification are defined only by the combination of gender and age due to the lack of information on race/ethnicity. The details are in the following two tables.
Table 5: Post-Stratification Cells - National
CELL | DESCRIPTION | SAMPLE SIZE | POPULATION |
---|---|---|---|
1 | Male - Hispanic (age 18 and over) | 37 | 16,025,259 |
2 | Male - Black, non-Hispanic (age 18 and over) | 24 | 12,295,956 |
3 | Male - White, non-Hispanic (age 18-34) | 26 | 21,569,336 |
4 | Male - White, non-Hispanic (age 35-44) | 35 | 13,569,404 |
5 | Male - White, non-Hispanic (age 45-54) | 75 | 15,668,930 |
6 | Male - White, non-Hispanic (age 55-64) | 73 | 12,513,255 |
7 | Male - White, non-Hispanic (age 65 and over) | 123 | 13,329,864 |
8 | Male - Other race, non-Hispanic (age 18 and over) | 54 | 6,918,128 |
9 | Female - Hispanic (age 18 and over) | 46 | 14,825,817 |
10 | Female - Black, non-Hispanic (age 18 and over) | 52 | 14,196,535 |
11 | Female - White, non-Hispanic (age 18-34) | 35 | 20,862,430 |
12 | Female - White, non-Hispanic (age 35-44) | 60 | 13,496,575 |
13 | Female - White, non-Hispanic (age 45-54) | 86 | 15,909,704 |
14 | Female - White, non-Hispanic (age 55-64) | 91 | 13,100,051 |
15 | Female - White, non-Hispanic (age 65 and over) | 169 | 17,908,073 |
16 | Female - Other race, non-Hispanic (age 18 and over) | 69 | 7,494,516 |
N/A | Missing demographic information | 27 | |
TOTAL | 1,082 | 229,683,833 |
Table 6: Post-Stratification Cells - MSA
CELL | DESCRIPTION | SAMPLE SIZE | POPULATION |
---|---|---|---|
1 | Male - age 18-34 | 30 | 8,289,508 |
2 | Male - age 35-44 | 50 | 5,471,778 |
3 | Male - age 45-54 | 66 | 5,305,946 |
4 | Male - age 55-64 | 53 | 3,742,602 |
5 | Male - age 65 and over | 91 | 3,625,639 |
6 | Female - age 18-34 | 55 | 8,072,874 |
7 | Female - age 35-44 | 63 | 5,526,391 |
8 | Female - age 45-54 | 100 | 5,512,983 |
9 | Female - age 55-64 | 79 | 4,137,051 |
10 | Female - age 65 and over | 120 | 5,083,986 |
N/A | Missing demographic information | 13 | |
TOTAL | 720 | 54,768,758 |
Those respondents who did not supply the demographic information necessary to categorize their age, gender, and/or race/ethnicity are excluded from the post-stratification process and assigned a value of one for M.
The multiplier M is then applied to W_{NRMTRA} to create W_{NRMTRAPS}. However, W_{NRMTRAPS}is overstated because a portion of the sample is not included in the calculation of the post-stratification adjustment. Therefore, a deflation factor is applied to the value of W_{NRMTRAPS}. The deflation factor DEF for the national sample is calculated as follows:
Where:
P(i, j, k) is the national population count for cell (i, j, k); and
TW_{NRMTRA_NA} is the sum of the W_{NRMTRA }weights for respondents with missing demographic information.
The deflation factor DEF for the sample of targeted MSAs is calculated as follows:
Where:
P(i, j) is the MSA population count for cell (i, j); and
TW_{NRMTRA_MSA} is the sum of the W_{NRMTRA }weights for respondents with missing demographic information.
This deflation factor denotes the proportion of the target population represented by respondents with non-missing demographic information. The final analysis weight, W_{FINAL}, is the scaled value of W_{NRMTRAPS}, calculated as follows:
W_{FINAL} = DEF x W_{NRMTRAPS}
W_{FINAL} can be viewed as the number of population members that each respondent represents.
3.6 Trimming of Final Analysis Weights
Extreme values of W_{FINAL} are trimmed to avoid over-inflation of the sampling variance. In short, the trimming process limits the relative contribution of the variance associated with the k^{th} unit to the overall variance of the weighted estimate by comparing the square of each weight to a threshold value determined as a multiple of the sum of the squared weights. Letting w_{1}, w_{2},...w_{j}, denote the final analysis weights for the n completed interviews, the threshold value is calculated using the following formula:
Each household having a final analysis weight that exceeds the determined threshold value is assigned a trimmed weight equal to the threshold. Next, the age/gender/race cell used in the post-stratification is identified for each household with a trimmed weight. To maintain the overall weighted sum within the cell, the trimmed portions of the original weights are reassigned to the cases whose weights are unchanged in the trimming process.
For cases having trimmed weights but missing age, gender, and/or race/ethnicity information, the trimmed portions of the original weights are assigned to all remaining cases whose weights are unchanged in the trimming process.
The entire trimming procedure is repeated on the new set of weights - a new threshold value is recalculated and the new extreme values are re-adjusted. The process is repeated until no new extreme values are found.
^{2} The Census Bureau provides a detailed breakdown of population count by age, gender, and race/ethnicity.
^{3}The four race/ethnicity categories used for post-stratification purposes are: Hispanic (any race), Black, non-Hispanic, White, non-Hispanic, and Other, non-Hispanic.