Assessment of 2001 New York State NHTS Add-On Data Using Empirical and Auditable Data Sources
Assessment of 2001 New York State NHTS Add-On Data Using Empirical and Auditable Data Sources
NATHAN ERLBAUM *
This study assesses how well the 2001 National Household Travel Survey (NHTS) estimates compare with other sources of comparable data from the census or administrative records. Comparisons of a number of NHTS measures are made with benchmark data sources and brief findings presented. Traffic count-based vehicle-miles of travel (VMT) estimates of residential travel show that monthly patterns for survey VMT are inconsistent with observed statewide ground-count estimates; residential-based VMT is not comparable to total VMT, but by using additional data to specify nonresidential and commercial VMT, it is possible to reach the traffic count estimate for total VMT. For transit ridership, NHTS person trips by subway correspond well to Metropolitan Transportation Authority reports of subway trips. There is general agreement for estimates of workers, but some geographies within New York State show statistical differences between the two surveys. For status of drivers with DMV licenses, agreement exists, but some strata and gender groupings show statistical differences between the two measures. For the number of registered household vehicles, for most strata, the census estimates are within the NHTS 95% lower bound and the estimate. The study presents findings and recommendations based on these comparisons, as well as observations as to how important a role standard error and confidence interval play in the analysis of survey results.
KEYWORDS: National Household Travel Survey (NHTS), confidence interval, census, quality assessment.
Conducting and analyzing a household travel survey is a fairly common mechanism for obtaining information about travel behavior and characteristics of the household, including its members, their trip making activity, and vehicle usage. The 2001 National Household Travel Survey (NHTS) is the latest in a series of roughly quinquennial residential-based household travel surveys undertaken by the Federal Highway Administration (FHWA).
This study assesses the quality and usefulness of the data for the New York State Department of Transportation's (NYSDOT's) purposes. Comparisons to other data sources used by the NYSDOT are made and differences noted. The paper focuses on a discussion and assessment of selected survey measures from the 2001 NHTS NYS add-on and how well they compare with data drawn from the 2000 Decennial Census, the 1997 Vehicle Inventory and Usage Survey (VIUS), transit operator annual reports, summary data from continuous traffic counting sites, and other sources available to NYSDOT.
The paper examines highway travel from two perspectives: 1) temporal trends in the NHTS are compared with NYSDOT ground-count-based estimates for the same period; and 2) combining the estimates of NHTS residential travel and VIUS commercial travel with NYSDOT ground-count-based estimates of statewide travel.
The NHTS is a list-assisted random digit dialing survey designed to yield an equal probability sample of households with telephones. The survey is effectively a metropolitan/nonmetropolitan area survey. In a national probabilistic sample of over 25,000 U.S. households, New York State would be represented by its share of the national population or about 1,600 samples. The 1,600 national samples would normally be drawn primarily from New York City and its suburban counties and the Buffalo metropolitan area, because that is where the largest share of the state's population resides.
In 1995 and 2001, NYSDOT participated with FHWA as an add-on, purchasing over 11,000 additional household samples. The additional households enable the state to examine travel behavior in the state with greater statistical reliability. The add-on sample also enables NYSDOT to look at the primary urban counties within each metropolitan area, treat the 12 counties in the lower Hudson Valley as if they were each separate areas, and assess the counties not included as urban as part of a nonurban aggregate, thereby providing greater understanding of similarities and differences in travel behavior in different areas of the state.
In 1995, the Research Triangle Institute conducted the survey; in 2001, Westat conducted the survey. While the survey format and national sample size is essentially similar, differences exist between the two surveys with respect to nonresponse adjustment, access and egress modes for public transportation, the method for annualizing the odometer readings, and perhaps, most importantly, how the events of September 11, 2001 (9/11) impacted the survey and/or reflect some type of permanent travel behavior alteration in NYS.
The primary objectives of this report are to:
- Validate that the survey temporal distribution of residential household personal travel over the year was consistent with the distribution obtained from ground-count data-collection efforts and that it also exhibited similar patterns in the post 9/11 period.
- Understand how residential household personal vehicle-miles of travel (VMT) might have remained constant between 1995 and 2001 in light of the 13% increase in statewide VMT over the same period.
- Validate that the NHTS as a total travel survey adequately reflected public transit ridership and the significant growth in unlinked transit trips reported by the Metropolitan Transportation Authority (MTA) in the New York metropolitan region between 1995 and 2001.
- Assess how well the NHTS estimates for workers, drivers, and vehicles available in households compared with the 2000 Census and motor vehicle records.
TEMPORAL TRENDS IN TRAVEL SURVEY VS. GROUND-COUNT ESTIMATES OF VMT
In this section, the monthly distribution of vehicle travel is compared using two sourcesthe continuous monitoring ground-count data supplied to NYS and the 1995 Nationwide Personal Transportation Survey (NPTS) and 2001 NHTS (USDOT 1995 and 2001). Some differences in these two sources are worth noting.
Temporal ground counts illustrate trends in highway travel for residents and nonresidents: the use of private and commercial vehicles as well as public transit vehicles. The counts reflect intrastate travel as well as cross-state and interstate travel and include both work-related and discretionary travel.
The vehicle trip data from the NPTS/NHTS reflect residential household personal travel. A year-long summary of all vehicle trips for all purposes was sampled, with each household reporting for a single "travel day." In reality, these estimates reflect primarily local vehicle traveltravel that probably does not change much from day to day as people go about their lives and daily business and errands. The survey does not include commercial travel, travel by visitors to NYS residents, or information about nonresidents who work or travel into or through NYS on a daily basis.
A plot of monthly estimates for statewide VMT based on data from the continuous monitoring sites within NYS is shown in figure 1. A vertical line is provided for reference to indicate January 1995 and 2001, and September 2001.
The slope of travel shows an increasing trend from 1995 through 2002. The arc-like pattern within the years may reflect the impact of weather on the travel season. However, the overall trend nonetheless shows upward change from a 1995 summer peak of 10.5 billion VMT to a summer peak of almost 12 billion VMT in 2001, reflecting an annualized increase in vehicle travel of 2.25% per year over the 6 year period (NYSDOT OPP 2003d).
Examination of the ground-count data on vehicle travel for the survey years and those between the 1995 NPTS and the 2001 NHTS (figure 2) provides the following temporal presentation. In all five years, there is a significant decline in vehicle travel in September relative to August as vacations end and school starts each fall. In 1995 and 1996, there were increases in October, which may be more a reflection of unseasonably warm weather causing travel to remain higher. Figure 3 shows a comparison of the VMT estimates for the United States (for 2001/2002) and NYS (1995/1996 and 2001/2002) and illustrates a similar pattern (USDOT 2003).
Perhaps more interesting is what happens when the NPTS/NHTS survey is used as a surrogate for vehicle travel. In the NPTS/NHTS series, the number of completed samples associated with each calendar month was weighted to reflect each month as 1/12 of the total number of samples. If the survey period spanned the same month, say there are two months of April samples, then the sum of those samples was combined and made 1/12 of the total. As noted in the ground-count data, vehicle travel is not evenly distributed across the months. The weighting method should correctly estimate the pattern of monthly travel across the year.
Figure 4 is based on the statewide summary for NYS of vehicle trips from the NPTS/NHTS add-on data by month. On the whole, the curves tend to reflect a seasonal pattern, but are not quite in agreement with observed statewide travel from ground counts (NYSDOT OPP 2003e). It is possible that the trough in vehicle travel from February to April in the 2001 NHTS is consistent with the decline noted in seasonal travel due to weather. Equally worthy of note is the more significant decline in September between the two surveys when compared with the actual ground-count data, which may reflect the events of 9/11. More importantly is the depth of the decline in 1995/1996, which is more likely related to response rate and survey temporal and other adjustments rather than catastrophic events.
Assuming that the propensity to travel is directly related to how temperate the weather is, figure 5 shows heating degree days.1 In the figure, December, January, and February 1995/1996 were colder than the winter of 2001/2002, reflecting a pattern consistent with the ground-count-based travel data (NYSERDA 2004).
Temperature severity alone does not preclude travel. The amount of precipitation and snowfall during the winter may provide more of an explanation for a variation in travel patterns across years.
In figure 6, the population weighted statewide snowfall by NYSDOT residency was adjusted to reflect its potential impact on the propensity to travel. The graph would suggest that the snowfall impact on travel during the winter months of 1995/1996 was more severe than in 2001/2002. However, both this observation and the ground-count data appear to contradict the survey's estimate of monthly seasonal travel (NYSDOT OOM 2003). Perhaps what the survey is showing is that people still go about their activities even when it snows. It could also be that the amount, duration, and timing of the snow event is more likely to impact travel than just the total snowfall amount. At the present time, access to specific data to test this hypothesis is unavailable within NYSDOT.
Effects of September 11, 2001
Clearly, 9/11 had an impact on the 2001 NHTS; however, the impact is unclear. Anecdotally, it has been suggested that the public hunkered down and stayed home after 9/11 and the anthrax scares; this is not reinforced by ground-count observations. In an examination of the impacts of 9/11, NYSDOT found that the severity of the vehicular traffic impacts were short in duration, with daily travel returning to almost normal by late October and highway travel in general returning to annual seasonal patterns by the end of the year.
If travel in 2000 is considered a normal year, then the effect of 9/11 on travel was a 6% reduction statewide in that month. October travel showed a 2% reduction. However, by November and December 2001, highway travel resumed at a greater value than in 2000 (NYSDOT OPP 2002). This is perhaps, in part, due to the avoidance of air travel and the shift to highway for longer distance trips over the Thanksgiving and Christmas holidays and more temperate weather.
Another observation of this study was that the farther the travel was from "ground zero," the less the impact. The most significant impacts were observed where specific traffic restrictions were in place or where facilities were closed. Air travel showed significant long-term declines, with observational and anecdotal evidence showing that long-distance vehicle traffic was increasing in the short-range air travel corridor.
In first quarter of 2002, the decline in survey base travel for NYS residents can perhaps be explained by the economic impacts of much less business travel and the loss of jobs due to the deepening recession. Another possibility is the national-level survey adjustment for monthly variation may have in and of itself masked the actual travel for NYS.
ASSESSING TOTAL HIGHWAY TRAVEL: SURVEY VS. GROUND COUNT
It may not be possible to expect a resident household personal travel survey such as the NPTS/NHTS to form the basis for assessing temporal variation in vehicular travel. Limitations in sample size, weighting, and nonresponse adjustments may work effectively to address issues associated with the sampling unit; however, it is not clear how to correctly adjust for design effect variables. It is equally unclear how to account for all of the other vehicular travel that is not residential household personal transportation traveling to, from, and across NYS that is measured by ground-count means.
The following discussion attempts to assess how the survey estimate of total residential-based travel may exist within the context of a ground-count-based estimate of total highway travel in NYS. The approach relies on related surveys, studies, pseudo and empirical data, and a number of enabling assumptions based on observation, anecdotal data, and local conditions.
When examining survey data, it is important to take into consideration the impact of sampling error. Sample size has a considerable impact on sampling error. While a large number of samples were taken in NYS as part of the NHTS add-on, it is possible that sampling error is sufficiently large to preclude detection of small changes between 1995 and 2001. In order to examine this, consideration must be given to the confidence interval associated with an estimated value of a measure (e.g., VMT) and view that measure as being within a certain range of values above or below the true value. In the case of a survey such as the NPTS/NHTS, the estimated value is represented by the weighted sample data.
Assuming that the 2001 NHTS produces an unbiased estimate of residential household VMT (because it is from a random sample), we can be confident with 95% certainty that the survey estimate does not differ from the true residential-based VMT by more than twice the standard error in either direction. In 2001, the estimated statewide residential-based VMT value was 95.2 billion. There is a 95% certainty that the true value of residential-based VMT lies in the interval from 91.6 billion to 98.8 billion VMT, a relative error of 1.9%. The estimated VMT for 1995 was 95.6 billion. There is a 95% certainty that the true value of statewide residential-based VMT lies in the interval from 91.2 billion to 100.0 billion VMT.
Looking at the sampling error on the estimate of the VMT from both the 2001 and 1995 surveys, there is a 95% certainty that the true value of residential-based VMT lies between 89.2 billion and 100.9 billion VMT (USDOT 1995 and 2001). Therefore, no statistically significant difference exists in the two estimates.
However, given the inability to discern any difference, we could interpret this as follows:
- residential household-based VMT may have grown by as much as 5 billion, or 5% relative to 1995;
- the 1995 estimate could be an overstatement of 6 billion;
- true growth could be as large as 11 billion.
For the 2001 NHTS in NYS, a much more detailed analysis for the different urban strata shows that the relative errors are much larger, ranging from +/2.6 to +/19.7%, compared with +/2.0% for the state as a whole.
Given the level of uncertainty with the survey's estimate of residential household VMT, can it be resolved with the ground-count estimate of travel? Consider the following example using the 2001 NHTS and the 1997 VIUS. With careful examination of the VIUS for estimates of personal and nonpersonal transportation VMT and with adjustments based on the assumptions noted, it is possible to construct an estimate using the upper bounds of the confidence limit almost equal to the 130 billion statewide ground-count VMT estimate in 2001 (NYSDOT OPP 2003a).
Table 1 shows a substitution of the 2001 NHTS trucks that are available for personal use within a household for those in the VIUS. Trucks that may be considered available for commercial use at an establishment in the VIUS are then added to the NHTS. This allows the NHTS to specify residential-based VMT and the VIUS to specify resident-based commercial VMT. Using a series of assumptions and other data sources, adjustments are made to compensate for the following: survey error; difference in time period; the flow of both personal and commercial vehicles in, out, and across the state; NYS public transit and school buses; and the recognition that NYS is a net importer of goods and services for vehicles not registered in the state.
The Difficulty with Small Trucks
Pickups, vans, and other truck vehicles that can be used for transporting both people and goods are difficult to quantify from any source. Small home-based businesses can use a vehicle for both personal and commercial travel in the same day, sometimes in the same trip. The NHTS asks about commercial vehicle use and obtains the occupation code of the vehicle user, but if more than 10 commercial trips are made on the travel day (e.g., taxis or police cars) the survey asks the respondent to report only the trips for personal use of the vehicle.
In New York State as in other states, pickups, small vans, and sport utility vehicles (SUVs) may be registered as either cars or trucks depending on usage (e.g., pickup trucks with a permanently attached cap are registered as standard passenger series vehicles). In the 1997 VIUS, passenger car files were searched and any such vehicles were included in the VIUS sampling frame along with truck registrations. Therefore, the 1997 VIUS contains both personal use and commercial use trucks. This is also the case in the 1995 NPTS and 2001 NHTS. The survey does not ask for the type of registration for household-based vehicles.
In this comparison the non-NYS pickup/panel/van estimate is assumed to be at least equal to the NYS value. In analyzing the 1995 NPTS NYS add-on, the NYS metropolitan area strata were compared against the nation as a whole. Based on this analysis, we found that with the exception of New York County (Manhattan) and the remainder of New York County; the rest of the state had travel characteristics similar to that of the nation. Additionally, given the substitution effect of pickups, panels, vans, and SUVs for autos and their usage similarities, this assumption is necessary to address in migration of nonresident vehicles in border areas.
Travel from Outside NYS
For nonresident travel, other assumptions are necessary given the expanse of the multistate New York City (NYC) labor market area, where residents and businesses in northern New Jersey and western Connecticut regularly engage in travel and business activities in NYC and its suburban counties. Equally important is the considerable daily passenger and truck traffic in western New York between Canada and NYS, which is clearly evident by Canadian-New York border crossing counts, neither of which are measured by the VIUS nor the NHTS.
Discussions with staff at the NYS Thruway (TWY) indicate that consultant studies done in the mid-1990s showed a nonresident presence on the TWY in excess of 30% (Maynus 2004). The reader should note that the TWY is mostly a rural road skirting many of the major urban areas it traverses and carries less than 10% of the state's VMT. More recent studies in 2004 in the highly urbanized NYC metro area related to the TWY Tappen Zee Bridge/I-287 corridor indicate a significantly higher proportion of nonresident usage. For three primary facilities that cross the Hudson River (where tolls are collected in the east bound direction only)the Tappen Zee Bridge, the George Washington Bridge, and the Lincoln Tunnelthe daily nonresident share of the vehicular flow was 67.5%.
To address nonpersonal vehicle usage, a number of assumptions were also made. Doubling the assignment of Reebie TransSearch fully laden trucks conservatively accounts for empty backhauls and less-than-truckload movements by large long-distance trucks to, from, and across NYS (Reebie 2001). The VIUS estimate of nonresident truck movements (given the number of vehicles crossing the Hudson River) is, at a minimum, at least equivalent to that of NYS and, absent other data, is distributed equally across all vehicle categories.
Adjustments for the VIUS relative error using national values will understate the error associated with the smaller NYS sample, hence twice the national relative error is assumed for NYS.2 It should also be noted that because NYS is a net importer of goods and services as demonstrated by the Commodity Flow Survey (CFS) data,3 the VIUS will not adequately represent the vehicles entering NYS. Equally important are adjustments for things that cannot be measured, are addressed based on anecdotal data, or that will likely overstate the longer distance movements of trucks.
Taking all of the above issues and assumptions into consideration and the reality that the ground-count-based estimate of 130 billion statewide VMT may in and of itself have perhaps a +/5% error or be +/6.5 billion VMT off, we can reach the 130 billion statewide VMT estimate. By iteratively back solving and/or adjusting assumptions, the estimates also lie within the range of 123 billion to 136 billion VMT.
It may be possible, therefore, to make estimates with a variety of survey resources that come close to ground-count-based estimates of total statewide VMT as reported in the Highway Performance Monitoring System (HPMS).4 This approach is tenuous at best, as there are equally problematic issues associated with ground-count expansion, vehicle classification, other issues that affect the HPMS, and estimation of travel not adequately covered by existing surveys that are crucial in resolving the regional travel impacts for a bridge state. It is at least possible to accept that residential household personal travel may indeed be at or near the upper confidence level shown in the NHTS and that the nonresident movement may be accounting for the growth observed through ground counts.
ASSESSING PUBLIC TRANSPORTATION RIDERSHIP
The NHTS collects data on trips by all modes of travel, and New York transit trips are well represented in the survey. This section focuses on a comparison of transit ridership between the 1995 NPTS and 2001 NHTS NYS add-ons in relation to reported transit ridership. In order to do so, some discussion of the difference in survey collection for transit trips is necessary. There are some definitional differences in how transit operator ridership may be reported.
- If every time a rider changed modes a fare was required, these individual trips would look like revenue trips collected.
- If free transfers between modes are allowed, then the number of modal trips may differ from revenue-based trips.
Public transportation operators providing regularly scheduled transit services typically report passenger revenue separately and passenger ridership in the form of unlinked trips, thereby accounting for each time a person boards a vehicle. Consider the example where a transit fare card is used and the traveler desires to go from point A to point B, as illustrated in figure 7. The traveler or rider begins at the origin with an auto trip to a bus station, takes a bus trip, and makes a free transfer to another bus. The rider then takes a commuter rail trip, walks to another bus, and arrives at his or her destination. This could be reported on the fare card as four separate unlinked transit trips but in reality represents the collection of three separate fares. In household travel surveys, it becomes very important to pay close attention to what is reported as a transit trip, especially if one desires to compare survey results with auditable transit operator statistics.
The NPTS/NHTS uses the following question sequence to determine the origin and destination and mode used:
- Where are you?
- Where did you go next?
- What mode did you use?
When the mode reported is public transportation, a single main mode is then determined based on the longest distance in the link (although some respondents may have reported the longest time segment) or what the respondent identifies as the main public transportation mode for the trip. Mode changes are recorded as access and egress modes to the main mode (2001 NHTS), or as segmented transit trips (1995 NPTS).
The approach in the 1995 NPTS was to determine if public transportation was used at anytime on a trip, identify the main mode of public transportation, and then classify up to four segments for the use of other modes of transportation. For example, in a trip where an individual drives to the train station, takes a bus, transfers to a bus, takes commuter rail, walks, takes a bus, and then walks to a destination, commuter rail is the longest segment and is coded as the main mode for the transit trip. This public transit trip has more than one segment and would show up in the segmented transit trip file.
Since all transit trips have walk access and egress, these would typically not be coded unless the walk trip was of significant length or between transit modes. A commuter rail trip would show up in the travel day file, and in the segmented file one would find for segments 14: auto, bus, bus, and commuter rail. The second bus trip might be lost or the bus-to-bus transfer may get coded as one bus trip because the mode did not change. Although each transit trip would be assumed to have at least three segments (walk access, vehicle trip, and walk egress) a large majority of trips in the segmented file obtained two or fewer segments.
In the 2001 NHTS, a different approach to recording public transit trips was employed. There was no segmented transit trip file for recording the multiple modes associated with transit use. Instead, when public transit (PUBTYPE) was used, the respondent was asked to identify the main transit mode. However, up to five access (TRACC15) and five egress (TREGR15) modes for this main mode were also available. If successive modes were the same then they were coded as one, because the issue is mode capture and a bus-to-bus change is still bus. Figure 7 illustrates the 2001 method.
Based on the above discussion, it becomes clear that the two surveys are not directly comparable in their estimate of public transit ridership. On the whole, most public transit trips had less than three segments in 1995, indicating that even walk access and egress were poorly reported. Good detail on multiple transit mode trips is not available from the 1995 NPTS.
Table 2 presents a comparison of the relative growth in the number of personal trips on passenger transit when reported as the main mode or the main mode with segmented or access/egress transit trips in the 1995 NPTS and 2001 NHTS for NYS. It also shows the average 1995/1996 (to correspond with the survey period) and 2001/2002 Metropolitan Transportation Authority (MTA) unlinked passenger trips (NYSDOT PTD 2003b). A significant difference can be seen in what was reported in the surveys and what was measured by MTA. It should be noted that, between the 1995 NPTS and the 2001 NHTS, MTA introduced a fare card and a variety of different fare policies to encourage free transfers or offpeak discounts. These policies may have contributed to the large increase in unlinked trips (or modal boardings) reported by MTA.
MTA carries the bulk of all transit passenger trips in NYS. In calendar year 2001, the NYSDOT Passenger Transportation Division reported that MTA provided service for approximately 92% of the statewide transit passengers (NYSDOT PTD 2003a). Adding private operators downstate to MTA's share shows that service was provided to approximately 98.6% of all transit passengers in NYS.
A comparison with MTA operational data is very important (NYSDOT PTD 2003c). With the introduction of the MTA fare card, many bus trips provide free transfers, in part accounting for the increase in ridership due to greater system flexibility. However, from a fare card perspective there is no change in the way subway riders are reported. Subway ridership reporting in 1995 and 2001 allows free transfers between trains without being recorded as a boarding, very much the same way that bus transfers are currently counted. Therefore, a more focused analysis of subway ridership statewide was undertaken.
Table 3 presents 1995 NPTS and 2001 NHTS subway ridership for NYS. Subway ridership is unique to the five New York City boroughs. Port Authority Trans-Hudson (PATH) service between Manhattan and New Jersey is essentially subway-like, but NYC residents are astute enough to recognize PATH as a separate and different mode and would likely indicate it as such.
The way the surveys are coded, it is possible to identify whether the trip used public transportation on any portion. It is assumed in 2001 that the occurrence of subway access/egress to subway as a main modemay indicate that PATH was used. In 1995 two consecutive segments of subway would indicate the same or a subway transfer.
Data from a 2001 PATH transit survey show origins and destinations based on stops (Eng-Wong Taub 2003). While the PATH survey does not indicate transfers to the subway, a modest assumption of 60,000 trips, based on examination of origin and destination stations, assumed a transfer to the subway from PATH. Since the actual number of PATH to subway riders is not precisely known, nor is it possible to estimate the number of nonresidents who may arrive in NYC by other means for business and tourism, these are not unreasonable assumptions for this analysis. The 1995 value for this number was taken as a reduction in the 2001 value by half of the change in decennial census county workflow from New Jersey to New York, which was 9.8% between 1990 and 2000.
Given these assumptions for comparability in subway ridership between the two surveys, we may conclude that the actual public transit ridership represented by the survey and that from operator records are relatively close in terms of percentage change (26.4% vs. 24.2%). Undertaking this analysis using the NHTS confidence intervals for these same data would most likely indicate that the survey estimates easily accommodate the operator statistics for subway ridership. It is then possible that the survey may provide a representative estimate of transit ridership when the problem of unlinked trips is controlled.
ENUMERATING THE WORKFORCE
A U.S. Census Bureau report (Clark et al. 2003) makes the following observations in the executive summary with respect to employment:
- Lower counts of employed people (and the civilian labor force) in censuses than in the Current Population Survey (CPS) extend back to 1950, but in 2000 the differences between the census and the CPS were larger than in the past. The 2000 employment data may be influenced by anomalous data for individuals in group quarters. (For a discussion of employment data for group quarter populations, see USDOC 2000, pp. 960961.)
- The 2000 census estimate of the number of employed people was about 5% lower than the CPS estimate. But the 2000 census estimate of the number of unemployed people was over 50% higher than the CPS estimate.
- The 2000 census estimate of the labor force participation rate was 2.1% lower than the CPS estimate. The Census unemployment rate was 2.1% higher than the CPS.
It is possible that during the collection of the 2000 census the temporary field interviewers concentrated more on getting "complete count" data and, therefore, were less likely to get all long-form questions completed, resulting in a lot of missing data that was later filled in by imputation. Examination of SF3 Table P132: "Imputation of Work Status for Persons Age 16 and Over for New York State," shows that the numbers generally hover around 12% (NYSDOT OPP 2003b). Table 4 indicates that the percent imputation can be higher for the aggregated county data associated with each of the NHTS add-on strata in NYS. In fact, within the five boroughs of New York City, which represent a population of over 8 million persons, the level of worker imputation is 10% or higher.
An internal analysis conducted by FHWA of specific census tracts in Washington, DC, found tracts where the percent imputed varied from 30% to 88%, especially in poorer neighborhoods and among specific racial groups (Murakami 2003).
Taking these observations into consideration, along with the fact that the five boroughs of New York City comprise a very racially and economically diverse area of the state, the accuracy of the census estimate for the number of workers is very important. This is especially so when surveys rely on the decennial census for controls (NYSDOT OPP 2003b).
In a sample survey like the NHTS, the number of workers is an effect variable resulting from questions asked of members of the household during the interview. The sample estimate of the number of workers from the NHTS must be examined within the context of the confidence interval. Similarly, census long form measures, such as workers, are also obtained through sampling, and it is equally important to estimate the confidence interval for the 2000 Census SF3 (USDOC 2000).
Table 5 compares the 2000 census estimate and the confidence interval for the universe of workers ages 16 and over with the NHTS estimate and confidence interval for the variable "worker." In 17 of the 23 strata shown, the census and NHTS 95% confidence intervals are mutually exclusive, suggesting truly different numbers.
The nature of the survey instruments, question wording, and the timeframe of the census (2000) and the NHTS (2001/2002) clearly offer the potential for differences. The events of September 11, 2001, and the severity of the economic collapse that led to job loss in the NYC metro area and state as a whole would suggest lower NHTS values. However, such is not the case; in every county, the NHTS estimate of workers is higher.
In the 2000 decennial census question 21 asks: Last week, did this person do any work for either pay or profit? Question 21b asks: Last week, was this person temporarily absent from a job or business? The response to questions 21 and 21b, along with related response logic, forms the basis for identifying whether or not someone is a worker.
In the 2001 NHTS, the questioning was slightly different. There were several questions and responses that led to the determination of the worker status of the respondent:
- During the household interview this question was asked: Does this household member have a job?
- Later on in the personal interview, the primary activity was determined: Was the person working? Temporarily absent from a job or business? Looking for work? A homemaker? Going to school? Retired? or Doing something else?
- If the activity was something other than working or being temporarily absent from a job, the NHTS used the same wording as the census; that is, the person was asked "Last week, did you do any work for pay or profit?"
The results of this line of questioning are shown in table 6.
The questioning in the NHTS attempts to avoid the worker underreporting problem that was felt to exist in the 1995 NPTS. However, exactly what "having a job" means to the respondent is self-determined. If the census question by design underestimates the number of workers relative to the CPS, then the basic definition of "what is a worker" is the real question. By allowing the respondent to determine the definition of "having a job" and by asking the question in multiple places, the NHTS may identify a set of part-time, occasional, or otherwise uncounted workers that the census may not.
In table 6 from the NHTS for New York State as a whole, those who work or were temporarily absent from a job or business (close to the census definition) represent an estimated population of 8,352,459, which is very similar to the census estimate of 8,211,916. However, as stated before, these estimates must be looked at in the context of error and confidence limits (table 7). This table presents nine strata where the 95% confidence intervals are mutually exclusive (i.e., the estimates are statistically different). In the remaining 14 strata, there is no significant difference between the two numbers (i.e., statistically they are the same estimate).
Clearly the concept of worker in the NHTS shows that many people do not have full- or part-time jobs and do some other type of activity during the week, yet they reported that they worked and were compensated. This type of questioning may explain some of the problems between the decennial census and the CPS, as well as illustrate the effects of differences in question wording, survey instrument design and administration, the difference between job and worker for the respondent, the impact of effect variables that are not controlled for, and the need to clearly evaluate survey estimates within the context of confidence limits.
DRIVERS AND DRIVER LICENSES
In this section, the relationship between "driver licenses in force" from the New York State Department of Motor Vehicle (DMV) and the effect variable, "driver," in the NHTS will be examined.5 Given that the NHTS is weighted by age, race, and sex, the logical assumption is that the account of drivers would correspond well with that of DMV. The DMV licenses in-force summary by gender for 2001 reflects all persons who have a valid driver license at the end of the year (which is approximately midpoint through the survey).
The NHTS does not specifically ask if each driver holds a current and valid license; the respondent is simply asked whether the person is a driver. Therefore, the NHTS may count people whose licenses may have been suspended, people who have licenses but are no longer driving, or people who are licensed out of state but may be residing temporarily in NYS.
In table 8, it is possible to see that sample size is critical to how well the number of drivers estimated by the NHTS corresponds with the number of in-force driver licenses in each stratum. At the statewide level, only the total number of NHTS female drivers is statistically different from DMV in-force licenses. On a strata basis, however, the correspondence is very different. Of the 23 strata shown, 13 are statistically different for male drivers in the NHTS vs. DMV, 8 are different for female drivers (7 of the 8 are the same strata as for males), and 12 are statistically different for all drivers.
Interestingly, there is no apparent pattern for the correspondence between the two sources or the driver categories (male, female, total). Even in the Albany, Rensselaer, Saratoga, Schenectady stratum, which has a considerable oversample relative to other strata (at the request of the metropolitan planning organization (MPO)), there is mixed correspondence.
The correspondence seems to be better in small to moderate strata rather than very urban strata in the NYC area. Clearly, the very low number of persons with DMV driver licenses in New York County (Manhattan) is understandable given the availability of mass transit services and the high cost of housing, operating, and garaging an automobile. Many of the residents are age-eligible to drive yet do not have a license, and this cuts across all age cohorts for this county. What is not so understandable is why the survey reported such a high number of drivers in this county. It is possible that the population is more transient and residents are licensed in other states, or that they really do not have legal licenses. Also worthy of note is the trend for the survey to reflect drivers more in line with population. DMV licenses may reflect the inherent residential density and spatial context that may contribute to more auto trips taking place.
The survey, however, estimates drivers at the state level quite well; the overall estimate for total drivers is almost an exact match with DMV licenses in force. The obvious conclusion then is that this survey effect variable when broken out by gender and geography is highly sensitive to sample size and sampling error.
It is also possible that this sensitivity extends to trip production as well, which would likely impact survey estimates of respondent VMT. Clearly, geographic-based age, sex, and race weighting may not adequately reflect the spatial living arrangements that density introduces into the dynamic associated with owning and driving a car.
Previous NYSDOT Analysis
As part of NYSDOT's analysis of the 1995 NPTS, a series of analytical reports were prepared for each of the survey strata by the Center for Transportation Analysis at Oak Ridge National Laboratory (Hu and Young 1999). One report, "1995 New York NPTS: A Comparison Study," focused on comparing and contrasting the individual survey strata that corresponded to the primary urban counties in each metropolitan area. The intent of this report was to assess comparability in travel measures to make it easier for MPOs to benefit and draw from travel model updates and improvements done in areas with similar characteristics. One of the findings of this comparison study was that comparability in travel behavior measures was best for areas of similar tract-level population density.
Neither the census nor the NHTS asked if the vehicles available within the household were owned and/or registered to someone in the household. Nor did they ask if the vehicles were leased by someone else or how the vehicles were used (primarily for personal or for commercial use). Both surveys simply asked about the vehicles available for use, which makes comparison with registered vehicles difficult.
The total number of vehicles available in households is not directly available from the census, which gives the number of households categorized by the number of vehicles (zero, one, two, etc., up to six or more). By multiplying the number of households by the corresponding number of vehicles available, it is possible to estimate the number of vehicles available (NYSDOT OPP 2003c).
The NHTS enumerates all the vehicles in every household and provides a vehicle file with the make, model, and primary driver of each vehicle noted. In the 2001 NHTS, the Energy Information Agency (EIA) added the fuel type, fuel efficiency, and gas cost at the residential location to the vehicle file, expanding its usefulness.
The number of vehicles reported by the census and the NHTS should compare reasonably well against the number registered within the county. Unfortunately, the NYSDMV registers vehicles as "standard series" (mostly passenger cars), commercial (trucks, vans, pickups), and other categories that define specific vehicles and/or their use (trailers, taxi, rental, farm, etc.). Examination of the VIUS indicates that the bulk of the commercial vehicles that fell into the pickup, van, SUV, and other truck categories were being used for personal transportation, and, therefore, the NYSDMV standard series vehicles are comparable and can be used for this comparison (NYSDMV 2001).
Some difficulty still exists in figuring out how many cars may be in commercial use and how many commercial vehicles may be in personal use. In table 9, the NHTS data have been recoded to correspond to the census distribution (zero, one, two, three, four, and five or more vehicles). The NHTS survey estimate and its upper and lower confidence limits are shown. Since the census values are mostly within the NHTS confidence interval, the census confidence interval was not computed. In addition to these data, the NYSDMV registration data for standard series, commercial vehicles, and their sum are included for comparison purposes.
Unlike the previous discussion for drivers and driver licenses, vehicles available may indeed be much more closely related to the basic sampling unitthe household in both the census and the NHTS. As part of the 2001 NHTS survey contract with NYSDOT, the survey vendor was asked to evaluate if disaggregated registration data by registration type provided better weighting than households for the vehicle file in the 1995 NPTS. As a result of their analysis, it was determined that households and registrations on the whole were both equal to the task for weighting the vehicle file. This would suggest then that the census, NHTS, and DMV data would be proportionally similar.
In table 9, the reader should note that within the census and NHTS estimates it is possible that there are non-NYS registered vehicles being counted. Examination of the table shows that for the most part the census estimate is essentially found between the 95% lower bound and the estimate for the NHTS.
There are two exceptions, the statewide total and the Albany, Rensselaer, Saratoga, Schenectady strata. In the case of the Albany strata, it is possible that the census upper confidence interval overlaps the lower bound of the NHTS. However, no explanation can be offered for the differences in the Albany strata and the statewide total. Equally interesting is that, for the most part, the standard series registrations alone correspond well except for four strata, and when these values are taken with the "Commercial" vehicles, the "Sum" falls within the NHTS confidence interval except for Suffolk and Rockland counties. It is possible that in these counties there were a greater proportion of commercial business vehicles. Most important to note is that the entire statewide NHTS confidence interval is significantly above the census 2000 estimate, perhaps due in part to the difference in the survey instrument and sample size.
Does the survey estimate of vehicle travel over time adequately match observed monthly VMT? No.
- In a comparison of survey and ground-count-based estimates of monthly VMT, the effects of seasonal variation alone did not explain the differences. Temperature trends and seasonal snowfall did not provide any additional explanatory data. Perhaps the duration and timing of weather events had more to do with impacting day travel than just the amount of snow that fell.
- Equally important is the fact that the NHTS is a residential household survey; it is not possible to assess the effects of nonresident and commercial travel to or through the state. The specific impacts of September 11, 2001, and the deepening recession on personal, business, and commercial travel are intricately woven into the fabric of daily travel reflected in ground counts.
- Also, the monthly ground-count data were not disaggregated by vehicle classification, state of origin, purpose (personal, business, or commercial travel) for either resident or nonresident vehicles, which would be necessary for a rigorous comparison of the survey results.
Can the apparent lack of change in survey estimates of residential household personal VMT for 1995 and 2001 be explained given the increase in the ground-count-based estimate of VMT? Yes.
- Survey estimates of effect variables require careful examination of the standard error and the 95% confidence limit. Sample size has a considerable impact on sampling error.
- Total VMT was comprised of residential and nonresidential personal and commercial travel. The NPTS/NHTS addressed residential personal household travel. The VIUS addressed residential vehicular (truck) travel. Both surveys occurred at different time intervals and had significantly different sample sizes and sampling universes. Both surveys lacked consistency in definitions for mode relative to how vehicles were registered and used. Neither survey addressed interstate personal or vehicular travel.
- Ground-count estimates of VMT included residential, nonresidential, personal, commercial, and interstate movements.
- With appropriate assumptions, it may be possible to illustrate that growth in count-based VMT is perhaps being driven by what can loosely be described as commercial vehicle travel. Further research into the potential impact of growing commercial vehicle travel, especially linked to shopping and home-based businesses, is warranted.
- Lastly and perhaps most important, survey sampling in the NHTS, VIUS, and CFS is typically administered for the resident population, domiciled registered vehicles, or the shipper state of origin. In multistate labor market areas where a regional context is required, the inability to adequately assess in some manner the net migration within, into, and across the labor market by state severely hinders the ability to understand the complete travel picture, especially as measured by what is on the road.
Will a survey adequately reflect public transportation ridership? Yes, in certain cases.
- Outside of New York City and the surrounding region, defining transit may be a simple undertaking; that is, in some cases only one mode is available (e.g., a bus). NYC offers a wide variety of transit, both publicly and privately operated, and riders may take one or more modes and/or transfer within the same mode. Also, nonresidents (from New Jersey or Connecticut) enter the city to work every day and are not included in any residential survey. Survey estimates of transit person trips tend to underestimate unlinked trips.
- Additionally there are significant definitional problems in analyzing public transit trips derived from a survey with respect to those trips reported by a transit operator. Transit operators collect revenue and monitor data on unlinked trips, which do not have a one-to-one relationship, especially when a sliding fare and transfers are readily available.
- Within the context of a narrowly defined segment of public transportation (MTA subway), it is possible that the survey estimate may correspond well with the operator report of unlinked trips.
Is there comparability between the census and the NHTS on "Who is a worker?" Yes, with consistent definitions and at the state level.
- The concept of workerin the NHTS indicates that many people do not have full- or part-time jobs, yet they report that they engage in some other type of activity for which they are compensated. The difference between the census and the NHTS estimates for NYS is about 1 million jobs.
- The decennial census and the NHTS are both surveys and, as such, are subject to concerns of sample size, standard error, and the need to evaluate survey estimates within the 95% confidence interval. Worker status is a household effect variable in both surveys. Equally important is question wording and the survey instrument and its administration.
- Careful separation of the NHTS response to best match that of the decennial census concept of worker shows that, for the state as a whole, there is agreement. However, for nine of the strata in NYS, the two estimates of worker are statistically different.
- It is clear that for the respondent, the census and the NHTS have very different concepts of worker compared with that of transportation analysts. By asking "Do you have a job?" which is self-defined in the NHTS, and then probing first for traditional work status and then for any activities with pay or profit (the census question), it is possible that the NHTS reveals nontraditional or illegal employment (e.g., under-the-table employment or the underground workforce).
Does the survey adequately reflect drivers and driver licenses? Yes and no.
- On a statewide basis, the NHTS survey estimates for drivers matches NYSDMV total "driver licenses in force" quite well. When categorized by strata and gender, the results are mixed.
- The NHTS concept of driver may not be equivalent to licensed driver as reported by NYSDMV. The NHTS asks who a driver is without qualifying whether that person has a legal license.
- We can conclude that the survey estimate of drivers is an effect variable that is highly sensitive to sample size and sampling error. The sensitivity may also result from the spatial impact for travel opportunities due to settlement density, the availability of mass transit options, transient population, residents licensed in other states, or respondents without legal licenses.
Are the census and the NHTS estimates of the number of vehicles available within households in NYS accurate? Maybe.
- For most strata, the census estimate is essentially found between the NHTS 95% lower bound and the survey estimate.
- One of the problems in making this comparison is that the census does not adequately delineate vehicles that may be available within the household for use by type, registration, and usage (personal and nonpersonal or commercial). The census simply collects the number of households with zero to five or more vehicles that are available.
- One improvement to the NHTS would be a mechanism to match vehicle type with registration category. As part of the 2001 NHTS data collection and analysis, NYSDOT requested an assessment of whether DMV vehicle registration data by county was a better measure for weighting the vehicle file than the household weight used in 1995. The conclusion was that they were nearly equivalent. However, when the census and the NHTS are compared with NYSDMV registration categories the inconsistencies were problematic.
Lastly, both the census and the NHTS may be counting non-NYS registered vehicles.
Clark, S.L., J. Iceland, T. Palumbo, K. Posey, and M. Weismantle. 2003. Comparing Employment, Income, and Poverty: Census 2000 and the Current Population Survey. Suitland, MD: U.S. Department of Commerce, U.S. Census Bureau.
Eng-Wong Taub & Associates. 2003. Origin and Destination Data by Station, prepared for the Port Authority Trans-Hudson Corporation (PATH), 2001 PATH Passenger Travel Study. Obtained by special request of NYSDOT, Passenger Transportation Division.
P.S. Hu and J.R. Young. 1999. 1995 New York NPTS: A Comparison Study, prepared for NYSDOT, Planning and Strategy Group. November 23. Available at http://www.dot.state.ny.us/ ttss/1995npts/nynpts95_comparison_study.pdf.
L. Maynus, New York State Thruway Authority. 2004. Personal communications.
E. Murakami, Federal Highway Administration, U.S. Department of Transportation. 2003. Personal communication. October 29.
New York State Department of Motor Vehicles (NYSDMV). 2001. Tabulations of driver licenses in force.
New York State Department of Transportation (NYSDOT), Passenger Transportation Division (PTD). 2003a. Metropolitan Transportation Authority fare structure data.
______. 2003b. Metropolitan Transportation Authority historical revenue and ridership data.
______. 2003c. Port Authority Trans-Hudson Corporation (PATH) operating statistics, from the 2001 National Transit Database.
New York State Department of Transportation (NYSDOT), Office of Policy and Performance (OPP). 2002. Trends and Travel after September 11, 2001. New York, NY.
______. 2003a. Customized tabulations using U.S. Department of Commerce, U.S. Census Bureau. 1998. 1997 Vehicle Inventory and use Survey, digital micro data CD.
______. 2003b. Data extraction using U.S. Department of Commerce, U.S. Census Bureau, 2000 Census SF3 DVDdata: NYSDOT, Tables: P1Total Population, P4Percent of Population in the Sample, P8Sex by Age, P30Means of Transportation to Work for Workers 16 Years and Over, P43Sex by Employment Status for the Population 16 Years and Over, P132Imputation of Work Status in 1999 for the Population 16 Years and Over.
______. 2003c. Data extraction from U.S. Department of Commerce, U.S. Census Bureau. 2000 Census SF3 DVD. TablesH6: Occupancy Status; H44: Tenure by Vehicles Available.
______. 2003d. Estimates of historical monthly vehicle-miles of travel.
______. 2003e. Temporal distributions of vehicle-miles of travel, special tabulations from the 1995 NPTS and 2001 NHTS.
New York State Department of Transportation (NYSDOT), Office of Operations Management (OOM). 2003. Statewide snowfall reported by NYSDOT residencies. Internal reports and tabulations.
New York State Energy Research and Development Authority (NYSERDA). 2004. Patterns and Trends, New York State Energy Profiles: 19892003.Available at http://www. nyserda.org, as of October 2005.
Reebie & Associates. 2001. TranSearch Data. Custom tabulations. April 2003.
U.S. Department of Commerce (USDOC), U.S. Census Bureau. 2000. Chapter 8: Accuracy of the Data, Using Tables to Compute Standard Errors and Confidence Intervals. Summary File 3, 2000 Census of Population and Housing, Technical Documentation. Available at http://www.census.gov/prod/cen2000/doc/sf3.pdf.
U.S. Department of Transportation (USDOT), Federal Highway Administration. 1995. Nationwide Personal Transportation Survey, New York State add-on. Digital data files and custom tabulations.
______. 2001. National Household Transportation Survey, New York State add-on. Digital data files and custom tabulations.
______. 2003. Highway Statistics. Washington, DC. Estimates of U.S. monthly vehicle-miles of travel for 2001/2002, special tabulations.