Precision of Geocoded Locations and Network Distance Estimates

Precision of Geocoded Locations and Network Distance Estimates

V.S. CHALASANI1
J.M. DENSTADLI2
Ø. ENGEBRETSEN3
K.W. AXHAUSEN4,*

ABSTRACT

This paper addresses the accuracy of the geocoding of travel diaries, the relationships between different network-based distance estimates, and how exact estimates are when distances are self-reported. Three large-scale surveys in Norway and Switzerland demonstrate that very high precision is possible when survey protocol emphasizes the capture of addresses. The study uses the relevant and available databases and networks. Crow-fly, shortest distance path, shortest time path, and mean user equilibrium path distances are systematically related to each other, the pattern of their relationships is matched to theoretical expectations, and the impact of network resolution is reported. In the examples studied, medians of self-reported distances by distance band provide reasonable estimates of crow-fly and shortest distance path distances.

KEYWORDS: Geocoding, travel diary, precision, network distances, detour factors.

HOW MUCH PRECISION IS POSSIBLE?

Measuring distances traveled is a central task of transport statistics, as these data are not only key descriptors of travel behavior, but also essential for the calculation of derived statistics, such as exposure to risks (accidents, pollution), volume of externalities (emissions, congestion), speeds, incidence of taxation, and so forth. It is also central, directly or indirectly, to all choice models estimated from travel behavior data. Thus, it is not surprising that recent technological innovations, such as geographic information systems and the vast expansion of spatially referenced databases and networks have been adopted quickly by transport statisticians and modelers. This adoption process is ongoing, and professional standards for appropriate use must be formulated. This paper contributes to the current discussion: first, by highlighting various questions about the availability of these new resources and second, by reporting results from our work with these systems in Norway and Switzerland.

The gold standard of distance measurement is an uninterrupted trace of Global Positioning System (GPS) points matched to a complete and geometrically correct network model. The currently available GPS datasets are neither uninterrupted nor matched to complete and geometrically correct network models (see, for a recent example, Hackney et al. 2004; Marchal et al. 2004), but they are much closer to this standard than the alternatives discussed below. Some studies come quite close (see, e.g., Wolf et al. 2003). Lacking data of this quality, the researcher has various second-best alternatives to locate (geocode) origins and destinations of stages or trips observed 1 (Axhausen 2003) and to estimate distances between them. Data sources assumed available for further discussion are the following: travel diary surveys (Richardson et al. 1995; Axhausen et al. 2003; Resource Systems Group 1999), address databases, and network models suitable for shortest path calculations.

The quality of geocoding will depend on the details reported by travelers, as well as the details of the address databases to which these reports are matched. Travelers' difficulties with reporting addresses are well known: full street addresses may not be known for shops and other locations; correct postal codes are forgotten, even when the street address is known; or no unique names exist for common meeting points in parks or other public spaces. Address databases have similar problems: no entries for points in public spaces; arbitrary allocation of reference points for large complexes, such as train stations, airports, or shopping centers; and some missing street addresses.

Using zones for modeling convenience or privacy protection increases both complexity and the possibility for error. The definition of a reference point for a zone is an additional problem in its own right. Should one use the geographical mean of the zone, the built-up area, the center of gravity of the population, the city hall, or the post office for zones defined by a postal code?

Currently available detailed network models for vehicle navigation will be almost perfect from a topological perspective, as they include (nearly) all street addresses and all nodes. However, minor delays in the updating of such databases can cause minor errors. The larger issue is the coding of link types and associated mean speeds for link types. The same problems (with larger impacts on accuracy) occur with planning networks, that is, networks used in planning applications for assignment or other transport flow algorithms (Ortuzar and Willumsen 2001; Sheffi 1985). These contain far fewer links and nodes, causing inconsistencies between shortest paths calculated using them in comparison with using navigation networks. An added complication is their use of zones to represent space with all the related definition problems discussed above. Further, network models employ special types of links to connect zones with networks. One such connector is required to produce a complete description of the area, but many users employ two or more, which again will impact shortest path calculations.

Road geometry in network models only approximates the true geometry of real road alignments. As long as the true length of links is known, locating a street address along a link will add only minor errors.

Network models can be used to calculate path distances between origins and destinations for different criteria that might or might not have the same values—for example:

  • shortest distance path,
  • shortest time path,
  • paths included in the set of paths traveled at user equilibrium,
  • paths included in the set of paths traveled at stochastic user equilibrium, and
  • paths included in the set of paths traveled at system optimum.

For the last three criteria, one would need to define summaries of returned path distances, for example, mean, median, or minimum. The complexities involved in estimating origin-destination matrices required for these calculations are not included here (see Ortuzar and Willumsen 2001 for details).

Calculation of the shortest distance path distance is unambiguous, which is not the case for shortest time path distance, which requires the modeler to make assumptions about traveling speeds on the various links. One obvious assumption is the free-flow speed, normally the posted speed limit, available in all assignment networks. Most networks set up for navigation purposes assume a mean speed for each link type. These are substantially lower than free-flow speeds. Other a priori choices are possible. We can also calculate the straight line (crow-fly) distance between two points, either as Euclidian distance or as Great Circle distance (Hubert 2003), that takes the Earth's spherical shape into account.

When we consider the number of possible combinations and choices in network distance calculation, traveler-reported distances are at least unambiguous. Travelers choose a path based on specific preferences and situations. We can expect their self-reported distances will deviate from any modeled distance because of their tendency to estimate distances imprecisely (Bovy and Stern 1990; Rietveld et al. 1999; Raghubir and Krishna 1996). In many cases, though, this is the only information available. Thus, patterns of deviations between reported and modeled distances are of interest.

Although not yet undertaken, a study of the interactions between all these elements would be helpful. This paper focuses on many of the relevant issues that provide some missing background and allow other results to be assessed:

  • What degree of accuracy is possible in the geocoding of addresses obtained from travel diaries? The results of three studies, the Swiss national travel diary survey (Mikrozensus 2000), the 2003 Thurgau six-week diary (Thurgau 2003), and the 2001 Norwegian national passenger travel survey (NPTS 2001) are compared.
  • How large are the differences between various distance estimates? Using a current national assignment model for Switzerland (Vritc et al. 2003; Vritc and Axhausen 2004), shortest distance path distances, shortest time path distances, and mean user equilibrium path distances will be calculated and compared.
  • What are the differences between reported distances and calculated distances? The three datasets will be used to answer this question.

DATASETS

2001 Norwegian National Passenger Travel Survey

The 2001 NPTS is the latest in a series of Norwegian travel surveys, which are undertaken on a four-year cycle (Denstadli et al. 2003). The respondents, all of whom are at least 13 years old, reported both their trips for one day and all trips over 100 kilometers made during the last month in a computer-aided telephone interview (CATI). They were asked to fill in a "memory jogger" before the interviews. Respondents were drawn from the national person register, which allows pre-geocoding of home and work place addresses.

The published dataset gives addresses at the level of the approximately 14,000 statistical wards, which is how the census office divides Norway. These vary in population from 0 to 3,500, with a mean of 320. The geocoding of the 64,240 daily trips and 27,507 long-distance journeys involved two automatic matches and two manual correction phases against a set of address databases, including one with the names of firms and organizations (Denstadli and Hjorthol 2003).

Swiss National Travel Survey

The Swiss Federal Office of Statistics (BFS) and the Federal Office of Spatial Planning (ARE) conducted the Mikrozensus 2000, the sixth in a series dating back to 1974 (BFS 2001 and 2002). A number of cantons provided further support by financing additional respondents at marginal costs. The CATI-interview covered the stages of one entire day and long-distance and air travel for longer periods. The feasibility of geocoding the stage data was still uncertain during the survey's design phase, so exact street addresses or their equivalents were obtained only for trips to, within, and from the 10 largest cities in Switzerland (40,000 to 340,000 inhabitants). The names of stations and public transport stops were carefully recorded as part of the stage-based interview, as well as home addresses. However, the quality of the address information was not a prime concern for the survey.

The geocoding (Jermann 2003) of the 144,000 stages (about 100,000 trips 2) was performed some time after the field phase of the survey, as part of a different project. Using geocoded address databases of the BFS, canton Zürich, and the Swiss Federal Railways stations and stops, we implemented a semi-automatic matching process after normalizing and correcting street addresses in the Mikrozensus 2000 records (spelling, punctuation, removal of diacritical marks, etc.). The remaining addresses were matched by hand, as far as possible, using maps, telephone books, and information on the internet, especially for place names and leisure facilities. (The address-matching tools in ArcInfo and MapInfo were unsuitable, because they embed too many assumptions valid only in the context of the United States).

2003 Thurgau Six-Week Diary

This survey replicates and improves on the six-week Mobidrive survey (Axhausen et al. 2002). A total of 99 households with 230 members were recruited in the rural and small town canton of Thurgau; they reported their travel for a continuous six-week period, using six one-week trip diaries (about 36,000 trips). The data were then coded and the field worker called respondents to clarify any omissions, particularly omitted or unclear addresses. (Address information quality was a priority for everyone involved in the survey.)

The geocoding was undertaken (Machguth and Löchl 2004) after the completion of the field work using the same type of databases employed for the geocoding of the Mikrozensus 2000 and adopting the same process. In contrast to the Mikrozensus, destinations abroad were coded to street block level in Germany and to municipality level elsewhere.

QUALITY OF GEOCODED LOCATIONS

In the preceding section, we asked what level of quality could be achieved for such large-scale exercises when they rely primarily on automatic matching steps. The quality of geocodes can be evaluated by how precisely addresses can be pinpointed. In the Norwegian study, quality was rated by quantifying the number of wards to which an address could belong. Table 1 gives details of the criteria for quality rankings. In nearly 90% of the cases, it was possible to locate the address within one ward. However, address locations for both ends of the trip were possible in only about 80% of the cases, raising problems later with distance calculations (table 2). Trip purpose, mode, and area were investigated for impacts on accuracy. The first two were not significant, but the type of area, predictably, had an impact. Better databases for larger urban areas substantially improved quality, particularly when the wards considered are smaller in these areas.

The matching quality of data on location in Mikrozensus 2000 needed to be examined individually for each stage, as these were the basic units of the data collection. Varying quality of underlying databases produces differences. Because some addresses were available only with street names, and in most cases only as municipalities, the collection of addresses differed for various areas during the survey. Table 3 details the quality ratings and table 4 shows the qualities available at the origins and destinations of the stages.

Matching was very precise for stages with stations on either end, relatively good for both bus and tram stops. When street addresses were available, coding was simple. However, in one-third of the cases, respondents could only recall the street, or only a street could be identified for the location. The municipalities were matched precisely. Note that cases rated C2, which refers to locations for available street addresses, were so incomplete that matching could only be achieved at the municipal level. Slightly more than 70% of the stages could be matched at both ends to level 1 (including 14% municipality to municipality stages) and 85% to level 1 or 2, which is roughly comparable to the Norwegian results. Considering that the average Swiss municipality has only about 2,500 inhabitants, and given that the Mikrozensus was mostly conducted without considering geocoding of locations, this result is quite good.

The geocoding quality for the 2003 Thurgau followed the Mikrozensus example, but was supplemented by a new type of coding that translated the previous codes into a more comprehensible metric (table 5). The code "<100m" understates the accuracy, because it covers mainly exactly coded street addresses. The quality of the geocoding is very high, reflecting the attention given to it during the survey process. With 60% of trips captured within 100 m of their true origins and destinations, this brings us very close to ideal conditions for the distance estimation.

DIFFERENCES BETWEEN DISTANCE ESTIMATES

Swiss and Norwegian data allow comparison of network estimates against reported distances, as well as against each other. This section focuses on the comparison between the various network estimates discussed above.

In a first step for Mikrozensus 2000, the stage-based information was used to geocode the trips. The best available geocode was attached to the start of the first stage and the destination of the last stage (table 6). The main mode of the trip was determined, as is usual in this situation, by an a priori ranking of the modes involved, in which the various public transport modes have priority before private motorized vehicles and slow modes. Further analysis in this section is restricted to car driver and passenger trips, as no detailed walking and cycling network information was available.

Network distance calculations were performed using a national assignment model available at the Institute for Transport Planning and Systems (Vritc et al. 2003; Vritc and Axhausen 2004), which divides Switzerland into 3,066 zones, 14,798 nodes, and 19,664 links. The associated origin-destination matrix of average annual weekday flows was calibrated for the year 2001. The geocode for a postal code is the geocode of the associated post office's address. As a municipality is normally the same as a postal code area and a zone in the national network model, this address was also used to describe the center of gravity of the zones. The distance between the network and the center of gravity, that is, the length of the centroid connector, was set to the Euclidian distance between the relevant node and the centroid.

Crow-fly distances were calculated as Euclidian distances between the origin and destination of the trip, at the precision available. For network-based calculations, each trip end was associated with the relevant zone and, therefore, its zonal centroid. Distances were calculated using VISUM 8.0 (PTV AG 2002) for about 3,000 zones with an average of 2,500 residents. Shortest distance path distances included lengths of centroid connectors at either end of trips. Shortest time path distances were calculated assuming free-flow speeds for links. User-equilibrium (UE) assignment distances were calculated as weighted average distances of paths used at equilibrium between any two locations. The matrix of average weekday traffic flows was assigned with the assumption that daily link capacities are 12 times hourly link capacities. We excluded all trips inside a zone from further analysis, as they have, by definition, a distance of zero in network models, better interpreted as a missing value.

A comparison of distance distributions (table 7 and figure 1) highlights the differences between the three sources of information. The largest share of crow-fly distance trips lies in the one to five kilometer band. The mean crow-fly distance in this band is substantially smaller than the mean distances in all other bands. Network distance distributions are similar, but, as one would expect, shortest time path and mean UE assignment path distances are slightly longer. This effect is pronounced for longer distances, where routings via roads with higher speeds start to pay off. Alpine topography, including the many large lakes in the foothills of the Alps, explains the large differences in the shares of trips over 100 kilometers distance vs. crow-fly distances. Mean reported distance lies between the shortest distance path and shortest time path estimate. Given that neither of the two network-based estimates reflects actual behavior fully, this mean value is a credible estimate for all trips.

In many cases, it is useful to convert one distance estimate to another. Such conversions, using the mean ratios of the relevant estimates, often called detour factors, have been reported previously but only for certain pairs of distance estimates (e.g., by Qureshi et al. 2002). Table 8 provides six comparisons for Mikrozensus 2000 based on the estimates described above. A clear difference can be observed in detour factor change patterns. Calculations are based on all observations in the sample, even if crow-fly distances were longer than model-based estimates. This can happen, especially for shorter trips, when the distance between zonal centroids is smaller than the actual distance traveled (see above). Detour factors fall as crow-fly distances become longer. While they are well above the square root of two (a factor of the Manhattan metric for short distances), they are also much smaller for longer distances. Factors for the three network distances are, for practical purposes, identical for the shortest distance band, but diverge after this, reflecting different objective functions behind their calculation.

The pattern is reversed for shortest distance paths detour factors, where the factors grow as shortest path distances increase. This is predictable, as the chance to use a faster, but longer route via the less-crowded high-capacity network increases with trip length.

In the 2003 Thurgau survey, the distances (shortest distance path and shortest time path) were calculated using high resolution Vektor 25, a network of the Swiss ordinance survey, employing the gecodes described above. This allowed the inclusion of all trips, except for cases where respondents returned to the same address after a walk or drive. The patterns revealed in table 9 are similar to those discussed for the Mikrozensus 2000, but their levels are markedly lower for crow-flow distance ratios, reflecting the finer network employed and the absence of centroid connectors.

Distance estimate comparisons for the Norwegian data are possible only for shortest time path distance at this time. However, results confirm the pattern revealed by the Mikrozensus data; the detour factor is significantly larger in the shortest distance band (table 10). The national-level planning network data were provided by the Norwegian highway authority and the path calculation included travel times, distances, and various bridge and ferry tolls.

Figure 2 illustrates the results for the shortest time path distances. The ratio level seems to depend on resolution of the networks used. The national-level planning networks used for the Mikrozensus 2000 and 2001 NPTS produced larger ratios than the finer network used for the 2003 Thurgau survey. This is especially obvious for the shorter distance bands, while differences start to disappear over long distances.

REPORTED AND ESTIMATED DISTANCES

Unknown errors in the differences between the true length of a trip and the reported length have led modelers to avoid the use of travelers' reported distance estimates whenever possible. Expressly, when estimating choice models, the consistent errors of network models are preferable to travelers' unknown, idiosyncratic errors. But, in many cases, neither full traces nor geocodes nor network models are available. Thus, the quality of reported distances is important, especially if the differences were to cancel out for averages or other sample summaries.

One partial way to assess reported distance quality is to compare it with the shortest distance path distance derived from a network model. Such a comparison must be partial, as one cannot know if the traveler deviated from the predicted path. If the distance estimates for the model are zone-based, their measurement uncertainties due to the differences between interzonal distances can be assessed and compared with the distances between addresses.

In the 2001 NPTS, geocodes refer to statistical wards of differing size. To determine measurement uncertainty, mean distances between all ward addresses and their respective centroid were calculated for each ward (for details, see Denstadli and Engebretsen 2004). To avoid large measuring uncertainties, in the later calculations we eliminated trips to and from wards with a mean distance of more than 1.0 kilometers between addresses and the centroid. In addition, trips with obvious geocoding errors and trips where the measurement uncertainty for either statistical ward was larger than one-quarter of the network distance estimate were removed. Finally, we omitted trips that started and ended in the same ward.

Table 11 shows the resulting relative deviations by distance band for all car driver and passenger trips below 100 kilometers, which applied to the vast majority of all such trips. The measurement uncertainty is nearly independent of trip distance and fairly small, with a mean of about 0.6 kilometers. The overall deviation decreased with distance. The shares of trips in the various deviation bands were redistributed. The large share of distance estimates within the measuring uncertainty was greatest for the lowest distance band. This share went down as distance rose with a nearly matching increase in the below 5% deviation band. About 45% of trips were estimated within 10% of the shortest time path distance. Additional analysis revealed minor differences between various trip purposes, young and middle-aged people, sexes, and urban and rural areas.

Deviations in reported distances are due not only to respondent errors, but may also be caused by interviewer misinterpretation, recording errors, or routes with freely chosen detours. We expect deviations of this kind to be more random. Note that a consistent share of deviations are in excess of 50%. Plots of reported distances against distances from the network model show that, except for some outliers, distance estimates correlate highly. Omitting the outliers, we can conclude that deviations seem randomly and asymptotically normally distributed (for details, see Denstadli and Engebretsen 2004), with the result that the mean detour factor is close to 1.0 across all distance bands (table 12).

Repeating this analysis for the 2000 Mikrozensus and 2003 Thurgau data (table 13 and table 14) also reveals a similar pattern for public transport trips. Mean detour factors are dominated by outliers over short distances. Over longer distances, the median converges quickly to 1.0 for car trips and to 1.1 for longer public transport trips. The factor drops below 1.0 for longer car trips and to about 1.2 for public transport trips. To obtain a credible estimate of distance traveled, this pattern requires adjustment of reported distances by distance band. The poorer estimates for public transport reflect the longer routing of public transport services, a lack of active navigation by the traveler, and slow access and egress to the station or stop.

The pattern is also visible in Thurgau 2003, but not as clearly. It is obvious that the very large detour factors for short distances in Mikrozensus 2000 data are a product of omitted intrazonal trips. The very low reported distances in the longer distance band are due to the omission of hiking and cycling paths in the network model used; these can be crucial in hilly terrain. It should be noted that the speed assumptions chosen for shortest time paths were overly optimistic resulting in reported travel time underestimates of about one-third. This is far too much, even allowing for biases inherent in reported travel times. One would assume that this would lead to longer-than-realistic distances for longer trips.

The pattern of change suggests a relationship with trip speed and mode. Based on the distance bands used above, this pattern is visible in figure 3. The same pattern, but without the outlier for the short interzonal distances, can be seen in the 2003 Thurgau data.

For the Mikrozensus 2000 data, which represent a more typical situation, the dependence of the detour factor on the reported speed was modeled using aggregate values for distance bands of 2 kilometers up to 50 kilometers and of 5 kilometers beyond that. Table 15 presents the best fitting model. (For an alternative approach, see Zmud and Wolf 2003.)

CONCLUSIONS AND FURTHER RESEARCH

The three questions raised at the beginning of this paper were:

  • What level of accuracy of geocoding of addresses can be obtained from travel diaries?
  • How big are the differences between various distance estimates?
  • What are the differences between reported distances and calculated distances?

The experiences reported here show that, in urban areas, it is possible to geocode almost all locations to within 100 meters of their true geocode, if the survey process emphasizes this aspect of the work. With even lower accuracy requirements, higher rates are possible. This carries forward to the joint accuracy of the trip length estimate, as the probability increases that both trip ends are well coded. It should be noted, though, that these rates require very good address databases, especially for firms, commercial outlets, common locations without street addresses, and public transport stations and stops. The last two categories require particular attention, as these addresses are often not available from either the relevant Census office or commercial providers. (In the case of Norway and Switzerland, it was possible to obtain relevant databases from public transport operators or the national government.) National public transport timetables include some geocoding information, but their station and stop names sometimes differ from local nomenclature.

A lower location rate for trips undertaken outside urban areas (noticeable in the 2001 NPTS, as well as other surveys) raises some concern. The low location rate is due to a lack of street names and identifiable landmarks like shops, churches, etc. It is important that the interviewer keep this in mind. If the respondent cannot provide an address or a landmark close by, the interviewer must make him or her describe the place in alternative ways, that is, by asking for distance and direction to the nearest lake or urban settlement, or any other marker that can help locate the trip.

We found large and systematic differences in network distance estimates, as expected. It is crucial that the modeler report the assumptions behind the estimates used. The 2003 Thurgau data show that speed assumptions behind the shortest time path distances can be crucial; detour factors provided here give a first impression of their size and pattern. However, they cannot be corroborated until the literature provides further estimates of their value. Still, the impact of network resolution is already visible in the results reported here.

Differences between reported and estimated distances can be very large for an individual trip. These errors do not cancel out for large samples. A systematic difference remains, but its pattern is predictable and depends on the trip distance. For longer trips, the medians of reported distances match the shortest distance path distances. Correcting for reported speed, there are no differences in detour factors between modes. The strong dependence on reported speed suggests a reasonable way to correct estimates.

Although we do not recommend using self-reported information as the only data for travel distances, self-reported distances are useful when assessing the quality of geocoding. Large deviations between two distance measures may indicate that the error lies in an incorrectly located start or end point and not the respondent's stated travel distance. There may also be errors in digital road data or logical defects in models determining the route (and consequently the distance). In addition, as long as objective measurements relate only to distances between zones (e.g., statistical wards), self-reported distances represent valuable additional information on short trips and intra-zone trips.

Three surveys do not allow wide generalizations. Replication of this work is required to establish the robustness of the results presented here. Discrepancies due to different formulations of network models are especially important, as substantial variance in professional practice exists, which should be reduced to improve accuracy and consistency of the model results. This zeros in on the most important element missing for further research: a high-quality GPS dataset matched to an equally high-quality network model as the basis for detailed studies.

ACKNOWLEDGMENTS

The authors are grateful for the support of H. Machgut and J. Jermann during the geocoding of the Swiss data and for the support of Mr. M. Vrtic and Mr. T. Hamre, who provided the network distance estimates for the 2000 Mikrozenus and the 2001 NPTS, respectively. The results are our own and do not reflect the assessment of the owners of the datasets used.

REFERENCES

Axhausen, K.W. 2003. Definitions and Measurement Problems. Capturing Long Distance Travel. Edited by K.W. Axhausen, J.L. Madre, J.W. Polak, and P. Toint. Baldock, Hertfordshire, England: Research Science Press.

Axhausen, K.W., A. Zimmermann, S. Schönfelder, G. Rindsfüser, and T. Haupt. 2002. Observing the Rhythms of Daily Life: A Six-Week Travel Diary. Transportation 29(2):95–124.

Axhausen, K.W., J.L. Madre, J.W. Polak, and P. Toint, eds. 2003. Capturing Long Distance Travel. Baldock, Hertfordshire, England: Research Science Press.

Bovy, P.H.L. and E. Stern. 1990. Route Choice: Wayfinding in Transport Networks. Dordrecht, Netherlands: Kluwer.

Bundesamt für Raumentwicklung, Bundesamt für Statistik (BFS). 2001. Mobilität in der Schweiz, Ergebnisse des Mikrozensus 2000 zum Verkehrsverhalten, Bern und Neuenburg.

____. 2002. Mikrozensus Verkehrsverhalten 2000, Hintergrundbericht zu, Mobilität in der Schweiz, Bern und Neuenburg.

Denstadli, J.M. and Ø. Engebretsen. 2004. Testing the Accuracy of Self-Reported Geoinformation Travel Surveys, paper submitted to the Conference on Progress in Activity-Based Analysis, Maastricht, Netherlands, 28–31 May.

Denstadli, J.M. and R.J. Hjorthol. 2003. Testing the Accuracy of Collected Geoinformation in the Norwegian Personal Travel Survey: Experiences from a Pilot Study. Journal of Transport Geography 11(1):47–54.

Denstadli, J.M., R. Hjorthol, A. Rideng, and J.I. Lian. 2003. Travel Behaviour in Norway, TØI report, 637/2003. Oslo, Norway: Institute of Transport Economics.

Hackney, J., F. Marchal, and K.W. Axhausen. 2004. Monitoring a Road System's Level of Service: The Canton Zürich Floating Car Study 2003, paper presented at the 84th Annual Meetings of the Transportation Research Board, Washington, DC, January 2005.

Hubert, J.P. 2003. GIS-Based Enrichment. Capturing Long Distance Travel. Edited by K.W. Axhausen, J.L. Madre, J.W. Polak, and P. Toint. Baldock, Hertfordshire, England: Research Science Press.

Jermann, J. 2003. Geokodierung Mikrozensus 2000. Arbeitsbericht Verkehrs- und Raumplanung, 177. Zürich, Switzerland: Institute for Transport Planning.

Machguth, H. und M. Löchl. 2004. Geokodierung 6-Wochenbefragung Thurgau 2003. Arbeitsbericht Verkehrs- und Raumplanung, 219. Zürich, Switzerland: Institute for Transport Planning.

Marchal, F., J.K. Hackney, and K.W. Axhausen. Forthcoming. Efficient Map-Matching of Large GPS Data Sets: Tests on a Speed Monitoring Experiment in Zurich. Transportation Research Record.

Ortuzar, J. de D. and L.G. Willumsen. 2001. Modelling Transport. Chichester, England: Wiley.

Planung Transport Verkehr AG (PTV AG). 2002. User Manual VISUM 8.0. Karlsruhe, Germany.

Qureshi, M.A., H. Hwang, and S. Chin. 2002. Comparison of Distance Estimates for the Commodity Flow Survey Based on the Great Circle Distance Versus Network Based Distances. Transportation Research Record 1804:212–216.

Raghubir, P. and A. Krishna. 1996. As the Crow Flies: Bias in Consumers' Map-Based Distance Judgments. Journal of Consumer Research 23(1):26–39.

Resource Systems Group. 1999. Computer-Based Intelligent Travel Survey System: CASI/Internet Travel Diaries with Interactive Geo-Coding, report to the U.S. Department of Transportation.

Richardson, A.J., E.S. Ampt, and A.H. Meyburg. 1995. Survey Methods for Transport Planning. Melbourne, Australia: Eucalyptus Press.

Rietveld P., B. Zwart, B. Van Wee, and T. van den Hoorn. 1999. On the Relationship Between Travel Time and Travel Distance of Commuters: Reported Versus Network Travel Data in the Netherlands. The Annals of Regional Science 33(3):269–287.

Sheffi, Y. 1985. Urban Transportation Networks: Equilibrium Analysis with Mathematical Programming Methods. Englewood Cliffs, NJ: Prentice-Hall.

Vrtic, M. and K.W. Axhausen. 2004. Forecast Based on Different Data Types: A Before and After Study, paper presented at the 10th World Conference on Transport Research, Istanbul, Turkey, July.

Vrtic, M., P. Fröhlich, and K.W. Axhausen. 2003. Schweizerische Netzmodelle für Strassen- und Schienenverkehr. Jahrbuch 2002/2003 Schweizerische Verkehrswirtschaft. Edited by T. Bieger, C. Laesser, and R. Maggi. St. Gallen, Switzerland.

Wolf, J., M. Oliveira, and M. Thompson. 2003. The Impact of Trip Underreporting on VMT and Travel Time Estimates: Preliminary Findings from the California Statewide Household Travel Survey GPS Study, paper presented at the 83rd Annual Meetings of the Transportation Research Board, Washington, DC, January.

Zmud, J. and J. Wolf. 2003. Identifying the Correlates of Trip Misreporting: Results from the California Statewide Household Travel Survey GPS Study, paper presented at the 10th International Conference on Travel Behaviour Research, Lucerne, Switzerland, August.

END NOTES

1. A stage is the movement with one mode; a trip is the sequence of stages between two activities; a journey is a sequence of trips starting and ending at the current residence of the traveler, generally the home (Axhausen 2003).

2. Microzensus deliberately omitted many stages, in particular those under 100 meters; these omissions were exacerbated by interviewer error.

ADDRESSES FOR CORRESPONDENCE

1 V.S. Chalasani, Institute for Transport Planning (IVT), ETH Zürich, 8093 Zürich, Switzerland. E-mail: chalasani@ivt.baug.ethz.ch

2 J.M. Denstadli, Institute of Transport Economics (TØI), P.O. Box 6110, Etterstad, 0480 Oslo, Norway. E-mail: jmd@toi.no

3 Ø. Engebretsen, Institute of Transport Economics (TØI), P.O. Box 6110, Etterstad, 0480 Oslo, Norway. E-mail: oen@toi.no

4 Corresponding author: K.W. Axhausen, Institute for Transport Planning (IVT), ETH Zürich, 8093 Zürich, Switzerland. E-mail: axhausen@ivt.baug.ethz.ch