Development of a Random Sampling Procedure for Local Road Traffic Count Locations

Development of a Random Sampling Procedure for Local Road Traffic Count Locations

Sarah T. Bowling*
JWA/HMB Indiana, LLC
Lisa Aultman-Hall*
University of Connecticut

ABSTRACT

Traffic counting programs traditionally designed for traffic management require some re-thinking in order to provide accurate estimates of daily vehicle-miles traveled (DVMT) by road class for air quality planning. Predicting DVMT usually involves traffic counting at random points along highways, but local/minor roads, despite their extensive mileage, are not routinely counted.

We present a procedure to determine random count locations on functionally local roads. We used a geographic information system (GIS)-generated grid to cut roads into point-like sections from which we drew a random sample. The advantages of this procedure are that it overcomes GIS local road database limitations, uses standard GIS functions, and generates output that can be directly mapped for field crews. Cutting roads into various sizes and shapes introduced some bias during this process. A weighting procedure based on 750 local road counts in Kentucky measured the effect of the bias (which was deemed minimal and is therefore not needed in the application). Our experience using the sampling procedures allows us to recommend grid sizes that take into account computer processing time and file size limitations while limiting bias and ensuring acceptable randomness.

INTRODUCTION

Traditionally, transportation agencies conduct routine traffic volume counts on higher volume highway corridors. However, local roads1 are also important and unique, because they account for a considerable amount of total roadway mileage. For example, local roads make up 67% of the total roadway mileage in Kentucky, the study area for this project (CKTC 1997). Because traffic counts are typically conducted on local roads only for events such as road improvement projects and specific developments, the counts are not random and thus cannot provide accurate estimates of total travel on this class of road.

In September 1998, the need for estimates of the overall travel on local roads was bolstered by a U.S. Environmental Protection Agency (EPA) mandate requiring 22 states and the District of Columbia to submit state implementation plans relating to the transport of ozone across state lines (USEPA 1998). Oxides of nitrogen (NOx) form ozone, or smog, which can negatively affect the environment and human health (e.g., damaged vegetation, water quality deterioration, acid rain, and respiratory and heart disease). Sources of NOx emissions include motor vehicles and electric utilities. EPA requires state agencies to provide daily vehicle-miles traveled (DVMT) by land-use classification, road type, and vehicle type in order to estimate the amount of vehicle emissions produced on the county level.

DVMT is most commonly estimated from average 24-hour traffic counts at points along roads or a subset of roads. To obtain the DVMT, the traffic count is adjusted for daily and seasonal factors and then multiplied by the length of the road section. For example, if 1,000 vehicles a day travel a 2-mile section of road, the DVMT is estimated to be 2,000 vehicle-miles. Likewise, if there are a total of 100 miles of a particular road class in a county and the mean of a number of random traffic counts is 40,000 vehicles per day, then the countywide DVMT estimate is 4 million vehicle-miles for that class of roads. DVMT estimated from the existing nonrandom local road counts and total mileage would overestimate DVMT given that the more heavily traveled local roads are the ones more often counted.

When air quality, as opposed to traffic management, is the focus of DVMT and traffic count efforts, random locations must be chosen. One common source of random traffic counts is the Highway Performance Monitoring System (HPMS) established in 1978 by the Federal Highway Administration (FHWA). Data in the HPMS provide current statistics on the condition, use, operating characteristics, and performance of the nation's major highways. This travel information is routinely available for major highway systems, statewide and nationally, and is useful for the estimation of DVMT.

In Kentucky and elsewhere, HPMS data are used for estimating the total DVMT for the entire arterial and collector road systems, even though the sample is not completely random. In order to get the HPMS sample, each state had to break the arterial and collector routes into logical roadway sections. Rural section lengths were to range from 3 to 10 miles while attempting to ensure homogeneous traffic sections. Similarly, urban access-controlled facility sections were not to exceed five miles. All other urban sections were to be between one and three miles. A random sample2 was then taken from this total set of road sections, but did not include the National Highway System and major arterial roads that in theory have complete coverage (USDOT 2000). What made the sample nonrandom was the various section lengths and the fact that there were no instructions for selecting the point on the section to take the traffic count. Some agencies may have counted at the busiest point or at the center. Although some states count local roads as part of the HPMS, most do not.

It might seem easy to produce a spatially random sample by dividing the local roads into segments of a particular length (one-tenth of a mile is common for a number of purposes) and selecting a random sample from this database. However, local road geographic information systems (GIS) databases from which sample locations would be drawn are less developed than those for more major roadways. Given a one-tenth of a mile section it would be necessary to attribute starting points, ending points, and mile point locations to every road segment in the database in order to produce maps of the count locations for field workers. Additionally, many local roads, especially in urban areas, are shorter than the segment length into which roads are normally divided. This makes discretizing the routes complicated. Roads shorter than the segment length would always be a single segment and would have a higher chance per unit length of being selected. If shorter or longer roads have systematically higher or lower traffic volumes, respectively, this would bias the DVMT estimate if the shorter length roads were more likely to be selected.

It would be useful to have a method for selecting random points on the roads directly or graphically using simple random procedures rather than depending on weighted or proportional sampling. In this case, the spatial procedure is analogous to throwing a dart at a map blindfolded and counting at the road location that the dart hit. Moving from counting traffic on homogeneous traffic road sections to counting traffic at random points represents a fundamental change in philosophy and is consistent with the idea that traffic volume changes from point to point at driveways and intersections. Because of the variety of land uses on local roads the nonhomogeneity of traffic is particularly problematic.

The objective of this study is to develop a GIS-based random sampling procedure to determine count locations as random points on functionally local roads. A total of 750 24-hour local road counts were taken during this study in order to evaluate the sample properties resulting from the procedure. The large sample allowed us to analyze the bias issues resulting from 1) the grid-based nature of the procedure, 2) the shorter length of some local roads, and 3) the various directions or curves of individual roads. In application, much smaller sample sizes are likely to be used.

The following section describes other efforts to estimate DVMT on local roads. The remainder of the paper describes the GIS grid-based procedure and the evaluation of the bias it creates. The results of the bias analysis are presented along with a description of a procedure to correct for the sampling bias. However, the sampling bias was considered small enough to recommend use of the straightforward sampling procedure without the more complicated bias correction procedure.

OTHER EFFORTS TO ESTIMATE LOCAL ROAD DVMT

Programs in several states estimate overall travel on local roads through random samples. For example, Tennessee takes counts on local roads for specific highway projects, railroad crossing studies, and intersection analysis, although the count locations are not typically selected randomly. Because of this, the Tennessee Department of Transportation (TDOT) sought other methods to get a random sample of count locations (Crouch et al. 2001). Their study analyzed a program that collects traffic count information for all bridges in the state with a span length of 24 feet or greater.

Crouch et al. (2001) proposed a method to measure the randomness of these bridge counts for DVMT estimation on rural local roads. The traffic counts at bridge locations were compared with a random sample of traffic counts at nonbridge locations on local roads in eight counties. The researchers developed the procedure used to collect the random sample for nonbridge locations. Each of the eight counties was divided into four-square-mile grids (the width and length were two miles), and a process of repeated systematic sampling was used.

First, the grids throughout the county were sampled. Then, within each grid, the location of the actual count was chosen by randomly selecting x and y coordinates. Each grid cell consisted of a 10 by 10 matrix. From the randomly selected coordinates, the closest local road location was selected, and at this location, a traffic count was collected by TDOT. This is indeed a random procedure with one possible bias: shorter roads may be less likely to be closest to the 0.2 mile by 0.2 mile grid selected. When working with a large number of counties, the process could be labor intensive and time consuming. Using the random counts generated in this manner, the researchers found the bridge counts were not a representative sample of all rural local roads in each county.

In a California study (Niemeier et al. 1999), vehicle-miles traveled on dead-end unpaved roads were estimated from a random sample. Traffic counts were collected at random unpaved local road access points to paved roads. Because counting was conducted at the access points to prevent trespassing on the private roads, researchers did not have to deal with the issue of selecting the point along a road and, thus, a random sample of whole roads was taken. The count locations were mapped using GIS so the sites could be easily found. The count provided an estimate of the number of trips generated on the unpaved road, which was converted into DVMT by assuming there was a single destination on the road and each vehicle entering or exiting the road traveled half the length of the segment. However, the assumption that the vehicle is traveling to or from the midpoint of the road may cause the DVMT to be incorrectly estimated. For example, dead-end, unpaved local roads could have one origin/destination point at the end of the road. This method is random, but it is only suitable for local roads that dead end and have very few origin/destination points.

As part of this research study, an email survey of 45 states was conducted using contact names provided by the FHWA division office. The 29 replies indicated various methods for obtaining local road volume counts and sample locations. In Oregon, locations are picked from a select group of local roads that a computer program indicates are undersampled. The most recent counts from the local roads that are frequently sampled are then added to the counts of the sampled roads. The total sample may be nonrandom because the frequently sampled local roads are usually selected based on where road improvement projects will be located, developments built, or traffic problems exist. These are historically the more highly traveled areas. The random sample of the undersampled road segments is built by aggregating the full dataset as if it were one continuous road. Microsoft Excel then randomly picks a mile point along the road segments, and each pick becomes a location for a traffic count. The urban sample segments are 0.1 miles in length, while the rural sample segments are 1 mile. The count is taken at the center of the segment.

Other states provided less detailed input in the email survey. Vermont, for instance, selects the most "important" local roads for the counts. This, of course, is not random. West Virginia does not sample roads that have an average daily traffic value of less than 50 vehicles per day. This nonrandom method would certainly cause the DVMT to be inflated if total road length were used for the estimate. In Wisconsin, local roads are counted for special reasons, such as a traffic problem or new development. Again, this is not a random sample and, therefore, the DVMT estimate for EPA purposes could be incorrect. Wisconsin proposed developing a random sample of locations on local roads, but costs were prohibitive.

Until recently, DVMT estimates were mainly used to determine if a road needed improvements or expansion. Now that EPA requires DVMT to predict total vehicle emissions for each county, accurate estimates are much more important. The formerly sufficient nonrandom sampling methods used by many states are no longer adequate. Clearly, the need exists for a random sampling procedure that is not extremely labor-intensive in order to count locations to be used for estimating the DVMT on all functionally local roads.

GIS GRID-BASED SAMPLING METHODOLOGY

Challenges of Finding a Methodology

Location and alignment information for roads in most jurisdictions is usually stored in GIS databases, and sampling from these databases is desirable. In addition, because maps are useful to direct field workers to count locations it is logical to proceed with a GIS-based method. Roadways stored in a GIS are usually divided into segments (and, therefore, individual GIS features) at all intersections and many other points, some unsystematic.

In the road databases for the three Kentucky counties in this study, local road segments ranged in length from a few feet to 10 miles. ArcView, a Windows-based GIS produced by the Environmental Systems Research Institute (ESRI), has a built-in function that can select a random set of such features or, in this case, segments. However, a random sample taken from this form of road database would not be appropriate for several reasons. First, the exact location on the road must be chosen and more than one location on the same road segment must have the opportunity to be chosen. The reasoning for this is based on the nonuniform variation in traffic volume along a road segment, especially for longer local roads where different intersecting roads and land uses affect traffic levels. Another reason the sample could not be taken from this database is that short and long segments would have been weighted equally. If the sample were taken from the existing GIS line theme, the precise location on the selected segment would then have to be subsequently chosen, and thus an individual point on a short segment would have a greater opportunity of being selected than a point on a longer segment. Therefore, equal weighting is not desirable.

There are other reasons why weighting is not a good method in our process. First, it adds two extra steps to the sampling procedure, which is intended to be straightforward. The length of each section would have to be determined to be used as weights. This might require GIS spatial analysis with poorer quality GIS databases. After segments were selected, another sampling procedure would be required to choose the random point along the given road segment. Second, weighted random sampling cannot be undertaken with built-in functions in most GIS programs requiring data to be transferred between programs.

As discussed in the introduction, another logical approach to developing the random sample would involve picking a random mile point or distance measure along these roads and then mapping it for the people conducting the counts. Knowing the length of every local road in a particular county, a line or row in a spreadsheet program could represent each one-tenth of a mile section. Most spreadsheet programs are capable of taking a random sample from the whole set. However, once the sample is taken it is difficult to direct the people making the traffic counts to the count location. On local roads, there are typically no mile markers to indicate location as there are with more major or higher volume roads. Maps of count locations made in ArcView could solve this problem. However, limitations in the coding of local road databases present a further difficulty for this mapping.

Mapping a specific point on a road is very easy with GIS road databases with a feature called "dynamic segmentation." Using dynamic segmentation, every road segment has two "special" attributes. One indicates the beginning linear reference marker at the start of the segment and the second indicates the end reference. The GIS can then locate any mile point on the road segment based on this information. This allows the mile-point reference system to span across adjacent segments. For example, the system could span across an intersection. However, the available GIS databases for local roads rarely contain dynamic segmentation. Therefore, use of a sampling procedure that required start and end mile points to allow mapping would become very labor-intensive.

As an alternative to creating dynamic segmentation attributes in the database, each individual road segment (as opposed to the whole road) could be coded automatically with a starting mile point of zero and an ending mile point of its length. However, using discrete mile-point demarcations, such as one-tenth in the spreadsheet listing, and random sampling presents another problem for very short local roads, especially in urban areas. Selection of a random continuous number between zero and each segment's length would be necessary in a two-stage process like that used in Tennessee. In the first stage, a weighted random sample with replacement, with probability proportional to road segment length, would be taken. In the second stage, a point or points along the segment would be selected by random number generation. This procedure would require separate programming outside the GIS, and the results would necessitate subsequent transfer back into the GIS for mapping because mile points are not meaningful on a segment-by-segment basis or on local roads without field mile-point markers.

The new methodology proposed here is also two stage but uses standard built-in functions of the typical GIS: grid generation, database intersection, and random sampling from a feature table. The product is already a line feature in the database and is immediately mapped. In the first stage, a GIS grid is generated and used to cut road segments into sections. As the grid size becomes smaller, the sections become more point-like, enabling a new theme from which the random sample can be drawn using the direct built-in random sample command. This avoids the use of any weighting or resampling. The procedure ensures that the sample locations are spread randomly throughout the study area and that each point-like section along all roads has an equal chance of being in the sample regardless of the total length of the road.

Creating the Point-Like Sections for Three Study Areas

In this case, the primary GIS used was ArcView. We developed a procedure that cut the roads into small sections using a grid; thus, the shape and density of the local roads were considered potentially influencing and affected the selection of study areas. Because it was not feasible to include all 120 Kentucky counties, we chose three counties for this study: Henderson, Pike, and Fayette. In total, the Kentucky Transportation Cabinet agreed to count up to 750 locations in these 3 counties for analysis of the sampling strategy. The counts were performed by a state contractor using "tube style" Peek Automatic Data Recorders (ADR-1000) between fall 1999 and spring 2000. Counts were taken for 24 to 48 hours and adjusted for season and day of the week using factors developed with historic counts by the Kentucky Transportation Cabinet. No axle counts or adjustments for heavy vehicles were undertaken. This large number of counts was not expected to be routine but was undertaken to address the issue of variability in local road volumes in order to design future counting programs.

The three counties for the chosen sampling procedure are very different from one another. Henderson County (440 square miles or 1,140 km2) is in the western part of the state where the flat plain topography results in gridlike roads (total of 601 miles or 968 km of local road). It includes the small city of Henderson, which has a population of approximately 27,000. Pike County (788 square miles or 2,041 km2) is in the eastern, mountainous part of the state, has winding and curvy roads, and is considered a relatively rural county (total of 829 miles or 1,335 km of local roads). Fayette County (284 square miles or 736 km2), with a population of approximately 250,000, represents an urban county with a dense road network (total of 734 miles or 1,182 km of local roads). The separate GIS themes for state-maintained, county-maintained, and city-maintained local roads were combined for the three test counties to obtain three local road GIS databases. All GIS local road databases were developed and maintained by the Kentucky Transportation Cabinet.

Unfortunately, ArcView does not have the capability to create a grid (a set of adjacent polygon squares covering a certain area or extent), so grids were created in ArcInfo (a compatible ESRI GIS) by specifying the extent of the area and the grid size. These grids can be used directly in ArcView. Using the intersection function in ArcView, a "cookie cutter" grid shows, for example, in the designated square in figure 1, that the roads in the square are now in four separate pieces or features. Each separate, tiny line feature in the output database has a record in the attribute table from which ArcView's sampling script draws the random sample. Note that the random point-like road segments are selected, not the squares. Therefore, there is no need to select the road segment within a given selected cell as was done in some past procedures.

One obstacle of the grid approach is that some bias can be introduced by virtue of the point-like segments not being of equal length, as illustrated in figure 1. The grid used to cut the roads into small sections was orthogonal, so the roads were cut at different angles. As a result, some of the sections were considerably longer than others. If you have two roads of equal length, one cut into several short pieces and the other cut into a few long pieces, then the road cut into several short pieces will have a greater chance of being selected in the random sample. Given that the local road traffic volume was found to correlate with the original road segment length and also with differences in rural and urban areas, in order to avoid bias, the number of segments into which a particular road was divided would have to be directly proportional to the length of that road. This means that a road with twice the length of another road should be divided into twice the number of sections.

Our objective then is to determine the size of the largest grid square that brings an acceptably low bias to the sample. As the grid size approaches zero, the point-like sections approach true points of zero length, which present absolutely no bias. The smaller the grid square size, the more computer space and time are needed for the spatial analysis that cuts the road segments. The three counties were analyzed with 0.2 mile, 0.15 mile, 0.1 mile, and 0.05 mile grid square sizes. Although space issues needed to be considered (the grid for one county at the 0.05 mile size was 148 MB) in choosing the final grid square size, the computing time and ability of a personal computer to do the intersection (cutting) without crashing were certainly critical issues.

CONSIDERATION OF BIAS IN THE POINT-LIKE SECTIONS

In order to compare grid sizes and determine if the straightforward sampling procedure could be used without a more complicated weighting procedure to correct for the bias, it was necessary to develop a method to measure the bias that would be present in an average traffic count from a sample drawn using this process. Once the road segments were cut by the grid, the length of the original road section and the number of point-like segments into which it was divided were available for use in measuring bias. Figure 2 illustrates these data for one 0.2 mile grid in Pike County (lines and equations on this figure are described below).

The first of several indicators of bias considered was the coefficient on the x2 variable in the equation for the best-fit quadratic curve. This curve is not represented on the figure but has the form

y = a + bx + cx2

where a, b and c are parameter coefficients,

x is the original road segment length,

y is the number of segments into which the road is cut.

The value of the coefficient on the x2 variable is an indication of the curvature of the line, and increasing values of the coefficient would indicate bias. A negative value would indicate that the line curved downward, specifying that the longer roads were being cut into relatively fewer pieces and were therefore underrepresented in the sample. A positive value would denote the opposite: longer roads were overrepresented in the sample. The magnitude of the coefficient for the x2 term also provides an indication of whether it is appropriate to proceed using a linear regression-based representation of the relationship between road length and number of point-like segments.

A bias analysis graph and equation such as that in figure 2 was generated for each county and grid size analyzed. The coefficients on the x2 variable in the equation for the best-fit quadratic line as generated by Microsoft Excel are shown in table 1. Within an individual county, the value of the coefficient varies. This alone is not insightful; it is the comparison between counties that provides useful information. The magnitude of the coefficient is substantially greater for Fayette County than it is for Henderson and Pike Counties, showing that the grid process works better for rural roads than for urban roads because they are longer and less dense. We considered the low magnitude of these coefficients to be the justification to proceed with representing the relationship with a linear equation.

However, it is important to note that bias could still exist even in a linear relationship (x2 coefficient = zero). Therefore, we undertook further consideration of the linear regression equation. One factor considered in measuring this bias was the y-intercept of the best-fit line. On one hand, this value would ideally seem to be zero, because a road of zero length should be divided into zero sections. However, a y-intercept of one would indicate that a road of very small length was divided into one section, meaning that very short roads will be automatically overrepresented in the sample. As evident in figure 2, some very short roads were divided into up to three or four segments. Table 1 shows that the y-intercept value did not vary significantly as the grid size changed. For all counties and grid sizes, the y-intercept hovered just above one, which is expected because very short segments would most often be cut into one piece or, at most, two pieces. This result illustrates that some bias will be present with all grid sizes given that short segments are overrepresented.

The line indicating no sampling bias due to road length would be expected to have a certain slope, referred to here as the "target slope." The target slope is obtained by dividing the total number of segments in a county by the total length of local roadway in that county. For example, if there are 5 million distance units of local road in a particular county, and a specific grid size cuts these roads into 7,000 segments, the segments should be on average 714.29 distance units (i.e., 5 million distance units/7,000 segments) long. The target slope is the inverse of this number (divided by 1,000 for the graph scale shown), and the line on figure 2 was derived by using this slope with a y -intercept of 1.

Comparison of the target slope to the actual slope first required consideration of the R2 value. The R2 values shown in table 1 indicate that both the sampling procedure and the weighting procedure described below, which is based on the linear slope, are better suited to non-urban areas. The variation in the number of segments decreases with the smaller grid square sizes, as expected. However, the relatively high overall R2 values indicate that the best-fit line does indeed represent the data well, adding legitimacy to the comparison of the actual and target slopes described below.

Table 2 includes the target slope, the actual slope of the best-fit line, and the percentage difference between its slope and the target slope. The range included with the slope is the 95% confidence interval. The confidence interval was inspected for the inclusion of the target slope. None of the target slopes were included in the 95% confidence interval, indicating bias was present.

In each county the percent error between the target slope and the actual slope decreased as the grid square size approached zero, as expected. The target slopes are greater than the actual slopes, indicating that as road length increases the road becomes underrepresented in the sample. Fayette County had percent errors greater than that for the other two counties, again indicating that less dense roads are better suited to the grid process. Henderson County's grid-like roads had smaller errors than Pike County where roads are curvier. Therefore, it can be inferred that the grid procedure works best for grid-like roads and rural roads. The grid size is more crucial in urban areas.

In order to consider the impact of the bias due to road length and the grid procedure, weights were developed based on slope comparison; these weights were then applied to the traffic counts for these three counties. Counts were performed during calendar year 2000 at points selected using the 0.2 mile grid procedure (a worst-case scenario). The number of 24-hour counts performed in Henderson, Pike, and Fayette counties were 164, 243, and 337, respectively. These totals were designed so that the number of counts in each county were proportional to the length of local roads but also ensured a minimum number of rural and urban counts in each county (this constraint was imposed by the Transportation Cabinet). Counts were corrected for seasonal and weekly factors using constants developed in Kentucky based on counts on all functionally classed roads over many years.

The best-fit line and the target line were known for each county for the 0.2 mile grid size. In other words, for a road of a particular length, the number of segments into which it was divided and the number of segments into which it should have been divided were known. The weight was calculated as the ratio of the number of segments into which the road of a given length should have been divided if no bias by road length existed and the actual average number of segments into which the road was divided. This weight varied by road length as illustrated in figure 3 for Pike County for all grid sizes. We calculated a weighted average for the 24-hour traffic count, or average daily traffic (ADT) using the weights for the 0.2 mile grid size.

Table 3 presents the sampled and weighted ADT and the subsequent sampled and weighted DVMT estimate for local roads in each county based on the 0.2 mile grid process. The table demonstrates that without the weighted ADT, the DVMT estimate for each county would be slightly overestimated, with the greatest difference in Fayette County. This is further evidence that the weighting procedure is important for urban areas but is also a function of the greater number of shorter roads in those areas. However, the percentage difference due to the sampling bias is small and deemed acceptably low for modeling purposes for either the planning or air quality considerations described at the beginning of this paper. Based on the slope comparison the bias would be even less with the smaller grid sizes. It would not be useful to undertake the multistage weighting procedure calculations.

CONCLUSIONS

In summary, we developed and validated a straightforward sampling procedure that will allow random sampling of traffic count locations on extensive local road systems. Because built-in GIS commands can be used, sampling does not require time-intensive processes and the results can be directly mapped for field use. The procedure offers a means to determine not only a random road but also the point along the road where counting should occur. Furthermore, the procedure can handle very short local roads without greatly biasing the sample.

The analysis presented here provides guidance for determining a recommended grid size for use in sampling that takes into account computer capabilities in terms of file size and processing time while ensuring acceptable randomness of sampling. Attempts to use grid sizes below 0.05 miles were not successful in ArcView for the study areas used. Although individuals should select a grid square size based on their computer processing capabilities and the characteristics of the roads in their study, these results indicated that a larger grid size can be used for rural roads and grid-like roads. The grid square size needs to be smaller for urban counties due to the dense, short roads. Because it is very difficult to work with the 0.05 mile grid square size, the 0.1 mile size is recommended for urban counties. The recommendation for rural counties is to use the smallest grid square size feasible, but a 0.2 mile size would be sufficient, especially if roads are in a grid-like pattern.

Although not directly related to the main topic of this paper, several observations can be made regarding traffic counts on local roads and the estimation of accurate countywide DVMT. The state of Kentucky undertook a significant number of 24- to 48-hour local road traffic counts for this project, which is a very unusual and expensive undertaking, particularly for local roads. A total of 3,800 counts were obtained (including the 750 used in this sample procedure research). The counts had extraordinarily high standard deviations (386 for 2,702 counts in rural areas and 1,323 for 1,099 counts in urban areas), suggesting that sample sizes beyond those realistically possible would be necessary to obtain average counts with reasonable confidence intervals. Further disaggregation of roads beyond simple use of the functional classification system will be necessary before any reasonable traffic data-collection plan can be undertaken by states for EPA travel estimations. For this reason, we recommend that the next stage of research be to apply the sampling procedure to higher functional class roads where it might decrease the total number of counts required. If tests were conducted on the National Highway System where the HPMS provides near universal coverage, valuable sample size recommendations might be possible.

ACKNOWLEDGMENTS

The authors would like to thank the Kentucky Transportation Cabinet for funding this research as well as members of the University of Kentucky Transportation Center and Wilbur Smith Associates for their contributions.

REFERENCES

Commonwealth of Kentucky Transportation Cabinet (CKTC), Division of Transportation Planning. 1997. Traffic Characteristics of Kentucky Highways. Frankfort, KY.

Crouch, J.A., W.L. Seaver, and A. Chatterjee. 2001. Estimation of Traffic Volumes on Rural Local Roads in Tennessee, presented at the Transportation Research Board 80th Annual Meeting, Washington, DC.

Niemeier, D., J. Morey, J. Franklin, T. Limanond, and K. Lakshminarayanan. 1999. An Exploratory Study: A New Methodology for Estimating Unpaved Road Miles and Vehicle Activity on Unpaved Roads, RR-99-2. Davis, CA: Institute of Transportation Studies.

U.S. Department of Transportation (USDOT), Federal Highway Administration. 2000. Highway Performance Monitoring System Field Manual, Chapter 7: Sample Selection and Maintenance. Available at http://www.fhwa.dot.gov/ohim/hpmsmanl/hpms.htm, as of May 28, 2003.

U.S. Environmental Protection Agency (USEPA), Office of Air Quality Planning and Standards. 1998. The Regional Transport of Ozone: New EPA Rulemaking on Nitrogen Oxide Emissions. Research Triangle Park, NC.

Address for Correspondence and End Notes

Authors' addresses: Sarah T. Bowling, JWA/HMB Indiana, LLC, 624 W. Main St., Suite 300, Louisville, KY 40202.

Corresponding author: Lisa Aultman-Hall, Associate Professor, Department of Civil and Environmental Engineering, University of Connecticut, Unit 2037 261 Glenbrook Road, Storrs, CT 06269-2037. Email: aultman@engr.uconn.edu.

KEYWORDS: geographic information systems, sampling, traffic counting, vehicle-miles traveled.

1. In this paper, local roads are all public roads in the state of Kentucky classified as "functionally local" by the Kentucky Transportation Cabinet. These roads may be paved or unpaved but nearly all in this study area are paved. All local roads, regardless of the responsible jurisdiction, were included in this study (i.e., city- and county-maintained roads are included).

2. Unless otherwise noted, "random sample" refers to a simple random sample as opposed to any sampling technique involving weights or resampling.