## Introduction

## Introduction

The Commodity Flow Survey (CFS) is the largest effort to identify where and how goods are shipped in the United States. It measures the value and weight of commodities shipped primarily by manufacturing, mining, and wholesale trade.^{1}

The 2002 CFS was undertaken through a partnership between the Research and Innovative Technology Administration's (RITA) Bureau of Transportation Statistics in the U.S. Department of Transportation, and the Census Bureau in the U.S. Department of Commerce. BTS provided funding and technical guidance. Census Bureau collected quarterly data, as part of its Economic Census, from approximately 50,000 business establishments in 2002. From this sample of establishments, commodity flows were estimated for a universe of about 800,000 business establishments. The next CFS is scheduled for 2007.

### CFS Coverage and Limitations

The CFS covers employer establishments that are located in the 50 states and the District of Columbia. Surveyed establishments were selected by geographic location and industry. Each surveyed business reported on a sample of individual shipments made during a one-week period in each quarter of 2002. (See appendix C for a description of the survey methodology and sample design.) CFS data on individual shipments include total value and weight, commodity type, modes of transport, and domestic origin and destination. The CFS also reports on whether the commodity is a hazardous material.

The 2002 CFS did not cover shipments of crude petroleum, which primarily affect data for pipeline and water transportation. Also, the survey excludes establishments classified in the North American Industry Classification System as farms, forestry, fisheries, oil and gas extraction, governments, construction, transportation, households, foreign establishments, and some retail and service businesses. Furthermore, the CFS does not cover shipments originating in Puerto Rico and other U.S. territories and possessions. Commodities that are shipped from a foreign location to another foreign destination through the United States (e.g., from Canada to Mexico) are also excluded from the survey.

### Reliability of the Estimates and Interpreting Confidence Intervals

Since the CFS data are estimates based on a sample survey, the data are subject to sampling errors. This section of the report provides 90-percent confidence intervals for the estimates in tables 1, 2, 3, and 4. A confidence interval is a range around a given estimate. Confidence intervals have a specified probability of containing the average of all the estimates when samples are repeated using the same sampling frame conducted under the same survey conditions.The coefficients of variation (CVs) of the estimates in tables 5a to 10a are provided in tables 5b to 10b . The CV of an estimate is the standard error of the estimate divided by the estimate and measures the relative sampling variability. The CV and standard error associated with an estimate can be used to construct a confidence interval.

### How should confidence intervals be interpreted?

Confidence intervals can help in assessing reliability of the estimates and in making comparisons between difference in geographic areas, commodities, and modes of transportation. For example, in table 1, look at the value of shipments originating in Alabama in 2002 - $128 billion, along with the 90 percent confidence interval around that estimate ($115 billion to $140 billion). This means we can be 90 percent sure that the 2002 estimate for freight shipments originating in Alabama is in the range of $115 billion to $140 billion. More precisely, if we generate many confidence intervals from similarly designed surveys, we can expect that the true value will be contained in the intervals 90 percent of the time.

In other words, the likelihood of obtaining a result of $128 billion for Alabama, assuming the true value is not within the confidence interval of $115 billion to $140 billion, is 10 percent or less. That is, one will not get the $128 estimate very often if the true value is outside the interval. Another simple way to think of this is to say: the true estimate for the value of shipments in Alabama is **most likely** between $115 billion to $140 billion. It is **less likely** to be outside of this interval.

Also we can determine whether the value of freight shipments from Alabama ($128 billion) is significantly different, for example, from the shipments from Alaska ($8 billion) and Arizona ($111 billion). Looking at the confidence intervals for Alaska ($6 billion to $10 billon), we do not see an overlap and can say that the estimates for the two states are significantly different.

Note that the wider a confidence interval, the less precise the estimate. Precision depends upon sample size and sample variability. So the larger the sample size of a survey, the narrower the confidence interval and the better the estimate. In the Alabama example, a similar survey with a much larger sample size may yield a narrower confidence interval for the $128 billion estimate.

^{1} The 2002 survey was preceded by the 1997 and 1993 surveys. As detailed in Appendix A, the industry coverage of the 2002 survey differs from the 1993 and 1997 surveys because of a change from the 1997 Standard Industry Classification to the 1997 North American Industry Classification System and other survey improvements.