Assessing the Impact of Speed-Limit Increases on Fatal Interstate Crashes - Discussion 3

Assessing the Impact of Speed-Limit Increases on Fatal Interstate Crashes - Discussion 3

Andrew Harvey*
Cambridge University

The paper by Balkin and Ord uses stochastic rather than deterministic trends to model the series on crashes. This is important since deterministic trends are rarely appropriate for economic and social time series, and their use can result in misleading inferences on the effects of interventions. One of the attractions of the structural time series modeling approach is that a deterministic trend emerges as a special case of a stochastic trend; this happens in equation (2) in Balkin and Ord when lowercase sigma superscript {2} subscript {lowercase n} is zero. The hypothesis that lowercase sigma superscript {2} subscript {lowercase n} is zero can be tested formally using the procedure of Kwiatkowski, Phillips, Schmidt, and Shin (1992). The amendments needed to allow for the effects of intervention variables are discussed in Busetti and Harvey (2001). This test has not yet been implemented in the STAMP package of Koopman et al. (2000), which Balkin and Ord uses to carry out calculations. However, evidence for the suitability of a random walk is provided by the Box-Ljung Q-statistic obtained when lowercase sigma superscript {2} subscript {lowercase n} is set to zero; for rural Arizona this results in Q(15,13) jumping from a statistically insignificant 12.85 to a highly significant 40.66. If the random walk is replaced by a first-order autoregressive process, the coefficient is estimated to be 0.98. Thus, for Arizona at least, the random walk level seems to be a reasonable model.

There are two ways in which the analysis could be improved. The first is by taking account of the fact that the data are in the form of counts, some of which are quite small. Rather than using the log(y+1) transformation, a count data structural time series model could be used. Harvey and Fernandes (1989) gives a procedure that can be used when only the level is stochastic, while Durbin and Koopman (2000) shows how simulation methods enable a general count data model to be estimated.

The second suggestion is to make use of control groups. For example, the urban Arizona series can serve as a control for rural Arizona. If the series are correlated, one can go some way toward resolving the issue raised by Balkin and Ord when they say "...Arizona... had an increase in the number of crashes the year of the speed-limit change but a decrease from that level in subsequent years. This suggests that drivers in Arizona may have learned how to drive safely at the new limit. Such patterns are not consistent across states, and this issue requires further investigation." The structural time series framework for using control groups is discussed in some detail in Harvey (1996). In the present context, it simply involves setting up a bivariate time series model consisting of equations (1) and (4) of the Balkin/Ord paper and estimating them jointly with allowance made for correlations across the level, seasonal, and irregular disturbances. Thus

lowercase y subscript {1 lowercase t} equals lowercase mu subscript {1 lowercase t} plus lowercase epsilon subscript {1 lowercase t}; lowercase y subscript {2 lowercase t} equals lowercase mu subscript {2 lowercase t} plus lowercase lambda times lowercase z subscript {lowercase t} plus lowercase epsilon subscript {2 lowercase t} where lowercase t equals 1 to uppercase t

where the intervention variable, zt, is defined as in (5), and

lowercase mu subscript {lowercase i t} equals lowercase mu subscript {lowercase i, lowercase t minus 1} plus lowercase eta subscript {lowercase i t} where lowercase i equals 1, 2

Such a model can be estimated in STAMP. Using data up to November 1995 to exclude the later change, the correlation between the level disturbances, lowercase eta subscript {1 lowercase t} and lowercase eta subscript {2 lowercase t}, is 0.81, while the correlation between the irregulars, lowercase epsilon subscript {1 lowercase t} and lowercase epsilon subscript {2 lowercase t}, is -0.07. This translates into a reduction in the root mean squared error (RMSE) of the level intervention in the rural series located at April 1987. The t-statistic correspondingly increases from 2.21 to 2.41. The gain is not dramatic, possibly because the number of crashes in the urban series is so small. However, figure 1 here clearly shows the connection between the series with the urban series and also shows the slight decrease noted by Balkin and Ord after 1987.

The ideal model for control group analysis would be a bivariate, count data model as in Fernandes, Ord, and Harvey (1993). A simpler option would be to aggregate the data to a quarterly level, thereby removing zeroes in nearly all the series and yielding a better Gaussian approximation in logarithms.

Balkin and Ord suggests the use of a "Super .t-Test" to determine the significance of interventions for all states together. There may be a problem here insofar as the individual t-statistics are not independent of each other. An alternative approach, which also solves the small counts problem, is to aggregate all the crashes in states where there was a change in speed limit and then test the significance of the intervention variable. Taking the logarithm of the total number of crashes in the states where the speed limit was raised in April, May, or June of 1987 gives a t-statistic of 3.21 for a level intervention in May of 1987. Again, only observations up to November 1995 were used. The t-statistic increases when a control group series is formed from the urban series and the rural series where the limit was not raised in 1987. The bivariate model shows a correlation of 0.90 between the level disturbances and the intervention, lowercase lambda is estimated as 0.167 with a t-statistic of 4.35. The increase is clearly significant and translates into an 18% increase in crashes on roads where the speed limit was raised.

References

Busetti, F. and A.C. Harvey. 2001. Testing for the Presence of a Random Walk in Series with Structural Breaks. Journal of Time Series Analysis 22:127-50.

Durbin, J. and S.J. Koopman. 2000. Time Series Analysis of Non-Gaussian Observations Based on State-Space Models from Both Classical and Bayesian Perspectives (with discussion). Journal of Royal Statistical Society, Series B 62:3-56.

Fernandes, C., K. Ord, and A.C. Harvey. 1993. Time Series Models for Multivariate Series of Count Data. Developments in Time Series Analysis. Boca Raton, FL: Chapman and Hall.

Harvey, A.C. 1996. Intervention Analysis with Control Groups. International Statistical Review 64:313-28.

Harvey, A.C. and C. Fernandes. 1989. Time Series Models for Count Data or Qualitative Observations. Journal of Business and Economic Statistics 7:409-22.

Koopman, S.J., A.C. Harvey, J.A. Doornik, and N. Shephard. 2000. Structural Time Series Analyser Modeller and Predictor. London: Timberlake Consultants Ltd.

Kwiatkowski, D., P.C.B. Phillips, P. Schmidt, and Y. Shin. 1992. Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root: How Sure Are We that Economic Time Series Have a Unit Root? Journal of Econometrics 44:159-78.

Address for Correspondence

Andrew Harvey is a professor of Econometrics at the University of Cambridge on the Faculty of Economics and Politics. Address: Sidgwick Avenue, Cambridge, CB3 9DD England. Email: Andrew.Harvey@econ.cam.ac.uk.