1 s2.0 S0001457506001734 Main PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Accident Analysis and Prevention 39 (2007) 546–555

Empirical Bayes before–after safety studies: Lessons learned from two


decades of experience and future directions
Bhagwant Persaud ∗ , Craig Lyon
Ryerson University, Department of Civil Engineering, 350 Victoria Street, Toronto, Ontario M5B2K3, Canada
Received 7 May 2006; received in revised form 28 August 2006; accepted 21 September 2006

Abstract
The empirical Bayes (EB) methodology has been applied for over 20 years now in conducting statistically defendable before–after studies of the
safety effect of treatments applied to roadway sites. The appeal of the methodology is that it corrects for regression to the mean and traffic volume
and other changes not due to the measure. There is, therefore, a natural tendency to put a stamp of approval on any study that uses this methodology,
and to assume that the results can then be used in specifying crash modification factors for use in developing treatments for hazardous locations, or
in designing new roads using tools such as the interactive highway safety design model (IHSDM). At the other extreme are skeptics who suggest
that the increased sophistication and data needs of the EB methodology are not worth the effort since alternative, less complex methods can produce
equally valid results. The primary objective of this paper is to capitalize on experience gained from two decades of conducting EB studies around
the world to illustrate that the EB methodology, if properly undertaken, produces results that could be substantially different and less biased than
those from more conventional types of studies. A secondary objective is to emphasize that caution is needed in assessing the validity of studies
undertaken with the EB methodology and in using these results for providing crash modification factors. To this end, a number of issues that
are critical to the proper conduct and interpretation of EB evaluations are raised and illustrated based on lessons learned from recent experience
with these studies. These include: amalgamating the effects on different crash types; the specification of the reference/comparison groups; and
accounting for traffic volume changes. Current and future directions, including the improvements offered by a full Bayes approach, are discussed.
© 2006 Elsevier Ltd. All rights reserved.

Keywords: Safety countermeasures; Road safety; Empirical Bayes; Bayes; Safety evaluation; Before–after studies

1. Introduction these counts regress towards their true long-term mean. While
compelling evidence of the existence of the regression-to-the-
There is an undisputed need to evaluate the safety effect of mean phenomenon has been presented (Hauer, 1997; Hauer and
roadway improvements that may impact accident frequency. Persaud, 1983) there is a certain amount of skepticism about the
What seems to be still in dispute is whether or not it is the need for the EB methodology, doubts that are fuelled by a belief
worth the effort of using sophisticated methodology such as that road safety improvements are implemented for a variety of
the empirical Bayes (EB) procedure (Hauer, 1997) for conduct- reasons and that a randomly high accident count is not among the
ing observational before–after studies. This is because (a) the key selection criteria. Numerical data from Norway, presented
relative complexity of the methodology requires analysts with by Elvik (2004), appears to support this belief in part in that both
considerable training and experience, (b) the data needs can be treated and untreated sites over several years were equally likely
quite extensive, and (c) the result of (a) and (b) is that the per- to have above or below normal accident rates.
sonnel and financial resource needs can be prohibitive. Further skepticism about the need for the EB methodology
The EB methodology has been developed to account for arises from the belief that when many years of pre-treatment data
regression-to-the-mean effects that arise when sites with ran- are used to select entities for treatment or in an evaluation, and
domly high short-term accident counts are selected for treatment these entities have high accident counts, there will be little or
and experience a reduction in accidents subsequently when no regression-to-the-mean. While there is some validity to this
belief, it is difficult to establish how many years of pre-treatment
data are required or how high accident counts need to be for
∗ Corresponding author. Tel.: +1 416 979 5345; fax: +1 416 979 5122. regression-to-the-mean to be virtually non-existent. Evidence
E-mail address: bpersaud@ryerson.ca (B. Persaud). in Hauer and Persaud (1983) suggests that there is fairly large

0001-4575/$ – see front matter © 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.aap.2006.09.009
B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555 547

regression to the mean for two-lane rural road segments even for ume changes and for time trends in accident occurrence due to
a 5-year “before” periods, illustrating that a long before period changes over time in factors such as weather, accident report-
will not eliminate regression-to-the-mean, especially when the ing practices and driving habits. However, there are a number
average annual accident count is relatively small. This tends to be of difficulties which, if not properly resolved, will render this
the case for many types of treated entities such as short two-lane methodology just as invalid as the conventional methods, result-
rural road segments, low volume stop controlled intersections ing in a misuse of precious resources and a general lack of faith
and rail-highway crossings. in the method. It is important to recognize and address these
The more conventional alternatives to the EB method, involv- issues since it is natural for those involved in safety manage-
ing a simple before–after comparison of accident counts or rates, ment to give a stamp of approval to results from an EB study
with or without a comparison or control group, are appeal- just because they claim to have been produced by such a statis-
ing in that they are relatively easy to apply. These alternative tically sound methodology.
methods, however, are fraught with difficulties, which are well Given the two extremes in beliefs on the EB methodology –
documented. The “best of the rest” involves a process in which blind faith and skepticism – it seems worthwhile and timely to
sites are selected for possible treatment on the basis of their address the concerns in both camps by consolidating the lessons
safety record and then randomly allocated to either a treatment learned in conducting EB evaluations over the past 20 years or
or a control group—a classical experimental design. This would so since the first applications of this methodology. This need is
create similar accident frequency distributions in the two groups, the motivation for this paper. First, the basics of EB evaluation
allowing for regression-to-the-mean effects to be controlled for. are reviewed. This is followed by three substantive sections,
In practice, this method of project selection is problematic since one that presents evidence supporting the need for and valid-
there may be moral and liability issues if some sites that end ity of the EB approach, one that uses the results from several
up in the control group are more worthy of treatment than some published before–after studies to compare estimates of safety
in the treatment group. In addition, this method will not control effect obtained by the EB and the naı̈ve methods, and one that
for changes in safety resulting from changes in traffic volume addresses issues in EB evaluations that need to be considered in
at the treatment sites that might result from the treatment itself. assessing the validity of EB studies.
Measures such as left turn treatments at intersections are known
to have such effects.
2. Basics of empirical Bayes evaluation
To avoid these issues in using a control group, a quasi-
experimental design is commonly used in which an untreated
In the empirical Bayes evaluation of the effect of a treatment,
“comparison” group of sites similar to the treated ones is selected
the change in safety for a given crash type at a treated intersection
separately from the treatment site selection process. A com-
is given by
parison group can account for unrelated effects such as time
and travel trends but will not account for regression-to-the- B − A, (1)
mean unless sites are precisely matched on the basis of accident
occurrence in addition to all the factors that affect accident occur- where B is the expected number of crashes that would have
rence. There are immense practical difficulties of achieving this occurred in the “after” period without the treatment and A is
ideal as illustrated in Pendleton (1996). In addition, the neces- the number of reported crashes in the after period. Because of
sary assumption that the comparison group is unaffected by the changes in safety that may result from changes in traffic volume,
treatment is difficult to test and can be an unreasonable one in from regression-to-the-mean, and from trends in crash reporting
some situations. And this method, like the classical experimen- and other factors, the count of crashes before a treatment by itself
tal design, will not control for changes in safety resulting from is not a good estimate of B (Hauer, 1997)—a reality that has
changes in traffic volume at the treatment sites that might result now gained common acceptance. Instead, B is estimated from an
from the treatment itself. Most fundamentally, the comparison empirical Bayes (EB) procedure (Hauer, 1997) in which a safety
group needs to be similar to the treatment group in all of the performance function (SPF) is used to first estimate the number
possible factors that could influence safety. A paper by Scopatz of crashes that would be expected in each year of the “before”
(1998) points to the difficulties of fulfilling this need by examin- period at locations with traffic volumes and other characteristics
ing the result from Hingson et al. (1996) that lowering legal BAC similar to a treatment site being analyzed. The sum of these
limits to 0.08% resulted in a 16% reduction in the probability that annual SPF estimates (P) is then combined with the count of
a fatally injured driver would have a BAC above that level. The crashes (x) in the before period at the treatment site to obtain
treatment group consisted of States that passed a lower legal an estimate of the expected number of crashes (m) before the
BAC law while the comparison states retained a 0.10% BAC treatment. This estimate of m is
legal limit. Scopatz showed that if logically valid but different
comparison states are chosen the results change dramatically, m = w1 (x) + w2 (P). (2)
and in most cases are in fact consistent with a conclusion of “no
effect”. The weights w1 and w2 are estimated as
The empirical Bayes (EB) method (Hauer, 1997) can over-
come the limitations of conventional methods by accounting not P
w1 = , (3)
only for regression-to-the-mean effects, but also for traffic vol- P + 1/k
548 B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555

Column 9: EB
w2 = , (4)
k(P + 1/k)

5.0
−4.8
7.4
1.1
−2.5
3.5
−8.6
−0.5
−10.5
18.6
−13.2
−9.0
5.5
where k is the dispersion parameter of the negative binomial dis-
tribution that is assumed for the crash counts used in estimating
Percent difference

the SPF. The value of k is estimated from the SPF calibration


Column 8: CG

process with the use of a maximum likelihood procedure.


A factor is then applied to m from Eq. (2) to account for
Increase

−14.6
−14.0
−22.3
−17.1
−22.9

−23.6
−19.8
−9.7 the length of the after period as well as differences in traffic

−8.6
17.6
4.3

1.3
volumes and general trends in crash risk due to factors such as
weather, reporting practices and the other safety countermea-
Column 7: EB

sures between the before and after periods. This factor is the
sum of the annual SPF predictions for the after period divided
by P, the sum of these predictions for the before period. The
0.20
0.42
0.67
0.92
1.20
1.43
1.75
2.00
2.37
2.64
3.02
3.33
4.73
result, after applying this factor, is an estimate of B. The proce-
accidents/site per year

dure also produces an estimate of the variance of B, the expected


number of crashes that would have occurred in the after period
Column 6: CG
in 1997–1999

without the treatment.


Estimate of

The estimate of B is then summed over all road sections in a


0.00
0.34
0.69
1.03
1.37
1.72
2.06
2.40
2.75
3.09
3.43
3.78
5.46

treatment group of interest (to obtain Bsum ) and compared with


the count of crashes during the after period in that group (Asum ).
The variance of B is also summed over all sections in the group
Using data for 1669 California rural stop controlled intersections to illustrate regression to the mean and the validity of the EB method
Column 5: observed

of interest.
The index of safety effectiveness (θ) is estimated as
percent change

Asum /Bsum
Increase

θ= . (5)
−12.0
−11.4
−20.0
−14.6
−20.6

−21.3
−17.4

1 + [Var(Bsum )/Bsum
2 ]
−7.0

−5.9
21.2
7.4

4.3

The standard deviation of θ is given by


accidents/site per year

Stddev(θ)
 0.5
θ 2 {[Var(Asum )/Asum 2 ]+[Var(Bsum )/Bsum 2 ]}
in 1997–1999

= 2
. (6)
Column 4:

[1 + Var(Bsum )/Bsum 2 ]
0.21
0.40
0.72
0.93
1.17
1.48
1.60
1.99
2.12
3.13
2.62
3.03
4.99

The percent change in crashes is in fact 100(1 − θ); thus a


value of θ = 0.7 with a standard deviation of 0.12 indicates a
accidents/site per year

30% reduction in crashes with a standard deviation of 12%.

3. Evidence supporting the need for and validity of the


in 1994–1996
Column 3:

empirical Bayes approach


0.33
0.67
1.00
1.33
1.67
2.00
2.33
2.67
3.00
3.33
3.67
5.30

Evidence presented in Hauer (1997), Hauer and Persaud


0

(1983) and Persaud and Hauer (1984) based on a large number


of diverse datasets has demonstrated that that regression-to-the-
Column 2: accidents/site

mean effects can be quite substantial and that the EB method


is a very effective tool for accounting for this effect. In keeping
in 1994–1996 (x)

with the objectives of this paper it is still useful to present some


additional evidence and discussion.
Data in Table 1 pertain to accident counts at 1669 rural,
> = 12

4-legged stop controlled intersections in California. These inter-


10
11
0
1
2
3
4
5
6
7
8
9

sections averaged 0.81 accidents per year during 1994–1996


and 0.83 accidents per year in 1997–1999. Intersections are
with (x) accidents

grouped into rows based on the count of accidents in 1994–1996,


Column 1: sites

in 1994–1996

shown in column 2. As column 5 shows, those intersections in


groups which in 1994–1996 had more than the average number
Table 1

of accidents in this period experienced a reduction in accidents in


584
348
203
144
103
63
56
31
28
31
21
11
46

1997–1999 in all except one case. And intersections with fewer


B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555 549

accidents than the average experienced considerable increases. The cumulative residuals for each were then plotted against the
These changes are due to regression-to-the-mean since these 1994–1996 accident count (column 2).
intersections were largely unaltered during the 6-year period Fig. 1 graphically confirms what is shown in Table 1, in that
from 1994 to 1999, according to information in the Highway the EB predictions not only estimate efficiently, as evidenced by
Safety Information System (HSIS) (Griffith and Council, 2000) the closeness of the plotted line to 0.0 on the Y-axis, but also
from which these data were extracted. For the numbers in col- produce unbiased estimates since the plot oscillates about the
umn 6, a comparison ratio of (0.83/0.81), which is based on all X-axis. By contrast, the CG predictions show substantial bias
1669 sites in the dataset, was applied to the numbers in column in consistently under-predicting when the 1994–1996 count is
3 as might be done in a “simple before–after with comparison less than about 2.0 (when the line is rising) and overpredicting
group” (CG) method. It is seen that the regression-to-the-mean when the count is higher than 2.0 (i.e., when the line is falling).
effects still remain substantial because the comparison ratio is The upshot is that regression-to-the-mean does exist and cannot
so small in this case. usually be accounted for by the CG method.
The 1994–1996 accident count data were used, along with
safety performance functions estimated from the entire database, 4. Comparing before–after study results obtained by
to develop EB estimates for each group of intersections, using the EB and traditional methods
methodology detailed earlier. These estimates are shown in col-
umn 7. As mentioned, these are estimates of what safety would Having provided further evidence and discussion to support
be in a subsequent after period at a treated site had treatment not the validity of, and need for the EB methodology, it remains to
been applied. Since treatment was not in fact applied to these see if the EB results from actual before–after studies are mate-
intersections, the actual “after” period (1997–1999) count rep- rially different from those that would have been obtained with
resents in this case “what safety would be in a subsequent after more traditional methods. To do so, we have summarized in
period at a treated site had treatment not been applied”. Thus the Table 2 the results of a number of EB studies that one or both
validity of the EB estimates of the 1997–1999 frequencies in the of the authors have been involved in over the years. These are
absence of treatment (column 7) can be assessed by comparing compared to results that would have been obtained with a naı̈ve
them to the actual accident counts in 1997–1999 (column 4) and before–after analysis.
contrasting them to estimates that would have been obtained by The two sets of analysis essentially differ in the values of
the CG method. “expected after without treatment” shown in the second and third
The comparison of the estimates in columns 6 and 7 with columns. For the conventional (naı̈ve) method this value (in the
the actual accidents per year in column 4 is reflected in the per- second column) was obtained by multiplying the “before” period
centage differences shown in columns 8 and 9. This comparison count for each site by the ratio of the “after” period length to the
reveals that the EB estimates are closer to the 1997–1999 counts “before” period length. Where AADT information was available
than are the estimates from the CG method. Most importantly, for both the before and after periods, as was the case for round-
it is seen from the numbers in columns 8 that the CG method abouts, centre-line rumble strips and red light cameras, the ratio
systematically overestimates these counts for sites with above of the after period to before period traffic volume was applied
average accident frequencies. By contrast, as seen in column as an additional factor. This ratio assumes a linear relationship
9, the EB method appears to be unbiased in that it sometimes between crashes and traffic volumes, an assumption generally
overestimates and sometimes underestimates the frequencies for found to be invalid. In no case were the naı̈ve estimates cor-
these sites. rected for time trend in accident frequencies using a comparison
The superiority of the EB method is illustrated vividly in group, although this might have been possible if this was the
Fig. 1, which was constructed by first subtracting the numbers in primary study method. For red light cameras, centre-line rum-
columns 6 and 7 from those in column 4, for each row in Table 1, ble strips, and conversion from signal to all-way stop, this trend
to find the mean residuals for the CG and EB predictions when was known, based on available reference group data, to be an
compared to the observed accident frequency in 1997–1999. increasing one, so the naı̈ve estimates shown are likely to be on
the low side for these three treatments.
For the EB analysis, the estimates of “expected after with-
out treatment” in the third column were obtained as detailed
in Eqs. (1)–(6) presented earlier. These estimates, as noted, cor-
rect for differences in length and underlying accident experience
between the before and after periods. They also correct for traffic
volume changes between the before and after period where such
changes are known. (For rail crossing protection upgrades, con-
versions from signal to all-way stop, and for some roundabout
conversions, it had to be assumed that the before and after period
traffic volumes were the same, because of the unavailability of
data for both periods.)
Fig. 1. Cumulative mean residuals based on 1994–1996 accident count group- Most importantly, the EB estimates correct for regression-
ings in Table 1. to-the-mean, which is perhaps the main reason for the often
550
Table 2
Comparison of EB and naı̈ve study results
Column 1: accident type Expected after without treatment Column 5: accident Column 6: naı̈ve Column 7: EB Naive/EB reduction
count after accident reduction accident reduction
Column 2: naı̈ve Column 3: EB Column 4: % diff. Column 8: number Column 9: %

Converting 222 intersections from two- to all-way stop (Persaud et al., 1984)
All 1329 1079 18.8% 616 713 (54%) 463 (43%) 1.54 1.26
Injury 313 226 27.8% 60 253 (81%) 166 (73%) 1.52 1.11

B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555


Right-angle 726 558 23.1% 126 600 (83%) 432 (77%) 1.39 1.08
Rear-end 151 123 18.5% 101 50 (33%) 22 (18%) 2.27 1.83
Pedestrian 139 123 11.5% 75 64 (46%) 48 (39%) 1.33 1.18
Installing gates at 934 rail crossings with flashers (Hauer and Persaud, 1987)
All 286 208 37.9% 114 172 (60%) 94 (45%) 1.83 1.33
Installing gates at 1037 rail crossings with crossbucks (Hauer and Persaud, 1987)
All 239 162 32.2% 50 189 (79%) 112 (69%) 1.69 1.14
Installing flashers at 891 rail crossings with crossbucks (Hauer and Persaud, 1987)
All 165 101 38.8% 49 116 (70%) 52 (51%) 2.23 1.37
Change 189 intersections from signal to all-way stop (Persaud et al., 1997)
Angle/turn 636 625 1.7% 476 160 (25%) 149 (24%) 1.07 1.04
Rear-end 128 144 −12% 102 26 (20%) 42 (29%) 0.62 0.69
All 1056 1063 −0.7% 809 247 (23%) 254 (24%) 0.97 0.96
Centre-line rumble strips on 211 miles of two-lane roads (Persaud et al., 2004)
All 1961 2030 −3.5% 1777 184 (9%) 253 (12%) 0.73 0.75
Injury 769 749 2.6% 647 122 (16%) 102 (14%) 1.20 1.14
Roundabouts at 23 US intersections (Persaud et al., 2001)
All 553 455 17.7% 275 278 (50%) 180 (40%) 1.54 1.25
Injury 84 58 31.0% 12 72 (86%) 46 (79%) 1.57 1.09
Red light cameras at 132 intersections in 7 jurisdictions (Persaud et al., 2005)
Angle all 1580 1541 2.4% 1163 417 (26%) 378 (25%) 1.10 1.04
Angle injury 912 896 1.8% 634 278 (30%) 262 (29%) 1.06 1.03
Rear-end all 2399 2531 −5.5% 2896 −497 (−21%) −365 (−14%) 1.36 1.50
Rear-end inj. 944 984 −4.2% 1008 −64 (−7%) −24 (−2.4%) 2.67 1.92
Left turn priority treatment at 35 traffic signals (Lyon et al., 2005)
Inj. left turn (LT) 215 180 16.3% 152 63 (29%) 28 (16%) 2.25 1.81
Inj. LT side impact 157 165 −5.1% 135 22 (14%) 30 (18%) 0.73 0.78
B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555 551

substantial difference between them and the naı̈ve estimates. methods. These other issues relate to the assessment of poten-
This difference, which is shown as percentages in the fourth tial crash migration or spillover effects to non-treatment sites,
column of Table 2, is of interest by itself in that it is largest for the assessment of how site characteristics may impact on the
two sets of measures—conversion from two-way to all-way stop effect of a treatment, the difficulties caused when treatments are
control and upgrading protection at rail-highway crossings. In applied in combination, as is typically the case, and the com-
both cases, there was a known tendency to quickly apply the plications that arise when treatment effects vary over time. The
treatment in response to the occurrence of one or more recent three issues that are addressed are the differential effects for dif-
accidents—the classical situation for regression-to-the-mean. At ferent crash types, the specification of the reference groups, and
the other extreme, are the three cases for which this difference the consideration of the effects of traffic volume changes.
between the EB and naı̈ve estimates is not substantial—red light
cameras, centre-line rumble strips, and conversion from signal 5.1. Issue 1: differential effects for different crash types
to all-way stop. This is not to say that there was little or no
regression-to-the-mean. In fact, these are the same three cases Most treatments affect various accident impact and severity
noted above for which the naı̈ve estimates of “expected after types differently. Therefore, in assessing the overall impact of
without treatment” are likely to be on the low side because they a treatment it is necessary to somehow amalgamate these dif-
did not correct for a significant increasing trend in accident fre- ferential impacts. This is especially critical when a measure has
quency. positive impacts on some accident types and negative impacts
The most significant conclusion from the range of values on others. Examples of such measures are conversion to traf-
in column 4 is that it seems almost impossible to estimate the fic signal control and installation of red light cameras (RLC),
amount of regression-to-the-mean that was present in reported both of which are known to increase rear-end crashes but to
studies that used the naı̈ve method. reduce the more severe right-angle crashes. Some recent results
Another key revelation from Table 2 is in the difference from Persaud et al. (2005) and Council et al. (2005) for an
between the naı̈ve reduction in column 6 (the difference between FHWA study emphasize the importance of properly weighting
columns 2 and 5) and the EB reductions in column 7 (the differ- these effects in arriving at an overall impact. Until then other
ence between columns 3 and 5). In terms of % reduction, it is researchers would merely report these effects separately with-
seen from the numbers in column 9 that there is a relatively small out attempting to derive a net safety benefit of RLC programs,
difference in many cases between the naı̈ve and EB estimates of information that is crucial to the continuation of such programs.
safety effect, which may support a belief that the EB analysis The FHWA research involved an EB study that confirmed the
may not be worth the effort. However, if we estimate safety effect conventional belief that RLCs increase rear-end and decrease
in terms of the actual reduction in accident frequency rather than right-angle crashes. This was followed by an examination of
the % reduction, which is what really matters, we see substantial the economic costs of these changes, based on a consideration
differences between the EB and naı̈ve estimates, as evidenced of specially derived rear-end and right-angle unit crash costs
by the ratios in column 8. On this basis, it must be concluded that for various severity levels for urban signalized intersections
EB analysis does in fact make a substantial enough difference. (Zaloshnja et al., 2004), in order to establish the aggregate effects
of the RLC programs evaluated. These results, shown in Table 3,
5. Issues that may affect the validity of EB results suggest that costs of the right-angle crashes saved clearly out-
weigh the cost of the increased rear-end crashes, even though
As suggested earlier, one should not use blind faith in assess- the net savings in crashes was not substantial.
ing the validity of EB studies. This is because there are a number
of tricky issues in conducting these studies and, if these are not 5.2. Issue 2: specification of reference groups
properly addressed, the results of an EB study can be just as
invalid as those from conventional studies. Below, three such In the EB methodology, safety performance functions need
issues are focused on. It should be noted in passing that there to be calibrated for each of the before and after periods and
are in fact other issues that are not addressed and that aspects desirably for each year of these periods. A reference group of
of these issues apply also to the more conventional evaluation “similar” entities to the treated ones is used for this purpose.

Table 3
Economic evaluation of the safety effect of red light cameras (from Council et al., 2005)
Right-angle Rear-end

EB estimate of crashes expected in the after period without RLC 1542 2521
Count of crashes observed in the after period (370 site years) 1163 2896
% change in crashes (standard error) [negative is decrease] −24.6 (2.9) 14.9 (3.0)
Estimate of the change in crashes [negative is decrease] −379 375
Crash cost change [negative is decrease] ($) −18,497,977 5,875,156
% change in crash cost [negative is decrease] −27.7 8.5
552 B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555

Typically, multipliers are estimated for each period or each year for these later years, a comparison group of sites that consisted of
of each period. Three primary considerations arise in specifying as yet untreated locations, or locations on which RPMs had been
this reference group. installed prior to the beginning of the study period, was identified
First, it almost goes without saying that the reference group where possible, to account for time trends between the SPF cal-
must be representative of the treated entities. That is, the ref- ibration period and the rest of the analysis period. For example,
erence group must be similar to the treated group in terms of for four-lane freeways in Missouri and Pennsylvania, the com-
geometric design, traffic volumes, vehicle fleet, and so on. Where parison group consisted of a sample of multilane (non-freeway)
the reference group is also used to account for time trends, a test roadways.
of comparability should be applied to evaluate its suitability in The third issue is that in some cases the treatment may affect
this respect. In essence, this test of comparability compares a the logical reference group. Red light camera programs are a
time series of target accident counts for a treatment group and classical example, but there is evidence of this effect for other
a candidate reference group. If a candidate reference group is a measures, such as traffic calming, all-way stop installation, and
good one, then the yearly trends in accident frequencies track raised pavement markers. A good example of how this issue can
each other well over time. Hauer (1997) proposes calculating a affect study results is an evaluation of raised pavement mark-
sequence of sample odds ratios using 1 year of “before” data and ers by Orth-Rodgers Associates Inc. (1998) who estimated the
the following year as the “after” data, starting with years 1 and effects on nighttime crashes at 91 interstate highway locations
2 and incrementally increasing by 1 year. From this sequence in Pennsylvania for “before” and “after” periods of 1–3 years.
of ratios, the sample mean and standard error is determined. If Daytime crashes at the same sites were used as a comparison
this sample mean is not sufficiently close to 1.0 then the can- group. The authors found an insignificant 1.2% decrease in all
didate reference group is unsuitable. As an example, consider night time crashes, but suspected that the lack of an “expected”
the data from one of the cities used to study the effects of red positive effect might have been due to the fact that there was a
light cameras that was referenced earlier (Persaud et al., 2005). reduction in the daytime crashes (due to the rumbling effect of
The mean and standard errors of the sequence of sample odds RPMs) that was used for the comparison group.
ratios for total accidents were estimated to be 1.045 and 0.150, In the case of red light cameras (RLC), the actual hope is
respectively. If we selected a 95% confidence interval the odds that there would be a general deterrent or spillover effect at all
ratio is estimated to be between 0.751 and 1.339, suggesting that signalized intersections, not just those with cameras, especially
there is not sufficient evidence to conclude that the odds ratio is if the public does not know where the cameras are. Ignoring
in fact not 1.0 and, therefore, that the candidate reference group spillover effects to intersections without RLCs, will lead to an
cannot be rejected on this basis. underestimation of RLC benefits, more so if sites with these
The second consideration arises when all or most of the enti- effects are used as a comparison group. To resolve this issue in
ties that would form a potential reference group are treated. the recent EB evaluation of RLCs by Persaud et al. (2005), the
In this case, there is no natural reference group. However, effects of regression-to-the-mean and changes in traffic volume
regression-to-the-mean is unlikely in evaluating the compos- were explicitly accounted for using safety performance func-
ite treatment effect over all sites, since all or almost all sites tions (SPFs) relating crashes of different types and severities to
are selected for treatment and not just those with a high acci- traffic flow and other relevant factors for each jurisdiction based
dent frequency. Nevertheless, an EB evaluation is still preferred on signalized intersections without RLCs. Annual SPF multipli-
because it can be informative to look separately at the disag- ers were calibrated to account for the temporal effects on safety
gregate treatment effects for sites with high and low accident of variation in weather, demography, crash reporting and so on.
counts. In doing an EB study in this case, the “before” period This is common practice in applying the EB methodology out-
data for the treatment group can be used to develop the SPF for lined in Eqs. (1)–(6). However, because of the possibility of
use in the EB methodology. Factors for the “after” period at the spillover effects to neighboring signalized intersections, it was
treatment sites could not be obtained this way since the after decided to estimate annual multipliers for the period after the
period SPF is required to estimate what would happen without first RLC installation from the trend in annual multipliers of
the treatment. In this case, another entity set for the jurisdiction SPFs calibrated for a comparison group of unsignalized inter-
is used to derive a trend between the after and before period, in sections in the jurisdiction of interest. The assumption here, of
effect a ratio of the “after” period SPF multiplier to the “before” course, is that this comparison group is unaffected by the treat-
period multiplier. This trend factor is then applied to the SPF ment.
based on the “before” period data. This procedure was followed To illustrate, consider the information in Table 4 for one of the
for a recent study of raised pavement markers (RPMs), which cities that supplied data for the study. Using the SPFs calibrated
were installed non-selectively for four-lane freeways in Mis- with the reference data collected for unsignalized and signalized
souri and Pennsylvania and for two-lane roadways in Illinois intersections, annual multipliers were estimated for each year.
and New Jersey. For these, the reference group information used The year of the first camera installation was 1996. In case there
for calibrating SPFs comprised the before period data at all the was a spillover effect at the non-treated signalized intersections,
identified locations with RPMs. This meant that data available the annual multipliers estimated from the signalized reference
for calibrating the SPFs would be non-existent for the period group were applied only for the years 1992–1995.
after non-selective installation was complete, and could be lean Yearly multipliers for the years 1996–2002 were then esti-
toward the end of the installation period. To calibrate the SPFs mated as follows:
B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555 553

Table 4
Development of yearly trend factors when a treatment may affect logical reference sites
Reference group Unsignalized Adjusted unsignalized Signalized Adjusted signalized

1992 1.34 1.61


1993 1.20 1.52
1994 1.17 1.44
1995 0.99 1.15
Avg. 1992–1995 1.18 1.43
1996 0.84 0.71 0.89 1.02
1997 0.78 0.66 0.88 0.95
1998 0.93 0.79 0.94 1.13
1999 0.78 0.66 1.01 0.95
2000 1.23 1.05 1.09 1.50
2001 0.84 0.71 1.21 1.02
2002 1.96 0.82 1.09 1.17

• Step 1. For both the unsignalized and signalized reference this results in an even smaller increase in accidents, that using
groups the average yearly multiplier between 1992 and 1995 SPFs and the EB methodology to account for this change is, in
was determined. effect, overkill. However, changes in traffic volume at treatment
• Step 2. The unsignalized reference group yearly multipliers sites can in fact be much larger, and can go in either direction,
for the years 1996–2002 were divided by the average from because many treatments by themselves cause such an increase.
1992 to 1995. These results are shown in the column labeled Intersection treatments such as left turn accommodation, traffic
“Adjusted Unsignalized”. signal installation, red light cameras and conversion to round-
• Step 3. Yearly multipliers for the signalized reference group abouts are well known to affect traffic volumes. The data in
for 1996–2002 were then estimated by multiplying the aver- Table 5 illustrates this phenomenon for roundabout conversions
age from 1992 to 1995 by the adjusted unsignalized factor for evaluated by Persaud et al. (2001). It appears that many round-
each year. abouts, by alleviating a congestion problem, actually re-attracted
traffic that had gone elsewhere to avoid the congestion.
The data in Table 4 show that with the exception of 2001, the If the objective of the evaluation is to assess a treatment
yearly multipliers are being increased. The mean of the unad- programme in a specific jurisdiction, then these traffic volume
justed and adjusted factors from 1996 to 2002 are 1.02 and 1.11, changes, and the resulting safety changes at the treated sites
respectively, indicating a possible spillover effect, and justifying and elsewhere, are part of the effect that is being evaluated.
the approach used. However, a more common objective has been to estimate the
effect of a treatment to derive an accident modification factor for
5.3. Issue 3: properly accounting for traffic volumes widespread application; in that case, it is of interest to estimate
changes
Table 5
Volume changes on roadways will occur and, typically, Illustrating traffic volume changes for intersections converted to roundabout
AADTs increase over time, of the order of 2–4% per annum. AADT before AADT after % change
These changes, by themselves, will cause accident frequencies
7185 9840 37.0
to increase, usually by less than the increase in traffic because of 7650 8500 11.1
the non-linear relationship between accidents and traffic volume, 7654 9293 21.4
which typically has a decreasing slope. Therefore, in evaluating 11934 12205 2.3
treatments, benefits would be underestimated if these changes 12627 15990 26.6
are not properly accounted for. In a naı̈ve evaluation, the propor- 12627 11010 −12.8
13272 26691 101.1
tional effect on the expected accident frequency after treatment 13300 16900 27.1
is assumed to be the same as the proportional increase in traffic 13972 15500 10.9
volume. This assumption is incorrect, because of the non-linear, 15153 17825 17.6
decreasing slope relationship between accidents and traffic vol- 15300 17000 11.1
ume. This would serve to further exaggerate the benefit of the 15300 17000 11.1
15345 17220 12.2
treatment, if regression-to-the-mean is already present. The EB 15600 18450 18.3
method properly accounts for the effect of the increase in traffic 18000 20000 11.1
volume by using safety performance functions to represent the 18475 27525 49.0
actual relationship, linear or otherwise, between accidents and 18795 31476 67.5
traffic volumes. 18942 30418 60.6
22030 31525 43.1
It can be argued that since the 2–4% increase in traffic is so 27000 30000 11.1
small and within the realm of an insignificant change, and since
554 B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555

Table 6
Roundabout evaluations with and without accounting for AADT changes
Accident type Expected after without treatment Recorded after Apparent reduction Actual reduction

Naı̈ve EB

All (with AADT change) 553 455 275 278 (50%) 180 (40%)
All (without AADT change) 436 354 275 161 (37%) 79 (22%)
Injury (with AADT change) 84 58 12 72 (86%) 46 (79%)
Injury (without AADT change) 68 48 12 56 (82%) 36 (75%)

the effect of the treatment had there been no change in traffic approach. The modeling framework allows for the specification
volume. This second objective is the focus of this discussion. of quite complex model forms not easily handled in conven-
To further emphasize the point in the previous paragraph, the tional generalized linear modeling approaches., For example,
roundabout conversion data were re-analyzed with the assump- models including both multiplicative and additive terms (to rep-
tion that, for all 23 conversions in the study, traffic volumes resent point hazards such as driveways), can be estimated with
were the same for the before and after periods. The results of relative ease. The small sample properties of FB models should
this reanalysis are compared in Table 6 with the original results also allow the estimation of valid models with smaller sample
in Table 2 for which actual changes in traffic volumes were avail- sizes. This may be particularly valuable for evaluating treat-
able for most of the intersections and were accounted for. Quite ments based on relatively rare accident types such as those
clearly, not accounting for AADT changes makes a substantial involving pedestrians. Full Bayes modeling also provides the
difference. ability to include prior knowledge on the values of the coeffi-
cients in the modeling along with the data collected. Perhaps
6. Lessons learned and current and future directions most advantageous is the ability to consider spatial correlation
between sites in the model formulation. Spatial correlation con-
There are two principal messages in the paper. The first is that, siders the effect of one location’s proximity to other locations
based on evidence from actual studies and empirical data, the EB on the expected accident frequency. For before–after studies,
methodology, if properly undertaken, does produce results that spatial correlation will likely be an issue where the treated and
are substantially different, and more valid, than those produced comparison sites are close together. Considering this spatial cor-
by more traditional methods. It is therefore worth the invest- relation allows the inclusion of sites geographically close to each
ment in data collection and analysis, and in training analysts, other. A recent study of county-level injury and fatal crashes in
to undertake such evaluations. On the other hand, quick and Pennsylvania published in Accident Analysis and Prevention
dirty conventional evaluations, often done as a compromise of (Agüero-Valverde and Jovanis, 2006) found spatial correlation
convenience, will produce questionable results, and should gen- to be significant. If exposure over time is not known then the
erally be avoided. The second message is a caution against blind comparison group selected should in fact be as close in prox-
faith in assessing the validity of studies undertaken with the EB imity as possible to the treated site since the exposure is more
methodology and in using these results for deriving crash modi- likely to be similar than if the comparison sites were farther
fication factors. To this end, a number of issues that are critical to away. The disadvantage of the full Bayes approach, at least for
the proper conduct of EB evaluations were raised and illustrated the time being, is that the methodology is quite complex, and
based on recent experience. may require a very high level of statistical training, especially
Current and future research seems to be in the areas of since it does not lend itself readily to implementation in a black
improving SPFs and, more fundamentally, to explore whether or box. Once the FB research gets far enough along, it will be of
not the increased sophistication in these is worth the considerable interest to do an extensive comparison of the EB and FB results.
effort in collecting the required data to develop the best possible
SPFs. Related to this, is the estimation of the negative binomial Acknowledgements
dispersion parameter which, as shown in Eqs. (3) and (4), are
crucial to the EB methodology. There are current efforts (Miaou This paper is based on research conducted by one or
and Lord, 2003) that recognize that this dispersion parameter is both authors over the years, often in collaboration with other
not constant for a given SPF, as was assumed in all EB studies researchers. The contributions of the other researchers, espe-
done to date; these efforts are aimed at modelling this param- cially Dr. Ezra Hauer, who literally wrote the book on the
eter as function of the SPF variables. A useful complement to empirical Bayes methodology, are gratefully acknowledged, as
this research should be an examination of whether this increased are those by various persons in a number of agencies that pro-
accuracy matters materially. vided data and guidance. Sponsors of these research projects
Finally, there are several researchers who are currently deserve special credit for their past and on-going support that
exploring full Bayes (FB) modeling (Miaou and Lord, 2003; has likely saved lives. These include the Natural Sciences and
Pawlovich et al., 2006) for evaluating safety treatments. There Engineering Research Council of Canada, the Federal Highway
are a number of attractive characteristics of the full Bayes Administration, the National Cooperative Highway Research
B. Persaud, C. Lyon / Accident Analysis and Prevention 39 (2007) 546–555 555

Program, the Insurance institute for Highway Safety, Transport Miaou, S.-P., Lord D., 2003. Modeling traffic crash-flow relationships for inter-
Canada, and the Network Centres of Excellence on the Automo- sections: dispersion parameter, functional form, and Bayes versus empirical
bile and the 21st Century (Auto21). Versions of this paper were Bayes methods. Transport. Res. Rec. 1840: 31–40. TRB, National Research
Council, Washington, DC.
presented at the Road Safety on Four Continents Conference Orth-Rodgers Associates Inc., 1998. Safety and congestion management
held in Warsaw, Poland in October 2005 and at the Transporta- research and advanced technology applications—final report. Technical
tion Research Board 2006 Annual Meeting. Assistance to the RPM Task Force, Research Work Order Number 1, Orth-
Rodgers Associates Inc., Philadelphia, PA.
Pendleton, O., 1996. Evaluation of accident analysis methodology. Federal High-
References way Administration Report FHWA-RD-96-039.
Persaud, B., Council, F., Lyon, C., Eccles, K., Griffith, M., 2005. Multi-
Agüero-Valverde, J., Jovanis, P., 2006. Spatial analysis of fatal and injury crashes jurisdictional safety evaluation of red light cameras. Transport. Res. Rec.
in Pennsylvania. Anal. Prev. 38 (3), 615–618. 1922, 29–37.
Council, F., Persaud, B., Lyon, C., Eccles, K., Griffith, M., Zaloshnja, E., Miller, Persaud, B., Retting, R., Lyon, C., 2004. Crash reductions following installation
T., 2005. Guidance for implementing red light camera programs based on of centre-line rumble strips. Accident Anal. Prev. 36, 1073–1079.
an economic analysis of safety benefits. Transport. Res. Rec. 1922, 38–43. Persaud, B., Retting, R., Garder, P., Lord, D., 2001. Safety effect of roundabout
Elvik, R., 2004. To what extent is there bias by selection? Selection for road conversions in the U.S.: empirical Bayes observational before–after study.
safety treatment in Norway. Transport. Res. Rec. 1897, 200–205. Transport. Res. Rec. 1757, 1–8.
Griffith, M.S., Council, F.M., 1999. The Highway Safety Information Sys- Persaud, B.N., Hauer, E., Retting, R., Vallurapalli, R., Mucsi, K., 1997. Crash
tem: United States Department of Transportation multi-state safety analysis reductions following traffic signal removal in Philadelphia. Accident Anal.
database. In: Proceedings of the Traffic Safety on Two Continents Confer- Prev. 29, 803–810.
ence. Malmo, Sweden. Persaud, B.N., Hauer, E., Lovell, J., 1984. The Safety Effect of Conversion to
Hauer, E., Persaud, B.N., 1987. How to estimate the safety of rail highway grade All Way Stop Control in Philadelphia. University of Toronto, Department of
crossings and the safety effect of warning devices. Transport. Res. Rec. 1114, Civil Engineering Publication No. 84, p. 14.
131–140. Persaud, B.N., Hauer, E., 1984. A comparison of two methods for de-biasing
Hauer, E., Persaud, B.N., 1983. A common bias in before and after comparisons before and after accident studies. Transport. Res. Rec. 975.
and its elimination. Transport. Res. Rec. 905, 164–174. Scopatz, R., 1998. Methodological study of between-states comparisons, with
Hauer, E., 1997. Observational Before–after Studies in Road Safety: Estimating particular application to.08% BAC law evaluation. In: Presented at the Trans-
the Effect of Highway and Traffic Engineering Measures on Road Safety. portation Research Board 77th Annual Meeting, Washington, DC.
Pergamon Press/Elsevier Science Ltd., Oxford/UK. Pawlovich, M., Li, W., Carriquiry, A., Welch, T., 2006. Iowa’s experience
Hingson, R., Hereen, T., Winter, M., 1996. Lowering State legal blood alcohol with road diet measures: use of Bayesian approach to assess impacts
limits to 0.08%: the effect on fatal motor vehicle crashes. Am. J. Public on crash frequencies and crash rates. Transport. Res. Rec. 1953, 163–
Health 86 (9), 1297–1299. 171.
Lyon, C., Persaud, B., Haq, A., Kodama, S., 2005. Development of safety Zaloshnja, E., Miller, T., Council, F., Persaud, B., 2004. Comprehensive and
performance functions for signalized intersections in a large urban area human capital crash costs by maximum police-reported injury severity within
and application to evaluation of left turn priority treatment. Transportation selected crash types. In: Proceedings of the Annual Meeting. American Asso-
Research Record. J. Transport. Res. Board 1908, 165–171. ciation for Automotive Medicine, Key Biscayne, FL.

You might also like