Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

A data mining approach towards highway crash data analytics

Road traffic accidents are the leading cause of death by injury and the tenth-leading
cause of all deaths globally. By analyzing crash data, road traffic injuries can be
predicted and preventive measures adopted. Crash data often contains correlated
features. High correlation and high number of missing values are two main drawbacks of
crash data. This plethora of dimensions in input features is the challenge of crash data
analytics; one which traditional statistical methods can't cope with alone. This is where a
combinatorial metaheuristic optimization algorithm comes in handy.
In this research effort, 200 Fatal and injury accidents are sampled from files of Tehran Traffic
Police, for analysis. Processed data is put through a supervised classification algorithm,
namely "logistic regression" in order to identify Crash contributors, which are in fact the
independent variables of the mathematical models and uncover accident prone scenarios.
Using a specially programmed software package coded in visual basic, the regression
objective functions are optimized by the Particle Swarm Optimization algorithm. This
algorithm is iterative in nature and has the ability to find near optimal solutions for nonlinear
objective functions.
Among causes contributing to fatal crashes, pedestrians attempting to cross urban highways,
interfering with the traffic while doing so, drivers using reverse gear as to turn onto a missed
ramp, and segments of roads non-conforming to standard design are by far the most fatal
causes.

Key words
Logistic Regression, Particle Swarm Optimization, fatal accident, crash contributors

----------------------------------------------------------------------------------------------------

1 INTRODUCTION
Numerous driver, vehicle, roadway and environmental factors contribute to crashes. In
addition to main effects interactions between factors are very likely to be significant. The
large number of potentially important factors, combined with the complex nature of crash
etiology and injury outcome present significant challenges to the safety analyst. In depth
studies of behavioral factors in road accidents using conventional methods are often
inconclusive and costly. Freeway fatal accidents represent complex phenomenon that is
difficult to precisely predict [24].
In the past decade, the concept of agent-based modeling has been developed and applied to
problems that exhibit a complex behavioral pattern. Agent-based modeling is an approach
based on the idea that a system is composed of decentralized individual "agents" and that
each agent interacts with other agents according to localized knowledge, through the
aggregation of the individual interactions the overall image of the system emerges.
Special kinds of artificial agents are the agents created by analogy with social insects;
(Wasps, ants and termites). Swarm intelligence (Beni and Wang3), is the branch of Artificial
Intelligence based on study of behavior of individuals in various decentralized systems. In
this paper we present a class of swarm intelligence, called particle swarm optimization
(PSO). PSO is used to build accident prediction models. The likelihood function is
maximized as the algorithm's objective or measure of fitness. This paper is organized in the
following way;
A literature review of accident prediction models and the application of PSO in transportation
problems is presented in the next section; Then follows the methodology of this research; data
description, defining the mathematical models, testing for the significance of the coefficients,
and interpreting the estimated values, PSO algorithm details and finally, conclusion,
discussion, and suggestions are presented.

2 LITERATURE REVIEW
There has been considerable interest in the area of traffic accident likelihood modeling. This
modeling mainly falls in to two categories; offline modeling and on-line modeling.
Offline models focus on the impacts of influencing factors, alternatively, online models
primarily use short-term traffic measures and environmental conditions measured in real time.
As for accident frequencies, poison models and negative binomial models are known to have
been used. Shankar et al 16(1995), estimated and analyzed the monthly accident frequencies on
rural freeway segments. Lee and Mannering11 (2002) used zero-inflated Poisson (ZIP) and
negative binomial regression (ZINB) models to identify the factors that influence the frequency
of relatively rare runoff-roadway accidents. Also Lord et al13 (2004) argued that ZIP and ZINB
models may not be appropriate for modeling accident frequency because a preponderance of
zeros may not be caused by a dual state process as assumed in the ZIP or ZINB models. Noland,
(2003) used fixed effects negative binomial models to investigate the impacts of roadway
infrastructure improvements on fatal and injury traffic crashes. Taking into account accident
duration, Chung (2010), developed an accident duration prediction model on the Korean
freeway system. For the severity of accidents, quantifying the effects of roadway characteristics
on vehicle-occupant injuries have been undertaken using a wide variety of models including
Logit models, dual-state multinomial Logit models, nested Logit models, mixed Logit models,
ordered Probit models. Carson and Mannering, (2001), Khatak, (2001), khatak et al(2002),
Kockelman and kweon (2002), Lee and Mannering (2002), Abdel-Aty(2003), kweon and
kockelman(2003), Ulfarsson and Mannering(2004), Yamamoto and Shankar(2004),
Khorashadi et al(2005), Lee and Abdel-Aty(2005), Islam and Mannering(2006), Eluru and
Bhat( 2007), Milton et al(2008), Eluru et al(2008), Savolainen and
Mannering(2007),Malyshkina and Mannering(2009).As for assessing the impact of freeway
design exceptions on the frequency and severity of vehicle accidents, Malyshkina and
Mannering (2010).On the frequency of accidents and the severity of accidents in terms of
resulting injuries, Lord et al(2008), presented a Conway-Maxwell- Poisson generalized linear
model; Anastasopoulos and Mannering4(2009), introduced negative binomial with random
parameters.
As for state of the art in implicating PSO algorithm in transportation and traffic engineering,
prediction model of traffic fatalities based on particle swarm with mutation optimization-
support vector machine has been presented by Gu, X.et al27 (2018), Xu, X. et al (2019)28
presented a deep learning model using a recurrent neural network (RNN) combined with
particle swarm optimization (PSO) to predict the crash density in different severity levels such
as property damage only (PDO) and fatal-injury crashes. Dawei Gao et al29 (2019) applied
improved PSO algorithm to vehicle crash research and carried out the multi objective
optimization based on an approximate model. The improved PSO algorithm proved remarkable
in the collision index, which includes vehicle acceleration, critical position intrusion, and
vehicle mass. In summary, the improved PSO algorithm had excellent optimization effects on
vehicle collision.
Srinivisan17 at al (2003) used PSO to solve the automatic incident detection problem on
freeways. Zhu et al (2006), developed the PSO search strategy for the vehicle routing problem
with time windows (VPRTW).
Zhao19 et al (2006), proposed the urban traffic flow, forecast model in the case of adjacent
intersections. A novel particle swarm optimization algorithm was introduced by Hao et al
(2006), to solve the transportation problem. Mohammed at al (2008), solved the shortest path
problem by using PSO.

This study uses P.S.O algorithm to build accident prediction models based on freeway accident
records.

3 Proposed Accident Modeling Approach

In the following section the method used for accident prediction modeling is introduced and
the step by step approach is defined.
One approach that can be used to better understand factors that impact traffic safety is
throughout the use of Accident Prediction Models, (A.P.M). The coefficients of APMs cannot
be estimated by the traditional ordinary least-squares or weighed least- squares regression
methods. This is because the assumptions for these methods are violated by the discrete, non-
negative nature of accident count data. It is now common to estimate the coefficients of
APMs by using Max-likelihood methods to calibrate what are referred to as GLMs
(Generalized Linear Models).
In this paper models are built following a "step forward" procedure. The first model built is a
"constant only" model. Then step by step crash factors are added. Only if the added variable
improves the log likelihood of the model, it is justified to enter into the accident predicting
scenario. The estimator of the "constant only model" ,β0 , is calculated in table 2.
Continuing the forward procedure, table 3, summarizes the estimators resulted from one-
variable models built. Based on the acquired confidence level (α), causal factors according
to table 4 are justified to enter into the second phase of modeling (two variable
mathematical models);

3.1 Data preparation


Development of a database that contains variables used for modeling is the first practical
step. Identifying preliminary variables based on the purpose of model development and
theoretical issues is the second step. This study is based on a Middle Eastern metropolitan
Traffic Police database of fatal and injury accidents of the city’s freeways. The term
"freeway" in this study is an aggregation of both urban freeways and freeways on the
suburb network. 200 police case files were sampled reporting fatal and injury accidents.
Data was extracted from handwritten reports of police experts, categorized and coded for
computer analysis as is depicted by Table 1.

3.2 Interpretation of the Fitted Models

In the logistic regression model, the value for each coefficient, βˆ j , indicates the marginal
value of the logit function. The precise interpretation of each estimated coefficient is the
difference of the logit function as depicted by equation 3.
βˆ j (x j + 1) − βˆ j x j =βˆ j (3)
This difference is used to calculate a ratio called "odds ratio"
Odds ratio is the odds of the response variable while x j = 1 divided to the odds of the
response variable when x j = 0 as is defined by equation 4 and 5.
( βˆ0 + βˆ1x 1 +...+ βˆ j +...+ βˆ p )
e 1
( βˆ0 + βˆ1x 1 +...+ βˆ j +...+ βˆ p )
/ ( βˆ0 + βˆ1x 1 +...+ βˆ j +...+ βˆ p )
= 1 + e( βˆ + βˆ x +...+ 0+...+ βˆ ) 1+ e (4)
e 0 11 p
1
( βˆ0 + βˆ1x 1 +...+ 0 +...+ βˆ p )
/ ( βˆ0 + βˆ1x 1 +...+ 0 +...+ βˆ p )
1+ e 1+ e
( βˆ + βˆ x +...+ βˆ +...+ βˆ )
e 0 11 j p
βˆ j
= = ˆ ˆ ˆ
( β + β x +...+ 0 +...+ β p )
e (5)
e 0 11
Odds ratio predicts the effect of a specific dependent variable with coefficient βˆ j in the
βˆ
model, on the probability of the response variable by multiplying this probability by e j .
It is inferred from table 3, the prime contributor to fatal accidents of the studied metropolitan
freeways is low standard road design, as is implied by its odd ratio, 1.78. The second
most important contributor, is the pedestrian attempt to cross the freeway while
intervening with the traffic; OR=1.59, and last but not least, using reverse gear by drivers
to drive in opposite direction increases the probability of fatal accidents by 150 percent.
Among most probable sections locating fatal accidents, off ramps (OR=1.35) and in ramps
(OR=1.07), are more prone to fatal crashes. Drivers younger than 25 tend to be more
involved in fatal crashes(OR=1.41)and among weekdays, the odds of fatal accidents
happening on Thursdays (half working day according to Persian calendar ) is 1.33 times
greater than other days of the week.

3.3 Two Variable Models

Two scenarios are presented. The first logit function g1(x) is defined according to equation 6:

g 1 (x ) =−1.0385 + (−1.0219) *(x 32 ) + (−8.4161) *(x 53 ) (6)


e g1 ( x )
p(occurrence of fatal crash)= (7)
1 + e g1 ( x )

p(occurrence of injury crash) =1-p (occurrence of fatal crash)


e g1 ( x )
=1- (8)
1 + e g1 ( x )

Where
x 32 : Accident cause: vehicle stopped by the side of freeway partially blocking low
speed lane
x 53 : Guilty vehicle involved in the accident, motorcycle
In this model confidence level implies:
p ( χ 2 (1) > 5.3509) < 0.02
As confidence level of 98 percent is acceptable (error level=0.02), the model is approved.
Thus in studying combinatorial factors leading to fatal freeway crashes, as is depicted by
Logit transform, g1(x), the dual combination of vehicle stopped by the side of low speed
lane in freeways and a motorcycle involved in the accident produces a loglikelihood of
-40.7911 in the prediction model
The second scenario is defined by Logit function g2(x).
g 2 (x ) =−1.6748 + (1.0324) *(x 51 ) + (−5.4335) *(x 53 )
(9)
Where
x 51 : Guilty vehicle involved in the accident, pride
x 53 : Guilty vehicle involved in the accident, motorcycle
In this model equation 10 implies.
p ( χ 2 (1) > 0.4229) < 0.5154 (10)

%95 is also an acceptable confidence level. g2(x) is approved.


So the dual combination of passenger car involved being a “pride” and a motorcycle involved
in the accident, produces a loglikelihood of -39.6879, thus the latter combination better
predicts a dual variable scenario leading into fatal freeway crashes.
Among the three variable scenarios tested for validity, g3(x) has the lowest error level:

g 3 (x ) =−1.6879 + (1.1924) *(x 51 ) + (−4.1372) *(x 53 ) + (−1.1428) *(x 33 ) (11)


Where

x 51 : Guilty vehicle involved in the accident, pride


x 53 : Guilty vehicle involved in the accident, motorcycle
x 33 : Accident cause: sudden weaving between lanes
Among trine combinations of accident causing variables, the combination of pride and
motorcycle involved in the accident together with a critical weaving maneuver between lanes
prior to the accident, produces a loglikelihood of -39.4751.
Another trine scenario is depicted by Logit transform,g4(x):
g 4 (x ) =−1.6545 + (1.189) *(x 51 ) + (−12.6857) *(x 53 ) + (−0.4058) *(x 32 ) (12)
Where

x 51 : Guilty vehicle involved in the accident, pride


x 53 : Guilty vehicle involved in the accident, motorcycle
x 32 : Accident cause: vehicle stopped by the side of freeway partially blocking low
speed lane
Another trine combination is the involvement of both pride and motorcycle in the accident,
while there is a vehicle stopped by the side of freeway partially blocking the low speed
lane; the log likelihood of such a combination is -39.5876 as is depicted by equation
12.The first trine combination produces a higher log likelihood so it predicts a more
probable scenario leading to a fatal freeway crash.
The process of adding variables to the model is halted after g5(x) is produced. As the new
models lose their adequacy afterwards, the most complicated scenario leading to fatal crashes
can be ruled by the logit function g5 .This scenario involves at least a motorcycle, a pride
passenger car, while there has been at least one critical weaving maneuver between lanes
prior to the crash, and the crash location is within the limits of a freeway bridge, as is
depicted by equation 13.

g 5 (x ) = (−1.4270) + (0.9431)* x 51 + (−8.8231)* x 53 + (−52.46)* x 33 + (−0.1084)* x 63 (13)

Where
x 51 : Guilty vehicle involved in the accident, pride
x 53 : Guilty vehicle involved in the accident, motorcycle
x 33 : Accident cause: sudden weaving between lanes
x 63 : Accident location: within limits of a bridge

3.4 Assessing the fit of the model


An interesting consequence of logistic regression is equation 14.
n n

=i 1=
∑ y i = ∑ πˆ (x i )
i 1
(14)
As can be observed by the numerical values in Table 5, the sum of predicted values in the
built models approximately simulate the sum of observed values, ∑ 𝑦𝑦 =41, obtained from
traffic records. As is depicted by Figure 1, it seems that the log likelihood shows an
increasing trend, while the accident leading scenarios get more complicated.

• guilty vehicle:"pride"
-41.6

• guilty vehicle:"pride"+motorcycle
-39.6879

• guilty vehicle:"pride"+motorcycle+accident cause:weaving between lanes


-39.4751

• guilty vehicle:"pride"+motorcycle+accident cause:weaving between


lanes+accident location:within the limits of a highway bridge
-39.1503

Figure 1. Increasing trend of Log likelihood predicts a more fatal Situation

4 Particle Swarm Optimization algorithms


The objective functions presented in the accident prediction models need to be optimized; but
they are non-linear and require special solving methods. One approach is to use metaheuristic
algorithms. The following section introduces one such algorithm implemented for
optimization.
.

4.1 Details of implementing PSO

The metaheuristic particle swarm optimization (PSO) was proposed by Kennedy and
Eberhart (1995)[25]. Kennedy and Eberhart were inspired by the behavior of birds
flocking. The basic idea of the PSO metaheuristic could be illustrated by using the
example with a group of birds that search for food within some area. The birds do not have
any knowledge about the food location. Let us assume that the birds know how distant the
food is. Go after the bird that is closest to the food is the best strategy for the group.
(Kennedy and Eberhart) treated each single solution of the optimization problem as a
“bird” that flies through the search space. They call each single solution a “particle”. Each
particle is characterized by the fitness value, current position in the space and the current
velocity. When flying through the solution space all particles try to follow the current
optimal particles. Particle’s velocity directs particle’s flight. Particle’s fitness is calculated
by the fitness function that should be optimized. In the first step, the population of
randomly generated solutions is created. In every other step the search for the optimal
solution is performed by updating (improving) the generated solutions. Each particle
memorizes the best fitness value it has achieved so far. This value is called Pbest. Each
particle also memorizes the best fitness value obtained so for by any other particle. This
value is called gbest. The velocity and the position are changed in each step. Each particle
adjusts its flying by taking in to account its own experience, as well as the experience of
other particles. In this way, each particle is loaded towards pbest and gbest positions. The
position χ i = ( χ i 1 , χ i 2 ,..., χ iD ) and the velocity v i = (v i 1 ,v i 2 ,...,v iD ) of i th particle are
vectors. The position χ ki +1 of the i th particle in the (K+1) st iteration is calculated in the
following way:
x k +1i = x k i + v k +1i ∆t (1)
Where v ki +1 is the velocity of the i th particle In the (K+1) st iteration and ∆t is the unit time
interval. The velocity v ki +1 equals:

pb i − x ki
p g − x ki
v i
ωv c1r1
=+i
+ c 2 r2
k +1 k
∆t ∆t (2)
Where ω is the inertia weight, r1 , r2 are the random numbers (mutually independent) in
the range [0,1] and c1 , c 2 are the positive constants, p bi is the best position of the i th
particle achieved so far and p g is the best position of any particle achieved so far.
The particle’s new velocity is based on its previous velocity and the distances of its
current position from its best position and the group’s best position. After updating
velocity the particle flies toward a new position (defined by relation (2)). Parameter ω that
represents particle's inert ion was proposed by Shi and Eberhart(1998).Parameters 𝐶𝐶1 and
C2 represent the particle’s confidence in its own experience, as well as the experience of
other particles.
The PSO represents a search process that contains stochastic components (random
numbers, r1 and r2 ). Small number of parameters that should be initialized also
characterizes the PSO. In this way, it is relatively easy to perform big number of numerical
experiments. Adopted from the literature (Shi and Eberhardt1998), the following details
have been used to implement PSO for this research study:

 The number of records=200


 The number of particles=20
 The algorithm starts with ω=1 and follows a gradual decrease until ω=0.1
 C1 starts with 2.5 and as the iterations proceed reaches 0.5
 C2 starts with 0.5 and as the iterations proceed reaches 2.5
 Main loop iterates 35 to 40 times
4.2 Software Package [26]
To implement the PSO algorithm, maximizing the likelihood function in the prediction
models, a software package was designed, coded in visual basic, and embedded in excel
worksheets. Each model was saved as an excel workbook; 1variable.xlsm, 2variable.xlsm,
and so forth. The input data was extracted, categorized and coded from Traffic Police case
files and finally saved as Dataentry.xlsm in the project. The process of dummy variable
producing was programmed by a macro embedded in the Dataentry.xlsm. By clicking on the
"run PSO” button the algorithm is run once and the estimators β0 through βp are saved.

5 Conclusions

A variety of factors can come into play when vehicles crash on the road. In terms of studied
subject’s urban freeways, this work suggests that, among causes contributing to urban
freeway accidents, pedestrians trying to cross the freeways, intervening with the traffic
while doing so, vehicle's malfunction while driving, drivers using reverse gear in
freeways, as to turn onto a missed off ramp, and low standard geometrical design of some
sections of the freeways, play major role in occurrence of fatal crashes; as is witnessed by
having odd ratios greater than 1 in the models built. As for freeway sections, the models in
this study prove that “in ramps” and “off ramps” are the most probable locations for the
occurrence of fatal crashes. Also drivers aged less than 40 are more prone to be found
guilty on freeway fatal accidents; while fatal accidents tend to happen more on Thursdays
(half working day in Eastern sun-based calendar), compared to other days of the week. It
is also assessed, according to police records, a special passenger car (pride) has
outnumbered other types of vehicles in being found guilty in fatal accidents;(Odds
Ratio:3.22, in the one variable model predicts more than 3 times greater chance than
other types of passenger cars).Among causes contributing to crashes, mathematical models
predict that low standard geometrical design of freeway sections, drivers using reverse
gear in freeways and pedestrians attempting to cross the freeways each increase the chance
of an accident being fatal more than 150 percent.
Among trine combinations of accident causing variables, the combination of a special
passenger car (pride) and motorcycle involved in the accident together with a critical
weaving between lanes prior to the accident, is a more probable scenario leading to fatal
crashes.
Among quadrilateral combination of crash causes, the combination of pride and motorcycle
involved, plus critical weaving prior to the accident, plus accident location: within the
limits of a freeway bridge, is the most complicated scenario validated by post hoc tests.

5.1 Discussion

23.3% of urban freeway crashes is due to pedestrians attempting to cross freeways intervening
with the traffic while doing so. This dominates while on 36.4% of such cases, a pedestrian
crossing bridge is available within less than 500 meters from the crossing point.
Another prominent contributing factor, is the prohibited parking of vehicles by the side of
freeways while partially obstructing the low speed lane for some reason; this reason can be
lack of gasoline, vehicle's malfunction and so forth. In the case of an impact with the stopped
vehicle police experts have blamed the approaching driver with terms such as "not paying
enough attention to the front" in their reports, instead of the driver who has created the
obstacle.
10% of fatal and injury crashes are found to be caused by drivers using reverse gear and 20% of
such crashes are due to weaving between lanes. The frequent occurrence of such accidents
and the fact that 33.3% of all the fatal and injury accidents happen in off ramp and in ramp
sections of freeways prove that, driver's unfamiliarity with the route has forced them to use
reverse gear as to turn onto a missed ramp or to weave suddenly between lanes; according to
municipality reports 17% of freeway crashes are due to lack, deficiency, and inefficient
location planning of traffic signs and massages. There is no hint to this deficiency in police
reports.
Also the problem of glare and its impact on driver's recognition and view has not been taken into
consideration; neither in the morning light nor at night.
If a vehicle has capsized, swerved or collided with the median, terms such as "over speeding", or
"unable to control the vehicle" is frequently used in police reports and flaws in geometrical
road design has been seldom mentioned. Only in 3.3% of the police reports any reference to
the freeway section's geometrical design has been given.
Braking distance is another item studied prior to drawing crash diagrams by experts. A statistical
review shows that skid marks remained from a special prevalent passenger car is mostly
between 20 and 50 meters while this distance recorded from other types of passenger cars is
usually below 10 meters in the subject’s freeway accidents.
Considering accident occurrence time, the domain 20:01-7:00(night), holds 43.3% of the total
accidents counted.
Also, Thursday, (half working day in Eastern sun-based calendar) is the day of the week, which
30% of all weekday fatal and injury accidents have taken place.
80% of drivers found guilty by the Traffic Police have more than 2 years driving experience.
Nearly 40% of drivers found guilty have more than 5 years' experience. Seemingly increasing
driving experience has not affected accident risk factor.
Drivers aged between 25 and 40 which include 40% of the total drivers, tend to fare
significantly higher than other age groups in being involved in fatal and injury accidents of
the subject’s freeways.

5.2 Suggestions

In order to reduce injuries sustained by pedestrians attempting to cross the freeways it is


suggested:
For those segments of the freeway passing through densely populated areas, with easy access
to the road, and high crossing demand, posting impassable medians is advisable. Also
pedestrian crossing bridges in such segments of the freeway can be equipped by escalators
to facilitate pedestrians' crossing or subway passages can be built in its place.
Implementing antiglare facilities on roadway medians to reduce glare at night especially on
horizontal curves, and encouraging drivers to use standard sunglasses to reduce glare at
daylight is another preventive measure.
Effective layout planning of traffic sign and messages and providing intelligent routing
services that meet drivers' need in unfamiliar roads can have a profound effect on reducing
accidents in ramp sections of the freeways.
REFERENCES

[1] Alizadeh Elizee,Z.: Persian translation of ”Brown Gibson Facility Layout Planning and its
application in transportation” Proceedings of the fifth national conference of Iran Traffic
and Transportation Engineering”,Tehran, Iran, March 2000.
[2] Alizadeh Elizee,Z., Babazadeh, A.: The Use Of Particle Swarm Optimization In
Predicting Fatal Accidents On Tehran Urban Freeways , proceedings of the 10th
international conference of traffic and transportation Engineering, Tehran, Iran,
November 2010.
[3] Anwaar Ahmed, Beenish Akbar Khan, Muhammad Bilal Khurshid, Muhammad Babar
Khan & Abdul Waheed (2015): Estimating national road crash fatalities using aggregate
data, International Journal of Injury Control and Safety Promotion, DOI:
10.1080/17457300.2014.992352
[4]Ayati, E. and Abbasi, E. (2014) Modeling Accidents on Mashhad Urban Highways. Open
Journal of Safety Science and Technology, 4, 22-35.
http://dx.doi.org/10.4236/ojsst.2014.41004
[5] Beni, G., Wang, J., Swarm intelligence. In: Proceedings of the Seventh Annual Meeting
of the Robotics Society of Japan. RSJ Press, Tokyo, 1989, pp. 425-428.
[6]Bester, C.J. (2001). Explaining national road fatalities. Accident Analysis & Prevention,
3(5), 663_672.
[7] Carson, J., Mannering, F.: The effect of ice warning signs on accident frequencies and
severities. Accident Analysis and Prevention 33(1), 2001, 99-109
[8] Evans,A.W. : Estimating transport fatality risk from past accident data .Accident Analysis
and Prevention 35, 2003, 459-472.
[9] Gaudry,M. and Lassarre S: Structural Road Accident Models;The International Drag
Family.Elsevier Science Ltd, Oxford , UK, 2000.
[10] Kennedy, J., Eberhart, R.C. : The particle swarm: social adaptation in information
processing system. In: Corne, D., Dorigo, M., Glover, F. (Eds.), New Ideas in Optimization
London. McGraw-Hill,1999, pp. 379-388.
[11] Kennedy, J., Eberhart, R.C., Shi, Y. : Swarm Intelligence. Morghan Kaufmann
Publishers, San Francisco, 2001.
[12] Kockelman,K.M.,Kweon,Y.: Driver Injury Severity :an application of ordered probit
models.Accident Analysis and Prevention 34, 2002, 313-321.
[13] Kopits,E.,Cropper,M. : Why have traffic fatalities declined in industrialized
countries;Implications for pedestrians and vehicle occupants. Journal of Transport
Economics and Policy 42, 2008, 129-154.
[14] Lee, j., and Mannering,F. : Impact of roadside features on the frequency and
severity of run-off-roadway accidents:An empirical analysis,
Accid. Anal Prev., 34, 2002, 149-161
[15] Li,L.,Kim,K : Estimating driver crash risks based on the extended Bradley- Terry
model:an enduced exposure method.Journal of royal statistical society A163 (2),
2000, 227-240.
[16] Lord, D.,Washington,S. P., and Ivan, J. N. : Statistical challenges with modeling motor
vehicle crashes: understanding the implications of alternative approaches., 83rd Annual
meeting of transportation Research Board, Washington,D.C. ,2004.
[17] Myers,R.H.,Montgomery,D.C. and Vining,G.G. : Generalized Linear
Models:with applications in engineering and science,John Wiley&sons Inc., New York,
2002
[18] Noland, R. B. :Traffic fatalities and injuries: The effect of changes in infrastructure and
other trends.” Accid. Anal,35, 2003, 599-611
[19] Shankar,V.,Mannering,F.,and Barfield,W. : Effect of roadway geometrics and
environmental factors on rural freeway accident frequencies” Accid. Anal Prev.,
27(3), 1995, 371-389
[20] Srinivasan, D., Loo, W.H., Cheu, R.L: Traffic incident detection using particle
swarm optimization. In: Proceedings of the IEEE Swarm intelligence Symposium
2003 (SIS 2003), Indianapolis, IN, USA, 2003, pp. 44- 151.

[21] Shi.Y and Eberhart RC, "Parameter Selection in Particle Swarm Optimization",
In:Porto VW,Saravanan N, Waagen D and AE EvolutionatryProgramming VII, pp611-
616.Springer,1998.
[22] Teodorovic,D. : Swarm Intelligence Systems for Transportation
engineering:Principles and Applications.Transportation Research Part C 16, 2008,
pp.651-667
[23] Zhao, J., Jia, L., Chen, Z., Wang, X. : Urban traffic flow forecasting model of double
RBF neural network based on PSO. In: Proceedings of the Sixth International
Conference on Intelligent Systems Design and Applications (ISDA'06). IEEE
Computer Society, Washington, DC, 2006, pp. 892-896.

[24] World Health Organization. World report on road traffic injury prevention
Accessed 14 Sept 2015.http://whqlibdoc.who.int/publications/2004/9241562609.pdf

[25] Chun-Feng Wang and Kui Liu, “A Novel Particle Swarm Optimization Algorithm for
Global Optimization,” Computational Intelligence and Neuroscience, vol. 2016, Article
ID 9482073, 9 pages, 2016.

[26] Nabab Alam M., “Particle Swarm Optimization: Algorithm and its Codes in MATLAB”
Research Gate March 2016 DOI:10.13140/RG.2.1.4985.3206
https://www.researchgate.net/publication/297245624

[27] Gu, X., Li, T., Wang, Y., Zhang, L., Wang, Y. and Yao, J., 2018. Traffic fatalities
prediction using support vector machine with hybrid particle swarm optimization. Journal of
Algorithms & Computational Technology, 12(1), pp.20-29.

[28] Xu, X., Zeng, Z., Wang, Y. and Ash, J., 2019, August. Crash Density and Severity
Prediction Using Recurrent Neural Networks Combined with Particle Swarm Optimization.
In International Conference on Management Science and Engineering Management (pp. 566-
580). Springer, Cham.

[29] Gao, D., Li, X. and Chen, H., 2019. Application of improved particle swarm
.optimization in vehicle crashworthiness. Mathematical problems in Engineering, 2019
Table1. Variables defined for modeling

Variable number Variable name Number of Variable definition


categories
Response variable
1 Fatal accidents
0 Injury accidents

1 Time (X1) Accident time


Am peak (X11) 1 7:01-9:00
Midday (X12) 2 9:01-16:30
PM peak (X13) 3 16:31-20:00
Night (X14) 4 20:01-7:00

2 Day (X2) Accident day


Sat (X21) 1 Saturday
Sun (X22) 2 Sunday
Mon (X23) 3 Monday
Tue (X24) 4 Tuesday
Wed(x25) 5 Wednesday
Thu(x26) 6 Thursday
Fri(x27) 7 Friday

3 Cause (X3) Accident cause

Pedestrian crossing 1 Pedestrian attempting to


(X31) cross the freeway
Vehicle parked (X32) 2 Vehicle stopped by the side
of freeway blocking part
of low speed lane
Weaving (X33) 3 Weaving between lanes
Malfunction (X34) 4 Malfunction of the vehicle
while in motion
Reverse gear (X35) 5 Drivers using reverse gear
in freeway
Road design (X36) 6 Non standard geometrical
road design of the
section of freeway

7 Other factors
4 Guilty vehicle (X5) Guilty vehicle type
involved in the
accident
Pride (X51) 1 Passenger car :"pride"
Miscellaneous passenger 2 Passenger car other than
cars (X52) "pride"
Motor cycle (X53) 3 Motorcycle
Light truck (X54) 4 Light truck
Heavy vehicle (X55) 5 Heavy vehicle
5 Freeway section (X6) Accident location
Off ramp (X61) 1 Off ramp
In ramp (X62) 2 In ramp
Bridge (X63) 3 Bridge
Tunnel (X64) 4 Tunnel

Toll gate (X65) 5 Pay toll


Main line (X66) 6 Main line
6 AGE (X7) Guilty driver's age group

<=25 (X71) 1 Less than 25

25< <=40 (X72) 2 Between 25 and 40

40< <=60 (X73) 3 Between 40 and 60

>60 (X74) 4 More than 60

missing Not noted

7 Experience (X8) Guilty driver's experience

<=1 year (X81) 1 Less than 1 year

1< <=2 years (X82) 2 Between 1 and 2 years

2< <=5 years (X83) 3 Between 2 and 5 years

>5 years (X84) 4 More than 5 years

missing Not noted

Table 2. The constant only Model

Constant only Estimator

-0.9783 β̂ 0

-45.1397 Log likelihood

Table 3. Comparison of one variable models built

Odds ratio Confidence Likelihood Log likelihood coefficient constant Crash contributor
��
𝑒𝑒𝑒𝑒𝑒𝑒 �𝛽𝛽 level α ratio β�
1

β °
based on statistic(G)
the one
tailed chi-
squared
distribution
1.59 0.139 2.1876 -44.046 0.4647 -1.4727 Pedestrian
attempting to
cross the
freeway
0.37 0.0673 3.3464 -43.4666 -1.0014 -1.2533 Vehicle stopped by
the side of
freeway
blocking part
of low speed
lane
0.66 0.1372 2.2090 -44.0352 -0.4153 -1.4664 Weaving between
lanes
1.03 0.1422 2.1535 -44.063 0.0340 -1.3266 Malfunction of the
vehicle while
in motion
1.50 0.1089 2.5703 -43.855 0.4029 -1.3503 Drivers using
reverse gear in
freeway
1.78 0.0675 3.3433 -43.468 0.5759 -1.4960 Non standard
geometrical
road design of
the section of
freeway
1.35 0.1164 2.4647 -43.907 0.2988 -1.4567 Off ramp
1.07 0.1449 2.1256 -44.077 0.0637 -1.3795 In ramp
0.66 0.1241 2.3647 -43.9574 -0.4124 -1.2302 Bridge
0.86 0.1313 2.2768 -44.1295 -0.1537 -1.2371 Main line
1.41 0.1313 2.2768 -44.0014 0.3417 -1.3932 Guilty driver's age
Less than 25
0.57 0.1039 2.6449 -43.8173 -0.5697 -1.2004 driver's age
Between 25
and40
1.33 0.1065 2.6039 -43.8378 0.2839 -1.5402 Thursday(half
weekend)
0.46 0.0525 3.7602 -43.2596 -0.7700 -1.0131 Night
3.22 0.007 7.0796 -41.6 1.1709 -1.9556 Guilty
vehicle:"pride"
0.97 0.1441 2.1337 -44.0729 -0.0291 -1.3846 Guilty
vehicle:other
passenger cars
0.04 0.006 7.4354 -41.4221 -3.2209 -1.2482 motorcycle
0.27 0.0517 3.7848 -43.2474 -5.9943 -1.2927 Light truck

Table 4 . Causal factors justified to enter into the second phase of modeling

Variable Error level


X32 6%
X36 5%
X14 5%
X51 0.7%
X53 0.6%
X54 5%
Table 5. Comparison of ∑ y� of the models built by ∑ 𝑦𝑦 = 41 obtained from traffic records

Model Definition
� 𝑦𝑦�
49.23 One variable model, fatal accidents occurred at
night

37.29 One variable model,


Guilty vehicle: passenger car(pride)
39.10 One variable model,
Guilty vehicle: motorcycle
40.78 One variable model,
Causal factor: pedestrian attempting to cross the
freeway
4048 One variable model
Causal factor: Vehicle stopped along low speed
lane,
38.74 One variable model, causal factor: non- standard
road geometrical design
42.08 Two variable model: causal factors:
Vehicle stopped along low speed lane+ Guilty
vehicle: motorcycle

39.81 Two variable model: causal factors:


Guilty vehicle: passenger car(pride)+ Guilty
vehicle: motorcycle
35.86 three variable model: causal factors:
Guilty vehicle: passenger car(pride)+ Guilty
vehicle :motorcycle+ sudden weaving between
lanes
31.30 four variable model: causal factors:
Guilty vehicle: passenger car(pride)+ Guilty
vehicle: motorcycle+ sudden weaving between
lanes+ approaching a bridge

You might also like