Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Safety Science 60 (2013) 13–20

Contents lists available at SciVerse ScienceDirect

Safety Science
journal homepage: www.elsevier.com/locate/ssci

Development of a safety performance function for motorcycle accident


fatalities on Malaysian primary roads
Muhammad Marizwan Abdul Manan ⇑, Thomas Jonsson, András Várhelyi
Traffic and Roads Unit (Taffik och väg), Department of Technology and Society (Teknik och Samhälle), Faculty of Engineering (Lunds Tekniska Högskola), Lund University,
P.O. Box 118, John Ericssons väg 1, 22100 Lund, Sweden

a r t i c l e i n f o a b s t r a c t

Article history: This study uses a generalized linear model approach, i.e. negative binomial regression, to develop a pre-
Received 21 November 2012 dictive model for motorcycle fatal accidents on Malaysian primary roads. For the modeling process, a
Received in revised form 17 April 2013 huge data inventory has been carried out, integrating the road geometry features, fatal accident records
Accepted 12 June 2013
and traffic censuses from 3 selected states for the past 3-year period. The results show that motorcycle
fatalities per kilometer on primary roads are statistically significantly affected by the average daily num-
ber of motorcycles and the number of access points per kilometer. The model established for this study
Keywords:
can also be regarded as the first motorcycle safety performance function in Malaysia and probably in Asia.
Motorcycle accident fatalities
Safety performance function
Also noted in this study is the need to establish a proper and systematic road geometry and traffic census
Negative binomial regression inventory in order to develop better accident prediction models for Malaysia in the future.
Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction Malaysian primary roads are hazardous for motorcyclists. Stud-


ies have revealed that motorcyclists are the victims in more than
Any research related to motorcycle fatalities in Malaysia would 50% of the road accident fatalities in Malaysia; 62% of the total
best represent those countries where motorcycles account for motorcycle accident fatalities (MCAF) occurred on primary roads
more than 25% of registered vehicles and reported accident fatali- (Abdul Manan and Várhelyi, 2012; Radin Umar, 1994). Sixty per-
ties. According to Table 1, these countries are mostly from Asia (e.g. cent of the roads in Malaysia are primary roads, which are partially
China, India, Malaysia, Thailand, etc.), South America (e.g. Colom- access-controlled, mostly non-segregated or single carriageway,
bia and Suriname) and Africa (e.g. Mauritius). Of these, Malaysian with intrinsically dangerous features, e.g. trees, open culverts and
data is close to the average, i.e. 47% of registered vehicles are access roads to rural houses and plantations (Hsu et al., 2003;
motorcycles and 58% of the victims of reported accident fatalities PWD, 2009; Tung et al., 2008). As a result, there are more MCAF
are motorcyclists (WHO, 2009). Despite Malaysia’s low ranking per 100 km of primary roads than on secondary roads, local streets
position in the category of road fatalities per 100,000 registered and minor roads combined (Abdul Manan and Várhelyi, 2012). In
vehicles (12th) and motorcycle fatalities per 100,000 registered addition to this glaring problem, MCAF on Malaysian primary roads
motorcycles (10th), it still has the highest road fatalities per are overrepresented in that motorcycles only constitute 20–25%
100,000 population among these countries (WHO, 2009). More- (HPU, 2009) of the total traffic composition on such roads.
over, for the 8-year period (2002–2009), road fatalities in Malaysia Safety Performance Functions (SPFs) are not only good tools for
showed a steady increase of 4% per year and rose to 6745 in 2009 use in road safety analysis; they also help in understanding the
(Abdul Manan and Várhelyi, 2012). In 2009 alone, motorcycle fatal- importance of contributing factors to motorcycles fatalities. SPF
ities reached 4070, which is the highest in the 10-year period is a mathematical function that relates the expected crash
(2000–2009) (Abdul Manan and Várhelyi, 2012). Thus, alleviating frequency of a roadway element, such as a road segment, to the
Malaysia’s problems regarding motorcycle accident fatalities traffic volume and other characteristics of that element (Jonsson
would set a prime example for other countries that face similar et al., 2009). This function has been adopted successfully for the
problems. past 10 years (Garber et al., 2010; Hauer et al., 2002) in identifying
sites with the largest potential for safety improvement in order to
achieve the greatest possible safety benefit on roads and highways
⇑ Corresponding author. Tel.: +46 046 222 9125 (O), +46 076 233 5775 (HP); fax: (Tegge et al., 2006). SPFs are calibrated from data by statistical
+46 046 222 9100. techniques, with the assumption that the accident data counts
E-mail addresses: mmarizwan@gmail.com, marizwan.manan@tft.lth.se (M.M. are from a negative binomial distribution (Hauer et al., 2002).
Abdul Manan), thomas.jonsson@tft.lth.se (T. Jonsson), andras.varhelyi@tft.lth.se (A. Despite the fact that the application of SPF is very successful, it is
Várhelyi).

0925-7535/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.ssci.2013.06.005
14 M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20

generally applied to all vehicles and there have been no attempts 1995). Ramírez et al. (2009) utilized negative binomial models to
to implement SPF specifically for motorcycles. analyze the influence of traffic conditions, i.e. volume and compo-
The aim of this paper is to develop a predictive model, i.e. SPF, sition of accidents on different types of interurban roads in Spain
solely for fatal motorcycle accidents on primary roads in Malaysia, (Ramírez et al., 2009). Chang (2005), through the use of the nega-
which in turn can be modeled by other countries with motorcycle tive binomial model to evaluate freeway accident frequencies,
fatality problems similar to Malaysia’s. found out that a number of highway geometric variables may have
significantly influenced the freeway’s accident occurrence (Chang,
2. Literature review 2005). Abdel-Aty and Radwan (2000) illustrated the significance of
the Annual Average Daily Traffic (AADT), degree of horizontal
Earlier accident models are regression-based models, initially curvature, lane, shoulder and median widths, urban/rural charac-
developed by using multiple or normal linear regression. These teristics, and the section’s length on the frequency of accident
models assume a normal error structure for the response variable, occurrence by using the negative binomial model (Abdel-Aty and
a constant variance for the residuals, and a linear relationship be- Radwan, 2000). In another example of the use of negative binomial
tween the response and explanatory variables (Ceder and Livneh, models, the authors found that traffic flow, highway segment
1982). However, many studies have indicated that road accidents length, junction density, terrain type and presence of a village set-
on a highway section are discrete, non-negative, and rare events, tlement within road segments were statistically significant explan-
and as a result multiple linear regressions are not suitable for such atory variables for crash involvement (Ackaah and Salifu, 2011).
cases (Karlaftis and Golias, 2002). Moreover, due to the discrete Research on accident prediction models specifically for motor-
and non-negative characteristics of accidents, it is well known that cycles is still limited worldwide, and especially in Malaysia. There
the use of linear regressions could lead to biased estimates for are only two prominent accident-modeling studies, specifically
accident prediction models (Maher and Summersgill, 1996). To targeting motorcycles: one by Harnen et al. (2006) where the
overcome these limitations, several researchers have suggested authors develop an accident prediction model, via a generalized
Poisson regression models as the first choice for modeling count linear model, for motorcycle accidents at junctions on urban roads
data (Lord et al., 2005; Miaou, 1994; TARC, 2009). in Malaysia. Their model reveals that motorcycle accidents are pro-
Using the Poisson regression model requires that the mean and portional to the power of traffic flow, and the estimates indicate
variance of the accident frequency (the response variable) are that an increase in non-motorcycles and motorcycles entering
equal (Maher and Summersgill, 1996). On the other hand, in most the junction is associated with an increase in motorcycle accidents
accident data, the variance of the accident frequency exceeds the (Harnen et al., 2006). The other study, by Radin Umar et al. (2000),
mean and causes the data to be over-dispersed (Elvik, 2003; Hauer, uses multivariate analysis of the impact of the exclusive motor-
2010). This restriction (which, when violated, leads to invalid cycle lane on motorcycle accidents along a major primary road in
t-tests of the parameter estimates), can be overcome with the Malaysia (Federal Highway Route 2). They find that motorcycle
use of Negative Binomial regression, which allows the variance of accidents are directly proportional to the cubic power of traffic
the dependent variable to differ from the mean (Karlaftis and flow, and are reduced by approximately thirty-nine percent (39%)
Golias, 2002). As a result, several recent authors (Ackaah and Salifu, with the existence of motorcycle lanes (Radin Umar et al., 2000).
2011; Harnen et al., 2006; Maher and Summersgill, 1996; Ramírez Hence, with the limited research done on predicting motorcycle
et al., 2009; Shankar et al., 1996) have found that a negative bino- accidents, especially on Malaysian primary roads, and most of it
mial distribution has proven to be more preferable for certain more than 10 years old, the need for this kind of research is
types of accident models especially when relating to road traffic obvious.
properties.
There are several studies where negative binomial models have 3. Method
been utilized as a means to investigate the relation between acci-
dents and road and traffic properties (road geometry, road type, In order to analyze the fatal accident frequency, k, the distribu-
traffic volume, etc.). For example, Shankar et al. (1995) used the tion model of the number of fatalities, Y, is first discussed. Let Yi
negative binomial model to seek interactions between weather denote the number of fatalities occurring on a specific road site
and road geometric variables with road accidents (Shankar et al., during a given time period. If we assume that the number of

Table 1
Worldwide road traffic fatality data from countries with a share of registered motorcycles more than 25% (Jacobs et al., 2000; Li et al., 2008; WHO, 2009).

No. Country Pop. Registered (2007) No. Reported (2007) Road fatalities per MC fatalities per 100,000
(Million) registered MC (traffic risk for MC)
No. Vehicle MC Road traffic MC 100,000 100,000 Registered
(2007)
(million) (%) fatalities fatalities Population (health vehicles (traffic risk)
(%) risk)
1 Vietnam 87.4 23.00 95 12,800 80.0 14.6 55.8 47.0
2 Cambodia 14.4 0.15 84 1545 63.0 10.7 1000.7 (1) 750.5
3 Lao P.D.R. 5.9 0.64 79 608 80.0 10.4 94.8 96.0
4 Indonesia 231.6 63.30 73 16,548 61.0 7.1 26.1 21.8
5 India 1169.0 72.70 71 105,725 27.0 9.0 145.4 (3) 55.3
6 Nepal 28.2 0.62 69 962 38.0 3.4 155.8 (2) 85.8
7 Taiwan 22.7 19.80 67 2894 54.0 12.7 14.6 11.8
8 Thailand 63.9 25.60 63 12,492 70.0 19.6 (3) 48.8 54.2
9 China 1336.3 145.20 51 5565 45.0 3.4 105.3 33.8
10 Pakistan 163.9 5.29 51 89,455 28.0 6.7 61.6 92.9
11 Philippines 88.0 5.52 48 1185 37.0 1.3 21.5 16.6
12 Malaysia 26.6 16.82 47 6282 58.0 23.6 (1) 37.3 (12) 46.1 (10)
13 Mauritius 1.3 0.33 43 140 36.0 11.1 41.9 35.1
14 Colombia 46.2 4.95 39 5409 36.0 11.7 109.2 100.8
15 Suriname 0.5 0.15 27 90 31.0 19.7 (2) 59.4 68.2

Data from 2006, ( ) Ranking within category, MC: Motorcycle.
M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20 15

Table 2
Type of data considered.

No. Type of data Description of content Source (see reference)


1. Motorcycle accident fatalities 1. Data from year 2007 to 2009 (ADSA, 2011; PDRM, 2007; PDRM, 2008; PDRM,
2. Location of fatal accident by state, district, route number, type of 2009)
land use, type of road hierarchy and type of road geometry
2. Road traffic volume on each 1. Average daily traffic volume of cars, heavy vehicles and (HPU, 2007, 2008, 2009)
section of Malaysian primary motorcycles
roads
3. Road statistics and inventory of 1. Route number (PWD, 2009)
all primary roads 2. Road length for each route
3. Location of route by state and district
4. Road properties: road width and existence of paved shoulders
4. Road geometry features 1. Number of lanes (HPU, 2007, 2008, 2009; PWD, 1986) and Google
2. Number of curves and straight sections per km. Earth and Google Maps from year 2002 until 2010
3. Number of access roads, minor roads, junctions and intersec-
tions per km
4. Existence of road median
5. Land use: residential, commercial or industrial area

fatalities follows a Poisson distribution with expected value, and by means of reference to satellite photos and various maps from
thus variance, equal to ki, the probability that the number of fatal- the Malaysian road authorities. Hence, we have chosen three states
ities during this period will be equal to yi, may be written as (Perak, Selangor and Johor), which have the best available data, as
y samples in order to minimize time and resources. From these three
ki i  expðki Þ
PðY i ¼ yi Þ ¼ where i ¼ 0; 1; 2; 3 . . . ð1Þ states, we have managed to collect a large amount of data pertain-
yi ! ing to road geometry and land use from 124 road sections. Next,
Accident data may not always have equal mean and variance; it we have matched and integrated it with the 3-year fatal accident
could be either under-dispersed i.e. mean greater than variance, or data, i.e. reported within 30 days after a crash, and the traffic cen-
over-dispersed i.e. mean less than variance (Hiselius, 2004; Maher sus for every road section.
and Summersgill, 1996). In other words, ki is not the same at all We have noted that the motorcycle accident fatality data col-
sites and there will usually be a variation of the ki around a mean, lected straight from the police was not homogeneous in terms of
k. To overcome this, the negative binomial distribution, which in- collision type. This is because identifying motorcycle collision
cludes a gamma-distributed (j) error term is appropriate to denote types, i.e. single-motorcycle and multi-vehicle accidents, from
the probability distribution of accidents (Shankar et al., 1995). The the police database is difficult due to the inconsistencies of data
Negative Binomial distribution probability function and variance entry and illegible description found in the accident recording
(r) for Y are written as forms. Fatal accident data is used simply because the injury data
 j  y is critically underreported (Abdul Manan and Várhelyi, 2012). For
Cðj þ yÞ j j example, comparing Malaysian accident statistics to a highly
PðY ¼ yÞ ¼ where y
CðjÞy! j þ k jþk developed motorized country like Sweden, it can be concluded that
¼ 0; 1; 2; 3; . . . ð2Þ there are 9 severe injuries for each fatality in the Swedish statistics,
while there are only 1.4 severe injuries per fatality according to the
k2 Malaysian statistics (Abdul Manan and Várhelyi, 2012).
r2 ¼ k þ Table 2 shows the types of data obtained for this study for the
j
study period of 2007–2009. For the purpose of constructing the
To analyze accident frequency, there is a need for a regression mod- model, we have chosen our dependent variable to be Motorcycle
el that can describe the variation properly (Elvik, 2003; Hiselius, accident fatalities per kilometer to represent the motorcycle fatality
2004). An exponential function is a common formulation, since this rate on Malaysian primary roads. The rest of the data is categorized
function ensures that the expected number of accidents is a positive as independent variables and divided into continuous and categor-
number (Hiselius, 2004). Thus, accident frequency may be written ical data as seen in Table 3. Statistical software, i.e. SPSS ver. 19, has
as a multiplicative function for which the value of the exponents been used for developing the model.
can be estimated directly by measuring the explanatory variables As seen in Table 3, the size of the sample for this study is 372,
on the logarithmic scale, i.e. Eq. (3). i.e. 124 road sections times 3 years. These road sections are pri-
EðYÞ ¼ exp½Rbi lnðX i Þ ð3Þ mary roads located within the boundaries of the selected states;
they vary from 1.61 km to 86.10 km in length and have non-homo-
where b is the coefficient estimation and Xi are the independent geneous road features, i.e. different lane configurations and avail-
variables, e.g. road traffic properties such as ADT and number of ability of median and paved shoulders. We note that there is at
curves or junctions least 1 motorcycle fatal accident on each road section and a
maximum of 29 fatal accidents on one of the sections. We also note
3.1. Empirical setting that these road sections traverse many different land uses, but the
majority of these are in rural areas with many small access roads.
The literature on the development of road accident prediction For the purpose of this study, we have defined an access road as a
models, reviewed in the previous section, suggests a number of three-legged priority control junction that serves to connect the
variables that could explain variations in accident occurrence and main road, i.e. primary road, to a minor road leading into planta-
casualties. The lack of road geometry and land use data, for exam- tions, factories or villages.
ple number of curves, access points, commercial and residential Next, a Pearson correlation matrix is used to investigate
areas, etc., meant that we had to manually collect this information whether some independent variables are strongly correlated with
16 M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20

Table 3
Descriptive statistics of variables.

Continuous variables N Minimum Maximum Mean Std. deviation Unit


Statistic
Motorcycle fatalities 372 1 29 4.712 4.058 Number
Motorcycle fatalities per km 372 0.010 3.7 0.366 0.448 Number/km
Section length 372 1.61 86.10 19.19 13.273 km
No. of access points 372 3 901 164.073 132.438 Number
No. of curves 372 0 365 25.798 41.371 Number
No. of minor intersections 372 0 36 6.307 5.363 Number
No. of towns 372 0 6 1.605 1.171 Number
No. of residential areas 372 2 31 12.177 6.844 Number
No. of industrial areas 372 0 35 8.194 5.851 Number
No. of commercial areas 372 1 36 10.105 6.627 Number
Average Daily Traffic (ADT) 372 755 193,190 23,597 28,887 Vehicles/day
Percent of cars 372 38.2 83 64.622 8.337 %
Percent of heavy vehicles 372 2.7 39.8 15.143 6.834 %
Percent of motorcycles 372 7.5 47.9 20.234 7.394 %
Categorical variables Description N Percent
Lane configuration 2 lanes 240 64.5
4 lanes 42 11.3
6 lanes 12 3.2
2 and 4 lanes 63 16.9
4 and 6 lanes 15 4.0
Total 372 100.0
Median Not divided 252 67.7
Divided 60 16.1
Divided on certain stretches 60 16.1
Total 372 100.0
Paved shoulder Unpaved shoulder 213 57.3
Paved shoulder 117 31.5
Paved on certain stretches 42 11.3
Total 372 100.0

each other. This is because strong correlation between indepen- Table 5 shows the selected variables to be included into the
dent variables in regressions could lead to difficulties in the inter- regression process. The authors note that the dependent variable
pretation of parameter estimates, as it might strongly affect the proposed (Motorcycle fatalities per kilometer) is put back into
other model parameters (Abdel-Aty and Radwan, 2000; Maher Motorcycle fatalities in order to be computed into SPSS. However,
and Summersgill, 1996). In this study, we have set our correlation introducing an offset variable (Length, km) in the fitting process
value (Pearson correlation) acceptance to be less than 0.5. As can into SPSS means that the final model yields the number of Motor-
be seen in Table 4, there is a strong correlation between indepen- cycle fatalities per kilometer. This method is similar to the one used
dent variables that affects the models i.e., number of towns with by Harnen et al. (2006). Moreover, as seen in Table 5, some of the
residential, commercial and minor junctions, number of residential variables have undergone transformation, i.e. Logarithmic compu-
with commercial, number of industrials with commercial, number of tation, Ln, in order to suit the formulation of Eq. (4). On the other
commercial with minor intersection, and percent cars with heavy hand, we also introduce several interaction variables, e.g. LnCur-
vehicles, and thus they are excluded from the modeling formulation ve_per_km  LnADTMC or LnAccess_per_km  LnADT, in order to
process. The correlated independent variables, such as ADT, ADT of have additional insights into the contributory factors.
cars and ADT of motorcycles, were incorporated during the model- Using SPSS software ver. 19, the generalized linear model, i.e.
ing computation, but were not included together in any of the indi- negative binomial with log link analysis, is performed for this study.
vidual trials. This is because the three ADT-variables are almost The response variable for the model is set as MCFatal and the
perfectly correlated with each other. Therefore adding more than predictors are set for Lane_config, Median and Paved_shoulder as
one of them would not add to the quality of the models (Washing- ‘factors’ while LnAccess_per_km, LnCurve_per_km, LnADTMC, LnADT
ton et al., 2003). In fact, having two of them at the same time and/or interaction variables are set as ‘covariates’. Categorical vari-
makes the whole model non-significant. ables are included in the model using so-called ‘dummy variables’
From Eq. (3), we expect our prediction model to have a final where x takes the value of 1 if it belongs to the specific category, or
form as follows; 0 if it does not. One category level is used as the reference level and
the others are represented by using a ‘dummy’, yielding as many
b biþ1
½expðln X i Þ i  ½expðln X iþ1 Þ  dummy variables as number of category levels minus one. As men-
MCFatal=km ¼ expðb0 Þ 
continuous tioned previously, LnLength is set as an ‘offset variable’ and one of
½expðX i Þbi  ½expðX iþ1 Þbiþ1    the predictors. As for the model estimation, we have chosen the
 ð4Þ ‘Hybrid’ method with the scale parameter method of ‘Person chi-
categorical
square’. We have also selected ‘Type I and III’ as the analysis type,
or in another term: while the Chi-square Statistics are set as ‘Wald’.
b b
X i i  X iþ1
iþ1

MCFatal=km ¼ expðb0 Þ  4. Model estimation
continuous
expðX i bi Þ  expðX iþ1 biþ1 Þ   
 ð5Þ Various models have been generated for this study in order to
categorical find a suitable prediction model that could represent motorcycle
M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20 17

fatalities on Malaysian primary roads. Several measures of good-


ness of fit were used (as shown in Table 6) and the variable that
ADT of

provided the overall best fit of the model was chosen. Out of the
MC

various models generated, we have narrowed down six (6) models

1
(see Table 6) that are overall statistically significant (p < 0.05),
based on the Omnibus test of the model as a whole (SPSS, 2007).
ADT of

0.913
Cars

Models 1, 2 and 3 have 1 common variable, which is LnADTMC,

1
whereas models 4, 5 and 6 have LnADT.

0.995
0.940
The coefficients in each model are estimated by means of max-
ADT

imum likelihood, which is performed by the statistical package

1
SPSS. The significance of each variable is examined by the Wald
No. of access per

chi square test at the 95% confident interval (Olsson, 2002). Models
1, 2, 4 and 5 are not suitable representations of this study. For mod-
0.016 els 1 and 4, none of the categorical variables are significant, i.e.
0.070
0.080

p > 0.05, as seen in Table 6. As for models 2 and 5, despite removing


km

some variables, there are still some that are not statistically signif-
icant. Moreover, we have also tried many combinations of interac-
tion variables in all the models, but, unfortunately, all of the
0.288
0.293
0.247
access
No. of

0.565

models with interaction variables also come out as not statistically


1

significant, i.e. p > 0.05.


0.137
0.105

In models 3 and 6 the number of variables has been reduced


0.135
0.232

0.094
% MC

until only statistically significant ones are left. All estimated coef-
1

ficients of variables for models 3 and 6 are statistically significantly


0.316

0.114
0.384

0.419
0.407

different from zero at the 5% level. The goodness-of-fit statistics (as


0.129
% HV

seen in Table 6) for these two models show that the models fit the
1

data very well. The Pearson Chi-square statistics divided by de-


Highlighted and Bold – Variables that are highly correlated with each other and not included at the same time (Pearson correlation > 0.5).
0.628
0.226
0.113
0.540

grees of freedom are estimated to be 0.872 and 0.821, respectively,


% Cars

0.456
0.408

0.260

for models 3 and 6, as shown in Table 5. In other words, the esti-


1

mated values of the Pearson Chi-square divided by the degrees of


freedom are within the permissible range (i.e. between 0.8 and
0.141

0.257
0.214

0.191
0.207
0.003
curves

1.2), indicating that the Negative Binomial distribution assumption


No. of

0.062
0.102

is acceptable (Bauer and Harwood, 2000). Models 3 and 6 can be


1

written as:

MCFatal=km ¼ expð4:891Þ  ADTMC 0:404


0:262
 Access per km ðModel 3Þ
intersections
No. of minor

0.141

0.104
0.071
0.067
0.084
0.124
0.057
0.083

0.093

MCFatal=km ¼ expð6:381Þ  ADT 0:477


Bold – One of the independent variables is included but not in combination with the other one.
1

0:316
 Access per km ðModel 6Þ
The Average Daily Traffic (LnADT) and Average Daily Traffic of
commercial

motorcycles (LnADTMC) are statistically significant (p < 0.05) with


0.049
0.104

0.027
0.011
0.020
No. of

0.319
0.530

0.079
0.045

0.060

positive estimated model parameters for both models. Both models


3 and 6 have one common variable, i.e. LnAccess_per_km with a
1

positive parameter value and a strong statistical significance. This


indicates that the fatal crash frequency increases with an increase
industrial

in the traffic flow or number of access points per kilometer. More-


0.128
0.061
0.180

0.140
0.105
No. of

0.552
0.472

0.178

0.278
0.199
0.038

over, both LnADT and LnADTMC are not strongly correlated (Pear-
1

son correlation < 0.07) with LnAccess_per_km, as seen in Table 4.


Both models correspond to the ‘Safety performance function’.
As stated earlier, SPF is a mathematical function that relates the ex-
residential

pected number of accidents (in this case; motorcycle fatalities per


0.116
0.079

0.120
0.108
0.001
No. of

0.399
0.842
0.447

0.466
0.087

0.097

0.014

MC – Motorcycle, ADT – average daily traffic.

kilometer) to a set of explanatory variables (Elvik, 2003). A safety


Pearson correlation of independent variables.

performance function, which is widely applied, can be written as


Eq. (6) (Elvik, 2003):
0.062

0.077
0.083
0.064
0.020
towns
No. of

0.592

0.644

0.396
0.480

0.608

0.097

0.097

0.052

EðkÞ ¼ aQ b  exp½Rjx ð6Þ


1

where Q measures exposure, i.e. traffic volume, raised to an expo-


No. of access per km

nent b. Exp is the exponential function, that is the base of natural


No. of commercial
No. of residential

% Heavy vehicles

logarithms raised to the sum of parameter estimates multiplied


No. of industrial

intersections
No. of curves

No. of access

by the relevant values of the explanatory variables, representing


No. of towns

No. of minor

ADT of Cars
ADT of MC

risk factors (Rjx) (Elvik and Vaa, 2004). Hence, for models 3 and
% Cars

6, the measure of exposure (Q) is ADTMC and ADT, respectively,


% MC
Table 4

ADT

while ‘Access per kilometer’ is the risk factor and Alpha (a) would
be the scaling constant from the models.
18 M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20

Table 5
Variables computation and SPSS coding.

Category Variables SPSS labeling Computation/designation SPSS coding


Dependent Motorcycle fatalities Y MCFatal – –
Off-set Length per section – LnLength LnLength
Independent, continuous variables (covariates) No. of access X1 LnAccess_per_km Ln (X1/Length)
No. of curves X2 LnCurve_per_km LnX2
Percent of Motorcycle X4 LnADTMC Ln (ADT  X4/100)
ADT X5 LnADT LnX5
Independent, categorical variables (factors) Lane configuration X6 Lane_config 2 lanes 1
4 lanes 2
6 lanes 3
2 and 4 lanes 4
4 and 6 lanes 5
Median X7 Median Not divided 1
Divided 2
Divided on certain stretches 3
Paved shoulder X8 Paved_shoulder Unpaved shoulder 1
Paved shoulder 2
Paved on certain stretches 3

Table 6
Estimation results for various alternative models.

Parameter estimates Model 1 Model 2 Model 3 Model 4 Model 5 Model 6


b p-Value b p-Value b p-Value b p-Value b p-Value b p-Value
(Intercept) 4.721 .000 4.665 .000 4.891 .000 6.798 .000 6.809 .000 6.381 .000
LnAccess_per_km .299 .000 .281 .001 .262 .001 .341 .000 .323 .000 .316 .000
LnCurve_per_km .039 .600 .040 .573
LnADTMC .379 .000 .369 .000 .404 .000
LnADT .528 .000 .533 .000 .477 .000
[Lane_config = 5.00] .375 .474 .319 .516 .795 .123 .992 .001
[Lane_config = 4.00] .082 .807 .119 .688 .107 .744 .313 .109
[Lane_config = 3.00] .454 .451 .905 .094 .009 .988 .235 .518
[Lane_config = 2.00] .481 .283 .456 .271 .251 .564 .047 .804
[Lane_config = 1.00] 0a . 0a . 0a . 0a .
[Median = 3.00] .364 .329 .254 .431 .288 .423
[Median = 2.00] .226 .623 .239 .575 .164 .710
[Median = 1.00] 0a . 0a . 0a .
[Paved_shoulder = 3.00] .136 .579 .155 .510 .114 .597
[Paved_shoulder = 2.00] .195 .116 .226 .056 .220 .058
[Paved_shoulder = 1.00] 0a . 0a . 0a .
Overdispersion Parameter (Scale) .801b .847b .872a .729b .734b .821a
Omnibus test df = 11 Sig. df = 8 Sig. df = 2 Sig. df = 11 Sig. df = 8 Sig. df = 2 Sig.
Likelihood ratio Chi-square 80.011 .000 79.296 .000 66.741 .000 108.275 .000 113.348 .000 86.028 .000
Goodness of fit df = 354 Value/df df = 363 Value/df df = 369 Value/df df = 354 Value/df df = 363 Value/df df = 369 Value/df
Value Value Value Value Value Value
Deviance 201.712 .570 207.846 .573 216.864 .588 186.910 .528 191.814 .528 204.440 .554
Pearson Chi-square 283.531 .801 307.634 .847 321.684 .872 257.914 .729 266.555 .734 302.857 .821
Akaike’s Info. Criterion (AIC) 1980.597 2009.561 2006.580 1965.794 1993.530 1994.155
Bayesian Info. Criterion (BIC) 2027.428 2044.831 2018.336 2012.626 2028.800 2005.911

Bold – indicates the lower value between models 3 and 6.


a
Set to zero because this parameter is redundant.
b
Computed based on the Pearson Chi-square.

Model 6 is statistically better than model 3 but both models can and down and oscillates around 0 (Hauer, 2004), as shown in
be adequately accepted. The model’s goodness-of-fit for this study Fig. 1. Comparing both models, model 3 appears to be slightly clo-
is measured in term of Deviance, Pearson Chi-Square, Akaike’s ser, oscillating around 0 at 10–18 access points per km, which indi-
Information Criterion (AIC) and Bayesian Information Criterion cates that it might be a better model than model 6 within a certain
(BIC). The smaller the value, the better and more preferred the range.
model would be, i.e. best fitted model (Abdel-Aty and Radwan,
2000; Olsson, 2002; Shankar et al., 1996). Furthermore, AIC is often
used as a way of comparing several competing models, without 5. Discussion
necessarily making any formal inference (Olsson, 2002). As seen
in Table 6, model 6 has a slightly lower value of Deviance, AIC The motorcycle accident fatality rate, i.e. fatalities per kilome-
and BIC than model 3, thus showing that model 6 is considered ter, on Malaysian primary roads can be predicted via a negative
the model with the best statistical fit. However, based on the CURE binomial regression model. Two of the six models tested have a
plot in Fig. 1, the data fits both models along the entire range of good statistical fit with highly significant predictors. Both share
values assumed by a variable. These particular CURE plots are one matching variable, i.e. access per kilometer (Access_per_km),
based on the Access per km variable due to the fact that both mod- and each contains another variable, which is the Average Daily
els share this variable. A good CURE plot is one which moves up Traffic (ADT) for Model 6 and the Average Daily Traffic of
M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20 19

Fig. 1. Cumulative residual plot (CURE) for Models 3 and 6.

in Table 7, because the relative proportions of different road users


Table 7 in traffic affect the number of accidents (Elvik and Vaa, 2004). This
Predictability comparison between Model 3 and Model 6 where access per kilometer shows that Model 3 is sensitive to the growth of traffic flow com-
and length are fixed.
position and thus suitable to be used when a primary road section
ADT ADTMC %MC MCFatal MCFatal |D| experiences a modal shift, i.e. cars to motorcycle. Moreover, if the
(model 3) (model 6) ADT grew larger (two times), the difference between Model 3 and
23,000 4600 20 (mean) 3.5 3.4 0.07 Model 6 predictions would be even bigger. In simple terms, model
5750 25 3.8 0.39 3 has the benefit of having the actual number of motorcyclists, in-
6900 30 4.1 0.68
stead of the total ADT as the ADT does not necessarily reflect the
8050 35 4.3 0.94
9200 40 4.6 1.18 number of motorcycles. Thus, model 3 is more logical.
10,350 45 (max) 4.8 1.41 The significance of having ‘access per kilometer’ in the models is
46,000 9200 20 (mean) 4.6 4.7 0.14 that it is vital in complementing the real scenario of motorcycle
11,500 25 5.0 0.29 fatalities on Malaysian primary roads. According to Elvik and Vaa
13,800 30 5.4 0.67 (2004), the number of access points has a major impact on accident
16,100 35 5.7 1.02 rates. Studies in Malaysia have shown that the most typical type of
18,400 40 6.1 1.33
20,700 45 (max) 6.3 1.63
fatal crash for Malaysian motorcyclists is the crossing course type
collision (27.5%), i.e. side or perpendicular angle crashes, with 28%
of those occurring with a car (Abdul Manan and Várhelyi, 2012)
motorcycle (ADTMC) for Model 3. The estimates for the models coming out from an access road and thus infringing upon the
indicate that an increase of access points per kilometer and the right-of-way (ROW) of the motorcyclist (Pai, 2011; Radin Umar,
average traffic volume, i.e. ADTMC and ADT, are highly associated 2005). This violation of the motorcyclist’s ROW could lead to
with an increase in motorcycle fatal accidents per kilometer. crashes into the side of the car (Pai, 2009; Radin Umar, 1999,
Although Model 6 is statistically better, both models prove to fit 2005); this type of accident, occurring mostly at access points, is
well in all ranges of the variables. known to be the most hazardous crash pattern for motorcyclists,
Traffic volume is the single most important factor that influ- i.e. stop-/yield-controlled junctions (Pai, 2009). Another study by
ences the number of road accidents (Elvik and Vaa, 2004; Hauer, Ackaah and Salifu (2011), which uses a similar approach to this
1985). Considering this fact, both models would be viable. Further- study, also acknowledges the importance of having ‘number of ac-
more, traffic flow variables affect the exposure of road users to cess points’ as one of the variables in predicting the number of
accident-prone situation (Jonsson, 2005). In relation to exposure, crashes. Therefore, the fact that this variable is regarded as one
if we take the ADT alone, both cars and motorcycles interact; of the major causes of motorcycle fatalities is certainly justified,
hence, which party affects the accident-prone situation is un- and an increase in the number of access points per kilometer will
known. On the other hand, if we take into account the ADT of increase the number of motorcycle fatalities. In other words, hav-
the motorcycle (ADTMC), we look at the level of exposure by mo- ing more access roads protruding into the primary road will in-
torcycle alone, i.e. direct exposure to risk. Another way of looking crease the risk of motorcycles facing cars entering from the
at this (as seen in Table 7), is that, in a scenario where a 10 km road access road.
section has 5 access points per kilometer, the predicted motorcycle It is worth mentioning that a few limitations in this study have
fatalities at a mean ADT would be 3.4 if Model 6 is used. Comparing thwarted our attempts to work efficiently and accurately on some
this to Model 3, and assuming motorcycle composition is 20% of occasions. First, the data collected from the police is not 100% accu-
the ADT, the predicted motorcycle fatalities would be 3.5. Thus, rate in terms of pinpointing the exact location of the fatal acci-
the difference in the fatality prediction for both models would dents. Most of the police database has identified the correct
not be much. However, if the motorcycle composition grew in route number per district and state; however, some fail to give
the same total ADT, the motorcycle fatality would increase, as seen the correct route number or assign it to a different district. This
20 M.M. Abdul Manan et al. / Safety Science 60 (2013) 13–20

data, which is up to 5% of the total, has had to be excluded. More- Hiselius, L.W., 2004. Estimating the relationship between accident frequency and
homogeneous and inhomogeneous traffic flows. Accident Analysis and
over, we may assume that some unobserved heterogeneities exist
Prevention 36, 985–992.
in our motorcycle accident fatality data, i.e. different fatal collision HPU, 2007. Road Traffic Volume Malaysia 2007. In: Highway Planning Unit (Ed.).
types, such as single-motorcycle and multi-vehicle accidents. We Ministry of Works Malaysia, Kuala Lumpur, Malaysia.
have tried to separate these, but there are too many discrepancies HPU, 2008. Road Traffic Volume Malaysia 2008. In: Highway Planning Unit (Ed.).
Ministry of Works Malaysia, Kuala Lumpur, Malaysia.
in the police recording system for us to successfully identify and HPU, 2009. Road Traffic Volume Malaysia 2009. In: Highway Planning Unit (Ed.).
separate these fatal collision types. For example, according to the Ministry of Works Malaysia, Kuala Lumpur, Malaysia.
police, they sometimes record a single motorcycle accident as Hsu, T.-P., Ahmad Farhan, M.S., Nguyen, X.D., 2003. A comparison study on
motorcycle traffic development in some Asian countries – case of Taiwan,
‘not at fault’, ‘crashed out from the road’, ‘victim of hit and run’, Malaysia and Vietnam. The Eastern Asia Society for Transportation Studies
and most of the time they leave the collision type section empty. (EASTS), International Cooperative Research Activity.
Hence, in order to determine the exact motorcycle collision type, Jacobs, G., Aeron-Thomas, A., Astrop, A., 2000. Estimating Global Road Fatalities. TRL
Report Transport Research Laboratory, London, United Kingdom.
we have had to look at the free text in the accident record form, Jonsson, T., 2005. Predictive Models for Accidents on Urban Links: A Focus on
most of which is illegible. Second, the maps provided by the gov- Vulnerable Road Users. Department of Technology and Society, Lund Institute of
ernment agency, i.e. the Malaysian Highway Planning Unit, Public Technology. Lund University, pp. 1–142.
Jonsson, T., Lyon, C., Ivan, J.N., Washington, S.P., Schalkwyk, I.v., Lord, D., 2009.
Works Department, are not wholly accurate and complete in terms Differences in the performance of safety performance functions estimated for
of providing the geometric properties of the roads. Therefore, we total crash count and for crash count by crash type. Transportation Research
have had to opt for an alternative solution, such as relying on Goo- Record: Journal of the Transportation Research Board 2102, 115–123.
Karlaftis, M.G., Golias, I., 2002. Effects of road geometry and traffic volumes on rural
gle Earth and local street maps (private source). Consequently, we
roadway accident rates. Accident Analysis and Prevention 34, 357–365.
have not been able to obtain several vital variables such as speed Li, Y., Qiu, J., Liu, G., Zhou, J., Zhang, L., Wang, Z., Zhao, X., Jiang, Z., 2008. Motorcycle
limits and road vertical profiles. This study reflects our realization accidents in China. Chinese Journal of Traumatology (English Edition) 11, 243–
that Malaysia is in grave need of a proper system that stores and 246.
Lord, D., Washington, S.P., Ivan, J.N., 2005. Poisson, Poisson-gamma and zero-
categorizes all information pertaining to traffic censuses and cur- inflated regression models of motor vehicle crashes: balancing statistical fit and
rent road geometry of all primary roads in Malaysia, i.e. Malaysian theory. Accident Analysis and Prevention 37, 35–46.
Federal government owned roads. Maher, M.J., Summersgill, I., 1996. A comprehensive methodology for the fitting of
predictive accident models. Accident Analysis and Prevention 28, 281–296.
This study will benefit academic practitioners, engineers and Miaou, S.-P., 1994. The relationship between truck accidents and geometric design
decision makers on the subject of improving motorcycle safety of road sections: Poisson versus negative binomial regressions. Accident
not only in Malaysia but also in other countries that face similar Analysis and Prevention 26, 471–482.
Olsson, U., 2002. Generalized Linear Models – An Applied Approach.
situations. Model 3 enables us to predict motorcycle fatalities on Studentlitteratur, Lund, Sweden.
Malaysian primary roads by just identifying the ADT of motorcy- Pai, C.-W., 2009. Motorcyclist injury severity in angle crashes at T-junctions:
cles and the number of access or minor junction points per kilome- identifying significant factors and analysing what made motorists fail to yield to
motorcycles. Safety Science 47, 1097–1106.
ter. Hence, engineers or decision makers can plan to reduce Pai, C.-W., 2011. Motorcycle right-of-way accidents—a literature review. Accident
fatalities by closing down some access points and building more Analysis and Prevention 43, 971–982.
service roads that can combine several access points into one. PDRM, 2007. Laporan Tahunan PDRM 2007 (Royal Malaysia Police Annual Report,
2007). In: Royal Malaysia Police (Ed.), Kuala Lumpur, Malaysia.
The next step for us is to conduct observational studies, i.e., to fo-
PDRM, 2008. Laporan Tahunan PDRM 2008 (Royal Malaysia Police Annual Report,
cus on the behavior of motorcyclists and other drivers at access 2008). In: Royal Malaysia Police (Ed.), Kuala Lumpur, Malaysia.
points, in order to analyze and hopefully better understand the PDRM, 2009. Laporan Tahunan PDRM 2009 (Royal Malaysia Police Annual Report,
road safety situation on Malaysian primary roads. 2009). In: Royal Malaysia Police (Ed.), Kuala Lumpur, Malaysia.
PWD, 1986. A Guide on Geometric Design of Roads, Technical Guidelines (Arahan
Teknik: Jalan) 8/86. Road Branch: Public Works Department Malaysia, Kuala
References Lumpur, Malaysia.
PWD, 2009. Laporan Statistik Jalan Malaysia 2009 (Malaysian Annual Road
Abdel-Aty, M.A., Radwan, A.E., 2000. Modeling traffic accident occurrence and Statistics, 2009). In: Public Works Department of Malaysia (Ed.). Public Works
involvement. Accident Analysis and Prevention 32, 633–642. Department of Malaysia, PWD, Kuala Lumput, Malaysia.
Abdul Manan, M.M., Várhelyi, A., 2012. Motorcycle fatalities in Malaysia. IATSS Radin Umar, R.S., 1994. Road accidents in Malaysia. IATSS Research 18, 38–41.
Research 36, 30–39. Radin Umar, R.S., 1999. The Value of Frontal Conspicuity on Motorcycle Accidents in
Ackaah, W., Salifu, M., 2011. Crash Prediction Model for Two-Lane Rural Highways Malaysia, World Engineering Congress’99 – Towards the Engineering Vision:
in the Ashanti Region of Ghana. IATSS Research. Gobal Challenges and Issues, Kuala Lumpur, Malaysia, pp. 91–96.
ADSA, 2011. Statistics and Accident Characteristics Involving Motorcycles in Radin Umar, R.S., 2005. The Value of Daytime Running Headlights Initiavites on
Malaysia: ADSA (Accident Database and Analysis Unit): Fact Sheet, In: Road Motorcycles Crashes in Malaysia. Transport and Communication Bulletin for
Safety Engineering and Environment Research Center (Ed.). Malaysian Institute Asia and the Pacific, pp. 17–31.
of Road Safety Research, MIROS, Kajang, Selangor, Malaysia. Radin Umar, R.S., Mackay, M., Hills, B., 2000. Multivariate analysis of motorcycle
Bauer, K.M., Harwood, D.W., 2000. Statistical Models of At-Grade Intersection accidents and the effects of exclusive motorcycle lanes in Malaysia. Journal of
Accidents—Addendum. Federal Highway Administration, Virginia, USA. Crash Prevention and Injury Control 2, 11–17.
Ceder, A., Livneh, M., 1982. Relationships between road accidents and hourly traffic Ramírez, B.A., Izquierdo, F.A., Fernández, C.G., Méndez, A.G.m., 2009. The influence
flow, ÄîI: analyses and interpretation. Accident Analysis and Prevention 14, 19– of heavy goods vehicle traffic on accidents on different types of Spanish
34. interurban roads. Accident Analysis and Prevention 41, 15–24.
Chang, L.-Y., 2005. Analysis of freeway accident frequencies: negative binomial Shankar, V., Mannering, F., Barfield, W., 1995. Effect of roadway geometrics and
regression versus artificial neural network. Safety Science 43, 541–557. environmental factors on rural freeway accident frequencies. Accident Analysis
Elvik, R., 2003. Traffic Safety, Transportation Engineers’ Handbook. McGraw Hill. and Prevention 27, 371–389.
Elvik, R., Vaa, T., 2004. Handbook Road Safety Measures. Elsevier. Shankar, V., Mannering, F., Barfield, W., 1996. Statistical analysis of accident
Garber, N.J., Haas, P.R., Gosse, C., 2010. Development of Safety Performance severity on rural freeways. Accident Analysis and Prevention 28, 391–401.
Functions for Two-Lane Roads Maintained by the Virginia Department of SPSS, 2007. SPSS Advanced Statistics 17.0, 17.0 Ed., Chicago, Illinois.
Transportation. Virginia Transportation Research Council. TARC, 2009. Development of Accident Prediction Model, Road Safety Knowledge
Harnen, S., Radin Umar, R.S., Wong, S.V., Wan Hashim, W.I., 2006. Motorcycle Development and Dissemination. Thailand Accident Research Center (TARC),
accident prediction model for junctions on urban roads in Malaysia. Advances Bangkok.
in Transportation Studies an international Journal 8, 31–39. Tegge, R.A., Jo, J.-H., Ouyang, Y., 2006. Development of Safety Performance Functions
Hauer, E., 1985. On the estimation of the expected number of accidents. Accident for Illinois. Illinois Center for Transportation, Illinois.
Analysis and Prevention 18, 1–12. Tung, S.H., Wong, S.V., Law, T.H., Radin Umar, R.S., 2008. Crashes with roadside
Hauer, E., 2004. Statistical road safety modeling. Transportation Research Record: objects along motorcycle lanes in Malaysia. International Journal of
Journal of the Transportation Research Board 1897, 81–87. Crashworthiness 13, 205–210.
Hauer, E., 2010. On prediction in road safety. Safety Science 48, 1111–1122. Washington, S.P., Karlaftis, M.G., Mannering, F.L., 2003. Statistical and Econometric
Hauer, E., Harwood, D.W., Council, F.M., Griffith, M.S., 2002. Estimating safety by the Methods for Transportation Data Analysis. CRC Press LLC, Washington, DC.
empirical Bayes method. A tutorial. Transportation Research Record 1784, 126– WHO, 2009. Global Status Report on Road Safety: Time for Action. World Health
131. Organization, Geneva, Switzerland.

You might also like