Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

VEHICLE ACCIDENT AND TRAFFIC CLASSIFICATION USING DEEP

CONVOLUTIONAL NEURAL NETWORKS


BULBULA KUMEDA1, ZHANG FENGLI1, ARIYO OLUWASANMI1, FORSTER OWUSU1, MAREGU ASSEFA1,
TEMESGEN AMENU2
1
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu
610054, China,
2
School of Management, University of Electronics Science and Technology of China Chengdu, 610054, China,
E-MAIL: bulbula2017@std.uestc.edu.cn, fzhang@uestc.edu.cn, ariyo@std.uestc.edu.cn, forster362@gmail.com,
maregu2006@gmail.com, konsotempreture@yahoo.com

Abstract: learning is a very good approach that plays a vital role in


Road Traffic accident is a horrifying and terrifying variety of data-driven decision-making processes, and is the
phenomenon which is the main cause of death of more than 1.35 most effective approach with rapid growth which brings a
million people and injuries to over 50 million people per year. tremendous influence on our day to day life with such
The global economic loss caused by road traffic accidents examples of machine learning for rainfall forecasting,
annually is estimated at 518 billion US dollars. In this work, we
investigate the categorization of traffic and accident images
prediction of bankruptcy, forecasting weather, optical
according to their content, features, and semantics using deep character recognition, and self-driving system, etc. [1-2].
learning techniques. The availability of appropriate data Nowadays the breach between the abilities of human
representation enhances the performance of deep neural and machines are getting too closer or bridged by proving an
networks in feature extraction using supervised domain immense growth of Artificial Intelligence. Currently,
knowledge of the data to create valuable pattern understanding. researchers in the field of AI are working on enormous parts
In this work, we have gone through the classification of images of the field to achieve amazing things happen such as Video
using CNN techniques and categorize images into four & Image identification, Analysis & Classification of Image,
predefined classes i.e. Accident, Dense Traffic, Fire, and Sparse Recreation of Media, Natural Language Processing,
Traffic. Our deep Convolutional Neural Network (CNN) model
learning the mapping of the input images to their labeled classes
Recommendation Systems, etc. [3]. Most of the progress that
and show good generalization to the test dataset. Experimental uses Deep learning for computer vision has been made and
results on real-world datasets have shown that the proposed accomplished with time, mostly over one specific algorithm
CNN method is effective for accident and traffic classification, — Convolutional Neural Network (CNN) [4].
and can perform the task with an accuracy of 94.4% on four Artificial neural networks (ANN) are enthused by the
target traffic and accident classes. biological neural system designed to recognize patterns. It is
a machine learning algorithm that is known as a data
Keywords: modeling tool in classification and prediction. Neural
Convolutional neural network; Image classification; networks are composed of individual units called neurons,
Traffic accident; Deep learning; Neural network; Pattern which are found in a series of groups called Layers. The input
recognition data in each node make a simple mathematical computation
and transmits the data to the next node which is connected to
1. Introduction them.
Currently, the best and powerful algorithms for
Machine learning is a scientific study which concerns automated processing of images we have are CNNs, and they
with the designing and developing of statistical models and are used by many different companies, to do tasks like
algorithms with the ability of programs to perform a identifying objects in an image, image localization, and other
particular task without using explicit instructions, which is to pattern recognition tasks. CNN has been a popular direction
modify its implementation based on the newly acquired data. of research in machine learning and image processing with
Machine learning algorithms are used to analyze the data, the biggest features of automatically extracting visual
extract hidden patterns, and make predictions based on the characteristics and lessen calculation time [5].
summarized information in a suitable format. Machine

978-1-7281-4242-5/19/$31.00 ©2019 IEEE 323

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.
The task of image classification deeply relies on a Network (TAP-CNN) is applied in [9] determining issues
successful computer vision system. Hence, we take an in- like, light, weather, traffic flow, to form a state matrix to
depth study examination of an image classification system depict the traffic state and CNN model. CNN was used to
for our computer vision-based task using deep learning. We establish the TAP-CNN model to deliver a reference to
review the current works in the detection and classification predict the traffic accident. Whereas the work in [10] used
of images using CNN. Finally, we analyze the use of the Traffic Accident Severity Prediction based on CNN (TASP-
convolutional neural network in deep architectures and their CNN) to solve the problems in accident severity through
application to traffic accident image classification. analysis of the detailed combination relationships among the
In this work, we have used a traffic accident image characteristics of the traffic accidents that highly affect the
dataset called Traffic-Net that is found at [6]. This dataset is severity of the accident. The authors proposed a Feature
gathered to make sure that it is possible to use machine Matrix to Gray Image algorithm based on the weights of
learning methods that can be trained to identify traffic traffic accident features. The experimental outcomes of this
situations and provide real-time observation, analytic and work depict that the proposed model has a good prediction
warnings. The dataset has 4,400 images that have four performance of the accident severity.
classes where every class has 1,100 images. In [11] the authors used a Fast-Convolutional Neural
Networks (F-CNN) to predict the future traffic flow with
2. Related Works uncertain accident data. In this work, a fuzzy inference
system (FIS) is applied for the first time to indicate the traffic
Currently, CNN is the best and most powerful image accident characteristics when presenting unspecified traffic
processing technique with great learning ability largely due accident information to CNN, which relieves the restrictions
to the use of several feature extraction phases (hidden layers) of traffic data. This [12] paper used a deep convolutional
that can automatically learn representations from the data. neural network to detect traffic accidents through feature
The improvement in high data processing hardware devices extraction together with the combination of experts for
and the increase in a massive amount of dataset has classification. There are two tasks in this work, In the first
accelerated the research in CNNs, and in recent times a very task, the authors used the output of the last max-pooling layer
inspiring deep CNN architectures are stated. A traffic of CNN to extract the hidden features automatically. For the
accident is the most wide-spreading phenomenon in the second task, they used a combination of advanced variations
whole world which are resulting in the death of precious of extreme learning machine (ELM). The proposed system
human beings and huge loss of economic assets. Recently in this work is very significant for online processing accident
different researchers and scholars are seriously focusing to images that are taken by any surveillance systems like
minimize and prevent the life-threatening traffic accidents Unmanned Aerial Vehicles (UAV). The research works in
through the application of data mining and machine learning [13-14] used a Deep CNN and Fast learning CNN models for
techniques the classification of images. While the work in [15] used an
To detect the factors that are highly leading to fatal enhanced Fast learning Shallow CNN model that’s is
accidents the work in [7] used a Convolutional Neural designed to allow fast training and low implementation
Networks by efficiently clustering records and considering complexity. The method is qualified by convolutional filters,
suitable features. To identify the incidence of the accident, use of least squares regression, randomly-valued classifier-
various features like speed limit, injury severity, and time of stage input weights, and linear classifier-stage output units.
the accident, drunk driver, month, and weather during the The recent advancements in the convolutional neural
accident, human factors, and light conditions are considered. networks have been stated by [16] and discussed the great
The work in [8] used a deep learning technique, performance of CNN in several fields such as image
Convolutional Neural Networks in the urban network to classification, pattern recognition, and multimedia
automatically detect traffic incidents using traffic flow data. compression. Here also the work in [17] reviewed the
The experimental result depicts that the CNN approach used enhancements of CNN in various aspects like layer design,
in this work is effective and superior in detecting the rate of activation function, loss function, regularization,
the incident over the traditional neural networks. The authors optimization, and fast computation. They also introduced the
also stated that accident detection using deep learning different applications of CNN in computer vision, speech,
methods can enhance the accuracy of incident detection in and natural language processing. Nowadays CNN is used to
urban networks. analyze and detect medical images to enhance the medical
A novel road traffic accident prediction model called process through achieving an accurate result. Some of the
Traffic accident Prediction using Convolutional Neural researches used CNN to mitosis detection of breast cancer,
breast cancer image classification using a dataset for breast

324

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.
histopathological images, and a two-phase deep CNN for is easier to process without losing characteristics which are
lessening class lopsidedness in histopathological images- vital for acquiring a better prediction and accelerating
based breast cancer recognition [18-21]. training speed [23].
CNNs are a feed-forward neural network that learns to
3. Vehicle Accident Classifier recognize patterns across space and thus, it learns to identify
components or patterns in an image or frames of video like
Neural networks based on deep learning techniques are curves, edges, and lines, etc. The hidden layer on CNN can
potentially powerful for classification and pattern do different tasks using four basic mechanisms namely,
recognition which deeply relies on a successful computer convolutional layer, ReLU layer, pooling layer, and fully
vision system. Neural networks comprise of three basic connected layer.
layers:
Input layers: - carries the initial input data into the 3.1.1. Convolution Layer
model for further processing by succeeding layers of
artificial neurons. This layer is always the first step in CNN which can be
Hidden layers: - a layer to transform the given input into supposed as the core building block of CNN. In this layer,
something that the output layer understands. A hidden layer the neurons appear for particular characteristics which will
is between the input layers and output layers, where a neuron later make the neurons to raise high activation. By feeding
receives a weighted input and yields an output through the input image through a set of convolutional filters, the
activation functions. convolution layer is used to activate the certain characteristic
Output layers: - the final layer of the artificial neurons, of the image where each layer enables different
which produces the given output from the hidden layer of the characteristics from the image. In this layer has a parameter
model. which consists of a set of learnable filters or kernels which
has a small accessible field, but spread through the full depth
of the input size. In terms of signal processing jargon,
convolution can be supposed as a weighted sum between two
signals. The output of the first layer becomes the input for
the next layer when it passes through the convolution layer.
This layer passes the information on to the next layer by
applying a convolution operation to the input. The sum of the
element-wise multiplication of the kernel (filter) and the
novel image is called Convolution [24].
A problem of employing convolutional layers is that it
Fig.1 A neural network having two hidden layers adapted
lessens the output map size. The smaller sized output is
from [22]
produced from a large stride. Equations (1) depict the
connection between output size O and input size of an image
3.1. Convolutional Neural Network (ConvNet/CNN)
I after convolution with pace s and kernel K. Additionally,
the size of the feature map is inversely proportional to the
CNN is a component of the deep neural network. As number of convolutional layers. Means, when the number of
previously described above, neural networks contain several conv layers is increased the feature map size, gets smaller
artificial neurons, which are the foundation of the algorithm and smaller. Row output size (𝑂𝑥 and column output size
intelligence. The development of CNN has shown advanced 𝑂𝑦 of convolutional layers are specified as follows:
improvements for the last five years. Companies like Google,
Amazon, and Facebook use CNN for searching user photos, Ox =
1x − K x
+ 1, Oy =
1y − K y
+ 1,
(1)
S S
for generating product recommendations and automatic
photo tagging algorithms respectively.
CNN is capable to distinguish a given an input image 3.1.2. ReLU Layer
one from another, assign significance learnable weights and
biases to several objects of an image. The ConvNet is highly It is known as the rectified linear unit which is another
enthused by the organization of Visual Cortex for image step to our convolution layer used to apply the non-saturating
classification and has analogous architecture that of the activation function𝑓(𝑥) = max(0, 𝑥). . From an activation
connectivity pattern of Neurons in the human brain. Its main map, it effectively removes negative values by setting them
purpose is to minimize or lessen the images into a form that to zero make the network to learn faster and work more

325

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.
efficient. Without affecting the accessible fields of the It is one of the layers of CNN which is known as a
convolution layer, the ReLU layer increases the nonlinear downsampling layer and works with the width and height of
properties of the overall network and the decision layers. the image. It is widely and instantly used layer after the
This is because images themselves are highly non-linear. convolutional layer to lessen the width and height of the
picture. Pooling is used to lessen the number of needed
3.1.3. Pooling Layer parameters and the size of computation required. Using
fewer parameters helps to control and avoid overfitting.

Fig.2 The architecture of CNN for accident image classification


weight from the jth neuron in (𝑙 − 1 the layer to the ith
neuron in the ith layer.
To detect an object from an image no matter where an image
is located, pooling gradually lessens the size of the input
3.1.5. Model Design
picture. Max pooling is the most common and widely used
form of pooling that takes a filter size of 𝐹 × 𝐹and uses the
In our work, our model consisted of four convolutional
maximum procedure over the 𝐹 × 𝐹sized part of the image.
layers and three fully connected layers. The first two
convolutional layers were followed with max-pooling
3.1.4. Fully Connected Layer
operations. All the CNN filters were set to a size of
3 × 3 with a stride of 2. Also, the pooling filters were set to
A fully connected layer is a collection of series of fully
size 2 × 2 and a stride of 2 on both width and length. Each
connected layers. A fully connected layer can be represented
layer has a relu activation and the following batch
with the function from ℝ m to ℝ n. Each output dimension
normalization function. The two starting layers have 32
depends on each input dimension. To cluster feature maps,
filters each, while the later layers have 64 filters. The dense
CNN needs classification layers especially in depicting
layers have 128, 64 and 1 nodes respectively, with a SoftMax
massive data [25].
activation function on the output layer.
The fully connected layers in CNN are basically a
multilayer perceptron (MLP) that targets to map the
(𝑙−1) (𝑙−1) (𝑙−1) 4. Result and Discussion
𝑚1 × 𝑚2 × 𝑚3 activation volume from the
combination of preceding various layers into a class Using Adam optimizer and a learning rate of 0.001 at
likelihood distribution. Therefore, the output layer of the 100 epochs, the model’s cost function was computed using
(𝑙−𝑖)
MLP will have 𝑚1 outputs; where i denote the number cross entropy with a batch size of 32. The model produced
of layers. If 𝑙 − 1 is a fully connected layer; an efficient result on the traffic-net dataset with a training
𝑚
(𝑙−1)
(𝑙) (𝑙−1) accuracy of 94.4% and a test accuracy of 91.64%. Fig.3 and
𝑦𝑖𝑙 = 𝑓(∑𝑗=1
1
𝑤𝑖,𝑗 𝑌𝑖 ) (2) 4 below shows the learning curve of our model as the cost
Where 𝑚𝑙 _lindicates the number of neurons in the lth function of the model prediction against the ground truth is
hidden layer. The activation function is denoted by f and optimized using backpropagation.
(𝑙)
weight parameters are represented as w where 𝑤𝑖,𝑗 is the From the images, the training loss could be seen

326

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.
depicting the converges of the cost function. We believe the 5. Conclusion
reason why the model achieves high accuracy is because of
the invariant property of the CNN architecture, which is able In this work, we have proposed a deep convolutional
to extract spatial features from images by learning curves and neural network (CNN), currently which takes advantage of
edges at the initial layers before combining them into recent deep learning models in vehicle detection and
meaningful shapes and patterns at subsequent layers. recognition to detect traffic accidents from the given image
Currently, there is no published work using the Traffic- datasets. The model is trained for classification of images
Net dataset, but a network was trained using a pretrained using a Traffic-Net dataset which contains 4,400 images
ResNet50 model to perform prediction on the dataset which where 1,100 images for each category, with 900 images for
is provided in the GitHub repository [6]. The author has training and 200 images for testing. This dataset is gathered
stated the result of a trained ResNet model on the four classes in order to ensure that machine learning systems can be
of the 800 test images which achieves a test accuracy of trained to notice traffic situations and provide real-time
91.64%. Compared to our model, our model has performed monitoring, analytics, and alerts. We categorize images into
better on both the training and the test results with a test four predefined classes i.e. Accident, Dense Traffic, Fire, and
accuracy of 91.64% also, we recorded an accurate training of Sparse Traffic. Our deep Convolutional Neural Network
94.4%. (CNN) model learns the mapping of the input images to their
labeled classes and shows good generalization to the test
Table 1 Test on CNN model
dataset. Our proposed model consists of four convolutional
Batch Learnin Number model Training Test
size g rate of epochs Accuracy accuracy layers and three fully connected layers. This model was
32 0.001 100 CNN 94.4% 91.64% trained for 100 epochs only, but it achieved 94.4% training
The results in Table 1 show that as for the Traffic-Net accuracy on 800 test images with a learning rate of 0.001.
traffic and accident image classification, our original neural From our experimental results obtained by applying our
network model performs well achieving the accuracy rate of proposed model on real-world datasets, we can conclude that
94.4%. our CNN model is effective for image classification and can
perform the task with an accuracy of 94.4% on four target
accident classes.

References

[1] Barboza, Flavio, Herbert Kimura, and Edward Altman.


"Machine learning models and bankruptcy prediction."
Expert Systems with Applications 83 (2017): 405-417.
[2] Cramer, Sam, et al. "An extensive evaluation of seven
machine learning methods for rainfall prediction in
weather derivatives." Expert Systems with
Fig.3 Model accuracy of the CNN Applications 85 (2017): 169-181.
[3] Saha, Sumit. "A comprehensive guide to convolutional
neural networks–the ELI5 way." (2018).
[4] Koushik, J. (2016). Understanding Convolutional
Neural Networks. arXiv preprint arXiv:1605.09081.
[5] Lin, Yiqi. "Evaluation Methods and Applications in
Image Recognition based on Convolutional Neural
Networks." Journal of Physics: Conference Series. Vol.
1087. No. 6. IOP Publishing, 2018.
[6] Olafenwa Moses, Traffic-Net Dataset
https://github.com/OlafenwaMoses/Traffic-Net
[7] L. Yasaswini, G. Mahesh, R. Siva Shankar, L.V.
Srinivas, “Identifying Road Accidents Severity using
Fig. 4 Model Loss of the CNN
Convolutional Neural Networks” International Journal
of Computer Sciences and Engineering, Vol.-6, Issue-
7, July 2018

327

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.
[8] Lin Zhu, Fangce Guo, Rajesh Krishnan, Jhon W. Polak, in 2016 International Joint Conference on Neural
“A Deep Learning Approach for Traffic Incident Networks (IJCNN), 2016, vol. 29, no. 1, pp. 2560–2567.
Detection in Urban Networks”21st IEEE [20] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L.
International Conference on Intelligent Transportation Heutte, “A dataset for breast cancer histopathological
Systems (ITSC) Maui, Hawaii, USA November 4-7, image classification,” IEEE Trans. Biomed. Eng., vol.
2018 63, no. 7, pp. 1455–1462, 2016.
[9] Lu Wenqi, Luo Dongyu, Yan Menghua, “A Model of [21] N. Wahab, A. Khan, and Y. S. Lee, “Two-phase deep
Traffic Accident Prediction Based on Convolutional convolutional neural network for reducing class
Neural Network” 2017 2nd IEEE International skewness in histopathological images based on breast
Conference on Intelligent Transportation Engineering, cancer detection,” Comput. Biol. Med., vol. 85, no.
Singapore September 1-3, 2017 March, pp. 86–97, Jun. 2017.
[10] MING ZHENG, TONG LI, RUI ZHU, “Traffic [22] Aegeus, Zerium. Artificial Neural Networks
accident’s severity prediction: a deep learning approach Explained. July 25, 2018.
based CNN network” IEEE Access (Volume: 7) 06 https://blog.goodaudience.com/artificial-neural-
March 2019 Page(s): 39897 - 39910 networks-explained-436fcf36e75 (accessed 6 25,
[11] J IYAO AN, LI FU, WEIHONG CHEN, AND JIAWEI 2018).
ZHAN, “A Novel Fuzzy-Based Convolutional Neural [23] Ding, X. (2014). Research on the applications of BP
Network Method to Traffic Flow Prediction with Neural Networks and Convolutional Neural Networks
Uncertain Traffic Accident Information” IEEE Access in character recognition (Master's thesis, Huazhong
(Volume: 7) 11 February 2019, Page(s): 20708 – 20722 University of Science and Technology)
[12] Ali Pashaei, Mehdi Ghatee, Hedieh Sajedi, [24] Li, D. (2016). Research on the license plate recognition
Convolution Neural Network Joint with Mixture of based on Convolutional Neural Networks (Master's
Extreme Learning Machines for Feature Extraction and thesis, Xiangtan University).
Classification of Accident Images, Journal of Real- [25] Zhao, J. (2016). Research of Substation Monitoring
Time Image Processing, 2019. Image Recognition Approach Based on Convolutional
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Neural Networks (Doctoral dissertation, North China
“ImageNet Classification with Deep Convolutional Electric Power University).
Neural Networks,” Adv. Neural Inf. Process. Syst., pp.
1–9, 2012.
[14] Lee, K., Park, D.C.: Image classification using fast
learning convolutional neural networks. Adv. Sci.
Technol. Lett. 113, 50–55 (2015)
[15] McDonnell, Mark D., and Tony Vladusich. "Enhanced
image classification with a fast-learning shallow
convolutional neural network." 2015 International Joint
Conference on Neural Networks (IJCNN). IEEE, 2015.
[16] Gu, Jiuxiang, et al. "Recent advances in convolutional
neural networks." Pattern Recognition 77 (2018): 354-
377.
[17] Q. Zhang, M. Zhang, T. Chen, Z. Sun, Y. Ma, and B.
Yu, “Recent advances in convolutional neural network
acceleration,” Neurocomputing, vol. 323, pp. 37–51,
2019
[18] D. C. Cireşan, A. Giusti, L. M. Gambardella, and J.
Schmidhuber, “Mitosis Detection in Breast Cancer
Histology Images with Deep Neural Networks BT -
Medical Image Computing and Computer-Assisted
Intervention – MICCAI 2013,” in Proceedings
MICCAI, 2013, pp. 411–418.
[19] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L.
Heutte, “Breast cancer histopathological image
classification using Convolutional Neural Networks,”

328

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 17:07:35 UTC from IEEE Xplore. Restrictions apply.

You might also like