Professional Documents
Culture Documents
Article 1
Article 1
Abstract
Vehicle counting is an important process in the estimation of road traffic density to evaluate the traffic
conditions in intelligent transportation systems. With increased use of cameras in urban centers and
transportation systems, surveillance videos have become central sources of data. Vehicle detection is one of the
essential uses of object detection in intelligent transport systems. Object detection aims at extracting certain
vehicle-related information from videos and pictures containing vehicles. This form of information collection in
intelligent systems is faced with low detection accuracy, inaccuracy in vehicle type detection, slow processing
speeds. In this research, we propose a vehicle detection system from infrared images using YOLO (You Look
Only Once) computational mechanism. The YOLO mechanism can apply different machine or deep learning
algorithms for accurate vehicle type detection. In this study we propose an infrared based technique to combine
with YOLO for vehicle detection in traffic. This method will be compared with a machine learning technique of
K-means++ clustering algorithm, a deep learning mechanism of multitarget detection and infrared imagery
using convolutional neutral network
1. Introduction
Infrared (IR) target tracking and detection is critical in video surveillance especially in transportation systems[1]. This
infrared system has been utilized in military applications especially in IR imaging and guidance technology. This technology
has attracted considerable attention due to its anti-interference ability, observability in all weather, high guidance precision
and long detection distances[1]. Nonetheless, in contrast to the conventional visual image, the IR images have low spatial
resolution, lack of textural information, and poor signal-to-noise (SNR) ratio. Additionally, tracking of vehicles in traffic
which are moving fast using IR images may raise problems with target resolution and background motion[1]. In this study,
we propose an observational framework for vehicle detection in traffic that utilizes infrared imaging and YOLO
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
In order to overcome the problems associated with the current vehicle detection and tracking systems we will compare our
proposed mechanism with kmean++ clustering algorithm that uses bounding boxes of varied sizes on the training dataset for
detection of vehicles in traffic[2]. Secondly, a deep learning Multi-target detection approach that involves the YOLO
mechanism under the Darknet framework[3]. Finally, an infrared imaging that uses convolutional Neural Network.
The primary contribution of this research is to develop the effect of an incredible algorithm of vehicle in IR target tracking.
This research will provide important findings for development of IR-based vehicle detection and tracking systems that
combine deep learning and YOLO computational technique. The second contribution of this study is to compare the various
tracking algorithms to determine the most effective in vehicle detection in IR images.
2. RELATED WORK
In section we provide a review of the important algorithms used for object detection in intelligent transport system (ITS).
2.2 YOLOv2
YOLO technique guarantee real-time image processing with high accuracy but the method has a higher localization error
with lower recall response. The YOLOv2 is an updated version of the YOLO technique. The YOLOv2 increases the accuracy
and the recall response time as it incorporates new features listed below;
A fully connected layer that are important and responsible for prediction of the boundary.
Class prediction is accomplished at the boundary level rather than the cell level. The resulting elements will have four
parameters of the boundary box.
A pooling layer is removed to introduce a spatial output of the network to 13x13 from the initial 7x7
Input image is varied from 418x418 to 416x416. The will result on an odd-numbered spatial dimensions which is important
in case the picture is occupied by a large image in the center. S
The last convolution layer in the image is replaced with three 3x3 convolutional layers that generate 1024 output channels
2.3 YOLO v3
This is an updated version of the YOLO that includes multi-label classification. The YOLOv3 produces non-exclusive
output that has a score more than one. The YOLOv3 does not use softmax but rather an independent logistic classifier utilized
to compute the likeness of the objects in the image. Furthermore, YOLOv3 employs a binary cross-entropy loss for each label
rather than using the mean square error in the computation of the classification loss. Figure 2 demonstrates the neural
architecture of YOLOv3.
2
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
Figure 3: The YOLO mechanism that will be used to verify the results [12]
The detector in this technique uses the deep learning technique of YOLOv3. This technique is important in verifying the
tracking. The results in this verification system indicate that the system could detect targets contained in complex
background. The detector varies the tracking results within specific frequency bands to reduce the need for heavy and
complex computation. In this method the tracker operates independently. The CTAD technique can be summarized in the
following pseudocode
THE TRACKING THREAT IS INITIALIZED FOR
THE TRACKER;
THE DETECTING THREAD FOR THE DETECTOR
IS INITIALIZED;
RUN THE TRACKER;
3
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
RUN DETECTOR
RUN TRACKER;
END
4
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
̂𝑖 , ℎ̂𝑖 , 𝐶̂𝑖 and 𝑝̂𝑖 are the corresponding prediction of 𝑥𝑖 , 𝑦𝑖 , 𝜔𝑖 , ℎ𝑖 , 𝐶𝑖 and 𝑝̂𝑖
𝑥̂𝑖 , 𝑦̂𝑖 , 𝜔
𝜆𝑐𝑜𝑜𝑟𝑑 is the weight of the coordinate loss
𝜆𝑛𝑜𝑜𝑏𝑗 is the weight of the bounding boxes
𝐵 is the bounding boxes
𝑆 2 is the 𝑆 𝑥 𝑆 grid cells
𝑜𝑏𝑗
Π𝑖 indicate if the box is located in cell i or not
𝑜𝑏𝑗
Π𝑗 indicates that the jth box is responsible for prediction.
The element in this framework is the design of the network for vehicle detection. This is a multi-layer feature fusion
because the variation in the vehicles has difference in color, contour, tire shape, and lamp shape [10]. The multilayer feature
fusion strategy was used for reorganizing the local information.
2.5.1 Design of Network. In this KCA method, the process involves two significant steps: The multi-layer feature fusion,
is the first step of identifying vehicles in traffic images. In in this section the difference between the vehicles is identified using
contour, tire shape and lamp shape. The multi-layer feature fusion takes the general YOLOv2_vehicle model as illustrated in
figure 4 below.
5
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
location in the feature map. The box regression will be utilized in the fine-tuning of the window and perform clustering
statistics using the K-mean algorithm.
In this technique, the data set is fine-tuned based on classification. This fine tuning technique is utilized in the training of
the vehicle dataset that will be used in the convolution neural network. Furthermore, the data is enhanced during the training
phase using random scaling, exposure and saturation. Once diversity is established in the images the neural network is used to
divide the image into various regions, which make it easy to predict the probabilities and borders and assign the bounding
boxes based on the probabilities. The DLMTD model is illustrated below.
6
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
3. COMPARISON
In section, we will compare the four mechanism of detecting vehicle in image taken from cameras from the intelligent
transport system (ITS).
Table 1: Comparison of the detection algorithm
CTAD KCA DLMTD CNN
Tracking LCT Kmean++ CNN CNN
Detection YOLOv3 YOLOv2 YOLOv2 Labeling
Toolbox
Error correction Uses Regression Multi-Layer Intersection Suppression
models Feature Fusion Over Union
The techniques were compared based on the accuracy of detection (precision) and the speed of evaluation (FPS). Based on
the literature review the DLMTD had the highest detection accuracy and the highest speed of evaluation[7]. Our technique had
the most desired speed of 18.1 but had the second highest accuracy percentage. The combined tacking and detection (CTAD)
performance can be improved by subjecting all the techniques to the infrared images that were used[9]. In the case of the IR
images have low spatial resolution, lack of textural information, and poor signal-to-noise (SNR) ratio. These elements reduced
the accuracy of the CTAD technique while the other mechanism used the normal images[8]. To have the second best speed
illustrate the potential of this technique.
4. CONCLUSION
In this study, the survey compared some notable object detection techniques that can be applied on IR images to detect and
track vehicle in ITS. The CTAD is a new technique that when couple with YOLO produce incredible results. The technique
was evaluated in classical methods. The other techniques have been used to detect vehicle in normal images. The findings of
this survey show that the is potential in the future to develop various techniques based on YOLO to detect vehicle on infrared
images.
References
[1] Y. Hu, M. Xiao, K. Zhang, and X. Wang, “Aerial Infrared Target Tracking in Complex Background Based on Combined Tracking and
Detecting,” Math. Probl. Eng., vol. 2019, 2019, doi: 10.1155/2019/2419579.
[2] J. Sang et al., “An improved YOLOv2 for vehicle detection,” Sensors (Switzerland), vol. 18, no. 12, Dec. 2018, doi:
10.3390/s18124272.
[3] X. Li, Y. Liu, Z. Zhao, Y. Zhang, and L. He, “A deep learning approach of vehicle multitarget detection from traffic video,” J. Adv.
Transp., vol. 2018, 2018, doi: 10.1155/2018/7075814.
[4] X. Liu, T. Yang, and J. Li, “Real-time ground vehicle detection in aerial infrared imagery based on convolutional neural network,”
Electron., vol. 7, no. 6, Jun. 2018, doi: 10.3390/electronics7060078.
[5] J. Leitloff, D. Rosenbaum, F. Kurz, O. Meynberg, and P. Reinartz, “An Operational System for Estimating Road Traffic Information
from Aerial Images,” Remote Sens., vol. 6, no. 11, pp. 11315–11341, Nov. 2014, doi: 10.3390/rs61111315.
7
2nd International Scientific Conference of Al-Ayen University (ISCAU-2020) IOP Publishing
IOP Conf. Series: Materials Science and Engineering 928 (2020) 022027 doi:10.1088/1757-899X/928/2/022027
[13] W. Liu et al., “G-RMI Object Detection,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics), vol. 9905 LNCS, pp. 21–37, 2016, doi: 10.1007/978-3-319-46448-0_2.
[14] A. F. Zohra, S. Kamilia, A. Fayçal, and S. Souad, “Detection And Classification Of Vehicles Using Deep Learning,” Int. J. Comput.
Sci. Trends Technol., vol. 6, 2013, Accessed: Apr. 25, 2020. [Online]. Available: www.ijcstjournal.org.
[15] D. Gour and A. Kanskar, “Automated AI Based Road Traffic Accident Alert System: YOLO Algorithm,” Int. J. Sci. Technol. Res.,
vol. 8, p. 8, 2019, Accessed: Apr. 25, 2020. [Online]. Available: www.ijstr.org.
[16] Y. Jamtsho, P. Riyamongkol, and R. Waranusast, “Real-time Bhutanese license plate localization using YOLO,” ICT Express, Nov.
2019, doi: 10.1016/j.icte.2019.11.001.
[17] A. R. Caballo and C. J. Aliac, “YOLO-based Tricycle Detection from Traffic Video,” in Proceedings of the 2020 3rd International
Conference on Image and Graphics Processing, 2020, pp. 12–16, doi: 10.1145/3383812.3383828.