Professional Documents
Culture Documents
Improvement of Object Detection Based On Faster R - 220904 150051
Improvement of Object Detection Based On Faster R - 220904 150051
2021 36th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) | 978-1-6654-3553-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/ITC-CSCC52171.2021.9501480
ed licensed use limited to: SVKM¿s NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on September 01,2022 at 12:38:10 UTC from IEEE Xplore. Restriction
Input Image
Feature Map Classification
ROI Pooling
Bounding Box
RPN
Bounding
Conv. Maxpool Conv. Maxpool
…… FC FC Boxes,
Layer Layer Layer Layer
Classes
∗
The feature extraction network is typically a box, and conditional class probabilities. The
pretrained CNN. The first subnetwork RPN is used to confidence score is defined as .
generate a proposal of the object, and the second Intersection Over Union (IOU) is the most popular
ed licensed use limited to: SVKM¿s NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on September 01,2022 at 12:38:10 UTC from IEEE Xplore. Restriction
Kalman H
Filter
+1 = + +!
The estimate can then be updated by
where
1 0 '( 0
4. Experimental Results and Discussion
0 1 0 '(
=% )
0 0 1 0
In order to verify the effectiveness of using
(2)
0 0 0 1
Kalman Filter to fuse the YOLO v2 detector and the
Faster R-CNN detector, experiment is carried out in
MATLAB/Simulink environment. An annotated
0⎤
-./
⎡0
driving dataset is used, including frames collected
⎢ -./ ⎥
= ⎢0 ⎥
from cameras while driving in cities during daylight
⎢ ⎥0
conditions.
⎢'( 0⎥
(3)
The 1920x1200 resolution images are first resized
⎣0 '( ⎦
to 359x224 to fit the input size of YOLO v2 and Faster
R-CNN. A pretrained ResNet-50 network is used for
1 0 0 0
#=4 5
feature extraction to build the YOLO v2 and Faster R-
0 1 0 0
(4) CNN detection network. After obtaining the
= 67 8 9: 9; <-
coordinates and sizes of the bounding boxes from
(5) each algorithm, the results are then fed to Kalman
= 6=: =; <-
Filter.
The coordinates predicted from Kalman Filter are
(6)
ed licensed use limited to: SVKM¿s NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on September 01,2022 at 12:38:10 UTC from IEEE Xplore. Restriction
[1] Q. Hu, S. Paisitkriangkrai, C. Shen, A. van den Hengel
and F. Porikli, "Fast Detection of Multiple Objects in Traffic
Scenes With a Common Detection Framework," in IEEE
Transactions on Intelligent Transportation Systems, vol. 17,
no. 4, pp. 1002-1014, April 2016.
(a) YOLO v2 (b) Faster R-CNN [2] J. U. Kim and Y. Man Ro, "Attentive Layer Separation
for Object Classification and Object Localization in Object
Detection," 2019 IEEE International Conference on Image
Processing (ICIP), Taipei, Taiwan, 2019, pp. 3995-3999.
In this paper, vehicle object detection by [8] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN:
combining results from YOLO v2 and Faster R-CNN Towards Real-Time Object Detection with Region Proposal
is proposed. The YOLO v2 is fast and has a less Networks," in IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June
computational cost. However, it somewhat sacrifices 2017.
the detection accuracy. To overcome this problem, the
Kalman filter is used to fuse the two popular object [9] Redmon, Joseph, et al. "You only look once: Unified,
detection algorithms. Due to the one-stage structure of real-time object detection." Proceedings of the IEEE
YOLO v2 and the two-stage structure of Faster R- conference on computer vision and pattern recognition.
CNN, the former has faster speed while the latter has 2016, pp. 779-788.
better accuracy. Therefore, in the Kalman filter, the
results from Faster R-CNN are used as the [10] Liu, Wei, et al. "Ssd: Single shot multibox detector."
observation. Experiment is carried out for vehicle European conference on computer vision. Springer, Cham,
2016, pp. 21-37.
detection, and the results show that the fusion of the
two algorithms in the Kalman filter has better [11] X. Song, P. Jiang and H. Zhu, "Research on Unmanned
detection accuracy. Vessel Surface Object Detection Based on Fusion of SSD
and Faster-RCNN," 2019 Chinese Automation Congress
Acknowledgment (CAC), Hangzhou, China, 2019, pp. 3784-3788.
This work was supported by Seoul National [12] S. Chang, "A Deep Learning Approach for Localization
Systems of High-Speed Objects," in IEEE Access, vol. 7, pp.
University of Science and Technology, Seoul, South 96521-96530, 2019.
Korea.
[13] G. Yang and Z. Chen, "Pedestrian Tracking Algorithm
References for Dense Crowd based on Deep Learning," 2019 6th
International Conference on Systems and Informatics
(ICSAI), Shanghai, China, 2019, pp. 568-572.
ed licensed use limited to: SVKM¿s NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on September 01,2022 at 12:38:10 UTC from IEEE Xplore. Restriction