Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Traffic Sign Detection and Recognition with

Deep CNN Using Raspberry Pi 4 in Real-time


Matha Vijaya Phanindra Kumar Corresponding Author: Karthika R.
Department of Electronics and Department of Electronics and
Communication Engineering Communication Engineering
Amrita School of Engineering, Amrita School of Engineering,
Coimbatore, Coimbatore,
Amrita Vishwa Vidyapeetham, India. Amrita Vishwa Vidyapeetham, India.
cb.en.p2ael21014@cb.students.amrita.edu r_karthika@cb.amrita.edu

Abstract— Automatic traffic sign detection and recognition Additionally, the framework which is used for traffic
is crucial and has the potential to be utilized for driver sign detection has been divided into regression and
assistance to reduce collisions in driverless cars. A lot of classification tasks to gather more data and enhance the
applications in the automotive sector are built around the ability of recognition and detection. In order to increase
computer vision challenge of traffic sign detection. The
accuracy and speed, a novel-based approach has been used
2023 IEEE 11th Region 10 Humanitarian Technology Conference (R10-HTC) | 979-8-3503-2614-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/R10-HTC57504.2023.10461824

proposed method uses deep Convolutional Neural Networks


(CNN) to detect and recognize traffic sign images in real-time. in the proposed work to detect the traffic sign images and
Using images of the German Traffic Sign Recognition before recognition, the pre-processing has been performed.
Benchmark (GTSRB) as training data, a sequential CNN Fig. 1 represents a flow diagram which includes all the steps
model has been built which will be utilized to recognize and that are carried out in obtaining the desired implementation.
classify the unlabeled traffic signs in this challenge. There are
43 classes in the image dataset. The model is implemented in
hardware using Raspberry Pi 4 and a web camera.

Keywords— Traffic sign recognition and detection, German


Traffic Sign Recognition Benchmark (GTSRB), Convolutional
Neural Networks (CNN), Raspberry Pi 4, Web camera

I. INTRODUCTION
Advanced Driver Assistance System (ADAS) is a vital
area for computer vision research. Traffic sign detection and Fig. 1. Block diagram of Traffic sign detection and recognition
recognition [1] are two related technologies. Traffic signs
display the most recent traffic conditions, establish road II. RELATED WORK
rights, prohibit and authorize specific driving behavior of
different routes, cue harmful messages with other essential There have been documented approaches to traffic sign
information concerning vehicle safety. They can also assist recognition. An integrated system for detecting, tracking,
drivers in assessing the state of the road to choose the best and recognizing has been presented and speed limits based
driving routes [2]. on Adaboost, color-sensitive Haar wavelet features, and
temporal information propagation have been explained in
Important components that can aid drivers in obtaining [7]. Around 4,000 samples from 23 different classes ranging
road information include color and shape. Traffic signs have from 30 to 600 samples per class were used to train the
several constant properties that can be used for detection classifier and a test dataset of 1,700 images of traffic sign
and recognition. Every country uses nearly the same colors images which has given an accuracy of 94% for the
for traffic signs, which often have standardized shapes classification is used to assess the performance.
(circles, triangles, and rectangles) and basic colors (red,
blue, and yellow). External factors, such as the weather, A neural network-based approach based on single-digit
frequently have an impact on how traffic signs appear. recognition is presented in [8] for the recognition of
Because of this, traffic sign recognition is a key and crucial European and American speed limit signs using MAPS
subject of study in traffic engineering for the safety aspects software architecture. But they don't offer results for each
of drivers as well as pedestrians. Multiple traffic-sign-based individual classification. The total system, which includes
methods have been developed in [3,4]. A CNN based on the detection and tracking, performs on 281 traffic signs.
transfer learning approach is proposed in the paper [5].
Effective Regional Convolutional Neural Network (RCNN) Color-based region segmentation and shape-based
detection is achieved by a small number of typical traffic verification of the segmented area are used in [9] to classify
training instances after deep CNN is trained using a large various traffic indicators. In the process of detection, an
amount of data involving traffic sign images. A multi- appropriate neural network is chosen using the shape and
resolution feature combination network is developed in the color (RGB) of the data. In [10] a multi-layer perceptron
paper [6], allowing for the study of numerous useful features neural network with 2,880 images is used to train a classifier
from small-scale objects. for speed limits based on numbers on 1,233 images and it
successfully classifies with an accuracy of 92.4%. It is
unclear whether different instances of the same traffic sign
are shared throughout groups of images in the above work.

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-2614-7/23/$31.00 ©2023 IEEE 24
A dataset with 1,300 pre-processed instances from 6 classes center. Totally 39209 images are present in the training split
(five-speed limitations and one noise class) is used in [11] to set and 12630 images in the testing split set. It is ensured
test various methods using the segmentation of binary that images belonging to the same set are assigned if they
images depending on the shape. The work in [12] has a share a real-world traffic sign. However, the majority of
collection that includes a whopping 36,000 images of the traffic sign instances in the proposed dataset samples only
Spanish traffic signs from 193 sign classifications and is appear once. Some of the images in the training set are also
based on pictogram identification of the road signs. In [13], used for validation.
the authors successfully performed the recognition of traffic
signs using histogram characteristics and ROI extraction. On each traffic sign image, the corresponding sign class
But the results achieved in their work is not satisfactory as is labeled. For example, road work signs, speed limit 50
the performance of the proposed work is not as per the Kilometers per hour, speed limit 60 Kilometers per hour,
expectations. etc. are labeled on each image. Traffic signs were divided
into three categories that would suit the characteristics of a
The results presented above are obviously incomparable number of well-known traffic sign detection algorithms,
because all systems were assessed using confidential despite the fact that can clearly make variations between the
information, the majority of which is not accessible to the tasks of recognition and detection. The classifications are
other researchers to prove their claim. The authors in [14] danger signs, mandatory signs, prohibitory signs, and other
have used the GTSRB dataset for a novel CNN-based signs. Fig. 2 shows the different categories of traffic sign
architecture for objection detection and recognition. Faster- images used in the proposed work.
RCNN is used in [15] for multi-object detection using the
multiple datasets involving traffic signs, traffic lights, and
cars for their approach, and in [16] the authors have used
CNN based approach for Indian traffic sign recognition and
detection. In [17] authors have used Yolo V5 for video
detection and VGG for audio detection. Real-time videos
were used to test the suggested model, which performed
better than the alternatives.

III. DATASET
Images for the GTSRB collection were chosen from
recordings made over numerous visits in the spring and fall
of 2010 in the vicinity of Bochum, Germany. They record
varied scenes (urban, rural, and highway) in daylight and
twilight with a range of meteorological conditions. There
are various instances that can be observed in Fig. 2. The
Vienna Convention on Road Signs and Signals has some
established standards for recorded traffic signs.
A. Data Collection and Format
They have captured images in the form of a Bayer
Fig. 2. Different Categories of Traffic Signs in GTSRB
pattern with a 1380CH resolution camera with automatic
control of the exposure. Edge-adaptive, constant-hue DE
mosaicking [18, 19] was used on the images present in the IV. HARDWARE AND OPERATING SYSTEM
dataset to transform them into RGB color space, and they
A. Raspberry Pi 4 B
were all saved in raw Portable Pixel Map (PPM) file format.
Each pertinent traffic sign that can be seen in the images has Raspberry Pi is possibly one of the most abundant and
been manually labeled. The ground truth information has advanced hardware systems available now. When compared
been kept in a CSV file. to the different devices, the operating system for this is
available in different versions. It provides researchers huge
B. Data Organization opportunities to develop different software and applications.
The dataset GTSRB [20] is used in the proposed work to Raspberry Pi Foundation revised the Raspberry Pi versions
achieve accurate output. and is a single-image classification in recent years with advancements in technology. The most
dataset with more than 40 classes and a total of almost recent is the Raspberry Pi 4, which has varying Random
50,000 images. In terms of color, shape, rotation, occlusion, Access Memory (RAM) capacities. Raspberry Pi 4 with
and different weather conditions, the benchmark created is a RAM of 8 GB and a web camera of 17 Frames Per Second
substantial database with significant differences between (FPS) have been used for the implementation.
classes.
B. Raspbian Operating System (OS)
The 51840 images in the GTSRB dataset represent 43 Raspbian is an OS that is used in the Raspberry Pi based
different types (classes) of traffic signs and range in size on Debian. The latest Debian Bullseye Raspberry Pi Desktop
from 15 × 15 to 250 × 250 pixels. The traffic signs available 64-bit version has been implemented in this work. There are
other versions for 32-bit operating system applications as
in the image are scattered throughout and not directly in the

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
25
well. Fig. 3 shows the latest Raspberry Pi 4 model B image per the application. A wireless configuration mode has been
and how the architecture design of the hardware is chosen while writing the OS and performed the required
accomplished. The Debian OS has to be written into the configuration to access the OS by setting a username and
Secure Digital (SD) card from a personal computer. Then the password. Putty and VNC viewer applications are used to
boot and setup of the environment have to be performed for connect with the Raspberry Pi OS in wireless mode. Below
the hardware implementation of the model. Fig. 4 shows the Fig. 6 shows the basic description of the Raspberry Pi 4 B
Raspberry Pi imager application and the environment used and Fig. 7 shows the starting window of the VNC viewer.
for Raspberry Pi is Linux. VNC viewer can be used for remote access of a computer or
any other operating system with a proper configuration as
per the application and its usage.

Fig. 3. Raspberry Pi 4 Model B

The Raspberry Pi imager application is used to write the


Raspbian OS to the SD card. The SD card should be chosen
in such a way that its memory is compatible based on the Fig. 6. Raspberry Pi 4 Model B basic description
application. HP 32Gb memory card has been used to write
the Raspbian 64-bit bullseye version. Fig. 5 shows the
different versions of Raspbian OS available.

Fig. 7. VNC viewer


Fig. 4. Raspberry Pi Imager

According to the above discussed details regarding the


hardware implementation, Raspberry Pi 4 B with 8 GB of
RAM and a Debian Bullseye Raspberry Pi Desktop 64-bit
version have been used. The computational performance of
Raspberry Pi 4 B used in the proposed work is 1.8 GHz.
Since, the total current consumption of downstream USB
peripherals is less than 500mA in the proposed work, a good
quality 2.5A power supply is used. The cost of the model is
around $80 with all required peripherals and a web camera of
17 Frames Per Second (FPS) costing around $8 has been
used for the implementation.

V. METHODOLOGY
Fig. 5. Raspbian OS selection from Raspberry Pi Imager
The different steps and the flow process that is carried
The next step after writing the OS is to insert the SD out for accurate traffic sign recognition and detection of
card in Raspberry Pi and make the necessary connections as traffic sign images using proposed CNN model is discussed
in this section.

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
26
A. Data Visualization C. CNN Model
Data Visualization gives a clear idea of the distribution A CNN is an image processing architecture that uses
of the data among the 43 classes present in the GTSRB structured approaches with convolutional layers to recognize
dataset, which can be seen in Fig. 8. features such as edges and patterns. The input images,
represented as pixel matrices, are subjected to non-linear
modifications via activation functions such as sigmoid to
learn detailed associations of them. Pooling layers reduces
complexity and highlights important information by down
sampling the feature maps. The learned patterns then are
subsequently used by the fully linked layers to produce
predictions or classifications. CNNs are salient tools for
numerous computer vision tasks and applications due to
their outstanding performance in object detection and image
recognition. The complete CNN architecture in detail can be
seen in Table I with the total number of trainable parameters
as 3,78,203.
Fig. 8. Distribution of training dataset in GTSRB

B. Data Pre-processing
To achieve the best results, pre-processing of data is
carried out on the input images. Gray-scaling, Shuffling,
Normalization, and Equalization are some of the techniques
used in this. The main purpose of converting an image from
one color space to another is to help CNN with efficient
feature extraction. In digital images, gray-scaling means that
each value of the pixel reflects the brightness of the light. Fig. 9. Block diagram of CNN Model and flow process
Typically, visualization of just the range from deepest black
to lightest white will be carried out. It can also be explained
TABLE I. NETWORK STRUCTURE OF CNN USED IN THIS
in such a way that the image is entirely composed of black, WORK
white, and grey, out of which has several shades of grey. It
also reduces the dimension of the given image and the
model's complexity. For example, a 4x4x3 image has 48
input nodes, whereas grayscale input requires only 16 nodes.

The randomness and variety in the training data can be


improved using shuffling. One of the dataset's major flaws
is that many images have low contrast, making it difficult
for the human eye to recognize a given sign. Using an
algorithm known as Contrast Limited Adaptive Histogram
Equalization, image contrast (CLAHE) can be automatically
improved. Normalization is used to alter the intensity range
of pixels. It is also used to remove noise from images.
Normalization of the values between 0 and 1 is performed
instead of the 0 to 255 levels. Min-max scaling is used as a
normalization technique in the proposed work. The model is trained on different parameters based on
the distribution of the data. After tuning the parameters, a
The images present in the GTSRB dataset are not number of times with multiple combinations based on the
equally distributed. There are images of some particular requirement, the final model parameters are, test ratio=0.2,
classes with a smaller number and some with more images. validation ratio=0.2, batch size=100, learning rate=0.001,
To improve the performance and capacity of the model to be and Adam optimizer is used for the optimization of the
trained, image augmentation is used which increases the model. Fig. 9 shows the block diagram of the CNN
number of images in the dataset used for training. The deep workflow with the given input images dataset for training.
learning neural network performance frequently improves as
the amount of data variability increases. Image data CNN has been chosen over other machine learning
augmentation is typically applied only to the training models for image-related tasks in the proposed work
datasets but not to validation or test datasets. Augmentation because of its capacity to capture spatial hierarchies and
local connectivity, automatically learn features, and display
of images has been carried out in the proposed work to translation invariance. Model complexity is reduced by
make the training data more generic and avoid overfitting. multiple parameter sharing and it improves the performance.
Data augmentation improves generalization, while parallel
processing will enhance training and inference, resulting in
the best possible results when implemented in hardware.

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
27
VI. RESULTS AND ANALYSIS
The detection and recognition of the real-time traffic
signs are accomplished by interfacing the hardware using
the above details provided regarding the Raspberry Pi 4 and
web camera. The model created detects the traffic signs on
the road by continuously processing the images. During the
real-time detection of the traffic sign images, if there is no
image detected, the output display of the Raspberry Pi 4
shows the message "No Sign Detected" as shown in below
Fig. 10. Table II shows the evaluation of the model trained
in the proposed work at different steps of iteration. Results Fig. 12. Graph showing training and validation loss
have been recorded and observed after every 50 epochs and
after 200 epochs we have achieved the best results in testing A. Interfacing with Raspberry Pi 4
and validation. The model created in the previous sections is then
implemented by interfacing the Raspberry Pi 4 with the web
camera as shown in Fig. 13, Fig. 14, Fig. 15, and Fig. 16 are
the output images after implementation of the model with
Raspberry Pi 4 and web camera which shows class-15 for
No entry, class-34 which shows Turn left ahead, and class-
16 which shows Vehicles over 3.5 metric tons prohibited
respectively.

Fig. 10. Output image showing No sign Detected

TABLE II. ACCURACY FOR TESTING AND VALIDATION AT


DIFFERENT EPOCHS

Fig. 13. Interfacing Raspberry Pi 4 with a Web camera

Fig. 11 shows the graph with training and validation


accuracy and Fig. 12 shows a graph containing training and
validation losses.

Fig. 14. Output image of class-15 showing No Entry

Fig. 11. Graph showing training, validation accuracy

Fig. 15. Output image of class-34 showing Turn left ahead

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
28
[6] Yuan Yuan, Zhitong Xiong, Qi Wang, "VSSA-NET: Vertical
Spatial Sequence Attention Network for Traffic Sign Detection," in
IEEE Transactions on Image Processing, July 2019, vol. 28, no. 7,
pp. 3423-3434.
[7] C. Bahlmann, Y. Zhu, Visvanathan Ramesh, M. Pellkofer and T.
Koehler, "A system for traffic sign detection, tracking, and
recognition using color, shape, and motion information," IEEE
Proceedings. Intelligent Vehicles Symposium, 2005., Las Vegas,
NV, USA, 2005, pp. 255-260.
[8] Fabien Moutarde, Alexandre Bargeton, Anne Herbin and Lowik
Chanussot"Robust on-vehicle real-time visual detection of
American and European speed limit signs, with a modular Traffic
Signs Recognition system," 2007 IEEE Intelligent Vehicles
Fig. 16. Output image of class-16 showing Vehicles over 3.5 metric tons Symposium, Istanbul, Turkey, 2007, pp. 1122-1126.
prohibited [9] Alberto Broggi, Pietro Cerri, Paolo Medici, Pier Paolo Porta, and
Guido Ghisio, "Real Time Road Signs Recognition," 2007 IEEE
VII. CONCLUSION AND FUTURE SCOPE Intelligent Vehicles Symposium, Istanbul, Turkey, 2007, pp. 981-
986.
Traffic sign recognition systems are important in road [10] Christoph Gustav Keller, Christoph Sprunk, Claus Bahlmann, Jan
scenarios with traffic sign images for autonomous cars, Giebel and Gregory Baratoff, "Real-time recognition of U.S. speed
autonomous driver assistance systems, and maintenance of signs," 2008 IEEE Intelligent Vehicles Symposium, Eindhoven,
Netherlands, 2008, pp. 518-523.
highways. This aids the vehicle to take the appropriate route
[11] Azam Sheikh Muhammad, Niklas Lavesson, Paul Davidsson,
and direction accurately as artificial intelligence is going to Mikael Nilsson, “Analysis of speed sign classification algorithms
be well-adapted technology in the coming days. Hardware using shape-based segmentation of binary images”, International
implementation of proposed model using advanced systems Conference on Computer Analysis of Image and Patterns (CAIP),
like Raspberry Pi 4 will provide more accurate detection due Lecture Notes in Computer Science (LNIP), Volume 5702, pp.
to its high speed of computing. The overall accuracy of 1220-1227, 2009.
testing is 99.80% with the proposed model which makes an [12] S. Maldonado Bascon, J. Acevedo Rodriguez, S. Lafuente Arroyo,
A. Fernndez Caballero, F. Lopez-Ferreras, “An optimization on
accurate recognition of the traffic sign images in real-time pictogram identification for the road-sign recognition task using
as shown in the results and analysis section. SVMs”, ELSEVIER Computer Vision and Image Understanding,
March 2010, Volume 114, Issue 3, pp. 373–383.
The work carried out is done in different steps as [13] Ming Liang, Mingyi Yuan, Xiaolin Hu, Jianmin Li and Huaping
explained above in which data visualization is carried out to Liu, "Traffic sign detection by ROI extraction and histogram
understand the distribution of data among all 43 classes, features-based recognition," The 2013 International Joint
Conference on Neural Networks (IJCNN), Dallas, TX, USA, 2013,
data preprocessing, CNN model creation, and then finally pp. 1-8.
the hardware implementation using Raspberry Pi 4 with a [14] R. Karthika, Latha Parameswaran, “A novel convolutional neural
web camera and the result is displayed on the screen. This network-based architecture for object detection and recognition
prototype can be integrated with a camera and a voice alert with an application to traffic sign recognition from road scenes”,
system can be added at the center of the vehicle with more Pattern Recognition and Image Analysis, , July 2022, Volume 32,
evaluation metrics on the display screen. This system may pp. 351-362.
[15] Kaushek Kumar T R, S Thiruvikkraman, Gokul R, Nirmal A and
be enhanced so that the user can be alerted to the upcoming
Karthika R, "Evaluating the Scalability of a Multi-Object Detector
traffic signals in the route of travel so that the driver can Trained with Multiple Datasets," 2021 5th International Conference
plan travel time accordingly. on Intelligent Computing and Control Systems (ICICCS), Madurai,
India, 2021, pp. 1359-1366.
REFERENCES [16] Rajesh Kannan Megalingam, Kondareddy Thanigundala,
[1] Canyong Wang, "Research and Application of Traffic Sign Sreevatsava Reddy Musani, Hemanth Nidamanuru and Lokesh
Detection and Recognition Based on Deep Learning," 2018 Gadde, "Indian traffic sign detection and recognition using deep
International Conference on Robots & Intelligent Systems (ICRIS), learning", International Journal of Transportation Science and
Changsha, China, 2018, pp. 150-152. Technology, Volume 12, Issue 3,2023, Pp. 683-699.
[2] Md. Abdul Alim Sheikh, Alok Kole and Tanmoy Maity, "Traffic [17] V Sowbaranic Raj, Jalakam Venu Madhava Sai, N A Lakkshmi
sign detection and classification using color feature and neural Yogesh, S B Kavya Preetha and Lavanya R, "Smart Traffic Control
network," 2016 International Conference on Intelligent Control for Emergency Vehicles Prioritization using Video and Audio
Power and Instrumentation (ICICPI), Kolkata, India, 2016, pp. Processing," 2022 6th International Conference on Intelligent
307-311. Computing and Control Systems (ICICCS), Madurai, India, 2022,
[3] Tao Chen and Shijian Lu, "Accurate and Efficient Traffic Sign pp. 1588-1593.
Detection Using Discriminative AdaBoost and Support Vector [18] B. K. Gunturk, J. Glotzbach, Y. Altunbasak, R. W. Schafer and R.
Regression," in IEEE Transactions on Vehicular Technology, June M. Mersereau, "Demosaicking: color filter array interpolation," in
2016, vol. 65, no. 6, pp. 4006-4015 IEEE Signal Processing Magazine, Jan. 2005, vol. 22, no. 1, pp.
[4] Ma Xing, Mu Chunyang, Wang Yan, Wang Xiaolong and Chen 44-54.
Xuetao, "Traffic sign detection and recognition using color [19] R. Ramanath, W. Snyder, G. L. Bilbro, and W. A. Sander III,
standardization and Zernike moments," 2016 Chinese Control and "Demosaicking methods for Bayer color arrays," Journal of
Decision Conference (CCDC), Yinchuan, China, 2016, pp. 5195- Electronic Imaging, vol. 11, no. 3, pp. 306-315, Jul. 2002.
5198. [20] K. I. Kiy, “A new method of global image analysis and its
[5] Lu Wei, Lu Runge, Liu Xiaolei, "Traffic sign detection and application in understanding road scenes”, Pattern Recognition and
recognition via transfer learning," 2018 Chinese Control and Image Analysis, September 2018, Volume 28, pp. 483-495.
Decision Conference (CCDC), Shenyang, China, 2018, pp. 5884-
5887.

Authorized licensed use limited to: BMS College of Engineering. Downloaded on April 05,2024 at 03:48:00 UTC from IEEE Xplore. Restrictions apply.
29

You might also like