Simulation of Self Driving Car Using Deep Learning

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2021 International Conference on Emerging Smart Computing and Informatics (ESCI)

AISSMS Institute of Information Technology, Pune, India. Mar 5-7, 2021

Simulation of Self Driving Car Using Deep


Learning
Sangita Lade Parth Shrivastav Saurabh Waghmare
2021 International Conference on Emerging Smart Computing and Informatics (ESCI) | 978-1-7281-8519-4/20/$31.00 ©2021 IEEE | DOI: 10.1109/ESCI50559.2021.9396941

Department of Computer Engineering, Department of Computer Engineering, Department of Computer Engineering,


Vishwakarma Institute of Technology, Vishwakarma Institute of Technology, Vishwakarma Institute of Technology,
Pune, India Pune, India Pune, India
sangita.lade@vit.edu parth.shrivastav19@vit.edu saurabh.waghmare18@vit.edu

Sudarshan Hon Sushil Waghmode Shubham Teli


Department of Computer Engineering, Department of Computer Engineering, Department of Computer Engineering,
Vishwakarma Institute of Technology, Vishwakarma Institute of Technology, Vishwakarma Institute of Technology,
Pune, India Pune, India Pune, India
sudarshan.hon18@vit.edu sushil.waghmode18@vit.edu shubham.teli18@vit.edu

Abstract—According to the WHO, around 1.35 million The simulator is connected via a flask server. In autonomous
people die every year in road traffic crashes. Year by year there mode, the simulator sends the images along with other
are advancements in technology which has paved a way for the information to the flask server. Then we pre-process the
inclusion of artificial intelligence even into automobiles. In this received images and pass it to the model for prediction. The
paper, we have provided an extensive study on the model predicts the steering angles and sends it to the
implementation of self-driving cars based on deep learning. For simulator. According to the received value, the car steers in
ease and safety, we will be simulating the car in the simulator the specified direction.
provided by Udacity. The data for training the model is
recorded in the simulator and imported into the project for
training the model. Finally, we have implemented and
compared various existing deep learning models and showcased
the results in the section IV. We have obtained accuracy of
96.83% for model A and 76.67% for model B.

Keywords - simulation, Nvidia-model, self-driving, image


augmentation, CNN

I. INTRODUCTION
In this modern era, there has been many advancements in
automation and Artificial Intelligence has helped us in
advancing to newer technologies. Artificial Intelligence have
proved to solve many real-world problems. Driving a car
autonomously is one of the trending problems. Autonomous
cars can be helpful to reduce accidents and save precious time
of users.
Many new solutions for autonomous driving are emerging Fig. 1. Main screen of Udacity’s self-driving car simulator
but, in this paper, we will be focusing on our solution which
is based on machine learning. To solve this problem, we are In this paper, we have implemented and showcased the
using the power of convolutional neural networks [1] to results of two distinct architectures of CNN. The first CNN
extract features from the images that are generated from the architecture was introduced in paper [2]. We have compared
simulator. the model with another distinct CNN architecture. Section V
provides the results of both the CNN models. We have
Udacity, an e-learning platform, has provided us a compared the accuracy and plotted loss per epoch while
simulator to collect data and test deep learning models on training the models.
different tracks. The simulator provides data collection in the
form of images. The data is specifically collected from 3 II. LITERATURE REVIEW
different angles, namely left, centre and right. The model is
trained using these images and the angle is predicted in the In paper [3] author Dr Narapareddy Ramarao and his team
range of -1 to 1. A negative angle represents a left turn, zero proposed research on various algorithms for detecting traffic
angle represents no steering and positive value greater than lanes. Some of the methods had a requirement of expensive
zero is considered as a right turn. sensors like LiDAR. They compared other methods like
Feature Selection Algorithm, Canny edge detection with
Figure 1 below shows the main screen of the simulator. Hough transformation for detecting traffic lanes. They
As the figure shows, Udacity provides two different modes – concluded that the Lane Detection system based on OpenCV
training mode and autonomous mode. Training mode is used is feasible because the region of interest can be easily
to record data and autonomous mode is used to test the model. extracted and only that portion can be processed to get the

978-1-7281-8519-4/21/$31.00 ©2021 IEEE 175

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.
output. Other special sensors like LiDAR were slower as they III. METHODOLOGY
need more processing time to collect the data from its We have explained the comprehensive stages of our
surrounding. The real-time performance of the OpenCV method in this section.
based system is fast and robust.
Author D. Dong, and his team in [4] proposed an ENet A. Data Collection
algorithm for detecting objects. They also used a canny edge For data collection, we are using Udacity’s self-driving
algorithm to detect traffic lanes. For traffic lane detection they car simulator to collect the data. By using the training mode
used 32x32 pixel images which were provided by Udacity. provided in the simulator, we can record ourselves driving the
For vehicle detection, 64x64 pixel images of different cars car on the predefined tracks. This method is called as
were used. The system achieved 95% accuracy in detecting behavioural cloning as we are recording actual behaviour of
lanes [4]. user while driving the car. We are using Track 1 to get the
Author T. Okuyama, and his team in [5] proposed a training data. The data is in the form of images from three
simulation study of an autonomous agent which was based on different angles mainly centre, left and right. The fig. 2 shows
Deep Q Network. They used Deep Q learning to predict the Q sample images from all three angles.
values to perform a specific set of actions. The simulation
environment was created in Unity. The obstacles were
divided into three different types and were generated
randomly. The input was an image with a road scene and the
agent had three kinds of action that were steer left, steer right
and keep straight. According to the predicted Q value, it was
taking specified action. They concluded that the agent can
drive in a simplified environment provided with lane
markings [5].
Author A. Mogaveera and his team in [6] designed and
trained an autonomous robot with the help of a neural
network. Tensorflow was used to train the model. The model
was predicting steering value. There were three values, ‘0’,
‘1’ and ‘2’ for forward, right and left direction respectively.
Here the ultrasonic sensor was primarily used. The
performance of the robot was satisfactory [6].
Author J. Barrozo and V. Lazcano in [7] presented a
dynamic model of a physical vehicle. The autonomous
vehicle was built for agronomy. Thus, they developed a
control system based on visual features. For developing the
control system, they had three main stages. The first stage
was to create a prototype vehicle which can easily move in
any terrain. The second stage was to construct a dynamic
model. The final stage was to simulate the model and evaluate
the performance. For The dataset used was synthetic images
from ENPEDA database. They compared their model with
other models and concluded that their proposed model
performed better than other models considering the database
[7].
Author V. Swaminathan and his team in [8] have
presented a prototype of the autonomous car. They conducted
different experiments to evaluate the performance of the
algorithm. The Belgium traffic sign dataset was used for all
Fig. 2. Images captured from left, centre and right side of car
the experiments. They dataset they used consisted 385 images
in total. There were four main phases. The first phase was to The images are saved in RGB format with width and
detect road signs, for that they have used CNN architecture, height as 320 pixels and 160 pixels respectively. A csv file is
more specifically Google’s MobileNet architecture. The automatically generated which contains the path to images,
second phase was to design a distance calculating module. along with the steering value received at the point of
For detecting the distance, they have used triangle similarity. capturing.
The third phase consisted of a lane following controller. For
this they used multi-layer perceptron. The last phase was We are driving car in both the directions to get more
autonomous driving on assigned track. The proposed versatile data from the same track. At the end, we
approach for detecting traffic signs achieved 83.7%. The successfully collected 3997 images with different steering
proposed system by authors in [8] was successful to classify angles.
traffic signs as well as calculating distance. The autonomous
robot was able to be within the lane by the proposed approach
[8].

176

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.
B. Data Pre-processing
In this step we pre-processed the images so that
unnecessary details can be eliminated before training the
model. The Fig 3 visualizes the total count of steering angle
ranging from -1 to 1. The graph depicts that our data is
skewed. A large portion of our data set has steering value 0 as
most of the time the car was driving straight on the track.

Fig. 3. Count of records according to sterring angles

To balance the dataset, we are dropping 2771 records Fig. 5. Image before and after pre-processing
containing steering angle 0. Finally, only 175 records with
steering value as 0 are kept in the dataset. Fig 4 visualizes the C. Model Architecture
distribution of steering angle after dropping the records.
We are introducing two distinct CNN architectures in this
segment. We have named both as ‘A’ and ‘B’. To create our
model, we have used the Keras library. For our model
construction, we were concerned with a stack of layers, with
each layer consisting of only one input and one output, hence
we chose the sequential category of model from the Keras
library. Our CNN architecture ‘A’ is shown in fig 6. Model
‘A’ consists of total 11 layers including input layer. We have
5 convolutional layers following with 1 Flatten layer and 4
Dense layers. As our model will also predict negative values,
we have set Exponential Linear Unit (ELU) [9] as our
activation function for all layers. Kernel size for the first
three layers is set to 5 by 2. The first layer is the input layer.
It contains images of the track from three different angles:
centre, left and right. The second layer is a Convolution2D
Fig. 4. Count of records according to sterring angles after balancing the
dataset layer. The number of filters for this layer is 24. The third
layer is a Convolution2D layer with number of filters set to
To avoid under-fitting of the data we are using 36. The fourth layer is a Convolution2D layer with number
augmentation techniques on images. For augmentation, we of filters as 48. Next two layers i.e. layer fifth and layer sixth
are using the imgaug library. are both Convolution2D layer with number of filters set to 64
and kernel size to 3. Seventh Layer is Flatten layer; it is used
By using CV2 library we are converting RGB images to
YUV. The advantage of converting images is that YUV to flatten the whole input. The next four layers are the Dense
images take lesser bandwidth as compared to RGB. After layers in which three of them are fully connected and the last
converting the images, we are smoothening the image by layer gives us a steering angle. The seventh layer has 100
using the method GaussianBlur provided in the CV2 library. units. Layer eighth and layer ninth contains 50 and 10 units
We are cropping the image according to our region of respectively. And finally, the eleventh layer contains only 1
interest. The width and height of the image after pre- unit which is our predicted steering angle. The input shape
processing is 200 pixels and 66 pixels respectively. It helps to and output shape after every layer is shown in fig 6.
make processing faster. Fig 5 shows a set of two images. The
image at the top represents the original image and the image
next to it shows the output after performing the pre-
processing stage.

177

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.
layer. The sixth layer is a fully connected Dense layer with
512 units. And finally seventh layer is Dense layer with only
1 unit i.e. predicted steering angle. The shape of input and
output for all layers is shown in fig 7.

Fig. 7. CNN Architecture ‘B’

For both the models the loss function is set to mean


squared error. And for reducing the loss function we are
using Adam [10] optimizer.

D. Model Training & Testing


We have used 1000 images for training and 251 images
for testing our model. Fig 8. visualizes the count of steering
angles in the training dataset.

Fig. 6. CNN Architecture ‘A’ based on Nvidia model

CNN architecture ‘B’ is relatively less complex than CNN


architecture ‘A’. The CNN architecture ‘B’ is shown in fig 7.
Here also we are using the sequential category of model from
the Keras library. This architecture consists total 7 layers
including input layer. The first four layers are Convolution2D
layers following with a Flatten layer and 2 Dense layers. First
layer is input layer and is same as described in CNN
architecture ‘A’. The second layer is Convolution2D layer
with kernel size set to 8x8 and number of filters set to 16. The
third layer has a kernel size of 5x5 and a number of filters is
set to 32. The kernel size and number of filters for the fourth
layer is 5x5 and 64 respectively. The fifth layer is the Flatten Fig. 8. Count of steering angle in training dataset

178

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.
With the use of matplotlib library available in python, we IV. RESULTS
have plotted a chart as shown in Fig. 9. It visualizes the count The training results obtained from model A and model B
of steering angles in the testing dataset. is recorded in table I. We can observe that model A has more
accuracy and consistency as compared with model B.
Although implementing the CNN architecture for model A is
more complex, but the results are found to be quite promising.

TABLE I. COMPARISON OF BOTH MODELS

Comparison
Model Min Max
Accuracy
Error Error
Model A 0.00014 0.61 96.83

Model B 0.0 1.0 76.67

Fig. 12 shows the simulation and the console with


previous predicted steering angle. Model A was able to
predict mostly correct steering angles. The car completed the
Fig. 9. Count of steering angle in testing dataset track autonomously.
We have trained our models by passing parameters such
as epochs, epochs_per_step, validation_steps as 10, 300 and
200 respectively. Fig. 10 shows the training and validation
loss per epoch for model A. With an increase in each epoch,
our loss for training and validating is found to decay
exponentially.

Fig. 12. Simulation running car autonomously

Fig. 13 shows an incoming sharp turn in the simulation


with previous predicted steering angle.

Fig. 10. Loss per epoch for model ‘A’

Fig. 13. Simulation running car autonomously with incoming left turn
Fig. 11 shows the training and validation loss per epoch
for model B. As we can observe from the figure below, the
Model B has not performed as good as model A. At some
model’s training and its validation loss is almost constant
occasions it got off-tracked in the simulation.
with increase in each epoch, i.e. the loss is not decaying.
V. CONCLUSIONS
In this paper, we have built two different CNN
architecture models and tested them on the simulator provided
by Udacity. Both the architectures behave quite differently
despite being from the same family of CNN. With an
accuracy of 96.83%, model ‘A’ performed well in the
simulation. Model ‘B’ obtained an accuracy of 76.67%. From
the above test results, we can conclude that CNN architecture
has been found to be helpful to predict the steering angle
according to the track.
In comparison with both the models, model ‘A’ completed
the track autonomously. Whereas, we observed that model
‘B’ was wobbling along the track and eventually got off-track
in the simulator multiple times.
Fig. 11. Loss per epoch for model ‘B’

179

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.
As safety of user is the priority of self-driving cars, this [5] T. Okuyama, T. Gonsalves and J. Upadhay, "Autonomous Driving
model may not withstand some parameters of safety. But, by System based on Deep Q Learnig," 2018 International Conference on
Intelligent Autonomous Systems (ICoIAS), Singapore, 2018, pp. 201-
optimizing and modifying the model ‘A’ and by hybridizing it 205, doi: 10.1109/ICoIAS.2018.8494053.
with cutting edge technologies, this architecture can [6] A. Mogaveera, R. Giri, M. Mahadik and A. Patil, "Self Driving Robot
contribute in real-world application. using Neural Network," 2018 International Conference on Information
, Communication, Engineering and Technology (ICICET), Pune, 2018,
REFERENCES pp. 1-6, doi: 10.1109/ICICET.2018.8533870.
[1] S. Albawi, T. A. Mohammed and S. Al-Zawi, "Understanding of a [7] J. Barrozo and V. Lazcano, "Simulation of an Autonomous Vehicle
convolutional neural network," 2017 International Conference on Control System Based on Image Processing," 2019 5th International
Engineering and Technology (ICET), Antalya, 2017, pp. 1-6, doi: Conference on Frontiers of Signal Processing (ICFSP), Marseille,
10.1109/ICEngTechnol.2017.8308186. France, 2019, pp. 88-94, doi: 10.1109/ICFSP48124.2019.8938094.
[2] Bojarski, Mariusz & Testa, Davide & Dworakowski, Daniel & Firner, [8] V. Swaminathan, S. Arora, R. Bansal and R. Rajalakshmi,
Bernhard & Flepp, Beat & Goyal, Prasoon & Jackel, Larry & Monfort, "Autonomous Driving System with Road Sign Recognition using
Mathew & Muller, Urs & Zhang, Jiakai & Zhang, Xin & Zhao, Jake & Convolutional Neural Networks," 2019 International Conference on
Zieba, Karol. (2016). End to End Learning for Self-Driving Cars. Computational Intelligence in Data Science (ICCIDS), Chennai, India,
[3] Dr Narapareddy Ramarao, B Vivek Bhat, Kartik Kulkarni, Ashley, 2019, pp. 1-4, doi: 10.1109/ICCIDS.2019.8862152.
Raban Akbary "Lane Detection for Autonomous Vehicle " Vol. 9 - No. [9] Z. Qiumei, T. Dan and W. Fenghua, "Improved Convolutional Neural
3 (March 2019), International Journal of Engineering Research and Network Based on Fast Exponentially Linear Unit Activation
Applications (IJERA) , ISSN: 2248-9622 , www.ijera.com Function," in IEEE Access, vol. 7, pp. 151359-151367, 2019, doi:
[4] D. Dong, X. Li and X. Sun, "A Vision-Based Method for Improving 10.1109/ACCESS.2019.2948112.
the Safety of Self-Driving," 2018 12th International Conference on [10] Kingma, Diederik & Ba, Jimmy. (2014). Adam: A Method for
Reliability, Maintainability, and Safety (ICRMS), Shanghai, China, Stochastic Optimization. International Conference on Learning
2018, pp. 167-171, doi: 10.1109/ICRMS.2018.00040. Representations.

180

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 07:15:12 UTC from IEEE Xplore. Restrictions apply.

You might also like