Professional Documents
Culture Documents
Article
Article
CASE STUDY
Author(s): Joel Murataj a , Prof. Dr. Eng. Abdulsalam Alkholidi b,c, Prof. Dr. Habib Hamam d,
PhD Afrim Alimeti e
a Software Engineering Department, Canadian Institute of Technology, Albania
b Software Engineering Department Canadian Institute of Technology, Albania
c Electrical Engineering Department Faculty of Engineering – Sana’a University - Yemen
d Département de Génie Électrique, Faculté D'Ingénierie, Université de Moncton, Canada
e Industrial Engineering Department, Canadian Institute of Technology, Albania
Abstract
Deep Learning (DL) is a subfield of Machine Learning (ML) that deals with algorithms inspired by the
structure and function of the brain. DL uses complex algorithms and deep neural nets to train a model.
It consists of the learning of artificial neural networks that consider algorithms inspired by the human
brain by learning how to use a large amount of data. It includes machine learning, where machines can
learn by experience and get skills without human intervention. The importance of deep learning is the
ability to process a large number of characteristics allowing deep and powerful learning when dealing
with ambiguous data. This paper aims to study and analyze to be updated existing papers related to
the deep learning field and introduce our contribution. An additional aim of this review paper is to
concentrate on the self-driving cars case study and introduce the new approach with high performance.
Keywords: Neural Network, Architectures, Big Data, Machine Learning, Autonomous Driving
16
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
DL is based on the functioning of the human brain. Function, Threshold Function, ReLU Function, and
Let’s understand how a Biological Neural Network Hyperbolic Tangent Function, and the output differ
(BNT) looks like (dendrite, cell nucleus, axon, and from one activation function to another. Moreover,
synapse) where Artificial Neuron contains input, the activation function requires nonlinearity.
The most used activation function is the Sigmoid
nodes, weights, and output. There are several
Function because it exists between (0 to1). It
steps to process the perceptron learning model
is used for models where we have to predict the
are given as demonstrated in Fig. 1: probability as an output. It exists between 0 and 1
as shown in Fig, 2.
• Step 1 calculating the weighted sum of the inputs.
• Step 2 passes the calculated weighted sum as in
the input to the activation function to generate the
output [4-5].
amount of data and numbers of layers in the fields where machine learning and deep learning
networks aid artificial intelligence by providing a set of
algorithms and neural networks to solve data-
We need deep learning to: driven problems. Deep learning uses artificial
neural networks that behave similarly to the neural
• Process huge amounts of data: ML algorithms networks in the human brain. A neural network
work with a huge amount of structured data but functions when some input data is fed to it. This
DL algorithms can work with an enormous amount data is then processed via layers of perceptron’s to
of structured and unstructured data. produce the desired output.
• Perform complex algorithms: ML algorithms
cannot perform complex operations. To do that 2. Literature Review
we need DL algorithms.
• Achieve the best performance with a large In recent times many researches have been
amount of data: As the amount of data increases, published toward developing deep learning
the performance of ML algorithms decreases. algorithms. he most reported are shown in [5,
In order to make sure that the performance of a 9-20]. The IEEE transactions article developed
model is good, we need DL. by Saptarshi Sengupta et al. [5] mentioned that
• Features extraction: ML algorithms extract several problems were solved and developed by
patterns based on labeled sample data, while DL using deep learning algorithms in the last decade -
algorithms take large volumes of data as input, the automated identification of data patterns; and
analyze the input to extract features out of an it can do so with a precision that far exceeds that of
object, and identify similar objects. human beings. Outside the world of conventional,
handcrafted computers, it has solved problems
Deep learning approaches are: algorithms of learning about aching of patients for
practitioners attempting to make sense of the data
• Supervised learning where the input variables flow that is now inundating our society.
introduced as X is structured to the output values
that are represented by Y using the following In the paper developed by Katleho L Masita et
function Y=f(X) [8]. al., they discussed some of the most relevant
• Unsupervised Learning (UL) analyzes and and recent advances and contributions made to
clusters unlabeled data sets using machine the study of the use of deep learning in object
learning methods. These algorithms find hidden detection. In addition, as seen, the results of
patterns in data without the assistance of humans multiple studies indicate that the application of
(thus the term "unsupervised"). deep learning in object detection greatly exceeds
• Reinforcement learning (RL) is a branch of traditional methods based on handcrafted
machine learning that studies how intelligent and learned characteristics [9]. Another paper
agents should operate in a given environment to published by Aman Bhalla et al. [10] illustrated
maximize the concept of cumulative reward. a computer vision model that learns from video
• Hybrid Learning (HL) applies to systems that data is proposed in this research work. It includes
allow the use of both generative (unsupervised) image processing, image augmentation, behavioral
and discriminatory (supervised) elements. cloning, and neural network model convolution.
The architecture of the neural network used to
An artificial neural network is the functional unit detect paths in a video segment, road linings,
of deep learning, using artificial neural networks obstacle positions, and behavioral cloning is used
which mimic the behavior of the human brain for the model to learn in the video from human
to solve complex data-driven problems. Now, behavior.
deep learning itself is a part of machine learning
which falls under the larger umbrella of artificial We summarize some researches published recently
intelligence. Artificial intelligence machine on deep learning as demonstrated in Table I.
learning and deep learning are interconnected
18
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
During the last decade, self-driving vehicle autonomous driving. The performance survey and
technology has developed remarkably despite computational requirements serve as a system-
some obstacles encountered, that will be inevitably level design reference for AI-based self-driving
overcome shortly. The second part of this study cars [19], where authors started by introducing
will focus on this self-driving vehicle algorithm. self-driving AI-based architectures, convolutional
This means that we will introduce the recently and repetitive neural networks, and a deep
published studies as related work. Then we will reinforcement learning model. The method is
introduce our contribution through the series of based on surveyed driving scene perception,
simulations we released. The article published path planning, behavior judging, and motion
by Sorin Grigorescu et al. has introduced an control algorithms. Researchers investigated both
overview of deep learning techniques using standard planning pipeline and perception, where
19
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
each method is built using deep learning methods, low and high thresholds, it is accepted only if it is
and End2End systems as well, which directly map connected to a strong edge (the rapid changes in
sensory information to steering commands that brightness). The rapport between the thresholds
connect sensory information directly to directional is 1/2 or 1/3.
commands. • Step 4: To create a triangle toward the view of
In the article developed by Jelena Kocić et al. [20], the car, because we know that the lane lines are
a single solution for an end-to-end autonomous toward the car. Computing the bitwise of both
driving deep neural network is presented. The images and taking the bitwise of each homologous
main goal of our work was to achieve autonomous pixel in both arrays ultimately makes the image
driving using a light deep neural network suitable only show the region of interest traced by the
for deployment on compact car platforms. There polygonal containing the task.
are many end-to-end deep neural networks used • Step 5: To detect straight lines and mainline. It
for autonomous driving, where the input to the finds the line which describes the points best.
machine learning algorithm is camera images and
the output is steering angle prediction, but these The image presented in Fig. 3 is converted to
convolutional neural networks are significantly grayscale color and smoothed.
more complex than the network architecture that
they proposed.
20
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
The bitwise function called compares by bytes the The Hough Space function detects the lane line
image and the triangle image and focuses only on during road all the time. For finding lane lines,
the lane lines as shown in Fig. 5. the process just needs to work with pixels as
demonstrated in Fig. 7.
y =mx +b (1)
(a) (b)
21
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
After changing the dimension of the image, for a layers, and an output layer using the Softmax
clearer difference between the edges, we need activation function, as we have a multi-class
to equalize the photo as demonstrated in Fig. categorization. Categorical Cross-Entropy is used
10. This helps the model find the features, as the as a loss function as we need to categorize multiple
model should focus on the edges of the image and classes and check for the lowest error. After the
differences of the pixels. data is processed by the model, it is classified into
one category as numerical. Each number means a
type of traffic sign. Figure 12 shows the progress
of the training process. Epochs are the iteration,
which is executed on the data. In the last epoch,
the accuracy is 0.98 and loss is 0.05 for training
data and 0.02 for validation loss, and 0.99 for
validation accuracy. Those are very good results. It
means that we have used the right model to train
the data. It is difficult to find the right number of
Fig. 10. Equalized traffic sign image
nodes and hidden layers. The result shows, that
The entire dataset is preprocessed in the same it is the best. We have tried a different number of
way, before starting the training. A very important layers until we got the best one.
part diversifies, is data augmentation. Data
augmentation is a widely used technique, which
helps expand and diversify your training data,
providing an added variety of important features,
that the network needs to extract. It gives variety
to the data and gives more data for the model to
work and generalize the new data that can exist.
Some of the training data consist of width shift and
height shift (zoomed and rotated). This makes the
model learn more and train in different data. We
Fig. 12. Training process. Screenshot by author.
have used Keras for the augmentation of the data.
The result is shown in Fig. 11.
(a) (b)
22
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
(3)
24
A REVIEW OF DEEP LEARNING FOR SELF-DRIVING CARS: CASE STUDY
REFERENCES
25