Extensive Literature Ondeep Learning

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 52

ABSTRACT

Over the past few years, deep


learning (DL) computing models
have been considered the gold
standard in the machine learning
(ML) community.
Furthermore, it has gradually
become the most widely used
computational method in the field
of ML, thus obtaining exceptional
results for some complex
cognitive tasks, equal to or even
exceeding results.
results provided by human
performance.
One of the benefits of DL is the
ability to learn huge amounts of
data.
The field of DL has grown
rapidly in recent years and is
widely used to successfully solve
many traditional applications.
More importantly, DL has
outperformed
well-known machine learning
techniques in many fields, such
as cybersecurity, natural
language processing,
bioinformatics, robotics and
control, and information
processing.
medical news, among many
other fields.
Specifically, this review
attempts to provide a more
comprehensive study of the most
important aspects of DL,
including recently added
improvements in this area.
In particular, this review
emphasizes the importance of
DL, presenting the technical and
network types of DL.
It then introduces convolutional
neural networks (CNNs), the
most widely used type of DL
network, and describes the
evolution of the CNN
architecture.
This is followed by a list of
major DL applications .
The computing engines,
including FPGAs,
GPUs, and CPUs, are
summarized along with a
description of their impact on the
DL.

INTRODUCTION
Recently, machine learning (ML)
has become popular in search
and is integrated into a variety of
applications, including text
mining, spam detection, video
recommendation, image
classification, and retrieval.
The continuous emergence of
new research in the field of deep
and distributed learning is due to
the
unpredictable growth in data
collection capabilities and the
incredible advances made in
hardware technology, for
example Example: High
Performance Computing (HPC).
Relatively speaking, feature
extraction is performed
automatically through the DL
algorithm.
This encourages researchers to
extract distinguishing features
using as little human effort and
field knowledge as possible.
These algorithms have a multi-
layer data representation
architecture, where the first
layers extract low-level features
while the last layers extract high-
level features.
The human brain can
automatically extract data
representation.
The main advantage of CNN
over its predecessors is that it
automatically
detects important features
without any human supervision,
making it the most widely used.

BACKGROUND
DL, a subset of ML, is inspired by
information processing patterns
found in the human brain.
DL does not require any human-
designed rules to operate;
instead, it uses a large amount of
data to map the given input to
specific labels.
DL is designed using multiple
layers of
algorithms (artificial neural
networks or ANNs), each layer
providing a different
interpretation of the data fed to
them.
In contrast, DL has the ability to
automate the learning of feature
sets for many tasks, unlike
conventional ML methods.
DL allows learning and
classification in one go.

WHEN TO APPLY DEEP


LEARNING
Artificial intelligence is useful in
many situations and is equal or
better than human experts in
some cases, which means DL
can be the solution to the
following problems :
• Cases where human experts
are not available experts.
• Cases where humans cannot
explain
decisions made using their
expertise (language
understanding, medical
decisions, and speech
recognition).
• Where the solution to the
problem is updated over time
(price forecasts, stock market
preferences, weather forecasting
and monitoring).
• Cases where the solution
needs to be tailored to each
specific case (personalization,
biometrics ).
• Cases where the scale of the
problem is extremely large and
exceeds our weak reasoning
abilities (sentiment analysis, ad
matching with Facebook,
calculating website rankings).

Why deep learning?


1. Universal learning approach:
Because
DA is capable of working in about
all application areas, it is
sometimes called universal
learning.
2. Durability: In general, precisely
designed features are not
necessary in DL engineering.
Instead, optimized features are
learned automatically based on
the task under consideration.
Therefore, high reliability with
respect to normal changes in the
input data is achieved.
3. Generalization: Different data
types or different applications
can use the same DL technique,
an approach often referred to as
transfer learning (TL).
Additionally, this is a useful
approach in problems with
insufficient data.
4. Scalability: DL is highly
scalable.

Types of DL networks
The most famous types of deep
learning networks are discussed
in this section: these
include recursive neural
networks (RvNNs), RNNs, and
CNNs. RvNNs and RNNs were
briefly explained in this section
while CNNs were explained in
deep due to the impor-
tance of this type. Furthermore, it
is the most used in several
applications among other
networks.
Recursive neural
networks
The RvNN recursive neural
network can make predictions in
a hierarchical structure and also
classify the output using
component vectors.
Recursive auto-associative
memory (RAAM) was the main
inspiration for development of
RvNN.
The RvNN architecture was
created to process structured
objects with random shapes such
as graphs or trees.
This approach creates a fixed-
width distributed representation
from a variable-sized recursive
data structure.
The network was trained using
the Back Propagation Through
Structure (BTS) learning system.
The BTS system follows the
same technique as the general
backpropagation algorithm and is
capable of supporting tree
structures.
The link automatically trains the
network to recreate the input
layer model at the output layer .
RvNN is very effective in the
NLP context.
Socher et al introduces the
RvNN architecture designed to
process input from a variety of
methods.
This author presents two
applications for classifying
natural language sentences:
instances where each sentence
is divided into natural words and
images, and instances where
each image is separated into
segments of interest different.
RvNN computes a pair of
possible points for the merge and
builds the syntax tree.
In addition, RvNN calculates
scores related to the plausibility
of the merger for each pair of
units.
Then, the pair with the highest
score of is merged into a
composite vector.
After each merge, RvNN
generates (a) a
larger region consisting of
multiple units, (b) a vector of
region components, and (c) a
label for the class (e.g.
, a noun phrase will becomes the
label class for the new region if
the two units are nominal words).
The component vector for the
entire region is the root of the
RvNN tree structure .
An example of an RvNN tree is
shown in Figure 5.
RvNN has been used in a
number of applications.
Recurrent neural
networks
RNNs are a commonly used and
familiar algorithm in the field of
DL.
RNN is mainly applied in the
field of speech processing and
NLP context.
Unlike conventional networks,
RNNs use data sequentially
within the network.
Because structures embedded
in data strings provide valuable
information, this functionality is
fundamental to a variety of
applications.
For example, it is important to
understand the context of a
sentence to determine the
meaning of a particular word in
it.
Therefore, the RNN can be
considered as a unit of short-
term memory , where x
represents the input layer, y the
output layer, and s represents
the (hidden) state layer.
For a given input sequence, a
typical open
RNN scheme is shown in Figure
6.
Pascanu et al.
introduced three different types
of deep RNN techniques, which
are “Hidden to Hidden”, “Hidden
to Output” and “Input to Hidden”.
Convolutional Neural
Networks
In the field of DL, CNN is the
best known and most commonly
used algorithm.
The main advantage of CNN
over its predecessors is that it
automatically identifies relevant
features without any human
supervision.
CNN has been widely applied in
different fields, including
computer vision, speech
processing, facial recognition,
and more.
The structure of CNN is inspired
by neurons in human and animal
brains, similar to conventional
neural networks.
Specifically, in the cat brain, a
complex chain of cells forms the
visual cortex; This sequence is
simulated by CNN.
Goodfellow et al.
identified three main advantages
of CNN: equivalent
representation, sparse
interaction, and parameter
sharing.
Unlike conventional fully
connected (FC) networks, shared
weights and local connectivity in
CNN are used to fully exploit 2D
input data structures such as
signals image.
This operation uses an
extremely small number of
parameters, which simplifies the
training process and speeds up
the network.
This is the same as in the cells
of the visual cortex.
Notably, only small regions of
the scene are detected by these
cells, not the entire scene (i.e.
these cells spatially extract the
local correlation available in the
input, much like filters locally on
'frontend).
A commonly used type of CNN,
similar to a multilayer perceptron
(MLP), consists of multiple
convolutional layers before
downsampling (pooling) layers,
while the final layers are FC
layers.
An example of a CNN
architecture for image
classification is shown in Figure
7.
Benefits of employing
CNNs
The benefits of using CNNs over
other traditional neural networks
in the computer
vision environment are listed as
follows:
1. The main reason to consider
CNN is the weight sharing
feature, which reduces the
number of trainable network
parameters and in turn helps the
network to enhance
generalization and to avoid
overfitting.
2. Concurrently learning the
feature extraction layers and the
classification layer causes
the model output to be both
highly organized and highly
reliant on the extracted features.
3. Large-scale network
implementation is much easier
with CNN than with other neural
networks.
CNN Architecture
Over the past 12 years, several
CNN architectures have been
introduced.
The Model architecture is an
essential element in improving
the performance of various
applications.
Various modifications have been
made in the CNN architecture
from 1989 to present.
Such modifications include
structural reform, regularization,
parameter optimization, etc.
On the contrary, it is worth
noting that the important
performance improvement of the
CNN occurred largely due to the
reorganization of the processing
units, as well as the development
of new blocks.
.
In particular, the most innovative
developments in CNN
architecture have
focused on the use of network
depth.
In this section, I review the most
popular CNN architectures,
starting with the AlexNet model in
2012 and ending with the High
Resolution (HR) model in 2020.
Table 2 provides Brief overview
of CNN.
Applications of deep
learning
Presently, various DL
applications are widespread
around the world. These applica-
tions include healthcare, social
network analysis, audio and
speech processing (like
recognition and enhancement),
visual data processing methods
(such as multimedia data
analysis and computer vision),
and NLP (translation and
sentence classification), among
others.
Computational
approaches

CPU-based approach
The well-behaved
performance of the CPU
nodes usually assists robust
network connectivity,
storage abilities, and large
memory. Although CPU
nodes are more common
purpose than those of
FPGA or GPU, they lack the
ability to match them in
unprocessed
computation facilities, since
this requires increased
network ability and a larger
memory
capacity.
GPU-based approach
The GPU is extremely
efficient for some basic DL
primitives, including many
parallel computing
operations such as
activation functions, matrix
multiplication, and
convolution.
Incorporating HBM-stacked
memory into the up-to-date
GPU models which greatly
improves bandwidth.
This improvement allows
multiple primitives to
efficiently use all available
GPU computational
resources.
The improvement in GPU
performance over CPU
performance is typically 10-
20:1 involving dense linear
algebra operations.
FPGA-based approach
FPGA is widely used in
various tasks including
deep learning.
Inference accelerators are
typically implemented using
FPGAs.
The FPGA can be
effectively configured to
reduce unnecessary or
costly functions associated
with the GPU system .
Compared to the GPU, the
FPGA is limited to both low
operating floating point
performance and integer
reasoning.
A key aspect of FPGAs is
the ability to automatically
reconfigure array
characteristics (at runtime),
as well as the ability to
configure the array
through an efficient design
with little or no overhead.
As mentioned earlier,
FPGA delivers both
performance and latency for
every watt it achieves
compared to GPUs and
CPUs in DL inference
operations.
Summary and conclusion
Finally,the inclusion of a
brief discussion by
gathering all the relevant
data provided along this
extensive literature review.

• DL requires sizeable
datasets (labeled data
preferred) to predict unseen
data and to
train the models.

• Impressive and robust


hardware resources like
GPUs are required for
effective CNN
training.

• With the recent


development in
computational tools
including a chip for neural
net-
works and a mobile GPU,
we will see more DL
applications on mobile
devices. It will
be easier for users to use
DL.

• Last, this overview


provides a starting point for
the community of DL being
inter-
ested in the field of DL.
Furthermore, researchers
would be allowed to decide
the
more suitable direction of
work to be taken in order to
provide more accurate alter-
natives to the field.
Reference
1.
https://www.researchgate
.net/publication/3382384
34_Literature_Review_of
_Deep_Learning_Resear
ch_Areas

2.
https://www.analyticsvidhya.
com/blog/2021/05/convoluti
onal-neural-networks-cnn/

3. https://cerv.aut.ac.nz/wp-
content/uploads/2023/06/De
ep_Learning_Book.pdf
4.
https://journalofbigdata.spri
ngeropen.com/articles/10.1
186/s40537-021-00444-8

You might also like