Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 66

G I TA M

(Deemed to be University)

19EID331
Artificial Neural Networks
By
Venkata Kranthi B
Department of EECE
GITAM School of Technology(GST)
Bengaluru-561203
Email: kbudigi@gitam.edu

6/27/2022 Department of EECE 19EID331 1


ANN
Contents G I TA M
(Deemed to be University)

• Introduction to Neural Networks


• Architecture based classification of Neural Networks.
• Classification of Neural Networks based on learning methods.
• Activation functions and Loss functions.
• Factors to be considered for choice of type of Neural Network.
• Introduction to hardware requirements for implementation
of
Neural Networks

6/27/2022 Department of EECE 19EID331 2


ANN
Definition G I TA M
(Deemed to be University)

• The brain is a highly complex, nonlinear, and parallel computer


(information-processing system).
• It has the capability to organize its structural constituents, known as
neurons, to perform certain computations (e.g., pattern recognition,
perception, and motor control) many times faster than the fastest
digital computer in existence today.
• “ A n artificial neural network is a system located on the services of
biological neural networks. It is a simulation of a biological neural
system.”
• The characteristic of artificial neural networks is that there are
multiple architectures, which consequently needed several methods
of algorithms, but despite being a complex system, a neural network
is nearly simple.

6/27/2022 Department of EECE 19EID331 3


ANN
Introduction G I TA M
(Deemed to be University)

• Consider, for example, human vision, which is an information-


processing task.
• It is the function of the visual system to provide a representation of the
environment around us and, more important, to supply the
information we need to interact with the environment.
• To be specific, the brain routinely accomplishes perceptual
recognition tasks (e.g., recognizing a familiar face embedded in an
unfamiliar scene) in approximately 100–200 ms, whereas tasks of
much lesser complexity take a great deal longer on a powerful
computer.

6/27/2022 Department of EECE 19EID331 4


ANN
Introduction G I TA M
(Deemed to be University)

• A neural network is a massively parallel distributed processor made


up of simple processing units that has a natural propensity for
storing experiential knowledge and making it available for use.
• It resembles the brain in two respects:
1.Knowledge is acquired by the network from its
environment through a learning process.
2. Interneuron connection strengths, known as synaptic weights, are
used to store the acquired knowledge.
• An Artificial Neural Network is a flexible, most often non-linear
system that understands to implement a function (an input/output
map) from data.
• Adaptive defines that the system parameters are transformed
during operation, generally known as the training phase.

6/27/2022 Department of EECE 19EID331 5


ANN
G I TA M
(Deemed to be University)

6/27/2022 Department of EECE 19EID331 6


ANN
Advantages of Artificial Neural Network G I TA M
(Deemed to be University)

• Artificial neural networks have the ability to provide the data to be


processed in parallel, which means they can handle more than one
task at the same time.
• Artificial neural networks have been in resistance. This means that
the loss of one or more cells, or neural networks, influences the
performance of Artificial Neural networks.
• Artificial neural networks are used to store information on the
network so that, even in the absence of a data pair, it does not mean
that the network is not generating results.
• Artificial neural networks are gradually being broken down, which
means that they will not suddenly stop working and these networks
are gradually being broken down.
• We are able to train A N N ’ s that these networks learn from past
events and make decisions.

6/27/2022 Department of EECE 19EID331 7


ANN
Advantages of Artificial Neural Network G I TA M
(Deemed to be University)

• A neural network can implement tasks that a linear program cannot.


• When an item of the neural network declines, it can
continue
without some issues by its parallel features.
• A neural network determines and does not require to
be reprogrammed.
• It can be executed in any application.

6/27/2022 Department of EECE 19EID331 8


ANN
Disadvantages of Artificial Neural Network G I TA M
(Deemed to be University)

• As we mentioned before, with A N N arms hanging along with the


execution of parallel processing, and so they need processors that
support parallel processing, so the A N N s are dependent on the
hardware.
• Since it’s similar to the functionality of the human brain, we may
not be able to determine what is the proper network structure of an
Artificial Neural network.
• Not only do artificial neural networks, but also the statistical models
can be trained with only numeric data, so it makes it very difficult
for A N N to understand the problem statement.
• When an artificial neural network that provides a solution to the
problem statements that we really don’t know on what basis it will
give the solution, and this time, A N N is not a reliable

6/27/2022 Department of EECE 19EID331 9


ANN
Disadvantages of Artificial Neural Network G I TA M
(Deemed to be University)

• The neural network required training to operate.


• The structure of a neural network is disparate from the structure of
microprocessors therefore required to be emulated.
• It needed high processing time for big neural networks.

6/27/2022 Department of EECE 19EID331 10


ANN
The A N N applications G I TA M
(Deemed to be University)

• Classification, the aim is to predict the class of an input vector


• Pattern matching, the aim is to produce a pattern best associated
with a given input vector
• Pattern completion, the aim is to complete the missing parts of a
given input vector
• Optimization, the aim is to find the optimal values of parameters in
an optimization problem
• Control, an appropriate action is suggested based on given an input
vectors
• Function approximation/times series modeling, the aim is to learn
the functional relationships between input and desired output
vectors;
• Data mining, with the aim of discovering hidden patterns from data
(knowledge discovery)

6/27/2022 Department of EECE 19EID331 11


ANN
Benefits of Neural Networks G I TA M
(Deemed to be University)

• Nonlinearity
• Input–Output Mapping
• Adaptivity
• Evidential Response
• Contextual Information
• Fault Tolerance
• VLSI Implementability
• Uniformity of Analysis and Design
• Neurobiological Analogy

6/27/2022 Department of EECE 19EID331 12


ANN
Nonlinearity G I TA M
(Deemed to be University)

• An artificial neuron can be linear or nonlinear.A neural network,


made up of an interconnection of nonlinear neurons, is itself
nonlinear.
• Moreover, the nonlinearity is of a special kind in the sense that it is
distributed throughout the network.
• Nonlinearity is a highly important property, particularly if the
underlying physical mechanism responsible for generation of the
input signal (e.g., speech signal) is inherently nonlinear.

6/27/2022 Department of EECE 19EID331 13


ANN
Input–Output Mapping G I TA M
(Deemed to be University)

6/27/2022 Department of EECE 19EID331 14


ANN
Adaptivity G I TA M
(Deemed to be University)

• Neural networks have a built-in capability to adapt their synaptic


weights to changes in the surrounding environment.
• In particular, a neural network trained to operate in a specific
environment can be easily retrained to deal with minor changes in the
operating environmental conditions.
• Moreover, when it is operating in a nonstationary environment (i.e.,
one where statistics change with time), a neural network may be
designed to change its synaptic weights in real time.

6/27/2022 Department of EECE 19EID331 15


ANN
Evidential Response G I TA M
(Deemed to be University)

• In the context of pattern classification, a neural network can be


designed to provide information not only about which particular
pattern to select, but also about the confidence in the decision made.
• This latter information may be used to reject ambiguous patterns,
should they arise, and thereby improve the classification
performance of the network.

6/27/2022 Department of EECE 19EID331 16


ANN
Contextual Information G I TA M
(Deemed to be University)

• Knowledge is represented by the very structure and activation state


of a neural network.
• Every neuron in the network is potentially affected by the global
activity of all other neurons in the network.
• Consequently, contextual information is dealt with naturally by
a neural network.

6/27/2022 Department of EECE 19EID331 17


ANN
Fault Tolerance G I TA M
(Deemed to be University)

• A neural network, implemented in hardware form, has the potential


to be inherently fault tolerant, or capable of robust computation, in the
sense that its performance degrades gracefully under adverse
operating conditions.
• For example, if a neuron or its connecting links are damaged, recall
of a stored pattern is impaired in quality.

6/27/2022 Department of EECE 19EID331 18


ANN
VLSI Implementability G I TA M
(Deemed to be University)

• The massively parallel nature of a neural network makes it


potentially fast for the computation of certain tasks.
• This same feature makes a neural network
well suited for
implementation using very-large-scale-integrated (VLSI) technology.
• One particular beneficial virtue of VLSI is that it provides a means
of capturing truly complex behavior in a highly hierarchical fashion
(Mead, 1989).

6/27/2022 Department of EECE 19EID331 19


ANN
Uniformity of Analysis and Design G I TA M
(Deemed to be University)

• Basically, neural networks enjoy universality as


information processors. This feature manifests itself in different
ways:
• Neurons, in one form or another, represent an ingredient common to
all neural networks.
• This commonality makes it possible to share theories and
learning algorithms in different applications of neural networks.
• Modular networks can be built through a seamless
integration of
modules.

6/27/2022 Department of EECE 19EID331 20


ANN
Neurobiological Analogy G I TA M
(Deemed to be University)

• The design of a neural network is motivated by analogy with the


brain, which is living proof that fault-tolerant parallel processing is
not only physically possible, but also fast and powerful.
• Neurobiologists look to (artificial) neural networks as a research tool
for the interpretation of neurobiological phenomena.
• On the other hand, engineers look to neurobiology for new ideas
to solve problems more complex than those based on conventional
hardwired design

6/27/2022 Department of EECE 19EID331 21


ANN
Block Diagram Representation of Nervous G I TA M
System
(Deemed to be University)

6/27/2022 Department of EECE 19EID331 22


ANN
Block Diagram Representation of Nervous
System G I TA M
(Deemed to be University)

• Central to the system is the brain, represented by the neural (nerve)


net, which continually receives information, perceives it, and makes
appropriate decisions.
• Two sets of arrows are shown in the figure.
• Those pointing from left to right indicate the forward transmission
of information-bearing signals through the system.
• The arrows pointing from right to left (shown in red) signify the
presence of feedback in the system.
• The receptors convert stimuli from the human body or the external
environment into electrical impulses that convey information to the
neural net (brain).
• The effectors convert electrical impulses generated by the neural net
into discernible responses as system outputs.

6/27/2022 Department of EECE 19EID331 23


ANN
History of A N N G I TA M
(Deemed to be University)

• The history of neural networking arguably began in the


late 1800s with scientific endeavors to study the activity of the
human brain.
• In 1890, William James published the first work about brain
activity
patterns.
• In 1943, McCulloch and Pitts created a model of the neuron that is
still used today in an artificial neural network.
• In 1949, Donald H ebb published "The O rganization
of
Behavior," which illustrated a law for synaptic neuron learning.

6/27/2022 Department of EECE 19EID331 24


ANN
History of A N N G I TA M
(Deemed to be University)

• This law, later known as Hebbian Learning in honor of Donald


Hebb, is one of the most straight-forward and simple learning rules
for artificial neural networks.
• In 1951, Narvin Minsky made the first Artificial Neural Network
(ANN) while working at Princeton.
• In 1958, "The Computer and the Brain" were published, a year
after Jhon von Neumann's death. In that book, von Neumann
proposed numerous extreme changes to how analysts had been
modeling the brain.

6/27/2022 Department of EECE 19EID331 25


ANN
Structure G I TA M
(Deemed to be University)

6/27/2022 Department of EECE 19EID331 34


ANN
Elements of a Neural Network G I TA M
(Deemed to be University)

Input Layer :- This layer accepts input features. It provides


information from the outside world to the network, no computation
is performed at this layer, nodes here just pass on the
information(features) to the hidden layer.
Hidden Layer :- Nodes of this layer are not exposed to the outer
world, they are the part of the abstraction provided by any neural
network. Hidden layer performs all sort of computation on the
features entered through the input layer and transfer the result to
the output layer.
Output Layer :- This layer bring up the information learned by the
network to the outer world.
The neural network is made up many perceptrons.

Perceptron is a single layer neural network. It is a


binary classifier and part of supervised learning. A
simple model of the biological neuron in an artificial
neural network is known as the perceptron.

6/27/2022 Department of EECE 19EID331 35


ANN
Types of A N N G I TA M
(Deemed to be University)

Neural Network works the same as the human nervous system functions.
There are several types of neural network. These networks implementation
are based on the set of parameter and mathematical operation that are
required for determining the output.

6/27/2022 Department of EECE 19EID331 36


ANN
Feed forward Neural Network
(Artificial Neuron)
G I TA M
(Deemed to be University)

• F N N is the purest form of A N N in which input and data travel in only one
direction.
• Data flows in an only forward direction; that's why it is known as the
Feedforward Neural Network.
• The data passes through input nodes and exit from the output nodes.
• The nodes are not connected cyclically. It doesn't need to have a hidden
layer.

6/27/2022 Department of EECE 19EID331 37


ANN
Feed forward Neural Network
(Artificial Neuron) G I TA M
(Deemed to be University)

• In FNN, there doesn't need to be multiple layers. It may have a single layer
also.
• It has a front propagate wave that is achieved by using a
classifying
activation function.
• All other types of neural network use backpropagation, but FNN can't.
• In FNN, the sum of the product's input and weight are calculated, and
then it is fed to the output.
• Technologies such as face recognition and computer vision are used FNN.

6/27/2022 Department of EECE 19EID331 38


ANN
Redial basis function Neural Network G I TA M
(Deemed to be University)

• RBFNN find the distance of a point to the centre and considered it to work
smoothly.
• There are two layers in the RBF Neural Network.
• In the inner layer, the features are combined with the radial basis function.
• Features provide an output that is used in consideration.
• Other measures can also be used rather than Euclidean.

Redial Basis Function


We define a receptor t.
Confronted maps are drawn around
the receptor.
For RBF Gaussian Functions are
generally used. So we can define the
radial distance r = | | X- t | | .

6/27/2022 Department of EECE 19EID331 39


ANN
Redial basis function Neural Network G I TA M
(Deemed to be University)

• Redial Function=Φ(r) = exp (- r2/2σ2), where σ > 0


• This Neural Network is used in power restoration system.
• In the present era power system have increased in size and complexity.
• It's both factors increase the risk of major power outages.
• Power needs to be restored as quickly and reliably as possible after a
blackout.

6/27/2022 Department of EECE 19EID331 40


ANN
Multilayer Perceptron G I TA M
(Deemed to be University)

• A Multilayer Perceptron has three or more layer.


• The data that cannot be separated linearly is classified with the help of this
network.
• This network is a fully connected network that means every single node is
connected with all other nodes that are in the next layer.
• A Nonlinear Activation Function is used in Multilayer Perceptron.
• It's input and output layer nodes are connected as a directed graph. It is a
deep learning method so that for training the network it
uses backpropagation.
• It is extensively applied in speech recognition and machine translation
technologies.

6/27/2022 Department of EECE 19EID331 41


ANN
Convolutional Neural Network G I TA M
(Deemed to be University)

• In image classification and image recognition, a Convolutional Neural


Network plays a vital role, or we can say it is the main category for those.
• Face recognition, object detection, etc., are some areas where C N N are
widely used.
• It is similar to FNN, learn-able weights and biases are available in neurons.
• C N N takes an image as input that is classified and process under a certain
category such as dog, cat, lion, tiger, etc.
• As we know, the computer sees an image as pixels and depends on the
resolution of the picture.
• Based on image resolution, it will see h * w * d, where h= height w= width
and d= dimension.
• For example, An RGB image is 6 * 6 * 3 array of the matrix, and the
grayscale image is 4 * 4 * 3 array of the pattern.

6/27/2022 Department of EECE 19EID331 42


ANN
Recurrent Neural Network G I TA M
(Deemed to be University)

• Recurrent Neural Network is based on prediction.


• In this neural network, the output of a particular layer is saved and fed
back to the input.
• It will help to predict the outcome of the layer.
• In Recurrent Neural Network, the first layer is formed in the same way as
FNN's layer, and in the subsequent layer, the recurrent neural network
process begins.

6/27/2022 Department of EECE 19EID331 43


ANN
Recurrent Neural Network G I TA M
(Deemed to be University)

• Both inputs and outputs are independent of each other, but in some cases, it
required to predict the next word of the sentence.
• Then it will depend on the previous word of the sentence.
• R N N is famous for its primary and most important feature, i.e., Hidden
State. Hidden State remembers the information about a sequence.
• R N N has a memory to store the result after calculation.
• R N N uses the same parameters on each input to perform the same task
on all the hidden layers or data to produce the output.
• Unlike other neural networks, R N N parameter complexity is less.

6/27/2022 Department of EECE 19EID331 44


ANN
Modular Neural Network G I TA M
(Deemed to be University)

• In Modular Neural N etwork, several different networks


are functionally independent.
• In M N N the task is divided into sub-task and perform by several
systems.
• During the computational process, networks don't
communicate directly with each other.
• All the interfaces are work independently towards achieving the
output.
• Combined networks are more powerful than flat and unrestricted.
• Intermediary takes the production of each system, process them to
produce the final output.

6/27/2022 Department of EECE 19EID331 45


ANN
Sequence to Sequence Network G I TA M
(Deemed to be University)

• It is consist of two recurrent neural networks.


• Here, encoder processes the input and decoder processes
the
output.
• The encoder and decoder can either use for same or
different parameter.
• Sequence-to-sequence models are applied in chatbots,
machine
translation, and question answering systems.

6/27/2022 Department of EECE 19EID331 46


ANN
MO D E LS OF A N E U RO N G I TA M
(Deemed to be University)

A neuron is an information-processing unit that is fundamental to


the operation of a neural network. There are three basic elements of
the neural model

1. A set of synapses, or connecting


links, each of which is characterized by a weight or
2. strength
An adder of its
forown.
summing the input signals, weighted by
respective synaptic
the strengths of the neuron; the operations
described here constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a
neuron. The activation function is also referred to as a squashing
function, in that it squashes (limits) the permissible amplitude
range of the output signal to some finite value.

6/27/2022 Department of EECE 19EID331 47


ANN
Explanation G I TA M
(Deemed to be University)

•Specifically, a signal xj at the input of


synapse j connected to neuron k is
multiplied by the synaptic weight wkj.

•It is important to make a note of the


manner in which the subscripts of the
synaptic weight wkj are written.

•The first subscript in wkj refers to the


neuron in question, and the second
subscript refers to the input end of the
synapse to which the weight refers.

•Unlike the weight of a synapse in the


brain, the synaptic weight of an artificial
neuron may lie in a range that includes
negative as well as positive values.

6/27/2022 Department of EECE 19EID331 48


ANN
Explanation G I TA M
(Deemed to be University)

• The neural model of Fig. also includes an externally applied bias,


denoted by bk .
• The bias bk has the effect of increasing or lowering the net input of
the activation function, depending on whether it is positive or
negative, respectively.
• In mathematical terms, we may describe the neuron k depicted in
Fig. by writing the pair of equations:

where x1, x2, ..., xm are the input signals;


wk1, wk2, ..., the respective
wkm
synaptic weights of neuron k
are
bk is the bias
ᵠ(·) is the activation function
yk is the output signal of the neuron

• The use of bias bk has the effect of applying an affine transformation


to the output u k of the linear combiner in the model of Fig. as shown
by
6/27/2022 Department of EECE 19EID331 49
ANN
Learning methods G I TA M
(Deemed to be University)

• Artificial neural networks work through the optimized weight


values.
• The method by which the optimized weight values are attained is
called learning
• In the learning process try to teach the network how to produce the
output when the corresponding input is presented
• When learning is complete: the trained neural network, with the
updated optimal weights, should be able to produce the output
within desired accuracy corresponding to an input pattern.
• Learning methods
• Supervised learning
• Unsupervised learning
• Reinforced learning

6/27/2022 Department of EECE 19EID331 50


ANN
Classification of Neural Networks G I TA M
(Deemed to be University)

6/27/2022 Department of EECE 19EID331 51


ANN
Supervised learning G I TA M
(Deemed to be University)

Supervised learning means guided


learning by “teacher”; requires a
training set which consists of input
vectors and a target vector
associated with each input vector

6/27/2022 Department of EECE 19EID331 52


ANN
Supervised learning G I TA M
(Deemed to be University)

• Supervised learning system:


• feed forward
• functional
link
• product unit
• Recurrent
• Time delay
• “Learning experience in
our childhood”
• As a child, we learn about various things (input) when we see them
and simultaneously are told (supervised) about their names and the
respective functionalities (desired response).

6/27/2022 Department of EECE 19EID331 53


ANN
Unsupervised learning G I TA M
(Deemed to be University)

• The objective of unsupervised learning is to discover patterns or


features in the input data with no help from a teacher, basically
performing a clustering of input space.
• The system learns about the pattern from the data itself without a
priori knowledge.
• This is similar to our learning experience in adulthood “For
example, often in our working environment we are thrown into a
project or situation which we know very little about. However, we
try to familiarize with the situation as quickly as possible using our
previous experiences, education, willingness and similar other
factors”

6/27/2022 Department of EECE 19EID331 54


ANN
Unsupervised learning G I TA M
(Deemed to be University)

• Hebb’s rule: It helps the neural network or neuron assemblies to


remember specific patterns much like the memory.
• From that stored knowledge, similar sort of incomplete or spatial
patterns could be recognized.
• This is even faster than the delta rule or the back propagation
algorithm because there is no repetitive presentation and training of
input–output pairs.

6/27/2022 Department of EECE 19EID331 55


ANN
Reinforced learning G I TA M
(Deemed to be University)

•A ‘teacher’ though available, does not present the expected answer but
only indicates if the computed output is correct or incorrect
•The information provided helps the network in its learning process
•A reward is given for a correct answer computed and a penalty for a
wrong answer

6/27/2022 Department of EECE 19EID331 56


ANN
Types of Activation Function G I TA M
(Deemed to be University)

• The activation function, denoted by (v), defines the output of a neuron in


terms of the induced local field v.
• An activation function in a neural network defines how the weighted sum
of the input is transformed into an output from a node or nodes in a layer of
the network
• Activation function decides, whether a neuron should be activated or not
by calculating weighted sum and further adding bias with it.
• The purpose of the activation function is to introduce non-linearity into the
output of a neuron.
• We know, neural network has neurons that work in correspondence
of weight, bias and their respective activation function.
• In a neural network, we would update the weights and biases of the
neurons on the basis of the error at the output. This process is known
as back-propagation.
• Activation functions make the back-propagation possible since the
gradients are supplied along with the error to update the weights and
biases.

6/27/2022 Department of EECE 19EID331 57


ANN
Threshold Function G I TA M
(Deemed to be University)

Threshold Function: For this type of activation function, described in


Fig. we have

In engineering, this form of a threshold function is commonly


referred to as a Heaviside function. Correspondingly, the
output of neuron k employing such a threshold function is
expressed as

where vk is the induced local field of the neuron; that is,

6/27/2022 Department of EECE 19EID331 58


ANN
Threshold Function G I TA M
(Deemed to be University)

• In neural computation, such a neuron is referred to as the


McCulloch–Pitts model, in recognition of the pioneering work done
by McCulloch and Pitts (1943).
• In this model, the output of a neuron takes on the value of 1 if the
induced local field of that neuron is nonnegative, and 0 otherwise.
• This statement describes the all-or-none property of the McCulloch–
Pitts model.

6/27/2022 Department of EECE 19EID331 59


ANN
Sigmoid Function G I TA M
(Deemed to be University)

• The sigmoid function, whose graph is “S”-shaped, is by far the most


common form of activation function used in the construction of
neural networks.
• It is defined as a strictly increasing function that exhibits a graceful
balance between linear and nonlinear behavior.
• An example of the sigmoid function is the logistic function, defined
by

where a is the slope parameter of the sigmoid function.


By varying the parameter a, we obtain sigmoid
functions of different slopes, as illustrated in Fig. In fact,
the slope at the origin equals a/4.
In the limit, as the slope parameter approaches infinity,
the sigmoid function becomes simply a threshold
function

6/27/2022 Department of EECE 19EID331 60


ANN
Signum function G I TA M
(Deemed to be University)

• The activation functions defined in above Eqs. range from 0 to 1. It


is sometimes desirable to have the activation function range from -1
to 1, in which case, the activation function is an odd function of the
induced local field.
• Specifically, the threshold function of Eq is now defined as below
equation which is commonly referred to as the signum function.

6/27/2022 Department of EECE 19EID331 61


ANN
Linear Function G I TA M
(Deemed to be University)

• Equation : Linear function has the equation similar to as of a


straight line i.e. y = ax
• No matter how many layers we have, if all are linear in nature, the
final activation function of last layer is nothing but just a linear
function of the input of first layer.
• Range : -inf to +inf
• Uses : Linear activation function is used at just one place i.e. output
layer.
• Issues : If we will differentiate linear function to bring non-linearity,
result will no more depend on input “x” and function will become
constant, it won’t introduce any ground-breaking behavior to our
algorithm.
• For example : Calculation of price of a house is a regression
problem. House price may have any big/small value, so we can
apply linear activation at output layer. Even in this case neural net
must have any non-linear function at hidden layers.

6/27/2022 Department of EECE 19EID331 62


ANN
Sigmoid Function G I TA M
(Deemed to be University)

• It is a function which is plotted as ‘S’ shaped graph.


• Equation A = 1/(1 + e-x)
• Nature : Non-linear. Notice that X values lies between -2 to 2, Y
values are very steep. This means, small changes in x would also
bring about large changes in the value of Y.
• Value Range : 0 to 1
• Uses : Usually used in output layer of a binary classification, where
result is either 0 or 1, as value for sigmoid function lies between 0
and 1 only so, result can be predicted easily to be 1 if value is
greater than 0.5 and 0 otherwise.

6/27/2022 Department of EECE 19EID331 63


ANN
Tanh Function G I TA M
(Deemed to be University)

• The activation that works almost always better than sigmoid


function is Tanh function also knows as Tangent Hyperbolic
function. It’s actually mathematically shifted version of the sigmoid
function. Both are similar and can be derived from each other.
• Equation :-
f(x) = tanh(x) = 2/(1 + e-2x) - 1 OR
tanh(x) = 2 * sigmoid(2x) - 1
Value Range :- -1 to +1
• Nature :- non-linear
• Uses :- Usually used in hidden layers of a neural network as it’s
values lies between -1 to 1 hence the mean for the hidden layer
comes out be 0 or very close to it, hence helps in centering the data by
bringing mean close to 0. This makes learning for the next layer
much easier.

6/27/2022 Department of EECE 19EID331 64


ANN
RELU G I TA M
(Deemed to be University)

• Stands for Rectified linear unit. It is the most widely used activation
function. Chiefly implemented in hidden layers of Neural network.
• Equation :- A(x) = max(0,x). It gives an output x if x is positive and
0
otherwise.
• Value Range :- [0, inf)
• Nature :- non-linear, which means we can easily backpropagate the
errors and have multiple layers of neurons being activated by the
ReLU function.
• Uses :- ReLu is less computationally expensive than tanh and
sigmoid because it involves simpler mathematical operations. At a
time only a few neurons are activated making the network sparse
making it efficient and easy for computation.
• In simple words, RELU learns much faster than sigmoid and Tanh
function.
6/27/2022 Department of EECE 19EID331 65
ANN
Softmax Function G I TA M
(Deemed to be University)

• The softmax function is also a type of sigmoid function but is handy


when we are trying to handle classification problems.
• Nature :- non-linear
• Uses :- Usually used when trying to handle multiple classes. The
softmax function would squeeze the outputs for each class between
0 and 1 and would also divide by the sum of the outputs.
• Output:- The softmax function is ideally used in the output layer of
the classifier where we are actually trying to attain the probabilities
to define the class of each input.

6/27/2022 Department of EECE 19EID331 66


ANN
C H O O S I N G THE RIGHT
ACTIVATION G I TA M
(Deemed to be University)
FUNCTION
• The basic rule of thumb is if you really don’t know what activation
function to use, then simply use RELU as it is a general activation
function and is used in most cases these days.
• If your output is for binary classification then, sigmoid function is
very natural choice for output layer.
• The activation function does the non-linear transformation to the
input making it capable to learn and perform more complex tasks.

6/27/2022 Department of EECE 19EID331 67


ANN
Loss Function G I TA M
(Deemed to be University)

• The Loss Function is one of the important components of Neural


Networks.
• Loss is nothing but a prediction error of Neural Net. And
the
method to calculate the loss is called Loss Function.
• In simple words, the Loss is used to calculate the gradients. And
gradients are used to update the weights of the Neural Net.

Different loss functions are


• Mean Squared Error (MSE)
• Binary Crossentropy (BCE)
• Categorical Crossentropy (CC)
• Sparse Categorical Crossentropy (SCC)

6/27/2022 Department of EECE 19EID331 68


ANN
Mean Squared Error G I TA M
(Deemed to be University)

• MSE loss is used for regression tasks. As the name suggests, this
loss is calculated by taking the mean of squared differences between
actual(target) and predicted values.
• Example
• For Example, we have a neural network which takes house data and
predicts house price. In this case, you can use the MSE loss.
Basically, in the case where the output is a real number, you should
use this loss function.

6/27/2022 Department of EECE 19EID331 69


ANN
Binary C rossentropy G I TA M
(Deemed to be University)

• BCE is used for the binary classification tasks. If you are


using
loss BCE loss function, you just need one output node to classify the
data into two classes. The output value should be passed through
a sigmoid activation function and the range of output is (0 – 1).
Example
• For example, we have a neural network that takes atmosphere data and
predicts whether it will rain or not. If the output is greater than 0.5, the
network classifies it as rain and if the output is less than 0.5, the
network classifies it as not rain. (it could be opposite depending upon
how you train the network). More the probability score value, the more
the chance of raining. While training the network, the target value
fed to the network should be 1 if it is raining otherwise 0

6/27/2022 Department of EECE 19EID331 70


ANN
Categorical Crossentropy G I TA M
(Deemed to be University)

• When we have a multi-class classification task, one of the loss


function you can go ahead is this one. If you are using C C E loss
function, there must be the same number of output nodes as the
classes. And the final layer output should be passed through
a softmax activation so that each node output a probability value
between (0–1).
• For example, we have a neural network that takes an image and
classifies it into a cat or dog. If the cat node has a high probability
score then the image is classified into a cat otherwise dog. Basically,
whichever class node has the highest probability score, the image is
classified into that class.

6/27/2022 Department of EECE 19EID331 71


ANN
Sparse Categorical Crossentropy G I TA M
(Deemed to be University)

• This loss function is almost similar to C C E except for one change.


• When we are using SCCE loss function, you do not need to one hot
encode the target vector. If the target image is of a cat, you simply
pass 0, otherwise 1. Basically, whichever the class is you just pass
the index of that class.

6/27/2022 Department of EECE 19EID331 72


ANN
of Neural Networks
Hardware requirements for implementation G I TA M
(Deemed to be University)

• There two fundamentally different alternatives for


implementation
are the of neural networks:
1. A software simulation in conventional computers
2. A special hardware solution capable of
dramatically decreasing execution time.

• A software simulation can be useful to develop and debug new


algorith ms, as well as to benchmark them using small networks.
• However, if large networks are to be used, a software simulation is
not enough. The problem is the time required for the learning
process, which can increase exponentially with the size of the
network.

6/27/2022 Department of EECE 19EID331 73


ANN
of Neural Networks
Hardware requirements for implementation G I TA M
(Deemed to be University)

• Neural networks without learning, however, are rather


uninteresting.
• If the weights of a network were fixed from the beginning and were
not to change, neural networks could be implemented using any
programming language in conventional computers.
• But the main objective of building special hardware is to provide a
platform for efficient adaptive systems, capable of updating their
parameters in the course of time.
• New hardware solutions are therefore necessary

6/27/2022 Department of EECE 19EID331 74


ANN

You might also like