Module 1

G I TA M
(Deemed to be University)
19EID331
Artificial Neural Networks
By
Venkata Kranthi B
Department of EECE
GITAM School of Technology(GST)
Bengaluru-561203
Email: kbudigi@gitam.edu
6/27/2022 Department of EECE 19EID331 1

ANN
Contents G I TA M
• Introduction to Neural Networks

• Architecture based classification of Neural Networks.
• Classification of Neural Networks based on learning methods.
• Activation functions and Loss functions.
• Factors to be considered for choice of type of Neural Network.
• Introduction to hardware requirements for implementation
of
Neural Networks

ANN
Definition G I TA M
• The brain is a highly complex, nonlinear, and parallel computer

(information-processing system).
• It has the capability to organize its structural constituents, known as
neurons, to perform certain computations (e.g., pattern recognition,
perception, and motor control) many times faster than the fastest
digital computer in existence today.
• “ A n artificial neural network is a system located on the services of
biological neural networks. It is a simulation of a biological neural
system.”
• The characteristic of artificial neural networks is that there are
multiple architectures, which consequently needed several methods
of algorithms, but despite being a complex system, a neural network
is nearly simple.

ANN
Introduction G I TA M
• Consider, for example, human vision, which is an information-

processing task.
• It is the function of the visual system to provide a representation of the
environment around us and, more important, to supply the
information we need to interact with the environment.
• To be specific, the brain routinely accomplishes perceptual
recognition tasks (e.g., recognizing a familiar face embedded in an
unfamiliar scene) in approximately 100–200 ms, whereas tasks of
much lesser complexity take a great deal longer on a powerful
computer.

ANN
Introduction G I TA M
• A neural network is a massively parallel distributed processor made

up of simple processing units that has a natural propensity for
storing experiential knowledge and making it available for use.
• It resembles the brain in two respects:
1.Knowledge is acquired by the network from its
environment through a learning process.
2. Interneuron connection strengths, known as synaptic weights, are
used to store the acquired knowledge.
• An Artificial Neural Network is a flexible, most often non-linear
system that understands to implement a function (an input/output
map) from data.
• Adaptive defines that the system parameters are transformed
during operation, generally known as the training phase.

ANN
G I TA M

ANN
Advantages of Artificial Neural Network G I TA M
• Artificial neural networks have the ability to provide the data to be

processed in parallel, which means they can handle more than one
task at the same time.
• Artificial neural networks have been in resistance. This means that
the loss of one or more cells, or neural networks, influences the
performance of Artificial Neural networks.
• Artificial neural networks are used to store information on the
network so that, even in the absence of a data pair, it does not mean
that the network is not generating results.
• Artificial neural networks are gradually being broken down, which
means that they will not suddenly stop working and these networks
are gradually being broken down.
• We are able to train A N N ’ s that these networks learn from past
events and make decisions.

ANN
Advantages of Artificial Neural Network G I TA M
• A neural network can implement tasks that a linear program cannot.

• When an item of the neural network declines, it can
continue
without some issues by its parallel features.
• A neural network determines and does not require to
be reprogrammed.
• It can be executed in any application.

ANN
Disadvantages of Artificial Neural Network G I TA M
• As we mentioned before, with A N N arms hanging along with the

execution of parallel processing, and so they need processors that
support parallel processing, so the A N N s are dependent on the
hardware.
• Since it’s similar to the functionality of the human brain, we may
not be able to determine what is the proper network structure of an
Artificial Neural network.
• Not only do artificial neural networks, but also the statistical models
can be trained with only numeric data, so it makes it very difficult
for A N N to understand the problem statement.
• When an artificial neural network that provides a solution to the
problem statements that we really don’t know on what basis it will
give the solution, and this time, A N N is not a reliable

ANN
Disadvantages of Artificial Neural Network G I TA M
• The neural network required training to operate.

• The structure of a neural network is disparate from the structure of
microprocessors therefore required to be emulated.
• It needed high processing time for big neural networks.

ANN
The A N N applications G I TA M
• Classification, the aim is to predict the class of an input vector

• Pattern matching, the aim is to produce a pattern best associated
with a given input vector
• Pattern completion, the aim is to complete the missing parts of a
given input vector
• Optimization, the aim is to find the optimal values of parameters in
an optimization problem
• Control, an appropriate action is suggested based on given an input
vectors
• Function approximation/times series modeling, the aim is to learn
the functional relationships between input and desired output
vectors;
• Data mining, with the aim of discovering hidden patterns from data
(knowledge discovery)

ANN
Benefits of Neural Networks G I TA M
• Nonlinearity
• Input–Output Mapping
• Adaptivity
• Evidential Response
• Contextual Information
• Fault Tolerance
• VLSI Implementability
• Uniformity of Analysis and Design
• Neurobiological Analogy

ANN
Nonlinearity G I TA M
• An artificial neuron can be linear or nonlinear.A neural network,

made up of an interconnection of nonlinear neurons, is itself
nonlinear.
• Moreover, the nonlinearity is of a special kind in the sense that it is
distributed throughout the network.
• Nonlinearity is a highly important property, particularly if the
underlying physical mechanism responsible for generation of the
input signal (e.g., speech signal) is inherently nonlinear.

ANN
Input–Output Mapping G I TA M

ANN
Adaptivity G I TA M
• Neural networks have a built-in capability to adapt their synaptic

weights to changes in the surrounding environment.
• In particular, a neural network trained to operate in a specific
environment can be easily retrained to deal with minor changes in the
operating environmental conditions.
• Moreover, when it is operating in a nonstationary environment (i.e.,
one where statistics change with time), a neural network may be
designed to change its synaptic weights in real time.

ANN
Evidential Response G I TA M
• In the context of pattern classification, a neural network can be

designed to provide information not only about which particular
pattern to select, but also about the confidence in the decision made.
• This latter information may be used to reject ambiguous patterns,
should they arise, and thereby improve the classification
performance of the network.

ANN
Contextual Information G I TA M
• Knowledge is represented by the very structure and activation state

of a neural network.
• Every neuron in the network is potentially affected by the global
activity of all other neurons in the network.
• Consequently, contextual information is dealt with naturally by
a neural network.

ANN
Fault Tolerance G I TA M
• A neural network, implemented in hardware form, has the potential

to be inherently fault tolerant, or capable of robust computation, in the
sense that its performance degrades gracefully under adverse
operating conditions.
• For example, if a neuron or its connecting links are damaged, recall
of a stored pattern is impaired in quality.

ANN
VLSI Implementability G I TA M
• The massively parallel nature of a neural network makes it

potentially fast for the computation of certain tasks.
• This same feature makes a neural network
well suited for
implementation using very-large-scale-integrated (VLSI) technology.
• One particular beneficial virtue of VLSI is that it provides a means
of capturing truly complex behavior in a highly hierarchical fashion
(Mead, 1989).

ANN
Uniformity of Analysis and Design G I TA M
• Basically, neural networks enjoy universality as

information processors. This feature manifests itself in different
ways:
• Neurons, in one form or another, represent an ingredient common to
all neural networks.
• This commonality makes it possible to share theories and
learning algorithms in different applications of neural networks.
• Modular networks can be built through a seamless
integration of
modules.

ANN
Neurobiological Analogy G I TA M
• The design of a neural network is motivated by analogy with the

brain, which is living proof that fault-tolerant parallel processing is
not only physically possible, but also fast and powerful.
• Neurobiologists look to (artificial) neural networks as a research tool
for the interpretation of neurobiological phenomena.
• On the other hand, engineers look to neurobiology for new ideas
to solve problems more complex than those based on conventional
hardwired design

ANN
Block Diagram Representation of Nervous G I TA M
System

ANN
Block Diagram Representation of Nervous
System G I TA M
• Central to the system is the brain, represented by the neural (nerve)

net, which continually receives information, perceives it, and makes
appropriate decisions.
• Two sets of arrows are shown in the figure.
• Those pointing from left to right indicate the forward transmission
of information-bearing signals through the system.
• The arrows pointing from right to left (shown in red) signify the
presence of feedback in the system.
• The receptors convert stimuli from the human body or the external
environment into electrical impulses that convey information to the
neural net (brain).
• The effectors convert electrical impulses generated by the neural net
into discernible responses as system outputs.

ANN
History of A N N G I TA M
• The history of neural networking arguably began in the

late 1800s with scientific endeavors to study the activity of the
human brain.
• In 1890, William James published the first work about brain
activity
patterns.
• In 1943, McCulloch and Pitts created a model of the neuron that is
still used today in an artificial neural network.
• In 1949, Donald H ebb published "The O rganization
of
Behavior," which illustrated a law for synaptic neuron learning.

ANN
History of A N N G I TA M
• This law, later known as Hebbian Learning in honor of Donald

Hebb, is one of the most straight-forward and simple learning rules
for artificial neural networks.
• In 1951, Narvin Minsky made the first Artificial Neural Network
(ANN) while working at Princeton.
• In 1958, "The Computer and the Brain" were published, a year
after Jhon von Neumann's death. In that book, von Neumann
proposed numerous extreme changes to how analysts had been
modeling the brain.

ANN
Structure G I TA M

ANN
Elements of a Neural Network G I TA M
Input Layer :- This layer accepts input features. It provides

information from the outside world to the network, no computation
is performed at this layer, nodes here just pass on the
information(features) to the hidden layer.
Hidden Layer :- Nodes of this layer are not exposed to the outer
world, they are the part of the abstraction provided by any neural
network. Hidden layer performs all sort of computation on the
features entered through the input layer and transfer the result to
the output layer.
Output Layer :- This layer bring up the information learned by the
network to the outer world.
The neural network is made up many perceptrons.
Perceptron is a single layer neural network. It is a

binary classifier and part of supervised learning. A
simple model of the biological neuron in an artificial
neural network is known as the perceptron.

ANN
Types of A N N G I TA M
Neural Network works the same as the human nervous system functions.
There are several types of neural network. These networks implementation
are based on the set of parameter and mathematical operation that are
required for determining the output.

ANN
Feed forward Neural Network
(Artificial Neuron)
G I TA M
• F N N is the purest form of A N N in which input and data travel in only one
direction.
• Data flows in an only forward direction; that's why it is known as the
Feedforward Neural Network.
• The data passes through input nodes and exit from the output nodes.
• The nodes are not connected cyclically. It doesn't need to have a hidden
layer.

ANN
Feed forward Neural Network
(Artificial Neuron) G I TA M
• In FNN, there doesn't need to be multiple layers. It may have a single layer
also.
• It has a front propagate wave that is achieved by using a
classifying
activation function.
• All other types of neural network use backpropagation, but FNN can't.
• In FNN, the sum of the product's input and weight are calculated, and
then it is fed to the output.
• Technologies such as face recognition and computer vision are used FNN.

ANN
Redial basis function Neural Network G I TA M
• RBFNN find the distance of a point to the centre and considered it to work
smoothly.
• There are two layers in the RBF Neural Network.
• In the inner layer, the features are combined with the radial basis function.
• Features provide an output that is used in consideration.
• Other measures can also be used rather than Euclidean.
Redial Basis Function

We define a receptor t.
Confronted maps are drawn around
the receptor.
For RBF Gaussian Functions are
generally used. So we can define the
radial distance r = | | X- t | | .

ANN
Redial basis function Neural Network G I TA M
• Redial Function=Φ(r) = exp (- r2/2σ2), where σ > 0

• This Neural Network is used in power restoration system.
• In the present era power system have increased in size and complexity.
• It's both factors increase the risk of major power outages.
• Power needs to be restored as quickly and reliably as possible after a
blackout.

ANN
Multilayer Perceptron G I TA M
• A Multilayer Perceptron has three or more layer.

• The data that cannot be separated linearly is classified with the help of this
network.
• This network is a fully connected network that means every single node is
connected with all other nodes that are in the next layer.
• A Nonlinear Activation Function is used in Multilayer Perceptron.
• It's input and output layer nodes are connected as a directed graph. It is a
deep learning method so that for training the network it
uses backpropagation.
• It is extensively applied in speech recognition and machine translation
technologies.

ANN
Convolutional Neural Network G I TA M
• In image classification and image recognition, a Convolutional Neural

Network plays a vital role, or we can say it is the main category for those.
• Face recognition, object detection, etc., are some areas where C N N are
widely used.
• It is similar to FNN, learn-able weights and biases are available in neurons.
• C N N takes an image as input that is classified and process under a certain
category such as dog, cat, lion, tiger, etc.
• As we know, the computer sees an image as pixels and depends on the
resolution of the picture.
• Based on image resolution, it will see h * w * d, where h= height w= width
and d= dimension.
• For example, An RGB image is 6 * 6 * 3 array of the matrix, and the
grayscale image is 4 * 4 * 3 array of the pattern.

ANN
Recurrent Neural Network G I TA M
• Recurrent Neural Network is based on prediction.

• In this neural network, the output of a particular layer is saved and fed
back to the input.
• It will help to predict the outcome of the layer.
• In Recurrent Neural Network, the first layer is formed in the same way as
FNN's layer, and in the subsequent layer, the recurrent neural network
process begins.

ANN
Recurrent Neural Network G I TA M
• Both inputs and outputs are independent of each other, but in some cases, it
required to predict the next word of the sentence.
• Then it will depend on the previous word of the sentence.
• R N N is famous for its primary and most important feature, i.e., Hidden
State. Hidden State remembers the information about a sequence.
• R N N has a memory to store the result after calculation.
• R N N uses the same parameters on each input to perform the same task
on all the hidden layers or data to produce the output.
• Unlike other neural networks, R N N parameter complexity is less.

ANN
Modular Neural Network G I TA M
• In Modular Neural N etwork, several different networks

are functionally independent.
• In M N N the task is divided into sub-task and perform by several
systems.
• During the computational process, networks don't
communicate directly with each other.
• All the interfaces are work independently towards achieving the
output.
• Combined networks are more powerful than flat and unrestricted.
• Intermediary takes the production of each system, process them to
produce the final output.

ANN
Sequence to Sequence Network G I TA M
• It is consist of two recurrent neural networks.

• Here, encoder processes the input and decoder processes
the
output.
• The encoder and decoder can either use for same or
different parameter.
• Sequence-to-sequence models are applied in chatbots,
machine
translation, and question answering systems.

ANN
MO D E LS OF A N E U RO N G I TA M
A neuron is an information-processing unit that is fundamental to

the operation of a neural network. There are three basic elements of
the neural model
1. A set of synapses, or connecting

links, each of which is characterized by a weight or
2. strength
An adder of its
forown.
summing the input signals, weighted by
respective synaptic
the strengths of the neuron; the operations
described here constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a
neuron. The activation function is also referred to as a squashing
function, in that it squashes (limits) the permissible amplitude
range of the output signal to some finite value.

ANN
Explanation G I TA M
•Specifically, a signal xj at the input of

synapse j connected to neuron k is
multiplied by the synaptic weight wkj.
•It is important to make a note of the

manner in which the subscripts of the
synaptic weight wkj are written.
•The first subscript in wkj refers to the

neuron in question, and the second
subscript refers to the input end of the
synapse to which the weight refers.
•Unlike the weight of a synapse in the

brain, the synaptic weight of an artificial
neuron may lie in a range that includes
negative as well as positive values.

ANN
Explanation G I TA M
• The neural model of Fig. also includes an externally applied bias,

denoted by bk .
• The bias bk has the effect of increasing or lowering the net input of
the activation function, depending on whether it is positive or
negative, respectively.
• In mathematical terms, we may describe the neuron k depicted in
Fig. by writing the pair of equations:
where x1, x2, ..., xm are the input signals;

wk1, wk2, ..., the respective
wkm
synaptic weights of neuron k
are
bk is the bias
ᵠ(·) is the activation function
yk is the output signal of the neuron
• The use of bias bk has the effect of applying an affine transformation

to the output u k of the linear combiner in the model of Fig. as shown
by
ANN
Learning methods G I TA M
• Artificial neural networks work through the optimized weight

values.
• The method by which the optimized weight values are attained is
called learning
• In the learning process try to teach the network how to produce the
output when the corresponding input is presented
• When learning is complete: the trained neural network, with the
updated optimal weights, should be able to produce the output
within desired accuracy corresponding to an input pattern.
• Learning methods
• Supervised learning
• Unsupervised learning
• Reinforced learning

ANN
Classification of Neural Networks G I TA M

ANN
Supervised learning G I TA M
Supervised learning means guided

learning by “teacher”; requires a
training set which consists of input
vectors and a target vector
associated with each input vector

ANN
Supervised learning G I TA M
• Supervised learning system:

• feed forward
• functional
link
• product unit
• Recurrent
• Time delay
• “Learning experience in
our childhood”
• As a child, we learn about various things (input) when we see them
and simultaneously are told (supervised) about their names and the
respective functionalities (desired response).

ANN
Unsupervised learning G I TA M
• The objective of unsupervised learning is to discover patterns or

features in the input data with no help from a teacher, basically
performing a clustering of input space.
• The system learns about the pattern from the data itself without a
priori knowledge.
• This is similar to our learning experience in adulthood “For
example, often in our working environment we are thrown into a
project or situation which we know very little about. However, we
try to familiarize with the situation as quickly as possible using our
previous experiences, education, willingness and similar other
factors”

ANN
Unsupervised learning G I TA M
• Hebb’s rule: It helps the neural network or neuron assemblies to

remember specific patterns much like the memory.
• From that stored knowledge, similar sort of incomplete or spatial
patterns could be recognized.
• This is even faster than the delta rule or the back propagation
algorithm because there is no repetitive presentation and training of
input–output pairs.

ANN
Reinforced learning G I TA M
•A ‘teacher’ though available, does not present the expected answer but
only indicates if the computed output is correct or incorrect
•The information provided helps the network in its learning process
•A reward is given for a correct answer computed and a penalty for a
wrong answer

ANN
Types of Activation Function G I TA M
• The activation function, denoted by (v), defines the output of a neuron in

terms of the induced local field v.
• An activation function in a neural network defines how the weighted sum
of the input is transformed into an output from a node or nodes in a layer of
the network
• Activation function decides, whether a neuron should be activated or not
by calculating weighted sum and further adding bias with it.
• The purpose of the activation function is to introduce non-linearity into the
output of a neuron.
• We know, neural network has neurons that work in correspondence
of weight, bias and their respective activation function.
• In a neural network, we would update the weights and biases of the
neurons on the basis of the error at the output. This process is known
as back-propagation.
• Activation functions make the back-propagation possible since the
gradients are supplied along with the error to update the weights and
biases.

ANN
Threshold Function G I TA M
Threshold Function: For this type of activation function, described in

Fig. we have
In engineering, this form of a threshold function is commonly

referred to as a Heaviside function. Correspondingly, the
output of neuron k employing such a threshold function is
expressed as
where vk is the induced local field of the neuron; that is,

ANN
Threshold Function G I TA M
• In neural computation, such a neuron is referred to as the

McCulloch–Pitts model, in recognition of the pioneering work done
by McCulloch and Pitts (1943).
• In this model, the output of a neuron takes on the value of 1 if the
induced local field of that neuron is nonnegative, and 0 otherwise.
• This statement describes the all-or-none property of the McCulloch–
Pitts model.

ANN
Sigmoid Function G I TA M
• The sigmoid function, whose graph is “S”-shaped, is by far the most

common form of activation function used in the construction of
neural networks.
• It is defined as a strictly increasing function that exhibits a graceful
balance between linear and nonlinear behavior.
• An example of the sigmoid function is the logistic function, defined
by
where a is the slope parameter of the sigmoid function.

By varying the parameter a, we obtain sigmoid
functions of different slopes, as illustrated in Fig. In fact,
the slope at the origin equals a/4.
In the limit, as the slope parameter approaches infinity,
the sigmoid function becomes simply a threshold
function

ANN
Signum function G I TA M
• The activation functions defined in above Eqs. range from 0 to 1. It

is sometimes desirable to have the activation function range from -1
to 1, in which case, the activation function is an odd function of the
induced local field.
• Specifically, the threshold function of Eq is now defined as below
equation which is commonly referred to as the signum function.

ANN
Linear Function G I TA M
• Equation : Linear function has the equation similar to as of a

straight line i.e. y = ax
• No matter how many layers we have, if all are linear in nature, the
final activation function of last layer is nothing but just a linear
function of the input of first layer.
• Range : -inf to +inf
• Uses : Linear activation function is used at just one place i.e. output
layer.
• Issues : If we will differentiate linear function to bring non-linearity,
result will no more depend on input “x” and function will become
constant, it won’t introduce any ground-breaking behavior to our
algorithm.
• For example : Calculation of price of a house is a regression
problem. House price may have any big/small value, so we can
apply linear activation at output layer. Even in this case neural net
must have any non-linear function at hidden layers.

ANN
Sigmoid Function G I TA M
• It is a function which is plotted as ‘S’ shaped graph.

• Equation A = 1/(1 + e-x)
• Nature : Non-linear. Notice that X values lies between -2 to 2, Y
values are very steep. This means, small changes in x would also
bring about large changes in the value of Y.
• Value Range : 0 to 1
• Uses : Usually used in output layer of a binary classification, where
result is either 0 or 1, as value for sigmoid function lies between 0
and 1 only so, result can be predicted easily to be 1 if value is
greater than 0.5 and 0 otherwise.

ANN
Tanh Function G I TA M
• The activation that works almost always better than sigmoid

function is Tanh function also knows as Tangent Hyperbolic
function. It’s actually mathematically shifted version of the sigmoid
function. Both are similar and can be derived from each other.
• Equation :-
f(x) = tanh(x) = 2/(1 + e-2x) - 1 OR
tanh(x) = 2 * sigmoid(2x) - 1
Value Range :- -1 to +1
• Nature :- non-linear
• Uses :- Usually used in hidden layers of a neural network as it’s
values lies between -1 to 1 hence the mean for the hidden layer
comes out be 0 or very close to it, hence helps in centering the data by
bringing mean close to 0. This makes learning for the next layer
much easier.

ANN
RELU G I TA M
• Stands for Rectified linear unit. It is the most widely used activation
function. Chiefly implemented in hidden layers of Neural network.
• Equation :- A(x) = max(0,x). It gives an output x if x is positive and
0
otherwise.
• Value Range :- [0, inf)
• Nature :- non-linear, which means we can easily backpropagate the
errors and have multiple layers of neurons being activated by the
ReLU function.
• Uses :- ReLu is less computationally expensive than tanh and
sigmoid because it involves simpler mathematical operations. At a
time only a few neurons are activated making the network sparse
making it efficient and easy for computation.
• In simple words, RELU learns much faster than sigmoid and Tanh
function.
ANN
Softmax Function G I TA M
• The softmax function is also a type of sigmoid function but is handy

when we are trying to handle classification problems.
• Nature :- non-linear
• Uses :- Usually used when trying to handle multiple classes. The
softmax function would squeeze the outputs for each class between
0 and 1 and would also divide by the sum of the outputs.
• Output:- The softmax function is ideally used in the output layer of
the classifier where we are actually trying to attain the probabilities
to define the class of each input.

ANN
C H O O S I N G THE RIGHT
ACTIVATION G I TA M
FUNCTION
• The basic rule of thumb is if you really don’t know what activation
function to use, then simply use RELU as it is a general activation
function and is used in most cases these days.
• If your output is for binary classification then, sigmoid function is
very natural choice for output layer.
• The activation function does the non-linear transformation to the
input making it capable to learn and perform more complex tasks.

ANN
Loss Function G I TA M
• The Loss Function is one of the important components of Neural

Networks.
• Loss is nothing but a prediction error of Neural Net. And
the
method to calculate the loss is called Loss Function.
• In simple words, the Loss is used to calculate the gradients. And
gradients are used to update the weights of the Neural Net.
Different loss functions are

• Mean Squared Error (MSE)
• Binary Crossentropy (BCE)
• Categorical Crossentropy (CC)
• Sparse Categorical Crossentropy (SCC)

ANN
Mean Squared Error G I TA M
• MSE loss is used for regression tasks. As the name suggests, this
loss is calculated by taking the mean of squared differences between
actual(target) and predicted values.
• Example
• For Example, we have a neural network which takes house data and
predicts house price. In this case, you can use the MSE loss.
Basically, in the case where the output is a real number, you should
use this loss function.

ANN
Binary C rossentropy G I TA M
• BCE is used for the binary classification tasks. If you are

using
loss BCE loss function, you just need one output node to classify the
data into two classes. The output value should be passed through
a sigmoid activation function and the range of output is (0 – 1).
Example
• For example, we have a neural network that takes atmosphere data and
predicts whether it will rain or not. If the output is greater than 0.5, the
network classifies it as rain and if the output is less than 0.5, the
network classifies it as not rain. (it could be opposite depending upon
how you train the network). More the probability score value, the more
the chance of raining. While training the network, the target value
fed to the network should be 1 if it is raining otherwise 0

ANN
Categorical Crossentropy G I TA M
• When we have a multi-class classification task, one of the loss

function you can go ahead is this one. If you are using C C E loss
function, there must be the same number of output nodes as the
classes. And the final layer output should be passed through
a softmax activation so that each node output a probability value
between (0–1).
• For example, we have a neural network that takes an image and
classifies it into a cat or dog. If the cat node has a high probability
score then the image is classified into a cat otherwise dog. Basically,
whichever class node has the highest probability score, the image is
classified into that class.

ANN
Sparse Categorical Crossentropy G I TA M
• This loss function is almost similar to C C E except for one change.

• When we are using SCCE loss function, you do not need to one hot
encode the target vector. If the target image is of a cat, you simply
pass 0, otherwise 1. Basically, whichever the class is you just pass
the index of that class.

ANN
of Neural Networks
Hardware requirements for implementation G I TA M
• There two fundamentally different alternatives for

implementation
are the of neural networks:
1. A software simulation in conventional computers
2. A special hardware solution capable of
dramatically decreasing execution time.
• A software simulation can be useful to develop and debug new

algorith ms, as well as to benchmark them using small networks.
• However, if large networks are to be used, a software simulation is
not enough. The problem is the time required for the learning
process, which can increase exponentially with the size of the
network.

ANN
of Neural Networks
Hardware requirements for implementation G I TA M
• Neural networks without learning, however, are rather

uninteresting.
• If the weights of a network were fixed from the beginning and were
not to change, neural networks could be implemented using any
programming language in conventional computers.
• But the main objective of building special hardware is to provide a
platform for efficient adaptive systems, capable of updating their
parameters in the course of time.
• New hardware solutions are therefore necessary

ANN

Module 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 1

Uploaded by

Copyright:

Available Formats

G I TA M

6/27/2022 Department of EECE 19EID331 1

• Introduction to Neural Networks

6/27/2022 Department of EECE 19EID331 2

• The brain is a highly complex, nonlinear, and parallel computer

6/27/2022 Department of EECE 19EID331 3

• Consider, for example, human vision, which is an information-

6/27/2022 Department of EECE 19EID331 4

• A neural network is a massively parallel distributed processor made

6/27/2022 Department of EECE 19EID331 5

6/27/2022 Department of EECE 19EID331 6

• Artificial neural networks have the ability to provide the data to be

6/27/2022 Department of EECE 19EID331 7

• A neural network can implement tasks that a linear program cannot.

6/27/2022 Department of EECE 19EID331 8

• As we mentioned before, with A N N arms hanging along with the

6/27/2022 Department of EECE 19EID331 9

• The neural network required training to operate.

6/27/2022 Department of EECE 19EID331 10

• Classification, the aim is to predict the class of an input vector

6/27/2022 Department of EECE 19EID331 11

6/27/2022 Department of EECE 19EID331 12

• An artificial neuron can be linear or nonlinear.A neural network,

6/27/2022 Department of EECE 19EID331 13

6/27/2022 Department of EECE 19EID331 14

• Neural networks have a built-in capability to adapt their synaptic

6/27/2022 Department of EECE 19EID331 15

• In the context of pattern classification, a neural network can be

6/27/2022 Department of EECE 19EID331 16

• Knowledge is represented by the very structure and activation state

6/27/2022 Department of EECE 19EID331 17

• A neural network, implemented in hardware form, has the potential

6/27/2022 Department of EECE 19EID331 18

• The massively parallel nature of a neural network makes it

6/27/2022 Department of EECE 19EID331 19

• Basically, neural networks enjoy universality as

6/27/2022 Department of EECE 19EID331 20

• The design of a neural network is motivated by analogy with the

6/27/2022 Department of EECE 19EID331 21

6/27/2022 Department of EECE 19EID331 22

• Central to the system is the brain, represented by the neural (nerve)

6/27/2022 Department of EECE 19EID331 23

• The history of neural networking arguably began in the

6/27/2022 Department of EECE 19EID331 24

• This law, later known as Hebbian Learning in honor of Donald

6/27/2022 Department of EECE 19EID331 25

6/27/2022 Department of EECE 19EID331 34

Input Layer :- This layer accepts input features. It provides

Perceptron is a single layer neural network. It is a

6/27/2022 Department of EECE 19EID331 35

6/27/2022 Department of EECE 19EID331 36

6/27/2022 Department of EECE 19EID331 37

6/27/2022 Department of EECE 19EID331 38

Redial Basis Function

6/27/2022 Department of EECE 19EID331 39

• Redial Function=Φ(r) = exp (- r2/2σ2), where σ > 0

6/27/2022 Department of EECE 19EID331 40

• A Multilayer Perceptron has three or more layer.

6/27/2022 Department of EECE 19EID331 41

• In image classification and image recognition, a Convolutional Neural

6/27/2022 Department of EECE 19EID331 42

• Recurrent Neural Network is based on prediction.

6/27/2022 Department of EECE 19EID331 43