Download as pdf or txt
Download as pdf or txt
You are on page 1of 140

CHP5.

ARTIFICIAL NEURAL
NETWORK
Introduction – Fundamental concept– Basic Models of Artificial Neural
Networks – Important Terminologies of ANNs – McCulloch-Pitts Neuron
5.2 Neural Network Architecture: Perceptron, Single layer Feed Forward ANN,
Multilayer Feed Forward ANN, Activation functions, Supervised Learning:
Delta learning rule, Back Propagation algorithm.

Megha V Gupta, NHITM


WHY?
• Modern digital computers is truly astounding.
– No human can ever hope to compute a million operations a second.(e.g. To
search a document in computer)
• Most powerful computers cannot compete with the human brain.(e.g.
computer can’t tell a story).
• Imagine combining both.
• And all humans can live happily ever after (or will they?).
• This is the aim of artificial intelligence
INTRODUCTION TO NN

• ANN hopes to reproduce human brain by artificial means.


• mimics how our nervous system process information.
• ANN is composed of a large number of highly
interconnected processing elements (neurons) working in
unison to solve specific problems.
• ANNs, like people, learn by example.
• Learning in biological systems involves adjustments to the
synaptic connections that exist between the neurons. This
is true of ANNs as well.
FUNDAMENTAL CONCEPT

• NN are constructed and implemented to model the human


brain.
• Main objective of NN research is to develop a computational
device for modeling the brain to perform various
computational tasks at a faster rate than traditional systems.
• Performs various tasks such as pattern-matching,
classification, optimization function and data clustering etc.
• These tasks are difficult for traditional computers

Megha V Gupta, NHITM


ANN
• ANN possess a large number of processing elements
(PEs) called nodes/units/neurons which operate in
parallel.
• Neurons are connected with others by connection link.
• Each link is associated with weights which contain
information about the input signal. This info is used by
the NN to solve a particular problem.
• ANNs have the capability to model networks of
original neurons as found in the brain. Thus, ANN PEs
are called neurons or artificial neurons.
• Each neuron has an internal state of its own (activation
or activity level) which is a function of the inputs that
neuron receives
• The activation signal of a neuron is transmitted to
other neuron
• A neuron can send only one signal at a time

Megha V Gupta, NHITM


Artificial Neural Networks
• Information is transmitted as a series of electric
impulses, so-called spikes.

• The frequency and phase of these spikes encodes the


information.

• In biological systems, one neuron can be connected to as


many as 10,000 other neurons.

• Usually, a neuron receives its information from other


neurons in a confined area, its so-called receptive field.

Megha V Gupta, NHITM


EVOLUTION OF NN
• 1943: McCulloch and Pitts model neural networks
based on their understanding of neurology.
– Neurons embed simple logic functions:
• a or b
• a and b
• 1949 Hebb’s law
• Perceptron (Rosenblatt 1958)
– Three-layer system:
• Input nodes
• Output node
• Association layer
– Can learn to connect or associate a given input to a random
output unit
• 1960 Adaline, better learning rule (Widrow, Huff)

Megha V Gupta, NHITM


• 1969 Limitations (Minsky, Papert)
– Showed that a single layer perceptron cannot learn the XOR of two
binary inputs
– Lead to loss of interest (and funding) in the field
• 1972 Kohonen nets, associative memory
• In 1974, Werbos came to introduce a so-called backpropagation
algorithm for the three-layered perceptron network.
• in 1982, A totally unique kind of network model is the Self-Organizing
Map (SOM) introduced by Kohonen. SOM is a certain kind of
topological map which organizes itself based on the input patterns that it
is trained with.
• in 1986, The application area of the MLP networks remained rather
limited until the breakthrough when a general back propagation
algorithm for a multi-layered perceptron was introduced by
Rummelhart and Mclelland.
• 1988 Neocognitron, character recognition (Fukushima)
Megha V Gupta, NHITM
Megha V Gupta, NHITM
BIOLOGICAL NEURON

• Has 3 parts
– Dendrites (hair like structures):- collect stimuli from the neighboring neurons
and pass it on to soma
– Soma or cell body(Processing element of neuron):- accumulates stimuli
received through dendrites.
– Axon(long cylindrical fibre): carries electric impulses(stimuli) of the neuron to
neighboring neurons.
• The small gap between the axon terminal and adjacent dendrite of the
neighboring neuron is called synapse.
• Electric impulse is transmitted across the synaptic gap by means of
electrochemical process. Megha V Gupta, NHITM
QUIZ TIME

Go to www.menti.com and use the code 84 65 77 4

https://www.menti.com/75pm97s2y7

Megha V Gupta, NHITM


BIOLOGICAL NEURON
• Synaptic gap scales the input information by weight. If input signal is x
and synaptic weight is w then the stimulus that finally reaches the
soma is (x * w)

• This weight w and together with other synaptic weights embody the
knowledge stored in the network of neurons.

• Synapses are of two types


✓ A positive weight corresponds to excitatory synapse (causes the firing of
the receiving cell)
✓ A negative weight corresponds to inhibitory synapse (hinders the firing
of the receiving cell)

• Neuron fires when the total of the weights to receive impulses


exceeds the threshold value.
• After firing the cell has to wait for a period of time called refractory
period before it can fire again.
Megha V Gupta, NHITM
How do our brains work?
▪ The Brain is A massively parallel information processing system.
▪ Our brains are a huge network of processing elements. A typical brain contains a
network of 10 billion neurons.

Megha V Gupta, NHITM


How do our brains work?
▪ A processing element

A neuron is connected to other neurons through about 10,000


synapses

Megha V Gupta, NHITM


How do our brains work?
▪ A processing element

A neuron receives input from other neurons. Inputs are combined.


Once input exceeds a critical level, the neuron discharges a spike ‐ an
electrical pulse that travels from the body, down the axon, to the next
neuron(s)
Megha V Gupta, NHITM
How do our brains work?
▪ A processing element

The axon endings almost touch the dendrites or cell body of the
next neuron.
Transmission of an electrical signal from one neuron to
the next is effected by neurotransmitters.
Megha V Gupta, NHITM
How do our brains work?
▪ A processing element

Neurotransmitters are chemicals which are released from the first neuron
and which bind to the Second.

Megha V Gupta, NHITM


How do our brains work?
▪ A processing element

This link is called a synapse. The strength of the signal that


reaches the next neuron depends on factors such as the amount of
neurotransmitter available.
Megha V Gupta, NHITM
How do ANNs work?

An artificial neuron is an imitation of a human neuron


Megha V Gupta, NHITM
FROM HUMAN NEURONS TO ARTIFICIAL
NEURONS

“Machine Learning”
by Anuradha Srinivasaraghavan & Vincy Joseph
Copyright © 2019 Wiley India Pvt. Ltd. All rights reserved.
ARTIFICIAL NEURON

Megha V Gupta, NHITM


QUIZ TIME

• Who proposed the first perceptron model in 1958?


A. McCulloch-pitts
B. Marvin Minsky
C. Hopfield
D. Rosenblatt

Correct Answer: D

Megha V Gupta, NHITM


How do ANNs work?
• Now, let us have a look at the model of an artificial neuron.

Megha V Gupta, NHITM


ARCHITECTURE OF SIMPLE ANN
• X1 and X2 are input neurons which transmit signals and Y is the output
neuron which receives signals.
• Input neurons are connected to the output neuron over a weighted
interconnected links(W1 andW2)
Net input
yin = x1w1 + x2 w2
x1
X1
w1
where x1 and x2 are
Y y
activations of the input
x2 neurons
w2
X2 The output y of the output neuron
can be obtained by applying
activations over the net input.

y = f ( yin )
The function to be applied over the net input is called activation function.
Megha V Gupta, NHITM
QUIZ

• What is shape of dendrites like


a) oval
b) round
c) tree
d) rectangular

• Correct answer: c

Megha V Gupta, NHITM


FOR THE FOLLOWING NETWORK CALCULATE THE NET INPUT
GIVEN TO THE OUTPUT NEURON.

[x1,x2,x3]=[0.3,0.5,0.6][x1,x2,x3]=[0.3,0.5,0.6]
[w1,w2,w3]=[0.2,0.1,−0.3][w1,w2,w3]=[0.2,0.1,−0.3]
The net input can be calculated as,
yin=x1w1+x2w2+x3w3yin=x1w1+x2w2+x3w3
yin=0.3×0.2+0.5×0.1+0.6×(−0.3)yin=0.06+0.05−0.18=−0.07

Megha V Gupta, NHITM


NEURAL NET OF PURE LINEAR
EQN.
Input
m
X Y mx

The calculation of net input is similar to the calculation of output of a


pure linear straight line equation (y=mx)

Megha V Gupta, NHITM


Megha V Gupta, NHITM
COMPARISON BETWEEN BRAIN VERSES
COMPUTER
Brain ANN

Speed Few ms. Few nano sec. massive


||el processing
Size and complexity 1011 neurons & 1015 Depends on designer
interconnections
Storage capacity Stores information in its Contiguous memory
interconnection or in locations
synapse. loss of memory may
No Loss of memory happen sometimes.
Tolerance Has fault tolerance No fault tolerance Inf
gets disrupted when
interconnections are
disconnected
Control mechanism Complicated involves Simpler in ANN
chemicals in biological
neuron
Megha V Gupta, NHITM
BASIC MODELS OF ANN
• The models of ANN are specified by the 3 basic entities namely:

Basic Models of
ANN

Activation
Interconnections Learning rules
function

Megha V Gupta, NHITM


CLASSIFICATION BASED ON
INTERCONNECTIONS
• The arrangement of Interconnections
neurons to form
layers and the
connection pattern Single node with
Feed-forward Recurrent
its own Feedback
formed within and
between layers is
called network Single-layer Single-layer
architecture. There
exists 5 basic types
of neuron
Multilayer Multilayer
connection
architecture.

Megha V Gupta, NHITM


FEEDFORWARD, FEEDBACK AND
RECURRENT NETWORKS
• A network is said to be feedforward network if no neuron in the
output layer is an input to a node in the same layer or in the
preceding layer.

• When outputs can be directed back as inputs to same or preceding


layer nodes then it results in the formation of feedback networks.

• Recurrent networks are feedback networks with closed loops.

Megha V Gupta, NHITM


QUIZ

• Where does the chemical reactions take place in neuron?


a) dendrites
b) axon
c) synapses
d) nucleus

Correct answer: c

Megha V Gupta, NHITM


Megha V Gupta, NHITM
SINGLE LAYER FEEDFORWARD
NETWORK

• When a layer is formed,


inputs are connected to
output nodes with
various weights.
• This results in a single-
layer feed forward
network.
• Input layer receive the
input and buffers the
input signal. Output layers
generates output of the
network
Megha V Gupta, NHITM
QUIZ

• What is estimate number of neurons in human cortex?


a) 108
b) 105
c) 1011
d) 1020

• Correct answer:c

Megha V Gupta, NHITM


MULTILAYER FEED FORWARD NETWORK
• It is formed by the interconnection of several layers.
• Any layer between input and output is a hidden layer
• Hidden layer is internal to network and has no direct contact with the
external environment.
• No. of hidden layers : zero or more
• More hidden layer ➔ more complexity but may provide an efficient
output response
• In a fully connected network, every output from one layer is
connected to each and every node in the next layer.

“Machine Learning”
by Anuradha Srinivasaraghavan & Vincy Joseph
Copyright © 2019 Wiley India Pvt. Ltd. All rights reserved.
MULTILAYER FEED FORWARD
NETWORK

Megha V Gupta, NHITM


FEEDBACK NETWORK
• When outputs are connected back as inputs to same or preceding
layer then it is a feedback network.
• If the feedback of the output is directed back to the same layer then it
is called lateral feedback.

“Machine Learning”
by Anuradha Srinivasaraghavan & Vincy Joseph
Copyright © 2019 Wiley India Pvt. Ltd. All rights reserved.
SINGLE NODE WITH OWN
FEEDBACK

Megha V Gupta, NHITM


RECURRENT NETWORKS

• Recurrent networks are feedback networks with closed loop.


• It can be single layer or multilayer.
• Single layer network: Output processing element can be directed back
to itself or other processing elements or both.

“Machine Learning”
by Anuradha Srinivasaraghavan & Vincy Joseph
Copyright © 2019 Wiley India Pvt. Ltd. All rights reserved.
SINGLE LAYER RECURRENT
NETWORKS
A Single layer network with a feedback connection in which PE’s output can be
directed back to the PE itself or to the other PE or to both.

Megha V Gupta, NHITM


MULTI LAYER RECURRENT NETWORKS
• Multilayer Recurrent Network : Output processing element is
directed back to the nodes in a preceding layer or same layer.
LATERAL FEEDBACK
In this structure, each processing neuron receives 2 different classes of
inputs “excitatory inputs” (connection with hollow circles) from nearby PEs
and “inhibitory inputs” (connection with solid circles) from more distantly
located PEs

•If the feedback of the output of the PEs is directed back as input to the PEs in
the same layer then it is called lateral feedback
Megha V Gupta, NHITM
LEARNING

Megha V Gupta, NHITM


TRAINING
• The process of modifying the weights in the connections between
network layers with the objective of achieving the expected output is
called training a network.
• This is achieved through
– Supervised learning
– Unsupervised learning
– Reinforcement learning

Megha V Gupta, NHITM


SUPERVISED LEARNING( learning with
the help of a teacher)

• E.g. Learning process of a small child


Child doesn’t know how to read/write. Their each action is supervised
by a teacher
• Each input vector requires a corresponding target vector.
• Training pair=[input vector, target vector]

Neural
X Network Y
W
(Input)
(Actual output)
Error
Error
(D-Y) Signal
signals Generator (Desired Output)

Megha V Gupta, NHITM


SUPERVISED LEARNING( learning with
the help of a teacher)

• During Training, input vector is presented to the


network which results in an output vector.
• Then the actual output vector is compared with the
desired (target) output vector. If there exists a
difference between the 2 output vectors, then an
error signal is generated by the network.
• This error signal is used for adjustment of weights
until the actual output matches the desired (target)
output.
• It is assumed that correct target output values are
known for each input pattern.
Megha V Gupta, NHITM
UNSUPERVISED LEARNING ( learning
without the help of a teacher)

• How a fish or tadpole learns to swim


• All similar input patterns are grouped together as
clusters.
• If a matching input pattern is not found a new cluster
is formed

Megha V Gupta, NHITM


UNSUPERVISED LEARNING

• There is no feedback from the environment to inform what the


output should be whether the outputs are correct.
• The network must itself discover patterns, regularities, features or
categories from the input data over the output.
• While doing so, the network undergoes change in its parameters. This
process is called self organizing in which exact clusters will be formed
by discovering similarities and dissimilarities among the objects.
Megha V Gupta, NHITM
REINFORCEMENT LEARNING
( learning with a critic)

X
Y
(Input) NN
W (Actual output)

Error
signals Error
Signal R
Generator Reinforcement
signal

Megha V Gupta, NHITM


WHEN REINFORCEMENT LEARNING
IS USED?
• If less information is available about the target output values (only
critic information not the exact information)
• Learning based on this critic information is called reinforcement
learning and the feedback sent is called reinforcement signal
• Reinforcement learning is a form of supervised learning. Feedback in
this case is only evaluative and not instructive

Right answer is not


provided but
indication of whether ‘right’
or ‘wrong’ is provided.

Megha V Gupta, NHITM


QUIZ

A Neuron can send ________ signal at a time.


a. Multiple
b. Two
c. One
d. Any number of

Correct answer: c

Megha V Gupta, NHITM


QUIZ

The cell body of neuron can be analogous to what mathematical operation?


a. summing
b. differentiator
c. integrator
d. Difference

Correct answer: a

Megha V Gupta, NHITM


WHAT IS AN ACTIVATION
FUNCTION?
• An activation function is a function that is added into an
artificial neural network in order to help the network
learn complex patterns in the data.
• When comparing with a neuron-based model that is in our
brains, the activation function is at the end deciding what
is to be fired to the next neuron. That is exactly what an
activation function does in an ANN as well.
• It takes in the output signal from the previous cell and
converts it into some form that can be taken as input to
the next cell.
Megha V Gupta, NHITM
ACTIVATION FUNCTIONS

Megha V Gupta, NHITM


ACTIVATION FUNCTIONS

Megha V Gupta, NHITM


ACTIVATION FUNCTIONS

Megha V Gupta, NHITM


ACTIVATION FUNCTIONS

Megha V Gupta, NHITM


ACTIVATION FUNCTIONS
Activation functions:

(A) Identity (Linear)

(B) Binary step (hardlimit


or unipolar binary)

(C) Bipolar step


(symmetrical hardlimit or
bipolar binary)

(D) Binary sigmoid


(unipolar continuous or
unipolar sigmoid or logistic
sigmoid)

(E) Bipolar sigmoid (bipolar


continuous)

(F) Ramp (saturating linear)

Megha V Gupta, NHITM


QUIZ

Internal state of neuron is called __________, is the function of the inputs the
neurons receives.
a. Weight
b. Activation or activity level of neuron
c. Bias
d. Node

Correct answer: b

Megha V Gupta, NHITM


IMPORTANT TERMINOLOGIES OF
ANNS

• Weights
• Bias
• Threshold
• Learning rate
• Momentum factor
• Vigilance parameter
• Notations used in ANN

Megha V Gupta, NHITM


WEIGHTS
Each neuron is connected to every other neuron by means of directed links. Links
are associated with weights. Weights contain information about the input signal. This
information is used by the net to solve a problem. The weight is represented as a
matrix. Weight matrix also called connection matrix

W=  w1 
T


w T
  w11w12 w13...w1m 
 2
 wT 
 
 3  w 21w 22 w 23...w 2 m 
. 
.  = .................. 

. 

 

 .


................... 
. 
 T
 
 wn   w n1w n 2 w n 3...w nm 
 

Megha V Gupta, NHITM


BIAS
• Bias included in the network has its impact in calculating the net input.
• Bias is included by adding a component x0=1 to the input vector X.
Thus the input vector becomes
X=(1,X1,X2…Xi,…Xn)
• Bias is considered like another weight. ie w0j = bj

Megha V Gupta, NHITM


BIAS
• wij –is the weight from processing element ”i” (source node) to
processing element “j” (destination node)
The net input to the output neuron is calculated as

y =x w
1
X1 bj = w0j
inj i ij
w1j i =0

Xi Yj = x 0 w0 j + x1w1 j + x 2 w2 j + .... + x n wnj


wij
n
Xn wnj = w0 j +  x i wij
i =1
n
Fig. Simple net with bias
y =b +x w
inj j i ij
i =1
The activation function is applied over this net input to
calculate the output. Megha V Gupta, NHITM
BIAS
• The bias can also be explained
as follows: • Bias plays a major role in
determining the output of the
• In a straight-line equation: network

• y=mx+c • Two types


– Positive bias: increase the net
input
y-> output
– Negative bias: decrease the net
x-> input input
m -> weight
c-> bias
c(bias)
(weight)
Input m
x y y=mx+c
Megha V Gupta, NHITM
QUIZ

Which function is a continuous function that varies gradually


between the values 0 and 1 or -1 and +1?

A) Linear function
B) Sigmoidal function
C) Thresholding function
D) Activation function

Correct answer: b

Megha V Gupta, NHITM


THRESHOLD

Megha V Gupta, NHITM


LEARNING RATE

Megha V Gupta, NHITM


QUIZ

The ------------ determines how fast the weights of NN


change.

A) Learning rate
B) Bias
C) Activation function
D) Momentum

Correct answer: a

Megha V Gupta, NHITM


MOMENTUM FACTOR AND VIGILANCE
PARAMETER

Megha V Gupta, NHITM


MCCULLOCH-PITTS NEURON
(LINEAR THRESHOLD GATE MODEL)
(By Warren McCulloch and Walter Pitts)

Megha V Gupta, NHITM


MCCULLOCH-PITTS NEURON

Megha V Gupta, NHITM


MCCULLOCH-PITTS NEURON

Megha V Gupta, NHITM


FEATURES OF MCCULLOCH-PITTS
MODEL

• Allows binary 0,1 states only


• Operates under a discrete-time assumption
• Weights and the neurons’ thresholds are fixed in the model and
no interaction among network neurons
• Just a primitive model
Megha V Gupta, NHITM
QUIZ

Negative sign of weight indicates?


a. excitatory input
b. inhibitory output
c. excitatory output
d. inhibitory input

Correct answer: d

Megha V Gupta, NHITM


LOGIC GATES WITH MP
NEURONS
• We can use McCulloch-Pitts neurons to implement the basic logic gates.

• All we need to do is find the appropriate connection weights and neuron


thresholds to produce the right outputs for each set of inputs.

• Logical AND
x1 θ=2
1
x1 x2 y
0 0 0 y
0 1 0
1 0 0 x2 1
1 1 1

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
LOGIC GATES WITH MP
NEURONS
x1 θ=2
2
• Logical OR
y

x1 x2 y
x2 2
0 0 0
0 1 1
1 0 1 x1
1 θ= 1
1 1 1
y

x2 1
Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
LOGIC GATES WITH MP
NEURONS
x1 θ= 1
1
• Logical AND NOT
y

x1 x2 y
x2 -1
0 0 0
0 1 0
1 0 1 x1 θ= 2
2
1 1 0
y

x2 -1

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
QUIZ

What was the name of the first model which can perform weighted sum of
inputs?
a. McCulloch-pitts neuron model
b. Marvin Minsky neuron model
c. Hopfield model of neuron
d. Perceptron

Correct answer: a

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
LINEAR SEPARABILITY

Megha V Gupta, NHITM


LINEAR SEPARABILITY

the decision line is drawn separating the positive response region from
the negative response region.Megha V Gupta, NHITM
LINEAR SEPARABILITY
• Separation of the input space into regions is based on whether the
network response is positive or negative
• Line of separation is called linear-separable line.
• Example:-
– AND function & OR function are linear separable Example
– EXOR function Linearly inseparable Example

Problems with input patterns which can be classified using a single


hyperplane are said to be linearly separable.

Problems (such as XOR) which cannot be classified in this way are said
to be non-linearly separable.

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
QUIZ

When two classes can be separated by a separate line, they are known as?
a. linearly separable
b. linearly inseparable classes
c. may be separable or inseparable, it depends on system
d. inseparable

Correct answer: a

Megha V Gupta, NHITM


PERCEPTRON NETWORKS (ROSENBLATT)
• It is a single layer feedforward network.
• Consists of 3 units:
1) Sensory unit(input unit)
2) Associator unit (hidden unit)
3) Response unit ( output unit)
PERCEPTRON NETWORK
• Sensory unit:
It is connected to associator unit with fixed weight value (1, -1,0)
It is an input unit( input layer)
• Associator unit
This unit acts as an intermediate between sensory and response
unit
It represents hidden layer.
Intermediate operations performed
• Response unit:
It has an activation of 1,0,-1
It is used to perform the updation in synaptic weights
The output is binary signals.
Megha V Gupta, NHITM
QUIZ

A simple perceptron is

A) auto-associative neural network


B) Competitive network
C) Multilayer feed-back network
D) a single layer feed-forward neural network

Megha V Gupta, NHITM


PERCEPTRON LEARNING RULE (SUPERVISED)
• The learning signal is the difference between the desired and actual
neuron response.

Megha V Gupta, NHITM


Megha V Gupta, NHITM
UNIVERSITY QUESTION
• For a bipolar binary neuron, the weight change formula for a
perceptron training rule reduces training rule

Megha V Gupta, NHITM


• For a unipolar binary neuron, the weight change formula for a
perceptron training rule reduces training rule

Megha V Gupta, NHITM


PERCEPTRON LEARNING
RULE

Megha V Gupta, NHITM


UNIVERSITY QUESTION
Use perceptron learning rule for computing weights after one iteration for the
data given below
X1= [1 -2 0 -1] T; X2= [0 1.5 -0.5 -1] T; X3= [-1 1 0.5 -1] T.
Initial weight W1= [1 -1 0 0.5]. The learning constant is given by c=0.1.
The teacher’s desired responses for X1, X2, X3 are [-1, -1,1] respectively.

Megha V Gupta, NHITM


EXAMPLE WITH BIAS

Megha V Gupta, NHITM


DELTA LEARNING RULE (CONTINUOUS
PERCEPTRON LEARNING RULE)
Widrow-Hoff introduced the Delta Learning Rule that uses non-linear
differentiable activation function for training the MLP networks.

Megha V Gupta, NHITM


DELTA LEARNING RULE
• The delta learning rule is only valid for continuous activation function
since we can easily take derivative of such continuous activation
functions.
• The learning signal is called delta and is defined as

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
DELTA LEARNING RULE-
PROBLEM

Solved on One note

Megha V Gupta, NHITM


WHY WE NEED
BACKPROPAGATION?
• While designing a Neural Network, in the beginning, we
initialize weights with some random values or any variable
for that fact.
• Now obviously, we are not superhuman. So, it’s not necessary
that whatever weight values we have selected will be correct,
or it fits our model the best.
• we have selected some weight values in the beginning, but
our model output is way different than our actual output i.e.
the error value is huge.
• Now, how will you reduce the error?
• Basically, what we need to do, we need to somehow explain
the model to change the parameters (weights), such that
error becomes minimum.
Megha V Gupta, NHITM
BACKPROPAGATION
ALGORITHM

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
BACKPROPOGATION

• Backpropagation, short for “backward propagation of errors”, is a


mechanism used to update the weights using gradient descent. It
calculates the gradient of the error function with respect to the
neural network’s weights. The calculation proceeds backwards
through the network.
• Gradient descent is an iterative optimization algorithm for finding the
minimum of a function; in our case we want to minimize the error
function. To find a local minimum of a function using gradient
descent, one takes steps proportional to the negative of the gradient
of the function at the current point.

Megha V Gupta, NHITM


Megha V Gupta, NHITM
ARCHITECTURE OF
BACKPROPOGATION NETWORK

Megha V Gupta, NHITM


Apply backpropogation to find weights of the network. Desired output d=0.5 and
learning rate ƞ =1

Megha V Gupta, NHITM


Megha V Gupta, NHITM
Megha V Gupta, NHITM
Megha V Gupta, NHITM
QUIZ

Back propagation algorithm is based on

A) Evolutionary algorithms
B) Particle swarm optimization
C) Genetic algorithms
D) Gradient descent method

Megha V Gupta, NHITM


UNSUPERVISED LEARNING

• We can include additional structure in the network so


that the net is forced to make a decision as to which one
unit will respond.

• The mechanism by which it is achieved is called


competition.

• It can be used in unsupervised learning.

• A common use for unsupervised learning is clustering


based neural networks.
UNSUPERVISED LEARNING

• In a clustering net, there are as many units as the input


vector has components.

• Every output unit represents a cluster and the number of


output units limit the number of clusters.

• During the training, the network finds the best matching


output unit to the input vector.

• The weight vector of the winner is then updated


according to learning algorithm.
INTRODUCTION OF SOM
• Introduced by Prof. Teuvo Kohonen in 1982
• Also known as Kohonen feature map
• Unsupervised neural network
• Clustering tool of
high-dimensional
and complex data
KOHONEN SOM (SELF ORGANIZING MAPS)

• Since it is unsupervised environment, so the name is Self


Organizing Maps.

• Self Organizing NNs are also called Topology Preserving


Maps which leads to the idea of neighborhood of the
clustering unit.

• During the self-organizing process, the weight vectors of


winning unit and its neighbors are updated.
• Normally, Euclidean distance measure is used to find the
cluster unit whose weight vector matches most closely to
the input vector.
KOHONEN SOM (SELF ORGANIZING MAPS)

• Structure of Neighborhoods

The neighborhood of the units designated by “*” of radii r1, r2 and r3


where r1=2 ,r2=1 and r3=0 (r1>r2>r3)
KOHONEN SOM (SELF ORGANIZING MAPS)

• Structure of Neighborhoods

For a rectangular grid, each unit has 8


nearest neighbors
KOHONEN SOM (SELF ORGANIZING MAPS)

• Structure of Neighborhoods

For a hexagonal grid, there are 6 neighbors


for each unit
KOHONEN SOM (SELF ORGANIZING MAPS)

• Architecture of SOM
SOM consists of 2
layers input layer and
output layer(cluster)

There are n units in


the input layer and m
units in output layer.

Winning unit is
identified by using
either dot product or
Euclidean distance
method and the
weight updation using
kohonen learning rules
is performed over
winning cluster unit.
KOHONEN SELF ORGANIZING MAPS

Architecture

neuron i
Kohonen layer
wi

Winning neuron

Input vector X
KOHONEN SOM (SELF ORGANIZING MAPS)

– Neighborhoods do not wrap around from one side of


the grid to other side which means missing units are
simply ignored.
• Algorithm:
KOHONEN SOM (SELF ORGANIZING MAPS)

• Algorithm:

– Radius and learning rates may be decreased after each


epoch.
– Learning rate decrease may be either linear or
geometric.
PROBLEM KSOFM

• Construct KSOFM to cluster 4 given vectors [0011],


[1000],[0110] and [0001] Number of clusters to be formed is 2.
Assume an initial learning rate of 0.5.
Number of input vectors n=4 and number of cluster =m=2

Megha V Gupta, NHITM


ANN Applications

Medical Applications

Information
Searching & retrieval
Chemistry

Education
Business & Management
Applications of ANNs

• Signal processing
• Pattern recognition, e.g. handwritten characters or face
identification.
• Diagnosis or mapping symptoms to a medical case.
• Speech recognition
• Human Emotion Detection
• Educational Loan Forecasting

140

You might also like