Unit II Supervised II

UNIT II
Supervised Learning II
Course :Soft Computing

By: Dr P Indira priyadarsini
DEPARTMENT OF INFORMATION TECHNOLOGY 1 11/09/2021

MULTI LAYERED FEED FORWARD
NEURAL NETWORK ARCHITECTURES
Multilayer networks solve the classification problem for non linear sets by
employing hidden layers, whose neurons are not directly connected to the output.
The additional hidden layers can be interpreted geometrically as additional hyper-
planes, which enhance the separation capacity of the network. Figure 2.2 shows
typical multilayer network architectures.
This new architecture introduces a new question: how to train the hidden units for
which the desired output is not known. The Back propagation algorithm offers a
solution to this problem.

Input Nodes – The Input nodes provide information from the outside world to the
network and are together referred to as the “Input Layer”. No computation is
performed in any of the Input nodes – they just pass on the information to the hidden
nodes.
Hidden Nodes – The Hidden nodes have no direct connection with the outside world
(hence the name “hidden”). They perform computations and transfer information
from the input nodes to the output nodes. A collection of hidden nodes forms
a “Hidden Layer”.
While a feedforward network will only have a single input layer and a single output
layer, it can have zero or multiple Hidden Layers.
Output Nodes – The Output nodes are collectively referred to as the “Output Layer”
and are responsible for computations and transferring information from the network
to the outside world.

The training occurs in a supervised style. The basic idea is to present the input vector
to the network;
calculate in the forward direction the output of each layer and the final output of the
network.
For the output layer the desired values are known and therefore the weights can be
adjusted as for a single layer network; in the case of the BP algorithm according to the
gradient decent rule.
To calculate the weight changes in the hidden layer the error in the output layer is
back-propagated to these layers according to the connecting weights.
This process is repeated for each sample in the training set. One cycle through the
training set is called an epoch.
The number of epochs needed to train the network depends on various parameters,
especially on the error calculated in the output layer.
The following description of the Back propagation algorithm is based on the

descriptions in [rume86],[faus94], and [patt96].
The assumed architecture is depicted in Figure 2.3. The input vector has n dimensions,

the output vector has m dimensions, the bias (the used constant input) is -1, there is
one hidden layer with g neurons. The matrix V holds the weights of the neurons in the
hidden layer. The matrix W defines the weights of the neurons in the output layer. The
DEPARTMENT OF INFORMATION TECHNOLOGY
learning parameter is η, and the momentum4 is α. 11/09/2021
BACKPROPAGATION
• In machine learning, gradient descent and back propagation often appear at the
same time, and sometimes they can replace each other.
• Back propagation can be considered as a subset of gradient descent, which is the

implementation of gradient descent in multi-layer neural networks.
• Back propagation, also named the Generalized Delta Rule, is an algorithm used in
the training of ANNs for supervised learning. (generalizations exist for other
artificial neural networks) It efficiently computes the gradient of the error
function with respect to the weights of the network for a single input-output
example.
• This makes it feasible to use gradient methods for training multi-layer networks,
updating the weights to minimize loss.
• Since the same training rule is applied recursively for each layer of the neural
network, we can calculate the contribution of each weight to the total error
inversely from the output layer to the input layer.

Back Propagation Neural Networks Continued….
• Back Propagation Neural Network is a multilayer neural network consisting of
the input layer, at least one hidden layer and output layer.
• As its name suggests, back propagating will take place in this network.
• The error which is calculated at the output layer, by comparing the target
output and the actual output, will be propagated back towards the input layer.
Architecture
• As shown in the diagram, the architecture of BPN has three interconnected
layers having weights on them.
• The hidden layer as well as the output layer also has bias, whose weight is
always 1, on them.
• As is clear from the diagram, the working of BPN is in two phases.
• One phase sends the signal from the input layer to the output layer, and the
other phase back propagates the error from the output layer to the input layer.

BACKPROPAGATION (CONTINUED…. )
• Simply put, we’re propagating the total error backward through the
connections in the network layer by layer, calculate the contribution
(gradient) of each weight and bias to the total error in every layer, then use
gradient descent algorithm to optimize the weights and biases, and
eventually minimize the total error of the neural network.
• The back propagation algorithm has two phases:
• Forward pass phase: feed-forward propagation of input pattern signals

through the network, from inputs towards the network outputs.
• Backward pass phase: computes ‘error signal’ – propagation of error

(difference between actual and desired output values) backwards through
network, starting form output units towards the input units.
• Visualizing this can help understand how the backpropagation algorithm

works step by step

The back-propagation training algorithm
Step 1: Initialization
Set all the weights and threshold levels of the
network to random numbers uniformly
distributed inside a small range:
 2.4 2.4 
  ,  
 Fi Fi 
where Fi is the total number of inputs of neuron i
in the network. The weight initialization is done
on a neuron-by-neuron basis.
11/09/2021 9
Step 2: Activation
Activate the back-propagation neural network
by applying inputs x1(p), x2(p),…, xn(p) and
desired outputs yd,1(p), yd,2(p),…, yd,n(p).
(a) Calculate the actual outputs of the neurons
in the hidden layer:
 n 
y j ( p )  sigmoid   xi ( p )  wij ( p )   j 
 i 1 
where n is the number of inputs of neuron j in
the hidden layer, and sigmoid is the sigmoid
activation function.
DEPARTMENT OF INFORMATION TECHNOLOGY 11/09/2021
10
Step 2 : Activation (continued)
(b) Calculate the actual outputs of the neurons
in the output layer:
m 
yk ( p )  sigmoid   x jk ( p )  w jk ( p )   k 
 j 1 
where m is the number of inputs of neuron k in
the output layer.
DEPARTMENT OF INFORMATION TECHNOLOGY 11/09/2021

11
Step 3: Weight training
Update the weights in the back-propagation
network propagating backward the errors
associated with output neurons.
(a) Calculate the error gradient for the
neurons in the output layer:
k ( p )  yk ( p)  1  yk ( p )  ek ( p )
where ek ( p )  yd ,k ( p )  yk ( p )
Calculate the weight corrections:
w jk ( p)   y j ( p)  k ( p)
Update the weights at the output neurons:
w jk ( p  1)  w jk ( p )  w jk ( p )
12
Step 3: Weight training (continued)
(b) Calculate the error gradient for the neurons
in the hidden layer:
l
j ( p)  y j ( p )  [1  y j ( p )]   k ( p ) w jk ( p )
k 1
Calculate the weight corrections:
wij ( p )   xi ( p )  j ( p )
Update the weights at the hidden neurons:
wij ( p  1)  wij ( p )  wij ( p )
DEPARTMENT OF INFORMATION TECHNOLOGY

13
Step 4: Iteration
Increase iteration p by one, go
back to Step 2 and repeat the process until the
selected error criterion is satisfied.
As an example, we may consider the three-layer
back-propagation network. Suppose that the
network is required to perform logical operation
Exclusive-OR. Recall that a single-layer
perceptron could not do this operation. Now we
will apply the three-layer net.

11/09/2021 14
Working Example
Back propagation (mynotes)

APPLICATIONS OF FEEDFORWARD NEURAL
NETWORKS
There are a wide variety of applications of neural networks to real world problems.
1. Gene Expression Profiling for predicting Clinical Outcomes in cancer patients.
2. Steering an Autonomous Vehicle.
3. Call admission control in ATM Networks.
4. Robot Arm Control and Navigation.

Unit II Supervised II

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit II Supervised II

Uploaded by

Copyright:

Available Formats

UNIT II

Course :Soft Computing

DEPARTMENT OF INFORMATION TECHNOLOGY 1 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 2 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 3 11/09/2021

The following description of the Back propagation algorithm is based on the

The assumed architecture is depicted in Figure 2.3. The input vector has n dimensions,

• Back propagation can be considered as a subset of gradient descent, which is the

DEPARTMENT OF INFORMATION TECHNOLOGY 5 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 6 11/09/2021

• The back propagation algorithm has two phases:

• Forward pass phase: feed-forward propagation of input pattern signals

• Backward pass phase: computes ‘error signal’ – propagation of error

• Visualizing this can help understand how the backpropagation algorithm

DEPARTMENT OF INFORMATION TECHNOLOGY 7 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY

DEPARTMENT OF INFORMATION TECHNOLOGY 14 11/09/2021

Back propagation (mynotes)

DEPARTMENT OF INFORMATION TECHNOLOGY 15 11/09/2021

1. Gene Expression Profiling for predicting Clinical Outcomes in cancer patients.

2. Steering an Autonomous Vehicle.

3. Call admission control in ATM Networks.

4. Robot Arm Control and Navigation.

DEPARTMENT OF INFORMATION TECHNOLOGY 16 11/09/2021

You might also like