Professional Documents
Culture Documents
SJNanda Neural Network
SJNanda Neural Network
SJNanda Neural Network
Approach
3/63
• What is a Neural Network?
➢ A method of computing, based on the interaction of multiple
connected processing elements
4/63
Single Input Neuron
Resemblance biological
Neuron and Neuron Model
Strength of Weight
Synapse
Transfer Cell body
function and
summation
Signal on Output
axon
6/63
Nonlinear functions
7/63
Multiple-Input Neuron
8/63
A layer of Neurons
9/63
Multiple Layer of Neurons
10/63
Perceptron Learning Rule
11/63
Decision Boundary
12/63
13/63
Example: AND Gate
14/63
Testing
Supervised Learning
Reinforcement learning
unsupervised learning
16/63
Constructing Learning Rules
Let
17/63
18/63
Unified Learning Rule
19/63
Neural Network
20/63
ADALINE
21/63
22/63
Wiener-Hopf equation
• Considering weight and bias in one vector
Output of Network
23/63
The expression be written in the following convenient form:
Where
Where
Taking gradient
24/63
Least mean square algorithm
25/63
LMS Con..
• We obtain after derivative weight and bias
26/63
Backpropagation
• The perceptron learning rule of Frank Rosenblatt and the LMS algorithm of
Bernard Widrow and Marcian Hoff were designed to train single-layer
perceptron-like networks.
• Single-layer networks suffer from the disadvantage that they are only able to
solve linearly separable classification problems.
• The difference between the LMS algorithm and backpropagation is only in the
way in which the derivatives are calculated.
• For a single-layer linear network the error is an explicit linear function of the
network weights, and its derivatives with respect to the weights can be easily
computed
• In multilayer networks with nonlinear transfer functions, the relationship
between the network weights and the error is more complex.
• In order to calculate the derivatives, we need to use the chain rule of calculus.
In fact, this chapter is in large part a demonstration of how to use the chain
rule.
27/63
Three layer network
28/63
Back Propagation Training
x1
yl ( n )
x2
el ( n )
d (n)
-
+
Back-Propagation Σ
Algorithm
29/63
M is the layer of the network
30/63
• The steepest descent algorithm for the approximate mean square error
(stochastic gradient descent) is
• So far, this development is identical to that for the LMS algorithm. Now we
come to the difficult part – the computation of the partial derivatives.
31/63
Chain rule
To review the chain rule, suppose that we have a function f that is an explicit
function only of the variable n. We want to take the derivative of with respect
to a third variable w . The chain rule is then:
For example, if
32/63
• The second term in each of these equations can be easily computed, since
• the net input to layer is an explicit function of the weights and bias in that layer:
Therefore
If we now define
33/63
• We can now express the approximate steepest descent algorithm as
where
34/63
To compute the sensitivities Sm
where
35/63
• Therefore the Jacobian matrix can be written
• Now we can see where the backpropagation algorithm derives its name. The
sensitivities are propagated backward through the network from last layer to
the first layer:
36/63
To compute sm
Now, since
we can write
This can be expressed in matrix form as
37/63
Example on Backpropagation
38/63
Before we begin the backpropagation algorithm we need to choose some initial
values for the network weights and biases. Generally these are chosen to be small
random values.
39/63
• Next, we need to select a training set .
• In this case, we will sample the function at 21 points in the range [-2,2] at
equally spaced intervals of 0.2. The training points are indicated by the circles
in previous Fig.
• For our initial input we will choose p=1, which is the 16th training point:
• The output of the first layer is then
40/63
• The second layer output is
41/63
We can now perform the backpropagation. The starting point is found at the second
layer,
The first layer sensitivity is then computed by backpropagating the sensitivity from
the second layer,
42/63
• The final stage of the algorithm is to update the weights. For simplicity, we will
use a learning rate α=0.1
This completes the first iteration of the backpropagation algorithm. We next proceed to
randomly choose another input from the training set and perform another iteration of
the algorithm. We continue to iterate until the difference between the network response
and the target function reaches some acceptable level.
43/63
FLANN
• MLP has high computation complexity as it has many layers
• The FLANN is basically a single layer structure in which non-
linearity is introduced by enhancing the input pattern with
nonlinear function expansion.
44/63
Trigonometric FLANN
45/63
Chebyshev FLANN
These higher Chebyshev polynomials for -1<x<1 may be generated
using the recursive formula given by
46/63
Legendre-FLANN
47/63