Professional Documents
Culture Documents
Additional Topics
Additional Topics
Adaline
Delta Rule or LMS or Widrow-Hoff
Generalised delta rule or error backpropagation
Effect of Momentum term on Backpropagation
Radial Basis Function Networks
Kohonen SOM
LVQ
Simulated Annealing
Covers Theorem
BPTT
Hard limit activation function
Module 4
Previous Solutions
Important Questions
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
ADALINE
I=
w x
i
i 0
where xi represents the input from ith neuron and wi represents the
corresponding weight. The bias signal, i.e. x0 is always +1.
The neuron is linear and hence uses the identity function to find its output,
i.e.
n
y = f(I) = I =
w x
i
i 0
The learning rule minimises the mean squared error between the output y
and the the target output t, and is given as
wi = ( t y ) xi
where i represents the ith input signal and wi represents the change that
needs to be made in the ith weight.
So, this can also be written as
wi(new) = wi(old) x i
where + sign is taken when t = 1; y = 0 and sign is taken when t = 0; y
=1.
An ADALINE can be used to solve linearly separable problems but it
fails in case of problems that are not linearly separable, like the XORpattern problem.
The ADALINE training algorithm is given as:
Step 1) Initialize weights randomly ( may be set to 0 for simplicity ) and
set the learning rate between 0 and 1.
Step 2) do steps (3) to (7)
Step 3)For each training pattern, do steps (4) to ( 6 )
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
I=
w x
i
i 0
y = f(I) = I =
w x
i
i 0
Figure 1:
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
DELTA RULE
The delta rule is a learning rule used by neural networks which works on
the principle of minimising the mean-squared error. It is also called the
LMS (Least Mean Square ) rule or Widrow-Hoff rule.
The net input of any output neuron is given as
n
I=
w x
i
i 0
where xi represents the input from ith neuron and wi represents the
corresponding weight. The bias signal, i.e. x0 is always +1.
The neuron uses an activation function to find its output, i.e.
n
y = f(I) = f ( wi xi )
i 0
The delta learning rule minimises the mean squared error between the
output y and the the target output t, and is given as
wi = ( t y ) xi
where i represents the ith input signal and wi represents the change that
needs to be made in the ith weight.
So, this can also be written as
wi(new) = wi(old) x i
where + sign is taken when t = 1; y = 0 and sign is taken when t = 0; y
=1.
The delta rule is used to modify weights in networks like Perceptron and
Adaline.
The features of the delta rule are as follows :
It is one of the simplest learning rules
Learning is said to be distributed because weight-modifications
can be performed locally at each neuron
Learning is said to be online because it takes place in a patternby-pattern manner, where weights get modified after
presentation of each training pattern
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
1 2
e j (n)
2
The total error energy is obtained by adding up the error energies of all
the output neurons and it is given as
E(n) =
1 m 2
e j (n)
2 j 1
E=
e j (n)
N n 1
2 N n 1 j 1
According to the gradient-descent search rule,
wij (n)
E (n)
wij (n)
where ve sign indicates that the direction of weight change and change
in E(n) are reverse according to gradient descent in the weight space.
Now, the gradient can be expressed as
E ( n)
E ( n ) e j ( n) y j ( n)
.
wij ( n) e j ( n) y j ( n) I j ( n)
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
The generalised delta rule is used for training in networks like the
backpropagation network.
Its features are :
It has wider scope than traditional learning rules like the LMS
It is global because the error for the entire network is to be
minimised
Its execution is complex but it can be used for solving a greater
range of problems, like the separation of XOR-type patterns that
are not linearly separable
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
2 i2
Ri(x) = e
where u is a pre-defined vector with the same dimension as x.
In this case, each receptive field node produces an identical output for
inputs within a fixed radial distance from the centre of the kernel, i.e. they
are radially symmetric.
The output for an output node can be calculated in two ways :
(1) Weighted sum :
n
y(x) =
c R ( x)
i
i 1
c R ( x)
i
y(x) =
i 1
n
R ( x)
i
i 1
Figure 2:
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
c1
c2
ck
c3
cn
x1
x2
xm
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
||ck cw || 2
2 (t )
hkw(t) = h0
where h0 is a positive constant and (t) is a decreasing function of t,
which is often taken as the exponential decay function given as
(t) = 0e( -t/ )
where 0 and are constants
The algorithm is said to converge when hkw does not vary with time,
i.e. it becomes constant
Applications :
Clustering
Vector Quantisation
Data visualisation
Feature extraction
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
Demerits :
SOM is not based on minimization of any objective function
Termination is not based on optimising any model of the process or
its data
Convergence is unguaranteed and termination is often a forced one.
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
x; y 1
T ( x, y ) y; x 1
0; otherwise
t-conorm or s-norm :
A mapping S : [0,1] X [0,1] [0,1] is said to be a t-norm if it
satisfies the following four properties :
Commutativity : S(x,y) = S(y,x)
Monotonicity : S(x,y) S(x,z) if y z
Associativity : S(x, S(y,z)) = S(S(x,y),z)
Linearity
: S(x,0) = x
The most commonly used t-norm operators are :
Maximum or Standard Union :
Smax(x,y) = max(x,y)
Algebraic Sum :
Sas(x,y) = x + y xy
Bounded Sum :
Sbp(x,y) = min(1,x+y)
Drastic Sum or Drastic Union :
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
x; y 0
S ( x, y ) y; x 0
1; otherwise
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
~ (2(~ A) );0.5 A ( x) 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
Data Base
Fuzzification
Crisp
Input
(i)
(ii)
(iii)
Fuzzy
Inference
Engine
Defuzzification
Fuzzy
Rule
Base
Crisp
Output
FIS MODELS :
FISs are capable of performing non-linear mappings between
inputs and outputs based on a set of fuzzy rules.
The interpretations of any rule in the rule base depends on the FIS
model
The two most popular FIS models are :
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
(i)
(ii)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MAMDANI MODEL
The Mamdani model is one of the most popular models used in fuzzy
inference systems, which is a non-additive model that that uses the
maximum operator to combine the outputs of fuzzy rules.
For the Mamdani model with N rules, the ith rule is given as
Ri : IF x is Ai THEN y is Bi ;
for i = 1, 2, ., N
where Ais and Bis are fuzzy sets defined on the input and output spaces
respectively.
An input in the form of
x is A'
produces an output of the form
y is B'
by combining the rules using
--------------- (1)
---------------- (2)
In place of the max and min, other operators like sum and product can
also be used.
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
y'
i 1
Ai '
( x) f i ( x)
i 1
Ai'
--------------- (3)
( x)
where
---------------- (2)
The TSK model has been found to work better than Mamdani
model with lesser rules.
It can extract rules from the data automatically, and has been
found to be stronger and more flexible than Mamdani model.
Its disadvantage is that the rules generated may not be meaningful.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ANFIS
The ANFIS ( Adaptive Neuro Fuzzy Inference System or Adaptive
Network-based Fuzzy Inference System) is a hybrid neuro-fuzzy
system that may be defined as an adaptive neural network that
functions like a fuzzy inference system.
Architecture of ANFIS :
An ANFIS usually has an architecture consisting of six layers which is of
the form n-nK-K-K-K-1 model, where n represents the number of inputs
in layer 0 and K represents the number of nodes in the next layer to which
each input node is connected. In the figure n = 2 and also K = 2.
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
Functioning of ANFIS :
Layer 0 :
It is the input layer containing n input nodes.
The figure has a network with 2 inputs
Layer 1 :
Every node i in this layer is an adaptive node with a node function
Ai ( x ); i 1,2
O1i
Bi 2 ( x ); i 3,4
Each node is associated with a linguistic label Ai or Bi
The node function value O1i is just the membership value
associated with these fuzzy sets
Typically membership function used is
A ( x)
1
x ci
1
ai
2 bi
Here ai, b i, and ci are parameters which are called the premise
parameters
Layer 2 :
Every node in this layer represents a fuzzy neuron with the
algebraic product t-norm as the operator
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
O2i wi Ai ( x). Bi ( y )
where i = 1,2
Layer 3 :
Every node in this layer performs normalization, and the outputs
are called normalized firing strengths
The outputs are given as :
O3i
O2 i
2
2j
j 1
Layer 4 :
Every node i in this layer is an adaptive node with a node function
Level 5 :
The single node in this layer finds the overall output by summing
up all the incoming signals
The output is given as
2
O5i O4 j
j 1
Learning of ANFIS :
In the ANFIS model, the functions used at all the nodes are
differentiable, and hence the backpropagation algorithm can be
used to train the network.
The ANFIS model uses the TSK fuzzy rules
Ri : IF x is Ai THEN y is fi(x) ;
for i = 1, 2, ., N
where the functions fi are of the form
fi(x) = ai0 + ai1x1 + ai2x2 + .. + ainx n
The network finds the output y and then finds the error, and then it
adjusts only the membership functions of the parameters in ANFIS
network.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)
COVERS THEOREM
The Covers Theorem is a result that gives us a method for dealing with
separability of patterns which are not linearly separable.
It states that :
A complex pattern-classification problem which is not linearly
separable in a given n-dimensional input space, when transformed to a
higher dimensional feature space, is more likely to be linearly separable,
provided that the space is not densely populated.
The theorem is often used for converting patterns which are not linearly
separable into linearly separable patterns.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)