Professional Documents
Culture Documents
Unit 4
Unit 4
III, CSE
SRM Institute of Science & Technology
DELHI-NCR
Introduction
Gaussian
function
Weights from
the hidden
layer are cluster
centers
Covers Theorem
• w φ(x) < 0, x
T
= 0, is the separating surface between the
X-
• The hyperplane defined by wT φ(x)
two classes.
RBF Networks for classification
• RBF
• MLP
RBF Networks for classification Contd…
X X X X
X X
X
X Quadratically
separable- Quadrics
X
What happens in Hidden layer?
φ:
center
Large Small
Distance measure
n
rj = (x − w )
i =1
i ij
2
Width of hidden unit
−
n 2
j = exp(−
( xi
i =1 j ) ) 1
2
2
d max
where = 2
2M
( xi − j )
n 2
M
j = exp(− 2
i =1
) 3
d max
Training of the hidden layer
• The hidden layer in a RBF network has units which have weights
corresponding to the vector representation of the centre of the
cluster
• These weights are found either by k-means clustering algo or
kohonen’s algorithm
• Training is unsupervised but the no. of clusters is set in advance. The
algorithms finds the best fit to these clusters.
K-means algorithm
• The output layer is trained using the least mean square algorithm,
which is a gradient descent technique
• Given input signal vector x(n) and desired response d(n)
• Set initial weights w(x)=0
• For n=1,2,………..
Compute
e(n)=error=d – wtx
w(n+1)=w(n)+c.x(n).e(n)
Similarities between RBF and MLP
MLP RBF
Can have any number of hidden layer Can have only one hidden layer
Processing nodes in different layers shares a common Hidden nodes operate very differently and have a
neural model different purpose
Argument of hidden function activation function is the The argument of each hidden unit activation function is
inner product of the inputs and the weights the distance between the input and the weights
Trained with a single global supervised algorithm RBF networks are usually trained one later at a time
After training MLP is much faster than RBF After training RBF is much slower than MLP
Example: the XOR problem
Input space: x2
(0,1) (1,1)
(0,0) (1,0) x1
Output space: 0 1 y
φ2
(0,0)
1.0 Decision boundar
0.5 (1,1)
0.5 1.0 φ1
(0,1) and (1,0)
• When mapped into the feature space < 1 , 2 > (hidden layer), C1 and C2
become linearly separable. So a linear classifier with 1(x) and 2(x) as inputs
can be used to solve the XOR problem.
RBF NN for the XOR problem
−|| x − 1 ||2
1 ( x) = e with 1 = (0,0) and 2 = (1,1)
−|| x − 2 ||2
2 ( x) = e
x1
t1 -1
y
x2 t2 -1
+1
Pattern X1 X2
1 0 0
2 0 1
3 1 0
4 1 1
RBF network parameters
m1
i ( x ) = exp − 2 x − i
2
d max
Learning Algorithm 1
y ( xi ) = d i
+
[ w1...wm1 ] = [d1...d N ]
T T
Learning Algorithm 1: summary
• The problem:
• Face recognition of persons of a known group in an indoor environment.
• The approach:
• Learn face classes over a wide range of poses using an RBF network.
Dataset
• database
• 100 images of 10 people (8-bit grayscale, resolution 384 x 287)
• for each individual, 10 images of head in different pose from face-on to profile
• Designed to asses performance of face recognition techniques when pose
variations occur
Datasets
Output units
Linear
Supervised
RBF units
Non-linear
Unsupervised
Input units
Hidden Layer
• Hidden nodes can be:
• Pro neurons: Evidence for that person.
• Anti neurons: Negative evidence.
• The number of pro neurons is equal to the positive examples of
the training set. For each pro neuron there is either one or two
anti neurons.
• Hidden neuron model: Gaussian RBF function.
Training and Testing
Centers:
of a pro neuron: the corresponding positive example
of an anti neuron: the negative example which is most similar to the
corresponding pro neuron, with respect to the Euclidean distance.
Spread: average distance of the center from all other centers. So
the spread of a hidden neuron n is
n
1
n =
H 2
h and
|| t n
− t ||
h