Professional Documents
Culture Documents
Radial Basis Function (RBF) Neural Networks
Radial Basis Function (RBF) Neural Networks
IIn this
hi part, we will
ill investigate
i i a networkk structure related
l d to
the multi-layer feed-forward network, known as the radial basis
function (RBF) neural network. The RBF neural network has a
feed-forward structure with a modified hidden layer and
training algorithms.
1
f( )
f(x)
1
x1
1 oi
di=||x-ci||
x2
1 oi=g(di)
xp
2
In mathematic terms, we can describe the operation of the
hidden layer neuron as:
dj cj x
o j ( x ) g (d j ) g ( c j x )
d2
g (d ) exp
2
2
where is the width of the basis function. The suitable value of
is application dependent.
=0.5
3
=1
=2
4
(2) The multi-quadratic function:
g (d ) d 2 2
(3) The inverse multi-quadratic function
g (d ) 1 / d 2 2
g (d ) d 2 log( d )
multi-quadratic
5
inverse multi-quadratic
thin-plate spline
6
If Gaussian basis function is used, we have:
xc 2
m m
f (x) w j o j ( x) w j exp
j
2 2
j1 j1
The parameters that determine the RBF are:
Weight estimation
It is noted that the output of the RBF neural network has a linear
relationship with the output of hidden layer neurons, thus the
estimation of weights in the second step can be done using a
linear least square estimation algorithm.
7
Assume there are N training samples:
2
m
x(k ) c j
m
f [ x ( k )] w j exp w j o j ( k )
j 1 2 2 j 1
Define:
f [ x (1)] w1
f [ x ( 2 )] w
f w 2
f [ x ( N )] wm
o1 ( N ) o2 ( N ) om ( N )
We obtain:
f Φw
8
We need to find such weights that the following cost function is
minimized:
N
J { f [ x ( k )] d ( k )}
k 1
2
[f d ]T [f d ]
[ Φw d ]T [ Φw d ]
Where
d d(1) d(2) d(N)
T
J
Solve 0
w
We obtain:
w (Φ T Φ ) 1 Φ T d
The weights thus obtained are call linear least square estimates.
Example
1 [1 1]T 0
2 [0 1]T 1
3 [0 0]T 0
4 [1 0]T 1
Assume two hidden layer neurons are used, the centers of two
neurons are set to: c1=[1 1]T , c2=[0 0]T, and the width of the
Gaussian basis function is set to 0.707.
9
w1 1
x1 w0
c1
+ f(x)
x2 c2
w2
2
f ( x ) wi exp( || x c i || 2 / 2 2 ) w0
i 1
1 0.1353 1 0
0.3678 0.3678 1 w 1
w 2 1
0.1353 1 1 0
w 0
0.3678 0.3678 1 1
w1 2 . 5018
w 2 . 5018
2
w 0 2 . 8404
10
Strategies for neuron center selection
d max / 2m
11
Self-organized selection of centers
(d) Go to step (b) until the cluster center positions and the
sample partitioning are no longer changed.
12
(a) Initialization of cluster centers
(b) Clustering
13
(c) Finding new cluster centers
14
Center selection as a model term selection problem
15
Estimate the weights, calculate the approximation error. The
candidate that leads to the minimum approximation error is
selected as the second center.
At Step 1:
Consider each sample as a center of the neurons to construct a
1-neuron RBF neural network;
x
c1 f(x)
16
(1) Calculate the response of the neuron to all training samples;
(2) Estimate
E i the
h weight
i h andd calculate
l l the h error. Select
S l the
h one
having minimum error, say ci, as the first center.
At Step 2:
Consider each neuron center in the pool as the one to be
selected next, and construct 2-neuron RBF neural networks.
f(x)
( )
x c1
∑
c2
cc1
cci-11 cci+1
ccN
17
(1) Calculate the response of the neuron to all training samples;
(2) Estimate
E i the
h weights
i h andd calculate
l l the h error. Select
S l the
h one
having minimum error, say ck, as the first center.
At Step 3:
c1
f(x)
x
c2
∑
cj
18
The stopping criterion:
In this approach, the centers of the RBF and all other parameters
of the network undergo a supervised learning process. In other
words, the RBF network takes on its most generalized form. A
natural candidate for such a process is error-correction learning,
learning
which is most conveniently implemented using a gradient-
descent procedure.
The first step in such a procedure is to define the criterion to be
optimized: N
1
E
2
j1
e 2
j
e j f [x( j)] d ( j)
19
By minimizing the cost function E using gradient-descent, we can
obtain all parameters of the RBF neural network:
c j , w j , arg min( E )
The gradient-descent
gradient descent based optimization can be summarized as
follows:
E (n ) N x( j) ci (n) 2
e j ( n ) expp
wi (n ) 2 2
j 1
E(n)
w i (n 1) w i (n) 1
w i (n)
E (n) N m x( j ) c i ( n) 2 x( j ) c i ( n) 2
2 w j (n)e j (n) exp
(n) 2 2
3 ( n)
j 1 i 1
E(n)
(n 1) (n) 3
(n)
where 1, 2, and 3 are step-size. The iteration is repeated until
parameters have no noticeable changes between two iterations.
20
Application example of RBF neural networks in data mining
Direct mailings to a potential customers can be a very effective
way to market a product or a services. However, as we all know,
many of this junk mails are of no interest to the people to receive
it. And most of the mails are thrown away. This wastes and costs
money.
(3) Predict the interest of other customers and select those that
have high possibility to be interested.
21
Among the 5822 customers, only 348 customers have a caravan
policy.
The another problem with the training data is that, among the 85
22
To select neuron center locations, we first consider each of the
848 (348+500) training samples as candidates, and then use the
bottom-up method to select.
848
J { f [x(k )] d (k )}2
k 1
Where d(k) is the class label of the kth training sample, having a
value of 1 or -1. f[x(k)] is the network output for the kth training
sample.
23
Classification error rate of the 848 training samples
800
238 46
4000
000
24