Professional Documents
Culture Documents
Fuzzy Extreme Learning Machine For Classification: W.B. Zhang and H.B. Ji
Fuzzy Extreme Learning Machine For Classification: W.B. Zhang and H.B. Ji
tive implementations of SLFNs. To solve the classification problems We can have the KKT corresponding optimality conditions as follows:
using ELM, [2] proposes that ELM can be applied to support vector
machines (SVMs) by simply replacing SVM kernels with ELM ∂LFELM N
kernels. The study of ELM for classification with the standard optimis- = 0 bj = ai,j h(xi )T b = HT a (6a)
∂bj i=1
ation method in [3] verifies that ELM can solve any multiclass classifi-
cation problems directly [4]. However, in many real applications, the ∂LFELM
= 0 ai = Csi ji , i = 1, ..., N (6b)
different input points may not be exactly assigned to one of the ∂ ji
classes such as the imbalance problems and the weighted classification
problems. The traditional ELM lacks this kind of ability. This Letter pro- ∂LFELM
= 0 h(xi )b − TTi + jTi = 0, i = 1, ..., N (6c)
poses a novel method called fuzzy ELM (FELM) where a fuzzy mem- ∂ ai
bership is applied to each input of ELM such that different inputs can
make different contributions to the learning of output weights. The pro- By substituting (6a) and (6b) into (6c), the equations can be equivalently
posed method is suitable for real classification applications because of written as
its ability to solve the problems with different weights.
S
+ HHT a = T (7)
C
Extreme learning machine: ELM [1] studies the generalised SLFNs
⎡1 ⎤
s1 0 0 · · · 0
whose hidden layer need not be tuned and may not be neuron like. In ⎡ T⎤
ELM, all the hidden node parameters are randomly generated, and the t1 ⎢0 1 0 · · · 0⎥
⎢ ⎥ ⎢ s2 ⎥
output weights are analytically determined. Different from traditional where T = ⎣ ... ⎦ and the fuzzy matrix S = ⎢ ⎢ .. ..
⎥
.. ⎥
learning method ELM not only tends to reach the smallest training ⎣ . . .⎦
error but also the smallest norm of output weights: tTN
0 0 0 · · · s1N N ×N
From (6a) and (7),
N
minimise : bh(xi ) − ti −1
i=1 S
b = HT + HHT T (8)
C
and minimise : b (1)
Thus, the inputs with different fuzzy matrix can make different contri-
butions to the learning of the output weights b. Then, the output func-
where bi is the output weight form the ith hidden node to the output tion of FELM classifier is
node, h(xi ) is the output vector of the hidden layer with respect to the −1
input xi and ti is the label corresponding to xi . According to [5], for feed- S
f(x) = h(x)b = h(x)HT + HHT T (9)
forward neural networks reaching a smaller training error, the smaller C
the norm of weights is, the better generalisation performance the net-
works have. Then, the minimal norm least square method is used in For binary classification problems, FELM needs only one output node,
the implementation of ELM: and the decision function is
−1
T S
b = H† T (2) f (x) = sign h(x)H + HH T
T (10)
C
where T = [t1 , t 2 , ..., tN ]T , and H† is the Moore-Penrose generalised For m-class cases, the predicted class label of a testing point is the index
inverse of matrix H. When HT H is nonsingular, H† = ( HT H)−1 hT , number of the output node which has the highest output value.
or when HHT is nonsingular, H† = HT ( H HT )−1 . Compared to
SVM, ELM achieves similar performance for classification and runs label(x) = arg max fj (x) (11)
j[{1,...,m}
at a much faster learning speed. However, a majority of real applications
are weighted classification problems, whereas the traditional ELM lacks Furthermore, according to different applications, the fuzzy matrix S can
this kind of ability. Therefore, this Letter proposes fuzzy ELM to solve be set flexibly to solve the different problems.
this problem.
Simulation results: The performance of different algorithms (SVM,
Fuzzy ELM: In real world classification problems, according to the ELM and FELM) is compared in real world benchmark multiclass
different weights, the effects of the training points should be different. classification data sets which are taken from the UCI Machine
A set S of labelled training points with associated fuzzy membership Learning Repository [7]. The feature mapping used in ELM and