Professional Documents
Culture Documents
Improved Group Search Optimization Based On Opposite Populations For Feedforward Networks Training With Weight Decay
Improved Group Search Optimization Based On Opposite Populations For Feedforward Networks Training With Weight Decay
Improved Group Search Optimization Based On Opposite Populations For Feedforward Networks Training With Weight Decay
+! =
I
'
I
U
,
i fCU,
k
) isbetter thanf(x )
I
X
k
, otherwise
,
(1)
(2)
(3)
where Vik is a mutant vector, i = { I, 2, ... , N}, nl, n2, and n3
are mutually different random integer indices selected from
{l, 2, ... ,N} (N :: 4 is required), j = {I, 2, . . . , D} is the
individual dimension, F U [0, 2] is a real constant which
determines the amplification of the added differential variation
of (X _- X
_
(9)
V
k
_
X
k
r 0 (X
k _
X
k
) (10)
i -
,
+
3 12 13
where r3 E 9 is a uniform random sequence in (0, 1), nl, n2,
and n3 are mutually diferent random integer indices selected
from
|
I, 2, . . . , }.
In these approaches, each member Xi in the group is
composed of a set of weights between input layer and hidden
nodes (WI)' hidden biases o,,. a set of weights between
hidden layer and output layer (W
2
), and output biases o,,
[4]:
Xi = [W.o.W .o]
For each member, all variables are randomly initialized
within the range of [-1, 1]. The fitness function adopted is te
mean squared error (MSE) on the validation set:
1
N C 2
E -
_" " n
_
n )
( 11)
0
where t is the taget output related to the ith individual
concering the kth class, 0 is the output obtained by the ANN
for the ith individual concerning the kth class and C is the
number of output nodes.
1. k = 0;
2. Randomly initialize positions and head angles of all
members;
3. Randomly initialize initial population P
4.
5.
6.
7.
8.
9.
10.
II.
12.
13.
14.
IS.
16.
Find the corresponding opposite population vto
v
N'
Select the fittest N members from ,P U v to
form the initial population;
Calculate the fitness values of initial members;
WHILE (the termination conditions are not met)
FOR (each members i in the group)
Choose producer: Find the producer Xl of the group;
Perform producing: perform producer behaviour;
Perform scrounging: Randomly select a percentage
from the rest members to perform scrounging;
Perform dispersion: The rest members (rangers) will
perform ranging;
Fitness evaluation: Calculate the fitness value of
current member;
Regularization Coeficient Update: using (8);
END FOR
IF rand(O, I) < i,
Find the corresponding opposite population vto
current population P ,
17. Select the fittest N members from ,vu vto
form the new population;
18. END IF
19. k=k+l;
20. END WHILE
Figure 2 - Pseudocode for the OGSO-WD algorithm
IV. RESULTS AND DISCUSSION
In this section, we compare the new two GSO approaches
with the LM, ODE [15], traditional GSO [2, 3] and GSO-WD
presented in our ealy work [7]. All programs run in a
MATLAB 6.0 environment. A validation set is used in all
evaluated methodologies to prevent overfitting.
For evaluating all of these algorithms five benchmark
classification datasets, obtained from UCI Machine Learning
Repository [27] have been used. These datasets presents
diferent degrees of dificulties and different number of
classes. The evaluation metrics used are an empirical analysis
and a paired hypothesis test of type t-test (a = 5%) [28] over
the test accuracies obtained by each method in each dataset. In
the experiments, some parameters have been fixed for all
datasets (Table 1), according to [2, 3, 7, 15, 19]. The
maximum pursuit distance lmax was given as follows:
lmax = IIU - L II =
!i
-
L
(11 )
477
where Li and Ui are the lower and upper bounds for the ith
dimension.
1. k = 0;
2. Randomly initialize positions and head angles of all
members;
3. Randomly initialize initial population P
4. Find the corresponding opposite population vto
v
N'
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
Select the fittest N members from ,P U v to
form the initial population;
Calculate the fitness values of initial members;
WHILE (the termination conditions are not met)
FOR (each members i in the group)
Choose producer: Find the producer Xp of the group;
Perform producing: perform producer behaviour;
IF GSO group:
Perform scrounging: Randomly select a
percentage from the rest members to perform
scrounging;
Perform dispersion: The rest members (rangers)
will perform ranging;
ELSE (Modified DE group)
S-Mutation: Randomly select a percentage from the
rest members to mutate following eq. (9);
R-Mutation: The rest members will mutate
following eq. ( 10);
Crossover: each member from modified DE group
will execute crossover according to eq. (2);
Fitness evaluation: Calculate the fitness value of
current member;
19. Regularization Coeficient Update: using (8);
20. END FOR
21. IF rand(O, 1) < ir
22. Find the corresponding opposite population vto
current population P ,
23. Select the fittest N members from ,vu vto
form the new population;
24. ENDIF
25. k = k + I;
26. END WHILE
Figure 3 - Pseudocode for the OGSO-MDE-WD a1gonthm
Table 1
-
Fixed parameter for all algorithms
Algorithm Parameter Value
PN
50
GSO
Scroungers percentage 80%
nul and Uma
ni a
L
, emai2
Maxi mum number of i terations 50
OGSO-MDE-DE S-Mutation Percentage 80%
C, 0.8
ODE F 1.0
ir 0.3
Levenberg- Number of hi dden nodes 6
Marquardt Maxi mum number of epochs 200
Weight Decay
A 5 X 10)
INC 1 X 10
We tested for each dataset diferent maximum number of
GSO iterations, using the global best value found (Table 1).
Each dataset was divided in training, validation and testing
sets, as specified in Table 2. For all algorithms, 50
independent executions were done with each dataset. The
training, validation and testing sets were randomly generated
at each trial of simulations. The results obtained by LM, GSO,
GSO-WD and the new GSO approaches that achieved the best
results are bolded.
Table 2 - Dataset specifications
Dataset Cancer Diabetes Heart Iris Wine
Training 350 384 130 70 78
Validation 175 192 70 40 50
Testing 174 192 70 40 50
From Table 3 through Table 7, the results obtained for each
method are shown. Table 3 shows the results obtained for
cancer dataset. The best result has been achieved by the hybrid
OGSO-MDE-WD (96.41 % of test accuracy) in an empirical
evaluation, followed by OGSO-WD and GSO-WD [7],
respectively, and these approaches outperformed the ODE,
traditional GSO and LM approaches. The t-test showed that
OGSO-MDE-WD achieved better results than ODE,
traditional GSO and LM algorithms.
Table 3 - Results for Cancer
Classification Algorithm Training Time (s) Test Accuracy (% )
LM 0.28376 87.32 15.22
ODE 2021.67 95.84 1. 30
GSO 1041.38 95.68 1.44
GSO-WD 1043.84 96.08 1. 41
OGSO-WD 2326.39 96.23 1.43
OGSO-MDE-WD 1272.48 96.41 1.45
For diabetes case (Table 4), in an empirical analysis OGSO
WD (76.61 % of test accuracy) outperformed all other
algorithms, followed by OGSO-MDE-WD. The t-test showed
that OGSO-WD outperformed ODE, GSO and LM algorithms,
and the OGSO-MDE-WD outperformed ODE and LM
algorithms.
Table 4 - Results for Diabetes
Classification Algorithm Training Time (s) Test Accuracy (% )
LM 0.31474 70.61 7.99
ODE 3921.26 75.30 2.48
GSO 1205.83 75.59 2.39
GSO-WD 1246.65 75.61 2.80
OGSO-WD 2306.60 76.61 + 2.66
OGSO-MDE-WD 1733.45 76.34 2.51
Table 5 - Results for Heart
Classification Algorithm Training Time (s) Test Accuracy (%)
LM 0.22700 71.57 12.21
ODE 2373.30 79.46 5.73
GSO 2067.71 79.62 5.25
GSO-WD 2148.87 80.43 5.47
OGSO-WD 1340.83 80.51 5.92
OGSO-MDE-WD 1510.72 80.34 5.19
The table 5 shows that OGSO-WD (80.51% of test
accuracy) achieved the higher test accuracy empirically for
heart dataset. All GSO based algorithms outperformed ODE,
traditional GSO and LM techniques empirically, but according
to t-test their results are similar among each other, except for
478
LM, which was worse than all other approaches, presenting a
high degree of instability.
In relation to iris plant dataset (Table 6), the best empirical
results have been achieved by OGSO-MDE-WD (96.05 of test
accuracy) even as the date set for diabetes. All GSO based
algorithms outperformed ODE, traditional GSO and LM
techniques empirically, but according to t-test their results are
similar among each other, except again, for the LM algorithm
(the worst approach).
Table 6 - Results for Iris
Classification Algorithm Training Time (s) Test Accuracy (% )
LM 0.2033 76.60 24.53
ODE 3083.65 95.20 3.60
GSO 797.23 95.35 3.67
GSO-WD 781.72 95.70 3.35
OGSO-WD 2692.64 95.85 3.30
OGSO-MDE-WD 1337.63 96.05 +3.43
For wine dataset (table 7), again the hybrid OGSO-MDE
WD outperformed all approaches in an empirical analysis,
followed by the new OGSO-WD and GSO-WD, respectively.
Once more t-test showed that all methods but LM achieved
similar performances.
Table 7 - Results for Wine
Classification Algorithm
LM
ODE
GSO
GSO-WD
OGSO-WD
OGSO-MDE-WD
Training Time (s)
0.322
6939.87
2983.68
4409.86
1788.65
1311.14
V.
C
ONCLUSION
Test Accuracy (% )
79.72 22.28
96.04 2.84
96.00 2.71
96.32 3.46
96.88 2.56
97.00 2.72
In this paper, we introduced two new hybrid learing
approaches based on opposite populations and GSO to
improve LM algorithm for training ANNs, namely OGSO-WD
and OGSO-MDE-WD. Both approaches use the concept of
opposition-based learning (OBL) [14] to accelerate the
convergence of GSO method. The performance of the tested
algorithms was evaluated with well known benchmark
classification datasets [27]. Experimental results show that the
hybrid approaches achieved better generalization performances
than ODE, original GSO and LM for all tested datasets. For
two cases (cancer and diabetes), the new GSO based
techniques outperformed the ODE, GSO and LM based on t
test evaluation test (confidence coeficient of 95 %).
A
CKNOWLEDGMENT
The authors would like to thank F ACEPE, CNPq and
CAPES (Brazilian Reseach Agencies) for their financial
support.
R
EFERENCES
[1] C. J. Barnard and R. M. Sibly, "Producers and Scroungers: A General
Model and Its Application to Captive Flocks of House Sparrows", in:
Animal Behaviour, vol. 29, pp. 543-550, 1981.
[2] S. He, Q. H. Wu, and J. R. Saunders, "A Novel Group Search Optimizer
Inspired by Animal Behavioural Ecology," in: 2006 1E Congress on
Evolutionar Computation (CEC 2(06), pp.1272-1278, Vancouver,
2006.
[3] S. He. H. Wu and J. R. Saunders, "Group Search Optimizer: An
Optimization Algorithm Inspired by Animal Searching Behaviour", in:
IE Transactions on /volutionar Computation, vol. 13, no. 5, pp.
973-990,2009.
[4] S. He, Q. H. Wu and J. R. Saunders, "Breast Cancer Diagnosis Using an
Artificial Neural Network Trained by Group Search Optimizer", in:
Transactions ojthe Institute ojMeasurement and Control, vol. 31, no. 6,
pp. 517-531, 2009.
[5] Q. H. Wu, Z. Lu, M. S. Li and T. Y. Ji, "Optimal Placement of FACTS
Devices by a Group Search Optimizer with Multiple Producers", in:
2008 Congress on Evolutionar Computation (CEC 2(08), Hong Kong,
pp. l033-1039,2008.
[6] S. He and X. Li, "Application of a Group Search Optimization based
Artificial Neural Network to Machine Condition Monitoring", in: 13'
h
IEEE Interational Conjerence on Emerging Technologies and Factor
Automation (ETFA 2(08), pp. 1260-1266,2008.
[7] D. N. G. Silva, L. D. S. Pacffico and T. B. Ludermir. "Improved Group
Search Optimizer Based on Cooperation Among Groups for
Feedforward Networks Training with Weight Decay", in: 2011 1Int
Conj on Systems, Man, and Cyberetics, 2011, Anchorage. IEEE
SMC2011 Confrence Proceedings, pp. 2133-2138, 2011.
[8] W. J. O'Brien, B. 1. Evans and G. L. Howick, "A New View of the
Predation Cycle of a Planktivorous Fish, White Crappie (Pomoxis
Annularis)", in: Canadian J. Fisheries Aquatic Sci., vol. 43, pp. 1894-
1899, 1986.
[9] I. D. Couzin, J. Krause, N. R. Franks and S. A. Levin, "Effective
Leadership and Decision-Making in Animal Groups on the Move",
Nature, vol. 434, pp. 513-516, 2005.
[10] C. L. Higgins and R. E. Strauss, "Discrimination and classification of
foraging path produced by search-tactic models," Behavior Ecology, vol.
15,pp. 248-254,2003.
[11] A. F. Dixon, "An Experimental Study of the Searching Behaviour of the
Predatory Cocconellid Beetle Adalia Decempunctata", in: J. Animal
Ecology, vol. 28, pp. 259-2281,2003.
[12] R. Storn and K. Price, "Differential Evolution - A Simple and Efficient
Adaptive Scheme for Global Optimization Over Continuos Spaces",
Berkerley, CA, Tech. Rep. TR-95-012, 1995.
[13] R. Storn and K. Price, "Differential Evolution - A Simple and Efcient
Heuristic for Global Optimization Over Continuos Spaces", in: Joural
oJ Global Optimization II, pp. 341-359, 1997.
[14] H. R. Tizhoosh, "Opposition-Based Learning: A New Scheme for
Machine Learning", in: Proc. Int. Conf. Comput. Intell. Modeling
Control and Autom., Vienna, vol. 1, pp. 695-701, 2005.
479
[IS] S. Rahbamayan, H. R. Tizhoosh and M. M. A. Salama, "Opposition
Based Differential Evolution", in: IEEE Transactions on Evolutionar
Computation, vol. 12, no. 1, pp. 64-79, 2008.
[16] J. Kennedy and R. Eberhart, "Particle Swarm Optimization", in: Proc.
1/E/lntl. Con( on Neural Networks, 1942-1948, 1995.
[17] J. Kennedy and R. Eberhart. "Swarm Intelligence", Morgan Kaufmann
Publishers, Inc, San Francisco, CA, 200 I.
[18] F. van den Bergh. "An Analysis of Particle Swarm Optimizers", PhD
dissertation, Faculty of Natural and Agricultural Sciences, Univ.
Pretoria, Pretoria, South Africa, 2002.
[19] M. Carvalho, T. B. Ludemir, "Particle Swarm Optimization of Feed
Foward Neural Networks with Weight Decay", in: Proceedings oj the
Sixth Interational ConJerence on Hybrid Intelligent Systems (HIS'06),
pp. 5-10, 2006.
[20] S. Haykin, "Neural Networks: A Comprehensive Foundation", 2
ml
Edition, Prentice Hall, 1998.
[21] A. S. Weigend, D. E. Rumelhart and B. A. Huberman, "Generalization
by Weight Elimination with Application to Forecasting", in:
Proceedings ojthe 1990 confrence on Advances in neurl infrmation
processing systems II (NIPS II), pp. 875-882, 1990.
[22] C. Zanchettin and T. B. Ludermir, "Global Optimization Methods for
Designing and Training Feedforward Artificial Neural Networks", in:
Dynamics oj Continuous, Discrete & Impulsive Systems (DCDIS) A
Supplement, Advances in Neural Networks, vol. 14 (Sl), pp. 328-337,
2007.
[23] E. Eiben and J. E. Smith, "Introduction to Evolutionar Computing",
Natural Computing Series, MIT Press, Springer, Berlin, 2003.
[24] S. Kirkpatrick, C. D. Gellat Jr. and M. P. Vecchi, "Optimization by
Simulated Annealing", Science, vol. 220, pp. 671-680,1983.
[25] F. Glover, "Future Paths for Integer Programming and Links to Artificial
Intelligence", Computers and Operation Research, vol. 13, pp. 533-549,
1986.
[26] M. Dorigo, V. Maniezzo and A. Colorni, "Ant System: Optimization by
a Colony of Cooperative Agents", IEEE Transactions on Systems, Man
and Cyberetics - Part B, vol. 26, no. 1, pp.29-41, 1996.
[27] A. Frank and A. Asuncion, UCI Machine Learning Repository, Univ.
California, Sch. Inform. Comput. Sci., Irvine, CA, 2011 [Online].
Available: htt://archive.ics.uci.edu/ml.
[28] M. H. DeGroot, "Probability and Statistics", 2
ml
edition. 1989.