Aittokoski Miettinen 08

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

EngOpt 2008 - International Conference on Engineering Optimization

Rio de Janeiro, Brazil, 01 - 05 June 2008.

Efficient evolutionary method to approximate the Pareto optimal set in


multiobjective optimization
Timo Aittokoski, Kaisa Miettinen

Dept. of Mathematical Inf. Technology, P.O. Box 35 (Agora), FI-40014 University of Jyväskylä, Finland
timo.aittokoski@jyu.fi, kaisa.miettinen@jyu.fi

1. Abstract
Solving real-life engineering problems requires often multiobjective, global and efficient (in terms of ob-
jective function evaluations) treatment. In this study, we consider problems of this type by discussing
some drawbacks of the current methods and then introduce a new population based multiobjective op-
timization algorithm which produces a dense (not limited to the population size) approximation of the
Pareto optimal set in a computationally effective manner.

2. Keywords: efficient Pareto front approximation, multicriteria optimization, population-based ap-


proaches

3. Introduction
Many real-life industrial optimization problems are demanding in a way that they contain multiple con-
flicting objectives, they cannot be solved by local methods because of several local optima, and the
solution process should be computationally efficient because objective function values may be provided
by time consuming ”black box” -simulations. For these reasons, in this study we focus on treating mul-
tiobjective, global and computationally expensive problems (in box constrained domains) with limited
budget for objective function evaluations.
With multiobjective optimization problems, the concepts of non-dominated and Pareto optimal solu-
tions are relevant. Solution A is said to dominate solution B if all components of A are at least as good
as those of B (with at least one strictly better component) and A is non-dominated if it is not dominated
by any solution. Correspondingly, solution A belongs to the Pareto optimal set if none of the objective
function values can be improved without degrading the value of at least one objective, that is, if it is
not dominated by any other feasible solution. We need a decision maker (DM) and his or her preference
information to judge which Pareto optimal solution is the most satisfactory one as a final solution.
Among widely used approaches in solving demanding engineering problems with multiple objectives
are interactive scalarization-based methods where preference information is iteratively extracted from the
user, and a single (or a small set of) Pareto optimal solution(s) is produced at a time (see, e.g., [20]) and
evolutionary multiobjective optimization (EMO) approaches (see, e.g., [5]) producing an approximative
representation of the whole Pareto front. Both approaches have some drawbacks. Even though only those
Pareto optimal solutions are generated in interactive methods that the DM is interested in, computational
complexity may blur the interactive nature of the iterative solution process if the DM must wait for long
for solutions to be generated and at the beginning of such a process, specifying preference information
may be difficult before the DM has gained some understanding of the problem (see, e.g., [14]).
On the other hand, EMO algorithms are often computationally expensive, or the resulting approx-
imation (final population) may have too few solutions to properly represent the whole Pareto front.
Further, if the approximation has high number of solutions, it may become cognitively very challenging
for the DM to select the final decision, unless some proper tool or the visualization of the set is available.
Unfortunately, intuitive visualization of the approximation set is practically impossible when there are
more than three objectives.
Although EMO approaches have some drawbacks, the basic concept of approximating algorithms is
appealing for computationally costly problems: the approximation of the Pareto set can be produced
off-line, and the DM is involved in the solution process only after all the heavy computation is finished
(although it is also possible to employ preference information in an EMO [32]).
Motivated by the limited budget for objective function evaluations, we tackle here with aforemen-
tioned problems by introducing a new population based multiobjective optimization algorithm which
produces a dense (not limited to the population size in a traditional sense) approximation of the Pareto

1
optimal set in a computationally effective manner, reducing thus the amount of objective function calls
needed.
The remainder of this study is organized as follows. In Section 4 we discuss some drawbacks of the
current EMO methods and then in Section 5 we propose some possible improvements, and formulate
an algorithm implementing them. In Section 6 we show some experimental results comparing the new
algorithm and the well-known NSGA-II algorithm [6] using test problems from the literature. Finally,
we conclude in Section 7.

4. Some drawbacks of the current EMO approaches


When employing EMO for some particular problem, the approximated front should be as close as possible
to the real front, and at the same time cover it as well as possible. Many current EMO implementations
have some drawbacks which either hinder the performance, or leave the end user to the state of unaware-
ness with regard to the validity of chosen parameters of the algorithm. In the following, we discuss some
of these drawbacks.

4.1. Convergence
It seems not to be widely fathomed that often referred EMO’s, such as Pareto Archived Evolution
Strategy (PAES), Strength Pareto Evolutionary Algorithm (SPEA) or Elitist Non-Dominated Sorting
Genetic Algorithm (NSGA-II) are not guaranteed to converge [18]. Rather, at some point of the solution
process the populations start to oscillate, as shown below in Figure 1. Though this behavior has been
recognized during the last decade, for example, by [5, S. 6.2.5], [9], [18], [25], [26] and [27], attempts
towards truly convergent EMO’s have been made only during recent years, for example in the form of
hybrids (e.g. [12]) or an epsilon-dominance approach (e.g. [18]).
Besides, the elitism (which is closely related to convergence) seems to be difficult concept in context
of EMOs. In single objective problems, elitism means that some number of best solutions are always
copied to the next generation, and for some single objective problems Rudolph proved convergence to
the global optimum with elitism in [24]. With multiobjective problems, the meaning of elitism is no
longer straightforward, not at least in the case when the population is already full of non-dominated
solutions, which are mathematically equivalent. For example, in NSGA-II [6], an elitist scheme means
that both parent and child populations are combined, and only non-dominated solutions of that combined
population survive. It is worth mentioning that although Rudolph [26] proved the convergence of certain
types of EMO algorithms (accepting a solution only if it dominates at least one of the current solutions),
elitism in the sense as described above, has nothing to do with Rudolph’s proof. It seems that most of
the current EMO approaches are not satisfying the convergence conditions presented by Rudolph.
Generally, the convergence problem arises from the concept of non-dominance, in conjunction with
diversity preservation mechanisms used. To put it short, for example, in NSGA-II, the parent and child
populations are combined and non-domination sorting is executed. If there exists more non-dominated
solutions than the population size allows, some solutions are pruned using a diversity measure. As all
non-dominated solutions in the population are mathematically equivalent, they cannot be ordered, but in
reality some of them may be located closer to the real Pareto optimal front than others. Anyhow, while
selecting solutions to the next generation with diversity measure, some solution located very near the
Pareto optimal from may be replaced by some other non-dominated solution which improves diversity,
but is at the same time located much farther from the Pareto front, thus leading to oscillation.
In several EMO algorithms, due to oscillation, average distance between points in the final population
and the real Pareto front is dictated by the population size rather than the number of generations, which
may seem contradictory. We illustrate this behavior in Figure 1, where generational distance GD (see
Section 6.1.) is plotted against the number of objective function evaluations with population sizes 12,
24 and 48 using the test problem ZDT1 [37] and NSGA-II [6]. Naturally, similar behavior occurs also
with other problems.
With the smallest population, GD stagnates before 5000 evaluations, and with larger populations
before 7000 evaluations. After that GD starts to oscillate, which is clearly seen in the magnification
of GD values between 12000 and 16000 evaluations, because non-dominated solutions near the Pareto
front are occasionally switched to ones located farther away. It is also visible in the smaller picture that
the final oscillation level depends on the population size. This phenomenon has probably often remained

2
Figure 1: Development of generational distance (GD) against objective function evaluations.

unnoticed, as most test problems are bi-objective, and in these cases, the populations used are typically
sufficiently dense to give an illusion of true convergence. If convergence near the Pareto optimal front is
of high importance, it seems that the population size should be as large as possible.

4.2. Deterioration of population


As seen in the previous subsection, in current EMO approaches some of the non-dominated solutions
may be replaced by ones which result in better diversity, but which are at the same time farther from
the real Pareto front. This leads inevitably to deterioration of the population, by which we mean, that
in the history of all the evaluated solutions there exist solutions which dominate the solutions in the
current population. We refer to these solutions as deteriorated ones, and if the population contains such
solutions, we say it is deteriorated. For background, see [9] and [18].
In our tests with NSGA-II [6], for example, with the bi-objective test problems ZDT1, ZDT3 [37], and
tri-objective problems DTLZ2, DTLZ5 and DTLZ7 [7], 12, 10, 23, 32 and 29% of the population mem-
bers, respectively, were deteriorated after 25000 objective function evaluations. This readily suggests
that the algorithm has wasted some objective function evaluations, and could have actually performed
better. Moreover, if a population gets heavily deteriorated, it seems intuitively plausible that also the
children in the future generations will be worse than with non-deteriorated population.

4.3. Arbitrary size of population


In the field of population based single objective optimization, the population size is often selected with
regard to the number of decision variables k. As a common rule of thumb, the population size is set to
around 10 times k.
With EMO methods, the population size is very often set to 100 or 200 (at least with common ZDT
and DTLZ test problems), while the number of decision variables is ranging from 7 to 30 with these
problems. Thus, there is not even a rule of thumb given (although such a rule may be very arbitrary
also in the single objective case). With some real-life engineering problem, where only one optimization
run may be possible, the lack of any rule makes it very difficult to choose a proper population size.
With a fixed population size, there typically are only few non-dominated solutions in each population
during the first generations. For example, with the problem ZDT1 and NSGA-II as a solver, it typically
takes more than 60 generations (equaling to 6000 objective function evaluations) before a population of
size 100 is fully occupied with non-dominated solutions. Let us point out that in the case of NSGA-II, the
inclusion of dominated solutions is not very detrimental, as those population members that are allowed
to breed are picked using binary tournament selection, thus emphasizing non-dominated solutions.
On the contrary, with, for example, differential evolution (DE) [30] based EMO’s, such as generalized

3
differential evolution (GDE3) [17], where each population member produces one child at a time, also
dominated solutions are fully allowed to breed. In some sense, this seems a very counter intuitive and
inefficient approach, as the algorithm should emphasize high quality non-dominated solutions more. We
propose that the above-mentioned fact causes DE based EMO’s to converge slower, when compared for
example to NSGA-II, at the beginning of the optimization process. This hypothesis is supported by the
the results of the CEC competition [31], where GDE3 was not performing very well with small number
of generations, but eventually came up with good results.
As issues discussed above suggest, we can argue that a desirable population size could be relatively
small at the beginning of an optimization process, and towards the end, a method could benefit from
an essentially larger population. With a larger population, oscillation would be diminished, and also
convergence closer to the real Pareto front would be possible. Indeed, it has been proven in [25] that
such EMO algorithms where the population at step n + 1 is a union of the non-dominated parents and
children of the step n converges to the real Pareto front. Obviously, in this case, the population size will
typically grow from generation to generation.

4.4. Diversity maintenance


Diversity maintenance is one of the key issues with EMO algorithms. As mentioned earlier, a desirable
set of solutions should closely approximate the real Pareto front, and at the same time it should cover
the entire range of the Pareto front. To this end, the NSGA-II algorithm [6] uses a so-called crowding
distance, which measures how crowded the neighborhood of each of the solutions in the population is.
In the selection phase, most crowded individuals are left out of the next population (pruned), and this
should lead to good diversity. Besides NSGA-II, a crowding distance approach is also employed by
several other algorithms (e.g. [19], [22], [23] and [29]).
Although the idea of a crowding distance is very useful, if not even essential, the implementation
given in [6] is not properly applicable to more than two objectives. Besides, already in the case of only
two objectives, the crowding estimation can fail severely. This misbehavior is well illustrated in [15] in
Figures 2, 16 and 17.
A viable way to implement pruning also with a higher number of objectives is presented in [16]. In
this approach, for each individual, distances to some number of its nearest neighbors are calculated, and
a product of these distances is used as a crowding estimator for the given individual. The population is
pruned individual by individual, removing the most crowded individual at a time. The drawback of this
approach is its high computational cost, although a sophisticated way to compute the crowding estimate
is given in [16].
As discussed earlier, it seems that diversity preservation is a somewhat conflicting objective with
convergence, as the population starts to oscillate. And as stated, this is due to the concept of domina-
tion, and not easily avoidable, unless some special measures are taken.

5. Proposed algorithm
In the previous section, we highlighted some drawbacks of the current EMO algorithms. Here we
propose a simple, yet efficient algorithm in response to the presented drawbacks. We wish that this
work refreshes some already known facts, and also inspires other researchers to develop algorithms with
a better performance.
The basic feature of our algorithm is the use of an unrestricted population, which has no artificial size
limit. The population size has only a minimum size minsize, which is dictated by the minimum amount
of points needed by the point generation mechanism to work. If the number of non-dominated solutions
is less than the minimum size, we take dominated solutions based on their dominance rank, similarly to
NSGA-II. Otherwise, all non-dominated solutions are accepted in the population. In this way, we can
get rid of several of the above discussed drawbacks. With this approach, there is no longer need for an
end user to select the population size, and at each phase of the optimization process, the population
contains all the non-dominated solutions found so far. At the beginning of the process this leads to
a very small population, and each non-dominated solution is bred with a high frequency. This should
obviously improve the convergence speed as there are no bad solutions producing offspring. When the
optimization process proceeds, the population size grows, and this should allow convergence closer to
the real Pareto front [25]. Deterioration of the population is no longer a problem, since the population

4
contains only non-dominated solutions generated during the process, and actually all of them.
Furthermore, no explicit diversity preservation mechanism is needed, and yet a proper diversity is
attained. At first hand, this statement may seem far-fetched, but it is actually a natural consequence
of the fact that no non-dominated point is expelled from the population during the process. If only the
point generation mechanism works properly (and we are free to incorporate whichever point generation
mechanism we choose to), we retain all the non-dominated points. Here it is essential to realize that,
for example, neither NSGA-II nor DE-based EMO’s strive to generate on purpose solutions that spread
widely all along the Pareto front. Rather, they just select points that are located in not so crowded
regions. If all the solutions are kept, no diversity preservation mechanism is needed (assuming the point
generation works reasonably).
There is also an additional benefit in storing all the non-dominated points. In this way, we retain
a maximal amount of information about the Pareto optimal front, gained during the optimization run.
However, as the number of solutions in the final population may grow, it is evident that it may become
cognitively very challenging to the DM to select the final solution. This is especially the case if the
number of objectives is more than two or three, when no intuitive visualization of the final population is
straight-forward. For these high dimensional cases, to select the final solution, we need special approaches
like those presented in [4], [8], [10], [21] and [33].
As a point generation mechanism is an important part of our algorithm (when more high quality
solutions are produced, convergence gets faster), it should be working sensibly. As DE has been success-
fully applied to a wide range of optimization problems (e.g. ranging from optimization of water pumping
systems [2] to internal combustion engine design [1]), we considered point generation mechanism of it
sufficiently developed, although we are aware that it has some possible defects [28].
Now we can present the steps of our algorithm as follows:

1. Initialize the population using minsize random points within the given search space.
2. Evaluate the new points.
3. Combine the current population with the evaluated points. Identify non-dominated solutions, and
take all these to the next population. If the minimum size of population is not reached, take
again non-dominated solutions from the remaining points, and continue until the minimum size is
reached.
4. Select randomly burstsize points from the current population to be used as parents in the point
generation mechanism. Generate one new child point for every parent point using point generation
mechanism of DE. In creation of the trial point, all points in the current population may participate.
Points which are not inside the given search space are truncated to the border, similarly as in
NSGA-II.
5. Evaluate the child population, and if the budget for objective function evaluations is not exhausted,
go back to Step 3.

In addition to an unrestricted population size, another novel feature in our algorithm is how parent
points are selected. In classic DE, every point serves in turn as a parent, while all the other points in
the population may randomly participate in creation of the trial point. With an unrestricted popula-
tion size, it is obvious that every point cannot generate a child. For this reason, we select randomly
burstsize points which serve as a virtual parent population. As a randomly selected subset of the points
retains features of the original set, integrity of the point generation process is not endangered by this
selection. The selection of burstsize should be done with regard to computational expense of objective
function evaluation, as burstsize directly defines how often dominated solutions are filtered out from the
population. As the filtering is not very expensive, with expensive problems it may be reasonable to do
filtering after every evaluation, i.e., use burstsize = 1, thus having most recent and efficient information
available at all times.

6. Numerical experiments
In this section, we present some results of numerical experiments with our new algorithm, compared to
those of NSGA-II using test problems from the literature. First we introduce the performance metrics
used, and then the actual results.

5
6.1. Performance metrics
With EMO algorithms that produce a set of solutions approximating the Pareto set, evaluating the
performance of a given algorithm is far from trivial. As mentioned earlier, to characterize the goodness
of this solution set, all solutions should be as close as possible to the real Pareto front and the solutions
should cover the whole Pareto front as well as possible, meaning that distribution of solutions along the
front should be even (no gaps), and the extent of solutions should be as high as possible. It is obvious
that the second part is easier to achieve with a higher number of non-dominated solutions. We refer to
these two properties as closeness and diversity, respectively.
Measures have been proposed for both closeness (e.g., generational distance) and diversity (e.g.,
spacing, spread and maximum spread) [5]. If the results of several algorithms are compared using two
different measures, this may easily lead to a situation where one measure is better in some algorithm,
and another is worse. In this case, there is no way of judging which one of them is better.
Recently, a hypervolume indicator [38] has gained popularity. It defines the volume (inside some
predefined hypercube) of the objective space dominated by the given solution set, and as such it can
give information about both closeness and diversity at the same time. Furthermore, it possesses a
desirable property for a performance metric stated in [35]: ”whenever one approximation completely
dominates another approximation, the hypervolume of the former will be greater than the hypervolume
of the latter.”
Another metric giving information about closeness and diversity at the same time is inverted genera-
tional distance, IGD [36]. A normal generational distance GD [34] measures how far the given solutions
of population P are on the average from the real Pareto optimal solutions P ∗. To calculate GD, for
every solution in P , the closest point in P ∗ is located. As to IGD, for every solution in P ∗, the closest
point in P is located. As a result, if a part of the Pareto front is missing or very poorly approximated
by P , the value of IGD grows. Obviously, to reliably compute either GD or IGD, a sufficiently dense
set of real Pareto optimal points is needed. Thus, these metrics are usable only with problems where
the exact Pareto front is known.
In this study, we have chosen to use the generational distance GD, the inverted generational distance
IGD and the Monte Carlo based hypervolume HV [3] as performance measures.

6.2. Results
To test our algorithm, we use a few well-known test problems from the literature: bi-objective ZDT1,
ZDT3, ZDT4 [37] and tri-objective DTLZ2, DTLZ5, DTLZ7 [7] problems. Although these problems have
some known limitations [11] and it is not clear how closely related they are to characteristics of real-life
problems, we use them as they are widely cited, and regardless of their limitations, they test algorithms’
ability to overcome some difficulties, such as disconnectedness of the front and multimodality.
Results of our algorithm are compared to those of NSGA-II [13] as it is a well-known and widely used
EMO algorithm. For NSGA-II, the population size was set to 100 for all the problems, while crossover
probability was 0.9, mutation probability 1/k, distribution index for SBX crossover 15 and distribution
index for polynomial mutation 20. For our algorithm, minsize was set to 10, burstsize to 25, scaling
factor F was 0.8, and crossover probability CR for ZDT problems 0.5, and for DTLZ problems 0.6. In
this study, we made no systematic attempt to tweak the parameters to achieve optimal performance.
In Table 1, we show averaged result of 30 separate test runs for both the algorithms with each of
the above-mentioned test problems. We have run the algorithms to a maximum of 8000 evaluations,
which is low by standards of traditional EMO comparisons, but on the other hand, with regard to real
engineering problems it is not obvious that even such a high number can be used. In this light, we are
more interested in how the algorithms behave with lower numbers of evaluations.
For each test case, values of GD, IGD and HV are shown in steps of 1000 objective function
evaluations in Table 1. Obviously, GD and IGD are better the smaller they are, while HV is better
when it is bigger. Asterisk in the cells of the new algorithm indicates that it has performed better when
compared to NSGA-II.
From the results it is clearly seen that the performance of the new algorithm is superior to that of
NSGA-II in almost every problem and by every metric. Especially convergence in the beginning of the
optimization process is remarkably fast, as it was designed to; in most cases HV has reached after 2000
evaluations nearly the same values as after 8000 evaluations, suggesting that essential features of the

6
Table 1: Comparison of proposed algorithm and NSGA-II algorithm
ZDT1 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.028007* 0.0036236* 0.4856* 0.1917 0.037427 3.33E-06
2000 0.0023086* 0.00060567* 0.63528* 0.090102 0.017946 0.06432
3000 0.00077457* 0.00028369* 0.65194* 0.044704 0.010353 0.2411
4000 0.00041187* 0.00018378* 0.65694* 0.026062 0.0067677 0.37209
5000 0.00026781* 0.00013287* 0.65945* 0.016284 0.0043185 0.46914
6000 0.00019332* 0.00010396* 0.66085* 0.010176 0.0026949 0.53915
7000 0.00014819* 0.000084726* 0.66195* 0.0062079 0.0017368 0.5832
8000 0.00011899* 0.000071562* 0.66275* 0.003964 0.0011088 0.61218
ZDT3 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.029928* 0.0045269* 0.60359* 0.18046 0.030601 0.00269
2000 0.0020621* 0.00091507* 0.73482* 0.086854 0.014624 0.1916
3000 0.00069041* 0.00044331* 0.75036* 0.044249 0.0084281 0.41057
4000 0.00034838* 0.00029388* 0.75568* 0.024159 0.005897 0.52563
5000 0.00022314* 0.0002132* 0.75797* 0.015103 0.0043955 0.59432
6000 0.00016045* 0.00016681* 0.75909* 0.0096113 0.0031868 0.65161
7000 0.00012671* 0.00013653* 0.75975* 0.0056242 0.0021137 0.69213
8000 0.00010168* 0.00011708* 0.76027* 0.0034806 0.0013823 0.71501
ZDT4 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.091588* 0.029035* 0.13046* 3.0161 0.24179 0
2000 0.045828* 0.02788* 0.14064* 1.0852 0.088378 0
3000 0.033946* 0.027654* 0.14245* 0.48194 0.044688 0.00083333
4000 0.02816* 0.027611* 0.14302* 0.24985 0.030833 0.018583
5000 0.024617* 0.027601 0.1432* 0.15498 0.024152 0.061187
6000 0.022068* 0.0276 0.14327* 0.071305 0.020758 0.10076
7000 0.020221* 0.027594 0.14335* 0.065949 0.019166 0.12197
8000 0.01878* 0.027582 0.14338* 0.063463 0.018382 0.13094
DTLZ2 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.01583* 0.0022119* 0.23717* 0.032783 0.0031238 0.12748
2000 0.0052983* 0.0012758* 0.34057* 0.0095183 0.0016349 0.28643
3000 0.0030549* 0.00097728* 0.37377* 0.0047195 0.0012484 0.33138
4000 0.0021537* 0.000816* 0.39094* 0.0030685 0.0011434 0.34948
5000 0.0016678* 0.00072382* 0.40221* 0.0024037 0.0011342 0.35793
6000 0.0013663* 0.00065651* 0.40996* 0.001997 0.0011004 0.36421
7000 0.001159* 0.00060378* 0.41576* 0.0018585 0.0010914 0.36782
8000 0.0010057* 0.00056322* 0.42024* 0.0016773 0.001077 0.37073
DTLZ5 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.029206 0.0011105* 0.045753* 0.027096 0.0019419 0.0223
2000 0.010967 0.00039947* 0.073557* 0.0062537 0.00065061 0.06453
3000 0.0063443 0.00022132* 0.083027* 0.0020312 0.00028733 0.08154
4000 0.0044117 0.00015923* 0.086893 0.00095979 0.00016326 0.087533
5000 0.0033121 0.00012638* 0.089167 0.00057633 0.00012865 0.08977
6000 0.0026657 0.00010173* 0.09067 0.00039728 0.00011719 0.090907
7000 0.0022216 0.000084042* 0.09161* 0.00032692 0.00010588 0.09144
8000 0.0019281 0.000074296* 0.09231* 0.00027421 0.00010402 0.09189
DTLZ7 NEW NSGA-II
Evals Gd Igd Hv Gd Igd Hv
1000 0.026496* 0.0055494* 0.12855* 0.75452 0.062738 0
2000 0.0041413* 0.0030781* 0.18971* 0.38348 0.025436 0.0010833
3000 0.001985* 0.0026951* 0.20199* 0.18625 0.013603 0.016103
4000 0.0012945* 0.0025382* 0.20753* 0.087215 0.0089129 0.049977
5000 0.00096419* 0.0024284* 0.21063* 0.047724 0.006339 0.08542
6000 0.00077066* 0.0023728* 0.21265* 0.02935 0.0045198 0.11657
7000 0.0006445* 0.0023341* 0.21412* 0.018467 0.0032785 0.14162
8000 0.0005564* 0.0022981* 0.21522* 0.013174 0.0024404 0.16031

Pareto front have been captured, as well as quite good proximity to the front has been achieved very
efficiently. Difference percentages compared to NSGA-II are drastic.
From the IGD and HV values (as well as examples of Figure 2) we can deduce that the diversity
of the solutions produced by the new algorithm is good as assumed, although there exists no explicit
diversity preservation mechanism. An interesting observation can be made with DTLZ5, where GD
is weaker for the new algorithm, but IGD is suggesting a better spread of the population. This may
partially be due to unfixed population size of the new algorithm. The HV numbers in this case are
questionable, since the Pareto front is a line in a three-dimensional space, and the volume dominated by
it should be theoretically either zero or close to it.
In problem ZDT4, some interesting behavior is seen, with the new algorithm: GD and HV values
are better, while IGD is worse. This is probably due to sub-optimal distribution of points in extreme
ends of the front, where trade-off between objectives is minimal, and thus also effect to hypervolume.
Anyhow, IGD is sensitive to these gaps, and NSGA-II is performing better by this metric.
One thing worth to mention is the effect of the population size to the metrics employed. Although
one may argue that a comparison of populations with different sizes is not fair, we find it justified.
Measures GD and IGD are average distances between the final population and the real Pareto set, and
if the real Pareto set is sufficiently dense (5000 points used here), there should be no bias. Obviously
HV benefits from a higher number of solutions, but as the unrestricted population size is an essential
part of our algorithm, the comparison is justified. Further, although in the beginning of the process the
new algorithm has an essentially smaller population, it still produces better HV values.

7
Figure 2: ZDT1 and ZDT3 problems at 1000 and 4000 evaluations.

In Figure 2, we show as examples populations of both the new algorithm, NSGA-II and the real
Pareto front for test problems ZDT1 and ZDT3 after 1000 and 4000 function evaluations. It is clearly
seen that the new algorithm is essentially closer to the real Pareto front after 1000 evaluations, and it
has already captured the essential features of the front. After 4000 evaluations, the new algorithm has
practically speaking converged to the real front, covering it fully at the same time.

7. Discussion & Conclusions


In this work, we have discussed some drawbacks of the current EMO approaches. As our emphasis
is in solving some real engineering problems, where number of allowed objective function evaluations
is severely restricted, we have proposed a new algorithm which seems to be computationally efficient
and overcomes some of the drawbacks discussed. Further, the algorithm is very straight-forward to
implement.
The apparent computational efficiency of the new algorithm is explained by three separate facts,
(i) in the beginning of the process, the population is not forced in to some arbitrary size, and thus it
does not contain dominated solutions, which may hinder the performance, and because the population
contains all the non-dominated solutions (ii) oscillation is not possible, and thus, (iii) the population
can not deteriorate later during the process. This means that the new algorithm continues to converge
infinitely, in contrast to e.g. NSGA-II, if further evaluations are allowed, albeit at slowing rate.
Preliminary comparisons presented in this study encourage us to believe that our approach has some
real potential. The convergence rate of the proposed algorithm was excellent with all tested problems,
though number of problems could have been larger. Also, test problems employed here are known to
have some undesired features [11], such as having extremal values (optimal values next to the border
of the search space). In this case, the constraint handling mechanism (whether points located outside
the given domain are rejected, truncated or reflected) of the algorithm may have a profound effect on
performance, as these values can get optimized ”accidentally”. To avoid the effect of the constraint
handling mechanism, our algorithm used the same mechanism as NSGA-II in this study.
To overcome difficulties with test problems, we shall use more diverse set of test problems in the
future. Also, as the point generation scheme of DE is known to have some potential defects, we shall
also make some experiments with different point generation schemes.

8
8. Acknowledgements
We wish to thank Professor Kalyanmoy Deb and his research group, especially Mr. Karthik Sindhya,
for providing assistance with the use of NSGA-II and some test problems. We are also grateful to Dr.
Yi Cao for his great assistance with Monte Carlo based hypervolume estimation measure, as well as to
Mr. Saku Kukkonen for providing an implementation for the GD metric. Further, we want to thank
Mr. Sauli Ruuska for interesting discussions, and Mr. Vesa Ojalehto for helping in some technical issues.

9. References
[1] T. Aittokoski, K. Miettinen, Cost Effective Simulation-Based Multiobjective Optimization in Per-
formance of Internal Combustion Engine, accepted to be published in journal Engineering Opti-
mization, 2008.
[2] B.V. Babu and R. Angira, Optimization of Water Pumping System Using Differential Evolution
Strategies, in Proceedings of The Second International Conference on Computational Intelligence,
Robotics, and Autonomous Systems (CIRAS-2003), Singapore, 2003.
[3] Y. Cao, Matlab Central File Exchange: Hypervolume Indicator,
http://www.mathworks.fr/matlabcentral/fileexchange/, 2008.
[4] D. Craft, Matlab Central File Exchange: Pareto surface navigator,
http://www.mathworks.fr/matlabcentral/fileexchange/, 2008.
[5] K. Deb, Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, Ltd.
Chichester, England, 2001.
[6] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A Fast and Elitist Multiobjective Genetic Algorithm:
NSGA-II, IEEE Trans. Evolutionary Computation, 6(2):182–197, 2002.
[7] K. Deb, L. Thiele, M. Laumanns, E. Zitzler, Scalable multi-objective optimization test problems, in
Proceedings of the 2002 Congress on Evolutionary Computation, 2002.
[8] P. Eskelinen, K. Miettinen, K. Klamroth, J. Hakanen, Interactive Learning-Oriented Decision Sup-
port Tool for Nonlinear Multiobjective Optimization: Pareto Navigator, Helsinki School of Eco-
nomics, Working Paper W-439, 2007.
[9] T. Hanne (1999), On the convergence of multiobjective evolutionary algorithms, European Journal
of Operational Research, 117, 1999.
[10] T. Hanne, Interactive decision support based on multiobjective evolutionary algorithms, OR Pro-
ceedings 2005 (2006) 761–766, Springer Berlin.
[11] S. Huband, P. Hingston, L. Barone, L. While, A Review of Multiobjective Test Problems and a
Scalable Test Problem Toolkit, IEEE Transactions on Evolutionary Computation, 10(5), 2006.
[12] H. Ishibuchi, T. Yoshida, Hybrid Evolutionary Multi-Objective Optimization Algorithms, In Pro-
ceedings of Soft Computing Systems - Design, Management and Applications, December 1-4, 2002,
Santiago, Chile.
[13] Kanpur Genetic Algorithms Laboratory, NSGA-II source code,
http://www.iitk.ac.in/kangal/codes/nsga2/nsga2-v1.1.tar, 2008.
[14] K. Klamroth, K. Miettinen, Integrating Approximation and Interactive Decision Making in Multi-
criteria Optimization, Operations Research, 56(1), 222–234, 2008.
[15] S. Kukkonen, K. Deb, Improved Pruning of Non-Dominated Solutions Based on Crowding Distance
for Bi-Objective Problems. 2006 IEEE Congress on Evolutionary Computation, Vancouver, BC,
Canada July 16-21, 2006.
[16] S. Kukkonen, K. Deb, A Fast and Effective Method for Pruning of Non-dominated Solutions in
Many-Objective Problems, in Parallel Problem Solving from Nature - PPSN IX, 2006.
[17] S. Kukkonen, J. Lampinen, GDE3: the third evolution step of generalized differential evolution,
IEEE Congress on Evolutionary Computation, pp. 443–450, Edinburgh, Scotland, 2005.
[18] M. Laumans, L. Thiele, K. Deb, E. Zitzler, Combining Convergence and Diversity in Evolutionary
Multi-Objective Optimization, Evolutionary Computation 10(3), 263–282, 2002.
[19] N. K. Madavan, Multiobjective optimization using a Pareto Differential Evolution approach, in
Proceedings of the 2002 Congress on Evolutionary Computation (CEC 2002), Honolulu, Hawaii,
pp. 11451150, 2002.
[20] K. Miettinen, Nonlinear Multiobjective Optimization, Kluwer Academic Publishers, Boston, 1999.
[21] M. Monz, K. H. Kfer, T. R. Bortfeld, C. Thieke, Pareto navigation - algorithmic formulation of
interactive multi-criteria IMRT planning, Physics in Medicine and Biology 53, 985–998, 2008.

9
[22] C. R. Raquel, P. C. Naval Jr., An effective use of crowding distance in multiobjective particle
swarm optimization, in Proceedings of the Genetic and Evolutionary Computation (GECCO 2005),
Washington DC, USA, 2005, pp. 257-264.
[23] T. Robic, B. Filipic, DEMO: Differential Evolution for multiobjective optimization, in Proceedings
of the 3rd International Conference on Evolutionary Multi-Criterion Optimization (EMO 2005),
Guanajuato, Mexico, 2005, pp. 520-533.
[24] G. Rudolph, Convergence of evolutionary algorithms in general search spaces, in Proceedings of
the Third IEEE Conference on Evolutionary Computation, 1996, pp 50–54.
[25] G. Rudolph, Evolutionary search for minimal elements in partially ordered finite sets, in Proceedings
of the 7th Annual Conference on Evolutionary Programming, pp 345-353. Springer, Berlin, 1998.
[26] G. Rudolph, Evolutionary search under partially ordered sets, in Proceedings of the International
NAISO Congress on Information Science Innovations , 1999.
[27] G. Rudolph, and A. Agapie, Convergence properties of some multi-objective evolutionary algo-
rithms, in Proc. of IEEE Congress on Evolutionary Computation, 2000, pp. 1010–1016.
[28] S. Ruuska, T. Aittokoski, The Effect of Trial Point Generation Schemes on the Efficiency of
Population-Based Global Optimization Algorithms, in Proceedings of International Conference
on Engineering Optimization, Rio de Janeiro, Brazil, 2008.
[29] M. R. Sierra, C. A. Coello Coello, Improving pso-based multiobjective optimization using crowding,
mutation and e-dominance, in Proceedings of the 3rd International Conference on Evolutionary
Multi-Criterion Optimization (EMO 2005), Guanajuato, Mexico, 2005.
[30] R. Storn, K.Price, Differential evolution – a simple and efficient heuristic for global optimization
over continuous spaces, Journal of Global Optimization 11, 341–359, 1997.
[31] P. N. Suganthan, Performance Assessment on Multi-objective Optimization Algorithms, CEC-07,
Singapore, 2007.
[32] L. Thiele, K. Miettinen, P. Korhonen, J. Molina, A Preference-Based Interactive Evolutionary
Algorithm for Multiobjective Optimization, Working Papers W-412, Helsinki School of Economics,
Helsinki, 2007.
[33] H. L. Trinkaus, T. Hanne, knowCube: A visual and interactive support for multicriteria decision
making, Computers & Operations Research 32, 1289–1309, 2005.
[34] D. V. Veldhuizen (1999), Multiobjective Evolutionary Algorithms: Classifications, Analyses, and
New Innovations, Ph. D. Thesis, Dayton, OH: Air Force Institute of Technology, Technical Reports
No. AFIT/DS/ENG/99-01, 1999.
[35] E. Zitzler, D. Brockhoff, L. Thiele, The Hypervolume Indicator Revisited: On the Design of Pareto-
compliant Indicators Via Weighted Integration, EMO 2007, LNCS 4403, pp. 862-876, 2007.
[36] M. A. Villalobos-Arias, G. T. Pulido, C. A. Coello Coello, A Proposal to use stripes to main-
tain diversity in a multi-objective particle swarm optimizer, in Proceedings of Swarm Intelligence
Symposium, 2005.
[37] E. Zitzler, K. Deb, L. Thiele, Comparison of multiobjective evolutionary algorithms: Empirical
results, Evolutionary Computation 8(2), 173–195, 2000.
[38] E. Zitzler, L. Thiele, Multiobjective Optimization Using Evolutionary Algorithms - A Comparative
Case Study, in Conference on Parallel Problem Solving from Nature (PPSN V), pages 292301,
Amsterdam, 1998.

10

You might also like