Interactive Multi-Objective EA Framework

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 28, NO.
1, FEBRUARY 2024 223
An Interactive Knowledge-Based Multiobjective

Evolutionary Algorithm Framework for Practical
Optimization Problems
Abhiroop Ghosh , Member, IEEE, Kalyanmoy Deb , Fellow, IEEE, Erik Goodman , and Ronald Averill
Abstract—Experienced users often have useful knowledge resources for design problems may be limited in time, cost,
and intuition in solving real-world optimization problems. User or availability. Thus, in many cases, it may be important to
knowledge can be formulated as intervariable relationships to use any available information that may help an optimization
assist an optimization algorithm in finding good solutions faster.
Such intervariable interactions can also be automatically learned algorithm in finding good solutions.
from high-performing solutions discovered at intermediate itera- For complex single-objective practical problems, evolution-
tions in an optimization run—a process called innovization. These ary algorithms (EAs) with generic recombination and mutation
relations, if vetted by the users, can be enforced among newly operators [1], [2] may be too slow to lead to high-performing
generated solutions to steer the optimization algorithm toward regions of the search space. Good performance of an algorithm
practically promising regions in the search space. Challenges
arise for large-scale problems where the number of such variable in solving benchmark problems, such as ZDT [3], DTLZ [4],
relationships may be high. This article proposes an interactive and WFG [5] does not always translate to good performance
knowledge-based evolutionary multiobjective optimization (IK- on practical problems. For such cases, creating customized
EMO) framework that extracts hidden variable-wise relation- algorithms leveraging additional problem information is nec-
ships as knowledge from evolving high-performing solutions, essary. Deb and Myburgh [6] proposed a customized EA
shares them with users to receive feedback, and applies them
back to the optimization process to improve its effectiveness. that exploited the linearity of constraint structures to solve a
The knowledge extraction process uses a systematic and elegant billion-variable resource allocation problem. A microgenetic
graph analysis method which scales well with the number of vari- algorithm [7] combining range-adaptation and knowledge-
ables. The working of the proposed IK-EMO is demonstrated based reinitialization was applied to an airfoil optimization
on three large-scale real-world engineering design problems. problem. Semi-independent variables [8] can be used to han-
The simplicity and elegance of the proposed knowledge extrac-
tion process and the achievement of high-performing solutions dle user-specified monotonic relationships among variables in
quickly indicate the power of the proposed framework. The the form of xi ≤ xi+1 ≤ xi+2 ≤ . . . ≤ xj . Some techniques
results presented should motivate further such interaction-based for combining EAs with problem knowledge are given in [9].
optimization studies for their routine use in practice. Domain knowledge can also be semantically annotated and
Index Terms—“Innovization,” interactive optimization, knowl- injected into an optimization process [10].
edge extraction, multiobjective optimization, repair. Alternatives to prespecifying problem information exist,
such as cultural algorithms which encode domain knowl-
I. I NTRODUCTION edge inside a belief space [11]. Self-organizing maps (SOMs)
OR PRACTICAL multiobjective optimization problems can provide information about important design variable clus-
F (MOPs), additional knowledge may often be available
from the users who have years of knowledge and experience
ters [12]. Recent innovization studies [13], [14], [15] aim to
extract additional problem information from high-performing
in solving such problems. However, such information is often solutions during the optimization process in the form of
ignored by researchers while developing an algorithm due functional relationships between variables, objectives, and
to concerns regarding loss of generality. But computational constraints.
Interactive optimization is when the user, referred to as
Manuscript received 27 July 2022; revised 2 December 2022 and the decision maker (DM), provides guidance during the
12 February 2023; accepted 17 February 2023. Date of publication optimization [16]. Multiple ways to interactively specify
20 March 2023; date of current version 31 January 2024. The work of
Abhiroop Ghosh was supported by the Koenig Endowed Chair. The work information exist, such as aspiration levels [17], the impor-
of Kalyanmoy Deb was supported by the Michigan State University (MSU) tance of individual objectives [18], etc.
under Grant RT083557. (Corresponding author: Abhiroop Ghosh.) Using additional problem information comes with a set
Abhiroop Ghosh is with Aspen Technology, Medina, MN 55340 USA
(e-mail: ghoshab1@msu.edu). of challenges. An effective knowledge representation method
Kalyanmoy Deb and Erik Goodman are with the Electrical and Computer needs to be designed which can be used effectively by
Engineering Department, Michigan State University, East Lansing, MI 48824 an optimization algorithm. At the same time, it should
USA (e-mail: kdeb@msu.edu; goodman@msu.edu).
Ronald Averill is with the Mechanical Engineering Department, Michigan also be comprehensible to the user. Validating any user-
State University, East Lansing, MI 48824 USA (e-mail: averillr@msu.edu). provided knowledge is necessary since the quality of supplied
This article has supplementary material provided by the information may vary. This reduces the possibility of a prema-
authors and color versions of one or more figures available at
https://doi.org/10.1109/TEVC.2023.3259339. ture or false convergence. The user may wish to periodically
Digital Object Identifier 10.1109/TEVC.2023.3259339 monitor and review optimization progress as well as any
1089-778X
c 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on May 23,2024 at 07:11:55 UTC from IEEE Xplore. Restrictions apply.
224 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 28, NO. 1, FEBRUARY 2024
learned information. If necessary, the user can also supply with a local search procedure was employed in [21] to ensure
information in a collaborative fashion [19]. However, care faster convergence. A combination of innovization and data
needs to be taken to ensure that any user feedback does not mining approaches were used in [22] to achieve faster conver-
lead the search process toward suboptimal solutions. For large- gence. Gaur and Deb [15] proposed an adaptive innovization
scale problems resulting in a potentially huge rule set, how do method that treats the innovization process as a machine learn-
we efficiently encode the rule information? How do we ensure ing problem and repairs the solutions directly, based on the
that enforcing one rule does not violate one or more of the learned model. A combination of user guidance and inequal-
other rules? How can maximum rule compliance among new ity relation-based online innovization [23], [24] was used to
solutions be achieved? solve three practical problems.
This article aims to address the issues mentioned
above by proposing a generic knowledge-based evolutionary
B. Structure of Rules Considered in This Study
multiobjective optimization (EMO) framework with user inter-
activity (IK-EMO) for solving practical constrained MOPs. For an interactive knowledge-based optimization algorithm
Users can provide a preference among the learned relation- to work, a standard form of knowledge representation is nec-
ships. The possibility of learned and user-provided knowledge essary which is simple enough for users to understand but
being imperfect is also taken into account and the algorithm has enough complexity to capture problem knowledge accu-
can adjust the extent of their influence accordingly. IK-EMO rately. Using algebraic expressions or “rules” is one way of
performance is demonstrated on three practical constrained representing knowledge and has been extensively used in the
MOPs and is also compared to three other EMO algorithms. “innovization” literature [13], [15]. A rule can take the form
of an equality or an inequality, as shown in
II. VARIABLE R ELATIONSHIPS AS K NOWLEDGE IN
φ(x) = 0, (1)
O PTIMIZATION TASK
ψ(x) ≤ 0. (2)
Knowledge is a generic term and can be interpreted in many
different ways depending on the context. For an optimization Any arbitrary form of rules involving many variables from
task, here, we restrict the definition of knowledge to be addi- a decision variable vector (x) and complicated mathematical
tional information provided or extracted about the optimization structures of functions φ or ψ may be considered, but such
problem itself. In this article, we are interested in variable– rules would not only be difficult to learn, they would also be
variable relationships that commonly exist in high-performing difficult to interpret by the user. In this study, we restrict the
solutions of the problem. However, the definition of knowledge rules to have simple structures involving a maximum of two
can be extended to include the objective and constraint func- variables, as discussed below.
tions too. A practical optimization task minimizes a number of 1) Constant Rule: This type of rule involves only one vari-
objectives and satisfies a number of constraints, all stated as able taking a constant value (xi = κi ). In terms of (1), for the
functions of one or more variables. Thus, understanding the ith variable, the structure of the rule becomes φi (x) = xi − κi .
variable-to-variable relationships which are common to fea- This type of rule can occur if multiple high-performing solu-
sible solutions (each represented by a variable vector) with tions are expected to have in common a fixed value of a
small objective values is critically important. A supply of such specific variable [25].
knowledge a priori by the users, in addition to the optimization 2) Power Law Rule: Power law rules [13] for two variables
problem description, or a discovery process of such knowledge xi and xj can be represented by (1) as φij (x) = xi xjb − c,
from the evolving high-performing optimization solutions, can where b and c are constants. This form makes power laws
be directly utilized by the optimization algorithm to speed up versatile enough to encode a wide variety of rules, such as
its search process. Moreover, if such knowledge is discovered proportionate or inversely proportionate relationships among
during the optimization process, users will benefit from hav- two variables. Interestingly, an inequality power law using a
ing this knowledge in addition to the optimal solutions of the ψ function can also be implemented, but such a rule may
problem. represent a relationship loosely and we do not consider it here.
3) Equality Rule: This type of rule can express the equal-
A. Past Studies ity principle of two variables xi and xj observed in high-
Innovization is the process of extracting commonalities performing solutions. In terms of (1), φij (x) = xi − xj is the
among Pareto-optimal solutions, first proposed by Deb and rule’s structure.
Srinivasan [13]. The basic principle of innovization is to gen- 4) Inequality Rule: This type of rule can represent rela-
erate rules representing intervariable relationships in simple tional properties of two variables xi and xj as xi ≤ xj or xi ≥ xj .
forms such as power laws (xi xjb = c). Bandaru et al. [20] In terms of (2), ψij (x) = xi − xj or ψij (x) = xj − xi are the
have proposed a method which is able to express relation- respective rules. For example, the radius of two beams in a
ships involving operators like summation (+), difference (−), truss [23] might be related via this type of rule.
product (×), etc. After describing the chosen rule structures, we are now
Bandaru and Deb [14] introduced the concept of higher- ready to discuss the procedures of extracting such rules from
and lower-level innovization. A genetic programming-based high-performing variable vectors and applying the extracted
innovization framework was proposed in [20] and was applied rules to the optimization algorithm. A summary of their
on an inventory management problem. An MOEA combined representations and use in our analysis are provided in Table I.
GHOSH et al.: INTERACTIVE KNOWLEDGE-BASED MULTIOBJECTIVE EA FRAMEWORK 225
TABLE I
RULE T YPES AND THE C ORRESPONDING M ATHEMATICAL R EPRESENTATION . X R EPRESENTS THE S ET OF ND S OLUTIONS . xi AND xj R EFER TO THE iTH
AND j TH VARIABLES , R ESPECTIVELY, OF AN ND S OLUTION x ∈ X . T HE C ORRESPONDING VARIABLES IN A N EW S OLUTION xr TO B E R EPAIRED A RE
L ABELED AS xir AND xjr , R ESPECTIVELY. N ORMALIZED VARIABLES A RE R EPRESENTED BY A H AT (x̂i , x̂j ). H IGHER R ANKED RULES A RE
P REFERRED W HILE P ERFORMING R EPAIR . T HE S CORE (s) I S A M EASURE OF H OW W ELL X F OLLOWS THE RULE IN THE
R EPRESENTATION C OLUMN . S ATISFACTION C ONDITION D ICTATES W HETHER xr F OLLOWS THE R ESPECTIVE RULE
Fig. 1. IK-EMOframework showing user interaction, learning, and repair agents. Blue blocks represent a normal EMO. Green blocks represent the components
responsible for knowledge extraction and application, as well as user interaction.
III. P ROPOSED I NTERACTIVE K NOWLEDGE -BASED any reasonable-sized problem, such a huge number of mean-
IK-EMO F RAMEWORK ingful relationships may not exist. In practice, the user may
In this study, we restrict our discussions to MOPs, so high- be interested in only a handful of relationships that relate
performing solutions refer to the entire nondominated (ND) some critical decision variables. In order to reduce the com-
solution set discovered by the optimization algorithm from plexity, variables can be divided into different groups Gk for
the start of the run to the current iteration. Fig. 1 shows the k = 1, 2, . . . , ng . Each group consists of variables that the user
proposed IK-EMO framework. thinks are likely to be related. Group information is specified
The framework starts with a description of the MOP, as prior to the optimization. If the user does not know how to
shown in the top-left box in the figure. In addition, if any construct the groups, he/she can put all n variables into a single
additional problem information is available, that is also speci- group. It is then up to the algorithm to figure out which vari-
fied. The penultimate step before starting the optimization is to ables are related. Intergroup relationships are not discoverable
select a suitable EMO and methods to algorithmically extract under this scheme. Variables that are not part of any group
and apply any problem knowledge. The subsequent sections are assumed not to be related to other variables. For exam-
describe the various components in more detail. ple, assume there are two variable groups G1 = {2, 3, 5} and
G2 = {1, 4, 7} for an 8-variable problem. For G1 all pairwise
combinations (x2 , x3 ), (x2 , x5 ), and (x3 , x5 ) will be checked
A. User Knowledge for the existence of any possible relationships. A similar pro-
Before the start of the optimization, the user may provide cess is repeated for G2 . Since intergroup relationships are not
some initial information which will affect how the framework explored, combinations like (x3 , x7 ) will not be considered.
operates, details of which are given below. Variables x6 and x8 are not part of any group, hence they
1) Variable Grouping: For a problem with n variables, are assumed not to be related to the other variables in any
there can be ([n(n − 1)]/2) pairwise variable interactions. For meaningful way.
2) Rule Hierarchy: In the proposed framework, we con- the parameters β and . In order to evaluate the quality of
sider four types of rules as presented in Section II. The user the fit, we use the coefficient of determination (R2 ) metric. A
may be interested in more than one type of rule and have some new solution (xr ) follows the power law given in (3) if the
preferences among them. In that case, the user can define a difference between the actual value (xir or xjr ) and the pre-
hierarchy of multiple rule types ranked according to user pref- dicted value (x̂ir or x̂jr ) is lower than a predefined threshold
erence. The existence of a particular rule type for one or more ij ). Table I shows the formulation for the satisfaction
error (emin
variables is checked rank-wise. For example, if constant rules condition.
are ranked 1, followed by power laws (rank 2) and inequalities 3) Equality Rule: Two variables can be considered equal if
(rank 3), then the variables in every group will be checked for |xi − xj | ≤ εij with εij being a tolerance parameter for variable
constant rules first. The variables which do not exhibit con- pair xi and xj . The proportion of ND solutions following this
stant rules will then be checked for power laws, and so on. condition is the score (sφij ) of the equality rule. The need to
For relations having equal ranking, a scoring criterion needs define εij for every variable pair can be avoided if normalized
to be used to determine which rule better represents the ND variables are used.
front and will be used by the algorithm. 4) Inequality Rule: Inequality rules can be of the form
xi ≤ xj or xi ≥ xj . The proportion of ND solutions satisfying
B. Learning Agent either condition is the score of the respective rules.
A learning agent is a procedure used to identify different After the learning agent identifies specific rules from a set
innovization rules present in the ND solutions in a population. of ND solutions, the rules can be used to repair offspring
The rules involve a single variable or a pair of variables, as solutions of the next generation. The repair mechanism for
required by a rule’s description. Each type of rule (inequality, each rule is described next.
power law, etc.) requires a different rule satisfaction condition.
A score (within [0,1], as presented in Table I) is assigned to C. Repair Agent
each rule to quantify how well the rule represents the ND set. Once the rules are learned from the current ND solutions by
Different learning agents applicable to the rule types covered the learning agent, the next task is to use these rules to repair
in Section II are presented below. A summary of the various the offspring solutions for the next few generations. There are
rules, their scoring procedures, and satisfaction conditions are two questions to ponder. First, how many rules should we use
provided in Table I. in the repair process? Second, how closely should we adhere
1) Constant Rule: In order to learn constant rules, we have to each rule while repairing? A small fraction of learned rules
to analyze the values of the variable under consideration for may not embed requisite properties present in the ND solutions
every ND solution and check if one or more of them converge in offspring solutions. But the usage of too many rules may
to specific values. Since variables can be of different scales reduce the effect of each rule. Similarly, a tight adherence
and units, we need a generalized criterion to determine if a to observed rules may encourage premature convergence to a
variable is taking on a constant value. First, the median (x̃i ) nonoptimal solution, while a loose adherence may not pass on
is calculated. The proportion of ND solutions which satisfy properties of ND solutions to the offspring. We propose four
|xi − x̃i | ≤ ρi is said to be the score (sφi ) of the constant rule different rule usage schemes [10% (RU1) to 100% (RU4)] and
xi = κi = x̃i . ρi is a small tolerance used for determining three rule adherence schemes [RA1 (tight) to RA3 (loose)] for
whether variable xi ’s value is in the neighborhood of x̃i . It power law and inequality rules.
must be defined separately for each variable. An alternative 1) Constant Rule: To apply a constant rule xi = κi to a
option is to normalize the variables and define a singular ρ
particular offspring solution x(k) , the variable xi(k) is simply set
for the normalized variable space.
to κi , thereby implementing the learned rule from previous ND
To check whether a new solution (xr ) follows xir = κi , we
solutions to the current offspring solutions. Constant rules are
check whether xir lies in the neighborhood of κi using the
always included in the rule set and used with tight adherence.
condition: |xir − κi | ≤ ρi .
2) Power Law Rule: For a power law rule x̂i x̂j b = c, one
2) Power Law Rule: In order to learn power laws
variable is selected as the base (independent) variable and the
(xi xjb = c) we use the method proposed in [15] with a mod-
other variable is set according to the rule. For example, for
ification. Each variable is initially normalized to [1, 2]. A
a particular offspring solution x(k) , if x̂i (k) is selected as the
training dataset is created from the ND solution set with (k) (k)
base variable, x̂j is set as follows: x̂j = (c/x̂i )(1/b) . Despite
the logarithms of normalized variables x̂i and x̂j as features,
theoretically being able to represent constant relationships by
leading to (4)
having b = 0, in practice, extremely low values of b can
(k)
x̂i x̂jb = c (3) cause the repaired variable x̂j to have a large value out-
⇒ log x̂i = β log x̂j + (4) side the variable range. Hence, in this study, we first check
whether a variable follows constant rules, and if it does, then
where β = −b is the weight and = log c is the intercept. that variable’s involvement in a power law rule is ignored.
Normalization prevents 0 or negative values from appearing A repair of a power law rule is followed with three dif-
in the logarithm terms. Then, we apply ordinary least-squares ferent confidence levels by adjusting to an updated c-value:
linear regression to the logarithm of x̂i and x̂j . Linear regres- x̂i x̂j b = cr . PL-RA1 uses cr = c (tight adherence); PL-RA2
sion finds the best-fit line for the training data defined by uses cr ∈ N (c, σc ) (medium adherence); and PL-RA3 uses
TABLE II
cr ∈ N (c, 2σc ) (loose adherence), where σc is the standard RULE H IERARCHY BY R ANK FOR E ACH R EPAIR AGENT
deviation of c-values for the power law observed among the
ND solutions during the learning process. PL-RA1 puts the
greatest trust into the learned power law rule, whereas PL-
RA3 has the least amount of trust and provides the most
flexibility in the repair process.
3) Inequality and Equality Rules: In order to repair an off-
(k)
spring solution x(k) , we have to select one variable (xi ) as
(k)
the base variable and the other (xj ) as the dependent variable
to be repaired. The generalized inequality repair operation is
shown in

(k) (k) (k) (k) (k)
xj = xi + νr1 xiU − xi , for xi ≤ xj (5)
xi(k) − νr2 xiU possibility of recovery. To prevent this, in (7), the probabil-
xj(k) = , for xi(k) ≥ xj(k) . (6)
1 − νr2 ity update step ensures that a minimum selection probability
(pmin ) is always assigned to each repair operator present in
Three different rule adherence (RA) schemes are consid-
the ensemble. Equation (8) normalizes the probability values
ered. IQ-RA1 uses νr1 = μν1 and νr2 = μν2 (tight adherence
for each operator so that their total sum is one.
with no standard deviation), which are computed as the means
The learning rate (α) determines the rate of change of the
of ν1 and ν2 from ND solutions during the learning process,
repair probabilities. A high α would increase the sensitivity
as follows:
and can result in large changes in repair probabilities over a
xj − xi xi − xj
ν1 = U , ν2 = U . short period of time. A low α exerts a damping effect which
xi − xi xi − xj causes the probability values to update slowly. Through trial
For IQ-RA2, νr1 ∈ N (μν1 , σν1 ) and νr2 ∈ N (μν2 , σν2 ) and error, α = 0.5 and pmin = 0.1 are found to be suitable for
(medium adherence with one standard deviation) are used, the problems of this study.
where σν1 and σν2 are standard deviations of ν1 and ν2 , respec-
tively. Both νr1 and νr2 are set to zero, if they come out to
E. Mixed Rule Repair Agent
be negative. For IQ-RA3, νr1 , νr2 ∈ U(0, 1) (loose adherence
with a uniform distribution) are used. A mixed rule repair agent is designed to work on two or
more different types of rules. Since multiple rules (e.g., an
inequality rule and a power law rule) can show up for the
D. Ensemble Repair Agent
same variable pair, a rule hierarchy needs to exist as defined
Both power law and inequality/equality rules have three RA in Section III-A2. Table II shows the rule hierarchical rank
options for repair. For a new problem, it is not clear which used for all the repair agents in this study.
option will work the best, so we also propose an ensemble
approach (PL-RA-E and IQ-RA-E) in which all three options
are allowed, but based on the success of each option, more F. User’s Ranking of Rules
probability is assigned to each. The ensemble method also The user forms the basis of the interactivity of the IK-
considers a fourth option in which no repair to an offspring EMO framework. At any point during the optimization, the
is made. The survival rate (rsi ) of offspring generated by the user has the option to review the optimization results and pro-
ith repair operator is a measure of its quality. The greater the vide feedback to the optimization algorithm in one or more of
survival rate of the offspring created by an operator is, the the following ways.
higher is the probability of its being used in the subsequent 1) Rule Ranking: The user may provide a ranking of rules
offspring generation. The probability (pir ) update operation for (rank 1 is most preferred) provided by the algorithm.
the ith operator is presented in The algorithm will then try to implement the rules in
the rank order provided by the user.
rsi
pr (t + 1) = max pmin , α i + (1 − α)pr (t)
i i (7) 2) Rule Exclusion: The user may select to remove cer-
i rs tain rules provided by the algorithm, based on their
pi (t + 1)
pir (t + 1) = r i (8) knowledge of the problem.
pr (t + 1) 3) Rule Specificity: The user may specify details for consid-
i
ering a rule further. For example, the user may specify
where α is the learning rate, rsi = (nis /noff ), where nis and noff that only variables having a correlation above a specified
are the number of offspring created by the ith operator that value should be considered. Another criterion could be
survive in generation t and the total number of offspring that to select all rules having a score greater than a threshold
survive in generation t, respectively. It is possible that at any as rank 1 and exclude the others.
point during the optimization, no solution generated by one In this article, the proposed rule usage schemes (RU1-RU4)
of the repair operators survives. This might cause the corre- can also be considered as artificial users [26] who select a
sponding selection probability to go down to zero without any certain percentage of the learned rules every few generations.
This systematically illustrates the interactive ability of IK-

EMO while showing the effect of different numbers of rules
used for repair on the performance.
In the subsequent experiments, it is assumed that the user
instantaneously provides their feedback. However, in the real
world, this may not be the case. The user may require some
finite time to adequately process the results and gather their
feedback. Pausing the optimization algorithm during the user’s
analysis process can result in losing out on useful function
evaluations (FEs) that could have been completed during this
overall allocated time period. An experimental analysis of this
issue is provided in the supplementary document. Fig. 2. Ten variables in two noninteracting groups are represented in com-
plete graphs. Each node represents a variable. An edge i–j represents the
existence of a relationship between decision variables xi and xj . (a) Group
G1 . (b) Group G2 .
G. Variable Relation Graph
The possible number of pair-wise relations among n vari-
ables is ([n(n − 1)]/2) or O(n2 ). Thus, for a large number
of variables, the amount of bookkeeping required to track
individual pairwise relations is large. Moreover, the observed
relationships should not contradict each other. For example,
for inequality rules xi ≤ xj and xj ≤ xk , the transitive property
can be maintained by choosing to repair xj based on xi , fol-
lowed by repairing xk based on xj using (5). But repairing both
xj and xk separately based on xi can potentially contradict the
rule xj ≤ xk . To solve these two challenges, we propose using
a graph-based data structure, called variable relation graph
(VRG), to encode and track relationships observed between
multiple variable pairs. A customized graph-traversal algo- Fig. 3. Blue edge represents that a corresponding power law relation has a
rithm ensures that all repairs are performed with minimal or no score greater than smin . A brown edge represents a corresponding inequality
contradictions. In the following sections, steps 1–5 show the relation having a high score. If the scores for both types of relations are low,
the corresponding edge is removed. If this results in a node having no edges,
process of using learning agents to construct a VRG (learning then it is removed, such as node 8 in (a). (a) Group G1 . (b) Group G2 .
phase). A learning interval (TL ) is defined as the number of
generations or FEs after which a new learning phase begins.
Step 6 shows the process of applying the VRG to repair an off-
spring solution using one or more repair agents (repair phase). rule repair operators (third row) shown in Table II, except
A repair interval (TR ) is defined as the number of generations that inequalities are ranked 3 for illustration here. A blue or
or FEs between any two repair phases. brown edge represents a power law rule or an inequality rule,
1) Create Complete VRG: A vertex (or node) of a VRG respectively. An edge ranking is also assigned based on the
represents a variable and an edge connecting two nodes indi- rule hierarchy. In this case, edges representing power laws
cates the existence of a relationship between the corresponding and inequalities will be ranked 1 and 2 by default, unless
variables. For every group Gk of variables, all pairwise vari- overruled by the user. Both graphs have a reduced number
able combinations are connected by an edge. This will result of edges after the rule selection process is complete. Node 8
in a complete graph where every pair of vertices is connected in Fig. 3(a) (marked in red) is found to have a constant rule
by a unique undirected edge. An example with two variable associated with it and hence removed. In Fig. 3(b), variables
groups (G1 = {1, 2, 3, 6, 8} and G2 = {4, 5, 7, 9, 10}) having (x5 , x9 ) and (x9 , x10 ) are not related by power laws having
five variables each is illustrated in Fig. 2. a score greater than smin . However, they are found to follow
2) Rule Selection: In this step, learned rules are used to inequality relationships with a score greater than smin . Hence,
modify the VRGs according to two selection criteria. First, they are connected by brown edges. The rest of the edges rep-
all rules having a score (defined in Table I) above a certain resent power law rules and are marked by blue. The approach
threshold (smin ) are considered. Second, they are applied in to set the direction of the edges is discussed next.
the order of the user’s preference ranking. A connection may 3) Create Directed Acyclic VRG: In order to apply a repair
be removed if it does not satisfy the selection criteria. If a agent to the VRG, it needs to be converted to a directed acyclic
single-variable (constant) rule satisfies the selection criterion, graph (DAG). This step ensures graph traversal is possible
then the corresponding node is removed from the VRG and without getting stuck in loops. The members of every group Gk
that rule will be implemented separately. If no two-variable are randomly permuted to create a sequence Dk . If i appears
rule involving xi and xj satisfies the minimum score criterion, before j in Dk , an undirected edge between nodes i and j
the corresponding VRG edge (i-j) is removed. An example is converted to a directed edge from i to j. In the example
is shown in Fig. 3, which uses the rule hierarchy for mixed shown in Fig. 4, two random sequences D1 = (2, 1, 3, 6) and
Fig. 4. Random sequence of the nodes is created for each group. An edge Fig. 6. User can provide feedback by adding, removing, or ranking the rules.
i → j is created from an undirected edge if i appears before j in the sequence. Ranking divides an existing VRG into multiple subgraphs. Edges with a red
This creates a directed acyclic VRG. (a) Group G1 . (b) Group G2 . border represent rank 1 relations. Edges with a dark yellow border represent
rank 2 relations. Gray edges represent relations that the user wants to remove.
(a) Group G1 . (b) Group G2 .
Algorithm 1 VRG Traversal and Repair Pseudocode

Require: New solution set (Xr ), variable groups (G), VRGs for
every solution and group, rule hierarchy.
Ensure: Repaired solution set Xr .
1: function T RAVERSE G RAPH(x, Graph, CurrentNode,
PreviousNode, NodesVisited, CurrentRank)
2: if CurrentNode in NodesVisited then return
3: end if
4: CurrentEdges ← Graph.Edges[CurrentNode];
5: for each outgoing edge (e) in CurrentEdges do
6: NextNode ← e.EndVertex;
Fig. 5. Transitive reduction is performed to remove redundant directed edges. 7: if NextNode not in NodesVisited then
In (a), since edges 2 → 3 → 6 exist, edge 2 → 6 is considered redundant 8: EdgeType ← e.EdgeType;
and is removed. (a) Group G1 . (b) Group G2 .
9: EdgeRank ← e.EdgeRank;
10: if EdgeRank = CurrentRank then
11: Repair(x, CurrentNode, NextNode, EdgeType,
D2 = (10, 4, 5, 9, 7) are created for groups G1 and G2 , respec- EdgeRank);
12: end if
tively. Since node 2 appears before node 1 in D1 , a blue arrow 13: TraverseGraph(x, Graph, NextNode, CurrentNode,
goes from node 2 to node 1, as shown in the figure. This pro- NodesVisited);
cess is repeated for every population member so as to create 14: end if
diverse VRGs. 15: end for
4) Transitive Reduction: Next, a transitive reduction [27] is 16: Add CurrentNode to NodesVisited;
17: end function
performed on the VRG corresponding to each variable group. 18: for each group Gk in G do Repair procedure begins
For VRGs having both power law and inequality edges, the 19: for each solution x in Xr do
transitive reduction is performed on subgraphs consisting only 20: CurrentGraph ← VRG assigned to x for Gk ;
of the edges of the same type. This step eliminates redundant 21: for CurrentRank = 1, 2,..., nranks do
directed edges between two different rule types. An example 22: StartNode ← Select the first node having atleast one
edge of rank CurrentRank from a random sequence;
of eliminating an arrow from node 2 to node 6 is shown in 23: TraverseGraph(x, CurrentGraph, StartNode, NULL,
Fig. 5(a). [], CurrentRank);
5) Modify VRG According to User’s Feedback: A user can 24: end for
provide feedback in the form of a ranking, or select only a 25: end for
26: end for
subset of the available rules. In the former case, the VRG
edge rankings are updated to reflect the user’s choice. Edges
corresponding to the rules discarded by the user are removed.
Fig. 6(a) shows an example where the rule involving x1 and the pseudocode of the repair process. In the pseudocode, the
x6 are ranked 1 (marked by arrows with a red border) and x2 VRG data structure has the attributes Nodes and Edges. The
and x3 are ranked 2 (marked by arrows with a dark yellow Edges attribute representing an edge (i, j) has multiple sub-
border). The gray edges represent the rules discarded by the attributes: StartVertex (i in this case), EndVertex (j in this
user. Fig. 6(b) shows a similar ranking process. case), EdgeType (rule type and corresponding repair agent),
6) Repair New Offspring Solutions: For every new solu- and EdgeRank (rank of an edge). A function TraverseGraph
tion, the corresponding VRGs are traversed. From a starting is used which recursively traverses the VRG from a random
node, the algorithm moves forward via the outgoing edges start node for a particular rule rank. The function Repair
and repairs the connected nodes recursively in a depth-first called by TraverseGraph calls the correct repair agent based
fashion. This is repeated for all ranks. Algorithm 1 presents on EdgeType.
TABLE III
N UMBER OF D ECISION VARIABLES AND C ONSTRAINTS FOR 39- AND 59-S EGMENT S TEPPED B EAMS
TABLE IV
PARAMETER S ETTINGS OF IK-EMO
Fig. 7. Simply-supported stepped beam with five segments.
IV. S IMPLY-S UPPORTED S TEPPED B EAM D ESIGN

Beam design problems are common in [8] and [28] and
can be used to benchmark an optimization algorithm. In this
article, we consider a simply-supported stepped beam design
with multiple segments having a rectangular cross-section. An
example with five segments is shown in Fig. 7.
A vertical load of 2 kN is applied at the middle of the
beam. All nseg segments are of equal length. The area of the Four rule usage schemes RU1, RU2, RU3, and RU4 select
rectangular cross-section is determined by its width (bi ) and the top 10%, 20%, 50%, and 100% of the learned rules sorted
height (hi ) for the ith segment, where i ∈ [1, nseg ], The vol- according to their scores. They also act as artificial users with
ume (V) and maximum deflection ( ) are to be minimized a consistent behavior. Each rule usage scheme is paired with
by finding an optimal width bi and height hi of each seg- one or more repair agents. Eight cases with a single repair
ment, totaling 2nseg variables. The maximum stress σi (x) of agent are considered: PL-RA1, PL-RA2, PL-RA3, PL-RA-E,
the ith member and deflection δj (x) at the jth node need to be IQ-RA1, IQ-RA2, IQ-RA3, and IQ-RA-E. Two cases with a
kept below strength of the material σmax and a specified limit combination of repair agents are considered: one with PL-RA2
δmax , respectively. The aspect ratio (ratio of height to width) and IQ-RA2, and the other with PL-RA-E and IQ-RA-E. From
of each segment is also restricted within a particular range (in Table I, ρi is set to be 0.1, εij is set as 0.1, and emin
ij is set to
[aL , aU ]), as constraints. The MOP formulation is shown in 0.01. Table IV shows the parameter settings for this problem.
nseg For each combination of a repair agent and user, 20 runs are
performed and the hypervolume (HV) [30] values are recorded
Minimize V(x) = bi hi li (9)
at the end of each generation. The Wilcoxon rank-sum test [31]
i=1
nseg is used to compare the statistical performance of the algorithms
Minimize (x) = max δi (x) (10) tested here with respect to the best-performing algorithm for
i=1
nseg each scenario. As an example, let p1 and p2 represent the
Subject to max σi (x) ≤ σmax (11) performance metric values for two algorithms A1 and A2 . For
i=1
nseg each simulation run, p1 and p2 exist as paired observations.
max δj (x) ≤ δmax (12) Here, the null hypothesis states that there is no statistically sig-
j=1
nificant difference between p1 and p2 . The hypothesis is tested
aL ≤ ai ≤ aU , for i = 1, . . . , nseg . (13)
with a 95% significance level and the p-values are recorded. A
Here, two cases with 39 and 59 segments are considered. p-value less than 0.05 means that there is a statistically signif-
Problem parameters are described in Table III. icant performance difference between A1 and A2 . The median
number of FEs taken to achieve a target hypervolume (HVT )
A. Experimental Settings is used as a performance metric for the Wilcoxon test. HVT
is set to be the final median HV achieved by base NSGA-
NSGA-II [29], a state-of-the-art MOEA, is applied with
II when the run is terminated. In order to make the problem
the proposed IK-EMO procedure to solve both cases. This
challenging for the proposed approach, a small population size
problem is intended to demonstrate the performance of our
of 40 is used and a maximum number of generations of 500
proposed algorithm with minimal initial user knowledge. Thus,
is set, thereby allowing a maximum computational budget of
all variables are put into a single group. IK-EMO is combined
20 000 FEs for each run.
separately with each repair agent described in Section III-C.
In addition, there are two cases where mixed relationships are
used: the first case with PL-RA2 and IQ-RA2, and the second B. Experimental Results and Discussion
case with PL-RA-E and I-ES. The rule hierarchy is described Tables V and VI show the optimization results for the 39-
in Table II. and 59-segment stepped beam problems, respectively. Base
TABLE V
FE S R EQUIRED TO ACHIEVE HVT = 0.95 FOR 39-S EGMENT B EAMS . B EST-P ERFORMING A LGORITHM FOR ROW I S M ARKED IN B OLD .
B EST-P ERFORMING A LGORITHM IN E ACH C OLUMN I S M ARKED BY A S HADED G RAY B OX . A LGORITHMS W ITH S TATISTICALLY
S IMILAR P ERFORMANCE TO THE B EST A LGORITHM C OLUMN -W ISE A RE M ARKED IN I TALICS .
T HE C ORRESPONDING W ILCOXON p-VALUES A RE G IVEN IN B RACES
TABLE VI
FE S R EQUIRED TO ACHIEVE HVT = 0.76 FOR 59-S EGMENT B EAMS
(a) (b)
Fig. 8. ND Fronts and HV Plots Obtained by IK-EMO With RU2 and Power Law Repair Agents for 59-Segment Stepped Beam Problem. (a) ND Front for
One Run. (b) HV Plot Over 20 Runs.
NSGA-II results without any rule extraction and repair are Similar behaviors are observed for 39-segment case (see the
shown in the first row. The best performance case in each supplementary materials).
row is marked in bold. For every column, the best-performing The results show many interesting observations as stated
algorithm is marked with a shaded gray box. The Wilcoxon below.
p-values show the relative performance of each algorithm with 1) General Observations: Statistically base NSGA-II does
the column-wise best performance. Algorithms with a statisti- not perform well compared to knowledge-based NSGA-II
cally similar performance to the column-wise best are shown methods for both 39- and 59-segment problems. A positive
in italics. The ND front obtained in a particular run using aspect of the proposed algorithm is that it is still able to
the power law repair operators for RU2 is shown in Fig. 8(a) achieve a good performance with a significantly low popu-
for the 59-segment case. The corresponding median HV plot lation size. For problems with expensive evaluation functions,
over the course of the optimization run is shown in Fig. 8(b). this may stay beneficial for saving computational time.
TABLE VII
2) Power Law Versus Inequality Rules: As can be seen IEEE B US S YSTEM S PECIFICATIONS
from the table, for both 39- and 59-segment problems, the
power law repair operators generally perform better than
inequality-based repair operators except for PL-RA3. One pos-
sible reason could be the greater versatility of power laws in
modeling complex relationships compared to simple inequality
rules.
TABLE VIII
3) Best-Performing Algorithm for Each Rule Usage OPF D ECISION VARIABLE T YPES AND R ANGES
Scheme: In the 39-segment case, PL-RA-E is the best per-
former for RU1 to RU4, with PL-RA-2 and the mixed repair
operators offering statistically similar performance for RU1 to
RU3. This shows that the ensemble method can be used to
get good performance without the need for selecting a proper
repair process. For the 59-segment case, PL-RA2 and PL-
RA1 are the best performers for RU1 and RU2, respectively.
problem [33], [34]. The following objective functions are min-
PL-RA-E offers a statistically similar performance in both
imized: fuel cost, emissions, voltage deviation, and real power
cases. In the cases of RU3 and RU4, PL-RA-E gives the best
loss. In many cases, one or more of these objectives are con-
performance, with PL-RA2 having a comparable performance.
sidered in the literature, with the rest being kept as constraints.
PL-RA3 performs significantly worse than the other PL oper-
In this study, we consider two objectives: 1) minimizing fuel
ators for all the cases. For PL-RA1, adhering closely to the
cost and 2) reducing fossil fuel emissions. Voltage deviation
learned power law rules constrains NSGA-II in finding good
and power loss are kept as constraints. This version of the
solutions. PL-RA3 introduces a large amount of variance
OPF problem is also known as the environmental economic
which is detrimental to the optimization process. A compro-
dispatch (EED) problem [34]
mise between these two extremes, provided by PL-RA2 or
PL-RA-E, is the logical step. NG

Fig. 8 illustrates the results when RU2 is combined with Minimize CF (PG , VG ) = ai + bi PGi + ci P2Gi (14)
the power law repair operators for both problem cases. The i=1
difference in the quality of solutions obtained after 20 000 FEs Minimize CE (PG , VG )
is prominent in the 59-segment ND front. In the median HV NG
plots, it is seen that the FEs required to reach HVT for PL-RA- = αi + βi PGi + γi P2Gi + ζi e(λi PGi ) (15)
E is close to the number needed by the best-performing repair i=1
agent. Base NSGA-II and the repair operators all give good
Nbus
quality solutions at the end of the run for the 39-segment case. Subject to (Pi − PD − PL ) = 0 (16)
However, for the 59-segment case, base NSGA-II performs i=1
significantly worse. VDmin ≤ VD ≤ VDmax , PLmin ≤ PL ≤ PLmax
4) Relative Performance of Each Rule Usage Scheme: It QGimin ≤ QGi ≤ QGimax , Psmin ≤ Ps ≤ Psmax
can be seen from Tables V and VI that RU2 produces the best Vsmin ≤ Vs ≤ Vsmax , VPQimin ≤ VPQi ≤ VPQimax
performance in 7 out of 10 cases for the 39-segment case,
and 6 out of 10 cases for the 59-segment case. This shows where CF is the fuel cost, CE is the emission cost, NG is the
that in terms of rule usage, using too few or too many of number of generators, PGi is the real power output and VGi
the learned rules is not effective in improving the algorithm’s is the voltage output of the ith generator, (ai , bi , ci ) are the
performance. fuel cost coefficients, and (αi , βi , γi , ζi , λi ) are the emission
5) Mixed Relation Repair Agents: The mixed relation cost coefficients. VD is the total voltage deviation of all the
repair agents (PL-RA2+IQ-RA2) and (PL-RA-E+IQ-RA-E) load buses, PL is the total real power loss, QGi is the reactive
have statistically similar performance to the best algorithm power output of the ith generator, Ps is the real power output
for each user and both problem cases. Even though inequality and Vs is the voltage output of the slack bus, and VPQi is
repair operators perform worse than power law repair operators the voltage at the ith load/P-Q bus. PD is the power demand
individually, their presence in the mixed repair agents do not and Nbus is the total number of buses. A load flow analysis
hinder the performance, since only the high-performing rules must be performed to satisfy the equality constraint. We use
are added to the VRG during creation. The proposed frame- MATPOWER [35] as the load flow solver. We consider IEEE
work is robust enough to give good performance irrespective 118-bus and 300-bus systems in this study.
of the number of repair agents and type of rules.
A. Experimental Settings
The bus details, along with the numbers of decision vari-
V. O PTIMAL P OWER F LOW P ROBLEM ables and constraints, are given in Table VII. The types of
Optimal power flow (OPF) is a common problem in power decision variables and their corresponding ranges are given in
system engineering with MOEAs being used to solve the Table VIII.
(a) (b)
Fig. 9. ND front and HV plots obtained with RU2 and power law repair agents for IEEE 300-bus OPF problem. (a) ND front for one run. (b) HV plot over
20 runs.
TABLE IX
OPF VARIABLE G ROUPS Fig. 9 illustrates the results with RU2 combined with the
power law repair operators for the IEEE 300-bus case. The
difference in the quality of solutions obtained after 20 000 FEs
is prominent in the IEEE 300-bus case. In the median HV
plots, it is seen that the number of FEs required to reach HVT
for PL-RA-E is the least, followed by PL-RA-2.
4) Relative Performance of Each Rule Usage Scheme: It
can be seen from Tables X and XI that RU2 produces the best
Experimental settings are the same as in the stepped beam
performance in 9 out of 10 cases for the IEEE 118-bus case,
problem except that the population size is set to be 50 and the
and 8 out of 10 cases for the IEEE 300-bus case. This shows
maximum number of generations is set as 400 for both IEEE
that in terms of rule usage, using too few or too many of the
118 and 300-bus systems. From Table I, ρi and εij are set as
learned rules is detrimental to the optimization performance
1, and emin
ij is set to 0.01. Two variable groups are defined and in general, which is similar to the conclusions made in the
shown in Table IX.
stepped beam design problems.
5) Mixed Relation Repair Agents: The mixed relation
B. Experimental Results and Discussion repair agent PL-RA-E+IQ-RA-E has statistically similar
performance to the best algorithm for each user and both
Tables X and XI show the optimization results for the IEEE problem cases. As in the stepped beam problems, the worse
118- and 300-bus systems, respectively. The ND front obtained performance of the inequality repair operators does not hinder
in a single run using four power law repair methods for RU2 the performance of the mixed relation operators.
are shown in Fig. 9(a) for the IEEE 300-bus system. The corre-
sponding median HV plots over the course of the optimization
VI. T RUSS D ESIGN P ROBLEM
are shown in Fig. 9(b). Similar behavior is observed for the
118-bus system (see the supplementary document). Finally, we consider a commonly used truss design problem
1) General Observations: Base NSGA-II is outperformed involving two objectives, and 1416 highly nonlinear con-
by both the power law and inequality-based repair agents. straints. The truss has 1100 members and 316 nodes, making
Good performance with a low population size is obtainable a total of 1179 variables, making it a large-scale problem.
by IK-EMO, making it the better choice for this problem. The details of the problem description are provided in the
2) Power Law Versus Inequality Rules: For both problem supplementary document.
cases, a power law repair operator is the best performer for Experimental settings are similar to those of the previous
each user, as in the stepped beam problem. Inequality-rule- problems. The population size is set to 100 and the maximum
based repair operators in general result in worse performance number of generations is set as 10 000. Thus, the total compu-
compared to power-law-based repair operators. This, as in the tational budget comes out to be 1 million FEs. From Table I,
stepped beam problem, is a result of the power laws being ρi and εij are set as 0.1, and emin
ij is set to 0.01. Multiple vari-
able to more accurately model the intervariable relationships. able groups are defined for this problem based on the relative
location and alignment of the beams as shown in Table XII.
3) Best-Performing Algorithm for Each User: For both the
118- and 300-bus problems, PL-RA-E performance is the best
or statistically similar to the best for all the cases. Thus, A. Experimental Results and Discussion
an ensemble repair agent is a good choice for automatically Results are presented in Table XIII and Fig. 10. Base
selecting the best repair agent according to the situation. NSGA-II is outperformed by most repair operators. As in the
TABLE X
FE S R EQUIRED TO ACHIEVE HVT = 0.90 FOR IEEE 118-B US S YSTEM
TABLE XI
FE S R EQUIRED TO ACHIEVE HVT = 0.79 FOR IEEE 300-B US S YSTEM
TABLE XII
VARIABLE G ROUPS FOR THE 1100-M EMBER T RUSS C ASES . E ACH G ROUP analysis requires comparison with other learning-based EMO
H AS C OMPARABLE VARIABLES H AVING I DENTICAL U NITS AND S CALES algorithms which have attempted to learn and use the features
of the search space or the fitness landscape properties [36].
Any such algorithm used for a performance comparison with
IK-EMO-based NSGA-II should also be able to handle con-
strained and MOPs. Based on these requirements, we have
chosen three recently proposed algorithms: RVEA [37], AGE-
MOEA [38], and BiCo [39]. These algorithms use some form
previous two problems, power-law-based approaches perform of adaptation methods to execute a more efficient search.
better than inequality-based approaches, but the ensemble-based PlatEMO [40], which provides a MATLAB implementation
approach performs overall the best with an intermediate use of of these algorithms, is used here.
repair (RU2). More information are put in the supplementary For IK-EMO, the parameters are kept the same as the
document. previous experiments. PL-RA-E with RU2 is chosen as the
repair scheme for IK-EMO, with NSGA-II as the core
optimization algorithm. HVT is set as the final median HV
VII. S UMMARY OF R ESULTS
obtained by base NSGA-II.
For every problem, we have a total of 11 different algo- Table XV shows the FEs required to achieve HVT by all
rithms, including base NSGA-II and 10 repair schemes. For the algorithms for each problem. It is seen that IK-EMO with
each problem, ranking based on FEs to achieve the target HV PL-RA-E statistically outperforms most of the other algo-
for four users is summarized in Table XIV. An algorithm with rithms by reaching the HVT faster. AGE-MOEA and BiCo
statistically similar performance to the best-performing algo- are able to statistically match the IK-EMO’s performance for
rithm is assigned a rank of 1. More details are presented in only 78-variable beam and 115-variable OPF problems. The
the supplementary document. It is seen that the top-ranked performance difference between IK-EMO and other meth-
algorithm is PL-RA-E followed by the mixed PL-RA-E+IQ- ods is more significant for the large-scale 1179-variable truss
RA-E, highlighting the superiority of the ensemble approach. problem. It should be noted that the other algorithms have
PL-RA2 comes in the third place, showing that a moderate a better performance than the base NSGA-II on almost all
level of RA provides the optimal performance. problems. Thus, the superior performance of IK-EMO is not
a product of the underlying NSGA-II alone, but of the effective
VIII. C OMPARISON W ITH OTHER MOEA S VRG-based learning and repair methods as well. In addition,
In the previous sections, the performance enhancement of the user also has knowledge of functional relationships among
NSGA-II with IK-EMO is investigated. A more complete design variables which can be used for different variants of
(a) (b)
Fig. 10. ND fronts and HV plots obtained by IK-EMO with RU2 and power law repair agents for 1100-member truss design problem. (a) ND front for one
run. (b) HV plot over 20 runs.
TABLE XIII
FE S R EQUIRED TO ACHIEVE HVT = 0.80 FOR 1100-M EMBER T RUSS
TABLE XIV
R ANKING OF D IFFERENT R EPAIR AGENTS ON M ULTIPLE P ROBLEMS . A D ETAILED B REAKDOWN I S P ROVIDED IN THE S UPPLEMENTARY M ATERIAL
TABLE XV
FE S R EQUIRED TO ACHIEVE THE HVT FOR A LL THE P RACTICAL P ROBLEMS OVER 20 RUNS . HVT I S S ET AS THE F INAL M EDIAN HV OF BASE
NSGA-II. P ROBLEM -W ISE B EST-P ERFORMING A LGORITHM I S M ARKED IN B OLD . A LGORITHMS W ITH S TATISTICALLY S IMILAR P ERFORMANCE TO
THE B EST A LGORITHM FOR E ACH P ROBLEM A RE M ARKED IN I TALICS . T HE C ORRESPONDING W ILCOXON p-VALUES A RE G IVEN IN B RACES
the same problem. Thus, IK-EMO is a better choice over the augmentation to obtain better quality solutions faster. Power
other algorithms if knowledge interpretability and reusabil- law, inequality, and mixed rules are extracted, together with
ity are desired and stay as a modular concept which can be their degrees of statistical adherence, from the ND solutions
embedded easily to other EMO/EMaO methods. at a regular interval of generations. A computationally effi-
cient graph data structure-based (VRG) knowledge processing
IX. C ONCLUSION AND F UTURE W ORK method has been proposed to store and process multiple pair-
In this article, we have proposed the IK-EMO framework wise variable interactions. A user is then expected to provide
which interleaves interactive optimization with knowledge a ranking of the learned rules based on his/her perception of
the validity of the rules. A repair agent has been proposed to in the optimization algorithm for updating offspring solu-
utilize the VRG with user-supplied ranking to repair offspring tions to constitute a computationally fast search process. More
solutions. The study has created six repair schemes with three such practice-oriented studies must now accompany evolution-
different degrees—tight, medium, and loose—of RA. A mixed ary optimization applications to make them more worthy for
power law and inequality-based repair has also been used. practical problem-solving tasks.
Finally, three ensemble-based repair schemes which adaptively
use power law, inequality or both have been proposed. These R EFERENCES
10 repair schemes have been implemented with four different
[1] K. Deb and R. B. Agrawal, “Simulated binary crossover for continuous
rule usage schemes RU1-RU4, using 10% (conservative), to search space,” Complex Syst., vol. 9, no. 2, pp. 115–148, 1995.
100% (liberal) of the learned rules. [2] K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A
The proposed framework has been applied to three con- Practical Approach to Global Optimization. Berlin, Germany: Springer-
Verlag, 2005.
strained large-scale two-objective practical problems: 78- and [3] E. Zitzler, K. Deb, and L. Thiele, “Comparison of multiobjective evo-
118-variable stepped beam design problems, 115- and 243- lutionary algorithms: Empirical results,” Evol. Comput., vol. 8, no. 2,
variable OPF problems, and a 1179-variable truss design pp. 173–195, Mar. 2000.
[4] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler, “Scalable multi-
problem. objective optimization test problems,” in Proc. Congr. Evol. Comput.
Experimental results on all problems have consistently (CEC), vol. 1, 2002, pp. 825–830.
shown that 1) usage of a moderate number of rules (20%) [5] S. Huband, L. Barone, L. While, and P. Hingston, “A scalable
multi-objective test problem toolkit,” in Evolutionary Multi-Criterion
combined with a moderate degree of RA produces a better Optimization (Lecture Notes in Computer Science, 3410). Berlin,
performance compared to other individual repair operators and Germany: Springer-Verlag, 2005, pp. 280–295.
2) power law rules, individually, produce better performance [6] K. Deb and C. Myburgh, “Breaking the billion-variable barrier in real-
world optimization using a customized evolutionary algorithm,” in Proc.
than inequality rules. Moreover, ensemble-based repair oper- Genet. Evol. Comput. Conf. (GECCO), New York, NY, USA, Jul. 2016,
ators provide the best performance overall. Use of ensembles pp. 653–660.
eliminates the need to experiment to find the right RA for a [7] A. Szőllős, M. Šmíd, and J. Hájek, “Aerodynamic optimization
via multi-objective micro-genetic algorithm with range adaptation,
new problem. IK-EMO is also able to work with very low pop- knowledge-based reinitialization, crowding and -dominance,” Adv. Eng.
ulation sizes, even for a large-scale problem. IK-EMO has also Softw., vol. 40, no. 6, pp. 419–430, Jun. 2009.
shown superior performance over other EMOs which adapt to [8] A. H. Gandomi, K. Deb, R. C. Averill, S. Rahnamayan, and
M. N. Omidvar, “Using semi-independent variables to enhance
the nature of the decision space or fitness landscape. IK-EMO optimization search,” Expert Syst. Appl., vol. 120, pp. 279–297,
can also help users learn about unknown relations between the Apr. 2019.
decision variables. [9] R. Landa-Becerra, L. V. Santana-Quintero, and C. A. Coello Coello,
“Knowledge incorporation in multi-objective evolutionary algorithms,”
This study opens up a number of avenues for future work. in Multi-Objective Evolutionary Algorithms for Knowledge Discovery
The scope of knowledge can be increased to include the objec- from Databases (Studies in Computational Intelligence), vol. 98. Berlin,
tive and constraint functions as well, in line with existing Germany: Springer, 2008, pp. 23–46.
[10] C. Barba-González, A. J. Nebro, J. García-Nieto, M. del Mar Roldán-
innovization literature. Studies can be performed using rule García, I. Navas-Delgado, and J. F. Aldana-Montes, “Injecting
types other than the ones considered here. For EMO algorithms domain knowledge in multi-objective optimization problems: A
using some form of adaptation mechanisms, repair methods semantic approach,” Comput. Stand. Interfaces, vol. 78, Oct. 2021,
Art. no. 103546.
can interfere with the natural evolution of optimal solutions. [11] C. A. Coello Coello and M. G. C. Tapia, “Cultural algorithms for
Studies need to be performed to understand how the repair optimization,” in Handbook of AI-based Metaheuristics. Boca Raton,
agents affect the operation of adaptation-based algorithms. In FL, USA: CRC Press, Jul. 2021, pp. 219–238.
[12] S. Obayashi and D. Sasaki, “Visualization and data mining of Pareto
this work, the learning and repair intervals are kept fixed. The solutions using self-organizing map,” in Evolutionary Multi-Criterion
effect of these parameters needs to be studied more closely. Optimization (Lecture Notes in Computer Science, 2632). Berlin,
In many problems, a rule may not stay valid across the entire Germany: Springer, 2003, pp. 796–809.
[13] K. Deb and A. Srinivasan, “Innovization: Innovating design principles
Pareto-optimal front. Locally present rules may exist in certain through optimization,” in Proc. Genet. Evol. Comput. Conf. (GECCO),
parts of the Pareto-optimal front [14]. Ways to extract local vol. 2, 2006, pp. 1629–1636.
rules and repair an MOEA’s offspring population members [14] S. Bandaru and K. Deb, “Higher and lower-level knowledge discovery
from Pareto-optimal sets,” J. Global Optim., vol. 57, no. 2, pp. 281–298,
accordingly will introduce additional challenges but may result Oct. 2013.
in faster convergence. The interactive and knowledge-based [15] A. Gaur and K. Deb, “Adaptive use of innovization principles for a faster
approach should also be implemented to other EMO/EMaO convergence of evolutionary multi-objective optimization algorithms,”
in Proc. Genet. Evol. Comput. Conf., New York, NY, USA, Jul. 2016,
algorithms to observe its effect in improving convergence pp. 75–76.
speed. A learning agent can also use other information sources. [16] B. Xin, L. Chen, J. Chen, H. Ishibuchi, K. Hirota, and B. Liu,
For example, an archive of all discovered ND solutions may be “Interactive multiobjective optimization: A review of the state-of-the-
art,” IEEE Access, vol. 6, pp. 41256–41279, 2018.
used in place of the current ND solution set for rule discovery. [17] K. Deb and J. Sundar, “Reference point based multi-objective
Traditional user preference information including, but not optimization using evolutionary algorithms,” in Proc. Genet. Evol.
limited to, the relative importance of objective functions and Comput. Conf. (GECCO), vol. 1, 2006, pp. 635–642.
[18] K. Miettinen, F. Ruiz, and A. P. Wierzbicki, “Introduction to
preferred regions of the Pareto-optimal front, can also poten- multiobjective optimization: Interactive approaches,” in Multiobjective
tially be integrated into this type of framework. Nevertheless, Optimization (Lecture Notes in Computer Science, 5252). Berlin,
this study has clearly demonstrated a viable way to extract Germany: Springer-Verlag, 2008, pp. 27–57.
[19] M. Gombolay et al., “Human–machine collaborative optimization via
variable interaction knowledge from intermediate optimization apprenticeship scheduling,” J. Artif. Intell. Res., vol. 63, pp. 1–49,
iterations and to use relevant and vetted knowledge back May 2018.
[20] S. Bandaru, T. Aslam, A. H. Ng, and K. Deb, “Generalized higher-level Abhiroop Ghosh (Member, IEEE) received the
automated innovization with application to inventory management,” Eur. bachelor’s degree in electrical engineering from
J. Oper. Res., vol. 243, no. 2, pp. 480–496, Jun. 2015. Jadavpur University, Kolkata, India, in 2016, and the
[21] K. Deb and R. Datta, “Hybrid evolutionary multi-objective optimization Ph.D. degree in electrical and computer engineering
and analysis of machining operations,” Eng. Optim., vol. 44, no. 6, from Michigan State University, East Lansing, MI,
pp. 685–706, Jun. 2012. USA, in 2022.
[22] A. H. C. Ng, C. Dudas, H. Boström, and K. Deb, “Interleaving He is currently a Senior Software Engineer
innovization with evolutionary multi-objective optimization in pro- with Aspen Technology, Medina, MN, USA, work-
duction system simulation for faster convergence,” in Learning and ing on optimization applications in power gener-
Intelligent Optimization (Lecture Notes in Computer Science, 7997). ation management systems. His primary research
Berlin, Germany: Springer, 2013, pp. 1–18. interests include multiobjective knowledge-driven
[23] A. Ghosh, K. Deb, R. Averill, and E. Goodman, “Combining user knowl- optimization and evolutionary algorithms.
edge and online innovization for faster solution to multi-objective design
optimization problems,” in Evolutionary Multi-Criterion Optimization
(Lecture Notes in Computer Science, 12654 (Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics)). Cham, Switzerland: Kalyanmoy Deb (Fellow, IEEE) received the
Springer, Mar. 2021, pp. 102–114. bachelor’s degree in mechanical engineering from
[24] A. Ghosh, K. Deb, E. Goodman, and R. Averill, “A user-guided the Indian Institute of Technology Kharagpur,
innovization-based evolutionary algorithm framework for practical Kharagpur, India, in 1985, and the master’s and
multi-objective optimization problems,” Eng. Optim., to be published. Ph.D. degrees from the University of Alabama,
[Online]. Available: https://doi.org/10.1080/0305215X.2022.2144275 Tuscaloosa, AL, USA, in 1989 and 1991,
[25] A. Ghosh, E. Goodman, K. Deb, R. Averill, and A. Diaz, “A large-scale respectively.
bi-objective optimization of solid rocket motors using innovization,” in He is a University Distinguished Professor
Proc. IEEE Congr. Evol. Comput. (CEC), Jul. 2020, pp. 1–8. and a Koenig Endowed Chair Professor with the
[26] C. Barba-González, V. Ojalehto, J. García-Nieto, A. J. Nebro, Department of Electrical and Computer Engineering,
K. Miettinen, and J. F. Aldana-Montes, “Artificial decision maker driven Michigan State University, East Lansing, MI, USA.
by PSO: An approach for testing reference point based interactive meth- He is largely known for his seminal research in evolutionary multicriterion
ods,” in Parallel Problem Solving from Nature (PPSN XV) (Lecture optimization. He has published over 600 international journal and conference
Notes in Computer Science, 11101). Cham, Switzerland: Springer- research papers to date.
Verlag, 2018, pp. 274–285. Dr. Deb received the IEEE EC Pioneer Award and the Infosys Prize. He
[27] A. V. Aho, M. R. Garey, and J. D. Ullman, “The transitive reduction of a is a Fellow of ACM and ASME.
directed graph,” SIAM J. Comput., vol. 1, no. 2, pp. 131–137, Jul. 1972.
[28] A. Rothwell, “Optimization of beams,” in Optimization Methods in
Structural Design (Solid Mechanics and its Applications), vol. 242.
Cham, Switzerland: Springer-Verlag, Apr. 2017, pp. 147–181.
[29] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist Erik Goodman received the bachelor’s and mas-
multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput., ter’s degrees from Michigan State University, East
vol. 6, no. 2, pp. 182–197, Apr. 2002. Lansing, MI, USA, in 1966 and 1968, respectively,
[30] E. Zitzler and L. Thiele, “Multiobjective optimization using evolution- and the Ph.D. from the University of Michigan, Ann
ary algorithms—A comparative case study,” in Parallel Problem Solving Arbor, MI, USA, in 1972.
from Nature (PPSN V) (Lecture Notes in Computer Science, 1498). He was the PI and the Director of the BEACON
Berlin, Germany: Springer -Verlag, 1998, pp. 292–301. Center for the Study of Evolution in Action, an NSF
[31] M. Hollander, D. A. Wolfe, and E. Chicken, Nonparametric Statistical Center headquartered at Michigan State University,
Methods (Wiley Series in Probability and Statistics). Hoboken, NJ, USA: from 2010 to 2018, where he was a Professor
Wiley, Jul. 2015. Emeritus of Electrical and Computer Engineering
[32] K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms. and Mechanical Engineering and Computer Science
New York, NY, USA: Wiley, 2001. and Engineering until 2022. He co-founded Red Cedar Technology (1999, cur-
[33] S. Datta, A. Ghosh, K. Sanyal, and S. Das, “A radial bound- rently part of Siemens), East Lansing, and developed the HEEDS SHERPA
ary intersection aided interior point method for multi-objective commercial design optimization software now widely used in industry.
optimization,” Inf. Sci., vol. 377, pp. 1–16, Jan. 2017. Dr. Goodman’s honors include the Michigan Distinguished Professor of
[34] M. Basu, “Economic environmental dispatch using multi-objective dif- the Year 2009 and the MSU Distinguished Faculty Award in 2011. He was
ferential evolution,” Appl. Soft Comput. J., vol. 11, no. 2, pp. 2845–2853, a Senior Fellow of the International Society for Genetic and Evolutionary
2011. Computation in 2004, the Founding Chair of the ACM SIG on Genetic and
[35] R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, Evolutionary Computation (SIGEVO) from 2005 to 2007, and continuing
“MATPOWER: Steady-state operations, planning, and analysis tools service on its Executive Committee and Advisory Committee.
for power systems research and education,” IEEE Trans. Power Syst.,
vol. 26, no. 1, pp. 12–19, Feb. 2011.
[36] Y. E. Tian et al., “Evolutionary large-scale multi-objective optimization:
A survey,” ACM Comput. Surv., vol. 54, pp. 1–34, Oct. 2021. Ronald Averill received the M.S. and Ph.D. degrees
[37] R. Cheng, Y. Jin, M. Olhofer, and B. Sendhoff, “A reference vector in engineering mechanics from Virginia Polytechnic
guided evolutionary algorithm for many-objective optimization,” IEEE Institute and State University, Blacksburg, VA, USA,
Trans. Evol. Comput., vol. 20, no. 5, pp. 773–791, Oct. 2016. in 1989 and 1992, respectively.
[38] A. Panichella, “An adaptive evolutionary algorithm based on non- He is a Professor Emeritus with the Department of
Euclidean geometry for many-objective optimization,” in Proc. Genet. Mechanical Engineering, Michigan State University,
Evol. Comput. Conf. (GECCO), Jul. 2019, pp. 595–603. [Online]. East Lansing, MI, USA. His research focus is on
Available: https://doi.org/10.1145/3321707.3321839 design optimization of large and complex systems,
[39] Z.-Z. Liu, B.-C. Wang, and K. Tang, “Handling constrained analysis of composite materials and structures, and
multiobjective optimization problems via bidirectional coevolution,” design for sustainable agriculture. He co-founded
IEEE Trans. Cybern., vol. 52, no. 10, pp. 10163–10176, Oct. 2022. Red Cedar Technology, East Lansing, in 1999, and
[40] Y. Tian, R. Cheng, X. Zhang, and Y. Jin, “PlatEMO: A served as its President and CEO through 2011. Red Cedar Technology, cur-
MATLAB platform for evolutionary multi-objective optimization,” 2017, rently a Siemens company, is a leading provider of multidisciplinary design
arXiv:1701.00879. optimization software, solutions and technology.

Interactive Multi-Objective EA Framework

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interactive Multi-Objective EA Framework

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 28, NO.

1, FEBRUARY 2024 223

An Interactive Knowledge-Based Multiobjective

This systematically illustrates the interactive ability of IK-

Algorithm 1 VRG Traversal and Repair Pseudocode

Fig. 7. Simply-supported stepped beam with five segments.

IV. S IMPLY-S UPPORTED S TEPPED B EAM D ESIGN

You might also like