Swarm and Evolutionary Computation: Mauro Castelli, Gianpiero Cattaneo, Luca Manzoni, Leonardo Vanneschi

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Swarm and Evolutionary Computation 44 (2019) 636–645

Contents lists available at ScienceDirect

Swarm and Evolutionary Computation


journal homepage: www.elsevier.com/locate/swevo

A distance between populations for n-points crossover in genetic algorithms


Mauro Castelli a , Gianpiero Cattaneo b , Luca Manzoni b, ∗ , Leonardo Vanneschi a
a
NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, 1070-312 Lisboa, Portugal
b Dipartimento di Informatica, Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milano, Italy

A B S T R A C T

The theoretical study of Genetic Algorithms and the dynamics induced by their genetic operators is a research field with a long history and many different approaches.
In this paper we complete a recently presented approach to model one-point crossover using pretopologies (or Čechtopologies) in two ways. First, we extend it to
the case of n-points crossover. We extend the definition of crossover distance between populations to work for n-points crossover, proving that computing it can be
performed in polynomial time for any fixed number of crossover points. Secondly, we experimentally study how the distance distribution changes when the number
of crossover points increases. In particular, the average distance between a population and the optimum decreases with the increase in the number of crossover
points, showing that increasing the latter can reduce the number of crossover operations needed to evolve an optimal solution.

1. Introduction aspects of the dynamics of the evolutionary algorithm [8]. It is use-


ful, for example, for computing indicators of problem difficulty (e.g.,
The usual approach to the study of genetic algorithms (GAs) is to fitness distance correlation [9–11]). Pretopologies, instead of topolo-
model their dynamics either using some simple kind of crossover, like gies, have also been used to study the dynamics of GAs. The first
the one-point crossover, or without focusing on the difference given by use of pretopologies to model crossover is due to Stadler, Wagner
using different kinds of crossover (see, for instance [1] for a compre- and coworkers [12,13], where a connection with hypergraphs is also
hensive overview). Several models have appeared so far that study the made. The study of topologies is also linked to the search for bet-
dynamics under one-point crossover or in which the crossover does not ter genetic operators. This has been done in a recent work by Yoon,
play a significant role. The most prominent example is the work of Vose Kim, Moraglio, and Moon [14], where the relation between pheno-
and his coworkers [2,3], that is based on modeling GA – with selection, type and genotype is studied and used for the definition new genetic
mutation, and crossover – as a deterministic system, under the hypothe- operators.
sis of an infinite population. In a work by Kim and Moon [4] a distance The most frequently studied type of crossover is the one-point
is defined where the interactions between genes of a chromosome of a crossover. Modeling only one crossover point is, of course, unsatisfac-
GA are considered; furthermore, in the same study, the idea of using a tory, since GAs are inspired by real biological processes in which more
distance between populations is developed, even if mutation – instead than one crossover point exists. Furthermore, since GAs are commonly
of crossover – plays the main role. used in optimization, it is appropriate to study operators that general-
Topology has often been considered an important concept in the ize one-point crossover. The study of these operators results in a better
study of GAs. In the work of Moraglio [5–7], the crossover and muta- understanding of the impact of the used type of crossover on the per-
tion operators are defined and studied in terms of topology, gener- formance of GAs. Of course, the generalization of one-point crossover
ally induced by a metric space, making the model applicable to a to n-points crossover is not a trivial, both computationally and from the
wide range of evolutionary computation techniques. In this model, perspective of the complexity of the mathematical formalization per-
some general results on crossover are reached, for example on the spective.
points that can be generated by a geometric crossover (a condition As previously noted, an important mathematical tool, also used in
respected by many crossovers used in practice). Among the possible the case of one point crossover, is the notion of metric space, obviously
approaches, the definition of an operator-based distance for evolution- induced from an appropriate distance. This observation led to the study
ary computation techniques is an important step in analyzing some and definition of a population-based crossover distance for one-point

∗ Corresponding author.
E-mail addresses: mcastelli@novaims.unl.pt (M. Castelli), cattang@live.it (G. Cattaneo), luca.manzoni@disco.unimib.it (L. Manzoni), lvanneschi@novaims.unl.
pt (L. Vanneschi).

https://doi.org/10.1016/j.swevo.2018.08.007
Received 15 December 2017; Received in revised form 16 July 2018; Accepted 11 August 2018
Available online 21 August 2018
2210-6502/© 2018 Elsevier B.V. All rights reserved.
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

crossover [15]. This work is a natural generalization of [15] to n-points 2. Basic notions
crossover. While a distance “consistent” with traditional GAs mutation
is Hamming’s distance, defining a distance “consistent” with crossover In this section, some basic notations and notions about lattices and
is clearly more difficult. In fact, differently from mutation, when an closure operators are introduced.
individual is fixed, the result of a crossover operation depends on the We denote by [i, j] (resp. (i, j)) with i, j ∈ ℕ the set
entire population. {i, i + 1, … , j − 1, j} ⊆ ℕ (resp. {i + 1, … , j − 1}). The meaning of
Having a distance or, more in general, a function related to a [i, j) and (i, j] is the immediate extension of the previous notation. For
distance between populations, can be useful in understanding the a fixed 𝓁 ∈ ℕ, we denote by SC𝓁,1 the set {[i, j] | 1 ≤ i ≤ j ≤ 𝓁 }. In a
behaviour of GA with respect to the particular problem at hand, with previous paper [15] it has been proved that SC𝓁,1 is a lattice w.r.t.the
the aim of improving GA performance. In particular, the following set inclusion. Also, for every n ∈ ℕ with n > 1, we denote by SC𝓁,n the
applications can be identified: set of all subsets of [1, 𝓁 ] that can be written as the union of a set of
SC𝓁,n−1 and a set of SC𝓁,1 .
• A population with a low average distance to any other population
A finite alphabet will be denoted by Σ. The set of all the strings
has less obstacles in producing any other individual quickly since
of a given length 𝓁 composed of symbols from Σ is denoted by Σ𝓁 .
less generations are needed to produce it. This can help in generat-
An element x ∈ Σ𝓁 is denoted by x1 , x2 , … , x𝓁 . The notation x[i,j] is a
ing the initial population, thus limiting the risk of getting stuck in
short-cut for xi , xi+1 , … , xj−1 , xj .
local minima.
A function f between two partially ordered sets A and B is said to be
• A population with a large average distance might have a low diver-
isotone or order-preserving iff ∀a, b ∈ A a ≤ b ⇒ f(a) ≤ f(b).
sity in term of genetic material (even if the individuals inside the
Recall that a lattice  is a non-empty set L endorsed with a partial
population are all different), since recombination needs more steps ⋁
ordering ≤L such that for any two elements a, b ∈ L the join a b (i.e.,
to generate new individuals. That is, a distance between populations ⋀
the least upper bound of a and b) and the meet a b (i.e., the greatest
can be used as a diversity measure for GA and this measure can be
lower bound of a and b) operators are uniquely defined in L [16]. A
subsequently used to promote diversity. ⋁ ⋀
lattice is bounded if L (i.e., the maximal element for L) and L (i.e.,
• More in general, traditional crossovers can only generate individuals
the minimal element for L) exist. A lattice is complete if for every subset
in the convex hull of the population [5]. Without the intervention ⋁ ⋀
S of L then both S and S exist. Note that every finite lattice is
of mutation, each generation reduces the hyper-volume available
complete.
for the production of new individuals. A crossover-based distance
Given a partially ordered set (poset)  = (L, ≤L ) a subset O of L is a
can help in understanding how the shrinkage of this convex hull
lower set if for all x ∈ L the conditions “there exists y ∈ O with x ≤ L y”
happens and the implication with respect to the ability of generating
imply “x ∈ O.” The set of all lower sets of a poset  is denoted by  ()
new solutions.
and it is a complete lattice with respect to the set inclusion. See Ref.
When used as an optimization algorithm, GAs are inherently [16] for a reference on lattices.
stochastic. However, there is a lower bound to the number of gener- A Čechclosure [17] on a set X is a function on the power set of X,
ations needed to transform a given population into another one. This · ∶  (X ) ↦  (X ), such that:
lower bound can be found in a deterministic way, selecting an “optimal”
1. ∅ = ∅
sequence of populations. The length of this sequence allows us to deter-
2. ∀A ⊆ X A ⊆ A (monotonicity)
mine what is the minimum number of generations needed for pass-
3. ∀A, B ⊆ X A ∪ B = A ∪ B (additivity)
ing from a population to another. This lower bound also holds for the
commonly used stochastic version of GA. In order to find this lower Recall that a Kuratowski closure is a Čech closure with the following
bound, we use sets of populations, while, as said above, standard GAs additional condition:
have a single population that is modified by the genetic operators in
a stochastic manner. That is, in standard GAs only follow a possible 4. ∀A ⊆ X A = A (idempotency)
dynamics in a stochastic way, while here we focus on studying all The Kuratowski closure is one of the ways to define a topology on X
the possible dynamics. In other words, in this stochastic context, one [18]. A Čech closure can be iterated, defining Ai with i ∈ ℕ as follows:
obtains, after one step, different populations with different probabil- {
ities. The two dynamics are related in the sense that the determinis- Ai−1 if i ≠ 0
Ai = (1)
tic process that we study keeps track of all reachable population (i.e., A otherwise
the ones that have a non-zero probability of being obtained) with the
aim of determining when a population becomes reachable. Here we When X is finite the function [[·]] ∶  (X ) →  (X ) defined as [[A]] =
⋃ i
are not proposing a new version of GA for optimization, but a method i∈ℕ A is a Kuratowski closure.
to single out the dynamics of existing GAs when all the stochastic
choices are, in some sense, optimal. We have experimentally verified 3. An extension of the model to n-points crossover
that a number of different crossover points can change, on average,
this bound. When the number of crossover points increases there is In this section, we extend to n-points crossover the one-point model
a measurable reduction in the average distance between two popu- presented in Ref. [15]. To keep this work self-contained, we present the
lations. This result is a stimulus to continue the investigation of the adapted definitions even when they remain similar to the ones already
effect that different types of crossovers can have on the dynamics of the formalized for one-point crossover. The proofs of the propositions of
population. this section can be obtained by a generalization of the proofs of [15].
This work is structured as follows: in Section 2 the basic model defi- The main objective of this section is to show that the dynamics given
nitions and the mathematical notions needed in the subsequent parts by crossover can be modeled via topological operations, this is, using
of the paper are recalled. In Section 3 we give a generalization to Čechclosures. This will make possible to “extract” a definition of dis-
the n-points crossover of the model introduced in Ref. [15]. The main tance related to the number of time a Čech closure must be applied
results are presented in Section 4. In Section 5 we experimentally study before reaching a fixed point.
how the distance distribution varies for different numbers of crossover The model that we are going to define is based on the idea that,
points. The paper ends with the final remarks of Section 6 and with given two populations P1 and P2 , with P2 reachable from P1 , of a GA
some proposals for possible future work. it is possible to count the minimum number of generations needed to
transform P1 in P2 using only crossover operations. Hence, we decided

637
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

not to consider, for now, the fact that GA has an essential stochastic That is, the two individuals obtained are still the same.
component. The semantics of the two populations P1 and P2 is the fol- This relation has been extended to the power set of Σ𝓁 as follows:
lowing: the first population, P1 , is the initial population of the GA and ( )
the second one, P2 , is the target - or final - population, containing one Definition 3.2. A n-point crossover relation P,n over P =  Σ𝓁 is
optimal solution. Hence, the minimum number of generations needed a relation such that ∀P1 , P2 ∈ P:
to go from P1 to P2 represents a lower bound on the number of gen-
erations needed by a GA (even when it has a stochastic component) to P1 P,n P2 ⇔ ∀x′ ∈ P2 ∃y′ ∈ Σ𝓁 ∃x, y ∈ P1
reach an optimal solution. s.t (x, y)I ,n (x′ , y′ )

3.1. Crossover relation In the relation P,n the symbol P refers to “populations.”

For n = 1 this definition is the same as the one given in Ref. [15].
A first step in the introduction of our simplified model of GA with Intuitively, two sets are in the relation P,n if every element of the sec-
n-points crossover is the definition of a crossover relation. The simpli- ond set can be obtained by using n-point crossover operations starting
fied aspect of this model is that a population is any possible subset of from elements of the first set. It is immediate that P,n is reflexive,
strings of a fixed length 𝓁 over an alphabet Σ, in which both the fixed but neither symmetric nor transitive. However, the following property
population size and the presence of duplicate elements are ignored. holds:
Definition 3.1. A n-points crossover relation (for n ∈ [1, 𝓁 )) is a binary
∀P1 , P2 ∈ P, P1 P,n P2 implies that ∀P′1 ⊇ P1 and ∀P′2 ⊆ P2 , P′1 P,n P′2 .
relation I ,n over Σ𝓁 × Σ𝓁 such that:
∀x, y, x′ , y′ ∈ Σ𝓁 , (x, y)I ,n (x′ , y′ ) iff ∃k0 = 0, k1 , … , kn , kn+1 = 𝓁 ∈ ℕ In order to clarify this property, let us discuss the following example:
(not necessarily all distinct) such that ∀i ∈ [0, n]:
{ Example 3.2. Let P1 , P2 ⊆ {0, 1}3 be:
x(ki ,[ki+1 ] if i is even
x(′k ,[k ] = P1 = {(0, 1, 0), (1, 0, 1)} P2 = {(1, 1, 1), (0, 0, 0)}
i i+1 y(ki ,[ki+1 ] otherwise
and P′1 , P′2 ⊆ {0, 1}n be:
and
{
y(ki ,[ki+1 ] if i is even P′1 = {(0, 1, 0), (1, 0, 1), (1, 1, 0)} P′2 = {(0, 0, 0)}
y(′k ,[k ]=
i i+1 x(ki ,[ki+1 ] otherwise When considering only two-points crossover one obtains P1 P,2 P2 .
That is, by only applying two-points crossover it is possible to trans-
In the relation I ,n the symbol I refers to “individuals.”
form the population P1 into P2 . We also have that P1 P,2 P′2 , since P′2
For the case of one-point crossover (i.e.,n = 1) this definition is
exactly the one presented in Ref. [15]. Intuitively, we have that two contains fewer elements than P2 . Even if P′1 contains more individual
pairs of elements of Σ𝓁 are in relation w.r.t.this definition if the sec- with respect to P1 , this does not impede the generation of the individu-
ond pair can be obtained from the first one using one n-point crossover als that were already possible to generate by P1 , therefore we have that
operation. The relation I ,n is reflexive and symmetric but not transi- P′1 P,2 P′2 , as desired. We want to remark that the relation P1 P,2 P2
tive (i.e., following [19,20], it is a similarity relation). is completely determined by the genetic material present in the two
populations: since crossover cannot generate new genetic material, it is
Example 3.1. Let us consider Σ = {0, 1} and 𝓁 = 6. An example of two necessary for P1 to contain all the necessary genetic material to reach
pairs of strings in relation w.r.t. 4 -points crossover is the following: P2 .
( ) ( | | | | ) The main idea is to define a Čech closure according to Eq. (1) over
0 1 0 0 0 1 0 | 0 | 0 | 0 1 | 1 P such that ∀P ∈ P and ∀i ∈ ℕ, Pi is the set of populations that can be
I ,4 | | | | (2)
| | | |
1 0 1 1 0 0 1 || 1 || 1 || 0 0 || 0 obtained from P after i generations using only the crossover as a genetic
operator. To satisfy those requirements we defined a closure ⟦·⟧ such
Notice that the relation I ,4 is not transitive. For example the pair that ∀P1 , P2 ∈ P:
( ) ( | | | | )
0 0 0 0 0 0 0 | 1 | 0 | 1 | 0 0 1. P2 ∈ ⟦P1 ⟧ iff P2 can be obtained by using only crossover operations
I ,4 | | | |
| | | | from P1 .
1 1 1 1 1 1 1 || 0 | 1 | 0
| |
| 1
| 1
2. If P2 ∈ ⟦P1 ⟧, the minimal i ∈ ℕ such that [[P1 ]] = Pi is also the min-
and the pair imal number of generations needed to obtain P2 from P1 .
( ) ( ‖ | |) Such a closure, defined in Ref. [15] for one-point crossover, is here
0 1 0 1 0 0 0 1 0 1 ‖ 0 | 1 |
I ,4 ‖ | | generalized as follows:
‖ | |
1 0 1 0 1 1 1 0 1 0 ‖ | |
‖ 1 | 0 |
Definition 3.3. The crossover closure for n-point crossover is a function
are both in the relation I ,4 (notice that the first two crossover points - · ∶  (P) →  (P) defined, for every A ⊆ P as:
denoted by a double line - coincide, i.e., k1 = k2 ), but the pair
1. When A = ∅, A = ∅.
( ) ( ) { }
0 0 0 0 0 0 0 1 0 1 0 1 2. When A = {P}, {P} = P′ ∈ P | PP,n P′ .
and ⋃
3. Otherwise, A = P∈A {P}.
1 1 1 1 1 1 1 0 1 0 1 0

is not in the relation I ,4 . The following two propositions, as a generalization of the corre-
Notice that the choice of starting with a swap on the first interval sponding results relative to the case for n = 1 [15], holds for a crossover
or in the second one (i.e., at odd or even crossover points), is actually closure.
immaterial. That is, in the first case of the previous example we would
Proposition 3.1. The crossover closure is a Čech closure.
have:
( ) ( | | | | ) Proposition 3.2. For all P1 , P2 ∈ P and for all k ∈ ℕ, P2 ∈
0 1 0 0 0 1 1 | 1 | 1 | 0 0 |0 {P1 }k iff there exists Q0 = P1 , Q1 , … , Qk−1 , Qk = P2 ∈ P such that ∀i ∈
I ,4 | | | |
| | | | [0, k) Qi P,n Qi+1 .
1 0 1 1 0 0 0 || 0 || 0 || 1 0 || 1

638
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

Intuitively, the previous proposition states that verifying that a pop- want to remark that in the dynamics of GA there is no symmetry when
ulation P2 is inside the kth iteration of the closure of a population P1 is transforming a population P1 into a population P2 via crossover (which
the same as verifying if it is possible to obtain P2 starting from P1 in k might be easy if, for example, P2 is a subset of P1 ) and P2 in P1 also
generations using only crossover operations. via crossover (which might be impossible if P2 does not contain all
We are now going to show that from the closure of a population the genetic material necessary to generate all the individuals in P1 ).
P1 it is always possible to find a particular population P′ such that Therefore, a directional distance better capture this asymmetry in the
{P1 }2 = {P′ }. That is, we can always focus on considering closures of dynamics of GA, and we consider it better suited to be used during the
singletons. We define, ∀P ∈ P and ∀i ∈ ℕ, the set Si (P) ∈ P as follows: experimental part of the paper.
{
P if i = 0 4. A representation for populations as lower sets
S i (P ) = ⋃
{Si−1 (P)} otherwise
In this section, a way to represent populations as lower sets is intro-
The following proposition links the iteration of the closure with the duced. This representation allows us to compute the previously defined
sequence S0 (P), S1 (P), …. distance in an efficient way (i.e., in a time that is polynomial w.r.t.the
size of the populations and the length of the individuals). The main
Proposition 3.3. For all P1 , P2 ∈ P such that ∃i ∈ ℕ with P2 ∈ {P1 }i the idea behind this section is that, when we know the target population
following holds: that we want to reach, we do not need to represent all the possible pop-
{
{ } 1 if P2 ⊂ P1
ulations differently. This allows for a compact populations’ representa-
min i ∈ ℕ | P2 ∈ {P1 }i = tion, where only the “parts” that are actually useful to the crossover are
min {i ∈ ℕ | P2 ⊆ Si (P1 )} otherwise memorized.
The previous proposition intuitively states that it is possible to know While SC𝓁,1 is a lattice, ∀n > 1, ∀𝓁 > 2⌊ 2n ⌋ + 3, the SC𝓁,n poset is
the minimum number of generations needed to obtain a population not a lattice (w.r.t. set inclusion).
from another by only considering a Čech closure of a particular sin- Example 4.1. For example, consider, for a fixed n and 𝓁 = 2⌊ n2 ⌋ + 3,
gleton set. { }
let A, B ∈ SC𝓁,n where A = {1} ∪ {5} ∪ {7} ∪ {9} … ∪ 2⌊ n2 ⌋ + 3 and
{ }
3.2. Distance definition B = {1} ∪ {3} ∪ {7} ∪ {9} … ∪ 2⌊ 2n ⌋ + 3 . Recall that SC𝓁,n cannot
have elements that are the union of more than n disjoint sets in the
From the previous definitions, we have the elements to define a met- form [i, j] and both A and B are union of n disjoint sets. It is imme-
ric between populations. The definitions of [15] can be easily adapted diate that the atomic upper bound{ of A and} B is not unique, since
to the n-points crossover
{ case. } both {1} ∪ [3, 5] ∪ {7} ∪ {9} … ∪ 2⌊ 2n ⌋ + 3 and [1, 3] ∪ {5} ∪ {7} ∪
Let k∗ = min k ∈ ℕ | ∀U ⊆ P ∶ Uk = U k+1 (i.e., the minimum { }
{9} … ∪ 2⌊ 2n ⌋ + 3 are atomic upper bounds. Hence they are not the
number of iterations of the Čech closure needed to reach a fixed point
least upper bound, that, by definition, must be unique.
independently of the starting set). Then a quasi-metric (i.e.,a distance
without the symmetry property, simply called direction distance in this From now on, we fix n, 𝓁 ∈ ℕ. We now recall some definitions from
paper) fP ∶ P × P → ℝ+ can be defined as: Ref. [15] adapting them to the n-points crossover case.

⎧ { } ∗ Definition 4.1. Let x ∈ Σ𝓁 , A ∈ SC𝓁,n and P ∈ P. We say that A is


⎪min k ∈ ℕ | P2 ∈ {P1 }k if P2 ∈ {P1 }k represented in P if ∃y ∈ P such that ∀a ∈ A ya = xa .
fP (P1 , P2 ) = ⎨

⎪k otherwise
⎩ The concept of representation has been extended to populations:
( )
In this case the function fP respects all the properties of a metric except Definition 4.2. Fix x ∈ Σ𝓁 . We define rx ∶ P →  SC𝓁 ,n as rx (P) =
{ }
symmetry. In fact, fP (P1 , P2 ) = 0 if and only if P1 = P2 , since P1 ≠ P2 A ∈ SC𝓁 ,n ∣ A is represented in P .
implies that either at least one closure operation must be performed
to reach P2 from P1 (i.e., fP (P1 , P2 ) ≥ 1) or P2 is not reachable from Proposition 4.1. For all P ∈ P and for all x ∈ Σ𝓁 , rx (P) is a lower set
P1 (i.e., fP (P1 , P2 ) = k∗ > 0). The property that for each P1 , P2 , and of SC𝓁,n .
P3 fP (P1 , P3 ) ≤ fP (P1 , P2 ) + fP (P2 , P3 ) is also verified. In fact, there are
Proof. Let A ∈ rx (P). Hence there exists y ∈ P such that A is rep-
two possible cases:
resented in {y}. It follows from Definition 4.1 that any B ⊆ A is also
• Both fP (P1 , P2 ) and fP (P2 , P3 ) are less than k∗ . In this case fP (P1 , P3 ) represented in {y} and, as a consequence, it is represented in P. Hence,
must be less or equal that their sum, since it is the minimum num- rx (P) is a lower set in SC𝓁,n . □
ber of generations needed to go from P1 to P3 and, if this requires
passing from P2 at most equality can be obtained. We now define the notion of the alternating number of two ele-
• If one of fP (P1 , P2 ) and fP (P1 , P2 ) is equal to k∗ than the inequality ments of SC𝓁,n . The idea is that given A, B ∈ SC𝓁,n we want to find an
holds by definition, since the maximum value that fP can have is k∗ . algorithm that given any two strings y, z ∈ Σ𝓁 such that A ∈ rx ({y})
and B ∈ rx ({z}), generates a string w ∈ Σ𝓁 such that A, B ∈ rx ({w})
Starting from fP , it is possible to obtain a distance between popu- by scanning the string left to right and choosing at every position
lations. The function dP defined as (P1 , P2 ) ↦ 12 (fP (P1 , P2 ) + fP (P2 , P1 )) i ∈ [1, 𝓁 ] either yi or zi . The alternating number is the minimum num-
suffices for this. In fact, since fP already respects all the properties of ber of “switch” from copying one string to copy the other that such an
metrics except symmetry, taking the average of fP (P1 , P2 ) and fP (P2 , P1 ) algorithm must perform when A and B are fixed. Intuitively, if such a
suffices to obtain a values that is not dependant on the order of P1 and number is less than the number of available crossover points, then the
P2 . string w can be generated by one crossover operation starting from two
For any fixed P ∈ P it is possible to define a distance 𝛿 P between strings, the first one having A in its representation and the second one
elements of Σ𝓁 as: having B.
𝛿P (x, y) = dP ((P ⧵ {x}) ∪ {y} , (P ⧵ {y}) ∪ {x}) . Definition 4.3. Let x ∈ Σ𝓁 , and let A, B ∈ SC𝓁,n . The crossover lan-
In the experimental part of the paper, however, we will use the direc- guage of these A and B, denoted by A,B ⊆ Γ𝓁 for the alphabet
tion distance fp . While a metric has nicer theoretical properties, we Γ = {a, b}, is defined as:

639
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645
(⋃ )
⋀ y ∈ rx {P} obtained from the n-points crossover of z, v such that
∀w ∈ {a, b}𝓁 w ∈ A,B ⟺ ∀i ∈ [1, 𝓁 ] (i ∈ A i ∉ B ⟹ wi = a)
A ∈ rx ({y}). Let w = ak1 bk2 … bkh ∈ B1 ,B2 be a word with B1 ,B2 with
⋀ ⋀
(i ∉ A i ∈ B ⟹ w i = b ) h ≤ n (this word exists by hypothesis). It is possible to see that by
( ) choosing as crossover points between z and v the positions k1 , k2 … , kh ,
The alternating number of A,B (denoted by A,B ) is the smallest m ∈ ℕ the obtained element y is such that A ∈ rx ({y}). □
such that there exists w ∈ A,B with where the symbols a and b alter-
nates m times. With the observation that ∀P ∈ P and ∀x ∈ Σ𝓁 , [1, 𝓁 ] ∈

We are now going to define a function remapping lower sets of SC𝓁,n } P ∈ P and
rx (P) iff x ∈ P we { can conclude that for all
for all x ∈ Σ𝓁 , min k ∈ ℕ ∣ [1, 𝓁 ] ∈ 𝜇nk,𝓁 (rx (P2 )) is equal to
to lower sets of SC𝓁,n . Intuitively, this function will transform the repre-
sentation of a population P in the representation of another population min {k ∈ ℕ ∣ {x} ⊆ Sk (P)}. Notice that the previous proposition also
that is the union of all the populations in the closure of P. implies that 𝜇 n,𝓁 also remaps lower sets to lower sets, a condition that
( ) ( ) was not proved when the function was defined.
Definition 4.4. We define 𝜇n,𝓁 ∶  SC𝓁 ,n →  SC𝓁 ,n as follows: for
( ) Remark 4.1. Note that since the function 𝜇 n,𝓁 is such that ∀A ∈
all U ∈  SC𝓁 ,n ( )
{ ( ) }  SC𝓁 ,n , 𝜇n,𝓁 (A) ⊇ A and the poset SC𝓁,n is finite, the dynamics
𝜇n,𝓁 (U ) = A ∈ SC𝓁 ,n ∣ ∃B1 , B2 ∈ U s.t. A = B1 ∪ B2 and B1 ,B2 ≤ n induced by the iteration of 𝜇 n,𝓁 always reaches a fixed point (i.e., an
equilibrium point). Trivially, there are no cyclic points.
The following is main result, since it allows us to compute the min-
imum number of iterations of the closure necessary to obtain a certain
4.1. The computational complexity of computing the distance between two
element of Σ𝓁 by iterating the function 𝜇n,𝓁 .
populations
Proposition 4.2. For all x ∈ Σ𝓁 the following diagram commutes:
The computational complexity of determining the distance between
two populations, using the proposed representation, is polynomial in
the size of the individuals and in the size of the populations, as we are
going to show. The presented bounds are not tight, but this is not neces-
sary for showing that the computation can be performed in polynomial
time.
Let P1 and P2 be two populations. To compute fP (P1 , P2 ) we obtain
the following time complexity bounds:
Proof. Fix x ∈ Σ𝓁 , P ∈ P, and A ∈ SC𝓁,n . The proof is divided into 1. For each element x in P2 , it is necessary to build the poset SC𝓁,n ,
two parts:
(⋃ ) which has size O(𝓁 2n ) (i.e., polynomial in the length of the indi-
1) A ∈ rx {P} ⇒ A ∈ 𝜇n,𝓁 (rx (P)). viduals but exponential in the number of crossover points - that
(⋃ ) we have assumed to be fixed). Hence, the time required for this
⋃ step is linear with respect to |P2 | and polynomial with respect
Let A be in rx {P} . Then there exists y ∈ {P} obtained by the
n-points crossover of two elements z, v ∈ P and such that A ∈ rx ({y}). to 𝓁.
Consider the word w ∈ {a, b}𝓁 defined as: 2. For each partial order SC𝓁,n with the associated element x ∈ P2 , it
{ is necessary to compute rx (P1 ), which can be performed by check-
a if yi was taken from z ing every individual in P2 with every element of SC𝓁,n . Hence, the
∀i ∈ [1, 𝓁 ] wi =
b if yi was taken from y number of steps necessary will be, for each x ∈ P2 , polynomial with
respect to the size of SC𝓁,n (and, hence, with respect to 𝓁), and |P1 |.
Since y has been obtained by n-points crossover, the word has at most 3. Finally, computing 𝜇 n,𝓁 is polynomial with respect to the size of
n alternations. Furthermore, there exists B ∈ rx ({z}) and C ∈ rx ({v}) SC𝓁,n , since it can be performed by checking all the pairs of ele-
such that A = B ∪ C and such that w ∈ B,C (this implies that A is in the ments in SC𝓁,n . Since SC𝓁,n is monotone, it cannot be iterated more
representation of {y} and that it has been obtained from the crossover than |SC𝓁,n | times before reaching a fixed point, thus still giving a
of z and v). Hence, A ∈ 𝜇n,𝓁 (rx (P)). polynomial time bound. In fact, by adapting a result in Ref. [15], it
(⋃ )
is possible to show that the number of iterations is at most logarith-
2) A ∈ 𝜇n,𝓁 (rx (P)) ⇒ A ∈ rx {P} .
mic with respect to 𝓁.
Let A ∈ 𝜇n,𝓁 (rx (P)). Then there exists B1 , B2 ∈ rx (P) such that B1 ∪
( ) This means that fp (P1 , P2 ) can be computed in polynomial time
B2 = A and B1 ,B2 ≤ n. By the definition of rx there exists z, v ∈ P with respect to the size 𝓁 of the individuals and the size of P1
such that B1 ∈ rx ({z}) and B2 ∈ rx ({v}). We claim that there exists and P2 . More precise bounds can be obtained by exactly specifying

Table 1
The average and the variance for a different number of crossover points on string of length varying from 6 to 9
bits. The average distance decreases and appears to stabilize to fixed values as the number of crossover points
increases. The different genome length are presented in decreasing number of bits, from 9 bits at the tops to
6 at the bottom of the table.
n = 1 n = 2 n = 3 n = 4 n = 5 n = 6 n = 7 n = 8
Ave 2.812 2.665 2.452 2.351 2.315 2.312 2.301 2.318
Var 0.141 0.127 0.147 0.156 0.155 0.153 0.148 0.152
Ave 2.377 2.228 2.021 1.954 1.939 1.936 1.936
Var 0.103 0.092 0.102 0.100 0.098 0.098 0.098
Ave 2.171 2.017 1.795 1.799 1.761 1.756
Var 0.122 0.109 0.106 0.115 0.096 0.101
Ave 1.898 1.749 1.582 1.614 1.588
Var 0.141 0.115 0.111 0.111 0.105

640
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

Fig. 3. A comparison of the distance distribution (each fitted to a Gaussian


using bins of size 0.05) from 1 to 6 crossover points for individuals of 7 bits.

Fig. 1. How the average and the variance change when the number of crossover
points increases. While the former decreases, the latter remains stable.

Fig. 4. A comparison of the distance distribution (each fitted to a Gaussian


the data structures and representations used while implementing the using bins of size 0.05) from 1 to 7 crossover points for individuals of 8 bits.
algorithm.

This experimental part is a complement to the theoretical part we


5. Experimental results on the distance distribution introduced so far. The definitions of a metric or, more in particular,
of a directional distance and of a polynomial-time algorithm to com-
In this section, we perform a comparison of the distance distribu- pute it, provide no direct information with respect to the distribution
tions obtained for a different number of crossover points on individuals of the distance values. Since there are no theoretical results yet that
with size ranging from 6 to 9 bits. This experimental exploration is nec- allow us to infer additional properties of this distance, we decided to
essary to check if the proposed distance is significantly different for an perform an experimental exploration with the aim of (1) verifying that
increasing number of crossover points. Intuitively, a higher number of incrementing the number of crossover points results in lower distance
crossover points should increase the ability to produce new individuals, values, and (2) studying the distribution of the distance values and how
thus decreasing the average distance. the distribution varies with a different number of crossover points.

Fig. 2. A comparison of the distance distribution (each fitted to a Gaussian Fig. 5. A comparison of the distance distribution (each fitted to a Gaussian
using bins of size 0.05) from 1 to 5 crossover points for individuals of 6 bits. using bins of size 0.05) from 1 to 8 crossover points for individuals of 9 bits.

641
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

Fig. 6. A comparison of the distance distribution between the 2-points and 1-point crossovers. From left to right and from top to bottom the length of the genomes
are 6, 7, 8, and 9 bits.

5.1. Experimental design small populations of 4 individuals (randomly generated). To these pop-
ulations we have added the individual for which we want to estimate
One first obstacle in the experimental design is to determine how the distance to the optimum, i.e., a population consisting only of the
to compute a distance between two individuals when no population is individual 1𝓁 where 𝓁 ∈ {6, 7, 8, 9}. The distance we reported for each
given. Therefore, in order to calculate the (directional) crossover dis- individual corresponds to the average of the 100 distance values calcu-
tance, for each of the (up to) 29 possible individuals we have used 100

Fig. 7. A comparison of the distance distribution between the 3-points and 1-point crossovers. From left to right and from top to bottom the length of the genomes
are 6, 7, 8, and 9 bits.

642
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

Fig. 8. A comparison of the distance distribution between the 4-points and 1-point crossovers. From left to right and from top to bottom the length of the genomes
are 6, 7, 8, and 9 bits.

lated over these populations.


The experimental procedure is the following one: (a) For each population Pi with i = 1, … , 100 consisting of 4 indi-
viduals z1 , z2 , z3 , z4 that are randomly generated with a uniform
1. For each 𝓁 ∈ {6, 7, 8, 9}:
distribution, the directional distance fP between {x, z1 , z2 , z3 , z4 }
2. Let y = 1𝓁 ;
and {y} is computed producing the value dP (Pi ∪ {x}, {y});
3. For each possible number of crossover points n ∈ {1, 2, … , 𝓁 − 1}:
4. For each individual x ∈ {0, 1}𝓁 :

Fig. 9. A comparison of the distance distribution between the 5-points and 1-point crossovers. From left to right and from top to bottom the length of the genomes
are 6, 7, 8, and 9 bits.

643
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

Fig. 10. A comparison of the distance distribution between the 6-points and 1-point crossovers. From left to right and from top to bottom the length of the genomes
are 7, 8, and 9 bits.

(b) the average of the dP (Pi ∪ {x}, {y}) values is taken as the dis- were performed. Furthermore, to get a complete knowledge of the pos-
tance of a population containing x. sible distances, the whole solution space was explored. While this pro-
cedure produces accurate results – no individual was overlooked – it is
The resulting distances are the ones used to compute the statistics dis- impossible to apply it to larger chromosomes. New sampling techniques
cussed in the continuation of this section. To limit the bias associated will be necessary to accurately explore the distance distribution in that
with the use of random individuals, 100 computations of the distance case.

Fig. 11. A comparison of the distance distribution between the 7-points and 1-point crossovers on the top (8 bits to the left and 9 bits to the right) and a comparison
between 8-points crossover and 1 point crossover for 9-bits genomes on the bottom.

644
M. Castelli et al. Swarm and Evolutionary Computation 44 (2019) 636–645

5.2. Experimental results must have for modeling crossover in GAs. It will also be the focus of
future studies to determine what is a good trade-off between minimiz-
The results are presented from Figs. 6–11. Each figure shows a com- ing the number of crossover points and minimizing the average of the
parison of the distance distribution for 1-point and a multiple points distance value; it would be interesting to observe whether there is a
crossover (from 2 crossover points in Fig. 6 to a maximum of 7 crossover correlation between these variations on the average of the distance and
points in Fig. 11). The average and variance of each distance distribu- the performance of a GA on synthetic or real-world problems. Finally,
tion are summarized in Table 1 and Fig. 1. a general way of extending this model to other evolutionary algorithms
As it is possible to observe, the shape of the distribution is similar should be devised. Other algorithms does not necessarily employ a
to a Gaussian distribution in all cases, with the obvious difference that fixed-size string representation, therefore the extension to evolutionary
the optimum has always distance 0 from itself. This can be observed in algorithms like tree-based genetic programming, Cartesian genetic pro-
more details in Figs. 2–5, where the fitting of the obtained results to a gramming, or simply GA with a different kind genotype will necessitate
Gaussian distribution has been reported. a generalization of the ideas proposed here.
The average decreases monotonically with the increase in the num-
ber of crossover points used, up to 𝓁 − 1 for a genome of length 𝓁 References
(which is the maximum number of crossover points possible for 𝓁-
bits individuals). In particular, the average appears to converge in all [1] C. Reeves, J. Rowe, Genetic Algorithms: Principles and Perspectives: a Guide to GA
cases to a fixed value, where there is no possible improvement. While Theory, Springer, 2002.
[2] M. Vose, The Simple Genetic Algorithm: Foundations and Theory, MIT Press,
the improvements from 1 to 2 and from 2 to 3 crossover points are Cambridge, MA, USA, 1998.
quite large for all genome sizes, successive increases in the number of [3] M. Vose, Course notes: genetic algorithm theory, in: Genetic and Evolutionary
crossover points do not produce improvements of a similar magnitude. Computation Conference - GECCO 2010 (Companion), ACM, 2010, pp.
2647–2660.
Therefore, we can observe that there are diminishing returns when [4] Y.-H. Kim, B.-R. Moon, New topologies for genetic search space, in: Conference on
increasing the number of crossover points. This is intuitively explain- Genetic and Evolutionary Computation, GECCO 2005, ACM, New York, NY, USA,
able in the following way: moving from 1 to 2 crossover points increases 2005, pp. 1393–1399, https://doi.org/10.1145/1068009.1068232.
[5] A. Moraglio, R. Poli, Topological interpretation of crossover, in: Proceedings of the
the possibilities to generate new individuals in less time for many cases Genetic and Evolutionary Computation Conference, Vol. 3102 of Lecture Notes in
(e.g., 11100111 and 00011000 can be used to generate the optimum in Computer Science, Springer, 2004, pp. 1377–1388.
one step with 2-points crossover, while at least two steps are required [6] A. Moraglio, One-point geometric crossover, in: Parallel Problem Solving from
Nature - PPSN XI, 11th International Conference, Kraków, Poland, September
for 1-point crossover). When the number of crossover points is suffi-
11-15, 2010, Proceedings, Part I, Vol. 6238 of Lecture Notes in Computer Science,
ciently large, the use of additional crossover points provides a limited Springer, 2010, pp. 83–93.
benefit when considering the number of generations necessary to reach [7] A. Moraglio, Geometry of evolutionary algorithms, in: Conference on Genetic and
Evolutionary Computation - GECCO 2011 (Companion), ACM, 2011, pp.
the optimum. Thus, the impact on the average is negligible.
1439–1468.
It is interesting to remark that n-points crossover can be considered [8] J. McDermott, U.-M. O’Reilly, L. Vanneschi, K. Veeramachaneni, How far is it from
as a “parallel version” of one-point crossover, in which n one-point here to there? A distance that is coherent with GP operators, in: European
crossover operations take place in parallel (as can be seen in Exam- Conference on Genetic Programming - EuroGP, Vol. 6621 of Lecture Notes in
Computer Science, Springer, 2011, pp. 190–202.
ple 3.1). Hence, the study of the relations between different distances [9] T. Jones, S. Forrest, Fitness distance correlation as a measure of problem difficulty
can be interesting to better understand the effects of this parallelization. for genetic algorithms, in: International Conference on Genetic Algorithms - ICGA,
Morgan Kaufmann, 1995, pp. 184–192.
[10] L. Vanneschi, Theory and Practice for Efficient Genetic Programming, Ph.D.
Thesis, Faculty of Sciences, University of Lausanne, Switzerland, 2004.
6. Further remarks and contributions [11] M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance
correlation as a difficulty measure in genetic programming, Evol. Comput. 13 (2)
In this paper, a recent model for one-point crossover in GA has (2005) 213–239.
[12] P. Stadler, G. Wagner, Algebraic theory of recombination spaces, Evol. Comput. 5
been generalized to n-points crossover. We have shown that when the
(3) (1997) 241–275.
kind of crossover is fixed, the distance can be computed in polynomial [13] B. Stadler, P. Stadler, M. Shpak, G. Wagner, Recombination spaces, metrics, and
time w.r.t.both population size and individual length; notice however, pretopologies, Z. Phys. Chem. 216 (2002) 217–234.
that the degree of the polynomial depends on the number of crossover [14] Y. Yoon, Y.-H. Kim, A. Moraglio, B.-R. Moon, Quotient geometric crossovers and
redundant encodings, Theor. Comput. Sci. 425 (2015) 4–16.
points. This result indicates that the structures used for modeling one- [15] L. Manzoni, L. Vanneschi, G. Mauri, A distance between populations for one-point
point crossover can be generalized to deal with n-points crossover. crossover in genetic algorithms, Theor. Comput. Sci. 429 (2012) 213–222.
Hence, the proposed model is not limited to a specific case and the [16] G. Birkhoff, Lattice Theory, American Mathematical Society, 1967.
[17] E. Čech, Topological Spaces, Wiley Interscience Publisher, London, 1966.
results on the polynomial complexity in time can be extended to more [18] J. Munkers, Topology, second ed., Prentice Hall, 1999.
general kinds of crossover. In order to experimentally study the pro- [19] H. Poincaré, Le Continu Mathématique, Revue de Métaphysique et de Morale I,
posed distance, we have shown how the distance distribution changes 1893, pp. 26–34 (Reprinted in [20] as Chapter II).
[20] H. Poincaré, La Science et l’hypothèse, Flammarion, Paris, 1903 (English
with different numbers of crossover points. translation as Science and Hypothesis, Dover, New York, 1952).
Future work will involve a more in-depth study of this model and,
in general, an investigation of the properties that a certain structure

645

You might also like