Ai Inteligens - Article

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 5

Machine Learning

Become a member
Sign in
Sign up
Intro to Evolutionary Computation Using DEAP
Mohammed E.Amer
Mohammed E.Amer
Follow
Nov 19 · 9 min read

Introduction
Evolutionary computation is a very powerful generic optimization technique that
draws its main inspiration from the theory of evolution by natural selection.
Evolution by natural selection is a very elegant theory that depends for its
explanation of the biodiversity in nature on two main components:
Random mutations
Selection pressure
Different ecological habitats have different challenges and requirements for
survival. According to the evolutionary theory, in any ecological niche, the
different organisms’ traits will be variant due to random mutations in DNA and
copying mistakes in its duplication. Due to this variation in traits, there will be
differential advantages of survival for organisms having more suitable traits for
survival, i.e the nature is implicitly applying a pressure that selects for the fit
individuals. Because the most fit organisms will be more probable to survive, they
will pass their ‘fit’ genes to their offsprings, which will be again more probable
to survive.
Evolution can be thought of as an algorithm optimizing for fitness. This is the
core idea of evolutionary optimization. In other words, if we have a problem that
we can generate different solutions for, then we can use the performance of each
solution as a measure of fitness that can drive an evolutionary algorithm to find
better and better solutions. Evolutionary algorithms have different flavors which
share most of their components, however, differing in details and characteristics
of each component. The main components of an evolutionary algorithm are:
Representation scheme (e.g genotype, phenotype, …etc)
Mating operators (e.g crossover)
Mutating operators (e.g bit flipping)
Fitness metric
Selection strategy (e.g tournament selection)
Evolution strategy (e.g mu,lambda)
You can find a more detailed introduction to the different variations of
evolutionary algorithms in my previous article. For the purpose of this tutorial, I
will focus on a variation called evolutionary strategy (ES), which I will introduce
briefly next. The full jupyter notebook is available here.
Evolutionary Strategy (ES)
As I mentioned, all evolutionary algorithms share most of the aforementioned
components, while differing in their details. For ES, the representations scheme is
mainly a phenotype, i.e the individuals (or solutions) are represented explicitly
as vectors of numbers. Each individual will have an accompanying vector, called
strategy, which is just a vector controlling its mutation. Different mating
operators are used in ES, but the one we will be using is blending, which is mainly
a form of linear combination between the mated parents. The mutating operator we
will be using is log-normal, which, like all mutating operators in ES, depends on
the strategy vector mentioned above for mutating different values in the
individual’s representation vector. Selection strategy will be tournament
selection, in which multiple random selections of a subset individual are done, and
the best individual is selected each time. The fitness function is the only part of
any evolutionary algorithm that must be delegated to the user to define, i.e the
user will provide some function that will assign fitness to each individual in the
population based on a suitable measure for the problem at hand. The evolutionary
strategy controls the population size and here I will use (mu,lambda) _pronounced
‘mu comma lambda’_, where mu and lambda are positive integers and mu refers to the
size of the parents’ population while lambda refers to the size of the generated
offsprings. In this strategy, the selection strategy (i.e tournament selection in
this example) is applied to the offsprings only.
ES, like all evolutionary algorithms, is done in iterations called generations.
Each generation, offsprings are generated from current parents in the population by
mating and then mutating. The fitness of the new members are then evaluated and the
selection strategy is applied to select the individuals that will survive into the
next generation.
DEAP
DEAP is a python framework for implementing evolutionary algorithms. It provides an
organized simplified way for coordinating the different components necessary for
any evolutionary algorithm. For any component, DEAP provides most of the common
variations as predefined components, while providing the enough flexibility for
defining your own variations in case the usual components are not enough for your
problem.
Let’s first define a dummy problem that we can optimize using ES. In many
situations you will have some natural process, we will call it the data-generating
process, that you want to model so that you can make predictions later. The data-
generating process is like a blackbox, you can never look inside, but you can give
it an input and it will respond with an output,

As you see above, this data-generating process takes an input and produces an
output based on some cubic polynomial. Notice that we never have access to this
formula, this is just for illustration purposes. Actually, our mission is to use ES
to model this unknown formula. We have only access to ‘querying’ this blackbox and
obtain responses, i.e we can sample it,

Note that an additive random term is added. This term represents noise, which is an
unavoidable component in any observations due to a lot of sources.
To model a process using an evolutionary algorithm, we need a candidate model,
basically, an assumption or ‘inductive bias’ about what kind of model we are
searching for.This may come from another analytical sources or previous data
analyses. For our dummy case, we will assume that we know that the model we are
searching for is a polynomial of at most the fourth degree, so we decide to use a
fourth degree polynomial,

the set of ‘a’ variables are unknown and we need the evolutionary algorithm to
optimize them so that the model output fits our data. This will be reflected in our
definition of the evaluation function that is discussed below. But first, we need
to prepare our ES algorithm.
Let’s import the basic subpackages we will need for our implementation,

To implement ES using DEAP, we need first to define our individual, strategy and
fitness as datatypes. This can be done in DEAP without the need to define your own
classes explicitly,

The first line defines our fitness datatype. The first argument gives the datatype
name, the second gives the base class, which is the base class for fitness measures
provided by DEAP and the third argument sets the fitness weight to a negative
value, which means that the evolutionary optimization will try to minimize this
value. It may seem odd that we are trying to minimize the fitness, but as you will
see, the fitness function we will use is actually an error function, i.e it
measures how much our solution deviates from observations, and hence, we need to
minimize it.
Notice that after your run this line, a datatype FitnessMin is added dynamically to
the creator subpackage and you can access it directly as ‘creator.FitnessMin’.
The second line is similar, however, it defines the individual. Starting from the
fourth argument, any arguments you provide, will be added as fields or attributes
to the defined datatype. For the individual, we define a necessary one required by
DEAP, which is fitness. DEAP will look for this one when it wants to update the
fitness value or read it. You can see that we pass our defined FitnessMin to it.
The strategy attribute is needed by the ES algorithm, since it depends on a
strategy vector for doing the mutations. We initialize it to ‘None’, as we will be
populating it ourselves later. You can pass any other attributes you need to the
create function and they will be added to your datatype. The third line defines the
strategy datatype.
Now we need to register some functions to make it easy to use our defined datatypes
to generate individuals, and aggregate them into a population. DEAP provides the a
special utility ‘toolbox’ for this purpose,

First, we define a function that takes our individual and strategy classes and the
size of an individual, i.e how many parameters are contained in the individual
vector, and generates a randomly initialized individual and populate it with a
randomly initialized strategy vector. We, then, initialize our toolbox, and use its
register function twice. The first usage registers a function with the name ‘
individual’. You can see that the second argument is our actual defined function
implementation and the rest of the arguments is the arguments that will be passed
to the function by default upon calling.
The second usage defines another important function called ‘population’, which we
will use for population generation. We pass a DEAP predefined function, which is
initRepeat. The arguments after initRepeat are, again, the arguments which will be
passed to it by default upon calling. When we call the function ‘population’ from
our toolbox, initRepeat will be called, which will initialize a set of individuals
using its second argument, which is just our predefined individual generating
function, and it will put them into a container of the type ‘list’, which is its
first argument, and return it.
We need now to register the evaluation function, that DEAP will use to assign
fitness to different individuals. Here is where our assumed polynomial model comes
into play,

The ‘pred’ function takes an individual and a data point and returns the model
output by calculating a 4th degree polynomial. You can see that the individual’s
parameters from 1–4 are used as the coefficients for the x with different exponents
and the fifth is used as the bias/intercept, i.e the term without x.
The ‘fitness’ function calculate the mean squared error (MSE) between the actual
output and our model’s output (which is calculated using the previous function
‘pred’). More fit individuals will have smaller MSE, and this is the reason our
defined fitness at the beginning was assigned a negative weight. We are interested
in making MSE smaller.
Finally, we register our fitness function in our toolbox using the special keyword
‘evaluate’, so that DEAP can find it when calculating fitness.
We need to register a couple more functions necessary for the operation, which are
the mutation, crossover and selection operators. We can do so by registering a set
of function to our toolbox with specific names that DEAP will look for when doing
these operations,

The mate, mutate and select are the crossover, mutation and selection operators,
respectively. For all of them, we use a predefined DEAP components, and pass their
needed arguments.
So far, these are most of the necessary components for the operation of ES. A very
convenient optional tool provided by DEAP is the ‘Statistics’ tool, which we can
configure to obtain some stats about the evolutionary algorithm after each
generation,

When we initialize our statistics, we provide in the constructor a function


definition, which will receive an individual upon calling by DEAP, and we are
required to return the attribute of the individual which we wish to apply the
statistical operators on. The registered function names will be used as the label
for the corresponding stat, e.g {‘avg’: <the np.mean function output>}.
Another convenient tool is ‘Hall of Fame’, which we can configure and DEAP will
populate it with the best k individuals in each generation,

The ‘1’ means it will be populated by only the best individual.


Now we are ready to fire the ES and let it do the magic for us,

The first line is just initializing a new population, which is passed our mu value
as its size since we are going to use the (mu,lambda or MuCommaLambda) algorithm
described earlier. The second line runs ES for a single generation (ngen=1) using
the mentioned algorithm, passing the second needed parameter lambda. You can see we
passed also our defined toolbox, which contains all our configured functions and
operators, our stats and hall of fame. Below, is a visualization of 100
generations’ run.

References
https://deap.readthedocs.io/en/master/
Bäck, T., Fogel, D. B., & Michalewicz, Z. (Eds.). (2018). Evolutionary computation
1: Basic algorithms and operators. CRC press.
Machine Learning
Evolutionary Algorithms
Deap
Optimization

Mohammed E.Amer
WRITTEN BY

Mohammed E.Amer
Follow
Write the first response
More From Medium
Also tagged Evolutionary Algorithms
Evolving Neural Networks
Conor Lazarou
Conor Lazarou in Towards Data Science
Nov 15 · 9 min read
154
Also tagged Evolutionary Algorithms
Imagineering & Resurrections
Chris Neels
Chris Neels in Towards Data Science
Oct 31 · 8 min read
21
Top on Medium
You Don’t Need More Motivation — You Need a System
Darius Foroux
Darius Foroux in Forge
Nov 18 · 3 min read
8.5K
Discover Medium
Welcome to a place where words matter. On Medium, smart voices and original ideas
take center stage - with no ads in sight. Watch
Make Medium yours
Follow all the topics you care about, and we’ll deliver the best stories for you to
your homepage and inbox. Explore
Become a member
Get unlimited access to the best stories on Medium — and support writers while
you’re at it. Just $5/month. Upgrade
About
Help
Legal

You might also like