Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Genetic algorithms

Pablo Vilallave, Arian Zaghi-Haghighi


Jaume I University - Intelligent Systems (EI1028)
al386121@uji.es
al386160@uji.es

Abstract— When we have large amounts of data stored earlier, the way they work is by emulating natural
within some structure, the biggest problem we have to deal selection and the process of evolution.
with is, not only the way we are going to retrieve all this
information but also finding high-quality solutions to
optimize search problems. This document will clarify how In GA, an initial set of candidate solutions, also
genetic algorithms help us with this type of problem. called pool or population, is generated and
iteratively updated. In each iteration or generation
Keywords—Genetic algorithms, Evolutionary Algorithms, every individual in the population is assigned a
Mutation, Survival of the fittest fitness value based on its objective function value,
and the fitter individuals are given the higher
I. INTRODUCTION chance to mate and generate fitter offspring after
In computer science, Evolutionary Computation being submitted through recombination and
(EC) is a family of algorithms for global mutation, just like occurs in natural genetics.This
optimization inspired by biological evolution. process is repeated over and over and as a result in
These algorithms are often referred to as each new generation the less desired solutions are
Evolutionary Algorithms (EA) and the way they removed, small random changes are introduced and
work consists of generating a population of possible the populations gradually will evolve to increase in
solutions and introducing small changes, allowing fitness until a stopping criterion is reached.
to generate offspring only to the fittest individuals,
therefore simulating the natural process of Genetic Algorithms have various advantages as
evolution.These kinds of algorithms are frequently well as limitations:
used to find optimal or near-optimal solutions to
difficult problems which otherwise would take a Disadvantages:
lifetime to solve. [1] ● They are not suited for all kinds of problems,
especially the ones which are simple.
● Fitness value is calculated repeatedly which
might be computationally expensive for some
problems
● There are no guarantees on the optimality nor
the quality of the solution
● If it is not implemented properly, the genetic
Algorithm may not converge to the optimal
solution.

Fig. 1 Diagram of a basic evolutionary algorithm. Advantages:


● Does not require any derivative information
● It’s faster and more efficient compared to
II. GENETIC ALGORITHMS
traditional methods
Genetic Algorithms (GA) are the most popular ● Provides a list of good solutions and not just a
type of Evolutionary Algorithms and as we stated single one
● Always gets an answer to the problem, which ● Genetic operators: These alter the genetic
gets better over time. composition of the offspring. [2]
● Very useful when there are a large number of
parameters and options involved Based on EA, the basic structure of an GA will be
as follows: First, the population needs to be
initialized (generated at random), then, the
III. GENETICS ALGORITHMS TERMINOLOGY
population is evaluated thanks to the fitness
Before we start explaining how GAs work, we are function. Now we have a loop, until termination
going to introduce some basic terminology which criteria is reached, which consists in parent
will be useful to understand the process: selection, perform recombination and mutation, and
finally evaluate and select survivors, then repeat the
● Population: A subset of possible solutions to process. Now we will explain each step separately,
the given problem. in order to understand all the different ways we can
● Chromosomes: One of these possible implement a GA and comprehend what is the better
solutions. one in each situation:
● Gene: One element position of a chromosome
● Allele: The value a gene contains.

Fig. 3 GA structure

A. Solution representation
Fig. 2 Population components Deciding the type of representation we will use to
represent our solution is one of the most important
● Genotype space: Is the population in the decisions, it can lead to poor performance if we
computation space. In this space, solutions are don't choose the best option. Now we will explain
represented in a way which can be easily some of the main types:
understood by a computing system. ● Binary representation: It is one of the simplest
● Phenotype space: Is the population and most used representations. It consists of
represented in the actual real world. bit strings. But we have to consider that in
● Decoding and encoding: Decoding is the this kind of encoding different bits have
process of transforming a solution from the different significance and it can have
genotype to the phenotype, while encoding is undesired consequences when performing
the opposite process. crossover and mutation operations.
● Fitness function: The fitness function simply ● Real numbers: If we want to define genes
defined is a function which takes a candidate using continuous variables instead of discrete
solution to the problem as input and produces variables, real valued representation is the
as output how “fit” our how “good” the best.
solution is with respect to the problem in ● Permutation representation: In some cases we
consideration. need the solution being represented by an
order of elements, for example, in the
Travelling Salesman Problem (TSP), that we
will discuss later. In such cases, it is ● Fitness proportionate selection: It is one of
convenient to use this type of representation. the most popular ways of parent selection.
The probability of each individual to become
B. Initialization
a parent here, is proportional to its fitness,
Therefore, fitter individuals have higher
Besides which representation we use, we also chances of mating their features to the next
need to think about how the population is going to generation, evolving better individuals over
be initialized. There are two primary methods to do time. We can do this by a process called
it: roulette wheel selection, where a wheel is
● Random initialization: The initial population divided into n pipes. Each pipe represents the
starts with completely random solutions. individuals, and they get a portion of the
● Heuristic initialization: The initial population circle proportional to his fitness
starts using a known heuristic for the ● Tournament Selection: In this parent selection
problem. technique, X individuals are selected from the
population at random and the best out of these
Despite this, it has been observed that the random are selected to become a parent. This process
solutions are the ones which drive the population to is repeated over and over for selecting the
optimality. next parents.
● Rank Selection: This is the most used way
C. Fitness function when the individuals in the population have
As we said earlier, the fitness function takes the very close fitness values, which usually
possible solution and gives us an output happens at the end of the GA. This can lead to
determining how good the solution is. It is done each individual having an almost equal
repeatedly in a GA therefore it should be chance of being selected as a parent, just as
sufficiently fast. A fitness function, in order to we can see in Fig 4 . Therefore, we remove
works should possess the following characteristics: the concept of a fitness value while selecting
a parent, but each individual is ranked
● The fitness function should be sufficiently according to their fitness, The selection of the
fast to compute. parents depends on the rank of each
● It must quantitatively measure how fit a given individual and not on its fitness. The higher
solution is or how fit individuals can be ranked individuals are chosen over the lower
produced from the given solution. ranked ones.

D. Fitness function
The parent selection process consists in selecting
parents which mate and recombine them in order to
create off-springs for the next generation. This
process is very crucial to the convergence rate of
the GA. But we should care about premature
convergence, which means taking up the entire
population by one extremely fit solution. That's
why maintaining a good diversity in the population
Fig. 4 Rank selection
is extremely important for the success of a
GA.There are different ways of parent selection,
some of them better than others, let’s take a look at ● Random Selection: This is the last way of
them: parent selection and it is done completely at
random. This strategy is usually avoided as it probability. In this case, two offsprings are
has no preference over fitter individuals. generated and in some cases one of the
children can have more genetic material from
E. Crossover
one parent than the other. [3]
In Genetic Algorithms crossover also known as F. Mutation
recombination is a genetic operator used to combine Mutations are used in Genetic Algorithms to
the genetic information of two individuals or maintain genetic diversity from one generation of a
possible solutions to generate new individuals. This population to another. It consists of small random
happens during reproduction in biology and it is changes in the chromosome in order to get a new
applied with a high probability. There are different solution. The probability of mutation has to be low
crossover operators that can be used, but also a to avoid high randomness in the Genetic Algorithm.
specific one could be implemented in order to fit a
particular problem. One of the most used operators There are different types of mutations that can be
are the ones that follows: applied to a GA, but the most commonly used are
the following:

● Single-Point Crossover: In this techione, a ● Bit Flip Mutation: One or more random bits
random crossover point is picked randomly are selected and flipped
and the bits in the right of that point are
swapped between the two parent
chromosomes. ● Random Resetting: A random value is
assigned to a randomly chosen gene.

● Swap Mutation: Two positions on the


chromosome are selected at random and they
swap their values.
● Scramble Mutation: In this case, a random
subset of chromosomes is chosen and their
Fig. 5 Single-point crossover
values are shuffled at random.
● Multi-Point Crossover: In this case, two
crossover points are picked randomly from
the parent chromosomes and the bits in ● Inversion Mutation: A subset of
between the two points are swapped between chromosomes is chosen and instead of being
the parent organisms. shuffled, its order is inverted. [4]
G. Survivor selection
The Survivor Selection determines which
individuals are to be kicked out and which are to be
kept in the next generation. This step is crucial as it
should make sure that the fittest individuals are not
kicked out of the population, while keeping
diversity at the same time.Some GAs use a
Fig. 6 K-point crossover technique called Elitism which makes sure that the
fittest individual is always propagated to the next
● Uniform Crossover: Consists on choosing generation. In addition and to ensure the process of
each bit from either parent with equal
Survivor Selection is done properly the following IV. GENETIC ALGORITHMS PRACTICAL CASES
strategies are often used: Once we established the structure and
functionality of genetic algorithms, we are going to
● Age Based Selection: It has no notion of take a look at some practical cases, in order to see
fitness. It is based on the premise that each how to generate an entire GA, and see how it
individual is allowed in the population for a works. Now we are going to enumerate the software
finite generation where it is allowed to we used, and why we choose it:
reproduce and it is kicked after that no matter
how good its fitness is. This ensures that the ● Jupyter Notebook as Development
oldest members of the population are always Environment
kicked out, while the rest of the members' age ● Python 2.7 as Programming Language.
is increased by 1. ● Pyevolve, a complete genetic algorithm
● Fitness Based Selection: In this fitness based framework written in python
selection, the children tend to replace the least
fit individuals in the population. The selection Pyevolve, as we said before is a framework
of the least fit individuals may be done with written in python that implements the most
one of the different parent selection common features used in genetic algorithms like
techniques. roulette wheel and tournament selectors, crossover
methods, a variety of initia lizators… It is really
H. Termination condition fast and easy to use, and comes with default
parameters as operators or settings. First, we will
The termination Condition of a Genetic Algorithm
solve the travelling salesperson problem (TSP), and
determines when the GA run will get to an end. At then, we will see how can we apply genetic
the start of a run the GA progresses very fast and algorithm usages to robots.
there are better solutions every iteration, but in the
later stages of the run the improvements in every V. THE TRAVELLING SALESPERSON PROBLEM
iteration are very small so we will need a The TSP problem, based on a set of cities and the
termination condition so that our solution is close to distance between every pair of them, consist of
the optimal at the end of the run. These are some of finding the shortest possible route that visits every
the most used termination conditions: city one time and returns to the starting point. It is
classified as an NP-hard problem, which means that
● When there has been little to no improvement the algorithm that solves that problem can be
translated into one for solving any NP-problem.
in the population fort X iterations
● When an absolute number of generations has First we will start with a file which contains the 48
been reached US state capitals and their coordinates. We will read
● When the objective function value has this file and create a node list, which will contain
reached a certain predefined value. the ‘x’ and ‘y’ (coordinates) points of each state.
Then, we need to instance a genome, we can do this
The termination Condition is determined by the easily by using the G1DList class implemented in
Pyevolve, and giving to it, the length of the node
kind of problem that we want to solve, so the GA
list as an argument. Once we have the genome we
designer should try different combinations to see will set the evaluation function, the Edge
which one suits the problem the best. [5] Recombination crossover (which is perfect for this
type of problems), and the List TSP initialization of
the genome. Finally, we must create our Genetic
Algorithm Engine. The one that we will be using is
the GA Engine Class, giving the genome created
before as argument.
In order to make it work, and save the best
individual, we will do all the generations until the
termination criteria by using the ‘evolve’ function,
and once we execute it, we will save the population
best individual by using the ‘bestIndividual’
function.

Now we can see the results, and plot them in a few


different graphs. It’s important to understand the
difference between raw score and fitness score as
we will represent them in different graphs:

● Raw score: Represents the score returned by


the evaluation function, it is not scaled.
● Fitness score: It is the scaled raw score.

Thanks to pyevolve we can plot the errors and


maximum/minimum, using some plot functions, as
we can see in the next figures. Fig. 8 Top, raw score maxmin. Left, fitness score maxmin

And now the results, as you can notice, the first


image, has too many connections between different
states, that’s because it isn’t an optimized solution,
as it is the first resolving path. But, in the second
one a shortest path has been drawn.

Fig. 7 Top, raw score error bars. Left, fitness score error bars
VI. GENETIC ALGORITHMS APPLIED IN ROBOTICS VIII. REFERENCES
Before we start with this experiment, we will need [1]https://en.wikipedia.org/wiki/Evolutionary_co
some software, in order to generate a 3D mputation
environment to see how the robot reacts according
to our program. The software we will be using is [2]https://towardsdatascience.com/introduction-t
the Unreal Development Kit. And, to generate the o-genetic-algorithms-including-example-code-e3
map and spawn the robot in it we will execute two 96e98d8bf3
scripts, ‘launchmap’ and ‘robotspawn’.
[3]https://en.wikipedia.org/wiki/Crossover_(gene
Now we can start to test our robot and see how it tic_algorithm)
starts to crawl. While he learns, we will store all the
results in a database, to later, analyze the results, as [4]https://en.wikipedia.org/wiki/Mutation_(geneti
we did in the previous case. c_algorithm)
As we couldn’t achieve the 50 generations, we [5]https://en.wikipedia.org/wiki/Genetic_algorith
only performed the test with 10 and 20 of them. m
But, it was enough to see the improvement, as the
next graph shows.

Fig. 9 Top, raw score maxmin. Left, fitness score maxmin

VII. CONCLUSIONS
In conclusion, we have managed to understand
how genetic algorithms work, how they can be
improved to achieve more accurate results. They
can be one of the best tools in order to develop
smart artificial intelligence and to analyze the
evolution of some problems.
Depending on the number of population, mutation
and other variables, we can simulate different types
of evolutions. Using Pyevolve has helped us a lot to
achieve our goal.
However, going back to the concept and genetic
algorithms functionality is nos always successful,
even if it is not always a feasible solution. Also,
sometimes it is not very intuitive if you do not have
enough experience.

You might also like