Professional Documents
Culture Documents
Optimizing With Genetic Algorithms: by Benjamin J. Lynch
Optimizing With Genetic Algorithms: by Benjamin J. Lynch
Algorithms
by
Benjamin J. Lynch
T G
C A
A C Feb 23, 2006
G T
T G
T A
G C
C T
Outline
• What are genetic algorithms?
– Biological origins
– Shortcomings of Newton-type optimizers
• How do we apply genetic algorithms?
– Options to include
• Encoding
• Selection
• Recombination
• Mutation
– Strategies
• What programs can we use?
– How do we parallelize?
• MPI, fork/wait
2
What are genetic algorithms?
(GAs)
• A class of stochastic search strategies
modeled after evolutionary mechanisms
3
What are genetic algorithms?
(GAs)
• A major difference between natural GAs
and our GAs is that we do not need to
follow the same laws observed in nature.
– Although modeled after natural processes, we
can design our own encoding of information,
our own mutations, and our own selection
criteria.
4
Definitions for today
5
Why would we use genetic algorithms?
Isn’t there a simple solution we learned in
Calculus?
• Perfect for a
parabola!
H x = !g
– H is the Hessian (2nd
derivative with respect
to all parameters
optimized)
– g is the gradient
– x is the step to
minimize the gradient
x
7
Where Newton-Raphson Fails
• A local method will only find local
extrema.
If we start our search here
We’ll end up here
H x = !g
8
How do we use GAs to optimize the
parameters we’re interested in?
• Choose parameters to optimize
• Determine chromosomal representation of
parameters
• Generate initial population of individuals
(chromosomes)
• Evaluate fitness of each individual to
reproduce
• Allow selection rules and random behavior
to select next population
9
Enter
Genetic Algorithm
Create Initial Population
10
Setting up the problem
Setup 11
Determine the Parameters
Airfoil Example:
• Constant wing area
• Variable camber
• Variable chord at root
• Variable chord at tip
• Span (function of chords
and wing area)
Setup 12
Density Functional
Example:
• Choose parameters to
be all the variables in
the gradient-corrected
exchange terms.
+ (
) 1/ 3 2 , cx 2
&
) 3+ 3 ( ax , bx e , dx f
2
&" (r ) 4 / 3 dr
E XGGA = ,$ % ) ) & + f & !
2 * 4# ' dx
!
)) 1 + 6ax sinh ,1 x , &&
* Ax '
Setup 13
Evaluate Your Fitness
Setup 14
Fitness
Setup 15
Fitness
Setup 16
• For an airfoil, this might be a function of drag and lift
Setup 17
• For an empirical density functional, the fitness
might be a weighted RMS deviation from
experimental values.
dataset
Calc Exp 2
Fitness = " ! (E i ( P) " E
i )
i
Setup 18
• Check that your problem is well-suited for
optimization with a GA.
Setup 19
• Determine chromosomal representation of
parameters
• Parameters can be encoded in binary, base-
4 base-10, etc.
encoding 20
• After you decide how to encode the parameters,
you must decide on the domain of your
parameters. This is entirely dependent on your
problem. You will want to allow your parameters
to be anything physically reasonable (if you’re
solving a physical problem)
encoding 21
Create Initial Population
• Population size is chosen (1-10 individuals/parameter
optimized for most applications)
• Parameters to be optimized are encoded.
– Binary, Base 10
– Let’s say we have 2 parameters with initial values of 32
and 13. In binary and base 10 they would look like:
Chromosome of the
100000001101 individual
3213
encoding 22
How binary genes translate into
parameters
101011101001010111010000100110
encoding 23
Create Initial Population
encoding 24
• Generate initial population of individuals
(chromosomes)
encoding 25
Review Steps in a GA
1.Initialization of population
2. Evaluation of fitness
3. Selection
4. Recombination
5. Repeat 2-4
26
Evaluation
Evaluate Fitness 27
• The fitness function is somehow based on
the genes of the individual and should
reflect how good a given set of parameters
is.
– Lift-drag , low drag airfoil
– Ability of a density functional to better predict
chemical phenomena
– Swimming speed of a robotic fish
– Power output of a chemical laser
Evaluate Fitness 28
• Evaluation of the fitness is the computationally-
intensive portion of a GA optimization
• Each chromosome holds the information that
uniquely describes an individual.
• Each chromosome/(parameters set)/individual can
be evaluated separate from the other individuals.
– GA optimizations are typically described as
embarrassingly parallelizable
• The evaluation of the chromosomes reduces down
to a fitness value for each individual which will
be used in the next step
Evaluate Fitness 29
Selection
• Allow selection rules and random behavior
to select next population
Selection 30
• The parents must be selected based on their
fitness.
Selection 31
Roulette Wheel
Selection
• Probability of parenthood
is proportional to fitness.
• The wheel is spun until
two parents are selected.
• The two parents create Fit(#1)
one offspring. Fit(#2)
Fit(#3)
• The process is repeated to Fit(#4)
Fit(#5)
create a new population
for the next generation.
Selection 32
• Roulette wheel selection has problems if the
fitness changes by orders of magnitude.
• If two individuals have a much higher
fitness, they could be the parents for
every child in the next generation.
Fit(#1)
Fit(#2)
Fit(#3)
Fit(#4)
Fit(#5)
Selection 33
Another Reason Not to Use the
Roulette Wheel
• If the fitness value for all
individuals is very close,
the parents will be chosen Fit(#1)
Fit(#2)
with equal probability, Fit(#3)
Fit(#4)
and the function will Fit(#5)
cease to optimize.
• Roulette selection is very
sensitive to the form of
the fitness function and
generally requires
modifications to work at
all.
Selection 34
Rank Selection
Fit(#1)
Fit(#2)
Fit(#3)
• All individuals in Fit(#4)
ranked according
to fitness
rank
• Each individual is
assigned a weight Fit(#1)
Fit(#2)
inversely Fit(#3)
Fit(#4)
proportional to the 1 Fit(#5)
Selection 35
Tournament Selection
• 4 individuals (A,B,C,D) are randomly selected from
the population. Two are eliminated and two
become the parents of a child in the next generation
A
A
Fitness(A) > Fitness(B) B
C
D
Fitness(D) > Fitness(C) D
Selection 36
Tournament Selection
• Selection of parents continues until a new
population is completed.
• Individuals might be the parent to several children,
or no children.
A
A
Fitness(A) > Fitness(B) B
C
D
Fitness(D) > Fitness(C) D
Selection 37
Similarities Between Tournament
and Rank Selection
Fraction of
• Tournament population Fitness
selection is very
similar to rank
9 Both parents were
above the median
selection in the 16
limit of a large
population when 6 One parent was
above the median
we assign a 16
weight of 1/rank.
1 Neither parent was
16 above the median
Selection 38
Recombination
Recombination 39
Recombination with crossover points
• We can choose a Parent 1
number of Parent 2
crossover points child
Recombination 40
crossover point occurs within a parameter
parent1
parent2
child
A a A
Param. 1 B b B
C c C In this case the child will have
(eyes)
D d D a new nose that is not the same
E e E as parent1 or parent2.
Param. 2 F f F
(nose) G g g
H h h
Recombination 41
representation of parameters becomes important.
parent1
parent2 Not possible if
child we used base 10
encoding
1 1 1
Param. 1 1 0 1 1 1 1
1 1 1 5 0 5
1 0 1
0 1 0
Param. 2 0 1 0 0 1 0
0 1 1 0 5 3
0 1 1
Recombination 42
Recombination – Uniform Crossover
• Uniform crossover
– No limit to crossover points
A a a
B b B Allows more variation in offspring
C c c and decreases need for random
D d D mutations
E e e
F f f
G g g
H h H
Recombination 43
• Mutations can parent1
be applied after parent2
recombination child
A a A A
Param. 1 B b B B
C c C C
D d D D A random mutation
E e e e has been applied to
Param. 2 F f f f the child
G g g g
H h h h
Random
44
mutations
• Creep mutations are a special
type of random mutation. Possible creep
• Creep mutations cause a mutations for
parameter to change by a small param. 1
amount, rather than randomizing
any one element.
1 1 1 1 1
Param. 1 1 1 1 1 1
1 1 1 0 1
0 1 0 1 1
OR
0 1 1 1 1
Param. 2 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Random
45
mutations
• The desirable frequency of mutations depends
greatly on the other GA options chosen.
A a A A
Param. 1 B b B B
C c C C
D d D D
E e e e
Param. 2 F f f f
G g g g
H h h H
Random
46
mutations
Other Operators for Recombination
• Other rearrangements of
information are possible
• Swap entire genes
• Swap locus 0 2
4 4 0 5
2 0 4 9
8 8 2 0
5 5 8 3
9 9 5 0
0 0 9 4
3 3 0 2
3 8
Random
47
mutations
Elitism
• Elitism refers to the safeguarding of the
chromosome of the most fit individual in a given
generation.
• If elitism is used, only N-1 individuals are produced
by recombining the information from parents. The
last individual is a copy of the most fit individual
from the previous generation.
• This ensures that the best chromosome is never lost
in the optimization process due to random events.
48
More GA Options
• Separate “islands” with populations that interact
infrequently.
• Use “male” & “female” populations
• “Alpha male” selection
• Use 3 parents for each offspring
• Use 3 sexes
• Recessive genes
• …. and many more. Most of which are only useful
for very specific types of fitness functions.
49
Micro-GA
• Small population size of 1 individual per
parameter optimized.
• No random mutations.
• When the genetic variance is below a
certain threshold (~5%), the most fit
individual goes on, while the chromosomes
of all other individuals are randomized
• Cycle this process
50
Overview of a micro-GA
Create initial population
Recombine parents to
create new population
52
Review
Evaluation of the fitness is
the time-consuming
portion
Create Initial Population
54
Which GA options are good picks for
my system?
• Start with robust algorithms
– Micro-GA
– Binary encoding
– Tournament selection
– Uniform crossover at gene boundaries
56
How might one parallelize a GA?
• Evaluating the fitness of an individuals is
independent of the rest of the population.
– You can easily run your GA on N processors
where N is your population size.
57
Method to Parallelize a GA
• MPI is always a good choice if you’re
already familiar the language. This option
would also enable GA algorithms with
“islands” on heterogeneous clusters.
58
#!/usr/bin/perl
use Parallel::ForkManager; Fork method to
$max_process=4;
parallelize with Perl
$pm = new Parallel::ForkManager($max_process); on a shared-memory
@all_chromosomes=(@ARGV)
machine
# enter main loop, no more than $max_process
# children will run at a time
foreach $chromosome (@all_chromosomes) {
my $pid = $pm->start and next;
# prepare an appropriate input file with
# this set of parameters
&writeinput($chromosome);
# run the program to evaluate the fitness
&evalfitness($chromosome);
$pm->finish;
}
59
GA Programs Available
60
Other GA Resources Available
• Me
– blynch@msi.umn.edu
– 612-624-4122
61
Questions?
blynch@msi.umn.edu 4-4122
help@msi.umn.edu 6-0802
62