Slides GA

Nontraditional Optimization: GA
• To find a global minimum, users normally try a heuristic approach where several local
minima are found by repeated trials with different starting values or by using different
techniques
• The smallest of all known local minima is then assumed to be global minima. Certainly not
very reliable.
• Traditional techniques are mostly derivative based

• Features of Nontraditional Techniques
• Do not require derivative information (derivative free methods)
• Search wide domains of the objective/cost surface
• Can efficiently deal with a large number of variables
• Suitable for extremely complex cost surfaces; multiobjective problems
• Suggests possible other sub-optimal solutions which may be more

suitable for implementation
• Highly flexible and versatile

• Computationally expensive (more suitable for off-line applications)
• Intuitive (no strong mathematical ground; black box nature) and

inspired from some natural/biological process
• Stochastic in nature (relying heavily on random number generation)
• Large number of hyper parameters (problem specific tuning)

• Some Nontraditional Optimization Techniques
• Genetic Algorithm (GA)
• Differential Evolution (DE)
• Simulated Annealing (SA)
• Particle Swarm Optimization (PSO)
• Ant Colony Optimization (ACO)
• Artificial Bee Colony (ABC)

• Artificial Immune System (AIS)
• Gray Wolf Algorithm (GWA)
• Firefly Algorithm (FA)
• Bacterial Foraging Optimization (BFO)
• Cuckoo Search Algorithm (CSA)
• Harmony Search (HS)
and the list goes on …

Genetic Algorithm (GA)
• Introduced by John Holland in 1975

(University of Michigan) and
popularized by his student David Goldberg
• Philosophically, GA is based on Darwin’s

theory of survival of the fittest
• Basic idea is to start from a randomly chosen set of probable solutions

and then allow intermixing of these candidate solutions over many
generations and gradually evolve towards the optimal solution
Steps for Binary Coded GA (BCGA)
• In principle, suitable for maximization problems
• Let us first consider the unconstrained optimization problem

𝑥1
• find 𝑥 = ⋮ to
𝑥𝑛
maximize 𝑓 𝑥
where 𝑥𝑖𝑚𝑖𝑛 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑚𝑎𝑥 𝑖 = 1,2, … 𝑛
i.e. 𝑥 ∈ Ω ⊂ ℛ𝑛
Initialization
Encoding
Fitness Evaluation
Selection Iteration
Crossover
Mutation
Decoding
i) Initialization
• A population size N is chosen and then N random points are picked from
the phenotype space (search space) Ω
• N is even and usually a few 10’s to a few 100’s (not too high!)
ii) Encoding
• N points from the phenotype space are mapped to N corresponding

points in the genotype space
• Each variable 𝑥𝑖 is represented by a binary substring of length 𝑙𝑖
• Such binary substrings are concatenated for all i=1,2,…n
• Performed for all N points

• 𝑙𝑖 depends on the desired accuracy level of 𝑥𝑖
𝑥𝑖𝑚𝑎𝑥 −𝑥𝑖𝑚𝑖𝑛
• 𝜖𝑖 =
2𝑙𝑖 −1
• 𝑙𝑖 = 𝑐𝑒𝑖𝑙𝑖𝑛𝑔 [ 𝑙𝑜𝑔2 1 + ]
𝜖𝑖
• Mapping from the range 𝑥𝑖𝑚𝑖𝑛 , 𝑥𝑖𝑚𝑎𝑥 to 0, 2𝑙𝑖 − 1 is done as
follows:
𝑥𝑖 −𝑥𝑖𝑚𝑖𝑛
𝑥𝑖𝑑 = 𝑟𝑜𝑢𝑛𝑑 2𝑙𝑖 − 1 or 𝑥𝑖 = 𝑥𝑖𝑚𝑖𝑛 + 𝜖𝑖′ . 𝑥𝑖𝑑
(𝜖𝑖′ is the actual resolution)
• 𝑥𝑖𝑑 is then converted to binary substring 𝐷𝑖

Numerical Example:
• Two design variables x1 ϵ [-2 , 2] and x2 ϵ [1 , 12]
−0.85
• To represent x= as a binary string or chromosome.
4.21
• Desired accuracy level of 0.2 for both x1 and x2.

Solution:
resolution for x1 ,
solving for l1, i.e. l1  5
similarly,
solving for
Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 . ∈1′ where ∈1′ is the actual resolution of 𝑥1
𝑥1𝑚𝑎𝑥 − 𝑥1𝑚𝑖𝑛
∈1′ = = 0.13
2𝑙1 −1
∴ −0.85 = −2 + 0.13 × 𝑥1𝑑
=> 𝑥1𝑑 = 8.9 ≈ 9
Hence, binary representation of -0.85 is '01001'

similarly,
𝑥2 = 𝑥2𝑚𝑖𝑛 + 𝑥2𝑑 . ∈′2 where ∈′2 is the actual resolution of 𝑥2
𝑥2𝑚𝑎𝑥 − 𝑥2𝑚𝑖𝑛
∈′2 = = 0.1746
2𝑙2 −1
∴ 4.21 = 1 + 0.1746 × 𝑥2𝑑
=> 𝑥2𝑑 = 18.38 ≈ 18
Hence, binary representation of 4.21 is '010010'

−0.85
Hence, '01001010010' maps the point x =
4.21
iii) Decoding and Fitness Evaluation
• Next the fitness values at all the N points in the genotype space are
computed
• Extract the decimal integer 𝑥𝑖𝑑 from the GA string and then
map back to the phenotype space as
𝑥𝑖 = 𝑥𝑖𝑚𝑖𝑛 + 𝜖𝑖′ . 𝑥𝑖𝑑
• Compute the objective function value once all 𝑥𝑖 (i=1, 2, … n) are

obtained for a particular chromosome
• This is usually the fitness value of that chromosome

Exercise:
Let an optimization problem have two design variables x1 and x2

where both are varying in the range [-5, 5]. If they are represented by
6 bit binary substrings each then what is the point in the phenotype
space corresponding to the chromosome 100110 011100 ?
1.03
Answer:
−0.55
iv) Selection
A mating pool (size N) is selected from the population
(a) Proportionate selection/Roulette-Wheel selection
• Probability of getting selected in a mating pool ∝ fitness

• Implemented with the help of a Roulette-Wheel
fi
f1
fN Pi  N
f
i 1
i
f2
• Balancing Population Diversity and Selection Pressure (Exploration/Exploitation balance)

(b) Ranking Selection
• Least fit string is given a rank of 1, next one rank 2 etc.
• Thus the fittest string gets the highest rank i.e. N
• Then Roulette wheel selection is performed based on the ranks (rather

than on the fitness)
f4 f1
f1 10%
40%
80% f3
20% f2
3% f 3 f2
30%
7% 10%
Proportionate selection Ranking selection
(c) Tournament Selection
• Small number (2-3) of chromosomes are randomly picked from the

population and the fittest is put in the mating pool
• All the candidates are returned to the original population
• The process is repeated N times
• The least fit candidate is eliminated

v) Crossover
• Crossover operation forms children chromosomes for next generation by

using parent chromosomes from the mating pool of the current generation
• N/2 mating pairs are selected at random
• With some crossover probability (𝑝𝑐 close to 1.0), the parent chromosomes
swap some portions of themselves to produce two children chromosomes
• In single-point crossover a crossover site is chosen at random and the

portions on the right side of the crossover site are interchanged
• Single-point Crossover
0101101011 1001101001 Parents
0011010010 1110100101
0101101011 1110100101
Children
0011010010 1001101001
•Two-point Crossover
101011 10011 010010100 Parents
010010 11101 000110011
101011 11101 010010100

Children
010010 10011 000110011
• Uniform Crossover
 At each bit position a coin is tossed to decide whether there will be

interchange of the bits.
 If ‘head’ appears then there will be a swapping else not
10110000111001101110
Parents
01110110001100010100
Let us assume that 2-nd, 4-th, 5-th, 8-th, 9-th,12-th, 18-th and 20-th
bit positions are selected for swapping
11110000011101001110
00110110101000110100 Children
Found to perform better for large search spaces

• Crossover operation introduces randomness into the current population to
help avoid getting trapped in local minima
• Crossover operation may result in better or worse strings. If a few worse

offspring are formed, they will not survive for long since upcoming
generations are likely to eliminate them
• What if all the new offspring are worse? To avoid such situations, go for
an ELITIST selection
vi) Mutation
• After crossover, a small fraction (decided by a small mutation probability

𝑝𝑚 ) of the N members are made to undergo Mutation.
A mutation site is chosen at random and the corresponding bit is

flipped (single point mutation)
• Each bit of all the N chromosomes are flipped with the mutation
probability 𝑝𝑚 (bit-wise mutation)
1) 0101101011 1001101001
2) 0011010010 1110100101
3) 0101101011 1110100101
4) 0011010010 1001101001
5) 101011 10011 010010100
6) 010010 11101 000110011
7) 101011 11101 010010100
8) 010010 10011 000110011

• Helps a great deal to avoid getting stuck in local minima
• Too low 𝑝𝑚 local minima problem
Too high 𝑝𝑚 too much randomization

Termination criteria
- Maximum number of Generations, say G = 100
- At least 𝑛2 generations are complete and at least over the last 𝑛1

generations, the best solution is not changing by more than a small ε
Best solution of each generation is saved

Elitism
– Objective is to speed up convergence (best solutions are not lost from the population)
– A few best chromosomes of the present generation (Elite Count (EC) = 2, say) are
directly sent to the next generation; others go through the normal process of selection,
crossover and mutation
Example:
maximize f(x1, x2) = x1x2 where 1.0 ≤ x1, x2 ≤ 10.0
Initial population is randomly chosen as,
(2, 8); (5, 5); (9.5, 7.5); (3, 2); (10, 3); (4, 7)
Random numbers generated for Roulette wheel selection are,
0.27, 0.72, 0.11, 0.68, 0.91, 0.50.
Compute the next generation with the fitness values. Assume the following:
string length of each variable is 4
single point crossover with site 3 and 5 alternately
mating pairs as 1st and 4th members; 2nd and 5th members etc.
𝑃𝑐 = 1.0; 𝑃𝑚 = 0.0 and EC=0.

Resolution for each variable,
10 − 1
∈′ = 4 = 0.6
2 −1
Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 ∈ ′
2 = 1 + 0.6 ∗ 𝑥1𝑑
𝑥1𝑑 = 1.66 ≈ 2.0 = ′0010′
Similarly, 8 = 1 + 0.6 ∗ 𝑥2𝑑
𝑥2𝑑 = 11.66 ≈ 12.0 = ′1100′

GEN #1 GEN #1
(Phenotype (Genotype
Space) Space)
1. (2, 8) 0010 1100
2. (5, 5) 0111 0111
3. (9.5, 7.5) 1110 1011
4. (3, 2) 0011 0010
5. (10, 3) 1111 0011
6. (4, 7) 0101 1010

GEN #1 GEN #1 Fitness Select.
(Phenotype (Genotype Value Prob.
Space) Space) (f=x1.x2)
1. (2, 8) 0010 1100 16.00 0.09
2. (5, 5) 0111 0111 25.00 0.14
3. (9.5, 7.5) 1110 1011 71.25 0.40
4. (3, 2) 0011 0010 06.00 0.03
5. (10, 3) 1111 0011 30.00 0.17
6. (4, 7) 0101 1010 28.00 0.17
176.25
GEN #1 GEN #1 Fitness Select. Mating
(Phenotype (Genotype Value Prob. Pool
1. (2, 8) 0010 1100 16.00 0.09 1110 1011 (3)
2. (5, 5) 0111 0111 25.00 0.14 1111 0011 (5)
3. (9.5, 7.5) 1110 1011 71.25 0.40 0111 0111 (2)
4. (3, 2) 0011 0010 06.00 0.03 1111 0011 (5)
5. (10, 3) 1111 0011 30.00 0.17 0101 1010 (6)
6. (4, 7) 0101 1010 28.00 0.17 1110 1011 (3)
176.25
Mating GEN #2
Pool (Genotype
Space)
1110 1011 1111 0011
1111 0011 1110 1011
0111 0111 1111 0010
1111 0011 0101 1011
0101 1010 0110 1011
1110 1011 1111 0111

Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 . ∈ ′
= 1 + 15 ∗ 0.6
= 10.0
and 𝑥2 = 𝑥2𝑚𝑖𝑛 + 𝑥2𝑑 . ∈ ′
= 1 + 3 ∗ 0.6
= 2.8
Mating GEN #2 GEN #2 Fitness
Pool (Genotype (Phenotype Value
1110 1011 1111 0011 (10, 2.8) 28.00
1111 0011 1110 1011 (9.4, 7.6) 71.44
0111 0111 1111 0010 (10, 2.2) 22.00
1111 0011 0101 1011 (4, 7.6) 30.40
0101 1010 0110 1011 (4.6, 7.6) 34.96
1110 1011 1111 0111 (10, 5.2) 52.00
238.80
𝑥1 −1 2
Example: − + 𝑥2 −2 2
𝑓 = −10 cos 𝑥1 cos 𝑥2 𝑒 4
−5.0 ≤ 𝑥1 ≤ 5.0 ; −5.0 ≤ 𝑥2 ≤ 5.0
2.51 (2.49)
𝑥∗ = 𝑓 ∗ = −2.8613 (−2.8733)
2.48 (2.43)
• N=20 (40) 4
• Pc=0.9
2
• Pm=0.09
x1(blue), x2(red)
1
• EC=0
-1
• Chromosome length= 10+10 -2
-3
• Uniform crossover
0 5 10 15 20 25 30
Gen
3
-1
-2
-3
-4
-5
-5
0
-6
-5 -4 -3 -2 -1 0 1 2 5
3 4 5
Schema Theorem for Binary Coded GA
• Proposed by Prof. John Holland
• An attempt to give GA a mathematical foundation
Let us consider a population of binary-strings created at random
110011
010100
110111
110000
Let us assume the following two schemata (templates):

H1: * 1 0 * * *
H2: * 1 0 * 0 0 ( * could be either 1 or 0)
H1: * 1 0 * * *
H2: * 1 0 * 0 0
• Order of schema O(H):

No. of fixed positions (bits) present in a schema
For example: O(H1) = 2; O(H2) = 4
• Defining length of schema δ(H):

Distance between the first and last fixed positions in a string
For example: δ(H1) = 3-2 = 1; δ(H2) = 6-2 = 4
• Effect of Selection:
Let m(H, t) = No. of strings belonging to schema H at tth Gen.
m(H, t+1) = No. of strings belonging to schema H at (t+1)th Gen.
f (H )
E [ m( H , t  1)]  m( H , t )
f
f (H) = Schema fitness or Avg. fitness of the strings represented by schema H
f = Avg. fitness of the entire population at t-th Gen.

• Effect of Crossover (Single-point):
Let Pc= Probability of crossover and
L = String length
A schema is destroyed if crossover site falls within the defining length
 (H )
Probability of destruction = p L  1
c
 (H )
Probability of survival = 1  pc
L 1
• Effect of Mutation (Bit–wise Mutation):
To protect a schema, mutation should not occur at the fixed bits
Let pm : probability of mutation

probability of destruction of a single bit = pm
probability of survival of a single bit = 1- pm
Probability of survival of the whole schema,

ps = (1- pm) (1- pm)……….O(H)
= (1- pm) O(H)
= 1- O(H) pm as pm ‹‹ 1
Considering the contributions of all three operators,
E [m( H , t  1)] m( H , t ) f ( H ) 1  pc  ( H )  O( H ) pm  (neglecting 2nd order term)

f  L 1 
Building-Block Hypothesis:
The schemata having low order, short defining length and fitness considerably more
than average fitness of the population will have more and more representations in
future generations
Limitations of Binary Coded GA
• Unable to yield any arbitrary precision in the solution → Real Coded GA
• Hamming Cliff problem → creates an artificial hindrance to the gradual

search of GA → Gray Coded GA
14 : 01110 1 change
15 : 01111
5 changes
16 : 10000
Real Coded GA:
x1 x2 x3 x4 x5 x6
Chromosome:
 Selection: Same as Binary Coded GA
 Crossover: Single point
Linear Crossover
Ch1 = 0.5*Pr1+0.5*Pr2
Best
Ch2 = 1.5*Pr1-0.5*Pr2 Two
Ch3 = -0.5*Pr1+1.5*Pr2
x2
c3
p2
c1
p1
c2
x1
𝑃1 + 𝑃2
𝐶2, 𝐶3 = ± (𝑃1 − 𝑃2)
2
Blend Crossover
Ch1 = (1-γ)*Pr1+ γ *Pr2

Ch2 = γ *Pr1+(1- γ)*Pr2
γ = 2*r – 0.5 where r is a uniform random number in [0,1]
i.e. γ is a uniform random number in [-0.5,1.5]
Simulated Binary Crossover

 Mutation: Replace a value with a random value in the entire range or in a
random neighbourhood
(neighbourhood may shrink as generation increases)
• Mutation probability in RCGA is more than that in BCGA
Let m be the string length for a variable x1 in BCGA

Probability that this variable survives mutation is (1 − 𝑃𝑚 )𝑚
≈ (1 − 𝑚𝑃𝑚 )
Hence, 1 − 𝑃𝑚𝑅 = 1 − 𝑚𝑃𝑚𝐵
𝑃𝑚𝑅 = 𝑚 𝑃𝑚𝐵
Numerical Example:
maximize f(x1, x2) = - x12 - 2x22 - x1x2 ; −5 ≤ 𝑥1 , 𝑥2 ≤ 5.0
Assume the mating pool to be (4,0); (-2,2); (3,1); (3,1)

Consider mating pairs as 1-3 and 2-4
Obtain the next generation by linear crossover.
Assume Pc = 1.0 and Pm = 0.
Answer:
(3.5, 0.5); (2.5, 1.5); (0.5, 1.5); (-4.5, 2.5)

Constraints Handling in GA (also applicable to PSO)
optimize f(𝑥)
subject to
gj(𝑥) ≤ 0 , j = 1,2,….,m
hk(𝑥) = 0 , k = 1,2,….,p
𝑥 = [x1 x2 …. xn]T
𝑥min ≤ 𝑥 ≤ 𝑥max
Let m+p = q
Functional constraints
Φk(𝑥) , k = 1,2,….,q
Penalty Function Approach
Fitness function of ith solution
Fi(X) = fi(X) ± Pi (+ for minimization problems)
where Pi indicates penalty given by

q
Pi  C  ik ( X )
2
k 1
C indicates penalty coefficient

Example:
Static Penalty
Fitness of i-th solution

q
Fi ( X )  fi ( X )   Ck , r ik ( X )
2
k 1
where Ck,r : rth level violation of kth constraint
(amount of violation is divided into various pre-defined levels)

Dynamic Penalty
q 
Fitness Fi ( X )  fi ( X )  (C.t )  ik ( X )
k 1
where C, α, β are user-defined constants
t = number of generations
 Penalty increasing with generation number (pressurizing GA)

Adaptive Penalty
q
Fi ( X )  f i ( X )   (t ) ik ( X )
2
Fitness
k 1
where t : number of generations
1
. (t ), if best soln. of last Nf GEN were feasible
λ(t+1) =  1
 . (t ) , if infeasible
2
if neither, λ(t+1)= λ(t) ( where β1≠β2 and β1, β2 >1 )

Slides GA

Uploaded by

Copyright:

Available Formats

You might also like

Slides GA

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides GA

Uploaded by

Copyright:

Available Formats

Nontraditional Optimization: GA

• Traditional techniques are mostly derivative based

• Do not require derivative information (derivative free methods)

• Search wide domains of the objective/cost surface

• Can efficiently deal with a large number of variables

• Suitable for extremely complex cost surfaces; multiobjective problems

• Suggests possible other sub-optimal solutions which may be more

• Highly flexible and versatile

• Intuitive (no strong mathematical ground; black box nature) and

• Stochastic in nature (relying heavily on random number generation)

• Large number of hyper parameters (problem specific tuning)

• Genetic Algorithm (GA)

• Differential Evolution (DE)

• Simulated Annealing (SA)

• Particle Swarm Optimization (PSO)

• Ant Colony Optimization (ACO)

• Artificial Bee Colony (ABC)

• Gray Wolf Algorithm (GWA)

• Firefly Algorithm (FA)

• Bacterial Foraging Optimization (BFO)

• Cuckoo Search Algorithm (CSA)

• Harmony Search (HS)

and the list goes on …

• Introduced by John Holland in 1975

• Philosophically, GA is based on Darwin’s

• Basic idea is to start from a randomly chosen set of probable solutions

• In principle, suitable for maximization problems

• Let us first consider the unconstrained optimization problem

• N points from the phenotype space are mapped to N corresponding

• Each variable 𝑥𝑖 is represented by a binary substring of length 𝑙𝑖

• Such binary substrings are concatenated for all i=1,2,…n

• Performed for all N points

(𝜖𝑖′ is the actual resolution)

• 𝑥𝑖𝑑 is then converted to binary substring 𝐷𝑖

• Two design variables x1 ϵ [-2 , 2] and x2 ϵ [1 , 12]

• Desired accuracy level of 0.2 for both x1 and x2.

solving for l1, i.e. l1  5

∴ −0.85 = −2 + 0.13 × 𝑥1𝑑

=> 𝑥1𝑑 = 8.9 ≈ 9

Hence, binary representation of -0.85 is '01001'

𝑥2 = 𝑥2𝑚𝑖𝑛 + 𝑥2𝑑 . ∈′2 where ∈′2 is the actual resolution of 𝑥2

∴ 4.21 = 1 + 0.1746 × 𝑥2𝑑

=> 𝑥2𝑑 = 18.38 ≈ 18

Hence, binary representation of 4.21 is '010010'

map back to the phenotype space as

𝑥𝑖 = 𝑥𝑖𝑚𝑖𝑛 + 𝜖𝑖′ . 𝑥𝑖𝑑

• Compute the objective function value once all 𝑥𝑖 (i=1, 2, … n) are

• This is usually the fitness value of that chromosome

Let an optimization problem have two design variables x1 and x2

A mating pool (size N) is selected from the population

(a) Proportionate selection/Roulette-Wheel selection

• Probability of getting selected in a mating pool ∝ fitness

• Balancing Population Diversity and Selection Pressure (Exploration/Exploitation balance)

• Least fit string is given a rank of 1, next one rank 2 etc.

• Thus the fittest string gets the highest rank i.e. N

• Then Roulette wheel selection is performed based on the ranks (rather

• Small number (2-3) of chromosomes are randomly picked from the