Slides GA

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Nontraditional Optimization: GA

• To find a global minimum, users normally try a heuristic approach where several local
minima are found by repeated trials with different starting values or by using different
techniques

• The smallest of all known local minima is then assumed to be global minima. Certainly not
very reliable.

• Traditional techniques are mostly derivative based


• Features of Nontraditional Techniques

• Do not require derivative information (derivative free methods)

• Search wide domains of the objective/cost surface

• Can efficiently deal with a large number of variables

• Suitable for extremely complex cost surfaces; multiobjective problems

• Suggests possible other sub-optimal solutions which may be more


suitable for implementation

• Highly flexible and versatile


• Computationally expensive (more suitable for off-line applications)

• Intuitive (no strong mathematical ground; black box nature) and


inspired from some natural/biological process

• Stochastic in nature (relying heavily on random number generation)

• Large number of hyper parameters (problem specific tuning)


• Some Nontraditional Optimization Techniques

• Genetic Algorithm (GA)

• Differential Evolution (DE)

• Simulated Annealing (SA)

• Particle Swarm Optimization (PSO)

• Ant Colony Optimization (ACO)

• Artificial Bee Colony (ABC)


• Artificial Immune System (AIS)

• Gray Wolf Algorithm (GWA)

• Firefly Algorithm (FA)

• Bacterial Foraging Optimization (BFO)

• Cuckoo Search Algorithm (CSA)

• Harmony Search (HS)

and the list goes on …


Genetic Algorithm (GA)

• Introduced by John Holland in 1975


(University of Michigan) and
popularized by his student David Goldberg

• Philosophically, GA is based on Darwin’s


theory of survival of the fittest

• Basic idea is to start from a randomly chosen set of probable solutions


and then allow intermixing of these candidate solutions over many
generations and gradually evolve towards the optimal solution
Steps for Binary Coded GA (BCGA)

• In principle, suitable for maximization problems

• Let us first consider the unconstrained optimization problem


𝑥1
• find 𝑥 = ⋮ to
𝑥𝑛

maximize 𝑓 𝑥
where 𝑥𝑖𝑚𝑖𝑛 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑚𝑎𝑥 𝑖 = 1,2, … 𝑛
i.e. 𝑥 ∈ Ω ⊂ ℛ𝑛
Initialization

Encoding

Fitness Evaluation

Selection Iteration
Crossover

Mutation

Decoding
i) Initialization

• A population size N is chosen and then N random points are picked from
the phenotype space (search space) Ω

• N is even and usually a few 10’s to a few 100’s (not too high!)

ii) Encoding

• N points from the phenotype space are mapped to N corresponding


points in the genotype space

• Each variable 𝑥𝑖 is represented by a binary substring of length 𝑙𝑖

• Such binary substrings are concatenated for all i=1,2,…n

• Performed for all N points


• 𝑙𝑖 depends on the desired accuracy level of 𝑥𝑖

𝑥𝑖𝑚𝑎𝑥 −𝑥𝑖𝑚𝑖𝑛
• 𝜖𝑖 =
2𝑙𝑖 −1

𝑥𝑖𝑚𝑎𝑥 −𝑥𝑖𝑚𝑖𝑛
• 𝑙𝑖 = 𝑐𝑒𝑖𝑙𝑖𝑛𝑔 [ 𝑙𝑜𝑔2 1 + ]
𝜖𝑖
• Mapping from the range 𝑥𝑖𝑚𝑖𝑛 , 𝑥𝑖𝑚𝑎𝑥 to 0, 2𝑙𝑖 − 1 is done as
follows:

𝑥𝑖 −𝑥𝑖𝑚𝑖𝑛
𝑥𝑖𝑑 = 𝑟𝑜𝑢𝑛𝑑 2𝑙𝑖 − 1 or 𝑥𝑖 = 𝑥𝑖𝑚𝑖𝑛 + 𝜖𝑖′ . 𝑥𝑖𝑑
𝑥𝑖𝑚𝑎𝑥 −𝑥𝑖𝑚𝑖𝑛

(𝜖𝑖′ is the actual resolution)

• 𝑥𝑖𝑑 is then converted to binary substring 𝐷𝑖


Numerical Example:

• Two design variables x1 ϵ [-2 , 2] and x2 ϵ [1 , 12]

−0.85
• To represent x= as a binary string or chromosome.
4.21

• Desired accuracy level of 0.2 for both x1 and x2.


Solution:

resolution for x1 ,

solving for l1, i.e. l1  5

similarly,

solving for
Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 . ∈1′ where ∈1′ is the actual resolution of 𝑥1

𝑥1𝑚𝑎𝑥 − 𝑥1𝑚𝑖𝑛
∈1′ = = 0.13
2𝑙1 −1

∴ −0.85 = −2 + 0.13 × 𝑥1𝑑

=> 𝑥1𝑑 = 8.9 ≈ 9

Hence, binary representation of -0.85 is '01001'


similarly,

𝑥2 = 𝑥2𝑚𝑖𝑛 + 𝑥2𝑑 . ∈′2 where ∈′2 is the actual resolution of 𝑥2

𝑥2𝑚𝑎𝑥 − 𝑥2𝑚𝑖𝑛
∈′2 = = 0.1746
2𝑙2 −1

∴ 4.21 = 1 + 0.1746 × 𝑥2𝑑

=> 𝑥2𝑑 = 18.38 ≈ 18

Hence, binary representation of 4.21 is '010010'


−0.85
Hence, '01001010010' maps the point x =
4.21
iii) Decoding and Fitness Evaluation

• Next the fitness values at all the N points in the genotype space are
computed

• Extract the decimal integer 𝑥𝑖𝑑 from the GA string and then

map back to the phenotype space as

𝑥𝑖 = 𝑥𝑖𝑚𝑖𝑛 + 𝜖𝑖′ . 𝑥𝑖𝑑

• Compute the objective function value once all 𝑥𝑖 (i=1, 2, … n) are


obtained for a particular chromosome

• This is usually the fitness value of that chromosome


Exercise:

Let an optimization problem have two design variables x1 and x2


where both are varying in the range [-5, 5]. If they are represented by
6 bit binary substrings each then what is the point in the phenotype
space corresponding to the chromosome 100110 011100 ?

1.03
Answer:
−0.55
iv) Selection

A mating pool (size N) is selected from the population

(a) Proportionate selection/Roulette-Wheel selection

• Probability of getting selected in a mating pool ∝ fitness


• Implemented with the help of a Roulette-Wheel

fi
f1
fN Pi  N

f
i 1
i
f2

• Balancing Population Diversity and Selection Pressure (Exploration/Exploitation balance)


(b) Ranking Selection

• Least fit string is given a rank of 1, next one rank 2 etc.

• Thus the fittest string gets the highest rank i.e. N

• Then Roulette wheel selection is performed based on the ranks (rather


than on the fitness)

f4 f1
f1 10%
40%
80% f3
20% f2
3% f 3 f2
30%

7% 10%
Proportionate selection Ranking selection
(c) Tournament Selection

• Small number (2-3) of chromosomes are randomly picked from the


population and the fittest is put in the mating pool

• All the candidates are returned to the original population

• The process is repeated N times

• The least fit candidate is eliminated


v) Crossover

• Crossover operation forms children chromosomes for next generation by


using parent chromosomes from the mating pool of the current generation

• N/2 mating pairs are selected at random

• With some crossover probability (𝑝𝑐 close to 1.0), the parent chromosomes
swap some portions of themselves to produce two children chromosomes

• In single-point crossover a crossover site is chosen at random and the


portions on the right side of the crossover site are interchanged
• Single-point Crossover
0101101011 1001101001 Parents
0011010010 1110100101

0101101011 1110100101
Children
0011010010 1001101001

•Two-point Crossover
101011 10011 010010100 Parents
010010 11101 000110011

101011 11101 010010100


Children
010010 10011 000110011
• Uniform Crossover

 At each bit position a coin is tossed to decide whether there will be


interchange of the bits.
 If ‘head’ appears then there will be a swapping else not

10110000111001101110
Parents
01110110001100010100

Let us assume that 2-nd, 4-th, 5-th, 8-th, 9-th,12-th, 18-th and 20-th
bit positions are selected for swapping

11110000011101001110
00110110101000110100 Children

Found to perform better for large search spaces


• Crossover operation introduces randomness into the current population to
help avoid getting trapped in local minima

• Crossover operation may result in better or worse strings. If a few worse


offspring are formed, they will not survive for long since upcoming
generations are likely to eliminate them

• What if all the new offspring are worse? To avoid such situations, go for
an ELITIST selection
vi) Mutation

• After crossover, a small fraction (decided by a small mutation probability


𝑝𝑚 ) of the N members are made to undergo Mutation.

A mutation site is chosen at random and the corresponding bit is


flipped (single point mutation)

• Each bit of all the N chromosomes are flipped with the mutation
probability 𝑝𝑚 (bit-wise mutation)
1) 0101101011 1001101001

2) 0011010010 1110100101

3) 0101101011 1110100101

4) 0011010010 1001101001

5) 101011 10011 010010100

6) 010010 11101 000110011

7) 101011 11101 010010100

8) 010010 10011 000110011


• Helps a great deal to avoid getting stuck in local minima

• Too low 𝑝𝑚 local minima problem

Too high 𝑝𝑚 too much randomization


Termination criteria

- Maximum number of Generations, say G = 100

- At least 𝑛2 generations are complete and at least over the last 𝑛1


generations, the best solution is not changing by more than a small ε

Best solution of each generation is saved


Elitism

– Objective is to speed up convergence (best solutions are not lost from the population)

– A few best chromosomes of the present generation (Elite Count (EC) = 2, say) are
directly sent to the next generation; others go through the normal process of selection,
crossover and mutation
Example:
maximize f(x1, x2) = x1x2 where 1.0 ≤ x1, x2 ≤ 10.0

Initial population is randomly chosen as,

(2, 8); (5, 5); (9.5, 7.5); (3, 2); (10, 3); (4, 7)

Random numbers generated for Roulette wheel selection are,

0.27, 0.72, 0.11, 0.68, 0.91, 0.50.

Compute the next generation with the fitness values. Assume the following:

string length of each variable is 4

single point crossover with site 3 and 5 alternately

mating pairs as 1st and 4th members; 2nd and 5th members etc.

𝑃𝑐 = 1.0; 𝑃𝑚 = 0.0 and EC=0.


Resolution for each variable,

10 − 1
∈′ = 4 = 0.6
2 −1

Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 ∈ ′

2 = 1 + 0.6 ∗ 𝑥1𝑑

𝑥1𝑑 = 1.66 ≈ 2.0 = ′0010′

Similarly, 8 = 1 + 0.6 ∗ 𝑥2𝑑

𝑥2𝑑 = 11.66 ≈ 12.0 = ′1100′


GEN #1 GEN #1
(Phenotype (Genotype
Space) Space)

1. (2, 8) 0010 1100

2. (5, 5) 0111 0111

3. (9.5, 7.5) 1110 1011

4. (3, 2) 0011 0010

5. (10, 3) 1111 0011

6. (4, 7) 0101 1010


GEN #1 GEN #1 Fitness Select.
(Phenotype (Genotype Value Prob.
Space) Space) (f=x1.x2)

1. (2, 8) 0010 1100 16.00 0.09

2. (5, 5) 0111 0111 25.00 0.14

3. (9.5, 7.5) 1110 1011 71.25 0.40

4. (3, 2) 0011 0010 06.00 0.03

5. (10, 3) 1111 0011 30.00 0.17

6. (4, 7) 0101 1010 28.00 0.17

176.25
GEN #1 GEN #1 Fitness Select. Mating
(Phenotype (Genotype Value Prob. Pool
Space) Space) (f=x1.x2)

1. (2, 8) 0010 1100 16.00 0.09 1110 1011 (3)

2. (5, 5) 0111 0111 25.00 0.14 1111 0011 (5)

3. (9.5, 7.5) 1110 1011 71.25 0.40 0111 0111 (2)

4. (3, 2) 0011 0010 06.00 0.03 1111 0011 (5)

5. (10, 3) 1111 0011 30.00 0.17 0101 1010 (6)

6. (4, 7) 0101 1010 28.00 0.17 1110 1011 (3)

176.25
Mating GEN #2
Pool (Genotype
Space)

1110 1011 1111 0011

1111 0011 1110 1011

0111 0111 1111 0010

1111 0011 0101 1011

0101 1010 0110 1011

1110 1011 1111 0111


Now, 𝑥1 = 𝑥1𝑚𝑖𝑛 + 𝑥1𝑑 . ∈ ′

= 1 + 15 ∗ 0.6

= 10.0

and 𝑥2 = 𝑥2𝑚𝑖𝑛 + 𝑥2𝑑 . ∈ ′

= 1 + 3 ∗ 0.6

= 2.8
Mating GEN #2 GEN #2 Fitness
Pool (Genotype (Phenotype Value
Space) Space) (f=x1.x2)

1110 1011 1111 0011 (10, 2.8) 28.00

1111 0011 1110 1011 (9.4, 7.6) 71.44

0111 0111 1111 0010 (10, 2.2) 22.00

1111 0011 0101 1011 (4, 7.6) 30.40

0101 1010 0110 1011 (4.6, 7.6) 34.96

1110 1011 1111 0111 (10, 5.2) 52.00

238.80
𝑥1 −1 2
Example: − + 𝑥2 −2 2
𝑓 = −10 cos 𝑥1 cos 𝑥2 𝑒 4

−5.0 ≤ 𝑥1 ≤ 5.0 ; −5.0 ≤ 𝑥2 ≤ 5.0

2.51 (2.49)
𝑥∗ = 𝑓 ∗ = −2.8613 (−2.8733)
2.48 (2.43)

• N=20 (40) 4

• Pc=0.9
2

• Pm=0.09

x1(blue), x2(red)
1

• EC=0
-1

• Chromosome length= 10+10 -2

-3

• Uniform crossover
0 5 10 15 20 25 30
Gen
3

-1

-2

-3

-4
-5

-5

0
-6
-5 -4 -3 -2 -1 0 1 2 5
3 4 5
Schema Theorem for Binary Coded GA

• Proposed by Prof. John Holland

• An attempt to give GA a mathematical foundation

Let us consider a population of binary-strings created at random

110011
010100

110111
110000

Let us assume the following two schemata (templates):


H1: * 1 0 * * *
H2: * 1 0 * 0 0 ( * could be either 1 or 0)
H1: * 1 0 * * *
H2: * 1 0 * 0 0

• Order of schema O(H):


No. of fixed positions (bits) present in a schema
For example: O(H1) = 2; O(H2) = 4

• Defining length of schema δ(H):


Distance between the first and last fixed positions in a string
For example: δ(H1) = 3-2 = 1; δ(H2) = 6-2 = 4
• Effect of Selection:

Let m(H, t) = No. of strings belonging to schema H at tth Gen.

m(H, t+1) = No. of strings belonging to schema H at (t+1)th Gen.

f (H )
E [ m( H , t  1)]  m( H , t )
f

f (H) = Schema fitness or Avg. fitness of the strings represented by schema H

f = Avg. fitness of the entire population at t-th Gen.


• Effect of Crossover (Single-point):

Let Pc= Probability of crossover and

L = String length

A schema is destroyed if crossover site falls within the defining length

 (H )
Probability of destruction = p L  1
c

 (H )
Probability of survival = 1  pc
L 1
• Effect of Mutation (Bit–wise Mutation):

To protect a schema, mutation should not occur at the fixed bits

Let pm : probability of mutation


probability of destruction of a single bit = pm
probability of survival of a single bit = 1- pm

Probability of survival of the whole schema,


ps = (1- pm) (1- pm)……….O(H)
= (1- pm) O(H)
= 1- O(H) pm as pm ‹‹ 1
Considering the contributions of all three operators,

E [m( H , t  1)] m( H , t ) f ( H ) 1  pc  ( H )  O( H ) pm  (neglecting 2nd order term)


f  L 1 

Building-Block Hypothesis:

The schemata having low order, short defining length and fitness considerably more
than average fitness of the population will have more and more representations in
future generations
Limitations of Binary Coded GA

• Unable to yield any arbitrary precision in the solution → Real Coded GA

• Hamming Cliff problem → creates an artificial hindrance to the gradual


search of GA → Gray Coded GA

14 : 01110 1 change
15 : 01111
5 changes
16 : 10000
Real Coded GA:

x1 x2 x3 x4 x5 x6
Chromosome:

 Selection: Same as Binary Coded GA

 Crossover: Single point

Linear Crossover
Ch1 = 0.5*Pr1+0.5*Pr2
Best
Ch2 = 1.5*Pr1-0.5*Pr2 Two
Ch3 = -0.5*Pr1+1.5*Pr2
x2

c3
p2
c1
p1
c2

x1

𝑃1 + 𝑃2
𝐶2, 𝐶3 = ± (𝑃1 − 𝑃2)
2
Blend Crossover

Ch1 = (1-γ)*Pr1+ γ *Pr2


Ch2 = γ *Pr1+(1- γ)*Pr2

γ = 2*r – 0.5 where r is a uniform random number in [0,1]

i.e. γ is a uniform random number in [-0.5,1.5]

Simulated Binary Crossover


 Mutation: Replace a value with a random value in the entire range or in a
random neighbourhood
(neighbourhood may shrink as generation increases)

• Mutation probability in RCGA is more than that in BCGA

Let m be the string length for a variable x1 in BCGA


Probability that this variable survives mutation is (1 − 𝑃𝑚 )𝑚
≈ (1 − 𝑚𝑃𝑚 )
Hence, 1 − 𝑃𝑚𝑅 = 1 − 𝑚𝑃𝑚𝐵
𝑃𝑚𝑅 = 𝑚 𝑃𝑚𝐵
Numerical Example:

maximize f(x1, x2) = - x12 - 2x22 - x1x2 ; −5 ≤ 𝑥1 , 𝑥2 ≤ 5.0

Assume the mating pool to be (4,0); (-2,2); (3,1); (3,1)


Consider mating pairs as 1-3 and 2-4
Obtain the next generation by linear crossover.
Assume Pc = 1.0 and Pm = 0.

Answer:

(3.5, 0.5); (2.5, 1.5); (0.5, 1.5); (-4.5, 2.5)


Constraints Handling in GA (also applicable to PSO)

optimize f(𝑥)
subject to
gj(𝑥) ≤ 0 , j = 1,2,….,m
hk(𝑥) = 0 , k = 1,2,….,p
𝑥 = [x1 x2 …. xn]T
𝑥min ≤ 𝑥 ≤ 𝑥max
Let m+p = q
Functional constraints
Φk(𝑥) , k = 1,2,….,q
Penalty Function Approach

Fitness function of ith solution

Fi(X) = fi(X) ± Pi (+ for minimization problems)

where Pi indicates penalty given by


q
Pi  C  ik ( X )
2

k 1

C indicates penalty coefficient


Example:
Static Penalty

Fitness of i-th solution


q
Fi ( X )  fi ( X )   Ck , r ik ( X )
2

k 1

where Ck,r : rth level violation of kth constraint

(amount of violation is divided into various pre-defined levels)


Dynamic Penalty

q 
Fitness Fi ( X )  fi ( X )  (C.t )  ik ( X )
k 1

where C, α, β are user-defined constants

t = number of generations

 Penalty increasing with generation number (pressurizing GA)


Adaptive Penalty

q
Fi ( X )  f i ( X )   (t ) ik ( X )
2
Fitness
k 1

where t : number of generations

1
. (t ), if best soln. of last Nf GEN were feasible
λ(t+1) =  1

 . (t ) , if infeasible
2

if neither, λ(t+1)= λ(t) ( where β1≠β2 and β1, β2 >1 )

You might also like