Professional Documents
Culture Documents
Unit 5
Unit 5
DEPARTMENT OF MCA
GENETIC ALGORITHMS
A genetic algorithm is an adaptive heuristic search algorithm inspired by "Darwin's theory of
evolution in Nature." It is used to solve optimization problems in machine learning. It is one of the
important algorithms as it helps solve complex problems that would take a long time to solve.
Genetic Algorithms are being widely used in different real-world applications, for example,
Designing electronic circuits, code-breaking, image processing, and artificial creativity.
3. Selection: The selection phase involves the selection of individuals for the reproduction of
offspring. All the selected individuals are then arranged in a pair of two to increase reproduction.
Then these individuals transfer their genes to the next generation. There are three types of
Selection methods available, which are: Roulette wheel selection, Tournament selection and
rank based selection.
➢ Crossover: The crossover plays a most significant role in the reproduction phase of the
genetic algorithm. In this process, a crossover point is selected at random within the genes.
Then the crossover operator swaps genetic information of two parents from the current
generation to produce a new individual representing the offspring.
The genes of parents are exchanged among themselves until the crossover point is met. These
newly generated offspring are added to the population. This process is also called or crossover.
Types of crossover styles available:
1. One point crossover
2. Two-point crossover
3. Livery crossover
4. Inheritable Algorithms crossover
➢ Mutation: The mutation operator inserts random genes in the offspring (new child) to
maintain the diversity in the population. It can be done by flipping some bits in the
chromosomes. Mutation helps in solving the issue of premature convergence and enhances
diversification. Types of mutation styles available,
1. Flip bit mutation
2. Gaussian mutation
3. Exchange/Swap mutation
5. Termination: After the reproduction phase, a stopping criterion is applied as a base for
termination. The algorithm terminates after the threshold fitness solution is reached. It will identify
the final solution as the best solution in the population.
GENETIC OPERATORS
The generation of successors in a GA is determined by a set of operators that recombine and mutate
selected members of the current population. These operators correspond to idealized versions of
the genetic operations found in biological evolution. The two most common operators are
crossover and mutation.
2. Two-Point Crossover: This is a specific case of a N-point Crossover technique. Two random
points are chosen on the individual chromosomes (strings) and the genetic material is exchanged
at these points.
The crossover between two good solutions may not always yield a better or as good a solution.
Since parents are good, the probability of the child being good is high. If offspring is not good (poor
solution), it will be removed in the next iteration during “Selection”.
II. Mutation: In simple terms, mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and introduce diversity in the genetic
population and is usually applied with a low probability – pm. If the probability is very high, the GA
gets reduced to a random search.
Mutation is the part of the GA which is related to the “exploration” of the search space. It has been
observed that mutation is essential to the convergence of the GA while crossover is not.
Mutation Operators:
1. In this bit flip mutation, we select one or more random bits and flip them. This is used for
binary encoded GAs.
2. Random Resetting is an extension of the bit flip for the integer representation. In this, a random
value from the set of permissible values is assigned to a randomly chosen gene.
3. In swap mutation, we select two positions on the chromosome at random, and interchange the
values. This is common in permutation-based encodings.
4. Scramble mutation is also popular with permutation representations. In this, from the entire
chromosome, a subset of genes is chosen and their values are scrambled or shuffled randomly.
5. In inversion mutation, we select a subset of genes like in scramble mutation, but instead of
shuffling the subset, we merely invert the entire string in the subset.
The probability that a hypothesis will be selected is given by the ratio of its fitness to the fitness of
other members of the current population. This method is sometimes called fitness proportionate
selection, or roulette wheel selection. Other methods for using fitness to select hypotheses have
also been proposed. In tournament selection, two hypotheses are first chosen at random from the
current population. With some predefined probability ‘p’ the more fit of these two is then selected,
and with probability (1 - p) the less fit hypothesis is selected. Tournament selection often yields a
more diverse population than fitness proportionate selection. In another method called rank
selection, the hypotheses in the current population are first sorted by fitness. The probability that
a hypothesis will be selected is then proportional to its rank in this sorted list, rather than its fitness.
To increase the efficiency of the compositions, it was proposed to combine the trees designed by
the self-configuring genetic programming algorithm in the ensemble.
Decision trees
The decision tree is a method based on the application of various separation functions of the initial
data set, in particular, simple threshold rules.
Decision trees are an effective method, especially in demand in tasks, where interpretation of the
results is required, which is understandable to a layman in the field of data analysis. Decision tree
learning algorithms have a significant drawback; cannot choose the optimal tree structure. To
combat this drawback, it is proposed to use the genetic programming algorithm.
4. Selection:
➢ Select parent solutions from the population based on their fitness scores.
Techniques like roulette wheel selection, tournament selection, or rank-based
selection can be used.
5. Crossover (Recombination):
➢ Combine parts of two parent solutions to create new offspring. For clustering,
crossover might involve swapping subsets of cluster assignments between two
parent chromosomes.
6. Mutation:
➢ Introduce random changes to offspring solutions to explore the search space. This
could involve randomly reassigning some data points to different clusters.
7. Evolution Process:
➢ Replace less fit solutions in the population with new offspring.
Applications
Genetic algorithm-based clustering is used in various fields, including:
➢ Market segmentation
➢ Image segmentation
➢ Bioinformatics (e.g., gene expression data analysis)
➢ Document clustering
➢ Anomaly detection
Example Workflow
1. Initialize Population: Randomly generate cluster assignments for an initial set of
solutions.
2. Evaluate Fitness: Calculate the fitness of each solution using a clustering metric.
3. Select Parents: Choose solutions based on their fitness scores for reproduction.
4. Crossover and Mutation: Generate new solutions by combining parts of selected parents
and introducing mutations.
5. Update Population: Replace the less fit solutions with new offspring.
6. Iterate: Repeat the evaluation, selection, crossover, mutation, and updating steps for a
predefined number of generations or until convergence.
The genetic algorithm-based clustering can effectively find high-quality clustering solutions,
leveraging the global search capabilities and flexibility of genetic algorithms.
Single Objective Optimization using GAs focuses on optimizing one goal, such as accuracy or cost,
through iterative evolution of solutions. Biobjective (Multi-objective) Optimization handles
multiple conflicting goals simultaneously, seeking a set of optimal trade-offs. In both cases, GAs
provides a robust, flexible framework for exploring complex solution spaces and finding high-
quality solutions.
Workflow
1. Chromosome Encoding:
➢ Each chromosome represents a potential solution, typically a set of model
parameters.
➢ For instance, in neural network optimization, each chromosome might encode the
weights and biases of the network.
2. Initial Population:
➢ Generate an initial population of random solutions (parameter sets).
While GAs and GD are fundamentally different, GAs can be adapted to emulate the iterative,
improvement-focused nature of GD. By carefully tuning selection, crossover, and mutation
processes, GAs can provide a robust alternative for optimization in scenarios where traditional
gradient-based methods are not applicable.