Presented By


Under the Guidance of
Dr. S. Kanmani
Professor & Head
Department of Information Technology
Pondicherry Engineering College

Optimal Distributive Genetic Algorithm for
Mining Association Rules.
To propose and implement Self adaptive
Distributive Genetic Algorithm for Association
Rule Mining.
Extraction of interesting information or
patterns from data in large databases is known
as data mining.
Data Mining
Association analysis is the discovery of what are
commonly called association rules.
It studies the frequency of items occurring together in
transactional databases
Association rule mining provides valuable
information in assessing significant correlations.

Association Rules
Find all the rules X Y with
minimum support and
Support, s, probability that a
transaction contains X Y
Confidence, c, conditional
probability that a transaction
having X also contains Y
Let minsup = 50%, minconf = 50%
Freq. Pat.: Milk:3, Nuts:3, Sugar:4, Eggs:3,
{Milk, Sugar}:3
buys sugar
buys both
buys milk
Nuts, Eggs, Bread 40
Nuts, Coffee, Sugar , Eggs, Bread
Milk, Sugar, Eggs 30
Milk, Coffee, Sugar 20
Milk, Nuts, Sugar 10
Items bought
Association rules:
Milk Sugar (60%, 100%)
Sugar Milk (60%, 75%)
A Genetic Algorithm (GA) is a procedure used to
find approximate solutions to search problems
through the application of the principles of
evolutionary biology.
Genetic algorithms use biologically inspired
techniques such as genetic inheritance, natural
selection, mutation, and sexual reproduction
(recombination, or crossover).
Conceptual Algorithm
[Start] Generate random population of n chromosomes (suitable
solutions for the problem)
[Fitness] Evaluate the fitness f(x) of each chromosome x in the population
[New population] Create a new population by repeating the following
steps until the new population is complete
A.[Selection] Select two parent chromosomes from a population
according to their fitness (the better fitness, the bigger chance to be
B.[Crossover] With a crossover probability cross over the parents to
form a new offspring (children). If no crossover was performed,
offspring is an exact copy of parents.
C.[Mutation] With a mutation probability mutate new offspring at each
locus (position in chromosome).
D.[Accepting] Place new offspring in a new population
[Replace] Use new generated population for a further run of algorithm
[Test] If the end condition is satisfied, stop, and return the best solution
in current population
[Loop] Go to step 2
Steps in Genetic Algorithm
Existing Work
From the literature survey done the following were observed

I. Genetic Operations
I.I Encoding
Chromosomes are of either fixed length or varying length
Fuzzy rules are encoded as chromosomes.
Natural numbers used for encoding
Gene expressions also encoded
I.II Initial Population
Seeded by random selection (roulette wheel)
Seeded by users
Single rule sets generated
Fuzzy rules represented
I.III Crossover
Locus point of crossover
On same attributes or random attributes
Set crossover rate dynamically
Symbiotic combination

I.IV Mutation
Locus point of mutation
Weight factor taken into consideration for deciding locus point
Dynamic mutation point
Mutation 1 and Mutation 2 generated

I.V Fitness Threshold
Dynamically set
TP,TN, FP,FN criteria considered
Strength of implication taken into consideration
Sustainability index, creditable index and inclusive index considered
Real values of Confidence and Support derived and applied
Predictability and Comprehensibility factors considered.

2. Methodology
Crossover replaced by symbiotic combination
Rules selection performed by user thereby seeding population
to next generation
Searching for rules in K- itemset instead of whole database
Distributed GA performed
Dynamic immune evolution and biometric mechanism

3. Application Areas

4. Evaluation Parameters.
Population Size
Chromosome Length
Mutation Probability
Crossover probability
Fitness threshold
Support and Confidence Factor

Open Issues
Mining Rules with non fixed consequent.
Combined with other methods for multi-relation data.
Elimination of redundant rules.
Fixing optimum values for parameters.
Enhance self addictiveness.
Rule selection made dependent on other classes.
Algorithm could be improved to generate further
simpler rules.
Test on different domain.
Complexity prediction by using Distributed Computing.
Unsupervised Learning.


Proposed Work
To implement self adaptive Genetic Algorithm for
Association Rule Mining with optimal accuracy.

By Iterative Approach to increase the number of rules
extracted in each iteration, as a way to decrease the
time for learning.

To propose the Self Adaptive GA in Distributive

Self Adaptive GA
Work Done So Far
Literature survey performed on genetic algorithm and
comparative study based on other methods done .
Analysis on Existing Rule mining method : Apriori done
Basic Genetic Algorithm for optimizing function coded
in Java.
Proposed a comparison framework on Genetic
algorithm in Association Rule Mining.
Work to be done
Implementing association rule mining with self
adaptive Genetic Algorithm on medical dataset.
Test the same algorithm on other dataset and
compare with existing methods.
Optimize result with GA parameters.
Survey on Distributed Algorithm.

Execution Plan for Next Six Months
July-August Implementing an existing paper
August - Testing the code with Medical data set and
perform comparative study
September - Alter the code for other datasets and compare
the result obtained
October - Make alteration in GA factors in code & evaluate
the results
November - Feasibility study on generated code to obtain
Decembers optimum result.
Papers Published
Paper titled Framework for Comparison of Association Rule
Mining Using Genetic Algorithm has been selected for The
International Conference On Computers, Communication &
Intelligence at VCET, Madurai.

