DC Meet Second

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

Presented By

K.Indira

Under the Guidance of
Dr. S. Kanmani
Professor & Head
Department of Information Technology
Pondicherry Engineering College


1
Optimal Distributive Genetic Algorithm for
Mining Association Rules.
2
To propose and implement Self adaptive
Distributive Genetic Algorithm for Association
Rule Mining.
Objective
Extraction of interesting information or
patterns from data in large databases is known
as data mining.
Data Mining
3
Association analysis is the discovery of what are
commonly called association rules.
It studies the frequency of items occurring together in
transactional databases
Association rule mining provides valuable
information in assessing significant correlations.

ASSOCIATION ANALYSIS
4
5
Association Rules
Find all the rules X Y with
minimum support and
confidence
Support, s, probability that a
transaction contains X Y
Confidence, c, conditional
probability that a transaction
having X also contains Y
Let minsup = 50%, minconf = 50%
Freq. Pat.: Milk:3, Nuts:3, Sugar:4, Eggs:3,
{Milk, Sugar}:3
Customer
buys sugar
Customer
buys both
Customer
buys milk
Nuts, Eggs, Bread 40
Nuts, Coffee, Sugar , Eggs, Bread
50
Milk, Sugar, Eggs 30
Milk, Coffee, Sugar 20
Milk, Nuts, Sugar 10
Items bought
Tid
Association rules:
Milk Sugar (60%, 100%)
Sugar Milk (60%, 75%)
GENETIC ALGORITHM
A Genetic Algorithm (GA) is a procedure used to
find approximate solutions to search problems
through the application of the principles of
evolutionary biology.
Genetic algorithms use biologically inspired
techniques such as genetic inheritance, natural
selection, mutation, and sexual reproduction
(recombination, or crossover).
6
7
START
GENERATE INITIAL
POPULATION
EVALUATION
GENETIC OPERATORS
(CROSSOVER, MUTATION)
SELECTION
STOP
TERMINAL
CONDITION
No
Yes
Conceptual Algorithm
[Start] Generate random population of n chromosomes (suitable
solutions for the problem)
[Fitness] Evaluate the fitness f(x) of each chromosome x in the population
[New population] Create a new population by repeating the following
steps until the new population is complete
A.[Selection] Select two parent chromosomes from a population
according to their fitness (the better fitness, the bigger chance to be
selected)
B.[Crossover] With a crossover probability cross over the parents to
form a new offspring (children). If no crossover was performed,
offspring is an exact copy of parents.
C.[Mutation] With a mutation probability mutate new offspring at each
locus (position in chromosome).
D.[Accepting] Place new offspring in a new population
[Replace] Use new generated population for a further run of algorithm
[Test] If the end condition is satisfied, stop, and return the best solution
in current population
[Loop] Go to step 2
Steps in Genetic Algorithm
8
Existing Work
From the literature survey done the following were observed

I. Genetic Operations
I.I Encoding
Chromosomes are of either fixed length or varying length
Fuzzy rules are encoded as chromosomes.
Natural numbers used for encoding
Gene expressions also encoded
I.II Initial Population
Seeded by random selection (roulette wheel)
Seeded by users
Single rule sets generated
Fuzzy rules represented
I.III Crossover
Locus point of crossover
On same attributes or random attributes
Set crossover rate dynamically
Symbiotic combination


9
I.IV Mutation
Locus point of mutation
Weight factor taken into consideration for deciding locus point
Dynamic mutation point
Mutation 1 and Mutation 2 generated

I.V Fitness Threshold
Dynamically set
TP,TN, FP,FN criteria considered
Strength of implication taken into consideration
Sustainability index, creditable index and inclusive index considered
Real values of Confidence and Support derived and applied
Predictability and Comprehensibility factors considered.

10
Existing Work Contd..

2. Methodology
Crossover replaced by symbiotic combination
Rules selection performed by user thereby seeding population
to next generation
Searching for rules in K- itemset instead of whole database
Distributed GA performed
Dynamic immune evolution and biometric mechanism
introduced

3. Application Areas

4. Evaluation Parameters.
Population Size
Chromosome Length
Mutation Probability
Crossover probability
Fitness threshold
Support and Confidence Factor

11
Existing Work Contd..

Open Issues
Mining Rules with non fixed consequent.
Combined with other methods for multi-relation data.
Elimination of redundant rules.
Fixing optimum values for parameters.
Enhance self addictiveness.
Rule selection made dependent on other classes.
Algorithm could be improved to generate further
simpler rules.
Test on different domain.
Complexity prediction by using Distributed Computing.
Scalability.
Unsupervised Learning.


12

Proposed Work
To implement self adaptive Genetic Algorithm for
Association Rule Mining with optimal accuracy.

By Iterative Approach to increase the number of rules
extracted in each iteration, as a way to decrease the
time for learning.

To propose the Self Adaptive GA in Distributive
Environment.



13
Self Adaptive GA
SELF
ADAPTIVE
Work Done So Far
Literature survey performed on genetic algorithm and
comparative study based on other methods done .
Analysis on Existing Rule mining method : Apriori done
Basic Genetic Algorithm for optimizing function coded
in Java.
Proposed a comparison framework on Genetic
algorithm in Association Rule Mining.
15
Work to be done
Implementing association rule mining with self
adaptive Genetic Algorithm on medical dataset.
Test the same algorithm on other dataset and
compare with existing methods.
Optimize result with GA parameters.
Survey on Distributed Algorithm.


16
Execution Plan for Next Six Months
July-August Implementing an existing paper
August - Testing the code with Medical data set and
perform comparative study
September - Alter the code for other datasets and compare
the result obtained
October - Make alteration in GA factors in code & evaluate
the results
November - Feasibility study on generated code to obtain
Decembers optimum result.
17
Papers Published
Paper titled Framework for Comparison of Association Rule
Mining Using Genetic Algorithm has been selected for The
International Conference On Computers, Communication &
Intelligence at VCET, Madurai.


18
References
Jing Li, Han Rui-feng, A Self-Adaptive Genetic Algorithm Based On Real-
Coded, International Conference on Biomedical Engineering and
computer Science , Page(s): 1 - 4 , 2010

Chuan-Kang Ting, Wei-Ming Zeng, Tzu-Chieh Lin, Linkage Discovery
through Data Mining, IEEE Magazine on Computational Intelligence,
Volume 5, February 2010.

Caises, Y., Leyva, E., Gonzalez, A., Perez, R., An extension of the Genetic
Iterative Approach for Learning Rule Subsets , 4th International Workshop
on Genetic and Evolutionary Fuzzy Systems, Page(s): 63 - 67 , 2010

Shangping Dai, Li Gao, Qiang Zhu, Changwu Zhu, A Novel Genetic
Algorithm Based on Image Databases for Mining Association Rules, 6th
IEEE/ACIS International Conference on Computer and Information Science,
Page(s): 977 980, 2007

Peregrin, A., Rodriguez, M.A., Efficient Distributed Genetic Algorithm for
Rule Extraction,. Eighth International Conference on Hybrid Intelligent
Systems, HIS '08. Page(s): 531 536, 2008

19
20
Mansoori, E.G., Zolghadri, M.J., Katebi, S.D., SGERD: A Steady-State
Genetic Algorithm for Extracting Fuzzy Classification Rules From
Data, IEEE Transactions on Fuzzy Systems, Volume: 16 , Issue: 4 ,
Page(s): 1061 1071, 2008..

Xiaoyuan Zhu, Yongquan Yu, Xueyan Guo, Genetic Algorithm Based on
Evolution Strategy and the Application in Data Mining, First
International Workshop on Education Technology and Computer Science,
ETCS '09, Volume: 1 , Page(s): 848 852, 2009

Hong Guo, Ya Zhou, An Algorithm for Mining Association Rules Based
on Improved Genetic Algorithm and its Application, 3rd International
Conference on Genetic and Evolutionary Computing, WGEC '09, Page(s):
117 120, 2009

Genxiang Zhang, Haishan Chen, Immune Optimization Based Genetic
Algorithm for Incremental Association Rules Mining, International
Conference on Artificial Intelligence and Computational Intelligence, AICI
'09, Volume: 4, Page(s): 341 345, 2009
References Contd..
21

You might also like