Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 46

University Institute of Engineering

DEPARTMENT OF COMPUTER SCIENCE


& ENGINEERING
Bachelor of Engineering (Computer Science & Engineering)
Artificial Intelligence and Machine Learning(21CSH-316)
Prepared by:
Sitaram patel(E13285)
Topic: Association Rule

DISCOVER . LEARN . EMPOWER


Course Outcomes
CO-1:Understand the fundamental concepts and
techniques of artificial intelligence and machine
learning

CO-2: Apply the basic of python programming to various


problems related to AI &ML.

CO-3:

CO-4:

CO-5:.Apply various unsupervised machine learning


models and evaluate their performance

2
Course Objectives

To study learning processes:


To provide a comprehensive
supervised and unsupervised,
To make familiar with the foundation to Machine To understand modern
deterministic and statistical
domain and fundamentals of Learning and Optimization techniques and practical
knowledge of Machine
Artificial intelligence. methodology with trends of Machine learning.
learners, and ensemble
applications t.
learning

3
Road map
• Basic concepts
• Apriori algorithm
• Different data formats for mining
• Mining with multiple minimum supports
• Mining class association rules
• Summary

4
Association Rule Mining
• Proposed by Agrawal et al in 1993.
• It is an important data mining model studied extensively by the
database and data mining community.
• Assume all data are categorical.
• No good algorithm for numeric data.
• Initially used for Market Basket Analysis to find how items purchased by
customers are related.

5
Association Rules
Which of my products tend to be purchased together?
What do other people like this person tend to like/buy/watch?
• Discover "interesting" relationships among variables in a large
database
• Rules of the form "When X observed, Y also observed"
• The definition of "interesting“ varies with the algorithm used for discovery
• Not a predictive method; finds similarities, relationships

6
The model: data
• I = {i1, i2, …, im}: a set of items.
• Transaction t :
• t a set of items, and t  I.
• Transaction Database T: a set of transactions T = {t1, t2, …, tn}.

7
Transaction data: supermarket data
• Market basket transactions:
t1: {bread, cheese, milk}
t2: {apple, eggs, salt, yogurt}
… …
tn: {biscuit, eggs, milk}
• Concepts:
• An item: an item/article in a basket
• I: the set of all items sold in the store
• A transaction: items purchased in a basket; it may have TID (transaction ID)
• A transactional dataset: A set of transactions

8
Transaction data: a set of documents
• A text document data set. Each document is treated as a “bag” of
keywords
doc1: Student, Teach, School
doc2: Student, School
doc3: Teach, School, City, Game
doc4: Baseball, Basketball
doc5: Basketball, Player, Spectator
doc6: Baseball, Coach, Game, Team
doc7: Basketball, Team, City, Game

9
The model: rules
• A transaction t contains X, a set of items (itemset) in I, if X  t.
• An association rule is an implication of the form:
X  Y, where X, Y  I, and X Y = 

• An itemset is a set of items.


• E.g., X = {milk, bread, cereal} is an itemset.
• A k-itemset is an itemset with k items.
• E.g., {milk, bread, cereal} is a 3-itemset

10
Rule strength measures
• Support: The rule holds with support sup in T (the transaction data set) if
sup% of transactions contain X  Y.
• sup = Pr(X  Y).
• Confidence: The rule holds in T with confidence conf if conf% of
tranactions that contain X also contain Y.
• conf = Pr(Y | X)
• An association rule is a pattern that states when X occurs, Y occurs with
certain probability.

11
Support and Confidence
• Support count: The support count of an itemset X, denoted by X.count,
in a data set T is the number of transactions in T that contain X. Assume
T has n transactions.
• Then,

( X  Y ).count
support 
n
( X  Y ).count
confidence 
X .count
12
Goal and key features
• Goal: Find all rules that satisfy the user-specified minimum support
(minsup) and minimum confidence (minconf).
• Key Features
• Completeness: find all rules.
• No target item(s) on the right-hand-side
• Mining with data on hard disk (not in memory)

13
An example
• Transaction data t1: Eggs, Chicken, Milk
• Assume: t2: Eggs, Cheese
t3: Cheese, Boots
minsup = 30%
t4: Eggs, Chicken, Cheese
minconf = 80%
t5: Eggs, Chicken, Clothes, Cheese,
• An example frequent itemset: Milk
{Chicken, Clothes, Milk} [sup = 3/7] t6: Chicken, Clothes, Milk
t7: Chicken, Milk, Clothes
• Association rules from the itemset:
Clothes  Milk, Chicken [sup = 3/7, conf = 3/3]
… …
Clothes, Chicken  Milk, [sup = 3/7, conf = 3/3]

14
Transaction data representation
• A simplistic view of shopping baskets,
• Some important information not considered. E.g,
• the quantity of each item purchased and
• the price paid.

15
Many mining algorithms
• There are a large number of them!!
• They use different strategies and data structures.
• Their resulting sets of rules are all the same.
• Given a transaction data set T, and a minimum support and a minimum confident, the set
of association rules existing in T is uniquely determined.
• Any algorithm should find the same set of rules although their computational
efficiencies and memory requirements may be different.
• We study only one: the Apriori Algorithm

16
Association Rules - Apriori
• Specifically designed for mining over transactions in databases
• Used over itemsets: sets of discrete variables that are linked:
• Retail items that are purchased together
• A set of tasks done in one day
• A set of links clicked on by one user in a single session
• Our Example: Apriori

17
Apriori Algorithm - What is it?
Support
• Earliest of the association rule algorithms
• Frequent itemset: a set of items L that appears together "often
enough“:
• Formally: meets a minimum support criterion
• Support: the % of transactions that contain L
• Apriori Property: Any subset of a frequent itemset is also frequent
• It has at least the support of its superset

18
Apriori Algorithm (Continued)
Confidence
• Iteratively grow the frequent itemsets from size 1 to size K (or until we
run out of support).
• Apriori property tells us how to prune the search space
• Frequent itemsets are used to find rules X->Y with a minimum
confidence:
• Confidence: The % of transactions that contain X, which also contain Y
• Output: The set of all rules X -> Y with minimum support and
confidence

19
The Apriori algorithm
• Probably the best known algorithm
• Two steps:
• Find all itemsets that have minimum support (frequent itemsets, also called large
itemsets).
• Use frequent itemsets to generate rules.

• E.g., a frequent itemset


{Chicken, Clothes, Milk} [sup = 3/7]
and one rule from the frequent itemset
Clothes  Milk, Chicken [sup = 3/7, conf = 3/3]

20
Step 1: Mining all frequent itemsets
• A frequent itemset is an itemset whose support is ≥ minsup.
• Key idea: The apriori property (downward closure property): any subsets of
a frequent itemset are also frequent itemsets

ABC ABD ACD BCD

AB AC AD BC BD CD

A B C D

21
The Algorithm
• Iterative algo. (also called level-wise search): Find all 1-item frequent
itemsets; then all 2-item frequent itemsets, and so on.
• In each iteration k, only consider itemsets that contain some k-1 frequent itemset.

• Find frequent itemsets of size 1: F1


• From k = 2
• C = candidates of size k: those itemsets of size k that could be frequent, given F
k k-1
• F = those itemsets that are actually frequent, F  C (need to scan the database
k k k
once).

22
Example
Finding frequent itemsets
Dataset T
itemset:count minsup=0.5
1. scan T  C1: {1}:2, {2}:3, {3}:3, {4}:1, {5}:3
TID Items
 F1: {1}:2, {2}:3, {3}:3, {5}:3 T100 1, 3, 4
 C2: {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5} T200 2, 3, 5

2. scan T  C2: {1,2}:1, {1,3}:2, {1,5}:1, {2,3}:2, {2,5}:3, {3,5}:2 T300 1, 2, 3, 5


T400 2, 5
 F2: {1,3}:2, {2,3}:2, {2,5}:3, {3,5}:2
 C3: {2, 3,5}
3. scan T  C3: {2, 3, 5}:2  F3: {2, 3, 5}

23
Details: ordering of items
• The items in I are sorted in lexicographic order (which is a total order).
• The order is used throughout the algorithm in each itemset.
• {w[1], w[2], …, w[k]} represents a k-itemset w consisting of items
w[1], w[2], …, w[k], where w[1] < w[2] < … < w[k] according to the
total order.

24
Algorithm Apriori(T)
Details: the algorithm
C1  init-pass(T);
F1  {f | f  C1, f.count/n  minsup}; // n: no. of transactions in T
for (k = 2; Fk-1  ; k++) do
Ck  candidate-gen(Fk-1);
for each transaction t  T do
for each candidate c  Ck do
if c is contained in t then
c.count++;
end
end
Fk  {c  Ck | c.count/n  minsup}
end
return F  k Fk;

25
Apriori candidate generation
• The candidate-gen function takes Fk-1 and returns a superset (called
the candidates) of the set of all frequent k-itemsets. It has two steps
• join step: Generate all possible candidate itemsets Ck of length k
• prune step: Remove those candidates in Ck that cannot be frequent.

26
Function candidate-gen(Fk-1)
Candidate-gen function
Ck  ;
forall f1, f2  Fk-1
with f1 = {i1, … , ik-2, ik-1}
and f2 = {i1, … , ik-2, i’k-1}
and ik-1 < i’k-1 do
c  {i1, …, ik-1, i’k-1}; // join f1 and f2
Ck  Ck  {c};
for each (k-1)-subset s of c do
if (s  Fk-1) then
delete c from Ck; // prune
end
end
return Ck;
27
An example
• F3 = {{1, 2, 3}, {1, 2, 4}, {1, 3, 4},
{1, 3, 5}, {2, 3, 4}}

• After join
• C4 = {{1, 2, 3, 4}, {1, 3, 4, 5}}
• After pruning:
• C4 = {{1, 2, 3, 4}}
because {1, 4, 5} is not in F3 ({1, 3, 4, 5} is removed)

28
Step 2: Generating rules from frequent
itemsets
• Frequent itemsets  association rules
• One more step is needed to generate association rules
• For each frequent itemset X,
For each proper nonempty subset A of X,
• Let B = X - A
• A  B is an association rule if
• Confidence(A  B) ≥ minconf,
support(A  B) = support(AB) = support(X)
confidence(A  B) = support(A  B) / support(A)

29
Generating rules: an example
• Suppose {2,3,4} is frequent, with sup=50%
• Proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup=50%, 50%, 75%, 75%, 75%,
75% respectively
• These generate these association rules:
• 2,3  4, confidence=100%
• 2,4  3, confidence=100%
• 3,4  2, confidence=67%
• 2  3,4, confidence=67%
• 3  2,4, confidence=67%
• 4  2,3, confidence=67%
• All rules have support = 50%

30
Generating rules: summary
• To recap, in order to obtain A  B, we need to have support(A  B) and
support(A)
• All the required information for confidence computation has already
been recorded in itemset generation. No need to see the data T any
more.
• This step is not as time-consuming as frequent itemsets generation.

31
On Apriori Algorithm
Seems to be very expensive
• Level-wise search
• K = the size of the largest itemset
• It makes at most K passes over data
• In practice, K is bounded (10).
• The algorithm is very fast. Under some conditions, all rules can be found in
linear time.
• Scale up to large data sets

32
Lift and Leverage

33
Association Rules Implementations
• Market Basket Analysis
• People who buy milk also buy cookies 60% of the time.
• Recommender Systems
• "People who bought what you bought also purchased….“.
• Discovering web usage patterns
• People who land on page X click on link Y 76% of the time.

34
Use Case Example: Credit Records
Credit ID Attributes
1 credit_good, female_married, job_skilled, home_owner, …
2 credit_bad, male_single, job_unskilled, renter, …
Minimum Support: 50%
Frequent Itemset Support The itemset {home_owner, credit_good} has
minimum support.
credit_good 70%
male_single 55% The possible rules are
job_skilled 63%
credit_good -> home_owner
home_owner 71%
home_owner, credit_good 53% and

home_owner -> credit_good

35
Computing Confidence and Lift
Suppose we have 1000 credit records:
free_housing home_owner renter total
credit_bad 44 186 70 300
credit_good 64 527 109 700
108 713 179

713 home_owners, 527 have good credit.


home_owner -> credit_good has confidence 527/713 = 74%

700 with good credit, 527 of them are home_owners


credit_good -> home_owner has confidence 527/700 = 75%

The lift of these two rules is

0.527 / (0.700*0.713) = 1.055

36
A Sketch of the Algorithm
• If Lk is the set of frequent k-itemsets:
• Generate the candidate set Ck+1 by joining Lk to itself
• Prune out the (k+1)-itemsets that don't have minimum support Now we have
Lk+1
• We know this catches all the frequent (k+1)-itemsets by the apriori
property
• a (k+1)-itemset can't be frequent if any of its subsets aren't frequent
• Continue until we reach kmax, or run out of support
• From the union of all the Lk, find all the rules with minimum
confidence
37
Step 1: 1-itemsets (L1)
• let min_support = 0.5 Frequent Itemset Count
credit_good 700
• 1000 credit records credit_bad 300
• Scan the database male_single 550
male_mar_or_wid 92
• Prune
female 310
job_skilled 631
job_unskilled 200
home_owner 710
renter 179

38
Step 2: 2-itemsets (L2)
Frequent Itemset Count
• Join L1 to itself
credit_good, male_single 402
• Scan the database to get the credit_good, 544
counts job_skilled
credit_good, 527
• Prune home_owner
male_single, job_skilled 340
male_single, 408
home_owner
job_skilled, home_owner 452

39
Step 3: 3-itemsets
Frequent Itemset Count
credit_good, 428
job_skilled, home_owner

• We have run out of support.


• Candidate rules come from L2:
• credit_good -> job_skilled
• job_skilled -> credit_good
• credit_good -> home_owner
• home_owner -> credit_good

40
Finally: Find Confidence Rules
Rule Set Cnt Set Cnt Confidence

IF credit_good THEN credit_good 700 credit_good AND 544 544/700=77%


job_skilled job_skilled
IF credit_good THEN credit_good 700 credit_good AND 527 527/700=75%
home_owner home_owner
IF job_skilled THEN job_skilled 631 job_skilled AND 544 544/631=86%
credit_good credit_good
IF home_owner THEN home_owner 710 home_owner AND 527 527/710=74%
credit_good credit_good

If we want confidence > 80%:


IF job_skilled THEN credit_good

41
Diagnostics
• Do the rules make sense?
• What does the domain expert say?
• Make a "test set" from hold-out data:
• Enter some market baskets with a few items missing (selected at random).
Can the rules predict the missing items?
• Remember, some of the test data may not cause a rule to fire.
• Evaluate the rules by lift or leverage.
• Some associations may be coincidental (or obvious).

42
Apriori - Reasons to Choose (+) and Cautions (-)

Reasons to Choose (+) Cautions (-)


Easy to implement Requires many database scans
Uses a clever observation to prune the Exponential time complexity
search space
• Apriori property
Easy to parallelize Can mistakenly find spurious
(or coincidental) relationships
• Addressed with Lift and Leverage
measures

43
Check Your Knowledge
1. What is the Apriori property and how is it used in the Apriori algorithm?
2. List three popular use cases of the Association Rules mining algorithms.
3. What is the difference between Lift and Leverage. How is Lift used in
evaluating the quality of rules discovered?
4. Define Support and Confidence
5. How do you use a “hold-out” dataset to evaluate the effectiveness of the
rules generated?

44
Summary
• Association rule mining has been extensively studied in the data mining
community.
• There are many efficient algorithms and model variations.
• Other related work includes
• Multi-level or generalized rule mining
• Constrained rule mining
• Incremental rule mining
• Maximal frequent itemset mining
• Numeric association rule mining
• Rule interestingness and visualization
• Parallel algorithms
• …

45
References
• Books and Journals
• Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai
Ben-David-Cambridge University Press 2014
• Introduction to machine Learning – the Wikipedia Guide by Osman Omer.

• Video Link-
• https://www.youtube.com/watch?v=guVvtZ7ZClw
• https://www.youtube.com/watch?v=RHkvnRemaLE
• Web Link-
• https://www.geeksforgeeks.org/association-rule/
• https://www.javatpoint.com/association-rule-learning
• https://www.ibm.com/docs/en/SSEPGG_9.7.0/com.ibm.im.model.doc/c_support_in_an_association_rule
.html

46

You might also like