Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

Data Mining – Association Rules

1
 Review Questions
◦ Question 1: Data Mining and Metrics
 Algorithm Questions
◦ Question 2: Applying Apriori Algorithm
◦ Question 3: Finding Association Rules

2
3
 What is an Association Rule?

4
 What is an Association Rule?
An association rule states that: given a set of
records, each of which contain some number of
items from a given collection, there will be a
dependency rule that will predict the occurrence of
an item based on the occurrences of other items in
the transaction. In other words, if it has been
found in all transactions that coke is always
bought with milk, then there will be a rule that
states {milk} -> {coke} (however, not the other way
around since not all milk is bought with coke).

5
 What are the metrics for evaluating
association rules?

6
 What are the metrics for evaluating
association rules?
The association rule evaluation metrics are
“Support” (s) and “Confidence” (c). Support is
the fractions of the transactions that contain
both X and Y. Confidence measures how often
items in Y appears in transactions that contain
X.

7
 What are the metrics for
evaluating association rules?

For example given the following TID Items


table, these are the support and 1 Bread, Milk
confidence values: 2 Bread, Diaper,
Beer, Eggs
Example Association Rule: 3 Milk, Diaper,
{Milk, Diaper} => Beer Beer, Coke
4 Bread, Milk,
s = (Milk, Diaper, Beer)/Total Diaper, Beer
Transactions 5 Bread, Milk,
= 2/5 Diaper, Coke
= 0.4
c = (Milk, Diaper, Beer)/ (Milk,
Diaper)
= 2/3
= 0.67

8
9
 Apply the Apriori algorithm to find all itemsets with
support >= 0.2 from the following data:

Transaction Items in Transaction


1 Milk, Bread, Eggs
2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
7 Coffee, Juice
8 Milk, Bread, Cookies, Eggs
9 Cookies, Butter
10 Milk, Bread

10
 Apriori Principle Step 1: Count up the occurrences
of 1 item:
Itemset Count
Milk 5
Bread 4
Eggs 4
Juice 3
Butter 2
Coffee 3
Cookies 2

*Note: since it is out of 10, 0.2 support means if it


appears twice in the list.

11
 Apriori Principle Step 2: Look for frequent
occurrences of 2 items (in bold, not strikethrough):
Itemset Count
Milk, Bread 4
Milk, Eggs 3
Milk, Juice 1
Milk, Cookies 1
Bread, Eggs 3
Bread, Cookies 1
Eggs, Coffee 1
Eggs, Cookies 1
Juice, Butter 1
Juice, Coffee 1
Butter, Cookies 1

12
 Apriori Principle Step 3: Look for frequent
occurrences of 3 items (in bold, not strikethrough):

Itemset Count
Milk, Bread, Eggs 3

Therefore, the most frequent and highest itemset


data mining sub-itemset is {Milk, Bread, Eggs}.

13
 Using the data set in question 2 ({Milk, Bread,
Eggs}), find all the association rules with support
>= 0.2 and confidence >= 0.8.

 “{Milk, Bread} -> Eggs” where {Milk, Bread} is X and


Eggs is Y.
 Support = {itemset (X and Y)}/transactions
 Confidence = {itemset (X and Y)}/{itemset (X)}
 To do this, we check each permutation of the
association rules.

14
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 1 Milk, Bread, Eggs
Confidence = 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 6 Coffee
Confidence = 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread
Confidence =

15
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 6 Coffee
Confidence = 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread

Confidence =

16
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 3/10 = 0.3 6 Coffee
Confidence = 3/3 = 1 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread
Confidence =

17
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 3/10 = 0.3 6 Coffee
Confidence = 3/3 = 1 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 3/10 = 0.3 10 Milk, Bread
Confidence = 3/3 = 1

18
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

19
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 4/10 = 0.4 2 Milk, Juice
3 Juice, Butter
Confidence = 4/5 = 0.8 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

20
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 4/10 = 0.4 2 Milk, Juice
3 Juice, Butter
Confidence = 4/5 = 0.8 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = 4/10 = 0.4 Cookies, Eggs
9 Cookies, Butter
Confidence = 4/4 = 1 10 Milk, Bread

21
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

22
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.3 2 Milk, Juice
3 Juice, Butter
Confidence = 3/5 = 0.6 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

23
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.25 2 Milk, Juice
3 Juice, Butter
Confidence = 3/5 = 0.6 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = 3/10 = 0.3 Cookies, Eggs
9 Cookies, Butter
Confidence = 3/4 = 0.75 10 Milk, Bread

24
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

25
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.3 2 Milk, Juice
3 Juice, Butter
Confidence = 3/4 = 0.75 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

26
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.25 2 Milk, Juice
3 Juice, Butter
Confidence = 3/4 = 0.75 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = 3/10 = 0.3 Cookies, Eggs
9 Cookies, Butter
Confidence = 3/4 = 0.75 10 Milk, Bread

27
Therefore, the only Association Rules that satisfy the
restriction of having support >= 2 and confidence
>= 0.8 is:

 {Milk, Eggs} -> {Bread} (s=0.3, c=1)


 {Eggs, Bread} -> {Milk} (s=0.3, c=1)
 {Milk} -> {Bread} (s=0.4, c=0.8)
 {Bread} -> {Milk} (s=0.4, c=1)

28
29

You might also like