Professional Documents
Culture Documents
Presented To: Prof. Sweta Agarawa
Presented To: Prof. Sweta Agarawa
Presented To: Prof. Sweta Agarawa
2
• Given a set of database transactions, where each transaction is a set
of items, an association rule is an expression
XY
where X and Y are sets of items (literals). The intuitive meaning
of the rule: transactions in the database which contain the items in X
tend to also contain the items in Y.
• Example: 98% of customers who purchase tires and auto accessories
also buy some automotive services; here 98% is called the
confidence of the rule. The support of the rule
is the percentage of transactions that contain both X and Y.
• The problem of mining association rules is to find all rules that
satisfy a user-specified minimum support and minimum confidence.
3
Consider shopping cart filled with several items
Market basket analysis tries to answer the following
questions:
Who makes purchases?
What do customers buy together?
In what order do customers purchase items?
Prompts other decisions:
Where to place items in the store? e.g., Together? Apart?
What items should we put on sale (not put on sale)?
Let’s look at a concrete example of Apriori, based on the
AllElectronics transaction database D, shown below. There are
nine transactions in this database, e.i., |D| = 9. We use the next
figure to illus-
trate the fin- TID List of item_Ids
ding of fre-
quent itemsets T100 I1, I2, I5
in D. T200 I2, I4
T300 I2, I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2, I3, I5
T900 I1, I2, I3
8
Scan D for Itemset Sup. count Itemset Sup. count
Compare candidate
count of each {I1} 6 {I1} 6
support count with
candidate- scan {I2} 7 {I2} 7
minimum support
{I3} 6 {I3} 6
count - compare
{I4} 2 {I4} 2
{I5} 2 {I5} 2
C1 L1
Itemset Itemset Sup. count
{I1,I2} {I1,I2} 4 Itemset Sup. count
Generate C2
{I1,I3} {I1,I3} 4 {I1,I2} 4
candidates {I1,I4} {I1,I4} 1
from L1
Scan Compare {I1,I3} 4
{I1,I5} {I1,I5} 2 {I1,I5} 2
{I2,I3} {I2,I3} 4 {I2,I3} 4
{I2,I4} {I2,I4} 2 {I2,I4} 2
{I2,I5} {I2,I5} 2 {I2, I5} 2
{I3,I4} {I3,I4} 0
{I3,I5} {I3,I5} 1 L2
{I4,I5} {I4,I5} 0
C2
C2
9
Generate C3 Itemset Itemset Sup. Count Itemset Sup. Count
candidates {I1,I2,I3} {I1,I2,I3} 2 {I1,I2,I3} 2
from L2 Scan Compare
{I1,I2,I5} {I1,I2,I5} 2 {I1,I2,I5} 2
C3 C3 L3
10
Established software for fast effective discovery of real
associations.
• Magnum Opus is the only association discovery software that finds the
core associations in data and discards the rest.
• Some other special features are :
Robust variable selection and data modification tools improve the quality
of data , which leads to better modeling and more reliable results.
in this graphical summary, the strongest support value was found for
Swimming=Sometimes, which was associated Gymnastic=Sometimes,
Baseball = Sometimes, and Basketball=Sometimes
in the 2D Association Network, the support values for the Body and Head
portions of each association rule are indicated by the sizes and colors of
each circle in the 2D. The thickness of each line indicates the confidence
value (joint probability) for the respective association rule; the sizes and
colors of the "floating" circles plotted against the (vertical) z-axis indicate
the joint support (for the co-occurrences) of the respective Body and Head
components of the association rules. The plot position of each circle along
the vertical z - axis indicates the respective confidence value. Hence, this
particular graphical summary clearly shows two simple rules: Respondents
who name Pizza as a preferred fast food also mention Hamburger, and vice
versa.
his is an example of how association rules can be applied to text mining
tasks. This analysis was performed on the paragraphs (dialog spoken by
the characters in the play) in the first scene of Shakespeare's "All's Well
That Ends Well," after removing a few very frequent words like is, of, etc.
Of course, the specific words and phrases removed during the data
preparation phase of text (or data) mining projects will depend on the
purpose of the research.
y ou
nk
a 2 0 10
h
T 2-11-
2