Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Market Basket Analysis

Rengarajan R (19049)
Market Basket Analysis
• Key technique used by large retailers to uncover
associations between items. 
• Also known as Product Affinity Analysis
• Looks for combinations of items that occur
together frequently in transactions.
• Association Rules are widely used to analyze retail
basket or transaction data, and are intended to
identify strong rules discovered in transaction
data using measures of interestingness, based on
the concept of strong rules
Real world applications

Commonly seen in retail outlets like Reliance Fresh, D-


Mart, any departmental stores etc.
Association Rule Mining
 An association rule has 2 parts:
– an antecedent (if) and
– a consequent (then)
 An antecedent is something that’s found in data, and a consequent
is an item that is found in combination with the antecedent
 Association rules are created by thoroughly analyzing data and
looking for frequent if/then patterns.
 Then, depending on the following two parameters, the important
relationships are observed:
• Support: Support indicates how frequently the if/then relationship
appears in the database.
• Confidence: Confidence tells about the number of times these
relationships have been found to be true.
Example Case/Scenario
• Consider that the sales manager of D-Mart
located near a residential area wants to give
discount to customers to attract more sales
• He has a dataset ‘Groceries’ in hand consisting
of about 9000+ transactions.
• How could he associate products that go
together & identify the basket of products in
order to give discounts?
Snapshot of Groceries.csv
Apriori Algorithm
• Apriori algorithm employs a simple a priori belief as
guideline for reducing the association rule search
space: all subsets of a frequent item-set must also be
frequent. {Priori means: from the former}
• The support of an item-set or rule measures how
frequently it occurs in the data
• A rule's confidence is a measurement of its predictive
power or accuracy. It is defined as the support of the
item-set containing both X and Y divided by the
support of the item-set containing only X
• Lift is a measure of how much more likely one item is
to be purchased relative to its typical purchase rate,
given that you know another item has been
purchased
Result: Top 10 rules by ‘support’
Top 20 most frequent items
Visual plot of highly associated items
Thank you

You might also like