Data Insights - Module 2 (Sanskar)

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Sprocket

Central Pty Ltd


Data analytics approach
Sanskar Sharma, Analyst at KPMG
Agenda

1. Introduction
2. Data Exploration
3. Model Development
4. Interpretation
Problem and Task

Identify and Recommend Top 1000 Customers from Company


Database

Outline of the Problem Content of Data Analysis

Sprocket Central, a premium cycle and  ‘New’ and ‘Old’ Customer Age Distribution
 Bike Related Purchases for Last 3 Years by Gender
accessories brand wants to improve its
 Job Industry Distribution
sales  Wealth Segment and Age Category
 Car Owners Distribution Among States
It wants KPMG to analyze company  RFM Analysis and Customer Classification

database of customers and recommend


top 1000 Customers to Target
Data Quality Assessment

Issues in Data Quality Highlighted and Corrected


Dataset Accuracy Completeness Consistency Currency Relevancy Validity Uniqueness
Customer DOB: Job Title: Gender: Deceased Default:    
Demographic 1.Incorrect 1.Blanks 1.Inconsistent Customers: 1.Delete
Values Customer Id: 2.Not 1.Filter the Column
Age: 1.Incomplete Standardized Data
1.Missing  
 

Customer   Customer Id: State:        


Address 1.Incomplete 1.Inconsistent
2.Not
Standardized
New Age: Job Title:         Rank:
Customer List 1.Missing 1.Blanks 1.Delete
DOB: Duplicate of
1.Blanks Rank Column
Transactions Profit: Customer Id:     Order Status: List Price:  
Missing 1.Incomplete 1.Filter the 1.Format
Online Order: Data  
1.Blanks Product First
Brand: Sold Date:
1.Blanks 1.Format
Data Exploration

Age Distribution of New and Old Customers


Age Category of Old Customers 20
30
1200 1057
40
1000
• Highest Number of existing customers fall 800
50

under the age category 40-49 600 504 508 528 60


384 70
400
• Customers below the age of 20 and above 80
200
70 are almost negligible 0
17 2 2 90
Total
• New Customer Data shows age categories Scale- 20(Below 20) , 30 (20-29) and so on
20-29, 40-69 are densely populated
Age Category of New Customers 20
• Sharp Decline in share of customers in age 30
category 30-39 250
212 40
200
155 150 50
147
150 60
83 90
100 70
50 35 80
6
0 90
Total
Data Exploration

Gender Wise Cycle Related Purchases in Last 3 Years

Ge n d e r wis e P U R C H A S E S Of OL D
C U S T OME R S
• Female Customer Account for More than

50.98%
60.00%

46.83%
50% of Past 3 years’ Purchases 50.00%

40.00% Female
• Male and Undisclosed Gender Category Male

contributes to approximately 47% and 30.00% U

2% of Bike Related Purchases, 20.00%

Respectively
10.00%

2.20%
0.00%
T otal
New Customer Job Industry Argiculture

Data Exploration Entertainment


Financial Services
Health
IT

Job Industry Distribution Manufacturing


Property
Retail
Telecommunicatio
ns
• Manufacturing and Financial Service Sector
Employees forms the Largest chunk of customers
in both new and old customers Old Customer Job Industry
Argiculture
2% 3% 4%
• Smallest Fraction of Customers come from [PERCE Entertainment

‘Telecommunications’ and ‘Agriculture’ Industry NTAGE] Financial Services


8% Health
• Job Industry wise Customer Distribution has not 24%
IT
changed much and Industry Contribute Customers
Manufacturing
in similar ratio as in old customers 24%
Property
18%
6% Retail
Telecommunicatio
ns
Data Exploration
Age Group Wise Wealth Segment of New Customers
250

Age Category Wise Wealth Segment 200

Distribution 150

100

50
• Highest Number of Customers belong to Mass
Customer Segment in all age categories 0
20 30 40 50 60 70 80 90
• High Net Worth and Affluent Customers switch for
second highest customers among different age Age Group Wise Wealth Segment of Old Customers
categories.
1200
• Total Customers in High Net Worth Segment out 1000
number the total of Affluent Segment 800

Axis Title
600
400
200
0
20 30 40 50 60 70 80 90
Data Exploration
State Wise Car Owner Count for Old Customer

Customers who own Cars among


900 827
776
800
700
different States 600 No
500 Yes
375 382
400 324 316
300
• NSW has a higher number of customers who own 200
cars as compared to QLD and VIC where the 100
0
difference is almost negligible NSW QLD VIC

• New Customer Data show a lesser number of car


State Wise Car Owner Count for New Customer
owners in NSW, More car owners in QLD and
almost again negligible difference in VIC 300 272
250 234

200 No
150 125 132 134 Yes
103
100
50
0
NSW QLD VIC
Model Development

RFM Analysis and Classifications Minimum RFM Score for Classifications


RFM Analysis is way to prioritize customers and 5

enhance marketing strategy based on Recency(R), 4


Frequency(F) and Monetary(M) Scores Allotted to 3
Customers
2
Classifications made for this Model 1
1. Platinum Customer 0
2. Very Loyal er ya
l
ya
l er er er er er er
m om om m om om om
3. Becoming Loyal to Lo Lo t t o o t t t
us ry in
g us u s
Bl us u s u s
C Ve tC lC at
e C C t C
4. Recent Customer um o m n i a ng iv
e
os
tin
c
ec
e nt L si s L
5. Potential Customer Be te Lo a
Pl
a R
Po Ev
6. Late Bloomer
7. Losing Customer Minimum of R_Score Minimum of F_Score Minimum of M_Score

8. Evasive Customer
9. Lost Customer
Model Development

Scattered Chart for Recency Period against


Monetary Value
Recency Against Monetary
• We can observe that lower the Recency Period 14000

higher Revenue or Profit is generated 12000

• As we move to moderate Recency Period (100-200 10000

Monetary Value($)
Days) monetary Value also reduces to moderate 8000
from high
6000
• Further, Monetary Value significantly decreases
4000
when recency period is highest (200+ Days)
2000

0
0 50 100 150 200 250 300 350 400

Recency Value
Model Development

Scattered Chart for Recency Period against


Frequency of Purchase
Recency Against Frequency
16
• We see the frequency of purchase is highest among 14
the customer who have bought recently (i.e. less 12
than 150 Days)

Frequency Value
10
• As we move to customers who have made last 8
purchase within 200-300 Days, the frequency of 6
purchase also falls
4
• Trend continues and frequency of purchase falls 2
further to lowest for recency period of more than 0
300 days 0 50 100 150 200 250 300 350 400

Recency Value
Model Development

Job Industry Distribution


Frequency Against Monetary
14000

• Monetary Gain is directly proportional to frequency 12000

Monetary Value($)
of purchase 10000

8000

6000

4000

2000

0
0 2 4 6 8 10 12 14 16

Frequency Value
Model Development

Customer Rank, Title and Description based on RFM Value


Rank Customer Title Description RFM Value

1 Platinum Customer Most Recent Buy, Buys Often, Most Spent 434-444

2 Very Loyal Most Recent, Buys Often, High Amount Spent 422-433

3 Becoming Loyal Most Recent or Buys Often, And/or High Amount Spent 345-421

4 Recent Customer Very Recent, Buys Often or More than Once, And/or High Amount Spent 324-344

5 Potential Customer Very Recent, Buys Less Frequently, And/or High Amount Spent 312-323
Not Recent + Buys Often or Very Recent + Buys Less Frequently, And/or High Amount Spent
6 Late Bloomer 225-311
Very Old Buy + Buys Often or Not Recent + Buys Less Frequently, And High Amount Spent
7 Losing Customer 143-224

8 Evasive Customer Very Old Buy, Buys Less Frequently, Less Amount Spent 125-142

9 Lost Customer Very Old Buy, Bought Least, Least Spent 111-124
Model Development

Distribution of Customers
Distribution of Customers (%)
Lost Customer Platinum Customer
9%
711 Evasive Very Loyal
Customer 20%
71 10% Becoming Loyal
Losing Customer Recent Customer
627 2% 7%
Late Bloomer Potential Customer
410 Potential Late Bloomer
Customer 18% 13% Losing Customer
344 Recent Customer Evasive Customer
444 Becoming Loyal 12% Lost Customer
Very Loyal
239 10%
Platinum
336 Customer

309

0 100 200 300 400 500 600 700 800


Interpretation

Summary of Customer Table


Rank Customer Title Description RFM Value Count Cumulative

1 Platinum Customer Most Recent Buy, Buys Often, Most Spent 434-444 309 309

2 Very Loyal Most Recent, Buys Often, High Amount Spent 422-433 336 654

3 Becoming Loyal Most Recent or Buys Often, And/or High Amount Spent 345-421 239 884

4 Recent Customer Very Recent, Buys Often or More than Once, And/or High Amount 324-344
Spent 444 1328

5 Potential Customer Very Recent, Buys Less Frequently, And/or High Amount Spent 312-323
344 1672

6 Late Bloomer Not Recent + Buys Often or Very Recent + Buys Less Frequently, 225-311
And/or High Amount Spent 410 2082

7 Losing Customer Very Old Buy + Buys Often or Not Recent + Buys Less Frequently, And 143-224 627 2709
High Amount Spent
8 Evasive Customer Very Old Buy, Buys Less Frequently, Less Amount Spent 125-142 71 2708

9 Lost Customer Very Old Buy, Bought Least, Least Spent 111-124 711 3491
Interpretation

Customers to Target

Rank Customer Title Description RFM Value Count Cumulative

1 Platinum Customer Most Recent Buy, Buys Often, Most Spent 434-444 309 309

2 Very Loyal Most Recent, Buys Often, High Amount Spent 422-433 336 654

3 Becoming Loyal Most Recent or Buys Often, And/or High Amount Spent 345-421
239 884

4 Recent Customer Very Recent, Buys Often or More than Once, And/or High Amount 324-344 444 1328
Spent

• Choose the top 1000 customers with the highest RFM Value
• Customers chosen would have purchased Recently and Frequently; and spent the most
• Assign the conditions mentioned in description and select the top 1000 customers
Becom
Evasiv
Late Bl
Losing
Lost Cu
Platinu
Potenti
Recent
Very Lo

You might also like