Professional Documents
Culture Documents
MRA Project Milestone 1 - Maminulislam
MRA Project Milestone 1 - Maminulislam
Project -
Milestone 1
MAMINUL ISLAM
ISLAMMAMINUL44@GMAIL.COM
Summary
➢ Agenda & Executive Summary of the
data.
An automobile parts manufacturing company has collected data of transactions for 3 years.
They do not have any in-house data science team, thus they have hired you as their consultant.
Your job is to use your magical data science skills to provide them with suitable insights about
their data and their customers.
Sales Monetary
Univariate Univariate
The numeric variable Sales & Monetary are nearly same slightly right skewed bell curve –
yet its not the same values. So, we’ll follow all analysis using calculated price per order
which is Monetary.
EDA – Univariant Analysis
The Price-each is matching mostly to sales and quantity ordered is very high or very low
values.
EDA – Univariant Analysis
The days since last ordered represents frequency of when the customer places next order
as can be seen most of the customers have a range of 800-2800 days of last orders.
EDA – Bivariant Analysis
Day Trends
Day Trends of Sales, Priceeach & Quantity Ordered by the respective customers with
the average MSRP.
EDA – Weekly Trend of the Sales & MSRP
Weekly Sales
Weekly sales & MSRP shown with the volume pick and lows.
EDA – Sales across different
Productlines
USA is having most orders shipped and high sales with 3 on hold and 1 cancelled. This is
followed by France and so on order of decreasing sales.
EDA – Sales & Order status across
customers
Euro Shopping channel is the customer having highest sales and orders shipped followed
by Mini Gifts Distributers ltd. Euro Shopping Channel is also having orders in cancelled,
Disputed status – again an inference for company to check.
RFM Segmentation
➢ R-Score : Recency, is the most recent customer order which is calculated taking
difference of Order date & current date in Days.
➢ F-Score : Frequency, is the how often the orders are placed by customers, from the
excel sheet the variable DAYS_SINCE_LASTORDER.
➢ M-Score : Monetory, Sales can be used as Monetory but in this project used the
calculation of price & Quantity:
The RFM Segmentation is done in Tableau tool – with the workflow published in Tableau
public:
The output excel file from Tableau is used in excel to get actual RFM scores on sales:
ORDERN QUANTI PRICEEA ORDERLI SALES ORDERDATE DAYS_SINCE_LASTOR STATUS PRODU MSRP PRODU CUSTO PHONE ADDRES CITY POSTALCO COUNTR CONTA CONTA DEALSIZ Recenc Frequen Moneta
UMBER TYORDE CH NENUM DER CTLINE CTCOD MERNA SLINE1 DE Y CTLAST CTFIRST E y cy ry
RED BER E ME NAME NAME
897
Land of
Shippe Motorc S10_167 2125557 Long
10107 30 95.70 2 2871 24/02/2018 828 95 Toys NYC 10022 USA Yu Kwai Small 1331 1337 2871
d ycles 8 818 Airport
Inc.
Avenue
59 rue
Reims
Shippe Motorc S10_167 26.47.15 de
10121 34 81.35 5 2765.9 07/05/2018 757 95 Collect Reims 51100 France Henriot Paul Small 1259 1260 2765.9
d ycles 8 55 l'Abbay
ables
e
27 rue
Lyon du
Shippe Motorc S10_167 +33 1 46 Da Mediu
10134 41 94.74 2 3884.34 01/07/2018 703 95 Souveni Colonel Paris 75508 France Daniel 1204 1204 3884.34
d ycles 8 62 7555 Cunha m
ers Pierre
Avia
Toys4Gr 78934
Shippe Motorc S10_167 6265557 Pasade Mediu
10145 45 83.26 6 3746.7 25/08/2018 649 95 ownUps Hillside 90003 USA Young Julie 1149 1155 3746.7
d ycles 8 265 na m
.com Dr.
Technic 9408
Shippe Motorc S10_167 6505556 Burlinga Mediu
10168 36 96.66 1 3479.76 28/10/2018 586 95 s Stores Furth 94217 USA Hirano Juri 1085 1085 3479.76
d ycles 8 809 me m
Inc. Circle
RFM Analysis
The output excel file from Tableau is used in excel to get actual RFM scores on sales:
Bin1 consists of top 25% customers which are termed as active customers and among it
customers which are loyal and bring in monetary benefits are below:
RFM Inferences
Bin1 also consists of top 25% most loyal customers which are termed as active customers
and among it customers which are loyal and bring in monetary benefits are below:
RFM Inferences
Bin 2 is set of At risk customers (with recency 25% - 75%), which means they are
at risk of churning.
RFM Inferences
Bin 3 is set of lost customers (with recency more than 75%), which means are
already lost and may not return.
RFM Inferences
➢ Most of the customers belong to Bin 2 are in very critical situation and there is a huge
potential of this segment may switch to another supplier.
➢ In the Bin 1 ‘active’ and ‘loyal’ customers bringing decent revenue.
➢ In the Bin 2, the company can target to give more services to bring in loyalty as they
amount to a good revenue.
➢ The At-risk customers bring in the most monetary benefits – but haven’t purchased
recently.
➢ Maximum loyal customers belong to countries Japan & France.
➢ USA is having most at risk customers – again followed by France, Germany and
Finland. If shipping issues could be handled better for these countries – it would be
better for business.
Thanks