GR6 Epgp16sece

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Quantitative Techniques Project Report

Semester – 1

Strategic Sales Prediction:


Analyzing Rossmann Store Performance through Comprehensive
Evaluation of Store Dynamics, Promotional Strategies, and Competitive
Landscape

Section E
Group No - 6
Mukul Shrivastava (459)
Hyndavii Chiramdasu (450)
Anmol (_)
Akash Dayal (_)
Kapil Chaurasia (_)
Table of Contents
Understanding Rossmann data (Descriptive Analytics)............................................................4
Analysis on Stores & Assortment offered by them............................................................................4
Analysis on Sales trends...................................................................................................................5
Seasonality in sales data..................................................................................................................5
Analysis on Promo...........................................................................................................................6
School Holiday impact on Promo......................................................................................................6
Effect of competition on sales..........................................................................................................7
Inferential statistics.................................................................................................................7
Hypothesis testing...........................................................................................................................8
Predicting for sales using the various variables available (Regression Model).........................9
Appendix 1............................................................................................................................10

1
Background
The dataset under consideration pertains to Rossmann, a prominent retail chain that boasts a
widespread presence with over 3,000 drug stores strategically located across seven European
countries. At present, Rossmann store managers are tasked with predicting their daily sales for
up to six weeks in advance. This information is used to create staff schedules.
These sales are influenced by many factors, including promotions, competition, school and state
holidays, seasonality, and locality. With thousands of individual managers predicting sales
based on their unique circumstances, the accuracy of results is varied.

Objectives of this Analysis:


The aim of this analysis involves predicting daily sales for 1,115 stores (out of 3000) across
Germany to enhance store managers' ability to create effective staff schedules.

Motivation for this Analysis:


The motivation for this analysis lies in the need for accurate sales. Reliable sales forecasts are
crucial for effective store management as having the right sales staff on the floor will enable
upselling, cross-selling and a much better customer experience. This would also enable
Rossmann to have a standardized and robust prediction model rather than each store having its
own model.

Business Outcome:
We are aiming to solve the following business problems for Rossmann via our analysis:
1. Improved Operational Efficiency: By accurately predicting sales, store managers can
create efficient staff schedules, leading to improved operational efficiency.
2. Cost Savings: Optimized staff scheduling will result in cost savings by aligning labor
resources with actual demand, reducing unnecessary labor expenses.
3. Enhanced Customer Satisfaction: Efficient staffing ensures that customers receive the
attention and service they expect, contributing to a positive shopping experience.
4. Consistency Across Stores: Standardized sales prediction models ensure consistency
in forecasting across all 1,115 stores, reducing disparities in accuracy.
5. Strategic Decision-Making: Insights into key drivers of sales allow for informed
decision-making, helping Rossmann adapt its strategies to maximize sales opportunities.

Data
We utilized the Rossmann Store Sales dataset, sourced from Kaggle. The dataset can be
accessed at the following URL: https://www.kaggle.com/c/rossmann-store-sales/data.
The data has information for about daily sales of 1100 stores along with other metrics like
promotion, school holiday, nearby competition store, type of store, type of assortment etc. The
detailed data structure along with definitions is present in Appendix 1.

2
Understanding Rossmann data (Descriptive Analytics)
Rossmann Germany is a $2.1Billion dollar enterprise (2014 data) with 1100 stores in Germany.
Rossmann on average sells goods worth about $2 Million per day.

Analysis on Stores & Assortment offered by them


Rossmann operates 4 types of stores. Most of the stores are of Type a (602), followed by Type
d (348), Type c (148) & Type b(17). Each of the store can have any of the 3 types of assortment
(A or B or C).

 Store a has the highest sales across the different store types, possibly due to the fact
that it has the highest # of stores
 Store b has the nearly double the average sales (Sales/#Stores) of any store
 54% of sales across all years are contributed by store type a, followed by store type d
which contributes to ~30%
 This is because 600 of the 1115 stores are store type A and D type stores are 348 in
number

3
1. There are 3 types of assortment present in the each of the store types namely A, B, C,
D.
2. A & C type assortments contribute to ~49% each to sales, B type assortment being the
least contributor of ~1.2%

This is because assortment type A is in 593 stores followed by B in 513 and C in meagre 9 store

Analysis on Sales trends


Data available is for entire timeframe of 2013, 2014 and upto July of 2015.
Insights-
Sales were flat in 2013 vs 2014, but we see an increase in average sales in 2015
o December seems to be the holiday season where we see the heaviest sales of the year

4
Seasonality in sales data

 Sunday constantly had lowest average sales, after digging into it we found this was
because only few stores were open on Sunday
 Monday contributes to ~19% of the weekly sales which is the highest among all days of
week followed by Tuesday and Friday

Analysis on Promo

Promotions cause an uptick in sales i.e. ~3500$ higher than No Promo

School Holiday impact on Promo


No Promo Promo Sales increase
$ $
School Holiday 5,389 7,865 46%
$ $
School Working 4,241 8,173 93%
$ $
4,430 8,105 83%

5
1. While Promo increases sales always, the impact is higher if the promo falls on a school
holiday.
2. On a school working day, due to promo sales increase by 93%, while a Promo on a
school holiday only increases sales by 43%
3. This might be because parents have the time and mindspace to shop better.

Effect of competition on sales

Sales do not look dependent on the closest competitor distance. Looks like our stores doesn’t
get affected by competitors.
Also calculated the correlation b/w Sales & Comp Distance- (-0.01836759). Doesn’t indicate any
strong correlation.

Inferential statistics
• Define population- All customers who are shopping at Rossmann from the start of
the stores till date
• Parameter to be estimated – Sales
• Define sample- Sales of Customers shopping in the period 2013, 2014 and 2015
uptoAugust
• Appropriate statistic to estimate the parameter – Mean & Std dev
• Mean Sales= $5774
• Std dev = $3850

> kurtosis(df_v1$Sales)
[1] 1.778351
> skewness(df_v1$Sales)
[1] 0.6414577

Outlier analysis
Sales-

6
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 3727 5744 5774 7856 41551
Outlier= 7856+1.5*(7856-3727)= 14,049

n() sum(Promo)
19731 13942
Looks like due to promo guest are buying more. These are looking like outlier

Hypothesis testing
In this section our aim was to debunk some of the popular myths around Rossmann. Here are
couple of popular myths-

In Germany, there is a prevailing sentiment that Rossmamn has excessive promotions. Social
media platforms often feature posts advising against making non-promotional purchases at
Rossman, with claims suggesting that promotional sales occur at least every other day, if not
more (i.e. in a month more than 15 days are promo days for at least 1 store).
While this hypothesis may be entertaining for influencers, its potential impact on our sales is a
matter of concern for our leadership team. Consequently, there is an effort to substantiate this
claim.
Based on our sample data spanning across 31 months, it was determined that the average
number of promotional days in a month is 12.6 days, with a standard deviation of 1.58 days. In
our pursuit of data accuracy, we aimed for a 95% confidence level in our findings.

Probability Analysis
Evaluating the type of distribution of sales
Normal distribution. Test for normal

7
8
Predicting for sales using the various variables available (Regression Model)

The predictor variable in this analysis was Sales. Our objective in this section is to explain how
we used our understanding of data to come up with a regression equation which can explain
sales.

1. We had about 1100 stores, predicting sales outputs of each stores separately would
have been a mammoth task (possible also overfitting the data) so we converted the
stores into classes based on Store Type, Store Assortment, Promo, Date
2. Converted categorical variables into dummy variables

Since we were operating on 16+ columns this exercise was not possible in Excel. We turned
towards R.

Results
We ran multiple models and after multiple iterations we concluded with the below model. This
model gave good results not only on the testing data but also on the training data.

Model Equation

Sales = -28624.27*SchoolHoliday + 5396.04*cnt_str + 44383.26*StoreType_a + -


118154.38*StoreType_b + -48587.4*Assortment_a + -30392.83*Assortment_b +
184434.62*StateHoliday_a + 108537.37*StateHoliday_b + 84794.78*DayOfWeek_1 + -
33899.52*DayOfWeek_3 + -29509.96*DayOfWeek_4 + 861932.52*perc_stores_open +
174577.9*perc_stores_promo + -753132.44

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -753132.44 10422.32 -72.261 < 2e-16 ***
SchoolHoliday -28624.27 6261.45 -4.572 4.89e-06 ***
cnt_str 5396.04 38.75 139.239 < 2e-16 ***
StoreType_a 44383.26 7978.88 5.563 2.72e-08 ***
StoreType_b -118154.38 8117.35 -14.556 < 2e-16 ***
Assortment_a -48587.40 5574.06 -8.717 < 2e-16 ***
Assortment_b -30392.83 10785.89 -2.818 0.004843 **
StateHoliday_a 184434.62 18444.14 10.000 < 2e-16 ***
StateHoliday_b 108537.37 30084.90 3.608 0.000310 ***
DayOfWeek_1 84794.78 7956.19 10.658 < 2e-16 ***
DayOfWeek_3 -33899.52 7867.63 -4.309 1.66e-05 ***
DayOfWeek_4 -29509.96 7848.39 -3.760 0.000171 ***
perc_stores_open 861932.52 9972.20 86.434 < 2e-16 ***
perc_stores_promo 174577.90 5645.78 30.922 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

9
Residual standard error: 284200 on 11674 degrees of freedom
Multiple R-squared: 0.8261, Adjusted R-squared: 0.8259
F-statistic: 4265 on 13 and 11674 DF, p-value: < 2.2e-16

Appendix 1: Data explanation


The data for this analysis is sources from Kaggle. URL- https://www.kaggle.com/c/rossmann-
store-sales/overview

Data fields
• Id - an Id that represents a (Store, Date) duple within the test set
• Store - a unique Id for each store
• Sales - the turnover for any given day [Y variable]
• Customers - the number of customers on a given day
• Open - an indicator for whether the store was open: 0 = closed, 1 = open
• StateHoliday - indicates a state holiday. Normally all stores, with few exceptions, are
closed on state holidays. Note that all schools are closed on public holidays and
weekends. a = public holiday, b = Easter holiday, c = Christmas, 0 = None
• SchoolHoliday - indicates if the (Store, Date) was affected by the closure of public
schools
• StoreType - differentiates between 4 different store models: a, b, c, d
• Assortment - describes an assortment level: a = basic, b = extra, c = extended
• CompetitionDistance - distance in meters to the nearest competitor store
• CompetitionOpenSince[Month/Year] - gives the approximate year and month of the
time the nearest competitor was opened
• Promo - indicates whether a store is running a promo on that day

10

You might also like