Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

ML

Assignment Presentation
Competition = https://www.kaggle.com/c/m5-forecasting-accuracy (https://www.kaggle.com/c/m5-forecasting-accuracy)

In this competition, contestants are challenged to forecast future sales at Walmart based on heirarchical sales in the states of California, Texas, and
Wisconsin. Forecasting sales, revenue, and stock prices is a classic application of machine learning in economics, and it is important because it allows
investors to make guided decisions based on forecasts made by algorithms.

In this python3 notebook, I will briefly explain the structure of dataset. Then, I will visualize the dataset using Matplotlib and Plotly. And finally, I will
demonstrate how this problem can be approached with a variety of forecasting algorithms.

Contents
Objective

Import Libraries

Load Data

Data Exploration and Preparation

Data Viewing

Denoising - removing noise

Wavelet Denoising

Average Smoothing

Exploratory Data Analytics (EDA)

Train Test Split

Modeling

Naive Approach

Moving Average

Exponential Smoothing

Prophet - by Facebook

Loss comparisions for models

Future Works

Objective
To analyze the hierarchical sales data from Walmart, the world’s largest company by revenue, to forecast daily sales. The data covers stores in three
US States (California, Texas, and Wisconsin) and includes item level, department, product categories, and store details. In addition, it has explanatory
variables such as price, promotions, day of the week, and special events.

We aim to Explore the data and use different models to predict the sales.

This is a Time series forcasting problem and as no test data is give, we shall split the datset into training and testing time frames.

Import Libraries
In [1]:

import os
import gc
import time
import math
import datetime
from math import log, floor
from sklearn.neighbors import KDTree

import numpy as np
import pandas as pd
from pathlib import Path
from sklearn.utils import shuffle
from tqdm.notebook import tqdm as tqdm

import seaborn as sns


from matplotlib import colors
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize

import plotly.express as px
import plotly.graph_objects as go
import plotly.figure_factory as ff
from plotly.subplots import make_subplots

import pywt
from statsmodels.robust import mad

import scipy
import statsmodels
from scipy import signal
import statsmodels.api as sm
from fbprophet import Prophet
from scipy.signal import butter, deconvolve
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt

import warnings
warnings.filterwarnings("ignore")

%matplotlib inline

Load Data
In [2]:

calendar = pd.read_csv('calendar.csv')
selling_prices = pd.read_csv('sell_prices.csv')
sample_submission = pd.read_csv('sample_submission.csv')
sales_train_val = pd.read_csv('sales_train_validation.csv')

Data Exploration and Preparation

Data Viewing
In [3]:

print(calendar.head())

date wm_yr_wk weekday wday month year d event_name_1 \


0 2011-01-29 11101 Saturday 1 1 2011 d_1 NaN
1 2011-01-30 11101 Sunday 2 1 2011 d_2 NaN
2 2011-01-31 11101 Monday 3 1 2011 d_3 NaN
3 2011-02-01 11101 Tuesday 4 2 2011 d_4 NaN
4 2011-02-02 11101 Wednesday 5 2 2011 d_5 NaN

event_type_1 event_name_2 event_type_2 snap_CA snap_TX snap_WI


0 NaN NaN NaN 0 0 0
1 NaN NaN NaN 0 0 0
2 NaN NaN NaN 0 0 0
3 NaN NaN NaN 1 1 0
4 NaN NaN NaN 1 0 1

Calendar contains information about the dates on which the products are sold
In [4]:

print(selling_prices.head())

store_id item_id wm_yr_wk sell_price


0 CA_1 HOBBIES_1_001 11325 9.58
1 CA_1 HOBBIES_1_001 11326 9.58
2 CA_1 HOBBIES_1_001 11327 8.26
3 CA_1 HOBBIES_1_001 11328 8.26
4 CA_1 HOBBIES_1_001 11329 8.26

selling_prices contains information about the price of the products sold per store and date

In [5]:

print(sales_train_val.head())

id item_id dept_id cat_id store_id \


0 HOBBIES_1_001_CA_1_validation HOBBIES_1_001 HOBBIES_1 HOBBIES CA_1
1 HOBBIES_1_002_CA_1_validation HOBBIES_1_002 HOBBIES_1 HOBBIES CA_1
2 HOBBIES_1_003_CA_1_validation HOBBIES_1_003 HOBBIES_1 HOBBIES CA_1
3 HOBBIES_1_004_CA_1_validation HOBBIES_1_004 HOBBIES_1 HOBBIES CA_1
4 HOBBIES_1_005_CA_1_validation HOBBIES_1_005 HOBBIES_1 HOBBIES CA_1

state_id d_1 d_2 d_3 d_4 ... d_1904 d_1905 d_1906 d_1907 d_1908 \
0 CA 0 0 0 0 ... 1 3 0 1 1
1 CA 0 0 0 0 ... 0 0 0 0 0
2 CA 0 0 0 0 ... 2 1 2 1 1
3 CA 0 0 0 0 ... 1 0 5 4 1
4 CA 0 0 0 0 ... 2 1 1 0 1

d_1909 d_1910 d_1911 d_1912 d_1913


0 1 3 0 1 1
1 1 0 0 0 0
2 1 0 1 1 1
3 0 1 3 7 2
4 1 2 2 2 4

[5 rows x 1919 columns]

sales_train_val contains the historical daily unit sales data per product and store [d_1 - d_1913]

to see what we need to submit as a result, lets see the sample submission

In [6]:

print(sample_submission.head())

id F1 F2 F3 F4 F5 F6 F7 F8 F9 ... \
0 HOBBIES_1_001_CA_1_validation 0 0 0 0 0 0 0 0 0 ...
1 HOBBIES_1_002_CA_1_validation 0 0 0 0 0 0 0 0 0 ...
2 HOBBIES_1_003_CA_1_validation 0 0 0 0 0 0 0 0 0 ...
3 HOBBIES_1_004_CA_1_validation 0 0 0 0 0 0 0 0 0 ...
4 HOBBIES_1_005_CA_1_validation 0 0 0 0 0 0 0 0 0 ...

F19 F20 F21 F22 F23 F24 F25 F26 F27 F28
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0

[5 rows x 29 columns]

Below are sales data from six randomly selected stores in the dataset.
In [7]:

ids = sorted(list(set(sales_train_val['id'])))
d_cols = [c for c in sales_train_val.columns if 'd_' in c]
x_1 = sales_train_val.loc[sales_train_val['id'] == ids[0]].set_index('id')[d_cols].values[0]
x_2 = sales_train_val.loc[sales_train_val['id'] == ids[412]].set_index('id')[d_cols].values[0]
x_3 = sales_train_val.loc[sales_train_val['id'] == ids[656]].set_index('id')[d_cols].values[0]
x_4 = sales_train_val.loc[sales_train_val['id'] == ids[941]].set_index('id')[d_cols].values[0]
x_5 = sales_train_val.loc[sales_train_val['id'] == ids[1398]].set_index('id')[d_cols].values[0]
x_6 = sales_train_val.loc[sales_train_val['id'] == ids[1917]].set_index('id')[d_cols].values[0]

fig = make_subplots(rows=2, cols=3)

fig.add_trace(go.Scatter(x=np.arange(len(x_1)), y=x_1,mode='lines',
name="First sample",marker=dict(color="green")),row=1, col=1)

fig.add_trace(go.Scatter(x=np.arange(len(x_2)), y=x_2,mode='lines',
name="Second sample",marker=dict(color="violet")),row=1, col=2)

fig.add_trace(go.Scatter(x=np.arange(len(x_3)), y=x_3,mode='lines',
name="Third sample",marker=dict(color="blue")),row=1, col=3)

fig.add_trace(go.Scatter(x=np.arange(len(x_4)), y=x_4,mode='lines',
name="Fourth sample",marker=dict(color="red")),row=2, col=1)

fig.add_trace(go.Scatter(x=np.arange(len(x_5)), y=x_5,mode='lines',
name="Fifth sample",marker=dict(color="yellow")),row=2, col=2)

fig.add_trace(go.Scatter(x=np.arange(len(x_6)), y=x_6,mode='lines',
name="Sixth sample",marker=dict(color="gray")),row=2, col=3)

fig.update_layout(height=500, width=800, title_text="Sample sales")


fig.show()

Sample sales

15
First sample
100 Second sample
10 20
Third sample
Fourth sample
50 10
5 Fifth sample
Sixth sample
0 0 0
0 500 1000 1500 0 500 1000 1500 0 500 1000 1500

15
15

10 10
5

5 5

0 0 0
0 500 1000 1500 0 500 1000 1500 0 500 1000 1500

The sales data is very erratic, owing to the fact that so many factors affect the sales on a given day.

On certain days, the sales quantity is zero, which indicates that a certain product may not be available on that day

Denoising
This method may lose some information from the original time series, but it may be useful in extracting certain features regarding the general trends in
the time series. Denoising is done to get rid of rare occurances and get an overall trend alone.

we shall see 2 denoising methods, they are

Wavelet Denoising
Average Smoothing
Wavelet Denoising
Wavelet denoising (usually used with electric signals) is a way to remove the unnecessary noise from a time series. This method calculates coefficients
called the "wavelet coefficients". These coefficients decide which pieces of information to keep (signal) and which ones to discard (noise). We make
use of the MAD (mean absolute deviation) value to understand the randomness in the sales and accordingly decide the minimum threshold for the
wavelet coefficients in the time series. We filter out the low coefficients from the wavelets and reconstruct the sales data from the remaining coefficients

In [8]:

def maddest(d, axis=None):


return np.mean(np.absolute(d - np.mean(d, axis)), axis)

def denoise_signal(x, wavelet='db4', level=1):


coeff = pywt.wavedec(x, wavelet, mode="per")
sigma = (1/0.6745) * maddest(coeff[-level])

uthresh = sigma * np.sqrt(2*np.log(len(x)))


coeff[1:] = (pywt.threshold(i, value=uthresh, mode='hard') for i in coeff[1:])

return pywt.waverec(coeff, wavelet, mode='per')

From here on, we shall take the first 3 samples alone and experiment on them

In [9]:

# Wavelet Denoising

y_w1 = denoise_signal(x_1)
y_w2 = denoise_signal(x_2)
y_w3 = denoise_signal(x_3)

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), mode='lines+markers', y=x_1,
marker=dict(color="lightgreen"), showlegend=False,
name="Original signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), y=y_w1, mode='lines',
marker=dict(color="darkgreen"), showlegend=False,
name="Denoised signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), mode='lines+markers', y=x_2,
marker=dict(color="yellow"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), y=y_w2, mode='lines',
marker=dict(color="red"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), mode='lines+markers', y=x_3,
marker=dict(color="lightblue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), y=y_w3, mode='lines',
marker=dict(color="darkblue"), showlegend=False),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Original (pale) vs. Denoised (dark) sales")


fig.show()
Original (pale) vs. Denoised (dark) sales

15

10

0 500 1000 1500 2000

100

50

0 500 1000 1500 2000

30

25

20

15

10

−5

0 500 1000 1500 2000

Seeing a small part of sample 1 alone


In [10]:

tx_1 = sales_train_val.loc[sales_train_val['id'] == ids[0]].set_index('id')[d_cols].values[0][:90]


ty_w1 = denoise_signal(tx_1)
fig = make_subplots(rows=1, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(len(tx_1)), mode='lines+markers', y=tx_1,
marker=dict(color="lightgreen"), showlegend=False,
name="Original signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(tx_1)), y=ty_w1, mode='lines',
marker=dict(color="darkgreen"), showlegend=False,
name="Denoised signal"),
row=1, col=1
)

0 20 40 60 80

The below diagram illustrates these graphs side-by-side. Red graphs represent original sales and green graphs represent denoised sales.
In [11]:

fig = make_subplots(rows=3, cols=2)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), mode='lines+markers', y=x_1,
marker=dict(color="lightgreen"),name="Original signal 1"),row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), y=y_w1, mode='lines',
marker=dict(color="darkgreen"),name="Denoised signal 1"),row=1, col=2
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), mode='lines+markers', y=x_2,
marker=dict(color="yellow"),name="Original signal 2"),row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), y=y_w2, mode='lines',
marker=dict(color="red"),name="Denoised signal 2"),row=2, col=2
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), mode='lines+markers', y=x_3,
marker=dict(color="lightblue"),name="Original signal 3"), row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), y=y_w3, mode='lines',
marker=dict(color="darkblue"),name="Denoised signal 3"), row=3, col=2
)

fig.update_layout(height=600, width=800, title_text="Original (light) vs. Wavelet Denoised (dark) sales")


fig.show()

Original (light) vs. Wavelet Denoised (dark) sales

15 10 Original signal 1
Denoised signal 1
10
5 Original signal 2
5 Denoised signal 2
0
Original signal 3
0
Denoised signal 3
0 500 1000 1500 2000 0 500 1000 1500

100
100

50
50

0
0
0 500 1000 1500 2000 0 500 1000 1500

30
20
20
10
10
0
0
0 500 1000 1500 2000 0 500 1000 1500

Average smoothing
Average smooting is a relatively simple way to denoise time series data. In this method, we take a "window" with a fixed size (like 10). We first place the
window at the beginning of the time series (first ten elements) and calculate the mean of that section. We now move the window across the time series
in the forward direction by a particular "stride", calculate the mean of the new window and repeat the process, until we reach the end of the time series.
All the mean values we calculated are then concatenated into a new time series, which forms the denoised sales data.
In [12]:

# Functions for average smoothing

def average_smoothing(signal, kernel_size=3, stride=1):


sample = []
start = 0
end = kernel_size
while end <= len(signal):
start = start + stride
end = end + stride
sample.extend(np.ones(end - start)*np.mean(signal[start:end]))
return np.array(sample)

In [13]:

# Perform average smoothing on the first 3 samples alone

y_w1 = average_smoothing(x_1)
y_w2 = average_smoothing(x_2)
y_w3 = average_smoothing(x_3)

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), mode='lines+markers', y=x_1,
marker=dict(color="lightgreen"),name="Original signal 1"),row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_1)), y=y_w1, mode='lines',
marker=dict(color="darkgreen"),name="Denoised signal 1"),row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), mode='lines+markers', y=x_2,
marker=dict(color="yellow"),name="Original signal 2"),row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_2)), y=y_w2, mode='lines',
marker=dict(color="orange"),name="Denoised signal 2"),row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), mode='lines+markers', y=x_3,
marker=dict(color="lightblue"),name="Original signal 3"), row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(x_3)), y=y_w3, mode='lines',
marker=dict(color="darkblue"),name="Denoised signal 3"), row=3, col=1
)

fig.update_layout(height=900, width=800, title_text="Original (light) vs. Average Smoothing Denoised (dark) sales
")
fig.show()
Original (light) vs. Average Smoothing Denoised (dark) sales

15 Original signal 1
Denoised signal 1
Original signal 2
10 Denoised signal 2
Original signal 3
Denoised signal 3
5

0
0 500 1000 1500 2000

100

50

0
0 500 1000 1500 2000

30

20

10

0
0 500 1000 1500 2000

Seeing a small part alone of sample 1


In [14]:

tx_1 = sales_train_val.loc[sales_train_val['id'] == ids[0]].set_index('id')[d_cols].values[0][:90]


ty_w1 = average_smoothing(tx_1)
fig = make_subplots(rows=1, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(len(tx_1)), mode='lines+markers', y=tx_1,
marker=dict(color="lightgreen"), showlegend=False,
name="Original signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(len(tx_1)), y=ty_w1, mode='lines',
marker=dict(color="darkgreen"), showlegend=False,
name="Denoised signal"),
row=1, col=1
)

0 20 40 60 80

In the above graphs, the dark lineplots represent the denoised sales and the light lineplots represent the original sales. We can see that average
smoothing is not as effective as Wavelet denoising at finding macroscopic trends and pattersns in the data. A lot of the noise in the original sales
persists even after denoising. Therefore, wavelet denoising is clearly more effective at finding trends in the sales data. Nonetheless, average
smoothing or "rolling mean" can also be used to calculate useful features for modeling.

EDA

Rolling Average Price vs. Time for each store


In [15]:

past_sales = sales_train_val.set_index('id')[d_cols].T.merge(calendar.set_index('d')['date'],
left_index=True,right_index=True,validate='1:1').set_index('date')

store_list = list(selling_prices['store_id'].unique())
means = []
fig = go.Figure()
for s in store_list:
store_items = [c for c in past_sales.columns if s in c]
data = past_sales[store_items].sum(axis=1).rolling(90).mean()
means.append(np.mean(past_sales[store_items].sum(axis=1)))
fig.add_trace(go.Scatter(x=np.arange(len(data)), y=data, name=s))

fig.update_layout(yaxis_title="Sales", xaxis_title="Time", title="Rolling Average Sales vs. Time (per store)")

Rolling Average Sales vs. Time (per store)

7000
CA_1
CA_2
CA_3
6000
CA_4
TX_1
TX_2
5000
TX_3
WI_1
Sales

WI_2
4000 WI_3

3000

2000

0 500 1000 1500

Time

The above graphs corrspond to how an average retail store in the US performs over time. image src : imgur This insight was from mM5 starter data
exploration by Rob Mulan
In [16]:

# Bocplotting each store avg sales

fig = go.Figure()

for i, s in enumerate(store_list):
store_items = [c for c in past_sales.columns if s in c]
data = past_sales[store_items].sum(axis=1).rolling(90).mean()
fig.add_trace(go.Box(x=[s]*len(data), y=data, name=s))

fig.update_layout(yaxis_title="Sales", xaxis_title="Time", title="Rolling Average Sales vs. Store name ")

Rolling Average Sales vs. Store name

7000
CA_1
CA_2
CA_3
6000
CA_4
TX_1
TX_2
5000
TX_3
WI_1
Sales

WI_2
4000 WI_3

3000

2000

CA_1 CA_2 CA_3 CA_4 TX_1 TX_2 TX_3 WI_1 WI_2 WI_3

Time
In [17]:

# Avg sales by store

df = pd.DataFrame(np.transpose([means, store_list]))
df.columns = ["Mean sales", "Store name"]
px.bar(df, y="Mean sales", x="Store name", color="Store name", title="Avg sales vs. Store name")

Avg sales vs. Store name

6000 Store name


CA_1
CA_2
5000 CA_3
CA_4
TX_1
4000 TX_2
Mean sales

TX_3
WI_1
3000 WI_2
WI_3

2000

1000

0
CA_1 CA_2 CA_3 CA_4 TX_1 TX_2 TX_3 WI_1 WI_2 WI_3

Store name

Results from the EDA

CA_3 is the best performing store


CA_4 is the worst performing store
Wisconsin stores are all about the same in performance
WI_1 has exceeded usual improvement rates in the given time frame
TX_3 took a bad fall but recovered

Train Test Split


In [18]:
train_dataset = sales_train_val[d_cols[:1500]]
val_dataset = sales_train_val[d_cols[1500:]]
In [19]:

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"), showlegend=False,
name="Original signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"), showlegend=False,
name="Denoised signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[412].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[412].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[656].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[656].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Train (blue) vs. Validation (orange) sales")


fig.show()
Train (blue) vs. Validation (orange) sales

0
0 500 1000 1500

0
0 500 1000 1500

0
0 500 1000 1500

The above graph shows random samples being split into testa nd train regions represented by Blue and Orange respectively. The first 1600 days are
for training and the next 414 for testing

Modeling
We shall use 4 different methods to predict the future and compare them. They are

Naive Approach
Moving AVerage
Exponential Smoothing
Prophet - by Facebook

Naive Approach
Naive approach simply forecasts the next day's sales as the current day's sales. The model can be summarized as follows:

In the above equation, yt+1 is the predicted value for the next day's sales and yt is today's sales. The model predicts tomorrow's sales as today's sales.
Now let us see how this simple model performs on our dataset

In [20]:

predictions = []
for i in range(len(val_dataset.columns)):
if i == 0:
predictions.append(train_dataset[train_dataset.columns[-1]].values)
else:
predictions.append(val_dataset[val_dataset.columns[i-1]].values)

predictions = np.transpose(np.array([row.tolist() for row in predictions]))
error_naive = np.linalg.norm(predictions[:3] - val_dataset.values[:3])/len(predictions[0])

Lets take sample 0,69 and 99 to comapre in all methods


In [21]:

pred_0 = predictions[0]
pred_69 = predictions[69]
pred_99 = predictions[99]

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"),
name="Train"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"),
name="Val"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=pred_0, mode='lines',
marker=dict(color="green"),
name="Pred"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[69].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[69].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=pred_69, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500), mode='lines', y=train_dataset.loc[99].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=val_dataset.loc[99].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1919), y=pred_99, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Naive approach")


fig.show()
Naive approach

5 Train
Val
Pred
4

0
0 500 1000 1500

0
0 500 1000 1500

1.5

0.5

0
0 500 1000 1500

In the above graphs, the predictions and test values are too close to make out properly, so giving a better view of them alone by just showing a part of
them, say days 1500-1600
In [22]:

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="darkorange"),
name="Val"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=pred_0, mode='lines',
marker=dict(color="seagreen"),
name="Pred"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=val_dataset.loc[69].values, mode='lines',
marker=dict(color="darkorange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=pred_69, mode='lines',
marker=dict(color="seagreen"), showlegend=False,
name="Denoised signal"),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=val_dataset.loc[99].values, mode='lines',
marker=dict(color="darkorange"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(1500, 1600), y=pred_99, mode='lines',
marker=dict(color="seagreen"), showlegend=False,
name="Denoised signal"),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Naive approach Close up")


fig.show()
Naive approach Close up

2 Val
Pred

1.5

0.5

0
1500 1520 1540 1560 1580

0
1500 1520 1540 1560 1580

1.5

0.5

0
1500 1520 1540 1560 1580

We create a smaller dataset because the following methids are computationaly expensive. We just take the last 100 days and consider the first 70 for
training and the last 30 for validation
In [23]:

train_dataset = sales_train_val[d_cols[-100:-30]]
val_dataset = sales_train_val[d_cols[-30:]]

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"), showlegend=False,
name="Original signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"), showlegend=False,
name="Denoised signal"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[1].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[1].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[2].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[2].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Train (blue) vs. Validation (orange) sales")


fig.show()
Train (blue) vs. Validation (orange) sales

0
0 20 40 60 80

0.8

0.6

0.4

0.2

0
0 20 40 60 80

0
0 20 40 60 80
Moving Average
Moving average calculates the mean sales over the previous 30 (or any other number) days and forecasts that as the next day's sales. This method
takes the previous 30 timesteps into consideration, and is therefore less prone to short term fluctuations than the naive approach. The model can be
summarized as follows:

In [24]:

predictions = []
for i in range(len(val_dataset.columns)):
if i == 0:
predictions.append(np.mean(train_dataset[train_dataset.columns[-30:]].values, axis=1))
if i < 31 and i > 0:
predictions.append(0.5 * (np.mean(train_dataset[train_dataset.columns[-30+i:]].values, axis=1) + \
np.mean(predictions[:i], axis=0)))
if i > 31:
predictions.append(np.mean([predictions[:i]], axis=1))

predictions = np.transpose(np.array([row.tolist() for row in predictions]))
error_avg = np.linalg.norm(predictions[:3] - val_dataset.values[:3])/len(predictions[0])
In [25]:

pred_0 = predictions[0]
pred_69 = predictions[69]
pred_99 = predictions[99]

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"),
name="Train"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"),
name="Val"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_0, mode='lines',
marker=dict(color="green"),
name="Pred"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[69].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[69].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_69, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[99].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[99].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_99, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Moving Average Approach")


fig.show()
Moving Average Approach

4 Train
Val
Pred
3

0
0 20 40 60 80

0
0 20 40 60 80

1.5

0.5

0
0 20 40 60 80

We can see that this model performs better than the naive approach. It is less susceptible to the volatility in day-to-day sales data and manages to pick
up trends with slightly higher accuracy. However, it is still unable to find high-level trends in the sales.
Exponential Smoothing
The exponential smoothing method uses a different type of smoothing which differs from average smoothing. The previous time steps are
exponentially weighted and added up to generate the forecast. The weights decay as we move further backwards in time. The model can be
summarized as follows:

In the above equations, α is the smoothing parameter. The forecast yt+1 is a weighted average of all the observations in the series y1, … ,yt. The rate at
which the weights decay is controlled by the parameter α. This method gives different weightage to different time steps, instead of giving the same
weightage to all time steps (like the moving average method). This ensures that recent sales data is given more importance than old sales data while
making the forecast.

In [26]:

predictions = []
for row in tqdm(train_dataset[train_dataset.columns[-30:]].values[:100]):
fit = ExponentialSmoothing(row, seasonal_periods=3).fit()
predictions.append(fit.forecast(30))
predictions = np.array(predictions).reshape((-1, 30))
error_exponential = np.linalg.norm(predictions[:3] - val_dataset.values[:3])/len(predictions[0])
In [27]:

pred_0 = predictions[0]
pred_69 = predictions[69]
pred_99 = predictions[99]

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"),
name="Train"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"),
name="Val"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_0, mode='lines',
marker=dict(color="green"),
name="Pred"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[69].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[69].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_69, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[99].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[99].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_99, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Exponential Smoothing Approach")


fig.show()
Exponential Smoothing Approach

4 Train
Val
Pred
3

0
0 20 40 60 80

0
0 20 40 60 80

1.5

0.5

0
0 20 40 60 80

We can see that exponential smoothing is generating a horizontal line every time. This is because it gives very low weightage to faraway time steps,
causing the predictions to flatten out or remain constant. However, it is able to predict the mean sales with excellent accuracy.

Facebook Prophet
Prophet is an opensource time series forecasting project by Facebook. It is based on an additive model where non-linear trends are fit with yearly,
weekly, and daily seasonality, including holiday effects. It works best with time series that have strong seasonal effects and several seasons of
historical data. It is also supposed to be more robust to missing data and shifts in trend compared to other models.
In [28]:

dates = ["2007-12-" + str(i) for i in range(1, 31)]


predictions = []
for row in tqdm(train_dataset[train_dataset.columns[-30:]].values[:100]):
df = pd.DataFrame(np.transpose([dates, row]))
df.columns = ["ds", "y"]
model = Prophet(daily_seasonality=True)
model.fit(df)
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)["yhat"].loc[30:].values
predictions.append(forecast)
predictions = np.array(predictions).reshape((-1, 30))
error_prophet = np.linalg.norm(predictions[:3] - val_dataset.values[:3])/len(predictions[0])

INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th


is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override th
is.
INFO:fbprophet:n_changepoints greater than number of observations. Using 23.
In [29]:

pred_0 = predictions[0]
pred_69 = predictions[69]
pred_99 = predictions[99]

fig = make_subplots(rows=3, cols=1)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[0].values,
marker=dict(color="blue"),
name="Train"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[0].values, mode='lines',
marker=dict(color="orange"),
name="Val"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_0, mode='lines',
marker=dict(color="green"),
name="Pred"),
row=1, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[69].values,
marker=dict(color="blue"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[69].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_69, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=2, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70), mode='lines', y=train_dataset.loc[99].values,
marker=dict(color="blue"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=val_dataset.loc[99].values, mode='lines',
marker=dict(color="orange"), showlegend=False),
row=3, col=1
)

fig.add_trace(
go.Scatter(x=np.arange(70, 100), y=pred_99, mode='lines',
marker=dict(color="green"), showlegend=False,
name="Denoised signal"),
row=3, col=1
)

fig.update_layout(height=1200, width=800, title_text="Prophet Approach")


fig.show()
Prophet Approach

4 Train
Val
Pred
3

0
0 20 40 60 80

0
0 20 40 60 80

1.5

0.5

−0.5

0 20 40 60 80

Prophet is able to find low-level and high-level trends simultaneously, unlike most other models which can only find one of these. It is able to predict a
periodic function for each sample, and these functions seem to be pretty accurate .

On closer observation, we can see that the there is a macroscopic upward trend in samples 1 and 2 and a downward one in sample 3 which show the
improvemnt or fall over time.

This is a major use of FB prophet, i.e, to determine where a business is going in the future.
Submission CSV
We shall use FB Prophet for submissions. The answers have been compiled into a submissions.csv

In [30]:

days = range(1, 1913 + 1)


time_series_columns = [f'd_{i}' for i in days]
time_series_data = sales_train_val[time_series_columns]
forecast = pd.DataFrame(time_series_data.iloc[:, -28:].mean(axis=1))
forecast = pd.concat([forecast] * 28, axis=1)
forecast.columns = [f'F{i}' for i in range(1, forecast.shape[1] + 1)]
validation_ids = sales_train_val['id'].values
evaluation_ids = [i.replace('validation', 'evaluation') for i in validation_ids]
ids = np.concatenate([validation_ids, evaluation_ids])
predictions = pd.DataFrame(ids, columns=['id'])
forecast = pd.concat([forecast] * 2).reset_index(drop=True)
predictions = pd.concat([predictions, forecast], axis=1)
predictions.to_csv('submission.csv', index=False)

In [31]:

print(predictions.head())

id F1 F2 F3 F4 \
0 HOBBIES_1_001_CA_1_validation 0.964286 0.964286 0.964286 0.964286
1 HOBBIES_1_002_CA_1_validation 0.071429 0.071429 0.071429 0.071429
2 HOBBIES_1_003_CA_1_validation 0.571429 0.571429 0.571429 0.571429
3 HOBBIES_1_004_CA_1_validation 1.821429 1.821429 1.821429 1.821429
4 HOBBIES_1_005_CA_1_validation 1.357143 1.357143 1.357143 1.357143

F5 F6 F7 F8 F9 ... F19 F20 \


0 0.964286 0.964286 0.964286 0.964286 0.964286 ... 0.964286 0.964286
1 0.071429 0.071429 0.071429 0.071429 0.071429 ... 0.071429 0.071429
2 0.571429 0.571429 0.571429 0.571429 0.571429 ... 0.571429 0.571429
3 1.821429 1.821429 1.821429 1.821429 1.821429 ... 1.821429 1.821429
4 1.357143 1.357143 1.357143 1.357143 1.357143 ... 1.357143 1.357143

F21 F22 F23 F24 F25 F26 F27 \


0 0.964286 0.964286 0.964286 0.964286 0.964286 0.964286 0.964286
1 0.071429 0.071429 0.071429 0.071429 0.071429 0.071429 0.071429
2 0.571429 0.571429 0.571429 0.571429 0.571429 0.571429 0.571429
3 1.821429 1.821429 1.821429 1.821429 1.821429 1.821429 1.821429
4 1.357143 1.357143 1.357143 1.357143 1.357143 1.357143 1.357143

F28
0 0.964286
1 0.071429
2 0.571429
3 1.821429
4 1.357143

[5 rows x 29 columns]

Loss for each model


In [32]:

error = [error_naive, error_avg, error_exponential, error_prophet]


names = ["Naive approach", "Moving average", "Exponential smoothing", "Prophet"]
df = pd.DataFrame(np.transpose([error, names]))
df.columns = ["RMSE Loss", "Model"]
px.bar(df, y="RMSE Loss", x="Model", color="Model", title="RMSE Loss vs. Model")

RMSE Loss vs. Model

0.3 Model
Naive approach
Moving average
0.25 Exponential smoothing
Prophet

0.2
RMSE Loss

0.15

0.1

0.05

0
Na Mo Ex Pr
ive vin po op
ap g a ne he
pr ve nti t
oa ra al
c ge sm
h oo
thi
ng

Model

From the above graph, we can see that Naive is the best-scoring model. Prophet is the worst-scoring models. I believe that the Prophet can be
boosted significantly by tuning the hyperparameters.

Naive may not work out for other samples due to it being a very basic approach

Moving Average and exponential smoothing approaches are pretty similar

Prophet approach seems to be the worst scoring but I believe that is we trained it over all 1919 days instead of just a 100, it could very well perform
better

Conclusions
Different states have different mean and variance of sales, indicating differences in the distribution of development in these states.
Most sales have a linearly trended sine wave shape, reminiscent of the macroeconomic business cycle.
Several non-ML models can be used to forecast time series data. Moving average and exponential smoothing are very good models.
Prophet's performance can be boosted with more hyperparamter tuning.

You might also like