Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Follow 553K Followers

You have 2 free member-only stories left this month. Sign up for Medium and get an extra one

7 libraries that help in time-series problems


Automate Time-Series problems!

Pratik Gandhi 18 hours ago · 6 min read

Time series problems are one of the toughest problems to solve in data science.
Traditional methods that are time-aware like ARIMA, SARIMA are great but lately they
have largely been accompanied by the non-time aware and robust machine learning
algorithms like XGBoost, LigthGBM, and so forth because of need and proven successful
track records. However, using these methods require extensive data preparation like
removing periodicity, removing trend from the target and engineering features like
rolling window features, lag features, etc. to prepare the final dataset.

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 1/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Image created by author

To gain better accuracy we need to develop complex models and the work can be quite
extensive. Therefore, it is better to leverage some of the automation that is already
developed/creating by the Machine Learning community. Below are some of the
packages which are really helpful in solving time series problems.

1. tsfresh:
tsfresh is a fantastic python package that can automatically calculate a large number
of time series features.

Let's understand how tsfresh can be implemented by taking a standard dataset of Airline
passengers:

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 2/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started
mporting librariesOpen in app
ort pandas as pd
m tsfresh import extract_features, extract_relevant_features,
ect_features
m tsfresh.utilities.dataframe_functions import impute,
e_forecasting_frame
m tsfresh.feature_extraction import ComprehensiveFCParameters,
tings

Code for basic implementation

The data needs to/will be formatted in a format something like below:

Output of dataframe

# Getting Comprehensive Features


extraction_settings =
ComprehensiveFCParameters()
X = extract_features(df_shift,
https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 3/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science
column_id="id", column_sort="time",
column_value="value",
Get started Open in app
impute_function=impute,
                    show_warnings=False,

Extracting features using tsfresh

Output of features

From the output above we see that almost ~800 features are created. tsfresh also
helps in feature selection based on p-value. Check out the documentation for more
details.

There is an exhaustive list of all the features that are calculated using tsfresh which
can found here.

Github: https://github.com/blue-yonder/tsfresh

Documentation: https://tsfresh.readthedocs.io/en/latest/index.html

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 4/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

2. autots:
Get started Open in app
AutoTS is an automated time series forecasting library that can train multiple time series
models using straightforward code. AutoTS means Automatic Time Series.

Library Logo

Some of the best features of this library are:

It uses genetic programming optimization to find optimal time series forecasting


model.

Provides lower and upper confidence interval forecast values.

It trains diverse models like naive, statistical, machine learning as well as deep
learning models

It can also perform automatic ensembling of best models

It also has the ability to handle messy data by learning optimal NaN imputation and
outlier removal

It can run both univariate and multivariate time series

Let us take Apple Stocks dataset and understand more in detail:

# Loading the package


from autots import AutoTS
import matplotlib.pyplot as plt
import pandas as pd

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 5/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

# Reading the data


Get started Open in app
df = pd.read_csv('../input/apple-aapl-
historical-stock-data/HistoricalQuotes.csv')

Code to load, preprocess and plot the data

Simple plot to see the data

model = AutoTS(forecast_length=40,
frequency='infer',
              ensemble='simple',
drop_data_older_than_periods=100)
model = model.fit(df, date_col='Date',
value_col=' Close/Last', id_col=None)

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 6/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Setting up AutoTS to run

This will run hundreds of models. You will see in the output pane the variety of models
that run. Lets see how the model predicts :

prediction = model.predict()
forecast = prediction.forecast
print("Stock Price Prediction of
Apple")
print(forecast)

Code to generate and print predictions

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 7/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Prediction for 40 days

temp_df[' Close/Last'].plot(figsize=
(15,8), title= 'AAPL Stock Price',
fontsize=18, label='Train')
forecast[' Close/Last'].plot(figsize=
(15,8), title= 'AAPL Stock Price',
fontsize=18, label='Test')
plt.legend()

Code for plotting training and test(predictions) data

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 8/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Github: https://github.com/winedarksea/AutoTS

Documentation:
https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html

3. Prophet:

Library logo

Prophet is a well-known time series package developed by the research team at


Facebook with its first release in 2017. It works well with data that has strong seasonal
effects and several seasons of historical data. It is highly user-friendly and customizable,
with minimum efforts to set it up. It can handle the following things but not limited to:

daily seasonality,

holiday effects

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 9/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

input regressors
Get started Open in app

Lets take a look at a simple example:

s plt
het

e repo:

ith b t t /f b k/ h t/ t / l / l

Code to use Prophet library

Predictions for ~2 years

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 10/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

We can also trend and the seasonality plots as below:


Get started Open in app

Trend and Seasonality Plots

And finally, we can also see the predictions along with all the confidence intervals

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 11/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Output for predictions

Github: https://github.com/facebook/prophet

Documentation: https://facebook.github.io/prophet/

4. darts:

Darts is another Python package that helps in the manipulation and forecasting of time
series. The syntax is “sklearn-friendly” using fit and predict functions to achieve your
goals. In addition, it contains a variety of models from ARIMA to Neural Networks.

The best part of the package is that it supports not only univariate but also supports
multivariate time series and models. The library also makes it easy to backtest models
and combine the predictions of several models and external regressors. Lets take a
simple example and understand its working:

#Loading the package


https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 12/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science
oadi g t e pac age
from darts import TimeSeries
Get started
from Open
darts.models
in app import ExponentialSmoothing
import matplotlib.pyplot as plt

# Reading the data


data = pd.read_csv('../input/air-
passengers/AirPassengers.csv')
series = TimeSeries.from_dataframe(data, 'Month',
'#Passengers')
print(series)

Looking at the data after loading as series

# S litti th i i t i d lid
https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd ti t 13/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science
# Splitting the series in train and validation set
train, val = series.split_before(pd.Timestamp('19580101'))
Get started Open in app
# Applying a simple Exponential Smoothing model
model = ExponentialSmoothing()
model.fit(train)

# Getting and plotting the predictions


prediction =
model.predict(len(val))series.plot(label='actual')
prediction.plot(label='forecast', lw=3)
plt.legend()

Splitting the series and getting the predictions

Plotting the predictions

Github: https://github.com/unit8co/darts

Documentation: https://unit8co.github.io/darts/README.html

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 14/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

5. AtsPy:

AtsPy stands for Automated Time Series Models in Python. The goal of the library is to
forecast univariate time series. You can load the data and specify which models you
would like to run as shown in the example below:

# Importing packages
import pandas as pd
from atspy import AutomatedModel

# Reading the data:
data = pd.read_csv('../input/air-passengers/AirPassengers.csv')

# Preprocessing data
data.columns = ['month','Passengers']
data['month'] =
pd.to_datetime(data['month'],infer_datetime_format=True,format='%y%m')
data.index = data.month
df_air = data.drop(['month'], axis = 1)

Code to use AtsPy library

The package provides a diverse set of models totally automated. Below is the screenshot
of the models available:

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 15/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Github: https://github.com/firmai/atspy

6. kats:

Kats is another recent library developed by the research team at Facebook dedicated
especially to handle time-series data. The goal of the framework is to provide a complete
solution for solving time series problems. Using this library we can do the following:

time-series analysis

detection of patterns including seasonalities, outlier, trend changes

feature engineering module that produces 65 features

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 16/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

building forecasting models on time series data including Prophet, ARIMA, Holt-
Get started Open in app
Winters, etc.

The library seems to be promising and it has just released its first version. Some tutorials
can be found here.

Github: https://github.com/facebookresearch/Kats

7. sktime:
Sktime library as the name suggests is a unified python library that works for time series
data and is scikit-learn compatible. It has models for time series forecasting, regression,
and classification. The main goal to develop was to interoperate with scikit-learn.

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 17/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

It can do several things but to mention few of them:

State of the art models

Ability to use sklearn’s Pipeline

Model tuning

Ensembling of models

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 18/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

The roadmap of sktime looks very promising and has a lot of developments coming
Get started Open in app
down the pipeline:

1. Multivariate/panel forecasting,

2. Time series clustering,

3. Time series annotation (segmentation and anomaly detection),

4. Probabilistic time series modeling, including survival and point processes.

If there is a specific library/package you would like me to make a detailed tutorial please do
comment and let me know. Also, if there are any other wonderful time series packages that
can be added to this list, please do not hesitate to comment. Thank you for your time for
reading!

My other articles related to Time-Series:

1. https://towardsdatascience.com/7-statistical-tests-to-validate-and-help-to-fit-
arima-model-33c5853e2e93

2. https://towardsdatascience.com/20-simple-yet-powerful-features-for-time-series-
using-date-and-time-af9da649e5dc

Follow me on Twitter or LinkedIn. You may also reach out to me via


pratikkgandhi@gmail.com

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.

You'll need to sign in or create an account to receive this


Get this newsletter
newsletter.

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 19/20
6/29/2021 7 libraries that help in time-series problems | by Pratik Gandhi | Jun, 2021 | Towards Data Science

Get started Open in app

Time Series Forecasting Machine Learning Data Science Python

AboutWriteHelpLegal

Get the Medium app

https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd 20/20

You might also like