Professional Documents
Culture Documents
Forecasting Time Series With Scikit-Learn Models
Forecasting Time Series With Scikit-Learn Models
FORECASTING TIME
SERIES WITH
SCIKIT-LEARN MODELS
Joaquín Amat Rodrigo
Javier Escobar Ortiz
A time series is a sequence of data arranged chronologically and spaced at equal or irregular
intervals.
Historical data is used to obtain a mathematical representation capable of predicting future values to
create a forecasting model. This idea is based on the assumption that the future behavior of a
phenomenon can be explained based on its past behavior.
Historical data is used to obtain a mathematical representation capable of predicting future values to
create a forecasting model. This idea is based on the assumption that the future behavior of a
phenomenon can be explained based on its past behavior.
Single-step models
Recursive multi-step models
Direct multi-step models
Single-step models
Recursive multi-step models
Direct multi-step models
Multi-step
The goal is to predict the next n values in the series.
Observed value
Predicted value
Observed value
Predicted value
Observed value
Predicted value
Observed value
Predicted value
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
Predictors
T-n … T-4 T-3 T-2 T-1 T+1 Prediction step 1 Model for step +1
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
Predictors
T-n … T-4 T-3 T-2 T-1 T+1 Prediction step 1 Model for step +1
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
Predictors
T-n … T-4 T-3 T-2 T-1 T+1 Prediction step 1 Model for step +1
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Prediction step 3
forecaster.fit(y=data_train['y'], exog=None)
# Prediction
# ==============================================================================
steps = 36 # steps to be predicted
predictions = forecaster.predict(steps=steps, last_window=None)
return predictors
# Prediction
# ==============================================================================
steps = 36 # steps to be predicted
predictions = forecaster.predict(steps=steps, last_window=None)
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
T-n … T-4 T-3 T-2 T-1 T+1 T+2 T+3 Observed value
Predicted value
Predictors
T-n … T-4 T-3 T-2 T-1 T+1 Prediction step 1
Model for step 1
# Prediction
# ==============================================================================
steps = 36 # steps to be predicted
predictions = forecaster.predict(steps=steps, last_window=None)
● No prediction method outperforms another in all scenarios. It depends on the use case.
● Direct multi-step requires training a model for each step, so it has higher computational
requirements.
● With direct multi-step, the prediction horizon must be defined in advance, which is
unnecessary with the recursive multi-step approach.
# Backtesting forecaster
# ==============================================================================
metric, preds_backtest = backtesting_forecaster(
forecaster = forecaster, # forecaster
y = data['y'], # full time series
exog = None, # exogenous variables
steps = 12, # steps to be predicted
metric = 'mean_absolute_error', # metric
initial_train_size = len(data_train), # nº observations initial train
fixed_train_size = True, # fix/moving train size
refit = True, # retrain after each fold
verbose = True # verbose
)
# Regressor hyperparameters
param_grid = {'n_estimators': [100, 200],
'max_depth': [3, 5, 10]}
A prediction interval defines the interval within which the true value of y is expected to be
found with a given probability. For example, the prediction interval of 98% is expected to
contain the true value of the prediction with a probability of 98%.
+ 𝝐1
T-n … T-4 T-3 T-2 T-1 T+1+𝝐1
+ 𝝐2
Observed value
T-n … T-4 T-3 T-2 T-1 T+1+𝝐1 T+2+𝝐2
Predicted value
Predicted value + 𝝐
T-n … T-4 T-3 T-2 T-1 T+1+𝝐1 T+2+𝝐2 T+3
Predictors
+ 𝝐3
Model for step +1
T-n … T-4 T-3 T-2 T-1 T+1+𝝐1 T+2+𝝐2 T+3+𝝐3
forecaster.fit(y=data_train['y'], exog=None)
# Predict interval
# ==============================================================================
predictions = forecaster.predict_interval(
steps = 36,
interval = [10, 90] # 80% interval between 10th and 90th percentiles
)
predictions.head(5)
forecaster.fit(series=data, exog=None)
forecaster
# Prediction
# ==============================================================================
steps = 24
forecaster.fit(series=data, exog=None)
forecaster
# Prediction
# ==============================================================================
predictions = forecaster.predict(steps=None) # steps to predict
print(predictions)
● Links
○ Github: Skforecast
○ Documentation and user guides
● sktime
● StatsForecast (Nixtla)
● Prophet
● NeuralProphet
● GitHub stars ⭐
● Improvement suggestions
● Recommend to your friends