Dynamic Time

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

320 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO.

2, FEBRUARY 2023

Dynamic Time Scan Forecasting: A Benchmark


With M4 Competition Data
Rodrigo Barbosa de Santis , Tiago Silveira Gontijo , Reviewer, IEEE and Marcelo Azevedo Costa ,
Reviewer, IEEE

Abstract—Univariate forecasting methods are fundamental for [21]. The 5th edition took place in 2020, and focused on a re-
many different application areas. M-competitions provide im- tail sales application with 42,850 unit sales hierarchical series,
portant benchmarks for scientists, researchers, statisticians, and with the objective to produce the most accurate point forecast
engineers in the field, for evaluating and guiding the development
of new forecasting techniques. In this paper, the Dynamic Time as well as the most accurate estimation of the uncertainty of
Scan Forecasting (DTSF), a new univariate forecasting method these forecasts [22]. The 6th competition will take place this
based on scan statistics, is presented. DTSF scans an entire year and it will focus on predicting the overall market returns
time series, identifies past patterns which are similar to the of individual stocks [23].
last available observations and forecasts based on the median Whereas most well-known forecasting methods are based
of the subsequent observations of the most similar windows in
past. In order to evaluate the performance of this method, a on identifying intrinsic components of the time series, such
comparison with other statistical forecasting methods, applied in as level, trend, or seasonality, a particular group of methods
the M4 competition, is provided. In the hourly time domain, an based on similarity searches have been arousing interest in
average sMAPE of 12.9% was achieved using the method with the the areas of meteorology and renewable energy [24], [25].
default parameters, while the baseline competition – the simple These methods consist of identifying past weather patterns
average of the forecasts of Holt, Damped, and Theta methods –
was 22.1%. The method proved to be competitive in longer time ("analogs") that closely resemble the current state. These
series, with high repeatability. methods are capable of handling lengthy historical time series
in order to produce accurate and interpretive forecasts.
Index Terms—Univariate methods, M4 competition, bench-
marking, dynamic time scan forecasting. Among these methods, Dynamic Time Scan Forecasting
(DTSF) consists of a new and simple analog-based forecasting
technique [26]. It generates forecasts based on similar patterns,
I. I NTRODUCTION those with the highest R2 scores, calculated from the last
available window.
he development of predictive models is widely debated
T in the literature [1]–[4], since it assists the control of
associated uncertainty intrinsic to random variables. Given the
The accuracy of analog-based methods is scarcely reported
in areas other than energy prediction and is mostly limited
to wind and solar energy forecasting applications [27], [28],
above, there are several categories of predictive models based
which begs the question: "are analog-search-based models
on this physical knowledge (such as spectral analysis [5])
competitive compared to classical statistical prediction meth-
of intensive machine learning and statistical approaches [6].
ods?". Additionally, no research was found that compared
Forecasting models associated with a single random variable
analog search methods and statistical methods.
as a function of time support univariate forecasting, which is
To fill this gap, the current paper describes the DTSF
a very important area given its application in various sectors
forecasting method and discloses its performance on the
such as [7]–[9], business [10]–[12], energy [13]–[15], among
M4 competition time series. We compare DTSF with eight
others. In this context, it is fundamentally valuable to develop
classical statistical methods (Naive, Seasonal Naive, Simple
meticulous criteria for selecting the models [16].
Exponential Smoothing, Holt, Damped, Theta, AutoRegres-
The M-competition [17]–[20] is the most important fore-
sive Integrated Moving Average (ARIMA), and ExponenTial
casting competition in academia, in which researchers from all
Smoothing state space model (ETS)) and a combination of the
around the world test their methods on real-life, anonymous
outcomes of 3 individual methods (Holt, Damped, and Theta),
time series from distinct areas of industry. The 4th edition took
which compose the baseline of the M4 competition. The M4
place in 2018 [17], and 17 methods based on combinations
benchmark dataset was selected for this research because: (1) it
of statistical- and machine-learning or hybrids were tested on
consists of a reliable and curated benchmark base, adopted by
100,000-time series. Outputs from these events are registered
other researchers and practitioners for developing and testing
in review articles, pointing out the directions of development
forecasting methods; (2) it has a significant number of series:
and refinement of the most promising forecasting techniques
100,000 time series, with different frequencies (hourly, daily,
Rodrigo Barbosa de Santis is with LADEC Lab, Federal University of monthly, weekly, quarterly, yearly); (3) it has been mostly
Minas Gerais (UFMG) e-mail:rsantis@ufmg.br predominated by statistical methods of forecasting; (4) and
Tiago Silveira Gontijo is with LADEC Lab, Federal University of Minas it is composed of univariate and independent series.
Gerais (UFMG) e-mail:economista@ufmg.br
Marcelo Azevedo Costa is with LADEC Lab, Federal University of Minas The major contributions of the present paper can be sum-
Gerais (UFMG) e-mail:azevedo@est.ufmg.br marized as follows:
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 321

TABLE I
• the study applies a new method to M4 competition for
S UMMARY OF M4 COMPETITION DATASET, INCLUDING
benchmark purposes;
TIME - FREQUENCY, MINIMUM LENGTH OF TIME SERIES ,
• the method is compared with nine classical statistical
AND FORECAST HORIZON OF EACH TIME SERIES .
methods and a combination of the outcomes of three
individual methods, which compose the baseline of the Domain Number of series Min. length Horizon Seasonality
competition; Yearly 23,000 13 6 1
Quarterly 24,000 16 8 4
• in addition to applying the method, along with its default Monthly 48,000 42 18 12
parameters, an exhaustive search with hold-out validation Weekly 359 80 13 52
is adopted for model selection. Daily 4,227 93 14 7
Hourly 414 700 48 24
The major conclusions are:
• in the hourly time domain, an average error of 12.9% was
obtained using the method with the default parameters, B. Dynamic Time Scan Forecasting
while the competition baseline was 22.1%; DTSF is a forecasting method based on scan statistics [31]
• through the automatic selection of parameters, we boosted and was originally developed to address the problem of wind
the accuracy of the method by 12.31% compared to the forecasting for Brazilian power generation plants. It consists
method application without parameters selection; of scanning a time series and identifying past patterns (called
• the method proved to be competitive, both in terms of "analogs") similar to the last observations available of the time
accuracy and computational cost, over long time series series (called "query") [26].
and with high repeatability. Let yt be a time series of length N , t = 1, ..., N . Firstly, let
vector y[w] be defined as the last w observations of the series:
The present paper is organized into 5 sections. Following
this Introduction, Section 2 provides a review of the proposed y[w] = [yN −w+1 , ..., yN ]. (1)
forecasting method. Section 3 provides a background of the
The goal of DTSF is to identify analogs in the time series
datasets and methods applied in this study. Section 4 presents
which are greatly correlated with vector y[w] . Hence, the set
the results and discussions obtained from the application of
of candidate vectors can be defined by:
the methods. Finally, Section 5 concludes the present paper
and includes some recommendations for future studies. xt
[w]
= [yt−w+1 , ..., yt−w ] (2)
where t = 1, ..., N −2·w. The upper limit of the time sequence
[w]
II. M ATERIALS AND M ETHODS (N − 2 · w) guarantees that vector xt does not overlap with
vector y . Fig. 1 presents the DTSF procedure. Given the
[w]

A. M4 Competition Dataset last w observed values, which comprises vector y[w] , a rolling
window with the same size (xw t ) is used for scanning previous
The data used in the current study comes from the M4 com- values of the series.
petition dataset [17]. It is composed of 100,000 time series, Lastly, DTSF provides a k steps ahead forecast of the time
taken from different domains such as Economics, Finance, series, yN +1 , ..., yN +k . To produce this outcome, the DTSF
Demographics, and Industry, among others. The time series [w]
scans the series to find the closest analogs xt . The subsequent
show different periods: yearly, quarterly, monthly, weekly, values of the time series are used as the forecast values:
daily, or hourly.
yN +i = fx[w] (yt−w+i ) (3)
Table I summarizes the information about the competition’s t

dataset. Domain refers to the time period from which the where fx[w] is a function which correlates the elements of
data have been extracted, ranging from hourly to yearly. The [w]
t

vector xt and the elements of vector y[w] .


number of Series shows how many time series are available,
According to that, a first constraint can be set on k : 1 ≤
in total. The dataset is mostly composed of a collection of
k ≤ w. This constraint guarantees that if the most correlated
time series from yearly, quarterly or monthly domains - 95,000
time series window comprises the most recent values, prior to
time series. The minimum length is the shorter time series
vector y[w] , then the forecast values are a function of vector
in the given domain: the more aggregated the domain, like
y[w] ,
yearly, the more difficult it is to retrieve data. For example,
hourly time series are longer, having at least 700 available yN +i = fx[w] (yN −w+i ). (4)
N −2w
observation points. Horizon refers to how many steps are
being predicted in the future and are being used for metric As stated in Equations (3) and (4), forecast values depend
computation. Seasonality represents the expected recurrence on the window length w and the function fx[w] (.). A intuitive
t

of an event in a given time domain. proposal for function fx[w] (.) is a linear scaling of the elements
t
[w]
The dataset provides a public and reliable source for com- of vector xt , i.e., a linear model. This occurs due to the fact
paring statistical, machine learning, or hybrid methods on that previous values are likely similar to the last observations,
univariate time series forecasting [29]. It is internationally except for a scale and/or offset shift. So, the method searches
recognized by researchers and data scientists as the most for values that may be similar to the last values, after applying
important competition in this area [30]. a similarity function [26].
322 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO. 2, FEBRUARY 2023

Query Forec.

Target variable (y)


Analog 1
Subsequent observations to analog 1
Adj. forecast from analog 1
Analog 2
Subsequent observations to analog 2
Adj. forecast from analog 2
Analog 3
Subsequent observations to analog 3
Adj. forecast from analog 3

0 20 40 60 80 100
Time (t)

Fig. 2. Example of DTSF application to forecasting a time series.


Fig. 1. Illustration of the DTSF time series scan procedure. The three colored lines represent the top three analogs correlated to
the queried period. The dashed lines are the subsequent observations
of the analogs. The forecast is given by the median of the adjusted
forecast from the subsequent observations of the top analogs.
By taking a linear function as the similarity function, the
parameters of the model can be estimated to minimize the
sum of squares between the elements of vector y[w] and the As a data-driven method, DTSF usually performs better
[t] [t] [w]
linear equation: β0 + β1 × xt . Moreover, the similarity on time series with large numbers of observations and it
statistic can be assumed as the linear regression coefficient of can also be extended to search the patterns of secondary
determination R2 [26], [32]: series related to the prediction. The main disadvantage of
2 the method is the computational cost of scanning the entire
P 
j
[w]
yj − ŷj
[w] time series and calculating the similarity profile. However,
R2 = 1 − P  2 (5) more efficient methods, such as the Maureen’s Algorithm of
[w] [w]
j yj − ȳj Similarity Search (MASS) which applies convolution, have
been applied for speeding up this task [27]. To keep it feasible,
[w] [w] the linear similarity functions commonly adopted are from the
where yj is the j-th value of vector y[w] and ŷj is the j-
first to the third-degree polynomials.
th predicted value using the estimated linear function. Finally,
the method calculates a similarity profile based on the R2
score resulting from the comparison of the query with previous C. Statistical Forecasting Methods
windows. The analogs with higher R2 scores are considered A univariate forecasting method is a procedure for estimat-
closer analogs. Predictions of future steps are calculated from ing a point. The forecast is based on past and present values
a predefined number of analogs using aggregation functions, of a given time series [33]. This method is generally applied
such as median [26]. when there is a large number of series to forecast, or when
The DTSF model requires three parameters to be selected multivariate methods require forecasts for each explanatory
by the user: the length of the query window, the similarity variable. Given the advantage of simplicity and high usage,
function specification, and the number of analogs to be con- univariate forecasting methods are employed in most of the
sidered. The original implementation of DTSF is available on forecast applications in areas such as business, energy, and
the R package, DTScanF. In the present study, the original finance. The following methods are selected from the latest
implementation is the extent to which the aggregation function M4 competition benchmark [17], and a simple explanation is
applied to analogs can be either the median or the mean, given for each one, as follows:
according to the user or the model selection procedure. 1) Naive: the simplest, yet still powerful forecasting
Fig. 2 illustrates the forecasting procedure, using time method; assumes that the next steps to be predicted are
scanning in a given hourly time series, adopting a window equal to the last available observation [20].
with a length equal to 48 hours, a linear similarity function 2) Seasonal Naive (sNaive): the same concept as Naive,
(degree equal to 1), and the three analogs. Windows 1, 2, and 3 with the adaptation that the time series is deseasonalized;
are the ones most similar to the last window of available data. method adjusted and forecast later, re-adjusted with the
The forecast is given by the median (but other statistics can seasonal component [20].
be used such as the mean) of the subsequential observations 3) Naive2: each time series uses the forecast of either Naive
of the analogs. or sNaive, based on their score on the validation set.
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 323

TABLE II sM AP Ek /sM AP Ebase + M ASEk /M ASEbase


PARAMETERS RANGE ADOPTED FOR DTSF. OW A =
2
(8)
Parameters Range where Yt is the post sample value of the time series at point
Polynomial degree 1
Analogs 10 t, Ŷt is the estimated forecast, h is the forecasting horizon, m
Window size 48 is the frequency of the data, k is a given regressor, and base
Aggregation function Median is the sNaive estimator.
A hold-out cross-validation scheme is adopted to evaluate
and select the best parameters for the methods, in which the
4) Simple Exponential Smoothing (SES): classic statistical last k observations are kept as the validation set, k being equal
method which applies an exponentially weighted aver- to the forecast horizon. All possible parameter combinations
age [34]. are enumerated within the defined ranges, and the methods are
5) Holt: exponential smoothing with level and linear trend tuned using an exhaustive grid search procedure with sMAPE
components [34]. as the scorer.
6) Damped: exponential smoothing with dampened param-
eters for flattening trends, after a given period [35].
7) Theta: method based on a coefficient of curvature of the E. Software and Hardware
time-series, applied to the second difference of the data Routines were implemented using the R 3.6.0 program-
[36]. ming language with the official benchmarks and evaluation
8) Combined (Comb): the simple average of the forecasts script of M4 Competition, available at the GitHub repository
of the previous three models: Holt, Damped and Theta. (https://github.com/M4Competition/M4-methods). The Fore-
9) ARIMA: general forecast method estimated from the cast 8.7 package is used for the SES, Holt, Damped, ARIMA,
autoregressive, moving average and integration compo- and ETS methods. DTSF comes from the official implemen-
nents from the time series analysis [37]. tation of the method in R and C++, available from the public
10) ETS: automatic forecasting based on an extended range repository (https://rdrr.io/github/leandromineti/DTScanF/). All
of exponential smoothing methods [38]. data and scripts are available from the authors upon request.
11) DTSF: the proposed method, adopting the defined de- Computer specifications used to execute the algorithms and
fault parameters, which are: (i) polynomial function calculate the forecasts are as follows: CPU 8-core Intel Core i9
degree equal to 1, (ii) analogs equal to 10, (iii) window 2.3 GHz, 16 GB of RAM, and macOS 12.5 operating system.
size equal to length of forecast horizon, and (iv) median Once the predictions are calculated, the error arrays are next
as aggregation function [26]. calculated and saved as RDS files, allowing analysis of the
Table II presents the range adopted for the parameters of results. Fitting time is computed from the time delta of the
the proposed method. The polynomial degree is the degree of system, before and after each execution of the methods.
the function used for approximation, analogs are the number
of analogs to be used to estimate the forecast, window size III. R ESULTS AND DISCUSSION
defines the length of the scan window, and aggregation func-
Table III presents the average sMAPE achieved by each of
tion is the one that transforms the projection of the analogs
the statistical methods and by the proposed method, computed
into the final forecast.
for each of the time domains. The Theta method achieved the
best scores for the yearly and monthly frequencies (14.603 and
D. Model Selection Procedure 13.003), which composed more than 70% of the total of the
series, thus contributing to this particular method outperform-
The split of the data into training sets and test sets split is ing the other methods in the overall average (12.312). In the
predefined and given by the competition organizers. The data individual domains, Comb achieved the lowest error for both
come from different files for each of the time series domains. the daily (10.197) and the quarterly (10.197) domains, while
The test set has a fixed horizon for all the time series, and the ARIMA method scored the lowest error on the weekly
it is used only for computing the final scores. The evaluation frequency (8.593).
metrics adopted are the same ones that are applied in the M4 The average error of all methods is the lowest for daily
Competition, and are those most used in literature [39], [40]: frequency (close to 3.00), and there seems to exist a trend
the Symmetric Mean Absolute Percentage Error (sMAPE), toward increasing as the time domain becomes broader: the
Mean Absolute Scaled Error (MASE) and Overall Weighted weekly average error is around 9, the monthly is around 13,
Average (OWA). The formula for calculating the metrics is and so on. The exception is for the hourly frequency, in which
given: most of the statistical methods scored errors from 13.912 to
h 43.003.
1 X 2|Yt − Ŷt | DTSF exhibited errors considerably fewer errors methods
sM AP E = (6)
h t=1 |Yt | + |Ŷt | considered for benchmarking in this particular kind of time
Ph series (12.927). This makes the DTSF method interesting
1 (n − m) t=1 |Yt − Ŷt | for studying applications in which competitive estimators are
M ASE = Pn (7) sought.
h t=m+1 |Yt − Yt−m |
324 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO. 2, FEBRUARY 2023

Table IV presents the evaluation of the methods using OWA. the great gain in accuracy that explains the best performance
This metric is understood as showing how one method is more of this method, on average. Analyzing the sets between the
accurate when compared to Naive2. If OWA is lower than 1 170th and 300th time series with the smallest error, there is
the method is more adequate than Naive2. Otherwise, Naive2 less distinction between all the methods which, in general,
provides better forecasting performance. The DTSF scores for presented errors very close to each other. Other methods have
the hourly series imply a meaningful increase in accuracy shown a lower errors than DTSF along all time series, specially
over the Naive method (0.552). Moreover, when applying fine- the methods ARIMA and ETS. In the set between 300th
tuning, the gain increases to nearly 50%. For all other domains, and 414th, DTSF again marginally outperformed the other
the only ones in which the method performed worse than benchmark methods in most of the series.
Naive2 were the yearly and the daily, both of which have Table V presents the average sMAPE detailed by the
in common the longer term forecast period and the lowest forecast horizon, grouped by 6-hour periods. DTSF obtained
seasonality traits in common. lower errors, for all horizons than the other compared methods.
The outcome of the experiment can be explained by the Furthermore, the average error is 12.9%, and the highest errors
intrinsic design of the DTSF method, which was originally were obtained during the periods between the hours from 19
conceived to deal with very long time series with recurrent to 30 and the hours from 43 to 48.
patterns, such as its original application to 30-min frequency To provide better visualization of error evolution over time,
wind speed forecasting. Comparing results to Table I, which Fig. 4 presents the mean errors per step of each method
presents the seasonality, length, and forecast horizon of each (excluding the three from the previous figure), for all hourly
time domain, it is shown that the DTSF accuracy is greater time series. An increase in error over time, according to the
when the number of available data points is also greater. phenomenon of error propagation, is expected. This is better
Fig. 3 displays the average sMAPE for each one of the observed in the Holt method, in which error varied from 10%
414 hourly time series available in the competition database, at the first step to 40% at the last step. Moreover, in such
listed in ascending order according to the calculated error a visual representation, the Theta model is perceived to have
of the DTSF method. The methods Naive, sNaive and SES been more accurate, on average, than the DTSF model for the
methods were holdouts of the graphical representation. The y- 1st and 24th hours.
axis is presented using the base-10 logarithmic scale in order
to facilitate visual analysis.
Naive2
40

Holt
Damped
Theta
Comb
DTSF
ARIMA
35

2 2 2 2 2 5 2
2 52 2 2 1 8
2
5
3 3 2 ETS
5 2 8 8 2 2 4 1
2
5
4
3 282
3
5
412
14 4
1
5 3 5 85 3
2.0

5 1 8 3 75 25 76
2 23 2 2
8 2
5
4
2 18
3
42 752
1
3
442 22
15
3
5 2 5 75 1
3
4
1
2 224
5
4
3
2
5
3
1
4
18 4 3
18 4
3
5
21 75
5
3
14 3
4 27
166
5
6
5 5 2 2 3 2 2 2 8 2
715 8
2
5
5
3
1
4 75 185
2
3
4
7 1812
4
3
4
3
5
25
15
3
4
2
3
4
1
3
4
176665
4
17
3
5 8 3 5 8 3
1
4
2 3
4
1 2 8 2 7
5 8 5 23 8 2 1 1 58
17 5
4
3 1
2
3 755
4 1 7 88668 8 6666 878
6 5
2
3
4
30

3 3
4
5 3
17
4
2 668 1
82 3 8 8
5 53 2 1 5 8 25
5
3
4 2 28 5
4
3 3
4
22 1
5
4
38
2
5
3
428 285
15
4
3
17
1 78
3
4
2 3
5
1
2 775
8
8514
3
4
2 3
178 75
3
4
5
22
253
4
1 6
6 66662 878785
7 662
2
4
3
1
3 2 15 82
5
3
4 1 1 1881785 2178 74
7 7 2
1
3
4
725 8
158 6 78 83
167
4 6
68
7
66668775 1
5
3
4 2
72
178
5 2
2
5 2 2 322
5
11 218
5
4
3 4
3 7 2
5
3
4
15
7782
18
3 25 6666666 2
75172
3
14 3
4
17787885
5
1
3
4
2
3 1 4 88 8
1
4
3 2
31177 8
5
4
8 2
5 8 728
4
3
725
5
3
4 2
4
3
1
778 78
75287
3
4 2
5
3
4 777 85 3
4 21768
1
4
3
285 3
5
48
7
4 18
5
6663
4
6
3
4
682
3
64
117
3
4
66 66668
668 187875
5
2
3
4 3
4 3
4 3
1
4
1.5

4
1 3
8 8 3
4
1 7 5 7 82 1 15
2
3
4 7 4
3
5
2
62
7
66
6
5
2
3
42
1
66
6
2
2
5
4
8
36 7 7 87 7 2
5
3
4 7 8 8 7
8 54 1 8 22 2 2 5 2 878 2 7 811 4 15
3 2 875
4
3 288
5
4
3 778 1667
3 2
5
3
4
1666666 38
685
4115
17 778 878
3
4 88
25 1
4 1
4 3 2 2 2 8 25
5
3
4
83
4 5
3
4
1 8 8
1
3
5
2
4 5
3
2
4 7 1
4
7
31 1 1
8 7 7 12
5
4
6
34
5
2
6 6 14
7
3
5
2
1 8
3
1
4
5 18 2
5
8
4
3
1
1
4 1
4
15
2
1 57 5 18 8 88775
4
2
3 2378 2 1788
17878 5
7
3
4
81775
28 15278
3
4 1668
2
5
3 67878881
8
666666665 2
125
7
2 1 78 7
2 2 22
4
1 5 4
1
3 25
7 4
3 4
3 2528
3 5
4
3
7 68
3
4
5
2 75
2
3
4 2 4 3
4
2 5
4
3 18
3
4 3
5
4
sMAPE

4 3 2
28 3 2212 22 22 2 2 2 2 2225 5 2 23
3 2 4
1 34122 8 2 2 55 22 8
54
1 2 7 85 8 5 28
5
4
3 2
35
4
3
17
8 5
18
2
3
4 27
5
3 25
5
3
4 3
4
28 88
7 2
15
17166668
2
5
3
4 14
74 668
2666615
3
6
5
6666666668
1666
66617 7 8 5
8
2
3
4
1
3
4
2
17 4 18 5
3
4 28
5
18
3
4
25

8 5 4
3 17 1 153
2
15
411153
4
2 3
4 76 7
2241222 4 22 2 322 2 28 77 2
5
3
4 3
4
2 66 6
885
2
3
4 27 7
3
4 7
22 22 2222 222 2 22 2 22 2 22 4
3
1 7
5 2 4 12 5 4
3
1 3
4
87766666665 666 27
5
3
4
166668
6 67
67667 18
3
4
5 2
5
3
4
15 3 5 4
3
2
22 25
1
4 2 2 2 1
4 22 2 2 2 2 2 22 2 2 25 2 5 6
67
6
64
1
7
6
6 7
8
2
666
8
2
5
3
1
427288
76
8
3
5
4 4 18
7
3
5
4
2
2 1 77
2
3
4
7 4
1
2
1
1
5 22 422 2 22 2 2 222228 12
4
3
2 5 16
4 563665
6 7 17 1 1 8
5
3
4
2
1 7 75
2 2 2 8
5 2 3 7 33 3 1
8
3
4
5
2 66
65 4
2
3
1 7 7
2 5 7522 2 4 2 238 225 5 5238 6668
1.0

5 5 55 2 3
1 8 3
1 28
5
4
2 1 78
3
5
4
2 7
3
1
5
4
2 718
2
4
5
5 52 5 5 5 5 25 5 5 2 5 5 2 5 75 2 2 4 8 82 22 4
666 5
2
3
1
4 7 8
3
2 5 5 55 5 5 255 14
1 75 2
18
3
441874
18
7 8 78 17
6
166668
5 17 78 75
68 2
4
14 3
55 5 58 5 8 38 5
34 4
3
1666666666 4
3
5
2 7 7 25 1 8
2 7
4
1
log10(sMAPE)

2 77 4
3 5 5 555 5
7 5 218 737812 8884
2
5
3
4 27 77 5 66 567
4668
6 77 2374
1417728
3 55 55 55 555 525 5 5 55 55 55575554
4
1 355 77 755 1 13 1 6666667666646
18
217 175778
75 5
3
4
17838 13 8
2 5
4
20

5 1
5
8
2 4 4 56666676666664 75 3 4
3
2 5
3
4 1
8
2 3 5555533 5375 355 5 55 5 55553555 5 73
4 8
548
2
1
4477 1
8
11
22 2 74
873
1 1564
4
5
2
8
12
12
5
4
3
6
676667 8 4
17158
3
4
5 17 7735 2
38
4
8 278
5 1 37
4
3
5
7 3 3
37 47 333 5 7 3 5 55 5
3 65
8
6
466
76
7
13866667178566
3
4
5
3
1
2777 8 8 75 2 5 7 4
3
1 5
3 8
3 1 37 3 53 3 3557 33 7333 7 333758 5
18
43
18
5
4
2 2 2
3 7
8
1715
17 118
4
2
3 3 5
7 134
4
3
34141 5 3 33 4 1 3315325 33534 1 471444 33 4 33 354 7 4
31 18 23
3 35 33 5
3
4 1
666668
5
4
356
26
6 1
4
75
332 148
3
5 3
4 535 738
787 75773 17348 18 3
18
13 3333343333 133335544 13 1334 11 4 11 8488
3
5
2 167 8
4 8
16667
4
156
6 7 4 43 2
4 1 84
4 4
5 7 58 48
17
0.5

4 3333 14
4 14 131314 1 34 134 2
5
4
3
8 66668 6 77778 8 1
144134
4 134
144 1334 3374344 4 1324 723368
18
4 67
17
2 6 5
6 8
14
41447441 48 7 1 24
1414 44 314 47144
1 145
1434
2
1134
14
4
11
4331 4 11 4 3
14 112
14 3 43
34
1 4 334
144 1433
4 4 4 4
5 1 3335 22
4
3
1 4
144
14 4
3 35
1
3
1
27
4
25 14
815
5 3
1
4
2
2
47
88
8
5
5
3
1
4
2 7667
72
4
5
65
3
4
6
7
81
7
4
1 84
47
1
2
25
3
4 3
1
58
7
2 3
4
14 22 741 85 5
8
185
3 3
3
2814
18 31813 4 5 4
3
1
2 4
5
2
3
1
1
8
4 1 1 11 1 1 4 4
1 14 3 3 4
31 5
4 1 1 1 1 1 3
1 22
3 4
513 3 5
1 1 34766668373
18
3
16 5
7 5
3 4 5 1 8 2
1 1 4 1 41 1 1 444 41 45 14 1 1 1 4 43 44
5
43
15 1 44 3 58
4 18
4 855 4 728 15 5183 5
15

3 1 3
11 4 4 2
4
3 121
5 1 1 3
17 4 166 1 71
2
4
5
3
7 8
1
3 3
1
5
4 2
4
5
3
1
4
1 4 2 2 111 51 4 3 3 84 4 3
132 4 12 1 41 41 5 4 1366 2
7 73 5 5
7 25
5
3
1 1
3
4 3
1 83
4
5 4 1 4 112 1 43
4
5
3 2
14 22 24
4
5
3 1 83 67 338388
3
7 8 8 884 15
145 2
1 4 667
135
3 55
3
1 1 66
3
2
4
8 3 85 1 2
3
2
4 18 8 88 5
3
4
5 4
3
2 17886666668
2
5
4
3
0.0

888 8 8 8 8 5 8 666667
8 88 88
8 888 82 878 8 68 3
68
848 8 375 3
5 86 8 5
1666666666
1
4
2
8 88 88 881 8 8 8 4 88
7
1664
1
3
2 2
4
6 877 77 7
10

7
5 4 866667
6
8 87888 7 87
8 73
152
3
1 6667
4
7 67
6 666666666678
666668 7 8788 77
8
2
5
4 7 7
8 88 77 6666666666666688 7 78
8
7 66
6
66
666
6
6666 8 7
7 877 8
3
18 7 87 7 8 1
5
3
2
4 7 8
8
6
6
666667
6
66
6
66
8 7 7
8 8 8 8 8 8 Naive2
7 8 78 75 18786666
3
2
4 6668
766666666666667
66 6 7 8 7877 7 7 8 8887877
7 78
38
−0.5

17 877 6666666666
6667 17 8 787 778 8 7 7 77878 7747 2 4
577 77 Holt
2 6666666687 7774
666667
2
5
3 8 77877 78 7 877 2877 1 8 87
3 66667
5
4 668
67 7 7 7 7 7 8
7 8 5
3
1 8 7 7 Damped 0 10 20 30 40
777 78 78 8 7 87 8 87
66667778 7 7 8 7777 777 7 77 7 7 Theta
66 7 7 Comb
8
7 8 Step
67 DTSF
7
−1.0

8 ARIMA
7
8 ETS
Fig. 4. Average sMAPE (obtained in the 414 hourly time series by all
0 100 200 300 400 the methods for each step of the prediction, up to 48 hours – forecast
Time Series horizon).

Fig. 3. Forecasting methods average sMAPE for each of the 414


hourly time series, ordered by the accuracy of the DTSF method. Most statistical methods presented a pattern of very similar
The proposed method obtained fewer errors for most of the time curves, with the exception of the DTSF method. In DTSF,
series in this particular domain of application. the errors presented a different pattern, alternating peaks, and
valleys with the patterns of the other statistical methods. In
In the first 170 time series with the lowest sMAPE – one- general, DTSF appeared to remain more stable throughout the
third of the total available – the method proposed in the period, experiencing less of the error propagation effect and
present article achieved errors close to 10−2 , while most of the not exceeding the limit of 20%. These are more examples that
others obtained errors between 100.5 and 102 . This shows the explain the better performance of the DTSF method, compared
enormous predictive power in this specific type of series, and to the benchmark, in the hourly domain.
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 325

TABLE III
T HE PERFORMANCE OF DTSF COMPARED TO M4 BENCHMARK STATISTICAL METHODS – S MAPE METRIC .
sMAPE
Yearly Quarterly Monthly Weekly Daily Hourly Average
Method (23k) (24k) (48k) (359) (4,227) (414) (100k)
Naive 16.342 11.610 15.255 9.161 3.405 43.003 14.207
sNaive 16.342 12.521 15.994 9.161 3.405 13.912 14.660
Naive2 16.342 11.012 14.429 9.161 3.405 18.383 13.565
SES 16.398 10.600 13.620 9.012 3.405 18.094 13.089
Holt 16.535 10.955 14.833 9.706 3.070 29.474 13.839
Damped 15.162 10.243 13.475 8.867 3.063 19.277 12.655
Theta 14.603 10.312 13.003 9.094 3.053 18.138 12.312
Comb 14.874 10.197 13.436 8.947 2.985 22.114 12.567
ARIMA 15.150 10.408 13.486 8.593 3.185 14.081 12.679
ETS 15.356 10.291 13.525 8.727 3.046 17.307 12.725
DTSF 16.816 11.006 13.823 8.983 3.313 12.927 13.370

TABLE IV
T HE PERFORMANCE OF DTSF COMPARED TO M4 BENCHMARK STATISTICAL METHODS – OWA METRIC .
OWA
Yearly Quarterly Monthly Weekly Daily Hourly Average
Method (23k) (24k) (48k) (359) (4,227) (414) (100k)
Naive 1.000 1.066 1.095 1.000 1.000 3.593 1.072
sNaive 1.000 1.153 1.147 1.000 1.000 0.628 1.106
Naive2 1.000 1.000 1.000 1.000 1.000 1.000 1.000
SES 1.003 0.970 0.951 0.975 1.000 0.990 0.970
Holt 0.956 0.935 0.989 0.964 0.997 2.760 0.976
Damped 0.888 0.893 0.924 0.916 0.996 1.140 0.912
Theta 0.872 0.917 0.907 0.971 0.999 1.006 0.906
Comb 0.868 0.891 0.920 0.926 0.979 1.559 0.906
ARIMA 0.891 0.898 0.904 0.927 1.041 0.950 0.906
ETS 0.903 0.890 0.914 0.931 0.996 1.824 0.913
DTSF 1.002 0.961 0.950 0.914 1.092 0.552 0.969

TABLE V
AVERAGE S MAPE OBTAINED IN THE 414 HOURLY TIME SERIES BY THE PREDICTED STEPS , GROUPED IN 6- HOUR
PERIODS .

Steps
Methods 1-6 7-12 13-18 19-24 25-30 33-36 37-42 43-48 1-48
Naive2 16.3 20.1 18.8 15.7 18.2 20.7 19.3 18.0 18.4
Naive2 16.3 20.1 18.8 15.7 18.2 20.7 19.3 18.0 18.1
Holt 15.7 23.0 27.1 27.5 29.9 34.9 37.9 39.8 29.5
Damped 15.5 20.3 20.5 17.5 18.1 21.2 21.2 19.9 19.3
Theta 16.1 19.9 18.5 15.3 17.8 20.5 19.2 17.8 18.1
Comb 15.6 20.6 21.8 19.7 20.8 24.9 26.7 26.8 22.1
ARIMA 14.2 11.4 11.2 15.8 15.4 13.9 13.4 17.0 14.1
ETS 13.6 16.5 16.4 16.6 16.5 19.0 18.9 17.4 17.3
DTSF 12.6 10.7 10.2 15.0 14.8 11.6 11.6 11.6 12.9

TABLE VI
Table VI shows the time necessary to fit the methods for
T OTAL AND AVERAGE TIMES NECESSARY FOR FITTING THE
all of the 100,000 time series. The methods Naive2 and Comb
METHODS .
have been omitted as these two are a combination/selection
of individual methods. Total fitting time is given in seconds, Methods Total fitting time Average time Ratio to naive
while the average time per series is given in microseconds. The (s) per series (ms)
Naive 0.458 1.106 1.00
Ratio Naive column compares the average time of a particular sNaive 0.656 1.584 1.43
method compared to the execution time of the Naive method. SES 2.219 5.360 4.85
Holt 5.947 14.365 12.99
Damped 12.789 30.892 27.94
Theta 2.964 7.159 6.47
ARIMA 18437.598 44535.261 40278.22
DTSF was the method that consumed the most compu- ETS 1838.638 4441.155 4016.63
DTSF 6.241 15.074 13.63
tational time, almost 9 times more than Naive. It is worth
mentioning that the default parameters for DTSF adopt 10
analogs to estimate the forecast. Also, part of the method is
executed in the C compiled language, and part of it is executed
in R.
326 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO. 2, FEBRUARY 2023

IV. C ONCLUSIONS [11] Y. Zhang, M. Zhong, N. Geng, and Y. Jiang, “Forecasting electric
vehicles sales with univariate and multivariate time series models: The
The current paper presents the results of applying the dy- case of china,” PloS one, vol. 12, no. 5, p. e0176729, 2017.
namic time scan forecasting method with the M4 competition [12] G. A. Tularam and T. Saeed, “Oil-price forecasting based on various uni-
data and compares it with statistical methods used as baselines variate time-series models,” American Journal of Operations Research,
vol. 6, no. 03, p. 226, 2016.
in the same competition. The results point to a significant gain [13] G. Girish, A. K. Tiwari, et al., “A comparison of different univariate
in accuracy in hourly time domain problems, compared to the forecasting models forspot electricity price in india,” Economics Bul-
reference, which justifies adopting this method for problems letin, vol. 36, no. 2, pp. 1039–1057, 2016.
[14] M. Rana, I. Koprinska, and V. G. Agelidis, “Univariate and multivariate
of this particular nature. methods for very short-term solar photovoltaic power forecasting,”
Since the method was developed for problems with long Energy Conversion and Management, vol. 121, pp. 380–390, 2016.
time series and high repeatability, DTSF has been proved [15] E. Raviv, K. E. Bouwman, and D. Van Dijk, “Forecasting day-ahead
electricity prices: Utilizing hourly prices,” Energy Economics, vol. 50,
competitive. In the present experiment, the DTSF method pp. 227–239, 2015.
reduced the sMAPE by 12.13%. [16] B. Billah, R. J. Hyndman, and A. B. Koehler, “Empirical information
Furthermore, the dissemination of this method may be criteria for time series forecasting model selection,” Journal of Statistical
Computation and Simulation, vol. 75, no. 10, pp. 831–840, 2005.
interesting for other researchers who wish to extend it to [17] S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m4 com-
existing methods, either by combining it with other techniques petition: Results, findings, conclusion and way forward,” International
or by adapting its operation to other applications. Journal of Forecasting, vol. 34, no. 4, pp. 802–808, 2018.
[18] S. Makridakis and M. Hibon, “The m3-competition: results, conclusions
Future research should extend the method to multivariate and implications,” International journal of forecasting, vol. 16, no. 4,
forecasting problems and hierarchical time series and should pp. 451–476, 2000.
assess its performance in other applications with this character- [19] S. Makridakis, C. Chatfield, M. Hibon, M. Lawrence, T. Mills, K. Ord,
istic (the M5 competition, for instance). Also, some extensions and L. F. Simmons, “The m2-competition: A real-time judgmentally
based forecasting study,” International Journal of Forecasting, vol. 9,
of the method itself are foreseen, in order to improve its no. 1, pp. 5–22, 1993.
accuracy on time series for which its performance was less sat- [20] S. Makridakis and M. Hibon, “Accuracy of forecasting: An empirical in-
isfactory than the performance of other statistical methods, for vestigation,” Journal of the Royal Statistical Society: Series A (General),
vol. 142, no. 2, pp. 97–125, 1979.
example, adopting k-fold instead of hold-out cross-validation [21] M. P. P. Flores, J. F. Solís, G. C. Valdez, J. J. G. Barbosa, J. P. Ortega, and
for model selection [41]. J. D. T. Villanueva, “Hurst exponent with arima and simple exponential
smoothing for measuring persistency of m3-competition series,” IEEE
Latin America Transactions, vol. 17, no. 05, pp. 815–822, 2019.
ACKNOWLEDGMENTS [22] S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m5 com-
petition: Background, organization, and implementation,” International
The authors would like to thank the National Council for Journal of Forecasting, 2021.
Scientific and Technological Development (CNPq), Compan- [23] “The m6 financial forecasting competition.”
hia Energética Integrada (CEI), and Brasil Energia Inteligente [24] D. Yang and S. Alessandrini, “An ultra-fast way of searching weather
analogs for renewable energy forecasting,” Solar Energy, vol. 185,
(BEI) for its financial support of the project (Grant No. pp. 255–261, 2019.
141740/2019-1 and Grant No. 141177/2019-2, respectively). [25] L. E. B. Hoeltgebaum, N. L. Dias, and M. A. Costa, “An analog period
method for gap-filling of latent heat flux measurements,” Hydrological
Processes, vol. 35, no. 4, p. e14105, 2021.
R EFERENCES [26] M. A. Costa, R. Ruiz-Cárdenas, L. B. Mineti, and M. O. Prates,
[1] T. Hill, L. Marquez, M. O’Connor, and W. Remus, “Artificial neural “Dynamic time scan forecasting for multi-step wind speed prediction,”
network models for forecasting and decision making,” International Renewable Energy, vol. 177, pp. 584–595, 2021.
journal of forecasting, vol. 10, no. 1, pp. 5–15, 1994. [27] T. S. Gontijo, M. A. Costa, and R. B. de Santis, “Similarity search in
[2] P.-F. Pai and C.-S. Lin, “A hybrid arima and support vector machines electricity prices: An ultra-fast method for finding analogs,” Journal of
model in stock price forecasting,” Omega, vol. 33, no. 6, pp. 497–505, Renewable and Sustainable Energy, vol. 12, no. 5, p. 056103, 2020.
2005. [28] T. S. Gontijo, M. A. Costa, and R. B. de Santis, “Electricity price
[3] G. Dudek, “Pattern-based local linear regression models for short-term forecasting on electricity spot market: a case study based on the brazilian
load forecasting,” Electric Power Systems Research, vol. 130, pp. 139– difference settlement price,” in E3S Web of Conferences, vol. 239, EDP
147, 2016. Sciences, 2021.
[4] R. Shanmugam, “Book review,” Journal of Statistical Computation and [29] G. Bontempi, “Comments on m4 competition,” International Journal of
Simulation, vol. 76, no. 10, pp. 935–940, 2006. Forecasting, vol. 36, no. 1, pp. 201–202, 2020.
[5] T. T. Tchrakian, B. Basu, and M. O’Mahony, “Real-time traffic flow [30] R. Fildes and S. Makridakis, “The impact of empirical accuracy studies
forecasting using spectral analysis,” IEEE Transactions on Intelligent on time series analysis and forecasting,” International Statistical Re-
Transportation Systems, vol. 13, no. 2, pp. 519–526, 2011. view/Revue Internationale de Statistique, pp. 289–308, 1995.
[6] C. Voyant, G. Notton, S. Kalogirou, M.-L. Nivet, C. Paoli, F. Motte, and [31] J. Glaz and N. Balakrishnan, Scan statistics and applications. Springer
A. Fouilloy, “Machine learning methods for solar radiation forecasting: Science & Business Media, 2012.
A review,” Renewable Energy, vol. 105, pp. 569–582, 2017. [32] D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to linear
[7] H. Hassani and E. S. Silva, “Forecasting uk consumer price inflation regression analysis. John Wiley & Sons, 2021.
using inflation forecasts,” Research in Economics, vol. 72, no. 3, [33] C. Chatfield, Time-series forecasting. Chapman and Hall/CRC, 2000.
pp. 367–378, 2018. [34] R. Hyndman, A. B. Koehler, J. K. Ord, and R. D. Snyder, Forecasting
[8] W. Cai, J. Chen, J. Hong, and F. Jiang, “Forecasting chinese stock with exponential smoothing: the state space approach. Springer Science
market volatility with economic variables,” Emerging Markets Finance & Business Media, 2008.
and Trade, vol. 53, no. 3, pp. 521–533, 2017. [35] E. S. Gardner and E. McKenzie, “Why the damped trend works,” Journal
[9] E. Bernardini and G. Cubadda, “Macroeconomic forecasting and struc- of the Operational Research Society, vol. 62, no. 6, pp. 1177–1180,
tural analysis through regularized reduced-rank regression,” Interna- 2011.
tional Journal of Forecasting, vol. 31, no. 3, pp. 682–691, 2015. [36] V. Assimakopoulos and K. Nikolopoulos, “The theta model: a decom-
[10] Z. R. Khan Jaffur, N.-U.-H. Sookia, P. Nunkoo Gonpot, and B. See- position approach to forecasting,” International journal of forecasting,
tanah, “Out-of-sample forecasting of the canadian unemployment rates vol. 16, no. 4, pp. 521–530, 2000.
using univariate models,” Applied Economics Letters, vol. 24, no. 15, [37] G. E. Box and D. A. Pierce, “Distribution of residual autocorrelations in
pp. 1097–1101, 2017. autoregressive-integrated moving average time series models,” Journal
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 327

of the American statistical Association, vol. 65, no. 332, pp. 1509–1526,
1970.
[38] R. J. Hyndman, A. B. Koehler, R. D. Snyder, and S. Grose, “A state
space framework for automatic forecasting using exponential smoothing
methods,” International Journal of forecasting, vol. 18, no. 3, pp. 439–
454, 2002.
[39] S. M. Al-Alawi and S. M. Islam, “Principles of electricity demand
forecasting. i. methodologies,” Power Engineering Journal, vol. 10,
no. 3, pp. 139–143, 1996.
[40] A. Azadeh, S. Ghaderi, and S. Sohrabkhani, “Annual electricity con-
sumption forecasting by neural network in high energy consuming
industrial sectors,” Energy Conversion and management, vol. 49, no. 8,
pp. 2272–2278, 2008.
[41] C. Bergmeir, R. J. Hyndman, and B. Koo, “A note on the validity
of cross-validation for evaluating autoregressive time series prediction,”
Computational Statistics & Data Analysis, vol. 120, pp. 70–83, 2018.

Rodrigo Barbosa de Santis graduated in Production


Engineering from the Federal University of Juiz de
Fora (2015) and Master in Computer Modeling from
the Federal University of Juiz de Fora (2018). He
has experience in operations management, having
worked as a consultant at Natura, BASF, Johnson
& Johnson, and B2W. His skills include developing
and implementing predictive and decision support
systems. His main interests are Machine Learning,
Artificial Intelligence, Operations Management, and
Supply Chain.

Tiago Silveira Gontijo is PhD student and Master in


Production Engineering. Bachelor of Business Ad-
ministration and Economics. He has published books
and papers and works as a reviewer for international
journals. He has experience in the administration of
production systems and in the sanitation production
chain. Worked at the Regulatory Agency for Water
Supply and Sanitary Sewage Services of the State
of MG. Has experience in applied statistics and
quantitative methods, with emphasis on renewable
energies and time series. He works as an independent
consultant in economic regulation.

Marcelo Azevedo Costa is a full professor of


Applied Statistical Methods in the Department of
Production Engineering at the Federal University
of Minas Gerais (UFMG) and faculty member of
the Graduate Program in Production Engineering
(Stochastic Modeling and Simulation) at the same
institution. He holds a degree in Electrical Engi-
neering from the Federal University of Minas Gerais
(1999), a Ph.D. in Electrical Engineering from the
Federal University of Minas Gerais (2002) in Com-
putational Intelligence. He has done post-doctorate
from Harvard Medical School & Harvard Pilgrim Health Care (2007) in
Spatial Statistics and Epidemiological Surveillance, and a post-doctorate from
Linköping University (2018/Sweden) in Statistical Analysis, Diagnosis and
Fault Detection in Industrial Environments. He is currently a collaborating
professor in the Laboratory of Intelligence and Computational Technology
(LITC/UFMG), developing research projects in artificial neural networks. He
is also a collaborating professor at the Center of Research in Efficiency, Sus-
tainability and Productivity (NESP/UFMG), developing research projects in
economic regulation in the electricity sector. He has published several papers
in international journals such as Socio-Economic Planning Sciences, IEEE
Transactions on Power Delivery, Statistical Methods in Medical Research,
PLOS One, Measurement, among others. He is a reviewer of international
and national journals, as well as the author of book chapters published in
English. He is a CEMIG/FAPEMIG R&D project coordinator and researcher
in R&D projects. He supervises undergraduate, specialization, master and
doctoral students in the following areas: statistical models applied to the
electrical sector, applied statistics, network analysis, spatial statistics, time
series analysis, artificial neural network theory, and applications. He is
enthusiastic about using R language programming, which he has applied to
solve many practical statistical problems including Big Data analysis.

You might also like