Professional Documents
Culture Documents
Dynamic Time
Dynamic Time
Dynamic Time
2, FEBRUARY 2023
Abstract—Univariate forecasting methods are fundamental for [21]. The 5th edition took place in 2020, and focused on a re-
many different application areas. M-competitions provide im- tail sales application with 42,850 unit sales hierarchical series,
portant benchmarks for scientists, researchers, statisticians, and with the objective to produce the most accurate point forecast
engineers in the field, for evaluating and guiding the development
of new forecasting techniques. In this paper, the Dynamic Time as well as the most accurate estimation of the uncertainty of
Scan Forecasting (DTSF), a new univariate forecasting method these forecasts [22]. The 6th competition will take place this
based on scan statistics, is presented. DTSF scans an entire year and it will focus on predicting the overall market returns
time series, identifies past patterns which are similar to the of individual stocks [23].
last available observations and forecasts based on the median Whereas most well-known forecasting methods are based
of the subsequent observations of the most similar windows in
past. In order to evaluate the performance of this method, a on identifying intrinsic components of the time series, such
comparison with other statistical forecasting methods, applied in as level, trend, or seasonality, a particular group of methods
the M4 competition, is provided. In the hourly time domain, an based on similarity searches have been arousing interest in
average sMAPE of 12.9% was achieved using the method with the the areas of meteorology and renewable energy [24], [25].
default parameters, while the baseline competition – the simple These methods consist of identifying past weather patterns
average of the forecasts of Holt, Damped, and Theta methods –
was 22.1%. The method proved to be competitive in longer time ("analogs") that closely resemble the current state. These
series, with high repeatability. methods are capable of handling lengthy historical time series
in order to produce accurate and interpretive forecasts.
Index Terms—Univariate methods, M4 competition, bench-
marking, dynamic time scan forecasting. Among these methods, Dynamic Time Scan Forecasting
(DTSF) consists of a new and simple analog-based forecasting
technique [26]. It generates forecasts based on similar patterns,
I. I NTRODUCTION those with the highest R2 scores, calculated from the last
available window.
he development of predictive models is widely debated
T in the literature [1]–[4], since it assists the control of
associated uncertainty intrinsic to random variables. Given the
The accuracy of analog-based methods is scarcely reported
in areas other than energy prediction and is mostly limited
to wind and solar energy forecasting applications [27], [28],
above, there are several categories of predictive models based
which begs the question: "are analog-search-based models
on this physical knowledge (such as spectral analysis [5])
competitive compared to classical statistical prediction meth-
of intensive machine learning and statistical approaches [6].
ods?". Additionally, no research was found that compared
Forecasting models associated with a single random variable
analog search methods and statistical methods.
as a function of time support univariate forecasting, which is
To fill this gap, the current paper describes the DTSF
a very important area given its application in various sectors
forecasting method and discloses its performance on the
such as [7]–[9], business [10]–[12], energy [13]–[15], among
M4 competition time series. We compare DTSF with eight
others. In this context, it is fundamentally valuable to develop
classical statistical methods (Naive, Seasonal Naive, Simple
meticulous criteria for selecting the models [16].
Exponential Smoothing, Holt, Damped, Theta, AutoRegres-
The M-competition [17]–[20] is the most important fore-
sive Integrated Moving Average (ARIMA), and ExponenTial
casting competition in academia, in which researchers from all
Smoothing state space model (ETS)) and a combination of the
around the world test their methods on real-life, anonymous
outcomes of 3 individual methods (Holt, Damped, and Theta),
time series from distinct areas of industry. The 4th edition took
which compose the baseline of the M4 competition. The M4
place in 2018 [17], and 17 methods based on combinations
benchmark dataset was selected for this research because: (1) it
of statistical- and machine-learning or hybrids were tested on
consists of a reliable and curated benchmark base, adopted by
100,000-time series. Outputs from these events are registered
other researchers and practitioners for developing and testing
in review articles, pointing out the directions of development
forecasting methods; (2) it has a significant number of series:
and refinement of the most promising forecasting techniques
100,000 time series, with different frequencies (hourly, daily,
Rodrigo Barbosa de Santis is with LADEC Lab, Federal University of monthly, weekly, quarterly, yearly); (3) it has been mostly
Minas Gerais (UFMG) e-mail:rsantis@ufmg.br predominated by statistical methods of forecasting; (4) and
Tiago Silveira Gontijo is with LADEC Lab, Federal University of Minas it is composed of univariate and independent series.
Gerais (UFMG) e-mail:economista@ufmg.br
Marcelo Azevedo Costa is with LADEC Lab, Federal University of Minas The major contributions of the present paper can be sum-
Gerais (UFMG) e-mail:azevedo@est.ufmg.br marized as follows:
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 321
TABLE I
• the study applies a new method to M4 competition for
S UMMARY OF M4 COMPETITION DATASET, INCLUDING
benchmark purposes;
TIME - FREQUENCY, MINIMUM LENGTH OF TIME SERIES ,
• the method is compared with nine classical statistical
AND FORECAST HORIZON OF EACH TIME SERIES .
methods and a combination of the outcomes of three
individual methods, which compose the baseline of the Domain Number of series Min. length Horizon Seasonality
competition; Yearly 23,000 13 6 1
Quarterly 24,000 16 8 4
• in addition to applying the method, along with its default Monthly 48,000 42 18 12
parameters, an exhaustive search with hold-out validation Weekly 359 80 13 52
is adopted for model selection. Daily 4,227 93 14 7
Hourly 414 700 48 24
The major conclusions are:
• in the hourly time domain, an average error of 12.9% was
obtained using the method with the default parameters, B. Dynamic Time Scan Forecasting
while the competition baseline was 22.1%; DTSF is a forecasting method based on scan statistics [31]
• through the automatic selection of parameters, we boosted and was originally developed to address the problem of wind
the accuracy of the method by 12.31% compared to the forecasting for Brazilian power generation plants. It consists
method application without parameters selection; of scanning a time series and identifying past patterns (called
• the method proved to be competitive, both in terms of "analogs") similar to the last observations available of the time
accuracy and computational cost, over long time series series (called "query") [26].
and with high repeatability. Let yt be a time series of length N , t = 1, ..., N . Firstly, let
vector y[w] be defined as the last w observations of the series:
The present paper is organized into 5 sections. Following
this Introduction, Section 2 provides a review of the proposed y[w] = [yN −w+1 , ..., yN ]. (1)
forecasting method. Section 3 provides a background of the
The goal of DTSF is to identify analogs in the time series
datasets and methods applied in this study. Section 4 presents
which are greatly correlated with vector y[w] . Hence, the set
the results and discussions obtained from the application of
of candidate vectors can be defined by:
the methods. Finally, Section 5 concludes the present paper
and includes some recommendations for future studies. xt
[w]
= [yt−w+1 , ..., yt−w ] (2)
where t = 1, ..., N −2·w. The upper limit of the time sequence
[w]
II. M ATERIALS AND M ETHODS (N − 2 · w) guarantees that vector xt does not overlap with
vector y . Fig. 1 presents the DTSF procedure. Given the
[w]
A. M4 Competition Dataset last w observed values, which comprises vector y[w] , a rolling
window with the same size (xw t ) is used for scanning previous
The data used in the current study comes from the M4 com- values of the series.
petition dataset [17]. It is composed of 100,000 time series, Lastly, DTSF provides a k steps ahead forecast of the time
taken from different domains such as Economics, Finance, series, yN +1 , ..., yN +k . To produce this outcome, the DTSF
Demographics, and Industry, among others. The time series [w]
scans the series to find the closest analogs xt . The subsequent
show different periods: yearly, quarterly, monthly, weekly, values of the time series are used as the forecast values:
daily, or hourly.
yN +i = fx[w] (yt−w+i ) (3)
Table I summarizes the information about the competition’s t
dataset. Domain refers to the time period from which the where fx[w] is a function which correlates the elements of
data have been extracted, ranging from hourly to yearly. The [w]
t
of an event in a given time domain. proposal for function fx[w] (.) is a linear scaling of the elements
t
[w]
The dataset provides a public and reliable source for com- of vector xt , i.e., a linear model. This occurs due to the fact
paring statistical, machine learning, or hybrid methods on that previous values are likely similar to the last observations,
univariate time series forecasting [29]. It is internationally except for a scale and/or offset shift. So, the method searches
recognized by researchers and data scientists as the most for values that may be similar to the last values, after applying
important competition in this area [30]. a similarity function [26].
322 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO. 2, FEBRUARY 2023
Query Forec.
0 20 40 60 80 100
Time (t)
Table IV presents the evaluation of the methods using OWA. the great gain in accuracy that explains the best performance
This metric is understood as showing how one method is more of this method, on average. Analyzing the sets between the
accurate when compared to Naive2. If OWA is lower than 1 170th and 300th time series with the smallest error, there is
the method is more adequate than Naive2. Otherwise, Naive2 less distinction between all the methods which, in general,
provides better forecasting performance. The DTSF scores for presented errors very close to each other. Other methods have
the hourly series imply a meaningful increase in accuracy shown a lower errors than DTSF along all time series, specially
over the Naive method (0.552). Moreover, when applying fine- the methods ARIMA and ETS. In the set between 300th
tuning, the gain increases to nearly 50%. For all other domains, and 414th, DTSF again marginally outperformed the other
the only ones in which the method performed worse than benchmark methods in most of the series.
Naive2 were the yearly and the daily, both of which have Table V presents the average sMAPE detailed by the
in common the longer term forecast period and the lowest forecast horizon, grouped by 6-hour periods. DTSF obtained
seasonality traits in common. lower errors, for all horizons than the other compared methods.
The outcome of the experiment can be explained by the Furthermore, the average error is 12.9%, and the highest errors
intrinsic design of the DTSF method, which was originally were obtained during the periods between the hours from 19
conceived to deal with very long time series with recurrent to 30 and the hours from 43 to 48.
patterns, such as its original application to 30-min frequency To provide better visualization of error evolution over time,
wind speed forecasting. Comparing results to Table I, which Fig. 4 presents the mean errors per step of each method
presents the seasonality, length, and forecast horizon of each (excluding the three from the previous figure), for all hourly
time domain, it is shown that the DTSF accuracy is greater time series. An increase in error over time, according to the
when the number of available data points is also greater. phenomenon of error propagation, is expected. This is better
Fig. 3 displays the average sMAPE for each one of the observed in the Holt method, in which error varied from 10%
414 hourly time series available in the competition database, at the first step to 40% at the last step. Moreover, in such
listed in ascending order according to the calculated error a visual representation, the Theta model is perceived to have
of the DTSF method. The methods Naive, sNaive and SES been more accurate, on average, than the DTSF model for the
methods were holdouts of the graphical representation. The y- 1st and 24th hours.
axis is presented using the base-10 logarithmic scale in order
to facilitate visual analysis.
Naive2
40
Holt
Damped
Theta
Comb
DTSF
ARIMA
35
2 2 2 2 2 5 2
2 52 2 2 1 8
2
5
3 3 2 ETS
5 2 8 8 2 2 4 1
2
5
4
3 282
3
5
412
14 4
1
5 3 5 85 3
2.0
5 1 8 3 75 25 76
2 23 2 2
8 2
5
4
2 18
3
42 752
1
3
442 22
15
3
5 2 5 75 1
3
4
1
2 224
5
4
3
2
5
3
1
4
18 4 3
18 4
3
5
21 75
5
3
14 3
4 27
166
5
6
5 5 2 2 3 2 2 2 8 2
715 8
2
5
5
3
1
4 75 185
2
3
4
7 1812
4
3
4
3
5
25
15
3
4
2
3
4
1
3
4
176665
4
17
3
5 8 3 5 8 3
1
4
2 3
4
1 2 8 2 7
5 8 5 23 8 2 1 1 58
17 5
4
3 1
2
3 755
4 1 7 88668 8 6666 878
6 5
2
3
4
30
3 3
4
5 3
17
4
2 668 1
82 3 8 8
5 53 2 1 5 8 25
5
3
4 2 28 5
4
3 3
4
22 1
5
4
38
2
5
3
428 285
15
4
3
17
1 78
3
4
2 3
5
1
2 775
8
8514
3
4
2 3
178 75
3
4
5
22
253
4
1 6
6 66662 878785
7 662
2
4
3
1
3 2 15 82
5
3
4 1 1 1881785 2178 74
7 7 2
1
3
4
725 8
158 6 78 83
167
4 6
68
7
66668775 1
5
3
4 2
72
178
5 2
2
5 2 2 322
5
11 218
5
4
3 4
3 7 2
5
3
4
15
7782
18
3 25 6666666 2
75172
3
14 3
4
17787885
5
1
3
4
2
3 1 4 88 8
1
4
3 2
31177 8
5
4
8 2
5 8 728
4
3
725
5
3
4 2
4
3
1
778 78
75287
3
4 2
5
3
4 777 85 3
4 21768
1
4
3
285 3
5
48
7
4 18
5
6663
4
6
3
4
682
3
64
117
3
4
66 66668
668 187875
5
2
3
4 3
4 3
4 3
1
4
1.5
4
1 3
8 8 3
4
1 7 5 7 82 1 15
2
3
4 7 4
3
5
2
62
7
66
6
5
2
3
42
1
66
6
2
2
5
4
8
36 7 7 87 7 2
5
3
4 7 8 8 7
8 54 1 8 22 2 2 5 2 878 2 7 811 4 15
3 2 875
4
3 288
5
4
3 778 1667
3 2
5
3
4
1666666 38
685
4115
17 778 878
3
4 88
25 1
4 1
4 3 2 2 2 8 25
5
3
4
83
4 5
3
4
1 8 8
1
3
5
2
4 5
3
2
4 7 1
4
7
31 1 1
8 7 7 12
5
4
6
34
5
2
6 6 14
7
3
5
2
1 8
3
1
4
5 18 2
5
8
4
3
1
1
4 1
4
15
2
1 57 5 18 8 88775
4
2
3 2378 2 1788
17878 5
7
3
4
81775
28 15278
3
4 1668
2
5
3 67878881
8
666666665 2
125
7
2 1 78 7
2 2 22
4
1 5 4
1
3 25
7 4
3 4
3 2528
3 5
4
3
7 68
3
4
5
2 75
2
3
4 2 4 3
4
2 5
4
3 18
3
4 3
5
4
sMAPE
4 3 2
28 3 2212 22 22 2 2 2 2 2225 5 2 23
3 2 4
1 34122 8 2 2 55 22 8
54
1 2 7 85 8 5 28
5
4
3 2
35
4
3
17
8 5
18
2
3
4 27
5
3 25
5
3
4 3
4
28 88
7 2
15
17166668
2
5
3
4 14
74 668
2666615
3
6
5
6666666668
1666
66617 7 8 5
8
2
3
4
1
3
4
2
17 4 18 5
3
4 28
5
18
3
4
25
8 5 4
3 17 1 153
2
15
411153
4
2 3
4 76 7
2241222 4 22 2 322 2 28 77 2
5
3
4 3
4
2 66 6
885
2
3
4 27 7
3
4 7
22 22 2222 222 2 22 2 22 2 22 4
3
1 7
5 2 4 12 5 4
3
1 3
4
87766666665 666 27
5
3
4
166668
6 67
67667 18
3
4
5 2
5
3
4
15 3 5 4
3
2
22 25
1
4 2 2 2 1
4 22 2 2 2 2 2 22 2 2 25 2 5 6
67
6
64
1
7
6
6 7
8
2
666
8
2
5
3
1
427288
76
8
3
5
4 4 18
7
3
5
4
2
2 1 77
2
3
4
7 4
1
2
1
1
5 22 422 2 22 2 2 222228 12
4
3
2 5 16
4 563665
6 7 17 1 1 8
5
3
4
2
1 7 75
2 2 2 8
5 2 3 7 33 3 1
8
3
4
5
2 66
65 4
2
3
1 7 7
2 5 7522 2 4 2 238 225 5 5238 6668
1.0
5 5 55 2 3
1 8 3
1 28
5
4
2 1 78
3
5
4
2 7
3
1
5
4
2 718
2
4
5
5 52 5 5 5 5 25 5 5 2 5 5 2 5 75 2 2 4 8 82 22 4
666 5
2
3
1
4 7 8
3
2 5 5 55 5 5 255 14
1 75 2
18
3
441874
18
7 8 78 17
6
166668
5 17 78 75
68 2
4
14 3
55 5 58 5 8 38 5
34 4
3
1666666666 4
3
5
2 7 7 25 1 8
2 7
4
1
log10(sMAPE)
2 77 4
3 5 5 555 5
7 5 218 737812 8884
2
5
3
4 27 77 5 66 567
4668
6 77 2374
1417728
3 55 55 55 555 525 5 5 55 55 55575554
4
1 355 77 755 1 13 1 6666667666646
18
217 175778
75 5
3
4
17838 13 8
2 5
4
20
5 1
5
8
2 4 4 56666676666664 75 3 4
3
2 5
3
4 1
8
2 3 5555533 5375 355 5 55 5 55553555 5 73
4 8
548
2
1
4477 1
8
11
22 2 74
873
1 1564
4
5
2
8
12
12
5
4
3
6
676667 8 4
17158
3
4
5 17 7735 2
38
4
8 278
5 1 37
4
3
5
7 3 3
37 47 333 5 7 3 5 55 5
3 65
8
6
466
76
7
13866667178566
3
4
5
3
1
2777 8 8 75 2 5 7 4
3
1 5
3 8
3 1 37 3 53 3 3557 33 7333 7 333758 5
18
43
18
5
4
2 2 2
3 7
8
1715
17 118
4
2
3 3 5
7 134
4
3
34141 5 3 33 4 1 3315325 33534 1 471444 33 4 33 354 7 4
31 18 23
3 35 33 5
3
4 1
666668
5
4
356
26
6 1
4
75
332 148
3
5 3
4 535 738
787 75773 17348 18 3
18
13 3333343333 133335544 13 1334 11 4 11 8488
3
5
2 167 8
4 8
16667
4
156
6 7 4 43 2
4 1 84
4 4
5 7 58 48
17
0.5
4 3333 14
4 14 131314 1 34 134 2
5
4
3
8 66668 6 77778 8 1
144134
4 134
144 1334 3374344 4 1324 723368
18
4 67
17
2 6 5
6 8
14
41447441 48 7 1 24
1414 44 314 47144
1 145
1434
2
1134
14
4
11
4331 4 11 4 3
14 112
14 3 43
34
1 4 334
144 1433
4 4 4 4
5 1 3335 22
4
3
1 4
144
14 4
3 35
1
3
1
27
4
25 14
815
5 3
1
4
2
2
47
88
8
5
5
3
1
4
2 7667
72
4
5
65
3
4
6
7
81
7
4
1 84
47
1
2
25
3
4 3
1
58
7
2 3
4
14 22 741 85 5
8
185
3 3
3
2814
18 31813 4 5 4
3
1
2 4
5
2
3
1
1
8
4 1 1 11 1 1 4 4
1 14 3 3 4
31 5
4 1 1 1 1 1 3
1 22
3 4
513 3 5
1 1 34766668373
18
3
16 5
7 5
3 4 5 1 8 2
1 1 4 1 41 1 1 444 41 45 14 1 1 1 4 43 44
5
43
15 1 44 3 58
4 18
4 855 4 728 15 5183 5
15
3 1 3
11 4 4 2
4
3 121
5 1 1 3
17 4 166 1 71
2
4
5
3
7 8
1
3 3
1
5
4 2
4
5
3
1
4
1 4 2 2 111 51 4 3 3 84 4 3
132 4 12 1 41 41 5 4 1366 2
7 73 5 5
7 25
5
3
1 1
3
4 3
1 83
4
5 4 1 4 112 1 43
4
5
3 2
14 22 24
4
5
3 1 83 67 338388
3
7 8 8 884 15
145 2
1 4 667
135
3 55
3
1 1 66
3
2
4
8 3 85 1 2
3
2
4 18 8 88 5
3
4
5 4
3
2 17886666668
2
5
4
3
0.0
888 8 8 8 8 5 8 666667
8 88 88
8 888 82 878 8 68 3
68
848 8 375 3
5 86 8 5
1666666666
1
4
2
8 88 88 881 8 8 8 4 88
7
1664
1
3
2 2
4
6 877 77 7
10
7
5 4 866667
6
8 87888 7 87
8 73
152
3
1 6667
4
7 67
6 666666666678
666668 7 8788 77
8
2
5
4 7 7
8 88 77 6666666666666688 7 78
8
7 66
6
66
666
6
6666 8 7
7 877 8
3
18 7 87 7 8 1
5
3
2
4 7 8
8
6
6
666667
6
66
6
66
8 7 7
8 8 8 8 8 8 Naive2
7 8 78 75 18786666
3
2
4 6668
766666666666667
66 6 7 8 7877 7 7 8 8887877
7 78
38
−0.5
17 877 6666666666
6667 17 8 787 778 8 7 7 77878 7747 2 4
577 77 Holt
2 6666666687 7774
666667
2
5
3 8 77877 78 7 877 2877 1 8 87
3 66667
5
4 668
67 7 7 7 7 7 8
7 8 5
3
1 8 7 7 Damped 0 10 20 30 40
777 78 78 8 7 87 8 87
66667778 7 7 8 7777 777 7 77 7 7 Theta
66 7 7 Comb
8
7 8 Step
67 DTSF
7
−1.0
8 ARIMA
7
8 ETS
Fig. 4. Average sMAPE (obtained in the 414 hourly time series by all
0 100 200 300 400 the methods for each step of the prediction, up to 48 hours – forecast
Time Series horizon).
TABLE III
T HE PERFORMANCE OF DTSF COMPARED TO M4 BENCHMARK STATISTICAL METHODS – S MAPE METRIC .
sMAPE
Yearly Quarterly Monthly Weekly Daily Hourly Average
Method (23k) (24k) (48k) (359) (4,227) (414) (100k)
Naive 16.342 11.610 15.255 9.161 3.405 43.003 14.207
sNaive 16.342 12.521 15.994 9.161 3.405 13.912 14.660
Naive2 16.342 11.012 14.429 9.161 3.405 18.383 13.565
SES 16.398 10.600 13.620 9.012 3.405 18.094 13.089
Holt 16.535 10.955 14.833 9.706 3.070 29.474 13.839
Damped 15.162 10.243 13.475 8.867 3.063 19.277 12.655
Theta 14.603 10.312 13.003 9.094 3.053 18.138 12.312
Comb 14.874 10.197 13.436 8.947 2.985 22.114 12.567
ARIMA 15.150 10.408 13.486 8.593 3.185 14.081 12.679
ETS 15.356 10.291 13.525 8.727 3.046 17.307 12.725
DTSF 16.816 11.006 13.823 8.983 3.313 12.927 13.370
TABLE IV
T HE PERFORMANCE OF DTSF COMPARED TO M4 BENCHMARK STATISTICAL METHODS – OWA METRIC .
OWA
Yearly Quarterly Monthly Weekly Daily Hourly Average
Method (23k) (24k) (48k) (359) (4,227) (414) (100k)
Naive 1.000 1.066 1.095 1.000 1.000 3.593 1.072
sNaive 1.000 1.153 1.147 1.000 1.000 0.628 1.106
Naive2 1.000 1.000 1.000 1.000 1.000 1.000 1.000
SES 1.003 0.970 0.951 0.975 1.000 0.990 0.970
Holt 0.956 0.935 0.989 0.964 0.997 2.760 0.976
Damped 0.888 0.893 0.924 0.916 0.996 1.140 0.912
Theta 0.872 0.917 0.907 0.971 0.999 1.006 0.906
Comb 0.868 0.891 0.920 0.926 0.979 1.559 0.906
ARIMA 0.891 0.898 0.904 0.927 1.041 0.950 0.906
ETS 0.903 0.890 0.914 0.931 0.996 1.824 0.913
DTSF 1.002 0.961 0.950 0.914 1.092 0.552 0.969
TABLE V
AVERAGE S MAPE OBTAINED IN THE 414 HOURLY TIME SERIES BY THE PREDICTED STEPS , GROUPED IN 6- HOUR
PERIODS .
Steps
Methods 1-6 7-12 13-18 19-24 25-30 33-36 37-42 43-48 1-48
Naive2 16.3 20.1 18.8 15.7 18.2 20.7 19.3 18.0 18.4
Naive2 16.3 20.1 18.8 15.7 18.2 20.7 19.3 18.0 18.1
Holt 15.7 23.0 27.1 27.5 29.9 34.9 37.9 39.8 29.5
Damped 15.5 20.3 20.5 17.5 18.1 21.2 21.2 19.9 19.3
Theta 16.1 19.9 18.5 15.3 17.8 20.5 19.2 17.8 18.1
Comb 15.6 20.6 21.8 19.7 20.8 24.9 26.7 26.8 22.1
ARIMA 14.2 11.4 11.2 15.8 15.4 13.9 13.4 17.0 14.1
ETS 13.6 16.5 16.4 16.6 16.5 19.0 18.9 17.4 17.3
DTSF 12.6 10.7 10.2 15.0 14.8 11.6 11.6 11.6 12.9
TABLE VI
Table VI shows the time necessary to fit the methods for
T OTAL AND AVERAGE TIMES NECESSARY FOR FITTING THE
all of the 100,000 time series. The methods Naive2 and Comb
METHODS .
have been omitted as these two are a combination/selection
of individual methods. Total fitting time is given in seconds, Methods Total fitting time Average time Ratio to naive
while the average time per series is given in microseconds. The (s) per series (ms)
Naive 0.458 1.106 1.00
Ratio Naive column compares the average time of a particular sNaive 0.656 1.584 1.43
method compared to the execution time of the Naive method. SES 2.219 5.360 4.85
Holt 5.947 14.365 12.99
Damped 12.789 30.892 27.94
Theta 2.964 7.159 6.47
ARIMA 18437.598 44535.261 40278.22
DTSF was the method that consumed the most compu- ETS 1838.638 4441.155 4016.63
DTSF 6.241 15.074 13.63
tational time, almost 9 times more than Naive. It is worth
mentioning that the default parameters for DTSF adopt 10
analogs to estimate the forecast. Also, part of the method is
executed in the C compiled language, and part of it is executed
in R.
326 IEEE LATIN AMERICA TRANSACTIONS, VOL. 21, NO. 2, FEBRUARY 2023
IV. C ONCLUSIONS [11] Y. Zhang, M. Zhong, N. Geng, and Y. Jiang, “Forecasting electric
vehicles sales with univariate and multivariate time series models: The
The current paper presents the results of applying the dy- case of china,” PloS one, vol. 12, no. 5, p. e0176729, 2017.
namic time scan forecasting method with the M4 competition [12] G. A. Tularam and T. Saeed, “Oil-price forecasting based on various uni-
data and compares it with statistical methods used as baselines variate time-series models,” American Journal of Operations Research,
vol. 6, no. 03, p. 226, 2016.
in the same competition. The results point to a significant gain [13] G. Girish, A. K. Tiwari, et al., “A comparison of different univariate
in accuracy in hourly time domain problems, compared to the forecasting models forspot electricity price in india,” Economics Bul-
reference, which justifies adopting this method for problems letin, vol. 36, no. 2, pp. 1039–1057, 2016.
[14] M. Rana, I. Koprinska, and V. G. Agelidis, “Univariate and multivariate
of this particular nature. methods for very short-term solar photovoltaic power forecasting,”
Since the method was developed for problems with long Energy Conversion and Management, vol. 121, pp. 380–390, 2016.
time series and high repeatability, DTSF has been proved [15] E. Raviv, K. E. Bouwman, and D. Van Dijk, “Forecasting day-ahead
electricity prices: Utilizing hourly prices,” Energy Economics, vol. 50,
competitive. In the present experiment, the DTSF method pp. 227–239, 2015.
reduced the sMAPE by 12.13%. [16] B. Billah, R. J. Hyndman, and A. B. Koehler, “Empirical information
Furthermore, the dissemination of this method may be criteria for time series forecasting model selection,” Journal of Statistical
Computation and Simulation, vol. 75, no. 10, pp. 831–840, 2005.
interesting for other researchers who wish to extend it to [17] S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m4 com-
existing methods, either by combining it with other techniques petition: Results, findings, conclusion and way forward,” International
or by adapting its operation to other applications. Journal of Forecasting, vol. 34, no. 4, pp. 802–808, 2018.
[18] S. Makridakis and M. Hibon, “The m3-competition: results, conclusions
Future research should extend the method to multivariate and implications,” International journal of forecasting, vol. 16, no. 4,
forecasting problems and hierarchical time series and should pp. 451–476, 2000.
assess its performance in other applications with this character- [19] S. Makridakis, C. Chatfield, M. Hibon, M. Lawrence, T. Mills, K. Ord,
istic (the M5 competition, for instance). Also, some extensions and L. F. Simmons, “The m2-competition: A real-time judgmentally
based forecasting study,” International Journal of Forecasting, vol. 9,
of the method itself are foreseen, in order to improve its no. 1, pp. 5–22, 1993.
accuracy on time series for which its performance was less sat- [20] S. Makridakis and M. Hibon, “Accuracy of forecasting: An empirical in-
isfactory than the performance of other statistical methods, for vestigation,” Journal of the Royal Statistical Society: Series A (General),
vol. 142, no. 2, pp. 97–125, 1979.
example, adopting k-fold instead of hold-out cross-validation [21] M. P. P. Flores, J. F. Solís, G. C. Valdez, J. J. G. Barbosa, J. P. Ortega, and
for model selection [41]. J. D. T. Villanueva, “Hurst exponent with arima and simple exponential
smoothing for measuring persistency of m3-competition series,” IEEE
Latin America Transactions, vol. 17, no. 05, pp. 815–822, 2019.
ACKNOWLEDGMENTS [22] S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m5 com-
petition: Background, organization, and implementation,” International
The authors would like to thank the National Council for Journal of Forecasting, 2021.
Scientific and Technological Development (CNPq), Compan- [23] “The m6 financial forecasting competition.”
hia Energética Integrada (CEI), and Brasil Energia Inteligente [24] D. Yang and S. Alessandrini, “An ultra-fast way of searching weather
analogs for renewable energy forecasting,” Solar Energy, vol. 185,
(BEI) for its financial support of the project (Grant No. pp. 255–261, 2019.
141740/2019-1 and Grant No. 141177/2019-2, respectively). [25] L. E. B. Hoeltgebaum, N. L. Dias, and M. A. Costa, “An analog period
method for gap-filling of latent heat flux measurements,” Hydrological
Processes, vol. 35, no. 4, p. e14105, 2021.
R EFERENCES [26] M. A. Costa, R. Ruiz-Cárdenas, L. B. Mineti, and M. O. Prates,
[1] T. Hill, L. Marquez, M. O’Connor, and W. Remus, “Artificial neural “Dynamic time scan forecasting for multi-step wind speed prediction,”
network models for forecasting and decision making,” International Renewable Energy, vol. 177, pp. 584–595, 2021.
journal of forecasting, vol. 10, no. 1, pp. 5–15, 1994. [27] T. S. Gontijo, M. A. Costa, and R. B. de Santis, “Similarity search in
[2] P.-F. Pai and C.-S. Lin, “A hybrid arima and support vector machines electricity prices: An ultra-fast method for finding analogs,” Journal of
model in stock price forecasting,” Omega, vol. 33, no. 6, pp. 497–505, Renewable and Sustainable Energy, vol. 12, no. 5, p. 056103, 2020.
2005. [28] T. S. Gontijo, M. A. Costa, and R. B. de Santis, “Electricity price
[3] G. Dudek, “Pattern-based local linear regression models for short-term forecasting on electricity spot market: a case study based on the brazilian
load forecasting,” Electric Power Systems Research, vol. 130, pp. 139– difference settlement price,” in E3S Web of Conferences, vol. 239, EDP
147, 2016. Sciences, 2021.
[4] R. Shanmugam, “Book review,” Journal of Statistical Computation and [29] G. Bontempi, “Comments on m4 competition,” International Journal of
Simulation, vol. 76, no. 10, pp. 935–940, 2006. Forecasting, vol. 36, no. 1, pp. 201–202, 2020.
[5] T. T. Tchrakian, B. Basu, and M. O’Mahony, “Real-time traffic flow [30] R. Fildes and S. Makridakis, “The impact of empirical accuracy studies
forecasting using spectral analysis,” IEEE Transactions on Intelligent on time series analysis and forecasting,” International Statistical Re-
Transportation Systems, vol. 13, no. 2, pp. 519–526, 2011. view/Revue Internationale de Statistique, pp. 289–308, 1995.
[6] C. Voyant, G. Notton, S. Kalogirou, M.-L. Nivet, C. Paoli, F. Motte, and [31] J. Glaz and N. Balakrishnan, Scan statistics and applications. Springer
A. Fouilloy, “Machine learning methods for solar radiation forecasting: Science & Business Media, 2012.
A review,” Renewable Energy, vol. 105, pp. 569–582, 2017. [32] D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to linear
[7] H. Hassani and E. S. Silva, “Forecasting uk consumer price inflation regression analysis. John Wiley & Sons, 2021.
using inflation forecasts,” Research in Economics, vol. 72, no. 3, [33] C. Chatfield, Time-series forecasting. Chapman and Hall/CRC, 2000.
pp. 367–378, 2018. [34] R. Hyndman, A. B. Koehler, J. K. Ord, and R. D. Snyder, Forecasting
[8] W. Cai, J. Chen, J. Hong, and F. Jiang, “Forecasting chinese stock with exponential smoothing: the state space approach. Springer Science
market volatility with economic variables,” Emerging Markets Finance & Business Media, 2008.
and Trade, vol. 53, no. 3, pp. 521–533, 2017. [35] E. S. Gardner and E. McKenzie, “Why the damped trend works,” Journal
[9] E. Bernardini and G. Cubadda, “Macroeconomic forecasting and struc- of the Operational Research Society, vol. 62, no. 6, pp. 1177–1180,
tural analysis through regularized reduced-rank regression,” Interna- 2011.
tional Journal of Forecasting, vol. 31, no. 3, pp. 682–691, 2015. [36] V. Assimakopoulos and K. Nikolopoulos, “The theta model: a decom-
[10] Z. R. Khan Jaffur, N.-U.-H. Sookia, P. Nunkoo Gonpot, and B. See- position approach to forecasting,” International journal of forecasting,
tanah, “Out-of-sample forecasting of the canadian unemployment rates vol. 16, no. 4, pp. 521–530, 2000.
using univariate models,” Applied Economics Letters, vol. 24, no. 15, [37] G. E. Box and D. A. Pierce, “Distribution of residual autocorrelations in
pp. 1097–1101, 2017. autoregressive-integrated moving average time series models,” Journal
DE SANTIS et al.: DYNAMIC TIME SCAN FORECASTING: A BENCHMARK WITH M4 COMPETITION DATA 327
of the American statistical Association, vol. 65, no. 332, pp. 1509–1526,
1970.
[38] R. J. Hyndman, A. B. Koehler, R. D. Snyder, and S. Grose, “A state
space framework for automatic forecasting using exponential smoothing
methods,” International Journal of forecasting, vol. 18, no. 3, pp. 439–
454, 2002.
[39] S. M. Al-Alawi and S. M. Islam, “Principles of electricity demand
forecasting. i. methodologies,” Power Engineering Journal, vol. 10,
no. 3, pp. 139–143, 1996.
[40] A. Azadeh, S. Ghaderi, and S. Sohrabkhani, “Annual electricity con-
sumption forecasting by neural network in high energy consuming
industrial sectors,” Energy Conversion and management, vol. 49, no. 8,
pp. 2272–2278, 2008.
[41] C. Bergmeir, R. J. Hyndman, and B. Koo, “A note on the validity
of cross-validation for evaluating autoregressive time series prediction,”
Computational Statistics & Data Analysis, vol. 120, pp. 70–83, 2018.