Professional Documents
Culture Documents
A Decomposition-Clustering-Ensemble Learning Approach For Solar Radiation Forecasting
A Decomposition-Clustering-Ensemble Learning Approach For Solar Radiation Forecasting
Solar Energy
journal homepage: www.elsevier.com/locate/solener
A R T I C L E I N F O A B S T R A C T
Keywords: A decomposition-clustering-ensemble (DCE) learning approach is proposed for solar radiation forecasting in this
Solar radiation forecasting paper. In the proposed DCE learning approach, (1) ensemble empirical mode decomposition (EEMD) is used to
Decomposition-clustering-ensemble learning decompose the original solar radiation data into several intrinsic mode functions (IMFs) and a residual com-
approach ponent; (2) least square support vector regression (LSSVR) is performed to forecast IMFs and residual component
Ensemble empirical mode decomposition
respectively with parameters optimized by gravitational search algorithm (GSA); (3) Kmeans method is adopted
Least square support vector regression
to cluster all component forecasting results; (4) another GSA-LSSVR method is applied to ensemble the com-
ponent forecasts of each cluster and the final forecasting results are obtained by means of corresponding cluster’s
ensemble weights. To verify the performance of the proposed DCE learning approach, solar radiation data in
Beijing is introduced for empirical analysis. The results of out-of-sample forecasting power show that the DCE
learning approach produces smaller NRMSE, MAPE and better directional forecasts than all other benchmark
models, reaching up to accuracy rate of 2.96%, 2.83% and 88.24% respectively in the one-day-ahead forecasting.
This indicates that the proposed DCE learning approach is a relatively promising framework for forecasting solar
radiation by means of level accuracy, directional accuracy and robustness.
1. Introduction The scholars and researchers so far have conducted in-depth re-
search for solar radiation forecasting theory and put forward a number
With the continuous consumption of fossil fuels, energy and en- of feasible and efficient methods to predict actual solar radiation in-
vironmental problems become increasingly severe. It is urgent to find a tensity, which has achieved satisfactory results. These methods can be
solution to solve energy and environmental problems and achieve divided into three categories: traditional mathematical statistics, nu-
sustainable development. Therefore, renewable energy has attracted merical weather forecasting and machine learning. The traditional
much attention all over the world and been rapidly developed in recent mathematical statistics includes: regression analysis (Trapero et al.,
years. 2015), time series analysis (Huang et al., 2013; Voyant et al., 2013),
The solar energy is considered to be one of the cleanest and most gray theory (Fidan et al., 2014), fuzzy theory (Chen et al., 2013), wa-
promising renewable energy sources, which makes solar power an im- velet analysis (Mellit et al., 2006; Monjoly et al., 2017) and Kalman
portant direction of exploring renewable energy. The solar power filter (Akarslan et al., 2014), etc.; numerical weather forecasting solves
generation is widely used in developed countries and considerable de- thermal fluid dynamics equations with weather evolution by computers
veloping countries at present, and has partially replaced the traditional with high performance to predict the solar radiation intensity of the
power generation. In recent years, China has gained more than 25% next period mainly based on the actual situation of the atmosphere
growth rate every year in the development and utilization of renewable (Chow et al., 2011; Mathiesen and Kleissl, 2011; Mathiesen et al.,
energy. The solar radiation plays an important role in solar photovoltaic 2013), which is complicated and time-consuming. With the rise of big
power generation. With the development of solar technologies, de- data mining, machine learning techniques currently have attracted
mands for solar radiation data with high accuracy are increasing. much attention, for example, artificial neural networks (ANN)
Consequently, the solar radiation forecasting has become one of core (Amrouche and Le Pivert, 2014; Benmouiza and Cheknane, 2013; Niu
contents of solar photovoltaic power generation. et al., 2015; Paoli et al., 2010), support vector machines (SVM) (Gala
⁎
Corresponding author at: Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.
E-mail address: sywang@amss.ac.cn (S. Wang).
https://doi.org/10.1016/j.solener.2018.02.006
Received 28 July 2017; Received in revised form 26 December 2017; Accepted 1 February 2018
0038-092X/ © 2018 Elsevier Ltd. All rights reserved.
S. Sun et al. Solar Energy 163 (2018) 189–199
et al., 2016; Lauret et al., 2015) and heuristic intelligent optimization conclusions and indicates the direction of further research.
algorithms (Jiang et al., 2015; Niu et al., 2017; Niu et al., 2016a; Wang
et al., 2015) have been widely used in solar radiation forecasting. 2. Decomposition-Clustering-Ensemble (DCE) learning approach
According to the literature above, Trapero et al. (2015) applied
dynamic harmonic regression model (DHR) to predict short-term direct According to the work of TEI@I methodology (Wang et al., 2005;
solar radiation and scattered solar radiation in Spain for the first time in Wang, 2004), which is based on integration (@ Integration) of text
2015. Huang et al. (2013) took autoregressive model to forecast the mining (T) plus econometrics (E) plus intelligence techniques (I). Yu
solar radiation in the framework of meteorological factors-dynamic et al. (2008) proposed a decomposition ensemble learning approach for
adjustment system in 2013, which increased the accuracy by 30% than crude oil price forecasting. Recently, this learning approach has been
general neural network or random models. Fidan et al. (2014) predicted applied in many fields, including financial time series forecasting (Yu
hourly solar radiation in Izmir, Turkey by integration of Fourier et al., 2009), nuclear energy consumption forecasting (Tang et al.,
transform and neural network. Mellit et al. (2006) applied infinite 2012; Wang et al., 2011), hydropower consumption forecasting (Wang
impulse response function to filter time series of the total solar radia- et al., 2011), crude oil price forecasting (He et al., 2012; Yu et al., 2014;
tion in Algeria from 1981 to 2000, and then substituted the filtered data Yu et al., 2015; Yu et al., 2016), etc.
into adaptive wavelet neural network model to forecast the total solar By analyzing the literature above in detail, there are three main
radiation in 2001, where the error percentage is less than 6% per- steps involved in the decomposition and ensemble learning approach,
forming better than conventional neural network models and classical i.e. decomposition, single forecasting and ensemble forecasting. Firstly,
statistical methods. However, Akarslan et al. (2014) first used the some decomposition algorithms can be used to decompose the original
multidimensional linear predictive filtering model to forecast the solar time series into a number of meaningful components. Secondly, some
radiation, which surpasses the two-dimensional linear predictive fil- forecasting methods are applied to forecast all components respectively.
tering model and the traditional statistical forecasting method by means Finally, these forecasting results of each component are combined to
of empirical analysis. Amrouche and Le Pivert (2014) adopted the idea generate an aggregated output as the final forecasting result using some
of spatial modeling and artificial neural networks (ANNs) to forecast ensemble methods. It can be concluded that ensemble learning is cri-
the daily solar total radiation of four sites in the United States, in which tical to the final forecasting results. Sub-component forecasting has
the empirical results show that the forecasting results of this proposed different attributes in different time. If ensemble weights are the same
model satisfy the expected accuracy requirements. Benmouiza and all the time, different attributes cannot be captured. Therefore, DCE
Cheknane (2013) classified the input data by K-means, then modeled learning approach is proposed in this paper, which employs clustering
different class by using nonlinear autoregressive neural network, and scheme to cluster the sub-component forecasting results. By using dif-
finally predicted the solar radiation of test data by the corresponding ferent ensemble weights in different forecasting time, a better perfor-
model. mance can be obtained by compared with the fixed ensemble weights.
In recent years, the integrated model has grown rapidly. Paoli et al. The framework of DCE learning approach is shown in Fig. 1. It can
(2010) used the integrated model to predict the total solar radiation in be seen from Fig. 1, DCE learning approach contains four steps:
three sites in France. First, they pretreated the original total solar ra-
diation sequence by using the seasonal index adjustment method. And (1) Decomposition. Decomposition method is introduced to decompose
then they used the multi-layer perceptron neural network (MLPNN) for the original time series into some relatively simple and meaningful
daily solar radiation prediction. The results show that the mean abso- component series.
lute percentage error of the multi-layer perceptual neural network (2) Individual forecasting. Various forecasting models are employed to
model is about 6%, which is superior to the ARIMA, the Bayesian, forecast each component series.
Markov chain model and K-nearest neighbor algorithm. Lauret et al. (3) Clustering. A clustering method is used to cluster the individual
(2015) used three different methods which are artificial neural network forecasting results.
(ANN), Gaussian process (GP) and support vector machine (SVM) to (4) Ensemble forecasting. An ensemble method is applied to calculate
forecast global horizontal solar irradiance (GHI). Through analyzing the the ensemble weights of different cluster. Then the corresponding
actual data in three different places in France, the three machine clusters’ ensemble weights are used for component forecasts to
learning algorithms proposed in this paper are found to be better than obtain the ensemble forecasting results.
the linear autoregressive (AR) and persistence model. Gala et al. (2016)
used a combination of support vector regression (SVR), gradient 3. Related methodologies
boosted regression (GBR), and random forest regression (RFR) to pre-
dict three-hour accumulated solar radiation of 11 regions in Spain. 3.1. Ensemble empirical mode decomposition
Wang et al. (2015) proposed a new integrated model to predict hourly
solar radiation in 2015. First, they used the network structure of multi- The empirical mode decomposition (EMD) is an adaptive signal
response sparse regression (MRSR), leave-one-out cross validation, and decomposing method which is proposed by Huang et al. (1998). It can
extreme learning machine (ELM), then used the cuckoo search (CS) to be used to decompose the signal into several intrinsic mode functions
optimize its weight and threshold, and finally analyzed six sites in the (IMFs) and a residual series.
US. The prediction results show that this combination model is stronger Each IMF has a zero local mean value and its number of extreme
than the ARIMA, BPNN and optimally pruned extreme learning ma- values and zero crossings are equal or differ at most by one. Different
chine (OP-ELM). IMFs have different frequency ranges and represents different kinds of
The main innovation of this paper is to propose a novel decom- natural oscillation modes embedded in the original signal. Thus, dif-
position clustering ensemble (DCE) learning approach integrating ferent IMFs may highlight different details of the signal. Compared with
EEMD, K-means and LSSVR to improve the performance of solar ra- other methods, the key advantage of EMD is that the basis function is
diation forecasting by means of forecasting accuracy and robustness, derived directly from the original signal rather than a priori fixed basis
and to compare its forecasting power with some popular existing function. It is especially suitable for analyzing non-linear and non-sta-
forecasting models. The rest of the paper is organized as follows. tionary signals.
Section 2 describes the formulation process of the proposed DCE The ensemble empirical mode decomposition (EEMD) is proposed as
learning approach in detail. Related methodologies are illustrated in an improved version of EMD which overcomes the mode mixing pro-
Section 3. The empirical results and effectiveness of the proposed ap- blem of EMD. The main idea of EEMD is to add white noise into the
proach are discussed in Section 4. Finally, Section 5 provides some original signal with many trials. Then the EMD is applied to the noisy
190
S. Sun et al. Solar Energy 163 (2018) 189–199
signal with additional white noise rather than the original signal to 3.3. Least square support vector machine
obtain the IMFs. For further details on these methods, please refer to
(Huang et al., 1998; Wu and Huang, 2009; Zhang et al., 2008). Support vector machine (SVM) is firstly proposed by Vapnik based
on statistical learning theory and the principle of structural risk mini-
mization which possess good performance even for small samples
3.2. Kmeans clustering (Vapnik, 2013). However, it is time-consuming and leads to a high
computational cost when deal with large-scale problem. Therefore,
Kmeans is an unsupervised learning algorithm for clustering, which least square support vector regression (LSSVR) is proposed by Suykens
was proposed by MacQueen (1967). The basic idea of Kmeans algo- and Vandewalle (1999).
rithm is to split the historical dataset into several subsets where each The basic idea of support vector regression is to map the original
subset has its own unique characteristics. Assuming that training set is data into a high-dimensional feature space and make a linear regression
X = {x1,x2,…,x n} , the detail steps of Kmeans clustering are as follows: in the space. It can be formulated into:
f (x ) = wT φ (x ) + b (1)
(1) Initialize cluster centroids (c1,c2,…,ck ) randomly.
(2) If ‖x i−ck ‖ ⩽ ‖x i−cp ‖,p = 1,2,…,K ,k ≠ p, x i is assigned to cluster k. where φ (x ) is a nonlinear mapping function, f (x ) is the estimation
1 N
(3) Update cluster centroids by ci = N ∑ j =i 1 x ij,i = 1,2,…,k . value, wT and b are the weights.
i
k N It can be transformed into an optimization problem:
(4) Compute the least square error of clustering J = ∑i = 1 ∑ j =i 1 ‖x ij−ci ‖2 .
(5) Repeat step 2–4 until J convergence. T
1 T
min 2
w w + C ∑ (ξt + ξt∗)
t=1
From the above processes, it can be seen that the number of clusters
T
K and the election of the initial cluster center are critical to the final ⎧ w φ (x t ) + b−yt ⩽ ε + ξt ,(t = 1,2,…,T )
⎪
results of clustering. For further details on this method, please refer to s. t . y −(wT φ (x t ) + b) ⩽ ε + ξt∗,(t = 1,2,…,T )
⎨ t ∗
(MacQueen, 1967). ⎪ ξt ,ξt ⩾ 0
⎩ (2)
where C is the penalty parameter, ξt and ξt∗ are the nonnegative slack
191
S. Sun et al. Solar Energy 163 (2018) 189–199
192
S. Sun et al. Solar Energy 163 (2018) 189–199
are not listed here, but can be obtained from the authors. Therefore, in order to verify the forecasting performance of the
The m-step-ahead forecasting horizons are employed to evaluate the proposed DCE learning approach, four single forecasting models, i.e.,
performance of the proposed DCE learning approach. Given a time single ARIMA, ANN, SVR and LSSVR models and five hybrid ensemble
̂ m:
series x t ,(t = 1,2,…,T ) , we make m-step-ahead forecasting for x t + learning approaches, i.e., EMD-LSSVR-ADD, EMD-LSSVR-LSSVR, EMD-
LSSVR-K-LSSVR, EEMD-LSSVR-ADD and EEMD-LSSVR-LSSVR are con-
x t +̂ m = f (x t ,x t − 1,…,x t − (l − 1) ),t = 1,2,…,T . (4)
sidered as benchmark models.
̂ m is the m-step-ahead forecast value at time t , x t is the actual
where x t + For consistency, all LSSVR models, as both component forecasting
value at time t , and l denotes the lag orders which is chosen by auto- and nonlinear ensemble forecasting, are the same as LSSVR of single
correlation and partial correlation analysis (Lewis, 1982). benchmark model in the determination method, as mentioned above.
In addition, in order to assess the level forecasting accuracy and the Generally speaking, the parameter specification is crucial for model
directional forecasting accuracy of the proposed DCE learning approach performance. For single SVR and LSSVR models, the Gaussian RBF
with some other benchmark models. Three main evaluation criteria are kernel function is selected and GSA algorithm is used to determine
employed to compare the in-sample and out-of-sample forecasting values of optimal parameters γ , σ 2 and ε in terms of the smallest error in
performance (Coimbra et al., 2013). For level forecasting accuracy the in-sample subset (Niu et al., 2016b; Rashedi et al., 2009). For single
evaluation, the normalized root mean squared error (NRMSE) and the ANN forecasting technique, input neurons is set to 6 chosen by auto-
mean absolute percentage error (MAPE) are selected: correlation function (ACF) and partial correlation function (PCF),
hidden neuron is set to 7 and output neuron is set to 1 (Niu et al.,
T
100 1 T 1 x t −x t ̂ 2016b). ANN models are iteratively run 10,000 times to train the model
NRMSE =
x T
∑t=1 (xt −xt ̂)2 ,MAPE = T
∑ xt
× 100%
t=1 (5) (Yu et al., 2009). In ARIMA (p-d-q) model, the optimal form is esti-
mated by Schwarz Criterion (SC) minimization (Yu et al., 2009).
where T is the number of observations in the out-of-sample subset. Six decomposition ensemble learning approaches are performed.
The performance to forecast direction of movement can be mea- For EEMD decomposition, the ensemble member is set to 100 and the
sured by a directional symmetry (DS) as follows: standard deviation of added white noise is set to 0.2. Fig. 4 shows the
T decomposition results of solar daily radiation data in Beijing. All IMF
1 1 if (x t −x t − 1)(x t ̂−x t − 1) > 0
DS = ∑ dt × 100%, where dt = ⎧ components are listed from the highest frequency to the lowest fre-
T t=2
⎨
⎩ 0 otherwise (6)
quency and the last one is the residual term.
Furthermore, in order to further evaluate the level forecasting per- Therefore, four steps are involved in empirical results analysis: (1)
formance and the directional forecasting performance from the statis- four single models are performed to forecast solar radiation in Beijing to
tical view, the Diebold-Mariano (DM) statistic and Pesaran- select the best single model; (2) the forecasting performance of the
Timmermann (PM) statistic are adopted to test the statistical sig- proposed EEMD-LSSVR-K-LSSVR approach is compared with other en-
nificance of all models. The main process of the DM test and the PM test semble benchmark approaches to ensure that the proposed DCE ap-
can be found in (Diebold and Mariano, 2002; Pesaran and proach can be statistically superior to all other benchmark approaches;
Timmermann, 1992; Sun et al., 2017). (3) the robustness of the DCE approach is analyzed; (4) some conclu-
sions are summarized from the empirical analysis.
4.2. Empirical results
4.2.1. Performance comparison of single models
For comparison purposes, some other traditional forecasting The performances of four single forecasting models are analyzed by
methods are regarded as benchmark models to be compared with the means of the NRMSE, MAPE and DS evaluation criteria as shown in
proposed DCE learning approach in terms of forecasting performance. Figs. 5–7. Tables 1–4 demonstrate the results of DM test and PT test
According to previous literature review, the most popular univariate with respect to different forecasting horizons.
forecasting models are introduced for solar radiation series, i.e., the From Figs. 5–7, it can be concluded that: (1) the performance of
typical ARIMA model and the most popular AI techniques of ANN and LSSVR is the best single benchmark model, followed by other AI tech-
SVR. In particular, these three tools of ARIMA, ANN and SVR might be niques, and ARIMA ranks the last, by means of forecasting accuracy and
the most popular time series forecasting models, which have widely statistical test; (2) it may be difficult for the ARIMA, as a traditional
been used as single forecasting models and component forecasting linear model, to capture the complexity and nonlinearity of solar ra-
model for decomposition ensemble learning approach. Therefore, these diation data, while AI techniques are more appropriate for exploring
three typical univariate forecasting tools are introduced as single the nonlinear patterns; (3) the forecasting performances of SVR and
benchmark models in this study. Furthermore, we also introduce LSSVR ANN are quite similar in different forecasting horizons.
as the single benchmark model since LSSVR is selected as the compo- The DM test and PT test are employed to test the difference of four
nent forecasting tool for extracted components in formulating the single benchmark models in terms of level accuracy and directional
proposed DCE learning approach. In the decomposition stage, EMD, the accuracy, and the results are shown in Tables 1–4. According to the
original form of EEMD, is regarded as benchmark decomposition above empirical results, it can be obtained that (1) all the single models
method. are almost ineffective in solar radiation forecasting; (2) the level
193
S. Sun et al. Solar Energy 163 (2018) 189–199
Fig. 4. The IMFs and a residual for Beijing solar radiation data decomposed by EEMD.
5
5
0 0
One-step-ahead forecasting Three-step-ahead forecasting Six-step-ahead forecasting One-step-ahead forecasting Three-step-ahead forecasting Six-step-ahead forecasting
Fig. 5. Performance comparison of different single models in terms of NRMSE criteria. Fig. 6. Performance comparison of different single models in terms of MAPE criteria.
194
S. Sun et al. Solar Energy 163 (2018) 189–199
NRMSE (%)
DS (%)
6 5.14 4.87
4.53
30
4 3.15
2.96
20
10 2
0 0
One-step-ahead forecasting Three-step-ahead forecasting Six-step-ahead forecasting One-step-ahead forecasting Three-step-ahead forecasting Six-step-ahead forecasting
Fig. 7. Performance comparison of different single models in terms of DS criteria. Fig. 8. Performance comparison of different hybrid ensemble approaches in terms of
NRMSE criteria.
Table 1
DM test results for single models in one-step-ahead forecasting. EEMD-LSSVR-K-LSSVR EMD-LSSVR-K-LSSVR EEMD-LSSVR-LSSVR
EMD-LSSVR-LSSVR EEMD-LSSVR-ADD EMD-LSSVR-ADD
Tested model Benchmark model 10
MAPE (%)
6 5.11 5.26 5.04
SVR −0.1182 (0.4530) −3.5628 (0.0001) 4.68
ANN −3.7143 (0.0001) 4.01 4.12
4 3.06
2.83
2
Table 2
DM test results for single models in three-step-ahead forecasting. 0
One-step-ahead forecasting Three-step-ahead forecasting Six-step-ahead forecasting
Tested model Benchmark model
Fig. 9. Performance comparison of different hybrid ensemble approaches in terms of
SVR ANN ARIMA MAPE criteria.
195
S. Sun et al. Solar Energy 163 (2018) 189–199
Table 5
Cluster centers and member numbers of EEMD-LSSVR-K-LSSVR ensemble approach.
One-step-ahead forecasting 1 (0.228, −0.180, −0.178, −0.194, −0.173, −0.001, −0.815, −0.122, 0.037, −0.010, 8.445) 715
2 (0.010, 0.137, 0.028, −0.001, 0.052, −0.394, −0.348, −0.016, 0.061, 0.010, 8.609) 989
3 (−0.087, −0.260, 0.206, −0.074, −0.290, −0.369, 0.478, 0.086, −0.008, −0.010, 8.701) 662
Three-step-ahead forecasting 1 (−0.147, −0.0.088, 0.028, 0.146, −0.204, 0.144, −0.465, −0.093, 0.034, 0.008, 8.446) 726
2 (−0.449, 0.209, 0.217, 0.001, −0.006, 0.217, −1.050, 0.006, 0.041, 0.009, 8.560) 982
3 (0.535, −0.208, 0.317, −0.004, −0.317, −0.350, 0.494, 0.086, −0.008, −0.010, 8.700) 656
Six-step-ahead forecasting 1 (0.096, 0.002, −0.078, −0.045, −0.167, −0.366, 0.272, −0.026, 0.029, −0.008, 8.450) 695
2 (0.385, 0.038, −0.226, −0.091, 0.064, −0.050, 0.097, −0.017, 0.067, 0.010, 8.612) 987
3 (−0.118, −0.058, 0.306, 0.048, −0.402, −0.218, 0.596, 0.087, −0.011, −0.009, 8.703) 679
Table 6
DM test results for hybrid ensemble approaches in one-step-ahead forecasting.
EEMD-LSSVR-K-LSSVR −1.9124 (0.0279) −2.0125 (0.0221) −2.0964 (0.0180) −2.9974 (0.0014) −3.8961 (0.0000)
EMD-LSSVR-K-LSSVR −1.7029 (0.0443) −1.2385 (0.1078) −1.6951 (0.0450) −3.5264 (0.0002)
EEMD-LSSVR-LSSVR −1.3346 (0.0910) −1.0263 (0.1524) −1.5788 (0.0572)
EMD-LSSVR-LSSVR −1.5968 (0.0552) −2.9816 (0.0014)
EEMD-LSSVR-ADD 0.8463 (0.1987)
Table 7
DM test results for hybrid ensemble approaches in three-step-ahead forecasting.
EEMD-LSSVR-K-LSSVR −2.2354 (0.0127) −2.9025 (0.0019) −3.4458 (0.0002) −4.5543 (0.0000) −4.0056 (0.0000)
EMD-LSSVR-K-LSSVR −1.8974 (0.0289) −2.5961 (0.0047) −3.2594 (0.0005) −4.2674 (0.0000)
EEMD-LSSVR-LSSVR 0.7842 (0.2165) −1.0268 (0.1523) −1.3642 (0.0863)
EMD-LSSVR-LSSVR −1.1269 (0.1299) −1.6874 (0.0458)
EEMD-LSSVR-ADD 0.4419 (0.3293)
Table 8
DM test results for hybrid ensemble approaches in six-step-ahead forecasting.
EEMD-LSSVR-K-LSSVR −2.9156 (0.0018) −2.6671 (0.0038) −3.1146 (0.0009) −6.1573 (0.0000) −6.2105 (0.0000)
EMD-LSSVR-K-LSSVR −1.7849 (0.0371) −2.5963 (0.0047) −4.5267 (0.0000) −4.5249 (0.0000)
EEMD-LSSVR-LSSVR 1.0284 (0.1519) −1.6157 (0.0531) −6.4892 (0.0000)
EMD-LSSVR-LSSVR −1.8816 (0.0299) −3.1257 (0.0008)
EEMD-LSSVR-ADD −0.3674 (0.3567)
Table 9
PT test results for hybrid ensemble approaches.
EMD-LSSVR EEMD-LSSVR
196
S. Sun et al. Solar Energy 163 (2018) 189–199
197
S. Sun et al. Solar Energy 163 (2018) 189–199
Table 11
Robustness analysis for hybrid ensemble approaches.
EMD-LSSVR EEMD-LSSVR
One-step-ahead forecasting Std. of NRMSE 0.2049 0.0509 0.0512 0.3671 0.1327 0.0256
Std. of MAPE 0.0106 0.0076 0.0029 0.0087 0.0095 0.0032
Std. of DS 0.1956 0.0791 0.0113 0.1652 0.0592 0.0098
Three-step-ahead forecasting Std. of NRMSE 0.2149 0.0583 0.0664 0.3792 0.1428 0.0329
Std. of MAPE 0.0153 0.0107 0.0059 0.0114 0.0137 0.0065
Std. of DS 0.2011 0.0831 0.0139 0.1798 0.0625 0.0119
Six-step-ahead forecasting Std. of NRMSE 0.2218 0.0623 0.0704 0.3884 0.1568 0.0395
Std. of MAPE 0.0163 0.0125 0.0071 0.0182 0.0152 0.0085
Std. of DS 0.2217 0.0892 0.0149 0.1859 0.0782 0.0128
Table 12
Forecasting performance using the hybrid model proposed in Monjoly et al. (2017).
express their sincere appreciation to Jan Kleissl (the subject editor), method, LSSVR as component forecasting method and LSSVR as
Hugo Pedro (the associate editor), and two anonymous referees in ensemble method
making valuable comments and suggestions to this paper. Their com- EEMD-LSSVR-K-LSSVR – the model using EEMD as decomposition
ments and suggestions have improved the quality of the paper im- method, LSSVR as component forecasting method, Kmeans as clus-
mensely. tering method and LSSVR as ensemble method
ELM – extreme learning machine
Conflict of interest EMD – empirical mode decomposition
EMD-LSSVR-ADD – the model using EMD as decomposition method,
The authors declare that there is no conflict of interests regarding LSSVR as component forecasting method and ADD as ensemble
the publication of this paper. method
EMD-LSSVR-LSSVR – the model using EMD as decomposition
Appendix A. List of abbreviations method, LSSVR as component forecasting method and LSSVR as
ensemble method
Here, all terms mentioned in this paper and their definitions are EMD-LSSVR-K-LSSVR – the model using EMD as decomposition
listed in alphabetical order: method, LSSVR as component forecasting method, Kmeans as clus-
tering method and LSSVR as ensemble method
ACF – autocorrelation function GBR – gradient boosted regression
AI – artificial intelligence GHI – global horizontal solar irradiance
ANN – artificial neural network GP – Gaussian process
AR – autoregressive model GSA – gravitational search algorithm
ARMA – autoregressive moving average IMFs – intrinsic mode functions
ARIMA – autoregressive integrated moving average LSSVR – least square support vector regression
BPNN – back propagation neural network MAPE – mean absolute percentage error
CS – cuckoo search algorithm MLPNN – multi-layer perceptron neural network
DCE – decomposition-clustering-ensemble learning approach MRSR – multi-response sparse regression
DHR – dynamic harmonic regression model NRMSE – normalized root mean squared error
DM – Diebold-Mariano test OP-ELM – optimally pruned extreme learning machine
DS – directional symmetry PCF – partial correlation function
EEMD – ensemble empirical mode decomposition PM – Pesaran-Timmermann test
EEMD-LSSVR-ADD – the model using EMD as decomposition RFR – random forest regression
method, LSSVR as component forecasting method and ADD as en- SC – Schwarz criterion
semble method SVM – support vector machines
EEMD-LSSVR-LSSVR – the model using EMD as decomposition SVR – support vector regression
198
S. Sun et al. Solar Energy 163 (2018) 189–199
References using the singular spectrum analysis and nonlinear multi-layer perceptron network
optimized by hybrid intelligent algorithm for short-term load forecasting. Appl.
Math. Model. 40 (5), 4079–4093.
Akarslan, E., Hocaoğlu, F.O., Edizkan, R., 2014. A novel MD (multi-dimensional) linear Niu, M.F., Sun, S.L., Wu, J., Zhang, Y.L., 2015. Short-term wind speed hybrid forecasting
prediction filter approach for hourly solar radiation forecasting. Energy 73, 978–986. model based on bias correcting study and its application. Math. Probl. Eng. 2015.
Amrouche, B., Le Pivert, X., 2014. Artificial neural network based daily local forecasting http://dx.doi.org/10.1155/2015/351354.
for global solar radiation. Appl. Energ. 130, 333–341. Niu, M.F., Wang, Y.F., Sun, S.L., Li, Y.W., 2016a. A novel hybrid decomposition and
Benmouiza, K., Cheknane, A., 2013. Forecasting hourly global solar radiation using hy- ensemble model based on CEEMD and GWO for short-term PM2.5 concentration
brid k-means and nonlinear autoregressive neural network models. Energ. Convers. forecasting. Atmos. Environ. 134, 168–180.
Manage. 75, 561–569. Paoli, C., Voyant, C., Muselli, M., Nivet, M., 2010. Forecasting of preprocessed daily solar
Chen, S.X., Gooi, H.B., Wang, M.Q., 2013. Solar radiation forecast based on fuzzy logic radiation time series using neural networks. Sol. Energ. 84 (12), 2146–2160.
and neural networks. Renew. Energ. 60, 195–201. Pesaran, M.H., Timmermann, A., 1992. A simple nonparametric test of predictive per-
Chow, C.W., Urquhart, B., Lave, M., Dominguez, A., Kleissl, J., Shields, J., Washom, B., formance. J. Bus. Econ. Stat. 10 (4), 461–465.
2011. Intra-hour forecasting with a total sky imager at the UC San Diego solar energy Rashedi, E., Nezamabadi-Pour, H., Saryazdi, S., 2009. GSA: a gravitational search algo-
testbed. Sol Energy 85 (11), 2881–2893. rithm. Inform. Sci. 179 (13), 2232–2248.
Coimbra, C.F., Kleissl, J., Marquez, R., 2013. Overview of solar-forecasting methods and a Sun, S.L., Qiao, H., Wei, Y., Wang, S.Y., 2017. A new dynamic integrated approach for
metric for accuracy evaluation. Solar Resource Assess. Forecast. wind speed forecasting. Appl. Energ. 197, 151–162.
Diebold, F.X., Mariano, R.S., 2002. Comparing predictive accuracy. J. Bus. Econ. Stat. 20 Suykens, J.A., Vandewalle, J., 1999. Least squares support vector machine classifiers.
(1), 134–144. Neural Process. Lett. 9 (3), 293–300.
Fidan, M., Hocaoğlu, F.O., Gerek, Ö.N., 2014. Harmonic analysis based hourly solar ra- Tang, L., Yu, L., Wang, S., Li, J., Wang, S.Y., 2012. A novel hybrid ensemble learning
diation forecasting model. IET. Renew. Power. Gen. 9 (3), 218–227. paradigm for nuclear energy consumption forecasting. Appl. Energ. 93, 432–443.
Gala, Y., Fernández, Á., Díaz, J., Dorronsoro, J.R., 2016. Hybrid machine learning fore- Trapero, J.R., Kourentzes, N., Martin, A., 2015. Short-term solar irradiation forecasting
casting of solar radiation values. Neurocomputing 176, 48–59. based on dynamic harmonic regression. Energy 84, 289–295.
He, K., Yu, L., Lai, K.K., 2012. Crude oil price analysis and forecasting using wavelet Vapnik, V., 2013. The Nature of Statistical Learning Theory. Springer science & business
decomposed ensemble model. Energy 46 (1), 564–574. media.
Huang, J., Korolkiewicz, M., Agrawal, M., Boland, J., 2013. Forecasting solar radiation on Voyant, C., Paoli, C., Muselli, M., Nivet, M., 2013. Multi-horizon solar radiation fore-
an hourly time scale using a coupled autoregressive and dynamical system (CARDS) casting for Mediterranean locations using time series models. Renew. Sust. Energ.
model. Sol. Energy 87, 136–149. Rev. 28, 44–52.
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N., Tung, C.C., Wang, J., Jiang, H., Wu, Y., Dong, Y., 2015. Forecasting solar radiation using an opti-
Liu, H.H., 1998. The empirical mode decomposition and the Hilbert spectrum for mized hybrid model by Cuckoo Search algorithm. Energy 81, 627–644.
nonlinear and non-stationary time series analysis. Proc. Roy. Soc. Lond. A: Math., Wang, S., Yu, L., Tang, L., Wang, S.Y., 2011. A novel seasonal decomposition based least
Phys. Eng. Sci. 454, 903–995. squares support vector regression ensemble learning approach for hydropower con-
Jiang, H., Dong, Y., Wang, J., Li, Y., 2015. Intelligent optimization models based on hard- sumption forecasting in China. Energy 36 (11), 6542–6554.
ridge penalty and RBF for forecasting global solar radiation. Energ. Convers. Manage. Wang, S.Y., Yu, L., Lai, K.K., 2005. Crude oil price forecasting with TEI@I methodology.
95, 42–58. J. Syst. Sci. Complex. 18 (2), 145–166.
Lauret, P., Voyant, C., Soubdhan, T., David, M., Poggi, P., 2015. A benchmarking of Wang, S.Y., 2004. TEI@I: A New Methodology for Studying Complex Systems. The
machine learning techniques for solar radiation forecasting in an insular context. Sol. International Workshop on Complexity Science, Tsukuba, Japan.
Energy 112, 446–457. Wu, Z., Huang, N.E., 2009. Ensemble empirical mode decomposition: a noise-assisted
Lewis, C.D., 1982. International and Business Forecasting Methods. Butter-Worths, data analysis method. Adv. Adapt. Data Anal. 1 (01), 1–41.
London. Yu, L., Dai, W., Tang, L., 2016. A novel decomposition ensemble model with extended
MacQueen, J., 1967. Some methods for classification and analysis of multivariate ob- extreme learning machine for crude oil price forecasting. Eng. Appl. Artif. Intel. 47,
servations. Proc. Fifth Berkeley Sympos. Math. Stat. Probab. 1 (14), 281–297. 110–121.
Mathiesen, P., Collier, C., Kleissl, J., 2013. A high-resolution, cloud-assimilating numer- Yu, L., Wang, S.Y., Lai, K.K., 2008. Forecasting crude oil price with an EMD-based neural
ical weather prediction model for solar irradiance forecasting. Sol Energy 92, 47–61. network ensemble learning paradigm. Energ. Econ. 30 (5), 2623–2635.
Mathiesen, P., Kleissl, J., 2011. Evaluation of numerical weather prediction for intra-day Yu, L., Wang, S.Y., Lai, K.K., 2009. A neural-network-based nonlinear metamodeling
solar forecasting in the continental United States. Sol Energy 85 (5), 967–977. approach to financial time series forecasting. Appl. Soft Comput. 9 (2), 563–574.
Mellit, A., Benghanem, M., Kalogirou, S.A., 2006. An adaptive wavelet-network model for Yu, L., Wang, Z., Tang, L., 2015. A decomposition–ensemble model with data-char-
forecasting daily total solar-radiation. Appl. Energ. 83 (7), 705–722. acteristic-driven reconstruction for crude oil price forecasting. Appl. Energ. 156,
Monjoly, S., André, M., Calif, R., Soubdhan, T., 2017. Hourly forecasting of global solar 251–267.
radiation based on multiscale decomposition methods: a hybrid approach. Energy Yu, L., Zhao, Y., Tang, L., 2014. A compressed sensing based AI learning paradigm for
119, 288–298. crude oil price forecasting. Energ. Econ. 46, 236–245.
Niu, M.F., Gan, K., Sun, S.L., Li, F.Y., 2017. Application of decomposition-ensemble Zhang, X., Lai, K.K., Wang, S.Y., 2008. A new approach for crude oil price analysis based
learning paradigm with phase space reconstruction for day-ahead PM2.5 concentra- on empirical mode decomposition. Energ. Econ. 30 (3), 905–918.
tion forecasting. J. Environ. Manage. 196, 110–118.
Niu, M.F., Sun, S.L., Wu, J., Yu, L., Wang, J.Z., 2016b. An innovative integrated model
199