Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Journal of Experimental & Theoretical Artificial

Intelligence

ISSN: 0952-813X (Print) 1362-3079 (Online) Journal homepage: https://www.tandfonline.com/loi/teta20

A hybrid method based on wavelet, ANN and


ARIMA model for short-term load forecasting

Abdollah Kavousi Fard & Mohammad-Reza Akbari-Zadeh

To cite this article: Abdollah Kavousi Fard & Mohammad-Reza Akbari-Zadeh (2014)
A hybrid method based on wavelet, ANN and ARIMA model for short-term load
forecasting, Journal of Experimental & Theoretical Artificial Intelligence, 26:2, 167-182, DOI:
10.1080/0952813X.2013.813976

To link to this article: https://doi.org/10.1080/0952813X.2013.813976

Published online: 23 Jul 2013.

Submit your article to this journal

Article views: 859

View related articles

View Crossmark data

Citing articles: 25 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=teta20
Journal of Experimental & Theoretical Artificial Intelligence, 2014
Vol. 26, No. 2, 167–182, http://dx.doi.org/10.1080/0952813X.2013.813976

A hybrid method based on wavelet, ANN and ARIMA model for short-term
load forecasting
Abdollah Kavousi Farda* and Mohammad-Reza Akbari-Zadehb
a
Young Researchers Club, Sarvestan Branch, Islamic Azad University, Sarvestan, Iran; bDepartment of
Electrical Engineering, Technical College, Shahid Bahonar University of Shiraz, Shiraz, Iran
(Received 5 December 2012; final version received 21 April 2013)

In the new competitive electricity markets, the necessity of appropriate load


forecasting tools for accurate scheduling is completely evident. The model which is
utilised for the forecasting purposes determines how much the forecasted results would
be dependable. In this regard, this paper proposes a new hybrid forecasting method
based on the wavelet transform, autoregressive integrated moving average (ARIMA)
and artificial neural network (ANN) for short-term load forecasting. In the proposed
model, the autocorrelation function and the partial autocorrelation function are utilised
to see the stationary or non-stationary behaviour of the load time series. Then, by the
use of Akaike information criterion, the appropriate order of the ARIMA model is
found. Now, the ARIMA model would capture the linear component of the load time
series and the residuals would contain only the nonlinear components. The nonlinear
part would be decomposed by the discrete wavelet transform into its sub-frequencies.
Several ANNs are applied to the details and approximation components of the residuals
signal to predict the future load sample. Finally, the outputs of the ARIMA and ANNs
are summed. The empirical results show that the proposed hybrid method can improve
the load forecasting accuracy suitably.
Keywords: artificial neural network; discrete wavelet transform; autoregressive
integrated moving average; short-term load forecasting

1. Introduction
Short-term load forecasting (STLF) is a key issue to the optimal operation management of power
networks in the new competitive power markets. Overestimation of the load demand can cause
overconservative operation (excessive energy purchase) whereas underestimation may result in
overrisky operation (operating in vulnerable area; Fan, Chen, & Lee, 2009). In definition, STLF
is investigated in a time horizon from 1 h to 1 week (Amjady, 2007). In the new deregulated
power market, the improvement of a few percentages in the load forecasting accuracy of an
average-sized electric utility can save about million British pounds in a year when enhancing the
total security simultaneously (Bunn & Farmer, 1985). Therefore, in recent years, several
approaches have been proposed to improve the forecasting accuracy to cope with the ongoing
market reforms.
In a general classification, all STLF methods can be divided into two broad categories of
Khosravi, Nahavandi, and Creighton (2010): (1) statistical models (parametric) and (2) artificial
intelligence-based techniques. In the class of statistical models, regression models (linear or

*Corresponding author. Email: abdollah.kavousifard@gmail.com


q 2013 Taylor & Francis
168 A.K. Fard and M.-R. Akbari-Zadeh

piecewise-linear; Papalexopoulos & Hesterberg, 1990), Kalman filter (Al-Hamadi & Soliman,
2004) time series [autoregressive integrated moving average (ARIMA) models] (Huang & Shih,
2003), data mining approaches (Wu & Lu, 2003) and state space method (Senjyu, Sakihara,
Tamaki, & Uezato, 2000) are well known. One of the most significant and popular model of the
time series is the ARIMA model. The key point in the popularity of the ARIMA model is its
inherent statistical characteristics as well as the utilisation of the Box – Jenkins methodology
(Box & Jenkins, 1970). In addition, as the result of the linear correlation structure among the
time-series values, no nonlinear pattern can be captured by the ARIMA model. However, as the
result of the intrinsic complexity of electrical loads as well as the increasing nonlinearity of
the power systems, the application of these approaches to forecasting methods is almost
challenging. In fact, statistical-based load forecasters are often prone to bias (Ferreira & Silva,
2007).
On the contrary, the new modelling techniques based on artificial intelligence methods can
learn any nonlinear relationship among different variables with good accuracy. Among the most
popular techniques of this group are expert systems (Kim, Park, Hwang, & Kim, 1995), fuzzy
systems (Niknam, Kavousifard, & Aghaei, 2012), artificial neural networks (ANNs; Kavousi-
Fard & Akbari-Zadeh, 2013) and neuro-fuzzy systems (Kavousifard & Samet, 2011). In recent
years, ANNs have gained suitable success in the forecasting applications. A review of ANNs
used in the load modelling area can be found in Zhang, Patuwo, and Hu (1998). The ANNs have
good ability of self-learning and nonlinear approximation when facing with nonlinear complex
data-set. Moreover, with ANNs, there is no need to choose a specific model structure. However,
there is some scepticism associated with the use of ANNs in the case of STLF: (1) the utilisation
of ANNs without any specific knowledge about the properties of the load data and (2) the size of
the ANNs becomes unnecessarily very large. The first problem can be solved by applying
accurate analysis to the load data from different aspects such as stationary or non-stationary,
linearity or nonlinearity and so on. The second problem can be solved through developing ANNs
in a constructive (Kwok & Yeung, 1997) approach. In addition, by dividing the load data into its
sub-frequencies, the size of the ANN would be reduced notably when dependability of the final
forecast load data is improved. In this regard, wavelet transform is a suitable tool to simplify the
input data to be used by the ANN. In recent years, several studies have been implemented to get
use of the wavelet transform benefits in load forecasting (Chen, Luh, & Rourke, 2010; Yao,
Song, Zhang, & Cheng, 2000). In these studies, wavelet transform is utilised to decompose the
load data time series into its sub-frequencies. In fact, wavelet decomposition opens the door to
analyse the load data at its different frequencies.
Nevertheless, in spite of all the above researches, there is yet notable prediction error due to the
nonlinear, sophisticated and chaotic behaviour of the load data. In fact, without any knowledge
about the characteristics of the input data, the powerful forecasting models such as ANNs cannot
perform as well as they should. Also, it is not wise to apply the new artificial intelligence techniques
such as ANNs to any type of input data. Unfortunately, it is hard and time consuming to analyse
different characteristics of the load data in a real power system completely. In this situation, the
utilisation of hybrid methodologies which include both linear and nonlinear load modelling
techniques can be useful. By the use of this strategy, different aspects of the investigated system
including linear and nonlinear components would be captured.
According to the above discussion, this paper proposes a new hybrid correction method
including ARIMA model, wavelet transform and ANN to reach reliable and accurate load
forecasting model. At the first step, in order to see the stationary or non-stationary behaviours of
the time series, the correlation function (ACF) and partial autocorrelation function (PACF) are
applied to the load data. Then, by the use of Akaike information criterion (AIC), the appropriate
Journal of Experimental & Theoretical Artificial Intelligence 169

order for the ARIMA model is chosen. The ARIMA as the linear model of the proposed hybrid
method would capture the linear components of the load data. Therefore, the residuals of the
linear model would include the nonlinear components of the data. The residuals are decomposed
by the use of wavelet transform into several sub-frequency time series which each of them would
be fed into an ANN for forecasting. Finally, the forecast results of the linear and nonlinear
models are summed to give the future load data. The historical load data of the Fars Electricity
Power Company, Iran in 2009, are utilised as the case study. The load forecasting process is
implemented to find the next day load value.
The rest of this paper is organised as follows: In Section 2, a brief theory and background of
the forecasting models which are utilised in this study are explained. In Section 3, the proposed
hybrid method is described completely. The feasibility and satisfying performance of the
proposed method are demonstrated in Section 4. Finally, the remarks and findings are discussed
in Section 5.

2. Time-series forecasting methods


In this section, the basic principles and modelling process of the ARIMA, ANN and wavelet
transform are discussed briefly.

2.1. ARIMA model


The ARIMA model belongs to the class of stochastic processes for the analysis of the non-
stationary time series. An ARIMA model consists of three main parts: (1) the autoregressive (or
AR), (2) the integrated (or I) and (3) the moving average (or MA). In an ARIMA model, the
future variable is supposed to be a linear function of the past observations and some random
errors as follows (Hirotugu, 1974):

yt ¼ d þ f1 yt21 þ f2 yt22 þ . . . þ fp yt2p þ at 2 u1 at21 2 u2 at22 2 . . . 2 uq at2q ; ð1Þ

where yt and at are the actual value and random error at time t, respectively; wi (i ¼ 1, 2, . . . , p)
and uj (1, 2, . . . , q) are the parameters of the model and p and q are integers showing the order of
the model. It is supposed that the random errors (at) are independent and distributed with the
zero mean value and standard deviation (SD) of s 2. Also, it is shown that d is as follows:

d ¼ mð1 2 f1 2 f2 2 . . . 2 fp Þ; ð2Þ

where m is the mean value of the investigated time series. In Equation (1), when q ¼ 0, the
model is reduced to an AR model. Similarly, when p ¼ 0, the model is reduced to an MA model.
In definition, the time series {yt} is defined as an ARIMA( p, q, d) process if it should be first
differenced d times (wt ¼ Ddyt) to become stationary so that to be able to be modelled by an
ARIMA( p, q). Here, D is the differencing operator. It should be noted that non-stationary time
series can be made stationary by differencing it once or more.
Box and Jenkins (1970) developed an iterative method to build the ARIMA models with the
aim of forecasting. Their method consists of three main steps: (1) model identification, (2)
parameter estimation and (3) diagnostic checking. In the ‘model identification’ step, the ACF
and the PACF of the time series are used to find the stationary or non-stationary behaviours as
well the appropriate order of the ARIMA model. In the case of non-stationary data, the
differencing operator is applied to the time series once or more to make it stationary. Once a
suitable ARIMA model is chosen, the second step ‘parameter estimation’ is straightforward. By
170 A.K. Fard and M.-R. Akbari-Zadeh

the use of some methods such as least square estimate methodology, the model estimation error
is reduced and the optimal parameters are calculated. In the last step ‘diagnostic checking’, the
Box – Jenkins model will check whether the model error at is satisfied. This step can be
implemented by plotting the model residuals or other statistical diagnostic approaches. These
three steps are repeated to find the proper model for forecasting applications.

2.2. Akaike’s information criterion


As mentioned earlier in the last part, the Box – Jenkins model gets use of the ACF and PACF to
find the appropriate order of the ARIMA model as well as the stationary or non-stationary
behaviour of the time series. In this study, we utilise the ACF and PACF to find the stationary
characteristics of the time series. However, Akaike’s information criterion (AIC) is utilised as a
measure of the goodness of the ARIMA model order. The most significant aspect of the AIC is
keeping a good trade-off between the bias and variance of the model structure. In fact, AIC
searches for the best accuracy when avoids to increase the complexity of the model. AIC was
first introduced by Akaike in 1974. In this criterion, the ARIMA model which best minimises the
below equation is chosen:
  2m
AICðmÞ ¼ Ln s2a þ ; ð3Þ
n
where n is the length of the time series, m is the number of the model parameters and s2a is the SD
of the residuals.

2.3. ANN methodology to time series modelling


As mentioned earlier, in this study, the modelling of the nonlinear component of the load time
series is implemented by the ANNs. By the aid of ANNs, modelling of the nonlinear relationship
between the input and the output data-set becomes possible. One of the most important aspects
of the ANNs over the other nonlinear modelling techniques is that ANNs can be used as
universal approximators to learn a large group of functions with appropriate accuracy. This
ability of ANNs roots in their parallel structure which can detect complex and nonlinear
mappings. The single hidden layer ANN shown in Figure 1 is utilized in this study. As it can be

Figure 1. The structure of a three-layer feedforward MLP.


Journal of Experimental & Theoretical Artificial Intelligence 171

seen, it is a feedforward multilayer perception (MLP) including three layers: (1) input, (2)
hidden and (3) output.
This paper gets use of single-layer ANN to model the time series. Single hidden layer neural
network is the most popular ANN used for forecasting and modelling (Zhang et al., 1998). The
relationship between the input (yt21, yt22, . . . , yt2p) and the output yt can be mathematically
described as follows:
!
X
s X
p
yðtÞ ¼ b0 þ bj G a0j þ aij yt2i þ at ; ð4Þ
j¼1 i¼1

where p and s are the number of nodes in the input layer and the hidden layer, respectively, and
bj ( j ¼ 1, 2, . . . , s) and aij (i ¼ 1, 2, . . . , p; j ¼ 1, 2, . . . , s) are the ANN biasing and weighting
factors, respectively. Also, G() is the transfer function of the ANN. In this study, the type of the
transfer functions of the hidden layer is chosen as logistic function which is formulated as
follows:

1
GðxÞ ¼ ; ð5Þ
1 2 exp ð2xÞ

where x is the input vector and exp() is the exponential operator. According to Equation (4), the
ANN makes a nonlinear mapping between the input and the output data as follows:

yt ¼ f ðyt21 ; yt22 ; . . . yt2p ; a; bÞ þ at ; ð6Þ

where a and b are the vectors of the biasing and weighting factors of the ANN. In fact, the ANN
is utilised to work as an AR model with the ability of nonlinear modelling. According to recent
studies, the simple ANN shown in Equation (4) is a powerful model to learn the nonlinear
functions by increasing the number of nodes (s) in the hidden layer (Hornik, Stinchicombe, &
White, 1990). However, in the case of out-of-sample forecasting, a neural network with little
nodes in the hidden layer is more powerful. In fact, as the number of nodes in the hidden layer
increases, the ANN ability to fit the nonlinear time series is increased but with the cost of
reducing the ANN ability for generalisation out of the training data. This phenomenon can be
explained as overfitting event which happens in the process of ANN modelling. An overfitted
model has suitable capability to fit the data utilised as the training data, but has weak ability of
generalisation to the samples out of the training data. In addition, since in this study ANN input
data are the nonlinear components of the load time series, the problem becomes even more
severe. Therefore, in order to overcome this problem suitably, the wavelet transform is used to
reduce the complexity of the input data of the ANN by breaking it into its sub-frequencies.

2.4. Wavelet decomposition and reconstruction


In recent years, wavelet transform is used as a powerful tool to signal processing. Wavelet
transform can be utilised to extract information from different varieties of data such as audio
signals, images and so on. The basic concept in wavelet analysis is to select a proper wavelet
(called mother wavelet) and then perform an analysis using its translated and dilated versions
(Yao et al., 2000). There are different types of wavelet transforms including continuous wavelet
transform, discrete wavelet transform (DWT) and fast discrete wavelet transform. The DWT is
capable of extracting the coefficients of fine scales to capture high-frequency components as
172 A.K. Fard and M.-R. Akbari-Zadeh

well as coefficient of coarse scales to capture low-frequency components. Mathematically,


DWT can be formulated as follows (Pandey, Singh, & Sinha, 2010):
X XX
f ðtÞ ¼ cj0;k fj0;k ðtÞ þ vj;k 2j=2cð2jt 2 kÞ; ð7Þ
k j.j0 k

where C is the mother wavelet function, j is the dilation or level index, k is the translation or
scaling index, fj0,k is the scaling function of the coarse scale coefficients and cj0,k and vj0,k are
the scaling functions of detail (fine scale) coefficients. Here, cð2jt 2 kÞ are orthogonal functions.
By the use of DWT, the input data would be decomposed into its high-frequency (called
details) and low-frequency (called approximations) components. The type of the mother wavelet
(C) is determined by the high-pass filters which produce the details components in the
decomposition process. Similarly, the scaling function (fj0,k) is determined according to the
low-pass filters which produce the approximations. The processes of wavelet decomposition and
reconstruction are shown in Figure 2.
Wavelet decomposition tree can give important information about the signal investigated.
As shown in Figure 2, in the decomposition process, the original signal c0[n ] is passed through a
low-pass filter h[n ] and a high-pass filter g[n ] to be decomposed into two components c1[n ] and
d1[n ]. The c1[n ] as the approximation part includes the low-frequency components of the
original signal c0[n ]. Similarly, d1[n ] as the detail part includes the high-frequency components
of the signal. Then, the approximation c1[n ] is again decomposed by a bigger scale to new
approximation and detail parts shown by c2[n ] and d2[n ], respectively. The decomposition
process can continue till the details would include a single sample or pixel. According to the
optimum solution, the number of decomposition levels can be chosen.
The signal reconstruction can be made by reassembling the decomposed components
without loss of information. The signal reconstruction process is implemented by the use of the
wavelet coefficient as shown in Figure 2. Mathematically, the reconstructed signal consists of
the detail and approximation parts as follows:

c0 ½n ¼ c1 ½n þ d 1 ½n ¼ c2 ½n þ d2 ½n þ d1 ½n ¼ c3 ½n þ d3 ½n þ d2 ½n þ d1 ½n: ð8Þ

3. The proposed hybrid method


In the last section, the theories and backgrounds of the ARIMA, ANN and wavelet transform were
discussed. Both ARIMA and ANN models have reached notable success in the linear and nonlinear

Figure 2. The process of wavelet decomposition and reconstruction.


Journal of Experimental & Theoretical Artificial Intelligence 173

area, respectively (Niknam & Kavousifard, 2012; Niknam, Kavousi-Fard, & Baziar, 2012; Niknam,
Kavousi-Fard, & Seifi, 2012; Niknam, Kavousifard, Tabatabaei, & Aghae, 2011). However, none of
them can be utilised as a suitable tool to handle all the circumstances when it is not sensible. The
ARIMA model performs as a powerful predictor in the linear time series. However, the accuracy of
ARIMA model to solve the complicated nonlinear problems is not suitable. However, once ANNs
have good performance in the face of nonlinear problems, they can give mixed results in face of
linear models. Also, it was shown that the performance of the ANNs to model linear regression
problems depends on the sampling size as well as the noise level (Markham & Rakes, 1998). In
addition, in the case of non-stationary samples with rapid variations, it is usually very hard for ANNs
to estimate the time series such that large deviations may be seen in the estimation (Amjady, 2007).
In fact, according to the characteristics of the time series investigated, different forecasters should be
used and it is not sensible to get use of ANNs blindly in all circumstances. However, the
investigation of different aspects of a time series may require different complex and hard techniques
when it is time consuming. In this situation, the utilisation of hybrid methods based on both linear
and nonlinear models can be useful to capture both the linear and nonlinear aspects of the time data.
Also, as explained in the last section, by the use of the wavelet transform, the overfitting event is
avoided such that suitable accuracy would be achieved.
Statistically, a time series can be considered to be made of two main parts such as (1) linear
autocorrelation structure and (2) nonlinear component as follows:

y t ¼ Lt þ N t ; ð9Þ

where Lt and Nt are the linear and nonlinear parts. By the use of ARIMA model, the linear
components (Lt) are modelled. Therefore, the residuals of the ARIMA model would contain only the
nonlinear components (Nt). The residuals of the ARIMA model at time t can be calculated as follows:

et ¼ yt 2 L t ; ð10Þ

where et is the residuals and L t is the load forecast value by the ARIMA model. As mentioned in
Section 2.1, the residuals have significant role in the ‘Diagnostic checking’ step which determines the
sufficiency of a linear model. It means that the linear model which includes linear components in its
residuals will fail. Now, the DWT is utilised to decompose the residuals into their details and
approximation parts. As shown in Equation (8), all detail components as well as the last
approximation are fed into ANN for forecasting the next sample. Then, by the use of the
reconstruction process, the forecast value of the nonlinear part (N t ) is evaluated. Finally, the
combined forecast value is calculated as follows:

y t ¼ L t þ N t ; ð11Þ

where y t is the forecast value of the load. In summary, the proposed hybrid method consists of three
main steps. In the first step, the linear part of the time series is extracted by the use of ARIMA model.
The process of building the suitable ARIMA model is as follows:
(1) Model identification: By the use of ACF and PACF, the stationary and non-stationary
behaviours of the time series are investigated. In the case of non-stationary behaviour, the
time series is differenced once or more to become stationary. The order of the ARIMA model
is also determined by the use of AIC approach.
(2) Parameter estimation: The model coefficients are calculated by the use of methods such as
least square error, maximum likelihood approach and so on.
174 A.K. Fard and M.-R. Akbari-Zadeh

(3) Diagnostic checking: Check the model error. If the model is valid then use it, else discard it
and repeat the above steps to find the appropriate model. This step can be implemented by
plotting residuals or by the use of other statistical diagnostic approaches.
(4) Forecasting: The ARIMA model is utilised to forecast the linear part of the time series.
In the second step, the DWT is utilised to decompose the residuals of the ARIMA model into its
details and approximations. The decomposition of the residuals is continued until the details
would include a single signal.
In the third step, for each of the elements of Equation (8), a proper ANN is developed. The
summation of the output values of all ANNs would model the residuals of the ARIMA model. In
fact, the error terms of the ARIMA model are forecasted by the use of DWT and ANN. The
proposed hybrid method is shown in Figure 3.
In order to see the satisfying performance of the proposed method over the other well-known
methods in the area, the below criteria are utilised (Al-Hamadi & Soliman, 2004; Huang & Shih,
2003). Note that here Nes is the number of data to be predicated.
. Relative percentage error:

j y i 2 yi j
si % ¼ £ 100; i ¼ 1; 2; . . . ; N es : ð12Þ
yi

. Mean absolute percentage error (MAPE):

1 X N es
MAPE% ¼ si : ð13Þ
N es i¼1

. Mean absolute error (MAE):

1 X N es
MAE ¼ j y i 2 yi j: ð14Þ
N es i¼1

Figure 3. The process of proposed hybrid method.


Journal of Experimental & Theoretical Artificial Intelligence 175

. Root mean square error (RMSE):


vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
u 1 X N es
RMSE ¼ t s2 : ð15Þ
N es i¼1 i

. Maximum absolute relative percentage error (MARPE):


 
j y i 2 yi j
MARPE ¼ max 100 £ ; i ¼ 1; 2; . . . ; N es : ð16Þ
yi

4. Empirical simulation results


The proposed hybrid method is applied to the real load data of Fars Electric Power Company,
Iran in 2009 (Fars Electric Power Company: http://www.frec.co.ir). The data-set is divided into
groups of training data and test data. The training data-set is utilised to develop the forecasting
models when the test data-set is utilised to see the forecasting accuracy. In order to see the
performance of the proposed method in comparison with the other well-known methods in the
area, the forecasting results of different methods such as ARIMA, AR, ANN, support vector
machine (SVR) and so on are shown comparatively. In the proposed model, the input training
data include the daily peak load of the entire year previous to the prediction day. However, the
proper input features are selected from the peak loads of the previous 8 weeks. Finally,
the forecasting process is implemented for the last 30 days of the load data.
In the first step, by the use of ACF and PACF (Correlogram testing), it was found that the
empirical load time series is non-stationary. Therefore, the first-order difference was applied to
the time series to make it stationary. Now, as mentioned in Section 3, the proper ARIMA model
is chosen (AIC test) and the parameters are also estimated (least square approach). It should be
noted that the model should forecast the daily peak load of the last month (30 days). The
simulation results are implemented in MATLAB 7.10.0 software on a Pentium-IV 2.67-GHz
personal computer. At first, the ARIMA model is applied to the load time series to forecast the
linear component. Now by extracting the residuals of the ARIMA (10,8), the wavelet transform

Figure 4. The signal of the ARIMA residuals (nonlinear part).


176 A.K. Fard and M.-R. Akbari-Zadeh

should be utilised. In this study, Daubechies wavelet family is utilised. In fact, since Daubechies
are orthogonal wavelets, and do not cause information loss in the frequency domain, they can be
suitable choices for the forecasting applications (Chen et al., 2010). Also, the maximum level to
decompose the residual signal is determined by the use of ‘Wmaxlev’ function in the MATLAB
software. The signal of the ARIMA residuals is shown in Figure 4. Also, the detail and
approximation components of the residuals for wavelet decomposition level 5 are shown in
Figure 5.
For modelling the output signals of the DWT, an MLP neural network with seven neurons in
the hidden layer is chosen experimentally. A comparison of the actual value along with the
forecast values of the details (D1, . . . , D5) and the approximate part (A1) is shown in
Figures 6 –12.
As shown in Figures 6 –11, once the signal of the ARIMA residuals has much nonlinearity as
its property, the DWT has simplified it by breaking it into its sub-frequencies. Therefore, the
ANN can now model the details and approximate components sufficiently so that the accuracy
of the forecasting process is improved notably. Note that neglecting the wavelet role in the
proposed method will result in low accuracy of modelling the residuals by the ANN. As shown
in Figure 12, the summation of the forecast approximation (A1) with the forecast details (D1, . . . ,
D5) has resulted in suitable forecasting of the residuals.
Finally, a comparison of the actual load values as well as the load forecast values by ARIMA
and the proposed hybrid method is shown in Figure 13. According to Figure 13, the proposed
hybrid method has reduced the ARIMA error modelling suitably. The results obtained show that
the residuals signal of the ARIMA model has valuable information about the load characteristics
as the nonlinear component. In fact, neglecting the residuals in the forecasting model can result
in data loss which will cause low forecasting accuracy.
The results of different criteria defined in the last section are given in Tables 1 and 2. In
Table 1, the results of the relative forecasting error (s %) by different methods are shown.
In order to have a more precise investigation, the forecasting results of some of the other well-
known method in the area are given in Tables 1 and 2. As given in Table 1, in comparison with
the other forecasting methods, the proposed hybrid method has reduced the relative percentage

Figure 5. The details and approximations of the residuals for ARIMA model.
Journal of Experimental & Theoretical Artificial Intelligence 177

Figure 6. Comparison of the actual and forecasting values of A5 component of the ARIMA residuals with
MLP.

error of different days of the forecasting month. This reduction in forecasting error can be seen in
all the days.
According to Table 2, the value of MAPE criterion has reduced from 2.3001 by ARIMA to
0.4004 by the proposed hybrid model which is a worthy value. This shows that the nonlinear part
of the load time series consists of significant factors with meaningful frequencies. In fact, when
it is hard to detect the nonlinear relationship between the input and output of the residual signal,
the proposed hybrid method has detected this sophisticated relationship sufficiently. It is worth
noting that the authors have tried to model the original signal of the ARIMA residuals just by
ANN (without wavelet decomposition), which was not in any way useful. In fact, as the result of
the high nonlinearity and complexity of the residual signals, the ANN alone could not model the
residuals signal suitably, so the forecasting results would just be deteriorated. From the second
column of Table 2 (MARPE criterion), it can be seen that the proposed hybrid method has
reduced the maximum relative error to the proper value of 0.9446. Similar results can be
deduced from RMSE and MAE criteria. The simulation results show that combining linear and

Figure 7. Comparison of the actual and forecasting values of D1 component of the ARIMA residuals with
MLP.
178 A.K. Fard and M.-R. Akbari-Zadeh

Figure 8. Comparison of the actual and forecasting values of D2 component of the ARIMA residuals with
MLP.

Figure 9. Comparison of the actual and forecasting values of D3 component of the ARIMA residuals with
MLP.

Figure 10. Comparison of the actual and forecasting values of D4 component of the ARIMA residuals with
MLP.
Journal of Experimental & Theoretical Artificial Intelligence 179

Figure 11. Comparison of the actual and forecasting values of D5 component of the ARIMA residuals with
MLP.

Figure 12. Comparison of the actual and forecasting values of the ARIMA residuals with wavelet – ANN.

Figure 13. Comparison of the actual and forecasting values of load values by ARIMA and the proposed
hybrid method.
180 A.K. Fard and M.-R. Akbari-Zadeh

Table 1. Comparison of relative percentage for different methods.

ARIMA The proposed


Day model si (%) AR model si (%) ANN si (%) SVR si (%) hybrid method
1 2.233 3.231 0.879 1.992 0.3324
2 0.089 0.816 1.362 1.200 0.7741
3 1.850 1.444 2.425 0.805 0.5846
4 4.220 1.471 2.100 1.427 0.3474
5 2.430 6.855 1.702 0.922 0.3779
6 2.110 2.511 2.614 0.898 0.5702
7 3.580 0.624 2.726 0.858 0.3901
8 0.743 1.150 7.825 0.125 0.9446
9 6.854 2.209 3.044 8.336 0.0565
10 2.189 4.743 0.079 6.079 0.6506
11 1.998 1.495 1.242 1.369 0.0303
12 1.053 1.599 6.892 1.472 0.3628
13 6.410 3.414 0.162 1.732 0.9369
14 0.493 3.044 0.541 1.266 0.4529
15 2.397 0.820 2.849 3.036 0.1419
16 3.918 3.736 3.726 2.165 0.3166
17 5.146 3.615 1.003 0.379 0.0162
18 0.900 1.307 0.183 0.507 0.7715
19 0.355 1.268 0.237 3.170 0.1693
20 1.016 0.303 1.264 1.183 0.0644
21 0.755 1.557 0.854 2.803 0.2136
22 3.322 3.681 3.283 1.156 0.1345
23 3.774 3.3302 3.068 3.685 0.1804
24 3.943 4.6541 1.947 0.368 0.6462
25 0.989 1.8677 1.230 0.518 0.9006
26 1.112 2.4892 1.862 0.879 0.0520
27 1.485 1.5049 0.289 0.117 0.1647
28 1.123 4.7663 0.600 0.818 0.6944
29 2.789 4.0667 1.881 0.229 0.1785
30 1.789 1.4394 0.828 4.644 0.4041

nonlinear forecasting models (ARIMA and ANN) along with the DWT together can improve the
overall accuracy. Consequently, the proposed hybrid method can give more dependable results
over the other methods such as ARIMA and ANN models.

5. Conclusion
This paper proposed a new hybrid method for load forecasting based on the combination of the
ARIMA model, DWT and ANN. The choice of forecasting model will determine how the
forecast results would be accurate and dependable. In the proposed method, the ARIMA and

Table 2. Forecasting results for different methods.


Method MAPE (%) MARPE RMSE MAE
ARIMA model 2.3001 6.8549 2.8731 34.0608
AR model 2.5008 6.8558 2.9247 37.7258
ANN 1.9569 7.8251 2.6396 28.8032
SVR 1.8051 8.3365 2.5667 26.1718
The proposed hybrid method 0.4004 0.9446 0.4889 6.0361
Journal of Experimental & Theoretical Artificial Intelligence 181

ANN models are utilised together to capture the linear and nonlinear components of the load
time series, respectively. In fact, the proposed hybrid method is a correction method in which the
historical load data are corrected to generate new proper load data suitable for training the ANN.
In this regard, by the use of DWT, the residuals signal of the ARIMA model is decomposed into
its detail and approximation sub-parts so that each of them would be modelled by the appropriate
ANN. In order to train ANNs, the DWT, the back propagation algorithm as well as the on-line
learning approach are developed concurrently. The proposed hybrid method was applied on the
empirical load data of Fars Electrical Power Company, Iran in 2009. According to the simulation
results, the satisfying performance of the proposed hybrid method over the other well-known
forecasting models such as AR, ANN, SVR and so on was demonstrated. It was seen that the
proposed hybrid method can reduce the load forecast error of both ARIMA and ANN models
and the values of all forecasting criteria are improved.

References
Al-Hamadi, H. M., & Soliman, S. A. (2004). Short-term electric load forecasting based on Kalman filtering
algorithm with moving window weather and load model. Journal of Electric Power Systems
Research, 68, 47– 59.
Amjady, N. (2007). Short-term bus load forecasting of power systems by a new hybrid method. IEEE
Transactions on Power Systems, 22, 331– 341.
Box, G. E. P., & Jenkins, G. (1970). Time series analysis, forecasting and control. San Francisco, CA:
Holden-Day.
Bunn, D. W., & Farmer, E. D. (1985). Comparative models for electrical load forecasting. New York:
Wiley.
Chen, Y., Luh, P. B., & Rourke, S. J. (2010). Short term load forecasting: Similar day based wavelet neural
network. IEEE Transactions on Power Systems, 25, 322– 330.
Fan, S., Chen, L., & Lee, W. J. (2009). Short-term load forecasting using comprehensive combination
based on multimeteorological information. IEEE Transactions on Industry Applications, 45,
1460– 1466.
Ferreira, V., & Alves da Silva, A. (2007). Toward estimating autonomous neural network-based electric
load forecasters. IEEE Transactions on Power Systems, 22, 1554– 1562.
Hirotugu, A. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic
Control, 19, 716– 723.
Hornik, K., Stinchicombe, M., & White, H. (1990). Using multi-layer feedforward networks for universal
approximation. Neural Network, 3, 551– 560.
Huang, S.-J., & Shih, K.-R. (2003). Short-term load forecasting via ARIMA model identification including
non-Gaussian process considerations. IEEE Transactions on Power Systems, 18, 673– 679.
Kavousi-Fard, A., & Akbari-Zadeh, M. R. (2013). Reliability enhancement using optimal distribution
feeder reconfiguration. Journal of Neurocomputing, 106, 1 – 11.
Kavousifard, A., & Samet, H. (2011). Consideration effect of uncertainty in power system reliability
indices using radial basis function network and fuzzy logic theory. Journal of Neurocomputing, 74,
3420– 3427.
Khosravi, A., Nahavandi, S., & Creighton, D. (2010). Construction of optimal prediction intervals for load
forecasting problems. IEEE Transactions on Power Systems, 25, 1493 –1503.
Kim, K.-H., Park, J.-K., Hwang, K.-J., & Kim, S.-H. (1995). Implementation of hybrid short-term load
forecasting system using artificial neural networks and fuzzy expert systems. IEEE Transactions on
Power Systems, 10, 1534 –1539.
Kwok, T.-Y., & Yeung, D.-Y. (1997). Constructive algorithms for structure learning in feedforward neural
networks for regression problems. IEEE Transactions on Neural Network, 8, 630– 645.
182 A.K. Fard and M.-R. Akbari-Zadeh

Markham, I. S., & Rakes, T. R. (1998). The effect of sample size and variability of data on the comparative
performance of artificial neural networks and regression. Computers & Operations Research, 25,
251– 263.
Niknam, T., & Kavousifard, A. (2012). Impact of thermal recovery and hydrogen production of fuel cell
power plants on distribution feeder reconfiguration. IET Generation Transmission and Distribution,
6, 831– 843.
Niknam, T., Kavousifard, A., & Aghaei, J. (2012). Scenario-based multiobjective distribution feeder
reconfiguration considering wind power using adaptive modified particle swarm optimization. IET
Renewable Power Generation, 6, 236– 247.
Niknam, T., Kavousi-Fard, A., & Baziar, A. (2012). Multi-objective stochastic distribution feeder
reconfiguration problem considering hydrogen and thermal energy production by fuel cell power
plants. Energy, 4, 563 –573.
Niknam, T., Kavousi-Fard, A., & Seifi, A. (2012). Distribution feeder reconfiguration considering fuel cell/
wind/photovoltaic power plants. Journal of Renewable Energy, 37, 213–225.
Niknam, T., Kavousifard, A., Tabatabaei, S., & Aghae, J. (2011). Optimal operation management of fuel
cell/wind/photovoltaic power sources connected to distribution networks. Journal of Power Sources,
196, 8881– 8896.
Pandey, A. S., Singh, D., & Sinha, S. K. (2010). Intelligent hybrid wavelet models for short-term load
forecasting. IEEE Transactions on Power Systems, 25, 1266– 1273.
Papalexopoulos, A., & Hesterberg, T. (1990). A regression-based approach to short-term system load
forecasting. IEEE Transactions on Power Systems, 5, 1535– 1547.
Senjyu, T., Sakihara, H., Tamaki, Y., & Uezato, K. (2000). Next day peak load forecasting using neural
network with adaptive learning algorithm based on similarity. Journal of Electric Machines and
Power Systems, 28, 613– 624.
Wu, H., & Lu, C. (2003). A data mining approach for spatial modeling in small area load forecast. IEEE
Transactions on Power Systems, 17, 516– 521.
Yao, S. J., Song, Y. H., Zhang, L. Z., & Cheng, X. Y. (2000). Wavelet transform and neural networks for
short-term electrical load forecasting. Energy Conversion and Management, 4, 1975– 1988.
Zhang, G., Patuwo, E. B., & Hu, M. Y. (1998). Forecasting with artificial neural networks: The state of the
art. Journal of Forecasting, 14, 35 – 62.

You might also like