Deep-Learning-Based Wind Speed Forecasting Considering Spatial-Temporal Correlations

Journal of Coastal Research SI 93 623–632 Coconut Creek, Florida 2019
Deep-learning-based Wind Speed Forecasting Considering

Spatial–temporal Correlations with Adjacent Wind Turbines
Xiaoyu Shi, Shengzhi Huang*, Qiang Huang, Xuewen Lei, Jiangfeng Li, Pei Li, and Mingyang Yang
State Key Laboratory of Eco-hydraulics in Northwest Arid Region

Xi’an University of Technology
Xi’an 710048, China www.cerf-jcr.org
ABSTRACT
Shi, X.; Huang, S.; Huang, Q.; Lei, X.; Li, J.; Li, P., and Yang, M., 2019. Deep-learning-based wind
speed forecasting considering spatial–temporal correlations with the adjacent wind turbines. In: Guido-
Aldana, P.A. and Mulahasan, S. (eds.), Advances in Water Resources and Exploration. Journal of Coastal
Research, Special Issue No. 93, pp. 623-632. Coconut Creek (Florida), ISSN 0749-0208.
www.JCRonline.org The accurate prediction of wind speed, which greatly influences the secure and efficient application of wind
energy, is still an important issue and a huge challenge. Previous research has largely focused on advanced
algorithms, often ignoring the contribution of expanding predictors to predict wind speed. In order to promote
the accuracy of forecasting, this study proposes a provisory wind speed forecasting model based on spatial-
temporal correlation (SC) theory, in which the target and adjacent wind turbines, as well as the related time-
lag characteristics, are examined through Wavelet Coherence Transformation analysis (WCT). Prior to that,
the continuous wavelet transforms (CWT) are used to detect the spatial–temporal correlations with adjacent
wind turbines. The CWT results show that the adjacent wind turbines which have a strong correlation with the
target wind turbine are adopted as important factors of the forecasting model. Moreover, the study focuses on
long short term memory (LSTM), a typical deep learning model from the family of deep neural networks, and
compares its forecast accuracy to traditional methods with a proven track record of wind speed forecasting.
Wind speed series of these model tests are taken from a Buckley City wind farm in Washington State, USA.
The results of testing set reveal that (1) the root mean square error (RMSE), mean absolute error (MAE), and
mean absolute percentage error (MAPE) of the proposed model (SC-LSTM) are 0.49 m/s, 0.28 m/s and 2.57%,
respectively, which are much lower than those of the conventional Back Propagation (BP) model, Extreme
Learning Machines (ELM) model, and Support Vector Machine (SVM) model; (2) the proposed model that
considers spatial–temporal correlations with adjacent wind turbines based on the WCT can obtain reliable and
excellent prediction results, providing an excellent hybrid model for wind speed forecasts.
ADDITIONAL INDEX WORDS: Wind speed prediction, spatial-temporal correlation, wavelet coherence
transformation analysis, long short term memory.
INTRODUCTION speed forecasting (WSF) has four types, which are the statistical
With the evolution of global energy patterns and the adjustment method, machine learning method, physical method, and
of national energy strategies, renewable energy sources have bootstrap method (Ma et al., 2015; Qu et al., 2016; Tascikaraoglu
been undergoing key development in various countries all over and Uzunoglu, 2014). The first two are fit for shorter to short term
the world (Ak et al., 2018; Jiang and Huang, 2017). However, forecasting (Huang et al., 2014a). Stochastic methods have also
because wind speed has the features of randomness, volatility, and been proven to perform well, especially in long-term predictions
intermittentness, and its data series are irregular and changeable, (Fang et al., 2018; Fang et al., 2019a; Koutsoyiannis et al., 2018).
it is regarded as the hardest weather factor to predict (Boehme, It is estimated that the observation data at the next time point will
Wallace, and Harrison, 2007; Guo, Gao, and Wu, 2017). Enriching be close to the current value. Additionally, it is highly important
input factors of highly spatial–temporal correlations and using for wind speed forecasting to use numerical weather prediction
highly correlated prediction factors with excellent algorithms can (NWP) models in physical models. The bootstrap methods (also
guarantee the regular working of power networks, thus making called analogue, deterministic, or chaotic) entirely depend on the
accurate predictions possible (Meng, Huang, and Huang, 2018). available data, without using any model parameters (Dimitriadis,
Although with difficulty, lots of researchers have devoted their Koutsoyiannis, and Tzouka, 2016).
efforts to inventing a new model. Divided by the length of time, The statistical method mainly depends on a huge amount of
that is shorter term, short term, medium term, and long term, wind previous data and constructs non-leaner relations among the
different explanatory variables. These methods always contain the
moving-average model (MA), auto-regressive integrated moving-
DOI: 10.2112/SI93-084.1 received 27 September 2018; accepted in average model (ARIMA), quantile-regression model (QR),
revision 6 April 2019. Kalman-filter model and stochastic models (Fang et al., 2018).
*Corresponding author: huangshengzhi7788@126.com Except for the ARIMA statistical methods, stochastic models have
©
Coastal Education and Research Foundation, Inc. 2019
624 Shi et al.
recently been proposed for the surface wind speed (Koutsoyiannis

et al., 2018). Different from statistical methods, physical methods
mainly focus on depicting the physical process of wind speed.
Based on physical models, wind speed can be predicted through
numerical weather, which in turn relies on multiple parameters.
These multiple parameters are related in numerical weather
prediction (NWP). However, the majority of wind farms are
located in the sparsely populated regions, so the variety of data
can be not easily obtained (Wang et al., 2017).
Statistical and machine learning extract laws of historical
data can provide more precise results for short-term wind speed
forecasting than the physics method. Due to the application of
Figure 1. Overview of the wind farm and the location of target and
high-dimensional nonlinear relations in curve fitting, machine adjacent wind turbines.
leaning methods now have extensive applications. In comparison,
when dealing with high-dimensional, complex, and nonlinear
problems, shallow neural networks (SNN) exhibit a series of America. The wind farm includes five wind turbines, in which the
disadvantages, such as slow convergence, over-fitting, and easily No. 3 wind turbine is set as the target wind turbine and the rest of
falling into local optima (Wang et al., 2014; Wang et al., 2016). the wind turbines (No. 1, No. 2, No. 4, and No. 5) are set as the
Researchers have discovered that deep neural network (DNN) adjacent wind turbines in this study. The data were collected from
mainly include convolutional neural network (CNN), deep belief the 6th November 2010-13th December 2010, during which the
network (DBN) and recurrent neural network (RNN) (Wu et al., point was set as 5 minutes and 10656 sets of data were collected.
2016; Wu, Yin, and Liu, 2017). Due to the internal structure of Specifically, the data from the 1st- 8640th (30 days), 8641st-
DNN, it can be introduced to depict complex nonlinear relations. 9216th (2 days), and 9217st-10656th (5 days) observations were
Furthermore, the deep learning methods are not widely used in wind adopted as the training, validation, and testing series, respectively.
forecasting. RNN was introduced for dynamic object recognition
at first. Different from traditional shallow neural network models, METHOD AND MATERIALS
deep learning models can depict the inherent characters of the data Wavelet Coherence Transformation Analysis (WCT)
(Deepak, Zhang, and Huang, 2017). According to the following Wavelet coherence transformation analysis is a new method
research, deep learning methods perform better. A long short-term which combines the wavelet power spectrum and cross-spectrum
memory (LSTM) network derived from RNN can overcome the (Guo et al., 2019a, b; Han et al., 2019). The WCT can effectively
vanishing and explosion of the gradient during model training analyze the correlation degree between two non-stationary
(Byung-Hwa, Se-Young, and Ig-Jae, 2017). Therefore, it can time series and reveal the phase relationship between them in a
make efficient use of the characteristic information of training specific time scale and frequency domain (Huang et al., 2014a;
data during the process of forecasting. Liu, Huang, and Xie, 2019; Liu et al., 2019). Hence, it can reveal
Apart from the prediction algorithms, the forecasting accuracy the cross-correlation between time series more fully and deeply
is presented based on variable factors (Kinsela et al., 2016; Mi than traditional Pearson’s correlation analysis and regression
et al., 2015; Yu et al., 2019). In order to approach the actual analysis (Huang et al., 2017; Huang, Wang, and Huang, 2019).
situation, the amount of data that must be searched has an In this study, the comprehensive correlations between wind speed
unparalleled scale, and information is collected and decision series of target and adjacent wind turbines in temporal and spatial
making is made on the basis of data mining in WSF. Generally, dimensions were fully examined based on the WCT analysis.
data correlation analysis mainly includes Pearson’s correlation WCT combines the wavelet transform theory with cross spectrum
analysis and regression analysis has been widely used in WSF analysis, which may effectively depict the typical characteristics
(James and John, 1999; Jiang, 1999). Hence, this study introduces of wind speed sets in both frequency and time fields (Fang et al.,
wavelet coherence transformation analysis (WCT) for depicting 2019b, c; Wen, Fang, and Zhang, 2013). Hudgins, Friehe, and
the change in characteristics and coupled oscillations between the Mayer (1993) proposed the wavelet coherence transformation
target and adjacent wind turbines (Peng et al., 2018). By taking (WCT) analysis for searching the relations between two
the spatial correlation of wind speed and the related time-lag associated time series, and also as a novel method for the analysis
properties into account, the data of the target wind turbine was of spatial–temporal correlations between the target and adjacent
accurately predicted using the deep-learning-based method. wind turbines.
The structure of this paper is as follows. Study Area and Data W xy represents the wavelet power spectrum. xn and yn ,
describes the study region, which includes five wind turbines; the which denote the crosswavelet transforms, can be expressed as
methods and materials are shown in Method and Materials; the W xy =W xW y . Specifically, ∗ describes the complex conjugation
structure of the proposed model is given in The Hybrid SC-LSTM of xn and yn . The phase angle of W xy can be used for describing
Model. Results provides the experimental results; and finally, the local phase relation between xn and yn , within the influence
Discussions and Conclusions concludes this paper. cone. The theoretical distribution of the wavelet power spectrum
is expressed as follows (Huang et al., 2016; Wen, Fang, and
STUDY AREA AND DATA Zhang, 2013):
Wind speed data provided by the U.S. National Renewable
 Wnx ( s )Wny ∗ ( s )  Z ( P)
Energy Laboratory were used in this study. As can be seen in Figure D = v Pkx Pky (1)
1, the wind farm is located at Buckley City, Washington State,  σ xσ y  v
 
Journal of Coastal Research, Special Issue No. 93, 2019

Deep-learning-based Wind Speed Forecasting Considering Spatial–temporal Correlations with Adjacent Wind Turbines625
where Pkx and Pky denote the power spectra of the wind speed The bi-directional recurrent neural network (BRNN) derives
series, Wny g( s ) denotes the complex conjugation of Wnx ( s ) ; from the directional neural network structure (forward neural
s denotes the time lag, the standard deviations of xn and yn networks and backward neural networks). Hence, the output
can be represented as σ x and σ y , respectively; the confidence that is achieved depends on the hidden layers. Figure 2 shows
coefficient related to the probability can be described as Z v ( P ) the architecture of LSTM and presents the special hidden unit of
(Wen, Fang, and Zhang, 2013). LSTM.
The special structure of LSTM neurons successfully has been
Long Short-Term Memory Network (LSTM) used to solve the defect-gradient uncertainty problem. According
The wind speed of the target wind turbine can be effectively to the description above, the function of the forget gate of LSTM
predicted by extracting the characteristic information of adjacent is to reset the memory cell. Apart from that, the input gate function
wind turbines based on mapping the linkages between target is to modify the memory cell state of the input. Furthermore,
and adjacent wind turbines. Expanding predictors are useful for the output gate obstructs or permits the memory cell state from
improving the precision and stability of forecast results. In recent influencing other neurons. The input, forget and output gate can
years, LSTM has gradually been applied in forecasting research, be described as i, f, and o, respectively. The memory cell state
because of its powerful computing power and ability of data and the new memory cell gate can be represented as c and c ,
acquisition. LSTM has introduced forecasting the wind speed of respectively. The equations can be described as follows:
target wind turbines on the basis of adjacent wind turbines.
LSTM is a type of DNN algorithm, which includes an input i= σ (ωi ∗ xt + U i ∗ st −1 + Vi o ct −1 + bi ) (2)
t

layer, hidden layer, and output layer (Sepp and Jürgen, 1997). It
is noted that the special structure of neurons successfully solves f t σ g (ω f ∗ xt + U f ∗ st −1 + V fo ct −1 + b f ) (3)
=

the typical and traditional defect-gradient uncertainty problem
of the traditional loop neural network during the model training. ot σ g (ωo ∗ xt + U o ∗ st −1 + Voo ct −1 + bo )
= (4)

Moreover, its special structure forms three gates and a memory
cell to remember information in the hidden neuron (Shi et al., =ct f t o ct −1 + ito ct (5)

2018).
ct =σ c (Wc ∗ xt + U c ∗ st −1 + bc ) (6)

st = ot°σ h (ct ) (7)
LSTM neural network architecture

where the input vector can be described as xt ; weights and bias
can be represented as W, U, V, and b; ° denotes the scalar product
of vectors, the sigmoid function can be described as σ g ; the
hyperbolic tangent function can be written as σ h and σ c .
Evolution Metrics
Neuron units structure of LSTM In order to identify the SC-LSTM model, the correlation
coefficient R, mean absolute error (MAE), root mean square error
(RMSE), and mean absolute percentage error (MAPE) are selected
for the evaluation metrics as follows (Huang et al., 2014b; Ren et
al., 2019):
Cov(V obs ,V pre )
（V obs ,V pre ) =
R (8)
Var[V obs ]Var[V pre ]

ht 1
∑
N
=MAE V obs − Vt pre (9)
Ct-1 Ct N t =1 t
× +

tanh
1
∑ (Vt obs − Vt pre )2
N
ft it ot RMSE
= (10)
× × N t =1
ict
σ σ tanh σ
obs
ht-1 ht 1 N Vt − Vt pre
MAPE =
N
∑ t =1
Vt obs
(11)
forget gate input gate output gate
xt
where the observed value can be written as Vt obs , the predicted
value can be described as Vt pre , and N denotes the number of data.
THE HYBRID SC-LSTM MODEL

Deep learning and computer parallel computing technology
have been widely used in the last years. Because of its special
Figure 2. Illustration of the architectures of LSTM and its special neurons. network structure, the LSTM neural network was introduced

626 Shi et al.
Input the samples in the training,

validation and testing sets
Determine model parameters, number

of hidden layers and iterations
X1(t)
X1'(t) Train network
X2(t)
X2'(t)
CWT LSTM
...
...
Xn-1(t) Update model parameters

Xm'(t)
Xn(t)
Reach the number of iterations or No
satisfying the condition
Yes
Input the samples in the testing sets

Evaluation of prediction
results
Output prediction
results
Figure 3. The flowchart of the proposed model.
in wind speed prediction. The LSTM neural network model for Step 1: Analyze the spatial correlation between the target wind
short-term wind speed prediction has been applied in this study. turbine and the adjacent wind turbines using wavelet coherence
Moreover, considering that the spatial–temporal correlation with transformation analysis (WCT), and select the adjacent turbines
the adjacent wind turbines is relatively weak in this study, mainly with a significant correlation and time-lag characteristics as the
due to the previous traditional correlation analysis, it can only samples;
reflect the instantaneous correlation of sequences. Step 2: Collect the sample set and divide it into training,
In this study, the deep-learning-based model considering the validation, and testing;
spatial–temporal correlation can be written as: Step 3: Optimize the parameters in the LSTM model through
training as the detailed process shows in Long Short-Term
t
vobj t −k
= f (Vobj ,V1t − k ,V2t − k ,...,Vi t − k ) (12) Memory Network (LSTM);

Step 4: Substitute the samples of the test set into the well-
where vobjt
denotes the predicted data of target wind turbines at trained LSTM model and output the prediction results.
moment t, f ( ) represents the complex mapping relation, which
ranges between the historical data of target and adjacent wind RESULTS
turbines and determines the prediction precision of validation The Spatial-temporal Correlations between the Target and
periods; Vi t − k represents the historical data at the t-k moment of Adjacent Wind Turbines
the i-th adjacent wind turbines ( i ≤ m , where m is the number of The wavelet coherence transformation analysis is employed
wind turbines in this study, and k denotes the predicted horizon, for identifying the spatial–temporal correlations between the
k=1). Selection of the historical data of i -th adjacent wind target and adjacent wind turbines on the wind farm, and the cross
turbines as the hybrid model input is based on WCT. wavelet transforms between No. 3 and No. 1, No. 2, No. 4, No.
In view of the strong uncertainty and noise in the wind speed 5 wind turbine series are displayed in Figure 4(a), Figure 4(b),
historical data, it is difficult to directly determine the value Figure 4(c), and Figure 4(d), respectively. The 5% confidence
of f in Equation (12). Hence, according to the mechanism of level against red noise is described as a thick contour, and the
the mathematical model mentioned above, the LSTM hybrid arrows represent the relative phase relationship.
forecasting model is proposed in this study, which combines From Figure 4(a), it can be seen that No. 1 wind turbine shows
WCT and deep learning. The flowchart of the SC-LSTM model is a strong spatial correlation on the No. 3 target wind turbine,
depicted in Figure 3. The hybrid structure of the SC-LSTM model indicating that the historical data of No. 1 wind turbines can be
contains the four steps: the input for forecasting the wind speed of the target wind turbine.

64 64
32 32
4 4
16 16
8 8 8 8
4 4
16 16
Period / 5min
Period / 5min
2 2
32 1 32 1
1/2 1/2
64 1/4 64 1/4
1/8 1/8
128 128
1/16 1/16
256 1/32 256 1/32
1/64 1/64
200 400 600 800 1000 200 400 600 800 1000
Time Time
(a)
(a) (b)
(b)
64
64
32
4 4 32
16
16
8 8 8
8
4
4
16 16
Period / 5min
Period / 5min
2
32 1 1
32
1/2 1/2
64 1/4 64 1/4
1/8 1/8
128 128 1/16
1/16
1/32
256 1/32 256
1/64
1/64
200 400 600 800 1000 200 400 600 800 1000
Time Time
(c)
(c) (d) (d)
Figure 4. The spatial-temporal correlations between target and adjacent wind turbines ((a) NO. 3 & NO. 1; (b) NO. 3 & NO. 2; (c) NO. 3 & NO. 4; (d)
NO. 3 & NO. 5).
Specifically, No. 3 and No. 1 were significantly spatially correlated output framework. Detailed information of the entire dataset of
on a time scale of 64-128 (5 min) and exhibited identical phase the target wind turbine used in this study is shown in Figure 5.
changes, and the spectra passed the significance test, suggesting
that these two wind turbines had stable periods of approximately Parameter Settings
6-12 hours. Specifically, Figure 4(c) shows that the spectra of No. In this study, the back propagation (BP), extreme learning
3 and No. 4 were significantly correlated on a time scale of 32- machines (ELM) and support vector machine (SVM) are utilized
80 (5 min), which passed the significance test, and have stable to make the comparison of the SC-LSTM model. Finding better
periods of approximately 3-7 hours. The results of No. 5 are parameters of LSTM can effectively improve the accuracy of the
similar to No. 4. Moreover, the spectra in the high-energy spectra hybrid model. The parameters mainly include the initial weights,
of No. 2 and No. 3 did not pass the significance test and exhibited learning rate, activation function, network topography, batches,
a weak correlation. Hence, the historical data of No. 1, 4, and and the number of iterations. Specifically, the initial weight was
5 were input to train the established prediction model from The
Hybrid SC-LSTM Model.
30
Input and Output Training Validation Testing
The data were obtained from 6th November 2010-13th 25
December 2010, during which the point was set as 5 minutes
and 10656 sets of data in total were collected. Specifically, the 20
data from the 1st- 8640th (30 days), 8641st-9216th (2 days) and
Wind speed
(m/s)
9217st-10656th (5 days) actual data were classified as the training, 15

validation, and testing series, respectively.
10
Considering that the historical data at a time-lag over four hours
imposed slight effects on the prediction of wind speed and were
5
very likely to affect the prediction precision, the wind data of the
adjacent and target wind turbines collected in the first four hours 0
(with a time step of 5 minutes and 48 sets of data in total) were 1 1001 2001 3001 4001 5001 6001 7001 8001 9001 10001
input to the established model, while the wind speed data of the Data
No. 3 wind turbine at the next moment formed the output. All
Figure 5. Wind series of the target wind turbine.
forecasting models used for the comparison have the same input-

628 Shi et al.
BP
Training sample distribution Elman
Training sample distribution SVM
Training sample distribution
30
SC-LSTM
30 30 30
Actual
Actual Actual Actual
SC LSTM Predicted
BP Predicted Elman Predicted SVM Predicted 25
25 25 25
Training
20
20 20 20
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
15
15 15 15
10
10 10 10
5
5 5 5
0
0 0
0
-5
-5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000
-5 -5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 time(5min)
time(5min)
time(5min) time(5min)
Validating sample distribution
Validating sample distribution Validating sample distribution
Validating sample distribution 30
30 30
30 Actual
Actual Actual
Actual SC LSTM Predicted
BP Predicted SVM Predicted 25
25 Elman Predicted 25
25
20
Validation
20 20
20
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
15
wind speed(m/s)
15 15
15
10
10 10 10
5
5 5 5
0
0 0 0
-5
-5 -5 -5 0 100 200 300 400 500 600
0 100 200 300 400 500 600 0 100 200 300 400 500 600 0 100 200 300 400 500 600
time(5min)
time(5min) time(5min) time(5min)
Testing sample distribution Testing sample distribution Testing sample distribution Testing sample distribution
30 30 30 30
Actual Actual Actual Actual
BP Predicted Elman Predicted SVM Predicted SC LSTM Predicted
25 25 25
25
20 20 20
Testing
20
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
15 15 15
15
10 10 10
10
5 5 5
5
0 0 0
-5 -5 0 -5
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400
time(5min) time(5min) time(5min) time(5min)
Figure 6. Prediction results of forecasting models by BP, Elman, SVM and SC-LSTM model.
Table 1. Forecast performance of forecasting models.
RMSE (m/s) MAE (m/s) MAPE (%)

Model
Train Valid Test Train Valid Test Train Valid Test
BP 2.28 3.77 3.46 1.61 2.94 2.63 76.18 59.29 29.97
ELMAN 0.81 2.03 1.87 0.51 1.36 1.25 18.02 27.24 11.73
SVM 0.18 0.84 2.21 0.13 0.58 1.05 7.36 10.9 6.51
SC-LSTM 0.19 0.49 0.49 0.10 0.26 0.28 3.42 6.07 2.57

determined based on past experience, and the sigmoid, tanh, quite high prediction precision in training, the model still suffered
and softmax functions were used as the activation functions. from over-fitting and therefore the prediction performance was
Specifically, the number of batches was set to 10, while the degraded.
number of iterations ranged from 100 to 300.
Comparisons Results
Analysis of the Hybrid Model To further investigate the proposed model, considering the
To evaluate the performance of the SC-LSTM model, the BP spatial–temporal correlation with the adjacent wind turbines,
model, the ELM model and the SVM model were employed for the LSTM model was adopted to make the target wind turbines
comparion. The evaluation indexes of these models are described short-term prediction. The evaluation indexes of the SC-LSTM
in Table 1. Additionally, the forecasting results in the training, and LSTM models are described in Table 2. Furthermore, the
validation and testing based on the BP, ELM, SVM, and SC- forecasting results in the training, validation and testing based
LSTM model are shown in Figure 6. on the LSTM model and SC-LSTM model are depicted in Figure
On the basis of the forecasting error results presented in Table 1 7. Figure 8 presents the correlation factor R for the training,
and Figure 6, it can be seen that: (a) the SC-LSTM model is superior validation, and testing set by the LSTM and SC-LSTM model.
to the other models in training, validation, and testing. The reason From Table 2 and Figures 7-8, it can be concluded that: the
for this is that the LSTM adopts a special memory module and the SC-LSTM model is preferable to the LSTM model for training,
effective information (i.e., the back-propagated error) no longer validation and testing. For instance, the test results show that
relies on the current neuron, but is directly transmitted to the next the MAPE values of these two models are 3.0% and 2.57%,
layer via the gate in the memory module; (b) the BP model is respectively. Specifically, the proposed model learned on the
poorest in terms of the prediction and yields scattered prediction training set and the correlation coefficient achieved R=0.9988.
results and great deviation, suggesting that it is not flexible and The results provided by SC-LSTM model are very close to the
does not possess the necessary information extraction ability for real values, we can infer through it that the model learned the
complex data; (c) since the Elman model has a feedback structure, structure and of the peaks and trend of the time series. Notably,
it shows a certain memory function for time-series data. However, the WCT is used to analyze the correlation between the target
for a large amount of data, too many feedback structures easily and the adjacent analysis wind turbines can help in extracting
cause gradient explosion and disappearance, thus making the most of the hidden information with the data of wind turbines
prediction easily stagnate for a long time; (d) the SVM model and in improving model performance dramatically, especially
follows the criterion of the minimization of structural risks and in forecasting the peak values. Hence, this increase is expected
can maintain a certain generalization ability while minimizing the because the high linear correlation is normal and the advantage of
errors of samples. However, whilst the SVM model can achieve a applying WCT is well-utilized.
Training
Validation
Validating sample distribution Testing
Testing sample distribution
30 30 30
LSTM Predicted LSTM Predicted
25 LSTM Predicted
25 25
LSTM 20
20 20
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
15
15 15
10
10 10
5
0 5 5
-5 0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0
0 100 200 300 400 500 600 0 200 400 600 800 1000 1200 1400
time(5min) time(5min)
time(5min)
Training sample distribution Validating sample distribution Testing sample distribution
30 30 30
SC-LSTM 25
SC LSTM Predicted
25
SC LSTM Predicted
25
SC LSTM Predicted
20 20 20
wind speed(m/s)
wind speed(m/s)
wind speed(m/s)
15 15 15
10 10 10
5 5 5
0 0 0
-5 -5 -5
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 100 200 300 400 500 600 0 200 400 600 800 1000 1200 1400
time(5min) time(5min) time(5min)
Figure 7. Prediction results of forecasting models by the LSTM and SC-LSTM model.

630 Shi et al.
Table 2. Performance evaluations of forecasting models for wind speed.
Model RMSE (m/s) MAE (m/s) MAPE (%)

Train Valid Test Train Valid Test Train Valid Test
LSTM 0.25 0.51 0.57 0.13 0.27 0.33 5.44 5.15 3.0
SC-LSTM 0.19 0.49 0.49 0.10 0.26 0.28 3.42 6.07 2.57
Training period Validation period
Testing period
Figure 8. Correlation coefficient for training, validation and testing data set by the LSTM and SC-LSTM model.
DISCUSSIONS AND CONCLUSIONS Research and Development Program of China (grant number
This study introduced a novel spatial correlation theory, wavelet 2017YFC0405900), the National Natural Science Foundation
continuous transform (WCT), which can be used to detect the of China (grant number 51709221), the Planning Project of
spatial–temporal correlations with adjacent wind turbines. The Science and Technology of Water Resources of Shaanxi (grant
effectiveness of WCT can capture the nonlinearity functional numbers 2015slkj-27 and 2017slkj-19), the China Scholarship
relationship between the time and spatial series. At the same time, Council (grant number 201608610170), the Open Research Fund
it can reflect time-lag characteristics between time series. The of State Key Laboratory of Simulation and Regulation of Water
problem of predicting wind speed from new wind turbines where Cycle in River Basin (China Institute of Water Resources and
there is a lack of historical data is assessed using WCT. A very Hydropower Research, grant number IWHR-SKL-KF201803, the
short-term prediction of wind speed shows that the model could Belt and Road Special Foundation of the State Key Laboratory
improve the accuracy for the target wind turbine. of Hydrology-Water Resources and Hydraulic Engineering
The study demonstrated that LSTM can provide accurate wind (2018490711), and the Doctorate Innovation Funding of Xi’an
speed predictions for wind power plants. In addition, LSTM can University of Technology (grant number 310-252071712).
be used as a powerful tool in decision making, which can take
advantage of the forecasting. Intelligent wind farms will become LITERATURE CITED
the backbone of the smart grid. Spatial-temporal closely related Ak, R.; Li, Y.F.; Vitelli, V., and Zio, E., 2018. Adequacy assessment of
data and algorithms are integrated to promote the grid connection. a wind-integrated system using neural network-based interval
Additionally, the wind power curtailment condition of wind plants predictions of wind power generation and load. International
can be controlled with DNN. Journal of Electrical Power & Energy Systems, 95, 213-226.
Boehme, T.; Wallace, A.R., and Harrison, G.P., 2007. Applying
ACKNOWLEDGMENTS time series to power flow analysis in networks with high
The data is supported by the National Renewable Energy wind penetration. IEEE Transactions on Power Systems,
Laboratory (https://www.nrel.gov/research/data-tools. 22(3), 951-957.
html). This research was jointly funded by the National Key

Byung-Hwa, P.; Se-Young, O., and Ig-Jae, K., 2017. Face drought and its potential influence factors. Journal of
alignment using a deep neural network with local feature Hydrology, 547, 184-195.
learning and recurrent regression. Expert Systems with Huang, S.Z.; Wang, L., and Huang, Q., 2019, Spatio-temporal
Applications, 89, 66-80. characteristics of drought structure across China using an
Deepak, K.J.; Zhang, Z., and Huang, K.Q., 2017. Multi angle integrated drought index. Agricultural Water Management,
optimal pattern-based deep learning for automatic facial 218, 182-192.
expression recognition. Pattern Recognition Letters, 1-9. Hudgins, L.; Friehe, C.A., and Mayer, M.E., 1993. Wavelet
Dimitriadis, P.; Koutsoyiannis, D., and Tzouka, K., 2016, transforms and atmospheric turbulence. Physical Review
Predictability in dice motion: How does it differ from Letters, 71, 3279-3282.
hydrometeorological processes. Hydrological Sciences James, R.S. and John, L.W., 1999. A two-site correlation model
Journal, 61(9), 1611-1622. for wind speed, direction and energy estimates. Journal of
Fang, W.; Huang, S.Z.; Huang, G.H., and Huang Q., 2019b. Wind Engineering and Industrial Aerodynamics, 79(3), 233-
Copulas-based risk analysis for inter-seasonal combinations 268.
of wet and dry conditions under a changing climate. Jiang, S.F., 1999. Application of correlation coefficient in
International Journal of Climatology, 39(4), 2005-2021. regression analysis. Journal of Shanghai Institute of Electric
Fang, W.; Huang, S.Z.; Huang, Q.; Huang, G.H.; Meng, E.H., Power, 15 (1), 34-39.
and Luan, J.K., 2018. Reference evapotranspiration Jiang, Y. and Huang, G.Q., 2017. Short-term wind speed prediction:
forecasting based on local meteorological and global climate Hybrid of ensemble empirical mode decomposition, feature
information screened by partial mutual information. Journal selection and error correction. Energy Conversion and
of Hydrology, 561, 764-779 Management, 144, 340-350.
Fang, W.; Huang, S.Z.; Huang, Q.; Huang, G.H.; Wang, H.; Leng, Kinsela, M.A.; Monis, B.D.; Daley, M., and Hanslow, D.J., 2016.
G.Y.; Wang, L., and Guo, Y., 2019c. Probabilistic assessment A flexible approach to forecasting coastline change on wave
of remote sensing-based terrestrial vegetation vulnerability dominated beaches. Journal of Coastal Research, (752),
to drought stress of the Loess Plateau in China. Remote 952-956.
Sensing of Environment, in press. Koutsoyiannis, D.; Dimitriadis, P.; Lombardo, F., and Stevens,
Fang, W.; Huang, S.Z.; Ren, K.; Huang, Q.; Huang, G.H.; Cheng, S., 2018. From fractals to stochastics: Seeking theoretical
G.H., and Li, K.L., 2019a. Examining the applicability consistency in analysis of geophysical data. Advances in
of different sampling techniques in the development of Nonlinear Geosciences, 237-278.
decomposition-based streamflow forecasting models. Liu, S.Y.; Huang, S.Z., and Xie, Y.Y., 2019. Spatial-temporal
Journal of Hydrology, 568, 534-550. changes in vegetation cover in a typical semi-humid and
Guo, Y.; Huang, S.Z.; Huang, Q.; Wang, H.; Wang, L., and Fang, semi-arid region in China: Changing patterns, causes and
W., 2019b. Copulas-based bivariate socioeconomic drought implications. Ecological Indicator, 98, 462-475.
dynamic risk assessment in a changing environment. Journal Liu, S.Y.; Huang, S.Z.; Xie, Y.Y., and Huang, Q., 2019.
of Hydrology, 575, 1052-1064. Identification of the non-stationarity of floods: Changing
Guo, Y.; Huang, S.Z.; Huang, Q.; Wang, H.; Fang, W.; Yang, Y.Y., patterns, causes, and implications. Water Resources
and Wang, L., 2019a. Assessing socioeconomic drought Management, 33(3), 939-953.
based on an improved Multivariate Standardized Reliability Ma, X.L.; Tao, Z.M.; Wang, Y.H.; Yu, H.Y., and Wang, Y.P.,
and Resilience Index. Journal of Hydrology, 568, 904-918. 2015. Long short-term memory neural network for traffic
Guo, Y.F.; Gao, H.L., and Wu, Q.W., 2017. A meteorological speed prediction using remote microwave sensor data.
information mining-based wind speed model for adequacy Transportation Research Part C: Emerging Technologies,
assessment of power systems with wind power. International 54, 187-197.
Journal of Electrical Power & Energy Systems, 93, 406-413. Meng, E.H.; Huang, S.Z., and Huang, Q., 2018. A robust method
Han, Z.M.; Huang, S.Z.; Huang, Q.; Leng, G.Y.; Wang, H.; Li, H.; for non-stationary streamflow prediction based on improved
Li, H., and Li, P., 2019. Assessing GRACE-based terrestrial EMD-SVM model. Journal of Hydrology, 568, 462-478.
water storage anomalies dynamics at multi-timescales and Mi, C.; Shen, Y.; Mi, W.J., and Huang, Y.F., 2015. Ship
their correlations with teleconnection factors in Yunnan identification algorithm based on 3d point cloud for
Province, China. Journal of Hydrology, 574, 836-850. automated ship loaders. In: Mi, W.J.; Lee, L.H.; Hirasawa,
Huang, S.Z.; Chang, J.X.; Huang, Q., and Chen, Y.T., 2014a. K., and Li, W.L. (eds.), Recent Developments of Port and
Monthly streamflow prediction using modified EMD-based Ocean Engineering, Journal of Coastal Research, Special
support vector machine. Journal of Hydrology, 511, 764-775. Issue No.73, pp. 28-34.
Huang, S.Z.; Chang, J.X.; Huang, Q., and Chen, Y.T., 2014b. Peng, W.; Maleki, A.; Rosen, M.A., and Azarikhah, P., 2018.
Spatio-temporal changes and frequency analysis of drought Optimization of a hybrid system for solar-wind-based water
in the Wei River Basin, China. Water Resources Management desalination by reverse osmosis: Comparison of approaches.
28(10), 3095-3110. Desalination, 442, 16-31.
Huang, S.Z.; Huang, Q.; Chang, J.X., and Leng, G.Y., 2016. Qu, X.Y.; Kang, X.N; Zhang, C.; Jiang, S., and Ma, X.D., 2016.
Linkages between hydrological drought, climate indices and Short-term prediction of wind power based on deep long
human activities: A case study in the Columbia River basin. short-term memory. 2016 IEEE PES Asia-Pacific Power and
International Journal of climatology, 36(1), 280-290. Energy Conference, 1148-1152.
Huang, S.Z.; Li, P.; Huang, Q.; Leng, G.Y.; Hou, B.B., and Ma, L., Ren, K.; Huang, S.Z.; Huang, Q.; Wang, H.; Leng, G.Y.; Cheng,
2017. The propagation from meteorological to hydrological L.Y.; Fang, W., and Li, P., 2019. A nature-based reservoir

632 Shi et al.
optimization model for resolving the conflict in human Wang, S.X.; Zhang, N.; Wu, L., and Wang, Y.M., 2016. Wind
water demand and riverine ecosystem protection. Journal of speed forecasting based on the hybrid ensemble empirical
Cleaner Production, 231, 406-418. mode decomposition and GA-BP neural network method.
Sepp, H. and Jürgen, S., 1997. Long short-term memory. Neural Renewable Energy, 94, 629-636.
Computation, 9(8), 1735-1780. Wen, L.B.; Fang, H.X., and Zhang, Y., 2013. Correlation research
Shi, X.Y.; Lei, X.W.; Huang, Q.; Huang, S.Z.; Ren, K., and Hu, between the sunspot numbers and the cosmic rays based
Y.Y., 2018. Hourly day-ahead wind power prediction using on wavelet and cross wavelet analysis. Chinese Journal of
the hybrid model of variational model decomposition and Space Science, 33(1), 13-19.
long short-term memory. Energies, 11(11), 3227. Wu, W.Z.; Chen, K.J.; Qiao, Y., and Lu, Z.X., 2016. Probabilistic
Tascikaraoglu, A. and Uzunoglu, M., 2014. A review of combined short-term wind power forecasting based on deep neural
approaches for prediction of short-term wind speed and networks. 2016 International Conference on Probabilistic
power. Renewable and Sustainable Energy Reviews, 34, 243- Methods Applied to Power Systems (PMAPS).
254. Wu, Y.C.; Yin, F., and Liu, C.L., 2017. Improving handwritten
Wang, D.Y.; Luo, H.Y.; Olivier, G., and Lin, Y.B., 2017. Multi- Chinese text recognition using neural network language
step ahead wind speed forecasting using an improved wavelet models and convolutional neural network shape models.
neural network combining variational mode decomposition Pattern Recognition, 65, 251-264.
and phase space reconstruction. Renewable Energy, 113, Yu, D.; Zhu, H.; Han, W., and Holburn, D., 2019. Dynamic multi
1345-1358. agent-based management and load frequency control of
Wang, J.J.; Zhang, W.Y.; Li, Y.N.; Wang, J.Z., and Dang, Z.L., PV/Fuel cell/wind turbine/CHP in autonomous microgrid
2014. Forecasting wind speed using empirical mode system. Energy, 173, 554-568.
decomposition and Elman neural network. Applied Soft
Computing, 23, 452-459.

Deep-Learning-Based Wind Speed Forecasting Considering Spatial-Temporal Correlations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deep-Learning-Based Wind Speed Forecasting Considering Spatial-Temporal Correlations

Uploaded by

Copyright:

Available Formats

Journal of Coastal Research SI 93 623–632 Coconut Creek, Florida 2019

Deep-learning-based Wind Speed Forecasting Considering

State Key Laboratory of Eco-hydraulics in Northwest Arid Region

recently been proposed for the surface wind speed (Koutsoyiannis

Journal of Coastal Research, Special Issue No. 93, 2019

THE HYBRID SC-LSTM MODEL

Journal of Coastal Research, Special Issue No. 93, 2019

Input the samples in the training,

Determine model parameters, number

Xn-1(t) Update model parameters

Input the samples in the testing sets

Figure 3. The flowchart of the proposed model.

Journal of Coastal Research, Special Issue No. 93, 2019

256 1/32 256 1/32

9217st-10656th (5 days) actual data were classified as the training, 15

Journal of Coastal Research, Special Issue No. 93, 2019

Table 1. Forecast performance of forecasting models.

RMSE (m/s) MAE (m/s) MAPE (%)

BP 2.28 3.77 3.46 1.61 2.94 2.63 76.18 59.29 29.97

Journal of Coastal Research, Special Issue No. 93, 2019

Journal of Coastal Research, Special Issue No. 93, 2019

Table 2. Performance evaluations of forecasting models for wind speed.

Model RMSE (m/s) MAE (m/s) MAPE (%)

Training period Validation period

Journal of Coastal Research, Special Issue No. 93, 2019

Journal of Coastal Research, Special Issue No. 93, 2019

Journal of Coastal Research, Special Issue No. 93, 2019

You might also like