2021-Bayesian_Deep_Learning_for_Dynamic_Power_System_State_Prediction_Considering_Renewable_Energy_Uncertainty

This article has been accepted for publication in a future issue of this journal, but has not been
fully edited. Content may change prior to final publication.
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
Bayesian Deep Learning for Dynamic Power

System State Prediction Considering
Renewable Energy Uncertainty
Shiyao Zhang and James J. Q. Yu
AbstractÿModern power systems are incorporated with dis‐ tackle such issues, the accurate state prediction for power
tributed energy sources to be environmental-friendly and cost- system is crucial to capture the massive system uncertainties
effective. However, due to the uncertainties of the system inte‐ from RESs to support the system operations and services.
grated with renewable energy sources, effective strategies need
to be adopted to stabilize the entire power systems. Hence, the For instance, a suitable prediction technique can be deployed
system operators need accurate prediction tools to forecast the for system data analytics and contributes to the stable opera‐
dynamic system states effectively. In this paper, we propose a tion and planning in the power system.
Bayesian deep learning approach to predict the dynamic system Existing solutions to predict the dynamic system state in
state in a general power system. First, the input system dataset the power system mainly involve in deploying the traditional
with multiple system features requires the data pre-processing
prediction tools [1], [2]. For instance, in [1], a robust ap‐
stage. Second, we obtain the dynamic state matrix of a general
power system through the Newton-Raphson power flow model. proach, namely extended Kalman filter, is developed to mon‐
Third, by incorporating the state matrix with the system fea‐ itor system state dynamics in a faster and more reliable man‐
tures, we propose a Bayesian long short-term memory ner. Besides, a new robust generalized maximum-likelihood-
(BLSTM) network to predict the dynamic system state vari‐ type unscented Kalman filter in [2] is proposed to predict
ables accurately. Simulation results show that the accurate pre‐ the state innovation vectors in the system. Although the pre‐
diction can be achieved at different scales of power systems
through the proposed Bayesian deep learning approach.
vious studies propose the statistical techniques on system
state prediction, they do not consider the penetration of
Index TermsÿBayesian deep learning, data analytics, Newton- RESs in the power system. By considering the RESs de‐
Raphson power flow, renewable energy source, system state.
ployed in the system, some existing researches deploy suffi‐
cient statistical tools on predicting system state [3] - [5]. For
example, a Markov model with the Viterbi algorithm is pro‐
I. INTRODUCTION
posed in [3] for state prediction in different power systems
With the development of technology, the power system with the penetration of RESs. Furthermore, in [4], a statisti‐
has become more diversified than the conventional power cal Gaussian mixture model is developed to estimate the real-
system, such as the increase in the utilization of renewable time system measurements. Nonetheless, these approaches
energy sources (RESs). Although RESs contribute to improv‐ are not practical since such models already assume that both
ing the environmental impacts to prevent global warming, the system and models are linear. Lastly, a novel interval
the system uncertainties occur due to the uncertain and inter‐ state estimation algorithm is devised in [5] to consider differ‐
mittent renewable power generations. In practice, the integra‐ ent uncertainties of distributed generation outputs, as well as
tion of RESs in the system results in sudden instability phe‐ line parameters in unbalanced distribution systems. In prac‐
nomena of the entire power system. Meanwhile, the power tice, complex power systems have typical non-linear fea‐
system state variables may also undergo drastic changes. To tures, such as time-varying loads and multiple power genera‐
tions. Hence, to handle these non-linear features, several
Manuscript received: December 31, 2020; accepted: May 17, 2021. Date of studies are involved in deploying non-parametric methods
CrossCheck: May 17, 2021. Date of online publication: XX XX, XXXX. for prediction in the system. In [6], a non-parametric kernel
This work was supported by the General Program of Guangdong Basic and estimation method is applied to determine the state condition
Applied Basic Research Foundation (No. 2019A1515011032) and the Guang‐
dong Provincial Key Laboratory of Brain-inspired Intelligent Computation (No. to characterize the stability issues of the system.
2020B121201001). However, there exists a research gap in the state predic‐
This article is distributed under the terms of the Creative Commons Attribu‐ tion problem of traditional dynamic system. The above re‐
tion 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
S. Zhang is with the Academy for Advanced Interdisciplinary Studies, South‐ searches unilaterally consider linear or non-linear features in
ern University of Science and Technology, Shenzhen 518055, China (e-mail: sy‐ the power system. In practice, a dynamic power system con‐
zhang@ieee.org). sists of various types of linear and non-linear features. For
J. Yu (corresponding author) is with the Department of Computer Science and
Engineering, Southern University of Science and Technology, Shenzhen 518055, example, the electrical loads in the system are either linear
China (e-mail: yujq3@sustech.edu.cn). or non-linear loads according to the network structure. As an
DOI: 10.35833/MPCE.2020.000939
emerging technique to handle the problems in complex sys‐
tems, machine learning approaches can be a good candidate could be achieved through Bayesian deep learning with con‐
to fully respect both linear and non-linear factors in a gener‐ sideration of the bad-data circumstance. However, there is
al power system. Some studies utilize machine learning tools no recent study using Bayesian deep learning model to con‐
for system data prediction problems [7] - [9]. For instance, a sider the dynamic system state of the entire power system as
novel machine learning tool, support vector machine, can be far as we are concerned.
applied for time series prediction in the power system [7]. To bridge the research gaps, we propose a Bayesian deep
Besides, a low-rank tensor learning model can be utilized to learning approach to perform state prediction in the general
predict system measurements such as solar power generation power system considering RES and model uncertainties. The
[8]. Last but not least, the use of decision trees can handle a major efforts of this paper are shown as follows.
large amount of wide-area information to keep the stability 1) A Newton-Raphson power flow model is deployed to
of the power system [9]. estimate the dynamic system state, and we devise a new
Beyond the aforementioned studies, as a subset of ma‐ state matrix to aggregate with all historical information.
chine learning, deep learning has been widely utilized in the 2) By adopting Bayesian LSTM (BLSTM) network in the
related research fields [10]. In the power system, the imple‐ system, both the model and data uncertainties can be remark‐
mentation of deep learning techniques can use multiple neu‐ ably captured.
ral network layers to extract latent system features, which 3) Accurate predictions can be achieved by means of the
can further assist power system operations in practical sce‐ proposed BLSTM network for different scales of systems.
narios. Some existing research works have solved power sys‐ The remainder of this paper is organized as follows. Sec‐
tem problems through deep learning tools [11]-[16]. In [11], tion II presents and illustrates the proposed methodology. In
a method based on artificial neural network (ANN) is pro‐ Section III, we formulate a Newton-Raphson power flow
posed to predict the long-term voltage status to ensure the model to generate the state matrix of a general power sys‐
margin of the voltage stability. Additionally, an intelligent tem. The deep learning approach, BLSTM network, is then
system strategy is proposed in [12] to predict the dynamic proposed in Section IV. In Section V, we conduct the perfor‐
voltage deviation to observe the short-term instability of the mance evaluation of the proposed model with a general pow‐
system based on the ensemble learning of neural networks. er system. Finally, we conclude this paper in Section VI.
However, these studies lack multiple degrees of data inter‐
pretability in the power system. For example, the one-layer II. PROPOSED METHODOLOGY
ANN cannot deeply learn the representation of large-scale In this section, we present the proposed methodology,
power system data due to its simple structure and parameter which is summarized in Fig. 1.
settings. To tackle such issues, in [13], a model-specific
deep neural network (DNN) based power system state esti‐ Power system data
mation scheme is proposed to estimate real-time power sys‐ Data pre-processing stage
tem state. Furthermore, the long short-term memory (LSTM) Training Testing
model could be deployed to accurately capture the time-vary‐ dataset dataset
ing dynamic behaviors of active distribution networks [14]
and to forecast the solar energy output [15]. Finally, by im‐
plementing a surrogate model, a novel LSTM model is ap‐ Newton-Raphson power Power flow
flow model analysis stage
plied to capture the time-varying consecutive states [16].
Although the aforementioned studies have demonstrated
that deep learning is superior to perform prediction tasks, State matrix State matrix
these studies are indeed based on the deterministic models (training set) (testing set)
State prediction stage
so that they lack the ability of capturing uncertainty. In the
power system, the uncertainties from various sources may BLSTM network
expose potential safety issues to cause huge economic losses
[17]. Even though several non-deep-learning approaches can Predicted state variables Aggregation stage
deal with system uncertainties, e. g., [3] and [18], their
schemes become complex and time-consuming with the in‐ Fig.1. Workflow of proposed methodology.
creasing size of power networks. In this paper, a probabilis‐
tic deep learning model, i.e., Bayesian deep learning, is ad‐ The proposed methodology consists of four main stages:
opted to incorporate the power system uncertainties into the data-preprocessing, power flow analysis, state prediction,
state estimation framework aiming at a more interpretable and aggregation. The first stage involves the dataset prepara‐
model with reliable prediction by means of probability theo‐ tion process, where we collect various types of power sys‐
ry in an efficient manner. There are several research works tem data, such as various renewable power generation, time-
that justify the feasibility of the Bayesian deep learning varying loads, and dynamic power state information. Then,
methods in power system studies [19], [20]. For instance, a the input dataset is segregated into a training dataset and a
probabilistic Bayesian deep learning model is proposed for testing dataset. The size of the training dataset is determined
day-ahead load forecasting to capture the model and load un‐ by multiple potential features of the entire dynamic power
certainties [19]. In addition, in [20], the state estimation system. After that, both the training and testing datasets are
ZHANG et al.: BAYESIAN DEEP LEARNING FOR DYNAMIC POWER SYSTEM STATE PREDICTION CONSIDERING RENEWABLE ENERGY UNCERTAINTY
used for the power flow analysis. Once the state matrix is listic prediction techniques are primarily derived from deter‐
obtained and separated into training and testing sets, we in‐ ministic models, their capability in capturing the stochastic
corporate them into the complete training and testing datas‐ uncertainty is limited. Hence, most of the existing models
ets to fit in the proposed BLSTM network. By properly train‐ cannot represent the data uncertainty due to the stochastic
ing the learning system, the dynamic system states of gener‐ time-varying loads and RESs. It is important to develop a ge‐
al power systems can be obtained via online inference. Final‐ neric probabilistic deep learning model to handle a large
ly, the aggregation stage clusters all the individual probabilis‐ number of such uncertainties to provide the confidence
tic prediction through convolution with the previously saved bounds for decision-making.
weights to gain the final probabilistic dynamic state vari‐ Besides the data uncertainty, we need to investigate the
ables. uncertainty of the model related to the output results on deal‐
ing with dynamic state prediction, which is defined as model
A. Dynamic State Assessment uncertainty. Besides the stochastic uncertainty, the model un‐
The dynamic state assessment is gained by using the New‐ certainty also plays an important role in the probabilistic pre‐
ton-Raphson power flow model. Note that we consider the diction task, which is used to identify the amount of output
alternating-current (AC) power system in the proposed meth‐ uncertainty. Intuitively, the model uncertainty refers to the
odology. Hence, we perform AC power flow analysis so as uncertainty of the model parameters and network structure.
to estimate the AC power state information. The power sys‐ The challenge of such uncertainty is to know how much the
tem network can be modeled as a graph that is defined as chosen combinations affect the results of dynamic state pre‐
(&¨), where & {1¨2¨...¨N} is the bus set, N is the total diction in different circumstances. Thus, the aim is to investi‐
number of buses; and & u & is the transmission line set. gate the degree of uncertainty of this model corresponding
In the power network, we can model a branch (i¨j) with to the outputs. Since we consider a large amount of model
two common buses i¨j & as one equivalent π circuit. parameters and numerous variations of structures evaluated
Hence, the line admittance of this circuit can be expressed for different combinations, it is important to observe how
as y ij g ij jb ij, where g ij t 0 and b ij d 0 are the branch con‐ the accurate dynamic system state prediction varies under
ductance and branch susceptance, respectively. In addition, different circumstances, e. g., days, weeks, seasons, and so‐
the shunt capacitance of branch (i¨j) can be denoted as cial factors. In addition, to tackle such system conditions,
the proposed Bayesian deep learning model is developed by
c ij c ji, which is used for voltage and reactive power control.
implementing the related parameters inside. The remainder
These aforementioned parameters are the key components to
of this paper will introduce how our proposed Bayesian deep
develop the AC power flow model for a general power sys‐
learning model can effectively handle both the data and mod‐
tem.
el uncertainties in a detailed manner.
Through the power flow analysis in the system, the dy‐
namic state variables can be estimated. The collection of
III. NEWTON-RAPHSON POWER FLOW MODEL
these variables can be sorted as the state matrix of the sys‐
tem. Since we need to fit this information into the neural net‐ This section presents the problem formulation of Newton-
work model, the large amount of historical dynamic state es‐ Raphson power flow model. As shown in Fig. 1, the AC
timations are covered and divided into training and testing power flow model can be utilized to obtain the correspond‐
sets. ing state matrix for the power system after the data pre-pro‐
cessing stage, which is for the model training and testing in
B. Uncertainty Investigation the proposed BLSTM network. The procedure of AC power
Data uncertainty in general power systems is typically re‐ flow analysis is shown below in a detailed manner [21].
lated to the stochastic uncertainty of the power generation
A. AC Power Flow Model
and loads injected as power sources to the system, such as
load variability and intermittent power generation. It is more Based on the dynamic state assessment mentioned in Sec‐
challenging to handle both load and intermittent power gen‐ tion II, we further develop the power flow model of a gener‐
eration prediction than either of these individual tasks. Spe‐ al power system. Considering the nodal equations at each
cifically, the stochastic uncertainty of load and intermittent bus, the nodal admittance matrix in the system can be denot‐
power generation reflects the uncertainty characteristics from ed as Y. Through an one-line diagram of the whole circuit,
different injected sources resulted from the variability of for each bus, the current and voltage matrices are denoted as
weather and unpredictable human activities. Besides, most of I and V, respectively. By Ohm’s law, the nodal admittance
the existing approaches only predict the upper and lower matrix Y of the entire system follows (1).
bounds of the forecasting power without the detailed infor‐ ª I 1 º ª« y 11 y 12 * y 1N º» ª V 1 º
mation about the power distribution at every time step. Addi‐ « I » « y 21 y 22 * y 2N » « V »
tionally, most probabilistic prediction methods are indeed de‐ I « 2 » «« « 2»
»»» « , » YV (1)
«,» « , , ,
terministic approaches, which cannot fully capture the sto‐ « » « « »
¬I N ¼ ¬y N1 y N2 * y NN »¼ ¬V N ¼
chastic uncertainty. In this paper, as mentioned above, we
consider the integration with time-varying loads and RESs in For the representations of power injections in the system,
AC power systems. Hence, the data uncertainty is taken into we denote P i and Q i as the active and reactive power injec‐
account in our proposed system model. As existing probabi‐ tion at bus i, respectively, which are modeled as:
P i V i ϕV j (g ij cos θ ij b ij sin θ ij ) (2)

power imbalance, respectively.
j &i According to [21], the utilization of the Newton-Raphson
power flow model is capable of solving the above non-linear
Q i V i ϕV j (g ij sin θ ij b ij cos θ ij ) (3)
j &i equation through iterations. After the initial guess of state
matrix, the solution can be determined through an iterative
where & i is the set of nearby buses connected with bus i;
manner by means of the Newton-Raphson power flow mod‐
and θ ij is the phase difference of branch (i¨j).
el. Specifically, given the non-singular Jacobian matrix, the
In addition, the active and reactive branch power flows of
Newton-Raphson power flow model can be converged qua‐
branch (i¨j) are modeled as:
dratically from a sufficient initial guess [22]. The actual mea‐
P ij V i2 (g si g ij ) V iV j (g ij cos θ ij b ij sin θ ij ) (4) surements of the general power system can be estimated
Q ij V i2 (b si b ij ) V iV j (g ij sin θ ij b ij cos θ ij ) (5) through the power flow analysis. The complete process of
Newton-Raphson state estimation method is presented in Al‐
where g si and b si are the shunt conductance and shunt suscep‐ gorithm 1.
tance at bus i, respectively.
For the specific implementation at bus i of the system, it Algorithm 1: Newton-Raphson state estimation method
involves in both the renewable power generation and the 1: Initial setup for all buses in the system with reference to the slack bus
load demand of bus i, which can be denoted as P G and P L , 2: for k 1:M k
i i 3: Obtain 'P k and 'Q k through power flow equations in [22]
respectively. The active and reactive power injection at bus i 4: Develop the Jacobian matrix of the system
can also be expressed as: 5: Compute '|V k| and 'θ k and form the state matrix X k
6: if the stopping criterion is met then
P i P G P L i &
i i (6) 7: Update the final solution as X X k and then close the for loop
8: else
Q i Q Gi Q Li i & (7) 9: Return to Step 3 in the for loop for another iteration and set k = k + 1.
10: end if
In order to ensure the stable system operation, the active 11: end for
and reactive power generation at time t of bus i & should
follow their operation limits. For every iteration k, the active and reactive power imbal‐
P G ¨min d P G d P G ¨max
i i i (8) ance can be obtained, denoted as 'P k and 'Q k, respectively.
Q Gi ¨min d Q Gi d Q Gi ¨max In addition, by means of Jacobian matrix, the state matrix
(9)
X k can be computed. To satisfy the stopping criterion, the it‐
B. State Matrix eration number is set as k M k, where M k is the maximum it‐
eration number. For the tolerance of the convergence, Algo‐
The actual measurements for the state estimation in the rithm 1 follows the rule as:
power system are obtained by means of the Newton-Raph‐
|'P ik|d ϵ i & (13)
son power flow model. As mentioned before, a new state ma‐
trix is devised to aggregate with all the historical informa‐ |'Q ki|d ϵ i & (14)
tion. Through the power flow analysis, the crucial compo‐ 5
where ϵ = 10 is a small positive number.
nents in the system can be obtained, such as the voltage
magnitude, phase angle, active power, and reactive power at
IV. BLSTM
each bus. Based on the previous definitions, we denote the
voltage magnitude and phase angle at bus i as |V i| and θ i, re‐ This section introduces the proposed BLSTM network. By
spectively. There are totally 2N 1 state variables related to handling the segregated datasets introduced in Section II, we
the common bus voltage in the system, which explicitly in‐ can extract and analyze the input data features. The first step
cludes N voltage magnitudes, N 1 phase angles, where the is associated with the data pre-processing. Then, through the
phase angle for each bus is defined related to the reference power flow analysis stage, we aggregate all historical infor‐
bus. By considering the active and reactive power injections mation to fit in the proposed BLSTM network. The steps for
at bus i, we define the state vector in the system as: network training are covered below in detail.
x iT [ |V i|¨θ i ¨P i ¨Q i ] i & (10) A. Data Pre-processing
Then, the state matrix of a general power system can be Based on the proposed methodology mentioned in Section
expressed as: II, we consider a predetermined period for both renewable
X [x 1 ¨x 2 ¨*¨x N ] (11) power generation and load demands, which can be denoted
The satisfaction of AC power balance conditions can be as , {1¨2¨...¨N T} with N T time slots. Given the time period
modeled as the non-linear equation, which can be expressed ,, we use R t to denote the collection of renewable power
as: generation and load demand features in a general power sys‐
tem, where t ,. As mentioned in Section II, the complex
ª 'P º AC power system consists of various entries such as RESs
g(X) « » 0 (12)
¬'Q ¼ and time-varying loads. To tackle the complex system, the
where 'P and 'Q are the matrices of the active and reactive implementation of deep learning approach can be a suitable
candidate for accurate state prediction. Specifically, we col‐ Let z t be the input of the neural network, which is formed
lect the input data from December 2015 to December 2016. by:
For solar generation, wind generation, and residential load, z t Concatenate(R t ¨X t ) (15)
the profiles during this period are captured with a time dura‐
where Concatenate(·) is the function to concatenate a list of
tion of 15 minutes in a cumulative manner. Furthermore, the
inputs.
time-varying loads fluctuate due to the practical electricity
For the time sequence number in the LSTM model, we
usage patterns. In this paper, we collect the input power data
consider the model input as the observations over the past L
associated with 37 European countries. Specifically, using
time slots. The vector of observations can be denoted as
the topology of the city power grid, the related power gener‐
z t L¨t [z t L ¨z t L 1 ¨...¨z t ]. The LSTM network topology con‐
ation and loads are implemented. By sorting out the input da‐
ta, we can extract multiple different features of the power tains the fully self-connected hidden layers with the memory
grid. In addition, since the training and testing datasets are cells and related gate units installed. LSTM units implement
constructed by capturing the periodical changes of the pro‐ the hidden layers, and all units have direct connections with
files, a cross-validation step is also required for the data pre- each other. In Fig. 2, each LSTM cell is formed as the chain
processing stage to assess the exact separation of training structure and covers four interacting layers with individual
and testing sets. Besides, in order to improve the robustness communication links, which are different from the traditional
of the proposed deep learning model, the sufficient amount recurrent neural network (RNN) with a single neural net‐
of training dataset is considered. Furthermore, by consider‐ work structure. The key function for each LSTM cell con‐
ing the state matrix, the extracted features can rapidly in‐ sists of an input gate, forget gate, and output gate. First of
crease with the number of buses in the power system. Thus, all, the input gate in the LSTM network at time t i t can be
the total number of the features from the input data depends expressed as:
on general power network information. Besides, before we i t σ(z tU i h t 1W i ) (16)
fit the input data to the BLSTM network, we first normalize where U i and W i are the coefficient and weighted matrices
the input dataset to [0¨1] with min-max normalization. For of the input gate, respectively; and h t 1 is the hidden state
the missing values in the input dataset, we utilize linear in‐ for the previous time slot. By using the sigmoid function
terpolation for recovering the values. σ(), the non-linearity of the hidden layers is shown. The
B. LSTM backup cell state of LSTM network at time t C͂ t can be de‐
noted as:
The pre-processing input data with different items are fed
in the Bayesian deep learning model for the purpose of mod‐ C͂ t tanh(z tU g h t 1W g ) (17)
el training. In this paper, we focus on the BLSTM network, where tanh() is the hyperbolic tangent activation function;
which is inherently a probabilistic model for handling time- U g and W g are the coefficient and weighted matrices of the
series data. The network parameters in the BLSTM network backup cell state, respectively. Then, the forget gate of
are expressed by conditional probabilities, which are differ‐ LSTM network at time t f t can be denoted as:
ent from the typical LSTM network with fixed parameters. f t σ(z tU f h t 1W f ) (18)
Since the input data cover the features of renewable power
generation, it is crucial to capture the characteristics of where U f and W f are the coefficient and weighted matrices
RESs. Hence, our proposed BLSTM network can capture of the forget gate, respectively. The function of the memory
both the model uncertainty and stochastic uncertainty [10]. cell is to activate the forget gate to decide whether to delete
To illustrate the basic architecture of the proposed the useless information from the previous time slot. The out‐
BLSTM network, we first introduce the structure of one put gate of LSTM network at time t o t can be expressed as:
LSTM cell shown in Fig. 2. o t σ(z tU o h t 1W o ) (19)
yt where U o and W o are the coefficient and weighted matrices
of the output gate, respectively. The function of the output
ReLU(·) gate is to prevent from storing long time lag memories. Be‐
sides, the hidden state of LSTM network at time t h t can be
Hadamard
Ct1
product
+ Ct expressed as:
it Hadamard
tanh(·) h t o t Ctanh(C t ) (20)
ft product ot Hadamard where C denotes the Hadamard product operation. The cell
~
Ct product state of LSTM network at time t C t can be denoted as:
σ(·) σ(·) σ(·)
tanh(·)
C t f t CC t 1 i t CC͂ t (21)
wi + + wg + + wf + + wo + + where C t 1 denotes the cell state from the previous time slot,
+ + + + ht
which can also be defined as the memory cell state.
ui ug uf uo
zt C. BLSTM Network
ht1
The utilization of LSTM network has demonstrated that
Fig. 2. Structure of LSTM cell. the point prediction can capture the long-term dependencies
of the input dataset [23]. However, the standard LSTM net‐ δ* arg min KL(q δ (W )||p(W|Z train Ÿ train ))
δ
work cannot capture the uncertainty. Hence, we propose the q δ (W )
BLSTM network for probabilistic prediction tasks. Specifi‐ δ
ϯ
arg min q δ (W )lg
p(W ) p(W|Z train Ÿ train )
dW
cally, the BLSTM network involves in structure and parame‐
arg min [(KL(q δ (W )||p(W )) E qδ (W ) (lg p(Z train Ÿ train|W ))]
ter uncertainties. On one hand, the structure uncertainty re‐ δ
fers to the way of choosing model structure to interpolate or (25)

extrapolate the selected data. On the other hand, the parame‐ Hence, given the optimal distribution, the predictive distri‐
ter uncertainty is associated with the set of weight parame‐ bution is calculated approximately by:
ters in the network to indicate the uncertain observations.
For Bayesian neural networks, the prior distributions should
p(y|z¨Z train Ÿ train ) ϯp(y|z¨W )q (W )dW *
δ q *δ (y|z) (26)
be relevant to the distribution of the network parameters, e. where q *δ (W ) is the optimal distribution. Meanwhile, consid‐
g., weights and bias. The reason is that they are hard to be ering the dimension of the distribution of the parameters, we
identified due to the unclear meanings of these parameters. follow the assumption in [26].
According to the demonstrations in [24], deploying standard By repeating the calculation with the sample weight T s,
parametric distributions is effective when the prior is hard to the predictive mean of the samples can be calculated to pres‐
be identified. ent the model parameter uncertainty, which is shown as:
As mentioned above, in order to capture the model uncer‐ 1 s Wt
T
Tsϕ
tainty, a prior distribution that each parameter is set as a exp(y) f (z) (27)
t 1
standard normal distribution with zero mean can be placed Wt
over weight parameters. Note that the posterior distribution where f (z) is the stochastic forward pass with its weight in
W
is used to produce the samples of forecasts after training the the network. Besides, given that W t ~q *δ (W ) and p(y|f (z)) t
W
BLSTM network. We define p(W|Z train Ÿ train ) as the posterior N(y; f (z)¨σ)¨σ ! 0, we have the following estimator:
t
distribution of the network weights with training data, where 1 s Wt

T
W
( f (z))T f t (z) σ
Tsϕ
W is the weighted matrix for the network; and Z train and Y train exp(y T y) (28)
t 1
are the training sets. According to Bayes’s theorem, the pos‐
terior distribution can be expressed as: where σ is the data uncertainty and refers to the inherent
noise in the data. The measurement of this value can be de‐
p(Y train|Z train ¨W ) p(W )
p(W|Z train Ÿ train ) (22) termined by the residual sum of squares on the independent
p(Y train|Z train ) validation sets of data. Then, the predictive variance can be
Then, we can have: obtained as:
T T T
1 s Wt 1 s Wt
ϯp(y|z¨W ) p(W|Z
s
p(y|z¨Z train Ÿ train ) Ÿ train )dW (23) Wt

W
f t (z) σ 2
Tsϕ ϕ ϕ
train Var(y) ( f (z)) T
f (z) ( f (z)) T
t 1 T s2 t 1 t 1
where z and y are the given input and output points, respec‐
(29)
tively.
For Bayesian deep learning, the network parameters are Besides, the loss function in the BLSTM network can be
defined as:
assumed to follow the posterior probability distributions.
T train
Hence, after the training process of the Bayesian neural net‐ 1 1 1
$(δ) ϕ2σ ||y i f (z i )||2 lg σ 2 (30)
work, the queries for the unseen data can be predicted by: T 2
train t 1
2
2
p(ŷ |ẑ ) E p(W|Z Ÿ ) ( p(ŷ |ẑ ¨Z train Ÿ train ))
train train To combine with the data and model uncertainties, we de‐
fine a new expression for the output of this model by consid‐
ϯ p(ŷ |ẑ ¨Z train Ÿ train ) p(W|Z train Ÿ train )dW (24)
ering the predictive mean and model precision as [ẑ ¨σ̂ 2 ]
where E() is the expectation over the posterior probability W W
f BLSTM (z), where f BLSTM (z) is the proposed BLSTM network
t t
distribution p(W|Z train Ÿ train ); and ̂ represent the prediction that is parameterized by W t ~q *δ (W ). Thus, the final loss func‐
valnes. In this case, the Bayesian neural network is equiva‐ tion can be expressed as:
lent to taking the average predictions from an ensemble of T train
1 1 1
neural networks weighted by the posterior probabilities of $ BLSTM (δ) ϕ2σ ||y i ŷ i||2 lg σ̂ i2 (31)
T train
2 2
2
the parameters W. t 1
Note that the true posterior is usually intractable for the Last but not least, the confidence interval (CI) can be cal‐
LSTM network. By considering the complexity of the poste‐ culated as:
rior distribution of the network parameters, (24) is hard to CI [μ̂ t β/2 σ̂ ¨μ̂ t β/2 σ̂ ] (32)
be performed due to its intractable computation. Therefore, where μ is the expection value; t β/2 refers to the t-score in
we introduce q δ (W ) with parameter δ as an approximating
the table of t-distribution.
variational distribution to ensure the optimal distribution by
minimizing the Kullback-Leibler (KL) divergence according D. Training Algorithm
to [25]. To solve this issue, the variational inference can be In the proposed BLSTM network, we train the network by
utilized to gain the latent parameters δ on q δ (W ) as: minimizing the loss function in (30) to adjust the predicted
results. Also, the BLSTM network is capable of capturing slots) to be 4. The epoch number is 150 and the batch size
the uncertainty of the input data. Considering the process of is set to be 64. The numbers of the three hidden units for
training BLSTM network, we incorporate the Adam optimiz‐ the LSTM network are set to be 64, 128, and 256, respec‐
er [27] into the training algorithm for the LSTM network. In tively. The dropout rate is 0.5. The value of β is set to be
addition, the LSTM cell in the network is designed so that 5% to investigate the 95% confidence interval for the pro‐
the updating complexity per time interval and weight does posed BLSTM network. The number of samples T s is set to
not depend on the size of the neural network, and the stor‐ be 100. All the tested algorithms are implemented with Py‐
age does not rely on the sequence length of the input data thon and PyTorch [31].
[28]. Besides, the historical data, known as the input data,
B. Scenarios for Comparison
can be regarded as the prior information for input training.
When we apply the BLSTM network for the practical scenar‐ The evaluation of our proposed BLSTM network is based
io, we should pre-train the network under different power on the comparison with other typical prediction techniques.
system operations. Thus, the state of the general power sys‐ In this paper, we introduce six widely-adopted techniques
tem can be predicted through our proposed BLSTM net‐ for baseline comparisons, including multiple linear regres‐
work. Consequently, the training of BLSTM network can be sion (MLR) [32], ANN [11], LSTM [16], unscented Kalman
summarized as follows. filter (UKF) [2], quantile regression (QR) [33], and quantile
1) Collect the power system data, and separate the data in‐ random forest (QRF) [34]. Indeed, these tools have already
to training and testing datasets. been widely used for the application in the power system.
2) Normalize the input data and prepare the sample data ζ The former three techniques belong to point prediction tech‐
with normal distribution. niques while the latter three refer to probabilistic models.
3) Initialize network parameters: δ μ σζ. Our proposed BLSTM and LSTM network can capture both
4) Perform forward and back propagation on the batch. the data and model uncertainties simultaneously. The base‐
5) Update μ and σ according to the gradients in the net‐ line techniques are fine-tuned for optimal parameter configu‐
work respect to δ. rations to produce the best prediction results.
6) Fine tune the whole network with multiple trials and C. Performance Metrics
output the predicted result. To assess the prediction accuracy of the proposed BLSTM
network, four performance metrics are employed, which are
V. PERFORMANCE EVALUATION shown as:
This section shows the performance evaluation of the pro‐ 1
(y(t) ŷ (t))2
NT ϕ
posed methodology. First of all, we introduce the simulation MSE (33)
t,
setup. Then, we present the scenarios for comparison and
performance metrics for the simulation. Finally, we present 1
(y(t) ŷ (t))2
NT ϕ
RMSE (34)
and discuss the simulation results in a detailed manner. t,
A. Simulation Setup 1
In the simulation, we evaluate the performance of the pro‐
MAE
NT ϕ
_ y(t) ŷ (t) _ (35)
t,
posed model with the historical power system data. We em‐
1 _ y(t) ŷ (t) _
ploy the IEEE 57-bus system as the testing power system __ __u 100%
[29]. Real power system data from [30] are adopted in the
MAPE ϕ
N T t , _ y(t) _
(36)
subsequent case studies. In particular, 13-month data from where y(t) and ŷ (t) are the actual and predicted values at
December 2015 to December 2016 are obtained, which are time t, respectively; and N T is the number of the sampling
divided into a training dataset (the first twelve months) and period. These metrics are used to compare the predicted val‐
a testing dataset (the remaining month) for cross-validation. ue with the actual value for point prediction, namely, mean
The sampling period of all profiles is aggregated into 15 square error (MSE), root mean square error (RMSE), mean
minutes. The historical data cover multiple entries, including absolute error (MAE), and mean absolute percentage error
solar generation, wind generation, and time-varying house‐ (MAPE). For these metrics, a smaller value indicates a bet‐
hold load. We scale the time-varying household load to fit ter performance in the prediction.
the particular IEEE 57-bus test system. In addition, the solar
and wind generators are installed at bus 13 and 37, respec‐ D. Simulation Result
tively. According to the settings of the test system [29], the 1) Comparison of Different Prediction Methods
installed capacities for these two renewable power genera‐ As previously introduced in Section V-B, we evaluate the
tors are set to be 150 MW and 70 MW, respectively. proposed BLSTM network compared with the other six base‐
For the purpose of simulation, the measurements of the line methods. Table I presents the four performance metrics
system data are used to obtain the actual values of voltages, in evaluating these prediction methods. It is apparent that
phase angles, and power by the Newton-Raphson power our proposed BLSTM model outperforms the rest of the pre‐
flow model. Then, we design the BLSTM network as fol‐ diction methods with the smallest values in MSE, RMSE,
lows. We empirically set the sequence length L (in time MAE, and MAPE. Even if both UKF and QRF can have rel‐
atively low MSE and RMSE, MAE and MAPE are both while, we capture both the phase angle and the voltage mag‐
much higher than BLSTM, which indicate larger errors. In nitude of this bus. The normalized results are presented in
addition, we can observe that since BLSTM network consid‐ Figs. 4 and 5. In Fig. 4, it is obvious that the 95% lower
ers both the model and data uncertainties, the predicted re‐ and upper confidence bounds can better fit the trend of the
sult can be better even if for the point estimation. Owing to real values of the phase angle at bus 13. Besides, in Fig. 5,
the characteristics of capturing long-term dependencies of most of the predicted values are close to the real ones of the
time-series input data on LSTM network, the proposed voltage magnitude at bus 13. In addition, the similar results
BLSTM network can help further enhance the prediction ac‐ can also be demonstrated at bus 37, which is integrated with
curacy of the output results. The time complexity of the pro‐ wind power generation. Furthermore, by observing Figs. 4
posed BLSTM thus only depends on the number of sam‐ and 5, although the predicted values cannot fit well with the
pling period N T and the number of features N F. Hence, the sudden fluctuations of the real values, the occurrence of the
time complexity of BLSTM is defined as '(N T N F ). It is ap‐ confidence bounds of the BLSTM network show the effec‐
parent that a lower time complexity reflects a more effective tiveness of capturing such features.
approach.
0.56 95% upper confidence bound
TABLE I 95% lower confidence bound
0.55 Real values
COMPARISON OF DIFFERENT PREDICTION TECHNIQUES IN Predicted values
IEEE 57-BUS SYSTEM
Phase angle (rad)

0.54
Method MSE RMSE MAE MAPE (%) 0.53

MLR 2.100×103 0.0460 0.0320 467.1
0.52
ANN 2.820×104 0.0170 0.0110 163.2
LSTM 1.870×105 0.0043 0.0076 87.4 0.51
UKF 1.490×105 0.0039 0.0034 35.2
0.50
QR 6.060×105 0.0078 0.0125 28.0 0 500 1000 1500 2000 2500 3223
5
QRF 1.194×10 0.0035 0.0029 39.2 Time slot
BLSTM 5.660×106 0.0023 0.0017 24.9 Fig. 4. Probabilistic prediction of phase angle at bus 13.
1.06 95% upper confidence bound

2) Probabilistic Prediction of Net Active Power Imbalance 95% lower confidence bound
In this part, we investigate the net active power imbalance 1.05
Real values
Predicted values
Voltage magnitude (p.u.)
prediction for the entire month using the BLSTM network.

By considering both the model and data uncertainties, the 1.04
normalized prediction results can be obtained as shown in
1.03
Fig. 3. By comparing with the actual values, it is apparent
that the predicted results can better fit the trend of the actual 1.02
profile. According to Table I, we can demonstrate that our
proposed BLSTM network can outperform baseline tech‐ 1.01
niques, which also yields the lowest errors through the four
evaluating metrics. 1.00
0 500 1000 1500 2000 2500 3223
95% upper confidence bound Time slot
0.28
95% lower confidence bound
Real values Fig. 5. Probabilistic prediction of voltage magnitude at bus 13.
0.27 Predicted values
Active power (p.u.)
0.26 4) Tractability for Large Power System

In this part, we investigate the numerical performance of
0.25
the proposed BLSTM network in IEEE 118-bus system. Ta‐
0.24 ble II summarizes the prediction results by adopting differ‐
ent techniques in IEEE 118-bus system. Considering the
0.23
change of test case, all tools are fine-tuned to obtain the best
0.22 parameters to give the best prediction results. By comparing
0 500 1000 1500 2000 2500 3223 with these techniques, it is apparent that our proposed
Time slot
BLSTM network still obtain the lowest MSE, RMSE, MAE,
Fig. 3. Probabilistic prediction of net active power imbalance. and MAPE values, indicating the highest prediction accura‐
cy. Besides, with the increase of the system complexity, the
3) Probabilistic Prediction of Bus Voltage model and data uncertainties may rise for the six baseline
In this part, we further focus on the monthly dynamic techniques. However, according to the results in Table II, the
state variables of each common bus. The relatively large- accurate prediction results can still be obtained by means of
scale solar power generation is integrated at bus 13. Mean‐ BLSTM network, which shows the best performance.
TABLE II the application of other related research fields in the power

COMPARISON OF DIFFERENT PREDICTION TECHNIQUES IN system.
IEEE 118-BUS SYSTEM
Method MSE RMSE MAE MAPE (%)

REFERENCES
MLR 9.20×103 0.0960 0.05800 484.3 [1] J. Zhao, M. Netto, and L. Mili, “A robust iterated extended kalman fil‐
5
ter for power system dynamic state estimation,” IEEE Transactions on
ANN 2.65×10 0.0051 0.00440 36.1 Power Systems, vol. 32, no. 4, pp. 3205-3216, Jul. 2017.
LSTM 2.82×105 0.0054 0.00210 22.8 [2] J. Zhao and L. Mili, “A robust generalized-maximum likelihood un‐
scented Kalman filter for power system dynamic state estimation,”
UKF 2.90×105 0.0054 0.00190 18.7
IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 4,
QR 7.32×105 0.0086 0.00320 48.1 pp. 578-592, Aug. 2018.
QRF 1.60×105 0.0040 0.00150 11.6 [3] H. Livani, S. Jafarzadeh, C. Y. EvrenosoŤglu et al., “A unified ap‐
proach for power system predictive operations using Viterbi algo‐
BLSTM 7.45×106 0.0027 0.00085 7.3 rithm,” IEEE Transactions on Sustainable Energy, vol. 5, no. 3, pp.
757-766, Jul. 2014.
[4] X. Cao, B. Stephen, I. F. Abdulhadi et al., “Switching Markov Gauss‐
5) Robustness of Proposed BLSTM Network ian models for dynamic power system inertia estimation,” IEEE Trans‐
We further evaluate the robustness of the proposed actions on Power Systems, vol. 31, no. 5, pp. 3394-3403, Sept. 2016.
[5] Y. Zhang, J. Wang, and Z. Li, “Interval state estimation with uncertain‐
BLSTM network. In this part, we utilize two different sys‐ ty of distributed generation and line parameters in unbalanced distribu‐
tems, IEEE 57-bus and 118-bus systems, to investigate the tion systems,” IEEE Transactions on Power Systems, vol. 35, no. 1,
robustness of the proposed BLSTM network. Note that, the pp. 762-772, Jan. 2020.
[6] J. Lv, M. Pawlak, and U. D. Annakkage, “Prediction of the transient
input datasets are different between the two systems because stability boundary based on nonparametric additive modeling,” IEEE
the historical dynamic system states are generated based on Transactions on Power Systems, vol. 32, no. 6, pp. 4362-4369, Nov.
the structure information of the test systems. The results are 2017.
[7] N. I. Sapankevych and R. Sankar, “Time series prediction using sup‐
presented in Fig. 6. It is obvious that the loss curve of port vector machines: a survey,” IEEE Computational Intelligence
BLSTM in IEEE 57-bus system converges around 18 ep‐ Magazine, vol. 4, no. 2, pp. 24-38, May 2009.
ochs. Meanwhile, the loss curve of IEEE 118-bus system [8] M. Heidari Kapourchali, M. Sepehry, and V. Aravinthan, “Multivariate
converges around 27 epochs. The results indicate the effec‐ spatio-temporal solar generation forecasting: a unified approach to
deal with communication failure and invisible sites,” IEEE Systems
tiveness of proposed BLSTM network for different systems. Journal, vol. 13, no. 2, pp. 1804-1812, Jun. 2019.
[9] T. Wang, T. Bi, H. Wang et al., “Decision tree based online stability
0.0010 IEEE 57-bus system assessment scheme for power systems with renewable generations,”
IEEE 118-bus system CSEE Journal of Power and Energy Systems, vol. 1, no. 2, pp. 53-61,
0.0008 Jun. 2015.
[10] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol.
Training loss
0.0006 521, pp. 436-444, May 2015.

[11] D. Q. Zhou, U. D. Annakkage, and A. D. Rajapakse, “Online monitor‐
0.0004 ing of voltage stability margin using an artificial neural network,”
IEEE Transactions on Power Systems, vol. 25, no. 3, pp. 1566-1574,
Aug. 2010.
0.0002 [12] Y. Xu, R. Zhang, J. Zhao et al., “Assessing short-term voltage stabili‐
ty of electric power systems by a hierarchical intelligent system,”
IEEE Transactions on Neural Networks and Learning Systems, vol.
0 20 40 60 80 100 120 150
27, no. 8, pp. 1686-1696, Aug. 2016.
Epoch [13] L. Zhang, G. Wang, and G. B. Giannakis, “Real-time power system
Fig. 6. Training loss of BLSTM for different systems. state estimation and forecasting via deep unrolled neural networks,”
IEEE Transactions on Signal Processing, vol. 67, no. 15, pp. 4069-
4077, Aug. 2019.
VI. CONCLUSION [14] C. Zheng, S. Wang, Y. Liu et al., “A novel equivalent model of active
distribution networks based on LSTM,” IEEE Transactions on Neural
In this paper, we propose a Bayesian deep learning ap‐ Networks and Learning Systems, vol. 30, no. 9, pp. 2611-2624, Sept.
proach to predict the dynamic system states in general pow‐ 2019.
[15] A. Gensler, J. Henze, B. Sick et al., “Deep learning for solar power
er systems. First of all, the model respects the implementa‐ forecasting-an approach using autoencoder and LSTM neural net‐
tions of renewable generation and household loads in gener‐ works,” in Proceedings of 2016 IEEE International Conference on Sys‐
al systems through data pre-processing. Then, by the New‐ tems, Man, and Cybernetics (SMC), Budapest, Hungary, Oct. 2016,
pp. 2858-2865.
ton-Raphson power flow model, the system state matrix is [16] Z. Cao, Y. Wang, C. Chu et al., “Scalable distribution systems state es‐
developed. After combining multiple system features as a fi‐ timation using long short-term memory networks as surrogates,” IEEE
nal complete input dataset, we train and validate the Access, vol. 8, pp. 23359-23368, Jan. 2020.
[17] C. Peng, S. Lei, Y. Hou et al., “Uncertainty management in power sys‐
BLSTM network to generate accurate prediction results. In tem operation,” CSEE Journal of Power and Energy Systems, vol. 1,
addition, we capture both the data and model uncertainties in no. 1, pp. 23359-23368, Mar. 2015.
the proposed model. The simulation results indicate that the [18] M. Ghosal and V. Rao, “Fusion of multirate measurements for nonlin‐
accurate prediction can be obtained through our proposed ear dynamic state estimation of the power systems,” IEEE Transac‐
tions on Smart Grid, vol. 10, no. 1, pp. 216-226, Jan. 2019.
BLSTM network for different scales of power systems. Fu‐ [19] M. Sun, T. Zhang, Y. Wang et al., “Using Bayesian deep learning to
ture work will consider exploring a more complex deep capture uncertainty for residential net load forecasting,” IEEE Transac‐
learning model for a practical power system to further cap‐ tions on Power Systems, vol. 35, no. 1, pp. 188-201, Jan. 2020.
[20] K. R. Mestav, J. Luengo-Rozas, and L. Tong, “Bayesian state estima‐
ture the potential features of such system. In addition, the tion for unobservable distribution systems via deep learning,” IEEE
proposed Bayesian deep learning model can be extended to Transactions on Power Systems, vol. 34, no. 6, pp. 4910-4920, Nov.
2019. [33] S. M. Mazhari, N. Safari, C. Y. Chung et al., “A quantile regression-

[21] H. L. Nguyen, “Newton-Raphson method in complex form [power sys‐ based approach for online probabilistic prediction of unstable groups
tem load flow analysis,” IEEE Transactions on Power Systems, vol. of coherent generators in power systems,” IEEE Transactions on Pow‐
12, no. 3, pp. 1355-1359, Aug. 1997. er Systems, vol. 34, no. 3, pp. 2240-2250, May 2019.
[22] A. Wood and B. Wollenberg, Power Generation, Operation, and Con‐ [34] L. Alfieri and P. De Falco, “Wavelet-based decompositions in probabi‐
trol, New Jersey: Wiley-Interscience, 2014. listic load forecasting,” IEEE Transactions on Smart Grid, vol. 11, no.
[23] W. Kong, Z. Y. Dong, Y. Jia et al., “Short-term residential load fore‐ 2, pp. 1367-1376, Mar. 2020.
casting based on LSTM recurrent neural network,” IEEE Transactions
on Smart Grid, vol. 10, no. 1, pp. 841-851, Jan. 2019.
[24] Y. Gal, “Uncertainty in deep learning,” Ph. D. dissertation, University Shiyao Zhang received the B.S. degree in electrical and computer engineer‐
of Cambridge, Cambridge, 2016. ing from the Purdue University, West Lafayette, USA, in 2014, the M.S. de‐
[25] S. Kullback and R. A. Leibler, “On information and sufficiency,” An‐ gree in electrical engineering from University of Southern California, Los
nals of Mathematical Statistics, vol. 22, no. 1, pp. 79-86, Mar. 1951. Angeles, USA, in 2016, and the Ph.D. degree from the University of Hong
[26] M. Gabrié, “Mean-field inference methods for neural networks,” Jour‐ Kong, Hong Kong, China. He is currently a Post-Doctoral Research Fellow
nal of Physics A: Mathematical and Theoretical, vol. 53, no. 22, pp. 1- with the Academy for Advanced Interdisciplinary Studies, Southern Univer‐
74, May 2020. sity of Science and Technology, Shenzhen, China. His research interests in‐
[27] D. Kingma and J. Ba. (2014, Dec.). Adam: a method for stochastic op‐ clude smart energy systems, intelligent transportation systems, optimization
timization. [Online]. Available: https://arxiv.org/abs/1412.6980v8 theory and algorithms, and deep learning applications.
[28] J. Schmidhuber, “A local learning algorithm for dynamic feedforward
and recurrent networks,” Connection Science, vol. 1, pp. 403-412,
James J. Q. Yu received the B. Eng. and Ph. D. degrees in electrical and
Sept. 1995.
electronic engineering from the University of Hong Kong, Hong Kong, Chi‐
[29] Power Systems Test Case Archive. (2020, Dec.). [Online]. Available:
na, in 2011 and 2015, respectively. He was a Post-Doctoral Fellow at the
http://www.ee.washington.edu/research/pstca/
University of Hong Kong from 2015 to 2018. He is an Assistant Professor
[30] Data Platform: Time Series, Open Power System Data. (2020, Dec.)
at the Department of Computer Science and Engineering, Southern Universi‐
[Online]. Available: http://data.open-power-system-data.org/
ty of Science and Technology, Shenzhen, China, and an Honorary Assistant
[31] A. Paszke, S. Gross, S. Chintala et al., “Automatic differentiation in
Professor at the Department of Electrical and Electronic Engineering, the
pytorch,” in Proceedings of the NIPS 2017 Workshop, Co-located with
University of Hong Kong. He currently also serves as the Chief Research
the 31st Annual Conference on Neural Information Processing Systems
Consultant of GWGrid Inc., Zhuhai, Chian, and Fano Labs, Hong Kong,
(NIPS 2017), Long Beach, United States, Dec. 2017, pp. 1-4.
China. His research interests include smart city and urban computing, deep
[32] A. C. S. Hettiarachchige-Don and V. Aravinthan, “Estimation of miss‐
learning, intelligent transportation systems, smart energy systems, forecast‐
ing transmission line reactance data using multiple linear regression,”
ing and decision-making of future transportation systems, and basic artificial
in Proceedings of 2017 North American Power Symposium (NAPS),
intelligence techniques for industrial applications.
Morgantown, USA, Sept. 2017, pp. 1-6.

2021-Bayesian_Deep_Learning_for_Dynamic_Power_System_State_Prediction_Considering_Renewable_Energy_Uncertainty

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2021-Bayesian_Deep_Learning_for_Dynamic_Power_System_State_Prediction_Considering_Renewable_Energy_Uncertainty

Uploaded by

Copyright:

Available Formats

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.

Bayesian Deep Learning for Dynamic Power

P i V i ϕV j (g ij cos θ ij b ij sin θ ij ) (2)

fers to the way of choosing model structure to interpolate or (25)

distribution of the network weights with training data, where 1 s Wt

Phase angle (rad)

Method MSE RMSE MAE MAPE (%) 0.53

1.06 95% upper confidence bound

prediction for the entire month using the BLSTM network.

0.26 4) Tractability for Large Power System

TABLE II the application of other related research fields in the power

Method MSE RMSE MAE MAPE (%)

0.0006 521, pp. 436-444, May 2015.

2019. [33] S. M. Mazhari, N. Safari, C. Y. Chung et al., “A quantile regression-

You might also like

2021-Bayesian_Deep_Learning_for_Dynamic_Power_System_State_Prediction_Considering_Renewable_Energy_Uncertainty

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2021-Bayesian_Deep_Learning_for_Dynamic_Power_System_State_Prediction_Considering_Renewable_Energy_Uncertainty

Uploaded by

Copyright:

Available Formats

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.

Bayesian Deep Learning for Dynamic Power

P i V i ϕV j (g ij cos θ ij  b ij sin θ ij ) (2)

fers to the way of choosing model structure to interpolate or (25)

distribution of the network weights with training data, where 1 s Wt

Phase angle (rad)

Method MSE RMSE MAE MAPE (%) 0.53

1.06 95% upper confidence bound

prediction for the entire month using the BLSTM network.

0.26 4) Tractability for Large Power System

TABLE II the application of other related research fields in the power

Method MSE RMSE MAE MAPE (%)

0.0006 521, pp. 436-444, May 2015.

2019. [33] S. M. Mazhari, N. Safari, C. Y. Chung et al., “A quantile regression-

You might also like

P i V i ϕV j (g ij cos θ ij b ij sin θ ij ) (2)