1 s2.0 S0960148124000089 Main

Renewable Energy 222 (2024) 119943
Contents lists available at ScienceDirect
Renewable Energy
journal homepage: www.elsevier.com/locate/renene
A cohesive structure of Bi-directional long-short-term memory (BiLSTM)

-GRU for predicting hourly solar radiation
Neethu Elizabeth Michael a, *, Ramesh C. Bansal a, b, Ali Ahmed Adam Ismail a, A. Elnady a,
Shazia Hasan c
a
Department of Electrical Engineering, University of Sharjah, Sharjah, United Arab Emirates
b
Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria, South Africa
c
Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai, United Arab Emirates
A R T I C L E I N F O A B S T R A C T
Keywords: Uncertain weather scenarios have an impact on the output of solar farms and therefore affect the security of the
Bi-directional long short-term memory grid. It is advantageous for power system operators to forecast solar energy to balance the load generation and
Deep learning for optimal power scheduling. The most promising deep-learning techniques to combine weather variables with
Gated recurrent unit
precise measurements of solar irradiance are not widely discussed. To close this research gap and produce better
Solar irradiance
prediction results, this article aims to formulate and compare two distinctive deep learning algorithms for using
Solar forecasting
Stacked LSTM time series forecasting approaches to predict solar irradiance. For multivariate data, the forecasting technique Bi-
Directional Long Short-Term Memory (BiLSTM), and BiLSTM-GRU (Gated Recurrent Unit) Dropout, are exam
ined in this study. The output results from the proposed model are compared with other benchmark models based
on performance error measurements, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE),
Coefficient of Determination (R2) and Mean Absolute Percentage Error (MAPE). It was found that the proposed
hybrid method, BiLSTM-GRU with dropout outperformed the other methods in terms of solar irradiance pre
dicting accuracy. The analysis presented the best RMSE of 1.55 and MAE of 1.13 for BiLSTM and RMSE of 1.40
and MAE of 0.91 for BiLSTM-GRU architecture using hyperparameter tuning. The comparison results show that
the prediction accuracy is improved by tuning the hyperparameters.
1. Introduction vendors and customers for better power management, system safety, and
load generation balance. Since solar irradiance is influenced by several
1.1. Background environmental factors, such as temperature, wind direction, sun’s po
sition, movement of the clouds, and forecasting can be challenging.
Photovoltaic energy generation is becoming more significant as the As a result, many researchers across a wide range of disciplines have
world’s overall energy usage rises. The preservation of the environment concentrated on machine learning and deep learning methods for time
is an essential factor when migrating from conventional energy sources series forecasting. In solar power forecasting, a variety of techniques
to renewable energy sources. Consequently, researchers as well as en including physical, heuristic, statistical, and machine learning are
gineers are becoming more interested in clean, non-polluting renewable frequently used [3]. There are two primary strategies for forecasting PV
energy sources, such as solar, wind, and geothermal energy. Predicting irradiance and power. The first one is the physical approach, which
renewable energy has been regarded as a critically difficult field for needs prior information on PV material properties in addition to the
power system planning and sustainable electricity sector growth [1]. As need for weather information. The second approach is data-driven,
a result of significant capacity additions in 2020 and 2021, China which uses real-world data to train and validate coefficients before
accounted for 38% of the rise in solar generation in 2021. The United making forecasts on test data. This suggests that a data-driven approach
States witnessed the second-largest generational growth (17%), while should only be used once a particular PV module or system is known,
the European Union experienced the third-largest development (10%) along with enough data to train the models. On the other hand, a
[2]. Accurate solar energy forecasts, provide essential guidance to power physical approach can be applied even without the PV system in
* Corresponding author. Department of Electrical Engineering, University of Sharjah, Sharjah, United Arab Emirates.
E-mail address: NMichael@sharjah.ac.ae (N.E. Michael).
https://doi.org/10.1016/j.renene.2024.119943
Received 21 June 2023; Received in revised form 25 November 2023; Accepted 2 January 2024
Available online 2 January 2024
0960-1481/© 2024 Elsevier Ltd. All rights reserved.
N.E. Michael et al. Renewable Energy 222 (2024) 119943
operation. Equivalent networks for a PV module and a PV system can be the output vector, and b is the bias term [9].
developed from the equivalent circuit model for a single cell. The
physical models have a significant flaw such they demand different 1.2.2. Deep learning methods and advanced hybrid deep learning methods
input variables that are typically not readily accessible. Furthermore, In problems with higher dimensions, Artificial Neural Networks
physical methods with faster prediction speeds cannot depend on large (ANN) [10–12]can depict complex non-linear behaviors. Some of the
historical data, hence it is not possible to ensure the precision of the deep learning algorithms that compensate for the challenges of machine
models. learning models and are widely discussed in the literature are con
volutional neural networks (CNN) and recurrent neural networks
1.2. Related solar forecasting approaches and motivation (RNN). RNNs are a subclass of ANNs that employ guided feedback
connections to record the dynamics of sequences. RNN displays dynamic
This subsection details the different forecasting methods for elec temporal behavior by processing arbitrary input sequences using their
tricity parameters, specifically PV forecasting. The discussion is focused internal memory. This property can be used to forecast the solar irra
on the following major algorithms. diance for the succeeding time step considering the input from many
previous time steps. RNNs have successfully learned on a large scale due
1.2.1. Statistical and machine learning methods to recent developments in network structures, optimization methods,
Heuristic models are suggested in this context to lessen the quantity and graphics processing units (GPUs), which have helped them over
of data needed. There are three types of data-driven methods: machine come their conventional restrictions of being challenging to train due to
learning, statistical, and data-driven heuristics. Since they were not having millions of parameters [13]. RNN can predict multiple time ho
developed using physical presumptions or ideas, they are heuristic rizons for both short- and long-term solar forecasting. This was achieved
models. If they originate from the correlation between weather and by using an end-to-end pipeline to execute the architecture to evaluate
power, they are categorized as data-driven models. Several heuristic and validate the prediction model’s effectiveness. The proposed
models are presented and compared in the literature [4–6].To produce approach of RNN in Ref. [14] allows for multi-horizon forecasts with
PV electricity forecasts, statistical and machine learning (ML) tech real-time inputs, for use in the developing smart grid. Therefore,
niques also rely on historical data. Markov chain (MC), exponential enhanced recurrent neural networks (RNNs) are proposed in recent
filtering Naive technique, ARIMA (Autoregressive Integrated Moving research that could be successfully applied to prediction problems. A
Average), SARIMA (seasonal ARIMA), etc. are examples of statistical multi-horizon GHI forecasting model using RNN was suggested by
techniques. However, these traditional statistical methods such as fuzzy Mishra and Palanisamy (2018) who demonstrated an RMSE of 18.57
theory and regression analysis cannot fit the complex nonlinear rela Wm2 over various forecasting possibilities [14].
tionship. Based on the algorithm’s empirical capabilities, a predictive Due to the problem of serious vanishing/exploding gradient, it is
algorithm for ML techniques must be selected. This ensures that the unable to sustain the long temporal dependence. To reduce these
algorithm can fit complex nonlinear relationships. The more historical problems, hybrid approaches of RNN are used to develop an indepen
data, the better the PV system can be known in terms of how it will dent day-ahead PV power and solar irradiance forecasting model [15,
operate under various weather conditions, and the more accurate the 16]. In Ref. [17] the author proposed modified approaches based on
forecasting will be. Better outcomes may be obtained using more so temporal correlation to update the forecasting outcomes of the hybrid
phisticated techniques, or so-called machine learning (ML) techniques. RNN model. The work addresses the overfitting problem by incorpo
Decision Tree (DT), Support Vector Machine (SVM), and Artificial rating weather information into the data [17]. The LSTM networks’
Neural Network (ANN) are a few cases of ML methods. For instance, the recurrent design and memory units enable them to simulate the tem
prediction ability of GA-SVM is superior to that of other models. The poral variations in PV output power [18]. The recommended technique
support vector machine (SVM) forecast model was developed by William was assessed over the course of a year using hourly datasets from various
et al. [7–9] for the short-term prediction of a PV system. However, it can sites. When compared to other methods, the use of LSTM provides a
be challenging for shallow machine learning techniques to accurately further reduction in forecasting error. The LSTM-RNN technique was
characterize such intricate nonlinear relationships due to its low also used for forecasting wind speed along with solar radiation. It was
robustness. ML methods can be further classified into supervised and observed that wind speed and solar irradiance forecasting errors (RMSE)
unsupervised algorithms. The different procedures used to operate an are well within permissible bounds [13]. Advanced LSTM with hyper
ML algorithm include data collection, feature selection, data augmen parameter tuning is a recent research topic [19,20] where reduced
tation, dataset splitting, and accuracy improvement. Deep learning forecast errors are obtained by eliminating problems of overfitting.
methods have recently been discovered to be crucial in the forecasting of Fig. 2 depicts the basic architecture of the CNN utilized by the author in
renewable energy. Fig. 1. shows the SVM architecture developed by creating hybrid models [11]. The most significant hyperparameters like
William et al. where xi is the eigenvector, K is the kernel function, Y is the number of layers, learning method, dropout layer, etc. are evaluated
and then optimum values are acquired before training the data. By
optimizing the hyperparameters the model achieves better outcomes in
terms of performance metrics. A deep convolutional long short-term
memory was proposed used to extract optimal features for accurate
prediction of the global horizontal irradiance [21]. Through
Fig. 1. The architecture of developed SVM [9]. Fig. 2. The basic architecture of CNN [19].
2
experimental tests on case studies in Columbus, Detroit, and San Anto computational time and can handle multivariate data. In another
nio, the effectiveness of the proposed modified sine cosine algorithm research by Xue et al. [28]. The sparrow search algorithm (SSA) was
was demonstrated. It is crucial to recognize the various factors that in proposed for optimization. The outcome of the tests show that the
fluence the production of solar power, such as humidity, snowfall, optimization precision, convergence speed, and robustness of this al
temperature, albedo, etc. and their effects on the processes that focus gorithm beats the particle swarm optimization. (PSO), gray wolf opti
attention on the predicted solar radiation, and PV power. Hence multi mizer (GWO), and gravitational search algorithm (GSA). A deep learning
variate analysis with the combination of two stages of the attention technique for forecasting solar radiation that uses Bi-directional long
mechanism with the encoder-decoder LSTM model was researched for short-term memory (BiLSTM), the sine cosine algorithm (SCA), and
time-series forecasting problems [22]. In comparison to other models, complete ensemble empirical mode decomposition with adaptive noise
the proposed model needs to be trained with more layers and parame (CEEMDAN) were proposed by Tian et al. However, more investigation
ters. As a result, it performs slower than the other models in terms of was recommended into the hybrid forecasting model’s integration of
speed. environmental and meteorological factors associated with solar radia
Quantile regression averaging (QRA) with an integrated model of the tion forecasting. Details of some of the prediction methods employed in
LSTM was used by Mei et al. [23] to predict PV production. Fig. 3 [23] the literature are given in Table 1.
illustrates the precise steps of the LSTM-QRA-based nonparametric
probabilistic forecasting model. According to the experimental findings, 1.2.3. The proposed forecasting model
LSTM-QRA has a better prediction ability. However, LSTM has a Moreover, the context addressed in this work is an element of the
complicated structure, numerous constraints, and a lengthy training development for microgrids with the optimization of renewable energy
period. Gated recurrent unit network (GRU), another RNN gating ar resources and the general background is shown in Fig. 4. Additionally, it
chitecture after LSTM, was suggested by K. Cho et al. It has two gates and attempts to optimize the electrical infrastructure and increased energy
fewer training factors than an LSTM. High prediction precision is efficiency of public infrastructure. The energy management system and
guaranteed by GRU, which also resolves the issue of LSTM over-fitting the smart meters are interlinked through the point of common coupling
[16–19]. A hybrid model based on GRU to forecast short-term solar (PCC). The optimization project will deploy a collection of algorithms
irradiance is proposed in the literature where a sparrow search algo for prediction and modeling the levels of power production and power
rithm was used for better performance accuracy. It shows a better fit consumption on the various distribution networks. It helps to maintain
than the LSTM regardless of how stable or erratic the weather was an optimized balance between the energy available from various sour
[24–26]. ces. It is preferred to use artificial intelligence networks like machine
Later a sequential learning-based forecasting model was presented in and deep learning to determine the consumption priorities between
Ref. [27] that combines GRU and CNN into a single integrated frame renewable and conventional energy.
work for precise energy usage forecasting. Due to the representative To improve the accuracy of short-term PV irradiance forecast, this
features’ extraction feature of CNNs and the efficient gated structure of work suggests a new hybrid model that blends BiLSTM and GRU
GRU, the suggested model shows an efficient alternative to the previous following previous studies. This presented study used the Solar Radia
hybrid models in terms of computational difficulties and forecasting tion Data (SoDa) website’s clear sky irradiation data in this case. A
precision. However, improved methods are needed with less collection of paid and unpaid solar radiation and solar-related data is
available through the Solar Radiation Data (SoDa) service [36]. With a
time, step ranging from 1 min to 1 month, the HelioClim-3 Archives
service provides a very quick time series of irradiation, which has been
used for this study. Post-processing layers are used to adjust these values
when the user launches a request, as shown in Fig. 5. The data consist of
multivariate input parameters that are global horizontal irradiance
(Wh/m2), Top of Atmosphere (Wh/m2) irradiation over the period at the
top of the atmosphere (extraterrestrial), Temperature (K) at 2 m above
ground, Relative humidity (%) at 2 m above ground, Pressure (hPa) at
ground level, Wind speed (m/s) at 10 m above ground, Wind direction
(deg) at 10 m above ground (0 means from North, Rainfall (kg/m2) (=
Table 1
Brief overview of various algorithms for solar radiation forecasting.
Ref Year Algorithm Error Metrics Prediction
Time
[29] 2018 FoBa, correlation coefficient (r), 1h,24h and

Leap forward, coefficient of 48h
spikeslab, Cubist determination (R2), RMSE,
and bagEarthGC and accuracy
[30] 2019 PS, ML, CM nRMSE, nMBE Intra-hour,
intra-day and
day-ahead
[31] 2019 CGAE, CRPS (comprehensive 24 h
performance evaluation)
[32] 2020 CNN, LSTM MAE, RMSE, R2 15 mints
[33] 2021 VMD, CNN, RF RMSE, NRMSE, MAE, R2 15 mnt,1 h
and LSTM
[34] 2021 LSTM, BiLSTM, correlation coefficient (r), Day ahead
GRU, cumulative distribution
Bi-GRU, CNN function (CDF), MAPE,
MAE,
and standard deviation (σ)
[35] 2022 WT and BiLSTM RMSE, MAPE, R2, FS 24 h
Fig. 3. The basic architecture of LSTM-QRA [23].
3
Fig. 4. Background of the work.
Fig. 5. Helio Clim processing layers [36].
rain depth in mm), Snowfall (kg/m2), and Snow depth (m). The target 1.3. Novelty
variable taken for this analysis is Clear-Sky (Wh/m2); Irradiation over
the period if the sky were clear. First, the solar irradiation data is According to the author’s understanding, the literature has not
collected from Soda and the different parameters are evaluated for addressed advanced hybrid deep learning algorithms with multivariate
determining the primary influencing variables of PV irradiance, to data that produce high prediction accuracy. As a result, this paper goes
improve the ability of calculation for the proposed model. Secondly, into detail about the significance of the GRU algorithm, BiLSTM layer,
data preprocessing is performed to improve the efficiency of the pro and dropout layer. Additionally, the selected hyperparameters are
posed methods. Thirdly the data is trained and tested for the given data. modified prior to deep learning approach training to confirm the pre
Finally, the forecasted data and the observed data are compared with diction accuracy. The originality of this work is thereby doubled. Pri
different error metrics. marily, a novel deep learning technique is developed by integrating the
benefits of the GRU and BiLSTM architectures. A particular site in Brazil
4
is used for the analysis. With a temporal resolution of 15 min, the spe
cific coverage is mentioned and is restricted to (− 66◦ to +66◦ in both
latitude and longitude).
Before training the data, hyperparameters are modified within the
designated search space to find their optimal values.
1.4. Contributions
Considering all the aforementioned factors, a novel approach is

proposed to predict the solar irradiance fluctuation. This algorithm uses
a hybrid approach to preprocess the solar data utilizing the weather
variables for multivariate analysis. The forecasted solar irradiance is
well suited for the utilities to estimate uncertainties in the solar farm
outputs. This is the first study to utilize BiLSTM and GRU for solar
irradiance forecasting considering multiple inputs and tuning of pa
rameters. Considering the mentioned issues, the study’s contributions to
research can be summed up as follows:
The main contributions of this paper are summarized below:
(i) This research work investigates a novel hybrid time series pre
diction approach namely, BiLSTM-GRU with dropout
architecture.
ii) In this research, a hybrid architecture is developed to forecast
multivariate data. Multivariate analysis statistically correlates
the effects of various external factors that influence solar
irradiance.
(iii) The outcome of the proposed solar prediction model has been
justified through a relative analysis with benchmark models,
ANN, CNN, LSTM, and SVM.
(iv) Considering the significance of the tuning for hyperparameters,
the proposed algorithms selected number of units in each layer,
drop out layer, the initial learn rate, and L2 regularization as the
hyperparameters and performed Bayesian technique for opti
mizing the parameters within the search space. It is observed that
the incorporation of hyperparameters tuning has resulted in
better prediction accuracy.
(v) This work also performs an accuracy validation with respect to
Root Mean Square Error (RMSE), Normalized Root Mean Squared Fig. 6. The proposed Approach.
Error (NRME), Mean absolute error (MAE), Coefficient. of
determination (R2), Mean absolute percentage error (MAPE), and characteristic of Brazil’s solar energy resources. In Brazil, this is a
mean squared error (MSE). common feature of most large cities and metropolitan areas. Several
energy corporations are currently intending to run solar power facilities
2. Materials and methods (PV, concentrated PV, and CSP) in the Northeastern and Mid-West areas
of Brazil, as well as assessing the economic feasibility. Reliable scientific
The flowchart in Fig. 6 provides a framework to explain the meth data on solar energy assessment and spatial and temporal variations are
odology of this work. It details the significance of data analysis and being sought after by INPE (Brazilian National Institute for Space
preprocessing for time series data before the prediction. Preprocessing Research) and numerous institutions to support this effort. Despite the
data into the proper form is one of the primary obstacles to forecasting. Northeastern Brazilian region’s unique temperature and environmental
Data normalization and data standardization techniques are mostly used features, it is evident that the world’s radiation levels are generally
for scaling the data. The data is then divided into training and test data consistent.
which is utilized for prediction. The forecasted out is compared with the The National Aeronautics and Space Administration (NASA) team’s
observed output with respect to performance error metrics. measurement equipment from the SoDa is used to record these data,
with a sample rate of 15 min. Fig. 7 displays Temperature(K), (b)
2.1. Data evaluation Pressure(hPa), (c)Relative humidity (%) and (d) Wind speed(m/s) from
this multivariate data set.
The solar irradiation identified from Brazil is represented by the
processed data used in this study. The data from Brazilian site is selected 2.2. Data preprocessing
for analysis because of data availability and its significance [37]. Brazil’s
vast territory is heavily exposed to sun radiation due to its tropical Any peaks and non-stationary elements in the forecasting algo
location. The Northeastern part of Brazil experiences a higher annual rithms’ inputs indicate that the solar power production model was
average of daily total sun irradiation than regions like Germany and the improperly trained, which will result in a large forecasting error. Since
Iberian Peninsula, where the solar energy sector is significantly more most models rely on meteorological data as inputs, which might vary
developed. In addition, the region’s characteristic climate and tropical and be unpredictable depending on the weather, these problems are
location contribute to the lower inter-annual variability. The summer always present. The preliminary processing of the input data can
time peak load in the distribution lines causes the incident surface solar therefore decrease the problem of unsuitable training and the associated
irradiation to reach higher levels, which is another significant expense of computation, improving the accuracy of the model.
5
Fig. 7. (a) Temperature(K), (b) Pressure(hPa), (c)Relative humidity (%) and (d) Wind speed((m/s) (e)Irradiation
(Wh/m2).
Incomplete data is typically the result of a data-gathering process fail standard deviation are altered to a standard normal distribution. Data
ure, which may be caused by a defective sensor. In such cases [34], the normalization significantly affects any model’s output because its pri
scaled data is first interpolated to fill in missing values. After solving the mary goal is to assure the accuracy of the data before it is fed into any
issue of missing data, Variables with various scales are included in the model. The data is normalized by the below expression,
dataset. These situations, where distinct variables may have different x − xmin
scales, can result in an incorrect prioritization of some of the variables in x′ = (1)
Xmax − Xmin
the model. As a result, feature scaling is done on the dataset to help
speed up computation in the algorithm and to enhance convergence Where x′ is the input value that undergoes normalization and x is the
rates. It takes less time to test after the dataset has been trained. Ac observed input value. Without assuming a particular model relationship,
cording to the state of the art [38], normalization and standardization deep learning has the capacity to independently acquire knowledge and
are the most used methods to check forecast input. Both methods help produce precise predictions. In this analysis, a total of 33,432 observa
the researcher change the values of numerical columns in the dataset to tions for the month of February 2004 are included in this research data.
a standard scale to eliminate the undesired characteristics from the As a part of data analysis, the data were divided into two groups: a
dataset. In normalization, the numbers are scaled and moved between training and a testing group. This confirms how well our model func
0 and 1 in this scaling technique to lessen duplication. However, in tions on the new dataset. The data was divided into training and testing
Z-Score, called standardization, the data features are altered by data sets in the proportion 50:50 % (16716 observations). The testing set
deducting from the mean and dividing by the standard deviation. In this was used to evaluate the model, while the first one was used to train it.
case, the distribution’s shape is unaffected because the mean and
6
However, the testing set used the same algorithm as the training data set two times, the training can acquire some more characteristics using
even though both have completely different values. Finally, the model data. The hidden layer output, ht at the step time t, of the BiLSTM model
was evaluated based on error metrics. The input data details, and is shown as:
location are given in Table .2. The data is taken from HelioClim-3
ht = [Fht + Bht ] (7)
Archive Database of Solar Irradiance v5 (derived from satellite data)
and meteorological data (MERRA-2/NASA and GFS/NCEP) [36]. The where Fht and Bht are forward and backward hidden sequences. Fig. 8
analysis is performed on a selected location in Brazil. The special depicts the design of bidirectional LSTM for forecasting solar irradiance.
coverage is mentioned and is limited to (− 66◦ to +66◦ both in latitude The input is characterized by x(T) and the output is represented by y(T).
and longitude) with a temporal resolution of 15 mnt. The different layers and the parameters for the BiLSTM model are given
in Table 3.
2.3. BiLSTM A sequence input layer gives the time series sequence data and the
number of features of the input data is equal to the number of input
In many prediction problems, recurrent neural networks are used to variables which is 10 in this study. This architecture used five numbers
analyze sequential data. Nevertheless, RNNs fail to master long-term of BiLSTM layers that are stacked with hidden layers 500,250,200,150
dependencies because of issues with gradient vanishing problem. and 100. Furthermore, drop out architecture is used to reduce the
LSTM architecture is proposed and constructed based on RNNs to overfitting of data with a probability of 0.1. The multivariate data is
address these shortcomings. The core components of an LSTM are three trained by BiLSTM architecture, and the outputs are evaluated by the
gates and cell memory states and are given by the relations from (2) till following metrics of performance. The proposed BiLSTM with dropout
(6). layer forecasting network for analyzing the data set is given in Fig. 9.
( )
fgt = σ wfg [ht− 1 , xt ] + bfg (2)
2.4. Hybrid BiLSTM-GRU network
int = σ(win [ht− 1 , xt ] + bin ) (3)
The GRU modified RNN uses a special gated recurrent neural
outt = σ(wout [ht− 1 , xt ] + bout ) (4) network constructed on an improved LSTM, making it one of the most
popular modified RNN versions. The GRU’s internal structure is similar
cellt = fgt ∗ cellt− 1 + int ∗ tanh (wcell [ht− 1 , xt ] + bcell ) (5) to that of the LSTM, with the difference that it combines the LSTM’s
forget gate and input gate into a single update gate. GRU has update and
ht = outt ∗ tanh(cellt ) (6) reset gate as compared to LSTM [39,40]. The update gate keeps past data
updated [41,42]. Fig. 10 depicts a GRU unit’s fundamental layout.
Where, wfg , win and wout are the weighted matrices, bfg , bin and bout are Hence, LSTM differs from GRU in a few ways. For example, LSTM has
the cell biases. The input is represented by xt , and the hidden state vector three gates in contrast to GRU’s two. Second, the update gate in the GRU
by ht . Since the inputs are evaluated in a precise specific order in LSTM, is created by merging the input and forget gates in LSTM, and the reset
only the previous inputs have an impact. The bidirectional LSTM gate in GRU is used for the hidden state in LSTM.
framework was researched [20] so that future values might also affect Where Xt is the input data from the training set at time t, and ht is the
the algorithm. Reverse and forward time sequences may be handled by result of outcome of the recent layer at time t. ut and ret are the update
the inputs by the replicated LSTM architecture. In this way, by training it and the reset gates and at is the activation vector. Equations (8)–(11)
displays the GRU cell’s general equations. wu and wre are the weighted
Table 2 bias of update and reset gates, and bu and bre are the cell biases. Because
Outline of input data. they train more quickly due to fewer parameters, stacked layer GRU is
Type of resource Online solar radiation satellite-derived database and used in this model. The framework is explained through equations (8)–
access Via the SoDa website (11).
Provider MINES ParisTech/ARMINES/TRANSVALOR S.A. ( [ ]
ut = σ wu ht− 1, Xt + bu (8)
Outputs All radiation components over a horizontal, fix-tilted,
and normal plane ( [ ]
Spatial coverage Meteosat satellite (− 66◦ to +66◦ both in latitude and ret = σ wre ht− 1, Xt + bre (9)
longitude)
Temporal coverage Feb. 2004 onwards (updated in real time, every 15 ( [ ]
act = tanh ret∗ wac ht− 1, Xt + bac (10)
min)
Spatial resolution 3 km at Nadir, approx. 5 km in Europe (see illustration
above) ht = (1 − ut ) ∗ act + ut ∗ ht− 1 (11)
Temporal resolution 15 min
Number of Observations 33432 In this model, the properties of GRU and BiLSTM are combined to form a
Input variables considered Global horizontal irradiance(Wh/m2), Top of hybrid BiLSTM-GRU network as shown in Fig. 11. It uses two layers of
Atmosphere (Wh/m2) irradiation over the period at
BiLSTM and two layers of GRU. The parameters considered for this study
the top of the atmosphere (extraterrestrial),
Temperature (K) at 2 m above ground, Relative are given in Table 4. It is observed that the model complexity is less as
humidity (%) at 2 m above ground, Pressure (hPa) at compared to the previous stacked BiLSTM model, and this algorithm
ground level, Wind speed (m/s) at 10 m above ground, trains the input data more efficiently in terms of performance error. One
Wind direction (deg) at 10 m above ground (0 means
of regularization methods in deep learning neural networks to
from North, Rainfall (kg/m2) (= rain depth in mm),
Snowfall (kg/m2), and Snow depth (m). The target
compensate overfitting is dropout. The neural network’s complexity is
variable taken for this analysis is Clear-Sky (Wh/m2) decreased, and the hidden units’ output is randomly set to zero. The
Data Base HelioClim-3 Archive Database of Solar Irradiance v5 dropout layer makes certain neurons dormant during training. Hence
(derived from satellite data) and meteorological data BiLSTM models have utilized a dropout layer between the BiLSTM and
(MERRA-2/NASA and GFS/NCEP)
GRU layers.
Site latitude − 11.411 (positive means North);
Site longitude − 51.705 (positive means East);
Elevation (m); 235 2.5. Hyperparameter tuning
Summarization (period of 15 min (min)
integration);
Parameters are tuned by the trial-and-error method and in the
7
Fig. 8. Internal structure of BiLSTM model.
regularization between the range [1e–15 1e–2] as the hyperparameters.

Table 3
These parameters are tuned within a search space mentioned. The
BiLSTM architecture parameters.
following steps are performed for optimizing these hyperparameters:
BiLSTM Architecture Parameters
Layers Layer Parameters Activation functions i. The goal is to maximize an unknown function f(x) on the given
Sequence input Number of features 10
data.
layer ii. Update the model.
BiLSTM 1 Number of hidden 500 State activation function- iii. Find the value of new x that maximizes the acquisition function.
layers sigmoid iv. Selected hyperparameters are used to develop the proposed
BiLSTM 2 Number of hidden 250 Input weight initializer-
model.
layers Glorot
BiLSTM 3 Number of hidden 200 Recurrent eight initializer- v. Train the network.
layers orthogonal vi. Evaluate the results of the network by verifying the errors and
BiLSTM 4 Number of hidden 150 Bias Initializer-unit forget validating the accuracy.
layers gate
BiLSTM 5 Number of hidden 100
layers
An analysis of the actual and predicted data for this case study shows
Dropout Layer Probability 0.1 how the proposed model’s robustness was evaluated using different
Fully connected Number of features 11 error measures. The estimated hyperparameters for the proposed
Layer BiLSTM-GRU and BiLSTM models are given in Table 5.
Regression layer loss function mean squared error.
In both the proposed architectures, the layers used are the Sequence
input layer, BiLSTM layer, Dropout Layer, GRU Layer, Dropout layer,
benchmark methods. However, trial and error method consume time, GRU layer, Dropout layer, fully connected Layer, and Regression Output
and it may result in human error. Hence this study extended the research layer. In the case of BiLSTM, the total function evaluations are 30, and
by utilizing Bayesian technique for hyperparameter tuning. In the field the total objective function evaluation time is 7186.68sec. However, in
of artificial intelligence, Bayesian technique has been extensively the case of BiLSTM GRU, the total function evaluations are 30, and total
employed for hyperparameter modifications. The optimization with objective function evaluation time is 13143.3908sec. The proposed
Bayesian Technique differentiates from Random Search and Grid Search solver is Adam, the maximum epochs used are 200, and the minibatch
[20,43] in such a way that it accelerates search times based on past size chosen is 32.
results, but the other two approaches do not depend on past data. To
choose which hyperparameters to evaluate against the objective func 3. Results and discussion
tion, Bayesian technique creates a probability model of the objective
function. This subsection deals with hyperparameter tuning incorpo In this work, two BiLSTM architectures and comparison results with
rated forecasting with the proposed BiLSTM and BiLSTM-GRU archi benchmark models for solar irradiation forecasting are evaluated.
tecture. In addition, the weather variables are also used for multivariate Firstly, experiments with deep learning and machine learning methods
analysis of the proposed models. This research utilized number of units that include ANN, CNN, LSTM, and SVM are conducted. Brief de
in each layer between the range [50,200], drop out layer between the scriptions of these experiments are provided in the subsections. Then the
range [1e–19 e− 1], the initial learn rate in the range [1e–31], and L2 predicted results with the proposed models with and without
8
Fig. 9. BiLSTM Solar irradiance forecasting network.
Fig. 10. Gated recurrent unit architecture.
hyperparameter tuning are given. The accuracy is measured in terms of Bayesian regularization training procedure has been proved to outper
different error metrics MAE, MAPE, MSE, NRMSE, R2, and RMSE [20]. form all other training algorithms, applied. However, it takes significant
time to train the network within the specified epochs (1000). The ANN
architecture is displayed in Fig. 12 and Table 6 shows the training al
3.1. ANN gorithm details.
The various algorithms employed the following parameters to train
In this case, the neural network is trained on a collection of inputs to the data, and Table 6 compares and presents the performance results of
produce an associated set of target outputs, and the process is known as several techniques.
"function fitting." A neural network with hidden Sizes is returned by
ANN using fitnet function. After loading the training data, the program i. Training algorithms:Scaled conjugate gradient backpropagation
modifies the sizes of the inputs and outputs during training in accor (trainsg), Levenberg-Marquardt backpropagation (trainlm),
dance with the training data. Using the net function, the network is Bayesian Regulation backpropagation(trainbr), One step secant
trained. Different training algorithms that are Scaled conjugate gradient backpropagation (trainoss) and BFGS quasi-Newton back
backpropagation (trainsg), Levenberg-Marquardt backpropagation propagation (trainbfg).
(trainlm), Bayesian Regulation backpropagation(trainbr), One step ii. Fitting network:Hidden layer size = 500; Fitting network func
secant backpropagation (trainoss) and BFGS quasi-Newton back tion: Fitnet
propagation (trainbfg) are utilized in this study and evaluated the re iii. Training, validation, and testing division: Train Ratio:65/100,
sults. Here trial and error calibration are performed using validation Validation Ratio:5/100, Test Ratio:30/100
sets. Given stable data and sufficient hidden layer neurons, a two-layer
feed-forward network (fitnet) with sigmoid hidden neurons and linear According to the values of the error metrics, Bayesian Regulation
output neurons arbitrarily fit multi-dimensional mapping tasks. The
9
Fig. 11. BiLSTM GRU Solar irradiance forecasting network.
Table 4
BiLSTM GRU architecture parameters.
BiLSTM GRU Architecture Parameters
Layers Layer Parameters Activation Initializers

functions
Sequence Number of 10
input layer features
BiLSTM 1 Number of 500 State activation Input weight Fig. 12. ANN architecture [11].
hidden layers function- initializer-Glorot
BiLSTM 2 Number of 300 sigmoid Recurrent eight
hidden layers initializer- backpropagation obtained RMSE of 5.72 and R2 0.99. It performed
orthogonal better than other training algorithms for solar irradiance for fixed no of
Bias Initializer- epochs (1000). However, Levenberg-Marquardt backpropagation out
unit forget gate
performs Bayesian Regulation backpropagation in terms of time taken
GRU 1 Number of 200 State activation Input weight
hidden layers function- tanh initializer-Glorot (1.24s versus 1:22:45s) and number of epochs (17 versus 1000)
GRU 2 Number of 200 Gate activation Recurrent eight required. It is observed that Bayesian Regulation backpropagation and
hidden layers function- initializer- BFGS quasi-Newton backpropagation have large counts of training
sigmoid orthogonal epochs and total epoch times (1000 epochs and 1:22:45s for Bayesian
Bias Initializer-
zeros
Regulation backpropagation; 1000 epochs and 4.46 s for BFGS quasi-
Dropout Layer Probability 0.1 Newton-backpropagation, respectively).
Fully Number of 11
connected features 3.2. CNN
Layer
Regression loss function mean squared error.
layer The Convolutional network used [19] considered two sets of
convolution layers with three sets of LSTM layers. In this architecture,
the data sequences are converted to data batch using a sequence folding
layer. The LSTM layers added with the CNN layer improved the pre
Table 5
diction accuracy. The output is obtained through the regression layer at
Tuned hyperparameters with Bayesian technique.
the end. The details of the CNN network and the details of training and
Estimated hyperparameters BiLSTM BiLSTM-GRU
layers are given in Table 7.
Drop out 0.100555894977223 0.26028 From Table 9, it is observed that CNN architecture performed better
L2 regularization 9.36331861826709e-11 1.7404e-12 than ANN algorithms in terms of MAE (8.70 versus 12.78), R2(0.99
No of Units 174 119
No: of BiLSTM layers 1 1
versus 0.97), and RMSE (10.08 versus 30.47).
Initial learn rate 0.00278110566558507 0.0010025
3.3. LSTM
LSTM network has gained popularity in recent studies as it deploys
10
Table 6
Training algorithm details and error values.
Performance Scaled conjugate gradient back Levenberg-Marquardt Bayesian Regulation One step secant BFGS quasi-Newton
metrics propagation. backpropagation backpropagation backpropagation backpropagation
MAE 33.22 12.78 3.41 107.45 85.04

MAPE 11.94 5.44 1.05 34.97 27.34
MSE 2611.00 1635.20 32.75 2307.40 1559.00
NRMSE 0.39 0.23 0.02 1.16 0.95
R2 0.92 0.97 0.99 0.29 0.52
RMSE 51.09 30.47 5.72 151.90 124.86
No: of Epochs 204 17 1000 73 1000
Time 0.08s 1.24 1:22:45s 0.25s 4.46s
the solar irradiance data. The ultimate prediction model was developed
Table 7
by trial-and-error evaluation of the parameters. The details of the LSTM
CNN Training algorithm parameters.
network and the details of training and layers are given in Table 8.
Training algorithm: Bayesian Regulation backpropagation (trainbr) LSTM opted for Adam solver as CNN architecture and obtained
Fitting network: Min Batch Size:64 almost similar error results as that of CNN. Even though LSTM out
Maximum Epochs:30 performed ANN, it gave 0.99 as R2 which is the same as CNN.
Learning rate:0.005
Training, and testing division: 70:30
Solver Adam 3.4. SVM
Layers:
Sequence folding layer: 2 layers of Convolution layers [Filter sizes 5 x5] with Relu The literature has demonstrated that the support vector machine
layer, sequence unfolding layer, 3 layers of LSTM layers with drop out, fully
connected layer, and regression layer. The LSTM layers are designed by considering
(SVM) technique, a supervised machine learning forecasting model,
the hidden units of the order 50:50:10. The layers used state activation function as offers superior accuracy and speed for tackling nonlinear issues. The
‘tanh’ and gate activation function as ‘sigmoid’. SVR model has gained popularity in recent years for creating prediction
algorithms in the fields of solar and wind forecasting techniques. In this
case, the data is extracted from the csv file and selected the training and
Table 8 target data. The basic architecture is explained in Fig. 1. and the analysis
LSTM Training algorithm parameters. is performed by support vector machine (SVM) regression model. The
data is trained as a RegressionSVM model using fitrsvm function in
Layers sequence input layer, LSTM layer, fully connected layer, and
regression layer. MATLAB and using the given data. The kernel function used is a
gaussian function. The detailed description of SVR can be followed from
Maximum Epochs 40
Mini//batch Size 15
Ref. [44].
Gradient Threshold 1 The forecasted and observed data using BiLSTM architecture is given
Initial Learn Rate 0.005 in Fig. 13 (a), Fig. 13(b), and Fig. 13(c) and that of BiLSTM-GRU ar
Learn Rate Drop 125 chitecture is given in Fig. 14(a), Fig. 14(b), and Fig. 14(c). It is also
Period
noticed that the forecasted values followed observed values with mini
Learn Rate Drop 0.2
Factor mum error for the proposed models with hyperparameters tuned.
solver Adam Table 9 details the performance of different models for the data
No: Hidden Units 10 considered for this study.
From the comparison graph between observed and predicted data
and error analysis, it is observed that the proposed hybrid deep learning
Table 9 techniques are better. From the results, it is also concluded that BiLSTM-
Performance analysis of different benchmark models. GRU attained 4.55 MAE,2.37 MAPE,24.88 MSE, 0.06 NRMSE, 4.99
Method MAE MAPE MSE NRMSE R2 RMSE RMSE, and 0.99 R2 values, while BiLSTM achieved
9.27,6.22,133.80,0.14,11.57,0.98 for MAE, MAPE, MSE, NRMSE,
ANN 12.78 5.44 1635.20 0.23 0.97 30.47
CNN 8.70 5.80 101.54 0.11 0.99 10.08
RMSE, and R2 without hyperparameter tuning. Hence, it can be
LSTM 7.25 5.37 107.64 0.13 0.99 10.37 concluded that the hybrid model developed can produce better results
SVM 6.45 5.02 52.53 0.65 0.99 7.25 for multivariate data. Table 9. shows that the efficient gated structure of
BiLSTM 9.27 6.22 133.80 0.14 0.98 11.57 GRU in the hybrid BiLSTM-GRU combination achieved the lowest error,
BiLSTM-GRU 4.55 2.37 24.88 0.06 0.99 4.99
as compared to benchmark models, and demonstrated a decreased error
BiLSTM 1.14 0.80 2.42 0.02 0.99 1.56
(Hyperparameters as compared to BiLSTM.
Tuned)
BiLSTM-GRU 0.91 0.71 1.98 0.02 0.99 1.41
(Hyperparameters
Tuned)
the forget gate, which enables the prediction of continuity. This paper
also uses the LSTM model as one of the benchmark models to compen
sate for the issues related to time series data and generalization ability.
LSTM addresses the issue that the classic neural network needs a large
number of samples for training and local optimal solutions. As in the
previous cases, initially, the model is standardized to provide regular,
and orderly data. Then the algorithm uses the sequence input layer,
LSTM layer, fully connected layer, and regression layer for predicting Fig. 13(a). Predicted and Observed values of BiLSTM Architecture.
11
Fig. 13(b). Predicted and observed values of BiLSTM architecture

(Zoomed Plot). Fig. 14 (c). Predicted and Observed values of BiLSTM-GRU with tuned
hyperparameters (Zoomed Plot).
Fig. 13(c). Predicted and Observed Values of BiLSTM Architecture with tuned
hyperparameters (Zoomed Plot).
Fig. 15. Comparison of different benchmark models.
hyperparameters by tuning for the proposed methods. The proposed

architecture predicts the necessary parameter set appropriately for the
optimal outcome, and it has high flexibility and robustness. Further
more, ANN, CNN, SVM, and BiLSTM have been evaluated with the
Fig. 14(a). Predicted and Observed values of BiLSTM-GRU Architecture. optimal stacked BiLSTM-GRU model. Stacking the layers and using GRU
combinational architectures allowed the proposed hyperparameter
tuning to perform better than other methodologies.
4. Conclusion
This paper proposes deep learning methods specifically BiLSTM ar

chitecture and hybrid BiLSTM-GRU model to compensate for the fore
casting challenges of solar irradiation due to its stochastic nature. This
research work is conducted as part of power system optimization using
energy storage systems and has used publicly accessible datasets from
SoDa website. To reduce the overfitting and to reduce the vanishing
gradient problem the algorithm used BiLSTM architecture to initially
train the data, and then feed the trained data into proposed multilayered
Fig. 14 (b). Predicted and Observed values of BiLSTM-GRU Architecture stacked GRU with dropout architecture. The proposed model with
(Zoomed Plot). hyperparameters tuned using Bayesian Optimization gives an RMSE of
1.56 and MAE of 1.14 for BiLSTM and RMSE of 1.41 and MAE of 0.91 for
Fig. 15 displays the comparison of MAE, MAPE, and RMSE for BiLSTM-GRU architecture. However, the models give RMSE of 11.57,
different models. It is observed that ANN has a maximum MAE of 12.78 MAE of 9.27 for BiLSTM and RMSE of 4.99, and MAE of 4.55 for BiLSTM-
and BiLSTM-GRU-Tuned has the lowest MAE of 0.91. However, MAPE is GRU without hyperparameter tuning. The superiority of the proposed
highest for BiLSTM and lowest for BiLSTM-Tuned. RMSE also showed an method is demonstrated through case studies compared to benchmarks
increased value for ANN and the smallest value for Bi-LSTM-Tuned. and widely used models for solar forecasting. These studies show less
Hence, we can conclude that BiLSTM-GRU-Tuned performed better as error, high accuracy, and less complexity as compared to other deep
compared to other methodologies in terms of performance errors. learning and machine learning models.
Enhancing performance was obtained by choosing the optimal In future work, more hyperparameters will be considered along with
12
multivariate data analysis to improve the forecasting operation. This [17] F. Wang, Z. Xuan, Z. Zhen, K. Li, T. Wang, M. Shi, A day-ahead PV power
forecasting method based on LSTM-RNN model and time correlation modification
also focuses on the experimental validation of proposed methods in
under partial daily pattern prediction framework, Energy Convers. Manag. 212
various locations and their testing to improve the prediction accuracy. (2020) 112766, 212.
Moreover, multiple forecast horizons with finer temporal resolution will [18] M. Abdel-Nasser, K. Mahmoud, Accurate photovoltaic power forecasting models
be investigated to integrate into real-world applications. Future analysis using deep LSTM-RNN, Neural Comput. Appl. 31 (2019) 2727–2740.
[19] N.E. Michael, M. Mishra, S. Hasan, A. Al-Durra, Short-term solar power predicting
could also use particle component analysis (PCA) with large data since model based on multi-step CNN stacked LSTM technique, Energies 15 (6) (2022)
reduces the training duration and improves the generalization ability of 2150.
the model. [20] N.E. Michael, S. Hasan, A. Al-Durra, M. Mishra, Short-term solar irradiance
forecasting based on a novel Bayesian optimized deep Long Short-Term Memory
neural network, Appl. Energy 324 (2022) 119727.
CRediT authorship contribution statement [21] S.M.J. Jalali, S. Ahmadian, A. Kavousi-Fard, A. Khosravi, S. Nahavandi, Automated
deep CNN-LSTM architecture design for solar irradiance forecasting, IEEE Trans.
Syst. Man Cybern. Syst. 52 (1) (2021) 54–65.
Neethu Elizabeth Michael: Conceptualization, Software, Visuali [22] M. Aslam, S.J. Lee, S.H. Khang, S. Hong, Two-stage attention over LSTM with
zation, Writing – original draft, Writing – review & editing. Ramesh C. Bayesian optimization for day-ahead solar power forecasting, IEEE Access 9 (2021)
Bansal: Methodology, Investigation, Supervision, Writing – review & 107387–107398.
[23] F. Mei, J. Gu, J. Lu, J. Lu, J. Zhang, Y. Jiang, T. Shi, J. Zheng, Day-ahead
editing, Validation. Ali Ahmed Adam Ismail: Supervision, Writing – nonparametric probabilistic forecasting of photovoltaic power generation based on
review & editing. A. Elnady: Supervision, Writing – review & editing. the LSTM-QRA ensemble model, IEEE Access 8 (2020) 166138–166149.
Shazia Hasan: Supervision, Writing – review & editing. [24] P. Jia, H. Zhang, X. Liu, X. Gong, Short-term photovoltaic power forecasting based
on VMD and ISSA-GRU, IEEE Access 9 (2021) 105939–105950.
[25] Y. Dai, Y. Wang, M. Leng, X. Yang, Q. Zhou, LOWESS smoothing and Random
Declaration of competing interest Forest based GRU model: a short-term photovoltaic power generation forecasting
method, Energy 256 (2022) 124661.
The authors declare that they have no known competing financial [26] M. Aslam, K.H. Seung, S.J. Lee, M.J. Lee, S. Hong, E.H. Lee, Long-term solar
radiation forecasting using a deep learning approach-GRUs, in: 2019 IEEE 8th
interests or personal relationships that could have appeared to influence International Conference on Advanced Power System Automation and Protection
the work reported in this paper. (APAP), IEEE, 2019, pp. 917–920. October.
[27] M. Sajjad, Z.A. Khan, A. Ullah, T. Hussain, W. Ullah, M.Y. Lee, S.W. Baik, A novel
CNN-GRU-based hybrid approach for short-term residential load forecasting, IEEE
References Access 8 (2020) 143759–143768.
[28] J. Xue, B. Shen, A novel swarm intelligence optimization approach: sparrow search
[1] A. Sujil, R. Kumar, R.C. Bansal, FCM Clustering-ANFIS-based PV and wind algorithm, Syst. Sci. Control. 8 (1) (2020) 22–34.
generation forecasting agent for energy management in a smart microgrid, J. Eng. [29] A. Sharma, A. Kakkar, Forecasting daily global solar irradiance generation using
18 (2019) 4852–4857. machine learning, Renew. Sust. Energ. Rev. 82 (2018) 2254–2269.
[2] IEA, Solar PV, IEA, Paris, 2022. https://www.iea.org/reports/solar-pv. License: CC [30] R. Blaga, A. Sabadus, N. Stefu, C. Dughir, M. Paulescu, V. Badescu, A current
BY 4.0. perspective on the accuracy of incoming solar energy forecasting, Prog. Energy
[3] L. Zhang, R. Wilson, M. Sumner, Y. Wu, Advanced multimodal fusion method for Combust. Sci. 70 (2019) 119–144.
very short-term solar irradiance forecasting using sky images and meteorological [31] M. Khodayar, S. Mohammadi, M.E. Khodayar, J. Wang, G. Liu, Convolutional
data: a gate and transformer mechanism approach, Renew. Energy 216 (2023) graph autoencoder: a generative deep neural network for probabilistic spatio-
118952. temporal solar irradiance forecasting, IEEE Trans. Sustain. Energy 11 (2) (2019)
[4] K. Ding, Z. Ye, T. Reindl, Comparison of parameterization Models for the 571–583.
Estimation of the maximum power output of PV modules, Energy Proc. 25 (2012) [32] G. Li, S. Xie, B. Wang, J. Xin, Y. Li, S. Du, Photovoltaic power forecasting with a
101–107. hybrid deep learning approach, IEEE Access 8 (2020) 175871–175880.
[5] W. Zhang, A. Maleki, M.A. Rosen, A heuristic-based approach for optimizing a [33] D. Cannizzaro, A. Aliberti, L. Bottaccioli, E. Macii, A. Acquaviva, E. Patti, Solar
small independent solar and wind hybrid power scheme incorporating load radiation forecasting based on convolutional neural network and ensemble
forecasting, J. Clean. Prod. 241 (2019) 117920. learning, Expert Syst. Appl. 181 (2021) 115167.
[6] N.P. Sebi, Intelligent solar irradiance forecasting using hybrid deep learning model: [34] S. Boubaker, M. Benghanem, A. Mellit, A. Lefza, O. Kahouli, L. Kolsi, Deep neural
a meta-heuristic-based prediction, Neural Process. Lett. (2022) 1–34. networks for predicting solar radiation at Hail Region, Saudi Arabia, IEEE Access 9
[7] W. VanDeventer, E. Jamei, G.S. Thirunavukkarasu, M. Seyedmahmoudian, T. (2021) 36719–36729.
K. Soon, B. Horan, S. Mekhilef, A. Stojcevski, Short-term PV power forecasting [35] P. Singla, M. Duhan, S. Saroha, An ensemble method to forecast 24-h ahead solar
using hybrid GASVM technique, Renew. Energy 140 (2019) 367–379. irradiance using wavelet decomposition and BiLSTM deep learning network, Earth
[8] M.N. Akhter, S. Mekhilef, H. Mokhlis, N. Mohamed Shah, Review on forecasting of Sci. Inform. 15 (1) (2022) 291–306.
photovoltaic power generation based on machine learning and metaheuristic [36] SoDa Pro [Online]. Available: https://www.soda-pro.com/.
techniques, IET Renew. Power Gener. 13 (7) (2019) 1009–1023. [37] F.J. Lima, F. R Martins, E.B. Pereira, E. Lorenz, D. Heinemann, Forecast for surface
[9] M. Pan, C. Li, R. Gao, Y. Huang, H. You, T. Gu, F. Qin, Photovoltaic power solar irradiance at the Brazilian Northeastern region using NWP model and
forecasting based on a support vector machine with improved ant colony artificial neural networks, Renew. Energy 87 (2016) 807–818.
optimization, J. Clean. Prod. 277 (2020) 123948. [38] W. Lee, K. Kim, J. Park, J. Kim, Y. Kim, Forecasting solar power using long-short
[10] A.K. Yadav, V. Sharma, H. Malik, S.S. Chandel, Daily array yield prediction of grid- term memory and convolutional neural networks, IEEE Access 6 (2018)
interactive photovoltaic plant using relief attribute evaluator based radial basis 73068–73080.
function neural network, Renew. Sust. Energ. Rev. 81 (2018) 2115–2127. [39] K. Mahmud, S. Azam, A. Karim, S. Zobaed, B. Shanmugam, D. Mathur, Machine
[11] A.K. Yadav, H. Malik, S.S. Chandel, Application of rapid miner in ANN based learning based PV power generation forecasting in Alice springs, IEEE Access 9
prediction of solar radiation for assessment of solar energy resource potential of 76 (2021) 46117–46128.
sites in Northwestern India, Renew. Sust. Energ. Rev. 52 (2015) 1093–1106. [40] S. Mahjoub, L. Chrifi-Alaoui, B. Marhic, L. Delahoche, Predicting energy
[12] A.K. Yadav, V. Sharma, H. Malik, S.S. Chandel, Daily array yield prediction of grid- consumption using LSTM, multi-layer GRU and drop-GRU neural networks, Sensors
interactive photovoltaic plant using relief attribute evaluator based radial basis 22 (11) (2022) 4062.
function neural network, Renew. Sust. Energ. Rev. 81 (2018) 2115–2127. [41] M. Sajjad, Z.A. Khan, A. Ullah, T. Hussain, W. Ullah, M.Y. Lee, S.W. Baik, A novel
[13] D. Kumar, H.D. Mathur, S. Bhanot, R.C. Bansal, Forecasting of solar and wind CNN-GRU-based hybrid approach for short-term residential load forecasting, IEEE
power using LSTM RNN for load frequency control in isolated microgrid, Int. J. Access 8 (2020) 143759–143768.
Model. Simulat. 41 (4) (2021) 311–323. [42] R. Dey, F.M. Salem, Gate-variants of gated recurrent unit (GRU) neural networks,
[14] S. Mishra, P. Palanisamy, Multi-time-horizon solar forecasting using recurrent in: IEEE 60th International Midwest Symposium on Circuits and Systems,
neural network, in: 2018 IEEE Energy Conversion Congress and Exposition (ECCE), MWSCAS, 2017, pp. 1597–1600. August.
IEEE, 2018, September, pp. 18–24. [43] M. Munem, T.R. Bashar, M.H. Roni, M. Shahriar, T.B. Shawkat, H. Rahaman,
[15] M.N. Akhter, S. Mekhilef, H. Mokhlis, R. Ali, M. Usama, M.A. Muhammad, A.S. Electric Power Load Forecasting Based on Multivariate LSTM Neural Network
M. Khairuddin, A hybrid deep learning method for an hour ahead power output Using Bayesian Optimization, IEEE Electr. Power Energy Conf EPEC, 2020, pp. 1–6.
forecasting of three different photovoltaic systems, Appl. Energy 307 (2022) [44] Najeebullah, A. Zameer, A. Khan, S.G. Javed, Machine Learning based short term
118185. wind power prediction using a hybrid learning model, Comput. Electr. Eng. 45
[16] P. Kumari, D. Toshniwal, Deep learning models for solar irradiance forecasting: a (2015) 122–133.
comprehensive review, J. Clean. Prod. 318 (2021) 128566, 318.
13

1 s2.0 S0960148124000089 Main

Uploaded by

Copyright:

Available Formats

You might also like

1 s2.0 S0960148124000089 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0960148124000089 Main

Uploaded by

Copyright:

Available Formats

Renewable Energy 222 (2024) 119943

Contents lists available at ScienceDirect

A cohesive structure of Bi-directional long-short-term memory (BiLSTM)

[29] 2018 FoBa, correlation coefficient (r), 1h,24h and

Fig. 4. Background of the work.

Fig. 5. Helio Clim processing layers [36].

Considering all the aforementioned factors, a novel approach is

Fig. 8. Internal structure of BiLSTM model.

regularization between the range [1e–15 1e–2] as the hyperparameters.

Fig. 9. BiLSTM Solar irradiance forecasting network.

Fig. 10. Gated recurrent unit architecture.

Fig. 11. BiLSTM GRU Solar irradiance forecasting network.

Layers Layer Parameters Activation Initializers

LSTM network has gained popularity in recent studies as it deploys

MAE 33.22 12.78 3.41 107.45 85.04

Fig. 13(b). Predicted and observed values of BiLSTM architecture

Fig. 15. Comparison of different benchmark models.

hyperparameters by tuning for the proposed methods. The proposed

This paper proposes deep learning methods specifically BiLSTM ar­

You might also like

This paper proposes deep learning methods specifically BiLSTM ar