Mode Decomposition Based Large Margin Distribution Machines For Sediment Load Prediction

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Expert Systems With Applications 232 (2023) 120844

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Mode decomposition based large margin distribution machines for


sediment load prediction
Barenya Bikash Hazarika a, Deepak Gupta b, *
a
Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh 522502, India
b
Department of Computer Science & Engineering, Motilal Nehru National Institute of Technology Allahabad, Prayagraj, Uttar Pradesh 211004, India

A R T I C L E I N F O A B S T R A C T

Keywords: Precise suspended sediment load (SSL) prediction is essential for irrigation, hydropower and river management
Sediment load prediction practices. But due to some external factors such as high altitude, heavy monsoon and tropical climate conditions,
Mode decomposition the data we collect may contain noisy samples. Hence, it becomes a challenging task to accurately predict the SSL
Large margin distribution machine
in rivers. Therefore, to diminish the influence of noise, the empirical mode decomposition (EMD) -based tech­
Regression
Noise
niques can be adopted. Moreover, the large margin distribution machine-based regression (LDMR) can deal
efficiently with noisy datasets as it simultaneously minimizes the insensitive loss as well as quadratic loss. The
significance of this work lies in its contribution to improving the accuracy of suspended sediment load (SSL)
prediction in rivers, which has practical implications for various applications such as irrigation, hydropower, and
river management. We recognize the challenges posed by external factors, which introduce noise into the
collected data, making accurate prediction of SSL difficult. Overall, the significance of this work lies in its novel
integration of EMD-based techniques and LDMR-based models, which address the challenges posed by noisy and
non-stationary sediment load data. The findings contribute to the field of SSL prediction, offering practical so­
lutions for managing and utilizing river resources more effectively in various domains. The main advantage of
the least squares LDMR (LS-LDMR) approach is that it solves a system of linear equations rather than solving a
large QPPs unlike SVR, TSVR, LDMR, which makes it computationally efficient. It is well known that the sedi­
ment load data is complex and non-stationary in nature. Therefore, for daily SSL prediction, the LDMR and least
squares LDMR (LSLDMR) models are embedded with two different decomposition techniques, EMD and
ensemble EMD (EEMD), to handle the nonlinear and non-stationary characteristics of sediment load data on the
forecasted outputs and to increase prediction ability. The results of the proposed EMD-LDMR, EEMD-LDMR,
EMD-LSLDMR and EEMD-LSLDMR are compared with the conventional support vector regression (SVR), twin
SVR (TSVR), LDMR and LSLDMR. The performance of the evaluated using the best-fit model using the mean
absolute error (MAE), root mean square error (RMSE), symmetric mean absolute percentage error (SMAPE),
Willmott’s index (WI), correlation coefficient (CC) and R2. Better or comparable results specify the efficiency of
the suggested models. For graphical visualization, scatterplots, prediction error plots and violin plots are shown.
Numerical results show that these hybrid models show excellent prediction performance. The overall analysis of
the results suggests using EEMD-LSLDMR for SSL estimation.

1. Introduction eminent problem that has a negative effect on water quality, and causes
pollution in rivers. Usually, the SSL data forms a nonlinear and complex
Estimating the river suspended sediment load (SSL) is one of the structure due to different external factors such as rainfall, streamflow
major problems in river engineering practices as well as hydrology. and temperature (Moeeni & Bonakdari, 2018). The importance of ac­
Suspended sediment in rivers can be defined as sediment that is trans­ curate estimation of SSL is increasing day by day, especially in flood-
ported by fluid and is fine enough that turbulent whirlpools can affected zones. Therefore, many researchers have been working on
outweigh the subsiding of the sediment particles within the river, trig­ predicting SSL accurately using different techniques. In the past de­
gering them to become suspended. Sediment deposition in rivers is an cades, researchers used a few mathematical models for SSL estimation.

* Corresponding author.
E-mail address: deepakg@mnnit.ac.in (D. Gupta).

https://doi.org/10.1016/j.eswa.2023.120844
Received 18 May 2022; Received in revised form 9 June 2023; Accepted 12 June 2023
Available online 21 June 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

However, these models had a long response time and are complex. prearranged basis functions. It converts the data into a series of intrinsic
Because of this, a few classic models like the multiple linear regression mode functions (IMFs). Hence, it can be considered an ideal method
(MLR) and sediment rating curve (SRC) were applied for SSL prediction compared to wavelet for analyzing non stationary as well as non-linear
(Kisi, 2005). An SRC is a mathematical formula that relates the water information. Very recent studies on EMD can be found in Yang and Chen
discharge of a river to the sediment load that is being transported. By (2019), Gupta and Singh (2023), Xie et al, (2023), Yang et al. (2023) and
measuring the water discharge at a particular point in the river and others. However, the conventional EMD has an issue of mode-mixing
using the rating curve, scientists can estimate the sediment load that is (MM). Hence to overcome this, an upgraded version of EMD called
being transported at that point. However, these models had the limited ensemble EMD (EEMD) was suggested by Wu and Huang (2009). EEMD
capability to deal with the non-linearity and non-stationarity in SSL highly reduces the mode mixing problem as it is a multiple trial process.
datasets. Therefore, researchers shifted their attention towards artificial EEMD has also been successfully applied with several ML-based models
intelligence (AI) /machine learning (ML) based models for SSL predic­ for T-S prediction (Yang & Yang 2020; Ali et al., 2020; Chen et al.,
tion. A detailed discussion can be found in Gupta et al. (2021) . Among 2021). Ren et al. (2014) hybridized the EMD and its variants with SVR
the various AI/ML-based models, the artificial neural network (ANN) for T-S forecasting. Yaslan and Bican (2017) hybridized the EMD with
(Hazarika et al., 2020a, Rezaei et al., 2021), support vector machine support vector regression (SVR) for electrical load forecasting. Fan et al.
(SVM) and its variants (Essam et al., 2022, Doroudi et al. 2021), wavelet (2013) hybridized EMD with SVR for electric load prediction. Khan et al.
analysis (Hazarika et al., 2020b, Shiri et al. 2022), neuro-fuzzy (2021) used the ELM embedded with EEMD for electricity price fore­
(Mohanta et al., 2021, Hamaamin et al., 2019), adaptive neuro-fuzzy casting. Díez-García et al. (2022) explored the potential of EMD to
inference system (ANFIS) (Babanezhad et al., 2021, Darabi et al., mitigate radio frequency interface in microwave radiometry. Recently a
2021) and extreme learning machines (ELM) (Hazarika et al., 2020b, novel decomposition technique popularly known as feature mode
Gupta et al., 2020) based regressors have been widely explored for SSL decomposition (FMD) was suggested by Yonghao, Zhang, Li, Lin, &
prediction. Banadkooki et al. (2020) predicted the river SSL in Goor­ Zhang (2022) .
anrood Basin, Iran using ANN and ant lion optimization algorithm. Salih Recently, a novel regressor termed large margin distribution
et al. (2020) predicted the river SSL for the Delaware River, USA using a machine-based regression (LDMR) was proposed by Rastogi et al.
few novel machine learning models. Nhu et al. (2020) suggested a new (2020). It is well known that SVR minimizes the ε-insensitive loss and
model called random space for monthly SSL estimation. Hazarika et al. ignores the samples that fall inside the ε-insensitive tube. Moreover, the
(2020b) predicted the SSL for the Tawang Chu River, India using two least squares SVR (LSSVR) minimizes quadratic loss (QL), however, fails
wavelet-based hybrid models. Ehteram et al. (2021) suggested a hybrid to show good generalization performance on noisy datasets. However,
multi-objective whale algorithm-based ANN model for SSL prediction in the LDMR concurrently minimizes the ε-insensitive loss function and QL
Gooranrood Basin, Iran. Meshram et al. (2021) suggested novel iterative functions. It takes the advantage of SVR and LSSVR by taking full in­
classifier optimizer-based random forest hybrid models for SSL estima­ formation of the training sample and avoiding overfitting at the same
tion in the Seonath river basin located in India. Panahi et al. (2021) time. However, it is computationally less efficient as it has to solve
hybridized the Black widow optimizer with soft computing models for quadratic programming problems (QPP) to find the optimum results. To
SSL forecast. The dataset was collected from the Telar river, Iran. Rezaei address this issue Gupta and Gupta (2021) suggested a computationally
et al. (2021) predicted the SSL using a few artificial intelligence-based efficient least squares LDMR (LSLDMR) model was suggested. LSLDMR
methods. Hazarika et al. (2021) suggested two coiflet wavelet-based solves linear programming problems rather than solving the QPP which
hybrid models for river SSL prediction in the Tawang Chu river, India. makes it more time efficient than LDMR. These LDMR based models
Very recently Esmaeili-Gisavandani et al. (2022) applied three different exploit both the margin mean and margin variance and therefore these
types of discrete wavelet transform for predicting the daily SSL in the models can efficiently handle the noisy datasets. It is evident that the SSL
Navrood watershed, Iran. Essam et al. (2022) estimated the SSL in datasets may have noise therefore, these LDMR based models can show
Peninsular, Malaysia using the powerful SVM model and deep learning excellent generalization performance for SSL datasets. Reisenbüchler
(DL) models. Latif et al. (2023) explored the ability of a few DL as well as et al. (2021) developed an ANN model for river sediment management.
ML models for sediment load estimation. Cheng et al. (2023) in their Sharafati et al. (2020) predicted the river SSL using a few ensemble-
work tested how vegetation and climate effects on monthly sediment based ML models. AlDahoul et al. (2022) explored a few ML models
load using a partial least squares-structural equation modelling. An for sediment load prediction. Karami et al. (2022) suggested a new
extensive survey of AI/ML models for river SSL prediction can be studied approach using ANFIS and ant colony optimization for predicting SSL in
in Gupta et al., (2021) and Tao et al., (2021). reservoir dams. Zhao et al. (2021) predicted the SSL using decomposi­
There are numerous factors that may affect sediment load, including tion and multi-objective evolutionary optimization technique.
the size and shape of the river channel, the slope of the river bed, the The previous literature shows that the EMD and EEMD-based pre-
volume and velocity of water flow, and the type and amount of sediment processed models can be highly efficient for SSL estimation. Moreover,
present in the river. A monthly streamflow dataset is thus made up of the LDMR can efficiently handle the feature noise in the dataset. Addi­
trend, episodic, and noise components (Zhang et al., 2015). Previously, tionally, the recently proposed LSLDMR can show high prediction per­
streamflow series was predicted directly without data pre-processing in formance with low computational cost compared to LSLDMR. Moreover,
some previous studies (Bittelli et al., 2010), which may have resulted in to the best of our knowledge, LSLDMR has never been tested for time-
the loss of significant information that is present in the original time series analysis. The nature of the data on sediment load is complex
series (T-S). According to Chou and Wang (2004), it is hard to imitate and non-stationary. Therefore, to address the nonlinear and nonsta­
the alteration mechanisms of streamflow data using a prediction model tionary characteristics of sediment load data on the prediction results
with only a single mixed-frequency component. To improve prediction and to get the maximum benefit from these decomposition techniques
accuracy, monthly streamflow time series must be preprocessed. For and LDMR-based models, the LDMR and least squares LDMR (LSLDMR)
that, Wiener (1949) created the Fourier analysis with the associated models are embedded with two alternative decomposition approaches,
spectrum analysis for decomposing stationary T-S. However, the major EMD and ensemble EMD (EEMD). We have proposed 4 different hybrid
flaw of this method is that it analyses the time-series data in the fre­ models for SSL prediction. They are- EMD-LDMR, EEMD-LDMR, EMD-
quency domain which may lead to loss of a few major pieces of infor­ LSLDMR and EEMD-LSLDMR.
mation from the original data. Therefore, the perception of wavelet was The originality and importance of this work lie in its novel integra­
suggested by Morlet et al. (1982) which analyses both low frequency and tion of empirical mode decomposition (EMD)-based techniques and
noise. However, the empirical mode decomposition (EMD) which was LDMR models to address the challenges posed by noisy and non-
introduced by Huang et al. (1998) unlike wavelets does not need stationary suspended sediment load (SSL) data in river systems. The

2
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

significance of accurate SSL prediction is emphasized as it is crucial for 2.1. The LDMR
various applications such as irrigation, hydropower, and river man­
agement. Furthermore, we employ large margin distribution machine- The LDMR is a fresh approach suggested by Rastogi et al. (2020) for
based regression (LDMR) models to efficiently deal with noisy data­ regression problems. The key advantage of LDMR is that it shows high
sets. LDMR simultaneously minimizes the insensitive loss and quadratic efficiency while dealing with datasets that contain noise and outliers.
loss, making it robust to noise. Additionally, the least squares LDMR (LS- This is because it simultaneously minimizes the quadratic loss as well as
LDMR) approach is highlighted for its computational efficiency, as it ε − insensitive loss function. The primal statement of LDMR may be
solves a system of linear equations rather than solving a large quadratic expressed as:
programming problem (QPP) like support vector regression (SVR), twin c η 2
SVR (TSVR), LDMR, and LSLDMR. The prediction results of the proposed min ‖w‖2 − ‖y − (K(U, U t )w + eb)‖ + Cet (γ + δ),
2 2
EMD-LDMR, EEMD-LDMR, EMD-LSLDMR, and EEMD-LSLDMR models
are compared with conventional regression models, including SVR, s.t.,
TSVR, LDMR, and LSLDMR. Evaluation metrics such as mean absolute y − (K(U, U t )w + eb)⩾eε + γ
error (MAE), root mean square error (RMSE), symmetric mean absolute
percentage error (SMAPE), Willmott’s index (WI), correlation coeffi­ (K(U, U t )w + eb) − y⩾eε + δ,
cient (CC), and R2 are used to assess the prediction accuracy. The MAE is
And
important as it provides a simple and robust measure of prediction ac­
curacy in regression models. RMSE measures the average magnitude of γ, δ⩾0. (1)
errors while considering the squared differences between predicted and
actual values, making it suitable for penalizing larger errors. SMAPE is where K(U, Ut ) of order, m is the kernel and the (i, j)th element may be
important as it provides a balanced and interpretable measure of pre­ denoted as K = K(U, Ut )ij = k(xi , xj ) ∈ R and K = K(x, Ut ) = (k(x, x1 ) , ..
diction accuracy, accounting for both the magnitude and percentage of ., k(x, xm )) ∈ Rm for a vector x ∈ Rn .ε and η are user-specified parame­
errors in a symmetric manner. WI quantifies the agreement between ters.C is the model parameter,w indicates the unknown and b represents
observed and predicted values, providing a comprehensive assessment [ ]
w
of model performance that considers both bias and variability, making it the bias.γ and δ are slack variables. Assuming,z = ,‖w‖2 can be
b
suitable for evaluating the accuracy and reliability of predictions. The ⎡ ⎤
I 0
CC measures the strength and direction of the linear relationship be­ ⎢ . ⎥
tween two variables, allowing for the assessment of their association and rewritten as ‖w‖2 = zt I0 z,I0 = ⎢ ⎣
⎥.
. ⎦
providing insights into the dependency between them, which is crucial 0.....0
for understanding patterns, making predictions, and identifying poten­ [ ]
w
tial cause-and-effect relationships. R2 is significant as it quantifies the Finally, the unknowns, z = can be calculated as:
b
proportion of variance in the dependent variable that is explained by the [ ]
independent variables, indicating the goodness of fit of the regression z=
w − 1
= (cI0 + ηM t M) M t (ℓ1 − ℓ2 + y). (2)
model. Graphical visualization through scatterplots, prediction error b
plots, and violin plots are also provided. Overall, the originality of this
work lies in the integration of EMD-based techniques and LDMR models where ℓ1 and ℓ2 are Lagrangian multipliers. Also,M = [K(U, Ut ) e].
to address the challenges posed by noisy and non-stationary SSL data. Its For any new example x ∈ Rn , the regressor for LDMR can be
importance stems from its practical implications for various applica­ expressed as:
tions, including irrigation, hydropower, and river management, where f (x) = K(xt , U t )w + b. (3)
accurate SSL prediction is crucial for effective resource management.
The prime contributions of this work are: Despite showing high generalization ability, LDMR takes high
computation time to perform operations.
a) The EMD based LDMR (EMD-LDMR) and EMD based LSLDMR (EMD-
LSLDMR) are proposed
2.2. The LSLDMR
b) The EEMD based LDMR (EEMD-LDMR) and EEMD based LSLDMR
(EEMD-LSLDMR) are proposed
To reduce the computational efficiency of LDMR, very recently a
c) The LSLDMR model is freshly explored for SSL prediction.
novel LSLDMR was proposed by Gupta and Gupta (2021). LSLDMR deals
d) The SSL is predicted for the Tawang Chu river of Arunachal Pradesh,
with a system of linear equations rather than solving a QPP, which
India.
makes it time effectual compared to LDMR. The primal expression of
LDMR may be presented as:
The remainder of this paper is sorted out as follows: Section 2
demonstrates the related works. In section 3, the proposed models are c η 2
min ‖w‖2 − ‖y − (K(U, U t )w + eb)‖ + Cet (δ)2 ,
discussed. The experimental setup and dataset description are presented 2 2
in Section 3. Experimental setup numerical outcomes are discussed in s.t.,
Section 4. Lastly, we conclude the paper in Section 5.
(K(U, U t )w + eb) − y = eε + δ.
2. Related works
where C represents the model parameter,w is the unknown and b is the
bias.δ represents the slack variable. ε and η are user-defined parameters.
In this section, we have briefly shown the mathematical expressions [ ]
of the base models. They are LDMR, LSLDMR, EMD and EEMD. Let, the w
Finally, the unknowns, z = can be determined as:
b
training data points are {xi }m n
i=1 ∈ R ,m stands for the total number of data
[ ]
points.xi ∈ Rn and yi ∈ Rn indicates the input training datapoint and w
(4)
− 1
z= = (cI0 + (η + 2C)M t M) M t ((η + 2C)y + 2eε).
output respectively. Also, assume A ∈ Rm×n as the training data which ith b
row vector can be rewritten as xti and y = (y1,... ym ) indicates the original
values. Moreover, consider e is the vector of ones. where ℓ1 and ℓ2 are Lagrangian multipliers. Also,M = [K(U, Ut ) e].
For any unknown input x ∈ Rn , the regressor for LDMR may be ob

3
­
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

tained as:

f (x) = K(xt , U t )w + b. (5)

2.3. The EMD

EMD (Huang et al., 1998), sometimes known as the Hilbert– Huang


transform (HHT) (Huang and Wu, 2008), is a technique for decomposing
a signal into numerous IMF and a trend residue. EMD is a method for
obtaining instantaneous frequency data from nonstationary and
nonlinear data sets using an empirical methodology. HHT works simi­
larly to wavelet analysis, but it is a posteriori and has an empirical
theoretical underpinning. IMFs are a assemblage of a few adaptive basis
functions with an additional residual series that extract intrinsic oscil­
latory modes from a non-stationary signal. EMD is used to sift the initial
signal ’x(s)’ to remove the true IMFs and residue (r(s)) (Naik et al., 2018;
Kisi et al., 2014). The benefits of EMD are as follows: when compared to
wavelet decomposition and singular value decomposition: 1) Flexibility.
The basis function of EMD can be automatically created, unlike the
wavelet transform, which needs the wavelet basis function to be pre- Fig. 1. Schematic diagram of the proposed models.
selected. As a result, it is more suited for evaluating complicated EEG
signals. 2) Resilience (Shao et al., 2021). i. The SSL TS dataset is decomposed using EMD into 2 IMFs and one
The original signal x(s)’s breakdown can be seen as the summation of residue.
IMFs and residual as x(s).: ii. Integrate all the results by summation to formulate an output for

n the SSL dataset.
x(s) = Mi (s) + rn (s), (6) iii. The output is provided as an input to the LDMR and LSLDMR
i=1 models to generate EMD-LDMR and EEMD-LDMR models
respectively.
where s is the signal,Mi (s) is the sum of IMFs and rn (s) denotes the re­
sidual of the signal. Additionally, the procedure of the proposed EEMD based LDMR
models are as follows:
2.4. The EEMD
i. The SSL time-series dataset is decomposed using EEMD into a few
The self-adaptive decomposition method known as EEMD was IMFs and a residue by considering the number of ensemble
created expressly for the analysis of nonlinear and nonstationary data. members as100, max iteration as 500 and signal to noise ratio as
By breaking down the original time series into features that reflect 0.2 (Wang et al., 2013, Ouyang et al., 2016).
distinct spectral components, which are simpler to forecast, it is used to ii. Aggregate all the outputs by summation to formulate an
improve prediction performance (Nguyen et al., 2021; Mhamdi, Poggi, ensemble output for the SSL dataset.
& Jaidane, 2011). The conventional EMD has a problem of mode-mixing iii. The ensemble output is provided as an input to the LDMR and
(MM). To overcome these, an ensemble EMD (EEMD) was suggested by LSLDMR models.
Wu and Huang (2009). EEMD is a completely noise-assisted data anal­
ysis method that breaks down embedded oscillations at different scales The overall schematic diagram of the proposed models is shown in
into IMFs and a residual variable. To effectively prevent EMD’s MM Fig. 1. Firstly, the dataset is decomposed using EMD or EEMD. The
problem, EEMD makes extensive use of the statistical properties of xlm − xmin
decomposed data is normalized using xlm = m
, where xlm is the
Gaussian white noise (GWN) (Santhosh et al., 2018, Prasad et al., 2018). xmax
l
− xmin
m

EEMD employs the following steps: normalized value of xlm .xmax


l and xmin
are the extreme and least values
l
Step 1: The GWN added signal is generated using: respectively. The normalized data is separated into 70% training and
30% testing splits. The LDMR and LSLDMR models are trained on the
xj (s) = x(s) + ψ j , (7)
normalized SSL dataset. Based on the optimal parameters, which are
where ψ indicates the GWN. determined using the common k-fold cross-validation procedure
Step 2: the GWN is decomposed into a few IMFs and one residue, (Anguita et al., 2012), the model performance is calculated. The per­
which can be expressed as: formance of all addressed models is evaluated using root mean square
error (RMSE), mean absolute error (MAE), symmetric mean absolute
∑ percentage error (SMAPE), Willmott’s index (WI), correlation coeffi­
n
xj (s) = Mij (s) + rnj (s). (8)
i=1
cient (CC) and coefficient of determination (R2).

Equations (7) and (8) are reiterated and finally, the mean of the 4. Experimental setup and dataset description
corresponding IMFs is taken.
We have performed the simulations on a computer with 32 GB RAM,
3. Proposed models 3.20Ghz Intel i-7 processor on Windows 7 OS on the MATLAB 2019a
software. The solutions to the QPPs of SVR, TSVR, LDMR and the pro­
A divide-and-conquer technique breaks down a problem a few sub- posed EMD-LDMR, as well as the EEMD-LDMR, are determined by the
problems of the similar or related type until they are simple enough to “quadprog” function in MATLAB.
tackle on their own. After that, the sub-problems’ solutions are merged
to solve the main problem. This technique is followed by the EMD and
EEMD techniques. The procedure of the proposed EMD based LDMR
models are as follows:

4
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 2. Study area.

4.1. Parameters and kernel selection where G(ki , lj ) indicates the Gaussian kernel and i,j = 1,2.,..,n, indicates
the maximum of i and j.
The optimum parameters for these models are chosen using the 10- For validating the performance of our models, we used 6 different
fold cross-validation. The C parameter of the SVR and TSVR models performance indicators. They are:
are selected from {10− 5 , 10− 3 , ..., 103 , 105 } However, in the case of
LDMR, LSLDMR and the proposed EMD-LDMR, EEMD-LDMR, EMD- √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑N 2
1
LSLDMR and EEMD-LSLDMR models the optimal parameters C, C1 = a) RMSE = N i=1 [(So )i − (Sp )i ]

C2 are opted from {10− 5 , 10− 3 , .., 103 , 105 } For the reported models, the ∑ ⃒ ⃒
b) MAE = N1 Ni=1 ⃒(So )i − (Sp )i ⃒
parameter ε is fixed to 0.01.
∑N |(So )i −
The k parameter is considered as 1 for the LDMR and LSLDMR based
(Sp )i |
c) SMAPE = N1 i=1 (So )i +(Sp )i
models. During the kernel selection, we have selected the Gaussian ∑N
((So )i − (Sp )i )2
kernel. The kernel parameter,μ of SVR, TSVR, LDMR, LSLDMR and the d) WI = 1 − ∑N i=1
2
proposed LDMR and LSLDMR based models is chosen from a range of i=1
( |(Sp )i − (S0 )|+|(So )i − (S0 )| )
∑N
{2− 5 , 2− 3 , .., 23 , 25 } The non-linear Gaussian kernel can be presented as:
e) CC = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
i=1
((So )i − (So ))((Sp )i − (Sp ))
∑ √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅

( ) ( ⃦ ⃦2 ) N
((So )i − (So ))2
N
((Sp )i − (Sp ))2
G ki , lj = − exp − μ⃦ki − lj ⃦ i=1 i=1

Table 1
Prediction of the single models for SSL prediction.
Models Parameters Input Combinations Performance Indicators Time (in secs)

RMSE MAE SMAPE WI RSQ CC

SVR (103,2-5) 5-days lag 0.0541 0.0235 0.2952 0.865 0.6902 0.8308 4.8822
(103,2-4) 4-days lag 0.0534 0.0032 0.0824 0.8992 0.6886 0.8298 4.7855
(103,2-5) 3-days lag 0.0543 0.0239 0.2985 0.863 0.6901 0.8307 5.1771
(102,2-1) 2-days lag 0.0533 0.0232 0.2826 0.8702 0.7019 0.8378 5.2858
(102,22) 1-day lag 0.0537 0.0234 0.2903 0.868 0.6958 0.8341 5.3062
Average 0.0537 0.0194 0.2498 0.8731 0.6933 0.8326 5.0874
TSVR (10-1,2-1) 5-days lag 0.0512 0.0245 0.2863 0.8964 0.7021 0.8379 0.3712
(10-2,2-1) 4-days lag 0.0512 0.0246 0.2875 0.8927 0.6933 0.8327 0.3357
(10-1,2-3) 3-days lag 0.0513 0.0247 0.288 0.892 0.6928 0.8324 0.3515
(10-3,2-3) 2-days lag 0.0507 0.0243 0.2875 0.8952 0.6998 0.8366 0.3507
(10-1,25) 1-day lag 0.0499 0.0252 0.3064 0.8921 0.6883 0.8297 0.4006
Average 0.0509 0.0247 0.2911 0.8937 0.6953 0.8339 0.3619
LDMR (10-3,100,20) 5-days lag 0.0523 0.0254 0.2858 0.8887 0.6788 0.8239 0.7093
(10-2,100,20) 4-days lag 0.0519 0.0251 0.2862 0.8899 0.6825 0.8261 0.6533
(10-2,10-5,2-2) 3-days lag 0.0505 0.0248 0.2909 0.8979 0.7001 0.8367 0.3031
(10-3,10-3,2-5) 2-days lag 0.0508 0.0247 0.2937 0.8956 0.6962 0.8344 0.3015
(10-2,10-5,24) 1-day lag 0.0498 0.0249 0.2929 0.9044 0.7063 0.8404 0.3046
Average 0.0511 0.0249 0.2899 0.8953 0.6928 0.8323 0.4544
LSLDMR (10-5,10-2,25) 5-days lag 0.0248 0.0177 0.2604 0.9658 0.8825 0.9394 0.1024
(10-4,10-2,25) 4-days lag 0.0276 0.0203 0.2759 0.9552 0.849 0.9214 0.1046
(10-4,10-2,25) 3-days lag 0.0333 0.0227 0.2775 0.9216 0.754 0.8683 0.1014
(10-5,10-2,24) 2-days lag 0.0305 0.0243 0.2973 0.9065 0.7148 0.8454 0.0989
(10-4,10-1,24) 1-day lag 0.0344 0.0252 0.3415 0.9015 0.7001 0.8367 0.1519
Average 0.0301 0.022 0.2905 0.9301 0.7801 0.8822 0.1118

5
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Table 2
Prediction performance of the decomposition-based hybrid models.
Models Parameters Input Combinations Performance Indicators Time (in secs)

RMSE MAE SMAPE WI RSQ CC

EMD-LDMR (10-4,10-4,20) 5-days lag 0.0253 0.014 0.2182 0.9798 0.9239 0.9612 0.4931
(10-2,10-5,20) 4-days lag 0.0244 0.0133 0.2047 0.9813 0.9293 0.964 0.5221
(104,10-4,2-5) 3-days lag 0.0337 0.0166 0.2478 0.9607 0.868 0.9317 0.6956
(10-2,10-3,25) 2-days lag 0.0189 0.0099 0.158 0.9889 0.9573 0.9784 0.5177
(10-5,10-2,21) 1-day lag 0.046 0.0219 0.2587 0.924 0.7486 0.8652 0.6295
Average 0.0297 0.01514 0.2175 0.9669 0.8854 0.9401 0.5716
EEMD-LDMR (100,10-4,2-2) 5-days lag 0.0259 0.0125 0.1805 0.9781 0.9225 0.9605 0.6394
(100,10-5,2-3) 4-days lag 0.0289 0.0129 0.17 0.9724 0.9028 0.9502 0.6874
(102,10-5,2-4) 3-days lag 0.0291 0.0138 0.2044 0.9723 0.8999 0.9486 0.7371
(10-1,10-4,2-1) 2-days lag 0.0333 0.0147 0.1807 0.9635 0.8681 0.9317 0.7192
(10-2,10-3,20) 1-day lag 0.0451 0.0207 0.02105 0.927 0.7575 0.8703 0.6568
Average 0.0325 0.0149 0.1513 0.9627 0.8702 0.9323 0.6879
EMD-LSLDMR (10-3,10-4,25) 5-days lag 0.0082 0.0041 0.1169 0.9984 0.9939 0.9969 0.1116
(10-3,10-4,25) 4-days lag 0.0089 0.0058 0.13 0.9969 0.988 0.994 0.1213
(10-2,10-5,25) 3-days lag 0.0126 0.0096 0.1605 0.991 0.9653 0.9825 0.1094
(10-2,10-5,24) 2-days lag 0.014 0.0121 0.1831 0.9838 0.9393 0.9692 0.1068
(10-2,10-5,24) 1-day lag 0.0242 0.0184 0.2229 0.9515 0.8339 0.9132 0.1023
Average 0.0136 0.01 0.1627 0.9843 0.9441 0.971 0.1103
EEMD-LSLDMR (10-5,10-4,25) 5-days lag 0.0087 0.0032 0.0824 0.9992 0.9969 0.9984 0.1137
(10-5,10-4,25) 4-days lag 0.0096 0.0037 0.0929 0.9989 0.9955 0.9977 0.1077
(10-2,10-5,25) 3-days lag 0.0113 0.0083 0.1425 0.9934 0.9745 0.9872 0.1083
(10-2,10-5,24) 2-days lag 0.0122 0.01 0.1468 0.9879 0.9549 0.9772 0.1098
(10-2,10-5,24) 1-day lag 0.024 0.0183 0.2039 0.9522 0.8367 0.9147 0.1033
Average 0.0132 0.0087 0.1337 0.9863 0.9517 0.975 0.1086

∑N
it is noticeable that the LSLDMR shows the lowest average RMSE
f) R2 = ∑N i=1
[((So )i − (So ))((Sp )i − (Sp )]
∑N
i=1
[((Sp )i − (Sp ))]2 i=1
[((So )i − (So ))]2 (0.0301). Further, it can be noticed that among the single models, SVR
shows the lowest average MAE (0.0194) and SMAPE (0.2498) values
while the LSLDMR model has the highest WI (0.9301) and CC (0.8822)
where So is the original value and Sp indicate the predicted value.
values. The computational time of these models reveals that LSLDMR
model is more computationally efficient than SVR, TSVR and LDMR.
4.2. Dataset description
Table 2 shows the performance indicator values and computational
time (in seconds) for the EMD and EEMD based hybrid LDMR models.
To test the field applicability of EMD-LDMR, EEMD-LDMR, EMD-
One can notice from the table that the proposed EMD-LDMR, EEMD-
LSLDMR and EEMD-LSLDMR we have tested their prediction perfor­
LDMR, EMD-LSLDMR and EEMD-LSLDMR show excellent prediction
mance on an SSL dataset. The dataset is accumulated from NHPC
performance with the best mean R2 values of 0.8854, 0.8702, 0.9441
Limited, Tawang Basin Project and the data collected from the Tawang
and 0.9517 respectively. The maximum R2 value 0.9969 for the EEMD-
Chu River, Arunachal Pradesh, India. The study area is shown in Fig. 2
LSLDMR model is for the full input data (5-days lag). The best average R2
(Panda et al., 2014). The gauge station was located at Jang, Arunachal
value for the EEMD-LSLDMR (0.9517), is followed by the EMD-LSLDMR
Pradesh, India. The dataset is a collection of daily SSL data for 3
model (0.9441). Overall based on mean R2 values, 36.11% and 35.906%
consecutive years from 2013 to 2015. The catchment area of the river is
increase can be observed for the EMD-LDMR and EEMD-LDMR respec­
2737 sq. km. s and latitude 27 30′00″ to 28 24′00″ and longitude of
◦ ◦
tively compared to LDMR for Qt-5, Qt-4, Qt-3, Qt-2, Qt-1, Qt. Moreover,
91 47 00 to 92 28 00 . The dataset has a minimum value of 0.004 (gm/
′ ″ ′ ″
12.62% and 12.96% increase in R2 values can be noticed for the EMD-
◦ ◦

litre), maximum value of 0.0647 (gm/litre), variation of 0.0076 (gm/ LSLDMR and EEMD-LSLDMR respectively compared to LSLDMR for
litre), standard deviation of 0.0873 (gm/litre), skewness and kurtosis of Qt-5, Qt-4, Qt-3, Qt-2, Qt-1, Qt. Further, it is observed that the proposed
2.3284 (gm/litre) and 7.3785 (gm/litre) respectively. The mean, mode EEMD-LSLDMR shows the lowest average RMSE (0.0132) value which
and median values are 0.0649 (gm/litre), 0.0078 (gm/litre) and 0.0197 indicates the applicability of the proposed model. Among the
(gm/litre) respectively. decomposition-based hybrid models, EEMD-LSLDMR has the lowest
We have prepared 5 different variants of the datasets. They are mean MAE (0.0087) and SMAPE (0.1337) values as well as the highest
mean WI (0.9863) and CC (0.975) values.
a) Qt-5, Qt-4, Qt-3, Qt-2, Qt-1, Qt: 5 days lag Additionally, the following points can be observed:
b) Qt-4, Qt-3, Qt-2, Qt-1, Qt: 4 days lag
c) Qt-3, Qt-2, Qt-1, Qt: 3 days lag a) The lowest average RMSE of EEMD-LSLDMR is reduced to 75.419%,
d) Qt-2, Qt-1, Qt: 2 days lag 74.0668%, 74.168% and 41.096% compared to SVR, TSVR, LDMR
e) Qt-1, Qt: 1 day lag and LSLDMR respectively.
b) Moreover, the lowest average MAE of EEMD-LSLDMR is reduced to
5. Results and analysis 55.154%, 64.773%, 65.06% and 65.455% compared to SVR, TSVR,
LDMR and LSLDMR respectively.
The experimental outcomes for the single conventional models are c) The lowest average SMAPE of EEMD-LSLDMR is reduced to
shown in Table 1. Moreover, the optimum parameters for each model 46.477%, 54.071%, 53.881% and 53.976% compared to SVR, TSVR,
are also shown in Table 1. Five different input combinations are used in LDMR and LSLDMR respectively.
this work. It can be observed that the solitary machine learning algo­ d) The lowest average WI of EEMD-LSLDMR is increased by 12.96%,
rithms, i.e., SVR, TSVR, LDMR and LSLDMR show low prediction per­ 10.361%, 10.164% and 6.042% compared to SVR, TSVR, LDMR and
formance with the maximum R2 value of 0.7801 for LSLDMR involving LSLDMR respectively.
full input data (5-days lag). On further observation on the single models,

6
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Table 3
Average ranks based on different performance indicators for different input combinations (Best average rank is in boldface).
Input Combinations Indicators SVR TSVR LDMR LSLDMR EMD-LDMR EEMD-LDMR EMD-LSLDMR EEMD-LSLDMR

5-days lag RMSE 8 6 7 3 4 5 1 2


MAE 6 7 8 5 4 3 2 1
SMAPE 8 7 6 5 4 3 2 1
WI 8 6 7 5 3 4 2 1
RSQ 7 6 8 5 3 4 2 1
CC 7 6 8 5 3 4 2 1
4-days lag RMSE 8 6 7 4 3 5 1 2
MAE 1 7 8 6 5 4 3 2
SMAPE 1 8 7 6 5 4 3 2
WI 6 7 8 5 3 4 2 1
RSQ 7 6 8 5 3 4 2 1
CC 7 6 8 5 3 4 2 1
3-days lag RMSE 8 7 6 4 5 3 2 1
MAE 6 7 8 5 4 3 2 1
SMAPE 8 6 7 5 4 3 2 1
WI 8 7 6 5 4 3 2 1
RSQ 8 7 6 5 4 3 2 1
CC 8 7 6 5 4 3 2 1
2-days lag RMSE 8 6 7 4 3 5 2 1
MAE 5 6.5 8 6.5 1 4 3 2
SMAPE 5 6 7 8 2 3 4 1
WI 8 7 6 5 1 4 3 2
RSQ 6 7 8 5 1 4 3 2
CC 6 7 8 5 1 4 3 2
1-days lag RMSE 8 7 6 3 5 4 2 1
MAE 5 7.5 6 7.5 4 3 2 1
SMAPE 5 7 6 8 4 1 3 2
WI 8 7 5 6 4 3 2 1
RSQ 7 8 5 6 4 3 2 1
CC 7 8 5 6 4 3 2 1
Average rank 6.6 6.7667 6.8667 5.2667 3.4 3.5667 2.2333 1.3

e) The lowest average CC of EEMD-LSLDMR is increased by 17.103%, indicators for each input combination. It can be noted that the proposed
16.921%, 17.145% and 10.519% compared to SVR, TSVR, LDMR EEMD-LSLDMR model shows the best results in 21 out of 30 cases. EMD-
and LSLDMR respectively. LSLDMR, EEMD-LDMR and EMD-LDMR reveal the best results in 2, 1
f) Additionally, the computational time of these models reveals that and 4 cases respectively while SVR show the best results in 2 cases. The
proposed EEMD-LSLDMR model takes overall lowest mean time for average rank is shown in the last row of Table 3. It can be noticed that
computation compared to SVR, TSVR, LDMR, EMD-LDMR, EEMD- the proposed EEMD-LSLDMR reveals the lowest average rank (1.3)
LDMR and EMD-LSLDMR. which clearly shows the dominance of the classifier. However,
measuring the performance of the models based on their mean rank is
Therefore, it is evident from the outcomes that the proposed EEMD- not adequate. Therefore, we perform the Friedman test with Nemenyi
LSLDMR is the best proposed model based on prediction as well as statistics (Demšar, 2006).
computational cost. Based on the average rank portrayed in Table 3 we further performed
Table 3 illustrates the ranks based on different performance non-parametric Friedman test (Demšar, 2006). Under the null

Fig. 3. Graphical representation of the Nemenyi test. The CD is 1.917.

7
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 4. Prediction performance of the reported models on the SSL datasets.

8
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 5. Violin plot representation of the reported models on the SSL datasets.

9
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 5. (continued).

10
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 6. Prediction over the original values for the best single models.

11
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Fig. 7. Parameter insensitivity of proposed (a) EMD-LDMR, (b) EEMD-LDMR, (c) EMD-LSLDMR and (d) EEMD-LSLDMR on 5 days lag dataset.

12
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

hypothesis that all models are significantly different, we formulate the and μ are shown respectively. The colour bar in the right side indicates
Friedman statistics as: the intensity of RMSE values. For example, the yellow colour indicates

⎡( )⎤
6.62 + 6.7662 + 6.8662 + 5.2662 + 3.42 + 3.5662 + 2.2332 + 1.332
12 × 30 ⎢ ⎥ (30 − 1) × 165.378
χ 2F = ⎣ 8 × 92 ⎦ = 165.978, FF = = 109.339
8(8 + 1) − 30(8 − 1) − 165.378
4

the highest RMSE values and deep blue colur indicates the lowest RMSE
values. It can be observed from the figure that the suggested regressors
FF is distributed with ((8 − 1), (8 − 1) × (30 − 1)) = (7, 203) degrees are not highly sensitive to user-defined parameters.
of freedom. The critical value for FF (7, 203) is 2.055 for α = 0.05 ,
which is lower compared to FF . Therefore, the null hypothesis is rejec­ 6. Conclusion
ted. Further, the critical distance (CD) is calculated for qα = 0.05 as:
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ In this paper, we have combined the EMD and EEMD based tech­
8 × (8 + 1) niques with LDMR models for suspended sediment load prediction. The
CD = 2.8055 = 1.917.
6 × 30 proposed models, viz., EMD-LDMR, EEMD-LDMR, EMD-LSLDMR and
These are the conclusions which are established from the Nemenyi EEMD-LSLDMR models have been tested on SSL datasets that are gath­
test for qα = 0.05: ered from the Tawang Chu River, India. The performance of these sug­
gested models is compared with the traditional SVR, TSVR, LDMR and
a) The difference of SVR and EEMD-LSLDMR is more than the CD. LSLDMR. The performance of the regressors is assessed using 6 different
Therefore, EEMD-LSLDMR performs better than the SVR model. performance indicators. Furthermore, statistical analysis has been un­
b) The difference between the average rank of TSVR and EEMD- dertaken to show the efficacy of our best-proposed technique. The
LSLDMR is higher than the CD. Therefore, EEMD-LSLDMR per­ following implications can be derived-
forms significantly better than the TSVR model.
c) Similarly, the difference between the average rank of LDMR and 1. EMD based hybrid models outperform the single models for SSL
EEMD-LSLDMR is more than the CD. Therefore, EEMD-LSLDMR prediction.
achieves significantly better generalization performance compared 2. EEMD based hybrid models outperform the EMD based hybrid
to the LDMR model. models and single models for SSL prediction.
d) Further, the difference between the LSDMR and EEMD-LSLDMR is 3. The proposed EEMD-LSDMR shows the best prediction performance
more than the CD. Therefore, EEMD-LSLDMR shows better general­ compared to other models.
ization performance compared to the LSLDMR model.
In future, it would be fascinating to combine the LDMR and LSLDMR
Fig. 3 shows the relative results of the Nemenyi test among all with empirical wavelet transform as well as with the recently proposed
learning methods based on average ranks. The methods with higher feature mode decomposition. The major limitation of this study is that
rankings are on the right, while those with lower rankings are on the left. we have not explored the proposed algorithms on different SSL datasets
The approaches within a horizontal line with a length less than or equal collected from other rivers. Therefore, we can explore the applicability
to a critical distance perform statistically identically. It is noticeable that of the models on the SSL datasets collected from other rivers.
EEMD-LSLDMR, EMD-LSLDMR, EEMD-LDMR and EMD-LDMR are at the
right side of the graph which shows the efficiency of the model. It can be CRediT authorship contribution statement
further noted that EEMD-LSLDMR is significantly different from SVR,
TSVR, LDMR, LSLDMR, EMD-LDMR and EEMD-LDMR models. But no Barenya Bikash Hazarika: Formal analysis, Validation, Visualiza­
significant difference can be observed between EEMD-LSLDMR and tion, Conceptualization , Methodology. Deepak Gupta: Conceptualiza­
EMD-LSLDMR models. Fig. 4 indicates the prediction performance of the tion, Investigation, Writing – original draft.
regressors for the various input combinations. It is evident that the
decomposition-based hybrid LDMR models are in close relationship with
Declaration of Competing Interest
the original data compared to the single models.
Fig. 5 depicts the best decomposition-based hybrid model on a violin
The authors declare that they have no known competing financial
diagram (Hoffmann, 2022). The figure shows that the EEMD-LSLDMR
interests or personal relationships that could have appeared to influence
has closer distribution to the original value when compared to other
the work reported in this paper.
models for all the input combinations.
In addition, the scatter plots are shown for each model based on the
Data availability
best input combinations in Fig. 6. It is evident that the proposed EMD-
LDMR, EEMD-LDMR, EEMD-LSLDMR and EMD-LSLDMR models are
The dataset is accumulated from NHPC Limited, Tawang Basin
highly correlated to the original values.
Project and the data is collected from the Tawang Chu River, Arunachal
Pradesh, India.
5.1. Parameter insensitivity
Acknowledgements
Fig. 7 show the parameter insensitivity plots for (a) EMD-LDMR (b)
EEMD-LDMR, (c) EMD-LSLDMR and (d) EEMD-LSLDMR on 5 days lag We would like to thank the National Hydroelectric Power Corpora­
dataset for each C1 and μ combination based on RMSE. Here,C2 is fixed tion (NHPC) Limited, Tawang basin project, India for providing us with
to optimum. In the x-axis and y-axis, the different parameter values of C1 the dataset.

13
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

References computational intelligence (pp. 339-349). Springer, Singapore. https://doi.org/


10.1007/978-981-15-0029-9_27.
Hazarika, B. B., Gupta, D., & Berlin, M. (2020b). Modeling suspended sediment load in a
AlDahoul, N., Ahmed, A. N., Allawi, M. F., Sherif, M., Sefelnasr, A., Chau, K. W., & El-
river using extreme learning machine and twin support vector regression with
Shafie, A. (2022). A comparison of machine learning models for suspended sediment
wavelet conjunction. Environmental Earth Sciences, 79(10), 1–15. https://doi.org/
load classification. Engineering Applications of Computational Fluid Mechanics, 16(1),
10.1007/s12665-020-08949-w
1211–1232. https://doi.org/10.1080/19942060.2022.2073565
Hazarika, B. B., Gupta, D., & Berlin, M. (2021). A coiflet LDMR and coiflet OB-ELM for
Ali, M., Prasad, R., Xiang, Y., & Yaseen, Z. M. (2020). Complete ensemble empirical mode
river suspended sediment load prediction. International Journal of Environmental
decomposition hybridized with random forest and kernel ridge regression model for
Science and Technology, 18(9), 2675–2692. https://doi.org/10.1007/s13762-020-
monthly rainfall forecasts. Journal of Hydrology, 584, Article 124647. https://doi.
02967-8
org/10.1016/j.jhydrol.2020.124647
Hoffmann. H. (2022). Violin Plot (https://www.mathworks.com/matlabcentral/
Anguita, D., Ghelardoni, L., Ghio, A., Oneto, L., & Ridella, S. (2012). The ‘K’ in K-fold
fileexchange/45134-violin-plot), MATLAB Central File Exchange.
cross validation. In 20th European Symposium on Artificial Neural Networks,
Retrieved February 16, 2022.
Computational Intelligence and Machine Learning (ESANN) (pp. 441–446). i6doc. com
Huang, N. E., & Wu, Z. (2008). A review on Hilbert-Huang transform: Method and its
publ.
applications to geophysical studies. Reviews of Geophysics, 46(2). https://doi.org/
Babanezhad, M., Behroyan, I., Marjani, A., & Shirazian, S. (2021). Artificial intelligence
10.1029/2007rg000228
simulation of suspended sediment load with different membership functions of
Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., ... & Liu, H. H.
ANFIS. Neural Computing and Applications, 33(12), 6819–6833. https://doi.org/
(1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear
10.1007/s00521-020-05458-6
and non-stationary time series analysis. Proceedings of the Royal Society of London.
Banadkooki, F. B., Ehteram, M., Ahmed, A. N., Teo, F. Y., Ebrahimi, M., Fai, C. M., … El-
Series A: mathematical, physical and engineering sciences, 454(1971), 903–995. https://
Shafie, A. (2020). Suspended sediment load prediction using artificial neural
doi.org/10.1098/rspa.1998.0193.
network and ant lion optimization algorithm. Environmental Science and Pollution
Karami, H., DadrasAjirlou, Y., Jun, C., Bateni, S. M., Band, S. S., Mosavi, A., …
Research, 27(30), 38094–38116. https://doi.org/10.1007/s11356-020-09876-w
Chau, K. W. (2022). A novel approach for estimation of sediment load in Dam
Bittelli, M., Tomei, F., Pistocchi, A., Flury, M., Boll, J., Brooks, E. S., & Antolini, G.
reservoir with hybrid intelligent algorithms. Frontiers in Environmental Science, 165.
(2010). Development and testing of a physically based, three-dimensional model of
https://doi.org/10.3389/fenvs.2022.821079
surface and subsurface hydrology. Advances in Water Resources, 33(1), 106–122.
Khan, S., Aslam, S., Mustafa, I., & Aslam, S. (2021). Short-term electricity price
https://doi.org/10.1016/j.advwatres.2009.10.013
forecasting by employing ensemble empirical mode decomposition and extreme
Chen, Y., Dong, Z., Wang, Y., Su, J., Han, Z., Zhou, D., … Bao, Y. (2021). Short-term wind
learning machine. Forecasting, 3(3), 460–477. https://doi.org/10.3390/
speed predicting framework based on EEMD-GA-LSTM method under large scaled
forecast3030028
wind history. Energy Conversion and Management, 227, Article 113559. https://doi.
Kisi, O. (2005). Suspended sediment estimation using neuro-fuzzy and neural network
org/10.1016/j.enconman.2020.113559
approaches/Estimation des matières en suspension par des approches neurofloues et
Cheng, S., Yu, X., Li, Z., Xu, X., Gao, H., & Ye, Z. (2023). The effect of climate and
à base de réseau de neurones. Hydrological Sciences Journal, 50(4). https://doi.org/
vegetation variation on monthly sediment load in a karst watershed. Journal of
10.1623/hysj.2005.50.4.683
Cleaner Production, 382, Article 135290. https://doi.org/10.1016/j.
Kisi, O., Latifoğlu, L., & Latifoğlu, F. (2014). Investigation of empirical mode
jclepro.2022.135290
decomposition in forecasting of hydrological time series. Water Resources
Chou, C. M., & Wang, R. Y. (2004). Application of wavelet-based multi-model Kalman
Management, 28(12), 4045–4057. https://doi.org/10.1007/s11269-014-0726-8
filters to real-time flood forecasting. Hydrological Processes, 18(5), 987–1008.
Latif, S. D., Chong, K. L., Ahmed, A. N., Huang, Y. F., Sherif, M., & El-Shafie, A. (2023).
https://doi.org/10.1002/hyp.1451
Sediment load prediction in Johor river: Deep learning versus machine learning
Darabi, H., Mohamadi, S., Karimidastenaei, Z., Kisi, O., Ehteram, M., ELShafie, A., &
models. Applied Water Science, 13(3), 79. https://doi.org/10.1007/s13201-023-
Torabi Haghighi, A. (2021). Prediction of daily suspended sediment load (SSL) using
01874-w
new optimization algorithms and soft computing models. Soft Computing, 25(11),
Meshram, S. G., Safari, M. J. S., Khosravi, K., & Meshram, C. (2021). Iterative classifier
7609–7626. https://doi.org/10.1007/s00500-021-05721-5
optimizer-based pace regression and random forest hybrid models for suspended
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The
sediment load prediction. Environmental Science and Pollution Research, 28(9),
Journal of Machine Learning Research, 7, 1–30.
11637–11649. https://doi.org/10.1007/s11356-020-11335-5
Díez-García, R., Camps, A., & Park, H. (2022). On the potential of empirical mode
Mhamdi, F., Poggi, J. M., & Jaidane, M. (2011). Trend extraction for seasonal time series
decomposition for RFI mitigation in microwave radiometry. IEEE Transactions on
using ensemble empirical mode decomposition. Advances in Adaptive Data Analysis, 3
Geoscience and Remote Sensing, 60, 1–10. https://doi.org/10.1109/
(03), 363–383. https://doi.org/10.1142/s1793536911000696
TGRS.2022.3188171
Moeeni, H., & Bonakdari, H. (2018). Impact of normalization and input on ARMAX-ANN
Doroudi, S., Sharafati, A., & Mohajeri, S. H. (2021). Estimation of daily suspended
model performance in suspended sediment load prediction. Water Resources
sediment load using a novel hybrid support vector regression model incorporated
Management, 32(3), 845–863. https://doi.org/10.1007/s11269-017-1842-z
with observer-teacher-learner-based optimization method. Complexity, 2021.
Mohanta, N. R., Biswal, P., Kumari, S. S., Samantaray, S., & Sahoo, A. (2021). Estimation
https://doi.org/10.1155/2021/5540284
of sediment load using adaptive neuro-fuzzy inference system at Indus River Basin,
Ehteram, M., Ahmed, A. N., Latif, S. D., Huang, Y. F., Alizamir, M., Kisi, O., … El-
India. In Intelligent Data Engineering and Analytics (pp. 427–434). Singapore: Springer.
Shafie, A. (2021). Design of a hybrid ANN multi-objective whale algorithm for
https://doi.org/10.1007/978-981-15-5679-1_40.
suspended sediment load prediction. Environmental Science and Pollution Research, 28
Morlet, J., Arens, G., Fourgeau, E., & Giard, D. (1982). Wave propagation and sampling
(2), 1596–1611. https://doi.org/10.1007/s11356-020-10421-y
theory; Part II. Sampling theory and complex waves. Geophysics, 47(2), 222–236.
Esmaeili-Gisavandani, H., Farajpanah, H., Adib, A., Kisi, O., Riyahi, M. M., Lotfirad, M.,
https://doi.org/10.1190/1.1441329
& Salehpoor, J. (2022). Evaluating ability of three types of discrete wavelet
Naik, J., Satapathy, P., & Dash, P. K. (2018). Short-term wind speed and wind power
transforms for improving performance of different ML models in estimation of daily-
prediction using hybrid empirical mode decomposition and kernel ridge regression.
suspended sediment load. Arabian Journal of Geosciences, 15(1), 1–13. https://doi.
Applied Soft Computing, 70, 1167–1188. https://doi.org/10.1016/j.asoc.2017.12.010
org/10.1007/s12517-021-09282-7
Nguyen, H. P., Baraldi, P., & Zio, E. (2021). Ensemble empirical mode decomposition and
Essam, Y., Huang, Y. F., Birima, A. H., Ahmed, A. N., & El-Shafie, A. (2022). Predicting
long short-term memory neural network for multi-step predictions of time series
suspended sediment load in Peninsular Malaysia using support vector machine and
signals in nuclear power plants. Applied Energy, 283, Article 116346. https://doi.org/
deep learning algorithms. Scientific Reports, 12(1), 1–29. https://doi.org/10.1038/
10.1016/j.apenergy.2020.116346
s41598-021-04419-w
Nhu, V. H., Khosravi, K., Cooper, J. R., Karimi, M., Kisi, O., Pham, B. T., & Lyu, Z. (2020).
Fan, G. F., Qing, S., Wang, H., Hong, W. C., & Li, H. J. (2013). Support vector regression
Monthly suspended sediment load prediction using artificial intelligence: Testing of
model based on empirical mode decomposition and auto regression for electric load
a new random subspace method. Hydrological Sciences Journal, 65(12), 2116–2127.
forecasting. Energies, 6(4), 1887–1901. https://doi.org/10.3390/en6041887
https://doi.org/10.1080/02626667.2020.1754419
Gupta, D., Hazarika, B. B., & Berlin, M. (2020). Robust regularized extreme learning
Ouyang, Q., Lu, W., Xin, X., Zhang, Y., Cheng, W., & Yu, T. (2016). Monthly rainfall
machine with asymmetric Huber loss function. Neural Computing and Applications, 32
forecasting using EEMD-SVR based on phase-space reconstruction. Water Resources
(16), 12971–12998. https://doi.org/10.1007/s00521-020-04741-w
Management, 30(7), 2311–2325. https://doi.org/10.1007/s11269-016-1288-8
Gupta, D., Hazarika, B. B., Berlin, M., Sharma, U. M., & Mishra, K. (2021). Artificial
Panahi, F., Ehteram, M., & Emami, M. (2021). Suspended sediment load prediction based
intelligence for suspended sediment load prediction: A review. Environmental Earth
on soft computing models and Black Widow Optimization Algorithm using an
Sciences, 80(9), 1–39. https://doi.org/10.1007/s12665-021-09625-3
enhanced gamma test. Environmental Science and Pollution Research, 28(35),
Gupta, P., & Singh, R. (2023). Combining simple and less time complex ML models with
48253–48273. https://doi.org/10.1007/s11356-021-14065-4
multivariate empirical mode decomposition to obtain accurate GHI forecast. Energy,
Panda, R., Padhee, S. K., & Dutta, S. (2014). Glof study in Tawang River Basin, Arunachal
263, Article 125844. https://doi.org/10.1016/j.energy.2022.125844
Pradesh, India. The International Archives of Photogrammetry, Remote Sensing and
Gupta, U., & Gupta, D. (2021). Least squares large margin distribution machine for
Spatial Information Sciences, 40(8), 101. https://doi.org/10.5194/isprsarchives-xl-8-
regression. Applied Intelligence, 51(10), 7058–7093. https://doi.org/10.1007/
101-2014
s10489-020-02166-5
Prasad, R., Deo, R. C., Li, Y., & Maraseni, T. (2018). Soil moisture forecasting by a hybrid
Hamaamin, Y. A., Nejadhashemi, A. P., Zhang, Z., Giri, S., Adhikari, U., & Herman, M. R.
machine learning technique: ELM integrated with ensemble empirical mode
(2019). Evaluation of neuro-fuzzy and Bayesian techniques in estimating suspended
decomposition. Geoderma, 330, 136–161. https://doi.org/10.1016/j.
sediment loads. Sustainable Water Resources Management, 5(2), 639–654. https://doi.
geoderma.2018.05.035
org/10.1007/s40899-018-0225-9
Rastogi, R., Anand, P., & Chandra, S. (2020). Large-margin distribution machine-based
Hazarika, B. B., Gupta, D., & Berlin, M. (2020a). A comparative analysis of artificial
regression. Neural Computing and Applications, 32(8), 3633–3648. https://doi.org/
neural network and support vector regression for river suspended sediment load
10.1007/s00521-018-3921-3
prediction. In First international conference on sustainable technologies for

14
B.B. Hazarika and D. Gupta Expert Systems With Applications 232 (2023) 120844

Reisenbüchler, M., Bui, M. D., & Rutschmann, P. (2021). Reservoir sediment Wang, W. C., Xu, D. M., Chau, K. W., & Chen, S. (2013). Improved annual rainfall-runoff
management using artificial neural networks: A case study of the lower section of the forecasting using PSO–SVM model based on EEMD. Journal of Hydroinformatics, 15
Alpine Saalach River. Water, 13(6), 818. https://doi.org/10.3390/w13060818 (4), 1377–1390. https://doi.org/10.2166/hydro.2013.134
Ren, Y., Suganthan, P. N., & Srikanth, N. (2014). A comparative study of empirical mode Wu, Z., & Huang, N. E. (2009). Ensemble empirical mode decomposition: A noise-assisted
decomposition-based short-term wind speed forecasting methods. IEEE Transactions data analysis method. Advances in Adaptive Data Analysis, 1(01), 1–41. https://doi.
on Sustainable Energy, 6(1), 236–244. https://doi.org/10.1109/tste.2014.2365580 org/10.1142/s1793536909000047
Rezaei, K., Pradhan, B., Vadiati, M., & Nadiri, A. A. (2021). Suspended sediment load Xie, Q., Hu, J., Wang, X., Du, Y., & Qin, H. (2023). Novel optimization-based
prediction using artificial intelligence techniques: Comparison between four state-of- bidimensional empirical mode decomposition. Digital Signal Processing, 133, Article
the-art artificial neural network techniques. Arabian Journal of Geosciences, 14(3), 103891. https://doi.org/10.1016/j.dsp.2022.103891
1–13. https://doi.org/10.1007/s12517-020-06408-1 Yang, H. F., & Chen, Y. P. P. (2019). Hybrid deep learning and empirical mode
Salih, S. Q., Sharafati, A., Khosravi, K., Faris, H., Kisi, O., Tao, H., … Yaseen, Z. M. decomposition model for time series applications. Expert Systems with Applications,
(2020). River suspended sediment load prediction based on river discharge 120, 128–138. https://doi.org/10.1016/j.eswa.2018.11.019
information: Application of newly developed data mining models. Hydrological Yang, J., Fu, Z., Zou, Y., He, X., Wei, X., & Wang, T. (2023). A response reconstruction
Sciences Journal, 65(4), 624–637. https://doi.org/10.1080/ method based on empirical mode decomposition and modal synthesis method.
02626667.2019.1703186 Mechanical Systems and Signal Processing, 184, Article 109716. https://doi.org/
Santhosh, M., Venkaiah, C., & Kumar, D. V. (2018). Ensemble empirical mode 10.1016/j.ymssp.2022.109716
decomposition based adaptive wavelet neural network method for wind speed Yang, Y., & Yang, Y. (2020). Hybrid prediction method for wind speed combining
prediction. Energy Conversion and Management, 168, 482–493. https://doi.org/ ensemble empirical mode decomposition and bayesian ridge regression. IEEE Access,
10.1016/j.enconman.2018.04.099 8, 71206–71218. https://doi.org/10.1109/access.2020.2984020
Shao, X., Sun, S., Li, J., Kong, W., Zhu, J., Li, X., & Hu, B. (2021). Analysis of functional Yaslan, Y., & Bican, B. (2017). Empirical mode decomposition based denoising method
brain network in MDD based on improved empirical mode decomposition with with support vector regression for time series prediction: A case study for electricity
resting state EEG data. IEEE Transactions on Neural Systems and Rehabilitation load forecasting. Measurement, 103, 52–61. https://doi.org/10.1016/j.
Engineering, 29, 1546–1556. https://doi.org/10.1109/tnsre.2021.3092140 measurement.2017.02.007
Sharafati, A., Haji Seyed Asadollah, S. B., Motta, D., & Yaseen, Z. M. (2020). Application Yonghao, M., Zhang, B., Li, C., Lin, J., & Zhang, D. (2022). Feature mode decomposition:
of newly developed ensemble machine learning models for daily suspended sediment new decomposition theory for rotating machinery fault diagnosis. IEEE Transactions
load prediction and related uncertainty analysis. Hydrological Sciences Journal, 65 on Industrial Electronics. https://doi.org/10.1109/tie.2022.3156156
(12), 2022-2042. https://doi.org/10.1080/02626667.2020.1786571. Zhang, H., Wang, B., Lan, T., & Chen, K. (2015). A modified method for non-stationary
Shiri, N., Shiri, J., Nourani, V., & Karimi, S. (2022). Coupling wavelet transform with hydrological time series forecasting based on empirical mode decomposition. Shuili
multivariate adaptive regression spline for simulating suspended sediment load: Fadian Xuebao, 34(12), 42–53. https://doi.org/10.11660/slfdxb.20151205
Independent testing approach. ISH Journal of Hydraulic Engineering, 28(sup1), Zhao, N., Ghaemi, A., Wu, C., Band, S. S., Chau, K. W., Zaguia, A., … Mosavi, A. H.
356–365. https://doi.org/10.1080/09715010.2020.1801528 (2021). A decomposition and multi-objective evolutionary optimization model for
Tao, H., Al-Khafaji, Z. S., Qi, C., Zounemat-Kermani, M., Kisi, O., Tiyasha, T., … suspended sediment load prediction in rivers. Engineering Applications of
Yaseen, Z. M. (2021). Artificial intelligence models for suspended river sediment Computational Fluid Mechanics, 15(1), 1811–1829. https://doi.org/10.1080/
prediction: State-of-the art, modeling framework appraisal, and proposed future 19942060.2021.1990133
research directions. Engineering Applications of Computational Fluid Mechanics, 15(1),
1585–1612. https://doi.org/10.1080/19942060.2021.1984992

15

You might also like