Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

6.2: The 16 algorithms were tested (Figs. 2 and 3).

As per the observations, all of the models


functioned well . From all the models, RT-ANN, BA-RT, RF, BA-RF, and BA-M5P have the
highest prediction power.

PREI, which evaluates the efficiency of models on the potential to over- or


underestimate the water quality index, was used to further analyse the results
(Fig. 4). The RF model has the minimum error among the standalone algorithms.
M5P had a minimum error as well, but it had a huge error (about 40) that
impacted its performance. The error ranges for RT and REPT were also between
±10 , these algorithms failed to accurately estimate each result.

The predictive value of standalone algorithms was improved by hybrid models,


particularly the bagging method (compare Fig. 4a with e, c with g, and d and h).

The RFC-RT, RFC-M5P, CVPS-M5P, M5P, and BAM5P models are highly accurate
at predicting the maximum WQI values, as shown by the box plots of measured
and estimated WQI values. Only RFC-RF correctly estimated the lower values (Fig.
5).

Directly analyzing the models' predictions to compare their effectiveness has the
drawback that those with stronger prediction powers are easier to spot, but
determining the optimal method and success ranking is complex.

As a result, some quantitative data that gives stronger evidence of each


algorithm's performance is required. (See Table 7).

The hybrid RT-ANN (R2 = 0.951) had the highest prediction success (R2 N 0.75),
while the BPNN (R2 = 0.752) had the lowest. The RTANN model had the best
MAE (2.284) and the lowest RMSE (2.319). A model has very excellent prediction
ability when the NSE is 0.75 to 1. (Moriasi et al., 2007)(1). As a result, all
performed admirably, but RTANN outperformed the competition (NSE =
0.945).All algorithms except BA-RF, RF, BA-REPT, REPT, RFC-M5P, RFC -REPT, and
ANN-ANFIS overestimated WQI, according to the PBIAS metric. The ranking of
the algorithem from best to worst , based on their performance results is: RT-
ANN, BA-RF, RF, BA-RF,BA-M5P,M5P,RFC-M5P, ANN-ANFIS, RT, RFC-RF, REPT,
ANN, BA-REPT, RFC-RT, RFC-REPT, BPNN.

Conclusion: In a humid environment in northern Iran, this study investigated


the performance of four standalone (RT, EPTR, RF, and M5P) and 12 hybrid data-
mining algorithms (hybrids of the standalones with CVPS, RFC, and Bagging)
methods for forecasting monthly water quality index. Our goal was to create and
propose new algorithms for WQI prediction, as well as other aspects of water
science, in areas where water quality gauging stations are sparsely distributed.

The most important factor of Water quality index, according to the modelling
procedure, was faecal coliform concentration. BOD, NO3, DO, EC, COD, PO24,
Turbidity,TS, and pH were then listed in order of relevance. We also discovered
that multiple variable combinations led in varied degrees of model performance,
and that modifying the inputs on the models for our research region had uneven
and divergent effects on modelling in other catchments, even when utilising the
same variable combinations. When the variables with the highest CCs are
utilised in the models, the predicting power is the best. Low-CC variables have a
detrimental impact on predictive power.

When compared to the standalone algorithms, the hybrids demonstrated


enhanced prediction accuracy rate (i.e., effective than the standalone
algorithms), but they are not much effective in all circumstances. The RT-ANN
model outperformed all other models in terms of accuracy. RF, bagging-RF,
bagging-RT, bagging REPT, RFC-RF, RT, M5P = CVPS-M5P, RFC-M5P, bagging-M5P,
REPT, CVPS-REPT, CVPS-RT, RFC-REPT, and RFC-RT are in order of decreasing
performance after RT-ANN. Despite having the best performance, the RT-ANN
hybrid was unable to effectively predict severe WQI values. WQI values were
overestimated by nearly all algorithms, except BA-RF, RF, BA-REPT, REPT, RFC-
M5P, RFC -REPT, and ANN-ANFIS.

This is important to note that as these algorithms can give stable outputs with a
short-term dataset, they will be far more stable with longer-term dataset. As a
result, these algorithms may be highly efficient in emerging areas with minimal
measuring networks or when gauging networks have only lately been
constructed. According to our results t he recommended RT-ANN algorithm
appears to be a practical and has cost-effective approach for improving ground
water treatment in humid parts of northern Iran, .

This model will most probably become more beneficial in underdeveloped


nations, since the costs of testing various water quality parameters are large and
may be unaffordable generally. However, these observations cannot be
extended to other study regions or compared to other hydrological data. RT-
ANN would surely be an efficient algorithm, but it might not be the best (i.e.,
most accurate) in every situation.
[1] Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Binger, R.L., Harmel, R.D., Veith, T.L., 2007. Model
evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans.
ASABE 50, 885–900. https://doi.org/10.13031/2013.23153.

You might also like