A Novel Framework of Multivariate Modeling of Water Distribution Network Through 33 Factorial Design and Artificial Neural Network

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Journal of Environmental Science and Health, Part A

Toxic/Hazardous Substances and Environmental Engineering

ISSN: 1093-4529 (Print) 1532-4117 (Online) Journal homepage: https://www.tandfonline.com/loi/lesa20

A novel framework of multivariate modeling of


3
water distribution network through 3 factorial
design and artificial neural network

Partha S. Ghosal, Ashwini Javaregowda, Ashok K. Gupta & Dineshwar P.


Singh

To cite this article: Partha S. Ghosal, Ashwini Javaregowda, Ashok K. Gupta & Dineshwar P.
3
Singh (2019) A novel framework of multivariate modeling of water distribution network through 3
factorial design and artificial neural network, Journal of Environmental Science and Health, Part A,
54:6, 551-562, DOI: 10.1080/10934529.2019.1571308

To link to this article: https://doi.org/10.1080/10934529.2019.1571308

Published online: 22 Feb 2019.

Submit your article to this journal

Article views: 167

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=lesa20
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A
2019, VOL. 54, NO. 6, 551–562
https://doi.org/10.1080/10934529.2019.1571308

A novel framework of multivariate modeling of water distribution network


through 33 factorial design and artificial neural network
Partha S. Ghosala, Ashwini Javaregowdaa, Ashok K. Guptaa, and Dineshwar P. Singhb
a
Environmental Engineering Division, Department of Civil Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India; bDwarka,
New Delhi, India

ABSTRACT ARTICLE HISTORY


The water distribution network is largely affected by the change in the influencing factors, such as Received 1 August 2018
input pressure, demand and supply duration. The change in each parameter requires the extensive Accepted 8 January 2019
design of the network and the interactive effect of the influencing parameters are hardly explored.
KEYWORDS
The main hurdles for the water providers lie in the absence of a prediction model, which can be
Water distribution networks;
used as a decision tool to assess the effect of the change in parameter and estimating the cost 33 factorial design; artificial
for the changed scenario. The present study developed a novel framework based on the artificial neural network;
neural network for multivariate prediction modeling taking the response as the cost of the pipe hydraulic design
network. The application of the 33 factorial design was used for the selection of the influencing
parameters and outcome was taken as the input to the neural network model. The adequacy of
the model was tested through error functions and analysis of variance. The low values of the error
functions (0.0004–0.228) and high F value (162,442) and R2 (0.999) established the significance of
the model. The model can be used for predicting the cost of the changed scenarios and assess-
ment of the optimal solution for the system variables.

Introduction water supply are the main factors significantly influencing


the equity of water supply in WDN.[7] The difference in
Water distribution network (WDN) plays an important role
hydraulic gradient and nodal elevation defines the pressure
in the water supply system. Designing WDN for continuous
at the junction point; pressure tends to reduce as the eleva-
water supply (CWS) and operating in the intermittent water
supply (IWS) create substantial pressure variation and ham- tion rises. Pressure head at the source and head loss in the
per equitable water supply.[1,2] Nevertheless, the intermit- network define the hydraulic gradient line (HGL) through-
tency in a continuous water supply system instigates the out the network and directly affects the nodal pressure
pressure and flow distributions to be inadequate and not hence influences demand. Flow rate influences the velocity
homogeneous.[3] Short duration supply in IWS leads to large and pressure of the network. The increase in flow rate due
peak flow, increased pressure-loss, tail end consumer suffers to increasing demand in an existing network leads to mal-
low pressure and insufficient demand.[4] Supply duration function of the system.
less than 12 h per day significantly increase the chance of The major issue with the water supply system in the
hydraulic losses and energy loss in the water distribution developing countries is the increasing demand due to
network.[5] Life of the distribution system reduces due to growth/migration of population, pilferage of water and
frequent leakage caused by the transient pressure in the unmetered consumption. The unaccounted for flow
intermittent system. In the IWS system, network experiences increases with time and additional capacity for supply is
regular flow restarting, loss of pressure due to high peak often required. Although the treatment capacity of the sys-
flow, draining as supply stops and ingression of contami- tem can be increased by proving additional treatment units,
nated water as pressure falls below positive value. These the increase in the capacity of the distribution line is often
processes influence the mechanism affecting water quality difficult. Moreover, the difficulties in decision making for
throughout the distribution system.[6] However, the WDN augmenting the capacity or checking the adequacy of the
in most of the developing countries follows the IWS with a system is often observed due to nonavailability of predictive
wide variation of the supply hour. model considering the several variables influencing the sys-
Nodal pressures, delivered flows, line velocities, nodal ele- tem. The details of a distribution system have to be assessed
vation differences, size of supplied areas and duration of through a comprehensive hydraulic design, which requires

CONTACT Ashok K. Gupta agupta@civil.iitkgp.ac.in Environmental Engineering Division, Department of Civil Engineering, Indian Institute of Technology,
Kharagpur 721 302, India.
Color versions of one or more figures in the article can be found online at www.tandfonline.com/lesa.
Supplemental data for this article is available online at https://doi.org/10.1080/10934529.2019.1571308.
*Retired
ß 2019 Taylor & Francis Group, LLC
552 P. S. GHOSAL ET AL.

effort and time. The associated cost of modification is also Materials and methods
challenging to arrive without performing the detailed
Selection of major influencing parameters
hydraulic analysis of the network. The optimized solution
for the modification of a distribution network can only be A water distribution network is characterized by several
achieved by developing a multivariate prediction model for physical and hydraulic parameter. Each parameter influences
the WDN. the network capacity in different ways. The hydraulic
The modeling of water supply system parameters associ- parameters of the water distribution systems are inter-
ated with the WDN can be conducted by the univariate dependent and have a nonlinear interactive effect on one
approach by performing the hydraulic design of WDN with another. Hence, each parameter has to be carefully studied
the variation of a single parameter. However, a univariate before considering for optimization of the network. The
study rarely produces the interactive effect of the major sys- three major influencing parameters of water distribution
tem parameters. In several areas of environmental modeling, network design are selected and investigated through several
the applications of multivariate modeling have become a simulations to model the optimized distribution network
popular choice to overcome limitations of the univariate design. Input pressure (pump head) plays a vital role in
study. Among the several tools of multivariate modeling, the determining the nodal pressure; the pressure head at source
artificial neural network (ANN) has become a preferred defines hydraulic gradient and pressure at all nodes and cor-
choice. The vast application of artificial neural network responding flow at the respective nodes. Head at source also
(ANN) on nonlinear multivariate modeling in environmen- defines pipe diameter to maintain required pressure
tal engineering for complex problems appeared promis- throughout the network. Nodal pressure defines the water
ing.[8–15] Specifically, some inherent advantages, such as supply at the nodes.
universal approximation and mapping capacity, adaptability Network demand is always an uncertain parameter.
to changes in the dataset, robustness and fault tolerance, Predicting future demand of water distribution system is the
ability to process data in parallel (fast processing) and dis- biggest hurdle.[31] A sudden increase in demand within the
tributed manner are associated with the adequate perform- network or due to the nontechnical expansion of the net-
work decreases network capacity. The demand of the water
ance of ANN model.[16,17] Especially, ANN can be
distribution network regulates the flow, velocity and head
considered as a suitable tool to model the processes, which
loss in the system. Duration of supply in the network affects
are not described by an appropriate mathematical relation-
the other hydraulic parameter of the network like flow, vel-
ship. Although, the efficiency of ANN model is largely
ocity, head loss and pressure in the network. The demand
dependent on a large number of input data, it was observed
increases with the decrease in supply duration. Thus, the
that a systematic input data through the statistical design of
duration of supply, pressure head at the source and network
experiment might drastically reduce the requirement of data
demand are considered as the significant factors for multi-
as well as enhance the performance of ANN.[18] Even, an
variate modeling of the network.
ANN model can perform better than the regression model if
the training of the network is conducted with suitable
experimental data.[19] Although, various techniques of opti- Multivariate modeling
mization and modeling were attempted for designing, solv-
The variation of parameters mentioned above has a signifi-
ing, operational optimization, quality assessment and so
cant impact on the water distribution network. The specific
forth of the distribution network,[20–30] the multivariate
inter-relationships among the parameters are not available.
modeling for appraising the effect of major influencing par- Moreover, the parameters nonlinearly affect the distribution
ameter on the distribution network was rarely conducted, system. The assessment of the variation of the distribution
which is a significant gap area in the field of planning and network is reflected by the variation of the pipe diameter,
management of water supply system. Furthermore, the which affects the overall cost of the network. Hence, the
application of ANN on WDN was rarely conducted, evaluation of the overall effect of those parameters on the
although, it can be an efficient tool in this context. distribution system may be carried out by the cost of the
In this article, the analysis of WDN at various conditions pipe network.
of essential factors of the system was analyzed, and a multi- In the present study, the artificial neural network (ANN)
variate prediction model was developed through ANN. The was considered as the modeling tool, as there is no specific
systematic dataset with a 33 factorial design was framed con- mathematical relationship is available. The input of the
sidering pressure head at the source, network demand and ANN should comprise of the dataset of variation of the cost
duration of supply as independent variables and the cost of of the pipeline (response) concerning the variation of input
the system as a response. The number of neurons in the pressure, hours of supply and network demand (factors).
hidden layer, training algorithms, transfer functions and so Eventually, the multivariate modeling through ANN requires
forth, were optimized to develop the neural network. The the generation of the different set of data showing the vari-
prediction capacity of the model was tested with a validation ation of response concerning the factors. The dataset was
study conducted by 32 factorial model for 8 h supply. The prepared by simulation of a hypothetical network for a dif-
efficiency of the model was tested by simulating the network ferent set of factors and respective outcomes. In this context,
concerning the actual values of response. the generation of the input data was conducted through
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A 553

Figure 1. Algorithm for multivariate modeling with 33 factorial design and ANN.

statistical design to produce the systematic data as input to ANN model


ANN instead of providing a large set of informal data Among the various techniques of multivariate modeling,
as input. ANN is an efficient tool that can be applied to any informal
dataset to model both the linear or nonlinear relationships
of the independent and dependent variables.[33] The major
33 Factorial design
advantage of ANN is considered as their ability to learn
Design of Experiments (DoE) has pivotal importance in gen-
from the rule without depending on any mathematical rela-
erating the dataset in a systematic way of achieving the opti-
tionship.[19] The most frequently used ANN architecture
mum model performance with a minimum number of
input. Among the several techniques, such as Box–Behnken consists of the input layer, the output layer and one or more
design (BBD), and central composite design (CCD), 33 fac- hidden layer forming a multilayer feed forward-back propa-
torial design exhibits higher accuracy for prediction.[32] In a gation training network.[34] Several factors, like the complex-
3k factorial design, all possible combinations of the k factors ity of the problem, characteristics of input data, the error
on three levels are investigated comprising a total number of goal and so forth influence the proper choice of the training
experimental run required is N3k ¼ 3k þ Cr; where, k is the algorithm to obtain the optimum performance of the ANN.
total number of factors and Cr is the central run replicates. In the present study, the number of perceptron in the hid-
The total 27 number of hydraulic simulation was conducted den layer, training algorithms and transfer functions were
by varying the factors according to the 33 factorial model. In varied to develop the optimized ANN architecture.
this case, the central run is not required as the output is Levenberg–Marquardt, resilient BP, BFGS quasi-Newton,
generated from a hydraulic simulation software which is free scaled conjugate gradient, Fletcher–Powell conjugate gradi-
from pure error. The design run of 33 factorial design is ent, variable learning rate backpropagation, Polak–Ribiere
used as input to the ANN model. conjugate gradient and one step secant were chosen as the
554 P. S. GHOSAL ET AL.

different training algorithms. The input to the hidden layer Table 1. Experimental range and levels of factors used in 33 factorial design.
U is computed from input variables as follows: Level

fU g ¼ hWifI g (1) Factors –1 0 þ1


Input pressure (m H2O) 10 15 20
where, W is the weight and I is the input. Each term of U Network demand (L/s) 13.02 19.53 26.04
matrix can be presented as follows: Supply hour (h) 4 14 24

X
n
uj ¼ wi ii h (2) Water distribution network with 55 pipeline and 31 junc-
i¼1 tions with source reservoir were built in WaterGEMS simula-
where h is the associated bias. tion software. The total length of the distribution network
The linear (purelin) and the nonlinear functions, viz., pipeline was 9,688 m and minimum pipe diameter considered
log-sigmoid (logsig) and hyperbolic tangent sigmoid (tansig), was 100 mm. The design domain of the different independent
transfer functions were used in the hidden layer and are variables as per the 33 factorial design is presented in Table 1.
presented in Equations (3)–(5), respectively: The network was optimized for every simulation using
Darwin designer tool in WaterGEMS. Pressure constraints
f ðU Þ ¼ u (3) and velocity constraints were provided for every simulation.
Minimum and maximum pressure head considered for the
f ðU Þ ¼
1
(4) design was 7 m and 22 m, respectively.[35] Maximum velocity
ð1 þ eðuÞ Þ constraint was restricted to 1.5 m/s.
The demand arrived for a residential population of 7500,
2 11,250 and 15,000 with 150 L per capita demand.
f ðU Þ ¼ (5)
ð1 þ eð2uÞ Þ 1 Accordingly, the discharge required at the source was calcu-
lated to be 13.02, 19.53 and 26.04 L/s. The system was ana-
The mean square errors (MSE) and correlation coefficient
lyzed for different pressure head at the source. Elevation of
(R) were evaluated to appraise the performance of the ANN
the source was considered as 55.5 m, and 10 m, 15 m and
model. The mean square errors (MSE) is represented as fol-
20 m pressure head at reservoir were considered for the ana-
lows: lysis. Hence, the hydraulic gradient line was initiated with
1X n
 2 65.5 m, 70.5 m and 75.5 m. Under each pressure head the net-
MSE ¼ jYpred  Y exp j (6) work was optimized and analyzed for three different supply
n i¼1
duration viz. 4 h, 14 h and 24 h. Fixed demand pattern was
where Ypred and Yexp are the predicted and experimental considered for all the supply hours. Each experimental run
value of the dependent variable, respectively. The ANN ana- for different values of the simulation parameter is listed in
lysis was performed using MATLAB, 2010a environment Table 2 concerning the 33 factorial design. The network was
(The MathWorks, INC.). The number of epochs, minimum optimized using Darwin designer by minimization of cost
gradient and Mu was taken as 1000, 110 and 0.01, respect- subject to pressure and velocity constraints. The cost of the
ively. The ANN architecture with the arrangement of the pipe for different diameter was given in Table S1 in
input data, the hidden layer and the output layer has been Supporting Information. The Darwin designer performs with
presented in Figure 1. The data of 33 factorial design was genetic algorithm (GA) for the selected method of cost mini-
randomly split into three portions, that is, 70% for training, mization. In options, GA parameters were chosen as per the
15% for validation and 15% for testing for framing the recommended values of WaterGEMS to get the better results.
ANN network.
Results and discussion
Analysis of WDN Hydraulic design of different networks concerning 33
Hardy–Cross method, Newton–Raphson method, linear the- factorial design
ory method and linear graph theory method are few meth- The high requirement of the resources is associated with the
ods of solving WDN. However, it is quite cumbersome to planning, designing, maintenance, operation and manage-
use these methods for solving large real WDN manually. ment of the WDN. Any changes in factors affecting the
Hence, based on these analytical methods, several simulation WDN involved a thorough hydraulic analysis of the net-
software packages have been developed to design and ana- work. In those consequences, the major hindrance for the
lyze complex functions of a distribution network facilities decision makers is the unavailability of data for predicting
using advanced computing. In the present work, hydraulic the system behavior and the financial involvement, which is
simulator – WaterGEMS V8i (Bentley, USA) has been used closely related to the physical parameters, such as the diam-
for design and analysis of water distribution network. In this eter of the pipe of the network. The prediction of the net-
study, a hypothetical network was framed and used to assess work adequacy is the involvement of many parameters and
and compare the cost of the network by varying the influ- their complex relationship. Suitable multivariate modeling
encing factors. The WDN is presented in Figure S1 in with an adequate prediction capacity may be the solution
Supporting Information and was used for hydraulic analysis. for this concern.
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A 555

Table 2. 33 factorial design cost as ANN input and the predicted cost from ANN model.
Std. order Input pressure (m) Demand (L/s) Duration (H) Actual cost of pipe (Million INRa) Predicted cost of pipe (Million INR a)
1 10 13.02 4 10.78582 10.78054799
2 15 13.02 4 9.8481 9.811654087
3 20 13.02 4 9.97938 9.980842496
4 10 19.53 4 11.7447 11.74732765
5 15 19.53 4 10.64111 10.63621111
6 20 19.53 4 11.21668 11.20864866
7 10 26.04 4 12.52188 12.47920058
8 15 26.04 4 11.22488 11.20886973
9 20 26.04 4 12.07116 12.07011329
10 10 13.02 14 9.37148 9.351650369
11 15 13.02 14 9.33968 9.331180613
12 20 13.02 14 9.33968 9.346317004
13 10 19.53 14 9.47262 9.476737023
14 15 19.53 14 9.41212 9.41344767
15 20 19.53 14 9.41212 9.414209771
16 10 26.04 14 9.57978 9.582185618
17 15 26.04 14 9.47792 9.480885333
18 20 26.04 14 9.47792 9.496223862
19 10 13.02 24 9.32888 9.331297426
20 15 13.02 24 9.32888 9.330934847
21 20 13.02 24 9.32888 9.33184668
22 10 19.53 24 9.33968 9.345947309
23 15 19.53 24 9.32888 9.333738007
24 20 19.53 24 9.32888 9.332367233
25 10 26.04 24 9.36128 9.357466229
26 15 26.04 24 9.37148 9.358441212
27 20 26.04 24 9.33968 9.335677044
a
Million INR¼ 14381.00 US $.

Figure 2. Variation of nodal pressure and pipe diameter at a. input pressure 10 m, supply hour 4 h, b. input pressure 10 m, supply hour 14 h, c. input pressure 10
m, supply hour 24 h, d. input pressure 15 m, supply hour 4 h, e. input pressure 15 m, supply hour 14 h, f. input pressure 15 m, supply hour 24 h, g. input pressure
20 m, supply hour 4 h, h. input pressure 20 m, supply hour 14, i. input pressure 20 m, supply hour 24 h for the flow rate of A. 13.02, B. 19.53 and C. 26.04 L/s.
556 P. S. GHOSAL ET AL.

Figure 2. Continued

To model the optimum distribution network with network may be considered as pipe diameter and nodal
selected hydraulic parameters, 27 scenarios of the water dis- pressure. The increasing trend of nodal pressure concerning
tribution network were simulated with a different combin- to increase in input pressure and supply hours was estab-
ation of demand, input pressure and supply duration. The lished from Figure 2A–C. However, the flow rate has an
scenarios created were optimized using Darwin designer tool antagonistic effect on the nodal pressure. The diameter of
in the WaterGEMS. The new optimized network in the new the pipe in the network was decreased with increasing input
scenario was run, and the results were noted. In a few opti- pressure and supply duration. On the other hand, the flow
mized networks, the chronology of the pipe was missing and rate has a synergistic effect on the diameter.
they were manually corrected, and the results were checked The cost of the water supply distribution network mainly
for satisfying the constraints. After confirming the chron- depends on the total cost of the pipe used in the network. It
ology of the pipe, the pipe table from flex table was copied, is evident from Table 2 that the increase in supply hour
and the cost of the network was calculated and are pre- decreases the design cost of the water distribution system.
sented in Table 2. The pipe network concerning the vari- Supply hour also affects the flow rate of pipes. Increase in
ation of pipe diameter and nodal pressure were presented supply duration decreases the flow rate of the network.
for 13.02, 19.53 and 26.04 L/s flow rate at different supply Increase in demand increases flow rates in the network,
hour and input pressure as per the 33 factorial design run in meanwhile increase the diameter of the pipe to maintain the
Figure 2A–C. velocity of flow and increases the cost of the system
(Fig. 2A–C).
The pressure in the network at any point is the difference
Variation of hydraulic parameters and pipe cost in hydraulic grade line and elevation of that point. The
The hydraulic design of the network conducted by restrict- hydraulic grade is the difference between inlet pressure and
ing the design parameters, that is, velocity, pressure head head loss of the pipe. As input pressure increase, the inlet
within the specified values as mentioned in the earlier sec- pressure increases and hydraulic grade increase accordingly
tion. Each network was optimized for a minimum cost by the pressure in the system increases. Hence, head loss in the
genetic algorithm in WaterGEMS. Hence, the major distin- system has to be decreased to obtain the required pressure.
guishing parameter showing the hydraulic properties of the Increasing the size of the network pipe decreases head loss
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A 557

Figure 2. Continued

in the system eventually enhance the design cost of the dis- with the combination of demand, input head and supply
tribution network. The outcome of the factorial design was duration was created in the WaterGEMS and were opti-
used as input to ANN. mized using Darwin designer tool. The simulation order of
the ANN model for the validation experiment is presented
in Table 4. Typically, the intermittent supply duration in
ANN study
different cities of India was found to target 8 h. Hence, the
The multivariate modeling with the neural network was car- same was considered for validation of the model.
ried out with the 27 numbers 33 factorial data points as the
input to the network. The proposed formulation was pre-
sented in Figure 1, and the design run was presented in Optimization of number of neurons in hidden layer
Table 2. The variation in the number of neurons in the hid- The performance of the ANN is reliant on the computation
den layer was first conducted to optimize the hidden layer. carried out in the hidden layer, which in turn dependent on
Subsequently, the different training functions and transfer the number of perceptrons adopted. The inappropriate
functions were studied for the optimization of the ANN choice of the number of neurons creates either a complex or
model. In this study, eight training function and three trans- an under-performing network. The optimization of the
fer function pertinent to this modeling were attempted to number of the neuron for the hidden layer configuration is
optimize the network. The model outcome for these 24 cases an integral part of the efficient ANN model. The numbers
of network variation was presented in Table 3. The opti- of neurons were varied from 4 to 20, and the corresponding
mized ANN architecture was chosen from the performance
mean square of error (MSE) and R were represented in
indicator as MSE and R.
Figure 3. The network with 10 neurons in hidden layer
exhibited the optimum performance with an MSE of
Validation of model 1.744  104, and corresponding R is above 0.999 (Fig. 3).
The modeling of the water distribution has to be validated. The correlation statistics of the optimum number of neurons
For making comprehensive use of the validation experiment for training, testing, validation and overall ANN model are
of the model, a 32 factorial design with nine sets of scenario, 0.999, 0.997, 0.999 and 0.999, respectively (Fig. 3).
558 P. S. GHOSAL ET AL.

Table 3. Performance study of training algorithm and transfer function of ANN model.
Sl. No. Training algorithm Transfer function MSEa RMSEb Rc
4
1 Levenberg–Marquardt tansig 1.744  10 0.0132 0.999
logsig 0.0106 0.1031 0.996
purelin 0.0801 0.2830 0.967
2 BFGS quasi Newton tansig 0.0149 0.1219 0.992
logsig 0.0118 0.1085 0.994
purelin 0.0592 0.2434 0.967
3 Scaled conjugate gradient tansig 0.0022 0.0467 0.999
logsig 0.0120 0.1094 0.994
purelin 0.0602 0.2454 0.964
4 Resilient BP tansig 0.0133 0.1061 0.994
logsig 0.0183 0.1353 0.990
purelin 0.0885 0.2974 0.961
5 Polak–Ribiere conjugate gradient tansig 0.0026 0.0506 0.999
logsig 0.0071 0.0842 0.996
purelin 0.0707 0.2659 0.963
6 Fletcher–Powell conjugate gradient tansig 0.0116 0.1076 0.994
logsig 0.0040 0.0632 0.998
purelin 0.0618 0.2487 0.968
7 Variable learning rate backpropagation tansig 0.0104 0.1022 0.995
logsig 0.0059 0.0769 0.997
purelin 0.0685 0.2618 0.965
8 One step secant tansig 0.0094 0.0969 0.996
logsig 0.0169 0.1301 0.992
purelin 0.0627 0.2503 0.963
a
MSE is mean square errors.
b
RMSE is root mean square errors.
c
R is corellation coefficient.

Table 4. Validation set of 32 factorial design results.


Std. order Input pressure (m) Demand (L/s) Duration (h) Actual cost of pipe (Million INRa) Predicted cost of pipe (Million INRa)
1 10 13.02 8 9.57178 9.794936
2 15 13.02 8 9.42652 9.397797
3 20 13.02 8 9.46712 9.446128
4 10 19.53 8 9.7178 10.35198
5 15 19.53 8 10.07302 9.774478
6 20 19.53 8 9.5701 9.859356
7 10 26.04 8 10.72252 11.69009
8 15 26.04 8 9.8216 9.701655
9 20 26.04 8 9.97938 10.10908
a
Million INR¼ 14381.00 US $.

Selection of training algorithm and transfer function the actual values showing very good predictability of the
The performance of the ANN is largely dependent on the ANN model. The validation experiment for the 32 has also
training algorithm, and the transfer function chosen for the been simulated by ANN model, and the predicted values
network analysis. In this study, a set of 24 network run was and actual values are shown in Figure 4B. The predicted
conducted with eight training algorithm and corresponding points are coming to the close proximity of the actual value
three transfer functions. The performance indicator as MSE which shows the good predictability of the ANN model.
and R was presented for each training algorithm, such as Nevertheless, the predicted values for the original ANN
Levenberg–Marquardt, resilient BP, BFGS quasi-Newton, model is obviously higher than that of the validation set.
scaled conjugate gradient, Fletcher–Powell conjugate gradi- The performance of the ANN model was further assessed
ent, variable learning rate backpropagation, Polak–Ribiere with some error functions and analysis of variance and
conjugate gradient and one step secant and corresponding regression statistics. The error functions used in this study
log-sigmoid, hyperbolic tangent sigmoid transfer function are as follows:
and linear transformation. The R2 and error functions were The Sum of the square of the error[36]
computed for the models generated from 33 factorial data X
n
(Table 3). The Levenberg–Marquardt training algorithm and SSE ¼ ðYpred  Y exp Þi
2
(7)
tansig transfer function exhibited the optimum results. i¼1

The sum of absolute error[36]


Performance of ANN model
X
n
The predicted values of the ANN model for 3 model was 3 SAE ¼ jYpred  Y exp ji (8)
i¼1
computed. The actual values and the predicted values are
shown in Figure 4A. The predicted values are very close to
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A 559

Figure 3. (a) Variation of MSE and R with the number of neurons in the hidden layer and (b) R of optimum number of neurons for training, testing, validation and
overall ANN model.

The average relative error[36] The chi-square test statistic[37]


" #
100 X
n
Ypred Y exp X n
ðYpred  Y exp Þ2
ARE ¼ j ji (9) v ¼
2
(12)
n i¼1 Y exp Ypred
i¼1
The hybrid fractional error function[36]
" # where, n – p is the degree of freedom; n is the number of
100 X n
ðYpred  Y exp Þ2 the data point and p is the number of parameters. The per-
HYBRID ¼ (10)
n  p i¼1 Y exp formance of ANN model for both the cases was further ana-
i
lyzed by computing the error functions, and conducting an
The Marquardt’s percent standard deviation[36] analysis of variance (ANOVA); the results were presented in
Table 5 and Table S2 in Supporting Information, respect-
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
" #2ffi ively. The SSE, SAE, ARE, HYBRID and MPSD error and v2
u
u 1 X n
ð Y  Y Þ values were also appeared significantly lower for the model
MPSD ¼ 100t
pred exp
(11) from the 33 factorial set data than that of the validation set.
n  p i¼1 Y exp
i The v2 value for both the cases was appeared the lowest as
560 P. S. GHOSAL ET AL.

Figure 4. Performance of ANN model for (a) 33 factorial design, (b) validation experiments (The cost is in Million INR; 1 Million INR¼ 14381.00 US $.).

Table 5. Comparison of experimental data with predicted values of ANN models generated from 33 factorial data
and validation data with 32 factorial design.
Test statistic ANN from 33 factorial data ANN for validation set
The Sum of the square of the error 0.005 1.593
The sum of absolute error 0.228 2.712
The average relative error 0.083 2.991
The hybrid fractional error function 0.002 1.935
The Marquardt’s percent standard deviation 0.036 4.340
The v2 test statistic 0.0004 0.145

0.0004 and 0.145, respectively. Whereas, the sum of absolute statistics was 19.480 and 0.004 for the model for validation
error was the highest for 33 factorial model as 0.228 and the sets. The higher R2, adjusted R2, F value and the lesser P
Marquardt’s percent standard deviation was the highest for value and error functions in the ANN model from 33 factor-
the validation set. Furthermore, the F value of the model ial set data attributed to the better performance level com-
from 33 factorial data was significantly high as 162,442, and pared to the validation set of 32 factorial design.
the P value was less than 0.001, which in turn established In the present study, the method of multivariate model-
the significance of the ANN model. However, the same test ing through ANN can be used as a predictive and estimating
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH, PART A 561

tool for WDN. In most of the cases, the WDN suffers from [4] Totsuka, N.; Trifunovic, N.; Vairavamoorthy, K. Intermittent
a change in demand, supply duration and so forth for which Urban Water Supply Under Water Starving Situations.
Presented at the 30th WEDC International Conference,
the thorough hydraulic simulation and redesign of the dis-
Vientiane, Lao PDR, 2004; pp. 505–512.
tribution network is required. Eventually, the authorities [5] Abu-Madi, M.; Trifunovic, N. Impacts of Supply Duration on
face the hurdle to predict the cost requirement of the system the Design and Performance of Intermittent Water Distribution
as well as the suitable solution to handle the changed situ- Systems in the West Bank. Water Int. 2013, 38, 263–282. doi:
ation, such as imposing additional head and changing sup- 10.1080/02508060.2013.794404.
[6] Kumpel, E.; Nelson, K. L. Intermittent Water Supply:
ply duration In this context, the multivariate model can Prevalence, Practice, and Microbial Water Quality. Environ. Sci.
perform as a predictive tool for the decision makers as well Technol. 2016, 50, 542–553. doi:10.1021/acs.est.5b03973.
as for the assessment of the interactive behavior of the sys- [7] Ilaya-Ayza, A. E.; Martins, C.; Campbell, E.; Izquierdo, J.
tem variables. Gradual Transition from Intermittent to Continuous Water
Supply Based on Multi-Criteria Optimization for Network
Sector Selection. J. Comput. Appl. Math. 2018, 330, 1016–1029.
Conclusions doi:10.1016/j.cam.2017.04.025.
[8] Yabalak, E. Degradation of Ticarcillin by Subcritical Water
An established prediction model on the design of a distribu- Oxidation Method: Application of Response Surface
tion system was rarely developed, and the assessment of the Methodology and Artificial Neural Network Modeling. J.
Environ. Sci. Health A 2018, 53, 1–11.
interactive effect of the system parameters like pressure and [9] Ma, Y.; Huang, M.; Wan, J.; Hu, K.; Wang, Y.; Zhang, H.
flow was also seldom explored. Subsequently, the multivari- Hybrid Artificial Neural Network Genetic Algorithm Technique
ate approach in the present study was conducted for the first for Modeling Chemical Oxygen Demand Removal in Anoxic/
time to model the influential system parameters concerning Oxic Process. J. Environ. Sci. Health A 2011, 46, 574–580. doi:
the cost of the pipe network. Generally, vivid network ana- 10.1080/10934529.2011.562821.
[10] Ghosal, P. S.; Gupta, A. K. Sorptive Equilibrium Profile of
lysis is the solution to assess the cost of the system in differ- Fluoride onto Aluminum Olivine [(FexMg1-x)2SiO4] Composite
ent scenarios. Whereas, the water supply authorities/agencies (AOC): Physicochemical Insights and Isotherm Modeling by
often require to find out the cost of the system for the vari- Non-Linear Least Squares Regression and a Novel Neural-
ous combinations of the process variables, for example, the Network-Based Method. J. Environ. Sci. Health A, 53, 2018,
changed demand, extension or reduction of supply hour and 1–13. doi:10.1080/10934529.2018.1474590.
[11] Chronopoulos, K. I.; Tsiros, I. X.; Dimopoulos, I. F.; Alvertos,
so forth. N. An Application of Artificial Neural Network Models to
Apart from that, the individual influence of each param- Estimate Air Temperature Data in Areas with Sparse Network
eter can be assessed by conducting the univariate study. of Meteorological Stations. J. Environ. Sci. Health A 2008, 43,
However, the actual behavior of the system with a variation 1752–1757. doi:10.1080/10934520802507621.
of all the important parameters was rarely evaluated. [12] Flores-Asis, R.; Mendez-Contreras, J. M.; Juarez-Martınez, U.;
Alvarado-Lassman, A.; Villanueva-Vasquez, D.; Aguilar-Lasserre,
Nevertheless, the required process dynamics lies in the A. A. Use of Artificial Neuronal Networks for Prediction of the
assessment of the effect of all parameters at a time. The Control Parameters in the Process of Anaerobic Digestion with
approach of the present study explored this gap area. The Thermal Pretreatment. J. Environ. Sci. Health A 2018, 53,
performance analysis of the multivariate model through the 883–890. doi:10.1080/10934529.2018.1459070.
ANN and 33 factorial design exhibited that it can serve as a [13] Hu, K.; Wan, J. Q.; Ma, Y. W.; Wang, Y.; Huang, M. Z. A
Fuzzy Neural Network Model for Monitoring A2/O Process
prediction model. The model can assess the combined effect Using On-Line Monitoring Parameters. J. Environ. Sci. Health
of the different parameters as well as it can serve as a deci- A 2012, 47, 744–754. doi:10.1080/10934529.2012.660102.
sion tool for the authorities to assess the effect of changed [14] Kim, M.-Y.; Kim, M.-K. Dynamics of Surface Runoff and Its
scenario without performing the vigorous hydraulic analysis Influence on the Water Quality Using Competitive Algorithms
of the system. in Artificial Neural Networks. J. Environ. Sci. Health A 2007,
42, 1057–1064. doi:10.1080/10934520701418490.
[15] € Eco-friendly Approach to Mineralise 2-
Yabalak, E.; Yilmaz, O.
Nitroaniline Using Subcritical Water Oxidation Method: Use of
Disclosure statement
ANN and RSM in the Optimisation and Modeling of the
No potential conflict of interest was reported by the authors. Process. J. Iran. Chem. Soc. 2018, 6, 1–10.
[16] _ Modeling and Optimization II: Comparison
Baş, D.; Boyacı, IH.
of Estimation Capabilities of Response Surface Methodology with
References Artificial Neural Networks in a Biochemical Reaction. J. Food
Eng. 2007, 78, 846–854. doi:10.1016/j.jfoodeng.2005.11.025.
[1] Vairavamoorthy, K.; Gorantiwar, S. D.; Pathirana, A. Managing [17] Desai, K. M.; Survase, S. A.; Saudagar, P. S.; Lele, S. S.; Singhal,
Urban Water Supplies in Developing Countries—Climate R. S. Comparison of Artificial Neural Network (ANN) and
Change and Water Scarcity Scenarios. Phys. Chem. Earth 2008, Response Surface Methodology (RSM) in Fermentation Media
33, 330–339. doi:10.1016/j.pce.2008.02.008. Optimization: Case Study of Fermentative Production of
[2] Ilaya-Ayza, A. E.; Campbell, E.; Perez-Garcıa, R.; Izquierdo, J. Scleroglucan. Biochem. Eng. J. 2008, 41, 266–273. doi:10.1016/
Network Capacity Assessment and Increase in Systems with j.bej.2008.05.009.
Intermittent Water Supply. Water (Switzerland) 2016, 8, 126. [18] Gupta, A. K.; Ghosal, P. S.; Srivastava, S. K. Modeling and
doi:10.3390/w8040126. Optimization of Defluoridation by Calcined Ca-Al-(NO3)-LDH
[3] Fontanazza, C. M.; Freni, G.; Loggia, G. L. Analysis of intermit- Using Response Surface Methodology and Artificial Neural
tent supply systems in water scarcity conditions and evaluation Network Combined with Experimental Design. J. Hazard.
of the resource distribution equity indices. WIT Trans. Ecol. Toxic. Radioact. Waste 2017, 21, 04016024. doi:10.1061/
Environ. 2007, 103, 635–644. (ASCE)HZ.2153-5515.0000343.
562 P. S. GHOSAL ET AL.

[19] Abbasi, B.; Mahlooji, H. Improving Response Surface [28] Clark, R. M. Water Quality Modeling in Distribution Systems.
Methodology by Using Artificial Neural Network and Simulated J. Environ. Sci. Health A Environ. Sci. Eng. Toxicol. 1992, 27,
Annealing. Expert Syst. Appl. 2012, 39, 3461–3468. doi:10.1016/ 1329–1366.
j.eswa.2011.09.036. [29] Wu, Z. Y. Optimal Calibration Method for Water Distribution
[20] Ostfeld, A.; Oliker, N.; Salomons, E. Multiobjective Water Quality Model. J. Environ. Sci. Health A 2006, 41,
Optimization for Least Cost Design and Resiliency of Water 1363–1378. doi:10.1080/10934520600657115.
Distribution Systems. J. Water Resour. Plann. Manage. 2014, [30] Abbaszadegan, M.; Yi, M.; Alum, A. Stimulation of 2-
Methylisoborneol (MIB) Production by Actinomycetes After
140, 04014037. doi:10.1061/(ASCE)WR.1943-5452.0000407.
Cyclic Chlorination in Drinking Water Distribution Systems. J.
[21] Yazdani, A.; Jeffrey, P. Applying Network Theory to Quantify
Environ. Sci. Health A 2015, 50, 365–371. doi:10.1080/
the Redundancy and Structural Robustness of Water
10934529.2015.987526.
Distribution Systems. J. Water Resour. Plann. Manage. 2012, [31] Vasan, A.; Simonovic, S. P. Optimization of Water Distribution
138, 153–161. doi:10.1061/(ASCE)WR.1943-5452.0000159. Network Design Using Differential Evolution. J. Water Resour.
[22] Aksela, K.; Aksela, M.; Vahala, R. Leakage Detection in a Real Plann. Manage. 2010, 136, 279–287. doi:10.1061/(ASCE)0733-
Distribution Network Using a SOM. Urban Water J. 2009, 6, 9496(2010)136:2(279).
279–289. doi:10.1080/15730620802673079. [32] Ghosal, P. S.; Gupta, A. K.; Sulaiman, A. Multivariate
[23] Rathi, S.; Gupta, R. Optimal Sensor Locations for Optimization of Process Parameters in the Synthesis of
Contamination Detection in Pressure-Deficient Water Calcined CaAl (NO3) LDH for Defluoridation Using 33
Distribution Networks Using Genetic Algorithm. Urban Water Factorial, Central Composite and Box-Behnken Design. J.
J. 2017, 14, 160–172. doi:10.1080/1573062X.2015.1080736. Environ. Sci. Health. A Tox. Hazard. Subst. Environ. Eng. 2016,
[24] Tabesh, M.; Shirzad, A.; Arefkhani, V.; Mani, A. A 51, 86–96. doi:10.1080/10934529.2015.1086212.
Comparative Study Between the Modified and Available [33] Agami, N.; Atiya, A.; Saleh, M.; El-Shishiny, H. A Neural
Demand Driven Based Models for Head Driven Analysis of Network Based Dynamic Forecasting Model for Trend Impact
Water Distribution Networks. Urban Water J. 2014, 11, Analysis. Technol. Forecast. Soc. Change 2009, 76, 952–962.
221–230. doi:10.1080/1573062X.2013.783084. doi:10.1016/j.techfore.2008.12.004.
[25] Song, I.; Romero-Gomez, P.; Andrade, M. A.; Mondaca, M.; [34] Uddameri, V. Using Statistical and Artificial Neural Network
Choi, C. Y. Mixing at Junctions in Water Distribution Systems: Models to Forecast Potentiometric Levels at a Deep Well in
South Texas. Environ. Geol. 2007, 51, 885–895. doi:10.1007/
An Experimental Study. Urban Water J. 2018, 15, 32–38. doi:
s00254-006-0452-5.
10.1080/1573062X.2017.1364395.
[35] CPHEEO. Manual on Water Supply and Treatment. 1999.
[26] Wang, M.; Barkdoll, B. D. A Sensitivity Analysis Method
Ministry of Urban Development: New Delhi, India.
for Water Distribution System Tank Siting for Energy [36] Kundu, S.; Gupta, A. K. Arsenic Adsorption Onto Iron Oxide-
Savings. Urban Water J. 2017, 14, 713–719. doi:10.1080/ Coated Cement (IOCC): Regression Analysis of Equilibrium
1573062X.2016.1241285. Data with Several Isotherm Models and Their Optimization.
[27] Sadiq, M.; Zaidi, T. H.; Muhanna, H.; Al.; Mian, A. A. Effect of Chem. Eng. J. 2006, 122, 93–106. doi:10.1016/j.cej.2006.06.002.
Distribution Network Pipe Material on Drinking Water Quality. [37] Ayoob, S.; Gupta, A. K. Insights into Isotherm Making in the
J. Environ. Sci. Health A Environ. Sci. Eng. Toxicol. 1997, 32, Sorptive Removal of Fluoride from Drinking Water. J. Hazard.
445–454. doi:10.1080/10934529709376553. Mater. 2008, 152, 976–985. doi:10.1016/j.jhazmat.2007.07.072.

You might also like