Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

Available online at www.sciencedirect.com

ScienceDirect
www.journals.elsevier.com/journal-of-environmental-sciences

Prediction of effluent concentration in a wastewater treatment


plant using machine learning models

Hong Guo1 , Kwanho Jeong1 , Jiyeon Lim2 , Jeongwon Jo2 , Young Mo Kim1 , Jong-pyo Park3 ,
Joon Ha Kim1 , Kyung Hwa Cho2,⁎
1. School of Environmental Science and Engineering, Gwangju Institute of Science and Technology (GIST), 261 Cheomdan-gwagiro, Buk-gu,
Gwangju 500-712, Republic of Korea
2. School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Republic of Korea
3. HECOREA. INC, 405, Woori Venture Town II, 70, Seonyu-ro, Yeongdeungpo-gu, Seoul, Republic of Korea

AR TIC LE I N FO ABS TR ACT

Article history: Of growing amount of food waste, the integrated food waste and waste water treatment
Received 30 June 2014 was regarded as one of the efficient modeling method. However, the load of food waste to
Revised 11 December 2014 the conventional waste treatment process might lead to the high concentration of total
Accepted 22 January 2015 nitrogen (T-N) impact on the effluent water quality. The objective of this study is to
Available online 20 April 2015 establish two machine learning models—artificial neural networks (ANNs) and support
vector machines (SVMs), in order to predict 1-day interval T-N concentration of effluent
Keywords: from a wastewater treatment plant in Ulsan, Korea. Daily water quality data and
Artificial neural network meteorological data were used and the performance of both models was evaluated in
Support vector machine terms of the coefficient of determination (R2), Nash–Sutcliff efficiency (NSE), relative
Effluent concentration efficiency criteria (drel). Additionally, Latin-Hypercube one-factor-at-a-time (LH-OAT) and a
Prediction accuracy pattern search algorithm were applied to sensitivity analysis and model parameter
Sensitivity analysis optimization, respectively. Results showed that both models could be effectively applied
to the 1-day interval prediction of T-N concentration of effluent. SVM model showed a
higher prediction accuracy in the training stage and similar result in the validation stage.
However, the sensitivity analysis demonstrated that the ANN model was a superior model
for 1-day interval T-N concentration prediction in terms of the cause-and-effect
relationship between T-N concentration and modeling input values to integrated food
waste and waste water treatment. This study suggested the efficient and robust nonlinear
time-series modeling method for an early prediction of the water quality of integrated food
waste and waste water treatment process.
© 2015 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences.
Published by Elsevier B.V.

Introduction et al., 2011; García et al., 2005). The South Korean government
also prohibited the landfill of municipal solid sludge (MSS)
Following the restrictive landfill legislation passed by the and food waste (FW) in the early 21st century (S. Cheon et al.,
European Union (EU) in 1999, many developed countries have 2013). However, this strict regulation causes the dumping of
implemented various policies and technical developments for both the sludge and FW water (i.e., leachate) at sea, conse-
reducing the quantity of biodegradable waste landfill (Burnley quently leading to the prohibition of its disposal in the ocean

⁎ Corresponding author.E-mail: khcho@unist.ac.kr (Kyung Hwa Cho).

http://dx.doi.org/10.1016/j.jes.2015.01.007
1001-0742/© 2015 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V.
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 91

by the Marine Environment Management Act (Behera et al., tool because it has a relatively high accuracy for dealing with
2010; S. Cheon et al., 2013). At this time, basic environmental complicated systems. Furthermore, a key advantage of these
treatment facilities such as wastewater treatment plants models to the evaluation of WWTP performance is that these can
(WWTPs) appear to be an alternative inland treatment to directly predict output values from input values only after
resolve the problem. In the inland treatment with wastewater training and validation step. Artificial neural networks (ANNs)
treatment plant, over 80% of FW, which is recyclable organic and support vector machines (SVMs) are representative machine-
waste of municipal solid wastes (MSW), is dehydrated, and learning techniques (Dreyfus, 2005; Shon and Moon, 2007). Two
the remaining waste goes through recycling processes such machine learning models' performance studies have been widely
as composting, feed, and anaerobic digestion to generate discussed before (Hamed et al., 2004; Palani et al., 2008; Singh et
biomass energy (Chelliapan et al., 2012; Li et al., 2012). In al., 2009; Yoon et al., 2011). However, only black box modeling has
particular, methane gas, one of the biogases, can be utilized as the limitation on the process control and there has yet to
a biomass energy source (Lee et al., 2009). However, a large elucidate the cause-and-effect relationship for input and output
amount of FW leachate inevitably occurs in all recycling value for process control.
processes because of a high moisture content from the FW In this study, two machine learning models would be
leachate (Han et al., 2012), resulting in a significant burden on developed for predicting effluent T-N concentration for the
the wastewater treatment systems. integrated food waste and waste water treatment plant in
According to several research reports (Kim et al., 2008; Ulsan Metropolitan city, Korea. Moreover, by sensitivity
Sosnowski et al., 2003), the water treatment process can be analysis between input values and output values, the
more effective by using FW leachate in WWTP. This is because cause-and-effect relationship would be elucidated for the
the FW leachate contains a large amount of acid fermentation future process control and selection of the prior machine
liquid (AFL) which can be utilized as an organic carbon source learning model for integrated food waste and waste water
for removing nitrogen and phosphorus in advanced waste- treatment. The objective of this study is: a) development of
water treatment (AWT) processes (Han et al., 2012; Lee et al., reliable 1-day interval early T-N concentration prediction
2003). The digestion process with only sewage sludge could be model by parameter optimization method; b) evaluation of
less effective due to the low carbon/nitrogen (C/N) ratio and the building model by sensitivity analysis to find the cause–
low level of biodegradable organic compounds. FW leachate effect based reasonable model as future decision-making tool;
contains a high amount of solid contents as well as a high C/N c) to propose an early warning prediction tool to avoid the
ratio, while containing a low amount of the nutrient-type impact of FW leachate loading to the integrated food waste
elements (Mata-Alvarez, 2003). Therefore, the combined and waste water treatment.
treatment of sewage sludge and FW improves the removal
efficiency of nitrogen and phosphorus in AWT, enhancing the
stability of the digestion process. Furthermore, higher pro- 1. Method and materials
duction of methane gas is an additional benefit from the
co-digestion with FW leachate (Cecchi et al., 1988; Hamzawi et al., 1.1. Field sampling
1998; Mata-Alvarez et al., 1990; Poggi-Varaldo and Oleszkiewicz,
1992; Schmit and Ellis, 2001). Owing to these advantages, the We collected water samples in an attempt to investigate the
anaerobic digestion process of sewage sludge with FW has been effect of FW leachate on Yong-yeon (YY) WWTP in Ulsan. The
increased in WWTPs in Korea. However, this process also faces samples were collected from 6 different spots, including
critical issues which are associated with the side effects of influent, flow-distribution tank, aeration tank, effluent, FW
co-digestion. One of the issues is that the influent water quality is leachate, and pre-treated FW leachate (Fig. 1). The collected
degraded by mixing with the returned FW leachate from the samples were delivered to a laboratory at the Ulsan National
anaerobic co-digestion process, so it tends to increase liquor Institute of Science and Technology (UNIST) and were
suspended solids (MLSS) and causes a large amount of scum in analyzed in terms of total suspended solids (TSS), chemical
the activated sludge reactor (Kim and Shin, 2009; Mahmoud et al., oxygen demand (COD), total nitrogen (T-N), and total phos-
2003). As well, a sudden increase of the FW leachate could cause phorus (T-P); water temperature and pH were measured in-situ
an unstable digestion process and lower the level of effluent at the sampling stations.
water quality from WWTPs (S. Cheon et al., 2013).
Generally, water quality of a WWTP is sensitive to parameters 1.2. Sample analysis
such as pH, temperature, concentrations of specific substrates,
and contaminants. This is because wastewater is treated by the TSS of a water sample was measured by filtering a 20 mL
metabolism processes of microorganisms. However, biological sample through pre-weighed 47 mm Glass-Fiber paper (with
treatment still exhibits time-varying and highly nonlinear 1.2 μm pore size), then weighing the filter again after drying to
characteristics affected by various known and unknown param- remove all water in the sample. COD, T-P, and T-N were
eters (Hamed et al., 2004; Hong et al., 2003; Mjalli et al., 2007). Due measured through absorptiometric analysis. COD and T-P were
to these complicated features, many previous studies evaluated measured for 4 sampling locations: influent, flow-distribution
and diagnosed the performance of WWTP by using a mathemat- tank, aeration tank, and effluent. T-N was measured for 6
ical model for the process simulation and control (Gernaey et al., sampling locations including 2 additional stations (i.e., pre- and
2004; Hamed et al., 2004; Hong et al., 2003; Iacopozzi et al., 2007; post-aerobic transamination of FW leachate). The absorbance
Mjalli et al., 2007; Rivas et al., 2008; Wintgens et al., 2003). of samples, which were mixed with the proper reagents was
Thereinto, a machine learning model has proved to be a useful quantified under the 200-900 nm wavelength and the target
92 J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

Fig. 1 – Schematic diagram of wastewater treatment plants (WWTPs), dashed box indicates the system boundaries for the
machine learning model development for these studies. Sampling points are numbered as (1)–(6). TN conc: total nitrogen
concentration.

components were quantified. Distilled water was used for the Artificial neural networks have been seen as the standard
reference solution. For COD quantification, 0.5 mL of each water data-based nonlinear estimator tools, and it is widely applied
sample was put into the sulfuric acidic solution, and then for prediction and forecasting in the field of environment-
0.6 mL of standard potassium permanganate (KMnO4) solution related areas, including water treatment (Liong et al., 2001;
(0.005 mol/L) was added. The mixed solution was heated for Muttil and Chau, 2006), oceanography (Lee, 2004), and ecological
15 min at 100°C. After the reaction, the oxygen demand was science (Trichakis et al., 2011). Also, the use of data-based
measured by the amount of consumed potassium permanga- modeling for water quality (Ahmed and Sarma, 2007; Cho et al.,
nate. For T-P measurement, the water sample was pre-treated 2011; Karul et al., 2000; Lee et al., 2010; Lek and Guégan, 1999;
by putting persulfuric acid potassium into 5 mL of water sample Rogers and Dowla, 1994; Yan et al., 2010) has been successfully
and heating for 30 min at 120°C. After heating the pre-treated completed for the past 20 years.
water sample, a mixture of 2 mL of ammonium molybdate with A common ANN structure, called a multilayer perception
ascorbic acid was put into the sample. The reference solution network, consists of three distinctive layers: input, hidden,
was observed under the 880 nm wavelength and the T-P of the and output with linked-nodes and functions. After data are
water sample was measured by quantifying the amount of introduced into the ANN model, the network utilizes the
reduced phosphate. For quantifying T-N, the samples from pre- neurons which are non-linear algebraic functions (i.e., transfer
and post-aerobic treatment of FW leachate had to be diluted to functions) (Dreyfus et al., 2002). The signal passes from one
1/25 ratio due to their high concentration levels. The water neuron to another neuron by the weights and transfer
samples were pre-treated by putting alkaline persulfuric acid function (Govindaraju, 2000) and the back propagation algo-
potassium into 0.5 mL of water sample, which was then heated rithm could effectively train the network for the nonlinear
for 30 min at 120°C. After adding hydrochloric acid to make pH 2 neural network problems by adjusting weights in an attempt
to 3, T-N was finally measured by the absorption of wavelengths to minimize the objective function during those processes
under 220 nm. Consequently, the T-N of the water sample was (Rumelhart and Mcclelland, 1986). The mathematical expres-
measured by oxidizing nitrogenous compounds to nitrate ions sion of the ANN model in this study is as follows (Khalil et al.,
and calculated by the difference of light intensity between the 2005):
reference and sample. 0 1
XN
yi ¼ f @ W i j X j þ bi A ð1Þ
1.3. Modeling approaches j¼1

1.3.1. Artificial neural networks (ANNs) where, Xj is the jth nodal value for the previous layer and yi is
As the name implies, an artificial neural network is a data- the ith nodal value in the current layer.
based flexible mathematical structure of a neural network By multiplying the weighting factor (Wij) and adding the
model which is a very powerful computational technique for bias of the ith node, we can calculate the current nodal value
the modeling of complex non-linear relationships and analysis for the aforementioned nodal value based on the activation
of the explicit form of the relations between variables. It was function f based on Eq. (1). Three layers (input, hidden, and
firstly introduced in the early years of the 1940s and developed output) of a feed-forward artificial neural network were built
with the back-propagation (BP) algorithm in 1988 (Gallant, 1993; for predicting effluent concentrations in the YY WWTP from 8
McCulloch and Pitts, 1943; Rumelhart et al., 1988; Smith, 1993). input variables, Xi (i = 1, …, 8) (month, volumetric flow rate of
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 93

inflow, pH, temperature, chemical oxygen demand, suspended SVM models could be classified into two types: linear
solid, T-N of inflow, T-N of pre-treated FW leachate) (Fig. 2). Fewer support vector regression and nonlinear support vector
hidden nodes are usually preferable, due to the better generali- regression. The nonlinear support vector regression mathe-
zation capabilities which can avoid over-fitting problems. How- matical model was used for model development in this study.
ever, insufficient nodes also lead to impaired performance of the Mathematically, it can be described as follows:
networking training and validation (Ahmed and Sarma, 2007;
Palani et al., 2008). The optimal parameter sets of the hidden X
N
f ðX i Þ ¼ W i φðX i Þ þ b ð2Þ
nodes, learning rate, and momentum for the model were i¼1
determined by pattern search algorithms. In addition, we
tested the logistic sigmoid function, tangent sigmoid function, where, Wi and b are the parameters of the linear support
and the linear function as candidate transfer functions and vector regression function and φ(Xi) is the nonlinear mapping
optimal fitness model comes from tangent sigmoid transfer function. In order to simply calculate the nonlinear mapping
function. function, the kernel function, K(xi, xj) = 〈(ϕ(xi) ⋅ ϕ(xj))〉 would
be applied to make the inner products, analyze the space, and
1.3.2. Support vector machines (SVM) evaluate the feature-separating space as the mathematical
Support vector machines (SVMs) are a data-based machine functions (Yu et al., 2006). We tested all kinds of the kernels
learning model, which is based on structural risk minimiza- (such as linear, polynomial, sigmoid, and radial basis func-
tion (SRM) (Vapnik, 1995, 1999). The SRM minimizes the tion) and found that the radial basis function could lead to an
empirical error and model complexity simultaneously. It optimal fitness model for effluent water quality prediction in
could contribute to the improvement of generalization ability this study. In addition, for the key model parameters, the
of the classification or regression problems (Yoon et al., 2011). optimal parameter sets of the cost constant (C), the radius of
SVMs have been widely verified in numerous environmental insensitive tube (ε), and the scale parameter for stable
research areas. Dibike et al. (2001)applied various kernel performance of model (σ) were determined by the optimiza-
functions of SVM to predict rainfall, and Khalil et al. (2005) tion algorithm.
used SVM to demonstrate the agriculture-dominated water-
shed by analyzing the spatial distribution features of ground- 1.4. Model construction
water. For other fields, including the stream flow water level
of lakes and soil moisture prediction, it also widely applied 1.4.1. Input data preparation
(Gill et al., 2006; Khalil et al., 2006; Khan and Coulibaly, 2006; MATLAB was used for building ANN and SVM models to predict
Liong and Sivapragasam, 2002). effluent T-N in the WWTP. As the architectures of ANN and

Fig. 2 – Illustration of general conceptual model structure for artificial neural networks (ANNs) and support vector machines
(SVMs).
94 J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

SVM models were shown in Fig. 3, the total dataset is divided error, or is based on previous research. In this study, we used the
into two different groups: training and validation data set. For pattern search algorithm (Lewis and Torczon, 2002) to determine
the model input data, we chose particular data from January to the optimum values for parameters of the ANN and SVM models,
August for the training of the model and the data from as showed in Table 1. The initial ranges of each parameter were
September to October for the validation after optimization selected based on previous research (Cho et al., 2009; Wang et al.,
work on the training and validation period studies. All data 2003).
were normalized to range from − 1 to 1. After that, the
normalized data were used as input and output data for the 1.4.3. Assessment of model performance
ANN and SVM models. The optimal model parameters of To judge the performance of each machine learning model
these two models were determined by applying a global (for ANN and SVM), the suitable criterion selection is critical to
optimization algorithm, respectively. After determining the confirm the model performance. Also, as Krause found that
model parameters, the T-N concentration of the effluent none of single usage of efficiency criteria could give us the full
would be predicted by the ANN model and SVM model, then the explanation of model performance, since each of them has
values were compared to the measured values to evaluate the their pros and cons, we applied the three criteria: coefficient of
prediction accuracy. Other pollutants such as COD, BOD determination (R2), Nash–Sutcliff efficiency (NSE), and relative
(biochemical oxygen demand), and T-P were not considered as efficiency criteria (drel) for the training and validation, which are
the output values for modeling development because only the most frequently applied in the water-science field (Krause et al.,
concentration of T-N in the effluent was regarded as an indicator 2005).
of the effect of the FW leachate on the waste water treatment in The coefficient of determination could be defined and
this study. calculated as follows,
X 
Q m ðiÞ−Q m ðiÞÞðQ o ðiÞ−Q o ðiÞ
1.4.2. Model parameter optimization
Both of ANN and SVM, the model parameters greatly influence R2 ¼ X  i 2 X  2 ð3Þ
Q m ðiÞ−Q m ðiÞ Q o ðiÞ−Q o ðiÞ :
the learning and prediction accuracy of the output values. Palani
i i
et al. (2008) found that an insufficient number of nodes would
lead to an impaired performance of the network. Normally, the The value of the coefficient would be in the range from 0 to
optimum value of the parameters is determined by trial and 1 (no correlation to a perfect fit), and it would tell us how the

Fig. 3 – Logical flow for two machine learning modeling study (artificial neural networks (ANNs) and support vector machines
(SVMs)). SS: suspended solids; T-N: total nitrogen; tansig/tansig: the transfer function of ANNs; RBF: the kernel function of
SVMs.
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 95

Table 1 – 10 months measured data of the water quality variables for each process in the Ulsan waste water treatment plant.
Influent Flow-distribution Aeration-sedimentation Effluent Supernatant Pretreatment

T-N Average 39.074 38.884 47.569 14.639 2578.980 1662.938


(mg/L) Std. 13.842 20.508 20.933 9.522 1138.601 1055.886
Median 37.390 35.180 42.185 13.618 2221.800 1323.800
T-P Average 5.544 2.701 6.559 1.083 – –
(mg/L) Std. 1.925 0.570 1.893 0.852 – –
Median 5.515 2.800 6.333 0.978 – –
TSS Average 266.532 180.328 288.306 97.581 – –
(mg/L) Std. 108.449 88.755 97.375 62.020 – –
Median 250.000 175.000 287.500 100.000 – –

T-N: total nitrogen; T-P: total phosphorus; TSS: total suspended solids.

dispersion of the measured value could be explained by the the One-factor At a Time (OAT) and Latin Hypercube (LH)
modeling prediction. sampling methods. Under LH-OAT sensitivity analysis, all
Nash–Sutcliffe efficiency was developed in the 1970s and it parameters are sampled under the precision of the OAT
was widely applied to access the hydrological models. It is method so that any change of the output value can be clearly
also very sensitive to the extreme value and might give attributed to the changed input. Additionally, LH-OAT is also a
unoptimal results for the datasets, which contains extreme very efficient method; for m intervals in the LH method, a total
data. It could be calculated as: of m × (p + 1) steps are required (van Griensven et al., 2006).
For each input parameter, the boundary was set to the
X
T
ðQ o ðiÞ−Q m ðiÞÞ2 minimum and maximum values.
NSE ¼ 1− t¼1
T 
ð4Þ
X 2
Q o ðiÞ−Q o ðiÞ
t¼1
2. Results and discussions

where, Qo, Qm, and Q o ðiÞ are the measured value, modeled 2.1. Water quality monitoring
value, and average measured value at the ith order observa-
tion, respectively. And N is the total number of samples. The daily data for the three water quality parameters (T-N, T-P,
We used the absolute values to check the difference between and TSS) measured over 10 months for all 6 monitoring points
the values of measured and modeled. Both of the coefficients stations (influent, flow-distribution tank, aeration tank, efflu-
of determination and Nash–Sutcliffe efficiency described the ent, FW leachate, and pre-treated FW leachate) is presented in
difference between the measured and modeled value for the Table 1. The T-N and TSS are for two important parameters for
absolute values. However, there might be an over- or under- the assessment of the water quality analysis, including the
prediction due to higher or lower values. To counteract these measured data of T-N and TSS, months, volumetric flow rate of
problems, we additionally applied the relative efficiency criteria the inflow, pH, temperature, and COD was used as the input
(drel) to reduce the influence of the absolute differences between parameters for machine learning model construction. All of
the measured value and modeled value during high values these input parameters were determined through the sensitiv-
significantly. ity analysis for demonstration to the relationship with the T-N
X Q o ðiÞ−Q m ðiÞ2 concentration of effluent and finally selected from the model
development.
i
Q o ðiÞ
drel ¼ 1−    !2 ð5Þ Box plots in Fig. 4 are the statistical analysis of the measured
X Q m ðiÞ−Q o ðiÞ þ Q m ðiÞ−Q o ðiÞ water quality variables from Table 1. The results of the T-N
i Q o ðiÞ concentration for the flow distribution tank and aeration tank
are 38.884 ± 20.508 mg/L and 47.569 ± 20.933 mg/L, respectively.
The range of the relative efficiency criteria is also in the After the food sludge treatment, the pre-treated FW leachate
range from 0 to 1. would be recycled to the flow distribution tank. We measured
Besides applying the aforementioned criteria, fitness of the water quality parameters (T-N, T-P, and TSS) of the influent to the
constructed models was checked through the residual anal- each process and found that water quality of the aeration tank is
ysis (Krause et al., 2005). related to the effect of the pre-treated FW leachate recycling.
From Table 1, T-N concentration increased by 8.685 mg/L
1.5. Sensitivity analysis on average. Also, T-P and TSS increased by 3.858 mg/L and
107.978 mg/L, respectively. Since the difference between the
Latin Hypercube One factor At a Time (LH-OAT) sensitivity T-N concentrations of the pre-treated FW leachate and flow
analysis was used for input parameters that may have a distribution tank is at least one order of magnitude difference, the
potential influence on the prediction of T-N concentration of aforementioned results could be acceptable and we could observe
the effluent. As a sensitivity analysis method, which could that the integrated treatment of food waste could greatly affect
give the ranking of parameter sensitivity, LH-OAT combines the water quality of the WWTP treatment in Ulsan.
96 J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

Fig. 4 – Basic statistics analysis of the measured water quality data. T-N: total nitrogen (log scale); T-P: total phosphorus; TSS:
total suspended solids; a: influent; b: flow-distribution; c: aeration-sedimentation; d: effluent; e: supernatant; f: pretreatment.

2.2. Training and validation of ANN and SVM models conditions for bacteria growth, which could affect the removal
efficiency of the T-N of biological water treatment. Hence, the
Different ANN models and SVM models were built for testing temperature and pH are a reasonable determination as the
in order to determine the optimum model for the prediction of most significant parameters for predicting the T-N concentra-
effluent T-N concentration in this study. For the ANN model, tion of effluent for machine learning models in this study.
the hyperbolic tangent transfer function (nonlinear transfer Additionally, the T-N concentration of effluent was also an
function) was determined as the optimal function for both the important input parameter, which directly affects the input
hidden and output layers. And for SVM, the RBF kernel amount of T-N into waste water treatment. Hence it could be
function was used in the transformation layer. found that the ANN model could lead to a more reasonable
Additionally, the selection of the appropriate node num- model compared to SVM based on the consideration of
bers for both the ANN and SVM hidden layers is very critical, characteristics of the biological treatment process. For the
because over-fitting results could result from extreme number control of process, the more reasonable physical relation
of nodes use. In this study, we applied the pattern search based ANN model could be more reliable model to apply on
algorithm to find the optimum parameters of the node the avoidance of the high T-N concentration impact manage-
number of hidden layers for both the ANN and SVM models. ment on the system by adjusting the most physically related
Table 2 shows the optimum parameters for ANN and SVM, parameters than SVM.
which were obtained from the pattern search algorithm study. Machine learning models do not have to represent all the
physical meaning through the input and output variables, but
2.3. Parameter sensitivity analysis we can still observe from the result of sensitivity analysis that
the highest ranking parameter of ANN was temperature.
Table 3 summarizes the sensitivity ranking for the perfor- However, for SVM, the highest ranking parameter was the
mance of the input parameters to the T-N concentration for month, which seems to not be related to any physical
effluent. It showed the importance of the spatial and temporal meaning. By considering the relationship between tempera-
variables to the model predictions. In the ANN case, the ture and month, and an additional effect of the ionic strength
temperature was the most important parameter, followed by and flocs based on the different season (or month), the result
the T-N of inflow water and pH. On the other hand, the of ANN was more acceptable (Zita and Hermansson, 1994).
month, COD, and SS were the most three important param- Additionally, the values of the final effect in Table 3 showed
eters for the SVM model. that there was not much difference for all variables in SVM. In
For a biological water treatment plant, the temperature, pH, terms of the process control, ANN showed the more reliable
and organic carbon are the three most important operational and reasonable results than the SVM model.

Table 2 – Comparison of the optimized artificial neural networks (ANN) and support vector machines (SVM) performances
for prediction of total nitrogen (T-N) concentration of the effluent from the wastewater treatment plant in Ulsan.
Site Model Model parameters R2 NSE drel

Tr Vl Vl Tr Tr Vl

Ulsan wastewater treatment plant ANNs lr: 0.50 0.55 0.47 0.56 0.46 0.80 0.76
(Tansig/Tansig) mo: 0.742
# N: 11
SVMs C: 50.005 1.00 0.46 1.00 0.45 0.99 0.77
(RBF) ε: 0.001
σ: 4.693

lr: the learning rate; mo, momentum; # N: number of hidden neurons; C: the cost constant; ε: the radius of insensitive tube; σ: the parameter of
the kernel function; R2: the coefficient of determination; NSE: Nash–Sutcliffe model efficiency; drel: relative efficiency criteria; Tr: the training
step; Vl: the validation step; Tansig/Tansig: the transfer function of ANNs selected in this study; RBF: the kernel function of SVMs selected in
this study.
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 97

Table 3 – Sensitivity rank of input variables in artificial neural networks (ANNs) and support vector machines (SVMs) using
the Latin Hypercube One factor At a Time (LH-OAT) sensitivity analysis for the Ulsan wastewater treatment plant.
Rank ANN SVM

Variable Final effect Variable Final effect

1 Temperature 38.59 Month 1.45


2 Total nitrogen of inflow 33.37 Chemical oxygen demand 1.34
3 pH 32.60 Suspended solid 1.33
4 Volumetric flow rate of inflow 30.58 pH 1.29
5 Suspended solid 26.89 Temperature 1.28
6 Total nitrogen of food waste leachate 22.31 Total nitrogen of inflow 1.24
7 Month 23.58 Volumetric flow rate of inflow 1.22
8 Chemical oxygen demand 17.64 Total nitrogen of food waste leachate 1.17

2.4. Model test a good fit for the modeled data. As Table 2 shows, for the
training step and validation step of ANN model, coefficient of
Measured values of T-N concentration of effluent from WWTP determination value were 0.55 and 0.47; Nash–Sutcliff effi-
in Ulsan were compared to the modeled values by the ciency (NSE) were 0.56 and 0.46; relative efficiency criteria
machine learning models (ANN and SVM) using both of were 0.80 and 0.76. On the other hand, the results of the SVM
regressions model and residual values, to check the models' model showed that coefficient of determination (R2) values for
performance. Fig. 5 shows the regression model plot of steps the training and validation were 1.00 and 0.46; NSE values
of training and validation for the both of ANN and SVM. We were 1.00 and 0.4;, relative efficiency criteria were 0.99 and
could inspect that both of the ANN and SVM model resulted in 0.77 (Table 2).

Fig. 5 – Comparison of the modeled and measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water
treatment plant training and validation tests using artificial neural network (ANN) and support vector machine (SVM) model.
98 J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

By considering the low value of the modeling performance Recently, ANN was applied to predict water quality variables.
criteria, we additionally check the fitness of the created Soyupak et al. (2003) studied the prediction of dissolved oxygen
machine learning models through the analysis of residuals concentration in three separate reservoirs. The correlation of
(Fig. 6). We could observe that the plot of residuals for training the evaluation coefficient was greater than 0.95 for predicting
and validation for the both ANN and SVM shows that the dissolved oxygen concentration. SVM has also been used to
relationship between residuals and modeled values of T-N predict water quality. Singh et al. (2009) computed the DO
concentration are independent and random distribution. The (dissolved oxygen) and BOD concentration in a polluted river
results could also support by further correlation analysis (R2 flowing through the northern alluvial Gangetic plains in India.
for ANN: Training = 5.509e− 7, Validation = 3.306e− 6; R2 for Root-mean-square error (RMSE) values for the predicted and
SVM: Training = 1.1e −6, Validation = 2.01e− 5) in Fig. 7. observed values of DO were 0.7 and 0.74 for training and
In this study, the low results of model performance criteria validation steps, respectively, while the predicted and observed
obtained were likely from the data noise and short-term of the values of BOD were 0.85 and 0.85 for training and validation
input data. However, ANN and SVM models could give steps, respectively (Singh et al., 2009). For the waste water
acceptable modeling accuracy results for the future prediction treatment plant area, Oliveira-Esquerre et al. (2002) applied
of effluent T-N concentration. ANN for the prediction of the biochemical oxygen demand of
the biological wastewater treatment effluent with an average R2
2.5. Comparison of models of 0.76. Additionally, Mjalli et al. (2007) used ANN for research
into wastewater treatment plant operation characteristics and
Fig. 4 shows the measured and predicted T-N concentrations the prediction of the BOD, COD, and TSS for the Doha West
by the ANN and SVM models with application of optimum WWTP.
parameters. Prediction accuracy of the SVM was slightly higher A relatively low value of the model performance criteria
than the accuracy of the ANN during the training steps, whereas (0.4–1.0 for R2, 0.4–1.0 for NSE, and 0.76–0.99 for relative
the accuracy of the SVM was almost identical during the efficiency criteria) for output variables for the T-N concentra-
validation steps. Consequently, we observed a higher prediction tion of pre-treated FW leachate could be observed in this
performance of SVM than ANN. study. The reason might be the interpretation and prediction

Fig. 6 – Plot of the modeled versus measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water treatment
plant training and validation tests.
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 99

Fig. 7 – Residuals plots versus modeled (predicted) total nitrogen (T-N) concentration of effluent from the Ulsan waste water
treatment plant training and validation tests.

ability for the higher non-linear relationship. Balabin and output parameters to avoid exceeding the water quality
Lomakina found that higher nonlinear interferences could regulations.
lead to low accuracy for both the ANN and SVM models It should be noted that the measured data were checked daily.
(Balabin and Lomakina, 2011). On the other hand, the limited Therefore, the current machine learning models (ANN and SVM)
number of input variables and data noise could also be the were only applied to the prediction daily water quality change
reason for the lower values of the coefficient of determination. with a very short period (10 months). A large data set based
Hamed et al. (2004) applied ANN to the modeling of waste model recalibration and revalidation would be required in future
water treatment. The data set for modeling was collected studies for a more accurate prediction model. Additionally, other
around 10 months and low values of coefficient of determi- input parameters may also be considered for future modeling
nation (< 0.5 on average) could also be observed, which were work. Nevertheless, the models which were constructed in this
similar to our study. However, future studies are necessary for study could still be effectively used for the prediction of the
the high performance models for more effective water quality effluent T-N concentration.
prediction than current results.
Machine learning models (ANN and SVM) were developed
for the prediction of T-N concentration in effluent from the
Ulsan wastewater treatment plant using water quality and 3. Conclusions
meteorological data. The results showed that the machine
learning model could be applied to model the complex The aim of this study was to develop two reliable machine
wastewater treatment process, which also parallel treated learning models (ANN and SVM) to predict the early 1-day
the high T-N concentration of FW leachate from food waste interval T-N concentration of effluent to avoid impact of high
through the wastewater treatment process. The values of the FW leachate T-N concentration loading to the waste water
T-N concentration of the effluent were successfully predict- treatment. Both of daily water quality data and meteorological
ed by the machine learning models and these two models data were used as input parameters, and a pattern search
(ANN and SVM) could also be applied to 1) estimate the T-N algorithm was used for model parameter optimization for
concentration of effluent when real-time monitoring or machine learning models. In addition, sensitivity analysis
sampling is not possible, and 2) estimate the range of the was also conducted to determine the effectiveness of each
100 J O U RN A L OF E N V I RO N ME N TA L S CIE N CE S 3 2 (2 0 1 5) 9 0–1 0 1

input parameter by using the LH-OAT method. The present Dibike, Y.B., Velickov, S., Solomatine, D., Abbott, M.B., 2001. Model
study shows that: 1) the optimum model of ANN and SVM was induction with support vector machines: introduction and
applications. J. Comput. Civ. Eng. 15 (3), 208–216.
reliable to predict the trends of water quality at the
Dreyfus, G., 2005. Neural Networks: Methodology and
wastewater treatment plant of Ulsan; 2) based only on the
Applications. Springer, Heidelberg.
model performance assessment from prediction accuracy, Dreyfus, G., Martinez, J.-M., Samuelides, A., Gordon, M.B., Badran,
the SVM model performance was better than the ANN model; F., Thiria, S., et al., 2002. Réseaux de Neurones de Méthodologie
and 3) however, from the sensitivity analysis, more physical et Applications. Eyrolles.
related cause-and-effect relationships between the T-N Gallant, S.I., 1993. Neural Network Learning and Expert Systems.
concentration of effluent and other input parameters could MIT Press, London.
García, A.J., Esteban, M.B., Márquez, M.C., Ramos, P., 2005.
be elucidated from ANN than SVM. Thus, the ANN model
Biodegradable municipal solid waste: characterization and
could be a more reasonable and reliable model than SVM for potential use as animal feedstuffs. Waste Manag. 25 (8),
the purpose of decision-making model building and process 780–787.
control for the integrated food waste and waste water Gernaey, K.V., van Loosdrecht, M.C.M., Henze, M., Lind, M.,
treatment. This study showed that machine learning models Jørgensen, S.B., 2004. Activated sludge wastewater treatment
could be a reliable method for the water quality prediction as plant modelling and simulation: state of the art. Environ.
Model. Software 19 (9), 763–783.
early warning water quality control of waste water treat-
Gill, M.K., Asefa, T., Kemblowski, M.W., McKee, M., 2006. Soil
ment. For the future work, long-term modeling for the input
moisture prediction using support vector machines. J. Am.
value sampling could be suggested in the future to improve Water Resour. Assoc. 42 (4), 1033–1046.
the accuracy of the ANN and SVM models. Govindaraju, R.S., 2000. Artificial neural networks in hydrology. II:
hydrologic applications. J. Hydrol. Eng. 5 (2), 124–137.
Hamed, M.M., Khalafallah, M.G., Hassanien, E.A., 2004. Prediction
of wastewater treatment plant performance using artificial
Acknowledgments neural networks. Environ. Model. Software 19 (10), 919–928.
Hamzawi, N., Kennedy, K.J., McLean, D.D., 1998. Technical
This research was supported by a grant (12-TI-C04) from feasibility of anaerobic co-digestion of sewage sludge and
Advanced Water Management Research Program funded by municipal solid waste. Environ. Technol. 19 (10), 993–1003.
Han, M.J., Behera, S.K., Park, H.S., 2012. Anaerobic co‐digestion of
Ministry of Land, Infrastructure and Transport of Korean
food waste leachate and piggery wastewater for methane
government. production: statistical optimization of key process parameters.
J. Chem. Technol. Biotechnol. 87 (11), 1541–1550.
Hong, Y.-S.T., Rosen, M.R., Bhamidimarri, R., 2003. Analysis of a
municipal wastewater treatment plant using a neural
REFERENCES
network-based pattern analysis. Water Res. 37 (7), 1608–1618.
Iacopozzi, I., Innocenti, V., Marsili-Libelli, S., Giusti, E., 2007. A
modified activated sludge model no. 3 (ASM3) with two-step
Ahmed, J.A., Sarma, A.K., 2007. Artificial neural network model nitrification–denitrification. Environ. Model. Software 22 (6),
for synthetic streamflow generation. Water Resour. Manag. 847–861.
21 (6), 1015–1029. Karul, C., Soyupak, S., Çilesiz, A.F., Akbay, N., Germen, E., 2000.
Balabin, R.M., Lomakina, E.I., 2011. Support vector machine Case studies on the use of neural networks in eutrophication
regression (SVR/LS-SVM)—an alternative to neural networks modeling. Ecol. Model. 134 (2-3), 145–152.
(ANN) for analytical chemistry? Comparison of nonlinear Khalil, A., Almasri, M.N., McKee, M., Kaluarachchi, J.J., 2005.
methods on near infrared (NIR) spectroscopy data. Analyst 136 Applicability of statistical learning algorithms in groundwater
(8), 1703–1712. quality modeling. Water Resour. Res. 41, W05010. http://dx.doi.
Behera, S.K., Park, J.M., Kim, K.H., Park, H.-S., 2010. Methane org/10.1029/2004WR003608.
production from food waste leachate in laboratory-scale Khalil, A.F., McKee, M., Kemblowski, M., Asefa, T., Bastidas, L.,
simulated landfill. Waste Manag. 30 (8-9), 1502–1508 . 2006. Multiobjective analysis of chaotic dynamic systems
Burnley, S., Phillips, R., Coleman, T., Rampling, T., 2011. Energy with sparse learning machines. Adv. Water Resour. 29 (1),
implications of the thermal recovery of biodegradable 72–88.
municipal waste materials in the United Kingdom. Waste Khan, M.S., Coulibaly, P., 2006. Application of support vector
Manag. 31 (9-10), 1949–1959. machine in lake water level prediction. J. Hydrol. Eng. 11 (3),
Cecchi, F., Traverso, P.G., Perin, G., Vallini, G., 1988. Comparison of 199–205.
co‐digestion performance of two differently collected organic Kim, S.H., Shin, H.S., 2009. Acidogenesis of lipids-containing
fractions of municipal solid waste with sewage sludges. wastewater in anaerobic sequencing batch reactor. J. Korean
Environ. Technol. 9 (5), 391–400. Soc. Environ. Eng. 31 (12), 1075–1080.
Chelliapan, S., Mahat, S.B., Din, M.F.M., Yuzir, A., Othman, N., Kim, J.K., Han, G.H., Oh, B.R., Chun, Y.N., Eom, C.-Y., Kim, S.W.,
2012. Anaerobic digestion of paper mill wastewater. Iranica 2008. Volumetric scale-up of a three stage fermentation
J. Energy Environ. 3, 85–90. system for food waste treatment. Bioresour. Technol. 99 (10),
Cho, K.H., Kang, J.-H., Ki, S.J., Park, Y., Cha, S.M., Kim, J.H., 2009. 4394–4399.
Determination of the optimal parameters in regression Krause, P., Boyle, D.P., Bäse, F., 2005. Comparison of different
models for the prediction of chlorophyll-a: a case study of the efficiency criteria for hydrological model assessment. Adv.
Yeongsan Reservoir, Korea. Sci. Total Environ. 407 (8), Geosci. 5, 89–97.
2536–2545. Lee, T.L., 2004. Back-propagation neural network for long-term
Cho, K.H., Sthiannopkao, S., Pachepsky, Y.A., Kim, K.-W., Kim, J.H., tidal predictions. Ocean Eng. 31 (2), 225–238.
2011. Prediction of contamination potential of groundwater Lee, C.Y., Shin, H.S., Chae, S.R., Nam, S.Y., Paik, B.C., 2003. Nutrient
arsenic in Cambodia, Laos, and Thailand using artificial neural removal using anaerobically fermented leachate of food waste
network. Water Res. 45 (17), 5535–5544. in the BNR process. Water Sci. Technol. 47 (1), 159–165.
J O U RN A L OF E N V I RO N ME N TA L S CI EN CE S 3 2 (2 0 1 5 ) 9 0–1 0 1 101

Lee, D.H., Behera, S.K., Kim, J.W., Park, H.-S., 2009. Methane Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1988. Learning
production potential of leachate generated from Korean food internal representations by error propagation. In: Collins, A.,
waste recycling facilities: a lab-scale study. Waste Manag. 29 Smith, E.E. (Eds.), Readings in Cognitive Science. Morgan
(2), 876–882. Kaufmann, pp. 399–421.
Lee, E., Seong, C., Kim, H., Park, S., Kang, M., 2010. Predicting the S. Cheon, J.H., Bae, Y., Park, S., Lim, J., Ha, C., Choi, Y., Lim, H., 2013.
impacts of climate change on nonpoint source pollutant loads Examination of Inlet Conditions for Effective Anaerobic
from agricultural small watershed using artificial neural Digestion of Food Waste Leachate in Bio-reactor. SUDOKWON
network. J. Environ. Sci. 22 (6), 840–845. Landfill Site Management Corp., Incheon, South Korea
Lek, S., Guégan, J.-F., 1999. Artificial neural networks as a tool in (Available at: http://webbook.me.go.kr/DLi-File/094/003/002/
ecological modelling, an introduction. Ecol. Model. 120 (2-3), 5561166.PDF).
65–73. Schmit, K.H., Ellis, T.G., 2001. Comparison of temperature-phased
Lewis, R.M., Torczon, V., 2002. A globally convergent augmented and two-phase anaerobic co-digestion of primary sludge and
Lagrangian pattern search algorithm for optimization with municipal solid waste. Water Environ. Res. 73 (3), 314–321.
general constraints and simple bounds. SIAM J. Optim. 12 (4), Shon, T., Moon, J., 2007. A hybrid machine learning approach to
1075–1089. network anomaly detection. Inform. Sci. 177 (18), 3799–3821.
Li, X.M., Cheng, K.Y., Selvam, A., Wong, J.W., 2012. Bioelectricity Singh, K.P., Basant, A., Malik, A., Jain, G., 2009. Artificial neural
production from acidic food waste leachate using microbial network modeling of the river water quality—a case study.
fuel cells: effect of microbial inocula. Process Biochem. 48 (2), Ecol. Model. 220 (6), 888–895.
283–288. Smith, M., 1993. Neural Networks for Statistical Modeling.
Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with International Thomson Computer Press.
support vector machines. J. Am. Water Resour. Assoc. 38 (1), Sosnowski, P., Wieczorek, A., Ledakowicz, S., 2003. Anaerobic
173–186. co-digestion of sewage sludge and organic fraction of
Liong, S.Y., Khu, S.T., Chan, W.T., 2001. Derivation of Pareto front municipal solid wastes. Adv. Environ. Res. 7 (3), 609–616.
with genetic algorithm and neural network. J. Hydrol. Eng. 6 (1), Soyupak, S., Karaer, F., Gürbüz, H., Kivrak, E., Sentürk, E., Yazici,
52–61. A., 2003. A neural network-based approach for calculating
Mahmoud, N., Zeeman, G., Gijzen, H., Lettinga, G., 2003. Solids dissolved oxygen profiles in reservoirs. Neural Comput. Appl.
removal in upflow anaerobic reactors, a review. Bioresour. 12 (3-4), 166–172.
Technol. 90 (1), 1–9. Trichakis, I.C., Nikolos, I.K., Karatzas, G.P., 2011. Artificial Neural
Mata-Alvarez, J., 2003. Biomethanization of the Organic Fraction Network (ANN) based modeling for Karstic groundwater level
of Municipal Solid wastes. IWA Publishing, London. simulation. Water Resour. Manag. 25 (4), 1143–1152.
Mata-Alvarez, J., Cecchi, F., Pavan, P., Llabres, P., 1990. The van Griensven, A., Meixner, T., Grunwald, S., Bishop, T., Diluzio,
performances of digesters treating the organic fraction of M., Srinivasan, R., 2006. A global sensitivity analysis tool for
municipal solid wastes differently sorted. Biol. Wastes 33 (3), the parameters of multi-variable catchment models. J. Hydrol.
181–199. 324 (1-4), 10–23.
McCulloch, W.S., Pitts, W., 1943. A logical calculus of the ideas Vapnik, V., 1995. The Nature of Statistical Learning Theory.
immanent in nervous activity. Bull. Math. Biophys. 5 (4), Springer, New York, USA.
115–133. Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE
Mjalli, F.S., Al-Asheh, S., Alfadala, H.E., 2007. Use of artificial Trans. Neural Netw. 10 (5), 988–999.
neural network black-box modeling for the prediction of Wang, W.J., Xu, Z.B., Lu, W.Z., Zhang, X.Y., 2003. Determination of
wastewater treatment plants performance. J. Environ. Manage. the spread parameter in the Gaussian kernel for classification
83 (3), 329–338. and regression. Neurocomputing 55 (3-4), 643–663.
Muttil, N., Chau, K.W., 2006. Neural network and genetic Wintgens, T., Rosen, J., Melin, T., Brepols, C., Drensla, K.,
programming for modelling coastal algal blooms. Int. Engelhardt, N., 2003. Modelling of a membrane bioreactor
J. Environ. Pollut. 28, 223–238. system for municipal wastewater treatment. J. Membr. Sci. 216
Oliveira-Esquerre, K.P., Mori, M., Bruns, R.E., 2002. Simulation of (1-2), 55–65.
an industrial wastewater treatment plant using artificial Yan, H., Zou, Z.H., Wang, H.W., 2010. Adaptive neuro fuzzy
neural networks and principal components analysis. Braz. inference system for classification of water quality status.
J. Chem. Eng. 19 (4), 365–370. J. Environ. Sci. 22 (12), 1891–1896.
Palani, S., Liong, S.-Y., Tkalich, P., 2008. An ANN application for Yoon, H., Jun, S.-C., Hyun, Y., Bae, G.-O., Lee, K.-K., 2011. A
water quality forecasting. Mar. Pollut. Bull. 56 (9), 1586–1597. comparative study of artificial neural networks and support
Poggi-Varaldo, H.M., Oleszkiewicz, J.A., 1992. Anaerobic vector machines for predicting groundwater levels in a coastal
co-composting of municipal solid waste and waste sludge at aquifer. J. Hydrol. 396 (1-2), 128–138.
high total solids levels. Environ. Technol. 13 (5), 409–421. Yu, P.-S., Chen, S.-T., Chang, I.F., 2006. Support vector regression
Rivas, A., Irizar, I., Ayesa, E., 2008. Model-based optimisation of for real-time flood stage forecasting. J. Hydrol. 328 (3-4),
wastewater treatment plants design. Environ. Model. Software 704–716.
23 (4), 435–450. Zita, A., Hermansson, M., 1994. Effects of ionic strength on
Rogers, L.L., Dowla, F.U., 1994. Optimization of groundwater bacterial adhesion and stability of flocs in a wastewater
remediation using artificial neural networks with parallel activated sludge system. Appl. Environ. Microbiol. 60 (9),
solute transport modeling. Water Resour. Res. 30 (2), 457–481. 3041–3048.
Rumelhart, D.E., Mcclelland, J.L., 1986. Parallel Distributed
Processing: Explorations in the Microstructure of Cognition.
MIT Press, Cambridge, Mass.

You might also like