Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

28 © IWA Publishing 2020 Water Supply | 20.

1 | 2020

Water quality monitoring: from conventional to emerging

Umair Ahmed, Rafia Mumtaz, Hirra Anwar, Sadaf Mumtaz and
Ali Mustafa Qamar


The rapid urbanization and industrial development have resulted in water contamination and water Umair Ahmed
Rafia Mumtaz (corresponding author)
quality deterioration at an alarming rate, deeming its quick, inexpensive and accurate detection Hirra Anwar
Ali Mustafa Qamar
imperative. Conventional methods to measure water quality are lengthy, expensive and inefficient, School of Electrical Engineering and Computer
Science (SEECS),
including the manual analysis process carried out in a laboratory. The research work in this paper National University of Sciences and Technology
focuses on the problem from various perspectives, including the traditional methods of determining (NUST),
water quality to gain insight into the problem and the analysis of state-of-the-art technologies, Pakistan
including Internet of Things (IoT) and machine learning techniques to address water quality. After
Sadaf Mumtaz
analyzing the currently available solutions, this paper proposes an IoT-based low-cost system
HITEC-Institute of Medical Sciences,
employing machine learning techniques to monitor water quality in real time, analyze water quality National University of Medical Sciences,
trends and detect anomalous events such as intentional contamination of water. Pakistan

Key words | artificial intelligence (AI) techniques, internet of things, machine learning, real-time
monitoring, smart city, water quality


Water plays an important role in all aspects of our lives and been any extensive local research in this direction. These
its quality is deteriorating with ever-increasing pollution due are the main factors that are the primary motivation
to urbanization, industrialization and population growth. To behind this research to increase the national impact.
sustain quality of life, it is imperative to detect water pollu- The quality of water is affected by several parameters
tants causing contamination. Typically, the detection of that are biological, chemical and physical in nature.
water quality is time-consuming, and a cumbersome task There is no single parameter that defines water quality
requiring manual laboratory analysis and statistical infer- completely, due to which the Water Quality Index (WQI)
ences (Gazzaz et al. ). There are several systems was developed to measure water quality. The computation
developed around the world that monitor and detect water of WQI is a lengthy process, which is why there is a need
pollution in real time. However, such research in Pakistan for an alternative method to simplify the WQI calculation
is limited. Although Pakistan has limited laboratory facilities process. Additionally, there are certain water quality par-
through which more than 200 water quality parameters can ameters that are more expensive to attain than others. As
be analyzed, laboratory analysis itself is time consuming and of today, Internet of Things (IoT) and machine learning
does not offer real-time detection of deteriorating water are two promising technologies that can be employed as
quality. In addition to this, Pakistan has no web portal for an alternative to solve the aforementioned water quality
data visualization and for public viewing, nor has there problems (Gazzaz et al. ).
doi: 10.2166/ws.2019.144

Downloaded from

by NUST Zimbabwe (EIFL) user
29 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

In view of the above, our research is directed towards Nephelometric Turbidity Units (NTUs). It is measured
analyzing different methods to estimate and monitor water through a nephelometer or turbidimeter. It is significantly
quality using IoT and machine learning. As indicated correlated with total hardness, electrical conductivity,
before, the quality of water is determined by several par- sulphates, total dissolved solids and chemical oxygen
ameters, but what solely defines water quality is WQI. demand (EPA ; Bhandari & Nayal ; Verma &
Different countries have different methods for calculating Singh ; Patel & Vaghani ).
WQI, but all of them are computationally expensive • Temperature: Temperature is one of the most important
(Gazzaz et al. ). Towards such ends, we propose an parameters, which has a considerable amount of effect on
IoT-based system that can monitor water quality parameters aquatic life. It also affects gas transfer rates and the
in real time, identify quality trends and predict water quality amount of dissolved oxygen. It may alter the form of
using machine learning methodologies. some of the elements or their concentration. It is
This paper is organized into five sections as follows: the mostly measured in Celsius. Its measurement is carried
first section defines common water quality parameters and through a thermistor or thermometry in the field. It is
their role in determining the status of water quality. The highly correlated with electrical conductivity and loosely
second section discusses existing systems across the world correlated with pH (EPA ; Verma & Singh ; Ali
with a comparative analysis. The third section highlights & Qamar ; Khatoon et al. ).
research conducted in the domain of water quality using • Chloride (Cl): It is naturally present in water and while
manual laboratory analysis to gain more insight into the pro- its excess is not harmful to humans normally, but the
blem, machine learning algorithms employed in the domain, water’s taste grows towards the saltier range if it increases
and IoT systems employed for water quality monitoring. The to more than 250 mg/l and may be harmful to agricul-
fourth section outlines our proposed system based on IoT tural activities. It is mostly measured through titration
and machine learning to provide real-time water quality and measured in mg/l. It is highly correlated with total
monitoring. The last section concludes the paper along hardness, electrical conductivity, total dissolved solids,
with giving some future directions. biological oxygen demand and chemical oxygen
demand (EPA ; Bhandari & Nayal ; Khatoon
et al. ; Patel & Vaghani ).
Water quality parameters
• Electrical conductivity (EC): This indicates the water’s
potential to conduct electric current. It is not directly
WQI is measured through different water quality par- useful in terms of water quality. Nevertheless, it helps
ameters. The commonly used parameters (EPA ; more in terms of water’s ionic content, which in turn
Verma & Singh ) are briefly discussed below and their determines the hardness, alkalinity and some of the dis-
relations are defined in Table 1:
solved solids. The conductivity varies with the water
• pH: pH of water specifies how acidic or alkaline the source and is also correlated with pH, temperature,
water is. The acidic range lies between 0 and 6 while turbidity, chlorides, sulphates, dissolved oxygen, total
the alkaline range lies between 8 and 14. 6.5–8.5 is the dissolved solids and chemical oxygen demand. It is
most acceptable range of pH. It is measured through measured through the electrometric method (EPA ;
electrometry and pH electrodes. It is significantly Bhandari & Nayal ; Ali & Qamar ; Khatoon
correlated with electrical conductivity, total hardness, et al. ; Patel & Vaghani ).
sulphates and total suspended solids (EPA ; • Dissolved oxygen (DO): This indicates oxygen’s solubi-
Bhandari & Nayal ; Verma & Singh ; Ali & lity in water. Water mostly absorbs oxygen from the
Qamar ; Patel & Vaghani ). atmosphere or produces it through photosynthesis. It is
• Turbidity: Turbidity of water is the measurement of non- quite important for aquatic life. It is mostly measured
filterable, divided solids in the water. This may also inter- through an electrometric meter or Winkler titration. It
fere with the treatment of water. It is mostly measured in is highly correlated with electrical conductivity,

Downloaded from

by NUST Zimbabwe (EIFL) user
30 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Table 1 | Water quality parameter correlation matrix (EPA 2001; Bhandari & Nayal 2008; Verma & Singh 2012; Ali & Qamar 2013; Khatoon et al. 2013; Patel & Vaghani 2015)


PH ✓ ✓ ✓
Turb ✓ ✓ ✓ ✓
Temp ✓
Cl ✓ ✓ ✓ ✓ ✓
EC ✓ ✓ ✓ ✓ ✓ ✓ ✓
DO ✓ ✓
TH ✓ ✓ ✓ ✓ ✓ ✓
TSS ✓ ✓
TDS ✓ ✓ ✓ ✓ ✓ ✓
BOD ✓ ✓ ✓
COD ✓ ✓ ✓ ✓ ✓
FC ✓
TC ✓

biological oxygen demand and sulphates (EPA ; Patel It is significantly correlated with pH and total dissolved
& Vaghani ). solids (EPA ; Verma & Singh ; Patel & Vaghani
• Total hardness (TH): This as an important parameter to ).
determine water’s suitability for domestic and industrial • Total dissolved solids (TDS): It is the amount of remains
use. It is mostly the amount of concentrations of calcium of inorganic and organic soluble solids in the water. It is
and magnesium present in the water. Their concen- highly correlated with salinity and its increase makes
trations in rocks lead to significant hardness levels in the water saline. It is highly correlated with turbidity,
water. It is significantly correlated with pH, turbidity, chlorides, electrical conductivity, total hardness, total
chloride, total dissolved solids, biological oxygen suspended solids and chemical oxygen demand. It is
demand and chemical oxygen demand. It is measured measured through gravimetric (dried at stated temperature
through titration with EDTA and measured in mg/l after filtration) method and measured in mg/l (EPA ;
CaCO3 (EPA ; Bhandari & Nayal ; Ali & Bhandari & Nayal ; Ali & Qamar ; Khatoon
Qamar ; Patel & Vaghani ). et al. ; Patel & Vaghani ).
• Total solids (TS): This is the amount of suspended and • Biological oxygen demand (BOD): This is the amount of
dissolved solids in the water. It indicates the remains in oxygen consumed by biological activities in the water,
the water such as sulfur, phosphorus, calcium, etc. It is particularly protozoa and bacteria. If the BOD level is
measured through gravimetric (dried at stated tempera- quite high and surpasses DO, then other organisms die
ture) method and measured in mg/l (EPA ; Verma due to a shortage of oxygen. It is quite an important
& Singh ). factor indicating water quality and is significantly
• Total suspended solids (TSS): This is the amount of correlated with chloride, dissolved oxygen and total
remains of inorganic and organic solid material suspended hardness. It is measured through the incubation tech-
in the water. The increase in TSS makes the water prone to nique with oxygen determinations by oxygen meter or
the high absorption of light which increases the water Winkler Method and measured in mg/l (EPA ;
temperature and in turn decreases the water’s capability Bhandari & Nayal ; Verma & Singh ; Ali &
to hold oxygen. It highly affects aquatic life. It is measured Qamar ; Khatoon et al. ; Patel & Vaghani ).
through gravimetric (filtration, with drying at • Chemical oxygen demand (COD): This is the amount of
stated temperature) method and measured in mg/l. oxygen consumed during breaking down of organic

Downloaded from

by NUST Zimbabwe (EIFL) user
31 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

material and during oxidation of present inorganic online water quality monitoring and contamination event
material. Just like BOD, it is also an important factor detection. It employs multiple direct and surrogate sensors
representing the status of water quality and is highly cor- to transmit continuous data to SCADA. It has an API that
related with turbidity, chlorides, electrical conductivity, allows the user to update its default algorithms. It is also
total hardness and total dissolved solids. It is measured Representational State Transfer (REST) web service friendly,
through microdigestion and colorimetry and reflux allows for XML input and output. It supersedes other systems
distillation with acid potassium dichromate followed by in certain major aspects including the algorithms’ transpar-
titrimetric and is measured in mg/l (EPA ; Verma ency, the capability to directly integrate operational data
& Singh ; Patel & Vaghani ). into its event detection component and to have centralized
• Fecal coliform (FC): Fecal coliforms are bacteria that are processing on a single computing system as well as support-
found in human and animal waste and mostly originate in ing multiple sensors. Another such system is OptiEDS by
the intestines of warm-blooded species. They indicate Elad Salomons, which helps to detect anomalous water qual-
possible fecal contamination of water. They are measured ity conditions in real time. It is also capable of water quality
through the membrane filtration method and most monitoring and water contamination detection in real time.
probable number: multiple tube method and measured Bluebox is another system that can identify the behavior
in number of organisms/100 ml. It is loosely correlated of water quality parameters that cause abnormal behavior. It
with ammonia and significantly correlated with total produces a reliable output even if some of the parameters
coliform (EPA ; Cabral & Marques ). are missing. It initially performs normalization, calculates
• Total coliform (TC): Total coliforms consist of fecal coli- the points’ distance amongst the parameters in each data
forms and other types of similar non-fecal bacteria mostly point and plots the frequency curves of the distances to visu-
found in soil. The total coliforms reflect the possible pres- alize. However, it is quite expensive, costing up to $92,500
ence of pathogenic micro-organisms and are correlated and does not have the capability to directly integrate oper-
with fecal coliforms. They are measured through the ational data into event detection. Another system, Event
membrane filtration method and most probable monitor, was created by the Hach Company, which has a
number: multiple tube method and measured in heuristic ability to learn events, automatically tune itself
number of organisms/100 ml (EPA ; Cabral & and define what constitutes an abnormality in the system.
Marques ). However, it is also quite expensive. Ana::tool is another
EDS system, which falls under the umbrella of a vast
system moni::tool introduced by s::can in 2010 which also
Water quality detection systems includes a user interface reflecting a dashboard to reflect
real-time water quality parameters (Canary ; EPA ).
According to experts, the world has become more cautious The comparison of a few water quality and contami-
after 9/11 about resources, particularly water, that could be nation detection systems based on the important
intentionally polluted to stir up chaos amongst the masses. parameters is given in Table 2.
It eventually brought up the need to have real-time monitor-
ing and contamination detection systems in place.
Consequently, several systems were developed, where most LITERATURE SURVEY
of those systems were more focused on contamination
event detection. One of the first such systems, Canary, was The literature highlights the problem of water contami-
built by Sandia National laboratories and was funded by nation and, due to its gravity, a lot of research work has
the Environmental Protection Agency (EPA), National been done to find a suitable solution encompassing state
Homeland Security Research Center. It is currently deployed of the art technology. A survey of various local and inter-
at Greater Cincinnati Water Works (GCWW). It provides national research papers related to this problem area is
several open source components, most of them being summarized in Table 2. It has been observed that water

Downloaded from

by NUST Zimbabwe (EIFL) user
32 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Table 2 | Comparison of water contamination detection systems currently available (Canary 2010)

Water contamination detection systems

Comparison parameters HACH, Guardian Blue S::can con::stat Canary

Algorithm transparency Proprietary Proprietary Fully transparent

Direct integration of operational data into event detection No No Yes
Centralized processing on a single computing platform No No Yes
Ability to work with sensors from multiple vendors Custom request Yes Yes
Cost: event detection software (10 Stations) $92,500 $60,000 $0.00
Cost: required computing hardware (10 Stations) $0.00 (included) $0.00 (included) $3,500
Total cost $92,500 $60,000 $3,500

quality measurement, analysis, prediction, and anomaly calculated mean, median, minimum, maximum, standard
detection has been addressed by researchers through the deviation, quartile range and standard error for each of the
techniques of manual calculation and laboratory analysis, parameters and have found physio-chemical parameters to
machine learning methodologies employed to learn the be well in limits except sulphates. Bacteriological par-
trends of water quality parameters and Internet of Things- ameters such as total fecal coliform and total coliform
based solutions. In this regard, the following sections discuss counts for these samples were critically high, reflecting poor
in detail the specialized work that has been carried out in hygienic and sanitation conditions. The authors recommended
the respective areas. continuous monitoring of water quality and revamping of sewer-
age systems.
Research concerning statistical inference Ejaz et al. (), in their study, have conducted research
on the dataset of the river Ravi by sampling its data for 3
The research work concerning manual calculation and lab- years, from January 2005 to March 2007, from 14 sampling
oratory analysis of various water samples is highlighted in stations. The samples have been tested for 12 different par-
this section. Daud et al. (), in a research study, collected ameters including biological oxygen demand (BOD),
various water samples across all the provinces of Pakistan. dissolved oxygen, chemical oxygen demand (COD), sus-
Various samples that were tested for different parameters pended solids, phosphorus, chloride, sodium, total nitrogen,
were compared against the National Environmental Quality nitrate, nitrite, oil and grease, and total coliforms. The standard
Standards (NEQS) and World Health Organization (WHO) methods for the examination of water and wastewater
standards. Most of the samples had a presence of total coli- (Andrew et al. ) have been utilized for testing the afore-
form, fecal coliform, and Escherichia coli primarily due to mentioned parameters. For the comparison of their
the mixing of sewerage water and secondly due to industrial parameter reading, the National Environmental Quality Stan-
waste. The authors recommended installation and mainten- dards of Pakistan (NEQS) has been used. For the said purpose,
ance of treatment plants for the contaminated water and expensive laboratory analysis has been carried out, which is a
taking measures to ensure enforcement of NEQS. Alamgir limitation of their work as well. In this research, it has been
et al. () have collected 46 piped water samples across recommended to install more treatment plants and ensure
different places of Orangi town, Karachi, Pakistan and enforcement of NEQS to improve the quality of water.
tested them for bacteriological and physio-chemical ana- Batabyal & Chakraborty () conducted their
lyses using standard methods for the examination of water research in the Kanksa-Panagarh area of West Bengal.
and wastewater. The standards against which the samples They collected samples from 98 tube wells from Novem-
are compared included the WHO and National Standard ber to December 2011 for the post-monsoon period and
for Drinking Water Quality (NSDWQ). The authors have from May to June 2012 for the pre-monsoon period.

Downloaded from

by NUST Zimbabwe (EIFL) user
33 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

They tested the samples for 13 parameters including pH, much more convenient to employ. The only hurdle is the
total dissolved solids (TDS), total hardness, HCO3, Cl, availability of readings of nine input parameters for predic-
SO4, NO3, F, Ca, Mg, Fe, Mn, and Zn against WHO tion, which might turn out to be expensive if it’s to be
() and Indian (BIS ) standards. In the analysis, used in an IoT system. Shafi et al. () have used Internet
the authors have performed correlation analysis amongst of Things (IoT) for real-time monitoring of water quality and
the parameters to investigate the correlation. In addition applied different machine learning methodologies to predict
to this, they have also calculated the WQI using the water quality. The authors have devised a low-cost real-time
attained parameters, demonstrating in detail the Indian water quality monitoring kit using the ATMega328 micro-
method to manually calculate the WQI. controller, pH sensor, turbidity sensor, and temperature
sensors. The data analytics part of the research work is car-
Research concerning machine learning ried out on a dataset collected from 11 different water
sources in Pakistan. To predict water quality, machine learn-
This section explains the application of machine learning ing algorithms, including Support Vector Machine (SVM), k
techniques in water quality prediction and trend analysis. Nearest Neighbor (kNN), Artificial Neural Network (ANN)
Najafzadeh & Ghaemi () in their study have employed and deep neural networks have been used. In their findings,
several such machine learning techniques to predict two deep neural networks yield the highest accuracy of 93%
important water quality parameters; namely, five-day bio- while the second-best prediction algorithm is SVM with an
logical oxygen demand (BOD) and chemical oxygen accuracy of 91%. In this research work, the parameters of
demand (COD). They have used least square-support accuracy, precision, and recall have been used for perform-
vector machine (LS-SVM) and multivariate adaptive ance evaluation purposes.
regression spline (MARS) techniques and found them sig- Sakizadeh () has conducted his research on the
nificantly more accurate than other conventional machine dataset of 47 wells and springs acquired from a ministry
learning methods such as artificial neural networks. LS- from Iran in the time duration of 2006–2013. His study
SVM with polynomial kernel yielded the most accurate esti- considers 16 water quality parameters. He has used the
mations in predicting BOD5 with 5.463 root mean square method proposed by Horton () to calculate the WQI.
error (RMSE) while LS-SVM with RBF kernel predicted Three methodologies have been employed in this research
COD better with RMSE of 4.461. They have used nine work: ANN with early stopping, ANN with ensemble aver-
water quality parameters as input which are quite expensive aging, ANN with Bayesian Regularization. The correlation
to employ in an IoT system. Najafzadeh et al. () have coefficients between the predicted and observed values of
conducted their study on the dataset of Karoun river, Iran. WQI were calculated to be 0.94 and 0.77 and concluded
They have used nine independent water quality parameters that ANN with Bayesian Regularization generalizes the
to estimate three significant parameters namely, dissolved dataset better than the rest of the techniques. However,
oxygen (DO), biological oxygen demand (BOD) and this model is prone to over-fitting due to a lesser
chemical oxygen demand (COD). They have employed number of samples so the study must focus on efficient
three machine learning techniques, Model Tree (MT), generalization.
Evolutionary Polynomial Regression (EPR) and Gene Abyaneh () has predicted two prominent and not
Expression Programming (GEP). The performance of GEP easily acquired water quality parameters, biochemical
was slightly better in predicting BOD with RMSE of 5.388 oxygen demand (BOD) and chemical oxygen demand
while ERP’s performance proved to be better in predicting (COD) using multivariate linear regression and ANN.
COD and DO with RMSE of 4.997 and 4.728 respectively. BOD and COD are predicted using easily attainable par-
The research concurs that an in-depth understanding of ameters like pH, temperature (T), total suspended (TS) and
the input and output parameters is necessary when dealing total suspended solids (TSS). This study has been conducted
with empirically derived equations for estimation, while on the data acquired from the Ekbatan wastewater treat-
the alternative of artificial intelligence (AI) techniques is ment plant, Iran. In order to validate the model, two

Downloaded from

by NUST Zimbabwe (EIFL) user
34 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

prominent evaluation criteria were used including MSE and quality. In results, the higher values of fecal coliforms
coefficient of correlation (r). As evident in the results, ANN were found in the months of March, June, July, and October.
performed better than MLR in predicting BOD and COD. However, their model had a clear limitation since no other
Using ANN with minimal input parameters, the evaluation parameters except fecal coliforms and turbidity were out of
metric of BOD was RMSE ¼ 25.1 mg/L, r ¼ 0.83 and for the standard limits in the data set. Hence, the accuracy was
the prediction of COD was RMSE ¼ 49.4 mg/L, r ¼ 0.81. It mainly dominated by turbidity and fecal coliforms.
was established that both the models predicted BOD Gazzaz et al. () have conducted their research on
better than COD and pH had the most effect on the 255 samples from the Kinta river, Malaysia, obtained by
predictions. their Department of Environment. This dataset comprises
Zhang et al. () have proposed a system to monitor 9,180 data points derived from measurements on those
water quality online and employed machine learning algor- samples. Thirty parameters from these samples have been
ithms to help users make educated decisions. Continuous acquired and reduced to 23 through Principal Factor Analy-
data from various websites have been gathered in a data sis (PFA). Initially, using the WQI method, the authors have
repository for monitoring and analysis. The machine learn- calculated the WQI manually and then using ANN with a
ing algorithms, including pixel-based adaptive segmenter setting of 23-34-1 to train their model. The dataset was parti-
and bag of words approach, are used on this data to aid a tioned into three parts; that is, 80% for training, 10% for
user to make informed decisions. The authors have con- validation and 10% for testing. The aforementioned setting
ducted their study on Dublin Bay and have used YSI explained 99.5% of the predictions and variations of accu-
6600EDS for continuous monitoring of turbidity, optical dis- racy are dependent on the size and variation of the dataset
solved oxygen, temperature, conductivity, and depth. In this the data accurately. However, the accuracy is dependent
research, the authors have modified and used pixel-based on the size and variation of the dataset.
adaptive segmenter from the image processing domain to Verma & Singh (), in their study, have acquired 73
detect anomalous events from a continuous data stream. samples from Jharia coalfield situated in Jharkhand, India.
Once anomalous events are detected, the features of anom- They used 58 of those samples for training and the rest for
alous events are then extracted and clustered to perform testing. A three-layer feed-forward backpropagation neural
decision making. network was used and was trained for 1,000 epochs. Their
Ali & Qamar (), in their research, have mapped the model takes in six input parameters, including tempera-
water quality detection problem to the machine learning ture, pH, TS, TSS, DO, oil and grease and produces two
domain. They have conducted their research on Rawal outputs: BOD and COD. The results reflect the RMSE
watershed, situated in Islamabad, Pakistan. They collected values for the BOD and COD to be 0.114 and 9.83 respect-
663 water samples from 13 different stations and tested ively, and corresponding coefficients of correlation to be
them for features of appearance, temperature, turbidity, 0.976 and 0.981. They concluded that ANN with Bayesian
pH, alkalinity, hardness as CaCO3, conductance, calcium, Regularization generalizes in the best possible way. One of
total dissolved solids, chlorides, nitrates and fecal the major limitations of this work is that the prediction of
coliforms against WHO standards. The data was initially WQI is not carried out, but only estimation of BOD and
preprocessed; the missing values were filled out by the attri- COD is done, which might add to the error if it is used to
bute mean and the outliers replaced by attribute median, predict WQI.
followed by correlation analysis to draw out the correlation Mahapatra et al. () have proposed to use a fuzzy
amongst the parameters. In this paper, regression models system to predict the WQI. Generally, conventional fuzzy
are employed to check seasonal water quality trends systems are efficient; however, complexity affects efficiency.
(monthly and quarterly) and since there was no WQI in For this reason, authors have proposed a cascaded fuzzy
the data, authors have employed unsupervised learning: system that works better with complex problems. The pro-
Average Linkage (within groups) method of Hierarchical posed fuzzy system takes multiple inputs and gives out
Clustering using Euclidean distance to categorize water multiple outputs by using multiple fuzzy sub-systems. The

Downloaded from

by NUST Zimbabwe (EIFL) user
35 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

authors have validated their system on data collected from rules which make it easier to interpret and map; however,
the Central Pollution Control Board (CPCB), India. They generation of those rules and their outcomes require
have used the data of six Indian rivers and estimated WQI expert knowledge which makes fuzzy logic unsuitable to
using three water quality calculation methods i.e. Indian, our problem. However, ANN comes with certain adapta-
Malaysian, and USA. Six parameters have been used for bility, which enables ANFIS to combine the power of both
their case study, including pH, biological oxygen demand, algorithms. ANN allows ANFIS to learn and construct
dissolved oxygen, fecal coliform, electric conductivity, rules of fuzzy logic, which turns out to be more efficient
ammoniacal nitrogen, and temperature. Three fuzzy subsys- than either of the models and classified 89.59% of the data
tems have been used, each for a different water quality correctly.
criterion. Evidently, predictions of the system are quite Rankovic et al. () conducted their study on Gruza
close to the actual WQIs of each criterion making the pro- reservoir, Serbia. They acquired 180 data samples by a
posed system more fit, to the problem at hand, than monthly sampling for 3 years (2000–2003) through monitor-
conventional fuzzy systems. ing. They used 152 of those data samples for training and 28
Bucak & Karlik () have emphasized the importance for testing. The input parameters considered in this work
of real-time detection of water contamination and their include pH, temperature, chloride, total phosphate, nitrites,
research work is mostly focused on intentional contami- nitrates, ammonia, iron, manganese, and electrical conduc-
nation of water. The Cerebellar Model Articulation tivity and the predicted parameter is dissolved oxygen
Controller Artificial Neural Network (CMACANN) has (DO). The Feed-forward Neural Network (FNN) model
been utilized for contamination detection due to its fast has been used to predict the dissolved oxygen. The Leven-
learning capabilities. Five parameters have been monitored berg–Marquardt algorithm is used to train the FNN and
including pH, conductivity, chlorine residual, turbidity, the researchers have established that 15 hidden neurons
and total organic carbon (TOC). To validate their model, give optimal results. The results of FNN models have been
they have intentionally introduced certain contaminants in compared to the measured data based on correlation coeffi-
the water such as sodium cyanide, sodium arsenate, cient (r), Mean Absolute Error (MAE), and Mean Square
sodium fluoroacetate, parathion, cryptosporidium parvum Error (MSE). The limitation of this work is that they are pre-
oocysts, and a surrogate of Bacillus anthracis spores. Their dicting DO instead of WQI, which from our research topic’s
model then detects the effects of contaminants and classifies perspective, might add to the error if we are to predict WQI
it as an anomaly. The proposed model works far better than using the predicted DO. Moreover, the model needs to be
conventional Multi-Layer Perceptron with Backpropagation updated every now and then with real values to reflect the
(MLP with BP). Whereas the MLP achieves an accuracy of environmental changes.
98% after 1,000 iterations, the proposed model achieves Najah et al. () have used Artificial Neural Networks
100% accuracy with far less iterations. to predict three water quality parameters including total
Yan et al. () have used an Adaptive Neuro-Fuzzy dissolved solids (TDS), electrical conductivity, and turbid-
Inference System (ANFIS) to predict water quality status ity. This study has been conducted on two monitoring
instead of the conventional Artificial Neural Networks and stations, Johor River and Sayong River, situated in Malaysia.
found ANFIS to be more efficient than the other. In this A different methodology has been employed for each par-
work, the dataset of major river basins of China was ameter as well as for each monitoring station. For TDS,
obtained from CNEMC, consisting of 845 observation backpropagation with two hidden layers and Bayesian regu-
samples. Three parameters have been selected for the classi- larization, but with distinct transfer functions for each
fication model, including dissolved oxygen (DO), chemical monitoring station, has been used. TDS using EC has been
oxygen demand (COD) and ammoniacal-nitrogen (NH3-N). predicted, since both are highly correlated as is evident in
The employed model combines the two algorithms, ANN their results. The same methodology has been employed
and fuzzy logic to map the water quality problem in an for EC and they predicted it using TDS given their corre-
efficient manner. Fuzzy logic works in terms of IF-THEN lation. For turbidity, FNN using backpropagation has been

Downloaded from

by NUST Zimbabwe (EIFL) user
36 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

employed with a single hidden layer and backpropagation. A This is followed by the implementation of an instance of
distinct function for each monitoring station was used. Tur- the generalized IoT system. They used four parameter sen-
bidity has been predicted using total suspended solids (TSS) sors for this, namely conductivity, turbidity, water level,
since they are highly correlated. The selected models imi- and pH. For connectivity, they used TI CC3200, which is a
tated each water quality parameter quite efficiently with single-chip microcontroller with built-in Wi-Fi module and
minimal prediction error. ARM Cortex M4 core, which can be connected to the near-
Rene & Saidutta () have used regression analysis est Wi-Fi hotspot for internet connectivity and in turn move
and ANNs to predict biochemical oxygen demand (BOD) the data to the cloud or storage, and then an application
and chemical oxygen demand (COD) using other water could use that data storage to reflect the readings. If sensors
quality parameters, including, TOC, total suspended solids were not connected directly to the controller, they could be
(TSS), total dissolved solids (TDS), phenol concentration, connected using LoRa sensors.
ammoniacal nitrogen (AMN) and Kjeldahl’s nitrogen Encinas et al. () have presented a prototype for
(KJN). The regression analysis has been employed to find water quality monitoring of ponds. They have used tempera-
the correlation of TOC with BOD and COD. After ture, pH and dissolved oxygen sensors, an Arduino module,
regression analysis, 12 different models of ANN have been and ZigBee transmitters and receivers. For the software end,
run using different combinations of the aforementioned the MySql database along with SOAP web services and
water quality parameters to predict BOD and COD. The applications developed in C# and Android are part of this
Average Relative Error (ARE) is used to find the accuracy solution. The C# application allows sending a request for
of the model. The model has seven hidden neurons in the sensor readings through Arduino and a multiplexer. Once
hidden layer and a training count of 5,000 with TOC, requested, a sensor takes readings and sends them back to
phenol, TSS and AMN predicting the BOD most effectively, the computer through a ZigBee transmitter and the readings
having an ARE of 11.66%. Similarly, the model having eight are then received by a ZigBee receiver attached to the com-
hidden neurons and a training count of 1,500 with TOC, puter. The received readings are then saved in the local
phenol and TDS as input predicted COD in a better database and are sent to the cloud through a web service
manner, having an ARE of just 6.97%. The model with six and are eventually visualized in the Android application.
hidden neurons and a training count of 5,000 with TOC, AI is not used in the system, but it does set the base for AI
phenol, TSS and TDS as input performed effectively for to be used in the future for effective real-time decision
both BOD and COD with ARE for BOD as 8.20% and making.
ARE for COD as 11.08%. The empirical relations formed Raju & Varma () have proposed a real-time monitor-
amongst various parameters are quite reliable and bring ver- ing system for aqua farmers which allows the farmers to be
satility in the domain. apprised of the anomalous events if the water body is con-
taminated. They have used Raspberry Pi 3 with built-in
Research concerning internet of things (IoT) Wi-Fi module, a solar panel and a sensor node comprising
various sensors including those of dissolved oxygen, ammo-
The IoT domain proposes smart and low-cost solutions to nia, pH, temperature, salt, nitrate, and carbonates. The
the problem of water contaminant detection and water qual- system continuously monitors and stores the sensor data
ity analysis. Geetha & Gouthami () have proposed a and generates an alert for the farmer if any of the data devi-
generic IoT system for real-time water quality monitoring. ates from the allowed range. There is a mobile application
It comprises sensors that take parameter readings, then for the farmer through which he can monitor the sensor
those parameter readings are transmitted to a controller data in real-time and access historical data. Vijai & Sivaku-
through wireless communication devices attached to the mar () have proposed an IoT framework for real-time
sensors. Later the controllers, through wireless communi- water quality monitoring, demand forecasting, and anomaly
cation technology, store those sensor readings to data detection. In the proposed system, the parameters of turbid-
storage, which are reflected in a customized application. ity, chlorine, oxidation-reduction potential (ORP), nitrates,

Downloaded from

by NUST Zimbabwe (EIFL) user
37 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

pH, conductivity, and temperature have been considered adapts and samples only when an event of interest occurs;
and respective sensors have been used to take the various for example, flood, and minimizes the number of samples
measurements. The connectivity is achieved using either of to be taken. The NeoMote wireless sensing platform has
several options i.e. 3G, Bluetooth, ZigBee, etc. All these been used which consists of an ARM-Cortex M3 micropro-
components, when connected, make a centralized system cessor for computing and the Xively IoT platform, which
requiring a steady power supply to keep the system online. is used, as an interface for the services. The sensor node
The authors have also proposed two other components of was connected to an automated sampler (ISCO 3700)
Demand forecasting and Anomaly detection. For anomaly which had a 24-bottle capacity. To emphasize the flexibility
detection, the technique of ANN and fuzzy system has of web services, the authors have used three web services,
been utilized. However, the proposed system has not been each of which is developed using a different programming
tested with any data set. language. One of them was used to receive commands for
Birje et al. (), in their paper, have proposed a sampling and transmitting data; the second service used
system to monitor two of the most descriptive water quality the adaptive sampling algorithm and sent the sampling com-
parameters, which particularly determine whether water is mands to the first service. The third web service was used to
safe for aquatic life or not. The pH sensor, along with the interface with the IoT platform in order to access the histori-
pH meter and turbidity sensor, has been used to sense cal data and communicate with the sensors. The data
these readings from the water body. These readings are transfer among these services was in the form of JSON,
sent to an analog to digital converter (ADC) which in due to convention, but it also supports the XML format.
turn sends the digital readings to 16F887A PIC microcon- Perumal et al. () have presented a prototype for
troller to be shown on an LCD. Their work is extendable measuring the water level in real-time, and consequently
to employ GSM technology for communication. Cloete generate alarms for authorities and on social networks, in
et al. (), in their research, have designed a sensor case of alarming events like floods etc. The ultrasonic sen-
node which consists of temperature, conductivity, pH, sors have been utilized along with a wireless gateway,
ORP and flow sensors. Since the commercially available ATmega328P controller and a cloud server. After frequent
sensors are expensive, they have implemented the sensor intervals, ultrasonic sensors determine the distance between
designs themselves, thus making the system cost-effective. the water level and the sensor by sending a sound wave and
The signals generated from the sensors then go through estimating the water level by its reflection. Once determined,
conditioning in order to be able to interface with the micro- the water level reading is transmitted to the cloud server
controller. Apart from the sensor node, their proposed through a wireless gateway, where it is stored in a database.
system makes use of a ZigBee module to receive and trans- If the water level crosses a certain predefined threshold, an
mit the measurements and a microcontroller to notify the alarm is generated to alert the authorities or to broadcast it
measurements. All the measurements are then shown on on social networks like Twitter. In addition to that, the data
an LCD in front of their respective labels and a buzzer regarding water level stored on a cloud server is visualized
goes off if any of the measurements goes out of its allow- through a web application to learn the trends and perform
able limits. decision making. Vijayakumar & Ramya () have pro-
Wong & Kerkez () have emphasized the importance posed an online water quality monitoring system, which
of using real-time data along with historical data for water employs five sensors including sensors of temperature, pH,
quality detection. The authors have also discussed the flexi- turbidity, conductivity, and dissolved oxygen. The Raspberry
bility that comes with using web services along with the IoT Pi Bþ and IoT module USR-WIFI232-X-V4.4 have been
platforms. Since some of the water quality constituents are used to transmit the data collected from sensors to the
difficult to measure or their sensors are too expensive cloud through a gateway. The proposed system provides
for a cost-effective solution, this solution utilizes adaptive water quality monitoring and is suggested to be installed in
sampling of water along with easily available sensors. Adap- different locations in a pond to collect real-time water qual-
tive sampling, instead of sampling after predefined intervals, ity data.

Downloaded from

by NUST Zimbabwe (EIFL) user
38 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Cao et al. () have proposed an inexpensive, easy to is modeled to consume low amounts of battery, as it goes
set up wireless network to monitor water quality using into sleep mode when there is no request for data to be read.
ISFET micro-sensors and mobile communication. The Several methods employing statistical inference,
micro-sensors are deployed on the site to measure important machine learning and IoT were reviewed in the preceding
water quality parameters and send the measurements to sen- sections. Most of the research concerning statistical infer-
sing end device (ED) nodes attached to the sensors. ED ence used laboratory analysis such as electrometry for pH
nodes then transmit the measurements to the sensing readings, titration for hardness and membrane filtration
access point (AD) node, which is connected to a database method for coliforms etc. followed by WHO range examin-
server, where the sensor data are stored for future use and ation and WQI calculation through different empirical
visualization. A mobile network has been used for com- methods such as DoE-WQI and IWQI (EPA ; Gazzaz
munication between ED and AD nodes. The system was et al. ; Batabyal & Chakraborty ). Research concern-
programmed to collect sensor data automatically every ing machine learning explored different supervised learning
two hours. To experiment with their proposed system, they techniques such as ANN, SVM, LS-SVM, MARS, EPR,
used two micro-sensors for pH and temperature. GEP, KNN, MLR, ANFIS, FNN, CMACANN and fuzzy sys-
Rasin & Abdullah () have proposed a cost-effective tems to estimate water quality and other water quality
online water quality monitoring system using wireless sensor parameters (Yan et al. ; Bucak & Karlik ; Mahapatra
networks (WSNs). Their system consists of two modules: a et al. ; Gazzaz et al. ; Abyaneh ; Najafzadeh
wireless node and a base monitoring station. The wireless et al. ; Shafi et al. ; Najafzadeh & Ghaemi ).
node consists of a sensor unit and a microprocessor and is Research concerning IoT explored several low cost IoT sys-
powered by a 9 V battery. They use the ZMN2405HP tems designed for water quality monitoring. Most of the
ZigBee module, which consists of a CC2430 transceiver employed hardware included parameter sensors, the TI
IC. The inexpensive pH, temperature and turbidity sensors CC3200 microcontroller, Arduino module, Raspberry Pi 3,
have been used and readings from these sensors go through 16F887A PIC, ARM-Cortex M3 and ATmega328P, while
signal conditioning to determine their validity. Once con- most systems established communication through LoRa,
ditioned, the wireless sensor node sends the readings to Zigbee, a built-in Wifi module and GSM (Perumal et al.
the base monitoring station through the transceiver. The ; Birje et al. ; Wong & Kerkez ; Encinas et al.
other ZigBee module, consisting of the transceiver at the ; Geetha & Gouthami ; Raju & Varma ). The
base monitoring station, receives the readings and sends to reviewed methodologies built a foundation for a real-time
the computer using the RS 232 protocol. The received water quality system using IoT and machine learning
data are then visualized on a custom GUI developed in methodologies. It reflected that most substantial studies
Cþþ. Wang et al. () have proposed a low cost, low employing machine learning techniques to estimate water
power, and long-distance supervisory system based on the quality through WQI used at least six to nine parameters
WSN for aquaculture. Their proposed system consists as input, which is expensive in terms of IoT hardware
of two modules, a coordinator and a sensor node. The given the cost and availability of the sensors. Hence, we
coordinator is composed of a ZigBee based wireless found a gap in machine learning techniques for estimating
communication module, which uses a CC2430 chip with water quality with a lesser number of parameters in order
an RF transceiver and an ADC, and a GPRS module to to build low cost water quality systems.
transmit the data to the monitoring computer, which Table 3 provides a detailed description and analysis in
stores the data and helps in visualization. The sensor node the form of a comparison of various research papers encom-
contains the sensors, which read the water quality par- passing the domains of IoT, machine learning and manual
ameters and apply signal conditioning on the readings to laboratory analysis. The comparison of the research papers
prepare them to be digitized. After signal conditioning, the has been carried out based on the parameters of the method-
readings are sent to the coordinator, where these are digi- ology employed by researchers for the problem, limitations
tized and processed further. In addition to this, the system of the research paper, the dataset used, employed water

Downloaded from

by NUST Zimbabwe (EIFL) user
39 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Table 3 | The comparison of various research papers related to water quality detection

no. Paper Methodology Limitations Dataset Parameters Results Hardware

1 Najafzadeh & Estimating BOD5 Uses 9 WQPs 200 datasets from Ca2þ, Naþ, Mg2þ, RMSEBOD5 ¼ N/A
Ghaemi () and COD using which is Karoun River NO2, NO3, 5.463 and
LS-SVM impractical for PO3–
4 , EC, pH RMSECOD ¼
an IoT system and turbidity 4.461
2 Najafzadeh et al. Estimating BOD, While it is Dataset of Ca2þ, Naþ, Mg2þ, RMSEBOD ¼ N/A
() COD and DO certainly a Karoun River, NO2, NO3, 5.388,
using MT, EPR better Iran PO3–
4 , EC, pH RMSECOD ¼
and GET alternative than and turbidity 4.997 and
empirical RMSEDO ¼
calculations it 4.728
still uses 9
rendering its
use for IoT
3 Shafi et al. () Monitoring using Classifies water Dataset of 667 pH, turbidity, Accuracy: ATMega328, LCD
sensors and quality only samples temperature DNN 93% and parameter
classifying into two collected from SVM 91% sensors
water quality categories, i.e. PCRWR NN 86%
using DNN, good or poor. kNN 76%
NN, SVM & The standard
KNN WQI has not
been used
4 Geetha & Monitoring using Only monitoring, N/A Conductivity, N/A TI CC3200
Guthami () sensors and no prediction turbidity, water controller and
cloud level, pH parameter
infrastructure sensors
5 Daud et al. () General review of Only manual Manual samples Total coliform, Excessive total N/A
water quality laboratory gathered across fecal coliform, coliform due to
across all analysis Pakistan E. coli sewerage
provinces of
6 Encinas et al. Water quality Only monitoring, N/A Temperature, pH, N/A Parameter
() monitoring no prediction dissolved sensors,
using sensors oxygen Arduino
and SOAP web module and
services ZigBee
7 Raju & Varma Real-time Only provides N/A DO, ammonia, N/A Raspberry Pi3
() monitoring monitoring, pH, with built-in
system and does not temperature, Wi-Fi module, a
mobile process data for salt, nitrate and solar panel and
application for trends carbonates a sensor node
aqua farmers to
be apprised of


Downloaded from

by NUST Zimbabwe (EIFL) user
40 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Table 3 | continued

no. Paper Methodology Limitations Dataset Parameters Results Hardware

8 Wong & Kerkez Adaptive This work does N/A N/A N/A NeoMote wireless
() sampling of not monitor sensing
water using water quality in platform: ARM-
adaptive real-time but Cortex M3
sampling through microprocessor
algorithm, sampling, it just and Xively IoT
Xively IoT provides platform.
platform and monitoring. Automated
web services to No predictive sampler (ISCO
monitor water analysis 3700 with 24
quality bottle capacity)
9 Vijai & Artificial neural Proposes a N/A Turbidity, N/A Sensors,
Sivakumar network (ANN) generic IoT chlorine, ORP, connectivity:
() and fuzzy system without nitrates, pH, 3G, Bluetooth,
systems any dataset and conductivity & ZigBee
results temperature
10 Sakizadeh () ANN with early Prone to 47 wells and 16 groundwater Bayesian N/A
stopping, ANN overfitting with springs (2006– quality regularization.
with ensemble fewer samples 2013) from variables. To WQI cor: 0.94
averaging and Ministry of Iran calculate and 0.77
ANN with mentioned
Bayesian WQI
11 Alamgir et al. Bacteriological Only manual Forty-six samples pH, TSS, TDS, Well within limits N/A
() and physio- laboratory of piped water turbidity, TCC, except for
chemical analysis in Orangi town, TFC, TFS sulphates and
analyses Karachi, total fecal
Pakistan 2014 coliform
12 Batabyal & Calculates WQI Manual 98 tube wells pH, TDS, total Poor water N/A
Chakraborty using manual calculations hardness, quality was
() Indian method HCO3, Cl, attributed to
SO4, NO3, F, high contents
Ca, Mg, Fe, Mn, of TDS, NO3,
and Zn and Cl
13 Vijayakumar & Monitoring Only provides N/A Temperature, pH, N/A Sensors,
Ramya () employing IoT monitoring turbidity, Raspberry Pi
through sensors conductivity, B þ , IoT
and cloud DO module USR-
14 Abyaneh () Multivariate Only predicts Data acquired pH, temperature, Both models N/A
linear BOD, which from Ekbatan total predicted BOD
regression does not wastewater suspended, better than
(MLR), completely treatment plant, total suspended COD and pH
artificial neural reflect water Iran solids had the most
networks quality effect on the
(ANN), RMSE prediction


Downloaded from

by NUST Zimbabwe (EIFL) user
41 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Table 3 | continued

no. Paper Methodology Limitations Dataset Parameters Results Hardware

15 Zhang et al. Continuous Does not predict Dublin Bay Turbidity, optical N/A YSI 6600EDS
() monitoring, water quality, dissolved
pixel-based just clusters oxygen,
adaptive possible temperature,
segmenter, and anomalous conductivity,
bag of words events depth
16 Ali & Qamar Preprocessing: Biased dataset: 13 different Appearance, High fecal N/A
() attribute mean, No other stations, 2009 temperature, coliforms were
regression parameters to 2012, 663 turbidity, pH, found in the
models, except fecal water samples alkalinity, months of
hierarchical coliforms and hardness as March, June,
clustering turbidity were CaCO3 July, and
out of standard conductance, October
limits calcium, TDS,
nitrates, fecal
17 Verma & Singh ANN with Does not 73 samples (58 Six inputs (temp, (RMSE) values N/A
() Bayesian calculate WQI for training and pH, TS, TSS, for BOD and
regularization: but predicts 15 for testing) DO and oil and COD are 0.114
1,000 epochs BOD and COD grease) and two and 9.83% and
outputs (BOD correlation is
and COD) 0.976 and 0.981
18 Gazzaz et al. Artificial neural Requires larger 9,180 data points, 30 parameters Predictions N/A
() network, 23- dataset 255 samples reduced to 23 explain almost
34-1 through PFA 99.5% of the
19 Rankovic et al. FNN. Levenberg– WQI not 180 data samples, pH, temperature, Correlation N/A
() Marquardt calculated but 152 train, 28 chloride, total coefficient (r),
algorithm is predicts DO, test phosphate, mean absolute
used to train which might nitrites, nitrates, error (MAE)
the FNN. 15 result in error ammonia, iron, and mean
hidden neurons ahead if WQI is manganese, and square error
calculated using electrical (MSE) indicate
it conductivity accurate results

quality parameters, results, and hardware used in the pro- comprising various modules including the sensing module,
posed solution. the coordinator module, the data processing and analysis
module and the storage and core analytics module is
shown in Figure 1.
WATER QUALITY MONITORING AND DETECTION The detailed description of each of the modules involved
SYSTEM (WQMDS) in the proposed system is given next.

This section highlights in detail the water quality monitoring Sensing module
and detection system that we have proposed, which not only
monitors the water quality in real time but also predicts the The sensing module contains several sensors to measure
trends of water quality and recognizes anomalous events. four of the most important water quality parameters that
The high-level architecture of the proposed system are used to detect water quality. This module is responsible

Downloaded from

by NUST Zimbabwe (EIFL) user
42 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Figure 1 | The architecture of the proposed water quality detection and monitoring system (WQDMS).

for measuring the parameter readings and transmitting these comprises various web services, including the data pre-
readings to the coordinator module. The following four sen- processing service, storage service, and the data analysis
sors have been utilized: the pH, turbidity, temperature, and service. Once the real-time measurements are received
total suspended solids sensors, which sense the respective from the coordinator module, they are stored locally in a
parameters from the water body. MySql database using the storage service. In addition to
this, the data pre-processing service processes the data
Coordinator module that is received in real time, including the filtering of
useful data.
The coordinator module is responsible for coordinating Since a large amount of data is available at the local
between the sensing module and the data processing and server, analysis can be carried out to find out the hidden
analysis module. The coordinator module uses the Arduino trends and some useful information could also be explored.
microcontroller, which signals the sensors to take the real- This process is also carried out at this layer.
time readings and in return receives all the parameter
measurements from the various sensors that are connected Storage and core analytics module
to it. Once sensor measurements are received, the coordina-
tor module transmits them through a ZigBee transceiver to The storage and core analytics module fulfills two major
the on-site computer. responsibilities; that is, firstly to ensure the long-term storage
of water parameter readings and secondly to predict water
Data processing and analysis module quality trends using machine learning techniques and
detect anomalous events.
The data processing and analysis module, which is con- Once the data have passed the pre-processing stage, it
nected to the coordinator module through a transceiver, is transferred to the cloud using the REST web service.

Downloaded from

by NUST Zimbabwe (EIFL) user
43 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Various machine learning algorithms are applied to predict CONCLUSION AND FUTURE WORK
water quality. The detection of anomalous events, including
out of range parameter readings or any other malfunction- Water is one of the most essential resources for survival and
ing, is also detected through this module. The machine its quality is determined by the WQI, which is measured
learning module implements two machine learning through various water quality parameters depending upon
models deployed on the cloud along with the data; one the type of standard used. Conventionally, to measure
of them is trained using ANN, which detects the anomalies water quality parameters, expensive and time-consuming
and informs the dashboard about them. The other is laboratory analysis is performed, which makes timely con-
trained using an unsupervised learning technique; that is, taminant recognition and its ramifications difficult.
K-Means clustering, which classifies water quality into Alternatively, an IoT-based system can be employed to
three clusters: Class 0, Class 1 and Class 2. The normal monitor water quality in real time, which is an efficient
data points would be assigned to cluster 1 and would rep- and low-cost solution to the problem. Several such systems,
resent water fit for drinking. Similarly, another cluster such as CANARY, are deployed at various places using IoT
would represent water fit for uses other than drinking effectively and they have proved to be an effective alterna-
and the last cluster would represent water unfit for any tive to expensive manual laboratory analysis. While IoT
use. For any new arriving point, if it passes the anomaly systems are employed for real-time water quality monitoring,
detection, the system would query the deployed K-Means machine learning techniques such as artificial neural net-
model to be sure what cluster it should belong to. This works (ANN), support vector machines (SVM), regression,
helps to get a rough measure of its deviation from correlation analysis, hierarchical clustering etc. aid in learn-
normal, if there is any. The aforementioned machine ing the trends of the water quality parameters, predicting
learning models can be queried through the web service WQI, and detecting anomalous events like intentional con-
written in Java. tamination to enable real-time contamination detection
and action. The proposed system makes use of IoT for
Application dashboard real-time monitoring and uses machine learning algorithms
to learn various trends in the data to incorporate its learning
The proposed system has an application dashboard, which is in the system and aid in decision making. Additionally, the
used for visualization of the water quality data on desktop, review found a gap in the machine learning methodologies
web and mobile platforms. Once the real-time data is to estimate water quality with a lesser number of par-
received and stored, this dashboard, which has been devel- ameters, which can be easily employed in a low cost IoT
oped in Java, visualizes that information in the form of system using minimal number of parameter sensors.
graphs and heat maps. The safety ranges will also be dis-
played on the interface as well as alerts notifying the
admin regarding any anomalous events. The web dashboard
has two instances, one of which is deployed on the local
Abyaneh, H. Z.  Evaluation of multivariate linear regression
desktop computer on the site, which is connected to the
and artificial neural networks in prediction of water quality
local backup database and is accessible only on site. parameters. Journal of Environmental Health Science &
The other instance is deployed on the internet and uses Engineering 12 (1), 40.
Alamgir, A., Khan, M. A., Hany, O. E., Shaukat, S., Mehmood, K.,
the REST web service to access the cloud for the data, and
Ahmed, A., Ali, S., Riaz, K., Abidi, H., Ahmed, S. & Ghori,
uses that for visualizations on the dashboard. The Android M.  Public health quality of drinking water supply in
application also accesses the data from the cloud through Orangi Town, Karachi, Pakistan. Bulletin of Environment,
the same web service and plots visualizations. Both appli- Pharmacology, and Life Sciences 4 (11), 88–94.
Ali, M. & Qamar, A. M.  Data analysis, quality indexing and
cations generate an on-screen alert if anything seems to be
prediction of water quality for the management of Rawal
gravely out of limit and suggest an informed measure in Watershed in Pakistan. In: Eighth International Conference
case of an anomaly. on Digital Information Management (ICDIM 2013).

Downloaded from

by NUST Zimbabwe (EIFL) user
44 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Andrew, D. E., Lenore, S. C. & Arnold, E. G.  Standard Environmental Protection Agency  Water Quality Event
Methods for the Examination of Water and Wastewater, Detection System Challenge: Methodology and Findings.
19th edn. American Public Health Association, American Available from:
Water Works Association and Water Environment 2015-07/documents/water_quality_event_detection_system_
Federation, Washington, DC, USA. challenge_methodology_and_findings.pdf (accessed 19
Batabyal, A. K. & Chakraborty, S.  Hydrogeochemistry and November 2018).
water quality index in the assessment of groundwater quality Gazzaz, N. M., Yusoff, M. K., Zaharin Aris, A., Juahir, H. & Firuz,
for drinking uses. Water Environment Research 87 (7), M.  Artificial neural network modeling of the water
607–617. quality index for Kinta River (Malaysia) using water quality
Bhandari, N. S. & Nayal, K.  Correlation study on physico- variables as predictors. Marine Pollution Bulletin 64 (11),
chemical parameters and quality assessment of Kosi river 2409–2420.
water, Uttarakhand. Journal of Chemistry 5 (2), 342–346. Geetha, S. & Gouthami, S.  Internet of things enabled real
Birje, S. V., Bedkyale, T., Alwe, C. & Adiwarekar, V.  Water time water quality monitoring system. Smart Water 2 (1),
pollution detection system using pH and turbidity sensors. 1–19.
International Journal of Advanced Research in Computer and Horton, R. K.  An index number system for rating water
Communication Engineering 5 (4), 530–533. quality. Journal of Water Pollution Control Federation 37 (3),
Bucak, I. O. & Karlik, B.  Detection of drinking water quality 300–306.
using CMAC based artificial neural networks. Ekoloji Dergisi Khatoon, N., Khan, A. H., Rehman, M. & Pathak, V. 
20 (78), 75–81. Correlation study for the assessment of water quality and its
Bureau of Indian Standards  Indian Standard Drinking Water parameters of Ganga River, Kanpur, Uttar Pradesh, India.
Specification. 1st rev. Bureau of Indian Standards, New IOSR Journal of Applied Chemistry 5 (3), 80–90.
Dehli, India. Mahapatra, S. S., Nanda, S. K. & Panigrahy, B. K.  A cascaded
Cabral, J. P. & Marques, C.  Faecal coliform bacteria in fuzzy inference system for Indian river water quality prediction.
Febros river (northwest Portugal): temporal variation, Advances in Engineering Software 42 (10), 787–796.
correlation with water parameters, and species identification. Najafzadeh, M. & Ghaemi, A.  Prediction of the five-day
Environmental Monitoring and Assessment 118 (1–3), 21–36. biochemical oxygen demand and chemical oxygen demand
Canary Event Detection Software  Sandia National in natural streams using machine learning methods.
Laboratories. Available from: Environmental Monitoring and Assessment 191 (6), 380.
research/research_development_100_awards/_assets/ Najafzadeh, M., Ghaemi, A. & Emamgholizadeh, S. 
documents/2010_winners/SNL_Canary_SAND2010-2228P. Prediction of water quality parameters using evolutionary
pdf (accessed 14 January 2019). computing-based formulations. International Journal of
Cao, F., Jiang, F., Liu, Z. & Yang, Z.  Application of ISFET Environmental Science and Technology 16 (10), 6377–6396.
microsensors with mobile network to build IoT for water Najah, A., Elshafie, A., Karim, O. A. & Jaffar, O.  Prediction
environment monitoring. In: International Conference on of Johor River water quality parameters using artificial neural
Intelligent Environments, Shanghai, China. networks. European Journal of Scientific Research 28 (3),
Cloete, N. A., Malekian, R. & Nair, L.  Design of smart 422–435.
sensors for real time water quality monitoring. IEEE Patel, J. Y. & Vaghani, M. V.  Correlation study for assessment
Access 4, 3975–3990. of water quality and its parameters of par river Valsad,
Daud, M. K., Nafees, M., Ali, S., Rizwan, M., Bajwa, R. A., Gujarat, India. IJIERE 2, 150–156.
Shakoor, M. B., Arshad, M. U., Chatha, S. A. S., Deeba, F., Perumal, T., Sulaiman, M. N. & Leong, C. Y.  Internet of
Murad, W., Malook, I. & Zhu, S. J.  Drinking water things (IoT) enabled water monitoring system. In: 2015 IEEE
quality status and contamination in Pakistan. BioMed 4th Global Conference on Consumer Electronics (GCCE).
Research International 2017, 1–18. Raju, K. R. S. R. & Varma, G. H. K.  Knowledge based real
Ejaz, N., Hashmi, H. N. & Ghuman, A. R.  Water quality time monitoring system for aquaculture using IoT. In: IEEE
assessment of effluent receiving streams in Pakistan: a case of 7th International Advance Computing Conference (IACC).
River Ravi. Mehran University Research Journal of Rankovic, V., Radulovic, J., Radojevic, I., Ostojic, A. & Comi, L. 
Engineering & Technology 30 (3), 383–396. Neural network modeling of dissolved oxygen in the Gruza
Encinas, C., Ruiz, E., Cortez, J. & Espinoza, A.  Design and reservoir, Serbia. Ecological Modelling 221 (8), 1239–1244.
implementation of a distributed IoT system for the Rasin, Z. & Abdullah, M. R.  Water quality monitoring system
monitoring of water quality in aquaculture. Wireless using Zigbee based wireless sensor network. International
Telecommunications Symposium (WTS), pp. 1–7. Journal of Engineering & Technology IJET 9 (10), 24–28.
Environmental Protection Agency  Parameters of Water Rene, E. R. & Saidutta, M. B.  Prediction of water quality
Quality, Interpretation and Standards. Available from: indices by regression analysis and artificial neural networks. International Journal of Environmental Research 2 (2),
Water_Quality.pdf (accessed 19 November 2018). 183–188.

Downloaded from

by NUST Zimbabwe (EIFL) user
45 U. Ahmed et al. | Water quality monitoring: conventional and emerging technologies Water Supply | 20.1 | 2020

Sakizadeh, M.  Artificial intelligence for the prediction of In: 2009 5th International Conference on
water quality index in ground water systems. Modeling Earth Wireless Communications, Networking and Mobile
Systems and Environment 2 (1), 8. Computing.
Shafi, U., Mumtaz, R., Anwar, H., Qamar, A. M. & Khurshid, H. Wong, B. P. & Kerkez, B.  Real time environmental
 Surface water pollution detection using internet of things. sensor data: an application to water quality using
In: 2018 15th International Conference on Smart Cities: web services. Environmental Modelling & Software 84,
Improving Quality of Life Using ICT & IoT (HONET-ICT). 505–517.
Verma, A. K. & Singh, T. N.  Prediction of water quality from World Health Organization  Guideline for Drinking Water
simple field parameters. Environmental Earth Sciences Quality, 2nd edn, Vol. 1. World Health Organization,
69 (3), 821–829. Geneva, Switzerland.
Vijai, P. & Sivakumar, P. B.  Design of IoT systems and Yan, H., Zou, Z. & Wang, H.  Adaptive neuro fuzzy inference
analytics in the context of smart city initiatives in India. system for classification of water quality status. Journal of
Procedia Computer Science 92 (2016), 583–588. Environmental Sciences 22 (12), 1891–1896.
Vijayakumar, N. & Ramya, R.  The real time monitoring of Zhang, D., Sullivan, T., Briciu-Burghina, C., Murphy, K.,
water quality in IoT environment. In: 2015 International McGuinness, K., O’Connor, N. E., Smeaton, A. & Regan, F.
Conference on Innovations in Information, Embedded and  Detection and classification of anomalous events in
Communication Systems (ICIIECS). water quality datasets within a smart city-smart bay project.
Wang, Z., Wang, Q. & Hao, X.  The design of the International Journal on Advances in Intelligent Systems
remote water quality monitoring system based on WSN. 7 (1&2), 167–178.

First received 1 May 2019; accepted in revised form 23 September 2019. Available online 11 October 2019

Downloaded from

by NUST Zimbabwe (EIFL) user

You might also like