Screenshot 2024-06-18 at 18.31.19

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

DESIGN AND IMPLEMENTATION OF A DRINKABLE WATER

PREDICTION SYSTEM

BY
ILOCHI EMMANUEL
U20/NAS/CSC/…

SUPERVISED BY ENGR CHIBUEZE

JULY, 2024
ABSTRACT
The world is moving at a very high speed with innovation and development in the field of
information technology and with the sudden hike in the domain of artificial intelligence the
procedures for performing predictive analysis has moved to another great level revolutionizing
the whole medical and clinical ecosystem. However, the automation of the analytical process is
now made possible with the development of predictive systems using different artificial
intelligence and machine learning tools. In this work of research, the researcher suggest a unique
portable water prediction system that determines if water is portable (safe for human
consumption) or not. This work of research proposes very good portable water prediction system
that functions to determine whether a given water sample is portable (that is if it is safe for
human consumption) or not. This research work however was conducted in other to determine
and identify the machine learning algorithm that gives the highest level of accuracy. The model
was tested across various algorithms including Random forest algorithm which gave an accuracy
of 0.692, Logistic Regression algorithm which produced an accuracy of 0.507, Support Vector
Machine (SVM) which produced an accuracy level of 0.655, Decision Tree algorithm which
produced an accuracy of 0.606 and lastly the K-Nearest Neighbors which in turn produced and
accuracy level of 0.639. Moreover, at the end of the research it was noticed that the Random
Forest algorithm produced the best accuracy level.
ii

CHAPTER ONE
INTRODUCTION

1.1 BACKGROUND TO THE STUDY


Water is basically a colorless, odorless, and tasteless liquid that is essential for the survival of
all living organisms on planet earth. It is a chemical compound that is made up of only two
hydrogen atoms and one oxygen atom, with the chemical formula represented as H2O and at this,
stands to be the most abundant substance on Earth, covering about 71% (more than two-thirds)
of the earth’s surface (Kaddoura, 2022). Water is a fundamental component of the human body,
and plays a crucial role in various activities that take place in the body of living organisms
including food digestion and blood circulation and also in the life of human beings including
drinking, cooking, washing, and irrigation. However, while water is a major component of the
earth and covers 71% of our planet, only a very little tiny fraction of roughly 2.5% can be
considered as freshwater and suitable for human consumption (Gleick, 2014).
Meanwhile drinkable water (portable water) on the other hand is referred to as water that is
purified, very good and safe for human consumption and usage and at this is said to be free from
harmful toxins and hazardous microorganism pollutants that can cause any disease or any other
health problems that may affect the victim adversely (Dalal, 2022). Consequently, water is said
to be drinkable if it is free from pollutants and microorganisms and have met certain rules and
standards to make sure it has no bad thing in it as an estimated 2.2 billion people globally lack
access to safe drinking water at home, putting them at risk of waterborne diseases like cholera,
typhoid, and dysentery (WHO, 2019).
However, in the more recent years, the problems associated with the use of contaminated
water has greatly increased and at this, it has become more risky for individuals as they have no
way of verifying if a water is portable or not. Considering this, the introduction of water
prediction methodologies which is the process of predicting if a water is portable or not based on
various parameters such as temperature, pH, turbidity, and coliforms, has come to help reduce
the dangers associated in misinformation in the determination of water portability. Also, the use
of machine learning algorithms has been employed in creating systems to help in quality
portability prediction and have proven to be very effective in predicting the water quality index
and class.
Moreover, the emergence of portable water prediction methodologies have subsequently
given birth to predictive systems in the context of drinkable water, and at this, stands as a
significant advancement towards achieving a milestone in the domain of predictive systems. This
predictive approach therefore aims to proactively identify potential threats in water quality,
thereby allowing for timely interventions to maintain and improve the safety of drinking water.
This approach of portable water prediction involves the use of fancy machine learning
technologies to guess if the water is safe to drink or not by looking at a lot of information about
the water, like how clear it is, how much oxygen it has, and if there are any bad things in it, and
then use this information to tell if the water is good to drink or if it might make the person using
the water sick. Because the early detection of contamination allows for quality preventive
measures, thereby reducing the financial burden of treating waterborne diseases and improving
overall healthcare efficiency (Hallaq et al., 2019).
Additionally, drinkable water prediction systems have successfully brought about significant
changes to the overall health sector as these systems have greatly contributed to regulating the
diseases that are contracted from drinking water. However, the application of machine learning
algorithms in drinkable water prediction systems has helped to control water pollution and alert
users in case of poor quality detection and in all, the use of machine learning algorithms in
portable water prediction has proven to be effective in predicting the water quality index and
class.

1.2 STATEMENT OF PROBLEM


The manual process of predicting if water is portable and safe for consumption has given
birth to a lot of problems which includes the following:
1. Over time they have been a steady increase in the spread of waterborne diseases due to a
great inefficiency in the prediction of portable water.
2. In ability of farmers to determine of individual to determine the portability of water due to
lack of predictive resources.
3. Limited access to firsthand information as water users have to depend on mere water
observation to predict if its portable or not
4. Lack of data driven decision making as individual depend on their cognitive minds to
predict and detect the portability of water.

1.3 AIM AND OBJECTIVES OF THE STUDY


This research work is solely aimed at the development of a crop recommendation system
using machine learning algorithms. However, this study hopes to achieve this aim through
the following objectives.

1. To implement a very simple and easy to use user interface that would be accessible by
everyone to aid easy accessibility.
2. To train a binary classifier model that would be able to predict the portability of water
using the random forest machine learning algorithm.
3. To source training datasets (past climate data) from professional online and publicly
verifiable data repositories.
4. To use the sourced datasets to incorporate a data-driven portability prediction in order to
avail individuals the opportunity to access firsthand information.

1.4 SIGNIFICANCE OF THE STUDY


This research work holds a paramount significance to everyone as it has come to proffer
practical solutions to the challenges posed by improper water examination in the domain of
portable water detection with its leverage on comprehensive datasets which considers water
variables to assist humans in predicting the portability of water as all human beings depend
heavily on the water for almost everything they do in life. This research work also serves as a
foundational platform for further investigations in the domain of water portability prediction as it
stands to offer researchers a basis for refining predictive models, exploring new variables, and
advancing methodologies.

1.5 SCOPE OF THE STUDY

This research work mainly centers on the design and implementation of a drinkable water
prediction system with special attention on addressing the challenges faced by individuals in
making informed decisions on whether a volume of water that is being considered is portable and
safe for drinking or not. The scope of this research work is limitless as it is applicable to every
aspect of human life where it is used to determine the portability of water both for human beings
and for other living organisms and animals.

1.6 LIMITATION OF THE STUDY


During the course of this research work, the researcher experienced some setbacks which
includes but are not limited to the following:
1. Inaccuracy in the quality of data and precision: Mistakes and missing information in
the data that was used in training the model could make the results of this study less
reliable thereby producing some inconsistencies in quality of water portability prediction.
2. Cost Implications: The initial setup of the system model that is proposed by this system
are associated with some upfront costs which at some point became quite heavy on the
researcher to handle as the financial burden stood to be a limiting factor for the successful
development and adoption of the system.
1.7 OPERATIONAL DEFINITION OF TERMS
There are lot of terms that are associated with this work of research but for the purpose of
this study, only a few of them would be considered. These terms includes the following:
Drinkable Water: Water that is safe and suitable for human consumption
Water Quality: The physical, chemical, and biological characteristics of water that indicates the
suitability for various uses.
Predictive System: The ability of an agricultural system to anticipate, prepare for, respond to,
and recover from adverse conditions,
Data Precision: The degree of exactness and reliability in the collected datasets.
Machine Learning: This is a branch of artificial intelligence that enables computers to learn
patterns and make predictions from data without explicit programming.
Algorithm: A step-by-step procedure or set of rules to be followed by a computer to solve a
problem.
ProactiveFigure
Approach:
3.4 Class
A diagram
strategy ofthat
theanticipates
proposed system
and addresses potential issues before they
occur.
Waterborne Diseases: Illnesses caused by the consumption of water contaminated with harmful
or pollutants.
Decision-Support System: The interactive computer-based tool or software that aids individuals
in making informed decisions.

CHAPTER TWO
LITERATURE REVIEW

2.1 INTRODUCTION TO THE CONCEPT OF DRINKABLE WATER


PREDICTION
Right from the early days of human existence water has been an essential but yet
abundant substance to living things, although some part of the world find it hard to gain access to
water but for some it is available in abundance. At this, even for the places where water is
available in abundance, the access to safe and clean drinking (portable) water is an essential in
the maintainability of public health and the successful introduction of portable water prediction
system has stood to reflect a heavy commitment of humans effort in ensuring the maintainability
of good water quality. However, the successful journey from the traditional methods to the
modern, technologically driven portable water prediction systems has been coupled with
significant transgressional processes and by understanding the origins of portable water
prediction systems, individuals get to appreciate the advancements that have occurred over the
years to shape the current state of portable water prediction system.
th
Notably, while dating back into the early days of portable water analysis, the 20 century
to be precise, when the first portable water prediction system was developed and which relied
heavily on manual methods to rudimentarily assess the quality of water through manual
observations of water volumes (Zulkifli, 2022). These method often lacked precision and real-
time capabilities to detect whether water is (portable) suitable for human consumption or not.
However, as time went on, the need for more accurate and proactive approaches to portable
water prediction became evident and important to humans, and as this need for a more proactive
approach spurred, transgressive integration of technology into the management of the water
quality indicated a successful paradigm shift in our ability to anticipate and prevent waterborne
diseases.
Moreover, recent improvements in the world of portable water prediction systems has
brought forth various significant features that has helped to enhance the effectiveness of these
systems as the integration of machine learning algorithms has allowed for the dynamic analysis
of complex water quality parameters. These movement from traditional portable water prediction
systems to modern technologically driven portable water prediction systems has been
characterized by several transgressional processes, with one of the most significant
transgressional processes being the development of new technologies that made it possible to
collect and analyze data easily and more efficiently at a glance (Chen et al, 2020).

2.1.2 DESCRIPTION OF THE DIFFERENT CATEGORIES OF CROP


RECOMMENDATION SYSTEM
There are different categories of crop recommendation systems and this section will be reviewing
some of them. However some of these different categories are discussed below with their
example.
• Statistical models:
This category of portable water prediction systems is the mathematical frameworks that
makes use of statistical methods to analyse historical water quality data. This type of
models are created in other to identify patterns, relationships, and trends that are existing
within the data to enabling the prediction of future water quality conditions. They also
play an important role in the process of understanding the complex things that happen
between the various parameters which in turn influences the quality of water and the
ability to forecast gets to change as time goes on. There are different applications that
fall under this category and some of them include the Regression Models, Time series
Analysis, Multivariate Statistical Models, ANOVA models, etc. (Antanasijevic et al,
2020).
• Machine learning models:
This type of portable water prediction systems is a computational algorithm system that
is designed to analyse a large amount of datasets in other to learn patterns, and make
predictions or classifications that is in one way or the other related to water quality. This
type of models make use of techniques that allow them to adapt and improve on their
accuracy as time goes on. They also, provide a dynamic and data-driven approach to
forecast water quality parameters. Some examples of applications that fall under this
category include: Decision Trees model, Support Vector Machines (SVM), Decision
Trees, Neural Networks, Random Forests, K-nearest Neighbors (KNN), etc.
• Physical models:
This type of portable water prediction systems are the representations of the physical
processes that govern the quality of water. This type of models imitate the behaviour of
various components in a water system in other to provide insights into the activities that
go on between environmental factors and water quality parameters. However, unlike
statistical models that rely on historical data, this physical models are based on scientific
principles and equations and because of this they allow for the simulation of dynamic
processes and therefore influence drinkable water quality. Some examples of the
applications that fall under this category include Hydraulic Models, Water Treatment
Plant Models, Aquatic Ecosystem Models, and Geospatial Models, etc.

• Hybrid models:
This category of portable water prediction systems refer to approaches that are integrated
and combine multiple prediction methodologies or techniques to enhance the accuracy
and reliability of water quality forecasts. This type of models take advantage of the
strengths of different prediction strategies, often merging rule-based systems with
machine learning algorithms in other to provide a comprehensive and adaptive solution
for predicting water quality. Some examples of these applications that fall under this
category include Rule-Based and Machine learning Hybrid, Integration of Statistical
Models with Machine Learning, Ensemble Models, Rule-Based and IoT-Integrated
Hybrid, etc.

2.1.2 BRIEF HISTORY OF CROP RECOMMENDATION SYSTEM


The history of portable water prediction systems can be trace backs to the early days of
th
the 20 century when the first portable water prediction system was developed (Zulkifli, 2022)
and as time went on, the system was subsequently improved, and new features were added to
make it more efficient and accurate. However, the successful movement from traditional portable
water prediction systems to the modern technologically driven portable water prediction systems
was characterized by several transgressional processes with one of the most significant
transgressional processes being the development of new technologies that made it possible to
collect and analyze data more efficiently (chen et al, 2022).
Subsequently, more recent advancements in portable water prediction systems have been
characterized by several prominent features in response to the first steps that were made on the
path of portable water prediction systems by our humble ancestors who keenly observed the
world around them, and relied on simple observations to gauge water safety (Hangerman and
Exner, 2019). However, one of the most significant features that has been added to these systems
is the successful incorporation of the use of machine learning algorithms to predict water
portability by passing the necessary details of the waters to a machine learning model and have
it analyze and predict if the water is portable or not.

2.1.3 THE FUTURE OF CROP RECOMMENDATION SYSTEM


The journey of drinkable water prediction systems have come a long way in the quest for
safeguarding the quality of the water that is consumed by human beings and other living
organisms, and as the field of drinkable water prediction systems have transcended from its early
days of inception, making strides in predicting the portability of water, with improvements not
only increasing the precision of predictions but also have positioned drinkable water prediction
systems as a proactive protection of the health of the public.
However, when looking ahead into the future of drinkable water prediction systems, it is
quite evident that a tremendous promise which is driven by continuous advancements in
technology lies ahead, and also the integration of artificial Intelligence (AI) is expected to play
an important role in allowing prediction models to become even more intelligent and adaptive.
This great move of incorporating artificial intelligence algorithms has given rise to the use of
predictive analytics and big data analytics in order to contribute to a deeper understanding of
water quality patterns and therefore enable more informed and targeted interventions.
In summary, the future of drinkable water prediction systems is aimed at effecting a great
paradigm shift that would be marked by a lot of successful advancements in technology and a
determined approach to water quality management. Moreover, as these systems extends its
borders and successfully navigates through the future, drinkable water prediction systems plans
to serve as the reliable cornerstone, by continuously incorporating technological innovations in
the community’s involvement in creating a sustainable and resilient water quality prediction
ecosystem.
2.2 REVIEW OF RELATED WORKS
There are already s existing systems which are related to this research work in one way or the
other, however, in this section some of these systems would be reviewed.
In the field of the internet of things (IoT), Khot and Surve (2020) together presented a
concept for the design and development of a low-cost system for real-time water quality
monitoring. The system was designed to measure the chemical and physical properties of a water
sample using a number of sensors. The system uses its sensors and core controllers to measure
several factors, including temperature, pH, and turbidity. The data collected from IoT-based
sensors was used to train machine learning algorithms, which then used the data to anticipate
new data—or new scenarios—that would arise in the future.
Vijay and Kamaraj (2021) put in a lot of effort to attempt and use the activation functions
that is based on Artificial Neural Network (ANN) to predict the water quality index in drinking
water distribution systems. However, during the research, the water samples that was taken
between 2008 and 2017 from several wells in the Vellore area were used to train the system.
Hmoud and Waselallah (2021) contributed equally to the model and develop an effective
system to monitor drinking water in order to preserve a friendly and sustainable green
environment. However, the researchers referred to the system as the Adaptive Neuro-Fuzzy
Inference System (ANFIS) algorithm because the system was created to assist in predicting the
quality of water index (WQI). The system was created using the Feed-forward neural network
(FFNN) and K-nearest neighbors in other to accurately forecast and categorize water quality. The
system was designed with the FFNN algorithm and it obtained the highest accuracy of 100% and
the system was based on the parameters that were previously given.
Kadam et al (2019) all together worked together on a research where they applied
multiple linear regression (MLR) and artificial neural network (ANN) techniques. They did the
study to help them predict the fitness and quality of groundwater from shivganga river basin that
is located on the eastern slopes of the western ghat Indian region. The system was developed in
order to predict if a given water sample is portable (safe for drinking) or not. The system used a
Levenberg-Marquardt three-layer back propagation algorithm in ANN architecture so that it can
be able to consistently generate a precise model for predicting WQI-based groundwater quality.
The system further used the MLR model to check the efficiency of ANN prediction.
Sakizadeh (2016) worked so hard to forecast the index of water quality. He conducted an
extensive and independent study in the field of portable water prediction (WQI). He
accomplished this by employing artificial neural networks (ANNs) and, as a case study, he made
use of 16 groundwater quality characteristics that the Iranian Ministry of Energy in Andimeshk
successfully gathered from 47 wells and springs between 2006 and 2013. The three ANN
algorithm ANNs with early stopping, ANNs with Bayesian regularization, and Ensemble ANN
were utilized by the system. A sensitivity analysis was also incorporated into the system to
demonstrate the significance of each parameter in the WQI forecast.
Nair and Vijaya (2022) helped one another to develop an effective model that predicts the
quality of river water. As a result, they classified the index value of water in accordance with the
water quality standard. The system's researchers employed information gathered from eleven
sampling stations situated at different points along the Bhavani River, which flows through
Tamilnadu and Kerala. The system also used various machine learning algorithms, such as
support vector regression, MLP regression, random forest, and linear regression, in order to
successfully build the prediction model. The dataset gathered during the research was used to
develop the machine learning models. Additionally, naïve bayes, decision trees, MLP classifiers,
and support vector machines (SVM) have been employed in the development of a classification
model for the water quality index. Nonetheless, the models that were created, produced results
that were encouraging in terms of water index quality prediction.
Haghiabi et al. (2018) worked together to successfully predict water quality components
in the Tireh river, which is located in southwest Iran. All of the researchers cooperated with each
other to investigate the performance of artificial intelligence techniques, including support vector
machine (SVM), artificial neural network (ANN), and the group method of data handling
(GMDH). Upon concluding their investigation, the researchers found that every algorithm they
had used had some over-estimation characteristics.
Derdour et al. (2022) worked together to design and implement the best strategy that
would allow for the forecast of the water quality index (WQI). The researchers all contributed to
a research project that was primarily focused on verifying the multiple classification techniques,
which include discriminants analysis (DA), Decision Trees (DT), K-Nearest Neighbors (KNN),
Ensemble Trees (ET), and Support Vector Machine (SVM). The data samples were categorized
into four states based on their WQI: very poor or unsafe water, exceptional water quality, fair
water quality, and bad water quality. Upon completion of the research, it was established that the
utilization of machine learning models is highly effective in predicting drinking water on a
broader scale, hence facilitating the development of sustainable and efficient support and control
decision-plans of the quality of water.
Ainaqeb et al. (2022) together with his team members conducted research with the goal
of developing an intelligent system that uses machine learning models to assess the water's
quality and decide whether or not it is safe to drink. To determine the best model for water
portability prediction, the study compared the K-Nearest Neighbor, Random forest, Decision
Tree, Light GBM, and Support vector machine models.
2.3 SUMMARY OF REVIEW OF RELATED WORKS
Author(s) Name/ Journal Name Contribution Shortfall
Year
Khot, I. M., & Surve, A. . International Journal for . The system The system was
R. (2020) Research in Applied measures trained using data from
Science and Engineering parameters such as unverifiable resources.
Technology, 8, 228-236. turbidity, pH and
temperature that it
gets using its
sensors and core
controllers
Vijay, S., & Kamaraj, K. Water Resources It predicted the This work is limited to
(2021). Management, 35(2), 535- quality index of only water samples
553. water in Drinking from vellore.
water Distribution
systems by using
the activation
functions Based
Ann.
Hmoud Al-Adhaileh, Modeling Earth Systems The system was The system was
M., & Waselallah and Environment, 5, 951- created using the developed based on the
Alsaade, F. (2021). 962. K-nearest already listed
neighbors and the parameters.
Feed-forward
neural network
(FFNN) in order
to successfully
predict and
classify water
quality.
Kadam, A. K., Wagh, V. Modeling Earth Systems It predicted the It was built to only for
M., Muley, A. A., and Environment, 5, 951- fitness and quality the groundwater from
Umrikar, B. N., & 962. of groundwater shivganga river basin.
Sankhua, R. N. (2019). from shivganga
river basin that is
located on the
eastern slopes of
the western ghat
Indian.
Sakizadeh, M. (2016). Modeling Earth Systems The system also The system used an
and Environment, 2, 1-9. incorporate a outdated data source
sensitivity analysis for training the model.
to show the
importance of
each parameter in
the prediction of
WQI.
Nair, J. P., & Vijaya, M. In Journal of Physics: The dataset The system was
S. (2022, August). Conference Series (Vol. collected during limited to the samples
2325, No. 1, p. 012011). the research was collected from the
IOP Publishing. used to develop Bhavani River that
the machine flows through kerala
learning models as and Tamilnadu.
the system also
made use of other
machine learning
algorithm which
includes linear
regression, MLP
regression,
random forest and
support vector
regression in order
to successfully
build the
prediction model
Haghiabi, A. H., Water Quality Research The system All of the algorithms
Nasrolahi, A. H., & Journal, 53(1), 3-13. Predicted water employed in the course
Parsaie, A. (2018 quality of the study has some
components in the over-estimation
Tireh river which properties.
is located in the
southwest of Iran
Derdour, A., Jodar- Water, 14(18), 2801. The study checked It was developed with
Abellan, A., Pardo, M. the multiple a limited array of
Á., Ghoneim, S. S., & classification possible outcome
Hussein, E. E. (2022). techniques values.
Alnaqeb, R., Alrashdi, In 2022 IEEE/ACS 19th The study The system was only
F., Alketbi, K., & Ismail, International Conference on compared the K- designed to compare
H. (2022, December Computer Systems and Nearest Neighbor, the accuracy of
Applications (AICCSA) Random forest, different algorithms
(pp. 1-6). IEEE. Decision Tree, and not to predict the
Light GBM, portability of water.
Support vector
machine models in
order to get the
best model for
water portability
prediction

CHAPTER THREE
SYSTEM ANALYSIS AND DESIGN
3.1 PREAMBLE
This chapter of this research work stands to span across the domain of the whole design
and analysis process of the system that is being proposed by this research work. This chapter
hopes on doing this by delving deeply into the methodology (Object Oriented Analysis and
Design) and also the particular approach through with the adopted methodology was applied.
3.2 SYSTEM ANALYSIS
The process of system analysis spans across all the processes that are involved in the
dissection of the system that is proposed by this study and also examining its different individual
components of their interactions with one another. This section of this research work however
strands to address the comprehensive process of system analysis of the proposed system and it
stretches across the evaluation of some of the existing systems, the identification of the
limitations that are existing within those systems and their enhancement strategies. Moreover, the
detailed analysis of the proposed system spans across a full modelling activities that makes use
of activity diagrams, the class diagrams and the use case diagrams.

3.2.1 ANALYSIS OF EXISTING SYSTEMS


Khot and Surve IoT based system:
This system was specifically developed with the main aim of monitoring of the quality of
water in real time in the domain of internet of things (IoT). The system was consisted of several
sensors with which the system used to measure the chemical and physical parameters of a water
sample. The system measures parameters such as turbidity, pH and temperature that it gets using
its sensors and core controllers. The system was trained using machine learning algorithms based
on the data that was collected using IoT based sensors and then it uses this data for the prediction
of the new data (new scenarios) that might occur in the future. Moreover, notwithstanding the
good reason why this system was developed, the system experienced some pitfall as it was built
to only analyze data that is generated from the IoT based sensors and can’t analyze data from
third party software.
Adaptive neuro-fuzzy inference system (ANFIS) algorithm:
This system is a machine learning–based recommendation system a system that is efficient in its
operation of monitoring the water that is meant for drinking in order to ensure that a sustainable
and friendly green environment is maintained. The researchers of this research work referred to
the system as Adaptive neuro-fuzzy inference system (ANFIS) algorithm as it was developed to
help predict the quality index of water (WQI). The system was created using the K-nearest
neighbors and the Feed-forward neural network (FFNN) in order to successfully predict and
classify water quality. The system was developed based on the already listed parameters and the
FFNN algorithm achieved the highest accuracy of 100% for the water quality classification
(WQC). Moreover, notwithstanding the good reason why this system was developed, the system
experienced a downside as it was built only to be able to predict the quality index of water and
couldn’t predict if a water sample is portable (suitable for drinking) or not and at this isn’t able to
efficiently solve the problem of water portability detection.

3.2.2 WEAKNESSES OF EXISISTING SYSTEMS


Khot and Surve IoT based system: The weakness of this system comes from the system being
built only to be able to analyze data that is generated from the IoT based sensors and can’t
analyze data from third party software.
Adaptive neuro-fuzzy inference system (ANFIS) algorithm: The weakness of this system
however comes from the system built only to be able to predict the quality index of water and
couldn’t predict if a water sample is portable (suitable for drinking) or not and at this isn’t able to
efficiently solve the problem of water portability detection.

3.2.3 ANALYSIS O F THE PROPOSED SYSTEM


The proposed system stands out as a significant forward step in the process of addressing
the already exiting pitfalls and gaps in knowledge in the traditional drinkable water prediction
process. This innovative system incorporates a lot of features with the users in mind in order to
provide them with an intelligent and comprehensive approach to drinkable water prediction as
one of the key features of this system is the integrative use of machine learning algorithms while
using various water dataset sourced from reputable online repositories, in order to offer users a
robust form of analysis of water portability.
However, this system incorporates a distinctive feature as it utilizes the random forest
algorithms for training models that can predict the most suitable water for human consumption in
order to prevent most of the water borne diseases. Moreover, unlike the conventional systems
that often rely on simplistic approaches, this system incorporates machine learning algorithms to
elevate the precision and contextual relevance of crop recommendations. At this the system not
only focuses on factors in water properties but also considers the various factors and effects that
might occur in the human body after consuming the water that is not portable.
Furthermore, this system takes a holistic approach by empowering individuals and
industries to make informed decisions not only based on environmental factors but also on the
effects that the consumption of this water has in the human body as it integrates a predictive
analytics into its system thereby helping individuals to decipher whether water is suitable for
consumption or not. In the same vein, this forward-thinking feature of this system bridges a
crucial gaps in the existing systems that typically overlook the health challenges of non-portable
water consumption. In essence, this system offers a comprehensive solution by leveraging on
advanced machine learning algorithms, water data analysis, and portability predictions to
revolutionize the whole activities that are involved in making predictive portability decisions and
however foster sustainable practices in the region.

3.2.3 SYSTEM REQUIREMENTS


System requirements refers to the detailed features or capabilities that a system should
have and functions in which a system should be able to perform.
3.2.3.1. Functional requirements and non-functional requirements of this system:
The functional requirements and non-functional requirements of this system refer to the
requirements that specify what the system must have, the attributes and the abilities. The
requirements of the proposed system include that the proposed system should have the following
attributes and abilities:
a. The system should be able to analyse water when supplied with the properties of the
water sample such as pH, chemical components etc.
b. The system’s user interface should be as simple as possible to use enabling the user to
efficiently make use of the system.
c. The system should be able to predict if a water sample is portable (suitable for human
consumption) or not depending on the parameter data that was supplied to the system.

3.2.4 PROPOSED METHODOLOGY: OBJECT ORIENTED ANALYSIS AND DESIGN


Object-Oriented Analysis and Design (OOAD) stands as on the methodologies of
software development which makes use of the object-oriented concepts for the design, creation
and implementation of software systems. However, within this approach, the initial step involves
the determination of the various system requirements then followed by the identification of
classes that are meant to be in the system and their respective relationships with each other in the
system. Moreover, various techniques, methodologies and practices are integral to this
methodology as it encompasses the use of UML diagrams, object-oriented programming, and use
cases. Consequently, OOAD employs object-oriented programming approach for the actual
design and implementation of the software system as the UML diagrams play a crucial role in
illustrating different facets and interactions among various components within the software
system. Lastly, to wrap it up, the different use case scenarios are employed to help articulate the
different manners in which users interact with and make use of the software system. The
application of OOAD is particularly suited for handling intricate systems and therefore
substantiating its selection for such scenarios.

3.2.4.1 UNIFIED MODELLING LANGUAGE


The unified Modelling Language (UML) diagrams serves as a very important tool in the
field of software engineering. It provides a high standard and a visual means of representing the
various parts and subsections of a software system comprehensively. These diagrams offer stands
to offer programmers, software developers and stakeholders a powerful tool to collaboratively
conceptualize, design, and most especially communicate on the various aspects of a software
system. However, the use case diagram is one of the UML diagrams that is considered most
crucial as it stands to outline the potential interactions between users or external entities and the
system thereby serving as a blueprint for understanding the functional requirements and
scenarios within the software application.
Subsequently, the activity diagram stands to help complement the use case diagrams as it
brings dynamic aspects of system development processes to the forefront by illustrating the
overall flow of activities within the system thereby capturing the sequential and parallel
processes that unfold during system operation. However, this dynamic perspective stands to aid
in visualizing the overall workflow which takes place in the system thereby helping to identify
the potential bottlenecks, decision points, and opportunities for optimization in the software
system.

3.1.4.2 USE CASE DIAGRAM OF PROPOSED SYSTEM


The Use-case diagram is one of the types of the Unified Modelling language (UML)
diagrams, which is divided into four main categories namely: use cases which shows the
functional parts of the system, the actors which stands to represent the external entities
(component parts) of the proposed system, the associations between the actors and the use cases,
and lastly the system boundary which is the scope in which the system and the various actors in
the system is interacting with.
As for the proposed system, there would be 2 actors which are; the user (which is the user
of the system) and the system. However, the use case diagram in correspondence with the
proposed system is going to have 4 use cases which includes, visiting the interface of the system,
entering the parameters of the water sample, processing the parameter data and then predicting
the portability of the water sample.
Use case diagram:
Figure 3.1 Use case diagram of the proposed system
The diagram above portrays the activity that is taking place in the system that is proposed in this
research work. Form the diagram the system would be made up of two main entities which are
the user who can perform some actions including visiting the systems online platform to check
out the blood supplies that are available in a particular blood bank and however moving forward
to fixing an appointment with the doctor.

3.2.4.3 ACTIVITY DIAGRAM OF THE PROPOSED SYSTEM


An activity diagram is one of the diagrams in the domain of Unified Modelling Language
(UML) that is very important. It is a type of UML diagram which describes the main dynamic
aspects of the software system. However, an activity diagram is basically a flowchart that is used
to describe the transgressional flow of occurrences that would be taking place within the
proposed system. However, the figure below represents the detailed transition of activities within
the proposed system, from the initial process of visiting the platform to scouting the available
blood supply for the desired blood sample to booking an appointment with the doctor.
Activity diagram of the proposed system:

Figure 3.2 Activity diagram for the system. Figure 3.3 Activity diagram for the
model.

The diagram above class diagram is the class diagram of the proposed system which
stands to display the activity that takes place in the proposed system using pictorial data
representation. From the diagram, it could be depicted that the activity that would be taking place
in the proposed system ranges from the patient visiting the system him/her
3.2.4.4 CLASS DIAGRAM OF THE PROPOSED SYSTEM
Class diagrams is one of the useful tool in the domain of the Unified Modelling language
(UML) diagrams that stands to describe the whole operational structure of a software system by
mapping out all the classes, attributes, methods and the relationships between the objects that
work together collaboratively to make the system function perfectly. Below is the illustration of
the class diagram for the proposed system.

The diagram above is used in this research work to represent the class diagram of the
proposed system, which is used in this context, physically depict the series of activities that is
going on in the proposed system. The diagram depicts the series of class exists between the
system to help it function properly.

3.2.4.5 UML SEQUENCE DIAGRAM


UML sequence diagrams are used to refer to a dynamic graphical model representation
that is used to display the myriad interaction activity that is happening between the different
distinct elements within a system across a specified period of time. Similarly, these diagrams are
considered very useful as they play a heavy role in the understanding of the chronological flow
of activities and messages that happen during the execution of a specific use case or scenario.
However, at the core of systems implementation, these sequence diagrams are used to visually
show the order in which interactions do take place within a system, thereby offering a clear
portrayal of how the various components work together cooperatively in a specific system to
achieve the desired result.
However, the components parts in the UML sequence diagram include messages,
lifelines, activations, and objects, with the lifelines representing the entities participating in the
interaction and the messages representing the communication between the entities. Furthermore,
the activations are depicted as vertical lines extending from the lifeline, which represents the
amount of time an object's would be involvement in an interaction. As a result, this temporal
dimension of displaying the flow of activity within the system offers an important layer of
understanding, thereby allowing developers and stakeholders to comprehend not only what
occurs in the system but also when it occurs during its operation.
Moreover, UML sequence diagrams also helps in facilitating the communication between
the technical and non-technical stakeholders in the development of a system and by offering a
visual representation of the system dynamics as these diagrams stand as a powerful tool for
conveying intricate technical details in a more accessible manner. However, the use of these
diagrams heavily contributes to an improved collaboration between the different stakeholders in
a systems thereby fostering a shared understanding of system behavior, interaction and also
aiding the system users in the identification of potential issues for future enhancement of the
system.
Figure 3.5 Sequence diagram of the proposed system
The above diagram stands to represent the sequence diagram of the proposed system which
functions to show the sequential flow of activities that is happening within the context of the
proposed system, which ranges right from the user visiting the system’s user interface platform
to the user getting an accurate prediction of whether a water sample is portable (drinkable) based
on the parameter data that was provided by the user.

3.3 SYSTEM DESIGN


The system design of the proposed system is going to illustrate how the system hopes to
fulfil the requirements that have been proposed in the sections above in this research work. This
system design is going to comprise of the simple design interface of the proposed system
showing the physical sketch of the system as it aims at making its features easily accessible to
the users of the system.
3.3.1 INPUT/OUTPUT DESIGN
This section of this research work stands to show an diagrammatical illustration of the
interface of the proposed system which stands to represent the input and output display of the
proposed system. The input/output design of the proposed system is shown below

Admin

Figure 3.6 interface design of the proposed system


CHAPTER FOUR
SYSTEM IMPLEMENTATION
4.1 Overview
This chapter of this work of research focuses on the designing, building and
implementation of the system that have been proposed in the earlier chapters of this work of
research. Just as in the previous sections of this work of research, a thorough top down analysis
of both the similar systems that are already existing, the journey of building the proposed system
was embarked upon by outlining the system and all the requirement of the system. However, in
this section of this research work, the researcher puts the proposed system into action using the
tools that was selected and tested very well and have been confirmed to be suitable for
implementing the proposed system. Most importantly, in this chapter, the user interfaces of the
different sections of the system would be outlined, along with the tools that was implemented in
bringing it to live.
4.2 Tools used for implementation (Tech stack of the proposed system)
4.2.1 Frontend
4.2.1.1. HTML
HTML is the abbreviation of Hyper Text Markup Language which is used to build the
skeletal framework and foundation of a webpage. It is seen as the foundational building block of
web development that enables the developers to create and structure the web content. It involves
the use of a set of attributes and tags to define elements such as headings, paragraphs, images,
and links etc. which are in turn rendered by the web browser. The HTML language is made in a
way that humans can read it and also machines can understand it. This makes it a very versatile
tool that the developers can use to make web pages that have a very clear structure and
semantics. It also allows provides an opportunity for the developers to embed multimedia
elements like audio and video in other to improve the experience of the user. The HTML
language gives the developers the opportunity to add CSS (Cascading Style Sheets) and
JavaScript as well in other to allow the design and behavior of a web page to be customized and
dynamic. Most importantly, in most cases it serves as an avenue for data collection through the
use of its elements like forms and input fields. In all it is a very important tool for web
development because it provides a framework upon which websites stands and operates.
Moreover, it was used to design the physical interface of the proposed system interface.
4.2.1.2. CSS
Cascading Style Sheet (CSS) is a simple language that can be used to style webpages. It is very
powerful and is used to specify the type of design that the HTML documents will be displayed
with. It gives developers the opportunity to control the layout, font, color and the overall
appearance of the web pages.to ensure that the out of the design is pleasing to the users. it also
helps to ensure easy maintenance and scalability of web-based projects as it separates content
(HTML) form the presentation (CSS). CSS selects HTML elements using selectors and then
apply styles to them using its properties and values. It allows developers to create responsive
designs thereby enabling the web pages to be able to adapt to different screen sizes and devices
through the use of its special media queries. It is one of the tools that was used in the design of
the interface of the proposed system.

4.2.2 Backend
4.2.2.1 Python
Python is a programing language that is very powerful and versatile. It is a high-level
programming language that is known for its ease of readability and its overall simplicity.
Because of this reason, it has become the most popular choice of programming language for both
the beginners and experienced developers. Because it supports many different paradigms and
also supports a large ecosystem of third-party packages that allows for a rapid development
across various domains such as web development, data analysis, artificial intelligence and
automation. It also offers dynamic typing to improve productivity and also allow for integration
with other language, tools and frameworks like Django and Flask to enable and ensure a more
efficient modern software development process.
4.2.2.2 Flask
The Flask framework is a simple lightweight framework for python that is flexible and is
used for the web. It is specially designed to be used for building web applications quickly with a
minimum overhead. It follows the WSGI (Web Server Gateway Interface) standard. It is based
on Werkzeug and Jinja2 in other to ensure a robust performance and efficient templating. Flask
has a micro framework nature that makes it suitable for web development purposes and also
offering developers the opportunity to be able to integrate more components that they need. It
gives a straightforward and intuitive API facility to allow for rapid development and prototyping
and also to make it ideal for small to medium-sized applications and RESTful APIs. The Flask
framework was also among the tools that was used in creating the backend of the proposed
system.

4.2.2 Machine Learning


4.2.2.1 Sci-Kit Learn
This is a very powerful and versatile machine learning library that was created specially for
python. It is used for data mining and data analysis because it provides simple and efficient tools
for preprocessing data, classification of data, regression, clustering and in all reduction in
dimensionality. It is built on NumPy, SciPy, and Matplotlib and because of this reason, it
integrates very well with other scientific computing libraries to ensure robust performance and
ease of use. It has good documentation to serve both the beginners and the experts. It was used in
the development of the proposed system because it is an essential tool for developing models that
are predictive and also for performing complex data analysis.

4.3 SYSTEM REQUIREMENTS


The proposed system is a very simple system and at that, they are few things that are
needed to use this system as it is a locally based platform application and however would run
locally on the machine and possibly hosted on a webserver in the future. However, the system
still require that the user gets to fulfill a very little requirement in order to be able to use the
proposed system. These requirements are as follows:
1. A good laptop that is durable with any functioning operating system.
2. Python Version 3.10.8 and above installed
3. Flask Version 2.3.3 and above installed.
4. Sci-kit Learn 1.3.1 and above installed.
5. The different water parameters that are needed by the system to be used to analyse the
water sample provided.

4.4 USER INTERFACE OF THE PROPOSED SYSTEM


This particular section of the research work will contain the interface design of the
different sections that are contained in the proposed system. The interface design of this system
will feature the input interface design and the output interface design.

4.4.1. The Input screen of the proposed system


The input screen of the proposed system is the first interface of the proposed system that
the user gets to see when they visit the system and it is made up of three section only the header
section, the input section and the predict button. The input section of the system exists to aid the
user key in the parameters that would be used for analyze a proposed water sample.

Figure 4.1 The input screen of the proposed system


4.4.2 Output Screen of the proposed system:
This screen of the proposed system is meant to be used to display the output prediction
result at the end of each prediction process. The details that is displayed on this page is the
details that was supplied by the user about the water sample alongside the result that was gotten
after the analysis of the water sample that was done using the parameters that have been provided
by the user.
Figure 4.2 The output screen of the proposed system.

4.4 SYSTEM IMPLEMENTATION OF THE PROPOSED SYSTEM


The system was implemented using the tools that was listed in the earlier chapters of this
research work. The researcher sourced for data sample from a reputable and verifiable data
resource (Kaggle) in which a single dataset was collected. The dataset that was collected form
https://www.kaggle.com/datasets/uom190346a/water-quality-and-potability and it contained a
total of 3276 data items in which 75% of the dataset was used for training the model and 25% of
the data was used for testing the model after it was developed.
4.4.1 Test result of the proposed system
After the system was developed, the system was tested with 25% of the dataset that was gotten
from Kaggle across different algorithms to determine the algorithm that has the highest level of
accuracy and also to detect the particular algorithm that gives the highest accuracy value. The
algorithms that was used for testing the model includes the following: Random Forest, Logistic
Regression, SVM Classifier, Decision tree and the K-Nearest Neighbors (KNN). The summary
of the result that was gotten after the analysis is given in the table below:
Table 4.1 summary table of the algorithms and their accuracy values.
S/N Type of Algorithm Accuracy Value
1 Random Forest 0.692
2 Logistic Regression ` 0.507
3 SVM Classifier 0.655
4 Decision Tree 0.606
5 K-Nearest Neighbors 0.639

CHAPTER FIVE
SUMMARY, RECOMMENDATIONS AND CONCLUSION
5.1 OVERVIEW
This chapter of this work of research is used by the researcher of this work to give a
detailed conclusive summary of this project, the recommendations for future works that may
exist within this domain of knowledge, and lastly a detailed conclusion of this research work.
This chapter however summarizes everything that have been discussed in all the preceding
chapters of this project.

5.2 SUMMARY
At the beginning of the research project the problem statement that is connected to this
research work and the study's backdrop were explicitly outline and discussed thereby setting a
clear ground for the reason why this project research was embarked upon. However, the
theoretical underpinnings of this research were covered in the latter chapters of this paper,
together with a very descriptive review of the relevant systems that have been developed in the
past by previous researchers and are currently in place.
This research work also provided a thorough examination of the current existing systems,
pointed out their shortcomings, and did its best to take advantage of these issues (gaps in
knowledge) when developing the system that was suggested in this research work. Additionally,
a comprehensive system analysis was also conducted using some UML diagrams in the third
chapter (chapter 3) of this study; the system was explicitly analyzed utilizing use case diagrams,
activity diagrams, class diagrams, and class diagrams of the UML diagrams. Lastly, on the note
of the systems interface, a full description of the user interface and a list of all the programming
languages and frameworks utilized to carry out this project were also provided and analyzed in
detail.

5.3 RECOMMENDATIONS
This water analysis system is highly recommended to everyone who is located anywhere in
world and who is concerned with determining if water is drinkable or not (whether it is safe for
drinking or not). It is a revolutionize way of determining the portability of water easily when the
parameters of the water sample have been gotten through different processes and an individual is
left to decide if the water sample is safe for drinking or not.

5.4 CONCLUSION
Finally, at this juncture we have come to the end of this research as this particular section
of this research work stands to conclude the objectives that were defined at the beginning of this
research work. Nevertheless, this is not the conclusion (end) of the proposed system as this
research has not come to a halt with this conclusion but would continue to grow and add a lot of
new features and updates like the ability for users to be able to download the prediction results
and a whole lot of other cool features. However, for the purpose of this study which has a
limited scope, the system which has been proposed all through the process of this study, would
be made available to run locally on the laptop of the users and possibly be hosted online to be
made accessible to all the users worldwide.
REFERENCES
Kaddoura, S. (2022). Evaluation of Machine Learning Algorithm on Drinking Water Quality for
Better Sustainability. Sustainability, 14(18), 11478.
Gleick, P. H. (2014). The world's water 2014: The state of resource use and demand. Island
Press.
World Health Organization. (2019). The world health report 2019: Health a global priority.
World Health Organization.
Dalal, S., Onyema, E. M., Romero, C. A. T., Ndufeiya-Kumasi, L. C., Maryann, D. C.,
Nnedimkpa, A. J., & Bhatia, T. K. (2022). Machine learning-based forecasting of potability of
drinking water through adaptive boosting model. Open Chemistry, 20(1), 816-828.
Hallaq, D. O., El-Khaldi, K., & Hammouri, H. A. (2019). An artificial neural network approach
for predicting the performance of wastewater treatment plants. Journal of Environmental
Management, 231, 506-514.
Zulkifli, C. Z., Garfan, S., Talal, M., Alamoodi, A. H., Alamleh, A., Ahmaro, I. Y. Y., Sulaiman, S.,
Ibrahim, A. B., Zaidan, B. B., Ismail, A. R., Albahri, O. S., Albahri, A. S., Soon, C. F., Harun, N.
H., & Chiang, H. H. (2022). IoT-Based Water Monitoring Systems: A Systematic Review. Water,
14(22), 3621.
Chen, X., Li, Y., & Li, Y. (2020). Artificial intelligence for surface water quality monitoring and
assessment: a systematic literature analysis. Environmental Science and Pollution Research,
27(4), 3616-3633.
Antanasijevic, D., Jovanovic, M., & Stojanovic, Z. (2020). Machine learning for water quality
prediction: a review. Journal of Hydroinformatics, 22(3), 631-647.
Gharibi, H., Kisi, O., & Şahin, M. (2012). Application of artificial neural networks for prediction
of dissolved oxygen in rivers. Journal of Hydrology, 414, 255-265.
Zulkifli, C. Z., Garfan, S., Talal, M., Alamoodi, A. H., Alamleh, A., Ahmaro, I. Y. Y., Sulaiman, S.,
Ibrahim, A. B., Zaidan, B. B., Ismail, A. R., Albahri, O. S., Albahri, A. S., Soon, C. F., Harun, N.
H., & Chiang, H. H. (2022). IoT-Based Water Monitoring Systems: A Systematic Review. Water,
14(22), 3621.
Khot, I. M., & Surve, A. R. (2020). IoT Assisted Drinkable Water Quality Analysis System using
Machine Learning Techniques. International Journal for Research in Applied Science and
Engineering Technology, 8, 228-236.
Vijay, S., & Kamaraj, K. (2021). Prediction of water quality index in drinking water distribution
system using activation functions based Ann. Water Resources Management, 35(2), 535-553.
Hmoud Al-Adhaileh, M., & Waselallah Alsaade, F. (2021). Modelling and prediction of water
quality by using artificial intelligence. Sustainability, 13(8), 4259.
Kadam, A. K., Wagh, V. M., Muley, A. A., Umrikar, B. N., & Sankhua, R. N. (2019). Prediction of
water quality index using artificial neural network and multiple linear regression modelling
approach in Shivganga River basin, India. Modeling Earth Systems and Environment, 5, 951-
962.
Sakizadeh, M. (2016). Artificial intelligence for the prediction of water quality index in
groundwater systems. Modeling Earth Systems and Environment, 2, 1-9.
Nair, J. P., & Vijaya, M. S. (2022, August). River water quality prediction and index
classification using machine learning. In Journal of Physics: Conference Series (Vol. 2325, No.
1, p. 012011). IOP Publishing.
Haghiabi, A. H., Nasrolahi, A. H., & Parsaie, A. (2018). Water quality prediction using machine
learning methods. Water Quality Research Journal, 53(1), 3-13.
Derdour, A., Jodar-Abellan, A., Pardo, M. Á., Ghoneim, S. S., & Hussein, E. E. (2022).
Designing Efficient and Sustainable Predictions of Water Quality Indexes at the Regional Scale
Using Machine Learning Algorithms. Water, 14(18), 2801.
Alnaqeb, R., Alrashdi, F., Alketbi, K., & Ismail, H. (2022, December). Machine Learning-based
Water Potability Prediction. In 2022 IEEE/ACS 19th International Conference on Computer
Systems and Applications (AICCSA) (pp. 1-6). IEEE.
22

APPENDIXES

22

You might also like