Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

The Report is Generated by DrillBit Plagiarism Detection Software

Submission Information

Author Name KAMINIKONDA SIVA PRATHAP (20091A05E5)


Title REFINEMENT OF WATER POLLUTANTS THROUGH PREDICTIVE
WATER QUALITY ESTIMATION USING ARTIFICIAL NEURAL
NETWORKS
Paper/Submission ID 1700330
Submitted by principal.9@jntua.ac.in
Submission Date 2024-04-26 12:04:44
Total Pages 43
Document type Project Work

Result Information

Similarity 15 %
1 10 20 30 40 50 60 70 80 90

Student
Paper Quotes
Sources Type Report Content 0.56%
0.35%
Internet
5.85%
Journal/
Publicatio
n 8.79% Words <
14,
6.05%

Exclude Information Database Selection

Quotes Not Excluded Language English


References/Bibliography Not Excluded Student Papers Yes
Sources: Less than 14 Words % Not Excluded Journals & publishers Yes
Excluded Source 0% Internet or Web Yes
Excluded Phrases Not Excluded Institution Repository Yes

A Unique QR Code use to View/Download/Share Pdf File


DrillBit Similarity Report

A-Satisfactory (0-10%)
B-Upgrade (11-40%)

15 77 B C-Poor (41-60%)
D-Unacceptable (61-100%)
SIMILARITY % MATCHED SOURCES GRADE

LOCATION MATCHED DOMAIN % SOURCE TYPE

1 springeropen.com Internet Data


1

2 etd.cput.ac.za Publication
1

3 www.dx.doi.org Publication
<1

4 sjcit.ac.in Publication
<1

5 www.dgvaishnavcollege.edu.in Publication
<1

6 arxiv.org Publication
<1

7 riptutorial.com Publication
<1

8 www.dx.doi.org Publication
<1

9 moam.info Internet Data


<1

10 www.redswitches.com Internet Data


<1

11 sobiad.org Publication
<1

12 www.dx.doi.org Publication
<1

13 www.geeksforgeeks.org Internet Data


<1

14 www.squash.io Internet Data


<1
15 ijircce.com Publication
<1

16 jpinfotech.org Internet Data


<1

17 ADITHRI F2 FARMERS FRIEND BY 188R1D5803 YR 2020, JNTUH Student Paper


<1

18 Drought Management Planning Policy From Europe to Spain by Hervs- Publication


<1
Gmez-2019

19 eprints.hrwallingford.com Publication
<1

20 Thesis Submitted to Shodhganga Repository Publication


<1

21 epdf.pub Internet Data


<1

22 trepo.tuni.fi Publication
<1

23 moam.info Internet Data


<1

24 apps.dtic.mil Publication
<1

25 Cartesian Genetic Programming for Diagnosis of Parkinson Disease Publication


<1
through Handwri by Parziale-2020

26 www.canada.ca Internet Data


<1

27 Predicting intersection queue with neural network models by Gang-Le- Publication


<1
1995

28 A Novel Methodology for Converting a Conventional Building to a Publication


<1
Nearly Zero Ener by Jabbour-2020

29 How are UML Class Diagrams built in practice A usability study of two Publication
<1
UML tools by Planas-2019

30 docobook.com Internet Data


<1

31 Drugtarget interaction prediction via chemogenomic space learning- Publication


<1
based method by Mousavian-2014
32 moam.info Internet Data
<1

33 www.fao.org Publication
<1

34 bioresources.cnr.ncsu.edu Internet Data


<1

35 www.mdpi.com Internet Data


<1

36 builtin.com Internet Data


<1

37 egyankosh.ac.in Publication
<1

38 journals.uran.ua Publication
<1

39 Thesis Submitted to Shodhganga Repository Publication


<1

40 Thesis Submitted to Shodhganga Repository Publication


<1

41 www.jiaci.org Publication
<1

42 thehousingbubbleblog.com Internet Data


<1

43 Thesis Submitted to Shodhganga Repository Publication


<1

44 vision.soic.indiana.edu Internet Data


<1

45 docplayer.net Internet Data


<1

46 www.eoportal.org Internet Data


<1

47 bmcmedicine.biomedcentral.com Publication
<1

48 doku.pub Internet Data


<1

49 easychair.org Publication
<1

50 ebin.pub Internet Data


<1
51 Performance evaluation of linear and nonlinear models for the estimati by Publication
<1
Goodarzi-2018

52 Student paper published in Open Access Journal- Publication


<1
www.openaccessjournal.com

53 The quest for space capabilities and military security in Africa by Publication
<1
Oyewole-2020

54 Towards green buildings Glass as a building elementthe use and misuse Publication
<1
in the g by Mohse-2006

55 unej.ac.id Internet Data


<1

56 Architectures and Protocols for Capacity Efficient, Highly Dynamic and Publication
<1
by Chiu-2012

57 docplayer.info Internet Data


<1

58 docplayer.net Internet Data


<1

59 eprints.ums.ac.id Publication
<1

60 Experimental personalized array translator system by Hellerman-1964 Publication


<1

61 IEEE 2013 17th IEEE Workshop on Signal and Power Integrity (SPI) - P Publication
<1
by

62 INFORMATICSnsuworks.nova.edu Publication
<1

63 Knowledge management to support fate and transport modeling efforts in Publication


<1
risk-base by V-2002

64 Local description of a polyenic radical cation by P-1995 Publication


<1

65 mdpi.com Internet Data


<1

66 moam.info Internet Data


<1
67 moam.info Internet Data
<1

68 slideshare.net Internet Data


<1

69 SN 2008inBRIDGING THE GAP BETWEEN NORMAL AND FAINT Publication


<1
SUPERNOVAE OF TYPE IIP by Roy-2011

70 sportdocbox.com Internet Data


<1

71 Submitted to Visvesvaraya Technological University, Belagavi Student Paper


<1

72 tc.copernicus.org Internet Data


<1

73 Which cognitive dual-task walking causes most interference on the Time Publication
<1
by Zirek-2018

74 www.biorxiv.org Internet Data


<1

75 www.jssm.org Internet Data


<1

76 www.jstage.jst.go.jp Publication
<1

77 www.researchgate.net Internet Data


<1
19
ABSTRACT The purpose of this project is to study drinking water quality and irrigation design. Water is probably the most
precious resource after air. From an ecological and economic point of view, water quality is of great importance. Deteriorating
water quality in industry can pose risks and significant economic losses. Therefore, water quality analysis is required before
use for any purpose. The need for water quality monitoring is well known. However, traditional methods of testing water
quality require physically collecting water samples and analyzing them in a laboratory. However, it may be considered costly
24
and time-consuming. Another traditional strategy is the use of sensors. However, using sensors to test all aspects of water
quality is considered expensive and often has low accuracy. It takes a lot of effort, time, and a lot of paperwork. Manual
calculation required. It has no direct relationship with senior bureaucrats. To overcome all these limitations and improve the
precision of operation, a more effective computerization of the system is required. The proposed system eliminates or reduce
the difficulties, workload and mental conflict up to some extent. The key attributes are PH, Hardness, Solids, Chloramines,
Sulphate, Conductivity, Organic Carbon, Trihalomethanes, Turbidity. Among these, main attributes for drinking purpose are
20
PH, Hardness, Conductivity, Turbidity and for irrigation purpose PH, Hardness, Sulphate. The deep learning model,
Artificial Neural Networks is used to predict the quality of water either pure or impure based on the historical data.
Keywords: Artificial Neural Networks, PH, Hardness, Solids, Chloramines, Sulphate, Conductivity, Organic Carbon,
Trihalomethanes, Turbidity.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 1 CHAPTER 1 INTRODUCTION 1.1 Overview Water
quality has a direct impact on both public health and the environment. In addition to being used for drinking, water is also
43
used for industry and agriculture. Human societies have depended more on rivers than other water sources for their
expansion since they are the most accessible. Rarely, non-conventional water sources including seawater and groundwater
were used to remedy issues. Over time, concerns about water quality and conservation have spread throughout the world. It
11
has just 2.5 percent freshwater. There are just 0.3% of Earth's freshwater resources. Fig 1.1: Water Quality Prediction The
state of water, encompassing its chemical, physical, and biological properties, is referred to as "water quality." An essential
component of any examination of aquatic systems is modeling the parameters of water quality. Compared to other traditional
69
modeling methods, the artificial neural network is a novel approach with a flexible mathematical framework that can
recognize intricate non-linear correlations between input and output data. The main part is parameters that can be gathered
with low/lower cost. That is: pH value: (Potential Of Hydrogen) The pH scale is a crucial component in assessing the water's
acid-base equilibrium. It also acts as a barometer to determine how acidic or alkaline the water is. WHO has advised a
maximum pH range of from 6.5 to 8.5.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 2 The ranges of the current experiment were 6.52–6.83,
which corresponds to the WHO recommended range. Hardness: Hardness is basically caused by calcium and magnesium
salts. Below 75 mg/L - is for the most part considered delicate. 76 to 150 mg/L - moderately hard. 151 to 300 mg/L - hard.
More than 300 mg/ - very hard. Water passes through geologic strata that contain these salts, which dissolve them. The
amount of hardness in raw water is determined by how long it is exposed to materials that create hardness. The ability of
water to speed up the production of soap due to calcium and magnesium was the original definition of hardness. Solids (Total
1
Dissolved S olids - TDS): Many inorganic and some organic minerals and salts, including potassium, calcium, sodium,
bicarbonates, chlorides, magnesium, sulfates, and others, can be dissolved in water. These minerals gave the water an
34
unpleasant taste and a diluted tint. This is a crucial factor in how water is used. High TDS values are indicative of highly
mineralized water. For drinking purposes, the highest allowable limit for TDS is 1000 mg/l, while the desirable limit is 500
mg/l. Chloramines Chlorine and chloramine are the two primary disinfectants used in public water systems. Most frequently,
ammonia is added to chlorine to treat drinking water, which results in the formation of chloramines. It is deemed safe for
drinking water to have up to 4 mg/L, or 4 parts per million (ppm), of chlorine. Sulphate Natural materials such as sulphate
can be found in rocks, soil, and minerals. They can be found in food, plants, groundwater, and ambient air.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 3 In most freshwater supplies, its concentrations vary
from 3 to 30 mg/L, yet in certain places, significantly greater amounts (1000 mg/L) are discovered. The concentration of
1
sulphate in saltwater is around 2,700 mg/L. Conductivity Pure water is a good insulator and not a good conductor of electric
current. Water's electrical conductivity is improved when the concentration of ions rises. Electrical conductivity in water is
often determined by the concentration of dissolved particles in the water. In actuality, electrical conductivity (EC) gauges a
solution's ability to transfer electricity through its ionic process. The WHO recommends that the EC value not be more than
400 S/cm. Organic carbon Both manufactured and naturally occurring organic matter (NOM) that has decomposed provide
the total organic carbon (TOC) found in source waters. The total organic carbon content, or TOC, of pure water is measured.
The US EPA states that the TOC in treated or drinking water is less than 2 mg/L, and in source water used for treatment, it is
less than 4 mg/Lit. Trihalomethanes (THMs) Chlorine-treated water may contain these substances. The amount of organic
matter in the water, the temperature of the treated water, and the amount of chlorine needed to treat the water all affect the
concentration of THMs in drinking water. THM concentrations in drinking water up to 80 ppm are regarded as safe.
Turbidity The amount of solid materials in the suspended state determines how turbid water is. The test measures the amount
of light that water emits and is used to determine the quality of waste discharge in relation to colloidal particles. Wando Genet
Campus's mean turbidity value (0.98 NTU) is less than the 5.00 NTU WHO-recommended threshold.
18
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 4 Potability A value of 1 indicates that water is suitable
for human consumption, while a value of 0 indicates that it is not. Artificial Neural Networks (ANNs) are amazing
advancements in the field of water quality prediction that provide a number of benefits. First, unlike typical linear models,
ANNs are excellent at capturing complex nonlinear interactions among a myriad of contributing elements. Since the
relationships between variables in the water quality domain are frequently intricate and dynamic, this nonlinear modeling
capacity is very helpful. 1.2 Objectives of Project The goal of this research is to use an Artificial neural network to find the
34 27
best model fit to predict water quality indicators. It has been demonstrated that this artificial neural network structure is the
most effective neural network structure for hydrological and aquatic simulation. The purpose of the ANN model is to quickly
analyze and forecast certain water quality characteristics at any point within the domain of interest. The input parameters are
28
respective variables that have been measured at different sites. Salinity, temperature, dissolved oxygen, and chlorophyll-alpha
are the variables that matter. For a variety of reasons, the significance of water quality prediction is paramount on a
worldwide scale. First and foremost, having access to clean, safe water is a basic human need, and water quality forecasting
helps to ensure that drinkable water is available to communities everywhere. Public health and water quality are linked
because contaminated water can lead to waterborne illnesses that potentially impact millions of people worldwide. The
proposed work's contribution is outlined in the following points: The proposed work provides a thorough explanation and
analysis of the problem classification for water quality. The dataset is thoroughly pre-processed by the framework to make
sure it is suitable for feeding into the Ann model. The suggested work assures achievement of most significant features,
identification of the feature importance, feature dependencies, and feature weights, that enable optimized classification of
water quality dataset.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 5 CHAPTER-2 LITERATURE SURVEY 2.1
Introduction A literature review addresses the process of identifying the features of the current system. To design or establish
a new system, research the previous system, and analyze the challenging issues the system encountered. To support the
suggested system, the drawbacks of the current one are examined. Next, the advantages of the suggested system are outlined
along with its definition for the problem. 2.2 Related work 2.2.1 Related work 1: Abirami K, Priyadarshini, Changail
Radhakrishna, and Monisha A Venkatesan's paper "Water Quality Analysis and Prediction using Machine Learning" (2023)
25
investigates Naïve Bayes, K- Nearest Neighbour (K-NN), Decision Trees, Random Forest classifiers, and Support Vector
Machines (SVM). With 91.9% accuracy and less training time, they discovered that Random Forest-based prediction
performs better. Moreover, they suggest fusing wireless technologies and sensors to create a real-time Internet of Things water
quality monitoring system, expanding the study's useful uses. 2.2.2 Related work2: The study "Toward Design of Internet of
Things and Machine Learning-Enabled Frameworks for Analysis and Prediction of Water" (2023) by Mushtaque Ahmed
Rahu, Abdul Fattah Chandio, Khursheed Aurangzeb Sarang Karim, Musaed Alhussein, and Muhammad Shahid Anwar
makes use of support vector machines (SVM), Long Short-Term Memory (LSTM), XGBoost, Random Forest, and K-Nearest
Neighbour (K-NN). Their study demonstrates how inexpensively real-time data collecting using IoT technology can be
46
achieved, eliminating the need for manual data collection and regular on-site monitoring. They do point out that it takes a lot
of time and labor to install IoT and machine learning frameworks. 2.2.3 Related work3: The study "Water Quality
Classification Using SVM and XGBoost Method" (2022) is conducted by Hasriq Izzuan Hasnol Yusri, Afhzan Ab Rahim, Siti
Lailatul Mohd Hassan, Ili Shairah Abdul Halim, and Noor Ezan Abdullah. The algorithms XGBoost and Support Vector
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 6 Machines (SVM) are the subject of their investigation;
20
XGBoost outperforms SVM with an accuracy rate of 94%. But XGBoost also yields a 6% misclassification error. This study
emphasizes the superiority of XGBoost over conventional SVM techniques and highlights its efficacy in classifying water
quality. 2.2.4 Related work4: The paper "Testing of Water Quality Using SVM" (2021) by Hansen F. Charlie and Gloreine
Dela Cruz focuses on Total Organic Carbon (TOC) and Support Vector Machine (SVM).They discovered that SVM handled
24 47
non-linear data well and attained an amazing accuracy rate of 99.25%. They did point out that SVM can be computationally
66
demanding, especially when dealing with big datasets. This study provides important insights for future applications and
optimizations by highlighting the excellent accuracy and appropriateness of SVM for water quality testing while also noting its
computational limitations. 2.2.5 Related work5: The "Water Quality Prediction Method Based on AE-LSTM" (2020) study by
Huiqing Zhang and Kemei Jin presents a new method that combines Automatic Encoder (AE) with Long Short- Term
Memory (LSTM) neural networks. Their AE-LSTM model effectively reduces feature dimensionality and shows higher
prediction accuracy, with R2 values above 0.9. They do, however, recognize the difficulties in gathering huge datasets as well
as the model's complexity and interoperability problems. While outlining areas for future optimization and modification, this
work offers a potential approach for reliable water quality prediction.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 7 CHAPTER 3 FEASIBILITY STUDY The feasibility of
the project is analyzed in this phase and the business proposal is put forth with a very general plan for the project and some
cost estimates. This is to make sure the business won't be burdened by the suggested method. A basic understanding of the
54
system's primary requirements is necessary for a feasibility study. The feasibility analysis takes three main factors into
11
account: 1. Technical feasibility 2. Economic feasibility 3. Social feasibility 3.1 TECHNICAL FEASIBILITY: The purpose of
this study is to confirm the technical requirements or technical feasibility of the system. The development of any system must
53
not significantly strain the existing technical resources. The technological resources that are made available will be in high
demand. The client will therefore have to adhere to stringent restrictions. The designed system must have minimal
11
requirements because implementing it will only necessitate minimal or null changes. 3.2 ECONOMICAL FEASIBILITY: The
purpose of this study is to determine how the system will affect the organization's finances. The company can only invest a
certain amount of money in the development and research of the system. The costs have to make sense. Because the majority
of the technologies utilized are freely available, the developed system was also possible to be implemented within the allocated
budget. All that needed to be bought were the personalized goods. 3.3 SOCIAL FEASIBILITY: One of the study's objectives is
28
to determine how well users accept the system. This entails instructing the user on the proper usage of the technology. Instead
of viewing the system as a threat, users must accept it as a necessity. The techniques used to familiarize and educate the user
about the system will determine the extent of acceptance by the users. Since he is the system's last user, his confidence must be
increased in order for him to offer some helpful critique, which is greatly appreciated.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 8 CHAPTER 4 DATA WRANGLING &
12
EXPLORATORY DATA ANALYSIS A collection of data is called a data set, or dataset. When it comes to tabular data, a data
set is equivalent to one or more database tables, where each row in the table corresponds to a specific record from the data set
and each column denotes a different variable. The data set used in our model includes the following attributes. PH value of
water Hardness of water Solids Chloramines Sulphate Conductivity Organic carbon Trihalomethanes Turbidity Label-
36
Potability 4.1 Getting Insights About The Data Set There are different ways to read the dataset in python, here we have read
13
the data set using the panda’s module. Pandas is a very popular data manipulation library, and it is very commonly used. One
of its very important and mature functions is read_csv() which can read any .csv file very easily and help us manipulate it. We
have used the following command to read the data set. df = pd.read_csv('water_potability.csv') df.head() Figure 4.1: Overview
of Dataset
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 9 We get insights about the data using different functions
• df.head(): To read the first 5 rows of the data frame • df.tail(): To read the last 5 rows of the data frame • df.shape(): The
9
shape of a data frame is a tuple of array dimensions that tells the number of rows and columns of a given data frame. •
df.describe(): It is used to compute various statistical data from the numerical values of the Series or data frame, such as the
percentile, mean, and standard deviation. • df.info(): It is used to print a concise summary of a data frame. 4.2 Data Wrangling
The process of organizing and sanitizing disorganized and complicated data sets for quick access and analysis is known as data
wrangling. It is becoming more and more important to arrange vast volumes of available data for analysis as data sources and
amounts of data develop and expand quickly. 4.2.1 Handling Missing Values: It may happen when one or more things, or the
13 9
entire unit, are not given any information. Missing Data can also refer to as NA (Not Available) values in pandas. There are
71
several useful functions for detecting, removing, and replacing null values in Pandas data frame such as isnull(): To check any
missing values are present fillna(): To fill the null values replace(): To replace values dropna(): To drop null values 4.2.2
50
Handling Outliers: A data item or object that differs noticeably from the other (so-called normal) objects is called an outlier.
Errors in measurement or execution may be the cause. Outlier mining is the term used to describe the analysis used for outlier
detection. The process of eliminating an outlier from a data frame is same to that of removing a data item from a panda's data
61
frame, and there are numerous methods for doing so.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 10 4.3 Exploratory Data Analysis Data scientists use
exploratory data analysis (EDA), which frequently makes use of data visualization techniques, to analyze, study, and
summarize data sets' key features. It makes it easier for data scientists to find patterns, identify anomalies, test hypotheses,
and verify assumptions by assisting in the best way to modify data sources to obtain the answers they require 4.4 Data
Visualization Data visualization is the representation of data through use of common graphics, such as charts, plots,
infographics, and even animations. These informational visual displays make difficult data linkages and data-driven insights
understandable. There are 3 types of visualizations: • Univariate Analysis • Bivariate Analysis • Multivariate Analysis 4.4.1
Univariate Analysis: It is data analysis done on its own. It concentrates on a single variable at once. Say you are researching
58
the growth of plants. You may learn more about things like the distribution of heights, the average height of the plants, and
the number of plants that fall within a given height range by using univariate analysis. Here, methods for determining
measures of dispersion (variance, standard deviation) and central tendency (mean, median,and mode) are employed. 4.4.2
40
Bivariate Analysis: The story starts to pick up steam at this point. Bivariate analysis examines how two variables are related to
2
one another. Referring back to the plant example, bivariate analysis might be performed to determine whether plant height
and fertilizer use are correlated. This could entail using visual aids like scatter plots or correlation analysis. 4.4.3 Multivariate
Analysis: Data reflects the complexity of the world, which is rarely black and white. More than two variables in a dataset are
addressed by multivariate analysis. Assume you are studying the variables influencing agricultural productivity. You may
think about soil type, fertilizer, temperature, and sunlight. With the aid of multivariate analysis, you may gain a more
comprehensive understanding of how these variables interact and affect the yield. Regression analysis, factor analysis, and
machine learning models are a few possible techniques in this context.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 11 CHAPTER 5 SYSTEM DESIGN 5.1 Problem
Definition: Traditional water quality monitoring methods, while essential, have limitations. Just like businesses struggle with
customer churn, accurately predicting water quality changes is key for effective resource management. Deep learning gives an
optimized solution. Here's the challenge developing a model to predict future water quality. Instead of relying on time-
49
consuming lab tests or potentially inaccurate sensors, we can leverage deep learning algorithms like Artificial Neural
26
Networks (ANNs). The goal? Train an ANN on historical water quality data, along with factors like weather or flow rates.
This data-driven approach allows the model to learn complex relationships and predict future water quality values with high
77
accuracy. It can translate into significant benefits as early warnings of potential water quality issues enable preventative
actions to safeguard public health. Additionally, accurate predictions inform better decisions regarding water treatment,
resource allocation, and regulatory compliance. However, implementing ANNs requires expertise in data science and access to
large, high-quality datasets for training. Significant computational resources might also be needed. In conclusion, deep
76
learning presents a powerful tool for water quality prediction. However, addressing the challenges of data dependency, model
development expertise, and computational resources is crucial for successful implementation. 5.2 Existing System In the
3
existing system, water quality can be tested using traditional techniques such as collecting the water specimens manually and
then analyzed it in a laboratory. But it can be considered time-consuming and expensive. The biggest challenge in water is to
increase quality and offer it to humans with the best possible quality.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 12 5.2.1 Disadvantages of Existing System • More man
power. • Time consuming. • Consumes large volume of paper work. • Needs manual calculation. • No direct role for the higher
officials. To avoid all these limitations and make the working more accurately the system needs to be computerized in a better
way. 5.3 Proposed Methodology Using deep learning, this suggested system provides a novel method for assessing the quality
of water. It seeks to solve the shortcomings of the current approaches while streamlining the procedure. Comparable to
conventional testing techniques that examine elements like pH or mineral content, the system accepts a variety of water
2
properties as input. But at its foundation is a deep learning model, most likely an Artificial Neural Network (ANN), rather
than a manual examination. Because it has been trained on a large body of historical data on water quality, this model can
recognize trends and forecast the quality of the water as "pure" or "impure" depending on the input variables. The
advantages are evident less effort in comparison to manual testing, clarity in the interpretation of complicated results, and the
possibility of getting beyond some of the drawbacks of conventional approaches. 5.3.1 Advantages of Proposed System ANNs
capture non-linear water quality relationships for accurate predictions. Data-driven learning allows ANNs to identify subtle
patterns in water quality data. Retraining ANNs enables adaptation to changing environmental conditions. Deep learning
models have the potential for higher accuracy in water quality prediction. ANNs can handle complex data sets incorporating
various water quality factors. Trained ANN models offer efficient water quality prediction for real-time applications.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 13 5.4 Introduction to Deep Learning Models Within the
broader science of artificial intelligence, machine learning includes deep learning as a subset or subfield. To simulate and
handle complicated tasks, deep neural networks - which have many layers are used. The network's several tiers are referred to
as "deep" layers. In this case, supervised, semi-supervised, or unsupervised techniques may be applied. Deep learning involves
training a computer model on images, text, or audio alone to accomplish categorization tasks. State-of-the-art accuracy can be
attained by deep learning models, occasionally surpassing human performance. Neural network topologies with multiple
6
layers and a sizable quantity of labeled data are used to train models. Deep learning models are sometimes referred to as deep
neural networks because the majority of deep learning techniques employ neural network topologies. 5.4.1 Working of Deep
Learning Models: Deep learning models operate in multiple crucial processes that can be split down as follows: Data
Gathering and Preprocessing: A lot of data is needed to train deep learning models. To guarantee consistency and quality,
preprocessing is required for this data, which is gathered from multiple sources. Preprocessing stages could entail data
cleaning, normalization, scaling, and augmentation, based on the particulars of the work and the kind of data (pictures, text,
audio, etc.) that needs to be processed. Model Architecture Design: A deep learning model's architecture specifies its
composition, including the quantity and kind of layers, the connections between them, and the activation functions. Recurrent
neural networks (RNNs) handle sequential data, transformer structures handle natural language processing tasks, and
convolutional neural networks (CNNs) handle images. Initialization: The model's parameters, or weights and biases, are set
before training starts. This particular step has the potential to greatly impact both the model's final performance and learning
process. Training: To train a deep learning model, training examples are repeatedly given to the model, and its parameters are
adjusted to reduce the discrepancy between the model's predictions and the actual goals.
21
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 14 The following steps are usually included in the
training process: Forward Pass: The model is run with the input data, producing predictions. Calculation of Loss: A loss
26
function is used to calculate the difference between the actual targets and the projected outputs. Backward Pass: Methods
such as backpropagation are used to compute the gradients of the loss function with regard to the parameters of the model.
Update of Parameters: To minimize the loss, the model's parameters are updated using optimization techniques such as
Adam, RMSprop, or stochastic gradient descent (SGD). Validation: To track the model's capacity for generalization and
identify overfitting, its performance is assessed during training on a different validation dataset. Based on validation
8
performance, hyperparameters like learning rate, batch size, and regularization strength can be changed. Testing and
Evaluation: After training is finished, a different test dataset is used to evaluate the trained model's performance on untested
data. Depending on the job, other evaluation metrics may be used, such as mean squared error, accuracy, precision, recall,
and F1 score. Deployment: The trained model can be used in production environments to carry out tasks in the actual world
after undergoing extensive testing and review. In order to ensure scalability, dependability, and security, deployment
strategies may include integrating the model into software applications and deploying it on cloud servers or edge devices.
Monitoring and maintenance: In order to guarantee optimal performance over time, deep learning models require ongoing
51
monitoring and upkeep. Tracking metrics, spotting shifts in data distributions, and routinely retraining the model with new
35
data are all part of monitoring, which keeps the model accurate and relevant. 5.4.2 Introduction to Artificial Neural Networks
Artificial neurons were first conceptualized in 1943 by McCulloch and Pitts. The back- propagation training (BP) technique
2
for feedforward ANNs marked the beginning of ANN applications in research domains. An artificial neural network (ANN) is
a type of information
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 15 processing system that simulates the functions and
connections of biological neurons to approximate the behaviour of a human brain. 5.4.3 Architecture of ANN Artificial
neurons were first conceptualized in 1943 by McCulloch and Pitts. The back- propagation training (BP) technique for
feedforward ANNs marked the beginning of ANN applications in research domains. An artificial neural network (ANN) is a
type of information processing system that simulates the functions and connections of biological neurons to approximate the
behaviour of a human brain. Figure 5.1: ANN Architecture Input layer: The input layer is where the network receives and
processes raw data. For instance, the input layer gets a 28x28 pixel picture of the digit if the network is intended to identify
handwritten numbers. Hidden Layer: The magic takes place in the hidden layer. Often referred to as artificial neurons, these
56
networked nodes are loosely modeled after biological neurons. After receiving weighted inputs from every neuron in the layer
before it, each hidden layer neuron uses an activation function to calculate the output. By adding nonlinearity to the network,
activation functions enable the network to recognize intricate patterns in the data. There could be one or more hidden layers
in a neural network Output layer: The output layer generates the network's ultimate output. Ten neurons make up the output
73
layer of the digit recognition network, one for each digit that could exist (0 to 9).
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 16 The number that the network determines is most
likely to be represented by the input image correlates to the neuron with the highest output. Network learning is enabled via
connections between neurons in different layers. Random weights are originally attached to these connections. To enhance its
performance on the training data set, the network modifies these weights during training. The term "backpropagation"refers
to the weight adjustment procedure. 5.5 System Architecture Figure 5.2: System Architecture The following describes how the
data pre-processing stages relate to employing an ANN for water quality prediction, even though the image doesn't explicitly
75 32
depict the ANN architecture: Data Collection: In this step, data on the quality of the water is gathered from multiple sources,
including sensors in lakes, rivers, and monitoring stations. Measurements of pH, temperature,
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 17 dissolved oxygen levels, and the presence of pollutants
are a few examples of the data that may be included. Data Analysis for Exploration (EDA): You acquire a sense of your data
throughout this phase. It's possible to spot outliers, trends, or patterns that could affect the quality of the water. For example,
EDA may show a relationship between higher levels of bacteria and increased rainfall. Data Pre-processing: Missing values or
67
discrepancies may occur in real-world water quality data. These problems are addressed via data cleaning to guarantee the
55
information fed accurate and consistent input into the ANN model. Inequality of Data: At this point, you look for data
72 22
imbalances. This could indicate that there are many data points in the water quality prediction that represent normal readings
and few points that represent contaminated water. Machine learning models may encounter difficulties due to unbalanced
data. Over Sampling: You may decide to oversample if there is uneven data. To generate a more balanced dataset, this entails
adding more data points to the minority class (such as readings of tainted water). Splitting the Data: The pre-processed data is
divided into training and testing sets. The testing set is used to assess the ANN model's performance on untested data, whereas
35 26
the training data is used to train the network. Model Development & Training: Here, the training set of data is used to train
the ANN model. Based on fresh input data, the model predicts the water quality by identifying patterns in the data. Evaluation
of the Model: The testing data is used to evaluate the model's performance following training. This aids in assessing the
model's generalization ability to new data, which is essential for practical use.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 18 5.6 System Requirements 5.6.1 Hardware
Requirements • System : Intel I5 • Hard Disk : 500 GB • Ram : 8 GB 5.6.2 Software Requirements • Web Frameworks : Flask
• Technology : Python, HTML, CSS • Libraries Used : Pandas, Numpy, Seaborn, Matplotlib, Scikit-Learn • Operating System
: Windows 10 5.7 Introduction to UML A general-purpose modeling language is called Unified Modeling Language (UML).
UML's primary goal is to establish a common framework for visualizing a system's design process. It resembles blueprints
used in other engineering disciplines quite a bit. UML is a visual language rather than a programming language. UML
diagrams are used to show a system's structure and behavior. UML facilitates the modeling, design, and analysis processes for
system architects, software engineers, and businesspeople. Unified Modelling Language was standardized by the Object
Management Group (OMG) in 1997. Since then, OMG has been in charge of it. In 2005,UML was issued as an authorized
21
standard by the International Organization for Standardization (ISO). Over time, UML has undergone revisions and is
periodically evaluated. Features of UML The features of the UML are as follows: It is a modeling language that is generalized.
It is not the same as other programming languages such as Python, C++, etc.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 19 It has a connection to object-oriented design and
analysis. It is employed to display the system's workflow. It is a visual language that produces potent modeling artifacts. 5.7.1
29
Class Diagram The most widely used UML diagram is most likely a class diagram. This is a fundamental component of all
object-oriented programs. It displays the classes that are part of the system, along with information on their characteristics
and interactions with one another. A class is often composed of three sections in modeling tools: the name at the top, the
properties in the center, and the operations or methods at the bottom. Figure 5.3: Class Diagram
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 20 5.7.2 Object Diagram Object Diagrams sometimes
74
referred as Instance diagrams are very similar to class diagrams. Similar to class diagrams, they also show the relationship
23
between objects, but they use real world examples. They are also used to show how a system will look like at a given specific
time. Figure 5.4: Object Diagram 5.7.3 Sequence Diagram UML sequence diagrams display the relationships between items as
well as the sequence in which they happen. It is noteworthy that it illustrates interactions for particular cases. Interactions are
represented by arrows, and processes are displayed vertically. Figure 5.5: Sequence Diagram
57
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 21 5.7.4 Use Case Diagram Use case diagrams, the most
well-known kind of UML behavioral diagram, give a visual summary of the players in a system, the various capabilities they
need, and the ways in which these various capabilities interact. This makes it simple to identify the important participants and
system processes, making it an excellent place to start any project discussion. Figure 5.6: Usecase Diagram
17
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 22 5.7.5 Activity Diagram Activity diagrams provide a
graphical representation of labor processes. It can be applied to explain the workflow of individual system components or the
business process. State machine diagrams can occasionally be substituted by activity diagrams. Figure 5.7: Activity Diagram
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 23 5.7.6 Deployment Diagram The deployment
perspective of a system is displayed in a deployment diagram. Given that deployment diagrams are used to actually deploy the
components, it is related to component diagrams. In a deployment diagram, every node is an entity. Nothing more than the
real hardware on which the program is installed makes up a node. Figure 5.8: Deployment Diagram
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 24 CHAPTER 6 SYSTEM IMPLEMENTATION
Modules 1. Features Selection 2. Hardware Procedure 3. Model Implementation 4. Result Analysis Module Description 6.1
39
Features Selection Finding the most pertinent factors that have an impact on water quality and make a substantial
contribution to predictive models is the process of feature selection for water quality prediction. Based on subject expertise
and data accessibility, pertinent characteristics are first selected, such as physical factors (such as pH, turbidity) and chemical
parameters (such as pollutant concentrations). The significance of each parameter in relation to the goal variable (water
quality) is evaluated using statistical tests, correlations, and exploratory data analysis. The feature set is further refined by
31
methods such as dimensionality reduction and feature selection algorithms, which prioritize the most useful qualities while
lowering noise and redundancy. To validate chosen features and make sure they match known parameters impacting water
quality, domain specialists are essential. Table 6.1: List of Features required for water quality PH Value Hardness Solids
Sulfate Chloramines Conductivity Trihalomethanes Organic Carbon Turbidity Potability
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 25 Iterative refining and model evaluation are crucial to
maximizing interpretability and predictive performance throughout the process. Predictive models can more accurately
represent the intricate links between environmental variables and water quality by methodically choosing pertinent features.
68
This allows for more precise monitoring and management techniques. 6.2 Hardware Procedure To ensure accuracy and
63
consistency, standard operating procedures are usually followed while gathering data in a laboratory environment for
characteristics such as pH value, hardness, electrical conductivity, sulfates, and chloramines. This is how the process is
explained: Measurement equipment, including pH meters, conductivity meters, and spectrophotometers, should be calibrated
48 30
using standardized calibration solutions. To avoid causing any interference with measurements, make sure that all glassware
and equipment are clean and free of contaminants. pH Measurement: Measure the pH value by rinsing the electrode in
distilled water and carefully blotting the excess water off with a lint-free tissue. Allow the electrode to stabilize after
submerging it in the water sample. Once stability has occurred, note the pH value shown on the pH meter. Hardness
Measurement: To measure the hardness of a water sample, first filter it to get rid of any suspended particles. In order to
complex the calcium and magnesium ions that give the sample its hardness, titrate it with a standardized EDTA solution.
Utilize a color indicator (such as Eriochrome Black T) to ascertain the titration's endpoint, which represents the overall
hardness of the sampled water. Measurement Of Electric Conductivity: When measuring electrical conductivity, make sure
there are no deposits and that the conductivity cell is clean.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 26 After submerging the cell in the water sample, let it
come to equilibrium. Using a conductivity meter, determine the sample's electrical conductivity and note the result. Sulphates
Measurements: Measurement of Sulfates: First, get the water sample ready by filtering it to get rid of any suspended
materials. After reacting the sample with a reagent like barium chloride, measure the absorbance of the sample at a certain
wavelength (420 nm) using a spectrophotometer. To find the sulfate concentration, compare the sample's absorbance to a
standard curve or known concentrations. Measurement of Chloramines: Make sure there is no residual chlorine in the water
sample before proceeding. To create a colored complex in the sample, add a reagent specific to chloramines (N, N-diethyl-p-
phenylenediamine, for example). A spectrophotometer can be used to measure the sample's absorbance at a particular
wavelength (such as 550 nm), and the chloramine content can be calculated by comparing the results to a standard curve or
known concentrations. Ensuring Quality: Take measurements of each parameter in duplicate or triplicate to guarantee
accuracy and repeatability. Take part in proficiency testing programs or compare the results to reference criteria to verify the
accuracy of the data. Data recording and analysis: Keep track of all measurement values as well as pertinent metadata, like
sample number, time, and date. Use statistical techniques or tools to analyze the data in order to find patterns, abnormalities,
or connections between various factors.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 27 Figure 6.1: Students analyzing water samples to
ensure its quality Samples PH Values Electric Conductivity (S /cm) Solids(TDS) (ppm or mg/L) Hardness (mg/L) Nandyal
Drinking Water 7.6 2.57 83 45 Panyam Tap Water 6.8 652.2 441 86 Kurnool Tap Water 7.9 508.7 900 148 RGM Boys Hostel
8.3 3.40 29 19 Nerawada Drinking Water 7 1.32 34 32 Industrial Waste water 5.5 1231.8 1204 334 RGM Sewage Water 4.5
1125.3 948 254 Table 6.1: Recorded the values of various samples of water
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 28 6.3 Model Implementation 6.3.1 Overview of tools
used for the project: The process of specifying how an information system should be constructed (i.e., its physical design),
making sure the system is usable and operational, and making sure the system satisfies quality standards (i.e., quality
4
assurance) is known as systems implementation. 6.3.2 Python Python is a high-level, interpreted, general-purpose
programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is
dynamically typed and garbage collected. It supports multiple programming paradigms, including structured (particularly
5
procedural), object-oriented and functional programming. It is often described as a "batteries included" language due to its
15
comprehensive standard library. History The late 1980s saw the conception of the programming language Python, and in
December 1989, Guido van Rossum at CWI in the Netherlands began to construct it as a replacement for ABC that could
5
handle exceptions and interface with the Amoeba operating system. Van Rossum is the primary author of Python, and the
Python community has named him principle author in recognition of his ongoing pivotal role in determining the course of
Python. Python 2.0, which included support for Unicode and a cycle-detecting garbage collector for memory management
(together with reference counting), was released on October 16, 2000. The development process itself saw the most significant
change, moving toward a more open and community-supported approach. Following extensive testing, the significant,
backwards- incompatible version of Python 3.0 was made available on December 3, 2008. Several of its primary functions have
also been backported to Python 2.6 and 2.7, which are no longer supported but are nonetheless backwards compatible.
15
Features of Python Simple to Learn, Write, and Read: Python is a high-level programming language with syntax similar to
that of English. This facilitates reading and comprehending the code. Due to its ease of learning, Python is highly
recommended for novices by many.
10
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 29 Enhanced Output: Python is an extremely useful
language. Python's simplicity allows developers to concentrate on finding the solution. They don't have to invest a lot of time
learning the syntax or syntax of the programming language. Language Interpretation: Because Python is an interpreted
language, it runs the code directly, line by line. It halts further execution in the event of a mistake and feeds back the error.
Python simplifies debugging easier. Dynamically Typed: Prior to the code being executed, Python is unaware of the variable's
type. During execution, the data type is assigned automatically. Declaring variables and their data types is not a concern for
the programmer. Open Source:Python is available under an open-source license that has been authorized by the OSI. It is
therefore free to use and share. You are able to download the source code, edit it, and even share your customized Python
version. Organizations who wish to alter a certain behavior and utilize their version for development will find this helpful.
6.3.3 Python Libraries NumPy: A core library for numerical computing in Python is called NumPy, short for Numerical
Python. The `numpy.ndarray`, an effective multidimensional array data structure that accommodates homogenous elements,
is at the heart of it all. NumPy arrays, in contrast to Python lists, provide for quick and effective numerical operations
60
because of their homogenous structure. These arrays are very useful for managing datasets in science and engineering since
they may represent vectors, matrices, and higher-dimensional data structures. NumPy offers a uniform data format for
10 45
numerical operations, making it easier to write clear, effective code that can manage massive amounts of data. Key Features
and Functionalities of Numpy Vectorized Operations: Enhances readability and speed by enabling element-wise
mathematical operations over whole arrays. Broadcasting: This results in short and effective code by automatically adjusting
array widths for arithmetic operations between arrays of various forms. Universal Function: Offers a large set of
mathematical functions, including arithmetic, trigonometric, statistical, and logical operations, for element-wise operations on
arrays.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 30 Array Indexing and Slicing: Permits effective data
extraction and manipulation within arrays with robust indexing and slicing capabilities. Array Iteration: Provides effective
ways to iterate over elements in an array while keeping performance levels high even for sizable datasets. Array
Manipulation: Offers functions for reshaping, concatenating, dividing, and transposing arrays to enable versatile
manipulation of array structures and shapes. Random Number Creation: Contains a strong random number creation system.
Integration of NumPy with the Python Environment Interoperability: Easily integrates with well-known Python tools and
libraries for scientific computing, data analysis, and visualization, such as scikit-learn, SciPy, Pandas, and Matplotlib.
Efficient Memory Handling: Because NumPy arrays are implemented in C, this method makes use of efficient memory
management. Because of this, they can manage big datasets without consuming a lot of RAM Optimized Performance: Makes
use of highly optimized C and Fortran routines to guarantee numerical methods can be computed at high speeds. The speedier
execution of operations and computations is facilitated by this optimization. Open Source and Community Support:
Advantages of an open-source project that supports a thriving and dynamic user and developer community. This community-
driven ecosystem uses forums, documentation, and cooperative development to promote ongoing innovation, support, and
14
improvement. Pandas Pandas is a well-liked and robust Python package for analyzing and manipulating data. It's a vital tool
for data scientists, analysts, and developers since it offers simple data structures and functions that can be used with tabular or
structured data. Functionalities of Pandas Data Manipulation: Pandas provides a rich set of functions and techniques for
working with missing data (NaN values), reshaping data, selecting and filtering rows and columns, and merging and joining
datasets.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 31 Data Input/Output: Pandas has functions to read and
37
write data to and from a variety of file formats, including CSV, Excel, SQL databases, JSON, and HTML. Indexing and
Selection: There are several indexing and selection techniques that Pandas supports. These techniques include boolean
indexing, integer-based indexing with iloc, label-based indexing with loc, and hierarchical indexing (MultiIndex). Grouping
and Aggregation: Pandas makes it easier to group data according to one or more keys and apply aggregation functions (like
sum, mean, and count) to the grouped data by utilizing the group by feature. Time Series Analysis: Pandas has specific data
structures and techniques, including as date/time indexing, resampling, time zone management, and moving window
operations, for handling time series data. Visualization: While not the main focus, Pandas integrates with Matplotlib to
provide rudimentary visualization features, enabling users to generate basic plots using DataFrame and Series objects.
Integrating with the Python Ecosystem: Interoperability: NumPy, Matplotlib, SciPy, and scikit-learn are just a few of the
well- known Python libraries and tools that Pandas easily interacts with for data analysis, scientific computing, and
visualization. Effective Memory Management: Pandas data structures are made to manage massive datasets well, using
64
NumPy for numerical calculations, even though they are not as memory-efficient as NumPy arrays. Support from the
Community: Pandas is fortunate to have a sizable and vibrant user and contributor community that offers assistance, makes
development contributions, and produces tutorials and instructional materials for both novice and expert users. Matplot
Library A complete Python visualization toolkit for static, animated, and interactive graphics is called Matplotlib. It is
especially well-liked among data scientists, researchers, and engineers and is frequently utilized for tasks involving data
visualization. Key Features of Matplot
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 32 Many Plot kinds: Matplotlib can handle a wide range
7
of plot kinds, such as pie charts, line plots, scatter plots, bar plots, and histogram plots. Because of its adaptability, users can
produce a vast array of visualizations for various kinds of data. Customization and Styling: You can adjust the plot's colors,
markers, line styles, labels, titles, axes, and annotations, among other features, using Matplotlib's comprehensive
customization features. Plots can be altered by users to meet their own needs and style choices. Multiple Interfaces:
Matplotlib provides several interfaces for plotting, including an object-oriented interface and a procedural interface (pyplot)
that resembles MATLAB. While the object-oriented approach offers more control and flexibility for complex charting jobs,
70
the pyplot interface makes interactive plot creation simple. Publication-Quality Output: Matplotlib generates excellent charts
that are appropriate for use in reports, presentations, research articles, and other publications. Plots can be stored at resizable
resolutions in a number of file formats, such as PNG, PDF, SVG, and EPS. Interactive Features: Users may dynamically
explore and engage with their data with Matplotlib thanks to its support for interactive features like panning, zooming, and
cursor tracking. Matplotlib's seamless integration with Jupyter Notebooks makes it possible for users to generate, view, and
interact with plots right inside the notebook environment. Matplotlib's smooth integration with NumPy makes it possible for
users to work with data for visualization purposes and make plots straight from NumPy arrays. 6.3.4 Components Figure: The
top-level container containing every plot element. Axes: The discrete plots or subplots where data is plotted in the figure. The
plot itself, labels, grid lines, and the X and Y axes are usually the components of each Axes object. Axis: Data limitations, tick
locations, tick labels, and axis labels are managed by the X and Y axes of an Axes object. Artist: The items in the story that
stand in for graphical elements including text, pictures, lines, markers, and patches.
7
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 33 Usage import matplotlib.pyplot as plt # Create data x
= [1, 2, 3, 4, 5] y = [2, 3, 5, 7, 11] # Create a line plot plt.xlabel('X-axis') plt.ylabel('Y-axis') plt.title('Line Plot') # Display the
plot plt.show() Seaborn A Python toolkit for making aesthetically pleasing and educational statistical visualizations that is easy
to use and builds upon Matplotlib. It provides a range of plot styles, works with Pandas, and can be customized for efficient
data exchange and exploration. sklearn, or scikit-learn A well-known machine learning Python package. In addition to
offering a large variety of supervised and unsupervised methods, it also interfaces with scientific computing libraries and
offers tools for selecting models and prepping data. With Scikit-learn, users may successfully investigate machine learning.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 34 6.4 Results Analysis For the majority of the drinking
water characteristics, the values you gave seem to be within the typical range. Below listed of each metric and how your values
stack up against the common benchmarks established by agencies such as the World Health Organization (WHO). Parameter
Range pH Value 6.5-8.5 Hardness 60-120 mg/L Solids 500-1000 mg/L Sulfate 3-30 mg/L Chloramines Upto 4 mg/L
Conductivity <400 microS/cm Trihalomethanes Upto 80 ppm Organic Carbon <2 mg/L Turbidity 1-5 NTU Potability 1-safe,0
unsafe water Table 6.4: Values of water quality metrics for safe water
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 35 6.5 Output Screens Figure 6.2: Form page for giving
input values.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 36 6.6 Results Page After putting the values by the user,
we get the results of the model in the way mentioned below. Figure 6.3: Data is entered Figure 6.4: Predicted Results
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 37 CHAPTER 7 SYSTEM TESTING 7.1 Testcase
Description Error detection is the process of testing. A crucial part of quality control and software dependability assurance is
played by testing. Later on, testing results are also utilized in maintenance. The Psychology of Assessment: Often, the purpose
of testing is to prove that a software is error-free and functional. The primary objective of the testing step is to identify
potential faults inside the program. As a result, testing should begin with the goal of demonstrating that a program does not
function rather than that it does. Testing is the act of running a software with the goal of identifying bugs. Testing Objectives
Testing's primary goal is to find as many errors as possible, methodically, and with the least amount of time and effort.
Formally speaking, testing is the process of running a program in order to identify errors. A test that detects a mistake that
hasn't been found yet is considered successful. A test case that exhibits a high likelihood of detecting errors, should they exist,
52
is considered good. The tests are insufficient to identify potential errors that may exist. The program essentially validates
against dependable and high-quality requirements. Levels of Testing We use the notion of levels of testing to find the mistakes
that exist in various phases. System Testing: The goal of testing is to identify mistakes. That's why test cases are created. Code
testing is one method used for system testing. Code testing: This tactic looks at the program's logic. To apply this strategy, we
created some test data that led to the program and module's instructions being executed, i.e., testing every path. Systems are
not tested as individual systems nor are they developed as wholes. 7.2 Types of Testing Unit Testing
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 38 Unit testing concentrates the verification effort on the
smallest software unit, i.e., the smallest unit of software. H.Module. Using the detailed design and process specifications, tests
are performed to detect errors within the module boundaries. All modules must successfully pass unit tests before integration
testing begins. Unit tests are initially run independently for each module to identify errors. Link Testing Link Test does not
test the software, but rather the integration of each module in the system. The main concern is the compatibility of each
module. Programmers test where modules are designed using different parameters, lengths, types, etc. Integration Testing
33
Unit testing should be followed by integration testing. The goal is to ensure that the modules are properly integrated, with a
focus on testing the interfaces between modules. This testing activity can be considered as testing the design, so the focus is on
testing module interactions. In this project, the integration of all modules forms the main system. When integrating all
modules, check whether the integration affects the functionality of one of the services by specifying different combinations of
inputs so that the two services are fully running before integration is done. System Testing The entiresoftware system is tested
here. The reference documents for this process are the requirements document and goals to determine whether the software
meets the requirements. Acceptance Testing Acceptance testing is performed using realistic data from customers to
demonstrate that the software works satisfactorily. The tests here focus on the external behavior of the system. The internal
logic of the program is not emphasized. We need to select test cases so that the attributes of as many equivalence classes as
possible are executed at once. Testing phase is an important part of software development. This is the process of discovering
41
errors and missing operations and conducting a thorough review to determine whether objectives are being achieved and user
needs are being met.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 39 White Box Testing This is a unit testing method where
he takes one unit at a time and thoroughly tests it at the statement level to find the largest possible errors. We tested each
section of the code step by step to ensure that each statement in the code was executed at least once. White box testing is also
59
called glass box testing. Black Box Testing This test method considers the module as a single unit and tests the unit for
interfacing and communicating with other modules, rather than detailing it at the instruction level. Here, a module is treated
as a blocking box that takes some inputs and produces an output. The output of a particular input combination is passed to
other modules. Figure 7.1: Testing Cycle
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 40 CHAPTER 8 CONCLUSION Maintaining the
sustainability of the environment and public health depend on water supplies being suitable for a range of uses. In order to
assess the suitability of water for a certain application, a number of factors are compared to predetermined standards. In
addition to lowering the risk of water-borne illnesses, high water quality promotes general wellbeing and sustainable growth.
The assessment of water quality has been transformed by artificial neural networks (ANNs), which have greatly increased
model accuracy. With surprising accuracy, artificial neural networks (ANNs) can forecast if water meets desired quality
standards by utilizing large datasets and sophisticated algorithms. Furthermore, these models help clarify the fundamental
causes of water contamination, facilitating well-informed choices and focused treatments. Apart from categorizing water as
pure or tainted, sophisticated models offer valuable perspectives on why polluted water could still be suitable for specific uses.
We can better protect priceless water resources for current and future generations by incorporating creative methods into
water quality assessment projects, which also increases prediction efficiency and accuracy. By combining technology, science,
42
and well-informed decision-making, we can ensure that everyone has access to clean, safe water, which will benefit everyone's
health, well- being, and environmental sustainability.
38
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 41 CHAPTER 9 FUTURE SCOPE One of the most
important resources for survival is water, and Water Quality Prediction establishes the quality of the water. Remarkable
improvements in accuracy, adaptability, real- time monitoring, data integration, and user accessibility are anticipated in the
65 44
future of deep learning techniques for water quality prediction. Prediction accuracy is increased by deep learning algorithms
as they refine models using vast datasets. Prompt action is possible with real-time monitoring systems that use deep learning to
quickly identify irregularities. Robustness is increased through the integration of multimodal data. To maintain continual
reliability, adaptive models adapt to changing conditions. Effective water quality management is encouraged by user-friendly
62
decision support systems, which provide stakeholders with useful insights. Deep learning has a great deal of promise to
improve water resource management and solve complicated problems.
Water Quality Prediction Using ANN Dept. of CSE, RGMCET Page 42

You might also like