Mini Project

1
A Mini Project
On
MINE EXPLOSION
PREDICTION
Submitted in partial fulfilment
of the Requirements for the
award of the degree of
Bachelor of Technology
In
Computer Science and
Engineering
A Mini Project
On
WATER QUALITY ANALYSIS
for the award of the degree of Bachelor of Technology
In
Computer Science and Engineering
By
Rohith Kalwa
2
(17H61A0520)
PothakaniSamyuktha
(17H61A0538
By
Prachi Singh
2104920109002
Himani Sharma
2004920100025
Shubhi Tomar
2004921530005
Saif Ur Rehman
2104920109003
Under the guidance of

Mrs B Ujwala, Assistant
Professor
Dept of Computer Science
Under the guidance of
Prof. Dr. Sanjay Kumar
Head of the Dept. of Computer Science
3
Department of
Computer Science and
engineering
ANURAG
GROUP OF
INSTITUTIO
NS Department of Computer Science and Engineering
KCC INSTITUTE OF TECHNOLOGY AND MANAGEMENT

(Affiliated to Dr. A.P.J. Abdul Kalam Technical University, Approved by AICTE)
Knowledge Park-III, Greater Noida, Uttar Pradesh 201306
4
KCC INSTITUTE OF TECHNOLOGY AND MANAGEMENT

(Affiliated to Dr. A.P.J. Abdul Kalam Technical University, Approved by AICTE)
Knowledge Park-III, Greater Noida,
Uttar Pradesh 201306
Department of computer science and Engineering
This is to certify that the

project entitled “Mine
Explosion Prediction” being
submitted by Rohith
Kalwabearing the Hall Ticket
number 17H61A0520 and
PothakaniSamyukthabearing
the Hall Ticket number
17H61AO538 and
Shaik Sania Aslam bearing
the Hall Ticket number
17H61A0544 in partial
5
fulfilment of the requirements

for the award of the degree of
the Bachelor of
Technology inComputer
Science and Engineering to
Anurag Group of
Institutions (Formerly CVSR
College of Engineering) is a
record of bonafide
work carried out by them
under my guidance and
supervision from November
2020 to March 2021
This is to certify that the project entitled “WATER QUALITY ANALYSIS”
being submitted by Himani Sharma bearing the Hall Ticket number–
2004920100025, Shubhi Tomar bearing the hall ticket number–
2004921530005, Saif Ur Rehman bearing the hall ticket number
2104920109003 and Prachi Singh bearing the hall ticket number-
2104920109002 in partial fulfilment of the requirements for the award of the
6
degree of the Bachelor of Technology in Computer Science and

Engineering to KCC Group of Institutions is a record of bonafide work
carried out by them under my guidance and supervision from September 2022
to December 2022.
The results presented in this project have been verified and found to be
satisfactory. The results embodied in this project report have not
been submitted to any other University for the award of any other
degree or diploma.
Internal Guide External Examiner

Prof. Dr. Sanjay Kumar
(Professor, Dept of CSE)
ACKNOWLEDGEMENT
It is our privilege and pleasure to express a profound sense of respect, gratitude and
indebtedness to our Prof. Dr. Sanjay Kumar, Assistant Professor, Dept. of Computer
Science and Engineering, KCC Group of Institutions for his/her indefatigable
inspiration, guidance, cogent discussion, constructive criticisms, and encouragement
throughout this dissertation work.
We express our sincere gratitude to DrG.Vishnu Murthy, Professor & Head,

Department of Computer Science and Engineering, Anurag Group of
Institutions
(Formerly CVSR College of Engineering), for his suggestions, motivations and
co-operation for the successful completion of the work.
7
We express our sincere gratitude to Dr Sanjay, Associate Professor & Head, Department of
Computer Science and Engineering, KCC Group of Institutions, for his suggestions,
motivations, and co-operation for the successful completion of the work.
We extend our sincere thanks to DrK.S.Rao, Director, Anurag Group

of
Institutions for his encouragement
We extend our sincere thanks to Mr. Deepak Gupta, Chairman, KCC Group of
Institutions for his encouragement.
HIMANI SHARMA – 2004920100025
PRACHI SINGH – 2104920109002
SAIF UR REHMAN – 2004920109003
SHUBHI TOMAR - 2004921530005
DECLARATION
We hereby declare that the project work entitled “ WATER QUALITY

ANALYSIS” submitted to the KCC Group of Institutions in partial fulfilment of the
requirements for the award of the degree of Bachelor of Technology (B.Tech) in Computer
Science and Engineering is a record of an original work done by us under the guidance of
8
Prof. Dr. Sanjay Kumar, Assistant Professor and this project work have not been submitted to
any other university for the award of any other degree or diploma.
HIMANI SHARMA – 2004920100025
PRACHI SINGH – 2104920109002
SAIF UR REHMAN – 2004920109003
SHUBHI TOMAR - 2004921530005
ABSTRACT
The major goal of this project is to use machine learning techniques to measure water quality.
A potability is a numerical phrase that is used to assess the quality of a body of water. The
following water quality parameters were utilized to assess the overall water quality in terms
of potability in this study: ph., Hardness, Solids, Chloramines, Sulfate, Conductivity, Organic
9
Carbon, Trihalomethanes, Turbidity were the parameters. To depict the water quality, these
parameters are used as a feature vector. To estimate the water quality class, the paper used
two types of classification algorithms: Decision Tree (DT) and K- Nearest Neighbor (KNN).
Experiments were carried out utilizing a real dataset containing information from various
locations around Andhra Pradesh, as well as a synthetic dataset generated at random using
parameters. Based on the results of two different types of classifiers, it was discovered that
the KNN classifier outperforms other classifiers. According to the findings, machine learning
approaches are capable of accurately predicting the potability. Potability, Water Quality
Parameters, Data Mining, and Classification are all index terms.
Keywords: Machine Learning, Supervised Learning, K-Nearest Neighbour

(KNN), Decision Tree, Hyper Parameter Tuning, Python Programming.
10
CONTENT
S.NO PAGE
NO
1. Introduction 1
1.1. Motivation 2
1.2. Problem Definition
2
1.3. Objective of the
Project 3
2. Literature Survey 3
3. Analysis
5
3.1. Existing System
5
3.2. Proposed System
5
11
3.3. Software
Requirement Specification
6
3.3.1 Purpose
6
3.3.2 Scope
6
3.3.3 Overall
Description
6
4. Design
7
4.1. UML diagrams
7
5. Implementation 12
5.1. Modules 12
12
5.2. Module description

12
5.3. Introduction of
technologies used 13
5.4. Sample Code 17
6. Test cases 19
7. Screenshots 20
8. Conclusion 21
9. Future Enhancement 22
10. Bibliography 23
S.NO PAGE NO
1. Introduction 7
1.1. Motivation 8
1.2. Problem Definition 8
1.3. Objective of the Project 9
2. Literature Survey 10
3. Analysis 11
3.1. Existing System 11
3.2. Proposed System 12
3.3. Software Requirement Specification 13
4. Design 14
13
4.1. UML diagrams 14

5. Implementation 15
5.1. Modules 15
5.4. Sample Code 16
6. Results 19
7. Screenshots 20
8. Conclusion 21
9. Future Enhancement 22
10. Bibliography 23
14
1. Introduction
Water quality analysis is a complex topic due to the different factors that influence it. This
concept is inextricably linked to the various purposes for which water is used. Different needs
necessitate different standards. There is a lot of study being done on water quality prediction.
Water quality is normally determined by a set of physical and chemical parameters that are
closely related to the water's intended usage. The acceptable and unacceptable values for each
variable must then be established. Water that meets the predetermined parameters for a
specific application is considered appropriate for that application. If the water does not fulfil
these requirements, it must be treated before it may be used. Water quality can be assessed
using a variety of physical and chemical properties. As a result, studying the behaviour of
each individual variable independently is not possible in practise to accurately describe water
quality on a spatial or temporal basis. The more challenging method is to combine the values
of a group of physical and chemical variables into a single value. A quality value function
(usually linear) represented the equivalence between the variable and its quality level was
included in the index for each variable. These functions were created using direct
measurements of a substance's concentration or the value of a physical variable derived from
water sample studies. The major goal of this research is to examine how machine learning
algorithms may be used to predict water quality.
15
1.1 Motivation
Nowadays, machine learning algorithms have proven themselves as a universal tool for
different types of tasks, giving advanced possibilities for dealing with analysed data,
including such types of tasks as data imputation, unsupervised clusterization, classification
and regression. They are commonly used in many research areas; however, they are yet less
common among environmental engineering workers, though such tools may provide an
extremely efficient alternative to the traditional analytical approaches. (Wilcox, Woon and
Aung 2013).
1.2 Problem Definition

Water pollution is becoming the most severe human concern affecting water quality. Various
human activities render water unsafe for drinking and domestic usage. The primary causes of
water pollution are chemical fertilizers and pesticides that enter rivers and streams as
untreated wastewater and industrial effluents that run near cities and lowlands. Polluted
water increases certain waterborne illnesses, causing some severe diseases.
The issues that this study intends to solve are outlined below:
 misconception of WHO guidelines on drinkable
 water parameters.
 the lengthy clinical process of drinkable water
 prediction.
 lack of uses of machine learning on water quality
 prediction.
 key awareness factor that are known to Rural people.
16
1.3 Objective of the Project

The purpose of the research behind this thesis was in presenting of examples of how such
advanced tools may be used on a particular data set meant for increasing water quality in
European region. In the following chapters one will go through the presentation of the
machine learning, its origins and possibilities in general, explanation of the data and models
used during the research, results of the application of algorithms, discussion (covering
obstacles one can face while working with this kind of models) and conclusion, which will
cover the presented material, give advices for engineers and scientists who would like to use
this models for their environmental tasks and finally and give some words about the possible
future of the development of these tools in environmental field.
17
2. Literature Survey
This paper reviews the role of uncertainty in the identification of mathematical models of
water quality and in the application of these models to problems of prediction. More
specifically, four problem areas are examined in detail: uncertainty about model structure,
uncertainty in the estimated model parameter values, the propagation of prediction errors, and
the design of experiments to reduce the critical uncertainties associated with a model. The
review is rather lengthy, and it has therefore been prepared in effect as two papers. There is a
shorter, largely nontechnical version, which gives a quick impression of the current and
future issues in the analysis of uncertainty in water quality modeling. Enclosed by this shorter
discussion is the main body of the review dealing in turn with (1) identifiability and
experimental design, (2) the generation of preliminary model hypotheses under conditions of
sparse, grossly uncertain field data, (3) the selection and evaluation of model structure, (4)
parameter estimation (model calibration), (5) checks and balances on the identified model,
i.e., model “verification” and model discrimination, and (6) prediction error propagation.
Much time is spent in discussing the algorithms of system identification the methods of
recursive estimation, and in relating these algorithms and the subject of identification to the
problems of prediction uncertainty and first-order error analysis. There are two obvious
omissions from the review. It is not concerned primarily with either the development and
solution of stochastic differential equations or the issue of decision making under uncertainty,
although clearly some reference must be made to these topics. In brief, the review concludes
(not surprisingly) that much work has been done on the analysis of uncertainty in the
development of mathematical models of water quality, and much remains to be done. A lack
of model identifiability has been an outstanding difficulty in the interpretation and
explanation of past observed system behavior, and there is ample evidence to show that the
“larger,” more “comprehensive” models are easily capable of generating highly uncertain
predictions of future behavior. For the future of the subject, it is speculated that there is the
possibility of progress in the development of novel algorithms for model structure
identification, a need for new questions to be posed in the problem of prediction, and a
distinct challenge to the conventional views of this review in the new forms of knowledge
representation and manipulation now emerging from the field of artificial intelligence.
18
3.Analysis
3.1 Existing System

A hybrid decision tree-based machine learning model was proposed to predict the water
quality with 1875 data. In the evaluation process, six water quality parameters were used to
predict the water quality. Extreme gradient boosting (XG Boost) and RF algorithms were
applied that includes complete ensemble empirical mode decomposition with adaptive noise
(CEEMDAN) along with six different algorithms. At first, raw statistical data was collected.
After CEEMDAN distribution, XG Boost, and RF algorithms were applied in data
distribution section. When training was completed, it shows the water quality along with
prediction error.
A machine learning model was proposed with RF, Decision Tree (DT) and Deep Cascade
Forest (DCF). The first step of the prediction model was data processing. Data samples were
divided into suitable and unsuitable section at data processing unit. After that, system
calculated the water quality parameters for irrigation. Water quality was predicted by six
levels of measure. Data was collected from Bouregreg watershed (9000 km2) located in the
middle of Morocco. Data was divided into 75 percent for training and 25 percent for testing.
In the data normalization and model building unit, system predict the water quality by data
splitting. An author presented a data intelligence model for water quality index prediction.
Support vector regression (SVR), adaptive neuro-fuzzy inference system (ANFIS), Back
propagation neural network (BPNN) and one multilinear regression (MLR) algorithms are
applied for prediction. The author collected the data from Jumna, the major tributary of the
Ganga River. The length of the river is 1400 km.
A hybrid machine learning approach was suggested for water quality prediction. RF, reduced
error pruning tree (REPT), and twelve different algorithms were applied to analyze the water
quality. The author divided the methodology into two sections are data collection and
preparation. Eleven water quality indicators were applied to identify the water quality. In the
model evaluation, the author took coefficient of determination (R2), mean absolute error
(MAE), root-mean-square deviation, the percentage of bias (PBIAS), percent of relative error
index (PREI), and Nash-Sutcliffe efficiency (NSE) for the performance measure of different
algorithms.
19
3.2 Proposed System

The proposed system is intended to determine potability. It is divided into two phases, one for
training and the other for testing. The following procedures are carried out in both sections.
Data on training pH and hardness testing data Solids, chloramines, sulphate, conductivity,
organic carbon, trihalomethanes, turbidity, and potability are all terms that can be used to
describe something. The data set was chosen as follows: The collection of essential
parameters that affect water quality, identification of the number of data samples, and
definition of the class labels for each data sample present in the data are all factors that go
into selecting the water quality data set, which is a prerequisite to model construction. Ten
indicator parameters make up the data sets used in this study. pH value and hardness are
examples of these factors. Solids, chloramines, sulphate, conductivity, organic carbon,
trihalomethanes, turbidity, and potability are all terms that can be used to describe the
properties of a substance.
The proposed approach, however, is not constrained by the number of parameters or the
selection of parameters. A k-fold cross validation technique is employed to set the learning
and testing framework in this study, corresponding to each data sample in the data set. The
dataset is separated into k-disjointed sets of equal size, each with roughly the same class
distribution, using this technique. This division's subsets are utilised as the test set in turn,
with the remaining subsets serving as the training set. These are Decision Tree (DT) and K-
Nearest Neighbour (KNN) methods. In terms of the underlying relational structure between
the indicator parameters and the class label, each of these strategies takes a different
approach. As a result, each technique's performance for the same data set is likely to differ.
Validating the performance of different classifiers on an unknown data set: Data mining
provides several metrics for validating the performance of different classifiers on an unknown
data set. A repeated cross-validation procedure in the MATLAB caret package was used to
create the learning and testing environment. The following procedure was used to apply the
classification algorithm: 1. The data set was split into two parts: training (80%) and testing
(20%). (20 percent). 2. The training set was subjected to repeated cross-validation, with the
number of iterations fixed to Classifiers were trained in this manner. 3. The model's optimal
parameter configuration was selected, resulting in the maximum accuracy. 4. The model was
scrutinized.
20
3.3 Software Requirement Specification

Machine learning algorithms, classification algorithms, and regression algorithms all improve
daily in our contemporary age, producing improved results. The most often used
classification algorithms are ANN, CNN, DNN, DT and RF [5]. Using factors such as pH,
conductivity, hardness, and so on, this proposed model predicts whether the water is safe to
drink. Numerous methods using activation functions are utilized in data processing and
learning. RF, SVM, ANN, DNN and Gaussian Naïve Bayes are the suggested prediction
algorithms in this proposed work.
21
4. Design
4.1 UML Diagrams

22
5. Implementation
5.1 Modules
To estimate river water quality class, two data mining methods were used: Decision Tree
(DT) and K- Nearest Neighbour (KNN). These methods are both parametric and
nonparametric classifiers, and their goal is to develop a function that maps input variables to
output variables from a training dataset. Because the function's form is unknown, different
algorithms make different assumptions about the function's form and how training data is
learned to produce the output. The parametric learning classifier makes more confident
assumptions about the data. If the assumptions for any data set are true, these classifiers will
make rectification judgments. However, if the assumptions are incorrect, the same classifier
performs poorly. To learn classification tasks, these classifiers do not rely on the quantity of
the sample data set; rather, their working principles are their assumptions. This classifier is
susceptible to prediction mistakes such as bias, in addition to its parametric character. When
the model makes multiple assumptions, the Decision Tree yields substantial bias.
Nonparametric classifiers, unlike parametric learning classifiers, do not make any
assumptions about the form of the mapping function, and by not making any assumptions,
they are having more accuracy. These classifiers can create any function from the training
data set. The DT and KNN classifiers are included in this category. Learning techniques are
used in DT, whereas the similarity principle is used in KNN. To put it another way, DT Small
data sets with complete domain expertise, on the other hand, are equally advantageous for
these classifiers. Instead of learning from data, the KNN classifier finds a group of k items in
the training set that are the most like the test object. Unlike other classifiers, DT does not rely
on domain expertise. To make classification decisions, it simply calculates the distance
between two characteristics. Because each algorithm's mode of operation differs, a
comparison of all of them is necessary to determine which one is better at approximating the
underlying function for the same training and testing water quality datasets.
23
5.2 Sample Code
import numpy as np # linear algebra

import pandas as pd # data processing, CSV file I/O (e.g., pd.
read_csv)
# Input data files are available in the read-only”. /input/" directory

# For example, running this (by clicking run or pressing Shift+Enter) w
ill list all files under the input directory
import os
for dirname, _, filenames in os. walk('/kaggle/input'):
for filename in filenames:
print (os. path. join (dirname, filename))
import numpy as np
import pandas as pd
from warnings import filterwarnings
from collections import Counter
import matplotlib. pyplot as plt
import seaborn as sns
import plotly
import plotly. express as px
water_df=pd. read_csv('/content/water_potability.csv')
water_df.info()
len(water_df.axes[0])
pot= pd.DataFrame(water_df['Potability'].value_counts())
fig = px.pie(pot,values='Potability',names=['Not Potable','Potable'],op
acity=0.6,
labels={'label':'Potability','Potability':'No. Of Samples'
},
color_discrete_sequence=px.colors.sequential.RdBu)
fig.update_layout(
font_family='monospace',
title=dict(text='Samples Of Potable & Non-Potable Water
',x=0.47,y=0.98,
font=dict(color='royalblue',size=20)),
legend=dict(x=0.37,y=-0.05,orientation='h',traceorder='reversed'),
hoverlabel=dict(bgcolor='black'))
fig.show()
24
fig = px.histogram(water_df,x='ph',y=Counter(water_df['ph']),color='Pot
ability',template='plotly_white',
marginal='box',opacity=0.7,nbins=100,
barmode='group',histfunc='count',
width=1000, height=700)
fig.update_layout(
font_family='Gravitas One',
title=dict(text='pH Level Distribution Plot',x=0.5,y=0.95,
font=dict(color='darkblue',size=20)),
xaxis_title_text='pH Level',
yaxis_title_text='Count',
legend=dict(x=1,y=0.98,borderwidth=0,tracegroupgap=5),
bargap=0.4,
)
fig.show()
fig = px.histogram(water_df,x='Sulfate',y=Counter(water_df['Sulfate']),
color='Potability',template='plotly_white',
marginal='box',opacity=0.7,nbins=100,color_discrete_s
equence=['#51C4D3','#4F7942'],
width=1000, height=700)
fig.update_layout(
title=dict(text='Distribution Of Sulphates Plot',x=0.53,y=0.95,
font=dict(color='#17869E',size=20)),
xaxis_title_text='Sulfate (mg/L)',
legend=dict(x=1,y=0.96,borderwidth=0,tracegroupgap=5),
bargap=0.3,
)
fig.show()
fig = px.histogram(water_df,x='Hardness',y=Counter(water_df['Hardness']
),color='Potability',template='plotly_white',
marginal='box',opacity=0.7,nbins=100,color_discrete_s
equence=['#17869E','#74C365'],
)
fig.add_annotation(text='<76 mg/L is considered soft
',x=40,y=130,showarrow=False,font_size=12)
fig.add_annotation(text='Between 76 and 150 (mg/L) is moderately
hard',x=113,y=130,showarrow=False,font_size=12)
fig.add_annotation(text='Between 151 and 300 (mg/L) is considered h
ard',x=250,y=130,showarrow=False,font_size=12)
fig.add_annotation(text='>300 mg/L is considered very hard
',x=340,y=130,showarrow=False,font_size=12)
fig.update_layout(
25
title=dict(text='Distribution of Hardness Plot',x=0.55,y=0.98,
font=dict(color='#636363',size=24)),
xaxis_title_text='Hardness (mg/L)',
legend=dict(x=1,y=0.96,bordercolor='royalblue',borderwidth=0,traceg
roupgap=5),
bargap=0.3
)
fig.show()
cor=water_df.drop('Potability',axis=1).corr()
cor
model = DecisionTreeClassifier(random_state=0,ccp_alpha = 0.0025).fit(t

rain_inputs , train_target)
train_pred = model.predict(train_inputs)
val_pred = model.predict(val_inputs)
test_pred = model.predict(test_inputs)
print('Training Set Accuracy : {:.4f}
%'.format(accuracy_score(train_target , train_pred)*100))
print('Validation Set Accuracy : {:.4f}
%'.format(accuracy_score(val_target , val_pred)*100))
print('Testing Set Accuracy : {:.4f}
%'.format(accuracy_score(test_target , test_pred)*100))
from sklearn.tree import plot_tree,export_text

plt.figure(figsize = (80,100))
plot_tree(model , feature_names = train_inputs.columns , max_depth=3 ,
filled=True);
tree_text=export_text(model , feature_names=list(train_inputs.columns)
, max_depth=5)
print(tree_text[:5000])
26
6. Results
Performance Measures Results True Positives (TP) are when the model predicts the positive
class properly. True Negatives (TN) is one of the components of a confusion matrix designed
to demonstrate how classification algorithms work. Positive outcomes that the model
predicted incorrectly are known as False Positives (FP). False Negatives (FN) are negative
outcomes that the model predicts negative class. Accuracy is the most basic and intuitive
performance metric, consisting of the ratio of successfully predicted observations to total
observations. Accuracy = TP+TN/(TP+FP+FN+TN).
27
7. Screenshots
28
8. Conclusion
Potability determines the quality of water, which is one of the most important resources for
existence. Traditionally, testing water quality required an expensive and time-consuming lab
analysis. This study investigated an alternative machine learning method for predicting water
quality using only a few simple water quality criteria. To estimate, a set of representative
supervised machine learning algorithms was used. It would detect water of bad quality before
it was released for consumption and notify the appropriate authorities It will hopefully reduce
the number of individuals who drink low-quality water, lowering the risk of diseases like
typhoid and diarrhoea. In this case, using a prescriptive analysis based on projected values
would result in future capabilities to assist decision and policy makers.
Overall, the goals defined for this research were reached and the examples of the application
of machine learning models are presented, covering most of the aspects of the average
research working in the field of artificial intelligence for environmental sciences tasks. This
work also reveals the importance of consulting data scientists before starting of the
monitoring, since data sets unsuitable for requested tasks is a common problem.
29
9. Future Enhancement
Future prospective of the development of this research may be seen in several ways. Firstly,
consistent misclassification of season values between winter and spring may be studied
further using this data set by extracting and analysing the samples, which tend to be often
misclassified. On the other hand, models generated during this research may be used by IT
students for producing software meant to help environmental specialists in analysing
collected water quality data.
All in all, following the technological progress and taking the best from what it provides us
from day to day ensures continuous development of the research field. The same goes for
environmental sciences and machine-learning algorithms are one of the tools that can
contribute to this field a lot and may be used to keep the progress on-going.
30
10. Bibliography
1. The Environmental and Protection Agency, “Parameters of water

quality,” Environ. Prot., p. 133, 2001.
2. Batista, G. E. A. P. A., and M. C. Monardo. “A Study of K-Nearest
Neighbour as an Imputation Method.” Soft Computing Systems: Design,
Management and Applications, 2002: pp. 251–260.
3. Environmental Engineering. Technical Report, Abu Dhabi: Masdar
Institute of Science and Technology, 2013.
4. American Public Health Association. 1998. "Standard Methods for the
Examination of Water and Wastewater." 20th edition.
5. Jacobson C. 1991. "Water, Water Everywhere." 2nd Ed. Produced by
Hach Company.
6. Sources and Causes of Water Pollution That Affect Our Environment.
Retrieved May 7, 2022, from
https://www.conserve-energy-future.com/sources-and-causes-of-water-
pollution. php
7. Srivastava, G.; Kumar, P. Water quality index with missing parameters.
Int. J. Res. Eng. Technol. 2013, 2, 609–614.

Mini Project

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mini Project

Uploaded by

Copyright:

Available Formats

1

Under the guidance of

KCC INSTITUTE OF TECHNOLOGY AND MANAGEMENT

KCC INSTITUTE OF TECHNOLOGY AND MANAGEMENT

Department of computer science and Engineering

This is to certify that the

fulfilment of the requirements

degree of the Bachelor of Technology in Computer Science and

Internal Guide External Examiner

We express our sincere gratitude to DrG.Vishnu Murthy, Professor & Head,

We extend our sincere thanks to DrK.S.Rao, Director, Anurag Group

HIMANI SHARMA – 2004920100025

PRACHI SINGH – 2104920109002

SAIF UR REHMAN – 2004920109003

SHUBHI TOMAR - 2004921530005

We hereby declare that the project work entitled “ WATER QUALITY

HIMANI SHARMA – 2004920100025

PRACHI SINGH – 2104920109002

SAIF UR REHMAN – 2004920109003

SHUBHI TOMAR - 2004921530005

Keywords: Machine Learning, Supervised Learning, K-Nearest Neighbour

5.2. Module description

4.1. UML diagrams 14

1.2 Problem Definition

1.3 Objective of the Project

3.1 Existing System

3.2 Proposed System

3.3 Software Requirement Specification

4.1 UML Diagrams

5.2 Sample Code

import numpy as np # linear algebra

# Input data files are available in the read-only”. /input/" directory

model = DecisionTreeClassifier(random_state=0,ccp_alpha = 0.0025).fit(t

from sklearn.tree import plot_tree,export_text

1. The Environmental and Protection Agency, “Parameters of water

You might also like