Chronic Kidney Disease Prediction using Data Mining

Chapter 1
Data Mining is one of the most encouraging areas of research with the purpose of
finding useful information from voluminous data sets. It has been used in many domains
like image mining, opinion mining, web mining, text mining, graph mining etc. Its
applications include anomaly detection, financial data analysis, medical data analysis,
social network analysis, market analysis etc. It has become popular in health organization
as there is a requirement of analytical methodology for predicting and finding unknown
patterns and information in health data. It plays a vital role for discovering new trends in
healthcare industry. Data Mining is particularly useful in medical field when no
availability of evidence favoring a particular treatment option is found. Large amount of
complex data is being generated by healthcare industry about patients, diseases, hospitals,
medical equipments, claims, treatment cost etc. that requires processing and analysis for
knowledge extraction.

Data mining comes up with a set of tools and techniques which when applied to this
processed data, provides knowledge to healthcare professionals for making appropriate
decisions and enhancing the performance of patient management tasks. Patients with
similar health issues can be grouped together and effective treatment plans could be
suggested based on patient’s history, physical examination, diagnosis and previous
treatment patterns. Chronic Kidney Disease (CKD) has become a global health issue and
is an area of concern. It is a condition where kidneys become damaged and cannot filter
toxic wastes in the body. Our work predominantly focuses on detecting life threatening
diseases like Chronic Kidney Disease (CKD) using Classification algorithms like Naive
Bayes and Artificial Neural Network (ANN).

1.1 Overview

Chronic Kidney Disease (CKD) is the highly developed and irreversible destruction
of the kidneys. Kidneys are indispensable parts of human body. They have several
functions, including:

 Helping confirm the credit of minerals and electrolytes in your body, such as
calcium, sodium, and potassium


Chronic Kidney Disease using Data Mining

 Playing an vital role in the production of red blood cells

 Maintaining the delicate acid-base sample of your blood
 Excreting water-soluble wastes from your body

Chronic Kidney Disease (CKD), then called chronic renal disease, is once loss in
kidney act on intensity of times of months or years. The symptoms of worsening kidney
performance are not specific, and might add happening feeling generally unwell and
experiencing a condensed appetite. Often, chronic kidney sickness is diagnosed for that
excuse of screening of people known to be at risk of kidney problems, such as those
considering high blood pressure or diabetes and those as well as a bloodline relative
following CKD. This weakness may along with be identified once than it leads to one of
its qualified complications, such as cardiovascular weakness, anemia, pericarditis or renal
osteodystrophy.CKD is a long-term form of kidney sickness; as a result, it is
differentiated from acute kidney sickness (acute kidney cause offense) in that the
narrowing in kidney discharge adherence must be faculty for as soon as then again 3
months. CKD is an internationally credited public health is not a hundred percent
affecting 5-10% of the world population.

Chronic Kidney Disease is identified by a blood exam for creatinine, which is a

psychoanalysis product of muscle metabolism. Higher levels of creatinine indicate a
humiliate glomerular filtration rate and therefore a decreased knack of the kidneys to
excrete waste products. Creatinine levels may be declared in the forward stages of CKD,
and the condition is discovered if urinalysis (breakdown of a urine sample) shows the
kidney is allowing the loss of protein or red blood cells into the urine.

1.2 Motivation
The present lifestyles of people, working environment and diet have given rise to
many diseases, one of which includes Chronic Kidney Disease. Chronic Kidney Disease
(CKD) is prevailing nowadays and has become a global health issue which must be
timely detected and diagnosed.

Kidneys are important organs of human body that eradicate toxic and unwanted
waste from blood causing smooth functioning of body organs. CKD is a condition that
describes loss of kidney function over time making it difficult for them to filter poisonous
wastes from the body.

Chronic Kidney Disease using Data Mining

1.3 Scope of Project

Disease detection is one of the significant areas of research in Medical Field. The aim
of the project is to find Chronic Kidney Disease in a human body in an efficient
manner .The project aims at developing an Automated Software Package for Chronic
Kidney Disease Prediction useful in medical sectors like hospitals, clinical laboratories.

Using of Data mining technique helps us to use different set of tools for calculating
large set of data volumes. Helping to find out chronic disease in human body also
approaches to find other symptoms that can be cured with the help of diagnosis.

1.4 Proposed System

Chronic Kidney Disease (CKD) has become a global health issue and is an area of
concern. It is a condition where kidneys become damaged and cannot filter toxic wastes
in the body. Our work predominantly focuses on detecting life threatening diseases like
Chronic Kidney Disease (CKD) using Classification algorithms. Proposed system is
automation for Chronic Kidney Disease prediction using classification technique “naive
bayes” and artificial neural network technique “C4.5”.

Chronic Kidney Disease has been predicted and diagnosed using data mining
classifiers: ANN and Naive Bayes. Performances of these algorithms are compared using
Rapid miner tool. The obtained results showed that Naive Bayes is the most accurate
classifier with higher accuracy when compared to ANN output’s accuracy. In this system,
some of the factors considered were age, diabetes, blood pressure, RBC count etc. The
work can be extended by considering other parameters like food type, working
environment, living conditions, availability of clean water, environmental factors etc for
kidney disease detection.

Initially dataset are collected from various source and stored in the database. Dataset
consist of 24 attributes and each attributes has its own range. In the next step
preprocessing takes place in that, noisy and irrelevant data are removed. Preprocessed
data are sent to the classification algorithm and data are classified based on Naive Bayes
and ANN algorithm as shown in the fig1.1.

Chronic Kidney Disease using Data Mining



Classification Algorithm
(Naïve Bayes & Neural Network)

Test Data

New Data Analysis


Fig 1.1: Proposed System of CKD

Classified old patient dataset and new patient test result are given to the model there
it will be arranged in the structured manner. In analysis phase both results will be
analyzed then the result will be predicted.

For example the clinical data of 200 records considered for analysis has been taken
from UCI Machine Learning Repository. The data obtained after cleaning and removing

Values are 220. The data has been implemented using Rapid Miner tool. There are 25

Chronic Kidney Disease using Data Mining

attributes in the dataset. The numerical attributes include age, blood pressure, blood
glucose random, blood urea, serum creatinine, sodium, potassium, hemoglobin, packaged
cell volume, WBC count, RBC count. The nominal attributes include specific gravity,
albumin, sugar, RBC, pus cell, pus cell clumps, bacteria, hypertension, diabetes mellitus,
coronary artery disease, appetite, pedal edema, anemia and class.

The validation process which helps to examine the accuracy of fitted models and its
performance on new data and also model construction helps to build a model and testing
dataset and measures its performance.

Chronic Kidney Disease using Data Mining

Chapter 2
2.1 Survey Papers

[1] Performance analysis of classification data mining techniques over heart disease
data base.
The healthcare industry collects huge amounts of healthcare data which,
unfortunately, are not “mined” to discover hidden information for effective decision
making. Discovery of hidden patterns and relationships often goes unexploited. Advanced
data mining techniques can help remedy this situation.

It can serve a training tool to train nurses and medical students to diagnose
patients with heart disease. It is a web based user friendly system and can be used in
hospitals if they have a data ware house for their hospital. Presently we are analyzing the
performances of the two classification data mining techniques by using various
performance measures.

The effectiveness of models was tested using two methods: Classification Matrix
and Lift Chart. This system can serve a training tool to train nurses and medical students
to diagnose patients with heart disease. It can also provide decision support to assist
doctors to make better clinical decisions or at least provide a “second opinion.”

This system is developed using two data mining classification modeling

techniques. The system extracts hidden knowledge from a historical heart disease
database. DMX query language and functions are used to build and access the models.
The models are trained and validated against a test dataset. Classification Matrix methods
are used to evaluate the effectiveness of the models. The two models are able to extract
patterns in response to the predictable state.

[2] Mining Medical Data to Identify Frequent Diseases using Apriori Algorithm.
The data mining is a process of analyzing a huge data from different perspectives
and summarizing it into useful information. The information can be converted into
knowledge about historical patterns and future trends. Patients from different locations
approach different hospitals.

Chronic Kidney Disease using Data Mining

They do not converge in a same place. Their records are maintained by the
hospitals where they get treated. Collecting information about the frequently occurring
diseases is not an easy job. The data collection regarding these sorts of diseases can be
done through association rule. Apriori of the Association rule is adopted for the mining of
data. Details regarding the occurrence of these diseases in a particular time period can
also be mined using Apriori algorithm.

Based on the Apriori principle any subset of a frequent itemsetsmust also be

frequent. If {XY} is a frequent Itemset, both {A} and {B} must be frequent item sets.
The key idea of Apriori algorithm is to make multiple passes over the database. It
employs an iterative approach known as breadth-first search (level-wise search) through
the search space, where k-itemsets are used to explore (k+1)-item sets. In the beginning,
the set of frequent 1-itemsets is found. The set of that contains one item, which satisfy the
support threshold is denoted by L1. In each subsequent pass, it begins with a seed set of
itemsets found to be large in the previous pass. This seed set is used for generating new
potentially large itemsets, called candidate item sets, and count the actual support for
these candidate item sets during the pass over the data. At the end of the pass, we
determine which of the candidate itemsets are actually large (frequent), and they become
the seed for the next pass.

The proposed method is useful to identify the frequent diseases in a large medical
dataset. The outcome of this research will help the practitioners in making medicinal
decisions for frequently occurring diseases. Analysis is made on data from various
geographical locations during different time periods.

[3] Discovery of Significant Parameters in Kidney Dialysis Data Sets by K-Means

There are few parameters which are considered essential for the decision making
of kidney dialysis. The parameters like creatinine, sodium and urea plays an important
role. This paper identifies the possible survival period of kidney failure patients before
they need to go for the necessary kidney transplantation. The survival period of patient is
identified by the influence of kidney dialysis parameters with the help of K-means
clustering algorithm. K mean algorithm is as follows: It represents each cluster by the
mean value of the objects in the cluster .It takes the input parameter, k, and partitions a set
of n object into k clusters so that the resulting intra Cluster similarity is high but the inter

Chronic Kidney Disease using Data Mining

cluster similarity is low. Cluster similarity is measured in regard to the mean value of the
objects in a cluster. First, it randomly selects k of the objects, each of which initially
represents a cluster mean or centre. For each of the remaining objects, an object is
assigned to the cluster to which it is the most similar, based on the distance between the
object and the cluster mean. It then computes the new mean for each cluster. This process
iterates until the criterion function converges. This clustering procedure is applied to the
parameters and the survival period of patient is identified. The clustering is applied based
on the age and gender. The parameters with the normal value have better survival rate
than low or high values.

[4] An empirical study on prediction of heart disease using classification data mining
The use of pattern recognition and data mining techniques into risk prediction
models in the clinical domain of cardiovascular medicine is proposed. The data is to be
modeled and classified by using classification data mining technique. Some of the
limitations of the conventional medical scoring systems are that there is a presence of
intrinsic linear combinations of variables in the input set and hence they are not adept at
modeling nonlinear complex interactions in medical domains. This limitation is handled
in this research by use of classification models which can implicitly detect complex
nonlinear relationships between dependent and independent variables as well as the
ability to detect all possible interactions between predictor variables.

To investigate the performance of different classification algorithm such as DT,

NB, K-NN and NN on heart disease dataset. The heart disease prediction is useful for
cardio vascular clinicians which contains the patient's records. This patient's record is
classified and predicted who are having the heart diseases. After evaluation it is found
that NB gives the better accuracy than other classifiers.

[5] Interactive knowledge discovery for temporal lobe epilepsy.

To optimize the rule-discovery process by giving clinician flexibility of
incorporating domain knowledge, in the form of desire rule formats, into the rule search.
There are many reasons why a physician might experience difficulty in formulating an
appropriate differential diagnosis. It may be that the case involves a rare disease or
unusual presentation. Often, such difficulties arise in clinical problems where two or more

Chronic Kidney Disease using Data Mining

disease processes are at work, generating a complex sequence of abnormal findings that
can be interpreted in a variety of ways.

Also proposes a data discovery algorithm for a small data set with high
dimensionality. Support vector machine is applied to classify the feature vectors. Finally,
particle swarm agents are used to discover the SVM classification rules. It has been
shown that this algorithm can manage the rule extraction task efficiently. We will develop
and evaluate a new approach for interactive data mining based on swarm intelligence. The
proposed method will process external rules along with the raw data to do reasoning. The
proposed method is designed to work in low sample and high dimensional feature space
conditions where statistical power of the raw data is not sufficient for a reliable decision.

[6] Prediction of Kidney Disease by using Data Mining Techniques.

Chronic Kidney Disease is a large and growing problem among aging populations.
Detection & prediction of kidney disease is important for providing proper & right
treatment to the patient. The conventional systems that were used for detection of the
kidney diseases used data sets of the patient and generated results using if-then rules
along with and-or mechanism. This new technique uses both fuzzy systems and neural
networks called as neuro-fuzzy system that will generate results on the basis of obtained
input data set. This new system that is made from the combination of both fuzzy system
and neural network generates results by mathematical computation and not on the basis of
probabilistic theory. The ANFIS system is used in the proposed in this work. The initial
step is to load the data set that is to be processed. The data set consist of the various
information of the particular subject that is to be normalized. Next step after loading the
data set is to initialization the ANFIS system, the ANFIS system is combination of the
artificial neural network and the fuzzy logics. The system is most efficient system for
computing the results.
After the system is initialized, next step is to initialize the parameters of the system.
These parameters can be number of inputs required and the outputs. After this the
member ship function are defined. With the help of these membership functions the
calculation is done. In this step after parameters and the member functions are defined
training of the ANFIS system is done. After training the system the calculation of the
performance parameters is done. The performance parameters will show the efficiency of
the designed system. Next step is to optimize the parameters of the ANFIS system

Chronic Kidney Disease using Data Mining

designed in order to achieve more better and efficient results. Finally the calculation of
the optimized results is done and the comparison is performed. The comparison will show
the best results.

[7] A Survey on Mining Techniques for Early Lung Cancer Diagnoses

Lung cancer, a disease highly dependent on historical data for early diagnosis, has
influenced researchers to pursue the data mining techniques for the pre-diagnosis process.
The five year survival rate increases to 70% with the early detection at stage 1, when the
tumor has not yet spread. Existing medical techniques like X-Ray, Computed
Tomography (CT) scan, sputum cytology analysis and other imaging techniques not only
require complex equipment and high cost but is also proven to be efficient only in stage 4,
when the tumor has metastasized to other parts of the body. The proposed system involves
the development of a data mining tool that will help in the classification of patients into
the category that could potentially test positive for lung cancer in stage 1. Based on the
pre-diagnosis results from the tool, the doctor can perform the diagnosis for the
confirmation of tumor in the patient and initiate the treatment at an early stage thereby
increasing the survival rate.
The method of applying data mining techniques in identifying effective pre-diagnosis
of the disease can improve practitioner performance. Lung cancer being a disease which
is highly dependent on historical data can make use of data mining for its early detection.
Researchers have been investigating on applying various data mining techniques on lung
cancer dataset for early diagnosis of lung cancer. This paper proposes a model for
measuring if applying data mining techniques to lung cancer dataset can provide reliable
performance in the detection of lung cancer at Stage I .The proposed system uses the most
effective method to extract knowledge and information from the existing lung cancer
profile data. Data cleaning is a challenging step involved here as the data collected from
heterogeneous sources does not contain all the required attributes. Normally with increase
in the training data, performance can be increased.

[8] Performance Comparison of Three Data Mining Techniques for Predicting

Kidney Dialysis Survivability

Chronic Kidney Disease using Data Mining

The main objective of this manuscript is to report on research where we took

advantage of those available technological advancements to develop prediction models
for kidney dialysis survivability, and also the main goal of medical data mining
techniques is to get best algorithms that describe given data from multiple aspects. The
number of patients on hemodialysis due to end stage kidney disease is increasing. The
median survival for these patients is only about 3 years and the cost of providing care is
high. Finding ways to improve patient outcomes and reduce the cost of dialysis is a
challenging task. Dialysis care is complex and multiple factors may influence patient
survival. More than 50 parameters may be monitored while providing a kidney dialysis
treatment. Understanding the collective role of these parameters in determining outcomes
for an individual patient and administering dividualized treatments is of importance.
Individual patient survival may depend on a complex interrelation ship between multiple
demographic and clinical variables, medications, and medical interventions. In this
research, three data mining techniques (Artificial Neural Networks, Decision tree and
Logical Regression) are used to elicit knowledge about the interaction between these
variables and patient survival. A performance comparison of three data mining techniques
is employed for extracting knowledge in the form of classification rules. The concepts
introduced in this research have been applied and tested using a data collected at different
dialysis sites. The computational results are reported. Finally, ANN is suggested for
Kidney dialysis to get better results with accuracy and performance.

[9] Performance Evaluation of Different Techniques in the Context of Data Mining-

A Case of an Eye Disease.
The optimization in data mining plays a fundamental role for the extraction of
Or knowledge in minimum time. The author describes and implemented three
optimization techniques such as fuzzy logic based approach, neural network based
approach(Perceptron based and Back-propagation based). For the experimentation
purpose data from eye clinic are collected to understand the appropriate disease and
recommend the type of lenses for patient. The work also covers the analysis observations
and discussions based on the obtained results. A dataset of an eye clinic is taken as the
example for the experimental purpose. The dataset consist of four attributes as an input
and one output class attribute of a patient. The first attribute is age factor describing the

Chronic Kidney Disease using Data Mining

age of the patient next is spectacle prescription which describes the type of spectacle the
patient is using and the last is astigmatism which is a type of an eye defect. Based on
these the output class will provide the type of lens recommended by the doctor.

Fuzzy logic is an approximation method to solve any problem. It contains the

fuzzy sets rather than the crisp sets. The input parameters of the fuzzy logic takes only
approximate values called partial truth .Perceptron networks come under single layer feed
forward networks and also called simple perceptron. The perceptron network consist of
three units namely, input unit, hidden unit and output unit. The net input at hidden-layer
and input-layer is calculated first then net-input between hidden-layer and output-layer is
calculated. Then input is mapped to the output using the different activation functions like
binary sigmoidal and bipolar sigmoidal activation function. These all layers computation
is inbuilt in the tool of neural network which is used here for the optimization
results .During the experiments it is found that Back-Propagation is showing average
result. It would be interesting to see the impact of these approaches on more complex

[10] Diagnosis of Kidney Disease Using Fuzzy Expert System.

The paper develops and presents a diagnosis system based on fuzzy logic to report
the healthiness of a patient’s kidney. The data chosen is obtained from various diagnosis
results of kidney patients of Birdem Hospital, Dhaka. For the fuzzy system, a total of
seven input variables are used as follows: nephron functionality, blood sugar, systolic and
diastolic blood pressure, age, weight, and alcohol intake. The result that is the healthiness
of kidney is measured in the range of 0 to 10. The expert system is implemented in the
fuzzy logic toolbox built in the Mat lab software. The proposed system provides a means
to deliver a more direct way to tell whether a kidney is in good or bad condition. Fuzzy
logic is applied for this purpose. The system is based on a fuzzy rule-based inference
method, where we apply the Mamdani approach for fuzzification and de-fuzzification.

Fuzzy logic is best applied in fields where a great amount of uncertainty or

fuzziness exists. In this case, building an expert system by applying fuzzy inference rules
is a very suitable

choice. In a fuzzy inference system or FIS, fuzzy set theory is applied to map
inputs (or attributes) to outputs. The fuzzification process involves transforming crisp

Chronic Kidney Disease using Data Mining

values into various grades of membership for linguistic terms of fuzzy sets. Membership
functions are used to associate a grade to each linguistic term. De-fuzzification is the
process of getting a quantifiable result in fuzzy logic, given the fuzzy sets and
corresponding membership degrees (obtained from fuzzification).The designed fuzzy
expert input variables, can successful precise and accurate. Comparing with traditional
approaches used by hospitals, the system can predict the healthiness of kidney.

2.2 Survey Findings

Healthcare functionalities are derived using different technologies to enhance
detection of finding heart disease in human being. Further it helps to get patient billing
and other information to ease the availability of the records. Research of finding detects
are carried Based on different methods and algorithms of data mining technique future
work are conducted in specific manner. Additional data mining techniques can be
incorporated to provide better diagnosis. The size of the dataset used in this research is
still quite small. A large dataset would definitely give better results. It is also necessary to
test the system extensively with input from doctors, especially cardiologists, before it can
be deployed in hospitals. [1]

Data mining tools have been developed for effective analysis of medical
information to help the clinician in making better diagnosis. In this research work, the
researcher can collect data from Hospital Information System (HIS)which has the
sufficient details of patient including patient’s name, age, disease, location, district, date
from laboratories which keeps on growing year after year. Having collected the data from
hospital information system, this research can find the frequent disease with the help of
association techniques. This research work helps to mine the data about the frequent
diseases with help of tools applied over training data set. [2]

Importance of clustering technique for identifying the influence of kidney dialysis

parameters. A K-mean algorithm is used to identify the measured parameters. The
important parameters for kidney dialysis are creatinine, sodium and urea. These
parameters clustering procedure help in predicting the survival of an individual patient
beyond the median survival time. [3]

The dataset have the large volume of data which consumes more time for
classification. Thereby reduction the dimensionality of data using the attribute selection

Chronic Kidney Disease using Data Mining

methods is to be done. Keeping track of patient’s record seems to be difficult to manage

and fails in getting accurate results. [4]

Both of the injection and rejection of rules to allow interactive and effective
contributions provided by an expert user. The well-known support vector machine (SVM)
classifier and swarm data miner will be integrated to handle joint processing of the raw
data and the rules. [5]

In this proposed work the ANFIS system is proposed which is considered to be

better for obtain useful data. Along with this the optimizations if the system is done to
improve the efficiency of the results obtained .From the results obtained it is concluded
that this method is better and efficient than the traditional systems. The designed system
will help in extraction of the useful data from the data set. This proposed system is effect
approach for the data mining process. In future the method can be advanced by using
various optimization algorithms that can increase the efficiency of the system. Also
hybrid approaches can be used for obtaining more precise results. [6]

Artificial neural networks (ANN) provide a powerful tool to help doctors to

analyze, then model and make sense of complex clinical data across a wide range of
medical applications .It is a mathematical model developed on the basis of biological
neural networks. Each neuron node in the input layer represents each attribute of the
patient dataset. The values from the input layer are then sent to the nodes in the hidden
layer along with the weight values where the learning actually takes place. After the
learning process, the classification is done in the output layer [7].

The effectiveness of models was tested using different data mining methods. The
purpose is to determine which model gave the highest percentage of correct predictions
for diagnosing patients with a major life threatening diseases. The purpose of this study is
to investigate the use of different classifiers as tools for data mining, predictive modeling
and data processing in the prognosis of diseases. The goal of any modeling exercise or the
best technique is to extract as much information as possible from available data and
provide an accurate representation of both the knowledge and uncertainty about the
epidemic. The prediction of life threatening diseases survivability has been a challenging
research problem for many researchers. Since the early dates of the related research,
much advancement has been recorded in several related fields [8].

Data Mining is about solving problems by analyzing data already present in

Chronic Kidney Disease using Data Mining

databases. Data mining is also stated as essential process where intelligent methods are
applied in order to extract the data patterns. The rule extraction is the basic process of
data mining. If-then rules are the most common taxonomy for the rule extraction in the
field of extracting knowledge from a large database. To obtain the best possible solution
in the extraction [9].

The delivery of precise an elaborative diagnosis of disease sis important and crucial
for the well-being of patients. Conventional diagnosis systems for renal diseases today
involve taking several tests that include tests on blood sugar, BUN (Blood Urea
Nitrogen), creatinine. Developing a system which can be used by doctors for a more
precise analysis of kidney condition. Diagnosing kidney condition today is vital attribute
in almost all medical fields [10].

Chronic Kidney Disease using Data Mining

Chapter 3
3.1 Introduction
Software Requirement Specification (SRS) is a fundamental document, which
forms the foundation of the software development process. SRS not only lists the
requirements of a system but also has a description of its major features. These
recommendations extend the IEEE standards. The recommendations would form the basis
for providing clear visibility of the product to be developed serving as baseline for
execution of a contract between client and the developer.

A system requirement is one of the main steps involved in the development

process. It follows after a resource analysis phase that is the task to determine what a
particular software product does. The focus in this stage is one of the users of the system
and not the system solutions. The result of the requirement specification document states
the intention of the software, properties and constraints of the desired system.

SRS constitutes the agreement between clients and developers regarding the
contents of the software product that is going to be developed. SRS should accurately and
completely represent the system requirements as it makes a huge contribution to the
overall project plan.

The software being developed may be a part of the overall larger system or may
be a complete standalone system in its own right. If the software is a system component,
the SRS should state the interfaces between the system and software portion.

3.2 Stakeholders
The Stakeholders of the project are:
 Team members
 Project guide
 Project reviewers
 Department Faculties
 College management
 Organization’s officials

Chronic Kidney Disease using Data Mining

 Admin
 Doctor
 Patient
 Receptionist

3.3 Functional Requirements

Functional requirement defines a function of a software system or its component.
Function is described as a set of inputs, the behavior, and outputs.
Different Module present in the system is:
 Module 1 :Admin
 Module 2 : Doctor
 Module 3 : Patient
 Module 4 : Receptionist

 Module 1: Admin
In this module, Admin will keep track of day-to-day process done in the
 Staff Creation: Responsible for creating staff account and managing their
 Add Stage Data: Admin will add stages into the database.
 Constraints: Admin will add patient constraints into the database.
 Ranges: Admin will add set of predefined ranges into constraints.
 Password: Admin can reset password of his own or he can also create staff
 Update: Admin can upload any additional data of staff, stages, ranges and
constraints into the database.
 Delete: Admin will be enabling to delete staff, stage, range and constraints from

 Module 2: Doctor
Doctor can able to detect whether patient contains CKD or not.
 Upload Patient Details: Doctor can upload patient clinical data and in future he
can modify data record set. It helps to monitor patient clinical record.
 View Treatment Details: After uploading treatment details, they can also view it.

Department of IS&E MIT MYSORE Page 17

Chronic Kidney Disease using Data Mining 2016-2017

 Constraints: Doctor can suggest constraint depending on the treatment of patient.

 Module 3:Patient
In this module, Patient can give feedback and view his/her treatment details.
 Feedback: Patient can provide feedback based on the care shown by the clinic
and also they can share their experience provided in the module.
 View Record Set: After login to account patient can view their own record and
treatment details.

 Module 4: Receptionist
Receptionist can register, upload, view and generate billing details of patient.
 Register: They can register patient information by creating new account.
 Upload of data: Receptionist can upload patient’s clinical data.
 Billing Details: They can generate billing details of each patient data.
 View Patient Details: Receptionist can view each patient information and they
can alter the changes whenever they needed.

3.4 Non Functional Requirments

Non-Functional Requirements are also known as quality requirements which
impose constraints on the design or implementation such as usability, reliability,
performance and scalability.
 Usability: System is a medical oriented application and system is automation for
kidney disease prediction and mainly used by doctors and receptionists of the
hospitals and as it’s a browser based application it can be accessed worldwide.
 Reliable: Our application provides the services according the users satisfaction
and interest, and designed as per users requirements and more user friendly, so the
application is more reliable compare to other medical sector applications.
 Maintainability: As we update the software regularly it will be easy to maintain
it. Application is designed in such a way that future modifications and
enhancements can be done easily.
 Efficiency: The application provides the efficient results as it uses data mining
technique or machine learning technique for disease prediction. Huge amount of
data mined to get more efficient results.

Chronic Kidney Disease using Data Mining

 Re-usability: The system is a web based application, once the user creates an
account; user can access the system multiple times.
3.5 System Requirements
3.5.1 Software Requirements

 Framework: .NET
 IDE: Visual Studio 2010
 Front End: ASP. NET 4.0
 Programming language: C#NET

3.5.2 Hardware Requirements

 RAM: 1GB+
 Processor: Pentium 4+
 Processor Speed: 2ghz+
 Hard disk: 20GB+

Chronic Kidney Disease using Data Mining

Chapter 4
4.1 System Analysis
System Analysis is a detailed analysis of various operations performed by a system
and their relationship within and outside the system. It is systematic technique that
defines goals objectives. One of the main aspects of analysis is the defining the
boundaries of the system.
System analysis study has been conducted with the following objectives.
 Identify user’s need.
 Evaluate the system concept for feasibility.
 Perform economical and technical analysis.
 Allocate function to hardware, software, people, databases etc.

Tools used in analysis.

The various tools of structural analysis are:
 Use case Diagram
 Sequence Diagram
 Activity Diagram
 Dataflow Diagram
 ER Diagram
 Database Structure
 Graphical User Interface ( GUI)

The structured analysis has the following attributes

 The data flow diagram (DFD) presents a picture of what kind is being
specified and is conceptually easy to understand presentation of the
 Sequence diagram shows participants in an interaction and the sequence of
messages among them. The system contains different modules which interact
with each other.
 The messages passed between the modules are shown by sequence
 Use-case diagram is a coherent piece modules are going to interact with one or
other uses.

Chronic Kidney Disease using Data Mining

 The diagrams are specified in a precise, concise and highly readable manner. It
shows the working system and how it interacts together.

4.2 High-Level Design

Three tier architecture:

Three-tier architecture is a client-server architecture in which the functional process

logic, data access computer data storage and user interface are developed and maintained
as independent modules on separate platforms. Three-tier architecture is a software design
pattern and well established software architecture.

Fig 4.2 Three-tier Architecture of System

Three-tier architecture allows anyone of the three tiers to be upgraded or replaced

independently. The user interface is implemented on a desktop PC and users are standard
graphical user interface with different modules running on the application servers. The
relational database management on the database server contains the computer data storage
logic. The middle is usually multi-tier.
The three tiers in three tier architecture are:
 Presentation tier: Occupies the top-level and displays information related to
services available on a website. This tier communicates with other tiers by
sending results to the browser and other tiers in the network.

Department of IS&E MIT MYSORE Page 21

Chronic Kidney Disease using Data Mining

 Application-tier: Also called middle-tier, Logic tier, Business Logic tier, this tier
is pulled from the presentation tier. It controls application functionality by
performing detailed processing.
 Data-tier: Houses database servers where information is stored and retrieved.
Data in this tier is kept independent of application servers or business logic.

4.2.1 System Architecture:

Fig 4.2.1 System Architecture of CKD

Data mining technique is used in the building of our architecture where data is
very important key factor and used all times in the system. All the related data and their
information are stored in the DB accurately. Database is maintained for each module
where it stores data values and all functionalities work accordingly. Algorithm
implementation is carried out for each and every step of module to predict the result
within range and constraints that are given in the time of inserting values to the record.

Data mining technique is used in the building of our architecture where data is very

Chronic Kidney Disease using Data Mining

important key factor and used all times in the system. All the related data and their
information are stored in the DB accurately. Database is maintained for each module
where it stores data values and all functionalities work accordingly. Algorithm
implementation is carried out for each and every step of module to predict the result
within range and constraints that are given in the time of inserting values to the record.

The main goal of the system is to detect CKD in a patient by taking risk factors and
different attribute values using Naive Baye’s algorithm. Four modules are used in the
system where each module has different functionality and operations. All functions
defined within system and works by inputting a value and fetching the result from the

4.3 Use Cases

A use case is a list of steps, defining interactions between system and actors. The
actor can be a human, an external system or time.

Purpose of Use Case Diagram:

 The list of goal names provides the shortest summary of what system will provide .It
also provides a project planning skeleton, to be used to build initial priorities,
estimates, team allocation and timings.
 The main success scenario of each use case provides everyone involved with an
agreement as to what system will basically do and what it will not do. It provides the
context for each specific requirement, a context that is very hard to get anywhere else.

Purpose of Use Case in our project:

 In our project, system administrator has the highest authorization among all
stakeholders. He creates staffs, specifies stages, constraints, ranges of CKD, and sets
id and passwords for staffs.
 Doctor is an actor who participates in multiple use cases like uploading new patient
details, view and uploads treatment details, view result indicating presence or absence
of CKD.
 Receptionist is an actor who participates in multiple use cases like registering of
patients, upload old patient details, billing, view treatment details and patient history.

Department of IS&E MIT MYSORE Page 23

Chronic Kidney Disease using Data Mining







Set ID/Password


Fig 4.3.1: Use Case diagram of Admin

Chronic Kidney Disease using Data Mining



View patient data



Update/ delete

Fig 4.3.2: Use Case diagram of Doctor

Chronic Kidney Disease using Data Mining



View patient details



Change Password

Fig 4.3.3: Use Case diagram of Receptionist

Chronic Kidney Disease using Data Mining





Fig 4.3.4: Use Case diagram of Patient

4.4 Sequence Diagram

A sequence diagram is an interaction diagram that shows how processes operate with
one another and what is their order. A sequence diagram shows object interactions
arranged in time sequence. It defects the object and class involved in the scenario and the
sequence of messages exchanged between the object needed to carry out the functionality
of the scenario.
A sequence diagram shows, as parallel vertical line, different processes of object that
live simultaneously, and, as horizontal arrows the messages exchanged between them, in
the order in which they occur.

Purpose of sequence diagram in the project

In this module the user login to the system if the user is already registered otherwise
the users first have to register in to the system. During login the user is validated by the
data in the database if he is a valid user then he/she can input the file and view the
converted data then logout from the system.
In this module the admin login to the system and he/she validated, if he/she is a valid
user then he can manage the translations of the users and he can add, delete, update the
user data by fetching the user data from the database, otherwise it will transform again in
to the login form. At last the admin logout from the application.

Chronic Kidney Disease using Data Mining




ID and Password
Success Create Staff

Setid and Password for Staff


View Staff

Input Stages

View Stages

I/P Ranges and Constraints

View Ranges, Constraints

Upload Password


Fig 4.4.1: ADMIN Sequence Diagram

Department of IS&E MIT MYSORE Page 28

Chronic Kidney Disease using Data Mining



Upload new patient constraint
ID and Password


View new/old patient details

Upload Treatment Details

View Result


Fig 4.4.2: DOCTOR Sequence Diagram

Department of IS&E MIT MYSORE Page 29

Chronic Kidney Disease using Data Mining



Upload Feedback
ID and Password
View Treatment Details


Fig 4.4.3: Patient Sequence Diagram



Upload old patient detail
Register patient details

View Patient History


Fig 4.4.4: RECEPTIONIST Sequence Diagram

Chronic Kidney Disease using Data Mining

4.5 Activity Diagram

Activity diagram is a graphical representation or flow chart which describes
operations of the system. It is basically a flow chart to represent the flow from
one activity to another activity.

Symbols and Notations:

Activity diagram represents set of defined symbols and each symbol has its own
meaning and they are used wherever it is appropriate.

Represents the beginning of a

process or workflow in an
activity diagram. It can be used
by itself or with a note symbol
that explains the starting point.

The activity symbol is the main

component of an activity
diagram. These shapes indicate
the activities that make up a
modeled process.

The connector symbol is represented by

arrowed lines that show the directional flow,
or control flow, of the activity. An incoming
arrow starts a step of an activity; once the
step is completed, the flow continues with
the outgoing arrow.

Department of IS&E MIT MYSORE Page 31

Chronic Kidney Disease using Data Mining

The join symbol, or synchronization

bar, is a thick vertical or horizontal line.
It combines two concurrent activities
and re-introduces them to a flow where
only one activity.

A fork is symbolized with

multiple arrowed lines from a
join. It splits a single activity
flow into two concurrent

The decision symbol is a

diamond shape; it represents the
branching or merging of various
flows with the symbol acting as
a frame or container.

Additional messages that don't fit within the

diagram itself. The note symbol allows the
diagram creators or collaborators to

The end symbol represents the

completion of a process or workflow.

Department of IS&E MIT MYSORE Page 32

Chronic Kidney Disease using Data Mining

Explanation of Each Activity Diagram:

In our system, workflow of each module are explained and detailed description of
them are elaborated using activity diagram with relevant symbols and notations.Each
module consist of one input, one ouput and individual functions are defined within the

Admin : Each module has login, using the credentials given they can see the data
and view their related information.If login doesn’t match with the password invalid
message was shown. Fig shows activity diagram of admin module.When admin login
successfully, they can give data related to staff member , stages, constraints and range of
the patient test record.They have permisson to change their own password.

Receptionist : As shown in the Fig If receptionist login to his/her page into the
system, they functions with registration, uploading of patient information,generating
billing details and also looking details of treatment that are given by doctor. They are
responsible for managing account of each patient.Error message was displayed if they
enter wrong login details.

Patient : In this module when patient login to his/her page they can view treatment
details which consist of stages in which patient suffering from disease, symptoms,medical
prescription given by doctor. As shown in the Fig , they are provided with feedback field
where they give their thoughts and discuss their doubts and issues.

Doctor : Using login details they are given privelege to upload patient test records,
giving constraints and range depending on the initial stage of CKD detection.Once patient
treatment begins they can keep track of improvement, adding range of attributes to their
own database.Range can be selected by doctor for each patient which helps them to detect

Chronic Kidney Disease using Data Mining

Admin :




IS Admin

Staff Stage Constraints Ranges Password

Fig 4.5.1: Activity Diagram of Admin Module

Department of IS&E MIT MYSORE Page 34

Chronic Kidney Disease using Data Mining



In Valid


Is Receptionist

Register Upload Details Billing View Details

Fig 4.5.2: Activity Diagram of Receptionist Module

Department of IS&E MIT MYSORE Page 35

Chronic Kidney Disease using Data Mining

Patient :





Feedback View Details

Fig 4.5.3: Activity Diagram of Patient Module

Department of IS&E MIT MYSORE Page 36

Chronic Kidney Disease using Data Mining

Doctor :




Is Doctor

Upload Details View Result Constraints

Fig 4.5.4: Activity Diagram of Doctor Module

4.6. Data Flow Diagram

A Data Flow Diagram is a visual display of data through an informational system,
modeling its process aspects. In general DFD is used as initial system to understand the
overview of the system which can be elaborated.

Data Flow Diagram does not show about the timing of the operation and information
of the process where it undergoes in the system.DFD shows information in the form of
visual display will be input to and output from the system like where the data come from
and go to, and where the data will be stored.

DFD uses set of symbols like rectangles, circles and arrows, plus short text labels, to

Department of IS&E MIT MYSORE Page 37

Chronic Kidney Disease using Data Mining

show data inputs, outputs, storage points and the routes between each destination. With
the help of data flow diagram, users are able to visualize how the system will operate,
what the system will accomplish, and how the system will be implemented.

In our project we have four individual modules which represent visual display of the
system and we have explained each functionality of the separate module and how it builds
relationship between the modules according to the flow.

Data Flow Diagram of our project consists of four modules namely,

 Admin
 Receptionist
 Patient
 Doctor

All the above modules are interrelated to each other through a set of process defined
in the structural system analysis using defined symbols and notations. Each module has at
least one input for one output. Information of data used in the system is stored in the
database separately. As a result by giving relevant set of input we can fetch the output in
the form of data.

Admin Module:
ADMIN operates various kinds of roles by supporting people or group of people in
business enterprise. They manage more routines administration tasks within an
organization or department.

In our project admin plays an important role by handling various tasks within a
system. After getting access to the page, admin monitors different tasks. It keeps track of
patient’s detail by inputting different stages and constraints.

Initially, this module inputs different attributes like inputting patient record,
updating/deleting the record in the database, monitoring and viewing the patient test
record and so on. Admin can set range to the constraints and also he alters the range when
it is necessary. He can modify the system by adding or removing constraints.

Department of IS&E MIT MYSORE Page 38

Chronic Kidney Disease using Data Mining

Admin Login

Input Update/ Creat Sets View Change

delete e

Stages Stages Stages


Constraints Constraints

Range Range


Fig 4.6.1: Admin Dataflow Diagram

Admin has given permission to add staff in the database by creating User
ID/password to them where they can monitor staff behavior. Adding staff information
helps to keep track of which staff member is taking responsibility of patient so that time
consuming can be avoided at the time of patient’s evaluation.

Department of IS&E MIT MYSORE Page 39

Chronic Kidney Disease using Data Mining

If any update happened in the patient’s record, they have given authority to change
the dataset in the DB of patient. Deletion of dataset can be done by admin at the time of
any duplicate entries in the database. They are authenticated to monitor behavior of
patient’s data in finding the symptoms for the cause of Chronic Kidney Disease (CKD).

Once data are uploaded in the database he/she are eligible to view the patient record.
Also they have access to look past record set of patient which improves communication
with the doctor very easy and accordingly prescription was given depending on the

Receptionist Module:

Receptionist Login

Input Update/delete View Change

Previous patient Previous patient Previous patient

clinical data clinical data clinical data

Billing Password
Billing Billing
Details Details Details

Patient history Patient history Patient history

Patient Patient
registration registration


Fig 4.6.2: Receptionist Dataflow Diagram

Department of IS&E MIT MYSORE Page 40

Chronic Kidney Disease using Data Mining

This module contains patient previous clinical data, patient registration, patient
history and billing detail.

It operates and functions in the mentioned attributes of patient details in a specific

manner. Receptionist is responsible for creating patient registration, updating previous
data, monitoring history, managing account and giving bill details to the patient.

When a patient visits clinic it is the job of receptionist to register his/her account in
the clinic database. If user is already registered, then no need to create new account in the
DB. This module has access to view previous history of patient data record set.

Patient Module:

Patient Login

Input View Change

Feedback Treatment Details Password


Fig 4.6.3: Patient Dataflow Diagram

Department of IS&E MIT MYSORE Page 41

Chronic Kidney Disease using Data Mining

For a patient three attributes are given such as feedback, viewing treatment details
and changing password. Initially, He/she are registered with User ID/password. Once
patient login into the account, they can view their record details which contains
information of his/her treatment detail.

In this module they are authorized to change password so that data can be
secured. Only they can login to their account and provide feedback depending on clinic
behavior and how they feel about the treatment given in the clinic, care taken by staff
members. Feedback helps the clinic and staff members to correct themselves in the future
so that they can take care of patients smoothly and softly.

Doctor Login

Input View Sets Change

Patient Data Data
New Patient
Treatment Password
New Patient
Data Stages Data



Fig 4.6.4: Doctor Dataflow Diagram

Department of IS&E MIT MYSORE Page 42

Chronic Kidney Disease using Data Mining

In this module, the doctor uploads new patient’s clinical data, patient’s treatment
detail, and based on test result he can decide whether patient contain CKD or not.

Doctor can add new patient’s constraints to the dataset. He can also see patient
record, treatment details and depending on the test result system can predict stages as
well as patient is CKD or not.

4.7 Entity – Relationship Diagram

User Type
: id Password

Login n
1 n

R loa
Mo eg Mo d
Stage 3 dify ist dify Seriu
er n m
Stage 2 n n Age
: tid
Stages H n Treatment Constraints
Stage 1 Sugar
1 n
H Patients nt

Email pname

: pid

Fig 4.7 E-R Diagram of CKD

An Entity-Relationship Model (ER Model) is a data model for describing the data or
information aspects of a business domain or its process requirements, in the abstract way
that leads itself to ultimately being implemented in a database such as relational database.
The main components of ER modules are entities and the relationships that can exist

Department of IS&E MIT MYSORE Page 43

Chronic Kidney Disease using Data Mining

among them.
An Entity-Relationship Model is a systematic way of describing and defining a
business process. The process is modeled as components (entities) that are linked with
each other by relationships that express the dependencies and requirements between

them .

4.8 Database Structure:

Chronic Kidney Disease Prediction Using Data Mining

Table 1 – Users


UserID String [varchar(20)] Not null Primary Key

Password String [varchar(20)]

User Type String [varchar(20)]

EmailId String [varchar(50)]

Table 2 – Kidney Disease


Disease ID Int, Auto Generate Not null Primary Key

Disease Type String [varchar(20)]

Table 3 – Stages


Staged Int, Auto Generate Not null Primary Key

Disease ID Int Not null Foreign Key

Stage String [varchar(20)]

Department of IS&E MIT MYSORE Page 44

Chronic Kidney Disease using Data Mining

Table 4– Attribute (constraints)


Attribute ID Int, Auto Generate Not null Foreign Key

Attribute String [varchar(20)]

Table 5 – Values (constraint range)


Value ID Int, Auto Generate Not null Primary Key

Attribute ID Int Not null Foreign Key

Value String [varchar(20)]

Table 6 – Patients


Patient ID Int, Auto Generate Not null Primary Key

Patient Name String [varchar(20)] Not null

Patient Age Int Not null

Gender String [varchar(20)]

Marital Status String [varchar(20)]

Occupation String [varchar(20)]

Contact No String [varchar(20)]

Email ID String [varchar(50)]

Address String [varchar(500)]

Photo String [varchar(500)]

AdmitedDate Date Time

Stage ID Int Not null Foreign Key

Department of IS&E MIT MYSORE Page 45

Chronic Kidney Disease using Data Mining

Table 7 – Patient Attributes


PAttributeId Int, Auto Generate Not null Primary Key

Patient ID Int Not null Foreign Key

Value ID Int Not null Foreign Key

Table 8 – Treatment


Treatment ID Int, Auto Generate Not null Primary Key

Stage ID Int Not null Foreign Key

Treatment String [varchar(500)]

Last Updated Date Time

Fig 4.8 Data Structure Table

Department of IS&E MIT MYSORE Page 46

Chronic Kidney Disease using Data Mining

. Implementation can be described as realization of an application, or execution of a
plan, idea, model, design, specification, standard, algorithm, or policy. In computer
science, an implementation is explained as realization of a technical specification or
algorithm as a program, a software component, or any other computer system through
computer programming and deployment. Many implementations may exist for a given
specification or standard.

5.1 Overview of .NET:

.NET Framework: The .NET Framework is a new computing platform that

simplifies application development in the highly distributed environment of the Internet.
The .NET Framework is designed to fulfill the following objectives: To provide a
consistent object-oriented programming environment whether object code is stored and
executed locally, executed locally but Internet-distributed, or executed remotely.
 To provide a code-execution environment that minimizes software deployment
and versioning conflicts.
 To provide a code-execution environment that guarantees safe execution of code,
including code created by an unknown or semi-trusted third party.
 To provide a code-execution environment that eliminates the performance
problems of scripted or interpreted environments.
 To make the developer experience consistent across widely varying types of
applications, such as Windows-based applications and Web-based applications.
 To build all communication on industry standards to ensure that code based on the
.NET Framework can integrate with any other code.

The .NET Framework has two main components:

 The common language runtime
 The .NET Framework class library
The common language runtime is the foundation of the .NET Framework. You can
think of the runtime as an agent that manages code at execution time, providing core
services such as memory management, thread management, and remoting, while also

Chronic Kidney Disease using Data Mining

enforcing strict type safety and other forms of code accuracy that ensure security and
robustness. In fact, the concept of code management is a fundamental principle of the
runtime. Code that targets the runtime is known as managed code, while code that does
not target the runtime is known as unmanaged code.

The class library, the other main component of the .NET Framework, is a
comprehensive, object-oriented collection of reusable types that you can use to develop
applications ranging from traditional command-line or graphical user interface (GUI)
applications to applications based on the latest innovations provided by ASP.NET,
such as Web Forms and XML Web services.

5.2 Introduction to ASP.NET

ASP.NET is unified web development platform that provides the services

necessary for you to build enterprise-class web applications. While ASP.NET is largely
syntax compatible with Active Server Page (ASP), it provides a new programming model
and infrastructure that allow you to create a powerful new class of applications. ASP.NET
is part of the .NET framework and allows you to take full advantage of the features of the
common language runtime, such as type safety, inheritance, language interoperability and

ASP.NET is supported on Windows 2000 (Professional, Server and Advanced

Server), Windows XP Professional and the Windows Server 2003 family for both client
and server applications.

In addition, to develop ASP.NET server applications, the following software is also


 Windows 2000 Server or Advanced Server with Service Pack 2, Windows

XP Professional or 64-Bit Edition, or one of the Windows Server 2003
family products.

 MDAC 2.7 for Data.

 Internet Information Services.

Department of IS&E MIT MYSORE Page 48

Chronic Kidney Disease using Data Mining

5.3 Introduction to C#:

C# (pronounced as ‘C Sharp’) is a new computer-programming language
developed by Microsoft Corporation, USA. C# is a fully object-oriented language
like Java and is the first Component-oriented language. It has been designed to support
the key features of .NET Framework, the new development platform of Microsoft for
building component-based software solutions. It is a simple, efficient, productive and
type-safe language derived from the popular C and C++ languages. Although it belongs
to the family of C/C++, it is a purely object-oriented, modern language suitable for
developing Web-based applications.

C# is designed for building robust, reliable and durable components to handle

real-world applications. Major highlights of C# are:

 It is a brand new language derived from the C/C++ family.

 It simplifies and modernizes C++.
 It is the only component-oriented language available today.
 It is the only language designed for the .NET Framework.
 It is a concise, lean and modern language.
 It combines the best features of many commonly used languages: the
productivity of Visual Basic, the power of C++ and the elegance of Java.
 It is intrinsically object-oriented and web-enabled.
 It has a lean and consistent syntax.
 It embodies today’s concern for simplicity, productivity and robustness.
 It will become the language of choice for .NET programming.
 Major parts of .NET Framework are actually coded in C#.

ADO.Net - Database Connectivity:

Most applications need data access at one point of time making it a crucial
component when working with applications. Data access is making the application
interact with a database, where all the data is stored. Different applications have different
requirements for database access. ASP.NET uses ADO .NET (Active X Data Object) as
its data access and manipulation protocol which also enables us to work with data on the
Internet. Data Access in ADO.NET relies on two components: Data Set and Data Provider.

Department of IS&E MIT MYSORE Page 49

Chronic Kidney Disease using Data Mining

1. Data Set
The dataset is a disconnected, in-memory representation of data. It can be
considered as a local copy of the relevant portions of the database. The Data Set is
persisted in memory and the data in it can be manipulated and updated independent of the
database. When the use of this Data Set is finished, changes can be made back to the
central database for updating. The data in Data Set can be loaded from any valid data
source like Microsoft SQL server database, an Oracle database or from a Microsoft
Access database.
2. Data Provider
The Data Provider is responsible for providing and maintaining the connection to
the database. A Data Provider is a set of related components that work together to
provide data in an efficient and performance driven manner. The .NET Framework
currently comes with two Data Providers: the SQL Data Provider which is designed only
to work with Microsoft's SQL Server 7.0 or later and the OleDb Data Provider which
allows us to connect to other types of databases like Access and Oracle. Each Data

Provider consists of the following component classes:

The Connection object is a connection to the database. The Command object is used
to execute a command. The Data Reader object which provides a forward-only, read only,
connected record set. The Data Adapter object populates a disconnected Data Set with
data and performs update.

5.4 Introduction to SQL Server:

Microsoft SQL Server is a full-featured relational database management system
(RDBMS) that offers a variety of administrative tools to ease the burdens of database
development, maintenance and administration.  In this article, we'll cover six of the more
frequently used tools: Enterprise Manager, Query Analyzer, SQL Profiler, Service
Manager, Data Transformation Services and Books Online.

Enterprise Manager is the main administrative console for SQL Server

installations.  It provides you with a graphical "birds-eye" view of all of the SQL Server
installations on your network.  You can perform high-level administrative functions that
affect one or more servers, schedule common maintenance tasks or create and modify the
structure of individual databases.

Department of IS&E MIT MYSORE Page 50

Chronic Kidney Disease using Data Mining

Query Analyzer offers a quick and dirty method for performing queries against
any of your SQL Server databases.  It's a great way to quickly pull information out of a
database in response to a user request, test queries before implementing them in other
applications, create/modify stored procedures and execute administrative tasks.

SQL Profiler provides a window into the inner workings of your database.  You
can monitor many different event types and observe database performance in real time. 
SQL Profiler allows you to capture and replay system "traces" that log various activities. 
It's a great tool for optimizing databases with performance issues or troubleshooting
particular problems.

Service Manager is used to control the MSSQL Server (the main SQL Server
process), MSDTC (Microsoft Distributed Transaction Coordinator) and SQL ServerAgent
processes.  An icon for this service normally resides in the system tray of machines
running SQL Server.  You can use Service Manager to start, stop or pause any one of
these services.

Data Transformation Services (DTS) provide an extremely flexible method for

importing and exporting data between a Microsoft SQL Server installation and a large
variety of other formats.  The most commonly used DTS application is the "Import and
Export Data" wizard found in the SQL Server program group.

5.5 Reason for choosing .NET:

5.5.1 Limitations of C:
 C developers are forced to contend with manual memory management.
 Ugly pointer arithmetic.
 C is structured programming language.
 Programmers require complete knowledge of best programming technique
5.5.2 Limitations of C++:
 C++ can be thought as an Object Oriented layer on top of C.
 It involves manual memory management.
 Ugly pointer arithmetic.
 Ugly syntactical constructs.

Chronic Kidney Disease using Data Mining

5.5.3 Limitations of JAVA/J2EE:

 Java programmers must use java front to back during development cycle.
 It is not appropriate for many graphical or numerical intensive applications.
 .NET provides solution to all the above mentioned problems.


5.6.1 Classification Rules:
Classification is a process of finding a model (or function) that describes and
distinguishes data classes or concepts. The model is derived based on the analysis of a set
of training data (i.e., data objects for which the class labels are known).

5.6.2 Naive Bayes Algorithm

Naive Bayes is a probabilistic classifier based on Bayes theorem with strong
independence assumptions between the features. Bayes theorem provides a way of
calculating the posterior probability, P(c|x), from P(c), P(x), and P(x|c). Naive Bayes
classifier assumes that the effect of value of predictor (x) on the given class (c) is
independent of the values of other predictors. This assumption is called class conditional
Naive Bayes Algorithm Steps:
Step 1: Scan the dataset (Storage servers).
Retrieval of required data for mining from the servers such as database, cloud,
excel sheet etc.
Step 2: Calculate the probability of each constraint value. (n.n_c, m, p).
Here for each attribute we calculate the probability of occurrence using the
following formula. (Mentioned in the next step). For each class (disease) we should apply
the formula.
Step 3: Apply the formula
P (constraint value (ai)/ subject value (v)) = (n_c + mp)/ (n+m)
Where: n = the number of training examples
 n_c = number of examples
 p = a prior estimate for P
 m = the equivalent sample size

Department of IS&E MIT MYSORE Page 52

Chronic Kidney Disease using Data Mining

Step 4: Multiply the probabilities by p

For each class, here we multiple the results of each attribute with p and final
results are used for classification.
Step 5: Compare the values and classify the attribute values to one of the predefined set
of class.
Sample Example
Attributes (Constraints) – S1, S2, S3 [m=3]
Subject (Disease) – CKD, NOT CKD [p=1/2=0.5]

Training Dataset

Patient Name S1(X,Y,Z) S2 (A,B,C) S3 (P,Q,R) Disease (subject)

Anil X A P CKD

Ajay X B Q CKD


Kumar Z A R CKD

Naveen Z C R NOT CKD

New Patient data – Akash Constraints (S1 -X, S2-A, S3-R) Disease – CKD / NOT
P= [n_c + (m*p)]/ (n+m)

P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=2,m=3,p=0.5 n=2, n_c=0,m=3,p=0.5
p=[2+(3*0.5)]/(2+3) p=[0+(3*0.5)]/(2+3)
p=0.7 p=0.3

P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=2,m=3,p=0.5 n=2, n_c=2,m=3,p=0.5
p=[2+(3*0.5)]/(2+3) p=[2+(3*0.5)]/(2+3)
p=0.7 p=0.3

Department of IS&E MIT MYSORE Page 53

Chronic Kidney Disease using Data Mining

P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=1,m=3,p=0.5 n=2, n_c=1,m=3,p=0.5
p=[1+(3*0.5)]/(2+3) p=[1+(3*0.5)]/(2+3)
p=0.5 p=0.5

CKD – 0.7 * 0.7 * 0.5 * 0.5 (p) NOT CKD – 0.3 * 0.3 * 0.5 * 0.5 (p)

=0.1225 =0.0225


So this new patient is classified to CKD

5.6.3: C4.5 Algorithm

C4.5 is one among the top algorithms in data mining technique. It was developed by
Ross Quinlan. In the projectC4.5 algorithm has been implemented to predict the stages of
CKD of the patients based on clinical test constraints.

C4.5 Algorithm Steps:

Step 1: Scan the dataset (storage servers)

Step 2: for each attribute a, calculate the gain [number of occurrences]

Step 3: Let a_best be the constraint of highest gain [highest count]

Step 4: Create a decision node based on a_best – retrieval of nodes [patient] where the
attribute values matches with a_best.

Step 5: recur on the sub-lists [list of patient] and calculate the count of outcomes [Stages]
– termed as sub nodes. Based on the highest count we classify the new node.

Sample Example

Attributes (Features) – F1, F2, F3 [m=3]

Subject (stages) – S1, S2 [p=1/2=0.5]

Department of IS&E MIT MYSORE Page 54

Chronic Kidney Disease using Data Mining

Training Dataset

Name F1(X,Y,Z) F2(A,B,C) F3(P,Q,R) Stage (subject)

Anil X A P S1

Kumar X B Q S1

Ajay Y B P S2

Naveen Z A R S1

Akash Z A Q S2

New Patient Features – Akul F1-X, F2-A, F3-R Which Stage - ?

Feature Count (X) in the dataset = 2
Feature Count (A) in the dataset = 3
Feature Count (R) in the dataset = 1

Sort ();
Feature Count

A 3

X 2

R 1

A – S1 (2) & S2 (1);

Stage Priority

S1 2

S2 1

It is diagnosed that CKD is in Stage 2 for new patient Akul

Department of IS&E MIT MYSORE Page 55

Chronic Kidney Disease using Data Mining

5.7 Pseudo code

5.7.1 Pseudo code for Login

main ()
Admin ();
Receptionist ();
Doctor ();
Patient ();
GET User_ type;
GET User_ ID/Email_ Id;
GET Password;
If (User_ ID==entered User _ID and Password==entered Password)
User _type is fetched from Database
If (User_ type==1)
Else If (User_ type==2)
Else If (User_ type==3)
Else If (User_ type==4)

Chronic Kidney Disease using Data Mining

5.7.2 Pseudo code for Admin

constraints ();
values ();
account ();
GET User_type;
GET User_Id;
GET password;
GET Email_Id;
if(User_Id==entered User_Id)
User_Id already exist
staff is added
Stages ()
GET Stage;
If (Stage==entered stage)
Stage exists
Stage is added

Constraint ()
GET Constraint;
if(Constraint==entered constraint)

Department of IS&E MIT MYSORE Page 57

Chronic Kidney Disease using Data Mining

constraint already exists

New constraint is added
Values ()
GET values;
if(value<=constraint(max value) and value>=constraint(min value))
Value is accepted
GET old_password;
GET new_password;
GET confirm_password;
if(old_password==existing password && new_password==confirm_password)
password changed successfully

5.7.3 Pseudo code for Doctor:

Doctor ()

Chronic Kidney Disease using Data Mining

Get patient_details();
Get patient_constraints();
Patient has CKD;
Patient has stage1;
Else If(result==stage2)
Patient has stage2;
Else If(result==stage3)
Patient has stage3;
else If(result==stage4)
Patient has stage4;
Patient has stage5;
Patient does not have CKD;

Treatment_details ()

Department of IS&E MIT MYSORE Page 59

Chronic Kidney Disease using Data Mining

Display treatment_details_of_stage1;
Else if(stage==stage2)
Display treatment_details_of_stage2;
Else If(stage==stage3)
Display treatment_details_of_stage3;
Else If(stage==stage4)
Display treatment_details_of_stage4;
Display treatment_details_of_stage5;
GET old_password;
GET new_password;
GET confirm_password;
if(old_password==existing password && new_password==confirm_password)
password changed successfully

5.7.4. Pseudo code for receptionist:


Get patient_details;
get patient_constrains;

Department of IS&E MIT MYSORE Page 60

Chronic Kidney Disease using Data Mining

GET old_password;
GET new_password;
GET confirm_password;
if(old_password==existing password && new_password==confirm_password)
password changed successfully

5.7.5 Pseudo code for patient:

Patient ( )

View_treatmentdetails( );




Get treatment_details;



Department of IS&E MIT MYSORE Page 61

Chronic Kidney Disease using Data Mining

GET old_password;
GET new_password;
GET confirm_password;
if(old_password==existing password && new_password==confirm_password)
password changed successfully

5.8 Advantages

 Proposed system is a medical sector application and automation for Chronic

Kidney Disease.
 Reduces the time required to analyze test results and predict CKD.
 Reduce kidney failure due to diabetes.
 Reduce the number of cardiovascular death rate for persons on dialysis.
 Reduce the total number of deaths for persons with a functioning kidney

 We have developed and performed an internal validation for five models for CKD
progression from stage I to stage V. Our models leverage different types of
variables—demographic, laboratory and/or clinical documentation data that are
collected routinely during the course of clinical care as part of the electrical health
record —as well as the longitudinal aspect of the records as encoded through

 In absence of laboratory and documentation information, the simplest model

(eGFR model) identifies that low eGFR at time of CKD stage III diagnosis is
associated with higher risk of progression. Furthermore, younger patients with
impaired kidney function (stage III CKD) progress more rapidly toward stage V
CKD. Consistent with current knowledge of CKD, male gender was found to be
associated with more rapid loss of eGFR, and the laboratory test models (RLT)
identified laboratory data known to be associated with CKD progression.

Chronic Kidney Disease using Data Mining

 We found that text is a valuable predictor for CKD progression and that the use of
time series models to characterize patient state can substantially improve
predictive accuracy for progression. In particular, the model which incorporated
demographic, laboratory, and clinical documentation data had the highest
concordance of the models considered.

 Risk prediction in CKD has been studied extensively, with dozens of available
risk models with acceptable performance (discrimination 0.56–0.94). Most
developed classifiers use readily obtainable information, including age,
demographics, and laboratory data. Hence, laboratory data, comorbidities, and
occasional vital signs are the sole dimensions of contemporary CKD classifiers.
Age, sex, and eGFR are included in almost all models, but fewer than half use
proteinuria (qualitative assessment or quantitative proteinuria or albuminuria),
serum creatinine, serum albumin, or blood pressure.

5.9 Limitation

 Because our dataset consists of a non-curated, real-world set of patient characteristics,

as recorded through clinical care, there is some potential noise in the collected
variables. For instance, given the lack of high quality information about ethnicity, we
cannot assess which ethnic groups are well represented in our dataset. This fact may
introduce noise in the GFR calculations.

 The models we designed and validated are based on data from a single institution.
While there is value in focusing on a single institution at a time (the risk predictions
are relevant to the characteristics of the institution’s patient population for instance),
the model validity and its generalizability would be better demonstrated over data
from several institutions.

 In particular, because of the potential variations in clinical vocabulary and overall

language in the documentation across different institutions, there would likely be a
benefit to generalizing the risk model to patient records from other institutions. . Our
study requires longitudinal documentation (both inpatient and outpatient notes over
many years, for a large set of patient records). Since there are no publicly available
datasets (even de-identified) with these properties, extending this study to other
datasets is outside the scope of this study and an important limitation of the work.

Chronic Kidney Disease using Data Mining

 Short of training a model for data from different institutions, the models presented in
this study are in theory portable to different institutions. In particular, the
unsupervised NLP techniques described here (topic modeling) are actually conducive
to such an approach, as they identify patterns in the language of any given corpus
without any prior knowledge of the topics or vocabulary to expect. To address the
potential differences in language from one institution to another, the topic models
would have to be learned on documentation from the new institutions.

Department of IS&E MIT MYSORE Page 64

Chronic Kidney Disease using Data Mining

Chapter 6
Testing is the process of evaluating a system or its component(s) with the intent to
find that whether it satisfies the specified requirements or not. This activity results in the
actual, expected and difference between their results. In simple words testing is executing
a system in order to identify any gaps, errors or missing requirements in contrary to the
actual desire or requirements.
Testing is the practice of making objective judgments regarding the extent to
which the system (device) meets, exceeds or fails to meet stated objectives.

6.1 Purpose of testing

Testing is used to provide customers with bug free Software and Reliable
software. The software developed should not get any problem while in use, in order make
efficient use of the software developed Software testing is conducted. Because software
once developed costs much and if the customer faces problem while in use he has to incur
huge losses .So to avoid such loss software testing is conducted.
Testing is done to Analyze whether the application developed is according to the
Requirements. The main course of testing is to check for the existence of defects or errors
in a program or project or product, based up on some predefined instructions or
Following are some of important factors for which Testing for an application is required:
 Reduce the number of bugs in the code.
 To provide a quality product.
 To verify whether all the requirements are met.
 To satisfy the customer’s needs.
 To provide a Bug free software.
 To earn the reliability of the Software.
 To avoid the user from detecting problems.
 Verify that it behaves “as specified”.
 Validate that what has been specified is what the user actually wanted.

Department of IS&E MIT MYSORE Page 65

Chronic Kidney Disease using Data Mining

6.2 Black Box Testing

The technique of testing without having any knowledge of the interior workings of
the application is Black Box testing. The tester is obvious to the system architecture and
does not have access to the source code. Typically, when performing a black box test, a
tester will interact with the system's user interface by providing inputs and examining
outputs without knowing how and where the inputs are worked upon.

6.3 White Box Testing

White box testing is the detailed investigation of internal logic and structure of the
code. White box testing is also called glass testing or open box testing. In order to
perform white box testing on an application, the tester needs to possess knowledge of the
internal working of the code. The tester needs to have a look inside the source code and
find out which unit/chunk of the code is behaving inappropriately.

6.4 Different Levels of Testing

Fig 6.4: Different Levels of Testing

Chronic Kidney Disease using Data Mining

6.4.1 Unit Testing

Unit Testing is a level of the software testing process where individual
units/components of a software/system are tested. The purpose is to validate that each unit
of the software performs as designed.
This type of testing is performed by the developers (White Box Testing) before
the setup is handed over to the testing team to formally execute the test cases. Unit testing
is performed by the respective developers on the individual units of source code assigned
areas. The developers use test data that is separate from the test data of the quality
assurance team.
The goal of unit testing is to isolate each part of the program and show that
individual parts are correct in terms of requirements and functionality.

6.4.2 Integration Testing

Integration Testing is a level of the software testing process where individual units
are combined and tested as a group. The purpose of this level of testing is to expose faults
in the interaction between integrated units. The testing of combined parts of an
application to determine if they function correctly together is Integration testing.

6.4.3 System Testing

This is the next level in the testing and tests the system as a whole. Once all the
components are integrated, the application as a whole is tested rigorously to see that it
meets Quality Standards. This type of testing is performed by a specialized testing team.
System Testing is a level of the software testing process where a complete, integrated
system/software is tested. The purpose of this test is to evaluate the system’s compliance
with the specified requirements.

6.4.4 Acceptance Testing

Acceptance testing or User Acceptance Testing is a level of the software testing
process where a system is tested for acceptability. The purpose of this test is to evaluate
the system’s compliance with the business requirements and assess whether it is
acceptable for delivery.

Chronic Kidney Disease Prediction using Data Mining




Execute and run The application should run Application is running

TC001 Home page Pass
the application without interrupts successfully

Home page http://localhost: Home page has to be

TC002 Home page displayed Pass
display 5219/Login.aspx displayed
Click on User Login User login Page should be User login page displayed
TC003 View User login Pass
link displayed successfully
Required field validator
Validation of Email=” ”Login Button Required field validator
TC004 has been displayed Pass
Email Textbox Click has to be displayed(*)
Validation of Email=” Required field validator Required field validator
TC005 Pass
Email Textbox m” Login Button Click shouldn’t be displayed(*) has not been displayed
Required field validator
Validation of Password” “ Login Required field validator
TC006 has been displayed Pass
Password Textbox Button Click has to be displayed(*)
Password= Required field validator
Validation of Required field validator
TC007” has not been displayed Pass
Password Textbox shouldn’t be displayed(*)
Login Button Click (*)
Email=”abc@ Has to logged in as User
If the user is not User not
TC008 Gmail.Com” and navigate to login.aspx Error Message Display Pass
registered one found
Password=**** page


Chronic Kidney Disease using Data Mining 2016-2017



Registered User
Logs in to system Email=abc@ Has logged in as Admin
TC009 by entering email and navigate to admin Successful login Pass
and password(if Password=**** homepage
user is admin)

Registered User
Logs in to system Email=abc@ Has logged in as Doctor
TC010 by entering email and navigate to doctor Successful login Pass
and password(if Password=**** homepage
user is doctor)

Registered User
Logs in to system
Email=abc@ Has logged in as
by entering email
TC011 Receptionist and navigate Successful login Pass
and password(if
Password=**** to receptionist homepage
user is

Registered User
Logs in to system Email=”abc@ Has logged in as Patient
TC012 by entering email” and navigate to patient Successful login Pass
and password(if Password=**** homepage
user is patient)

Department of IS&E MIT MYSORE Page 69

Login button Has to login to respective Login button

TC013 Login button click Didn’t login Fail
working home pages not working

Login button Has to login to respective Login to respective home

TC014 Login button click Pass
working home pages pages based on user type

Data will stored in the Back end

Validation of
TC015 Submit button click database and message is Database was not found fail database is
submit button
displayed not found
Data will stored in the Data stored in the
Validation of
TC016 Submit button click database and message is database and message is pass
submit button
displayed displayed
Click on add staff link Add staff page has to be Add staff page didn’t error message
TC017 View add staff fail
from admin home page displayed displayed displayed

Click on add staff link Add staff page has to be

TC018 View add staff Add staff page displayed pass
from admin home page displayed
Click on disease type
Add disease type page has Add disease type page
TC019 View disease type link from the admin Pass
to be displayed displayed successfully
home page
It should accept the disease
Setting the type of Admin enters the type Disease type is accepted
TC020 type and store it in the Pass
disease of diseases and is saved successfully

Department of IS&E MIT MYSORE Page 70

Chronic Kidney Disease using Data Mining 2016-2017



Click on add
View add Add constraints page has Add constraints page error message
TC021 constraints link from fail
constraints to be displayed didn’t displayed displayed
admin home page
Click on add
View add Add constraints page has Add constraints page
TC022 constraints link from pass
constraints to be displayed displayed
admin home page
Click on add ranges
Add ranges page has to be Add ranges page didn’t error message
TC023 View add ranges link from admin fail
displayed displayed displayed
Click on Stages link in Add stages page has to be Add stages page is
TC024 View Stages Pass
the admin home page displayed displayed successfully
Setting the It should accept the Different stages are
Admin enters the
TC025 different type of different stages and store it accepted and is saved Pass
different stages
stages in the database successfully
Click on add ranges
Add ranges page has to be Add ranges page
TC026 View add ranges link from admin pass
displayed displayed
Click on add stages
Add stages page has to be Add stages page didn’t error message
TC027 View add stages link from admin fail
displayed displayed displayed

Click on add stages

Add stages page has to be Add stages page
TC028 View add stages link from admin pass
displayed displayed
Click on change
View change Change password page has Change password page error message
TC029 password link from fail
password to be displayed didn’t displayed displayed
admin homepage

Department of IS&E MIT MYSORE Page 71

Chronic Kidney Disease using Data Mining



Click on add stages

View change Change password page has Change password page
TC030 link from admin pass
password to be displayed displayed
Clink on treatment
View treatment Treatment details page has Treatment details page error message
TC031 details link from Fail
details to be displayed didn’t displayed displayed
patient homepage
Clink on treatment
View treatment Treatment details page has Treatment details page
TC032 details link from pass
details to be displayed displayed
patient homepage
Clink on feedback link Feedback page has to be Feedback page didn’t error message
TC033 View feedback fail
from patient homepage displayed displayed displayed

Clink on feedback link Feedback page has to be

TC034 View feedback Feedback page displayed pass
from patient homepage displayed
Click on change
View change Change password page has Change password page error message
TC035 password link from fail
password to be displayed didn’t displayed displayed
patient homepage
Click on change
View change Change password page has Change password page
TC036 password link from pass
password to be displayed displayed
patient homepage
Click on patient details
View patient Patient details page has to Patient details page didn’t error message
TC037 link from doctor fail
details be displayed displayed displayed
Click on patient details
View patient Patient details page has to Patient details page
TC038 link from doctor pass
details be displayed displayed

Department of IS&E MIT MYSORE Page 72

Chronic Kidney Disease using Data Mining



Click on upload
View upload Upload treatment page has Upload treatment page error message
TC039 treatment link from doc fail
treatment to be displayed didn’t displayed displayed
Click on upload
View upload Upload treatment page has Upload treatment page
TC040 treatment link from pass
treatment to be displayed displayed
doctor homepage
Click on reporting link
Generate report page has to Generate Report page is
TC041 View reporting from the doctor home Pass
be displayed displayed successfully
Doctor sets the The Report should be
particular disease type generated with the Report is generated
TC042 Generating Report Pass
and stage to generate particular disease type and successfully
report stage
Click on view result
Result page has to be Result page didn’t error message
TC043 View result link from doctor fail
displayed displayed displayed

Click on view result

Result page has to be
TC044 View result link from doctor Result page displayed pass

Click on add
View add Add constraints page has Add constraints page error message
TC045 constraints link from fail
constraints to be displayed didn’t displayed displayed
doctor homepage
Click on add
View add Add constraints page has Add constraints page
TC046 constraints link from pass
constraints to be displayed displayed
doctor home page

Department of IS&E MIT MYSORE Page 73

Chronic Kidney Disease using Data Mining



Click on change
View change Change password page has Change password page error message
TC047 password link from fail
password to be displayed didn’t displayed displayed
doctor homepage

Click on change
View change Change password page has Change password page
TC048 password link from pass
password to be displayed displayed
doctor homepage
Click on patient
View patient Patient registration has to Patient registration page error message
TC049 registration link from fail
registration be displayed didn’t displayed displayed
receptionist homepage
Click on patient
View patient Patient registration has to Patient registration page
TC050 registration link from pass
registration be displayed displayed
receptionist homepage
Click on the billing
View billing Billing details has to be Billing details page didn’t error message
TC051 details link from fail
details displayed displayed displayed
receptionist homepage
Click on the billing
View billing Billing details has to be Billing details page
TC052 details link from pass
details displayed displayed
receptionist homepage
Click on patient details
View patient Patient details page has to Patient details page didn’t error message
TC053 link from receptionist fail
details be displayed displayed displayed
Click on patient details
View patient Patient details page has to Patient details page
TC054 link from receptionist pass
details be displayed displayed

Department of IS&E MIT MYSORE Page 74

Chronic Kidney Disease using Data Mining



Click on reporting link

Generate report page has to Generate Report page is
TC055 View reporting from the receptionist Pass TC048
be displayed displayed successfully
home page

Click on change
View change Change password page has Change password page error message
TC056 password link from fail
password to be displayed didn’t displayed displayed
receptionist homepage

Click on change
View change Change password page has Change password page
TC057 password link from pass
password to be displayed displayed
receptionist homepage

Admin should be
Click on sign out link Website homepage didn’t error message
TC058 Admin sign out redirected to the website fail
in admin homepage displayed displayed

The system should allow

doctor to change his
password by confirming
Click on the change An error message is
Doctor change his old password and new
TC059 password link from getting displayed for fail
password password. If the old
doctor homepage incorrect old password
password is correct,
message should be

Department of IS&E MIT MYSORE Page 75

Chronic Kidney Disease using Data Mining



The system should allow

The doctor can change his
doctor to change his
password by confirming
password by confirming
Click on the change his old password and new
doctor change his old password and new
TC060 password link from password. An message is pass
password password. If the old
doctor homepage getting displayed for
password is correct,
change password
message should be

The system should allow

receptionist to change his
password by confirming
Click on the change An error message is
Receptionist his old password and new
TC061 password link from getting displayed for fail
change password password. If the old
doctor homepage incorrect old password
password is correct,
message should be

The system should allow

The doctor can change his
receptionist to change his
password by confirming
password by confirming
Click on the change his old password and new
receptionist his old password and new
TC062 password link from password. An message is pass
change password password. If the old
receptionist homepage getting displayed for
password is correct,
change password
message should be

Department of IS&E MIT MYSORE Page 76

Chronic Kidney Disease using Data Mining



The new staff should be

Admin enters id and Staff is created with Id
Creation of staff created Pass
TC063 password of staff and password is
using add staff With given id and
From admin page successfully
Click on delete staff
Staff is deleted Pass
TC064 Deletion of staff option The staff should be deleted
From admin page

Click of set constraint It should accept constraint

Setting of Constraint is accepted Pass
TC065 option from admin name and store it in
constraint and is saved successfully
page database

Click update constraint It should accept the change

Update Constraint is updated Pass
TC066 button made to constraint
constraint successfully
From admin page And updated in database
Click delete constraint
The constraint should be Constraint is deleted
TC067 Delete constraint button Pass
deleted successfully
From admin page
Setting the range Click of set constraint It should accept constraint
Constraint is accepted
TC068 of respective range option from range and store it in Pass
and is saved successfully
constraint admin page database
Click update constraint It should accept the change Constraint range is
TC069 range button from made to constraint range updated Pass
Constraint range
admin page And updated in database successfully
Setting the range Click of set constraint It should accept constraint
Constraint is accepted
TC070 of respective range option from range and store it in Pass
and is saved successfully
constraint admin page database

Department of IS&E MIT MYSORE Page 77

Chronic Kidney Disease using Data Mining



Click update constraint It should accept the change Constraint range is

TC071 range button from made to constraint range updated Pass
Constraint range
admin page And updated in database successfully
Delete constraint
Click delete constraint
range The constraint range Constraint range is
TC072 range button from Pass
should be deleted deleted successfully
admin page

Admin should be
Click on sign out link Website homepage is
TC073 Admin sign out redirected to the website
in admin homepage displayed Pass

Clink on update
Update treatment Treatment details has to be Treatment details page
TC074 treatment details link Pass
details updated Updated successfully
from patient homepage
Treatment details has to be
Clink on delete
Delete treatment deleted Treatment details page
TC075 treatment details link Pass
details deleted
from patient homepage

The disease should be

predicted by considering
The constraint,
Verification of constraint, constraint range
constraint range is Disease prediction is
TC076 prediction of and old patient record. If Pass
given as input and successful
disease ckd is present display
Click result button
”disease type=ckd” else
“disease type=not ckd”

Department of IS&E MIT MYSORE Page 78

Chronic Kidney Disease using Data Mining


Doctor should be
Click on sign out link Website homepage is
TC077 Doctor sign out redirected to the website
in doctor page displayed Pass
Receptionist should be
Receptionist sign Click on sign out link Website homepage is
TC078 redirected to the website
out in reception page displayed Pass
Click on sign out link patient should be redirected Website homepage is
TC079 Patient sign out
in patient page to the website homepage displayed Pass

Department of IS&E MIT MYSORE Page 79

Chronic Kidney Disease using Data Mining

Chronic Kidney Disease Prediction using Data Mining

Chapter 7
Login Page

Fig 7.1: Login page

Login Types

Fig 7.2: Login Types


Chronic Kidney Disease using Data Mining

Invalid Condition

Fig 7.3: Invalid Condition

Doctor Login

Fig 7.4: Doctor Login

Department of IS&E MIT MYSORE Page 82

Chronic Kidney Disease using Data Mining

Admin Login

Fig 7.5: Admin Login

Doctor Home Page

Fig 7.6: Doctor Home Page

Department of IS&E MIT MYSORE Page 83

Chronic Kidney Disease using Data Mining

Reporting Page

Fig 7.7: Reporting Page

To Upload Treatment Detail

Fig 7.8: Upload Treatment Detail

Department of IS&E MIT MYSORE Page 84

Chronic Kidney Disease using Data Mining

Change Password

Fig 7.9: Change Password

Receptionist Home Page

Fig 7.10: Receptionist Home Page

Department of IS&E MIT MYSORE Page 85

Chronic Kidney Disease using Data Mining

Admin Home Page

Fig 7.11: Admin Home Page

To Select Disease Type

Fig 7.12: Disease Type

Department of IS&E MIT MYSORE Page 86

Chronic Kidney Disease using Data Mining

To Upload Disease and Stages

Fig 7.13: Upload Disease and Stages

To Add Constraints

Fig 7.14: Addition of Constraints

Department of IS&E MIT MYSORE Page 87

Chronic Kidney Disease using Data Mining

To Set Constraints Ranges

Fig 7.15: Set Constraint Ranges

Department of IS&E MIT MYSORE Page 88

Chronic Kidney Disease using Data Mining


Chronic Kidney Disease has been predicted and diagnosed using data mining classifiers: ANN
and Naive Bayes. In this proposed work, some of the factors considered were age, diabetes,
blood pressure, RBC count etc. The work can be extended by considering other parameters like
food type, working environment, living conditions, availability of clean water, and
environmental factors for kidney disease detection. This project is a medical sector application
which helps the medical practitioners in predicting the disease types based on the symptoms.
Patients can also predict diseases by entering symptoms in the form of sentences. It is
automation for disease prediction and it identifies the disease, its types and complications from
the clinical database in an efficient and an economically faster manner. It is successfully
accomplished by applying the Naïve Bayes algorithm for classification. The classification
technique comes under data mining technology. The proposed work takes symptoms as input and
predicts the disease based on old patients data.

Chronic Kidney Disease using Data Mining


Monitoring of patients using IOT devices

Suppose if the patient is unable to visit hospital every time for check up, he/she can be
monitored remotely from any places using IOT devices

Query Module
We can add the query module as a future enhancement to the application where doctor,
receptionist and admin of the application can interact with each other.

Department of IS&E MIT MYSORE Page 90

Chronic Kidney Disease using Data Mining

[1] Aditya Sunda N., Pushpa Latha P., Rama Chandra M.(2012, June). Performance Analysis of
Classification Data Mining Techniques over Heart Disease Data Base. International Journal of
Engineering Science and Advanced Technology(IJESAT). (pp. 470-478),2012.

[2] Ilayaraja M., Meyyappan T. (2013, February).Mining Medical Data to Identify Frequent
Diseases using Apriori Algorithm. In Pattern Recognition, Informatics and Mobile Engineering
(PRIME), 2013 International Conference on (pp. 194-199).IEEE.

[3]Ravindra B. V., Sriraam N., Geetha M. (2014, November).Discovery of Significant

Parameters in Kidney Dialysis Data Sets by K means Algorithm. In Circuits, Communication,
Control and Computing (I4C), 2014 International Conference on (pp. 452-454). IEEE.

[4]Peter T. J., Somasundaram K. (2012, March).An Empirical Study on Prediction of Heart

Disease using Classification Data Mining Techniques. InAdvances in Engineering, Science and
Management (ICAESM), 2012 International Conference on (pp. 514-518). IEEE.

[5] Mostafa Ghannnad Rezaie, Hamid Soltanian Zadeh. Interactive Knowledge Discovery for
Lobe Epilepsy. International Journal of Advanced Science and Technology,(pp. 45-48),2013.

[6] Neha Sharma, Er. Rohit Kumar Verma (2016,September).Prediction of Kidney Disease by
using Data Mining Techniques. International Journal of Advance Research in Computer Science
and Engineering (IJARCSSE), vol 6, issue 9, 2016.

[7] Rajan J. R., Chelvan C. C. (2013, December). A Survey on Mining Techniques for Early
Lung Cancer Diagnoses. In Green Computing, Communication and Conservation of Energy
(ICGCE), 2013 International Conference on (pp. 918-922).IEEE.

Chronic Kidney Disease using Data Mining

[8] Lakshmi K. R., Nagesh Y., VeeraKrishna M. (2014). Performance Comparison of Three Data
Mining Techniques for Predicting Kidney Dialysis Survivability. International Journal of
Advances in Engineering & Technology (IJAET), 7(1), 242-254, 2014.
[9] Agarwal Y., Pandey H. M. (2014, September). Performance Evaluation of Different
Techniques in The Context of Data Mining-A case of an eye disease. In Confluence the Next
Generation Information Technology Summit (Confluence), 2014 5th International Conference-
(pp. 72-76). IEEE.

[10]Ahmed S., Tanzir Kabir M., Tanzeem Mahmood N., Rahman R.M. (2014, December).
Diagnosis of Kidney Disease using Fuzzy Expert System. In Software, Knowledge, Information
Management and Applications (SKIMA), 2014 8th International Conference on (pp. 1-8).IEEE.

Department of IS&E MIT MYSORE Page 92

You might also like