Final Report

Chronic Kidney Disease Prediction using Data Mining 2016-2017
Chapter 1
INTRODUCTION
Data Mining is one of the most encouraging areas of research with the purpose of
finding useful information from voluminous data sets. It has been used in many domains
like image mining, opinion mining, web mining, text mining, graph mining etc. Its
applications include anomaly detection, financial data analysis, medical data analysis,
social network analysis, market analysis etc. It has become popular in health organization
as there is a requirement of analytical methodology for predicting and finding unknown
patterns and information in health data. It plays a vital role for discovering new trends in
healthcare industry. Data Mining is particularly useful in medical field when no
availability of evidence favoring a particular treatment option is found. Large amount of
complex data is being generated by healthcare industry about patients, diseases, hospitals,
medical equipments, claims, treatment cost etc. that requires processing and analysis for
knowledge extraction.
Data mining comes up with a set of tools and techniques which when applied to this
processed data, provides knowledge to healthcare professionals for making appropriate
decisions and enhancing the performance of patient management tasks. Patients with
similar health issues can be grouped together and effective treatment plans could be
suggested based on patient’s history, physical examination, diagnosis and previous
treatment patterns. Chronic Kidney Disease (CKD) has become a global health issue and
is an area of concern. It is a condition where kidneys become damaged and cannot filter
toxic wastes in the body. Our work predominantly focuses on detecting life threatening
diseases like Chronic Kidney Disease (CKD) using Classification algorithms like Naive
Bayes and Artificial Neural Network (ANN).
1.1 Overview
Chronic Kidney Disease (CKD) is the highly developed and irreversible destruction
of the kidneys. Kidneys are indispensable parts of human body. They have several
functions, including:
 Helping confirm the credit of minerals and electrolytes in your body, such as
calcium, sodium, and potassium
DEPARTMENT OF IS&E MIT MYSORE Page 1

Chronic Kidney Disease using Data Mining 2016-2017
 Playing an vital role in the production of red blood cells

 Maintaining the delicate acid-base sample of your blood
 Excreting water-soluble wastes from your body
Chronic Kidney Disease (CKD), then called chronic renal disease, is once loss in
kidney act on intensity of times of months or years. The symptoms of worsening kidney
performance are not specific, and might add happening feeling generally unwell and
experiencing a condensed appetite. Often, chronic kidney sickness is diagnosed for that
excuse of screening of people known to be at risk of kidney problems, such as those
considering high blood pressure or diabetes and those as well as a bloodline relative
following CKD. This weakness may along with be identified once than it leads to one of
its qualified complications, such as cardiovascular weakness, anemia, pericarditis or renal
osteodystrophy.CKD is a long-term form of kidney sickness; as a result, it is
differentiated from acute kidney sickness (acute kidney cause offense) in that the
narrowing in kidney discharge adherence must be faculty for as soon as then again 3
months. CKD is an internationally credited public health is not a hundred percent
affecting 5-10% of the world population.
Chronic Kidney Disease is identified by a blood exam for creatinine, which is a

psychoanalysis product of muscle metabolism. Higher levels of creatinine indicate a
humiliate glomerular filtration rate and therefore a decreased knack of the kidneys to
excrete waste products. Creatinine levels may be declared in the forward stages of CKD,
and the condition is discovered if urinalysis (breakdown of a urine sample) shows the
kidney is allowing the loss of protein or red blood cells into the urine.
1.2 Motivation
The present lifestyles of people, working environment and diet have given rise to
many diseases, one of which includes Chronic Kidney Disease. Chronic Kidney Disease
(CKD) is prevailing nowadays and has become a global health issue which must be
timely detected and diagnosed.
Kidneys are important organs of human body that eradicate toxic and unwanted
waste from blood causing smooth functioning of body organs. CKD is a condition that
describes loss of kidney function over time making it difficult for them to filter poisonous
wastes from the body.
Department of IS&E MIT MYSORE Page 2

1.3 Scope of Project

Disease detection is one of the significant areas of research in Medical Field. The aim
of the project is to find Chronic Kidney Disease in a human body in an efficient
manner .The project aims at developing an Automated Software Package for Chronic
Kidney Disease Prediction useful in medical sectors like hospitals, clinical laboratories.
Using of Data mining technique helps us to use different set of tools for calculating
large set of data volumes. Helping to find out chronic disease in human body also
approaches to find other symptoms that can be cured with the help of diagnosis.
1.4 Proposed System
Chronic Kidney Disease (CKD) has become a global health issue and is an area of
concern. It is a condition where kidneys become damaged and cannot filter toxic wastes
in the body. Our work predominantly focuses on detecting life threatening diseases like
Chronic Kidney Disease (CKD) using Classification algorithms. Proposed system is
automation for Chronic Kidney Disease prediction using classification technique “naive
bayes” and artificial neural network technique “C4.5”.
Chronic Kidney Disease has been predicted and diagnosed using data mining
classifiers: ANN and Naive Bayes. Performances of these algorithms are compared using
Rapid miner tool. The obtained results showed that Naive Bayes is the most accurate
classifier with higher accuracy when compared to ANN output’s accuracy. In this system,
some of the factors considered were age, diabetes, blood pressure, RBC count etc. The
work can be extended by considering other parameters like food type, working
environment, living conditions, availability of clean water, environmental factors etc for
kidney disease detection.
Initially dataset are collected from various source and stored in the database. Dataset
consist of 24 attributes and each attributes has its own range. In the next step
preprocessing takes place in that, noisy and irrelevant data are removed. Preprocessed
data are sent to the classification algorithm and data are classified based on Naive Bayes
and ANN algorithm as shown in the fig1.1.

Data
collection
Preprocessing
Classification Algorithm
(Naïve Bayes & Neural Network)
Test Data
Model
New Data Analysis
Prediction
Fig 1.1: Proposed System of CKD
Classified old patient dataset and new patient test result are given to the model there
it will be arranged in the structured manner. In analysis phase both results will be
analyzed then the result will be predicted.
For example the clinical data of 200 records considered for analysis has been taken
from UCI Machine Learning Repository. The data obtained after cleaning and removing
missing
Values are 220. The data has been implemented using Rapid Miner tool. There are 25

attributes in the dataset. The numerical attributes include age, blood pressure, blood
glucose random, blood urea, serum creatinine, sodium, potassium, hemoglobin, packaged
cell volume, WBC count, RBC count. The nominal attributes include specific gravity,
albumin, sugar, RBC, pus cell, pus cell clumps, bacteria, hypertension, diabetes mellitus,
coronary artery disease, appetite, pedal edema, anemia and class.
The validation process which helps to examine the accuracy of fitted models and its
performance on new data and also model construction helps to build a model and testing
dataset and measures its performance.

Chapter 2
LITERATURE SURVEY
2.1 Survey Papers
[1] Performance analysis of classification data mining techniques over heart disease
data base.
The healthcare industry collects huge amounts of healthcare data which,
unfortunately, are not “mined” to discover hidden information for effective decision
making. Discovery of hidden patterns and relationships often goes unexploited. Advanced
data mining techniques can help remedy this situation.
It can serve a training tool to train nurses and medical students to diagnose
patients with heart disease. It is a web based user friendly system and can be used in
hospitals if they have a data ware house for their hospital. Presently we are analyzing the
performances of the two classification data mining techniques by using various
performance measures.
The effectiveness of models was tested using two methods: Classification Matrix
and Lift Chart. This system can serve a training tool to train nurses and medical students
to diagnose patients with heart disease. It can also provide decision support to assist
doctors to make better clinical decisions or at least provide a “second opinion.”
This system is developed using two data mining classification modeling

techniques. The system extracts hidden knowledge from a historical heart disease
database. DMX query language and functions are used to build and access the models.
The models are trained and validated against a test dataset. Classification Matrix methods
are used to evaluate the effectiveness of the models. The two models are able to extract
patterns in response to the predictable state.
[2] Mining Medical Data to Identify Frequent Diseases using Apriori Algorithm.
The data mining is a process of analyzing a huge data from different perspectives
and summarizing it into useful information. The information can be converted into
knowledge about historical patterns and future trends. Patients from different locations
approach different hospitals.

They do not converge in a same place. Their records are maintained by the
hospitals where they get treated. Collecting information about the frequently occurring
diseases is not an easy job. The data collection regarding these sorts of diseases can be
done through association rule. Apriori of the Association rule is adopted for the mining of
data. Details regarding the occurrence of these diseases in a particular time period can
also be mined using Apriori algorithm.
Based on the Apriori principle any subset of a frequent itemsetsmust also be

frequent. If {XY} is a frequent Itemset, both {A} and {B} must be frequent item sets.
The key idea of Apriori algorithm is to make multiple passes over the database. It
employs an iterative approach known as breadth-first search (level-wise search) through
the search space, where k-itemsets are used to explore (k+1)-item sets. In the beginning,
the set of frequent 1-itemsets is found. The set of that contains one item, which satisfy the
support threshold is denoted by L1. In each subsequent pass, it begins with a seed set of
itemsets found to be large in the previous pass. This seed set is used for generating new
potentially large itemsets, called candidate item sets, and count the actual support for
these candidate item sets during the pass over the data. At the end of the pass, we
determine which of the candidate itemsets are actually large (frequent), and they become
the seed for the next pass.
The proposed method is useful to identify the frequent diseases in a large medical
dataset. The outcome of this research will help the practitioners in making medicinal
decisions for frequently occurring diseases. Analysis is made on data from various
geographical locations during different time periods.
[3] Discovery of Significant Parameters in Kidney Dialysis Data Sets by K-Means

Algorithm.
There are few parameters which are considered essential for the decision making
of kidney dialysis. The parameters like creatinine, sodium and urea plays an important
role. This paper identifies the possible survival period of kidney failure patients before
they need to go for the necessary kidney transplantation. The survival period of patient is
identified by the influence of kidney dialysis parameters with the help of K-means
clustering algorithm. K mean algorithm is as follows: It represents each cluster by the
mean value of the objects in the cluster .It takes the input parameter, k, and partitions a set
of n object into k clusters so that the resulting intra Cluster similarity is high but the inter

cluster similarity is low. Cluster similarity is measured in regard to the mean value of the
objects in a cluster. First, it randomly selects k of the objects, each of which initially
represents a cluster mean or centre. For each of the remaining objects, an object is
assigned to the cluster to which it is the most similar, based on the distance between the
object and the cluster mean. It then computes the new mean for each cluster. This process
iterates until the criterion function converges. This clustering procedure is applied to the
parameters and the survival period of patient is identified. The clustering is applied based
on the age and gender. The parameters with the normal value have better survival rate
than low or high values.
[4] An empirical study on prediction of heart disease using classification data mining
techniques.
The use of pattern recognition and data mining techniques into risk prediction
models in the clinical domain of cardiovascular medicine is proposed. The data is to be
modeled and classified by using classification data mining technique. Some of the
limitations of the conventional medical scoring systems are that there is a presence of
intrinsic linear combinations of variables in the input set and hence they are not adept at
modeling nonlinear complex interactions in medical domains. This limitation is handled
in this research by use of classification models which can implicitly detect complex
nonlinear relationships between dependent and independent variables as well as the
ability to detect all possible interactions between predictor variables.
To investigate the performance of different classification algorithm such as DT,

NB, K-NN and NN on heart disease dataset. The heart disease prediction is useful for
cardio vascular clinicians which contains the patient's records. This patient's record is
classified and predicted who are having the heart diseases. After evaluation it is found
that NB gives the better accuracy than other classifiers.
[5] Interactive knowledge discovery for temporal lobe epilepsy.

To optimize the rule-discovery process by giving clinician flexibility of
incorporating domain knowledge, in the form of desire rule formats, into the rule search.
There are many reasons why a physician might experience difficulty in formulating an
appropriate differential diagnosis. It may be that the case involves a rare disease or
unusual presentation. Often, such difficulties arise in clinical problems where two or more

disease processes are at work, generating a complex sequence of abnormal findings that
can be interpreted in a variety of ways.
Also proposes a data discovery algorithm for a small data set with high
dimensionality. Support vector machine is applied to classify the feature vectors. Finally,
particle swarm agents are used to discover the SVM classification rules. It has been
shown that this algorithm can manage the rule extraction task efficiently. We will develop
and evaluate a new approach for interactive data mining based on swarm intelligence. The
proposed method will process external rules along with the raw data to do reasoning. The
proposed method is designed to work in low sample and high dimensional feature space
conditions where statistical power of the raw data is not sufficient for a reliable decision.
[6] Prediction of Kidney Disease by using Data Mining Techniques.

Chronic Kidney Disease is a large and growing problem among aging populations.
Detection & prediction of kidney disease is important for providing proper & right
treatment to the patient. The conventional systems that were used for detection of the
kidney diseases used data sets of the patient and generated results using if-then rules
along with and-or mechanism. This new technique uses both fuzzy systems and neural
networks called as neuro-fuzzy system that will generate results on the basis of obtained
input data set. This new system that is made from the combination of both fuzzy system
and neural network generates results by mathematical computation and not on the basis of
probabilistic theory. The ANFIS system is used in the proposed in this work. The initial
step is to load the data set that is to be processed. The data set consist of the various
information of the particular subject that is to be normalized. Next step after loading the
data set is to initialization the ANFIS system, the ANFIS system is combination of the
artificial neural network and the fuzzy logics. The system is most efficient system for
computing the results.
After the system is initialized, next step is to initialize the parameters of the system.
These parameters can be number of inputs required and the outputs. After this the
member ship function are defined. With the help of these membership functions the
calculation is done. In this step after parameters and the member functions are defined
training of the ANFIS system is done. After training the system the calculation of the
performance parameters is done. The performance parameters will show the efficiency of
the designed system. Next step is to optimize the parameters of the ANFIS system

designed in order to achieve more better and efficient results. Finally the calculation of
the optimized results is done and the comparison is performed. The comparison will show
the best results.
[7] A Survey on Mining Techniques for Early Lung Cancer Diagnoses
Lung cancer, a disease highly dependent on historical data for early diagnosis, has
influenced researchers to pursue the data mining techniques for the pre-diagnosis process.
The five year survival rate increases to 70% with the early detection at stage 1, when the
tumor has not yet spread. Existing medical techniques like X-Ray, Computed
Tomography (CT) scan, sputum cytology analysis and other imaging techniques not only
require complex equipment and high cost but is also proven to be efficient only in stage 4,
when the tumor has metastasized to other parts of the body. The proposed system involves
the development of a data mining tool that will help in the classification of patients into
the category that could potentially test positive for lung cancer in stage 1. Based on the
pre-diagnosis results from the tool, the doctor can perform the diagnosis for the
confirmation of tumor in the patient and initiate the treatment at an early stage thereby
increasing the survival rate.
The method of applying data mining techniques in identifying effective pre-diagnosis
of the disease can improve practitioner performance. Lung cancer being a disease which
is highly dependent on historical data can make use of data mining for its early detection.
Researchers have been investigating on applying various data mining techniques on lung
cancer dataset for early diagnosis of lung cancer. This paper proposes a model for
measuring if applying data mining techniques to lung cancer dataset can provide reliable
performance in the detection of lung cancer at Stage I .The proposed system uses the most
effective method to extract knowledge and information from the existing lung cancer
profile data. Data cleaning is a challenging step involved here as the data collected from
heterogeneous sources does not contain all the required attributes. Normally with increase
in the training data, performance can be increased.
[8] Performance Comparison of Three Data Mining Techniques for Predicting

Kidney Dialysis Survivability

The main objective of this manuscript is to report on research where we took

advantage of those available technological advancements to develop prediction models
for kidney dialysis survivability, and also the main goal of medical data mining
techniques is to get best algorithms that describe given data from multiple aspects. The
number of patients on hemodialysis due to end stage kidney disease is increasing. The
median survival for these patients is only about 3 years and the cost of providing care is
high. Finding ways to improve patient outcomes and reduce the cost of dialysis is a
challenging task. Dialysis care is complex and multiple factors may influence patient
survival. More than 50 parameters may be monitored while providing a kidney dialysis
treatment. Understanding the collective role of these parameters in determining outcomes
for an individual patient and administering dividualized treatments is of importance.
Individual patient survival may depend on a complex interrelation ship between multiple
demographic and clinical variables, medications, and medical interventions. In this
research, three data mining techniques (Artificial Neural Networks, Decision tree and
Logical Regression) are used to elicit knowledge about the interaction between these
variables and patient survival. A performance comparison of three data mining techniques
is employed for extracting knowledge in the form of classification rules. The concepts
introduced in this research have been applied and tested using a data collected at different
dialysis sites. The computational results are reported. Finally, ANN is suggested for
Kidney dialysis to get better results with accuracy and performance.
[9] Performance Evaluation of Different Techniques in the Context of Data Mining-

A Case of an Eye Disease.
The optimization in data mining plays a fundamental role for the extraction of
patterns
Or knowledge in minimum time. The author describes and implemented three
optimization techniques such as fuzzy logic based approach, neural network based
approach(Perceptron based and Back-propagation based). For the experimentation
purpose data from eye clinic are collected to understand the appropriate disease and
recommend the type of lenses for patient. The work also covers the analysis observations
and discussions based on the obtained results. A dataset of an eye clinic is taken as the
example for the experimental purpose. The dataset consist of four attributes as an input
and one output class attribute of a patient. The first attribute is age factor describing the

age of the patient next is spectacle prescription which describes the type of spectacle the
patient is using and the last is astigmatism which is a type of an eye defect. Based on
these the output class will provide the type of lens recommended by the doctor.
Fuzzy logic is an approximation method to solve any problem. It contains the

fuzzy sets rather than the crisp sets. The input parameters of the fuzzy logic takes only
approximate values called partial truth .Perceptron networks come under single layer feed
forward networks and also called simple perceptron. The perceptron network consist of
three units namely, input unit, hidden unit and output unit. The net input at hidden-layer
and input-layer is calculated first then net-input between hidden-layer and output-layer is
calculated. Then input is mapped to the output using the different activation functions like
binary sigmoidal and bipolar sigmoidal activation function. These all layers computation
is inbuilt in the tool of neural network which is used here for the optimization
results .During the experiments it is found that Back-Propagation is showing average
result. It would be interesting to see the impact of these approaches on more complex
cases.
[10] Diagnosis of Kidney Disease Using Fuzzy Expert System.

The paper develops and presents a diagnosis system based on fuzzy logic to report
the healthiness of a patient’s kidney. The data chosen is obtained from various diagnosis
results of kidney patients of Birdem Hospital, Dhaka. For the fuzzy system, a total of
seven input variables are used as follows: nephron functionality, blood sugar, systolic and
diastolic blood pressure, age, weight, and alcohol intake. The result that is the healthiness
of kidney is measured in the range of 0 to 10. The expert system is implemented in the
fuzzy logic toolbox built in the Mat lab software. The proposed system provides a means
to deliver a more direct way to tell whether a kidney is in good or bad condition. Fuzzy
logic is applied for this purpose. The system is based on a fuzzy rule-based inference
method, where we apply the Mamdani approach for fuzzification and de-fuzzification.
Fuzzy logic is best applied in fields where a great amount of uncertainty or

fuzziness exists. In this case, building an expert system by applying fuzzy inference rules
is a very suitable
choice. In a fuzzy inference system or FIS, fuzzy set theory is applied to map
inputs (or attributes) to outputs. The fuzzification process involves transforming crisp

values into various grades of membership for linguistic terms of fuzzy sets. Membership
functions are used to associate a grade to each linguistic term. De-fuzzification is the
process of getting a quantifiable result in fuzzy logic, given the fuzzy sets and
corresponding membership degrees (obtained from fuzzification).The designed fuzzy
expert input variables, can successful precise and accurate. Comparing with traditional
approaches used by hospitals, the system can predict the healthiness of kidney.
2.2 Survey Findings

Healthcare functionalities are derived using different technologies to enhance
detection of finding heart disease in human being. Further it helps to get patient billing
and other information to ease the availability of the records. Research of finding detects
are carried Based on different methods and algorithms of data mining technique future
work are conducted in specific manner. Additional data mining techniques can be
incorporated to provide better diagnosis. The size of the dataset used in this research is
still quite small. A large dataset would definitely give better results. It is also necessary to
test the system extensively with input from doctors, especially cardiologists, before it can
be deployed in hospitals. [1]
Data mining tools have been developed for effective analysis of medical
information to help the clinician in making better diagnosis. In this research work, the
researcher can collect data from Hospital Information System (HIS)which has the
sufficient details of patient including patient’s name, age, disease, location, district, date
from laboratories which keeps on growing year after year. Having collected the data from
hospital information system, this research can find the frequent disease with the help of
association techniques. This research work helps to mine the data about the frequent
diseases with help of tools applied over training data set. [2]
Importance of clustering technique for identifying the influence of kidney dialysis

parameters. A K-mean algorithm is used to identify the measured parameters. The
important parameters for kidney dialysis are creatinine, sodium and urea. These
parameters clustering procedure help in predicting the survival of an individual patient
beyond the median survival time. [3]
The dataset have the large volume of data which consumes more time for
classification. Thereby reduction the dimensionality of data using the attribute selection

methods is to be done. Keeping track of patient’s record seems to be difficult to manage

and fails in getting accurate results. [4]
Both of the injection and rejection of rules to allow interactive and effective
contributions provided by an expert user. The well-known support vector machine (SVM)
classifier and swarm data miner will be integrated to handle joint processing of the raw
data and the rules. [5]
In this proposed work the ANFIS system is proposed which is considered to be

better for obtain useful data. Along with this the optimizations if the system is done to
improve the efficiency of the results obtained .From the results obtained it is concluded
that this method is better and efficient than the traditional systems. The designed system
will help in extraction of the useful data from the data set. This proposed system is effect
approach for the data mining process. In future the method can be advanced by using
various optimization algorithms that can increase the efficiency of the system. Also
hybrid approaches can be used for obtaining more precise results. [6]
Artificial neural networks (ANN) provide a powerful tool to help doctors to

analyze, then model and make sense of complex clinical data across a wide range of
medical applications .It is a mathematical model developed on the basis of biological
neural networks. Each neuron node in the input layer represents each attribute of the
patient dataset. The values from the input layer are then sent to the nodes in the hidden
layer along with the weight values where the learning actually takes place. After the
learning process, the classification is done in the output layer [7].
The effectiveness of models was tested using different data mining methods. The
purpose is to determine which model gave the highest percentage of correct predictions
for diagnosing patients with a major life threatening diseases. The purpose of this study is
to investigate the use of different classifiers as tools for data mining, predictive modeling
and data processing in the prognosis of diseases. The goal of any modeling exercise or the
best technique is to extract as much information as possible from available data and
provide an accurate representation of both the knowledge and uncertainty about the
epidemic. The prediction of life threatening diseases survivability has been a challenging
research problem for many researchers. Since the early dates of the related research,
much advancement has been recorded in several related fields [8].
Data Mining is about solving problems by analyzing data already present in

databases. Data mining is also stated as essential process where intelligent methods are
applied in order to extract the data patterns. The rule extraction is the basic process of
data mining. If-then rules are the most common taxonomy for the rule extraction in the
field of extracting knowledge from a large database. To obtain the best possible solution
in the extraction [9].
The delivery of precise an elaborative diagnosis of disease sis important and crucial
for the well-being of patients. Conventional diagnosis systems for renal diseases today
involve taking several tests that include tests on blood sugar, BUN (Blood Urea
Nitrogen), creatinine. Developing a system which can be used by doctors for a more
precise analysis of kidney condition. Diagnosing kidney condition today is vital attribute
in almost all medical fields [10].

Chapter 3
SOFTWARE REQUIREMENT SPECIFICATIONS
3.1 Introduction
Software Requirement Specification (SRS) is a fundamental document, which
forms the foundation of the software development process. SRS not only lists the
requirements of a system but also has a description of its major features. These
recommendations extend the IEEE standards. The recommendations would form the basis
for providing clear visibility of the product to be developed serving as baseline for
execution of a contract between client and the developer.
A system requirement is one of the main steps involved in the development

process. It follows after a resource analysis phase that is the task to determine what a
particular software product does. The focus in this stage is one of the users of the system
and not the system solutions. The result of the requirement specification document states
the intention of the software, properties and constraints of the desired system.
SRS constitutes the agreement between clients and developers regarding the
contents of the software product that is going to be developed. SRS should accurately and
completely represent the system requirements as it makes a huge contribution to the
overall project plan.
The software being developed may be a part of the overall larger system or may
be a complete standalone system in its own right. If the software is a system component,
the SRS should state the interfaces between the system and software portion.
3.2 Stakeholders
The Stakeholders of the project are:
 Team members
 Project guide
 Project reviewers
 Department Faculties
 College management
 Organization’s officials

 Admin
 Doctor
 Patient
 Receptionist
3.3 Functional Requirements

Functional requirement defines a function of a software system or its component.
Function is described as a set of inputs, the behavior, and outputs.
Different Module present in the system is:
 Module 1 :Admin
 Module 2 : Doctor
 Module 3 : Patient
 Module 4 : Receptionist
 Module 1: Admin
In this module, Admin will keep track of day-to-day process done in the
application.
 Staff Creation: Responsible for creating staff account and managing their
activity.
 Add Stage Data: Admin will add stages into the database.
 Constraints: Admin will add patient constraints into the database.
 Ranges: Admin will add set of predefined ranges into constraints.
 Password: Admin can reset password of his own or he can also create staff
password.
 Update: Admin can upload any additional data of staff, stages, ranges and
constraints into the database.
 Delete: Admin will be enabling to delete staff, stage, range and constraints from
database.
 Module 2: Doctor
Doctor can able to detect whether patient contains CKD or not.
 Upload Patient Details: Doctor can upload patient clinical data and in future he
can modify data record set. It helps to monitor patient clinical record.
 View Treatment Details: After uploading treatment details, they can also view it.

 Constraints: Doctor can suggest constraint depending on the treatment of patient.

 Module 3:Patient
In this module, Patient can give feedback and view his/her treatment details.
 Feedback: Patient can provide feedback based on the care shown by the clinic
and also they can share their experience provided in the module.
 View Record Set: After login to account patient can view their own record and
treatment details.
 Module 4: Receptionist
Receptionist can register, upload, view and generate billing details of patient.
 Register: They can register patient information by creating new account.
 Upload of data: Receptionist can upload patient’s clinical data.
 Billing Details: They can generate billing details of each patient data.
 View Patient Details: Receptionist can view each patient information and they
can alter the changes whenever they needed.
3.4 Non Functional Requirments

Non-Functional Requirements are also known as quality requirements which
impose constraints on the design or implementation such as usability, reliability,
performance and scalability.
 Usability: System is a medical oriented application and system is automation for
kidney disease prediction and mainly used by doctors and receptionists of the
hospitals and as it’s a browser based application it can be accessed worldwide.
 Reliable: Our application provides the services according the users satisfaction
and interest, and designed as per users requirements and more user friendly, so the
application is more reliable compare to other medical sector applications.
 Maintainability: As we update the software regularly it will be easy to maintain
it. Application is designed in such a way that future modifications and
enhancements can be done easily.
 Efficiency: The application provides the efficient results as it uses data mining
technique or machine learning technique for disease prediction. Huge amount of
data mined to get more efficient results.

 Re-usability: The system is a web based application, once the user creates an
account; user can access the system multiple times.
3.5 System Requirements
3.5.1 Software Requirements
 Framework: .NET
 IDE: Visual Studio 2010
 Front End: ASP. NET 4.0
 Programming language: C#NET
3.5.2 Hardware Requirements

 RAM: 1GB+
 Processor: Pentium 4+
 Processor Speed: 2ghz+
 Hard disk: 20GB+

Chapter 4
SYSTEM ANALYSIS AND DESIGN
4.1 System Analysis
System Analysis is a detailed analysis of various operations performed by a system
and their relationship within and outside the system. It is systematic technique that
defines goals objectives. One of the main aspects of analysis is the defining the
boundaries of the system.
System analysis study has been conducted with the following objectives.
 Identify user’s need.
 Evaluate the system concept for feasibility.
 Perform economical and technical analysis.
 Allocate function to hardware, software, people, databases etc.
Tools used in analysis.

The various tools of structural analysis are:
 Use case Diagram
 Sequence Diagram
 Activity Diagram
 Dataflow Diagram
 ER Diagram
 Database Structure
 Graphical User Interface ( GUI)
The structured analysis has the following attributes

 The data flow diagram (DFD) presents a picture of what kind is being
specified and is conceptually easy to understand presentation of the
application.
 Sequence diagram shows participants in an interaction and the sequence of
messages among them. The system contains different modules which interact
with each other.
 The messages passed between the modules are shown by sequence
diagram.
 Use-case diagram is a coherent piece modules are going to interact with one or
other uses.

 The diagrams are specified in a precise, concise and highly readable manner. It
shows the working system and how it interacts together.
4.2 High-Level Design

Three tier architecture:
Three-tier architecture is a client-server architecture in which the functional process

logic, data access computer data storage and user interface are developed and maintained
as independent modules on separate platforms. Three-tier architecture is a software design
pattern and well established software architecture.
Fig 4.2 Three-tier Architecture of System
Three-tier architecture allows anyone of the three tiers to be upgraded or replaced

independently. The user interface is implemented on a desktop PC and users are standard
graphical user interface with different modules running on the application servers. The
relational database management on the database server contains the computer data storage
logic. The middle is usually multi-tier.
The three tiers in three tier architecture are:
 Presentation tier: Occupies the top-level and displays information related to
services available on a website. This tier communicates with other tiers by
sending results to the browser and other tiers in the network.

 Application-tier: Also called middle-tier, Logic tier, Business Logic tier, this tier
is pulled from the presentation tier. It controls application functionality by
performing detailed processing.
 Data-tier: Houses database servers where information is stored and retrieved.
Data in this tier is kept independent of application servers or business logic.
4.2.1 System Architecture:
Fig 4.2.1 System Architecture of CKD

Data mining technique is used in the building of our architecture where data is
very important key factor and used all times in the system. All the related data and their
information are stored in the DB accurately. Database is maintained for each module
where it stores data values and all functionalities work accordingly. Algorithm
implementation is carried out for each and every step of module to predict the result
within range and constraints that are given in the time of inserting values to the record.
Data mining technique is used in the building of our architecture where data is very

important key factor and used all times in the system. All the related data and their
information are stored in the DB accurately. Database is maintained for each module
where it stores data values and all functionalities work accordingly. Algorithm
implementation is carried out for each and every step of module to predict the result
within range and constraints that are given in the time of inserting values to the record.
The main goal of the system is to detect CKD in a patient by taking risk factors and
different attribute values using Naive Baye’s algorithm. Four modules are used in the
system where each module has different functionality and operations. All functions
defined within system and works by inputting a value and fetching the result from the
database.
4.3 Use Cases

A use case is a list of steps, defining interactions between system and actors. The
actor can be a human, an external system or time.
Purpose of Use Case Diagram:

 The list of goal names provides the shortest summary of what system will provide .It
also provides a project planning skeleton, to be used to build initial priorities,
estimates, team allocation and timings.
 The main success scenario of each use case provides everyone involved with an
agreement as to what system will basically do and what it will not do. It provides the
context for each specific requirement, a context that is very hard to get anywhere else.
Purpose of Use Case in our project:

 In our project, system administrator has the highest authorization among all
stakeholders. He creates staffs, specifies stages, constraints, ranges of CKD, and sets
id and passwords for staffs.
 Doctor is an actor who participates in multiple use cases like uploading new patient
details, view and uploads treatment details, view result indicating presence or absence
of CKD.
 Receptionist is an actor who participates in multiple use cases like registering of
patients, upload old patient details, billing, view treatment details and patient history.
Admin:

Login
Stages
Staff
Constraint
ADMIN
Range
Set ID/Password
Change
Password
Fig 4.3.1: Use Case diagram of Admin

Doctor:
Upload
View patient data
Result
DOCTOR
Change
Password
Update/ delete
Fig 4.3.2: Use Case diagram of Doctor

Receptionist:
Register
View patient details
Receptionist
Billing
Upload
Change Password
Fig 4.3.3: Use Case diagram of Receptionist

Patient:
Feedback
View
treatment
Details
PATIENT
Fig 4.3.4: Use Case diagram of Patient
4.4 Sequence Diagram

A sequence diagram is an interaction diagram that shows how processes operate with
one another and what is their order. A sequence diagram shows object interactions
arranged in time sequence. It defects the object and class involved in the scenario and the
sequence of messages exchanged between the object needed to carry out the functionality
of the scenario.
A sequence diagram shows, as parallel vertical line, different processes of object that
live simultaneously, and, as horizontal arrows the messages exchanged between them, in
the order in which they occur.
Purpose of sequence diagram in the project

In this module the user login to the system if the user is already registered otherwise
the users first have to register in to the system. During login the user is validated by the
data in the database if he is a valid user then he/she can input the file and view the
converted data then logout from the system.
In this module the admin login to the system and he/she validated, if he/she is a valid
user then he can manage the translations of the users and he can add, delete, update the
user data by fetching the user data from the database, otherwise it will transform again in
to the login form. At last the admin logout from the application.

Admin:
ADMIN LOGIN HOME

SERVER
ID and Password
Success Create Staff
Setid and Password for Staff

Failed
View Staff
Input Stages
View Stages
I/P Ranges and Constraints
View Ranges, Constraints
Upload Password
Logout
Fig 4.4.1: ADMIN Sequence Diagram

Doctor:
DOCTOR LOGIN HOME SERVER
Success
Upload new patient constraint
ID and Password
Failed
View new/old patient details
Upload Treatment Details
View Result
Logout
Fig 4.4.2: DOCTOR Sequence Diagram

Patient:
PATIENT LOGIN HOME SERVER
Upload Feedback
ID and Password
Success
View Treatment Details
Failed
Logout
Fig 4.4.3: Patient Sequence Diagram
Receptionist:
RECEPTIONIST LOGIN HOME SERVER
ID/Password
Upload old patient detail
Success
Failed
Register patient details
View Patient History
Logout
Fig 4.4.4: RECEPTIONIST Sequence Diagram

4.5 Activity Diagram

Activity diagram is a graphical representation or flow chart which describes
operations of the system. It is basically a flow chart to represent the flow from
one activity to another activity.
Symbols and Notations:

Activity diagram represents set of defined symbols and each symbol has its own
meaning and they are used wherever it is appropriate.
Represents the beginning of a

process or workflow in an
activity diagram. It can be used
by itself or with a note symbol
that explains the starting point.
The activity symbol is the main

component of an activity
diagram. These shapes indicate
the activities that make up a
modeled process.
The connector symbol is represented by

arrowed lines that show the directional flow,
or control flow, of the activity. An incoming
arrow starts a step of an activity; once the
step is completed, the flow continues with
the outgoing arrow.

The join symbol, or synchronization

bar, is a thick vertical or horizontal line.
It combines two concurrent activities
and re-introduces them to a flow where
only one activity.
A fork is symbolized with

multiple arrowed lines from a
join. It splits a single activity
flow into two concurrent
activities.
The decision symbol is a

diamond shape; it represents the
branching or merging of various
flows with the symbol acting as
a frame or container.
Additional messages that don't fit within the

diagram itself. The note symbol allows the
diagram creators or collaborators to
communicate
The end symbol represents the

completion of a process or workflow.

Explanation of Each Activity Diagram:
In our system, workflow of each module are explained and detailed description of
them are elaborated using activity diagram with relevant symbols and notations.Each
module consist of one input, one ouput and individual functions are defined within the
system.
Admin : Each module has login, using the credentials given they can see the data
and view their related information.If login doesn’t match with the password invalid
message was shown. Fig shows activity diagram of admin module.When admin login
successfully, they can give data related to staff member , stages, constraints and range of
the patient test record.They have permisson to change their own password.
Receptionist : As shown in the Fig If receptionist login to his/her page into the
system, they functions with registration, uploading of patient information,generating
billing details and also looking details of treatment that are given by doctor. They are
responsible for managing account of each patient.Error message was displayed if they
enter wrong login details.
Patient : In this module when patient login to his/her page they can view treatment
details which consist of stages in which patient suffering from disease, symptoms,medical
prescription given by doctor. As shown in the Fig , they are provided with feedback field
where they give their thoughts and discuss their doubts and issues.
Doctor : Using login details they are given privelege to upload patient test records,
giving constraints and range depending on the initial stage of CKD detection.Once patient
treatment begins they can keep track of improvement, adding range of attributes to their
own database.Range can be selected by doctor for each patient which helps them to detect
disease.

Admin :
Login
Invalid
Valid
IS Admin
Staff Stage Constraints Ranges Password
Fig 4.5.1: Activity Diagram of Admin Module

Receptionist:
Login
In Valid
Valid
Is Receptionist
Register Upload Details Billing View Details
Fig 4.5.2: Activity Diagram of Receptionist Module

Patient :
Login
Invalid
Valid
Patient
Feedback View Details
Fig 4.5.3: Activity Diagram of Patient Module

Doctor :
Login
Invalid
Valid
Is Doctor
Upload Details View Result Constraints
Fig 4.5.4: Activity Diagram of Doctor Module
4.6. Data Flow Diagram

A Data Flow Diagram is a visual display of data through an informational system,
modeling its process aspects. In general DFD is used as initial system to understand the
overview of the system which can be elaborated.
Data Flow Diagram does not show about the timing of the operation and information
of the process where it undergoes in the system.DFD shows information in the form of
visual display will be input to and output from the system like where the data come from
and go to, and where the data will be stored.
DFD uses set of symbols like rectangles, circles and arrows, plus short text labels, to

show data inputs, outputs, storage points and the routes between each destination. With
the help of data flow diagram, users are able to visualize how the system will operate,
what the system will accomplish, and how the system will be implemented.
In our project we have four individual modules which represent visual display of the
system and we have explained each functionality of the separate module and how it builds
relationship between the modules according to the flow.
Data Flow Diagram of our project consists of four modules namely,

 Admin
 Receptionist
 Patient
 Doctor
All the above modules are interrelated to each other through a set of process defined
in the structural system analysis using defined symbols and notations. Each module has at
least one input for one output. Information of data used in the system is stored in the
database separately. As a result by giving relevant set of input we can fetch the output in
the form of data.
Admin Module:
ADMIN operates various kinds of roles by supporting people or group of people in
business enterprise. They manage more routines administration tasks within an
organization or department.
In our project admin plays an important role by handling various tasks within a
system. After getting access to the page, admin monitors different tasks. It keeps track of
patient’s detail by inputting different stages and constraints.
Initially, this module inputs different attributes like inputting patient record,
updating/deleting the record in the database, monitoring and viewing the patient test
record and so on. Admin can set range to the constraints and also he alters the range when
it is necessary. He can modify the system by adding or removing constraints.

ID/Password
Admin Login
Input Update/ Creat Sets View Change

delete e
Stages Stages Stages

Id/pwd
Staff
Id/pwd
Constraints
Constraints Constraints
Range Range
Range
DATABASE
Fig 4.6.1: Admin Dataflow Diagram
Admin has given permission to add staff in the database by creating User
ID/password to them where they can monitor staff behavior. Adding staff information
helps to keep track of which staff member is taking responsibility of patient so that time
consuming can be avoided at the time of patient’s evaluation.

If any update happened in the patient’s record, they have given authority to change
the dataset in the DB of patient. Deletion of dataset can be done by admin at the time of
any duplicate entries in the database. They are authenticated to monitor behavior of
patient’s data in finding the symptoms for the cause of Chronic Kidney Disease (CKD).
Once data are uploaded in the database he/she are eligible to view the patient record.
Also they have access to look past record set of patient which improves communication
with the doctor very easy and accordingly prescription was given depending on the
health-checkups.
Receptionist Module:
ID/Password
Receptionist Login
Input Update/delete View Change
Previous patient Previous patient Previous patient

clinical data clinical data clinical data
Billing Password
Billing Billing
Details Details Details
Patient history Patient history Patient history
Patient
Patient Patient
registration
registration registration
DATABASE
Fig 4.6.2: Receptionist Dataflow Diagram

This module contains patient previous clinical data, patient registration, patient
history and billing detail.
It operates and functions in the mentioned attributes of patient details in a specific

manner. Receptionist is responsible for creating patient registration, updating previous
data, monitoring history, managing account and giving bill details to the patient.
When a patient visits clinic it is the job of receptionist to register his/her account in
the clinic database. If user is already registered, then no need to create new account in the
DB. This module has access to view previous history of patient data record set.
Patient Module:
ID/Password
Patient Login
Input View Change
Feedback Treatment Details Password
DATABASE
Fig 4.6.3: Patient Dataflow Diagram

For a patient three attributes are given such as feedback, viewing treatment details
and changing password. Initially, He/she are registered with User ID/password. Once
patient login into the account, they can view their record details which contains
information of his/her treatment detail.
In this module they are authorized to change password so that data can be
secured. Only they can login to their account and provide feedback depending on clinic
behavior and how they feel about the treatment given in the clinic, care taken by staff
members. Feedback helps the clinic and staff members to correct themselves in the future
so that they can take care of patients smoothly and softly.
Doctor Module:
ID/Password
Doctor Login
Input View Sets Change
Treatment
Patient Data Data
New Patient
Data
Treatment Password
Data
Treatment
New Patient
Data Stages Data
Result
DATABASE
Fig 4.6.4: Doctor Dataflow Diagram

In this module, the doctor uploads new patient’s clinical data, patient’s treatment
detail, and based on test result he can decide whether patient contain CKD or not.
Doctor can add new patient’s constraints to the dataset. He can also see patient
record, treatment details and depending on the test result system can predict stages as
well as patient is CKD or not.
4.7 Entity – Relationship Diagram
.
User Type
: id Password
Login n
1 n
n
Up
R loa
Mo eg Mo d
Stage 3 dify ist dify Seriu
er n m
n
tname
Stage 2 n n Age
: tid
1
Patient
Stages H n Treatment Constraints
Stage 1 Sugar
as
1 n
1
n
Co
H Patients nt
as
ai
ns
Email pname
: pid
Fig 4.7 E-R Diagram of CKD
An Entity-Relationship Model (ER Model) is a data model for describing the data or
information aspects of a business domain or its process requirements, in the abstract way
that leads itself to ultimately being implemented in a database such as relational database.
The main components of ER modules are entities and the relationships that can exist

among them.
An Entity-Relationship Model is a systematic way of describing and defining a
business process. The process is modeled as components (entities) that are linked with
each other by relationships that express the dependencies and requirements between
them .
4.8 Database Structure:
Chronic Kidney Disease Prediction Using Data Mining
Table 1 – Users
FIELD TYPE NULL KEY
UserID String [varchar(20)] Not null Primary Key
Password String [varchar(20)]
User Type String [varchar(20)]
EmailId String [varchar(50)]
Table 2 – Kidney Disease
FIELD TYPE NULL KEY
Disease ID Int, Auto Generate Not null Primary Key
Disease Type String [varchar(20)]
Table 3 – Stages
FIELD TYPE NULL KEY
Staged Int, Auto Generate Not null Primary Key
Disease ID Int Not null Foreign Key
Stage String [varchar(20)]

Table 4– Attribute (constraints)
FIELD TYPE NULL KEY
Attribute ID Int, Auto Generate Not null Foreign Key
Attribute String [varchar(20)]
Table 5 – Values (constraint range)
FIELD TYPE NULL KEY
Value ID Int, Auto Generate Not null Primary Key
Attribute ID Int Not null Foreign Key
Value String [varchar(20)]
Table 6 – Patients
FIELD TYPE NULL KEY
Patient ID Int, Auto Generate Not null Primary Key
Patient Name String [varchar(20)] Not null
Patient Age Int Not null
Gender String [varchar(20)]
Marital Status String [varchar(20)]
Occupation String [varchar(20)]
Contact No String [varchar(20)]
Email ID String [varchar(50)]
Address String [varchar(500)]
Photo String [varchar(500)]
AdmitedDate Date Time
Stage ID Int Not null Foreign Key

Table 7 – Patient Attributes
FIELD TYPE NULL KEY
PAttributeId Int, Auto Generate Not null Primary Key
Patient ID Int Not null Foreign Key
Value ID Int Not null Foreign Key
Table 8 – Treatment
FIELD TYPE NULL KEY
Treatment ID Int, Auto Generate Not null Primary Key
Stage ID Int Not null Foreign Key
Treatment String [varchar(500)]
Last Updated Date Time
Fig 4.8 Data Structure Table

CHAPTER 5
IMPLEMENTATION
. Implementation can be described as realization of an application, or execution of a
plan, idea, model, design, specification, standard, algorithm, or policy. In computer
science, an implementation is explained as realization of a technical specification or
algorithm as a program, a software component, or any other computer system through
computer programming and deployment. Many implementations may exist for a given
specification or standard.
5.1 Overview of .NET:
.NET Framework: The .NET Framework is a new computing platform that

simplifies application development in the highly distributed environment of the Internet.
The .NET Framework is designed to fulfill the following objectives: To provide a
consistent object-oriented programming environment whether object code is stored and
executed locally, executed locally but Internet-distributed, or executed remotely.
 To provide a code-execution environment that minimizes software deployment
and versioning conflicts.
 To provide a code-execution environment that guarantees safe execution of code,
including code created by an unknown or semi-trusted third party.
 To provide a code-execution environment that eliminates the performance
problems of scripted or interpreted environments.
 To make the developer experience consistent across widely varying types of
applications, such as Windows-based applications and Web-based applications.
 To build all communication on industry standards to ensure that code based on the
.NET Framework can integrate with any other code.
The .NET Framework has two main components:

 The common language runtime
 The .NET Framework class library
The common language runtime is the foundation of the .NET Framework. You can
think of the runtime as an agent that manages code at execution time, providing core
services such as memory management, thread management, and remoting, while also

enforcing strict type safety and other forms of code accuracy that ensure security and
robustness. In fact, the concept of code management is a fundamental principle of the
runtime. Code that targets the runtime is known as managed code, while code that does
not target the runtime is known as unmanaged code.
The class library, the other main component of the .NET Framework, is a
comprehensive, object-oriented collection of reusable types that you can use to develop
applications ranging from traditional command-line or graphical user interface (GUI)
applications to applications based on the latest innovations provided by ASP.NET,
such as Web Forms and XML Web services.
5.2 Introduction to ASP.NET
ASP.NET is unified web development platform that provides the services

necessary for you to build enterprise-class web applications. While ASP.NET is largely
syntax compatible with Active Server Page (ASP), it provides a new programming model
and infrastructure that allow you to create a powerful new class of applications. ASP.NET
is part of the .NET framework and allows you to take full advantage of the features of the
common language runtime, such as type safety, inheritance, language interoperability and
versioning.
ASP.NET is supported on Windows 2000 (Professional, Server and Advanced

Server), Windows XP Professional and the Windows Server 2003 family for both client
and server applications.
In addition, to develop ASP.NET server applications, the following software is also

required:
 Windows 2000 Server or Advanced Server with Service Pack 2, Windows

XP Professional or 64-Bit Edition, or one of the Windows Server 2003
family products.
 MDAC 2.7 for Data.
 Internet Information Services.

5.3 Introduction to C#:

C# (pronounced as ‘C Sharp’) is a new computer-programming language
developed by Microsoft Corporation, USA. C# is a fully object-oriented language
like Java and is the first Component-oriented language. It has been designed to support
the key features of .NET Framework, the new development platform of Microsoft for
building component-based software solutions. It is a simple, efficient, productive and
type-safe language derived from the popular C and C++ languages. Although it belongs
to the family of C/C++, it is a purely object-oriented, modern language suitable for
developing Web-based applications.
C# is designed for building robust, reliable and durable components to handle

real-world applications. Major highlights of C# are:
 It is a brand new language derived from the C/C++ family.

 It simplifies and modernizes C++.
 It is the only component-oriented language available today.
 It is the only language designed for the .NET Framework.
 It is a concise, lean and modern language.
 It combines the best features of many commonly used languages: the
productivity of Visual Basic, the power of C++ and the elegance of Java.
 It is intrinsically object-oriented and web-enabled.
 It has a lean and consistent syntax.
 It embodies today’s concern for simplicity, productivity and robustness.
 It will become the language of choice for .NET programming.
 Major parts of .NET Framework are actually coded in C#.
ADO.Net - Database Connectivity:

Most applications need data access at one point of time making it a crucial
component when working with applications. Data access is making the application
interact with a database, where all the data is stored. Different applications have different
requirements for database access. ASP.NET uses ADO .NET (Active X Data Object) as
its data access and manipulation protocol which also enables us to work with data on the
Internet. Data Access in ADO.NET relies on two components: Data Set and Data Provider.

1. Data Set
The dataset is a disconnected, in-memory representation of data. It can be
considered as a local copy of the relevant portions of the database. The Data Set is
persisted in memory and the data in it can be manipulated and updated independent of the
database. When the use of this Data Set is finished, changes can be made back to the
central database for updating. The data in Data Set can be loaded from any valid data
source like Microsoft SQL server database, an Oracle database or from a Microsoft
Access database.
2. Data Provider
The Data Provider is responsible for providing and maintaining the connection to
the database. A Data Provider is a set of related components that work together to
provide data in an efficient and performance driven manner. The .NET Framework
currently comes with two Data Providers: the SQL Data Provider which is designed only
to work with Microsoft's SQL Server 7.0 or later and the OleDb Data Provider which
allows us to connect to other types of databases like Access and Oracle. Each Data
Provider consists of the following component classes:

The Connection object is a connection to the database. The Command object is used
to execute a command. The Data Reader object which provides a forward-only, read only,
connected record set. The Data Adapter object populates a disconnected Data Set with
data and performs update.
5.4 Introduction to SQL Server:

Microsoft SQL Server is a full-featured relational database management system
(RDBMS) that offers a variety of administrative tools to ease the burdens of database
development, maintenance and administration. In this article, we'll cover six of the more
frequently used tools: Enterprise Manager, Query Analyzer, SQL Profiler, Service
Manager, Data Transformation Services and Books Online.
Enterprise Manager is the main administrative console for SQL Server

installations. It provides you with a graphical "birds-eye" view of all of the SQL Server
installations on your network. You can perform high-level administrative functions that
affect one or more servers, schedule common maintenance tasks or create and modify the
structure of individual databases.

Query Analyzer offers a quick and dirty method for performing queries against
any of your SQL Server databases. It's a great way to quickly pull information out of a
database in response to a user request, test queries before implementing them in other
applications, create/modify stored procedures and execute administrative tasks.
SQL Profiler provides a window into the inner workings of your database. You
can monitor many different event types and observe database performance in real time.
SQL Profiler allows you to capture and replay system "traces" that log various activities.
It's a great tool for optimizing databases with performance issues or troubleshooting
particular problems.
Service Manager is used to control the MSSQL Server (the main SQL Server
process), MSDTC (Microsoft Distributed Transaction Coordinator) and SQL ServerAgent
processes. An icon for this service normally resides in the system tray of machines
running SQL Server. You can use Service Manager to start, stop or pause any one of
these services.
Data Transformation Services (DTS) provide an extremely flexible method for

importing and exporting data between a Microsoft SQL Server installation and a large
variety of other formats. The most commonly used DTS application is the "Import and
Export Data" wizard found in the SQL Server program group.
5.5 Reason for choosing .NET:

5.5.1 Limitations of C:
 C developers are forced to contend with manual memory management.
 Ugly pointer arithmetic.
 C is structured programming language.
 Programmers require complete knowledge of best programming technique
5.5.2 Limitations of C++:
 C++ can be thought as an Object Oriented layer on top of C.
 It involves manual memory management.
 Ugly pointer arithmetic.
 Ugly syntactical constructs.

5.5.3 Limitations of JAVA/J2EE:

 Java programmers must use java front to back during development cycle.
 It is not appropriate for many graphical or numerical intensive applications.
 .NET provides solution to all the above mentioned problems.
5.6 DATA MINING TECHNIQUES:

5.6.1 Classification Rules:
Classification is a process of finding a model (or function) that describes and
distinguishes data classes or concepts. The model is derived based on the analysis of a set
of training data (i.e., data objects for which the class labels are known).
5.6.2 Naive Bayes Algorithm

Naive Bayes is a probabilistic classifier based on Bayes theorem with strong
independence assumptions between the features. Bayes theorem provides a way of
calculating the posterior probability, P(c|x), from P(c), P(x), and P(x|c). Naive Bayes
classifier assumes that the effect of value of predictor (x) on the given class (c) is
independent of the values of other predictors. This assumption is called class conditional
independence.
Naive Bayes Algorithm Steps:
Step 1: Scan the dataset (Storage servers).
Retrieval of required data for mining from the servers such as database, cloud,
excel sheet etc.
Step 2: Calculate the probability of each constraint value. (n.n_c, m, p).
Here for each attribute we calculate the probability of occurrence using the
following formula. (Mentioned in the next step). For each class (disease) we should apply
the formula.
Step 3: Apply the formula
P (constraint value (ai)/ subject value (v)) = (n_c + mp)/ (n+m)
Where: n = the number of training examples
 n_c = number of examples
 p = a prior estimate for P
 m = the equivalent sample size

Step 4: Multiply the probabilities by p

For each class, here we multiple the results of each attribute with p and final
results are used for classification.
Step 5: Compare the values and classify the attribute values to one of the predefined set
of class.
Sample Example
Attributes (Constraints) – S1, S2, S3 [m=3]
Subject (Disease) – CKD, NOT CKD [p=1/2=0.5]
Training Dataset
Patient Name S1(X,Y,Z) S2 (A,B,C) S3 (P,Q,R) Disease (subject)
Anil X A P CKD
Ajay X B Q CKD
Arun Y B P NOT CKD
Kumar Z A R CKD
Naveen Z C R NOT CKD
New Patient data – Akash Constraints (S1 -X, S2-A, S3-R) Disease – CKD / NOT
CKD
P= [n_c + (m*p)]/ (n+m)
CKD NOT CKD
X X
P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=2,m=3,p=0.5 n=2, n_c=0,m=3,p=0.5
p=[2+(3*0.5)]/(2+3) p=[0+(3*0.5)]/(2+3)
p=0.7 p=0.3
A A
P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=2,m=3,p=0.5 n=2, n_c=2,m=3,p=0.5
p=[2+(3*0.5)]/(2+3) p=[2+(3*0.5)]/(2+3)
p=0.7 p=0.3

R R
P=[n_c + (m*p)]/(n+m) P=[n_c + (m*p)]/(n+m)
n=2, n_c=1,m=3,p=0.5 n=2, n_c=1,m=3,p=0.5
p=[1+(3*0.5)]/(2+3) p=[1+(3*0.5)]/(2+3)
p=0.5 p=0.5
CKD – 0.7 * 0.7 * 0.5 * 0.5 (p) NOT CKD – 0.3 * 0.3 * 0.5 * 0.5 (p)
=0.1225 =0.0225
Since CKD > NOT CKD
So this new patient is classified to CKD
5.6.3: C4.5 Algorithm
C4.5 is one among the top algorithms in data mining technique. It was developed by
Ross Quinlan. In the projectC4.5 algorithm has been implemented to predict the stages of
CKD of the patients based on clinical test constraints.
C4.5 Algorithm Steps:
Step 1: Scan the dataset (storage servers)
Step 2: for each attribute a, calculate the gain [number of occurrences]
Step 3: Let a_best be the constraint of highest gain [highest count]
Step 4: Create a decision node based on a_best – retrieval of nodes [patient] where the
attribute values matches with a_best.
Step 5: recur on the sub-lists [list of patient] and calculate the count of outcomes [Stages]
– termed as sub nodes. Based on the highest count we classify the new node.
Sample Example
Attributes (Features) – F1, F2, F3 [m=3]
Subject (stages) – S1, S2 [p=1/2=0.5]

Training Dataset
Name F1(X,Y,Z) F2(A,B,C) F3(P,Q,R) Stage (subject)
Anil X A P S1
Kumar X B Q S1
Ajay Y B P S2
Naveen Z A R S1
Akash Z A Q S2
New Patient Features – Akul F1-X, F2-A, F3-R Which Stage - ?

Feature Count (X) in the dataset = 2
Feature Count (A) in the dataset = 3
Feature Count (R) in the dataset = 1
Sort ();
Feature Count
A 3
X 2
R 1
A – S1 (2) & S2 (1);
Output
Stage Priority
S1 2
S2 1
It is diagnosed that CKD is in Stage 2 for new patient Akul

5.7 Pseudo code
5.7.1 Pseudo code for Login
main ()
{
LOGIN ();
Admin ();
Receptionist ();
Doctor ();
Patient ();
}
LOGIN ()
{
GET User_ type;
GET User_ ID/Email_ Id;
GET Password;
If (User_ ID==entered User _ID and Password==entered Password)
{
User _type is fetched from Database
LOGIN SUCCESSFUL
}
If (User_ type==1)
Admin
Else If (User_ type==2)
Receptionist
Doctor
Patient
}
LOGIN FAILED
}

5.7.2 Pseudo code for Admin
Admin()
{
add_staff();
stages();
constraints ();
values ();
account ();
}
add_staff()
{
GET User_type;
GET User_Id;
GET password;
GET Email_Id;
if(User_Id==entered User_Id)
User_Id already exist
else
staff is added
}
Stages ()
{
GET Stage;
If (Stage==entered stage)
Stage exists
Else
Stage is added
}
Constraint ()
{
GET Constraint;
if(Constraint==entered constraint)

constraint already exists

else
New constraint is added
}
Values ()
{
GET values;
if(value<=constraint(max value) and value>=constraint(min value))
Value is accepted
else
Invalid
}
Account()
{
GET old_password;
GET new_password;
GET confirm_password;
if(old_password==existing password && new_password==confirm_password)
password changed successfully
else
unsuccessful
}
5.7.3 Pseudo code for Doctor:
Doctor ()
{
Upload_patientdetails();
View_patientdetails();
Result();
Treatment_details();
}
Upload_patientdetails()
{

Get patient_details();
Add_constraints();
}
Add_constraints()
{
Get patient_constraints();
}
View_patientdetails()
{
Display_patientdetails();
}
Result()
{
If(result==CKD)
{
Patient has CKD;
If(result==stage1)
Patient has stage1;
Else If(result==stage2)
Patient has stage2;
Else If(result==stage3)
Patient has stage3;
else If(result==stage4)
Patient has stage4;
else
Patient has stage5;
}
Else
Patient does not have CKD;
}
Treatment_details ()
{
If(stage==stage1)

Display treatment_details_of_stage1;
Else if(stage==stage2)
Else If(stage==stage3)
Else If(stage==stage4)
Else
}
Account()
{
GET old_password;
GET new_password;
else
unsuccessful
}
5.7.4. Pseudo code for receptionist:
Receptionist()
{
Upload_patientdetails();
Billing();
Account();
}
Upload_patientdetails()
{
Get patient_details;
Add_constraints();
}
Add_constraints()
{
get patient_constrains;
}

Account()
{
GET old_password;
GET new_password;
else
unsuccessful
}
5.7.5 Pseudo code for patient:
Patient ( )
View_treatmentdetails( );
Give_feedback();
Account();
View_treatmentdetails()
Get treatment_details;
Give_feedback()
Upload_feedback;

Account()
{
GET old_password;
GET new_password;
else
unsuccessful
}
5.8 Advantages
 Proposed system is a medical sector application and automation for Chronic

Kidney Disease.
 Reduces the time required to analyze test results and predict CKD.
 Reduce kidney failure due to diabetes.
 Reduce the number of cardiovascular death rate for persons on dialysis.
 Reduce the total number of deaths for persons with a functioning kidney
transplant.
 We have developed and performed an internal validation for five models for CKD
progression from stage I to stage V. Our models leverage different types of
variables—demographic, laboratory and/or clinical documentation data that are
collected routinely during the course of clinical care as part of the electrical health
record —as well as the longitudinal aspect of the records as encoded through
filters.
 In absence of laboratory and documentation information, the simplest model

(eGFR model) identifies that low eGFR at time of CKD stage III diagnosis is
associated with higher risk of progression. Furthermore, younger patients with
impaired kidney function (stage III CKD) progress more rapidly toward stage V
CKD. Consistent with current knowledge of CKD, male gender was found to be
associated with more rapid loss of eGFR, and the laboratory test models (RLT)
identified laboratory data known to be associated with CKD progression.

 We found that text is a valuable predictor for CKD progression and that the use of
time series models to characterize patient state can substantially improve
predictive accuracy for progression. In particular, the model which incorporated
demographic, laboratory, and clinical documentation data had the highest
concordance of the models considered.
 Risk prediction in CKD has been studied extensively, with dozens of available
risk models with acceptable performance (discrimination 0.56–0.94). Most
developed classifiers use readily obtainable information, including age,
demographics, and laboratory data. Hence, laboratory data, comorbidities, and
occasional vital signs are the sole dimensions of contemporary CKD classifiers.
Age, sex, and eGFR are included in almost all models, but fewer than half use
proteinuria (qualitative assessment or quantitative proteinuria or albuminuria),
serum creatinine, serum albumin, or blood pressure.
5.9 Limitation
 Because our dataset consists of a non-curated, real-world set of patient characteristics,

as recorded through clinical care, there is some potential noise in the collected
variables. For instance, given the lack of high quality information about ethnicity, we
cannot assess which ethnic groups are well represented in our dataset. This fact may
introduce noise in the GFR calculations.
 The models we designed and validated are based on data from a single institution.
While there is value in focusing on a single institution at a time (the risk predictions
are relevant to the characteristics of the institution’s patient population for instance),
the model validity and its generalizability would be better demonstrated over data
from several institutions.
 In particular, because of the potential variations in clinical vocabulary and overall

language in the documentation across different institutions, there would likely be a
benefit to generalizing the risk model to patient records from other institutions. . Our
study requires longitudinal documentation (both inpatient and outpatient notes over
many years, for a large set of patient records). Since there are no publicly available
datasets (even de-identified) with these properties, extending this study to other
datasets is outside the scope of this study and an important limitation of the work.

 Short of training a model for data from different institutions, the models presented in
this study are in theory portable to different institutions. In particular, the
unsupervised NLP techniques described here (topic modeling) are actually conducive
to such an approach, as they identify patterns in the language of any given corpus
without any prior knowledge of the topics or vocabulary to expect. To address the
potential differences in language from one institution to another, the topic models
would have to be learned on documentation from the new institutions.

Chapter 6
TESTING
Testing is the process of evaluating a system or its component(s) with the intent to
find that whether it satisfies the specified requirements or not. This activity results in the
actual, expected and difference between their results. In simple words testing is executing
a system in order to identify any gaps, errors or missing requirements in contrary to the
actual desire or requirements.
Testing is the practice of making objective judgments regarding the extent to
which the system (device) meets, exceeds or fails to meet stated objectives.
6.1 Purpose of testing

Testing is used to provide customers with bug free Software and Reliable
software. The software developed should not get any problem while in use, in order make
efficient use of the software developed Software testing is conducted. Because software
once developed costs much and if the customer faces problem while in use he has to incur
huge losses .So to avoid such loss software testing is conducted.
Testing is done to Analyze whether the application developed is according to the
Requirements. The main course of testing is to check for the existence of defects or errors
in a program or project or product, based up on some predefined instructions or
conditions.
Following are some of important factors for which Testing for an application is required:
 Reduce the number of bugs in the code.
 To provide a quality product.
 To verify whether all the requirements are met.
 To satisfy the customer’s needs.
 To provide a Bug free software.
 To earn the reliability of the Software.
 To avoid the user from detecting problems.
 Verify that it behaves “as specified”.
 Validate that what has been specified is what the user actually wanted.

6.2 Black Box Testing
The technique of testing without having any knowledge of the interior workings of
the application is Black Box testing. The tester is obvious to the system architecture and
does not have access to the source code. Typically, when performing a black box test, a
tester will interact with the system's user interface by providing inputs and examining
outputs without knowing how and where the inputs are worked upon.
6.3 White Box Testing
White box testing is the detailed investigation of internal logic and structure of the
code. White box testing is also called glass testing or open box testing. In order to
perform white box testing on an application, the tester needs to possess knowledge of the
internal working of the code. The tester needs to have a look inside the source code and
find out which unit/chunk of the code is behaving inappropriately.
6.4 Different Levels of Testing
Fig 6.4: Different Levels of Testing

6.4.1 Unit Testing

Unit Testing is a level of the software testing process where individual
units/components of a software/system are tested. The purpose is to validate that each unit
of the software performs as designed.
This type of testing is performed by the developers (White Box Testing) before
the setup is handed over to the testing team to formally execute the test cases. Unit testing
is performed by the respective developers on the individual units of source code assigned
areas. The developers use test data that is separate from the test data of the quality
assurance team.
The goal of unit testing is to isolate each part of the program and show that
individual parts are correct in terms of requirements and functionality.
6.4.2 Integration Testing

Integration Testing is a level of the software testing process where individual units
are combined and tested as a group. The purpose of this level of testing is to expose faults
in the interaction between integrated units. The testing of combined parts of an
application to determine if they function correctly together is Integration testing.
6.4.3 System Testing

This is the next level in the testing and tests the system as a whole. Once all the
components are integrated, the application as a whole is tested rigorously to see that it
meets Quality Standards. This type of testing is performed by a specialized testing team.
System Testing is a level of the software testing process where a complete, integrated
system/software is tested. The purpose of this test is to evaluate the system’s compliance
with the specified requirements.
6.4.4 Acceptance Testing

Acceptance testing or User Acceptance Testing is a level of the software testing
process where a system is tested for acceptability. The purpose of this test is to evaluate
the system’s compliance with the business requirements and assess whether it is
acceptable for delivery.

6.4.5 TEST CASES
TEST EXPECTED ACTUAL STATUS

DESCRIPTION INPUT COMMENT
CASEID RESULT OUTPUT (P/F)
Execute and run The application should run Application is running

TC001 Home page Pass
the application without interrupts successfully
Home page http://localhost: Home page has to be

TC002 Home page displayed Pass
display 5219/Login.aspx displayed
Click on User Login User login Page should be User login page displayed
TC003 View User login Pass
link displayed successfully
Required field validator
Validation of Email=” ”Login Button Required field validator
TC004 has been displayed Pass
Email Textbox Click has to be displayed(*)
(*)
Validation of Email=”anil@gmail.co Required field validator Required field validator
TC005 Pass
Email Textbox m” Login Button Click shouldn’t be displayed(*) has not been displayed
Required field validator
Validation of Password” “ Login Required field validator
TC006 has been displayed Pass
Password Textbox Button Click has to be displayed(*)
(*)
Password= Required field validator
Validation of Required field validator
TC007 anil@gmail.com” has not been displayed Pass
Password Textbox shouldn’t be displayed(*)
Login Button Click (*)
Email=”abc@ Has to logged in as User
If the user is not User not
TC008 Gmail.Com” and navigate to login.aspx Error Message Display Pass
registered one found
Password=**** page

EXPECTED ACTUAL STATUS

TEST DESCRIPTION INPUT COMMENT
RESULT OUTPUT (P/F)
CASEID
Registered User
Logs in to system Email=abc@ Has logged in as Admin
TC009 by entering email gmail.com and navigate to admin Successful login Pass
and password(if Password=**** homepage
user is admin)
Registered User
Logs in to system Email=abc@ Has logged in as Doctor
TC010 by entering email gmail.com and navigate to doctor Successful login Pass
user is doctor)
Registered User
Logs in to system
Email=abc@ Has logged in as
by entering email
TC011 gmail.com Receptionist and navigate Successful login Pass
and password(if
Password=**** to receptionist homepage
user is
receptionist)
Registered User
Logs in to system Email=”abc@ Has logged in as Patient
TC012 by entering email gmail.com” and navigate to patient Successful login Pass
user is patient)


Login button Has to login to respective Login button

TC013 Login button click Didn’t login Fail
working home pages not working
Login button Has to login to respective Login to respective home

TC014 Login button click Pass
working home pages pages based on user type
Data will stored in the Back end

Validation of
TC015 Submit button click database and message is Database was not found fail database is
submit button
displayed not found
Data will stored in the Data stored in the
Validation of
TC016 Submit button click database and message is database and message is pass
submit button
displayed displayed
Click on add staff link Add staff page has to be Add staff page didn’t error message
TC017 View add staff fail
from admin home page displayed displayed displayed
Click on add staff link Add staff page has to be

TC018 View add staff Add staff page displayed pass
from admin home page displayed
Click on disease type
Add disease type page has Add disease type page
TC019 View disease type link from the admin Pass
to be displayed displayed successfully
home page
It should accept the disease
Setting the type of Admin enters the type Disease type is accepted
TC020 type and store it in the Pass
disease of diseases and is saved successfully
database


Click on add
View add Add constraints page has Add constraints page error message
TC021 constraints link from fail
constraints to be displayed didn’t displayed displayed
admin home page
Click on add
View add Add constraints page has Add constraints page
TC022 constraints link from pass
constraints to be displayed displayed
admin home page
Click on add ranges
Add ranges page has to be Add ranges page didn’t error message
TC023 View add ranges link from admin fail
displayed displayed displayed
homepage
Click on Stages link in Add stages page has to be Add stages page is
TC024 View Stages Pass
the admin home page displayed displayed successfully
Setting the It should accept the Different stages are
Admin enters the
TC025 different type of different stages and store it accepted and is saved Pass
different stages
stages in the database successfully
Click on add ranges
Add ranges page has to be Add ranges page
TC026 View add ranges link from admin pass
displayed displayed
homepage
Click on add stages
Add stages page has to be Add stages page didn’t error message
TC027 View add stages link from admin fail
homepage
Click on add stages

Add stages page has to be Add stages page
TC028 View add stages link from admin pass
displayed displayed
homepage
Click on change
View change Change password page has Change password page error message
TC029 password link from fail
password to be displayed didn’t displayed displayed
admin homepage


Click on add stages

View change Change password page has Change password page
TC030 link from admin pass
password to be displayed displayed
homepage
Clink on treatment
View treatment Treatment details page has Treatment details page error message
TC031 details link from Fail
details to be displayed didn’t displayed displayed
patient homepage
Clink on treatment
View treatment Treatment details page has Treatment details page
TC032 details link from pass
details to be displayed displayed
patient homepage
Clink on feedback link Feedback page has to be Feedback page didn’t error message
TC033 View feedback fail
from patient homepage displayed displayed displayed
Clink on feedback link Feedback page has to be

TC034 View feedback Feedback page displayed pass
from patient homepage displayed
Click on change
patient homepage
Click on change
TC036 password link from pass
patient homepage
Click on patient details
View patient Patient details page has to Patient details page didn’t error message
TC037 link from doctor fail
details be displayed displayed displayed
homepage
View patient Patient details page has to Patient details page
TC038 link from doctor pass
details be displayed displayed
homepage


Click on upload
View upload Upload treatment page has Upload treatment page error message
TC039 treatment link from doc fail
treatment to be displayed didn’t displayed displayed
page
Click on upload
View upload Upload treatment page has Upload treatment page
TC040 treatment link from pass
treatment to be displayed displayed
doctor homepage
Click on reporting link
Generate report page has to Generate Report page is
TC041 View reporting from the doctor home Pass
be displayed displayed successfully
page
Doctor sets the The Report should be
particular disease type generated with the Report is generated
TC042 Generating Report Pass
and stage to generate particular disease type and successfully
report stage
Click on view result
Result page has to be Result page didn’t error message
TC043 View result link from doctor fail
homepage
Click on view result

Result page has to be
TC044 View result link from doctor Result page displayed pass
displayed
homepage
Click on add
View add Add constraints page has Add constraints page error message
TC045 constraints link from fail
constraints to be displayed didn’t displayed displayed
doctor homepage
Click on add
View add Add constraints page has Add constraints page
TC046 constraints link from pass
constraints to be displayed displayed
doctor home page


Click on change
doctor homepage
Click on change
doctor homepage
Click on patient
View patient Patient registration has to Patient registration page error message
TC049 registration link from fail
registration be displayed didn’t displayed displayed
receptionist homepage
Click on patient
View patient Patient registration has to Patient registration page
TC050 registration link from pass
registration be displayed displayed
Click on the billing
View billing Billing details has to be Billing details page didn’t error message
TC051 details link from fail
details displayed displayed displayed
Click on the billing
View billing Billing details has to be Billing details page
TC052 details link from pass
details displayed displayed
View patient Patient details page has to Patient details page didn’t error message
TC053 link from receptionist fail
details be displayed displayed displayed
homepage
View patient Patient details page has to Patient details page
TC054 link from receptionist pass
details be displayed displayed
homepage


Click on reporting link

Generate report page has to Generate Report page is
TC055 View reporting from the receptionist Pass TC048
be displayed displayed successfully
home page
Click on change
Click on change
Admin should be
Click on sign out link Website homepage didn’t error message
TC058 Admin sign out redirected to the website fail
in admin homepage displayed displayed
homepage
The system should allow

doctor to change his
password by confirming
Click on the change An error message is
Doctor change his old password and new
TC059 password link from getting displayed for fail
password password. If the old
doctor homepage incorrect old password
password is correct,
message should be
displayed



The doctor can change his
doctor to change his
Click on the change his old password and new
doctor change his old password and new
TC060 password link from password. An message is pass
password password. If the old
doctor homepage getting displayed for
change password
message should be
successful
displayed

receptionist to change his
Click on the change An error message is
Receptionist his old password and new
TC061 password link from getting displayed for fail
change password password. If the old
doctor homepage incorrect old password
message should be
displayed

The doctor can change his
receptionist to change his
Click on the change his old password and new
receptionist his old password and new
TC062 password link from password. An message is pass
change password password. If the old
receptionist homepage getting displayed for
change password
message should be
successful
displayed


The new staff should be

Admin enters id and Staff is created with Id
Creation of staff created Pass
TC063 password of staff and password is
using add staff With given id and
From admin page successfully
password
Click on delete staff
Staff is deleted Pass
TC064 Deletion of staff option The staff should be deleted
successfully
From admin page
Click of set constraint It should accept constraint

Setting of Constraint is accepted Pass
TC065 option from admin name and store it in
constraint and is saved successfully
page database
Click update constraint It should accept the change

Update Constraint is updated Pass
TC066 button made to constraint
constraint successfully
From admin page And updated in database
Click delete constraint
The constraint should be Constraint is deleted
TC067 Delete constraint button Pass
deleted successfully
From admin page
Setting the range Click of set constraint It should accept constraint
Constraint is accepted
TC068 of respective range option from range and store it in Pass
and is saved successfully
constraint admin page database
Click update constraint It should accept the change Constraint range is
Update
TC069 range button from made to constraint range updated Pass
Constraint range
admin page And updated in database successfully
Setting the range Click of set constraint It should accept constraint
Constraint is accepted
TC070 of respective range option from range and store it in Pass
and is saved successfully
constraint admin page database


Click update constraint It should accept the change Constraint range is

Update
TC071 range button from made to constraint range updated Pass
Constraint range
admin page And updated in database successfully
Delete constraint
Click delete constraint
range The constraint range Constraint range is
TC072 range button from Pass
should be deleted deleted successfully
admin page
Admin should be
Click on sign out link Website homepage is
TC073 Admin sign out redirected to the website
in admin homepage displayed Pass
homepage
Clink on update
Update treatment Treatment details has to be Treatment details page
TC074 treatment details link Pass
details updated Updated successfully
from patient homepage
Treatment details has to be
Clink on delete
Delete treatment deleted Treatment details page
TC075 treatment details link Pass
details deleted
from patient homepage
The disease should be

predicted by considering
The constraint,
Verification of constraint, constraint range
constraint range is Disease prediction is
TC076 prediction of and old patient record. If Pass
given as input and successful
disease ckd is present display
Click result button
”disease type=ckd” else
“disease type=not ckd”


Doctor should be
Click on sign out link Website homepage is
TC077 Doctor sign out redirected to the website
in doctor page displayed Pass
homepage
Receptionist should be
Receptionist sign Click on sign out link Website homepage is
TC078 redirected to the website
out in reception page displayed Pass
homepage
Click on sign out link patient should be redirected Website homepage is
TC079 Patient sign out
in patient page to the website homepage displayed Pass


Chapter 7
SNAPSHOTS
Login Page
Fig 7.1: Login page
Login Types
Fig 7.2: Login Types

Invalid Condition
Fig 7.3: Invalid Condition
Doctor Login
Fig 7.4: Doctor Login

Admin Login
Fig 7.5: Admin Login
Doctor Home Page
Fig 7.6: Doctor Home Page

Reporting Page
Fig 7.7: Reporting Page
To Upload Treatment Detail
Fig 7.8: Upload Treatment Detail

Change Password
Fig 7.9: Change Password
Receptionist Home Page
Fig 7.10: Receptionist Home Page

Admin Home Page
Fig 7.11: Admin Home Page
To Select Disease Type
Fig 7.12: Disease Type

To Upload Disease and Stages
Fig 7.13: Upload Disease and Stages
To Add Constraints
Fig 7.14: Addition of Constraints

To Set Constraints Ranges
Fig 7.15: Set Constraint Ranges

CONCLUSION
Chronic Kidney Disease has been predicted and diagnosed using data mining classifiers: ANN
and Naive Bayes. In this proposed work, some of the factors considered were age, diabetes,
blood pressure, RBC count etc. The work can be extended by considering other parameters like
food type, working environment, living conditions, availability of clean water, and
environmental factors for kidney disease detection. This project is a medical sector application
which helps the medical practitioners in predicting the disease types based on the symptoms.
Patients can also predict diseases by entering symptoms in the form of sentences. It is
automation for disease prediction and it identifies the disease, its types and complications from
the clinical database in an efficient and an economically faster manner. It is successfully
accomplished by applying the Naïve Bayes algorithm for classification. The classification
technique comes under data mining technology. The proposed work takes symptoms as input and
predicts the disease based on old patients data.

FUTURE ENHANCEMENT
Monitoring of patients using IOT devices

Suppose if the patient is unable to visit hospital every time for check up, he/she can be
monitored remotely from any places using IOT devices
Query Module
We can add the query module as a future enhancement to the application where doctor,
receptionist and admin of the application can interact with each other.

REFERENCES
[1] Aditya Sunda N., Pushpa Latha P., Rama Chandra M.(2012, June). Performance Analysis of
Classification Data Mining Techniques over Heart Disease Data Base. International Journal of
Engineering Science and Advanced Technology(IJESAT). (pp. 470-478),2012.
[2] Ilayaraja M., Meyyappan T. (2013, February).Mining Medical Data to Identify Frequent
Diseases using Apriori Algorithm. In Pattern Recognition, Informatics and Mobile Engineering
(PRIME), 2013 International Conference on (pp. 194-199).IEEE.
[3]Ravindra B. V., Sriraam N., Geetha M. (2014, November).Discovery of Significant

Parameters in Kidney Dialysis Data Sets by K means Algorithm. In Circuits, Communication,
Control and Computing (I4C), 2014 International Conference on (pp. 452-454). IEEE.
[4]Peter T. J., Somasundaram K. (2012, March).An Empirical Study on Prediction of Heart

Disease using Classification Data Mining Techniques. InAdvances in Engineering, Science and
Management (ICAESM), 2012 International Conference on (pp. 514-518). IEEE.
[5] Mostafa Ghannnad Rezaie, Hamid Soltanian Zadeh. Interactive Knowledge Discovery for
Lobe Epilepsy. International Journal of Advanced Science and Technology,(pp. 45-48),2013.
[6] Neha Sharma, Er. Rohit Kumar Verma (2016,September).Prediction of Kidney Disease by
using Data Mining Techniques. International Journal of Advance Research in Computer Science
and Engineering (IJARCSSE), vol 6, issue 9, 2016.
[7] Rajan J. R., Chelvan C. C. (2013, December). A Survey on Mining Techniques for Early
Lung Cancer Diagnoses. In Green Computing, Communication and Conservation of Energy
(ICGCE), 2013 International Conference on (pp. 918-922).IEEE.

[8] Lakshmi K. R., Nagesh Y., VeeraKrishna M. (2014). Performance Comparison of Three Data
Mining Techniques for Predicting Kidney Dialysis Survivability. International Journal of
Advances in Engineering & Technology (IJAET), 7(1), 242-254, 2014.
[9] Agarwal Y., Pandey H. M. (2014, September). Performance Evaluation of Different
Techniques in The Context of Data Mining-A case of an eye disease. In Confluence the Next
Generation Information Technology Summit (Confluence), 2014 5th International Conference-
(pp. 72-76). IEEE.
[10]Ahmed S., Tanzir Kabir M., Tanzeem Mahmood N., Rahman R.M. (2014, December).
Diagnosis of Kidney Disease using Fuzzy Expert System. In Software, Knowledge, Information
Management and Applications (SKIMA), 2014 8th International Conference on (pp. 1-8).IEEE.

Final Report

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Report

Uploaded by

Copyright:

Available Formats

Chronic Kidney Disease Prediction using Data Mining 2016-2017

DEPARTMENT OF IS&E MIT MYSORE Page 1

 Playing an vital role in the production of red blood cells

Chronic Kidney Disease is identified by a blood exam for creatinine, which is a

Department of IS&E MIT MYSORE Page 2

1.3 Scope of Project

1.4 Proposed System

Department of IS&E MIT MYSORE Page 3

New Data Analysis

Fig 1.1: Proposed System of CKD

Department of IS&E MIT MYSORE Page 4

Department of IS&E MIT MYSORE Page 5

This system is developed using two data mining classification modeling

Department of IS&E MIT MYSORE Page 6

Based on the Apriori principle any subset of a frequent itemsetsmust also be

[3] Discovery of Significant Parameters in Kidney Dialysis Data Sets by K-Means

Department of IS&E MIT MYSORE Page 7

To investigate the performance of different classification algorithm such as DT,

[5] Interactive knowledge discovery for temporal lobe epilepsy.

Department of IS&E MIT MYSORE Page 8

[6] Prediction of Kidney Disease by using Data Mining Techniques.

Department of IS&E MIT MYSORE Page 9

[7] A Survey on Mining Techniques for Early Lung Cancer Diagnoses

[8] Performance Comparison of Three Data Mining Techniques for Predicting

Department of IS&E MIT MYSORE Page 10

The main objective of this manuscript is to report on research where we took

[9] Performance Evaluation of Different Techniques in the Context of Data Mining-

Department of IS&E MIT MYSORE Page 11

Fuzzy logic is an approximation method to solve any problem. It contains the

[10] Diagnosis of Kidney Disease Using Fuzzy Expert System.

Fuzzy logic is best applied in fields where a great amount of uncertainty or

Department of IS&E MIT MYSORE Page 12

2.2 Survey Findings

Importance of clustering technique for identifying the influence of kidney dialysis

Department of IS&E MIT MYSORE Page 13

methods is to be done. Keeping track of patient’s record seems to be difficult to manage

In this proposed work the ANFIS system is proposed which is considered to be

Artificial neural networks (ANN) provide a powerful tool to help doctors to

Data Mining is about solving problems by analyzing data already present in

Department of IS&E MIT MYSORE Page 14

Department of IS&E MIT MYSORE Page 15

A system requirement is one of the main steps involved in the development

Department of IS&E MIT MYSORE Page 16

3.3 Functional Requirements

Department of IS&E MIT MYSORE Page 17

 Constraints: Doctor can suggest constraint depending on the treatment of patient.

3.4 Non Functional Requirments

Department of IS&E MIT MYSORE Page 18

3.5.2 Hardware Requirements

Department of IS&E MIT MYSORE Page 19

Tools used in analysis.

The structured analysis has the following attributes

Department of IS&E MIT MYSORE Page 20

4.2 High-Level Design

Three-tier architecture is a client-server architecture in which the functional process

Fig 4.2 Three-tier Architecture of System

Three-tier architecture allows anyone of the three tiers to be upgraded or replaced

Department of IS&E MIT MYSORE Page 21

4.2.1 System Architecture:

Fig 4.2.1 System Architecture of CKD

Department of IS&E MIT MYSORE Page 22

4.3 Use Cases

Purpose of Use Case Diagram: