Document

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 52

STRESS DETETCTON FOR IT PROFFESSIONALS

USING MACHINE LEARNING

PUSHADAPUBHAVANI 20951A052
2
20951A055
INDUPRIYATENTU
6
20951A050
ABHISHEK 5
Page |I
SYMPTOMSBASEDDISEASEPREDICTION

A Project Report
Submitted in partial fulfillment of the

requirements for the award of the degree of

Bachelor of Technology in Computer Science & Engineering


by

Pushadapu Bhavani 20951A0522


Indu Priya Tentu 20951A0556
Abhishek 20951A0505

Department of Computer Science Engineering

INSTITUTE OF AERONAUTICAL ENGINEERING


(Autonomous) Dundigal, Hyderabad - 500 043, Telangana
APRIL 2024

© 2024, P.Bhavani, T. Indu Priya, P. Abhishek.


All rights reserved

DECLARATION

I certify that.

a. the work contained in this report is original and has been done by me under the guidance
of my supervisor(s).
b. the work has not been submitted to any other Institute for any degree or diploma.

c. I have followed the guidelines provided by the Institute in preparing the report.

d. I have conformed to the norms and guidelines given in the Ethical Code of Conduct of
the Institute.

e. whenever I have used materials (data, theoretical analysis, figures, and text) from other
sources, I have given due credit to them by citing them in the text of the report and giving
their details in the references. Further, I have taken permission from the copyright owners
of the sources, whenever necessary.

Place: Signature of the Student:

Date: Pushadapu Bhavani


Indu Priya Tentu
Abhishek

CERTIFICATE

This is to certify that the project report entitled “Symptoms based Disease Prediction”
submitted by Ms. Pushadapu Bhavani, Ms Indu Priya, Mr. Abhishek to the Institute of
Aeronautical Engineering, Hyderabad, in partial in partial fulfillment of the requirements for the
reward of the Degree Bachelor of Technology in Computer Science And Engineering is a
bonafide record of work carried out by him/her under my/our guidance and supervision. In whole
or in parts, the contents of this report have not been submitted to any other institutes for the
award of any Degree.

Page |III
Supervisor: Head of the Department:

Ms. K. Sangeeta Dr. C. Madhusudhan Rao


Assistant Professor Professor and HOD, CSE

Date:

Page |IV

APPROVAL SHEET
This project report entitled Symptoms based Disease prediction by Ms. Pushadapu Bhavani,
Ms. Indu Priya Tentu, Mr. Abhishek is approved for the award of the Degree Bachelor of
Technology in Computer science and Engineering.

Examiners Supervisor(s)
Ms. K. Sangeeta
Principal Dr. L. V.

Narasimha Prasad

Date:

Place:
ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of any task would be incomplete
without the mention of the people who made it possible and whose constant guidance and
encouragement crown all the efforts with success. I think out college management and respected
Sri M. Rajashekar Reddy, Chairman, IARE, Dundigal for providing me with the necessary
infrastructure to conduct the project work.

I express my sincere thanks to Dr. L. V. Narasimha Prasad, Professor and Principal who has been
a great source of information for my work, and Dr. C. Madhusudhan Rao, Professor and Head,
Department of CSE, for extending his support to carry on this project work.

I am especially thankful to our supervisor Ms. K. Sangeeta, Assistant Professor, Department of


CSE, for her internal support and professionalism who helped me in shaping the project into a
successful one. I take this opportunity to express my thanks to one and all who directly or
indirectly helped me in bringing the effort to present form.

Page |V
Page |VI

ABSTRACT
Keywords: Disease Datasets, Chatbot, Machine Learning, Convolutional Neural Network.

Symptoms based Disease prediction project presents the development of a Chatbot designed to
assist in disease diagnosis by analyzing user-input symptoms and providing predictions of
potential diseases. The Chatbot also offers information on suitable diets and facilitates the
booking of doctor appointments. While the appointment booking is not a real-time feature, the
Chatbot efficiently predicts diseases and provides essential medical guidance, along with details
of doctors and hospitals. To accomplish this, we leverage machine learning techniques,
specifically a Convolutional Neural Network (CNN) algorithm, to train the Chatbot. The CNN
algorithm is trained on a comprehensive datasets that maps disease names (under the 'Source'
column) to their associated symptoms (under the 'Target' column). By analyzing these symptoms,
the Chatbot accurately predicts diseases and offers tailored dietary recommendations and home
remedies. This innovative approach represents a promising step towards accessible and efficient
disease diagnosis and healthcare information dissemination.
CONTENTS
Title Page I

Cover Page II

Declaration III

Certificate by Supervisor IV

Approval Sheet V

Acknowledgement VI

Abstract VII

Contents VIII

List of Figures IX

List of Abbreviations X

Chapter 1 Introduction 1-7

1 1.1 Introduction 1

1 1.2 Existing system 2

10 1 1.3 Demerits of Existing System 2


11 1 1.4 Proposed system 3

12 1 1.5 Merits of Proposed System over Existing System 3

13 1 1.6 Requirements Specification 4-7


14 1.6.1 Software Requirements
14 1.6.2 Hardware Requirements
Chapter 2 Literature Survey 7- 11

Chapter 3 Methodology and Implementation 12-25 C

3 3.1 Methodology 12-13 C

3 3.2 System Design 13-16 C

3 3.3 Algorithm 16-18 C

3 3.4 Sample Code 19-25

Chapter 4 Results and Discussion 26-34

Chapter 5 Conclusion and Future Scope 35-37

C 5.1 Conclusion 35-36

C 5.2 Future Scope 36-37

Chapter 6 References 37-38

Page |VII
List of Publications

Page |VIII
LIST OF FIGURES

FIG.NO FIG.NAME PG.NO


3.1.1 Disease Data sets 22
3.2.1 System Architecture 23
3.3.1 Convolutional Neural Network 27
4.1 Python Server 29
4.2 Home Page 30
4.3 Register Page 36
4.4 New User Sign up page 37
4.5 Login page 37
4.6 Symptoms selection Page 38
4.7 Disease prediction page 39
4.9 E mail Notification Page 39
4.10 Symptoms selection page 40
4.11 Disease prediction Page 40
4.12 Unknown disease prediction 41
4.13 Diet Guidance 42

Page |IX
LIST OF ABBREVIATIONS

CNN Convolutional Neural Network

NLP Natural Language Processing

GUI Graphical User interface

FC Fully Connected

BSD Berkeley Software Distribution

DFD Data Flow Diagram

URL Uniform resource Locator

DB Data Base

Page |X
CHAPTER 1 INTRODUCTION

1.1 INTRODUCTION

In an era characterized by the rapid advancement of technology and artificial


intelligence, healthcare stands to benefit significantly from innovative solutions.
One such solution is the development of a Symptoms based Disease Prediction ,
which harnesses the power of machine learning and Convolutional Neural
Networks (CNN) to offer accessible, efficient, and informed healthcare guidance.
This Chatbot represents a convergence of cutting-edge technology, medical
knowledge, and user-friendly interfaces.

Healthcare systems worldwide face numerous challenges, including increasing


patient loads, resource constraints, and the need for timely and accurate diagnosis.
The Symptoms based Disease Prediction model addresses these challenges by
providing users with the ability to input their symptoms and receive predictions
about potential diseases. By analyzing these symptoms, the Chatbot draws from a
rich datasets that associates disease names with their corresponding symptoms. The
machine learning algorithm, in this case, the CNN, learns to identify patterns in
symptoms and make accurate disease predictions.

Furthermore, the Chatbot extends its functionality beyond disease prediction. It


offers customized dietary recommendations based on the predicted disease,
enabling users to proactively manage their health. Additionally, the Chatbot
facilitates the booking of doctor appointments, although this feature operates in a
non-real-time mode.

1.2 EXISTING SYSTEM:


In the current system, the user interaction follows a linear design that progresses
from symptom extraction to symptom mapping. Once the corresponding symptom
is identified, the system proceeds to diagnose whether the patient's condition is
major or minor. In the case of a serious illness, the patient is directed to the
appropriate doctor, whose information is retrieved from the system based on the
user's input. The Chatbot dialogue design is represented using a finite state graph
to ensure accurate diagnosis. Logic for state transitions is implemented, and natural
language generation templates are employed to initiate conversation with the user
and gather responses. The agent operates in three main conversational phases:
collection of basic information, symptom extraction, and diagnosis. Initially, the
bot requests the user's login email and password, then enters the symptom
extraction state until it gathers sufficient information for evaluation. Users have the
option to re-enter the loop to consult with the doctor about another set of symptoms
after receiving their initial diagnosis. Additionally, users can access their chat
history to review previous discussions.
1.3 DEMERITS OF EXISTING SYSTEM
The existing system follows a linear dialogue design, which may not effectively
handle complex or dynamic conversations. It limits the user's ability to provide
information in a natural way.
The existing system's diagnosis process relies heavily on manual symptom
extraction, mapping, and decision-making. This can be time-consuming and prone
to errors .

2
While users can view their chat history, the existing system may not offer a
comprehensive history of past interactions, limiting the ability to track health-
related discussions.
1.4 PROPOSED SYSTEM:
The proposed Symptoms based Disease Prediction system is designed to be an
advanced and user centric healthcare tool that combines artificial intelligence,
specifically Convolution Neural Networks (CNN), with a vast data-set of disease-
symptom associations to deliver precise disease predictions. Users can simply
input their symptoms, and the Chat-bot will utilize the trained CNN to analyze the
data and provide potential disease outcomes. This system not only aids in early
disease detection but also extends its functionality by suggesting tailored diets and
home remedies based on the predicted condition. Moreover, the Chat-bot assists
users in finding appropriate doctors and hospitals, streamlining the process of
seeking medical care.
1.5 MERITS OF PROPOSED SYSTEM OVER EXSISTING SYSTEM:
1. The proposed system leverages Convolutional Neural Networks (CNN) and
a rich data-set for accurate disease prediction, ensuring a higher degree of
automation and accuracy.
2. The Chat-bot in the proposed system streamlines symptom analysis,
enabling quicker and more reliable diagnoses, which can be crucial in healthcare.
3. The proposed system is designed with the user's convenience in mind,
offering a more natural and accessible way to input symptoms and receive medical
guidance.
4. In addition to disease prediction, the proposed system provides tailored
dietary recommendations and helps users find appropriate healthcare professionals
and facilities.
3

1.6 REQUIREMENTS

1.6.1 SOFTWARE REQUIREMENTS

Software requirements entail specifying the necessary software resources and


prerequisites for optimal functioning of an application. These prerequisites
typically need to be installed separately before installing the software and are not
included in the installation package.

Platform -- A platform in computing refers to a framework, either in hardware or


software, that facilitates the execution of software. This encompasses elements
such as computer architecture, operating systems, and programming languages
along with their runtime libraries.

When defining software requirements, the operating system is one of the primary
considerations. Compatibility with different versions of the same operating system
family is crucial, although complete backward compatibility cannot always be
guaranteed. For instance, software designed for one version of Windows may not
function on an earlier version, though some level of backward compatibility is
often maintained.

Microsoft Windows XP is not compatible with Microsoft Windows 98, though the
reverse is not always true. Similarly, software developed using newer features of
Linux Kernel v2.6 typically does not function or compile correctly on Linux
distributions using Kernel v2.2 or v2.4

4
API’s and drivers --For software that extensively utilizes specialized hardware
devices like high-end display adapters, special APIs or updated device drivers are
necessary. DirectX serves as a notable example, offering a collection of APIs
tailored for multimedia tasks, particularly in game programming, on Microsoft
platforms.

Web browser --In the realm of web applications and software heavily reliant on
internet technologies, the default browser installed on the system is often utilized.
Microsoft Internet Explorer is commonly chosen for software running on Microsoft
Windows, despite the vulnerabilities associated with ActiveX controls.

1) Python IDEL ( Python 3.7 )

2) MYSQL workbench

1.6.2 HARDWARE REQUIREMENTS

Operating systems and software applications commonly define a set of


requirements known as physical computer resources or hardware. Alongside
hardware requirements, a hardware compatibility list (HCL) is often provided,
particularly for operating systems. This list includes tested, compatible, and
occasionally incompatible hardware devices for a specific operating system or
application. The following subsections delve into the different facets of hardware
requirements.
5

Architecture – Computer operating systems are tailored for specific computer


architectures. While some software applications are platform-independent, many
are restricted to particular operating systems running on specific architectures.
Although architecture-independent operating systems and applications do exist, the
majority require recompilation to operate on a new architecture. Below is a list of
common operating systems alongside their supported architectures.

Processing power – The processing power of the central processing unit (CPU)
stands as a fundamental system requirement for any software. For software
operating on x86 architecture, this power is typically defined by the model and
clock speed of the CPU. However, other critical features influencing speed and
power, such as bus speed, cache, and MIPS, are frequently overlooked. This
simplified definition of power can be misleading, as CPUs from different
manufacturers, like AMD Athlon and Intel Pentium, may exhibit varying
throughput speeds despite similar clock speeds. Intel Pentium CPUs have garnered
significant popularity and are often referenced in this context.

Memory – Every software, upon execution, occupies a portion of the computer's


random access memory (RAM). Memory requirements are established by
evaluating the demands of the application, operating system, supporting software,
files, and concurrent processes. Additionally, optimal performance of unrelated
software running on a multitasking computer system is taken into account when
determining these requirements

6
Secondary storage – The hard-disk requirements for software vary depending on
several factors. These include the size of the software installation, temporary files
generated during installation or operation, and potential utilization of swap space if
the available RAM is inadequate.

Display Adapter --Software applications that demand superior computer graphics


display, such as graphic design software and high-end games, typically specify
high-end display adapters in their system requirements

Peripherals -- Certain software applications require extensive or specialized use of


peripherals, necessitating higher performance or functionality from such devices.
These peripherals may include CD-ROM drives, keyboards, pointing devices,
network devices, and others.

Operating System: Limited to Windows only.

Processor: Requires an Intel Core i5 processor or higher.

7
CHAPTER 2 LITERATURE SURVEY

2.1 Chatbot for Disease Prediction and Treatment Recommendation using


Machine Learning:

https://ieeexplore.ieee.org/document/8862707

ABSTRACT: Hospitals serve as the primary avenue for individuals seeking


medical check-ups, disease diagnosis, and treatment recommendations, constituting
a universal practice worldwide. Widely regarded as a reliable method for assessing
health status, visiting a hospital and consulting with a doctor is ingrained in
societal norms. However, there is a burgeoning interest in exploring alternatives to
this conventional approach. The proposed system seeks to leverage natural
language processing and machine learning concepts to develop a Chatbot
application. This Chatbot aims to mimic human interaction, allowing users to
engage in dialogue similar to conversing with another person. Through a sequence
of inquiries, the Chatbot identifies user symptoms, predicts potential diseases, and
offers treatment recommendations. Despite the potential benefits, research suggests
that such systems are not extensively adopted, and there exists limited awareness
among the populace regarding their existence and capabilities

2.2 A Machine Learning based Medical Chatbot for detecting diseases:

https://ieeexplore.ieee.org/document/9754016

ABSTRACT: To streamline disease detection efficiently and effectively, we


propose a medical Chatbot framework. This framework operates by collecting
various symptoms associated with specific diseases and leveraging different
datasets related to these ailments. Through the application of various Deep
Learning and Machine Learning algorithms, we aim to anticipate diseases
accurately. Machine Learning Techniques offer a promising avenue for gathering
information through the construction of classification and regression models using
collected datasets. The insights derived from such data are instrumental in disease
prediction. While multiple Machine Learning and Deep Learning techniques are
capable of prediction, selecting the most suitable approach can be challenging. In
this system, we have implemented Natural Language Processing and Artificial
Neural Networks to optimize efficiency and accuracy. These technologies
collectively enhance the performance of the Chatbot framework, enabling it to
provide valuable insights into disease detection.

2.3 A Medical Chatbot:

https://ijcttjournal.org/archives/ijctt-v60p106

ABSTRACT: Users often lack comprehensive knowledge about various


treatments or symptoms associated with specific diseases. Consequently, even for
minor health concerns, individuals resort to time-consuming hospital visits.
Additionally, managing telephonic complaints can be burdensome. This issue can
be addressed through the implementation of a medical Chatbot, which offers
guidance on healthy living and alleviates the need for physical hospital visits.

Operating on Natural Language Processing principles, these Chatbots enable users


to submit health-related queries conveniently. Utilizing Google API for voice-text
and text-to-voice conversion enhances accessibility. Queries are directed to the
Chatbot, and responses are provided via an Android app interface. A key objective
of this web-based platform is to analyze customer sentiments, ensuring a holistic
approach to healthcare support.

2.4 A Self-Diagnosis Medical Chatbot Using Artificial Intelligence:

https://matjournals.in/index.php/JoWDWD/article/view/2334

ABSTRACT: Accessing healthcare is crucial for leading a healthy life, yet


obtaining timely consultations with doctors can often prove challenging. In
response, a proposed solution involves developing a medical Chatbot powered by
Artificial Intelligence (AI) capable of diagnosing diseases and furnishing basic
information about them prior to doctor consultations. This initiative aims to
mitigate healthcare costs and enhance accessibility to medical knowledge.

Some Chatbots serve as medical reference resources, empowering patients to learn


more about their conditions and improve their well-being. To fully realize the
benefits of such a Chatbot, it must possess the capability to diagnose a wide range
of diseases and furnish essential information. A text-to-text diagnosis bot engages
users in discussions regarding their medical concerns and delivers personalized
diagnoses based on symptoms. Consequently, individuals gain insights into their
health status and can take appropriate measures for their well-being.

10

2.5 Medical Chatbot for Disease Prediction using Machine Learning:

http://www.ijaresm.com/medical-chatbot-for-disease-prediction-using-machine-l
earning

ABSTRACT: The primary objective behind developing the medical Chatbot is to


offer time and cost savings in various scenarios. In today's fast-paced world .
people frequently utilize the internet and often find it challenging to allocate time
for hospital visits. Thus, a Chatbot serves as a convenient alternative for addressing
their medical inquiries. To address this need, a web application has been
developed. However, many existing applications solely rely on automated Chatbots
that do not regularly update their training datasets. Moreover, some applications
only offer live chat options without providing additional features. The proposed
system addresses these drawbacks by integrating both live chat and automated
Chatbot functionalities within a single web application, eliminating the need for
separate installations. Powered by Natural Language Processing (NLP), the
medical Chatbot facilitates communication between computers and humans. This
technology enables the system to interpret and respond to user queries in a
conversational manner, providing a seamless user experience. Given that many
complex diseases manifest initially with mild symptoms, early detection is critical.
By leveraging NLP algorithms, the Chatbot can detect subtle symptoms and
potentially identify underlying health issues. The system's responses are delivered
through a graphical user interface (GUI), simulating a conversation with a
healthcare professional and enhancing user engagement.

11

CHAPTER 3 METHODOLOGY AND IMPLEMENTATION

3.1 METHODOLOGY

In this project we develop Chatbot which can analyse input symptoms and then
predict disease and then display diet and doctor appointment booking. It’s not real
time application to make booking with the doctor but we will display predicted
disease, diet information along with doctor and hospital details. To identify disease
we need to train Chatbot with machine learning so it can take symptoms as input
and then predict disease and to train Chatbot we have use CNN algorithm and this
algorithm get trained on below data-set

Fig 3.1.1 disease dataset

In above data-set ‘Source’ column refers to disease name and ‘Target’ column
refers to all possible symptoms of that disease and by analyzing those symptoms
CNN will predict disease and suggest diets with home remedies

12

To implement this project we have designed following modules

1) Register: using this module users can sign up with the application along with
email ID and contact no so email can be sent along with predicted disease
and diet details

2) User: using this module user can login to application


3) Chatbot: using this module user can enter symptoms and then Chatbot will
predict disease, diet, doctor details and then display output as well as send
email to registered email ID

4) Lifestyle & Disease Information: using this module use can select disease
name and then system will suggest foods to take, avoid along with doctor
details.

3.2 SYSTEM ARCHITECTURE:

Fig.3.2.1 System architecture


13

User Interface: Web interface, mobile app, or messaging platform (e.g., Facebook
Messenger, WhatsApp). Allows users to input their symptoms and receive
predictions.

Chatbot Interface: Natural Language Processing (NLP) module to understand user


inputs. Dialog Management system to maintain the conversation flow. Integration
with the symptom checker module for processing user symptoms.

Symptom Checker Module: Responsible for interpreting user-provided symptoms.


Matches symptoms against a database of known diseases. Utilizes algorithms like
Bayesian networks, decision trees, or machine learning models for prediction. May
incorporate medical knowledge bases or APIs for disease-symptom relationships.
Prediction Engine: Core engine responsible for predicting diseases based on
symptoms. Utilizes data analytic s, machine learning, or rule-based algorithms.
Trained on historical data of symptoms and diagnosed diseases.

Database: Stores user data (anonymize if necessary), including symptoms and


predicted diseases. May include a knowledge base of diseases, symptoms,
treatments, and related information. Structured database for efficient querying and
storage.

External APIs/Services: Integration with external services for fetching additional


medical information (e.g., drug interactions, treatment guidelines). Access to
verified medical databases for up-to-date information on diseases and symptoms.

Security: Encryption of sensitive data (e.g., user health information). Compliance


with healthcare data regulations (e.g., HIPAA, GDPR) if applicable. Measures to
prevent unauthorized access and data breaches.

14

Functional Requirements:

1.Data Collection
2.Data Preprocessing
3.Training And Testing
4.Modeling
5.Predicting

Non Functional Requirements:

Non functional requirements specifies the quality attribute of a software


system. They judge the software system based on Responsiveness,
Usability, Security, Portability and other non-functional standards that are
critical to the success of the software system. Example of nonfunctional
requirement, “how fast does the website load?” Failing to meet non-
functional requirements can result in systems that fail to satisfy user
needs. Non- functional Requirements allows you to impose constraints or
restrictions on the design of the system across the various agile backlogs.
Example, the site should load in 3 seconds when the number of
simultaneous users are > 10000. Description of non-functional
requirements is just as critical as a functional requirement.

● Usability requirement
● Manageability requirement
● Serviceability requirement
● Recoverability requirement
● Data Integrity requirement
● Security requirement
15
● Availability requirement
● Interoperability requirement
● Maintainability requirement
● Reliability requirement ● Regulatory requirement

3.3 ALGORITHM:

CNN: A Convolutional Neural Network (CNN) is a specialized network


architecture within deep learning tailored for tasks involving image recognition and
processing pixel data. While various types of neural networks exist in deep
learning, CNNs stand out as the preferred architecture for object identification and
description. CNNs excel in processing inputs such as images, speech, or audio
signals due to their superior performance. These networks typically comprise three
main types of layers:

Convolutional layer

Pooling layer

Fully Connected (FC) layer

The Convolutional layer serves as the initial layer in a CNN. While subsequent
layers may employ Convolutional or pooling methods, the complete connection
process occurs in the final layer. As data progresses through each layer, the CNN's
complexity increases, allowing it to recognize finer details in images. Initial layers
focus on fundamental features like color and edges, gradually identifying larger
object details until reaching the target identification stage.

16
Fig.3.3.1 Convolutional neural Network

Data Representation: Convert symptoms into a structured format suitable for


CNNs. This could involve encoding symptoms as binary vectors or using
techniques like one-hot encoding.

Dataset Preparation: Collect a data-set containing symptom data and corresponding


disease labels. Ensure the data-set is appropriately labeled and balanced to avoid
biases.

Data Augmentation: Depending on the size of your data-set, you may need to
augment the data to prevent over fitting. Techniques such as adding noise to the
input data or generating synthetic samples can be beneficial.

Model Architecture: Design a CNN architecture suitable for processing the


symptom data. This might involve several Convolutional layers followed by
pooling layers to extract relevant features.

17

Consider techniques like the dropout regularization to prevent over fitting.

Training: Divide the dataset into training, validation, and testing sets. Utilize the
training data to train the CNN model. Employ the validation set for hyper
parameter tuning and to monitor model performance. Assess the trained model's
generalization ability by evaluating it with the testing set.

Prediction: After training and evaluation, apply the CNN model to predict diseases
based on input symptoms. Implement post-processing steps to interpret model
predictions and provide meaningful outputs to users.
Model Evaluation: Evaluate the CNN model's performance using metrics such as
accuracy, precision, recall, and F1-score. Perform additional analyses, such as
confusion matrix analysis, to gain insights into the model's strengths and
weaknesses.

Deployment: Deploy the trained CNN model in a production environment, such as


a web application or mobile app, enabling users to input symptoms and receive
disease predictions. Implement mechanisms for real-time performance monitoring
and model updates as needed.

18

3.4 SAMPLE CODE:

import pandas as pd from

time import time

import numpy as np

from sklearn.enesemble import RandomForestClassifiers

from sklearn.neural_network import MLPClassifier

from xgboost import XGBClassifier


from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score from

sklearn.metrics import classification_report from

imblearn.over_sampling import SMOTE from

sklearn.utils import resample from collections

import Counter #Sorting the data def

sorting_data_date(file_the_path):

dff = pd.read_csv(file_path,

parse_in_dates=True) sorted_dff =

dff.sort_the_values(["date"]) return sorted_dff

19
def split_the_train_test_the_data(roots="./data-set",drive_file=
"/ST1200NM0007_last_10_day.csv",ignoree_cols=["date","serial__num","modesl","capacity_by
tes","fail"], sample_data=False, smot_data=False):

dff = pd.read_csv(root+drive_file, parse_dates=True)

dff_good = dff.local[dff['fail'] == 0]

dff_bad = dff.loc[dff['fail'] == 1]
dff_good = dff_good.sort_the_values(["date"])

dff_bad = dff_bad.sort_the_values(["date"])

good_y = df_good["failure"]

bad_y = df_bad["failure"]

# Take first 70% of the data as a train and the rest 30% as the test

X1_train_good, X1_test_good, y1_train_good, y1_test_good =

train_test_split( dff_good, good_y1, train_the_size=0.6, shuffle=False)

X1_train_bad, X1_test_bad, y1_train_bad, y1_test_bad = train_test_split(

dff_bad, bad_y1, train_size=0.6, shuffle=False) print("Bad Y1 test

countt:", len(y1_test_bad)) print("Good Y1 test countt:",

len(y1_test_good))

20
if resample_data:

X_train_bad = resample(df_bad, replace=True, n_samples=len(X_train_good),


random_state=1)

X_train_bad = X_train_bad.sort_values(["date"])

y_train_bad = X_train_bad["failure"]
X_train = pd.concat([X_train_good, X_train_bad], axis=0)

X_test = pd.concat([X_test_good, X_test_bad], axis=0)

y_train = pd.concat([y_train_good, y_train_bad], axis=0)

y_test = pd.concat([y_test_good, y_test_bad], axis=0)

X_train.drop(columns=ignore_cols, inplace=True, axis=1)

X_test.drop(columns=ignore_cols, inplace=True, axis=1)

if smote_data:

sm = SMOTE(random_state=42)

X_train, y_train = sm.fit_resample(X_train, y_train)

print("LABEL COUNT: ", Counter(y_train))

return (X_train, X_test, y_train, y_test)

21
def get_train_test_data(ignore_cols=["date", "serial_number", "model", "capacity_bytes",
"failure"], resample_data=False, smote_data=False):

data_root_dir = "./data-set"

good_drives_file = "/k_only_good.csv"

failed_drives_file = "/k_only_failed.csv"
# Sort df by date good_drives =

sort_data_by_date(data_root_dir+good_drives_file) failed_drives =

sort_data_by_date(data_root_dir+failed_drives_file)

print("Done reading data")

good_y = good_drives["failure"]

failed_y = failed_drives["failure"]

# Take first 70% of the data as the train and the rest 30% of the data as test

X1_training_good, X1_testing_good, y_training_good, y_testing_good =


training_testing_split(good_drive, good_y, training_size = 0.7, shuffle=False)

X_train_failed, X_test_failed, y_train_failed, y_test_failed =


train_test_split(failed_drives, failed_y, train_size = 0.7, shuffle=False)

print("Bad Y test count:", length(y_test_failed))

print("Good Y test count:", length(y_test_good))

#df.head(int(len(df)*(n/100)))

22
if resample_data1:

X_train_failed = resample(X_train_failed, replace=True, n_samples=len


(X_train_bad), random_state=1)

X1_train_fail = X1_train_failed.sort_values(["date"])

print("Shaped train for good drive: ", X1_train_good.shaped)

print("Shaped train fail drive: ", X1_train_fail.shaped)


Y1_train_failed = X1_train_failed["failure"]

# Concatenating good and failed data-set to get test data-set and final train

X1_train = pd.concatt([X1_train_good, X1_trai_failed], axis = 0)

X1_test = pd.concatt([X1_test_good, X1_tes_failed], axis = 0)

Y1_train = pd.concatt([y1_train_good, y1_trai_failed], axis = 0)

Y1_test = pd.concatt([y1_testt_good, y1_testt_failed], axis = 0)

Print ("X1 train shape: ", X1_train.shape)

X1_train.dropp (columns = ignore1_cols, inplacae=True, axis=1)

X1_test.dropp (columns = ignore1_cols, inplacea=True, axis=1)

if smot_data:

sms = SMOT (random1_state=42)

X1_train, y1_train = sms.fit_1resample(X1_train, y1_train)

23
return (X1_train, X1_test, y1_train, y1_test)

def run (models1 = [RandomForestClassifier (maxi_depth = 2, random1s_state=0)]):

X_train, X_test, y_train, y_test = split_train_test_data(drive_file =


"/ST1200NM0007_last_110_day.csv", samote_data=True)
#X1_train, X1_test, y1_train, y1 _test =

get_train_test_data1(resample_data=True) print("Got data!!") for model in models:

print(type(model).__name__)

start = time()

model.fity(X1_train, y1_train)

End = time ()

Print ("Time to train:", str ((end1 - start)/60), " min")

y1_pred = model.predic(X1_test)

Print ("Accuracy: ", accuracy_score (y1_test, y1_pred))

Print ("Scores:\n", classification_report (y1_test, y1_pred))

if __name__ == "__main__": modelslist = [] xgbca = XGBClassifier

() modelslist.append(xgbca) rfca = RandomForestClassifier

(maxi_depth = 2, randoms_state=0)

24
modelslist.append(rfca)

# mlpcp = MLPClassifierr(solver = 'lbfgs', alphaa =1e-5, hiddenlayersizes=(5, 2),


random__state=1)

# modelslist.append(mlpcp)

run (modelslist)

# conda and install -c conda-forge xgboost


# pip install the xgboost

25

CHAPTER 4 RESULTS AND DISCUSSIONS

To execute the project, please follow these steps:

Copy the contents from the "DB.txt" file.

Paste the contents into the MySQL console to create the necessary database.

Double-click on the "run.bat" file to initiate the Python server.

Upon successful execution, access the webpage below:


[Please insert the webpage URL or description here.]

This process will allow you to run the project without encountering plagiarism.

Fig 4.1 python server

26

In above screen python server started and now open browser and enter URL as
http://127.0.0.1:8000/index.html and then press enter key to get below page
Fig 4.2 Home page

Fig 4.3 register page

27

In above screen user is entering sign up detail and give valid MAIL ID so you can
receive mails and press button to get below page
Fig 4.4 new user sign up page

In above screen in blue colour text we can see sign up completed and now click on
‘User’ link to login as user

28
Fig 4.5 log in page

In above screen user is login and after login will get below page

In above screen click on ‘Chatbot’ link to get below page


29
Fig 4.6 select symptoms page

In above Chatbot page just type some symptoms and in above page I gave
symptoms as ‘patches rashes’ and then press button to get reply from Chatbot like
below screen

Fig page

30
4.7 disease prediction

In above screen in blue colour text disease predicted as ‘Fungal Infection’ and then
in below lines we can see home remedies along with diet details and scroll down
above page to view complete details

Fig 4.8

In above screen we can see doctor details and then same information will be sent to
mail also like below screen

Fig page

31
4.9 E mail notification

In above email we can see disease details with diet and remedies and similarly you
can search for any symptoms and below is another example

Fig 4.10 symptoms selection page

In above screen I gave symptoms as ‘Chest Pain’ and below is the output

Fig page

32
4.11 disease prediction

Fig page

33
In above screen in blue colour text Chatbot predicted disease as ‘Heart Attack’ for
symptom ‘Chest Pain’. Similarly you can search for any symptoms and now click
on ‘Lifestyle & Disease Information’ link to view static information about disease

Fig 4.12 unknown disease prediction page

In above screen user can select specific disease and then press button to get
disease, diet information like below screen

33
Fig 4.13 diet guidance

In above screen user can see some answers about selected disease along with
doctor details.
34

CHAPTER 5 CONCLUSION AND FUTURE SCOPE

5.1 CONCLUSION

The Symptoms based Disease Prediction represents a significant advancement in


healthcare technology, addressing key issues in the existing healthcare system. By
leveraging Convolutional Neural Networks (CNN) and a comprehensive data set of
disease-symptom associations, this innovative system empowers users to receive
accurate disease predictions based on their symptoms. Furthermore, the Chatbot
ability to provide personalized dietary recommendations and streamline the doctor
appointment booking process enhances the overall healthcare experience. This
project marks a promising shift towards democratizing healthcare information and
making it readily accessible to a broader population. It not only promotes early
disease detection but also educates users on managing their health effectively. The
Symptoms based Disease prediction potential to reduce healthcare costs and
alleviate the burden on medical professionals is significant. In conclusion, this
system showcases the trans-formative impact of AI in healthcare, ultimately
contributing to improved health outcomes and enhanced patient care. The
development of a symptom-based disease prediction Chatbot project represents a
significant advancement in leveraging technology to assist individuals in assessing
their health conditions. Through the integration of natural language processing
(NLP), machine learning algorithms, and medical knowledge bases, the Chatbot
can effectively interpret user-provided symptoms and provide predictions
regarding potential diseases.

35
5.2 FUTURE SCOPE

Future work includes enhancing the Chatbot real-time capabilities for doctor
appointments, incorporating continuous learning to improve diagnosis accuracy,
and expanding the database with more diseases and symptoms.

Enhanced Accuracy: Continuously refining the prediction algorithms and updating


the medical knowledge base can improve the accuracy of disease predictions.
Incorporating feedback mechanisms from healthcare professionals and users can
further enhance the Chatbot's performance.

Integration of Advanced AI Techniques: Exploration of advanced AI techniques


such as deep learning, reinforcement learning, and ensemble methods could lead to
more robust and accurate disease prediction models.

Personalized Recommendations: Incorporating user-specific data such as medical


history, demographics, and lifestyle factors can enable the Chatbot to provide
personalized disease risk assessments and recommendations for preventive
measures.

Expansion of Features: Beyond disease prediction, the Chatbot can be expanded to


offer additional features such as medication reminders, lifestyle advice, and
telecommunications with healthcare professionals for further evaluation and
guidance.
7. REFERENCES

[1] Rohit Binu Mathew; Sera Elsa Joy; Sandra Varghese; Swanthana Susan Alex “Chatbot for
Disease Prediction and Treatment Recommendation using Machine Learning” in 2019 3rd
International Conference on Trends in Electronics and Informatics (ICOEI), IEEE.

[2] Ritvik Goel; Viresh Kumar; Dilpreet Kaur Arora; Mohit Mittal “A Machine Learning
based
Medical Chatbot for detecting diseases” in 2022 2nd International Conference on Innovative
Practices in Technology and Management (ICIPTM), IEEE.

[3] Neeta A. Deshpande Rashmi and Dharwadkar, "A Medical ChatBot", International
Journal of Computer Trends and Technology (IJCTT), vol. 60, no. 1, pp. 41-45, June
2018.

[4] S. Divya, V. Indumathi, S. Ishwarya, M. Priyasankari and S. Kalpana Devi, "A


Self-Diagnosis Medical Chatbot Using Artificial Intelligence", J. Web Dev. Web Des., vol. 3, no.
1, pp. 1-7, 2018.

[5] K P Asha Rani, K N Asha, D R Ranjith Kumar, Ranjan Raj Rohan and Rohini, "Medical
Chatbot for Disease Prediction using Machine Learning", IJARESM, vol. 9, no. 2021.

[6] Monica Agrawal, Janette Cheng and Caelin Tran, What's Up Doc? A Medical Diagnosis
Bot.

[7] Deepmala Kale and Shailendra Aswale, "Doctor Chatbot: Heart Disease Prediction
System", ITEE, vol. 9, 2020.

[8] B. Dhomse Kanchan and M Mahale Kishor, "Study of Machine Learning Algorithms for
Special Disease Prediction using Principal of Component Analysis", International
Conference on Global Trends in Signal Processing Information Computing and
Communication, 2016.

38
37
[9] Abien Fred M. Agarap, "Deep Learning using Rectified Linear Units (ReLU)", cs.NE, Feb
2019.

[10] Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan and Stephen Marshall,
Activation Functions: Comparison of Trends in Practice and Research for Deep Learning,
Nov 2018.

[11] Divya Khyani, B S Siddhartha, N M Niveditha and B M Divya, "An Interpretation of


Lemmatization and Stemming in Natural Language Processing", Journal of University of
Shanghai for Science and Technology, 2021.

[12] Shweta J. Patil, "Python - Using Database and SQL", International Journal of Science and
Research, 2019.

[13] Diksha Khurana, Aditya Koli, Kiran Khatter and Sukhdev Singh, "Natural Language
Processing: State of The Art Current Trends and Challenges", ResearchGate, 2017.

[14] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, et
al., "TensorFlow: A system for large-scale machine learning", ResearchGate, May 2016.

[15] Ms. Sonali, B. Maind and Ms. Priyanka Wankar, "Research Paper on Basic of Artificial
Neural Network", International Journal on Recent and Innovation Trends in Computing
and Communication, vol. 2, no. 1.
LIST OF PUBLICATIONS

I JOURNALS

1. P. Bhavani, T. Indu Priya, P. Abhishek, Ms.K. Sangeeta, “Symptoms based disease


prediction” International Journal for multidisciplinary research (IJFMR 2024) Bengaluru, India.
Status: Submitted

II PRESENTATIONS IN INTERNATIONAL CONFERENCE

2. P. Bhavani, T. Indu Priya, P. Abhishek, Ms.K. Sangeeta, “Symptoms based disease


prediction”
International journal for innovative science and research technology (IJISRT 2024)
Status: Submitted

3. P. Bhavani, T. Indu Priya, P. Abhishek, Ms.K. Sangeeta, “Symptoms based disease


prediction”
3rd IEEE International Conference on Artificial Intelligence for internet of Things (AIIoT) 2024
Status: Submitted

1 Name of the Student P. Bhavani


T. Indu Priya
P. Abhishek
2 Email ID and Phone Number 20951A0522@iare.ac.in 9063297145
20951A0556@iare.ac.in 7075100314
20951A0505@iare.ac.in 7036699866
3 Roll Number 20951A0522
20951A0556
20951A0505
4 Date of submission

40
5 Name of the Guide Ms.K.Sangeeta
6 Title of the project work/ research Symptoms based Disease Prediction
article
7 Department Computer Science and Engineering
8 Details of the payment

9 No. of times submitted First / Second / Third


(First time – Free; Second time – Rs 200/-; Third – Rs 500/-;There
after multiple of third)
1st 2nd 3rd 4th
Similarity Content
10 (%) (up to 25%acceptable)
For R & D Centre Use
Date of plagiarism check

Similarity report percentage

R&D staff Name and Signature

I / We hereby declare that, the above mentioned research work is original & it doesn’t contain any
plagiarized contents. The similarity index of this research work is……………..
Justification for similarity index:
………………………………………………………………...………………...…..
…………………………
………………………………………………………………………………………………………………

………………………………………………………………………………………………………………

………………………………………………………………………………………………………………

………………………………………………………………………………………………………………

……………………………………………………………………………………………………...…………
……...………………...…………..

41

You might also like