Professional Documents
Culture Documents
Document
Document
Document
PUSHADAPUBHAVANI 20951A052
2
20951A055
INDUPRIYATENTU
6
20951A050
ABHISHEK 5
Page |I
SYMPTOMSBASEDDISEASEPREDICTION
A Project Report
Submitted in partial fulfillment of the
DECLARATION
I certify that.
a. the work contained in this report is original and has been done by me under the guidance
of my supervisor(s).
b. the work has not been submitted to any other Institute for any degree or diploma.
c. I have followed the guidelines provided by the Institute in preparing the report.
d. I have conformed to the norms and guidelines given in the Ethical Code of Conduct of
the Institute.
e. whenever I have used materials (data, theoretical analysis, figures, and text) from other
sources, I have given due credit to them by citing them in the text of the report and giving
their details in the references. Further, I have taken permission from the copyright owners
of the sources, whenever necessary.
CERTIFICATE
This is to certify that the project report entitled “Symptoms based Disease Prediction”
submitted by Ms. Pushadapu Bhavani, Ms Indu Priya, Mr. Abhishek to the Institute of
Aeronautical Engineering, Hyderabad, in partial in partial fulfillment of the requirements for the
reward of the Degree Bachelor of Technology in Computer Science And Engineering is a
bonafide record of work carried out by him/her under my/our guidance and supervision. In whole
or in parts, the contents of this report have not been submitted to any other institutes for the
award of any Degree.
Page |III
Supervisor: Head of the Department:
Date:
Page |IV
APPROVAL SHEET
This project report entitled Symptoms based Disease prediction by Ms. Pushadapu Bhavani,
Ms. Indu Priya Tentu, Mr. Abhishek is approved for the award of the Degree Bachelor of
Technology in Computer science and Engineering.
Examiners Supervisor(s)
Ms. K. Sangeeta
Principal Dr. L. V.
Narasimha Prasad
Date:
Place:
ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of any task would be incomplete
without the mention of the people who made it possible and whose constant guidance and
encouragement crown all the efforts with success. I think out college management and respected
Sri M. Rajashekar Reddy, Chairman, IARE, Dundigal for providing me with the necessary
infrastructure to conduct the project work.
I express my sincere thanks to Dr. L. V. Narasimha Prasad, Professor and Principal who has been
a great source of information for my work, and Dr. C. Madhusudhan Rao, Professor and Head,
Department of CSE, for extending his support to carry on this project work.
Page |V
Page |VI
ABSTRACT
Keywords: Disease Datasets, Chatbot, Machine Learning, Convolutional Neural Network.
Symptoms based Disease prediction project presents the development of a Chatbot designed to
assist in disease diagnosis by analyzing user-input symptoms and providing predictions of
potential diseases. The Chatbot also offers information on suitable diets and facilitates the
booking of doctor appointments. While the appointment booking is not a real-time feature, the
Chatbot efficiently predicts diseases and provides essential medical guidance, along with details
of doctors and hospitals. To accomplish this, we leverage machine learning techniques,
specifically a Convolutional Neural Network (CNN) algorithm, to train the Chatbot. The CNN
algorithm is trained on a comprehensive datasets that maps disease names (under the 'Source'
column) to their associated symptoms (under the 'Target' column). By analyzing these symptoms,
the Chatbot accurately predicts diseases and offers tailored dietary recommendations and home
remedies. This innovative approach represents a promising step towards accessible and efficient
disease diagnosis and healthcare information dissemination.
CONTENTS
Title Page I
Cover Page II
Declaration III
Certificate by Supervisor IV
Approval Sheet V
Acknowledgement VI
Abstract VII
Contents VIII
List of Figures IX
List of Abbreviations X
1 1.1 Introduction 1
Page |VII
List of Publications
Page |VIII
LIST OF FIGURES
Page |IX
LIST OF ABBREVIATIONS
FC Fully Connected
DB Data Base
Page |X
CHAPTER 1 INTRODUCTION
1.1 INTRODUCTION
2
While users can view their chat history, the existing system may not offer a
comprehensive history of past interactions, limiting the ability to track health-
related discussions.
1.4 PROPOSED SYSTEM:
The proposed Symptoms based Disease Prediction system is designed to be an
advanced and user centric healthcare tool that combines artificial intelligence,
specifically Convolution Neural Networks (CNN), with a vast data-set of disease-
symptom associations to deliver precise disease predictions. Users can simply
input their symptoms, and the Chat-bot will utilize the trained CNN to analyze the
data and provide potential disease outcomes. This system not only aids in early
disease detection but also extends its functionality by suggesting tailored diets and
home remedies based on the predicted condition. Moreover, the Chat-bot assists
users in finding appropriate doctors and hospitals, streamlining the process of
seeking medical care.
1.5 MERITS OF PROPOSED SYSTEM OVER EXSISTING SYSTEM:
1. The proposed system leverages Convolutional Neural Networks (CNN) and
a rich data-set for accurate disease prediction, ensuring a higher degree of
automation and accuracy.
2. The Chat-bot in the proposed system streamlines symptom analysis,
enabling quicker and more reliable diagnoses, which can be crucial in healthcare.
3. The proposed system is designed with the user's convenience in mind,
offering a more natural and accessible way to input symptoms and receive medical
guidance.
4. In addition to disease prediction, the proposed system provides tailored
dietary recommendations and helps users find appropriate healthcare professionals
and facilities.
3
1.6 REQUIREMENTS
When defining software requirements, the operating system is one of the primary
considerations. Compatibility with different versions of the same operating system
family is crucial, although complete backward compatibility cannot always be
guaranteed. For instance, software designed for one version of Windows may not
function on an earlier version, though some level of backward compatibility is
often maintained.
Microsoft Windows XP is not compatible with Microsoft Windows 98, though the
reverse is not always true. Similarly, software developed using newer features of
Linux Kernel v2.6 typically does not function or compile correctly on Linux
distributions using Kernel v2.2 or v2.4
4
API’s and drivers --For software that extensively utilizes specialized hardware
devices like high-end display adapters, special APIs or updated device drivers are
necessary. DirectX serves as a notable example, offering a collection of APIs
tailored for multimedia tasks, particularly in game programming, on Microsoft
platforms.
Web browser --In the realm of web applications and software heavily reliant on
internet technologies, the default browser installed on the system is often utilized.
Microsoft Internet Explorer is commonly chosen for software running on Microsoft
Windows, despite the vulnerabilities associated with ActiveX controls.
2) MYSQL workbench
Processing power – The processing power of the central processing unit (CPU)
stands as a fundamental system requirement for any software. For software
operating on x86 architecture, this power is typically defined by the model and
clock speed of the CPU. However, other critical features influencing speed and
power, such as bus speed, cache, and MIPS, are frequently overlooked. This
simplified definition of power can be misleading, as CPUs from different
manufacturers, like AMD Athlon and Intel Pentium, may exhibit varying
throughput speeds despite similar clock speeds. Intel Pentium CPUs have garnered
significant popularity and are often referenced in this context.
6
Secondary storage – The hard-disk requirements for software vary depending on
several factors. These include the size of the software installation, temporary files
generated during installation or operation, and potential utilization of swap space if
the available RAM is inadequate.
7
CHAPTER 2 LITERATURE SURVEY
https://ieeexplore.ieee.org/document/8862707
https://ieeexplore.ieee.org/document/9754016
https://ijcttjournal.org/archives/ijctt-v60p106
https://matjournals.in/index.php/JoWDWD/article/view/2334
10
http://www.ijaresm.com/medical-chatbot-for-disease-prediction-using-machine-l
earning
11
3.1 METHODOLOGY
In this project we develop Chatbot which can analyse input symptoms and then
predict disease and then display diet and doctor appointment booking. It’s not real
time application to make booking with the doctor but we will display predicted
disease, diet information along with doctor and hospital details. To identify disease
we need to train Chatbot with machine learning so it can take symptoms as input
and then predict disease and to train Chatbot we have use CNN algorithm and this
algorithm get trained on below data-set
In above data-set ‘Source’ column refers to disease name and ‘Target’ column
refers to all possible symptoms of that disease and by analyzing those symptoms
CNN will predict disease and suggest diets with home remedies
12
1) Register: using this module users can sign up with the application along with
email ID and contact no so email can be sent along with predicted disease
and diet details
4) Lifestyle & Disease Information: using this module use can select disease
name and then system will suggest foods to take, avoid along with doctor
details.
User Interface: Web interface, mobile app, or messaging platform (e.g., Facebook
Messenger, WhatsApp). Allows users to input their symptoms and receive
predictions.
14
Functional Requirements:
1.Data Collection
2.Data Preprocessing
3.Training And Testing
4.Modeling
5.Predicting
● Usability requirement
● Manageability requirement
● Serviceability requirement
● Recoverability requirement
● Data Integrity requirement
● Security requirement
15
● Availability requirement
● Interoperability requirement
● Maintainability requirement
● Reliability requirement ● Regulatory requirement
3.3 ALGORITHM:
Convolutional layer
Pooling layer
The Convolutional layer serves as the initial layer in a CNN. While subsequent
layers may employ Convolutional or pooling methods, the complete connection
process occurs in the final layer. As data progresses through each layer, the CNN's
complexity increases, allowing it to recognize finer details in images. Initial layers
focus on fundamental features like color and edges, gradually identifying larger
object details until reaching the target identification stage.
16
Fig.3.3.1 Convolutional neural Network
Data Augmentation: Depending on the size of your data-set, you may need to
augment the data to prevent over fitting. Techniques such as adding noise to the
input data or generating synthetic samples can be beneficial.
17
Training: Divide the dataset into training, validation, and testing sets. Utilize the
training data to train the CNN model. Employ the validation set for hyper
parameter tuning and to monitor model performance. Assess the trained model's
generalization ability by evaluating it with the testing set.
Prediction: After training and evaluation, apply the CNN model to predict diseases
based on input symptoms. Implement post-processing steps to interpret model
predictions and provide meaningful outputs to users.
Model Evaluation: Evaluate the CNN model's performance using metrics such as
accuracy, precision, recall, and F1-score. Perform additional analyses, such as
confusion matrix analysis, to gain insights into the model's strengths and
weaknesses.
18
import numpy as np
sorting_data_date(file_the_path):
dff = pd.read_csv(file_path,
parse_in_dates=True) sorted_dff =
19
def split_the_train_test_the_data(roots="./data-set",drive_file=
"/ST1200NM0007_last_10_day.csv",ignoree_cols=["date","serial__num","modesl","capacity_by
tes","fail"], sample_data=False, smot_data=False):
dff_good = dff.local[dff['fail'] == 0]
dff_bad = dff.loc[dff['fail'] == 1]
dff_good = dff_good.sort_the_values(["date"])
dff_bad = dff_bad.sort_the_values(["date"])
good_y = df_good["failure"]
bad_y = df_bad["failure"]
# Take first 70% of the data as a train and the rest 30% as the test
len(y1_test_good))
20
if resample_data:
X_train_bad = X_train_bad.sort_values(["date"])
y_train_bad = X_train_bad["failure"]
X_train = pd.concat([X_train_good, X_train_bad], axis=0)
if smote_data:
sm = SMOTE(random_state=42)
21
def get_train_test_data(ignore_cols=["date", "serial_number", "model", "capacity_bytes",
"failure"], resample_data=False, smote_data=False):
data_root_dir = "./data-set"
good_drives_file = "/k_only_good.csv"
failed_drives_file = "/k_only_failed.csv"
# Sort df by date good_drives =
sort_data_by_date(data_root_dir+good_drives_file) failed_drives =
sort_data_by_date(data_root_dir+failed_drives_file)
good_y = good_drives["failure"]
failed_y = failed_drives["failure"]
# Take first 70% of the data as the train and the rest 30% of the data as test
#df.head(int(len(df)*(n/100)))
22
if resample_data1:
X1_train_fail = X1_train_failed.sort_values(["date"])
# Concatenating good and failed data-set to get test data-set and final train
if smot_data:
23
return (X1_train, X1_test, y1_train, y1_test)
print(type(model).__name__)
start = time()
model.fity(X1_train, y1_train)
End = time ()
y1_pred = model.predic(X1_test)
(maxi_depth = 2, randoms_state=0)
24
modelslist.append(rfca)
# modelslist.append(mlpcp)
run (modelslist)
25
Paste the contents into the MySQL console to create the necessary database.
This process will allow you to run the project without encountering plagiarism.
26
In above screen python server started and now open browser and enter URL as
http://127.0.0.1:8000/index.html and then press enter key to get below page
Fig 4.2 Home page
27
In above screen user is entering sign up detail and give valid MAIL ID so you can
receive mails and press button to get below page
Fig 4.4 new user sign up page
In above screen in blue colour text we can see sign up completed and now click on
‘User’ link to login as user
28
Fig 4.5 log in page
In above screen user is login and after login will get below page
In above Chatbot page just type some symptoms and in above page I gave
symptoms as ‘patches rashes’ and then press button to get reply from Chatbot like
below screen
Fig page
30
4.7 disease prediction
In above screen in blue colour text disease predicted as ‘Fungal Infection’ and then
in below lines we can see home remedies along with diet details and scroll down
above page to view complete details
Fig 4.8
In above screen we can see doctor details and then same information will be sent to
mail also like below screen
Fig page
31
4.9 E mail notification
In above email we can see disease details with diet and remedies and similarly you
can search for any symptoms and below is another example
In above screen I gave symptoms as ‘Chest Pain’ and below is the output
Fig page
32
4.11 disease prediction
Fig page
33
In above screen in blue colour text Chatbot predicted disease as ‘Heart Attack’ for
symptom ‘Chest Pain’. Similarly you can search for any symptoms and now click
on ‘Lifestyle & Disease Information’ link to view static information about disease
In above screen user can select specific disease and then press button to get
disease, diet information like below screen
33
Fig 4.13 diet guidance
In above screen user can see some answers about selected disease along with
doctor details.
34
5.1 CONCLUSION
35
5.2 FUTURE SCOPE
Future work includes enhancing the Chatbot real-time capabilities for doctor
appointments, incorporating continuous learning to improve diagnosis accuracy,
and expanding the database with more diseases and symptoms.
[1] Rohit Binu Mathew; Sera Elsa Joy; Sandra Varghese; Swanthana Susan Alex “Chatbot for
Disease Prediction and Treatment Recommendation using Machine Learning” in 2019 3rd
International Conference on Trends in Electronics and Informatics (ICOEI), IEEE.
[2] Ritvik Goel; Viresh Kumar; Dilpreet Kaur Arora; Mohit Mittal “A Machine Learning
based
Medical Chatbot for detecting diseases” in 2022 2nd International Conference on Innovative
Practices in Technology and Management (ICIPTM), IEEE.
[3] Neeta A. Deshpande Rashmi and Dharwadkar, "A Medical ChatBot", International
Journal of Computer Trends and Technology (IJCTT), vol. 60, no. 1, pp. 41-45, June
2018.
[5] K P Asha Rani, K N Asha, D R Ranjith Kumar, Ranjan Raj Rohan and Rohini, "Medical
Chatbot for Disease Prediction using Machine Learning", IJARESM, vol. 9, no. 2021.
[6] Monica Agrawal, Janette Cheng and Caelin Tran, What's Up Doc? A Medical Diagnosis
Bot.
[7] Deepmala Kale and Shailendra Aswale, "Doctor Chatbot: Heart Disease Prediction
System", ITEE, vol. 9, 2020.
[8] B. Dhomse Kanchan and M Mahale Kishor, "Study of Machine Learning Algorithms for
Special Disease Prediction using Principal of Component Analysis", International
Conference on Global Trends in Signal Processing Information Computing and
Communication, 2016.
38
37
[9] Abien Fred M. Agarap, "Deep Learning using Rectified Linear Units (ReLU)", cs.NE, Feb
2019.
[10] Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan and Stephen Marshall,
Activation Functions: Comparison of Trends in Practice and Research for Deep Learning,
Nov 2018.
[12] Shweta J. Patil, "Python - Using Database and SQL", International Journal of Science and
Research, 2019.
[13] Diksha Khurana, Aditya Koli, Kiran Khatter and Sukhdev Singh, "Natural Language
Processing: State of The Art Current Trends and Challenges", ResearchGate, 2017.
[14] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, et
al., "TensorFlow: A system for large-scale machine learning", ResearchGate, May 2016.
[15] Ms. Sonali, B. Maind and Ms. Priyanka Wankar, "Research Paper on Basic of Artificial
Neural Network", International Journal on Recent and Innovation Trends in Computing
and Communication, vol. 2, no. 1.
LIST OF PUBLICATIONS
I JOURNALS
40
5 Name of the Guide Ms.K.Sangeeta
6 Title of the project work/ research Symptoms based Disease Prediction
article
7 Department Computer Science and Engineering
8 Details of the payment
I / We hereby declare that, the above mentioned research work is original & it doesn’t contain any
plagiarized contents. The similarity index of this research work is……………..
Justification for similarity index:
………………………………………………………………...………………...…..
…………………………
………………………………………………………………………………………………………………
…
………………………………………………………………………………………………………………
…
………………………………………………………………………………………………………………
…
………………………………………………………………………………………………………………
…
……………………………………………………………………………………………………...…………
……...………………...…………..
41