Professional Documents
Culture Documents
Disease Prediction Using ML
Disease Prediction Using ML
net/publication/372859217
CITATIONS READS
0 2,221
1 author:
SEE PROFILE
All content following this page was uploaded by Tatchemo GAUIAFAING Ronald on 03 August 2023.
By
MATRICULE: 21SI-004973
i
CERTIFICATION
This is to certify that this project titled “DISEASE PREDICTION WEB-APP USING
MACHINE LEARNING” was carried out by TATCHEMO GUIAFAING RONALD,
registration number 21SI-004973, an undergraduate senior year student in the School of
Information Technology of the Department of Software Engineering, Catholic
University Institute of Buea.
………………………………… Date…………………………………
……………………………………… Date………………………………………..
ii
DEDICATION
I dedicate this project to the Lord Jesus Christ, who gave me the grace and strength to go through this
amazing degree program. I’m very grateful for the constant support and encouragement I have
received from family and friends. A special thanks to all the Academic Mentors I have come across
since I was a freshman; their lectures, support, and disapproval have contributed a lot to my adaptation
to this wide-ranging field of Information technology. The future would have never been that bright
without all of them.
iii
ACKNOWLEDGMENTS
I am most grateful to the Almighty God for providing me with love, grace, mercy,
wisdom, and strength.
I appreciate all the guidance received from the V.P. of Academic Affairs,
Research, and Cooperation: Dr. FELICITAS ATABONG MOKOM
Dr. DOUMBU REINE CHARLYE GUIAFAING for her expertise in the domain
of health care. I am grateful for her constant guidance, assistance, inspiration,
advice, expertise, love, and support, which made this work a success.
iv
LIST OF ABBREVIATIONS
Kaggle It is a subsidiary of Google and an online community of data scientists and machine learning
engineers. It is on this platform that all four sicknesses have been trained, tested,tested
downloaded, and inserted in the final application for future use.
PYTHON It is a general-purpose language, which can be used to create a variety of different
Programs
ANACONDA Open-source distribution of the Python for M.L that aims to simplify package
management and deployment.
M.L Machine Learning
Support Vector Machine This are supervised learning models with associated learning algorithms that analyze
data for classification and regression analysis.
Datasets A data set is a collection of data.
Accuracy score Accuracy is one metric for evaluating classification models.
Disease prediction
Predicting the user's disease based on the symptoms that the user(Doctor) provides as
input.
NEURAL NETWORK An artificial intelligence (AI) method that teaches computers to process data in a way
that is inspired by the human brain Hence it said to be using Deep learning.
Spyder It is an open-source cross-platform integrated development environment for scientific
programming in the Python language. It is high-level Python web framework that
encourages rapid development and clean, pragmatic Design.
DJANGO It is high-level Python web framework that encourages rapid development and clean,
pragmatic design.
STREAMLIT It is an open-source app framework for Machine Learning
MVC Model View Controller this is Implemented in the app Implementation Using Django
ANACONDA It is an open-source distribution of the Python and R programming languages for M.L
that aims to simplify package management and deployment.
TENSORFLOW Project Object Model
Template Folder This is a Folder where HTML files are being place in for further Usage in the Code
Using Django Framework
Static Folder This is a Folder where CSS ,IMAGES , J.S, SCSS files are being place in for further
Usage in the Code Using Django Framework
TSL Key This is a Security Key Generated by Google base on the Use Email address in other to
ease User Authentication before being able to Predict Patients sickness
API Application Programming Interface
LGR Logistic Regression Model
v
TABLE OF CONTENT
DECLARATION............................................................................................................................................1
CERTIFICATION...........................................................................................................................................2
DEDICATION................................................................................................................................................3
ACKNOWLEDGMENTS...............................................................................................................................4
LIST OF ABBREVIATIONS.........................................................................................................................5
TABLE OF FIGURES..................................................................................................................................10
ABSTRACT..................................................................................................................................................11
1. INTRODUCTION.....................................................................................................................................12
1.1 Background of the Study.....................................................................................................................12
1.2 Problem Statement..............................................................................................................................12
1.3 Aims and Objectives............................................................................................................................13
1.4 SCOPE AND LIMITATIONS OF THE WORK.................................................................................13
1.4.1 SCOPE............................................................................................................................................13
1.4.2 LIMITATIONS OF THE WORK:...................................................................................................13
2. LITERATURE REVIEW..........................................................................................................................14
3. METHODOLOGY....................................................................................................................................17
A.SOFTWARE DEVELOPMENT: SICKNESS EDUCATIONAL SITE....................................................18
A.1 PROCEDURE....................................................................................................................................18
A.1.1 System Design.................................................................................................................................21
A.1.2 Implementation................................................................................................................................24
A.1.3 Testing.............................................................................................................................................24
A.1.4 Deployment and maintenance..........................................................................................................24
A.2 Project Plan.........................................................................................................................................24
3.1 Frameworks Used in this Project.............................................................................................................26
B. MACHINE LEARNING MODEL: PREDICTION APP..........................................................................27
B.1 ALGORITHM USE...........................................................................................................................27
B.2 WORKFLOW OF MACHINE LEARNING:....................................................................................27
vi
4 IMPLEMENTATION................................................................................................................................32
4.1 Configuration......................................................................................................................................32
4.2 Test Results.........................................................................................................................................33
5 CONCLUSION..........................................................................................................................................41
5.1Future Work.........................................................................................................................................42
REFERENCES..............................................................................................................................................43
vii
TABLE OF FIGURES
INDEX OF TABLES
Table 1: Comparative study using various algorithms in the literature review......................................................................16
Table 2: Used Software Tools...............................................................................................................................................25
Table 3: Hardware’s components Used.................................................................................................................................25
10
ABSTRACT
Disease Prediction Using Machine Learning is the system that is used to predict
diseases based on the symptoms that are given by patients or any other user. Disease
prediction in humans also means predicting the probability of a patient’s disease after
examining the combinations of the patient’s symptoms. This analysis in the medical
industry would lead to a streamlined and expedited treatment of patients. The
previous researchers have primarily emphasized machine learning models, mainly
Support Vector Machine (SVM), K-nearest neighbors (KNN), for the detection of
diseases with symptoms as parameters. However, the data used by the prior
researchers for training the model is not transformed, and the model is completely
dependent on the symptoms, while their accuracy is poor. Nevertheless, there is a
need to design a modified model for better accuracy and early prediction of human
disease. The proposed model has improved the efficacy and accuracy of the model by
resolving the issue of the earlier researcher’s models. The proposed model uses the
medical dataset from Kaggle and transforms the data by assigning weights based on
their rarity. This dataset is then trained using a combination of machine learning
algorithms, including SVM. Neural Network Parallel to this, the history of the patient
can be analyzed using the LSTM Algorithm. SVM is then used to conclude the
possible disease. The proposed model has achieved better accuracy and reliability as
compared to state-of-the-art methods. The proposed model is useful to contribute
towards development in the automation of the healthcare industries. It will be
connected to Hospital websites, and only an authenticated user (a doctor) will be
connected to the main predicting Web-Application for a possible prediction.
11
1. INTRODUCTION
1.1 Background of the Study
Machine Learning is the domain that uses past data for predicting. Machine Learning is
the understanding of computer system under which the Machine Learning model learn
from data and experience. The machine learning algorithm has two phases: 1) Training &
2) Testing. To predict the disease from a patient’s symptoms and from the history of the
patient, machine learning technology is struggling from past decades. Healthcare issues
can be solved efficiently by using Machine Learning Technology. We are applying
complete machine learning concepts to keep the track of patient’s health. ML model
allows us to build models to get quickly cleaned and processed data and deliver results
faster. By using this system doctors will make good decisions related to patient diagnoses
and according to that, good treatment will be given to the patient, which increases
improvement in patient healthcare services. To introduce machine learning in the medical
field, healthcare is the prime example. For the prediction of diseases, the existing will be
done on linear, KNN, Decision Tree algorithm.
Specialists find it difficult to make decisions about the illnesses because they may not
have skills in all areas. To address this issue, it is necessary to develop a disease
prediction system that combines medical knowledge with an integrated system to produce
the biggest results and can help society [1].
✓ According to the World-data Info there are 0.09 doctors per 1000 Inhabitants in
Cameroon.
✓ Specialists find it difficult to make decisions about the illnesses because they
may not have skills in all areas.
12
1.3 Aims and Objectives
✓ Educate the population with respect to the major illnesses faced by the
population in our community (such as diabetes, breast cancer, Parkinson's,
and Heart Disease), especially how they are transmitted, how they can be
avoided, and the preventive measures put in place by the Government.
13
2. LITERATURE REVIEW
There have been numerous studies done related to predicting the disease using different
machine learning techniques and algorithms which can be used by medical institutions.
This paper reviews some of those studies done in research papers using the techniques
and results used by them. Reviews are given below:
A. Reviews:
MIN CHEN et al, [2] proposed a disease prediction system in his paper where he used
machine learning algorithms. In the prediction of disease, he used techniques like CNN-
UDRP algorithm, CNN-MDRP algorithm, Naive Bayes, K- Nearest Neighbor, and
Decision Tree. This proposed system had an accuracy of 94.8%.
Sayali Ambekar et al, [3] recommended Disease Risk Prediction and used a convolution
neural network to perform the task. In this paper machine learning techniques like CNN-
UDRP algorithm, Naive Bayes, and KNN algorithm are used. The system uses structured
data to be trained and its accuracy reaches 82% and achieved by using Naïve Bayes.
Naganna Chetty et al, [4] developed a system that gives improved results for disease
prediction and used a fuzzy approach. And used techniques like KNN classifier, Fuzzy c-
means clustering, and Fuzzy KNN classifier. In this paper diabetes disease and liver,
disorder prediction is done and the accuracy of Diabetes is 97.02% and Liver disorder is
96.13. Dhiraj Dahiwade et al, [5] designed a model for prediction of the disease using
approaches of machine learning and used techniques like KNN and CNN. This Perceptron model
is used in this system. This system predicts heart disease based on basic symptoms like age, sex,
pulse rate, etc.
The accuracy of this suggested system is 91%. Ankita Dewan et al, [8] recommended a disease
prediction system that uses data mining classification hybrid technique for predicting heart
disease. This system is using techniques like Neural Network, Decision Tree, and Naive Bayes.
The accuracy of this system is 87%.
14
YEAR AUTHOR PURPOSE METHOD USED ACCURACY
2017 MIN CHENet al, [2] Proposed a disease CNN-UDRP algorithm, 0.95
prediction system in his CNN-MDRP algorithm,
paper where he Naive Bayes, K- Nearest
used machine learning Neighbor,
Algorithms. Decision Tree
2018 Sayali Ambe kar et al, Recommended Disease CNN-UDRP algorithm, The highest
[3] Risk Prediction and used a Naive Bayes and accuracy of
KNN convolution neural algorithm 82% is
network to perform achieved by
the task Naïve
Bayes.
2019 Dhiraj Dahiwade etal, Designed a model for K-Nearest neighbor (KNN) KNN: 95%
[5] prediction of the an Convolutional
disease using neuralnetwork (CNN) CNN:98%
approaches of machine
learning
2017 Lambo darJena et al, Focused on risk prediction Naive Bayes 95%
[6] for chronic diseases by
taking advantage of $$
distributed machine
99.7%
learning classifiers Multilayer Perceptron
15
84.42%
Heart
Disease:
87.12%
Diabetes:
74.03%
Heart
Disease:
70.97%
2017 Deeraj Shetty et al, Studied the uses of data Naïve Bayes and KNN KNN gives
[10] mining for diabetes better
disease prediction accuracy,
compared to
Naïve
Bayes.
2017 Rashmi G Saboji et al, Tried to find a scalable Random Forest Algorithm 0.98
[11] solution that
can predict heart disease
utilizing
classification mining
16
3. METHODOLOGY
It is not easy to develop software. Designing software is complicated and time-consuming. process.
During the program design life-cycle, programmers customize software for clients. There are situations
where programmers often change their designs according to customers’ requirements and some other
limitations. The design process is unstable and complex, while the requirements of customers are
always idealistic. It is imperative to make the right decision when choosing what software to use.
Development methodology is used.
It is essential to know that building software is not simply about writing some code and then being
done. To begin with, software developers talk to their clients about what they want the software to be
and what its major functions are. Then a development methodology is needed. Most companies use the
methodologies, which are public; some companies use their own methodologies. Award divides those
methodologies into two categories: heavyweight and lightweight. In Awad’s words, heavyweight
methodology refers to traditional methodologies; the most classical traditional methodology is
waterfall, which I will only discuss and compare with agile methodology.
Implementing this Disease Prediction System Using Machine Learning: A Combination of Both
Software Development Principles (for the website to be implemented, which will help educate
communities with respect to each pathology) and Machine Model (which will involve obtaining the
dataset from Kaggle, pre-processing the data, splitting the data into both Testing and training data, and
passing this spitted data into a machine learning algorithm) (SVM, LGR, or Deep Neural Network, as
the Case may be), then obtaining new data, saving it as a trained vector machine classifier, and hence
passing it into an if and else condition that will return either 0 or 1, that is, Patient is having... or
Patient is not having..., as the case may be, with respect to the 4 Various Sicknesses.
17
A.SOFTWARE DEVELOPMENT: SICKNESS
EDUCATIONAL SITE
A.1 PROCEDURE
For achieving the objective of the system, software development methodology needs to
be chosen wisely for better planning the flow of the system's development. There are
various types of models in the Software Development Life Cycle (SDLC), which are the
V-Shaped Model, Evolutionary Prototyping Model, Spiral Model, Iterative Waterfall
Model, and Agile Model. The choice made for the development was an Iterative
Waterfall Model.
The figure above shows the methodology that will be used in this research. The waterfall approach
emphasizes a structure and well-defined progression between each phase.
18
Each phase consists of a defined set of activities and deliverables that must be accomplished
before going to the next phase or step. The first phase tries to obtain or capture what the system
will do, which is its requirement; the second phase determines how it will be designed; the third
phase is the actual programming; the fourth phase is the full system testing; and the final phase,
which In the fifth phase, it focuses on the implementation tasks, such as documentation. The six
stages involved in the iterative waterfall model are as follows:
1. Requirement Analysis
2. System Design
3. Implementation
4. Testing
5. Deployment
6. Maintenance
During this phase, existing systems are analyzed, and all the requirements that are needed to
develop the new system are identified. In this phase, information regarding the system is
gathered and studied, either in the form of journals, articles, or research papers. The findings are
summarized and analyzed to find the requirements of the system as functional and non-
functional requirements.
A.1.1.1 Functional Requirements
I) Any Individual can educate himself with respect to various pathology especially how they are
transmitted and how they can been prevented.
II) They Can Subscribe Hospital News-letters in other to received information weekly.
19
III) They could leave a message to the Hospital office by using the contact form.
I) Doctors can log in to the system or create an account, which has to be activated by IT
personnel at each hospital.
II) Once the Doctor has logged in, the System will be directed to the Multiple Disease Prediction
App for a possible prediction by the doctor, where a form will be presented and a form should be
filled out.
A.1.1 System Design
The system design process normally partitions the requirements into either hardware or software
systems. It builds the overall system architecture. Software Design involves identifying and
describing the fundamental abstractions of software systems and their relationships. Based on the
detailed requirements that were obtained from the first phase of this project, the software
architectural design task is performed, which includes identifying the data flow, the class
diagram of this application, and also a text plan. The data flow diagram shows the flow of
information and the transformation applied when data moves in and out of a system.
The diagram above shows the data flow of the user (Common User) interacts with the system.
20
Figure 3: Data flow of a Medical Doctor interaction with the system
The diagram above shows the data flow of the doctor's interaction with the system.
The diagram above shows the data flow of the Medical doctor's (approved) interaction
with the system. The Class diagram and E-R diagram are used to illustrate the objects,
their properties and methods, and the relationships between objects once the User has
authenticated and been redirected to the Prediction Application.
21
Figure 4: ER-Diagram of the Prediction System Once the Doctor has Login
22
A.1.2 Implementation
Based on my architectural views as above from the previous phase, the implementation task is
performed. The whole application is partitioned into sets of program modules.
A.1.3 Testing
The system is built and tested as a complete system to ensure that the software requirements have been
met. This is done at the end of the Machine Learning part, which consists of Training the data and
testing the train data before inserting it into the Spyder application and thus connecting it to the Django
web applications. After my testing, as a full system, the system is placed out in public to be tested by
other users. This is the phase where the application is fully tested. It's completely tested to ensure it is
up and running without any errors. At this stage, the application is carefully monitored to ensure that it
is error-free and works as expected.
This is the longest phase; the application is deployed to an online server and put into use. Maintenance
involves correcting errors that were not discovered in the early stages, improving the implementation
of the system units, and enhancing the system’s components. In summary, with the iterative waterfall
model, the project is divided into phases; typically, the previous phase’s deliverables have to be
completed before the next can be executed, but in a case where an in-phase need needs to be corrected,
the model allows you to iterate to the previous phases for modifications. The reason for the choice of
model is because of the nature of the project, in which some phases need modifications in order for the
other phases to be modified and enhanced. The approach allows for a good estimation of the time
allocated for each of the major phases of the project.
A project plan is a formal document designed to guide the control and execution of a project. The
primary uses of the project plan are to document planning assumptions and decisions, facilitate
communication among project stakeholders, and document approved scope, cost, and schedule
baselines. This project plan will include cost estimation and analysis, project schedule.
23
Software Reason of Usage
Linux (UBUNTU 23.04) Most popular Linux distribution with good
performance and security enhancement. It
is a lightweight operating system and
consumes fewer hardware components.
Python programming language is easy to
Python (3.9.17) learn and object-oriented programming
language and comes with a lot of libraries
or packages for developing software’s
Kaggle It provide and online machine system which will
help to train and test the downloaded dataset before
connecting it to Spyder application
Web browser It is use as a source finding platform for article on
Google chrome literature (Previous work)
Table 2: Used Software Tools
The table above is the combination of both the Software and Hardware component used: For Web-
app and the Machine Learning Model
24
3.1 Frameworks Used in this Project
Bootstrap: It is the most popular HTML, CSS, and JS framework for developing responsive, mobile-
first projects on the web.
Django: Django is an open-source python web framework used for rapid development, pragmatic,
maintainable, clean design, and secures websites.
Django is based on MVT (Model-View-Template) architecture. MVT is a software design pattern for
developing a web application.
It is a collection of three important components Model View and Template. The Model helps to handle
the database. It is a data access layer that handles the data. The Template is a presentation layer that
handles the User Interface part completely. The View is used to execute the business logic and interact
with a model to carry data and renders a template. Django follows the MVC pattern but maintains its
conventions.
Streamlit (Spyder): Even though the Project will have an educational part, whereby sickness will be
explained to the user, especially their way of transmitting it and how it can be avoided, the principal
objective of the research work is to ensure that Doctors are able to determine at the end of having to
fill out a form if a Patient is sick or not.
Spyder was built specifically to be used for data science. Its interface allows the user to scroll through
various data variables and also offers an online help option. The output of the code can be viewed in
the Python console (for this research, streamlit is used) on the same screen. Where I will work on
different scripts at a moment and then try them out one by one in the same console or different as per
your choice, all the variables used will be stored in the variable explorer tab. It also provides an option
to view graphs and visualizations in the plot window.
25
So the data will be trained and tested on Kaggle and then downloaded with extension of be of (.sav)
and connected to Spyder apps, and then a form is created from the Spyder app to enable the data to be
parsed from the form to the train model after the user has been given access by the Django web app.
For 3 disease prediction, the SVM algorithm is used, and the Supervise algorithm is the Thus, in
supervision learning, we feed our data to the Machine Learning model, and the Machine Learning
Model learns from the data and its respective labels. So in this case, we train our models with several
pieces of medical information (Such as the blood glucose level and the insulin level of patients, along
with whether the person has diabetes or not). So this acts as a label to show whether the person is
Diabetic or Not. So once we feed this data to our support vector machine, what happens is that it tries
to plot this data. And once it plots, try to find hyperplane.
1-From Kaggle.com I will download the datasets, and then try to train our data with the models and
the respective levels and feed it into our Models.
2: I will pre-process the data, and we will try to analyze it. This data will be very suitable to feed the
machine learning model, and we need to standardize this data. (Because there are a lot of attributes
here, there is a lot of Medical Information.) So standardizing this data is important.
3: So once I pre-process the data, I will split the data into Train and Test. So I will train our machine
Learning Model with training data.
4: So once I split the test data into Training data and testing data, we will feed this to our Support
Vector Machine models. So we will be Using classification models, where these models will classify
whether the patient is Diabetic or non-diabetic. (This is for the case of Diabetic prediction.)
26
5- So once I have a Train Vector Support Machine Classifier, when we give our new data, I can now
predict whether the Person is Diabetic or non-diabetic.
This image makes it visible the hyperplane. So what happens is that this hyperplane separates these
two pieces of data. So when I feed a new model, it will try to put that particular data into the ideas of
these two groups.
27
Figure 8: Workflow of Prediction Architecture for Patients with Heart Diseases
The Support Vector Machine Model function is shown in the Image above. It in effect
considers the dataset entered and the one with which we are checking the similarity
between them in the form of X and Y (or Feature 1 and Feature 2), where X is people
with Parkinson's and the other set of lines is people without Parkinson's (or any other
disease training Model). The SVM finds the best line of separation between the two sets
of Data.
So SVM is a vector that is very close to this hyperplane. So if the Orientation of this
hyperplane changes, the SVM Classifier will also change. So what happens is that when
a new data-point is given, it tries to find out to which side the data plane belongs. So that
is how SVM works. And two points are not sufficient in some cases. So in some cases,
we need more than two dimensions to determine or get the result of our testing.
29
Figure
11: Screen-shoot of the 4 downloaded dataset for a possible prediction
The figure above is capture of the downloaded datasets of the sickness from kaggle to my repository
(computer) which will be use locally in other to ensure that the web-app function as train with the
same accuracy of each sickness.
Thus this datasets and the final train model will be move in the Spyder directory and part will be call
in our Spyder app whereby a form will be generate and combine at the end using the menu-option of
Streamlit.
30
4 IMPLEMENTATION
4.1 Configuration
As mentioned earlier in the Methodology section, this project is divided into two paths:
the Software Development section and the building of the Machine Learning Model.
Having downloaded each dataset on Kaggle, I created an account on Kaggle and created
a directory for each of those projects.
Figure 12:
Configuration of the Kaggle Notebook for the Heart Disease datasets
31
Figure 13: Downloading the Required Software
The outcome row is dropped from each model, and then each model is split into X_train, X_test, and
Y_train. Y_test = train_test_split(X, Y, test_size = 0.1, stratify = Y, random_state = 2). 4 variables are
created, as seen above: X_train (containing the feature of all the train data), X_test (containing the
feature of all the test data), Y_train (containing the feature of all the targets presenting the X_train),
and Y_test(containing the feature of all the targets presenting the X_test)
32
Figure 14: Representation of the spitted datasets in two
33
Figure 15: Accuracy score of the Training data of Diabetic dataset
Figure 16: Visualization of the Train model for the Diabetic Datasets
The pickle library is the function that is needed to save these models. A variable is declared as being
trained_model.sav, and the model is loaded in the variable called classifier. So I am opening the
variables filename and wb (write, binary), so I am writing a file in the binary format, and what I am
writing is nothing more than the classifier. This is, in effect, the diabetic dataset. In this case, I have
used a PIMA( Diabetes is a chronic condition that causes a person's blood sugar level to become too
high) diabetic dataset. So when I run this, it will create a file called trained_model.sav. So basically, it
is the same thing, so what will be done is through a webpage. So a user Interface is put in place, and
through this interface, the User could be prompted to enter these details. And the code will predict
whether the person is diabetic or not. Hence, each model is saved and downloaded. Spyder IDE has
been used to design the User Interface.
34
Figure 17: All the Train datasets have been downloaded for future use
The Image above represents both the train model (which has the extension .sav ) and the dataset
downloaded at the beginning of the project from Kaggle.
It is seen that the implementation is quite the same thing as the one done online on kaggle.
So a Function is created, and all of the declared predictions are included in the function. Having
created the function, I created the Interface (diabetic webpage), which in effect will present to the user
a field for him/her to enter (with the outcome field left out for our model to train based on the Outcome
field). And what the user will enter will be parsed in the train model and the model will return an
answer being either 0 or 1.
35
Figure 19: Spyder IDE of the loaded train model for the 4 Diseases
The Figure above represents a screenshot of the Combine trained model parsed into the Spyder
application. The form arrangement is done using col 1, col 2 col 3 definitions which in effect will
divide the row into 3. So if a row were to take only one field, it will take now 3 fields of the data
element of data-type being number. Hence the application is less cumbersome.
Having done this the option-menu function is called and the visualizations is been seen using Streamlit
which is the web-server used.
36
Figure 20: Visualization of the Final Version of the Prediction App
37
Figure 22: Joining the Django front-end website with the Prediction Application
FigureFigure
4- 13: 23: Front
Front ViewView of the
of the web-app
web-app combine
combine withwith
the the Streamlit
Streamlit (Prediction
(Prediction App)App)
38
Figure 24: Visualizing the Sign-In Option Menu
39
5 CONCLUSION
This project aims to predict disease based on symptoms. The project is set up in such a
way that the device takes the approver's symptoms (a medical doctor's) as input and
generates an output, which is disease prediction. A prediction accuracy probability of
95% is obtained on average. The GRAILS system was used to successfully incorporate
the disease predictor. In order not only to predict sickness but also to eradicate or reduce
its propagation, a front-end website was also implemented where any user would be able
to be informed concerning various diseases, especially the most prevalent ones, and how
they are transmitted. In case the user has more questions, on the Contact Us page, there is
a form that the user will have to fill out and will be contacted a few days later by a Nurse,
doctor, or any other healthcare professional depending on the nature of his or her
Questions.
The application can therefore be mounted or added to any Hospital website where
needed.
40
5.1Future Work
I intend to remove the contact us page but rather implement a chatbot interface
function and, using the principles of deep learning, assist any User.
I intend to ensure that the Data inserted by the user in the Spyder application is
stored for future training of the model; thus, some diseases may mutate and become more
and more difficult to predict.
I intend to put the final application version online for free for use in unfavorable
communities in Cameroon and across Africa.
41
REFERENCES
42
[9]Pahulpreet Singh Kohli and Shriya Arora, “Application of Machine
Learning in Disease Prediction” IEEE, 978-1-5386-6947-1/18, pp. 1-4,
2018.
[10]Deeraj Shetty, Kishor Rit, Sohail Shaikh and Nikita Patil, ”
Diabetes DiseasePrediction Using Data Mining” IEEE, 978-1-5090-
3294-5/17, 2017.
[11]Rashmi G Saboji and Prem Kumar Ramesh,“A Scalable Solution
for Heart Disease Prediction using Classification Mining Technique”
IEEE, 978-1-5386-1887-5/17, pp. 1780-1785, 2017.
[12] Rati Shukla, Vikash Yadav, Parashu Ram Pal and Pankaj
Pathak, "Machine Learning Techniques for Detecting and Predicting
Breast Cancer" IJITEE, ISSN: 2278-3075, Volume-8, pp. 2658-2662,
2019.
[13] Servant Leadership from a Christian Perspective Essay | Bartleby.
(n.d.). Retrieved May22, 2022, from
https://www.bartleby.com/essay/Servant-Leadership-From-a-Christian-
Perspective-F3CVTBFZTC
43