Professional Documents
Culture Documents
Disease Prediction and Drug Recommendation Using Machine Learning
Disease Prediction and Drug Recommendation Using Machine Learning
Disease Prediction and Drug Recommendation Using Machine Learning
Presented By -
Batch-D15 Manaal Ahmed
C o o r d i n a t o r- Kabeer(18D21A05L1)
D r. M . R A M A S U B R A M A N I A N P. M a n i s h a ( I 8 D 2 1 A 0 5 M 2 )
K.Prathibha(18D21A05J9 )
(Professor)
Guide –
M s . Y. D I V YA
(Asst. Professor)
1
CONTENT
Abstract
Introduction
Existing system
Disadvantages
Problem statement
Proposed system
Advantages
System requirement specifications
System design architecture model
Algorithm
Data flow diagram
Modules
Functional Requirements and Non Functional Requirements
UML Diagrams
Algorithm Explanation
Conclusion
A B S T R AC T
I n t h e 2 1 s t c e n t u r y, t h e r e h a v e b e e n m a n y p r o f o u n d t e c h n o l o g i c a l a d v a n c e s t a k i n g p l a c e a n d t h e r e a r e a l s o m a n y a d v a n c e s i n m e d i c i n e s .
D u e t o t h i s , t h e l i f e e x p e c t a n c y i n c r e a s e d f r o m 3 0 - 4 0 y e a r s i n 1 9 t h c e n t u r y t o 6 0 - 7 0 y e a r s i n 2 1 s t c e n t u r y. B u t w i t h t h i s s u d d e n r i s e i n
a r t i f i c i a l I n t e l l i g e n c e s i n c e 1 9 5 0 , t h e r e i s a l o t o f d e v e l o p m e n t i n t h e o t h e r a s p e c t s s u c h a s m e d i c a l f i e l d . We a r e p r o p o s i n g t h e
s o l u t i o n w h i c h u s e s d i ff e r e n t c o n c e p t s o f a r t i f i c i a l i n t e l l i g e n c e / M a c h i n e L e a r n i n g . We w i l l i m p l e m e n t t h e c o m p o n e n t s u c h a s d i s e a s e
prediction and medicine/drug recommendation based on the disease.
In the proposed solution, the first component is based on entering the symptoms details which are used in to predict the disease by
m a c h i n e l e a r n i n g m o d e l s l i k e L o g i s t i c R e g r e s s i o n , S u p p o r t Ve c t o r M a c h i n e . N o w w e t r y t o r e c o m m e n d t h e m e d i c i n e s u s i n g t h e
recommendation algorithms. The python module package known as light-FM has a recommendation algorithm which will be used for
r e c o m m e n d i n g m e d i c i n e s . We a r e p r o p o s i n g a n a d d i t i o n a l f e a t u r e f o r i n c u r a b l e d i s e a s e s b y u s i n g d i f f e r e n t A I t e c h n i q u e s s u c h a s D e e p
Neural Network to recommend the drug.
3
I N T RO D UC T I O N
The huge numbers of variables are considered as entire variables that are required to understand the complete working process itself,
h o w e v e r n o m o d e l h a s a n a l y z e d s u c c e s s f u l l y.
S i n c e t h e a r r i v a l o f a d v a n c e d c o m p u t i n g , t h e d o c t o r s ’ s t i l l r e q u i r e t h e t e c h n o l o g y i n v a r i o u s p o s s i b l e w a y s l i k e s u rg i c a l r e p r e s e n t a t i o n
p r o c e s s a n d x - r a y p h o t o g r a p h y, b u t t h e t e c h n o l o g y p e r c e p t u a l l y s t a y e d b e h i n d .
The method still requires the doctors’ experience due to alternative factors starting from medical records to weather conditions, etc.
To t a c k l e t h i s d r a w b a c k , M e d i c a l d e c i s i o n s u p p o r t s y s t e m s m u s t b e u s e d .
This system is able to assist the doctors to make the correct decision.
Medical decision support system refers to both the process of attempting to determine or identify possible diseases or disorder and the
opinion reached by this process.
4
EXISTING SYSTEM
M e d i c a l d e c i s i o n c o u l d b e e x t r e m e l y s p e c i a l i z e d a n d d i ff i c u l t j o b d u e t o a l t e r n a t i v e f a c t o r s o r i n c a s e o f r a r e
diseases.
The alternative factors include stress; tired misdiagnosis might vary from ignorance of doctors and incomplete
information.
Standard algorithm may go through the entire variables like prevailing conditions history of medical records, and
family records and various factors relating to the patient records, sheer magnitude of obtainable hidden factors.
D i ff e r e n t i a l d i a g n o s i s m e t h o d s c a n b e u s e d t o i d e n t i f y t h e p r e s e n c e o f a n e n t i t y w h e r e m u l t i p l e a l t e r n a t i v e s a r e
possible and also refers to include the candidate alternatives.
This method is needs a process of elimination or obtaining information that shrinks the probability of candidate
conditions to negligible levels.
5
D I S A D VA N TA G E S
6
P R O B L E M S TAT E M E NT
In the proposed solution, the first component is based on entering the symptoms details which are used in to predict the disease
b y m a c h i n e l e a r n i n g m o d e l s l i k e L o g i s t i c R e g r e s s i o n , S u p p o r t Ve c t o r M a c h i n e .
Now we try to recommend the medicines using the recommendation algorithms.
The python module package known as light-FM has a recommendation algorithm which will be used for recommending medicines.
We a r e p r o p o s i n g a n a d d i t i o n a l f e a t u r e f o r i n c u r a b l e d i s e a s e s b y u s i n g d i ff e r e n t A I t e c h n i q u e s s u c h a s D e e p N e u r a l N e t w o r k t o
recommend the drug.
7
PROPOSED SYSTEM
To r e d u c e t h e l a rg e n u m b e r o f v a r i a b l e s a n d f i n d t h e m o s t p r o b a b l e d i s e a s e s b y u s i n g t h e K - M e a n s a l g o r i t h m .
This algorithm is more suitable to cluster the more number of diseases.
K-Mean is one of the unsupervised learning algorithms which are used to solve the clustering problem.
T h e m a i n i d e a i s t o d e t e r m i n e t h e k c e n t r o i d s , o n e f o r e a c h c l u s t e r. D i ff e r e n t t e s t s p e r f o r m e d o n t h e p a t i e n t s w i l l s e r v e d a s a
attributes for clustering.
8
A DVA N TA G E S
9
S YS T E M RE Q U I R E M E N T S
Hardware Requirements-
Processor: I3/Intel
Processor RAM: 4GB (min)
Hard Disk: 128 GB
K e y B o a r d : S t a n d a r d Wi n d o w s K e y b o a r d
M o u s e : Tw o o r T h r e e B u t t o n M o u s e
Monitor: Any
Software Requirements-
O p e r a t i n g S y s t e m : Wi n d o w s 7 +
S e r v e r- s i d e S c r i p t : P y t h o n 3 . 6 +
IDE: PyCharm
L i b r a r i e s U s e d : P a n d a s , N u m p y, F l a s k
10
SYSTEM DESIGN-ARCHITECTURE
11
A L G O R I T H M
LOGISTIC REGRESSION
SUPPORT VECTOR MACHINE
12
DATA FLOW DIAGRAM
The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of input data to the system, various processing
carried out on this data, and the output data is generated by this system.
The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data
used by the process, an external entity that interacts with the system and the information flows in the system.
DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow
and the transformations that are applied as data moves from input to output.
DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing
information flow and functional detail.
13
D ATA F L O W D I A G R A M
User
Unauthorized user
Yes NO
Check
Select symptom
Logistic Regression
Drug Prediction
End process
14
MODULES
Select symptoms
In this module user selects the symptoms.
Logistic Regression
In this module uses the logistic regression.
Support vector Machine
I n t h i s m o d u l e u s e s t h e S u p p o r t Ve c t o r M a c h i n e
Multi layer precision
This module uses the Multi Layer Precision
Disease Prediction
In this module it predicts the disease
Drug Prediction
This module predicts and recommend the drug
15
F U N CT I O NA L A N D N O N - F U N C T I O N A L R E Q U I RE M E N T S
The functional requirements or the overall description documents include the product perspective and features, operating system
and operating environment, graphics requirements, design constraints and user documentation. The appropriation of requirements
and implementation constraints gives the general overview of the project in regards to what the areas of strength and deficit are
and how to tackle them.•
Python idel 3.7 version (or)
Anaconda 3.7 ( or)
Jupiter (or)
Google colab
Minimum hardware requirements are very dependent on the particular software being developed by a given thought Python / Canopy /
V S C o d e u s e r. A p p l i c a t i o n s t h a t n e e d t o s t o r e l a rg e a r r a y s / o b j e c t s i n m e m o r y w i l l r e q u i r e m o r e R A M , w h e r e a s a p p l i c a t i o n s t h a t n e e d
t o p e r f o r m n u m e r o u s c a l c u l a t i o n s o r t a s k s m o r e q u i c k l y w i l l r e q u i r e a f a s t e r p r o c e s s o r. •
Operating system: windows, linux
Processor: minimum intel i3
Ram: minimum 4 gb
Hard disk: minimum 250gb
16
U M L D I A G RA M S
UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling language in the field of object-
o r i e n t e d s o f t w a r e e n g i n e e r i n g . T h e s t a n d a r d i s m a n a g e d , a n d w a s c r e a t e d b y, t h e O b j e c t M a n a g e m e n t G r o u p .
The goal is for UML to become a common language for creating models of object oriented computer software. In its current form
UML is comprised of two major components: a Meta-model and a notation. In the future, some form of method or process may also
be added to; or associated with, UML.
T h e U n i f i e d M o d e l i n g L a n g u a g e i s a s t a n d a r d l a n g u a g e f o r s p e c i f y i n g , Vi s u a l i z a t i o n , C o n s t r u c t i n g a n d d o c u m e n t i n g t h e
artifacts of software system, as well as for business modeling and other non-software systems.
The UML represents a collection of best engineering practices that have proven successful in the modeling of large and complex
systems.
The UML is a very important part of developing objects oriented software and the software development process. The UML uses
mostly graphical notations to express the design of software projects.
17
USE CASE DIAGRAM
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by and created from a Use-
case analysis. Its purpose is to present a graphical overview of the functionality provided by a system in terms of actors, their
goals (represented as use cases), and any dependencies between those use cases. The main purpose of a use case diagram is to
s h o w w h a t s y s t e m f u n c t i o n s a r e p e r f o r m e d f o r w h i c h a c t o r. R o l e s o f t h e a c t o r s i n t h e s y s t e m c a n b e d e p i c t e d .
18
CLASS DIAGRAM
The class diagram is used to refine the use case diagram and define a detailed design of the system. The class diagram classifies
the actors defined in the use case diagram into a set of interrelated classes. The relationship or association between the classes
can be either an "is-a" or "has-a" relationship. Each class in the class diagram may be capable of providing certain functionalities.
These functionalities provided by the class are termed "methods" of the class. Apart from this, each class may have certain
"attributes" that uniquely identify the class.
19
O B J E C T DI AG RA M
T h e object diagram is a special kind of class diagram. An object is an instance of a class. This essentially means that an object
represents the state of a class at a given point of time while the system is running. The object diagram captures the state of different
classes in the system and their relationships or associations at a given point of time.
20
S TAT E D I A G R A M
A s t a t e d i a g r a m , a s t h e n a m e s u g g e s t s , r e p r e s e n t s t h e d i ff e r e n t s t a t e s t h a t o b j e c t s i n t h e s y s t e m u n d e rg o d u r i n g t h e i r l i f e c y c l e .
Objects in the system change states in response to events. In addition to this, a state diagram also captures the transition of the
o b j e c t ' s s t a t e f r o m a n i n i t i a l s t a t e t o a f i n a l s t a t e i n r e s p o n s e t o e v e n t s a ff e c t i n g t h e s y s t e m .
21
A CT I V I T Y D I A G R A M
T h e process flows in the system are captured in the activity diagram. Similar to a state diagram, an activity diagram also
consists of activities, actions, transitions, initial and final states, and guard conditions.
22
SEQUENCE DIAGRAM
A sequence diagram represents the interaction between different objects in the system. The important aspect of a sequence
diagram is that it is time-ordered. This means that the exact sequence of the interactions between the objects is represented step by
step. Different objects in the sequence diagram interact with each other by passing "messages".
23
C O L L A B RAT I O N D I A G R A M
A c o l l a b o r a t i o n d i a g r a m g r o u p s t o g e t h e r t h e i n t e r a c t i o n s b e t w e e n d i ff e r e n t o b j e c t s . T h e i n t e r a c t i o n s a r e l i s t e d a s n u m b e r e d
interactions that help to trace the sequence of the interactions. The collaboration diagram helps to identify all the possible
interactions that each object has with other objects.
24
A L G O R I T H M E X P L A N AT I O N
LOGISTIC REGRESSION
Logistic regression is one of the most popular Machine Learning algorithms, which comes under the Supervised Learning technique. It is
used for predicting the categorical dependent variable using a given set of independent variables.Logistic regression predicts the output
of a categorical dependent variable. Logistic regression is similar to linear regression because both of these involve estimating the values
of parameters used in the prediction equation based on the given training data. Linear regression predicts the value of some continuous,
dependent variable. Whereas logistic regression predicts the probability of an event or class that is dependent on other factors.
S U P P O RT V E C T O R M A C H I N E
Support vector machine in machine learning is defined as a data science algorithm that belongs to the class of supervised learning that
analyses the trends and characteristics of the data set and solves problems related to classification and regression. Support vector
m a c h i n e i s b a s e d o n t h e l e a r n i n g f r a m e w o r k o f V C t h e o r y ( Va p n i k - C h e r v o n e n k i s t h e o r y ) a n d e a c h o f t h e t r a i n i n g d a t a p o i n t s i s m a r k e d a s
one of the 2 categories and then iteratively builds a region that will separate the data points in the space into 2 groups such that the data
points in the region is well separated across the boundary with the maximum width or gap.
25
CONCLUSION
The first component is based on entering the symptoms details which are used in to predict the disease by machine learning models
l i k e L o g i s t i c R e g r e s s i o n , S u p p o r t Ve c t o r M a c h i n e . N o w w e t r y t o r e c o m m e n d t h e m e d i c i n e s u s i n g t h e r e c o m m e n d a t i o n a l g o r i t h m s .
T h e p y t h o n m o d u l e p a c k a g e k n o w n a s l i g h t - F M h a s a r e c o m m e n d a t i o n a l g o r i t h m w h i c h w i l l b e u s e d f o r r e c o m m e n d i n g m e d i c i n e s . We
a r e p r o p o s i n g a n a d d i t i o n a l f e a t u r e f o r i n c u r a b l e d i s e a s e s b y u s i n g d i ff e r e n t A I t e c h n i q u e s s u c h a s D e e p N e u r a l N e t w o r k t o
recommend the drug.
This system has large scope as it has the following features which are:
Automation of Disease Diagnosis.
Paper free work helping the environment.
To increase the efficiency, accuracy for the patients to help them in future.
Managing the information related to diseases.
26