Professional Documents
Culture Documents
Heart Disease Documentation
Heart Disease Documentation
Submitted by
R.PARAMESH 21CS1120
K.BHUVANKUMAR 21CS1111
V.DHANUSH 21CS1113
S.ANANDHARAJ 21CS1105
PONDICHERRY, INDIA.
May 2024
PSV COLLEGE OF ARTS AND SCIENCE
PONDICHERRY UNIVERSITY
PUDUCHERRY – 607 402
BONAFIDE CERTIFICATE
We greatly thanks to all those who helped us in making this Project successful.
Thiru. S.SELVAMANI and our secretary Dr. S.VIKNESH M.Tech, Ph.D., for all their
It is great pleasure to thank our Director Dr.N.GOBU M.E., MBA, Ph.D., MISTE,
AMIE, MIIW for his valuable guidance, support and encouragement to do our project work.
It gives us great ecstasy of pleasure to convey our deep and sincere thanks to Principal
Dr. K.KAMALAKKAN, Ph.D., of PSV College of Arts and Science for having given us
We wish to express our deep sense of gratitude to our Head of the Department
Mr. R.BABU M.C.A., M.E., for his valuable guidance and encouragement throughout this
project work.
We wish to express our gratitude thanks to our Project Guide MS. T,SANTHIYA
B.S.C., M.S.C, Professor for his able guidance and useful suggestions, which helped me in
We also thanks to all our Department Faculties and Lab Administrator for their
timely guidance in the conduct of our project work and for all their valuable assistance in the
project work.
parents for their blessings my friends/classmates/seniors for their help and wishes for the
ABSTRACT 8-9
I INTRODUCTION 10
II PROBLEM DEFINITION 12
5.2 ARCHITECTURE
IX CODING 37-67
X SCREENSHOTS 68-71
XI CONCLUSION 72
XIII BIBLOGRAPHY 74
ABSTRACT
ABSTRACT
Heart disease prediction is crucial for informed healthcare decisions, enabling early
interventions and lifestyle adjustments. Traditional methods such as machine learning and
deep learning have been instrumental in developing predictive models for heart disease
detection. Commonly employed algorithms like Support Vector Machines (SVM), logistic
regression, XG Boost, and LightGBM have demonstrated accuracies ranging from 73.77% to
88.5%. In our project, we aim to enhance the efficiency of heart disease prediction by
leveraging the power of ensemble classifiers. Specifically, we employ the Convolutional
Neural Network (CNN) alongside Recurrent Neural Network (RNN) algorithms. This novel
approach integrates the strengths of both deep learning and traditional machine learning
methods. By combining these models through an ensemble classifier, we seek to improve
predictive accuracy and contribute to more effective and reliable heart disease risk
assessment. This research endeavors to advance the field by exploring innovative
methodologies for enhancing the performance of heart disease prediction models.
INTRODUCTION
I. INTRODUCTION
The growth in medical data collection presents a new opportunity for physicians to
improve patient diagnosis. In recent years, practitioners have increased their usage of
computer technologies to improve decision-making support. In the health care industry, Deep
learning is becoming an important solution to aid the diagnosis of patients. Deep learning is
an analytical tool used when a task is large and difficult to program, such as transforming
medical record into knowledge, pandemic predictions, and genomic data analysis. Recent
studies have used Deep learning techniques to diagnose different cardiac problems and make
a prediction. A major problem of Deep learning is the high dimensionality of the dataset, in
our project we used ensemble classifier method to detect accurately for heart disease using
CNN and RNN. The analysis of many features requires a large amount of memory and leads
to an over fitting, so the weighting features decrease redundant data and processing time, thus
improving the performance of the algorithm. Dimensionality reduction uses feature extraction
to transform and simplify data, while feature selection reduces the dataset by removing
useless features.
PROBLEM DEFINITION
2.1PROBLEM DEFINITION
Expert choice system in light of AI classifiers and the use of fake fluffy rationale is
successfully finding the HD therefore, the proportion of death diminishes and The Cleveland
heart illness informational index was utilized by different analysts also for the distinguishing
proof issue of HD. The Deep learning prescient models of ensemble classifier need
appropriate information for preparing and testing. The presentation of AI model can be
expanded whenever adjusted dataset is use for preparing and testing of the model. Moreover,
the model prescient abilities can improve by utilizing appropriate and related highlights from
the information. Hence, information adjusting and highlight determination is altogether
significant for model execution improvement. In writing different analysis strategies have
been proposed by different analysts, anyway these strategies are most certainly not
successfully analysis HD.
Expert choice system in light of AI classifiers and the use of fake fluffy rationale is
successfully finding the HD therefore, the proportion of death diminishes and The Cleveland
heart illness informational index was utilized by different analysts also for the distinguishing
proof issue of HD. The Deep learning prescient models need appropriate information for
preparing and testing. The current heart disease prediction system relies on traditional
machine learning and potentially deep learning methods but operates on a standalone model
basis. Commonly used algorithms, such as Support Vector Machines (SVM), logistic
regression, XG Boost, and LightGBM, may have been employed individually to predict heart
disease risk based on specific datasets.
DISADVANTAGES:
Augment the existing dataset with additional relevant features and ensure its
comprehensiveness to capture a more nuanced representation of heart disease
risk factors. Implement advanced data preprocessing techniques to handle
outliers, imbalances, and complex relationships within the dataset. This includes
sophisticated methods for handling missing values and robust feature scaling.
Introduce an ensemble framework that combines the predictive power of the
Convolutional Neural Network(CNN) and the temporal understanding of
Recurrent Neural Network (RNN) algorithms. The ensemble utilizes a
Ensemble Classifier to merge the outputs of individual models. Capitalize on
the diversity of CNN and RNN algorithms, which bring distinct perspectives to
heart disease prediction. This diversity aims to enhance the overall robustness of
the predictive model.
ADVANTAGES:
RAM -- 8 Gb
Framework -- Flask
Language – Python
PYTHON
INTRODUCTION TO PYTHON
Python is a high-level object-oriented programming language that was created by Guido
van Rossum. It is also called general-purpose programming language as it is used in
almost every domain we can think of as mentioned below:
Web Development
Software Development
Game Development
AI & ML
Data Analytics
This list can go on as we go but why python is so much popular let’s see it in the next
topic.
So let me explain:
Every Programming language serves some purpose or use-case according to a domain. for eg,
Javascript is the most popular language amongst web developers as it gives the developer the
power to handle applications via different frameworks like react, vue, angular which are used
to build beautiful User Interfaces. Similarly, they have pros and cons at the same time. so if
we consider python it is general-purpose which means it is widely used in every domain the
reason is it’s very simple to understand, scalable because of which the speed of development
is so fast. Now you get the idea why besides learning python it doesn’t require any
programming background so that’s why it’s popular amongst developers as well. Python has
simpler syntax similar to the English language and also the syntax allows developers to write
programs with fewer lines of code. Since it is open-source there are many libraries available
that make developers’ jobs easy ultimately results in high productivity. They can easily focus
on business logic and Its demanding skills in the digital era where information is available in
large data sets.
Now in the era of the digital world, there is a lot of information available on the
internet that might confuse us believe me. what we can do is follow the documentation which
is a good start point. Once we are familiar with concepts or terminology we can dive deeper
into this.
YouTube: https://www.youtube.com/watch?v=_uQrJ0TkZlc
CodeAcademy: https://www.codecademy.com/catalog/language/python
I hope now you guys are excited to get started right so you might be wondering where we can
start coding right so there are a lot of options available in markets. we can use any IDE we
are comfortable with but for those who are new to the programming world I am listing some
of IDE’s below for python:
2) PyCharm: https://www.jetbrains.com/pycharm/
3) Spyder: https://www.spyder-ide.org/
4) Atom: https://atom.io/
Real-World Examples:
1) NASA (National Aeronautics and Space Agency): One of Nasa’s Shuttle Support
Contractors, United Space Alliance developed a Workflow Automation System (WAS)
which is fast. Internal Resources Within critical project stated that:
“Python allows us to tackle the complexity of programs like the WAS without getting bogged
down in the language”.
Nasa also published a website (https://code.nasa.gov/) where there are 400 open source
projects which use python.
2) Netflix: There are various projects in Netflix which use python as follow:
Central Alert Gateway
Chaos Gorilla
Security Monkey
Chronos
Amongst all projects, Regional failover is the project they have as the system decreases
outage time from 45 minutes to 7 minutes with no additional cost.
3) Instagram: Instagram also uses python extensively. They have built a photo-sharing
social platform using Django which is a web framework for python. Also, they are able to
successfully upgrade their framework without any technical challenges.
2) Game Development: PySoy and PyGame are two python libraries that are used for game
development
4) Desktop GUI: Desktop GUI offers many toolkits and frameworks using which we can
build desktop applications.PyQt, PyGtk, PyGUI are some of the GUI frameworks.
Well, the reality is like the logo of infinity which we can see above. In the
programming realm, there is no such thing as mastery. It’s simply a trial and
error process. For example. Yesterday I was writing some code where I was
trying to print a value of a variable before declaring it inside a function. There I
had seen a new error named “UnboundLocalErrorException“.
Now here is the main part. What approach to follow in order to master Python
Programming?
print("Hello World")
Variables in Python:
my_var = 100
As you can see here, we have created a variable named “my_var” to assign a
value 100 to the same.
While data structures are responsible for deciding how to store this data in a
computer’s memory.
my_str = "ABCD"
As you can see here, we have assigned a value “ABCD” to a variable my_str.
This is basically a string data type in Python.
my_dict={1:100,2:200,3:300}
Again this is just the tip of the iceberg. There are lots of data types and data
structures in Python. To give a basic idea about data structures in Python, here
is the complete list:
1.Lists
2.Dictionary
3.Sets
4.Tuples
5.Frozenset
Python is no exception for that as well. This is one of the most important
concepts that we need to master.
IF-ELIF-ELSE conditionals:
else:
print("Do nothing")
As you can see in the above example, we have created what is known as the if-
elif-else ladder
For loop:
for i in "Python":
print(i)
PRO Tip:
Once you start programming with Python, you will be seeing that if we missed
any white spacing in python then python will start giving some errors. This is
known as Indentation in python. Python is very strict with indentation. Python is
created with a mindset to help everyone become a neat programmer. This
indentation scheme in python is introduced in one of python’s early PEP(Python
Enhancement Proposal).
While The Python Language Reference describes the exact syntax and semantics of
the Python language, this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional components that are commonly
included in Python distributions.
Python’s standard library is very extensive, offering a wide range of facilities as
indicated by the long table of contents listed below. The library contains built-in modules
(written in C) that provide access to system functionality such as file I/O that would
otherwise be inaccessible to Python programmers, as well as modules written in Python that
provide standardized solutions for many problems that occur in everyday programming.
Some of these modules are explicitly designed to encourage and enhance the portability of
Python programs by abstracting away platform-specifics into platform-neutral APIs.
The Python installers for the Windows platform usually include the entire standard
library and often also include many additional components. For Unix-like operating systems
Python is normally provided as a collection of packages, so it may be necessary to use the
packaging tools provided with the operating system to obtain some or all of the optional
components.
We’ve mentioned namespaces, publishing packages and importing modules. If any of these
terms or concepts aren’t entirely clear to you, we’ve got you! In this section, we’ll cover
everything you’ll need to really grasp the pipeline of using Python packages in your code.
Importing a Python Package
We’ll import a package using the import statement:
Let’s assume that we haven’t yet installed any packages. Python comes with a big
collection of pre-installed packages known as the Python Standard Library. It includes tools
for a range of use cases, such as text processing and doing math. Let’s import the latter:
You might think of an import statement as a search trigger for a module. Searches are
strictly organized: At first, Python looks for a module in the cache, then in the standard
library and finally in a list of paths. This list may be accessed after importing sys (another
standard library module).
The sys.path command returns all the directories in which Python will try to find a package.
It may happen that you’ve downloaded a package but when you try importing it, you get an
error:
In such cases, check whether your imported package has been placed in one of Python’s
search paths. If it hasn’t, you can always expand your list of search paths:
At that point, the interpreter will have more than one more location to look for packages
after receiving an import statement.
If you’d like to import multiple resources from the same source, you can simply comma-
separate them in the import statement:
There is, however, always a small risk that your variables will clash with other variables in
your namespace. What if one of the variables in your code was named log, too? It would
overwrite the log function, causing bugs. To avoid that, it’s better to import the package as
we did before. If you want to save typing time, you can alias your package to give it a
shorter name:
Aliasing is a pretty common technique. Some packages have commonly used aliases: For
instance, the numerical computation library NumPy is almost always imported as “np.”
Another option is to import all a module’s resources into your namespace:
However, this method poses serious risk since you usually don’t know all the names
contained in a package, increasing the likelihood of your variables being overwritten. It’s
for this reason that most seasoned Python programmers will discourage use of the wildcard
* in imports. Also, as the Zen of Python states, “namespaces are one honking great idea!”
How to Install a Python Package
How about packages that are not part of the standard library? The official repository for
finding and downloading such third-party packages is the Python Package Index, usually
referred to simply as PyPI. To install packages from PyPI, use the package installer pip:
pip can install Python packages from any source, not just PyPI. If you installed Python
using Anaconda or Miniconda, you can also use the conda command to install Python
packages.
While conda is very easy to use, it’s not as versatile as pip. So if you cannot install a
package using conda, you can always try pip instead.
Reloading a Module
If you’re programming in interactive mode, and you change a module’s script, these
changes won’t be imported, even if you issue another import statement. In such case, you’ll
want to use the reload() function from the importlib library:
Another important file is setup.py. Using the setuptools package, this file provides detailed
information about your project and lists all dependencies — packages required by your
code to run properly.
Publishing to PyPI is beyond the scope of this introductory tutorial. But if you do have a
package for distribution, your project should include two more files: a README.md
written in Markdown, and a license. Check out the official Python Packaging User Guide
(PyPUG) if you want to know more.
INSTALLING PACKAGES
It’s important to note that the term “package” in this context is being used to describe a
bundle of software to be installed (i.e. as a synonym for a distribution). It does not to refer to
the kind of package that you import in your Python source code (i.e. a container of modules).
It is common in the Python community to refer to a distribution using the term “package”.
Using the term “distribution” is often not preferred, because it can easily be confused with a
Linux distribution, or another larger software distribution like Python itself.
This section describes the steps to follow before installing other Python packages.
Before you go any further, make sure you have Python and that the expected version is
available from your command line. You can check this by running:
Unix/macOS
python3 --version
Windows
You should get some output like Python 3.6.3. If you do not have Python, please install the
latest 3.x version from python.org or refer to the Installing Python section of the Hitchhiker’s
Guide to Python.
Note
It’s because this command and other suggested commands in this tutorial are intended to be
run in a shell (also called a terminal or console). See the Python for Beginners getting started
tutorial for an introduction to using your operating system’s shell and interacting with
Python.
Note
If you’re using an enhanced shell like IPython or the Jupyter notebook, you can run system
commands like those in this tutorial by prefacing them with a ! character:
!{sys.executable} --version
Python 3.6.3
It’s recommended to write {sys.executable} rather than plain python in order to ensure that
commands are run in the Python installation matching the currently running notebook (which
may not be the same Python installation that the python command refers to).
Note
Due to the way most Linux distributions are handling the Python 3 migration, Linux users
using the system Python without creating a virtual environment first should replace
the python command in this tutorial with python3 and the python -m pip command
with python3 -m pip --user. Do not run any of the commands in this tutorial with sudo: if
you get a permissions error, come back to the section on creating virtual environments, set
one up, and then continue with the tutorial as written.
Ensure you can run pip from the command line
Additionally, you’ll need to make sure you have pip available. You can check this by
running:
Unix/macOS
Windows
If you installed Python from source, with an installer from python.org, or via Homebrew you
should already have pip. If you’re on Linux and installed using your OS package manager,
you may have to install pip separately, see Installing pip/setuptools/wheel with Linux
Package Managers.
If pip isn’t already installed, then first try to bootstrap it from the standard library:
Unix/macOS
Windows
Run python get-pip.py. 2 This will install or upgrade pip. Additionally, it will
install setuptools and wheel if they’re not installed already.
Warning
Be cautious if you’re using a Python install that’s managed by your operating system
or another package manager. get-pip.py does not coordinate with those tools, and may
leave your system in an inconsistent state. You can use python get-pip.py --
prefix=/usr/local/ to install in /usr/local which is designed for locally-installed
software.
Ensure pip, setuptools, and wheel are up to date
While pip alone is sufficient to install from pre-built binary archives, up to date copies of
the setuptools and wheel projects are useful to ensure you can also install from source
archives:
Unix/macOS
Windows
See section below for details, but here’s the basic venv 3 command to use on a typical Linux
system:
Unix/macOS
Windows
This will create a new virtual environment in the tutorial_env subdirectory, and configure the
current shell to use it as the default python environment.
Creating Virtual Environments
Imagine you have an application that needs version 1 of LibFoo, but another application
requires version 2. How can you use both these applications? If you install everything into
/usr/lib/python3.6/site-packages (or whatever your platform’s standard location is), it’s easy
to end up in a situation where you unintentionally upgrade an application that shouldn’t be
upgraded.
Or more generally, what if you want to install an application and leave it be? If an application
works, any change in its libraries or the versions of those libraries can break the application.
Also, what if you can’t install packages into the global site-packages directory? For instance,
on a shared host.
In all these cases, virtual environments can help you. They have their own installation
directories and they don’t share libraries with other virtual environments.
Currently, there are two common tools for creating Python virtual environments:
Using venv:
Unix/macOS
Windows
Using virtualenv:
Unix/macOS
Windows
For more information, see the venv docs or the virtualenv docs.
The use of source under Unix shells ensures that the virtual environment’s variables are set
within the current shell, and not in a subprocess (which then disappears, having no useful
effect).
In both of the above cases, Windows users should _not_ use the source command, but should
rather run the activate script directly from the command shell like so:
<DIR>\Scripts\activate
Managing multiple virtual environments directly can become tedious, so the dependency
management tutorial introduces a higher level tool, Pipenv, that automatically manages a
separate virtual environment for each project and application that you work on.
Use pip for Installing
pip is the recommended installer. Below, we’ll cover the most common usage scenarios. For
more detail, see the pip docs, which includes a complete Reference Guide.
The most common usage of pip is to install from the Python Package Index using
a requirement specifier. Generally speaking, a requirement specifier is composed of a project
name followed by an optional version specifier. PEP 440 contains a full specification of the
currently supported specifiers. Below are some examples.
Unix/macOS
Windows
Unix/macOS
Windows
To install greater than or equal to one version and less than another:
Unix/macOS
python3 -m pip install "SomeProject>=1,<2"
Windows
Unix/macOS
Windows
In this case, this means to install any version “==1.4.*” version that’s also “>=1.4.2”.
pip can install from either Source Distributions (sdist) or Wheels, but if both are present on
PyPI, pip will prefer a compatible wheel. You can override pip`s default behavior by e.g.
using its –no-binary option.
Wheels are a pre-built distribution format that provides faster installation compared to Source
Distributions (sdist), especially when a project contains compiled extensions.
If pip does not find a wheel to install, it will locally build a wheel and cache it for future
installs, instead of rebuilding the source distribution in the future.
Upgrading packages
Unix/macOS
python3 -m pip install --upgrade SomeProject
Windows
To install packages that are isolated to the current user, use the --user flag:
Unix/macOS
Windows
For more information see the User Installs section from the pip docs.
Note that the --user flag has no effect when inside a virtual environment - all installation
commands will affect the virtual environment.
If SomeProject defines any command-line scripts or console entry points, --user will cause
them to be installed inside the user base’s binary directory, which may or may not already be
present in your shell’s PATH. (Starting in version 10, pip displays a warning when installing
any scripts to a directory outside PATH.) If the scripts are not available in your shell after
installation, you’ll need to add the directory to your PATH:
On Linux and macOS you can find the user base binary directory by running python -
m site --user-base and adding bin to the end. For example, this will typically
print ~/.local (with ~ expanded to the absolute path to your home directory) so you’ll
need to add ~/.local/bin to your PATH. You can set your PATH permanently
by modifying ~/.profile.
On Windows you can find the user base binary directory by running py -m site --
user-site and replacing site-packages with Scripts. For example, this could
return C:\Users\Username\AppData\Roaming\Python36\site-packages so you would
need to set your PATH to
include C:\Users\Username\AppData\Roaming\Python36\Scripts. You can set your
user PATH permanently in the Control Panel. You may need to log out for
the PATH changes to take effect.
Requirements files
Unix/macOS
Windows
Install a project from VCS in “editable” mode. For a full breakdown of the syntax, see pip’s
section on VCS Support.
Unix/macOS
Unix/macOS
Windows
Unix/macOS
Windows
Installing from local src in Development Mode, i.e. in such a way that the project appears to
be installed, but yet is still editable from the src tree.
Unix/macOS
Unix/macOS
Windows
Unix/macOS
Windows
Install from a local directory containing archives (and don’t check PyPI)
Unix/macOS
To install from other data sources (for example Amazon S3 storage) you can create a helper
application that presents the data in a PEP 503 compliant index format, and use the --extra-
index-url flag to direct pip to use that index.
./s3helper --port=7777
python -m pip install --extra-index-url http://localhost:7777 SomeProject
Installing Prereleases
Find pre-release and development versions, in addition to stable versions. By default, pip
only finds stable versions.
Unix/macOS
Windows
Unix/macOS
python3 -m pip install SomePackage[PDF]
python3 -m pip install SomePackage[PDF]==3.0
4.1 ARCHITECTURE
• Data pre-processing
• Ensemble classifier
MODULE DESCRIPTION
Heart disease data is pre-processed after collection of various records. The dataset contains a
total of patient records, where records are with some missing values. Those records have been
removed from the dataset and the remaining patient records are used in pre-processing. The
multiclass variable and binary classification are introduced for the attributes of the given
dataset. The multi-class variable is used to check the presence or absence of heart disease. In
the instance of the patient having heart disease, the value is set to else the value is set to
indicating the absence of heart disease in the patient. The pre-processing of data is carried out
by converting medical records into diagnosis values. The results of data pre-processing for
patient records indicate that records show the value of establishing the presence of heart
disease while the remaining reflected the value of 0 indicating the absence of heart disease.
With a keen focus on data quality, this module addresses missing values, outliers, and applies
appropriate feature scaling. It adeptly encodes categorical variables, ensuring seamless
integration into machine learning models, and balances the dataset to rectify potential class
imbalances. For datasets with temporal components, it tactfully sequences data to align with
the temporal considerations of the Recurrent Neural Network (RNN) and Convolutional
Neural Network(CNN). The output is a refined and preprocessed dataset, primed for optimal
performance in subsequent modeling endeavors.
ENSEMBLE CLASSIFIER:
ALGORITHMS:
There are other types of neural networks in deep learning, but for identifying and recognizing
objects, CNNs are the network architecture of choice. This makes them highly suitable for
computer vision (CV) tasks and for applications where object recognition is vital, such as
self-driving cars and facial recognition.
Artificial neural networks (ANNs) are a core element of deep learning algorithms. One type
of an ANN is a recurrent neural network (RNN) that uses sequential or time series data as
input. It is suitable for applications involving natural language processing (NLP), language
translation, speech recognition and image captioning.
The CNN is another type of neural network that can uncover key information in both time
series and image data. For this reason, it is highly valuable for image-related tasks, such as
image recognition, object classification and pattern recognition. To identify patterns within
an image, a CNN leverages principles from linear algebra, such as matrix multiplication.
CNNs can also classify audio and signal data.
A CNN's architecture is analogous to the connectivity pattern of the human brain. Just like
the brain consists of billions of neurons, CNNs also have neurons arranged in a specific way.
In fact, a CNN's neurons are arranged like the brain's frontal lobe, the area responsible for
processing visual stimuli. This arrangement ensures that the entire visual field is covered,
thus avoiding the piecemeal image processing problem of traditional neural networks, which
must be fed images in reduced-resolution pieces. Compared to the older networks, a CNN
delivers better performance with image inputs, and also with speech or audio signal inputs.
A deep learning CNN consists of three layers: a convolutional layer, a pooling layer and a
fully connected (FC) layer. The convolutional layer is the first layer while the FC layer is the
last.
From the convolutional layer to the FC layer, the complexity of the CNN increases. It is this
increasing complexity that allows the CNN to successively identify larger portions and more
complex features of an image until it finally identifies the object in its entirety.
Convolutional layer. The majority of computations happen in the convolutional layer, which
is the core building block of a CNN. A second convolutional layer can follow the initial
convolutional layer. The process of convolution involves a kernel or filter inside this layer
moving across the receptive fields of the image, checking if a feature is present in the image.
Over multiple iterations, the kernel sweeps over the entire image. After each iteration a dot
product is calculated between the input pixels and the filter. The final output from the series
of dots is known as a feature map or convolved feature. Ultimately, the image is converted
into numerical values in this layer, which allows the CNN to interpret the image and extract
relevant patterns from it.
Pooling layer. Like the convolutional layer, the pooling layer also sweeps a kernel or filter
across the input image. But unlike the convolutional layer, the pooling layer reduces the
number of parameters in the input and also results in some information loss. On the positive
side, this layer reduces complexity and improves the efficiency of the CNN.
Fully connected layer. The FC layer is where image classification happens in the CNN based
on the features extracted in the previous layers. Here, fully connected means that all the
inputs or nodes from one layer are connected to every activation unit or node of the next
layer.
All the layers in the CNN are not fully connected because it would result in an unnecessarily
dense network. It also would increase losses and affect the output quality, and it would be
computationally expensive.
A CNN can have multiple layers, each of which learns to detect the different features of an
input image. A filter or kernel is applied to each image to produce an output that gets
progressively better and more detailed after each layer. In the lower layers, the filters can
start as simple features.
At each successive layer, the filters increase in complexity to check and identify features that
uniquely represent the input object. Thus, the output of each convolved image -- the partially
recognized image after each layer -- becomes the input for the next layer. In the last layer,
which is an FC layer, the CNN recognizes the image or the object it represents.
With convolution, the input image goes through a set of these filters. As each filter activates
certain features from the image, it does its work and passes on its output to the filter in the
next layer. Each layer learns to identify different features and the operations end up being
repeated for dozens, hundreds or even thousands of layers. Finally, all the image data
progressing through the CNN's multiple layers allow the CNN to identify the entire object.
Deep learning is a subset of machine learning that uses neural networks with at least three
layers. Compared to a network with just one layer, a network with multiple layers can deliver
more accurate results. Both RNNs and CNNs are used in deep learning, depending on the
application.
For image recognition, image classification and computer vision (CV) applications, CNNs
are particularly useful because they provide highly accurate results, especially when a lot of
data is involved. The CNN also learns the object's features in successive iterations as the
object data moves through the CNN's many layers. This direct (and deep) learning eliminates
the need for manual feature extraction (feature engineering).
CNNs can be retrained for new recognition tasks and built on preexisting networks. These
advantages open up new opportunities to use CNNs for real-world applications without
increasing computational complexities or costs.
As seen earlier, CNNs are more computationally efficient than regular NNs since they use
parameter sharing. The models are easy to deploy and can run on any device, including
smartphones.
Recurrent Neural Network(RNN) is a type of Neural Network where the output from the
previous step is fed as input to the current step. In traditional neural networks, all the inputs
and outputs are independent of each other. Still, in cases when it is required to predict the
next word of a sentence, the previous words are required and hence there is a need to
remember the previous words. Thus RNN came into existence, which solved this issue with
the help of a Hidden Layer. The main and most important feature of RNN is its Hidden state,
which remembers some information about a sequence. The state is also referred to as
Memory State since it remembers the previous input to the network. It uses the same
parameters for each input as it performs the same task on all the inputs or hidden layers to
produce the output. This reduces the complexity of parameters, unlike other neural networks.
Artificial neural networks that do not have looping nodes are called feed forward neural
networks. Because all information is only passed forward, this kind of neural network is also
referred to as a multi-layer neural network.
Information moves from the input layer to the output layer – if any hidden layers are present
– unidirectionally in a feedforward neural network. These networks are appropriate for image
classification tasks, for example, where input and output are independent. Nevertheless, their
inability to retain previous inputs automatically renders them less useful for sequential data
analysis.
The fundamental processing unit in a Recurrent Neural Network (RNN) is a Recurrent Unit,
which is not explicitly called a “Recurrent Neuron.” This unit has the unique ability to
maintain a hidden state, allowing the network to capture sequential dependencies by
remembering previous inputs while processing. Long Short-Term Memory (LSTM) and
Gated Recurrent Unit (GRU) versions improve the RNN’s ability to handle long-term
dependencies.
Then calculate its current state using a set of current input and the previous state.
One can go as many time steps according to the problem and join the information from all the
previous states.
Once all the time steps are completed the final current state is used to calculate the output.
The output is then compared to the actual output i.e the target output and the error is
generated.
The error is then back-propagated to the network to update the weights and hence the
network (RNN) is trained using Backpropagation through time.
ADVANTAGES
An RNN remembers each and every piece of information through time. It is useful in time
series prediction only because of the feature to remember previous inputs as well. This is
called Long Short Term Memory.
Recurrent neural networks are even used with convolutional layers to extend the effective
pixel neighborhood.
DISADVANTAGES:
To overcome the problems like vanishing gradient and exploding gradient descent several
new advanced versions of RNNs are formed some of these are as;
A BiNN is a variation of a Recurrent Neural Network in which the input information flows in
both direction and then the output of both direction are combined to produce the input. BiNN
is useful in situations when the context of the input is more important such as Nlp tasks and
Time-series analysis problems.
RNN is considered to be the better version of deep neural when the data is sequential. There
are significant differences between the RNN and deep neural networks
TECHNICAL DESCRIPTION
A deep neural network is simply a shallow neural network with more than one hidden
layer. Each neuron in the hidden layer is connected to many others. Each arrow has a weight
property attached to it, which controls how much that neuron's activation affects the others
attached to it.
The word 'deep' in deep learning is attributed to these deep hidden layers and derives
its effectiveness from it. Selecting the number of hidden layers depends on the nature of the
problem and the size of the data set. The following figure shows a deep neural network with
two hidden layers.
In this section, we covered a high-level overview of how an artificial neural network works.
To learn more, see the article on how neural networks work from scratch. You can also take a
deeper look at neural networks in this neural networks deep dive.
APPLICATIONS
Deep learning has a plethora of applications in almost every field such as health care,
finance, and image recognition. In this section, let's go over a few applications.
Health care: With easier access to accelerated GPU and the availability of huge
amounts of data, health care use cases have been a perfect fit for applying deep
learning. Using image recognition, cancer detection from MRI imaging and x-rays has
been surpassing human levels of accuracy. Drug discovery, clinical trial matching,
and genomics have been other popular health care-based applications.
Autonomous vehicles: Though self-driving cars is a risky field to automate, it has
recently taken a turn towards becoming a reality. From recognizing a stop sign to
seeing a pedestrian on the road, deep learning-based models are trained and tried
under simulated environments to monitor progress.
e-commerce: Product recommendations has been one of the most popular and
profitable applications of deep learning. With more personalized and accurate
recommendations, customers are able to easily shop for the items they are looking for
and are able to view all of the options that they can choose from. This also accelerates
sales and thus, benefits sellers.
Personal assistant: Thanks to advancements in the field of deep learning, having a
personal assistant is as simple as buying a device like Alexa or Google Assistant.
These smart assistants use deep learning in various aspects such as personalized voice
and accent recognition, personalized recommendations, and text generation.
Clearly, these are only a small portion of the vast applications to which deep learning can
be applied. Stock market predictions and weather predictions are also equally popular fields
in which deep learning has been helpful.
Though deep learning methods gained immense popularity in the last 10 years or so,
the idea has been around since the mid-1950s when Frank Rosenblatt invented the perceptron
on an IBM® 704 machines. It was a two-layer-based electronic device that had the ability to
detect shapes and do reasoning. Advancements in this field in recent years are primarily
because of the increase in computing power and high-performance graphical processing units
(GPUs), coupled with the large increase in the wealth of data these models have at their
disposal for learning, as well as interest and funding from the community for continued
research. Though deep learning has taken off in the last few years, it does come with its own
set of challenges that the community is working hard to resolve.
NEED FOR DATA
The deep learning methods prevalent today are very data hungry, and many complex
problems such as language translation don't have sophisticated data sets available. Deep
learning methods to perform neural machine translation to and from low-resource languages
often perform poorly, and techniques such as domain adaptation (applying learnings gained
from developing high-resource systems to low-resource scenarios) have shown promise in
recent years. For problems such as pose estimation, it can be arduous to generate such a high
volume of data. The synthetic data the model ends up training on differs a lot in reality from
the "in-the-wild" setup in which the model ultimately needs to perform.
Even though deep learning algorithms have proven to beat human-level accuracy,
there is no clear way to backtrack and provide the reasoning behind each prediction that's
made. This makes it difficult to use in applications such as finance where there are mandates
to provide the reasoning behind every loan that is approved or rejected.
Another dimension that tends to be an issue is the underlying bias in the data itself,
which can lead to poor performance of the model on crucial subsets of the data. Learning
agents that use a reward-based mechanism sometimes stop behaving ethically because all
they require to minimize system error is to maximize the reward they accrue. This example
shows how the agent simply stopped playing the game and ended up in an infinite loop of
collecting reward points. While it might be acceptable in a game scenario, wrong or unethical
decisions can have a profound negative impact in the real world. A strong need exists to
allow models to learn in a balanced fashion.
VII APPENDICES
7. 1CODING
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 age 303 non-null int64
1 sex 303 non-null int64
2 cp 303 non-null int64
3 trestbps 303 non-null int64
4 chol 303 non-null int64
5 fbs 303 non-null int64
6 restecg 303 non-null int64
7 thalach 303 non-null int64
8 exang 303 non-null int64
9 oldpeak 303 non-null float64
10 slope 303 non-null int64
11 ca 303 non-null int64
12 thal 303 non-null int64
13 target 303 non-null int64
dtypes: float64(1), int64(13)
memory usage: 33.3 KB
df.isna().sum()
age 0
sex 0
cp 0
trestbps 0
chol 0
fbs 0
restecg 0
thalach 0
exang 0
oldpeak 0
slope 0
ca 0
thal 0
target 0
dtype: int64
df.describe()
df.sex.value_counts()
sex
1 207
0 96
Name: count, dtype: int64
pd.crosstab(df.cp, df.target)
pd.crosstab(df.cp, df.target).plot(kind="bar",
figsize=(10, 6),
color=["salmon", "lightblue"])
y = df["target"]
X.head()
y.head()
0 1
1 1
2 1
3 1
4 1
Name: target, dtype: int64np.random.seed(42)
accuracy 0.69 61
macro avg 0.69 0.69 0.69 61
weighted avg 0.69 0.69 0.69 61
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred_knn)
random_forest_classifier = RandomForestClassifier(n_estimators=100,
random_state=42)
accuracy 0.84 61
macro avg 0.84 0.84 0.84 61
weighted avg 0.84 0.84 0.84 61
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.reshape(X_train.shape[0], -
1)).reshape(X_train.shape)
X_test_scaled = scaler.transform(X_test.reshape(X_test.shape[0], -
1)).reshape(X_test.shape)
# Model definition
cnn_input = Input(shape=(X_train_scaled.shape[1], X_train_scaled.shape[2]))
cnn_layer = Conv1D(filters=128, kernel_size=3, activation='relu')(cnn_input)
cnn_layer = MaxPooling1D(pool_size=2)(cnn_layer)
cnn_output = Flatten()(cnn_layer)
model.compile(optimizer=Adam(learning_rate=0.0001),
loss='binary_crossentropy', metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_loss', patience=10,
restore_best_weights=True)
WARNING:tensorflow:From C:\Users\yamin\anaconda3\Lib\site-
packages\keras\src\backend.py:1398: The name
tf.executing_eagerly_outside_functions is deprecated. Please use
tf.compat.v1.executing_eagerly_outside_functions instead.
WARNING:tensorflow:From C:\Users\yamin\anaconda3\Lib\site-
packages\keras\src\backend.py:6642: The name tf.nn.max_pool is deprecated.
Please use tf.nn.max_pool2d instead.
Epoch 1/100
WARNING:tensorflow:From C:\Users\yamin\anaconda3\Lib\site-
packages\keras\src\utils\tf_utils.py:492: The name tf.ragged.RaggedTensorValue
is deprecated. Please use tf.compat.v1.ragged.RaggedTensorValue instead.
WARNING:tensorflow:From C:\Users\yamin\anaconda3\Lib\site-
packages\keras\src\engine\base_layer_utils.py:384: The name
tf.executing_eagerly_outside_functions is deprecated. Please use
tf.compat.v1.executing_eagerly_outside_functions instead.
Epoch 2/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
...
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
...
Epoch 78/100
Epoch 79/100
7/7 [==============================] - 0s 28ms/step - loss: 0.4746 - accuracy:
0.7824 - val_loss: 0.5207 - val_accuracy: 0.7755
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
C:\Users\yamin\anaconda3\Lib\site-packages\keras\src\engine\training.py:3103:
UserWarning: You are saving your model as an HDF5 file via `model.save()`.
This file format is considered legacy. We recommend using instead the native
Keras format, e.g. `model.save('my_model.keras')`.
saving_api.save_model(
# Create a DataFrame
model_compare = pd.DataFrame(model_scores, index=["accuracy"])
NEGATIVE RESULT:
POSITIVE RESULT:
CONCLUSION
CONCLUSION:
The project successfully introduces an ensemble approach for heart disease prediction,
showcasing improved accuracy and interpretability. The ensemble framework, combining
CNN and RNN, proves to be a promising avenue for advancing predictive modeling in
healthcare. The comprehensive report serves as a valuable resource for stakeholders,
outlining methodologies, results, and future avenues for research and improvement.
FUTURE ENHANCEMENT
The Data Processing Module successfully handled missing values, outliers, and applied
feature scaling, resulting in a refined and cleaned dataset. The thorough data processing lays a
solid foundation for subsequent model training, ensuring that the models are fed with high-
quality and standardized inputs. The Ensemble Model Module effectively combined
predictions from the Convolutional Neural Network (CNN) and the Recurrent Neural
Network (RNN), showcasing improved predictive accuracy. By leveraging the strengths of
both CNN and RNN, the ensemble approach demonstrates the potential for enhanced
performance, capturing a more comprehensive understanding of heart disease risk factors.
The integration of CNN and RNN in an ensemble demonstrates the potential for harnessing
diverse modeling approaches. The project contributes to the field of healthcare analytics,
offering a robust tool for early heart disease detection and informed decision-making.
IX BIBLOGRAPHY
[1] S. J. Pasha and e. S. Mohamed, ‘‘novel feature reduction (nfr) model with Deep learning
and data mining algorithms for effective disease risk prediction,’’ ieee access, vol. 8, pp.
184087–184108, 2020.
[2] Y. Khan, U. Qamar, N. Yousaf, and A. Khan, ‘‘Deep learning techniques for heart disease
datasets: A survey,’’ in Proc. 11th Int. Conf. Mach. Learn. Comput. (ICMLC), Zhuhai,
China, 2019, pp. 27–35.
[3] S. Goel, A. Deep, S. Srivastava, and A. Tripathi, ‘‘Comparative anal- ysis of various
techniques for heart disease prediction,’’ in Proc. 4th Int. Conf. Inf. Syst. Comput. Netw.
(ISCON), Mathura, India, Nov. 2019, pp. 88–94
[5] S. Mohan, C. Thirumalai, and G. Srivastava, ‘‘Effective heart disease prediction using
hybrid Deep learning techniques,’’ IEEE Access, vol. 7,pp. 81542–81554, 2019.
[7] D. W. Hosmer, S. Lemeshow, and E. D. Cook, Applied Logistic Regression, 2nd ed. New
York, NY, USA: Wiley, 2000.
[9] R. Atallah and A. Al-Mousa, ‘‘Heart disease detection using Deep learning majority
voting ensemble method,’’ in Proc. 2nd Int. Conf. new Trends Comput. Sci. (ICTCS), Oct.
2019, pp. 1–6.
[10] A. Gupta, L. Kumar, R. Jain, and P. Nagrath, ‘‘Heart disease pre-diction using
classification (naive bayes),’’ in Proc. 1st Int. Conf. Comput., Commun., Cyber-Secur. (ICS).
Singapore: Springer, 2020, pp. 561–573.