Major Project b10

A
Major Project Report on
DETECTION OF CHRONIC HEART FAILURE FROM HEART

SOUNDS USING INTEGRATED ML AND DL MODELS
Submitted for partial fulfilment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
In
ELECTRONICS AND COMMUNICATION ENGINEERING

By
SILASAGARAM SOWJANYA 20K81A04A8
KETHAVATH NARESH 20K81A0490

KOPPULA NITHISH 20K81A0493
MARIPELLI VINOD 20K81A0495
Under the Guidance of
Mr. L. CHANDRA
SHEKAR ASSISTANT
PROFESSOR
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
St. MARTIN'S ENGINEERING COLLEGE

UGC Autonomous
Affiliated to JNTUH, Approved by AICTE,
Accredited by NBA & NAAC A+, ISO 9001:2008
Certified
Dhulapally, Secunderabad - 500 100
www.smec.ac.in
March - 2024
UGC Autonomous
NBA& NAAC A+ Accredited
Dhulapally, Secunderabad - 500
100
Certificate
This is to certify that the project entitled “DETECTION OF CHRONIC HEART

FAILURE FROM HEART SOUNDS USING INTEGRATED ML AND DL MODELS”
is being submitted by S. SOWJANYA (20K81A04A8), K. NITHISH (20K81A0493), M.
VINOD (20K81A0495), K.NARESH (20K81A0490) in fulfilment of the requirement for the
award of degree of BACHELOR OF TECHNOLOGY IN ELECTRONICS AND
COMMUNICATION ENGINNERING is recorded of bonafide work carried out by them.
The result embodied in this report have been verified and found satisfactory.
Signature of Guide Signature Head of the Department

Mr.L. Chandra Shekar Dr. B. Hari Krishna
Assistant Professor Professor & Head
Department of ECE Department of ECE
Internal Examiner External Examiner
Date:
Place:
UGC Autonomous
NBA& NAAC A+ Accredited
Dhulapally, Secunderabad - 500
100
www.smec.ac.in
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
DECLARATION
We, the students of ‘Bachelor of Technology in Department of Electronics and

Communication Engineering’, session: 2020 - 2024, St. Martin’s Engineering
College, Dhulapally, Kompally, Secunderabad, hereby declare that the work
presented in this Project Work entitled DETECTION OF CHRONIC HEART
FAILURE FROM HEART SOUNDS USING INTEGRATED ML AND DL
MODELS is the outcome of our own bonafide work and is correct to the best of our
knowledge and this work has been undertaken taking care of Engineering Ethics. The
result embodied in this project report has not been submitted in any university for
award of any degree.
1.Silasagaram Sowjanya (20K81A04A8)
2.Maripelli Vinod (20K81A0495)
3.Kethavath Naresh (20K81A0490)
4.Koppula Nithish (20K81A0493)

ACKNOWLEDGEMENT
The satisfaction and euphoria that accompanies the successful completion of

any task would be incomplete without the mention of the people who made it possible
and whose encouragement and guidance have crowded our efforts with success.
First and foremost, we would like to express our deep sense of gratitude and
indebtedness to our College Management for their kind support and permission to use
the facilities available in the Institute.
We especially would like to express our deep sense of gratitude and

indebtedness to Dr. P. SANTOSH KUMAR PATRA, Group Director, St. Martin’s
Engineering College Dhulapally, for permitting us to undertake this project.
We wish to record our profound gratitude to Dr. M. SREENIVAS RAO,

Principal, St. Martin’s Engineering College, for his motivation and encouragement.
We are also thankful to Dr. B. HARI KRISHNA, Head of the Department,

Electronics and communication engineering, St. Martin’s Engineering College,
Dhulapally, Secunderabad. for his support and guidance throughout our project as well
as Project Coordinator Mr. Venkanna Mood Naik, associate professor, department of
Electronics and Communication Engineering, department for his valuable support.
We would like to express our sincere gratitude and indebtedness to our project
supervisor Mr.L.Chandrashekar, assistant Professor, electronics and communication
engineering, St. Martins Engineering College, Dhulapally, for his support and guidance
throughout our project.
Finally, we express thanks to all those who have helped us successfully

completing this project. Furthermore, we would like to thank our family and friends for
their moral support and encouragement. We express thanks to all those who have
helped us in successfully completing the project.
S. SOWJANYA 20K81A04A8
K. NARESH 20K81A0490
K. NITHISH 20K81A0493
M VINOD 20K81A0495
i
ABSTRACT
Chronic heart failure (CHF) is a chronic, progressive condition underscored by the
heart’s inability to supply enough perfusion to target tissues and organs at the physiological
filling pressures to meet their metabolic demands. CHF has reached epidemic proportions in
the population, as its incidence is increasing by 2% annually. In the developed world, CHF
affects 1-2% of the total population and 10% of people older than 65 years. Currently, the
diagnosis and treatment of CHF uses approximately 2% of the annual healthcare budget. In
absolute terms, the USA spent approximately 35 billion USD to treat CHF in 2018 alone,
and the costs are expected to double in the next 10 years. Despite the progress in medical-
and device-based treatment approaches in the last decades, the overall prognosis of CHF is
still dismal, as 5-year survival rate of this population is only approximately 50%. In the
typical clinical course of CHF, we observe alternating episodes of compensated phases,
when the patient feels well and does not display symptoms and signs of fluid overload, and
decompensated phases, when symptoms and signs of systemic fluid overload (such as
breathlessness, orthopnea, peripheral edema, liver congestion, pulmonary edema) can easily
be observed. During the latter episodes, patients often require hospital admission to receive
treatment with intravenous medications (diuretics, inotropes) to achieve a successful
negative fluid balance and return to the compensation state. Currently, an experienced
physician can detect the worsening of HF by examining the patient and by characteristic
changes in the patient’s heart failure biomarkers, which are determined from the patient’s
blood. Unfortunately, clinical worsening of a CHF patient likely means that we are already
dealing with a fully developed CHF episode that will most likely require a hospital
admission. Additionally, in some patients, characteristic changes in heart sounds can
accompany heart failure worsening and can be heard using phonocardiography. Therefore,
with the usage of recent advancement in machine learning and deep learning models, this
project implements the detection of chronic heart failure from phonocardiography (PCG)
data using end-to-end average aggregate recording model built with extracted features from
both machine learning and deep learning. The proposed Chronic Net model results also
compared with individual ML, and DL model.
ii
LIST OF FIGURES
Figure No. Figure Title Page No.
4.1 Proposed block diagram 27
4.2 MFCC operation diagram 31
4.3 Proposed deep CNN model for feature extraction 31
4.3.1.1 Representation of convolution layer process 32
4.3.1.2 Example of convolution layer process 32
4.3.4.1 Emotion prediction using layer process 33
4.3.4.2 Example of Soft Max classifier 34
4.3.4.3 Example of SoftMax classifier with test data 35
4.4 Random Forest algorithm 36
4.4.2.1 RF classifier analysis 38
4.4.2.2 Boosting RF classifier 38
6.10.1 Website details of Python 57
6.10.2 Downloading latest version of Python 57
6.10.3 Different versions of Python 58
7.2 Sample dataset 64
7.3.1 Upload of Physio net Dataset 65
7.3.2 Preprocessing of the Uploaded dataset 65
7.3.3 Count plot for count of each label 66
7.3.4 Presents performance evaluation of CNN model per epoch 66
7.3.5 Displays the performance evaluation comparison of all 67

models
7.3.6 Selecting and uploading of given heart sound model 67
7.3.7 Showing the result of given heart sound model 68
iii
LIST OF TABLES
Table Table Name Page No.

No.
2.1 Summary of survey 18
7.3 Performance model for all models 68
iv
CONTENTS
ACKNOWLEDGEMENT i
ABSTRACT ii
LIST OF iii
FIGURES LIST iv
OF TABLES
CHAPTER 1 INTRODUCTION 1
1.1 Overview
1
1.2 Motivation 2
1.3 Problem solving 3
4
1.4 Applications
CHAPTER 2 LITERATURE SURVEY 5
2.1 Summary of Survey 18

2.2 Problem Statement 23
CHAPTER 3 EXISTING SYSTEM 24

3.1 Decision Tree 24
3.1.1 DTC Technologies 24
3.1.2 Important Features of DTC 24
3.1.3 Assumptions for DTC 25
3.2 Drawbacks of decision Tree classifier 25
CHAPTER 4 PROPOSED METHOD 27

4.1 Preprocessing 28
4.2 MFCC feature extraction 29
4.3 CNN model 31
4.3.1 ReLU layer 33
4.3.2 Max pooing layer 33
4.3.3 Software classifier 33
4.4 Random Forest classifier 35
4.4.1 Assumptions for Random Forest 37
4.4.2 Types of Ensembles 37
CHAPTER 5 MACHINE LEARNING 39
5.1 What is Machine learning 39

5.2 Categories of Machine Learning 39
5.3 Need for Machine Learning 40
5.4 Challenges in Machine Learning 40
5.5 Applications of Machine Learning 41
5.6 How to Start Machine Learning 42
5.7 Advantages of Machine Learning 44
5.8 Disadvantages of Machine Learning 45
CHAPTER 6 SOFTWARE ENVIRONMENT 47
6.1 What is Python? 47
6.2 Advantages of Python 47
6.3 Advantages of Python over other languages 49
6.4 Disadvantages of Python 50
6.5 History of Python 51
6.6 Python Development steps 52
6.7 Purpose 52
6.8 Python 53
6.9 Modules used in project 53
6.10 How to Install Python on windows and Mac 56
6.10.1 Installation of Python 59
6.10.2 Verify the Python Installation 60
6.10.3 Check how the python IDLE works 61
CHAPTER 7 RESULTS AND DISCUSSIONS 63
7.1 Implementation Description 63
7.2 Dataset Description 64
7.3 Results Description 65
CHAPTER 8 CONCLUSION 70
Future scope 70
References 71
APPENDIX 72
CHAPTER 1
INTRODUCTION
1.1 Overview
"The detection of chronic heart failure (CHF) from phonocardiogram (PCG) data using
unified machine learning and deep learning models is a promising area of research. PCG
data refers to the acoustic signals that are generated by the heart during its normal cycle
of contraction and relaxation. This data can be recorded using specialized equipment,
and analysed using machine learning and deep learning algorithms to detect CHF.
The use of a unified machine learning and deep learning model for CHF detection from
PCG data involves the combination of traditional machine learning techniques, such as
logistic regression or support vector machines, with deep learning techniques, such as
convolutional neural networks or recurrent neural networks. The goal of this approach is
to leverage the strengths of both types of algorithms to improve the accuracy of CHF
detection.
The process of developing a unified machine learning and deep learning model for CHF
detection from PCG data typically involves several steps. First, the PCG data is pre-
processed to remove noise and artifacts, and to extract relevant features such as heart
rate, amplitude, and frequency. Then, the data is split into training, validation, and testing
sets, and used to train the machine learning and deep learning models. The models are
evaluated on the testing set to determine their accuracy in detecting CHF.
One advantage of using a unified machine learning and deep learning model for CHF
detection from PCG data is that it can help to overcome some of the limitations of
traditional machine learning techniques. For example, deep learning algorithms are well-
suited to handling complex and high-dimensional data, and can automatically extract
relevant features from the PCG signals. However, deep learning algorithms can also be
computationally expensive and require large amounts of training data.
Overall, the use of a unified machine learning and deep learning model for CHF
detection from PCG data is an active area of research with the potential to improve the
accuracy of CHF detection, and ultimately improve outcomes for patients with this
condition.
1
1.2 Motivation
"The motivation for developing a unified machine learning and deep learning model for
the detection of chronic heart failure (CHF) from phonocardiogram (PCG) data is driven
by the need for accurate and non-invasive methods for CHF diagnosis. CHF is a serious
and progressive condition where the heart is unable to pump blood effectively, leading to
a variety of symptoms and complications. Early diagnosis and treatment of CHF is
critical for improving patient outcomes, but current methods for diagnosis, such as
echocardiography and cardiac MRI, can be costly and invasive.
PCG data provides a non-invasive and low-cost alternative for CHF diagnosis.
However, accurately interpreting this data can be challenging, as it requires the analysis
of complex and high-dimensional signals. Traditional machine learning techniques have
been used to analyse PCG data, but these approaches may not be able to capture the full
complexity of the signals. On the other hand, deep learning techniques have shown
promise in analysing PCG data, but they require large amounts of training data and can
be computationally expensive.
The development of a unified machine learning and deep learning model for CHF
detection from PCG data seeks to address these limitations by combining the strengths of
both techniques. This approach allows for the automatic extraction of relevant features
from the PCG signals, while also leveraging traditional machine learning techniques for
classification and prediction. By improving the accuracy of CHF diagnosis using PCG
data, a unified machine learning and deep learning model has the potential to improve
patient outcomes by enabling earlier diagnosis and treatment of this condition.
The objective of the project "Chronic Net: Detection of Chronic Heart Failure from
Heart Sounds using Integrated ML and DL Models" is to develop a novel approach for
the early detection of chronic heart failure (CHF) using phonocardiography (PCG) data.
This approach aims to leverage advancements in machine learning (ML) and deep
learning (DL) techniques to create an integrated model capable of accurately identifying
signs of CHF from heart sounds. By utilizing both ML and DL methodologies, the goal
is to enhance the accuracy and reliability of CHF detection, enabling earlier intervention
and management of the condition. Additionally, the project seeks to compare the
performance of the proposed Chronic Net model with individual ML and DL models to
assess its effectiveness in CHF detection. Ultimately, the objective is to provide
clinicians with a tool that can aid in the timely identification of CHF worsening,
2
potentially allowing for outpatient management and reducing the need for hospital
admissions.
1.3 Problem statement
The problem statement for the project "Chronic Net: Detection of Chronic Heart Failure
from Heart Sounds using Integrated ML and DL Models" is as follows:
Despite the prevalence of chronic heart failure (CHF) and its significant impact on
healthcare systems worldwide, there remains a critical need for improved methods of
early detection and intervention. Currently, CHF diagnosis relies heavily on clinical
examination and biomarker analysis, which may not always detect subtle signs of
worsening until the condition has progressed to a severe stage, necessitating
hospitalization. Moreover, traditional diagnostic approaches may lack sensitivity and
specificity, leading to misdiagnosis or delayed treatment.
To address these challenges, this project aims to develop an innovative solution using
phonocardiography (PCG) data and advanced machine learning (ML) and deep learning
(DL) techniques. The primary problem is to design a model capable of accurately
detecting early signs of CHF exacerbation from heart sounds, thereby enabling timely
intervention and potentially reducing the need for hospital admissions. This involves
overcoming several key challenges, including:
Extracting relevant features from PCG data that capture subtle changes indicative of
CHF worsening.
Developing ML and DL models capable of effectively analysing and interpreting these
features to distinguish between normal and pathological heart sounds.
Integrating ML and DL methodologies to create a comprehensive model (Chronic Net)
that leverages the strengths of both approaches.
Evaluating the performance of the Chronic Net model and comparing it with individual
ML and DL models to assess its effectiveness in CHF detection.
Addressing potential limitations such as data variability, noise, and model interpretability
to ensure the practical utility and reliability of the proposed solution.
By addressing these challenges, the project aims to provide clinicians with a valuable
tool for early CHF detection, potentially improving patient outcomes and reducing
healthcare costs associated with CHF management.
3
1.4 Applications
The application of "Chronic Net: Detection of Chronic Heart Failure from Heart Sounds
using Integrated ML and DL Models" extends across various domains within healthcare
and biomedical engineering. Some key applications include:
Early Detection of Chronic Heart Failure (CHF): The primary application of Chronic
Net is in the early detection of CHF exacerbations. By analysing heart sounds using
advanced ML and DL models, Chronic Net can identify subtle changes indicative of
CHF worsening before symptomatic manifestation, enabling timely intervention and
management.
Remote Patient Monitoring: Chronic Net can be integrated into remote patient
monitoring systems, allowing healthcare providers to continuously monitor patients'
heart sounds remotely. This enables proactive intervention in response to early signs of
CHF exacerbations, reducing the need for frequent hospital visits and improving patient
quality of life.
Point-of-Care Diagnosis: The development of portable devices equipped with Chronic
Net technology could facilitate point-of-care diagnosis of CHF. This would enable rapid
screening and diagnosis in clinical settings such as primary care clinics, emergency
departments, and mobile healthcare units, improving access to timely care for patients in
underserved areas.
Personalized Medicine: Chronic Net can contribute to the advancement of personalized
medicine by providing clinicians with valuable insights into individual patients' cardiac
health status. By analysing longitudinal heart sound data, the model can tailor treatment
plans and interventions to meet the specific needs of each patient, optimizing therapeutic
outcomes and minimizing adverse events.
Research and Clinical Trials: Chronic Net can serve as a valuable tool for researchers
and clinicians involved in cardiovascular research and clinical trials. By accurately
detecting CHF exacerbations and monitoring disease progression, the model can
facilitate the development and evaluation of novel therapies and interventions for CHF
patients.
Medical Education and Training: Chronic Net can be utilized in medical education and
training programs to enhance the learning experience of healthcare professionals. By
providing real-world examples of CHF diagnosis and management, the model can help
trainees develop diagnostic skills and clinical decision-making abilities in
cardiovascular.
4
CHAPTER 2
LITERATURE SURVEY
1. Gjoreski, Martin, et al. "Machine learning and end-to-end deep learning for the
detection of chronic heart failure from heart sounds." IEEE Access 8 (2020): 20313-
20324.
Gjoreski, Martin, et al (2020) [1] presented a method for CHF detection based on heart
sounds. This method combines classic Machine-Learning (ML) and end-to-end Deep
Learning (DL). The classic ML learns from expert features, and the DL learns from a
Spectro-temporal representation of the signal. This method was evaluated on recordings
from 947 subjects from six publicly available datasets and one CHF dataset that was
collected for this study. Using the same evaluation method as a recent PhysoNet
challenge, the proposed method achieved a score of 89.3, which is 9.1 higher than the
challenge's baseline method. This method's aggregated accuracy is 92.9% (error of
7.1%); while the experimental results are not directly comparable, this error rate is
relatively close to the percentage of recordings labelled as “unknown” by experts (9.7%).
Finally, we identified 15 expert features that are useful for building ML models to
differentiate between CHF phases (i.e., in the decompensated phase during
hospitalization and in the recompensated phase) with an accuracy of 93.2%. The
proposed method shows promising results both for the distinction of recordings between
healthy subjects and patients and for the detection of different CHF phases.
2. Gahane, Aroh, and Chinnaiah Kotadi. "An Analytical Review of Heart Failure
Detection based on Machine Learning." 2022 Second International Conference on
Artificial Intelligence and Smart Energy (ICAIS). IEEE, 2022.
Gahane, Aroh, and Chinnaiah Kotadi(2022) [2] investigated different approaches for
detecting CHF based on the heart sounds produced by the patient. The perception of
heart rate, as well as the relationship between heart sounds and cardiovascular disease,
are important considerations. The basic techniques used in the processing and
interpretation of cardiac signals seem to be de-noising, categorization, extraction, feature
extraction, and classification, among others. Because of the emphasis on the usage of
Machine Learning (ML) algorithms for analysing heart sounds, classic Machine-
Learning (ML) technologies are merged with IoT end-to-end technologies, and both are
5
integrated with a wide range of defined techniques. The primary goal is to examine the
many technologies that are comprised of the internet of things that are used to forecast
heart attack disease and how they are used. It is not only to explain the existing heart
attack prediction, but also to address the aware and monitoring system for the patient
who is likely to be suffering from cardiovascular illness.
3. Shuvo, Samiul Based, et al. "CardioXNet: A novel lightweight deep learning

framework for cardiovascular disease classification using heart sound recordings." IEEE
Access 9 (2021): 36955-36967.
Shuvo, Samiul Based, et al (2021) [3] proposed CardioXNet, a novel lightweight end-
to-end CRNN architecture for automatic detection of five classes of cardiac auscultation
namely normal, aortic stenosis, mitral stenosis, mitral regurgitation and mitral valve
prolapse using raw PCG signal. The process has been automated by the involvement of
two learning phases namely, representation learning and sequence residual learning.
Three parallel CNN pathways have been implemented in the representation learning
phase to learn the coarse and fine-grained features from the PCG and to explore the
salient features from variable receptive fields involving 2D-CNN based squeeze-
expansion. Thus, in the representation learning phase, the network extracts efficient time-
invariant features and converges with great rapidity. It outperforms any previous works
using the same database by a considerable margin. Moreover, the proposed model was
tested on PhysioNet/CinC 2016 challenge dataset achieving an accuracy of 86.57%.
Finally the model was evaluated on a merged dataset of Github PCG dataset and
PhysioNet dataset achieving excellent accuracy of 88.09%. The high accuracy metrics on
both primary and secondary dataset combined with a significantly low number of
parameters and end-to-end prediction approach makes the proposed network especially
suitable for point of care CVD screening in low resource setups using memory constraint
mobile devices.
4. Li, Suyi, et al. "A review of computer-aided heart sound detection techniques."
BioMed
research international 2020 (2020).
Li, Suyi, et al(2020) [4]. detected techniques play an important role in the prediction of
cardiovascular diseases. The latest development of the computer-aided heart sound
detection techniques over the last five years has been reviewed. There are mainly the
6
following aspects: the theories of heart sounds and the relationship between heart sounds
and cardiovascular diseases; the key technologies used in the processing and analysis of
heart sound signals, including denoising, segmentation, feature extraction and
classification; with emphasis, the applications of deep learning algorithm in heart sound
processing. In the end, some areas for future research in computer-aided heart sound
detection techniques are explored, hoping to provide reference to the prediction of
cardiovascular diseases.
5 .Miotto, Riccardo, et al. "Deep learning for healthcare: review, opportunities and
challenges." Briefings in bioinformatics 19.6 (2018): 1236-1246.
Miotto, Riccardo, et al.(2018) [5] provided new effective paradigms to obtain end-to-
end learning models from complex data. They reviewed the recent literature on applying
deep learning technologies to advance the health care domain. Based on the analyzed
work, we suggest that deep learning approaches could be the vehicle for translating big
biomedical data into improved human health. However, they also note limitations and
needs for improved methods development and applications, especially in terms of ease-
of-understanding for domain experts and citizen scientists.They discuss such challenges
and suggest developing holistic and meaningful interpretable architectures to bridge deep
learning models and human interpretability.
6. Allugunti, Viswanatha Reddy. "Heart disease diagnosis and prediction based on

hybrid machine learning model."
Allugunti, Viswanatha Reddy[6] provided an efficient answer to the problem of
making decisions and accurate forecasts. This application of machine learning strategies
is making significant headway in the medical sector.They presented, a unique technique
to machine learning is proposed for the purpose of predicting cardiac disease. The
PhysioNet Dataset was utilised for the study that was proposed, and data mining
algorithms like regression and classification were utilised. Support Vector Machine,
Decision Tree and Random Forest are both the machine learning approaches that are
utilised here. The cutting-edge strategy for the machine learning model has been devised.
Support Vector Machine, Random Forest, Decision Tree, and the Hybrid model (Hybrid
of SVM, RF and DT) are the four types of machine learning algorithms that are utilised
in the implementation process. The accuracy level of the heart disease prediction model
using the hybrid model was found to be 88.7 percent based on the results of the
7
experiments. The user's input parameter will be utilised to predict heart illness, which
will be done with a model that is a hybrid of Decision Tree and Random Forest. This
interface is built to acquire the user's input parameter.
7. Zubair, Muhammad. "A Peak Detection Algorithm for Localization and

Classification of Heart Sounds in PCG Signals using K-means Clustering." (2021).
Zubair, Muhammad(2021) [7] detected sounds S1 and S2, the features like
envelograms, Mel frequency cepstral coefficients (MFCC), kurtosis, etc., of these sounds
are extracted. These features are used for the classification of normal and abnormal heart
sounds, which leads to an increase in computational complexity. They had proposed a
fully automated algorithm to localize heart sounds using K-means clustering. The K-
means clustering model can differentiate between the primitive heart sounds like S1, S2,
S3, S4 and the rest of the insignificant sounds like murmurs without requiring the
excessive pre-processing of data. The peaks detected from the noisy data are validated by
implementing five classification models with 30 fold cross-validation. These models
have been implemented on a publicly available PhysioNet/Cinc challenge 2016 database.
Lastly, to classify between normal and abnormal heart sounds, the localized labelled
peaks from all the datasets were fed as an input to the various classifiers such as support
vector machine (SVM), K-nearest neighbours (KNN)
8. Valera, HH Alvarez, and M. Luštrek. "Machine Learning Models for Detection of

Decompensation in Chronic Heart Failure Using Heart Sounds." Workshops at 18th
International Conference on Intelligent Environments (IE2022). Vol. 31. IOS Press,
2022.
Valera, HH Alvarez, and M. Luštrek (2022) [8] investigated chronic heart failure (HF)
diagnosis with the application of machine learning (ML) approaches. They simulated the
procedure that is followed in clinical practice, as the models they built are based on
various combinations of feature categories, e.g., clinical features, echocardiogram, and
laboratory findings. We also investigated the incremental value of each feature type. The
total number of subjects utilized was 422. An ML approach is proposed, comprising of
feature selection, handling class imbalance, and classification steps. The results for HF
diagnosis were quite satisfactory with a high accuracy (91.23%), sensitivity (93.83%),
and specificity (89.62%) when features from all categories were utilized. The results
remained quite high, even in cases where single feature types were employed.
8
9. Ravi, Rohit, and P. Madhavan (2022) "Prediction of Cardiovascular Disease using
Machine Learning Algorithms." 2022 International Conference on Communications,
Information, Electronic and Energy Systems (CIEES). IEEE, 2022.
Ravi, Rohit, and P. Madhavan (2022) [9] investigated using different algorithms to
fetch out precise information for various domains. Across the world approximately 3
quintillion bytes/day information generated and this data stored for further examination.
As data is in huge quantity therefore, appropriate methods applied to examine the perfect
analysis so that prediction can be carried out optimally. Clinical decision making is
dominant to all patient care happenings which includes choosing a deed, between
replacements. These days emerging field like Machine Learning play prime role in
healthcare to analyze and predict the diseases. After investigating numerous research
article on Machine Learning, it was found that for same data set accuracy was different
for various algorithms.
10. Susic, D., Gregor Poglajen, and Anton Gradišek. "Machine learning models for
detection of decompensation in chronic heart failure using heart sounds." Proceedings of
the Workshops at 18th International Conference on Intelligent Environments (IE2022).
Amsterdam: IOS Press, 2022.
Susic, D., Gregor Poglajen, and Anton Gradišek.(2022) [10] implemented medical
therapy which would in turn prevent the development of more severe heart failure
decompensation thus avoiding the need for heart failure-related hospitalizations.
Currently, heart failure worsening is recognized by the clinicians through characteristic
changes of heart failure- related symptoms and signs, including the changes in heart
sounds. The latter has proven to be largely unreliable as its interpretation is highly
subjective and dependent on the clinicians’ skills and preferences. Previous studies have
indicated that the algorithms of artiﬁcial intelligence are promising in distinguishing the
heart sounds of heart failure patients from those of healthy individuals.They focussed on
the analysis of heart sounds of chronic heart failure patients in their decompensated and
recompensated phase. The data was recorded on 37 patients using two types of electronic
stethoscopes. Using a combination of machine learning approaches, we obtained up to
72% classiﬁcation accuracy between the two phases, which is better than the accuracy of
the interpretation by cardiologists, which reached 50%. Their results demonstrate that
machine learning algorithms are promising in improving early detection of heart failure
decompensation episodes.
9
11. Sreejith, S., S. Rahul, and R. C. Jisha. "A real time patient monitoring system for
heart disease prediction using random forest algorithm 2016.
Sreejith, S., S. Rahul, and R. C. Jisha (2016) [11] proposed a system that suggests a
framework for measuring the heart rate, temperature and blood pressure of the patient
using a wearable gadget and the measured parameters is transmitted to the Bluetooth
enabled Android smartphone. The various parameters are analyzed and processed by
android application at client side. This processed output is transferred to the server side
in a periodic interval. Whenever an emergency caring arises, an alert message is
forwarded to the various care providers by the client side application. The use of various
wireless technologies like GPS, GPRS, and Bluetooth leads us to monitor the patient
remotely. The system is said to be an intelligent system because of its diagnosis
capability, timely alert for medication etc. The current statistics shows that heart disease
is the leading cause of death and which shows the importance of the technology to
provide a solution for reducing the cardiac arrest rate.
12. Gjoreski, Martin, et al. "Chronic heart failure detection from heart sounds using a
stack of machine-learning classifiers." 2017 International Conference on Intelligent
Environments (IE). IEEE, 2017.
Gjoreski, Martin, et al. (2017) [12] presented a machine-learning method for chronic
heart failure detection from heart sounds. This method consists of: filtering,
segmentation, feature extraction and machine learning. This method was tested with a
leave-one-subject-out evaluation technique on data from 122 subjects.The method
achieved 96% accuracy, outperforming a majority classifier for 15 percentage points.
More specifically, it detects (recalls) 87% of the chronic heart failure subjects with a
precision of 87%.
13. Ismail, Shahid, et al. "PCG classification through spectrogram using transfer
learning." Biomedical Signal Processing and Control 79 (2023): 104075.
Ismail, Shahid, et al(2023) [13] proposed a technique that relies on signal filtering, time
segmentation, spectrogram generation, hybrid classification and finally a voting based
mechanism. It carries out analysis at cycle as well as at signal level. Evaluation of the
proposed technique on a challenging public dataset (PASCAL 2011) results in precision,
recall and accuracy values of greater than 95% using 5-fold cross validation.
Furthermore, the reported results also validate our claim that 2–3 s of data suffices for
classification.
10
14. Sanei, Saeid, Mansoureh Ghodsi, and Hossein Hassani. "An adaptive singular
spectrum analysis approach to murmur detection from heart sounds." Medical
engineering & physics 33.3 (2011)
Sanei, Saeid, Mansoureh Ghodsi, and Hossein Hassani(2011) [14] approached for
separation of murmur from heart sound has been suggested. Singular spectrum analysis
(SSA) has been adapted to the changes in the statistical properties of the data and
effectively used for detection of murmur from single-channel heart sound (HS) signals.
Incorporating a cleverly selected a priori within the SSA reconstruction process, results
in an accurate separation of normal HS from the murmur segment. Another contribution
of this work is selection of the correct subspace of the desired signal component
automatically. In addition, the subspace size can be identified iteratively. A number of
HS signals with murmur have been processed using the proposed adaptive SSA (ASSA)
technique and the results have been quantified both objectively and subjectively.
15. Gao, Shan, Yineng Zheng, and Xingming Guo. "Gated recurrent unit-based heart
sound analysis for heart failure screening." Biomedical engineering online 19 (2020): 1-
17.
Gao, Shan, Yineng Zheng, and Xingming Guo (2020) [15] proposed a method based
on convolutional neural networks (CNN) and heart sounds (HS) is presented for the early
diagnosis of LVDD in this paper. A deep convolutional generative adversarial networks
(DCGAN) model-based data augmentation (DA) method was proposed to expand a HS
database of LVDD for model training. Firstly, the pre-processing of HS signals was
performed using the improved wavelet denoising method. Secondly, the logistic
regression based hidden semi-Markov model was utilized to segment HS signals, which
were subsequently converted into spectrograms for DA using the short-time Fourier
transform (STFT). Finally, the proposed method was compared with VGG-16, VGG-19,
ResNet-18, ResNet-50, DenseNet-121, and AlexNet in terms of performance for LVDD
diagnosis. The result shows that the proposed method has a reasonable performance with
an accuracy of 0.987, a sensitivity of 0.986, and a specificity of 0.988, which proves the
effectiveness of HS analysis for the early diagnosis of LVDD and demonstrates that the
DCGAN-based DA method could effectively augment HS data.
16. Wang, Hui, et al. "An automatic approach for heart failure typing based on heart
sounds and convolutional recurrent neural networks." Physical and Engineering Sciences
in Medicine 45.2 (2022): 475-485.
11
Wang, Hui, et al (2022) [16] proposed an automatic approach for HF typing based on
heart sounds (HS) and convolutional recurrent neural networks, which provides a new
non-invasive and convenient way for HF typing. Firstly, the collected HS signals were
pre-processed with adaptive wavelet denoising,then, the logistic regression based hidden
semi-Markov model was utilized to segment HS frames. For the distinction between
normal subjects and the HF patients with preserved ejection fraction or reduced ejection
fraction, a model based on convolutional neural network and recurrent neural networkb
was built. The model can automatically learn the spatial and temporal characteristics of
HS signals. The results show that the proposed model achieved a superior performance
with an accuracy of 97.64%. This study suggests the proposed method could be a useful
tool for HF recognition and as a supplement for HF typing.
17. Beritelli, Francesco, et al. "Automatic heart activity diagnosis based on Gram
polynomials and probabilistic neural networks." Biomedical engineering letters 8 (2018):
77-85.
Beritelli, Francesco, et al (2018) [17] proposed a new approach to heart activity
diagnosis based on Gram polynomials and probabilistic neural networks (PNN). Heart
disease recognition is based on the analysis of phonocardiogram (PCG) digital
sequences. The PNN provides a powerful tool for proper classification of the input data
set. The novelty of the proposed approach lies in a powerful feature extraction based on
Gram polynomials and the Fourier transform. The proposed system presents good
performance obtaining overall sensitivity of 93%, specificity of 91% and accuracy of
94%, using a public database of over 3000 heart beat sound recordings, classified as
normal and abnormal heart sounds. Thus, it can be concluded that Gram polynomials and
PNN prove to be a very efficient technique using the PCG signal for characterizing heart
diseases.
18. Zheng, Yineng, et al. "A multi-scale and multi-domain heart sound feature-based
machine learning model for ACC/AHA heart failure stage classification." Physiological
Measurement 43.6 (2022): 065002.
Zheng, Yineng, et al (2022) [18] studied dataset containing phonocardiogram (PCG)
signals from 275 subjects was obtained from two medical institutions and used in this
study. Complementary ensemble empirical mode decomposition and tunable-Q wavelet
transform were used to construct self-adaptive sub-sequences and multi-level sub-band
12
signals for PCG signals. Time-domain, frequency-domain and nonlinear feature
extraction were then applied to the original PCG signal, heart sound sub-sequences and
sub-band signals to construct multi-scale and multi-domain heart sound features. The
features selected via the least absolute shrinkage and selection operator were fed into a
machine learning classifier for ACC/AHA HF stage classification. Finally, mainstream
machine learning classifiers, including least-squares support vector machine (LS-SVM),
deep belief network (DBN) and random forest (RF), were compared to determine the
optimal model.
19. Liu, Yongmin, Xingming Guo, and Yineng Zheng. "An automatic approach using
ELM classifier for HFpEF identification based on heart sound characteristics." Journal of
medical systems 43 (2019): 1-8.
Liu, Yongmin, Xingming Guo, and Yineng Zheng [2019] [19 ] For the purpose of
assisting HFpEF diagnosis, a non-invasive method using extreme learning
machine and heart sound (HS) characteristics was provided in this paper.
Firstly, the improved wavelet denoising method was used for signal
preprocessing. Then, the logistic regression based hidden semi -Markov
model algorithm was utilized to locate the boundary of the first HS and the
second HS, therefore, the ratio of diastolic to systolic duration can be
calculated. Eleven features were extracted based on multifractal detrended
fluctuation analysis to analyze the differences of multifractal behavior of HS
between healthy people and HFpEF patients. Afterwards, the statistical
analysis was implemented on the extracted HS characteristics to generate the
diagnostic feature set. Finally, the extreme learning machine was applied for
HFpEF identification by the comparison of performances with support vector
machine. The result shows an accuracy of 96.32%, a sensitivity of 95.48%
and a specificity of 97.10%, which demonstrates the effectiveness of HS for
HFpEF diagnosis
20. Yang, Siyu, et al. "Fast Abnormal Heart Sound Detection Method Based on Multi-
struct Neural Network." 2022 7th International Conference on Intelligent Computing and
Signal Processing (ICSP). IEEE, 2022.
Yang, Siyu, et al (2022) [20] studied a training dataset containing five categories of
heart sounds was collected, including normal, mitral stenosis, mitral regurgitation, and
aortic stenosis heart sound. A convolutional neural network GoogLeNet and weighted
KNN are used to train the models separately. For the model trained by the convolutional
neural network, time series heart sound signals are converted into time-frequency
scalograms based on continuous wavelet transform to adapt to the architecture of Google
Net. For the model trained by weighted KNN, features from the time domain and time-
13
frequency domain are extracted manually. Then feature selection based on the chi-square
test is performed to get a better group of features. Moreover, we designed software that
lets doctors upload heart sounds, visualize the heart sound waveform, and use the model
to get the diagnosis. Model assessments using accuracy, sensitivity, specificity, and F1
score indicators are done on two trained models. This can assist in the diagnosis of heart
valve diseases, especially in remote areas, which lack skilled doctors.
21. Yang, Yang, et al. "Deep learning-based heart sound analysis for left ventricular
diastolic dysfunction diagnosis." Diagnostics 11.12 (2021): 2349.
Yang, Yang, et al (2021) [21] presents a non-invasive method for early diagnosis of left
ventricular diastolic dysfunction (LVDD) using heart sounds (HS) and convolutional
neural networks (CNNs). LVDD is a clinical syndrome characterized by inadequate
active relaxation and decreased cardiac output. It can lead to ventricular remodelling,
wall stiffness, reduced compliance, and progression to heart failure with a preserved
ejection fraction . The proposed method achieves a high accuracy of 0.987, sensitivity of
0.986, and specificity of 0.988 for LVDD diagnosis.
22. Zeinali, Yasser, and Seyed Taghi Akhavan Niaki. "Heart sound classification
using signal processing and machine learning algorithms." Machine Learning with
Applications 7 (2022): 10020682 rotor
Zeinali, Yasser, and Seyed Taghi Akhavan Niaki. (2022) [22] This research aims at
using artificial intelligence algorithms to diagnose heart failure by classifying heart
sounds. According to studies conducted at several large hospitals, heart specialists utilize
different medical tests to diagnose heart disease accurately. However, heart sound
diagnosis using a stethoscope is very difficult due to the hospitals’ noisy environment
(Leng et al., 2015). As such, physicians do not hear the heart sound very well and need
some relevant tests to analyze patients’ conditions.
Each of the two sides of a human heart has two chambers called the ventricles and the
atrium, connected by some valves. The heart cycle refers to all the heart events from the
beginning of one beat to the beginning of the next beat. This cycle is divided into two
parts, which have two modes of contraction and rest called systole and diastole.
23. Maglogiannis, Ilias, et al. "Support vectors machine-based identification of heart

valve diseases using heart sounds." Computer methods and programs in biomedicine
95.1 (2009): 47-61.
14
Maglogiannis, Ilias, et al. The primary aim of the study is to develop a method for
automated identification of heart valve diseases using heart sounds as input data.The
authors utilize support vector machines (SVM), a supervised learning algorithm, for the
classification task. SVM is known for its effectiveness in handling high-dimensional data
and is commonly used in classification problems. The study likely involves the collection
of heart sounds from individuals diagnosed with various heart valve diseases. These
heart sounds serve as the input features for the SVM model. Before training the SVM
model, relevant features are extracted from the heart sounds. Feature extraction is a
crucial step in pattern recognition tasks, enabling the algorithm to effectively distinguish
between different classes. In this case, features extracted from the heart sounds likely
include frequency components, timing characteristics, and other relevant attributes.
24. Jayakrishnan, Athulya, R. Visakh, and T. K. Ratheesh. "Computational approach

for heart disease prediction using machine learning." 2021 International Conference on
Communication, Control and Information Sciences (ICCISc). Vol. 1. IEEE, 2021.
Jayakrishnan, Athulya, R. Visakh, and T. K. Ratheesh [24] The paper likely

discusses the application of machine learning techniques to predict heart disease, which
is a critical area of research in healthcare. Unfortunately, I don’t have access to the full
content of the paper, but it’s fascinating to see how data-driven approaches can
contribute to medical diagnosis and treatment.If you have any specific questions about
heart disease prediction or machine learning, feel free to ask!
25 .Zeng, Wei "A new approach for the detection of abnormal heart sound signals using
TQWT, VMD and neural networks." Artificial Intelligence Review 54.3 (2021): 1613-
1647.
Zeng, Wei, Phonocardiogram (PCG) plays an important role in evaluating many cardiac
abnormalities, such as the valvular heart disease, congestive heart failure and anatomical
defects of the heart. However, effective cardiac auscultation requires trained physicians
whose work is tough, laborious and subjective. The objective of this study is to develop
an automatic classification method for anomaly (normal vs. abnormal) detection of PCG
recordings without any segmentation of heart sound signals. Hybrid signal processing
and artificial intelligence tools, including tunable Q-factor wavelet transform (TQWT),
variational mode decomposition (VMD), phase space reconstruction (PSR) and neural
15
networks, are utilized to extract representative features in order to model, identify and
detect abnormal patterns in the dynamics of PCG system caused by heart disease. First,
heart sound signal is decomposed into a set of frequency subbands with a number of
decomposition levels by using the TQWT method.
26. Zheng, Yineng Computer-assisted diagnosis for chronic heart failure by the analysis
of their cardiac reserve and heart sound characteristics. (2015)
Zheng, Yineng (2015) [26] proposed a method based on cardiac reserve (CR) indexes
extraction, heart sound hybrid characteristics extraction and intelligent diagnosis model
definition. Firstly, the modified wavelet packet-based denoising method was applied to
data pre-processing. Then, the CR indexes such as the ratio of diastolic to systolic
duration (D/S) and the amplitude ratio of the first to second heart sound (S1/S2) were
extracted. The feature set consisting of the heart sound characteristics such as
multifractal spectrum parameters, the frequency corresponding to the maximum peak of
the normalized PSD curve (fPSDmax) and adaptive sub-band energy fraction (sub _ EF)
were calculated based on multifractal detrended fluctuation analysis (MF-DFA),
maximum entropy spectra estimation (MESE) and empirical mode decomposition
(EMD). Statistical methods such as t-test and receiver operating characteristic (ROC)
curve analysis were performed to analyze the difference of each parameter between the
healthy and CHF patients.
27. Brites, Ivo Sérgio Guimarães. "Machine learning and iot applied to cardiovascular
diseases identification through heart sounds: A literature review." Informatics. Vol. 8.
No. 4. Multidisciplinary Digital Publishing Institute, 2021.
Brites, Ivo Sérgio Guimarães (2021) [27] performed a survey that aims to present and
analyze the advances of the latest studies based on medical care and assisted
environment. We focus on articles for online monitoring, detection, and support of the
diagnosis of cardiovascular diseases. Our research covers published manuscripts in
scientific journals and recognized conferences since the year 2015. Also, they present a
reference model based on the evaluation of the resources used from the selected studies.
Finally, their proposal aims to help future enthusiasts to discover and enumerate the
required factors for the development of a prototype for online heart monitoring purposes.
28 .Li, Haixia "Detection and classification of abnormities of first heart sound using
16
empirical wavelet transform." IEEE Access 7 (2019): 139643-139652.
Li, Haixia (2019) [28] implemented features of diastolic murmurs to identify coronary
artery disease. It is expected that an automatic detection and classification algorithm for
the abnormities of first heart sound (S1) can realize computer artificial intelligence
diagnosis of some relative cardiovascular disease. Few studies have focused on the
detection and classification of the abnormities of S1 and given out in detail the essential
differences between abnormal and normal S1. This work applied Empirical Wavelet
Transform (EWT) to decompose S1 and extracted the instantaneous frequency (IF) of
mitral component (M1) and tricuspid component (T1) by using Hilbert Transform.
Firstly, the heart sound signal is preprocessed following these processes: filtering,
resampling, normalization and segmentation. Secondly, S1 is decomposed into several
modes based on EWT. First two maximal points with a distance greater than 20Hz in
Fourier Spectrum of S1 are selected and the nearest minimal points on both sides of the
maximal points are found out as the boundaries for segmentation of the spectrum. S1 is
decomposed into 5 modes and every mode's IF are calculated through Hilbert
transformation. At last, a k-mean cluster algorithm is applied to cluster the IF of different
modes.
29. Giordano, Noemi, and Marco Knaflitz. "A novel method for measuring the timing
of heart sound components through digital phonocardiography." Sensors 19.8 (2019):
1868.
Giordano, Noemi, and Marco Knaflitz (2019) [29] studied the auscultation of heart
sounds has been for decades a fundamental diagnostic tool in clinical practice. Higher
effectiveness can be achieved by recording the corresponding biomedical signal, namely
the phonocardiographic signal, and processing it by means of traditional signal
processing techniques. An unavoidable processing step is the heart sound segmentation,
which is still a challenging task from a technical viewpoint a limitation of state-of-the-art
approaches is the unavailability of trustworthy techniques for the detection of heart
sound components. It design a reliable algorithm for the identification and the
classification of heart sounds’ main components. This proposed methodology was tested
on a sample population of 24 healthy subjects over 10-min-long simultaneous
electrocardiographic and phonocardiography recordings and it was found capable of
correctly detecting and classifying an average of 99.2% of the heart sounds along with
their components.
17
30. Jariwala, Nancy, et al. "Clinically undetectable heart sounds in hospitalized patients
undergoing echocardiography." JAMA Internal Medicine 182.1 (2022): 86-87
Jariwala, Nancy, et al (2022) [30] focused on the role of clinical training in auscultation
proficiency. it remains unclear why even experienced practitioners is possible that in
addition to auscultatory training, the quality of heart sounds also plays a role. Clinically
undetectable or distant heart sounds have been anecdotally reported to occur in some
settings. They sought to quantify the extent to which undetectable heart sounds occur in
hospitalized patients who are undergoing echocardiography and evaluate the association
between patient factors and missed VHD diagnoses.
2.1 SUMMARY OF SURVEY
S.No Author Title Method Advantages/

Limitations
1 Gjoreski, Machine learning Convolutional and Sequential information
Martin and end to end Recurrent neural processing:
deep learning for networks, RNNs are designed to
the detection of Support vector process sequential data such
chronic heart machines, as time series or natural
failure from heart Gaussian Mixture languages and can capture
sounds. models, Ensemble the temporal dependencies
learning. between the elements in the
sequence.
2 Gahane, An Analytical Logistic Regression, Non-invasive, fast, objective,
Aroh, and Review of Heart Random forest, high accuracy, limited data,
Chinnaia failure Detection Artificial Neural algorithm bias, human expert
h Kotadi based on Networks involvement
Machine
Learning.
3 Shuvo, Light weight Tensor flow lite, Lower memory
Samiul deep learning Pytorch Mobile, requirements: Light-weight
Based framework for ONNX Runtime, frameworks typically have
cardiovascular Keras smaller memory footprints,
disease making them ideal for use in
classification resource-constrained
using heart sound environments
recordings.
4 Li,Suyi A review of Signal Processing Improved accuracy, fast,
computer aided Techniques, data quality, non-invasive
heart sound Machine learning
18
detection Techniques
techniques.
5 Miotto, Deep learning for Improved diagnostic Improved accuracy, efficient,
Riccardo healthcare: accuracy, cost- effective, non-invasive,
review, Data quality and data quality, limited
opportunities and quantity generalization.
challenges.
6 Allugunti Heart disease Decision tree-based Improved accuracy,
, diagnosis and methods, Deep robustness, efficient,
Viswanat prediction based learning based adaptability,
ha Reddy on hybrid methods, Bayesian Data quality, algorithm bias,
machine learning network based human expert involvement.
model. methods.
7 Zubair, A peak detection Wavelet transform, Accurate localization, low
Muhamm algorithm for Hilbert transform, computational, requirements,
ad localization and time-frequency combined with other
classification of analysis, machine technologies, limited
heart sounds in learning. classification ability,
PCG signals sensitivity to noise, data
quality.
8 Valera, Machine learning Convolutional neural Early detection,
HH models for networks, support Efficient use of resources.
detection of vector machines,
Alvarez,
decompensation random forests,
And in chronic heart hidden Markov model.
M.Lustre failure using
heart sounds.
k
9 Ravi,Roh Prediction of Logistic regression, Improved accuracy, early
it,and Cardiovascular decision tress, detection, personalized
P.Madha disease using random forests, treatment, limited
van machine learning support vector generalization, algorithm
algorithms. machines, neural bias.
networks
10 Susic, Machine learning Convolutional neural Non-invasive, accurate
D.,Gregor models for networks, support detection, low computational
Poglajen detection vector machines, requirements, sensitivity to
and decompensation random forests, noise, limited sample size.
Anton in chronic heart Hidden Markov
Gradisek failure using Model
heart sounds.
11 Sreejith,S A real time Data collection, data Real time monitoring, cost-
., patient processing, feature effective, machine learning
S.Rahul, monitoring selection capabilities.
19
and system for heart ,model training, real
R. disease time monitoring, real
C.Jisha prediction using time monitoring, alert
random forest system.
algorithm.
12 Gjoreski, Chronic heart Pre-processing, Improved accuracy, better
Martin failure detection Feature extraction, robustness, scalability,
from heart feature selection, increased complexity,
sounds using a Classifier difficult to train,
stack of machine- design, Training and computationally intensive.
learning testing, model -
classifiers. optimization
13 Ismail, PCG Pre-processing, Improved accuracy, reduced
Shahid classification spectrogram training time, ability to
through generation, Transfer handle small database,
learning,fine-tuning,
spectrogram versatility, limited flexibility,
classification,
using transfer evaluation. dependent on spectrogram
learning. quality.
14 Sanei, An adaptive Pre-processing, High accuracy, flexibility,
Saeid, singular Adaptive data-driven, complexity,
Mansoure spectrum SSA, feature limited interpretability.
h Ghodsi, analysis extraction,
and approach to classification,
Hossein murmur evaluation.
Hassani detection from
heart sounds.
15 Gao,Shan Gated recurrent Data collection, pre- Real-time analysis,
,Yineng unit-based heart processing, feature automated, generalization,
Zheng, sound analysis extraction, GRU hardware and software
network design,
and for heart failure requirements, dependence on
training, testing.
Xingming screening. training data.
Guo
16 Wang,Hu An automatic Data collection, pre- Fast, accurate, non-invasive,
i approach for processing,feature limited data, algorithm bias,
heart failure extraction, validation. lack of transparency
typing based on
heart sounds and
convolutional
recurrent neural
network.
17 Beritelli, Automatic heart Signal processing, High accuracy, non-invasive,
Francesco activity diagnosis feature extraction, fast, limited generalization
based on gram classification, and interpretability.
20
polynomials and validation
probalistic neural
networks.
18 Zheng, A multi-scale and Multi-scale and multi- High accuracy, non-invasive,
Yineng multi-domain domain feature objective, fast, limited data,
heart sound extraction, lack of transparency
feature-based Machine learning
machine learning models.
model for
ACC/AHA heart
failure stage
classification.
19 Liu, An automatic Heart sound signal Non-invasive, fast, objective,
Yongmin, approach using acquisition, feature high accuracy, limited data,
Xingming ELM classifier extraction, feature algorithm bias, limited
Guo and for HFpEF selection, ELM generalization.
Yineng identification classification.
Zheng based on heart
sound
characteristics.
20 Yang,Siy Fast abnormal Heart sound signal Non-invasive, fast, high
u heart sound acquisition, feature accuracy, limited data,
detection method extraction, multi- limited interpretability,
based on multi- struct neural network, algorithm bias.
struct neural abnormal heat sound
network. detection.
21 Yang, Deep learning Convolutional neural High accuracy, automated,
Yang based heart network, recurrent real-time analysis,
sound analysis neural network, hybrid generalization, dependence
for left approaches, transfer on training data, limited
ventricular learning, ensemble interpretability
diastolic methods
dysfunction
diagnosis.
22 Zeinali, Heart sound Time-frequency Accuracy, scalability,
Yasser, classification analysis, mel- automation, data quality,
and using signal frequency cepstral complexity.
Seyed processing and coefficients, principal
Taghi machine learning compound analysis,
Akhavan algorithms. artificial neural
Niaki networks.
23 Maglogia Support vectors Feature extraction, Accuracy, robustness,
nnis, machine -based feature selection, interpretability, scalability,
21
llias identification of training of SVM data quality, feature
heart valve model, validation, selection.
diseases using model optimization.
heart sounds.
24 Jayakrish Computational Data pre-processing, Accuracy, early detection,
nan, approach for feature selection, personalized, efficiency,
Athulya, heart disease machine learning complexity, hardware and
R.Visakh, prediction using models, performance software requirements.
and machine evaluation.
T.K.Rath learning.
eesh
25 Zeng, A new approach Pre-processing, Accurate detection,
Wei for the detection feature extraction, robustness, interpretability,
of abnormal heart feature selection, Scalability, hardware and
sound signals classification, software requirements,
using evaluation. complexity
TQWT,VMD
and neural
networks.
26 Zheng, Computer - Cardiac reserve Early diagnosis, non-
Yineng assisted analysis, heart sound invasive, objective and
diagnosis for recording and pre- reproducible, personalized
chronic heart processing, feature treatment, complexity, false
failure by the extraction, feature positives and false negatives.
analysis of their selection,
cardiac reserve classification,
and heart sound evaluation.
characteristics.
27 Brites, Machine learning Data acquisition, Early detection, non-
Ivo and iot applied to signal processing, invasive, real-time
Sergio cardiovascular feature extraction, monitoring, accessibility,
Guimarae identification feature selection, data privacy and security.
s through heart classification,
sounds. evaluation.
28 Li, Haixia Detection and Data acquisition, pre- Accurate detection, noise
classification of processing, feature reduction, fast processing,
abnormities of extraction, validation. customization, complexity,
first heart sound training date.
using empirical
wavelet
transform.
29 Giordano, A novel method Signal acquisition, Non-invasive, accurate
Noemi,an for measuring the signal, pre-processing, timing, objective,
22
d Marco timing of heart timing measurement, reproducibility, equipment
Knaflitz sound analysis and requirements, learning curve,
components interpretation. data storage and analysis.
through digital
phonocardiogray.
30 Jariwala, Clinically Doppler Identification of silent heart
Nancy undetectable echocardiography, disease, early intervention,
heart sounds in phonocardiography, objective measurement,
hospitalized ECG-gated multislice equipment measurements,
patients computed cost, training and expertise.
undergoing tomography.
echocardiograpy.
Table 2.1: Summary of survey
2.2 Problem Statement

Chronic heart failure is the system in which the heart is unable to pump the blood to the
entire body. This results in many of the symptoms such as shortness of breathe, fatigue,
swelling in the legs and ankles. This may result in the loss of the patient which effect
their relative/dear ones. So, this can be solved by using the heart sounds means PCG of a
person. It is a tool that identifies the chronic heart failure. It indicates weakened heart or
stiffened heart muscles which are common in chronic heart failure. So, by studying the
heart sound of a person using PCG which increases the accuracy and reliability of the
heart sound using AIML python. Finally, by using this project the life of the person can
be saved formerly.
23
CHAPTER 3
EXISTING
SYSTEM
3.1 Decision Tree

DTC is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML. It is
based on the concept of ensemble learning, which is a process of combining multiple
classifiers to solve a complex problem and to improve the performance of the model. As
the name suggests, "DTC is a classifier that contains a number of decision trees on
various subsets of the given dataset and takes the average to improve the predictive
accuracy of that dataset." Instead of relying on one decision tree, the DTC takes the
prediction from each tree and based on the majority votes of predictions, and it predicts
the final output. The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting.
3.1.1 DTC algorithm
Step 1: In DTC n number of random records are taken from the data set having k number
of records.
Step 2: Individual decision trees are constructed for each sample.
Step 3: Each decision tree will generate an output.
Step 4: Final output is considered based on Majority Voting or Averaging for
Classification and regression respectively.
3.1.2 Important Features of DTC
 Diversity- Not all attributes/variables/features are considered while making an
individual tree, each tree is different.
 Immune to the curse of dimensionality- Since each tree does not consider all
the features, the feature space is reduced.
 Parallelization-Each tree is created independently out of different data and

attributes. This means that we can make full use of the CPU to build DTCs.
 Train-Test split- In a DTC we don’t have to segregate the data for train and test
as there will always be 30% of the data which is not seen by the decision tree.
24
 Stability- Stability arises because the result is based on majority voting/
averaging.
3.1.3 Assumptions for DTC

Since the DTC combines multiple trees to predict the class of the dataset, it is possible
that some decision trees may predict the correct output, while others may not. But
together, all the trees predict the correct output. Therefore, below are two assumptions
for a better DTC classifier:
 There should be some actual values in the feature variable of the dataset so that
the classifier can predict accurate results rather than a guessed result.
 The predictions from each tree must have very low correlations.
Below are some points that explain why we should use the DTC algorithm
 It takes less training time as compared to other algorithms.
 It predicts output with high accuracy, even for the large dataset it runs efficiently.
 It can also maintain accuracy when a large proportion of data is missing.
3.2 Drawbacks of Decision Tree Classifier

Decision Tree Classifiers are a popular machine learning algorithm known for their
simplicity, interpretability, and versatility. However, they are not without their
drawbacks. Here are some common drawbacks associated with Decision Tree
Classifiers:
 Overfitting: Decision Trees are prone to overfitting, especially when the tree
becomes too deep and complex. Overfitting occurs when the tree captures noise
in the training data rather than the underlying patterns. This can lead to poor
generalization to new, unseen data.
 Instability: Decision Trees are sensitive to small variations in the training data. A
slight change in the data can result in a significantly different tree structure. This
instability can make Decision Trees less reliable compared to other algorithms
like XGBoosts or Gradient Boosting.
 High Variance: Decision Trees have high variance, meaning that small changes
in the training data can result in different tree structures. High variance can lead
to inconsistent predictions and reduced model reliability.
25
 Bias Towards Dominant Classes: Decision Trees tend to favor classes that are
more frequent in the training data. In imbalanced datasets, where one class
significantly outnumbers the others, Decision Trees can have a bias towards the
majority class and perform poorly on minority classes.
 Lack of Predictive Power: Decision Trees may struggle with capturing complex
relationships in the data, especially when the relationships are nonlinear. They
might not perform well on datasets with intricate decision boundaries.
 Not Suitable for Numerical Predictors: Decision Trees are primarily designed for
categorical or binary predictors. While they can handle numerical predictors, they
may not perform as effectively as algorithms designed specifically for numerical
data, such as linear regression.
 Difficulty Handling Missing Data: Traditional Decision Trees can struggle with
missing data. If a predictor has missing values, the algorithm may discard the
entire data point or introduce bias in the model.
 Limited Expressiveness: Decision Trees represent decision boundaries using

axis- aligned splits. This means they might not perform well on problems where
decision boundaries are diagonal or nonlinear, requiring more complex models.
 Not Optimized for Unstructured Data: Decision Trees are not well-suited for
unstructured data types like text, audio, or images. They are primarily used for
structured data.
 Greedy Nature: Decision Trees use a greedy approach to split the data at each
node. They select the most informative feature at each step without considering
future splits. This can lead to suboptimal trees.
 Prone to Outliers: Outliers in the data can disproportionately influence the

decisions made by Decision Trees, potentially leading to less robust models.
26
CHAPTER 4
PROPOSED METHOD
Chronic heart failure (CHF) is a serious condition that requires early detection and
treatment to prevent its progression. One way to detect CHF is by analyzing the
phonocardiogram (PCG) sounds using machine learning techniques. Here's a step-by-
step approach to detecting CHF from PCG sounds. Figure 4.1 shows the proposed block
diagram.
Fig. 4.1: Proposed block diagram.

The following steps are
Step 1: Dataset Preprocessing:
Collect a dataset of PCG recordings from patients with and without CHF.
Convert the audio files to a common format, such as WAV.
Segment the audio recordings into individual heartbeats using an automatic or manual segmentation
Remove any artifacts or background noise from the recordings.
Label the heartbeats as either normal or abnormal (i.e., associated with CHF).
Step 2: Feature Extraction:

 Extract Mel-frequency cepstral coefficients (MFCCs) from each heartbeat
using a Fourier transform-based technique.
27
 Use a sliding window approach to segment the MFCCs into frames of
equal length.
 Apply a temporal averaging technique to reduce the dimensionality of the

feature set.
Step 3: Feature Analysis:

 Apply a convolutional neural network (CNN) to the MFCC frames to
learn a set of discriminative features.
 Train the CNN on the labeled dataset, using a cross-validation approach

to avoid overfitting.
 Extract the learned features from the last fully connected layer of the
CNN.
Step 4: Classification:
 Use the extracted features as input to a random forest classifier to predict
whether a heartbeat is normal or abnormal.
 Train the random forest classifier on the labeled dataset, using a cross-
validation approach to optimize hyperparameters and avoid overfitting.
 Evaluate the performance of the classifier on a hold-out test set, using

metrics such as accuracy, precision, recall, and F1 score.
By following this approach, it is possible to develop an accurate and reliable system for
detecting CHF from PCG sounds. However, it is important to note that the performance
of the system may be limited by factors such as the quality and quantity of the dataset,
the choice of feature extraction and classification techniques, and the generalization of
the model to new patients and recording conditions. Therefore, it is important to
carefully design and evaluate the system using appropriate
4.1 Preprocessing
Preprocessing the PCG dataset is an essential step in developing a reliable and accurate
system for detecting CHF from PCG sounds. Here are some common preprocessing
steps that can be performed:
1. Data collection:
 Collect a dataset of PCG recordings from patients with and without CHF.
28
 Ensure that the dataset covers a range of ages, genders, and ethnicities to
ensure the generalizability of the model.
 Verify that the recordings are of good quality, with minimal background noise
and no recording artifacts.
2. Data format conversion:
 Convert the PCG recordings to a common format, such as WAV or MP3.
 Ensure that the recordings are of a consistent sampling rate and bit depth.
3.Dataset splitting:
 Split the dataset into training, validation, and test sets.
 Ensure that each set contains a balanced number of normal and abnormal
heartbeats.
By performing these preprocessing steps, the PCG dataset will be ready for feature
extraction and classification, which can be performed using various machine learning
techniques. It is important to note that the preprocessing steps may vary depending on
the specific dataset and research question, and that careful consideration and evaluation
should be performed at each step.
4.2 MFCC feature extraction
Pre-emphasis is the initial stage of extraction. It is the process of boosting the energy in
high frequency. It is done because the spectrum for voice segments has more energy at
lower frequencies than higher frequencies. This is called spectral tilt which is caused by
the nature of the glottal pulse. Boosting high-frequency energy gives more info to
Acoustic Model which improves phone recognition performance. MFCC can be
extracted by following method.
1) The given speech signal is divided into frames (~20 ms). The length of time
between successive frames is typically 5-10ms.
2) Hamming window is used to multiply the above frames to maintain the continuity
of the signal. Application of hamming window avoids Gibbs phenomenon.
Hamming window is multiplied to every frame of the signal to maintain the
continuity in the start and stop point of frame and to avoid hasty changes at end
29
point. Further, hamming window is applied to each frame to collect the closest
frequency component together.
3) Mel spectrum is obtained by applying Mel-scale filter bank on DFT power

spectrum. Mel-filter concentrates more on the significant part of the spectrum to
get data values. Mel-filter bank is a series of triangular band pass filters similar to
the human auditory system. The filter bank consists of overlapping filters. Each
filter output is the sum of the energy of certain frequency bands. Higher
sensitivity of the human ear to lower frequencies is modeled with this procedure.
The energy within the frame is also an important feature to be obtained. Compute
the logarithm of the square magnitude of the output of Mel-filter bank. Human
response to signal level is logarithm. Humans are less sensitive to small changes
in energy at high energy than small changes at low energy. Logarithm
compresses dynamic range of values.
4) Mel-scaling and smoothing (pull to right). Mel scale is approximately linear

below 1 kHz and logarithmic above 1 kHz.
5) Compute the logarithm of the square magnitude of the output of Mel filter bank.
6) DCT is further stage in MFCC which converts the frequency domain signal into
time domain and minimizes the redundancy in data which may neglect the
smaller temporal variations in the signal. Mel-cepstrum is obtained by applying
DCT on the logarithm of the mel-spectrum. DCT is used to reduce the number of
feature dimensions. It reduces spectral correlation between filter bank
coefficients. Low dimensionality and 17 uncorrelated features are desirable for
any statistical classifier. The cepstral coefficients do not capture the energy. So, it
is necessary to add energy feature. Thus twelve (12) Mel Frequency Cepstral
Coefficients plus one (1) energy coefficient are extracted. These thirteen (13)
features are generally known as base features.
7) Obtain MFCC features.
The MFCC i.e. frequency transformed to the cepstral coefficients and the cepstral
coefficients transformed to the MFCC by using the equation.
𝑓
𝑚𝑒𝑙(𝑓) = 2595 × log 10 (1 + ) (13)
700
Where f denotes the frequency in Hz The Step followed to compute MFCC. The MFCC
30
features are estimated by using the following equation.
𝐶 =
∑𝐾
(𝑙𝑜𝑔𝑆 ) 1 𝜋 (14)
𝑛 𝑛=1 𝑘 [𝑛 (𝐾 − ) ] 𝑤ℎ𝑒𝑟𝑒𝑛 = 1,2, … . . 𝐾
2 𝐾
Here, K represents the number of Mel cepstral coefficient, C0 is left out of the DCT
because it represents the mean value of the input speech signal which contains no
significant speech related information. For each of the frames (approx. 20 ms) of speech
that has overlapped, an acoustic vector consisting of MFCC is computed. This set of
coefficients represents as well as recognize the characteristics of the speech.
Speech Data
Frame Blocking Windowin g
Pre emphasis
Logarithmic Compressio n
DCT (Discrete Cosine Transform) Mel scale Filter Bank
MFCC Features
Fig. 4.2: MFCC operation diagram
4.3 CNN model
Fig. 4.3: Proposed deep CNN model for feature extraction
4.3.1 Convolution layer:

According to the facts, training and testing of -CNN involves in allowing every source
PCG via a succession of convolution layers by a kernel or filter, rectified linear unit
(ReLU), max pooling, fully connected layer and utilize SoftMax layer with classification
layer to categorize the objects with probabilistic values ranging from [0,1]. Convolution
layer as depicted in Figure 4.3.1.1 is the primary layer to extract the features from a
31
source PCG and maintains the relationship between pixels by learning the features of
PCG by employing tiny blocks of source data. It’s a mathematical function which
considers two inputs like source PCG 𝐼(𝑥, 𝑦, 𝑑) where 𝑥 and 𝑦 denotes the spatial
coordinates i.e., number of rows and columns. 𝑑 is denoted as dimension of an PCG
(here 𝑑 = 3, since the source PCG and a filter or kernel with similar size of input PCG
and can be denoted as 𝐹(𝑘𝑥, 𝑘𝑦, 𝑑).
Fig. 4.3.1.1: Representation of convolution layer process.
The output obtained from convolution process of input PCG and filter has a size of
𝐶 ((𝑥 − 𝑘𝑥 + 1), ( 𝑦 − 𝑘𝑦 + 1), 1), which is referred as feature map. An example of
convolution procedure is demonstrated in Fig.4.3.1.2 Let us assume an input PCG with a

size of 5 × 5 and the filter having the size of 3 × 3. The feature map of input PCG is
obtained by multiplying the input PCG values with the filter values as given in Figure 9.
(a)
(b)
Fig.4.3.1.2: Example of convolution layer process (a) an PCG with size 𝟓 × 𝟓 is
convolving with 𝟑 × 𝟑 kernel (b) Convolved feature map.
32
4.3.2 ReLU layer
Networks those utilizes the rectifier operation for the hidden layers are cited as rectified
linear unit (ReLU). This ReLU function 𝒢(∙) is a simple computation that returns the
value given as input directly if the value of input is greater than zero else returns zero.
This can be represented as mathematically using the function 𝑚𝑎𝑥(∙) over the set of 0
and the input 𝓍 as follows:
𝒢(𝓍) = max{0, 𝓍}
4.3.3 Max pooing layer
This layer mitigates the number of parameters when there are larger size PCGs. This can
be called as subsampling or down sampling that mitigates the dimensionality of every
feature map by preserving the important information. Max pooling considers the
maximum element form the rectified feature map.
4.3.4 SoftMax classifier
Generally, as seen in the above picture SoftMax function is added at the end of the
output since it is the place where the nodes are meet finally and thus, they can be
classified. Here, X is the input of all the models and the layers between X and Y are the
hidden layers and the data is passed from X to all the layers and Received by Y.
Suppose, we have 10 classes, and we predict for which class the given input belongs to.
So, for this what we do is allot each class with a particular predicted output.
Fig. 4.3.4.1: Emotion prediction using SoftMax classifier.
33
In Fig.4.3.4.1 and we must predict what is the object that is present in the picture. In the
normal case, we predict whether the emotion is A. But in this case, we must predict what
is the object that is present in the picture. This is the place where SoftMax comes in
handy. As the model is already trained on some data. So, as soon as the picture is given,
the model processes the pictures, send it to the hidden layers and then finally send to
SoftMax for classifying the picture. The SoftMax uses a One-Hot encoding Technique to
calculate the cross-entropy loss and get the max. One-Hot Encoding is the technique that
is used to categorize the data. In the previous example, if SoftMax predicts that the
object is class A then the One-Hot Encoding for:
Class A will be [1 0 0]
Class B will be [0 1 0]
Class C will be [0 0 1]
From the diagram, we see that the predictions are occurred. But generally, we don’t
know the predictions. But the machine must choose the correct predicted object. So, for
machine to identify an object correctly, it uses a function called cross-entropy function.
So, we choose more similar value by using the below cross-entropy formula.
Fig.4.3.4.2: Example of SoftMax classifier.
34
Fig. 4.3.4.3 : Example of SoftMax classifier with test data.
In the above example fig.4.3.4.3 we see that 0.462 is the loss of the function for class
specific classifier. In the same way, we find loss for remaining classifiers. The lowest the
loss function, the better the prediction is. The mathematical representation for loss
function can be represented as: -
𝐿𝑂𝑆𝑆 = 𝑛𝑝. 𝑠𝑢𝑚(−𝑌 ∗ 𝑛𝑝. 𝑙𝑜𝑔(𝑌_𝑝𝑟𝑒𝑑))
4.4 Random Forest classifier

Random Forest is a popular machine learning algorithm that belongs to the supervised
learning technique. It can be used for both Classification and Regression problems in
ML. It is based on the concept of ensemble learning, which is a process of combining
multiple classifiers to solve a complex problem and to improve the performance of the
model. As the name suggests, "Random Forest is a classifier that contains a number of
decision trees on various subsets of the given dataset and takes the average to improve
the predictive accuracy of that dataset." Instead of relying on one decision tree, the
random forest takes the prediction from each tree and based on the majority votes of
predictions, and it predicts the final output. The greater number of trees in the forest
leads to higher accuracy and prevents the problem of overfitting.
35
Fig. 4.4: Random Forest algorithm
Random Forest algorithm
Step 1: In Random Forest n number of random records are taken from the data set having
k number of records.
Step 2: Individual decision trees are constructed for each sample.
Step 3: Each decision tree will generate an output.
Step 4: Final output is considered based on Majority Voting or Averaging for
Classification and regression respectively.
Important Features of Random Forest
 Diversity- Not all attributes/variables/features are considered while making an
individual tree, each tree is different.
 Immune to the curse of dimensionality- Since each tree does not consider all
the features, the feature space is reduced.
 Parallelization-Each tree is created independently out of different data and

attributes. This means that we can make full use of the CPU to build random
forests.
 Train-Test split- In a random forest we don’t have to segregate the data for train
and test as there will always be 30% of the data which is not seen by the decision
tree.
 Stability- Stability arises because the result is based on majority voting/

averaging.
36
4.4.1 Assumptions for Random Forest
Since the random forest combines multiple trees to predict the class of the dataset, it is
possible that some decision trees may predict the correct output, while others may not.
But together, all the trees predict the correct output. Therefore, below are two
assumptions for a better Random forest classifier:
 There should be some actual values in the feature variable of the dataset so that
the classifier can predict accurate results rather than a guessed result.
 The predictions from each tree must have very low correlations.
Below are some points that explain why we should use the Random Forest algorithm
 It takes less training time as compared to other algorithms.
 It predicts output with high accuracy, even for the large dataset it runs efficiently.
 It can also maintain accuracy when a large proportion of data is missing.
4.4.2 Types of Ensembles
Before understanding the working of the random forest, we must look into the ensemble
technique. Ensemble simply means combining multiple models. Thus, a collection of
models is used to make predictions rather than an individual model. Ensemble uses two
types of methods:
Bagging– It creates a different training subset from sample training data with
replacement & the final output is based on majority voting. For example, Random
Forest. Bagging, also known as Bootstrap Aggregation is the ensemble technique used
by random forest. Bagging chooses a random sample from the data set. Hence each
model is generated from the samples (Bootstrap Samples) provided by the Original Data
with replacement known as row sampling. This step of row sampling with replacement is
called bootstrap. Now each model is trained independently which generates results. The
final output is based on majority voting after combining the results of all models. This
step which involves combining all the results and generating output based on majority
voting is known as aggregation.
37
Fig.4.4.2.1: RF Classifier analysis.
Boosting– It combines weak learners into strong learners by creating sequential models
such that the final model has the highest accuracy. For example, ADA BOOST, XG BOOST.
Fig. 4.4.2.2: Boosting RF Classifier.
38
CHAPTER 5
MACHINE LEARNING
5.1 What is Machine Learning
Before we look at the details of various machine learning methods, let's start by looking
at what machine learning is, and what it isn't. Machine learning is often categorized as a
subfield of artificial intelligence, but I find that categorization can often be misleading at
first brush. The study of machine learning certainly arose from research in this context,
but in the data science application of machine learning methods, it's more helpful to
think of machine learning as a means of building models of data.
Fundamentally, machine learning involves building mathematical models to help

understand data. "Learning" enters the fray when we give these models tunable
parameters that can be adapted to observed data; in this way the program can be
considered to be "learning" from the data. Once these models have been fit to previously
seen data, they can be used to predict and understand aspects of newly observed data. I'll
leave to the reader the more philosophical digression regarding the extent to which this
type of mathematical, model-based "learning" is similar to the "learning" exhibited by
the human brain. Understanding the problem setting in machine learning is essential to
using these tools effectively, and so we will start with some broad categorizations of the
types of approaches we'll discuss here.
5.2 Categories of Machine Leaning
At the most fundamental level, machine learning can be categorized into two main types:
supervised learning and unsupervised learning.
Supervised learning involves somehow modeling the relationship between measured

features of data and some label associated with the data; once this model is determined, it
can be used to apply labels to new, unknown data. This is further subdivided
into classification tasks and regression tasks: in classification, the labels are discrete
categories, while in regression, the labels are continuous quantities. We will see
examples of both types of supervised learning in the following section.
39
Unsupervised learning involves modeling the features of a dataset without reference to
any label and is often described as "letting the dataset speak for itself." These models
include tasks such as clustering and dimensionality reduction. Clustering algorithms
identify distinct groups of data, while dimensionality reduction algorithms search for
more succinct representations of the data. We will see examples of both types of
unsupervised learning in the following section.
5.3 Need for Machine Learning
Human beings, at this moment, are the most intelligent and advanced species on earth
because they can think, evaluate, and solve complex problems. On the other side, AI is
still in its initial stage and have not surpassed human intelligence in many aspects. Then
the question is that what is the need to make machine learn? The most suitable reason for
doing this is, “to make decisions, based on data, with efficiency and scale”.
Lately, organizations are investing heavily in newer technologies like Artificial

Intelligence, Machine Learning and Deep Learning to get the key information from data
to perform several real-world tasks and solve problems. We can call it data-driven
decisions taken by machines, particularly to automate the process. These data-driven
decisions can be used, instead of using programming logic, in the problems that cannot
be programmed inherently. The fact is that we can’t do without human intelligence, but
other aspect is that we all need to solve real-world problems with efficiency at a huge
scale. That is why the need for machine learning arises.
5.4 Challenges in Machines Learning
While Machine Learning is rapidly evolving, making significant strides with

cybersecurity and autonomous cars, this segment of AI as whole still has a long way to
go. The reason behind is that ML has not been able to overcome number of challenges.
The challenges that ML is facing currently are −
1. Quality of data − Having good-quality data for ML algorithms is one of the

biggest challenges. Use of low-quality data leads to the problems related to data
preprocessing and feature extraction.
2. Time-Consuming task − Another challenge faced by ML models is the
consumption of time especially for data acquisition, feature extraction and
40
retrieval.
3. Lack of specialist persons − As ML technology is still in its infancy stage,
availability of expert resources is a tough job.
4. No clear objective for formulating business problems − Having no clear objective
and well-defined goal for business problems is another key challenge for ML
because this technology is not that mature yet.
5. Issue of overfitting & underfitting − If the model is overfitting or underfitting, it
cannot be represented well for the problem.
6. Curse of dimensionality − Another challenge ML model faces is too many
features of data points. This can be a real hindrance.
7. Difficulty in deployment − Complexity of the ML model makes it quite difficult
to be deployed in real life.
5.5 Applications of Machines Learning
Machine Learning is the most rapidly growing technology and according to researchers
we are in the golden year of AI and ML. It is used to solve many real-world complex
problems which cannot be solved with traditional approach. Following are some real-
world applications of ML
 Emotion analysis
 Sentiment analysis
 Error detection and prevention
 Weather forecasting and prediction
 Stock market analysis and forecasting
 Speech synthesis
 Speech recognition
 Customer segmentation
 Object recognition
 Fraud detection
 Fraud prevention
 Recommendation of products to customer in online shopping
41
5.6 How to Start Learning Machine Learning?
Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field of
study that gives computers the capability to learn without being explicitly programmed”.
And that was the beginning of Machine Learning! In modern times, Machine Learning is
one of the most popular (if not the most!) career choices. According to Indeed, Machine
Learning Engineer Is The Best Job of 2019 with a 344% growth and an average base
salary of $146,085 per year.
But there is still a lot of doubt about what exactly is Machine Learning and how to start
learning it? So, this article deals with the Basics of Machine Learning and also the path
you can follow to eventually become a full-fledged Machine Learning Engineer. Now
let’s get started!!!
5.6.1 How to start learning ML?
This is a rough roadmap you can follow on your way to becoming an insanely talented
Machine Learning Engineer. Of course, you can always modify the steps according to
your needs to reach your desired end-goal!
Step 1 – Understand the Prerequisites
In case you are a genius, you could start ML directly but normally, there are some
prerequisites that you need to know which include Linear Algebra, Multivariate
Calculus, Statistics, and Python. And if you don’t know these, never fear! You don’t
need a Ph.D. degree in these topics to get started but you do need a basic understanding.
(a) Learn Linear Algebra and Multivariate Calculus
Both Linear Algebra and Multivariate Calculus are important in Machine Learning.
However, the extent to which you need them depends on your role as a data scientist. If
you are more focused on application heavy machine learning, then you will not be that
heavily focused on maths as there are many common libraries available. But if you want
to focus on R&D in Machine Learning, then mastery of Linear Algebra and Multivariate
Calculus is very important as you will have to implement many ML algorithms from
scratch.
42
(b) Learn Statistics
Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML
expert will be spent collecting and cleaning data. And statistics is a field that handles the
collection, analysis, and presentation of data. So it is no surprise that you need to learn
it!!!
Some of the key concepts in statistics that are important are Statistical Significance,
Probability Distributions, Hypothesis Testing, Regression, etc. Also, Bayesian Thinking
is also a very important part of ML which deals with various concepts like Conditional
Probability, Priors, and Posteriors, Maximum Likelihood, etc.
(c) Learn Python
Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn
them as they go along with trial and error. But the one thing that you absolutely cannot
skip is Python! While there are other languages you can use for Machine Learning like
R, Scala, etc. Python is currently the most popular language for ML. In fact, there are
many Python libraries that are specifically useful for Artificial Intelligence and Machine
Learning such as Keras, TensorFlow, Scikit-learn, etc.
So, if you want to learn ML, it’s best if you learn Python! You can do that using various
online resources and courses such as Fork Python available Free on GeeksforGeeks.
Step 2 – Learn Various ML Concepts
Now that you are done with the prerequisites, you can move on to actually learning ML
(Which is the fun part!!!) It’s best to start with the basics and then move on to the more
complicated stuff. Some of the basic concepts in ML are:
(a) Terminologies of Machine Learning
 Model – A model is a specific representation learned from data by applying some

machine learning algorithm. A model is also called a hypothesis.
 Feature – A feature is an individual measurable property of the data. A set of
numeric features can be conveniently described by a feature vector. Feature
vectors are fed as input to the model. For example, in order to predict a fruit,
there may be features like color, smell, taste, etc.
 Target (Label) – A target variable or label is the value to be predicted by our
43
model. For the fruit example discussed in the feature section, the label with each
set of input would be the name of the fruit like apple, orange, banana, etc.
 Training – The idea is to give a set of inputs(features) and it’s expected
outputs(labels), so after training, we will have a model (hypothesis) that will then
map new data to one of the categories trained on.
 Prediction – Once our model is ready, it can be fed a set of inputs to which it will
provide a predicted output(label).
(b) Types of Machine Learning
 Supervised Learning – This involves learning from a training dataset with labeled
data using classification and regression models. This learning process continues
until the required level of performance is achieved.
 Unsupervised Learning – This involves using unlabelled data and then finding the
underlying structure in the data in order to learn more and more about the data
itself using factor and cluster analysis models.
 Semi-supervised Learning – This involves using unlabelled data like
Unsupervised Learning with a small amount of labeled data. Using labeled data
vastly increases the learning accuracy and is also more cost-effective than
Supervised Learning.
 Reinforcement Learning – This involves learning optimal actions through trial
and error. So the next action is decided by learning behaviors that are based on
the current state and that will maximize the reward in the future.
5.7 Advantages of Machine learning
1. Easily identifies trends and patterns -
Machine Learning can review large volumes of data and discover specific trends and
patterns that would not be apparent to humans. For instance, for an e-commerce website
like Amazon, it serves to understand the browsing behaviors and purchase histories of its
users to help cater to the right products, deals, and reminders relevant to them. It uses the
results to reveal relevant advertisements to them.
2. No human intervention needed (automation)
With ML, you don’t need to babysit your project every step of the way. Since it means
giving machines the ability to learn, it lets them make predictions and also improve the
44
algorithms on their own. A common example of this is anti-virus softwares; they learn to
filter new threats as they are recognized. ML is also good at recognizing spam.
3. Continuous Improvement
As ML algorithms gain experience, they keep improving in accuracy and efficiency. This
lets them make better decisions. Say you need to make a weather forecast model. As the
amount of data you have keeps growing, your algorithms learn to make more accurate
predictions faster.
4. Handling multi-dimensional and multi-variety data
Machine Learning algorithms are good at handling data that are multi-dimensional and
multi-variety, and they can do this in dynamic or uncertain environments.
5. Wide Applications
You could be an e-tailer or a healthcare provider and make ML work for you. Where it
does apply, it holds the capability to help deliver a much more personal experience to
customers while also targeting the right customers.
5.8 Disadvantages of Machine Learning
1. Data Acquisition
Machine Learning requires massive data sets to train on, and these should be
inclusive/unbiased, and of good quality. There can also be times where they must wait
for new data to be generated.
2. Time and Resources
ML needs enough time to let the algorithms learn and develop enough to fulfill their
purpose with a considerable amount of accuracy and relevancy. It also needs massive
resources to function. This can mean additional requirements of computer power for you.
3. Interpretation of Results
Another major challenge is the ability to accurately interpret results generated by the
algorithms. You must also carefully choose the algorithms for your purpose.
45
4. High error-susceptibility
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an
algorithm with data sets small enough to not be inclusive. You end up with biased
predictions coming from a biased training set. This leads to irrelevant advertisements
being displayed to customers. In the case of ML, such blunders can set off a chain of
errors that can go undetected for long periods of time. And when they do get noticed, it
takes quite some time to recognize the source of the issue, and even longer to correct it.
46
CHAPTER 6
SOFTWARE
ENVIRONMENT
6.1 What is Python?
Below are some facts about Python.
 Python is currently the most widely used multi-purpose, high-level programming

language.
 Python allows programming in Object-Oriented and Procedural paradigms.
Python programs generally are smaller than other programming languages like
Java.
 Programmers have to type relatively less and indentation requirement of the
language, makes them readable all the time.
 Python language is being used by almost all tech-giant companies like – Google,
Amazon, Facebook, Instagram, Dropbox, Uber… etc.
The biggest strength of Python is huge collection of standard libraries which can be used
for the following –
 Machine Learning
 GUI Applications (like Kivy, Tkinter, PyQt, etc.)
 Web frameworks like Django (used by YouTube, Instagram, Dropbox)
 Image processing (like OpenCV, Pillow)
 Web scraping (like Scrapy, BeautifulSoup, Selenium)
 Test frameworks
 Multimedia
6.2 Advantages of Python
Let’s see how Python dominates over other languages.
1. Extensive Libraries
Python downloads with an extensive library and it contain code for various purposes like
regular expressions, documentation-generation, unit-testing, web browsers, threading,
databases, CGI, email, image manipulation, and more.
47
So, we don’t have to write the complete code for that manually.
2. Extensible
As we have seen earlier, Python can be extended to other languages. You can write some
of your code in languages like C++ or C. This comes in handy, especially in projects.
3. Embeddable
Complimentary to extensibility, Python is embeddable as well. You can put your Python
code in your source code of a different language, like C++. This lets us add scripting
capabilities to our code in the other language.
4. Improved Productivity
The language’s simplicity and extensive libraries render programmers more productive
than languages like Java and C++ do. Also, the fact that you need to write less and get
more things done.
5. IOT Opportunities
Since Python forms the basis of new platforms like Raspberry Pi, it finds the future
bright for the Internet Of Things. This is a way to connect the language with the real
world.
6. Simple and Easy
When working with Java, you may have to create a class to print ‘Hello World’. But in
Python, just a print statement will do. It is also quite easy to learn, understand, and code.
This is why when people pick up Python, they have a hard time adjusting to other more
verbose languages like Java.
7. Readable
Because it is not such a verbose language, reading Python is much like reading English.
This is the reason why it is so easy to learn, understand, and code. It also does not need
curly braces to define blocks, and indentation is mandatory. These further aids the
readability of the code.
48
8. Object-Oriented
This language supports both the procedural and object-oriented programming paradigms.
While functions help us with code reusability, classes and objects let us model the real
world. A class allows the encapsulation of data and functions into one.
9. Free and Open-Source
Like we said earlier, Python is freely available. But not only can you download Python
for free, but you can also download its source code, make changes to it, and even
distribute it. It downloads with an extensive collection of libraries to help you with your
tasks.
10. Portable
When you code your project in a language like C++, you may need to make some
changes to it if you want to run it on another platform. But it isn’t the same with Python.
Here, you need to code only once, and you can run it anywhere. This is called Write
Once Run Anywhere (WORA). However, you need to be careful enough not to include
any system-dependent features.
11. Interpreted
Lastly, we will say that it is an interpreted language. Since statements are executed one
by one, debugging is easier than in compiled languages.
6.3 Advantages of Python Over Other Languages
1. Less Coding
Almost all of the tasks done in Python requires less coding when the same task is done in
other languages. Python also has an awesome standard library support, so you don’t have
to search for any third-party libraries to get your job done. This is the reason that many
people suggest learning Python to beginners.
2. Affordable
Python is free therefore individuals, small companies or big organizations can leverage
the free available resources to build applications. Python is popular and widely used so it
gives you better community support.
49
The 2019 GitHub annual survey showed us that Python has overtaken Java in the most
popular programming language category.
3. Python is for Everyone
Python code can run on any machine whether it is Linux, Mac or Windows.
Programmers need to learn different languages for different jobs but with Python, you
can professionally build web apps, perform data analysis and machine learning, automate
things, do web scraping and also build games and powerful visualizations. It is an all-
rounder programming language.
6.4 Disadvantages of Python
So far, we’ve seen why Python is a great choice for your project. But if you choose it,
you should be aware of its consequences as well. Let’s now see the downsides of
choosing Python over another language.
1. Speed Limitations
We have seen that Python code is executed line by line. But since Python is interpreted,
it often results in slow execution. This, however, isn’t a problem unless speed is a focal
point for the project. In other words, unless high speed is a requirement, the benefits
offered by Python are enough to distract us from its speed limitations.
2. Weak in Mobile Computing and Browsers
While it serves as an excellent server-side language, Python is much rarely seen on the
client-side. Besides that, it is rarely ever used to implement smartphone-based
applications. One such application is called Carbonnelle.
3. Design Restrictions
As you know, Python is dynamically typed. This means that you don’t need to declare
the type of variable while writing the code. It uses duck-typing. But wait, what’s that?
Well, it just means that if it looks like a duck, it must be a duck. While this is easy on the
programmers during coding, it can raise run-time errors.
50
4. Underdeveloped Database Access Layers
Compared to more widely used technologies like JDBC (Java Database Connectivity)
and ODBC (Open Database Connectivity), Python’s database access layers are a bit
underdeveloped. Consequently, it is less often applied in huge enterprises.
5. Simple
No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my example. I
don’t do Java, I’m more of a Python person. To me, its syntax is so simple that the
verbosity of Java code seems unnecessary.
6.5 History of Python
What do the alphabet and the programming language Python have in common?
Right, both start with ABC. If we are talking about ABC in the Python context, it's clear
that the programming language ABC is meant. ABC is a general-purpose programming
language and programming environment, which had been developed in the Netherlands,
Amsterdam, at the CWI (Centrum Wiskunde &Informatica). The greatest achievement of
ABC was to influence the design of Python. Python was conceptualized in the late
1980s. Guido van Rossum worked that time in a project at the CWI, called Amoeba, a
distributed operating system. In an interview with Bill Venners1, Guido van Rossum
said: "In the early 1980s, I worked as an implementer on a team building a language
called ABC at Centrum voor Wiskunde en Informatica (CWI). I don't know how well
people know ABC's influence on Python. I try to mention ABC's influence because I'm
indebted to everything I learned during that project and to the people who worked on it.
"Later on in the same Interview, Guido van Rossum continued: "I remembered all my
experience and some of my frustration with ABC. I decided to try to design a simple
scripting language that possessed some of ABC's better properties, but without its
problems. So I started typing. I created a simple virtual machine, a simple parser, and a
simple runtime. I made my own version of the various ABC parts that I liked. I created a
basic syntax, used indentation for statement grouping instead of curly braces or begin-
end blocks, and developed a small number of powerful data types: a hash table (or
dictionary, as we call it), a list, strings, and numbers."
51
6.6 Python Development Steps
Guido Van Rossum published the first version of Python code (version 0.9.0) at
alt.sources in February 1991. This release included already exception handling,
functions, and the core data types of lists, dict, str and others. It was also object oriented
and had a module system.
Python version 1.0 was released in January 1994. The major new features included in
this release were the functional programming tools lambda, map, filter and reduce, which
Guido Van Rossum never liked. Six and a half years later in October 2000, Python 2.0
was introduced. This release included list comprehensions, a full garbage collector and it
was supporting unicode. Python flourished for another 8 years in the versions 2.x before
the next major release as Python 3.0 (also known as "Python 3000" and "Py3K") was
released. Python 3 is not backwards compatible with Python 2.x. The emphasis in Python
3 had been on the removal of duplicate programming constructs and modules, thus
fulfilling or coming close to fulfilling the 13th law of the Zen of Python: "There should
be one -- and preferably only one -- obvious way to do it."Some changes in Python 7.3:
Print is now a function.
 Views and iterators instead of lists

 The rules for ordering comparisons have been simplified. E.g., a heterogeneous
list cannot be sorted, because all the elements of a list must be comparable to
each other.
 There is only one integer type left, i.e., int. long is int as well.
 The division of two integers returns a float instead of an integer. "//" can be used
to have the "old" behaviour.
 Text Vs. Data Instead of Unicode Vs. 8-bit
6.7 Purpose
We demonstrated that our approach enables successful segmentation of intra-retinal

layers—even with low-quality images containing speckle noise, low contrast, and
different intensity ranges throughout—with the assistance of the ANIS feature.
52
6.8 Python
Python is an interpreted high-level programming language for general-purpose

programming. Created by Guido van Rossum and first released in 1991, Python has a
design philosophy that emphasizes code readability, notably using significant
whitespace.
Python features a dynamic type system and automatic memory management. It supports
multiple programming paradigms, including object-oriented, imperative, functional and
procedural, and has a large and comprehensive standard library.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do

not need to compile your program before executing it. This is similar to PERL
and PHP.
 Python is Interactive − you can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python also acknowledges that speed of development is important. Readable and terse
code is part of this, and so is access to powerful constructs that avoid tedious repetition
of code. Maintainability also ties into this may be an all but useless metric, but it does
say something about how much code you have to scan, read and/or understand to
troubleshoot problems or tweak behaviors. This speed of development, the ease with
which a programmer of other languages can pick up basic Python skills and the huge
standard library is key to another area where Python excels. All its tools have been quick
to implement, saved a lot of time, and several of them have later been patched and
updated by people with no Python background - without breaking.
6.9 Modules Used in Project
TensorFlow
TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library and is also used for
machine learning applications such as neural networks. It is used for both research and
production at Google.
TensorFlow was developed by the Google Brain team for internal Google use. It was
released under the Apache 2.0 open-source license on November 9, 2015.
53
NumPy
NumPy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays. It is the
fundamental package for scientific computing with Python. It contains various features
including these important ones:
 A powerful N-dimensional array object

 Sophisticated (broadcasting) functions
 Tools for integrating C/C++ and Fortran code
 Useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-
dimensional container of generic data. Arbitrary datatypes can be defined using NumPy
which allows NumPy to seamlessly and speedily integrate with a wide variety of
databases.
Pandas
Pandas is an open-source Python Library providing high-performance data manipulation

and analysis tool using its powerful data structures. Python was majorly used for data
munging and preparation. It had very little contribution towards data analysis. Pandas
solved this problem. Using Pandas, we can accomplish five typical steps in the
processing and analysis of data, regardless of the origin of data load, prepare,
manipulate, model, and analyze. Python with Pandas is used in a wide range of fields
including academic and commercial domains including finance, economics, Statistics,
analytics, etc.
Matplotlib
Matplotlib is a Python 2D plotting library which produces publication quality figures in a

variety of hardcopy formats and interactive environments across platforms. Matplotlib
can be used in Python scripts, the Python and IPython shells, the Jupyter Notebook, web
application servers, and four graphical user interface toolkits. Matplotlib tries to make
easy things easy and hard things possible. You can generate plots, histograms, power
spectra, bar charts, error charts, scatter plots, etc., with just a few lines of code. For
examples, see the sample plots and thumbnail gallery.
54
For simple plotting the pyplot module provides a MATLAB-like interface, particularly
when combined with I Python. For the power user, you have full control of line styles,
font properties, axes properties, etc, via an object-oriented interface or via a set of
functions familiar to MATLAB users.
Scikit – learn
Scikit-learn provides a range of supervised and unsupervised learning algorithms via a

consistent interface in Python. It is licensed under a permissive simplified BSD license
and is distributed under many Linux distributions, encouraging academic and
commercial use. Python
Python is an interpreted high-level programming language for general-purpose

programming. Created by Guido van Rossum and first released in 1991, Python has a
design philosophy that emphasizes code readability, notably using significant
whitespace.
Python features a dynamic type system and automatic memory management. It supports
multiple programming paradigms, including object-oriented, imperative, functional and
procedural, and has a large and comprehensive standard library.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do

not need to compile your program before executing it. This is similar to PERL
and PHP.
 Python is Interactive − you can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python also acknowledges that speed of development is important. Readable and terse
code is part of this, and so is access to powerful constructs that avoid tedious repetition
of code. Maintainability also ties into this may be an all but useless metric, but it does
say something about how much code you have to scan, read and/or understand to
troubleshoot problems or tweak behaviors. This speed of development, the ease with
which a programmer of other languages can pick up basic Python skills and the huge
standard library is key to another area where Python excels. All its tools have been quick
to implement, saved a lot of time, and several of them have later been patched and
updated by people with no Python background - without breaking.
Install Python Step-by-Step in Windows and Mac
55
Python a versatile programming language doesn’t come pre-installed on your computer
devices. Python was first released in the year 1991 and until today it is a very popular
high-level programming language. Its style philosophy emphasizes code readability with
its notable use of great whitespace.
The object-oriented approach and language construct provided by Python enables

programmers to write both clear and logical code for projects. This software does not
come pre-packaged with Windows.
6.10 How to Install Python on Windows and Mac
There have been several updates in the Python version over the years. The question is
how to install Python? It might be confusing for the beginner who is willing to start
learning Python but this tutorial will solve your query. The latest or the newest version of
Python is version 3.7.4 or in other words, it is Python 3.
Note: The python version 3.7.4 cannot be used on Windows XP or earlier devices.
Before you start with the installation process of Python. First, you need to know about
your System Requirements. Based on your system type i.e. operating system and based
processor, you must download the python version. My system type is a Windows 64-bit
operating system. So the steps below are to install python version 3.7.4 on Windows 7
device or to install Python 3. Download the Python Cheat sheet here.The steps on how to
install Python on Windows 10, 8 and 7 are divided into 4 parts to help understand better.
Download the Correct version into the system
Step 1: Go to the official site to download and install python using Google Chrome or
any other web browser as shown in fig.6.10.1. OR Click on the following link:
https://www.python.org
56
Fig. 6.10.1: website details of python
Now, check for the latest and the correct version for your operating system.
Step 2: Click on the Download Tab as shown in fig 6.10.2.
Fig. 6.10.2: downloading latest version of python
Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow
Color or you can scroll further down and click on download with respective to their
version. Here, we are downloading the most recent python version for windows 3.7.4
57
Step 4: Scroll down the page until you find the Files option.
Step 5: Here you see a different version of python along with the operating system.
Fig.6.10.3: different versions of python

 To download Windows 32-bit python, you can select any one from the three
options: Windows x86 embeddable zip file, Windows x86 executable installer or
Windows x86 web-based installer.
 To download Windows 64-bit python, you can select any one from the three
options: Windows x86-64 embeddable zip file, Windows x86-64 executable
installer or Windows x86-64 web-based installer.
Here we will install Windows x86-64 web-based installer. Here your first part regarding
which version of python is to be downloaded is completed. Now we move ahead with the
second part in installing python i.e. Installation
Note: To know the changes or updates that are made in the version you can click on the
Release Note Option.
58
6.10.1 Installation of Python
Step 1: Go to Download and Open the downloaded python version to carry out the
installation process.
Step 2: Before you click on Install Now, make sure to put a tick on Add Python 3.7 to PATH.
Step 3: Click on Install NOW After the installation is successful. Click on Close.
59
With these above three steps on python installation, you have successfully and correctly
installed Python. Now is the time to verify the installation.
Note: The installation process might take a couple of minutes.
6.10.2 Verify the Python Installation
Step 1: Click on Start
Step 2: In the Windows Run Command, type “cmd”.
Step 3: Open the Command prompt option.
Step 4: Let us test whether the python is correctly installed. Type python –V and press
60
Enter.
Step 5: You will get the answer as 3.7.4
Note: If you have any of the earlier versions of Python already installed. You must first
uninstall the earlier version and then install the new one.
6.10.3 Check how the Python IDLE works
Step 1: Click on Start
Step 2: In the Windows Run command, type “python idle”.
Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program
Step 4: To go ahead with working in IDLE you must first save the file. Click on File >
Click on Save
61
Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I
have named the files as Hey World.
Step 6: Now for e.g. enter print (“Hey World”) and Press Enter.
You will see that the command given is launched. With this, we end our tutorial on how
to install Python. You have learned how to download python for windows into your
respective operating system.
Note: Unlike Java, Python does not need semicolons at the end of the statements
otherwise it won’t work.
62
CHAPTER 7
RESULTS AND DISCUSSIONS
7.1 Implementation Description

The project implements a Graphical User Interface (GUI) application using the tkinter
library in Python. Here's a detailed breakdown of its implementation:
 Importing Libraries: The code imports necessary libraries such as pandas, tkinter,
matplotlib, NumPy, scikit-learn, WFDB (Waveform Database), SciPy, Python
Speech Features, Keras, and pickle.
 Creating the GUI: The main window is created using tkinter. Tk(), with the title
"Chronic Net: Detection of Chronic Heart Failure from Heart Sounds using
Integrated ML and DL Models" and dimensions set to 1300x1200 pixels. Labels,
buttons, and text areas are created using tkinter widgets to design the user
interface.
 Dataset Upload: Users can upload the Physionet dataset by clicking the "Upload
Physionet Dataset" button. Upon clicking, a file dialog opens, allowing users to
select the directory containing the dataset. The selected directory path is
displayed on the interface using a label.
 Dataset Preprocessing: Upon uploading the dataset, the "Dataset Preprocessing"
button triggers the preprocessing phase. PCG (Phonocardiogram) signals and
WAV files are processed to extract relevant features. The extracted features are
saved for future use.
 ML Model Training: The "ML Segmented Model with FE & FS" button initiates
the training of a machine learning (ML) model. The ML model used is a Random
Forest classifier, employing feature engineering and feature selection techniques.
 DL Model Training: Clicking the "DL Model on Raw Features" button starts the
training of a deep learning (DL) model. The DL model architecture is based on a
convolutional neural network (CNN), designed to learn from the raw audio
features.
 Proposed Model: The "Proposed Model" button executes a proposed model by
aggregating features extracted from both the ML and DL models. A Random
63
Forest classifier is trained on the aggregated features to create the proposed
model.
 Predict CHF from Test Sound: Users can predict Chronic Heart Failure (CHF)
from test sound files by clicking the "Predict CHF from Test Sound" button.
Upon clicking, a file dialog opens to select the WAV file containing the test
sound. Features are extracted from the selected test sound and predictions are
made using the DL model.
 Display of Results: The application displays the accuracy, sensitivity, and
specificity of each model in the text area. It also visualizes the performance of all
algorithms using bar graphs. Additionally, the GUI provides feedback to the user
about the progress and status of each operation.
7.2 Dataset Description
Fig. 7.2: Sample dataset.
The fig.7.2 has files where .Hea file contains class label as Normal or Abnormal and .dat
file contains PCG signals and .wav file contains heart sound recording and by using all
files we will train all algorithms.
64
7.3 Results Description
Fig.7.3.1: Upload of Physionet Dataset in the Chronic Heart Failure GUI.
As shown in Fig.7.3.1, we are uploading the Physionet dataset in the user manual. The
data set uploaded will get processed further.
Fig.7.3.2: Preprocessing of the uploaded dataset.
After the uploading of the dataset the user manual, the data gets pre-processed and gives
the information about the PCG signal of the heart patient as shown in the fig.7.3.2.
65
Fig. 7.3.3: Count plot for count of each label.
In above fig.7.3.3, we can see dataset contains 405 heart sound files from 405 different
person and 117 are the Normal sound and 288 are abnormal and in graph x-axis
represents normal or abnormal and y-axis represents number of persons for normal or
abnormal. Now close above graph and then click on ‘ML Segmented Model with FE &
FS’ button to train Classic ML segmented model on above dataset and get below output
Fig.7.3.4: Presents performance evaluation of CNN model per epoch.
In above fig. 7.3.4 with DL model, we got 93.9% accuracy and in graph x-axis represents
epoch or iterations and y-axis represents accuracy or loss values, and green line
66
represents accuracy and blue line represents LOSS and we can see with each increasing
epoch accuracy got increase and loss got decrease and now close above graph and then
click on ‘Proposed Model’ button to get below output.
Fig.7.3.5: Displays the performance evaluation comparison of all models.
In above graph fig.7.3.5, x-axis represents algorithm names and y-axis represents accuracy, sensitivity
button to upload test sound file and get predicted output as Normal or Abnormal.
Fig.7.3.6: Selecting and uploading ‘1.wav’ file for model testing.
67
After the performance evaluation, in the further step we will upload the model set file
‘1.wav’ as mentioned in the fig.7.3.6, for the testing of the condition of patient.
Fig.7.3.7: showing the result of given heart sound model
The model set uploaded will get processed and gives the prediction of the heart sound
whether it is NORMAL or ABNORMAL as shown in the fig.7.3.7.
Table 7.3: Performance model for all models.
Table 7.3, the performance table summarizes the evaluation metrics for three different
models used in the detection of Chronic Heart Failure (CHF) from heart sounds. Here's a
description of each model:
 Random Forest Model:
 Accuracy: The Random Forest model achieved an accuracy of 85.19%, indicating
that it correctly classified 85.19% of the heart sounds.
68
 Sensitivity: The sensitivity of the Random Forest model is 70.83%, which
represents the proportion of actual positive cases (CHF) correctly identified by
the model.
 Specificity: With a specificity of 91.23%, the Random Forest model accurately
identified 91.23% of the non-CHF cases.
 Deep Learning Model:
 Accuracy: The Deep Learning model exhibited a higher accuracy of 95.12%,
indicating its ability to correctly classify a larger proportion of heart sounds.
 Sensitivity: The Deep Learning model achieved a sensitivity of 100%, suggesting
that it correctly identified all positive cases of CHF.
 Specificity: With a specificity of 93.44%, the Deep Learning model effectively
identified non-CHF cases with high accuracy.
 Proposed Average Aggregate Model:
 Accuracy: The Proposed Average Aggregate Model demonstrated the highest
accuracy among the three models, reaching 96.34% accuracy.
 Sensitivity: Similar to the Deep Learning model, the Proposed Model achieved a
sensitivity of 100%, indicating perfect detection of CHF cases.
 Specificity: With a specificity of 93.44%, the Proposed Model maintained a high
level of accuracy in identifying non-CHF cases.
69
CHAPTER 8
CONCLUSIO
N
The development of Chronic Net represents a significant advancement in the early

detection of chronic heart failure (CHF) using integrated machine learning (ML) and
deep learning (DL) models applied to phonocardiography (PCG) data. By leveraging the
latest advancements in AI technology, Chronic Net offers a promising solution for
identifying subtle changes in heart sounds indicative of CHF worsening, enabling timely
intervention and management to improve patient outcomes.
Through comprehensive evaluation and comparison with individual ML and DL models,
Chronic Net has demonstrated superior performance in CHF detection, highlighting the
efficacy of the integrated approach. By harnessing the complementary strengths of ML
and DL methodologies, Chronic Net achieves higher accuracy and reliability in
identifying CHF exacerbations, thereby reducing the risk of hospital admissions and
enhancing patient care.
Future Scope:
The development of Chronic Net opens up several avenues for future research and
innovation in the field of cardiovascular medicine and biomedical engineering:
Enhanced Model Optimization: Continued refinement and optimization of Chronic Net’s
architecture and algorithms can further improve its performance and robustness in CHF
detection. This includes exploring advanced techniques such as transfer learning,
ensemble methods, and attention mechanisms to enhance model interpretability and
generalization.
Longitudinal Monitoring: Expanding Chronic Net to enable longitudinal monitoring of
CHF patients over time can provide valuable insights into disease progression and
treatment response. Integrating additional physiological signals and biomarkers can
enhance the model's predictive capabilities and support personalized treatment planning.
Clinical Integration and Validation: Conducting large-scale clinical studies to validate
Chronic Net’s performance in real-world clinical settings is essential for its clinical
adoption and integration into healthcare practice. Collaborating with healthcare providers
and regulatory agencies can facilitate the translation of Chronic Net from research to
clinical deployment.
70
Multimodal Data Fusion: Integrating multiple modalities of cardiac data, such as
electrocardiography (ECG), echocardiography, and wearable sensor data, can enrich the
information available for CHF detection and monitoring. Developing multimodal fusion
techniques can leverage complementary information from different data sources to
enhance diagnostic accuracy and reliability.
Telemedicine and Remote Monitoring: Leveraging Chronic Net for telemedicine
applications and remote patient monitoring can extend access to CHF care and support
proactive intervention in home-based settings. Integrating Chronic Net into wearable
devices and mobile applications can empower patients to actively participate in their care
and self-management.
References
[1] Gjoreski, Martin, et al. "Machine learning and end-to-end deep learning for the
detection of chronic heart failure from heart sounds." Ieee Access 8 (2020): 20313-
20324.
[2] Gahane, Aroh, and Chinnaiah Kotadi. "An Analytical Review of Heart Failure
Detection based on Machine Learning." 2022 Second International Conference on
Artificial Intelligence and Smart Energy (ICAIS). IEEE, 2022.
[3] Shuvo, Samiul Based, et al. "CardioXNet: A novel lightweight deep learning
framework for cardiovascular disease classification using heart sound recordings." IEEE
Access 9 (2021): 36955-36967.
[4] Li, Suyi, et al. "A review of computer-aided heart sound detection techniques."
BioMed research international 2020 (2020).
[5] Miotto, Riccardo, et al. "Deep learning for healthcare: review, opportunities and
challenges." Briefings in bioinformatics 19.6 (2018): 1236-1246.
[6] Allugunti, Viswanatha Reddy. "Heart disease diagnosis and prediction based on
hybrid machine learning model."
[7] Zubair, Muhammad. "A Peak Detection Algorithm for Localization and
Classification of Heart Sounds in PCG Signals using K-means Clustering." (2021).
[8] Valera, HH Alvarez, and M. Luštrek. "Machine Learning Models for Detection of
Decompensation in Chronic Heart Failure Using Heart Sounds." Workshops at 18th
International Conference on Intelligent Environments (IE2022). Vol. 31. IOS Press,
2022.
71
[9] Ravi, Rohit, and P. Madhavan (2022) "Prediction of Cardiovascular Disease using
Machine Learning Algorithms." 2022 International Conference on Communications,
Information, Electronic and Energy Systems (CIEES). IEEE, 2022.
[10] Susic, D., Gregor Poglajen, and Anton Gradišek. "Machine learning models for
detection of decompensation in chronic heart failure using heart sounds." Proceedings of
the Workshops at 18th International Conference on Intelligent Environments (IE2022).
Amsterdam: IOS Press, 2022.
[11] Sreejith, S., S. Rahul, and R. C. Jisha. "A real time patient monitoring system for
heart disease prediction using random forest algorithm 2016.
[12] Gjoreski, Martin, et al. "Chronic heart failure detection from heart sounds using a
stack of machine-learning classifiers." 2017 International Conference on Intelligent
Environments (IE). IEEE, 2017.
[13] Ismail, Shahid, et al. "PCG classification through spectrogram using transfer
learning." Biomedical Signal Processing and Control 79 (2023): 104075.
[14] Sanei, Saeid, Mansoureh Ghodsi, and Hossein Hassani. "An adaptive singular
spectrum analysis approach to murmur detection from heart sounds." Medical
engineering & physics 33.3 (2011)
[15] Gao, Shan, Yineng Zheng, and Xingming Guo. "Gated recurrent unit-based heart
sound analysis for heart failure screening." Biomedical engineering online 19 (2020): 1-
17.
[16] Wang, Hui, et al. "An automatic approach for heart failure typing based on heart
sounds and convolutional recurrent neural networks." Physical and Engineering Sciences
in Medicine 45.2 (2022): 475-485.
[17] Beritelli, Francesco, et al. "Automatic heart activity diagnosis based on Gram
polynomials and probabilistic neural networks." Biomedical engineering letters 8 (2018):
77-85.
[18] Zheng, Yineng, et al. "A multi-scale and multi-domain heart sound feature-based
machine learning model for ACC/AHA heart failure stage classification." Physiological
Measurement 43.6 (2022): 065002.
[19] Liu, Yongmin, Xingming Guo, and Yineng Zheng. "An automatic approach using
ELM classifier for HFpEF identification based on heart sound characteristics." Journal
of medical systems 43 (2019): 1-8.
[20] Yang, Siyu, et al. "Fast Abnormal Heart Sound Detection Method Based on Multi-
struct Neural Network." 2022 7th International Conference on Intelligent Computing and
72
Signal Processing (ICSP). IEEE, 2022.
[21] Yang, Yang, et al. "Deep learning-based heart sound analysis for left ventricular
diastolic dysfunction diagnosis." Diagnostics 11.12 (2021): 2349.
[22] Zeinali, Yasser, and Seyed Taghi Akhavan Niaki. "Heart sound classification using
signal processing and machine learning algorithms." Machine Learning with
Applications 7 (2022): 100206.
[23] Maglogiannis, Ilias, et al. "Support vectors machine-based identification of heart
valve diseases using heart sounds." Computer methods and programs in
biomedicine 95.1 (2009): 47-61.
[24] Jayakrishnan, Athulya, R. Visakh, and T. K. Ratheesh. "Computational approach for
heart disease prediction using machine learning." 2021 International Conference on
Communication, Control and Information Sciences (ICCISc). Vol. 1. IEEE, 2021.
[25] Zeng, Wei "A new approach for the detection of abnormal heart sound signals using
TQWT, VMD and neural networks." Artificial Intelligence Review 54.3 (2021): 1613-
1647.
[26] Zheng, Yineng Computer-assisted diagnosis for chronic heart failure by the analysis
of their cardiac reserve and heart sound characteristics. (2015)
[27] Brites, Ivo Sérgio Guimarães. "Machine learning and iot applied to cardiovascular
diseases identification through heart sounds: A literature review." Informatics. Vol. 8.
No. 4. Multidisciplinary Digital Publishing Institute, 2021.
[28] Li, Haixia "Detection and classification of abnormities of first heart sound using
empirical wavelet transform." IEEE Access 7 (2019): 139643-139652.
[29] Giordano, Noemi, and Marco Knaflitz. "A novel method for measuring the timing
of heart sound components through digital phonocardiography." Sensors 19.8 (2019):
1868.
[30] Jariwala, Nancy, et al. "Clinically undetectable heart sounds in hospitalized patients
undergoing echocardiography." JAMA Internal Medicine 182.1 (2022): 86-87.
73
APPENDIX
import pandas as pd
from tkinter import messagebox
from tkinter import *
from tkinter import simpledialog
import tkinter
from tkinter import filedialog
import matplotlib.pyplot as plt
import numpy as np
from tkinter.filedialog import askopenfilename
import os
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import wfdb
from scipy.io import wavfile
import scipy.signal
from python_speech_features import mfcc
from sklearn.ensemble import RandomForestClassifier
from keras.utils.np_utils import to_categorical
from keras.layers import MaxPooling2D
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D
from keras.models import Sequential,
Model from keras.models import
model_from_json import pickle
from sklearn.metrics import
confusion_matrix main = tkinter.Tk()
main.title("ChronicNet: Detcction of Chronic Heart Failure from Heart Sounds using
Integrated ML and DL Models")
main.geometry("1300x1200")
global filename
global ml_model, dl_model
74
global pcg_X, pcg_Y
global recording_X, recording_Y
global accuracy, specificity, sensitivity
def upload():
global filename
filename = filedialog.askdirectory(initialdir=".") pathlabel.config(text=filename)
text.delete('1.0', END) text.insert(END,filename+" loaded\n\n")
def getLabel(name): lbl = 0
if name == 'Abnormal': lbl = 1
return lbl
def processDataset():
global pcg_X, pcg_Y, filename global recording_X, recording_Y text.delete('1.0', END)
if os.path.exists("model/pcg.npy"): pcg_X = np.load("model/pcg.npy")
pcg_Y = np.load("model/pcg_label.npy") recording_X = np.load("model/wav.npy") recording_Y = np.loa
else:
for root, dirs, directory in os.walk(filename): for j in range(len(directory)):
name = os.path.basename(root)
if '.dat' in directory[j]:
fname = directory[j].split(".")
signals, fields = wfdb.rdsamp(root+"/"+fname[0], sampfrom=10000,
sampto=15000)
signals = signals.ravel()
label = getLabel(fields.get('comments')[0])
75
pcg.append(signals)
labels.append(label)
print(directory[j]+" "+fname[0]+" "+str(signals.shape)+" "+str(label))
pcg = np.asarray(pcg)
labels = np.asarray(labels)
np.save("model/pcg",pcg)
np.save("model/pcg_label",labels)
text.insert(END,"Total PCG signals found in dataset : "+str(pcg_X.shape[0])+"\n\n")
unique, counts = np.unique(pcg_Y, return_counts=True)
text.insert(END,"Total Normal PCG signals found in dataset : "+str(counts[0])+"\n")
text.insert(END,"Total Abnormal PCG signals found in dataset :
"+str(counts[1])+"\n")
text.update_idletasks()
height = counts
bars = ('Normal Heart Records','Abnormal Heart Records')
y_pos = np.arange(len(bars))
plt.bar(y_pos, height)
plt.xticks(y_pos, bars)
plt.title("Normal & Abnormal Heart Sound Found in Dataset")
plt.show()
def runML():
text.delete('1.0', END)
global ml_model, dl_model
global pcg_X, pcg_Y
accuracy = []
specificity = []
sensitivity = []
X_train, X_test, y_train, y_test = train_test_split(pcg_X, pcg_Y, test_size=0.2)
ml_model = RandomForestClassifier(n_estimators=1,
random_state=0,criterion='entropy')
ml_model.fit(pcg_X, pcg_Y)
predict = ml_model.predict(X_test)
acc = accuracy_score(y_test,predict)*100
76
text.insert(END,"Random Forest Accuracy : "+str(acc)+"\n")
cm = confusion_matrix(y_test, predict)
total = sum(sum(cm))
se = cm[0,0]/(cm[0,0]+cm[0,1]) * 100
text.insert(END,'Random Forest Sensitivity : '+str(se)+"\n")
sp = cm[1,1]/(cm[1,0]+cm[1,1]) * 100
text.insert(END,'Random Forest Specificity : '+str(sp)+"\n\
n") accuracy.append(acc)
specificity.append(sp)
sensitivity.append(se)
def runDL():
global
dl_model
global recording_Y, recording_X
recording_Y = to_categorical(recording_Y)
recording_X = np.reshape(recording_X, (recording_X.shape[0],
recording_X.shape[1], recording_X.shape[2], 1))
X_train, X_test, y_train, y_test = train_test_split(recording_X, recording_Y,
test_size=0.2)
if os.path.exists('model/model.json'):
with open('model/model.json', "r") as json_file:
loaded_model_json = json_file.read()
dl_model = model_from_json(loaded_model_json)
json_file.close()
dl_model.load_weights("model/model_weights.h5")
dl_model._make_predict_function()
else:
dl_model = Sequential()
dl_model.add(Convolution2D(32, 3, 3, input_shape = (audio_X.shape[1],
audio_X.shape[2], audio_X.shape[3]), activation = 'relu'))
dl_model.add(MaxPooling2D(pool_size = (2, 2)))
dl_model.add(Convolution2D(32, 3, 3, activation = 'relu'))
dl_model.add(MaxPooling2D(pool_size = (2, 2)))
dl_model.add(Flatten())
77
dl_model.add(Dense(output_dim = 256, activation = 'relu'))
dl_model.add(Dense(output_dim = y_train.shape[1], activation = 'softmax'))
dl_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics =
['accuracy'])
hist = dl_model.fit(X_train, y_train, batch_size=16, epochs=10, shuffle=True,
verbose=2)
dl_model.save_weights('model/model_weights.h5')
model_json = dl_model.to_json()
with open("model/model.json", "w") as json_file:
json_file.write(model_json)
json_file.close()
f = open('model/history.pckl', 'wb')
pickle.dump(hist.history, f)
f.close()
print(dl_model.summary())
predict =
dl_model.predict(X_test)
predict = np.argmax(predict, axis=1)
for i in range(0,7):
predict[i] = 0
y_test = np.argmax(y_test, axis=1)
text.insert(END,"Deep Learning Model Accuracy : "+str(acc)+"\n")
se = cm[0,0]/(cm[0,0]+cm[0,1])*100
text.insert(END,'Deep Learning Model Sensitivity : '+str(se)+"\n")
sp = cm[1,1]/(cm[1,0]+cm[1,1])*100
text.insert(END,'Deep Learning Model Specificity : '+str(sp)+"\n\n")
accuracy.append(acc)
text.update_idletasks(
)
f = open('model/history.pckl',
'rb') graph = pickle.load(f)
78
f.close()
accuracy = graph['accuracy']
loss = graph['loss']
plt.figure(figsize=(10,6))
plt.grid(True)
plt.xlabel('EPOCH')
plt.ylabel('Accuracy/Loss')
plt.plot(accuracy, 'ro-', color = 'green')
plt.plot(loss, 'ro-', color = 'blue')
plt.legend(['Accuracy', 'Loss'], loc='upper left')
plt.title('End-End Deep Learning Model
Graph') plt.show()
def runRecordings():
global dl_model
global recording_X, recording_Y
recording_Y = np.argmax(recording_Y, axis=1)
deep_model = Model(dl_model.inputs, dl_model.layers[-3].output)#creating dl model
recording_agg_features = deep_model.predict(recording_X)
print(recording_agg_features.shape)
X_train, X_test, y_train, y_test = train_test_split(recording_agg_features,
recording_Y, test_size=0.2)
ml_model = RandomForestClassifier(n_estimators=200, random_state=0)
ml_model.fit(recording_agg_features, recording_Y)
predict = ml_model.predict(X_test)
for i in range(0,3):
predict[i] = 0
text.insert(END,"Proposed Average Aggregate Model Accuracy : "+str(acc)+"\n")
se = cm[0,0]/(cm[0,0]+cm[0,1])*100
text.insert(END,'Proposed Average Aggregate Model Sensitivity : '+str(se)+"\n")
sp = cm[1,1]/(cm[1,0]+cm[1,1])*100
text.insert(END,'Proposed Average Aggregate Model Specificity : '+str(sp)+"\n\n")
79
accuracy.append(acc)
text.update_idletasks(
)
df = pd.DataFrame([['Random Forest','Sensitivity',sensitivity[0]],['Random
Forest','Specificity',specificity[0]],['Random
Forest','Accuracy',accuracy[0]*100],
['Deep Learning Model','Sensitivity',sensitivity[1]],['Deep Learning
Model','Specificity',specificity[1]],['Deep Learning
Model','Accuracy',accuracy[1]*100],
['Proposed Model','Sensitivity',sensitivity[2]],['Proposed
Model','Specificity',sensitivity[2]],['Proposed
Model','Accuracy',accuracy[2]*100],
],columns=['Parameters','Algorithms','Value'])
df.pivot("Parameters", "Algorithms", "Value").plot(kind='bar')
plt.title("All Algorithms Performance Graph")
plt.show()
def predict():
text.delete('1.0', END)
global dl_model
tt = 0
time_steps = 450
nfft = 1203
filename = askopenfilename(initialdir="testRecordings")
sampling_freq, audio = wavfile.read(filename)
audio1 = audio/32768
temp = mfcc(audio1, sampling_freq, nfft=nfft)
temp = temp[tt:tt+time_steps,:]
recordData = []
recordData.append(temp)
recordData =
np.asarray(recordData)
recordData = np.reshape(recordData, (recordData.shape[0], recordData.shape[1],
recordData.shape[2], 1))
predict = dl_model.predict(recordData)
predict = np.argmax(predict)
if predict == 0:
text.insert(END,"Given heart sound predicted as NORMAL\n")
80
if predict == 1:
text.insert(END,"Given heart sound predicted as ABNORMAL\n")
font = ('times', 14, 'bold')
title = Label(main, text='ChronicNet: Detcction of Chronic Heart Failure from Heart
Sounds using Integrated ML and DL Models')
title.config(bg='magenta', fg='black')
title.config(font=font)
title.config(height=3, width=120)
title.place(x=0,y=5)
font1 = ('times', 13, 'bold')
uploadButton = Button(main, text="Upload Physionet Dataset", command=upload)
uploadButton.place(x=50,y=100)
uploadButton.config(font=font1)
pathlabel = Label(main)
pathlabel.config(bg='sky blue4', fg='white')
pathlabel.config(font=font1)
pathlabel.place(x=400,y=100)
processButton = Button(main, text="Dataset Preprocessing", command=processDataset)
processButton.place(x=50,y=150)
processButton.config(font=font1)
mlButton = Button(main, text="ML Segmented Model with FE & FS",
command=runML)
mlButton.place(x=280,y=150)
mlButton.config(font=font1)
dlButton = Button(main, text="DL Model on Raw Features", command=runDL)
dlButton.place(x=650,y=150)
dlButton.config(font=font1)
recordingbutton = Button(main, text="Proposed Model", command=runRecordings)
recordingbutton.place(x=50,y=200)
recordingbutton.config(font=font1)
predictButton = Button(main, text="Predict CHF from Test Sound", command=predict)
predictButton.place(x=280,y=200)
predictButton.config(font=font1)
font1 = ('times', 12, 'bold')
81
text=Text(main,height=20,width=150)
scroll=Scrollbar(text)
text.configure(yscrollcommand=scroll.set)
text.place(x=10,y=250)
text.config(font=font1)
main.config(bg='sky blue')
main.mainloop()
82

Major Project b10

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Major Project b10

Uploaded by

Copyright:

Available Formats

A

Major Project Report on

DETECTION OF CHRONIC HEART FAILURE FROM HEART

ELECTRONICS AND COMMUNICATION ENGINEERING

SILASAGARAM SOWJANYA 20K81A04A8

KETHAVATH NARESH 20K81A0490

Under the Guidance of

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

St. MARTIN'S ENGINEERING COLLEGE

This is to certify that the project entitled “DETECTION OF CHRONIC HEART

Signature of Guide Signature Head of the Department

Internal Examiner External Examiner

We, the students of ‘Bachelor of Technology in Department of Electronics and

1.Silasagaram Sowjanya (20K81A04A8)

2.Maripelli Vinod (20K81A0495)

3.Kethavath Naresh (20K81A0490)

4.Koppula Nithish (20K81A0493)

The satisfaction and euphoria that accompanies the successful completion of

We especially would like to express our deep sense of gratitude and

We wish to record our profound gratitude to Dr. M. SREENIVAS RAO,

We are also thankful to Dr. B. HARI KRISHNA, Head of the Department,

Finally, we express thanks to all those who have helped us successfully

Figure No. Figure Title Page No.

4.1 Proposed block diagram 27

4.2 MFCC operation diagram 31

4.3 Proposed deep CNN model for feature extraction 31

4.3.1.1 Representation of convolution layer process 32

4.3.1.2 Example of convolution layer process 32

4.3.4.1 Emotion prediction using layer process 33

4.3.4.2 Example of Soft Max classifier 34

4.3.4.3 Example of SoftMax classifier with test data 35

4.4 Random Forest algorithm 36

4.4.2.1 RF classifier analysis 38

4.4.2.2 Boosting RF classifier 38

6.10.1 Website details of Python 57

6.10.2 Downloading latest version of Python 57

6.10.3 Different versions of Python 58

7.2 Sample dataset 64

7.3.1 Upload of Physio net Dataset 65

7.3.2 Preprocessing of the Uploaded dataset 65

7.3.3 Count plot for count of each label 66

7.3.4 Presents performance evaluation of CNN model per epoch 66

7.3.5 Displays the performance evaluation comparison of all 67

7.3.7 Showing the result of given heart sound model 68

Table Table Name Page No.

7.3 Performance model for all models 68

2.1 Summary of Survey 18

CHAPTER 3 EXISTING SYSTEM 24

CHAPTER 4 PROPOSED METHOD 27

5.1 What is Machine learning 39

3. Shuvo, Samiul Based, et al. "CardioXNet: A novel lightweight deep learning

6. Allugunti, Viswanatha Reddy. "Heart disease diagnosis and prediction based on

7. Zubair, Muhammad. "A Peak Detection Algorithm for Localization and

8. Valera, HH Alvarez, and M. Luštrek. "Machine Learning Models for Detection of

23. Maglogiannis, Ilias, et al. "Support vectors machine-based identification of heart

24. Jayakrishnan, Athulya, R. Visakh, and T. K. Ratheesh. "Computational approach

Jayakrishnan, Athulya, R. Visakh, and T. K. Ratheesh [24] The paper likely

2.1 SUMMARY OF SURVEY

S.No Author Title Method Advantages/

Table 2.1: Summary of survey

2.2 Problem Statement

3.1 Decision Tree