Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Publications of the Astronomical Society of Australia (2022), 1–8

doi:

REVIEW

Overview of machine learning and deep learning models for Covid-19


identification acoustic signal-based
Y. Lakrad, Y. Siyah, and M. EL Omari
computer and telecommunication research laboratory,Master IT, Faculty of Science, Rabat, Morocco
Author for correspondence: Y. Siyah, Email: youssef.siyah@um5r.ac.ma.

(Received ; revised ; accepted ; first published online )

Abstract
COVID-19 can be pre-screened on the basis of symptoms and confirmed by laboratory tests such as reverse transcription polymerase chain
reaction (RTPCR). This RT-PCR treatment is more expensive and induces a violation of social distancing rules, and is time consuming. The
rapid antigen test (RAT) is another molecular testing method which alleviates the time limitation of RT-PCR but has high false negatives
(low specificity). Therefore, it was necessary to look for alternative methods to detect Covid 19. The techniques proposed in this overview
are machine learning and deep learning techniques based on the sound of coughing and breathing, these techniques can detect the virus
quickly and gives everyone the opportunity to test at home without going to a laboratory by recording audio clips of their coughing and
breathing sounds and downloading the data anonymously. In this work, we used different machine learning and deep learning methods. After
a comparison between these methods, the LSTM, 8-KNN and GA-ML models were found to give accuracy of 98%, which is the highest
reported so far among the other models proposed in this overview.
Keywords: COVID-19, Machine learning, Deep learning, Audio

1. INTRODUCTION still unaffordable to most global populations. Sometimes, it


is unpleasant to the children. Not least, this test is not yet
The COVID-19 pandemic, also known as the coronavirus
accessible to the people living in remote areas, where medical
pandemic, is an ongoing global pandemic of coronavirus dis-
facilities are scarce [2]. Alarmingly, the physicians suspect
ease 2019 (COVID-19) caused by severe acute respiratory
that the general people refuse the COVID-19 test in fear of
syndrome coronavirus 2 (SARS-CoV-2). The novel virus was
stigma [1]. Governments worldwide have initiated a free mas-
first identified from an outbreak in Wuhan, China, in De-
sive testing campaign to stop the spreading of this virus, and
cember 2019. Attempts to contain it there failed, allowing
this campaign is costing them billions of dollars per day at the
the virus to spread to other areas of China and later world-
average rate of $23 per test[30]. Hence, easily accessible, quick,
wide. The World Health Organization (WHO) declared the
and affordable testing is essential to limit the spreading of the
outbreak a public health emergency of international concern
virus. The COVID-19 detection method, using human audio
on 30 January 2020 and a pandemic on 11 March 2020. As
signals, can play an important role here.
of 22 August 2022, the pandemic had caused more than 596
Recently, AI has been extensively implemented in the dig-
million cases and 6.45 million confirmed deaths, making it one
ital health sector and specifically in the COVID-19 health
of the deadliest in history. Figure 1 shows the Evolution of
crisis, due to the variety of information it provides, such as
COVID-19 cases and deaths up to august 2022. The WHO
COVID-19 growth-rate detection, risk, and infection severity
reports as most common symptoms of C19 fever, dry cough,
identification, in addition to death prediction. AI is a broad
loss of taste and smell, and fatigue; the symptoms of a severe
umbrella that consists of many subdivisions, including machine
COVID-19 condition are mainly shortness of breath, loss of
learning (ML) and deep learning (DL), which both imitate
appetite, confusion, persistent pain or pressure in the chest,
the functionality of the human brain and behaviors based on
and temperature above 38 degrees Celsius. Monitoring the
the data that is fed to cluster tasks[18]. AI has numerous ap-
development of the pandemic and screening the population
plications in both speech-signal processing and digital-image
for symptoms is mandatory. Arguably the procedures mostly
processing. Both facilitate the process of controlling, mon-
used are temperature measurement – e.g., before boarding
itoring, and overcoming the COVID-19 epidemic through
a plane – and diverse corona rapid tests – e.g., before being
their four-step procedure: detection, prevention, recovery,
allowed to visit a care home. In the clinical test for diagnosing
and response[28]. Scientists believe that it is possible to de-
C19 infection, the anterior nasal swabs sample is collected as
termine the presence of a COVID-19 infection by analyzing
suggested by Hanson et al.[17].
the generated sounds from the respiratory system, whether
To date, reverse transcription-polymerase chain reaction (RT-
cough, breathing, or regular speech. Furthermore, medical
PCR) is considered the gold standard for testing coronavirus[32].
imaging data such as chest X-ray and lung computed tomog-
However, the RT-PCR test requires person-to-person contact
raphy (CT) scans can be beneficial in COVID-19 detection[5].
to adminis-ter, needs variable time to produce results, and is
2 Y. Lakrad et al.

contains 595 COVID-19(+) people and 592 COVID-19(-)


In this overview, we will present several machine learning people have been tagged. All data has been noise removed.
and deep learning techniques collected from the sounds of the In addition, all data were normalized with the normalization
respiratory system (coughing, breathing, speech) and evaluate method before the study.
these techniques to select the best performing model against Model 3: In 2022, Madhurananda Pahar and Marisa Klop-
COVID-19.The organization of this document is as follows: per and Robin Warren and Thomas Niesler[25] provided a
Section 2 presents the data set collected. It also details the method to detect COVID-19 in cough, breath and speech
classification methods with their performance. In section 3, using deep transfer learning and bottleneck features . the first
the experiments and results are detailed in a table. Section step of this model is the extraction of the characteristics of
4 defines machine and deep learning and then presents the the samples which are the mel frequency spectral coefficients
steps for developing a machine learning model. Finally, the (MFCC) and the energies of the linearly spaced logarithmic fil-
conclusion of the reported work and the future directions of ter banks, as well as their velocity and acceleration coefficients
the proposed approach are presented in section 5. and also signal rate zero-crossing (ZCR) [34] and kurtosis
[34], then selection of this feature is performed using CNN,
LSTM and Resnet50 architectures to improve performance
of COVID-19 based classification on signals cough, breath
and speech audio. and then the superficial classifiers are (LR,
SVM, KNN and MLP). This method used a database con-
taining 10.29 h of annotated audio recordings with four class
labels (cough, speech, sneeze, noise) was available to pre-train
the neural architectures. The composition of this dataset is
summarized as 11202 cough sounds (2.45 h audio), 2.91 h of
speech from male and female participants, 1013 sneeze sounds
(13.34 min audio) and 2.98 h other unvoiced sounds.
Figure 1. COVID-19 cases reported weekly by WHO Region, and global deaths,
as of 21 August 2022 Model 4: In 2022, Ezz El Din Hemdan1 and Walid El
Shafai and Amged Sayed[19] provided a method called CR19
to detect COVID-19 in cough audio signals using machine
learning algorithms for automated medical diagnostic applica-
2. Materials and methods tions. the CR19 consists of a hybrid GA-ML with six differ-
Model 1: In 2022, Rumana Islam and Esam Abdel-Raheem ent machine learning algorithms, namely Linear Regression,
and Mohamed Tarique[20] provided a method to use deep Naïve Bayes, K Nearest Neighbors, Support Vector Machine,
learning to automatically classify and detect covid19 diseases Logistic Regression, and Decision Tree. the first stage of this
from cough sound. this model used a database composed of model is the acquisition and collection of data, in this phase
cough sound samples on covid19 diseases, it is called Virufy. the audio data will be collected from mobile phone sources
The Virufy is a volunteer-run organization, which has built to be processed and then we have the data preprocessing, it
a global database to identify COVID-19 patients using AI. is to load the data collected to be adapted for further process-
The database contains both clinical and participatory data. ing within the proposed framework to indicate the positive or
Clinical data is accurate as it was collected and processed in negative COVID-19 case for each audio signal in the data set
a hospital according to standard operating procedure (SOP). and thereafter we have phase 3 this is for training and finally
Subjects were confirmed as healthy individuals (i.e. COVID- the classification and detection of COVID-19 in cough data.
19 negative) and COVID-19 patients (i.e. COVID-positive) Model 5: In 2022, Mahmoud Aly and Kamel H. Rahouma
using the RT- test PCR, and the data was labeled accordingly. and Safwat M. Ramzy[6] provided a COVID-19 diagnostic
Model 2: In 2021, Yunus Emre Erdogan and Ali Narin [15] method using machine learning and outsourced breathing
provided a method based on machine learning algorithms to and voice recordings. The network architecture of this deep
automatically classify and detect covid19 diseases from cough learning model is very simple, it contains three layers, the
sound. the first step of this model is the extraction of the tradi- first layer contains 16 nodes with ReLU activation function,
tional features using the empirical mode decomposition (EMD) the second layer to reduce overfitting at a rate of 0.5 and the
and the discrete wavelet transform (DWT) of the acoustic third layer contains a single node with the ReLU activation
cough data, then the selection of this feature made by the Reli- function. sigmoid activation function. This model used the
ef F algorithm and after they are classified by the SVM (Vector Coswara dataset where each user recorded 9 different types of
Machine Support). This method used a database consisting of sounds like coughing, breathing, and speaking tagged with
cough acoustic samples on COVID-19(+) and COVID-19(-) COVID-19 status. the database contains 9 separate datasets,
were obtained from the open access site https://virufy.org/. each dataset contains samples of a single type of sound labeled
The data was provided by a mobile application developed by (deep breath, shallow breath, heavy cough, shallow cough,
Stanford University. The data belong to a total of 1187 people. fast count, normal count, vowel-A , vowel-E, vowel-O) with
All data was determined as positive and negative according COVID-19 status (positive) or (negative). The total number
to the results obtained from the RT-PCR test. The database of samples in each dataset was 1299, where (126) positive and
Publications of the Astronomical Society of Australia 3

(1173) negative. Moreover. recorded 9 different types of sounds such as coughing, breath-
Model 6: In 2022, I. Anee Daisy and R. Lavanya[12] ing, and speaking, labeled with COVID-19 status.A combi-
provided a deep learning-based method for non-vocal human nation of models trained on different sounds can diagnose
sound classification and Covid -19 cough detection. this model COVID-19 with more A combination of patterns trained on
is used the algorithm of CNN (CONVOLUTIONAL NEU- different sounds can diagnose COVID-19 more accurately
RON NETWORK) in order to extract the characteristics of than a single pattern trained on cough or breath only. Finally,
each audio signal, the acoustic characteristics are selected in the this study aims to draw attention to the importance of the hu-
time and frequency domains, including the short term energy, man voice alongside other respiratory sounds for sound-based
the loudness , zero crossing rate (ZCR), power spectral density COVID-19 diagnosis.
(PSD), and spectral entropy. This model used the Coswara Model 10: In 2021, Coppock, Harry and Gaskell, Alex
dataset where each user recorded 9 different types of sounds and Tzirakis, Panagiotis and Baird, Alice and Jones, Lyn and
like coughing, breathing, and speaking tagged with COVID- Schuller, Björn and al. Virufy(is a non-profit research or-
19 status. The COVID-19 identification in this study is based ganization developing artificial intelligence (AI) technology
on the ResNet50 variant with a database contains the different to screen for Covid-19 from cough)[11] provided a machine
types of audio sounds such as cough sounds, snoring, sneezing learning based method for classification of non-vocal human
and COPD. sounds and detection of Covid -19 cough. This model uses
Model 7: In 2022, Mohammed Usman and Vinit Ku- ResNet(is an artificial neural network (ANN)) and Features"Mel-
mar Gunjan and Mohd Wajid and Mohammed Zubair and spectrogram".This model used database of people who are
Kazy Noor-e-alam Siddiquee[33] provided a method using positive "A database of breath, cough, and voice sounds for
speech as a biomarker for COVID-19 detection based on ma- COVID-19 diagnosis".
chine learning. this model is used on the Microsoft Azure Model 11: In 2021 pahar . presents a machine learning-
Machine Learning Studio (MAMLS) cloud platform to per- based COVID-19 cough classifier that can discriminate COVID-
form binary classification, and classification performance is 19 positive coughs from negative and healthy coughs recorded
analyzed and compared for five state-of-the-art classification on a smartphone. The datasets used contain both forced and
algorithms available in MAMLS This model uses short-term natural coughs which is The Coswara dataset, Coswara is
Fourier transform (STFT) coefficients as characteristics of each publicly available, contains 92 COVID-19 positive subjects
audio. the algorithms used in this platform are: NN, SVM, and 1079 healthy subjects. The method uses CNN based on
LoR, BDT and DF. The speech recordings used in this study residuals "Resnet50" which is the classifier that gives the best
include two categories: speech from healthy individuals with performance.
no known pre-existing medical condition at resting heart rate Model 12: In 2022, Kawther A. Al Dhlan [4] provided
and speech from asymptomatic individuals. COVID-19 pos- a method for detecting COVID19 using a deep learning ap-
itive people. The COVID-19 identification in this study is proach. The system proposed in this study consists of two
based on the ResNet50 variant with a database containing the steps, preprocessing and classification. The least mean squares
different types of audio sounds such as cough sounds, snoring, filter removes artifacts or noise from the input speech signal in
sneezing and COPD. the preprocessing stage. After completing the preprocessing
Model 8: In 2021, Fakhry, Ahmed and Jiang, Xinyi and process, the Generative Adversarial Network (GAN) classi-
Xiao, Jaclyn and Chaudhari, Gunvant and Han, Asriel and fier analyzes the filter signal to classify COVID-19 and non-
Khanzada, Amil and al. Virufy(is a non-profit research orga- COVID-19 signals. The database used in this model contains
nization developing artificial intelligence (AI) technology to healthy and unhealthy sound samples, including COVID-19
screen for Covid-19 from cough)[16] provided a deep learning identification.
based method for classification of non-vocal human sounds Model 13: In 2022, Ali Bou Nassif and Ismaïl Shahin
and detection of Covid -19 cough. This model uses Multi- and Mohamed Bader and Abdelfatah Hassan and Naoufel
Branch Deep Learning( multimodal data-based failure progno- Werghi[23] provided a covid-19 detection system using deep
sis model) This model used the Coughvid dataset"A database learning algorithms based on speech and image data. in this
of breath, cough, and voice sounds for COVID-19 diagnosis". study, the features extracted for the optimal representation of
This study provided experimental results reaching an AUC of the speech signal are the Mel frequency cepstral coefficients
91% . Finally, this study aims to draw attention to the impor- (MFCC) [2] and the model used is the long-short-term mem-
tance of the human voice alongside other respiratory sounds ory model (LSTM). LSTM is an advanced variant of RNN.
for sound-based COVID-19 diagnosis. It stores data information for an extended period of time, and
Model 9:In 2021, Aly, Mahmoud and Rahouma, Kamel H past data is easier to retrieve from memory. The database
and Ramzy, Safwat M and al. Virufy(is a non-profit research used in this model consists of 1159 cough, breath and speech
organization developing artificial intelligence (AI) technology sound samples obtained from 592 participants, divided into
to screen for Covid-19 from cough)[6] provided a machine 379 healthy patients and 213 COVID-19 infected patients. All
learning based method for classification of non-vocal human samples in the dataset were captured using a moving micro-
sounds and detection of Covid -19 cough. This model uses phone.
Resnet50 (is a 50-layer deep convolutional neural network). Model 14: In 2022,Hemdan Ezz El-Din and El-Shafai
In this study, we used the Coswara dataset where each user Walid and Sayed Amged[19].Use a novel machine learning ap-
4 Y. Lakrad et al.

proach to detect COVID-19 patients (symptomatic and asymp- Model 17:In 2022 ,Hemdan, Ezz El-Din and El-Shafai,
tomatic). A research group at the University of Cambridge Walid and Sayed, Amged[19] This paper proposes a frame-
shared such a dataset of cough and breath sounds samples work for efficient detection and diagnosis of COVID-19 using
from 582 healthy patients and 141 COVID-19 patients,The using hybrid machines. Detection and diagnosis of COVID-
collected dataset includes data from 245 healthy individuals 19 using hybrid learning algorithms with genetic algorithms
and 78 asymptomatic and 18 symptomatic COVID19 patients. from cough audio signals. The proposed CR19 framework
The CNN stacking model is based on a meta-learning logistic combines genetic algorithms(A genetic algorithm(GA) is a
regression classification using spectrograms (the energy dis- metaheuristic that belongs to the broader class of evolutionary
tribution of the sound is plotted against time and frequency) algorithms (EA).) to improve the efficiency and accuracy of
generated by the time and frequency) generated from the user’s classification of the cough dataset. GA-KNN performs better
data.using the combined data set (Cambridge and collected). than other classifiers, indicating that this model provides a very
Model 15: In 2020 Ritwik, Kotra Venkata Sai and Kalluri, good diagnosis for COVID-19 from the patient’s cough with ,
Shareef Babu and Vijayasenan, Deepu [29] attempt to study GA-KNN shows an accuracy of more than 97% in the diagno-
the presence of cues regarding COVID-19 disease in voice sis of COVID-19 Finally, this study aims to draw attention to
data.They use an approach that is similar to speaker recog- the importance of human voice and other respiratory sounds
nition. The dataset consists of audio clips extracted from for the sound diagnosis of COVID-19.
YouTube videos of television interviews (often a video call) of In all the above works, the working steps are one, and
COVID-19 positive patients. The speech portion of the audio the only difference is the selection of the algorithm that the
was extracted at a sampling rate of 44.1 kHz. The audio sam- researcher uses for the same purpose, which is to achieve an ac-
ples audio samples were converted to a single-channel (mono) curate and reliable system to detect the presence of the Corona
wav format and passed through a 300 Hz to 3.4 kHz bandpass virus in the objects to be tested. The following ?? represents
filter. This is done to simulate telephone quality speech. The the different stages of audio processing in order to diagnose
audio is finally downsampled to 8 kHz. covid19.
We collected speech data from nineteen speakers, includ-
ing ten COVID19-positive speakers and nine COVID-19-
negative speakers. The composition of male and female speak-
ers in each class is presented in Table 1 with the number of
utterances per class and gender. In this work, they also use the
short-term melodic spectrum as low-level low-level features.
They will follow an approach similar to the super vectors that
were originally used in speaker recognition to extract features
at the utterance level. An SVM trained with these features is
used to predict COVID-19 from the speech data.
Model 16: In 2021,Das, Rohan Kumar and Madhavi,
Maulik and Li, Haizhou[13]In this work, they focus on new
acoustic fronts for the detection of COVID-19 since the data
for the challenge is very limited. We consider auditory acoustic
acoustic indices based on the long term transform, the gamma-
tone filter bank and the equivalent equivalent rectangular band-
width spectrum to capture the discriminative features of the
discriminative signal characteristics for COVID-19 detection. Figure 2. Framework for classification of COVID 19 status on speech
The database released for the DiCOVA challenge is derived
from Coswara corpus which is collected by a crowd sourcing
platform from COVID-19 positive and negative individuals.
Each participant provided 9 audio recordings including soft 3. RESULTS
and loud cough, soft and deep breathing, sustained phonation 3.1 the Performances
of vowels, and counting 1-20 digits at a fast and normal rate To evaluate the results of a given system, either in machine
in a digits at a fast, normal rate using the web application4. learning or in deep learning, the evolution metrics of this
The collected data are then divided into two tracks for the system are calculated (precession, recall, f-score, specificity and
DiCOVA Challenge. Track 1 includes sound recordings of the precision) as shown in the table1 we see base on the confision
individuals’ coughs, while Track 2 consists of deep breathing, matrix (TP, TN, FP and FN) which are the numbers of true
vowels and deep breathing, vowels, and normal speed number positives, true negatives, false positives and False negative [21],
counts. they focus only on the Track-1 database of the chal- respectively.
lenge in this work. they find that our three auditory acoustic
features perform better than the basic performing better than • TP : which specifies the numberof correctly classified pos-
the baseline MFCC features with the LR and RF classifiers. itive samples.
Among them, erbSpec-RF obtains the best AUC • FN : which specifies the number of misclassified samples.
Publications of the Astronomical Society of Australia 5

• FP : which specifies the number of negatives examples of accurate in diagnosing COVID-19 from cough audio signals.
misclassified samples. model 5: In this study [6],The experimental results for the
• TN : which specifies the number of negatives examples of combination of patterns (shallow breathing, severe cough, shal-
correctly classified samples. low cough, fast count, E vowel, O vowel) reached an AUC of
96.4%, and the AUC for the combination of patterns (severe
and based on this matrix, we conclude a performance set as cough, shallow cough, E vowel) was 92%. Ultimately, this
shown in the following table: study demonstrates the importance of using different breath,
Table 1. The metrics questions. cough, and speech sounds for accurate and more reliable diag-
Equation Equation Equation nosis and screening of COVID-19.
Number Name model 6 :In this study [12],this model successfully detected
1 Accuracy (TN + TP)/(TN + TP + FN + FP) COVID-19 subjects with a maximum sensitivity of 94.21%,
2 Recall TP/(TP + FN) specificity of 94.96% and area under the receiver operating
3 Specificity TN/(TN + FP) characteristic curves (AUROC) of 0.90.
4 Precision TP/(TP + FP) model 7 :The experimental results obtained in [33] using the
5 F1-Score 2x((PrecisionxRecall) five classification algorithms showed an average level of per-
/(Precision + Recall)) formance, the evaluated measurement values being around
70% of their value, but the best performance was observed
for the determination algorithm trend with the highest value.
The highest for the "recall" scale is 0.7892. Because a higher
3.2 Results & Discussion invocation value means fewer false classifications of a "false
Model 1: The performance of the proposed system[20] in negative" class. The cost of misclassifying a COVID-19 posi-
terms of accuracy, precision, F1 score and VPN for the time tive sample as COVID-19 negative is high, and therefore the
domain feature vector are: 0.912, 0.829, 0.889, 0.873 respec- choice of recall as an evaluation measure should be maximized
tively, and for the frequency domain feature vector are: 1.000, while adjusting model parameters.
0.975, 0.974, 0.952 , and for the mixed domain feature vector model 8 :In this study[16],The model adopts multi-branch
are: 0.941, 0.938, 0.937, 0.934. The comparison between deep learning (multi-modal data-based fault prediction model),
these vectors shows that the proposed system achieves better and the experimental results provided in this study reach an
performance with the frequency domain feature vector using AUC of 91%. Finally, this study aims to draw attention to
the cough sound samples, thus it is more accurate and reliable the importance of human voice and other breath sounds for
compared to the other vectors. auditory diagnosis of COVID-19.
Model 2:In this study[15], the highest performance values model 9 :In this study [6],The results show that by averag-
obtained for the linear SVM are: an precision equal to 98.4%, ing the predictions of several separately trained and evaluated
a recall of 99.5%, a specificity of 97.3%, an accuracy of 97.4% models for different types of sounds, a simple binary classifier
and an F1 score of 98.6% with the EMD and DWT measure- achieves an AUC of 96.4Finally, this study aims to draw atten-
ments (traditional characteristic ) and the selection of Relief F tion to the importance of the human voice over other breath
functions. and for the highest performance values obtained sounds in the sound diagnosis of COVID-19.
from ResNet50 deep feature selection with Relief F for CNN model 10 :In this study[11],The results show that 96.4%
are: an precision of 97.8%, recall of 98.5%, specificity of AUC can be achieved using a simple binary classifier In this
97.3%, accuracy of 97.4%, and F1 score values of 98.0%. So study[11],The results show that 96.4% AUC can be achieved
from a little comparison, it can be clearly stated that features using a simple binary classifier
obtained by traditional feature approaches show higher per- model 11 :In this study[4],The method uses a CNN based on
formance than deep features. the residual "Resnet50", which is the best performing classifier
model 3 :in this study[25], the evolution of the efficiency of with an AUC of 94%.
transfer learning and bottleneck feature extraction is achieved model 12 :The proposed system [4] in reduces the validation
using CNN, LSTM and Resnet50 architectures to improve loss and increases the validation accuracy, which enables the
the performance of COVID classification -19 19. 19 based on model to learn low root mean square error, and the exper-
cough, breath and speech audio cues. The experimental results imental results obtained using the proposed GAN method
provided by the system proposed in this study reach an area are represented by the accuracy , recall, accuracy, and The
under the ROC curve (AUC) of 0.982 for cough, followed by F-measure is 96.54%, 96.15%, 98.56%, and 0.96, respectively.
an AUC of 0.942 for respiration and 0.923 for speech. these results is good compared to the ANN, CNN, RNN
model 4 :In this study[19], a method called CR19 is provided model[4].
to detect COVID-19 in cough audio signals using machine model 13 :The proposed system [23] does Three types of ex-
learning algorithms for automated medical diagnostic applica- periments were conducted, using speech-based, image-based,
tions. The study provides experimental results demonstrating and speech-image-based models. Long-Short-Term Memory
that GA-ML improves the accuracy of GA-ML for all classi- (LSTM) was used for the vocal classification of the patient’s
fiers compared to non-GA based techniques. We also observed cough, voice and breathing, achieving an precision greater
that among these six classifiers, GA-KNN was more than 97% than 98%. Additionally, CNN models VGG16, VGG19, Den-
6 Y. Lakrad et al.

snet201, ResNet50, Inceptionv3, InceptionResNetV2, and be unlabeled, and the model will have to spot and extract the
Xception were calibrated for classification of chest X-ray im- recurring features from itself.
ages. The VGG16 model outperforms all other CNN models, The second step is to select an algorithm to run on the train-
achieving 85.25% precision without fine tuning and 89.64% ing dataset. The type of algorithm to use depends on the type
after performing fine tuning techniques. Additionally, the and volume of training data and the type of problem to be
speech and image-based model was evaluated using the same solved.
seven models, achieving an precision of 82.22% by the Incep- The third step is training the algorithm. This is an iterative
tionResNetV2 model. So with a small comparison between process. Variables are run through the algorithm, and the re-
the results we conclude that the best model is LSTM which is sults are compared with those it should have produced. The
based only on speech. “weights” and bias can then be adjusted to increase the accu-
model 14 :In this study ,Using a novel machine learning racy of the result. The variables are then run again until the
method to detect COVID- 19 boxes that are both symptomatic algorithm produces the correct result most of the time. The
and asymptomatic using the CNN as a helper. The accuracy, algorithm, thus trained, is the Machine Learning model.
sensitivity, and specificity for symptomatic and asymptomatic The fourth and final step is to use and improve the model.
boxes were96.5%, 96.42%, and 95.47% and 98.85%, 97.01%, We use the model on new data, the origin of which depends on
and 99.6%, respectively. the problem to be solved. For example, a Machine Learning
model 15 :In this study [29],In this work, they also use the model designed to detect spam will be used on emails.
short-term melodic spectrum as low-level low-level features. Machine learning algorithms are divided into two categories
They follow an approach similar to the super vectors originally which are supervised and unsupervised algorithms [7]. Super-
used in speaker monitoring to extract utterance-level features. vised machine learning is an elementary but strict technol-
The SVMs are trained with these features in a similar way ogy. Operators present the computer with sample inputs and
to the super vectors originally used in speaker monitoring to the desired outputs, and the computer searches for solutions
extract features at the utterance level. An SVM trained with to get those outputs based on those inputs. The goal is for the
these features is used to predict COVID- 19 from the speech computer to learn the general rule that maps inputs and out-
data. withe accuracy of 88.6% and an F1-Score of 92.7%. puts. The main algorithms of supervised machine learning are:
model 16 :In this study [13],They found that our three audi- random forests, decision trees, K-NN (k-Nearest Neighbors)
tory acoustic features performed better than the basic MFCC algorithm, linear regression , Naïve Bayes algorithm, support
features The use of LR and RF classifiers perform better than vector machine (SVM), logistic regression and gradient boost-
the basic MFCC features. Among them, erbSpec-RF achieves ing. In unsupervised machine learning, the algorithm itself
the best AUC of 73.4% on the validation set, showing its determines the structure of the input (no labels are applied to
strong potential for detecting COVID-19 from cough sounds. the algorithm). This approach can be a goal in itself (which
model 17 :In this study [19],genetic algorithms (genetic algo- makes it possible to discover structures buried in the data)
rithms (GA) are meta-heuristics belonging to a broader class or a means to achieve a certain goal. This approach is also
of evolutionary algorithms (EA)) to improve the efficiency called “feature learning”. The main algorithms of unsuper-
and accuracy of classification of cough datasets. GA-ANN vised machine learning are: K-Means, clustering/hierarchical
outperforms other classifiers, demonstrating that this model clustering and dimensionality reduction.
provides a very good diagnosis of COVID-19 based on patient
cough. The accuracy of GA-ANN is >97% with a diagnosis 4.2 Deep learning
of COVID-19 Finally, this study aims to draw attention to the Deep Learning [8] is a branch of Machine Learning, but it
importance of human voice and other breath sounds in the is the most commonly used today. It is an invention of Geoffrey
informed diagnosis of COVID-19. Hinton, dated 1986. Simply put, Deep Learning is an improved
version of Machine Learning. Deep learning uses a technique
that gives it a superior ability to detect even the most subtle
4. Artificial Intelligence
patterns. This technique is called deep neural network. This
4.1 Machine Learning depth corresponds to the large number of layers of computing
Machine Learning or automatic learning [8] is a scientific nodes that make up these networks and work in collaboration
field, and more specifically a subcategory of artificial intel- to process data and deliver predictions. These neural networks
ligence. It consists of letting algorithms discover “patterns”, are directly inspired by the functioning of the human brain.
namely recurring patterns, in data sets. This data can be num- Computing nodes are comparable to neurons, and the network
bers, words, images, statistics... The development of a Machine itself is similar to the brain.
Learning model involves four main steps represented in the
figure3.
The first step is to select and prepare a set of training data.
This data will be used to feed the Machine Learning model
to learn how to solve the problem for which it is designed.
The data can be labeled, in order to indicate to the model the
characteristics which it will have to identify. They can also
Publications of the Astronomical Society of Australia 7

Figure 3. Machine learning steps

Table 2. Comparison with many previous studies for sound-based COVID-19


diagnosis.

Research Dataset Sound type Models /Classifiers Performance


Fakhry et al. (Virufy)[16] Coughvid Cough Multi-Branch Deep Learning AUC=91%
Pahar et al.[24] Coswara Cough Resnet50 AUC=98%
Coppock et al.[11] Covid-19- sounds Cough Breathing ResNet AUC=84.6%
N.Sharma(2020)[31] Healthy and COVID-19 -positive:941 Cough,Breathing,vowel,Counting(1-20) Random forest classifier ACC=76.74%
C.brown et al.(2021)[9] COVID-19-positive:141 Non-COVID:298 Cough and Breathing CNN using spectrogram ACC=80%
pahar2021covid[24] Coswara Cough Residue-based CNN ACC=95.33%
Pahar(2021) [26] Coswara breaths Resnet50 AUC=92%
erdougan [15] Coswara cough ResNet50 CNN ACC=96,83 %
N.Sharma [31] Healthy/COvid(+)=941 Cough/Respiration Càntraste spectral,MFCC/RF ACC=66.76%
Proposed system[31] COvid(+)=50 COvid(-)=50 Cough MFCC,vecteur de chrominance/DNN ACC=97.5%
pahar(2022)[25] Coswara coughing,talking, sneezing,noise CNN, un LSTM et un Resnet50 AUC=94.2%
hemdan(2022)[19] Coswara breath sounds, coughing and speech CR19 ACC=97%
usman(2022)[33] COVID-19(+) and COVID-19(-) Speech STFT ACC=78.92%
al.(2022)[4] healthy,unhealthy sound GAN ACC=96,54%
nassif(2022)[23] Coswara coughing,breathing,speaking LSTM ACC=98%
das(2021)[13] Coswara Cough,breath,counting from 1-20 erbSpec/RF AUC=81%
das(2021)[13] Coswara Cough,breath,counting from 1-20 GTCC/MLP AUC=78.61%
chaudhari(2020)[10] Coswar,Coughvid cough CNN:LSTM:CRNN AUC=77.1%
ritwik(2020)[29] covid(+)=10,covid(-)=19 cough SVM ACC=88.6%
akman(2022)[3] CCS,CSS Cough,Speech CIdeR ACC=78%
muguli(2021)[22] Coswar Cough,Breath RF AUC=76.85%
despotovic2021[14] Coswar,Coughvid cough GA-ML ACC=98%
rahman(2022)[27] Cambridge coughing,breathing 8-KNN ACC= 98%

5. Conclusion
8 Y. Lakrad et al.

The COVID-19 outbreak has had a significant effect on the [18] Hassan, A., Shahin, I., & Alsabek, M. B. 2020, in 2020 International
well-being of people around the world, with a sharp increase in conference on communications, computing, cybersecurity, and informatics
(CCCI), IEEE, 1–5
the number of casualties. Deep learning and machine learning
[19] Hemdan, E. E.-D., El-Shafai, W., & Sayed, A. 2022, Journal of Ambient
techniques have provided considerable help since the begin- Intelligence and Humanized Computing, 1
ning of the global epidemic. In this overview, several classi- [20] Islam, R., Abdel-Raheem, E., & Tarique, M. 2022, Biomedical Engi-
fication approaches of COVID-19 based on coughing sound neering Advances, 3, 100025
and breathing have been proposed with the performance of [21] Madeeh, O. D., & Abdullah, H. S. 2021in , IOP Publishing, 012008
each method, The comparison of the obtained results was [22] Muguli, A., Pinto, L., Sharma, N., et al. 2021, arXiv preprint
listed in Table2 for the identification of the best technique for arXiv:2103.09148
COVID-19 classification. According to the latter, the LSTM, [23] Nassif, A. B., Shahin, I., Bader, M., Hassan, A., & Werghi, N. 2022,
Mathematics, 10, 564
8-KNN and GA-ML models are the most efficient compared
[24] Pahar, M., Klopper, M., Warren, R., & Niesler, T. 2021, Computers in
to the others with an accuracy of 98%, without forgetting the Biology and Medicine, 135, 104572
Resnet50 model which also performs with an Area Under the [25] —. 2022, Computers in Biology and Medicine, 141, 105153
ROC Curve (AUC) of 98%. [26] Pahar, M., & Niesler, T. 2021
For future work, more focus will be given to investigate the [27] Rahman, T., Ibtehaz, N., Khandakar, A., et al. 2022, Diagnostics, 12,
progression level of the COVID-19 patients by using the 920
cough sound analysis. Furthermore, since some other respira- [28] Rajkarnikar, L., Shrestha, S., & Shrestha, S. 2021, Int. J. Adv. Eng, 4,
tory diseases produce similar cough sounds, it is imperative to 337
[29] Ritwik, K. V. S., Kalluri, S. B., & Vijayasenan, D. 2020, arXiv preprint
compare the cough sound features of the COVID-19 patients
arXiv:2011.04299
with those of the other respiratory diseases.
[30] sarah kliff. 2021, Most Coronavirus Tests Cost About $100.
Why Did One Cost $2,315? The New York Times, available
at https://www.nytimes.com/2020/06/16/upshot/coronavirus-test-cost-
References varies-widely.html. accessed on November 23
[1] 2021, More than the virus, fear of stigma is stopping people from [31] Sharma, N., Krishnan, P., Kumar, R., et al. 2020, arXiv preprint
getting tested: Doctors, The New Indian Express, available at arXiv:2005.10548
https://www.newindianexpress.com/states/karnataka/2020/aug/06/more-
[32] Udugama, B., Kadhiresan, P., Kozlowski, H. N., et al. 2020, ACS nano,
than-virus-fear-of-stigma-is-stopping-people-from-getting-tested-
14, 3822
doctors-2179656.html. accessed on November 22
[33] Usman, M., Gunjan, V. K., Wajid, M., Zubair, M., et al. 2022, Compu-
[2] 2021, Word Bank and WHO, Half of the world lacks access to essential
tational Intelligence and Neuroscience, 2022
health services, 100 million still pushed into extreme poverty due because
of health expenses, available at https://www.who.int/news/item/13-12-
2017-world-bank-and-who-half-the-world-lacks-access-to-essential-
health-services-100-million-still-pushed-into-extreme-poverty-because-
of-health-expenses. accessed on November 23
[3] Akman, A., Coppock, H., Gaskell, A., et al. 2022, Frontiers in Digital
Health, 4
[4] Al-Dhlan, K. A. 2022, International Journal of Speech Technology, 25,
641
[5] Alafif, T., Tehame, A. M., Bajaba, S., Barnawi, A., & Zia, S. 2021, Inter-
national journal of environmental research and public health, 18, 1117
[6] Aly, M., Rahouma, K. H., & Ramzy, S. M. 2022, Alexandria Engineering
Journal, 61, 3487
[7] Ayodele, T. O. 2010, New advances in machine learning, 3, 19
[8] Bastien, L. 2020, Machine Learning: Définition, fonctionnement, utilisa-
tions
[9] Brown, C., Chauhan, J., Grammenos, A., et al. 2020, arXiv preprint
arXiv:2006.05919
[10] Chaudhari, G., Jiang, X., Fakhry, A., et al. 2020, arXiv preprint
arXiv:2011.13320
[11] Coppock, H., Gaskell, A., Tzirakis, P., et al. 2021, BMJ innovations, 7
[12] Daisy, M. I. A., & Lavanya, R. ????
[13] Das, R. K., Madhavi, M., & Li, H. 2021, in 22nd Annual Conference of
the International Speech Communication Association, INTERSPEECH
2021, 4276–4280
[14] Despotovic, V., Ismael, M., Cornil, M., Mc Call, R., & Fagherazzi, G.
2021, Computers in Biology and Medicine, 138, 104944
[15] Erdoğan, Y. E., & Narin, A. 2021, Computers in Biology and Medicine,
136, 104765
[16] Fakhry, A., Jiang, X., Xiao, J., et al. 2021, arXiv preprint
arXiv:2103.01806
[17] Hanson, K. E., Caliendo, A. M., Arias, C. A., et al. 2020, Clinical
infectious diseases

You might also like