Professional Documents
Culture Documents
Paul Et Al. - 2023
Paul Et Al. - 2023
Paul Et Al. - 2023
https://doi.org/10.1007/s00521-021-06737-6 (0123456789().,-volV)(0123456789().
,- volV)
Received: 8 March 2021 / Accepted: 21 September 2021 / Published online: 5 January 2022
Ó The Author(s) 2021
Abstract
Novel Coronavirus 2019 disease or COVID-19 is a viral disease caused by severe acute respiratory syndrome coronavirus 2
(SARS-CoV-2). The use of chest X-rays (CXRs) has become an important practice to assist in the diagnosis of COVID-19 as
they can be used to detect the abnormalities developed in the infected patients’ lungs. With the fast spread of the disease, many
researchers across the world are striving to use several deep learning-based systems to identify the COVID-19 from such CXR
images. To this end, we propose an inverted bell-curve-based ensemble of deep learning models for the detection of COVID-
19 from CXR images. We first use a selection of models pretrained on ImageNet dataset and use the concept of transfer
learning to retrain them with CXR datasets. Then the trained models are combined with the proposed inverted bell curve
weighted ensemble method, where the output of each classifier is assigned a weight, and the final prediction is done by
performing a weighted average of those outputs. We evaluate the proposed method on two publicly available datasets: the
COVID-19 Radiography Database and the IEEE COVID Chest X-ray Dataset. The accuracy, F1 score and the AUC ROC
achieved by the proposed method are 99.66%, 99.75% and 99.99%, respectively, in the first dataset, and, 99.84%, 99.81% and
99.99%, respectively, in the other dataset. Experimental results ensure that the use of transfer learning-based models and their
combination using the proposed ensemble method result in improved predictions of COVID-19 in CXRs.
Keywords COVID-19 detection Convolutional neural network Ensemble learning Chest X-ray Bell-shape function
123
16114 Neural Computing and Applications (2023) 35:16113–16127
alternative [5] as the person under investigation can be imaged 3. The proposed approach is evaluated on COVID-19
in a more isolated environment, thereby lowering the risk of Radiography Database [10] and IEEE COVID Chest
spreading the virus. Although a vaccine has been developed, it X-ray Dataset [11] and state-of-the-art results are
will take time to vaccinate the entire world population, espe- obtained.
cially in developing countries [6]. The remaining paper is structured as follows: Sect. 2 pro-
With the recent developments in data-driven Deep vides a quick review of the past methods related to the
Learning (DL), various DL models like convolutional neural research topic under consideration. In Sect. 3, we discuss
networks (CNNs) are being used extensively to study med- the proposed approach. Section 4 presents the results fol-
ical images [7]. CNNs are achieving state-of-the-art perfor- lowed by a brief discussion on the same. We end with
mances in classification into disease classes for diagnosis concluding remarks outlining some future research plans in
and also in segmentation of the region of interest (ROI) in Sect. 5.
medical images. This is enabled by the fact that CNNs can
learn local features very accurately from a given medical
image which can be a CT scan or a CXR. Combining outputs 2 Related work
of multiple classifiers to generate the final output is a popular
approach to enhance the performance of classification. The Several methods such as transfer learning, ensembling,
combination of ensemble algorithms works on the output etc., have been proposed in the literature to improve the
scores of the individual classifiers, which may have different performance of the DL models. More recently, researchers
architectures to capture different elements of data or differ- have applied these techniques in several domains of image
ent input vectors generated from the same data instance [8]. processing like facial expression recognition [12], image
Existing popular rank level or confidence score level fusion [13], malware classification [14], etc. Such methods
ensemble methods like majority voting, sum-rule (soft vot- have also been used in the medical image processing
ing) [9] focus on a linear combination of the classifiers’ domain. A recent work by Dolz et al. [15] uses CNN
outputs to generate the final prediction, lacking any consid- ensembles for infant’s brain MRI segmentation. Another
eration of the output vector quality. work by Efaz et al. [16] uses deep CNN supported
In this paper, we propose a novel weighted average ensembles for computer-aided diagnosis of malaria. Savelli
ensemble method to combine the confidence scores of et al. [17] have also developed a similar method for small
various pretrained CNN models to achieve better perfor- lesion detection. It is, therefore, logical that similar meth-
mance in detecting COVID-19 from CXR images. The ods have also been applied for COVID-19 detection. We
inverted bell curve is used to assign weights to the classi- highlight a few such works below.
fiers’ outputs. The more we move further from the centre of Gianchandani et al. [18] have also used an approach
the bell we attain higher weight values, and thus the shape where they use transfer learning and ensembling to
of the inverted bell is utilized to calculate the weight for an improve the performance of their DL models. The authors
output vector. Both the classifiers’ output quality and the have considered the VGG16, ResNet152V2, Incep-
overall performance of the classifiers are considered, tionResNetV2 and DenseNet201 models in their work
thereby providing a more justifying combination of clas- which are trained using transfer learning. The authors then
sifier outputs. Transfer learning is used where the CNN show that a deep neural ensembling provides better results
models are first pretrained on a huge dataset to learn basic when compared to each of the models. The work utilized
image-related features. Then they transfer the knowledge two datasets for training the DL models. The first one was
with some fine-tuning to classify CXR images to help the obtained from Kaggle and used for binary classification.
medical practitioners in the diagnosis of COVID-19. We The second one was collected by a team of researchers in
highlight the benefits of the proposed inverted bell-curve- collaboration with doctors and was used for multi-class
based ensemble method to improve the accuracy and classification. It contained 1203 CXRs equally split among
robustness of these transfer learning models. the 3 classes of COVID affected, normal and pneumonia
To summarize, the contributions of this work are as affected, respectively. The accuracy and F1 scores are
follows: reported as 96.15% and 0.961 for the binary classification
1. We propose an ensemble of transfer learning models to task, and as 99.21% and 0.99 for the multi-class classifi-
classify CXR images to detect COVID-19. cation task, respectively.
2. We propose a novel ensemble method that uses an A recent work by Ouyang et al. [19] also utilizes
inverted Bell curve to assign weight to the output of the ensembling. The authors have developed a dual-sampling
classifiers and performs weighted average to obtain the attention network to detect COVID-19 in CT scan images.
final output vector. To deal with the imbalance in the distribution of the
123
Neural Computing and Applications (2023) 35:16113–16127 16115
infection regions in the lungs, a dual-sampling strategy was pneumonia affected and normal, with 900 X-rays belong-
used. Two separate 3D ResNet34 models were trained ing to the COVID affected class. The authors have reported
using different sampling strategies, and finally, the pre- the accuracy, sensitivity, specificity and F1 score values as
dictions were combined using weighted average ensem- 97.78%, 97.75% 96.25% and 95.25%, respectively.
bling. For training and validation, a dataset consisting of Ezzat et al. [26] also use a similar approach where they
2186 CT scans from 1588 patients was used. For the testing have used the Gravitational Search Algorithm to choose the
stage, another independent dataset comprising of 2796 CT optimal hyperparameters for a DenseNet121 CNN model.
scans from 2057 patients was used. The AUC, accuracy The authors go on to show that such a method performs
and F1 score values are reported as 0.944, 87.5% and better than the state-of-the-art InceptionV3 model. In the
82.0%, respectively, on the testing dataset. work, a combination of two datasets, the Cohen dataset and
Similarly, in [20], the authors have used ensembling and the Kaggle Chest X-ray Dataset, has been used. The final
iterative pruni ng to improve the classification results of dataset contained 99 COVID-19-positive X-rays and 207
their DL models. Four publicly available CXR datasets are COVID-19-negative X-rays which also included some
used in the work. One pneumonia-related dataset was used other diseases like pneumonia, SARS, etc. in addition to
for modality-specific training before training on the normal X-rays. The authors have reported the accuracy and
COVID-19 CXRs. The idea is that training on a similar F1 score of their method as 98.38% and 98%, respectively.
dataset of CXRs will be beneficial for the DL models. The The availability of a large quantity of training data is
authors report their accuracy and AUC as 99.01% and also required for the success of DL models. However, in
0.9972, respectively. the emerging domains, there is often a lack of training data.
Zhang et al. [21] have used two-stage transfer learning Waheed et al. [27] have proposed an auxiliary classifier
and a deep residual network framework for the classifica- generative adversarial network (ACGAN)-based model
tion of CXR images. The authors used pretrained ResNet34 termed as COVIDGAN to tackle this issue. The authors
model and fine-tuned the model on a large dataset of have used a dataset consisting of 1124 CXRs of which 403
pneumonia CXR images. The authors then used a feature are COVID-19 infected, and the rest are normal. It has been
smoothing layer and a feature extraction layer and utilized derived from 3 open-sourced datasets. The authors have
them along with layers transferred from the fine-tuned shown that including the synthetic images generated by
ResNet34 model. A fully connected layer at the end of the COVIDGAN in a VGG16 classifier improves the perfor-
network produces the final output. The authors used two mance of the model. The accuracy, F1 score, sensitivity
datasets of CXR images, one with 5860 images which were and specificity improve to 95%, 0.95, 90% and 97%,
used for the first stage of training and the other one with respectively, from 85%, 0.85%, 69% and 95%,
739 images which was used for the later stage. The testing respectively.
accuracy is reported as 91.08 % by the authors.
Jaiswal et al. [22], in their work, have utilized transfer 2.1 Research gap
learning in DL models to detect COVID-19 in CT scan
images. The authors have used the ImageNet dataset for As highlighted in the previous section, ensembling-based
pretraining and the SARS-CoV-2 CT scan dataset for approaches are widely used in different image classifica-
training the models. The authors have observed that the tion tasks among others. They have also been used in a few
DenseNet201 model performs the best as compared to the methods proposed for COVID-19 detection. The most
VGG16, ResNet152V2 and InceptionResNetV2 models. common techniques used include summation, majority
The training, testing and validation accuracies are reported voting, averaging and weighted averaging of the predic-
as 99.82%, 96.25% and 97.40%, respectively. tions obtained from the classifiers considered for forming
Recently, several works ([23, 24]) have also been pro- the ensemble. These approaches provide a significant
posed which make use of optimization algorithms along improvement in performance in most cases. However, an
with DL for COVID-19 detection. The work by Goel et al. important observation is that these methods do not consider
[25] introduced an optimized CNN termed as OptCoNet for the quality of the predictions while producing the output.
the purpose of COVID-19 diagnosis from CXRs. The These techniques simply apply the corresponding operation
proposed CNN model consists of feature extraction com- to obtain the output. We may also choose to use some
ponents and classification components as usual. However, secondary classifiers [28] which can make use of some
the hyperparameters of the CNN (like learning rate, num- learning algorithms. This learning process is based on the
ber of epochs, etc.) have been optimized by using the Grey optimization of some metrics like accuracy or F1 score. For
Wolf Optimization algorithm. A dataset comprising of example, the work by Gianchandani et al. [18] mentioned
2700 CXRs collected from various public repositories was previously uses a neural network-based secondary classi-
used. There were three classes in all: COVID affected, fier. Applying the similar technique in our experimental
123
16116 Neural Computing and Applications (2023) 35:16113–16127
setup does not produce a superior result. This secondary 3 Proposed approach
classification stage also does not give any separate treat-
ment based on the quality of the predictions, which may This section describes the methods used in this study. All
explain the previous observation. these methods put together to create the proposed analysis
In many cases, it has been observed that classifiers pipeline whose block diagram is shown in Fig. 1.
obtain high accuracy on a particular task even if the quality
of the predictions is inferior. Here, we assume the accuracy 3.1 Preliminaries
to be the fraction of correctly predicted classes where each
prediction is the class with the maximum probability as In the domain of computer vision, CNNs have proved to be
predicted by the model under consideration. By quality, we the best tool by achieving excellence in a wide array of
refer to the difference among the predicted probabilities. research problems including image classification, object
This is to some extent a metric to measure the confidence detection, image segmentation, etc. The convolution layer
of the model. For example, between two classifiers with does most of the computation. These layers convolve the
predicted class probabilities as [0.8, 0.1, 0.1] and input with a filter and pass it to the next layers as the
[0.4, 0.3, 0.3], we would prefer the former classifier. In output. Like the former, pooling layers do not have any
terms of the accuracy viewpoint mentioned previously, weights associated with them. The pooling layers help to
both would produce the same output. However, from the reduce the dimensions of the intermediate feature maps
quality viewpoint, the first classifier would be preferred before they are passed through the activation function.
since it predicts the class with high confidence. The dif- Although this downsampling strategy using pooling layers
ferences in probability scores of the predicted class (0.8) loses some data, it helps in preventing overfitting and
with the other classes (0.1 and 0.1) are very high. The reduces the complexity of the overall network. The final
second classifier is said to be unsure of the prediction since convolution layer is generally followed by a fully con-
the differences in scores of the predicted class (0.4) with nected network (FC), where all the neurons in one layer are
the other classes (0.3 and 0.3) are very low. Its output is connected to the outputs of the previous layer.
very close to the uniform random prediction of Activation functions are used to introduce nonlinearity
[0.33, 0.33, 0.33]. Hence, we would consider the first in neural networks. The activation functions that we have
classifier as the better one, since the difference between the used include: Rectified Linear Unit (ReLU), Softmax and
probabilities of the predicted class and the other classes is Sigmoid. Practically, ReLU has been found to be better as
minimal. compared to sigmoidal functions for intermediate activa-
In the present work, we use the inverted bell curve in tion in a network. It speeds up the convergence of
order to minimize this issue. The aim is to introduce some stochastic gradient descent (SGD) and also reduces the
robustness in the ensembling process while improving vanishing gradient problem.
classification accuracy at the same time. Dropout [29] is a method of regularization and is fre-
quent used in CNNs. Normally, overfitting is a major
Confidence
CXR Images Transfer Learning Score Ensemble Output
Final Output
Densenet-161 Vector
Inverted Bell-
Resnet-18 curve Based
Ensemble
Final Prediction
VGG-16
Fig. 1 Graphical representation of the proposed ensemble method used for detecting COVID-19 from CXR images
123
Neural Computing and Applications (2023) 35:16113–16127 16117
123
16118 Neural Computing and Applications (2023) 35:16113–16127
S11
Combination 3.3.2 Sum rule (Soft voting)
Algorithm
Classifier 1 S12
Score 1
Class 1 Let us consider output of some ith classifier (i 2 ½0; k) is
S1N
oi ¼ ½s0i ; s1i ; :::; sCi where sji is the confidence score of jth
Classifier 2 class (j 2 ½0; C). Now define majority as summation of the
S21 Score 2
Class 2
S22 vectors s0 i where s0 ki is 1 if only argmaxj sji ¼ k, any other
S2N
value is 0. So if the final output vector Y ¼ ½Y0 ; Y1 ; :::; YC
is produced by majority voting then
Score N
Class N
Classifier M
SM1
X
k
j
SM 2
Yj ¼ s0 i ð1Þ
i¼0
SMN
123
Neural Computing and Applications (2023) 35:16113–16127 16119
3.3.3 Borda count we will get higher values of f(x), similar amount at both
direction due to the fact that ðx bÞ term is squared in the
This method is a voting technique that works on the rank equation. This very idea is used in the context of assigning
level of the classifiers [37]. The confidence score sji in each weights to the outputs of each classifier.
classifier’s output is assigned with a rank rij in such a way Let us consider two independent classifiers P and Q
that the highest score value gets the lowest rank value. In produce [0.8, 0.1, 0.1] and [0.5, 0.3, 0.2] as output confi-
Borda count, the rank values are added to get the combined dence scores for some input X. Though both of these
rank output R ¼ ½R0 ; R1 ; :::; RC . classifiers predict the X belongs to class-0, the classifier P
does it more confidently. Therefore, while doing the
X
k
weighted average of these scores, we must assign more
Rj ¼ rij ð3Þ
i¼0 weight to the classifier P for this output. In doing so, the
property of f(x) discussed above is used. Let the minima of
The final class is predicted by performing argmin on R. To f(x) be at x ¼ 0:5, then we will get higher values of f(x) as
enhance the method a weight value wi is attached to each we get closer to 0 or 1 because these are respectively lower
classifier, which can be calculation by logistic regression
and upper bounds for sji . It can be easily shown that minima
[38] and the final rank is counted by taking the weighted
of f(x) exists at x ¼ b. So the value of b is taken as 0.5 to
sum.
satisfy our requirement. The value of c determines the
range of the weights and it is chosen as 0.5 experimentally.
3.4 Proposed method: inverted bell curve
There may arrive a situation when some classifiers
weighted ensemble
having very poor performance metrics over a dataset but
for some instances it produces the outputs confidently.
This section presents the mathematical formulation of the
Therefore, without suppressing the classifiers’ impacts
proposed ensemble methods. Let there be C classes in
completely, we aim to weaken its contribution in the
dataset and k classifiers trained on the dataset. In this paper
ensemble output, and we consider the accuracy of the
value of k is taken is 3 but k can take any finite value. Let sji classifier by taking a ¼ 1=Ai ; in f(x), where Ai is the
be the confidence score for jth class predicted by the ith accuracy for the ith classifier. So the weight wi assigned to
classifier. The confidence scores are the output of softmax;
the output of ith classifier is
hence, the output of some ith classifier will follow:
X
C
X
C wi ¼ Ai f ðsji Þ where ð6Þ
sji ¼1 where sji 2 ½0; 1 ð4Þ j¼0
j¼0
123
16120 Neural Computing and Applications (2023) 35:16113–16127
123
Neural Computing and Applications (2023) 35:16113–16127 16121
123
16122 Neural Computing and Applications (2023) 35:16113–16127
123
Neural Computing and Applications (2023) 35:16113–16127 16123
Table 5 Performance evaluation (in %) on IEEE COVID Chest X-ray Table 7 Evaluation metric on Kaggle Pneumonia Dataset
Dataset
Model Accuracy(in %)
Model Accuracy F1 score AUC
DenseNet-161 83.52
DenseNet-161 99.59 99.51 99.57 ResNet-18 83.14
ResNet-18 99.59 99.51 99.57 VGG-16 83.71
VGG-16 99.75 99.71 99.79 Proposed 86.11
Proposed 99.84 99.81 99.99
Borda count 98.11 99.59 DL-based models like CNNs generally provide better
Majority voting 98.97 99.59 performance than conventional white-box machine learn-
Sum-rule 98.97 99.59 ing models techniques like regression, decision trees, etc.
Proposed 99.66 99.84 However, it is to be noted that DL-based models are black-
box models in general. It is difficult to obtain explainability
for the predictions which may be important in certain fields
Table 6 shows the comparison between the popular like medical image processing. Here, medical professionals
ensemble techniques and the proposed method on of the want a prediction to come from the relevant artefacts
datasets mentioned in this paper. present in the input image (X-Ray, CT scan, etc.) and not
We observe from Tables 2 and 5 that the present from irrelevant parts of the image like the background.
approach achieves the best results on both datasets. On the The work reported in [48] is one such work that provides
first dataset, the DenseNet-161 model produces the best an explainable machine learning approach for EEG-based
performance and has a 99.66% accuracy. On the second brain-computer interface systems. In this work, the core
dataset, the VGG-16 model achieves the best results and prediction is performed using a CNN model. To introduce
has an accuracy of 99.75%. However, the present ensem- explainability in the system, the authors have used occlu-
bling approach outperforms both the above and achieves sion sensitivity analysis along with saliency maps seg-
accuracy values of 99.66% and 99.84%, respectively, on mentation through k-means clustering. Occlusion
the two datasets. This demonstrates the robustness of the sensitivity analysis is a simple approach where patches of
123
16124 Neural Computing and Applications (2023) 35:16113–16127
the input image are occluded using a mask and the effect found in the upper respiratory tract. Further development of
on the output is observed. This is used to infer the regions the virus results in fibrin accumulation on the alveolar
of interest. region causing reduced gas exchange in the lung. The
The work in [49] is another work where the authors have models individually pay attention to small and different
introduced explainability in a machine learning framework. regions looking for the textural changes caused by this
They have evaluated their approach for Glioma Cancer fibrin accumulation [42]. So the proposed ensemble of
prediction and have shown a comparable performance with these models can help in considering all of these textural
black box methods with the added advantage of explain- changes found in the chest X-rays.
able predictions. Besides, the work reported in [50] is a Table 8 compares the proposed method with some
recent method, where an explainable deep learning recent works in the same domain. Tang et al. [44] imple-
framework has been presented for COVID-19 diagnosis in ment an ensemble of multiple snapshot models of COV-
chest X-rays. The authors have utilized the Grad-CAM IDNet [40] on the COVIDx Dataset. The authors use a
approach for obtaining explainability from their CNN base weighted ageing ensemble technique to combine the
learners. snapshot models. Qiao et al. [45] use focal loss bases
In the present work, we use Grad-CAM [39] to capture neural ensemble on a combined dataset of IEEE COVID
the region of attention for the three models used in this Dataset and a Kaggle Pneumonia Dataset. Chowdhury
study. The idea behind Grad-CAM is to calculate the et al. [47] use an ensemble of a number of EfficientNet
weighted average of the feature maps obtained from a snapshots on COVIDx dataset. Turkoglu et al. [46] apply
particular layer in a model where the weights are the gra- Relief feature selection algorithm to select deep features
dients of the feature maps calculated on the predicted class from transfer learned AlexNet. This method is evaluated on
score. We choose the final convolution layer for Grad- a combined dataset made up of three public datasets.
CAM as it is considered to have the best compromise
between detailed spatial information and high-level 4.5 Error case analysis
semantics [39]. In Fig. 11, the superimposed image of the
Grad-CAM mask and the input CXR is shown. The rows in Table 9 shows a failure case encountered with the proposed
the image grid correspond to VGG-16, ResNet-18 and method. Upon careful observation, it can be found that all
DenseNet-161, respectively, from top to bottom. Each of the predictions from the base learners are very weak.
column represents the same input CXR. The region of Not a single model has a confidence score over 60% for
interest is shown as red spots in the figure. It can be their predicted labels. Though the VGG network predicts
observed that different models put attention on different the correct label, it has the lowest confidence score among
regions of the same CXR which can prove to be useful for all the other prediction confidences. Our method is unable
the combination of these models. to emphasize such low confidence score leading to incor-
Figure 11 also shows that the regions of attention of the rect prediction.
models are near the upper respiratory tract and alveolar Figure 12 shows the Grad-CAM results for a failure
lobes. The initial effects of SARS-CoV-2 are generally case. From the figure, it is noted that VGG-16 network
VGG-16
ResNet-18
DenseNet-161
Fig. 11 Grad-CAM performed on the last convolution layer for VGG-16 (top row), ResNet-18 (middle row) and DenseNet-161 (bottom row)
123
Neural Computing and Applications (2023) 35:16113–16127 16125
Table 8 Comparison of the proposed method with other deep learning methods in previous studies
Work ref. Datasets Accuracy (%)
123
16126 Neural Computing and Applications (2023) 35:16113–16127
meta-heuristics for hyperparameter tuning of the neural 5. Rubin GD, Ryerson CJ, Haramati LB, Sverzellati N, Kanne JP,
network to improve detection performance. Raoof S et al (2020) The role of chest imaging in patient man-
agement during the COVID-19 pandemic: a multinational con-
sensus statement from the Fleischner Society. Chest
Acknowledgements Ashis Paul, Arpan Basu and Ram Sarkar are
158(1):106–116
thankful to the Centre for Microprocessor Applications for Training,
6. Kaiser MS et al (2021) iWorkSafe: Towards Healthy Workplaces
Education and Research (CMATER) laboratory of the Computer
during COVID-19 with an Intelligent pHealth App for Industrial
Science and Engineering Department, Jadavpur University, Kolkata,
Settings. IEEE Access 9:13814–13828
India, for providing infrastructural support.
7. Mahmud M, Kaiser MS, Hussain A, Vassanelli S (2018) Appli-
cations of Deep Learning and Reinforcement Learning to Bio-
Author Contributions This work was carried out in close collabora-
logical Data. IEEE Trans Neural Netw Learn Syst
tion between all co-authors. All authors have contributed to, seen and
29(6):2063–2079
approved the final manuscript.
8. Ruiz J, Mahmud M, Modasshir M, Kaiser MS, et al. 3D Dense-
Net Ensemble in 4-Way Classification of Alzheimer’s Disease.
Code availability The code can be accessed via the following GitHub
In: International Conference on Brain Informatics. Springer;
repository: https://github.com/ashis0013/Inverted-Bell-Curve-
2020. p. 85–96
Ensemble.
9. Ho TK, Hull JJ, Srihari SN (1994) Decision combination in
multiple classifier systems. IEEE Trans Pattern Anal Mach Intell
Declarations 16(1):66–75
10. Chowdhury ME, Rahman T, Khandakar A, Mazhar R, Kadir MA,
Conflict of interest The authors declare that the research was con- Mahbub ZB et al (2020) Can AI help in screening viral and
ducted in the absence of any commercial or financial relationships COVID-19 pneumonia? IEEE Access 8:132665–132676
that could be construed as a potential conflict of interest. 11. Cohen JP, Morrison P, Dao L. COVID-19 image data collection.
arXiv 200311597. 2020. Available from: https://github.com/
Ethical Approval All procedures performed in studies involving ieee8023/covid-chestxray-dataset
human participants were in accordance with the ethical standards of 12. Liu K, Zhang M, Pan Z. Facial expression recognition with CNN
the institutional and/or national research committee and with the 1964 ensemble. In: 2016 international conference on cyberworlds
Helsinki Declaration and its later amendments or comparable ethical (CW). IEEE; 2016. p. 163–166
standards. 13. Amin-Naji M, Aghagolzadeh A, Ezoji M (2019) Ensemble of
CNN for multi-focus image fusion. Inf Fusion 51:201–214
Informed Consent Informed consent was obtained from all individual 14. Vasan D, Alazab M, Wassan S, Safaei B, Zheng Q (2020) Image-
participants included in the study. Based malware classification using ensemble of CNN architec-
tures (IMCEC). Comput Secur 92:101748
Open Access This article is licensed under a Creative Commons 15. Dolz J, Desrosiers C, Wang L, Yuan J, Shen D, Ayed IB (2020)
Attribution 4.0 International License, which permits use, sharing, Deep CNN ensembles and suggestive annotations for infant brain
adaptation, distribution and reproduction in any medium or format, as MRI segmentation. Comput Med Imag Graph 79:101660
long as you give appropriate credit to the original author(s) and the 16. Efaz ET, Alam F, Kamal MS. Deep CNN-Supported Ensemble
source, provide a link to the Creative Commons licence, and indicate CADx Architecture to Diagnose Malaria by Medical Image. In:
if changes were made. The images or other third party material in this Proceedings of International Conference on Trends in Compu-
article are included in the article’s Creative Commons licence, unless tational and Cognitive Engineering. Springer; 2021. p. 231–243
indicated otherwise in a credit line to the material. If material is not 17. Savelli B, Bria A, Molinara M, Marrocco C, Tortorella F (2020)
included in the article’s Creative Commons licence and your intended A multi-context cnn ensemble for small lesion detection. Artif
use is not permitted by statutory regulation or exceeds the permitted intell Med 103:101749
use, you will need to obtain permission directly from the copyright 18. Gianchandani N, Jaiswal A, Singh D, Kumar V, Kaur M. Rapid
holder. To view a copy of this licence, visit http://creativecommons. COVID-19 diagnosis using ensemble deep transfer learning
org/licenses/by/4.0/. models from chest radiographic images. J Ambient Intell Human
Comput 2020:1–13
19. Ouyang X, Huo J, Xia L, Shan F, Liu J, Mo Z et al (2020) Dual-
sampling attention network for diagnosis of COVID-19 from
References community acquired pneumonia. IEEE Trans Med Imag
39(8):2595–2605
1. Cucinotta D, Vanelli M (2020) WHO declares COVID-19 a 20. Rajaraman S, Siegelman J, Alderson PO, Folio LS, Folio LR,
pandemic. Acta Bio Medica: Atenei Parmensis. 91(1):157 Antani SK (2020) Iteratively pruned deep learning ensembles for
2. Chan JFW, Yuan S, Kok KH, To KKW, Chu H, Yang J et al COVID-19 Detection in Chest X-Rays. IEEE Access
(2020) A familial cluster of pneumonia associated with the 2019 8:115041–115050
novel coronavirus indicating person-to-person transmission: a 21. Zhang R, Guo Z, Sun Y, Lu Q, Xu Z, Yao Z et al (2020)
study of a family cluster. lanceT 395(10223):514–523 COVID19XrayNet: a Two-Step Transfer Learning Model for the
3. Dey N, Rajinikanth V, Fong SJ, Kaiser MS, Mahmud M (2020) COVID-19 Detecting Problem Based on a Limited Number of
Social-Group-Optimization Assisted Kapur’s Entropy and Mor- Chest X-Ray Images. Interdiscip Sci Comput Life Sci
phological Segmentation for Automated Detection of COVID-19 12(4):555–565
Infection from Computed Tomography Images. Cogn Comput 22. Jaiswal A, Gianchandani N, Singh D, Kumar V, Kaur M. Clas-
12(5):1011–1023 sification of the COVID-19 infected patients using DenseNet201
4. Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, based deep transfer learning. J Biomol Struct Dynam. 2020:1–8
Baxter SL et al (2018) Identifying medical diagnoses and treat- 23. Chattopadhyay S, Dey A, Singh PK, Geem ZW, Sarkar R. Covid-
able diseases by image-based deep learning. Cell 19 Detection by Optimizing Deep Residual Features with
172(5):1122–1131 Improved Clustering-Based Golden Ratio Optimizer.
123
Neural Computing and Applications (2023) 35:16113–16127 16127
Diagnostics. 2021;11(2). Available from: https://www.mdpi.com/ Proceedings Eighth International Workshop on Frontiers in
2075-4418/11/2/315 Handwriting Recognition. IEEE; 2002. p. 195–200
24. Sen S, Saha S, Chatterjee S, Mirjalili S, Sarkar R (2021) A bi- 38. Monwar MM, Gavrilova ML (2009) Multimodal biometric sys-
stage feature selection approach for COVID-19 prediction using tem using rank-level fusion approach. IEEE Trans Syst Man
chest CT images. Appl Intell 51:8985–9000 Cybernet Part B (Cybernet) 39(4):867–878
25. Goel T, Murugan R, Mirjalili S, Chakrabartty DK (2020) 39. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra
OptCoNet: an optimized convolutional neural network for an D. Grad-cam: Visual explanations from deep networks via gra-
automatic diagnosis of COVID-19. Appl Intell 51(3):1351–1366 dient-based localization. In: Proceedings of the IEEE interna-
26. Ezzat D, Hassanien AE, Ella HA (2020) An optimized deep tional conference on computer vision; 2017. p. 618–626
learning architecture for the diagnosis of COVID-19 disease 40. Wang L, Lin ZQ, Wong A (2020) Covid-net: a tailored deep
based on gravitational search optimization. Appl Soft Comput convolutional neural network design for detection of covid-19
98:106742 cases from chest x-ray images. Sci Rep 10(1):1–12
27. Waheed A, Goyal M, Gupta D, Khanna A, Al-Turjman F, Pin- 41. Ieracitano C, Paviglianiti A, Campolo M, Hussain A, Pasero E,
heiro PR (2020) CovidGAN: data augmentation using auxiliary Morabito F (2020) A novel automatic classification system based
classifier GAN for improved Covid-19 Detection. IEEE Access on hybrid unsupervised and supervised machine learning for
8:91916–91923 electrospun nanofibers. Ieee/caa J Autom Sinica. 8:64–76
28. Mukhopadhyay A, Singh PK, Sarkar R, Nasipuri M (2018) A 42. Gallelli L, Zhang L, Wang T, Fu F (2020) Severe Acute Lung
study of different classifier combination approaches for hand- Injury Related to COVID-19 Infection: a Review and the Possible
written Indic Script Recognition. J Imag 4(2):39 Role for Escin. J Clin Pharmacol 60:815–825
29. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdi- 43. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D.,
nov R (2014) Dropout: a simple way to prevent neural networks Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold,
from overfitting. J Mach Learn Res 15(1):1929–1958 G. & Gelly, S. An image is worth 16x16 words: Transformers for
30. Ioffe S, Szegedy C. Batch normalization: Accelerating deep image recognition at scale. Arxiv PreprintArxiv:2010.11929.
network training by reducing internal covariate shift. In: Inter- (202)
national conference on machine learning. PMLR; 2015. 44. Tang, S., Wang, C., Nie, J., Kumar, N., Zhang, Y., Xiong, Z. &
p. 448–456 Barnawi, A. EDL-COVID: Ensemble Deep Learning for COVID-
31. Simonyan K, Zisserman A. Very deep convolutional networks for 19 Cases Detection from Chest X-Ray Images. Ieee Transactions
large-scale image recognition. arXiv preprint arXiv:14091556. On Industrial Informatics. (2021)
2014 45. Qiao Z, Bae A, Glass L, Xiao C, Sun J (2021) FLANNEL (focal
32. He K, Zhang X, Ren S, Sun J. Deep residual learning for image loss based neural network ensemble) for COVID-19 detection.
recognition. In: Proceedings of the IEEE conference on computer J Am Med Inform Assoc 28:444–452
vision and pattern recognition; 2016. p. 770–778 46. Turkoglu M (2021) COVIDetectioNet: COVID-19 diagnosis
33. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely system based on X-ray images using features selected from pre-
Connected Convolutional Networks. In: 2017 IEEE Conference learned deep features ensemble. Appl Intell 51:1213–1226
on Computer Vision and Pattern Recognition (CVPR); 2017. 47. Chowdhury, N., Kabir, M., Rahman, M. & Rezoana, N. ECOV-
p. 2261–2269 Net: An Ensemble of Deep Convolutional Neural Networks
34. Tulyakov S, Jaeger S, Govindaraju V, Doermann D (2008) Based on EfficientNet to Detect COVID-19 From Chest X-rays.
Review of classifier combination methods. Machine learning in Arxiv PreprintArxiv:2009.11850. (202)
document analysis and recognition. Springer, Heidelberg, 48. Ieracitano C, Mammone N, Hussain A, Morabito F (2021) A
pp 361–386 novel explainable machine learning approach for EEG-based
35. Lee DS, Srihari SN. A theory of classifier combination: the neural brain-computer interface systems. Neural Comput Appl 1–14.
network approach. In: Proceedings of 3rd International Confer- https://doi.org/10.1007/s00521-020-05624-w [Online first]
ence on Document Analysis and Recognition. vol. 1. IEEE; 1995. 49. Pintelas E, Liaskos M, Livieris I, Kotsiantis S, Pintelas P (2020)
p. 42–45 Explainable machine learning framework for image classification
36. Mahmud M, Kaiser MS. Machine Learning in Fighting Pan- problems: case study on glioma cancer prediction. J Imag 6:37
demics: A COVID-19 Case Study. In: Santosh KC, Joshi A, 50. Singh R, Ey R, Babu R (2021) COVIDScreen: explainable deep
editors. COVID-19: Prediction, Decision-Making, and its learning framework for differential diagnosis of COVID-19 using
Impacts. Lecture Notes on Data Engineering and Communica- chest X-Rays. Neural Comput Appl 33:8871–8892
tions Technologies. Singapore: Springer; 2021. p. 77–81
37. Van Erp M, Vuurpijl L, Schomaker L. An overview and com- Publisher’s Note Springer Nature remains neutral with regard to
parison of voting methods for pattern recognition. In: jurisdictional claims in published maps and institutional affiliations.
123