Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Effectiveness of Deep Learning Models for Covid-19 Detection on

Real-life Data
1
Dhyey Patel, 2 Kavish Mehta, 3Aparna Menon,4Sara Parveen
1,2, 3, 4
Electrical and Computer Engineering Department, University Of Waterloo, Waterloo, Ontario, Canada

Abstract— Respiratory illnesses represent a formidable were suboptimal, leaving room for improvement.
challenge to the medical community, with COVID-19's Therefore, our research aims to build upon their existing
emergence in 2020 presenting a particularly alarming threat to work.
lung health. This development has underscored the pressing need In contrast to Silva P et al. 's implementation of only
for more effective methods of studying and diagnosing the
disease. In recent years, Artificial Intelligence (AI) techniques
two architectures based on Efficientnet-B0 [3], we
have been increasingly employed for this purpose. However, the evaluate the performance of other widely-used models
real-world efficacy of these methods remains a crucial such as ResNet50_v2 [4], AlexNet [5], ChexNet [6], and
consideration. The present study seeks to evaluate the accuracy MobileNetV2 [7] for this purpose.
of AI models on real-world data and improve their performance In the course of our study, we were inspired by the
through the use of an image augmentation technique, i.e., canny work of S K T Hwa et. al. [8] and Shou-Ming Hou et.
edge detection. Specifically, we aim to measure the effectiveness al.’s [9] who used advanced Canny edge detection
of these models in identifying and diagnosing respiratory techniques on CXRs to improve the efficiency of deep
diseases using real-world data, while also exploring how image learning models. In our work we investigate the impact of
augmentation techniques can enhance their accuracy. Through
this research, we hope to shed light on the feasibility and
using Canny edge detection [10] on the efficiency of deep
potential utility of AI-based approaches to respiratory disease learning models trained on CT-scans. Our detailed
diagnosis and treatment, as well as highlight areas where future explanation of our implementation, motivations, findings,
improvements are needed. and conclusions can be found in the upcoming sections.
Index Terms—COVID-19, Deep Learning, Convolutional
Neural Networks, Image Augmentation, Canny Edge Detection, II. DATASET
Transfer learning
In this section, we will discuss the datasets that we
I. INTRODUCTION used for this study. We used the same datasets as Silva P
et al. [2] in their work. Dataset-1 (SARS-CoV-2 CT-scan
The COVID-19 pandemic caused by the SARS-CoV-2 dataset [11]) was used to divide in train and validation sets
virus has resulted in the need for effective and efficient respectively. Whereas Dataset-2 (COVID-CT dataset [12])
diagnostic methods.The predominant method used to was used for the test purpose.
diagnose COVID-19 currently is the RT-PCR (Reverse
Transcription Polymerase Chain Reaction) test. its A. Distribution of Data
limitations in terms of time and resources have led to the
As discussed earlier, Dataset-1 [11] was used for
exploration of alternative methods that are more
training and validation purposes, whereas Dataset-2 [12]
convenient and accurate. Methods using Deep Learning
was used for testing. Dataset-1 consisted of a total of 2482
models for this purpose have been one of the most
CT-scan images and Dataset-2 consisted of a total of 746
explored alternatives.
CT-scan images. Table-1 summarizes how the data was
In this context, our review paper [1] explored the use
split for training, validation and testing.
of Deep learning techniques for identifying respiratory
diseases. Our study compared the effectiveness of using Table 1: Summary of Data distribution
Chest X-Rays(CXRs) and Computerized Tomography
Scans(CT-Scan) as the primary datasets and evaluated
Dataset Purpose Covid Non-Covid
widely-used Convolutional Neural Network(CNN)
models for image segmentation and identification. During Training 1000 1000
Dataset-1
our review, we came across a study by Silva P et al. [2], in
[11] Validation 252 230
which they applied Deep learning methods to real-life
data and tried to measure the effectiveness of their model. Dataset-2 [12] Testing 349 397
While we appreciate the cross-data evaluation approach
used by them, we found that the results they achieved
In Dataset-2, 349 CT-scan images are of Covid technique and evaluated the results using several other
infected patients and 397 CT-scan images of non-infected Convolutional Neural Network (CNN) architectures such as
patients. The disparity observed in the count of non-Covid ResNet50_v2 [4], AlexNet [5], ChexNet [6], and
cases between our work and the study conducted by Silva MobileNetV2 [7].
P. et al. can be attributed to a temporal difference in the
A. Pre-processing Motivation
data sources. Specifically, it is likely that the source
utilized in our analysis had been updated since the time of A.1 Canny Edge Detection [10]
Silva P. et al.'s study, leading to variations in the reported In medical diagnostics, it is necessary to have good
figures for non-COVID patients. accuracy in a model. One of the most reliable edge
detection techniques developed to date is the Canny edge
B. Variety in Test Data detection algorithm, which is highly precise and rigorously
Dataset-2, as reported in reference [12], has been defined. This method has gained recognition for its ability
curated by compiling CT-scans from diverse sources to to provide consistent and accurate edge detection results.
provide a comprehensive range of data for model testing. SKT Hwa et. al. 's [8] and Shou-Ming hou et. al.’s [9]
research validates the above statement and motivated us to
The dataset has been designed with a view to simulate
use canny edge detection for our research.
real-life data that may be encountered by the model in
practical settings. To this end, the dataset has been created In figure-2 we can see a CT-scan image and its
to incorporate several self-induced errors such as textual corresponding canny edge processed image.
information or highlight on images, variations in size, and
changes in contrast, as depicted in Figure-1 respectively.

(1) (2)
(1) (2)
Figure-2: (1) Original CT-scan image [12] (2) Canny-edge
processed image
By utilizing this pre-processing technique, we can
extract the essential edge-based features from a CT-scan
and eliminate any extraneous features that may have been
(3) introduced during the image extraction process. This is
particularly relevant in real-world scenarios where CT-scan
Figure-1: Images from Dataset-2 [12] showing the variety images from diverse sources may be susceptible to varying
in the test dataset (Images might be scaled down for degrees of quality defects.
convenience)
Although the original form of Canny edge detection
is not without its imperfections, there are several
III. MOTIVATION AND METHODS techniques for improving it, as demonstrated by the
research of SKT Hwa et. al. and Shou-Ming Hou et. al..
In this section, we will discuss the methodologies Nevertheless, utilizing Canny edge detection in its original
adopted in our research and the driving factors for those. form has yielded enhanced accuracy for machine learning
Upon reviewing Silva P et. al. 's [2] research, we models.
determined that additional preprocessing techniques could
enhance model accuracies when analyzing real-life data. B. Modeling Motivation
Furthermore, we concluded that basing some findings on a In the course of our research, we endeavored to
single model is inadequate, and we should include other address the primary concerns noted in Silva P. et al.'s [2]
widely used models in our analysis. Therefore, drawing work pertaining to the vanishing gradient and bulky nature
inspiration from SKT Hwa et. al.'s, Shou [8]-Ming Hou et. of the models. To mitigate these issues, we employed four
al.’s [9] paper and our own review paper [1], we distinct models: ResNet50_v2 [4], AlexNet [5], ChexNet
incorporated canny edge detection as a preprocessing [6] and MobileNetV2 [7].
B.1 ResNet50_v2 the softmax activation function.
ResNet50_v2 [4], which includes 48 convolution B.4 ChexNet
layers, 1 MaxPooling layer, and 1 Average Pooling layer,
ChexNet [6] is a convolutional neural network (CNN)
was selected as our base model. This model was trained on
architecture that was specifically designed for chest X-ray
Imagenet [13] weights through transfer learning. The
image analysis and diagnosis of 14 different chest diseases.
architecture features a convolutional layer with a 7x7
The ChexNet architecture uses a modified version of the
kernel size, 64 kernels with a stride of size 2, and a Max
Google Inception v3 [14] CNN architecture and consists of
Pooling layer forming a single unique layer. The following
121 layers. Due to its significant results on the X-ray
layers include 1x1, 64, 3x3, 64, and 1x1, 256 kernels
images, we decided to test it on the CT-scans as well.
repeated three times, 1x1, 128, 3x3, 128, and 1x1, 512
kernels repeated four times, and 1x1, 256, 3x3, 256, and ChexNet uses a 121-layer CNN architecture with skip
1x1, 1024 kernels repeated six times. Nine layers of 1x1, connections, which allows for better information flow
512 kernels with two of 3x3, 512, and 1x1, 2048 kernels through the network and faster convergence during
were replicated three times, totaling nine layers. The final training. The bulk of the ChexNet architecture consists of a
layer consists of a fully connected layer with 1000 nodes series of Inception modules. An Inception module is a
and a softmax function. Skip connections were employed group of convolutional layers with different filter sizes and
to address the vanishing gradient issue by providing types that are concatenated together to form a single
alternative paths for the gradient to converge with three output. The Inception modules help ChexNet extract
shortcut connections. features at different scales and levels of abstraction.
B.2 AlexNet IV. RESULTS
AlexNet [5] is a smaller model and also the Table 2: Summary of Dataset-2 [9]
architecture uses ReLU (rectified linear unit) activation
functions in the hidden layers which helps with vanishing
Model Train Val Train Val
gradient problem and dropout regularization to prevent Accuracy Accuracy Loss Loss
overfitting.
AlexNet has five convolutional layers, starting with ResNet50 96.78% 90.59% 11.19% 21.37%
96 filters of 11×11 size, a stride of 4 pixels, and no
padding. The other layers have different filter sizes, AlexNet 91.12% 90.25% 30.81% 37.63%
numbers of filters, and padding sizes. The model also has
MobileNet 90.90% 82.76% 35.78% 39.45%
three fully connected layers, with the first two having 4096
neurons and the last one having 1000 neurons. The output
ChexNet 88.70% 81.50% 47.30% 60.05%
layer uses softmax activation to predict class probabilities.
So, we can see that the size of the model is very small After training all the four models, ResNet50v2 gave
as compared to other models and yet we are able to rectify
the most satisfactory results on the validation set without
the errors that were present in Silva P et al.’s paper [2].
overfitting with validation accuracy of 94.59% as well as
B.3 MobileNetV2 96.78% accuracy on training data.
MobileNetV2 [7] uses a building block called the
"inverted residual with linear bottleneck". This building A. Comparison
block consists of three consecutive layers: a 1x1 In our research, we have evaluated the performance of
convolutional layer that expands the number of channels, a four different neural network models, namely ResNet50_v2
depthwise separable convolutional layer that reduces the [4], AlexNet [5], ChexNet [6] and MobileNetV2 [7]. Our
spatial dimensions, and another 1x1 convolutional layer primary objective was to classify a disease accurately,
that compresses the number of channels back to the therefore, the accuracy of the models was of utmost
original size. importance, even though we also aimed to increase the
The output of the building block is obtained by speed of execution by using smaller models.
adding the input feature maps to the output feature maps, We found that smaller-sized AlexNet and
which creates a "shortcut" connection that improves the MobileNetV2 models resulted in lower accuracy, and thus,
information flow through the network and helps to avoid it was necessary to use a well-generalized model. We chose
the vanishing gradient problem. ResNet50_v2 as our primary model due to its higher
Towards the conclusion of the network, there is a accuracy and the presence of skip connections that reduced
layer for global average pooling, followed by a dropout complexity and prevented over-fitting. The ResNet50_v2
layer, and ultimately a fully connected layer that utilizes model converged easily to a straight line with minimal
validation and training loss as the number of epochs model reveals its precision, recall, and F1-score to be
increased, as demonstrated by Figure-3. 0.5831, 0.6934, and 0.6335, respectively. The obtained
precision and recall values indicate that the model has a
higher true positive rate than false positive rate, which is
promising. However, the F1-score suggests that there is
scope for improvement in balancing precision and recall,
which may further enhance the overall performance of the
model.

V. CONCLUSION
From this research, we can conclude that the efficiency
of Deep Learning models on real-life images is not up to the
Figure-3: ResNet50_v2 Accuracy vs Epochs mark in medical diagnostics related issues. There can be
many irregularities in the real-life data which the model
In contrast, ChexNet showed quick convergence in the might have never seen before, and hence leading into
initial epochs with high accuracy but then oscillated ambiguous results.
between local maximum and local minimum, which was
due to its high complexity and variance, as illustrated in A. Limitations
Figure-4. Therefore, we concluded that ResNet50_v2 was During the research, potential challenges related to the
the most suitable model for our study. deployment of the model in an open environment were
identified. Dataset-2 [12] contains images with text or
highlighted areas that may negatively affect the model's
accuracy. Additionally, CT scans of patients with diseases
other than COVID-19 may interfere with the model's
predictions. Some images that produce a blurry output after
canny edge detection may also be misclassified.
Furthermore, the model's inability to recognize patterns
outside of its learned dataset may lead to incorrect
predictions in the future.
B. Future advancement possibilities
This section will explore potential avenues for future
Figure-4: ChexNet Accuracy vs Epochs
research and development to build on the findings
B. Confusion Matrix presented. The literature consulted for the Canny Edge
[8][9] method suggests that an improved version of the
Canny Edge algorithm could potentially enhance the
accuracy of the model.
Additionally, rather than solely testing the model with
error-inducing data, there is the possibility of training the
model using such data to enable better generalization of the
model. Also an ensemble of all these models may work
more efficiently than each model individually.

VI. REFERENCES
[1] Patel, D., Mehta, K., Menon, A., Parveen, S. (2023).
Covid-19 detection using medical images of the
human thoracic region with the help of deep learning
Figure-5: Confusion Matrix of ResNet50_v2 models. University Of Waterloo
[2] Silva P, Luz E, Silva G, Moreira G, Silva R, Lucio D,
ResNet50_v2 has been identified as the most stable
Menotti D. COVID-19 detection in CT images with
model as per the discussion above and it can be seen from
deep learning: A voting-based scheme and
the confusion matrix in Figure-5. The total accuracy of the
cross-datasets analysis. Inform Med Unlocked.
model is calculated to be 62.46%. Further analysis of the
2020;20:100427. doi: 10.1016/j.imu.2020.100427. 248-255). Ieee.
Epub 2020 Sep 14. PMID: 32953971; PMCID:
PMC7487744. [14] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., &
Wojna, Z. (2016). Rethinking the inception
[3] Mingxing, T., & Le Quoc, V. E. Rethinking Model architecture for computer vision. In Proceedings of
Scaling for Convolutional Neural Networks. arXiv. the IEEE conference on computer vision and pattern
May 2019. arXiv preprint arXiv:1905.11946. recognition (pp. 2818-2826).
[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep
residual learning for image recognition. In
Proceedings of the IEEE conference on computer
vision and pattern recognition (pp. 770-778).
[5] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017).
Imagenet classification with deep convolutional
neural networks. Communications of the ACM, 60(6),
84-90.
[6] Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H.,
Duan, T., ... & Ng, A. Y. (2017). Chexnet:
Radiologist-level pneumonia detection on chest x-rays
with deep learning. arXiv preprint arXiv:1711.05225.
[7] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., &
Chen, L. C. (2018). Mobilenetv2: Inverted residuals
and linear bottlenecks. In Proceedings of the IEEE
conference on computer vision and pattern
recognition (pp. 4510-4520).
[8] Hwa, S. K. T., Bade, A., & Hijazi, M. A. (2020,
November). Enhanced Canny edge detection for
Covid-19 and pneumonia X-Ray images. In IOP
Conference Series: Materials Science and Engineering
(Vol. 979, No. 1, p. 012016). IOP Publishing.
[9] Hou, S. M., Jia, C. L., Hou, M. J., Fernandes, S. L., &
Guo, J. C. (2021). A Study on Weak Edge Detection
of COVID-19’s CT Images Based on Histogram
Equalization and Improved Canny Algorithm.
Computational and Mathematical methods in
medicine, 2021.
[10] Canny, J. (1986). A computational approach to edge
detection. IEEE Transactions on pattern analysis and
machine intelligence, (6), 679-698.
[11] Soares, E., Angelov, P., Biaso, S., Froes, M. H., &
Abe, D. K. (2020). SARS-CoV-2 CT-scan dataset: A
large dataset of real patients CT scans for
SARS-CoV-2 identification. MedRxiv, 2020-04.
[12] Zhao, J., Zhang, Y., He, X., & Xie, P. (2020).
Covid-ct-dataset: a ct scan dataset about covid-19.
[13] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., &
Fei-Fei, L. (2009, June). Imagenet: A large-scale
hierarchical image database. In 2009 IEEE conference
on computer vision and pattern recognition (pp.

You might also like