Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

First Progress Report on

Chest X-Ray disease classification Using Deep Learning


Submitted in the partial fulfillment of the Degree of
Bachelor of Technology
(Computer Science and Engineering)
Submitted by:
Gaurav Chaudhary (03815002718)
Keshav Bansal (05115002718)
Kumar Tatsat (05415002718)

Under the supervision of


Mr. Sushil Kumar

Department of computer Science and Engineering


Maharaja Surajmal Institute of Technology
Janakpuri, New Delhi
2018-22
INDEX

1. Title page
2. Abstract (also contains problem statement)
3. Introduction
4. Literature survey
5. Research gaps
6. Objectives
7. Methodology of study
8. Work left to do.
9. Limitations
10. References
Abstract
In this project, we will implement a deep learning pipeline for disease classification using a deep
learning model. We also aim to present a comparison of various deep learning models for this
application in cohesion with our data preprocessing pipeline. We want to examine if smaller
models can perform as well as some deep CNNs because of data preprocessing. For this we are
using OpenCV and PyTorch.
Introduction

Every year millions of adults are diagnosed with pulmonary pathologies like: pulmonary
atelectasis, cardiomegaly, pulmonary effusion, infiltration, pulmonary mass, nodule, pneumonia,
pneumothorax [2,3], etc. According to the Indian Council of Medical Research, in 2019 alone,
about 1.6 million people died due to COPD (Chronic obstructive pulmonary disease) [24]. Also,
the COVID-19 pandemic truly took our healthcare infrastructures abase, causing millions of
people to die due to breathing difficulty caused by pulmonary fibrosis (a symptom of Covid-19).
CXRs (Chest X-Rays) are currently one of the best available methods for diagnosing pulmonary
pathologies [3], playing a crucial role in clinical healthcare and epidemiological studies. But,
diagnosing the X-ray images with correct pathology is an involved task and relies heavily upon
the availability of expert radiologists.

In this project, we propose a novel deep learning based pipeline for classifying aforementioned
pulmonary pathologies from CXRs. While other researchers have used deep learning for this
application [7], we are using a combination of preprocessing techniques such as Bone
Suppression [13] and HE (Histogram Equalization)[25] to enhance our model’s capability.

Our project aims to help our healthcare workers provide quality diagnosis even to the most
remote places quickly and cheaply.
Literature Survey

The application of Deep Convolutional Neural Networks for disease classification using CXRs
really became popular with the RSNA Pneumonia Detection Challenge [1] on Kaggle after the
Stanford team consisting of Andrew Ng et al. submitted their CheXnet model [7], which was
based on a Dense CNN. The CheXnet team used Chest X-ray 14 dataset which contained about
112000 X-rays with annotations of up to 14 different thoracic pathologies, released by Wang et
al. in 2017 [5,6]. Concomitantly, they published a research paper on CXRs disease classification
in which they shared their model’s performance. CheXnet achieved F1 scores of around 0.435
which was significantly higher than compared to the F1 score, 0.387 of Stanford radiologists.
Their paper proved the obtainability of Deep Learning techniques for chest X-ray disease
classification.

Then later in 2020, a group from University of Saskatchewan along with researchers from
National Institute of Technology, Trichy, India and International Road Dynamics, Canada
published their research with COVID CXRs and a CheXNet based Deep CNN called CXNet [8].
The research team also published their base model’s performance, which was comparable to
CheXNet even though it had a significantly lesser number of parameters. To achieve this, the
team used various data pre-processing techniques like, lung segmentation [11] and CLAHE and
BEASF [25,16] on CXRs. They were able to improve on some fault points of CheXnet. CheXnet
was often found distracted by the textual data on images, a problem which the team tackled by
lung segmentation using U-nets, forcing the model to focus on the required ROI. Their final
model, Hierarchical Multi-class COVID-CXNet, achieved an F1 score of 0.94 for binary
classification and 0.85 for 3 class classification. Although their model performed very well, they
pointed out a major problem of lack of data for training. They trained their model on only about
7700 [9] images available for COVID detection.

Furthermore, image enhancement techniques based on histogram equalization for medical


images is specially necessary to increase image contrast to make non-linearities more
distinguishable. There are various image enhancement methods based on histogram
equalization, like: DHE, BEASF, CLAHE, etc. Radiologists also use manual contrast
improvement to diagnose mass and nodules better.

Apart from the aforementioned techniques, researchers have also tried bone suppression [12, 13]
to increase the model performance. Bone suppression separates the soft tissues from the bones,
removing some distractions for the deep learning models. Researchers of this [12] paper proved
the efficacy of this process in detecting tuberculosis from CXRs.
Research Gaps

1. CheXNet was often found distracted by the textual data on X-ray images, which can be
solved by lung-segmentation.
2. University of Saskatchewan used Lung Segmentation, BEASF and CLAHE in their data
preprocessing pipeline, but they left out bone suppression.
3. Although the model presented by researchers at University of Saskatchewan performed
well for detecting Covid even with lesser number of parameters compared to other
researchers mentioned in their paper, they stated the lack of data as a limitation for the
bias in their model.
4. Another group of independent researchers tried out bone shadow suppression, but left out
histogram-equalization and lung segmentation in their data preprocessing pipeline, which
have proved to be effective in increasing model performance.
Objectives

The objectives of our project are as follows:

1. Building upon the existing research on effectiveness of data preprocessing for medical
images, we want to find and implement an optimized data preprocessing pipeline that
achieves lung segmentation [11, 14] using neural networks like U-net, bone suppression
using an auto-encoders network and HE using techniques like BEASF, CLAHE, DHE
and others.
2. For increasing the model’s explainability, we want to implement the Grad-CAM [21]
algorithm, which would help determine if the model is looking at the correct places or
not.
3. For measuring model’s performance, we use F1 score for multi-class classification.
4. Lastly, we are proposing an Inception [19] based Deep CNN trained from scratch for this
application.
5. Along with our proposed model, we would compare the results with SqueezeNet based
and ResNet-34 deep learning models.
Research Methodology

Figure 1: Proposed DL Pipeline

Our research methodology can be described in the following steps:

1. Dataset preparation:
a. We are using the NIH Chest X-ray dataset [6] for our final disease classification.
b. We are combining the open-source COVID-19 dataset, “COVID-19 Image Data
Collection” [9] compiled by a team from IEEE.
c. For the Segmentation model, we are using datasets of Montgomery County chest
X-ray set (MC) and Shenzhen chest X-ray set from U.S. National Library of
Medicine [10].
d. For Bone suppression, we are using the BSE-JSRT dataset [13].
2. Bone Suppression Model Training: we’re using an auto-encoder based model for bone
suppression [22].
3. HE: for this step, we would use OpenCV [18] and python’s NumPy library.
4. Lung Segmentation: for this we’re using a U-Net model which has been proven to be
very effective in medical segmentation tasks [26], along with edge-dilation to preserve
the lung structure better.
5. In the final stage, we have chosen a deep learning model which would accept inputs from
the preprocessing pipeline and the reshaped x-ray image for context preserving [19]. For
the outputs, we would have a primary output path for pulmonary pathology classification
using softmax function and a secondary output for binary classification using a sigmoid
function for COVID-19 dataset. Primarily, we are using an inception architecture to keep
the number of parameters low while giving attention to different levels of details in the
x-ray images.
6. Along with the Inception based model, we would be using a SqueezeNet and a ResNet-34
model to compare the performance.
Methodology of Study

Objective-1:

1) Datasets :- (50% complete) We have collected the datasets required for the project,
details of whom are mentioned in the table below:

Name of dataset Size of Dataset # Data Classes


CXR8 119,631 9
Shenzhen Lung Segmentation Dataset 662 2
Montgomery Lung Segmentation Dataset 138 2
JSRT Bone Suppression Dataset 39 (4000 augmented) 2
IEEE Dataset for Covid Chest X-Rays 2
Table: Dataset Collection

For the CXR8 dataset, here are the details:

Name of pathology # of images

Atelectasis 5789

Cardiomegaly 1010

Effusion 6331

Infiltration 10317

Mass 6046

Nodule 1971

Pneumonia 1062

Pneumothorax 2793

Normal 84312

Total 119631

Table: CXR Classification Dataset


For lung segmentation, the following datasets are used:

Dataset name No. of X-Rays (w/masks)

Shenzhen Hospital CXR Set 326 normal, 336 abnormal

Montgomery County CXR Set 80 normal, 58 abnormal


Table: Lung Segmentation dataset

For Bone Suppression dataset: One hundred and fifty-four conventional chest radiographs with a
lung nodule and 93 radiographs without a nodule were selected from 14 medical centers and
were digitized by a laser digitizer with a 2048 x 2048 matrix size (0.175-mm pixels) and a 12-bit
gray scale. It was developed by the JSRT (Japanese Society of Radiological Telhnology).

Bone Suppression
2) Histogram Equalization: For this we tested three algorithms (CLAHE, BEASF and
DHE):

From the results in this image, we can see, CLAHE performs the best among the tested
algorithms.
Objective-4 (50% done)

1) Lung Segmentation: We have created all but one of the scripts for final testing. Below is
the model architecture.

Lung Segmentation (U-Net) model

Layer Output Shape # of Params

ModuleList1: - -
- DoubleConv 2-1 - -
- Sequential 3-1 [-1, 64, 512, 512] 37,696

MaxPool2D 1-1 [-1, 64, 256, 256] -

ModuleList1: - -
- DoubleConv 2-2 - -
- Sequential 3-2 [-1, 128, 256, 256] 221,696

MaxPool2D 1-2 [-1, 128, 128, 128] -

ModuleList1: - -
- DoubleConv 2-3 - -
- Sequential 3-3 [-1, 256, 128, 128] 885,760

MaxPool2d: 1-3 [-1, 256, 64, 64] -

ModuleList1: - -
- DoubleConv 2-4 - -
- Sequential 3-4 [-1, 512, 64, 64] 3,540,992

MaxPool2d: 1-4 [-1, 512, 32, 32] -

DoubleConv: 1-5 - -
- Sequential 2-5 - -
- Conv2D 3-5 - 4,718,592
- BatchNorm2D 3-6 - 2,048
- ReLU 3-7 - -
- Conv2D 3-8 - 9,437,184
- BatchNorm2d 3-9 - 2,048
- ReLU 3-10 [-1, 1024, 32, 32] -

ModuleList: 1 - -
- ConvTranspose2d: 2-6 - 2,097,664
- DoubleConv: 2-7 - -
- Sequential: 3-11 [-1, 512, 64, 64] 7,079,936
- ConvTranspose2d: 2-8 - 524,544
- DoubleConv: 2-9 - -
- Sequential: 3-12 - 1,770,496
- ConvTranspose2d: 2-10 - 131,200
- DoubleConv: 2-11 - -
- Sequential: 3-13 [-1, 256, 128, 128] 442,880
- ConvTranspose2d: 2-12 - 32,832
- DoubleConv: 2-13 - -
- Sequential: 3-14 [-1, 64, 512, 512] 110,848

Conv2d: 1-6 [-1, 1, 512, 512] 65

Total Trainable parameters 31,036,481

Total parameters 31,036,481

While testing the model with just 1 image pair, we got these results:
2) Bone Suppression: We have created all but one of the scripts for model testing. Below is the
model architecture:

Bone Suppression (AE) model

Layer Output Shape Param #

Sequential: 1-1 [-1, 64, 64, 64] -


- Conv2d: 2-1 - 144
- LeakyReLU: 2-2 - -
- MaxPool2d: 2-3
- 4,608
- Conv2d: 2-4
- BatchNorm2d: 2-5 - 64
- LeakyReLU: 2-6 - -
- MaxPool2d: 2-7 - -
- Conv2d: 2-8 - 18,432
- BatchNorm2d: 2-9 - 128
- LeakyReLU: 2-10 [-1, 16, 512, 512] -
- MaxPool2d: 2-11
[-1, 64, 64, 64] -

Sequential: 1-2 [-1, 1, 513, 513] -


- ConvTranspose2d: 2-12 - 18,464
- LeakyReLU: 2-13 - -
- ConvTranspose2d: 2-14
[-1, 32, 129, 129] 4,624
- LeakyReLU: 2-15
- ConvTranspose2d: 2-16 - -
- Tanh: 2-17 [-1, 16, 257, 257] 145
-

Total Trainable parameters 46,609

Total parameters 46,609


3) Classification : We have created all but 2 of the scripts, one for training the models, and the
second for saving the training checkpoints. Below are the model architectures:

a) Inception based model: The model shown here is not the final, as one of the fully
connected layers currently has 96% of the model’s parameter weight. We are working on
updating this architecture.

Inception Based model


Layer Output Shape Param #

conv_block: 1-1 [-1, 32, 256, 256] -


- Conv2d: 2-1 [-1, 32, 256, 256] 1,600
- BatchNorm2d: 2-2 [-1, 32, 256, 256] 64
- LeakyReLU: 2-3
[-1, 32, 256, 256] -

MaxPool2D 1-2 [-1, 32, 128, 128] -

conv_block: 1-3 [-1, 64, 128, 128] -


- Conv2d: 2-4 [-1, 64, 128, 128] 18,496
- BatchNorm2d: 2-5 [-1, 64, 128, 128] 128
- LeakyReLU: 2-6
[-1, 64, 128, 128] -

MaxPool2D 1-4 [-1, 64, 64, 64] -

Inception_block: 1-5 [-1, 112, 64, 64] -


- conv_block: 2-7 [-1, 26, 64, 64] -
- Conv2d: 3-1 [-1, 26, 64, 64] 1,690
- BatchNorm2d: 3-2
[-1, 26, 64, 64] 52
- LeakyReLU: 3-3
- Sequential: 2-8 [-1, 26, 64, 64] -
- conv_block: 3-4 [-1, 46, 64, 64] -
- conv_block: 3-5 [-1, 32, 64, 64] 2,144
- Sequential: 2-9 [-1, 46, 64, 64] 13,386
- conv_block: 3-6 [-1, 20, 64, 64] -
- conv_block: 3-7 [-1, 16, 64, 64] 1,072
- Sequential: 2-10
[-1, 20, 64, 64] 8,060
- MaxPool2d: 3-8
- conv_block: 3-9 [-1, 20, 64, 64] -
[-1, 64, 64, 64] -
[-1, 20, 64, 64] 1,340

MaxPool2d: 1-6 [-1, 112, 32, 32] -

Inception_block: 1-7 [-1, 192, 32, 32] -


- conv_block: 2-11 [-1, 51, 32, 32] -
- Conv2d: 3-10 [-1, 51, 32, 32] 5,763
- BatchNorm2d: 3-11 [-1, 51, 32, 32] 102
- LeakyReLU: 3-12 [-1, 51, 32, 32] -
- Sequential: 2-12
[-1, 77, 32, 32] -
- conv_block: 3-13
- conv_block: 3-14 [-1, 51, 32, 32] 5,865
- Sequential: 2-13 [-1, 77, 32, 32] 35,574
- conv_block: 3-15 [-1, 38, 32, 32] -
- conv_block: 3-16 [-1, 13, 32, 32] 1,495
- Sequential: 2-14 [-1, 38, 32, 32] 12,464
- MaxPool2d: 3-17 [-1, 26, 32, 32] -
- conv_block: 3-18
[-1, 112, 32, 32] -
[-1, 26, 32, 32] 2,990

Inception_block: 1-8 [-1, 268, 32, 32] -


- conv_block: 2-15 [-1, 102, 32, 32] -
- Conv2d: 3-19 [-1, 102, 32, 32] 19,686
204
- BatchNorm2d: 3-20 [-1, 102, 32, 32]
-
- LeakyReLU: 3-21 [-1, 102, 32, 32] -
- Sequential: 2-16 [-1, 110, 32, 32] 8,580
- conv_block: 3-22 [-1, 44, 32, 32] 43,890
- conv_block: 3-23 [-1, 110, 32, 32] -
- Sequential: 2-17 [-1, 24, 32, 32] 1,560
- conv_block: 3-24 [-1, 8, 32, 32] 4,872
-
- conv_block: 3-25 [-1, 24, 32, 32]
-
- Sequential: 2-18 [-1, 32, 32, 32]
- MaxPool2d: 3-26 [-1, 192, 32, 32]
- conv_block: 3-27 [-1, 32, 32, 32]

Inception_block: 1-9 [-1, 370, 32, 32] -


- conv_block: 2-19 [-1, 110, 32, 32] -
- Conv2d: 3-28 [-1, 110, 32, 32] 29,590
- BatchNorm2d: 3-29 [-1, 110, 32, 32] 220
- LeakyReLU: 3-30 [-1, 110, 32, 32] -
- Sequential: 2-20 [-1, 180, 32, 32] -
- conv_block: 3-31 [-1, 90, 32, 32] 24,390
- conv_block: 3-32 [-1, 180, 32, 32] 146,340
- Sequential: 2-21 [-1, 40, 32, 32] -
- conv_block: 3-33 [-1, 20, 32, 32] 5,420
- conv_block: 3-34 [-1, 40, 32, 32] 20,120
- Sequential: 2-22 [-1, 40, 32, 32] -
- MaxPool2d: 3-35 [-1, 268, 32, 32] -
- conv_block: 3-36 [-1, 40, 32, 32] 10,840

MaxPool2d: 1-10 [-1, 370, 16, 16] -


Inception_block: 1-11 [-1, 832, 16, 16] -
- conv_block: 2-23 [-1, 256, 16, 16] -
- Conv2d: 3-37 [-1, 256, 16, 16] 94,976
- BatchNorm2d: 3-38 [-1, 256, 16, 16] 512
- LeakyReLU: 3-39 [-1, 256, 16, 16] -
- Sequential: 2-24 [-1, 320, 16, 16] -
- conv_block: 3-40 [-1, 160, 16, 16] 59,680
- conv_block: 3-41 [-1, 320, 16, 16] 461,760
- Sequential: 2-25 [-1, 128, 16, 16] -
- conv_block: 3-42 [-1, 32, 16, 16] 11,936
- conv_block: 3-43 [-1, 128, 16, 16] 102,784
- Sequential: 2-26 [-1, 128, 16, 16] -
- MaxPool2d: 3-44 [-1, 370, 16, 16] -
- conv_block: 3-45 [-1, 128, 16, 16] 47,744

AvgPool2d: 1-12 [-1, 832, 10, 10] -

Dropout: 1-13 [-1, 83200] -

Linear: 1-14 [-1, 384] 31,949,184

Dropout: 1-15 [-1, 384] -

Linear: 1-16 [-1, 128] 49,280

Dropout: 1-17 [-1, 128] -

Linear: 1-18 [-1, 9] 1,161

Total Trainable parameters 33,213,254

Total parameters 33,213,254


b) ResNet34 based model: Below is the model’s architecture:

ResNet-34 Based model

Layer Output Shape Param #

Conv2d: 1-1 [-1, 32, 256, 256] 1,568

BatchNorm2d: 1-2 [-1, 32, 256, 256] 64

LeakyReLU: 1-3 [-1, 32, 256, 256] -

MaxPool2d: 1-4 [-1, 32, 128, 128] -

Sequential: 1-5 [-1, 64, 128, 128] -


- block: 2-1 [-1, 64, 128, 128] -
- Conv2d: 3-1 [-1, 32, 128, 128] 1024
- BatchNorm2d: 3-2
[-1, 32, 128, 128] 64
- LeakyReLU: 3-3
- Conv2d: 3-4 [-1, 32, 128, 128] -
- BatchNorm2d: 3-5 [-1, 32, 128, 128] 2,048
- LeakyReLU: 3-6 [-1, 32, 128, 128] 128
- Conv2d: 3-7 [-1, 32, 128, 128] 2,176
- BatchNorm2d: 3-8 [-1, 64, 128, 128] -
- Sequential: 3-9 [-1, 64, 128, 128] -
- LeakyReLU: 3-10
[-1, 64, 128, 128] 2,048
- block: 2-2
- Conv2d: 3-11 [-1, 64, 128, 128] 64
- BatchNorm2d: 3-12 [-1, 64, 128, 128] -
- LeakyReLU: 3-13 [-1, 32, 128, 128] 9,216
- Conv2d: 3-14 [-1, 32, 128, 128] 64
- BatchNorm2d: 3-15 [-1, 32, 128, 128] -
- LeakyReLU: 3-16 [-1, 32, 128, 128] 2,048
- Conv2d: 3-17
[-1, 32, 128, 128] 128
- BatchNorm2d: 3-18
- LeakyReLU: 3-19 [-1, 32, 128, 128] -
[-1, 64, 128, 128]
[-1, 64, 128, 128]
[-1, 64, 128, 128]

Sequential: 1-6 [-1, 128, 64, 64] -


- block: 2-3 [-1, 128, 64, 64] -
- Conv2d: 3-20 [-1, 64, 128, 128] 4,096
- BatchNorm2d: 3-21
[-1, 64, 128, 128] 128
- LeakyReLU: 3-22
- Conv2d: 3-23 [-1, 64, 128, 128] -
- BatchNorm2d: 3-24 [-1, 64, 64, 64] 36,864
- LeakyReLU: 3-25 [-1, 64, 64, 64] 128
- Conv2d: 3-26 [-1, 64, 64, 64] -
- BatchNorm2d: 3-27 [-1, 128, 64, 64] 8,192
- Sequential: 3-28 [-1, 128, 64, 64] 256
- LeakyReLU: 3-29 [-1, 128, 64, 64] 8,448
- block: 2-4
[-1, 128, 64, 64] -
- Conv2d: 3-30
- BatchNorm2d: 3-31 [-1, 128, 64, 64] -
- LeakyReLU: 3-32 [-1, 64, 64, 64] 8,192
- Conv2d: 3-33 [-1, 64, 64, 64] 128
- BatchNorm2d: 3-34 [-1, 64, 64, 64] -
- LeakyReLU: 3-35 [-1, 64, 64, 64] 36,864
- Conv2d: 3-36 [-1, 64, 64, 64] 128
- BatchNorm2d: 3-37
[-1, 64, 64, 64] -
- LeakyReLU: 3-38
- block: 2-5 [-1, 128, 64, 64] 8,192
- Conv2d: 3-39 [-1, 128, 64, 64] 256
- BatchNorm2d: 3-40 [-1, 128, 64, 64] -
- LeakyReLU: 3-41 [-1, 128, 64, 64] -
- Conv2d: 3-42 [-1, 64, 64, 64] 8,192
- BatchNorm2d: 3-43 [-1, 64, 64, 64] 128
- LeakyReLU: 3-44
[-1, 64, 64, 64] -
- Conv2d: 3-45
- BatchNorm2d: 3-46 [-1, 64, 64, 64] 36,864
- LeakyReLU: 3-47 [-1, 64, 64, 64] 128
[-1, 64, 64, 64] -
[-1, 128, 64, 64] 8,192
[-1, 128, 64, 64] 256
[-1, 128, 64, 64] -

Sequential: 1-7 [-1, 256, 32, 32] -


- block: 2-6 [-1, 256, 32, 32] -
- Conv2d: 3-48 [-1, 128, 64, 64] 16,384
- BatchNorm2d: 3-49
[-1, 128, 64, 64] 256
- LeakyReLU: 3-50
- Conv2d: 3-51 [-1, 128, 64, 64] -
- BatchNorm2d: 3-52 [-1, 128, 32, 32] 147,456
- LeakyReLU: 3-53 [-1, 128, 32, 32] 256
- Conv2d: 3-54 [-1, 128, 32, 32] -
- BatchNorm2d: 3-55 [-1, 256, 32, 32] 32,768
- Sequential: 3-56 [-1, 256, 32, 32] 512
- LeakyReLU: 3-57
[-1, 256, 32, 32] 33,280
- block: 2-7
- Conv2d: 3-58 [-1, 256, 32, 32] -
- BatchNorm2d: 3-59 [-1, 256, 32, 32] -
- LeakyReLU: 3-60 [-1, 128, 32, 32] 32,768
- Conv2d: 3-61 [-1, 128, 32, 32] 256
- BatchNorm2d: 3-62 [-1, 128, 32, 32] -
- LeakyReLU: 3-63 [-1, 128, 32, 32] 147,456
- Conv2d: 3-64
[-1, 128, 32, 32] 256
- BatchNorm2d: 3-65
- LeakyReLU: 3-66 [-1, 128, 32, 32] -
- block: 2-8 [-1, 256, 32, 32] 32,768
- Conv2d: 3-67 [-1, 256, 32, 32] 512
- BatchNorm2d: 3-68
[-1, 256, 32, 32] -
- LeakyReLU: 3-69
- Conv2d: 3-70 [-1, 256, 32, 32] -
- BatchNorm2d: 3-71 [-1, 128, 32, 32] 32,768
- LeakyReLU: 3-72 [-1, 128, 32, 32] 256
- Conv2d: 3-73 [-1, 128, 32, 32] -
- BatchNorm2d: 3-74 [-1, 128, 32, 32] 147,456
- LeakyReLU: 3-75 [-1, 128, 32, 32] 256
- block: 2-9
[-1, 128, 32, 32] -
- Conv2d: 3-76
- BatchNorm2d: 3-77 [-1, 256, 32, 32] 32,768
- LeakyReLU: 3-78 [-1, 256, 32, 32] 512
- Conv2d: 3-79 [-1, 256, 32, 32] -
- BatchNorm2d: 3-80 [-1, 256, 32, 32] -
- LeakyReLU: 3-81 [-1, 128, 32, 32] 32,768
- Conv2d: 3-82 [-1, 128, 32, 32] 256
- BatchNorm2d: 3-83
[-1, 128, 32, 32] -
- LeakyReLU: 3-84
[-1, 128, 32, 32] 147,456
[-1, 128, 32, 32] 256
[-1, 128, 32, 32] -
[-1, 256, 32, 32] 32,768
[-1, 256, 32, 32] 512
[-1, 256, 32, 32] -
-
32,768
256
-
147,456
256
-
32,768
512
-

Sequential: 1-8 [-1, 512, 16, 16] -


- block: 2-10 [-1, 512, 16, 16] -
- Conv2d: 3-85 [-1, 256, 32, 32] 65,536
- BatchNorm2d: 3-86
[-1, 256, 32, 32] 512
- LeakyReLU: 3-87
- Conv2d: 3-88 [-1, 256, 32, 32] -
- BatchNorm2d: 3-89 [-1, 256, 16, 16] 589,824
- LeakyReLU: 3-90 [-1, 256, 16, 16] 512
- Conv2d: 3-91 [-1, 256, 16, 16] -
- BatchNorm2d: 3-92
- Sequential: 3-93 [-1, 512, 16, 16] 131,072
- LeakyReLU: 3-94 [-1, 512, 16, 16] 1,024
- block: 2-11 [-1, 512, 16, 16] 132,096
- Conv2d: 3-95
[-1, 512, 16, 16] -
- BatchNorm2d: 3-96
- LeakyReLU: 3-97 [-1, 512, 16, 16] -
- Conv2d: 3-98 [-1, 256, 16, 16] 131,072
- BatchNorm2d: 3-99 [-1, 256, 16, 16] 512
- LeakyReLU: 3-100 [-1, 256, 16, 16] -
- Conv2d: 3-101 [-1, 256, 16, 16] 589,824
- BatchNorm2d: 3-102 [-1, 256, 16, 16] 512
- LeakyReLU: 3-103
[-1, 256, 16, 16] -
[-1, 512, 16, 16] 131,072
[-1, 512, 16, 16] 1,024
[-1, 512, 16, 16] -

AdaptiveAvgPool2d: 1-9 [-1, 512, 1, 1] -

Linear: 1-10 [-1, 128] 65,664

Linear: 1-11 [-1, 64] 8,256

Linear: 1-12 [-1, 9] 585

Total Trainable parameters 2,917,609

Total parameters 2,917,609

c) SqueezeNet based model: Below is the model’s arhitecture:

SqueezeNet Based model

Layer Output Shape Param #

Conv2d: 1-1 [-1, 32, 512, 512] 288


BatchNorm2d: 1-2 [-1, 32, 512, 512] 64

LeakuRelU: 1-3 [-1, 32, 512, 512] -

MaxPool2d: 1-4 [-1, 32, 256, 256] -

Fire: 1-5 [-1, 64, 256, 256] -


- Conv2d: 2-1 [-1, 16, 256, 256] 512
- BatchNorm2d: 2-2 [-1, 16, 256, 256] 32
- LeakyReLU: 2-3
[-1, 32, 256, 256] -
- Conv2d: 2-4
- BatchNorm2d: 2-5 [-1, 32, 256, 256] 512
- Conv2d: 2-6 [-1, 32, 256, 256] 64
- BatchNorm2d: 2-7 [-1, 32, 256, 256] 4608
- LeakyReLU: 2-8 [-1, 32, 256, 256] 64
[-1, 64, 256, 256] -

Fire: 1-6 [-1, 128, 256, 256] -


- Conv2d: 2-9 [-1, 16, 256, 256] 1024
- BatchNorm2d: 2-10 [-1, 16, 256, 256] 32
- LeakyReLU: 2-11
[-1, 16, 256, 256] -
- Conv2d: 2-12
- BatchNorm2d: 2-13 [-1, 64, 256, 256] 1024
- Conv2d: 2-14 [-1, 64, 256, 256] 128
- BatchNorm2d: 2-15 [-1, 64, 256, 256] 9216
- LeakyReLU: 2-16 [-1, 64, 256, 256] 128
[-1, 64, 256, 256] -

Fire: 1-7 [-1, 128, 256, 256] -


- Conv2d: 2-17 [-1, 16, 256, 256] 4096
- BatchNorm2d: 2-18 [-1, 16, 256, 256] 64
- LeakyReLU: 2-19
[-1, 16, 256, 256] -
- Conv2d: 2-20
- BatchNorm2d: 2-21 [-1, 64, 256, 256] 4096
- Conv2d: 2-22 [-1, 64, 256, 256] 256
- BatchNorm2d: 2-23 [-1, 64, 256, 256] 36,864
- LeakyReLU: 2-24 [-1, 64, 256, 256] 256
[-1, 64, 256, 256] -

MaxPool2d: 1-8 [-1, 256, 128, 128] -

Fire: 1-9 [-1, 128, 256, 256] -


- Conv2d: 2-25 [-1, 32, 128, 128] 8192
- BatchNorm2d: 2-26 [-1, 32, 128, 128] 64
- LeakyReLU: 2-27
[-1, 32, 128, 128] -
- Conv2d: 2-28
- BatchNorm2d: 2-29 [-1, 128, 128, 128] 4096
- Conv2d: 2-30 [-1, 128, 128, 128] 256
- BatchNorm2d: 2-31 [-1, 128, 128, 128] 36,864
- LeakyReLU: 2-32 [-1, 128, 128, 128] 256
[-1, 256, 128, 128] -

Fire: 1-10 [-1, 384, 128, 128] -


- Conv2d: 2-33 [-1, 48, 128, 128] 12,288
- BatchNorm2d: 2-34 [-1, 48, 128, 128] 96
- LeakyReLU: 2-35
[-1, 48, 128, 128] -
- Conv2d: 2-36
- BatchNorm2d: 2-37 [-1, 192, 128, 128] 9,216
- Conv2d: 2-38 [-1, 192, 128, 128] 384
- BatchNorm2d: 2-39 [-1, 192, 128, 128] 82,944
- LeakyReLU: 2-40 [-1, 192, 128, 128] 384
[-1, 384, 128, 128] -

Fire: 1-11 [-1, 384, 128, 128] -


- Conv2d: 2-41 [-1, 32, 128, 128] 18432
- BatchNorm2d: 2-42 [-1, 48, 128, 128] 96
- LeakyReLU: 2-43
[-1, 48, 128, 128] -
- Conv2d: 2-44
- BatchNorm2d: 2-45 [-1, 48, 128, 128] 9,216
- Conv2d: 2-46 [-1, 192, 128, 128] 384
- BatchNorm2d: 2-47 [-1, 192, 128, 128] 82,944
- LeakyReLU: 2-48 [-1, 192, 128, 128] 384
[-1, 384, 128, 128] -

Fire: 1-12 [-1, 512, 128, 128] -


- Conv2d: 2-49 [-1, 64, 128, 128] 24,576
- BatchNorm2d: 2-50 [-1, 64, 128, 128] 128
- LeakyReLU: 2-51
[-1, 64, 128, 128] -
- Conv2d: 2-52
- BatchNorm2d: 2-53 [-1, 256, 128, 128] 16,384
- Conv2d: 2-54 [-1, 256, 128, 128] 512
- BatchNorm2d: 2-55 [-1, 256, 128, 128] 147,456
- LeakyReLU: 2-56 [-1, 256, 128, 128] 512
[-1, 512, 128, 128] -

MaxPool2d: 1-13 [-1, 512, 64, 64] -

Fire: 1-14 [-1, 512, 64, 64] -


- Conv2d: 2-57 [-1, 64, 64, 64] 32,768
- BatchNorm2d: 2-58 [-1, 64, 64, 64] 128
- LeakyReLU: 2-59 [-1, 64, 64, 64] -
- Conv2d: 2-60
[-1, 256, 64, 64] 16,384
- BatchNorm2d: 2-61
- Conv2d: 2-62 [-1, 256, 64, 64] 512
- BatchNorm2d: 2-63 [-1, 256, 64, 64] 147,456
- LeakyReLU: 2-64 [-1, 256, 64, 64] 512
[-1, 512, 64, 64] -
Conv2d: 1-15 [-1, 10, 64, 64] 5,120

AvgPool2d: 1-16 [-1, 10, 16, 16] -

Dropout: 1-17 [-1, 2560] -

Linear: 1-18 [-1, 384] 983,424

Dropout: 1-19 [-1, 384] -

Linear: 1-20 [-1, 128] 49,280

Dropout: 1-21 [-1, 128] -

Linear: 1-22 [-1, 9] 1,161

Total Number of trainable parameters 1,756,137

Total parameters 1,756,137


Work left to do:

1) Implement Grad-CAM algorithm.


2) Train individual modules.
3) Get the scores for their individual performances.
4) Orchestrate all the modules together to build the pipeline.
5) Fine-tune the final pipeline.
6) Check performance of the 3 presented Deep CNNs, for pathology detection.
7) Check the performance of above-mentioned models for Covid detection.

Limitations:

1) Mainly, our model’s performance would be limited by the lack of quality data for the
bone suppression module.
2) Another limitation is the computational power required to train the model efficiently.
3) We are currently fixing the input size of the modules to 512x512, with more
computational capacity we would be able to increase the image input size, hence
increasing the information the model has to work with.
Gantt chart
References
[1] “RSNA Pneumonia Detection Challenge,” Kaggle.com. [Online]. Available:
https://www.kaggle.com/c/rsna-pneumonia-detection-challenge. [Accessed:
04-Sep-2021].

[2] Cdc.gov. [Online]. Available:


https://www.cdc.gov/nchs/data/nhamcs/web_tables/2015_ed_web_tables.pdf#%5B%7B
%22num%22%3A80%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22X
YZ%22%7D%2C-89%2C777%2C0.930573%5D. [Accessed: 04-Sep-2021].

[3] Cdc.gov. [Online]. Available:


https://www.cdc.gov/nchs/data/nvsr/nvsr66/nvsr66_06_tables.pdf#%5B%7B%22num%
22%3A109%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22FitH%22%7
D%2C554%5D. [Accessed: 04-Sep-2021].

[4] T. Franquet, “Imaging of community-acquired pneumonia,” J. Thorac. Imaging, vol.


33, no. 5, pp. 282–294, 2018.

[5] B. Kelly, “The chest radiograph,” Ulster Med. J., vol. 81, no. 3, pp. 143–148, 2012.

[6] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “ChestX-Ray8:


Hospital-scale chest X-ray database and benchmarks on weakly-supervised
classification and localization of common thorax diseases,” in 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), 2017.

[7] P. Rajpurkar et al., “CheXNet: Radiologist-level pneumonia detection on chest X-rays


with deep learning,” arXiv [cs.CV], 2017.

[8] A. Haghanifar, M. M. Majdabadi, Y. Choi, S. Deivalakshmi, and S. Ko,


“COVID-CXNet: Detecting COVID-19 in frontal chest X-ray images using deep
learning,” arXiv [eess.IV], 2020.

[9] J. P. Cohen, P. Morrison, L. Dao, K. Roth, T. Q. Duong, and M. Ghassemi, “COVID-19


Image Data Collection: Prospective Predictions Are the Future,” arXiv [q-bio.QM],
2020.

[10] S. Jaeger, S. Candemir, S. Antani, Y.-X. J. Wáng, P.-X. Lu, and G. Thoma, “Two public
chest X-ray datasets for computer-aided screening of pulmonary diseases,” Quant.
Imaging Med. Surg., vol. 4, no. 6, pp. 475–477, 2014.
[11] S. Rajaraman, L. Folio, J. Dimperio, P. Alderson, and S. Antani, “Improved semantic
segmentation of tuberculosis-consistent findings in chest X-rays using augmented
training of modality-specific U-Net models with weak localizations,” arXiv [cs.CV],
2021.

[12] S. Rajaraman, G. Zamzmi, L. Folio, P. Alderson, and S. Antani, “Chest X-ray bone
suppression for improving classification of tuberculosis-consistent findings,”
Diagnostics (Basel), vol. 11, no. 5, p. 840, 2021.

[13] R. Tanaka, S. Sanada, M. Oda, M. Suzuki, K. Sakuta, and H. Kawashima, “Improved


accuracy of image guided radiation therapy (IMRT) based on bone suppression
technique,” in 2013 IEEE Nuclear Science Symposium and Medical Imaging
Conference (2013 NSS/MIC), 2013, pp. 1–2.

[14] Y. Gordienko et al., “Deep learning with lung segmentation and bone shadow exclusion
techniques for chest X-ray analysis of lung cancer,” arXiv [cs.LG], 2017.

[15] M. Gusarev, R. Kuleev, A. Khan, A. Ramirez Rivera, and A. M. Khattak, “Deep


learning models for bone suppression in chest radiographs,” in 2017 IEEE Conference
on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB),
2017, pp. 1–7.

[16] J. Ma, X. Fan, S. X. Yang, X. Zhang, and X. Zhu, “Contrast limited adaptive histogram
equalization based fusion for underwater image enhancement,” Preprints, 2017.

[17] A. M. Reza, “Realization of the contrast limited adaptive histogram equalization


(CLAHE) for real-time image enhancement,” J. VLSI Sign. Process. Syst. Sign. Image
Video Technol., vol. 38, no. 1, pp. 35–44, 2004.

[18] I. Culjak, D. Abram, T. Pribanic, H. Dzapo, and M. Cifrek, “A brief introduction to


OpenCV,” in 2012 Proceedings of the 35th International Convention MIPRO, 2012, pp.
1725–1730.

[19] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception
architecture for computer vision,” in 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2016.

[20] A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning


library,” Advances in Neural Information Processing Systems, vol. 32, 2019.
[21] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra,
“Grad-CAM: Visual explanations from deep networks via Gradient-based localization,”
arXiv [cs.CV], 2016.

[22] D. Y. Oh and I. D. Yun, “Learning bone suppression from dual energy chest X-rays
using adversarial networks,” arXiv [cs.CV], 2018.

[23] K. K. Bressem, L. C. Adams, C. Erxleben, B. Hamm, S. M. Niehues, and J. L.


Vahldiek, “Comparing different deep learning architectures for classification of chest
radiographs,” Sci. Rep., vol. 10, no. 1, p. 13590, 2020.

[24] “India: air pollution deaths by type 2019,” Statista.com. [Online]. Available:
https://www.statista.com/statistics/1194824/india-air-pollution-deaths-by-type/.
[Accessed: 04-Sep-2021].

[25] Nithyananda C R, Ramachandra A C, and Preethi, “Survey on Histogram Equalization


method based Image Enhancement techniques,” in 2016 International Conference on
Data Mining and Advanced Computing (SAPIENCE), 2016, pp. 150–158.

[26] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for


Biomedical Image Segmentation,” arXiv [cs.CV], 2015.

You might also like