Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Segmentation of Acute Ischemic Stroke Lesion

from Non-Contrast CT Scans


Manas Kumar Nag

Central University of Rajasthan


Anup Kumar Sadhu
Medical College and Hospitals Campus
Samiran Das
Helmholtz Institute Freiberg, Helmholtz Zentrum Dresden Rossendorf

Research Article

Keywords: Non-Contrast CT, Ischemic Stroke Lesion Analysis, Lesion Segmentation, Attention,
Convolution

Posted Date: March 22nd, 2024

DOI: https://doi.org/10.21203/rs.3.rs-4131026/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: No competing interests reported.


Segmentation of Acute Ischemic Stroke Lesion
from Non-Contrast CT Scans
Manas Kumar Nag1*, Anup Kumar Sadhu2 and Samiran Das3
1* Department of Biomedical Engineering, Central University of
Rajasthan, Bandar Sindri, Ajmer, 305817, Rajasthan, India.
2 EKO Diagnostics, Medical College and Hospitals Campus, Kolkata,

700073, West Bengal, India.


3 Helmholtz Institute Freiberg, Helmholtz Zentrum Dresden Rossendorf,

Freiberg, 09599, Saxony, Germany.

*Corresponding author(s). E-mail(s): manas.nag@curaj.ac.in;

Abstract
This paper proposes an advanced deep learning model for detecting and segment-
ing the brain acute ischemic lesions from Non-Contrast CT scans. This treatise
introduced the 3D CoAtNet model for accurately detecting hypo-intense lesions,
which even experienced clinicians fail to detect with satisfactory accuracy. The
computer-aided detection and segmentation can act as an assistive tool for clin-
icians. However, most of the prevalent machine learning models were unable
to obtain satisfactory detection accuracy. We propose an efficient deep learning
model based on the 3D CoAtNet, which combines the attention mechanism with
convolution. It helps the shallow encoder-decoder structure to perform better fea-
ture extraction, which plays a crucial role in separating brain tissue and ischemic
stroke lesions. Dilation rates of 1, 3, and 5 were applied to each encoder convolu-
tion block. We obtained the dice similarity score, which was calculated as 75 %
and the Jaccard index was 69%.

Keywords: Non-Contrast CT, Ischemic Stroke Lesion Analysis, Lesion Segmentation,


Attention, Convolution

1
1 Introduction
Stroke has emerged as a major cause of long-term disability and mortality [1]. Accord-
ing to recent statistics [2], ischemic stroke and hemorrhagic stroke constitute 85%,
and 15% of the total stroke cases. Ischemic stroke occurs when the flow of oxygenated
blood to the brain is restricted due to the presence of a clot in the blood vessel. Mag-
netic Resonance Imaging is a widely used non-invasive imaging technique for detecting
ischemic stroke. Although magnetic resonance imaging, especially dMRI, is a fast way
to detect ischemic stroke lesions, Computed Tomography (CT) is economical and has
fewer exclusion criteria, gaining popularity in the detection of ischemic stroke lesions.
The lesion being hypointense, becomes challenging for clinicians.
Several recent works attempted to detect ischemic stroke lesions from MRI scans.
However, very few recent works attempted to perform segmentation of ischemic stroke
lesions from Non-Contrast CT scans (NCCT). NCCT does not show subtle changes in
early ischemic changes in the brain (up to 6 hours after the onset of stroke). Ischemic
changes occurring after 6 hours of the onset of stroke are generally quite challenging
to observe. Consequently, ischemic stroke detection near the lateral ventricles becomes
impossible in most cases. Further, the ischemic changes are generally hypo-intense for
the healthy brain tissue. As a consequence, accurate detection of lesions is indeed a
challenging task. However, thrombolytic therapy can be started only after detecting
the ischemic changes.
Rajini et al. [3] utilized traditional methods such as k-NN, SVM, ANN, and
decision trees to identify the CT slices having ischemic stroke lesion segmentation
from CT images using Laplace pyramid and fuzzy c-means clustering. A related
work [4] introduced D-Net architecture for segmentation of ischemic infarction and
hemorrhagic transformation using contrast non-contrast CT scans of ischemic stroke
patients [5]. The nn-U-net framework was used to evaluate early ischemic changes
from Non-Contrast CT scans [6]. In another related work, a deep network named
EIS-Net was proposed for segmenting early ischemic stroke lesions from NCCT
scans[7]. Another work, IS-Net was proposed for the segmentation of ischemia stroke
lesions from CT scans. [8]

Some recent works attempted to detect ischemic stroke lesions by segmenting


NCCT scans. However, most of these segmentation-based methods such as Alshehri
et al. [9], ZoetMulder et al.[10], Thiyagarajan et al. [11], An et al. [12], Verclytte et el.
[13], Liu et al. [14] detected the lesion from MRI scan images. However, generally devel-
oping countries with advanced healthcare infrastructure prefer NCCT data over MRI.
The NCCT technology is cost-effective and has fewer exclusion criteria compared to
MTI. However, segmenting NCCT lesions is relatively more difficult compared to seg-
menting the magnetic resonance scan images. Therefore, efficient, accurate automated
segmentation methods are necessary for the timely detection of ischemic stroke lesions.
Since the area also includes the penumbra and core area, only precise segmentation
methods can accurately perform the desired segmentation of the ischemic stroke.
The U-Net model[15], which is a widely used deep learning model for the segmen-
tation of medical images, generally obtains satisfactory performance efficacy. However,

2
some recent works have discussed that the 3D U-Net often faces challenges in extract-
ing discriminative features to identify ischemic stroke lesion and healthy brain tissue
mainly due to its shallow architecture. This treatise introduced an efficient deep neu-
ral network termed 3D CoAtNet to accurately detect the lesions. The proposed 3D
CoAtNet model[16] incorporates two different learning paradigms, convolution, and
attention mechanisms. The CoAtNet contains an encoder-decoder-like architecture to
extract high-level deep features from the input samples. The network essential incor-
porates dilated convolutional blocks with dilation rate of 1, 3, and 5. The network uses
skip connections for unrestricted flow of information between the respective block of
encoder and decoder. The succeeding relative attention blocks identify the important
regions from the image. Relative attention also extracts robust attention maps. The
significant contributions of this paper are listed below-
• The work proposes a 3D CoAtNet model as a feature extractor. The 3D CoAtNet
extracts rich, high-level features useful for the subsequent segmentation task.
• The model proposed in the work combines two efficient deep learning models; convo-
lution and attention. As a consequence, the proposed network attains considerably
higher efficacy in the experiments conducted.

2 Materials and Methods


2.1 Data Collection
60 Non-Contrast CT scans were collected from a public database APIS [17]. The data
set collected included the APIS database in conjunction with the NCCT and apparent
diffusion coefficient (ADC) scans for the same patients. Only the NCCT scans were
focussed for this study. Unlike diffusion-weighted MRI, the ADC has a low attenuation
signal in the ischemic region; hence a dark region is obtained.
A total of 550 Non-contrast brain CT Scans of clinically confirmed acute ischemic
stroke were obtained from EKO Diagnostics, Medical College and Hospitals campus,
Kolkata. The images of CT brain scans were captured by a multi-detector CT scanner
(Brivo series, 385; 16 detectors manufactured by GE Healthcare. The window width
was set to the default settings, 100 Hounsfield units (HU), and the window center was
fixed at 40 HU. The size of each volume was fixed at 512 × 512, with slice numbers
96. The slice thickness was 1.25 mm, with a resealable slope of 1, a scalable intercept
of (−1024), and a pixel spacing of 0.42 × 0.42 mm. The raw data from the dicom
were collected and converted into a format of the neuroimaging informatics technology
initiative (NifTI). In Fig. 1 the (a) NCCT and (b) ADC scans of the same patient
suffering from ischemic stroke are shown.

2.2 Data Pre-Processing


The skull was removed from the NCCT using FSL software, as the skull does not
have any role in this study. Removal of the skull reduces the number of classes, hence
computation complexity was reduced. The Nearest Neighbour transform was applied
to each CT scan to convert anisotropic voxels into isotropic voxels.

3
Fig. 1: (a) NCCT Scan of ischemic patient; (b) Apparent
Diffusion Coefficient of ischemic patient

2.3 Proposed Methodology


Several previous works utilized the 3D U-net for ROI segmentation from medical
images. However, the 3D U-net is essentially a shallow network, which is generally
unable to extract representative features to distinguish between an ischemic stroke
lesion and healthy brain tissue. Consequently, the 3D U-net and other standard deep
neural networks generally obtain a relatively lower accuracy for the above-mentioned
segmentation task. On the contrary, the CNNs are translation invariant, fast, and
generalizable. On the contrary, vision transformers are scalable models that extract
global features. However, the limited receptive field of CNNs limits the networks to
realize meaningful representative global contextual features. The vision transformers
display poor generalization ability and hence require a large amount of data to obtain
satisfactory accuracy. As per some recent works[18],[19], combining convolutions with
attention mechanisms allows better feature representation, generalization, and scal-
ability. The proposed work utilized recently proposed efficient 3D Convolution with
Attention Network (CoAtNet) for accurately detecting the lesions.
The proposed CoAtNet combines the advantages of advanced deep models, such
as vision transformers and convolutions. The CoAtNet combines convolutional blocks
with the attention mechanism to achieve excellent generalization ability and extract
excellent feature representation. The network essentially unifies the depthwise convo-
lution and self-attention via simple relative attention and to stack convolution layers
and attention layers. The 3D CoAtNet, which is a modified version of the original
network [20] initially extracts the hierarchical local spatial features using conv blocks
and utilizes transformer blocks subsequently extract essential global spatial informa-
tion. The 3D CoAtNet architecture shown in Fig. 2 illustrates the different blocks.
The CoAtNet proposed in this work contains two conv blocks followed by two trans-
former blocks. The network uses MB Conv blocks instead of generic conv blocks. The

4
MBconv blocks extract spatial features [21] using depthwise convolution according to
equation 1.

X
yi = (Wi−j ⊛ xj ) (1)
j∈L(i)

where xi , yi ∈ RD , xi being the input and yi the output, and L (i) is the local
neighborhood of i, such as the kernel 3 × 3 × 3 centered on i.

MBConv includes a residual block followed by an inverted residual block, where the
feature dimension is passed through a bottleneck. The features are widened through
the 1 × 1 block and compressed using a subsequent 3 × 3 depth-wise convolution layer.
The inverted bottleneck initially expands the input channel size 4 times and finally
projects the 4 times wide hidden channel back to the original channel by enabling
residual connection. The features extracted by the MBConv blocks are fed to the
subsequent encoder section. The proposed 3D Coatnet model was pre-trained on the
image net data set[22]. The pretrained weights were saved and the same was applied
for feature extraction from ischemic CT scans.

Fig. 2: Schematic Diagram for 3D CoAt net

The self-attention mechanism, central to the transformer blocks, allows the network
to uncover important spatial features essential for obtaining high efficacy. The self-
attention mechanism computed as per equation2 calculates the weights based on the
renormalized similarity between xi and xj .

5
Fig. 3: Schematic of the complete model (Top) Multi-Level Dilated
Residual Block and (Bottom) Design of MLDR blocks in Final Network

X exp(xT xj )
i
yi = P T
xj (2)
j∈g k∈g i xk
x

where g denotes the global spatial space. Since the attention block output relies on
the whole inputs xi , xj , it considers the whole global data.
The proposed model uses an encoder-decoder structure similar to the 3D U-Net.
The convolution block performs dilated convolution with a dilation rate of 1, 3, and
5. The network also includes a skip connection between the respective encoder layer
and the decoder layer. Figure 2 illustrates the schematic diagram of the model. Since
we have a binary output, the sigmoid layer was used at the end. Thus the CoAtNet
model combines the advantage of the self-attention mechanism and CNNs in a single
block. The CoAtNet 3D attempts to resolve the restricted receptive field of the conv
blocks by incorporating relative weighting. Conv blocks become translation invariant
when the weights for the convolution operation wk−l depend on the relative difference
in the pixel indices, not the actual position of the pixel. Since the increase in the

6
receptive field size improves the performance at the cost of increased computational
complexity, a proper trade-off between these two is necessary in practical applications.

The network combines the convolution kernel and the obtained attention matrix
to achieve high model capacity and satisfactory generalization. This task is generally
performed before or after normalizing the data with the Softmax operation. The data
is normalized after the softmax operation according to
!
X exp(xTi xj )
yipst = P T
+ wi−j xj (3)
j∈g k∈g exp(xi xk )

The output obtained after pre-normalization is computed as

!
X exp(xTi xj + wi−j )
yipre = P T
xj (4)
j∈g k∈g exp(xi xk + wi−k )

The CoAtNet model obtains high accuracy in various visual recognition tasks.

2.4 Training CT data


In this work, a combination of focal loss and dice loss is used as a loss function, as
illustrated in Equation 3. The focal loss reduces the contribution of relatively easy data
points, while the dice loss handles the class imbalance problem. An equal weight was
given to focal loss and generalized dice loss to take advantage of both methods. Batch
normalization was applied to speed up the computation process and avoid the problem
of vanishing gradients. ReLU activation was used and optimization was performed
using the Adam optimizer (Learning rate 0.001, exponential decay for mean as 0.9,
exponential decay for variance 0.99, epsilon value 10−7 . The proposed model with the
mentioned hyper-parameter was used for training the NCCT scans containing ischemic
stroke lesions.
Out of 60 CT scans collected from APIS, 50 CT scans were used for training, the
remaining 10 scans were used for validation, and the other 550 CT scans collected
from in-house data were used for testing.

LT otal = 0.5(LBinaryDice ) + 0.5(LBinaryF ocal ) (5)

2.5 Post-Processing of Segmented data


We noticed the presence of some misclassified pixels remained after testing the trained
model. The misclassified samples lead to challenges in learning. Consequently, these
misclassified pixels were removed using bwarea open, and a small value area (200) was
selected. Since 200 is a small area, it would prevent the removal of ischemic stroke
lesions and remove healthy brain tissue that was misclassified as a lesion.

7
Table 1: The average Dice score and Jaccard Coefficient.
Proposed Methodology Average Dice Score Average Jaccard Score
3D U-net [15] 0.60 0.52
V-Net [23] 0.61 0.52
3D Residual U-Net [24] 0.63 0.55
3D Attention U-Net [25] 0.64 0.55
3D Attention Residual U-Net [26] 0.64 0.52
nn U-Net [6] 0.65 0.60
Proposed Method 0.75 0.69

2.6 Evaluation of proposed Method


This work manually compared the efficacy of the segmentation task by manual com-
parison. To this aim, standard accuracy metrics such as the F1 score/dice (equation 6),
Jaccard index (equation 7), precision, recall/sensibility, and specificity metrics were
used. Here, we measured the volume of lesions by multiplying the total number of
voxels of lesions in the scan by the cubic spacing of the scan. We performed these
aforementioned calculations using the Medpy package (version 0.3.0) in conjunction
with Python (3.9.10).

(2 × T P )
DSI = (6)
(2 × T P + F P + F N )

|AV ∩ GT |
JC = (7)
|AV ∪ GT |

3 Results and Discussions


The whole CT scan images were segmented to identify ischemic stroke lesions on whole
CT scans. The hypo-intense nature makes their detection challenging to clinicians.
Further detection of an ischemic stroke lesion, earlier treatment. The average result
is shown in Table 1. The model was trained on 60 NCCT scans and the model was
tested on 550 internal CT data to demonstrate the robustness of the model.
Table I tabulated the performance of the proposed model with other existing 3D
deep learning models. The results exhibited in the Table suggest that the 3d CoAtNet
outperformed the other models. The CoAtNet obtained an average Dice score of 0.75
significantly higher than the closest competitor nn-U-net, which resulted in a Dice
score of 0.65. Similarly, the average Jaccard score for CoAtNet and nn-U-net are 0.69,
and 0.60, respectively. The compared models were trained on the APIS data set[17].
The trained models were tested on 550 CT scans that were collected in-house. The 3D
U-net, V-Net, and other variants of 3D U-Net like 3D Residual U-net, 3D attention U-
Net, and 3D residual U-Net are shallow networks without external feature extractors.
In this proposed study an external feature extractor was used. The features enabled the
model to attain 75 % of the Dice similarity index and 69 % of the Jaccard coefficient.
The nn U-Net is known for its better segmentation accuracy, but in this study, the

8
Coatnet used for feature extraction is not trained from scratch, the saved weight
from the image net dataset was used to save the training time. Original NCCT scans
of ischemic stroke lesions are visualized in rows a1-a3 in Fig. 4, while the manually
delineated ischemic stroke lesion is shown in row (b1-b3) of Fig. 3, and in row (c1-c3)
the predicted ischemic stroke lesion is shown. In c1 and c3, the ischemic stroke lesion
is close to the lateral ventricle, and the proposed model can delineate the ischemic
stroke lesion. The iso intensity of the lesion and lateral ventricle makes the detection or
delineation of the task challenging. In a2, the ischemic stroke lesion slowly transforms
into a hemorrhagic mass. The proposed model delineated the lesion along with the
hemorrhagic transformation. In future work, the transformed hemorrhagic mass of the
ischemic stroke lesion should be delineated from the ischemic stroke lesion.

Fig. 4: (a1-a3) Orginal Ischemic NCCT Scans; (b1-b3)


Manually Delineated Ischemic Stroke Lesions; (c1-c3)
Predicted Ischemic lesion

9
Since the current methods for stroke detection are very time-consuming, we
explored faster approaches (less than 6 hours) for stroke detection. Besides, we plan to
explore the efficacy of domain adaptation techniques to convert NCCT scan to d-MRI
scan to accurately detect the early signs of stroke. In the future, we intend to further
perform the parcellation of ischemic stroke lesions into the penumbra and core area
from NCCT scans. In this regard, domain adaptation could help convert the NCCT
scans into perfusion CT scans.

4 Conclusion
This treatise proposed a new efficient deep learning model for segmenting ischemic
stroke lesions from Non-Contrast CT scans. The proposed CoAtNet 3D model is a
contemporary model that combines the convolutional strategy with the self-attention
strategy for the aforementioned segmentation task. The convolutional filters uncover
important local features from the data due to their high inductive bias. On the con-
trary, the self-attention mechanism realizes the global spatial features and identifies
the key regions. The 3D Coatnet extracts high-level features, which are fed to the
subsequent 3D U-NeT. Here, dilation rates of 1, 3, and 5 were used to inflate the ker-
nel and expand the receptive field. The feature extractor identifies vital features that
facilitate the easy segregation of ischemic stroke lesion voxels and healthy brain tissue
voxels. Thus the proposed model overcomes the traditional shallow network like U-Net,
which generally fails to identify these crucial features. The proposed model architec-
ture addresses this issue. The 3D U-net structure used after the feature extraction
uses the skip connection to resolve the vanishing gradient issue.

Acknowledgments
Funding Acknowledgement: The authors would like to acknowledge that no funding
has been received for this study.

Compliance with the ethical standard


Conflict of interest: The authors declare that they do not have any conflict of
interest.
Ethical approval: All procedures performed in studies involving human participants
were in accordance with the ethical standard of the institutional and/or national
research committee and with the 1964 Helsinki Declaration and its later amendments
or comparable ethical standards.
Informed consent: Informed consent was obtained from all participants included in
this study.

References
[1] Kim, J., Thayabaranathan, T., Donnan, G.A., Howard, G., Howard, V.J., Roth-
well, P.M., Feigin, V., Norrving, B., Owolabi, M., Pandian, J., et al.: Global stroke
statistics 2019. International Journal of Stroke 15(8), 819–838 (2020)

10
[2] S Pandya, R., Mao, L., Zhou, H., Zhou, S., Zeng, J., John Popp, A., Wang,
X.: Central nervous system agents for ischemic stroke: neuroprotection mecha-
nisms. Central Nervous System Agents in Medicinal Chemistry (Formerly Current
Medicinal Chemistry-Central Nervous System Agents) 11(2), 81–97 (2011)

[3] Rajini, N.H., Bhavani, R.: Computer aided detection of ischemic stroke using
segmentation and texture features. Measurement 46(6), 1865–1874 (2013)

[4] Yahiaoui, A.F.Z., Bessaid, A.: Segmentation of ischemic stroke area from ct
brain images. In: 2016 International Symposium on Signal, Image, Video and
Communications (ISIVC), pp. 13–17 (2016). IEEE

[5] Kuang, H., Menon, B.K., Qiu, W.: Segmenting hemorrhagic and ischemic infarct
simultaneously from follow-up non-contrast ct images in patients with acute
ischemic stroke. IEEE Access 7, 39842–39851 (2019)

[6] El-Hariri, H., Neto, L.A.S.M., Cimflova, P., Bala, F., Golan, R., Sojoudi, A.,
Duszynski, C., Elebute, I., Mousavi, S.H., Qiu, W., et al.: Evaluating nnu-net
for early ischemic change segmentation on non-contrast computed tomography
in patients with acute ischemic stroke. Computers in biology and medicine 141,
105033 (2022)

[7] Kuang, H., Menon, B.K., Sohn, S.I., Qiu, W.: Eis-net: Segmenting early infarct
and scoring aspects simultaneously on non-contrast ct of patients with acute
ischemic stroke. Medical Image Analysis 70, 101984 (2021)

[8] Yang, H., Huang, C., Nie, X., Wang, L., Liu, X., Luo, X., Liu, L.: Is-net: Auto-
matic ischemic stroke lesion segmentation on ct images. IEEE Transactions on
Radiation and Plasma Medical Sciences (2023)

[9] Alshehri, F., Muhammad, G.: A few-shot learning-based ischemic stroke seg-
mentation system using weighted mri fusion. Image and Vision Computing 140,
104865 (2023)

[10] Zoetmulder, R., Baak, L., Khalili, N., Marquering, H.A., Wagenaar, N., Benders,
M., Aa, N.E., Išgum, I.: Brain segmentation in patients with perinatal arterial
ischemic stroke. NeuroImage: Clinical 38, 103381 (2023)

[11] Thiyagarajan, S.K., Murugan, K.: Arithmetic optimization-based k means algo-


rithm for segmentation of ischemic stroke lesion. Soft Computing, 1–13 (2023)

[12] An, J., Wendt, L., Wiese, G., Herold, T., Rzepka, N., Mueller, S., Koch, S.P., Hoff-
mann, C.J., Harms, C., Boehm-Sturm, P.: Deep learning-based automated lesion
segmentation on mouse stroke magnetic resonance images. Scientific Reports
13(1), 13341 (2023)

[13] Verclytte, S., Gnanih, R., Verdun, S., Feiweier, T., Clifford, B., Ambarki, K.,

11
Pasquini, M., Ding, J.: Ultrafast mri using deep learning echoplanar imaging for
a comprehensive assessment of acute ischemic stroke. European Radiology 33(5),
3715–3725 (2023)

[14] Liu, C.-F., Zhao, Y., Yedavalli, V., Leigh, R., Falcao, V., Miller, M.I., Hillis,
A.E., Faria, A.V.: Automatic comprehensive radiological reports for clinical acute
stroke mris. Communications Medicine 3(1), 95 (2023)

[15] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed-
ical image segmentation. In: Medical Image Computing and Computer-Assisted
Intervention–MICCAI 2015: 18th International Conference, Munich, Germany,
October 5-9, 2015, Proceedings, Part III 18, pp. 234–241 (2015). Springer

[16] Yu, J., Ma, T., Chen, H., Lai, M., Ju, Z., Xu, Y.: Marrying global–local spatial
context for image patches in computer-aided assessment. IEEE Transactions on
Systems, Man, and Cybernetics: Systems (2023)

[17] Gómez, S., Mantilla, D., Garzón, G., Rangel, E., Ortiz, A., Sierra-Jerez, F.,
Martı́nez, F.: Apis: A paired ct-mri dataset for ischemic stroke segmentation
challenge. arXiv preprint arXiv:2309.15243 (2023)

[18] Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: Marrying convolution and attention
for all data sizes. Advances in neural information processing systems 34, 3965–
3977 (2021)

[19] Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z., Huang, G.: On the
integration of self-attention and convolution. In: Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 815–825 (2022)

[20] Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: Marrying convolution
and attention for all data sizes. In: Ranzato, M., Beygelzimer, A.,
Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Infor-
mation Processing Systems, vol. 34, pp. 3965–3977. Curran Associates,
Inc., ??? (2021). https://proceedings.neurips.cc/paper files/paper/2021/file/
20568692db622456cc42a2e853ca21f8-Paper.pdf

[21] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2:
Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

[22] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-
scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision
and Pattern Recognition, pp. 248–255 (2009). Ieee

[23] Milletari, F., Navab, N., Ahmadi, S.-A.: V-net: Fully convolutional neural net-
works for volumetric medical image segmentation. In: 2016 Fourth International
Conference on 3D Vision (3DV), pp. 565–571 (2016). Ieee

12
[24] Bhalerao, M., Thakur, S.: Brain tumor segmentation based on 3d residual u-net.
In: International MICCAI Brainlesion Workshop, pp. 218–225 (2019). Springer

[25] Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori,
K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning
where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)

[26] Thomas, E., Pawan, S., Kumar, S., Horo, A., Niyas, S., Vinayagamani, S.,
Kesavadas, C., Rajan, J.: Multi-res-attention unet: a cnn model for the segmen-
tation of focal cortical dysplasia lesions from magnetic resonance images. IEEE
Journal of Biomedical and Health Informatics 25(5), 1724–1734 (2020)

13

You might also like