Professional Documents
Culture Documents
segmentation_of_acute_ischemic_stroke_lesion_from_non_contrast_ct_scans
segmentation_of_acute_ischemic_stroke_lesion_from_non_contrast_ct_scans
Research Article
Keywords: Non-Contrast CT, Ischemic Stroke Lesion Analysis, Lesion Segmentation, Attention,
Convolution
DOI: https://doi.org/10.21203/rs.3.rs-4131026/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Abstract
This paper proposes an advanced deep learning model for detecting and segment-
ing the brain acute ischemic lesions from Non-Contrast CT scans. This treatise
introduced the 3D CoAtNet model for accurately detecting hypo-intense lesions,
which even experienced clinicians fail to detect with satisfactory accuracy. The
computer-aided detection and segmentation can act as an assistive tool for clin-
icians. However, most of the prevalent machine learning models were unable
to obtain satisfactory detection accuracy. We propose an efficient deep learning
model based on the 3D CoAtNet, which combines the attention mechanism with
convolution. It helps the shallow encoder-decoder structure to perform better fea-
ture extraction, which plays a crucial role in separating brain tissue and ischemic
stroke lesions. Dilation rates of 1, 3, and 5 were applied to each encoder convolu-
tion block. We obtained the dice similarity score, which was calculated as 75 %
and the Jaccard index was 69%.
1
1 Introduction
Stroke has emerged as a major cause of long-term disability and mortality [1]. Accord-
ing to recent statistics [2], ischemic stroke and hemorrhagic stroke constitute 85%,
and 15% of the total stroke cases. Ischemic stroke occurs when the flow of oxygenated
blood to the brain is restricted due to the presence of a clot in the blood vessel. Mag-
netic Resonance Imaging is a widely used non-invasive imaging technique for detecting
ischemic stroke. Although magnetic resonance imaging, especially dMRI, is a fast way
to detect ischemic stroke lesions, Computed Tomography (CT) is economical and has
fewer exclusion criteria, gaining popularity in the detection of ischemic stroke lesions.
The lesion being hypointense, becomes challenging for clinicians.
Several recent works attempted to detect ischemic stroke lesions from MRI scans.
However, very few recent works attempted to perform segmentation of ischemic stroke
lesions from Non-Contrast CT scans (NCCT). NCCT does not show subtle changes in
early ischemic changes in the brain (up to 6 hours after the onset of stroke). Ischemic
changes occurring after 6 hours of the onset of stroke are generally quite challenging
to observe. Consequently, ischemic stroke detection near the lateral ventricles becomes
impossible in most cases. Further, the ischemic changes are generally hypo-intense for
the healthy brain tissue. As a consequence, accurate detection of lesions is indeed a
challenging task. However, thrombolytic therapy can be started only after detecting
the ischemic changes.
Rajini et al. [3] utilized traditional methods such as k-NN, SVM, ANN, and
decision trees to identify the CT slices having ischemic stroke lesion segmentation
from CT images using Laplace pyramid and fuzzy c-means clustering. A related
work [4] introduced D-Net architecture for segmentation of ischemic infarction and
hemorrhagic transformation using contrast non-contrast CT scans of ischemic stroke
patients [5]. The nn-U-net framework was used to evaluate early ischemic changes
from Non-Contrast CT scans [6]. In another related work, a deep network named
EIS-Net was proposed for segmenting early ischemic stroke lesions from NCCT
scans[7]. Another work, IS-Net was proposed for the segmentation of ischemia stroke
lesions from CT scans. [8]
2
some recent works have discussed that the 3D U-Net often faces challenges in extract-
ing discriminative features to identify ischemic stroke lesion and healthy brain tissue
mainly due to its shallow architecture. This treatise introduced an efficient deep neu-
ral network termed 3D CoAtNet to accurately detect the lesions. The proposed 3D
CoAtNet model[16] incorporates two different learning paradigms, convolution, and
attention mechanisms. The CoAtNet contains an encoder-decoder-like architecture to
extract high-level deep features from the input samples. The network essential incor-
porates dilated convolutional blocks with dilation rate of 1, 3, and 5. The network uses
skip connections for unrestricted flow of information between the respective block of
encoder and decoder. The succeeding relative attention blocks identify the important
regions from the image. Relative attention also extracts robust attention maps. The
significant contributions of this paper are listed below-
• The work proposes a 3D CoAtNet model as a feature extractor. The 3D CoAtNet
extracts rich, high-level features useful for the subsequent segmentation task.
• The model proposed in the work combines two efficient deep learning models; convo-
lution and attention. As a consequence, the proposed network attains considerably
higher efficacy in the experiments conducted.
3
Fig. 1: (a) NCCT Scan of ischemic patient; (b) Apparent
Diffusion Coefficient of ischemic patient
4
MBconv blocks extract spatial features [21] using depthwise convolution according to
equation 1.
X
yi = (Wi−j ⊛ xj ) (1)
j∈L(i)
where xi , yi ∈ RD , xi being the input and yi the output, and L (i) is the local
neighborhood of i, such as the kernel 3 × 3 × 3 centered on i.
MBConv includes a residual block followed by an inverted residual block, where the
feature dimension is passed through a bottleneck. The features are widened through
the 1 × 1 block and compressed using a subsequent 3 × 3 depth-wise convolution layer.
The inverted bottleneck initially expands the input channel size 4 times and finally
projects the 4 times wide hidden channel back to the original channel by enabling
residual connection. The features extracted by the MBConv blocks are fed to the
subsequent encoder section. The proposed 3D Coatnet model was pre-trained on the
image net data set[22]. The pretrained weights were saved and the same was applied
for feature extraction from ischemic CT scans.
The self-attention mechanism, central to the transformer blocks, allows the network
to uncover important spatial features essential for obtaining high efficacy. The self-
attention mechanism computed as per equation2 calculates the weights based on the
renormalized similarity between xi and xj .
5
Fig. 3: Schematic of the complete model (Top) Multi-Level Dilated
Residual Block and (Bottom) Design of MLDR blocks in Final Network
X exp(xT xj )
i
yi = P T
xj (2)
j∈g k∈g i xk
x
where g denotes the global spatial space. Since the attention block output relies on
the whole inputs xi , xj , it considers the whole global data.
The proposed model uses an encoder-decoder structure similar to the 3D U-Net.
The convolution block performs dilated convolution with a dilation rate of 1, 3, and
5. The network also includes a skip connection between the respective encoder layer
and the decoder layer. Figure 2 illustrates the schematic diagram of the model. Since
we have a binary output, the sigmoid layer was used at the end. Thus the CoAtNet
model combines the advantage of the self-attention mechanism and CNNs in a single
block. The CoAtNet 3D attempts to resolve the restricted receptive field of the conv
blocks by incorporating relative weighting. Conv blocks become translation invariant
when the weights for the convolution operation wk−l depend on the relative difference
in the pixel indices, not the actual position of the pixel. Since the increase in the
6
receptive field size improves the performance at the cost of increased computational
complexity, a proper trade-off between these two is necessary in practical applications.
The network combines the convolution kernel and the obtained attention matrix
to achieve high model capacity and satisfactory generalization. This task is generally
performed before or after normalizing the data with the Softmax operation. The data
is normalized after the softmax operation according to
!
X exp(xTi xj )
yipst = P T
+ wi−j xj (3)
j∈g k∈g exp(xi xk )
!
X exp(xTi xj + wi−j )
yipre = P T
xj (4)
j∈g k∈g exp(xi xk + wi−k )
The CoAtNet model obtains high accuracy in various visual recognition tasks.
7
Table 1: The average Dice score and Jaccard Coefficient.
Proposed Methodology Average Dice Score Average Jaccard Score
3D U-net [15] 0.60 0.52
V-Net [23] 0.61 0.52
3D Residual U-Net [24] 0.63 0.55
3D Attention U-Net [25] 0.64 0.55
3D Attention Residual U-Net [26] 0.64 0.52
nn U-Net [6] 0.65 0.60
Proposed Method 0.75 0.69
(2 × T P )
DSI = (6)
(2 × T P + F P + F N )
|AV ∩ GT |
JC = (7)
|AV ∪ GT |
8
Coatnet used for feature extraction is not trained from scratch, the saved weight
from the image net dataset was used to save the training time. Original NCCT scans
of ischemic stroke lesions are visualized in rows a1-a3 in Fig. 4, while the manually
delineated ischemic stroke lesion is shown in row (b1-b3) of Fig. 3, and in row (c1-c3)
the predicted ischemic stroke lesion is shown. In c1 and c3, the ischemic stroke lesion
is close to the lateral ventricle, and the proposed model can delineate the ischemic
stroke lesion. The iso intensity of the lesion and lateral ventricle makes the detection or
delineation of the task challenging. In a2, the ischemic stroke lesion slowly transforms
into a hemorrhagic mass. The proposed model delineated the lesion along with the
hemorrhagic transformation. In future work, the transformed hemorrhagic mass of the
ischemic stroke lesion should be delineated from the ischemic stroke lesion.
9
Since the current methods for stroke detection are very time-consuming, we
explored faster approaches (less than 6 hours) for stroke detection. Besides, we plan to
explore the efficacy of domain adaptation techniques to convert NCCT scan to d-MRI
scan to accurately detect the early signs of stroke. In the future, we intend to further
perform the parcellation of ischemic stroke lesions into the penumbra and core area
from NCCT scans. In this regard, domain adaptation could help convert the NCCT
scans into perfusion CT scans.
4 Conclusion
This treatise proposed a new efficient deep learning model for segmenting ischemic
stroke lesions from Non-Contrast CT scans. The proposed CoAtNet 3D model is a
contemporary model that combines the convolutional strategy with the self-attention
strategy for the aforementioned segmentation task. The convolutional filters uncover
important local features from the data due to their high inductive bias. On the con-
trary, the self-attention mechanism realizes the global spatial features and identifies
the key regions. The 3D Coatnet extracts high-level features, which are fed to the
subsequent 3D U-NeT. Here, dilation rates of 1, 3, and 5 were used to inflate the ker-
nel and expand the receptive field. The feature extractor identifies vital features that
facilitate the easy segregation of ischemic stroke lesion voxels and healthy brain tissue
voxels. Thus the proposed model overcomes the traditional shallow network like U-Net,
which generally fails to identify these crucial features. The proposed model architec-
ture addresses this issue. The 3D U-net structure used after the feature extraction
uses the skip connection to resolve the vanishing gradient issue.
Acknowledgments
Funding Acknowledgement: The authors would like to acknowledge that no funding
has been received for this study.
References
[1] Kim, J., Thayabaranathan, T., Donnan, G.A., Howard, G., Howard, V.J., Roth-
well, P.M., Feigin, V., Norrving, B., Owolabi, M., Pandian, J., et al.: Global stroke
statistics 2019. International Journal of Stroke 15(8), 819–838 (2020)
10
[2] S Pandya, R., Mao, L., Zhou, H., Zhou, S., Zeng, J., John Popp, A., Wang,
X.: Central nervous system agents for ischemic stroke: neuroprotection mecha-
nisms. Central Nervous System Agents in Medicinal Chemistry (Formerly Current
Medicinal Chemistry-Central Nervous System Agents) 11(2), 81–97 (2011)
[3] Rajini, N.H., Bhavani, R.: Computer aided detection of ischemic stroke using
segmentation and texture features. Measurement 46(6), 1865–1874 (2013)
[4] Yahiaoui, A.F.Z., Bessaid, A.: Segmentation of ischemic stroke area from ct
brain images. In: 2016 International Symposium on Signal, Image, Video and
Communications (ISIVC), pp. 13–17 (2016). IEEE
[5] Kuang, H., Menon, B.K., Qiu, W.: Segmenting hemorrhagic and ischemic infarct
simultaneously from follow-up non-contrast ct images in patients with acute
ischemic stroke. IEEE Access 7, 39842–39851 (2019)
[6] El-Hariri, H., Neto, L.A.S.M., Cimflova, P., Bala, F., Golan, R., Sojoudi, A.,
Duszynski, C., Elebute, I., Mousavi, S.H., Qiu, W., et al.: Evaluating nnu-net
for early ischemic change segmentation on non-contrast computed tomography
in patients with acute ischemic stroke. Computers in biology and medicine 141,
105033 (2022)
[7] Kuang, H., Menon, B.K., Sohn, S.I., Qiu, W.: Eis-net: Segmenting early infarct
and scoring aspects simultaneously on non-contrast ct of patients with acute
ischemic stroke. Medical Image Analysis 70, 101984 (2021)
[8] Yang, H., Huang, C., Nie, X., Wang, L., Liu, X., Luo, X., Liu, L.: Is-net: Auto-
matic ischemic stroke lesion segmentation on ct images. IEEE Transactions on
Radiation and Plasma Medical Sciences (2023)
[9] Alshehri, F., Muhammad, G.: A few-shot learning-based ischemic stroke seg-
mentation system using weighted mri fusion. Image and Vision Computing 140,
104865 (2023)
[10] Zoetmulder, R., Baak, L., Khalili, N., Marquering, H.A., Wagenaar, N., Benders,
M., Aa, N.E., Išgum, I.: Brain segmentation in patients with perinatal arterial
ischemic stroke. NeuroImage: Clinical 38, 103381 (2023)
[12] An, J., Wendt, L., Wiese, G., Herold, T., Rzepka, N., Mueller, S., Koch, S.P., Hoff-
mann, C.J., Harms, C., Boehm-Sturm, P.: Deep learning-based automated lesion
segmentation on mouse stroke magnetic resonance images. Scientific Reports
13(1), 13341 (2023)
[13] Verclytte, S., Gnanih, R., Verdun, S., Feiweier, T., Clifford, B., Ambarki, K.,
11
Pasquini, M., Ding, J.: Ultrafast mri using deep learning echoplanar imaging for
a comprehensive assessment of acute ischemic stroke. European Radiology 33(5),
3715–3725 (2023)
[14] Liu, C.-F., Zhao, Y., Yedavalli, V., Leigh, R., Falcao, V., Miller, M.I., Hillis,
A.E., Faria, A.V.: Automatic comprehensive radiological reports for clinical acute
stroke mris. Communications Medicine 3(1), 95 (2023)
[15] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed-
ical image segmentation. In: Medical Image Computing and Computer-Assisted
Intervention–MICCAI 2015: 18th International Conference, Munich, Germany,
October 5-9, 2015, Proceedings, Part III 18, pp. 234–241 (2015). Springer
[16] Yu, J., Ma, T., Chen, H., Lai, M., Ju, Z., Xu, Y.: Marrying global–local spatial
context for image patches in computer-aided assessment. IEEE Transactions on
Systems, Man, and Cybernetics: Systems (2023)
[17] Gómez, S., Mantilla, D., Garzón, G., Rangel, E., Ortiz, A., Sierra-Jerez, F.,
Martı́nez, F.: Apis: A paired ct-mri dataset for ischemic stroke segmentation
challenge. arXiv preprint arXiv:2309.15243 (2023)
[18] Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: Marrying convolution and attention
for all data sizes. Advances in neural information processing systems 34, 3965–
3977 (2021)
[19] Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z., Huang, G.: On the
integration of self-attention and convolution. In: Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 815–825 (2022)
[20] Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: Marrying convolution
and attention for all data sizes. In: Ranzato, M., Beygelzimer, A.,
Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Infor-
mation Processing Systems, vol. 34, pp. 3965–3977. Curran Associates,
Inc., ??? (2021). https://proceedings.neurips.cc/paper files/paper/2021/file/
20568692db622456cc42a2e853ca21f8-Paper.pdf
[21] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2:
Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
[22] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-
scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision
and Pattern Recognition, pp. 248–255 (2009). Ieee
[23] Milletari, F., Navab, N., Ahmadi, S.-A.: V-net: Fully convolutional neural net-
works for volumetric medical image segmentation. In: 2016 Fourth International
Conference on 3D Vision (3DV), pp. 565–571 (2016). Ieee
12
[24] Bhalerao, M., Thakur, S.: Brain tumor segmentation based on 3d residual u-net.
In: International MICCAI Brainlesion Workshop, pp. 218–225 (2019). Springer
[25] Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori,
K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning
where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
[26] Thomas, E., Pawan, S., Kumar, S., Horo, A., Niyas, S., Vinayagamani, S.,
Kesavadas, C., Rajan, J.: Multi-res-attention unet: a cnn model for the segmen-
tation of focal cortical dysplasia lesions from magnetic resonance images. IEEE
Journal of Biomedical and Health Informatics 25(5), 1724–1734 (2020)
13