A Novel Modified U-shaped 3-D Capsule Network MUDCap3 for Stroke Lesion Segmentation From Brain MRI

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

A Novel Modified U-shaped 3-D Capsule Network

(MUDCap3) for Stroke Lesion Segmentation from


Brain MRI
Subin Sahayam Abirami A
Department of Computer Science and Engineering Department of Computer Science and Engineering
Indian Institute of Information Technology Indian Institute of Information Technology
Design and Manufacturing, Kancheepuram Design and Manufacturing, Kancheepuram
Chennai 600127, India Chennai 600127, India
coe18d001@iiitdm.ac.in, 0000-0003-1129-895X edm16b001@iiitdm.ac.in

Umarani Jayaraman
Department of Computer Science and Engineering
Indian Institute of Information Technology
Design and Manufacturing, Kancheepuram
Chennai 600127, India
umarani@iiitdm.ac.in

Abstract—Stroke is the death of brain cells due to lack of the world health organization (WHO) on the top 10 causes of
blood supply to the cells caused by a blood clot or a blood death worldwide, stroke is the second major cause of death
vessel rupture. It is a dominant cause of disability and death [1]. The statistics is shown in Fig. 1.
globally. Rehabilitation of post-stroke patients require an efficient
treatment plan. Segmentation of stroke lesions from brain images
is the first step towards treatment planning. Generally, stroke
lesions are manually segmented from brain Magnetic Resonance
(MR) images by neuroradiologists. Manual segmentation requires
expertise, which is costly, time-consuming, error-prone and labor-
intensive. Automated stroke lesion segmentation overcomes draw-
backs in manual segmentation. U-nets have given state-of-the-
art results even in small scale medical image datasets. Medical
images like MRI are usually noisy even after pre-processing. As
a result, the pooling layer used in U-nets, which reduces the
number of parameters, tends to loose important information.
Capsule networks has been proposed to overcome the drawbacks
of pooling layer for classification tasks. Using this information
as a motivation, a novel modified U-shaped 3D capsule network
(MUDCap3) has been proposed. The 3D nature of the network
is to learn temporal information from 3D MRI. The proposed
MUDCap3 model has achieved a dice score of 0.67 on the ATLAS
dataset. The obtained results outperformed over several state-of- Fig. 1: Top 10 causes of death in 2016 worldwide by WHO
the-art models in the literature.
[1].
Index Terms—Capsule Network, Deep Learning, Segmenta-
tion, U-net, MRI, Stroke
Stroke diagnosis includes performing Brain Computed To-
I. I NTRODUCTION mography(CT), Magnetic Resonant Imaging(MRI), CT and
MR Arteriogram, and other blood tests. Among these, MRI
Cerebrovascular accident, commonly known as stroke, is is widely used for treatment planning as it accurately dis-
a medical condition caused by damage to brain cells due to cerns changes to brain tissues and cells. Diagnosis is initi-
insufficient supply of oxygen-rich blood to the brain or sudden ated by segmenting stroke lesions from brain MRI images
bleeding in the brain. It is a prominent cause of death and [2]. Manual tracing is time-intensive (4.8–9.6 hours/subject).
disability around the world. Hence, diagnosis and treatment Automated lesion segmentation requires the least time (1
of stroke are of utmost importance. Based on statistics from minute/subject) followed by semi-automated segmentation
978-0-7381-2447-6/20/$31.00 © 2020 IEEE (24.9 minute/subject) [3]. In addition to manual segmentation

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.
being time-intensive, the inter-rater similarity of 67% (±7%) identification of the lesion region is important for treatment
[4] shows the need for labor with expertise. Automated stroke planning. It is possible to get the lesion region through a deep
lesion segmentation using brain MRI images overcomes such learning model called the U-Net [8]. It proposed an encoder-
drawbacks. decoder style model, where the encoder consists of convolution
Brain MRI provides information about the regions of dam- and max-pooling layers, and the decoder performed upsam-
aged brain cells. The MRI could be operated in different imag- pling and reconstruction function. Later, this network has been
ing modes to capture different information from the subject. improved to include skip connections that carry information
The three common MRI imaging modes are T1-weighted, T2- from encoder at one level to decoder at the same level. U-
weighted, and FLAIR. Any of the three MRI images or their ResNet [12] further improved the segmentation by adding
combination could be used for stroke lesion segmentation and residual connections. Following these algorithms, an ensemble
diagnosis. However, stroke neuroimaging can be succefully of such network has been used for the segmentation task [10].
marked with a T1-weighted anatomical MRI [5]. Therefore,
Paper Title Year Model Imaging
only segmentation of stroke lesions from T1-weighted MRI Deep Neural Networks
has been considered. A sample input T1-weighted image and Segment Neuronal Electron
2012 CNN
it’s respective stroke lesion segmentation ground truth image Membranes in Electron Microscopy
Microscopy Images [5]
is shown in Fig. 2. U-net: Convolutional networks
Light
for biomedical image 2015 U-Net
Microscopy
segmentation [8]
DeepMedic for Brain Tumor
2016 DeepMedic MRI
Segmentation [11]
Voxel-based Gaussian naı̈ve
Voxel-based
Bayes classification of ischemic
2016 Gaussian MRI
stroke lesions in individual
Naive-bayes
T1-weighted MRI scans [6]
White matter hyperintensity
and stroke lesion
segmentation and differentiation 2018 U-ResNet MRI
using convolutional neural
networks [12]
Capsules for Object
2017 Capsule Network CT
Segmentation [14]
D-UNet: a dimension-fusion
Dimension
U-shape network for chronic 2019 MRI
fused U-Net
stroke lesion segmentation [9]
CLCI-Net: Cross-Level fusion Cross-Level
and Context Inference fusion
Networks for Lesion 2019 and Context MRI
Segmentation of Chronic Inference
Stroke [15] Network
X-Net: Brain Stroke Lesion
Fig. 2: Input brain MR image and expected lesion mask output Segmentation Based on
from ATLAS dataset. Depthwise Separable 2019 X-Net MRI
Convolution and Long-range
Dependencies [16]
With advent of machine learning algorithms, such as Ma- A multi-path 2.5-dimensional
chine Learning Classifiers [6], Convolutional Neural Networks convolutional neural network
2020 U-Net Ensemble MRI
[5], [7], U-Net [8]–[10], DeepMedic [11], and Residual Net- system for segmenting stroke
lesions in brain MRI images [10]
works [12], [13], automatic segmentation of stroke lesions
have reached results close to manual segmentation, but has TABLE I: Literature overview of stroke lesion segementation
not achieved sufficient efficiency. Also, some machine learning algorithms
algorithms are computationally intensive and require special
hardware [10], [11]. These challenges remain as a block to Although Convolution Neural Networks has revolutionized
wide use of automated segmentation models. the segmentation process through the modified U-Net model,
A quick overview of various algorithms for stroke lesion the max-pooling layer omits significant information about the
segmentation in the literature is given in Table I. Convolution incoming image, which is required during segmentation. A
neural networks consist of series of convolution and max- convolution neural network predicts the presence of lesions
pooling layers cascaded to extract important features from based on the features. It tends to eliminate the orientation
an input. While the convolution layer extracts features, the and spatial information of these features that are significant to
max-pooling layer passes these features to the next layer, segmenting a region as a lesion. Hinton et. al. [17] advent of
simultaneously reducing the dimension of the image. Through capsule networks, a novel deep learning algorithm, has proved
convolution neural networks, it is possible to classify the pres- to be extremely efficient for object segmentation [14], and has
ence of a lesion in the image for stroke diagnosis. However, outperformed the state-of-the-art models. Capsule networks
in addition to identifying the presence of lesions in the image, eliminate the max-pooling function and store information in

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.
the form of vectors, unlike CNN, which stores scalar feature The bottleneck layer resembles the bottleneck layer in U-net
information. The direction of the vector represents the infor- model, where the two convolution layers are replaced by the
mation about features such as pose, orientation, vector length convolution capsule layers.
and dimension. Through dynamic routing mechanism, the
capsule networks also brings in flexibility to the network while A. Convolution capsule block
training. However, the capsule networks are computationally The convolution capsule block consists of two convolution
intensive due to the dynamic routing mechanism. By incorpo- capsule layers routed by a locally constrained dynamic routing
rating constrained routing [14], this problem could be over- algorithm. This is shown if Fig. 4. Each convolution capsule
come. When training such a network, the loss metric is used layer performs convolution, reshape and squash. This is given
to direct the network towards training optimally. Commonly, in Algorithm 1.
dice coefficient and marginal loss are used for training and The convolution capsule layer performs 3D convolution. For
validating a network designed for the segmentation problem image I of dimension (x, x, y), channel c, with set of n filters
[10], [11], [16]. In addition to these losses, this paper also F , each of kernel size (k, k, k), and stride 1, 3D convolution
incorporated the reconstruction loss, as suggested by Hinton for each filter upto n is performed by
et. al. for capsule networks to improve network training [17]. c

This paper also incorporates these steps by extending the same oj = I[m]  F [j] (1)
to 3D images. m=1
The main contribution of the paper is the modified U- where, j ranges from 0 to n.
shaped 3D capsule network. As far as the authors are aware, Following the convolution layer, the reshape function con-
the modification of the max-pooling layers in 3D-Unets with verts the feature maps from the convolution layer output to
capsule networks and its application on the segmentation of the number of vectors as required by the high-level capsules.
stroke lesions has not been studied. The paper is organized Following the reshaping, the capsule performs squash trans-
as follows, in section II, the proposed MUDCap3 model has formation to convert the length of the vectors to probabilities
been discussed. It is followed by experimentation and results and orientation to feature information.
obtained on the ATLAS dataset [18] has been discussed.In
section IV, the conclusions on the proposed MUDCap3 model Algorithm 1 Vector output of capsule
is discussed. Input: Capsule i in layer l − 1, capsule k in layer l, cap-
II. P ROPOSED M ODEL sule i prediction vector ûi , log prior probability for that
capsule i is routed to k is bik and weight matrix W ik
Capsule networks has been proposed to solve the classi- (i, k, ûi , bik , W ik )
fication problem [17]. It consists of primary and secondary Output: Capsule vector output v k .
b
capsules dynamically routed to each other. To perform image 1: Coupling coefficient learned through routing cik = e b
ik
segmentation, capsule layers could be arranged in encoder-  ke
ik

decoder style U-network [14]. Since the input is a 3D image, 2: Total input to capsule k pk = i cik W ik ûi
pk 2 pk
spatial information would be captured by 3D convolution and 3: Vector output by squashing total input v k = 1+p 2
k  pk 
convolution capsule layers, which are incorporated in this 4: return v k
model. MUDCap3 is analogous to a 3D U-net where the
convolution layers and max-pooling layers are replaced with
convolution capsules. The deconvolution layer is replaced with B. Deconvolution capsule block
a deconvolution capsule. The flow diagram and the outline of The purpose of the deconvolution capsule block is anal-
the proposed MUDCap3 model is shown in Fig. 3. ogous to the deconvolution block in U-net. It reconstructs
The dimensions of the 3D image and the labels along the the feature image to original dimensions for segmentation.
3 different axes are different. To keep the training consistent It consists of a deconvolution capsule layer, skip connection,
along any axes, the image has been resized to 256, 256, 256. concatenation layer, and convolution capsule layer.
The resizing function pads the input, the ground truth image The deconvolution capsule layer incorporates convolution
and extrapolates the missing values. transpose functionality. This function performs a transforma-
The architecture of the MUDCap3 model is shown in Fig. tion that changes the output of the convolution function to the
4. The convolution layer in the model detects basic features dimensions of the input of the function. In addition to the 3D
in the input image. At the convolution layer, the number of convolution transpose function, the vectors are also reshaped
filters perform convolution with the input image and output the to the output dimension of the transformation.
feature map. Skip connections pass the information captured at
different blocks of the convolution path to blocks at the decon- C. Locally-constrained dynamic routing
volution path. This information may be lost at the intermediary The capsules in layer l pass information to layer l+1
layers and are essential for reconstruction. The information capsules through a dynamic routing algorithm. The coupling
passed through the skip connection is concatenated with an coefficient is learned through multiple iterations, practically 3,
upsampled feature image and is passed to convolution layer. and is used to modify the output of capsules in layer l and

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.
Training Phase Target
Segmentation
Labels

Loss Function

Set of MRI Resize Images MUDCap- 3


Images 256 x 256 x 256 Model

Testing Phase Coupling


Coefficients

Resize Images MUDCap- 3 Image


256 x 256 x 256 Model Reconstruction

Sample Target
Input MRI Output MRI

Fig. 3: The flow diagram and architecture of the proposed MUDCap3 model.
256 x 256 x 256

256 x 256 x 256 x 1x 16


5 x 1 x 16

Expected
Resized Output
Input Image
128 x 128 x 128 x 4 x 16

128 x 128 x 128 x 2 x 16


128 x 128 x 128 x 2 x 16

128 x 128 x 128 x 4 x 16

Image

Skip Connection

Convolution Deconvolution Capsule


64 x 64 x 64 x 4 x 32

Capsule Block Block


64 x 64 x 64 x 4 x 32

64 x 64 x 64 x 8 x 32

32 x 32 x 32 x 8 x 32

64 x 64 x 64 x 8 x 32

5x5x5 3D Conv, Routing 1


5x5x5 3D Conv, Routing 3
5x5x5 3D Deconv, Routing 1
Skip Connection

32 x 32 x 32 x 8 x 64

Bottleneck Layer

Fig. 4: The architecture of the proposed MUDCap3 model.

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.
thereby the connection between capsules in layer l and l+1.
However, this routing algorithm is computationally expensive
for a U-shaped network as each capsule layer l is routed to
every capsule in layer l+1. This hinders the capsule network
to be scaled to 3D U-net based model. To overcome this, the
routing is locally constrained to a kernel k of height h and
width w. It is given by Algorithm 2.

Algorithm 2 Locally-constrained dynamic routing


Input: Kernel k at location (m, n) with dimension (h, w) in
layer l, set of all capsules I in layer l, set of all capsules
M N at position (m, n) in layer l + 1 and o iterations
(m, n, h, w, I, M N ).
Output: bmn .
1: for g = 0 to o do
b
2: layer l capsules: ci = e i bi
ke 
3: layer l + 1 capsules mn: pmn = i ci(mn) W i(mn) ûi
pmn 2 pmn
4: layer l + 1 capsules mn: v mn = 1+p mn 2 pmn 
5: layer l and l + 1 capsule i and mn respectively: bi|mn =
bi|mn + W i(mn) ûi .v mn
6: end for Fig. 5: MUDCap3 output from different viewing planes: Input
7: return bmn (left), Ground Truth(middle), Predicted lesion region(right).

Model Dice Coefficient


III. E XPERIMENTS AND R ESULTS U-net [8] 0.42
X-Net [16] 0.48
A. Dataset D-Unet [9] 0.54
The proposed model has been trained and validated on DeepMedic [11] 0.59
2.5DCNN [10] 0.69
the ATLAS database [18]. The ATLAS database has 3- Proposed MUDCap3 0.67
dimensional (197,233,189) T1-weighted MR Images of a
brain obtained from 229 cohorts affected with stroke around TABLE II: Comparison of results with other state-of-the-art
the world. The dataset also contains average T1-weighted algorithms on ATLAS dataset
structural images as a template. The data set also contains
a comma-separated value file with information on the type of
lesions such as embolic or hemorrhagic, number of lesions and state-of-the-art models is given in Table II. MUDCap3 has
their locations, primary stroke regions, and the territory of the been trained for 35 epochs, with 160 T1-weighted images,
lesions. This data can be used not only for the segmentation of using the Keras-TensorFlow framework. The proposed MUD-
lesions but also for classification of the type of stroke which Cap3 model achieved a dice coefficient of 0.67 after 35
is required for treatment planning. epochs. The dice coefficient is comparable with the state-of-
the-art algorithms trained on the ATLAS dataset, as given
B. Results in Table II. MUDCap-3 is the modification of the U-net
An ensemble of marginal and reconstruction loss is used model. From the results obtained, MUDCap-3 performs better
in the MUDCap3 model to facilitate better learning. The dice than the baseline U-net model. However, due to numerable
coefficient has been used to measure the correct segmentation network parameters, the performance of the MUDCap3 is
of the proposed model. Dice coefficient is given by, limited to a dice coefficient of 0.67. Although, on different
N datasets, few deep learning-based algorithms have provided
2 i pi q i better dice scores, for the ATLAS dataset, MUDCap3 has
D = N N 2 (2)
2
i pi + i qi provided comparable results.
where, pi and q i are values of pixel i of predicted and ground
IV. C ONCLUSION
truth image respectively.
The lesion output images from the MUDCap3 model is In this paper, MUDCap3, an U-shaped 3-D capsule network
shown in Fig. 5. The dice coefficient captures whether the for the segmentation of lesions from brain MR images has
model finds the foreground and background correctly. All the been proposed. It consists of a cascade of convolution layers
segmentation models in the referred literature used the dice and primary convolutional capsule clustered in the form of U-
coefficient to measure the quality of the segmented lesion net for image segmentation with locally constrained dynamic
region. The dice coeffcient of MUDCap3 along with other routing algorithm.

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.
• The problem of spatial information loss as in the case of [12] R. Guerrero, C. Qin, O. Oktay, C. Bowles, L. Chen, R. Joules, R. Wolz,
2D models, and information loss due to max-pooling in M. d. C. Valdés-Hernández, D. Dickie, J. Wardlaw et al., “White matter
hyperintensity and stroke lesion segmentation and differentiation using
convolutional neural network and U-network have been convolutional neural networks,” NeuroImage: Clinical, vol. 17, pp. 918–
addressed. 934, 2018.
• A dynamic and locally constrained routing is imple- [13] N. Tomita, S. Jiang, M. E. Maeder, and S. Hassanpour, “Automatic post-
stroke lesion segmentation on mr images using 3d residual convolutional
mented to improve the accuracy of the learning algorithm, neural network,” NeuroImage: Clinical, p. 102276, 2020.
simultaneously reducing the problem of parameter boom [14] R. LaLonde and U. Bagci, “Capsules for object segmentation,” arXiv
in capsule networks. preprint arXiv:1804.04241, 2018.
[15] B. B. Avants, N. Tustison, and G. Song, “Advanced normalization tools
• Class imbalance has been addressed due to the vectorized (ants),” Insight j, vol. 2, no. 365, pp. 1–35, 2009.
and affine-transform invariant nature of feature informa- [16] K. Qi, H. Yang, C. Li, Z. Liu, M. Wang, Q. Liu, and S. Wang, “X-
tion stored in capsules and the locally constrained routing net: Brain stroke lesion segmentation based on depthwise separable
convolution and long-range dependencies,” in International Conference
algorithm. on Medical Image Computing and Computer-Assisted Intervention.
• The capsule network can learn the pose information, so Springer, 2019, pp. 247–255.
lesions diagnosed separately from the axial, coronal, or [17] S. Sabour, N. Frosst, and G. E. Hinton, “Dynamic routing between
capsules,” in Advances in neural information processing systems, 2017,
sagittal planes could now be addressed from a single pp. 3856–3866.
perspective, thereby eliminating the need of plane-wise [18] S.-L. Liew, J. M. Anglin, N. W. Banks, M. Sondag, K. L. Ito, H. Kim,
diagnosis. J. Chan, J. Ito, C. Jung, N. Khoshab et al., “A large, open source dataset
of stroke anatomical brain images and manual lesion segmentations,”
MUDCap3 has been compared with state-of-the-art algo- Scientific data, vol. 5, p. 180011, 2018.
rithms using the dice coefficient metric. The proposed model
has shown comparable results with a dice coefficient of
0.67 while eliminating several challenges in medical image
segmentation.

R EFERENCES
[1] W. H. Organization et al., “Global health estimates 2016:
Deaths by cause, age, sex by country and by region 2000-
2016; 2018,” World Health Organization, 2019, [Online; Accessed
09-November-2020]. [Online]. Available: https://www.who.int/news-
room/fact-sheets/detail/the-top-10-causes-of-death
[2] M. Katan and A. Luft, “Global burden of stroke,” in Seminars in
neurology, vol. 38, no. 2. Georg Thieme Verlag, 2018, pp. 208–211.
[3] J. K. Udupa and S. Samarasekera, “Fuzzy connectedness and object
definition: theory, algorithms, and applications in image segmentation,”
Graphical models and image processing, vol. 58, no. 3, pp. 246–261,
1996.
[4] D. G. Lowe, “Object recognition from local scale-invariant features,” in
Proceedings of the seventh IEEE international conference on computer
vision, vol. 2. IEEE, 1999, pp. 1150–1157.
[5] K. L. Ito, H. Kim, and S.-L. Liew, “A comparison of automated
lesion segmentation approaches for chronic stroke t1-weighted mri data,”
Human brain mapping, vol. 40, no. 16, pp. 4669–4685, 2019.
[6] J. C. Griffis, J. B. Allendorfer, and J. P. Szaflarski, “Voxel-based gaussian
naı̈ve bayes classification of ischemic stroke lesions in individual t1-
weighted mri scans,” Journal of neuroscience methods, vol. 257, pp.
97–108, 2016.
[7] K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane,
D. K. Menon, D. Rueckert, and B. Glocker, “Efficient multi-scale 3d
cnn with fully connected crf for accurate brain lesion segmentation,”
Medical image analysis, vol. 36, pp. 61–78, 2017.
[8] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks
for biomedical image segmentation,” in International Conference on
Medical image computing and computer-assisted intervention. Springer,
2015, pp. 234–241.
[9] Y. Zhou, W. Huang, P. Dong, Y. Xia, and S. Wang, “D-unet: a
dimension-fusion u shape network for chronic stroke lesion segmen-
tation,” IEEE/ACM transactions on computational biology and bioinfor-
matics, 2019.
[10] Y. Xue, F. G. Farhat, O. Boukrina, A. Barrett, J. R. Binder, U. W.
Roshan, and W. W. Graves, “A multi-path 2.5 dimensional convolutional
neural network system for segmenting stroke lesions in brain mri
images,” NeuroImage: Clinical, vol. 25, p. 102118, 2020.
[11] K. Kamnitsas, E. Ferrante, S. Parisot, C. Ledig, A. V. Nori, A. Criminisi,
D. Rueckert, and B. Glocker, “Deepmedic for brain tumor segmen-
tation,” in International workshop on Brainlesion: Glioma, multiple
sclerosis, stroke and traumatic brain injuries. Springer, 2016, pp. 138–
149.

Authorized licensed use limited to: University of Sargodha. Downloaded on November 06,2023 at 09:42:16 UTC from IEEE Xplore. Restrictions apply.

You might also like