Professional Documents
Culture Documents
PDF Big Data in Multimodal Medical Imaging 1St Edition Ayman El Baz Editor Ebook Full Chapter
PDF Big Data in Multimodal Medical Imaging 1St Edition Ayman El Baz Editor Ebook Full Chapter
https://textbookfull.com/product/level-set-method-in-medical-
imaging-segmentation-1st-edition-ayman-el-baz-editor/
https://textbookfull.com/product/diabetes-and-retinopathy-1st-
edition-ayman-s-el-baz/
https://textbookfull.com/product/neural-engineering-techniques-
for-autism-spectrum-disorder-volume-1-imaging-and-signal-
analysis-1st-edition-ayman-s-el-baz/
https://textbookfull.com/product/diabetes-and-retinopathy-
computer-assisted-diagnosis-volume-2-ayman-s-el-baz/
Multimodal Imaging in Uveitis 1st Edition H. Nida Sen
https://textbookfull.com/product/multimodal-imaging-in-
uveitis-1st-edition-h-nida-sen/
https://textbookfull.com/product/big-data-in-omics-and-imaging-
association-analysis-1st-edition-momiao-xiong/
https://textbookfull.com/product/big-data-and-ethics-the-medical-
datasphere-1st-edition-jerome-beranger/
https://textbookfull.com/product/medical-imaging-systems-andreas-
maier/
Big Data in Multimodal
Medical Imaging
Big Data in Multimodal
Medical Imaging
Edited by
Ayman El-Baz and Jasjit S. Suri
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have
been made to publish reliable data and information, but the author and publisher cannot assume responsibility for
the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace
the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission
to publish in this form has not been obtained. If any copyright material has not been acknowledged please write
and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted,
or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, includ-
ing photocopying, microfilming, and recording, or in any information storage or retrieval system, without written
permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers,
MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety
of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment
has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only
for identification and explanation without intent to infringe.
Preface............................................................................................................................... ix
Editors...............................................................................................................................xi
Contributors....................................................................................................................xiii
Acknowledgments...........................................................................................................xvii
vii
viii Contents
Index............................................................................................................................... 301
Preface
This book covers the state-of-the-art approaches for big data in the field of medical
image analysis and healthcare. Big data refers to the amalgamation and processing of
huge datasets that are composed of different data types (e.g. clinical, genomic, imag-
ing, pathological, etc.) and have rapidly become more massive and complex, particu-
larly with the advent of new technologies. Broadly speaking, we are considering data of
high volume that are obtained from a variety of sources and processed at high speed.
Currently, big data is a very active area of research and is significantly dominating in
the medical research field, where it achieved very promising results with medical imag-
ing, content-based image retrieval, healthcare automation, diagnosis, and prediction of
many diseases.
Moreover, big data analysis is now trending in many commercial applications in
healthcare, which include, but are not limited to, designing computer aided diagno-
sis systems that could be used in helping physicians in diagnosing and selecting the
optimal treatment plan for many diseases. Various studies are discussed in this book
to demonstrate how powerful big data is in the medical field. This involves radiomics,
multimodal image fusion, image segmentation, and registration, as well as applications
to computational health information, pre-clinical dementia, cardiovascular diseases,
acute renal rejection, prostate cancer, and diabetic retinopathy,
In summary, the main aim of this book is to help advance scientific research within
the broad field of big data. The book focuses on major trends and challenges in this
area, and it presents work aimed at identifying the new achievements and their use in
biomedical analysis.
Ayman El-Baz
Jasjit S. Suri
ix
Editors
xi
Contributors
xiii
xiv Contributors
The completion of this book could not have been possible without the participation
and assistance of so many people whose names may not all be enumerated. Their con-
tributions are sincerely appreciated and gratefully acknowledged. However, the edi-
tors would like to express their deep appreciation and indebtedness particularly to Dr.
Ali H. Mahmoud, Yaser ElNakieb, Omar Dekhil, and Mohamed Ali for their endless
support.
Ayman El-Baz
Jasjit S. Suri
xvii
Chapter 1
Multimodal Imaging Radiomics
and Machine Learning
Gengbo Liu, Youngho Seo, Debasis Mitra, and Benjamin L. Franc
1.1 Introduction................................................................................................................ 1
1.2 High-Quality Standardized Imaging Data Acquisition........................................... 2
1.3 Segmentation of the Region of Interest from the Image.......................................... 3
1.4 High-Throughput Feature Extraction....................................................................... 3
1.4.1 First-Order Features with Image Statistics.................................................. 3
1.4.2 Shape- and Size-Based Features.................................................................... 4
1.4.3 Second-Order Features................................................................................... 4
1.4.4 Wavelet-Based Features................................................................................. 5
1.4.5 Feature Extraction Software.......................................................................... 5
1.5 Clinical Predictive Modeling with Machine Learning............................................. 6
1.5.1 Feature Selection Methods............................................................................. 6
1.5.2 Radiomic Features Classifying Outcomes..................................................... 7
1.5.2.1 Clustering Algorithms....................................................................... 7
1.5.2.2 Regression Model............................................................................... 7
1.5.2.3 Support Vector Machine.................................................................... 8
1.6 Big Data and Machine Learning.............................................................................. 9
1.7 Conclusions............................................................................................................... 10
References........................................................................................................................ 11
Appendix........................................................................................................................... 14
A. First-Order Statistics Features....................................................................... 14
B. Shape- and Size-Based Features..................................................................... 16
C. GLCM Features................................................................................................ 16
D. GLRLM Features............................................................................................. 20
E. Neighborhood Gray Tone Difference Matrix (NGTDM) Features................. 21
1.1 Introduction
According to Gerlinger et al. [1], single tumor biopsy samples can cause underesti-
mation of intratumor heterogeneity. The heterogeneity in the gene, cell and tissue of
the solid tumor limited the accuracy and representation of invasive detection results. In
order to develop the concept of personalized medicine and attain a better understanding
of tumor heterogeneity noninvasively, the concept of radiomics was proposed by Lambin
et al. [2] in 2012. A suffix of “omics”, such as genomics, refers to the objects of study of
a certain field and its relationships to biology and medicine. As genomics attempts to
relate genes to phenotype, radiomics studies relate medical images to phenotypes of
1
2 Big Data in Multimodal Medical Imaging
image reconstruction algorithm, etc., influence the quality of images. For this reason,
many standardized image datasets are being made available for the research community.
For example, the National Institutes of Health (NIH) and the National Cancer Institute
(NCI) have established the Lung Image Database Consortium (LIDC) [7], The Cancer
Genome Atlas (TCGA) [8] and The Cancer Imaging Archive (TCIA) [9], etc., covering
medical images of the lung, brain, breast, prostate and other organs.
1.4.3 Second-Order Features
Second-order features indicate long range interaction of pixel or voxel values in
an image. These include gray level cooccurrence matrix (GLCM), gray level size zone
matrix (GLSZM) and gray level run length matrix (GLRLM). As one of the important
second-order features, the mathematical meaning and application of GLCM are dis-
cussed in this section.
The GLCM (formulas in Appendix A.3) is based on the second-order statistical com-
bined conditional probability density of the image that considers the spatial relationship
of pixels. Haralick et al. [24] first proposed this matrix and 14 corresponding statistics
and showed those statistics reflect the information regarding the relative position of the
various gray levels over the images. Although GLCM features have a good ability to dis-
tinguish different textures, GLCM computation is time consuming and computationally
expensive because it needs to generate GLCM first and then extract 14 texture features.
Therefore, many researchers attempt to improve the computation time in several direc-
tions. Ulaby et al. [25, 26] reported that many features in GLCM are correlated with
each other and selected only four independent texture features, namely, energy, entropy,
contrast and correlation. In another study [27], the results show that two parameters,
Big-Data Radiomic Features in Medical Imaging 5
energy and contrast, are considered to be the most efficient for discriminating different
textures among the most relevant of 14 originally proposed parameters. In another study,
to simplify the computation of GLCM, Bo et al. [28] proved that GLCM tends to a con-
stant for different distances and angles when the distance is large enough. As one of the
improvements on GLCM, gray level cooccurrence linked list (GLCLL) [29], which stores
only the nonzero cooccurring probabilities, has better computational speed [30].
Many researchers have utilized GLCM features to analyze multimodal medical
images. In one study, Hatt et al. [31] investigated the association between the meta-
bolically active tumor volume (MATV) of esophageal and non-small cell lung cancer
(NSCLC) through texture features generated by GLCM. In an ultrasound study [32],
the authors investigated the features’ differences between the normal and postradio-
therapy parotid glands by extracting eight GLCM features at distance of 4 pixels and
all angles in two dimensional images. The result suggested that significant differences
in GLCM features were observed between the normal and postradiotherapy parotid
glands. Petkovska et al. [33] reported that the GLCM features extracted from contrast-
enhanced CT have a better identification ability to differentiate malignant from benign
nodules than visual inspection by three experienced radiologists.
1.4.4 Wavelet-Based Features
Wavelets are small oscillatory waveforms which can be stretched into different sizes
and convolved against a signal for extracting their coefficients. The coefficients are, thus,
scale and translation invariant. Wavelet coefficients decompose input signal into differ-
ent components, and then study each component with a resolution matched to its scale.
The wavelet transform essentially splits the signal into a low pass sub-band (abbreviation
notation is “L”, also called approximation level) and a high pass sub-band (abbreviation
notation is “H”, also called detail level) at each scale. A wavelet transformed-2D image
is shown in Figure 1.2 as an example. The wavelet transform-based method processes
an image at multiple levels of resolution and returns a series of gray level edge maps at
different resolutions. After the generation of different level of coefficients (that are 2D
images themselves of varying resolutions) by wavelet transform, all the radiomic fea-
tures described before, such as first-order features, shape- and size-based features and
second-order features, can be extracted on the high pass and low pass sub-band images
again. These could generate thousands of radiomic features [34]. Many studies have used
wavelet-based features to reveal image features of medical images. Yang et al. [35, 36]
used Go ́mes et al.’s informative classifying features and developed a feature extraction
scheme using wavelet transform that further improved the diagnostic performance.
FIGURE 1.2: Wavelet transformed images, upper-left is original image and right are
three different levels of wavelet coefficients. And horizontal direction goes along high-
pass filtering, and vertical direction goes along low-pass filtering.
Features with better stability showed higher prognostic performance, and that phenom-
enon may be due to reduced noise in the stable features. In one study, Aerts et al. [18]
selected top features in four feature groups out of 100 most stable features which were
determined by two testing datasets. According to the stability rank of test and retest
CT scan dataset, they selected the four most significant radiomic features which were
statistical energy, shape compactness, gray level nonuniformity in the original image
and gray level nonuniformity in HLH wavelets filtered images. Their results showed
that radiomic features have prognostic meaning in both lung and head and neck cancer.
Leijenaar et al. [40] assessed every extracted feature between test-retest and inter-
observer stability of features by the intraclass correlation coefficient (ICC), which is
calculated by Kruskal–Wallis one-way analysis of variance (ANOVA) method and pro-
vides an indication of feature measurements reliability. Similarity between test-retest
and interobserver stability rankings of features was evaluated by Spearman’s rank
correlation coefficient.
In addition to being useful in improving the predictive performance of machine learn-
ing algorithms, feature selection also reduces the dimensionality of radiomics feature-
space and, thus, improves computational efficiency. Feature selection methods can be
classified as univariate and multivariate feature selection methods [41]. Some research
investigated the role of feature selection methods on predictive models [43]. Parmar et
al. [41, 42, 44] analyzed 14 feature selection methods on the predictive performance of
patient survival and found that the prediction accuracy of radiomic features is influ-
enced by the number of features, feature selection methods and pattern recognition
classifiers. Among 14 feature selection algorithms, Wilcoxon test-based feature selec-
tion method (WLCX), one of the univariate methods, showed the highest performance
in most of the selected features numbers.
1.5.2.2 Regression Model
Linear regression is a model that provides the predictive relationship between
a dependent variable and one or more independent variables. The linear regression
model has been used to analyze the temporal relationship between undergoing therapy
8 Big Data in Multimodal Medical Imaging
and radiomic features. For example, Yang et al. [46] compared tumor texture features
of fluorodeoxyglucose (FDG) heterogeneity with tumor SUV measures on patients’
PET images and classified their chemoradiotherapy responses at different time points.
Temporal changes were analyzed by the linear regression model. The temporal behav-
iors in the early phase of treatment show that texture features are a better predictor
for the outcome of therapeutic intervention than SUV measures. In another study
[47], a linear regression model was also used to assess the relationship between the
radiation dose and radiomic features. Li et al. [48] demonstrated significant associa-
tions between radiomic features and multigene assay recurrence scores using multiple
linear regression models.
Unlike linear regression where a dependent variable is normally continuous, a logis-
tic regression model is used when a dependent variable has only a limited number of
possible or categorical values. In King et al.’s study [49], radiomic features were corre-
lated for tumor local failure and control at 2 years. A logistic regression model was used
to predict local failure by univariate and multivariate analyses of the apparent diffu-
sion coefficients parameters, tumor stage and tumor volume. The poorly responding
tumors showed a significant increase in skewness and kurtosis values than that shown
by well-controlled tumors. El et al. [50] employed radiomic features to predict treatment
outcomes in patients with cervical cancers and head and neck cancers. Association
between different extracted radiomic features and nine patients’ overall survival was
investigated by a logistic regression model. The result showed that shape-based fea-
tures had the highest categorical predictive capability of failure-risk, while commonly
used SUV descriptive statistics had the lowest predictive ability.
The Cox regression model, referred to as the Cox proportional hazards regression
model, was applied mainly for the prognosis of survival-time outcomes of tumors and
other chronic diseases on one or more predictors. Cook et al. [51] measured tumor
radiomic features like coarseness, contrast, busyness and complexity which were derived
from a PET dataset of 53 patients with NSCLC treated with chemoradiotherapy. Cox
regression was used to examine the effects of the radiomic features on the survival
outcomes. Therapy responders showed lower coarseness and higher contrast and busy-
ness than non-responders. However, none of the SUV parameters predicted treatment
response accurately. Those results indicate that coarseness, busyness, and contrast are
more closely associated with differentiating between responders and non-responders to
chemoradiotherapy than SUV measures, and coarseness was an independent predictor
of overall patient survival. In another study [52], the Cox model was trained to evaluate
radiomic features’ capability to predict distant metastasis.
plus surgery). SVM and logistic regression (LR) models were used to predict pathologic
tumor response to treatment. Compared to different models and feature groups, they
found that SVM models constructed by radiomic features combined with conventional
SUV measures and clinical parameters significantly improved the pathologic response
Big-Data Radiomic Features in Medical Imaging 9
prediction. Some studies [54, 55] have shown that many image features are able to dif-
ferentiate tumor stages using different medical imaging modalities. In one study, Mu
et al. [56] classified cervical cancer patients into early stage and advanced stage using
54 PET-based image features. One of the radiomic features, run percentage (RP), which
is selected by the SVM classifier, was the most discriminative index with higher accu-
racy and ability to differentiate early stage and advanced stage with high accuracy. In
another tumor stage study, Dong et al. [57] assessed radiomic features on PET images
of 40 patients with esophageal squamous cell carcinoma. Correlations were observed
between SUVmax, GLCM-entropy and GLCM-energy on the one hand and T and N
stages of patients on the other. Xu et al. [58] developed a computer-aided diagnosis
(CAD) system to differentiate malignant and benign bone and soft-tissue lesions on
FDG PET/CT images. After selection of a subset of the most optimal radiomic features
parameters, an SVM classifier was used to automatically differentiate different bone
tumor tissues. The result showed that the CAD method with the combined PET and
CT optimal radiomic features has a better ability in differential diagnosis than that
by each of PET or CT texture analysis alone. Compared with the histological diagnosis
of the lesions, the diagnostic method based on classification results of their radiomic
features achieved a relatively good accuracy.
the veracity of diversely acquired data. Finally, we believe these two solutions will
have to adjust to the fact that the data will continuously grow and no static model for a
fixed dataset will be sufficient. However, that is for the near future. In this context, the
newly emerging field of topological data analysis (TDA) [62] may help in understanding
the nature of a dataset. As an unsupervised machine learning model, this technique
studies the topology of data points (images) in the underlying feature space. Instead of
clustering the data points, TDA rather investigates more complex structures like con-
nectivity between data. Such topological understanding may provide much better guid-
ance to the supervised machine learning algorithms [63] or may even be embedded in
the latter for a robust algorithm [64] to address the Big Data challenges.
Computing Hardware: Deep learning algorithms’ miraculous success has not only
dazzled all machine learning applications but also medical imaging communities.
However, these algorithms are not just data hungry but also computing-resources hun-
gry. Most of these algorithms need multiple high-performance computing nodes such
as computing-focused graphics processing units (GPUs) with sufficient localized mem-
ory closer to computational unit. For example, Nvidia (Santa Clara, CA) provides a
dedicated web page for medical imaging applications on their GPU platforms [65] and
announced Project Clara in March 2018 [66]. Specialized hardware is being developed
for deep learning algorithms by many other vendors as well.
Image Data Mart: Another sector where we see that Big Data in medical imaging
needs to grow is toward Data Mart development. There are many dimensions of such
image databases, e.g., imaging modalities, disease models, metadata involved with
study-questions or protocols, etc. Currently, many such databases are being made
available, e.g., the recent release of 100,000 chest X-ray images for 30,000 patients
by NIH for lung cancer research [67]. The project will release torso images of the
same type for research on abdominal diseases according to the announcement [68].
Similar other lung cancer datasets are available on the web. Eventually such large
data release projects need to be consolidated, when they are too similar, or easily
referred from one to another to pose higher-level research questions, e.g., for connect-
ing genotype and phenotype of a disease. The National Cancer Informatics Program
provides a step toward such an integrated view of data. It took more than 30 years for
the genomics databases to provide a disciplined view to researchers for data access.
We expect that experience will be harnessed to allow comprehensive access to imag-
ing databases for scientists.
1.7 Conclusions
The machine learning models built upon unique and important radiomic features
provide a new way not only to study tumor heterogeneity but also to quantitatively
analyze the changes in micro-level gene or protein patterns that are hidden behind
medical images. Many studies have applied radiomic features and reached promising
results in precision medicine. As a new research field, radiomics still needs further
exploration. With the standardization of medical imaging data, and the rapid develop-
ment of various methods of image segmentation, feature extraction, feature selection
and machine learning algorithms, the application of radiomic features will have far-
reaching effects and lead to tremendous changes in radiology. As larger datasets are
Another random document with
no related content on Scribd:
10 qu’il volsissent priier au conte Derbi qu’il les
volsist prendre à merci, salve leurs corps et leurs
biens, et en avant il se metteroient en l’obeissance
dou roy d’Engleterre. Li contes de Pennebruch et li
contes de Kenfort respondirent qu’il en parleroient
15 volentiers; et puis demandèrent il où li contes de
[Lille] et li aultre baron estoient. Il respondirent:
«Certainnement nous ne savons, car il cargièrent et
toursèrent très le mienuit tout le leur et se partirent,
mès point ne nous disent quel part il se trairoient.»
20 Sus ces parolles, se partirent li doi conte dessus
nommet, et vinrent au conte Derbi, qui n’estoit mies
loing de là, et li moustrèrent tout ce que les gens de
Bregerach voloient faire. Li dis contes Derbi, qui fu
moult nobles et très gentilz de coer, respondi: «Qui
25 merci prie, merci doit avoir. Dittes leur qu’il oevrent
leur ville et nous laissent ens: nous les assegurons
de nous et des nostres.» Adonc retournèrent li doi
signeur dessus dit, et recordèrent à chiaus de Bregerach
tout ce que vous avés oy: dont il furent moult
30 joiant, quant il veirent qu’il pooient venir à pais. Si
vinrent ou le place et sonnèrent les sains, et se
[53] assamblèrent tout, hommes et femmes, et fisent ouvrir
leurs portes, et vinrent à grant pourcession et moult
humlement contre le conte Derbi et ses gens, et le
menèrent à le grant eglise. Et là li jurèrent il feaulté
5 et hommage et le recogneurent à signeur, ou nom
dou roy d’Engleterre, par le vertu de le procuration
qu’il en portoit. Ensi eut en ce temps li contes Derbi
le bonne ville de Bregerach, qui se tint toutdis depuis
englesce. Or, parlerons nous des signeurs de Gascongne
10 qui estoient retrait en le ville et ou chastiel
de le Riolle, et vous conterons comment il se
maintinrent.