Professional Documents
Culture Documents
17.1 Classification of Adrenal Lesions Through Spatial Bayesian Modeling of GLCM
17.1 Classification of Adrenal Lesions Through Spatial Bayesian Modeling of GLCM
17.1 Classification of Adrenal Lesions Through Spatial Bayesian Modeling of GLCM
OF GLCM
Fig. 1: Illustrative patterns of GLCM in adrenal lesions from 2.2. Gaussian Markov Random Field (GMRF)
CT imaging for 3 representative subjects, pixel-level ROIs on the GMRF models have been studied extensively as tools for
left, GLCMs on the right.
characterizing spatial variation of observable lattice-type data
as well as unobservable latent variables in hierarchical mod-
els. We define a GMRF by specifying a set of full conditional
discerning the extent and pathogenesis of a particular disease. distributions for ηi :
To overcome this limitation, we developed a model-based
Bayesian probabilistic framework for classification of GLCM ηi (sk )|ηi (s1 ), .., ηi (sk−1 ), ηi (sk+1 ), ..., ηi (sn )
objects. By introducing a Gaussian process to account for X Wkd ηi (sd ) 1
spatial dependencies among the GLCM entries, we derive a ∼N , ,
probability model to characterize distributions of the space Bk τη Bk
d6=k
of the entire normalized GLCM as a multivariate response where Bk is a normalizing factor; τη is a precision parameter
surface. The method, which we refer to as a Bayesian Spa- and Wkd represents a “weight” characterizing the strength
tial Gaussian Process Classifier (BSGC), was applied in a of dependence between lattice elements k and d, k 6= d.
cancer detection context to discriminate malignant and be- Through the Brook’s Lemma [8], the joint distribution is
nign adrenal lesions from enhancement patterns observed 0
f (ηi ) ∝ exp(− τ2 ηi (B − W )ηi ). Specification of W as
from CT. When compared to currently employed analytic the adjacency matrix with Wkd indicating whether lattice el-
approaches which use radiomics-based texture features, the ements k and d are adjacent neighbors and B as a diagonal
proposed method is shown to effectuate substantial improve- matrix with kth entry Bk = Dk + q, where Dk denotes the
ments in classification accuracy when compared to prediction number of neighbors for lattice k, and q > 0 a diagonal offset
based on the methods of support vector Machine, logistic term, ensures a proper joint multivariate normal distribution
regression, artificial neural network, and random forest. The with (B −W ) symmetric, positive definite with non-negative
methodology offers insights into the manner in which imag- off-diagonal entries [9].
ing data may be better utilized to identify complex patterns
that characterize the intrinsic heterogeneity observed in tumor
pathology. 2.3. Bayesian Hierarchical model and Posterior inference
To complete the conditional sampling model, given specifica-
tion of the model parameters and latent, spatially correlated
2. HIERARCHICAL STATISTICAL FRAMEWORK random effects, the observed empirical rates for patient i can
assume the following Gaussian distribution:
This section describes how spatial modeling techniques, yi | β, ηi , τ ∼ M V N yi | xi β + ηi , τ−1 I
namely Gaussian Markov Random Field (GMRF) priors,
can be employed in a hierarchical statistical model to capture Prior specification for ηi follows from the GMRF formula-
patterns of dependence within the GLCM structure, and yield tion,
ηi | τη , d ∼ M V N ηi | 0, τη−1 (D + diag(d) − W ) .
probability-based measures for prediction and classification.
148
Hierarchical model specification is complete with prior distri- 3. SIMULATION STUDY
butions for the parameters:
β ∼ M V N (β | m0 , Σ0 ) , Performance for the proposed BSGC classifier was evaluated
through simulation and compared to four popular approaches
τ ∼ Gamma (τ | a, d) , in the GLCM texture analysis literature. Specifically, classifi-
τη ∼ Gamma (τη | b, g) , and cation performance based on GLCM derived texture features
d ∼ Gamma (d | e, f ) . (such as contrast, homogeneity, energy and entropy) using
the methods of logistic regression, support vector machine
A vague prior for β e.g. Σ0 = 1000 ∗ I and m0 = (SVM), artificial neural network (ANN) and random forest
0
(0, 0, ..., 0) , was used in our study to promote maximal were compared to the BSGC approach [4, 7, 10–14]. To for-
data learning. Following recommendations in [8], we set mulate an appropriate sampling model capable of effectuating
a = d = 0.001, b = g = 0.1, and chose e = 1, f = 0.0001. characterizations of spatial patterns of GLCMs that were ev-
Bayesian posterior inference was conducted through Markov ident in our diagnostic cancer study of adrenal lesions, we
chain Monte Carlo (MCMC) sampling and post-MCMC considered GLCMs constructed from 8 gray-levels with nor-
computation of posterior summary statistics. malized element-wise probability densities arising from a bi-
0
variate normal distribution
with µ = (2 + c, 6 − c) and
2.4. Predictive Discriminant Analysis 10 −0.5
Σ = . A final smoothed event rate surface
−0.5 10
Unlike regression-based classifiers, which rely on linear was obtained for GLCM simulation by smoothing over the
predictors, in the presence of col-linearity the Bayesian Gaussian-derived empirical rate surface, calculated in propor-
paradigm facilitates class prediction through probabilistic- tion to the number of generated points in each grid with re-
based measures that characterize the distributions of inter- spect to the total number of generated points of the entire grid
dependent observable predictors under candidate classes. surface. In addition, to effectuate the discrete GLCM space,
Bayesian discriminant-type predictive classifiers are fully we scaled the rate surface by a random integer that is sampled
specified through the predictive density of an observable pre- from 50 to 5000, characterizing image size and rounded the
dictor and the prior probability of each class. Denote the values to nearest integers, with a distribution obtained from
observed GLCM of a new, heretofore unclassified object the empirical distribution of image sizes observed in our case
by yN +1 with potential covariates xN +1 . Additionally, let study. Simulation scenarios were constructed by considering
c = {c1 , c2 , ..., ch } denote the set of all possible classes distributions of GLCMs generated under different choices of
to which object yN +1 could be assigned. The classifica- mean shift parameter c.
tion probability for any class configuration follows from our Figure 2 depicts the mean rate surface used to simulate
model specification as proportional to the product of the GLCM-derived patterns with c = 0, 1, 2, 3.5, respectively.
prior probability of class ck and the value of the conditional Simulation scenarios were formulated to shift the peak count
predictive distribution for yN +1 under class ck , cells from upper left toward lower right along the diagonal,
P (c = ck | yN +1 , y, xN +1 , x) = reflecting varying patterns pertaining to the spatial clustering
of dense versus non-dense tissues observeable in our study of
P (yN +1 | y, x, xN +1 , c = ck ) P (c = ck ) (1) adrenal lesion using GLCMs based on CT. Class c = 0 is con-
Pk ,
l=1 P (yN +1 | y, x, xN +1 , c = cl ) P (c = cl ) sidered the reference group to which every class is compared.
where Simulated results of classification accuracy under LOOCV
are displayed in Figure 3, where BSGC is shown to outper-
P (yN +1 | y, x, xN +1 , c = ck ) = form competing methods which attempt to use information
Z
P yN +1 | ηN ck ck ck ck ck ck ck
contained in GLCMs by reducing the multivariate functional
+1 , y , x , xN +1 , β , τ , τη , ρ
structure to a set of summary statistics (i.e., GLCM-derived
ck
P (β ck , ηN ck ck
+1 , τ , τη , ρ
ck
| y ck , xck ) features). In fact, all feature-based classification approaches
ck failed to yield monotonic trends as c increased in our simu-
d β ck , ηN ck ck ck
+1 , τ , τη , ρ ,
lation study; on the other hand, when c becomes larger, the
is computed by averaging over posterior samples obtained patterns in the GLCM space are more separable from the ref-
from MCMC. The prior values for P (c = ck ) can be specified erence group, resulting in enhanced discrimination accuracy
based on the frequency of class ck observed in a training sam- for the BSGC method, resulting in monotonic behavior. Thus,
ple or through other available information. Class labels can be when functional structure is presented in the GLCM, our pro-
assigned in accordance with the highest class probability (1), posed BSGC method was more robust than the feature-based
which is used as the basis for evaluation in our simulation and classification approaches. Non-monotonic trends in per-
case studies using a leave-one-out cross-validation (LOOCV) formance where apparent for feature-based classifiers, which
strategy. tended to achieve best performance near c = 2, corresponding
149
Fig. 2: Simulation scenarios for comparing GLCM-based Fig. 3: Simulation results. Classification accuracy obtained
classifiers. Each figure depicts the assumed mean rate sur- from prediction under LOOCV.
face used to simulate GLCM-derived patterns with different
choices of c. The upper-left is generated with c = 0, the
upper-right is generated with c = 1, the lower-left is gener- Specifically, to obtain the summary statistics reported in Ta-
ated with c = 2, the lower-right is generated with c = 3.5 ble 1, LOOCV was implemented as follows: at each step,
the observables (either GLCM or GLCM-derived features)
from a single lesion were omitted from the training set, while
to GLCMs with mean peak counts of occurrence at moderate posterior inference was implemented using data from the re-
gray levels. Considering the symmetry inherent to GLCMs maining lesions. Thereafter, a class (benign or malignant)
that characterize all directions, these methods resulted in di- is predicted for ROIs contributed by the omitted patient for
minished predictive performance for increasing values of c each method and compared with each region’s true known
due to the fact that 180◦ counterclockwise rotations produce status. Effectuating an improvement in accuracy of >70%,
similar GLCM-derived summary statistics thereby attenuat- the BSGC method outperformed the existing textural feature-
ing the true extent of data-signal between classes of c = 0 and based classifiers.
c = 4. By way of contrast, the BSGC classifier characterizes
the positional information of peak counts via the adjacency Table 1: Case study results. Classification accuracy
matrix which results in robustness to shifts in the mean count obtained from prediction under LOOCV
surface, enhancing separability with increasing c.
Logistic Random
BSGC SVM ANN
Regression Forest
Accuracy 0.80 0.46 0.47 0.47 0.45
4. CASE STUDY
150
Journal of Surgical Oncology, vol. 13, no. 1, pp. 117, [14] M. Pawar, D. K. Sharma, and R. Giri, “Multiclass skin
2015. disease classification using Neural Network,” Inter-
national Journal of Computer Science and Information
[3] R. M. Haralick, K. Shanmugam, et al., “Textural fea- Technology Research, vol. 2, no. 4, pp. 189–193, 2014.
tures for image classification,” IEEE Transactions on
systems, man, and cybernetics, , no. 6, pp. 610–621,
1973.
151