Voxel-Based Morphometry With Unified Segmentation

Voxel-Based Morphometry
with Unified Segmentation

Ged Ridgway
Centre for Medical Image Computing
University College London
Thanks to:
John Ashburner and the FIL Methods Group.
Preprocessing in SPM
• Realignment
– With non-linear unwarping for EPI fMRI
• Slice-time correction
• Coregistration
• Normalisation SPM8b’s unified tissue
segmentation and spatial
• Segmentation normalisation procedure
• Smoothing But first, an introduction to

Computational Neuroanatomy
Aims of computational neuroanatomy
• Many interesting and clinically important

questions might relate to the shape or local size
of regions of the brain
• For example, whether (and where) local patterns
of brain morphometry help to:
? Distinguish schizophrenics from healthy controls
? Understand plasticity, e.g. when learning new skills
? Explain the changes seen in development and aging
? Differentiate degenerative disease from healthy aging
? Evaluate subjects on drug treatments versus placebo
Alzheimer’s Disease example
Baseline Image Repeat image Subtraction image

Standard clinical MRI 12 month follow-up
1.5T T1 SPGR rigidly registered
1x1x1.5mm voxels
SPM for group fMRI Group-wise
statistics
fMRI time-series
“Contrast”
spm T
Preprocessing Stat. modelling Results query
Image
fMRI time-series
“Contrast”
Image
fMRI time-series
“Contrast”
Image
SPM for structural MRI ? Group-wise
statistics
High-res T1 MRI
High-res T1 MRI
High-res T1 MRI
?
The need for tissue segmentation
• High-resolution MRI reveals fine structural detail

in the brain, but not all of it reliable or interesting
– Noise, intensity-inhomogeneity, vasculature, …
• MR Intensity is usually not quantitatively
meaningful (in the same way that e.g. CT is)
– fMRI time-series allow signal changes to be analysed
statistically, compared to baseline or global values
• Regional volumes of the three main tissue types:
gray matter, white matter and CSF, are well-
defined and potentially very interesting
Examples of
segmentation
GM and WM segmentations
overlaid on original images
Structural image, GM and

WM segments, and brain-
mask (sum of GM and WM)
Segmentation – basic approach
• Intensities are modelled by a Gaussian Mixture

Model (AKA Mixture Of Gaussians)
• With a specified number of components
• Parameterised by means, variances and mixing
proportions (prior probabilities for components)
Non-Gaussian Intensity Distributions
• Multiple MoG components per tissue class allow

non-Gaussian distributions to be modelled
– E.g. accounting for partial volume effects
– Or possibility of deep GM differing from cortical GM
Tissue Probability Maps
• Tissue probability maps (TPMs) can be used to

provide a spatially varying prior distribution,
which is tuned by the mixing proportions
– These TPMs come from the segmented images of
many subjects, done by the ICBM project
Class priors
• The probability of class k

at voxel i, given weights γ
is then:
 kbik
P(ci  k | γ)  K
j1  jbij
• Where bij is the value of

the jth TPM at voxel i.
Aligning the tissue probability maps
• Initially affine-registered using a multi-

dimensional form of mutual information
• Iteratively warped to improve the fit of the
unified segmentation
model to the data
– Familiar DCT-basis
function concept, as
used in normalisation
MRI Bias Correction
• MR Images are corupted by smoothly varying
intensity inhomogeneity caused by magnetic
field imperfections and subject-field interactions
– Would make intensity distribution spatially variable
• A smooth intensity correction can be modelled
by a linear combination of DCT basis functions
Summary of the unified model
• SPM8b implements a generative model

– Principled Bayesian probabilistic formulation
• Combines deformable tissue probability maps
with Gaussian mixture model segmentation
– The inverse of the transformation that aligns the
TPMs can be used to normalise the original image
• Bias correction is included within the model
Segmentation clean-up
• Results may contain some non-brain tissue

(dura, scalp, etc.)
• This can be removed
automatically using
simple morphological
filtering operations
– Erosion
– Conditional dilation
Lower segmentations
have been cleaned up
Limitations of the current model
• Assumes that the brain consists of only GM and

WM, with some CSF around it.
– No model for lesions (stroke, tumours, etc)
• Prior probability model is based on relatively
young and healthy brains
– Less appropriate for subjects outside this population
• Needs reasonable quality images to work with
– No severe artefacts
– Good separation of intensities
– Good initial alignment with TPMs...
Extensions (possible or prototype)
• Multispectral modelling  k  μ k ,  k  σ k ,   { s }
– (New Segment Toolbox)
• Deeper Bayesian philosophy
– E.g. priors over means and variances
– Marginalisation of nuisance variables
– Model comparison
• Groupwise model (enormous!)
• Combination with DARTEL (see later and new seg tbx)
• More tissue priors e.g. deep grey, meninges, etc.
• Imaging physics
– See Fischl et al. 2004, as cited in A&F introduction
Voxel-Based Morphometry
• In essence VBM is Statistical Parametric

Mapping of segmented tissue density
• The exact interpretation of gray matter

concentration or density is complicated, and
depends on the preprocessing steps used
– It is not interpretable as neuronal packing density or
other cytoarchitectonic tissue properties, though
changes in these microscopic properties may lead to
macro- or mesoscopic VBM-detectable differences
A brief history of VBM
• A Voxel-Based Method for the Statistical Analysis of

Gray and White Matter Density… Wright, McGuire,
Poline, Travere, Murrary, Frith, Frackowiak and Friston.
NeuroImage 2(4), 1995 (!)
– Rigid reorientation (by eye), semi-automatic scalp editing and
segmentation, 8mm smoothing, SPM statistics, global covars.
• Voxel-Based Morphometry – The Methods. Ashburner
and Friston. NeuroImage 11(6 pt.1), 2000
– Non-linear spatial normalisation, automatic segmentation
– Thorough consideration of assumptions and confounds
A brief history of VBM
• A Voxel-Based Morphometric Study of Ageing… Good,

Johnsrude, Ashburner, Henson and Friston. NeuroImage
14(1), 2001
– Optimised GM-normalisation (“a half-baked procedure”),
modulation of segments with Jacobian determinants
• Unified Segmentation. Ashburner and Friston.
NeuroImage 26(3), 2005
– Principled generative model for segmentation using
deformable priors
• A Fast Diffeomorphic Image Registration Algorithm.
Ashburner. Neuroimage 38(1), 2007
– Large deformation normalisation to average shape templates
• …
VBM overview
• Unified segmentation and spatial normalisation

• Optional modulation with Jacobian determinant
• Optional computation of tissue totals/globals
• Gaussian smoothing
• Voxel-wise statistical analysis
VBM in pictures
Segment
Normalise
VBM in pictures
Segment
Normalise
Modulate (?)
Smooth
VBM in pictures
Segment
 a1xyz 
 a 2 xyz 
Normalise    Y  X xyz  exyz
  
Modulate (?)  
aNxyz  exyz ~ N (0,  xyzV )
2
Smooth
Voxel-wise statistics 1 0
1 0

X   
 
0 1
0 1
VBM in pictures
Segment
Normalise
Modulate (?)
Smooth
Voxel-wise statistics
VBM Subtleties
• Whether to modulate
• Adjusting for total GM or Intracranial Volume
• How much to smooth
• Limitations of linear correlation
• Statistical validity
Native
Modulation intensity =
tissue density
• Multiplication of the warped

(normalised) tissue intensities Modulated
so that their regional or global
volume is preserved
– Can detect differences in
completely registered areas
• Otherwise, we preserve
concentrations, and are
detecting mesoscopic effects
Unmodulated
that remain after approximate
registration has removed the
macroscopic effects
– Flexible (not necessarily
“perfect”) registration may not
leave any such differences
“Globals” for VBM
• Shape is really a
multivariate concept
– Dependencies among
volumes in different
regions
Above: (ii) is globally thicker, but
• SPM is mass univariate locally thinner than (i) – either of these
– Combining voxel-wise effects may be of interest to us.
information with “global”
integrated tissue volume Below: The two “cortices” on the right
provides a compromise both have equal volume…
– Using either ANCOVA or
proportional scaling
Figures from: Voxel-based morphometry
of the human brain… Mechelli, Price,
Friston and Ashburner. Current Medical
Imaging Reviews 1(2), 2005.
Total Intracranial Volume (TIV/ICV)
• “Global” integrated tissue volume may be

correlated with interesting regional effects
– Correcting for globals in this case may overly reduce
sensitivity to local differences
– Total intracranial volume integrates GM, WM and
CSF, or attempts to measure the skull-volume directly
• Not sensitive to global reduction of GM+WM
(cancelled out by CSF expansion – skull is fixed!)
– Correcting for TIV in VBM statistics may give more
powerful and/or more interpretable results
Smoothing
• The analysis will be most sensitive to effects that

match the shape and size of the kernel
• The data will be more Gaussian and closer to a
continuous random field for larger kernels
• Results will be rough and noise-like if too little
smoothing is used
• Too much will lead to distributed, indistinct blobs
Smoothing
• Between 7 and 14mm is probably best

– (lower is okay with better registration, e.g. DARTEL)
• The results below show two fairly extreme
choices, 5mm on the left, and 16mm, right
Nonlinearity
Caution may be needed when looking for linear

relationships between grey matter
concentrations and some covariate of interest.
Circles of uniformly Plot of intensity at circle

Smoothed
increasing area. centres versus area
VBM’s statistical validity
• Residuals are not normally distributed

– Little impact on uncorrected statistics for experiments
comparing reasonably sized groups
– Probably invalid for experiments that compare single
subjects or tiny groups with a larger control group
• Need to use nonparametric tests that make less
assumptions, e.g. permutation testing with SnPM
• Correction for multiple comparisons

– RFT correction based on peak heights should be OK
• Correction using cluster extents is problematic
– SPM usually assumes that the smoothness of the
residuals is spatially stationary
• VBM residuals have spatially varying smoothness
• Bigger blobs expected in smoother regions
– Toolboxes are now available for non-stationary
cluster-based correction
• http://www.fmri.wfubmc.edu/cms/NS-General
• False discovery rate

– Less conservative than FWE
– Popular in morphometric work
• (almost universal for cortical thickness in FS)
– Recently questioned…
• Topological FDR in SPM8
– See release notes for details and paper
Variations on VBM
• “All modulation, no gray matter”

– Jacobian determinant “Tensor” Based Morphometry
– Davatzikos et al. (1996) JCAT 20:88-97
• Deformation field morphometry
– Cao and Worsley (1999) Ann Stat 27:925-942
– Ashburner et al (1998) Hum Brain Mapp 6:348-357
• Other variations on TBM
– Chung et al (2001) NeuroImage 14:595-606
Deformation and shape change
Figures from Ashburner and Friston, “Morphometry”, Ch.6

of Human Brain Function, 2nd Edition, Academic Press
Deformation fields and Jacobians
Deformation
Original Warped Template vector field
Determinant of
Jacobian Matrix
encodes voxel’s
Jacobian Matrix
Longitudinal VBM
• Intra-subject registration over time much more

accurate than inter-subject normalisation
• Imprecise inter-subject normalisation
– Spatial smoothing required
• Different methods have been developed to
reduce the danger of expansion and contraction
cancelling out…
Longitudinal VBM variations
• Voxel Compression mapping separates

expansion and contraction before smoothing
– Scahill et al (2002) PNAS 99:4703-4707
• Longitudinal VBM multiplies longitudinal volume

change with baseline or average grey matter
density
– Chételat et al (2005) NeuroImage 27:934-946
Longitudinal VBM variations
Late Early Late CSF Early CSF
Late CSF - Early CSF Late CSF - modulated CSF
Smoothed
CSF “modulated” by
Warped early Difference Relative volumes
relative volume
Nonrigid registration developments
• Large deformation concept

– Regularise velocity not displacement
• (syrup instead of elastic)
• Leads to concept of geodesic
– Provides a metric for distance between shapes
– Geodesic or Riemannian average = mean shape
• If velocity assumed constant computation is fast
– Ashburner (2007) NeuroImage 38:95-113
– DARTEL toolbox in SPM8b
• Currently initialised from unified seg_sn.mat files
DARTEL exponentiates a velocity flow
field to get a deformation field
Velocity flow field

Example geodesic shape average
Average on
Riemannian
manifold
Linear Average
(Not on Riemannian manifold)
DARTEL average
template evolution
Grey matter
average of 452
subjects – affine
Iterations
471 subjects
– DARTEL
Questioning Intersubject normalisation
• Registration algorithms might find very different

correspondences to human experts
– Crum et al. (2003) NeuroImage 20:1425-1437
• Higher dimensional warping improves image
similarity but not necessarily landmark
correspondence
– Hellier et al. (2003) IEEE TMI 22:1120-1130
Questioning Intersubject normalisation
• Subjects can have fundamentally different

sulcal/gyral morphological variants
– Caulo et al. (2007) Am. J. Neuroradiol. 28:1480-85
• Sulcal landmarks don’t always match underlying
cytoarchitectonics
– Amunts, et al. (2007) NeuroImage 37(4):1061-5
Intersubject normalisation opportunities
• High-field high-resolution
MR may have potential to
image cytoarchitecture
• Will registration be better
or worse at higher
resolution?
– More information to use
– More severe
discrepancies?
– Need rougher
deformations
– Non-diffeomorphic?
4.7T FSE
De Vita et al (2003) Br J Radiol 76:631-7

Intersubject normalisation opportunities
• Regions of interest for fMRI can be defined from

functional localisers or orthogonal SPM contrasts
– No obvious equivalent for single-subject structural MR
• Potential to include diffusion-weighted MRI
information in registration ?
– Zhang et al. (2006) Med. Image Analysis 10:764-785
Summary of key points
• VBM performs voxel-wise statistical analysis on

smoothed (modulated) normalised segments
• SPM8b performs segmentation and spatial
normalisation in a unified generative model
• Intersubject correspondence is imperfect
– Smoothing alleviates this problem to some extent
• Also improves statistical validity
• Some current research is focussed on more
sophisticated registration models
Unified segmentation in detail
An alternative explanation to the paper

and to John’s slides from London ‘07
http://www.fil.ion.ucl.ac.uk/spm/course/
slides07/Image_registration.ppt
Unified segmentation from the GMM upwards…
The standard Gaussian mixture model
Voxel i, class k p ( yi | ci  k )  N (  k ,  k )
p ( yi , ci  k )   k N (  k ,  k )
p ( yi )    k N (  k ,  k )
k
p (y )    k N ( yi |  k ,  k )
Assumes independence
(but spatial priors later...)
i k
Could solve with EM
 p ( y | μ, σ , γ ) (1-5)
Spatially modify mean and variance with bias field
p (y )    k N ( yi |  k ,  k )
i k
 k   k / i (  ) Note spatial dependence

(on voxel i), [coefficients
 k   k / i (  ) for linear combination of
DCT basis functions]
   
p (y )    k N  yi k
, k 
i k  i ( ) i ( )  (10)
Anatomical priors through mixing coefficients
   
p (y )    k N  yi k
, k 

i k   i (  )  i (  ) 
Basic idea  k   ik Note spatial dependence

(on voxel i)
bik  k
k  bik
 bij j
Implementation prespecified:
j estimated: k
bik  k    
p (y )   N  yi k
, k 

i k  bij  j   i (  )  i (  )  (12)
j
Aside: MRF Priors (A&F, Gaser’s VBM5 toolbox)
bik  k
k 
 bij j
j
 
bik exp   km rmi  rmi probable number of neighbours
k   mK  in class m, for voxel i
 
j bij expm
K
 jm rmi 

(45)
Spatially deformable priors (inverse of normalisation)
 k bik    
p(y )   N  yi k
, k 

i k   j bij  i ( ) i ( ) 
j
Simple idea! bik  bik ( ) Prior for voxel i depends on

some general transformation
model, parameterised by α
Optimisation SPM8b’s model is affine + DCT warp

is tricky… With ~1000 DCT basis functions
(13)
Spatially deformable priors (inverse of normalisation)
 k bik    
p(y )   N  yi k
, k 

i k   j bij  i ( ) i ( ) 
j
bik  bik ( )
 k bik ( )    
p (y )   N  yi k
, k 

i k   j bij ( )   i (  )  i (  ) 
j
p ( y )  p ( y | α , β, μ , σ , γ )
(14, pretty much)
Objective function so far…
 k bik ( )    
p (y )   N  yi k
, k 
p(y ) i p(ky
| α, βj b,ijμ(, σ) , γ) i ( ) i ( ) 
j
   log p (y | α, β, μ, σ, γ )
  log p ( yi | α, β, μ, σ, γ )
i
 k bik ( )    
  log  N  yi k
, k 
i k   j bij ( )  i (  ) i (  ) 
j (14, I think...)
Objective function with regularisation
p ( y )  p ( y | α , β, μ , σ , γ )
Assumes priors
p(y, α, β | μ, σ, γ )  p(y | α, β, μ, σ, γ ) p(α ) p(β) independent
p (α )  N (0, C )
   log p (y | α, β, μ, σ, γ )
p (β)  N (0, C )
F   log p (y, α, β | μ, σ, γ )
C : α T C1α    log p(α )  log p (β)
gives deformation’s
bending energy
(15,16)
Optimisation approach
Maximising:
 k bik ( )    
F   log N  yi k
, k   log p( )  log p (  )

i k   j bij ( )   i (  )  i (  ) 
j
With respect to {α, β, μ, σ, γ} is very difficult…
Iterated Conditional Modes is used – this alternately

optimises certain sets of parameters, while keeping the
rest fixed at their current best solution
• EM used for mixture parameters

• Levenberg Marquardt (LM) used for bias and
warping parameters
– Note unified segmentation model with Gaussian
assumptions has a “least-squares like” log(objective)
making it ideal for Gauss-Newton or LM optimisation
• Local opt, so starting estimates must be good
– May need to manually reorient troublesome scans
Unified segmentation Figure from C. Gaser
from the GMM upwards…
• Repeat until convergence…

– Hold γ, μ, σ2 and α constant, and
minimise E w.r.t. b
• Levenberg-Marquardt strategy,
using dE/dβ and d2E/dβ2
– Hold γ, μ, σ2 and β constant, and

minimise E w.r.t. α
• Levenberg-Marquardt strategy,
using dE/dα and d2E/dα2
– Hold α and β constant, and

minimise E w.r.t. γ, μ and σ2
• Expectation Maximisation
Note ICM steps
Results of the Generative model
Key flaw, lack of neighbourhood

correlation – “whiteness” of noise
Motivates (H)MRF priors, which

should encourage contiguous
tissue classes
(Note, MRF prior is not equivalent

to smoothing each resultant
tissue segment, but differences in
eventual SPMs may be minor…)

Voxel-Based Morphometry With Unified Segmentation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Voxel-Based Morphometry With Unified Segmentation

Uploaded by

Copyright:

Available Formats

Voxel-Based Morphometry

with Unified Segmentation

• Smoothing But first, an introduction to

• Many interesting and clinically important

Baseline Image Repeat image Subtraction image

• High-resolution MRI reveals fine structural detail

Structural image, GM and

• Intensities are modelled by a Gaussian Mixture

• Multiple MoG components per tissue class allow

• Tissue probability maps (TPMs) can be used to

• The probability of class k

• Where bij is the value of

• Initially affine-registered using a multi-

• SPM8b implements a generative model

• Results may contain some non-brain tissue

• Assumes that the brain consists of only GM and

• In essence VBM is Statistical Parametric

• The exact interpretation of gray matter

• A Voxel-Based Method for the Statistical Analysis of

• A Voxel-Based Morphometric Study of Ageing… Good,

• Unified segmentation and spatial normalisation

• Multiplication of the warped

• “Global” integrated tissue volume may be

• The analysis will be most sensitive to effects that

• Between 7 and 14mm is probably best

Caution may be needed when looking for linear

Circles of uniformly Plot of intensity at circle

• Residuals are not normally distributed

• Correction for multiple comparisons

• False discovery rate

• “All modulation, no gray matter”

Figures from Ashburner and Friston, “Morphometry”, Ch.6

• Intra-subject registration over time much more

• Voxel Compression mapping separates

• Longitudinal VBM multiplies longitudinal volume

Late CSF - Early CSF Late CSF - modulated CSF

• Large deformation concept

Velocity flow field

• Registration algorithms might find very different

• Subjects can have fundamentally different

De Vita et al (2003) Br J Radiol 76:631-7

• Regions of interest for fMRI can be defined from

• VBM performs voxel-wise statistical analysis on

An alternative explanation to the paper

 k   k / i (  ) Note spatial dependence

Basic idea  k   ik Note spatial dependence

k   mK  in class m, for voxel i

Simple idea! bik  bik ( ) Prior for voxel i depends on

Optimisation SPM8b’s model is affine + DCT warp

With respect to {α, β, μ, σ, γ} is very difficult…

Iterated Conditional Modes is used – this alternately

• EM used for mixture parameters

• Repeat until convergence…

– Hold γ, μ, σ2 and β constant, and

– Hold α and β constant, and

Key flaw, lack of neighbourhood

Motivates (H)MRF priors, which

(Note, MRF prior is not equivalent

You might also like