Xu Et Al CODDD Review2009

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

40

Current Opinion in Drug Discovery & Development 2009 12(1):40-52


Thomson Reuters (Scientific) Ltd ISSN 1367-6733

Metabolomics in pharmaceutical research and development:


Metabolites, mechanisms and pathways
Ethan Y Xu, William H Schaefer & Qiuwei Xu*
Address
Department of Safety Assessment, Merck Research Laboratories,
770 Sumneytown Pike, West Point, PA 19486, USA
Email: qiuwei_xu@merck.com
*To whom correspondence should be addressed

In recent years, quantitative metabolomics has played increasingly important roles in pharmaceutical research and development.
Metabolic profiling of biofluids and tissues can provide a panoramic view of abundance changes in endogenous metabolites to
complement transcriptomics and proteomics in monitoring cellular responses to perturbations such as diseases and drug treatments.
Precise identification and accurate quantification of metabolites facilitate downstream pathway and network analysis using software
tools for the discovery of clinically accessible and minimally invasive biomarkers of drug efficacy and toxicity. Metabolite abundance
profiles are also indicative of biochemical phenotypes, which can be used to identify novel quantitative trait loci in genome-wide
association studies. This review summarizes recent experimental and computational efforts to improve the metabolomics technology
as well as progress towards in-depth integration of metabolomics with other disparate omics datasets to build mechanistic models
in the form of detailed and testable hypotheses.
Keywords Biochemical mechanisms, identification, quantification, metabolic profiling, metabolomics, metabonomics, NMR, pathway
analysis, systems biology

Abbreviations
DSS-d6 2,2-dimethyl-2-silapentane-5-sulfonate sodium,
FDR false discovery rate, HSQC heteronuclear single
quantum correlation, LC-MS liquid chromatography mass
spectrometry, MRS magnetic resonance spectroscopy,
MS mass spectrometry, NMR nuclear magnetic resonance,
RT-PCR reverse transcription polymerase chain reaction,
QTL quantitative trait locus

Introduction
Metabolomics, also known as metabonomics [1] or
metabolic profiling [2], originated from Linus Pauling's
seminal vision of generating information-rich quantitative
response profiles from human biofluids to evaluate defined
diets for orthomolecular medicine [3], an alternative
medicine aimed at preventing and treating disease with
natural products. Since then, metabolomics has evolved
into a valuable tool in systems biology and permeated
into diverse areas such as investigative toxicology [4-8],
pharmaceutical lead optimization [9], environmental science
[10-12], epidemiology [13-15], disease and population
stratification [16-18], pharmacology [19-21], plant biology
[22-24], cellular biochemistry [25-27], and human nutrition
[28-32]. Throughout this review, the term 'metabolomics'
will be used to describe the application of analytical
chemistry tools to follow changes of endogenous
metabolites in biofluids, cells and tissues.
Nuclear magnetic resonance (NMR) and mass spectrometry
(MS) have been the most widely used analytical platforms

in fingerprinting spectral differences and profiling


endogenous small molecules in metabolomic analyses. 1D
1
H-NMR has the advantage of being precisely quantitative
and non-destructive, while simultaneously detecting all
proton-bearing molecules from one single spectrum.
Molecules are distinguished in 1D 1H-NMR spectra by their
chemical shifts, peak multiplicities and coupling constants.
1
H-NMR has a large linear dynamic range for quantification
and can detect in the micromolar range. MS, often coupled
with liquid or gas chromatographic (LC- or GC-) separation
techniques, has an advantage in sensitivity although it is
only semi-quantitative (or relatively quantitative) for most
applications. LC-MS or GC-MS identifies molecules based
on accurately determining mass-to-charge ratios of
molecular ions and their fragment ions, as well as the
associated retention time from the chromatographic
separation. MS has a lower detection limit in the nanomolar
range. This review will focus on data acquisition and
processing using NMR-based quantitative metabolomics,
and readers interested in MS analytical strategies and
algorithm developments in MS-based metabolomics are
referred to references [33-36].
Evolving from early limitations of analytical technologies,
a shift toward identification of all possible endogenous
molecules has been made possible by the high resolution
and sensitivity of modern NMR and MS technology,
developments of robust algorithms, establishment of
inclusive spectral reference libraries, and interactive
software. The identification of molecules is often driven

Metabolomics in pharmaceutical R&D Xu et al 41

by a need to understand and interpret the observed


changes of endogenous metabolites in terms of relevant
biochemical mechanisms and metabolic pathways, enabling
independent confirmation of any observation by orthogonal
or independent analytical platforms, different animal
models, and other 'omics approaches.
The interpretation of abundance changes of endogenous
metabolites
requires
the
physiological
context
of
biochemical pathways and metabolite-protein interaction
networks. Metabolic profiles are molecular representations
of physiological and functional status; they are molecular
phenotypes of functioning genes and proteins [22].
Metabolomes in higher organisms can be viewed in the
context of the extended central dogma of molecular
biology (see Figure 1). They are often affected by symbiotic
organisms [37,38], diets [39-41] and other environmental
factors [13,30,37,40].
This review begins by discussing data processing and
analysis algorithms for chemometric and quantitative
metabolomics. The importance of reference metabolite
databases will be highlighted in the context of quantitative
NMR metabolomics. An overview of recent literature will
be provided, describing efforts to integrate metabolomics
with other 'omics profiling technologies, and new data
analysis tools to extract biochemical pathway information
from metabolite identification and quantification.

Chemometric metabolomics and


quantitative metabolomics
Quantitative NMR analysis in biological systems can be
traced back to as early as 1971 [42], shortly after its
inception for quantitative chemical analysis [43]. The
accuracy and robustness of quantitative NMR have been

validated even in rigorous GMP (Good Manufacturing


Practice) environments for biological product release and
characterization [44,45]. As early as 1989, Nicholson and
his colleagues published a paper on quantitative 1H-NMR
analyses of excreted urine metabolites from rats dosed
with cadmium [46]. Given its nearly 100% natural
abundance and a large gyromagnetic ratio, 1H is the
most sensitive among all nuclei, including 31P and 13C.
Despite some early successes, 1D 1H-NMR spectra often
faced difficulties in the presence of overwhelming and
complicated overlapping peaks from dozens to hundreds
of detectable chemicals in tissues and biofluids.

Chemometric metabolomics
There are two different data analysis approaches in dealing
with the massive amount of complicated 1H-NMR spectra
in a metabolomic study. The first method is multivariate
pattern recognition or chemometric analysis [47].
Chemometric metabolomics focuses on the identification of
global trends in spectral peak patterns rather than on the
identification and quantification of endogenous metabolites
in spectra of overlapping signals from mixtures. The
analysis provides an unbiased representation of the whole
metabolomic dataset and saves a significant amount of
time by avoiding the tedious task of unambiguous
metabolite identification. However, there are growing
concerns over poor inter-laboratory reproducibility of
the multivariate metabolic fingerprints derived from
chemometric analyses, the inability to confirm chemometric
findings with complementary technologies, and the lack
of biological insights due to the absence of unequivocal
metabolite identification [48].
In recent years, several new techniques of multivariate
data analysis have been developed for chemometric

Figure 1. The extended central dogma of molecular biology.

Organism

Environment

Genome

Systems biology

DNA

Transcriptome

mRNA

Diet
Xenobiotics

Proteome

Protein

Metabolite M1

M2

M3

M4

Metabolome

Stressors

42 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

metabolomics. Statistical total correlation spectroscopy


(STOCSY), proposed by Cloarec et al at Imperial
College London, exploits the multicollinearity of the
chemical shift variables in a set of NMR spectra to
generate a pseudo-2D NMR spectrum that resembles
homonuclear total correlation spectroscopy (TOCSY) [49].
The application of STOCSY to NMR spectra often faces
the following challenges: (i) selection of a threshold value
of Pearson correlation coefficient (PCC) when drawing
contour plots to filter out background noise; (ii) alignment
of NMR spectra to remove small variations in peak
position, line width, or peak shape due to the effect of pH,
concentration and ionic strength; and (iii) limited dynamic
ranges of peak intensities.
The Imperial College team later extended the STOCSY
concept to examine the correlation patterns between
NMR and LC-MS spectra, and named the new method
statistical
heterospectroscopy
(SHY)
[50].
It
was
demonstrated that a number of metabolites were
identified by cross-correlations between NMR and LC-MS
peaks, therefore improving the efficiency of metabolite
identification for those metabolites detectable by both
techniques [50].
Built on multivariate projection models such as Principal
Component Analysis (PCA) and Partial Least Squares
(PLS), the metabolite projection analysis (MPA) method
facilitates metabolite identification in a metabolomic
experiment [51]. NMR spectra for a set of known pure
metabolites are projected into the PCA or PLS models built
from the sample spectra, allowing for the instantaneous
identification of metabolites from scores plots without the
need for tedious annotation of signals on the loadings plots.
MPA has the obvious appeal in the practice of chemometric
metabolomics, but false positive metabolites cannot be
prevented. Manual inspections are often required to confirm
an identified list of metabolites.

Quantitative metabolomics
The second data analysis approach in NMR metabolomics
is the identification and quantification of endogenous
metabolites from the NMR spectra. This approach is now
known as quantitative metabolomics [52]. This analysis is
different from the traditional targeted metabolite analysis
used in clinical chemistry where the analytical protocols
are limited to the predefined set of metabolites while
changes of all other small molecules are ignored.
Quantitative metabolomics is unbiased and is suitable for
the identification and quantification of all detectable
metabolites. However, the metabolite identification process
is limited by the size of the NMR reference spectral library
of known endogenous metabolites. Unidentified NMR
peaks can still be captured in this open-ended quantitative
NMR analysis by recording the peak positions and scaled
intensities with respect to an internal reference compound
such as DSS-d6 (2,2-dimethyl-2-silapentane-5-sulfonate
sodium). Additional NMR data acquisition such as 2D
homonuclear or heteronuclear correlation spectra in
combination with other orthogonal analytical methods

such as LC-MS can be deployed later to determine the


chemical structures of the metabolites. Identification of
endogenous metabolites is an interactive process. When
unexpected metabolites are identified in an open-profiling
experiment, other metabolites in the same or related
pathways can become candidates for follow-up targeted
analyses. This iterative metabolite analysis with a focus
on biochemical pathways has the potential to complete the
cycle of hypothesis generation and hypothesis testing in
the elucidation of underlying mechanisms of complex
diseases and drug-induced toxicities [14].
The advantage of the quantitative metabolomics approach
is highlighted in translational research projects, which
focus on the progress from preclinical to clinical testing
in the drug development process [53-55]. For example,
ex vivo metabolite profiles in cerebrospinal fluid can be
related to in vivo MRS (magnetic resonance spectroscopy)
measurements of the same set of metabolites in brain
[54]. Unlike genes and proteins, most endogenous
metabolites are identical across different tissues, organs
and species. It can be straightforward to relate
perturbation-related
metabolite
abundance
changes
between different animal species. For example, the same
small molecules due to renal proximal tubule toxicities
are present in the urine of rodent, monkey and human
although the sequences of membrane transporter proteins
and genes of these molecules may vary among different
species.
The practice of quantitative NMR metabolomics requires
correct NMR settings and sample preparation to achieve
precise
and
accurate
quantification
of
metabolite
signals. Attention has to be given to parameters such as
recycle delay, well-contained water suppression and the
spectral width. In order to avoid spin saturation, a recycle
delay of 5 to 7 T1 is often recommended. Applying water
suppression by selective shaped pulse such as in WET
(water suppression enhanced through T1 effects) [56] can
avoid the intensity attenuation of nearby peaks (especially
peaks due to anomeric protons of sugar molecules such
as glucose, mannose and galactose that appear close to
the water signal) or even exchangeable protons (eg,
urea). A large spectral window or sharp filters such as a
brick-wall digital filter can avoid peak clipping at either
side of the spectral window.
The consistency of spectral intensity can be maintained
by maintaining the same instrument settings throughout
the data acquisition process for all samples. This
consistency can also be achieved by scaling the spectral
intensity relative to the peak area of an internal reference
compound that is added at a known concentration. The
latter approach is preferred for its convenience and
consistency by including an internal reference compound
during sample preparation for quantification [57,58],
and enables calculation of absolute concentrations of
endogenous metabolites in each sample by peak area
integration or by spectral deconvolution methods, as
discussed below.

Metabolomics in pharmaceutical R&D Xu et al 43

Metabolite quantification has also been proposed with 2D


1
H-13C HSQC (heteronuclear single quantum correlation)
spectroscopy [59]. The large chemical shift dispersion
along the 13C dimension helps to resolve overlapping
peaks that often complicate 1D 1H-NMR spectral analysis
for metabolomics. However, a single internal quantitative
reference is not sufficient to quantify all metabolites using
2D HSQC because the cross peak intensities depend on
many factors such as one-bond 1H-13C coupling constants
of individual 1H-13C pairs. Therefore, quantification has
to rely on external calibration curves. Given hundreds
of NMR-detectable endogenous metabolites, setting up
calibration curves for all of these chemicals is unrealistic.
Large sample quantities are also recommended to achieve
an adequate signal-to-noise ratio within a manageably
short
acquisition
time
window.
However,
biofluid
samples such as cerebrospinal fluid are often limited in
availability. Hence, metabolite identification via 2D
heteronuclear NMR appears to be practical only when
large quantities of materials are available (eg, plant
metabolomics studies).

Data analysis algorithms for quantitative


metabolomics
The calculation of metabolite concentrations from NMR
spectra is often carried out by two different approaches.
Traditionally, metabolite concentration is determined

using peak area integration when peaks are well resolved.


This approach is based on the principle that detected
NMR signals are produced by electromagnetic pulses,
which respond linearly with the concentration of
distinctive nuclei (eg, methyl protons in acetate) in the
analyzed samples. This is exemplified by a recent study
that measured lipid components of skin samples using
well-resolved lipid peaks [60]. However, overlapping
peaks frequently appear in the NMR spectra of biological
samples. In such cases, deconvolution algorithms have
to be applied to assist metabolite quantification.
The second approach is based on spectral lineshape
deconvolution. An NMR spectrum of a biological sample
can be viewed as a linear composition of the spectra of
each of the individual components. Therefore, a complex
mixture spectrum can be decomposed into constituent
reference spectra with corresponding coefficients to
account for their relative abundance in the original
spectrum. NMR spectra of individual metabolites are often
distinctive and characteristic based on the number of
peaks, peak positions, peak multiplicity, and splitting
patterns of peak multiplicity (ie, coupling constants).
Deconvolution can be achieved either by manual
adjustment of peak intensity and position for all constituent
peaks [57] or by automated deconvolution algorithms
(Figure 2).

Figure 2. An example workflow and an illustration of deconvolution for quantitative NMR metabolomics.

B
Sample preparation
Buffered
solution
Spectral optimization
Gradient
shimming
Data acquisition
Phase, reference,
spectra
Search reference
library

Peak integration

Quantify and
deconvolve

Add an internal
reference standard
(eg, DSS-d6 )
Pulse calibration,
water suppression
calibration
Scale spectra with
an internal reference
(eg, DSS-d6)

Deconvolution:
SVD, least squares
regression
3.27

Metabolites and quantities

3.26

3.25
(ppm)

3.24

3.23

(A) A typical flow chart of quantitative NMR metabolomics. (B) An illustration of lineshape-based deconvolution of overlapping NMR peaks
for metabolite quantification. The solid curves represent original NMR peaks, while dashed curves (with solid circles) represent the fitted
peaks based on reference chemicals. Two lower curves (dashed and dot-dashed) are deconvolved peaks of taurine and trimethylamine
oxide (TMAO), respectively. The intensities of the deconvolved peaks are the products of the reference spectra of taurine and TMAO and their
contributing coefficients.
DSS-d6 2, 2-dimethyl-2-silapentane-5-sulfonate sodium, SVD singular value decomposition

44 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

Singular value decomposition (SVD) provides a single


unique answer to quantifying and identifying metabolites
in mixtures by solving linear response equations [58]
without requiring initial seeding values to start the
calculations. Unlike least squares regression (see below),
SVD only produces coefficients for spectra of contributing
metabolites, thus avoiding forced negative coefficients
often seen in least-squares fitting. The accuracy and
robustness of the SVD method has been demonstrated
on a set of synthetic mixtures, and in quantifying taurine
and trimethylamine oxide (TMAO) from rat urine samples
in the crowded spectral region between 3.23 and 3.27 ppm
[58].
An alternative deconvolution method solves the linear
response equations by least squares regression. A method
called LCModel has been widely used for in vivo MRS
quantification of endogenous metabolites in human and
animal brains [61,62]. Gipson et al developed a weighted
least squares method on binned spectral intensities to
search for differences among treatment groups and
identifying metabolites [63]. In this algorithm, weighting
factors are inversely proportional to signal variances so
that peaks of significant intensities are weighted more
heavily.
Keys to the success of deconvolution-based quantification
algorithms include exhaustive collection of reference NMR
spectra and development of robust algorithms and efficient
software for peak alignment. Significant progress has
been made in building reference spectra databases to
facilitate metabolite identification and quantification.
However, improvements are still required in the
development of peak alignment algorithms, and the
implementation of such algorithms in interactive data
processing software.

so that users can easily inspect and correct the


misalignments
interactively.
During
the
software
development process, it is important to clarify the
requirements for sample preparation, NMR data acquisition,
spectral pre-processing and deconvolution algorithms.
For example, the inclusion of a quantitative internal
reference would help scale spectral intensity consistently
and uniformly, and correct data acquisition settings would
warrant consistent spectral quality and ease the process
of spectral phase and baseline correction.
Deconvolution algorithms could facilitate not only the
quantification of overlapping peaks but also that of lowintensity peaks. The integration of low-intensity peaks
is often plagued by low signal-to-noise ratios. It is often
straightforward to recognize a compound of low abundance
in a 1H-NMR spectrum by overlaying distinctive reference
peaks onto a sample spectrum. Both overlapping and
low-intensity peaks account for a significant portion of
signals in NMR spectra. Deconvolution algorithms are
therefore indispensable to quantitative metabolomics in
their capability to extract the abundance profiles of
hundreds of metabolites from a single 1H-NMR spectrum.
One successful example of deconvolution-based NMR
quantification has been the profiling of serum lipoproteins
[70-72].
Using
pre-acquired
reference
spectra
of
lipoprotein
sub-particles,
this
method
successfully
measured lipoprotein subclasses such as chylomicrons,
very low density lipoprotein (VLDL), low-density lipoprotein
(LDL) and high-density lipoprotein (HDL). The approach
circumvented the need to physically separate each
subclass of particles while achieving satisfactory results.

Reference databases for metabolite


identification and quantification

The need for peak alignment is common in the processing


of NMR and chromatography data. Although peak alignment
can be conducted manually, it is preferable to have it
performed by software tools implementing robust alignment
algorithms. Various algorithms are being developed and
tested such as the fuzzy Hough transform for aligning
peaks in an image representation of NMR spectra [64],
and several warping algorithms to align sample peaks
to pre-selected reference peaks using interpolation of
spectral data points [65-69]. Developing an algorithm to
achieve global alignment for all peak regions of a spectrum
is a challenge. It is more practical to align peaks locally
in a few selected small regions where quantifying by
deconvolution would follow. The intensities of peaks in
other regions for the same molecules can be constructed
with the deconvolution coefficients. Alignment and
quantification can, therefore, be further evaluated by
comparing reconstructed peak intensities with actual
NMR peaks in those regions.

Current estimates suggest that there may be as few as


2500 to 3000 endogenous metabolites (active or present)
in the human metabolome; however, this number is
expected to change as the metabolite detection technologies
become more sensitive and comprehensive [48,73].
The chemical structures of many of these endogenous
metabolites are known during the enzymatic transformation
in metabolic pathways, and are often conserved among
different species across phylogeny. While chemometric
metabolomics could focus on examining global trends
in spectral patterns without identifying the underlying
metabolites, the practice of quantitative metabolomics
requires the extraction of detectable and recognizable
metabolite abundance profiles from the spectra of
complex mixtures of biofluids (eg, urine, plasma/serum,
cerebrospinal fluid), tissue biopsies or cultured cell lysates.
The collection of quantitative 1D 1H-NMR spectra of pure
compounds to build a spectral reference database is
essential for this metabolite identification and quantification
process.

A flexible and powerful software interface should combine


existing, robust algorithms to efficiently process the peak
alignments, and be able to highlight misaligned regions

Currently, there are a few public and proprietary NMR


spectral libraries. These include the Human Metabolome
Database (HMDB; freely available) [74], the Madison

Metabolomics in pharmaceutical R&D Xu et al 45

Metabolomics
Consortium
Database
(MMCD;
freely
available to academic users) [75], the Chenomx database
(proprietary) [57], and many other proprietary databases.
HMDB (www.hmdb.ca) contains NMR spectra for at least
755 pure chemicals. Among them are 1D 1H and 2D 1H-13C
HSQC NMR spectra. These chemicals include not only
endogenous metabolites but also xenobiotics found in
human biofluids at concentrations of > 1 M. Similarly,
MMCD (mmcd.nmrfam.wisc.edu) contains NMR spectra for
477 compounds. These spectra include 1D 1H, 13C, 13C-DEPT
(Distortionless Enhancement by Polarization Transfer), 2D
homonuclear and heteronuclear spectra. For metabolomic
studies based on LC-MS, METLIN (metlin.scripps.edu)
has become a valuable database tool since it became
available in 2005 [76,77]. KEGG (www.genome.jp/kegg)
and PubChem (pubchem.ncbi.nlm.nih.gov) remain the
most useful all-purpose metabolite databases, offering
hyperlinked biochemical and chemical knowledge not
captured in those specialized metabolite databases for
NMR and LC-MS metabolomics.
HMDB provides annotations to each metabolite, for example,
the matrices (eg, biofluids and tissues) where those
metabolites are usually found, with normal and abnormal
reference ranges of concentrations. The collection of
this type of information is tedious, involving literature
searching, semi-automated text mining and exquisite
measurements in targeted biomatrices [78]. This
information is especially valuable at the stage of postidentification biochemical interpretation of changes in
metabolite abundance. It is not uncommon for a preclinical
toxicity study to be limited by the number of test subjects
especially, with non-rodent large animals where normal
population variability is difficult to assess. In those cases,
biochemical interpretation of a metabolomic profiling
experiment will benefit from the use of historical data in
published literature or databases.

concentrations of reference compounds should be similar


to the concentrations commonly found in samples. The
concentration of the internal reference standard (eg, DSS-d6)
should also be similarly close. High concentrations can
affect chemical shifts, distort peak line shapes, and
contribute to variance of quantification when sample
concentration is disparately lower than that of a reference
compound. A typical concentration range for a reference
compound is ~ 1 to 2 mM, at which most abundant
chemicals in biofluids are detected by 1D 1H-NMR and a good
signal-to-noise ratio can be achieved by acquiring reference
spectra in a short time.
Given the invaluable role of NMR reference spectra for
metabolite identification and quantification, establishing
a commonly shared database containing a library of
quantitative NMR reference spectra will undoubtedly save
time and cost by avoiding duplicate efforts. The large
size of a library with contributions from all participants
can accelerate quantitative NMR metabolomics. However,
standardization of spectral collection conditions and
spectral quality still remains to be set. Quality control
standards will need to be achieved before NMR spectra
can be accepted for repository.

Elucidating biochemical mechanisms of


diseases and drug toxicities

The proprietary Chenomx database is a part of the


metabolomics software for 1D 1H-NMR spectral analysis
and contains at least 270 reference compounds. This
NMR database has two additional dimensions affecting
metabolite peak appearance: pH and NMR field strength.
Because chemical shifts of many metabolite peaks are
sensitive to pH, the collection of NMR spectra over a
range of pH conditions and the inclusion of pH-dependent
chemical shifts in the database help to search for chemicals
and peak alignment when pH-indicative chemicals such
as imidazole or difluorotrimethylsilanylphosphonic acid
(DFTMP) [79] are added during sample preparation
for monitoring solution pH. In addition, spectral peak
dispersion is dependent on the magnetic field of the NMR
spectrometer: high-field magnets can often resolve
otherwise overlapping peaks. This feature is particularly
relevant to line shape-based metabolite quantification, where
proper peak dispersion helps to align peaks in the reference
spectra to their corresponding peaks in sample spectra.

In toxicity studies, metabolomics, transcriptomics and


proteomics can all yield useful information about changes
in biochemistry related to the toxicological and pathological
effects. However, biochemical interpretation of results from
each of these 'omics individually can lead to premature
conclusions as many toxicants may perturb cellular
processes at multiple molecular levels. In recent years,
the concept of systems biology has emerged to describe
the integrated study of complex biological systems at
multiple molecular levels that are consistent with the
extended central dogma of molecular biology (Figure 1). This
has inspired a similar concept of systems toxicology [80]
to describe a system-wide evaluation of a living organism
before and after perturbation by toxicants or stressors
via the integration of molecular profiling data and
measurement of conventional toxicological endpoints.
Integration of data from different 'omics platforms can
lead to coherent modeling of perturbed biochemical
processes and reduce the effects of measurement noise.
Many studies have attempted to integrate molecular
profiling data from various combinations of transcriptomics,
proteomics and metabolomics. In those experiments,
biofluid and tissue samples collected from the same set of
animals or from comparably treated animals were analyzed
in parallel by applying different 'omics technologies. Most
of these studies presented integrated results at the level
of interpretations already drawn from separate analyses
of individual datasets rather than co-analysis of different
'omics datasets [81].

Quantitative reference spectra require the presence of an


internal reference such as DSS-d6, or other appropriate
small
molecules
of
known
concentrations.
The

Given the difficulty of integrating different 'omics data


types in the context of a detailed biological model
[82], one solution is to generate testable hypotheses

46 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

concerning potential protein and transcript changes


that can occur based on measured drug-induced changes
of well-known small-molecule metabolites. Chen et al
applied LC-MS metabolomic analysis of deproteinized
serum samples from control and dextran sulfate sodium
(DSS)-treated mice to investigate the pathogenic
mechanism of acute ulcerative colitis [83]. The
team found that DSS treatment elicited an increase
of
the
ratio
of
stearoyl
lysophosphatidylcholine
(LPC; 18:0) over oleoyl LPC (18:1) that bore a striking
resemblance
to
the
phenotype
of
stearoyl-CoA
desaturase 1 (SCD1) knockout mice. This observation
led to the hypothesis that DSS might downregulate hepatic
SCD1 at the mRNA or protein level. After verifying this
hypothesis by real-time RT-PCR (reverse transcriptionpolymerase chain reaction) of whole liver mRNA and
immunoblot analysis of liver microsomes, the researchers
further conducted a series of experiments to reveal
that inhibition of SCD1-mediated oleic acid biogenesis
exacerbates proinflammatory responses to DSS challenges.
This case study has provided a compelling demonstration
of the power of conventional hypothesis-driven integration
of targeted mRNA and protein measurements with the
analysis of metabolomics data in discovering novel
therapeutic targets and diagnostic biomarkers. In
addition, the study illustrates how phenotypic similarities
between drug-induced toxicities and genetic disorders
such as inborn errors of metabolism could be exploited
to facilitate the discovery process of toxicity biomarkers
[84].
Experiments have also been carried out to integrate
metabolomics with transcriptomics data analysis at the
deeper level of detailed biological models [85]. Urine NMR
metabolomic and kidney transcriptomic profiles were
obtained from the same set of animals dosed with the
nephrotoxicants, cisplatin or gentamicin, and metabolitetranscript correlation analysis was performed in the
context of relevant literature knowledge of renal
physiology. The two types of data were used together
in a pathway enrichment analysis (see below) and for the
calculation of correlation coefficients before any inferences
were drawn. By using urinary profiles of identified
metabolites
from
quantitative
metabolomics
rather
than binned NMR spectral regions from chemometric
analysis, Xu et al were able to build a detailed model,
including genes for membrane transporters involved
in
the
proximal
tubule
reabsorption
of
nutrient
metabolites and upstream transcription factors that are
downregulated by the nephrotoxicants entering the
tubular epithelial cells. This research could be extended
to the examination of global metabolite-transcript
correlation networks for potential revelation of additional
biochemical pathways perturbed by nephrotoxicants.
Although the available data might be insufficient for the
application of more advanced methods of causal inference
such as Bayesian networks [86], knowledge-guided
visualization of correlation patterns is expected to offer
more insight to the mechanisms of drug-induced target
organ toxicities.

Emerging applications of metabolomics in


genotype-phenotype association studies
The characterization of genotype-phenotype associations
for complex human diseases remains a challenging task
because of the involvement and complicated interplay of
multiple genes and environmental factors. Quantitative
trait loci (QTL) are regions of the genome that contribute to
variations in a quantitative trait as measured on continuous
scales [87]. The critical difference between a qualitative
Mendelian trait and a quantitative trait is not the number of
segregating loci, but the effect size of phenotypic variations
between genotypes in comparison to the individual
variation within genotypic classes [88]. While classical
QTL analysis can readily associate a broad genomic region
with a physiological phenotype, it usually does not reveal
the molecular mechanisms underlying that phenotype. In
recent years, transcriptomic profiling of mRNA abundance
by microarrays has been deployed to expand the types of
phenotypes analyzed in genetic linkage and association
studies [87]. Genetic correlations between such gene
expression QTL (eQTL) with organism-level phenotypes
would facilitate the identification of causal mutations and
pinpoint the perturbed molecular pathways that result in a
particular phenotype [89-91].
The route to linking a QTL to a physiological phenotype
often involves the consideration of changes in the steadystate levels of endogenous metabolites, in addition to
changes in mRNA abundance. Those changes in metabolite
abundance could correlate with various genetic, epigenetic,
transcriptional,
translational,
post-translational
and
environmental modulations of the organism-level phenotype
[22,92]. Some metabolites are important intermediates
in signal transduction pathways that can regulate gene
expression at the transcriptional level [93]. As such, plant
biologists have introduced the concept of metabolic QTL
(mQTL) into quantitative genetics. The first report in this
field demonstrated the application of mQTL mapping as a
novel approach in tomato improvement [94]. The
metabolite profiling was limited to a targeted selection
of
74
metabolites
for
relative
quantification
by
GC-MS. Correlation analysis of all of the possible pairs of
metabolite and physiological phenotype allowed the
modeling of several regulatory networks of plant
metabolism. Keurentjes and coworkers applied untargeted
LC-MS metabolomic profiling to the genetic analysis of
natural variations of metabolite abundance in Arabidopsis
[95]. This proof-of-concept study identified mQTLs for
~ 75% of all mass peaks detected by LC-MS, suggesting
that the abundance of most plant metabolites is under
quantitative genetic control. mQTL mapping in Arabidopsis
was later integrated with eQTL analysis to show that
natural variations in aliphatic glucosinolates have
feedback control on the transcript abundance of
genes that encode enzymes involved in glucosinolate
biosynthesis [96].
mQTL mapping was first applied to the genetic analysis of
mammalian metabolic phenotypes by Dumas et al [97]. The

Metabolomics in pharmaceutical R&D Xu et al 47

research group used chemometric NMR metabolomics to


profile a subset of serum samples from a diabetic inbred
strain (GK) crossed with a normoglycemic strain (BN) of
rats (F2), and identified approximately 110 consistent
mQTLs with LOD scores (logarithm of odds ratio) greater
than 3 (ie, the likelihood of a given QTL not linked to the
trait is less than 1 in 1000). After statistical validation
with rigorous permutation tests at a threshold false
discovery rate (FDR) of 0.05, approximately 12 of
the mQTLs were found to be significant. To overcome the
lack of metabolite identities in the association of
chemometric NMR fingerprints with genomic regions,
the authors also assigned candidate metabolite names to
approximately 158 mQTLs by searching their in-house NMR
spectral database. One particular spectral region, centered
on chemical shift 7.86 ppm, was experimentally confirmed
to represent benzoate. Dumas et al then incorporated
liver transcriptomic data from the two parental strains
to suggest that the benzoate-associated mQTL on
chromosome 14 might represent the candidate gene
UDP-glucuronosyltransferase 2B (Ugt2b). The gene identity
was confirmed by a genomic Southern blot analysis,
suggesting that Ugt2b deficiency in GK rats may account
for serum benzoate accumulation in F2 animals carrying
the GK mQTL homozygous genotype.
The power of integrating mQTL with eQTL analysis to
reveal genetic architecture and molecular pathways
underlying
disease
susceptibility
was
successfully
demonstrated by Ferrara and coworkers in a study of
F2 intercross between diabetes-resistant and diabetessusceptible mouse strains [98]. Targeted profiling of
67 liver metabolites was conducted by combining the
application of stable-isotope internal standards and LC-MS
(for amino acids and acylcarnitines) or GC-MS (for organic
acids). Genotypic analysis was integrated with liver mRNA
profiling and metabolite profiling data to construct
causal phenotype networks for the genetic control of
hepatic metabolic processes using a novel method [99].
One of the advantages of targeted metabolite profiling
with absolute quantification by MS analyses is the rapid
resolution of biochemical pathway information from simple
visualization of metabolite-metabolite and metabolitetranscript correlation patterns. This causal network
model further predicted that the modulation of glutamine
and/or glutamate levels should lead to changes in
the transcript abundance of three genes, encoding
alanine:glyoxylate aminotransferase (Agxt), arginase 1
(Arg1) and phosphoenolpyruvate carboxykinase 1 (Pck1).
The directionality of this prediction was confirmed in vitro
when glutamine (10 mM) was added to cultured primary
hepatocytes; real-time RT-PCR assays indicated that the
mRNA expression of all three genes was upregulated.

Pathway and network analysis in


metabolomics
The biochemical interpretation of complex transcriptomic,
proteomic and metabolomic profiles often requires the
use of pathway and network tools that provide integrated
visualizations of differential expression patterns measured

by molecular profiling, and of the interaction networks


of cellular components extracted from knowledge bases
of biomedical literatures. In the field of microarray
transcriptomics, statistical algorithms, such as gene set
enrichment analysis (GSEA), gene set analysis (GSA)
and parametric analysis of gene set enrichment (PAGE),
have gained widespread popularity for the interpretation
of genome-wide expression profiles in the context of
biochemical pathways associated with predefined specific
sets of genes [100-102]. Open-source tools such as GenMAPP
(genmapp.org) and VisANT (visant.bu.edu) as well as
commercial tools such as MetaCore (GeneGo Inc) and
Ingenuity Pathway Analysis (Ingenuity Systems Inc) have
enabled the visualization of biochemical interactions between
genes, proteins and metabolites as functional modules or
canonical pathways [103].
The typical number of known metabolites identified
by the current NMR and LC-MS technologies is in the
order of hundreds while a microarray transcriptomic
experiment can readily profile tens of thousands of genes.
The scale difference between measurable metabolomes
and transcriptomes has made the application of most
enrichment analysis algorithms to metabolomics data very
challenging. Only simple enrichment analysis, such as
hypergeometric ranking without direct use of the metabolite
abundance values, has been applied to the pathway
analysis of metabolic profiles [85]. Additional difficulties
in applying pathway analysis algorithms to metabolomics
data arise from the lack of a controlled vocabulary for the
functional annotation of metabolite identities. Availability
of a widely accepted metabolite ontology (MO), which is
analogous to gene ontology [104,105], will allow more
transcriptomics pathway analysis tools to be adapted
to the field of metabolomics (Figure 3). The Ontology
Working Group (OWG) was established in 2006 as a part
of the Metabolomics Standards Initiative (msi-ontology.
sourceforge.net), but a delivery timetable for a practical
MO has yet to be announced.
In the absence of powerful pathway analysis algorithms
for metabolomics data, one effective approach to obtain
insights into the perturbation of metabolic pathways is to
overlay plots of metabolite abundance across treatment
groups onto the canonical pathway maps. To discover
novel metabolic biomarkers for drug-induced oxidative
stress, Soga et al analyzed liver metabolite profiles from
vehicle- and acetaminophen-treated C57BL6 mice using
capillary electrophoresis with electrospray ionization
time-of-flight mass spectrometry (CE-TOF-MS) [106]. By
pre-analyzing 569 pure metabolite standards selected from
the KEGG LIGAND database, these researchers identified
132 standard metabolites among 1859 CE-TOF-MS peaks
detected in both groups of mouse liver samples. Using
self-developed software tools for the differential display of
metabolites between controls and acetaminophen-treated
animals, the researchers discovered extensive depletion
of glutathione (GSH) and its oxidized form GSSG. Another
significantly changed cationic metabolite (m/z 290.135),
which was unknown at the time of analysis, was later

48 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

Figure 3. Metabolite ontology integrates metabolite databases to enable knowledge-based pathway enrichment analyses.
Cross referencing
by name and ID mapping
HMDB

Pathway ontology

BioCyc

KEGG

PubChem

METLIN
ChEBI

Flat files parsing

Database
integration

PubMed
articles

Expert
knowledge

Curation

Indexing
ONTOLOGY

Browsing, searching,
pathway enrichment analysis

Metabolite-1
doc1
0
3
doc2

Metabolite ontology can serve as a controlled vocabulary to integrate the cross references of endogenous metabolites in public databases
by synonym and identity (ID) mapping. Expert knowledge can be curated from published literature to construct a pathway ontology for
metabolomics. A pathway ontology can be used to index PubMed abstracts and full-text articles to build a document-metabolite matrix, on
which significantly changed metabolites in a metabolomics experiment can be mapped to functionally related biochemical pathways. Among all
public metabolite databases, HMDB (Human Metabolome Database) appears to have the most inclusive coverage of metabolite synonyms and
IDs, while BioCyc appears to contain the most detailed pathway ontology for endogenous metabolites.
ChEBI Chemical Entities of Biological Interest, HMDB human metabolome database, KEGG Kyoto Encyclopedia of Genes and Genomes.

identified as ophthalmate using tandem MS. Ophthalmate


differs from GSH by the replacement of the cysteine residue
with 2-aminobutyrate. Given the knowledge that ophthalmate
is an analog of GSH, the overlay of differential metabolite
profiles onto the pathway diagrams of GSH and
ophthalmate biosynthesis immediately suggested that
-glutamylcysteine synthetase (GCS) might be activated
during GSH depletion and/or that glutathione synthetase
(GS) might possess an affinity for -glutamyl-2-aminobutyrate (a dipeptide in ophthalmate), although lower
than for -glutamylcysteine (a dipeptide in GSH). Thus,
insights from the visualization of differential metabolite
profiles in the context of metabolic pathways reaffirmed
ophthalmate to be a potential oxidative stress biomarker that
responds to GSH depletion.
The overlay of differential profiles onto canonical metabolic
pathway maps was expanded by Munger et al to include
abundance profiles of both metabolites and enzyme-coding
mRNA transcripts [107]. Targeted LC-MS/MS metabolite
quantification was applied and parallel microarray analysis

was used to investigate the metabolome dynamics of


human fibroblast (HF) cells during the course of human
cytomegalovirus (HCMV) infection. Although targeted
metabolite quantification via selected reaction monitoring
is labor intensive, the use of isotope-labeled internal
standards reduces many inherent experimental artifacts
of LC-MS, such as ion suppression, and can produce highquality metabolite datasets. HCMV-induced increases of
cellular metabolites involved in glycolysis, the Krebs cycle,
and pyrimidine nucleotide biosynthesis were found to be
mirrored by the transcriptional upregulation of enzymes in
the same metabolic pathways. HCMV induction of cellular
enzymes was confirmed by quantitative RT-PCR and
enzymatic activity assays. Another control experiment
to compare the metabolite profiles from quiescent
mock-infected G0 cells, HCMV-infected quiescent G0 cells
and actively growing cells led to the conclusion that HCMV
did not modify the cellular metabolome by merely releasing
the cell cycle arrest and that the virus reprogrammed host
cell metabolic status to produce a unique signature of

Metabolomics in pharmaceutical R&D Xu et al 49

permissive infection [107]. Additional insights into HCMV


perturbation of host cell metabolism might be revealed if
the authors could apply metabolite-transcript correlation
analysis to their high-quality datasets.
Correlation analysis can be deployed as a powerful tool
for the integration of molecular profiling data to offer
insight to the mechanisms of toxicity when measurement
noise from bioanalytical platforms can be managed to be
consistently less than the animal-to-animal variability of
in vivo drug toxicity studies. Cross-tissue correlation
network
analysis
was
introduced
to
integrate
transcriptomics, proteomics and metabolomics data for
the selection of accessible biomarkers in drug-induced
hepatotoxicity [108]. Data were combined from LC-MS
analysis of plasma and liver lipids, targeted GC-MS analysis
of several classes of plasma metabolites, LC-MS analysis
of liver proteins, and microarray profiling of liver mRNA
to build correlation networks in a rat hepatotoxicity study
of an undisclosed drug candidate. At the threshold
FDR-adjusted p value of 0.15, 172 significant correlations
between plasma and liver analytes and 17,327 significant
correlations within liver analytes were identified. Nine
disjoint correlation subnetworks emerged from the
172 plasma-to-liver correlation pairs with the most
interesting of these being the one centered on hepatic
UDP-glucuronosyltransferase 1A1 (UGT1A1), the three
tryptic peptides of which were significantly elevated by
drug treatment. Eleven forms of triglycerides were
uniformly lower in the plasma while six of these forms
accumulated in liver tissues of drug-treated animals,
which is consistent with the histopathological observation
of hepatic steatosis. However, only observing correlation
patterns between the hepatic abundance of a drug
metabolism enzyme UGT1A1 and that of plasma or liver lipid
profiles is not sufficient to establish any direct mechanistic
links between UGT1A1 and lipids in the context of druginduced hepatotoxicity. The integration of correlation
analysis with literature knowledge was important for the
researchers to generate a testable hypothesis that the
drug or its metabolites might impair hepatic triglyceride
export by altering the biosynthesis of phosphatidylcholine.
Significant correlations can often be identified between
metabolites and transcripts involved in seemingly unrelated
metabolic pathways, making the biochemical interpretation
of those correlation patterns very difficult. To circumvent
this inherent difficulty in system-wide correlation network
analysis, Xu et al used global pathway enrichment
analysis and relevant literature knowledge to focus
their metabolite-transcript correlation analysis in the
context of renal physiological processes perturbed by
model nephrotoxicants [85]. The examination of the
detailed topology of enriched canonical pathways in the
MetaCore knowledgebase led to their discovery of negative
correlations between urinary glucose abundance and kidney
mRNA levels of sodium-dependent glucose transporters
SLC5A1/2 and those between monocarboxylates and
SLC16A7.
Furthermore,
knowledge
from
published
literature that had not yet been captured in MetaCore led

to the discovery of the negative correlations between


neutral amino acids and the orphan renal transporter
SLC6A18 as well as the positive correlations between
transporter genes and their putative upstream transcription
factors. This type of 'targeted' correlation analysis
provides rational links between two disparate datasets
and reveals new biological insights to the underlying
biochemical mechanisms in the form of concrete and testable
hypotheses. This data analysis method might have wider
applicability to the integrated pathway analysis of diverse
'omics datasets in computational systems biology.

Conclusions

Quantitative metabolomics is expected to be increasingly


integrated into the process of drug discovery and
development.
Identification
and
quantification
of
endogenous metabolites in biofluids and tissues allow
scientists to survey global cellular responses to perturbations
and to observe mechanistic biochemical interactions in
preclinical disease models and toxicity studies. The addition
of metabolomics to the systems biology toolbox holds
the promise of delivering safer and more efficacious drug
candidates to pharmaceutical development pipelines.
Quantitative metabolomics is poised to play increasingly
important roles in identifying novel efficacy and toxicity
biomarkers for the pharmaceutical industry despite being
database dependent and more tedious to conduct than
the currently dominant chemometric metabolomics. This
review has focused on the recent progress in developing
innovative strategies and database tools to streamline the
workflow of quantitative metabolomics. It remains a major
challenge for quantitative metabolomics to determine more
quantifiable metabolites in body fluids and tissue biopsies
to enable better coverage of the cellular metabolome.
On the other hand, the precise metabolite identities and
abundance profiles from quantitative metabolomics have
already enabled the deep integration of metabolomics
with other 'omics technologies to complete the circle of
systems biology. This tight integration has also significantly
facilitated the extraction of pathway and network
information from metabolomics data for the generation
of testable hypotheses on the underlying mechanisms of
complex diseases and drug-induced toxicities.

Acknowledgments

The authors would like to thank Jill Williams for her


meticulous preparation of Figures 1 and 3, and Xiaodan
Zhang for his contributions to the generation of Figure 2
and Figure 3 prototypes. We are grateful to Dr Frank
Sistare for his support of our metabolomics projects and
critical reading of the manuscript. We also thank Drs Steve
Pitzenberger, Michael Klimas and Michael Lassman for their
valuable comments on the manuscript.

References


1.

of outstanding interest
of special interest
Nicholson
JK,
Lindon
JC,
Holmes
E:
'Metabonomics':
Understanding the metabolic responses of living systems to
pathophysiological stimuli via multivariate statistical analysis
of biological NMR spectroscopic data. Xenobiotica (1999)
29(11):1181-1189.

50 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

2.

Horning EC, Horning MG: Metabolic profiles: Gas-phase


methods for analysis of metabolites. Clin Chem (1971) 17(8):802809.

3.

Pauling L, Robinson AB, Teranishi R, Cary P: Quantitative


analysis
of
urine
vapor
and
breath
by
gas-liquid
partition chromatography. Proc Natl Acad Sci USA (1971)
68(10):2374-2376.

21. van Doorn M, Vogels J, Tas A, van Hoogdalem EJ, Burggraaf J,


Cohen A, van der Greef J: Evaluation of metabolite profiles
as
biomarkers
for
the
pharmacological
effects
of
thiazolidinediones in type 2 diabetes mellitus patients and
healthy volunteers. Br J Clin Pharmacol (2007) 63(5):562-574.
22. Fiehn O: Metabolomics the link between genotypes and
phenotypes. Plant Mol Biol (2002) 48(1-2):155-171.

4.

Nicholson JK, Connelly J, Lindon JC, Holmes E: Metabonomics:


A platform for studying drug toxicity and gene function.
Nat Rev Drug Discov (2002) 1(2):153-161.

23. Sanchez DH, Siahpoosh MR, Roessner U, Udvardi M, Kopka J: Plant


metabolomics reveals conserved and divergent metabolic
responses to salinity. Physiol Plant (2008) 132(2):209-219.

5.

Robertson DG: Metabonomics in toxicology: A review. Toxicol Sci


(2005) 85(2):809-822.

6.

Clarke CJ, Haselden JN: Metabolic profiling as a tool for


understanding mechanisms of toxicity. Toxicol Pathol (2008)
36(1):140-147.

24. Dixon RA, Gang DR, Charlton AJ, Fiehn O, Kuiper HA, Reynolds
TL, Tjeerdema RS, Jeffery EH, German JB, Ridley WP, Seiber JN:
Applications of metabolomics in agriculture. J Agric Food Chem
(2006) 54(24):8984-8994.

7.

Robertson D, Reily MD, Cantor GH: Metabonomics in preclinical


pharmaceutical discovery and development. In: The Handbook
of Metabonomics and Metabolomics. Holmes E (Ed), Elsevier,
Amsterdam, The Netherlands (2007):241-277.

8.

Nicholas PC, Kim D, Crews FT, Macdonald JM: 1H NMR-based


metabolomic analysis of liver, serum, and brain following
ethanol administration in rats. Chem Res Toxicol (2008)
21(2):408-420.

9.

Dieterle F, Schlotterbeck G, Ross A, Niederhauser U, Senn H:


Application of metabonomics in a compound ranking study
in early drug development revealing drug-induced excretion
of choline into urine. Chem Res Toxicol (2006) 19(9):1175-1181.

10. Viant MR: Environmental metabolomics using


H-NMR
spectroscopy. In: Methods in Molecular Biology. Environmental
Genomics. Martin CC (Ed), Humana Press, Totowa, NJ, USA (2007)
410:137-150.
1

11. Miller MG: Environmental metabolomics: A SWOT analysis


(strengths,
weaknesses,
opportunities,
and
threats).
J Proteome Res (2007) 6(2):540-545.
12. Griffin JL, Shore RF: Applications of metabonomics within
environmental toxicology. In: The Handbook of Metabonomics
and Metabolomics. Holmes E (Ed), Elsevier, Amsterdam, The
Netherlands (2007):517-532.
13. Holmes E, Loo RL, Stamler J, Bictash M, Yap IK, Chan Q, Ebbels T,
De Iorio M, Brown IJ, Veselkov KA, Daviglus ML et al: Human
metabolic phenotype diversity and its association with diet
and blood pressure. Nature (2008) 453(7193):396-400.
14. Griffin JL, Vidal-Puig A: Current challenges in metabolomics
for diabetes research: A vital functional genomic tool or just a
ploy for gaining funding? Physiol Genomics (2008) 34(1):1-5.
15. Griffith HR, den Hollander JA, Okonkwo OC, O'Brien T, Watts RL,
Marson DC: Brain N-acetylaspartate is reduced in Parkinson
disease with dementia. Alzheimer Dis Assoc Disord (2008)
22(1):54-60.
16. Mkinen VP, Soininen P, Forsblom C, Parkkonen M, Ingman P,
Kaski K, Groop PH, Ala-Korpela M: 1H NMR metabonomics
approach to the disease continuum of diabetic complications
and premature death. Mol Syst Biol (2008) 4:167.
17. Assfalg M, Bertini I, Colangiuli D, Luchinat C, Schafer H, Schutz
B, Spraul M: Evidence of different metabolic phenotypes in
humans. Proc Natl Acad Sci USA (2008) 105(5):1420-1424.
18. Raftery D, Gowda GA: An
multicomponent biomarker
179(6):2089-2090.

approaching
diagnostics?

new wave of
J Urol (2008)

19. Clayton TA, Lindon JC, Cloarec O, Antti H, Charuel C, Hanton


G, Provost JP, Le Net JL, Baker D, Walley RJ, Everett JR et al:
Pharmaco-metabonomic phenotyping and personalized drug
treatment. Nature (2006) 440(7087):1073-1077.
20. Kaddurah-Daouk R, McEvoy J, Baillie RA, Lee D, Yao JK,
Doraiswamy PM, Krishnan KR: Metabolomic mapping of atypical
antipsychotic effects in schizophrenia. Mol Psychiatry (2007)
12(10):934-945.

25. Rabinowitz JD: Cellular metabolomics of


Expert Rev Proteomics (2007) 4(2):187-198.

Escherichia

coli.

26. Ni Q, Reid KR, Burant CF, Kennedy RT: Capillary LC-MS for
high sensitivity metabolomic analysis of single islets of
Langerhans. Anal Chem (2008) 80(10):3539-3546.
27. Winder CL, Dunn WB, Schuler S, Broadhurst D, Jarvis R,
Stephens GM, Goodacre R: Global metabolic profiling of
Escherichia coli cultures: An evaluation of methods for
quenching and extraction of intracellular metabolites.
Anal Chem (2008) 80(8):2939-2948.
An informative paper comparing various quenching and extraction methods
for intracellular metabolite profiling.
28. German JB, Roberts MA, Watkins SM: Personal metabolomics
as a next generation nutritional assessment. J Nutr (2003)
133(12):4260-4266.
29. Rezzi S, Ramadan Z, Fay LB, Kochhar S: Nutritional metabonomics:
Applications and perspectives. J Proteome Res (2007)
6(2):513-525.
30. Rezzi S, Ramadan Z, Martin FP, Fay LB, van Bladeren P, Lindon JC,
Nicholson JK, Kochhar S: Human metabolic phenotypes link
directly to specific dietary preferences in healthy individuals.
J Proteome Res (2007) 6(11):4469-4477.
31. Walsh MC, Brennan L, Malthouse JP, Roche HM, Gibney MJ:
Effect of acute dietary standardization on the urinary,
plasma, and salivary metabolomic profiles of healthy humans.
Am J Clin Nutr (2006) 84(3):531-539.
32. Gibney MJ, Walsh M, Brennan L, Roche HM, German B, van Ommen
B: Metabolomics in human nutrition: Opportunities and
challenges. Am J Clin Nutr (2005) 82(3):497-503.
33. Lu W, Bennett BD, Rabinowitz JD: Analytical strategies for
LC-MS-based targeted metabolomics. J Chromatogr B Analyt
Technol Biomed Life Sci (2008) 871(2):236-242.
Summary of the requirements for quantitative metabolomics using
LC-MS. Suggestions for sample preparation, settings of chromatography
and ionization, and selection of quantitative analysis using multiple
reaction monitoring and high-resolution MS measurements.
34. Wilson ID, Plumb R, Granger J, Major H, Williams R, Lenz EM:
HPLC-MS-based methods for the study of metabonomics. J
Chromatogr B Analyt Technol Biomed Life Sci (2005) 817(1):67-76.
35. Want EJ, Nordstrom A, Morita H, Siuzdak G: From exogenous to
endogenous: The inevitable imprint of mass spectrometry in
metabolomics. J Proteome Res (2007) 6(2):459-468.
36. Bennett BD, Yuan J, Kimball EH, Rabinowitz JD: Absolute
quantitation of intracellular metabolite concentrations by
an
isotope
ratio-based
approach.
Nat
Protoc
(2008)
3(8):1299-1311.
37. Dumas ME, Maibaum EC, Teague C, Ueshima H, Zhou B, Lindon JC,
Nicholson JK, Stamler J, Elliott P, Chan Q, Holmes E: Assessment
of analytical reproducibility of 1H NMR spectroscopy based
metabonomics for large-scale epidemiological research:
The INTERMAP Study. Anal Chem (2006) 78(7):2199-2208.
38. Nicholson JK, Holmes E, Wilson ID: Gut microorganisms,
mammalian metabolism and personalized health care.
Nat Rev Microbiol (2005) 3(5):431-438.

Metabolomics in pharmaceutical R&D Xu et al 51

39. Bertram HC, Hoppe C, Petersen BO, Duus J, Mlgaard C,


Michaelsen KF: An NMR-based metabonomic investigation on
effects of milk and meat protein diets given to 8-year-old boys.
Br J Nutr (2007) 97(4):758-763.

57. Weljie AM, Newton J, Mercier P, Carlson E, Slupsky CM: Targeted


profiling: Quantitative analysis of 1H NMR metabolomics data.
Anal Chem (2006) 78(13):4430-4442.

and

58. Xu Q, Sachs JR, Wang TC, Schaefer WH: Quantification and


identification of components in solution mixtures from 1D
proton NMR spectra using singular value decomposition.
Anal Chem (2006) 78(20):7175-7185.
Provides a simple but robust mathematical solution to solving linearly
additive NMR spectra by using reference component spectra encoded with
quantitative properties.

42. Rcker G, Bohn G, Fell AF: [Identification and quantitative


determination of ureides, methaqualone and barbiturates in
autopsy material by NMR-spectroscopy]. Arch Toxikol (1971)
27(2):168-172.

59. Lewis IA, Schommer SC, Hodis B, Robb KA, Tonelli M, Westler
WM, Sussman MR, Markley JL: Method for determining molar
concentrations of metabolites in complex solutions from
two-dimensional 1H-13C NMR spectra. Anal Chem (2007)
79(24):9385-9390.

43. Shoolery JN, Smithson LH: The use of a high resolution NMR
spectrometer controlled by a dedicated computer for
quantitative analytical chemistry. J Am Oil Chem Soc (1970)
47(5):153-157.

60. Robosky LC, Wade K, Woolson D, Baker JD, Manning ML, Gage DA,
Reily MD: Quantitative evaluation of sebum lipid components
with nuclear magnetic resonance. J Lipid Res (2008)
49(3):686-692.

44. Xu Q, Abeygunawardana C, Ng AS, Sturgess AW, Harmon BJ,


Hennessey JP Jr: Characterization and quantification of
C-polysaccharide in Streptococcus pneumoniae capsular
polysaccharide
preparations.
Anal
Biochem
(2005)
336(2):262-272.

61. Provencher SW: Estimation of metabolite concentrations from


localized in vivo proton NMR spectra. Magn Reson Med (1993)
30(6):672-679.

40. Stella C, Beckwith-Hall B, Cloarec O, Holmes E, Lindon JC, Powell J,


van der Ouderaa F, Bingham S, Cross AJ, Nicholson JK: Susceptibility
of human metabolic phenotypes to dietary modulation.
J Proteome Res (2006) 5(10):2780-2788.
41. Gidley MJ: Naturally functional foods challenges
opportunities. Asia Pac J Clin Nutr (2004) 13 (Suppl):S31.

45. Malz F, Jancke H: Validation of quantitative NMR. J Pharm Biomed


Anal (2005) 38(5):813-823.
46. Nicholson JK, Higham DP, Timbrell JA, Sadler PJ: Quantitative high
resolution 1H NMR urinalysis studies on the biochemical effects
of cadmium in the rat. Mol Pharmacol (1989) 36(3):398-404.
47. Coen M, Hong YS, Clayton TA, Rohde CM, Pearce JT, Reily MD,
Robertson DG, Holmes E, Lindon JC, Nicholson JK: The mechanism
of galactosamine toxicity revisited; a metabonomic study.
J Proteome Res (2007) 6(7):2711-2719.
48. Lewis GD, Asnani A, Gerszten RE: Application of metabolomics
to cardiovascular biomarker and pathway discovery. J Am Coll
Cardiol (2008) 52(2):117-123.
An insightful review discussing the limitations of chemometrics
approaches to biological understanding of metabolomics data.
49. Cloarec O, Dumas ME, Craig A, Barton RH, Trygg J, Hudson J,
Blancher C, Gauguier D, Lindon JC, Holmes E, Nicholson J: Statistical
total correlation spectroscopy: An exploratory approach for
latent biomarker identification from metabolic 1H NMR data
sets. Anal Chem (2005) 77(5):1282-1289.
50. Crockford DJ, Holmes E, Lindon JC, Plumb RS, Zirah S,
Bruce SJ, Rainville P, Stumpf CL, Nicholson JK: Statistical
heterospectroscopy, an approach to the integrated analysis
of NMR and UPLC-MS data sets: Application in metabonomic
toxicology studies. Anal Chem (2006) 78(2):363-371.
51. Dieterle F, Ross A, Schlotterbeck G, Senn H: Metabolite projection
analysis for fast identification of metabolites in metabonomics.
Application in an amiodarone study. Anal Chem (2006)
78(11):3551-3561.

62. Provencher SW: Automatic quantitation of localized in vivo 1H


spectra with LCModel. NMR Biomed (2001) 14(4):260-264.
63. Gipson GT, Tatsuoka KS, Sweatman BC, Connor SC: Weighted
least-squares deconvolution method for discovery of group
differences between complex biofluid 1H NMR spectra.
J Magn Resonance (2006) 183(2):269-277.
64. Csenki L, Alm E, Torgrip RJ, Aberg KM, Nord LI, Schuppe-Koistinen
I, Lindberg J: Proof of principle of a generalized fuzzy Hough
transform approach to peak alignment of one-dimensional 1H
NMR data. Anal Bioanal Chem (2007) 389(3):875-885.
65. Nielsen N-PV, Carstensen JM, Smedsgaard J: Aligning of single
and multiple wavelength chromatographic profiles for
chemometric data analysis using correlation optimized
warping. J Chromatogr A (1998) 805(1-2):17-35.
66. Pravdova V, Walczak B, Massart DL: A comparison of two
algorithms for warping of analytical signals. Anal Chim Acta
(2002) 456(1):77-92.
67. van Nederkassel AM, Daszykowski M, Eilers PH, Heyden YV:
A comparison of three algorithms for chromatograms
alignment. J Chromatogr A (2006) 1118(2):199-210.
68. Eilers PH: Parametric
76(2):404-411.

time

warping.

Anal

Chem

(2004)

69. Wu W, Daszykowski M, Walczak B, Sweatman BC, Connor SC,


Haselden JN, Crowther DJ, Gill RW, Lutz MW: Peak alignment of
urine NMR spectra using fuzzy warping. J Chem Inf Model (2006)
46(2):863-875.
70. Otvos JD, Jeyarajah EJ, Bennett DW: Quantification of plasma
lipoproteins
by
proton
nuclear
magnetic
resonance
spectroscopy. Clin Chem (1991) 37(3):377-386.

52. Wishart DS: Quantitative metabolomics using NMR. Trends Anal


Chem (2008) 27(3):228-237.
An informative overview on the history and practical settings of quantitative
NMR metabolomics.

71. Hiltunen Y, Ala-Korpela M, Jokisaari J, Eskelinen S, Kiviniitty K,


Savolainen M, Kesniemi YA: A lineshape fitting model for 1H
NMR spectra of human blood plasma. Magn Reson Med (1991)
21(2):222-232.

53. Serkova NJ, Van Rheen Z, Tobias M, Pitzer JE, Wilkinson JE,
Stringer KA: Utility of magnetic resonance imaging and
nuclear
magnetic
resonance-based
metabolomics
for
quantification of inflammatory lung injury. Am J Physiol Lung Cell
Mol Physiol (2008) 295(1):L152-L161.

72. Ala-Korpela M: Potential role of body fluid 1H NMR


metabonomics as a prognostic and diagnostic tool. Expert Rev
Mol Diagnostics (2007) 7(6):761-773.

54. Leibfritz D, Dreher W, Willker W: In vivo NMR applications


of metabonomics. In: The Handbook of Metabonomics and
Metabolomics. Holmes E (Ed), Elsevier, Amsterdam, The Netherlands
(2007):489-516.
55. Shulman RG, Rothman DL (Ed): Metabolomics by In Vivo NMR.
John Wiley & Sons Ltd, Chichester, UK (2005).
56. Ogg RJ, Kingsley PB, Taylor JS: WET, a T1- and B1-insensitive watersuppression method for in vivo localized 1H NMR spectroscopy.
J Magn Resonance B (1994) 104(1):1-10.

73. Gomase VS, Changbhale SS, Patil SA, Kale KV: Metabolomics.
Curr Drug Metab (2008) 9(1):89-98.
74. Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D,
Jewell K, Arndt D, Sawhney S, Fung C et al: HMDB: The Human
Metabolome Database. Nucleic Acids Res (2007) 35:D521-D526.
HMDB is a high-quality metabolite database to facilitate both NMR- and
MS-based quantitative metabolomics.
75. Cui Q, Lewis IA, Hegeman AD, Anderson ME, Li J, Schulte CF,
Westler WM, Eghbalnia HR, Sussman MR, Markley JL: Metabolite
identification via the Madison Metabolomics Consortium
Database. Nat Biotechnol (2008) 26(2):162-164.

52 Current Opinion in Drug Discovery & Development 2009 Vol 12 No 1

76. Smith CA, O'Maille G, Want EJ, Qin C, Trauger SA, Brandon TR,
Custodio DE, Abagyan R, Siuzdak G: METLIN: A metabolite mass
spectral database. Ther Drug Monit (2005) 27(6):747-751.
77. Benton HP, Wong DM, Trauger SA, Siuzdak G: XCMS2: Processing
tandem mass spectrometry data for metabolite identification
and
structural
characterization.
Anal
Chem
(2008)
80(16):6382-6389.
78. Wishart DS, Lewis MJ, Morrissey JA, Flegel MD, Jeroncic K, Xiong Y,
Cheng D, Eisner R, Gautam B, Tzur D, Sawhney S et al: The human
cerebrospinal fluid metabolome. J Chromatogr B Analyt Technol
Biomed Life Sci (2008) 871(2):164-173.
Illustrates cerebrospinal fluid metabolite profiling as an excellent
example of integrated metabolite-based quantitative metabolomics
through literature text mining, identification and quantification by NMR,
GC-MS, and LC-FTMS (Fourier transform MS).
79. Reily MD, Robosky LC, Manning ML, Butler A, Baker JD, Winters RT:
DFTMP, an NMR reagent for assessing the near-neutral pH of
biological samples. J Am Chem Soc (2006) 128(38):12360-12361.
80. Waters MD, Fostel JM: Toxicogenomics and systems toxicology:
Aims and prospects. Nat Rev Genet (2004) 5(12):936-948.
81. Lindon JC, Holmes E, Nicholson JK: Global systems biology
through integration of "omics" results. In: The Handbook of
Metabonomics and Metabolomics. Holmes E (Ed), Elsevier, Amsterdam,
The Netherlands (2007):533-555.
82. Stoughton RB, Friend SH: How molecular profiling could
revolutionize drug discovery. Nat Rev Drug Discov (2005)
4(4):345-350.
83. Chen C, Shah YM, Morimura K, Krausz KW, Miyazaki M,
Richardson TA, Morgan ET, Ntambi JM, Idle JR, Gonzalez FJ:
Metabolomics reveals that hepatic stearoyl-CoA desaturase 1
downregulation exacerbates inflammation and acute colitis.
Cell Metab (2008) 7(2):135-147.
An elegant series of experiments and analyses demonstrating the
generation and verification of hypotheses on unobserved protein and
mRNA changes from observed metabolite changes.
84. Vangala S, Tonelli A: Biomarkers, metabonomics, and drug
development: Can inborn errors of metabolism help in
understanding drug toxicity? AAPS J (2007) 9(3):E284-E297.
85. Xu EY, Perlina A, Vu H, Troth SP, Brennan RJ, Aslamkhan AG, Xu Q:
Integrated pathway analysis of rat urine metabolic profiles
and kidney transcriptomic profiles to elucidate the systems
toxicology of model nephrotoxicants. Chem Res Toxicol (2008)
21(8):1548-1561.
One of the first reports to attempt the integration of quantitative
metabolomics with transcriptomics at the level of data co-analysis.
Knowledge-based pathway enrichment analysis and metabolite-transcript
correlation analysis were combined to build detailed biological models.
86. Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian
networks to analyze expression data. J Comput Biol (2000)
7(3-4):601-620.
87. Rockman MV, Kruglyak L: Genetics of global gene expression.
Nat Rev Genet (2006) 7(11):862-872.
An excellent review on the application of eQTL to elucidate the genetic
architecture of quantitative traits. Readers unfamiliar with quantitative
genetics are referred to this article for concise definitions of relevant
terminology.
88. Griffiths AJF, Wessler SR, Lewontin RC, Carroll SB: Quantitative
Genetics. In: Introduction to Genetic Analysis, 9th Edition. WH
Freeman & Co Ltd, New York, NY, USA (2008):639-678.
89. Brem RB, Yvert G, Clinton R, Kruglyak L: Genetic dissection of
transcriptional regulation in budding yeast. Science (2002)
296(5568):752-755.
90. Hubner N, Wallace CA, Zimdahl H, Petretto E, Schulz H,
Maciver F, Mueller M, Hummel O, Monti J, Zidek V, Musilova A et al:
Integrated transcriptional profiling and linkage analysis for
identification of genes underlying disease. Nat Genet (2005)
37(3):243-253.
91. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V,
Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS et al: Genetics
of gene expression surveyed in maize, mouse and man.
Nature (2003) 422(6929):297-302.

92. Giovannoni JJ: Breeding new life


Nat Biotechnol (2006) 24(4):418-419.

into

plant

metabolism.

93. Sonoda J, Pei L, Evans RM: Nuclear receptors: Decoding


metabolic disease. FEBS Lett (2008) 582(1):2-9.
94. Schauer N, Semel Y, Roessner U, Gur A, Balbo I, Carrari F,
Pleban T, Perez-Melis A, Bruedigam C, Kopka J, Willmitzer L et al:
Comprehensive metabolic profiling and phenotyping of
interspecific introgression lines for tomato improvement.
Nat Biotechnol (2006) 24(4):447-454.
95. Keurentjes JJ, Fu J, de Vos CH, Lommen A, Hall RD, Bino RJ,
van der Plas LH, Jansen RC, Vreugdenhil D, Koornneef M: The
genetics of plant metabolism. Nat Genet (2006) 38(7):842-849.
96. Wentzell AM, Rowe HC, Hansen BG, Ticconi C, Halkier BA,
Kliebenstein DJ: Linking metabolic QTLs with network and ciseQTLs controlling biosynthetic pathways. PLoS Genet (2007)
3(9):1687-1701.
97. Dumas ME, Wilder SP, Bihoreau MT, Barton RH, Fearnside JF,
Argoud K, D'Amato L, Wallis RH, Blancher C, Keun HC, Baunsgaard
D et al: Direct quantitative trait locus mapping of mammalian
metabolic phenotypes in diabetic and normoglycemic rat
models. Nat Genet (2007) 39(5):666-672.
98. Ferrara CT, Wang P, Neto EC, Stevens RD, Bain JR, Wenner BR,
Ilkayeva OR, Keller MP, Blasiole DA, Kendziorski C, Yandell BS et al:
Genetic networks of liver metabolism revealed by integration
of metabolic and transcriptional profiling. PLoS Genet (2008)
4(3):e1000034.
Describes a case study integrating mQTL and eQTL to build experimentally
testable causal models of the regulation of mouse liver metabolism.
99. Chaibub Neto E, Ferrara CT, Attie AD, Yandell BS: Inferring causal
phenotype networks from segregating populations. Genetics
(2008) 179(2):1089-1100.
100. Efron B, Tibshirani R: On testing the significance of sets of genes.
Ann Appl Stat (2007) 1(1):107-129.
The GSA package represents the state of the art in gene-based pathway
enrichment analysis algorithms.
101. Kim SY, Volsky DJ: PAGE: Parametric analysis of gene set
enrichment. BMC Bioinformatics (2005) 6:144.
102. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL,
Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES,
Mesirov JP: Gene set enrichment analysis: A knowledge-based
approach for interpreting genome-wide expression profiles.
Proc Natl Acad Sci USA (2005) 102(43):15545-15550.
103. Ganter B, Zidek N, Hewitt PR, Muller D, Vladimirova A: Pathway
analysis tools and toxicogenomics reference databases for
risk assessment. Pharmacogenomics (2008) 9(1):35-54.
104. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry
JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA et al:
Gene ontology: Tool for the unification of biology. The Gene
Ontology Consortium. Nat Genet (2000) 25(1):25-29.
105. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R,
Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J et al: The
Gene Ontology (GO) database and informatics resource.
Nucleic Acids Res (2004) 32:D258-D261.
106. Soga T, Baran R, Suematsu M, Ueno Y, Ikeda S, Sakurakawa T,
Kakazu Y, Ishikawa T, Robert M, Nishioka T, Tomita M: Differential
metabolomics reveals ophthalmic acid as an oxidative stress
biomarker indicating hepatic glutathione consumption.
J Biol Chem (2006) 281(24):16768-16776.
An interesting application of visualization software tools to overlay
differential metabolite profiles onto metabolic pathway maps for the
discovery of a potential oxidative stress biomarker.
107. Munger J, Bajad SU, Coller HA, Shenk T, Rabinowitz JD: Dynamics
of the cellular metabolome during human cytomegalovirus
infection. PLoS Pathog (2006) 2(12):e132.
An excellent study integrating targeted LC-MS/MS metabolite
quantification with transcriptomic profiling and enzyme activity assays.
Solid conclusions were drawn on how viral infection reprograms host cell
metabolism.
108. Adourian A, Jennings E, Balasubramanian R, Hines WM, Damian
D, Plasterer TN, Clish CB, Stroobant P, McBurney R, Verheij ER,
Bobeldijk I et al: Correlation network analysis for data
integration and biomarker selection. Mol Biosystems (2008)
4(3):249-259.

You might also like