Professional Documents
Culture Documents
Toxicogenomic Analysis Methods For Predictive Toxicology 2006
Toxicogenomic Analysis Methods For Predictive Toxicology 2006
www.elsevier.com/locate/jpharmtox
Appraisal of state-of-the-art
Abstract
Toxicogenomics, the application of genomic data to elucidate or predict an organism’s response to a toxicant, can inform the drug
development process in important ways. It is apparent that standardized approaches to many types of toxicogenomic questions are still being
formulated. Specifically, a significant body of proof of principle studies has emerged that demonstrates a range of statistical methodologies
applied to predictive toxicology. These studies rely on class prediction methods – mathematical models generated using the gene expression
profiles of known toxins from representative toxicological classes – to predict the toxicological effect of a compound based on the
similarities between its gene expression profile and the profiles of a given toxicological class. Class prediction methods hold promise for
increasing the rate at which compounds can be evaluated for toxicity early in the drug discovery process, while at the same time reducing the
length of toxicological studies and their associated costs. Class prediction methods are informed by class comparison and class discovery
steps, which inform, respectively, the selection of genes whose response can be used to distinguish among the toxicological classes and the
number of classes distinguishable using the response of these genes. Together these steps use a variety of complementary statistical
techniques to achieve a successful class prediction model. This report attempts to review some of the themes that appear to be emerging in the
application of these techniques to predictive toxicology methods over toxicogenomics’ short history.
D 2005 Elsevier Inc. All rights reserved.
response to phenotypes (Pognan, 2004). Class prediction For example, in the class comparison step, statistical
methods only indicate possible relationships between gene significance tests like ANOVA may be applied to select
response and phenotypes. Further study is necessary to genes that vary among the toxicological classes being
distinguish causative from reactive genes. studied. Clustering techniques, considered unsupervised
In addition, the information-rich data set and the dynamic learning methods in that they only consider gene expres-
nature of gene expression present computational challenges sion data and not toxicological class information when
for the routine use of class prediction methods in drug representing similarities between treatments, are usually
development. Genomic data sets are a complex matrix, often applied in the class discovery step. Literature examples
containing thousands of individual data points. Genes useful exist that describe the class comparison, class discovery,
for class prediction are selected for inclusion in the and class prediction steps for the classification of known
discriminatory gene set, whose expression values (the gene hepatotoxins as either peroxisome proliferators or enzyme
signature) can be used to distinguish among the toxico- inducers (Hamadeh, Bushel, Jayadev, DiSorbo et al.,
logical classes studied. This discriminatory gene set must be 2002), known toxins to one of five characterized toxico-
discovered against a complex background of other gene logical classes (Thomas et al., 2001), and toxic metals into
expression changes, some resulting from factors unrelated to seven or nine distinct groups (Tsai et al., 2005).
the treatment (e.g., sampling time) and others a result of the The published accounts of class prediction methods,
treatment, but not useful for class prediction, as they are including the related steps of class comparison and class
perturbed in a similar way by diverse toxins (e.g., genes discovery, show much diversity in the statistical methods
involved in metabolic pathways) (Hamadeh, Bushel, Jaya- applied, with some general themes repeated in many, but not
dev, Matin et al., 2002). all, studies (e.g., the use of unsupervised learning methods
Gene expression experiments also present a challenge to for class discovery and supervised learning methods for
traditional statistical significance testing because signifi- class prediction). This review attempts to survey the ways
cant change must be calculated for datasets with many different groups are approaching the computational chal-
variables (potentially tens of thousands) but few available lenges posed by the use of gene expression data for
experimental replicates. Finally, toxins may affect gene predictive toxicology and provide a discussion of possible
expression in complex ways, requiring statistical methods future directions for class prediction methods.
that can consider the interaction of expression changes,
i.e., genes excluded by statistical significance tests like
ANOVA (because they do not change significantly across 2. The process of predictive toxicogenomics
groups) may still have predictive value when coupled to
the response of genes that do change. At a high level, the process leading up to a successful
The most successful computational methods for class class prediction model can be represented as three to five
prediction are the supervised learning methods (or classi- steps (see Fig. 1).
fiers). These methods rely on a training set consisting of
gene expression profiles from representatives of the Data Preparation— Datasets are corrected for sources of
different toxicological classes to be modeled. The gene variability that result from causes
signatures from samples in the training set and the other than the treatments under study
knowledge of their origins (toxicological class) are used (e.g., hybridization differences in
to derive a set of algorithms that can be used to classify preparation of the microarrays, var-
unknowns. Class prediction methods most often follow iable recoveries of mRNA, fluores-
class comparison and class discovery steps, which, cent dye labeling efficiencies).
respectively, inform the selection of the discriminatory Class Comparison— The prepared data from a training set
gene set and help to define the toxicological classes are analyzed to define the discrim-
distinguishable by gene expression signatures. These steps inatory gene set—the set of genes
often make use of complimentary statistical techniques. that allow for differentiation among
Class
Prediction
Class
Discovery
Fig. 1. An abstract view of the relationship of class prediction to the related steps that inform it.
J. Maggioli et al. / Journal of Pharmacological and Toxicological Methods 53 (2006) 31 – 37 33
the toxicological classes represented Current literature examples include a wide variety of
in the training set. techniques used for class prediction and the related steps
Class Discovery— The similarity among treatments shown in Fig. 1. Though it is convenient to break
present in the training set is visual- toxicogenomic studies down into discrete steps, mapping
ized using techniques like clustering, statistical methods to these steps can be a challenge. Multiple
an unsupervised learning method statistical methods are often evaluated at each step in the
that groups treatments based only process of generating a class prediction model. Furthermore,
on the similarities in their gene techniques commonly associated with one step may be
signatures and does not employ applied in another. For example, supervised learning meth-
knowledge about the samples’ tox- ods, typically associated with the class prediction step, may
icological class. be applied as a class comparison technique, to identify the
Class Prediction— Unknown or blinded samples are discriminatory gene set (Hamadeh, Bushel, Jayadev, DiSorbo
assigned to toxicological classes. et al., 2002). Finally, because supervised learning methods do
Typically, a classifier (supervised not lend themselves well to visualization, a number of other
learning method) is applied to the statistical techniques may be used to provide qualitative
gene signatures of the training set views of the similarities between treatment groups. The flow
samples to generate a mathematical diagram shown in Fig. 2 attempts to capture the diversity of
model for predicting the toxicolog- approaches used to generate class prediction methods.
ical class of unknowns.
Evaluation— The model generated for class pre- 2.1. Data preparation
diction is evaluated. Blinded samples
can be used to estimate success rates All microarray experiments are affected by systematic
in predicting the toxicological class and random error. Random error can be generated by factors
of unknowns, or individual samples such as background noise, scanner noise, and hybridization
from the training set can be used to noise. The ideal way to reduce random error is to generate
evaluate the model, using a ‘‘leave many replicates and perform data analysis on the combined
one out’’ validation approach. replicates. When replicates are limiting, as they often are in
Candidate
Normalized
Hypothesis Testing Discriminatory
data
Gene Sets
Learning Methods
Data-Driven Discovery
Clinical
Histopathology
Proteomic
Metabolomic
Classification Evaluation or
Model Validation
Fig. 2. An information-centric view of the class prediction process and the steps that inform it. Families of techniques are represented by the blue boxes (e.g.,
Hypothesis Testing includes parametric methods like t-tests, ANOVA, and non-parametric methods like Wilcoxon and SAM). In any one study, multiple
techniques from the same family are often applied for comparison. The evaluation step informs the success of each technique. Selection of statistical methods
and discriminatory gene sets is often refined in an iterative process to generate a final classification model for unknowns or blinded samples.
34 J. Maggioli et al. / Journal of Pharmacological and Toxicological Methods 53 (2006) 31 – 37
microarray studies, an estimation of random errors can be visually—more similar gene expression profiles are grouped
useful. Systematic errors have known causes or well together. Clustering is a valuable exploratory technique for
understood behaviors, and can be corrected. Examples of helping to characterize how many classes a given set of
microarray systematic error include scanner sensitivity or treatments can be divided into.
non-zero background intensities. Preprocessing algorithms Two common types of clustering algorithms are hier-
such as background subtraction, normalization, and de- archical and partitioning algorithms. Hierarchical algorithms
trending can reduce or eliminate systematic error. An in yield a hierarchy of clusters for a data set that can be
depth discussion of data preprocessing and normalization visualized in a dendrogram tree (see Fig. 3). Data sets
methods is beyond the scope of this paper, but can be found belonging to the same branch of a cluster are similar to each
in a number of references (e.g., Baldi & Hatfield, 2002). other at some level, whereas data sets in separate branches
are less similar.
2.2. Class comparison Partitioning algorithms like K-Means divide the data set
into an a priori specified number of clusters that are viewed
A common approach to class comparison is to search for in tabular format to make inferences about their similarity
a discriminatory gene set among expression profiles within a cluster. Because partitioning algorithms result in
generated from studies of toxins representative of known bins, unique inferences about the relationship of each data
toxicological classes. Statistical significance testing is often point in a cluster to every other data point in the cluster are
used to select the discriminatory gene sets. For example, to not apparent. Similarly, further inferences about the relation-
select discriminatory genes from a training set of expression ship of all the data points in one cluster to all the data points
profiles from rats exposed to nine toxic metals, an ANOVA in another cluster cannot be drawn (see Fig. 4).
F-test was used to find those genes that varied significantly The results of clustering are influenced by the type of
across the nine treatment groups and an OVA (one-versus- similarity measure used to calculate the distance between
all) test identified gene expression that varied significantly items in the clusters. Distance based similarity measures,
when each group was compared to the average of the other like Euclidian distance, emphasize the magnitude of the fold
eight (Tsai et al., 2005). The resulting two discriminatory changes between data sets. Correlation-based measures,
gene sets (the set defined from the F-test and the union of such as combination with mean subtraction, emphasize the
the nine groups returned from the OVA analysis), as well as pattern of the fold changes.
a third set, consisting of those genes appearing in both Cluster analysis was used to study how gene response
original sets, were then evaluated for their ability to classify varied over time for a given treatment (Hamadeh, Bushel,
toxic metals successfully. Jayadev, Matin et al., 2002). The results of time course
Another approach is to reduce the dimensionality of the analysis can help further refine the genes included in the
complex data set using dimension-reducing techniques such discriminatory gene set, as one of the common goals for a
as principal component analysis (PCA), multidimensional class prediction method is a time-independent model, a
scaling (MDS), or wavelet transformation (Yang, Blomme, model that excludes genes whose response is highly
& Waring, 2004). Rather than requiring the selection of unstable with time. In the same study, clustering, PCA,
specific genes from a data set, these techniques reduce the and correlation analysis, were all used to demonstrate the
high-dimensionality of the original data set, which can similarities between test compounds from the two classes
include thousands of variables, into a smaller number of (peroxisome proliferators and enzyme inducers) and provide
weighted variables. One disadvantage of this approach is that preliminary evidence that creation of a model to distinguish
information about which genes are modified most for these classes should be attainable.
individual classes is obscured (Tsai et al., 2005). Hierarchical clustering algorithms using a distance
A combination approach is sometimes used. To identify a (Euclidian distance) or correlation-based (one minus corre-
discriminatory set for classifying hepatotoxins, ANOVA lation coefficient) similarity measures were compared for
analysis was used to identify the top 200 genes that varied their ability to cluster datasets from nine rats treated with
among the groups studied. Wavelet transformation was then toxic metals (Tsai et al., 2005). Though clustering was an
applied to the expression profiles from these 200 genes, investigative step to examine the number of classes repre-
reducing their response into seven components (Yang et al., sented by the nine treatments, the clusters were compared by
2004). These seven components were carried forward to the examining their ability to group replicate treatments together
class prediction stage. into eight groups in a class prediction type exercise (group-
ings were defined by the study authors). Though clustering
2.3. Class discovery methods are useful for investigating similarities between
treatments (class discovery), clustering is not recommended
Clustering methods are often applied to visualize the for use in class prediction. Clustering is a subjective
similarities between individual treatments as well as multi- technique, whose results are highly influenced by selection
ple treatments from different toxicological classes. Hier- of the clustering algorithm and similarity metric (Simon,
archical clustering represents the distance between samples Radmacher, Dobbin, & McShane, 2003).
J. Maggioli et al. / Journal of Pharmacological and Toxicological Methods 53 (2006) 31 – 37 35
Fig. 3. An example of two types of hierarchical clustering algorithms applied to the data sets derived from rats treated with 15 known hepatotoxins (taken from
Waring et al., 2001). The 2D clusters were generated using the Rosetta Resolver\ System. Reproduced with permission.
36 J. Maggioli et al. / Journal of Pharmacological and Toxicological Methods 53 (2006) 31 – 37
A number of other publications have begun to detail the Hamadeh, H. K., Bushel, P. R., Jayadev, S., Martin, K., DiSorbo, O., &
many obstacles that remain before class prediction methods Sieber, S., et al. (2002). Gene expression analysis reveals chemical-
specific profiles. Toxicological Sciences, 67, 219 – 231.
can begin to fulfill the promises of accelerating the drug Hayes, K. R., Vollrath, A. L., Zastrow, G. M., McMillian, B. J., Craven, M.,
development process, or possibly even replacing some & Jovanovich, S., et al. (2005). EDGE: A centralized resource for the
traditional toxicological studies. Among these challenges comparison, analysis, and distribution of toxicogenomic information.
is the cost-intensive process of building relevant databases Molecular Pharmacology, 67, 1360 – 1368.
Lühe., A. Suter, L. Ruepp, S., Singer, T., Weiser, T., Albertini, S. (2005).
of gene expression profiles of known toxins (Lühe et al.,
Toxicogenomics in the pharmaceutical industry: Hollow promises or
2005; Van Delft et al., 2005), the difficulties of comparing real benefit? Mutation Research, 575(1 – 2), 102 – 115.
gene expression data collected using different technologies Pognan, F. (2004). Genomics, proteomics and metabonomics in toxicology:
(Hayes et al., 2005), and the challenge of making useful Hopefully not Ffashionomics_. Pharmacogenomics, 5(7), 879 – 893.
predictions of toxicity across species or from in vitro Simon, R., Radmacher, M. D., Dobbin, K., & McShane, L. M. (2003).
systems (e.g., cultured primary hepatocytes) to living organs Pitfalls in the use of DNA microarray data for diagnostic and
prognostic classification. Journal of the National Cancer Institute, 95,
in human beings (Pognan, 2004). 14 – 18.
Though many obstacles remain, work continues to try and Thomas, R. S., Rank, D. R., Penn, S. G., Zastrow, G. M., Hayes, K. R., &
make class prediction methods robust and sufficiently Pande, K., et al. (2001). Identification of toxicologically predictive
relevant for routine use in toxicological evaluation of novel gene sets using cDNA microarrays. Molecular Pharmacology, 60,
compounds. A continual refinement in the application and 1189 – 1194.
Tsai, C. A., Lee, T. C., Ho, I. C., Yang, U. C., Chen, C. H., & Chen, J. J.
evaluation of computational approaches will undoubtedly (2005). Multi-class clustering and prediction in the analysis of micro-
continue to be an important part of this effort. array data. Mathematical Biosciences, 193, 79 – 100.
Ulrich, R., & Friend, S. (2001). Toxicogenomics and drug discovery: Will
new technologies help us produce better drugs? Nature Reviews Drug
References Discovery, 1, 84 – 88.
Van Delft, J. H. M., van Agen, E., van Breda, S. G. J., Herwijnen, M. H.,
Staal, Y. C. M., Kleinjans, J. C. S. (2005). Comparison of supervised
Afshari, C. A., Nuwaysir, E. F., & Barrett, J. C. (1999). Application of
clustering methods to discriminate genotoxic from non-genotoxic
complementary DNA microarray technology to carcinogen identifica- carcinogens by gene expression profiling. Mutation Research, 575,
tion, toxicology, and drug safety evaluation. Cancer Research, 59, 1 – 3.
4759 – 4760. Waring, J. F., Jolly, R. A., Ciurlionis, R., Lum, P. Y., Praestgaard, J. T., &
Baldi, P., & Hatfield, W. (2002). DNA microarrays and gene expression.
Morfitt, D. C., et al. (2001). Clustering of hepatotoxins based on
Cambridge, UK’ Cambridge University Press. mechanism of toxicity using gene expression profiles. Toxicology and
Dudoit, S., Fridlyand, J., & Speed, T. P. (2002). Comparison of discrim- Applied Pharmacology, 175, 28 – 42.
ination methods for the classification of tumors using gene expression Yang, Y., Blomme, E. A., & Waring, J. F. (2004). Toxicogenomics in drug
data. Journal of the American Statistical Association, 97, 77 – 87.
discovery: From preclinical studies to clinical trials. Chemico-Bio-
Hamadeh, H. K., Bushel, P. R., Jayadev, S., DiSorbo, O., Bennett, L., & Li, logical Interactions, 150, 71 – 85.
L., et al. (2002). Prediction of compound signature using high density
gene expression profiling. Toxicological Sciences, 67, 232 – 240.