Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Update on Systems Biology Tools

Bioinformatic and Systems Biology Tools to Generate


Testable Models of Signaling Pathways and Their Targets1

Andrea Pitzschke and Heribert Hirt*


Department of Applied Genetics and Cell Biology, University of Natural Resources and Applied Life
Sciences, 1190 Vienna, Austria (A.P.); Department of Plant Molecular Biology, Max F. Perutz Laboratories,
University of Vienna, 1030 Vienna, Austria (H.H.); and Unité de Recherche en Génomique Végétale Plant
Genomics Laboratory, 91057 Evry, France (H.H.)

With more and more high-throughput data becom- of theoretically possible signaling modules are indeed
ing available, scientists are faced with the challenge to formed, (2) which stimuli are conveyed, (3) which
develop or apply intelligent software to extract essen- targets are addressed, and (4) what is the biological
tial information from large-scale data sets. If used in a role of the respective signaling modules.
smart way, some bioinformatic programs can aid in
many ways to elucidate the function of a gene of
interest, including modes of regulation and synthesis, COMPARISON OF EXPRESSION PROFILES
its posttranslational modifications and potential inter-
One approach is to use correlative transcriptome
action partners, and last but not least processes that are
analysis as a relatively unbiased technique. Hereby,
regulated by its gene products. Examples of combina-
microarray profiles of signaling cascade mutants
tory applications of bioinformatic tools that lead to the
can be compared from a wide range of organisms,
generation (and subsequent confirmation) of hypoth-
e.g. by using the Genevestigator tool (https://www.
eses (Table I) are described below, with a focus on the
genevestigator.com/gv/index.jsp). Mutants whose
deciphering of cellular processes regulated by mitogen-
transcriptome profiles significantly overlap are likely
activated protein kinase (MAPK) cascades in Arabi-
to act in common signaling cascades. The extent of
dopsis (Arabidopsis thaliana).
such overlaps can be conveniently visualized in Venn
Plants need to cope with a wide range of challenging
diagrams, e.g. using the tool at http://www.pangloss.
environmental conditions. The successful adaptation/
com/seidel/Protocols/venn4.cgi, where expression
response to such stresses requires the efficient and
profiles of up to four mutants can be compared. The
specific transduction of environmental signals. In stress
program also generates lists of gene IDs occurring in
signal transduction, a prominent role is played by
two, three, or four entered data sets. Further inspection
MAPK cascades, which minimally consist of a MAPK
of the list of commonly regulated genes can give
kinase kinase (MAPKKK), a MAPK kinase (MAPKK),
indications on the processes controlled by a theoretical
and a MAPK. Via a phosphorelay mechanism, these
signaling module, e.g. using Genevestigator—a rich
modules transduce incoming signals to activate MAPKs
source for transcriptome data on spatio-temporal ex-
that subsequently phosphorylate specific target proteins
pression patterns, mutant profiles, and responses to
(for review, see Colcombet and Hirt, 2008; Pitzschke
numerous treatments/growth conditions.
et al., 2009c). So far, experimental evidence exists only
The following example emphasizes the consisten-
for a very few MAPK substrates, but a proteomic
cies with respect to similarities in expression profiles,
phosphoarray approach suggests that transcription fac-
phenotype, and hormone accumulation and thus doc-
tors (TFs) are the major targets of MAPKs (Popescu
uments the robustness and usefulness of transcriptome-
et al., 2009). Phosphorylation of TFs can potentially alter
based approaches. Rather than confirming correlations
their subcellular localization, protein stability, or DNA-
predicted from experimental results, with compara-
binding activity. MAPK cascades may thus be primary
tively little effort such tools can generate reasonable
regulators of stimulus-dependent adaptation of gene
hypotheses, which can subsequently be experimen-
expression. The Arabidopsis genome encodes for 60 to
tally validated.
80 MAPKKKs, 10 MAPKKs, and 20 MAPKs. The pres-
ent challenge is to elucidate (1) which of the thousands
EXAMPLE: MEKK1-MKK1/2-MPK4 AND BEYOND
1
This work was supported by grants from the Austrian Science MEKK1-MKK1/2-MPK4 engage in a signaling cas-
Foundation.
cade that is activated in response to pathogen attack
* Corresponding author; e-mail hirt@evry.inra.fr.
The author responsible for distribution of materials integral to the (Gao et al., 2008). Bimolecular fluorescence comple-
findings presented in this article in accordance with the policy mentation analysis showed that both MPK4 and
described in the Instructions for Authors (www.plantphysiol.org) is: MEKK1 interact with MKK1 and MKK2 (Gao et al.,
Heribert Hirt (hirt@evry.inra.fr). 2008). mekk1, mpk4, and mkk1/mkk2 double knockout
www.plantphysiol.org/cgi/doi/10.1104/pp.109.149583 mutants show spontaneous cell lesions and highly
460 Plant PhysiologyÒ, February 2010, Vol. 152, pp. 460–469, www.plantphysiol.org Ó 2009 American Society of Plant Biologists
Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Systems Biology Tools for Dissecting Signalling Pathways

Table I. List of bioinformatic tools described in this review


Tool Link Description Reference

DNA tools
Genevestigator https://www.genevestigator.com/ Study the expression and regulation Zimmermann
gv/index.jsp of genes in a broad variety of contexts et al. (2004)
AttedII http://atted.jp/ Find coexpressed genes Obayashi et al. (2007)
TAIR bulk sequence http://www.arabidopsis.org/tools/bulk/ Bulk download of sequences Garcia-Hernandez
download sequences/index.jsp (transcript, regulatory regions, et al. (2002)
untranslated region, introns...) for
lists of gene IDs of interest
TAIR patmatch http://www.arabidopsis.org/cgi-bin/ Pattern matching tool to search for short Yan et al. (2005)
patmatch/nph-patmatch.pl (,20 residues) nucleotide or peptide
sequences, or ambiguous/degenerate
patterns in Arabidopsis DNA/protein
sequences
PLACE www.dna.affrc.go.jp/PLACE/ Detect known cis-elements within a Higo et al. (1998)
promoter set of interest
http://sphinx.rug.ac.be:8080/PlantCARE/ Lescot et al. (2002)
TAIR motif finder http://www.arabidopsis.org/tools/bulk/ Detect novel cis-elements within a
motiffinder/index.jsp promoter set of interest
AlignACE http://atlas.med.harvard.edu/cgi-bin/
alignace.pl
STAMP http://www.benoslab.pitt.edu/stamp/ Alignment, similarity, and database Mahony and
matching for DNA motifs Benos (2007)
POBO http://ekhidna.biocenter.helsinki.fi/ Calculate and visually display the Kankainen and
poxo/pobo/pobo significance of putatively enriched Holm (2004)
DNA elements
MotifMatcher http://users.soe.ucsc.edu/~kent/ Visualize user-defined multiple DNA
improbizer/motifMatcher.html motifs on promoter sets of interest as
beads on a string
MapMan http://gabi.rzpd.de/projects/MapMan Map transcript profiling data onto Thimm et al. (2004);
pathways and onto genetic maps; Usadel et al. (2009)
generate response overlays
Protein tools
TAIR patmatch http://www.arabidopsis.org/cgi-bin/ Retrieve lists of gene IDs containing a Garcia-Hernandez
patmatch/nph-patmatch.pl (,20 residues) nucleotide or peptide et al. (2002)
sequence of interest
BAR interactome http://bar.utoronto.ca/interactions/ Find interacting proteins; 70,000 predicted Geisler-Lee
cgi-bin/arabidopsis_interactions_viewer.cgi protein-protein interaction data by et al. (2007)
Geisler-Lee et al. (2007) and 2,800
documented Arabidopsis protein-protein
interactions
TargetP http://www.cbs.dtu.dk/services/TargetP/ Subcellular localization (prediction Horton et al. (2007)
algorithms)
WolfPSort http://wolfpsort.org/
SUBA http://www.plantenergy.uwa.edu.au/suba2/ Subcellular localization (from prediction Heazlewood
algorithms and experimental evidence) et al. (2007)
Affytree http://bioinfoserver.rsbs.anu.edu.au/ Phylogeny tool Frickey et al. (2008)
utils/affytrees/
Genebee ClustalW www.genebee.msu.su/clustal Alignment and phylogenetic analysis of Frickey et al. (2008)
amino acid sequences
Classification tools
TAIR GO annotation http://www.arabidopsis.org/tools/ GO annotation search, functional Garcia-Hernandez
bulk/go/index.jsp categorization et al. (2002)
TAIR chromosome http://www.arabidopsis.org/jsp/ Displays entered geneIDs on the Garcia-Hernandez
map tool ChromosomeMap/tool.jsp Arabidopsis chromosomes et al. (2002)
Venn generator http://www.pangloss.com/seidel/ Venn diagram generator. Up to four lists
Protocols/venn4.cgi of geneIDs can be analyzed

elevated levels of reactive oxygen species. Moreover, ogen Pseudomonas syringae is reduced in these pathway
they display a severely dwarfed phenotype, which is mutants. For these reasons, the MEKK1-MKK1/
correlated with the strong accumulation of salicylic 2-MPK4 cascade has been ascribed a role as a negative
acid (SA), a major hormone in biotic pathogen re- regulator of innate immune responses in plants (Gao
sponses. Accordingly, the sensitivity to the plant path- et al., 2008).
Plant Physiol. Vol. 152, 2010 461
Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Pitzschke and Hirt

For all mutants affected in this MAPK module, search for statistically overrepresented “functional”
transcriptome analyses have been performed (Qiu and “cellular compartment” terms, using the gene
et al., 2008a; Pitzschke et al., 2009a). Indeed, the gene ontology (GO) tool (e.g. http://www.arabidopsis.org/
expression profiles of these mutants are highly similar. tools/bulk/go/index.jsp) is another promising ap-
Consistent with the hierarchical order in the signaling proach. Not unexpectedly, in our example, this
cascade, mekk1 shows the largest set of differentially tool detects an enrichment of the GO terms “stress-
regulated genes, followed by mkk1/2 and eventually responsive” and “transcription factor activity” in the
mpk4. Moreover, many of the common differentially list of mekk1, mkk1/2, and mpk4 up-regulated genes.
regulated genes are known to be SA-responsive genes Moreover, GO term analysis revealed that among the
and/or are associated with redox regulation (Qiu genes down-regulated in mekk1 and mpk4 those en-
et al., 2008a; Pitzschke et al., 2009a). In agreement coding plastidic or chloroplastic proteins are signifi-
with the partial redundancy of MKK1 and MKK2, the cantly overrepresented (Pitzschke et al., 2009a), which
expression profiles of mkk1 or mkk2 single mutants may indicate that these mutants might also regulate
hardly overlap with those of mekk1, mkk1/mkk2, and processes related to photosynthesis to prevent further
mpk4 mutants (Gao et al., 2008). Microarray data-based ROS production.
online tools (e.g. https://www.genevestigator.com/
gv/index.jsp) also reveal a strong correlation of tran-
scriptome profiles of mekk1, mkk1/mkk2, and mpk4 HOW TO IDENTIFY POTENTIALLY COREGULATED
with several other mutants, such as constitutive expres- GENES—THE ARABIDOPSIS CHROMOSOME
sion of PR genes5 (cpr5) and nonexpressor of pathogenesis- MAP TOOL
related genes1 (npr1), suggesting further commonalities
The highly user-friendly setup and the diversity of
between these mutants. A more targeted bioinformatic
tools provided by The Arabidopsis Information Re-
approach of comparative transcriptome studies, Func-
source (TAIR) enable the researcher to subject genes
tional Associations by Response Overlap, has also
of interest to further bioinformatic analysis. For
highlighted the relatedness of mekk1, mkk1/2, and example, the tool (http://www.arabidopsis.org/jsp/
mpk4 profiles with those of cpr5 and npr1 (Nielsen
ChromosomeMap/tool.jsp), which displays the posi-
et al., 2007). Indeed, cpr5 and npr1 mutants are also
tion of entered genes on the five Arabidopsis chromo-
dwarfed and have highly elevated SA levels (Cao
somes, is useful for revealing clustering of genes on a
et al., 1994; Bowling et al., 1997).
particular chromosomal region. Such local clustering
An interesting characteristic of many SA-accumulating
can be an indication for transcriptional coregulation,
mutants is that their dwarfed phenotype (often associ-
as, for example, evidenced in a study on the cluster of
ated with sterility/poor seed production) can be rescued
Arabidopsis RPP5 locus R genes involved in pathogen
by growing these plants at elevated temperatures, in line
response (Yi and Richards, 2007).
with the observed negative correlation of the heat-
induced and the mpk4 mutant gene expression profiles
(revealed by Functional Associations by Response Over- HOW TO MANAGE THE FLOOD OF DATA
lap analysis; Nielsen et al., 2007). Applying this knowl-
edge to lines of interest may assist the positioning of the Despite their unquestionable value for driving re-
corresponding gene in the signaling cascade and can search progress, whole-genome microarrays have
help to yield a larger pool of precious seeds through their drawbacks. The experiment as such is very cost
adjustment of growth conditions. intensive. Moreover, the huge data set generated from
Web-based tools that integrate large sets of micro- these arrays often confronts the scientist with the
arrays have the potential to reveal novel correlations (hard) decision of which subset of differentially ex-
between responses. To give an example, we observed a pressed genes to investigate further (in silico). Prior
strong negative correlation between the expression selection might therefore be advisable. For those re-
response to SA and CO2 by Genevestigator analysis. It searchers particularly interested in defense-related
may therefore be worth testing the CO2 response of responses, the small-scale expression array (Sato
mekk1, mkk1/2, and mpk4 mutants with respect to et al., 2007), which analyzes transcript abundance of
phenotype, SA levels, and transcriptional changes. 321 genes associated with pathogen response (and a
Likewise, this observation may also indicate that in- set of genes for normalization), may be a good alter-
creasing environmental pollution (CO2) renders plants native, either for the experiment as such and/or as a
more susceptible to pathogen attack and a recent study preselection for the downstream data analysis. Results
provides experimental evidence for this in silico-based from a recent miniarray study investigating the re-
assumption (Lake and Wade, 2009). sponse of seven defense-affected mutants (coi1, dde2,
ein2, mpk3, pad4, cbp60g-1, and sid2) to P. syringae
treatment (Wang et al., 2009) provides a manageable
HOW TO GO FROM GENE TO FUNCTION data set for a first comparison with our own data. The
closer the transcriptome profile of a mutant of interest
To understand the functional significance of gene is to any of these mutants, the higher the probability
expression profiles displayed by a mutant of interest, a that the corresponding proteins engage in a common
462 Plant Physiol. Vol. 152, 2010
Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Systems Biology Tools for Dissecting Signalling Pathways

pathway. Moreover, if a mutant profile shows stron- lipid, amino acid, and carbohydrate breakdown are
gest overlap with the subset of genes coregulated in largely induced. Novel aspects of sugar depletion
several of the other mutants (e.g. subsets of the seven were also revealed, e.g. a trend to preferential induc-
defense-related mutants), the corresponding protein tion of cell wall synthesis-involved genes and re-
may be an upstream regulator acting before stress pression of genes involved in cell wall breakdown.
signaling bifurcation into individual pathways. Furthermore, previous indications on a cross talk
Similar to the usefulness of the above-described between sugar-sensing and abscisic acid- and ethyl-
miniarray for pathogen response-focused research, a ene-sensing pathways (Rook et al., 2001; Brocard et al.,
global map of gene expression within 15 different 2002; Leon and Sheen, 2003) could be substantiated.
zones of the root corresponding to cell types and MapMan is being updated continuously, and a con-
tissues at progressive developmental stages, allows version of this tool now also allows comparison of
researchers a preselection of large data sets (retrieved responses in different organisms, as demonstrated by
from published or our own microarrays) for the anal- the comparison of diurnal changes in Arabidopsis
ysis of developmental aspects (Birnbaum et al., 2005). and tomato (Solanum lycopersicum) expression profiles
Likewise, a report from Leonhardt et al. (2004) pro- (Urbanczyk-Wochniak et al., 2006).
vides a list of genes that allows a preselection for Despite its unquestionable value, MapMan has the
guard cell- and mesophyll-expressed genes. major drawback in that many genes cannot be catego-
rized into certain MapMan-defined areas of function
and are therefore not considered in the analysis. For
MAPPING OF MICROARRAY DATA ONTO
PATHWAYS AND GENETIC MAPS—THE
example, in a study on the Arabidopsis response to
MAPMAN TOOL Fusarium, the majority of genes could not be assigned to
any of the known function categories (Yuan et al., 2008).
One precious tool, which allows analysis of large If one’s list of genes of interest contains several “genes
data sets and facilitates the assignment of clusters of unknown function,” a further separate inspection
of genes showing major transcriptional changes to might be advisable. Using ClustalW (http://www.
areas of function, is MapMan (http://gabi.rzpd.de/ genebee.msu.su/clustal/basic.html) potential phyloge-
projects/MapMan). MapMan is grouping genes on the netic relatedness between the corresponding proteins
Arabidopsis affymetrix 22 K array into .200 hierar- can be detected, which will help to assign putative
chical categories, thereby providing an overview of roles/implications of those proteins to the process that
various cellular processes. Due to its complexity, we is investigated. This, in turn, can help to refine the
will not describe this tool in detail, but recommend the MapMan data sets and thus facilitate future analyses.
following articles (Thimm et al., 2004; Usadel et al.,
2009). Ideally, upon reading of the articles, MapMan
PREDICTION OF PATHWAY MODULES THROUGH
should be visited and data sets, including those of CORRELATION OF GENE EXPRESSION
your own experiments, explored.
Briefly, MapMan allows superimposition of differ- Genes that are coexpressed over multiple data sets
ent data sets in overlay plots and thus facilitates the are likely to show functional relatedness. This knowl-
identification of shared features, both globally and on edge may help to predict which proteins act in a
a gene-to-gene basis. common pathway or, as in this particular case, which
By grouping genes that are probably involved in a MAPK signaling component engages in a common
common area of function, the MapMan tool can reveal module. Using the AttedII tool (http://atted.jp/), lists
trends toward repression or induction, which might of genes whose expression correlates with that of a
not be obvious at the single gene level. The data sets of gene of interest can be generated and correlation
responses of interest can originate from your own coefficients calculated. To test its suitability, we que-
experiments or can be downloaded from published ried AttedII to predict components potentially associ-
microarrays. The analysis is also facilitated by the ated with MKK4, a stress-related MAPKK whose
option to focus and visualize certain major pathways, transcript abundance alters in response to numerous
such as “metabolism” or “DNA synthesis.” stimuli (e.g. as evidenced in Genevestigator). AttedII
The usefulness of MapMan has been demonstrated reveals strong gene expression correlation of MKK4
by the analysis of the Arabidopsis starvation response: with MKK5 and also with MPK3, but not with any
The transcript profile of wild-type seedlings harvested other MAPK signaling component. MKK4 and MKK5
at the end of the night was compared either to wild- are known to be functionally redundant, to be con-
type seedlings that had been incubated in the dark trolled by the MAPKKK YODA, and to act as upstream
for an additional 6-h period or to starchless pgm regulators of the MAPKs MPK3 and MPK6 (Wang
mutants harvested at the end of the night. The Map- et al., 2007). Neither YODA nor MPK6 are among the
Man-generated overlay plot revealed strong correla- predicted MKK4-correlated genes, most likely due to
tion between these two sugar-depletion conditions. As their ubiquitous expression. Further genes correlating
might be expected, the common transcriptional re- with MKK4 expression are promising candidates for
sponse indicates repression of photosynthesis and Suc, encoding additional components involved in MKK4-
starch, and lipid synthesis, while genes involved in mediated signal transduction. The above example shows
Plant Physiol. Vol. 152, 2010 463
Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Pitzschke and Hirt

the usefulness of gene expression correlation-based hy- HOW TO EXPLOIT MICROARRAY DATA
pothesis generation, but also reveals the limitations of FOR THE IDENTIFICATION OF
this approach for constitutively expressed genes. TRANSCRIPTIONAL REGULATORS
To decipher the modes controlling the expression of a
set of coexpressed genes, an in-depth inspection of their
HOW TO EXPLOIT MICROARRAY DATA FOR upstream regulatory regions may provide further in-
THE IDENTIFICATION OF TARGETS OF formation. During the immediate responses to a given
SIGNALING CASCADES stimulus, the signaling may not have bifurcated yet into
highly complex downstream pathways. Therefore,
The rich pool of publicly available microarray data early induced genes are likely to underlie regulation
cannot only be screened by bioinformatic tools for by common TF(s) and therefore share common DNA
hypothesizing the composition of signaling pathways, motifs in their regulatory regions. Promoter sequences
but they are also suitable for making predictions on the of user-defined length can be downloaded from sev-
TFs and promoter elements that control a set of coex- eral Arabidopsis databases, e.g. TAIR (http://www.
pressed genes. arabidopsis.org/tools/bulk/sequences/index.jsp), and
Although each type of signal requires a specific subsequently screened for the presence of certain
cellular response, the transcript abundance of some DNA motifs. While PLACE (www.dna.affrc.go.jp/
genes is altered in response to multiple signals. This PLACE/) or PlantCARE (http://sphinx.rug.ac.be:8080/
approach can be exemplified for finding the set of PlantCARE/) are useful for detecting known cis-
common stress genes by using a clustering method elements within a set of promoters, the TAIR motif finder
(Ma and Bohnert, 2007). Using publicly available (http://www.arabidopsis.org/tools/bulk/motiffinder/
microarray data of transcriptional changes in response index.jsp) and AlignACE tool (http://atlas.med.harvard.
to various abiotic and biotic stresses, 197 common edu/cgi-bin/alignace.pl) allow identification of poten-
stress-responsive genes were identified. Similar stud- tially novel DNA motifs shared by multiple promoters.
ies were reported by Swindell (2006) and by Kant et al. Once candidate motifs have been identified, the statisti-
(2008) for nine and 16 abiotic stress conditions, re- cal significance of their enrichment can be assessed
spectively. Based on GO annotation (kinase, TF, etc.), using the POBO tool (http://ekhidna.biocenter.helsinki.
the latter report classified a subset of 289 genes as fi/poxo/pobo/pobo), which compares motif abun-
multiple stress regulatory genes (MSTRs), including dance in the given promoter set to the Arabidopsis
several members of the WRKY and bZIP protein background frequencies. This tool has, for example,
families, which are known to be stress associated (for proven useful for documenting the strong enrichment
review, see Jakoby et al., 2002; Ulker and Somssich, of W boxes in the promoters of WRKY18-dependent,
2004). Considering the transcriptional response of SA-inducible genes (Wang et al., 2006).
these factors to very diverse signals, one may position Subsequent to this statistical analysis, the functional
MSTRs at the early steps of stress signaling responses. relevance of enriched candidate DNA motifs in medi-
MSTRs can be expected to have a high turnover and to ating stress responses can be experimentally validated
be controlled at multiple levels to allow fast adaptation using synthetic promoter-reporter gene constructs in
and to prevent a prolonged activation of downstream transgenic plants or transfected protoplasts. The latter
signaling processes that would interfere with plant system also allows—with minimal effort—to test can-
growth and development. These characteristics render didate TFs for their ability to induce/repress gene
MSTRs prime candidates for posttranslational modifi- expression driven by a motif of interest, as, for exam-
cations such as protein kinases or ubiquitin-mediated ple, evidenced in Rushton et al. (2002) or Pitzschke
stability control. et al. (2009b).
Given that some signaling cascades are activated
very rapidly (e.g. MAPK cascades are activated within
minutes), candidate targets for diverse signaling path- HOW TO FIND THE TARGETS OF
ways might be found by screening transcriptome data TRANSCRIPTION FACTORS
sets for very early responses in a similar fashion as
done for MSTRs. Alternatively to starting with the identification of
The identification of early targets of signaling cas- multiple signal-responsive genes through the compar-
cades and knowledge on the modes controlling their ison of multiple signal-dependent expression profiles,
activity is also of tremendous value for applied science, an equally attractive approach for the elucidation of
because appropriate manipulation may minimize the signaling cascades is the detailed characterization of a
effort of creating crops with desired traits such as TF of interest, e.g. a known or predicted substrate of a
resistance to multiple stresses. The key regulators may signaling cascade.
be expressed in a controllable system, e.g. by using For their characterization, a phylogenetic analysis
chemically inducible expression or nuclear transloca- may provide first indications about the dimerization
tion systems, thereby circumventing undesirable side behavior and sometimes even about potential DNA
effects on growth/development that are often associ- target motifs. However, high homology within the
ated when overexpressing genes constitutively. DNA-binding domain of two TFs does not necessarily
464 Plant Physiol. Vol. 152, 2010
Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Systems Biology Tools for Dissecting Signalling Pathways

correlate with target motif similarity. For example, the amplification procedure. RDSA yields a range of can-
bZIP factors of tobacco (Nicotiana tabacum) RSG2 and didate DNA motifs that can be screened for common
tomato VSF-1 have highly conserved bZIP domains, elements and aligned using the STAMP tool (http://
yet they bind to completely different DNA motifs www.benoslab.pitt.edu/stamp/). Electrophoretic mo-
(Ringli and Keller, 1998; Fukazawa et al., 2000). The bility shift assays and mutagenesis of the candidate
bZIP domain of Arabidopsis VIP1 is strongly related to motif(s) is then used for confirming the binding and
those of RSG2 and VSF-1, and VIP1 had been shown specificity of the TF to those motifs. Once such motif
earlier to be phosphorylated by MPK3 in a stress- has been found and confirmed, target genes of the
dependent manner and to undergo cytoplamic- TF can be predicted. For this, the TAIR patmatch tool
nuclear translocation (Djamei et al., 2007). Where no (http://www.arabidopsis.org/cgi-bin/patmatch/nph-
further information on the DNA motifs targeted by a patmatch.pl) provides a tab-delimited file of position,
TF of interest is available, random DNA selection number, and orientation for all genes harboring such
assays (RDSAs) may be applied to generate data that motifs in a user-defined region (e.g. within 500 bp
subsequently can be analyzed by a range of bioinfor- promoter regions). In the case of VIP1, this information
matic tools (Pitzschke et al., 2009b). aided the prediction of one of its target genes MYB44,
In RDSA (Fig. 1), random double-stranded DNA which was later confirmed by promoter-reporter
fragments, usually 15 to 20 nucelotides long and gene activation and chromatin immunoprecipitation
flanked by defined primer-annealing sites, are in- (Pitzschke et al., 2009b). For several multimeric TFs the
cubated with recombinant TF protein. Candidate spacing between adjacent target DNA motifs is crucial
motifs are enriched through a repetitive selection- for transactivating activity. If knowledge about the

Figure 1. Flow chart of a possible strategy to identify the DNA motif(s) and promoters targeted by a TF of interest. Bioinformatic
tools are shown in italics.

Plant Physiol. Vol. 152, 2010 465


Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Pitzschke and Hirt

spacing exists (e.g. the preferred spacing between W proteins. The low degree of overlap in Y2H and MS
boxes targeted by certain groups of WRKYs; Ciolkowski studies in yeast further cautions on a naı̈ve interpre-
et al., 2008), the number of further candidate target tation of these data sets. Y2H suffer from a relatively
genes can be narrowed down. A fast visual tool for high degree of false positives that can be generated by
this application is the MotifMatcher tool (http://users. multiple factors that are inherent in the system, in-
soe.ucsc.edu/~kent/improbizer/motifMatcher.html), cluding overexpression, artificial interaction of two
which depicts multiple user-defined motifs (entered as components in the same compartment, or misfolding
matrix), each in a different color, on a set of promoters of the protein of interest due to fusion to yeast bait or
of interest as beads on a string (Fig. 1). prey proteins, respectively. MS studies of protein
complexes, on the other hand, suffer from copurifica-
tion of more or less abundant contaminants and the
HOW TO USE PROTEOMIC DATA IN
CONSTRUCTING SIGNALING NETWORKS possibility that the proteins may not be interacting
directly. Given these drawbacks, valuable information
Hypotheses on signaling pathway compositions can nonetheless be obtained from in silico analysis of
cannot only be generated through gene expression- publicly available interaction data, e.g. by using the
based analyses, but also through proteomic ap- tool provided at http://bar.utoronto.ca/interactions/
proaches. The classical experimental approaches for cgi-bin/arabidopsis_interactions_viewer.cgi. This tool
retrieving a list of candidate interactors of a protein of queries a huge database of confirmed Arabidopsis
interest are yeast two-hybrid (Y2H) screens and mass interacting proteins retrieved from Biomolecular In-
spectrometry (MS) analysis of purified protein com- teraction Network Database and from high-density
plexes. Whereas Y2H analyses have the potential to Arabidopsis protein microarrays, and provides details
predict direct protein interaction partners, MS analysis about the experimental evidence. It also integrates
of protein complexes primarily indicates that the pro- data from macro- and microarray-based phosphopro-
teins are in more or less complicated assemblies of tein arrays that led to the identification of Arabidopsis

Figure 2. Prediction of signaling components through integration of transcriptome and proteome arrays. By searching for the
overlap between candidate proteins identified from peptide-based microarrays (e.g. interactors of a regulatory protein in a
process of interest; right) and proteins encoded by genes differentially expressed under conditions of interest (left), high-priority
candidates involved in the immediate downstream signaling can be defined. As exemplified here, multiple stress-responsive
genes that encode for proteins for which phosphorylation by a stress-associated kinase/phosphatase has been observed are likely
to be key components in early stress signal transduction.

466 Plant Physiol. Vol. 152, 2010


Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Systems Biology Tools for Dissecting Signalling Pathways

MAPK candidate substrates (Feilner et al., 2005; genes carry multiple W boxes, suggesting that both
Popescu et al., 2009). genes underlie a common regulatory mechanism, i.e.
Because a Y2H- or protein microarray-based pre- through WRKY33 (Petersen et al., 2008).
dicted interaction does not necessarily mean that two In-depth analysis of available phosphopeptide se-
proteins truly interact in planta, the list of candidate quences may aid the prediction of peptide motifs that
interacting proteins can be narrowed down by apply- are recognized by a given kinase. Moreover, similar to
ing additional selection criteria: (1) check the spatio- the screening of Arabidopsis genes carrying a motif of
temporal expression pattern of the corresponding interest in their upstream regulatory region, the TAIR
genes (does gene x expression overlap with that of patmatch tool can be applied to generate a list of
gene y; useful tools are: https://www.genevestigator. candidate Arabidopsis proteins that harbor a given
com/gv/index.jsp and http://atted.jp/); and (2) com- peptide motif. Additional confidence about the func-
pare the subcellular localization of the proteins (a tional relevance of a candidate peptide motif may
chloroplast-localized protein is unlikely to interact also be obtained through phylogenetic analysis.
with a nuclear protein). For example, Arabidopsis NPR1 and its orthologs in
Obviously, data merely based on prediction algo- other plants carry a characteristic DSXXXS peptide,
rithms (e.g. http://wolfpsort.org/, http://www.cbs.dtu. phosphodegron, which marks it for phosphorylation-
dk/services/TargetP/) need to be interpreted with dependent proteasomal degradation (Spoel et al.,
caution. The more complex SUBA tool (http://www. 2009). Phylogenetic analysis can, for example, be
plantenergy.uwa.edu.au/suba2/) integrates prediction- performed using the tool at http://bioinfoserver.
based information with data based on experimental rsbs.anu.edu.au/utils/affytrees/, which provides in-
evidence (MS/MS, GFP fusion protein localization stud- formation about the homologs to a protein of interest
ies). Through integration of transcriptomic and proteo- in other plant species.
mic data—from your own and published arrays—can The functional relevance of candidate peptide motifs
also further facilitate the identification of top candidates can then be experimentally verified (e.g. through in
(Fig. 2). Once a list of a manageable number of candi- vitro phosphorylation). Subsequently, hybrid/artificial
date interaction partners has been established, their kinases can be created that modify proteins other than
ability to bind to the protein of interest can be experi- their true targets or that prevent phosphorylation of a
mentally validated (coimmunoprecipitation/bimolecular protein by outcompeting the true modifying up-
fluorescence complementation/fluorescence resonance stream kinase. Given that phosphorylation events are
energy transfer). a common feature in the signaling of critical responses/
The interaction viewer tool and the screening of processes in animals, this approach has high potential,
published lists of protein-protein interactions can also for example, for tumorigenesis/cancer therapy research.
aid the prediction of (further) partners that interact In summary, this review documents the usefulness,
with a protein of interest (if protein A interacts with B, robustness, and limitations of applying various tran-
and B with C, then A might also interact with C). For scriptome-, promoterome-, and proteome-based bio-
instance, MPK4 has been shown to interact with MKS1 informatic tools for deciphering signaling pathways in
(for MAPK substrate 1). On the other hand, MKS1 was Arabidopsis. Clearly, only a small subset of available
found to interact with two WRKY TFs, WRKY25 and tools are described, and their literally unlimited num-
WRKY33 in yeast. Both WRKYs are involved in biotic ber of elaborate combinations harbors high potential to
stress signaling, which in turn is clearly linked to significantly speed up the progress in signaling re-
MPK4. In an elegant series of experiments, Qiu et al. search. Also, modeling approaches, for example,
(2008b) could show that MPK4 exists in nuclear com- based on kinetic data, harbor a huge potential to
plexes with the WRKY33 TF. This complex depends on dissect signaling pathways. In the future, experiments
the MPK4 substrate MKS1. Challenge with pathogenic can be designed in a highly targeted manner and in
elicitors leads to the activation of MPK4 and phos- silico analysis will replace bench work to a large extent.
phorylation of MKS1. Subsequently, complexes with Received October 16, 2009; accepted November 12, 2009; published Novem-
MKS1 and WRKY33 are released from MPK4, ber 13, 2009.
and WRKY33 is recruited to the promoter of PAD3,
encoding an enzyme required for the synthesis of
antimicrobial camalexin. MKS1 serves to fine tune LITERATURE CITED
WRKY33-mediated PAD3 expression. In line with this
scenario, wrky33 mutants exhibit enhanced suscepti- Birnbaum K, Jung JW, Wang JY, Lambert GM, Hirst JA, Galbraith DW,
Benfey PN (2005) Cell type-specific expression profiling in plants via
bility to necrotrophic pathogens, whereas overexpres-
cell sorting of protoplasts from fluorescent reporter lines. Nat Methods
sion of WRKY33 increases resistance (Zheng et al., 2: 615–619
2006). A recent transcriptome study has revealed Bowling SA, Clarke JD, Liu Y, Klessig DF, Dong X (1997) The cpr5 mutant
further potential target genes of WRKY33, including of Arabidopsis expresses both NPR1-dependent and NPR1-independent
CYP71A1 that encodes a cytochrome P450 monoxyge- resistance. Plant Cell 9: 1573–1584
Brocard IM, Lynch TJ, Finkelstein RR (2002) Regulation and role of the
nase required for camalexin synthesis (Petersen et al., Arabidopsis abscisic acid-insensitive 5 gene in abscisic acid, sugar, and
2008). The AttedII tool predicts a strong coregulation stress response. Plant Physiol 129: 1533–1543
of PAD3 with CYP71A13, and the promoters of both Cao H, Bowling SA, Gordon AS, Dong X (1994) Characterization of an

Plant Physiol. Vol. 152, 2010 467


Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Pitzschke and Hirt

Arabidopsis mutant that is nonresponsive to inducers of systemic ac- genes and cis elements for identifying co-regulated gene groups in
quired resistance. Plant Cell 6: 1583–1592 Arabidopsis. Nucleic Acids Res 35: D863–D869
Ciolkowski I, Wanke D, Birkenbihl RP, Somssich IE (2008) Studies on Petersen K, Fiil BK, Mundy J, Petersen M (2008) Downstream targets of
DNA-binding selectivity of WRKY transcription factors lend structural WRKY33. Plant Signal Behav 3: 1033–1034
clues into WRKY-domain function. Plant Mol Biol 68: 81–92 Pitzschke A, Djamei A, Bitton F, Hirt H (2009a) A major role of the
Colcombet J, Hirt H (2008) Arabidopsis MAPKs: a complex signalling MEKK1-MKK1/2-MPK4 pathway in ROS signalling. Mol Plant 2:
network involved in multiple biological processes. Biochem J 413: 120–137
217–226 Pitzschke A, Djamei A, Teige M, Hirt H (2009b) VIP1 response elements
Djamei A, Pitzschke A, Nakagami H, Rajh I, Hirt H (2007) Trojan horse mediate mitogen-activated protein kinase 3-induced stress gene expres-
strategy in Agrobacterium transformation: abusing MAPK defense sion. Proc Natl Acad Sci USA 106: 18414–18419
signaling. Science 318: 453–456 Pitzschke A, Schikora A, Hirt H (2009c) MAPK cascade signalling net-
Feilner T, Hultschig C, Lee J, Meyer S, Immink RG, Koenig A, Possling A, works in plant defence. Curr Opin Plant Biol 12: 421–426
Seitz H, Beveridge A, Scheel D, et al (2005) High throughput identi- Popescu SC, Popescu GV, Bachan S, Zhang Z, Gerstein M, Snyder M,
fication of potential Arabidopsis mitogen-activated protein kinases Dinesh-Kumar SP (2009) MAPK target networks in Arabidopsis
substrates. Mol Cell Proteomics 4: 1558–1568 thaliana revealed using functional protein microarrays. Genes Dev 23:
Frickey T, Benedito VA, Udvardi M, Weiller G (2008) AffyTrees: facilitat- 80–92
ing comparative analysis of Affymetrix plant microarray chips. Plant Qiu JL, Fiil BK, Petersen K, Nielsen HB, Botanga CJ, Thorgrimsen S,
Physiol 146: 377–386 Palma K, Suarez-Rodriguez MC, Sandbech-Clausen S, Lichota J, et al
Fukazawa J, Sakai T, Ishida S, Yamaguchi I, Kamiya Y, Takahashi Y (2000) (2008b) Arabidopsis MAP kinase 4 regulates gene expression through
Repression of shoot growth, a bZIP transcriptional activator, regulates transcription factor release in the nucleus. EMBO J 27: 2214–2221
cell elongation by controlling the level of gibberellins. Plant Cell 12: Qiu JL, Zhou L, Yun BW, Nielsen HB, Fiil BK, Petersen K, Mackinlay J,
901–915 Loake GJ, Mundy J, Morris PC (2008a) Arabidopsis mitogen-activated
Gao M, Liu J, Bi D, Zhang Z, Cheng F, Chen S, Zhang Y (2008) MEKK1, protein kinase kinases MKK1 and MKK2 have overlapping functions in
MKK1/MKK2 and MPK4 function together in a mitogen-activated defense signaling mediated by MEKK1, MPK4, and MKS1. Plant Physiol
protein kinase cascade to regulate innate immunity in plants. Cell Res 148: 212–222
Ringli C, Keller B (1998) Specific interaction of the tomato bZIP transcrip-
18: 1190–1198
Garcia-Hernandez M, Berardini TZ, Chen G, Crist D, Doyle A, Huala E, tion factor VSF-1 with a non-palindromic DNA sequence that controls
vascular gene expression. Plant Mol Biol 37: 977–988
Knee E, Lambrecht M, Miller N, Mueller LA, et al (2002) TAIR:
Rook F, Corke F, Card R, Munz G, Smith C, Bevan MW (2001) Impaired
a resource for integrated Arabidopsis data. Funct Integr Genomics 2:
sucrose-induction mutants reveal the modulation of sugar-induced
239–253
starch biosynthetic gene expression by abscisic acid signalling. Plant J
Geisler-Lee J, O’Toole N, Ammar R, Provart NJ, Millar AH, Geisler M
26: 421–433
(2007) A predicted interactome for Arabidopsis. Plant Physiol 145:
Rushton PJ, Reinstadler A, Lipka V, Lippok B, Somssich IE (2002)
317–329
Synthetic plant promoters containing defined regulatory elements pro-
Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007)
vide novel insights into pathogen- and wound-induced signaling. Plant
SUBA: the Arabidopsis Subcellular Database. Nucleic Acids Res 35:
Cell 14: 749–762
D213–D218
Sato M, Mitra RM, Coller J, Wang D, Spivey NW, Dewdney J, Denoux C,
Higo K, Ugawa Y, Iwamoto M, Higo H (1998) PLACE: a database of plant
Glazebrook J, Katagiri F (2007) A high-performance, small-scale micro-
cis-acting regulatory DNA elements. Nucleic Acids Res 26: 358–359
array for expression profiling of many samples in Arabidopsis-pathogen
Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ,
studies. Plant J 49: 565–577
Nakai K (2007) WoLF PSORT: protein localization predictor. Nucleic
Spoel SH, Mou Z, Tada Y, Spivey NW, Genschik P, Dong X (2009)
Acids Res 35: W585–W587
Proteasome-mediated turnover of the transcription coactivator NPR1
Jakoby M, Weisshaar B, Droge-Laser W, Vicente-Carbajosa J, Tiedemann
plays dual roles in regulating plant immunity. Cell 137: 860–872
J, Kroj T, Parcy F (2002) bZIP transcription factors in Arabidopsis. Swindell WR (2006) The association among gene expression responses to
Trends Plant Sci 7: 106–111 nine abiotic stress treatments in Arabidopsis thaliana. Genetics 174:
Kankainen M, Holm L (2004) POBO, transcription factor binding site 1811–1824
verification with bootstrapping. Nucleic Acids Res 32: W222–W229 Thimm O, Blasing O, Gibon Y, Nagel A, Meyer S, Kruger P, Selbig J,
Kant P, Gordon M, Kant S, Zolla G, Davydov O, Heimer YM, Chalifa- Muller LA, Rhee SY, Stitt M (2004) MAPMAN: a user-driven tool to
Caspi V, Shaked R, Barak S (2008) Functional-genomics-based identi- display genomics data sets onto diagrams of metabolic pathways and
fication of genes that regulate Arabidopsis responses to multiple abiotic other biological processes. Plant J 37: 914–939
stresses. Plant Cell Environ 31: 697–714 Ulker B, Somssich IE (2004) WRKY transcription factors: from
Lake JA, Wade RN (2009) Plant-pathogen interactions and elevated CO2: DNA binding towards biological function. Curr Opin Plant Biol 7:
morphological changes in favour of pathogens. J Exp Bot 60: 3123–3131 491–498
Leon P, Sheen J (2003) Sugar and hormone connections. Trends Plant Sci 8: Urbanczyk-Wochniak E, Usadel B, Thimm O, Nunes-Nesi A, Carrari F,
110–116 Davy M, Blasing O, Kowalczyk M, Weicht D, Polinceusz A, et al (2006)
Leonhardt N, Kwak JM, Robert N, Waner D, Leonhardt G, Schroeder JI Conversion of MapMan to allow the analysis of transcript data from
(2004) Microarray expression analyses of Arabidopsis guard cells and Solanaceous species: effects of genetic and environmental alterations in
isolation of a recessive abscisic acid hypersensitive protein phosphatase energy metabolism in the leaf. Plant Mol Biol 60: 773–792
2C mutant. Plant Cell 16: 596–615 Usadel B, Poree F, Nagel A, Lohse M, Czedik-Eysenberg A, Stitt M (2009)
Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouze A guide to using MapMan to visualize and compare Omics data in
P, Rombauts S (2002) PlantCARE, a database of plant cis-acting regu- plants: a case study in the crop species, Maize. Plant Cell Environ 32:
latory elements and a portal to tools for in silico analysis of promoter 1211–1229
sequences. Nucleic Acids Res 30: 325–327 Wang D, Amornsiripanitch N, Dong X (2006) A genomic approach to
Ma S, Bohnert HJ (2007) Integration of Arabidopsis thaliana stress-related identify regulatory nodes in the transcriptional network of systemic
transcript profiles, promoter structures, and cell-specific expression. acquired resistance in plants. PLoS Pathog 2: e123
Genome Biol 8: R49 Wang H, Ngwenyama N, Liu Y, Walker JC, Zhang S (2007) Stomatal
Mahony S, Benos PV (2007) STAMP: a web tool for exploring DNA- development and patterning are regulated by environmentally respon-
binding motif similarities. Nucleic Acids Res 35: W253–W258 sive mitogen-activated protein kinases in Arabidopsis. Plant Cell 19:
Nielsen HB, Mundy J, Willenbrock H (2007) Functional Associations by 63–73
Response Overlap (FARO), a functional genomics approach matching Wang L, Tsuda K, Sato M, Cohen JD, Katagiri F, Glazebrook J (2009)
gene expression phenotypes. PLoS One 2: e676 Arabidopsis CaM binding protein CBP60g contributes to MAMP-
Obayashi T, Kinoshita K, Nakai K, Shibaoka M, Hayashi S, Saeki M, induced SA accumulation and is involved in disease resistance against
Shibata D, Saito K, Ohta H (2007) ATTED-II: a database of co-expressed Pseudomonas syringae. PLoS Pathog 5: e1000301

468 Plant Physiol. Vol. 152, 2010


Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.
Systems Biology Tools for Dissecting Signalling Pathways

Yan T, Yoo D, Berardini TZ, Mueller LA, Weems DC, Weng S, Cherry JM, and Glycine max resistance to Fusarium virguliforme. BMC Genomics
Rhee SY (2005) PatMatch: a program for finding patterns in peptide and (Suppl 2) 9: S6
nucleotide sequences. Nucleic Acids Res 33: W262–W266 Zheng Z, Qamar SA, Chen Z, Mengiste T (2006) Arabidopsis WRKY33
Yi H, Richards EJ (2007) A cluster of disease resistance genes in Arabidopsis transcription factor is required for resistance to necrotrophic fungal
is coordinately regulated by transcriptional activation and RNA silenc- pathogens. Plant J 48: 592–605
ing. Plant Cell 19: 2929–2939 Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W (2004)
Yuan J, Zhu M, Lightfoot DA, Iqbal MJ, Yang JY, Meksem K (2008) In GENEVESTIGATOR: Arabidopsis microarray database and analysis
silico comparison of transcript abundances during Arabidopsis thaliana toolbox. Plant Physiol 136: 2621–2632

Plant Physiol. Vol. 152, 2010 469


Downloaded from www.plantphysiol.org on April 23, 2015 - Published by www.plant.org
Copyright © 2010 American Society of Plant Biologists. All rights reserved.

You might also like