Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/265332508

Functional Genomics and Proteomics: Basics, Opportunities and


Challenges

Chapter · January 2003


DOI: 10.1007/978-3-642-55539-8_3

CITATIONS READS

0 661

5 authors, including:

Stefan R Schmidt Jeno Gyuris


BioAtrium AG Békés Megyei Pándy Kálmán Kórház
49 PUBLICATIONS   432 CITATIONS    96 PUBLICATIONS   5,247 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Alkaptonuria View project

Split inteins technology development View project

All content following this page was uploaded by Stefan R Schmidt on 22 October 2017.

The user has requested enhancement of the downloaded file.


CHAPTER 3

Functional Genomics and Proteomics: 3


Basics, Opportunities and Challenges
NIKOLAI KLEY, STEFAN SCHMIDT, VIVIAN BERLIN, HANNES LOFERER, JENO GYURIS

Contents
3.1
3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . 39 Introduction
3.2 Functional Genomics:
Technologies and Applications . . . . . . . . . . 41
3.2.1 Genome Profiling. . . . . . . . . . . . . . . . . . . 41 Genomics and proteomics are changing our under-
3.2.1.1 Gene Expression ProfIling . . . . . . . . . . . . . 41 standing of biology. To date, the greatest impact has
3.2.1.2 Mapping DNA-Protein Interactions . . . . . . . . 45 come from DNA sequencing projects, which recently
3.2.1.3 DNA Variations . . . . . . . . . . . . . . . . . . . . 45
culminated in the unveiling of the near-complete 3.2-
3.2.2 Phenotype-Based Functional Genomics . . . . . 46
3.2.2.1 Induced Phenotypes and Gene Function . . . . 46 billion base-pair sequence of the human genome (In-
3.2.2.2 Applications of Phenotype-Based Functional ternational Human Genome Sequencing Consortium
Genomics: Drug Discovery and Diagnostics . . 47 2001; Venter et al. 2001). We have thus advanced from
3.3 Proteomics: Technologies and Applications .. . 48 having only limited information about the genetic de-
3.3.1 Proteome Expression Profiling . . . . . . . . . . 48 tails of biology to possessing an immense amount of
3.3.1.1 2D-PAGE-MALDI-MS . . . . . . . . . . . . . . . . 48 structural information about individual genes. The
3.3.1.2 Multidimensional LC-MS/MS . . . . . . . . . . . 50
complete genome sequences of more than 60 species
3.3.1.3 Isotopic Methods . . . . . . . . . . . . . . . . . . . 50
3.3.1.4 Protein Chip . . . . . . . . . . . . . . . . . . . . . . 51 are now available in databases, and many more are
3.3.1.5 Protein Arrays . . . . . . . . . . . . . . . . . . . . 51 expected to become available in the near future.
3.3.2 ProfIling of Macromolecular Interactions: These information resources alone will have a signifi-
Elucidation of Proteome Networks . . . . . . . . 51 cant impact on biomedical research. Even more im-
3.3.2.1 Molecular Interaction Screening Technologies. 52
3.3.2.2 Applications of Molecular
portantly, blueprints of genomes can provide the ba-
Interaction Screening Technologies. . . . . . . . 55 sis for the integration of complex data sets derived
3.3.3 Probing the Proteome: from a wide range of studies in genomics, functional
Discovery of Surrogate Ligands . . . . . . . . . . 55 genomics, and proteomics. The resulting increase in
3.3.3.1 Protein-Antibody
genetic and biological information will have an even
Interaction Screening Technologies. . . . . . . . 55
3.3.3.2 Combinatorial Peptide Libraries greater impact on biomedical research and the way
and Surrogate Ligand Discovery . . . . . . . . . 56 medicine is practiced.
3.3.3.3 Implications for Diagnostics, Prognostics In general terms, genomics refers to the generation
and Therapy. . . . . . . . . . . . . . . . . . . . . . 57 of information about genes and genomes by systemat-
3.4 Chemical Genomics and Proteomics . . . . . . . 57 ic approaches that can be performed at an industrial
3.4.1 Impact of Small Molecules on Genome scale, such as the sequencing and physical mapping
Expression. . . . . . . . . . . . . . . . . . . . . . . 57
of genes. A richness of information that is relevant to
3.4.2 Small Molecule-Target Interactions . . . . . . . . 58
basic, applied, and medical sciences can be derived
3.5 Functional Genomics and Proteomics: from blueprints of genomes. This includes, but is not
Implications for Molecular
and Nuclear Medicine. . . . . . . . . . . . . . . . 60
necessarily limited to, information about: (1) the
identity and modular structure of genes and their
3.6 Functional Genomics and Proteomics:
encoded proteins; (2) the classification of gene and
Opportunities and Challenges. . . . . . . . . . . 62
protein families, and their evolutionary relationships;
3.7 References . . . . . . . . . . . . . . . . . . . . . . . 62
(3) the molecular basis of evolution (Le., changes that
have led to speciation and existing phylogeny);
40 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

(4) the linkage of specific genetic markers to inher- pathways; and ultimately (3) the development of a pro-
ited traits and disease susceptibility; and (5) the mo- teome network of cellular signaling pathways.
lecular basis of susceptibility/responsiveness to drug Chemical genomics and proteomics refers to the
treatment (pharmacogenomics). systematic analysis of the impact of organic small
However, only limited information about the biolog- molecule drugs on the genome and proteome. This
ical function of genes and proteins can be extracted chemistry-oriented analysis is focused on improving
from sequence information alone, despite the diverse our understanding of the molecular basis of the
bioinformatic queries to which genome sequences can mechanism of action of drugs on a genome- and pro-
be subjected at present. Indeed, the surprisingly low teome-wide level. Current strategies include: (1)
number of genes predicted to be encoded by the human imaging the dynamics of the impact of small mole-
genome (currently estimated to be 30,000-40,000 cules, and their metabolites, on the expression of ge-
genes) implies that regulatory processes such as differ- nomes (transcriptome analysis) and proteomes; (2)
ential gene expression, alternative splicing, and post- uncovering cellular targets and pathways that underlie
transcriptional and post-translational regulatory pro- cellular responses to small molecules; and (3) the ra-
cesses must significantly contribute to the complexity tional design of small molecules for the selective per-
of molecular events underlying cellular processes and turbation of endogenously expressed or selectively en-
specification. Indeed, it is estimated that, on average, gineered proteins. Insights gained from the use of
ten variants of each protein exist (which includes splice these strategies can afford a better understanding of
variants and post-translationally modified variants). what drugs do and how they act. Thus, collectively,
This raises the complexity of the proteome significantly chemical genomics and proteomics strive to improve
beyond that of the genome. If one includes differential our understanding of the target specificity of a com-
expression as a dynamic parameter, the number of pos- pound, reveal early in the discovery process potential
sible states of the proteome are yet several orders of safety issues with respect to identified target and off-
magnitude more complex. This is what underlies the target interactions and effects, highlight potential
true complexity of cellular functions. A major chal- new uses of compounds, and guide medicinal chemis-
lenge ahead will be to use genomic information re- try to the synthesis of drugs with more selective and
sources to understand the functions of all genes and improved properties. Furthermore, in the near future,
the proteins they encode. Understanding how genes chemical genomics and proteomics data may be inte-
and proteins collaborate and interact to carry out cellu- grated with pharmacogenomic data on allelic var-
lar processes - that is, understanding how complex iances occurring in populations and, thereby, improve
biological systems operate ("systems biology") - will the prediction of drug response profiles. Chemical
be among the most difficult challenges ahead. genomics and proteomics will also promote the emer-
Functional genomics refers to the systematic genera- gence of smaIl molecules that can be used as tools to
tion and analysis of information about what genes do, probe the function of genes and proteins. To quote
and involves a broad range of technologies and experi- Richard Klausner, director of the NCI: "The discov-
mental approaches that strive to integrate genomic in- ery, modification and annotation of small molecules
formation with information gathered from gene-driven in terms of their ability to probe and perturb biologi-
experimentation. These include genome-wide analysis cal targets will be one of the central tools of the post-
of gene expression (analysis of the transcriptome) as genomic era". Chemical genomics emulates all the
well as the analysis of phenotypic effects associated principles of genetics, but rather than relying on ge-
with induced changes in gene expression. Operating netic mutations to dissect function, it uses small mol-
on a large scale, functional genomics is a truly multi- ecules. These may prove particularly useful in prob-
disciplinary science that integrates genomics, genetics, ing the function of proteins in vivo.
molecular and cellular biology, computer science The exploitation of genomics, functional genomics,
(bioinformatics), and engineering and automation and proteomics to elucidate physiological and patho-
technologies. Proteomics integrates a similarly diverse logical processes will play an important role in the
range of technologies but focuses on the large scale future practice of medicine. In this chapter we review
analysis of the functions of protein products encoded the basic principles as well as recent technological
by genomes - and includes (1) the analysis of cellular advances in functional genomics and proteomics, and
protein expression, protein modifications, and pro- examine how progress in these areas affects our un-
tein-protein interactions; (2) the mapping of signaling derstanding of systems biology and the molecular ba-
3.2 Functional Genomics: Technologies and Applications 41

sis of disease, and how it stimulates the development 3. Composite gene expression studies to infer the
of novel research tools, diagnostics, prognostics, and function{s) of a gene on the basis of its co-regula-
therapeutics. tion with other genes. Large-scale gene expression
profiling and gene clustering studies have shown,
for both prokaryotes and eukaryotes, that genes
that regulate specific cellular processes (e.g., RNA
splicing, cell cycle progression, glycolysis, and
3.2 other metabolic pathways) are often co-regulated.
Functional Genomics: Technologies and Applications Consequently, when analyzed in the context of the
pattern of expression of thousands of genes, the
3.2.1 similarity of the behavior of a gene to that of other
Genome Profiling genes with known function can provide clues to
biological function ("guilt by association").
3.2.1.1 4. Determination of gene expression patterns in dis-
Gene Expression ProfIling ease. Diversion of normal physiology is frequently
accompanied by a panoply of histological and bio-
The traditional gene-by-gene approach will not suf- chemical changes, including changes in gene ex-
fice to meet the sheer magnitude of the challenge of pression patterns. The up or downregulation of
understanding biological systems with 40,000 or more gene activity can be either the cause of the patho-
genes. It will be necessary to take "global views" of physiology or the result of disease. Consequently,
biological processes, which requires a simultaneous genes may be identified which (l) can serve as di-
monitoring of the activity and regulation of as many agnostic markers, (2) cause disease and can be tar-
cellular components as possible. Global analysis of geted for therapeutic intervention, and/or (3) are
gene expression represents an important piece of expressed as a consequence of disease and can
such a puzzle. Imaging transcriptional programs can lead to strategies aimed at alleviation of symp-
reveal how global gene expression is remodeled dur- toms.
ing changes in cell growth, physiology, pathology, or Cancer is a good example for which the utility of
environment, and can provide important information gene expression has been amply demonstrated in
about gene function. Already, the recent development the discovery of novel therapeutic targets as well
of technologies that enable gene expression studies at as in the classification of cancers. Gene expression
large scale have begun to have a profound impact on profiles will become a valuable tool with which
biological research, pharmacology, and medicine - as pathologists and oncologists can obtain a more
exemplified, for instance, by: global quantitative approach in the classification of
1. Determination of the tissue and cell type specificity cancers and the prediction of outcomes. It will al-
of gene expression. A gene with expression re- low physicians to follow the progression of disease
stricted to a certain cell type or tissue is unlikely even when there is no histological evidence of
to be involved in the pathology of a disease affecting change. Furthermore, in conjunction with gene
another cell type or tissue, unless a secreted protein mutation and gene polymorphism analysis, the
is involved. Knowledge of the tissue (and cell) spec- ability to detect differences in human cancers by
ificity of gene expression is also critical for assem- the difference in expression profiles is likely to aid
bling biological pathways (co-expression) and vali- in the selection of appropriate therapies.
dating suitable targets for therapeutic intervention. 5. Gene expression studies in disease models in
2. Cell-signaling studies. Alteration of gene expres- inbred animals. Detailed profiling of gene expres-
sion is a critical step in most cellular signaling sion in model systems can yield important insights
pathways. Gene-expression studies have been de- into cellular, animal, and human physiology criti-
signed to identify genes whose expression depends cal to the discovery and validation of therapeutic
on a cell state or on the functions of specific com- targets.
ponents of signaling or transcription apparatuses. 6. Gene expression studies in pathogens. With the en-
For instance, as studies in yeast indicate, functions tire genome sequence of many pathogens now
of a gene can be predicted from the expression available, new approaches toward understanding
profile of a cell with a mutation in that gene. the molecular basis of pathogenesis can be gained.
42 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

For instance, gene expression studies can provide 3.2.1.1.1


insights into the biology of acute infection vs. la- Microarrays
tency in vivo, virulence factors, and host response
to pathogens. It should also be possible to obtain DNA micro arrays provide a simple and natural yet
signatures for pathogens that are diagnostic, even systematic and comprehensive vehicle for exploring
when the etiologic agent is not known. the genome. The power and universality of DNA mi-
7. Gene expression in response to drug treatment. croarrays as experimental tools derive from the ex-
Gene expression studies can give important in- quisite specificity and affinity of complementary
sights into the mechanism of action of small mole- base-pairing. A DNA copy of an individual gene pro-
cule drugs and drug resistance mechanisms. They vides an ideal reagent for specific and quantitative
can also be used to delineate and predict adverse detection and measurement of the sequence of the
events based on the identification of genes with gene, even in an extremely complex mixture. Expres-
toxicity potential. In conjunction with traditional sion analysis is carried out by hybridizing RNA/
toxicity studies, the identification of deregulated cDNA to the immobilized DNA. Several different DNA
gene products that can be used as surrogate mar- micro array technologies are currently in use
kers will further improve the drug development (Fig. 3.1).
process. In one type of array, DNA is printed and subse-
quently immobilized on solid supports (Cheung et al.
It is apparent that gene expression analysis is an im-
1999). The first such solid support described was ny-
portant functional genomics tool that has many ap-
lon, and nylon remains widely used. This array
plications in the basic and applied medical sciences.
boasts superior sensitivity with its utilization of
It is thus not surprising that many gene expression
radioactively labeled probes, and is relatively inex-
technologies have been developed over the years.
pensive (Duggan et al. 1999). The small amount of
These include techniques that generally depend on
RNA sample required for hybridization makes this
DNA sequencing (such as sequencing of EST libraries
technology particularly suitable for the analysis of ex-
or SAGE elements), or on PCR-based differential dis-
pression profiles in micro dissected disease tissues.
play methods, micro arrays, or gene traps (which al-
However, glass supports have distinct advantages as
low direct measurements of gene activity in intact
well: DNA probes can be covalently attached onto
cells). Microarrays have many advantages over other
glass surfaces, glass is a durable material, it produces
gene expression technologies, which are often labor-
low background as compared to nylon membranes
ious and insensitive, and they seem likely to become
when fluorescence detection is used, and it is nonpor-
a standard tool of both molecular biology research
ous so that hybridization volumes can be kept to a
and clinical diagnostic research. The most significant
minimum. Thus, despite the higher sensitivity of ny-
advantage of micro arrays is the ability to analyze the
lon-based filter arrays, the use of nonporous solid
expression of thousands of genes in a parallel and
supports is becoming a more popular technology
systematic way, enabling the investigator to take a
with time. It has facilitated miniaturization and uti-
more "global view" of biological processes as they are
lizes the safer fluorescence-based signal detection
reflected in signature gene expression changes. Ge-
methods. The method pioneered by Pat Brown and
nome sequencing efforts will provide the necessary
colleagues uses arrays where DNA is printed on glass
information to display a large number of genes on
microscope slides using a robotic "arrayer" (Schena
high-density arrays. Industrialization of chip produc-
et al. 1995; Shalon et al. 1996; Eisen and Brown
tion and free market competition will reduce costs
1999). Information about the relative abundance of
over time and make such tools more widely available
genes in two DNA or RNA samples is obtained
to the scientific community. Due to the importance of
through the labeling of such samples with different
micro array technology, we will focus in this review
fluorescent dyes. These are then mixed and hybri-
on a brief description of currently used array-based
dized to the arrayed DNA spots. After hybridization,
profiling technologies. In vivo profiling methods
the fluorescence of each dye is measured separately.
based on gene traps will also be addressed, as they
The ratio of signals reflects the relative abundance of
represent an important avenue of gene expression
the sequence of each gene in the two RNA or DNA
analysis in intact cells and in in vivo model systems.
samples.
3.2 Functional Genomics: Technologies and Applications 43

Isolation of RNA I
!!!!
Probe preparation and labelling
a

1----.-------1 I Hybridization I

- Oligonucleotide
synilie is &
array chips
Sample I

Fluorescence reader I
b a
Sample I

Sample 2

Fig. 3.1. Array-based gene expression monitoring. A flow each probed gene and for each condition (sample type).
path schematically represents the interacting components Various types of data analyses may be performed, such as
required for monitoring RNA abundance levels using DNA pairwise comparisons when two samples are compared, or
microarrays generated by robotic arraying of DNA (e.g., multiple-pairwise comparisons or clustering analyses (hier-
cDNA arrays from PCR products) or by in situ oligonucleo- archical and partitioning clustering methods) when more
tide synthesis (oligonucleotide arrays, Affymetrix chip tech- samples are used in the analysis. (b) cDNA arrays on glass
nology). (a) cDNA arrays produced on nylon filter mem- supports or oligonucleotide arrays (Affymetrix chips) are
branes are hybridized with radioactively labeled probes usually hybridized with probes generated using fluorescent
(generally 33p_cDNA). Subsequent to image analysis, the sig- dyes (usually Cy3 and Cy5). From two different conditions,
nal intensities (adjusted using various quality control mea- cDNAs are labeled with the two different dyes, and the two
sures such as background subtraction and multiplication samples are co-hybridized to a single array (as opposed to
with a normalization factor - for instance, normalization to two matching arrays as required for radioactive probes from
the average of all probed genes, or normalization to the to- different samples). Subsequently, the array is scanned at
tal signal of some housekeeping genes) obtained by hybridi- two different wavelengths to detect the relative transcript
zation of probes generated from different samples can be abundance for each condition
arranged in a data matrix containing expression values for

Other arrays utilize oligonucleotides rather than known (Fodor et al. 1991; Lockhard et al. 1996). This
DNA or cDNA arrays (Fodor et al. 1991; Kharpko et method exploits photolithography technology (an
a1. 1991; Southern et a1. 1992). Among these, the adaptation from computer chip production) in paral-
method for high density spatial synthesis of oligonu- lel oligonucleotide synthesis, and has been used so
cleotides, as developed by Steve Fodor and colleagues far to produce chips with 400,000 or more distinct
(and commercialized by Affymetrix), is the best oligonucleotides.
44 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

The emergence of micro array technologies, ies will help to reveal cellular circuitry at an unprece-
although still in their infancy, has greatly facilitated dented level of detail and lead to the discovery of
large scale gene expression studies. It is important to novel targets for therapeutic intervention. Another
note that large scale gene expression profiling does application is the study of cell response to drug treat-
not simply entail a scaling up of numbers of events that ment (see Sect. 3.4.1).
can be measured simultaneously. As alluded to pre- Microarray studies have also been used to improve
viously, large data sets harbor information about pat- clinical disease diagnosis. For example, micro arrays
terns and systematic features, which can support the are already providing insights into cancer that would
building of a picture of complex systems. Bioinformati- be difficult, if not impossible, to obtain by a gene-by-
cians are challenged with evolving tools to facilitate the gene approach. They have been used to identify spe-
search and display of "hidden" features. These include cific subtypes of a variety of cancers, including leuke-
sophisticated tools for statistical analysis as well as vi- mias (Golub et al. 1999) and lymphomas (Alizadeh et
sualization tools that display data in a easily compre- a1. 2000), cutaneous malignant melanoma (Bittner et
hensible manner. In turn, these should facilitate the al. 2000), breast cancer (Perou et a1. 1999) and colon
formulation of experimentally testable hypotheses. cancer (Alon et a1. 1999). For instance, comparison of
For a description of some such analysis tools, the inter- profIles of acute myeloid leukemia (AML) and acute
ested reader is referred to recent reviews covering this lymphoblastic leukemia (ALL), two hematologic ma-
topic (Wittes and Friedman 1999; Eisen et a1. 1998). lignancies that are often difficult to differentiate by
Microarrays have been employed in many of the standard pathological examination of the diseased
types of investigations briefly outlined in Sect. 3.2.1.1. cells, allows the distinction between these cancers
For instance, in the basic research arena, micro arrays without previous knowledge of these classes (Golub
have been applied to the study of genome-wide pat- et al. 1999). Importantly, an automatically derived call
terns of gene expression in response to a variety of predictor (based on identified marker genes) is able
imposed stimuli. Many early studies have focused on to determine the class of new leukemia cases. An-
the yeast model system and revealed waves of gene other example is highlighted by studies of diffuse
expression that correlate with cell cycle progression large B-cell lymphomas (DLBCL), a clinically hetero-
and various environmental conditions (De Risi et a1. geneous disease group. Clustering analyses of micro-
1997; Lashkari et al. 1997; Wodicka et al. 1997; Spell- array data resulted in the identification of marker
man et al. 1998). The results from such studies clearly genes that allowed the definition of two DLBCL
indicate that gene expression data reflect information groups: one had the signature of B-cells from germ-
about gene function and even about the physical as- inal centers, the other the signature of activated B-
sociation of gene products. Numerous clusters of co- cells (Alizadeh et a1. 2000). The clinical outcome of
expressed genes, representing diverse expression pat- patients carrying the activated B-cell-like signature
terns across even a limited set of conditions, were was much worse than that of patients with the germ-
strikingly coherent in their cellular functions (Eisen inal center B-cell signature. These studies, although
et a1. 1998). Other investigations have explored still preliminary, indicate how large scale gene expres-
whether expression profIles of mutant cells can be sion studies may improve the resolution of cancer
used to classify the functions of previously uncharac- subtypes. Similar kinds of approaches to other dis-
terized genes (Hughes et a1. 2000). Microarrays have eases are also being undertaken (Friddle et a1. 2000)
also been applied in many other studies involving that should accelerate our understanding of disease
mammalian cells. These include, for instance; (1) the etiologies, and thus the development of mechanism-
genome-wide analysis of the mitotic cell cycle (Cho based therapeutics and tools for better diagnosis and
et a1. 1998); (2) the study of the transcription re- prognosis (Khan et a1. 2001).
sponse of human fibroblasts to serum (Iyer et a1.
1999); (3) the study of the response to activation of a 3.2.1.1.2
specific gene, such as the MYC oncogene; (4) the Non-Array Based Gene Expression Methods:
large scale identification of secreted and membrane- In Vivo Analysis of Gene Expression
associated proteins (Diehn et a1. 2000); (5) the differ-
ential analysis of normal and disease tissues (Perou Micro arrays and other sequencing- or PCR-based
et a1. 1999); and (6) the analysis of tumor subtypes methods for quantifying gene expression are limited
(see below). Clearly, large scale gene expression stud- to the use of RNA isolated from cells or tissues. Cer-
3.2 Functional Genomics: Technologies and Applications 45

tain experimental paradigms, however, require mea- quantified directly from PET scanner images. This
surement of gene expression in intact cells or whole approach has already found application in the devel-
organisms. opment of PET reporters for measuring gene expres-
Methods that utilize gene traps have been success- sion. For instance, a PET-reporter-gene approach
fully used in this context (Brown et al. 1998). These based on herpes simplex virus type 1 thymidine kin-
are primarily based on the use of retroviral vector sys- ase (HSV1-tk) trapping of positron-labeled substrates
tems that support the integration of reporter genes into in cells expressing HSV1-tk has been reported
transcriptionally active chromosomal regions. Integra- (Gambhir et al. 1998, 1999). Others have shown that
tion may result in knocking out the targeted gene, and coupling target and reporter expression through in-
this strategy has been used to generate large embryonic ternal ribosome entry sites (IRES) could be used for
stem (ES) cell libraries for the generation of transgenic quantitation of the expression of virtually any gene of
knock-out animal model systems (Zambrowicz et al. interest from the respective transgene in intact cells
1998). However, integration may also be utilized to (Yu et al. 2000). Such bi-cistronic reporter systems
measure the activity of the gene into which the trans- may be of particular use in gene therapy.
gene construct has integrated (Whitney et al. 1998; Ishi- In addition to clinical applications, transgenic and
da and Leder 1999; Akiyama et al. 2000; Medico et al. "knock-in" animal models incorporating PET repor-
2001; Mitchell et al. 2001). Inclusion of appropriate ters could be used to measure endogenous gene ex-
splice acceptor and splice donor sites in the transgene pression in many experimental settings. Thus, inte-
constructs puts expression of the transgene reporter grating large scale experiments that identify novel
under the control of the endogenous promoter of the genes of interest with strategies for in vivo monitor-
affected gene, thereby providing an in situ readout of ing of gene expression changes, will further facilitate
endogenous gene activity. Numerous reporters, primar- linking genotypes and phenotypes. Further exploita-
ily drug-resistance genes or genes encoding fluorescent tion and development of PET reporter systems should
proteins such as green fluorescent protein (GFP), have prove useful for both basic and clinical research.
been used in this context. Retroviral as well as plasmid-
based gene reporter traps have been used to generate 3.2.1.2
complex cell populations with numerous integration Mapping DNA-Protein Interactions
sites. Such cell populations may be analyzed to profile
gene expression at a genome-wide level. Fluorescent re- It will be useful to augment global gene expression
porter probes enable real time analysis of gene expres- data with information about the function of the fac-
sion using such systems and will be useful in the dissec- tors that directly regulate genes. Again, DNA arrays
tion of signaling pathways and in the study of the ef- have shown promise in mapping the binding sites of
fects of environmental agents (Whitney et al. 1998). proteins to target DNA sequences. Two complemen-
The development of noninvasive methods for mea- tary techniques have already been used, chromatin
suring reporter activity would further expand the use immunoprecipitation and DNA adenine methyltrans-
of such systems to in vivo analysis of single or com- ferase identification (DamID) (Ren et al. 2000; Iyer et
posite gene expression. Such an advance would have al. 2001; Van Steensel et al. 2001). These techniques
applications in a wide range of fields, including the can provide genome-wide biochemical information
development of animal model systems using gene about DNA-protein interactions.
"knock-in" strategies and gene therapy settings. The
recent development of PET reporter genes may pro- 3.2.1.3
vide such an avenue. Positron emission tomography DNA Variations
(PET) is a noninvasive imaging modality used to
study biochemical and biological process in vivo. PET The elucidation of DNA variations has been central to
probes incorporate a positron-emitting isotope at- the discovery of genetic mutations underlying many in-
tached to a molecule of interest. Such tracers are herited and sporadic diseases. Polymorphic DNA var-
usually positron-labeled ligands for receptors or posi- iations are also used as markers of genes and genomes
tron-labeled enzyme substrates. Retention of PET with which researchers perform genetic analysis in
probes in living tissue generally occurs after ligand outbred species where matings are not controlled. In
binding or after conversion of a substrate to the context of sequencing of the human genome, a huge
"trapped" product(s), and can be measured and collection of single nucleotide polymorphisms (SNPs)
46 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

is being discovered, and these SNPs are anticipated to fied targets into known signaling pathways. Although
provide a basis for more efficient mapping of disease large scale studies facilitate the prediction of gene
genes and for understanding the role of gene variations function, functional characterization of any candidate
in altered gene function, susceptibility to disease, and target will ultimately need to be studied in more de-
the response of individual patients to drug treatment tail utilizing a spectrum of different technologies and
(Chakravarti 2001). The general expectation is that biological model systems.
SNPs will have a significant impact on basic and clini- The goal of phenotype-based functional genomics
cal research (Risch 2000; Roses 2000), although some is to accelerate this process through the identification
fundamental questions, such as the role of common ge- of targets on the basis of their function in the regula-
netic variants in causing human disease, and the nature tion of specific cellular phenotypes. Thus, target dis-
of variations within and between populations, still covery and validation are addressed in parallel. In
need to be answered. other words, a combinatorial screening paradigm first
Many different methodologies are being explored identifies elements that induce certain phenotypes,
in the analysis of DNA variations, a detailed discus- and the genes pertaining to such elements are then
sion of which is beyond the scope of this review. identified - a reversed order of discovery from that
However, although not strictly within the realm of usually pursued in functional genomics (conceptually
functional genomics, gene variance analysis will have similar to principles followed in reverse genetics).
a significant impact on the determination of func- Classical genetic analysis in genetically tractable
tional differences between polymorphic variants; that organisms has been among the approaches tradition-
is, how these result in phenotypic polymorphisms. ally taken to study gene function. This involves the
Microarrays, as described earlier in this review, have introduction of mutations and the subsequent obser-
had a great impact on large-scale gene expression vation of the mutant phenotype. Complex biological
profiling. Microarrays can also be used to study processes have been dissected by genetic screens in
DNA, primarily for identifying and genotyping muta- which a large collection of random mutations are in-
tions and polymorphisms (Brown and Botstein 1999; troduced into the genome of the model organism and
Hacia 1999; Lockhard and Winzeler 2000). mutants with the desired altered phenotype are se-
Array-based characterization and identification of lected and analyzed. Classical genetics, combined
novel DNA variants have largely been performed using with molecular genetics and genomics, will remain a
oligonucleotide arrays, exploiting the ability to per- powerful tool for analyzing the workings of complex
form custom synthesis at high density. Thus, oligonu- biological systems.
cleotides with defined substitutions can be used to scan Mammalian cells are widely used as model systems
a target sequence for mutations. Such arrays have been to gain insights into normal and disease-related pro-
used to detect variants in the HIV genome, human mi- cesses. However, with the exception of mutagenesis,
tochondria, and various disease genes such as the p53 none of the classical genetic tools are applicable to
tumor-suppressor gene. Array technology is likely to mammalian cell systems. Somatic mammalian cells
become an important tool in diagnostics, as in geno- are asexual diploids, and it is impossible to perform
typing cancer subtypes, thereby aiding the oncologist genetic crosses for the generation of well-character-
in the selection of appropriate therapies. An impact ized cell strains - the cornerstone of classical genetic
will no doubt soon be seen of SNPs upon the discovery analysis. Functional genomics tools have been devel-
of the molecular basis of complex polygenic diseases oped to address these limitations.
and drug susceptibility (pharmacogenomics).
3.2.2.1
Induced Phenotypes and Gene Function

3.2.2 In an effort to emulate all the principles of classical


Phenotype-Based Functional Genomics genetics, various functional genomics tools have been
developed to dissect the function of individual genes
Data emerging from the functional genomics ap- or perform genome-wide target discovery screens in
proaches described in the previous sections largely mammalian cells. The central element of these tools
relate to the surveying of molecular differences be- is the cDNA itself. Individual cDNAs, or a library of
tween biological samples and the placement of identi- cDNAs representing the entire genetic content of a
3.2 Functional Genomics: Technologies and Applications 47

cell or an organism, are inserted into vector systems pic selections (cycle cloning). Cycle cloning is impor-
that allow efficient gene transfer and expression in a tant in increasing the validity of the screen since
variety of mammalian cells. The most prominent of mammalian cells are inherently heterogeneous and
these vector systems are based on retroviruses be- prone to spontaneous reversions, and many interest-
cause of their versatility, efficiency, and ease of ma- ing phenotypes are leaky. Ultimately, functional
nipulation (Morgenstein and Land 1990; Hannon et screens will select the relevant genes that are rate-lim-
al. 1999). The expression of a cDNA in sense orienta- iting in the control of a biological process, and can
tion leads to overexpression of the gene product in lead to the identification of novel targets and unpre-
the target cell, mimicking the effect of a "gain of dicted mechanisms. In the past decade, functional ge-
function" mutation. The antisense expression of a netic screens have been used with great success to
cDNA leads to the expression of an RNA molecule study complex biological processes such as tumor
whose sequence is complementary to the mRNA. The suppression, apoptosis, cellular senescence and drug
binding of antisense RNA to the target mRNA inter- resistance (see, for example, Gudkov et al. 1993; Deiss
feres with its translation, leading to the knock-out or et al. 1995; Garkavtsev et al. 1996; Sun et al. 1998;
knock-down of gene activity. To improve the effi- Hudson et al. 1999; Mahon and Whitehead 2001).
ciency of inhibition, small antisense fragment li-
braries can be generated. Small antisense RNA frag- 3.2.2.2
ments may inhibit gene expression more effectively Applications of Phenotype-Based Functional
than complete antisense mRNA (Nellen and Sczakiel Genomics: Drug Discovery and Diagnostics
1996). The anti-sense expression of a cDNA or cDNA
fragments intends to mimic the effect of a "loss of The sequencing of the human genome revealed the ex-
function" mutation (Deiss and Kimchi 1991; Holz- istence of thousands of potential drug targets. Func-
mayer et al. 1992; Gudkov and Roninson 1997). tional genomics is being used to validate these poten-
The overexpression or antisense-mediated inhibi- tial targets in order to establish a causative relationship
tion of the expression of a gene or its encoded pro- between the target and the disease of interest.
tein results in a reversible alteration of the cellular Target validation studies have multiple goals. First,
phenotype, and this can provide valuable insights they attempt to link the target mechanistically to the
into gene function. Ample examples in the scientific disease and demonstrate the role of the target in
literature demonstrate that the overexpression of pathogenesis. Second, they attempt to demonstrate
sense and anti-sense cDNAs results in induced pheno- that the inhibition of the target in diseased cells re-
typic changes that reflect functional properties of the sults in a desired therapeutic effect. These experi-
gene of interest (see, for example, Gallagher et al. ments may indicate the efficacy of a future drug de-
1997; Camero et al. 2000; Gudkov et al. 1999; Xu et veloped against the target. A third goal is to deter-
al. 2001). Genome-wide genetics screens are con- mine whether the inhibition of the target has a toxic
ducted with complex sense or antisense cDNA expres- effect on normal cells.
sion libraries to isolate genes whose "gain of func- Phenotype-based functional genomics screens
tion" (sense libraries) or "loss of function" (antisense achieve these goals through the use of cellular disease
libraries) induces the desired phenotypic change. models and antisense-mediated inhibition of target ex-
Other approaches may use combinatorial peptide ex- pression. Antisense inhibition of gene function mimics
pression libraries, if the discovery of a peptide surro- the effect of a drug that would bind to and inhibit the
gate ligand with target inhibitory function is desired. target protein. Such an approach can be taken at a ge-
Typically, a screen begins with the transfer of a li- nome-wide level or at the level of specific gene families
brary into the target cells. The transduced cell popu- that are of special interest for drug discovery. Using
lation is grown under selective conditions to allow bioinformatic search tools, genes encoding proteins
for the enrichment of cells that display the phenotype with the same predicted biochemical activity (gene
of interest. Cells with the altered phenotype are col- families) can be identified in genomic databases. Gene
lected and the library-derived genetic element (cDNA families encoding protein kinases, ion channels, G pro-
or antisense fragment) whose expression is responsi- tein coupled receptors, and proteases can be identified
ble for the induction of the phenotype is recovered by the presence of characteristic motifs in the predicted
and amplified. This enriched sub-library of functional protein sequences. The challenge for functional geno-
elements may be used in multiple rounds of phenoty- mics is to match individual members of gene families
48 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

to disease indicators and demonstrate that the targeting with than DNA, which explains the comparably
of those particular members would be therapeutically slower development of large scale technology plat-
beneficial. This task requires the simultaneous valida- forms that would facilitate the analysis of proteins at
tion of hundreds of genes in multiple disease models. a proteome wide level. Nonetheless, significant devel-
Phenotype-based approaches have utility not only opments in this area have been achieved in the past
in the discovery and validation of drug targets but few years. Technologies have emerged that facilitate
also in the development of diagnostic tools. For in- more sensitive and larger scale profiling of changes
stance, the identification of specific cell surface mar- in protein expression, and the mapping of protein-
kers for diseased tissues would facilitate the develop- protein interactions and signaling pathways. As al-
ment of imaging agents for early detection of patho- ready noted in the introduction of this chapter, large
logical states. Similarly, the identification of proteins scale analysis of the regulation and dynamic interac-
secreted by the diseased tissue would form the basis tions of components of the proteome will be indis-
for the development of novel blood tests for early dis- pensable in the functional characterization and anno-
ease detection. These efforts can be aided by pheno- tation of the genome, and will have an impact on all
type-based functional studies designed to find mem- areas of basic and medical sciences.
brane-bound or secreted proteins. Such an approach Here we discuss recent developments in proteomics
takes advantage of the fact that membrane targeting technologies and their impact on profiling the dynamic
of both secreted and cell surface proteins requires a regulation of the proteome, the mapping of "interac-
short, amino-terminal hydrophobic peptide called a tomes" (protein-protein interaction networks), the de-
signal peptide. The sequence of the signal peptide velopment of novel tools to probe the function of pro-
uniquely identifies the secreted proteins. The goal is teins in vivo, some of which are predicted to aid in the
the isolation of cDNA fragments from a library made development of imaging agents, and the biology under-
from the diseased tissue that encode signal peptides lying physiological and pathological processes.
using a signal sequence trap (Tashiro et al. 1993; Ko-
jima and Kitamura 1999).
The signal trap is based on the use of truncated
membrane-bound cell-surface molecules such as CD4, 3.3.1
CDS, or CD95. These proteins are rendered incapable Proteome Expression Profiling
of localizing to the surface of the cell if they lack a
functional signal sequence. The fusion of a cDNA frag- 3.3.1.1
ment that encodes functional signal pep tides will re- 2D-PAGE-MALDI-MS
store the localization of the truncated proteins to the
surface of the cell. Cells expressing the marker on the The most frequently used method for protein separa-
surface can be isolated using marker-specific antibod- tion and identification is the two-dimensional polya-
ies and cell-sorter or magnetic beads. After the recov- crylamide gel electrophoresis (2D-PAGE, 2D-GE) fol-
ery of the cDNAs from the positive cells, a catalogue of lowed by mass spectrometry (MSj Gygi and Aebersold
membrane-bound and secreted proteins can be com- 2000; Fig. 3.2). This classical profiling method consists
piled from normal and diseased tissues. Candidate di- of several steps, beginning with sample preparation. No
agnostic markers that are abundant in the diseased single method of sample preparation can be applied
sample can be further validated for the development universally due to the diverse nature of samples that
of novel diagnostics using additional techniques. are analyzed, in contrast to the more uniform sample
preparation methods involving RNA or DNA. Methods
for sample preparation include stepwise precipitation,
immuno-affinity isolation and/or subcellular fraction-
ation. Following sample preparation, proteins are sepa-
3.3 rated in two dimensions according to two independent
Proteomics: Technologies and Applications and distinct properties, isoelectric point and molecular
weight. Recently, advances in 2D-GE technology have
As a counterpart to functional genomics, proteomics resulted in improved resolution of proteins with polya-
addresses the challenge of the dissection of the func- crylamide gels (Goerg et al. 2000). To date, 2D-GE is
tion of proteins. Proteins are more difficult to work still the most powerful method used to separate com-
3.3 Proteomics: Technologies and Applications 49

Source of protein +1-----1 Cell


Ti ue samples
samples

••
Protein
extraction! olubilization

Isoelectric focusing
(first dime nion)

Sample 1
SDS-PAGE
(2 nd dimension)

1
Comparative image analy
Spot excision
Protea e digestion

l
Mass spectrometric identification
of pots/peptide fragments

Database Analy
Protein ID

Fig. 3.2. Protein expression monitoring. A flow path sche- points. Subsequently, the strip is subjected to reduction and
matically represents the use of two-dimensional gel electro- alkylation and applied to a "second dimension" SDS-PAGE
phoresis for visualizing proteins and their relative expres- gel. Proteins are then separated on the basis of size. Gels
sion levels. This approach may be used to compare the ex- are fixed and proteins visualized by silver staining. Visual-
pression levels of proteins in two or more different samples! ized spots are then recorded and quantified. Spots are ex-
conditions (e.g., normal and diseased). Proteins are solubi- cised, tryptic digests generated, and peptides analyzed by
lized and the protein mixture applied to a "first dimension" mass spectrometric analysis
gel strip that separates proteins based on their isoelectric
SO CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

plex protein mixtures. Separated proteins are visualized of 2D-GE (slow spot excision, low detection limit, low
by chromophoric staining, isotopic labeling, or with loading capacity, and biased applicability), can be par-
fluorescent dyes, and further examined by image anal- tially overcome by automatically separating proteins by
ysis. This process is only semi-automated and still re- multidimensional liquid chromatography. This method
quires manual editing of critical positions. In order also applies orthogonal separation principles based on
to achieve statistically significant detection of protein different physicochemical parameters. In multidimen-
expression differences between two different samples, sionalliquid chromatography, the crude lysate (which
2D-GE has to be repeated several times. Individual pro- may even contain solubilized membrane proteins) is
tein spots of interest are excised, alkylated, reduced, first digested as a complex mixture of proteins and then
and digested with trypsin. The resulting peptides are subjected to a series of liquid chromatographic (LC)
mixed with a large excess of UV-absorbing matrix, steps before analysis with an electrospray ionization
dried on a spot, ionized by a pulsed laser, then ex- mass spectrometer (ESI-MS). Coupling the injection-
tracted by an electrical field into the mass analyzer. capillary to a low-flow chromatographic system per-
This method, known as matrix-assisted laser desorp- mits direct on-line analysis. The electrospray source
tion-ionization mass spectroscopy (MALDI-MS), cre- operates at atmospheric pressure, making interfacing
ates a peptide mass fingerprint. This signature of a pro- relatively simple (Rowley et al. 2000). Recently, a bipha-
tein is calibrated against the defined internal standard sic micro capillary column integrating directly overlaid
of trypsin auto-cleavage products and compared to a beds of strong anionic exchange beads with reversed
database of tryptic peptides created by virtual cleavage phase particles was applied successfully to separate
of stored protein sequences. State-of-the-art software is the eukaryotic 80S ribosome and the yeast proteome
capable of including various potential post-transla- (Link et al. 1999; Washburn et al. 2001). The combina-
tional modifications into the search algorithm to im- tion of fast data acquisition with sophisticated search
prove the confidence level of protein identification. algorithms capable of subtracting identified peptides
Confidence in the database search results is directly before the next round of database searching, such as
correlated with mass accuracy. Recently, methods have SEQUEST, leads to a higher sample throughput than
been established to automate the processes of spot can be achieved with any other method (Yates et al.
identification, cutting, digestion, and MS-analysis. 1995). The success rate of this approach is increasing
The coupling of 2D-GE and MALDI-MS is well estab- with the completion of sequenced genomes. Current
lished now (Pandey and Mann 2000). However, it still systems do not yet achieve the resolving power of
suffers from limitations in reproducibility, sensitivity, 2D-GE but have the potential to improve in the future.
and automation potential. MS data also have shortcom-
ings in that peptide masses are often not sufficient to 3.3.1.3
unambiguously identify a protein, and these shortcom- Isotopic Methods
ings are of particular relevance when working with or-
ganisms whose genomes are not completely sequenced. Isotopic labeling methods facilitate better quantitative
The range of applications of 2D-GE is thus still limited measurements of dynamic changes in protein expres-
(Pandey and Mann 2000). Still, 2D-GE has been suc- sion. One approach involves stable isotopic dilution,
cessfully applied to profile expression profiles of frac- and utilizes heavy isotopes as internal standards that
tions of the proteome of diverse species, including hu- can easily be differentiated by MS from non-labeled
man. Complementary technologies, such as those de- samples (de Lenheer et al. 1985). This method usually
scribed below, may circumvent some of the limitations introduces the isotopic label before protein extrac-
inherent in 2D-GE approaches and provide larger scale tion, which prohibits its application to biopsied tissue
solutions for the future. samples. Another promising method for quantifica-
tion consists of three modules and is called isotope-
3.3.1.2 coded affinity tag (lCAT). The elements are: an affini-
Multidimensional LC-MS/MS ty tag, a linker to incorporate stable isotopes, and a
reactive group specific to thiol groups present at cys-
Methods to characterize complex protein mixtures teines. Proteins from two samples are denatured, re-
without the need of pre-purification increase the effi- duced, and labeled with either the heavy or the light
ciency of protein identification and avoid difficulties variant of the isotope. Then the samples are com-
associated with 2D-GE. The most prominent limitations bined, digested, and isolated by affinity chromatogra-
3.3 Proteomics: Technologies and Applications 51

phy directed against the affinity tag. The column is bilization supports include filter membranes [polyviny-
coupled to a tandem MS and ratios of heavy and light lidene difluoride (PVDF) Nitrocellulose], and glass,
versions of the peptides are determined. Additionally, plastic, metal or silicone surfaces. Detection may be
the identity of the peptide can be revealed by MS/MS based on electromagnetic waves (absorption, reflec-
sequencing. The major drawbacks of this method are: tion, transmission, fluorescence, luminescence, phos-
(1) the attachment of additional mass (complicates pho-imaging), label-free techniques such as surface
database searches), and (2) the biased selectivity for plasmon resonance, monitoring of molecular parame-
cysteine-containing peptides (Gygi et al. 1999v). The ters such as electrochemical properties (conductivity),
method nevertheless holds great promise for differen- atomic force microscopy (Jones et al. 1998), or mass
tial protein profiling at a larger scale and for disco- spectrometry (Borrebaeck et al. 2001). The most ad-
vering novel disease proteins and markers. vanced example of this approach has been the investi-
gation of the presence and relative abundance of lIS
3.3.1.4 different antigens on an antibody array by differential
Protein Chip labeling of extracts with Cy3 and CyS fluorescent dyes
(Haab 2001). When dealing with a limited number of
Expression profiles can also be analyzed by replacing target molecules, a miniaturized sandwich-enzyme-
the tedious and complex process of 2D-GE with a sam- linked immunosorbent assay (ELISA) which requires
ple preparation on a chip. This technology reduces the a pair of antibodies, one for binding and one for detect-
complexity of the protein mixture by separating the ing the antigen, may be used (Mendoza et al. 1999; Joos
proteins into sets of proteins with common properties. et al. 2000; Huang 2001). Currently, the greatest obsta-
Protein Chip arrays contain various affinity surfaces, cles for protein arrays are the limited stability of the
allowing separation according to various protein char- immobilized molecules, the limited availability of
acteristics such as charge, hydrophobicity, and metal monospecific antibodies (which might be overcome
binding. The different surfaces bind subsets of the pro- by recombinant and synthetic antibody generation
tein mixture. After differential washing and removal of techniques; see Sect. 3.3.3.1) and the high variance of
unbound material, the bound proteins are analyzed in a affinities and on/off rates. Protein array technologies
time-of-flight mass spectrometer (TOF MS) (Fung et al. are still in their infancy but will no doubt grow to be-
2001). The technology has been used for disease profil- come larger scale proteome profiling tools as more
ing in cancer and to define biomarkers significant for technical advances are made in the near future.
different disease statuses (von Eggeling et al. 2000).
The Protein Chip (Ciphergen, Fremont, Calif., USA)
system promises to become a powerful proteomics tool 3.3.2
for highlighting the differences in protein expression Profiling of Macromolecular Interactions:
profiles directly from complex lysates (Senior 1999), Elucidation of Proteome Networks
but major technological improvements are still needed
for broader applications. Molecular interactions are essential to many biologi-
cal processes. Constitutive and induced noncovalent
3.3.1.S associations of proteins and multienzyme complexes
Protein Arrays playa major role in the regulation of cellular machin-
eries implicated in diverse processes such as DNA re-
Protein arrays are inspired by the success of DNA ar- plication, transcription, translation, and metabolic
rays and are likely to become another important tool and signal transduction pathways. Much of modern
for monitoring protein expression profiles and for the biological research is concerned with the identifica-
larger scale analysis of biochemical properties of pro- tion of proteins involved in such cellular processes,
teins. Successful implementation of protein arrays for with determining their function, and with elucidating
expression profiling requires access to protein and li- how, when, and where they interact with other pro-
gand libraries, suitable solid supports, immobilization teins involved in specific biochemical pathways. Pro-
techniques, and sensitive detection devices. Protein teomics technologies that facilitate the large scale
binding molecules can be derived from small mole- mapping of protein interactions will pave the way to-
cules, oligonucleotides, aptamers, antibodies, and mac- ward the establishment of maps of cellular circuitry
romolecules such as phages (Borrebaeck 2000). Immo- and functions of the proteome.
52 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

3.3.2.1 ing of interactions are the yeast two hybrid (Y2H)


Molecular Interaction Screening Technologies and phage display technologies (for more detailed de-
scription of the latter, see Sect. 3.3.3).
Biochemical separation technologies have been suc- The Y2H system is a powerful method for the in vivo
cessfully applied to the study of macromolecular analysis of protein-protein interactions in intact cells,
complexes, although they are limited to in vitro anal- and uses yeast as a surrogate host system (Fields and
ysis of such complexes (Pandey and Mann 2000). De- Song 1989; Chien et a1. 1991; Gyuris et al. 1993; Vidal
velopments in mass spectrometry technology are and Legrain 1999). It is based on the reconstitution
greatly facilitating the scaling up of such studies of a transcription factor complex, which is mediated
(Shevchenko et al. 1996). Proteomics has also led to by the interaction of two fusion proteins (Fig. 3.3 A).
development of methods for the detection and identi- These fusion proteins consist of two interacting protein
fication of protein-protein or other macromolecular entities fused to either a DNA-binding component or
interactions in intact cells. The most widely known transcription activation domain component. An inter-
and used technologies that support large scale screen- action-driven reconstitution of transcription activation

Y2H
a. No interaction

DB-binding ite Reporter Gene

h. Po itive interaction

DB-binding ite Reporter Gene

yea t cell array -


reporter expre ion

Fig. 3.3 A-C. Cell-based protein-protein interaction monitor- teraction of the two hybrid proteins results in increased
ing. Variations of two-hybrid themes. (A) The classic yeast- transcription of a reporter gene. Increased reporter expres-
two-hybrid (Y2H) assay. Reconstitution of an active tran- sion/activity therefore reflects a positive protein-protein in-
scription factor complex using two hybrid proteins. One hy- teraction. Inset: Yeast cells expressing hybrid protein may
brid protein encodes a DNA binding domain (DB) fused to be arrayed and grown on fIlter membranes, and reporter
a protein X (bait). The second hybrid protein encodes a expression monitored by image analysis in ways similar to
transcription activation domain (AD) fused to a protein Y image analyses performed with DNA arrays (see Fig. 3.1).
(prey). Protein Y may be encoded by a cDNA library when When LacZ is used as a reporter, yeast cells turn blue-
library screens are performed with the objective to identify greenish upon induction of transcription.
a prey protein that binds to a bait protein X of interest. In-
3.3 Proteomics: Technologies and Applications 53

USPS

a.
o interaction

Nub Cub Reporter

h. Po itive interaction

Nub Cub \ --
Reporter

Ubiquitin protea e
Report r release
Reporter degradation
(+1- selection based of reporter)
II I
cleavage

USPS in yea t
2

• • エ NセZ@
e ••
3

セ@
.. .
4 5
,
6
X and Y interact

o interaction
o selection Selective Conditions
B

Fig. 3.3. B Ubiquitin split protein system (USPS). C-terminal gradation. Consequently, depending on choice of reporter
domain of ubiquitin is fused to protein X (bait) and a repor- (and cloning strategy), positive or negative selection schemes
ter. N-terminal domain of ubiquitin (usually some mutant may be used to monitor protein-protein interactions. Inset:
form of Nub with lower affinity for Cub) is fused to protein shows use of USPS in yeast employing a reporter (ura3) that
Y (prey). Interaction of bait and prey proteins brings Nub undergoes degradation upon release. Yeast cells expressing
and Cub fragments in close proximity and drives reconstitu- ura3 reporter are sensitive to the drug 5-FOA (i.e., no growth
tion of ubiquitin. This is now recognized by one or more en- under selective conditions when no interaction occurs, right
dogenous ubiquitin protease, resulting in cleavage at the C- panel - row 2 - columns 5-6). In contrast, if X and Y interact,
terminal end and release of the fused reporter. Depending resulting in degradation of reporter, cells become resistant to
on the nature of the N-terminal residue of the cleaved repor- drug treatment (positive selection, right panel-row 1 - col-
ter, the reporter may remain intact or targeted for degrada- umns 5-6). Columns 1-3 and 4-6 show decreasing amounts
tion by processes governed by the N-end rule of protein de- of spotted yeast cells. C see p. 54

activity leads to induction of the expression of a cable to many signaling events, and not suitable for the
responsive reporter (for a more detailed review on analysis of full length transcription factors or mem-
Y2H, see Mendelsohn and Brent 1999; Vidal and Le- brane proteins. Recent developments utilizing repres-
grain 1999). Variations on the Y2H theme have been sion domains may, however, make the Y2H system
used to study protein-protein, protein-RNA (SenGupta more appropriate to the study of transcription factors
et al. 1996), protein-peptide (Yang et a1. 1995), protein- (Hirst et al. 2001). Additionally, interactions are ana-
single chain antibody (Visintin et a1. 1999), and pro- lyzed in yeast cells, an environment extraneous to pro-
tein-small molecule interactions (Licitra et a1. 1996). teins derived from other host systems.
Despite the great utility of this technology in interac- Driven by the limitations of the aforementioned
tion mapping, it also has limitations. It is limited to systems, other technologies have been developed.
the analysis of soluble proteins or soluble domains of These include the ubiquitin split protein sensor tech-
proteins in the nucleus of yeast cells; i.e., it is not appli- nology (USPS) Oohnsson and Varshavsky 1994; Stagl-
54 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

peA
a.
o interaction
(no enzyme act ivity)

E-fl E-f2

b. Po itive interaction
(enzyme active)

Enzyme
Recon titution
of enzymatic
activity
c

Fig. 3.3. C Protein complementation assay (PCA). Concep- enzyme. Interaction is driven by the interaction of proteins
tually similar to the USPS technology. In this scenario, an X (bait) and Y (prey) fused to the two enzyme fragments.
active enzyme is reconstituted through "forced" interaction Consequently, changes in enzyme activity reflect a change
of N-terminal (Efl) and C-terminal fragments (Ef2) of the in interaction of hybrid proteins

jar et al. 1998; Wittke et al. 1999; Rojo-Niersbach et USPS, this assay is based on the reconstitution of an
al. 2000), and other two component protein fragment enzymatic activity. Interactions can be detected either
complementation assays (PCAs). The USPS technol- by scoring for cellular growth using specific culture
ogy (Fig. 3.3 B) is based upon a protein-protein inter- conditions for DHFR-deficient cells, or by using a flu-
action/dimerization-assisted reassembly of fragmen- orescence-labeled methotrexate (MTX) as a reporter
ted ubiquitin, which results in susceptibility of the re- probe. MTX is capable of binding to DHFR with a
constituted ubiquitin to cellular ubiquitin-specific 1 : 1 stoichiometry. This system is also applicable to
proteases. These recognize the intact ubiquitin struc- diverse host systems, including bacterial systems
ture and can cleave the protein between the C-termi- (Hochschild and Dove 1998; Joung et al. 2000).
nus of ubiquitin and a C-terminally fused reporter There are also other interaction technologies based
protein. Interaction can be measured either as a re- on resonance energy transfer between reporter pro-
lease of reporter activity or loss in reporter activity. teins with fluorescent or bioluminescent properties,
USPS interaction screening can be performed in yeast and these may be used to measure interactions in real
as well as mammalian and plant cells. It is also suit- time. These technologies are known as FRET (fluores-
able for the analysis of membrane proteins and full cence resonance energy transfer) and BRET (biolumi-
length transcription factors. The USPS technology nescence resonance energy transfer). Green fluores-
promises to become yet another high throughput in- cent protein (GFP) and variants thereof (e.g., YFP
teraction screening system for the mapping of cellular and RFP; yellow and red fluorescent proteins) have
circuitry. been used in FRET to monitor protein-protein inter-
Conceptually similar to USPS is a PCA technology actions by measuring energy transfer between these
(Fig. 3.3 C) based on dihydrofolate reductase (DHFR) reporter molecules (Devi 2000). Energy transfer oc-
(Pelletier et al. 1998; Remy et al. 1999). PCA-DHFR is curs when proteins fused to these reporters interact
based on the reconstitution of two DHFR fragments and bring the reporter molecules in close physical
through induced proximity mediated by interacting proximity to one another. The same principle is used
proteins fused to the DHFR fragments. In contrast to in BRET, using yet other variants of GFP (Devi 2000).
3.3 Proteomics: Technologies and Applications 55

3.3.2.2 genes, proteins, and signaling pathways: (2) the dis-


Applications of Molecular covery and validation of drug discovery targets; and
Interaction Screening Technologies (3) the development of assays for monitoring specific
pathway activities in multiple experimental settings
The interaction technologies described in the previous (see also Sect. 3.5). Integration of data sets emerging
section have so far been primarily used in cell culture from genomics and functional genomics research
systems to characterize or identify novel protein-pro- with data sets obtained from large scale proteomics
tein interactions. However, USPS and PCA could in studies will lead to a better understanding of the mo-
principle be extended to incorporate a number of dif- lecular basis of cellular processes and will facilitate
ferent reporters, some of which could even support de- the manipulation of such processes for the develop-
tection by non-invasive imaging techniques. This could ment of therapeutic agents.
allow for real time in vivo monitoring of protein-pro-
tein interactions, an objective that is currently being
pursued by various investigators in the field. 3.3.3
The molecular interaction screening systems dis- Probing the Proteome: Discovery of Surrogate Ligands
cussed here have been applied to the identification of
multiple types of interactions. These include protein- Having the complete sequence of an entire genome in
protein interactions, peptide-protein interactions, sin- databases is not sufficient to elucidate the biological
gle chain antibody-protein target interactions, and function of the encoded genes. There is no strict lin-
small molecule-target interactions (see Sect. 3.4.2). ear relationship between genes and encoded proteins
Consequently, they can also be utilized to find new since the protein product of one gene might be pre-
surrogate ligands, as can also be achieved with phage sent in many isoforms and post-translationally modi-
display approaches (see Sect. 3.3.3), for probing the fied versions in the same cell. Therefore, probing pro-
function of genes and proteins. Such surrogate li- tein function inside the cell requires the ability to dif-
gands may be used, for example, in the development ferentiate individual proteins, isoforms, and modified
of novel peptide or antibody-based imaging agents, derivatives from one another. This can be achieved
and in the mapping of cellular signaling pathways through the use of surrogate ligands that bind with
and proteome networks. high specificity and affinity to the protein of interest.
To date, application of molecular interaction screen- Most often, surrogate ligands are either antibodies or
ing technologies to large scale mapping of cellular sig- peptides that are isolated from large combinatorial li-
naling pathways has been pioneered by the Y2H tech- braries. In the simplest case, surrogate ligands are
nology, primarily because Y2H is well established, used to detect the protein target inside the cell and in
and because yeast is a model system that can easily complex biological mixtures. In more complex situa-
be adapted and manipulated for the screening of large tions, the surrogate ligands not only bind to the tar-
and complex cDNA expression libraries. It has been di- get protein but also modulate its activity. They may
rected toward a mapping of "interactomes" of yeast (Ito inhibit the target's biochemical activity, prevent its
et al. 2000, 2001; Uetz et al. 2000), C. elegans (Walhout interaction with regulatory subunits, or interfere with
et al. 2000), and the infectious agents H. pylori (Rain et its interactions with other cellular proteins.
al. 2001) and vaccinia virus (Mc Craith et al. 2000).
Characterization of other model system "interactomes:' 3.3.3.1
such as the Drosophila "interactome:' is underway. Nu- Protein-Antibody Interaction Screening Technologies
merous studies utilize Y2H and other interaction tech-
nologies in the characterization of human protein in- In the immune system, vast libraries of antibodies
teractions and the mapping of cellular signaling path- and T-cell receptors recognize virtually any foreign
ways, with the goal of establishing maps of the human entity. When the interaction of a specific antibody
"interactome:' with an antigen occurs, proliferation of the antibody-
Much is to be gained by proteomics efforts to de- producing cells is stimulated, leading to the selective
velop refined methods for measuring macromolecular amplification of that particular antibody-producing
interactions and for the large scale mapping of "inter- clone. Because of this design, the immune system ef-
actomes:' They represent cornerstone activities in (l) ficiently selects for antibodies capable of recognizing
the elucidation of the function and regulation of virtually any molecular shape.
56 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

This central feature of the immune system has ficities and affinities. Antibodies that recognize an
been exploited to generate polyclonal and monoclonal immobilized target are enriched from this library by
antibodies against specific targets. The theoretical cycles of selective enrichment, ultimately leading to
and practical issues of producing polyclonal and the isolation of a subset of antibodies that may recog-
monoclonal antibodies are discussed in great detail nize different epitopes on the target and have differ-
elsewhere (Harlow and Lane 1998). Early methods of ent binding affinities (for review see Barbas et al.
antibody generation and isolation were time-consum- 2001). This process of selective enrichment is highly
ing and hampered by low throughput. The emergence amenable to automation, allowing the parallel screen-
of antibody display methods and the availability of ing of the library with a large number of different
antibody display libraries has overcome these short- antigens. This feature makes antibodies a very valu-
comings and made possible the simultaneous isola- able tool to study the entire proteome.
tion of antibodies that specifically recognize dozens
or even hundreds of targets. 3.3.3.2
This breakthrough was made possible by two tech- Combinatorial Peptide Libraries
nical developments. First, it was demonstrated that li- and Surrogate Ligand Discovery
braries of randomly recombined heavy and light
chains can generate antibodies with defined specifici- Surrogate peptide ligands, like antibodies, specifically
ty (Huse et al. 1989). Second, it was shown that for- recognize target proteins and can be isolated from
eign DNA fragments can be inserted into the gene III combinatorial peptide libraries. In these libraries,
of fIlamentous phages (such as fl and MI3), which random oligonucleotides are inserted into a fIlamen-
encodes phage coat protein pIII, to create a fusion tous phage gene III or gene VIII allowing the fusion
protein with the foreign sequence at the N-terminus of the encoded random peptides to the N-terminus of
(Smith 1985). The fusion protein is incorporated into the coat proteins pIII or pVIII, respectively. Peptides
the virion, which retains infectivity and displays the displayed as pIII fusions can be present in up to 5 co-
foreign protein in a form that can bind to other pro- pies per virion because there are 5 copies of pIII pro-
teins. The phage displaying a protein with the desired tein on each virion. The number of pVIII fused pep-
binding specificity to an immobilized target can be tides can vary from dozens to hundreds per virion
extracted from the total population of recombinant because there are around 3000 pVIII proteins on
phages by affinity selection. The recovered, "affinity- every virion. Typically, higher-binding-affinity pep-
enriched" phage population can be amplified and tides can be isolated from the pIII library because of
subjected to another round of selection. This process the avidity effect.
of selective enrichment for proteins with the desired Peptide libraries may differ in the length of the
binding specificity is called "panning:' and is made random region and may contain fixed residues that
possible by the linkage of the genotype (the foreign impose predetermined structural constraints on the
DNA fragment inside the virion) to the phenotype peptides, such as disulfide bridged loops or a-helices.
(the binding affinity of the displayed protein do- The peptides may also be displayed on the surface of
main). folded proteins or protein domains (for review, see
Antibody libraries displayed on the surface of Smith and Petrenko 1997; Barbas et al. 2001).
phage are either single-chain (scFv) or fragment-anti- Random peptide libraries are searched for peptides
gen-binding (Fab) libraries. In the Fab libraries, gene that bind to the target protein using cycles of affinity
fragments encoding entire light- and heavy-chain selection as described for antibodies. Peptide ligands
antibody binding regions are amplified by peR from that recognize the target can be found because only a
immune cells, combined and displayed on the surface limited number of critical residues are required for
of phage. In single-chain antibody libraries, only the the specific interaction between the peptide and the
variable domains of the light chain (Vd and heavy target. Pep tides usually bind at or near the naturalli-
chain (VH) are amplified, spliced together and dis- gand-binding site or active site of the target, thereby
played. Only these VH and VL regions mediate anti- acting as natural ligand agonists or antagonists.
gen recognition in the Fab fragment. Random antibody or peptide libraries can also be
A typical phage display antibody library contains enriched through binding to intact cells and used to
approximately 10 10 _10 11 antibody fragments, which identify ligands for cell surface receptors. These surro-
corresponds to the number of different binding speci- gate antibody or peptide ligands can be used to discri-
3.4 Chemical Genomics and Proteomics 57

minate between different cell types with distinct pep- useful information for nonpeptidic drug development
tide binding signatures, employed as reagents for re- as well, since the interaction of the peptide with the tar-
ceptor identification and cloning, or as tools to induce get provides critical structural information that could
receptor-mediated phenotypic changes in target cells. be exploited either by screening for nonpeptidic mole-
Recently, another type of random peptide library cules with the same binding specificity, or by a more
has been developed for the intracellular expression of systematic chemistry effort to incorporate the critical
peptides. The expressed peptides ("perturbagens") interactions revealed by the peptide binding to the tar-
may perturb the biological activity of their intracellu- get into synthetic small molecules.
lar targets, leading to detectable phenotypic change.
Peptides are recovered from cells that exhibit the de-
sired phenotype, and are used in subsequent experi-
ments to isolate their target(s). The peptide/target
3.4
pair could form the basis for mechanism-based drug
Chemical Genomics and Proteomics
discovery efforts (Capronigro et al. 1998).
As briefly discussed in the introduction of this chap-
3.3.3.3
ter, chemical genomics and proteomics refer to the
Implications for Diagnostics, Prognostics
systematic analysis of the impact of organic small
and Therapy
molecules on the genome and proteome (MacBeath
2001). The power of this approach is that global ef-
Surrogate ligands may form the basis for the develop-
fects of drug treatment can be studied. In the same
ment of diagnostic and prognostic agents if alterations
way that systems biology using genomics is supplant-
in the concentration, localization, or modification of a
ing studies of individual genes, chemical genomics
target are linked to disease development and/or thera-
could potentially replace the analysis of drugs and
peutic outcome. Using these ligands, in vitro assays can
small molecules using conventional assays that mea-
be developed to detect and measure the target in tissue
sure individual cellular parameters.
biopsies or body fluids. The linkage of the surrogate li-
Chemical genomics may address a number of
gands to an imaging agent may allow the development
questions: What biological processes are affected by
of a new generation of in vivo diagnostic and prognos-
drug treatment? What is the mechanism of action
tic tools. Antibody and peptide ligands may also be
and what is the precise molecular target(s) through
used for the generation of protein chips for the parallel
which a compound achieves a therapeutic effect?
detection of tens to hundreds of proteins in one and the
What additional targets are affected? Are these addi-
same biological sample.
tional targets associated with toxicity, or with other
Surrogate ligands may also modulate the biological
potential therapeutic applications? Chemical genomics
activity of their target molecules, and may therefore
can also be applied to the design of small molecules
provide a route for the development of therapeutic
that can probe the function of gene and proteins in
agents. Antibodies have an excellent biodistribution
vitro and in vivo.
profile and are stable in vivo. This property, combined
Multiple strategies that seek to meet such objec-
with the relatively straightforward generation of fully
tives have been developed over the last years, and in-
human antibodies, makes antibodies a suitable drug
tegrate many aspects of biology and chemistry. Here
candidate if the goal is to target cell surface receptors
we focus primarily on (a) large scale gene expression
and secreted proteins. The promise of antibodies as
profiling studies, and (b) the search and characteriza-
therapeutic agents has been substantiated by the recent
tion of targets of small molecules.
success of antibody drugs such as herceptin and ritux-
an. Currently, there are hundreds of antibodies in de-
velopment for multiple therapeutic indications.
Peptides are less stable in vivo than antibodies, and 3.4.1
may require chemical modification and/or the develop- Impact of Small Molecules on Genome Expression
ment of sustained release formulations for the delivery
of the peptide drug. Nonetheless, there are currently One branch of chemical genomics utilizes differential-
over ten peptide drugs on the market and many more gene-expression technologies to analyze the biological
are in development. Surrogate peptide ligands provide effects of drug treatment. The focus here is on DNA
58 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

microarrays (described in Sect. 3.2.1.1). RNA, ex- opment. Fifty percent of drugs are lost between no-
tracted from cultured cells or from tissues of animals, mination for development and clinical use. There is
is reverse transcribed, labeled, and used to probe whole an 80% failure after compounds enter the clinic due
genome arrays or arrays containing sequences that re- to lack of efficacy, toxicity issues, or competition, to
present a subset of the genome. Differences in gene ex- give an overall failure rate of 90%. Efforts are under-
pression, which may be caused by drug treatment or way to use genomics to predict compound toxicity.
genetic mutation, are monitored. Expression profiles This involves the generation of expression profiles for
of test compounds are compared to those of reference various tissues (e.g., liver and kidney) from animals
compounds with known mechanisms of action and to treated with compounds of known toxicity. The rela-
those of mutations in genes of known function. By pat- tionship between the expression profiles can then be
tern matching, the cellular pathways and even the mo- compared to traditional measures of toxicity. The
lecular targets of compounds of unknown mechanism goal of this approach is to develop predictive tools
can be revealed. The more comprehensive the reference for the assessment of toxicity at an early stage in the
data set, the greater the probability that compounds drug discovery process to decrease the failure rate of
can be accurately categorized. drugs that enter development.
In one example, expression profiling in yeast re- In summary, large scale gene expression analysis is
vealed that the molecular target of dyclonine, a topi- a powerful approach for monitoring and decoding
cal anesthetic, is ERG2, a gene required for ergosterol the cellular pathways affected by drug treatment.
biosynthesis. The human homolog of ERG2 is the sig- Such analysis relies on a reference database of profiles
ma receptor, a neurosteroid-interacting protein that generated from compounds of known mechanism
regulates potassium conductance (Hughes et al. and toxicity, or from mutations in genes of known
2000), suggesting that this receptor may represent the function. Pattern recognition is employed to deter-
human target of dyclonine. This study highlights how mine what cellular pathways are affected by chemical
a non-human model organism may be utilized to elu- perturbation. This tool has the potential to rapidly
cidate the molecular mechanism of a therapeutic assess the specificity and selectivity of compounds in
agent in human cells. the drug discovery process, identify compounds with
Good drugs are specific for a biological target and novel mechanism of action, and define new therapeu-
have minimal effects on other targets. A major objec- tic applications for compounds that have unantici-
tive in developing new drugs is to determine if a pated effects.
compound is specific for the intended target. Chemi-
cal genomics provides a comprehensive approach to
the analysis of genome-wide consequences of a given
perturbation are analyzed, as well as a method for 3.4.2
identifying unanticipated activities of drugs - i.e., Small Molecule-Target Interactions
off-target effects. Using DNA microarray technology,
the expression levels of thousands of genes are mea- Classic approaches for detection of ligand-protein re-
sured simultaneously, giving a "fingerprint" of a com- ceptor interactions have relied primarily on in vitro
pound's activity at the genomic level. Genetic and biochemical methods, including radiolabeled ligand
pharmacologic inhibition of gene function can result binding, affinity chromatography, and photo-cross-
in extremely similar changes in gene expression, thus linking. These methods are often laborious and time-
providing a method for confirming a potential target. consuming, and suffer from the requirement of ob-
The pattern of gene expression in cells containing a taining sufficient material for purification, peptide se-
deletion of the potential target, when studied in quence analysis, and subsequent cloning of the
drug-treated vs. untreated cells, can also provide in- cDNAs encoding these targets. Technological ad-
formation about off-target effects. This approach has vances in solid support technology (Shimizu et al.
been used in studies of the immunosuppressants 2000) and protein separation and analysis methods
FKS06 and cyclosporin, confirming calcineurin as the (see Sect. 3.3.1) are improving the range of applica-
molecular target of these drugs and identifying off- tions of biochemical methods. However, higher
target effects of FKS06 (Marton et al. 1998). throughput and more generally applicable methods
A major problem in the pharmaceutical industry for the detection of interactions between organic
is the high failure rate of compounds that enter devel- small molecules and their targets are necessary in or-
3.4 Chemical Genomics and Proteomics 59

Y3H - CrD ligand

a. セ@ No int raction

DB-bi nding ite Reporter Gene

b. Po itive interaction

DB-binding ite Reporter Gene

Example
LBO: DHFR (dihydrofolate reducta e)
Ligand: MTX- MOL (methotrexate linked to mall molecule)

Fig. 3.4. Monitoring of small molecule-protein target inter- terest (square with light gray shading). Interaction of the
actions. A schematic representation of a variation of the chimeric small molecule with the ligand binding domain
Y2H system (see Fig. 3.3A), adapted for the analysis of the (LBD; e.g., DHFR) of the DB-fusion protein and the prey
interaction of proteins as mediated by a cell permeable target molecule (Y) of the AD-fusion protein, promotes CID
small molecule (compound induced dimerization, CID). A and induction of the transcription of the reporter gene. An
bifunctional small molecule is synthesized that incorporates AD-cDNA library may be used to screen for prey proteins
an entity with known target binding properties (e.g., meth- that interact with a small molecule of interest
otrexate, MTX; .-) fused via a linker to a compound of in-

der to speed up the drug discovery process. Here we with a target protein (the "prey") results in a com-
describe a few cell-based technologies that concep- pound-mediated dimerization (recruitment) of two
tually derive from two-hybrid interaction screening distinct target fusion proteins (e.g., DHFR fused to a
technologies (see Sect. 3.3.2). DNA-binding protein, DB-DHFR, and the prey fused
One method that highlights the concept of two-hy- to a transcription activation domain, prey-AD), form-
brid interaction screening technologies as applied to ing a complex that bears the properties of a functional
assaying compound-mediated protein dimerization is transcription factor (e.g., a complex such as: DB-
the yeast three-hybrid (Y3H) technology (Licitra and DHFRxMTX-cpdxTarget-AD). This transcription fac-
Liu 1996). It shares many properties with its predeces- tor can activate specific reporter genes integrated into
sor, the Y2H technology (Fig. 3.3A), but differs in that the yeast genome. Thus, binding of a small molecule
interaction between fusion proteins is mediated by a to its target protein may be monitored via the activa-
bridging small molecule (Fig. 3.4). The bridging small tion of reporters that promote the growth of yeast cells
molecule is a bifunctional synthetic ligand that is used (when auxotrophic markers are used) or can be assayed
to induce protein dimerization. In brief, one compo- using colorimetric assays (e.g., LacZ assay). Other sys-
nent of the bifunctional ligand is an entity with known tems that are designed for protein-protein interaction
target binding properties (for example, methotrexate, screening (see Sect. 3.3.2) may also be adapted for
which binds with high affinity to the target protein the analysis of small molecule-target interaction. This
DHFR; Lin et a1. 2000), and the other component is is a very powerful demonstration of how integration
the compound of interest. These two components are of proteomics and chemistry can lead to development
linked by a specifically designed linker. Interaction of novel platforms that accelerate drug discovery.
60 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

Cellular interaction screening systems for monitor- vances in array technology developed for various
ing drug-target interactions will find uses in many functional genomics applications such as gene ex-
different settings. They provide more general and pression profiling. Schreiber and his colleagues, for
rapid methods for large scale screening of cDNA li- instance, have created microarrays printed with li-
braries for target proteins (target discovery). They are braries of small molecules (MacBeath et al. 1999).
not hampered by scarcity of biological samples. They These arrays may be interrogated with diverse target
can be employed in high throughput mutagenesis populations, including purified proteins or even cell
studies to discover receptors with higher affinity to extracts, to detect known as well as novel types of in-
small molecule ligands (target evolution), profile the teractions. Although still in their infancy, the further
interaction of small molecules with target poly- development of such array technology platforms will
morphic variants (pharmacogenomics/proteomics), likely result in opportunities to profile the mecha-
study structure-function relationship, and develop nism of action of small molecule drugs.
new types of high throughput small molecule screen-
ing assays for drug discovery. The synthesis of new
kinds of synthetic small molecule libraries that are
compatible with three-hybrid systems has already 3.5
been reported (Koide et al. 2001). Functional Genomics and Proteomics: Implications
Recently, an interesting approach for engineering for Molecular and Nuclear Medicine
small molecules that disrupt protein function has been
reported. This approach can be used to investigate the In the context of the revelation of the blueprint of ge-
function of genes that, until now, have defied func- nomes, functional genomics and proteomics have be-
tional analysis because of inherent limitations in the gun to have an impact on many disciplines in basic
genetic manipulation of mammalian cells (Bishop et science. Obtaining more "global views" of the multi-
al. 2000). It integrates biology, structural biology and plicity of components encoded by genomes, and how
chemistry. Shokat and colleagues focused on a class they interact and are regulated at the level of the pro-
of enzymes known as kinases. They mutated a selected teome, is accelerating annotation of the genome, the
kinase gene to change the site on the enzyme that binds understanding of gene function, the dissection of mo-
ATP. The result was an enlarged ATP-binding pocket lecular signaling pathways, and the validation of drug
that did not, however, affect the enzyme's normal func- discovery targets. Such enlarged views accelerate our
tion. By adding various bulky chemical groups to small understanding of the molecular basis of cellular pro-
molecules already known to inhibit kinases, they iden- cesses, and how these regulate physiological and
tified analogs of such inhibitors that specifically fit into pathological phenotypes. Integration of data sets ob-
the enlarged active site and selectively inhibited the tained from large scale "survey" profiling studies with
mutagenized variant but not the wild-type form of those from more focused gene and protein function
the enzyme. A nonspecific kinase inhibitor was thus studies that biologists are traditionally trained to seek
transformed into a highly specific one. This allowed will constantly refine the picture of cellular circuitry
them to study the function of certain kinases in model with a goal of unraveling the many mysteries under-
systems such as yeast, in which the wild-type form of lying organ and whole body system functions.
the kinase of interest was replaced by the mutant inhib- The impact of functional genomics and proteomics
itor-responsive variant. The concept of "drug-target fit- is not limited to basic science research. These disci-
ting" has previously been applied to proteins other plines have already begun to have an impact on med-
than kinases. Clacks on et al. (1998) reported the syn- ical sciences in numerous ways, including the discov-
thesis of analogs of the FK506 immunosuppressive ery of novel drug targets (target discovery), the intro-
agent that bind mutant forms of FKBP with higher af- duction of new guiding principles in medicinal chem-
finity than wild-type FKBP (Clacks on et al. 1998). Such istry for the development of novel mechanism-based
analogs are being used in gene therapy settings to in- therapeutics (drug discovery), the anticipation of po-
duce small molecule dimerizer-induced protein-protein tential side effects of drugs based on an understand-
interactions that activate various types of biological ing of their molecular mechanism of action (toxico-
processes in a controlled fashion (Clackson 2000). genomics), and the establishment of new guiding
Other approaches to the analysis of small mole- principles in the stratification of patient population
cule-target interactions take advantage of recent ad- on the basis of an understanding that drug response
3.5 Functional Genomics and Proteomics: Implications for Molecular and Nuclear Medicine 61

is influenced by genotype (pharmacogenomics). new strategies in assay development, should lead to


Functional genomics and proteomics research is lead- the development of new agents that enable the moni-
ing to the discovery of many novel types of markers toring of the activity of specific enzymes and path-
for disease diagnosis and prognosis. For instance, the ways in vivo. For example, new tools have recently
emerging large number of new tumor-cell markers, been developed for the imaging of metalloprotease
discovered in part using micro array-based gene ex- activity in vivo.
pression profiling, are likely to allow physicians to Cancer studies have shown a positive correlation
eventually tailor cancer therapies to their individual between cancer progression and expression of extra-
patients. Further, response to a specific therapy may cellular proteinases, such as the metalloproteinases
be predicted not only on the basis of the patient's (MMPs). Malignant cells depend on these proteinases
genotype, but also by the expression pattern of a set in order to disrupt basement membranes, invade
of specific markers. It is still some way from translat- neighboring tissues and metastasize to different or-
ing clues provided by gene and protein expression gans. A method that would allow in vivo measure-
studies into validated diagnostics or prognostic tools, ment of MMP activity would be highly desirable, in
but the path is clearly visible. As certain functional particular for monitoring the therapeutic efficacy of
genomics and proteomic tools become more refined any anti-MMP drug.
and more widely used in the clinical arena, they will Bremer et al. (2001) have developed a method that is
become an integral part of up-to-date medicine. based on non-immunogenic MMP substrates that accu-
Functional genomics and proteomics are also pro- mulate in tumors and can be used as enzymatic repor-
viding the basic and medical scientist with an in- ter probes. Cleavage by MMPs converts the probe into a
creasing number of new research tools for probing fluorescent product, which is detected using near-infra-
genomes and proteomes. These include small mole- red fluorescence imaging (NIRF), a newly developed
cules, peptides, and antibodies, many of which are re- technology. Similar methods may be followed for the
levant to the field of nuclear medicine. Two major analysis of other proteases, although this is more diffi-
foci of nuclear medicine - non-invasive imaging of cult for intracellular proteases. However, whole cell as-
organs and the destruction of cancer cells, which rely says for the monitoring of the cellular activity of cellu-
largely on the localization of radioactive compounds lar proteases have been and are being developed: for
within organs of the body - will benefit from the cur- example, in substrate-based fluorescence assays for
rent genomics and proteomics revolution. Several apoptotic caspases the cleaved product is detected on
benefits derive from technological developments as the basis of an induced fluorescence signal. Such cas-
described in this review alone. A sampling follows. pase assays are useful in monitoring the activity of cel-
(1) Profiling-mediated discovery of novel tissue-spe- lular circuits converging on this important class of en-
cific markers, in particular cell surface markers, will zymes, and it is possible that future technological de-
spur the development of more specific imaging agents velopments will translate the use of this kind of assay
and therapeutics. (2) Non-invasive reporter assays into development of in vivo imaging agents.
(such as PET reporters; see Sect 3.2.1.1.2), when uti- Further progress in the development of technolo-
lized in conjunction with genome-manipulation tech- gies that enable the monitoring of the association or
nologies (e.g., trans gene-expression technologies and dissociation of macromolecules, such as protein-pro-
directed-gene "knock-in" methodologies), should en- tein interactions, may also lead to novel means by
able the in vivo monitoring of gene expression. Such which to measure the activity of signaling pathways
efforts would have direct implications for gene thera- in intact cells. For instance, as previously discussed
pyas well. (3) The discovery of numerous new types in this review, interaction technologies based on the
of receptor/target ligands, including peptide- and induced proximity of fluorescent or bioluminescent
antibody-based ligands (for both extracellular and in- reporter molecules are already being used in moni-
tracellular targets) could form the basis of new imag- toring protein-protein interactions in intact cells.
ing and therapeutics agents. (4) A better characteriza- Similarly, protein fragment complementation assays,
tion of the target interaction landscape of small mol- in which an interaction between macromolecules pro-
ecule drugs should improve the analysis and under- motes the reconstitution of reporter activity, may in
standing of drug-target interaction imaging data. (5) the future be used to monitor in vivo activity of spe-
The discovery of signaling pathways and new en- cifically induced interactions and signaling pathways.
zymes within these pathways, in conjunction with If successfully applied to reporters that can be moni-
62 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

tored with noninvasive imaging methods, such pro- chemists, physiologists with physicists, cell biologists
teomics technologies could prove useful for applica- with computer scientists, and so on. Especially chal-
tions in nuclear medicine. Technologies based on lenging will be the development of coherent and uni-
small-molecule-induced dimerization of macromole- form genomic information storage platforms and of
cules are already being applied in gene therapy set- tools that allow the scientist to investigate and under-
tings (Clackson 2000). stand relationships among diverse data elements.
Significant advances in small molecule or macro- Genomics research is utterly dependent on computer
molecule delivery technologies have been made in re- science. Thus, biologists will need to understand the
cent years. For instance, peptide sequences encoded concepts behind relational databases, concepts that
by diverse proteins (such as HIV-Tat and antennape- include the reduction of a vast amount of information
dia), have been shown to be effective signals in shut- to defined types of entities and their enumeration,
tling pep tides, and even proteins, into mammalian and to relations between such entities - concepts that
cells (Ford et al. 200l). Applications of these peptides date only from the 1970s (Codd 1998). The evolving
for in vivo delivery have also been reported. Efforts multidisciplinary genomic and proteomic sciences
inspired by these discoveries have led to the design will define a new scientific activity that integrates ba-
of synthetic molecules that can be linked to other sic and applied sciences, an important objective of
small molecules of interest and promote their uptake which is improved diagnosis, treatment, and preven-
into cells (Wender et al. 2000). Such developments tion of disease.
promise to have significant implications for basic
science research and drug delivery strategies. As dis- • Acknowledgements. We would like to thank Dr. Margaret
cussed previously in this review, functional genomics Lee Kley for critical reading of the manuscript and many
helpful suggestions, and Zephyr Secher for help in prepar-
and proteomics-based combinatorial functional ing the manuscript.
screens can lead to the discovery of target specific
peptide surrogate ligands, which in combination with
delivery and labeling technologies, may be used in
target function studies, and, possibly, as future imag-
ing agents for intracellular targets.
3.7
References
Akiyama N, Matsuo Y, Sai H et al (2000) Identification of a
3.6 series of transforming growth factor bet-responsive
Functional Genomics and Proteomics: genes by retrovirus-mediated gene trap screening. Mol
Opportunities Cell Bioi 20:3266-3273
and Challenges Alizadeh AA, Eisen MB, Davis RE et al (2000) Distinct
types of diffuse large B-cell lymphoma identified by gene
expression profiling. Nature 403:503-511
The potential for functional genomics and proteomics Alon U, Barkai N, Notterman DA et al (1999) Broad pat-
is almost unlimited. Practiced at smaller scale, research terns of gene expression revealed by clustering analysis
that utilizes genomics and proteomics technologies is of tumor and normal colon tissues probed by oligonu-
cleotide arrays. Proc Natl Acad Sci USA 96:6745-6750
already generating important information about gene
Barbas CF, Burton DR, Scott JK, Silverman GJ (2001) Phage
and protein functions and the molecular basis of dis- display: a laboratory manual. Cold Spring Harbor La-
ease, whereas large-scale "surveying" and "cataloging" boratory Press, Cold Spring Harbor, New York
studies are currently giving rise to complex informa- Bishop AC, Ubersax JA, Petsch DT et al (2000) A chemical
tion we cannot yet fully understand. Integrating the switch for inhibitor-sensitive alleles of any protein ki-
nase. Nature 407:395-401
newly emerging "global view" approach, or systems
Bittner M, Meltzer P, Chen Yet al (2000) Molecular classifi-
biology, with more traditional gene-by-gene function cation of cutaneous malignant melanoma by gene ex-
studies will be critical in the functional mapping of pression profiling. Nature 406:536-540
the human genome. Potential applications in clinical Borrebaeck CA (2000) Antibodies in diagnostics - from im-
medicine have been discussed throughout this review. munoassays to protein chips. Immunol Today 21:379-382
Borrebaeck CA, Ekstrom S, Hager AC et al (2001) Protein
An interdisciplinary spirit will be necessary in or- chips based on recombinant antibody fragments: a
der to fully reap the benefits from the genomics revo- highly sensitive approach as detected by mass spectro-
lution. Geneticists will need to communicate with metry. Biotechniques. 2001 30:1126-1132
3.7 References 63

Bremer C, Tung CH, Weissleder R (2001) In vivo molecular Fields S, Song OK (1989) A novel genetic system to detect
target assessment of matrix metalloproteinase inhibition. protein-protein interactions. Nature 340:245-246
Nature Med 7:743-748 Fodor SPA, Read JL, Pirung MC et al (1991) Light-directed,
Brown PO, Botstein D (1999) Exploring the new world of spatial addressable parallel chemical synthesis. Science
the genome with DNA arrays. Nat Genet 21:33-37 251:767-773
Brown SD, Nolan PM (1998) Mouse mutagenesis-systematic Ford KG, Souberbielle BE, Darling D, Farzaneh F (2001)
studies of mammalian gene function. Hum Mol Genet Protein transduction: an alternative to genetic interven-
7:1627-1633 tion? Gene Ther 8:1-4
Capronigro G, Abedi MR, Hurlburt AP et al (1998) Transdo- Friddle CJ, Koga T, Rubin EM et al (2000) Expression profil-
minant genetic analysis of a growth control pathway. ing reveals distinct sets of genes altered during induc-
Proc Nat! Acad Sci USA 95:7508-7513 tion and regression of cardiac hypertrophy. Proc Nat!
Carnero A, Hudson JD, Hannon GJ et al (2000) Loss-of- Acad Sci USA 97:6745-6750
function genetics in mammalian cells: the p53 tumor Fung ET, Thulasiraman V, Weinberger SR, Dalmasso EA
suppressor model. Nucleic Acids Res 28:2234-2241 (2001) Protein biochips for differential profiling. Curr
Chakravarti A (2001) To a future of genetic medicine. Nat- Opin BiotechnoI12:65-69
ure 409:822-823 Gallagher WM, Cairney M, Schott B et al (1997) Identifica-
Cheung VG, Morley M, Aguilar F et al (1999) Making and tion of p53 genetic suppressor elements which confer re-
reading micro arrays. Nat Genet 21:15-19 sistance to cisplatin. Oncogene 14: 185-193
Chien C, Bartel PL, Sternglanz R et al (1991) The two hy- Gambhir SS, Barrio JR, WU L et al (1998) Imaging of ade-
brid system: a method to identify and clone genes for noviral-directed herpes simplex virus type 1 thymidine
proteins that interact with a protein of interest. Proc Nat! kinase gene expression in mice with ganciclovir. J Nucl
Acad Sci USA 88:9578-9582 Med 39:2003-2011
Cho RJ, Campbell MJ, Winzeler EA et al (1998) A genome- Gambhir SS, Barrio JR, Phelps ME et al (1999) Imaging
wide transcription analysis of the mitotic cell cycle. Mol adenoviral-directed reporter gene expression in living
Cell 2:65-73 animals with positron emission tomography. Proc Nat!
Clacks on T (2000) Regulated gene expression systems. Gene Acad Sci USA 96:2333-2338
Ther 7:120-125 Garkavtsev I, Kazarov A, Gudkov AVet al (1996) Suppres-
Clacks on T, Yang W, Rozamus LWet al (1998) Redesigning sion of the novel growth inhibitor p33INGI promotes
an FKBP-ligand interface to generate chemical dimerizers neoplastic transformation. Nat Genet 14:415-420
with novel specificity. Proc Nat! Acad Sci USA 95:10437- Golub TR, Sionim DK, Tamayo P et al (1999) Molecular
10442 classification of cancer: class discovery and class predic-
Codd EF (1998) A relational model of data for large shared tion by gene expression monitoring. Science 286:531-537
data banks. 1970. MD Comput 15:162-166 Goerg A, Obermaier C, Boguth G et al (2000) The current
De Leenheer AP, Lefevere MF, Lambert WE, Colinet ES state of two-dimensional electrophoresis with immobi-
(1985) Isotope-dilution mass spectrometry in clinical lized pH gradients. Electrophoresis 21:1037-1053
chemistry. Adv Clin Chern 24:111-161 Gudkov AV, Zelnick CR, Kazarov AR et al (1993) Isolation
De Risi JL, Iyer V, Brown PO (1997) Exploring the meta- of genetic suppressor elements, inducing resistance to to-
bolic and genetic control of gene expression on a geno- poisomerase II-interactive cytotoxic drugs, from human
mic scale. Science 278:680-686 topoisomerase II eDNA. Proc Nat! Acad Sci USA
Deiss LP, Kimchi A (1991) A genetic tool used to identify 90:3231-3235
thioredoxin as a mediator of a growth inhibitory signal. Gudkov AV, Roninson IB (1997) Isolation of genetic sup-
Science 252:117-120 pressor elements (GSEs) from random fragment eDNA
Deiss LP, Feinstein E, Berissi H et al (1995) Identification of libraries in retroviral vectors. Methods Mol Bioi 69:221-
a novel serine/threonine kinase and a novel 15-kD pro- 240
tein as potential mediators of the gamma interferon-in- Gudkov AV, Roninson IB, Brown R (1999) Functional ap-
duced cell death. Genes Dev 9:15-30 proaches to gene isolation in mammalian cells. Science
Devi LA (2000) G-protein-coupled receptor dimmers in the 285:299
lime light. Trends Pharmcol Sci 21:324-326 Gygi SP, Aebersold R (2000) Mass spectrometry and Proteo-
Diehn M, Eisen MB, Botstein D et al (2000) Large-scale mics. Curr Opin Chern Bioi 4:489-494
identification of secreted and membrane-associated gene Gygi SP, Rochon Y, Franza BR, Aebersold R (1999a) Corre-
products using DNA micorarrays. Nat Genet 25:58-62 lation between protein and mRNA abundance in yeast.
Duggan DJ, Bittner M, Chen Yet al (1999) Expression pro- Mol Cell Bioi 19: 1720-1730
filing using DNA micro arrays. Nat Genet 21:10-14 Gygi SP, Rist B, Gerber SA et al (1999b) Quantitative analy-
Eisen MB, Brown PO (1999) DNA arrays for analysis of sis of complex protein mixtures using isotope-coded affi-
gene expression. In: Weissman SM (ed) eDNA prepara- nity tags. Nat Biotechnol 17:994-999
tion and characterization. Methods in enzymology. Aca- Gyuris J, Golemis E, Chertkov H et al (1993) Cdil, a human
demic Press, San Diego, pp 179-205 Gl and S phase protein phosphatase that associates with
Eisen MB, Spellman PT, Brown PO et al (1998) Cluster ana- Cdk2. Cell 75:791-803
lysis and display of genome-wide expression patterns. Haab BB (2001) Advances in protein micro array technology
Proc Nat! Acad Sci USA 95:14863-14868 for protein expression and interaction profiling. Curr
Opin Drug Discov Devel 4:116-123
64 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

Hacia J (1999) Resequencing and mutational analysis using Joung JK, Ramm EI, Pabo CO (2000) A bacterial two-hybrid
oligonucleotide micro arrays. Nat Genet 21:42-47 selection system for studying protein-DNA and protein-
Hannon GJ, Sun P, Concklin DS et al (1999) MaRX: an ap- protein interactions. Proc Nat! Acad Sci USA 97:7382-
proach to genetics in mammalian cells. Science 7387
283:1129-1130 Khan J, Wei JS, Ringner M et al (2001) Classification and
Harlow E, Lane D (1998) Using antibodies: a laboratory diagnostic prediction of cancers using gene expression
manual. Cold Spring Harbor Laboratory Press, Cold profiling and artificial neural networks. Nature Med
Spring Harbor, New York 7:673-679
Hirst M, Ho C, Sabourin L et al (2001) A two-hybrid system Kharpko KR, Khorlin AA, Ivanov IB et al (1991) Hybridiza-
for transactivator bait proteins. Proc Nat! Acad Sci USA tion of DNA with oligonucleotides immobilized in gel: a
98:8726-8731 convenient method for detecting single base substitu-
Hochschild A, Dove S (1998) Protein-protein contacts that tions. Mol Bioi 25:581-591
activate and repress prokaryotic transcription. Cell Koide K, Finkelstein JM, Ball Z, Verdine GL (2001) A syn-
92:597-600 thetic library of cell-permeable molecules. J Am Chern
Holzmayer TA, Pestov DG, Roninson IB (1992) Isolation of Soc 123:398-408
dominant negative mutants and inhibitory antisense Kojima T, Kitamura T (1999) A signal sequence trap based
RNA sequences by expression selection of random DNA on a constitutively active cytokine receptor. Nat Biotech-
fragments. Nucleic Acids Res 20:711-717 nol 17:487-490
Huang R (2001) Detection of multiple proteins in an anti- Lashkari DA, De Risi JL, McCusker JH et al (1997) Yeast
body-based protein micro array system. J Immunol Meth- genome micorarrays for parallel genetic and gene ex-
ods 255:1-13 pression analysis of the yeast genome. Proc Nat! Acad
Hudson JD, Shoaibi MA, Maestro R et al (1999) A proin- Sci USA 94:13057-13062
flammatory cytokine inhibits p53 tumor suppressor ac- Licitra EJ, Liu JO (1996) A three-hybrid system for detecting
tivity. J Exp Med 190:1375-1382 small ligand-protein receptor interactions. Proc Nat!
Hughes TR, Marton MJ, Jones AR et al (2000) Functional Acad Sci USA 93:12817-12821
discovery via a compendium of expression profiles. Cell Lin H, Abida WM, Sauer RT et al (2000) Dexamethasone-
102: 109-126 Methotrexate: an efficient chemical inducer of protein di-
Huse WD, Sastry L, Iverson SA et al (1989) Generation of a merization in vivo. J Am Chern Soc 122:4247-4248
large combinatorial library of the immunoglobulin reper- Link AJ, Eng J, Schieltz DM et al (1999) Direct analysis of
toire in phage lambda. Science 246:1275-1281 protein complexes using mass spectrometry. Nat Biotech-
International Human Genome Sequencing Consortium nol 17:676-682
(2001) Initial sequencing and analysis of the human gen- Lockhart DJ, Winzeler EA (2000) Genomics, gene expres-
ome. Nature 409:860-921 sion and DNA arrays. Nature 405:827-836
Ishida Y, Leder P (1999) RET: a polyA-trap retrovirus vector Lockhart DJ, Dong H, Byrne MC et al (1996) Expression
for the reversible disruption and expression monitoring monitoring by hybridization to high density oligonucleo-
of gene in living cells. Nucleic Acids Res 27:35 tide arrays. Nat BiotechnoI14:1675-1680
Ito T, Tashiro K, Muta S et al (2000) Toward a protein-pro- MacBeath G (2001) Chemical genomics: what will it take
tein interaction map of the budding yeast: a comprehen- and who gets to play? Genome Bioi 2:2005
sive system to examine two-hybrid interactions in all MacBeath G, Koehler AN, Schreiber SL (1999) Printing
possible combinations between the yeast proteins. Proc small molecules as micorarrays and detecting protein-li-
Nat! Acad Sci USA 97:1143-1147 gand interactions en masse. J Am Chern Soc 121:7967-
Ito T, Chiba T, Ozawa R et al (2001) A comprehensive two- 7968
hybrid analysis to explore the yeast protein interactome. Mahon GM, Whitehead IP (2001) Retrovirus eDNA expres-
Proc Nat! Acad Sci USA 98:4569-4574 sion library screening for oncogenes. Methods Enzymol
Iyer VR, Eisen MB, Ross DT et al (1999) The transcriptional 332:211-221
program in the response of human fibroblasts to serum. Marton MJ, De Risi JL, Bennett HA et al (1998 a) Drug tar-
Science 283:83-87 get validation and identification of secondary drug target
Iyer VR, Horak CE, Scafe CS et al (2001) Genomics binding effects using DNA micro arrays. Nat Med 4:1293-1301
sites of the yeast cell cycle transcription factors SBF and Me Craith S, Holtzman T, Moss B, Fields S (2000) Genome-
MBF. Nature 409:533-538 wide analysis of vaccinia virus protein-protein interac-
Jones VW, Kenseth JR, Porter MD et al (1998) Microminia- tions. Proc Nat! Acad Sci USA 97:4879-4884
turized immunoassays using atomic force microscopy Medico E, Gambarotta ZG, Gentile A et al (2001) A gene
and compositionally patterned antigen arrays. Anal trap vector system for identifying transcriptionally re-
Chern 70:1233-1241 sponsive genes. Nat Biotechnol 19:579-582
Joos TO, Schrenk M, Hopfl P et al (2000) A micro array en- Mendelsohn AR, Brent R (1999) Protein interaction meth-
zyme-linked immunosorbent assay for autoimmune diag- ods: towards an endgame. Science 284:1948-1950
nostics. Electrophoresis 21 :2641-2650 Mendoza LG, McQuary P, Mongan A et al (1999) High-
Johnsson N, Varshavsky A (1994) Split ubiquitin as a sensor throughput micro array-based enzyme-linked immuno-
of protein interactions in vivo. Proc Nat! Acad Sci USA sorbent assay (ELISA). Biotechniques 27:778-780; 782-
91:10340-10344 788
3.7 References 65

Mitchell K, Pinson KI, Kelly OG et al (2001) Functional ana- Smith GP, Petrenko VA (1997) Phage display. Chern Rev
lysis of secreted and transmembrane proteins critical for 97:391-410
mouse development. Nat Genet 28:241-249 Southern EM, Maskos U, Elder JK (1992) Analyzing and
Morgenstern JP, Land H (1990) Advanced mammalian gene comparing nucleic acid sequences by hybridization to ar-
transfer: high titre retroviral vectors with multiple drug rays of oligonucleotides: evaluation using experimental
selection markers and a complementary helper-free models. Genomics 13:1008-1017
packaging cell line. Nucleic Acids Res 18:3587-3596 Spellman PT, Sherlock G, Zhang MQ et al (1998) Compre-
Nellen W, Sczakiel G (1996) In vitro and in vivo action of hensive identification of cell-cycle-regulated genes of the
antisense RNA. Mol Biotechnol 6:7-15 yeast Saccharomyces cerevisiae by micro array hybridiza-
Pandey A, Mann M (2000) Proteomics to study genes and tion. Mol Bioi Cell 9:3273-3297
genomes. Nature 405:837-846 Stagljar I, Korostensky C, Johnsson N et al (1998) A genetic
Pelletier IN, Campbell-Valois FX, Michnick SW(1998) Oligo- system based on split-ubiquitin for the analysis of inter-
merization domain-directed reassembly of active dihy- actions between protein in vivo. Proc Natl Acad Sci USA
drofolate reductase from rationally designed fragments. 95:5187-5192
Proc Natl Acad Sci USA 95:12141-12146 Sun P, Dong P, Dai K et al (1998) p53-independent role of
Perou CM, Jeffrey SS, Van De Rijn M et al (1999) Distinctive MDM2 in TGF-betal resistance. Science 282:2270-2272
gene expression patterns in human mammary epithelial Tashiro K, Tada H, Heilker R et al (1993) Signal sequence
cells and breast cancers. Proc Natl Acad Sci USA trap: a cloning strategy for secreted proteins and type I
96:9212-9217 membrane proteins. Science 261:600-603
Rain JC, Selig L, De Reuse H et al (2001) The protein-pro- Uetz P, Giot L, Cagney G et al (2000) A comprehensive ana-
tein interaction map of Helicobacter pylori. Nature lysis of protein-protein interactions in Saccharomyces
409:211-215 cerevisiae. Nature 403:623-627
Remy I, Michnick SW (1999) Clonal selection and in vivo Van Steensel B, Delrow J, Henikoff S (2001) Chromatin pro-
quantitation of protein interactions with protein-frag- filing using targeted DNA adenine methyltransferase. Nat
ment complementation assays. Proc Natl Acad Sci USA Genet 27:304-308
96:5394-5399 Venter JC, Adams MD, Myers EWet al (2001) The sequen-
Ren B, Robert F, Wyrick JJ et al (2000) Genome-wide loca- cing of the human genome Science 291:1304-1351
tion and function of DNA binding proteins. Science Vidal M, Legrain P(1999) Yeast forward and reverse n'n-hy-
290:2306-2309 brid systems. Nucleic Acids Res 27:919-929
Risch NJ (2001) Searching for genetic determinants in the Visintin M, Tse E, Axelson H et al (1999) A. Selection of
new millennium. Nature 405:847-856 antibodies for intracellular function using a teo-hybrid
Rojo-Niersbach E, Morley D, Heck S, Lehming N (2000) A in vivo system. Proc Natl Acad Sci USA 96:11723-11728
new method for the selection of protein interactions in Von Eggeling F, Davies H, Lomas L et al (2000) Tissue-spe-
mammalian cells. Biochem J 348:585-590 cific microdissection coupled with Protein Chip array
Roses AD (2000) Pharmacogenetics and the practice of technologies: applications in cancer research. Biotechni-
medicine. Nature 405:857-856 ques 29:1066-1070
Rowley A, Choudhary JS, Marzioch M et al (2000) Applica- Walhout AJ, Sorcella R, Lu X et al (2000) Protein interaction
tions of protein mass spectrometry in cell biology. Meth- mapping in C. elegans using proteins involved in vulval
ods 20:383-397 development. Science 287:116-122
Schena M, Shalon D, Davis RW, Brown PO (1995) Quantita- Washburn MP, Wolters D, Yates JR III (2001) Large-scale
tive monitoring of gene expression patterns with a eDNA analysis of the yeast proteome by multidimensional pro-
microarray. Science 270:467-470 tein identification technology. Nat Biotechnol 19:242-247
SenGupta DJ, Zhang B, Kraemer B et al (1996) Three-hybrid Wender PA, Mitchell DJ, Pattabiraman K et al (2000) The
system to detect RNA-protein interactions in vivo. Proc design, synthesis, evaluation of molecules that enable or
Natl Acad Sci USA 93:8496-8501 enhance cellular uptake: peptoid molecular transporters.
Senior K (1999) Fingerprinting disease with protein chip ar- Proc Nat! Acad Sci USA 97:13003-13008
rays. Mol Med Today 5:326-327 Whitney M, Rockenstein E, Cantin G et al (1998) A gen-
Shalon D, Smith SJ, Brown PO (1996) A DNA micro-array ome-wide functional assay of signal transduction in liv-
system for analyzing complex DNA samples using two- ing mammalian cells. Nat BiotechnoI16:1329-1333
color fluorescent probe hybridization. Genome Res Wittes J, Friedman H (1999) Searching for evidence of al-
6:639-645 tered gene expression: a comment on statistical analysis
Shevchenko A, Jensen ON, Podtelejnikov AV et al (1996) of microarray data. J Nat! Cancer Inst 91:400-401
Linking genome and proteome by mass spectrometry: Wittke S, Lewke N, Mueller S et al (1999) Probing the mole-
large-scale identification of yeast proteins from two di- cular environment of membrane proteins in vivo. Mol
mensional gels. Proc Natl Acad Sci USA 94:14440-14445 Cell Bioi 10:2519-2530
Shimizu N, Sugimoto K, Tang J et al (2000) High-perfor- Wodicka L, Dong H, Mittman M et al (1997) Genome-wide
mance affinity beads for identifying drug receptors. Nat expression monitoring in Saccharomyces cerevisiae. Nat
Biotechnol 18:877-881 Biotechnol 15:1359-1367
Smith GP (1985) Filamentous fusion phage: novel expres- Xu X, Leo C, Jang Yet al (2001) Dominant effector genetics
sion vectors that display cloned antigens on the virion in mammalian cells. Nat Genet 27:23-29
surface. Science 228:1315-1317
66 CHAPTER 3 Functional Genomics and Proteomics: Basics, Opportunities and Challenges

Yang M, Wu Z, Fields S (1995) Protein-peptide interactions Yu Y, Annala AJ, Barrio JR et al (2000) Quantification of
analyzed with the yeast-two-hybrid system. Nucleic Acids target gene expression by imaging reporter gene expres-
Res 23:1152-1162 sion in living animals. Nat Med 6:933-937
Yates JR III, Eng JK, McCormack AL, Schieltz D (1995) Zambrowicz BP, Friedrich GA, Buxton EC et al (1998) Dis-
Method to correlate tandem mass spectra of modified ruption and sequence identification of 2000 genes in
peptides to amino acid sequences in the protein data- mouse embryonic stem cells. Nature 392:608-611
base. Anal Chern 67:1426-1436

View publication stats

You might also like