cDNA Microarray

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 13

INTRODUCTION

A DNA microarray is a multiplex technology used in molecular biology and in


medicine. It consists of an arrayed series of thousands of microscopic spots of DNA
oligonucleotides, called features, each containing picomoles of a specific DNA sequence.
This can be a short section of a gene or other DNA element that are used as probes to
hybridize a cDNA or cRNA sample (called target) under high-stringency conditions.
Probe-target hybridization is usually detected and quantified by fluorescence-based
detection of fluorophore-labeled targets to determine relative abundance of nucleic acid
sequences in the target.

In standard microarrays, the probes are attached to a solid surface by a covalent bond to a
chemical matrix (via epoxy-silane, amino-silane, lysine, polyacrylamide or others). The
solid surface can be glass or a silicon chip, in which case they are commonly known as
gene chip or colloquially Affy chip when an Affymetrix chip is used. Other microarray
platforms, such as Illumina, use microscopic beads, instead of the large solid support.
DNA arrays are different from other types of microarray only in that they either measure
DNA or use DNA as part of its detection system.

Microarray technology evolved from Southern blotting, where fragmented DNA is


attached to a substrate and then probed with a known gene or fragment. The use of a
collection of distinct DNAs in arrays for expression profiling was first described in 1987,
and the arrayed DNAs were used to identify genes whose expression is modulated by
interferon. These early gene arrays were made by spotting cDNAs onto filter paper with a
pin-spotting device. The use of miniaturized microarrays for gene expression profiling
was first reported in 1995, and a complete eukaryotic genome (Saccharomyce cerevisiae)
on a microarray was published in 1997.
PRINCIPLE

 The microarray technology consists of spotting PCR products or long


oligonucleotides (50mer-70mer) on glass slides at densities of up to 6000 spots /
cm2. These slides are hybridised using fluorescent targets (cDNAs or genomic
DNAs). The fluorescent molecules most commonly used are members of the
cyanine family, Cy3 et Cy5. After hybridisation, the signals are detected using a
fluorescence scanner. The use of two different fluorochromes allows the
determination of hybridisation signals from two distinct strains in one single
experiment.
One the fluorescent intensities have been obtained, the major part of the work is
the analysis of the data in order to extract the biological information.

This analysis can be divided into five steps :


 Target preparation
 Hybridization
 Slide scanning
 Data analysis
 Expression profile clustering
MATERIALS

DNA sources
About 5200 human cDNA clones of the IMAGE library were obtained from the RZPD
Resource Centre (Berlin, Germany). Some 21 000 random shotgun clones representing
the genome of Trypanosoma brucei were provided by Najib El-Sayed of the Institute for
Genomic Research (TIGR, Rockville, USA). Nearly 4550 shotgun clones covering the
entire genome of Pseudomonas putida as a minimal tiling path were obtained from
Helmut Hilbert of Qiagen (Hilden, Germany). PCR products for some 21 000 predicted
open reading frames (ORFs) of Drosophila melanogaster were produced directly from
genomic DNA. The template for some 7300 ORF-specific PCR products of Candida
albicans was strain SC5314 (Can14).

PCR amplification
PCR amplifications were performed in 384- or 96-well microtitre plates. For PCR on the
cDNA and shotgun clones, 0.2 µM of the respective, vector-specific primer pairs d(TCA
CACAGGAAACAGCTATGAC) and d(GTAAAACGACGGCCAGTG) (human clones),
d(TTGTAAAACGACGGCCAGTG) and d(GCGGATAACAATTTCACACAGGA)
(T.brucei) or d(TCGGATCCACTAGTAACG) and d(GGCCGCCAGTGTGATG)
(P.putida) (all from Interactiva, Ulm, Germany) were used. The reactions were started by
inoculating 25 or 100 µl of PCR mix, usually in 10 mM Tris–HCl, pH 8.3, 2.25 mM
MgCl2, 50 mM KCl, 0.2 mM each dATP, dTTP, dGTP and dCTP, 1.5 M betaine, 0.1 mM
cresol red and 2 U Taq polymerase, with a few Escherichia coli cells transferred from a
growth culture using a plastic 384- or 96-pin gadget (Genetix, New Milton, UK). The
plates were incubated for 3 min at 94°C, before 35 cycles of denaturation at 94°C for 30
s, annealing at 51°C for 30 s and elongation at 72°C for 90 s were performed, followed by
a final elongation phase at 72°C for 10 min. In some cases, the PCR was performed
without betaine. The Drosophila ORFs were initially amplified on 100 ng genomic DNA
with some 43 000 gene-specific primers, all of which contained one of several common
tag sequences of 15 nt length at their 5'-ends. Subsequent re-amplification was carried out
using the fitting primer pair. PCR products of C.albicans ORFs were produced on 20 ng
genomic DNA with 7300 specific primer pairs.

Microarray production process:

DNA fragments amplified by PCR technique are spotted on a microscopic glass slide
coated with polylysine prior to spotting process. The polylysine coating goal is to ensure
DNA fixation through electrostatic interactions. PCR fragments are in our case the
expressed part (ORF) of the 6200 Saccharomyces cerevisae genes (baker yeast). Slide
preparation is achieved by blocking the polylysine not fixed to DNA in order to avoid
target binding. Prior to hybridisation, DNA is denatured to obtained a single strand DNA
on the microarray, this will allow the probe to bind to the complementary strand from the
target.

Target preparation:

RNA are extracted from two yeast cultures from which we want to compare
expression level. Messengers RNA are then transformed in cDNA by reverse
transcription. On this stage, DNA from the first culture with a green dye, whereas DNA
from the second culture is labelled with a red dye.

The available target-preparation methods can be divided into two groups: first-strand
cDNA that is labeled or tagged with a capture sequence, or the generation of antisense
RNA (aRNA) from double-stranded cDNA during an in vitro transcription (IVT)
reaction. Labeled cDNA can be prepared via direct The incorporation of a fluorophore-
labeled nucleotide or through incorporation of an aminoallyl-labeled nucleotide, followed
by coupling to a fluorophore containing an amine-reactive group to the aminoallyl
nucleotide (Schena et al. 1995; for review, see Lockhart and Winzeler 2000).
Alternatively, the first-strand cDNA can be tagged with a capture sequence that is used
for subsequent detection steps (Stears et al. 2000). DNA microarrays containing short
oligonucleotide probes (<35 nucleotides long) require more target for each hybridization,
which requires an amplification method with smaller sample sizes. Typically, the
generation of aRNA (aRNA is also commonly called complementary RNA or cRNA) is
preceded by first-strand synthesis of cDNA using an oligonucleotide primer containing a
bacteriophage T7 RNA polymerase promoter proximal to an oligo(dT) sequence (van
Gelder et al. 1990;Eberwine et al. 1992; Lockhart et al. 1996). After second-strand cDNA
synthesis and cDNA purification, an IVT reaction is performed using T7 RNA
polymerase in the presence of labeled nucleotides. Alternatives to this labeling strategy
produce unlabeled aRNA, followed by a cDNA synthesis in the presence of a
fluorophore-labeled nucleotide (Wang et al. 2000). Any target preparation method
requires a linear amplification of the available transcripts to be representative of the
transcript population.

Hybridisation:

Green labelled cDNA and red labelled ones are mixed together (call the target) and
put on the matrix of spotted single strand DNA (call the probe). The chip is then
incubated one night at 60 degrees. At this temperature, a DNA strand that encounter the
complementary strand and match together to create a double strand DNA. The fluorescent
DNA will then hybridise on the spotted ones.
The discrepancies in microarray results are a consequence of differences in microarray
measures, such as accuracy [i.e. ‘the degree of conformity of the measured quantity to its
actual (true) value’; sensitivity [i.e. ‘the concentration range of target molecules in which
accurate measurements can be made’; reproducibility [i.e. ‘the degree to which repeated
measurements of the same quantity will show the same or similar results’; and specificity
[i.e. ‘the ability of a probe to provide a signal that is influenced only by the presence of
the target molecule’.

Accuracy, sensitivity and reproducibility may be affected by several effectors. These


measures and their effectors are discussed by Dufva and Draghici et al. , and will not be
detailed here. An example for an effector of sensitivity, reproducibility and accuracy is the
type of microarray platform: oligonucleotide arrays have been found to be more
reproducible and sensitive than cDNA arrays , and some oligonucleotide arrays have been
found to be more accurate than others. Sensitivity is also affected by probe density (i.e.
the number of different probes that are fabricated in a given area), which has been shown
to be an effector for the availability of probes for hybridization; this availability may also
be affected by the steric restrictions imposed by the solid microarray surface. A higher
availability of probes for hybridization has been demonstrated to increase sensitivity. In
addition, sensitivity is affected by the hybridization signal-to-noise ratio (i.e. the ratio
between the spot signal and that of the background): a low background increases
microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime
measures affecting discrepancies in gene-expression profiles between different probes
targeting the same region of a given transcript or between different microarray platforms;
in the present review, we will highlight the issue of microarray - hybridization specificity
as a key measure that once improved, may increase the validity of microarray results.
Microarrays consist of multiple probes. Hence, a prime key for specificity during
microarray hybridiation, for either short-oligomer or cDNA microarrays; is the ability of
the probe to discriminate between different target molecules.

Probes are designed to be complementary to the target molecule according to the Watson–
Crick rules of binding. Therefore, a probe with high specificity to its target molecule
should provide a signal influenced only by the presence of the target molecule.
Nevertheless, a perfect match in terms of sequence-similarity-based complementarity
between a probe and its target molecule does not guarantee specificity. This is due to the
presence of thousands of target molecules during microarray hybridization—each target
molecule being composed of tens of hundreds or thousands of four-nucleotide bases, and
to the effect of different effectors (discussed subsequently) of hybridization specificity,
which may alter the ability of a probe to bind to a target molecule. Hence, there is often
some degree of microarray-probe hybridization to a target molecule which is not strictly
complementary to it or vice versa, a variable number of target molecules that are
hybridized to a microarray probe which is not exactly complementary to them.

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray


hybridization. The first is of hybridization between a single probe molecule and a single
target molecule. The two molecules may exhibit perfect hybridization, partial
hybridization (i.e. the target molecule is only partially hybridized to the probe; or no
hybridization.

The second level of specificity is of a spot. At this level, multiple probe molecules that
compose one spot are hybridized to multiple target molecules. The spot probes may
exhibit perfect, partial or no hybridization with the target molecules. Notably, at this level,
partial hybridization may have one or both of two forms: only some of the probes may be
hybridized to the target molecule, or probes may be hybridized to only some of the target
molecules. This partial hybridization, at the spot level, may be a result of cross-
hybridization (i.e. hybridization between sequences that are not strictly complementary,
due to the presence and hybridization of nontarget molecules with sequences similar to
that of the spot probes. Since a spot is composed of multiple probes, a single spot may
simultaneously bear all combinations of one to four of the presented probe-target
molecule types of binding.

The third level of specificity is of a spot-set [or, in Affymetrix terminology, ‘probe-set’, in


which multiple spots represent different segments of the same reference sequence (e.g.
different exons of a gene). At this level, different spots of a spot-set may exhibit perfect
hybridization with the target molecule; partial hybridization with the target molecule due
to the presence of probes with mismatches to the target molecule as a result of, for
example, an annotation error in the gene sequence, or intended mismatches introduced to
quantify nonspecific hybridization; no hybridization due to, for example, alternative
splicing of a transcript, leading to probes with no match to the target molecule; cross
hybridization due to, for example, a spot, within a spot-set that represents an
evolutionarily conserved gene segment, which hybridizes with nontarget molecules
derived from various gene-family members.

The fourth level of specificity is that of the microarray, in which a variable number of
spot-sets may exhibit different forms of hybridization with target sequences perfect
hybridization (i.e. all target molecules are hybridized to their representative spot-sets and
all spot-sets are hybridized to the target molecules they represent), partial hybridization in
either direction, no hybridization (i.e. target molecules are not hybridized to any spot-set
or spot-sets do not match any target molecules) or cross- hybridization (e.g. target
molecules of different genes hybridize to the same spot-set or target molecules of a
particular gene hybridize to several different genes’ spot-sets). These different forms may
exist for a large number of different target molecules or spot-sets.

Slide scanning:

A laser excites each spot and the fluorescent emission gather through a photo-
multiplicator (PMT) coupled to a confocal microscope. We obtained two images where
grey scales represent fluorescent intensities read. If we replace grey scales by green
scales for the first image and red scales for the second one, we obtained by
superimposing the two images one image composed of spots going from green ones
(where only DNA from the first condition is fixed) to red (where only DNA from the
second condition is fixed) passing through the yellow colour (where DNA from the two
conditions are fixed on equal amount).
Data analysis:

We have now two microarray images from which we have to calculate the number of
DNA molecules in each experimental condition. To dos o, we measure the signal amount
in the green dye emission wavelength and the signal amount in the red dye emission
wavelength. Then we normalise these amount according to various parameters (yeast
amount in each culture condition, emission power of each dye, …). We suppose that the
amount of fluorescent DNA fixed is proportional to the mRNA amount present in each
cell at the beginning and we calculate the red/green fluorescence ratio. If this ratio is
greater than 1 (red on the image), the gene expression is greater in the second
experimental condition, if this ration is smaller than 1 (green on the image), the gene
expression is greater in the first condition. We can visualize these differences in
expression using software as the one developed in the laboratory call ArrayPlot (cf below
image). This software allows from the intensities list of spot to display the red intensities
of each spot as a function of the green intensities.

Fabrication

Microarrays can be manufactured in different ways, depending on the number of probes


under examination, costs, customization requirements, and the type of scientific question
being asked. Arrays may have as few as 10 probes to up to 2.1 million micrometre-scale
probes from commercial vendors.

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate
in order to obtain desirable surface properties for the application of interest. Optimal
surface properties are those which produce high signal to noise ratios for the DNA targets
of interest. Generally, this involves maximizing the probe surface density and activity
while minimizing the non-specific binding of the targets of interest. Methods of surface
engineering vary depending on the platform material, design, and application.

Spotted vs. oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies, including printing with


fine-pointed pins onto glass slides, photolithography using pre-made masks,
photolithography using dynamic micromirror devices, ink-jet printing, or
electrochemistry on microelectrode arrays.
In spotted microarrays, the probes are oligonucleotides, cDNA or small fragments of
PCR products that correspond to mRNAs. The probes are synthesized prior to deposition
on the array surface and are then "spotted" onto glass. A common approach utilizes an
array of fine pins or needles controlled by a robotic arm that is dipped into wells
containing DNA probes and then depositing each probe at designated locations on the
array surface. The resulting "grid" of probes represents the nucleic acid profiles of the
prepared probes and is ready to receive complementary cDNA or cRNA "targets" derived
from experimental or clinical samples. This technique is used by research scientists
around the world to produce "in-house" printed microarrays from their own labs. These
arrays may be easily customized for each experiment, because researchers can choose the
probes and printing locations on the arrays, synthesize the probes in their own lab (or
collaborating facility), and spot the arrays. They can then generate their own labeled
samples for hybridization, hybridize the samples to the array, and finally scan the arrays
with their own equipment. This provides a relatively low-cost microarray that may be
customized for each study, and avoids the costs of purchasing often more expensive
commercial arrays that may represent vast numbers of genes that are not of interest to the
investigator. Publications exist which indicate in-house spotted microarrays may not
provide the same level of sensitivity compared to commercial oligonucleotide arrays,
possibly owing to the small batch sizes and reduced printing efficiencies when compared
to industrial manufactures of oligo arrays.

In oligonucleotide microarrays, the probes are short sequences designed to match parts of
the sequence of known or predicted open reading frames. Although oligonucleotide
probes are often used in "spotted" microarrays, the term "oligonucleotide array" most
often refers to a specific technique of manufacturing. Oligonucleotide arrays are
produced by printing short oligonucleotide sequences designed to represent a single gene
or family of gene splice-variants by synthesizing this sequence directly onto the array
surface instead of depositing intact sequences. Sequences may be longer (60-mer probes
such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending
on the desired purpose; longer probes are more specific to individual target genes, shorter
probes may be spotted in higher density across the array and are cheaper to manufacture.
One technique used to produce oligonucleotide arrays include photolithographic
synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive
masking agents are used to "build" a sequence one nucleotide at a time across the entire
array. Each applicable probe is selectively "unmasked" prior to bathing the array in a
solution of a single nucleotide, then a masking reaction takes place and the next set of
probes are unmasked in preparation for a different nucleotide exposure. After many
repetitions, the sequences of every probe become fully constructed. More recently,
Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large
numbers of probes.
Two-channel vs. one-channel detection

Diagram of typical dual-colour microarray experiment.

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA


prepared from two samples to be compared (e.g. diseased tissue versus healthy tissue)
and that are labeled with two different fluorophores. Fluorescent dyes commonly used for
cDNA labelling include Cy3, which has a fluorescence emission wavelength of 570 nm
(corresponding to the green part of the light spectrum), and Cy5 with a fluorescence
emission wavelength of 670 nm (corresponding to the red part of the light spectrum). The
two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is
then scanned in a microarray scanner to visualize fluorescence of the two fluorophores
after excitation with a laser beam of a defined wavelength. Relative intensities of each
fluorophore may then be used in ratio-based analysis to identify up-regulated and down-
regulated genes.

Oligonucleotide microarrays often contain control probes designed to hybridize with


RNA spike-ins. The degree of hybridization between the spike-ins and the control probes
is used to normalize the hybridization measurements for the target probes. Although
absolute levels of gene expression may be determined in the two-color array, the relative
differences in expression among different spots within a sample and between samples is
the preferred method of data analysis for the two-color system. Examples of providers for
such microarrays includes Agilent with their Dual-Mode platform, Eppendorf with their
DualChip platform for fluorescence labeling, and TeleChem International with Arrayit.

In single-channel microarrays or one-color microarrays, the arrays are designed to give


estimations of the absolute levels of gene expression. Therefore the comparison of two
conditions requires two separate single-dye hybridizations. As only a single dye is used,
the data collected represent absolute values of gene expression. These may be compared
to other genes within a sample or to reference "normalizing" probes used to calibrate data
across the entire array and across multiple arrays. Three popular single-channel systems
are the Affymetrix "Gene Chip", the Applied Microarrays "CodeLink" arrays, and the
Eppendorf "DualChip & Silverquant". One strength of the single-dye system lies in the
fact that an aberrant sample cannot affect the raw data derived from other samples,
because each array chip is exposed to only one sample (as opposed to a two-color system
in which a single low-quality sample may drastically impinge on overall data precision
even if the other sample was of high quality). Another benefit is that data are more easily
compared to arrays from different experiments; the absolute values of gene expression
may be compared between studies conducted months or years apart. A drawback to the
one-color system is that, when compared to the two-color system, twice as many
microarrays are needed to compare samples within an experiment.

Expression profile clustering:

Then we can try to gather genes that share the same expression profile on several
experiments. This clustering can be done gradually as for phylogenetic analysis, which
consist in calculating similarity criteria between expression profiles and gather the most
similar ones. We can also use more complex techniques as principal component analysis
or neuronal networks.

At the end hierarchical clustering is usually displayed as a matrix where each column
represent one experiment and each row a gene. Ratios are displayed thanks to a colour
scale going from green (repressed genes) to red (induced genes).
Uses and types
Arrays of DNA can be spatially arranged, as in the commonly known gene chip (also
called genome chip, DNA chip or gene array), or can be specific DNA sequences labelled
such that they can be independently identified in solution. The traditional solid-phase
array is a collection of microscopic DNA spots attached to a solid surface, such as glass,
plastic or silicon biochip. The affixed DNA segments are known as probes (although
some sources use different terms such as reporters). Thousands of them can be placed in
known locations on a single DNA microarray.

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization),
or detect RNA (most commonly as cDNA after reverse transcription)that may or may not
be translated into proteins. The process of measuring gene expression via cDNA is called
expression analysis or expression profiling.

Since an array can contain tens of thousands of probes, a microarray experiment can
accomplish that many genetic tests in parallel. Therefore arrays have dramatically
accelerated many types of investigation.

Applications include:

Technology or
Synopsis
Application

In an mRNAor gene expression profiling experiment the


expression levels of thousands of genes are simultaneously
monitored to study the effects of certain treatments, diseases, and
Gene expression developmental stages on gene expression. For example,
profiling microarray-based gene expression profiling can be used to
identify genes whose expression is changed in response to
pathogens or other organisms by comparing gene expression in
infected to that in uninfected cells or tissues.

Comparative genomic Assessing genome content in different cells or closely related


hybridization organisms.

Chromatin DNA sequences bound to a particular protein can be isolated by


immunoprecipitation immunoprecipitating that protein (CHIP), these fragments can be
on Chip then hybridized to a microarray (such as a tiling array) allowing
the determination of protein binding site occupancy throughout
the genome. Example protein to immunoprecipitate are histone
modifications (H3K27me3, H3K4me2, H3K9me3, etc),
Polycomb-group protein (PRC2:Suz12, PRC1:YY1) and
trithorax-group protein (Ash1) to study the epigenetic landscape
or RNA Polymerase II to study the transcription lanscape.

Identifying single nucleotide polymorphism among alleles within


or between populations. Several applications of microarrays
make use of SNP detection, including Genotyping, forensic
SNP detection analysis, measuring predisposition to disease, identifying drug-
candidates, evaluating germline mutations in individuals or
somatic mutations in cancers, assessing loss of heterozygosity, or
genetic linkage analysis.

An 'exon junction array design uses probes specific to the


expected or potential splice sites of predicted exons for a gene. It
is of intermediate density, or coverage, to a typical gene
expression array (with 1-3 probes per gene) and a genomic tiling
Alternative splicing
array (with hundreds or thousands of probes per gene). It is used
detection
to assay the expression of alternative splice forms of a gene.
Exon arrays have a different design, employing probes designed
to detect each individual exon for known or predicted genes, and
can be used for detecting different splicing isoforms.

Genome tiling arrays consist of overlapping probes designed to


densely represent a genomic region of interest, sometimes as
Tiling array large as an entire human chromosome. The purpose is to
empirically detect expression of transcripts or alternatively splice
forms which may not have been previously known or predicted.

You might also like