Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Techniques

A Tutorial for DNA Microarray Expression Profiling


Harm van Bakel1 and Frank C.P. Holstege1,*
1

Department of Physiological Chemistry, University Medical Center Utrecht, 3508 AB Utrecht, The Netherlands *Correspondence: f.c.p.holstege@umcutrecht.nl

Introduction Microarray technology has revolutionized biology by allowing experimentation at the genomic rather than the single-gene level. Since its inception, the technology has spawned a myriad of different applications (Hoheisel, 2006). This tutorial focuses on the use of DNA microarrays for gene expression analysis and aims to clarify commonly used experimental approaches and analysis methods. Experimental Basics DNA microarrays make it possible to simultaneously assess mRNA expression of thousands of genes. This is achieved by arraying a large number of DNA probes onto a small surface, each matching a unique (part of a)

gene in the genome, to which one or more labeled mRNA samples from cells or tissues of interest are hybridized (Figure 1A). A large number of microarray platforms exist that differ in the type of probe and manufacturing technology (Irizarry et al., 2005). A broad distinction can be made according to the use of single-color versus two-color labeling (Figure 1A). In one-color experiments, each sample (also known as a target) is labeled with the same fluorescent dye and hybridized to separate microarrays. In two-color experiments, samples are differentially labeled, using two different fluorescent dyes, and are directly compared by mixing and hybridizing the two samples on a single microarray. The former approach
Figure 1. DNA Microarray Experiments and Study Designs

(A) Overview of how the effects of a stimulus on a cell culture may be studied using microarrays. Total RNA is first isolated from each condition, followed by steps to enrich for and in some cases to amplify mRNA. This material, cDNA, or, in the case of amplification, cRNA, is directly or indirectly labeled, for example by generating cDNA in a reverse-transcription reaction with labeled nucleotides. For two-color experiments, samples from each condition are labeled with a different fluorescent dye such as Cy3 (green) or Cy5 (red), and then combined and hybridized onto a single microarray. Following quantification of the fluorescent signals in each dye channel, changes in gene expression can be calculated from the intensities of the individual channels. For one-color experiments, each sample is labeled identically and hybridized onto different microarrays. Comparisons between conditions are done analogously to two-color experiments, by comparing the signals obtained from each microarray. (B) In two-color microarray experiments, three designs are possible. In a direct pairwise design, two groups are directly compared by hybridizing a sample from each group in the different dye channels. To prevent dye-bias artifacts, samples from each group should be swapped between dye channels in different hybridizations. Reference design experiments can be used to compare two or more groups. Samples from each group are hybridized against a common reference sample or a common reference pool of samples. Loop designs organize samples in a circular manner, and each microarray is used to compare two consecutive samples in the chain. In large loops, additional hybridizations (spokes) can be performed to reduce the distance between samples that are far apart. Because of inherent difficulties in the data analysis of the loop design, it is not used as often as the reference and the pairwise designs. In the schematic layouts, letters represent different sample groups and connecting lines indicate hybridizations.

22 2008 Elsevier Inc. All rights reserved.

Techniques
requires very consistent manufacturing to minimize array-to-array variation. The latter approach benefits from the direct comparison between samples on a single array. However, comparing more than two samples becomes complicated, necessitating either the use of a common reference sample or a hybridization strategy that combines multiple different pairs of samples (Figure 1A) (Churchill, 2002). Both single- and dual-channel platforms are frequently used and will yield comparable results (Irizarry et al., 2005). Regardless of the platform used, differences in hybridization efficiency between probes make it difficult to accurately translate measured fluorescent spot intensities into absolute mRNA levels. The quantified hybridization signals for each probe are therefore generally represented as a ratio that captures the change in expression between the conditions studied (Eisen et al., 1998). The meaning of expression ratios is intuitively easy to understand, but their asymmetrical distribution for upregulated (between 1 and infinity) and downregulated genes (between 0 and 1) can make even trivial operations, such as calculating a mean expression ratio, difficult. The ratios are therefore generally log transformed prior to any further data analysis, resulting in a symmetrical distribution of up- and downregulated genes. The log-transformed expression ratios obtained after quantification of the scanned microarray images cannot be directly used for data analysis. Differences in background, labeling efficiencies, and the amount of labeled material can mean that signals between the dye channels in two-color experiments, or between the different hybridizations in one-channel array experiments, are unbalanced. These imbalances can be nonhomogeneous. For example, a weak background autofluorescence of DNA spots in one channel will have a stronger effect on probes corresponding to lowly expressed genes than on those representing highly expressed genes (Figure 2A). For these various reasons, the signals between dye channels or between individual arrays must therefore be normalized so that they reflect true expression differences across the complete intensity range (for a review, see Quackenbush, 2002). Early normalization methods were based on median signal intensities or based on the assumption of constant expression of housekeeping genes. This is now discouraged because of the inability to deal with intensity-dependent bias and the fact that there are no genes with constant expression across all conditions. Nonlinear normalization approaches, such as lowess (Yang et al., 2002), balance dyechannel intensities on the premise that the majority of genes do not change expression during the course of an experiment (Figure 2B). These methods perform very well, provided that there are no global changes in gene expression patterns affecting more than a quarter of all genes. Such global changes are lost during normalization, and standard normalization results in misrepresentation of differentially expressed genes.

Figure 2. Comparing Microarray Data


(A and B) MA scatter plots showing the results of a self versus selfexperiment (same sample hybridized in each dye channel) before (A) and after (B) normalization. The MA plot is a variant of a normal scatter plot whereby log2-transformed expression ratios (M) are plotted as a function of the averaged background-subtracted intensity (A) of the two channels (R and G). The raw data (A) show a bias toward the G channel (Cy3) in the low-intensity range. (B) Data after nonlinear lowess normalization on genes per array subgrid. Expression ratios are now balanced between the two dye channels, and the Cy3 bias is removed. (C) Example of a Venn diagram showing the degree of overlap between gene lists obtained from three different experiments. The number of genes is unique to each list, and the sizes of the overlaps between lists are indicated.

Studies that involve inactivation of components of the general transcription machinery or conditions that profoundly affect cell growth are particularly at risk for such effects. In these cases, externally spiked in vitro synthesized control RNAs can be added in equal quantities to each sample and used as a basis for normalization instead (van Bakel and Holstege, 2004).
2008 Elsevier Inc. All rights reserved. 23

Techniques
Defining Differential Expression Central to most microarray experiments is the goal of determining the set of genes which are differentially expressed between samples. This is not always straightforward, considering the large number of genes that are being evaluated and the biological and technical variation in their expression measurements. A simple foldchange cutoff at an arbitrary value, still used in many studies, will not provide any measure of confidence for those genes selected, and results in the false positive selection of many low or unexpressed genes that do not actually exhibit real changes in expression levels but are more prone to be affected by measurement noise and other artifacts (Cui and Churchill, 2003). Such problems underscore the need for statistical testing in microarray data analysis. Due to the relatively high cost of microarrays, the number of replicates is usually small, especially when compared to the large number of genes tested. This poses a significant problem for statistical analysis, as it means that only a limited number of measurements are available to estimate individual gene variance (Draghici, 2003). Statistical methods have been developed for microarray data analysis that address this issue by combining gene-specific variance with the global variance across all genes (Cui and Churchill, 2003). This takes advantage of the thousands of gene measurements available in each hybridization while still allowing for differences in individual gene variance. Another advantage offered by many of these specialized packages is their ability to estimate the statistical distribution of expression ratios by random sampling from the underlying data. The resulting increase in sensitivity and specificity to detect expression changes provides a distinct advantage over conventional statistical approaches such as standard t tests, which are inappropriate to properly determine differential expression. In order to perform statistics, a minimal number of replicate measurements need to be performed. An important distinction must be made between microarray measurements obtained from technical and from biological replicates. The expression of many genes, particularly those involved in responses to stress, are very sensitive to slight differences in treatment or growth conditions. Therefore, repeated measurements on a single biological sample are much less meaningful than repeated measurements on several biological replicates. When assessing reported expression differences, particular attention should therefore be paid to the number of biological replicates. Because many genes are tested simultaneously in a microarray experiment, standard significance thresholds are not appropriate to assess confidence. When considering an experiment with 20,000 genes, applying even a relatively conservative p value cutoff of 0.01 for each gene individually would a priori result in 200 genes being reported as significant, even if the number of true positives is actually much lower. It is therefore impor24 2008 Elsevier Inc. All rights reserved.

tant that p values reported in the literature have undergone a multiple-testing correction (MTC) that adjusts the p value of each gene for the number of tests that were performed, that is, the total number of probes on the microarray. These correction methods have historically been aimed at reducing the chance of reporting any erroneous results. This type of correction tends to be quite conservative and can reduce the power to recognize significant results. In microarray experiments where multiple significant expression changes are expected, it is often justifiable to accept a small number of false positives for the benefit of increased power. Many microarray studies therefore aim to control the rate at which errors occur, known as the false discovery rate (FDR), typically setting a cutoff at 5%. An excellent overview of MTC methods available for microarray studies is given elsewhere (Dudoit et al., 2003). It is important to note that depending on the goal of the experiment, not all applications require an equally high degree of accuracy in the reported gene lists. Many microarray studies are initially performed as screens for differential expression, with the goal of further analyzing only a few genes. In these cases virtually any statistical method will suffice, as long as subsequent analyses focus on only the best-ranked genes and these are all subjected to independent validation of the change in expression by alternative methods such as reverse transcriptase-polymerase chain reaction. Applications that require a much higher degree of accuracy for all reported expression changes include global analyses whereby conclusions are based on the entire profile of differentially expressed genes, for example in studies aimed at identifying pathways or other functional relationships (Hughes et al., 2000; van de Peppel et al., 2005). Visualizing Expression Differences Direct comparisons between samples are typically displayed in scatter plots. In most cases, these will show intensity measures that were obtained from each dye channel in a two-color experiment, or from two independent hybridizations in a one-color setup. Features that are expressed at similar levels between conditions will be lined up around the central diagonal, whereas differentially expressed genes deviate from this line. The use of scatter plots should be combined with statistics to indicate which features are reliably differentially expressed. Other applications of scatter plots that may be encountered are for reproducibility assessment or to compare the performance of different technology platforms and methods (Mahadevappa and Warrington, 1999). The degree of reproducibility can be expressed as a correlation coefficient, which gives an indication of how closely two experiments match. For directly comparing ratios rather than intensities, scatter plots are less suitable. This is because in most experiments, the majority of genes will not show expression differences. As a result, correlation between ratios derived from different experi-

Techniques

Figure 3. Clustering and Functional Characterization


Gene expression data from a set of Saccharomyces cerevisiae mutants with single-gene deletions (as indicated at bottom) in various stages of clustering. (A) No clustering. (B) Hierarchical clustering on genes, using a standard correlation coefficient as distance measure. Genes with similar expression patterns across all experiments are now grouped together. The dendrogram on the left represents the similarities between expression profiles, where genes that are connected at a higher level in the tree have a more similar expression pattern. (C) Data clustered on both genes and experiments, now also showing which experiments are more similar to each other. Modified from van de Peppel et al. (2005). (D) Simplified gene ontology terms (GO) for the yeast HXK1 gene in each of the three distinct ontologies present in GO: biological process, cellular component, and molecular function. Each GO term will have an additional number of genes associated with it that have been annotated with the same term. As the specificity of each term increases from top to bottom in the schematic, the number of annotated genes decreases. GO classifications differ from standard hierarchies in that a single GO term can have multiple parents (broader classes to which the term refers). For example, the biological process term monosaccharide metabolism is preceded by both alcohol metabolism and cellular carbohydrate metabolism, as it is part of both.

2008 Elsevier Inc. All rights reserved.

25

Techniques
ments will largely be determined by the genes that are most affected by measurement noise, that is, the genes with little or no expression. When correlations of ratios are presented, they can therefore be restricted to only those genes that show statistically significant expression differences in any of the individual experiments that are being compared. Another way of comparing results between experiments is to determine the degree of overlap in differentially regulated genes, with the aim of identifying biologically significant relationships. Such overlaps are often presented in Venn diagrams, which use intersecting circles to show all logical combinations of gene lists (Figure 2C). Preferably, the circles and their overlapping areas will be scaled to indicate the number of genes involved. Overlaps between gene lists should be accompanied by p values derived from hypergeometric tests that indicate the chance of finding the observed overlap when randomly sampling similar numbers of genes from the genome (Khatri and Draghici, 2005). Standard p value cutoffs suffice to identify meaningful overlaps, provided that MTC methods are applied when more than one overlap is considered. Although there is little doubt that analysis of the overlap between gene lists from different experiments can be very powerful, there are potential pitfalls that should be avoided. Important determinants for the degree of overlap between data sets are the initial sizes of the gene lists identified in each experiment. These in turn depend greatly on the statistical power of the individual studies. Even when using the same statistical method to select for differentially expressed genes, the study that incorporates more replicates has increased power to detect small expression differences and is therefore likely to report more expression changes. In other words, such comparisons should ideally be performed on gene lists generated in precisely the same way, with the same number of replicates. In the absence of reliable estimates of sensitivity and specificity for each experiment, care must be taken when using Venn diagrams to demonstrate biologically meaningful overlaps. Reducing Complexity: Clustering Selecting for differential expression often results in large numbers of genes. Without further processing and classification, it can be very difficult to distinguish higherorder structure in such data. To reduce complexity, it is possible to group genes that show similar expression patterns over a series of conditions together. This is commonly referred to as clustering, and encompasses a wide variety of different methods for distinguishing groups of coregulated genes. There are two main types of clustering, unsupervised and supervised. Unsupervised clustering requires no prior knowledge to discover structure and uses a predefined metric (e.g., correlation) to discover a natural grouping in a data set. In contrast, supervised methods need a set of samples with known classification that is used to train an algorithm to clas26 2008 Elsevier Inc. All rights reserved.

sify a new data set. The latter is often used in cancer research, for example to predict the clinical outcome of treating new patients based on the tumor expression profiles of patients for which the outcome is already known (for a review, see Ransohoff, 2004). Among the best-known unsupervised approaches are hierarchical clustering methods (Eisen et al., 1998) that start by assigning every gene to its own cluster and then sequentially merge the most closely related clusters together based on a measure of similarity. This organizes the expression data in a tree-like structure where groups of genes with similar expression patterns are connected on the same branch (Figures 3A and 3B). Distinct groups of genes can then be selected by setting a cutoff at a certain branch length. It is also possible to cluster on both genes and experimental conditions (twodimensional clustering), which makes it easier to interpret cluster diagrams (compare Figure 3B with Figure 3C). Applications include time-course analyses, quality control to see whether replicate hybridizations cluster together, disease profiling to determine new subclassifications, and determination of functional relationships between collections of gene deletions. Hierarchical clustering methods are well suited to identify genes that have closely related expression patterns, but it can be more difficult to interpret connections further down the tree, as the algorithms attempt to connect heterogeneous clusters based on averaged expression profiles. It is also important to realize that only the depth at which branches connect in the tree is a measure of their similarity and not the order of branches from left to right or top to bottom. Branches can be swiveled at the branch point, and there are therefore multiple ways of displaying the same tree. Care should be taken when interpreting cluster diagrams that have been visually arranged for maximal contrast between profiles of interest. There are many alternative clustering methods (Slonim, 2002). Partitioning algorithms such as k-means start with all genes in a single cluster that is subsequently broken up in smaller groups with similar expression patterns. For these methods, the number of clusters (k) is generally defined beforehand. K-means clustering should be used with care, as the number of initial groups has a large effect on the final partitioning. A reasonable estimate of the number of groups to use for the k-means algorithm can be obtained by first using other clustering techniques, such as principal component analysis (PCA). PCA aims to reduce the complexity of a data set by focusing on the most important parts (i.e., the principal components) that capture the maximum amount of variation in a data set. As a result, a complex data set that contains multiple experiments can still be visualized in two- or three-dimensional plots that capture most of the underlying gene expression differences. A more sophisticated partitioning method uses self-organizing maps (SOM), which has the advantage that it also describes the relationships between clusters. Finally, other clus-

Techniques
tering methods, such as the signature algorithm (Ihmels et al., 2002), have the distinct advantage that they allow genes to be part of more than one cluster or even no cluster at all. This allows for a more natural representation of biological regulation where the same genes can be part of multiple regulatory modules controlled by different transcription factors. Each clustering method has its advantages and drawbacks. The choice of clustering algorithm therefore depends on the specific application. When evaluating microarray results, attention should be given to the suitability of the chosen method and, importantly, the manner in which clusters have been selected (e.g., cutoff depth in hierarchical trees or number of groups for k-means clustering). Preferably, some measure of stability of the reported clusters should be reported. Such measures can be derived from counting the number of times a particular grouping is observed with different clustering methods or upon repeated clusterings with the same method after adding a small amount of noise or random subsampling of the data in each round (Kerr and Churchill, 2001; McShane et al., 2002). Classification Using Gene Ontology Clusters or other lists of differentially expressed genes can often be meaningless without insight into the biological process that is represented. This can be achieved for each gene individually, by extracting available annotation information from literature resources or public repositories such as the ENSEMBL database. More commonly, gene ontology (GO) classifications are used (Ashburner et al., 2000). The GO project aims to provide a structured, controlled vocabulary that can be used to systematically describe the properties of gene products. This is vital to avoid possible confusion that can arise when different terms are used to describe similar characteristics (e.g., pore or channel). It also addresses a potential bias when researchers focus on only one particular characteristic common to the genes of interest, without taking into account the frequency with which this characteristic occurs in the rest of the genome. GO consists of three ontologies that classify genes according to biological process, cellular components, and molecular function. As an example, the simplified hierarchical classification of the Saccharomyces cerevisiae HXK1 gene is given in Figure 3D. By examining which GO terms are frequently found in gene clusters, it is possible to assign a putative function. Such GO class assignments should be accompanied by a p value (e.g., derived from a hypergeometric test) to indicate the random chance of finding genes of that particular class together in a cluster. Care must be taken when interpreting these p values for GO classes that are are associated with only a handful of genes (e.g., n < 5). Inclusion of even a single gene annotated for a rare GO term will result in that term being reported with high significance, and may even outrank more common GO terms that are present at much higher frequency in the gene list of interest. To address this problem, the reported probabilities should be adjusted in favor of GO terms supported by more genes, and readers are encouraged to use readily available tools that employ such measures (Dennis et al., 2003; Hosack et al., 2003). Finally, it can be difficult to adjust the p values for the number of GO categories that are tested (Khatri and Draghici, 2005) (i.e., multiple-testing correction), as these are highly interrelated (Figure 3D) and genes may be assigned to multiple categories. Resampling approaches can be used to estimate false discovery rates for GO classifications (Khatri and Draghici, 2005; Zeeberg et al., 2003) and provide more realistic p values. Despite the obvious power of such analyses, gene function annotations and hence GO classifications are currently incomplete, meaning that there will be an inevitable bias toward well-studied mechanisms. A lack of significant GO assignments should therefore never be taken as an absence of biological relevance, but rather as an opportunity to discover novel biological function. Future Developments Microarray technology has been available since the late 1990s. The technology, applications, and analysis methods are still developing. One important change that will impact the methods described above is the rapidly increasing number of probes placed on microarrays. The increased resolution of the latestgeneration microarrays now makes it possible to use probes that correspond to individual exons or even tile complete genomes (David et al., 2006; Stolc et al., 2004). Besides improving accuracy for detecting expression changes, the increased coverage enables detection of splice variants and, in the case of tiling arrays, aids the discovery of novel transcripts. In both cases, more sophisticated statistical methods are required to determine differential expression of individual genes. Although it is too early to accurately predict such developments, the advent of deep sequencing methods may even make many array-based methods obsolete in the future.
REFERENCES Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 2529. Churchill, G.A. (2002). Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 32 (Suppl), 490495. Cui, X., and Churchill, G.A. (2003). Statistical tests for differential expression in cDNA microarray experiments. Genome Biol. 4, 210. David, L., Huber, W., Granovskaia, M., Toedling, J., Palm, C.J., Bofkin, L., Jones, T., Davis, R.W., and Steinmetz, L.M. (2006). A high-resolution map of transcription in the yeast genome. Proc. Natl. Acad. Sci. USA 103, 53205325. Dennis, G., Jr., Sherman, B.T., Hosack, D.A., Yang, J., Gao, W., Lane, H.C., and Lempicki, R.A. (2003). DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 4, P3.

2008 Elsevier Inc. All rights reserved.

27

Techniques
Draghici, S. (2003). Data Analysis Tools for DNA Microarrays (Princeton, NJ: Chapman & Hall/CRC). Dudoit, S., Shaffer, J.P., and Boldrick, J.C. (2003). Multiple hypothesis testing in microarray experiments. Stat. Sci. 18, 71103. Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 1486314868. Hoheisel, J.D. (2006). Microarray technology: beyond transcript profiling and genotype analysis. Nat. Rev. Genet. 7, 200210. Hosack, D.A., Dennis, G.J., Sherman, B.T., Lane, H.C., and Lempicki, R.A. (2003). Identifying biological themes within lists of genes with EASE. Genome Biol. 4, R70. Hughes, T.R., Marton, M.J., Jones, A.R., Roberts, C.J., Stoughton, R., Armour, C.D., Bennett, H.A., Coffey, E., Dai, H., He, Y.D., et al. (2000). Functional discovery via a compendium of expression profiles. Cell 102, 109126. Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., and Barkai, N. (2002). Revealing modular organization in the yeast transcriptional network. Nat. Genet. 31, 370377. Irizarry, R.A., Warren, D., Spencer, F., Kim, I.F., Biswal, S., Frank, B.C., Gabrielson, E., Garcia, J.G., Geoghegan, J., Germino, G., et al. (2005). Multiple-laboratory comparison of microarray platforms. Nat. Methods 2, 345350. Kerr, M.K., and Churchill, G.A. (2001). Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments. Proc. Natl. Acad. Sci. USA 98, 89618965. Khatri, P., and Draghici, S. (2005). Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 35873595. Mahadevappa, M., and Warrington, J.A. (1999). A high-density probe array sample preparation method using 10- to 100-fold fewer cells. Nat. Biotechnol. 17, 11341136. McShane, L.M., Radmacher, M.D., Freidlin, B., Yu, R., Li, M.C., and Simon, R. (2002). Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data. Bioinformatics 18, 14621469. Quackenbush, J. (2002). Microarray data normalization and transformation. Nat. Genet. 32 (Suppl), 496501. Ransohoff, D.F. (2004). Rules of evidence for cancer molecular-marker discovery and validation. Nat. Rev. Cancer 4, 309314. Slonim, D.K. (2002). From patterns to pathways: gene expression data analysis comes of age. Nat. Genet. 32 (Suppl), 502508. Stolc, V., Gauhar, Z., Mason, C., Halasz, G., van Batenburg, M.F., Rifkin, S.A., Hua, S., Herreman, T., Tongprasit, W., Barbano, P.E., et al. (2004). A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655660. van Bakel, H., and Holstege, F.C. (2004). In control: systematic assessment of microarray performance. EMBO Rep. 5, 964969. van de Peppel, J., Kettelarij, N., van Bakel, H., Kockelkorn, T.T.J.P., van Leenen, D., and Holstege, F.C. (2005). Mediator expression profiling epistasis reveals a signal transduction pathway with antagonistic submodules and highly specific downstream targets. Mol. Cell 19, 511522. Yang, Y.H., Dudoit, S., Luu, P., Lin, D.M., Peng, V., Ngai, J., and Speed, T.P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15. Zeeberg, B.R., Feng, W., Wang, G., Wang, M.D., Fojo, A.T., Sunshine, M., Narasimhan, S., Kane, D.W., Reinhold, W.C., Lababidi, S., et al. (2003). GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4, R28.

Please cite this article as:


van Bakel, H., and Holstege, F.C.P. (2007). A Tutorial for DNA Microarray Expression Profiling. In Evaluating Techniques in Biochemical Research, D. Zuk, ed. (Cambridge, MA: Cell Press), http://www.cellpress.com/misc/page?page=ETBR.

28 2008 Elsevier Inc. All rights reserved.

You might also like