Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

LETTERS

An expression atlas of rice mRNAs and small RNAs


© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology

Kan Nobuta1,2, R C Venu4, Cheng Lu1,2, André Beló2, Kalyan Vemaraju1, Karthik Kulkarni1, Wenzhong Wang1,
Manoj Pillay1, Pamela J Green1–3, Guo-liang Wang4 & Blake C Meyers1,2

Identification of all expressed transcripts in a sequenced unknown transcripts. Transcriptional analyses of the A. thaliana
genome is essential both for genome analysis and for genome with MPSS have identified extensive alternative polyadenyla-
realization of the goals of systems biology. We used the tion, as well as large numbers of natural antisense transcripts, novel
transcriptional profiling technology called ‘massively parallel transcripts from unannotated genomic regions, noncoding RNAs and
signature sequencing’ to develop a comprehensive expression small RNAs6,7. Sequence-based data have very high specificity and
atlas of rice (Oryza sativa cv Nipponbare). We sequenced accuracy for assessing gene activity, because they are not subject to
46,971,553 mRNA transcripts from 22 libraries, and cross-hybridization. In addition to enhancing genome annotation,
2,953,855 small RNAs from 3 libraries. The data demonstrate MPSS data provide quantitative expression information6.
widespread transcription throughout the genome, including To clarify the complexity of polyadenylated transcripts in rice, we
sense expression of at least 25,500 annotated genes and sequenced 22 mRNA libraries using MPSS (Table 1, Supplementary
antisense expression of nearly 9,000 annotated genes. An Table 1 and Supplementary Data online). These libraries, from
additional set of B15,000 mRNA signatures mapped to 12 diverse untreated tissues (with some replicates) and six abiotic
unannotated genomic regions. The majority of the small RNA stress treatments, included 46,971,553 transcripts. The nonredundant
data represented lower abundance short interfering RNAs that set comprised 249,990 distinct sequences. The expression values of
match repetitive sequences, intergenic regions and genes. these signatures ranged over four orders of magnitude, with the
Among these, numerous clusters of highly regulated small majority found in the range of 1 to 100 transcripts per million
RNAs were readily observed. We developed a genome (TPM) (Supplementary Table 1 online). Filtering to capture the
browser (http://mpss.udel.edu/rice) for public access to most ‘reliable’ signatures removed most erroneous sequences6, leaving
the transcriptional profiling data for this important crop. a set of 46,251,966 signatures (Table 1 and Supplementary Fig. 1
online); these comprised 121,581 distinct signatures. This represents
Because of its scientific, economic and cultural importance, the the deepest reported set of plant transcriptional data. This sampling
sequencing of the rice (Oryza sativa ssp. japonica cv Nipponbare) depth is high enough that new transcripts are infrequently
genome1 represents a milestone in plant biology. The recent annota- discovered (Fig. 1a).
tion of the rice genome (The Institute for Genomic Research, TIGR Signatures matching only once in the genome (hits ¼ 1) provide a
version 4.0) includes 55,890 features that represent 42,653 predicted conservative assessment of transcript diversity and active genes,
protein-coding genes and 13,237 transposable elements2. Experimen- whereas including duplicated signatures (hits 40) provides an
tal evidence from full-length cDNA and expressed sequence tags upper boundary (Supplementary Table 2 online). The majority of
(ESTs) is critical for genome analysis3, yet these data are incomplete, the signatures (B75%) in the libraries matched sense-strand tran-
in that they are subsaturating and miss nonpolyadenylated transcripts scripts of nearly half of the annotated rice genes (20,821 of 42,653,
such as small RNAs. Small RNAs of 21–24 nucleotides are well
characterized in Arabidopsis thaliana, but little is known about the
diversity of these molecules in other plant species. Several categories Table 1 Summary statistics for rice MPSS libraries
are known, including short interfering RNAs (siRNAs) and micro- Category mRNA total Small RNA total
RNAs (miRNAs), both of which silence genes by targeting comple-
mentary mRNAs for degradation4. siRNAs can also trigger Libraries 22 3
transcriptional silencing by guiding nuclear complexes that target Signatures sequenceda 46,971,553 2,953,855
either histone modifications or DNA methylation or both5. Small Distinct signaturesb 249,990 284,301
RNAs are best discovered and measured by deep sequencing Distinct genome-matched signaturesc 81,961 221,592
approaches that have high sensitivity and specificity. Abundance of genome-matched signaturesd 46,251,966 1,948,368
One advantage of using a signature-based expression profiling aAll sequencing reactions combined for each type of library. bThe nonredundant set of either

method such as massively parallel signature sequencing (MPSS) to the complete set of MPSS signatures or those that match to the genome. cFor the genome-
matched mRNA signatures, only those that passed the reliability filter are included. dThe sum
improve genome annotation is its potential to characterize previously of the observed frequency for all distinct, genome-matched signatures.

1Delaware Biotechnology Institute, 2Department of Plant and Soil Sciences, 3College of Marine and Earth Studies, University of Delaware, Newark, Delaware 19711.
4Department of Plant Pathology, The Ohio State University, Columbus, OH 43210. Correspondence should be addressed to B.C.M. (meyers@dbi.udel.edu).
Received 26 October 2006; accepted 25 January 2007; published online 11 March 2007; corrected online 23 March 2007; doi:10.1038/nbt1291

NATURE BIOTECHNOLOGY VOLUME 25 NUMBER 4 APRIL 2007 473


LETTERS

Figure 1 Genome-wide transcriptional analysis by a c


mRNA MPSS. (a) Discovery rates for new rice

Number of distinct signatures


300,000 All 5 4 3 2 1 Os04g32650
transcripts based on the 46,971,553 mRNA Slope = 487 / 1,000,000
9
5 6 7 8
MPSS signatures, resampled 1,000 times with 200,000
All signatures
5 4 3 2 1
replacement. Each signature was weighted by
Slope = 34 / 1,000,000 NYL 37 0 0 0 0 TPM
abundance. The gold line indicates 249,990 NSL 0 0 0 0 63 TPM
100,000
distinct, unfiltered signatures, whereas the green Reliable signatures
Os09g35990
line indicates 121,581 reliable signatures. Each
slope was calculated at the endpoint, indicating 0 1 2 3 4
0 10 20 30 40 50
the final discovery rate for new transcripts after Total number of signatures sampled 5 4 3 2 1
© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology

sampling 50 million transcripts. Sampling of (in millions) NYL 0 0 29 0 0 TPM


additional tissues or treatments may increase this b Absent by MPSS Present by MPSS
NSL 0 0 0 0 29 TPM
Os10g37190
rate. (b) mRNA MPSS signatures representing
sense-strand transcripts (classes 1, 2, 5 and 7), HH
1 2
compared to annotated rice genes and excluding 39% 5 4 3 2 1
transposons and small genes (o50 amino acids). FL-cDNA HH
NYL 0 0 0 0 50 TPM
Others NSL 0 0 13 0 0 TPM
Most genes with support by MPSS have 21,832 20,821 EST 87%
genes genes
additional support from full-length cDNA data 17,886 (1,358) (78) 17,382 Os03g18910
and high similarity to A. thaliana genes. Total (1,358) (19)
1
LH
gene numbers are indicated in the center (white 61% FL-cDNA 5 4 3 2 1
EST Others
sections), and genes without cDNA support are 3,946 3,439
LH
NYL 0 0 0 31 0 TPM
(59) 13%
indicated by the blue sections labeled ‘‘others.’’ (0) NSL 0 38 0 0 0 TPM
Numbers in parentheses are potential pack-
MULEs found in each category. HH and LH, high
and low homology to A. thaliana, respectively. Using signatures with hits 4 1 slightly increases MPSS supported genes (Supplementary Fig. 14 online).
(c) Each row in the heatmap represents a different gene preselected for evidence of a substantial number of alternative transcripts. Yellow and blue are
upregulated and downregulated transcripts in salt-treated (NSL) versus untreated young leaf (NYL), respectively. Green represents no difference in
expression. The ‘‘All’’ column is the NSL/NYL ratio (the expression ratio of all salt-treated to all young leaf transcripts), without considering alternatively
terminated transcripts. Column 1 is the expression level of the 3¢-most MPSS signature, the longest transcript, whereas column 5 is the shortest
transcript, and 5¢-most MPSS signature. Example genes are shown to the right, with the different transcripts and expression in the two libraries
indicated. Red and pink boxes represent coding regions and untranslated regions, respectively, and the black triangles indicate the five MPSS signatures
that correspond to the genes.

Fig. 1b and Supplementary Table 2a online). This number is MPSS data are consistent with studies of centromeres 3, 4 and 8
comparable to the number of gene annotations supported by EST or indicating that genes in these regions are active8–10. Some rice centro-
full-length cDNA data (21,328) (Fig. 1b). Of genes with MPSS meres, like Cen3, may be evolving from genic regions to repeat-based
support, 84% are supported by EST or full-length cDNA data, and mature centromeres9.
only 18% of those lacking MPSS support have ESTs or full-length Genic positions of MPSS signatures identify transcripts resulting
cDNAs (Fig. 1b). Because the ESTs or full-length cDNAs are derived from alternative polyadenylation and 3¢-splicing, as well as those
from 460 libraries (versus our 22 libraries), we concluded that which are antisense transcripts. Among the 20,821 expressed rice
substantial sampling depth may substitute for tissue or treatment genes, more than half (11,941) had multiple sense signatures that
diversity. Among the 20,821 genes with MPSS support, 87% had high represent alternative transcripts. These transcripts can show marked
homology to A. thaliana genes, whereas only 39% of the genes without differences in their expression levels (Fig. 1c). This type of analysis
MPSS support fall into this category (Fig. 1b). Because numerous provides a novel view of complex gene expression events that may be
MPSS-supported low-homology genes (1,873) match ESTs or full- important in better understanding gene function. We also identified
length cDNAs, many validated rice genes apparently lack orthologs in 11,001 antisense signatures corresponding to 8,023 annotated genes
A. thaliana. In A. thaliana, 17 MPSS libraries confirmed expression of (Supplementary Table 2a online). Some natural antisense transcripts
21,193 genes. By comparison to our rice data, these data suggest that are coexpressed and induce alternative splicing, some induce dsRNA
the two plant genomes have similar transcriptional complexities, cleavage and show reciprocal expression patterns11, and some even
although the respective genome size and gene number of rice are generate regulatory small RNAs12. MPSS analysis identified natural
3 times and 1.5 times those of A. thaliana. antisense transcripts for many rice genes, often with highly specific
We expected that similar tissues and treatments would share sets of expression patterns. A comparison of sense and antisense expression
expressed genes. Hierarchical clustering of the libraries was performed levels demonstrates the difficulty of generalizing the function
using the 20,821 genes with MPSS support, and the pollen library was of natural antisense transcripts (Supplementary Fig. 5 online),
an outgroup compared to all other libraries (Supplementary Fig. 2 which may require gene-by-gene analyses to discern their complex
online). Only one-third of the pollen-specific genes had A. thaliana regulatory mechanism.
orthologs. The unique transcriptional pattern of the pollen library The 22 libraries included 13,461 intergenic signatures (Supplemen-
may result from its haploid nature; the number of library-specific tary Table 2a online) that could be (i) misannotated 3¢-untranslated
genes was the highest in pollen (392), including an unusually large regions of known genes, (ii) novel genes or (iii) noncoding RNAs
number of transposable elements (Supplementary Fig. 3 online). including miRNAs. Sixty-three intergenic mRNA signatures matched
To examine the association between chromosomal architecture and 30 known miRNA genes (precursors) and 68 noncoding RNA genes
transcriptional activity, we compared gene density and expression from the registry13 (Supplementary Table 3 online). Consistent
levels for each chromosome (Supplementary Fig. 4 online). Expres- with previous reports7, many noncoding RNAs were expressed in
sion activity was observed around the gene-dense regions, and the the stigma-ovary library (Supplementary Table 3b online). Like

474 VOLUME 25 NUMBER 4 APRIL 2007 NATURE BIOTECHNOLOGY


LETTERS

a c
Number of distinct signatures

100,000
Rice 10 Mb 20 Mb 30 Mb

Small RNA mRNA Genes and


Arabidopsis 2,000

(TPQ)
10,000

1,000
0
2,000

(TPM) repeats (bp)


100 Chromosome 8
10
0
70,000
0

00
1

10

00

0
00

,0
–1
2–
© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology

1,

>1 0
11

1–
10

Number of hits to genome

b d 160 e

annotated hairpin of the pre-miRNA


1–25 TPQ

Number of signatures matching to


10,000 Protein-coding gene 140 TIR Os01g20930 TIR
Count of gene, TEs or IGRs

26–100 TPQ
Transposable >100 TPQ 5′ 1 2
element 120 5′
1,000
Intergenic region 100
100 Os02g49580 TIR
80 TIR

60 5′
10 1
5′
40
0
10 100 20 TIR Os10g37080 TIR
Number of distinct signatures
0 5′ 1 2 3
5′
FLR SNU STM
Libraries

Figure 2 Deep sequencing of rice small RNAs by MPSS. (a) Match frequencies of small RNAs in rice versus those of A. thaliana. We determined the
number of genomic matches (‘hits’) to their respective genomes for 149,978 and 56,920 distinct small RNA MPSS signatures from rice and A. thaliana
inflorescence libraries. The A. thaliana data have been described elsewhere7. (b) The number of small RNAs matching individual protein-coding genes,
transposable elements and intergenic regions. (c) Distributions of small RNAs, mRNA expression and genes or repeats across rice chromosome 8, plotted
as moving averages of five adjacent bins of 100 kb. The light-green, pink, and light-blue lines (top) are small RNAs in inflorescence, stem and young leaf,
respectively. mRNA levels in the same tissues are shown in green, red and blue (middle). Black and gray lines (bottom) are densities of genes and repeats.
Blue vertical shading indicates the approximate position of the centromere. (d) Abundance values for small RNA signatures matching to the hairpin of the
171 genome-mapped, known rice miRNAs. The y-axis indicates the number of distinct signatures in each abundance class. (e) Small RNAs map to the
terminal inverted repeats (TIRs, black arrowheads) of pack-MULE elements described elsewhere14,15. Yellow shading indicates transposon-like sequences;
orange indicates inverted repeats; black triangles are small RNAs; red and blue boxes are annotated exons. Additional features are as described in
Supplementary Figure 7 online.

‘housekeeping’ genes, many intergenic signatures were expressed in all the chromosomes (Supplementary Fig. 6 online). The two completely
22 libraries and were supported by EST or full-length cDNA data. sequenced rice centromeres are characterized by small RNAs corre-
Clearly there are a substantial number of unidentified genes in sponding to the centromeric repeats (Supplementary Fig. 7 online).
rice. Some of these are expressed across a range of tissues and Because the distributions of A. thaliana small RNAs are strongly
developmental stages, whereas others may encode small peptides or correlated with those of repetitive sequences, we examined the
generate uncharacterized, regulatory small RNAs such as miRNAs relationship between rice small RNAs, genes and repeats. Rice small
or siRNAs. However, intergenic transcripts were less broadly RNAs were strongly associated with intergenic regions (Supplemen-
expressed, with each intergenic transcript expressed in an average of tary Fig. 8 online), and some chromosomes demonstrated a pericen-
6.65 libraries versus 14.42 for gene-associated signatures, and tromeric concentration when we examined the small RNA abundance
much more weakly expressed than genic transcripts (Supplementary in addition to the distribution (Fig. 2 and Supplementary Fig. 9
Fig. 4 online). This low expression contrasts with the prevailing online). Most chromosomal regions in rice show a range of patterns of
abundance of sense signatures and suggests that these intergenic small RNAs, consistent with a dispersed arrangement of genes,
transcripts may have distinct characteristics unlike those of normal, transposons and miniature inverted repeat transposable elements
protein-coding genes. (Supplementary Fig. 10 online). Notably, abrupt transitions between
To investigate the complexity of small RNAs in rice, we next concentrated siRNA clusters and active genes were often observed
generated MPSS small RNA libraries of rice inflorescence, stem, and (Supplementary Fig. 11 online), suggesting a localized shift from
seedling leaf. Of the 284,301 distinct small RNAs, 78% matched the silenced to active chromatin within a very short physical distance. This
rice genome, representing 1,948,368 of 2,953,855 total sequences type of abrupt transition is supported by methylation profiling studies
(Table 1). Like A. thaliana7, the inflorescence library was proportion- that can identify heterochromatin via biochemical (rather than
ally more complex, perhaps reflecting stronger germline silencing cytological) means14,15. These studies have indicated a strong coloca-
(Supplementary Table 4 online). More than half of the distinct lization of siRNA clusters and DNA methylation, even within cytolo-
signatures matched unique sites in the genome, but there were gically defined euchromatin. Many rice small RNAs were derived from
many more highly duplicated signatures in the rice genome than transposons or retrotransposons, but many also matched unannotated
were observed in A. thaliana (Fig. 2a). Unlike small RNAs from intergenic regions (Supplementary Table 5 online). At least 12,766 of
A. thaliana7, those from rice are not as concentrated across the the 13,237 annotated retrotransposon or transposon-related sequences
pericentromeric regions and are instead more widely distributed on in the rice genome had matches to small RNAs.

NATURE BIOTECHNOLOGY VOLUME 25 NUMBER 4 APRIL 2007 475


LETTERS

As we have demonstrated elsewhere6,7, repetitive sources of siRNAs treatments. The rice small RNA data indicate an extensive and
produce numerous distributed small RNAs, whereas miRNAs pro- complex repertoire of such molecules. This vastly exceeds that of
duced small focused clusters of specific sequences. Rice genes matched A. thaliana, consistent with increased genome sizes correlating with an
distinct small RNAs at a rate at least as high as repeats and intergenic increased complexity of small RNAs. This suggests that larger plant
regions (Fig. 2b and Supplementary Fig. 8 online), often resulting genomes, such as those of most crops, will require deeper sequencing.
from miniature inverted repeat transposable elements or other small Additional complexity may be found in analyses of nonpolyadenylated
repeats embedded within an intron. As in A. thaliana, tandem repeats transcripts22. A comprehensive understanding of the network of gene
and inverted repeats are rich sources of small RNAs (Supplementary expression events in rice or other crops will require concerted efforts
Fig. 12 online). From all three libraries, there were many clusters of such as ours to characterize the activities and functions of a compre-
© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology

small RNAs, particularly in intergenic, unannotated regions of the hensive catalog of genomic components.
rice genome (Supplementary Table 6 online). This indicates that the
rice genome is much richer in silenced sequences compared with METHODS
A. thaliana, consistent with the higher degree of repetitive DNA. High- and low-homology rice genes. The high- and low-homology rice genes
Because small RNAs may also interact with imperfectly matched were identified according to their similarity to A. thaliana genes using a
targets, their biological effect may be far more substantial than we threshold of a BLASTP e-value o1.0e-7.
have indicated.
The miRNA registry includes 182 rice miRNAs13, of which 171 were MULE analysis. The rice MULE data have been previously described18 and
were downloaded from http://www.genome.org/. Because the International
mapped in the genome and 130 were expressed, accounting for 8.8%,
Rice Genome Sequencing Project annotation version 2.0 was used for their
36.6% and 12.6% of the inflorescence, seedling and stem small RNAs.
analyses, we remapped all the MULEs onto TIGR4.0. After mapping these
These percentages are much lower than those for A. thaliana, which is sequences, potential pack-MULEs (genes flanked on both sides by MULEs)
consistent with a substantial abundance of repeat-associated siRNAs in with or without MPSS signatures were identified. In addition, we also used
the more complex rice genome6,7. The lack of small RNA biogenesis these sequences to identify pack-MULEs associated with the intergenic MPSS
mutants in rice hinders our ability to distinguish siRNAs and signatures, because these could be pack-MULEs, which correspond to the
miRNAs. However, miRNAs are abundant, consistently expressed unannotated transcripts.
and conserved, and we identified numerous conserved and consis-
tently expressed small RNAs (Supplementary Fig. 13 online), suggest- Analysis of alternative termination. As an example of the differential expres-
ing that these data include many novel miRNAs. sion of alternative transcripts, we focused on two libraries (NYL and NSL) to
examine the effect of salt stress. The genes with alternative termination sites
With the three libraries, we examined the developmental regulation
were identified from the MPSS data (multiple sense-strand MPSS signatures
of rice small RNAs. The chromosomal distributions of small RNAs
associated with a single gene), and from this set, those genes were selected that
suggested a high degree of similarity between young leaf and the stem had at least two sense signatures demonstrating tenfold higher levels of
small RNAs (Fig. 2c and Supplementary Fig. 9 online), both of which expression in one library than the other. For each gene and for each library,
differed in comparison to the inflorescence data. However, the stem the expression levels and the sum of the expression levels were recorded for five
library produced the greatest number of small RNA clusters or genes MPSS signatures located at 3¢ end of each gene. The expression level of the NSL
that were substantially much more abundant in only one of the three signatures was divided by that of the corresponding NYL signatures, with the
libraries (Fig. 2d and Supplementary Table 7 online), suggesting an resulting values log-transformed and loaded into R to generate a heatmap.
unusual degree of small RNA regulation in the stem. This is the
first report of small RNAs from plant stems, and it is possible that Additional methods. Detailed methods are available in Supplementary
Methods online.
some of these small RNAs are involved in signal transmission between
leaves and inflorescences16. In contrast to A. thaliana6,7, many small
Accession codes. Gene Expression Omnibus (GEO): series identifier GSE7107,
RNA clusters were substantially reduced in the inflorescence as platform identifiers GPL3777 and GPL3776 for mRNA and small RNA samples,
compared with amounts in seedlings and the stem (Supplementary respectively; sample identifiers, GSM169562, GSM169564, GSM169566,
Table 7 online). GSM169567, GSM169568, GSM169569, GSM169570, GSM170900,
The rice genome contains many gene fragments (‘pack-MULEs’) GSM170901, GSM170902, GSM170903, GSM170904, GSM170905,
generated by transposons17. Our mRNA MPSS data identified 17,886 GSM170906, GSM170907, GSM170909, GSM170912, GSM170912,
genes without expression data (Fig. 1b), 490% of which have no GSM170914, GSM170917, GSM170919 and GSM170921. The raw and normal-
known function, suggesting that many of these are inactive or are ized MPSS data are also available at http://mpss.udel.edu/rice and this website
pseudogenes. We compared the mRNA and small RNA MPSS data to allows users to query these data based on physical location, gene identifiers or
by sequence.
predicted rice pack-MULEs17, using 8,271 previously identified
MULEs18. This identified 1,358 potential pack-MULEs among the Note: Supplementary information is available on the Nature Biotechnology website.
17,886 unexpressed genes. These elements represent just one of several
classes of gene-shuffling transposable elements19–21, so other inactive ACKNOWLEDGMENTS
genes are likely to be transposed fragments. Small RNAs matched to We are grateful to C. Haudenschild, TIGR’s rice annotation project, S. Singh Tej,
many of the terminal inverted repeats of these pack-MULEs but M. Nakano, R. German, A. Hetawal, R. Gupta and S. Kaushik. This work was
supported by US National Science Foundation awards 0321437 (B.C.M. and
infrequently to the internal gene fragments (Fig. 2e). The combined
G.-l.W.) and 0439186 (P.J.G. and B.C.M.), and US Department of Agriculture
mRNA and small RNA datasets may offer an experimentally based 2005-35064-15326 (B.C.M. and P.J.G.).
system for pack-MULE identification in rice and other plant genomes.
Taking into account annotated, expressed genes as well as small AUTHOR CONTRIBUTIONS
RNAs, a very high proportion of the rice genome is actively tran- K.N. performed research, analyzed data and wrote the manuscript, R.C.V. and
C.L. performed laboratory research and provided useful discussions, A.B., K.V.,
scribed. Although there are thousands of annotated genes lacking both K.K., W.W. and M.P. performed computational research; P.J.G. and G.-l.W.
mRNA and small RNA expression data, detection of their expression designed research and wrote manuscript; B.C.M. designed research, analyzed
may require sampling of highly specialized tissues, cell types or data, and coordinated and wrote the manuscript.

476 VOLUME 25 NUMBER 4 APRIL 2007 NATURE BIOTECHNOLOGY


LETTERS

COMPETING INTERESTS STATEMENT 11. Jen, C.H., Michalopoulos, I., Westhead, D. & Meyer, P. Natural antisense transcripts
The authors declare no competing financial interests. with coding capacity in Arabidopsis may have a regulatory role that is not linked to
double-stranded RNA degradation. Genome Biol. 6, R51 (2005).
12. Borsani, O., Zhu, J., Verslues, P.E., Sunkar, R. & Zhu, J.K. Endogenous siRNAs derived
Published online at http://www.nature.com/naturebiotechnology from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis.
Reprints and permissions information is available online at http://npg.nature.com/ Cell 123, 1279–1291 (2005).
reprintsandpermissions 13. Griffiths-Jones, S. The microRNA Registry. Nucleic Acids Res. 32, D109–D111
(2004).
1. International Rice Genome Sequencing Project. The map-based sequence of the rice 14. Lippman, Z. et al. Role of transposable elements in heterochromatin and epigenetic
genome Nature 436, 793–800 (2005). control. Nature 430, 471–476 (2004).
2. Yuan, Q. et al. The Institute for Genomic Research Osa1 rice genome annotation 15. Zhang, X. et al. Genome-wide high-resolution mapping and functional analysis of DNA
methylation in Arabidopsis. Cell 126, 1189–1201 (2006).
© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology

database. Plant Physiol. 138, 18–26 (2005).


3. Kikuchi, S. et al. Collection, mapping, and annotation of over 28,000 cDNA clones 16. Yoo, B.-C. et al. A systemic small RNA signaling system in plants. Plant Cell 16,
from japonica rice. Science 301, 376–379 (2003). 1979–2000 (2004).
4. Bartel, D.P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116, 17. Jiang, N., Bao, Z., Zhang, X., Eddy, S.R. & Wessler, S.R. Pack-MULE transposable
281–297 (2004). elements mediate gene evolution in plants. Nature 431, 569–573 (2004).
5. Verdel, A. et al. RNAi-mediated targeting of heterochromatin by the RITS complex. 18. Juretic, N., Hoen, D.R., Huynh, M.L., Harrison, P.M. & Bureau, T.E. The evolutionary
Science 303, 672–676 (2004). fate of MULE-mediated duplications of host gene fragments in rice. Genome Res. 15,
6. Meyers, B.C. et al. Analysis of the transcriptional complexity of Arabidopsis thaliana by 1292–1297 (2005).
massively parallel signature sequencing. Nat. Biotechnol. 22, 1006–1011 (2004). 19. Morgante, M. et al. Gene duplication and exon shuffling by helitron-like trans-
7. Lu, C. et al. Elucidation of the small RNA component of the transcriptome. Science posons generate intraspecies diversity in maize. Nat. Genet. 37, 997–1002
309, 1567–1569 (2005). (2005).
8. Cheng, Z. et al. Functional rice centromeres are marked by a satellite repeat and a 20. Britten, R. Transposable elements have contributed to thousands of human proteins.
centromere-specific retrotransposon. Plant Cell 14, 1691–1704 (2002). Proc. Natl. Acad. Sci. USA 103, 1798–1803 (2006).
9. Yan, H. et al. Genomic and genetic characterization of rice cen3 reveals extensive 21. Lipatov, M., Lenkov, K., Petrov, D. & Bergman, C. Paucity of chimeric gene-transpo-
transcription and evolutionary implications of complex centromere. Plant Cell 18, sable element transcripts in the Drosophila melanogaster genome. BMC Biol. 3, 24
3227–3238 (2006). (2005).
10. Yan, H. et al. Transcription and histone modifications in the recombination-free region 22. Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide
spanning a rice centromere. Plant Cell 17, 3227–3238 (2005). resolution. Science 308, 1149–1154 (2005).

NATURE BIOTECHNOLOGY VOLUME 25 NUMBER 4 APRIL 2007 477

You might also like