Professional Documents
Culture Documents
A New Principle of RNA Folding Based On Pseudoknotting1985 Pleij CW
A New Principle of RNA Folding Based On Pseudoknotting1985 Pleij CW
Nucleic
Department of Biochemistry, State University of Leiden, Wassenaarseweg 64, 2333 AL Leiden, The
Netherlands
Received 3 December 1984; Revised and Accepted 5 February 1985
ABSTRACT Tertiary interactions involving hairpin or interior loops of RNA can lead to extended quasi-continuous double helical stem regions, consisting of coaxially stacked segments of duplex RNA, bridged by single-stranded connections. This type of compact folding plays a role in various strategic regions of RNA molecules. Their role in ribosome functioning, RNA splicing and recognition of tRNA-like structures is discussed.
INTRODUCTION Insight in the three-dimensional structure of single stranded RNA is essential for a full understanding of its functions. It is striking therefore how little we know about the specific folding of these biopolymers in space. Except for a few tRNAs (1-5) no high resolution structural information is available. Even of 5S RNA, a molecule whose structure has been studied in great detail (6) no definitive model for its secondary structure exists, let alone its tertiary structure. In many cases, the available information has led to proposals for the secondary structure in which single-stranded regions alternate with double-stranded regions. The latter contain the classical Watson-Crick G C and AMU base pairs and sometimes G U base pairs. It is increasingly becoming clear that also other combinations have to be considered (7). In such secondary structure models, characteristic features like hairpin, bulge, interior and bifurcation loops can be discerned beside the normal double-stranded stem regions. Frequently with little experimental data at hand these proposals are based on computer-aided nredictions (8,9) which unfortunately have a limited value only (10). In general, the folding of RNA molecules will reflect a delicate balance between base-base interactions and electrostatic repulsion of the negatively charged phosphate residues. In these foldings an imoortant contribution to the stability of the RNA structure is furnished by the stacking interactions of the base residues (11). Under high salt conditions with sufficient screeno I RL Press Umited, Oxford, England. 1717
FOLDING OF tRNA-LIKE STRUCTURES AT THE 3' TERMINI OF VIRAL RNAs The 3' termini of various plant viral RNAs like turnip yellow mosaic virus (TYMV) RNA, brome mosaic virus (BMV) RNA, and tobacco mosaic virus (TMV) RNA are recognized by a variety of tRNA-specific enzymes, e.g. tRNA nucleotidyltransferase and a specific aminoacyl tRNA synthetase (15-16). We recently described structure mapping studies on labelled 3' terminal RNA fragments of all three individual viral RNAs. Models of the secondary structure were proposed based upon enzymatic digestions with RNase Ti,Sl nuclease and the double-strand sDecific RNase from the venom of the cobra Naja naja oxiana and chemical modification of adenosine residues with diethyl pyrocarbonate and cytidine residues with dimethyl sulfate. The cloverleaf structure as found in canonical tRNA appeared to be absent in the tRNA-like structures of the viral RNAs. Similar nroposals were made by Florentz et al. for TYMV RNA (17) and by Ahlquist et al. for BMV RNA (18). We have shown that three-dimensional foldings appeared to have evolved, however, which show a great resemblance with the L-shane of the elongator tRNAs. These models provide a basis for our understanding how these viral RNAs have met the requirements for faithful recognition by tRNA-specific enzymes. For a more detailed discussion of these models, see ref. 12-14. Of great interest here is the construction of the RNA domain which comprises the aminoacyl acceptor arm of the tRNA-like structures. To illustrate this noint the secondary structure of the last 42 nucleotides at the 3' terminus of TYMV RNA is shown in fig. la. Experimental data and sequence comparisons with other plant viral RNAs pointed to a base pairing interaction involving the
1718
b
E
UCA
U I!J3CdACUCCCAt d UCGACC
CGGGUG
UCCGN
6G
CA CC
mAUCCCGU
AAUCGCC
nt. ) Su 42.
Figur 1.i Theedie.Nsonalth foldinge of the aminoacyl aeptformarm of thsetRThe aminoacyl acceptor arm is formed by theia like starutranemofntYMN. ltastin the inict trple Geliand triplent Cro sequeince Ianvovd IInd base pairing arsin illustrthedbys thein dashedaclionesh restwe of3G1 the tRA2i5 strCture is) griven byea soli lie.Noe thre-iensoabseldnce ofane aminoacyl stempformdb barmseoin pairicdng betwee n"cdb the 3'aandd5'psid of the structure.ai stadar tRNA ()Larrangement ftN Tny rmyat. he aminoacyl acceptor fre arm is ycailsakn fromthere boraseoarng intrctmaison beween thatG TYM5 and -C27. (c) Artistr
detai ls about experimental detai ls and a di scussion of these models, see ref. 12-14.
1719
,.-.
s~~~
1> c 5
5
5
Figure 2. The formation of extended double hel ices in RNA chains, based on tertiary interactions of a special kind, shown in four presentations. Si and S2 represent stem regions formed by normal Watson-Crick base pairing and Li and L2 the single stranded RNA regions connecting the double helical segments Si and S2. (a) Conventional representation of the secondary structure in which nucleotides from the hairpin loop adjacent to stem region Si base pair with a complementary region at the 3' side of the hairpin. (b) Schematic illustration of the building principle in a graphical format. (c) Schematic folding. (d) Three-dimensi onal folding, showing the quasi-continuous doubl e stranded helix of 8 base pairs and the two crossing loops connecting the two double helical segments. Our experience with the plant viral RNA structures we here make the imDortant assumption that for all interactions of this kind stem Si and stem S2 are stacked on top of each other in such a way that a quasi-continuous, righthanded double helix is formed, comparable to A-RNA (fig. 2d). This assumption of course is only valid if the single-stranded connecting loops Li and L2 pose no sterical contraints upon this structure. Due to the handedness of the double helix and the polarity of the chain, loop Li and L2 will not be equivalent: loop Li crosses the deep groove and loop L2 the shallow groove of the double helix. This must have consequences for the length and the conformation of each of the two l OOpS. A first insight in the minimal number of nucleotides needed in loop Li or L2 is provided by an analysis of the data which emerge from our three-dimensional models of the tRNA-like 3' termini of the plant viral RNAs. In table 1 we have summnarized these data in terms of the number of base pairs in stem Si and S2 and the number of nucleotides in loop Li and L2. Beside the RNAs studied experimentally we also included related viral RNAs whose 3' terminal sequences are known and which can be folded in similar tertiary structures (12-14, i8, 19). The lowest number of base pairs found in the stem regions is 3, which presumably is also the
1721
(13919 ).
minimal number. The maximum found so tdr is 7. The lower limit for either loop seems to be 2 nucleotides, though bridgina a different number of base pairs. The upper limit is less well defined and can be hundred or even more than one thousand nucleotides (see below). Such large loons of course will possess secondary structure themselves as was obvious in the case of BMV RNA (13). An intriguing outcome is the finding that 2 nucleotides are sufficient to span a distance of 6 base pairs over the deep groove of the double helix. Analysis of the known geometry of the RNA-A helix shows that this is feasible (fig. 3a). Based on the known coordinates for synthetic RNA double helices (20) we have calculated the distance between one phosnhate atom (P6) on one strandand others on the opposite strand (fig. 3b). Depending on the direction chosen, either the deep groove (PO - P_6) or the shallow groove (P - P6) is crossed. (See also legend to fig. 3). Fiqure 3c shows that the distance between two phosphates is minimal when 7 base pairs over the deep groove are bridged (10.1 R). This distance is almost half of that between PO and PO (17.4 i) and similar to that between P0 and P2 on one strand (11.0 R). Bridging the deen groove with 2 nucleotides therefore seems possible. Connecting the two strands over the shallow groove takes at least 16.9 R, while the distances crossing 2, 3 and 4 base pairs do not differ very much and are actually of the same order of magnitude as for PO - PO. Because the
1722
P
0-
PO i
0-
|
0
5'
l,
'6
D_
DEEP
GROOVE
;N P1
.fi
SHALLOW
r*
0
GROOVE
a-
I30
. .
6 8
10
.
-10
-8 -6
.
-4
6 -2 2 4 phosphate number
Figure 3. Distances between Dhosphate atoms in a regular RNA double helix of the A-type. The shortest distance between a fixed phosphate residue on one strand (P6) and the phosphates on the other strand is given thouoh it should be realized that in some cases elements of the double helix like base Dairs are penetrated. (a) Three-dimensional representation showing the deeo and shallow groove of the double helix. Going upwards from phosphate PO which is located opposite to P6 on the other strand, the phosphates are i ndi cated with a positive subscrint, whereas negative ones are used going downwards. (b) Two-dimensional scheme. Vertical bars represent base Dairs, the arrows the shortest distances between the phosphate residues involved. (c) Graphical representation. Distances were calculated based on Dublished coordinates of a regular RNA-A double helix (20).
minimal number of nucleotides required for closing a hairDin loop is accepted to be three (21), we assume that this number is also three for 100D L2. This is actually observed in the case of TYMV RNA (see table 1). Bridging the shallow groove with 2 nucleotides (see TMV RNA) may therefore require either a distortion of the double helix and/or a change in the regular nucleotide conformation(s) in the crossing loop. Stretching of the ribose ohosphate backbone, however, by changes in the ribose ouckering (C3 endo to C2 endo) and torsion angle y (C4 C5) will Drobably be not sufficient (22). If two nucleotides can cross the shallow groove, one has to envision that even one nucleotide might suffice for the deep groove. In the latter case it is conceivable that the two phosDhates concerned (PO and P or P7, fig. 3) miaht come into closer proximity by a bending of the double helix over this groove as found to be possible for DNA (23). Duolex RNA, however, is renorted to be less flexible than the DNA double helix (24,25). On the other hand, an
-
1723
~
Au
c a
A G f0 ^ G U
c
'
EXON
Pi1;
AC
'5....u 0
'A
0 0
AA
-A 0
U4
-UA
AU-
UA C Um3
^^ uAA~
5,
EXON a
.NTRON_
Figure 4. Models of RNA secondary structures showinq the possible role of " pseudoknots " i n RNA spl icing and ri bosome functi oni ng. (a) Part of the secondary structure of the 165 ribosomal RNA from E. coli. The formation of stem reaion 1 by base pairing between the two comnlemeintary boxed regions was confirmed by phylogenetic comparisons. For further details see ref. 31 and 32. (b) Coaxial stacking of stem regions 1 and 2 in 165 rRNA based on the new building principle described in the text. Helical segment 1 might be stacked on helix 3 by including the G A base pair at the end of helix 3. (c) Schematic representation of the generalized secondary structure model of fungal mitochondrial introns according to Davies et al. (34). Only key regions of the intron structure are given. The thiRE Te reDresents the exoni reoions. The thin line the intron, the arrows show the splice sites and the zigzag line the internal guide sequence. For a full description of this model and for the symbols used see ref. 34. (d) Coaxial stacking of the double helix segments P1, P1O and P2 which brings the two splice sites into close proximity.
either an A residue bulged out or with the incorporation of a GA pair. If this model is correct, it does not only offer a rational explanation for the enigmatic coexistence of helices 1 and 2 but it also implies that the molecule has to undergo a considerable conformational change for adopti ng another structure. Such a conformati onal transition may play an important role as a switch in ribosome functioning. So far this novel feature of RNA folding has only been recognized at the extremities of the RNA chains: at the 3 end of viral RNA and at the 5 end of rRNA. That it may also be involved in the folding of internal parts of the RNA chain is suggested by the recently proposed mechanism for autocatalytic RNA self-splicing. The latter phenomenon has been reported by Cech for the nuclear rRNA precursor of Tetrahymena and may also be
1726
Figure 5. Hypothetical structure of an aminoacyl acceptor arm in a tRNA-like structure. Coaxial stacking of 12 base pairs can be obtained by base pairing of nucleotides in the interior loop and a complementary region at the 5' side. The structure gives rise to two connecting loops of which one (most proximal to the 3' terminus) crosses the double helix. The hairpin loop containing 7 nucleotides corresponds to the Tip-loop of classical tRNA (see also fig. 1). The exact number of base pairs in the various double helical segments is here chosen arbitrarily as 4 in each segment, but may vary in principle from 3 to 6.
and presumably functionally important core of the structure. We suggest that the helix formed by base pairinq of E' and F stacks coaxially with helix P8. In principle other variants may be envisaged involving hairpin, interior, bulge, and bifurcation loops. They all notentially lead to the formation of extended double helices.
KNOTTED STRUCTURES IN RNA AND COMPUTER-AIDED PREDICTIONS OF RNA SECONDARY STRUCTURE All intramolecular tertiary interactions in RNA described above are examples of what have been called "knotted structures" (8) thounh of a special kind. If the tertiary interactions would comnrise one turn of adouble helix or more, the formation of a so-called "real knot" in the RNA chain is conceivable by the threading of one of both free ends through a loop (loop Li for the 3' end and loop L2 for the 5' end, fig. 2c). No true knots in RNA molecules however have been reported. We therefore orefer the term "pseudoknot" coined by Studnicka et al. (8), for the tertiary interaction we found in the plant viral RNAs. So far, not more than 6-7 base nairs are involved in this pseudoknot formation, which is well below one full turn of an RNA duplex (11 base pairs). Helices of this length might prevent that
1728
CONCLUDING REMARKS In this paper we have shown that pseudoknots are present in RNA molecules. In itself it is not surprising to see that under physiological conditions long range base pairing interactions in the three-dimensional folding of RNA do occur. It may be stressed here, however, that these tertiary interactions involving hairpin or interior loops can give rise to long extended double helical stem regions. More work is needed to get insight in their precise folding, their stability and their spread in natural RNAs.
1729
23. Levitt, M. (1983) Cold Spring Harbor Symp. Quant. Biol. 47, 251-262. 24. Bolton, P.H. and Kearns, D.R. (1979) J. Am. Chem. Soc. 1U1, 479-484. 25. Arnott, S., Chandrasekaran, R. and Selsing, E. (1975) iTFwStructure and Conformation of Nucleic Acids and Protein-Nucleic Acid Interactions" Sundaralingam, M. and Rao, S.T., eds, pp. 577-596, University Park Press, Baltimore.
1730
9-15).
1731