Professional Documents
Culture Documents
Capítulo 14 - Procesamiento Del ARN
Capítulo 14 - Procesamiento Del ARN
14
AND RNA PROCESSING
• The Immense Dystrophin Gene
• Gene Structure
Gene Organization
Introns
The Concept of the Gene Revisited
• Messenger RNA
The Structure of Messenger RNA
Pre-mRNA Processing
The Addition of the 5 Cap
The Addition of the Poly(A) Tail
RNA Splicing
Alternative Processing Pathways
RNA Editing
• Transfer RNA
The Structure of Transfer RNA
Transfer RNA Gene Structure
and Processing
• Ribosomal RNA
The Structure of the Ribosome
Ribosomal RNA Gene Structure
and Processing
• Small Interfering RNAs and
MicroRNAs
RNA Interference
Model Genetic Organism: The
Nematode Worm Caenorhabditis
For almost 50 years, entertainer Jerry Lewis has served as
elegans
national chairman of the Muscular Dystrophy Association, a
partnership between scientists and citizens aimed at fighting
neuromuscular diseases. (Courtesy of the Muscular Dystrophy Association.)
372
RNA Molecules and RNA Processing 373
unaffected mothers. In spite of this early recognition of its hereditary basis, the biochemical
cause of Duchenne muscular dystrophy remained a mystery until 1987.
In 1985, Louis Kunkel and his colleagues at Harvard Medical School observed a boy
with Duchenne muscular dystrophy whose X chromosome had a visible deletion on the
short arm. Reasoning that this boy’s disease was caused by the absence of a gene within the
deletion, they recognized that the deletion pointed to the location on the X chromosome of
the gene responsible for Duchenne muscular dystrophy. Kunkel and his colleagues located
and cloned the piece of DNA responsible for the disease. Shortly thereafter, the sequence of
the gene was determined, and the protein that it encodes was isolated. This large protein,
called dystrophin, consists of nearly 4000 amino acids and is an integral component of
muscle cells. Persons with Duchenne muscular dystrophy lack functional dystrophin.
The dystrophin gene is among the most remarkable of all genes yet examined. It’s huge,
encompassing more than 2 million nucleotides of DNA. However, only about 12,000
nucleotides encode its amino acids. Why is the dystrophin gene so large? What are all those
other nucleotides doing?
The unusual properties of the dystrophin gene make sense only in the context of RNA
processing—the alteration of RNA after it has been transcribed. Dystrophin messenger
RNA, like many eukaryotic RNAs, undergoes extensive processing after transcription,
including the removal of large sections that are not required for translation. Chapter 13
focused on transcription—the process of RNA synthesis. In this chapter, we will examine
the function and processing of RNA.
We begin by taking a careful look at the nature of the gene. Next, we examine messen-
ger RNA (mRNA), its structure, and how it is modified in eukaryotes after transcription.
We’ll also see how, through alternative pathways of RNA modification, one gene can
produce several different proteins. Then, we turn to transfer RNA (tRNA), the adapter
molecule that forms the interface between amino acids and mRNA in protein synthesis. We
examine ribosomal RNA (rRNA), the structure and organization of rRNA genes, and how
rRNAs are processed. Finally, we consider a newly discovered class of very small RNAs that
play important roles in RNA degradation, translation inhibition, and other functions.
As we explore the world of RNA and its role in gene function, we will see evidence of
two important characteristics of this nucleic acid. First, RNA is extremely versatile, both
structurally and biochemically. It can assume a number of different secondary structures,
which provide the basis for its functional diversity. Second, RNA processing and function
frequently include interactions between two or more RNA molecules.
Gene Structure how many nucleotides are encompassed in a gene, and how
is the information in these nucleotides organized? In 1902,
What is a gene? In Chapter 3, it was noted that the definition Archibald Garrod suggested, correctly, that genes code for
of gene would appear to change as we explored different proteins (see pp. 47–48). Proteins are made of amino acids;
aspects of heredity. A gene was defined there as an inherited so a gene contains the nucleotides that specify the amino
factor that determines a characteristic. This definition may acids of a protein. We could, then, define a gene as a set of
have seemed vague, because it says nothing about what a nucleotides that specifies the amino acid sequence of a
gene is, only what it does. Nevertheless, this definition was protein, which indeed was, for many years, the working
appropriate for our purposes at the time, because our focus definition of a gene. As geneticists learned more about the
was on how genes influence the inheritance of traits. It structure of genes, however, it became clear that this concept
wasn’t necessary to consider the physical nature of the gene of a gene was an oversimplification.
in learning the rules of inheritance.
Knowing something about the chemical structure of
DNA and the process of transcription now enables us to be Gene Organization
more precise about what a gene is. Chapter 10 described how Early work on gene structure was carried out largely
genetic information is encoded in the base sequence of through the examination of mutations in bacteria and
DNA; so a gene consists of a set of DNA nucleotides. But viruses. This research led Francis Crick in 1958 to propose
374 Chapter 14
DNA 5’ CGTGGATACACTTTTGCCGTTTCT 3’
3’ GCACCTATGTGAAAACGGCAAAGA 5’
DNA 1 A continuous sequence of
Transcription nucleotides in the DNA…
Transcription
RNA
mRNA 5’ CGUGGAUACACUUUUGCCGUUUCU 3’
Translation
Codons
PROTEIN Translation 2 …codes for a continuous sequence
of amino acids in the protein.
Polypeptide Arg Gly Tyr Thr Phe Ala Val Ser Conclusion: With colinearity, the
chain number of nucleotides in the gene
Amino acids is proportional to the number of
amino acids in the protein.
14.1 The concept of colinearity suggests that a
continuous sequence of nucleotides in DNA encodes a
continuous sequence of amino acids in a protein.
Experiment
Question: Is the coding sequence in a gene
always continuous?
that genes and proteins are colinear—that there is a direct DNA RNA
correspondence between the nucleotide sequence of DNA
and the amino acid sequence of a protein (FIGURE 14.1). The
concept of colinearity suggests that the number of
nucleotides in a gene should be proportional to the number
of amino acids in the protein encoded by that gene. In a
general sense, this concept is true for genes found in bacte- Methods 1 Mix DNA with
rial cells and many viruses, although these genes are slightly complementary
longer than expected if colinearity is strictly applied (the RNA and heat to
separate DNA strands.
mRNAs encoded by the genes contain sequences at their
ends that do not specify amino acids). At first, eukaryotic
genes and proteins also were generally assumed to be colin-
ear, but there were hints that eukaryotic gene structure is
fundamentally different. Eukaryotic cells contain far more
DNA than is required to encode proteins (see Chapter 11).
Furthermore, many large RNA molecules observed in the
2 Cool the mixture.
nucleus were absent from the cytoplasm, suggesting that Complementary
nuclear RNAs undergo some type of change before they are sequences pair.
exported to the cytoplasm.
Most geneticists were nevertheless surprised by the
announcement in the 1970s that four coding sequences Results DNA may reanneal with its …or with RNA.
in a gene from a eukaryotic virus were interrupted by complementary strand…
nucleotides that did not specify amino acids. This discovery
was made when the viral DNA was hybridized with the
mRNA transcribed from it, and the hybridized structure
was examined with the use of an electron microscope
(FIGURE 14.2). The DNA was clearly much longer than
the mRNA, because regions of DNA looped out from the
Noncoding regions
of DNA are seen
as loops.
hybridized molecules. These regions contained nucleotides whereas others may have more than 60; intron length
in the DNA that were absent from the coding nucleotides in varies from fewer than 200 nucleotides to more than 50,000.
the mRNA. Many other examples of interrupted genes were Introns tend to be longer than exons, and most eukaryotic
subsequently discovered; it quickly became apparent that genes contain more noncoding nucleotides than coding
most eukaryotic genes consist of stretches of coding and nucleotides. Finally, most introns do not encode proteins
noncoding nucleotides. (an intron of one gene is not usually an exon for
another), although geneticists are finding a growing number
CONCEPTS of exceptions.
When a continuous sequence of nucleotides in DNA There are four major types of introns (Table 14.1).
encodes a continuous sequence of amino acids in a Group I introns, found in some rRNA genes, are self-splicing
protein, the two are said to be colinear. The discovery —they can catalyze their own removal. Group II introns are
of coding and noncoding regions within eukaryotic present in some protein-encoding genes of mitochondria,
genes shows that not all genes are colinear with the chloroplasts, and a few eubacteria; they also are self-splicing,
proteins that they encode. but their mechanism of splicing differs from that of the
group I introns. Nuclear pre-mRNA introns are the best
studied; they include introns located in the protein-
Introns encoding genes of the nucleus. The splicing mechanism by
Many eukaryotic genes contain coding regions called exons which these introns are removed is similar to that of the
and noncoding regions called intervening sequences or group II introns, but nuclear introns are not self-splicing;
introns. For example, the ovalbumin gene has eight exons their removal requires snRNAs (discussed later) and a
and seven introns; the gene for cytochrome b has five exons number of proteins. Transfer RNA introns, found in tRNA
and four introns (FIGURE 14.3). All the introns and the genes, utilize yet another splicing mechanism that relies on
exons are initially transcribed into RNA but, after transcrip- enzymes to cut and reseal the RNA. In addition to these
tion, the introns are removed by splicing and the exons are major groups, there are several other types of introns.
joined to yield the mature RNA. We’ll take a detailed look at the chemistry and mechan-
Introns are common in eukaryotic genes but are rare in ics of RNA splicing later in the chapter. For now, we should
bacterial genes. For a number of years after their discovery, keep in mind two general characteristics of the splicing
introns were thought to be entirely absent from prokaryotic process: (1) the splicing of all pre-mRNA introns takes place
genomes, but they have now been observed in archaea, in the nucleus; and (2) the order of exons in DNA is usually
bacteriophages, and even some eubacteria. Introns are pres- maintained in the spliced RNA—the coding sequences of a
ent in mitochondrial and chloroplast genes, as well as gene may be split up, but they are not usually jumbled up.
nuclear genes. In eukaryotic genomes, the size and number
of introns appear to be directly related to increasing organ- CONCEPTS
ismal complexity. Yeast genes contain only a few short Many eukaryotic genes contain exons and introns, both
introns; Drosophila introns are longer and more numerous; of which are transcribed into RNA, but introns are later
and most vertebrate genes are interrupted by long introns. removed by RNA processing. The number and size of
All classes of genes—those that code for rRNA, tRNA, and introns vary from gene to gene; they are common in
proteins—may contain introns. The number and size of many eukaryotic genes but uncommon in bacterial genes.
introns vary widely: some eukaryotic genes have no introns,
Introns Introns
Transcription DNA is transcribed into Transcription
RNA, and introns are
removed by RNA splicing.
1 234567 8 1 23 4 5
mRNA 5’ 3’ mRNA 5’ 3’
7-Methylguanine OCH3
14.6 Most eukaryotic mRNAs have a 5 cap.
The cap consists of a nucleotide with 7-methylguanine P CH2 N
attached to the pre-mRNA by a unique 5 –5 bond (shown
in detail in the bottom box). The cap is added shortly after
the initiation of transcription. A methyl group is added to
position 7 of the guanine base of the newly added (now OCH3
the terminal) nucleotide and to the 2 position of each P
sugar of the next two nucleotides.
RNA Molecules and RNA Processing 379
Several different enzymes take part in the addition of and CFII); and polyadenylate polymerase (PAP). CPSF binds
the 5 cap. The initial step is carried out by an enzyme that to the upstream AAUAAA consensus sequence, whereas CstF
associates with RNA polymerase II. Because neither RNA binds to the downstream sequence. The pre-mRNA is cleaved,
polymerase I nor RNA polymerase III have this associated and CstF and the cleavage factors leave the complex; the
enzyme, RNA molecules transcribed by these polymerases cleaved 3 end of the pre-mRNA is then degraded. CFSF and
(rRNAs, tRNAs, and some snRNAs) are not capped. PAP remain bound to the pre-mRNA and carry out
polyadenylation. After the addition of approximately 10 ade-
nine nucleotides, a poly(A)-binding protein (PABII) attaches
The Addition of the Poly(A) Tail to the poly(A) tail and increases the rate of polyadenylation.
A second type of modification to eukaryotic mRNA is the As more of the tail is synthesized, additional molecules of
addition of from 50 to 250 adenine nucleotides at the 3 end, PABII attach to it.
forming a poly(A) tail. These nucleotides are not encoded in The poly(A) tail confers stability on many mRNAs,
the DNA but are added after transcription (FIGURE 14.7) in a increasing the time during which the mRNA remains
process termed polyadenylation. Many eukaryotic genes intact and available for translation before it is degraded by
transcribed by RNA polymerase II are transcribed well cellular enzymes. The stability conferred by the poly(A) tail
beyond the end of the coding sequence (see Chapter 13); the depends on the proteins that attached to the tail. The
extra material at the 3 end is then cleaved and the poly(A) poly(A) tail also facilitates attachment of the ribosome to
tail is added. For some pre-mRNA molecules, more than the mRNA.
1000 nucleotides may be removed from the 3 end. The eukaryotic mRNAs that code for core histone pro-
Processing of the 3 end of pre-mRNA requires teins (see Chapter 11) are unique in that they lack a poly(A)
sequences both upstream and downstream of the cleavage tail and depend on a different mechanism for 3 cleavage.
site (FIGURE 14.8). The consensus sequence AAUAAA is This process requires the formation of a hairpin structure in
usually from 11 to 30 nucleotides upstream of the cleavage the pre-mRNA and a small ribonucleoprotein particle
site (see Figure 14.7) and determines the point at which (snRNP) called U7 (FIGURE 14.9). The U7 particle contains
cleavage will take place. A sequence rich in uracil nucleotides an snRNA with nucleotides that are complementary to a
(or guanine and uracil nucleotides) is typically downstream sequence on the pre-mRNA just downstream of the cleavage
of the cleavage site. site, and U7 most likely binds to this sequence. A hairpin-
In mammals, 3 cleavage and the addition of the poly(A) binding protein binds to the hairpin structure and stabilizes
tail require a complex consisting of several proteins: cleavage the binding of U7 to the complementary sequence on the
and polyadenylation specificity factor (CPSF); cleavage pre-mRNA. The hairpin-binding protein also stabilizes the
stimulation factor (CstF); at least two cleavage factors (CFI mRNA and increases its rate of translation.
DNA
5’
Polyadenylate U7 snRNA probable pairing
Cleavage and 5’ AAUAAA polymerase
polyadenylation PAP
CPSF Cleavage
specificity factor CF
site
CstF CF
Cleavage 3’
Cleavage 14.9 Eukaryotic mRNAs that lack a poly(A) tail
stimulation factors depend on a different mechanism for 3 cleavage.
factor U-rich Cleavage requires the presence of U7 snRNA, which has
sequence bases complementary to a consensus sequence downstream
2 Cleavage takes place. of the 3 cleavage site. Cleavage depends on the formation
Cleavage factors and the 3’ of a hairpin structure near the 3 end of the pre-mRNA; base
end of the pre-mRNA are pairing probably takes place between the complementary
released. regions of the pre-mRNA and the U7 snRNA.
CF CF
AAUAAA CstF
5’ CONCEPTS
PAP 3’
CPSF 3’ Eukaryotic pre-mRNAs are processed at their 5 and
3 The 3’ end 3 ends. A cap, consisting of a modified nucleotide
of the
and several methyl groups, is added to the 5 end.
pre-mRNA
is degraded. The cap facilitates the binding of a ribosome, increases
the stability of the mRNA, and may affect the removal
AAA
PAP
4 Polyadenylate polymerase the addition of a poly(A) tail.
CPSF adds adenine nucleotides
3’ to the 3’ end of the
new mRNA…
RNA Splicing
PABII
The other major type of modification of eukaryotic pre-
mRNA is the removal of introns by RNA splicing. This
AAA
5 …and poly(A)-binding modification takes place in the nucleus, usually after tran-
AA
AA protein (PABII) attaches scription and the addition of the poly(A) tail but before
5’ AAUAAA to the poly(A) tail and
A AA
elongate the these splice sites possess short consensus sequences. Most
AAAAAA
poly(A) tail. introns in pre-mRNA begin with GU and end with AG,
indicating that these sequences play a crucial role in splic-
A
AAA
AA
ing. Indeed, changing a single nucleotide at either of these
5’ AAUAAA A
AAA A A
sites does prevent splicing. A few introns in pre-mRNA
PAP
A
CPSF begin with AU and end with AC. These introns are spliced
3’
by a process that is similar to that seen in GU . . . AG introns
but utilizes a different set of splicing factors. This discussion
14.8 Processing of the 3 end of pre-mRNA requires will focus on splicing of the more common GU . . .
a consensus sequence and several factors. AG introns.
RNA Molecules and RNA Processing 381
N
RNA 1 The mRNA is cut at 2 The 5’ end of the 3 A cut is made at P
the 5’ splice site. intron attaches to the 3’ splice site.
the branch point.
RNA PROCESSING G
5’
–O
A N
5’
P
G 2’
Exon 1 Exon 2
O
–
14.11 The splicing of nuclear introns requires a two-step process. First, cleavage takes
place at the 5 splice site, and a lariat is formed by the attachment of the 5 end of the intron to the
branch point. Second, cleavage takes place at the 3 splice site, and two exons are spliced together.
382 Chapter 14
U1 with 5 splice site U1 attaches to 5 end of intron; commits intron to splicing; no direct role in splicing
U2 with branch point Positions 5 end of intron near branch point for lariat formation
U2 with U6 Holds 5 end of intron near branch point
U6 with 5 splice site Positions 5 end of intron near branch point
U5 with 3 end of first exon Anchors first exon to spliceosome subsequent to cleavage; juxtaposes two ends of
exon for splicing
U5 with 3 end of one exon Juxtaposes two ends of exon for splicing
and 5 end of the other
U4 with U6 Delivers U6 to intron; no direct role in splicing
The guanine nucleotide in the consensus sequence at the 5’ Exon 1 Intron A Exon 2 3’
5 splice site bonds with the adenine nucleotide at the
branch point. This bonding is accomplished through trans- Branch point
esterification, a chemical reaction in which the OH group 1 U1 attaches to
on the 2 -carbon atom of the adenine nucleotide at the the 5’ splice site. U1
branch point attacks the 5 phosphodiester bond of the gua-
nine nucleotide at the 5 splice site, cleaving it and forming Exon 1 Intron A Exon 2
a new 5 –2 phosphodiester bond between the guanine and
U1
adenine nucleotides. U2
In the second step of RNA splicing, a cut is made at the 2 U2 attaches to
3 splice site and, simultaneously, the 3 end of exon 1 the branch point.
becomes covalently attached (spliced) to the 5 end of exon 2.
This bond also forms through a transesterification reaction, Exon 1 Intron A Exon 2
in which the 3 -OH group attached to the end of exon 1 U1 U2
attacks the phosphodiester bond at the 3 splice site, cleaving
it and forming a new phosphodiester bond between the U5 3 A complex of U4,
U4
3 end of exon 1 and the 5 end of exon 2; the intron is U5, and U6 joins
released as a lariat. The intron becomes linear when the U6 the spliceosome.
bond breaks at the branch point and is then rapidly
degraded by nuclear enzymes. The mature mRNA consisting
of the exons spliced together is exported to the cytoplasm, 4 U1 and U4 are released.
Spliceosome
where it is translated.
Although splicing is illustrated in Figure 14.11 as a two- n U6 1
U1 tro on
U5
Ex
In
7 Two transesterification
A reactions join the exons
14.12 RNA splicing takes place within the U2 U6 together and release the
spliceosome. intron as a lariat with U2, U5,
and U6 attached.
RNA Molecules and RNA Processing 383
(a) (b) A
OH
5’ splice site 3’ splice site
5’ Exon 1U P G P U Exon 2 3’
1 An adenine nucleotide
within the intron attacks
Cleavage
a guanine nucleotide at
the 5’ end of the intron,…
Lariat
2 …creating a 3’-OH group structure
at the end of exon 1 and
Exon I 5’ 3’ Exon II a lariat structure within G PA
the intron.
Group II intron
5’ Exon 1 U OH 3’ P U Exon 2 3’
Alternative Processing Pathways cleavage at the second site. The use of an alternative cleavage
Another finding that complicates the view of a gene as a site may or may not produce a different protein, depending
sequence of nucleotides that specifies the amino acid on whether the position of the site is before or after the
sequence of a protein is the existence of alternative process- termination codon.
ing pathways, in which a single pre-mRNA is processed in Both alternative splicing and multiple 3 cleavage sites
different ways to produce alternative types of mRNA, result- can exist in the same pre-mRNA transcript; an example is
ing in the production of different proteins from the same seen in the mammalian gene that encodes calcitonin; this
DNA sequence. gene contains six exons and five introns (FIGURE 14.17a). The
One type of alternative processing is alternative splic- entire gene is transcribed into pre-mRNA (FIGURE 14.17b).
ing, in which the same pre-mRNA can be spliced in more There are two possible 3 cleavage sites. In cells of the thyroid
than one way to yield multiple mRNAs that are translated gland, 3 cleavage and polyadenylation take place after the
into different amino acid sequences and thus different pro- fourth exon, and the first three introns are then removed
teins (FIGURE 14.16a). Another type of alternative processing to produce a mature mRNA consisting of exons 1, 2, 3,
requires the use of multiple 3 cleavage sites (FIGURE 14.16b); and 4 (FIGURE 14.17c). This mRNA is translated into the
two or more potential sites for cleavage and polyadenylation hormone calcitonin. In brain cells, the identical pre-RNA
are present in the pre-mRNA. In the example in Fig- is transcribed from DNA, but it is processed differently.
ure 14.16b, cleavage at the first site produces a relatively Cleavage and polyadenylation take place after the sixth
short mRNA, compared with the mRNA produced through exon, yielding an initial transcript that includes all six exons.
Transcription Transcription
3’ cleavage site
Pre-mRNA Pre-mRNA 1 2 3’ cleavage sites
5’ Exon 1 Exon 2 Exon 3 3’ 5’ Exon 1 Exon 2 3’
1 2
5’ Exon 1 Exon 2 Exon 3 AAAAA 3’ 5’ Exon 1 AAAAA 3’ 5’ Exon 1 Exon 2 AAAAA 3’
Exon 2
Either two introns …or two introns and
are removed to Alternative exon 2 are removed to RNA mRNA products of RNA
yield one mRNA… RNA splicing yield a different mRNA. splicing different lengths are splicing
produced after splicing.
mRNA mRNA
5’ Exon 1 Exon 2 Exon 3 AAAAA 3’ 5’ Exon 1 Exon 3 AAAAA 3’ 5’ AAAAA 3’ 5’ AAAAA 3’
x on 2
E
(a)
DNA
5’ Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Exon 6
3’
Transcription
(b)
(c) (d)
mRNA 5’ Exon 1 Exon 2 Exon 3 Exon 4 AAAAA 3’ mRNA 5’ Exon 1 Exon 2 Exon 3 Exon 5 Exon 6 AAAAA 3’
Calcitonin Calcitonin-gene-related
peptide (CGRP)
During splicing, exon 4 (part of the calcitonin mRNA) is SR proteins to exonic splicing enhancers, causing exons to be
removed, along with all the introns; so only exons 1, 2, 3, 5, omitted from the mature mRNA.
and 6 are present in the mature mRNA (FIGURE 14.17d).
When translated, this mRNA produces a protein called CONCEPTS
calcitonin-gene-related peptide (CGRP), which has an amino Alternative splicing enables exons to be spliced together
acid sequence quite different from that of calcitonin. Alter- in different combinations to yield mRNAs that encode
native splicing may produce different combinations of exons different proteins. Alternative 3 cleavage sites allow
in the mRNA, but the order of the exons is not usually pre-mRNA to be cleaved at different sites to produce
changed. Different processing pathways contribute to gene mRNAs of different lengths.
regulation, as discussed in Chapter 16.
Alternative processing is an important source of protein
diversity in vertebrates; an estimated 40% to 60% of all RNA Editing
human genes are alternatively spliced. Many human genetic A long-standing principle of molecular genetics is that,
diseases arise from mutations that affect pre-mRNA splicing; except for a few RNA viruses, genetic information ultimately
indeed, about 15% of single-base substitutions that result in resides in the nucleotide sequence of DNA (Chapter 10).
human genetic diseases alter pre-mRNA splicing. Some of This information is transcribed into mRNA, and mRNA is
these mutations interfere with recognition of the normal then translated into a protein. The assumption that all infor-
5 and 3 splice sites. Others create new splice sites. Muta- mation about the amino acid sequence of a protein resides
tions within exons can also interfere with the binding of in DNA is violated by a process called RNA editing. In RNA
RNA Molecules and RNA Processing 387
CONNECTING CONCEPTS
Eukaryotic Gene Structure and
editing, the coding sequence of an mRNA molecule is altered
after transcription, and so the protein has an amino acid Pre-mRNA Processing
sequence that differs from that encoded by the gene. Chapters 13 and 14 have introduced a number of different
RNA editing was first detected in 1986 when the coding components of genes and RNA molecules, including pro-
sequences of mRNAs were compared with the coding moters, 5 untranslated regions, coding sequences, introns,
sequences of the DNAs from which they had been tran- 3 untranslated regions, poly(A) tails, and caps. Let’s see how
scribed. Discrepancies were found for some nuclear genes in some of these components are combined to create a typical
mammalian cells and for mitochondrial genes in plant cells. eukaryotic gene and how a mature mRNA is produced
In these cases, there had been substitutions in some of the from them.
388 Chapter 14
(a)
Enhancer is typically upstream, but
could be downstream or in an intron
Promoter RNA coding
(c)
RNA
Pre-mRNA
5’ AAUAAA 3’
RNA PROCESSING
2 A 5’ cap is added. 3 Cleavage at the 3’ end is 3’ cleavage site
approximately 10 nucleo-
Translation tides downstream of the
(d) consensus sequence.
PROTEIN Pre-mRNA
5’ AAUAAA 3’
The promoter, which typically encompasses about The pre-mRNA is then processed to yield a mature
100 nucleotides upstream of the transcription start site, is mRNA. The first step in this processing is the addition of
necessary for transcription to take place but is itself not a cap to the 5 end of the pre-mRNA (FIGURE 14.19c).
usually transcribed when protein-encoding genes are tran- Next, the 3 end is cleaved at a site downstream of the
scribed by RNA polymerase II (FIGURE 14.19a). Farther AAUAAA consensus sequence in the last exon (FIG-
upstream or downstream of the start site, there may be URE 14.19d). Immediately after cleavage, a poly(A) tail is
enhancers that also regulate transcription. added to the 3 end (FIGURE 14.19e). Finally, the introns
In transcription, all the nucleotides between the tran- are removed to yield the mature mRNA (FIGURE 14.19f).
scription start site and the stop site are transcribed into pre- The mRNA now contains 5 and 3 untranslated regions,
mRNA, including exons, introns, and a long 3 end that is which are not translated into amino acids, and the
later cleaved from the transcript (FIGURE 14.19b). Notice nucleotides that carry the protein-coding sequences. The
that the 5 end of the first exon contains the sequence nucleotide sequence of a small gene (the human inter-
that codes for the 5 untranslated region and that the 3 end leukin 2 gene), with these components identified, is pre-
of the last exon contains the sequence that codes for the sented in FIGURE 14.20.
3 untranslated region.
RNA Molecules and RNA Processing 389
TATA box
5’ ….CATCAGAAGAGGAAAAATGAAGGTAATGTTTTTTCAGACAGGTAAAGTCTTTGAAAATATGTGTAATATGTAAAACATTTTGACACCCCCATAATATTTTTCCAGAATTAACAGTATAAATTGCATCTCTTG
TTCAAGAGTTCCCTATCACTCTCTTTAATCACTACTCACAGTAACCTCAACTCCTGCCACAATGTACAGGATGCAACTCCTGTCTTGCATTGCACTAAGTCTTGCACTTGTCACAAACAGTGCACCTACTTCAA
Transcription start site Start codon
Exon 1 Intron 1
GTTCTACAAAGAAAACACAGCTACAACTGGAGCATTTACTTCTGGATTTACAGATGATTTTGAATGGAATTAATGTAAGTATATTTCCTTTCTTACTAAAATTATTACATTTAGTAATCTAGCTGGAGATCATTTCT
Exon 2
TAATAACAATGCATTATACTTTCTTAGAATTACAAGAATCCCAAACTCACCAGGATGCTCACATTTAAGTTTTACATGCCCAAGAAGGTAAGTACAATATTTTATGTTCAATTTCTGTTTTAATAAAATTCAAAGTA
ATATGAAAATTTGCACAGATGGGACTAATAGCAGCTCATCTGAGGTAAAGAGTAACTTTAATTTGTTTTTTTGAAAACCCAAGTTTGATAATGAAGCCTCTATTAAAACAGTTTTACCTATATTTTTAATATATATTT
Intron 2
GTGTGTTGGTGGGGGTGGGAAGAA- - - (+2400bp)- - - -TGCAGAAAGTCTAACATTTTGCAAAGCCAAATTAAGCTAAAACCAGTGAGTCAACTATCACTTAACGCTAGTCATAGGTACTTGAGCCCTAGTTTT
TCCAGTTTTATAATGTAAACTCTACTGGTCCATCTTTACAGTGACATTGAGAACAGAGAGAATGGTAAAAACTACATACTGCTACTCCAAATAAAATAAATTGGAAATTAATTTCTGATTCTGACCTCTATGTAAA
Exon 3
CTGAGCTGATGATAATTATTATTCTAGGCCACAGAACTGAAACATCTTCAGTGTCTAGAAGAAGAACTCAAACCTCTGGAGGAAGTGCTAAATTTAGCTCAAAGCAAAAACTTTCACTTAAGACCCAGGGACT
Intron 3
TAATCAGCAATATCAACGTAATAGTTCTGGAACTAAAGGTAAGGCATTACTTTATTTGCTCTCCTGGAAATAAAAAAAAAAAAGTAGGGGGAAAAGT----(+1900 BP)-----CTTGAAAATAAAGGCAACAGGCCTA
Exon 4
TAAGACTTCAATTGGGAATAACTGTATATAAGGTAAACTACTCTGTACTTTAAAAAATTAACATTTTTCTTTTATAGGGATCTGAAACAACATTCATGTGTGAATATGCTGATGAGACAGCAACCATTGTAGAATTT
CTGAACAGATGGATTACCTTTTGTCAAAGCATCATCTCAACACTGACTTGATAATTAAGTGCTTCCCACTTAAAACATATCAGGCCTTCTATTTATTTAAATATTTAAATTTTATATTTATTGTTGAATGTATGGTTT
Stop codon
GCTACCTATTGTAACTATTATTCTTAATCTTAAAACTATAAATATGGATCTTTTATGATTCTTTTTGTAAGCCCTAGGGGCTCTAAAATGGTTTCACTTATTTATCCCAAAATATTTATTATTATGTTGAATGTTAAATA
TAGTATCTATGTAGATTGGTTAGTAAAACTATTTAATAAATTTGATAAATATAAACAAGCCTGGATATTTGTTATTTTGGAAACAGCACAGAGTAAGCATTTAAATATTTCTTAGTTACTTGTGTGAACTGTAGGATG
Poly(A) consensus sequence 3’ cleavage site
GTTAAAATGCTTACAAAAGTCACTCTTTCTCTGAAGAAATATGTAGAACAGAGATGTAGACTTCTCAAAAGCCCTTGCTTT 3’
You can see that non-coding introns occupy large parts of genes, Exons code for less than 165
even when large numbers of bases are not individually listed. amino acids, a small protein.
This computer-generated space-filling This ribbon model emphasizes This flattened cloverleaf model shows pairing
molecular model shows the three- the internal regions of base pairing. between complementary nucleotides.
dimensional structure of a tRNA.
Amino acid 3’
5’ attachment site A
C
(always CCA) C
3’ 5’ A Acceptor
G C arm
G C
G U
DHU arm C G TψC arm
Hydrogen bonds G C
between paired U U
AUG G C YU
bases Rare CCCC U AGGCC
AG
base ( ) GGGG UCCGG C
CG
A
G C G C
U A GA
Anticodon C G
arm C G Extra arm
C G (size varies)
This icon for tRNA The anticodon comprises U
will be used in three bases and interacts U
subsequent chapters. with a codon in mRNA. GC
If there are only four bases in DNA and if all RNA mol- order. The DHU arm is so named because it often contains
ecules are transcribed from DNA, how do tRNAs acquire the modified base dihydrouridine.
these additional bases? Modified bases arise from chemical Although each tRNA molecule folds into a cloverleaf
changes made to the four standard bases after transcription. owing to the complementary paring of bases, the cloverleaf
These changes are carried out by special tRNA-modifying is not the three-dimensional (tertiary) structure of tRNAs
enzymes. For example, the addition of a methyl group to found in the cell. The results of X-ray crystallographic
uracil creates the modified base ribothymine. studies have shown that the cloverleaf folds upon itself to
The structures of all tRNAs are similar, a feature critical form an L-shaped structure, as illustrated by the space-filling
to tRNA function. Most tRNAs contain between 74 and and ribbon models in Figure 14.22. Notice that the acceptor
95 nucleotides, some of which are complementary to each stem is at one end of the tertiary structure and the anticodon
other and form intramolecular hydrogen bonds. As a result, is at the other end.
each tRNA has a cloverleaf structure (FIGURE 14.22). The
cloverleaf has four major arms. If we start at the top and
proceed clockwise around the tRNA shown at the right in Transfer RNA Gene Structure and Processing
Figure 14.22, the four major arms are the acceptor arm, the The genes that produce tRNAs may be scattered about the
T!C arm, the anticodon arm, and the DHU arm. Three genome or may be in clusters. In E. coli, the genes for some
of the arms (the T!C, anticodon, and DHU arms) consist of tRNAs are present in a single copy, whereas the genes for
a stem and a loop. The stem is formed by the pairing of other tRNAs are present in several copies; eukaryotic cells
complementary nucleotides, and the loop lies at the termi- usually have many copies of each tRNA gene. All tRNA mol-
nus of the stem, where there is no nucleotide pairing. ecules in both bacterial and eukaryotic cells undergo pro-
Instead of having a loop, the acceptor arm includes the cessing after transcription.
5 and 3 ends of the tRNA molecule. All tRNAs have the In E. coli, several tRNAs are usually transcribed together
same sequence (CCA) at the 3 end, where the amino acid as one large precursor tRNA, which is then cut up into pieces,
attaches to the tRNA; so clearly this sequence is not responsi- each containing a single tRNA. Additional nucleotides may
ble for specifying which amino acid will attach to the tRNA. then be removed one at a time from the 5 and 3 ends of the
The T!C arm is named for the bases of three nucleotides tRNA in a process known as trimming. Base-modifying
in the loop of this arm: thymine (T), pseudouracil (!), and enzymes may then change some of the standard bases into
cytosine (C). The anticodon arm lies at the bottom of the modified bases, and additional bases (such as CCA at the
tRNA. Three nucleotides at the end of this arm make up the 3 end) may be added (FIGURE 14.23). Different tRNAs are
anticodon, which pairs with the corresponding codon on processed in different ways; so a generic processing pathway
mRNA to ensure that the amino acids link in the correct for all tRNAs is not possible. Eukaryotic tRNAs are processed
RNA Molecules and RNA Processing 391
1 A large 2 …is cleaved to produce 3 An intron is removed 4 …and bases are 5 Modification of several
precursor tRNA… an individual tRNA molecule. by splicing,… added to the 3’ end. bases ( ¥ ) produces
the mature tRNA.
AGC AGC
Will form A A 3’ splice GC
G 5’ splice G site
anticodon P site C Anticodon
14.23 Transfer RNAs are processed in both bacterial and eukaryotic cells.
Different tRNAs are modified in different ways. One example is shown here.
Methyl groups
2 The RNA is cleaved into
several intermediates…
Intermediates
3 …and trimmed.
Mature RNAs
16S rRNA tRNA 23S rRNA 5S rRNA 18S rRNA 5.8S rRNA 28S rRNA
4 Mature rRNA molecules
are the result.
14.24 Ribosomal RNA is processed after transcription. Note that eukaryotic 5S rRNA
is transcribed separately from the small eukaryotic rRNA gene.