Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

RNA MOLECULES

14
AND RNA PROCESSING
• The Immense Dystrophin Gene
• Gene Structure
Gene Organization
Introns
The Concept of the Gene Revisited
• Messenger RNA
The Structure of Messenger RNA
Pre-mRNA Processing
The Addition of the 5 Cap
The Addition of the Poly(A) Tail
RNA Splicing
Alternative Processing Pathways
RNA Editing

• Transfer RNA
The Structure of Transfer RNA
Transfer RNA Gene Structure
and Processing

• Ribosomal RNA
The Structure of the Ribosome
Ribosomal RNA Gene Structure
and Processing
• Small Interfering RNAs and
MicroRNAs
RNA Interference
Model Genetic Organism: The
Nematode Worm Caenorhabditis
For almost 50 years, entertainer Jerry Lewis has served as
elegans
national chairman of the Muscular Dystrophy Association, a
partnership between scientists and citizens aimed at fighting
neuromuscular diseases. (Courtesy of the Muscular Dystrophy Association.)

The Immense Dystrophin Gene


T he most common and devastating of the muscular dystrophies is Duchenne muscular
dystrophy, a fatal disease that strikes nearly 1 in 3500 males. At birth, affected boys
appear normal. The first symptom is mild muscle weakness appearing between 3 and
5 years of age: the child stumbles frequently, has difficulty climbing stairs, and is unable
to rise from a sitting position. In time, the arm and leg muscles become progressively
weaker. By age 11, those affected are usually confined to a wheel chair and, by age 20, most
persons with Duchenne muscular dystrophy have died. At present, there is no cure for
the disease.
Duchenne muscular dystrophy was first recognized in 1852, and the disease was fully
described in 1861 by Benjamin A. Duchenne, a French physician. Even before Mendel’s
laws were discovered, physicians noticed its X-linked pattern of inheritance, remarking
that the disease developed almost exclusively in males and seemed to be inherited through

372
RNA Molecules and RNA Processing 373

unaffected mothers. In spite of this early recognition of its hereditary basis, the biochemical
cause of Duchenne muscular dystrophy remained a mystery until 1987.
In 1985, Louis Kunkel and his colleagues at Harvard Medical School observed a boy
with Duchenne muscular dystrophy whose X chromosome had a visible deletion on the
short arm. Reasoning that this boy’s disease was caused by the absence of a gene within the
deletion, they recognized that the deletion pointed to the location on the X chromosome of
the gene responsible for Duchenne muscular dystrophy. Kunkel and his colleagues located
and cloned the piece of DNA responsible for the disease. Shortly thereafter, the sequence of
the gene was determined, and the protein that it encodes was isolated. This large protein,
called dystrophin, consists of nearly 4000 amino acids and is an integral component of
muscle cells. Persons with Duchenne muscular dystrophy lack functional dystrophin.
The dystrophin gene is among the most remarkable of all genes yet examined. It’s huge,
encompassing more than 2 million nucleotides of DNA. However, only about 12,000
nucleotides encode its amino acids. Why is the dystrophin gene so large? What are all those
other nucleotides doing?
The unusual properties of the dystrophin gene make sense only in the context of RNA
processing—the alteration of RNA after it has been transcribed. Dystrophin messenger
RNA, like many eukaryotic RNAs, undergoes extensive processing after transcription,
including the removal of large sections that are not required for translation. Chapter 13
focused on transcription—the process of RNA synthesis. In this chapter, we will examine
the function and processing of RNA.
We begin by taking a careful look at the nature of the gene. Next, we examine messen-
ger RNA (mRNA), its structure, and how it is modified in eukaryotes after transcription.
We’ll also see how, through alternative pathways of RNA modification, one gene can
produce several different proteins. Then, we turn to transfer RNA (tRNA), the adapter
molecule that forms the interface between amino acids and mRNA in protein synthesis. We
examine ribosomal RNA (rRNA), the structure and organization of rRNA genes, and how
rRNAs are processed. Finally, we consider a newly discovered class of very small RNAs that
play important roles in RNA degradation, translation inhibition, and other functions.
As we explore the world of RNA and its role in gene function, we will see evidence of
two important characteristics of this nucleic acid. First, RNA is extremely versatile, both
structurally and biochemically. It can assume a number of different secondary structures,
which provide the basis for its functional diversity. Second, RNA processing and function
frequently include interactions between two or more RNA molecules.

www.whfreeman.com/pierce More information about Duchenne muscular dystrophy and


the dystrophin gene

Gene Structure how many nucleotides are encompassed in a gene, and how
is the information in these nucleotides organized? In 1902,
What is a gene? In Chapter 3, it was noted that the definition Archibald Garrod suggested, correctly, that genes code for
of gene would appear to change as we explored different proteins (see pp. 47–48). Proteins are made of amino acids;
aspects of heredity. A gene was defined there as an inherited so a gene contains the nucleotides that specify the amino
factor that determines a characteristic. This definition may acids of a protein. We could, then, define a gene as a set of
have seemed vague, because it says nothing about what a nucleotides that specifies the amino acid sequence of a
gene is, only what it does. Nevertheless, this definition was protein, which indeed was, for many years, the working
appropriate for our purposes at the time, because our focus definition of a gene. As geneticists learned more about the
was on how genes influence the inheritance of traits. It structure of genes, however, it became clear that this concept
wasn’t necessary to consider the physical nature of the gene of a gene was an oversimplification.
in learning the rules of inheritance.
Knowing something about the chemical structure of
DNA and the process of transcription now enables us to be Gene Organization
more precise about what a gene is. Chapter 10 described how Early work on gene structure was carried out largely
genetic information is encoded in the base sequence of through the examination of mutations in bacteria and
DNA; so a gene consists of a set of DNA nucleotides. But viruses. This research led Francis Crick in 1958 to propose
374 Chapter 14

DNA 5’ CGTGGATACACTTTTGCCGTTTCT 3’
3’ GCACCTATGTGAAAACGGCAAAGA 5’
DNA 1 A continuous sequence of
Transcription nucleotides in the DNA…
Transcription

RNA
mRNA 5’ CGUGGAUACACUUUUGCCGUUUCU 3’

Translation
Codons
PROTEIN Translation 2 …codes for a continuous sequence
of amino acids in the protein.

Polypeptide Arg Gly Tyr Thr Phe Ala Val Ser Conclusion: With colinearity, the
chain number of nucleotides in the gene
Amino acids is proportional to the number of
amino acids in the protein.
14.1 The concept of colinearity suggests that a
continuous sequence of nucleotides in DNA encodes a
continuous sequence of amino acids in a protein.
Experiment
Question: Is the coding sequence in a gene
always continuous?

that genes and proteins are colinear—that there is a direct DNA RNA
correspondence between the nucleotide sequence of DNA
and the amino acid sequence of a protein (FIGURE 14.1). The
concept of colinearity suggests that the number of
nucleotides in a gene should be proportional to the number
of amino acids in the protein encoded by that gene. In a
general sense, this concept is true for genes found in bacte- Methods 1 Mix DNA with
rial cells and many viruses, although these genes are slightly complementary
longer than expected if colinearity is strictly applied (the RNA and heat to
separate DNA strands.
mRNAs encoded by the genes contain sequences at their
ends that do not specify amino acids). At first, eukaryotic
genes and proteins also were generally assumed to be colin-
ear, but there were hints that eukaryotic gene structure is
fundamentally different. Eukaryotic cells contain far more
DNA than is required to encode proteins (see Chapter 11).
Furthermore, many large RNA molecules observed in the
2 Cool the mixture.
nucleus were absent from the cytoplasm, suggesting that Complementary
nuclear RNAs undergo some type of change before they are sequences pair.
exported to the cytoplasm.
Most geneticists were nevertheless surprised by the
announcement in the 1970s that four coding sequences Results DNA may reanneal with its …or with RNA.
in a gene from a eukaryotic virus were interrupted by complementary strand…
nucleotides that did not specify amino acids. This discovery
was made when the viral DNA was hybridized with the
mRNA transcribed from it, and the hybridized structure
was examined with the use of an electron microscope
(FIGURE 14.2). The DNA was clearly much longer than
the mRNA, because regions of DNA looped out from the
Noncoding regions
of DNA are seen
as loops.

14.2 The noncolinearity of eukaryotic genes


was discovered by hybridizing DNA and mRNA.
(Electromicrograph from O.L. Miller, B.R. Beatty, D.W. Fawcett/ Conclusion: Coding sequences in a gene may
Visuals Unlimited.) be interrupted by noncoding sequences.
RNA Molecules and RNA Processing 375

hybridized molecules. These regions contained nucleotides whereas others may have more than 60; intron length
in the DNA that were absent from the coding nucleotides in varies from fewer than 200 nucleotides to more than 50,000.
the mRNA. Many other examples of interrupted genes were Introns tend to be longer than exons, and most eukaryotic
subsequently discovered; it quickly became apparent that genes contain more noncoding nucleotides than coding
most eukaryotic genes consist of stretches of coding and nucleotides. Finally, most introns do not encode proteins
noncoding nucleotides. (an intron of one gene is not usually an exon for
another), although geneticists are finding a growing number
CONCEPTS of exceptions.
When a continuous sequence of nucleotides in DNA There are four major types of introns (Table 14.1).
encodes a continuous sequence of amino acids in a Group I introns, found in some rRNA genes, are self-splicing
protein, the two are said to be colinear. The discovery —they can catalyze their own removal. Group II introns are
of coding and noncoding regions within eukaryotic present in some protein-encoding genes of mitochondria,
genes shows that not all genes are colinear with the chloroplasts, and a few eubacteria; they also are self-splicing,
proteins that they encode. but their mechanism of splicing differs from that of the
group I introns. Nuclear pre-mRNA introns are the best
studied; they include introns located in the protein-
Introns encoding genes of the nucleus. The splicing mechanism by
Many eukaryotic genes contain coding regions called exons which these introns are removed is similar to that of the
and noncoding regions called intervening sequences or group II introns, but nuclear introns are not self-splicing;
introns. For example, the ovalbumin gene has eight exons their removal requires snRNAs (discussed later) and a
and seven introns; the gene for cytochrome b has five exons number of proteins. Transfer RNA introns, found in tRNA
and four introns (FIGURE 14.3). All the introns and the genes, utilize yet another splicing mechanism that relies on
exons are initially transcribed into RNA but, after transcrip- enzymes to cut and reseal the RNA. In addition to these
tion, the introns are removed by splicing and the exons are major groups, there are several other types of introns.
joined to yield the mature RNA. We’ll take a detailed look at the chemistry and mechan-
Introns are common in eukaryotic genes but are rare in ics of RNA splicing later in the chapter. For now, we should
bacterial genes. For a number of years after their discovery, keep in mind two general characteristics of the splicing
introns were thought to be entirely absent from prokaryotic process: (1) the splicing of all pre-mRNA introns takes place
genomes, but they have now been observed in archaea, in the nucleus; and (2) the order of exons in DNA is usually
bacteriophages, and even some eubacteria. Introns are pres- maintained in the spliced RNA—the coding sequences of a
ent in mitochondrial and chloroplast genes, as well as gene may be split up, but they are not usually jumbled up.
nuclear genes. In eukaryotic genomes, the size and number
of introns appear to be directly related to increasing organ- CONCEPTS
ismal complexity. Yeast genes contain only a few short Many eukaryotic genes contain exons and introns, both
introns; Drosophila introns are longer and more numerous; of which are transcribed into RNA, but introns are later
and most vertebrate genes are interrupted by long introns. removed by RNA processing. The number and size of
All classes of genes—those that code for rRNA, tRNA, and introns vary from gene to gene; they are common in
proteins—may contain introns. The number and size of many eukaryotic genes but uncommon in bacterial genes.
introns vary widely: some eukaryotic genes have no introns,

Ovalbumin gene Exons Cytochrome b gene Exons


1 2 3 4 5 6 7 8 1 2 3 4 5
DNA 5’ 3’ DNA 5’ 3’
3’ 5’ 3’ 5’

Introns Introns
Transcription DNA is transcribed into Transcription
RNA, and introns are
removed by RNA splicing.
1 234567 8 1 23 4 5

mRNA 5’ 3’ mRNA 5’ 3’

14.3 The coding sequences of many eukaryotic genes are disrupted by


noncoding introns.
376 Chapter 14

Geneticists recognized that an additional molecule must


Table 14.1 Major types of introns
take part in the transfer of genetic information.
Type of The results of studies of bacteriophage infection con-
intron Location Splicing ducted in the late 1950s and early 1960s pointed to RNA as
Group I Some rRNA genes Self-splicing a likely candidate for this transport function. Bacteriophages
inject their DNA into bacterial cells, where the DNA is
Group II Protein-encoding Self-splicing
replicated, and large amounts of phage protein are produced
genes in mitochondria
on the bacterial ribosomes. As early as 1953, Alfred Hershey
and chloroplasts
discovered a type of RNA that was synthesized rapidly after
Nuclear Protein-encoding Spliceosomal bacteriophage infection. Findings from later studies showed
pre-mRNA genes in the nucleus that the bacteriophage T2 produced short-lived RNA having
tRNA tRNA genes Enzymatic a nucleotide composition similar to that of phage DNA but
quite different from that of the bacterial RNA. These obser-
Note: There are also several types of minor introns, including vations were consistent with the idea that RNA was copied
group III introns, twintrons, and archaeal introns.
from DNA and that this RNA then directed the synthesis
of proteins.
At the time, ribosomes were known to be somehow
The Concept of the Gene Revisited implicated in protein synthesis, and much of the RNA in a
How does the presence of introns affect our concept of cell was known to be in the form of ribosomes. Each gene
a gene? It no longer seems appropriate to define a gene as a was thought to direct the synthesis of a special type of ribo-
sequence of nucleotides that codes for amino acids in a some in the nucleus, which then moved to the cytoplasm
protein, because this definition excludes introns, which do and produced a specific protein. Using equilibrium density
not specify amino acids. This definition also excludes gradient centrifugation (see Figure 12.2), Sydney Brenner,
nucleotides that code for the 5 and 3 ends of an mRNA François Jacob, and Matthew Meselson demonstrated in
molecule, which are required for translation but do not code 1961 that new ribosomes are not produced during the burst
for amino acids. And defining a gene in these terms also of protein synthesis that accompanies phage infection
excludes sequences that encode rRNA, tRNA, and other (FIGURE 14.4). The genetic information needed to produce
RNAs that do not encode proteins. In view of our current new phage proteins was not carried by the ribosomes.
understanding of DNA structure and function, we need a In a related experiment, François Gros and his colleagues
more satisfactory definition of gene. infected E. coli cells with bacteriophages while radioactively
Many geneticists have broadened the concept of a gene labeled (“hot”) uracil was added to the medium (which
to include all sequences in the DNA that are transcribed into would become incorporated into newly produced phage
a single RNA molecule. Defined in this way, a gene includes RNA). After a few minutes, they transferred the cells to a
all exons, introns, and those sequences at the beginning and medium that contained unlabeled (“cold”) uracil. This type
end of the RNA that are not translated into a protein. This of experiment is called a pulse–chase experiment: the cells
definition also includes DNA sequences that code for are exposed to a brief pulse of label, which is then “chased”
rRNAs, tRNAs, and other types of nonmessenger RNA. by cold, unlabeled precursor. Pulse–chase experiments
Some geneticists have expanded the definition of a gene make it possible to follow, by tracking the presence of the
even further, to include the entire transcription unit—the radioactivity, products of short-term biochemical events, such
promoter, the RNA coding sequence, and the terminator. as RNA synthesis immediately following phage infection.
Gros and his coworkers found that the newly produced
CONCEPTS phage RNA was short lived, lasting only a few minutes, and
was associated with ribosomes but was distinct from
The discovery of introns forced a reevaluation of the
them. They concluded that newly synthesized, short-lived
definition of the gene. Today, a gene is often defined as
RNA carries the genetic information for protein structure
a DNA sequence that codes for an RNA molecule or the
to the ribosome. The term messenger RNA was coined for
entire DNA sequence required to transcribe and code for
this carrier.
an RNA molecule.
The Structure of Messenger RNA
Messenger RNA functions as the template for protein syn-
Messenger RNA thesis; it carries genetic information from DNA to a ribo-
As soon as DNA was identified as the source of genetic infor- some and helps to assemble amino acids in their correct
mation, it became clear that DNA could not directly encode order. Each amino acid in a protein is specified by a set of
proteins. In eukaryotic cells, DNA resides in the nucleus, three nucleotides in the mRNA, called a codon. Both
yet most protein synthesis takes place in the cytoplasm. prokaryotic and eukaryotic mRNAs contain three primary
RNA Molecules and RNA Processing 377

regions (FIGURE 14.5). The 5 untranslated region (5 UTR; Shine-Dalgarno sequence


sometimes call the leader) is a sequence of nucleotides that in prokaryotes only
is at the 5 end of the mRNA and does not code for the Start codon Stop codon
mRNA
amino acid sequence of a protein. In bacterial mRNA,
5’ 3’

5’ untranslated Protein-coding 3’ untranslated


Experiment region region region
Question: Do ribosomes carry genetic
information?
14.5 Three primary regions of mature mRNA are the
5 untranslated region, the protein-coding region, and
the 3 untranslated region.
Methods 1 E. coli were grown in medium
containing heavy isotopes
through several generations
so that the heavy isotopes this region contains a consensus sequence called the Shine-
would become incorporated
into all E. coli ribosomes.
Dalgarno sequence, which serves as the ribosome-binding
site during translation; it is found approximately seven
Medium with 15N and 13C nucleotides upstream of the first codon translated into an
E. coli culture amino acid (called the start codon). Eukaryotic mRNA has
no equivalent consensus sequence in its 5 untranslated
2 The cells were moved into region. In eukaryotic cells, ribosomes bind to a modified
medium containing light
5 end of mRNA, as discussed later in the chapter.
isotopes (14N and 12C)…
The next section of mRNA is the protein-coding
3 … and infected region, which comprises the codons that specify the amino
with bacteriophage. acid sequence of the protein. The protein-coding region
begins with a start codon and ends with a stop codon. The
last region of mRNA is the 3 untranslated region (3 UTR;
sometimes called a trailer), a sequence of nucleotides that is
at the 3 end of the mRNA and is not translated into protein.
Medium with 14N and 12C The 3 untranslated region affects the stability of mRNA and
the translation of the mRNA protein-coding sequence.
E. coli culture
CONCEPTS
4 If new ribosomes were
produced after phage Messenger RNA molecules contain three main regions:
infection, they would a 5 untranslated region, a protein-coding region, and
contain 14N and 12C and a 3 untranslated region. The 5 and 3 untranslated
would be relatively light.
regions do not code for the amino acids of a protein.

5 After phage proteins were


produced, ribosomes were Pre-mRNA Processing
separated by equilibrium
density gradient centrifugation. In bacterial cells, transcription and translation take place
simultaneously; while the 3 end of an mRNA is undergoing
transcription, ribosomes attach to the Shine-Dalgarno
sequence near the 5 end and begin translation. Because
Spin
transcription and translation are coupled, there is little
opportunity for the bacterial mRNA to be modified before
protein synthesis. In contrast, transcription and translation
Results Increasing are separated in both time and space in eukaryotic cells.
density
Transcription takes place in the nucleus, whereas most trans-
6 Only old ribosomes lation takes place in the cytoplasm; this separation provides
containing heavy isotopes
an opportunity for eukaryotic RNA to be modified before it
(15N and 13C) were found.
is translated. Indeed, eukaryotic mRNA is extensively altered
Conclusion: Ribosomes are not produced in after transcription. Changes are made to the 5 end, the
phage reproduction. 3 end, and the protein-coding section of the RNA molecule.
The initial transcript of protein-encoding genes of eukary-
14.4 Brenner, Jacob, and Meselson demonstrated otic cells is called pre-mRNA, whereas the mature, processed
that ribosomes do not carry genetic information. transcript is mRNA. We will reserve the term mRNA for
378 Chapter 14

RNA molecules that have been completely processed and are


ready to undergo translation. DNA
Recent research findings have demonstrated that some Transcription
translation in eukaryotes does take place in the nucleus.
Therefore, some eukaryotic transcription and translation RNA

may be coupled as in prokaryotes. The significance of this


coupling for RNA processing is not yet clear. RNA PROCESSING

The Addition of the 5 Cap


Nucleotide
One type of modification of eukaryotic pre-mRNAs is the Phosphate
addition at their 5 ends of a structure called a 5 cap. This
capping consists of the addition of an extra nucleotide at the mRNA 5’ P P P N P N P 3’
5 end of the mRNA and methylation by the addition of a
1 One of the three
methyl group (CH3) to the base in the newly added phosphates at the
nucleotide and to the 2 –OH group of the sugar of one or 5’ end of the mRNA Pi Removal of
more nucleotides at the 5 end (FIGURE 14.6). Capping takes is removed,… phosphate
place rapidly after the initiation of transcription and, as will
be discussed in more depth in Chapter 15, the 5 cap func- 5’ P P N P N P 3’
tions in the initiation of translation. Cap-binding proteins
recognize the cap and attach to it; a ribosome then binds to GTP
2 …and a guanine nucleotide
these proteins and moves downstream along the mRNA (with its phosphate) is added.
P Pi
until the start codon is reached and translation begins. The
presence of a 5 cap also increases the stability of mRNA and
5’ G P P P N P N P 3’
influences the removal of introns.
As noted in the discussion of transcription in Chap-
ter 13, three phosphates are present at the 5 end of all RNA 3 Methyl groups are added to
molecules, because phosphates are not cleaved from the first position 7 of the base of the Methylation
terminal guanine nucleotide,…
ribonucleoside triphosphate in the transcription reaction.
The 5 end of pre-mRNA can be represented as 5 –ppp-
NpNpN . . ., in which the letter N represents a ribonu- 5’ CH3 G P P P N P N P 3’
cleotide and p represents a phosphate. Shortly after the ini-
tiation of transcription, one of these phosphates is removed Methylation
and a guanine nucleotide is added (see Figure 14.6). This 4 …and to the 2’ position 5 The base on the initial
guanine nucleotide is attached to the pre-mRNA by a of the sugar in the second nucleotide also may
unique 5 –5 bond, which is quite different from the and third nucleotides. be methylated.
usual 5 –3 phosphodiester bond that joins all the other CH3 CH3
nucleotides in RNA: essentially, the guanine nucleotide is
5’ CH3 G P P P N P N P 3’
attached upside down to the 5 end of the pre-RNA. One or
more methyl groups are then added to the 5 end; the first CH3
of these methyl groups is added to position 7 of the base of
the terminal guanine nucleotide, making the base 7-methyl-
guanine. Next, a methyl group may be added to the 2 posi-
tion of the sugar in the second and third nucleotides, as CH3 7-Methyl CH3
shown in Figure 14.6. Rarely, additional methyl groups may group
be attached to the bases of the second and third nucleotides G H2C P P P CH2
N
of the pre-mRNA. 2’-Methyl
group

7-Methylguanine OCH3
14.6 Most eukaryotic mRNAs have a 5 cap.
The cap consists of a nucleotide with 7-methylguanine P CH2 N
attached to the pre-mRNA by a unique 5 –5 bond (shown
in detail in the bottom box). The cap is added shortly after
the initiation of transcription. A methyl group is added to
position 7 of the guanine base of the newly added (now OCH3
the terminal) nucleotide and to the 2 position of each P
sugar of the next two nucleotides.
RNA Molecules and RNA Processing 379

Several different enzymes take part in the addition of and CFII); and polyadenylate polymerase (PAP). CPSF binds
the 5 cap. The initial step is carried out by an enzyme that to the upstream AAUAAA consensus sequence, whereas CstF
associates with RNA polymerase II. Because neither RNA binds to the downstream sequence. The pre-mRNA is cleaved,
polymerase I nor RNA polymerase III have this associated and CstF and the cleavage factors leave the complex; the
enzyme, RNA molecules transcribed by these polymerases cleaved 3 end of the pre-mRNA is then degraded. CFSF and
(rRNAs, tRNAs, and some snRNAs) are not capped. PAP remain bound to the pre-mRNA and carry out
polyadenylation. After the addition of approximately 10 ade-
nine nucleotides, a poly(A)-binding protein (PABII) attaches
The Addition of the Poly(A) Tail to the poly(A) tail and increases the rate of polyadenylation.
A second type of modification to eukaryotic mRNA is the As more of the tail is synthesized, additional molecules of
addition of from 50 to 250 adenine nucleotides at the 3 end, PABII attach to it.
forming a poly(A) tail. These nucleotides are not encoded in The poly(A) tail confers stability on many mRNAs,
the DNA but are added after transcription (FIGURE 14.7) in a increasing the time during which the mRNA remains
process termed polyadenylation. Many eukaryotic genes intact and available for translation before it is degraded by
transcribed by RNA polymerase II are transcribed well cellular enzymes. The stability conferred by the poly(A) tail
beyond the end of the coding sequence (see Chapter 13); the depends on the proteins that attached to the tail. The
extra material at the 3 end is then cleaved and the poly(A) poly(A) tail also facilitates attachment of the ribosome to
tail is added. For some pre-mRNA molecules, more than the mRNA.
1000 nucleotides may be removed from the 3 end. The eukaryotic mRNAs that code for core histone pro-
Processing of the 3 end of pre-mRNA requires teins (see Chapter 11) are unique in that they lack a poly(A)
sequences both upstream and downstream of the cleavage tail and depend on a different mechanism for 3 cleavage.
site (FIGURE 14.8). The consensus sequence AAUAAA is This process requires the formation of a hairpin structure in
usually from 11 to 30 nucleotides upstream of the cleavage the pre-mRNA and a small ribonucleoprotein particle
site (see Figure 14.7) and determines the point at which (snRNP) called U7 (FIGURE 14.9). The U7 particle contains
cleavage will take place. A sequence rich in uracil nucleotides an snRNA with nucleotides that are complementary to a
(or guanine and uracil nucleotides) is typically downstream sequence on the pre-mRNA just downstream of the cleavage
of the cleavage site. site, and U7 most likely binds to this sequence. A hairpin-
In mammals, 3 cleavage and the addition of the poly(A) binding protein binds to the hairpin structure and stabilizes
tail require a complex consisting of several proteins: cleavage the binding of U7 to the complementary sequence on the
and polyadenylation specificity factor (CPSF); cleavage pre-mRNA. The hairpin-binding protein also stabilizes the
stimulation factor (CstF); at least two cleavage factors (CFI mRNA and increases its rate of translation.

DNA

Transcription Pre-mRNA is cleaved, at a position


start site Transcription from 11 to 30 nucleotides down-
DNA stream of the consensus sequence,
Transcription Consensus 11–30 in the 3’ untranslated region.
sequence nucleotides
RNA
Pre-mRNA 5’ AAUAAA 3’

RNA PROCESSING Cleavage


Cleavage site
The addition of adenine nucleotides
(polyadenylation) takes place at
5’ AAUAAA 3’ the 3‘ end of the pre-mRNA,
generating the poly(A) tail.
Polyadenylation
Poly (A) tail
ANIMATION

mRNA 5’ AAUAAA AAAAAAAAAAAAAAAA 3’

Conclusion: In pre-mRNA processing, a poly(A) tail


is added through cleavage and polyadenylation.

14.7 Most eukaryotic mRNAs have a 3 poly(A) tail.


380 Chapter 14

Consensus Hairpin-binding protein


sequence
Hairpin
Pre-mRNA 5’ AAUAAA 3’ cleavage
site Consensus sequence
Cleavage U-rich Pre-mRNA
1 A complex of proteins links site sequence 5’ GAAAGA 3’
the consensus sequence and
CUUUCU
downstream U-rich sequence.
3’ Region of

5’
Polyadenylate U7 snRNA probable pairing
Cleavage and 5’ AAUAAA polymerase
polyadenylation PAP
CPSF Cleavage
specificity factor CF
site
CstF CF
Cleavage 3’
Cleavage 14.9 Eukaryotic mRNAs that lack a poly(A) tail
stimulation factors depend on a different mechanism for 3 cleavage.
factor U-rich Cleavage requires the presence of U7 snRNA, which has
sequence bases complementary to a consensus sequence downstream
2 Cleavage takes place. of the 3 cleavage site. Cleavage depends on the formation
Cleavage factors and the 3’ of a hairpin structure near the 3 end of the pre-mRNA; base
end of the pre-mRNA are pairing probably takes place between the complementary
released. regions of the pre-mRNA and the U7 snRNA.

CF CF

AAUAAA CstF
5’ CONCEPTS
PAP 3’
CPSF 3’ Eukaryotic pre-mRNAs are processed at their 5 and
3 The 3’ end 3 ends. A cap, consisting of a modified nucleotide
of the
and several methyl groups, is added to the 5 end.
pre-mRNA
is degraded. The cap facilitates the binding of a ribosome, increases
the stability of the mRNA, and may affect the removal
AAA

Degradation of introns. Processing at the 3 end includes cleavage


AA
AA
5’ AAUAAA downstream of an AAUAAA consensus sequence and
A AA

PAP
4 Polyadenylate polymerase the addition of a poly(A) tail.
CPSF adds adenine nucleotides
3’ to the 3’ end of the
new mRNA…
RNA Splicing
PABII
The other major type of modification of eukaryotic pre-
mRNA is the removal of introns by RNA splicing. This
AAA

5 …and poly(A)-binding modification takes place in the nucleus, usually after tran-
AA
AA protein (PABII) attaches scription and the addition of the poly(A) tail but before
5’ AAUAAA to the poly(A) tail and
A AA

PAP increases the rate the RNA moves to the cytoplasm.


CPSF
of polyadenylation.
3’ Consensus sequences and the spliceosome Splicing
requires the presence of three sequences in the intron.
6 Polyadenylation
and continued AA A A AAA One end of the intron is referred to as the 5 splice site,
AA A
PABII binding and the other end is the 3 splice site (FIGURE 14.10);
AA
AA

elongate the these splice sites possess short consensus sequences. Most
AAAAAA

poly(A) tail. introns in pre-mRNA begin with GU and end with AG,
indicating that these sequences play a crucial role in splic-
A

AAA
AA
ing. Indeed, changing a single nucleotide at either of these
5’ AAUAAA A
AAA A A
sites does prevent splicing. A few introns in pre-mRNA
PAP
A

CPSF begin with AU and end with AC. These introns are spliced
3’
by a process that is similar to that seen in GU . . . AG introns
but utilizes a different set of splicing factors. This discussion
14.8 Processing of the 3 end of pre-mRNA requires will focus on splicing of the more common GU . . .
a consensus sequence and several factors. AG introns.
RNA Molecules and RNA Processing 381

14.10 Splicing of pre-mRNA requires Exon 1 Exon 2


consensus sequences. Critical consensus
sequences are present at the 5 splice site, 5’ splice site Intron 3’ splice site
the branch point, and the 3 splice site. In the C/ A
consensus sequence surrounding the branch 5’ A AG GU /G AGU YNYYRAY CAG G 3’
point (YNYYRAY), Y is any pyrimidine, R is 5’ consensus Branch 3’ consensus
any purine, A is adenine, and N is any base. sequence point sequence

The third sequence important for splicing is at the CONCEPTS


branch point, which is an adenine nucleotide that lies Introns in nuclear genes contain three consensus
from 18 to 40 nucleotides upstream of the 3 splice site sequences critical to splicing: a 5 splice site, a 3 splice
(see Figure 14.10). The sequence surrounding the branch site, and a branch point. Splicing of pre-mRNA takes
point does not have a strong consensus but usually takes place within a large complex called the spliceosome,
the form YNYYRAY (Y is any pyrimidine, N is any base, which consists of snRNAs and proteins.
R is any purine, and A is adenine). The deletion or muta-
tion of the adenine nucleotide at the branch point pre-
vents splicing. The process of splicing To illustrate the process of RNA
Splicing takes place within a large complex called the splicing, we’ll first consider the chemical reactions that
spliceosome, which consists of several RNA molecules take place. Then we’ll see how these splicing reactions con-
and many proteins. The RNA components are small stitute a set of coordinated processes within the context of
nuclear RNAs (Chapter 13) ranging in length from 107 to the spliceosome.
210 nucleotides; these snRNAs associate with proteins to Before splicing takes place, an intron lies between an
form small ribonucleoprotein particles (snRNPs, usually upstream exon (exon 1) and a downstream exon (exon 2), as
pronounced “snurps”). Each snRNP contains a single shown in FIGURE 14.11. Pre-mRNA is spliced in two distinct
snRNA molecule and multiple proteins. The spliceosome is steps. In the first step, the pre-mRNA is cut at the 5 splice
composed of five snRNPs, named for the snRNAs that they site. This cut frees exon 1 from the intron, and the 5 end of
contain (U1, U2, U4, U5, and U6), and some proteins not the intron attaches to the branch point; that is, the intron
associated with an snRNA. folds back on itself, forming a structure called a lariat.

5’ splice site 3’ splice site


DNA P
Transcription
Pre-mRNA Exon 1 Intron Exon 2

N
RNA 1 The mRNA is cut at 2 The 5’ end of the 3 A cut is made at P
the 5’ splice site. intron attaches to the 3’ splice site.
the branch point.
RNA PROCESSING G
5’
–O

A N
5’
P

G 2’
Exon 1 Exon 2
O

4 The intron is released 5 …and the two exons P


P 3’
as a lariat,… are spliced together. P

Lariat mRNA Exon 1 Exon 2

6 The bond holding the 7 The spliced mRNA is


lariat is broken, and the exported to the cytoplasm
ANIMATION

linear intron is degraded. Translation and translated.

14.11 The splicing of nuclear introns requires a two-step process. First, cleavage takes
place at the 5 splice site, and a lariat is formed by the attachment of the 5 end of the intron to the
branch point. Second, cleavage takes place at the 3 splice site, and two exons are spliced together.
382 Chapter 14

Table 14.2 RNA–RNA interactions in pre-mRNA splicing


Interaction Function

U1 with 5 splice site U1 attaches to 5 end of intron; commits intron to splicing; no direct role in splicing
U2 with branch point Positions 5 end of intron near branch point for lariat formation
U2 with U6 Holds 5 end of intron near branch point
U6 with 5 splice site Positions 5 end of intron near branch point
U5 with 3 end of first exon Anchors first exon to spliceosome subsequent to cleavage; juxtaposes two ends of
exon for splicing
U5 with 3 end of one exon Juxtaposes two ends of exon for splicing
and 5 end of the other
U4 with U6 Delivers U6 to intron; no direct role in splicing

The guanine nucleotide in the consensus sequence at the 5’ Exon 1 Intron A Exon 2 3’
5 splice site bonds with the adenine nucleotide at the
branch point. This bonding is accomplished through trans- Branch point
esterification, a chemical reaction in which the OH group 1 U1 attaches to
on the 2 -carbon atom of the adenine nucleotide at the the 5’ splice site. U1
branch point attacks the 5 phosphodiester bond of the gua-
nine nucleotide at the 5 splice site, cleaving it and forming Exon 1 Intron A Exon 2
a new 5 –2 phosphodiester bond between the guanine and
U1
adenine nucleotides. U2
In the second step of RNA splicing, a cut is made at the 2 U2 attaches to
3 splice site and, simultaneously, the 3 end of exon 1 the branch point.
becomes covalently attached (spliced) to the 5 end of exon 2.
This bond also forms through a transesterification reaction, Exon 1 Intron A Exon 2
in which the 3 -OH group attached to the end of exon 1 U1 U2
attacks the phosphodiester bond at the 3 splice site, cleaving
it and forming a new phosphodiester bond between the U5 3 A complex of U4,
U4
3 end of exon 1 and the 5 end of exon 2; the intron is U5, and U6 joins
released as a lariat. The intron becomes linear when the U6 the spliceosome.
bond breaks at the branch point and is then rapidly
degraded by nuclear enzymes. The mature mRNA consisting
of the exons spliced together is exported to the cytoplasm, 4 U1 and U4 are released.
Spliceosome
where it is translated.
Although splicing is illustrated in Figure 14.11 as a two- n U6 1
U1 tro on
U5
Ex
In

step process, the reactions are in fact coordinated within the U4


spliceosome. A key feature of the spliceosome is a series of
interactions between the mRNA and snRNAs and between A Exon 2
5 The 5’ splice site, 3’ splice U2
different snRNAs (summarized in Table 14.2). These inter- site, and branch point are
actions depend on complementary base pairing between the in close proximity,…
different RNA molecules and bring the essential compo-
6 …and are held together
nents of the pre-mRNA transcript and the spliceosome close by pairing between the
together, which makes splicing possible. pre-mRNA and the snRNP.
The spliceosome is assembled on the pre-mRNA tran-
script in a step-by-step fashion (FIGURE 14.12). First, snRNP
U1 attaches to the 5 splice site, and then U2 attaches to the
n
branch point. A complex consisting of U4, U5, and U6 tro Exon 1 Exon 2
U5
In

7 Two transesterification
A reactions join the exons
14.12 RNA splicing takes place within the U2 U6 together and release the
spliceosome. intron as a lariat with U2, U5,
and U6 attached.
RNA Molecules and RNA Processing 383

(which form a single snRNP) joins the spliceosome. This


addition causes a conformational change in the spliceosome,
the intron loops over, and the 5 splice site is brought close
to the branch point. Particles U1 and U4 dissociate from the
spliceosome, with the subsequent formation of base pairs
between U6 and U2 and between U6 and the 5 splice site.
The 5 splice site, 3 splice site, and branch point are in close
proximity, held together by the spliceosome. The two trans-
esterification reactions take place, joining the two exons
together and releasing the intron as a lariat.
The consensus sequences found at the 5 and 3 ends of
the introns are clearly important in splicing; however, in
more complex eukaryotes with long introns, these sequences
by themselves are not sufficient for the splicing machinery to
properly recognize the ends of the intron. In these organ-
isms, proper recognition of the 5 and 3 splice sites requires 14.13 Intron removal, processing, and transcription
consensus sequences, termed exonic splicing enhancers take place at the same site. RNA tracks can be seen in the
nucleus of a eukaryotic cell. Fluorescent tags were attached
(ESEs), located in the adjacent exons. Unlike the transcrip- to DNA (red) and RNA (green). Transcribed RNA does not
tional enhancers discussed in Chapter 13, an ESE must be in disperse; rather, it accumulates near the site of synthesis
a particular position relative to the splice site. In the splicing and follows a defined track during processing. (R. W. Dirks,
reactions, SR (serine-rich) proteins bind to ESEs and help K. C. Daniël, and A. K. Raap.)
recruit the splicing machinery to the 5 and 3 splice sites
and to the branch point.
Most mRNAs are produced from a single pre-mRNA
molecule from which the exons are spliced together. How-
Self-splicing introns Some introns are self-splicing, mean-
ing that they possess the ability to remove themselves from
ever, in some organisms mRNAs may be produced by splic-
an RNA molecule. These self-splicing introns fall into two
ing together exons from two or more different pre-RNAs;
major categories. Group I introns are found in a variety of
this process is called trans-splicing.
genes, including some rRNA genes in protists, some mito-
chondrial genes in fungi, and even some bacteriophage
Nuclear organization RNA splicing, which takes place genes. Although the lengths of group I introns vary, all of
in the nucleus, must occur before the RNA can move into them fold into a common secondary structure with nine
the cytoplasm. For many years, the nucleus was viewed as a looped stems (FIGURE 14.14a), which are necessary for splic-
biochemical soup, in which components such as the spliceo- ing. Transesterification reactions are required for the
some diffused and reacted randomly. Now, the nucleus is splicing of group I introns (FIGURE 14.14b).
believed to have a highly ordered internal structure, with Group II introns, present in some mitochondrial
transcription and RNA processing taking place at particular genes, also have the ability to self-splice. All group II introns
locations within it. By attaching fluorescent tags to pre- fold into similar secondary structures (FIGURE 14.15a). The
mRNA and using special imaging techniques, researchers splicing of group II introns is accomplished by a mechanism
have been able to observe the location of pre-mRNA as it is that has some similarities to the spliceosomal-mediated
transcribed and processed. The results of these studies splicing of nuclear genes; splicing takes place through two
revealed that intron removal and other processing reactions transesterification reactions that generate a lariat structure
take place at the same sites as those of transcription (FIG- (FIGURE 14.15b). Because of these similarities, group II
URE 14.13), suggesting that these processes may be physically
introns and nuclear pre-mRNA introns have been suggested
coupled. This suggestion is supported by the observation to be evolutionarily related—perhaps the nuclear introns
that part of RNA polymerase II is also required for the splic- evolved from self-splicing group II introns and later adopted
ing and 3 processing of pre-mRNA. the proteins and snRNAs of the spliceosome to carry out the
splicing reaction.
CONCEPTS
Intron splicing of nuclear genes is a two-step process: CONCEPTS
(1) the 5 end of the intron is cleaved and attached to Self-splicing introns are of two types: group I introns
the branch point to form a lariat and (2) the 3 end of and group II introns. These introns have complex
the intron is cleaved and the two ends of the exon are secondary structures that enable them to catalyze
spliced together. These reactions take place within their excision from RNA molecules without the aid
the spliceosome. of enzymes or other proteins.
384 Chapter 14
(a) All but the exons is (b)
removed by splicing. Intron

1 The 3’-OH group of a 5’ splice site 3’ splice site


guanine nucleotide
attacks and cleaves the 5’ Exon 1 U P A G P U Exon 2 3’
5’ end of the intron.
PG OH

2 The guanine nucleotide


is added to the 5’ end
of the intron,…
5’ splice site Intron
5’
P
3 …and a free 3’-OH G
P
Exon 1 group is generated at A
3’ splice the end of exon 1.
site
Exon 2
5’ Exon 1U OH3’ G P U Exon 2 3’

4 3’-OH at the end of


exon 1 attacks the
5’ end of exon 2,…

14.14 Group I Intron


introns undergo self- 5 …cleaving the intron at 5’
its 3’ end, releasing the P
splicing. (a) Secondary GP
structure of a group I intron, and splicing the A
intron. (b) Self-splicing two exons together.
of a group I intron.
5’ Exon 1 U P U Exon 2 3’ G OH 3’

Conclusion: A group I intron is removed through a unique self-splicing reaction.


Group I intron

(a) (b) A

OH
5’ splice site 3’ splice site

5’ Exon 1U P G P U Exon 2 3’

1 An adenine nucleotide
within the intron attacks
Cleavage
a guanine nucleotide at
the 5’ end of the intron,…
Lariat
2 …creating a 3’-OH group structure
at the end of exon 1 and
Exon I 5’ 3’ Exon II a lariat structure within G PA
the intron.
Group II intron

5’ Exon 1 U OH 3’ P U Exon 2 3’

3 The intron is Intron


removed as
a lariat,… 4 …and the two exons
G PA are spliced together.
Splicing
14.15 Group II introns undergo
self-splicing by a different mechanism
from that for group I introns. 5’ Exon 1 U P U Exon 2 3’
(a) Secondary structure of a group II intron.
OH
(b) Self-splicing of group II introns, which is
similar to the splicing of nuclear introns. Conclusion: A group II intron is removed through a
self-splicing reaction similar to that of nuclear introns.
RNA Molecules and RNA Processing 385

Alternative Processing Pathways cleavage at the second site. The use of an alternative cleavage
Another finding that complicates the view of a gene as a site may or may not produce a different protein, depending
sequence of nucleotides that specifies the amino acid on whether the position of the site is before or after the
sequence of a protein is the existence of alternative process- termination codon.
ing pathways, in which a single pre-mRNA is processed in Both alternative splicing and multiple 3 cleavage sites
different ways to produce alternative types of mRNA, result- can exist in the same pre-mRNA transcript; an example is
ing in the production of different proteins from the same seen in the mammalian gene that encodes calcitonin; this
DNA sequence. gene contains six exons and five introns (FIGURE 14.17a). The
One type of alternative processing is alternative splic- entire gene is transcribed into pre-mRNA (FIGURE 14.17b).
ing, in which the same pre-mRNA can be spliced in more There are two possible 3 cleavage sites. In cells of the thyroid
than one way to yield multiple mRNAs that are translated gland, 3 cleavage and polyadenylation take place after the
into different amino acid sequences and thus different pro- fourth exon, and the first three introns are then removed
teins (FIGURE 14.16a). Another type of alternative processing to produce a mature mRNA consisting of exons 1, 2, 3,
requires the use of multiple 3 cleavage sites (FIGURE 14.16b); and 4 (FIGURE 14.17c). This mRNA is translated into the
two or more potential sites for cleavage and polyadenylation hormone calcitonin. In brain cells, the identical pre-RNA
are present in the pre-mRNA. In the example in Fig- is transcribed from DNA, but it is processed differently.
ure 14.16b, cleavage at the first site produces a relatively Cleavage and polyadenylation take place after the sixth
short mRNA, compared with the mRNA produced through exon, yielding an initial transcript that includes all six exons.

(a) Alternative splicing (b) Multiple 3’ cleavage sites

DNA Intron 1 Intron 2 DNA Intron


Exon 1 Exon 2 Exon 3 Exon 1 Exon 2

Transcription Transcription
3’ cleavage site
Pre-mRNA Pre-mRNA 1 2 3’ cleavage sites
5’ Exon 1 Exon 2 Exon 3 3’ 5’ Exon 1 Exon 2 3’

3’ cleavage and Cleavage may be 3’ cleavage and …or at 3‘


at 3‘ site 1… site 2.
polyadenylation polyadenylation

1 2
5’ Exon 1 Exon 2 Exon 3 AAAAA 3’ 5’ Exon 1 AAAAA 3’ 5’ Exon 1 Exon 2 AAAAA 3’
Exon 2
Either two introns …or two introns and
are removed to Alternative exon 2 are removed to RNA mRNA products of RNA
yield one mRNA… RNA splicing yield a different mRNA. splicing different lengths are splicing
produced after splicing.

mRNA mRNA
5’ Exon 1 Exon 2 Exon 3 AAAAA 3’ 5’ Exon 1 Exon 3 AAAAA 3’ 5’ AAAAA 3’ 5’ AAAAA 3’

x on 2
E

Intron 1 Intron 2 Intron 1, exon 2, Intron 1 Intron 2


and intron 2
Conclusion: Both alternate splicing and multiple 3‘ cleavage
sites produce different mRNAs from a single pre-mRNA.

14.16 Eukaryotic cells have alternative pathways for processing pre-mRNA.


(a) With alternative splicing; pre-mRNA can be spliced in different ways to produce different
mRNAs. (b) With multiple 3 cleavage sites, there are two or more potential sites for cleavage
and polyadenylation; use of the different sites produces mRNAs of different lengths.
386 Chapter 14

(a)

DNA
5’ Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Exon 6
3’

Transcription

(b)

Pre-mRNA 5’ Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Exon 6 3’

1 In thyroid cells, cleavage 3’ cleavage site 3’ cleavage site


and polyadenylation take
4 In brain cells, 3‘ cleavage takes
place at the end of exon 4,…
RNA processing place at the end of exon 6.

5 During splicing, exon 4


Thyroid cells Brain cells
is eliminated with the
five introns,…

(c) (d)

mRNA 5’ Exon 1 Exon 2 Exon 3 Exon 4 AAAAA 3’ mRNA 5’ Exon 1 Exon 2 Exon 3 Exon 5 Exon 6 AAAAA 3’

2 …producing an 3 Translation produces 6 …producing an 7 Translation yields


mRNA that contains the hormone calcitonin. mRNA that contains calcitonin-gene-
exons 1, 2, 3, and 4. exons 1, 2, 3, 5, and 6. related peptide.

Calcitonin Calcitonin-gene-related
peptide (CGRP)

14.17 Pre-mRNA encoded by the gene for calcitonin undergoes


alternative processing.

During splicing, exon 4 (part of the calcitonin mRNA) is SR proteins to exonic splicing enhancers, causing exons to be
removed, along with all the introns; so only exons 1, 2, 3, 5, omitted from the mature mRNA.
and 6 are present in the mature mRNA (FIGURE 14.17d).
When translated, this mRNA produces a protein called CONCEPTS
calcitonin-gene-related peptide (CGRP), which has an amino Alternative splicing enables exons to be spliced together
acid sequence quite different from that of calcitonin. Alter- in different combinations to yield mRNAs that encode
native splicing may produce different combinations of exons different proteins. Alternative 3 cleavage sites allow
in the mRNA, but the order of the exons is not usually pre-mRNA to be cleaved at different sites to produce
changed. Different processing pathways contribute to gene mRNAs of different lengths.
regulation, as discussed in Chapter 16.
Alternative processing is an important source of protein
diversity in vertebrates; an estimated 40% to 60% of all RNA Editing
human genes are alternatively spliced. Many human genetic A long-standing principle of molecular genetics is that,
diseases arise from mutations that affect pre-mRNA splicing; except for a few RNA viruses, genetic information ultimately
indeed, about 15% of single-base substitutions that result in resides in the nucleotide sequence of DNA (Chapter 10).
human genetic diseases alter pre-mRNA splicing. Some of This information is transcribed into mRNA, and mRNA is
these mutations interfere with recognition of the normal then translated into a protein. The assumption that all infor-
5 and 3 splice sites. Others create new splice sites. Muta- mation about the amino acid sequence of a protein resides
tions within exons can also interfere with the binding of in DNA is violated by a process called RNA editing. In RNA
RNA Molecules and RNA Processing 387

nucleotides of the mRNA. More extensive RNA editing has


DNA
been found in the mRNA for some mitochondrial genes in
Transcription
trypanosome parasites (which cause African sleeping sick-
ness). In some mRNAs of these organisms, more than 60%
RNA
of the sequence is determined by RNA editing. Different
types of RNA editing have now been observed in mRNAs,
RNA PROCESSING
tRNAs, and rRNAs from a wide range of organisms; they
include the insertion and the deletion of nucleotides and the
conversion of one base into another.
If the modified sequence in an edited RNA molecule
doesn’t come from a DNA template, then how is it specified?
Preedited A variety of mechanisms may bring about changes in RNA
mRNA 5’ AAAAGGGCUUUAACUUCA 3’ sequences. In some cases, molecules called guide RNAs
UUUAAAUAUAUAAUAGAAAAUUGAAGU
(gRNAs) play a crucial role. The gRNAs contain sequences
1 The preedited mRNA that are partly complementary to segments of the preedited
pairs with guide RNA. RNA, and the two molecules undergo base pairing in these
sequences (FIGURE 14.18). After the mRNA is anchored to
Preedited the gRNA, the mRNA undergoes cleavage and nucleotides
mRNA 5’ AAAAGGGCUUUAACUUCA 3’
UUUUUUUGAAAUUGAAGU are added, deleted, or altered according to the template
Guide AA A A A A A provided by gRNA. The ends of the mRNA are then
3’ A 5’
mRNA joined together.
2 The guide RNA serves In other cases, enzymes bring about base conversion.
as a template for the In humans, for example, a gene is transcribed into mRNA
addition, deletion, or that codes for a lipid-transporting polypeptide called
alteration of bases.
apolipoprotein-B100, which has 4563 amino acids and is
synthesized in liver cells. A truncated form of the protein
5’ AAAUUUAUGUG UUGUC UUUUAACUUCA 3’
UUUAAAUAUAUAAUAGAAAAUUGAAGU called apolipoprotein-B48—with only 2153 amino acids—is
synthesized in intestinal cells. The truncated protein is pro-
3’ 5’
duced from an edited version of the same mRNA that codes
3 The mature mRNA for apolipoprotein-B100. In editing, an enzyme deaminates
is then released.
a cytosine base, converting it into uracil. This conversion
changes a codon that specifies the amino acid glutamine into
Mature
mRNA 5’ AAAUUUAUGUG UUGUC UUUUAACUUCA 3’ a stop codon that prematurely terminates translation, result-
ing in the shortened protein.
Conclusion: Guide RNA adds nucleotides to the
pre-mRNA that were not encoded by the DNA.
CONCEPTS
14.18 RNA editing is carried out by guide RNAs. The Individual nucleotides in the interior of pre-mRNA
guide mRNA has sequences that are partly complementary to may be changed, added, or deleted by RNA editing.
those of the preedited mRNA and pairs with it. After pairing, The amino acid sequence produced by the edited
the mRNA undergoes cleavage and new nucleotides are mRNA is not the same as that encoded by DNA.
added, with sequences in the gRNA serving as a template.
The ends of the mRNA are then joined together.

CONNECTING CONCEPTS
Eukaryotic Gene Structure and
editing, the coding sequence of an mRNA molecule is altered
after transcription, and so the protein has an amino acid Pre-mRNA Processing
sequence that differs from that encoded by the gene. Chapters 13 and 14 have introduced a number of different
RNA editing was first detected in 1986 when the coding components of genes and RNA molecules, including pro-
sequences of mRNAs were compared with the coding moters, 5 untranslated regions, coding sequences, introns,
sequences of the DNAs from which they had been tran- 3 untranslated regions, poly(A) tails, and caps. Let’s see how
scribed. Discrepancies were found for some nuclear genes in some of these components are combined to create a typical
mammalian cells and for mitochondrial genes in plant cells. eukaryotic gene and how a mature mRNA is produced
In these cases, there had been substitutions in some of the from them.
388 Chapter 14

(a)
Enhancer is typically upstream, but
could be downstream or in an intron
Promoter RNA coding

DNA Intron Intron


5’ Exon 1 Exon 2 Exon 3
3’
Transcription End of
1 Introns, exons, and a long 3’ end
start transcription
are all transcribed into pre-mRNA.
(b)
Consensus sequence
Pre-mRNA
5’ AAUAAA 3’

DNA 5’ untranslated 3’ untranslated


region region
Transcription

(c)
RNA
Pre-mRNA
5’ AAUAAA 3’
RNA PROCESSING
2 A 5’ cap is added. 3 Cleavage at the 3’ end is 3’ cleavage site
approximately 10 nucleo-
Translation tides downstream of the
(d) consensus sequence.
PROTEIN Pre-mRNA
5’ AAUAAA 3’

4 Polyadenylation at the 3’ cleavage site


cleavage site produces
(e) the poly(A) tail.
Pre-mRNA
5’ AAUAAA AAAAA 3’

(f) 5 Finally, the introns 6 …producing the 3’ cleavage site


RNA splicing
are removed,… mature mRNA.
ANIMATION

Poly(A) tail 14.19 Mature eukaryotic


mRNA
5’ AAAAA 3’ mRNA is produced when
Introns pre-mRNA is transcribed
5’ untranslated Protein-coding 3’ untranslated and undergoes several
region region region types of processing.

The promoter, which typically encompasses about The pre-mRNA is then processed to yield a mature
100 nucleotides upstream of the transcription start site, is mRNA. The first step in this processing is the addition of
necessary for transcription to take place but is itself not a cap to the 5 end of the pre-mRNA (FIGURE 14.19c).
usually transcribed when protein-encoding genes are tran- Next, the 3 end is cleaved at a site downstream of the
scribed by RNA polymerase II (FIGURE 14.19a). Farther AAUAAA consensus sequence in the last exon (FIG-
upstream or downstream of the start site, there may be URE 14.19d). Immediately after cleavage, a poly(A) tail is
enhancers that also regulate transcription. added to the 3 end (FIGURE 14.19e). Finally, the introns
In transcription, all the nucleotides between the tran- are removed to yield the mature mRNA (FIGURE 14.19f).
scription start site and the stop site are transcribed into pre- The mRNA now contains 5 and 3 untranslated regions,
mRNA, including exons, introns, and a long 3 end that is which are not translated into amino acids, and the
later cleaved from the transcript (FIGURE 14.19b). Notice nucleotides that carry the protein-coding sequences. The
that the 5 end of the first exon contains the sequence nucleotide sequence of a small gene (the human inter-
that codes for the 5 untranslated region and that the 3 end leukin 2 gene), with these components identified, is pre-
of the last exon contains the sequence that codes for the sented in FIGURE 14.20.
3 untranslated region.
RNA Molecules and RNA Processing 389

TATA box

5’ ….CATCAGAAGAGGAAAAATGAAGGTAATGTTTTTTCAGACAGGTAAAGTCTTTGAAAATATGTGTAATATGTAAAACATTTTGACACCCCCATAATATTTTTCCAGAATTAACAGTATAAATTGCATCTCTTG

TTCAAGAGTTCCCTATCACTCTCTTTAATCACTACTCACAGTAACCTCAACTCCTGCCACAATGTACAGGATGCAACTCCTGTCTTGCATTGCACTAAGTCTTGCACTTGTCACAAACAGTGCACCTACTTCAA
Transcription start site Start codon
Exon 1 Intron 1
GTTCTACAAAGAAAACACAGCTACAACTGGAGCATTTACTTCTGGATTTACAGATGATTTTGAATGGAATTAATGTAAGTATATTTCCTTTCTTACTAAAATTATTACATTTAGTAATCTAGCTGGAGATCATTTCT
Exon 2
TAATAACAATGCATTATACTTTCTTAGAATTACAAGAATCCCAAACTCACCAGGATGCTCACATTTAAGTTTTACATGCCCAAGAAGGTAAGTACAATATTTTATGTTCAATTTCTGTTTTAATAAAATTCAAAGTA

ATATGAAAATTTGCACAGATGGGACTAATAGCAGCTCATCTGAGGTAAAGAGTAACTTTAATTTGTTTTTTTGAAAACCCAAGTTTGATAATGAAGCCTCTATTAAAACAGTTTTACCTATATTTTTAATATATATTT
Intron 2
GTGTGTTGGTGGGGGTGGGAAGAA- - - (+2400bp)- - - -TGCAGAAAGTCTAACATTTTGCAAAGCCAAATTAAGCTAAAACCAGTGAGTCAACTATCACTTAACGCTAGTCATAGGTACTTGAGCCCTAGTTTT

TCCAGTTTTATAATGTAAACTCTACTGGTCCATCTTTACAGTGACATTGAGAACAGAGAGAATGGTAAAAACTACATACTGCTACTCCAAATAAAATAAATTGGAAATTAATTTCTGATTCTGACCTCTATGTAAA
Exon 3
CTGAGCTGATGATAATTATTATTCTAGGCCACAGAACTGAAACATCTTCAGTGTCTAGAAGAAGAACTCAAACCTCTGGAGGAAGTGCTAAATTTAGCTCAAAGCAAAAACTTTCACTTAAGACCCAGGGACT
Intron 3
TAATCAGCAATATCAACGTAATAGTTCTGGAACTAAAGGTAAGGCATTACTTTATTTGCTCTCCTGGAAATAAAAAAAAAAAAGTAGGGGGAAAAGT----(+1900 BP)-----CTTGAAAATAAAGGCAACAGGCCTA
Exon 4
TAAGACTTCAATTGGGAATAACTGTATATAAGGTAAACTACTCTGTACTTTAAAAAATTAACATTTTTCTTTTATAGGGATCTGAAACAACATTCATGTGTGAATATGCTGATGAGACAGCAACCATTGTAGAATTT

CTGAACAGATGGATTACCTTTTGTCAAAGCATCATCTCAACACTGACTTGATAATTAAGTGCTTCCCACTTAAAACATATCAGGCCTTCTATTTATTTAAATATTTAAATTTTATATTTATTGTTGAATGTATGGTTT
Stop codon
GCTACCTATTGTAACTATTATTCTTAATCTTAAAACTATAAATATGGATCTTTTATGATTCTTTTTGTAAGCCCTAGGGGCTCTAAAATGGTTTCACTTATTTATCCCAAAATATTTATTATTATGTTGAATGTTAAATA

TAGTATCTATGTAGATTGGTTAGTAAAACTATTTAATAAATTTGATAAATATAAACAAGCCTGGATATTTGTTATTTTGGAAACAGCACAGAGTAAGCATTTAAATATTTCTTAGTTACTTGTGTGAACTGTAGGATG
Poly(A) consensus sequence 3’ cleavage site
GTTAAAATGCTTACAAAAGTCACTCTTTCTCTGAAGAAATATGTAGAACAGAGATGTAGACTTCTCAAAAGCCCTTGCTTT 3’

You can see that non-coding introns occupy large parts of genes, Exons code for less than 165
even when large numbers of bases are not individually listed. amino acids, a small protein.

14.20 This representation of the nucleotide sequence of the human interleukin 2


gene includes the TATA box, transcription start site, start and stop codons, introns,
exons, poly(A) consensus sequence, and the 3 cleavage site.

Transfer RNA 30 to 40 different types of tRNA, each encoded by a different


gene (or, in some cases, multiple copies of a gene) in DNA.
In 1956, Francis Crick proposed the idea of a molecule that
transports amino acids to the ribosome and interacts with The Structure of Transfer RNA
codons in mRNA, placing amino acids in their proper order
A unique feature of tRNA is the occurrence of rare modified
in protein synthesis. By 1963, the existence of such an
bases. All RNAs have the four standard bases (adenine,
adapter molecule, called transfer RNA, had been confirmed.
cytosine, guanine, and uracil) specified by DNA, but tRNAs
Transfer RNA serves as a link between the genetic code in
have additional bases, including ribothymine, pseudourasil
mRNA and the amino acids that make up a protein. Each
(which is also occasionally present in snRNAs and rRNA),
tRNA attaches to a particular amino acid and carries it to the
and dozens of others. The structures of two of these modi-
ribosome, where the tRNA adds its amino acid to the grow-
fied bases are shown in FIGURE 14.21.
ing polypeptide chain at the position specified by the genetic
instructions in the mRNA. We’ll take a closer look at the
O
mechanism of this process in Chapter 15.
Each tRNA is capable of attaching to only one type of CH3
Addition of HN
amino acid. The complex of tRNA plus its amino acid can be
written in abbreviated form by adding a three-letter super- methyl group H
O O N
script representing the amino acid to the term tRNA. For
example, a tRNA that attaches to the amino acid alanine is Ribothymidine
HN H
written as tRNAAla. Because 20 different amino acids are
found in proteins, there must be a minimum of 20 different H O
types of tRNA. In fact, most organisms possess from at least O N
NH
Uracil Addition of HN
amino group H
14.21 Two of the modified bases found in tRNAs. O N
All the modified bases in tRNAs are produced by the
chemical alteration of the four standard RNA bases. Pseudouridine
390 Chapter 14

This computer-generated space-filling This ribbon model emphasizes This flattened cloverleaf model shows pairing
molecular model shows the three- the internal regions of base pairing. between complementary nucleotides.
dimensional structure of a tRNA.

Amino acid 3’
5’ attachment site A
C
(always CCA) C
3’ 5’ A Acceptor
G C arm
G C
G U
DHU arm C G TψC arm
Hydrogen bonds G C
between paired U U
AUG G C YU
bases Rare CCCC U AGGCC

AG
base ( ) GGGG UCCGG C

CG

A
G C G C
U A GA
Anticodon C G
arm C G Extra arm
C G (size varies)
This icon for tRNA The anticodon comprises U
will be used in three bases and interacts U
subsequent chapters. with a codon in mRNA. GC

14.22 All tRNAs possess a common secondary structure, the cloverleaf


structure. The base sequence in the flattened model is for tRNAAla.

If there are only four bases in DNA and if all RNA mol- order. The DHU arm is so named because it often contains
ecules are transcribed from DNA, how do tRNAs acquire the modified base dihydrouridine.
these additional bases? Modified bases arise from chemical Although each tRNA molecule folds into a cloverleaf
changes made to the four standard bases after transcription. owing to the complementary paring of bases, the cloverleaf
These changes are carried out by special tRNA-modifying is not the three-dimensional (tertiary) structure of tRNAs
enzymes. For example, the addition of a methyl group to found in the cell. The results of X-ray crystallographic
uracil creates the modified base ribothymine. studies have shown that the cloverleaf folds upon itself to
The structures of all tRNAs are similar, a feature critical form an L-shaped structure, as illustrated by the space-filling
to tRNA function. Most tRNAs contain between 74 and and ribbon models in Figure 14.22. Notice that the acceptor
95 nucleotides, some of which are complementary to each stem is at one end of the tertiary structure and the anticodon
other and form intramolecular hydrogen bonds. As a result, is at the other end.
each tRNA has a cloverleaf structure (FIGURE 14.22). The
cloverleaf has four major arms. If we start at the top and
proceed clockwise around the tRNA shown at the right in Transfer RNA Gene Structure and Processing
Figure 14.22, the four major arms are the acceptor arm, the The genes that produce tRNAs may be scattered about the
T!C arm, the anticodon arm, and the DHU arm. Three genome or may be in clusters. In E. coli, the genes for some
of the arms (the T!C, anticodon, and DHU arms) consist of tRNAs are present in a single copy, whereas the genes for
a stem and a loop. The stem is formed by the pairing of other tRNAs are present in several copies; eukaryotic cells
complementary nucleotides, and the loop lies at the termi- usually have many copies of each tRNA gene. All tRNA mol-
nus of the stem, where there is no nucleotide pairing. ecules in both bacterial and eukaryotic cells undergo pro-
Instead of having a loop, the acceptor arm includes the cessing after transcription.
5 and 3 ends of the tRNA molecule. All tRNAs have the In E. coli, several tRNAs are usually transcribed together
same sequence (CCA) at the 3 end, where the amino acid as one large precursor tRNA, which is then cut up into pieces,
attaches to the tRNA; so clearly this sequence is not responsi- each containing a single tRNA. Additional nucleotides may
ble for specifying which amino acid will attach to the tRNA. then be removed one at a time from the 5 and 3 ends of the
The T!C arm is named for the bases of three nucleotides tRNA in a process known as trimming. Base-modifying
in the loop of this arm: thymine (T), pseudouracil (!), and enzymes may then change some of the standard bases into
cytosine (C). The anticodon arm lies at the bottom of the modified bases, and additional bases (such as CCA at the
tRNA. Three nucleotides at the end of this arm make up the 3 end) may be added (FIGURE 14.23). Different tRNAs are
anticodon, which pairs with the corresponding codon on processed in different ways; so a generic processing pathway
mRNA to ensure that the amino acids link in the correct for all tRNAs is not possible. Eukaryotic tRNAs are processed
RNA Molecules and RNA Processing 391

1 A large 2 …is cleaved to produce 3 An intron is removed 4 …and bases are 5 Modification of several
precursor tRNA… an individual tRNA molecule. by splicing,… added to the 3’ end. bases ( ¥ ) produces
the mature tRNA.

Precursor tRNA Mature tRNA


3’ 3’ 3’
A A
C C
3’ 3’ C C
5’ 5’ 5’ 5’ 5’

AGC AGC
Will form A A 3’ splice GC
G 5’ splice G site
anticodon P site C Anticodon

Intron Conclusion: tRNA processing may include cleavage,


splicing, base addition, and base modification.

14.23 Transfer RNAs are processed in both bacterial and eukaryotic cells.
Different tRNAs are modified in different ways. One example is shown here.

in a manner similar to that for bacterial tRNAs: most are CONCEPTS


transcribed as larger precursors that are then cleaved,
All tRNAs are similar in size and have a common
trimmed, and modified to produce mature tRNAs.
secondary structure known as the cloverleaf. Transfer
Some eukaryotic tRNA genes possess introns of variable
RNAs contain modified bases and are extensively
length that must be removed in processing. For example,
processed after transcription in both bacterial and
about 40 of the 400 tRNA genes in yeast contain a single
eukaryotic cells.
intron that is always found adjacent to the 3 side of the anti-
codon. The tRNA introns are shorter than those found in
pre-mRNA and do not have the consensus sequences found
at the intron–exon junctions of pre-mRNAs. The splicing
Ribosomal RNA
process for tRNA genes (see Figure 14.23) is quite different Within ribosomes, the genetic instructions contained in
from the spliceosome-mediated reactions that remove mRNA are translated into the amino acid sequences of
introns from protein-encoding genes. The intron in the pre- polypeptides. Thus, ribosomes play an integral part in the
cursor tRNA is cut at both ends by an endonuclease enzyme, transfer of genetic information from genotype to pheno-
which releases the linear intron from the rest of the tRNA. type. We will examine the role of ribosomes in the process of
The two pieces of tRNA, which are held together by translation in Chapter 15. Here, we will consider ribosome
intramolecular bonding, are then folded and ligated to pro- structure and examine how ribosomes are processed before
duce the mature tRNA. becoming functional.

Table 14.3 Composition of ribosomes in bacterial and eukaryotic cells


Ribosome rRNA
Cell type size Subunit component Proteins
Bacterial 70S Large (50S) 23S (2900 nucleotides) 31
5S (120 nucleotides)
Small (30S) 16S (1500 nucleotides) 21
Eukaryotic 80S Large (60S) 28S (4700 nucleotides) 49
5.8S (160 nucleotides)
5S (120 nucleotides)
Small (40S) 18S (1900 nucleotides) 33

Note:The letter S stands for Svedberg unit.


392 Chapter 14

The Structure of the Ribosome Table 14.4 Number of rRNA genes in


The ribosome is one of the most abundant organelles in the different organisms
cell: a single bacterial cell may contain as many as 20,000
ribosomes, and eukaryotic cells possess even more. Ribo- Copies of rRNA genes
somes typically contain about 80% of the total cellular RNA. Species per genome
They are complex organelles, each consisting of more Escherichia coli 1
than 50 different proteins and RNA molecules (Table 14.3). Yeast 100–200
A functional ribosome consists of two subunits, a large
Human 280
ribosomal subunit and a small ribosomal subunit, each of
which consists of one or more pieces of RNA and a number Frog 450
of proteins. The sizes of the ribosomes and their RNA com-
ponents are given in Svedberg (S) units (a measure of how
rapidly an object sediments in a centrifugal field). It is Eukaryotic cells possess two types of rRNA genes: a
important to note that Svedberg units are not additive; in large gene that encodes 18S rRNA, 28S rRNA, and 5.8S rRNA,
other words, combining a 10S structure and a 20S structure and a small gene that encodes the 5S rRNA. All three bacte-
does not necessarily produce a 30S structure, because the rial rRNAs (23S rRNA, 16S rRNA, and 5S rRNA) are encoded
sedimentation rate is affected by the three-dimensional by a single type of gene.
structure as well as the mass. The three-dimensional struc- Ribosomal RNA is processed in both bacterial and
ture of the bacterial ribosome has been elucidated in great eukaryotic cells. In E. coli, the immediate product of tran-
detail through the use of x-ray cystallography. More will be scription is a 30S rRNA precursor (FIGURE 14.24a). Methyl
said about the ribosome’s structure in Chapter 15. groups (CH3) are added to specific bases and to the 2 –carbon
atom of some of the ribose sugars of this 30S precursor,
Ribosomal RNA Gene Structure and Processing which is then cleaved into several pieces and trimmed to
The genes for rRNA, like those for tRNA, can be present produce 16S rRNA, 23S rRNA, and 5S rRNA, along with one
in multiple copies, and the numbers vary among species or more tRNAs.
(Table 14.4); all copies of the rRNA gene in a species are Eukaryotic rRNAs undergo similar processing (FIG-
identical or nearly identical. In bacteria, rRNA genes are URE 14.24b). Small nucleolar RNAs (snoRNAs) help to cleave
dispersed, but, in eukaryotic cells, they are clustered, with and modify eukaryotic rRNAs (as well as some archaeal
the genes arrayed in tandem, one after another. rRNAs) and help to assemble the processed rRNAs into

(a) Prokaryotic rRNAs (b) Eukaryotic rRNAs


Precursor rRNA transcript (30S) Precursor rRNA transcript (45S)

1 Methyl groups are added


Methylation Methylation
to specific bases and to
the 2’-carbon atom of
some ribose sugars.

Methyl groups
2 The RNA is cleaved into
several intermediates…
Intermediates

16S tRNA 23S 5S

3 …and trimmed.

Mature RNAs

16S rRNA tRNA 23S rRNA 5S rRNA 18S rRNA 5.8S rRNA 28S rRNA
4 Mature rRNA molecules
are the result.

14.24 Ribosomal RNA is processed after transcription. Note that eukaryotic 5S rRNA
is transcribed separately from the small eukaryotic rRNA gene.

You might also like