Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

SYBT

MOLECULAR BIOLOGY
MOLECULAR BIOLOGY

Course objectives:-
The objective of this course is to have an insight into mechanism of
Gene Expression and Regulation.

Learning outcomes:-
By the end of the course the student will be able to:
• Discuss the mechanisms associated with Gene Expression at the
level of Transcription and Translation.
• Discuss the mechanisms associated with Regulation of Gene
Expression in Prokaryotes and Eukaryotes
UNIT I
Gene Expression-Transcription

Gene Expression- an Overview.


Transcription Process in Prokaryotes :
RNA Synthesis; Promoters and Enhancers; Initiation of
Transcription at Promoters; Elongation and Termination of an RNA
Chain.

Transcription in Eukaryotes :
Eukaryotic RNA Polymerases; Eukaryotic Promoters; Transcription
of Protein Coding Genes by RNA Polymerase; Eukaryotic mRNAs;
Transcription of other genes; Spliceosomes; RNA editing.
UNIT II
Gene Expression-Translation

Nature of Genetic Code. Wobble Hypothesis.

Translation :
Process of Protein Synthesis (Initiation,
Elongation, Translocation, Termination).

Post Translation Modifications. Protein sorting.


Unit III
Regulation of Gene Expression.

In Prokaryotes:
In Bacteria : lac Operon & trp Operon of E.coli.
In Viruses : Lytic / Lysogenic Regulation.

In Eukaryotes :
Operons in Eukaryotes; Control of Transcriptional
Initiation; Gene Silencing and Genomic Imprinting; Post-
Transcriptional Control; RNA Interference.
References

 iGenetics- Peter Russell -Pearson Education

 Microbial Genetics- Freifelder –Narosa Publishing


House
Gene Expression-Transcription

 Key Questions
• What is the central dogma?
• What are the four main types of RNA molecules
in cells?
• How is an RNA chain synthesized?
• How is transcription initiated, elongated, and terminated
in bacteria?
• How does transcription occur in eukaryotes?
• How is functional mRNA produced from the initial
transcript of a protein-coding gene in eukaryotes?
1956 Crick proposed CENTRAL
DOGMA
GENE EXPRESSION

Four main types of RNA


 mRNA (messenger RNA)
 r RNA (ribosomal RNA)
 t RNA (transfer RNA)
 sn RNA (small nuclear RNA)
GENE EXPRESSION

1. mRNA (messenger RNA) encodes the amino acid sequence


of a polypeptide. mRNAs are the transcripts of protein-coding
genes. Translation of an mRNA produces a polypeptide.

2. rRNA (ribosomal RNA), with ribosomal proteins, makes up


the ribosomes—the structures on which mRNA is translated.

3. tRNA (transfer RNA) brings amino acids to ribosomes


during translation.

4. snRNA (small nuclear RNA), with proteins, forms


complexes that are used in eukaryotic RNA processing
to produce functional mRNAs.
Transcription Process

Transcription is the process of transferring the genetic


information in DNA into RNA base sequences.

The DNA unwinds in a short region next to a gene

RNA polymerase catalyzes the synthesis of an RNA molecule


in the 5′-to-3′ direction along the 3′-to-5′ template strand of
the DNA.

Only one strand of the double stranded DNA is transcribed


into an RNA molecule.
Transcription
Transcription
Initiation of Transcription in
E. coli

A bacterial gene may be divided into three sequences with respect to


its transcription

1. A promoter a sequence upstream of the start of the gene that


encodes the RNA. The RNA polymerase interacts with the promoter.

2. The RNA-coding sequence itself that is, the DNA


sequence transcribed by RNA polymerase into the
RNA transcript.

3. A terminator specifying where transcription stops.


Gene Promoters in Transcription in
E.coli

 A promoter is a sequence upstream of the start of the gene that encodes


the RNA. The RNA polymerase interacts with the promoter.

 The promoter sequence serves to orient the RNA polymerase.

 A gene with its promoter is an independent unit.

 DNA sequences in most promoters of E. coli genes is critical for the initiation
of transcription.

 These sequences are generally found at-35 and-10 regions. The consensus
sequence for the -35 region (the-35 box) is 5′-TTGACA-3′ and for the-10
region(formerly called the Pribnow box) is 5′-TATAAT-3′.
Initiation of Transcription in
E. coli
RNA polymerase of E.coli

 Only one type of RNA polymerase is found in bacteria.

 All classes of genes—protein-coding genes mRNA, tRNA genes, and


rRNA genes—are transcribed by it.

 Initiation of transcription of a gene requires a form of RNA


polymerase called the holoenzyme (or complete enzyme).

 The holoenzyme consists of the core enzyme form of RNA


polymerase, which consists of two α ,one β , and one β′ polypeptide,
bound to another polypeptide called a sigma factor (σ ) which
ensures that the RNA polymerase binds in a stable way only at
promoters(promoter-specific binding).
Initiation of Transcription in
E. coli

 The RNA polymerase holoenzyme binds to the promoters.

 The holoenzyme contacts the-35 sequence and then binds to the full
promoter while the DNA is still in standard double helix form, a state called
the closed promoter complex.

 Then the holoenzyme untwists the DNA in the-10 region. The untwisted
form of the promoter is called the open promoter complex.

 The sigma factor of the holoenzyme plays a key role in these steps by
contacting the promoter directly at the -35 and-10 sequences.

 Once the RNA polymerase is bound at the-10 box, it is oriented properly to


begin transcription at the correct nucleotide of the gene.
Initiation of Transcription in
E. coli
Initiation of Transcription in
E. coli

 Promoters differ in their sequences, so the binding efficiency of RNA


polymerase varies.

 Promoters of most genes in E. coli have the-35 and-10 recognition sequences


which are recognized by a sigma factor with a molecular weight of 70,000
Da, called σ70.

 There are other sigma factors in E. coli which regulate gene expression by
binding to the core RNA polymerase and permits the holoenzyme to
recognize different promoters.

 For example, under conditions of high heat (heat shock) and other forms of
stress,σ 32 (molecular weight 32,000 Da) increases in amount, directing
some RNA polymerase molecules to bind to the promoters of genes that
encode proteins needed to cope with the stress.
Prokaryotic Sigma Factors
Elongation of Transcription in
E. coli

 RNA synthesis takes place in a region of DNA that has


separated into single strands to form a transcription
bubble.
 Once initiation succeeds and the elongation stage is
established.
 The RNA polymerase begins to move along the DNA and
the sigma factor is released.
 The core enzyme alone is able to complete the
transcription of the gene.
 In E. coli growing at 37°C, transcription occurs at about 40
nucleotides/sec.
Elongation of Transcription in
E. coli

 During the elongation stage, the core RNA polymerase


moves along, untwisting the DNA double helix ahead of
itself to expose a new segment of single-stranded template
DNA. Behind the untwisted region, the two DNA strands
reform into double-stranded DNA.

 Within the untwisted region, about 9 RNA nucleotides are


base paired to the DNA in a temporary RNA–DNA hybrid;
the rest of the newly synthesized RNA exits the enzyme as
a single strand..
Elongation of Transcription in
E. coli
Elongation of Transcription in
E. coli
RNA polymerase has two proofreading
activities.

 One in which the incorrectly inserted nucleotide is


removed by the enzyme , backing up one step, and
then replacing the incorrect nucleotide with the
correct one in a forward step.

 In the other proofreading process, the enzyme moves


back one or more nucleotides and then cleaves the
RNA at that position before resuming RNA synthesis
in the forward direction.
Termination of Transcription in
E. coli

The termination of bacterial gene transcription is


signaled by terminator sequences.
 In E. coli, the protein Rho (ρ) plays a role in the
termination of transcription of some genes. The
terminators of such genes are called Rho dependent
terminators (also, type II terminators).

 For other genes, the core RNA polymerase terminates


transcription; terminators for those genes are called
Rho independent terminators (also, typeI
terminators).
Rho independent terminators
(type I terminators).

 Rho-independent terminators consist of an inverted repeat sequence that is


about 16 to 20 base pairs upstream of the transcription termination point,
followed by a string of about 4 to 8 A–T base pairs.

 The RNA polymerase transcribes the terminator sequence, which is part of


the initial RNA-coding sequence of the gene.

 Because of the inverted repeat arrangement, the RNA folds into a hairpin
loop structure which causes the RNA polymerase to slow and then pause in
its catalysis of RNA synthesis.

 The string of U nucleotides downstream of the hairpin destabilizes the


pairing between the new RNA chain and the DNA template strand, and RNA
polymerase dissociates from the template; transcription has terminated.

 Mutations that disrupt the hairpin partially or completely prevent


termination.
Rho independent terminators
(type I terminators).
Rho dependent terminators
(type II terminators).

 Rho binds to the C-rich terminator sequence in the transcript upstream of the
transcription termination site.

 Rho then moves along the transcript until it reaches the RNA polymerase, where the
most recently synthesized RNA is base paired with the template DNA.

 Rho is a helicase enzyme, meaning that it can unwind double-stranded nucleic acids.

 When Rho reaches the RNA polymerase, helicase unwinds the helix formed between
the RNA and the DNA template strand.

 ATP hydrolysis provides the needed energy.

 The new RNA strand is then released, the DNA double helix reforms, and the RNA
polymerase and Rho dissociate from the DNA.

 Transcription has terminated.


Rho dependent terminators
(type II terminators).
Prokaryotic Transcription

In E. coli, the initiation and termination of transcription are


signaled by specific sequences that flank the RNA-coding
sequence of the gene. The promoter is recognized by the
sigma factor component of the RNA polymerase–sigma
factor complex. Two types of termination sequences are
found, and a particular gene has one or the other. One
type of terminator is recognized by the RNA polymerase
alone, and the other type is recognized by the enzyme in
association with the Rho factor.
Transcription in Eukaryotes

In eukaryotes, three different RNA polymerases transcribe the


genes for four main types of RNAs.
• RNA polymerase I, located in the nucleolus, catalyzes the
synthesis of three of the RNAs found in ribosomes: the 28S, 18S,
and 5.8S rRNA molecules.
• RNA polymerase II, located in the nucleoplasm, synthesizes
messenger RNAs (mRNAs) and some small nuclear RNAs
(snRNAs).
• RNA polymerase III, located in the nucleoplasm, synthesizes:
(1) transfer RNAs (tRNAs);
(2) 5S rRNA, a small rRNA molecule found in each ribosome;
(3) the snRNAs not made by RNA polymerase II.
Transcription of protein-coding genes by
RNA polymerase II.

 Eukaryotic genes are transcribed by RNA polymerase


II and have specific promoter sequences.
 The product of transcription is a precursor mRNA
(pre-mRNA) molecule.
 This pre-mRNA transcript must be modified,
processed, or both to produce the mature, functional
mRNA molecule.
 This mRNA molecule is translated to generate a
polypeptide.
Promoters of protein-coding
genes

 The promoters of protein-coding genes are located


about 200 base pairs upstream of the transcription
initiation site and contain various sequence elements.

 Two general regions of the promoter are:


(1) the core promoter; and
(2) promoter-proximal elements.
Promoters of protein-coding
genes

Core promoter
 Set of cis-acting sequence elements needed for the transcription machinery
to start RNA synthesis at the correct site.
 These elements are typically within no more than 50 bp upstream of the
start site.
 The best-characterized core promoter elements are:
(1) a short sequence element called Inr (initiator), which spans the
transcription initiation start site (defined as +1); and
(2) the TATA box, or TATA element (also called the Goldberg-Hogness box,
after its discoverers), located at about position-30. The TATA box has the
seven-nucleotide consensus sequence 5′-TATAAAA-3′.

 The Inr and TATA elements specify where the transcription machinery
assembles and determine where transcription will begin.
Promoters of protein-coding
genes

Promoter-proximal elements
 Present upstream from the TATA box, in the area from-50 to-200 nucleotides from the
start site of transcription. Examples of these elements are
(1) CAAT (“cat”) box, named for its consensus sequence and located at about-75; and
(2) GC box, with consensus sequence 5′-GGGCGG-3′, located at about -90.

 Promoters contain various combinations of core promoter elements and promoter-


proximal elements that together determine promoter function.

 The promoter proximal elements are important in determining how and when a gene
is expressed
Activators & Enhancers

 Promoters contain various combinations of core promoter


elements and promoter-proximal elements that together
determine promoter function transcription.
 Regulatory proteins called activators determine the
efficiency of transcription initiation.
 Other sequences—enhancers—are required for the
maximal transcription of a gene
 Enhancers are another type of cis-acting element which
function either upstream or downstream from the
transcription initiation site.
 Enhancers modulate transcription from a distance.
Eukaryotic Transcription Initiation

 Accurate initiation of transcription of a protein-coding gene involves the assembly of


RNA polymerase II and a number of other proteins called general transcription
factors (GTFs) on the core promoter.
 None of the three eukaryotic RNA polymerases bind directly to DNA. Instead,
particular GTFs bind first and recruit the RNA polymerase to form a complex.
 Other GTFs then bind, and transcription can begin.
 The GTFs are numbered for the RNA polymerase with which they work and are
lettered to reflect their order of discovery.
 For example, TFIID is the fourth general transcription factor (D) discovered that
works with RNA polymerase II.
 For protein-coding genes, the GTFs and RNA polymerase II bind to promoter
elements in a particular order to produce the complete transcription initiation
complex, also called the preinitiation complex (PIC) because it is ready to begin
transcription.
 Binding of activators to promoter proximal elements and to enhancer elements
determines the overall efficiency of transcription initiation at a particular promoter.
Assembly of preinitiation complex

 First, TFIID binds to the TATA box to form the initial committed complex.
 The multi subunit TFIID has one subunit called the TATA binding protein
(TBP), which recognizes the TATA box sequence and several other
proteins called TBP-associated factors (TAFs). In vitro, the TFIID–TATA
box complex acts as a binding site for the sequential addition of other
transcription factors.
 Initially, TFIIA and then TFIIB bind, followed by RNA polymerase II and
TFIIF, to produce the minimal transcription initiation complex. (RNA
polymerase II, like all eukaryotic RNA polymerases, cannot directly
recognize and bind to promoter elements.)
 Next, TFIIE and TFIIH bind to produce the complete transcription
initiation complex, also called the preinitiation complex (PIC). TFIIH’s
helicase-like activity now unwinds the promoter DNA, and transcription
is ready to begin.
The Structure and Production of Eukaryotic
mRNAs

The mature, biologically active mRNA in both prokaryotic


and eukaryotic cells has three main parts :
1) A 5′ untranslated region (5’ UTR; also called a leader
sequence) at the 5’ end;
2) The protein-coding sequence, which specifies the amino
acid sequence of a protein during translation; and
3) 3’ untranslated region (3’ UTR; also called a trailer
sequence). The 3’ UTR sequence may contain sequence
information that signals the stability of the particular
mRNA.
The Structure of Eukaryotic mRNAs
mRNA production in
Prokaryotes and Eukaryotes.
 mRNA production is different in bacteria and eukaryotes.
 In bacteria the RNA transcript functions directly as the mRNA
molecule.
 In addition, because bacteria lack a nucleus, an mRNA begins to
be translated on ribosomes before it has been transcribed
completely; this process is called coupled transcription and
translation.
 In eukaryotes the RNA transcript (the pre mRNA) is modified in
the nucleus by RNA processing to produce the mature mRNA.
Also, the mRNA must migrate from the nucleus to the cytoplasm
(where the ribosomes are located) before it can be translated.
Thus, a eukaryotic mRNA is always transcribed completely and
then processed before it is translated.
mRNA production in
Prokaryotes and Eukaryotes.

 Another fundamental difference between bacterial and


eukaryotic mRNAs is that bacterial mRNAs often are
polycistronic, meaning that they contain the amino acid
coding information from more than one gene.
 Eukaryotic mRNAs typically are monocistronic, meaning that
they contain the amino acid-coding information from just
one gene. The eukaryotic system allows for additional levels
of control of gene expression, which is particularly
important in the more complex, multicellular organisms.
Processes for the synthesis of functional mRNA in
prokaryotes and eukaryotes

 In bacteria, the mRNA synthesized by RNA polymerase


does not have to be processed before it can be translated
by ribosomes. Also, because there is no nuclear
membrane, mRNA translation can begin while
transcription continues, resulting in a coupling of
transcription and translation.
 In eukaryotes, the primary RNA transcript is a precursor-
mRNA (pre-mRNA) molecule, which is processed in the
nucleus by the addition of a 5’ cap and a 3’ poly(A) tail and
the removal of introns. Only when that mRNA is
transported to the cytoplasm can translation occur.
Production of Mature mRNA in Eukaryotes

 Eukaryotic mRNAs are modified at both the 5’ and 3’ ends.


 In eukaryotes in general, protein-coding genes typically have
non-amino acid–coding sequences called introns between the
other sequences that are present in mRNA, the exons.
 The term intron is derived from intervening sequence—a
sequence that is not translated into an amino acid sequence—
and the term exon is derived from expressed sequence.
 Exons include the 5’ and 3’ UTRs, as well as the amino acid-
coding portions.
 In the processing of pre-mRNA to the mature mRNA molecule,
the introns are removed.
5′ Modification of RNA

 Once RNA polymerase II has made about 20 to 30 nucleotides of pre-


mRNA, a capping enzyme adds a guanine nucleotide—most commonly,
7-methyl guanosine (m7G)—to the 5′ end.
 The addition involves an unusual 5′-5′ linkage, rather than a 5′-3′ linkage.
 The process is called 5′ capping.
 The sugars of the next two nucleotides are also modified by
methylation.
 The 5′ cap remains throughout processing and is present in the mature
mRNA.
 It protects against degradation by exonucleases because of the unusual
linkage.
 The 5′ cap is also important for the binding of the ribosome as an initial
step of translation.
3′ Modification

 Most eukaryotic pre-mRNAs become modified at their 3′ ends by the


addition of a sequence of about 50 to 250 adenine nucleotides called a
poly(A)tail.
 There is no DNA template for the poly(A) tail.
 The poly(A) tail remains when the pre-mRNA is processed to
 mature mRNA.
 The poly(A) tail is required for efficient export of the mRNA from the
nucleus to the cytoplasm.
 Once in the cytoplasm, the poly(A) tail protects the 3′ end of the mRNA
by buffering coding sequences against early degradation by
exonucleases.
 The poly(A) tail also plays important roles in the initiation of translation
by ribosomes and in processes that regulate the stability of mRNA.
3′ Modification

 Addition of the poly(A) tail defines the 3′ end of an mRNA strand and is
associated with the termination of transcription of protein-coding genes.
 Addition of the poly(A) tail is signaled when mRNA transcription proceeds
past the poly(A) site, a site in the RNA transcript that is about 10 to 30
nucleotides downstream of the poly(A) consensus sequence 5′-AAUAAA-3′.
 A number of proteins, including CPSF (cleavage and polyadenylation
specificity factor) protein, CstF (cleavage stimulation factor) protein, and
two cleavage factor proteins (CFI and CFII), then bind to and cleave the RNA
at the poly(A) site.
 Then, the enzyme poly(A) polymerase (PAP), which is bound to CPSF, adds A
nucleotides to the 3′ end of the RNA using ATP as the substrate to produce
the poly(A) tail.
 Poly(A) binding protein II (PABII) molecules bind to the poly(A) tail as it is
synthesized.
mRNA Splicing

 Pre-mRNAs often contain a number of introns.


 Introns must be excised from each pre-mRNA to produce a
mature mRNA that can be translated into the encoded
polypeptide.
 The mature mRNA, then, contains RNA copies of the exons in
the gene.
 These are now contiguously arranged in the RNA molecule
without being separated by intron sequences.
 Transcription of the gene results in a pre-mRNA containing both
exon and intron sequences.
 This RNA is found only in the nucleus.
 The intron sequence is excised by processing events, and the
flanking exon sequences are spliced together to produce a
mature mRNA.
Processing of Pre-mRNA to Mature mRNA

 Messenger RNA production from genes with introns


involves transcription of the gene by RNA polymerase II,
addition of the 5′ cap and poly(A) tail to produce the pre-
mRNA molecule, and processing of the pre-mRNA in the
nucleus to remove the introns and splice the exons
together to produce the mature mRNA.
 Introns typically begin with 5′-GU and end with AG-3′,
although more than just those nucleotides are needed to
specify a junction between an intron and an exon.
 Introns in pre-mRNAs are removed and exons joined in the
nucleus by mRNA splicing.
Processing of Pre-mRNA to Mature mRNA
RNA Splicing

 1. U1 snRNP binds to the 5′ splice junction of the intron. This binding is primarily the
result of base pairing of U1 snRNA in the snRNP to the 5′ splice junction.
 2. U2 snRNP binds to a sequence called the branch point sequence, which is located
upstream of the 3′ splice junction. This binding occurs as a result of the base pairing of
U2 snRNA in the snRNP to the branch-point sequence.
 3. A U4/U6 snRNP and a U5 snRNP interact, and the combination binds to the U1 and
U2 snRNPs, causing the intron to loop and thereby bringing its two junctions close
together.
 4. U4 snRNP dissociates, resulting in the formation of the active spliceosome.
 5. The snRNPs in the spliceosome cleave the intron from exon 1 at the 5′ splice
junction, and the now free 5′ end of the intron bonds to a particular A nucleotide in
the branch-point sequence. This looped back structure is called an RNA lariat structure.
The lariat structure involves an unusual 2′–5′ phosphodiester bond formed between
the 2′ OH of the adenine nucleotide in the branch-point sequence and the 5 ′
phosphate of the guanine nucleotide at the end of the intron. The A itself remains in
normal 3 ′–5 ′ linkage with its adjacent nucleotides of the intron.
 6. Next, the spliceosome excises the intron (still in lariat shape) by cleaving it at the 3 ′
splice junction and then ligates exons 1 and 2 together. The snRNPs are released at this
time. The process is repeated for each intron.
RNA editing

 RNA editing involves the posttranscriptional insertion or deletion of


nucleotides or the conversion of one base to another.
 As a result, the functional RNA molecule has a base sequence that
does not match the base-pair sequence of its DNA coding sequence.
 RNA editing was discovered in the mid-1980s in some mitochondrial
mRNAs of trypanosomes, the parasitic protozoa that cause sleeping
sickness.
 In some organisms, RNA editing inserts or deletes nucleotides or
converts one base to another in an RNA post-transcriptionally.
 As a result, the functional RNA molecule has a base sequence that
does not match the DNA coding sequence.
 Many RNAs that are edited are encoded by the mitochondrial and
 chloroplast genomes.
RNA editing

You might also like