Professional Documents
Culture Documents
Notes-Ecoli Expression-2023
Notes-Ecoli Expression-2023
Proteins are used as therapeutics e.g. growth hormones, insulin, vaccines etc.
Used as enzymes for various purposes e.g. restriction enzymes, polymerases,
proteases, lipases etc.
For research (structure-function studies): We need good amount of protein with
high purity to characterize that protein
High yield
Proper folding
A guide line is important for the investigator in the decision-making process for
choosing an appropriate expression system. However, even with the described guidelines
there are many circumstances when it is not obvious which expression system is the best
choice, and the use of multiple expressions systems must be attempted before an optimal
system is identified. Mainly four expression systems Escherichia coli, Pichia pastoris,
baculovirus/insect cell, and mammalian expression systems are used in laboratories.
Why E. coli?
Among these systems, E. coli remains the most widely used host for recombinant
protein expression. E. coli is easy to transform, grows quickly in simple media, and
requires inexpensive equipment for growth and storage. Also in most cases, E. coli can be
made to produce adequate amounts of protein suitable for the intended application and
there are relatively straightforward methods to scale-up bioproduction.
1. Expression Vector
In general, things that affect these can affect the level of protein
expression.
Picture above shows diagram of a typical expression vector with an expression cassette
containing all the elements needed for regulated high level expression of a protein in E.
coli.
i. Promoter
Promoter Strength
A strong promoter may not be best for all situations. Overproduction of RNA may
saturate translation machinery, and maximizing RNA synthesis may not be desirable or
necessary. A weaker promoter may actually give higher steady-state levels of soluble,
intact protein than one that is rapidly induced.
Inducibility
Foreign proteins when over-expressed could be toxic. Inducible promoter keeps the gene
expression off till it is time to turn it on. The promoters can be:
Types of Promoters
Promoters used in E. coli expression vectors can be divided into three categories
depending on their origin and mode of function
E. coli's own promoters are the first ones ever used to drive overexpression of proteins in
bacteria. These are strong promoters, and can be induced with relatively inexpensive
chemicals, but they are usually superseded by other promoters.
lac:
trp:
Although not naturally found in E. coli the synthetic tac and trc promoters can be
classified as E.coli promoters, as they are created by fusing different elements of the lac
and trp promoters making them more powerful than either of the parental promoters
alone. Tac and trc promoters are:
araB Promoter:
Arabinose promoter is perhaps the latest entry to the E. coli promoter family, and offers
very tight control of the expression. Several vectors (e.g. pBAD) with ara promoter are
available from Invitrogen and in particular the thioredoxin fusions are worth having a
look at. One of the advantages of the pBAD vectors is the broad range of inducer (L-
arabinose) concentrations where expression occurs and the ability to fine tune the
expression level to maximise solubility. The araB promoter is the:
T7 promoter system
The system of choice to most of the scientists is the T7 system, which is based on the
powerful promoter of gene 1 of T7 phage and the fast and processive RNA polymerase of
the same phage.
Originally developed by William Studier in the late 80s it has become the most popular
expression system today. Novagen sells the pET system commercially, and they have
tens of different vectors with different fusions etc. The T7 promoter is:
A protein or a peptide located either on the C- or N- terminus of the target protein, which
facilitates one or several of the following characteristics of the protein:
Improved expression/yield
Enhanced solubility
Improved detection and purification
Desired localization
Enterokinase recognizes the sequence (Asp)4Lys and cleaves immediately after the lysine
residue removing the fusion tag:
The cleavage site and fusion tag can also be synthesized at the C-terminus.
To facilitate assay of the fusion proteins, short antibody recognition sequences can be
incorporated between the tag and the protease cleavage site.
Inverted repeats followed by a string of A residues present downstream from the site of
insertion of cloned gene:
BL21(DE3):
BL21(DE3) derivatives:
BL21(DE3)Star: Reduced RNase activity due to rne131 gene mutation (rne131 gene
encodes RNase enzyme which degrades mRNA
Many extracellular eukaryotic proteins contain disulfide bonds that stabilise their
structure, but production of such proteins in E.coli can be problematic as the cytoplasm of
the bacteria is reducing and formation of disulfide bonds is unlikely and if formed, their
stability is very low. To overcome this problem, strains with more oxidising intracellular
environment have bene developed, aminly by the group of Jon Beckwith of Harvard
University, and commercialised by Novagen. These strains have deletion of either
thioredoxin B (trxB) alone or trxB and glutathione oxidoreductase (gor), and as a result
of this allow (some) disulfide bond formation in the cytoplasm.
These strains have oxidizing environment in the cytoplasm thus allowing disulfide
formation in the cytoplasm:
- be in frame with ATG codon and/or any fusion tag be in frame with the stop codon
As the vectors used for expression work carry all the elements needed for high level
expression, all that is left for you to do is to create an insert that takes advantage of that
design.
Perhaps the most important thing is make sure your insert is cloned in the correct
orientation and in the right translation frame to produce the protein you have decided to
make. If you have no N-terminal fusion, you need to make sure an initiation codon ATG
for methionine is present and in frame with the rest of the protein. You also need to make
sure a sensible stop codon exists to avoid producing excessively long tails. Most
commercial vectors will have stop codons in all three frames, but rather than relying on
this, introduce one yourself in the PCR primer in the ideal position. Of course if you are
using a C-terminal fusion, you have to ensure the reading frame continues in the right
frame and the stop codon will be the one provided by the vector. You will also need to
make sure you can clone the insert to the vector(s) of choice by computing a restriction
map with your insert's sequence.
Once you have done the cloning, made your minipreps and verified that correct insert is
found in the plasmid, you will still need to confirm the correctness of the sequence by
sending a sample to sequencing service. PCR (even with the fanciest error-free
polymerases) will create some errors and it would not be sensible to skip sequencing of
the construct at this stage.
- a restriction enzyme recognition site (plus few extra in the 5' end for effective digestion)
- an ATG codon in frame for initiation methionine
- sequence that anneals with the target gene (in frame)
- optimized codons for E.coli
For example:
C-terminus of the Insert
- Restriction enzyme recognition site (plus few extra in the 5' end for effective
digestion)
- STOP codon (TAA/TGA/TGA) in frame with your gene
- Sequence that anneals with the target gene
- remember to reverse complement !
For example:
Transformation Methods:
Chemical
Involves a brief heat shock to ice-cold cells mixed with the vector DNA.
Electroporation
The cells and DNA are placed between two flat electrodes in a special cuvet and
subjected to a brief (~5 ms) high voltage (1800 v) charge to produce transient pores in the
cell membrane through which the DNA enters the cell.
Important: Prior to electroporation, the cells must be washed and suspended in deionized
water to remove salts that would otherwise cause electrical arcing in the cuvette.
Selection of Transformed Cells (Selection Markers)
Most vectors carry at least one gene that confers antibiotic resistance to the host cells.
e.g., selectable marker gene such as bla, the gene that encodes ß- lactamase, an enzyme
that modifies ampicillin into a form that is non-toxic to the bacterium.
Tetracyclin (tetR) and kanamycin (KanR) resistant markers are also available.
Codon Biasness
Codon usage may also affect the level of protein expression. If the gene of interest
contains codons not commonly used in E. coli, low expression may result due to the
depletion of tRNAs for the rarer codons. When one or more rare codons is encountered,
translational pausing may result, slowing the rate of protein synthesis and exposing the
mRNA to degradation. This potential problem is of particular concern when the sequence
encodes a protein >60kDa, when rare codons are found at high frequency, or when
multiple rare codons are found over a short distance of the coding sequence. However,
the appearance of a rare codon does not necessarily lead to poor expression. It is best to
try expression of the native gene, and then make changes if needed. Strategies
include mutating the gene of interest to use optimal codons for the host organism,
and co-transforming the host with rare tRNA genes.
Rare codons: Arginine AGG, AGA, CGA; Leucine CTA; Isoleucine ATA; Proline CCC.
Post-translational Modification
E. coli does not glycosylate or phosphorylate proteins or recognize proteolytic processing
signals from eukaryotes, so take this into account when designing the cloning strategy. If
proteolytic processing is needed, it is best to express only the coding sequences for the
fully processed protein. If the protein of interest requires glycosylation for activity, and
full activity is important in the final use, consider a eukaryotic host.
Protein Functionality
Each requirement placed on a recombinant protein will affect the choice of expression
system. If a protein is to be used only to prepare antibody, it need not be soluble or
active, and the production of inclusion bodies (aggregates of improperly folded protein)
in E. coli may be all that is needed. Alternatively, if a protein’s biological activity will be
assayed, or if it is to be used in structural studies (NMR, crystallography, etc.), a properly
folded and soluble form will be required.