Manipulation of DNA: Gene

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 89

BT8601 GENETIC ENGINEERING AND GENOMICS

UNIT I BASICS OF RECOMBINANT DNA TECHNOLOGY


Manipulation of DNA – Restriction and Modification enzymes, Design of linkers and adaptors.
Characteristics of cloning and expression vectors based on plasmid and bacteriophage, Vectors for
yeast, insect and mammalian systems, Prokaryotic and eukaryotic expression host systems,
Introduction of recombinant DNA in to host cells and selection methods.
Manipulation of DNA
Introduction
Gene:
A gene can be defined as the region of DNA (or RNA in case of virus) that controls a discrete
hereditary characteristic, usually corresponding to a single protein or RNA. This includes the entire
functional unit, encompassing coding (exons) and noncoding sequences (introns and regulatory
sequences).
• Exons and introns which represent the coding and noncoding regions are present in a eukaryotic
gene. Introns are absent in prokaryotes.
• The introns are removed by splicing and the exons are translated in tandem to yield the functional
polypeptide that further undergoes post translational modification to become functional. These
functional polypeptides (proteins) are targeted to various organelles in the cell or exported out of the
cell for carrying out various intracellular and extracellular processes respectively.

Genome:
Genome is the complete set of genetic information of a cell or an organism; in particular, the
complete sequence of DNA/RNA that carries this information. In diploid organisms, it refers to the
haploid set of chromosomes present in a cell. Depending on its localization, genome may be nuclear
or organellar. Organellar genomes are again of two types: mitochondrial and chloroplast genome.
Genome size of organisms differs significantly between different species. The size of the genome
governs the size and complexity of an organism. However, many small sized organisms, in fact have
bigger genomes than their larger counterparts.
Various organisms have different sized genome as can be seen in the table below.
Species Organism Genome Size (Mb)
Triticum aestivum Plant 16000
Homo sapiens Mammal 3200
Arabidopsis thaliana Plant 125
Drosophila melanogaster Insect 180
Caenorhabditis elegans Nematode worm 97
Saccharomyces cerevisiae Yeast 12.1
Escherichia coli Bacterium 4.64
Haemophilus influenzae Bacterium 1.83
Mycoplasma genitalium Bacterium 0.58
Definition of recombinant DNA
• Production of a unique DNA molecule by joining together two or more DNA fragments
not normally associated with each other
• DNA fragments are usually derived from different biological sources

STEPS INVOLVED IN CREATION OF A rDNA


1. Selection of the gene of interest
2. Selection of of vector
3. Treatment with restriction endonuclease
4. Ligation using ligase
5. Transformation in suitable host
6. Screening of the recombinant plasmid
7. Expression of particular protein.
Applications
• Gene isolation/purification/synthesis
• Sequencing/Genomics/Proteomics
• Polymerase chain reaction (PCR)
• Mutagenesis (reverse genetics)
• Expression analyses (transcriptional and translational levels)
• Restriction fragment length polymorphisms (RFLPs)
• Biochemistry/ Molecular modeling
• High throughput screening
• Combinatorial chemistry
• Gene therapy
• Recombinant Vaccines
• Genetically modified crops
• Biosensors
• Monoclonal antibodies
• Cell/tissue culture
• Xenotransplantation( transplantation, implantation or infusion into a human recipient of
either (a) live cells, tissues, or organs from a nonhuman animal source)
• Bioremediation
• Production of next generation antibiotics
• Forensics
• Bioterrorism detection
Restriction and Modification enzymes
DNA manipulative enzymes can be grouped into four broad classes, depending on the
type of reaction that they catalyze:
 Nucleases are enzymes that cut, shorten, or degrade nucleic acid molecules.
 Ligases join nucleic acid molecules together.
 Polymerases make copies of molecules.
 Modifying enzymes remove or add chemical groups.
Nucleases
 Nucleases degrade DNA molecules by breaking the phosphodiester bonds that link one
nucleotide to the next in a DNA strand. There are two different kinds of nucleasel
Exonucleases remove nucleotides one at a time from the end of a DNA molecule.
Endonucleases are able to break internal phosphodiester bonds within a DNA molecule.
Restriction Endonucleases
 Also called restriction enzymes
 1962: ―molecular scissors‖ discovered in bacteria
 E. coli bacteria have an enzymatic immune system that recognizes and destroys foreign DNA
 3,000 enzymes have been identified, around 200 have unique properties, many are purified and
available commercially
Restriction Enzymes: Molecular Scissors
 Restriction enzymes (endonuleases) cut DNA at specific sequences
 What kinds of bonds are broken when restriction enzymes cut?
 Covalent bonds (within a single strand)
 Hydrogen bonds (between strands) as a result of the strands coming apart
Origins of Restriction Enzymes
• Naturally found in different types of bacteria
• Bacteria use restriction enzymes to protect themselves from foreign DNA
• Bacteria have mechanisms to protect themselves from the actions of their own restriction
enzymes
• Fungi (Nocardia and streptomyces)

Biological Role of RE
• Restriction Modification System -restriction enzymes are paired with methylases.
• Methylases are enzymes that add methyl groups to specific nucleotides within the recognition
sequence. The methylation prevents recognition by the restriction enzyme.
• Therefore, the restriction enzyme within a cell doesn‘t destroy its own DNA. However the
restriction enzyme can destroy foreign DNA which enters the cell such as bacteriophage.
• This system is composed of a restriction endonuclease enzyme and a methylase enzyme
• Each bacterial species and strain has their own combination of restriction and methylating
enzymes.

Diversity of Enzymes
EcoRI Esherichia coli R G/AATTC
BamHI Baccilu amyloliquefaciens H G/GATCC
HindIII Haemophilus influenzae Rd A/AGCCT
PstI Providencia stuartii CTGCA/G
PmeI Psuedomonas mendocina GTTT/AAAC
MECHANISM OF CUTTING
Nomenclature
• Smith and Nathans (1973) proposed enzyme naming scheme;
– Three-letter acronym for each enzyme derived from the source organism
– First letter from genus
– Next two letters represent species
– Additional letter or number represent the strain or serotypes
– For example. the enzyme HindII was isolated from Haemophilus influenzae serotype d.
• Named for bacterial genus, species, strain, and type
Example: EcoR1
Genus: Escherichia
Species: coli
Strain: R
Order discovered: 1
Uses for Restriction Enzymes
 RFLP analysis (Restriction Fragment Length Polymorphism)
 DNA sequencing
 DNA storage – libraries
 Transformation
 Large scale analysis – gene chips
 Restriction Analysis
 Using restriction enzymes to find out information about a piece of DNA
 We can use restriction enzymes to find out
o The size of a plasmid
o If there are any restriction sites for a particular enzyme on a piece of DNA (ex. EcoRI)
o How many restriction sites for a particular enzyme
o Where the restriction sites are located
 Using restriction enzymes to find out information about a piece of DNA
 We can use restriction enzymes to find out
o The size of a plasmid
o If there are any restriction sites for a particular enzyme on a piece of DNA (ex. EcoRI)
o How many restriction sites for a particular enzyme
o Where the restriction sites are located
Recognition sites have symmetry (palindromic)
Isoschizomers
Isoschizomers are pairs of restriction enzymes specific to the same recognition sequence.
• For example, Sph I (CGTAC/G) and Bbu I (CGTAC/G) are isoschizomers of each other.
An enzyme that recognizes the same sequence but cuts it differently is a neoschizomer. Neoschizomers
are a specific type (subset) of Isoschizomers.
• For example, Sma I (CCC/GGG) and Xma I (C/CCGGG) are neoschizomers of each other.
An enzyme that recognizes slightly different sequence, but produces the same ends is a isocaudomer.
Differences between restriction enzymes

Type II restriction endonucleases cut DNA at specific nucleotide sequences


The central feature of type II restriction endonucleases (which will be referred to simply as
―restriction endonucleases‖ from now on) is that each enzyme has a specific recognition sequence at
which it cuts a DNA molecule. A particular enzyme cleaves DNA at the recognition sequence and
nowhere else. For example, the restriction endonuclease called PvuI (isolated from Proteus vulgaris)
cuts DNA only at the hexanucleotide CGATCG. In contrast, a second enzyme from the same
bacterium, called PvuII, cuts at a different hexanucleotide, in this case CAGCTG. Many restriction
endonucleases recognize hexanucleotide target sites, but others cut at four, five, eight, or even longer
nucleotide sequences. Sau3A (from Staphylococcus aureus strain 3A) recognizes GATC, and AluI
(Arthrobacter luteus) cuts at AGCT. There are also examples of restriction endonucleases with
degenerate recognition sequences, meaning that they cut DNA at any one of a family of related sites.
HinfI (Haemophilus influenzae strain Rf), for instance, recognizes GANTC, so cuts at GAATC,
GATTC, GAGTC, and GACTC.
Modification enzymes
DNA modifying enzymes
There are numerous enzymes that modify DNA molecules by addition or removal of specific
chemical groups. The most important are as follows:
Alkaline phosphatase (from E. coli, calf intestinal tissue, or arctic shrimp), which removes the
phosphate group present at the 5 terminus of a DNA molecule (Figure 4.6a).
Polynucleotide kinase (from E. coli infected with T4 phage), which has the reverse effect to alkaline

Terminal deoxynucleotidyl transferase (from calf thymus tissue), which adds one or more
deoxyribonucleotides onto the 3 terminus of a DNA molecule (Figure 4.6c).
Nucleases
• Nuclease enzymes degrade nucleic acids by breaking the phosphodiester bond that holds the
nucleotides together.
• Restriction enzymes are good examples of endonucleases, which cut within a DNA strand.
• A second group of nucleases, which degrade DNA from the termini of the molecule, are
known as exonucleases.
• Apart from restriction enzymes, there are four useful nucleases that are often used in genetic
engineering.
• These are
• Bal 31 and
• exonuclease III (exonucleases), and
• deoxyribonuclease I (DNase I) and
• S1-nuclease (endonucleases).
• These enzymes differ in their precise mode of action and provide the genetic engineer with a
variety of strategies for attacking DNA.

Mode of action of various nucleases.


(a) Nuclease Bal 31 is a complex enzyme. Its primary activity is a fast-acting 3‘ exonuclease,
which is coupled with a slow-acting endonuclease. When Bal 31 is present at a high
concentration these activities effectively shorten DNA molecules from both termini.
(b) Exonuclease III is a 3‘ exonuclease that generates molecules with protruding 5‘ termini.
(c) DNase I cuts either single-stranded or double-stranded DNA at essentially random sites.
(d) Nuclease S1 is specific for single-stranded RNA or DNA.
(e) In addition to DNA-specific nucleases, there are ribonucleases (RNases), which act on RNA.
Polymerases
• Polymerase enzymes synthesise copies of nucleic acid molecules and are used in many
genetic engineering procedures.
• When describing a polymerase enzyme, the terms ‗DNA-dependent‘ or ‗RNA-dependent‘
may be used to indicate the type of nucleic acid template that the enzyme uses.
• Thus, a
– DNA-dependent DNA polymerase copies DNA into DNA,
– an RNA-dependent DNA polymerase copies RNA into DNA, and
– a DNA-dependent RNA polymerase transcribes DNA into RNA.
 DNA Polymerase, Large Fragment
 DNA Polymerase I
 Klenow Fragment
 Klenow Fragment, exo–
 phi29 DNA Polymerase
 T4 DNA Polymerase
o T7 DNA Polymerase
Applications
• Blunting of DNA ends: fill-in of 5'-overhangs or/and removal of 3'-overhangs
• Blunting of PCR products with 3'-dA overhangs
• Synthesis of labeled DNA probes by the replacement reaction
• Oligonucleotide-directed site-specific mutagenesis
• Ligation-independent cloning of PCR products
Reverse transcriptase (RTase) is an RNA-dependent DNA polymerase, and therefore produces
a DNA strand from an RNA template.
 It has no associated exonuclease activity.
 The enzyme is used mainly for copying mRNA molecules in the preparation of cDNA
(complementary or copy DNA) for cloning, although it will also act on DNA templates.
 Reverse transcriptase is a key enzyme in the generation of cDNA; the enzyme is an RNA-
dependent DNA polymerase, which produces a DNA copy of an mRNA molecule.

Ribonuclease H
• The enzyme RNase H is a non-specific endonuclease and catalyzes the cleavage of RNA via
a hydrolytic mechanism

Function
 It is commonly used to destroy the RNA template after first-strand complementary DNA
(cDNA) synthesis by reverse transcription, as well as procedures such as nuclease protection
assays.
 RNase H can also be used to degrade specific RNA strands when the cDNA oligo is
hybridized, such as the removal of the poly(A) tail from mRNA hybridized to oligo(dT), or
the destruction of a chosen non-coding RNA inside or outside the living cell.
 To terminate the reaction, a chelator, such as EDTA, is often added to sequester the required
metal ions in the reaction mixture.
Enzymes that modify the ends of DNA molecules(termini)
• The enzymes alkaline phosphatase, polynucleotide kinase, and terminal transferase act on the
termini of DNA molecules and provide important functions that are used in a variety of ways.
• The phosphatase and kinase enzymes, as their names suggest, are involved in the removal or
addition of phosphate groups.
• Bacterial alkaline phosphatase (there is also a similar enzyme, calf intestinal alkaline
phosphatase) removes phosphate groups from the 5 ends of DNA, leaving a 5-OH group.
Terminal Deoxynucleotidyl Transferase
• Terminal Deoxynucleotidyl Transferase (TdT), a template-independent DNA polymerase,
catalyzes the repetitive addition of deoxyribonucleotides to the 3'-OH of
oligodeoxyribonucleotides and single-stranded and double-stranded DNA .
• TdT requires an oligonucleotide of at least three nucleotides to serve as a primer.
• With RNA as template TdT shows variable performance which strongly depends upon the
tertiary structure of acceptor RNA 3'-end and the nature of nucleotide. Generally, it is lower
than using DNA as a template.

Source
E.coli cells carrying a cloned gene encoding calf thymus terminal deoxynucleotidyl
transferase.
Highlights
• Cloned and produced in E.coli
• Excellent stability and purity compared to native TdT
• Economical
Applications
• Addition of homopolymeric tails to plasmid DNA and to cDNA
• Double- or single-stranded DNA 3´-termini labeling with radioactively labeled or non-
radioactively labeled nucleotides
• Addition of single nucleotides to the 3´ ends of DNA for in vitro mutagenesis
• Production of synthetic homo- and heteropolymers
• RACE (Rapid Amplification of cDNA Ends)
• In situ Localization of Apoptosis
• Resolving gel compressions and artifact banding in DNA sequencing
Phosphatases & Kinases
 Alkaline phosphatase and polynucleotide kinase for DNA and RNA dephosphorylation or
phosphorylation.
 FastAP Alkaline Phosphatase
 T4 Polynucleotide Kinase
T4 Polynucleotide Kinase
• Polynucleotide Kinase (T4 PNK) catalyzes the transfer of the gamma-phosphate from ATP to
the 5'-OH group of single- and double-stranded DNAs and RNAs, oligonucleotides or
nucleoside 3'-monophosphates (forward reaction).
• The reaction is reversible. In the presence of ADP T4 Polynucleotide Kinase exhibits 5'-
phosphatase activity and catalyzes the exchange of phosphate groups between 5'-P-oligo-
polynucleotides and ATP (exchange reaction).

Highlights
• Active in restriction enzyme, RT, and T4 DNA Ligase buffers
Applications
• Labeling of nucleic acids' 5'-termini to be used as:
– probes for hybridization
– probes for transcript mapping
– markers for gel electrophoresis
– primers for DNA sequencing
– primers for PCR
• 5'-phosphorylation of oligonucleotide, PCR products, other DNA or RNA prior to ligation
• Phosphorylation of PCR primers
• Detection of DNA modification by the [32P]-postlabeling assay
• Removal of 3'-phosphate groups
Alkaline phosphatase
 Alkaline phosphatase is a hydrolase enzyme responsible for removing phosphate
groups from many types of molecules, including nucleotides, proteins, and alkaloids.
The process of removing the phosphate group is called dephosphorylation
Use in research
 Typical use in the lab for alkaline phosphatases includes removing phosphate
monoester to prevent self ligation.
 Common alkaline phosphatases used in research include:
 Shrimp alkaline phosphatase (SAP), from a species of Arctic shrimp (Pandalus
borealis)
 Calf-intestinal alkaline phosphatase (CIP)
 Placental alkaline phosphatase (PALP) and its C terminally truncated version that
lacks the last 24 amino acids (constituting the domain that targets for GPI membrane
anchoring) - the secreted alkaline phosphatase (SEAP)
Design of linkers and adaptors
Sticky ends ends and Blunt ends
For the reasons detailed in the preceding section, compatible sticky ends are desirable on the
DNA molecules to be ligated together in a gene cloning experiment. Often these sticky ends can
be provided by digesting both the vector and the DNA to be cloned with the same restriction
endonuclease, or with different enzymes that produce the samesticky end, but it is not always
possible to do this. A common situation is where the vector molecule has sticky ends, but the
DNA fragments to be cloned are blunt-ended. Under these circumstances one of three methods
can be used to put the correct sticky ends onto the DNA fragments.
Linkers
The first of these methods involves the use of linkers. These are short pieces of doublestranded
DNA, of known nucleotide sequence, that are synthesized in the test tube.

A typical linker is shown in Figure 4.21a. It is blunt-ended, but contains a restriction site, BamHI
in the example shown. DNA ligase can attach linkers to the ends of larger bluntended DNA
molecules. Although a blunt end ligation, this particular reaction can be performed very
efficiently because synthetic oligonucleotides, such as linkers, can be made in very large
amounts and added into the ligation mixture at a high concentration. More than one linker will
attach to each end of the DNA molecule, producing the chain structure shown in Figure 4.21b.
However, digestion with BamHI cleaves the chains at the recognition sequences, producing a
large number of cleaved linkers and the original DNA fragment, now carrying BamHI sticky
ends. This modified fragment is ready for ligation into a cloning vector restricted with BamHI.
• Synthetic , Short and known double stranded oligonucleotides sequence.
• Having blunted ends on both sides and R. Sits.
• Treatment with R.E produces sticky ends after ligation with target DNA.
• e.g. Linker having sit for BamHI.
Adaptors
• A Synthetic dstranded Oligonucleotide having blunt end and Sticky end.
• Blunt ends will bind to the blunt ends of target DNA to produce new DNA with sticky
ends.
• Problems: sticky of adaptors will binds with each other so….
• Treatment with Alkaline Phosphates.
• After attachment with target…… treatment
• Polynucleotide Kinase to add P–OH at 5 prime.

There is one potential drawback with the use of linkers. Consider what would happen if the
blunt-ended molecule shown in Figure 4.21b contained one or more BamHI recognition
sequences. If this was the case, the restriction step needed to cleave the linkers and produce the
sticky ends would also cleave the blunt-ended molecule (Figure 4.22). The resulting fragments
will have the correct sticky ends, but that is no consolation if the gene contained in the blunt-
ended fragment has now been broken into pieces.
The second method of attaching sticky ends to a blunt-ended molecule is designed to avoid this
problem. Adaptors, like linkers, are short synthetic oligonucleotides. But unlike linkers, an
adaptor is synthesized so that it already has one sticky end (Figure 4.23a). The idea is of course
to ligate the blunt end of the adaptor to the blunt ends of the DNA fragment, to produce a new
molecule with sticky ends. This may appear to be a simple method but in practice a new problem
arises. The sticky ends of individual adaptor molecules could base pair with each other to form
dimers (Figure 4.23b), so that the new DNA molecule is still blunt-ended (Figure 4.23c). The
sticky ends could be recreated by digestion with a restriction endonuclease, but that would defeat
the purpose of using adaptors in the first place. The answer to the problem lies in the precise
chemical structure of the ends of the adaptor molecule. Normally the two ends of a
polynucleotide strand are chemically distinct, a fact that is clear from a careful examination of
the polymeric structure of DNA (Figure 4.24a). One end, referred to as the 5′ terminus, carries a
phosphate group (5′-P); the other, the 3′ terminus, has a hydroxyl group (3′-OH). In the double
helix the two strands are antiparallel (Figure 4.24b), so each end of a double-stranded molecule
consists of one 5′-P terminus and one 3′-OH terminus. Ligation takes place between the 5′-P and
3′-OH ends (Figure 4.24c). Adaptor molecules are synthesized so that the blunt end is the same
as ―natural‖ DNA, but the sticky end is different. The 3′-OH terminus of the sticky end is the
same as usual, but the 5′-P terminus is modified: it lacks the phosphate group, and is in fact a 5′-
OH terminus (Figure 4.25a). DNA ligase is unable to form a phosphodiester bridge between 5′-
OH and 3′-OH ends. The result is that, although base pairing is always occurring between the
sticky ends of adaptor molecules, the association is never stabilized by ligation (Figure 4.25b).
Adaptors can therefore be ligated to a blunt-ended DNA molecule but not to themselves. After
the adaptors have been attached, the abnormal 5′-OH terminus is converted to the natural 5′-P
form by treatment with the enzyme polynucleotide kinase
Producing sticky ends by homopolymer tailing
• Homopolymer: A strand composed of one type of nucleotide.
• HT: the in-vitro addition of the same nucleotide by the enzyme terminal deoxynucleotide
transferase to 3‘-OH of a duplex DNA molecule. (calf thymus).
• e.g. Complimentary poly (C) and poly (G) for vector and target DNA respectively.
The technique of homopolymer tailing offers a radically different approach to the
production of sticky ends on a blunt-ended DNA molecule. A homopolymer is simply a
polymer in which all the subunits are the same. A DNA strand made up entirely of, say,
deoxyguanosine is an example of a homopolymer, and is referred to as polydeoxyguanosine
or poly(dG). Tailing involves using the enzyme terminal deoxynucleotidyl transferase (p. 50)
to add a series of nucleotides onto the 3′-OH termini of a double-stranded DNA molecule. If
this reaction is carried out in the presence of just one deoxyribonucleotide, a homopolymer
tail is produced (Figure 4.26a). Of course, to be able to ligate together two tailed molecules,
the homopolymers must be complementary. Frequently polydeoxycytosine (poly(dC)) tails
are attached to the vector and poly(dG) to the DNA to be cloned. Base pairing between the
two occurs when the DNA molecules are mixed (Figure 4.26b). In practice, the poly(dG) and
poly(dC) tails are not usually exactly the same length, and the base-paired recombinant
molecules that result have nicks as well as discontinuities (Figure 4.26c). Repair is therefore
a two-step process, using Klenow polymerase to fill in the nicks followed by DNA ligase to
synthesize the final phosphodiester bonds. This repair reaction does not always have to be
performed in the test tube. If the complementary homopolymer tails are longer than about 20
nucleotides, then quite stable base-paired associations are formed. A recombinant DNA
molecule, held together by base pairing although not completely ligated, is often stable
enough to be introduced into the host cell in the next stage of the cloning experiment (see
Figure 1.1). Once inside the host, the cell‘s own DNA polymerase and DNA ligase repair the
recombinant DNA molecule, completing the construction begun in the test tube.
Characteristics of cloning and expression vectors based on
plasmid and bacteriophage
Characteristics of plasmid and Vector
Vector

Vector is an agent that can carry a DNA fragment into a host cell.

Classification based on the purpose

If it is used for reproducing the DNA fragment, it is called a "cloning vector".


If it is used for expressing certain gene in the DNA fragment, it is called an "expression vector".

Criteria involved in selecting of cloning vector


• Objective of cloning
• Ease of working
• Knowledge about the vector
• Suitability
• Reliability
Features of Vector Design
 A plasmid origin of replication
 An antibiotic resistance marker
 A strong regulatable promoter
 An efficient translation initiation site
 An efficient E.coli signal sequence for periplasmic production
Cloning Vectors
• A cloning vector is a plasmid that can be modified to carry new genes.
• Plasmids useful as cloning vectors must have:
– An origin of replication.
– A selectable marker (antibiotic resistance gene, such as ampr and tetr).
– Multiple cloning site (MCS) (site where insertion of foreign DNA will not disrupt
replication or inactivate essential markers).
– Easy to purify away from host DNA.
• A vector is used to amplify a single molecule of DNA into many copes. A DNA
fragment must be inserted into a cloning vector.
• A cloning vector is a DNA molecule that has an origin of replication and is capable of
replicating in a bacterial cell.
• Most vectors are genetically engineered plasmids or phages. There are also cosmid
vectors, bacterial artificial chromosomes, and yeast artificial chromosomes.
Expression vector
• Expression vectors require translation of the vector's insert, thus requiring more
components than simpler transcription-only vectors.
• Expression in different host organism would require different elements,
 promoter for initiation of transcription,
 a ribosomal binding site for translation initiation, and
 termination signals.
Prokaryotes expression vector
• Promoter - commonly-used inducible promoters are promoters derived from lac operon
and the T7 promoter.
• Translation initiation site - Shine-Dalgarno sequence at the ribosomal binding site.
Eukaryotes expression vector
• Eukayrote expression vectors require sequences that encode for:
• Polyadenylation tail: Creates a polyadenylation tail at the end of the transcribed pre-
mRNA that protects the mRNA from exonucleases and ensures transcriptional and
translational termination: stabilizes mRNA production.
• Minimal UTR length: UTRs contain specific characteristics that may impede
transcription or translation, and thus the shortest UTRs or none at all are encoded for in
optimal expression vectors.
• Kozak sequence: Vectors should encode for a Kozak sequence in the mRNA, which
assembles the ribosome for translation of the mRNA.

Types of vectors
• Bacterial plasmid 6- 12 Kb
• Bacteriophage 25
• Cosmids 35
• Bacterial artificial chromosome 300
• Yeast artificial chromosome 200-1000
TYPES OF VECTOR TARGET CELL
1. Plasmid Bacteria
2. Bacteriophages Bacteria
3. Cosmid Bacteria
4. Viruses Animal cell
5. Yeast cloning vectors Yeast
6. Ti& Ri plasmid Plants
Plasmid
A plasmid is a circular, self-replicating DNA molecule carrying a few, useful but non necessary
genes.
Other names
 Cloning vector
 Vector
 Cloning vehicle
 Carrier DNA

Occurence
Procaryote organisms
Eukaryotic organisms like Entamoeba histolytica, yeast etc.
Size
Their size varies from 1 kbp to over 400 kilobase pairs (kbp).
Copy Number ( usual copy number 1-50)
Low
Moderate and
High copy number plasmids
Advantages
Plasmids are easy to manipulate and isolate from bacteria (kits).
After being modified, they can be integrated into other genomes, plants, protists, mammals,
thereby conferring to other organisms whatever genetic functionality they carry.

Desirable properties of plasmid cloning vehicles


An ideal cloning vehicle would have the following three properties:
• low molecular weight;
• ability to confer readily selectable phenotypic traits on host cells;
• single sites for a large number of restriction endonucleases, preferably in genes with a readily
scorable phenotype.
The advantages of a low molecular weight are several. First, the plasmid is much easier to
handle,i.e. it is more resistant to damage by shearing, and is readily isolated from host cells.
Secondly, low molecular- weight plasmids are usually present as multiple copies (see Table 4.2),
and this not only facilitates their isolation but leads to gene dosage effects for all cloned genes.
Finally, with a lowmolecular weight there is less chance that the vector will have multiple
substrate sites for any restriction endonuclease (see below). After a piece of foreign DNA is
inserted into a vector, the resulting chimeric molecules have to be transformed into a suitable
recipient. Since the efficiency of transformation is so low, it is essential that the chimeras have
some readily scorable phenotype. Usually this results from some gene, e.g. antibiotic resistance,
carried on the vector, but could also be produced by a gene carried on the inserted DNA.
General properties of plasmid vectors:
 Size – small, transformation efficiency increases
 Copy No. – multiple copy
 Genetic marker- one or more, identification of the recombinants
 Origin of replication-self origin of relication
 Unique restriction site- polylinkers or multiple cloning sites (MCS)
 ease of screening (lacZ)
 Insertional inactivation
 Pathogenecity
Specific properties of plasmid vectors:
• they must to appropriate to the host, e.g. ori, selection
• they must be appropriate to your objectives
 expression or non-expression
 expression of RNA or protein
 integration and non-integration
 low or high copy number
 ease of sequencing
 facilitate library construction (―the size problem‖)
 work in different species (shuttle vectors)
Plasmids are classified
1. by their ability to be transferred to other bacteria
Conjugative
 High mol.wt and low copy No., Tra genes present
The sexual transfer of plasmids to another bacterium through a pilus.
 Those plasmids possess the 25 genes required for transfer.
 Eg. F-plasmid
Non-conjugative
 High mol.wt and low copy No., Tra genes present
Non-conjugative plasmids don‘t initiate conjugation.
 They can only be transferred with the help of conjugative plasmids.
Ex – Col E1 plasmid

2. By function
1.Fertility-(F) plasmids, They are capable of conjugation (they contains the genes for the pili).
2. Resistance-(R) plasmids, : contain gene (s) that can build resistance against one or
several antibiotics or poisons.
3. Col-plasmids, : contain genes coding for colicines, proteins that can kill other bacteria.
4. Degradative plasmids, : able to digest unusual substances, e.g., toluene or salicylic acid.
5. Virulence plasmids : turn a bacterium into a pathogen.
6. .Cryptic plasmids Plasmids that have no known function
7. .Tumour inducing plasmid : Cause tumour formation in plants
8. Addiction system. These plasmids produce both a long-lived poison and a short-lived
antidote. Daughter cells that retain a copy of the plasmid survive, while a daughter cell
that fails to inherit the plasmid dies or suffers a reduced growth-rate because of the
lingering poison from the parent cell.
Classification based on copy number
Relaxed plasmid – multiple copies per cell( more than 100)
A plasmid that replicatesindependently of the main bacterial chromosome and is present
in 10-500 copies per cell.
Ex- pUC - 19
Stringent plasmid – limited no. copies per cell( 1-4 copies per cell)
A plasmid that only replicates along with the main bacterial chromosome and is present
as a single copy, or at most several copies, per cell.
Ex-pSC - 101
Based on replication strategy
A few types of plasmid are also able to replicate by inserting themselves into the bacterial
chromosome (Figure 2.3b). These integrative plasmids or episomes may be stably maintained in
this form through numerous cell divisions, but always at some stage exist as independent
elements.

Conformation
In bacteria, plamids are found in the supercoil (superenrollado) form.
Supercoil s and relaxed (relajado) forms can be seen under the electron microscope, in
electrophoresis or after centrifugation. The supercoiled form migrate and sediments quicky in in
electrophoresis and centrifugation respectively. Topoisomerases help to pass from one form to
the other as we saw it earlier.
Host range of plasmid
• Plasmids encode only a few of the proteins required for their own replication and in many
cases encode only one of them. All the proteins required for replication eg: DNA
polymerases, DNA ligase,helicases, etc., are .
• Other provided by the host cell
Those replication proteins that are plasmid-encoded are located very close to the ori (origin of
replication) sequences at which they act.
Thus, only a small region surrounding the ori site is required for replication. Other parts of the
plasmid can be deleted and foreign sequences can be added to the plasmid and replication will
still occur.
This feature of plasmids has greatly simplified the construction of versatile cloning vectors.
The host range of a plasmid is determined by its ori region.
Plasmids whose ori region is derived from plasmid Col E1 have a restricted host range: they
only replicate in enteric bacteria, such as E. coli, Salmonella, etc.
Other promiscuous plasmids have a broad host range and these include RP4 and
RSF1010.
• Plasmids of the RP4 type will replicate inmost Gram-negative bacteria, to which they are
readily transmitted by conjugation. Such promiscuous plasmids offer the potential of
readily transferring cloned DNA molecules into a wide range of genetic backgrounds.
• Plasmids like RSF1010 are not conjugative but can be transformed into a wide range of
Gram-negative and Gram-positive bacteria, where they are stably maintained.
• Many of the plasmids isolated from Staphylococcus aureus also have a broad host range
and can replicate in many other Gram-positive bacteria.
• Plasmids with a broad host range encode most, if not all, of the proteins required for
replication. They must also be able to express these genes and thus their promoters and
ribosome binding sites must have evolved such that they can be recognized in a
diversity of bacterial families.
Incompatibility of plasmids
Plasmid incompatibility is the inability of two different plasmids to coexist in the same cell in the
absence of selection pressure.
The term incompatibility can only be used when it is certain that entry of the second plasmid has
taken place and that DNA restriction is not involved.
Groups of plasmids which are mutually incompatible are considered to belong to the same
incompatibility (Inc) group.
Over 30 incompatibility groups have been defined in E. coli and 13 for plasmids of S. aureus
Plasmids will be incompatible if they have the same mechanism of replication
control. Not surprisingly, by changing the sequence of the RNA I/RNA II region of plasmids
with antisense control of copy number, it is possible to change their incompatibility group.
Alternatively, they will be incompatible if they share the same par region
Difference between genomic DNA /Chromosomal DNA / Plasmid DNA
• Genomic DNA : Genomic DNA refers to the entire complement of DNA contained
within a bacterial cell -- thus genomic DNA includes plasmids.
• In Short, Genomic DNA = Chromosomal Dna + Plasmid DNA (+mitochondrial ,
Chroloplast DNA if any )
• Chromosomal DNA : The DNA inside the nucleus of a cells is generally called as
Chromosomal DNA
• Plasmid DNA: Extra chromosomal, self replicating , autonomus DNA Present in the cell
Classification of plasmid vector
 Natural plasmid vector
 Based plasmid vector
Natural plasmid vector
 Used for cloning without any modification
Most of the natural plasmids cannot be used for gene cloning because
 They are large in size
 They have no genetic markers
 They have no unique site for common restriction enzymes in the marker gene
 They confer pathogenicity to the host.
To overcome this( drawbacks of natural plasmid vector)
Based plasmid vector
Constructed plasmid vector much used in gene transfer
Derived vector
Artificial vector are used
Examples:
• pBR 322,325,327,328 ,
• pUC8,pUC18
• pGEM3Z
pBR322

• Although pBR322 lacks the more sophisticated features of the newest cloning vectors,
and so is no longer used extensively in research, it still illustrates the important,
fundamental properties of any plasmid cloning vector.
• We will therefore begin our study of E. coli vectors by looking more closely at pBR322.
• The name ―pBR322‖ conforms with the standard rules for vector nomenclature:
• ―p‖ indicates that this is indeed a plasmid.
• ―BR‖ identifies the laboratory in which the vector was originally constructed
• (BR stands for Bolivar and Rodriguez, the two researchers who developed pBR322).
• ―322‖ distinguishes this plasmid from others developed in the same laboratory (there are
also plasmids called pBR325, pBR327, pBR328, etc.).
• The genetic and physical map of pBR322 gives an indication of why this plasmid was
such a popular cloning vector.
• The first useful feature of pBR322 is its size. In general vector ought to be less than 10
kb in size, to avoid problems such as DNA breakdown during purification pBR322 is
4363 bp, which means that not only can the vector itself be purified with ease, but so can
recombinant DNA molecules constructed with it. Even with 6 kb of additional DNA, a
recombinant pBR322 molecule is still a manageable size.
• The second feature of pBR322 is that, it carries two sets of antibiotic resistance genes.
• Either ampicillin or tetracycline resistance can be used as a selectable marker for cells
containing the plasmid, and each marker gene includes unique restriction sites that can be
used in cloning experiments.
• Insertion of new DNA into pBR322 that has been restricted with PstI, PvuI, or ScaI
inactivates the ampR gene, and insertion using any one of eight restriction endonucleases
(notably BamHI and HindIII) inactivates tetracycline resistance.
• This great variety of restriction sites that can be used for insertional inactivation means
that pBR322 can be used to clone DNA
• A third advantage of pBR322 is that it has a reasonably high copy number.
• Generally there are about 15 molecules present in a transformed E. coli cell, but this
number can be increased, up to 1000–3000, by plasmid amplification in the presence of a
protein synthesis inhibitor such as chloramphenicol
• An E. coli culture therefore provides a good yield of recombinant pBR322 molecules.
The pedigree of pBR322

• The remarkable convenience of pBR322 as a cloning vector did not arise by chance.
• The plasmid was in fact designed in such a way that the final construct would possess
these desirable properties.
• An outline of the scheme used to construct pBR322 is
• It can be seen that its production was a tortuous business that required full and skilfull
use of the DNA manipulative techniques
• A summary of the result of these manipulations is provided in Figure 6.2b, from which it
can be seen that pBR322 comprises DNA derived from three different naturally occurring
plasmids.
• The ampR gene originally resided on the plasmid R1, a typical antibiotic resistance
plasmid that occurs in natural population s of E. coli (p. 17).
• The tetR gene is derived from R6-5, a second antibiotic-resistant plasmid.
• The replication origin of pBR322, which directs multiplication of the vector in host cells,
is originally from pMB1, which is closely related to the colicin-producing plasmid ColE1

pBR327

• pBR322 was developed in the late 1970s, the first research paper describing its use being
published in 1977.
• Since then many other plasmid cloning vectors have been constructed, the majority of
these derived from pBR322 by manipulations similar to those summarized in Figure 6.2a.
• One of the first of these was pBR327, which was produced by removing a 1089 bp
segment from pBR322.
• This deletion left the ampR and tetR genes intact, but changed the replicative and
conjugative abilities of the resulting plasmid.
• As a result, pBR327 differs from pBR322 in two important ways:
• pBR327 has a higher copy number than pBR322, being present at about 30–45
molecules per E. coli cell.
• This is not of great relevance as far as plasmid yield is concerned, as both plasmids can
be amplified to copy numbers greater than 1000.
• However, the higher copy number of pBR327 in normal cells makes this vector more
suitable if the aim of the experiment is to study the function of the cloned gene. In these
cases gene dosage becomes important, because the more copies there
• The deletion also destroys the conjugative ability of pBR322, making pBR327 a non-
conjugative plasmid that cannot direct its own transfer to other E. coli cells.
• This is important for biological containment, averting the possibility of a recombinant
pBR327 molecule escaping from the test tube and colonizing bacteria in the gut of a
careless molecular biologist.
• In contrast, pBR322 could theoretically be passed to natural populations of E. coli by
conjugation, though in fact pBR322 also has safeguards (though less sophisticated ones)
to minimize the chances of this happening. pBR327 is, however, preferable if the cloned
gene is potentially harmful should an accident occur.

pUC series
• pUC8—a Lac selection plasmid
• pUC8 is descended from pBR322, although only the replication origin and the ampR
gene remain.
• The nucleotide sequence of the ampR gene has been changed so that it no longer contains
the unique restriction sites: all these cloning sites are now clustered into a short segment
of the lacZ′ gene carried by pUC8.
• pUC8 has three important advantages that have led to it becoming one of the most
popular E. coli cloning vectors.
• The first of these is fortuitous: the manipulations involved in construction of pUC8 were
accompanied by a chance mutation, within the origin of replication, which results in the
plasmid having a copy number of 500–700 even before amplification.
• This has a significant effect on the yield of cloned DNA obtainable from E. coli cells
transformed with recombinant pUC8 plasmids.
• The second advantage is that identification of recombinant cells can be achieved by a
single step process, by plating onto agar medium containing ampicillin plus X-gal (p. 79).
With both pBR322 and pBR327, selection of recombinants is a two-step procedure,
• sites and provide even greater flexibility in the types of DNA fragment that can be cloned
• Furthermore, the restriction site clusters in these vectors are the same as the clusters in
the equivalent M13mp series of vectors.
• DNA cloned into a member of the pUC series can therefore be transferred directly to its
M13mp counterpart, enabling the cloned gene to be obtained as single-stranded DNA
pGEM3Z

• pGEM3Z (Figure 6.4a) is very similar to a pUC vector: it carries the ampR and lacZ′
genes, the latter containing a cluster of restriction sites, and it is almost exactly the same
size distinction is that pGEM3Z has two additional short pieces of DNA, each of which
acts as the recognition site for attachment of an RNA polymerase enzyme.
• These two promoter sequences lie on either side of the cluster of restriction sites used for
introduction of new DNA into the pGEM3Z molecule.
• This means that if a recombinant pGEM3Z molecule is mixed with purified RNA
polymerase in the test tube, transcription occurs and RNA copies of the cloned fragment
are synthesized (Figure 6.4b).
• The RNA that is produced could be used as a hybridization probe or might be required
for experiments aimed at studying RNA processing (e.g., the removal of introns) or
protein synthesis.
• The promoters carried by pGEM3Z and other vectors of this type are not the standard
sequences recognized by the E. coli RNA polymerase.
• Instead, one of the promoters is specific for the RNA polymerase coded by T7
bacteriophage and the other for the RNA polymerase of SP6 phage.
• These RNA polymerases are synthesized during infection of
E. coli with one or other of the phages and are responsible for transcribing the phage genes.
• They are chosen for use in in vitro transcription as they are very active enzymes –
remember that the entire lytic infection cycle takes only 20 minutes (p. 18), so the
Phage vectors
Bacteriophages, or phages as they are commonly known, are viruses that specifically infect
bacteria.
• Like all viruses, phages are very simple in structure, consisting merely of a DNA (or
occasionally ribonucleic acid (RNA)) molecule carrying a number of genes, including
several for replication of the phage, surrounded by a protective coat or capsid made up of
protein molecules
The two main types of phage structure:
(a) head-and tail (e.g. λ); (b) filamentous (e.g. M13).
Phage Cloning Vectors

• Fragments up to 23 kb can be may be accommodated by a phage vector


• Lambda is most common phage
• 60% of the genome is needed for lytic pathway.
• Segments of the Lambda DNA is removed and a stuffer fragment is put in.
• The stuffer fragment keeps the vector at a correct size and carries marker genes that are
removed when foreign DNA is inserted into the vector.
• Example: Charon 4A Lambda
• When Charon 4A Lambda is intact, beta-galactosidase reacts with X-gal and the colonies
turn blue.
• When the DNA segment replaces the stuffer region, the lac5 gene is missing, which
codes for beta-galactosidase, no beta-galactosidase is formed, and the colonies are white.

Insertion vectors
Replacement vectors
. Replace the nonessential region of the phage genome with exogenous DNA
. high transformation efficiency (1000-time higher than plasmid)
A e replacement vector has two recognition sites for the restriction endonuclease used for
cloning. These sites flank a segment of DNA that is replaced by the DNA to be cloned (Figure
6.13a). Often the replaceable fragment (or “stuffer fragment” in cloning jargon) carries
additional restriction sites that can be used to cut it up into small pieces, so that its own re-
insertion during a cloning experiment is very unlikely. Replacement vectors are generally
designed to carry larger pieces of DNA than insertion vectors can handle. Recombinant selection
is often on the basis of size, with non-recombinant vectors
being too small to be packaged into e phage heads. An example of a replacement vectors is:
EEMBL4 can carry up to 20 kb of inserted DNA by replacing a segment flanked by pairs of
EcoRI, BamHI, and SalI sites. Any of these three restriction endonucleases can be used to
remove the stuffer fragment, so DNA fragments with a variety of sticky ends can be cloned.
Recombinant selection with eEMBL4 can be on the basis of size, or can utilize the Spi

phenotype
High-capacity vectors enable genomic libraries to be constructed
The main use of all e-based vectors is to clone DNA fragments that are too long to be handled by
plasmid or M13 vectors. A replacement vector, such as eEMBL4, can carry up to 20 kb of new
DNA, and some cosmids can manage fragments up to 40 kb. This compares with a maximum
insert size of about 8 kb for most plasmids and less than 3 kb for M13 vectors. The ability to
clone such long DNA fragments means that genomic libraries can be generated. A genomic
library is a set of recombinant clones that contains all of the DNA present in an individual
organism. An E. coli genomic library, for example, contains all the E. coli genes, so any desired
gene can be withdrawn from the library and studied. Genomic libraries can be retained for many
years, and propagated so that copies can be sent from one research group to another. The big
question is how many clones are needed for a genomic library? The answer can be calculated
with the formula:
INVITRO PACKAGING
For humans and other mammals
several hundred thousand clones are required. It is by no means impossible to obtain several
hundred thousand clones, and the methods used to identify a clone carrying a desired gene
(Chapter 8) can be adapted to handle such large numbers, so genomic libraries of these sizes are
by no means reasonable. However, ways of reducing the number of clones needed for
mammalian genomic libraries are continually being sought. One solution is to develop new
cloning vectors able to handle longer DNA inserts.

The most popular of these vectors are bacterial artificial chromosomes (BACs), which are
based on the F plasmid (p. 16). The F plasmid is relatively large and vectors derived from it have
a higher capacity than normal plasmid vectors. BACs can handle DNA inserts up to 300 kb in
size, reducing the size of the human genomic library to just 30,000 clones. Other high-capacity
vectors have been onstructed from bacteriophage P1, which has the advantage over e of being
able to squeeze 110 kb of DNA into its capsid structure. Cosmid-type vectors based on P1 have
been designed and used to clone
DNA fragments ranging in size from 75 to 100 kb. Vectors that combine the features of P1
vectors and BACs, called P1-derived artificial chromosomes (PACs), also have a capacity of
up to 300 kb.
M13 Phage vector

M13 is an example of a filamentous phage (see Figure 2.5b) and is completely different in
structure
M13 , a single stranded filamentous phage
• Phage DNA is packaged in the core of a helical particle
• The length of particle is dependant on length of DNA
• In all M13 preps the following occur
• polyphage- more than one genome length
• minphage- 0.2-0.5 genome length
• maxiphage- genetically defective but more than one genome length
• Molecular biologists use this to create cloning vectors
• Can insert long stretches of DNA into non essential regions
• No sharp cut off in length that can be packaged
• Some decrease in efficiency of packaging with increasing length
• 10% longer not affected, 50% longer replicate more slowly
• Furthermore, the M13 DNA molecule is much smaller than the genome, being only 6407
nucleotides in length.
• It is circular and is unusual in that it consists entirely of single-stranded DNA.
• The smaller size of the M13 DNA molecule means that it has room for fewer genes than
the genome.
• This is possible because the M13 capsid is constructed from multiple copies of just three
proteins (requiring only three genes), whereas synthesis of the
• head-and-tail structure involves over 15 different proteins. In addition, M13 follows a
simpler infection cycle than and does not need genes for insertion into the host genome.
• Injection of an M13 DNA molecule into an E. coli cell occurs via the pilus, the structure
that connects two cells during sexual conjugation (see Figure 2.4).
• Once inside the cell the single-stranded molecule acts as the template for synthesis of a
complementary strand, resulting in normal double-stranded DNA (Figure 2.11a).

• This molecule is not inserted into the bacterial genome, but instead replicates until over
100 copies are present in the cell (Figure 2.11b).
• When the bacterium divides, each daughter cell receives copies of the phage genome,
which continues to replicate, thereby maintaining its overall numbers per cell.
• As shown in Figure 2.11c, new phage particles are continuously assembled and released,
about 1000 new phages being produced during each generation of an infected cell.
• Several features of M13 make this phage attractive as a cloning vector.
• The genome is less than 10 kb in size, well within the range desirable for a potential
vector. In addition, the double-stranded replicative form (RF) of the M13 genome
behaves very much
• like a plasmid, and can be treated as such for experimental purposes.
• It is easily prepared from a culture of infected E. coli cells (and can be reintroduced by
transfection
• Most importantly, genes cloned with an M13-based vector can be obtained in the form of
single-stranded DNA.
• Single-stranded versions of cloned genes are useful for several techniques, notably DNA
sequencing and in vitro mutagenesis
• Cloning in an M13 vector is an easy and reliable way of obtaining single-stranded
• DNA for this type of work. M13 vectors are also used in phage display, a technique for
identifying pairs of genes whose protein products interact with one another

Cloning vectors based on M13 bacteriophage

• The most essential requirement for any cloning vector is that it has a means of replicating
in the host cell.
• For plasmid vectors this requirement is easy to satisfy, as relatively short DNA sequences
are able to act as plasmid origins of replication, and most, if not all, of the enzymes
needed for replication are provided by the host cell.
• Elaborate manipulations, such as those that resulted in pBR322 (see Figure 6.2a), are
therefore possible so long as the final construction has an intact, functional replication
origin.
• With bacteriophages such as M13 and , the situation as regards replication is more
complex.
• Phage DNA molecules generally carry several genes that are essential for replication,
including genes coding for components of the phage protein coat and phages specific
DNA replicative enzymes.
• Alteration or deletion of any of these genes will impair or destroy the replicative ability
of the resulting molecule.
• There is therefore much less freedom to modify phage DNA molecules, and generally
phage cloning vectors are only slightly different from the parent molecule.
How to construct a phage cloning vector
The problems in constructing a phage cloning vector are illustrated by considering M13.
The normal M13 genome is 6.4 kb in length, but most of this is taken up by ten closely packed
genes (Figure 6.5), each essential for the replication of the phage. There is only a single 507-
nucleotide intergenic sequence into which new DNA could be inserted without disrupting one
of these genes, and this region includes the replication origin which must itself remain intact.
Clearly there is only limited scope for modifying the M13 genome. The first step in construction
of an M13 cloning vector was to introduce the lacZ′ gene into the intergenic sequence. This
gave rise to M13mp1, which forms blue plaques on X-gal agar (Figure 6.6a). M13mp1 does not
possess any unique restriction sites in the lacZ′ gene. It does, however, contain the
hexanucleotide GGATTC near the start of the gene. A single nucleotide change would make this
GAATTC, which is an EcoRI site. This alteration was carried out using in vitro mutagenesis (p.
200), resulting in M13mp2

(Figure 6.6b). M13mp2 has a slightly altered lacZ′ gene (the sixth codon now specifies
asparagine instead of aspartic acid), but the b-galactosidase enzyme produced by cells infected
with M13mp2 is still perfectly functional.

• The next step in the development of M13 vectors was to introduce additional restriction
sites into the lacZ′ gene. This was achieved by synthesizing in the test tube a short
oligonucleotide, called a polylinker, which consists of a series of restriction sites and has

• EcoRI sticky ends (Figure 6.7a). This polylinker was inserted into the EcoRI site of
M13mp2, to give M13mp7 (Figure 6.7b), a more complex vector with four
possiblecloning sites (EcoRI, BamHI, SalI, and PstI).
• The polylinker is designed so that it does not totally disrupt the lacZ′ gene: a reading
frame is maintained throughout the polylinker, and a functional, though altered, b-
galactosidase enzyme is still produced.
• The most sophisticated M13 vectors have more complex polylinkers inserted into the
lacZ′ gene.
• An example is M13mp8, which has the same series of restriction sites as the plasmid
pUC8 (p. 92). As with the plasmid vector, one advantage of M13mp8 is itsability to take
DNA fragments with two different sticky ends.
• Although M13 vectors are very useful for the production of single-stranded versions of
cloned genes, they do suffer from one disadvantage. There is a limit to the size of DNA
fragment that can be cloned with an M13 vector, with 1500 bp generally being looked on
as the maximum capacity, though fragments up to 3 kb have occasionally been cloned.
To get around this problem a number of hybrid vectors (―phagemids‖) have been
developed by combining a part of the M13 genome with plasmid DNA.

• An example is provided by pEMBL8 (Figure 6.8a), which was made by transferring into
pUC8 a 1300 bp fragment of the M13 genome. This piece of M13 DNA contains the
signal sequence recognized by the enzymes that convert the normal double-stranded
• M13 molecule into single-stranded DNA before secretion of new phage particles.
• This signal sequence is still functional even though detached from the rest of the M13
genome, so pEMBL8 molecules are also converted into single-stranded DNA and
secreted as defective phage particles (Figure 6.8b). All that is necessary is that the E. coli
cells used as hosts for a pEMBL8 cloning experiment are subsequently infected with
normal M13 to act as a helper phage, providing the necessary replicative enzymes and
phage coat proteins.
• pEMBL8, being derived from pUC8, has the polylinker cloning sites within the lacZ′
gene, so recombinant plaques can be identified in the standard way on agar containing X-
gal. With pEMBL8, single-stranded versions of cloned DNA fragments up to 10 kb in
length can be obtained, greatly extending the range of the M13 cloning system.
In vitro packaging systems
• Can be made
• Purified empty heads, lambda DNA with foreign insert (50kb with cos ends), and tail
assemblies in test tube.
• Result in infective bacteriophage
• Uses
• Cloning pieces too big for plasmids
• Generation of ds DNA

Cloning in M13
Purpose: When the single-stranded DNA of a fragment is required, a M 13 vector can be used as
a common cloning tool.
Preparation of ssDNA:
1. Cloning: standard plasmid cloning method can be used to incorporate recombinant DNA into
M13 vectors;
2. Transformation: the M13 then infects sensitive E. coli cells;
3. Plating: the host cells grow to form the plaques;
4. Isolation: the ssDNA may then be isolated from phage particles in the growth medium of the
plate.
Screening: Blue-white screening using MCSs and lacZ' has been engineered into M13 vectors.
Examples: The M13mpl8 and M13mp19, which are a pair of vectors in which the MCS are in
opposite orientations relative to the M13 origin of replication.
Hybrid plasmid-M13 vectors
Definition: A number of small plasmid vectors, for example pBlue-script, have been developed
to incorporate M13 functionality.
Structure: They contain both plasmid and M13 origins of replication, but do not possess the
genes required for the full phage life cycle.
Working ways:
1. Plasmid way: they normally propagate as true plasmids, and have the advantages of rapid
growth and easy manipulation of plasmid vectors;
2. Phage way: they can be induced to produce single-stranded phage particles by co-infection
with a fully functional helper phage, which provides the gene products required for single-strand
production and packaging.

Vectors for yeast, insect and mammalian systems

Yeast vector
The yeast Saccharomyces cerevisiae is one of the most important organisms in biotechnology.
As well as its role in brewing and breadmaking, yeast has been used as a host organism for the
production of important pharmaceuticals from cloned genes Development of cloning vectors for
yeast was initially stimulated by the discovery of a plasmid that is present in most strains of S.
cerevisiae (Figure 7.1). The 2 fm plasmid, as it is called, is one of only a very limited number of
plasmids found in eukaryotic cells.

 Yeast episome plasmid, YEp


 Yeast integrative plasmids, YIp
 Yeast centromere plasmid, Ycp
 Yeast replicating plasmid, YRps
YEp Vectors.
• The 2 μm ori is responsible for the high copy-number and high frequency of
transformation of YEp vectors.
• Most YEp plasmids are relatively unstable and even under conditions of selective growth,
only 60 to 95 percent of the cells retain the YEp plasmid.
• The copy number of most YEp plasmids ranges from 10 to 40 copies per cell.
• Although this system is used for small scale expression studies, the use of YEp vectors in
large-scale manufacturing is not advisable.

• Yeast replicative plasmids (YRps) are able to multiply as independent plasmids because
they carry a chromosomal DNA sequence that includes an origin of replication.
Replication origins are known to be located very close to several yeast genes, including
one or two which can be used as selectable markers. YRp7 (Figure 7.6b) is an example of
a replicative plasmid. It is made up of pBR322 plus the yeast gene TRP1. This gene,
which is involved in tryptophan biosynthesis, is located adjacent to a chromosomal origin
of replication. The yeast DNA fragment present in YRp7 contains both TRP1 and the
origin.
• Three factors come into play when deciding which type of yeast vector is most suitable
for a particular cloning experiment. The first of these is transformation frequency, a
measure of the number of transformants that can be obtained per microgram of plasmid
DNA. A high transformation frequency is necessary if a large number of recombinants
are needed, or if the starting DNA is in short supply. YEps have the highest
transformation frequency, providing between 10,000 and 100,000 transformed cells
• per fg. YRps are also quite productive, giving between 1000 and 10,000 transformants
per fg, but a YIp yields less than 1000 transformants per fg, and only 1–10 unless special
procedures are used. The low transformation frequency of a YIp reflects the fact that the
rather rare chromosomal integration event is necessary before the vector can be retained
in a yeast cell. The second important factor is copy number. YEps and YRps have the
highest copy numbers: 20–50 and 5–100, respectively. In contrast, a YIp is usually
present at just one copy per cell. These figures are important if the objective is to obtain
protein from the cloned gene, as the more copies there are of the gene the greater the
expected yield

YCp Vectors
• The YIp integrative vectors are vectors that do not replicate autonomously, but integrate
into the genome at low frequencies by homologous recombination.
• Integration of circular plasmid DNA by homologous recombination leads to a copy of the
vector sequence flanked by two direct copies of the yeast sequence.
• Typically, YIp vectors integrate as a single copy. However, methods to integrate multiple
copies and stable cell lines with up to 15-20 copies of recombinant gene integrations have
been developed for over-expressing specific genes.
• YIp plasmids with two yeast segments, such as YFG1 and the URA3 marker, have the
potential to integrate at either of the genomic loci, whereas vectors containing repetitive
DNA sequences, such as Ty elements or rDNA, can integrate at any of the multiple sites
within genome.2
• YCp yeast centromere plasmid vectors are autonomously replicating vectors containing
centromere sequences (CEN), and autonomously replicating sequences (ARS).
• The YCp vectors are typically present at very low copy numbers from 1 to 3 per cell.
• These vectors are also relatively unstable and not very useful in high level expression but
are used as regular cloning vectors (e.g., pYC2, pBM272).
Yeast Autonomously replicating sequences
 About 400 origins exist in the 17 chromosomes of S. cerevisiae Each yeast origin
sequence, called an autonomously replicating sequence (ARS), confers on a plasmid the ability
to replicate in yeast and is a required element in yeast artificial chromosomes.
 Only one essential element is 15-bp segment designated element A. Three other short
segments-the B1, B2 and B3 elements-are required for efficient functioning of ARS1 A
complex of proteins called ORC(origin-recognition complex) binds specifically to element A in
ARS1 in an ATP-dependent manner

Insect vector
Cloning vectors for insects
The fruit fly, Drosophila melanogaster, has been and still is one of the most important model
organisms used by biologists. Its potential was first recognized by the famous geneticist Thomas
Hunt Morgan, who in 1910 started to carry out genetic crosses between fruit flies with different
eye colors, body shapes, and other inherited characteristics. These experiments led to the
techniques still used today for gene mapping in insects and other animals. More recently, the
discovery that the homeotic selector genes of Drosophila—the genes that control the overall
body plan of the fly—are closely
related to equivalent genes in mammals, has led to D. melanogaster being used as a model for
the study of human developmental processes. The importance of the fruit fly in modern biology
makes it imperative that vectors for cloning genes in this organism are available.
P elements as cloning vectors for Drosophila
The development of cloning vectors for Drosophila has taken a different route to that followed
with bacteria, yeast, plants, and mammals. No plasmids are known in Drosophila and although
fruit flies are, like all organisms, susceptible to infection with viruses, these have not been used
as the basis for cloning vectors. Instead, cloning in Drosophila makes use of a transposon called
the P element.
Transposons are common in all types of organism. They are short pieces of DNA (usually less
than 10 kb in length) that can move from one position to another in the chromosomes of a cell. P
elements, which are one of several types of transposon in Drosophila, are 2.9 kb in length and
contain three genes flanked by short inverted repeat sequences at either end of the element
(Figure 7.18a). The genes code for transposase, the enzyme that carries out the transposition
process, and the inverted repeats form the
recognition sequences that enable the enzyme to identify the two ends of the inserted transposon.
As well as moving from one site to another within a single chromosome, P elements can also
jump between chromosomes, or between a plasmid carrying a P element and one of the fly‘s
chromosomes (Figure 7.18b). The latter is the key to the use of P inactive. The second P element
carried by the plasmid is therefore one that has an intact version of the transposase gene. Ideally
this second element should not itself be transferred to the Drosophila chromosomes, so it has its
―wings clipped‖: its inverted repeats are removed so that the transposase does not recognize it as
being a real P element (Figure 7.17c). Once the gene to be cloned has been inserted into the
vector, the plasmid DNA is microinjected into fruit fly embryos. The transposase from the
wings-clipped P element directs transfer of the engineered P element into one of the fruit fly
chromosomes. If this happens within a germline nucleus, then the adult fly that develops from
the embryo will carry copies of the cloned gene in all its cells. P element cloning was first
developed in the 1980s and has made a number of important contributions to Drosophila
genetics.

Cloning vectors based on insect viruses


Although virus vectors have not been developed for cloning genes in Drosophila, one type of
virus, the baculovirus, has played an important role in gene cloning with other insects.
Baculovirus expression system
 A family of large rod-shaped viruses
 Circular double-stranded genome ranging from 80-180 kbp.
 the strong polyhedrin promoter-> the transcriptional control.
Advantages of working with Baclo system
 High expression levels using the polyhedrin or p10 promoter
 Supports post-translation modifications
 BEVS enables simultaneous expression of certain genes
 Expressed proteins do not have size limitations
 Capable of producing cytotoxic proteins
Baculovirus
 Baculovirus are present in invertebrates primarily insect species
 They are not infectious for vertebrates & plants
 Genome is covalently closed circular double stranded of 134 kbp,
 due to its small it can accommodate large fragments of foreign DNA
 They are divided into two groups on the basis of their structure as-:
 Nucleopolyhedroviruses (NPV)
 Granuloviruses
 These NPV are mainly used as expression vectors i.e. Autographa californica NPV
(AcMNPV) isolated from the larva of the alfalfa looper
 Baculovirus expression system based upon the ability to propagate AcMNPV in insect
cells
 Uses many of the protein modification, processing and
 transport systems present in higher eukaryotic
 cells.
 Virus that can be propagated to high titers adapted for growth in suspension cultures
obtain large amounts of recombinant protein with relative ease
 Baculovirus are non-infectious to vertebrates and
 their promoters are inactive in mammalian cells.
Bac-to-Bac Baculovirus Expression System
An efficient site-specific transposition system to generate baculovirus for high-level expression
of recombinant proteins
Advantage
1. recombinant virus DNA isolated from selected colonies is not mixed with parental,
nonrecombinant virus
- Easy colony screening : lacZα gene (Bluo-gal or X-gal)
2. Time-saving expression to identify and purify a recombinant virus
Used vectors:
1. Donor plasmid vector into which the gene(s) of interest will be cloned
2. Baculovirus shuttle vector (bacmid)
: mini-attTn7 target site, lacZ, kanamycin resistance marker
3. Helper plasmid
: transposase, tetracycline resistance marker
Steps in recombinant baculovirus production
 Clone the gene of interest in pfast Bac donor plasmid
 Expression cassette in pfast Bac is flanked by left and right arms of Tn7 and also an
SV40 polyadenylation signal to form a miniTn7
 Cloned pfast Bac is transformed in E.coli host strain (DH10Bac) which contains a
baculovirus shuttle vector bacmid having a mini-attTn7 target site
 Helper plasmid which allows to transpose the gene of interest from pfast to bacmid
(shuttle vector)
 Transposition occurs between the mini-att Tn7 target site to generate a recombinant
bacmid
 This recombinant bacmid can now be used to transfect insect cell lines.
Insects & Insect cells
 Baculovirus infects lepidopteran (butterflies & moths) insects and insect cell lines
 Commonly used cell lines are sf9 & sf21 derived from the pupal ovarian tissue of the fall
army worm spodoptera frugiperda and high five derived from the ovarian cells of the
cabbage looper
Grace’s Insect medium- unsupplemented but contains L-glutamine
 Grace‘s Insect medium supplemented-contains additional TC yeastolate & Lactalbumin
hydrolysate
 Trichoplusia ni Medium formulation hink (TNM-FH)- contains 10% FBS
Requirements for proper cell culture
 Temperature- Optimal range is 27-28 C
 pH- Optimal range is 6.1 to 6.4
 Aeration-Requires passive 02 diffusion for optimal growth & recombinant protein
expression
 Osmolality- Optimum is 345-380 mOsm/kg
 FBS- Working with suspension culture it is advisable to use (10-20% FBS) to gave
protection from cellular shear forces
Mammalian vectors
Cloning in mammals
At present, gene cloning in mammals is carried out for one of three reasons: To achieve a gene
knockout, which is an important technique used to help determine the function of an
unidentified gene (p. 213). These experiments are usually carried out with rodents such as mice.

Retroviruses
1.Retroviruses have a ssRNA genome.
2. Two copies of the sense ssRNA genome are within the viral particle.
3. When they infect a cell, the ssRNA is converted into a dsDNA copy by the RT (class VI).
4. Replication and transcription occur from this dsDNA intermediate, i.e. the pro-virus.
5. Which is integrated into the host cell genome by a viral integrase enzyme.
6. Retroviruses vary in complexity. At one extreme there are HIVs.
Retroviral vectors

Retroviral vector method


 A retrovirus is a virus that carries its genetic material in the form of RNA rather than
DNA.
 as vectors to transfer genetic material into the host cell
 resulting in a chimera, an organism consisting of tissues or parts of diverse genetic
constitution
 chimeras are inbred for as many as 20 generations until homozygous (carrying the
desired transgene in every cell) transgenic offspring are born
Advantage: effective means of integrating the transgene into the genome of a recipient cell.
Disadvantages:
• can transfer only small pieces (approximately 8 kilo bases) of DNA
• lack essential adjacent sequences for regulating the expression of the transgene.
• the genome of the retroviral strain (helper virus) that is needed to create large quantities
of the vector DNA can be integrated into the same nucleus as the transgene
Viruses as cloning vectors for mammals
For many years it was thought that viruses would prove to be the key to cloning in mammals.
This expectation has only partially been realized. The first cloning experiment involving
mammalian cells was carried out in 1979 with a vector based on simian virus 40 (SV40). This
virus is capable of infecting several mammalian species, following a lytic cycle in some hosts
and a lysogenic cycle in others. The genome is 5.2 kb in size
(Figure 7.19a) and contains two sets of genes, the ―early‖ genes, expressed early in the
infection cycle and coding for proteins involved in viral DNA replication, and the ―late‖ genes,
coding for viral capsid proteins. SV40 suffers from the same problem as e and the plant
caulimoviruses, in that packaging constraints limit the amount of new DNA that can be inserted
into the genome. Cloning with SV40 therefore involves replacing one or more of the existing
genes with the DNA to be cloned. In the original experiment a segment of the late gene region
was replaced (Figure 7.19b), but early gene replacement is also an option. In the years since
1979, a number of other types of virus have been used to clone genes in mammals.
These include:
Adenoviruses, which enable DNA fragments of up to 8 kb to be cloned, longer than is possible
with an SV40 vector, though adenoviruses are more difficult to handle because their genomes are
bigger.
Papillomaviruses, which also have a relatively high capacity for inserted DNA. Bovine
papillomavirus (BPV), which causes warts on cattle, is particularly attractive because it has an
unusual infection cycle in mouse cells, taking the form of a multicopy plasmid with about 100
molecules present per cell. It does not
cause the death of the mouse cell, and BPV molecules are passed to daughter cells on cell
division, giving rise to a permanently transformed cell line. Shuttle vectors consisting of BPV
and E. coli sequences, and capable of replication in both mouse and bacterial cells, have been
used for the production of recombinant proteins in mouse cell lines.
Adeno-associated virus (AAV), which is unrelated to adenovirus but often found in the same
infected tissues, because AAV makes use of some of the proteins synthesized by adenovirus in
order to complete its replication cycle. In the absence of this helper virus, the AAV genome
inserts into its host‘s DNA. With most integrative viruses this is a random event, but AAV has
the unusual property of
always inserting at the same position, within human chromosome 19. Knowing exactly where the
cloned gene will be in the host genome is important if the outcome of the cloning experiment
must be checked rigorously, as is the case in applications such as gene therapy. AAV vectors are
therefore looked on as having major potential in this area.
Retroviruses, which are the most commonly-used vectors for gene therapy.Although they insert
at random positions, the resulting integrants are very stable,which means that the therapeutic
effects of the cloned gene will persist for sometime. We will return to gene therapy in Chapter
14.
Prokaryotic and eukaryotic expression host systems
 Produces large amounts of a specific protein
 Permits studies of the structure and function of proteins
 Can be useful when proteins are rare cellular components or difficult to isolate
Common problems with bacterial expression systems
• Low expression levels:
▪ change promoter
▪ change plasmid
▪ change cell type
▪ add rare tRNAs for rare codons on second plasmid
• Severe protein degradation:
– use proteasome inhibitors and other protease inhibitors
– try induction at lower temperature
• Missing post-translational modification: co-express with kinases etc.
• Glycosylation will not be carried out:
– use yeast or mammalian expression system
• Misfolded protein (inclusion bodies):
– co-express with GroEL, a chaperone
– try refolding buffers
Gene Expression Systems in Prokaryotes
• In Escherichia coli
- lambda system - pL
- T7 system – pT7
- Lac system - plac
- Trp system
- synthetic systems – ptac, ptrc
- controlled by recombinase
• In Bacillus + other bacteria

The T7 promoter system


Reporter Gene Vectors
• A gene that encodes a protein whose activity can be easily assayed in a cell in which it is
not normally expressed
• These genes are linked to regulatory sequences whose function is being tested
• Changes in transcriptional activity from the regulatory sequences are detected by changes
in the level of reporter gene expression
EUKARYOTIC EXPRESSION HOST elaborate about yeast, Mammalian artificial
chromosomes and mammalian vectors

Introduction of recombinant DNA in to host cells and


selection methods.

Conjugation
Bacterial conjugation is the transfer of genetic material between bacterial cells by direct cell-to-
cell contact or by a bridge-like connection between two cells
Electroporation:
Electroporation is a mechanical method used to introduce polar molecules into a host cell
through the cell membrane. In this procedure, a large electric pulse temporarily disturbs
the phospholipid bilayer, allowing molecules like DNA to pass into the cell
 A gene gun or a biolistic particle delivery system, originally designed for plant
transformation, is a device for injecting cells with genetic information; the inserted
genetic material are termed transgenes.
 The payload is an elemental particle of a heavy metal coated with plasmid DNA. This
technique is often simply referred to as bioballistics or biolistics.
 This device is able to transform almost any type of cell, including plants, and is not
limited to genetic material of the nucleus: it can also transform organelles, including
plastids.
Transfection of eukaryotic cells
Problem: The transfection of DNA into eukaryotic cells is more problematic than E.coli
transformation, and efficiency of the process is much lower.
Reasons and solutions:
• In yeast and plant cells, the cell wall must be digested, which may then take up DNA
easily.
• Animal cells in culture take up DNA at low efficiency. If it is treated on their surface
with calcium phosphate, the efficiency may be increased.


Recombinants selection methods

IDENTIFICATION OF POSITIVE CLONES


• One of the first steps is to identify clones carrying the recombinant plasmid, with the
desired DNA insert.
• This can be done by 'picking' clones - choosing individual bacterial colonies in order to
isolate the plasmid DNA from each of them.
• Single bacterial colonies are grown in culture broth containing the selection antibiotic in
order to maintain the plasmid.
• The plasmid DNA is extracted by the standard minipreparation technique and then
analysed by restriction digest.
• After digesting the DNA, different sized fragments are separated by agarose gel
electrophoresis and the sizes determined by comparison with known DNA molecular
weight marker
Screening for Inserts

Antibiotic screening

Screening using probes


In molecular biology, a hybridization probe is a fragment of DNA or RNA of variable
length (usually 100-1000 bases long) which can be radioactively labeled. It can then be
used in DNA or RNA samples to detect the presence of nucleotide sequences (the DNA
target) that are complementary to the sequence in the probe. The probe thereby hybridizes
to single-stranded nucleic acid (DNA or RNA) whose base sequence allows probe-target
base pairing due to complementarity between the probe and target. The labeled probe is
first denatured (by heating or under alkaline conditions such as exposure to sodium
hydroxide) into single stranded DNA (ssDNA) and then hybridized to the target ssDNA
(Southern blotting) or RNA (Northern blotting) immobilized on a membrane or in situ.

Immunological screening
one of the most versatile expression cloning strategies,
• the prerequisite is the proper antibody
• It does not matter whether the protein be functional
• The recognition target is generally an epitope
• if there is sufficient information about its sequence
The steps of the first immunological screening techniques
 Transformed cells were inoculated on Petri dishes and allowed to form colonies.
 The colonies were lysed to release the antigen from positive clones
 A sheet of polyvinyl coated with the appropriate antibody was put onto the surface of the
plate,
 antigen–antibody complexes formed.
 The sheet was removed and exposed to 125I-labelled IgG specific to a different
determinant on the surface of the antigen
 The sheet was then washed and exposed to X-ray film.
Vectors for cloning large fragments of DNA
Cosmids
 Behave both as plasmids and as phages;
 Contain the cos sites of λ and plasmid origin of replication;
 Cosmids use the λ packaging system to package large DNA fragments bounded by λ cos
sites, which circularize and replicate as plasmids after infection of E.coli cells.
 Some cosmid vectors have two cos sites, and are cleaved to produce two cos ends, which
are ligated to the ends of target fragments and packaged into λ particles.
 Cosmids have a capacity for cloned DNA of 30-45 kb.
 Circular ds DNA.
 Carry more DNA than plasmid and can be maintained and manipulated as plasmids

SuperCos 1 Cloning Site Region


(sequence shown 1–71)
BamH
EcoR I Not I T3 Promoter T7 Promoter Not I EcoR I
gaattcgcggccgcaattaaccctcactaaagggatccctatagtgagtcgtattatgcggccgcgaattc
cos site recognition:
l-Terminase is an endonuclease that recognizes a sequence of about 100 bp at the cos regionof
the l genome.
It cleaves in this region to generate termini with complementary 12base,5'-overhangs.
l-Terminase is the product of the A and Nu1 genes of bacteriophage l,and is involved in the
resolution of concatemeri c genomes during the phage assembly process.
This enzyme is a heteromeric protein possessing several activities including DNA
binding,nicking of the DNA at the cos site and dissociationof the cohesive ends.
ATP serves as a requiredcofactor for l-Terminase activity.
While only the binding of ATP is necessary for cos sitenicking, ATP hydrolysis is necessary for
cohesive end dissociation.

Cosmid vectors
As we have seen, concatemers of unit-length λ DNA molecules can be efficiently packaged if the
cos sites, substrates for the packaging-dependent cleavage, are 37–52 kb apart (75–105% the size
of λ+ DNA). In fact, only a small region in the proximity of the cos site is required for
recognition by the packaging system (Hohn 1975). Plasmids have been constructed which
contain a fragment of λ DNA including the cos site (Collins & Brüning 1978, Collins & Hohn
1979, Wahl et al.1987, Evans et al. 1989). These plasmids have been termed cosmids and can be
used as gene-cloning vectors in conjunction with the in vitro packaging system. Figure 5.1 shows
a gene-cloning scheme employing a cosmid. Packaging the cosmid recombinants into phage
coats imposes a desirable selection upon their size. With a cosmid vector of 5 kb, we demand the
insertion of 32–47 kb of foreign DNA – much more than a phage-λ vector can accommodate.
Note that, after packaging in vitro, the particle is used to infect a suitable host. The recombinant
cosmid DNA is injected and circularizes like phage DNA but replicates as a normal plasmid
without the expression of any phage functions. Transformed cells are selected on the basis of a
vector drugresistance marker.

Cosmids provide an efficient means of cloning large pieces of foreign DNA. Because of their
capacity for large fragments of DNA, cosmids are particularly attractive vectors for constructing
libraries of eukaryotic genome fragments. Partial digestion with a restriction endonuclease
provides suitably large fragments. However, there is a potential problem associated with the use
of partial digests in this way. This is due to the possibility of two or more genome fragments
joining together in the ligation reaction, hence creating a clone containing fragments that were
not initially adjacent in the genome. This would give an incorrect picture of their chromosomal
organization. The problem can be overcome by size fractionation of the partial digest.
Even with sized foreign DNA, in practice cosmid clones may be produced that contain non-
contiguous DNA fragments ligated to form a single insert. The problem can be solved by
dephosphorylating the foreign DNA fragments so as to prevent their ligation together. This
method is very sensitive to the exact ratio of target-to-vector DNAs (Collins & Brüning 1978)
because vector-to-vector ligation can occur. Furthermore, recombinants with a duplicated vector
are unstable and break down in the host by recombination, resulting in the propagation of a non-
recombinant cosmid vector.Such difficulties have been overcome in a cosmidcloning procedure
devised by Ish-Horowicz and Burke (1981). By appropriate treatment of the cosmid vector pJB8
(Fig. 5.2), left-hand and right-hand vector ends are purified which are incapable of selfligation
but which accept dephosphorylated foreign DNA. Thus the method eliminates the need to size
the foreign DNA fragments and prevents formation of clones containing short foreign DNA or
multiple vector sequences. An alternative solution to these problems has been devised by Bates
and Swift (1983) who have constructed cosmid c2XB. This cosmid carries a BamHI insertion
site and two cos sites separated by a blunt-end restriction site (Fig. 5.3). The creation of these
blunt ends, which ligate only very inefficiently under the conditions used, effectively prevents
vector self-ligation in the ligation reaction. Modern cosmids of the pWE and sCos series (Wahl
et al. 1987, Evans et al. 1989) contain features such as: (i) multiple cloning sites (Bates & Swift
1983, Pirrotta et al. 1983, Breter et al. 1987) for simple cloning using non-size-selected DNA;
(ii) phage promoters flanking the cloning site; and (iii) unique NotI, SacII or Sfil sites (rare
cutters, see Chapter 6) flanking the cloning site to permit removal of the insert from the vector as
single fragments. Mammalian expression modules encoding dominant selectable markers
(Chapter 10) may also be present, for gene transfer to mammalian cells if required.
Phagemids
 Single-stranded;
 Both phage and plasmid characteristics;
 Help phage
 Two RNA polymerase promoters (T7and T3)
 Phagemids are plasmids containing both a plasmid ori and an M13 origin of replication.
 Because of the plasmid origin of replication, the phagemid propagates like a normal
plasmid under most conditions (i.e., double-stranded circles, high copy number, small
size, stable clones, accommodates large inserts, etc.).
 If some proteins and enzymes from M13 are furnished to the cell containing the phage,
then the phagemid is induced to produce single-stranded DNA and eventually phage
particles.
 The easiest way to supply these gene products is to infect the plasmid carrying strain with
an M13 helper phage, such as M13K07.
pBluescript
• small plasmid,
• has high copy number,
• The Ampr gene is a selectable marker
• The presence of the lac-Z-alpha fragment (and the MCS within the fragment) represent
the scorable marker; transformed cells that contain unaltered plasmid will produce blue
colonies on IPTG/X-gal plates, transformed cells with plasmid that have insert disrupting
the lac-Z-alpha fragment will produce white colonies.
• The f1 origin allows for phagemid rescue.
• commonly used cloning and sequencing procedures.
• extensive polylinker with 21 unique restriction enzyme recognition sites.
• Flanking the polylinker are T7 and T3 RNA polymerase promoters that can be used to
synthesize RNA in vitro.
• The (+) and (–) orientations of the f1 intergenic region allow the rescue of sense or
antisense ssDNA by a helper phage.
BACs and PACs as alternatives to cosmids
Phage P1 is a temperate bacteriophage which has been extensively used for genetic analysis of
Escherichia coli because it can mediate generalized transduction. Sternberg and co-workers have
developed a P1 vector system which has a capacity for DNA fragments as large as 100 kb
(Sternberg 1990, Pierce et al. 1992). Thus the capacity is about twice that of cosmid clones but
less than that of yeast artificial chromosome (YAC) clones (see p. 159). The P1 vector contains a
packaging site (pac) which is necessary for in vitro packaging of recombinant molecules into
phage particles. The vectors contain two loxP sites. These are the sites recognized by the phage
recombinase, the product of the phage cre gene, and which lead to circularization of the
packaged DNA after it has been injected into an E. coli host expressing the recombinase (Fig.
5.4). Clones are maintained in E. coli as low-copy-number plasmids by selection for a vector
kanamycin-resistance marker. A high copy number can be induced by exploitation of the P1 lytic
replicon (Sternberg 1990). This P1 system has been used to construct genomic libraries of
mouse, human, fission yeast and Drosophila DNA (Hoheisel et al. 1993, Hartl et al. 1994).
Shizuya et al. (1992) have developed a bacterial cloning system for mapping and analysis of
complex genomes. This BAC system (bacterial artificial chromosome) is based on the single-
copy sex factor F of Ecoli. This vector (Fig. 5.5) includes the λ cos N and P1 loxP sites, two
cloning sites (HindIII and BamHI) and several G+C restriction enzyme sites (e.g. SfiI, NotI,
etc.) for potential excision of the inserts. The cloning site is also flanked by T7 and SP6
promoters for generating RNA probes. This BAC can be transformed into E. coli very
efficiently, thus avoiding the packaging extracts that are required with the P1 system. BACs are
capable of maintaining human and plant genomic fragments of greater than 300 kb for over 100
generations with a high degree of stability (Woo et al. 1994) and have been used to construct
genome libraries with an average insert size of 125 kb (Wang et al. 1995a). Subsequently,
Ioannou et al. (1994) have developed a P1-derived artificial chromosome (PAC), by combining
features of both the P1 and the F-factor systems. Such PAC vectors are able to handle inserts in
the 100–300 kb range. The first BAC vector, pBAC108L, lacked a selectable marker for
recombinants. Thus, clones with inserts had to be identified by colony hybridization. While this
once was standard practice in gene manipulation work, today it is considered to be inconvenient!
Two widely used BAC vectors, pBeloBAC11 and pECBAC1, are derivatives of pBAC108L in
which the original cloning site is replaced with a lacZ gene carrying a multiple cloning site (Kim
et al. 1996, Frijters et al. 1997). pBeloBAC11 has two EcoRI sites, one in the lacZ gene and one
in the CMR gene, wheras pECBAC1 has only the EcoRI site in the lacZ gene. Further
improvements to BACs have been made by replacing the lacZ gene with the sacB gene
(Hamilton et al. 1996). Insertional inactivation of sacB permits growth of the host cell on
sucrosecontaining media, i.e. positive selection for inserts. Frengen et al. (1999) have further
improved these BACs by including a site for the insertion of a transposon. This enables genomic
inserts to be modified after cloning in bacteria, a procedure known as retrofitting. The principal
uses of retrofitting are the simplified introduction of deletions (Chatterjee & Coren 1997) and the
introduction of reporter genes for use in the original host of the genomic DNA (Wang et al.
2001).
Choice of vector
The maximum size of insert that the different vectors will accommodate is shown in Table 5.1.
The size of insert is not the only feature of importance. The absence of chimeras and deletions is
even more important. In practice, some 50% of YACs show structural instability of inserts or are
chimeras in which two or more DNA fragments have become incorporated into one clone. These
defective YACs are unsuitable for use as mapping and sequencing reagents and a great deal of
effort is required toidentify them. Cosmid inserts sometimes contain the same aberrations and the
greatest problem with them arises when the DNA being cloned contains tandem arrays of
repeated sequences. The problem is particularly acute when the tandem array is several times
larger than the allowable size of a cosmid insert. Potential advantages of the BAC and PAC
systems over YACs include lower levels of chimerism (Hartl et al. 1994, Sternberg 1994), ease
of library generation and ease of manipulation and isolation of insert DNA. BAC clones seem to
represent human DNA far more faithfully than their YAC or cosmid counterparts and appear to
be excellent substrates for shotgun sequence analysis, resulting in accurate contiguous sequence
data (Venter et al. 1996).

Expression vector – Characteristics, RNA probe synthesis, High level


expression ofproteins, Protein solubilization, purification and export.
• Expression vectors are required if one wants to prepare RNA probes from the cloned
gene or to purify large amounts of the gene product. In either case,transcription of the
cloned gene is required. Although it is possible to have the cloned gene under the
controlof its own promoter, it is more usual to utilize a promoter specific to the vector.
Such vector-carried promoters have been optimized for binding of the E. coli RNA
polymerase and many of them can be regulated easily by changes in the growth
conditions of the host cell.E. coli RNA polymerase is a multi-subunit enzyme. The core
enzyme consists of two identical α subunits and one each of the β and β′ subunits. The
core enzyme is not active unless an additional subunit, the σ factor, is present. RNA
polymerase recognizes different types of promoters depending on which type of σ factor
is attached. The most common promoters are those recognized by the RNA polymerase
with σ70. A large number of σ70 promoters from E. coli have been analysed and a
compilation of over 300 of them can be found in Lisser and Margalit (1993). A
comparison of these promoters has led to the formulation of a consensus sequence (Fig.
5.6). If the transcription starts point is assigned the position +1 then


• this consensus sequence consists of the −35 region (5′-TTGACA-) and the −10 region, or
ribnow box (5′-TATAAT). RNA polymerase must bind to both sequences to initiate
transcription. The strength of a promoter, i.e. how many RNA copies are synthesized per
unit time per enzyme molecule, depends on how close its sequence is to the consensus.
While the −35 and −10 regions are the sites of nearly all mutations affecting promoter
strength, other bases flanking these egions can affect promoter activity (Hawley &
McClure 1983, Dueschle et al. 1986, Keilty & Rosenberg 1987). The distance between
the −35 and −10 regions is also important. In all cases
• examined, the promoter was weaker when the spacing was increased or decreased from
17 bp.
• Upstream (UP) elements located 5′ of the −35 hexamer in certain bacterial promoters are
A+T-rich sequences that increase transcription by interacting with the α subunit of RNA
polymerase. Gourse et al. (1998) have identified UP sequences conferring increased
activity to the rrn core promoter. The best UP sequence was portable and increased
heterologous protein expression from the lac promoter by a factor of 100. Once RNA
polymerase has initiated transcription
• at a promoter, it will polymerize ribonucleotides until it encounters a transcription-
termination
• site in the DNA. Bacterial DNA has two types of transcription-termination site: factor-
independent and factor-dependent. As their names imply, these types are distinguished by
whether they work with just RNA polymerase and DNA alone or need other factors
before they can terminate transcription. The factor-independent transcription terminators
are easy to recognize because they have similar sequences: an inverted repeat followed
by a string of
• A residues (Fig. 5.7). Transcription is terminated in the string of A residues, resulting in a
string of U residues at the 3′ end of the mRNA. The factor dependent transcription
terminators have very little sequence in common with each other. Rather, termination
involves interaction with one of the three known E. coli termination factors, Rho (ρ), Tau
(τ) and NusA. Most expression vectors incorporate a factor-independent termination
sequence downstream from the site of insertion of the cloned gene.

Vectors for making RNA probes


• Although single-stranded DNA can be used as a sequence probe in hybridization
experiments, RNA probes are preferred. The reasons for this are that the rate of
hybridization and the stability are far greater for RNA–DNA hybrids compared with
DNA–DNA hybrids. To make an RNA probe, the relevant gene sequence is cloned in a
plasmid vector such that itis under the control of a phage promoter. After purification, the
plasmid is linearized with a suitable restriction enzyme and then incubated with the phage
RNA polymerase and the four ribonucleoside triphosphates (Fig. 5.8). No transcription
terminator is required because the RNA polymerase will falloff the end of the linearized
plasmid. There are three reasons for using a phage promoter. First, such promoters are
very strong, enabling large amounts of RNA to be made in vitro. Secondly,
• the phage promoter is not recognized by the E. coli RNA polymerase and so no
transcription will occur inside the cell. This minimizes any selection of variantinserts.
Thirdly, the RNA polymerases encoded by phages such as SP6, T7 and T3 are much
simpler molecules to handle than the E. coli enzyme, since the active enzyme is a single
polypeptide. If it is planned to probe RNA or single-stranded DNA sequences, then it is
essential to prepare RNA probes corresponding to both strands of the insert.One way of
doing this is to have two different clones
• corresponding to the two orientations of the insert. An alternative method is to use a
cloning vector in which the insert is placed between two different, opposing phage
promoters (e.g. T7/T3 or T7/SP6) that flank a multiple cloning sequence (see Fig. 5.5).
Since each of the two promoters is recognized by a different RNA polymerase, the
direction of transcription is determined by which polymerase is used.

• A further improvement has been introduced by Evans et al. (1995). In their LITMUS
vectors,
• the polylinker regions are flanked by two modified T7 RNA polymerase promoters. Each
contains a unique restriction site (SpeI or AflII) that has been engineered into the T7
promoter consensus sequence such that cleavage with the corresponding endonuclease
inactivates that promoter. Both promoters are active despite the presence of engineered
sites. Selective unidirectional transcription is achieved by simply inactivating the other
promoter by digestion
• with SpeI or AflII prior to in vitro transcription (Fig. 5.9). Since efficient labelling of
RNA probes demands that the template be linearized prior to transcription, at a site
downstream from the insert, cutting at the site within the undesired promoter performs
both functions in one step. Should the cloned insert contain either an SpeI or an AflII
site,the unwanted promoter can be inactivated by cuttingat one of the unique sites within
the polylinker.

Vectors for maximizing protein synthesis

• Provided that a cloned gene is preceded by a promoter recognized by the host cell, then
there is a
• high probability that there will be detectable synthesis of the cloned gene product.
However, much of the interest in the application of recombinant DNA technology is the
possibility of facile synthesis of large quantities of protein, either to study its properties
or because it has commercial value. Insuch instances, detectable synthesis is not
sufficient: rather, it must be maximized. The factors affecting the level of expression of a
cloned gene are shown in Table 5.2 and are reviewed by Baneyx (1999). Of these factors,
only promoter strength is considered here to select the strongest promoter possible: the
effectsof overexpression on the host cell also need to be considered.Many gene products
can be toxic to the host cell even when synthesized in small amounts.Examples include
surface structural proteins (Beck & Bremer 1980), proteins, such as the PolA gene
product, that regulate basic cellular metabolism (Murray & Kelley 1979), the cystic
fibrosis transmembrane conductance regulator (Gregory et al. 1990) and lentivirus
envelope sequences (Cunningham et al. 1993). If such cloned genes are allowed to be
expressed there will be a rapid selection for mutants that no longer synthesize the toxic
protein. Even when overexpression of a protein is not toxic to the host cell, high-level
synthesis exerts a metabolic
• drain on the cell. This leads to slower growth and hence in culture there is selection for
variants with lower or no expression of the cloned gene becausethese will grow faster. To
minimize the problems associated with high-level expression, it is usual to use a vector in
which the cloned gene is under the control of a regulated promoter.Many different
vectors have been constructed forregulated expression of gene inserts but most ofthose in
current use contain one of the following controllable promoters: λ PL , T7, trc (tac) or
BAD.Table 5.3 shows the different levels of expression that can be achieved when the
gene for chloramphenicol transacetylase (CAT) is placed under the control of three of
these promoters. The trc and tac promoters are hybrid promoters derived from the lac and
trp promoters (Brosius 1984). They are stronger than either of the two parental promoters
because their sequences are more like the consensus sequence. Like lac, the trcand tac
promoters are inducibile by lactose andisopropyl-β-d-thiogalactoside (IPTG). Vectors
using these promoters also carry the lacO operator and the lacI gene, which encodes the
repressor. The pET vectors are a family of expression vectors that utilize phage T7
promoters to regulate synthesis of cloned gene products (Studier et al. 1990).
• The general strategy for using a pET vector is shown inFig. 5.10. To provide a source of
phage-T7 RNApolymerase, E. coli strains that contain gene 1 of the phage have been
constructed. This gene is cloneddownstream of the lac promoter, in the chromosome, so
that the phage polymerase will only be synthesized following IPTG induction. The newly
synthesized T7 RNA polymerase will then transcribe the foreign gene in the pET
plasmid. If the protein product of the cloned gene is toxic, it is possible to minimize the
uninduced level of T7 RNA polymerase. First, a plasmid compatible with pET vectors is
selected and the T7 lysS gene is cloned in it. When introduced into a host cell carrying a
pET plasmid, the lysS gene will bind any residual T7 RNA polymerase (Studier 1991,
Zhang & Studier 1997). Also, if a lac operator is placed between the T7 promoter and the
cloned gene, this will further reduce transcription
• of the insert in the absence of IPTG (Dubendorff & Studier 1991). Improvements in the
yield of heterologous proteins can sometimes beachieved by use of selected host cells
(Miroux & Walker 1996). The λ PL promoter system combines very tight transcriptional
control with high levels of geneexpression. This is achieved by putting the cloned gene
under the control of the PL promoter carried on a vector, while the PL promoter is
controlled by a cI repressor gene in the E. coli host. This cI gene is itself under the
control of the tryptophan (trp) promoter (Fig. 5.11). In the absence of exogenous
tryptophan, the cI gene is transcribed and the cI repressor binds to the PL promoter,
preventing expression of the
cloned gene. Upon addition of tryptophan, the trprepressor binds to the cI gene,
preventing synthesis of the cI repressor. In the absence of cI repressor, there is a high
level of expression from the very strong PL promoter. The pBAD vectors, like the ones
based on PL promoter, offer extremely tight control of expression of cloned genes
(Guzman et al. 1995). The pBAD vectors carry the promoter of the araBAD (arabinose)
operon and the gene encoding the positive and
• negative regulator of this promoter, araC. AraC is a transcriptional regulator that forms a
complex with -arabinose. In the absence of arabinose, AraC binds to the O2 and I1
half-sites of the araBAD operon, forming a 210 bp DNA loop and thereby blocking
transcription (Fig. 5.12). As arabinose is added to the growth medium, it binds to AraC,
thereby releasing the O2 site. This in turn causes AraC to bind to the I2 site adjacent to
the I1 site. This releases the DNA loop and allows transcription to begin. Binding of
AraC to I1 and I2 is activated in the presence of
• cAMP activator protein (CAP) plus cAMP. If glucose is added to the growth medium,
this will lead to a repression of cAMP synthesis, thereby decreasing expression from the
araBAD promoter. Thus one can titrate the level of cloned gene product by varying the
glucose and arabinose content of the growth medium (Fig. 5.12). According to Guzman
et al. (1995), the pBAD vectors permit fine-tuning of gene expression. All that is required
is to change the sugarcomposition of the medium. However, this is disputed by others
(Siegele & Hu 1997, Hashemzadeh-Bonehi et al. 1998). Many of the vectors designed
for high-level expression also contain translation-initiation signals optimized for E. coli
expression

Vectors to facilitate protein purification


• Many cloning vectors have been engineered so that the protein being expressed will be
fused to another protein, called a tag, which can be used to facilitate protein purification.
Examples of tags include glutathione- S-transferase, the MalE (maltose-binding) protein
and multiple histidine residues, which can easily be purified by affinity chromatography.
The tag vectors are usually constructed so that the coding sequence for an amino acid
sequence cleaved by a specific protease is inserted between the codingsequence for the
tag and the gene being expressed.After purification, the tag protein can be cleaved off
with the specific protease to leave a normal or nearly normal protein. It is also possible to
include in the tag a protein sequence that can be assayed easily. This permits assay of the
cloned gene product when its activity is not known or when the usual assay
isinconvenient. Three different examples of tags aregiven below. The reader requiring a
more detailedinsight should consult the review by LaVallie and McCoy (1995). To use a
polyhistidine fusion for purification, the gene of interest is first engineered into a vector
in which there is a polylinker downstream of six histidine residues and a proteolytic
cleavage site. In the example shown in Fig. 5.13, the cleavag e site is that for
enterokinase. After induction of synthesis of the fusion protein, the cells are lysed and the
viscosity of the lysate is reduced by nuclease treatment.The lysate is then applied to a
column containing immobilized divalent nickel, which selectively binds the polyhistidine
tag. After washing away any contaminating proteins, the fusion protein is eluted from the
column and treated with nterokinase to release the cloned gene product.
• For the cloned gene to be expressed correctly, it has to be in the correct translational
reading frame.This is achieved by having three different vectors,each with a polylinker in
a different reading frame (see Fig. 5.13). Enterokinase recognizes the sequence(Asp)4Lys
and cleaves immediately after the lysine residue. Therefore, after enterokinase cleavage,
the cloned gene protein will have a few extra amino acids at the N terminus. If desired,
the cleavage site and polyhistidines can be synthesized at the C terminus. If the cloned
gene product itself contains
• an enterokinase cleavage site, then an alternative protease, such as thrombin or factor Xa,
with a different cleavage site can be used. To facilitate assay of the fusion proteins, short
antibody recognition sequences can be incorporated into the tag between the affinity label
and the protease cleavage site. Some examples of recognizable epitopes are given in
Table 5.4. These antibodies can be used to detect, by western blotting, fusion proteins
carrying the appropriate epitope. Note that a polyhistidine tag at the C terminus can
function for both assay and purification.Biotin is an essential cofactor for a number of
carboxylases important in cell metabolism. The biotin in these enzyme complexes is
covalently attached at a specific lysine residue of the biotin carboxylase carrier protein.
Fusions made to a segment of the carrier protein are recognized in E.coli by biotin ligase,
the product of the birA gene, and biotin is covalently attached in an ATP-dependent
reaction. The expressed fusion protein can be purified using streptavidin affinity
chromatography (Fig. 5.14). E. coli expresses a single endogenousbiotinylated protein,
but it does not bind to streptavidin in its native configuration, making the affinity
purification highly specific for the recombinant fusion protein.The presence of biotin on
the fusion protein has an additional advantage: its presence canbe detected with enzymes
coupled to streptavidin. The affinity purification systems described above suffer from the
disadvantage that a protease is required to separate the target protein from the affinity tag.
Also, the protease has to be separated from the protein of interest. Chong et al.
(1997,1998) have described a unique purification system that has neither of these
disadvantages . The system utilizes a protein splicing element, an intein, from the
Saccharomyces cerevisiae VMA1 gene (see Box 5.2). The intein is modified such that it
undergoes a self-cleavage reaction at its N terminus at low temperatures in the presence
-mercaptoethanol.

• The gene encoding the target protein is inserted into a multiple cloning site (MCS) of a
vector to create afusion between the C terminus of the target gene andthe N terminus of
the gene encoding the intein. DNA encoding a small (5 kDa) chitin-binding domainfrom
Bacillus circulans was added to the C terminus of the intein for affinity purification (Fig.
5.15). The above construct is placed under the control of an IPTG-inducible T7 promoter.
When crude extracts from induced cells are passed through a chitin column, the fusion
protein binds and all contaminating proteins are washed through. The fusion is then
induced to undergo intein-mediated self-cleavage on
• the column by incubation with a thiol. This releasesthe target protein, while the intein
chitin-bindingdomain remains bound to the column.

Vectors to promote solubilization of expressed proteins


• One of the problems associated with the overproductionof proteins in E. coli is the
sequestration of theproduct into insoluble aggregates or ‗inclusion bodies‘ (Fig. 5.16).
They were first reported in strainsoverproducing insulin A and B chains (Williams et al.
1982). At first, their formation was thought to be restricted to the overexpression of
heterologous proteins in E. coli, but they can form in the presence of high levels of
normal E. coli proteins, e.g. subunits of RNA polymerase (Gribskov & Burgess 1983).
Two parameters that can be manipulated to reduce
• inclusion-body formation are temperature and growth rate. There are a number of reports
which show that lowering the temperature of growth increases the yield of correctly
folded, soluble protein(Schein & Noteborn 1988, Takagi et al. 1988,Schein 1991). Media
compositions and pH values that reduce the growth rate also reduce inclusionbody
formation. Renaturation of misfolded proteins can sometimes be achieved following
solubilization in guanidinium hydrochloride (Lilie et al. 1998).

• Three ‗genetic‘ methods of preventing inclusion body formation have been described. In
the first of these, the host cell is engineered to overproduce a chaperon (e.g. DnaK,
GroEL or GroES proteins) in addition to the protein of interest (Van Dyk et al.1989,
Blum et al. 1992, Thomas et al. 1997). Castanie et al. (1997) have developed a series of
vectors which are compatible with pBR322-type plasmids and which encode the
overproduction of chaperons. These vectors can be used to test the effect of chaperons on
the solubilization of heterologous gene products. Even with excess chaperon there is no
guarantee of proper folding. The second method involves making minor changes to the
amino acid sequence of the target protein. For example, cysteine-to-serine changes in
fibroblast growth factor minimized inclusion-body formation (Rinas et al. 1992). The
third method is derived from the observation that many proteins produced as insoluble
aggregates in their native state are synthesized in soluble form as thioredoxin fusion
• proteins (LaVallie et al. 1993). More recently, Davis et al. (1999) have shown that the
NusA and GrpEproteins, as well as bacterioferritin, are even better than thioredoxin at
solubilizing proteins expressed at a high level. Kapust and Waugh (1999) have reported
that the maltose-binding protein is also much better than thioredoxin. Building on the
work of LaVallie et al. (1993), a
• series of vectors has been developed in which the gene of interest is cloned into an MCS
and the geneproduct is produced as a thioredoxin fusion protein with an enterokinase
cleavage site at the fusion point. After synthesis, the fusion protein is releasedfrom the
producing cells by osmotic shock and purified. The desired protein is then released by
enterokinase cleavage. To simplify the purification of thioredoxin fusion proteins, Lu et
al. (1996)

• systematically mutated a cluster of surface amino acid residues. Residues 30 and 62 were
converted to histidine and the modified (‗histidine patch‘) thioredoxin could now be
purified by affinity chromatography on immobilized divalent nickel. Analternative
purification method was developed bySmith et al. (1998). They synthesized a gene in
which a short biotinylation peptide is fused to the Nterminus of the thioredoxin gene to
generate a new protein called BIOTRX. They constructed a vector carrying the BIOTRX
gene, with an MCS at the C terminus, and the birA gene. After cloning a gene in the
MCS, a fused protein is produced which can be purified by affinity chromatography on
streptavidin columns. An alternative way of keeping recombinant proteinssoluble is to
export them to the periplasmicspace (see next section). However, even here they may still
be insoluble. Barth et al. (2000) solved thisproblem by growing the producing bacteria
under osmotic stress (4% NaCl plus 0.5 mol/l sorbitol) in the presence of compatible
solutes. Compatiblesolutes are low-molecular-weight osmolytes, such as glycine betaine,
that occur naturally in halophilic bacteria and are known to protect proteins at high salt
concentrations. Adding glycine betaine for the cultivation of E. coli under osmotic stress
not onlyallowed the bacteria to grow under these otherwise inhibitory conditions but also
produced a periplasmic environment for the generation of correctly folded recombinant
proteins.
Vectors to promote protein export

• Gram-negative bacteria such as E. coli have a complex wall–membrane structure
comprising an inner,cytoplasmic membrane separated from an outer membrane by a cell
wall and periplasmic space. Secreted proteins may be released into the periplasm or
integrated into or transported across the outer membrane. In E. coli it has been
established that protein export through the inner membrane to the periplasm or to the
outer membrane is achieved by a universal mechanism known as the general export
pathway (GEP). This involves the sec gene products (for review see Lory 1998). Proteins
that enter the GEP are synthesized in the cytoplasm with a signal sequence at the N
terminus. This sequence is cleaved by a signal or leader peptidase during transport. A
signal sequence has three domains: a positively charged amino-terminal region, a
hydrophobic core, consisting of five to 15 hydrophobic amino acids, and a leader
peptidase cleavage site. A signal sequence attached to a normally cytoplasmic protein
will direct it to the export pathway.
• Many signal sequences derived from naturally occurring secretory proteins (e.g. OmpA,
-lactamase, alkaline phosphatase and phage M13 gIII) support the
efficient translocation of heterologous peptides across the inner membrane when fused to
their amino termini. In some cases, however, the preproteins are not readily exported and
either become ‗jammed‘ in the inner membrane,accumulate in precursor inclusion bodies
or are rapidly degraded within the cytoplasm. In practice, it may be necessary to try
several signal sequences (Berges et al. 1996) and/or overproduce different chaperons to
optimize the translocation of a particular heterologous protein. A first step would be to
trythe secretion vectors offered by a number of molecularbiology suppliers and which are
variants of the vectors described above.
• It is possible to engineer proteins such that theyare transported through the outer
membrane and are secreted into the growth medium. This is achieved by making use of
the type I, Sec-independent secretionsystem. The prototype type I system is the
haemolysin transport system, which requires a short carboxy-terminal secretion signal,
two translocators(HlyB and D), and the outer-membrane protein TolC. If the protein of
interest is fused to the carboxyterminal secretion signal of a type I-secreted protein,it will
be secreted into the medium provided HlyB,HlyD and TolC are expressed as well
(Spreng et al. 2000). An alternative presentation of recombinant proteins is to express
them on the surface of thebacterial cell using any one of a number of carrier proteins (for
review, see Cornelis 2000).

You might also like