Professional Documents
Culture Documents
Content Server
Content Server
com
RESEARCH
Biology Group, MS-M888 and 2Biophysics Group, MS-P244, Los Alamos National Laboratory, Los Alamos, NM 87545. 3University of New Mexico,
Albuquerque, NM 87131. *Corresponding author (e-mail: waldo@telomere.lanl.gov).
Formation of the chromophore of green fluorescent protein (GFP) depends on the correct folding of the
protein. We constructed a folding reporter vector, in which a test protein is expressed as an N-terminal
fusion with GFP. Using a test panel of 20 proteins, we demonstrated that the fluorescence of Escherichia
coli cells expressing such GFP fusions is related to the productive folding of the upstream protein
domains expressed alone. We used this fluorescent indicator of protein folding to evolve proteins that are
normally prone to aggregation during expression in E. coli into closely related proteins that fold robustly
and are fully soluble and functional. This approach to improving protein folding does not require functional assays for the protein of interest and provides a simple route to improving protein folding and expression by directed evolution.
Keywords: protein folding, solubility, reporter, aggregation directed evolution, green fluorescent protein, inclusion body
http://biotech.nature.com
mophilic archaeon Pyrobaculum aerophilum12, in E. coli at 37C as Nterminal GFP fusions. Because protein insolubility can result from
other factors besides misfolding on the ribosome, we focused on
cytoplasmic proteins, excluding candidates with homologies to
known membrane-bound, membrane-associated, or periplasmic
proteins. Striking gene-dependent (up to 50-fold) differences in
whole-cell GFP fluorescence (Fig. 1) could not be explained by the
total expression levels, which varied at most by 20% over the entire
set of proteins (data not shown). Instead, the fluorescence was
directly related to the fraction of the overexpressed protein found in
the supernatant of lysed cells expressing the corresponding nonfusion protein under identical conditions (Fig. 1).
The correlation between nonfusion solubility and GFP fusion
fluorescence is not perfect. For example, the solubility of protein 8
(purine-nucleoside phosphorylase) is underestimated, whereas that
of protein 9 (soluble hydrogenase) is overestimated (Fig. 1).
Nonetheless, failure of the GFP chromophore to form in the fusion
context is surprisingly well correlated with the likelihood that the
protein of interest will be aggregated when expressed without the
GFP tag. Our results are consistent with the hypothesis that the GFP
folding trajectory is sensitive to the misfolding of the fused upstream
proteins. Based on the limited data set of 20 proteins presented in
Figure 1, and assuming that GFP fluorescence is an indicator of the
correct tertiary folding of the GFP moiety11, the threshold of full solubility (implying correct folding and avoidance of aggregation) for
the test protein expressed alone occurs when the downstream GFP
domain has an approximately 90% probability of misfolding in the
fusion context.
In vitro transcription/translation. Using six additional proteins
of bacterial and vertebrate origin, we repeated the GFP fusion experiment with an in vitro protein synthesis system in which the bulk
concentration of newly synthesized polypeptides is reduced by a factor of at least 1,000 relative to their concentration in E. coli13,14. The
production of the fluorescent GFP fusion protein was initiated by
addition of the DNA template and appeared to be complete within
about 30 min at 37C. When normalized by a control expressing GFP
alone, the GFP fusion fluorescence in the in vitro system and in E.
691
Fraction soluble
RESEARCH
Ln (Fluoroescence)
Figure 1. Solubility of proteins expressed in E. coli is highly correlated
with fluorescence of E. coli expressing corresponding GFP fusions.
Randomly selected proteins from Pyrobaculum aerophilum in order
of increasing GFP fusion fluorescence: (1) tartrate dehydratase bsubunit; (2) nucleoside-diphosphate kinase; (3) tyrosine tRNA
synthetase; (4) polysulfide reductase subunit; (5) methyltransferase;
(6) GTP cyclohydrolase I; (7) aspartate-semialdehyde dehydrogenase; (8) purine-nucleoside phosphorylase; (9) soluble
hydrogenase; (10) cysteine tRNA synthetase; (11) 3-hexulose 6phosphate synthase; (12) nirD protein; (13) C-type cytochrome
biogenesis factor; (14) phosphate cyclase; (15) hydrogenase
expression/formation protein (hypE); (16) chorismate mutase; (17)
DNA-directed RNA polymerase; (18) ribosomal protein S9p; (19)
translation initiation factor; (20) sulfite reductase (dissimilatory
subunit). GFP, soluble variant of GFP expressed alone. Dashed line
indicates threshold above which test proteins are fully soluble.
coli closely agreed (Table 1). This lack of dependence on the bulk
concentration of translated polypeptides implies that the GFP fluorescence reports folding of the fusion partner occurring cotranslationally15, or soon after translation.
Evolution of protein folding. The GFP folding reporter distinguishes proteins that fold robustly and are highly soluble when
expressed in E. coli from those that tend to misfold and aggregate.
Such a reporter system could be used in a directed evolution
process16,17 to evolve proteins that normally misfold into related ones
that fold properly. As a test of directed evolution of protein folding
we chose the mutant C33T of gene V protein18 and bullfrog H-subunit ferritin19,20. Beginning with DNA encoding the insoluble wildtype proteins, we used DNA shuffling6 to generate and recombine
mutations, and the GFP folding reporter to identify variants with
improved folding. Each protein was subjected to four rounds of forward evolution to generate soluble variants, followed by three
rounds of backcrossing6 against parental DNA to remove nonessential mutations. With each cycle of evolution, both the nonfusion solubility and GFP fusion fluorescence increased (Fig. 2).
Robustly folding ferritin variants. The rapid Fe-mineralization
phenotype of H-subunit ferritin requires at least seven key amino
acids21. Thus, the ferritin system can be used to test whether directed
evolution of protein folding can be accomplished without a loss of
function. Thirty of the ferritin clones that were most fluorescent
when expressed with the GFP reporter system were sequenced. These
comprised just three variants we designated HM-1 (N47D + Q55L +
E58R + T93P + G146E), HM-2 (N47D + E58K + E59A + T93P +
G146E), and HM-3 (K53R + Q55R + T93P + G146E). HM-3 also
contained two silent mutations at Asp120 (GAG to GAC) and at
Gln138 (CAG to CAA). These changed the codon usage without
changing the amino acid being coded. The variants HM-1 and HM692
E. coli cellsa
Insoluble proteins
Bullfrog H-subunit ferritin
Gene V (C33T)
XylR
0.034 0.004
0.030 0.005
0.023 0.003
0.031 0.003
0.041 0.005
0.031 0.003
Soluble proteins
Bullfrog L-subunit ferritin
Gene V (wild type)
Maltose-binding protein
0.58 0.02
0.40 0.03
0.43 0.02
0.53 0.03
0.43 0.02
0.50 0.03
Coupled transcription
and translationb
http://biotech.nature.com
RESEARCH
A
A
B
C
http://biotech.nature.com
These observations and the results described in this work underscore the sensitivity of the GFP folding trajectory to the expression
environment. It will be interesting to see whether new GFP variants
that abolish the sensitivity of the GFP chromophore formation to
the presence of misfolded fused protein domains can be generated.
Using the accessibility of small fused domains to indicate protein
folding can yield paradoxical results26, because such domains may
be accessible or partially functional even when the fusion partner is
aggregated or misfolded. In contrast, GFP fluorescence is a robust,
direct, sensitive indicator of the productive folding of fused protein
domains. We successfully improved the folding and solubility of the
multimeric protein ferritin. Here the 24 subunit monomers associate to form the large stable protein assembly. The method worked,
even though the subunit C termini of many of the members of the
ferritin family tend to become buried during folding26. This suggests that the fluorescence of the GFP translational fusion could
provide a convenient metric for driving the evolution of other protein scaffoldings as well.
In this work, for the first time, functional and soluble variants of
a normally insoluble protein have been obtained without recourse to
functional screens for the protein of interest. Where protein function
is being modified by random mutation and functional screens are
time-consuming or difficult, our method can be used to preadapt
proteins for improved folding robustness and to screen mutants for
folding potential. GFP fluorescence can be assessed on the basis of a
single cell, so large numbers of clones (>106) can be rapidly screened
for solubility by fluorescence-activated cell sorting 27. Our demonstration that the GFP folding reporter system works both in vivo and
in vitro opens up the possibility of high-throughput genome-wide in
vitro screening of protein folding from PCR-amplified genes, facili693
RESEARCH
tating a class-directed approach to structural proteomics28. In addition to improving protein expression, this approach should have
wide applicability to the design of novel protein structures, theoretical and empirical studies of protein folding, screening large numbers
of proteins and protein fragments for solubility, finding and modifying efficient folding partners, and even engineering hosts for
improved protein expression. The use of GFP as a sensitive fluorescent indicator of protein folding should enable the evolution of
closely related of sets of polypeptides that differ in their ability to
fold, thereby shedding new light on the folding code.
Experimental protocol
Cloning. Genes coding test proteins were amplified by conventional PCR
from plasmids available in-house (gene V and xylR), plasmids purchased
from commercial sources (maltose-binding protein, malE; Invitrogen, San
Diego, CA), or genomic DNA (P. aerophilum). Bullfrog H-subunit and Lsubunit ferritin genes were cloned from Rana catesbeiana tadpole red cells by
reverse-transcription PCR using a commercially available kit (Perkin-Elmer,
Foster City, CA) according to the manufacturers instructions. Gene V
C33(TGT)T33(ACT) was engineered using conventional PCR techniques.
Incorporating two codon changes guarded against trivial mutation to the soluble wild-type sequence (i.e., by the reversion T33C) in subsequent directedevolution experiments. Clones were isolated and sequences verified by dyeterminator sequencing. Specific ferritin mutants were engineered by overlap
PCR.
GFP folding reporter. The BglII/XhoI fragment of pET21(a+) (Novagen,
Madison, WI) was inserted into the corresponding site of pET28(a+), and the
BamHI/EcoRI site was replaced with the DNA fragment GGATCCGCTGGCTCCGCTGCTGGTTCTGGCGAATTC coding for amino acid linker
GSAGSAAGSGEF. We avoided large bulky hydrophobic residues in designing
the linker. A longer (GGGS)4 linker was also tried, and did not appear to
change the performance of the folding reporter. We chose to use the shorter
GSAGSAAGSGEF linker because it reduced the amount of homologous
repeats in the coding sequence, which could have resulted in deletions by
homologous recombination during the shuffling protocol. A soluble GFP
variant was engineered based on a variant that folds well in E. coli22 using sitedirected mutation to eliminate the internal NdeI and BamHI sites and incorporate the red-shift S65T mutation27 and the folding mutation F64L (ref. 29),
and inserted into the EcoRI/XhoI site of the vector. The NdeI/BamHI cloning
site was replaced by the frameshift stuffer with three translational stops
CATATGTGTTAACTGAGTAGGATCC, and the resulting vector digested
with NdeI and BamHI to receive inserts.
Fluorescence measurements and protein solubility. Cultures were
grown at 37C in Luria-Bertani (LB) media containing 30ml/ml kanamycin
and induced with 1 mM isopropylthiogalactoside (IPTG) at indicated temperature. Cells were diluted to OD600nm = 0.15 in 10 mM Tris, pH 7.5, 0.15
694
http://biotech.nature.com
RESEARCH
bound Fe(II). The Fe(III) zones were developed32. Briefly, the membrane
was treated with a solution of 1% HCl + 1% potassium ferrocyanide
(Turnbull Blue reaction) at ambient temperature (~24C) for 10 min. After
copious washing with distilled water, the Prussian blue spots were intensified by treating with 10 mM H2O2 + 10 mM diamminobenzidine in 10 mM
Tris, pH 8.0 (buffer C), for 5 min in the dark. The membrane was copiously
washed with distilled water, transferred to a Petri plate, and scanned on a
Hewlett-Packard 5P flatbed scanner while still moist.
Acknowledgments
http://biotech.nature.com
13. Zubay, G. In vitro synthesis of protein in microbial systems. Annu. Rev. Genet. 7,
267287 (1973).
14. Neidhardt, F.C. in Escherichia coli and Salmonella typhimurium: Cellular and
Molecular Biology (ed. Neidhardt, F.C.) 36 (American Society of Microbiology,
Washington, DC; 1987).
15. Fedorov, A.N. & Baldwin, T.O. Cotranslational protein-folding. J. Biol. Chem.
272, 3271532718 (1997).
16. Arnold, F.H. Directed evolution: creating biocatalysts for the future. Chem. Eng.
Sci. 51, 50915102 (1996).
17. Zhao, H.M. & Arnold, F.H. Optimization of DNA shuffling for high-fidelity recombination. Nucleic Acids Res. 25, 13071308 (1997).
18. Terwilliger, T.C., Zabin, H.B., Horvath, M.P., Sandberg, W.S. & Schlunk, P.M. Invivo characterization of mutants of the bacteriophage-F1 gene-V protein isolated by saturation mutagenesis. J. Mol. Biol. 236, 556571 (1994).
19. Dickey, L.F. et al. Differences in the regulation of messenger-RNA for housekeeping and specialized-cell ferritin: a comparison of 3 distinct ferritin complementary DNAS, the corresponding subunits, and identification of the 1st
processed pseudogene in Amphibia. J. Biol. Chem. 262, 79017907 (1987).
20. Waldo, G.S. & Theil, E.C. in Ferritin and Iron Biomineralization. Comprehensive
Supramolecular Chemistry 5. (vol. ed. Susslick, K.) 6591 (Pergamon Press,
Elsevier Science Ltd, Oxford, UK,1996).
21. Harrison, P.M. & Arosio P. The ferritins: molecular-properties, iron storage function and cellular-regulation. Biochim. Biophys. Acta-Bioenerg. 1275, 161203
(1996).
22. Crameri, A., Whitehorn, E.A., Tate, E. & Stemmer, W.P.C. Improved green fluorescent protein by molecular evolution using DNA shuffling. Nat. Biotechnol. 14,
315319 (1996).
23. Heim, R., Prasher, D.C. & Tsien, R.Y. Wavelength mutations and posttranslational autooxidation of green fluorescent protein. Proc. Natl. Acad. Sci. USA 91,
1250112504 (1994).
24. Reid, B.G. & Flynn, G.C. Chromophore formation in green fluorescent protein.
Biochemistry 36, 67866791 (1997).
25. Makino, Y., Amada, K., Taguchi, H. & Yoshida, M. Chaperonin-mediated folding
of green fluorescent protein. J. Biol. Chem. 272, 1246812474 (1997).
26. Jappelli, R., Luzzago, A., Tataseo, P., Pernice, I. & Cesareni G. Loop mutations
can cause a substantial conformational change in the carboxy terminus of the
ferritin protein. J. Mol. Biol. 227, 532543 (1992).
27. Cormack, B.P., Valdivia, R.H. & Falkow, S. FACS-optimized mutants of the green
fluorescent protein (GFP). Gene 173, 3338 (1996).
28. Terwilliger, T.C. et al. Class-directed structure determination: foundation for a
protein structure initiative. Protein Sci. 9, 18511856 (1998).
29. Heim R., Cubitt A.B. & Tsien R.Y. Improved green fluorescence. Nature 373,
663664 (1995).
30. Zhang, Y. et al. Expression of eukaryotic proteins in soluble form in Escherichia
coli. Protein Expr. Purif. 12, 159165 (1998).
31. http://rsb.info.nih.gov/nih-image/. NIH-Image is a public domain image processing program developed at the U.S. by the National Institutes of Health.
32. Moos, T. & Mollgard, K. A sensitive post-DAB enhancement technique for
demonstration of iron in the central-nervous-system. Histochem. J. 99, 471475
(1993).
695
Copyright of Nature Biotechnology is the property of Nature Publishing Group and its content may not be
copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written
permission. However, users may print, download, or email articles for individual use.