Professional Documents
Culture Documents
Ebook Directed Evolution Methods and Protocols Methods in Molecular Biology 2461 Andrew Currin Editor Online PDF All Chapter
Ebook Directed Evolution Methods and Protocols Methods in Molecular Biology 2461 Andrew Currin Editor Online PDF All Chapter
https://ebookmeta.com/product/mouse-cell-culture-methods-and-
protocols-methods-in-molecular-biology-633-andrew-ward/
https://ebookmeta.com/product/whole-body-regeneration-methods-
and-protocols-methods-in-molecular-biology-2450-blanchoud/
https://ebookmeta.com/product/circadian-regulation-methods-and-
protocols-methods-in-molecular-biology-2482-guiomar-solanas/
https://ebookmeta.com/product/rhodopsin-methods-and-protocols-
methods-in-molecular-biology-2501-valentin-gordeliy-editor/
Ferroptosis Methods and Protocols Methods in Molecular
Biology 2712 Guido Kroemer (Editor)
https://ebookmeta.com/product/ferroptosis-methods-and-protocols-
methods-in-molecular-biology-2712-guido-kroemer-editor/
https://ebookmeta.com/product/dnazymes-methods-and-protocols-
methods-in-molecular-biology-2439-gerhard-steger-editor/
https://ebookmeta.com/product/cancer-cell-biology-methods-and-
protocols-methods-in-molecular-biology-2508-sherri-l-christian-
editor/
https://ebookmeta.com/product/proteomics-in-systems-biology-
methods-and-protocols-methods-in-molecular-biology-2456-jennifer-
geddes-mcalister-ed/
https://ebookmeta.com/product/monoamine-oxidase-methods-and-
protocols-methods-in-molecular-biology-2558-claudia-binda-editor/
Methods in
Molecular Biology 2461
Andrew Currin
Neil Swainston Editors
Directed
Evolution
Methods and Protocols
METHODS IN MOLECULAR BIOLOGY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, UK
Edited by
This Humana imprint is published by the registered company Springer Science+Business Media, LLC part of Springer
Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Preface
Directed evolution (DE) is a powerful approach for the engineering of biological molecules,
often employed to introduce novel desirable characteristics. Successful DE projects require
the integration of a number of different key approaches, including computational design,
DNA mutagenesis and cloning, biochemistry, and screening. With such an array of skills
required, detailed methodologies and protocols provide an invaluable insight into the latest
techniques. We are therefore delighted to share the new chapters in this edition and hope
that these provide a useful resource to incorporate into new experimental workflows. We
would like to thank all the contributing authors for their efforts in writing these chapters,
particularly as this has been completed during the COVID-19 pandemic.
In recent years advances in technology has permitted DE to be employed at a larger scale
at high (<106 samples per day) and ultra-high (>106 samples per day) throughput. These
approaches rely greatly on in silico capabilities to design, process, and analyze these experi-
ments, often generating large datasets. Such experiments are enhanced through the use of a
learn process, to understand and predict the relationship between genotype and phenotype.
This edition of Methods in Molecular Biology attempts to equip an experimenter with the
latest techniques in DE, covering aspects at each stage of the Design-Build-Test-Learn cycle.
v
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Designing Overlap Extension PCR Primers for Protein Mutagenesis:
A Programmatic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Xiaofang Huang, Liangting Xu, Chuyun Bi, Lili Zhao, Limei Zhang,
Xuanyang Chen, Shiqian Qi, and Shiqiang Lin
2 Recombination of Single Beneficial Substitutions Obtained from Protein
Engineering by Computer-Assisted Recombination (CompassR) . . . . . . . . . . . . . 9
Haiyang Cui, Mehdi D. Davari, and Ulrich Schwaneberg
3 Nondegenerate Saturation Mutagenesis: Library Construction and Analysis
via MAX and ProxiMAX Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Anupama Chembath, Ben P. G. Wagstaffe, Mohammed Ashraf,
Marta M. Ferreira Amaral, Laura Frigotto, and Anna V. Hine
4 Antha-Guided Automation of Darwin Assembly for the Construction
of Bespoke Gene Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
P. Handal-Marquez, M. Koch, D. Kestemont, S. Arangundy-Franklin,
and V. B. Pinheiro
5 SpeedyGenesXL: an Automated, High-Throughput Platform
for the Preparation of Bespoke Ultralarge Variant Libraries
for Directed Evolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Joanna C. Sadler, Neil Swainston, Mark S. Dunstan, Andrew Currin,
and Douglas B. Kell
6 Facile Assembly of Combinatorial Mutagenesis Libraries Using
Nicking Mutagenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Monica B. Kirby and Timothy A. Whitehead
7 GeneORator: An Efficient Method for the Systematic Mutagenesis
of Entire Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Lucy Green, Nigel S. Scrutton, and Andrew Currin
8 Rapid Cloning of Random Mutagenesis Libraries
Using PTO-QuickStep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Pawel Jajesniak, Kang Lan Tee, and Tuck Seng Wong
9 Construction of Strong Promoters by Assembling Sigma Factor
Binding Motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Yonglin Zhang, Yang Wang, Jianghua Li, Chao Wang,
Guocheng Du, and Zhen Kang
10 Application of Restriction Free (RF) Cloning
in Circular Permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Boudhayan Bandyopadhyay and Yoav Peleg
11 Site-Directed Mutagenesis Method Mediated by Cas9. . . . . . . . . . . . . . . . . . . . . . . 165
Wanping Chen, Wenwen She, Aitao Li, Chao Zhai, and Lixin Ma
vii
viii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Contributors
ix
x Contributors
JEAN CHRISTOPHE GELLY • Laboratoire d’Excellence GR-Ex, Paris, France; BIGR, DSIMB,
UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
LUCY GREEN • Manchester Synthetic Biology Research Centre for Fine and Speciality
Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, School of Chemistry,
Faculty of Science and Engineering, University of Manchester, Manchester, UK
P. HANDAL-MARQUEZ • Rega Institute, KU Leuven, Leuven, Belgium
ANNA V. HINE • College of Health and Life Sciences, Aston University, Aston Triangle,
Birmingham, UK
XIAOFANG HUANG • Key Laboratory of Crop Biotechnology, Fujian Agriculture and Forestry
University, Fujian Province Universities, Fuzhou, China; College of Life Science, Fujian
Agriculture and Forestry University, Fuzhou, China
PAWEL JAJESNIAK • Department of Chemical and Biological Engineering, ChELSI Institute
and Advanced Biomanufacturing Centre, University of Sheffield, England, UK
ZHEN KANG • The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of
Education, Jiangnan University, Wuxi, China; The Key Laboratory of Industrial
Biotechnology, Ministry of Education, School of Biotechnology, Wuxi, China; The Science
Center for Future Foods, Jiangnan University, Wuxi, China
DOUGLAS B. KELL • Department of Biochemistry and Systems Biology, Institute of Systems,
Molecular and Integrative Biology, University of Liverpool, Liverpool, UK; The Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs
Lyngby, Denmark
D. KESTEMONT • Rega Institute, KU Leuven, Leuven, Belgium
MONICA B. KIRBY • Department of Chemical and Biological Engineering, University of
Colorado, Boulder, CO, USA
M. KOCH • Synthace Ltd., London, UK
AITAO LI • State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei
Collaborative Innovation Center for Green Transformation of Bio-resources, School of Life
Sciences, Hubei University, Wuhan, People’s Republic of China
JIANGHUA LI • The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of
Education, Jiangnan University, Wuxi, China; The Key Laboratory of Industrial
Biotechnology, Ministry of Education, School of Biotechnology, Wuxi, China
SHIQIANG LIN • Key Laboratory of Crop Biotechnology, Fujian Agriculture and Forestry
University, Fujian Province Universities, Fuzhou, China; College of Life Science, Fujian
Agriculture and Forestry University, Fuzhou, China
LIXIN MA • State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei
Collaborative Innovation Center for Green Transformation of Bio-resources, School of Life
Sciences, Hubei University, Wuhan, People’s Republic of China
LEOPOLDO FERREIRA MARQUES MACHADO • Manchester Institute of Biotechnology (MIB), The
University of Manchester, Manchester, UK; Department of Chemistry, The University of
Manchester, Manchester, UK
YOAV PELEG • Structural Proteomics Unit (SPU), Life Sciences Core Facilities (LSCF),
Weizmann Institute of Science, Rehovot, Israel
V. B. PINHEIRO • Rega Institute, KU Leuven, Leuven, Belgium; Institute of Structural and
Molecular Biology, University College London, London, UK
SHIQIAN QI • Department of Urology, State Key Laboratory of Biotherapy, West China
Hospital, College of Life Sciences, Sichuan University, Chengdu, China
ANNA JOËLLE RUFF • Institute of Biotechnology, RWTH Aachen University, Aachen,
Germany
Contributors xi
Abstract
Overlap extension PCR is one of the routinely used methods to generate mutagenic genes for the functional
and structural study of proteins. However, it is time-consuming to design the overlapping mutagenic
primers and gene primers by manual operation. In this chapter, we present a Python script that is able to
search all the possible primer combinations according to the preset definitions and calculate the necessary
parameters of each primer for the users, which could facilitate the primer design process. Up to 256 pairs of
primers can be provided for selection using this script.
Key words Site-directed mutagenesis, Overlap extension PCR, Overlapping mutagenic primers,
Primer design
1 Introduction
Andrew Currin and Neil Swainston (eds.), Directed Evolution: Methods and Protocols, Methods in Molecular Biology, vol. 2461,
https://doi.org/10.1007/978-1-0716-2152-3_1, © Springer Science+Business Media, LLC, part of Springer Nature 2022
1
2 Xiaofang Huang et al.
Fig. 1 The procedure of overlap extension PCR for producing mutagenic genes. The sense strand of the full-
length gene is shown with a long thick blue arrow. The antisense strand of the full-length gene is shown with a
long thick brown arrow. The full-length gene is used as a template to amplify the mutagenic upstream
fragment with the forward gene primer (thin blue arrow) and the reverse overlapping mutagenic primer (thin
brown arrow with three green triangles representing the mutagenic codon), and the mutagenic downstream
fragment with the forward overlapping mutagenic primer (thin blue arrow with three purple triangles
representing the mutagenic codon) and the reverse gene primer (thin brown arrow), respectively. In the
third PCR reaction, the mutagenic upstream fragment and the mutagenic downstream fragment, which can
bridge when annealing, are used as the template to generate the full-length mutagenic gene with gene
primers
2 Materials
2.3 Algorithm A schematic view of the algorithm is shown in Fig. 2. Based on our
experience, the asymmetric primers with 30 overhangs balance the
requirements of amplifying the upstream fragment, the down-
stream fragment, and overlapping the two fragments in the third
PCR run. The parameters defining the searching ranges are shown
in Fig. 2. As shown in the figure, the range of i is from 0 to 3;
therefore, there are four possible lengths for the upstream flanking
region of the forward overlapping mutagenic primer; the range of
j is also from 0 to 3 so that there are four possible lengths for the
downstream flanking region of the forward overlapping mutagenic
primer. Thus, there are 4 4 ¼ 16 possible lengths for the forward
overlapping mutagenic primer. Likewise, the number of the possi-
ble reverse overlapping mutagenic primers is also 16. Taking both
the forward overlapping mutagenic primer and the reverse over-
lapping mutagenic primer into account, the number of all possible
primer pairs is 16 16 ¼ 256. The program lists the parameters for
each pair of primers with regard to the sequence, length, GC
percentage, melting temperature, and so on.
4 Xiaofang Huang et al.
Fig. 2 Asymmetric primers with overhangs for overlap extension PCR for mutagenesis. The forward over-
lapping mutagenic primer is shown with a thick blue arrow consisting of the mutagenic codon (purple
triangles), with flanking 9 + i nucleotides and 15 + j nucleotides at the N- and C-terminus respectively.
The reverse overlapping mutagenic primer is shown with a thick brown arrow consisting of the mutagenic
codon (green triangles), with flanking regions of 9 + k nucleotides and 15 + l nucleotides. The values i, j, k,
and l represent the lengths of the flanking regions within the primers, which define the primers’ lengths to be
searched by the program. The ranges of i, j, k, and l are all between 0 and 3 nucleotides (including three
nucleotides). The overlapping region between the two primers, critical for bridging the two mutated fragments
in generating the full-length mutagenic gene, is shown with brown dashes
3 Methods
……
……
4 Notes
Acknowledgments
References
1. Xiao YH, Pei Y (2011) Asymmetric overlap www.bioinformatics.org/primerx/. Accessed
extension PCR method for site-directed muta- 14 May 2020
genesis. Methods Mol Biol 687:277–282 6. O’Halloran DM, Uriagereka-Herburger I,
2. Wang H, Zhou N, Ding F, Li Z, Chen R, Bode K (2017) STITCHER 2.0: primer design
Han A, Liu R (2011) An efficient approach for overlapping PCR applications. Sci Rep:7:
for site-directed mutagenesis using central 45349
overlapping primers. Anal Biochem 418:304– 7. Python. https://www.python.org/. Accessed
306 14 May 2020
3. Primer Premier A comprehensive PCR primer 8. Cock PJ, Antao T, Chang JT, Chapman BA,
design software. http://www.premierbiosoft. Cox CJ, Dalke A, Friedberg I, Hamelryck T,
com/primerdesign/index.html. Accessed Kauff F, Wilczynski B, de Hoon MJ (2009)
14 May 2020 Biopython: freely available Python tools for
4. Untergasser A, Cutcutache I, Koressaar T, Ye J, computational molecular biology and bioinfor-
Faircloth BC, Remm M, Rozen SG (2012) matics. Bioinformatics 25:1422–1423
Primer3—new capabilities and interfaces. 9. GenScript codon usage frequency table (chart)
Nucleic Acids Res 40:e115 tool. https://www.genscript.com/tools/
5. PrimerX. Automated design of mutagenic pri- codon-frequency-table. Accessed 14 May 2020
mers for site-directed mutagenesis. http://
Programmatic Designing of Primers for Protein Mutagenesis 7
10. Tang D, Sheng J, Xu L, Zhan X, Liu J, Jiang H, 12. Okegawa Y, Motohashi K (2015) A simple and
Shu X, Liu X, Zhang T, Jiang L, Zhou C, Li W, ultra-low cost homemade seamless ligation
Cheng W, Li Z, Wang K, Lu K, Yan C, Qi S cloning extract (SLiCE) as an alternative to a
(2020) Cryo-EM structure of C9ORF72- commercially available seamless DNA cloning
SMCR8-WDR41 reveals the role as a GAP for kit. Biochem Biophys Rep 4:148–151
Rab8a and Rab11a. Proc Natl Acad Sci U S A 13. Bi C, Huang X, Tang D, Shi Y, Zhou L, Hu Y,
117:9876–9883 Chen X, Qi S, Lin S (2020) A python script to
11. Cleavage close to the end of DNA fragments. design site-directed mutagenesis primers. Pro-
https://www.neb.com/tools-and-resources/ tein Sci 29:1054–1059
usage-guidelines/cleavage-close-to-the-end-
of-dna-fragments. Accessed 14 May 2020
Chapter 2
Abstract
A large number of beneficial substitutions can be obtained from a successful directed enzyme evolution
campaign and/or (semi)rational design. It is expected that the recombination of some beneficial substitu-
tions leads to a much higher degree of performance through synergistic effect. However, systematic
recombination studies show that poorly performing variants are often obtained after recombination of
three to four individual beneficial substitutions and this limits protein engineers to exploit nature’s potential
in generating better performing enzymes. Computer-assisted Recombination (CompassR) strategy allows
the recombination of identified beneficial substitutions in an effective and efficient manner in order to
generate active enzymes with improved performance. Here, we describe in detail the CompassR procedure
with an example of recombining four substitutions and discuss some important practical issues that should
be considered (such as the selection of protein structures, number of FoldX runs, evaluation of calculations)
for application of the CompassR rule. The core part of this protocol (system setup, ΔΔGfold calculation, and
CompassR application) is transferable to other enzymes and any recombination of single beneficial
substitutions.
Key words Protein engineering, Directed evolution, Single substitution, Recombination, The relative
free energy of folding (ΔΔGfold), FoldX
1 Introduction
Andrew Currin and Neil Swainston (eds.), Directed Evolution: Methods and Protocols, Methods in Molecular Biology, vol. 2461,
https://doi.org/10.1007/978-1-0716-2152-3_2, © Springer Science+Business Media, LLC, part of Springer Nature 2022
9
10 Haiyang Cui et al.
Fig. 1 Overview of all BSLA recombinants generated in the recombination of each category (“intra-category”)
and the beneficial substitutions F17S, V54K, and G155P with beneficial substitutions from categories A (light
green), B (light blue), and C (light purple) (“intercategory”). Categories (A, B, and C; on the left) are composed
of 13 selected beneficial substitutions obtained from the BSLA-SSM library and grouped according to their
ΔΔGfold values. Notations of recombinants: dark green: residual activity (in buffer) 80% of the BSLA wild
type activity. Orange: residual activity (in buffer) between 10–80% of the BSLA wild type activity. Red: residual
activity (in buffer) is between 0 and 10% of the BSLA wild type activity and referred to as “inactive”
recombinant
Fig. 2 Computer-assisted Recombination (CompassR) rule workflow. Preparation: The initial enzyme structure
and substitutions list are needed, and FoldX plugin for YASARA need to be downloaded and installed. Step 1:
Load the PDB file of wild type enzyme. Step 2: Select the target substitution for calculation of ΔΔGfold. Step 3:
Set parameters for FoldX. Step 4: Collection of ΔΔGfold results and alignment with CompassR rule; Step 5:
Make the recombination plan for beneficial substitutions
2 Material
2.1 The Initial 1. The initial structure of the BSLA wild-type crystal structure
Enzyme Structure and (as an example we use PDB ID 1i6w [47] Chain A, resolution
Substitution List 1.5 Å) is taken from Protein Data Bank (www.rcsb.org) (see
Note 2).
2. The PDB file is renamed after removing water molecules and
other ligands, for example, 1i6w_A_noSOL.pdb (see Note 3).
3. A list of beneficial substitutions, selecting for further recombi-
nation experiments, should be prepared (e.g., F17S, D64N,
G104Q, V165E). These substitutions will be filtered by Com-
passR rule after calculating their ΔΔGfold value (see Note 4).
2.2 Setup of the 1. Download and install the YASARA Structure version 16.7.22
FoldX Plugin for (http://www.yasara.org) (see Note 5).
YASARA 2. After free and simple registration for academic users on FoldX
website http://foldxsuite.crg.eu/, download your system-
specific FoldX Suite 4.0 (a zip file containing foldx.exe, rota-
base.txt) and FoldX plugin (a zip file named as yasaraPlugin.zip
containing several .py files and others) for YASARA (see
Note 1).
Recombination of Single Beneficial Substitutions by CompassR 13
3 Methods
3.1 Load the PDB File 1. Open YASARA Structure software (see Note 7).
of Wild Type Enzyme 2. Go to File > Load > PDB file.
(Step 1)
3. Look for the PDB file (i.e., 1i6w_A_noSOL.pdb) and click
“OK” (see Note 8).
3.2 Select the Target 1. Go to Analyze > FoldX > Mutate residue.
Substitution for 2. Select a residue for FoldX analysis in the sequence menu by
Calculation of the double-clicking, e.g., Phe17.
Relative Folding Free
3. In the window of “select FoldX routines,” activate the option “FoldX
Energy (ΔΔGfold, RepairPDB” and “Calculate stability change” for calculating the
Step 2) relative folding free energies (ΔΔGfold ¼ ΔGfold,sub ΔGfold,wt)
and click “OK” to go to the next step (see Note 9).
4. Select new amino acid residue, for example, Ser, and click
“OK.”
3.3 Set Parameters 1. Keep the default option in the window of “Set FoldX option
for FoldX (Step 3) (1)”, which means only “Move neighbors” option is activated,
and click “OK” (see Note 10).
2. Set parameters: Number of runs: 5; Temperature (K): 298; pH
to 7; Ionic strength (100): 5; VdW design: 2 (see Note 11).
3. Click “OK,” then FoldX starts to run with an additional FoldX
program terminal appeared.
14 Haiyang Cui et al.
3.4 Collection of 1. Collect the ΔΔGfold values (kcal/mol) of five runs from the
ΔΔGfold Results and YASARA console after the termination of the FoldX calcula-
Alignment with the tion. After pressing the space key, the results show in YASARA
CompassR Rule console as follows (see Note 12).
(Step 4)
Plugin>nameObj | total_energy
Plugin>-------------------------+-------------
Plugin>FA17S:Object1_Repair_1_0 | -0.0153279
Plugin>FA17S:Object1_Repair_1_1 | -0.0161321
Plugin>FA17S:Object1_Repair_1_2 | -0.0263127
Plugin>FA17S:Object1_Repair_1_3 | 0.00703162
Plugin>FA17S:Object1_Repair_1_4 | 0.0459613
Plugin>FA17S:Object1_Repair_1_4 | 0.0459613
Plugin>nameObj | total_energy
Plugin>-----------------------+-------------
Plugin>FA17S:Object1_Repair_1 | -0.00095594
3.5 Create a 1. When single substitutions with ΔΔGfold +0.36 kcal/mol are
Recombination Plan recombined (e.g., ΔΔGfold, F17S ¼ 0.03 kcal/mol, ΔΔGfold,
for Beneficial D64N ¼ +0.09 kcal/mol), one can expect active recombinants
Substitutions (Step 5) of improved properties (green), e.g., F17S-D64N (see upper
part in Fig. 1).
2. When beneficial substitutions are recombined with ΔΔGfold
values ranging from +0.36 to +7.52 kcal/mol (e.g., ΔΔGfold,
V165E ¼ +4.89 kcal/mol), one cannot predict whether the
recombinants are inactive or active (unpredictable behaviors;
orange), for example, F17S-V165E and others (see middle part
in Fig. 1).
3. Recombination of beneficial substitutions with
ΔΔGfold +7.52 kcal/mol (e.g., ΔΔGfold,
G104Q ¼ +14.38 kcal/mol) results in activity-reduced even
Recombination of Single Beneficial Substitutions by CompassR 15
4 Notes
References
1. Cui H, Cao H, Cai H, Jaeger K-E, Davari MD, 5. Rowe LA, Geddie ML, Alexander OB, Matsu-
Schwaneberg U (2020) Computer-assisted mura I (2003) A comparison of directed evolu-
recombination (CompassR) teaches us how to tion approaches using the β-glucuronidase
recombine beneficial substitutions from model system. J Mol Biol 332(4):851–860
directed evolution campaigns. Chem Eur J 6. Bloom JD, Meyer MM, Meinhold P, Otey CR,
26(3):643–649. https://doi.org/10.1002/ MacMillan D, Arnold FH (2005) Evolving
chem.201903994 strategies for enzyme engineering. Curr Opin
2. Bornscheuer UT, Hauer B, Jaeger KE, Schwa- Chem Biol 15(4):447–452
neberg U (2019) Directed evolution empow- 7. Rübsam K, Davari MD, Jakob F, Schwaneberg
ered redesign of natural proteins for the U (2018) KnowVolution of the polymer-
sustainable production of chemicals and phar- binding peptide LCI for improved polypropyl-
maceuticals. Angew Chem Int Ed 58(1): ene binding. Polymers 10(4):423
36–40. https://doi.org/10.1002/anie. 8. Tokuriki N, Tawfik DS (2009) Stability effects
201812717 of mutations and protein evolvability. Curr
3. Liebeton K, Zonta A, Schimossek K, Opin Struct Biol 19(5):596–604
Nardini M, Lang D, Dijkstra BW, Reetz MT, 9. Bershtein S, Segal M, Bekerman R, Tokuriki N,
Jaeger KE (2000) Directed evolution of an Tawfik DS (2006) Robustness–epistasis link
enantioselective lipase. Chem Biol 7(9): shapes the fitness landscape of a randomly drift-
709–718 ing protein. Nature 444(7121):929–932.
4. Bhuiya M-W, Liu C-J (2010) Engineering https://doi.org/10.1038/nature05385
monolignol 4-O-methyltransferases to modu- 10. Firnberg E, Labonte JW, Gray JJ, Ostermeier
late lignin biosynthesis. J Biol Chem 285(1): M (2014) A comprehensive, high-resolution
277–285 map of a gene’s fitness landscape. Mol Biol
Evol 31(6):1581–1592
18 Haiyang Cui et al.
Abstract
Protein engineering can enhance desirable features and improve performance outside of the natural
context. Several strategies have been adopted over the years for gene diversification, and engineering of
modular proteins in particular is most effective when a high-throughput, library-based approach is
employed. Nondegenerate saturation mutagenesis plays a dynamic role in engineering proteins by targeting
multiple codons to generate massively diverse gene libraries. Herein, we describe the nondegenerate
saturation mutagenesis techniques that we have developed for contiguous (ProxiMAX) and noncontiguous
(MAX) randomized codon generation to create precisely defined, diverse gene libraries, in the context of
other fully nondegenerate strategies. ProxiMAX randomization comprises saturation cycling with repeated
cycles of blunt-ended ligation, type IIS restriction, and PCR amplification, and is now a commercially
automated process predominantly used for antibody library generation. MAX randomization encompasses
a manual process of selective hybridisation between individual custom oligonucleotide mixes and a conven-
tionally randomized template and is principally employed in the research laboratory setting, to engineer
alpha helical proteins and active sites of enzymes. DNA libraries generated using either technology create
high-throughput amino acid substitutions via codon randomization, to generate genetically diverse clones.
Key words Nondegenerate, Saturation mutagenesis, Randomized gene libraries, Codon randomiza-
tion, Protein engineering, Library design, Genetic code, Amino acids, Genetic diversity,
Oligonucleotides
1 Introduction
Andrew Currin and Neil Swainston (eds.), Directed Evolution: Methods and Protocols, Methods in Molecular Biology, vol. 2461,
https://doi.org/10.1007/978-1-0716-2152-3_3, © Springer Science+Business Media, LLC, part of Springer Nature 2022
19
20 Anupama Chembath et al.
1.4 ProxiMAX Although MAX randomization can saturate multiple codons, its
Randomization drawback is its inability to saturate more than two adjacent codons
at a time. This results from the essential presence of the conserved
addressing region within each selection oligo (Fig. 1b). We there-
fore developed ProxiMAX randomization, a nondegenerate muta-
genesis technique designed to saturate multiple, contiguous
codons [9, 10]. ProxiMAX randomization relies on saturation
cycling, which entails repeated cycles of blunt-ended ligation, type
IIS restriction and PCR amplification. Figure 2 demonstrates the
ProxiMAX randomization technique for saturating contiguous
codons. The technique utilizes four groups of oligonucleotides:
three donor pools and an acceptor sequence. The donor pools
possess MAX codons at their 30 ends, encoding each of the desired
Nondegenerate Saturation Mutagenesis: Library Construction and Analysis. . . 23
Fig. 1 Schematic representation of the MAX randomization process (Subheading 3.1). A single template
oligonucleotide is synthesized to be fully degenerate at the designated, saturated codons. Meanwhile, a set of
up to 20 small selection oligonucleotides are synthesized individually, for each saturated position. Each such
oligonucleotide contains a conserved addressing region, typically of six bases and one MAX codon, which
represents the preferred codon for subsequent expression of a single amino acid. (a) Schematic showing three
positions being saturated with individual oligonucleotide pools. Each selection oligonucleotide consists of a
short six base invariant region, complementary to the template and one MAX codon. The selection oligonu-
cleotides, along with two flanking oligonucleotides, anneal to the template and are ligated together. The
ligated strand is then selectively amplified with primers complementary to the terminal oligonucleotides
(P1 and P2), to generate a randomization cassette. (b) Exemplar design of a cassette containing four MAX
randomized (saturated) codons in a section of a tyrosyl tRNA synthetase gene
24 Anupama Chembath et al.
Donor sets
5’ P1 3’
3’
Set 1 MlyI MAX
MAX
5’
5’ P2 3’
P 5’
Acceptor DNA
Set 2
5’ P3 3’
MlyI MAX
MAX
3’
5’ + 3’
3’
5’
3’ Rev 5’
3’
Set 3 MlyI MAX
MAX Ligate/combine/amplify
5’
After 6 cycles…..
5’
Constant region
3’
3’
5’ + 5’ MAXMAXMAXMAXMAXMAX
3’ MAXMAXMAXMAXMAXMAX
3’
5’
Ligate completed acceptor to
required constant region
amino acids defined by the user (see Notes 1 and 2). The donors
can be partially double-stranded oligonucleotides, fully double-
stranded DNA or self-complementary hairpin oligonucleotides,
though experience favors the latter.
Typically, individual donor oligos are pooled and blunt-end
ligated onto a double-stranded, 50 phosphorylated, acceptor
sequence. Ligation is followed by PCR amplification, digestion
with the type IIS restriction enzyme, MlyI and purification of the
resulting product. The MlyI recognition site lies upstream of the
MAX codon in the donor side of the sequence and cuts down-
stream of its recognition site, creating a blunt end. Thus MlyI
Nondegenerate Saturation Mutagenesis: Library Construction and Analysis. . . 25
1.5 Other Routes to Slonomics employs a fully automated, proprietary platform to syn-
Nondegenerate thesize randomized cassettes from thousands of hairpin oligonu-
Saturated DNA cleotides in order to generate highly diverse combinatorial gene
Libraries libraries [14, 15]. Unfortunately, this patented technology is no
longer widely available to the scientific community as a synthetic
service. Thus commercial preparation of nondegenerate saturated
libraries is currently limited to massively parallel oligonucleotide/
gene synthesis.
1.6 Library Size, Library size is determined by a number of factors including the
Sequence Space, and nature of the genetic code, type of mutagenic codon used in library
Screening design, and the number of designated sites for mutagenesis within
the library [16]. Removing DNA degeneracy from the randomiza-
tion scheme maximizes mutagenesis potential by optimizing the
use of sequence space. The sequence space occupied by a DNA
library is the proportion of required sequences that can actually be
created, physically. Sequence space is a fundamental consideration
when using saturation mutagenesis techniques, as the number of
required sequences for full diversity (all of the different randomiza-
tion combinations) should be below the maximum possible num-
ber of sequences that can be synthesized (and ideally, subsequently
screened) using any given approach. Although generation of mas-
sive molecular diversity is theoretically possible, the practicality of
generating such libraries and the ability to screen them will always
26 Anupama Chembath et al.
Fig. 3 Observed vs expected distribution of codons in six saturated positions within an E. coli AlaRS gene. Two
cassettes encompassing positions 41 & 43 and 212, 214, and 216 respectively were constructed using
automated ProxiMAX addition [11] of hexamer donors (positions 43, 216, and 214; positions 42, 215, and
213 were specified as conserved codons) and single codon donors (positions 41 and 212). These cassettes
were then joined to three framework sections via ligation of BsaI-digested fragments. Position 170 was
contained within one of these fragments and owing to its isolation within the gene was simply constructed
using a carefully balanced mixture of PCR primers [13]. NGS analysis of the resulting library was performed
using Isogenica’s proprietary software
2 Materials
Potassium acetate 50 mM
Tris-acetate (pH 7.9) 20 mM
Magnesium acetate 10 mM
BSA 100 μg/ml
Reagents:
Equipment:
3 Methods
3.1 MAX MAX randomization cassette(s) are made for each discreet region of
Randomization randomization as illustrated schematically in Fig. 1. The maximum
number of saturated codons within an individual cassette is limited
by the practical length of synthesis for the template strand (Fig. 1)
and also by the minimum mass of DNA required in order to contain
all theoretical template component sequences. For example, theo-
retically, a 93-mer template oligo could contain 9 positions of
randomization (sufficient to hybridize with 9 different selection
oligo pools plus a 6 base overlap at each end). However, assuming
an average MW of a nucleotide ¼ 330, the average MW of a 93-mer
oligo is 30,690. Nine codons each saturated with NNN equates to
649 or 1.8 1016 different sequences of template DNA. Thus, for a
minimum of one molecule of each possible template sequence
(assuming a perfect distribution during NNN synthesis),
~30 nmol would be required (1.8 1016/Avogadro’s number)
which equates ~920 μg of template DNA. For most applications,
this is not practical. In practice, we typically recommend limiting
each cassette to a maximum of six saturated codons. By the same
calculations, this equates to just ~110 fmol or ~2.5 ng of a 66-mer
template for one copy of each template sequence and allows us to
use quantities far in excess of those minimum values and also allows
comfortably for the multiple dilutions that are required during the
randomization process. Once constructed, multiple cassettes may
later be joined together either by overlap PCR or else by seamless
cloning methods such as Golden Gate Assembly.
Two constant oligonucleotides (End 1 and End 2), two pri-
mers (P1, the first 18 bases of End 1 and P2, the reverse comple-
ment of the last 18 bases of End 2), a template oligonucleotide
having conventional NNN saturation at the positions of MAX
randomization and corresponding sets of MAX oligonucleotide
pools (e.g., Fig. 1b) are ordered for synthesis. Since the majority
of these oligonucleotides are short, no special quality of DNA is
required, but if finances allow, it is convenient to order the MAX
Nondegenerate Saturation Mutagenesis: Library Construction and Analysis. . . 29
3.2 Joining Multiple Where multiple cassettes are required and joining is to be achieved
MAX Randomization by overlap PCR, End 1 and End 2 oligos are designed to have an
Cassettes by Overlap 18 base overlap with the corresponding end oligos of neighboring
PCR cassettes.
1. In the first stage of MAX library assembly, equal high volumes
of individual cassettes (e.g., 30 μl each of 3 neighboring cas-
settes) are combined in a 100 μl reaction containing 0.2 mM
dNTPs and 1 unit of Pfu DNA polymerase and are amplified
without any additional primers, under the following cycling
conditions: initial denaturation at 98 C for 2 min followed
by 20 cycles of denaturation for 30 s at 98 C, annealing for
30 s and extension for 2 min at 72 C, followed by a final
extension step at 72 C for 5 min. The annealing temperature
is selected as a compromise between the optimal annealing
temperatures of individual cassettes.
2. The process of optimizing final product yield by serially dilut-
ing the resulting PCR product and optimizing the annealing
temperature is performed as described in Subheading 3.1, step
3, but instead using primers that flank the entire, combined
product rather than primers P1 and P2.
3. The optimized whole-product PCR is then scaled up as
required and the resulting final product is purified using a
PCR purification kit according to manufacturer’s instructions.
DNA concentration of the purified library is determined by
measuring the absorbance at 260 nm.
4. Finally, the combined product is analyzed by NGS.
3.3 Joining Multiple If Golden Gate Assembly is to be used to join MAX randomization
MAX Randomization cassettes, the 50 ends of the P1 and P2 oligonucleotides are
Cassettes by Golden extended to contain appropriate Type IIS restriction sites and a
Gate Assembly 4 base overlap between fragments. Golden Gate Assembly is then
performed using a kit according to manufacturer’s instructions.
Nondegenerate Saturation Mutagenesis: Library Construction and Analysis. . . 31
3.4 Denaturation When saturating highly repetitive sequences such as those encoding
Gradient PCR: A Useful α-helical repeat proteins, concatemers can occasionally result dur-
Tip for MAX ing the amplification stage that cannot be resolved subsequently by
Randomization of optimizing the dilution of the template and the annealing temper-
Repetitive Sequences ature of the PCR. However, in these unusual cases, we have discov-
ered that such concatemers may be resolved successfully by running
a denaturation gradient PCR.
Figure 4 illustrates a 90 bp, highly repetitive gene fragment that
contained 5 saturated codons and was generated by MAX random-
ization. During PCR optimization, neither altered annealing tem-
peratures nor dilution of the template DNA gave rise to an
improvement in the yield of the required 90 bp PCR product. In
fact, multiple bands of similar intensity to the desired product could
be seen at several annealing temperatures and for all template dilu-
tions and annealing gradients tested.
Having failed to generate the required single product via tradi-
tional optimization, we then hypothesized that akin to the melt
analysis employed in some qPCR experiments [21], a reduced
denaturation temperature might prevent the denaturation (and
consequently, the amplification) of any longer, concatemeric pro-
ducts. In essence, by employing a temperature gradient at the
denaturation stage of cycling, a temperature capable of denaturing
the target sequence, but not the longer undesirable sequences,
might be identified (see Note 4).
The following methods were employed to test this hypothesis:
1. A PCR master mix was made to contain 1 Pfu DNA polymer-
ase buffer, 0.2 mM dNTPs, 50 pmol each of forward and
reverse primers, 3 units of Pfu DNA polymerase, and 1 μl
ligated template.
2. Having previously optimized an annealing temperature of
53.4 C, a denaturation gradient from 81 C to 87 C was
applied for PCR optimization.
3. The following cycling conditions were used: initial denatur-
ation for 2 min at 84 C, followed by 30 cycles of 30 s denatur-
ation gradient from 81–87 C, annealing for 30 s at 53.4 C,
extension for 10 s at 72 C and a final extension step with
1 cycle for 60 s at 72 C followed by hold at 4 C. These
reagent conditions and cycling times were within manufac-
turer’s recommended specifications, with the brief extension
time of 10 s selected to complement the manufacturer’s rec-
ommendation of 2 min/1 kb of the product.
4. The resulting PCR products were electrophoresed in 3% TAE
agarose gels, stained with ethidium bromide and imaged on a
Syngene G:Box.
5. As illustrated in Fig. 4, (lane 3), subsequent amplification of
this product under the exact PCR conditions described above
32 Anupama Chembath et al.
1. For the first codon (Fig. 2), individual hairpin donors from set
1 are combined at a final concentration of 1 μM (¼ 1 pmol/μl)
total donor DNA (precise concentrations of individual compo-
nents will vary according to the numbers of donors chosen).
2. Ten pmol of total donor DNA is then ligated to 3.3 pmol
acceptor sequence in a 20 μl reaction volume containing
1 ligase buffer, containing 1 mM ATP and 1 μl (400 Weiss
units) T4 DNA ligase, incubating at room temperature for 2 h
followed by 37 C incubation for 30 min. The acceptor
sequence can be any blunt-ended, double-stranded, linear
piece of DNA (synthetic, PCR product or other) that bears a
50 phosphate group at the point of ligation (Fig. 2).
3. The ligation mix is then diluted 1000-fold and amplified using
the appropriate forward primer according to donor set and a
universal reverse primer for the acceptor sequence, in duplicate
100 μl PCR reactions containing 1 Pfu buffer, 0.1 mM
dNTPs, 50 pmol forward and reverse primers, 1 unit of Pfu
DNA polymerase, and 1 μl of diluted template.
4. Following examination by gel electrophoresis, the PCR reac-
tions are combined and purified using a MinElute PCR Purifi-
cation Kit following manufacturer’s instructions.
5. The purified DNA is recovered in 30 μl of sterile dH2O and
25 μl of this purified product is digested in a 50 μl reaction
volume containing 1 CutSmart® buffer and 10 units of MlyI
for 37 C for 1 h followed by heat inactivation at 65 C for
20 min.
6. A sample of the resulting product is examined by electrophore-
sis in a 3% agarose gel to confirm digestion and the concentra-
tion of the remainder is determined by measuring A260 in a
NanoDrop spectrophotometer. The volume of the digested
DNA corresponding to 3.3 pmol is then calculated. Gel purifi-
cation of the digested product may be performed at this stage,
if required.
7. Steps 1–6 are repeated with the next set of oligonucleotide
donors (i.e., cycling donor sets 1–3) each time using the MlyI--
digested product from the previous round as the new acceptor,
until all required codons have been added.
8. Once sufficient codons have been added, a final, constant
region is added to the last codon by blunt end ligation and
the resulting product is once-more amplified by PCR as
described in step 5.
9. Where multiple ProxiMAX fragments are joined together, this
is typically achieved by incorporating Type IIS restriction sites
(such as BsaI) into both the acceptor sequence and the final
constant region ligated in step 8. The resulting BsaI fragments
Nondegenerate Saturation Mutagenesis: Library Construction and Analysis. . . 35
3.6 NGS Data NGS is undertaken routinely to assess the quality and diversity of
Analysis of Saturated the generated mutant libraries. Even though many software
Libraries: Part 1, resources have been released over the years for evaluation of NGS
Creating an Excel Data data, none of the commercially available resources are well-suited to
File assess the distribution of codons at anticipated saturated positions
across a saturation mutagenesis library. Consequently, we have
developed the following data analysis strategies that involves no
specialized equipment or programs in order to assess MAX ran-
domization libraries. Here, we exemplify the strategies using a
mutant library generated using MAX randomization to saturate
seven codons in a α-helical repeat protein:
1. NGS was accomplished via a commercial Illumina MiSeq ser-
vice, following 2 250 bp paired-end sequencing.
2. Once the NGS data is received, the two paired end reads (fastq
format) are uploaded onto Galaxy bioinformatics software
(https://usegalaxy.org/) along with the reference library
sequence (fasta format) for alignment.
3. Since the two reads, when combined, will form the full-length
randomized library, the first step in data analysis is to join the
two paired end reads together, and this is achieved using the
“fastq-join” function (using default settings) which joins two
paired-end reads on overlapping ends.
4. The joined sequences are then subjected to quality control
using filter fastq function where the reads are filtered by quality
score and length. Since this cassette was 336 bp in length, all
sequences with a minimum length of 290 bp and a maximum
length of 336 bp was included in the analysis to set the
sequence length (290 bp length was chosen so that all of the
seven randomized regions in the library would be included in
the filtered sequences. All the other settings were left to
default).
Another random document with
no related content on Scribd:
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.