Fungal Celullases

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 141

This is an open access article published under an ACS AuthorChoice License, which permits

copying and redistribution of the article or any adaptations for non-commercial purposes.

Review

pubs.acs.org/CR

Fungal Cellulases
Christina M. Payne,†,# Brandon C. Knott,‡,# Heather B. Mayes,§,# Henrik Hansson,⊥ Michael E. Himmel,∥
Mats Sandgren,⊥ Jerry Ståhlberg,⊥ and Gregg T. Beckham*,‡

Department of Chemical and Materials Engineering and Center for Computational Sciences, University of Kentucky, 177 F. Paul
Anderson Tower, Lexington, Kentucky 40506, United States

National Bioenergy Center, National Renewable Energy Laboratory, 15013 Denver West Parkway, Golden, Colorado 80401, United
States
§
Department of Chemical and Biological Engineering, Northwestern University, 2145 Sheridan Road, Evanston, Illinois 60208,
United States
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.


Biosciences Center, National Renewable Energy Laboratory, 15013 Denver West Parkway, Golden, Colorado 80401, United States

Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala BioCenter, Almas allé 5,
Downloaded via UNIV FED DO PARANA on August 10, 2022 at 15:57:58 (UTC).

SE-75651 Uppsala, Sweden


4.3. T. reesei Cellulases: Understanding the
Mechanisms of Action 1322
4.3.1. T. reesei as an Early Model for Cellulase
Action 1322
4.3.2. GHs and Related Enzymes 1324
5. Carbohydrate-Binding Modules and Linkers 1326
5.1. Family 1 Carbohydrate-Binding Modules 1326
5.2. Linkers 1331
6. Family 7 Glycoside Hydrolases 1338
6.1. Structural Studies and Catalytic Function 1341
6.1.1. TrCel7A: Wild-Type 1341
6.1.2. TrCel7A Catalytic Mutants 1341
CONTENTS 6.1.3. F. oxysporum Cel7B with Active Site-
1. Introduction 1309 Spanning Nonhydrolyzable Inhibitor 1341
2. Cellulose 1311 6.1.4. TrCel7B 1342
2.1. Cellulose Structures 1312 6.1.5. H. insolens Cel7B S37W/P39W Mutant 1343
2.2. Cellulose Microfibrils 1314 6.1.6. TrCel7A: Cello-Oligomer Complexes 1345
2.3. Cellulose Substrates 1315 6.1.7. P. chrysosporium Cel7D 1345
2.3.1. Microcrystalline Cellulose (MCC) from 6.1.8. TrCel7A: Exo Loop Engineering 1345
Plants 1315 6.1.9. TeCel7A 1346
2.3.2. MCC from Microbes 1315 6.1.10. PcCel7D Bound with Disaccharides 1346
2.3.3. Phosphoric Acid Swollen Cellulose 6.1.11. M. albomyces Cel7B 1346
(PASC) 1316 6.1.12. Heterobasidion irregulare Cel7A 1346
2.3.4. Cellulose Crystallinity 1316 6.1.13. T. harzianum Cel7A 1347
3. Glycoside Hydrolase Catalytic Mechanisms 1316 6.1.14. L. quadripunctata Cel7B 1347
3.1. Retaining and Inverting Mechanisms 1316 6.1.15. TrCel7A Michaelis Complex and Glyco-
3.2. Carbohydrate Ring Puckering 1318 syl-Enzyme Intermediate 1347
4. Early Developments in Fungal Cellulases 1320 6.1.16. GH7 Catalytic Insights from Molecular
4.1. History of the Discovery and Improvement Simulation 1347
of T. reesei Strains: Premolecular Era 1320 6.2. Processivity, Kinetic Modeling, and Visual-
4.1.1. Early Work at U.S. Army Natick Labo- ization 1350
ratories 1320 6.3. Product Inhibition 1357
4.1.2. Early International Effort for Strain 6.4. Pyroglutamate 1359
Improvement 1320 6.5. Glycosylation 1360
4.2. New Understanding from the Molecular Era: 6.6. Protein Engineering 1361
Cloning in T. reesei and S. cerevisiae 1321 6.7. Conclusions 1362
4.2.1. Cloning and Protein Production in T. 7. Family 6 Glycoside Hydrolases 1364
reesei 1321 7.1. Structural Studies 1366
4.2.2. RUT C-30 Revealed 1322
4.2.3. Cloning T. reesei Genes in S. cerevisiae:
The Road to Consolidated Bioprocessing 1322 Received: July 2, 2014
Published: January 28, 2015

© 2015 American Chemical Society 1308 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

7.1.1. TrCel6A: Wild-Type 1366 11.1. Initial Discoveries of Oxidative Function 1417
7.1.2. T. f usca Cel6A 1367 11.2. Mechanistic and Structural Studies 1419
7.1.3. TrCel6A: Y169F Variant 1367 11.3. Conclusions 1424
7.1.4. H. insolens Cel6A: Apo 1367 12. Modeling Enzymatic Hydrolysis 1425
7.1.5. H. insolens Cel6A: Cello-Oligomer Com- 12.1. Ordinary Differential Equation-Based Mod-
plexes 1368 els 1425
7.1.6. TrCel6A: Non-Hydrolyzable Ligands 1368 12.2. Agent-Based Models 1427
7.1.7. H. insolens Cel6B 1369 13. Concluding Remarks 1428
7.1.8. H. insolens Cel6A: D416A/Thio-Oligosac- Author Information 1429
charide Complex 1369 Corresponding Author 1429
7.1.9. TrCel6A: D175A and D221A Variants 1370 Author Contributions 1429
7.1.10. H. insolens Cel6A: D405N Variant 1371 Notes 1429
7.1.11. H. insolens Cel6A: D416A/Isofagomine Biographies 1429
Complex 1372 Acknowledgments 1431
7.1.12. C. cinerea Cel6C 1373 Abbreviations 1431
7.1.13. C. cinerea Cel6A 1373 References 1431
7.1.14. C. thermophilum Cel6A 1374
7.1.15. TrCel6A Variants HJPlus and 3C6P 1374
7.2. Catalytic Function 1374 1. INTRODUCTION
7.2.1. Catalytic Acid 1374 Lignocellulosic biomass has enormous potential to contribute to
7.2.2. pKa Modifying Residues 1374 worldwide energy, chemical, and material demands in a
7.2.3. Catalytic Base 1375 renewable, sustainable manner. In the United States alone, it
7.2.4. Catalytic Priming of Ring Distortion 1376 has been estimated that 30% of the current petroleum usage
7.2.5. Substrate Binding 1376 could be offset via biomass conversion to transportation fuels.1
7.2.6. Product Inhibition 1377 Indeed, in both the United States and European Union, biomass
7.2.7. Processive Catalytic Cycle 1378 is currently the most abundant source of energy from renewable
7.2.8. Synergistic and Processive Function 1378 sources. Given escalating energy demands worldwide, especially
7.3. Glycosylation 1379 in liquid transportation fuels,2,3 coupled with concerns of global
7.4. Protein Engineering 1380 climate change through continued use of fossil fuel resources, it
7.5. Conclusions 1382 is likely that biomass utilization will be a primary contributor in
8. Family 5 Glycoside Hydrolases 1383 the near- to mid-term to the global sustainable energy
8.1. Structural Studies 1384 portfolio.4−7
8.1.1. Catalytic Function 1384 All plant cells exhibit thick cell walls that primarily consist of
8.1.2. T. aurantiacus Cel5A 1386 polysaccharides and the aromatic polymer lignin. These
8.1.3. P. rhizinflata EglA/CelA 1389 polymers have evolved to form complex composite materials
8.1.4. T. reesei Cel5A (Formerly EG II/EG III) 1389 that render plant cells highly resistant to attack from pathogens.8
8.2. Characterization of Activity and Specificity 1390 Plant polysaccharides primarily consist of cellulose, the β-1,4
8.2.1. T. viride EG III 1391 linked homopolymer of glucose; hemicellulose, a heteroge-
8.2.2. TrCel5A 1391 neous, branched polysaccharide primarily made up of a β-1,4
8.2.3. H. insolens Cel5A 1393 linked polymers including xylan, glucoronxylan, xyloglucan,
8.2.4. Other Fungal GH5s 1396 glucomannan, and arabinoxylan backbones with heterogeneous
8.3. Protein Engineering 1396 side chains;9 and pectin, a typically minor component in cell
8.4. Conclusions 1397 walls consisting of a complex set of polysaccharide polymers
9. Family 12 Glycoside Hydrolases 1400 enriched in α-linked galacturonic acid or galacturonic acid and
9.1. Structural Studies 1401 rhamnose monomers.10,11 Lignin is a heterogeneous, branched,
9.1.1. Overall Structure 1402 alkyl-aromatic polymer comprising three phenyl-propanoid
9.1.2. GH12 Ligand Complex Structures 1402 monomers linked by myriad C−O and C−C bonds that are
9.2. Plant Cell Wall Loosening/Extension Activity likely formed through radical coupling reactions during cell wall
by GH12 Enzymes 1403 synthesis.12 The enzymatic machinery for the synthesis of plant
9.3. GH Clan-C: Structure and Sequence Compar- cell walls is a matter of intense research with many outstanding
ison 1404 questions.8−15 Cellulose, hemicellulose, and lignin represent
9.4. Enzyme Discovery and Engineering 1404 approximately 20−50%, 15−35%, and 10−30% of plant cell
9.5. Conclusions 1405 walls, respectively, on a dry weight basis.16 Due to its prevalence
10. Family 45 Glycoside Hydrolases 1406 as the most abundant polymer in terrestrial plants, cellulose is
10.1. Structural Studies 1409 the most abundant biological material on Earth. In addition,
10.1.1. Subfamily A 1409 cellulose is the most recalcitrant carbohydrate polymer to
10.1.2. Subfamily B 1410 catalytic degradation when compared to other plant cell wall
10.1.3. Subfamily C 1411 polysaccharides.6 Cellulose serves a key structural function in
10.2. Catalytic Function 1411 plants and is synthesized by complex enzymatic machinery
10.3. Similarities of GH45s to Expansins and during cell wall synthesis.14,15 The sugars covalently locked in
Swollenins 1411 cellulose and hemicellulose represent a vast, renewable
10.4. Conclusions 1415 feedstock for the production of fuels and chemicals. Lignin, in
11. Lytic Polysaccharide Monooxygenases 1416 addition, represents a potential feedstock for valorization to
1309 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 1. Overall view of a conventional biochemical conversion process to produce fuels and chemicals from lignocellulosic biomass. Cellulase
enzymes can be used to convert the cellulose portion of nonfood biomass, such as agricultural waste and energy crops, into fermentable sugars for
subsequent conversion to renewable fuels and chemicals.

fuels and chemicals although selective conversion of lignin to the cost of sugar production for fuels and chemicals
value-added products remains a significant challenge.17−19 production.6,32−34
For production of fuels and chemicals from lignocellulosic Recently, cellulase enzyme research has been accelerated due
biomass, overcoming the heterogeneity and recalcitrance of to renewed interest in the production of ethanol from
plant cell walls in a cost-effective manner at scales sufficient to lignocellulosic biomass. Ethanol is a useful blend stock for
offset fossil-fuel-derived resources is a major technical challenge. light-duty vehicles in the transportation sector, but there are
Second-generation biofuels production facilities are currently significant issues with its large-scale use including problems
under development, and several have been constructed to date associated with hygroscopy, blending limits with gasoline, and
around the world, with a primary aim to convert lignocellulosic the need for a new distribution system beyond petroleum-
biomass to ethanol. These facilities generally utilize a derived fuels. Thus, third-generation biofuels are now under-
“biochemical conversion” process wherein biomass is first size going focused research and development with the goal of cost-
reduced through milling or chipping, followed by a mild effective production of infrastructure-compatible fuels from
thermochemical pretreatment step to render plant cell wall lignocellulosic biomass including fuels to fulfill demands in the
materials more amenable to attack by biocatalysts. The gasoline, diesel, jet fuel, and maritime sectors. From the
enzymatic hydrolysis step then depolymerizes cellulose and perspective of biochemical conversion processes that utilize
carbohydrate intermediates such as clean sugars and carbohy-
residual hemicellulose to sugars. The last step upgrades and
drate derivatives as feedstocks, this body of work can be very
converts sugars to fuel or chemicals (Figure 1).20 In the large
broadly categorized into biological and catalytic upgrading of
suite of process options, the biomass depolymerization steps,
sugars to hydrocarbon fuels.5,35−40 Multiple strategies have
namely pretreatment and enzymatic hydrolysis, have long been
emerged for each class of fuel and chemical production, which
identified as the most costly portion of the conversion have been widely reviewed in the past several years.5,35−40 Many
process.21−23 Many different pretreatment options have been of these strategies still rely on the production of sugars or sugar
examined including dilute acid, hot water, steam explosion, derivatives, and thus, there remains significant incentive to
ammonia fiber expansion, alkaline, lime, maleic acid, and others, reduce the cost of selective biomass depolymerization to
as extensively discussed in recent reviews and comprehensive carbohydrates through some combination of thermochemical
technology comparisons.24−30 The enzymatic hydrolysis step pretreatment and enzymatic hydrolysis.
represents the second portion of biomass depolymerization and Many organisms across the kingdoms of life have evolved the
is a major cost driver in bioethanol production due to the high necessary enzymatic machinery for converting cellulose to
cost of enzyme production.21−23,31 However, cellulolytic soluble species for a food and energy source. Given the
enzymes are incredibly selective for glucose production and complexity of plant cell walls, most biomass-degrading
produce fewer downstream catalyst inhibitors relative to high organisms employ a battery of enzymes with synergistic function
temperature deconstruction. Thus, significant efforts have been to break down polysaccharides41,42 and, in some cases,
expended to understand and improve natural paradigms for lignin.43−47 To date, two primary paradigms have been
enzymatic depolymerization of biomass with the aim to decrease discovered in cellulose depolymerization: the “free” enzyme
1310 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

paradigm and the cellulosomal paradigm. The free enzyme enzymes. Cellulases, unlike many commonly studied enzymes,
paradigm represents the case wherein enzymes diffuse as single act on an insoluble substrate, and cellulose represents a complex,
catalytic units, often accompanied by binding modules heterogeneous macromolecule; thus, we first briefly review
covalently attached together via linker domains, exemplified physical and chemical aspects of cellulose relevant to enzyme−
by the enzyme suite from the filamentous fungus Trichoderma substrate interactions. The initial structural reports of cellulases
reesei (or Hypocrea jecorina).41 The primary mode of action of primarily beginning in the early 1990s enabled identification of
free cellulases is one wherein endoglucanases (EGs), or substrate interactions and catalytic mechanisms, and given that
nonprocessive cellulases, act by cleaving cellulose chains in most fungal enzymes that depolymerize cellulose employ a
amorphous regions, and cellobiohydrolases (CBHs), or hydrolytic mechanism, we discuss general aspects of how
processive cellulases, attach to cellulose chain ends and cellulases prime cellulose for hydrolysis. We then briefly review
depolymerize and hydrolyze cellulose chains typically into research efforts into cellulase mechanisms up until the initial
disaccharide units, down the length of a chain. In solution, structural reports. Many well-characterized cellulases are
cellobiose is then cleaved by β-glucosidases into glucose for multimodular with binding modules and linkers that enable
cellular uptake by organisms. The recent discovery of selective, multiple functions within the same enzyme; the efforts
oxidative enzymes adds another element of endo-acting action dedicated to the study of binding function and multimodularity
to this paradigm wherein chains are likely cleaved in crystalline are reviewed here as well. As fungi and most other biomass-
regions.48−52 degrading organisms secrete enzyme cocktails, we then discuss
The other well-characterized paradigm for enzymatic biomass the primary components of fungal cellulolytic cocktails
degradation is the use of cellulosomes wherein noncovalent including glycoside hydrolase (GH) family 7, 6, 5, 12, and 45
cohesin−dockerin interactions enable large complexes up to cellulases, in this order. We note that we do not discuss β-
hundreds of enzymes to operate in close proximity on large glucosidases (GH1 and GH3 enzymes), which cleave cellobiose
protein scaffolds. Cellulosomes were first discovered in the to glucose in solution as these enzymes have been recently
anaerobic rumen bacterium, Clostridium thermocellum.53−56 reviewed and primarily act on soluble substrates.63 Additionally,
More recently, additional paradigms have perhaps begun to fungi and other biomass-degrading organisms employ a vast
emerge57,58 including one with multimodular enzymes that diversity of enzymes aimed at other substrates in the plant cell
contain multiple catalytic domains (CDs) per protein, which wall such as hemicelluloses and pectins; these polysaccharides
appears to be a natural “interpolation” between free cellulases are also not reviewed here, primarily because cellulose is the
and cellulosomes in function,59 and another with polysaccharide most recalcitrant carbohydrate substrate in the cell wall
utilization loci, but additional characterization of these potential compared to the other polysaccharides, but the importance of
new paradigms will be required. other cell wall polysaccharides is of significant relevance for
Of all biomass-degrading organisms, fungi play a pivotal role, biomass conversion. For each GH family, we focus on the
as they are responsible for the vast majority the biomass molecular-level aspects of cellulase action by reviewing studies
degradation in nature. Fungi have colonized a vast range of from structural biology, biophysical and biochemical measure-
terrestrial and marine environments, and are thus of vital ments including microscopy, spectroscopy, scattering, and
importance for the recycling of carbon on Earth, a capacity with modeling. All of these tools are needed to develop a
broad implications in global ecology, biogeochemistry, comprehensive, mechanistic view of cellulase structure and
agriculture, and, more recently, for industrial applications. function. Additionally, we review exciting new discoveries in
Broadly, two primary modes of fungal biomass degradation exist mechanistic paradigms for cellulose depolymerization, namely
employing either enzymatic or chemical means to break down the recent discovery of lytic polysaccharide monooxygenases
biomass. Brown-rot fungi initially utilize Fenton chemistry to (LPMOs), which employ copper, oxygen, and a reducing agent
generate hydroxyl radicals, which attack plant cell walls via to oxidatively cleave cellulose. In summary, work in recent years
powerful oxidation reactions.60,61 Conversely, filamentous fungi has demonstrated that substantial gains are still possible in
characterized as soft-rots and white-rots have been long known reducing enzyme loadings for biomass conversion, and for
to primarily employ enzymatic means to break down biomass. further gains, fundamental science, enzyme discovery and
Many filamentous fungi produce high titers of effective screening, and improved methods for enzyme engineering will
cellulolytic enzymes employing the “free” enzyme paradigm be required.
mentioned above, and thus, filamentous fungal enzyme
production has become a cornerstone of industrial biofuels 2. CELLULOSE
research and development.6,34 In particular, the isolation of the Cellulose is the β-1,4-linked homopolymer of β-D-glucose and
soft-rot ascomycete fungus T. reesei in the South Pacific in the exhibits a reducing and nonreducing end, the former of which
1940s followed by its characterization at the Natick Research can ring open to produce an aldehyde form. Given its natural
Laboratories marked the beginning of the development of prevalence and utility as a fuel and chemical precursor, and as a
filamentous fungi for biomass conversion purposes.62 Sub- material with myriad applications, the study of cellulose is vast
sequently, a wealth of mechanistic information regarding fungal and diverse. As such, we briefly review salient physical and
cellulase structure and function has been reported from the chemical properties relevant to cellulose deconstruction by
1980s onward. This research area has been accelerated in the cellulolytic enzymes in nature. Even this narrowing of scope
past decade by significant governmental and commercial leaves a great deal of informative and influential work to be
investments into research into biofuels production worldwide. addressed. We note this section is by no means meant to be an
Here, we review enzymatic mechanisms utilized by fungi to exhaustive review of cellulose structure, but, rather, a brief
depolymerize cellulose, with the primary focus on discoveries introduction setting the stage for discussion of cellulose
made since the first structural reports for each enzyme family. deconstruction and the methods used to study cellulases. Lastly,
We highlight open questions related to furthering our we note that the mechanisms of cellulose synthesis, which are
understanding of these biologically and industrially important key for elucidating its structure in plants, have long been studied,
1311 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 2. Natural and synthetic cellulose polymorphs. For each of the four polymorphs, the “end-on” view is shown at top and the “top-down” view at
bottom. Celluloses Iα and Iβ are naturally occurring polymorphs exhibiting only intralayer hydrogen bonding.88,89 The differences between the two
polymorphs are most easily observed from the “top down” view, which illustrates the subtle differences in interlayer chain stacking. Celluloses II and
IIII, the result of chemically pretreating cellulose I, are significantly different in their chain stacking arrangement.90,94 Hydrogen bonding, shown in
yellow, occurs between sheets as across the layers.

and identified as one of the major challenges in plant 2.1. Cellulose Structures
Enzymes that break down recalcitrant polysaccharides must
biology.14,64−66 Exciting recent work has begun to illustrate
overcome multiple challenges in their catalytic action. At the
the molecular level details of cellulose biosynthesis.15,67−70 atomic level, β-1,4-linked polysaccharides exhibit incredibly
1312 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 3. Four of the putative Iβ microfibril models proposed in recent years. For many years, the 36-chain diamond shape has been the prevailing
model.64,104−108 NMR and X-ray scattering techniques have suggested alternative models may better fit the data. Fernandes et al. suggested the 24-
chain rectangle and diamond models, the former of which was also proposed by Thomas et al.102,103 Newman et al. recently suggested the 18-chain
diamond models fits wide-angle X-ray scattering data well.109

strong covalent bonds. In an influential study, Wolfenden and oriented chains was originally reported by Gardner and
colleagues estimated that the uncatalyzed half-life of O- Blackwell in 1974 using cellulose from the algae Valonia
glycosidic linkages such as those found in cellulose, chitin, and ventricosa.85 In 1984, plant cellulose was further shown to be a
other polysaccharides are 2 and 4 orders of magnitude more mixture of the cellulose Iβ and Iα polymorphs in two papers
stable than DNA or peptide bonds, respectively, to uncatalyzed from Vanderhart and Atalla.86,87 In 2002 and 2003, Nishiyama
hydrolysis at neutral pH with a half-life of an astounding 5 and co-workers reported refined structures of both cellulose Iβ
million years.71,72 Indeed, chemically intact cellulose and chitin and Iα obtained from a tunicate (Halocynthia roretzi) and from
(a similar substrate based on N-acetylglucosamine monomers) the freshwater algae Glaucocystis nostochinearum.88,89 These
have been found in fossilized plants that are significantly structures were obtained from synchrotron X-ray and neutron
older.73−77 Given these bond strengths, GHs are incredibly diffraction studies of oriented cellulose films and were obtained
proficient enzymes in that they can provide rate enhancements at nearly atomic resolution in both cases such that hydrogen
(kcat/kon) up to 1017-fold. This in turn makes GHs the most bond patterns could be established. The primary differences
powerful hydrolytic enzymes known to man that do not employ between the cellulose Iβ and Iα polymorphs reside in the
metals or other cofactors.78,79 hydrogen bonding patterns and the interlayer chain stacking
The covalent bond strength of the β-1,4 glycosidic linkage in arrangement (Figure 2). Cellulose Iβ forms two different layers,
cellulose is merely one of the challenges that enzymes face when dubbed the “center” and “origin” layers, whereas cellulose Iα
depolymerizing this recalcitrant substrate. Cellulose microfibrils forms a unit cell with a single chain. In both cellulose I
in plants pack into tightly bound, crystalline lattices wherein polymorphs, the hydrogen bonds only exist within single layers
only a fraction of the chains are accessible to enzymatic attack on with no intersheet hydrogen bonds, which was noted at the time
the microfibril surface, which forms yet another barrier that to be surprising, further suggesting that the van der Waals
fungi and other biomass-degrading organisms must overcome. contacts in cellulose contribute a great deal to its overall
Individual cellulose chains are cosynthesized by large, thermodynamic stability. It should also be noted that both
membrane-bound terminal complexes, which simultaneously cellulose structures originated from cellulose microfibrils that
extrude and assemble multiple chains of cellulose into are known to be much bigger in diameter than elementary plant
elementary microfibrils.14,64−66 Understanding how enzymes cellulose microfibrils, which may have an impact on the
depolymerize cellulose from a mechanistic perspective is structures, as discussed in more detail below.
predicated on knowing the localized crystalline structure of Certain chemical treatments can convert cellulose I to other
cellulose chains and the shapes and properties of the cellulose crystalline forms. Treatment with solvents such as sodium
microfibrils. Each question is briefly described below. hydroxide (known as mercerization)90 or dissolution in ionic
Cellulose can pack into multiple crystalline forms, or liquids91,92 can convert the parallel chains in native cellulose into
polymorphs. Natural systems, including plants, produce an antiparallel arrangement with both inter- and intralayer
cellulose I, the study of which dates back to the 19th hydrogen bonding interactions, producing cellulose II (Figure
century.80−84 The structure of cellulose I as a set of parallel- 2). The structure of cellulose II was also recently presented.90,93
1313 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 4. Computer simulation suggests that the cellulose Iβ microfibril twists by approximately 1.5° per cellobiose unit when it is solvated in water
and the ends are not fixed. The existence of cellulose twist remains an open experimental question. A representative microfibril begins to twist after
only 1 ns, shown at left. Reprinted with permission from ref 114. Copyright 2005 Elsevier Ltd.

Less severe chemical treatments, for example in ammonia, can corresponds to a cellulose microfibril consisting of 15−25 chains
also convert cellulose I or cellulose II into cellulose III each. In 2011, the same group used a wide variety of scattering
(sometimes called cellulose IIII and IIIII indicating that it and spectroscopic tools to examine microfibril cross sections in
originates from cellulose I or II, respectively).94 Cellulose III spruce wood.102 With the assumption that the number of chains
forms staggered layers with intra- and interlayer hydrogen in the microfibril must be divisible by 6, their findings suggest
bonding interactions, unlike in cellulose I where the chains pack that spruce wood microfibrils comprise about 24 chains. In
into flat layers (Figure 2). Both cellulose II and III typically terms of microfibril shape, they suggest either a diamond or a
exhibit greater digestibility by cellulase enzymes. 20,95,96 rectangular shaped microfibril, with the latter model fitting their
Cellulose IV has been suggested as another polymorph,97 but experimental observations better (Figure 3). Moreover, their
less characterization has been conducted to date, and it has been results suggest that the microfibrils are likely twisted and that the
recently suggested that it is likely quite similar to cellulose Iβ.98 surfaces are disordered.102 In 2013, the same group again
Several thermodynamic measurements have been conducted in examined celery collenchyma cellulose microfibrils and
the 1980s to understand the differences between cellulose I and demonstrated that the best-fit model was a 24-chain microfibril
II, which revealed relative enthalpic stabilities;99 additional with 8 layers of 3 chains each.103 Subsequently, a study from
rigorous thermodynamic and kinetic experiments will be Newman et al. applied similar methods to mung bean
required to more fully understand the interconversion between cellulose.109 They fit their scattering and NMR data to a 36-
cellulose polymorphs. chain microfibril model but were not able to match the
2.2. Cellulose Microfibrils experimental data to the calculated diffractograms. Conversely,
In the cellulose Iβ and Iα structures determined from Nishiyama Newman et al. were able to achieve good agreement between
et al.,88,89 the cellulose microfibrils were very large in diameter the computed and measured diffractograms and spectra using a
(10−20 nm).88,89 Conversely, plant cellulose chains pack into 24-chain and 18-chain model, with the 18-chain model
microfibrils with much smaller diameters,100−103 and, thus, have providing the best fit (Figure 3). Overall, this recent body of
much higher surface-to-volume ratios than cellulose from algae work suggests that cellulose microfibrils in higher plants may be
or tunicates. Regardless of the source, the cross-sectional shape smaller than the commonly assumed 36-chain models.
of the microfibril is quite likely dictated by the shape and Computer simulations have also been applied alongside of
arrangement of the terminal synthase complexes. It has long structural studies to gain additional insights into molecular level
been hypothesized that the elementary microfibril size in plants cellulose properties.110−113 Simulations in particular have
is 36 chains, primarily on the basis of imaging of terminal provided supporting evidence for the twisting of microfibrils,
complexes combined with other analyses (Figure 3).64,104−108 and as such, several recent highlights that are complementary to
More recently, new reports have begun to challenge the notion structural, NMR, and scattering studies are discussed here. In an
of a 36-chain model by applying a variety of scattering and early study, Matthews et al. simulated a 36-chain model of
nuclear magnetic resonance (NMR) techniques.100,102,103,109 cellulose Iβ and demonstrated that the microfibril is prone to
Kennedy and co-workers examined cellulose from celery twisting by approximately 1.5° per cellobiose unit (Figure 4).114
collenchyma, which is similar to cellulose from higher plants, Additional computational studies have reported twisting in
using NMR and scattering methods. In a careful study with their microfibrils of various sizes and shapes with multiple empirically
assumptions discussed thoroughly, they conclude that the fit energetic representations (force fields) for the carbohy-
microfibril diameters are between 2.4 and 3.2 nm, which drates,115−118 and recent simulations have suggested the
1314 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

physical basis for this twisting phenomenon.119 As mentioned Amorphous cellulose is often represented by phosphoric acid
above, experimental studies have also suggested that plant swollen cellulose (PASC), regenerated cellulose, or soluble,
microfibrils indeed exhibit twist.102 The effects of cellulose oligomeric substrates. Each of the “clean” cellulose substrates
microfibril twisting on cellulase action largely remain unan- (i.e., free of hemicellulose or lignin) exhibit significantly
swered. different properties including degree of polymerization, degree
Beyond atomistic simulations, multiple groups have devel- of crystallinity, microfibrils and fiber shapes, available reactive
oped coarse-grained models for studying cellulose at increas- surface area for cellulase reactions, and possibly more variables
ingly larger scales and at a variety of resolutions.120−127 Coarse- that substantially impact the effectiveness of enzymatic
grained models are quite important for the study of cellulose deconstruction. The use of many model substrates, often with
phenomenon such as enzyme action on the substrate, significant variation even within substrates, can make direct
interactions of multiple cellulose microfibrils, or interactions comparison of cellulase performance data difficult across
with other biopolymers, all of which occur at long length and laboratories and studies. For pretreated substrates, which
time scales. Going forward, improved simulation models for typically contain other residual plant cell wall polymers, direct
cellulose coupled to the development of high-resolution imaging comparisons are even more challenging. Thus, we note that
will likely converge such that simulation and experiment can be caution should be taken when comparing activity data across
directly connected. To this end, Ciesielski et al. recently studies as substrate variations can greatly influence the outcomes
reported the development of a transmission electron micros- of activity measurements. As the study of cellulase action is
copy (TEM) method wherein atomic coordinates for cellulose invariably linked the substrate properties, these properties
microfibrils can be mapped onto the experimentally measured should not be overlooked. Detailed substrate characterization is
nanoscale architecture of the plant cell wall after mild essential in cellulase studies. Going forward, the combination of
thermochemical pretreatment.128 This study highlights the structural, nanoscale imaging, and modeling will continue to be
power of combined experimental and computational tools to important for elucidating the features of cellulose relevant to
understand the behavior of cellulose microfibrils in the cell wall enzymatic deconstruction. Some of the more prevalent
context, which is key to understanding the physical and chemical substrates are described below.
environments that cellulases encounter during their enzymatic 2.3.1. Microcrystalline Cellulose (MCC) from Plants.
action. Purified, microcrystalline celluloses have been used extensively
It is commonly thought that cellulose comprises crystalline for comparing cellulose performance.96 These celluloses are
and amorphous regions. Although significant research has been made from wood pulp and are available commercially, including
conducted toward this question, there remains much con- the following: Sigmacell, Solka-floc, and Avicel. Cotton linters,
troversy around this topic, and a clear definition of amorphous the dust-like byproduct from cotton processing, are also used for
cellulose does not exist. Habibi et al. note that amorphous cellulose assays. Cotton linters are essentially fragmented, but
cellulose likely arises from chain dislocations wherein micro- chemically unmodified cotton fibers reduced to small size (ca.
fibrils distort due to internal strain.129 Taking the definition of 100 mesh). Whatman No. 1 filter paper, commonly used to
amorphous cellulose as distorted regions along the length of determine the international filter paper unit, is made from
crystalline fibers aligns well with the conventional model of cotton cellulose. The amorphous cellulose content of MCC
cellulose hydrolysis by GH enzyme cocktails32 mentioned (Avicel) can be increased by a mechanical treatment process
previously. Namely, EGs cleave cellulose chains in amorphous known as ball milling. Ball-milled cellulose, especially when
regions along the cellulose crystals, and CBHs attach to chains prepared under conditions of low water content, had been
and processively hydrolyze glycosidic linkages down the chain. shown to be considerably more digestible than the starting
During processive hydrolysis in crystalline regions, CBHs must material.127,133
decrystallize individual chains of cellulose, which requires 2.3.2. MCC from Microbes. Cellulose is produced by green
thermodynamic work. Several studies have examined the algae and some bacteria, primarily of the genera Acetobacter,
amount of work that is required to decrystallize chains from Sarcina, and Agrobacterium. Acetobacter xylinum cellulose is most
the surface of cellulose microfibrils with computer simula- commonly used for cellulose digestion experiments as it can be
tion.130−132 Beckham et al. examined chain decrystallization of produced in fermenters in yields as high as 15 g/L.134 The
cellulose Iβ, Iα, II, and III demonstrating that the work to bacterial cell produces protofibrils of approximately 2−4 nm in
decrystallize chains of celluloses Iβ and Iα is greater than that of diameter, which are eventually bundled into ribbon-shaped
celluloses II and III.130 Moreover, it was shown that an microfibrils of about 80 × 4 nm2.135 Natural bacterial cellulose
increasing number of inter- and intralayer interactions increase (BC) must be purified before use in enzyme assays. For A.
the decrystallization work in essentially an equivalent manner, xylinum cellulose, cellulose from culture medium is treated with
further suggesting that interlayer and intralayer interactions sodium or potassium hydroxide, acetic acid, followed by
contribute similarly to the thermodynamic stabilization of repeated washing with ultrapure water.136 The resulting
cellulose.130 BMCC has microfibrils of approximately 0.1−10 μm in width,
which is 100 times thinner than the microfibrils found in plant
2.3. Cellulose Substrates
cell walls. It has also been demonstrated that purified A. xylinum
Accurate experimental representation of the nano- and micro- BMCC has a degree of polymerization of about 800.137 Green
structures of cellulose is a notoriously difficult prospect. algae in which crystalline cellulose is the major component of
Throughout the cellulase literature, multiple model celluloses the cell walls include the Cladophorales (Cladophora, Chaeto-
are used as representative crystalline and amorphous substrates. morpha, Rhizoclonium, and Microdyction) and a few members of
Commonly, Avicel, bacterial microcrystalline cellulose Siphonocladales (Valonia, Dictyosphaeria, Siphonocladus, and
(BMCC), tunicate or algal cellulose, various forms of pretreated Boergesenia). V. ventricosa produces large cellulose microfibrils
biomass, or non-natural polymorphs derived from the many compared to plant cell wall microfibrils),138 which have been
types of cellulose I are used to represent crystalline cellulose. proposed to be in a 33 × 38 chain configuration with lengths of
1315 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

hundreds of nanometers to a few micrometers.139 Valonia and Table 1. Values for CI from Combined XRD and NMR
Cladophora produce celluloses with an exceptionally high degree Analysis150
of crystallinity, approximately 95% from XRD,140 and for this
cellulose XRD peak XRD amorphous NMR C4 peak
reason make ideal substrates for cellulase studies. Algal cellulose tested deconvolution subtraction separation average
is indeed complex, perhaps more than plant cellulose,
Cladophora ND 80 ND 80
considering that, in each Cladophora microfibril, the two
BMCC 73 82 74 76
cellulose I polymorphs were suggested to coexist, alternating Avicel PH- 61 78 57 65
either longitudinally or laterally.141 Koyama et al. further 101
suggested that there are three types of cellulose I polymorphs SigmaCell 50 61 79 56 65
found in green algae: Iα-broad microfibrils, Iβ-flat ribbons, and SigmaCell 20 64 67 53 61
Iβ-small microfibrils with random orientation.142 Although SolkaFloc 57 57 44 53
highly crystalline, the specific surface area of Cladophora
cellulose powder has been reported to be as high as 95 m2/g 3. GLYCOSIDE HYDROLASE CATALYTIC
from N2 gas adsorption studies,143 which is much higher than
MECHANISMS
this value for BMCC, ∼1 m2/g. Today, the molecular and
polymeric basis for the action of cellulases on algal celluloses is Given the diversity of monosaccharides and the multiple types
not clear. The advantages of using BMCC and algal celluloses of glycosidic linkages possible, carbohydrates form the most
are thus considerable, yet inconsistencies in preparation diverse set of biomolecules in nature. As such, the enzymatic
practices between laboratories can introduce difficulties in machinery to synthesize, modify, and deconstruct carbohydrates
comparing assay results. is vast.151−153 The Carbohydrate-Active Enzymes Database
2.3.3. Phosphoric Acid Swollen Cellulose (PASC). (www.CAZy.org) is a manually curated list of the primary
Walseth first developed a procedure for producing high- enzyme classes known to act on carbohydrates.151−153 Since its
reactivity cellulose suitable for cellulose activity studies by inception in 1998, CAZy has become an invaluable resource in
swelling air-dried cellulose in 85% phosphoric acid.144 After carbohydrate enzymology. More recently, a sister site,
dissolving crystalline cellulose, the solubilized precursors can be CAZypedia (www.cazypedia.org), has begun to develop
formed, which can subsequently be hydrolyzed. Cellulose descriptions of the CAZy Database entries for each enzyme
dissolution in phosphoric acid involves two processes: an class and family within each class. The protein classes covered in
esterification reaction between hydroxyl groups of cellulose and CAZy as of the time of this review include glycosyltransferases,
phosphoric acid to form cellulose phosphate and a competition carbohydrate esterases, polysaccharide lyases, auxiliary activities,
of hydrogen bond formation between the hydroxyl groups of carbohydrate-binding modules (CBMs), and GHs. Glycosyl-
cellulose and hydrogen bond formation between hydroxyl transferases (EC 2.4-) catalyze the formation of glycosidic bonds
groups of cellulose and water molecules or hydrogen ions.145 with nucleotide phosphate or lipid phosphate leaving groups
During this acid treatment, some glycosidic bond hydrolysis and to date have been classified into 95 families. 154
occurs which reduces the degree of polymerization, although the Carbohydrate esterases (EC 3.1.1- or 3.1.5-) are responsible
effects can be controlled. Following regeneration with water, for de-O- or de-N-acylation of polysaccharides, such as acetyl
free phosphoric acid is recovered, and the resulting cellulose is xylan esterases, and comprise 16 known families with multiple
amorphous without significant recrystallization. PASC has been nonclassified sequences, perhaps suggesting that other families
used widely as a test substrate for CBHs. Note that, in contrast exist. Polysaccharide lyases (EC 4.2.2-)155 employ β-elimination
to PASC, which has no chemical modification, carboxymethyl reaction mechanisms to cleave uronic acid-contaning poly-
cellulose (CMC) retains esterified carboxymethyl side chains saccharides, such as those commonly found in pectins.10,11 To
and is thus suitable only for testing EG action. date, polysaccharide lyases form 23 characterized families.151,155
2.3.4. Cellulose Crystallinity. The crystallinity index (CI) “Auxiliary activities” or AAs are a more recent addition to
of celluloses is a key parameter when selecting substrates for CAZy,152 and currently include 12 families of enzymes in total,
enzyme assays. Cellulose CI has been measured using several with 8 known to be active during lignin degradation and 4
different techniques including XRD, solid-state 13C NMR, known to be directly active on polysaccharides, namely LPMOs.
infrared (IR) spectroscopy, and Raman spectroscopy.146 There Oftentimes, catalytic function for carbohydrates is associated
have also been several methods used for calculating CI from the with binding function, and thus, CAZy contains a classification
raw spectrographic data, particularly for XRD. Methods using scheme for CBMs as well, which to date represent 69 families, as
Fourier transform (FT)-IR spectroscopy determine CI by will be discussed in section 5.156 Lastly, GHs are included in the
measuring relative peak heights or areas.147,148 Thygesen et al. CAZy Database. GHs are the primary drivers of enzymatic
compared four different analysis techniques involving XRD and polysaccharide degradation in nature and represent a vast set of
reported that the CI of Avicel cellulose varied significantly enzymes. To date, 132 GH families have been characterized. As
depending on the technique used.149 In 2012, Park et al. made most cellulases are GHs, we focus on their description in this
critical comparisons between the different techniques using section from a general, mechanistic viewpoint. Individual GH
XRD and solid-state 13C NMR.150 Comparisons were made with families that fungi employ, namely GHs from Families 5, 6, 7,
literature data for the CI of one type of cellulose (Avicel PH- 12, and 45, are described in separate sections below. We note
101) using these methods. Park et al. also reported the CI values that newly discovered LPMOs do not employ a hydrolytic
for seven crystalline cellulose preparations and BMCC, and mechanism; their mechanism of action is described in section
further recommended that the simple, peak height XRD 11.
measurement be excluded from comparisons and the remaining 3.1. Retaining and Inverting Mechanisms
values from other XRD and NMR methods be averaged to As proposed by Koshland in 1953, nearly all known GHs
obtain the “best value” for CI. These results are shown in Table employ one of two mechanisms: either retaining or inverting
1. hydrolysis.157 Inverting mechanisms proceed via a single
1316 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Scheme 1. Two Primary Catalytic Mechanisms of GHsa

a
(A) Inverting GHs employs a single displacement catalytic mechanism wherein a water molecule conducts nucleophilic attack at the anomeric
carbon of the −1 sugar, a catalytic base abstracts a proton from the attacking water molecule, and a catalytic acid transfers a proton to the glycosidic
oxygen to cleave the glycosidic linkage, resulting in an inversion of stereochemistry at the anomeric carbon. (B) Retaining GHs employs a two-step,
double displacement catalytic mechanism. In the first step, the nucleophilic residue attacks the anomeric carbon simultaneously with the proton
transfer from the acid residue to the glycosidic oxygen resulting in the formation of the glycosyl-enzyme intermediate and cleavage of the glycosidic
bond. In the second step, a water molecule enters the active site and attacks the anomeric carbon simultaneously transferring a proton to the catalytic
base, thus restoring the enzyme active site for subsequent catalysis.

catalytic step (Scheme 1A), wherein a water molecule conducts While the mechanisms put forth by Koshland are now widely
nucleophilic attack at the anomeric carbon of an oligosaccharide accepted,159−161 there has historically been some debate about
or polysaccharide, a proton from water is transferred to the whether bond cleavage and formation occurs in concerted steps
catalytic base, and a proton is transferred from the catalytic acid in an SN2 reaction, as shown in Scheme 1, or via a carbocation
to cleave the glycosidic linkage. Generally, both the catalytic acid intermediate as in an SN1-type reaction,162−166 or even via an
and base residues exhibit carboxylate groups (i.e., Asp or Glu). acyclic oxocarbenium ion,167 although subsequent studies have
This reaction results in an inversion of stereochemistry at the not supported ring-opening as part of the mechanism.168
anomeric center. Before the next catalytic cycle, the catalytic acid
Researchers now discuss oxocarbenium-like transition states
and base must be reset to their configuration for catalysis.
Scheme 1A illustrates the inversion from a β-linkage to an α- (TSs), rather than a distinct ionic intermediate, as the lifetime of
linkage. Conversely, retaining mechanisms are two-step an ion in an enzyme active site would be less than a molecular
reactions (Scheme 1B). The first step in retaining hydrolysis vibration.169−171 The oxocarbenium-like TSs show extended
(typically termed “glycosylation”) involves proton transfer from distances between atoms involved in bond cleavage and
the catalytic acid and attack at the anomeric carbon by the formation, sometimes referred to as “exploded” TS.161,170
nucleophile residue to form a covalent glycosyl-enzyme GH enzymes often have multiple carbohydrate binding sites
intermediate and invert the stereochemistry of the sugar in their CDs. For example, family 7 glycoside hydrolase (GH7)
covalently bound to the enzyme. In the second step, termed cellulases exhibit at least 9 subsites for binding cello-oligomers,
“deglycosylation”, a water molecule enters the enzyme active site and catalysis generally takes place 2 glucose units from the
and conducts nucleophilic attack at the anomeric carbon. The reducing end of the chain bound in the enzyme tunnel.172−174
glycosyl-enzyme intermediate bond is broken, and a proton is As GHs ubiquitously feature binding subsites for carbohydrate
transferred to the catalytic base (which was the acid in the first residues in polysaccharides, Davies, Wilson, and Henrissat
step) resetting the enzyme and again inverting the stereocenter proposed a scheme for the naming of carbohydrate binding sites
at the anomeric carbon for an overall net retention of similar to the scheme previously proposed by Biely, Krátký, and
stereochemistry. In retaining mechanisms, the carbohydrate
Vršanská.175,176 Specifically, they propose the use of the −n to
product of glycosylation is typically not assumed or illustrated to
participate in deglycosylation, and generally, both the catalytic +n system used by molecular enzymologists, where −n
acid and nucleophile are carboxylate residues. Several GH represents the nonreducing end and the +n represents the
families have been discovered that do not follow this typical reducing end subsites. Thus, glycosidic bond cleavage in GHs
paradigm, which have been recently reviewed.158 All cellulases occurs between the −1 and +1 subsites. This nomenclature has
reviewed here follow the typical retaining or inverting hydrolysis become nearly universally accepted in the carbohydrate
mechanisms shown in Scheme 1 with the exception of LPMOs enzymology and structural biology communities and will be
reviewed in section 11. used throughout this review.175,176
1317 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 5. Representations of the IUPAC conformations of six-membered rings: chair (C) has four atoms in the same plane, with one atom above that
plane and an atom on the opposite side of the ring below the plane; envelope (E) has five atoms in one plane, with the sixth either above or below the
plane; half-chair (H) has four atoms in one plane, with one atom above the plane and an adjacent atom below the plane; skew (S) has four atoms in the
same plane, with one atom above the plane and an atom two positions away below the plane; and boat (B) has four atoms in the same plane, with two
atoms on opposite sides of the ring either both above or both below the plane.

3.2. Carbohydrate Ring Puckering lysozyme,186 which was the first structure of an enzyme ever
A significant number of crystallographic studies have been solved.187 On the basis of analogy with SN1 transition states, it
conducted to date to study GH catalysis, as detailed in was suggested that the ring distortion is a key component of the
subsequent sections of this review for cellulolytic enzymes, hydrolysis mechanism, contributing to the formation of a
and many studies have employed transition state analogues to carbonium ion and weakening the C1-glycosidic bond.186 Other
capture the Michaelis complex of carbohydrates in GH active research groups have proposed that the role of ring distortion is
sites. A universal observation to date is that GHs distort overestimated on the basis of lack of difference in observed
carbohydrate ring geometries in Michaelis complexes for binding constants188 or calculated low-energy conformations.189
catalysis away from the chair conformations that are However, structural studies of the hen egg-white lysozyme
thermodynamically stable in aqueous solution.159,161,169,177,178 confirmed that a key substrate ring is distorted from the chair
To systematically classify sugar puckering geometries, Schwartz conformation, which investigators attributed to forming a
developed the original nomenclature for describing the 38 geometry that supported formation of an oxocarbenium ion at
canonical puckering conformations of pyranose rings,179 which C1190 or to weaken the scissile glycosidic bond by creating a
was subsequently adopted by the International Union of Pure higher-energy ground state.162,191
and Applied Chemistry (IUPAC).180 This system describes In one of the first published structures of a cellulase,
pyranose rings as chair (C), envelope (E), half-chair (H), skew specifically a family 6 glycoside hydrolase (GH6), Rouvinen et
(S), and boat (B) conformations, as shown in Figure 5. The al. noted that the sugar ring in the −1 subsite (at the time
naming system further uses superscript and subscript numbers referred to as the “B subsite”) was distorted from the solution-
for each ring conformation to denote which atoms are outside of stable 4C1 conformation. The authors were originally unsure if
the reference plane formed by four atoms (e.g., B1,4 denotes a this distortion was functionally significant.192 Barr et al. later
boat-shaped pyranose ring with the C1 and C4 carbons below reported that GH mutants that could allow the −1 sugar to relax
the reference plane). To quantitatively describe puckering, to a chair conformation exhibit increasing binding affinity and
Cremer and Pople proposed a spherical coordinate system that decrease hydrolytic activity, supporting the proposal that ring
uniquely describes the pyranose ring conformations as a distortion is important for catalysis.193 Zou et al. observed
function of three parameters that describe a sphere (Figure distorted sugars in the −1 subsite studies of T. reesei Cel6A
6).181 This has become the standard method to quantify the (TrCel6A) with a nonhydrolyzable ligand, identifying the
residue whose steric clash with the hydroxymethyl group forces
the sugar ring into a distorted conformation.194 They proposed
that the distortion is integral to the catalytic mechanism,
providing for nonperiplanar orientation between the scissile
bond and a doubly occupied, nonbonding orbital from the ring
oxygen.194 As reviewed in detail below, many more structures
were solved of cellulases that have revealed puckered
carbohydrate rings in the −1 subsites; certainly, the same is
also true for GHs beyond cellulolytic GHs, but these additional
structures are outside of the scope of this review. For capturing
such catalytic reaction coordinates, GH structural biology efforts
have relied on the use of synthetic ligands, such as thio-linked
sugars, which cannot be hydrolyzed, or with other transition
Figure 6. 38 IUPAC designated puckering conformations for six-
membered rings are projected here on a two-dimension representation state analogue ligands in native enzymes.161,178,195−197 Withers
of the Cremer−Pople sphere. et al. pioneered the use of fluoro-sugars in mechanistic GH
studies wherein fluorine atoms are substituted for hydroxyl
groups in pyranose sugars. For example, substitution at the C2
degree of ring puckering and is quite useful when stable position can result in a decrease in deglycosylation rates in
puckering geometries in enzymes are intermediate between the retaining GHs, enabling capture of glycosyl-enzyme intermedi-
38 IUPAC puckering geometries.182−184 Hill and Reilly ates.198−202 Other substitutions can lead to the ability to probe
introduced a system of triangular decomposition that is mechanistic steps in GHs when coupled to NMR, mass
particularly well-suited for molecular simulations and also spectrometry, and enzyme kinetics analysis.203 Moreover, the
uniquely identifies the exact puckering conformation.185 use of native substrates in catalytically inactive mutants, such as
Carbohydrate ring puckering in GH active sites has been the mutation of glutamate, a common acid/base to glutamine is
identified since the seminal structural study of the hen egg-white a common approach. Through approximately a decade of work
1318 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

starting in the 1990s, it became clear that cellulases from a oxocarboniom ion (instead forming a lower-energy oxonium
variety of GH families commonly distort pyranose rings from ion−water complex) as well as lowering the transition state
the 4C1 conformation to a puckered conformation, observed in energy by stretching the scissile glycosidic bond.221 To
the 1S3,204−208 2SO,192,194,209 BO,3,192,209 or 2,5B210 geometries. investigate whether preferential properties for catalysis could
The preponderance of data indicating conservation of puckering be observed in isolated puckered monosaccharides, several
in enzymes from a wide range of GH families and different studies have examined differences between puckered con-
organisms indicates that ring distortion indeed plays a central formations of β-D-glucose. The metadynamics studies by Biarnés
role in catalysis of glycosidic bond cleavage, and spurred efforts et al. investigated the relative stability of different puckered
to further understand the role of puckering in catalysis. conformations by employing Cartesian-collective variables to
Proposals for the role of puckering include providing for
explore the puckering potential energy surface.222 While the
antiperiplanar alignment of the ring oxygen lone pair with the
chosen collective variables resulted in distortions in puckering
leaving group (the scissile glycosidic bond), a requirement of
Deslongchamps’s theory of stereoelectronic control,211 later amplitude and energy barriers,223 Biarnés et al. found that bond
referred by the more specific name “antiperiplanar lone pair lengths and partial charges of key atoms change as a function of
hypothesis (ALPH)”.212 This requirement in enzymatic puckering geometry, with puckered geometries observed in
reactions has been refuted on the basis that predictions from enzyme complexes displaying catalytically favorable properties
this theory have been proven false,169,170 such as the prediction such as elongated C1−O1 bond distance, shortened C1−O5
that α-linked substrates would retain their ground state, chair bond distance, and a higher partial charge on C1.222
conformation during reaction.166,213 In counterpoint to such Barnett and Naidoo performed a study of β-D-glucose
objections, Nerinckx et al. noted that observed pucker puckering ensuring more complete sampling of puckering
conformations seemingly counter to the ALPH could in fact geometries, revealing the free-energy landscape calculated with
substantiate the ALPH if they were points on a catalytic itinerary the semiempirical method PM3CARB-1.182 The initial inves-
with the 1C4 inverted chair, rather than the solution-stable 4C1 tigation was followed by a study in which they compared free
chair conformation, and the 1C4 geometry could convert to 4C1 energy differences calculated with different semiempirical
in a subsequent step.214 This suggestion opposes the “principle methods against the density functional theory method
of least nuclear motion” which posits that enzymatic itineraries B3LYP/6-311++G(d,p).183 The trends revealed by the semi-
adopt conformations which minimize such large conformational empircal methods were qualitatively similar to notable
changes.168 quantitative differences: PM3CARB-1 revealed that the
Deslongchamps further theorized that ring “distortion raises second-lowest energy conformation is BO,3, just 1.6 kcal/mol
the energy of the ground state and thus lowers the energy of higher in free energy than the lowest-energy 4C1 conformation
activation for bond cleavage”,166 in agreement with the earlier
obtained, while B3LYP/6-311++G(d,p) identified BO,3 as the
proposal by Jencks.162 Warshel vigorously repudiated this “strain
third-lowest conformation, at 5.5 kcal/mol higher than lowest-
theory” based on theoretical models showing that it cannot
provide a significant catalytic effect.215,216 Additionally, energy 4C1 conformation. Together, these papers confirm that
researchers have noted that observed puckering conformations different puckered geometries afford measurably different
aligned β-linkages in an axial position suitable for nucleophilic properties that would make them more or less amenable to
attack.159,186,217,218 While it may be a factor in catalysis, it is not catalysis, and they show the results are sensitive to the method
a universal feature of GH substrate puckered geometries.184 employed.
Warshel advocated that the most important enzymatic Recently, Mayes et al. employed a highly accurate electronic
contribution to catalysis is stereoelectronic stabilization of the structure method (CCSD(T)/6-311+G(d,p)//B3LYP/6-
transition state,215,216 harkening back to Pauling’s theories on 311+G(2df,p)) in a study that ensured thorough sampling of
enzyme mechanisms.219 Blake et al. originally proposed a monosaccharide ring geometries.184 In addition to the 38
stereoelectronic argument for ring distortion, suggesting that the IUPAC puckered conformations, monosaccharides have
puckered conformation aids catalysis by allowing the ring exocylic groups that are free to rotate at ambient temperatures,
oxygen to share charge with the anomeric carbon, stabilizing a resulting in the notorious carbohydrate flexibility that challenges
carbonium ion formed during the proposed reaction mecha- structural and dynamic properties of carbohydrates.224 The
nism.163,186 As previously discussed, it is now widely believed study by Mayes et al. compared puckering behavior of five
that GHs employ oxocarbenium-like TSs, rather than ionic biologically important sugars, and the differences among them
intermediates, but the basic concept of the puckered geometry testify to the importance of exocyclic groups in defining
stabilizing a positive charge at the anomeric carbon still applies, puckering landscapes. For β-D-glucose, they confirmed that the
and stabilizing the TS is exactly what Pauling and Warshel puckered conformations employed by GH active sites offer a
champion. The proposal that puckered ring geometries stabilize
combination of catalytically advantageous properties, such as a
positive charge at the anomeric carbon continues to hold wide
support.159,161,196,220 higher partial charge at C1 to make it a better target for
As modifications to the electronic structure of carbohydrate nucleophilic attack. Furthermore, these conformations have
ring puckering is now an obvious feature of GH mechanisms, lower barriers for ring interconversion.
quantum mechanical approaches coupled to structural biology In the following sections, we review specific details of each
studies are a clear means to probe the mechanistic under- cellulase family represented prevalently in fungi, which comprise
pinnings of carbohydrate catalysis.159 A series of theoretical both inverting and retaining enzymes. Attention is given
studies have been conducted to determine the effects of ring primarily to specific elements of the catalytic steps in each
distortion on catalysis. Smith performed one of the earlier cellulase family. For more comprehensive reviews of general GH
studies using 2-oxanol.221 This work suggested that ring catalytic mechanisms, we refer readers to recent re-
distortion may obviate the need to form a high-energy views.158,159,161,196
1319 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 7. Geneaology of T. reesei mutants developed internationally from World War II to about 2000. Mutagenesis was affected by radiation or
chemical treatments. At the time, exposure to a linear accelerator was an effective means of irradiation. Most commercial preparations used today are
based on proprietary improvements of RUT-C30.

4. EARLY DEVELOPMENTS IN FUNGAL CELLULASES cocktails.230 An important aside is that, later in the 1970s, Gauss
Prior to the landmark publications of the first structures of and co-workers at the Bio Research Center Company (Nagoya,
fungal cellulases, many biochemical experiments shed light on Japan) patented the concept of combining, in one tank, T. reesei
the nature of cellulase enzymes, especially from the model fungal enzymes, milled biomass, and fermentative organisms.231
fungus T. reesei. As a preface to detailed structural and This revolutionary concept, later improved in a U.S. patent
mechanistic descriptions, we first briefly describe some of the owned by the Gulf Research & Development Company,232
initial developments in cellulase enzymology related to the dramatically increased thermodynamic “pull” to products, and
initial isolation of T. reesei, the early development work to was called simultaneous saccharification and fermentation, or
produce mutant strains for industrial production of cellulases, SSF.
and the first studies related to fingerprinting the cellulolytic The wild-type strain of T. reesei, QM6a, has been thoroughly
cocktail of T. reesei. We note that the naming scheme of studied and, importantly, used to generate the modern strains of
cellulases changed from their initial characterization to adopt the enhanced industrial microorganisms by radiation mutagenesis.
standard CAZy GH classification. Throughout section 4, we T. reesei QM9414, generated from QM9123, was the first
typically employ the “classical” name [e.g., the previously used production grade strain (Figure 7).233 QM9414 was found to
T. reesei “CBH I” versus the new, commonly used T. reesei produce approximately 2 times more cellulase protein than
“Cel7A” (TrCel7A)], and provide the new, widely used enzyme QM9123,234 which was in turn superior to the QM6a parent
names at the end of this section in Table 2. Beyond this section, strain by about the same extent.235 This era of very early strain
we only utilize the new naming schemes for enzymes. improvement is today difficult to follow for a number of reasons,
including the limited methods of protein and activity
4.1. History of the Discovery and Improvement of T. reesei determination available at the time. For example, the Biuret
Strains: Premolecular Era copper oxidation assay, known today to be highly influenced by
4.1.1. Early Work at U.S. Army Natick Laboratories. The sugars and amino acids, was the dominant protein assay. Today,
history of the discovery and improvement in the Trichoderma the bicinchoninic acid or dinitrosalicylic acid reagent assays are
strains has been thoroughly studied and reviewed.225−229 Many preferred.236 The other problem was that workers worldwide
Trichoderma species are known today to be cellulolytic, a used vastly different cellulase performance assays, normally
characteristic of their saprophytic life style. These strains include based on lab-specific digestion curves, given simply as “units”. A
T. reesei, T. lignorum, T. koningii, T. harzianum, T. long- true universal cellulase assay, first proposed by Ghose in 1987,
ibrachiatrum, T. virens, and T. pseudokoningii. Studies of helped alleviate these problems by defining a relevant measure
decomposing cotton militaria sent from Bougainville Island for cellulases, later known as the international filter paper
(Solomon Islands) in the South Pacific to the U.S. Army unit.237
Quarter Master Research and Development Center at Natick, 4.1.2. Early International Effort for Strain Improve-
Massachusetts, forged our understanding of an important ment. Starting from QM9414, researchers at the Natick Lab
cellulolytic fungus, eventually named T. viride QM6a.62 Note used ultraviolet light to generate the MCB-77 series of
that T. viride was eventually reclassified as T. reesei in honor of mutants.238 MCB-77 strains demonstrated volumetric produc-
Elwyn Reese.6 Postwar work at Natick by Reese and Mandels tivity of 90 IU/L/h, compared to 30 IU/L/h for QM9414.239
led to the modern concept of saccharification of biomass to The M series of mutants, produced at Rutgers University from
fermentable sugars affected by the powerful T. reesei enzyme QM9414, yielded strain M-7.240 M-7 was then treated with
1320 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

nitrosoguanidine, which, after suitable performance selection on reesei CBH I [CD−linker−cellulose binding domain (CBD)]
cellulose, resulted in the NG T. reesei mutant series. Strain NG- could be shortened or otherwise modified genetically to explore
14 showed approximately 5 times the filter paper activity, and its possible role in cellulose saccharification, as discussed in
twice the activities of cellobiase (or β-glucosidase) and EG of section 5.251
QM9414.241 NG-14 also produced about twice as much At this time, new tools were proposed for improving the
cellulase protein as QM9414, or about 1.4 mg/mL. transformation efficiency of T. reesei, notably hygromycin
Montenecourt and Eveleigh used T. reesei NG-14 to generate resistance.252 In 1995, Nakari−Setälä et al. reported the
the next series of mutants by chemical mutagenesis, including development of a modified strain of T. reesei able to grow on
the C and E series. From this series, the Rutgers C30 and E58 glucose, where catabolic repression shuts down the normal
are of interest as they are carbon catabolite derepressed strains. cascade of cellulase expression, and produces targeted proteins
Both of these strains showed about 4−6 times the cellobiase (or using novel promotors.253 Although expression levels of the
β-glucosidase) activity of QM6a, although the filter paper cellulase CDs reported in this study were low, about 100 mg/L,
activity was comparable. The RUT-C30 strain was superior to all this report marked a very important early step in engineering
other strains reported at the time, producing about 2.2 g/L promotor performance as well as movement toward engineered
cellulase protein,240 a trait later attributed to an increased T. reesei strains that were able to secrete heterologous genes into
endoplasmic reticulum content.242 Application of RUT-C30 to the culture medium without the high background interference
SSF further enhanced its performance on cellulose and from other cellulases. A critical review in 1995 by Keränen and
biomass.243 Another hypercellulolytic strain produced during Penttilä outlined the state of the art for expression of
the late 1970s at Rutgers was RL-P37.240 RL-P37 is thought to heterologous proteins in filamentous fungi.254 The next year, a
be the parent strain further developed by the U.S. cellulase report from Ilmén et al. described the isolation and character-
industry and used for large-scale production today. During this ization of the cre1 genes of the filamentous fungi T. reesei and T.
time, reports of a Cetus Corporation propriety strain, known as harzianum.255 We know today that, in multicellular ascomycetes,
L27, a regulatory mutant of QM9414, revealed that considerable the C2H2 type transcription factor CreA/CRE1 acts as a
effort was being applied to improving T. reesei by the industrial repressor mediating carbon catabolite repression.256 In T. reesei,
sector.244 Besides the cellulase improvement programs dis- CreA/CRE1 binds to the promoters of the respective target
cussed above at Natick and Rutgers, industrial sponsorship of genes via the consensus motif, 5′-SYGGRG-3′, and inhibits
this work was also ongoing in the 1980s−1990s in Finland translation.257 Ilmén et al.258 also reported the first detailed
(VTT), the United States (Cetus), Japan (Kyowa), and France analyses of the CBH I promotor. This work set the stage for
(CAYLA).233 Smaller programs were also underway in Portugal, follow-up work from the same group the next year which began
Czechoslovakia, and India. A report from Portnoy (2011) to describe regulation of cellulase expression in T. reesei at the
suggested that the only other T. reesei strain roughly comparable molecular level.259 In 2000, Pakula et al. reported the
to RUT-C30, CAYLA’s CL-847, was in fact derived from RUT- development of a novel isotope labeling approach for measuring
C30.245 protein synthesis and secretion in T. reesei in an attempt to
understand the rate-limited events occurring in cellulase
4.2. New Understanding from the Molecular Era: Cloning in
T. reesei and S. cerevisiae
production.260 Over the next few years, work at VTT continued
to investigate the protein transcription/expression factors
4.2.1. Cloning and Protein Production in T. reesei. In responsible for efficient production of cellulases, including
1987, Penttilä et al. reported the successful transformation of T. ACEI, ACEII, and RHOIII.261−263 A few years later, the VTT
reesei using a plasmid carrying the dominant selectable marker group reported the characterization of two genes, ire1 and ptc2,
amdS.246 Heterologous DNA was found to be integrated at implicated in T. reesei unfolded protein response.264
several different locations in the T. reesei genome, often in Following this early work, reports began to emerge regarding
multiple tandem copies. The successful expression of an the development of new strains of T. reesei that had been
Escherichia coli β-galactosidase in T. reesei was also reported by genetically modified to relieve catabolite repression and thus
these workers.246 This same year, the same group further improve target protein production when cultures are grown on
reported the characterization of the major genes coding glucose. A leading example was the report from VTT
cellulases from T. reesei.247 The group from VTT reported the summarizing development of strains of T. reesei in which the
first review of homologous and heterologous protein expression cre1 genes were deleted or modified by truncation (cre1-I) to
in T. reesei, citing both promoter changes and gene inactivation improve expression of plant cell wall degrading enzymes under
as critical for successful outcomes.248 The strong, inducible noninducing growth conditions.265 It is noteworthy that the
promotor for the cbh1 gene is highly recommended in this cre1-I gene was found earlier in the RUT C30 strain and thus
review. In the same year, Harkki et al.249 reported the first explains some of the ability of this strain, produced by chemical
example of genetic engineering of T. reesei for the purpose of mutagenesis in the 1980s, to produce cellulases when grown on
varying the production of key cellulases.249 These workers glucose. Kubicek et al.266 summarized promising genetic
produced a strain of T. reesei in which the cbh1 gene was strategies for improving protein production in T. reesei. It is
inactivated and the gene coding the major EG, egl1, was cloned known that plant cell wall degrading enzymes are inducible and
into a vector carrying the CBH I promotor and terminator, under both positive and negative transcriptional control. In this
which resulted in the overexpression of EGI. The following year, review, the authors summarize what is known about the known
1992, progress to date and outlook for cloning heterologous positive control elements include XYR1, ACE2, and HAP2/3/5,
genes in T. reesei were discussed in a landmark review from and the negative elements are ACE1 and CRE1. The authors
VTT.250 The ability to utilize a CBH I delete strain of T. reesei further propose increased research focus on signal transduction
was demonstrated the following year by the publication of pathways and other potential factors known to influence gene
perhaps the first paper reporting the engineering of this enzyme. regulation, such as light cycling and intensity. More recently,
Srisodsuk et al. showed that the linker of the three domain T. Steiger et al.267 reported the next generation of T. reesei stains
1321 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

developed for engineering, namely strains that are susceptible to of direct microbial conversion, also known as consolidated
homologous gene integration and employ reusable bidirection- bioprocessing, inspired the early work in T. reesei molecular
ally selectable markers. biology. The molecular cloning era of T. reesei cellulases began
4.2.2. RUT C-30 Revealed. In the modern era of systems with simultaneous reports from the U.S. and Europe in the
biology, the critical mutations conferring superior performance October 1983 issue of Bio/Technology that the T. reesei gene,
to RUT-C30 and its descendants have been elucidated.268 In cbh1, had been successfully cloned in the heterologous host, E.
2009, a massively parallel sequencing effort was used to compare coli, using lambda phage technology.281 The work from North
the genomes of T. reesei RUT-C30 and its direct ancestor NG14 America was reported by the research team from Cetus
with the published genome of the wild-type organism, QM6a.269 Corporation,281 and the work from Europe was reported by
The genomes of RUT-C30 and NG14 were found to be missing VTT.282 At this time, it was the stated goal of Cetus Corporation
over 100 kb of genomic DNA present in QM6a, encompassing to transform S. cerevisiae with the four dominant cellulase genes
18 large deletions in RUT-C30. These deletions included the from T. reesei in order to engineer a cellulose digesting, ethanol
cre1 gene truncation identified earlier by Ilmén et al.258 and 15 producing organism. To demonstrate the level of international
small deletions or insertions in RUT-C30. As stated above, cre1 competition ongoing at this time, we note that four year later
regulates catabolite repression in most cellulolytic fungi and is workers at VTT reported several landmark accomplishments
responsible for the strong inhibitory effect glucose has on relevant to this same objective. First, Penttilä et al. reported the
cellulase production.258,270 For wild-type Trichoderma, the successful cloning of active EG I and EG III in S. cerevisiae using
“cellulase signaling cascade” is initiated by natural environ- the yeast phosphoglycerate kinase promoter.283 At this time,
mental chemical inducers, such as cellobiose. However, large- these workers noted that the molecular weight of EGIII was
scale enzyme production is usually conducted with more higher than expected and proposed potential problems with
powerful inducers, such as sophorose and lactose. We note that hyper-glycosylation in yeast. Importantly, in 1988, Penttilä et al.
a process has recently been reported which converts glucose reported the expression of the CBH I and CBH II in S.
rich, biomass derived monosaccharide mixtures to sophorose cerevisiae.284 However, these heterologous enzymes were found
and other products using enzymatic transglycosylation,271 to be highly polydisperse in molecular weight, to bind poorly to
providing a more cost-effective source of this inducer. Beyond crystalline cellulose, and to be active only on amorphous
the mutations to cre1, 211 single nucleotide variants in RUT- cellulose. Pilot scale (200 L) production of T. reesei CBH II in S.
C30 were found by Ilmén et al.258 and Seidl et al.270 These cerevisiae was reported by VTT in 1990.285 Cetus Corporation
mutations and deletions affected 43 genes in NG14 and in RUT- discontinued its work on cellulases in the mid-1980s and was
C30.269 One of these genes in RUT-C30, gls2α, encodes the sold to Chiron Corporation in 1991. In 1993, this group
glucosidase II α-subunit,272 an enzyme important for trimming reported the coexpression of CBH II and EG I in S. cerevisiae and
N-linked oligosaccharides in glycoproteins, such as cellulases. showed that both enzymes were active on crystalline cellulose
Indeed, detailed biochemical analyses of Cel7A purified from and acted synergistically.286 The group at VTT continued to
RUT-C30 broth shows evidence that this enzyme is not express T. reesei genes in S. cerevisiae well into the next decade,
glycosylated normally.273 As noted by Peterson and Nevalai- for example, including expression of the family 5 glycoside
nen,274 several of the industrial strains derived from QM6a were hydrolase (GH5) mannanase gene, man1.287 The next leap in
selected for resistance to the chemical mutagen, nitro- understanding of heterologous gene processing in S. cerevisiae
soguanidine, which is now known to be a glycosylation resulted from studies of the unfolded protein response.288
inhibitor.275 Many years later, work to express T. reesei plant cell wall
In contrast to QM6a, RUT-C30 can be grown on glucose for degrading enzymes continues worldwide, and VTT has
cellulase production, and improvements in growing and remained involved in this field. In 2011, a notable report
inducing RUT-C30 have led to reports of its cellulase describes a systematic study to express, at higher titers than
productivity as high as 30 g/L. Proprietary strains of T. reesei previously reported, active CBH I and II in S. cerevisiae using an
used in industry, based on RUT-C30 and its descendants, have engineered chimera approach.289 These authors were able to
led to strains delivering more than 100 g/L cellulase protein.229 report the fermentation of MCC to ethanol by S. cerevisiae
In spite of this outstanding industrial record, RUT-C30 remains strains expressing CBHs with the addition of β-glucosidase. This
a marginal producer of heterologous proteins. Examples of such concept is, of course, the direct microbial conversion or
expression results include the following, in order of highest consolidated bioprocessing process. The team authoring this
production: Hormoconis resinae glucoamylase P (0.5 g/L),276 work included authors from VTT, Mascoma Corporation, and
Melanocarpus albomyces laccase (0.23 g/L),277 Acrophialophora the University of Stellenbosch, South Africa. In the year of this
nainiana XynVI (0.17 g/L),278 and CBH I-Fab fusion antibody review, Voutilainen et al. have reported the expression of active,
(0.15 g/L).279 All other examples of protein expression from thermal stable chimeric CBHs in S. cerevisiae.290
RUT-C30 reported are below these levels. However, T. reesei
strain ALKO3620 (produced from QM9414 by multiple rounds 4.3. T. reesei Cellulases: Understanding the Mechanisms of
Action
of mutation) was reported to produce 1.9 g/L of Nonomuraea
f lexuosa Xyn11A,280 which may make this case the highest 4.3.1. T. reesei as an Early Model for Cellulase Action.
publicly available heterologous protein expression example in a Likely because of the significant resource availability for T. reesei
mutant T. reesei strain. Peterson and Nevalainen274 suggest that at the time, this fungus soon became the archetype micro-
the causes of poor heterologous protein expression lie in the organism used to study cellulase digestion at the level of the
incompatibilities of foreign peptides with natural mechanisms of enzymes it produces, instead of the World War II era focus on
protein recognition and disposal within the cell, known microbial action and effects. In 1950, Reese and co-workers
collectively as the unfolded protein response. reported that although many fungi could hydrolyze derivatized
4.2.3. Cloning T. reesei Genes in S. cerevisiae: The Road celluloses, only a few could grow on crystalline cellulose, such as
to Consolidated Bioprocessing. In many ways, the concept cotton.291 At this time, the molecular nature of the cellulase
1322 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

concept was not known. We note that, during the war years,
only crude salt and metal ion mediated solvent precipitation of
proteins could be applied to protein separations. These
methods, developed for fractionating blood plasma proteins
for the war effort, were entirely unsuitable for purification of
microbial enzymes. This and other postwar developments in
protein biochemistry, erupting into play immediately after the
war, helped the nascent cellulase biochemistry programs.
Perhaps the most significant of these developments was the
work in the 1950s on enhanced protein separation methodology
in Uppsala, Sweden. Work by Pederson in Uppsala during this
era led to the development of modern size exclusion
chromatography with early columns being packed with simple
agarose gel spheres.292 With the ability to separate active,
secreted enzymes, the advent of modern cellulase biochemistry
was born.
Simple, atmospheric pressure column chromatographic
techniques were used in the early and mid-1960s to purify
cellulases for study.293−295 As this was before the time of Figure 9. This cartoon represents the classical endo/exo model of
widespread use of electrophoretic gels, analytical ultracentrifu- cellulase enzyme action in T. reesei and many other cellulolytic fungi.
gation was the primary tool for characterizing the molecular Here, the dominant EG, EG I, or Cel7B acts only on amorphous
weights and degree of polydispersity of proteins. Using these cellulose and functions as the Cx activity described by Reese decades
new tools for biochemistry, Mandels and Reese293 suggested a before.
protein level concept for cellulase action, called “C1−Cx”
(Figure 8). Enabled by this newly acquired ability to study the the available free cellulose ends, with β-glucosidases converting
cellobiose to glucose. Although a second exoglucanase from T.
reesei had been reported for some time, CBH I (Cel7A) and
CBH II (Cel6A) were shown to be functionally distinct by van
Tilbeurgh and co-workers301 and to correspond to distinct
cellulose digestion morphologies determined microscopi-
Figure 8. Diagram of first concept for the roles and specificities of cally.302 The same year, Wood proposed a model depicting
enzymes hydrolyzing cellulose, known as the C1−Cx model. Adapted the cellulose stereochemical specificity of these enzymes, called
with permission from ref 293. Copyright 1964 Society for Industrial CBH (B′) and CBH (B), a concept not discussed much today.
Microbiology and 1999 Springer Science and Business Media. In 1984, van Tilbeurgh and co-workers were first to describe the
systematic multistep chromatographic purification of all key
action of relatively homogeneous enzyme preparations and cellulases from T. reesei.303
measure their respective concentrations accurately, they A curious observation from this time was that the optimum
described in this report the possible role of C1 enzymes, synergistic ratio of cellulases purified from T. reesei was CBH
required for attack on crystalline cellulose, and the more I:EG I (1:1) and CBH II:EG II (95:1).305 For the latter case,
prevalent Cx enzymes, needed for hydrolysis of soluble, these results are consistent with the view that the EG creates
derivatized cellulose such as PASC. Characterization of the new chain ends for the exoglucanase in an almost catalytic role.
hypothetical C1 enzyme lagged behind progress on the Cx However, for the GH7 enzymes (CBH I and EG I, later known
enzymes for many years. Putative Cx enzymes were identified as as Cel7A and Cel7B, respectively), the 1:1 ratio suggests another
endo-(1−4)-β-glucanases, exo-(1−4)-β-glucanases, and β-gluco- mechanism. Wood306 proposed that the CBH I/EG I pair
sidases.296 During the 1960s, it was commonly thought that the function with close physical coupling so that, immediately
C1 and Cx enzymes worked in a synergistic way to hydrolyze following an internal bond cleavage event by EG I, CBH I is able
crystalline cellulose, although no molecular detail was available. to immediately occupy the newly revealed reducing chain end.
One proposal during this time was that C1 was a protein that This not only permits rapid hydrolysis, but also reduces the
decrystallized cellulose by displacing native hydrogen bonds in possibly that the broken chain can “reanneal” into the crystal
the microfibril, leading to a more available structure for Cx.297 In surface. We are not aware that more recent work has confirmed
1969, Eriksson298 is credited with first proposing that an or denied this theory; however, given the challenges now known
exoglucanase could be playing the role of C1, a view supported for removing all traces of EG contamination from CBH I
three years later in a report by Halliwell and co-workers.299 The preparations, caution is recommended.
landmark work reported by Berghem and Pettersson in 1973300 Initially reported by Ståhlberg et al.304 and more recently by
clearly showed that an exo-cellulase, purified from T. viride, was Kurašin and Väljamäe307 is a view of CBH I mechanism that was
highly active on crystalline cellulose and the best candidate for not envisioned by the simple exo/endo model. Using cellulose
C1. This enzyme, later classified as “cellulose 1,4-β-cellobiosi- reducing end group analysis following digestion, Ståhlberg et al.
dase (nonreducing end) or CBH I”, EC 3.2.1.91; is today suggested that T. reesei produces no true exo-cellulases (Figure
reclassified as EC 3.2.1.176 “cellulose 1,4-β-cellobiosidase 10). The work by Kurašin and Väljamäe used a similar reducing
(reducing end)”.151 The picture emerging from work done by end analysis strategy to show that both TrCel7A and
the late-1990s is depicted in Figure 9. In this view, EGs attack Phanerochaete chrysosporium Cel7D are also able to conduct
amorphous cellulose surface regions of the microfibril, revealing “endo-initiation” as well as the well-known exo-initiation leading
new cleavage sites for exoglucanases. Exoglucanases also attack to hydrolysis. It is worth noting that, even with this “extra”
1323 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 10. Concept of endo-initiating CBHs was introduced by


Ståhlberg et al. in 1993.304 This new role for both CBH I or Cel7A and
CBH II or Cel6A supports somewhat the concept of C1; however, the
endo-initiation function may not be as dramatic as first envisioned by
Reese.

reactive feature, CBH I by itself cannot hydrolyze more than


about 60% of a model cellulose substrate, such as Avicel.
Today, cellulase digestions mechanisms are once again viewed
in a sense related to the original concepts from Reese, i.e., C1
and Cx. However, some inconsistencies are apparent given the
classical view shown in Figure 9. Reese suggested that C1 was a
decrystallizing protein factor, whose function may not be bond-
breaking (at least, not hydrolytic), but instead possesses an Figure 11. Current view of cellulose degradation by many filamentous
ability to swell or disrupt cellulose, perhaps by intercalating fungi, combining both hydrolytic and oxidative fragmentation reactions.
between cellulose chains in the elementary microfibril, the Note that the action of LPMO may provide a new chain end for CBH I
(or Cel7A), even though it is oxidized.
consequence of which is action by other enzymes causing
complete digestion to cellobiose. The view shown in Figure 9
actually suggests that Cx is not a single class of enzymes, but the considering that other fungi have a greater diversity of critical
combined action of EGs and exoglucanases that work together cellulase components. For example, T. virens, Aspergillus
to expose the necessary ends of cellulose chains needed to yield nidulans, and Postia placenta have 256, 251, and 248 GHs,
simple sugars. Neither the EGs nor the exoglucanases from fungi respectively.268 The CAZy database lists two CBHs, six EGs,
cause global decrystallization of cellulose. This hydrolytic two LPMOs, and seven β-D-glucosidases for QM6a. Further-
enzyme system functions in an ablative manner, peeling one more, as illustrated by Martinez and co-workers,41 T. reesei has
layer at a time. The essential dilemma here is that no single an extremely limited set of the enzymes needed to degrade plant
protein is known to dramatically reduce cellulose crystallinity. cell walls (accessory enzymes, hemicellulases, acetyl esterases)
Some proteins may cause some local cellulose disruption and and especially living walls (pectinases). Seiboth and co-
morphological changes, for example, the expansins308 and workers268 concluded that T. reesei’s success in the biosphere
expansin-like proteins (swollenins),309 but their reported effects must stem from its efficient cellulase induction system and
on subsequent hydrolytic action are modest.310 extremely high cellulase production and secretion capability.
So, are there protein factors that truly reflect the notions of Some caveats to this cocktail paradigm deserve attention.
Reese four decades ago? The relatively recent work to define the First, it has been recently suggested193 that some cellulases
action of LPMOs has again posed some new potential answers should be categorized as “processive EGs” for structural and
to this question. LPMOs were originally classified as fungal functional reasons. Structurally, some cellulases have binding
GH61 enzymes and nonfungal members of family 33 CBM, but sites that are intermediate between the closed tunnels of CBHs
have been reclassified,152 and are now implicated in the oxidative and the open clefts of EGs. Functionally, some enzymes have
cleavage of cellulose and other plant polymers. As reviewed intermediate levels of processivity, depending on the definition
extensively in section 11, LPMOs are known to act in two of processivity used.315 Also, as discussed above, for many years,
reaction schemes on crystalline cellulose, generating oxidized enzymes now referred to as “cellobiohydrolases” were called
and nonoxidized chain ends (Figure 11). Some LPMOs have “exoglucanases” because of their assumed tendency to begin
been shown to oxidize glucose at position C1, releasing lactones their processive runs at a cellulose chain end (thus, “exo”).
that are hydrolyzed to aldonic acids,311 whereas other enzymes However, the term exoglucanase is now essentially obsolete as a
act on the nonreducing end, producing ketoaldoses, or a class of GH given the early and now more recently confirmed
combination thereof.312 The copper oxidases seem to attack the revelation that GH7 CBHs can perform hydrolysis in an endo-
highly crystalline regions of cellulose in contrast to hydrolases. initiation fashion.304,307 In addition, the more recently
In this regard, LPMOs may act like Reese’s C1 enzymes.313 discovered class of enzymes known as LPMOs (formerly
4.3.2. GHs and Related Enzymes. Today, the secretome of GH61 or family 33 CBM (CBM33)) have been recognized as a
T. reesei is fairly well-understood, thanks to decades of vital part of an efficient cellulase cocktail. These enzymes are not
traditional biochemistry and the recent assembling of much of GHs, even though they were once listed in GH or CBM families.
the genome, i.e., 34 Mbp of nearly contiguous sequence Given these nuanced (and at times misleading) enzymatic
comprising 9129 predicted genes.41 T. reesei QM6a is known to classifications, it may be more helpful and appropriate to think
produce at least 193 GHs, 93 glycosyl transferases, 5 in terms of “modes of actions” rather than types of cellulases.
polysaccharide lyases, 17 carbohydrate esterases, and 41 As suggested above, T. reesei secretes a multiplicity of the key
CBMs.314 It has been pointed out that the outstanding cellulose degrading enzymes, especially the β-glucosidases, EGs,
performance of T. reesei on biomass is somewhat surprising, and CBHs. The enzymes shown in Table 2 are those cited
1324 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 2. T. reesei QM9414 and QM6a Cellulases Reported in CAZy151−153
product con-
classical name CAZy name common name figuration reported by substrate specificity ref
Bgl2 EC3.2.1.21 GH1 Cel1A β-glucosidase 2 retaining Takashima et p-nitrophenyl-β-D-glucoside, p-nitrophenyl-β-D-cellobioside (pNPC), methylumbelliferyl-β-D- Takashima et al., 1999,317 Saloheimo et al.
al., 1999317 glucoside, 5-bromo-4-chloro-3-indolyl-β-D-glucoside 2002318
EC3.2.1.21 GH1 Cel1B β-glucosidase retaining Foreman et predicted β-glucosidase activity Foreman et al., 2003319
Chemical Reviews

al., 2003319
Bg1 EC3.2.1.21 GH3 Cel3A β-glucosidase 1 retaining Mach, Glc2/Glc3/Glc4/Glc5/Glc6, gentiobiose, laminaribiose, laminaritriose, sophorose, 2-chloro-4- Korotkova et al., 2009,321 Karkehabadi et al.,
1993320 nitrophenyl-β-D-glucopyranoside, p-nitrophenyl-β-D-glucopyranoside, CMC, laminarin, β- 2014322
glucan
EC3.2.1.21 GH3 Cel3B β-glucosidase retaining Foreman et predicted β-glucosidase activity Foreman et al., 2003319
al., 2003319
EC3.2.1.21 GH3 Cel3C β-glucosidase retaining Foreman et predicted β-glucosidase activity Foreman et al., 2003319
al., 2003319
EC3.2.1.21 GH3 Cel3D β-glucosidase retaining Foreman et predicted β-glucosidase activity Foreman et al., 2003319
al., 2003319
EC3.2.1.21 GH3 Cel3E β-glucosidase retaining Foreman et predicted β-glucosidase activity Foreman et al., 2003319
al., 2003319
EGII (formerly GH5 Cel5A endoglucanase II retaining Saloheimo et CMC-Na, Avicel, ball-milled cellulose, PASC Qin et al., 2008324
EGIII) al., 1988323
EC3.2.1.4
EC3.2.1.4 GH5 Cel5B endoglucanase retaining Foreman et predicted endoglucanase activity Foreman et al., 2003319
al., 2003319
CBHII GH6 Cel6A cellobiohydrolase II inverting Teeri et al., Avicel, CMC, Glc3/Glc4/Glc5/Glc6, PASC Poidevin et al., 2013,326 Ståhlberg et al., 1993304
EC3.2.1.91 1987325
CBHI GH7 Cel7A cellobiohydrolase I retaining Shoemaker et 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol-β-D-lactoside, 3,4-dinitrophenyl-β- Boer and Koivula, 2003,327 Becker et al., 2001328

1325
EC3.2.1.176 al., 1983281 D-cellobioside, 3,4-dinitrophenyl-β-D-lactoside, BMCC
EC3.2.1.91
EGI EC3.2.1.4 GH7 Cel7B endoglucanase I retaining Penttilä et al., Glc3/Glc4/Glc5/Glc6, pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, Van Arsdell et al., 1987,330 Biely et al., 1991,331
1986329 xyloglucan, xylan, arabinoxylan, mannan, galactomannan, barley β-glucan, hydroxyethylcellu- Bailey et al., 1999,286 Vlasenko et al., 2010332
lose
EGIII GH12 Cel12A endoglucanase III retaining Fowler et al., CMC, PASC, Avicel, Glc4/Glc5, barley β-glucan, glucomannan, filter paper Karlsson et al., 2002334
EC3.2.1.151 2001333
EC3.2.1.4
EGV EC3.2.1.4 GH45 Cel45A endoglucanase V inverting Saloheimo et CMC, PASC, Avicel, Glc3/Glc4/Glc5, barley β-glucan, glucomannan, filter paper Karlsson et al., 2002334
al., 1994335
Egl6 EC3.2.1.151 GH74 Cel74A endoglucanase and inverting Foreman et xyloglucan, hydroxyethylcellulose Benkő et al., 2008336
xyloglucanase al., 2003319
EG7 AA9 (formerly LPMO oxidative Foreman et cellulose Karkehabadi et al., 2008337
GH61) al., 2003319
Cel61B
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

currently in CAZy. Some families, for example GHs 6 and 7, cleaved into small, glycosylated domains at the C- and N-termini
contain enzymes with vastly different and mechanistically of Cel7A and Cel6A, respectively, both of which are bound with
synergistic activities (as described in sections 6 and 7). Other high affinity to crystalline cellulose. Both “core” domains
families, GHs 1 and 3, contain many enzymes with essentially isolated from the two cellulases retained activity against small
the same activity (cleavage of the glycosidic bond in cellobiose); molecule substrates. This study,344 along with similar work the
upon closer inspection, however, some of these enzymes have same year on two bacterial cellulases from Cellulomonas f imi,345
subtle differences in substrate specificity. Family GH7 is solidified the concept that cellulases can exhibit multimodular
somewhat enigmatic in that it contains both a CBH (Cel7A) structures with CBDs. Upon the discovery of additional
and an EG (Cel7B) based on the same protein fold (folded β carbohydrate-binding ligands beyond cellulose, the term CBD
sheet sandwich). In this case, the EG homologue has substrate was replaced with the broader concept of CBMs.156
tunnel associated peptide loops of shorter length than its CBH Kraulis et al. solved the first structure of a family 1 CBM using
counterpart (see section 6). We also note that the EGs found in NMR spectroscopy in 1989, which is shown in Figure 12.346
GH families 5, 7, and 12 act on cellulose to leave the terminal
hydroxyl in the retaining configuration. Family GH45 EGs leave
the terminal hydroxyl in the inverted configuration.
For more comprehensive reviews of the early literature on T.
reesei enzymology and strain development, we refer the readers
to several major reviews.32,316 The following sections primarily
focus on the developments in fungal cellulases following the
initial structural determinations of each enzyme family (and
CBMs) toward understanding structure−function relationships.

5. CARBOHYDRATE-BINDING MODULES AND


LINKERS
Biomass-degrading enzymes work at solid−liquid interfaces, and
the concentration of catalytic units at the surface is directly
related to the extent of substrate turnover. Thus, many enzymes
that work on polysaccharides are multimodular with catalytic
function accomplished by a single or multiple CDs coupled to a
binding function via one or more CBMs; these two domains are
connected together by linker peptides of varying length and Figure 12. TrCel7A family 1 CBM structure solved by Kraulis et al.346
structure. To date, 69 distinct families of CBMs have been Several residues of interest are highlighted including Tyr5, Asn29,
Tyr31, Tyr32, Gln34 on the hydrophilic, flat face of the CBM and the
discovered and characterized according to the CAZy database
two disulfide bonds in the protein. The structure forms an irregular,
(family 33 CBMs have been reclassified as oxidative enzymes, as triple-stranded β-sheet core. (B) Sequence alignment of the TrCel7A,
discussed below).151−153,156 As many biomass-degrading fungi TrCel7B, and TrCel6A CBMs. The figure was generated with ESPript
commonly employ family 1 CBMs for plant cell wall (http://espript.ibcp.fr).347
degradation, we review the history and developments in
structure−function studies primarily of this particular family of
CBMs in this section. Moreover, CBMs are connected to CDs The 36-residue TrCel7A CBM was prepared via solid-state
via linkers, and thus, the roles of these domains are also peptide synthesis. The structure revealed an irregular, triple-
reviewed. We note that the study of CBMs is vast, and here we stranded, antiparallel β-sheet arrangement with an amphiphilic
primarily focus on family 1 CBMs. We discuss other CBMs character with a large, hydrophilic flat face exhibiting three
mainly when findings are relevant to our collective under- tyrosine residues and a large number of polar amino acids, and a
standing of general CBM behavior. For more general hydrophobic face on the “wedge” portion of the CBM. The
perspectives on CBMs, readers are referred to several reviews CBM sequence contains four cysteine residues comprising two
from the past decade.156,338−342 disulfide bonds, and all three possible combinations were
investigated to determine the most likely pairing. The disulfide
5.1. Family 1 Carbohydrate-Binding Modules bonds are also shown in Figure 12. Perhaps the most cited
That many GHs contain both binding and catalytic function was observation from this original work is the presence of three
first reported in a study by Van Tilbeurgh et al.343 Therein, conserved aromatic residues on the flat, hydrophilic face of the
papain was used to proteolytically cleave the TrCel7A enzyme CBM (Tyr5, Tyr31, and Tyr32 in the TrCel7A CBM). As
into two primary fragments: a 56 kDa domain that lost catalytic discussed below in detail, this flat face is implicated as the
activity on cellulose and retained full activity on small molecule binding face to crystalline cellulose.
substrates (the CD), and a 10 kDa domain identified as the The family 1 CBM structure was followed by the structures of
glycosylated C-terminal domain (the CBM-linker).343 The many more CBMs from various families. By 2004, a paradigm
authors went on to show that the binding affinity of the 56 had emerged for three types of CBMs, which were categorized
kDa domain to crystalline cellulose is significantly reduced as type A, B, and C CBMs in a seminal review paper from
without the C-terminal domain. On the basis of these results, Boraston and co-workers in 2004.156 Type A CBMs, which
the authors proposed that TrCel7A is a multimodular protein include family 1, are characterized by flat faces that bind to
with a binding and catalytic function.343 Following this crystalline cellulose or chitin. Type B CBMs exhibit extended
important discovery, Tomme et al. further characterized the grooves or clefts for binding single sugar chains, such as those
proteolysis products of TrCel7A and TrCel6A from papain found in hemicellulose, pectin, or amorphous regions of
cleavage.344 They similarly demonstrated that both enzymes are polymers such as cellulose or chitin. Lastly, type C CBMs are
1326 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

characterized as those that bind mono-, di-, and trisaccharides. aromatic-containing face of the CBM is likely the binding face to
In 2004, three functions were attributed to CBMs: substrate crystalline cellulose.350
targeting, enhancements to enzyme−substrate proximity, and Around the same time, Linder and co-workers published a
disruption of the substrate.156 Boraston and co-workers series of comprehensive studies wherein the TrCel7A CBM was
published a recent revision of the A−B−C CBM paradigm investigated in great detail with 6 alanine mutant CBMs
wherein type B CBMs are classified as those that bind in endo- produced via solid-state peptide synthesis and examined with
mode and type C CBMs bind in exo-mode.338 type A CBMs binding affinity measurements and 2-D NMR spectroscopy,
remain classified as previously. including Y5A, P16R, N29A, Y31A, Y32A, and Q34A (Table
With the publication of the landmark structural study from 3).352 The individual point mutants of Tyr5 and Tyr32 both
Kraulis et al.,346 this enabled a very large body of work to be
conducted to investigate the function of family 1 CBMs.348−364 Table 3. Free Energy of Binding for TrCel7A CBM Mutants
With the knowledge of a multidomain CBH structure, Ståhlberg Relative to Wild-Type352
and co-workers proposed an initial model for CBH action in CBMs compareda ΔΔG (kJ/mol)
1991 using TrCel7A as a model enzyme, which formed much of
Cel7A wild-type → Cel7A P16R 1.2
the basis for our collective model of CBH action.348 By
Cel7A wild-type → Cel7A N29A 2.4
examining the binding capacity of the intact TrCel7A, the CD
Cel7A wild-type → Cel7A Q34A 4.9
alone, and the CBM-linker alone, it was proposed that the CBM
Cel7A wild-type → Cel7A Y31A 7.3
is responsible for targeting the intact enzyme to crystalline a
regions, thereby increasing the concentration of the CD to the Values for Y5A and Y32A, which lost their affinity to cellulose, were
not determined.
cellulose surface. The CBM was also proposed to enable two-
dimensional diffusion of the CBH enzyme on the cellulose completely lost affinity to cellulose. P16R was the least-affected
surface to enable efficient catalytic performance.348 Two- mutant, similar to results from Reinikainen et al.349,350 wherein
dimensional diffusion of a type A family 2 CBM from C. f imi this amino acid position had the least effect on Cel7A activity.
was later measured using fluorescence recovery after photo- The 2-D NMR spectroscopy suggests that the Y5A mutation
bleaching experiments, with an estimated diffusion rate of 2 × affects the overall compactness of the structure, which was later
10−11 to 1.2 × 10−10 cm2/s. To our knowledge, similar verified by Mattinen et al. by solving NMR structures of this
experiments have not yet been conducted for family 1 CBMs. CBM.356 In all other cases, the 2-D NMR spectra suggest
Reinikainen et al. expressed several variants of the full-length minimal changes only to the CBM structures upon single-point
TrCel7A in yeast, focusing on mutations on the flat face mutations, which was later verified structurally for the Y31A and
(Y492A, Y492D, and Y492H; this is Tyr31 in the CBM-only Y32A CBMs.356 Additional NMR spectroscopy of the same
sequence numbering) and on the hydrophobic wedge face CBMs was conducted with cellohexaose present in solution by
(P477R).349 As will be discussed in more detail in the GH7 Mattinen et al.357 From this study, the authors suggested that
section below (section 6), this was one of the first studies to the flat putative binding face of the family 1 CBM aligns in
note that heterologous expression of fungal cellulases leads to parallel with cellohexaose, thus potentially serving as a model for
lower activity than enzymes expressed natively, in this case by how the CBM binds to crystalline cellulose.357
more than a factor of 2 in activity against crystalline cellulose. An examination of the differences between the CBMs from
This observation was attributed to a higher extent of TrCel7A and TrCel7B closely followed these initial family 1
glycosylation from yeast expression as observed by a large CBM structure−function studies.351,358,359 When comparing the
heterogeneous population of enzymes on a gel. Within the CBM binding affinity alone, it was observed that the CBMs from
heterologously expressed enzymes, the authors observed that these two enzymes exhibit significantly different binding
wild-type Cel7A and the Y492H mutant exhibit roughly affinities, with the Cel7B CBM possessing a greater affinity to
equivalent activity, whereas Y492A and Y492D result in a crystalline cellulose. Starting with the Cel7A CBM sequence, the
significant decrease in activity against crystalline cellulose authors made multiple mutations, and found that the Y5W
accompanied by a concomitant decrease in binding affinity, as mutation alone can explain much of the differences in binding
measured at low temperature. The P477R mutation also affinity between the two CBMs351 (Table 4). A subsequent
resulted in similarly low activity and binding affinity as
Y492A; however, as the authors note, this could be due to Table 4. Relative Free Energies of Binding for the CBMs of
structural changes to the CBM given the removal of a proline TrCel7A, TrCel7A Y5W, and TrCel7B351
residue. This was not clearly resolved until a later study from
CBMs compared ΔΔG (kJ/mol)
Reinikainen et al., wherein Y492A, Y492H, and P477R were all
Cel7A wild-type → Cel7A Y5W −1.1
produced in the native organism with the native cbh1 gene
Cel7A Y5W → Cel7B wild-type −1.4
knocked out.350 Interestingly, the P477R mutant bound
Cel7A wild-type → Cel7B wild-type −2.4
similarly to the wild-type Cel7A enzyme, contrary to the same
enzyme produced heterologously. Both the Tyr492 mutants
bound to a lower extent than the wild-type enzyme, by study from Srisodsuk et al. compared the binding affinity of a
approximately 60%. At a concentration of 1 M MgSO4, all hybrid Cel7A CD and linker with the Cel7B CBM, which
mutants performed similarly to the wild-type enzyme on demonstrated slightly higher activity on BMCC.358 This study
crystalline cellulose, which the authors state suggests that the was closely followed by the solution structure of the Cel7B
hydrophobic effect plays a significant role in CBM binding to CBM.359 The structure is quite similar to the TrCel7A structure
the substrate. Given the similarity of the P477R results to the with the primary exceptions of an additional disulfide bond
wild-type enzyme and the reduction in affinity and activity with between Cys2 and Cys18, and a tryptophan residue at the 7-
mutations to Tyr492, this led the authors to suggest that the flat, position in place of Tyr5. The alignment of aromatic residues on
1327 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

the flat, hydrophilic surface of the CBM is quite similar.


Srisodsuk et al. also noted that the activity of the Cel7B-CBM-
containing chimeric enzyme with the Cel7A CD exhibited
higher activity.358 In a similar vein, Takashima et al. later
conducted an exhaustive study of CBM binding affinity
correlated to GH7 CBH activity using the H. grisea GH7
CBH (Figure 13).365 Therein, they demonstrated that

Figure 13. Correlation between the relative affinity constant (x-axis)


and activity on Avicel (y-axis) of each CBM mutant in the study.
Reprinted with permission from ref 365. Copyright 2007 Elsevier. Figure 14. T. reesei CBM binding temperature dependence: 3 h at 4 °C
(●), 18 h at 4 °C (○), 3 h at 22 °C (×), 18 h at 22 °C (■), 43 h at 22
°C (▲), 3 h at 50 °C (□), and 18 h at 50 °C (Δ). ▼ corresponds to
tryptophan residues in the “outer” positions on the CBM CBHI CBD, and ▽ to CBHII CBD, both at 18 h at 22 °C. Reprinted
(corresponding to Tyr5 and Tyr31 in the TrCel7A CBM) with permission from ref 354. Copyright 1996 American Society for
yielded the highest binding affinity measured for the H. grisea Biochemistry and Molecular Biology.
Cel7A CBM, and this correlated to the highest measured activity
of the full-length enzyme. The authors also found a nearly linear Cel6A CBM led to a dramatic loss in binding affinity relative to
correlation between activity on Avicel and CBM binding affinity, the wild-type Cel7A CBM. The removal of the third disulfide
highlighting the key relationship between CBMs and cellulolytic bond in the Cel6A CBM resulted in a decrease in binding
performance.365 affinity and off-rate, although not as drastic as the W7Y
Linder and Teeri also examined both the binding and mutation, suggesting that the Cel6A CBM rigidity contributes to
reversibility of the TrCel7A CBM on crystalline cellulose using a the CBM-cellulose interaction and that the presence of the
sensitive tritium-labeling approach combined with dilution tryptophan at the 7-position significantly affects the binding
experiments.353 Therein, they produced the CBM in E. coli as affinity.360 In a separate study, the same authors demonstrated
a chimera with the TrCel6A CBM-TrCel7A linker-TrCel7A that the native TrCel7A CBM binding affinity was not
CBM construct, described in detail in another study.354 The significantly affected by pH.361
resulting protein was cleaved to isolate the Cel7A CBM and More recently, Guo and Catchmark conducted a thorough
included 11 residues from the linker domain. The authors study using isothermal titration calorimetry and adsorption
demonstrated that the binding of the CBM was temperature isotherms to compare the binding characteristics of TrCel7A
dependent, with higher binding at lower temperatures (the and TrCel6A CBMs expressed in E. coli on crystalline cellulose,
authors tested 4, 22, and 30 °C) (Figure 14). More importantly, Avicel, PASC, and small cellodextrin chains.364 To enable
they demonstrated that the CBM binding was completely quantitative comparison of binding affinity across various
reversible, addressing a question in the literature regarding if cellulose substrates, the authors first developed a methodology
CBMs could desorb from cellulose. The authors conclude that to quantify the available surface area for CBM binding based on
the rate of adsorption and desorption of CBMs and the binding nitrogen adsorption and static light scattering. Similar to
affinity to cellulose should be optimally balanced to maximize previous conclusions from Carrard and Linder,360 the authors
cellulase activity and minimize nonproductive binding.353 demonstrated that the binding affinity of the TrCel6A CBM is
Interestingly, however, a subsequent study from Carrard and significantly higher, specifically by an order of magnitude relative
Linder demonstrated that the TrCel6A CBM could not be to the TrCel7A CBM (106 M−1 versus 105 M−1, respectively).364
desorbed from cellulose over a temperature range from 4 to 50 The application of isothermal titration calorimetry enabled the
°C, even 8 days after dilution, in stark contrast to the similar delineation between enthalpic and entropic contributions to
TrCel7A CBM.360 To ascertain the differences, the authors binding affinity for the first time for family 1 CBMs, which
examined two key differences between the two CBMs: namely, demonstrated that the binding affinity to the cellulose surface
Tyr5 in the Cel7A CBM is a tryptophan in the Cel6A CBM, and was enthalpically favorable and entropically unfavorable. This
the latter has an extra disulfide bond. The W7Y mutation in the finding is in direct contrast to an influential, earlier study
1328 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

wherein isothermal titration calorimetry experiments on a family imaging work with intact TrCel7A also suggested that the whole
2 CBM (also a type A CBM) suggested that type A CBM enzyme binds to and attacks cellulose on the hydrophobic
binding is entropically driven.366 Interestingly, Guo and surface, likely mediated by the CBM.367
Catchmark also examine the binding specificity of both CBMs More recently, Sugimoto et al. examined the family 1 CBM
to NaBH4-treated cellulose microfibrils, which modify cellulose binding behavior to cellulose in more quantitative detail.363 The
chain reducing ends, finding that the Cel6A CBM exhibits much authors fused a fluorescent protein with the TrCel7A CBM and
lower binding affinity.364 They attribute this difference to the measured binding isotherms to various crystalline and
fact that the Cel6A CBM may recognize the reducing ends of amorphous cellulose substrates. The authors fit their binding
cellulose chains, whereas significantly less difference in binding isotherms to four binding models, and found that the Hill
was observed for the Cel7A CBM. The authors speculate that binding model provided the best fit for CBM-binding to
the potential preference in the Cel6A CBM binding may lead cellulose, with the Langmuir isotherm model also providing a
the intact enzyme to bind near reducing ends of cellulose reasonable fit (correlation coefficients of 0.9995 and 0.9915,
chains.364 respectively) at 5 °C on Cladophora cellulose. The authors
In 2003, the binding-face specificity of two families of type A attribute the goodness-of-fit of the Hill model to the explicit
CBMs, including a large library of family 1 CBMs and a family 3 treatment of a steric exclusion effect induced by surface
CBM from C. thermocellum, to cellulose was reported.362 Lehtiö crowding of the CBM-fluorescent protein complex and infer
et al. fused CBMs of interest to a modified staphylococcal that the length and flexibility of the linker domain will dictate
protein A, which was then coupled to an immuno-gold label. By the size of the exclusion area.363
binding the CBMs to very large cellulose Iα microfibrils from To date, nearly all studies of family 1 CBMs have either used
Valonia, the authors could then visualize the specific cellulose solid-state peptide synthesis or E. coli expression, neither of
binding face for the CBMs directly with TEM and diffraction which impart glycosylation to the protein. However, detailed
measurements. It was shown that in all cases the CBMs bound mass spectrometry studies from Harrison et al. in 1998 showed
to the 110 face of cellulose Iα (Figure 15), which is the that the TrCel7A family 1 CBM exhibits glycosylation at the
Thr1 and Ser3 positions, as illustrated in Figure 16.368 Mass

Figure 15. Crystal structure faces of the cellulose Iα polymorph. The


circle indicates the 110 face, exposed in worn crystals, which is the Figure 16. (A) Glycosylated TrCel7A CBM on the hydrophobic
CBM binding site. Reprinted with permission from ref 362. Copyright surface of cellulose, as studied by Taylor et al.359 (B) LOGO
2003 National Academy of Sciences. representation372 of a multiple sequence alignment of family 1 CBMs
suggests that multiple putative O-glycosylation sites may exist on family
1 CBMs.
hydrophobic face (the hydrophobic face in cellulose Iβ is 100).
This finding shed light directly on the type A CBM-cellulose
interaction by ascertaining that the protein−ligand binding spectrometry analysis was not conducted beyond Tyr5. Given
surface is the flat surface with the glucopyranose rings directly the importance of glycosylation in cellulase activity,369 as
exposed. It should be noted that this type of face only exists in discussed in more detail below in several GH sections, and the
the natural cellulose I polymorphs with multiple chains sequence conservation of putative O-glycan sites on family 1
exhibiting exposed faces; cellulose II and III do not exhibit CBMs (Figure 16),370 fungi may employ glycans on CBMs for
“flat” faces with multiple parallel chain faces exposed.88−90,93,94 multiple functions. In terms of cellulose binding affinity, Taylor
The authors also measured the binding affinity of multiple et al. employed thermodynamic cycle calculations with
CBMs and noted that the addition of two tryptophan residues to molecular dynamics (MD) simulations to examine how O-
the TrCel7A CBM does not increase the binding affinity beyond glycans, especially at Ser3 and a conserved site at Ser14, are able
the effect of a single Y-to-W mutation.362 Unsurprisingly, later to modify the binding affinity over the nonglycosylated
1329 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

variant.371 The predictive capability of these simulations was but no data were included in their study to quantitatively
first tested by comparing to previous experimental data for support this claim.375 Instead, the authors show scanning
amino acid substitutions. The thermodynamic cycle calculations electron microscopy (SEM) images of before and after
were in quantitative agreement, for example, with the Y5W treatment with the P. janthinellum CBM-linker domain. Xiao
mutation. From there, the authors predicted that a single et al.376 and Wang et al.377 report similar findings, also without
mannose at Ser3 could modify the binding affinity by the same demonstrating synergistic cellulose depolymerization for the
order as a tryptophan mutation. Moreover, the simulations CBM from a T. reesei EG and from a family 7 CBH from T.
predicted that the addition of mannosylation at different sites pseudokoningii, respectively. These studies overall lack sufficient
would modulate the binding affinity by a significantly different detail to substantiate the interpretations of SEM and FTIR data.
magnitude, suggesting that the location and extent of More recently, Hall and co-workers reported an in-depth study
glycosylation has a major impact on the change in properties.371 of “CBM pretreatment” of Avicel and fibrous cellulose (cotton
The predictions that glycosylation affects CBM binding linters).378 Therein, they noted that a 15 h pretreatment at 42
affinities were recently tested experimentally.373 Chen et al. °C followed by enzymatic hydrolysis at 50 °C was able to
developed a solid-state glycopeptide synthesis approach to slightly improve the digestibility of Avicel and was able to
rapidly produce single glycoforms of the TrCel7A CBM. Two slightly decrease the crystallinity as measured by the height of
sets of glycoforms (for a total of 20 CBMs) were synthesized: the 200 peak in X-ray diffraction measurements. It is unknown
the first set focused on addition of glycan motifs to single amino how pretreatment with CBMs affected the conversion at long
acids, namely at Thr1, Ser3, and Ser14 of mono-, di-, and times or the final yield of reducing sugars from these
trimannose group, whereas the second set was designed to study experiments. Despite continued claims of substrate disrup-
the effects of adding glycans to multiple amino acid residues tion,379 the data for a disruption effect remain somewhat sparse,
simultaneously, as would be found naturally.368 For each and a full study of CBM disruption in terms of synergy with
glycoform, the thermal stability, resistance to thermolysin cellulases at high substrate conversions coupled to detailed
cleavage, and the binding affinity to BMCC were tested. For substrate characterization does not yet exist, thus severely
the addition of glycans to single sites, Chen et al. discovered that limiting the ability to ascertain true “disruption” effects by
Ser3 has the largest proteolysis-stabilizing effect (by up to a CBMs.
factor of 10), increase in thermal stability (up to 12 °C), and Many other CBM effects on fungal cellulase activity and
binding affinity increase to BMCC (up to a factor of 4).373 behavior beyond substrate disruption have been investigated.
Addition of glycans at Thr1 does not demonstrate significant Hall and co-workers compared the thermal stability of intact
differences in proteolytic and thermal stability, but a TrCel7A, the CD alone, and the CBM-linker, the latter two
disaccharide at Thr1 is able to increase the binding affinity isolated from papain cleavage of the intact enzyme.380 They
significantly to BMCC. Addition of glycans at Ser14 essentially found melting temperatures of 59, 51, and 66 °C, respectively,
improved all properties, but not to the extent of Ser3. For the suggesting that the CBM-linker is a significant stabilizer of the
addition of glycans to multiple sites, single mannosylation at intact enzyme. Voutilainen et al. produced chimeric enzymes of
each site was able to increase the binding affinity to BMCC by the Thermoascus aurantiacus Cel7A (TaCel7A) CBH, which
7.4-fold, well over the increases demonstrated via amino acid natively lacks a CBM and linker, to either the TrCel7A or
substitutions.351,352 An increase of up to 50-fold in thermolysin Chaetomium thermophilum Cel7A CBM-linker.381 On Avicel at
resistance was demonstrated with a concomitant increase in high temperature, this resulted in a significant increase in
thermal stability of up to 16 °C for several variants. From a activity, suggesting that CBM-linker pairs from thermostable
biological perspective, glycan-bearing residues are highly cellulases confer disparate benefits in terms of thermal stability
conserved in the 1, 3, and 14 positions on family 1 CBMs, to chimeric CBHs.
and given the ability to affect multiple, beneficial properties, it is Voutilainen et al. also published a recent study wherein they
likely that CBM glycosylation is employed commonly by fungi. built chimeras with CBMs from Families 1, 2, and 3 of the
Given the prevalence of glycosylation in fungal cellulases, this thermostable Talaromyces emersonii Cel7A (TeCel7A) enzyme
work demonstrates that the study of CBMs should explicitly and an engineered TeCel7A, both of which natively lack a
consider the effects of glycosylation to measure physiologically CBM.290 This study revealed that the CBM3-containing
relevant properties.373 enzymes bind approximately 20% more to Avicel over the
Another interesting question related to CBMs is the ability for family 1 and family 2 CBM-containing enzymes at both 45 and
substrate disruption, as discussed by Boraston et al.156 Din et al. 60 °C. Correspondingly, the family 3 CBM-containing cellulases
published an early study on CBM substrate disruption with the are able to solubilize a greater amount of Avicel in 24 h across a
family 2 CBM from the C. f imi EG, CenA.374 Therein, they range of temperatures from 45 to 65 °C. A similar study was
observed that application of the CBM-linker domain on ramie reported from Kim et al. wherein a large library of EGs were
cotton fibers resulted in surface roughening. The authors expressed with varying CBMs.382 The authors found that the
suggested that this CBM-cellulose interaction was the result of activity of the EG increases generally with the addition of CBMs,
nonhydrolytic disruption of the substrate. However, it was not and that the thermal stability in some cases, even with the
demonstrated that the CBM-treated substrates were subse- addition of CBMs and linkers from mesophilic organisms, was
quently more digestible by reducing-sugar assays nor were improved.
cellulose crystallinity measurements conducted. Ståhlberg et al. Although many cellulases have CBMs, there are many
incubated the TrCel7A CBM-linker domain with Avicel, but did examples of fungal and nonfungal cellulases that do not employ
not observe additional susceptibility of the resulting substrate to them in natural biomass degradation. There are likely multiple
digestion by intact Cel7A.348 Several subsequent studies have physiological and evolutionary reasons for the lack of CBMs in
reported observing substrate disruption. Gao et al. claimed that some cases, such as biomass degradation in organisms that
the Penicillium janthinellum CBM-linker was able to synergize densely pack solids into their digestive organs.383 Recently, an
with an EG from T. pseudokoningii on Avicel and cotton fibers, elegant study from Várnai et al. shed light on one potential
1330 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

reason by examining the performance of a T. reesei enzyme the preferred hydrophobic face, that the CBM will translate
cocktail at high solids loadings.384 Therein, they demonstrated along the hydrophobic surface in both a forward and backward
that T. reesei cellulases with their CBMs and linkers removed are direction with equal probability, and that the flat, hydrophilic
able to achieve the same extents of conversion on both Avicel face of the CBM is the thermodynamically preferred face for
and pretreated wheat straw at 20% solids loading. The most cellulose binding.
likely reason for this observation is that, at high solids loadings, Taken together, significant insights have been gained
the diffusion length for the enzymes to productively bind to regarding the function of family 1 CBMs, especially accelerated
substrate is much shorter than in low solids loading digestions, and informed by the initial structural work.346 Yet, these studies
thus precluding the need for CBMs to ensure that the also illustrate that our collective understanding of CBM function
concentration of active enzymes at the substrate interface is for activity and stability in the context of full-length, multi-
high. For natural systems, high dry-matter content may also modular enzymes and the corresponding potential for protein
preclude the need for CBMs, but a systematic correlation engineering via CBMs remain limited. Moreover, as is clear from
coupled to other environmental and evolutionary considerations this section as well, most of our collective understanding of
has not been conducted to our knowledge. For commercial family 1 CBMs stems almost solely from the model T. reesei
biomass conversion, the work from Várnai et al. suggests that system, despite the prevalence of family 1 CBMs in many other
additional research should be conducted to understand the need fungi. Undoubtedly, there is significant potential for improving
or lack thereof for CBMs in cellulolytic enzymes in industrial cellulase properties via deeper understanding of CBM function
contexts, as this could lower the overall mass loading of enzymes in the context of both cellulase performance and stabilization.
for biomass degradation to sugars.384 5.2. Linkers
Given the relatively small size of family 1 CBMs coupled to In multimodular cellulases, linkers of various length and
the inherent difficulty in directly studying molecular-level sequence diversity connect CBMs to CDs. In some cases,
interactions at the heterogeneous cellulose interface, family 1 these linkers are very short, and the CBMs are thus in intimate
CBMs have been the subject of computational studies since contact with the CDs.397,398 In fungi, however, linkers
1995 aimed at further elucidating cellulose−CBM interactions characterized to date tend to be longer and exhibit glycosylation,
at the molecular level.121,123,370,371,385−395 Nimlos et al. thus allowing greater separation between CBMs and CDs. Here,
conducted MD simulations of the TrCel7A CBM on the we review the characterization of linkers and their putative
hydrophobic face of cellulose and observed that Tyr13 functions beyond domain connection. We include discussion of
undergoes a conformational change from being tucked into nonfungal linkers where appropriate to understanding their
the CBM structure to bind directly to the substrate. It is general function.
noteworthy that this simulation was conducted with a previous Some of the original work related to linker domains and
version of the CHARMM force field,396 and this behavior has cellulase modularity was conducted with small-angle X-ray
not been observed in subsequent simulations with more updated scattering (SAXS) using T. reesei and C. f imi cellulases as model
potentials. The conservation of an aromatic residue in this systems.399−403 Abuja and co-workers examined both Cel7A and
position (Figure 16), however, suggests that this aromatic Cel6A, and in both cases, determined that the enzymes
residue is important for a structural or functional reason, and exhibited a “tadpole”-like structure with a large core domain
future experimental work will likely shed additional light on this (the CD) and a long, flexible “tail” (the CBM-linker; Figure
question. Multiple studies were later reported that suggest how 18).399 On the basis of the work from van Tilbeurgh et al.,343
CBMs translate on the cellulose surface.121,123,370,392,394,395 there was already strong experimental evidence indicating that
Potential energy surfaces of a family 1 CBM on both a coarse- the “tail” domain was responsible for carbohydrate binding. A
grained and atomistic surface of cellulose Iβ revealed that the “collar”-like section in the Cel7A linker region was also
Cel7A CBM displays potential energy wells every cellobiose unit
(roughly 1 nm) over the surface of cellulose (Figure 17). A
much more detailed study was later published by Nimlos et al.
describing multiple aspects of CBM behavior on cellulose. This
study demonstrated that there is a thermodynamic driving force
for the TrCel7A CBM to translate from hydrophilic surfaces to

Figure 17. Simulation of a family 1 CBM of a cellulose Iβ surface was


used to calculate potential energy surfaces as shown. The minimum
energy wells, shown in dark blue and violet, represent preferential Figure 18. Overall “tadpole” topologies of the TrCel7A and TrCel6A
locations for the CBM at the surface. These locations correspond to enzymes. The C- and N-termini are noted, highlighting the reverse
approximately 1 nm, which is the length of a cellobiose unit. Reprinted locations of their cores (CDs). The Cel7A linker displays a
with permission from ref 370. Copyright 2010 American Chemical characteristic “girdle”, while the Cel6A linker comprises two repeating
Society. units. Adapted with permission from ref 399. Copyright 1998 Elsevier.

1331 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

attributed to the presence of O-glycosylation.400 It was noted removed, the CBM alone, a variant where a substantial portion
that the TrCel6A tail region was significantly longer than its of the linker was removed, and a polyproline insertion in the
GH7 counterpart from the same organism. A second study on linker.407 They found that the CBM presence does not alter the
TrCel7A demonstrated that the enzyme becomes elongated in overall conformation of the enzyme and that the CBM was not
the presence of xylan from 18 to 22 nm, which was attributed to discernible from the linker region. The results for the wild-type
lengthening in the CBM-linker region.401 enzyme and the shortened-linker variant both demonstrate that
Langsford et al. published the seminal study demonstrating the linker volume is quite substantial and extended, suggesting
glycosylation of cellulase linkers in C. fimi serves to protect either (or both) that the glycosylation on the linker provides an
against proteolysis.404 Specifically, the authors isolated two excluded volume effect or that the linker is quite flexible, but
cellulases from C. f imi and expressed their counterparts in E. coli, extended. The polyproline mutant exhibited a considerable
the latter of which does not natively glycosylate proteins. The narrowing in the region where the proline insertion was made,
properties of the enzymes in terms of kinetics on small molecule suggesting that this region imparted significant local rigidity.
substrates (such as CMC or pNPC) and thermal and pH The authors interpret their data to suggest that the linker
stabilities were not affected by glycosylation.404 Binding was provides the means to separate the CBM and CD and optimize
measured at 0 °C on Avicel and also suggested that, at those their relative geometry, similar to the interpretation of Srisodsuk
conditions, glycosylation had no effect. In the presence of crude et al.251 The authors go on to state that the conformational
protease from C. f imi, however, the recombinant cellulases landscape of the linker may impart a “caterpillar”-like motion
showed fragmentation patterns that suggested the proline- between the CBM and CD during catalytic action on insoluble
threonine rich linker domains are proteolytically cleaved. cellulose. This model is one wherein free energy is gained by
Cellulase performance in Avicel digestion was not directly compression of the linker during catalysis, which is dissipated by
compared in the absence of protease. Overall, this study led sliding of the CBM.407
Langsford et al. to suggest that the C. f imi cellulases were In the aforementioned study from Receveur et al.,407 the
multimodular with CBMs and CDs, similar to that of van authors could not resolve the CBM in the full-length H. insolens
Tilbeurgh et al.343 Importantly, they also suggested that the GH45 EG, thus limiting their ability to decipher linker behavior
proline-threonine rich domain between these two functional relative to CBM behavior in their SAXS experiments. To
domains is a “hinge region” (now commonly referred to as the overcome this problem and focus only on the linker domain,
“linker”) that contains O-glycosylation that protects against von Ossowski et al. built a chimeric cellulase with 2 GH6
protease action.404 Although an equivalent study has not been cellulases from H. insolens (Cel6A and Cel6B) with an 88-
conducted to our knowledge in fungi, perhaps due to the residue linker between them, wherein both the CDs were
inherent difficulties in expressing fungal cellulases in E. coli or completely discernible from the linker region.408 Using SAXS
fully deglycosylating fungal cellulases, it is likely that the finding combined with molecular modeling, the authors demonstrated
from Langsford and co-workers is similar in multimodular that the chimeric cellulase adopts a huge range of conformations
cellulases regardless of origin in that O-glycosylation on linker in solution accessible at low energetic cost with equal probability
domains tends to shield the linkers from proteolysis during (thus, free energetically similar) from compressed to extended,
cellulose depolymerization in an extracellular, competitive as measured by the end-to-end distance. To our knowledge, this
environment. study marked the first explicit connection of linkers with
Srisodsuk et al. examined the activity of TrCel7A as a function intrinsically disordered proteins, which has become a large field
of linker length.251 Therein, the authors produced two mutant in the last 10−15 years.409−413 The authors also propose that the
enzymes, one in which the approximately first one-third of the O-glycans may provide greater extension between the
linker was removed, comprising Gly434-Gly444, referred to by subdomains, and that their results provide additional evidence
the authors as the “hinge” region, and the other from Gly434 to toward an inchworm- or caterpillar-like mechanism. Additional
Gly 460, which essentially is removal of the entire linker. The studies have been published that include SAXS analysis of intact
hinge mutation resulted in similar overall performance of the cellulases with broadly similar conclusions.414
enzyme relative to the wild-type, but with a reduced binding Besides SAXS, NMR and MD simulation also been used to
capacity at high loading. The mutant removing the entire linker characterize linker behavior. Poon et al. used NMR spectroscopy
drastically reduced the enzymatic activity toward crystalline to examine the proline-threonine rich linker of a family 10
cellulose. The overall interpretation of the linker’s role by the xylanase from C. f imi.415 Therein, they showed that the linker
authors was that it is important for maintaining sufficient samples conformational space on very fast time scales: from
distance between the CBM and CD and that it facilitates picoseconds to nanoseconds, and that glycans serve to dampen
“dynamic adsorption” to the surface of cellulose.251 An earlier, mobility. In two subsequent papers, we examined linker
similar study on a bacterial cellulase also demonstrated that behavior in 3 GH7 cellulases and the TrCel6A linker.416,417
removal of the linker domain significantly reduces its activity Beckham et al. first used replica-exchange MD to examine the
toward both crystalline and amorphous cellulose.405 TrCel7A linker with and without O-glycosylation.416 It was
Boisset et al. examined the H. insolens EG V (GH45 cellulase) shown that the Cel7A linker was an intrinsically disordered
with static light scattering, which contains a family 1 CBM and a protein in solution and that the primary role of the glycans in the
33-residue linker.406 They found that the average length of each isolated domain was to serve an excluded volume effect, as
residue in the linker domain is approximately 2 Å, which is not measured by the conformational free energy of the linkers as a
in agreement with α-helices, polyproline helices, or β-sheets. function of the end-to-end distance. A later study from
From these initial SAXS and light scattering studies, it was not Sammond et al. demonstrated similar behavior for three
clear if linkers are stiff or flexible. Receveur and co-workers later additional fungal cellulase linkers.417 Therein, the lack of
reported a comprehensive SAXS study wherein they examined structure predicted by replica-exchange MD simulation was
the full-length H. insolens GH45 EG (the same enzyme as from confirmed for the nonglycosylated variants of all four linkers
Boisset et al.406), the CD alone, the enzyme with the CBM with circular dichroism spectroscopy.
1332 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 19. Characteristics of cellulase linkers; data adapted from Sammond et al.417 (A) Linker length histograms for fungal GH7 and GH6 cellulases.
(B) O-glycosylation distribution across linker sequences measured by the prevalence of serine and threonine residues for fungal GH7 and GH6
cellulases.

It has often been mentioned that linkers in cellulase enzymes bacterial cellulase linkers and higher Ser/Thr content in fungal
do not exhibit sequence conservation or high homology to one cellulase linkers. Taken together, these results begin to suggest
another. In the aforementioned study from Sammond et al., an that cellulase linkers exhibit properties that are optimized for
alternative set of methods was applied to study the similarities specific catalytic function, the mechanistic underpinnings for
and differences between linkers besides sequence alignments.417 which are mostly unknown.417 Given the wealth of sequence
Instead of the conventional sequence identity, linker lengths, data for cellulases and other carbohydrate-active enzymes in the
amino acid content, glycosylation distributions, and these CAZy database,151−153 this bioinformatics approach offers a
variables as a function of one another, of the GH family, and straightforward, quantitative means to develop new hypotheses
of the origin (i.e., either bacterial or fungal) were compared. In regarding linker function and should provide a framework for
doing so, quantitative patterns in linker characteristics begin to the comparison of linkers beyond simple sequence alignments.
emerge, some of which are illustrated in Figure 19.417 For Detailed characterization of the glycosylation pattern of
example, multimodular fungal GH6 and GH7 cellulases exhibit a linkers is essential to understand their behavior. To that end,
significant difference in linker length (with GH6 cellulase linker several in-depth mass spectrometry studies have been conducted
average length of 42 and GH7s at 30 residues), the functional to characterize the glycosylation patterns on cellulase linkers.
reason for which remains unknown. In all cases examined, O- For TrCel7A, Harrison et al. published the first detailed study of
glycosylation was found to be approximately uniformly the linker domain glycosylation in 1998; the primary results
distributed across the length of linkers, suggesting that it is from which are shown in Figure 20.368 Therein, the authors
needed in a distributed manner to protect against proteolysis or examined Cel7A from a hyper-cellulase-producing strain of T.
to serve additional, unknown functions. Glycine residues were reesei (ALKO2877) and determined that all serine and threonine
found to be clustered at the termini of all linkers studied, residues in the linker exhibited at least a single O-mannose
suggesting the need for flexibility at the junction between residue. They also found that a mannose residue on the linker
ordered domains and the linker, perhaps for orientation during exhibited a sulfate group, but the role of this sulfation remains
catalytic action. The amino acid content of bacterial and fungal unknown. In 2001, Hui et al. characterized the linker of the same
linkers differed significantly, with higher proline content in enzyme produced in RUT-C30, Iogen-M4, and Iogen-B13 T.
1333 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

store free energy that is dissipated during catalysis or CBM


movement, thus driving the enzyme forward. This hypothesis
was tested computationally by using a free energy method
(umbrella sampling) by compressing and extending an isolated
linker domain over a cellulose slab, which suggested that there
was indeed a barrier.424 However, the computed barrier for
compression was found to be incredibly high, likely given that
convergence of simulations of these types are quite difficult with
such a low-resolution reaction coordinate as end-to-end distance
of a large, glycosylated linker peptide.416,424
Ting et al. proposed a theoretical mechanochemical model of
how a multimodular CBH such as TrCel7A can work at the
solid−liquid interface.425 Therein, they assumed that the CBM
and CD are random walkers connected by a linker modeled as a
Figure 20. (A) TrCel7A linker glycosylation pattern from Harrison et spring. The possible steps in the model, governed by a master
al.368 (B) Molecular snapshot of the linker from Beckham et al.416 equation, are CBM motion in a “forward” or “backward”
direction, and CD motion (once productively bound to a
reesei strains.273 In all cases, the serine and threonine residues cellulose chain) in a “forward” direction, i.e., a hydrolysis and
exhibited heterogeneous mannose distributions, with a single processivity event. Note that the events of processive cellulolytic
phosphorylation on an undetermined site in the linker. Shortly action are described in detail in sections 6.2 and 7.2.7. The rate
after, Hui et al. published a second study characterizing the constant governing hydrolysis and processivity for the CD
glycosylation of TrCel6A, TrCel7B, and of the GH5 EG, account for both the work required to decrystallize a single
TrCel5A.418 For each linker, the following number of glycans polymer chain and the compression of the linker as the CD
were detected on each linker: 39−46 O-glycans for Cel6A, 24− moves forward. The model demonstrates that the maximum
34 for Cel7B, and 32−42 for Cel5A. Stals et al. conducted a enzyme velocity on the surface is reached at intermediate linker
thorough examination of glycosylation on the TrCel7A linker as
stiffness, for a given length. The overall recommendations from
a function of growth conditions and found that, in minimal
this theoretical model are that optimization needs to be
media at low pH, 19−29 mannose residues were present,
conducted not only of the CD hydrolysis rates, as was already
whereas, in rich media, the O-glycans on the linker were
trimmed back to 16−23 residues.419 On the basis of analysis of known, but also of the linker length and stiffness.
fungal glycosylation pathways and the aforementioned mass Toward further understanding linker behavior during
spectrometry data, the T. reesei O-glycans are likely primarily cellulase action, we recently reported a combined computational
mannose units connected by α-O linkages to serine or and experimental study wherein the interaction of the TrCel7A
threonine.273,368,418−422 However, other carbohydrate mono- and TrCel6A enzymes with the surface of cellulose were
mers, such as glucose, as well as branching and straight chain O- examined.393 From long MD simulations, it was predicted that
glycans are known to be found in yeast and fungi, including T. the glycosylated linkers in both enzymes were able to bind to
reesei.423 Mannose and other carbohydrate monomers are cellulose, as illustrated in Figure 21. Subsequently, the binding
known also to be connected by both α2 and α6 linkages. affinities to cellulose were experimentally measured of both the
These data, when taken together, further complicate the ability glycosylated TrCel7A CBM-linker (isolated via papain cleavage)
to examine the impact of linker glycosylation in a systematic and the CBM alone (produced with solid-state peptide
manner. synthesis). These experimental measurements demonstrate
Many studies have been conducted to understand how linkers that the linker indeed is able to increase the binding affinity to
behave in isolation or how intact multimodular enzymes behave cellulose by a factor of 10 over the CBM alone, as fits with a
in solution. Conversely, our collective understanding of linker Langmuir isotherm model, and that it likely does not function as
function during enzymatic action is limited given that cellulases a spring between the two structured domains. The MD
act at a solid−liquid interface, which is inherently difficult to simulations further revealed that the linker binding is dynamic,
study at the molecular level. Related to cellulase action vis-à-vis and the linker lacks secondary structure upon binding, thus
linker function, Receveur et al. proposed the inchworm suggesting that its binding mechanism is one that is nonspecific
hypothesis wherein, during catalytic action, the linker will compared to that of the CBM. These results hearken back to the

Figure 21. Molecular snapshots of TrCel7A and TrCel6A wherein the linker binds to the cellulose surface from microsecond-long MD simulations.
These computational predictions of cellulose linkers enhancing binding of CBMs to the cellulose surface were corroborated experimentally via binding
isotherm measurements.393

1334 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Table 5. Summary of Biochemical Characterizations of Fungal GH7 Cellulases
temp
opt (°
organism expression host enzyme pH opt C) substrate for opt substrate specificity ref comments
Acremonium ther- Trichoderma ree- Cel7A 5.0 60 4-methylumbelliferyl- 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol- Voutilainen et al., 2008,381
mophilum sei β-D-lactoside β-D-lactoside, Avicel, PASC, filter paper, hydroxyethylcel- Szijártó et al., 2011483
ALKO4245 lulose, xylan, p-nitrophenyl-β-D-glucoside
Chemical Reviews

Acremonium sp. Cel7A pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332 assay condition 50 °C and pH 5.0;
CBS265.95 CMC, xyloglucan, xylan, arabinoxylan, mannan, galacto- 72% residual activity on CMC after
mannan 3 h at 40 °C and pH 5.0
Acremonium sp. Cel7B pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0;
CBS265.95 CMC, xyloglucan, xylan, arabinoxylan, mannan, galacto- 92% residual activity on CMC after
mannan 3 h at 40 °C and pH 5.0
Aspergillus aculeatus CBHI 3.0 60 Avicel Avicel, insoluble oligosaccharides (DP20), CMC, alkali- Takada et al., 1998,484,485 Ka- identical optima, but lower activity,
F-50 swollen cellulose, pNPL namasa et al., 2003486 when expressed in S. cervisiae
Aspergillus nidulans Pichia pastoris CBHI, CMC, Glc3/Glc4/Glc5/Glc6 Bauer et al., 2006487 assay conditions 37 °C and pH 4.5
FGSC A4 CbhB
Aspergillus nidulans Pichia pastoris EglB 5.5 42 CMC CMC, Glc4/Glc5/Glc6, barley β-glucan, lichenan, xyloglucan Bauer et al., 2006487
FGSC A4
Aspergillus niger CbhA CMC Gielkens et al., 1999488 assay conditions 30 °C, using parti-
CBS 513.88 ally purified enzyme
488
Aspergillus niger CbhB CMC Gielkens et al., 1999 assay conditions 30 °C, using parti-
CBS 513.88 ally purified enzyme
Aspergillus oryzae CelB 4.0 45 CMC CMC, barley β-glucan Kitamoto et al., 1996,489 Kotaka stable below 50 °C and at pH
et al., 2008490 3.0−7.0; A. oryzae KBN616 used
for CMC activity; A. oryzae RIB40
used for barley β-glucan activity

1335
Bispora sp. MEY-1 Pichia pastoris Bgl7A 5.0 60 lichenan barley β-glucan, lichenan, CMC, laminarin, xylan Luo et al., 2010,491 Zhang et al. stable at 60 °C for at least 1 h and at
2013492 pH 1.0−8.0
Chaetomium ther- Trichoderma ree- Cel7A 4.0 65 4-methylumbelliferyl- 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol- Voutilainen et al., 2008,381
mophilum sei β-D-lactoside β-D-lactoside, Avicel, PASC, filter paper, hydroxyethylcel- Szijártó et al., 2011483
lulose, xylan, p-nitrophenyl-β-D-glucoside
Chaetomium ther- Pichia pastoris Cbh3 5.0 60 pNPC pNPC, MCC, filter paper Li et al., 2009493
mophilum CT2
Chrysosporium luck- Cel7A 5.0−5.5 pNPL pNPL, <italic>p</italic>NPC, Avicel, CMC, cotton Gusakov et al., 2005494 assay conditions 40 °C; maintained
nowense >90% activity after 7 h at 60 °C
Cladorrhinum foe- Cel7 pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0;
cundissimum CMC, xyloglucan, xylan, arabinoxylan, mannan, galacto- 30% residual activity on CMC after
mannan 3 h at 60 °C and pH 5.0
Claviceps purpurea Cel1 CMC, Avicel Müller et al., 2007495
T5
Fusarium oxysporum EG I, pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0;
Cel7B CMC, xyloglucan, xylan, arabinoxylan, mannan, galacto- no detectable residual activity on
mannan CMC after 3 h at 60 °C and pH 5.0
Fusicoccum sp. Pichia pastoris CBHI 5.0 40 4-methylumbelliferyl- Avicel, filter paper, 4-methylumbelliferyl-β-D-cellobioside Kanokratana et al., 2008496 stable at pH 3−11 and maintains
BCC4124 β-D-cellobioside ∼50% at 70−90 °C for 30 min
Heterobasidion irreg- Cel7A 4.0 45 pNPL pNPL Momeni et al., 2013449
ulare TC 32-1
Humicola grisea var. Aspergillus ory- Exo1 5.0 65 pNPC Avicel, CMC, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitrophenyl-β-D- Takashima et al., 1998497 stable at pH 3.0−10.0 at 4 °C for
thermoidea zae glucoside, pNPC 20 h
IFO9854
Humicola grisea var. Aspergillus ory- CBHI 5.0 60 pNPC Avicel, CMC, xylan, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitro- Takashima et al., 1996,498 stable at pH 2.0−10.0 at 4 °C for 20
thermoidea zae phenyl-β-D-glucoside, pNPC Takashima et al., 1998497 h, and at 55 °C for 10 min
Review

IFO9854

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 5. continued
temp
opt (°
organism expression host enzyme pH opt C) substrate for opt substrate specificity ref comments
Humicola grisea var. Aspergillus ory- EGL1 5.0 55−60 pNPC Avicel, CMC, xylan, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitro- Takashima et al., 1996,498 stable at pH 5.0−11.0 at 4 °C for
thermoidea zae phenyl-β-D-glucoside, pNPC Takashima et al., 1998497 20 h, and at 60 °C for 10 min
IFO9854
Chemical Reviews

Humicola insolens Aspergillus ory- CBH1, 5.5 Glc3 PASC, Glc3/Glc4/ Glc5/Glc6, Avicel Schou et al., 1993,442 Schülein, assay conditions 37 °C
zae Cel7A 1997,499 Xu et al., 2009500
Humicola insolens Aspergillus ory- EG1, 5−6 CMC CMC, PASC, Glc3/Glc4/ Glc5/Glc6 pNPC, pNPL, Avicel, Schou et al., 1993,442 Schülein, assay conditions 40 °C; 35% residual
zae Cel7B BC, pretreated corn stover, xyloglucan, xylan, arabinox- 1997,499 Vlasenko et al., activity on CMC after 3 h at 60 °C
ylan, mannan, galactomannan 2010332 and pH 5.0
Irpex lacteus MC-2 Cel1, Ex-1 5.0 50 Avicel Avicel, CMC, BC, pNPL, pNPC Hamada et al., 1999501
Irpex lacteus MC-2 Cel2, Ex-2 5.0 50 Avicel Avicel, CMC, BC, pNPL, pNPC Hamada et al., 1999501
Melanocarpus albo- Cel7A 6.0 65−70 hydroxyethylcellulose hydroxyethylcellulose, 4-methylumbelliferyl-β-D-lactoside, Miettinen-Oinonen et al.,
myces CMC, Avicel, PASC 2004,502 Szijártó et al.,
2008503
Melanocarpus albo- Saccharomyces Cel7B 65 Avicel 4-methylumbelliferyl-β-D-lactoside, Avicel, CMC, hydroxye- Voutilainen et al., 2007,504 assay conditions pH 6.0
myces cerevisiae (cbh) thylcellulose, PASC, filter paper, 2-chloro-4-nitrophenol-β- Szijártó et al., 2008,503 Miet-
D-lactoside tinen-Oinonen et al., 2004,502
Voutilainen et al., 2009505
Myceliophthora ther- Pichia pastoris EG7A 5.0 60 CMC barley β-glucan, CMC, lichenan, arabinoxylan, xylan, filter Karnaouri et al., 2014506 stable at pH 3−11, retaining initial
mophila paper, hydroxyethylcellulose, Avicel, Glc3/Glc4/Glc5 activity after 24 h
Penicillium chrysoge- CBH1 pNPC Hou et al., 2007507
num FS010
Penicillium decum- CBHI, pNPC Gao et al., 2012508
bens Cel7A

1336
Penicillium decum- Saccharomyces Cel7B 4.0 60 CMC CMC, barley β-glucan, PASC, pNPC, Avicel, xylan Wei et al., 2010509 stable at pH 3−8 at 4 °C for 16 h;
bens cerevisiae maintained >90% initial activity
after 1 h at 60 °C
Penicillium occitanis CBHI 4−5 60 pNPC/PASC pNPC, pNPL, Avicel, filter paper, PASC, Glc3/Glc5 Limam et al., 1995510 stable at pH 2−9; maintains activity
Pol6 below 60 °C, but loses ∼50%
activity after 30 min at 60 °C, and
inactivated at 70 °C
Penicillium pulvillo- CBH1 4.2 Avicel Avicel, CMC, p-nitrophenyl-β-D-glucoside, glucuronoxylan Marjamaa et al. 2013511 assay conditions 45 °C
rum
Phanerochaete chrys- CBH62, Avicel, pNPL, pNPC, CMC Uzcategui et al., 1991512
osporium CBH1.1,
Cel7C
Phanerochaete chrys- Cbh58, Avicel, pNPL, pNPC, BMCC, hydroxyethylcellulose amor- Uzcategui et al., 1991,512 von
osporium CBH1.2, phous cellulose Ossowski et al. 2003461
Cel7D
Talaromyces emerso- Cbh1A, 4.1 66−69 2-chloro-4-nitrophenyl- Avicel; pNPC, pNPL, 2-chloro-4-nitrophenyl-β-D-cellobio- Tuohy et al., 2002513 pH optimum measured at 50 °C;
nii IMI 392299 CBH IB, β-D-cellobioside, side, 2-chloro-4-nitrophenol-β-D-cellotrioside, 2-chloro-4- temperature optimum measured at
(Rasamsonia Cel7A pNPL nitrophenol-β-D-lactoside, 4-methyl-umbelliferyl-β-D-cello- pH 5.0; T1/2 of 68 min at 80 °C
emersonii) trioside and pH 5.0
Talaromyces f unicu- XynA 3.5 55 barley β-glucan barley β-glucan, CMC, pNPL, pNPC, Glc3/Glc4/Glc5, xylan, Texier et al., 2012,514 Furniss et maintains >80% activity at pH 3−4.5
losus IMI 378536 arabinoxylan al., 2005515
[Penicillium f uni-
culosum]
Thermoascus auran- Trichoderma ree- Cel7A 5.0 65 4-methylumbelliferyl- 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol- Voutilainen et al., 2008381
tiacus ALKO4242 sei β-D-lactoside β-D-lactoside, Avicel, PASC
Thermoascus auran- Saccharomyces Cel7A 6.0 65 pNPL Avicel, PASC, pNPL, pNPC Hong et al., 2003516 stable at pH 3.0−9.0 at 40 °C for 24
Review

tiacus IFO 9748 cerevisiae h; maintained >80% initial activity

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
after 1 h at 65 °C
Table 5. continued
temp
opt (°
organism expression host enzyme pH opt C) substrate for opt substrate specificity ref comments
Thielavia australien- Cel7A Avicel, PASC Xu et al., 2009500
sis
Thielavia terrestris Cel7A Avicel, PASC Xu et al., 2009500
Chemical Reviews

Thielavia terrestris Cel7C PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0
xylan, arabinoxylan, mannan, galactomannan
Trichoderma harzia- CBHI, 5.0 50 pNPC BMCC, Avicel, Sigmacell 20, CMC, pNPC, cNPL Colussi et al., 2011,517 Textor
num FP 108/ Cel7A et al., 2013466
IOC-3844
Trichoderma longi- Aspergillus nidu- Egl1, 4.5 50 CMC CMC, barley β-glucan, laminarin, xylan Ganga et al., 1997518 narrow pH activity (35% activity on
brachiatum lans Cel7A CMC at pH 4 and 5.5); opt on
CECT 2606 xylan at pH 4.5 and 60 °C
Trichoderma pseu- Pichia pastoris Cel7B 45 4-methylumbelliferyl- 4-methylumbelliferyl-β-D-lactoside, CMC Mitrovic et al., 2014519 assay conditions pH 4.8
dokoningii β-D-lactoside
Trichoderma reesei Saccharomyces EGI, Cel7B 60 PASC CMC, Glc3/Glc4/Glc5/Glc6, pNPC, pNPL, PASC, Avicel, Van Arsdell et al., 1987,330 assay conditions pH 5.0; 50%
cerevisiae BC, pretreated corn stover, CMC, xyloglucan, xylan, Biely et al., 1991,331 Bailey et residual activity on CMC after 3 h
arabinoxylan, mannan, galactomannan, barley β-glucan, al., 1999,286 Vlasenko et al., at 60 °C and pH 5.0
hydroxyethylcellulose 2010332
Trichoderma reesei CBHI, 4.5 65 4-methylumbelliferyl- 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol- Boer and Koivula, 2003,327
L27 Cel7A β-D-lactoside β-D-lactoside, 3,4- dinitrophenyl-β-D-cellobioside, 3,4- Becker et al., 2001328
dinitrophenyl-β-D-lactoside, BMCC
Trichoderma viride EGI 5.0 50 CMC CMC, Glc3 Kwon et al., 1999 retains >80% activity at pH 3.5−6.0
HK-75
Trichoderma viride Saccharomyces CBHI 5.8 60 CMC CMC Song et al., 2010520 retains >70% of maximal activity at

1337
AS 3.3711 cerevisiae pH 3.8−7.0 and 60% of maximal
activity at 40−90 °C
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 6. GH7 Rate Constantsa


isolated CBH CBH plus EG
ref property value substrate property value subtrate ref
−1
476 kcat (s ) 7.1 ± 3.9 Iα
564 kcat (s−1) 6.8 ± 3.5 III
307 kcat (s−1) 2.8 ± 0.4 BC
557 kcat (s−1) 2.2 ± 0.5 BC
561 kcat (s−1) 2.4 BMCC kcat (s−1) 1.5 ± 0.2 BC 557
470 kcat,gly (s−1) 0.415 Glc8
441 kcat,gly (s−1) 10.8 Glc9
564 koff (s−1) 0.20 ± 0.01 III
307 koff (s−1) 0.0007 ± 0.1 BC
561 koff (s−1) 0.01 BMCC
307 Papp 61 ± 14 BC
557 Papp 66 ± 7 BC Papp 50 ± 3 BC 557
561 Papp 22.6 BMCC
557 kobs (s−1) 0.1 ± 0.05 BC kobs (s−1) 1.45 ± 0.5 BC 557
a
Advanced experimental techniques have allowed for the determination of various rate constants in the processive cycle of GH7 CBHs. In each case,
the CBH is intact TrCel7A, and the EG is TrCel5A. Substrate abbreviations not previously defined are III = cellulose III, Iα = cellulose Iα, Glc8 =
cellooctaose, and Glc9 = cellononaose. The rate constant kcat represents the rate constant for the complete inner processive cycle that includes
processivity, hydrolysis, and product expulsion, whereas kcat,gly specifically describes the barrier for the glycosylation reaction and was computed via
advanced molecular simulation techniques. Papp was called n in the original publication by Cruys-Bagger et al.561 Rate constant kobs is calculated from
the rate of cellobiose produced normalized by the concentration of CBH with occupied active site.557

Abuja et al. SAXS study from 1989 wherein the addition of xylan 6. FAMILY 7 GLYCOSIDE HYDROLASES
to TrCel7A was demonstrated to stiffen the linker.401 GH7 enzymes are commonly among the most prevalent
It has been known for some time that modifications to cellulolytic enzymes in secretomes of biomass-degrading fungi,
cellulase linkers modify the enzyme activity, typically in almost undoubtedly because the processive GH7 cellulases
detrimental ways when major changes are made.251,405 On the provide the majority of hydrolytic turnover during fungal
basis of pioneering work using SAXS and other biophysical cellulose depolymerization. For example, T. reesei secretes a
methods, we now know that cellulase linkers such as those single GH7 CBH and a single GH7 EG.41 White-rot
found in fungi are intrinsically disordered pro- basidiomycete fungi, such as the model fungus P. chrysosporium,
teins.407,408,414,416,417 For cellulases from T. reesei, linker glycan degrade lignin in plant cell walls most likely for enhanced access
patterns are either known or at least the range of mannose to biomass polysaccharides, and also employ GH7 CBHs and
residues have been characterized on linkers.273,368,418−420 New EGs for cellulolytic action.426,427 Similar to some organisms that
bioinformatics analyses have emerged recently that suggest a employ GH7 CBHs to degrade biomass, P. chrysosporium has
more general means to analyze linkers beyond sequence multiple GH7 CBHs, and in virtually all organisms that exhibit
alignments, yielding more detailed quantitative information multiple GH7 CBH genes, the need for multiple, similar CBHs
from the same family is as of yet unknown. Moreover, unlike
about linker function and optimization. 417 Lastly, new
other GH enzymes in this review (i.e., GH6, GH5, GH12, and
theoretical analysis coupled with biophysical measurements GH45), GH7 enzymes have not been found in bacteria or
has suggested that glycosylated linkers play a direct role in archaea to date, but mainly have been found in fungi. The CAZy
cellulose binding, similar to the CBM but with a nonspecific, database currently lists nearly 5000 GH7 sequences, although
dynamic role.393 However, despite these strides, key questions some are noted to be gene fragments only, not full-length
remain regarding the detailed mechanistic roles of cellulase enzymes.151−153 Quite recently, GH7 enzymes have been
linkers. For example, it is unknown why cellulases from different characterized in crustaceans,428 protist symbionts,429 strameno-
families employ linkers of different average lengths, or why and piles,428 and slime molds (or social amoeba),430−432 demon-
how linkers from different organisms affect overall enzyme strating that they are not only found in fungi. Interestingly, as
stability and activity.290 Certainly glycosylation is known to be highlighted by King et al., these nonfungal GH7 cellulases offer
important for linker protection404 and more recently for binding distinct evolutionary branches from fungal cellulases, inspiring
to cellulose,393 but glycosylation is known to be quite the study of the similarities and differences with their fungal
heterogeneous and it can differ significantly between fungi.421 counterparts.383,428,432,433
How linker sequences and glycan patterns coevolved is an open The first discovered and characterized GH7 cellulase was
characterized originally from T. reesei in the late 1970s and early
question. Moreover, most fungal cellulase linkers have been
1980s, which was originally denoted CBH I.434−438 As
examined from cellulases from T. reesei or similar fungi. mentioned in section 4, the gene was subsequently sequenced
However, some fungal cellulases, such as those found in simultaneously by two groups in the early 1980s.281,282 Initial
rumen fungi employ linkers with dramatically different sequence work from Pettersson et al. also discovered that TrCel7A was a
characteristics such as with extremely high asparagine multimodular protein, which were among the first studies to
content.417 Clearly, many structure−function studies on both discover the coupling of binding and catalytic function in
CBMs and linkers remain yet to be done to fully understand cellulases, as discussed above.343,344 As a result of the significant
their detailed roles in cellulose deconstruction. body of work on GH7 cellulases, especially CBHI from T. reesei
1338 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 7. Reported Fungal GH7 Crystal Structures


source and original name in PDB resolution
primary citation code (Å) brief highlights ref
CBH Structures
Trichoderma reesei CBH1/Cel7A 1CEL 1.80 first GH7 structure reported; complex with o-iodobenzyl-1-thio-β-D-cellobioside. 172
1DY4 1.90 complex with (S)-propranolol 463
1EGN 1.60 engineered variant E223S/A224H/L225 V/T226A/D262G 328
1Q2B 1.60 engineered variant D241C/D249C 461
1Q2E 1.75 engineered variant, deletion of residues 245−252; complex with Glc2-S-Glc2 461
2V3I 1.05 highest resolution of a CBH1 from T. reesei; complex with (R)-dihydroxy unpublished
phenanthrenolol.
2CEL 2.00 active-site mutant E212Q 446
3CEL 2.00 active-site mutant E212Q; complex with cellobiose 446
4CEL 2.20 active-site mutant D214N 446
5CEL 1.90 active-site mutant E212Q; complex with two cellotetraose molecules 173
6CEL 1.70 active-site mutant E212Q; complex with cellopentaose and cellotetraose 173
7CEL 1.90 active-site mutant E217Q; complex with cellohexaose and cellobiose 173
4C4C 1.45 active-site mutant E217Q; Michaelis complex 441
4C4D 1.32 active-site mutant E217Q; covalent glycosyl-enzyme intermediate trapped using DNP-2- 441
deoxy-2-fluoro-cellotrioside
Heterobasidion irregulare Cel7A 2YG1 1.90 449
2XSP 1.70 complex with xylose 449
Limnoria quadripunctata Cel7B 4GWA 1.60 first structure of a nonfungal GH7 CBH from a salt tolerant marine animal 383
4HAP 1.60 complex with cellobiose 383
4HAQ 1.90 complex with cellobiose and cellotriose 383
4IPM 1.14 complex with thiocellobiose 383
Melanocarpus albomyces Cel7B 2RFW 1.60 thermotolerant GH7 CBH1 465
2RFY 1.70 complex with cellobiose 465
2RFZ 1.80 complex with cellotriose 465
2RG0 2.10 complex with cellotetraose 465
Phanerochaete chrysosporium 1GPI 1.32 first structure of a GH7 CBH1 from a basidiomycete 460
Cel7D
1H46 1.52 complex with (R)-propranolol 462
1Z3T 1.70 complex with cellobiose 458
1Z3V 1.61 complex with lactose 458
1Z3W 1.70 complex with cellobioimidazole 458
Talaromyces emersonii Cel7A 1Q9H 2.35 first structure of a thermostable GH7 CBH 464
3PFJ 1.36 unpublished
3PFX 1.26 complex with cellobiose unpublished
3PFZ 1.10 complex with cellotetraose unpublished
3PL3 1.18 complex with cellopentaose unpublished
Trichoderma harzianum Cel7A 2Y9N 2.89 unpublished
2YOK 1.67 466
EG Structures
Trichoderma reesei Cel7B 1EG1 3.60 450
Fusarium oxysporum Cel7B 1OVW 2.70 first GH7 EG; complex with thiocellotriose 174
2OVW 2.30 complex with cellobiose 448
3OVW 2.30 448
4OVW 2.30 complex with epoxybutyl cellobiose 448
Humicola insolens Cel7B 1A39 2.20 engineered variant S37W/P39W 457
2A39 2.20 wild-type 452
1DYM 1.75 engineered variant E197A 452
1OJI 2.15 engineered variant E197S 653
1OJJ 1.40 engineered variant E197S; complex with lactose 653
1OJK 1.50 engineered variant E197S; complex with cellobiose 653

in the late 1970s and 1980s, these enzymes were included in the in industrial biomass conversion, a significant number of studies
first classification of cellulases from Henrissat et al., denoted as have been applied to elucidate the GH7 catalytic mechanism,
family C enzymes based on hydrophobic cluster analysis.439 understand the basis of GH7 CBH processivity, improve their
In this section, we review the history of GH7 studies primarily thermostability and activity, and ascertain other important
starting from the first structural report in 1994.172 As these features of their structure and function. Here, we focus on
enzymes have been the focus of many studies, given their structure and activity studies of these key enzymes. We note that
abundance in natural biomass degrading fungi and importance we do not include a significant review of studies before the first
1339 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 22. Crystal structure of the first GH7 CBH and EG. The ligand from the TrCel7A Michaelis complex (PDB code 4C4C441) is shown in all
panels. (A) CBH TrCel7A CD (PDB code 1CEL172) view from side, exhibiting the β sandwich structure that is characteristic of GH7 enzymes.
TrCel7A was the first GH7 structure solved and is the best-characterized member of GH7. (B) TrCel7A view from bottom showing the more closed
substrate binding “tunnel”. (C) EG F. oxysporum Cel7B (PDB code 1OVW174) view from side. (D) FoCel7B view from the bottom showing the more
open binding “groove”. (E) TrCel7A Michaelis complex (PDB code 4C4C441) exemplifies the standard numbering of the substrate binding sites
(catalytic residues shown in green for reference). A cellulose chain enters from the −7 site. Hydrolysis occurs between the −1 and +1 sites; thus, the
+1/+2 sites are termed the “product sites”.

structural reports, except where needed, as these studies have of GH7 gene products are now of acute importance for the
been reviewed extensively.32,316,440 We note that, despite the
continued development of structure−activity relationships. A
huge body of work conducted to date on these enzymes, there
are still major elements of their function that remain to be summary of GH7 structures that are discussed are provided in
elucidated, especially related to improving their stability and
Table 7, and biochemical data for characterized GH7s are
activity. Moreover, given the wealth of GH7 sequences available
from genomics and metagenomics efforts, expression and testing provided in Table 5.
1340 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

6.1. Structural Studies and Catalytic Function cellulose chains using cello-oligosaccharides with tritium-labeled
6.1.1. TrCel7A: Wild-Type. The first crystal structure of a reducing ends.445 It was also proposed that the longer binding
GH7 member was presented in 1994 (PDB code 1CEL), tunnel, along with the location of cleavage that is skewed toward
revealing the structure of the catalytic core of the CBH the tunnel exit, would make Cel7A more processive than Cel6A.
TrCel7A,172 whose primary structural characteristic is a large β- Given its importance as the first structural representative of a
sandwich formed by two large antiparallel β-sheets. This original GH7 CBH and the fact that many studies have used this enzyme
structure was a complex with inhibitor o-iodobenzyl-1-thio-β-D- as the model GH7 CBH, in the discussion that follows, the
cellobioside; Figure 22A,B shows the 1CEL crystal structure numbering of individual enzyme residues corresponds to that of
with the cellononaose ligand from the solved structure of the TrCel7A, unless otherwise noted.
Michaelis complex (PDB code 4C4C441), published 20 years 6.1.2. TrCel7A Catalytic Mutants. Site-directed muta-
later. The structure revealed a binding tunnel that was estimated genesis of TrCel7A was subsequently employed in 1996 to
to have 7 binding sites, roughly twice as long as that of the further probe the roles of the triad of acidic residues in glycosidic
TrCel6A CBH (the only other CBH with a solved structure at bond cleavage by mutation to their isosteric amide counter-
that time192). Like Cel6A, the binding tunnel was lined with parts.446 The catalytic activity of the individual point mutants
tryptophan residues (three in Cel6A and four in Cel7A). E212Q, D214N, and E217Q was impaired on 2-chloro-4-
Though it was previously determined via 1H NMR that TrCel7A nitrophenol-β-D-lactoside, with kcat reductions of 1/2000, 1/85,
(later generalized to the entire GH7 family442) employs a two- and 1/370, respectively, compared to the wild-type. In addition,
step, retaining catalytic mechanism (contrasted with the one- E212Q and E217Q mutants lost all catalytic activity on
step, inverting mechanism of TrCel6A),443 the structural crystalline cellulose.446 Crystal structures of each mutant
machinery had not yet been revealed. This mechanism utilizes showed that the active site architecture and overall fold of the
two glutamate residues that generally reside approximately 5.5 Å protein are identical in all the mutants and the wild-type, thus
apart in the catalytically active conformation (Figure 23).444 confirming that the activity loss was due to the catalytic roles of
these three residues (PDB codes 2CEL, 3CEL, and 4CEL).
Importantly, the D214N structure (PDB code 4CEL) revealed a
calcium ion bound to Glu212, supporting the hypothesis that
this residue is the charged species in the precatalytic state.
6.1.3. F. oxysporum Cel7B with Active Site-Spanning
Nonhydrolyzable Inhibitor. Also in 1996, important
structural details of the substrate at the active site during
catalysis were revealed by the first crystal structure of a GH7 EG
from F. oxysporum (FoCel7B) complexed with a nonhydrolyz-
able thiooligosaccharide inhibitor (PDB code 1OVW, Figure
22C,D).174 EGI was previously characterized as having four
binding sites via analytical HPLC measurements and kinetic
assays of this enzyme on reduced cello-oligomers of varying
length (DP 3−6).442 The active site-spanning substrate
analogue captured in the FoCel7B crystal structure occupies
the −2, −1, and +1 binding sites (Figure 24). Along with the
Figure 23. First structural picture of a GH7 active site. TrCel7A (PDB structure of a chitin-degrading enzyme published earlier the
code 1CEL) was the first GH7 structure solved. The o-iodobenzyl-1- same year,447 this crystal structure was the first to reveal an
thio-β-D-cellobioside inhibitor captured in the product sites helped to
identify Glu217 as the potential acid/base in the retaining mechanism.
Glu212 was proposed as the nucleophile with the role of Asp214 not
yet elucidated.

One of these residues is the nucleophile for the first step


(glycosylation), and the other is the acid/base. The acid/base
donates a proton to the glycosidic oxygen in the first step and
removes a proton from a water molecule that serves as the
second step nucleophile. On the basis of the 1CEL crystal
structure, two glutamate residues were proposed as the catalytic
residues: Glu217 as the catalytic acid/base and Glu212 as the
nucleophile. On the basis of its proximity to the O4 atom of the
o-iodobenzyl-1-thio-β-D-cellobioside glucosyl moiety (occupy-
ing the putative position of the cleavable glycosidic bond),
Glu217 was suggested as the acid/base (Figure 23).172 A third
acidic residue, Asp214 in TrCel7A, was noted to potentially be
involved in the chemical steps172 due to its proximity to the Figure 24. Enzymatic substrate distortion in the GH7 active site.
active site and close contact with the putative nucleophile, FoCel7B (PDB code 1OVW174) was the first GH7 EG structure solved.
Glu212. In addition, the location of the proposed active site Note the ring distortion at the −1 subsite that is midway between a 4E
suggested that the cellulose chain is cleaved from its reducing envelope and a 1,4B boat conformation. Sulfur atoms (shown in yellow)
end. This was consistent with previous findings that established replace the glycosidic oxygen atoms. Glu197 is the nucleophile, and
the preference of TrCel7A to hydrolyze the reducing ends of Glu202 is the catalytic acid/base.

1341 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 25. Loop structures in GH7 enzymes. The major loops of TrCel7A (PDB code 4C4C441) and TrCel7B (PDB code 1EG1450). The loop
nomenclature is taken from Momeni et al.449 Note the deletions of most major loops in TrCel7B. The ligand from 4C4C is shown in both panels.

Figure 26. Key structural differences among GH7 EGs. (A) At the tunnel entrance, TrCel7B (PDB code 1EG1, shown in orange) is the only GH7 EG
with solved structure to naturally maintain the tryptophan stacking seen in TrCel7A. Also shown is the HiCel7B S37W/P39W mutant (PDB code
2A39, magenta). (B) Similar to the tunnel entrance, TrCel7B is the only of the three GH7 EGs with solved structure to maintain the aromatic stacking
(Tyr38) at the −4 site seen in TrCel7A (Trp38). The HiCel7B S37W/P39W mutant is also shown, as well as the residues occupying this site in
FoCel7B (PDB code 1OVW, gray) and wild-type HiCel7B (PDB code 2A39, slate): Ile37 and Ser37, respectively. (C) The product binding sites are
quite different in the three EGs as compared with TrCel7A. Two of the three arginine residues that contact the ligand in TrCel7A are absent in all three
EGs (Arg251 and Arg394). Arg267, however, is found in HiCel7B and FoCel7B. These two EGs also have an additional residue that hydrogen bonds
to the substate (His209) that is not found in either of the T. reesei enzymes. In all panels, the background protein from TrCel7B (transparent orange
“cartoon”) and the ligand from the TrCel7A Michaelis complex (PDB code 4C4C, in aquamarine “sticks”) are shown.

intact glycosidic bond across the cleavage sites (Figure 24).190 shuffling” around the active site, maintaining the catalytically
The glucosyl ring at the −1 subsite was distorted into a nonchair active protonation states for the nucleophile and acid/base.
conformation (later termed a “skew boat”448). As discussed in Though not discussed at length in the original publication, this
section 3.2, GHs have been proposed to distort substrate ring structure was also the first to reveal the dramatic loop
conformations to aid catalytic bond cleavage. This structure was shortenings that were later identified as being characteristic of
further evidence that this distortion is indeed critical for GH7 EGs. In fact, comparison with the structure of TrCel7A
providing the nucleophile access to the −1 anomeric carbon for shows that loops B1, B2, B3, B4, A1, and A4 are all considerably
nucleophilic attack.174 The structure of FoCel7B also provided shortened or absent altogether in FoCel7B. Figure 25 provides
additional structural evidence confirming the importance of the the loop nomenclature utilized herein for GH7 cellulases,
three acidic residues that had been identified previously for following Momeni et al.449
TrCel7A. Glu202 (corresponding to TrCel7A Glu217) hydro- 6.1.4. TrCel7B. In 1997, the crystal structure of T. reesei EG
gen bonds (at an O-S distance of 2.7 Å) to the glycosidic sulfur (PDB code 1EG1), at the time defined as EGI but later renamed
of the inhibitor between the −1 and +1 glucosyl rings (Figure to TrCel7B, revealed the same overall fold as TrCel7A but with a
24). In addition, Glu197 (corresponding to TrCel7A Glu212) completely open binding site “cleft” compared to the tunnel
was poised for nucleophilic attack, residing 3.2 Å away from the observed for TrCel7A.450 In TrCel7B, loops B2, B3, B4, and A4
−1 anomeric carbon. Finally, it was noted that Asp199 are absent (Figure 25). However, loop A1 is of similar length in
(corresponding to TrCel7A Asp214) formed a hydrogen bond TrCel7B as it is in TrCel7A, though slightly more open in the
(2.5 Å) to the catalytic nucleophile. Though its catalytic role was crystal structure. Sequence alignments had previously indicated
still unclear, it was speculated to be involved in “proton that major deletions seen in TrCel7B mapped to the tunnel-
1342 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 27. Comparison of the loops of GH7 EGs. Compared with GH7 CBHs, GH7 EGs display significantly shorter (or altogether deleted) loops
that connect the two faces of the β sandwich. (A) TrCel7B (PDB code 1EG1) displays the most truncated loop structures of any GH7 cellulase with a
solved structure. (B) FoCel7B (PDB 1OVW) and (C) HiCel7B (PDB code 2A39) have slightly more prominent B3 and B4 loops than TrCel7B. In all
panels, the ligand from the TrCel7A Michaelis complex (PDB code 4C4C) is shown in aquamarine “sticks”.

forming loop regions of TrCel7A. The TrCel7B crystal structure detail for TrCel7B (Figure 27).450 These findings also solidified
confirmed the absence of these loops, yielding a more open the structural basis for the observed differences in EGs and
binding cleft. These differences in loop structure potentially CBHs. The same study that presented wild-type HiCel7B also
provided a strong rationale for the difference in functionality of utilized site-directed mutagenesis to confirm the catalytic roles
CBHs versus EGs. The additional surface loops in CBHs were of Glu202 (acid/base, corresponding to Glu217 in TrCel7A)
hypothesized to prevent an extracted cellulose chain from and Glu197 (nucleophile, corresponding to Glu212 in
readhering to the crystalline surface postcatalysis as well as TrCel7A).452 E197A and E202A mutants had no detectable
keeping the chain threaded in the tunnel for multiple hydrolytic catalytic activity, either on reduced oligosaccharides or p-
events before dissociation enabling CBHs to processively cleave nitrophenylcellobioside substrates.452 This study also confirmed
cellobiose much more effectively than EGs. the identity of the catalytic nucleophile for HiCel7B utilizing the
The same year, the identity of the GH7 catalytic nucleophile technique previously applied to FoCel7B451 in which the
was confirmed by isolating the glycosyl-enzyme intermediate of glycosyl-enzyme intermediate is trapped and subsequently
FoCel7B.451 The glycosyl-enzyme intermediate was captured by characterized by mass spectroscopy.
incubation of the enzyme with 2′,4′-dinitrophenyl 2-deoxy-2- The three EG structures described directly above constitute
fluoro-β-cellobioside, a method that has found great utility in the only GH7 EG structures solved to date. The most dramatic
identifying active site residues, as mentioned in section structural difference between these EGs and TrCel7A is in the
3.2.170,206,452−454 This class of inhibitors slows both steps of loops that protrude from the two faces of the β sandwich. All of
the retaining mechanism due to the presence of the C2 fluorine. the major loops (with the exception of A1 at the tunnel
Coupling this with a good leaving group that accelerates only entrance) are shortest in TrCel7B compared with the other
the first step (glycosylation) allows for “trapping” of the GH7 EGs; thus, its binding cleft is the most open GH7 cellulase
glycosyl-enzyme intermediate.202,451,455 Subsequent to trapping, characterized to date (Figure 25B and Figure 27A). The
characterization via mass spectrometry allows for identification shortenings/deletions of both the so-called “exo” loop (B3) as
of the nucleophilic residue. In this case, the nucleophile was well as another large loop (B2) are particularly dramatic in
identified as Glu197, which is completely conserved in GH7s TrCel7B. However, compared with HiCel7B and FoCel7B,
and corresponds to Glu212 in TrCel7A. Around the same time, TrCel7B has the longest entrance site loop (A1), which is on par
a combination of comparative liquid chromatography coupled with that of TrCel7A. Further comparing HiCel7B with
online to electrospray ionization mass spectrometry, tandem FoCel7B shows that the loop structures for these two enzymes
mass spectrometry, and microsequencing applied to TrCel7A are nearly identical, with the minor exception of a slightly
bound with an epoxide-based inhibitor provided direct lengthened loop in HiCel7B on the back of the binding tunnel
experimental proof for Glu212 as the catalytic nucleophile for entrance.
that enzyme and also revealed the glycosylation pattern of the All three GH7 EGs align remarkably well to TrCel7A with
core protein.456 regards to active site protein residues (Figure 28). The
6.1.5. H. insolens Cel7B S37W/P39W Mutant. Also in aromatic−carbohydrate interactions (discussed in more detail
1997, the structure of a double mutant of H. insolens Cel7B below) are conserved in all three EGs at Trp367 (over the −2
(HiCel7B; PDB code 1A39) was published wherein additional subsite) and Trp376 (over the +1 subsite). However, the
sugar-binding sites were engineered on the basis of comparison important interaction between Trp40 and the −7 sugar residue
with TrCel7A.457 Two tryptophan residues were inserted is naturally present only in only one known EG structure,
(S37W and P39W) to introduce “+3” and “+4” binding sites TrCel7B (Figure 26A). TrCel7B is also the only EG to naturally
(Figure 26A,B). The resulting mutant maintained wild-type maintain an aromatic stacking interaction at the −4 subsite,
level of activity on soluble substrates, but had a slightly albeit with a tyrosine rather than the tryptophan found in
decreased Michaelis constant KM (by 30%) on PASC, indicating TrCel7A (both at residue number 38, Figure 26B). The other
slightly higher binding on longer substrates. Subsequent two EGs have no aromatic interaction here; this site is occupied
publication of the wild-type structure of this enzyme (PDB by isoleucine in FoCel7B and by serine in HiCel7B.
code 2A39)452 further described the more open binding cleft of Despite the exceptional similarity in catalytic machinery and
EGs first observed structurally for FoCel7B and first discussed in some similarity in aromatic−carbohydrate interactions, signifi-
1343 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 28. Sequence alignment of major GH7 enzymes. Sequence alignment of three GH7 CBHs (TrCel7A, PcCel7D, and HirCel7A) and three EGs
(TrCel7B, HiCel7B, and FoCel7B). Strictly conserved residues are shown in red block, and chemically similar residues in red text. The blue boxes
indicate chemical similarity across a grouping of residues. The secondary structural elements and residue numbering of TrCel7A are shown above the
sequences. Loop structures (A1, B1, etc.) are shown in black boxes. The catalytic triad is denoted by yellow stars. The sequence alignment was
generated with ESPript (http://espript.ibcp.fr).347

cant differences exist not only in the aforementioned loop amongst EGs only TrCel7B maintains the protein−carbohy-
morphologies, but also in relevant enzyme residues at the tunnel drate stacking between Trp40 and the −7 subsite sugar at the
entrance and the product binding sites. As noted above, tunnel entrance. Though both enzymes lack this interaction,
1344 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 29. Comparison of the loops of GH7 CBHs. The loops that extend from the faces of the β sandwich in GH7 CBHs enclose the substrate
binding tunnel to varying degrees. Among GH7 CBHs with known structures, the A2, A3, A4, B1, B2, and B4 loops are fairly similar. (A) TrCel7A
encloses substrate most fully of any GH7 CBH. (B) PcCel7D features a shortening of the A1 loop and a natural deletion of six residues on loop B3
(“exo” loop) that give it a more open active site than TrCel7A (Figure 30A). (C) HirCel7A features a lengthened A1 loop (Figure 30A) and a slightly
shortened B3 loop (compared with TrCel7A) due to the natural deletion of two residues. In all panels, the ligand from the TrCel7A Michaelis complex
(PDB code 4C4C) is shown in aquamarine “sticks”.

HiCel7B and FoCel7B are not identical at the entrance, as long tunnel (the +3 site has almost no interaction with
HiCel7B has essentially no interactions with the ligand here, but enzymatic residues, and the +4 site is outside of the binding
FoCel7B has an arginine residue (Arg41, FoCel7B numbering) tunnel and has no carbohydrate−protein interactions). From −7
within hydrogen bonding distance to both the −7 and −6 to −4 the chain binding for all ligands overlaps almost perfectly.
subsites that TrCel7A lacks. The Asn49 residue in TrCel7A Two twists occur between −4 and −2 that essentially turn the
forms hydrogen bonds to both the −7 and −6 subsite glucosyl cellulose chain upside down. By linking the cellohexaose and
moieties, which is not found in any of the GH7 EGs. cellobiose contained in 7CEL and modeling a “skew boat”
Moving toward the tunnel exit, an increasing number of configuration in the −1 site (based on previously published
discrepancies are present between TrCel7A and the three GH7 chitobiase447 and FoCel7B174 structures), a theoretical model of
EGs. This is partly due to missing loops that enclose the tunnel the Michaelis complex (PDB code 8CEL) was also presented
exit in TrCel7A (A4 and B4, Figure 25). Another key difference that became the basis for modeling studies of this enzyme for the
in the product sites is three arginine residues present in TrCel7A next 15 years.
(Arg 251, Arg267, and Arg394) that are at or close to hydrogen 6.1.7. P. chrysosporium Cel7D. The white-rot basidiomy-
bonding distance to the +1/+2 glucosyl residues (Figure 26C). cete P. chrysosporium secretes six unique GH7 CBHs.426 The
Arg251 is located at the base of loop B3 and has been identified year 2001 marked the publication of the crystal structure of
as being particularly important to product coordination.458 The PcCel7D, the first GH7 CBH crystal structure from a
guanido group of this residue has been observed to make direct basidiomycete,460 and revealed a substrate binding “groove”
hydrogen bonds with the O5 and O6 atoms of the sugar in the that is more open than TrCel7A, but not as open as the GH7
+1 binding site.458 All three EGs lack this residue. All three also EGs. This more open architecture is the result of several loop
lack Arg394, which has been identified in a computational study deletions/shortenings compared with TrCel7A. At the binding
of processivity in TrCel7A as being one of the key drivers of tunnel entrance, PcCel7D has a shortened A1 loop but also an
processive motion.459 The conservation of these arginine extra tyrosine (Tyr47) covering the entrance on the opposite of
residues in processive cellobiohydrolases (Figure 28) and the tunnel entrance that TrCel7A lacks (Figures 29 and 30A). In
absence in relatively nonprocessive EGs perhaps affirm their addition, PcCel7D has a much shorter B3 loop and slightly
importance in driving processive motion. In contrast to the shorter B2 loop compared with TrCel7A. These loop variations
other two arginine residues present at the TrCel7A product give a more accessible active site and may be the structural
sites, HiCel7B and FoCel7B, maintain the Arg267 interaction. explanation for PcCel7D’s enhanced kcat and Km on small soluble
These latter two EGs also have an additional interaction at the substrates. The more open structure may also explain the
product sites that both T. reesei enzymes lack: His209, which is reduction in the binding of the cellobiose product, easing
within hydrogen bonding distance to the C2 hydroxyl of the +1 product inhibition for this enzyme relative to TrCel7A.461 The
glucosyl residue. specific residues responsible for this are discussed further in
6.1.6. TrCel7A: Cello-Oligomer Complexes. The exten- section 6.3.
sive insights into GH7 architecture provided by the three EG The difference in substrate binding architecture between
structures were greatly enhanced by subsequent publication of TrCel7A and PcCel7D may also be relevant to the binding of
new CBH structures. For example, crystal structures of TrCel7A different enantiomers of the β-blocker propranolol. Both
E212Q and E217Q mutants published in 1998 gave much enzymes prefer the S enantiomer over the R,462 yet crystal
greater insight into the binding of cello-oligomers in GH7 structures of TrCel7A (PDB code 1DY4) could only be
CBHs.173 These structures included the binding of (1) two obtained with S,463 while PcCel7D (PDB code 1H46) only gave
cellotetraose molecules on either side of the vacant −3 site crystals with R.462 The enantioselectivity of TrCel7A may be a
(PDB code 5CEL), (2) cellopentaose in −6 to −2 and largely entropy-driven process, as it has been shown to increase
cellotetraose in +1 to +4 (PDB code 6CEL), and (3) with temperature.462 Thus, the more open active site, and the
cellohexaose in −7 to −2 and cellobiose in +1/+2 (PDB code resulting increase in solvation, could explain this difference.
7CEL). This more complete view of cellulose chain binding 6.1.8. TrCel7A: Exo Loop Engineering. The comparison
revealed 9−10 binding sites (−7 to either +2 or +3) in a 50 Å of TrCel7A and PcCel7D was extended in 2003 by extensive
1345 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 30. Key structural differences among GH7 CBHs. (A) GH7 enzymes exhibit A1 loops that are shortened (exemplified by PcCel7D in light blue,
PDB code 1GPI) and lengthened (exemplified by HirCel7A in light pink, PDB code 2YG1) compared with TrCel7A (green, PDB code 4C4C). On
the opposite side of the glucan chain, Trp40 is conserved in all cases. HirCel7A and PcCel7D both have an extra tyrosine that protrudes over the tunnel
entrance: HirCel7A Tyr101 over the top and PcCel7D Tyr47 on the opposite side. (B) Variation in tyrosines on the A3 and B3 loops in GH7 CBHs:
the lack of Tyr371 (numbering for TrCel7A) renders the B3 loop of ThCel7A (magenta, PDB code 2Y9L) more flexible, and LqCel7A (yellow, PDB
code 4GWA) lacks the Tyr247 equivalent and has a slightly shortened B3 loop; TrCel7A possesses a tyrosine at the tip of both the A3 and B3 loops,
and these have multiple conformation (PDB code 1CEL shown in teal, exemplifies this alternate conformation). (C) Key variable residues that
coordinate the glucosyl residues in the product sites. TrCel7A lacks the aspartate interaction at the +2 site that is seen in PcCel7D (Asp336) and
HirCel7A (Asp347). All three CBHs maintain the three arginine residues shown. In addition, due to the shortening of the B3 loop in PcCel7D (six
residues) and HirCel7A (two residues), both of these lack Tyr247 and Thr246, which are possessed by TrCel7A. In all panels, the ligand from the
TrCel7A Michaelis complex is shown in aquamarine “sticks” (PDB code 4C4C).

engineering of the active-site loop B3 (referred to as the “exo” with a glucopyranose in the +2 subsite of PcCel7D (Figure
loop in the original publication) of TrCel7A; this loop is the 30C). The studies also revealed that Tris (buffer molecule) can
most prominent active-site structural difference between these bind in the active site, and that both Tris and calcium inihibit
two enzymes.461 This included (separately) the introduction of PcCel7D CBH activity.
a Y247F mutation at the tip of the loop (no crystal structure 6.1.11. M. albomyces Cel7B. The structure of another
presented), deletion of the middle eight residues of the loop thermostable GH7 CBH natively lacking a CBM was published
(PDB code 1Q2E), and stabilization by introduction of a in 2008 from M. albomyces.465 MaCel7B (PDB code 2RFW) was
disulfide bridge (PDB code 1Q2B). The three mutations crystallized with four molecules within one asymmetric unit and
showed little effect on the hydrolysis of small, soluble substrates. was determined both in apo form and with bound cellobiose,
The deletion mutant gave enhanced activity on amorphous cellotriose, and cellotetraose. This gave a structural analysis of
cellulose, but a 50% activity reduction on crystalline cellulose; altogether 16 different “structure snapshots” of the same enzyme
this was associated with a reduction in processive character. The and the total picture of a surprisingly large conformational
disulfide bridge resulted in enhanced activity on both variability of the substrate interaction with the enzyme. The
amorphous and crystalline cellulose. Taken together, these structure of MaCel7B is also quite similar to TrCel7A. The most
data confirmed the previous hypothesis that the B3 loop is significant difference from TrCel7A is that MaCel7B possesses a
integral to the high processivity of TrCel7A. slightly elongated entrance loop A1, which features a unique
6.1.9. TeCel7A. In 2004, the crystal structure of T. emersonii
tyrosine residue (Tyr100, which neither PcCel7D nor TrCel7A
(or Rasamsonia emersonii) CBH IB (TeCel7A) represented the
possess) that sits “above” the −7 binding site, opposite the
first structure of a thermostable GH7 CBH and also the first of a
conserved tryptophan at this site (Trp40 in TrCel7A).
GH7 CBH naturally lacking a CBM (PDB code 1Q9H).464 In
6.1.12. Heterobasidion irregulare Cel7A. The comparison
general, the structure of TeCel7A is quite similar to TrCel7A,
with two exceptions. First, the tip of the B2 loop was not of PcCel7D and TrCel7A was taken one step further by Momeni
resolved in the TeCel7A crystal structure, so comparison is not and Payne et al. in 2013.449 This publication presented the
possible. Also, the A1 loop at the binding tunnel entrance is structure of Cel7A from the root-rot fungus and basidiomycete
extremely similar to that of PcCel7D (Figure 30A), which is H. irregulare (PDB code 2YG1). HirCel7A also possesses the
somewhat shortened from that of TrCel7A.464 elongated A1 loop seen in MaCel7B, including the extra tyrosine
6.1.10. PcCel7D Bound with Disaccharides. Further residue (Tyr101 in HirCel7A; Figure 30A).465 HirCel7A has a
structural studies of PcCel7D in 2005 focused on the interaction slightly shorter B3 loop (deletion of two residues) compared
with inhibitors: natural product cellobiose (PDB code 1Z3T), with TrCel7A (and PcCel7D lacks this loop entirely), resulting
lactose (PDB code 1Z3V), and cellobioimidazole (PDB code in the loss of stable contacts across the binding tunnel with loop
1Z3W).458 The structural information revealed here was used to A3 that TrCel7A possesses (Figures 29 and 30B). Structural
explain the differences in affinities for cellobiose and lactose comparisons of the three enzymes were complemented by MD
between PcCel7D and TrCel7A. Residues on the B3 loop may simulations to examine the flexibility of tunnel closing loops and
provide more possibilities for interaction with both the substrate the potential role for these in the recognition of substrate, endo-
and the product in TrCel7A. TrCel7A, however, lacks a initiation, and the processive action of the enzyme. On this basis,
conserved acidic residue (Asp336) that was shown to interact it was suggested that HirCel7A exhibits intermediate properties
1346 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

between TrCel7A and PcCel7D in terms processivity and


possible endo-initiation capability.
6.1.13. T. harzianum Cel7A. In 2013, the structure of T.
harzianum Cel7A (ThCel7A; PDB code 2Y9L), an enzyme with
81% sequence identity with TrCel7A, revealed a few significant
structural differences.466 The entrance to the substrate-binding
tunnel is more open in ThCel7A than TrCel7A, due to the
shortening of the A1 loop, very similar to that seen in PcCel7D
(Figure 30A) and TeCel7A. The second highlighted difference
was the potentially greater flexibility of the B3 loop (denoted
loop 4 in the original publication) as a result of a single residue
substitution between ThCel7A (Ala384) and TrCel7A
(Tyr371). This substitution is not on the B3 loop itself, which
is highly conserved between the enzymes, but on the opposing
side of the tunnel (Figure 30B). MD simulations of the two
enzymes confirmed that this substitution does increase the
flexibility of the B3 loop and results in a more open binding
tunnel.466
6.1.14. L. quadripunctata Cel7B. The year 2013 also
featured publication of the first structure of a nonfungal GH7
CBH from the marine woodborer L. quadripunctata.383 Four
high-resolution LqCel7B structures (PDB codes 4GWA, 4HAP,
4HAQ, and 4IPM) constituted the basis for a structural analysis
and comparisons with TrCel7A using MD simulations. LqCel7B
was shown to have a highly acidic surface charge, which may be
the source of its high activity in saline environments (Figure 31).

Figure 32. Michaelis complex and glycosyl-enzyme intermediate of


TrCel7A. (A) TrCel7A Michaelis complex (PDB code 4C4C441). Note
the 4E conformation of the substrate at the −1 site. (B) TrCel7A
glycosyl-enzyme intermediate (PDB code 4C4D441) with covalent
bond between the nucleophile and the broken cellooligomer chain. The
cellobiose product is in unprimed glycosyl-enzyme intermediate mode.
Note the approximately 30° rotation of the nucleophile during
glycosylation.

Figure 31. Electrostatic potential distribution on the solvent accessible


surface of LqCel7B. Electrostatic potential between −7 kT/e and 7 kT/ The Michaelis complex featured a cellononaose chain that spans
e is shown as a colored gradient from red (acidic) to blue (basic). (A) the entire binding tunnel. The glycosyl-enzyme intermediate
LqCel7B possesses an anomalously high frequency of acidic residues on contains a cellohexaose molecule covalently bound to the
its surface. These are likely required for activity in its native marine nucleophile and a cellobiose product filling the +1/+2 sites. The
environment. (B) The surface charge of TrCel7A is much less acidic. glycosyl-enzyme intermediate was captured by the highly
Adapted from ref 383. successful method206,451−453 of incubation with 2,4-dinitro-
phenyl 2-deoxy-2-fluoro-β-cellotrioside, a mechanism-based
At the substrate tunnel entrance LqCel7B exhibits the same A1 suicide inhibitor.170,206,452−454 These structures constitute the
loop elongation, with extra tyrosine (Tyr121 in this case) also first experimentally determined structural picture of the
seen in MaCel7B465 and HirCel7A (Figure 30A).449 The B3 Michaelis complex for a GH7 CBH (confirming much of the
loop conformation for LqCel7B is quite similar to that of geometry in the theoretical model of the Michaelis complex
HirCel7A (PDB code 2YG1). HirCel7A and LqCel7B both lack presented in 1998 with PDB code 8CEL173) and the first
the tyrosine (Tyr247) that TrCel7A possesses that interacts glycosyl-enzyme intermediate for any GH7 enzyme.441
with loop A3 across the binding tunnel (Figure 30B). Where 6.1.16. GH7 Catalytic Insights from Molecular Simu-
LqCel7B and HirCel7A differ though is in the tyrosine (Tyr371 lation. Connecting the many geometrical changes between the
in TrCel7A) on the opposing A3 loop: LqCel7B possesses this static configurations captured in crystal structures with the
tyrosine, whereas HirCel7A does not (Figure 30B). dynamical variables that drive chemical reactions necessitates
6.1.15. TrCel7A Michaelis Complex and Glycosyl- the use of computational modeling and simulation. Moreover,
Enzyme Intermediate. In 2014, two structural snapshots of modeling is essential for the computation of individual free
key steps in the GH7 retaining mechanism were published energy barriers and rates of fundamental process
(Figure 32), namely the Michaelis complex (PDB code 4C4C) steps.111,159,467,468 This makes computational studies crucial to
and glycosyl-enzyme intermediate (PDB code 4C4D) of the development of enzymatic structure−function relation-
TrCel7A, both as acid/base disabled mutants (E217Q).441 ships.111 To that end, many computational studies have
1347 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 33. Hydrolytic free energy barriers for TrCel7A. Free energy barriers for the hydrolytic steps of glycosylation (left) and deglycosylation (right)
for TrCel7A acting on a cellulose chain are shown in addition to that for the nucleophilic water movement that is coupled to cellobiose product
movement (middle).441 Rate calculations based on these free energy barriers reveal that glycosylation is rate-limiting within the hydrolytic steps. M
denotes the Michaelis complex.

examined the chemical steps of GH7 enzymes, both of EG Knott et al. calculated free energy barriers for both steps 1 and
TrCel7B469 and CBH TrCel7A, as reviewed below.441,470−472 2 for TrCel7A.441 Path sampling simulations were utilized to
Zhang et al. used hybrid QM/MM (quantum mechanics/ elucidate the reaction mechanism prior to computing free
molecular mechanics) calculations to compute the two-dimen- energies.473−475 This is in contrast to previous computational
sional potential energy surface as a function of the key breaking investigations in which free energies were computed along
and forming bonds for both hydrolytic steps of p-nitrophenyl-β- assumed and unverified coordinates (e.g., bond lengths). Path
469
D-lactoside (pNPL) hydrolysis by EG TrCel7B. They found sampling revealed that the glycosylation reaction coordinate
that the barrier to glycosylation (18.9 kcal/mol) is higher than (Figure 33, “RC1”) contains components of the forming and
the barrier to deglycosylation (10.5 kcal/mol). Site-directed breaking bonds as well as a rotation in the nucleophile. This
mutagenesis experiments were also performed, with the goal of finding was corroborated by the crystallographic snapshots
attributing functional roles for several near-active site residues presented in the same study of the Michaelis complex (PDB
that are well-conserved in GH7 enzymes. R108 K, Y146F, code 4C4C) and glycosyl-enzyme intermediate (PDB code
Y170F, and D172N mutants revealed catalytic activities that 4C4D). Comparing the conformation of nucleophile Glu212 in
were decreased between 130- and 7700-fold. Subsequent MD the two structures (Figure 32) reveals an approximately 30°
simulations of these mutants revealed a disrupted hydrogen twist, consistent with the rotation seen in the simulations. In
bond network that resulted in more distant interactions with the between the two hydrolytic steps, the cellobiose product cleaved
catalytic residues, thus hindering either the nucleophilic attack during glycosylation shifts slightly toward the tunnel exit in
or the proton transfer. This likely increases the catalytic free order to allow the nucleophilic water access to the anomeric
energy barriers, explaining the observed reduction in activity and
carbon reaction center. The free energy barrier for this transition
underscoring the catalytic importance of the environment near
was computed along a distance coordinate describing the
the active site beyond simply the “catalytic residues”.
proximity of this water to the active site (Figure 33). An
Li et al. utilized QM and QM/MM to perform single-point
additional noteworthy aspect of this work was the finding that
calculations for both steps of the hydrolytic mechanism of
deglycosylation proceeds via a product-assisted mechanism: the
TrCel7A.471 Their QM calculations indicate that the free energy
barrier for step 2 is more than twice as high as that for step 1 at reaction coordinate (Figure 33, “RC2”) involves forming and
more than 30 kcal/mol (compared to a glycosylation barrier of breaking bonds as well as the orientation of a hydroxyl from the
around 14 kcal/mol). Yan et al. performed QM/MM umbrella cellobiose product, which positions the catalytic water for
sampling on wild-type TrCel7A as well as E212Q and D214N nucleophilic attack on the anomeric carbon. Subsequent free
mutants.472 They only study step 1 of the hydrolytic mechanism energy and dynamic calculations facilitated TST rate calcu-
finding a free energy barrier of more than 30 kcal/mol in all lations of 10.8 and 5300 s−1, for steps 1 and 2, respectively, thus
three cases. In addition to being difficult to reconcile with one confirming step 1 as rate-limiting in the hydrolytic cycle. This
another, these free energy barriers are difficult to reconcile with rate agrees well with the rate of processive cellulose hydrolysis
experimental hydrolysis rates which would suggest a significantly by TrCel7A on crystalline cellulose measured by Igarashi et al.
lower hydrolytic barrier. via high-speed AFM of 7.1 ± 3.9 s−1.476
Barnett et al. also utilized QM/MM umbrella sampling In addition to these studies that explicitly studied bond-
simulations for step 1 with wild-type TrCel7A,470 utilizing two breaking and bond-forming in GH7 enzymes, several other
coordinates (one breaking and one forming bond) along which computational studies have helped to shed light on other factors
to sample free energy. Their prediction of the step 1 free energy influencing enzymatic catalysis. These factors include substrate
barrier was 17.5 kcal/mol, from which transition state theory ring distortion,477 protonation of catalytic residues,478,479
predicts a reaction rate of 0.4 s−1. This study also presented mutations,472,480 and protein interfacial allostery.481,482 In
details on the ring puckering itinerary for step 1 and the addition, the glycosynthetic ability of the nucleophilic mutant
importance of the oxocarbenium character of the transition (HiCel7B E197S mutant) was examined via QM/MM
state, as described in section 3 for GH catalysis. metadynamics simulations.480
1348 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 34. Complete processive cycle of a GH7 CBH. TrCel7A is shown with its CD, linker, and CBM in gray “cartoon” representation. N-
glycosylation and O-glycosylation are shown in blue and yellow, respectively. The cellulose surface is shown in green and the cellobiose product in
magenta. Following the adsorption of the CBM and CD to the substrate and initial chain threading, TrCel7A processively cleaves cellobiose from a
cellulose chain end. The “Processive Cycle” includes chain processivity, hydrolysis, and product expulsion (Figure 35). This processive cycle occurs
repeatedly until the enzyme desorbs from the cellulose surface.

Figure 35. Hypothesized hydrolytic processive cycle of a GH7 CBH inferred from structural data. The inner processive cycle of a GH7 CBH consists
of the following seven steps: after adsorption, decrystallization, and initial chain threading, the substrate fills the −7 to −1 sites in pre-slide mode
(upper left); (1) processive motion of CBH by one cellobiose unit fills the product sites, with all glucosyl residues in the stable chair conformation
(slide mode, upper middle); (2) catalytic activation rotates the chain ∼90° and distorts the −1 glucosyl residue into half-chair or envelope
conformation, forming the Michaelis complex (upper right) and allowing the nucleophile access to the anomeric carbon reaction center; (3) the first
chemical step, glycosylation, cleaves cellobiose from the reducing end of the cellulose chain and forms a covalent bond between the nucleophile and
the broken chain (unprimed glycosyl-enzyme intermediate, lower right); (4) a shift in the product produces the primed glycosyl-enzyme intermediate
(lower middle) such that the deglycosylation nucleophilic water has access to the anomeric carbon reaction center; (5) the second chemical step,
deglycosylation, produces the product complex, wherein the glycosyl-enzyme intermediate is broken and the catalytic residues are regenerated (lower
left); (6) product expulsion vacates the +1/+2 sites (upper left). The processive cycle ends when the enzyme dissociates from the cellulose chain. It
should also be noted that it is possible for a CBH to perform endo-type initiation,304,307 in which case the enzyme would initiate this cycle in slide
mode (or similar, possibly with a longer chain on the product side of the active site) and the remainder of the processive cycle would proceed as
depicted. In all panels, the nucleophile (Glu212 in TrCel7A) is on top, and the catalytic acid/base (Glu217 in TrCel7A) is on bottom in green “sticks”.
For clarity, only selected hydrogen atoms are shown: the hydrogen bonded to the −1 anomeric carbon shows the stereochemistry and that of Glu217
shows its protonation state throughout the processive cycle.

1349 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

6.2. Processivity, Kinetic Modeling, and Visualization assumptions be made regarding initial binding mode (i.e.,
The wealth of structural and biochemical data on GH7 enzymes endo vs exo, where some processive enzymes are known to
has revealed a common overall fold as well as a common exhibit both initiation modes) and can frequently lead to
catalytic machinery. Clues have also been provided by structures misinterpretation or overestimation of processivity.461,532
as to differences in functionality (e.g., processivity) between Furthermore, these methods are also extremely sensitive to
CBHs and EGs and well as within these classes. This section the substrates selected, where the number of available free chain
discusses the discrete steps in the processive cycle of GH7 ends can drastically affect measured product profiles.533 A recent
enzymes, with a particular focus on CBHs. Visualization studies review provides an excellent assessment of the pros and cons of
of GH7 enzymes on lignocellulosic biomass are then reviewed, each available technique.521 Finally, we caution the reader to
as they predominantly provide qualitative information about approach experimental determinations of processivity with an
cellulase action. Kinetic modeling and novel high-speed AFM eye toward the difficulty in quantification and exercise good
measurements make this knowledge quantitative and have the judgment with respect to interpretations of processive ability
capability to provide rates for the individual steps of the and claims of endo- or exo-initiated binding.
processive cycle. We then discuss structural and molecular The direct in situ visualization of cellulase action on cellulosic
modeling studies, which allow for identification of the molecular substrates has provided valuable clues to their mechanistic
underpinnings of cellulase action including the individual steps action and has recently been reviewed.534,535 These visualization
of the processive cycle. Finally, the reversible blockage of this methods offer the ability to resolve single molecules on the
processive cycle via product inhibition is discussed and cellulose surface, allow for observing single cellulases interacting
reviewed. with the cellulose surface, and provide temporal tracking of the
GH7 CBHs act from the reducing end of cellulose chains445 effects of cellulases on the cellulose surface (e.g., the formation
and perform many hydrolytic events before disassociating from of fissures). TEM was the first method applied to visualizing the
a cellulose chain. This overall processive cycle includes at least structural dynamics of enzymatic cellulose degradation directly
the following steps (Figure 34): adsorption to the crystalline on the cellulose surface both for the complete T. reesei system536
cellulose surface (possibly preceded by the adsorption of an and for TrCel7A in isolation.537
attached CBM), cellulose chain decrystallization, chain thread- White and Brown visually presented the EG/CBH synergistic
ing through the binding tunnel or direct binding (i.e., endo- action of T. reesei enzymes.536 They found that CBH or EG in
initiation304), hydrolysis, product expulsion, and desorption. isolation could not produce cellulose microfibril degradation,
The “repeating processive” cycle that generally repeats many though EG could produce some splaying of chains (notably,
times before chain dissociation includes processive motion, results were only reported for 1 h of incubation). When acting in
catalytic activation, hydrolysis, and product expulsion (Figure concert, however, they could completely dissolve microfibrils
35). The rate-limiting step in the processive cycle of a cellulase is within 30 min of incubation.
an oft-discussed topic in the literature, due to its importance as Particularly notable among these early investigations were
the primary target for protein engineering efforts. those of Chanzy and co-workers on GH7 CBHs.537,538 These
For an EG, the processive cycle shown in Figure 35 is studies provided direct visual information regarding processive
modified in that the chain threading and product expulsion are CBH action and have had a long-lasting impact on the field.
omitted. Chain acquisition without threading is thought to be Early work showed the binding of TrCel7A to crystalline
readily accomplished by EGs (as opposed to CBHs) due to their cellulose and its subsequent degradation. Significantly, complete
open binding site clefts that lack the enclosing loops found in degradation of crystalline cellulose was shown by TrCel7A in
CBHs. GH7 CBHs are also capable of performing hydrolysis in isolation (within 48 h), without the need for any EG present,
an endo-fashion.304,307 It has been speculated that, with the producing cellobiose as the major product (determined by
ability for CBHs to perform endo-initiation, their enclosing HPLC).537 A follow-up study visualized gold nanoparticle-
loops may open, allowing entry of the cellulose chain into the labeled TrCel7A molecules via TEM. TrCel7A retains 60% of
active site without chain threading.304,307 Whether or not a CBH the hydrolytic activity even with the bulky gold label (5 nm in
performing endo-initiation subsequently disassociates immedi- diameter, the same order of magnitude as TrCel7A). It was
ately or embarks on a processive run is still an open question. found that TrCel7A binds preferentially to the hydrophobic
Quantifying the processive ability of an enzyme is useful both (100) face of cellulose microfibrils.538
in generally describing the mechanistic behavior and in Several other TEM studies focused on GH7 enzymes,536−544
identifying opportunities for activity enhancements. Unfortu- including those utilizing gold labeling538,539 and labeling with
nately, measuring processivity is not straightforward, and gold-coupled monoclonal antibodies (immuno-EM).541,544
measurements performed via different techniques are not Immuno-EM was utilized to determine that TrCel7A and
readily comparable.521 Formally, processive ability is defined TrCel7B preferentially bind to the crystalline and amorphous
as the number of hydrolytic events performed per number of regions of substrate, respectively.544
initiated processive runs. This definition of processive ability is One limitation of TEM is that it cannot visualize hydrated
frequently referred to in the literature as “apparent processivity”. cellulases; thus, it cannot be used in the native cellulose/
Kurašin and Väljamäe describe a complementary measurement, cellulase environment. Conventional atomic force microscopy
“intrinsic processivity”, that describes the theoretical processive (AFM), however, can be applied in liquid environments at
potential of a GH, in the limit of ideal polymeric substrate atmospheric temperature and pressure, without sample
turnover.307 A handful of techniques have been developed to modification (labeling, coating, etc.). Thus, AFM enabled the
assess apparent and intrinsic processivity, some of which are first visualization of cellulase action in a biologically relevant
described below. These methods generally capitalize on the environment.545 Subsequent AFM studies revealed further
relatively consistent nature of the processive GH product details of enzymatic action.367,545−550 For example, AFM
profile.307,522−531 However, characterization of processive ability visualization of TrCel7A action on crystalline cellulose
by measuring produced soluble products requires some suggested, under environmental conditions, that degradation was
1350 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

primarily on a single face.367 When combined with the previous resolution of 1−4 frames/s.558 Isolated CDs (sans CBM) moved
finding from TEM that the family 1 CBM of TrCel7A with comparable velocity to intact enzymes with CBM at around
preferentially binds to the hydrophobic face of crystalline 3.5 nm/s. However, E212Q (catalytic nucleophile) and W40A
cellulose (as described in section 5),362 this may indicate (binding tunnel entrance) mutants do not move at all. Thus, it
TrCel7A localization primarily to the hydrophobic face of was concluded that hydrolysis and chain loading are both critical
crystalline cellulose. This confirmed in an aqueous environment for movement. The E212Q mutant was immobilized on the
and with the native enzyme what Chanzy et al. had previously surface longer than the W40A mutant, suggesting W40 is critical
found via TEM of gold-labeled TrCel7A.538 AFM also showed for initial chain threading. On the basis of the observations
the formation of path-like indentations on the substrate when regarding the immobile E212Q mutant and the comparable
incubated with TrCel7A, considerably different than the effect of velocities of intact enzyme and isolated CD, they concluded that
EG,547 and that the processive motion of TrCel7a is impeded by movement is inherently coupled to catalytic activity. While
amorphous regions of the substrate.546 In addition, AFM has possible that the cleavage reaction actually induces the forward
been utilized to study synergistic effects between TrCel7A with processive motion, these observations at least imply that the
TrCel6A and EGs546,551,552 and in HiCel7A with Cel6A540 as catalytic event is a prerequisite for motion to continue.
well as with EG present.553 Real-time AFM has also examined Additionally, the role of the CBM was determined to be solely
EG action by itself (TrCel7B).549 Fluorescently labeling to enhance the concentration of enzyme molecules on the
TrCel7A coupled with confocal microscopy554 or total internal substrate, without a further substrate-modifying role.
reflection fluorescence microscopy555 allows for tracking the In 2010, Jalak and Väljamäe sought to identify the cause of the
CBH’s motion along cellulose fibrils. The latter of these studies rapid rate reduction characteristic of enzymatic hydrolysis531 by
found that upward of 90% of TrCel7A molecules were stationary quantifying the concentration of CBHs with a cellulose chain
on the cellulose surface during the observation interval.555 productively bound to cellulose (meaning that the cellulose
These studies highlight the utility of visualization studies in chain was bound in the active site). Experimentally, this was
enhancing the understanding of cellulase−cellulose dynamical accomplished by measuring the degree of inhibition of TrCel7A
interactions as well as cellulase−cellulase interactions in some and PcCel7D for a small MW reporter molecule, in this case
cases. pNPL. The rate of pNP production (product of pNPL
The general insights provided by cellulase visualization have hydrolysis) is readily related to the fraction of CBH with free
been solidified, refined, and quantified by kinetic modeling active site (and thus the productively bound fraction). The
studies and novel HS-AFM measurements. These studies observed catalytic constant can be calculated as the ratio of
provide the capability of determining rate constants for the cellobiose formation rate (cellulose hydrolysis product) to the
individual steps in the processive cycle (Figures 34 and 35) and concentration of CBHs with occupied active site. As candidates
how these constants depend on the nature and concentration of for the source of the rapid rate retardation, previously proposed
both substrate and enzyme. These models are often directly hypotheses such as cellobiose product inhibition, depletion of
linked to hydrolysis experiments providing feedback between more readily hydrolyzable substrate, depletion of available chain
kinetic experiments and theories that seek to rationalize the ends, inactivation through irreversible surface binding, and
results. The goals of these kinetic models are often focused on overcrowding of bound CBHs could not account for their data.
identifying the rate-limiting step(s) in the processive cycle of a However, if “steric obstacles” that limit the processive action of
CBH or an enzyme cocktail. Cellulase synergy and the origin of CBHs were considered, the results could be rationalized. In this
the “burst phase” often seen in the first few minutes of way, the enzyme dissociation rate from the surface (koff) is
hydrolysis experiments are two other primary areas of focus of implicated as the slow step of the overall CBH processive cycle
these studies. With some notable exceptions, we largely focus with only CBHs present. Following the initial “burst” phase, the
our attention here on those studies that have appeared from rate of hydrolysis is governed by koff and the average obstacle-
2009 to the present, as both Zhang and Lynd33 and free path of a CBH on a cellulose chain. This explanation also
subsequently Bansal et al.556 have nicely reviewed the literature accounts for the differences in hydrolysis rates wherein CBHs
on this topic for publications appearing before 2009. with more open binding tunnels (e.g., PcCel7D vs TrCel7A)
Ståhlberg et al.348 examined the adsorption and hydrolysis of more readily dissociate from cellulose chains, resulting in lower
intact TrCel7A, isolated CBM, and isolated catalytic core on processivity, but higher cellobiose production. Thus, there are
MCC. They found that the CBM increased both the adsorption “costs and benefits” to high processivity.563,564 The concept of
and hydrolytic activity. To explain these results, they proposed a steric obstacles that limit CBH catalytic production has been
model for the action of two-domain cellulases: the core is biased quite influential in subsequent studies performed by a number of
toward amorphous regions or chain ends whereas the CBM is groups.
biased toward the crystalline regions. Attachment of the CBM to The concept of a CBH processing along a cellulose chain
the core ensures that the core will have an elevated according to its “true” catalytic constant before becoming
concentration on the crystalline regions, thus increasing the “stuck” at an obstacle necessitates (at least) two major CBH
probability of cleavage. The importance of considering at least populations. In 2011, Igarashi et al.476 presented high speed
two different types of surface morphologies was thus AFM data of TrCel7A as well as TrCel6A that provided a
emphasized. powerful visual verification of this “dual population” model. In
Beginning around 2009, multiple studies were presented that this study, TrCel7A was observed to alternate between stopping
revolutionized our quantitative understanding of CBH and sliding. The velocity distribution of CBHs could be very
action.307,315,476,531,557−562 Taken together, they constitute a well accounted for by assuming two populations, one with near
significant step forward in our collective understanding of CBH zero velocity and the other with nonzero velocity (7.1 ± 3.9 nm/
function, beginning with the seminal study from Igarashi et al. in s). CBHs were shown to stop, and then accumulate behind a
which high-speed AFM was utilized to spatially track individual surface obstacle (described as a “traffic jam”). A new dimension
TrCel7A molecules on crystalline cellulose with a temporal was added to this dynamic picture when a group of CBHs was
1351 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 36. Cellulase architecture correlates with function. Shortened or absent substrate-enclosing loops result in a more open architecture which
gives rise to functional differences. The rate constants and processivity estimates are from Kurašin and Väljamäe.307 (A) CBH TrCel7A has the most
closed substrate-binding tunnel (PDB code 1CEL172), which gives rise to the highest Pintr and lowest koff of the enzymes considered in the study. The
ligand shown is from the TrCel7A Michaelis complex (PDB code 4C4C441). (B) CBH PcCel7D (PDB code 1GPI460) has a more open substrate-
binding tunnel than TrCel7A due mostly to the shortening of the A1 and B3 loops (Figure 29). The ligand shown is from the TrCel7A Michaelis
complex. (C) EG TrCel5A (PDB code 3QR3566), shown in complex with the ligand from Bacillus agaradhaerens Cel5A (PDB code 4A3H206). (D) EG
TrCel12A (PDB code 1H8V567) shown with the ligand from Humicola grisea Cel12A (PDB code 1UU6208). In all panels, the enzymes are shown to
scale and oriented with the acid/base on top and the nucleophile on the bottom (all four utilize a retaining mechanism). The substrate for each of the
307
listed measurements is amorphous cellulose, with the exception of PendoBC , which was performed on BMCC.

seen to stop, accumulate, and then resume motion without TrCel7A had previously been reported,304 but there was some
dissociating. This may correspond to the collective action of uncertainty about these findings.565 TrCel7A was shown to
several CBHs removing an obstacle in their path. TrCel6A by perform endo-initiation with probability Pendo of 0.41−0.55 on
itself was observed to bind, but it did not slide nor did it BC (PcCel7D, Pendo = 0.73−0.82) and 0.83−0.93 on amorphous
appreciably degrade cellulose. The combination of TrCel6A with cellulose (PcCel7D, Pendo = 0.92). Also, the Papp of TrCel7A was
TrCel7A degraded cellulose much more efficiently than the sum similar to that of PcCel7D on amorphous cellulose and on BC;
of the action of the two, giving a powerful visual representation however, Papp for each CBH was nearly 3 times higher on BC,
of CBH synergy, and it was suggested that this may be due to leading to the conclusion that this parameter is determined by
TrCel6A making endo-like cuts in the crystalline cellulose the substrate properties. The dissociation rate koff was also found
surface. to be dependent on the nature of the substrate. Moreover, the
Kurašin and Väljamäe built upon this quantification of the koff values for the EGs were 2 orders of magnitude above those
overall processive cycle by isolating its individual steps and of the CBHs. This is in contrast to kcat, which was of the same
connecting cellulase functional properties to structure.307 The order of magnitude for the CBHs as for two EGs (TrCel5A and
resulting seminal study provided an unprecedented holistic TrCel12A) and for the different types of cellulose. Pintr was also
understanding of CBH action in isolation. Recognizing both the determined by separately estimating kcat and koff. These rate
importance of cellulase processivity and also the difficulty in its constants were determined by measuring product formation rate
measurement, Kurašin and Väljamäe sought to develop a robust in the regimes where the processes described by these constants
method for its quantification. They note that the intrinsic are rate-limiting (the initial “burst” phase for kcat and the
processivity Pintr (determined by the ratio of the hydrolytic rate subsequent linear regime for koff). Pintr was 1−2 orders of
kcat to the dissociation rate koff) can only be realized on a magnitude greater than Papp, confirming that CBHs do not reach
“perfect” polymer. On real substrates, the inherent hetero- their full processive potential on real substrates, and thus
geneity introduces steric obstacles that halt forward processive processivity is substrate limited. The authors thus conclude that
motion. Thus, they sought to measure the ratio of the number of the dissociation of a stalled, nonproductively bound CBH is the
actual catalytic events before dissociation on a real polymer (i.e., rate-limiting step in the overall processive cycle and the primary
the apparent processivity Papp). The primary experimental target for the selection of cellulases, though it must be noted that
challenge for the quantification of apparent processivity is this conclusion is based on measurements of individual enzymes
determination of the number of processive run initiations. This in isolation (e.g., a CBH in the absence of any synergism with an
was elegantly addressed by selectively “tagging” only the EG, other CBH, or LPMO). The trends in various measured
reducing ends of cellulose with diaminopyridine (DAP) and parameters (koff, Pendo, Pintr) are indicative of a deep connection
detecting the release of DAP-labeled sugars (after cleavage by between enzyme structure and intrinsic kinetic properties
reducing end specific CBHs TrCel7A and PcCel7D) with (Figure 36). The key structural difference between EGs and
sensitive fluorescence detection. This analysis revealed that Papp CBHs is the shortening or deletion of several loops in EGs that
for the CBHs increased with increased enzyme loading, cover the substrate binding tunnel (Figures 25, 27, and 28).
indicating that there was another mode of initiation besides Also, PcCel7D has a shortening of the important B3 loop (aka
exo-action, namely endo-type initiations. This led them to the “exo” loop) as compared to TrCel7A (Figure 29). These
perform similar experiments with reduced BC to measure the loop differences are the likely structural basis for the increased
number of endo-initiation events. Endoinitiation capability by Pendo of PcCel7D versus TrCel7A (and even higher Pendo for
1352 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 37. CBHs acting in isolation and the two modes of endo/exo synergism. (A) The rate of cellobiose production by CBHs acting in isolation (i.e.,
no EGs present) is limited by dissociation from cellulose when hindered by obstacles (most notably the amorphous regions of cellulose).307 (B) The
traditional picture of endo/exo synergism (right side of panel B) is that EGs create random cuts in the cellulose surface that provide starting points for
CBH processive action. Jalak et al.557 showed that another important role for EGs was to help CBHs “escape” blockage (left side of panel B) by
amorphous regions of cellulose, as shown in panel A. Reprinted with permission from ref 557. Copyright 2012 the American Society for Biochemistry
and Molecular Biology, Inc.

EGs). In addition, PcCel7D more easily dissociates from a had indicated a role for EGs in “surface cleaning”,568,569 the
cellulose chain than TrCel7A (reflected in a higher koff, and thus prevailing model of endo/exo synergism posited that EGs make
a lower Pintr). The fact that the koff for the EGs studied is 2 orders internal cellulose chain cuts that produce starting points for
of magnitude higher than for the CBHs also seems to implicate CBH action.570 However, the recent revelation that the rate-
loop structures as the source of the differences in activity of limiting step for processive CBH action in isolation was
these cellulases (Figure 36). dissociation (Figure 37A) and not association,307 suggested that
Similarly, Praestgaard et al.562 developed an explicit kinetic the synergistic power of EGs might somehow be related to CBH
model to describe the “burst” kinetics seen in cellobiose dissociation. Jalak et al.557 demonstrated that the hydrolytic rate
production by CBHs, such as Cel7A wherein the steady-state is constant of TrCel7A on BC increased with the addition of EG
preceded by a transient burst in activity. They compare their TrCel5A under steady-state conditions, unaccountable by the
results to calorimetric measurements performed with TrCel7A. traditional paradigm. This led to the new hypothesis: the role of
The key components of the theory are the relative reaction rate EGs is not only to help CBHs attach to the cellulose surface, but
constants for adsorption, processive hydrolysis, and desorption, also to detach f rom the cellulose surface (Figure 37B), the step
as well as blockage by obstacles on the cellulose surface. They which was now gaining traction as rate-limiting in the CBH
find that the burst in activity persists until the CBHs start processive cycle (though EGs themselves were prone to
encountering obstacles in significant proportion. Their model is reversible deactivation by surface heterogeneity315). Both
capable of capturing this burst phase; however, to capture the modes of synergy were found to act in concert, but the
double exponential decay in cellobiose production rate seen traditional paradigm accounts for a significant portion of the
experimentally, it was necessary to add random enzyme synergistic effect only at high enzyme/substrate ratios.557 At
inactivation (with no correlation to the stage of the processive optimal enzyme/substrate ratios, the new synergistic mecha-
cycle) to their model. This model was further developed and nism dominates. Under these conditions (optimal enzyme/
applied to further mine the experimental wealth of mechanistic substrate loadings and with EG present), the result is that the
information on fast (i.e., non-rate-limiting) steps of the rate-limiting step for the conversion of cellulose to glucose is the
processive cycle afforded by the pre-steady-state regime.560 CBH processive cycle.557 In other words, the combined steps of
This constituted the first such pre-steady-state investigation of a processive motion, hydrolysis, and product expulsion (Figure
cellulase on its natural substrate, insoluble cellulose. For 35) are rate-limiting in the overall processive cycle which
TrCel7A, it was found that the rate of cellobiose production includes association and dissociation (Figure 34). This proposal
reaches a maximum after just 5−8 s, and then declines rapidly. was corroborated by the fact that, at these optimal enzyme/
They calculate 4 glycosidic bonds cleaved per second (kcat), substrate loadings, kobs (calculated from the rate of cellobiose
concluding that dissociation (koff) is rate-limiting (0.022/s). production normalized by the number of TrCel7A molecules
Complexation (kon) is 3× faster than dissociation (koff), and kcat with occupied active site) approached kcat (the rate constant for
(which includes hydrolysis, processive motion, and product the processive cycle involving hydrolysis, product expulsion, and
expulsion) is 2 orders of magnitude faster than either kon or koff. processive motion, Table 6). These findings were also
The stage was now set for a paradigm shift in the demonstrated to be true on lignocellulose.557 In addition, it
understanding of cellulase synergism. Although some studies was noted that the apparent processivity calculated for TrCel7A
1353 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 38. Free energy landscape of the hydrolytic and processive steps for TrCel7A. The barriers for glycosylation,441 product movement,441
deglycosylation,441 processive motion,459 and catalytic activation459 were computed via advanced molecular simulation techniques. These barriers
represent all of the key steps in the CBH processive cycle (in between adsorption and desorption) with the exception of product expulsion, which has
previously been experimentally ruled out as a rate-limiting factor in these enzymes.557 Because product expulsion is not explicitly considered here, the
overall free energy change for the processive cycle will not be as dramatic as shown. Taken with previous results, these findings led to the conclusion
that the glycosylation reaction is the rate-limiting step in the enzymatic deconstruction of cellulose by TrCel7A.459

is similar to the DP of the BC substrate (DP ∼ the length of were elucidated in 2014.459 The barrier for processive motion
obstacle-free path = nfree); thus, the “obstacles” were identified (advancing along the cellulose chain by one cellobiose unit) was
with the amorphous regions of the BC (Papp ∼ DP ∼ nfree), calculated via umbrella sampling and found to be higher than
suggesting that CBH is not able to digest in the amorphous that for catalytic activation (the formation of the Michaelis
regions.557 When a CBH encounters an amorphous region of complex) at 4.2 and 2.9 kcal/mol, respectively. Both of these
substrate, it stalls and must dissociate before it can continue barriers are significantly lower than the 15.5 kcal/mol barrier for
hydrolyzing the substrate. Endolytic cuts made in the substrate glycosylation (Figure 38).441 The barrier to processive motion
provide “escape routes” for the CBH before it reaches the may seem surprisingly small, given the many enzyme−substrate
amorphous regions and facilitate avoidance of the amorphous interactions that must be broken for the cellulose chain to
regions. An extremely relevant extension of this work would be advance. However, many new and strong interactions are
to perform similar experiments with other cellulolytic enzymes formed upon chain advancement, and given the flexibility of the
including nonreducing end specific CBHs, EGs from other GH residues lining the binding tunnel, these can start to be formed
families, and/or LPMOs to understand their molecular-level before prior interactions are fully broken. Knott et al. pointed in
synergistic effects. particular to the strong binding of polar residues at the product
The rate-limiting step of the processive cycle of a CBH had sites that are empty following product expulsion (and
now been narrowed to those steps that comprise the “inner immediately prior to processive motion).459 Taken with
processive cycle” (Figure 35), namely, processive motion, previous computational and experimental results, this computa-
catalytic activation, glycosylation, product movement, deglyco- tional evidence suggests that the rate-limiting step in the overall
sylation, and product expulsion.557 These are also the steps that conversion of cellulose is the glycosylation reaction. As such,
take place in the observed CBH processive motion in high-speed accelerating this step is thus predicted to constitute a primary
AFM studies.476,558,564 Further narrowing which of these steps target for the engineering of improved GH7 CBHs. In addition,
limits the overall production of cellobiose would be extremely a significant stabilization of −8.1 kcal/mol results from the
difficult to accomplish experimentally, but molecular simulation processive motion. This “driving force” for processive motion
has provided insight in this regard. Product expulsion has been was explained in terms of the particularly strong binding of the
studied via molecular simulation571,572 and has been ruled out as leading glucosyl residue of the cellulose chain, which advances to
the rate-limiting step in CBH action.557 The free energy barriers the +2 binding site. The primary residues responsible for this
for glycosylation, product movement, and deglycosylation were coordination (Asp259 and Arg394) are conserved in GH7
calculated as 15.5, 2.0, and 11.6 kcal/mol, respectively;441 the CBHs, but not in GH7 EGs (Figure 28). Further analysis of
reaction rate for glycosylation was found to be much lower than these simulations also suggested that the aromatic residues that
that for deglycosylation making it the rate-limiting hydrolytic line the binding tunnel are primarily for providing the tunnel
step (discussed in more detail in section 6.5). Free energy shape, guiding the cellulose chain to the active site in the correct
profiles for the remaining steps of the TrCel7A processive cycle orientation and conformation.
1354 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 39. Product binding modes in GH7 enzymes. The various binding poses of the glucosyl residues filling the product sites represented by
TrCel7A crystal structures. These include slide mode (exhibited by PDB code 5CEL),446 cut mode also known as the Michaelis complex (PDB code
4C4C),441 unprimed glycosyl-enzyme intermediate (PDB code 4C4D),441 and primed glycosyl-enzyme intermediate mode (PDB code 3CEL).446 For
reference, the catalytic residues of the unprimed glycosyl-enzyme intermediate (PDB code 4C4D) are shown.

An important theoretical advance in the treatment of steps in the hydrolytic processive cycle shown in Figure 39.573
enzymatic adsorption was the utilization of the quasi-steady- Ubhayasekera et al. studied binding in the product subsites for
state assumption to give physical significance to the parameters PcCel7D and TrCel7A, which showed two distinct binding
of the oft-used (and inherently nonprocessive) Michaelis− modes for glucosyl residues in the product sites, referred to as
Menten (MM) framework.559 The MM framework is often “cut” and “slide” modes.458 Slide mode was named on the basis
applied to processive enzymes (e.g., cellulases), because it is able of the proposal that this represents how the product sites are
to fit the experimental data reasonably well. However, the filled when the cellulose chain first slides across the active site.
justification for applying the MM framework to processive Structurally, the TrCel7A structure with two cellotetraose
systems had never been firmly established; thus, the fitting molecules bound (PDB code 5CEL173) best represents this
parameters obtained lacked physical meaning. Cruys-Bagger et stage of the processive cycle. The formation of the Michaelis
al. applied the quasi-steady-state assumption to a set of complex would then involve a transition (“catalytic activation”)
processive enzyme reactions, namely, association (kon), to cut mode, so-called because it was suggested to represent the
processive hydrolysis (kcat, which includes processivity, hydrol- positioning of a glucopyranose immediately prior to enzymatic
ysis, and product expulsion), and dissociation (koff). Thus, they cleavage. The recently solved structure of the TrCel7A
connected MM parameters with rate constants for the discrete Michaelis complex (PDB code 4C4C441) represents this stage.
processive steps of a CBH. Relatively straightforward data More recently, these product modes have been refined and
obtained from standard assay techniques could now be expanded to include the “unprimed glycosyl-enzyme inter-
converted to rate constants for the elementary steps via simple
mediate” and “primed glycosyl-enzyme intermediate” immedi-
mathematical expressions. This approach was subsequently
ately after glycosylation and before deglycosylation, respec-
applied to the hydrolysis of three different types of cellulose
tively.573 The unprimed glycosyl-enzyme intermediate mode
(Regenerated AC, Avicel, and BMCC) by two variants of
TrCel7A (intact with linker/CBM and cleaved catalytic core), essentially overlays cut mode in the product sites and represents
finding that the on-rate varies with substrate load but not with the product position immediately following the first chemical
enzyme load.561 Dissociation was found to be rate-limiting step (glycosylation). The TrCel7A glycosyl-enzyme intermedi-
except at very low substrate loads, where association became ate structure (PDB code 4C4D441) features a covalently bound
slower. The CBM was shown to increase the rate on crystalline cellohexaose molecule in the −6 to −1 subsites and a cellobiose
but not amorphous cellulose; in fact, cleaving the CBM actually product in the +1/+2 sites in unprimed glycosyl-enzyme
increased the rate of cellobiose production on RAC. This may intermediate mode. While the product remains in unprimed
indicate a role for the CBM in substrate disruption (see section glycosyl-enzyme intermediate mode, there is insufficient space
5) or simply indicate a low affinity of the CBM for more between cellobiose and the anomeric carbon reaction center for
amorphous regions of substrate. This study also found that the nucleophilic water molecule to approach. Thus, before the
dissociation was unaffected by the presence of the CBM second step of the catalytic cycle (deglycosylation) can proceed,
indicating that the dissociation of a cellulose strand from the CD the product shifts slightly toward the tunnel exit, into the
is slower than CBM disassociation. “primed glycosyl-enzyme intermediate” mode (Figure 35); this
Structural and molecular modeling studies have provided is the position the cellobiose product occupies during the
valuable insight into the molecular basis of the processive cycle deglycosylation (exhibited by product complex, PDB code
of GH7 CBHs. For example, the binding of various substrates 3CEL446). The “priming” refers to being primed for the second
and products in GH7 CBHs provides snapshots of the discrete catalytic step.
1355 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

From the first GH7 CBH and EG structures, aromatic similar to results found in TrCel6A,578 but they seemed to be
residues that line the substrate-binding tunnel have been completely deleterious to ligand binding in the EG Cel7B.577 In
identified as being ubiquitously conserved in GH7 en- related GHs, Horn et al. and Zakariassen et al. published two
zymes.172,450 The first structure of TrCel7A revealed the seminal studies on GH18 chitinases wherein aromatic mutations
presence of four tryptophan residues lining the binding closer to the catalytic center of the enzymes resulted in dramatic
tunnel,172 Trp40, Trp38, Trp367, and Trp376, which stack impacts to processivity, but yielded significantly higher activities
with the −7, −4, −2, and +1 glucosyl moieties, respectively. on chitosan, a soluble form of chitin.563,579 The authors of those
TrCel7B maintains these interactions with one variation: Trp38 studies suggested that aromatic residues in GH tunnels likely are
in Cel7A is Tyr38 in Cel7B.450 Later crystal structures of vital for processive action, but that processivity comes at a cost
TrCel7A with long oligosaccharides confirmed that these in enzyme performance. For GH7 cellulases, clearly, further
residues indeed stack face-to-face with the glycosyl moieties of experimental work will be required to fully elucidate the role of
the substrate.173 the aromatic residues in enzyme tunnels and clefts.
Mutation of the aromatic residues in CBH tunnels is typically Advanced molecular simulation methods have also con-
significantly detrimental to enzyme activity. In particular, tributed to the formation of a molecular basis for CBH
aromatic residues at the entrance to the binding tunnel have processivity. Payne et al.580 utilized free energy perturbation
been show to have a dramatic impact on catalytic function. with replica-exchange MD to develop a connection between
Koivula published the seminal study on this topic in 1996.574 structure and processivity. The binding free energy, ΔGob, of
Therein, mutation of the leading tryptophan residue in +4 several family 7 CBHs was calculated, and a novel theoretical
subsite of TrCel6A was demonstrated to abolish activity on connection was made to experimentally observable quantities:
crystalline cellulose, but not on amorphous substrates. This
study suggested that the leading tryptophan residue might act as ΔG bo ⎛ P intrk ⎞
= ln⎜ on

a recognition site for acquiring cellulose chains in a crystalline RT ⎝ kcat ⎠
context. von Ossowski et al. first made brief reference to similar
results being discovered for TrCel7A, but no data were In a comparison of the binding free energy of TrCel7A with a
reported.461 In their initial report on HS-AFM, Igarashi et al. tunnel-spanning cellononaose ligand to that of a celloheptaose
demonstrated that the TrCel7A W40A mutant, wherein the filling only the reactant side of the tunnel, it was calculated that
leading tryptophan residue in the −7 subsite is converted to filling the product sites results in stronger binding by 11.1 kcal/
alanine, did not slide on crystalline cellulose.558 This result was mol.580 Normalized by the number of binding sites, the product
later corroborated in a more in-depth HS-AFM study from sites bind the ligand more tightly than the substrate sites. This
Nakamura et al.575 In 2013, two computational studies, one exceptionally strong binding in the product sites of CBHs is
included with HS-AFM results from Nakamura and co- likely a key factor in driving processive motion. Following
workers575 and one from GhattyVenkataKrishna et al.,576 were product expulsion, the product sites are empty, and the cellulose
reported that shed further light on the role of Trp40 in chain chain must advance forward by one cellobiose unit (Figure 35).
acquisition. Namely, both studies essentially simultaneously This study revealed the significant stabilization experienced by
reported that a cellononaose chain was placed in silico with the the enzyme-cellulose complex by filling the product sites.
leading glucose residue bound to Trp40; the chain advances by a Furthermore, tight binding of cellobiose in the product binding
cellobiose unit into the CD to bind to the −5 to −7 sites is remniscent of the well-known phenomena of cellobiose
subsites.575,576 In the aforementioned computational and inhibition in TrCel7A.581,582 The value reported by Payne et al.,
experimental study of TrCel7A and TrCel6A on cellulose, −11.1 kcal/mol,580 is in excellent agreement with a prior study
MD simulation of the full length enzymes on the surface of by Bu et al., −11.2 kcal/mol,571 obtained using an entirely
cellulose demonstrated that, in both enzymes, the entrance different computational approach. This prior study highlighted
tryptophan residue interacted directly with the first glucose the likely relationship of favorable cellobiose product binding to
being decrystallized from the substrate.393 product inhibition. The effects of cellobiose inhibition on
It is also noted that, in some GH7 CBHs, there is an aromatic processive cellulolytic action are described further in section 6.3.
residue in the A1 loop that structurally resides on the opposite Pingali et al.583 employed small angle neutron scattering
side of the conserved tryptophan residue in the −7 subsite. For (SANS) in order to probe the effect of pH on the structure of
example, in a ligand-bound structure of LqCel7B, the tyrosine TrCel7A in solution. Their findings indicated that at higher pH
residue at this position was shown to bind to the planar face of (7.0, 6.0, and 5.3) the enzyme shape is well-defined and
the glucosyl moiety bound in the −7 position.383 A computa- compact, consistent with the crystal structure.172 However, at
tional examination of the HirCel7A CBH structure, which also pH 4.2 (near the pH for optimal catalytic activity), the CD
exhibits a tyrosine residue on the A1 loop, suggested similar adopts a conformation that is intermediate between this
binding behavior.449 compact form and a fully denatured state, while maintaining
Beyond the entrance tryptophan residue, no experimental its secondary structure. It was speculated that this may indicate
mutation work has been reported, to our knowledge, for GH7 enhanced conformational flexibility between the secondary
cellulases on the equivalent Trp38, Trp376, and Trp367 from structure elements that would allow increased access for binding
TrCel7A. Computational investigation of both Cel7A and to the cellulose chain, thus explaining the optimum in catalytic
Cel7B has provided some clues as to how these residues might activity.
relate to enzymatic processivity,577 as likely does analogy to Molecular simulation has provided insight into the molecular
work in other GH families. In a computational investigation, level roots of the pH-dependent, higher-level conformational
Taylor et al. found that mutating these aromatic residues to changes observed via SANS by Pingali et al. The typical protocol
alanine resulted in reduced binding affinities in all cases, but they for MD simulation involves fixing the protonation state of the
did so to different extents between TrCel7A, a CBH, and Cel7B, protein’s titratable residues based on the solution pH and the
an EG. The mutational effects were fairly localized in Cel7A, residue’s pKa and local environment. However, changes in local
1356 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

environment and protein conformational changes could result in the hydrolytic activity on BC. At 79 μM cellobiose, only 40% of
protonation/deprotonation processes that will be missed in this the activity was retained; at 158 μM, all activity was lost. The
scheme. Bu et al.479 utilized constant pH MD (CPHMD) in substrate concentration did not significantly affect measured
order to directly couple the protonation state of the titratable inhibition, and addition of β-glucosidase greatly increased the
residues of TrCel7A and TrCel6A to the solution pH solubilization of cotton fibers. Bacterial cellulases from C.
conditions. At pH of 5.0, near the pH of optimal activity, the thermocellum were later shown to be inhibited by cellobiose and,
boat (TrCel7A) or skew boat (TrCel6A) conformations for the to a lesser extent, glucose.599 Addition of β-glucosidase from
−1 sugar are favored over the stable chair. Increased loop Aspergillus niger was later shown to increase the glucose yield by
flexibility, particularly in loops B2 and B3 (and to a lesser extent mitigating cellobiose inhibition.594 The authors suggest that a
A2 and A4), was observed in TrCel7A at pH 5 when compared combination of product and substrate inhibition causes the
to pH 7 due to altered hydrogen bonding interactions. Similar detected retardation of hydrolysis at high substrate loading
increased loop flexibility was seen in TrCel6A. These findings (greater than 10%).
have important implications for catalysis and chain processivity, Lee and Fan studied the kinetics of the T. reesei cellulase
respectively. In addition, comparing the apo to the substrate- system on insoluble cellulose and the effect of extended
bound enzyme revealed significant differences in the pKa values hydrolysis times.588 They suggested that the reduction in
of the active-site titratable residues. hydrolysis rate may be caused by the inhibitory effect of the
Maupin and co-workers applied these techniques to the CBH formed products but also the transformation of cellulose into a
MaCel7B and determined that the active site was highly charge- less digestible form of increased crystallinity. They also showed
coupled.478 In particular, Asp214 and His228 are critical for that the cellulase cocktail is more strongly inhibited by
shuffling protons around the active site and maintaining the cellobiose than glucose. The product inhibition mechanism
catalytically active states of Glu212 (deprotonated) and Glu217 was suggested to be deactivation of the substrate-adsorbed
(protonated). Incorporation of the CPHMD and replica enzyme and thus uncompetitive inhibition. However, Holtzap-
exchange MD results into a kinetic model demonstrated that ple et al. estimated the binding of cellobiose to a cocktail of
charge coupling between Asp214, Glu217, and His228 was cellulases from T. reesei and concluded that cellobiose inhibition
essential to reproducing the experimental kinetic−pH profile. was noncompetitive.581 The authors discuss the possibility of a
Follow-up work using these tools examined the flexibility of regulatory site other than the active site. However, they
tunnel-enclosing loops as a function of solution pH.478 The conclude that the binding of sugar inhibitors is likely at the
findings of Bu et al. were affirmed in that increased loop active site given that the active site is structured to strongly bind
flexibility with varying pH is the likely source for the pH- polymeric forms of these sugars, the low diffusivity of the
dependent morphology seen in the neutron scattering studies. insoluble substrate, and the stronger binding of cellobiose versus
glucose (because it has more sites of attachment). They also
6.3. Product Inhibition
compared the glucose and ethanol inhibition of the cellulases,
While strong binding at the product sites of CBHs may be a key concluding that glucose inhibition was 1.4× greater than ethanol
factor in driving processive motion, it also has an undesired inhibition, indicating that conversion of glucose to ethanol
consequence for cellulose hydrolysis: product inhibition. There effectively reduces the inhibition.
is great potential for inhibition of enzyme cocktails due to the Väljamäe et al. detected cellobiose production by monitoring
exceptionally heterogeneous nature of lignocellulosic hydrolysis, the solution absorbance of a coupled reaction with CDH.600
which involves a complex substrate as well as multiple They concluded that the early rate retardation is not because of
components of the enzyme cocktail. However, the most product inhibition by measuring initial cellulose hydrolysis rates
dramatic inhibition of cellulases is due to the reaction products, by TrCel7A both in the presence and in the absence of initial
namely cellobiose and glucose. Product inhibition retards the cellobiose and finding that they follow an identical time
overall conversion rate of lignocellulose to the end product course.600 Furthermore, by adding fresh substrate to an already
glucose and is particularly nefarious at the high substrate slowed-down experiment produced a new “burst” phase
loadings utilized industrially.584,585 Additionally, product indicating that the retardation could not be due to
inhibition has been considered as a factor in the rapid rate deactivation/inactivation of the enzymes themselves. These
retardation that occurs at short time and low conversion.586−588 conclusions were later affirmed for the bacterial EGs E2 (GH6)
Various reviews have included discussion of product inhib- and E5 (GH5), where addition of β-glucosidase did not
ition,32,33,556 with a recent excellent two-part review focused stimulate cellulose hydrolysis when measured on filter paper or
exclusively on the topic.584,589 Both TrCel7A (CBH) and PASC.601
TrCel7B (EG) are inhibited by cellobiose, and this likely Though some early studies measured product inhibition with
originates from their relatively high binding affinities for crystalline substrates,602−605 many focused on small, soluble
cellobiose. As GH7 CBHs constitute the majority of industrial substrates.461,606−610 Vonhoff et al. measured cellobiose
cellulase mixtures, product inhibition constitutes an important inhibition of TrCel7A on 2-chloro-4-nitrophenol-β-D-lactoside
consideration for achieving high product yields in the enzymatic to be Ki = 20 μM,608 a figure which has often been used for
hydrolysis of cellulose,556,584 which can have significant impact comparison in subsequent literature. These findings contributed
on the efficiency of biomass conversion.590,591 Though other to strong product inhibition being considered as a source of the
strategies exist for relieving product inhibition in cellulases, cellulose hydrolysis rate retardation seen at low conver-
including product removal via membrane filtration585,592 and sion.581,582 Gruno et al. studied product inhibition on one
conversion via cellobiose dehydrogenase (CDH),593 the most CBH (TrCel7A) and three EGs (TrCel7B, TrCel5A, and
commonly employed strategy is conversion of cellobiose to TrCel12A) from the T. reesei cellulolytic system via 3H reducing
glucose via β-glucosidases.594−597 end labeled BC and amorphous cellulose.611 They tracked short
Early work by Halliwell and Griffin598 on the activity of T. time (5−10 s) radioactive product release in the presence of
koningii component C1 (CBH) showed that cellobiose inhibited varying concentrations of background cellobiose. With such a
1357 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

short time interval, one avoids the effects of continuous similar methods in thermophilic/thermotolerant fungi (A.
alterations in cellulose structure due to hydrolysis as well as thermophilum, T. aurantiacus, and C. thermophilum) to
the complex kinetics that may be relevant at later times, and thus demonstrate that GH7 CBHs were more sensitive to product
approximates a steady-state situation. Although the precise inhibition than were GH6 CBHs and EGs.614 Also, cellobiose
nature of the inhibition (competitive or mixed) was not inhibition decreased as temperature increased. The inhibition
determined, the relative strength of inhibition was calculated measured using 14C-labeled BC did not correlate with that
(assuming a competitive paradigm, though assuming mixed measured using low molecular-weight model substrates
inhibition gives Ki values within the standard error of one suggesting that the latter should not be used to measure
another). The value for the apparent competitive inhibition product inhibition as a parameter for selecting enzymes for
constant found for TrCel7A on 3H-labeled BC was approx- lignocellulose conversion557 and affirming what was concluded
imately 1.5 mM,611 or 2 orders of magnitude higher than what by Gruno et al.611 and later by Murphy et al.615 The latter study
had been found for the same enzyme on small, soluble compared inhibition by cellobiose and glucose for the five major
substrates.461,607,608 Thus, product inhibition is actually cellulases from T. reesei and confirmed previous findings that
decreased 100-fold on the more industrially relevant solid, TrCel7A was most sensitive to cellobiose.
insoluble substrates. Product inhibition for two of the EGs Structural studies of cellulases (section 6.2) have provided
studied (TrCel7B and TrCel5A) was about an order of some links to the molecular-level origins of the phenomenon of
magnitude weaker than for TrCel7A611 (consistent with what product inhibition. Becker et al. attempted to engineer TrCel7A
had been found previously on small molecule substrates607,612 with regards to its pH optimum by analogy with HiCel7B, which
and later confirmed on PASC613), and may be due to the more has a higher and broader pH optimum than either TrCel7A or
open binding site cleft of EGs. An anomalous trend was found TrCel7B.328 HiCel7B has an extra histidine residue near the
for EG TrCel12A wherein activity actually increased with catalytic acid/base that the T. reesei enzymes do not have. Five
cellobiose concentration suggesting this EG is activated by the point mutations were made to TrCel7A, adding this histidine as
product; however, this is more likely an experimental artifact well as changing nearby residues to accommodate the bulkier
due to enhancement of transglycosylation.611 A simple kinetic histidine residue (formerly an alanine). The primary goal of
model was also presented wherein the type of inhibition shifting to a more alkaline pH optimum was achieved as well as
experienced by the enzyme was found to be dependent upon the an unexpected additional consequence: cellobiose inhibition was
relative binding strengths of the nonproductive (i.e., product greatly relieved in the mutant CBH (Ki increased from 20 μM to
sites empty) and productive (i.e., binding tunnel-spanning) 755 μM, both on 2-chloro-4-nitrophenol-β-D-lactoside sub-
enzyme−substrate complexes. Competitive inhibition results strate). In addition, inhibition in the wild-type was competitive,
when the productive complex is bound much more tightly than whereas, in the mutant, it was mixed. This is all the more
the nonproductive; conventional mixed-type inhibition results interesting when one considers that the “binding pose” of the
when the opposite is true. This was a landmark study toward cellobiose in the product sites as well as the directly contacting
determining the relevance of product inhibition for the ligands of the enzyme are nearly identical in the wild-type and
industrial production of fuels and chemicals via enzymatic the mutant. However, the weaker binding was attributed to the
hydrolysis of lignocellulosic biomass. loss of two water-mediated interactions between cellobiose and
Bommarius et al. presented a study of the hydrolysis of MCC Thr226 and Asp262. von Ossowski et al.461 presented another
(Avicel) prepared via three different methods of pretreat- protein engineering study wherein the exo loop of TrCel7A was
ment.596 Product inhibition mainly affected the hydrolysis rate mutated in various ways to connect its structure with catalytic
during “phase I” of hydrolysis, up to about 30% substrate function. One mutant in particular had eight residues deleted
conversion and when the hydrolysis rate is highest. In the later from the exo loop, which significantly relieved inhibition by
phases when the hydrolysis rate has slowed down, other factors cellobiose (as well as changing its type from competitive to
such as “jamming” of enzymes molecules on the substrate were mixed). Ki was measured to be 24 μM for the wild-type TrCel7A
suggested to be rate-determining. on pNPL and 300 μM on the exo loop deletion mutant with
In a study of the synergism between endo- and exo-acting substrate 2-chloro-4-nitrophenol-β-D-lactoside. This study also
cellulases on 14C-labeled BC, Jalak et al. studied cellobiose considered PcCel7D (Ki = 180 μM on 2-chloro-4-nitrophenol-
inhibition557 under both single-turnover and steady-state β-D-lactoside), which has an intermediate length exo loop owing
conditions and concluded that the inhibition of hydrolysis rate to the natural deletion of six residues. The authors pinpoint
by cellobiose was stronger under steady-state than under single- structural roots for the relative inhibitory strengths of these
turnover conditions. The product was mainly competing with three enzymes. The deletion mutant loses carbohydrate
the cellulase chains to bind to TrCel7A by reducing the number interactions with Tyr247, Thr246, Tyr252, and Arg251 (residue
of initiations. Cellobiose also seemed to slow down the numbering for TrCel7A). However, PcCel7D maintains the
processive movement of the CBH on the cellulose chain. This Arg251 residue (Arg240 in PcCel7D), which has hydrogen
was expected since cellobiose bound in the product site should bonding interactions with both of the product site glucosyl
restrict the possibility of a processive movement and should residues. Further comparative studies of TrCel7A and PcCel7D
contribute to a noncompetitive component of the inhibition. examined the binding of several inhibitors in addition to
However, the calculated values of kinetic constants suggest that cellobiose.458 It was observed that the loss of hydrogen bonding
the expulsion of cellobiose should not be rate-limiting for the interactions along with the more open binding tunnel represents
turnover. Additionally, the kinetic analysis led to the suggestion the structural basis for weaker binding of cellobiose to PcCel7D,
that cellobiose also can bind in the substrate sites in direct and thus reduced product inhibition.458,461 The more open
competition with the cellulosic substrate. Thus, both com- tunnel provides more access to solvent and more potential
petitive and noncompetitive components of inhibition should be disruption of protein−carbohydrate electrostatic interactions.
involved in cellobiose inhibition of TrCel7A resulting in an Textor et al. presented the crystal structure and inhibition
overall mixed type of inhibition. The same group later utilized experiments on ThCel7A noting that a key difference between
1358 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

this enzyme and TrCel7A is in the exo loop mobility, as (addition of β-glucosidases) is complicated by the fact that β-
discussed in section 6.2.466 The exo loop itself is quite similar to glucosidases themselves are inhibited by glucose585,591,624 and
that of TrCel7A, but its mobility is enhanced due to a missing by gluconic acid.625 β-Glucosidase inhibition by gluconic acid is
interaction with Tyr371 (TrCel7A numbering) from the loop particularly significant because the LPMOs that have become an
on the opposite side of the binding tunnel (loop A3). This indispensable part of industrial cellulase cocktails produce
increased loop mobility was intimated by increased temperature gluconic acid in significant amounts.626,627 In addition, β-
factors in the crystal structure for this region and confirmed with glucosidases have transglycosylation capabilities628−631 wherein
MD simulations. This difference may be the key to explaining they assemble di- and oligosaccharides rather than deconstruct
why this enzyme’s inhibition by cellobiose is significantly them, a phenomenon that is exacerbated as glucose concen-
reduced, with a Ki = 7.2 mM on pNPC that is more than 2 trations increase.
orders of magnitude greater than that of TrCel7A on 2-chloro-4- 6.4. Pyroglutamate
nitrophenol-β-D-lactoside.
Molecular simulation has served to connect structural roots of Another common feature of GH7 enzymes, and other secreted
product inhibition with thermodynamics. Bu et al. calculated the proteins from, e.g., T. reesei, is an N-terminal modification of a
absolute binding free energy of both glucose and cellobiose in glutamic acid residue to pyroglutamate, also referred to as
the product sites of TrCel7A. Using two different computational pyrrolidone carboxylic acid or PCA (Scheme 2 and Figure 40)
methods, they found the absolute binding free energy of
cellobiose to be −14.4 kcal/mol (via steered MD) and −11.2 Scheme 2. Pyroglutamate Chemistry: The Chemistry of N-
kcal/mol (via free energy perturbation MD) signifying that Terminal Glutamine Cyclization
cellobiose is 11.2−14.4 kcal/mol more stable in the TrCel7A
product sites than in solution.571 In order to pinpoint specific
residues that contribute to this strong binding, the product site
residues that interacted most strongly with cellobiose in the
simulations (Arg251, Asp259, Asp262, Trp376, and Tyr381)
were mutated to alanine, each resulting in significantly
weakened cellobiose binding. The same group followed up on
this work to elucidate how product binding differs between
processive and nonprocessive cellulases as well as how the (refs 172, 309, 323, 329, 436, 438, 567, and 632−634). This
presence of a ligand in the substrate binding sites (−7 to −1) modification was known before the first GH7 crystal structures
affects the binding strength.572 Absolute binding free energy were solved due to the nonstandard signatures on mass
calculations with GH7 (CBH TrCel7A and EG TrCel7B) and spectrometry observed during N-terminal sequencing.436,438,632
GH6 (CBH TrCel6A and EG H. insolens Cel6B) revealed that The enzyme responsible for this chemical modification is
cellobiose binding in the product sites is dramatically stronger in glutaminyl cyclase.635,636 This post-translational modification is
CBHs than in EGs lending a thermodynamic basis to what had thought to be responsible for protection against degradation by
been observed experimentally.607,611−613 This result was exo-acting peptidase enzymes.637,638 Similarly, the removal of
rationalized in terms of the structural differences between EGs this post-translational modification is well-known via a calf liver
and CBHs: the more open architecture of the binding cleft in pyroglutamate amino peptidase, a method that has been
EGs leads to increased solvation, thus weakening the protein− commonly used for preparing eukaryotic proteins for N-
carbohydrate interactions in the product sites. Payne et al.580 terminal sequencing since the 1980s.639 GH7 structures to
found via an independent computational method that filling the date have been derived from fungal expression hosts, and as
product sites of TrCel7A results in an 11.1 kcal/mol mentioned above, pyroglutamate is ubiquitously observed in
stabilization580 (in agreement with the results from Bu et these structures. However, the role of this post-translational
al.571). This serves as another confirmation of the exceptionally modification was only recently reported in the academic
strong binding that is present in the product sites in CBHs, literature.640 Motivated to understand the significant differences
which is likely a key factor in product inhibition. Theoretical and in GH7 activity and stability imparted by heterologous
kinetic modeling studies have also helped to shed light on the expression in S. cerevisiae, Dana et al. expressed T. emersonii
reaction kinetics, mechanisms, and enzymatic synergism of (R. emersonii) Cel7A from S. cerevisiae and Neurospora crassa
product inhibition on enzymatic hydrolysis (refs 581, 582, 588, expression with a linker and CBM taken from A. thermophilum
591, 600, 602, 605, 609, and 616−619), and this body of and Agaricus bisporus, respectively.640,641 Improper disulfide
literature has recently been reviewed.584,618 bond formation in S. cerevisiae and the extent of hyper-
Beyond cellobiose product inhibition, cellulase cocktails face glycosylation were both examined, and reported not to be a
other complicating inhibitory factors. For example, many other major factor in the activity differences. Specifically, free thiols
species besides the cellobiose product exist in the hydrolytic were not detected, suggesting that all cysteine residues were
system which also inhibit the GH enzymes, including lignin,620 paired, and the yeast expression system used exhibits a deletion
ethanol,616,617 glucose,581,614 lactose,458,606 various ions,458,598 in the glycosylation pathway resulting in minimization of
solvents (including ethanol, butanol, and acetone),581 hemi- hyperglycosylation. Further deglycosylation by EndoH and a
cellulose-derived sugars (including mannose, galactose, and mannosidase enzyme resulted in an increase in activity by 25%,
xylose),587,621 xylan and xylooligomers,622 and long (DP 7−16) but did not fully explain the disparity in activity between the
xylo- and gluco-oligosaccharides produced during pretreatment fungal and yeast expressed GH7 enzyme, which led the authors
(which were shown to inhibit T. reesei CBHs 100x more strongly to examine pyroglutamate (Scheme 2). Treatment of the S.
than cellobiose).623 Cellobiose is a stronger inhibitor than each cerevisiae-expressed Cel7A with glutamyl cyclase in vitro resulted
of these with the exception of the last.623 In addition, the in essentially identical activity and stability to the enzyme
common industrial technique for product inhibition relief expressed in N. crassa.640
1359 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 40. N-Terminal pyroglutamate exemplified by TeCel7A (PDB code 3PFJ). The ligand from the TrCel7A Michaelis complex is shown in
aquamarine “sticks” (PDB code 4C4C).

6.5. Glycosylation
GH7 glycosylation significantly affects activity and represents a
promising target for protein engineering efforts. Glycosylation is
a post-translational modification that serves a myriad of
biological functions in recognition and signaling.642,643 Secreted
carbohydrate-active enzymes are often “decorated” by N-linked
and O-linked glycosylation, wherein small molecule carbohy-
drates are covalently attached to specific sites on the protein. N-
glycosylation attaches the carbohydrate (generally highly
branched mannose or single N-acetylglucosamine residues
(GlcNAc)) to the β-amide group of an asparagine in a N−X−
S/T motif (where X ≠ P), and O-glycosylation attaches the
carbohydrate (generally one to three mannose residues) to the
β-hydroxyl group of a serine or threonine.369 Glycosylation of
CBMs and linkers is discussed in sections 5.1 and 5.2,
respectively. In what follows, we focus our attention the
glycosylation of the catalytic domain (CD) of GH7 cellulases.
Early work by Maras et al. characterized six dominant forms of
N-glycans on TrCel7A produced from the RUT-C30 strain.644
Six major forms were characterized: GlcMan8 GlcNAc2,
GlcMan7GlcNAc2, Man7GlcNAc2, ManPGlcMan7GlcNAc2, Figure 41. TrCel7A CD glycosylation. The CD of TrCel7A is shown in
GlcMan5GlcNAc2, and Man5GlcNAc2. Klarskov et al. success- green transparent “surface”, N-glycans are show in slate and red
fully identified three sites for glycosylation in TrCel7A (out of “sticks”, and the cellulose surface is shown in aquamarine and red
four putative N-glycan attachment sites).456 Liquid chromatog- “sticks”. N-glycosylation attaches to the TrCel7A CD at 3 locations:
raphy coupled electrospray ionization mass spectrometry (LC- Asn45, Asn270, and Asn384. Blue squares denote GlcNAc, and green
ESMS) results indicated that a single GlcNAc residue was circles denote mannose. Both images show a representative glycan at
attached at Asn45, Asn270, and Asn384. The relatively small each binding site, though variability has been observed experimentally,
glycan prompted speculation that the glycan had been trimmed as discussed in the text.
by an endoglycosidase. The three sites identified in this early
study have been subsequently confirmed as the primary actually possessed attached sugar residues.368 They also
attachment sites for N-glycosylation on the TrCel7A CD demonstrated via monosaccharide analysis that only single
(Figure 41).369 For example, Harrison et al. analyzed a particular GlcNAc residues attached at these sites. O-glycosylation of GH7
strain of T. reesei, ALKO2877, and confirmed via ESMS that CDs has not been reported to date, in contrast to the linker and
only three of the four postulated TrCel7A N-glycosylation sites CBM (see section 5).368,645
1360 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Hui et al. examined TrCel7A from RUT-C30 and two A complementary study by Adney et al. utilized site-directed
derivative strains via MS and found that single GlcNAc residues mutagenesis to individually manipulate the glycan attachment
were predominantly found at Asn45 and Asn384 for all three sites on the Cel7A CD from T. reesei and Penicillium
strains, but high mannose chains tended to attach at Asn270 f uniculosum.648 Subsequent expression in A. niger and character-
with a strain-dependent degree of variability.273 The Asn270- ization revealed all mutants with glycosylation removed (two for
linked glycan was identified as Man8GlcNAc2 in RUT-C30, TrCel7A and three for Pf Cel7A) exhibited improved cellulose
single GlcNAc in one derivative strain, and a mixture of the two conversion. The N384A mutant for TrCel7A, however, had a
in the other derivative strain. These differences may be due to much more dramatic effect than did the N270A mutant,
differences in the presence of glycan-trimming glycosidases, increasing the activity by 70%. This residue resides on a
which vary with strain and growth conditions. This may also be substrate-binding tunnel loop and is near to the cellulose surface
the source of the shorter (and more consistent) length of when the enzyme is bound. This was significant in that it linked
glycans attached to Asn45 and Asn384: perhaps steric hindrance glycosylation function with the enzyme structure and attached
prevents glycosidases from accessing Asn270 glycans to the glycans. In addition to three mutants that removed glyco-
extent that they do at Asn45 and Asn384. Follow-up work by sylation, an additional mutant for Pf Cel7A (A196S, creating an
this group on TrCel7B revealed Asn56 (single GlcNAc residue) N−X−S motif) introduced a new glycosylation attachment site
and Asn182 (Man8GlcNAc2) as glycan-attachment sites for this at Asn194. This increase in glycosylation actually gave the most
EG.418 dramatic improvement (85% increase in activity) for any of the
The dependence of the glycosylation pattern of TrCel7A on Pf Cel7A mutants. Thermal stability also tended to be reduced
growth conditions was probed systematically by Stals et al.419 for the mutants versus the wild-type, both for addition and
Four different growth conditions were examined: minimal removal of glycosylation.
medium (resulting in a solution pH of 2.5), corn steep liquor- Gao et al. utilized homologous expression of four glycoforms
enriched medium (pH 5), CaCO3-supplemented minimal of P. decumbens Cel7A to probe N-glycosylation at two
medium (pH 7), and fed-batch cultivation (pH 4). Differences
attachment sites, Asn137 and Asn470.508 They found
in growth conditions were shown to affect the number of
ManxGlcNAcy residues (where x = 2−5 and y = 1−2) attached
glycosylation sites that were occupied (ranging from zero to
at Asn137. The glycoform with the lowest amount of mannose
three), attached glycan (Man5GlcNAc2 to Man8GlcNAc2),
in its glycosylation was shown to have zero detectable activity on
glucosylation (i.e., a glucosyl moiety attached to the terminus
of the glycan, e.g., GlcMan8GlcNAc2), phosphorylation (e.g., pNPC, but it was only this glycoform that synergistically
ManPGlcMan8GlcNAc2), and frequency of single GlcNAc increased the glucose yield of an industrial enzyme cocktail (by a
attachment. Minimal medium tended to give a higher degree factor of 2). Site-directed mutagenesis of Asn137 and Asn470 to
of glycan occupancy than rich medium. Phosphorylation and aspartate (thus removing the attachment glycan sites) resulted
terminal glucosylation did not trend discernibly with the growth in higher activities (the double mutant had 65% higher activity
medium conditions. The observation from Hui et al.273 that than the most active of the four glycoforms).
Asn270-linked glycans tend to be more resistant to hydrolytic These studies have demonstrated that it is possible to
trimming was affirmed. characterize and manipulate glycosylation to alter cellulase
Stals et al. also studied the effect on glycosylation produced by activity. Looking to the future, studies that can elucidate the
different strains, all under minimal medium growth con- underlying structural features and interactions that give rise to
ditions.646 Wild-type strain QM6A, RUT-C30 mutant, and the observed differences in enzymatic function (e.g., hydrolytic
four other high-cellulase producing mutants (RL-P37, QM9414, activity and thermal stability) will be particularly valuable for
VTT-D-80133, and VTT-D-78085) were compared. RUT-C30 harnessing glycosylation for improved cellulases.
and RL-P37 (both selected using ultraviolet light and 2- 6.6. Protein Engineering
deoxyglucose-supplemented media) produced longer glycans
(GlcMan7−8GlcNAc2) whereas the wild-type and other mutants As mentioned multiple times, GH7 enzymes are often the most
produced trimmed glycans (Man5−6GlcNAc2), which may prevalent members of fungal cellulolytic cocktails and provide a
indicate an inefficient glucosidase in the endoplasmic reticulum significant amount of hydrolytic potential. Thus, unsurprisingly,
in the former two strains. Interestingly, these two strains secrete they have received significant attention in academic, govern-
3−5 times more total cellulase than wild-type QM6A or ment, and industrial research laboratories for engineering higher
QM9414,647 though the connection between extended glycans activity, higher thermal stability, and advantageous modification
and cellulase production was not directly elucidated. of other properties. However, a major problem in the
Jeoh et al. expressed TrCel7A in A. niger, resulting in 6-fold engineering of GH7 cellulases, especially GH7 CBHs, is the
increased N-glycan attachment to the CD.645 The increased use of a reliable expression host that is able to properly fold and
glycosylation resulted in a concomitant decrease in activity (on glycosylate the enzymes such that the activity and stability are
both bacterial MCC and PASC) and increase in nonproductive comparable to those expressed in filamentous fungi. To date, it
binding (as compared to homologously expressed TrCel7A). seems that yeast such as S. cerevisiase is able to express certain
The original activity could be restored by supplementing with GH7 CBHs such as TeCel7A (albeit still at low levels compared
N-glycosidase that trimmed back the glycans. The decrease in to filamentous fungi) but not others such as TrCel7A.289 The
activity and binding relative to homologous TrCel7A were less sequence relationships that give rise to this nonuniform
dramatic (though not eradicated) when the CBM and linker expression level are unknown but are essential to understand
were proteolytically cleaved. The recombinant CD displayed a in order to develop high-throughput expression systems for
reduced binding affinity compared to homologous TrCel7A, GH7 CBH expression and screening. Nevertheless, significant
indicating that the glycans may sterically hinder its attachment efforts in the open literature either using rational design or
to the substrate surface (and thus account for the reduced semirandom mutations identified through computational
activity). methods have demonstrated that GH7 engineering is feasible
1361 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

to date, as briefly reviewed here. We note that we do not review at 75 °C, thus demonstrating that the addition of disulfide bonds
the patent literature in detail here. can improve the thermal stability of GH7 CBHs.
Early work in GH7 CBH engineering from Becker et al.328 Arnold and co-workers from the California Institute of
and Boer and Koivula327 demonstrated that it is possible to Technology have also made substantial contributions to GH7
rationally shift the pH optimum of TrCel7A toward a more CBH engineering primarilyon the basis of computational
alkaline optimum through site-directed mutagenesis. This pH prediction tools such as structure-guided recombination.650−652
optimum shift was based on comparison of the TrCel7A (CBH) In 2010, Heinzelman et al. used five GH7 CBH parent enzymes
active site to the HiCel7B (EG) active site, the latter of which to predict and express (in S. cerevisiae) 28 chimeric GH7 CBHs,
contains a histidine residue near the acid/base residue (Glu217 on the basis of a “background” structure of the TeCel7A enzyme,
in TrCel7A), which corresponds to an alanine residue in chosen due to its high expression levels in yeast.650 The authors
TrCel7A. To accommodate the bulky histidine residue in place found multiple combinations of mutations (with an average of
of alanine in the TrCel7A active site, four additional mutations 37 mutations relative to the closest parent enzyme) that resulted
were conducted, resulting in an enzyme with a basic pKa shift of in substantial thermal stability improvements. Moreover, activity
approximately +0.7 pKa units, but a lower kcat/KM overall on MCC for 6 chimeras was slightly retained at higher
compared to the wild-type on 3,4-dinitrophenyl-β-D-lactoside. temperatures than the parent enzymes (68, 70 °C). Interest-
Boer and Koivula subsequently examined the wild-type and the ingly, activity at 37 °C in 6 chimeras was improved, also on
same pentamutant TrCel7A to understand the enzyme stability MCC, but the overall conversion time was quite short (90 min)
as a function of pH. They demonstrated that the pentamutant when the relative activity measurements were made. The same
was less stable overall than the wild-type at both acidic and group reported a follow-up study in 2012 from Komor et al.651
alkaline pH, suggesting that mutations to shift pH optima starting with 5 chimeras from the previous work. Therein, they
should go hand in hand with mutations aimed at improving used a method based on GH7 CBH sequence alignments and
thermal stability.327 the FoldX force field to predict the effect on ΔGfolding for each
Voutilainen and co-workers from VTT have also published an amino acid residue. Using this approach, the authors were able
extensive series of studies to improve the thermal stability of to demonstrate an improved variant with 8 additional stabilizing
GH7 CBHs via several different methods.381,504,505,649 In a study mutations that exhibits a T50 of 72.1 °C, which is a substantial
from 2007, they used a high-throughput robotics system to increase in thermal stability. This mutant also retained
generate a library of random mutants of MaCel7B expressed in significant extent of activity on MCC up to 70 °C. Via yet
S. cerevisiae, and screened activity as a function of temperature another method, namely noncontiguous recombination from
on 4-methylumbelliferyl-β-D-lactoside.504 The authors found the same group, Smith et al. located 6 single amino acid
that a single mutation in the hydrophobic core of the enzyme mutations that are each able to improve the TrCel7A stability by
(S290T) was able to improve the Tm of the enzyme by 1.5 °C 1−3 °C each.652 Overall, these studies from Arnold et al.
and was able to double the activity of the enzyme on Avicel at 70 demonstrate that computational protein design tools combined
°C compared to the wild-type. In 2009, Voutilainen et al. with large screening studies can identify both single-point
described a subsequent engineering study using MaCel7B mutations and blocks of enzymes that can directly contribute to
wherein the S290T mutation was combined with the addition of improved stability.
a tenth disulfide bridge near the tunnel entrance (positionally High-throughput, random mutagenesis has also proven to be
analogous to the tenth disulfide bridge in the TrCel7A CD), to effective, given a reliable secretion system. Dana et al. reported
produce a 4.5 °C increase in Tm.505 The authors also employed the development of an S. cerevisiae strain capable of producing
the strategy of expressing MaCel7B with the TrCel7A CBM- GH7 CBHs with limited glycosylation at consistent titers.641
linker domain attached, which resulted in a 2.5 °C increase in Using this system, they developed a library of GH7 CBHs with
Tm. Comparison of the MaCel7B activity to TrCel7A (both biased clique shuffling starting with 11 parent Cel7A genes, 86%
enzymes were studied as both full-length enzymes with CBM- of which were active. Overall, 51 chimeras were identified with
linker and as solitary CDs) demonstrated that, at 45 °C, improved thermal stability, and several were shown to have
TrCel7A in both cases was more active, but MaCel7B was activity on Avicel hydrolysis significantly higher than TeCel7A at
significantly more active at 70 °C, also in both cases. The lower 60 and 65 °C.
overall activity of MaCel7B at lower temperature was attributed 6.7. Conclusions
to a higher degree of product inhibition. Nevertheless, these two Our collective understanding of GH7 cellulases to date is quite
studies demonstrated the ability to engineer a GH7 CBH for considerable, especially given recent developments toward more
higher stability and activity through a combination of both quantitative underpinnings of their action on cellulose. In
random and rational mutagenesis.504,505 summary, the following important features of GH7 cellulases
The same group from VTT also examined means to improve have been elucidated:
the thermal stability of another thermophilic GH7 CBH, namely
TeCel7A, expressed in S. cerevisiae.649 Therein, they used a (1) GH7 cellulases employ a two-step, retaining mechanism,
computational prediction tool to predict 5 new sites for and CBHs and EGs form two distinct populations that
engineering in disulfide bonds, all near the active site tunnel. differ primarily in their loop structures near the ligand.
Like MaCel7B, TeCel7A natively contains 9 disulfide bonds, and (2) GH7 CBHs can engage cellulose chains via exo-initiation
the addition of 3 independent disulfides increased the Tm in the or endo-initiation, i.e., binding to chains by the end or via
range 3.5−5 °C each. The combination of all three successful internal binding to chains.
engineered disulfide bonds increased the Tm by 9 °C. Two single
disulfide bond mutants and a triple mutant with all three (3) The rate-limiting step in GH7 CBH action in the absence
disulfide bonds added improved the activity on Avicel at 75 °C, of synergistic enzymes is likely to be substrate dissociation,
and the optimal triple mutant was able to hydrolyze Avicel at 80 either caused by obstacles or amorphous regions of
°C with only slightly reduced performance relative to its activity cellulose.
1362 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 42. TrCel6A structure and active site details. (A) The overall structure of the TrCel6A CD from PDB code 1QK2, showing the distorted β/α
barrel, with labeled structural elements.194 (B) The TrCel6A structure rotated to show the active site tunnel enclosed by an N-terminal and C-terminal
loop (pink and blue cartoon, respectively). (C) Four of the TrCel6A binding sites (−2 to +2) revealed by the cocrystallized nonhydrolyzable o-
iodobenzyl-1-thio-β-D-cellobioside ligand (cyan stick). Aromatic residues are shown in magenta stick, and residues that have been hypothesized to
participate in catalysis are shown in green stick. The water molecules likely involved in catalysis are shown as red spheres.

(4) The rate-limiting step in GH7 CBH action in the amounts of enzyme for biochemical, structural, and enzyme
presence in the presence of synergistic enzymes is likely is performance studies. Slowly emerging tools in specialized yeast
the processive velocity (i.e., the combined steps of strains, engineering of filamentous fungi, and cell-free expression
hydrolysis, product expulsion, and processive motion). systems for these enzymes, combined with a deeper under-
Synergistic enzymes, especially those that provide standing of the effects of post-translational modifications on
endolytic action, likely provide points of detachment for activity and stability, will eventually enable an era wherein rapid
GH7 CBHs, thus removing the rate limitation of substrate screening of GH7 cellulases from natural diversity and
dissociation. engineered enzymes will be possible. The recent publication
(5) Computational studies of the entire GH7 CBH processive of the effect of pyroglutamate clearly established its importance,
cycle suggest that the glycosylation step in hydrolysis is if not yet the mechanism by which it functions, in enzyme
the rate-limiting step between hydrolysis, product activity and stability.640 Clearly, a more comprehensive
expulsion, and processive motion. understanding of GH7 glycosylation is now of paramount
(6) The CBM-linker domains on GH7 cellulases seemingly importance, an area which has been only slightly reported on to
do not significantly contribute to the enzyme perform- date in the academic literature given the complexity of detailed
ance once the enzyme is productively bound to the glycan analysis.369
substrate, but likely play a major role in enzyme targeting, On the basis of quite recent results, it seems that the
and thus impact the overall enzyme performance. glycosylation step in the retaining mechanism is the primary rate
Although GH7s represent one of the most well studied (if not limitation of GH7 CBHs. If this conclusion is correct, which
the most well studied) classes of cellulolytic enzymes to date, seems logical given the strength of glycosidic bonds, then the
there are many open questions based on our current primary target for GH7 improvement is kcat. Certainly, future
understanding of their action. Given their significant presence
kinetics studies similar to that of Kurašin et al. in the presence of
in industrial cocktails combined with their powerful hydrolytic
action, engineering these enzymes is of keen importance for additional cellulolytic enzymes (e.g., GH6 cellulases, additional
industrial biomass conversion. The development of accurate EGs, β-glucosidases, and LPMOs) on a broader range of
structure−activity relationships for the GH7 cellulases still substrates will continue to inform the direct target for GH7
remains nascent. improvement. Furthermore, the development of additional,
One of the primary challenges in studying these enzymes is more utilitarian methods for studying processivity of GH7
expression in convenient heterologous expression systems, cellulases in the presence of additional enzymes will be of
which severely limits high-throughput production of adequate paramount importance for the systematic comparison of GH7
1363 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 8. Reported GH6 Crystal Structures


source and original name in PDB resolution
primary citation code (Å) brief highlights ref
Fungal CBH Structures
Trichoderma reesei CBHII/ 3CBH 2.00 first GH6 structure reported; cocrystallized with a nonhydrolizable ligand (o-iodobenzyl-1-thio-β- 192
Cel6A D-cellobioside)
1CB2 2.00 Y169F mutant 574
1QK2 2.00 wild-type with (Glc)2-S-(Glc)2 194
1QJW 1.90 Y169F mutant with (Glc)2-S-(Glc)2 194
1QK0 2.10 wild-type with m-iodobenzyl-β-D-glucopyranosyl-β(1,4)-D-xylopyranoside 194
1HGW 2.10 D175A mutant 656
1HGY 2.20 D221A mutant with β-D-glucose 656
4AU0 1.70 D221A mutant with 6-chloro-4-methylumbelliferyl-β-cellobioside 704
4AX6 2.30 D221A mutant with 6-chloro-4-phenylumbelliferyl-β-cellobioside 704
4AX7 1.70 D221A mutant with 4-methylumbelliferyl-β-cellobioside 704
Humicola insolens Cel6A 1BVW 1.92 659
2BVW 1.70 binding of glucose in the −2 subsite and cellotetraose in +1 through +4 subsite confirms the 662
existence of 6 subsites
1GZ1 1.90 D416A mutant with (Glc)2-S-(Glc)2 209
1OCB 1.75 wild-type complexed with fluoresceinylthioureido-derivatized tetrasaccharide 665
1OC6 1.50 D405N mutant 665
1OC5 1.70 D405N complexed with 4II-thio-β-cellotetraoside substrate 665
1OC7 1.11 D405N complexed with methyl-4,4II,4III,4IV-tetrathio-α-cellopentoside substrate 665
1OCJ 1.30 D416A complexed with methyl-4,4II,4III,4IV-tetrathio-α-cellopentoside substrate 665
1OCN 1.31 D416A complexed with cellobio-derived isofagomine 666
Coprinopsis cinerea Cel6C 3A64 1.60 670
3ABX 1.40 wild-type complexed with p-nitrophenyl-β-D-cellotrioside 670
3A9B 1.20 wild-type complexed with cellobiose 670
3VOF 1.60 D102A complexed with glucose 671
Coprinopsis cinerea Cel6A 3VOG 1.45 wild-type complexed with HEPES 671
3VOH 2.40 wild-type complexed with cellobiose 671
3VOI 2.00 wild-type complexed with p-nitrophenyl-β-D-cellotrioside and with a Mg2+ ion in the active site 671
3VOJ 2.29 D164A 671
Chaetomium thermophilum 4A05 1.90 wild-type complexed with cellobiose and cellotetraose with a Li+ ion in the active site 672
Cel6A
HJPlus chimera Cel6A 4I5R 1.50 chimera of H. insolens, T. reesei, and C. thermophilum 674
3C6P chimera Cel6A 4I5U 1.22 chimera of H. insolens, T. reesei, and C. thermophilum 674
Fungal EG Structure
Humicola insolens Cel6B 1DYS 1.60 663
Bacterial CBH Structures
Thermobif ida f usca E3/Cel6B 4B4H 1.50 wild-type 683
4B4F 2.20 wild-type complexed with cellohexaose (chain A) or cellotetraose (chain B) 683
4AVO 1.80 D274A complexed with cellohexaose 686
4AVN 2.00 D226A/S232A complexed with glucose and cellotetraose 686
Bacterial EG Structures
Thermobif ida f usca E2 (Cel6A) 1TML 1.80 this apo structure was virtually identical to the previously reported fungal GH6 CBH, but lacked 658
one of the tunnel-enclosing loops
2BOD 1.50 wild-type complexed with (Glc)2-S-(Glc)2 688
2BOE 1.15 Y73S mutant 688
2BOF 1.64 Y73S complexed with cellotetraose 688
2BOG 1.04 Y73S complexed with (Glc)2-S-(Glc)2 688
Mycobacterium tuberculosis 1UP0 1.75 wild-type complexed with cellobiose 667
H37Rv Cel6
1UP3 1.6 wild-type complexed with (Glc)2-S-(Glc)2 667
1UOZ 1.10 wild-type complexed with thio-cellopentaoside 667
1UP2 1.9 wild-type complexed with cellobio-derived isofagamine 667

enzyme performance across substrates and in cocktails of Streptomyces were identified as sharing over 20% sequence
varying composition. identity. The tertiary structure of these enzymes was unknown
at the time, but only a year later, TrCel6A would be the first
7. FAMILY 6 GLYCOSIDE HYDROLASES
reported cellulase structure providing critical insight into both
GH6 was one of the original six families identified through
hydrophobic cluster analysis.439 Originally denoted family B the general structure of this family and key modes of CBH
GHs, T. reesei CBH II and two EGs from C. f imi and action.
1364 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 43. Sequence alignment of the six fungal GH6 structures reviewed here. Strictly conserved residues are shown in red block, and chemically
similar residues in red text. The blue boxes indicate chemical similarity across a grouping of residues. The secondary structural elements of TrCel6A are
shown above the sequences. The catalytic acid is marked by a yellow star. Residues with hypothesized or confirmed roles in catalysis are marked by
magenta stars. Active site loops A and B are marked by black boxes. The figure was generated with ESPript (http://espript.ibcp.fr).347

By 1991, membership of this family had grown by one to categorized into eight subfamilies with accessory loop lengths
include Microbispora bispora EG A.654 Family B was renamed defining the features of the different groups.655 Today, GH6
family 6, reflecting the growing availability of sequence data.654 encompasses nearly 500 protein sequences from both bacteria
Recent sequence analysis suggests GH6 enzymes may be further and eukaryota remaining one of the smaller designated
1365 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

families;151 we stress that this by no means diminishes the value 7.1. Structural Studies
of this family in industrial biomass conversion. 7.1.1. TrCel6A: Wild-Type. The first cellulase structure
Members of the GH6 family have been experimentally revealed the catalytic core of TrCel6A and the general GH6 fold
characterized as exhibiting two primary modes of action, endo- as a distorted β/α barrel (PDB code 3CBH192). Much like the
hydrolytic cellulose action (EC 3.2.1.4) and exo-hydrolytic classic (β/α)8 barrel, this structure consists of a core region of β-
cellulose action (EC 3.2.1.91), where many are thought to strands surrounded by a constellation of α-helices. The noted
display both to some extent. Ståhlberg et al. proposed that all distortion arises from the exclusion of one of the eight β-strands
GH6s can create new chain ends in cellulose as an EG would, from within the core barrel region. The C-terminal loop
and thus, none are purely exoglucanases.304 While cellulolytic connecting the β7 strand to the α8 helix (shown in blue in Figure
action in general is critical to industrial biomass conversion, 42B) and the loop connecting the β2 strand to the α4 helix
GH6s play a particularly unique role by virtue of their (shown in pink in Figure 42B) form the enzyme active site. For
complementary mode of action. The GH6 family is currently convenience, we have labeled these loops as loop A and loop B,
the only known family having cellulases that act from the respectively, shown on the multiple sequence alignment in
nonreducing cellulose chain end.151 The most effective enzyme Figure 43.
The active site region of TrCel6A is a 20 Å-long tunnel
cocktails contain cellulases that recognize both the reducing and
formed by two loops, loops A and B.192 The loops partially
nonreducing ends allowing rapid, synergistic degradation of encompass the cellulose substrate and several water molecules.
crystalline content.305,436,476,540 As such, GH6 cellulases are In subsequent structures, the flexibility of these loops would be
primary components in biomass degradation cocktails. elucidated as well as the potential relevance in nucleophilic
In addition to their industrial relevance, GH6s have been attack (Figure 42B).194 The enclosure somewhat restricts the
critical instruments in the general study of cellulase action. tunnel volume and thus the substrate flexibility. This latter
TrCel6A was first uncovered in the characterization of the observation is consistent with previously observed CBH product
culture filtrate components from the enhanced QM 9414 profiles, as substrate rearrangement to a productive binding
Trichoderma strain,435 though the enzyme would not explicitly conformation is prevented when the chain is advanced by a
be named until 1980 as CBHII.436 Several years later, Teeri et al. single glucose moiety.302 Rouvinen et al. proposed that the
isolated and sequenced the CBHII gene adding a second tunnel shape is typical for processive attack of cellulose chains
sequence by which to compare CBHI (TrCel7A).325 The and suggested the loops surrounding the active site appear to be
solution of the T. reesei CBHII CD 3-D structure marked a ideally positioned to assist in this processive action.
major turning point in cellulase research.192 Biochemical In addition to uncovering the fold of GH6, the TrCel6A
characterization prior to this had successfully identified CBHII structure illustrated how the enzyme binds a cello-oligomer
as a modular enzyme consisting of a CD and a CBM.344,399 within the tunnel-shaped active site. Cocrystallization with o-
Additionally, key details regarding the CBHII mode of action iodobenzyl-1-thio-β-D-cellobioside inhibitor identified four
had been uncovered including that the enzyme acts processively substrate binding subsites, which Rouvinen et al. labeled A−
D; the structure has not been deposited in the PDB. Current
from the nonreducing end of cellulose and that it yields
nomenclature refers to these sites as −2 through +2 (Figure
primarily cellobiose through an anomeric inversion stereo- 42C).175 The binding subsites are largely defined by the spatial
chemical pathway.302,443 With the solution of the first GH6 arrangement of aromatic residues within the active site. The
crystal structure, details regarding processive action began to thio-oligosaccharide complex illustrated carbohydrate-π stacking
emerge.192 In addition to revealing the overall enzyme typology interactions with three tryptophan residues (Figure 42C).
and active site location (Figure 42), the CBHII structure solved The TrCel6A structure also enabled an initial proposal for the
by Rouvinen et al. uncovered a central question motivating GH6 catalytic mechanism. At the time of the Rouvinen et al.
many subsequent structural and biochemical studies, namely the study, GH6 enzymes were known to invert substrate stereo-
lack of a readily identifiable catalytic base, which is reviewed at chemistry,443 yet candidate residues corresponding to this
length here. As we note below, although there was significant mechanism had not been definitively identified. The new
debate for many years, there is now a general consensus that the structure showed two carboxylate oxygens belonging to Asp175
GH6 catalytic mechanism proceeds via a “water wire” or and Asp221 located less than 5 Å from the glycosidic bond
Grotthuss mechanism potentially with or without an explicit between the −1 and +1 subsites.192 Rouvinen et al. proposed
catalytic base on the enzyme, although this still lacks definitive that this location is the cleavage site and that these two residues
confirmation.573,656 Over time, CBHII would receive its current likely participate in catalysis. Specifically, the authors put forth
designation of Cel6A (TrCel6A), which we use throughout the the hypothesis that Asp221 was protonated, and thus the likely
remainder of our discussion.657 catalytic acid, and that Asp175 was charged aiding in
In this section, we describe our current understanding of protonation of Asp221 as necessary. A water molecule in
position to attack the anomeric carbon at the cleavage site was
fungal GH6 cellulase structure, function, and efforts to engineer
also captured (Figure 42C). Identification of a residue acting as
thermal stability and activity in these important enzymes. We the catalytic base proved less straightforward, with the authors
describe the hypothesized catalytic mechanism and the offering two potential candidate residues, Asp263 or Asp401.
structural and biochemical studies that support the proposed This latter observation spawned many subsequent studies and
mechanism, the many studies describing processive action and, controversy, and definitive identification of the catalytic base, if
to a lesser extent, the synergism studies. As bacterial GH6 one indeed is required, still remains elusive.
cellulases often provide interesting and insightful comparison to As we continue to discuss GH6 structure and catalytic
fungal GH6s, we occasionally deviate from the primary focus of function, it is helpful to compare subsequent structures
the review on what we hope is an informative detour. A providing insight by analogy. Figure 43 provides a sequence
summary of discussed GH6 structures is provided in Table 8. alignment of TrCel6A alongside several of the more relevant
1366 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

GH6 sequences. The active site loops as well as residues of catalytic base. The Asp156 carbonyl oxygen was buried in the
particular interest to understanding the catalytic mechanism active site, though still within hydrogen bonding distance of the
have been highlighted. catalytic acid. Spezio et al. suggested Asp156 more likely
7.1.2. T. f usca Cel6A. The first EG representative from functions to modulate pKa of the catalytic acid. The other
family 6 was solved just a few years after TrCel6A and proved proposed catalytic base, Asp401 in TrCel6A (Asp265), was
useful in identifying topological differences between EGs and observed forming a salt bridge with a neighboring arginine.
CBHs. Spezio et al. solved the structure of T. fusca Cel6A Spezio et al. suggested Asp265 was readily ionizable and
(designated E2 at the time; PDB code 1TML).658 Although a adequately positioned to accommodate both the ligand and
bacterial EG sharing only 26% sequence identity with TrCel6A, attacking water molecule in the inverting stereochemical
the topologies of the two enzymes are quite similar. As is mechanism; thus, the hypothesis from this study was that
characteristic of the GH6 fold, Tf Cel6A also displays the Asp265 serves as the catalytic base in GH6s.658
distorted β/α barrel, though with different α-helix lengths. A 7.1.3. TrCel6A: Y169F Variant. With an unclear picture of
notable difference between the two structures is the topology of the residues involved in catalytic function, Koivula et al. set out
the active site. Tf Cel6A is missing a homologous N-terminal to examine the role of a centrally located tyrosine residue
loop, which yields a more open cleft-shaped active site (Tyr169) in catalysis and substrate binding.574 The authors
potentially to allow flexibility in endo-initiated attack of suggest Tyr169 sterically hinders the glucan substrate in the −1
crystalline cellulose (Figure 44). On the surface, the active site binding site, resulting in formation of the catalytically active
conformation. The Y169F variant was generated and solved in
its apo form (PDB code 1CB2). This new structure displayed
negligible structural differences relative to the original wild-type
structure. Weighing in on the ongoing debate as to catalytic
residues, Koivula et al. contradicted proposals that Asp401 or
the Tf Cel6A homologue could act as the catalytic base. The
authors argued that Asp401’s participation in two salt bridges
would likely prevent it from extracting a proton from the
attacking water. Turning attention to Asp175, Koivula et al.
confirmed the previous structural interpretation indicating that
the residue is important for pKa modulation and ensures
protonation of the catalytic acid.192,574 This observation was
arrived at on the basis of unpublished mutagenesis studies.
Figure 44. Comparison of the fungal CBH TrCel6A (1QK2)194 and Additionally, the authors noted that Asp175’s position could
bacterial EG Tf Cel6A structures (1TML).658 The TrCel6A structure is potentially serve to stabilize an oxocarbenium-like transition
shown on the left in gray surface. The thio-oligosaccharide ligand is state.
shown in cyan stick. Active site loops A and B are shown in pink and
blue, respectively. The Tf Cel6A structure is shown on the right in
Ultimately through structural evidence and specificity and
green surface. For illustration, Tf Cel6A is shown bound to the thio- kinetic characterization, Koivula et al. arrived at the hypothesis
oligosaccharide from the TrCel6A 1QK2 structure. The Tf Cel6A active that Tyr169 serves to distort the −1 subsite glucose moiety
site is notably more open than the enclosed tunnel-shaped active site of positioning the glycosidic linkage for catalysis. Comparisons of
TrCel6A as a result of shorter and missing loop regions. The cleft- or catalytic constants toward short cello-oligosaccharides were
groove-shaped active site is a common attribute of many EGs. examined for both the wild-type and the Y169F variant, and kcat
for cellotriose and cellotetraose were 4 times lower for the
structural differences between Tf Cel6A and TrCel6A are Y169F variant. Specificity constants were also decreased. This
consistent with the notion of a strict delineation of endo- indicates Tyr169 plays a key role in catalysis. The Y169F
versus exo-initiated attack; though as noted before, it is known mutation was also accompanied by an altered pH activity profile
that CBHs are also capable of endo-initiation despite having a leading one to speculate that Tyr169 may also play an indirect
tunnel-shaped active site,304 making such a delineation based on role in ensuring protonation of the catalytic acid.
topology alone not definitive. 7.1.4. H. insolens Cel6A: Apo. The solution of the catalytic
The Tf Cel6A structure confirmed several suppositions from core of H. insolens CBH Cel6A (HiCel6A) and subsequent
Rouvinen et al. regarding catalytic residues while casting doubt characterization cast doubt on its categorization as a strictly exo-
on others.192 Spezio et al. noted that a homologous pair of enzyme (PDB code 1BVW).659 HiCel6A is strikingly similar to
aspartates was in a similar location to Asp175 and Asp221 in TrCel6A, sharing 64% sequence identity. Thus, one may
TrCel6A.658 Asp117 (Asp221 in TrCel6A) was likely proto- reasonably presume the two enzymes share a great deal of
nated given its relative distance to neighboring residues and was characteristics related to specificity, kinetics, and processive
reported as the catalytic acid on this structural basis. However, action. Up to this point, TrCel6A had been largely described as
Spezio et al. suggested that the Asp79 (Asp175 in TrCel6A) was an exo-active CBH. However, several groups were approaching
unlikely to participate in catalysis, as it shifted away from the the conclusion that an enzyme, but this one in particular, may
active site by 4 Å. Notably, the Tf Cel6A structure did not not be strictly delineated into either the endo- or exo-
contain a ligand, which may have allowed a greater degree of category.304,660 Varrot et al. observed hydrolysis of a
active site flexibility. Spezio et al. did make the concession that fluoresceinyl-derivatized oligosaccharide substrate by HiCel6A,
the loop containing Asp79 may shift upon binding. the functional group of which appeared far too large to reside in
Similarly, Spezio et al. examined the identity of the catalytic the closed, tunnel-shaped active site of the HiCel6A structure.
base on the basis of structural data. Focusing on the aspartate The authors argued that, to accommodate the fluorescein group,
homologous to TrCel6A Asp263 (Asp156/Tf Cel6A), they the active site of HiCel6A must necessarily exhibit flexibility in
found it was unlikely this residue participated in catalysis as the the active site loop regions. Flexibility of these loops would in
1367 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

theory allow for endo-initiation on cellulose, explaining more further suggest that in one models the missing −1 glucose
recent observations indicative of such.209 moiety, Asp405 would be a suitable distance from the −1
The catalytic acid, Asp226, was also observed in two anomeric carbon so as to activate a water molecule for
orientations, perhaps representing two different protonation nucleophilic attack. The authors maintained their opinion that
states (Figure 45).659 In one conformation, the distance Asp180 significantly impacts the pKa of the catalytic acid given
the large conformational change about the Cα−Cβ bond
allowing Asp180 to hydrogen bond with the glucose.
7.1.6. TrCel6A: Non-Hydrolyzable Ligands. A set of
three new ligand-bound complexes of TrCel6A was instrumen-
tal in addressing the issue of loop flexibility upon substrate
binding.194 These structures included the following: wild-type
complexed with (Glc)2-S-(Glc)2 (PDB code 1QK2), Y169F
complexed with (Glc)2-S-(Glc)2 (PDB code 1QJW), and wild-
type complexed with m-iodobenzyl-β-D-glucopyranosyl-β(1,4)-
D-xylopyranoside (PDB code 1QK0). Zou et al. analyzed the
observed active site loop conformations of these three new
structures in the context of the available structures at the time.
The authors found that the conformational states of the active
site loops generally fall into four categories including “most
closed”, “more open”, “even more open”, and “most open”. The
tunnel-forming loop was responsive to modifications in the
Figure 45. Two observed conformations of the HiCel6A catalytic acid active site including ligand binding as well as site-directed
likely correspond to two different protonation states. The 1BVW mutagenesis (Y169F, D175A, and D221A). The former three
structure is shown in green cartoon. The catalytic acid, Asp226, and the conformations are illustrated in Figure 46. Clearly, the TrCel6A
proposed catalytic base, Asp405, are shown in stick. The distance active site is quite flexible with its “pincer-like” narrowing of the
between the two residues in one conformation is consistent with a
single-displacement catalytic mechanism, and thus, Asp405 (Asp401/
tunnel likely playing a significant role in the enzyme’s mode of
TrCel6A) was not excluded from consideration on the basis of distance. action.
Additionally, the crystallization of TrCel6A in the presence of
nonhydrolyzable ligands was effective in capturing TrCel6A with
between the carbonyl oxygen of the proposed catalytic acid
the −1 glucose moiety in the 2SO distorted conformation
and the proposed catalytic base is 9.5 Å (Asp401/TrCel6A and
Asp405/HiCel6A), which is consistent with a typical single-
displacement mechanism. Varrot et al. further agreed with the
hypothesis that an aspartate serves to modulate the pKa of the
catalytic acid, as Asp268 was appropriately positioned to do so.
However, while the HiCel6A Asp180 is in an equivalent position
to Asp175 of TrCel6A, the authors reported their consideration
of its relevance to catalysis as inconsequential on the basis of its
distance from the cleavage site. They cite consistency with
mutagenesis studies performed on similar enzymes as further
support of this conclusion.192,661
7.1.5. H. insolens Cel6A: Cello-Oligomer Complexes.
Shortly after solution of the original HiCel6A apo structure,659
Varrot et al. reported a glucose/cellotetraose-bound structure of
HiCel6A confirming their previous hypothesis that the active
site loops are flexible, changing conformation upon substrate
binding, and providing significant molecular-level insight into
ligand binding (PDB code 2BVW).662 The noted conforma-
tional change in the active site loops leads to a more enclosed
active site tunnel that increased the number of contacts with the
ligand. This likely also applies to the TrCel6A active loops
indirectly explaining previous observations that TrCel6A is able
to access and hydrolyze internal bonds.304 Direct observation of
this phenomenon in TrCel6A would soon follow,194 and Figure 46. Flexibility of the TrCel6A active site loop A as observed in
computational studies would later define molecular mechanisms four separate structures. The “most open” active site observed via
associated with the conformational change.571 structural studies was captured in the 1HGW structure, shown in yellow
Six ligand-binding subsites, −2 through +4, were directly cartoon.656 The 1QJW active represents the “most closed” active site,
dark teal cartoon.194 Two intermediate active site loop conformations,
observed as a result of these new oligosaccharide complexes. “more open” and “even more open”, were captured in 1QK2 and
The −1 subsite was unoccupied, and the glycosyl units in the 1QK0, gray cartoon and magenta cartoon, respectively.194 The ligand,
occupied subsites adopted a relaxed 4C1 conformation. Addi- cyan stick, is the thio-oligosaccharide linked ligand from the 1QK2
tionally, the catalytic acid, Asp226, was observed in close structure. For illustration, Ser181 is shown in stick on each of the loop
proximity to the +1 glucose and within a realistic distance of A conformations demonstrating the impressive range of motion of this
Asp405, again not ruling it out as the catalytic base. Varrot et al. residue.

1368 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

providing additional insight in the ongoing efforts to define the perspective on the relationship of EG and CBH active site
catalytic residues. In both the wild-type/(Glc)2-S-(Glc)2 topology with activity (PDB code 1DYS).663 Up to this point,
complex and the Y169F complex, ligand binding in the active the tunnel-shaped active sites of CBHs defined by flexible active
site is essentially the same despite each having very different site loops were largely thought to be a construct allowing
active site loop conformations. The wild-type/m-iodobenzyl-β- processive hydrolysis of crystalline cellulose.316,440 EGs were
D-glucopyranosyl-β(1,4)-D-xylopyranoside complex captured identified through the lack of these active site loops resulting in a
the −1 sugar in the 4C1 conformation; however, the xylosyl groove- or cleft-shaped binding site. Deletion of one of the key
unit of the ligand was in the −1 site with the glucosyl moiety in CBH active site loops in a C. f imi CBH indeed increased overall
the −2 site. Zou et al. note that if the −1 subunit sugar had a activity toward CMC, but the activity toward smaller cello-
hydroxymethyl group as in the native substrate, it would severely oligosaccharides was simultaneously decreased clouding the
clash with the Tyr169 side chain. The Y169F mutant also notion that the absence of active site loops strictly delineates
captured the −1 glucose in a distorted conformation, suggesting activity toward soluble substrates.660
Tyr169 plays an indirect role in catalysis.194 Overall, these Adding to the suggestion that EGs may not always resemble
findings suggest the −1 ring distortion arises from the need to the canonical open active site architecture, Davies et al. reported
avoid steric clashes with the surrounding protein, yet the the structure of the HiCel6B, which displayed a striking
distortion is likely integral to the catalytic mechanism. homology with family 6 CBHs.663 As with familial representa-
Zou et al. also explicitly examined the three structures for tives TrCel6A, Tf Cel6A, and HiCel6A, HiCel6B exhibits the
clues regarding the identity of the catalytic residues with respect characteristic GH6 distorted β/α-barrel fold. However, HiCel6B
to existing characterizations and structural knowledge. Two is missing one C-terminal loop (loop B) present in the CBHs
primary conclusions were made. The pair of aspartyl residues (Figure 43) creating a more open binding site, yet not
previously proposed as catalytic acid and pKa modifying residue, equivalent to other GH6 EGs. The authors noted that the
Asp221 and Asp175, respectively, serve as the primary catalytic remaining N-terminal loop (loop A) may close upon substrate
machinery, and neither of the putative catalytic bases, Asp263 binding allowing Ser98 (homologous to TrCel6A Ser181) to
and Asp401, are convincingly suitable candidates. With respect interact with the −1 and/or +1 subsite substrate. Such a
to the first conclusion, Zou et al. suggested structural evidence conformational change could also allow Glu97 to interact with
that pointed toward a catalytic mechanism that is dependent the active site. Other EGs have a conserved histidine at the
upon the conformation of the active site loops: (1) In the most Glu97 position, while CBHs exhibit an alanine.
commonly captured orientation corresponding to most apo In the ongoing search for the catalytic base, Davies et al. note
structures, the tunnel is generally closed, opening as necessary to that Asp316 (homologous to TrCel6A Asp401) displayed salt
facilitate substrate entry. (2) When the substrate binds within bridge to the conserved Arg269.663 Corresponding to findings
the tunnel, the −1 glucosyl moiety distorts, which is likely from Zou et al.,194 this configuration suggests Asp316 (and
crucial for catalysis. The aspartyl pair is locked in an interaction TrCel6A’s Asp401) is an unlikely catalytic base in HiCel6B, but
with Tyr169 and Arg174 that facilitates the protonation of the authors noted that a small rearrangement of the −1 subsite
Asp221. (3) A loop conformational change tightens the tunnel sugar could better position the residue. Beyond the catalytic
breaking the Asp175/Asp221 interaction with Tyr169/Arg174 base, Ala182 hydrogen bonds with the pKa modulator Asp180
and positions two water molecules close to the −1 sugar of the that in turn orients the catalytic acid, Asp139, for catalysis.
anomeric carbon. (4) The Asp175/Asp221 hydrogen bond is 7.1.8. H. insolens Cel6A: D416A/Thio-Oligosaccharide
broken to allow Asp221 to donate its proton to the substrate Complex. With the identity of the catalytic base still unknown,
leaving group, and Asp175 activates a neighboring water Varrot et al. set out to investigate a set of particularly perplexing
molecule above the anomeric carbon. (5) Catalysis takes findings related to the proposed catalytic base in HiCel6A,
place, and the α-anomer of cellobiose is expelled as product Asp405.209 The authors reported unpublished observations that
alongside a final conformational change of the active site loops. the EG HiCel6B is rendered inactive upon mutation of the
Processive action on cellulose would then follow. homologous residue (Asp316). However, mutation of Asp405
Examining the putative catalytic bases, Zou et al. reported that in the CBH HiCel6A resulted in only a 100−300-fold reduction
Asp263 is too distant to be directly involved in catalysis but may in activity rather than complete annihilation. These results led
modify pH behavior of the enzyme, as previously pro- the authors to propose that the CBHs of the GH6 family may
posed.658,659 Asp401 was not ruled out as the catalytic base by benefit from catalytic “rescue” as a result of the phenotypic
distance and was likely to be in the correct charge state to differences in architecture. One of the basic residues proposed as
activate a water molecule. However, its side chain was not close a possible “rescue” residue in HiCel6A was Asp416. Varrot et al.
enough to a negatively charged residue and, thus, may not be suggest that the residue sits in such a way that the Asp could
suitable to deprotonate the water molecule. Furthermore, the potentially activate a water molecule for inverting attack on the
Asp401 side chain does not interact with a suitable water −1 glucose anomeric carbon. Moreover, this residue is located
molecule in either of the thio-oligomer complexes but, instead, on a loop that is missing in homologous EGs and, thus, would be
hydrogen bonds with the −1 sugar O3. The placement of a likely candidate to maintaining minimal levels of activity
Asp401 relative to the face of the sugar ring rules it out as the should the potential base (Asp405) be modified.
catalytic base. As discussed below, subsequent reports suggest Solution of a thio-oligosaccharide bound D416A variant of
that Zou et al.’s reported structures do actually indicate a HiCel6A was unable to confirm the hypothesis that Asp416
catalytic role for Asp401, though not as expected. Instead, it has serves as a “rescue” residue because no water molecule was
later been suggested that the backbone of Asp401 interacts with observed in a position for nucleophilic attack (PDB code
the attacking water, and thus, this residue may be important for 1GZ1).209 Nonetheless, the D416A structure was successful in
proper alignment of the catalytic center.573,656 capturing a thiol-linked cellotetraose molecule spanning the +2
7.1.7. H. insolens Cel6B. The solution of the H. insolens EG to −2 binding subsites with the −1 glucose moiety in the 2SO
Cel6B (formerly EG VI) structure provided a unique conformation. This structure was also very briefly one of the
1369 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

“most open” GH6 CBH active sites, in terms of position of the tional feature of note is that the D175A structure remains the
two active site loops A and B. At this point in time, evidence most open GH6 active site conformation yet observed.
convincingly pointed toward distortion of the Michaelis The idea that Asp221 was the catalytic acid was not a new one
complex in the catalytic itinerary of retaining GHs; however, going into this study, nor was that of Asp175 as a pKa modulator.
it was unknown as to whether inverting GHs exhibited a similar Up to this point, most available data pointed toward Asp221 as
transition state distortion. Previous GH6 structures had the catalytic acid.192,658,661 The D175A structure captured
captured a similar 2SO conformation in the −1 binding Asp221 positioned close to where the −1/+1 glycosidic linkage
subsite,194 but the nearest potential transition state to the 2SO would be located (PDB code 1HGW). This position matched an
conformation on the Stoddard diagram is the 2,5B boat earlier structure of the Y169F TrCel6A variant which used a
conformation, which initially seemed an unlikely transition thio-oligosaccharide to capture a substrate spanning the active
state. The D416A variant provided mounting structural site (PDB code 1QJW).194 Using MD simulation to model the
evidence of what appears to be a valid intermediate catalytic oxygen-linked oligosaccharide, Koivula et al. observed that the
state (2,5B) of inverting GH6s. hydrogen bond between Asp221 and Asp175 breaks revealing a
Overall, no significant structural changes were found for the new, catalytically competent side chain conformation of Asp221.
D416A mutant except for the open conformation of the active- This conformation had been observed previously in Tf Cel6A
and HiCel6A structures but was the first observation of a
site loops. The authors report only a single direct hydrogen
protonated carboxyl-oxygen oriented toward the glycosidic
bond between the −1 glucose O3 hydroxyl and the proposed
linkage in TrCel6A. Activity measurements of the D221A
catalytic base Asp405; other protein−substrate interactions
variant on cellotriose, cellotetraose, cellopentaose, and cellohex-
appeared to be mediated entirely by water. During proof of the aose indicated a complete loss of competency; however, binding
manuscript, Varrot et al. included a postscript209 acknowledging specificity was maintained. The findings overwhelmingly
the findings of another paper published at nearly the same support Asp221 as the catalytic acid candidate.
time.656 Koivula et al. had solved the structure of an even more Data supporting Asp175’s role in catalysis had been less
open TrCel6A structure, but more importantly, they had convincing up to this point. Koivula et al. began investigating the
proposed a solvent-mediated Grotthuss mechanism for function of Asp175 in catalysis by testing the natively expressed
deprotonation that was consistent with the solvent-mediated D175A variant on oligosaccharides. They showed that the
interactions observed in the D416A HiCel6A structure. variant was effectively inactivated but, like D221, maintained
Furthermore, Koivula et al. reported molecular simulations binding specificity. The recombinantly expressed D175A variant
that support the 2,5B transition state conformation. maintained 2−3% residual activity on barley β-glucan,
7.1.9. TrCel6A: D175A and D221A Variants. The Koivula suggesting, as the authors state, that it is not the catalytic base
et al. study in 2002 remains one of the most influential structural “in a normal sense”.656 The D175A variant also markedly
investigations of catalytic residue function in GH6s.656 The lowered the pH-rate profile, suggesting a role in pKa modulation.
structure of TrCel6A D175A and D221A mutants (PDB codes The authors also put forth a convincing argument surrounding
1HGW and IHGY, respectively) were solved, and activity the role of Asp175 in transition state stabilization. Examining
measurements, kinetic isotope experiments, and molecular cellobiosyl fluoride substrate reaction kinetics, Koivula et al.
simulations were used to ascribe functions to Asp221 and calculated an exceptionally high charge buildup in the TrCel6A
Asp175.656 A key finding of this study was that the catalytic transition state. Electrostatic stabilization would be required to
mechanism appears to proceed via water-mediated Grotthuss maintain such an unfavorable ΔΔG⧧, and Asp175 is uniquely
mechanism (Figure 47), and that Asp175 may accept the proton positioned to perform this role. The structure captures the −1
as the catalytic base.656 Moreover, the distortion of the substrate glucose ring oxygen and the Asp175 carboxylate oxygen within
in the −1 binding subsite was shown to be stable over a 4.5 Å of each other with no other shielding atoms between them.
relatively short MD simulation, suggesting the conformation is Many previous studies had proposed Asp401 as the mostly
stabilized by the surrounding protein environment. An addi- likely candidate, but contradictory results prohibited conclusive
identification.661 Using structural insights and molecular
simulation, Koivula et al. laid out a rational discussion describing
the evidence against Asp401 acting as the catalytic base in
TrCel6A. They first described the role of Asp401 in catalysis and
why prior studies have been inconclusive at best. Structural data
indicate that Asp175 and Ser181 coordinate two water
molecules positioned so as to attack the −1 glucose anomeric
carbon. The water coordinated by Ser181 is also coordinated by
the backbone carbonyl of Asp401. Simulation of a carbocationic
intermediate confirmed this water molecule as the likely catalytic
water. The authors further added that prior studies mutating
Asp401 do not in fact remove the catalytic base but rather
induce a charge imbalance that destabilizes the transition state.
The removal of Asp401 would leave two nearby positively
Figure 47. Proposed GH6 Grotthuss mechanism. In this mechanism,
charged residues without acidic compensation near the active
the role of catalytic base is fulfilled by a wire that shuttles the proton
from the attacking water.573 The residues are labeled according to the site.
TrCel6A sequence. The Ser181 and Asp401 backbone carbonyl Instead of a catalytic base-mediated mechanism, Koivula et al.
oxygens stabilize the attacking water, and the Asp175 side chain suggested proton transfer occurs through a water wire in a
stabilizes a second water molecule that will accept the proton from the Grotthuss-type mechanism, illustrated in Figure 47, wherein
attacking water in an inverting mechanism. proton transfer occurs to a neighboring water upon nucleophilic
1370 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 48. Proposed mechanism of catalysis and “virtual processivity” as captured through structural studies of HiCel6A bound to thio-oligosaccharide
inhibitors. The figure numbers referenced in the captions at right refer to the figures from the original publication. Reprinted with permission from ref
665. Copyright 2003 Elsevier.

attack, which may then transfer a proton to Asp175 serving as structural explanation as to why mutagenesis of Asp405 results
indirect catalytic base.664 The authors noted that because there in only a 100−300 fold decrease in activity rather than complete
is little evidence of bond formation with the nucleophilic water abolishment, as in the H. insolens EG. A series of five structures
in the transition state, there is likely no need for a traditional were reported toward the aim of this goal. These structures
catalytic base-mediated mechanism. Examination of solvent included the following: wild-type HiCel6A complexed with
isotope effects on kcat for cellotriose substrates supported the methyl 4-S-(4III-FTU-β-cellotriosyl)-4-thio-β-D-glucopyranoside
active site architecture observed in the structural study, and thus (PDB code 1OCB), an apo D405N mutant (PDB code 1OC6),
the proposed catalytic roles. Interestingly, a Grotthuss a D405N mutant complexed with methyl 4II-thio-β-cellotetrao-
mechanism for GH6 catalysis had previously been considered side and methyl 4,4II,4III,4IV-tetrathio-α-cellopentaoside (PDB
in the 1999 review by Davies et al.170 Ultimately, the plausibility codes 1OC5 and 1OC7, respectively), and a D416A mutant
of this mechanism had been dismissed as inconsistent with a complexed with methyl 4,4II,4III,4IV-tetrathio-α-cellopentaoside
prior study claiming to have identified a GH6 catalytic base (PDB code 1OCJ).665 The authors noted that there appear to be
residue.661 relatively few structural consequences of mutating Asp405.
7.1.10. H. insolens Cel6A: D405N Variant. In 2003, Varrot Neighboring residues rotate slightly but maintain a similar
et al. revisited their structural investigation into the possibility of interaction profile with the asparagine. Asp405 maintains a salt
a catalytic “rescue” residue upon mutation of the proposed bridge with the neighboring Arg357 as well, the destruction of
catalytic base, Asp405.665 The authors were searching for a which was speculated to result in a significant local instability.
1371 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Varrot et al. concluded that mutation of Asp405 must have a capture the 2,5B transition state conformation in the −1 binding
participatory role in catalysis since the mutation does not result subsite (PDB code 1OCN).666 This was the first structural
in local structural unfolding. Prior studies, as well as this one, report of this unusual catalytic itinerary, traversing from the 2SO
noted that Asp405 is directly involved in substrate binding with Michaelis complex conformation through the 2,5B transition
a hydrogen bond between the C3 hydroxyl group of the −1 state intermediate.194,209 The structure reveals the distortion is
glucose moiety. Additionally, it was proposed that Asp405 may driven primarily through steric interactions with neighboring
serve to stabilize the proposed 2,5B transition state intermediate, Tyr174 coinciding with the hypothesized role of the
though not directly observed as part of this study. The authors homologous residue in TrCel6A.574 The structure also
again left open the possibility that Asp405 is the catalytic base. effectively supported the Grotthuss mechanism for GH6
The study of the D405N mutants bound to the unique set of cellulases. A water molecule was captured in position for attack
fluorogenic and thio-oligosaccharides also uncovered a series of on the anomeric center of the −1 isofagomine moiety (Figure
structures trapped in the act of “virtual processivity”.665 Varrot 49). The formerly proposed catalytic base, Asp405, was
et al. illustrated a proposed schematic of catalysis and
processivity in GH6 enzymes (Figure 48). The initial step,
productive binding mode, was captured in the wild-type
structure bound to the fluorescing thio-oligomer (1OCB).
Though the ligand does not span the +1/−1 binding sites where
cleavage of the glycosidic linkage would occur, the catalytic acid
reflects a catalytically competent conformation. This structure
also captured a ligand binding within the −3 and −4 subsites,
which had not been previously defined for any GH6 enzyme.
The authors suggested this finding illustrates that there is no
absolute requirement for cellulose chains to enter the enzyme
tunnel from one end only supporting the ability of this particular
enzyme to act in an endo-active fashion as well as exo-active. In
the productively bound conformation, the protein makes
approximately 15 different hydrogen bonds directly with the
ligand and a similar number of solvent-mediated hydrogen
bonds.
The start of the processive event, “processivity commences”
(Figure 48), was captured in the 1OC7 structure, the D405N
mutant bound to the thio-linked cellopentamer.665 The first
sugar lies intermediate to the −1/+1 binding site with the Figure 49. Transition state intermediate of HiCel6A as captured in the
trailing sugars also occupying intermediate subsite binding isofagomine bound complex with the D416A variant (PDB code
positions as the chain threads in the tunnel. The interactions of 1OCN).666 Asp405, once proposed as the catalytic base, is shown
the protein with the ligand are similar to the productive binding mediating the Grotthuss-type water wire mechanism through hydrogen
bonding of its main chain carbonyl with a water molecule. Ser186, of
mode. The chain is hypothesized to progress through the tunnel loop A, also stabilizes the water wire. A second water molecule
maintaining a nonproductive binding mode conformation. The hydrogen bonds with the first, and Asp180 completes the wire
1OC5 structure of the D405N mutant bound to the thio-linked interaction. Asp226, the catalytic acid, is positioned away from the pKa
cellotetramer illustrates the nonproductive binding mode in modifying Asp180 during this intermediate state. Tyr174 forces the
which the ligand sits in the +1 through +4 binding subsites in a energetically unfavorable 2,5B conformation of the −1 moiety through
relaxed chair conformation. Each of the glucose moieties’ faces is steric interactions. This transition state intermediate appears to be
rotated 180° from that of the productively bound ligand general for GH6 inverting enzymes.
conformation suggesting the ligand has partially completed the
processive motion through the tunnel. Contrasting the captured mediating a water−ligand interaction by stabilizing
productive binding mode, the protein makes very few hydrogen the interaction with the main-chain carbonyl. Ser186 on loop A,
bonds directly with the ligand in the nonproductive the pKa modifying/indirect putative base Asp180, and a second
conformation. Nearly all of the interactions are through water molecule are also implicated in catalysis. This structure
solvent-mediated hydrogen bonds as the ligand slides through unambiguously suggests GH6 catalysis occurs via a Grotthuss-
the tunnel. The catalytic acid is tilted away from the substrate. type water wire mechanism with Asp180 potentially being able
This frequently captured dual orientation of the GH6 catalytic to accept a proton during catalysis (Figure 47).
acid, both of which were captured in this study, has long been Interestingly, the Grotthuss-type water wire mechanism may
attributed to the change in local pH affecting protonation state. be feasible even in the absence of a homologous serine residue in
Varrot et al. suggested that the conformational change may loop A. Mycobacterium tuberculosis H37Rv Cel6, a bacterial EG,
actually serve a role in processivity, putatively to accommodate exhibits an alanine substitution in this position. Structures of the
the unusual orientation of the sugars as they find their way bacterial EG complexed with nonhydrolyzable substrates
toward a catalytically competent binding position. Overall, this indicate that, despite the alanine substitution, a water molecule
study documents a fascinating series of structures illustrating a is still able to correctly position for inverting attack of the −1
proposed processive mechanism and sheds light on the ability of moiety (PDB codes 1UP0, 1UP3, 1UOZ, and 1UP2).667 The
HiCel6A to act in endo-initiation mode. stabilization of the water molecule is alternatively maintained by
7.1.11. H. insolens Cel6A: D416A/Isofagomine Com- a conserved asparagine homologous to Asp401 in TrCel6A. This
plex. Revisiting the H. insolens D416A variant, Varrot et al. used latter study suggests that GH6 enzymes have evolved to
a cellobiose-derived isofagomine inhibitor to successfully maintain the water wire catalytic mechanism.
1372 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

7.1.12. C. cinerea Cel6C. The genes encoding five new GH6 Tamura et al. report the CcCel6A-HEPES structure maintains an
enzymes (CelA through CelE) from C. cinerea were discovered “open” tunnel conformation, while the CcCel6A-cellobiose and
in 2009.668 Of the five, only CcCel6A exhibits a CBM and was CcCel6A-p-nitrophenyl-β-D-cellotrioside structures demonstrate
most closely related to that of TrCel6A.669 The remaining GH6 the “closed” form of the active site tunnel that now appears to be
cellulases were more similar in homology to HiCel6B. Liu et al. characteristic of GH6 CBH loops upon substrate binding.194,671
confirmed that C. cinerea Cel6A (CcCel6A) exhibits a CBM, The conformational changes in moving from open to closed
while CcCel6B and CcCel6C do not. Yoshida et al. claim that all forms were essentially identical to changes observed in HiCel6A.
five GH6s exhibit CBH activity on the basis of cellobiose Interactions of CcCel6A with the bound ligands were reported
production from PASC. This latter observation became to be consistent with the Grotthuss-type catalytic mechanism
increasingly more pertinent upon the solution of the structure previously proposed for GH6s.656
of CcCel6C in the year that followed. Rather than the new CcCel6A structure, the primary focus of
Liu et al. reported the solution of the CcCel6C structure in the Tamura et al. study was actually the conformational changes
complex with cellobiose and p-nitrophenyl-β-D-cellotrioside, associated with mutation of Asp102 in CcCel6C.671 Reversing
which appears to be the first report of a GH6 from a their original conclusion that the active site of CcCel6C is rigid
basidiomycete (PDB codes 3A64, 3ABX, and 3A9B).670 The by comparison to other GH6 CBHs,670 the authors noted that
authors suggested the structure was an additional example of a the D102A mutation induces a significant degree of flexibility in
GH6 CBH based on prior characterization on PASC despite the the active site loops. A “motion angle” measurement devised to
high sequence homology with the H. insolens EG Cel6B. account for conformational changes indicates that CcCel6C
However, in the current study, CcCel6C was tested for activity D102A variant active site loops tighten significantly in
on CMC and demonstrated specificity for the substrate unlike comparison to both the wild-type active site and the CcCel6A
any other GH6 CBH. The new structure suggests the CcCel6C structures. Tamura et al. go on to suggest that the conforma-
active site is significantly more open than any of the known GH6 tional change in the CcCel6C D102A variant is the most drastic
CBHs. Active site loops A and B, which are not missing (Figure yet observed in a GH6 CBH. However, superimposition of the
43), were captured in an open conformation in each instance two CcCel6C structures with the “most closed” and “most open”
and appear to be insensitive to the substrate binding-based extreme conformations from TrCel6A (1QJW and 1HGW,
conformational change noted in both TrCel6A and HiCel6A. respectively) indicates that the purported drastic conformational
Accordingly, the serine residue of loop A (Ser181 of TrCel6A) change is only moderately greater than that of TrCel6A (Figure
was not noted to interact with the substrate. A lysine 50). The CcCel6C D102A variant exhibits loop structuring
substitution in loop A is suspected for the reduction in flexibility nearly identical to that of the most closed TrCel6A structure,
in the loops overall. Additionally, a tyrosine present in the −3 1QJW. Overall, the more open active site of CcCel6C wild-type
binding site of other GH6 CBHs is missing in CcCel6C making again supports the enhanced ability of this particular GH6 to
for a much more open product-binding site by comparison. perform endo-initiated attack relative to other GH6 CBHs.
Overall, the evidence that CcCel6C is indeed a CBH is not
compelling. In fact, specificity toward CMC and the appearance
of what was described as a cleft-shaped active site surrounded by
fixed open loops points more toward EG behavior, or at the very
least, a high degree of endo-initiation. High sequence homology
with the EG HiCel6B is also concerning. In lieu of assays on
more crystalline substrates such as Avicel or BMCC, the
categorization of CcCel6C activity as a CBH remains unclear.
7.1.13. C. cinerea Cel6A. In a subsequent study, Tamura et
al. investigate conformational changes within the active sites of
both CcCel6A and CcCel6C resulting from substrate binding or
mutation of the pKa modifying catalytic aspartate.671 Four
CcCel6A structures were reported including the following:
CcCel6A bound to 4-(2-hydroxyethyl)-1-piperazineethane-
sulfonic acid (HEPES; PDB code 3VOG), CcCel6A bound to
cellobiose (PDB code 3VOH), CcCel6A bound to p-nitro-
phenyl-β-D-cellotrioside (PDB code 3VOI), and an apo
CcCel6A D164A variant (PDB code 3VOJ). A fifth structure
of an CcCel6C D102A variant with glucose in the −2 binding
site was also reported for comparison to the group’s prior wild-
type structure of CcCel6C.670
The solution of the CcCel6A structure was the second Figure 50. Superimposition of CcCel6C structures with the TrCel6A
basidiomycete GH6 structure reported. As noted before, “most open” and “most closed” structures. The wild-type CcCel6C
CcCel6A exhibits a CBM, whereas all other C. cinerea GH6s structure, orange cartoon, represents the open conformation, which
do not. CcCel6A demonstrates 52% sequence identity with Tamura et al. reported as the most drastic open conformation of a GH6
yet.671 The TrCel6A D221A variant in the “most open” conformation is
HiCel6A and 48% identity with TrCel6A. Additionally, CcCel6A
shown in yellow cartoon for direct comparison to CcCel6C.656 Upon
was reported to produce cellobiose from PASC but not binding a glucose molecule, the CcCel6C D102A variant active site
hydrolyze either p-nitrophenyl-β-D-cellotrioside or CMC, loops close, shown in pink cartoon. This latter conformation is nearly
indicative of CBH activity. However, relative or specific activity identical to the “most closed” conformation from the TrCel6A Y169F
data has never been directly reported making direct comparisons variant, shown in teal cartoon.194 The thio-oligosaccharide ligand from
between CcCel6A and CcCel6C and to other GH6s difficult. 1QK2 is shown in cyan stick.194

1373 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

7.1.14. C. thermophilum Cel6A. Solution of the C. 7.2. Catalytic Function


thermophilum Cel6A structure provided additional structural Structural and biochemical studies have been essential to
evidence in support of a general Grotthuss-type mechanism for elucidate the catalytic mechanism of GH6 enzymes, although
GH6 enzymes and again suggested GH6 enzymes have an active many questions still remain. To date, these studies enabled a
site particularly suited for endo-initiated attack (PDB code general hypothesis for the unusual single-displacement catalytic
4A05).672 CtCel6A displays the typical GH6 distorted β/α mechanism adopted by GH6s. In the section to follow, we
barrel including both full-length active site loops A and B.672 summarize our current understanding of the GH6 catalytic
The enzyme shares 77% sequence identity with HiCel6A and mechanism including the roles of key residues in the active site,
63% with TrCel6A suggesting it is a CBH. Specificity for BMCC the role of water in catalysis, aromatic residue function in
and filter paper supports this assessment.673 The wild-type substrate binding, and conformational changes along the
CtCel6A structure captured a cellobiose molecule in the −3 to processive cycle.
−2 binding subsites and a cellotetraose molecule in the +1 to +4 Four residues are well-conserved across GH6 enzymes: the
subsites. This is the second such account of a −3 binding site for amino acids homologous to TrCel6A Asp221, Asp175, Asp263,
a GH6 enzyme and is suggestive of the ability of the active site to and Asp401 (Figure 43).655 Residues homologous to Asp221
bind substrate in a manner so as to allow glycosidic cleavage of putatively operate as the catalytic acid in conjunction with the
products other than just cellobiose.665 One of the more pKa modifying and potential “indirect” catalytic aspartate
interesting features of this structure was the tetrahedrally Asp175. Several residues, including Asp263 and Asp401, have
coordinated Li+ ion located in the same position as the anomeric been proposed as the catalytic base, though definitive evidence
carbon of the Michaelis complex. The Li+ ion likely mimics the remains elusive. Additionally, Tyr169 has been reported to play
oxocarbenium-ion-like transition state complex coordinating a critical role in transition state stabilization. The function of
with three water molecules and the O4 hydroxyl of the +1 sugar, each of these residues is discussed below.
consistent with a water-mediated inverting catalytic mechanism 7.2.1. Catalytic Acid. Structural and biochemical character-
in GH6 enzymes. izations overwhelmingly suggest the GH6 catalytic acid is an
7.1.15. TrCel6A Variants HJPlus and 3C6P. Increasing aspartic acid residue located to donate the proton to the
the ability of cellulases to withstand heat while maintaining a substrate leaving group as necessary. Rouvinen et al. observed
reasonable degree of activity is an important area of industrial that TrCel6A is active over a pH range 4.0−7.0, a range that
enzymatic biomass conversion research. Wu et al. have been allows both charged and uncharged aspartyl side chains.192
particularly successful in identifying a set of mutations through Analyzing the structural geometry, they suggested that Asp221 is
directed evolution that is capable of increasing T50 by nearly 20 likely to be protonated and is in position to act as the catalytic
°C over wild-type.674 To understand the structural implications acid.192 Mutation of this TrCel6A residue to alanine results in
of these mutations and the molecular-level origins of added complete annihilation of activity, where the differences in
thermal stability, Wu et al. solved the structures of the TrCel6A activity are not attributable to substrate binding.192,656 This
thermally stable variants HJPlus and 3C6P (PDB codes 4I5R finding is consistent across the GH6 family including bacterial
and 4I5U, respectively).674 The thermal stability character- and fungal representatives, as well as in EG and CBH functions
ization efforts are discussed in greater detail in section 7.4, but (refs 194, 479, 574, 656, 658, 670, and 680−684). Structural
briefly, we mention that the HJPlus variant consisted of 48 studies have also captured the catalytic acid in two distinct
mutations with respect to the wild-type TrCel6A enzyme. The orientations.194,656,659,670,683,685 While the two conformations
3C6P consisted of an additional 7 mutations with respect to have been attributed to different protonation states, Varrot et al.
HJPlus, a total of 55 with respect to wild-type. Both variants suggest the conformations may represent accommodations
maintain a high degree of structural similarity with wild-type made within the active site to enable procession of the cello-
TrCel6A. With so many mutations, the authors necessarily did oligomer.665
not discuss contributions from each mutation toward thermal 7.2.2. pKa Modifying Residues. In addition to the catalytic
stability and focus primarily on residues providing stability gains acid, it is generally accepted that the GH6 catalytic mechanism
in 3C6P over HJPlus. Improvements in thermal stability were relies on an adjacent secondary aspartic acid that serves, at
attributed to five residues, all of which were located near the minimum, to prime the catalytic acid pKa. Upon solution of the
surface of the enzyme. Mutations at the surface of globular first GH6 structure, Rouvinen et al. posited that Asp175 could
proteins are frequently capable of enhancing thermal force protonation of Asp221 and stabilize a transitional positive
stability.675,676 These included the following: M135L, Q277L, charge at the substrate ring oxygen −1 binding subsite.192 Site-
S317P, S406P, and S413P. Only the serine to proline mutations directed mutagenesis and characterization of activity confirmed
were solvent exposed, however. The former two mutations the D175A mutant displayed approximately 20% of the activity
appear to improve thermal stability through improved hydro- of wild-type, which suggests the residue plays a supporting role
phobic interactions. In addition to proximity to the surface, in catalysis.192 The putative role of Asp175 as a pKa modifying
serine to proline mutations, as in the 3C6P variant, have long residue was again suggested by Koivula et al., who observed
been reported to contribute stability in loop regions by significantly reduced, but measurable, activity of the D175A
restricting conformational freedom.677−679 In HJPlus, proline mutant.656,684
substitutions appear to be well-tolerated, though they do not Similar behavior has been observed in bacterial GH6
always enhance stability. In fact, the three identified in the 3C6P representatives as well. Studies characterizing the role of the
variant were the only serine to proline mutations to significantly homologous C. f imi Cel6A residue, Asp216, again suggested an
impact thermal stability in a positive fashion. While this indirect role of the residue in catalysis. As a result of the D216A
particular approach to engineering enhanced stability while mutation, Damude et al. observed reductions in activity of 18- to
maintaining activity was effective, it is difficult to develop a 1380-fold depending on the substrate, but they noted the
general principle by which other proteins may be modified to mutants maintained a considerable amount of activity.661 The
the same effect. homologous residue in Tf Cel6A (Asp79) is also reported to
1374 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

function as a pKa modifying residue despite its structurally catalytic center led some researchers to briefly consider the
observed location 11 Å away from the catalytic acid.680 possibility that catalysis proceeds via Grotthuss mechanism in
Wolfgang et al. obtained pH activity profiles of the D79A, GH6s.170,659 However, the prospect was ultimately dismissed
D79E, and D79N mutants demonstrating that Asp79 raises the (and then later re-examined) on the basis of the Damude et al.
pKa of the corresponding catalytic acid. study identifying the Asp401 equivalent as a catalytic base.
Besides modulating pKa and potentially acting as the catalytic In a comprehensive examination of the aspartic acids of
base, as discussed in the next subsection, additional roles for Tf Cel6A, Wolfgang et al. determined that various Asp265
Asp175 and its homologous residues have been suggested. mutants (homologous to TrCel6A Asp401) retained anywhere
Koivula et al. noted that Asp175 is ideally positioned to function from 2% to 10% of wild-type activity on CMC and PASC.680
in stabilization of the oxocarbenium-like transition state.574 In a This suggested that the residue was not playing a direct role in
later study, the authors used molecular modeling and structural catalysis, such as a catalytic base would. To test whether Asp265
studies with fluoroglycosyl ligands to investigate this hypoth- and Asp79 (the pKa modifying residue) functioned coopera-
esis.656 They confirmed Asp175 stabilizes the exceptionally tively as a base providing residual activity when only one was
electron-deficient fluorine transition state and suggest this mutated, the authors tested the activity of the double mutant
interaction also takes place with oligosaccharides. Interestingly, D79N/D265N. The observed results were similar to what
Vuong et al. found that the D226A Tf Cel6B mutant, would be expected from a cumulative, independent decrease
homologous to Asp175, could hydrolyze a soluble substrate at rather than a marked decrease resulting from removal of the
the same level as wild-type producing large oligosaccharides, but catalytic base.
the same mutant could not hydrolyze crystalline cellulose to Molecular modeling studies were particularly insightful in
produce cellobiose. This latter finding suggests this residue may developing what is our current understanding of how GH6
play a role in processivity.682 Tamura et al. also suggest this enzymes proceed without a traditional catalytic base. Modeling
residue plays a crucial role in active site loop functionality in the TrCel6A transition state as an oxocarbenium cation, Koivula
CcCel6A and CcCel6C, as deletion enabled a significant et al. scanned several initial configurations of the active site to
conformational change (Figure 50).671 identify potential catalytic base residues. However, the authors
Asp263 in TrCel6A and its homologous residues have also concluded there were no suitable candidates to act as a direct
been suggested to serve as pKa modifying residues. Structurally, base. Some of the models revealed conformations wherein water
this aspartic acid sandwiches the catalytic acid with an adjacent molecules were positioned so as to attack the anomeric carbon
aspartic acid. Originally, Rouvinen et al. suggested that Asp263, of the oxocarbenium cation. The Ser181 side chain and the
or alternatively Asp401, may act as the catalytic base.192 Later, Asp401 backbone stabilized one water molecule, which also
several studies successfully ruled out Asp236 as the catalytic base connected to the Asp175 side chain via a second water molecule.
and suggested Asp263 likely increases the pKa of the catalytic After bond cleavage in the simulation, the distorted −1 sugar
acid.194,659,661 Site-directed mutagenesis and activity measure- ring relaxed to the chair conformation and moved slightly out of
ments of the homologous residue (Asp156) in Tf Cel6A support the active site. Koivula et al. thus suggested there is little need for
this proposal, with decreased activity found for D156A and a protein-based catalytic base and instead suggested a proton
D156N but not for D156E.680 could be transferred from the nucleophilic water to Asp175 by
7.2.3. Catalytic Base. The definitive assignment of a means of a second water in a Grotthuss-type mechanism; thus,
catalytic base in the inverting GH6 mechanism remains an open Asp175 could act as an indirect catalytic base via a second water
question. Recent evidence even suggests that GH6 enzymes do molecule. In addition to reviving the notion of a water-mediated
not require a direct catalytic base as part of their single- Grotthuss mechanism, the authors also predicted the 2,5B
displacement hydrolytic mechanism (i.e., a residue that could conformation of the oxocarbenium cation intermediate.
accept a proton directly from the attacking nucleophilic water Compounding structural studies capturing transition state
molecule). intermediates would later confirm this for other GH6
The structural study of TrCel6A by Rouvinen et al. was one of enzymes.665,666 To date, the assignment of Asp175 as the
the first instances in which Asp401 was proposed as the catalytic catalytic base is the most plausible hypothesis that has been put
base.192 Spezio et al. later proposed that the homologous forth.
residue, Asp265, in Tf Cel6A would aid in activity by ensuring As definitive evidence of the role of Asp175 remains open, the
the enzyme’s charged state, effectively as the catalytic base.658 search for a suitable GH6 catalytic base continues. In Tf Cel6B,
Damude et al. later used kinetic measurements to investigate the Vuong et al. mutated all potential catalytic base residues within 6
role of this residue, Asp392, in Cf Cel6A.661 After the catalytic Å of the −1/+1 subsites to an alanine.682 The authors suggested
acid mutant, the Cf Cel6A D392A mutant displayed the second- that an exogenous nucleophile such as sodium azide could
largest reduction in activity with a decrease in activity of partially “rescue” enzyme activity in the absence of a catalytic
approximately 3 × 104 on both CMC and PASC. Combined base and thus tested this hypothesis. Sodium azide did not
with geometric considerations from the crystal structures of “rescue” activity for any of the mutants nor was a catalytic base
TrCel6A and Tf Cel6A, the authors posited that Asp392, Asp401 identified. However, the catalytic relevance of each of the
in TrCel6A, serves as the catalytic base. mutated residues was confirmed. Activity of the D226A/S232A
The Damude et al. study has been the source of much (equivalent to TrCel6A Asp175 and Ser181) double mutation
confusion over the years, as only a handful of other studies was partially “rescued” by low concentrations of sodium azide.
suggest this residue causes such a drastic decrease in catalytic The authors suggest that the azide ion activates a water molecule
activity. One of the first suggestions that TrCel6A’s Asp401 may that performs hydrolysis. This “rescue” suggests Asp226 and
not actually be the catalytic base was put forth by Koivula et al., Ser232 activate a catalytic water via a proton-transferring
who noted that the salt bridge in which Asp401 participates network as a means of hydrolysis. The Tf Cel6B structure of the
would likely interfere with proton abstraction from the attacking D226A/S232A double mutant later supported this hypoth-
water molecule.574 The presence of water molecules near the esis.686
1375 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

7.2.4. Catalytic Priming of Ring Distortion. While many product with the −1 and −2 subsites may in fact contribute to
residues are likely involved in the catalytic priming of the the moderate product inhibition observed in
substrate for catalysis, the centrally located tyrosine has been TrCel6A.344,581,599,606,693 Studies of bacterial GH6s Tf Cel6A
shown to function in steric distortion of the −1 glucose and Tf Cel6B have been influential in defining protein−substrate
moiety.574 An early investigation by Koivula et al. effectively interactions in the −2 binding site.687,696 In Tf Cel6A, site-
established the role of Tyr169 and homologous tyrosines directed mutagenesis of a lysine residue to a histidine in the −2
through crystallography and biochemical characterization of binding site was shown to improve performance on PASC over
Tyr169 mutant activity.574 Through structural evidence and filter paper, suggesting this residue plays a rate-limiting role in
characterization of pH dependence of cellotetraose hydrolysis in hydrolysis of crystalline cellulose.696 In examining Tf Cel6B,
the wild-type and Y169F variant, the authors proposed that Vuong and Wilson identified an asparagine, Asn282, which
Tyr169 contributes to ring distortion in the −1 subsite as well as significantly impacts the ability of the enzyme to hydrolyze
helps to ensure protonation of the catalytic acid.574 Zou et al. crystalline substrates. Mutations of this asparagine to alanine
later confirmed Tyr169 aids in distortion of the −1 glucose but and aspartate to alanine increased overall activity on BMCC and
noted that the Y169F mutant exhibits the same distortion.194 filter paper and increased processivity.528 Overall, character-
Later, molecular modeling studies confirmed the strongly ization studies suggest the −2 binding site is important for
stabilizing nature of Tyr169 in maintaining the distorted 2SO catalysis, processivity, and product binding.
intermediate conformation.656 Distortion of the −1 glucopyranose ring in the active sites of
Similar results in homologous GH6s suggest the tyrosine at GHs is thought to be necessary for catalysis, as discussed
the −1 binding subsite has conserved function. In the bacterial previously. In GH6s, this distortion occurs on the product side
Tf Cel6A, Tyr73 mutants display tighter binding and lower of the cleavage site. A number of studies focus on understanding
hydrolytic activity in comparison to wild-type.687,688 Barr et al. both how enzyme structure contributes to distortion and
proposed that the additional volume created by these mutations energetic contributions from distortion. Study of the first
allows the −1 subsite sugar ring to bind in the relaxed 4C1 cellulase structure, TrCel6A,192 as well as several subsequent
conformation, indicating the role of Tyr73 could be to align the structural studies, observed tilting of the −1 glucopyranose ring
−1 sugar into an optimal position relative to the catalytic out of the plane of the other substrate units and noted puckering
residues.687 Alternately, they noted that Tyr73 could stabilize an of the ring from the relaxed chair conforma-
intermediate oxocarbenium ion through cation−π interaction, tion.194,209,574,665,666,683 Early molecular modeling simulations
consistent with the observation that Y73S had lower activity of Tf Cel6A in complex with cellotetraose focused on
than Y73F.687,689 In Tf Cel6A, Tyr73 is further from the catalytic investigating the conformation of the glucosyl unit bound in
acid than homologous residues in TrCel6A, and thus would only the −1 subsite confirming the stability of the −1 puckering.692
be able to aid in pKa modulation through conformational Koivula et al. noted that while TrCel6A subsites −2, +1, and +2
changes of the active site. The same roles have been reported for contain tryptophan residues, the −1 subsite is instead composed
the −1 binding site tyrosines of HiCel6A and Tf Cel6B as of a lysine, aspartate, and a tyrosine (described above), all of
well.666,682686 which likely contribute to the distortion of the sugar.574 The
7.2.5. Substrate Binding. As with most GHs, family 6 authors also proposed that binding in the +2 subsite causes
cellulases exhibit aromatic-lined substrate binding sites in both strain that manifests as ring distortion in the −1 subunit based
cleft and tunnel architectures. The binding sites have been on the 700-fold higher specificity constant for cellotetraose over
historically defined by the number of glucose moieties that may cellotriose. Consistent with the proposal that interactions
bind along the length of the tunnel or cleft. Since GH6 enzymes beyond the −1 subsite are involved in ring distortion and
are nonreducing end specific, positive numbers denote the productive substrate binding,574,690 Payne et al. used free energy
“substrate” side of the active site, and negative numbers denote calculations to find that mutations to tryptophan residues in
the “product” side. The only known exception to this multiple binding sites affected the conformation of the −1
generalization appears to be the GH6 CBH from T. emersonii sugar.578 Specifically, mutating the +2, +1, and −2 tryptophans
that has been reported to act from both the reducing and (Trp269, Trp397, and Trp135, respectively) to alanine resulted
nonreducing end of cellulose.513 TrCel6A exhibits at least six in the −1 sugar ring relaxing to a 4C1 conformation. Mutation of
binding subsites from +4 to −2.525,606 Four of these sites, +2 to the +4 site tryptophan, Trp272, did not cause ring relaxation,
−2, were observed in the seminal TrCel6A structure and have however. Using molecular simulation, Bu et al. further revealed
been extensively characterized over the years.192,574,690−693 The that the relative stability of the distorted conformation is
overall number of binding sites across GH6s varies among sensitive to pH, with the 2SO conformation favorable at the
members from at least 4 and up to 8 depending on the TrCel6A optimum pH of 5.479
enzyme.574,665,667,694,695 However, generalizations as to the roles Exomode initiation in processive enzymes is thought to occur
of residues in the core binding sites +2 to −2 are insightful and via the acquisition of free chain ends, which are threaded into
will be discussed here. the active site. Accordingly, the enzymes appear to exhibit
Product side binding sites, −2 and −1, have been shown to machinery enabling acquisition through carbohydrate-π stacking
tightly bind the ligand aiding in both positioning the glycosidic with an aromatic residue at the entrance of the enzyme active
linkage for catalytic cleavage and stabilization of the cleaved site. Koivula et al. hypothesized that TrCel6A exhibited an
dimeric product. In TrCel6A, the −2 subsite exhibits a additional two binding sites, +3 and +4, wherein Trp272 in the
carbohydrate−aromatic stacking interaction mediated by a +4 binding site plays a key role in chain acquisition.525 Trp272
tryptophan residue, Trp135. Little experimental evidence mutants retained near wild-type hydrolytic capability on
characterizing this binding site in TrCel6A is available; however, amorphous cellulose but were significantly impaired on
molecular modeling studies suggest the tryptophan residue BMCC. At the same time, the binding affinity of the mutants
tightly binds the ligand to maintain both the −1 ring distortion was similar to wild-type, confirming the hypothesis that the
and for product stability.578 This strong association of the residue was critical to chain acquisition. Notably, this
1376 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 51. Hypothesized processive catalytic cycle for exo-initiated attack. Pre-slide mode: The “more open” TrCel6A structure, 1QK2, represents the
“pre-slide” mode conformation of a processive GH6.194 Loop A (pink cartoon) is in the open conformation with Ser181 far from the catalytic center.
The catalytic acid Asp221 is hydrogen bonding with Asp175, the pKa modifying residue and putative catalytic base. Residues thought to participate in
catalysis, directly or indirectly, are shown in green stick. Tf Cel6B was captured in a similar “more open” conformation with a full-length cello-oligomer
in the 4AVO structure.686 Thus, we use the 4AVO cello-oligomer (cyan stick) in the pre-slide mode, slide mode, and Michaelis complex panels. Slide
mode: The cello-oligomer processes through the active site, filling the −1 and −2 binding sites. The protein (1QK2) has not yet changed
conformation in slide mode allowing the oligomer to pass unobstructed. Michaelis complex: The protein undergoes a conformational change
positioning loop A near the catalytic center, represented here by the TrCel6A 1QJW structure.194 In the Michaelis complex, the backbone of Asp401
and the side chains of Asp175 and Ser181 form a network mediated by two water molecules, red spheres. The catalytic residues illustrating the
Michaelis complex have been selected from various structures to represent this intermediate state (1QJW, Asp175, Asp401, and Ser181, water
molecules; 1QK2, Tyr169; 1HGW, Asp221 from chain B).656 Substrate−product complex: Hydrolysis occurs, putatively via the Grotthuss
mechanism, breaking the glycosidic linkage. The protein (1QJW) maintains the tightened active site conformation throughout hydrolysis producing
an α-cellobiose product molecule. The product and substrate ligand shown is that of the Tf Cel6B 4AVN structure with a modeled −1 glucose based
on the 4AVO ligand.686 The product molecule is expelled, and the processive catalytic cycle is reset.

tryptophan is conserved in GH6 CBHs but not in EGs. Zhang et M−1.691 Nevertheless, cellobiose inhibition in TrCel6A pales in
al. later confirmed the Tf Cel6B tyrosine functions in a similar comparison to values observed for TrCel7A. Moderate levels of
manner.526 glucose inhibition of TrCel6A have also been observed with an
7.2.6. Product Inhibition. GH6 product inhibition has not inhibition constant of 150 M−1.301,691 The inhibition of TrCel6A
been as extensively examined as in complementary GH7 has been described as noncompetitive inhibition and can reduce
cellulases (see section 6.3). Several early studies examined T. cellotriose hydrolysis by a factor of 30.691 However, glucose
reesei cellulase product inhibition, using cocktails rather than inhibition does not appear to be universal across GH6s, as
purified enzymes.581,599 These studies reached the conclusion glucose did not significantly inhibit Tf Cel6A hydrolysis of
that cellulases in the cocktail were indeed inhibited by the CMC697 or Avicel.698
cellobiose product, but it became clear this effect was likely Attempts to understand molecular contributions to cellobiose
dominated by the interaction of TrCel7A with cellobiose rather product inhibition in GH6s are accordingly limited. Molecular
than TrCel6A. At 25 °C, TrCel6A has a reported cellobiose simulation suggested that the strong substrate binding at the −2
inhibition constant (Ki) of approximately 500 M−1,301 while the subsite likely contributes to TrCel6A product inhibition.578 Site-
inhibition constant for TrCel7A with cellobiose is roughly directed mutagenesis of Tf Cel6B active site residues has been
50 000 M−1.607 Following on from this work, Teleman et al. shown to alleviate cellobiose inhibition.526 However, this
developed a hydrolysis progress curve model including product improvement in hydrolysis comes at the price of ability to
inhibition parameters and went on to suggest the TrCel6A degrade crystalline cellulose. In the same study, a glycine to
cellobiose inhibition may be closer to 5000 M−1 than 500 proline loop mutation inexplicably increased activity on
1377 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

amorphous and crystalline substrates while increasing cellobiose its classification as a CBH, TrCel6A may acquire a free chain end
inhibition. A molecular-level explanation of this observation and processively cleave hydrolytic linkages. However, on the
remains elusive. basis of production of reducing ends on both PASC and filter
7.2.7. Processive Catalytic Cycle. The inverting catalytic paper as well as reversible binding, TrCel6A was reported to
mechanism described here is common to all GH6 enzymes. exhibit a high degree of endo-initiation activity allowing the
CBHs in the GH6 family are also thought to exhibit an enzyme to randomly hydrolyze internal glycosidic bonds. In
encompassing “processive catalytic cycle” wherein the hydrolysis TrCel6A, this ability to cleave internal linkages similar to that of
step is one part of the entire process of successively cleaving an EG has been attributed in part to the flexibility of the tunnel-
crystalline polysaccharides (see section 6.2). In this section, we forming active site loops.304 Later, Harjunpää would report that
outline the steps of the hypothesized processive catalytic cycle. TrCel6A exhibits strictly processive action, finding no
We note that this is a hypothesized mechanism and that some of cellotetraose yield from cellohexaose, as would be expected if
the steps, particularly “pre-slide mode” and “slide mode”, are not the enzyme randomly cleaved glycosidic linkages.693 These
likely accurate indictions of endo-initiated attack by GH6s. The opposing findings hint at the difficulty in describing exo-activity
cycle presented here includes the original proposal by Zou et al., and processive function using accessible characterization
described above, and combines additional structural-based techniques. Currently, no consensus approach has been widely
evidence from processive CBHs that followed the Zou et al. adopted in examination of GH6 exo-activity, and thus,
study.194,573 Figure 51 illustrates the hypothesized processive comparison of findings across laboratories is difficult at best.
catalytic cycle using structures available in the literature. The Difficulties associated with cellulase processivity measurements
putative steps include the following: (1) In the pre-slide mode, have been described above in section 6.2, and a recent review
the processive catalytic cycle begins after the GH6 has acquired outlines limitations of available techniques, which are currently
a free chain end from the nonreducing end of the polysaccharide substantial.521 With this in mind, we describe our evolving
as part of the initial processivity event.20 The ligand occupies the understanding of the mechanism by which GH6s deconstruct
substrate binding sites (+1 through +4 or beyond, depending on cellulose, controversy in description of processive function, and
the enzyme), while the product binding sites (−1/−2) remain technologies being developed to move forward in our
unoccupied. At this point, the active site loops are in the “open” understanding of GH6 processive function.
conformation such that the side chain of the serine implicated in Description of the endo- or exo-initiation character of GH6
catalysis is more than 10 Å away from the catalytic center. The cellulases has long been qualitatively determined through
catalytic acid (Asp221) and pKa modifying/putative base soluble reducing sugar assays, where enzymes exhibiting a
aspartate (Asp175) hydrogen bond with each other. (2) In greater degree of exo-initiation activity produce fewer insoluble
the slide mode, the cello-oligosaccharide proceeds through the sugars relative to more endo-active enzymes.523 According to
active site tunnel by a cellobiose unit filling the −1 and −2 this approach, Tf Cel6A exhibits a high degree of endo-activity
binding sites. The −1 glucose moiety becomes distorted relative to the high exo-active character of Tf Cel6B. In this same
through steric interactions with a tyrosine. Asp221 is protonated study, TrCel6A and Tf Cel6B were reported to be functionally
during this step but maintains its interaction with Asp175. equivalent in synergistic mixtures of cellulases indicating a
Interaction of this latter aspartate with a neighboring arginine similar mechanism in degradation of cellulose. The conclusion
allows it to maintain the charged state. (3) In the Michaelis that TrCel6A is an exo-active cellulase was long unquestioned.
complex, a conformational change in loop A breaks the However, recent characterization and visualization studies now
interaction of the catalytic acid with Asp175. As a result, the suggest GH6 CBHs may also exhibit strongly endo-character-
protonated catalytic acid rotates toward the glycosidic linkage. istic activities304,476,546,600,699,700
The active site tunnel closes, and the serine residue of loop A Several subsequent visualization studies support the hypoth-
(Ser181) moves toward the catalytic center. Two water esis that CBHs TrCel6A and HiCel6A exhibit a high degree of
molecules are stabilized by newly formed interactions with endo-initiation. TEM captured clear images of the effects of
serine and two aspartates (Asp175, Asp401). (4) In the incubating HiCel6A, HiCel7A, and a mixture of the two on
substrate−product complex, hydrolysis occurs via the Grotthuss digestion of BC ribbons.540 Upon incubation with HiCel7A,
mechanism leaving an inverted α-cellobiose product in the −1 Boisset et al. observed thinning of the cellulose ribbons
and −2 subsites. The product is expelled from the active site, indicative of processive degradation of the polysaccharide.
potentially with a conformational change of loop A, and the When the BC was incubated with HiCel6A, the fibrils were also
processive catalytic cycle begins anew. thinned in several locations, but more importantly, the fibrils
7.2.8. Synergistic and Processive Function. Efficient were cut into shorter fragments. The latter behavior is suggestive
polysaccharide deconstruction has long been known to require a of endo-initiated degradation of the cellulose fibrils. Together,
complex mixture of CBHs, a host of EGs, and accessory HiCel6A and HiCel7A exhibit exo/exo synergy, working from
enzymes all working together through complementary func- opposite ends of the polysaccharide and using complementary
tion.32 Synergistic activity between TrCel6A and TrCel7A was modes of degradation. Boisset et al. illustrated this phenomenon
reported as early as 1980 prompting an entire body of research in a schematic reproduced here in Figure 52.540 The authors
devoted to understanding the mechanisms behind this made the important observation that choice of substrate greatly
phenomenon.436 Chanzy and Henrissat hypothesized that affects the outcome of the experiment. They argued that the BC
orthogonal action of TrCel6A and TrCel7A could be ribbons are the optimal substrate for observing the endo-
responsible for their synergy.302 As discussed previously, it processive action of HiCel6A; this substrate does not exhibit the
eventually became apparent that GHs do not exist as strictly abnormally high degree of structural defects of pretreated
delineated “endo” and “exo” active enzymes suggesting substrates such as BMCC, Avicel, or even Valonia microfibrils
additional possibilities for synergistic function of the two that may preclude endo-activity.
enzymes.304 Ståhlberg et al. suggested T. reesei CBHs are capable Recently, real-time visualization of cellulase action on
of two essential modes of cellulolytic degradation.304 In line with cellulose using high-speed AFM has been reported.476 Igarashi
1378 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

chemistry.521 Use of reducing-end attached chromogenic or


fluorogenic molecules in examining GH6 activity is generally
uninformative, as the enzymes exhibit relatively poor hydrolysis
of the heteroglycosidic linkages or preferentially cleave the
homoglycosidic linkage of longer model substrates. A recent
advance in fluorogenic substrates for characterization of GH6
activity made use of molecular docking studies to understand
how GH6s bind 4-methylumbelliferyl-β-D-cellobioside sub-
strates and suggested rational improvements based on the
findings.704 Wu et al. discovered that GH6 enzymes may
nonproductively bind with the traditional umbelliferyl substrate
and that substitutions at the C4 or C6 position of the
umbelliferyl motif greatly improved hydrolytic turnover of the
heteroglycosidic linkage.704 The 6-chloro-4-methyl-umbelliferyl-
β-D-cellobiose reporter molecule was deemed to be the most
successful improvement, as the molecule demonstrated an
appropriate balance of activity, solubility, and fluorescence in
comparison with other C4/C6 modifications. Nevertheless,
GH6 turnover of these molecules remains low when compared
to GH7 activity. The authors optimistically interpreted this to
mean there are additional gains to be had in design of GH6
reporter molecules.
Figure 52. Schematic of the action of HiCel7A and HiCel6A on BC 7.3. Glycosylation
ribbons. The two enzymes exhibit complementary function, working To our knowledge, TrCel6A is the only GH6 for which
from the reducing end, R, and the nonreducing end, NR, of the glycosylation has been explicitly characterized. Hui et al. used a
microfibril, respectively. (A) The primarily exo-acting HiCel7A
sharpens the microfibril into long, thinner ribbons. (B) HiCel6A
combination of capillary liquid chromatography-electrospray
exhibits a dual mode of action, thinning the ribbons in certain regions and matrix-assisted laser desorption and ionization time-of-flight
but also cutting the fiber using endo-initiated hydrolysis. Reprinted with mass spectrometry to locate N-linked glycans on the CD and
permission from ref 540. Copyright 2000 American Society for quantify the number of O-linked glycans attached to the CBM
Microbiology. and linker domain.418 The TrCel6A enzyme was purified from
the T. reesei RUT-C30 strain fermentation broth and thus
et al. investigated the synergistic effects of TrCel7A and closely reflects the native glycan arrangement. The cleaved
TrCel6A finding, as has often been reported, that the two act in CBM-linker domain, from residues 1 to 82, exhibits between 39
a synergistic exo/exo fashion. Interestingly, the authors noted and 46 O-linked mannose residues appended to the threonines
that when an ammonia pretreated cellulose polymorph, and serines in this region. This is significantly higher than the
cellulose IIII, was incubated with only TrCel6A for 3 and 8 number of O-linked glycans observed on the linker of TrCel7A,
min, little change in morphology of the microfibril was related to the shorter GH7 linker. As the linker length across
observed.476 However, subsequent addition of TrCel7A resulted GH families appears to be conserved,417 this finding may be true
in remarkably faster digestion than with either enzyme alone. It of other fungal GH6 enzymes exhibiting linker regions. On the
seems reasonable to conclude that TrCel6A may function in a CD of TrCel6A, Hui et al. report there are three N-linked glycan
highly endo-active mode on cellulose IIII creating new chain sites at Asn14, Asn289, and Asn310. Of these three, the Asn310
ends for TrCel7A that would otherwise be unavailable. It has glycan was reported to be a high-mannose glycan with anywhere
also been speculated that TrCel6A may even synergistically from 7 to 9 mannose residues attached to the base GlcNAc2.418
remove obstacles from the path of TrCel7A resulting in This finding was in line with observations made in crystallization
enhanced hydrolysis. One of the more intriguing results of of the first TrCel6A structure. Rouvinen et al. found that the
this work relative to TrCel6A can be seen in the movies TrCel6A CD likely exhibited N-linked glycans at the Asn289
supporting the manuscript; in isolation, TrCel6A appears to be and Asn310 residues, though the only assignment made was a
nonprocessive. Over the time span of the high-speed AFM single GlcNAc at Asn310.192 The crystal structure also reported
measurements, the enzyme molecules do not deviate signifi- the CD as having O-linked glycans at residues Thr87, Thr97,
cantly from their initial positions. This latter finding is counter Ser106, Ser109, Ser110, and Ser115. Additional crystal
to the community’s working hypothesis that TrCel6A is a structures of the natively expressed TrCel6A CD also exhibit
processive cellulase and highlights the need for accurate GH6 various combinations of the attached glycans.194,656,704
processivity assays. Insights into glycosylation of GH6 cellulases other than
Much of the difficulty in uniformly characterizing GH6 TrCel6A are limited to findings from structural reports.
activity, degree of processivity, and initial mode of attack arises Unfortunately, the C. cinerea and C. thermophilum GH6
from the fact that there are relatively few model substrates structures were obtained through recombinant expression in
available for nonreducing end specific enzymes. For decades, E. coli; thus, the structures do not exhibit glycosylation.670−672
reducing end specific assays have taken advantage of The HiCel6A structures were expressed in an alternative fungal
chromogenic and fluorogenic cellobiosides and lacto- host, A. oryzae, and do exhibit some glycosylation captured in
sides.701−703 The specific chemistry of the reducing end of the structures. Varrot et al. reported the CD exhibits an N-linked
cellulose and cello-oligomers allows for attachment of reporter GlcNAc at Asn141 and at least two O-linked mannose residues
molecules, whereas the nonreducing end lacks this unique at Ser127 and Thr118.659 It is likely the CD may exhibit more or
1379 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

rather different glycans on the surface of the CD in the native activity during a 10 min incubation) by approximately 10 °C.
host, but the extent to which this is true is currently publicly They tested Cys-Ser mutations on parent enzymes and found an
unavailable. approximately 8 °C increase in T50 for TrCel6A C400S (where
7.4. Protein Engineering the numbering is for the full-length enzyme with a CBM-linker).
This corresponding Cys to Ser mutations further improved
As with their GH7 counterparts, engineering GH6 CBHs secretion of the functional enzyme. They proposed that the
toward higher thermal stability has received significant attention increased stability of the Cys-Ser mutation could be due either
from industrial and academic laboratories. Early efforts focused to stronger hydrogen bonding interactions or due to steric
on improving GH6 stability were targeted toward bacterial reasons, as the Cys residue resides closer to the carbonyl of
representatives. Zhang et al. attempted to improve Tf Cel6B Pro339 than Ser.
thermostability by introducing disulfide bonds.526 They created Wu and Arnold further increased the HJPlus chimera
a Tf Cel6B G234S-G284P double mutant that showed 2-fold thermostability via directed evolution, creating the new chimera
increase on filter paper, but the benefit did not carry over to 3C6P, which contained seven mutations over HJPlus: S30F,
synergistic mixtures.526 Ai and Wilson attempted to engineer a V128A, M135L, Q277L, S317P, S406P, and S413P.674 The
more thermostable Tf Cel6B by introducing a disulfide bond by mutations creating the 3C6P variant were tested one-by-one in
means of the N233C-D506C double mutant (residues Asn233 TrCel6A. All mutations except Q276L (equivalent to Q277L in
and Asp506 in Tf Cel6A correspond to Asn182 and Arg410 in 3C6P) either stabilized or did not affect the T50 of the parent
TrCel6A).705 The circular dichroism spectra of the wild-type TrCel6A enzyme. The five thermostabilizing mutations in 3C6P
and double mutant were identical, indicating no change in the are all near the surface, with the Ser to Pro mutations being the
final structure. The mutated residues joined the two loops only solvent exposed mutations in loop regions. Wu and Arnold
covering the active-site cleft but did not inhibit activity on CMC proposed that they limit conformational freedom without
or PASC. Moderate gains in thermal stability were attained straining the backbone (the Cα and Cβ atoms in TrCel6A
through the double mutation with 100% of activity maintained and HJPlus align well with the Cα and Cβ of the corresponding
at 50 °C for 20 h compared to 85% activity retention in the wild- prolines in 3C6P). 3C6P displayed a half-life of 280 min at 75
type.705 Unfortunately, the mutations caused a decreased °C and a T50 of 80.1 °C, a 15 °C increase over HiCel6A and 20
protein yield attributed to lower in vivo expression but °C increase over TrCel6A. The 3C6P activity at its optimal
prohibiting extension to commercial applications.706 temperature of 75 °C offers a 10-fold reduction in hydrolysis
In efforts to increase thermostability in fungal GH6 cellulases, time compared to HiCel6A activity at its optimal temperature of
Lantz et al. targeted more than 100 nonconserved TrCel6A 60 °C on Avicel (at 60 h). Crucially, it continued to demonstrate
residues for mutagenesis. Therein, they concentrated primarily synergy, and they found that a mixture of thermostable Cel6A
on the CD surface residues hypothesizing that these residues (3C6P) and a thermostable variant of Cel7A from the same
benefit the least from hydrophobic packing stability and may group650 releases 1.8 times more cellobiose than a wild-type
exhibit the largest gains; some linker and CBM sites were also mixture at their respective optimum temperatures of 70 and 60
examined.707 Through this comprehensive approach, gains in °C from Avicel.
thermal stability of nearly 7 °C for TrCel6A (and 15 °C for On the basis of results from their previous studies wherein
TrCel7A) were achieved. These gains in thermal stability were free cysteine residues were found to be detrimental to stability in
significant enough to raise the stability of the primary hydrolytic GH6 cellulases,708,709 Wu et al. investigated the role of paired
components to that of the more thermally stable EGs TrCel5A and free cysteine residues in GH6 thermal degradation.710 They
and TrCel3A in a commercial enzyme cocktail. An added side noted that some GH6 enzymes contain unpaired, free cysteine
benefit of the thermal stability mutations was a moderate gain in residues, including the thermostable chimera 3C6P Cys246.
TrCel6A activity. Mutants (C246A, C246G, C246L, and C246S) lacking the free
Additional efforts to improve GH6 thermal stability include a cysteine showed increased extreme (90 °C) temperature
significant body of work from the Arnold group, as previously tolerance. They noted that HiCel6A and TrCel6A have a free
described in the GH7 section.674,708−710 Heinzelman et al. first cysteine residue and lower temperature tolerance than wild-type
employed structure guided enzyme recombination of HiCel6A, CtCel6A, which does not have a free cysteine. As with 3C6P,
TrCel6A, and the bacterial GH6 CtCel6A.709 From a sample set mutants of HiCel6A and TrCel6A in which their free cysteine
of 48 chimeras secreted by a glycosylation-deficient strain of S. residues were removed showed improved tolerance of high
cerevisiae, five chimeras exhibited half-lives of inactivation (63 temperature incubation. Further study of the C246G mutant
°C) higher than the most thermostable wild-type parent indicated that it was able to retain activity after high temperature
(HiCel6A). None of the three chimeras were more active on incubation due to disulfide-bond-assisted refolding to a
PASC at 50 °C than TrCel6A, the most active parent, but each productive conformation. They proposed that GH6 thermal
were active at higher temperatures than the parents, retaining inactivation is due to disulfide-bond degradation and thiol-
activity up to 70 °C. The most thermostable parent, HiCel6A, is disulfide exchange that results in misfolding, and that removing
inactive above 57 °C. They named the most active chimera free cysteine residues limits degradation by the later mechanism.
HJPlus, which was created by substituting the three “blocks” GH6 enzymes display a diverse range of pH optimum and
predicted to be most stabilizing into TrCel6A. In a follow-up activity ranges (Table 9).192,499,680,711,712 Nevertheless, it is
study, Heinzelman et al. produced additional GH6 chimeras that frequently desirable from an industrial perspective to under-
their regression model predicted would encode thermostable stand both the molecular contributions to activity at various pHs
chimeras.708 The block that most strongly contributed to and methods for engineering proteins more tolerant of pH
thermostability contained 10 differences between two parents conditions beyond optimal. Several groups have investigated the
with different thermostabilities. Mutating each residue in the effect of substrate binding on the pKa of the enzyme as well as
block identified one mutation, S313C in HiCel6A, which the effect of pH on active site loops and activity. Damude et al.
reduced the T50 (temperature at which an enyme loses 50% of noted that Cf Cel6A displays a basic shift in pKa with substrate
1380 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 9. Summary of Biochemical Characterizations of Fungal GH6 Cellulases
temp opt substrate
organism expression host enzyme pH opt (°C) for opt substrate specificity ref comments
Agaricus bisporus Saccharomyces Cel3 CMC, filter paper, PASC, barley β-glucan Chow et al., 1994714
D649 cerevisiae
Aspergillus nidu- Pichia pastoris X- CBHII 5.5 57 CMC soluble CMC, cello-oligosaccharides Glc3/Glc4/Glc5/Glc6, Bauer et al., 2006487
lans FGSC A4 33
Chemical Reviews

barley β-glucan and lichenan, Avicel


Chaetomium Pichia pastoris Cel6A 4 50 pNPC pNPC, BMCC, filter paper, CMC Wang et al., 2012673
thermophilum
Chrysosporium Cel6A 4.5 60 CMC CMC, labeled CMC, β-glucan Emalfrab et al., 2003715 formerly EG6
lucknowense
669
Coprinopsis cin- Escherichia coli Cel6A 8.0 PASC PASC, Avicel, cellotriose, cellobiose Liu et al., 2009, Tamura et al., assay conditions 30 °C
erea 5338 2012671
Coprinopsis cin- Escherichia coli Cel6B 7.0 PASC PASC, CMC, Avicel Liu et al., 2009 assay conditions 30 °C
erea 5338
Coprinopsis cin- Escherichia coli Cel6C 8.0 PASC PASC, CMC, Avicel Liu et al., 2009 assay conditions 30 °C
erea 5338
540
Humicola inso- Cel6A 60 Avicel Avicel; BC ribbons Boisset et al., 2000, Moriyaet al., assay conditions pH 5.0
lens 2003,716 Wu and Arnold, 2013674
Humicola inso- Saccharomyces Cel6B CMC, PASC, Glc5, reduced Glc6 Schou et al., 1993,442 Dalbøge and assay conditions 30 °C and pH 7
lens cerevisiae Heldt-Hansen, 1994,717 Davies et
al., 2000663
Irpex lacteus Pichia pastoris CBHII 5 50 PASC Avicel, PASC Toda et al., 2008718
MC-2 (Ex-4)
Malbranchea cin- Aspergillus ory- Cel6A PASC, pNPC, Avicel Wu et al., 2011,719 Xu et al., 2009500 assay conditions 50 °C and pH 5.0
namomea zae

1381
Neocallimastix Escherichia coli Cel6A 5.0 40 Avicel Avicel, CMC, PASC, lichenan Denman et al., 1996711 formerly CelA
patriciarum
Neocallimastix Escherichia coli CelA 6.0 50 barley barley β-glucan, lichenan, CMC, Avicel, PASC Wang et al., 2013720
patriciarum β-glucan
J11
Orpinomyces sp. Escherichia coli CelA 4.3−6.8 50 CMC CMC, Avicel, PASC, lichenan, barley β-glucan, pullulan, Glc4 Li et al., 1997712
PC-2
Orpinomyces sp. Escherichia coli CelC 4.6−7.0 40 CMC CMC, Avicel, PASC, lichenan, barley β-glucan, arabinogalac- Li et al., 1997712
PC-2 tan, araban, galactan, pullulan, gum arabic, pachyman,
pustulan, cellotetraose
Orpinomyces sp. Escherichia coli CelF 5.8−6.2 40−50 CMC CMC, PASC, barley β-glucan, lichenan, Avicel Chen et al., 2003721
PC-2
Penicillium de- Cel6A 5.0 50 pNPC pNPC, barley β-glucan Gao et al., 2011722
cumbens JU-
A10
Phialophora sp. Pichia pastoris EgGH6A 7.0 65 CMC-Na CMC-Na, barley β-glucan, Avicel, filter paper Zhao et al., 2012723 retained >40% activity at pH 4.0−10
G5
Piromyces rhizin- Escherichia coli Cel6A 6.0 37−45 CMC CMC, barley β-glucan, lichenan, xylan Tsai et al., 2003724
f latus 2301
Piromyces sp. E2 Escherichia coli Cel6A Avicel Harhangi et al., 2003725 assay conditions 39 °C and pH 6
Podospora anser- Pichia pastoris Cel6A 7 45 Avicel Avicel, CMC, Glc3/Glc4/Glc5/Glc6 Poidevin et al., 2013326
ina S mat+
Podospora anser- Pichia pastoris Cel6B 7 35 Avicel Avicel, CMC, Glc3/Glc4/Glc5/Glc6 Poidevin et al., 2013326
ina S mat+
Podospora anser- Pichia pastoris Cel6C 6 25 Avicel Avicel, CMC, Glc3/Glc4/Glc5/Glc6 Poidevin et al., 2013326
ina S mat+
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

present, shifting from 5.9 to 6.7 for the protonated group, and
5.7 to 6.3 for the deprotonated group. This shift was attributed

assay conditions 50 °C and pH 5.0

assay conditions 45 °C and pH 5.0


to the substrate excluding water from the active site.695 Recently,
Bu et al. used molecular modeling to gain insights into the
effects of pH on GH6 protein structure and function. Molecular
comments

modeling confirms, as proposed in the Damude et al. study, that


substrate binding results in a basic shift.479 Bu et al. further
demonstrated that pH affects the substrate ring conformation in
the −2 subsite and active site loop flexibility, with more
flexibility found at the optimal pH.
Engineering proteins for pHs beyond the optimum requires
consideration of both the overall protein stability as well as the
Poidevin et al., 2013,326 Ståhlberg et

catalytic function. A particularly successful pH engineering effort


by Wohlfahrt et al. demonstrates that TrCel6A can be altered so
as to maintain both stability and wild-type-like levels of activity
Yamanobe et al., 2000726

at significantly increased pH, similar to what would be


Brown et al., 2007727
ref

Song et al., 2010520

encountered under basic lignocellulosic pretreatment strat-


Wu et al., 2011719

egies.713 The authors targeted three carboxyl-carboxylate pairs


al., 1993304

in a stable arrangement with low solvent accessibility yet near


the flexible active site loops. Under alkaline conditions, the
carboxylic acid group of these pairs would likely be
deprotonated leading to repulsion of the side chain from the
neighboring carboxylate. Mutants were designed replacing the
carboxylic acid with an amide partner. Circular dichroism and
tryptophan fluorescence confirmed the strategy effectively
improved protein stability under alkaline conditions relative to
wild-type. The triple mutant (all three carboxylic acids replaced
Avicel, CMC, Glc3/Glc4/Glc5/Glc6, PASC

by amides) was the most stable of the proteins examined. The


substrate specificity

half-life of this protein was extended by 4-fold at pH 8, and


activity on cellotetraose was unperturbed. Interestingly, the
activity on BMCC was shown to be the same or higher than
wild-type under alkaline conditions. Overall, this protein
engineering strategy appears to represent a general approach
to modification of pH optimums. Furthermore, extrapolation of
this approach toward reducing pH optimum appears straightfor-
ward, namely by substituting amide-carboxylate pairs with
CMC-Na

carboxyl-carboxylate pairs.
PASC

Avicel

PASC

7.5. Conclusions
GH6 cellulases pose a unique challenge in characterization of
substrate

CMC-Na
for opt

molecular mechanisms as a result of their catalytic mechanism


Avicel

and specificity. As such, our molecular-level understanding of


cellulases in this family is somewhat limited in comparison to
other fungal cellulases such as GH7 and GH5 enzymes. The
temp opt
(°C)

following findings summarize the general consensus regarding


45

70

features of GH6 cellulases:


(1) GH6 cellulases employ a one-step, inverting mechanism
pH opt

to hydrolyze glycosidic linkages.


5.0

(2) The GH6 catalytic base has not been definitively


5

identified. Cellulases of this family most likely undertake


hydrolysis via a water-wire mediated Grotthuss mecha-
enzyme

CBHII

CBHII
Cel6A

Cel6A

Cel6A

nism in which a water wire shuttles the resulting proton


from the nucleophilic water molecule to a neighboring
aspartate, namely Asp175 in TrCel6A.
expression host
Aspergillus ory-

Pichia pastoris

Saccharomyces

(3) The GH6 family consists of both CBHs and EGs, both of
cerevisiae

which are capable of endo-initiated attack with specificity


Table 9. continued

toward the nonreducing end of crystalline cellulose.


zae

(4) GH6 CBHs exhibit disordered loops forming the


characteristic tunnel-shaped architecture. The loops
Stilbella annulata

Trichoderma ree-

Trichoderma vir-
Thielavia terrest-
lulolyticus CF-
Talaromyces cel-

demonstrate a remarkable range of flexibility in response


sei QM9414
organism

ide CICC

to both mutagenesis and ligand binding and may be


13038
2612

related to the mechanism of processivity or the endo-


ris

active nature of GH6 CBHs. These loops may also play an


1382 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

indirect role in stabilizing a water molecule as part of the characteristic of enzymes capable of degrading equatorially
catalytic mechanism. oriented glycosidic linkages.730,732,733 More specifically, the GH-
(5) GH6 cellulases serve a critical role in the synergistic A clan, and thus GH5s, belong to the 4/7-superfamily of βα-
activity of cellulase cocktails. They are the only known barrel glycosidases.730,731 Within this superfamily, the retaining
nonreducing end specific cellulases and have been shown mechanism is catalyzed by the conserved glutamate at the ends
to both create new free chain ends for GH7s as well as to of β-strands 4 and 7, though very little sequence homology is
remove obstacles from the path of more processive otherwise observed.730 Given such a broad categorization of
CBHs. low-homology enzymes based on the (β/α)8 fold, significant
(6) While ensuring uniform characterization of GH activity, variety emerges in enzymatic behavior with at least 20 different
degree of processivity, and initial mode of attack is an experimentally determined enzyme classes within this family
inherently difficult task, GH6s pose additional challenges alone.734 A recent study revisiting GH5 classified structures
as nonreducing end specific GHs. Attachment of reporter given newly available structural data even went so far as to
molecules to the nonreducing end of cellulose is recategorize several enzymes as GH30s, illustrating the difficulty
nontrivial, and thus, few reliable model substrates exist of sufficiently describing enzymatic activity based on sequence
for characterization of enzymes exhibiting this specificity. data or fold alone.735
GH6 family cellulases play a significant role in synergistic Within the GH5 family, there are also 51 currently recognized
biocatalytic conversion of lignocellulosic biomass, yet despite subfamilies that have been defined by phylogenetic analysis with
decades of careful study, many questions remain as to even the several undefined subfamilies anticipated as sequence data
most basic aspect of their molecular function. The path to continues to grow.734 Subfamily classification in GH5, or
definitively answering these questions of GH6 function will cellulase family A, began in 1990 with five original subfamily
encounter obstacles including those common to fungal GHs classifications proposed, A1−A5.736 Five additional subfamilies
such as high-throughput heterologous expression and character- were defined in the 15 years that followed until it became clear
ization as well as the unique aspects related to the GH6 catalytic that a robust sequence-based subfamily classification approach
mechanism. We anticipate the development of reporter was possible.734,737−740 Still, these newly defined phylogenetic
molecules for nonreducing end specific CBHs on either subfamilies encompass both variety of specificities and species
cellulose or soluble substrates will represent one of the most within the subfamily classification. Of the 51 identified GH5
meaningful advances toward understanding GH6 catalytic subfamilies to date, experimentally categorized cellulolytic
function. A recent publication demonstrated a step toward behavior (EC 3.2.1.4 and EC 3.2.1.91) has been observed in
this goal while also acknowledging the possibility that more
subfamilies 1, 2, 4, 5, 22, 26, 37, and 39 representing bacterial,
advanced reporter molecule substrates may exist.704 These types
archaeal, and eukaryotic taxonomies. Given the vast body of
of substrates not only are capable of elucidating the molecular
GH5 data available as well as the scope of this review, we focus
underpinnings of GH6 function; model substrates will also
our discussion of GH5 literature on fungal GH5s exhibiting
prove useful in understanding the extent of processivity and for
cellulolytic behavior except where discussion of bacterial GH5
selection of enzymes for maximum synergistic function as part of
biomass conversion cocktails. Furthermore, development of mechanisms provides relevant mechanistic insights.
GH6 variants in the search for thermal stability and/or activity Saloheimo et al. reported the first isolation and sequencing of
improvements will profoundly benefit from a more accurate the egl3 gene coding for T. reesei EGIII.323 The total mass of
assessment of enzyme performance. EGIII was estimated at 49.8 kDa and was described as having
modular organization as with other T. reesei cellulases
8. FAMILY 5 GLYCOSIDE HYDROLASES characterized at that point. The 35 amino acid N-terminal
region exhibited particularly high sequence homology with that
GH5s have emerged as key EGs in biomass conversion cocktails.
GH5 members were originally referred to as family A cellulases of CBHII and the C-terminal regions of CBHI and EGI, which
when some 21 β-glycanases were categorized into six different we now know derives from the characteristic T. reesei cellulase
families using hydrophobic cluster analysis to group similar modular appendage of a family 1 CBM (see section 5). At nearly
sequences.439 The two original fungal members of family A were the same time, Ståhlberg et al. reported the isolation of the
EG III from T. reesei (TrCel5A) and EG I from Schizophyllum EGIII core domain from the 61-residue N-terminus region. The
commune. Other members of this family were under authors described the N-terminal domain as a heavily
investigation at that time,728,729 though without a known glycosylated structural element common to the other T. reesei
sequence, inclusion of these enzymes in the family A cellulases and posited that modular organization may be a
classification would not come until many years later. As a characteristic of all Trichoderma cellulases.741 Saloheimo et al.
growing volume of sequence data became available, the also identified EGIII as a glycoprotein, predicting one N-
nomenclature expanded to include hydrolytic activity beyond glycosylation site on the basis a surface exposed Asn-Phe-Thr
that of β-glycan hydrolysis, and family A cellulases became motif and a great deal of O-glycosylation.323 EGIII was shown to
known as family 5 GHs.654 Today, the GH5 family is one of the cleave reducing sugars from cellodextrins, though at a rate 50−
largest and most diverse families of GHs with over 4000 200 times slower than that of T. reesei EGI. Comparison of the
entries.151,153 new EGIII sequence with that of an unpublished S. commune
Though a single family designation, GH5s exhibit quite a large EGI sequence indicated 30.4% sequence identity,323 and while
degree of variation in specificity and hydrolytic activity likely as a this seems low at first glance, it would later become apparent
result of divergence from a common ancestor.730,731 In addition that this family of cellulases is only loosely linked by sequence
to 1,4-glucanase activity, the GH5 family includes enzymes with homology. Saloheimo et al. also noted that a protein previously
reported 1,6-galactanase, 1,3-mannanase, 1,4-xylanase, and identified by Shoemaker and Brown in 1978, EG IV,728,729 was
xyloglucanase activities. Along with 18 other GH families, likely actually EGIII, as the amino acid compositions were
GH5s belong to the GH-A clan exhibiting a (β/α)8 fold remarkably similar.
1383 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

The T. reesei family 5 EG is responsible for a significant families likely originates from sequence variations that directly
portion of the fungi’s EG action, with some reports estimating contribute to both major and minor structural variations.751
the EG contribution at nearly 55% of total EG activity.742 As These structural variations, such as the two minor β-bulges
such, this EG is often incorporated as a key component in many located at strands β3 and β7 of C. thermocellum CelC and the 54-
industrial biomass conversion cocktails. Over the past four residue subdomain insertion connecting the α6 helix and β6
decades, a great deal of effort has gone into characterizing its strands,743 greatly affect overall function and appear to be
action and defining the limits of stability by many different related to thermal stability (Figure 53A).752
groups. Importantly, we note that over this time this particular Many GH5 structures spanning a range of taxonomies and
enzyme has been referred to in the literature by at least three substrate specificities were solved in the interim between the
different names. The EGIII moniker was given upon the original appearance of the original CelC structure and the first fungal
characterization by Saloheimo et al.323 Earlier, a report of a GH5 cellulase structure in 2002. Several of these structures from
partial amino acid sequence determined by Edman degradation bacterial cellulases have been invaluable in defining the catalytic
assigned the enzyme EGII.632 It eventually became clear that mechanism of GH5 cellulases and will be discussed separately
these were the same enzyme. Up until the late 1990s, most with respect to catalytic function.206,754,755
publications referred to this enzyme as EGIII, but more recently, 8.1.1. Catalytic Function. As members of the GH5 family,
the enzyme has frequently been referred to as EGII. As GH fungal GH5 cellulases hydrolytically cleave the glycosidic
nomenclature evolved, the enzyme has also been referred to as linkages of their substrates using the double-displacement
Cel5A. To avoid confusion throughout the remainder of this retaining mechanism, outlined in Section 3.157 Barras et al. first
discussion, we adopt the modern convention of Cel5A with described this for the bacterial Erwinia chrysanthemi EG Z
reference to the T. reesei family 5 EG. A summary of GH5 identifying the retention of the anomeric carbon product
structures that are discussed in this section is provided in Table configuration.756,757 The authors went on to speculate that two
10. conserved glutamates served as the catalytic acid/base and
enzymatic nucleophile as we will discuss at length below. Much
Table 10. Reported Fungal GH5 Crystal Structures of the available insight into catalytic mechanisms of GH5s has
been observed in bacterial representatives such as E.
source and original
name in primary chrysanthemi. Thus, in the following section, we briefly deviate
citation PDB code resolution (Å) brief highlights ref from our focus on fungal cellulases to discuss our current
Thermoascus 1GZJ 1.62 first fungal GH5 740 understanding of GH5 catalytic mechanisms in general.
aurantiacus Cel5A structure Initial characterization studies identified key catalytic residues
reported
through biochemical methods such as mutagenesis and use of
1H1N 1.12 apo wild-type 753
labeled substrates. One of the first studies to investigate the
Piromyces rhizinf lata 3AYS 2.20 complex with 762
Egl1 cellotetraose catalytic mechanism of GH5s was conducted in 1990, shortly
Trichoderma reesei 3QR3 2.05 apo wild-type 30 after some of the first GH5 enzymes were discovered.758 Baird et
Cel5A al. aligned 16 different amino acid sequences of known
celluloytic enzymes. Though none of the sequences exhibited
greater than 25% similarity to the other sequences in the set, the
8.1. Structural Studies multiple sequence alignment identified a recurring three-residue
The first solved structure of a GH5 enzyme was of the bacterial motif in all the cellulases examined. Baird et al. had uncovered
C. thermocellum EG CelC in 1995.743 Dominguez et al. observed the Asn-Glu-Pro motif defining in part the catalytic active site of
that the general fold corresponded to the (β/α)8 barrel topology GH5 cellulases. Site-directed mutagenesis of the Glu to Gln in
first observed in triose phosphate isomerase and is referred to as two representative bacterial EGs confirmed a detrimental loss of
a TIM barrel. This (β/α)8 tertiary structure is by far the most activity upon mutation. The participation of this conserved Glu
common observed fold, with an estimated 10% of all known in catalysis was again confirmed in both C. thermocellum and E.
enzyme structures falling into this class.744,745 It has been chrysanthemi GH5s.759,760 Without the benefit of structural
suggested that the variety of (β/α)8 barrel structures emerged as insight, these initial studies identified the catalytic acid/base of
a result of divergent evolution from a common ancestor, thus GH5 cellulases, though it would not be until later that the Glu
accounting for the vast diversity of sequences and functionality was confirmed as such.
allowing (β/α)8 barrel proteins to function as hydrolases, Following shortly after the discovery of the GH5 catalytic
oxidoreductases, transferases, lyases, and isomerases.746 acid/base, the complementary nucleophile, also a glutamate, was
Within the GH5 family, a core set of amino acid residues serve confirmed. Wang et al. used a clever labeling approach to
as distinction from other (β/α)8 barrel hydrolases. These determine the nucleophilic residue of C. thermocellum CelC
include the catalytic glutamates, identified previously through involving radiolabeling the enzyme with tritiated inhibitor and
biochemical characterization as well as a handful of other spectroscopically analyzing cleaved, labeled peptide frag-
residues thought to support catalysis or substrate bind- ments.747 Macarron et al. had previously speculated this same
ing.747−750 Solution of the C. thermocellum CelC structure glutamate was the catalytic nucleophile in TrCel5A.750 The
captured these residues (Glu280, His198, and Trp313) in an family 5 nucleophilic glutamate generally belongs to a Glu-Xxx-
arrangement suggesting the nucleophilic glutamate is assisted in Gly motif, where Xxx is typically an aromatic residue.761
catalysis by hydrogen bonding. The cellotriose-bound structure Identification of the conserved acid/base and nucleophile
(3 Å) also illustrated carbohydrate−substrate interactions glutamates served in part as the basis for refining GH
confirming Glu140 serves as the nucleophilic donor and that classifications including family A cellulases and ultimately the
Asn139 and His90 contact the substrate occupying key positions development of the family 5 classification.732,733
near the catalytic glutamates. The vast substrate diversity within The double-displacement retaining mechanism is also known
the GH5 family and throughout the over 30 (β/α)8 super- to require a water molecule for nucleophilic attack. Evidence of
1384 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 53. GH5s are among the many proteins exhibiting the relatively common (β/α)8 topology. (A) In 1995, the first GH5 structure, originating
from the bacteria C. thermocellum, was solved.743 The celluloytic EG CenC, in purple cartoon, is shown aligned with the first fungal GH5 cellulase
structure, T. aurantiacus Cel5A in green, solved nearly a decade later.740 (B) The GH5 structures share the same basic topology but exhibit
extraordinary diversity in the loop regions connecting the primary β-sheets and α-helices. A third GH5 cellulase, also from T. aurantiacus in tan, was
solved at nearly the same time as the first fungal Cel5A structure.753 C. thermocellum CenC exhibits a large subdomain insertion between the α6 helix
and β6 strand, where the two T. aurantiacus Cel5A structures exhibit a very compact loop (highlighted by the dashed red circle). The cellotriose ligand
from C. thermocellum CenC is shown in gray stick, and the positioning of the loops relative to the cellulose substrate suggests functionality in substrate
recruiting, though likely using very different mechanisms.

an appropriately positioned water molecule would not come The first step in the double-displacement hydrolytic
until later when structural characterization of the B. mechanism of GH5s is the glycosylation step in which the
agaradhaerens Cel5A reaction pathway was captured by Davies glycosyl-enzyme intermediate is formed, Figure 54. At this
et al.206 Other residues, including an asparagine, histidine, and point, a proton from the catalytic acid/base, Glu139, has been
arginine, have been hypothesized to participate in supporting transferred to the leaving group sugar, a non-natural
catalytic roles.740,747 However, it is not clear these interactions dinitrophenyl group in the Davies et al. study. The nucleophilic
are conserved across GH5s or the precise role each residue Glu228, having attacked the anomeric carbon of the −1
plays. glucopyranose moiety, is then covalently bonded to the
Structural evidence of the double-displacement mechanism in oligomeric substrate. The −1 glucopyranose sugar returns to
GH5s was obtained in 1998 when Davies et al. captured the its energetically preferential 1C4 state. With newfound space in
entire reaction coordinate in a series of structures from B. the enzyme active site upon expulsion of the product sugar and
agaradhaerens Cel5A, a β-1,4-glycanase.206 Five different relaxation of the substrate, a water molecule takes the place of
structures describe the individual states of the reaction the glycosidic oxygen. The authors managed to capture two
including: the initial apo conformation of the active site, the different versions of the GH5 glycosyl-enzyme intermediate
Michaelis complex, two versions of the glycosyl-enzyme covalently bonded to a cellobiosyl and a cellotriosyl substrate.
For the most part, the length of the substrate has little effect on
intermediate, and the product bound conformation. We
the position of the catalytic residue. However, a conserved
illustrate these states in Figure 54, as this reaction coordinate
tyrosine residue appears to undergo a conformational change as
is general for fungal GH5 cellulases. a result of the shorter cellobiosyl covalent attachment. This
Prior to hydrolysis, the cello-oligomer substrate is initially residue has been identified as a contributor to catalysis,30,743,762
bound within the GH5 active site in the catalytically competent though the exact means by which this occurs remains unknown.
Michaelis complex (Figure 54). The catalytic acid/base, Glu139 The authors refrain from speculating as to the role of this
for B. agaradhaerens Cel5A, and the nucleophilic glutamate, tyrosine, though it appears the residue is critical to transition
Glu228, are positioned over the −1 glucopyranose so as to state stabilization intermittently interacting with the −1 sugar
initiate catalysis. The proton of the acid/base hydrogen bonds hydroxyl group and Glu228. Davies et al. do go on to propose
with the glycosidic oxygen, while the nucleophlic glutamate that Arg62 is relevant to the correct positioning of the
hydrogen bonds with the anomeric carbon of the −1 nucleophilic glutamate throughout the catalytic cycle.
glucopyranose ring. The −1 glucopyranose must adopt an Finally, the nucleophilic water molecule catalyzes the second
energetically unfavorable skew conformation (1S3) to correctly step of the double-displacement mechanism, deglycosylation. A
position the leaving group sugar significantly higher than would proton from the water is donated to the catalytic acid/base, and
otherwise naturally occur. The use of a fluorine-substituted the remaining hydroxide caps the cleaved glycosidic linkage. The
dinitrophenyl substrate made capture of the intact glycosidic nucleophilic glutamate is returned to its original negative charge
linkage spanning the −1/+1 binding sites possible. state, and the enzyme is reset to its original state ready to
1385 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 54. Retaining mechanism of GH5 cellulases illustrated by B. agaradhaerens Cel5A structures captured along the reaction coordinate.206 The
catalytic residues are shown in yellow stick. Glu139 is the catalytic acid/base, and Glu228 is the enzymatic nucleophile. Arg62 and Tyr202 appear to
serve supportive roles. The substrate, representative of β-1,4-linked glucans, is shown in cyan stick. The water molecule that attacks the substrate
anomeric carbon in the deglycosylation step is shown as a red sphere. The Michaelis complex (PDB 4A3H) is shown at left and illustrates the
catalytically competent enzyme−substrate conformation. Two covalently bound glycosyl-enzyme intermediate states following the glycosylation step
of the double-displacement mechanism have been captured (PDB 5A3H and 6A3H), and the length of the product side substrate appears to affect the
conformation of Tyr202. The product, cleaved from the protein following the deglycosylation step, is shown at right (PDB 3A3H).

hydrolyze another glycosidic bond. Generally, most GH5s do fundamental understanding of catalytic mechanisms in this
not exhibit any significant degree of processive action. Thus, the family lags behind that of GH7 and offers an area for continued
substrate likely dissociates from the active site, and a new investigation.
oligomer association forms prior to the next catalytic cycle. In the sections that follow, we focus on the four fungal GH5
Overall, the Davies et al. study provides a firm structural basis cellulase structures solved to date noting that difficulties with
for understanding the hydrolytic mechanisms of GH5 enzymes, protein expression and purification have hindered accumulation
including cellulases.206 However, it is not immediately clear as to of a large fungal structural data set (Table 10).765 A multiple
which of the two steps, glycosylation or deglycosylation, is rate- sequence alignment of the four fungal structures alongside the
limiting in GH5s or if this is consistent across the family. bacterial B. agaradhaerens Cel5A sequence is provided in Figure
Currently, our understanding of the rate at which each step 55 to aid in comparative discussion.
proceeds is limited to theoretical studies of bacterial GH5 8.1.2. T. aurantiacus Cel5A. Lo Leggio and Larsen
cellulases. Liu et al. performed QM/MM simulations of both the reported the first fungal GH5 cellulase structure, a 1.62 Å EG
glycosylation and deglycosylation steps of Acidothermus from T. aurantiacus (PDB 1GZJ, Figure 53A).740 In addition to
cellulolyticus Cel5A.763 The glycosyl-enzyme intermediate was being the first fungal cellulase structure, the TaCel5A structure
approximated from MD simulations in lieu of a structure-based documented the first structural representative of the GH5
approximation. The authors report free energy barriers of 25.7 subfamily 5 classification (formerly A5/6). As with previous
and 29.4 kcal/mol for the glycosylation and deglycosylation GH5 structures, the canonical (β/α)8 fold comprises the general
reactions, respectively. However, the authors caution that the tertiary structure. However, the TaCel5A structure is remark-
difference between the two barriers may not be statistically ably compact with few extensions to the loops connecting the α
significant given the relatively low level of theory applied. and β regions (Figure 53B).
Furthermore, the reported barriers are likely unreasonably high The TaCel5A structure does not include a bound ligand, but
as a whole, again likely as a result of the level of theory. Similarly, on the basis of appearance, Lo Leggio and Larsen suggest the
Saharay et al. apply QM/MM to examine deglycosylation in B. EG may be capable of binding up to a celloheptaose oligomer
agaradhaerens Cel5A obtaining a free energy barrier of 24.2 within the −4 to +3 binding sites. The enzyme active site is
kcal/mol.764 Saharay et al. do not provide explanation as to the formed by a wide and shallow groove lined with aromatic
differences in their calculations and those of Liu et al. despite residues Trp278, Trp279, Trp273, Trp170, Trp174, and Tyr200
having used the same level of theory. In general, our (Figure 56A). This aromatic-mediated substrate recognition
1386 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 55. Sequence alignment of the four fungal GH5 structures discussed here alongside the bacterial B. agaradhaerens Cel5A (BaCel5A) from
which catalytic function was elucidated. Notably, the sequence alignment illustrates the relatively low sequence similarity among the members of this
GH family. Strictly conserved residues are shown in red block, and chemically similar residues in red text. The blue boxes indicate chemical similarity
across a grouping of residues. The secondary structural element of TrCel5A and BaCel5A are shown above and below the sequences, respectively. The
catalytic acid/base motif is shown in a pink box, and the catalytic nucleophile is shown in a yellow box. The figure was generated with ESPript (http://
espript.ibcp.fr).347

mechanism is typical of GHs,766 yet of the tryptophan residues Chemical modification of the homologous residue in TrCel5A
lining the active site, only Trp273 in the −1 binding site is decreased kcat/Km by nearly half.749 The authors hypothesize this
strictly conserved. However, aromatic residues near the +1 and residue serves a configurational function positioning the
+2 binding sites are spatially conserved throughout GH5s glycopyranose moiety in the −1 binding site in the nonenergeti-
suggesting functional relevance. The spatially conserved cally favorable skew conformation necessary for catalysis. The
tryptophan in the +1 binding site of TaCel5A is Trp170. tyrosine residue, Tyr200, located in the −2 binding site of
1387 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 56. (A) First fungal GH5 cellulase structures, both T. aurantiacus Cel5A, exhibit an extensive network of aromatic residues comprising the wide
and shallow active site groove.740,753 The aromatic residues, shown labeled in yellow stick, serve to mediate carbohydrate substrate interactions.
Surprisingly, only Trp273 in the −1 binding site and the catalytically functional Tyr200 are strictly conserved among the aromatic residues. However,
many others are spatially conserved by chemically similar residues. (B) A handful of conserved residues, cyan stick, are associated with GH5 cellulolytic
function and differentiate these (β/α)8 structures from the thousands of others. Several, including Tyr200, Glu133, and Glu240, directly participate in
catalysis. Other conserved residues contribute to structural stability through disulfide bonds, orange stick, and salt bridges, magenta stick.

TaCel5A, is also strictly conserved. Prior evidence suggests that observed for B. agaradhaerens Cel5A.206 More broadly,
the tyrosine hydroxyl group may participate in catalysis through mounting structural evidence such as this and recent GH18
interaction with the nucleophile rather than through direct studies suggests flexibility of the catalytic residues is a hallmark
stacking interaction with the carbohydrate moieties.206,743,747 of endo-active enzymes in general.206,768
Outside the aromatic residues, Lo Leggio and Larsen put A second even higher resolution (1.12 Å) TaCel5A structure
forth functional roles for several of the conserved residues. followed almost immediately after the first (PBD 1H1N), again
Through structural comparison, the authors identified con- without a bound substrate.753 Van Petegem et al. offer additional
served residues including two glycines (Gly8, Gly44), the molecular level insights not explicitly discussed by Lo Leggio
catalytic glutamates (Glu133, Glu240), as well as the six and Larsen.740 The aromatic lined active site groove of TaCel5A
canonically conserved active site residues (Asp132, Arg49, was again noted, but Van Petegem et al. describe an additional
His93, His198, Tyr200, Trp273). Conservation in GH5s is aromatic residue (Phe16) that may participate in carbohydrate
almost entirely restricted to the active site, as shown in Figure stacking interactions in the −2 binding site. Though not
56B. The exceptions to this in TaCel5A are the two glycine explicitly described as such, perhaps the most interesting
residues and Arg49, whose functions remain unclear. The findings from the TaCel5A structure reported by Van Petegem
authors suggest Gly44 is a part of the Schellman C-terminal et al. is the discussion of structural details that likely contribute
capping motif conserved throughout GH5s.767 The functional to the thermal stability of this EG. Lo Leggio and Larsen report a
relevance of Gly8 is less obvious, as this residue lies in the single disulfide bond between Cys212 and Cys249 noting that it
middle of the β1 strand away from the substrate and does not appear to be conserved among GH5s. Contradictorily,
conservation is unnecessary to maintain the (β/α)8 fold. Van Petegem et al. describe the apparent conservation of the
Arg49 is located in the β2 strand. This arginine is conserved in same disulfide bond in 20 different GH5 EGs. In both studies, it
all members of the 4/7 superfamily except those of GH10 and is unclear which subset of enzymes was used in the sequence
GH26. As with Gly44, this conserved residue is seemingly comparisons making it difficult to speculate why conservation of
related to the Schellman C-terminal capping motif, as the motif the disulfide bridge appears inconsistent. Nevertheless, disulfide
is also not conserved in GH10 and GH26. bonds have long been associated with thermal stability and may
The positions of the catalytic residues in the TaCel5A contribute to the same in TaCel5A.769−771 A salt bridge between
structure were also informative. As is often the case, the Arg154 and Asp188 is also reported, though not as strictly
crystallization conditions under which the protein was crystal- conserved. The Arg154 residue is located toward the end of the
lized, pH 9.0, were far from the optimal acidic pH of α4 helix, and Asp188 is located in a largely unstructured, solvent-
approximately pH 4.0. Additionally, the structure did not exposed loop connecting the α5 helix to the β6 strand. It is
contain a ligand bound in the active site. Despite these tempting to surmise that the presence of the salt bridge in a
limitations, the catalytic residues were found in a catalytically region otherwise susceptible to denaturation contributes some
competent position,740 making superimposition of cello- measure of stability, as, like disulfide bridges, the appearance of
oligomers from other bacterial GH5 structures straightforward salt bridges in thermophilic proteins has been implicated in
and informative from a mechanistic perspective. The positioning stability.772−775
of these residues under adverse conditions may be related to the Van Petegem et al. also very briefly discuss the function of the
apparent general flexibility of the GH5 EG active site as was loop connecting the α6 and β6 structural elements. Specifically,
1388 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

the authors describe a potential substrate positioning mecha- positioning given its proximity to Trp44, which is positioned
nism based on both an extensive observed hydrogen bonding directly over the −3 glucopyranose moiety. This suggests that
network in this region as well as superimposition of the T. while both PrEglA and TaCel5A use the same catalytic
aurantiacus structure with the cellobiose-bound C. thermocellum mechanism to hydrolyze the glycosidic linkage, the two enzymes
CenC (Figure 53B). While proximity of this loop to the C. make use of a different molecular level mechanism for substrate
thermocellum CenC cellobiose does suggest function, little else recruitment.
can be said regarding the mechanism given the significant The PrEglA structure also explicitly illustrated molecular
structural deviations between the two EGs in this region. C. phenomena related to catalysis in fungal GH5 cellulases. Tseng
thermocellum CenC exhibits an impressive subdomain insertion et al. note six water-mediated hydrogen bonds between the −1
represented by a very compact, unstructured loop in TaCel5A. It and −2 glucopyranose moieties in the 3AYS structure (Figure
is difficult to imagine these two drastically different features 57).762 The water near the anomeric carbon of the −1
behave in a similar substrate-recruiting fashion. glucopyranose hydrogen bonding with Glu154 has been
8.1.3. P. rhizinf lata EglA/CelA. The first fungal GH5 implicated as a participant in the catalytic mechanism of
cellulase structure exhibiting a bound ligand was solved by GH5s, specifically as a means of nucleophilic attack on the
Tseng et al. in 2011.762 The E. coli expressed EG from P. glycosyl-enzyme intermediate.206 The PrEglA ligand structure
rhizinflata, described in literature as both EglA and CelA captures the water molecule near the −1 glucopyranose despite
(PrEglA), captured the orientation of a cellotriose molecule in being catalytically inactivated by an E154A mutation. Super-
the −3 to −1 binding subsites at 2.2 Å resolution (PDB 3AYS). imposition of the Glu154 side chain from the wild-type structure
The authors reported that the N-terminal His tag attached for with E154A confirms the catalytic acid/base points toward the
purification prevented soaking in a cellulose ligand by blocking O1 atom of the −1 glucopyranose at 1.7 Å, and the
the active site. Cocrystallization with the catalytically inactive complementary nucleophile, Glu278, hydrogen bonds with the
PrEglA (E154A) was successful, however. These limitations anomeric carbon of the −1 glucopyranose at 3.2 Å. This catalytic
likely apply to other fungal GH cellulases explaining in part the arrangement of residues, including nucleophilic attack by the
long delay in solution of a substrate-bound structure. The water, is in line with that previously observed in a bacterial GH5
authors also solved the complementary apo wild-type structure cellulase. 206 The conserved Tyr231 was also observed
at 2.0 Å resolution (PDB 3AYR) illustrating that the enzyme participating in an apparent catalytic arrangement by hydrogen
underwent little to no change as a result of substrate binding.762 bonding with the Glu278 side chain.
PrEglA is similar to the TaCel5A structure in several ways The PrEglA cellotetraose-bound structure offers an intriguing
including the presence of several spatially conserved aromatic explanation for the enzyme’s putative processive ability. The
residues lining the active site and the presence of a single authors purport, on the basis of previous biochemical assays,
disulfide bond. Interestingly, the disulfide bond in PrEglA that PrEglA is “both an endoglucanase and a cellobiohydrolase”,
(Cys27 to Cys43) is located in a different loop from that which we take to mean that the enzyme exhibits both processive
observed in TaCel5A connecting the β1 strand to the α1 helix on and nonprocessive behavior.776 The authors do not attempt to
nearly the opposite side of the protein near the −3 further connect structure to function in an enzyme that certainly
glucopyranose of the bound cellotriose (Figure 57). The PrEglA resembles an EG by all appearances. However, Tseng et al.
disulfide bond almost certainly plays an indirect role in substrate indicate the bound cellotetraose ligand is nearly completely
surrounded by protein at the −1 binding subsite with the
remainder of the active site exposed to water. This observation is
likely directly related to the observed processive ability of the
enzyme in both increasing ligand binding free energy and
maintaining contact with crystalline substrates allowing
processive hydrolytic events.580
Finally, the authors observe what they consider to be a
potentially dimeric structure in which the O1 atom of the −1
sugar appears to hydrogen bond (3.2 Å) with the OE2 of the
symmetry-related Glu242.762 Site-directed mutagenesis of
E242A indicated overall catalytic function was retained,
suggesting dimeric catalysis by the proposed mechanism was
unlikely. Nevertheless, the authors go on to suggest that because
the first 100 amino acids encoded by the PrEglA cDNA
fragment were similar to the last third of the (β/α)8 barrel
sequence that PrEglA comprises at least two CDs acting in
tandem. However, the structural evidence to date does not
support this hypothesis.
8.1.4. T. reesei Cel5A (Formerly EG II/EG III). The most
recent fungal GH5 cellulase structure originates from T. reesei
(PDB 3QR3).30,566 The enzyme, Cel5A or formerly EGII/
Figure 57. Active site of P. rhizinf lata EglA, the first fungal GH5
EGIII, accounts for up to 55% of the total EG activity of T. reesei
cellulase structure to capture the position of a bound ligand. The
cellotriose-bound variant structure (3AYS) is shown in aquamarine and thus is a key component of many industrial biomass
cartoon and stick, and the apo wild-type structure (3AYR) in light pink conversion cocktails.742 Thermal stability of the enzymes within
cartoon and stick. Aromatic residues, Trp44 and Tyr231, are shown in these cocktails is key to effective deployment in commercial
yellow stick, and the cellotriose is shown in gray stick. Water molecules processes and has been a primary focus of recent protein
are shown as red spheres. engineering efforts.324,752,777−779 The thermal stability of
1389 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 58. First structure of TrCel5A reveals key details regarding disulfide bonds and loop insertions that may aid in understanding contributions to
thermal stability in GH5 EGs. (A) The TrCel5A structure, slate cartoon, is shown aligned with the homologous TaCel5A structure, green cartoon.
Primary differences in the two structures appear in the noted loop regions connecting the primary structural elements. TrCel5A also exhibits a β-
hairpin element, red cartoon, with a tryptophan residue that stacks against the protein face. (B) TrCel5A exhibits four disulfide pairs, orange stick, that
connect primary structural elements, yet surprisingly may not significantly contribute to thermal stability. (C) The active site of TrCel5A contains
several conserved residues, gray stick, that may contribute to stability of transition states in catalysis or substrate binding. The catalytic acid/base,
Glu218, and the nucleophile, Glu329, are shown in yellow stick.

TrCel5A (reported Tm of 69.5 °C) is relatively low compared to bonds, Cys162 to Cys169 and Cys343 to Cys393, anchor (β/
other hyperthermally stable GH5s and is a likely target for α)8 barrel structural elements directly including the α2 helix to
stability improvement. Lee et al. solved the 2.05 Å apo structure β3 sheet and the α7 helix to the α8 helix, respectively. Intuitively,
of TrCel5A to uncover the molecular level contributions to one would expect that, with the increased number of disulfide
thermal stability by comparison; however, the structure bonds, TrCel5A would exhibit a greater degree of structural
uncovers as many questions as it answers.566 stability and thus increased thermal tolerance over TaCe-
The TrCel5A structure’s closest homologue is TaCel5A with l5A.769−771 However, on the basis of the structural evidence
29% sequence identity and 69% sequence similarity. Compar- presented by Lee et al.,30 there does not appear to be a direct
ison of TrCel5A with the TaCel5A structure,740,753 a hyper- correlation of disulfide bonds and thermal stability. The authors
thermally stable enzyme with a Tm of nearly 81 °C,780 can report site-directed mutagenesis of several cysteines to serines
feasibly provide insight into the variation in molecular features resulted in insoluble protein expression in the E. coli
contributing to reduced thermal stability in TrCel5A. At first recombinant host. Thus, there is little evidence by which to
glance, primary differences in the two structures include ascertain the contributions to thermal stability imparted by each
extended loops connecting the β1 sheet to the α1 helix, the β3 nonconserved cysteine pair.
sheet to the α3 helix, and the α5 helix to the β6 sheet (Figure Finally, Lee et al. observe catalytic residue orientations in line
58A). The TrCel5A structure also features a protruding β- with the proposed GH5 catalytic mechanism.30,206 The catalytic
hairpin element at residues 378−385 near the final α8 helix, residues Glu218 and Glu329 are separated by a distance of
wherein Trp384 stacks the β-hairpin against the globular face of approximately 5 Å measured from the terminal oxygen atoms
the protein (Figure 58A). The functional role of the β-hairpin consistent with reports of catalytic residue flexibility (Figure
remains unknown. However, β-hairpins have been observed in 58C).206 The authors suggest residues Thr328, His288, and
other bacterial GH5s including Thermotoga maritima Cel5A781 Glu218 function simultaneously to raise the donor carboxylate
and Clostridium cellulovorans EG D (PDB 3NDY, unpublished). pKa promoting efficient catalysis. The nucleophilic glutamate,
Though Lee et al. do not investigate the contributions of the Glu329, is observed hydrogen bonding to the OE2 atom of
loop insertions to reduced thermal stability, it is tempting to Arg130 and the OE1 atom of Tyr290 as part of the retaining
hypothesize these increasingly disordered regions contribute in catalytic mechanism. As with T. aurantiacus,753 conserved
part to a reduced Tm relative to TaCel5A. Recent computational residues His174 and Trp362 are reported to participate in
investigations of a related enzyme lend credence to this substrate binding rather than directly in the catalytic mechanism
hypothesis, though it remains largely untested experimen- despite their proximity to the active site.
tally.752
8.2. Characterization of Activity and Specificity
Disulfide bonding in TrCel5A is also a key differentiator from
the homologous TaCel5A. TaCel5A exhibits a single disulfide In general, GH5 celluloytic function can be classified as endo-
bond pinning the α6 to β6 loop to the top of the α7 helix.740,753 initiated, with only 2 of the 4395 members classified as having β-
PrEglA also exhibits a single disulfide bond and is a known 1,4-cellobiosidase activity (EC 3.2.1.91), one from C.
hyperthermophilic enzyme.762 In contrast, TrCel5A contains 8 thermocellum and one from Teredinibacter turnerae. In the
cysteine residues forming four disulfide bonds in the protein following sections, we briefly describe the wealth of biochemical
(Figure 58B). One of disulfide bonds, between Cys302 and characterization data available for fungal GH5 cellulases. Each of
Cys338, corresponds to that observed in TaCel5A. The disulfide the enzymes described below have been classified as endo-β-1,4-
bond between Cys86 and Cys92 tethers the C- and N-terminal glucanases (EC 3.2.1.4) with some purported to exhibit
regions of the β1 to α1 loop. Oddly, the other two disulfide moderate degrees of processive action.
1390 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

8.2.1. T. viride EG III. Some of the earliest reported the control strain was speculated to result from hyper-
biochemical characterizations of GH5s came from T. viride glycosylation common in yeast expression.
several years before cellulase classification would relate their As previously mentioned, Saloheimo et al. described what
function to other GH5s. In 1978, Shoemaker and Brown appears to be the first isolated and sequenced fungal GH5
submitted two simultaneous reports on the discovery and cellulase.323 Alongside this isolation, the authors characterized
characterization of a set of four EGs from T. viride the activity of the new EG, as its sequence differed substantially
demonstrating purity with SDS-PAGE.728,729 Up to this point, from the known cellulases at the time, TrCel6A, TrCel7A, and
studies had purified fungal EGs (not necessarily GH5s) from T. TrCel7B, indicating potentially new function. TrCel5A, like
viride, T. koningii, Fusarium solani, Sporotrichum pulverulentum, TrCel6A, did not hydrolyze cellobiosides or lactosides, and thus,
and Irpex lacteus, though homogeneity of these early the authors speculated specificity was similar. Saloheimo et al.
purifications was uncertain.782−785 Shoemaker and Brown also reported a pH optimum of 4.0−5.0 on a cellotrioside
described what was ultimately the first biochemical character- substrate at 50 °C, comparable to the T. viride EGs.
ization of pure fungal GH5 cellulases, T. viride EGIII and EGIV, A flurry of activity surrounding the mode of TrCel5A action
both from subfamily 5. The enzymes were reported as 52 and occurred in the early 1990s with a series of papers uncovering
49.5 kDa, respectively.729 Specific activity assays suggested the the catalytic nucleophile,750 described above, putative binding
EGs were active on CMC, PASC, and cellotriose and higher models,748 essential aromatic residues,749 and a high degree of
oligosaccharides with limited Avicel activity.728 Both EGs were endo-activity.304 With a basic understanding of TrCel6A and
reported to have pH optimums of 4.0 to 4.5 on PASC and CMC TrCel7A specificity and action in place at this point, Macarrón et
and a noted intolerance for high pH.729 al. set out to develop a similar level of understanding of TrCel5A
The T. viride cellulases, both EGs and processive cellulases action. Having purified the TrCel5A with an attached CBM, the
from other GH families, formed the basis of a commercial authors report TrCel5A exhibits a stable pH range 4.0−6.3 at 55
enzyme preparation, Maxazyme CL, from the Dutch company °C, retaining 90% of its activity at 65 °C for 30 min.748
Gist-brocades (now owned by DSM) and another from Miles Chromophoric substrates were effectively used to identify
Laboratories. After the report from Shoemaker and Brown, TrCel5A as having a double-displacement catalytic mechanism,
several years passed before any additional characterization the first report of such. In the case of cleaving the 2-chloro-4-
occurred. In 1985, Beldman et al. reported the purification and nitrophenol bond, the deglycosylation step was reported as the
characterization of the commercial Maxazyme preparation, rate-limiting step, though nucleophilic competition experiments
though much of the report focuses on the processive cellulase between methanol and water point to the glycosylation step as
function.786 The authors followed up their characterization rate-limiting. Ultimately, no conclusion can be made regarding
efforts with a description of the apparent synergy of the T. viride rate-limitation in the double-displacement mechanisms from the
cellulases, which relies significantly on the presence of EGs such reported data. However, Macarrón et al. were able to propose a
as EGIII and EGIV.787 After these initial reports, much of the five-subsite binding model for TrCel5A based on observed
literature related to T. viride GH5 EGs surrounds strain cleavage products (Figure 59). To date, this remains to be
engineering for enhanced production with a few reports of validated from a structural standpoint, with only a single apo
recombinant expression for improved stability and pH structure of the enzyme reported.30
optimum. Macarrón et al. later investigated the function of three
8.2.2. TrCel5A. T. reesei (now also known as Hypocrea tryptophan residues in binding and catalysis using N-
jecorina) Cel5A (TrCel5A) is a model fungal GH5 EG bromosuccinimide modification.749 Given the likely role of
(subfamily 5) having found application in some modern conserved aromatic residues in carbohydrate substrate binding,
cellulase-based commercial enzyme preparations.788 As such, the authors hypothesized chemically modified tryptophans
much of the fungal GH5 characterization efforts to date have
focused on this enzyme. Early characterization focused on
delineating activity between the suite of T. reesei cellulases and
broadly describing specificity. In 1985, van Tilbeurgh and
Claeyssens used 4-methylumbelliferyl-β-D-glycoside substrates
in an attempt to explicitly differentiate T. reesei CBHs, EGs, and
β-glucosidase activities.301 Notably, TrCel5A was the only
cellulase to cleave the fluorescent phenol group from the
fluorogenic 4-methylumbelliferyl-β-D-cellotrioside. This techni-
que would later be used to characterize relative activities of
commercial enzyme preparations, the results of which high-
lighted the need to pay close attention to the model substrates
used in evaluation of overall activity and synergism, as it is now
clear that it is impossible to differentiate CBH from EG activity
using artificial fluorogenic substrates such as 4-methylumbelli-
feryl-β-D-glycoside.789 Figure 59. Proposed five-subsite binding model for TrCel5A, illustrated
In 1987, Penttilä et al., having expressed and cloned the egl3 with two possible hydrolysis pathways for 4-methylumbelliferyl-β-D-
cellotrioside, with the open rectangles indicating D-glucosyl moieties
gene encoding TrCel5A, recombinantly expressed the EG in S.
and filled rectangles indicating 4-methylumbelliferyl moieties. The
cerevisiae in an effort to understand the effects of expression host lettered, indented block represents the binding site of TrCel5A.
on activity.283 The authors found that while the recombinant Catalysis takes place between the C and D subsites, as indicated by the
enzymes remain active, the expression host affects both arrows indicating catalytic protein residues. The possibility of methanol
specificity and enzyme morphology. The change of morphology transfer as part of the EG activity is indicated. Reprinted with
resulting from heterologous expression of TrCel5A compared to permission from ref 748. Copyright 1993 the Biochemical Society.

1391 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

would affect binding and hydrolysis. Chemical modification was greater extent than TrCel7A. This is likely related to the
able to alter three tryptophans in total: one putatively part of the characteristics of the globular surface of TrCel5A, as it has been
CBM, one likely very near the catalytic active site, and one repeatedly reported that addition of bovine serum albumin
unspecified modification not affecting activity or binding. The (BSA) in assays enhances stability of TrCel5A.323,792,793 Here,
modification of the CBM-located tryptophan, mostly likely addition of BSA significantly reduced the binding of TrCel5A to
Trp5, reduced absorption on Avicel by nearly 50% (see section lignin.
5). As TrCel5A is an EG, it is unclear whether this modification In a related study, Le Costaouëc et al. examined the effects of
would significantly affect enzymatic activity in practice. The CBMs on hydrolysis for TrCel7A and TrCel5A as well as
“active site” tryptophan, thought to be Trp255, reduced rates of TaCel7A and TaCel5A on pretreated wheat straw and
hydrolysis on small soluble substrates, though primarily through spruce.794 TaCel7A and TaCel5A, unlike the Trichoderma
its influence on substrate binding rather than directly through cellulases, naturally lack the CBM module and presumably have
catalysis. This finding is directly related to the primarily endo- evolved in such a way as to maintain similar levels of hydrolysis.
active function of TrCel5A.304 The authors genetically modified the Thermoascus cellulases
In later years, the importance of choice of model substrate and with attached CBMs for a direct comparsion to intact TrCel7A
kinetic screening tools became an important emphasis in and TrCel5A. A general conclusion from this study was that, at
cellulase research. Frequently, TrCel5A was included in these high substrate loadings, all enzymes performed better without
studies, while TrCel7A or TrCel6A was the primary focus. In the attached CBMs, mostly likely by virtue of the high
developing the now ubiquitous disodium 2,2′-bicinchoninate population of free oligosaccharides reducing the need for
assay to quickly assess initial reducing end sugar production substrate recognition by the CBM. The Thermoascus cellulases
from cellulases, Johnston et al. evaluated Michaelis−Menten have likely evolved to function in high substrate loading
kinetics of TrCel5A relative to the three other primary contexts, as both enzymes outperformed the CBM deficient
Trichoderma cellulases providing a consistent comparison of Trichoderma cellulases. Le Costaouëc et al. also suggest that the
kinetic data accounting for initial rates of the biphasic CBM−lignin interactions may have detrimental nonproductive
systems.790 Kipper et al. similarly worked toward an accurate binding interaction that would be mitigated by the removal of
characterization cellulase kinetic action and, in doing so, the CBM entirely. Thus, even for Trichoderma cellulases,
reported the remarkable differences in TrCel5A endo-activity removal of the CBM for high substrate loading applications may
versus TrCel6A and TrCel7A. Related processive character- result in better overall performance.
ization efforts also served in verifying the nonprocessive nature Early characterization efforts were particularly successful in
of TrCel5A.315 Jäger et al. investigated the merits of using an α- predicting extent of glycosylation in TrCel5A. Both Saloheimo
cellulosic substrate in the selection of cellulases for alkaline- et al. and Ståhlberg et al. described TrCel5A putatively
pretreated biomass and briefly discussed TrCel5A action on this exhibiting one N-linked glycan and heavy O-linked glycosylation
realistic substrate.791 on the N-terminal structural domain, i.e., the CBM and linker
Knowing that TrCel5A is a multimodular enzyme consisting (as described in section 5).323,741 Nearly 15 years later, Hui et al.
of a CBM attached via a glycosylated linker to the catalytic would follow up these hypotheses to firmly establish the nature
domain, a natural follow-up question is to what extent does the and location of N-linked glycans and the extent of heterogeneity
CBM play a role in the enzyme’s action. Nidetzky et al., noting in O-linked glycans of the linker domain.418 Using capillary
the inherent difficulties associated with appropriate selection of liquid chromatography-electrospray mass spectrometry and
binding site models, set out to develop realistic relationships matrix-assisted laser desorption and ionization time-of-flight,
between cellulase binding and rate of substrate hydrolysis.792 As Hui et al. characterized the glycosylation patterns of all four
part of this study, the authors also evaluated the effects of the major Trichoderma cellulases purified from a T. reeei RUT-C30
CBM on binding and hydrolysis. After incubation with filter strain fermentation broth. The authors confirmed that TrCel5A
paper substrate and fluorometric assessment of activity, exhibits a single N-linked GlcNAc at Asn103, the previously
Nidetzky et al. developed Langmuir isotherms describing filter predicted location. 323,418 The glycan is likely trimmed
paper adsorption of all the major Trichoderma cellulases. endogenously after secretion from a higher mannose form.
Relative to TrCel5A, the authors found that (1) the extent of The authors also indicate the linker exhibits anywhere from 32
adsorption increases with agitation, facilitating mass transfer to to 42 O-linked mannose residues, only slightly fewer than the
the solid substrate, (2) nearly 4 times as many binding sites were 39−46 attached to the long, glycosylated TrCel6A linker. The
available for the intact TrCel5A compared to the CD alone, and O-linked glycan moieties have been shown to play a key role in
(3) the relationship of binding to hydrolysis was nonlinear, enzyme substrate binding, and such a high degree of
suggesting nonproductive binding. These results indicate that glycosylation as reported here may account in part for the
the CD alone does little of the work in maintaining enzyme/ significant differences in adsorption between the intact enzyme
substrate association, and thus, the CBM is critical to effective and the CBM-deficient versions.792
hydrolysis at relatively low substrate concentrations. Product inhibition of cellulases by cellobiose is a point of
The effect of CBM substrate binding to real biomass has also interest and noted concern when considering industrial
been examined to much the same result as described by conversion process conditions and design of experiments, as
Nidetzky et al. for filter paper.792 Palonen et al. compared the detailed in the previous two sections. Conversion rates of
binding behavior of TrCel7A and TrCel5A both with and cellulase cocktails tend to drastically decline with increasing
without CBMs on steam pretreated softwood and lignin.793 concentrations of glucose and cellobiose during the initial
Again, separation of the catalytic core domains from the CBMs hydrolysis stage,581 though which enzymes directly contributed
significantly decreased binding of the enzymes to the substrates, to this effect was not immediately clear. Gruno et al. directly
though the effect on hydrolysis as a result was not determined. examined product inhibition for TrCel7A and three Trichoderma
An interesting finding from this work was that the TrCel5A CD EGs including TrCel5A to delineate contributions from each of
appears to adsorb to the alkaline isolated lignin to a much these enzymes to the inhibition effect.611 As described
1392 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

previously, the primary cellobiose inhibition effects tend to TrCel5A is approximately 10 °C higher than either of three
derive from processive CBHs. The authors report TrCel5A other cellulases (Figure 60). With a Tm of 75 °C, TrCel5A is
hydrolysis of tritiated amorphous cellulose is virtually unaffected
by product inhibition, as was the case for the other EGs in the
study. Comparatively, inhibition of TrCel7A on tritiated BC was
nearly 100-fold higher than any of the EGs. It is likely that the
relatively weak adsorption of TrCel5A with carbohydrate
substrates resulting from the open, shallow binding site cleft
plays a significant role in allowing the cleaved cellobiose product
to rapidly dissociate from the active site preventing any
significant degree of inhibition. Using a completely different
technique of monitoring unmodified cellulose hydrolysis by
calorimetry, Murphy et al. would later corroborate this
finding.615
Gruno et al. also report turnover numbers for the various EGs
on amorphous cellulose. For TrCel5A, amorphous cellulose is
hydrolyzed at a rate of 8.0 ± 0.1 s−1.611 Karlsson et al. previously
reported a kcat of 65 s−1 for TrCel5A on cellopentaose.334 The
disparity is likely not in the actual rate of hydrolysis, but as
Gruno et al. point out, the differences may lie in the means by
Figure 60. Differential scanning microcalorimetry of four T. reesei
which hydrolysis was detected, namely formation of soluble cellulases, with concentration of ∼1 mg/mL in cell. In the reproduced
sugars. In later years, accurate detection of cellulase kinetics has figure, CBHII is TrCel6A, CBH I is TrCel7A, EGI is TrCel7B, and
become a focus area of several research groups around the EGII is TrCel5A. Reprinted with permission from ref 802. Copyright
world.315,527,795 1992 The Humana Press, Inc.
The means by which EGs and CBHs work together to
deconstruct cellulosic substrates has intrigued the community significantly more tolerant of adverse process conditions than
for several decades. As described in section 4, the literature any of its complements. We note that while a Tm of 75 °C is
generally points toward an the endo/exo synergism model, significant by comparison, TrCel5A is not considered a
wherein EGs serve to randomly hydrolytically cleave glycosidic hyperthermophilic enzyme, and thus, protein engineering efforts
linkages as a sort of preparation for the processive CBHs that tend to focus on further gains in thermal stability as we will
rapidly cleave without dissociation.305 Nidetzky et al. set out to discuss below.
describe synergistic cellulase action between binary mixtures of Protein stability in highly alkaline environments is also a
cellulases on filter paper. The authors point out that a key reason favorable industrial attribute, particularly given the recent focus
behind many of the inconclusive and contradictory reports was on ionic liquid pretreatment of biomass.91,803 Wahlström et al.
likely due to insufficient enzyme purity clouding results. Paying investigated the effects of ionic liquids on TrCel5A and
close attention to this fact, the authors reported the demonstrated that the ionic liquid environment had a markedly
complementary action of both CBH combinations and CBH/ detrimental effect on hydrolysis.804 The authors report the CBM
EG combinations. A particularly effective combination was the is particularly susceptible to the effects of ionic liquids as
addition of TrCel5A with the TrCel7A CBH enhancing overall evidenced by side-by-side comparisons of hydrolysis both with
activity more than 2-fold. Medve et al. confirm this finding for and without a CBM domain.
hydrolysis on Avicel at 40 °C, but the authors go on to suggest 8.2.3. H. insolens Cel5A. Few other GH5 fungal cellulases
that TrCel7A and TrCel5A compete for binding sites so as to have been as well-characterized as TrCel5A. Motivated in part
negatively affect synergism, with TrCel7A as the more effective by the industrial applicability, H. insolens EGs have been
competitor.796 This latter finding suggests selection of enzyme catalogued and characterized, though sparingly. Prior to the
ratios, erring on the low side of EGs, may effectively minimize initial report of cloning and sequencing,717 HiCel5A (EGII) was
enzyme costs while maintaining endo/exo synergism.797 The characterized as part of a set of H. insolens cellulases in studies
role of EGs in endo/exo synergism, in particular TrCel5A, focused on uncovering stereochemistry, specificity, and kinetics
continues to be a major research focus today.798−801 As new of several GH families.442,805 The stereochemistry of HiCel5A
approaches for describing cellulase kinetics become available, was indeterminate from the results of the Schou et al. study.442
models describing endo/exo synergism re-emerge, challenging However, a similar bacterial EG from family 5 (A at the time)
the way we think about processive cellulase action and the role was shown to follow an inverting stereochemical course in
EGs play.557 accordance with earlier reports of E. chrysanthemi EG Z.756 This
Thermal stability of an enzyme is essential to industrial account offered yet another clue that GH5 enzymes would all
relevance, as increasingly higher temperatures facilitate biomass generally follow this stereochemistry.442 The characterization of
decomposition. Intuitively, it would seem that enzymes secreted HiCel5A by Schou et al. also suggested the enzyme active site
from the same native host, such as TrCel6A, TrCel7A, TrCel7B, consists of at least six subsites as affinity for cellohexaitol over
and TrCel5A, would have similarly evolved thermal stabilities cellopentaitol was significantly higher. This latter finding has yet
given exposure to the same environmental conditions and to be structurally confirmed.
stresses. However, this is not necessarily the case as reported in Several years later, Schülein performed a follow-up study to
1992 by Baker et al.802 Using differential scanning calorimetry identify pH activity sensitivity and ranges as well as to fill in the
and complementary tryptophan fluorescence monitoring, the gaps regarding stereochemical course and kinetics.499 It had
authors discovered that thermal deactivation of all four enzymes been noted previously that HiCel5A as well as several other H.
is related to thermal unfolding, but that the overall stability of insolens EGs exhibited specificity for CMC; thus, CMC was used
1393 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 11. Summary of Biochemical Characterizations of Fungal GH5 Cellulases
GH 5 substrate for
organism expression host enzyme subfamily pH opt temp opt (°C) opt substrate specificity ref comments
Acremonium sp. Cel5A PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., assay condition 50 °C and pH 5.0; 52% residual
CBS265.95 CMC, xyloglucan, xylan, arabinoxylan, 2010332 activity on CMC after 3 h at 60 °C and pH 5.0
mannan, galactomannan
Aspergillus aculea- Cel5B PASC, Avicel, BC, pretreated corn stover, Vlasenko et al.,
Chemical Reviews

assay condition 50 °C and pH 5.0; 100% residual


tus CMC, xyloglucan, xylan, arabinoxylan, 2010332 activity on CMC after 3 h at 60 °C and pH 5.0
mannan, galactomannan (15% at 70 °C)
Aspergillus f umi- Pichia pastoris Egl2 5 5 50 CMC CMC, filter paper Liu et al., 2011807
gatus
Aspergillus f umi- Pichia pastoris Egl3 5 4 60 CMC CMC, filter paper, Avicel Liu et al., 2011807
gatus
Aspergillus nidu- EglA 5 6.5 50 CMC CMC, Avicel Chikamatsu et al.,
lans 1999808 Bauer et
al., 2006487
Aspergillus nidu- Pichia pastoris EglB 4 52 CMC CMC, cello-oligosaccharides Bauer et al., 2006487
lans
Aspergillus niger Pichia pastoris EglB 5 4 70 CMC barley β-glucan, locust bean gum, cellobiose, Li et al., 2012809
CMC, laminarin
Basidiomycete Cel5A 60 PASC PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., assay conditions pH 5.0; 103% residual activity on
CBS495.95 CMC, xyloglucan, xylan, arabinoxylan, 2010332 CMC after 3 h at 40 °C and pH 5.0 (7% at 60 °C
mannan, galactomannan and 2% at 80 °C)
Basidiomycete Cel5B 60 PASC PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., assay conditions pH 5.0; 28% residual activity on
CBS495.95 CMC, xyloglucan, xylan, arabinoxylan, 2010332 CMC after 3 h at 60 °C (5% at 80 °C)
mannan, galactomannan
Daldinia es- EG 6 70 CMC filter paper, Avicel, CMC Karnchanatat et al., activity stimulated by select divalent ions

1394
chscholzii (Eh- 2008810
renb.:Fr.)
Dictyoglomus Escherichia coli Cel5H 5 50−85 CMC CMC Shi et al., 2013811 maintains 78% activity at 4 M NaCl; chimerical
thermophilum CBMs moderately improve specific activity
toward CMC
Fomitopsis palust- EG47 CMC CMC, pNPC, Avicel Yoon et al., 2007812
ris
Fomitopsis pinico- EG 5 60 CMC CMC, Glc4/Glc5 Yoon et al., 2008813
la
Fusarium verticil- E2 5 80 CMC CMC, barley β-glucan, locust bean gum, de Almeida et al.,
lioides xylan, filter paper, cello-oligosaccharides 2013814
Gloeophyllum tra- Cel5A 5 CMC, pNPC, xylan, PASC, Avicel Cohen et al., suggest GtCel5A is a processive EG due to
beum 2005815 generation of cellobiose from Avicel
Gloeophyllum tra- Pichia pastoris Cel5B 5 3.5 30−70 CMC CMC, filter paper Kim et al., 2012816
beum
Humicola grisea Aspergillus ory- Egl2 5 5 75 CMC CMC, Avicel, xylan, cellobiose, Takashima et al.,
zae Glc3/Glc4/Glc5/Glc6 1997817
Myceliophthora Cel5A 5 PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., assay condition 50 °C and pH 5.0; 50% residual
thermophila CMC, xyloglucan, xylan, arabinoxylan, 2010332 activity on CMC after 3 h at 70 °C and pH 5.0
mannan, galactomannan
Neocallimastix Escherichia coli CelA 4 8.5 40 CMC CMC, Avicel, xylan, lichenan Fujino et al., 1998818
f rontalis
MCH3
Neurospora crassa GH5-1 5 CMC, Avicel Sun et al., 2011819
Orpinomyces joy- CelB2 4 6.6 45 CMC CMC, barley β-glucan, lichenan, pNPC, p- Qiu et al., 2000820
onii nitrophenyl-β-D-cellotrioside
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 11. continued
GH 5 substrate for
organism expression host enzyme subfamily pH opt temp opt (°C) opt substrate specificity ref comments
Orpinomyces joy- CelB29 4 5.8 50 CMC CMC, barley β-glucan, lichenan, pNPC, p- Qiu et al., 2000820
onii nitrophenyl-β-D-cellotrioside
Penicillium brasi- Aspergillus ory- Cel5C 5 4 70 CMC CMC, Avicel Krogh et al., 2009821
lianum zae
Chemical Reviews

Penicillium canes- Escherichia coli 3.4 60 CMC Chulkin et al.,


cens 2009822
Penicillium de- Saccharomyces Cel5A 5 4 60 CMC CMC, barley β-glucan, PASC Wei et al., 2010509 suggest PdCel5A may be processive
cumbens cerevisiae
823
Penicillium de- Pichia pastoris Cel5C 4 4.8 40−50 konjac gluco- glucomannan, tamarind seed gum, CMC, Liu et al., 2013 sequence includes an Ig-like domain
cumbens mannan barley β-glucan
Penicillium echi- Pichia pastoris Egl1 5 5.0−9.0 60 CMC CMC, filter paper, short oligosaccharides Rubini et al., strongly stimulated by calcium
nulatum 2009824
Penicillium janthi- Saccharomyces EGII 5 CMC, hydroxyethylcellulose, barley Mernitz et al.,
nellum cerevisiae β-glucan, lichenan 1996825
Penicillium pino- EG 5 70 CMC CMC, Avicel, cellulose, lichenan, laminarin, Jeya et al., 2010826
philum xylan, Glc4/Glc5
KMJ601
Phanerochaete EG38/Cel5A 5 CMC, Avicel, p-nitrophenyl-β-D-cellotrioside Uzcategui et al.,
chrysosporium 1991827
Phanerochaete EG44/Cel5B CMC, Avicel, p-nitrophenyl-β-D-cellotrioside Uzcategui, 1991827
chrysosporium
Phialophora sp. Pichia pastoris EgG5 4.0−5.0 70 CMC CMC, barley β-glucan, galactoglucomannan, Zhao et al., 2012828 retained 40% activity at pH 2.0
G5 filter paper, Avicel

1395
Piromyces equi Cel5A 4 5.1 45 CMC CMC, barley β-glucan, lichenan, galacto- Eberhardt et al., characterization of CD only; narrow pH optimum
mannan, pNPC, xylan 2000829 range and relatively thermally unstable
Piromyces rhizin- Escherichia coli EglA 4 5.5 50 CMC CMC, pNPC, barley β-glucan, lichenan, Liu et al., 2001776 deletion of the N-terminal linker significantly
f lata xylan, Avicel, filter paper affects thermal stability
Piromyces rhizin- Escherichia coli EglA 4 6 50 CMC CMC, pNPC, barley β-glucan, lichenan, Tsai et al., 2003724 characterization of CD only
f lata xylan, Avicel, filter paper
Talaromyces β-glucanase 4.8 80 barley β-glucan barley β-glucan, lichenan Murray et al., strict preference for mixed linkages, markedly
emersonii CBS 2001830 thermostable
814.70
Thermoascus aur- Cel5A 5 4.0−4.4 70−80 CMC CMC, barley β-glucan, lichenan Parry et al., 2002780 putatively exhibits 5 binding subsites
antiacus
Thermoascus aur- Saccharomyces Cel5A 5 6 70 CMC CMC Hong et al., 2003831
antiacus cerevisiae
IFO9748
Thielavia terrestris Cel5A PASC, Avicel, BC, CMC, mannan, galacto- Vlasenko et al.,
mannan 2010332
Trametes hirsuta Aspergillus ory- Eg-1 5 5 50 CMC CMC, PASC, Avicel, kraft pulp Nozaki et al., long linker may aid in moderate activity toward
zae 2007832 Avicel
Trichoderma sp. Escherichia coli C4endoII 5 5 50 CMC CMC, Avicel Sul et al., 2004833
C-4
Trichoderma ree- Saccharomyces Cel5A 5 4.8 CMC-Na CMC-Na, Avicel, ball-milled cellulose, PASC Qin et al., 2008324 assay condition 50 °C
sei cerevisiae
Trichoderma vir- Saccharomyces EGVIII 5 6 60 CMC CMC, cellobiose, Glc3, Avicel Huang et al.,
ide cerevisiae 2009834
Volvariella volva- Pichia pastoris EG1 5 7.5 55 CMC CMC, PASC, filter paper Ding et al., 2001835
cea
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

to determine pH activity profiles. HiCel5A exhibited one of the enhancements have been attained through a variety of
broadest pH activity profiles out of the seven H. insolens approaches including host engineering, heterologous expression,
cellulases examined maintaining greater than 60% activity over a and site mutagenesis efforts.
pH range 5.5−9. At the optimal pH of 7.5, HiCel5A exhibits a For the most part, efforts associated with GH5 protein
turnover rate of 8 s−1. Schülein also confirmed HiCel5A acts via engineering have targeted thermal stability improvements in
the inverting stereochemical mechanism in this study.499 industrially relevant cellulases. A wealth of data surrounding
HiCel5A shares nearly 50% identity with TrCel5A, which is a thermophilic and hyperthermophilic GH5 family members
striking degree of similarity given the relatively low homology of exists by which analogies can be made. To date, both rational
GH5s in general. Both of these fungal cellulases belong to design and directed evolution techniques have demonstrated
subfamily 5 and exhibit an N-terminal CBM. Schülein was the moderate success.752,777,778,836,837 Recently, Liu et al. used
first to describe the existence of the HiCel5A CBM explicitly in directed evolution on a Clostridium phytofermentans GH5 to
the literature, though the sequence had been annotated as improve the half-life by 92% at 60 °C.778 The gain was made
such.499,717 Schülein also concludes that on the basis of the through identifying a triple point mutation, though the means by
sequence homology, HiCel5A very likely folds according to the which this cluster contributed to stability was not described. As
canonical (β/α)8 motif of GH5s. part of this same study, the authors also investigated effects of
The relative effectiveness of CMC hydrolysis by a handful of CBMs on GH5 activity noting that removal of native CBMs
industrially relevant EGs was more recently examined.806 from the bacterial EG clearly had a detrimental effect on activity
Karlsson et al. concluded that HiCel5A was the most active toward both regenerated cellulose and Avicel.778 Chimerical
hydrolyzer of the family 5, 7, 12, and 45 EGs examined alongside CBM and Ig domain engineering frequently negatively affected
TrCel7B.806 The authors also showed that HiCel5A was one of activity as well though. A similar approach toward engineering
the least inhibited enzymes in terms of accepting hydroxyl an uncultured bacterial GH5 EG resulted in a 7-fold increase in
substitutions on the CMC substrate. This suggests HiCel5A thermal stability with the final enzyme having six mutations and
may be a bit more forgiving in acceptance of nonideal substrates, an appended non-native family 6 CBM.836 This latter CBM-
which is ideal in terms of cellulose deconstruction but not for variant GH5 EG exhibited improved activity toward Avicel. On
molecular probe applications. the basis of this small sample of directed evolution/chimera
8.2.4. Other Fungal GH5s. Vlasenko et al. recently engineering studies one can conclude directed evolution may be
undertook a broad study of EG substrate specificity to identify a helpful first approach toward uncovering locations contribu-
family dependent specificities.332 The specificity of five fungal ting to thermal stability. However, the appending of non-native
GH5 cellulases (Aspergillus aculeatus Cel5B, Basidiomycete CBMs to the variant does not always result in hydrolytic gains
CBS494.95 Cel5A, Basidiomycete CBS495.95 Cel5B, Myce- toward increasingly crystalline cellulose substrates.
liophthora thermophilia Cel5A, and Thielavia terrestris Cel5A) Strategic implementation of disulfide bonds has also been
against a variety of substrates was examined, including p- shown to result in thermal stability gains in bacterial GH5
nitrophenyl substrates, CMC, PASC, Avicel, BC, pretreated cellulases. Badieyan et al. performed MD simulations on 12
corn stover, xyloglucan, wheat arabinoxylan, β-1,4-mannan, and GH5 cellulases to develop a rules-based approach to engineering
galactomannan. Unsurprisingly, family 5 EGs appeared to stability.752 Correlations with several protein structure dynamic
exhibit a greater preference for PASC than for the other factors were developed to identify which were more likely
polymeric cellulose substrates. The EGs were also incapable of predictors of overall thermal stability. The authors confirmed a
cleaving p-nitrophenyl substrates, potentially because binding of positive correlation between protein flexibility and optimum
the small substrates to the long clefts (seven putative subsites) is activity temperature existed within the subset of GH5s (Figure
too weak to induce activity. In comparing CMC and PASC 61). Using this information, Badieyan et al. identified a
activity levels across all the EGs, Vlasenko et al. suggested that particularly thermally susceptible region of C. thermocellum
the GH5 EGs appear to be more accommodating of O- CenC. This happened to be the large subdomain insertion
substitution by methyl groups and rely less on hydrogen between the α6 helix and β6 strand shown in Figure 53. A new
bonding of the substrate for hydrolysis in comparison to family disulfide bond was introduced in this location with the intent of
6, 7, 9, 12, and 45 EGs, a finding that lines up well with those of stabilizing the protein without affecting activity. Assays of the
Karlsson et al.806 All of the GH5 EGs known to have celluloytic wild-type and disulfide-linked mutant with pNPC suggested the
activity were also remarkably effective at mannan degradation. It disulfide bridge consistently resulted in improved activities at
has been hypothesized that the GH5 family has resulted from elevated temperatures. The Tm of C. thermocellum CenC was
divergence of a common ancestral gene,730,731 and this improved by approximately 4 °C with the additional disulfide
comprehensive specificity study provides evidence pointing bond. As has been noted previously in the literature,675 the
toward divergent evolution where the ancestral gene exhibited mutation actually had little effect on overall protein fluctuations,
mannanase activity. but rather reduced flexibility was sequestered to the subdomain
Many additional characterizations of optimum pH and region. This latter observation likely accounts for the maintained
temperature as well as substrate specificity studies accompany- activity with the addition of the disulfide bond.
ing gene cloning and sequencing efforts have been conducted. Hyperglycosylation of fungal GH5 cellulases by heterologous
These results are summarized in Table 11. expression in yeast has been shown to positively impact thermal
stability. In two separate studies, TrCel5A was heterologously
8.3. Protein Engineering expressed in S. cerevisiae and P. pastoris.779,838 In each case, the
As with the catalytic mechanism, many of the noted successes in half-life of TrCel5A at 70 °C was at least doubled as a result of
engineering GH5 cellulases pertain to bacterial GH5 EGs. the hyperglycosylation. Both Qin et al. and Samanta et al. note
Briefly discussing some of the more pertinent advances from that hyperglycosylation did not detrimentally affect specific
bacterial representatives, we focus on published approaches to activity on CMC, as is sometimes a concern when non-native
engineering specificity, activity, and stability in GH5s. These glycans block substrate binding.779,838 This approach to
1396 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

hydrolyzed. Mutation of most of the identified residues


abolished activity on both CMC and cellopentaose. However,
the authors noted that one of the conserved aspartate residues
upon mutation to alanine was unable to hydrolyze cellopen-
taose, but that activity on CMC was maintained. A follow-up
assay indicated the D232A M. phaseolina Egl1 variant was
capable of hydrolyzing cellohexaose. The results suggest that
substrate specificity in GH5 cellulases can be modified through a
single point mutation, potentially in the same fashion as the
mutated aspartate was a conserved residue. Nevertheless, the
benefit of increasing minimum EG substrate size with respect to
industrial applications is unclear.
Perhaps one of the more well-known instances of GH5
engineering is the significant activity improvement from a single
Figure 61. Using MD simulation of a large set of GH5 EGs from active site mutation in A. cellulolyticus Cel5A.845 Baker et al.
hyperthermophilic, thermophilic, mesophilic, and psychrophilic organ- mutated a tryptophan implicated in product inhibition (Y245G)
isms, Badieyan et al. illustrated a correlation between protein flexibility, to alleviate stacking interactions slowing the rate at which the
as measured by root-mean-square deviation (RMSD), and the optimal cellobiose product leaves the active site. The improvement in
activity temperature (OAT) of the enzyme. The authors performed 10 activity was 40% over wild-type as a result of the staggering
ns MD simulations of each enzyme at three different temperatures. The 1480% reduction in affinity of the active site for cellobiose, as
correlation appears to be linear. Reprinted with permission from ref measured by the inhibition constant. Unfortunately given that
752. Copyright 2011 Wiley Periodicals, Inc.
this residue is not conserved either in type or similarity, it is
difficult to directly define a lesson that may be applied in
improved thermal stability is inherently dependent on the engineering fungal GH5s. In fact, TaCel5A already displays a
location of readily glycosylated serine and threonine residues at glycine in the same location suggesting product inhibition in
the surface. Thus, it is unlikely that this is a universal approach to TaCel5A may have been addressed through natural evolutionary
engineering thermal stability in fungal GH5 cellulases. changes. TrCel5A displays an asparagine and may similarly be
Individual enzymes will have to be characterized to ensure unaffected by product inhibition at this site.
specific activity is not adversely affected in each case, and the Finally, heterologous hosts capable of rapidly producing
extent to which hyperglycosylation aids in stability may vary. GH5s demonstrating near native activity are particularly useful
Stability of TrCel5A at extreme pH values is also a motivator for screening enzymes in the search for novel activities and
of several protein engineering studies, the alteration of which properties. Nakazawa et al. describe successful recombinant
would enable harsher alkaline biomass pretreatment conditions, expression of the key T. reesei EGs, Cel7B, Cel5A, and
a pretreatment option under intense development cur- Cel12A.846 The authors report the first description of a
rently.839−842 Wang et al. describe the use of directed evolution prokaryotic host (E. coli) expressing all three EGs from T.
to increase the pH optimum of TrCel5A.843 The approach reesei. The authors characterize the activity, stability, and
identified a surface exposed asparagine, Asn321, that had been specificity of each of the recombinant EGs, as they believe
substituted by a threonine shifting the optimal pH of TrCel5A demonstration of properties similar those of the natively
from 4.8 to 5.4. Further site-directed mutagenesis at this site, expressed EGs represents a potentially advantageous oppor-
N231H, broadened the optimal range to pH 6.0. The authors tunity to use the E. coli host in high-throughput mutagenesis
conclude this conserved residue is a key determinant of TrCel5A studies. The recombinant TrCel5A CD, the only GH5 of the
pH stability. Thus, this mechanism of improved alkaline- three EGs, behaved remarkably similarly to TrCel5A expressed
resistance may similarly apply to other homologous GH5s. Qin in T. reesei. The TrCel5A CD, targeting only β-1,4 glycosidic
et al. later applied the same approach to identify additional linkages in model substrates, maintained broad pH range of
residues contributing to the specific pH profile of TrCel5A. In 4.5−6, with an optimum pH of 5.5 at 50 °C. The enzyme also
this latter study, a series of charged, surface exposed residues maintained stability at 50 °C for 60 min, though activity
were again shown to contribute to the pH optimum.324 Qin et dropped by 60% with a 10 °C increase. In general, this study
al. tested several single point asparagine variants and three
offers a promising alternative host for high-throughput
cluster variants, one of which interestingly included a substrate
expression of TrCel5A mutants.
binding aromatic residue. The alkaline resistivity was improved
to a maximum pH 6.2. The key takeaway message from both of 8.4. Conclusions
these studies appears to be that charged residues at the surface of GH5 enzymes represent one of the larger and more diverse
TrCel5A significantly affect pH stability. Notably, the gains in groups of GHs with over 20 different hydrolytic modes of action
TrCel5A pH stability at increasingly higher pH do not yet reach and at least 51 different phylogenetic subfamilies. Though
those of alkaline levels, and significant surface residue relatively few, the fungal GH5 cellulases have been well-
remodeling is likely necessary to achieve such a goal. characterized; the industrial relevance of these cellulases is
Site-directed mutagenesis has also been successfully imple- paramount as this family contributes over 50% of fungal EG
mented to alter substrate specificity and activity in fungal GH5s. action in the T. reesei secretome.742 Structural and biochemical
After examining conserved residues in several GH5 sequences, characterization efforts over the years have successfully
Wang et al. identified several key residues putatively involved elucidated the following features of GH5 cellulases:
with substrate binding and catalysis in Macrophomina phaseolina
Egl1.844 The wild-type enzyme was capable of hydrolyzing both (1) All of the fungal GH5 cellulases discovered to date are
CMC and cellopentaose, the shortest cello-oligomer able to be EGs.
1397 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 12. Reported GH12 Crystal Structures


source and original name in primary citation PDB code resolution (Å) brief highlights ref
Archaeal Structure
Pyrococcus f uriosus DSM3638 Cel12A 3VGI 1.07 first archaea GH12 structure 874
Bacterial Structures
Bacillus licheniformis DSM13 BlXG12 2JEM 1.78 apo structure 862
2JEN 1.40 ligand complex structure 862
Rhodothermus marinus ITI378 Cel12A 1HOB 1.80 apo structure, first structure of a thermostable GH12 enzyme 875
2BW8 1.54 apo structure 884
2BWA 1.68 bound cellotriose 884
2BWC 2.15 bound cellotetraose 884
3B7M 2.10 thermostable variant enzyme 884
Streptomyces lividans 1376 CelB 1NLR 1.75 first GH12 structure 871
2NLR 1.20 first GH12 ligand complex structure 872
Streptomyces sp. 11AG8 Cel12A 1OA4 1.50 structure of a thermophilic GH12 enzyme 876
Thermotoga maritime MSB8 Cel12A 3AMH 2.09 wild-type apo structure 877
3AMM 1.98 wild-type soaked with cellotetraose 877
3AMN 1.47 E134C variant soaked with cellobiose 877
3AMP 1.78 E134C variant soaked with cellotetraose 877
3AMQ 1.80 E134C variant cocrystallized with cellobiose 877
3O7O 2.41 891
3VHN 2.50 bound cellobiose ligand 887
3VHO 1.93 Y61GG insertion 887
Fungal Structures
Aspergillus aculeatus KSM510 Xeg1 3VL8 1.90 878
3VL9 1.20 bound xyloglucan 878
Aspergillus niger CBS 120.49 EglA 1KS4 2.50 palladium complex 879
1KS5 2.10 879
Humicola grisea ATTC 22081 Cel12A 1OLR 1.20 apo structure 880
1UU4 1.49 cellobiose ligand 208
1UU5 1.70 cellotetraose ligand with mixed linkages binding from −4 to −1 subsites 208
1UU6 1.40 cellotetraose binding from −2 to +2 subsites by cellopentaose soak 208
1W2U 1.52 thio-oligosaccharide ligand spanning active site 208
Hypocrea schweinitzii ATCC 66965 Cel12A 1OA3 1.70 876
Trichoderma harzianum IOC-3844 Cel12A 4H7M 2.07 881
Trichoderma reesei QM9414 Cel12A 1H8V 1.9 first fungal GH12 structure 567
1OA2 1.5 A35V variant 876
1OLQ 1.7 P210C variant 876

(2) GH5 cellulases exhibit the ubiquitous TIM barrel fold. not yet convincingly determined the rate-limiting step in
The generality of this structure allows for significant GH5 catalysis.
variation of the loops connecting the major structural (6) Perhaps as a result of the significant sequence diversity
elements and likely accounts for the wide variety of within the family, GH5 cellulases respond particularly
observed substrate specificities, thermal stabilities, and pH well to protein engineering efforts. Significant gains in
optima. thermal stability have been made through the introduc-
(3) Relatively few GH5 fungal structures have been tion of disulfide bonds and hyperglycosylation with little
determined (only 4 thus far), and much of what is adverse impact to activity.
known regarding mechanistic function has been deter- (7) GH5 EGs naturally occur both with and without
mined from bacterial representatives. appended CBMs. The appendage of this module appears
(4) The catalytic active site of GH5 enzymes is defined by to be evolutionarily related to the environmental
two sequence motifs, an Asn-Glu-Pro motif on the β4 conditions (i.e., at high-substrate loading conditions,
strand that includes the catalytic acid and a Glu-Xxx-Gly GH5s do not require CBMs for effective cellulolytic
motif on the β7 strand (where Xxx is typically an aromatic conversion).
residue) that contains the complementary Glu nucleo- Bacterial GH5s have been instrumental in developing our
phile. The location of the catalytic acid and base includes understanding of the catalytic mechanisms of cellulases within
GH5 cellulases in the 4/7 superfamily of enzymes. this family, and a general understanding of the role these
(5) Hydrolysis of the glycosidic linkage occurs through a two- enzymes play in industrial biomass conversion is clear.
step, retaining mechanism. The glycosylation step Nevertheless, the role the constellation of loops connecting
produces a glycosyl-enzyme intermediate resulting in the major TIM barrel structural elements plays in determining
glycosidic bond cleavage. A water molecule then conducts specificity and thermal stability remains an outstanding
nucleophilic attack during the glycosylation step, which question. Given the diversity of specificities as well as the
resets the enzyme to its initial catalytic state. Studies have range of thermal and pH tolerances within this family of
1398 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 13. Summary of Biochemical Characterizations of Fungal GH12 Cellulases
temp opt
organism expression host enzyme pH opt (°C) substrate for opt substrate specificity ref comments
Aspergillus aculea- Cmc1/FI- 4.5 50 insoluble cello-oligosac- CMC, PASC, insoluble cello-oligosacchar- Murao et al., 1988892
tus F-50 CMCase charides (average DP ide (av DP 20), Glc4/Glc5/Glc6
20)
Chemical Reviews

Aspergillus aculea- Cel12A xyloglucan, xylan, arabinoxylan, galacto- Vlasenko et al., 2010332
tus mannan
Aspergillus aculea- Cel12B PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332
tus CMC. xyloglucan, xylan, arabinoxylan
Aspergillus fumiga- Cel12A PASC, Avicel, BC, CMC, xyloglucan Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0
tus
Aspergillus kawa- Saccharomyces CMCase-I CMC Sakamoto et al., 1995893 assay conditions 30 °C
chii IFO 4308 cerevisiae
Aspergillus niger Escherichia coli EglA 4.5 CMC CMC, xyloglucan, β-glucan Van Den Broeck et al.,
CBS 120.49/ 2001894
N400
Aspergillus niger Kluyveromyces EglA CMC, β-glucan Hasper et al., 2002895 assay conditions 40 °C and pH 4.5
CBS 513.88 lactis
Aspergillus oryzae Aspergillus ory- CelA 5.0 55 CMC CMC Kitamoto et al., 1996489
KBN616 zae
Aspergillus terreus EglD 4.8 50 CMC CMC, Avicel, filter paper Narra et al., 2014896
NIH2624
Chrysosporium EGIII 4.5−6 80 CMC CMC, labeled CMC, filter paper, β-glucan, Emalfrab et al., 2003715
lucknowense Avicel, galacto-mannan
Clonostachys rosea Aspergillus niger EGIII/ o-nitrophenyl-β-D-cello- azurine cross-linked hydroxyethyl cellulose, Goedegebuur et al., Gliocladium roseum Cel12C max Tm of 45.9 °C at pH 5

1399
Cel12C bioside (oNPC) oNPC 2002,853 Sandgren et al.,
2005890
Fomitopsis palustris EG-II/ 3.5−4.0 45−55 CMC CMC, Lichenan, barley β-glucan, Glc6, Shimokawa et al., 2008897 stable up to 55 °C
FFPRI 0507 Cel12 pachyman, laminarin, xyloglucan, gluco-
mannan
Gloeophyllum tra- Aspergillus niger Cel12A 4.5 50 CMC CMC, PASC, filter paper, Glc2/Glc3/Glc4/ Tambor et al., 2012801
beum ATCC Glc5/Glc6
11539
Humicola grisea Aspergillus niger Cel12A 5.5 oNPC azurine cross-linked hydroxyethyl cellulose, Goedegebuur et al., max Tm of 68.6 °C at pH 7, optimum pH of 5.5
ATCC 22081 oNPC 2002,853 Sandgren et al., measured at 40 °C
2005890
Hypocrea schwei- Aspergillus niger Cel12A 5.5 oNPC azurine cross-linked hydroxyethyl cellulose, Goedegebuur et al., max Tm of 49.2 °C at pH 5, optimum pH 5.5 measured
nitzii ATCC oNPC 2002,853 Sandgren et al., at 40 °C
66965 2005890
Myceliophthora Cel12A PASC, Avicel, BC, pretreated corn stover, Vlasenko et al., 2010332 assay conditions 50 °C and pH 5.0
thermophila CMC, xyloglucan, xylan, arabinoxylan
Phanerochaete Cel12A CMC, amorphous cellulose, xylan, mannan, Henriksson et al., 1999898 assay conditions pH 5 and 37 °C
chrysosporium filter paper
K3
Polyporus arcular- Cel3A 4.9 52 CMC CMC, Avicel, PASC, Glc3/Glc4/Glc5/Glc6 Ishihara et al., 2005899
ius 69B-8
Trichoderma reesei Cel12A 5 50 β-glucan CMC, PASC, Avicel, Glc4/Glc5, barley Karlsson et al., 2002334 retained >50% activity at pH 7, and 25% at pH 4; at pH
QM9414 β-glucan, glucomannan, filter paper 5, retained 90% activity at 60 °C, and <10% at 70 °C
Review

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 62. In 1997, the first three-dimensional structure of a GH12 enzyme, S. lividans CelB2, was determined at 1.75 Å resolution by Sulzenbacher et
al. (PDB code 1NLR).871 The overall structure fold of SlCelB2, as for all GH12 enzymes with known structures, was found to be a β-sandwich. (A) A
top-down and (B) end-on view of the structure of SlCelB2 (yellow cartoon) is shown aligned with the first fungal GH12 cellulase structure, TrCe12A
(bluewhite cartoon, PDB code 1H8V).567 In each panel, the 2-fluorocellotriosyl ligand from the SlCelB2 structure (PDB code 2NLR) is shown in cyan
stick. The catalytic residue motif is also shown in stick. (C) A close-up of the glycosyl-enzyme intermediate intermediate (PDB code 2NLR) revealed
the SlCelB2 catalytic nucleophile, Glu120, covalently bound to the ligand (yellow and cyan stick).872 A second distinct conformation of the 2-
fluorocellotriosyl ligand was found bound in the crystal structure of SlCelB2 representing the product species from the deglycosylation step of the
double-displacement mechanism (orange sticks). The TrCel12A active site is shown in blue/white sticks for comparison.

enzymes, the GH5 family represents a unique case study for described the cloning and characterization of 15 fungal GH12
understanding how protein structure contributes to activity and enzymes.853 As part of this study, the authors also proposed a
stability. Structural and biochemical characterization of a large GH12 subfamily classification structure based on amino acid
set of GH5 enzymes is likely to uncover motifs and mutations sequence alignments of the catalytic cores. Accordingly, GH12s
that contribute to stability of of these enzymes. can roughly be divided into four phylogenetic subfamilies
denoted GH12-1 to GH12-4. Subfamilies GH12-1 and GH12-2
9. FAMILY 12 GLYCOSIDE HYDROLASES contain fungal representatives; subfamily GH12-3 represents
Streptomyces members, and GH12-4 is a Thermophiles subfamily
Family 12 glycoside hydrolase (GH12) enzymes have been
containing most the archaeal GH12 enzymes. This initial
extensively characterized over more than four decades,847 long
subfamily division of GH12 enzymes would later be confirmed
before this family of enzymes was classified in the CAZy
and extended in three additional GH12 characterization
database151−153 or even cellulase family H as this group of
papers.854−856 In the paper from Master et al. in 2008, where
enzymes was initially named. Cel12A from T. reesei is most likely
they characterize a GH12 xyloglucanase from A. niger, the
the first GH12 enzyme to be discovered and characterized in any
detail and currently represents the most extensively charac- authors state that fungal enzymes with a strong preference for
terized of all GH12 proteins to date (Table 2). The T. reesei xyloglucan only can be found in subfamily GH12-2. Vlasenko et
GH12 enzyme (TrCel12A) was initially called EG III and al. later added that the subfamily classifications are largely
occasionally still is by some research groups. TrCel12A was undifferentiated in terms of substrate specificity, with EGs and
described in several early studies as a low molecular weight (25 xyloglucanases frequently sharing subfamily classifications.332
kDa) cellulase. At a total of 218 amino acids, TrCel12A is This phenomena likely arises from divergent evolution from a
relatively small compared to the other identified cellulases from shared ancestral gene.
T. reesei. TrCel12A was found to be only sparsely glycosylated The specific function of GH12 enzymes in fungal and
and is one of the three cellulases from this fungus that does not bacterial plant-degrading systems is not well-characterized or
exhibit a CBM and linker.319,847−851 The first time that this understood, and indeed, many questions remain as to the ”true”
“low-molecular-weight” GH12 cellulase from T. reesei was function of GH12 enzymes. Some biochemical data on the
described in literature was in a paper by Håkansson et al. specific activity of these enzymes can be found in the literature,
from 1973, though the cloning of the TrCel12A gene would not including studies of the activity on soluble850,857 and insoluble
be reported for another two decades.847,851 Some years later, the substrates.334,806,849,855,858,859 A summary of identified activities
cloning and heterologous expression of TrCel12A in S. cerevisiae of GH12 enzymes is included in Table 13. Several reports have
was described.852 The amount of natively expressed TrCel12A demonstrated that that GH12 enzymes, in addition to being
has been reported to be relatively small compared to other endo-β-1,4-glucanases (EC 3.2.1.4), also exhibit activity against
cellulases expressed by this fungus, less than 1% of the total β-1,3/1,4-glucan (EC 3.2.1.73),334,856,860 xyloglucan (EC
amount expressed protein.848 3.2.1.151), xylan (EC:3.2.1.8),334,499,850,854,861−865 and lichen-
Subsequent genome sequencing campaigns following the an.866 However, some GH12s exhibit stronger preferences for
discovery of the first GH12 identified a wide range of additional the noncellulosic substrates; the GH12 enzymes from Pyrococcus
GH12 gene sequence members, though the family as a whole is f uriosus (EglA)867 and the pathogenic fungus Cochliobolus
relatively small. At the time of this review, there are 357 GH12 carbonum (MLG2)868 are reported to prefer mixed-linked β-
entries in the CAZy database.151−153 Of these 357 entries, 55 are glucans over 1,4-linked polysaccharides. GH12 enzymes have
from Archaea, 187 from Bacteria, 114 from Eukaryota, and 1 also been shown to induce extension of plant cell walls, which
entry remains unclassified. In 2002, Goedegebuur et al. has been most extensively characterized in TrCel12A.860,869 For
1400 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 63. Sequence alignment of the six of the GH12 structures reviewed here. Strictly conserved residues are shown in red block, and chemically
similar residues in red text. The blue boxes indicate chemical similarity across a grouping of residues. The secondary structural elements of HgCel12A
are shown above the sequences. The catalytic nucleophile is marked by a yellow star, the proposed catalytic acid/base is marked by a blue star, and the
residue with a supporting role in catalysis is marked by a magenta star. The figure was generated with ESPript (http://espript.ibcp.fr).347

the H. insolens GH12 enzyme, Cel12A, Schülein et al. concluded TrCel12A, a distance frequently observed for the nucleophile/
that the enzyme’s low activity on celluloses, β-glucan, and other acid−base involved in the retaining mechanism of GHs.444
polysaccharides implied that “the function of this cellulase In addition to the catalytic nucleophile and acid/base residues
remains unclear”, suggesting additional functions remain to be (Glu116 and Glu200 in TrCel12A numbering), GH12s involve
discovered.499 a third carboxylate residue (Asp99 in TrCel12A numbering) as
All GH12 enzymes catalyze the cleavage of the β-1,4- part of a catalytic triad. This catalytic triad was first observed in
glycosidic linkages in the various types of β-glucan with a net GH clan-B enzymes.172,873 A summary of GH12 structures that
retention of the anomeric configuration via a double-displace- are discussed are provided in Table 12.
ment mechanism as described in section 4.157 This GH12 9.1. Structural Studies
catalytic mechanism was first demonstrated in 1997 when
Schülein et al. characterized HiCel12A.499 One year later, Zechel To date, there are a total 34 GH12 structures available in the
et al. used ligand labeling studies to identify the catalytic PDB, originating from 12 unique GH12 enzymes, one
nucleophile, residue Glu120 of Streptomyces lividans CelB2 archaea, 874 five bacterial, 862,871,875−877 and six eukary-
(Figure 62).857 The identity of the nucleophile was later also otic.567,876,878−881 Table 12 summarizes the structures solved
confirmed for TrCel12A in a mutation study 2000 by Okada et and notes the presence or absence of bound ligands. The first 3-
al.870 The two catalytic residues in all GH12 enzymes are two D structure of a GH12 enzyme to be solved was that of the
invariant carboxylates, Glu116 and Glu200 in TrCel12A, which bacterial S. lividans CelB2 (SlCelB2) at 1.75 Å resolution (PDB
are the catalytic nucleophile and acid/base, respectively. These code 1NLR).871 Two years later, the same group presented a
two catalytic residues are separated a distance of ∼5.5 Å in ligand complex structure of SlCelB2 at 1.2 Å resolution
1401 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

exhibiting a trapped 2-fluorocellotriosyl ligand covalently bound


to the catalytic nucleophile Glu120 (PDB code 2NLR, Figure
62C).872 This initial series of structures were influential in
determining both the overall fold of the enzyme as well as
uncovering key catalytic components. In 2001, Sandgren et al.
determined the first 3-D structure of a fungal GH12 enzyme,
apo TrCel12A, at 1.9 Å resolution (PDB code 1H8V).567
9.1.1. Overall Structure. GH12 enzymes generally exhibit a
β-sandwich fold (Figure 62A), which together with GH11
enzymes form GH clan-C in the CAZy classification, as
described in more detail below. TrCel12A is a typical structural
example of a GH12 enzyme (Figure 62A) consisting of 15 long
β-strands that fold into two twisted, largely antiparallel β-sheets,
A and B, which pack on top of one another. The convex β-sheet
A consists of six antiparallel strands (A1−6), and the concave β-
sheet B consists of nine largely antiparallel strands (B1−9). The
β-strands in the two β-sheets are numbered consecutively from
1 to 6/9 after their order in the sheets, with β-strands A1/B1
closest to the proposed nonreducing end of the binding cleft
(Figure 62A,B). A single α-helix in the structure packs against
the outer convex surface of β-sheet B, found at the bottom right Figure 64. Cord region of TrCel12A (PDB code 1H8V).567 The red
in Figure 62A. The enzyme is very compact with dimensions of dashed circle encompasses the cord region. Within this region, four
approximately 40 Å × 40 Å × 30 Å. Among GH12 enzymes, residues, shown in stick representation, form the basis of the +2 and +3
there are two highly conserved cysteine residues, Cys4 and binding subsites.
Cys32 in TrCel12A, that form a disulfide bond bridging two β-
strands. Some GH12 enzymes have been observed with for some of the GH11 enzymes, such as T. reesei XynII (a GH11
additional cysteine residues,880 but these are not highly xylanase), there have been reports that the “cord” region is
conserved among GH12 protein members. In some proteins, flexible upon substrate binding.883 As of yet, there are no signs of
these nonconserved cysteine residues occur as free cysteines, similar conformational flexibility in the GH12 cord region.
and the effect of these free cysteines on catalytic activity and 9.1.2. GH12 Ligand Complex Structures. The first GH12
protein stability has been studied in detail.880 ligand complex structure, as mentioned above, was that of
Like many fungal cellulases reported to date, the N-terminal SlCelB2.872 This ligand bound structure of SlCelB2 captured a
glutamine of TrCel12A undergoes a cyclization and con- 2-fluorocellotriosyl ligand, covalently bound to the enzyme,
densation reaction with the amine group of the N-terminus spanning binding site −3 to −1. Two distinct conformations of
producing a cyclic pyroglutamate (section 6.4). This cyclization the ligand were observed: one where the ligand was covalently
and condensation of the N-terminus is thought to make the bound to the catalytic nucleophile, Glu120, forming a glycosyl-
protein resistant to proteolytic degradation.882 Additionally, as enzyme intermediate,872 and a second form representing the
described at length for GH7 and GH6 cellulases, fungal GH12 product species produced after the second reaction step of the
enzymes are often found to be glycosylated. A N-acetylglucos- double-displacement mechanism.
amine residue is found covalently attached to Asn164 in the The first fungal GH12 ligand complex structure was that of H.
crystal structure of TrCel12A,567 one of two N-glycosylation grisea Cel12A (HgCel12A), described in a 2004 paper by
amino-acid sequence motifs in the protein, consistent with Sandgren et al.208 The authors present a total of four different
previous observations of TrCel12A glycosylation (Figure 63).882 HgCel12A ligand complex structures formed by soaking crystals
The concave surface of β-sheet B in GH12 enzymes forms a of the native enzyme with either cellobiose (PDB code 1UU4),
large crevice in the molecular surface perpendicular to the β- cellotetraose (PDB code 1UU5), cellopentaose (PDB code
strand direction of the enzymes, as can be seen in Figure 62A. 1UU6), or a thio-linked cellotetraose derivative (G2SG2, PDB
This crevice is approximately 35 Å long, 8 Å wide, and 15 Å code 1W2U). The ligands were soaked at pH values preventing
deep, and forms the substrate-binding site cleft of GH12 rapid hydrolysis of the ligand and, thus, were able to capture
enzymes. It is estimated from the size of the catalyctc cleft of linkages across the active site of the wild-type enzyme. A
GH12 enzymes that these have the potential to bind at least six theoretical cellohexaose ligand complex has been proposed from
glycan residues. The crevice of GH12 enzymes contains two this wealth of structures (Figure 65, top panel). These
glutamate residues, Glu116 and Glu200 of TrCel12A, which are HgCel12A ligand complex structures enabled mapping of the
invariant throughout GH12. As stated above, it was predicted noncovalent interactions between the enzyme and the glucosyl
that Glu116 was the catalytic nucleophile in TrCel12A, which chain bound in subsite −4 to +2 of the enzyme (Figure 65) and
was directly confirmed by site-directed mutagenesis studies.870 shed light on the mechanism and function of GH12 cellulases.
The bottom of subsites +2 and +3 in the substrate binding The unhydrolyzed cellopentaose ligand and the G2SG2 cello-
cleft of all known GH12 enzymes appears to be formed by oligomer were both found spanning the active site of the
residues from the “cord” region (Pro129 and Ile130 in catalytically active HgCel12A enzyme. In both cases, the
TrCel12A), indicated with a red circle in Figure 64. The proline pyranoside bound in subsite −1 displayed a 1S3 skew boat
ring of the “cord” is sandwiched between two aromatic residues, conformation. After soaking HgCel12A in cellotetraose, a cello-
Trp120 and Tyr147 in TrCel12A. The cord is structurally well- oligomer was captured bound across subsites −4 to −1 of the
conserved between all GH12 enzymes. Given the similar nature enzyme. Rather surprisingly, this ligand contained a β-1,3-
of the folds, GH11 enzymes also exhibit this “cord” region, and linkage between the two cellobiose units of the oligomer (PDB
1402 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

binding sites it occupies suggests a mixed β-glucan activity for


this enzyme.
In addition to the SlCelB2 and the HgCel12A ligand complex
structures, various types of bound ligands have been determined
for four other GH12 enzymes revealing detailed information
regarding substrate specificity and mode of action of GH12
enzymes. The first additional GH12 ligand complex structure
determined was that of a thermally stable, bacterial EG
presented in 2006 by Crenell et al.884 The Rhodothermus
marinus Cel12A (RmCel12A) structure was captured in complex
with cellotetraose (PDB code 2BWC) and cellotriose (PDB
code 2BWA) by soaking crystals in a solution containing
cellopentaose prior X-ray data collection. In doing so, the
RmCel12A bound either a cellotetraose or cellotriose ligand in
either one of the two copies of the enzyme in the structure, but
not both. The cellotetraose ligand was found bound in subsite
−3 to +1 and the cellotriose ligand in subsite −3 to −1, all in
equivalent conformations and positions observed in the
HgCel12A complex structure with a bound cellopentaose
molecule (1UU6).
In 2007, Gloster et al. presented a 1.4 Å resolution ligand
complex structure of a GH12 xyloglucanase from Bacillus
licheniformis (BlXG12, PDB code 2JEN).862 The BlXG12 ligand
complex structure was obtained from a crystal of a nucleophilic
variant of this enzyme, E115Q. A crystal of this “dead” variant of
the BiXG12 was soaked with a 20 mM xyloglucan
oligosaccharide mixture prior X-ray data collection. The
BlXG12 ligand complex structure obtained had two structurally
different xylosyl ligands bound in the same substrate-binding
cleft of the enzyme. One ligand spanned the −4 to −1 subsites,
and the second bound to the +1 to +2 subsites. The ligand
bound in −4 to −1 subsites consisted of four β-1,4-glucose
moieties decorated with two α-1,6 linked xylose residues
branching out from the glucose moieties in subsites −3 and
−2. The second ligand, bound in subsite +1 and +2, consisted of
two β-1,4-linked glucose moieties, both decorated with an α-1,6
linked xylose residue. The authors concluded that the lack of
additional decoration of the two xyloglucan-ligands bound in the
BlXG12 complex structure, in addition to the bound α-1,6
xylose residues, indicates that this enzyme has a preference for
“naked” β-1,4-glucan chains over more decorated types of β-1,4-
glucan.862
In a paper by Cheng et al. in 2011,877 four ligand complex
Figure 65. Cartoon representation of the first fungal GH12 ligand structures of T. maritima Cel12A (TmCel12A) were presented.
complex structures from H. grisea Cel12A.208 (A) A theoretical model These ligand−complex structures were obtained after soaking
ligand complex with a cellohexaose spanning the active site derived crystals of TmCel12A, either wild-type protein or an E134C
from the 1UU4, 1UU5, and 1UU6 structures is shown. The protein is variant, in either cellobiose (PDB codes 3AMN and 3AMQ) or
shown in translucent magenta cartoon. Aromatic residues lining the
cellotetraose (PDB codes 3AMM and 3AMP). In the two
binding cleft as well as catalytic residues are shown in green stick. (B)
The cellobiose ligand (1UU4, yellow stick) was found bound in the +1 TmCel12A cellobiose soaked complex structures, two copies of
to +2 subsites. (C) The cellopentaose ligand (1UU6, gray stick) bound a cellobiose ligand was found bound in the substrate-binding
across the −2 to +2 subsites. The electron density of one pyranose cleft. The first one was bound in subsites −2 and −1 and the
moiety was unable to be resolved. (D) Soaking in cellotetraose resulted second one in subsites +1 and +2. In the two TmCel12A
in a mixed linked cellotetraose ligand (1UU5, cyan stick) spanning cellotetraose soaked ligand complex structures, a cellotetraose
subsites −4 to −1 of the enzyme. molecule was found bound to subsites −4 to −1, and in addition
to this, a cellobiose molecule was also bound in binding sites +1
and +2. It is notable that the glucose molecules occupying
binding site −1 in all four TmCel12A ligand complex structures
code 1UU5). The β-1,3-linkage between the two cellobiose presented in the paper displayed an α-anomeric configuration.
units of the complexed structure was proposed to have been
formed via a transglycosylation reaction that most likely had 9.2. Plant Cell Wall Loosening/Extension Activity by GH12
occurred during the ligand soak of the protein crystals. The Enzymes
authors also proposed that the close fit of this β-1,3-linkage Some GH12 enzymes have been found to exhibit plant cell wall
ligand found in the HgCel12A liganded structure and the loosening and extension capability. This capability has most
1403 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 66. Superimposition of GH clan-C members, TrCel12A and TrXynII. (A) Top-down and (B) end-on views of GH12 TrCel12A (blue/white
cartoon, PDB 1H8V)567 structurally aligned with GH11 TrXynII (orange cartoon, PDB 1ENX)885 reveal high structural similarity between the two
families despite very low sequence similarity. The catalytic triads are shown for each structure in stick representation.

been extensively characterized for TrCe12A and was described unknown, reveals that only three amino acids are completely
in great detail in an extensive paper by Yuan et al. in 2001.860 conserved across clan GH-C. Two of the conserved amino acids
The same group revisited the topic in a subsequent manuscript correspond to the active site nucleophile and the catalytic acid/
in 2012.869 In the first of these two studies, Yuan et al. base (Glu 116, and Glu 200, respectively, in TrCel12A), both of
demonstrated that TrCe12A has the ability to induce extension which are essential to catalysis. The third conserved residue is a
of heat-inactivated type I cell wall from cucumber (Cucumis valine residue (Val160 in TrCel12A) that is located on the inner
sativus) and wheat (Triticum aestivum) type II cell walls.860 The concave β-sheet.
authors show that TrCel12A hydrolyzes an important structural 9.4. Enzyme Discovery and Engineering
polysaccharide of the cell wall, thereby leading to a significant
structural weakening of the wall. The weakening was most GH12 enzymes are the most commonly used enzyme for
readily detected as an increase in wall plasticity and elasticity. enzymatic stonewashing of cloth and enzyme additive in
Because TrCel12A readily hydrolyzes wall glucans such as washing powders. These enzymes are produced and sold in
xyloglucan and β-1,3/β-1,4-glucans but not xylans, mannans, or bulk quantities by many enzyme producing companies. Due to
galactans, the authors conclude that this type of glucan the commercial interest in utilizing GH12 enzymes as
hydrolysis is the probable mechanism for the TrCel12A effects biocatalysts in various biotechnological applications, such as
on the wall. The overall conclusion of these two papers is that the ones mentioned above, there has been a relatively strong
TrCel12A has the capability to modify both type I and II plant market drive to initiate research programs aimed toward
cell walls, but substantial extension of these is found only in type discovery of new, naturally occurring GH12s with suitable
I cell walls.869 properties for these applications and pro-
cesses.853,859,862,874−877,886 Specifically, many of the processes
9.3. GH Clan-C: Structure and Sequence Comparison for which the GH12 enzymes are suitable biocatalysts are carried
The GH clan-C, including GH11 and GH12, consists of 4 major out at elevated reaction temperatures, 80−100 °C, and under
groups of sequences that correspond to the bacterial and fungal very basic reaction conditions, pH 9 or higher. A direct outcome
members in each of the two GH families. Despite low sequence of these sometimes very harsh application conditions has been
identity between the two different subgroups (bacterial and that two of the most important goals for GH12 research
fungal) within each one of these two GH families and the virtual programs have been to either (1) identify new GH12 enzymes
absence of identity between the two GH families forming GH in nature with increased thermal stability with respect to
clan-C, the overall 3-D structures for all GH clan-C members catalytic activity and/or better performance properties at basic
exhibit remarkable similarity. Figure 66 illustrates the super- or nonphysiological pH conditions,853,859,874−877,880,886−889 or
imposition of a GH12, TrCel12A, with a GH11, TrXynII. The (2) find ways to change/improve these biophysical properties of
primary structural difference between the two GH clan-C a GH12 enzyme of interest.
members is the absence of the first two β-strands, A1 and B1, Two of the first thermostable GH12 enzymes to be cloned
TrXynII. This is true of more than half of GH11s. Nevertheless, and characterized in some detail were the two endo-cellulases
a structural-based sequence alignment of GH clan-C members, from Thermotoga neapolitana, TnCelA and TnCelB. Book et al.
extended to include sequences for which the structure is presented the results from the study of these two enzymes in
1404 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 67. Cartoon representation of TrCel12A showing conformational changes close to residue 35 upon mutating this residue from an alanine to a
valine. The point mutation stabilizes the variant enzyme (blue/white cartoon, PDB 1OA2) by 7.7 °C in comparison with the wild-type enzyme (wheat
cartoon, PDB 1H8V). (A) A close-up of the site of the A35 V mutation and (B) overall enzyme structure show the position of residue 35 relative to its
surroundings. A red dashed circle in part B indicates the “cord” region of GH12 enzymes.

1998.864 The authors show that the E. coli expressed TnCelA temperatures, and thus identify genetic engineering tools for the
and TnCelB enzymes have optimal activities at 95 °C at a pH of optimization of these properties. Many of the protein
6.0 and 106 °C at a pH of 6.0−6.6, respectively. No conclusions engineering strategies for GH12s have been described in a
were drawn regarding what biophysical properties might render 2005 review,890 and thus, we only briefly point to very successful
these two so much more thermostable by comparison to other efforts as examples. Such examples of the outcome of such
previously studied GH12 enzymes. protein stability engineering campaigns of GH12 enzymes are
One year later, Bauer et al. described the characterization of a presented in the studies by Sandgren et al.,876,880 Nakazava et
hyper-thermostable GH12 EG from the thermophilic archaeon al.,888 and Cheng et al.887 In the first of these studies, Sandgren
P. f uriosusis, Pf EglA.867 The authors show that Pf EglA has an et al. demonstrate that single amino acid substitutions can be
optimal activity temperature of 100 °C at a pH of 6.0 and a half- immensely valuable in improving process relevant properties of
life of 40 h at 95 °C. Differential scanning calorimetry was used GH12.876 The authors performed point mutagenesis studies
to determine the denaturing temperature of 112 °C. The alongside structural comparisons of homologous GH12
ultrahigh resolution (0.8 Å) structure of Pf EglA was later enzymes that were less or more thermally stable to suggest
described by Kim et al.874 Here, the authors use structural based the A35 V substitution in TrCel12A. This point mutation
evidence to conclude that the hyper-thermostability of Pf EglA successfully increased the Tm of the enzyme by 7.7 °C over wild-
can be explained in part by a calcium ion bound in a Ca2+ type (Figure 67). On the basis of structural evidence, the authors
binding DxDxDG loop motif located in a loop connecting two concluded that the thermal unfolding of TrCel12A most likely
β-strands at one end of the binding cleft of Pf EglA. The authors starts at the position of the identified thermal stabilizing A35 V
also show that if the bound calcium ion is removed from the mutation (Figure 67B).
enzyme by EDTA treatment prior heat treatment of the enzyme, Some attempts to engineer the pH optimum of GH12
a dramatic decrease in residual activity of Pf EglA after heat enzymes have also been carried out. Unfortunately, these have
treatment occurs. not been as successful as recent thermal stability engineering
The first crystal structure of R. marinus Cel12A (RmCel12A), efforts. Thus far, the most efficient strategy for identifying GH12
a highly thermostable GH12, helped reveal ion pairs and enzymes active at a nonphysiological pH has been to screen for
naturally occurring enzymes with altered pH optimum with
charged surface residues as an additional mechanism of
respect to catalytic activity (e.g., in various extreme environ-
stabilization.875 The authors show that RmCel12A retains 75%
ments often found in soda lakes). This type of natural habitat
of its initial catalytic activity after 8 h incubation at 90 °C.
screening for enzymes with altered pH optimum has been
Structural insight aided the authors in describing the origins of
described by van Solingen et al.886 and Goedegbuur et al.853
thermostability in RmCel12A; a comparatively large number of
Huang et al. also described the characterization of a GH12
ion pairs in combination with other features such as stabilization enzyme from the thermoacidophilic archaeon Sulfolobus
of an active site loop, and an increase in the number of polar solfataricus;859 this organism secretes an enzyme having a
amino acids on the surface of the protein, were reported as remarkable pH optimum of 1.8.
responsible for stabilization.
In additional to identifying new GH12 enzymes from nature 9.5. Conclusions
with better overall performance at elevated process temper- GH12 enzymes are industrially relevant proteins given that
atures and nonphysiological pH values, extensive research some of these are important components in commercial
programs have also been initiated with the specific goal to detergent and conditioning formulations. Despite that GH12
identify amino acids in the target enzymes that are important for cellulases are important components in industrial applications
the thermal stability of the molecule or pH optimum, with and reasonably well-characterized structurally, the true function
respect to both overall stability and catalytic activity at elevated for these enzymes in the fungal plant cell wall degradation
1405 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 14. Reported GH45 Crystal Structures


source and original name in primary citation PDB code resolution (Å) brief highlights ref
Fungal Structures
Humicola insolens EGV (HiCel45A) 1ENG 1.6 first GH45 structure, apo, obsolete 903
2ENG 1.5 replaces the original 1ENG apo structure 904
3ENG 1.9 wild-type, cellobiose complex 904
4ENG 1.9 D10N variant, cellohexaose complex 918
Humicola grisea Egl3 (HgCel45A) 1HD5 1.66 apo unpublished
Melanocarpus albomyces Cel45A (MaCel45A) 1L8F 1.8 apo 912
1OA7 2.0 cellobiose complex 914
1OA9 2.0 apo 914
Mollusk Structure
Mytilus edulis Cel45A (MeCel45A) 1WC2 1.2 apo 917

machinery is not well-understood. Here, we generalize the better access to the substrate. Within GH45, there exist some of
findings discussed above for GH12s: the smallest known cellulases, for example, T. reesei Cel45A
(1) The T. reesei GH12 enzyme TrCel12A is most likely the (TrCel45A; also known as T. reesei EG V) consisting of a CD of
first GH12 enzyme to be discovered and characterized in only ∼165 amino acid residues. Beyond their size, another
any detail. TrCel12A was initially called EGIII, was particularly intriguing aspect of GH45s is that they are
described in several early studies as a low molecular structurally and evolutionarily related to expansins, which are
weight (25 kDa) cellulose, and is one of the three ubiquitous in plants, and swollenins, which are present in some
cellulases from this fungus that does not exhibit a CBM. fungal species, as discussed below.
GH45 (family K in the original classification of cellulases) was
(2) Phylogenetic analysis suggests that GH12s can roughly be
first defined in 1993 by Henrissat and Bairoch when they
divided into four phylogenetic subfamilies denoted
extended the GH classification with newly available sequen-
GH12-1 to GH12-4. Subfamilies GH12-1 and GH12-2 ces.900 At that time, GH45 consisted of only two members, the
contain fungal representatives. As of yet, specificity has bacterial EG EGB from Pseudomonas f luorescens subsp. cellulosa
not been strongly linked to these phylogenetic divisions. (now Cellvibrio japonicus) and the fungal EGV from H. insolens
(3) GH12 enzymes are among the smallest carbohydrate (HiCel45A). The cloning of the bacterial EGB-encoding gene
active enzymes, with dimensions of approximately 40 Å × and characterization of the enzyme was published in 1990 by
40 Å × 30 Å. The small size may aid in giving these Gilbert et al.,901 where they recognized that the 482-residue
enzymes access to small pores in the plant cell walls. enzyme exhibits modular organization. By expression of
(4) The catalytic domains of GH12 enzymes exhibit a β- truncated variants, it was demonstrated that the catalytic activity
sandwich fold, consisting of 15 long β-strands that fold resides in the C-terminal domain composed of ∼250 residues
into two twisted, largely antiparallel β-sheets, which pack and that binding to cellulose is nearly abolished when the N-
on top of one another. terminal half is removed. It was found that the enzyme actually
(5) The long substrate-binding cleft of the enzyme is formed contains two CBMs, an N-terminal ∼100-residue family 2 CBM
by concave surface of the β-sandwich. followed by a ∼30-residue family 10 CBM. The two CBMs are
(6) The GH12 enzymes exhibit a wide range of cell wall separated from each other and from the GH45 catalytic module
polysaccharide specificities and catalyze the cleavage of β- by typical serine-rich linkers of ∼50 residues each. Activity was
1,4-glycosidic linkages with a net retention of the highest on lichenan and barley β-glucan, successively lower on
anomeric configuration via a double-displacement mech- CMC and PASC, and very low on crystalline cellulose (Avicel
anism. and filter paper). No activity was detected on xylan, laminarin,
The specific function of GH12 enzymes in fungal and cellobiose, or artificial cellobioside substrates (4-methylumbelli-
bacterial plant-degrading systems is not well-characterized or feryl-β-D-cellobioside, pNPC).901 This activity profile is typical
understood, and indeed, many questions remain as to the “true” for GH45 cellulases.
function of GH12 enzymes. Thus, much remains to be The other enzyme, HiCel45A, consists of 284 residues and is
accomplished toward a better and more complete understanding also modular, but with the GH45 module (∼210 residues) at the
of these enzymes and their hydrolytic mechanisms. Never- N-terminus followed by a linker and a C-terminal family 1 CBM.
theless, these enzymes play an important role in both cell wall The preparation and sequence of the enzyme and its properties
deconstruction and generation, suggesting further examination for treatment of cotton textiles and other cellulose fibers was
of enzymes from this interesting GH family is warranted in the described in a patent 1991 from Novo Nordisk A/S (now
context of industrial applications. Novozymes), Denmark.902 Novozymes has successfully ex-
ploited the enzyme for textile applications such as depilling,
10. FAMILY 45 GLYCOSIDE HYDROLASES biopolishing, and denim stonewashing, and the product
Family 45 glycoside hydrolase (GH45) enzymes share several Carezyme is of widespread use in laundry washing powders.
common features with GH12 enzymes; they are generally small They are often called “color-restoration” washing powders to
by comparison with other GH families, have broad substrate allude to the color-brightening appearance resulting from the
specificity, and can degrade other cell wall polysaccharides in removal of loose fibrils from the surface of cotton fibers.
addition to cellulose. It is hypothesized that the size of the HiCel45A is one of the most extensively studied members of
enzymes represents an evolutionary advantage, which allows this cellulase family and has served as a reference protein for
them to penetrate into smaller pores and cavities and thus gain GH45. The first three-dimensional structure and identification
1406 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 68. GH45 phylogeny (adapted from Sakamoto and Toyohara).909 Sequences from the Plantae, Mollusk, Fungi, Insect, Protozoa, Nematode,
and Bacteria kingdoms are included and colored according to the legend. The GenBank identifier for each sequence is shown in parentheses. At right,
the vertical bars mark the proposed subfamily classification of each sequence. The Plantae sequences are expansins rather than GH45s and thus are not
included in the subfamily classification scheme. The multiple sequence alignment from which the phylogenetic tree was generated was created using
Clustal Omega.910 Only the CDs of the sequences were included in the alignment. The phylogenetic tree was generated with UGENE using the
neighbor joining method and the Jones-Taylor-Thornton distance matrix model.911 The branch length scale (subtitutions/site) is shown at bottom
left.

of catalytic residues903,904 as well as demonstration of an among rather diverse organisms, including primitive protists
inverting reaction mechanism in GH45 were obtained from this (Parabasilian symbionts in the hindgut of termites), nematodes
enzyme.442 A summary of GH45 structures discussed is (e.g., the pine wood nematode, Bursaphelenchus xylophilus),
provided in Table 14. mollusks (e.g., the freshwater snail Ampullaria crossean and the
Other examples of modularity within GH45 enzymes have edible Blue mussel, Mytilus edulis), a narrow group of insects
been observed in, e.g., Pichia pastoris, Mucor circinelloides, and (Cucujiformia beetles, e.g., the mustard beetle, Phaedon
Piromyces equi. PpCel45A from P. pastoris GS115 (Komagataella cochleariae, and the rice weevil, Sitophilus oryzae), and one
pastoris GS115) carries no less than five different family 1 CBMs arthropod species (the Antarctic springtail, Cryptopygus
connected to the C-terminal GH45 CD.905 In M. circinelloides antarcticus).
there are two isoenzymes with identical GH45 CDs and either On the basis of phylogenetic analysis, a division of GH45 has
one or two family 1 CBMs at the N-terminus, which are been proposed into three subfamilies, A, B, and C.905,908 Two
suggested to be produced from the same gene by alternate main evolutionary lineages can be recognized among GH45s.909
splicing.906 The first GH45 enzyme to be described in an The known bacterial and most fungal enzymes, including
anaerobe, the rumen fungus P. equi, has no CBM, but three HiCel45A, belong to subfamily A together with insect,
fungal dockerins of ∼40 residues each N-terminally connected protozoan, and nematode members (Figure 68). Subfamily B,
to the GH45 CD.829 found in mollusks and a few fungi (e.g., M. edulis Cel45A/
Today there are over 300 GH45 entries in the CAZy MeCel45A and TrCel45A), and subfamily C, so far only
database,151−153 almost exclusively from Eukaryota. Only a few represented by a few basidiomycete enzymes (e.g., P.
bacterial species are present, such as the shipworm symbiont chrysosporium Cel45A/PcCel45A), show higher sequence
Teredinibacter turnerae.907 GH45 genes have been found in similarity with each other and with plant expansins and the
many fungi, within Ascomycota, Basidiomycota, Mucorales, and related fungal swollenins than the similarity with subfamily A
Neocallimastigales (P. equi). Otherwise, they seem scattered (Figure 69).
1407 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 69. Structural-based sequence alignment of four GH45 enzymes: HiCel45A, MeCel45A, TrCel45A, and PcCel45A. HiCel45A belongs to
subfamily A, MeCel45A and TrCel45A belong to subfamily B, and PcCel45A belongs to subfamily C. For comparison, two expansins, a swollenin, a
loosenin, a GH45-like domain, and a lytic transglycosylase are shown. The expansins are Solanum lycopersicum (Tomato) α-expansin, SlEXPA, and Zea
mays (maize) pollen allergen/β-expansin, ZmEXPB1. The swollenin is T. reesei swollenin, TrSWO1. The loosenin is Bjerkandera adusta Loosenin,
BaLOOS. The GH45-like domain is Emericella nidulans/A. nidulans cellulase/allergen, EnCelA, and the lytic transglycosylase is E. coli lytic
transglycosylase, EcMltA. Strictly conserved residues are shown in red block, and chemically similar residues are in red text. The blue boxes indicate
chemical similarity across a grouping of residues. Given the diversity of tertiary structure across this family, a consensus sequence representative of
general structure is not available. Thus, the secondary structural elements of each sequence are marked in the following fashion: α-helices are marked
by magenta boxes across the sequences, β-sheets are marked by bolded letters and numerically identified above the sequence. The catalytic acid is
marked by a yellow star, and residues in a conserved motif near the catalytic acid are marked by magenta stars. The alignment was generated with
ESPript (http://espript.ibcp.fr).347

1408 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 70. Top view (left panel) and end-on view (right panel) of the structure of H. insolens Cel45A D10N mutant crystallized with cellohexaose
(4ENG; gray cartoon). Catalytic residues are indicated (red stick), as well as the “lid-loop” (111−118; blue cartoon and stick), other residues of
interest (blue stick), and the two cellotriose units seen in subsites −4 to −2 and +1 to +3, respectively (cyan stick). The 111−118 loop and Asp114
residue (magenta cartoon) of HiCel45A wild-type in complex with cellobiose (3ENG; pink stick) and the corresponding loop in the M. albomyces
Cel45A structures 1L8F (pale green cartoon) and 1OA7 (pale yellow cartoon) are also shown.

10.1. Structural Studies bonded to Asp121. This latter observation supports the role of
To date, there are eight GH45 structures of four different Asp121 as the catalytic acid. The authors note the surprising lack
enzymes available in the Protein Data Bank (Table 14). Three of of carbohydrate−aromatic stacking interactions between the
these enzymes belong to subfamily A: HiCel45A (2ENG, substrate and the enzyme, with the exception of Trp18 in the −4
3ENG, 4ENG),903,904 H. grisea var. thermoidea (HgCel45A; subsite.
1HD5), and M. albomyces (MaCel45A; 1L8F;912 1OA7, The HiCel45A structure appears to exhibit significant
1OA9913,914). The structures of the Humicola enzymes were conformational changes upon substrate binding that may be
obtained without the C-terminal linker-CBM,904,915 whereas necessary for induction of catalysis. In the apo structure
with M. albomyces, the native enzyme could be used since it only (2ENG), the loop formed by residues 111−118 did not exhibit
consists of a GH45 module. The CDs of these three enzymes are sufficient electron density to make it possible to build this part of
very similar. The two Humicola CD sequences differ by only the structure. However, in the cellobiose (3ENG) and the
three amino acids and exhibit ∼75% sequence identity with M. cellohexaose (4ENG) complexes, the loop is ordered, but with
albomyces.912 The fourth enzyme belongs to subfamily B, significantly different conformations found in the two ligand
MeCel45A (1WC2). The isolation, characterization, and complex structures. The cellohexaose complex indicates that
sequencing of MeCel45A have been published,916 but the substrate binding induces “lid closure”, which brings a third
crystal structure is so far only described in a Doctoral thesis at aspartate, Asp114, near to the −1 subsite and with the
Uppsala University.917 carboxylate group pointing toward the catalytic center. Asp114
10.1.1. Subfamily A. The first GH45 structure, HiCel45A also interacts with OH6 of the +1 sugar, as well as with an
in the apo state (PDB code: 1ENG/2ENG), was published 1993 opposing loop across the active site via van der Waals contacts
by Davies et al. revealing a six-stranded, double-psi β-barrel (with Ile131), and two water-mediated hydrogen bonds (to
domain, a fold shared by multiple protein superfamilies.903,919 Ile131 N and Tyr147 OH) effectively enclose the active site in a
Loops connecting the β-strands protrude from opposite sides of short tunnel at the point of cleavage. Davies et al. also
the barrel to form a substrate-binding groove along one face of demonstrated that the activity is essentially abolished in the
the β-barrel. The two catalytic residues Asp121 and Asp10, D10N and D121N mutants, whereas the kcat is reduced by a
identified by site-directed mutagenesis, define the catalytic factor of 40 in the D114N mutant.904
center near the middle of the groove, which spans seven subsites Interestingly, the distance between the glucose residues in the
from −4 to +3 (Figure 70). Subsequently, Davies et al. −2 and +1 subsites is about 1 Å longer than expected for either a
published two ligand-bound complexes, the wild-type enzyme relaxed or a distorted glucose residue (such as found in
with cellobiose and the catalytically deficient mutant, D10N, lysozyme191) in the −1 subsite. The authors speculate that this
crystallized with cellohexaose (PDB code 3ENG and 4ENG, might be because the enzyme has evolved to optimally bind to
respectively).904 The cellobiose ligand binds in the +1 and +2 the transition state and that elongation occurs upon protonation
subsites, and in the D10N mutant, two cellotriosyl moieties are during the inverting hydrolysis mechanism (Figure 71). Davies
seen, one in subsites +1 to +3 and one in −2 to −4 with subsite et al. subsequently published a full-length article on further
−1 unoccupied (Figure 70). In both structures, the free C4 structural refinement of the native HiCel45A enzyme.918
hydroxyl of the glucosyl unit bound in +1 subsite, corresponding The structure of HgCel45A was deposited in the PDB in 2000
to a glycosidic oxygen in a cellulose substrate, is hydrogen (1HD5; McAuley, K.E., Wilson, K.S., and Schülein, M.), but a
1409 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Papageorgiou reported both the apo enzyme and a cellobiose


complex (1OA9, 1OA7, respectively).914 As expected from the
high sequence similarity with HiCel45A, the structures are quite
similar to less than 1 Å RMSD between the structures in all
cases. In the cellobiose complex structure, the ligand is bound in
the −2 and −3 subsites instead of the +1/+2 subsites found in
HiCel45A.904,914 The bound glucose residues are slightly shifted
relative to the corresponding glucan units in the HiCel45A
cellohexaose complex structure.
10.1.2. Subfamily B. To date, there is only one structure
reported in subfamily B of GH45, that of MeCel45A from the
North Sea blue mussel M. edulis; this enzyme consists of a GH45
catalytic module with 181 amino acids and no other
subdomains.917 There are known fungal representatives of this
GH45 subfamily,151−153 but no structures have been reported
thus far. The M. edulis enzyme was discovered after
homogenization of whole mussels, brief boiling, removal of
the solids, and analysis of the supernatant for cellulase
activity.916 Subsequently, the enzyme was localized to the
crystalline style of the digestive tract. The crystalline style is a
rotating, gelatinaceous rod in the stomach of herbivorous
mollusks, which acts as a mechanical device to grind the food in
a mortar-and-pestle fashion.920 The structure of the native
enzyme isolated from crystalline styles was solved at 1.2 Å
resolution (PDB code: 1WC2).
Figure 71. Davies et al. illustrated their hypothesis for the catalytic
The MeCel45A enzyme shows only 13% sequence identity
mechanism of HiCel45A in their 1995 manuscript.904 The conventional with HiCel45A from subfamily A, reflecting a comparatively
inverting catalytic mechanism is proposed to include elongation of the large phylogenetic distance between subfamilies A and B.
glycosidic bond in the transition state (middle panel). This supposition Nevertheless, the β-barrels superimpose well with an RMSD of
was supported by structural evidence indicating the glucose rings in the 1.6 Å over 105 matched Cα positions, and the structures are
−2 and +1 binding subsites were 1 Å further apart than known sugar quite similar around the catalytic center. The catalytic base
geometries of the time. Reprinted with permission from ref 904. (Asp24) and the catalytic acid (Asp132, or Asp10 and Asp121 in
Copyright 1995 American Chemical Society. HiCel45A) and residues surrounding the catalytic acid are
preserved in nearly identical positions (Thr20, Tyr22, Ala50,
publication has not been reported on this structure. Given that Ala51, His130 in MeCel45A corresponding to Thr6, Tyr8,
only three amino acids differ from the HiCel45A enzyme, it is Ala73, Ala74, His119 in HiCel45A) (Figure 72). Furthermore,
not surprising that the overall structure is quite similar. Asn109 in MeCel45A occupies a similar position as Asp114 in
Two structures of MaCel45A were independently published the flexible loop of HiCel45A upon “lid closure”, where it may
by two groups in 2003.912,914 Valjakka and Rouvinen reported also serve for binding to OH6 of the +1 glucoside. The
the structure of the native apo enzyme (1L8F).912 Hirvonen and corresponding loop in MeCel45A, which carries Asn109,

Figure 72. Difference in architecture between subfamilies A and B of GH45. (A) Structure 4ENG of H. insolens Cel45A in subfamily A. (B) Structure
1WC2 of MeCel45A in subfamily B. (C) Superposition of the two structures reveals high similarity at the β-barrel core, while loop regions differ in
length and structure. The proposed catalytic acid residues (red stick) are shown, as well as the −4 tryptophans, conserved residues around the catalytic
acid, and Asp114 of HinCel45A and the corresponding Asn109 of MeCel45A. The two cellotriose ligands in the 4ENG structure are included in panel
C (cyan stick). The orientation and scale are identical in all three panels.

1410 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

exhibits more protein−protein contacts and is not likely to show substrate specificity where determined. In most cases, the
similar flexibility as seen in HiCel45A. In both enzymes, a studies are limited to assays with CMC as substrate, e.g., for
tryptophan platform is exposed at subsite −4, although the demonstration of EG activity and measurements of specific
residues come from different regions of the sequence (Trp64 in activity, pH dependence, temperature profile, and/or thermal
MeCel45A, Trp18 in HiCel45A) and their side chain indole stability. Fewer enzymes have been analyzed in further detail in
planes are tilted by ∼80° relative to each other. terms of substrate specificity, product profile, enzyme kinetics,
Apart from the β-barrel core and the active center region, the etc. Although available data are limited, some common
structures of MeCel45A and HiCel45A are markedly different in properties can be distinguished. Generally, activities are
architecture due to differences in length of regions that comparable or lower on amorphous cellulose (PASC) than on
interconnect the β-strands of the barrel. Such regions are CMC and much lower or negligible on crystalline cellulose (e.g.,
extended largely along either one or the other side of the β- Avicel and filter paper). When tested, the enzymes were also
barrel in the two enzymes. In MeCel45A, three elongated loops active on mixed β-1,3/1,4-glucans (e.g., barley β-glucan or
along one side extend the surface of the barrel and make up one lichenan) and glucomannan, but they were essentially inactive
wall of the active site (Figure 72B), but all the connections on on xylan, xyloglucan, laminarin, or galactomannan, and release
the other side are rather short and consequently the cleft is more of aglycon was negligible from chromogenic or fluorogenic
open and shallower than in HiCel45A. The substrate binding substrates such as pNPC and 4-methylumbelliferyl-β-D-cellobio-
cleft of the latter enzyme is primarily formed by loops on the side.332 Cleavage of glycosidic linkages other than β-1,4 has not
opposite side compared to MeCel45A. However, HiCel45A has been demonstrated.
one extended loop on the other side as well, the flexible “lid” In an extensive study by Vlasenko et al. discussed previously,
mentioned above, and the binding groove is more enclosed. the substrate specificities of 10 ascomycete and 1 basidiomycete
The two distinct patterns of longer and shorter regions GH45 subfamily A enzymes were compared with EGs of GH
between the strands of the β-barrel also show up in the family 5, 6, 7, 9, and 12.332 In general, the GH45s showed
alignment of available GH45 sequences, consistent with the comparable activity to EGs of GH5, 6, and 7 on polymeric
difference in architecture between subfamily A on one hand and cellulose substrates (including PASC, Avicel, pretreated corn
B and C on the other (Figure 69). In addition to showing high stover, and CMC). The most active among the tested GH45s
sequence similarity to other GH45s from mollusks, M. edulis were the enzymes from T. terrestris, H. grisea, Chaetomium
Cel45A is quite similar to GH45s from fungal species such as brasiliensis, and Sordaria f imicola. Unlike other EGs, the GH45s
Trichoderma strains. For example, TrCel45A exhibits 33% were essentially inactive on the hemicellulose substrates tested
sequence identity with MeCel45A, and the alignment points in this study (xyloglucan, xylan, arabinoxylan, mannan, and
to a similar overall shape with an open and shallow active site galactomannan).
groove. In view of the structural differences discussed above, one
10.1.3. Subfamily C. Upon publication of the expression might expect significant differences in enzymatic properties
and characterization of a GH45 enzyme from the white-rot between the subfamilies of GH45 to follow. Apart from
basidiomycete fungus P. chrysosporium, Igarashi et al. suggested subfamily A, biochemical data are scarce and have only been
that the protein should be classified into a new subfamily C in reported for two members of subfamily B (TrCel45A334 and
GH45.908 The PcCel45A enzyme consists of a 180-residue Penicillum decumbens Cel45A921), and one in subfamily C
GH45 module without linker-CBM. No structure for GH45 (PcCel45A908). Generalizing the findings from these three
subfamily C cellulases has been published to date. However, enzymes, subfamily B and C members appear to exhibit lower
given their sequence similarity to subfamily B, it is likely that cellulolytic activity relative to subfamily A. Karlsson et al.
they will exhibit structures that are more similar to those from reported that TrCel45A exhibits 2−9-fold lower activity on
subfamily B than A. The sequence identity is 21% between Avicel, PASC, and CMC relative to TrCel5A, TrCel7B, and
PcCel45A and MeCel45A and from the length of loop regions, it TrCel12A; the enzyme produces cellotetraose as major
can be predicted that PcCel45A should have a shallow groove hydrolysis product in contrast to subfamily A, where cellobiose
similar to that of MeCel45A (Figure 69). dominates.334 Furthermore, TrCel45A showed higher activity
Quite interestingly, while the proposed catalytic acid and its on konjac glucomannan than the other EGs. Only glucose
surroundings seem to be well-preserved, PcCel45A is lacking a reducing ends were formed, demonstrating that the enzyme
counterpart to the aspartic acid assigned as catalytic base in does not act as a mannanase on glucomannan. In the case of P.
subfamily A and B, suggesting that the enzyme may utilize a decumbens Cel45A the activity was highest on glucomannan
different mechanism.908 However, in common with other followed by PASC, CMC, and β-glucan, suggesting that the
inverting GH families, the assignment of the catalytic base is enzyme may actually be regarded as primarily a glucomannanase
less certain than of the catalytic acid within GH45. The rather than a cellulase.921 Activity on glucomannan was also
proposed catalytic base is not strictly conserved throughout observed for PcCel45A of subfamily C, but the specific activity
subfamily A and B, and so far it has been experimentally shown was about 10 times higher on β-1,3/1,4-glucans (lichenan and
to be essential for activity in only one enzyme, HiCel45A of barley β-glucan) followed by CMC and PASC.908 No activity on
subfamily A.904 Further studies are clearly needed to fully MCC or xylan was detected after 120 h of incubation.
elucidate the catalytic mechanism within GH45. 10.3. Similarities of GH45s to Expansins and Swollenins
10.2. Catalytic Function The general fold of the GH45 CD is shared by cell wall acting
Most of the over 300 entries of GH45 in the CAZy database are proteins that do not exhibit hydrolytic activity but may instead
known only from their sequence. Among fungi, enzyme offer some disruptive effect (e.g., plant expansins and fungal
characterization has been reported for about 25 members, swollenins).930,931 The structural comparison of an expansin
nearly all belonging to GH45 subfamily A. Table 15 provides a with GH45s (Figure 73) illustrates that their β-barrels overlap
list of characterized enzymes, pH and temperature optima, and closely. The structures are also quite similar at the GH45
1411 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Table 15. Biochemical Characterization of Known GH45 Cellulasesa
temp
opt (°
organism expression host enzyme pH opt C) substrate for opt substrate specificity ref comments
Chrysosporium Cel45A 5 70 CMC CMC, filter paper, β-glucan, Avicel, xylan Emalfrab et al., 2003715 Ascomycota
lucknowense C1
Crinipellis scabella Cel45A PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, Shulein et al., 2005,922 Basidiomycota; retains 25% activity at
Chemical Reviews

xylan, arabinoxylan, mannan, galactomannan Vlasenko et al., 2010332 70 °C


Humicola grisea Cel45A PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, Vlasenko et al., 2010332 Ascomycota; retains 40% activity at
xylan, arabinoxylan, mannan, galactomannan 80 °C
Humicola grisea Aspergillus ory- Egl3 5 60 CMC CMC, Avicel Takashima et al., 2007915 ascomycota; retained >75% activity at 80
var. thermoidea zae °C for 10 min; stable at pH 3−12 at 4 °
C for 20 h
Humicola grisea Aspergillus ory- Egl4 6 75 CMC CMC, Avicel Takashima et al., 2007915 Ascomycota; Retained >75% activity at
var. thermoidea zae 80 °C for 10 min; stable at pH 3−12 at
4 °C for 20 h
Humicola insolens Cel45A CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, Dalboege et al., 2007,923 Ascomycota; retains 83% activity at
DSM 1800 xylan, arabinoxylan, mannan, galactomannan, azurine cross- Vlasenko et al., 2010332 80 °C
linked hydroxyethylcellulose
Macrophomina Cel45A CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, Shulein et al, 2005,922 Ascomycota; retains 22% activity at
phaseolina xylan, arabinoxylan, mannan, galactomannan Vlasenko et al., 2010332 70 °C
Melanocarpus albo- Cel45A 6−7 70 hydroxyethylcellulose CMC; hydroxyethylcellulose Miettinen-Oinonen et al., Ascomycota; retained near 50% activity
myces 2004,502 Szijártó et al., at pH 10; retained >60% activity at
2008503 70 °C
Mucor circinelloides Escherichia coli MCE1 7 45−50 CMC CMC, Avicel Baba et al., 2005906 Zygomycota
FERM P-17446
Mucor circinelloides Escherichia coli MCE2 7 45−50 CMC CMC, Avicel Baba et al., 2005906 Zygomycota

1412
FERM P-17446
921
Penicillium decum- Pichia pastoris Cel45A 5.0−3.5 60 60 konjac glucomannan konjac glucomannan, PASC, CMC-Na, barley β-glucan, xylan, Liu et al., 2010 Ascomycota; subfamily B; 90% relative
bens 114-2 and CMC Avicel activity retained at 70 °C and 30% at
80 °C
Phanerochaete Pichia pastoris Cel45A PASC, CMC, lichenan, barley β-glucan, glucomannan Igarashi et al, 2008908 Basidiomycota; subfamily C
chrysosporium K-
3
Phialophora sp. G5 Pichia pastoris EgGH45 6.0−8.0 60 CMC-Na CMC-Na, barley β-glucan, Avicel, filter paper Zhao et al., 2012723 Ascomycota
Phycomyces nitens Escherichia coli PCE1 6 50 CMC CMC Shimonaka et al., 2004924 Zygomycota
FERM P-17447
Piromyces equi Cel45A 6.5 70 CMC CMC, PASC, barley β-glucan, lichenin Eberhardt et al., 2000829 max activity at 70 °C, though dropped to
10% after 20 min; stable for at least 1 h
at 65 °C
Rhizopus oryzae RCE I 5−6 55 CMC CMC, Glc33, Glc4/Glc5/Glc6 Murashima et al. 2002925 Zygomycota
FERM BP-6889
Rhizopus oryzae RCE 2 5−6 55 CMC CMC, Glc33, Glc4/Glc5/Glc6 Murashima et al. 2002925 Zygomycota
FERM BP-6889
Rhizopus oryzae Saccharomyces RCE 3 7.7 50 CMC CMC, Avicel, Glc4/Glc5/Glc6 Moriya et al. 2003926 Zygomycota; >90% activity pH 5−7.7
FERM BP-6889 cerevisiae
Staphylotrichum STCE1 6.0 60 CMC CMC Koga et al., 2008927 Ascomycota
coccosporum IFO
31817
Syncephalastrum Pichia pastoris CBH I 5−6 70 CMC CMC, azurine cross-linked hydroxyethylcellulose Wonganu et al., 2008928 Zygomycota; retained >80% activity after
racemosum 1 h at 80 °C, and >50% for 4 h at
BCC18080 70 °C
Thielavia terrestris Cel45A CMC, PASC, Avicel, BC, arabinoxylan, mannan Shulein et al, 2005,922 Ascomycota; retains 12% activity at
Review

Vlasenko et al., 2010332 80 °C

DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

catalytic center. There is no obvious counterpart to the catalytic

Ascomycota; retained 70% of activity at


base of GH45 subfamily A and B, but the catalytic acid and

pH 5 and 40−70 °C; subfamily B

Ascomycota; retains 12% activity at


surrounding residues of GH45s are conserved in the
ThrXxxTyr/Phe and HisXxxAsp motifs in the β1 and β5
strands (Thr6, Tyr8, His119, and Asp121 in HiCel45A; Figure
comments

69), both in expansins and swollenins and also in other classes of


proteins recently discovered in fungi as discussed below. The
conservation of the catalytic acid and its surroundings suggests
Basidiomycota

80 °C that the machinery for protonation of a glycosidic oxygen may


play an important role in their mechanism of action.
Expansins were first discovered and characterized by
Cosgrove and co-workers as proteins responsible for cell wall
Vlasenko et al., 2010332

extension.931 In one of the original functional studies of


Karlsson et al., 2002334

Shulein et al, 2005,922

expansins, McQueen-Mason and Cosgrove demonstrated that


Schauwecker et al,

these proteins were able to mechanically weaken filter paper.931


ref

Soon after, the same authors showed that the addition of


1995929

hemicellulose greatly improved the binding of expansins to


crystalline cellulose, suggesting that the proteins might target
cellulose and hemicellulose simultaneously.932 Many studies
have followed these original reports related to the potential
CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan,

mechanisms of expansins, including structure and function


studies,933−938 hints regarding the molecular mechanism of
CMC, PASC, Avicel, Glc3/Glc4/Glc5, barley β-glucan,

expansin action in plant cell wall expansion,930,939−942 and their


ability to synergize with GHs in cellulose or biomass hydrolysis.
xylan, arabinoxylan, mannan, galactomannan

We note that observations of synergy with cellulases has been


substrate specificity

significantly mixed, with studies ranging from little to no synergy


to those suggesting substrate disruption, the latter often in the
absence of quantitative results or with only minor increases in
synergy when substrate conversion is quantified.943−947 More
glucomannan, filter paper

recently, expansins have been shown to synergize more


effectively with xylanases.948 Multiple families of expansins
have also been characterized on the basis of available sequence
data with the primary expansins studied to date termed α- and β-
expansins.308,949−951 The nomenclature of expansins has been
described by Kende et al.952 Expansins are vital to and
CMC

ubiquitous in plants. Plant genomes typically contain on the


order of 30 or more expansin genes, the majority of which are α-
All GH45 enzymes belong to subfamily A unless otherwise noted in the comments.
substrate for opt

expansins. Expansin-like proteins are also found in prokaryotes


and various nonplant eukaryotes suggesting that they are likely
important for alternate functions in nonplant organisms, such as
in root colonization.953−955
β-glucan

The protein structures of three expansins are currently


available: two group-1 pollen allergen plant β-expansins from
Phleum pratense (1N10, Major Timothy grass pollen allergen Phl
opt (°
temp

C)

P1; Fedorov, Ball, Leistler, Valenta, Almo, unpublished) and Zea


60

mays (ZmEXPB1; 2HCZ),938 and the bacterial EXLX1 from


Bacillus subtilis (3D30).937 The structures and sequence
pH opt

comparisons reveal a canonical two-domain fold of expansins.


5

An N-terminal double-psi β-barrel domain of around 110−120


residues, homologous to the GH45 catalytic module, is packed
enzyme
Cel45A

Cel45A

against a C-terminal Ig-like β-sandwich domain of ∼100


Egl1

residues; the domains are aligned to form a ∼60 Å long


putative polysaccharide-binding surface (Figure 73). Recent
expression host

structures of the bacterial EXLX1 in complex with β-1,3/1,4-


glucan and cellulose oligosaccharides show a new type of ligand-
mediated dimerization, where the oligosaccharide is sandwiched
Table 15. continued

between the Ig-like domains of two EXLX1 proteins in opposite


polarity.935 Both Yennawar et al.938 and Kerff et al.937 provide
Volutella colletotri-
Trichoderma reesei

detailed analyses and discussions of structure and sequence


Ustilago maydis
organism

similarities and differences between expansins, GH45s, and


other related proteins. Among GH45 structures, MeCel45 is
choides

most closely related to expansins,937 and GH45 subfamily B


shows higher sequence similarity with expansins than subfamily
a

1413 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 73. GH45 subfamily B is more similar to expansins than subfamily A. (A) Surface representation of the structure of HiCel45A (4ENG) in
subfamily A, with catalytic residues Asp10 and Asp121 in red and Trp18 at subsite −4 in green. (B) Surface of MeCel45A (1WC2) in subfamily B with
corresponding residues highlighted, Asp24, Asp131, and Trp64. (C) Surface of β-expansin EXPB1 from Zea mays (2HCZ), with residues Asp107 and
Trp194 highlighted. (D) Superposition of MeCel45A (1WC2; green) with Asp24, Asp132, and Trp64 shown, and expansin EXPB1 (2HCZ; purple)
with residues Asp107 and Trp194 shown. (E) Close-up of the active site of HiCel45A (blue stick), MeCel45A (green stick), and ZmEXPB1 (magenta
stick) showing the tryptophan residue at subsite −4 (Trp18/64/194), the proposed catalytic base (Asp10/24), and the catalytic acid (Asp121/132/
107) surrounded by conserved tyrosine, threonine, and histidine residues. The two cellotriose ligands in the 4ENG structure (white carbon atoms) are
included in panels D and E.

A. The catalytic center motifs ThrXxxTyr and HisXxxAsp are of the GH45 domain, in particular, β1/β2 and β4/β5 (Figure
conserved, but some loops are shorter in the expansins making 69), indicate that the putative sugar-binding surface may be
the putative sugar-binding surface even more shallow and nearly more enclosed in swollenins, or that they may form an
flat, similar to LPMOs as will be discussed in section 11. This additional domain in analogy with EcMltA discussed below.
may indicate that expansins act in very close proximity to an So far, swollenins have only been found in a limited number
insoluble substrate surface. of ascomycete fungi (11 species in UNIPROT). In addition to
The name “swollenin” was coined by Saloheimo, Pentillä, and TrSWO1,956 disruptive effects on cellulose and increased
co-workers at VTT, Finland, upon discovery of a new protein, adsorption and enhanced activity of cellulases have also been
SWO1, from T. reesei that shows structure disruption effects on demonstrated for swollenins from Neosartorya f umigata
cotton fibers and filter paper.309 The expression of TrSWO1 was (formerly Aspergillus f umigatus),957 Trichoderma asperellum,958
found to be highly upregulated under cellulase-inducing T. pseudokoningii,959 and Penicillium oxalicum.946 Interestingly,
conditions pointing to an important role in biomass Wang et al. could produce bioactive T. asperellum SWO
degradation. No structure of any swollenin is known to date, recombinantly in E. coli by using a cellulose-assisted refolding
but sequence similarities indicate that the C-terminal half of the and purification process.958 They also devised a method to
protein is homologous to expansins with one GH45-like and quantitatively measure the disruption activity of the swollenin,
one Ig-like domain. In addition, the 475-residue TrSWO1 by using a bacterial GH5 EG with negligible activity on
protein contains an N-terminal family 1 CBM followed by a untreated crystalline cellulose. More recently the effects of
typical Ser/Thr-rich linker, and a region/domain of unknown TrSWO1 on pretreated corn stover were investigated.960 The
structure and function. Several insertions between the β-strands swollenin primarily disrupted the hemicellulose fraction and
1414 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

released substantial amounts of oligomeric and monomeric Asp308 in EcMltA resulted in complete loss of activity.963
sugar. With respect to cellulose hydrolysis, synergism was small Interestingly, though, the mechanism of EcMltA differs from
with the CBH TrCel7A or the EG TrCel5A, whereas that of GH45s. The EcMltA enzyme cleaves β-1,4-glycosidic
pronounced synergism in terms of xylose yield was observed bonds in peptidoglycan with a nonhydrolytic mechanism where
with TrCel5A as well as the xylanases TrXyn10A or TrXyn11A, the 6-hydroxyl is joined to the anomeric carbon to form
the latter releasing about 3 times more xylose when combined nonreducing 1,6-anhydro-muropeptides. These lytic trans-
with TrSWO1. glycosylases thus offer an example of the utilization of the
A new type of GH45-related fungal protein, “loosenin” from same constellation of amino acids in a nonhydrolytic
the basidiomycete B. adusta (BaLOOS1), was discovered and mechanism, although analogous reaction products have not
characterized recently.961 BaLOOS1 binds strongly to both been demonstrated in GH45s, expansins, or swollenins.
cellulose and chitin, affects the morphology of cotton fibers, and Inspired by the alternate mechanism of EcMltA, the apparent
enhances the activity of cellulases, but did not exhibit detectable lack of a catalytic base, and the notion of a surprisingly long
hydrolytic activity when acting alone. Fibers of Agave tequilana distance over the −1 subsite in HiCel45A,904 we speculate that,
bagasse, known for high recalcitrance, were hydrolyzed 7.5 times at least in some cases, a mechanism might be employed that
faster by a cellulase/xylanase cocktail after BaLOOS1 pretreat- utilizes strain in the polysaccharide concomitantly with
ment. The protein is only 109 amino acid residues in length and protonation of the glycosidic oxygen as means for glycosidic
consists of a sole GH45-like domain, presumably similar in bond cleavage. Perhaps this may be combined with a
structure to that of expansins. The ThrPheTyr sequence is nucleophilic attack on the anomeric carbon by a hydroxyl
present in strand β1 and the catalytic acid appears to be group on another polysaccharide resulting in lytic trans-
conserved in strand β5, although the HisXxxAsp motif is glycosylation with inversion of the anomeric configuration and
replaced by AspLeuAsp (Figure 69). A protein-BLAST search without formation of new reducing ends. Such a mechanism
returned over 200 hits from fungi with about 40% or higher would make sense for expansins in particular, since the activity
sequence identities to BaLOOS1, suggesting that loosenin would be targeted to polysaccharides under tension and would
proteins are ubiquitous in fungi. allow for rearrangements in the polysaccharide network without
Yet another group of fungal proteins related to GH45 and compromising the integrity of the cell wall.
expansins is represented by EnCelA from Emericella nidulans (or Finally, we conclude that conservation of the catalytic acid
A. nidulans) in the sequence alignment in Figure 69. The CelA- and its environment seems to be a recurring theme, suggesting
encoding gene is constitutively expressed in all developmental that protonation and glycosidic bond cleavage are involved also
cycles of the fungus, but the protein is exclusively present in the in the GH45-related proteins. However, the molecular
cell walls of conidiospores.962 We note that the authors use the mechanism for GH45 subfamily C, expansins, swollenin,
name EglD, while the sequence shown in the article is identical loosenin, and other related proteins remain to be elucidated.
to that of EnCelA (Q5AVE5_EMENI) and different from that 10.4. Conclusions
of E. nidulans EglD (Q5BCX8; EGLD_EMENI) in the Uniprot GH45 cellulases are clearly industrially relevant proteins given
database. Investigations of the influence of gene expression on their inclusion in commercial detergent and conditioning
morphogenesis, growth, and germination of conidia indicate formulations. However, of all GH families of fungal cellulases,
that the protein may be involved in cell wall remodeling during GH45 are by far the least well characterized. Accordingly, insight
spore germination. The protein was extracted from conidio- into both their structure and function is relatively limited. Here,
spores and assayed in vitro for CMCase activity, but no we generalize the findings discussed above for GH45s:
degradation was detected.962 The GH45-like domain is
appended by a region of ∼95 residues, similar in length as in (1) GH45 CDs exhibit a six-stranded, double-psi β-barrel
expansins, but it is uncertain whether it folds into a similar Ig- fold, which is among the smallest carbohydrate active
like domain. Furthermore, the GH45-domain is preceded by a enzyme protein folds. The small size may aid in access to
region of ∼150 residues rich in Ser and Thr, which may serve as small pores in the substrate.
an O-glycosylated linker of a cell wall anchor. Many members (2) Phylogenetic analysis suggests three different subfamily
lack this N-terminal region and may represent nonanchored classifications (A, B, and C) may exist and are
homologues. However, if they may play any significant role in independent of kingdom. Currently, there is no structural
lignocellulose degradation, this appears to be unknown to date. representative from subfamily C.
Several of the sequences are annotated as “cellulase” but without (3) The enzymes exhibit a wide range of cell wall
reference to biochemical evidence. polysaccharide specificities and utilize an inverting
Significant sequence similarity is also recognized between hydrolytic mechanism for turnover.
GH45s and a family of lytic transglycosylases, bacterial enzymes (4) The inverting mechanism is not well-characterized in any
involved in cell wall peptidoglycan processing. EcMltA from E. GH45; however, in HiCel45A, the ligand bound structure
coli shows 31% and 23% sequence identity with MeCel45A and gives rise to the hypothesis that elongation of the
HiCel45A, respectively. The crystal structure of EcMltA reveals substrate in the transition state may be required for
a two-domain structure where the larger A domain overlaps hydrolysis.918
closely with that of GH45s (2AE0).963 However, EcMltA has a (5) Loop regions connecting the central β-strands contribute
∼140 residue long insertion between β1 and β2 of the canonical to specificity and ligand binding. Conformational changes
double-psi β-barrel topology, which folds into an additional β- of these loops upon binding have been observed and may
barrel B domain. A large glycan-binding groove is formed at the be a part of the subfamily A hydrolytic mechanism.
interface between the two domains. Again, the catalytic center is Subfamily B has a more open active site as a result of loop
strikingly similar to that of GH45s. The catalytic acid-associated deletions relative to subfamily A.
motifs (ThrGlyTyr and HisPheAsp in EcMltA) are highly (6) GH45s are structurally similar to a host of other proteins
conserved within lytic transglycosylase family 2, and mutation of including expansins, swollenins, loosenins, and lytic
1415 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Table 16. Reported LPMO Crystal Structuresa


source and original name in PDB ion
primary citation code resolution (Å) charge brief highlights ref
Fungal LPMO Structures
Trichoderma reesei GH61B 2VTC 1.60 Ni2+ first GH61 structure reported 337
Thielavia terrestris GH61E 3EII 2.25 Zn2+ Zn-bound GH61 structure; noted similarities with aromatics on surface to family 1 967
CBMs
Thielavia terrestris GH61E 3EJA 1.90 Mg2+ Mg-bound GH61 structure 967
Thermoascus aurantiacus 2YET 1.50 Cu2+ demonstrated copper was the active metal and observed N-terminal histidine 51
GH61 N-methylation
Thermoascus aurantiacus 3ZUD 1.25 Cu2+ Cu(NO3)2-soaked GH61 to confirm copper binding 51
GH61
Phanerochaete chrysosporium 45BQ 1.75 Cu2+ first basidiomycete GH61 structure reported 993
GH61D
Neurospora crassa LPMO-2 4EIR 1.10 Cu2+ first confirmed type 2 LPMO structure reported with activity for the C4 carbon 995
Neurospora crassa LPMO-3 4EIS 1.37 Cu2+ first confirmed type 3 LPMO structure reported with activity for the C1 and C4 995
carbons
Aspergillus oryzae LPMO 4MAH 1.55 Zn2+ first reported AA11 structure. Zn cation in the active site 1024
(AA11)
Aspergillus oryzae LPMO 4MAI 1.40 Cu2+ first reported AA11 structure; Cu(I) cation in the active site 1024
(AA11)
Nonfungal LPMO Structures
Serratia marcescens CPB21 2BEM 1.55 N/A first family 33 CBM structure reported 972
Serratia marcescens CPB21 2BEN 1.80 N/A Y54A mutant of CBP21 972
Enterococcus faecalis V583 4A02 0.95 N/A apo structure of a CBM33 enzyme 980
CBM33A
Serratia marcescens CPB21 2LHS n/a (NMR) N/A NMR structure; demonstrated that CBP21 is a rigid macromolecule 990
Bacillus amyloliquefaciens 2YOW 1.80 N/A apo structure of CBM33 enzyme 991
CBM33
Bacillus amyloliquefaciens 2YOX 1.90 Cu1+ Cu-bound structure with a Cu(I) ion; reduction was induced by X-rays 991
CBM33
1+
Bacillus amyloliquefaciens 2YOY 1.70 Cu Cu-bound structure with a Cu(I) ion; reduction was induced by ascorbate 991
CBM33
Enterococcus faecalis CBM33A 4ALC 1.49 Cu2+ first reported CBM33/LPMO from family AA9 with a +2 copper oxidation state 998
collected by a specialized diffraction method
Enterococcus faecalis CBM33A 4ALT 1.49 Cu2+ photoreduced CBM33/LPMO from family AA9 with a +1 copper oxidation state 998
Streptomyces coelicolor 4OY6 1.29 Cu2+ one of two first cellulose active bacterial (AA10) LPMO structures solved 991
LPMO10B
Streptomyces coelicolor 4OY7 1.50 Cu1+ second of two first cellulose active bacterial (AA10) LPMO structures solved 991
LPMO10C
a
Both fungal and non-fungal LPMOs are included.

transglycoyslases, many of which have conserved major decrystallization of individual chains requires thermodynamic
components of the GH45 active site. work.130−132,580,964 Over 60 years ago, Reese and co-workers
Much remains to be accomplished toward understanding speculated that nature likely employs additional mechanisms to
GH45 enzymes and their hydrolytic mechanisms. The current disrupt the crystalline lattice of cellulose.291 To this end, a
dearth of information cannot even definitively address the discovery was made in 2010 that verified this hypothesis and
question of consistent mechanisms across subfamilies. Even the overturned the conventional cellulose depolymerization para-
simplest of questions such as the identity of the catalytic base is digm.48 Namely, the characterization of a new family of biomass-
likely to remain unaddressed for some time given the difficulty degrading enzymes, now referred to as LPMOs,42,50,965 revealed
in attribution this role in inverting enzymes (see GH6 a completely new mechanistic approach for cleaving glycosidic
discussion). Nevertheless, these enzymes and the GH45-like bonds in cellulose (and chitin). It was shown that LPMOs can
counterparts play a role in both cell wall deconstruction and be quite synergistic with the traditional GH enzyme cocktails
generation, suggesting further examination of this interesting described above, even before the basis for their mechanisms was
family is warranted both for their potential in industrial reported.966,967 These enzymes were previously characterized
applications and for understanding their role in natural either as family 33 CBMs or family 61 glycoside hydrolase
biomass-degrading contexts. (GH61) cellulases from nonfungal and fungal origin,
respectively. LPMOs are now understood to be ubiquitous in
11. LYTIC POLYSACCHARIDE MONOOXYGENASES nature, to be highly upregulated during biomass degradation,
The structures of ligand-bound GHs described above marked a and to exhibit significant diversity both within and between
turning point in our collective understanding of plant cell wall biomass-degrading organisms.60,965,968,969
degradation starting in the 1990s that has enabled more than Given that LPMOs are emerging as key components of
two decades of intense mechanistic research and development. cellulase cocktails, likely due to their ability to create chain
However, as discussed above, cellulose and similar polysacchar- breaks in crystalline regions of cellulose where EGs would not
ides pack tightly into dense crystalline lattices on which GHs be able to productively bind to the substrate, here we briefly
must act. Given the inherent recalcitrance of cellulose, review the discoveries thus far in this new field. We note that in
1416 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 74. S. marcescens CBP21 structure and initial mechanistic discoveries. (A) The overall structure of S. marcescens CBP21 with the active site
residues highlighted in stick format.972 PDB code: 2BEM, chain C. His28, Glu60, Ala112, His114, and Phe187 are shown in stick format. The
coordination distances to the ion in the structure to the histidine residues are highlighted. (B) The S. marcescens CBP21 active site.972

this section we describe research into both fungal and nonfungal histidine residues (His114) of CBP21 was mutated, which
LPMOs as the mechanistic developments to date are closely resulted in complete elimination of the synergistic benefits of
linked. Unlike previous sections, we primarily follow a action.966 A later report from Moser et al.973 additionally
chronological description of the scientific developments and demonstrated that two CBM33s from T. f usca (denoted E7 and
discoveries, as this field of cellulase research is quite nascent. At E8) were also able to synergize with S. marcescens chitinases in
the end of the section, we then summarize the current state of the depolymerization of β-chitin and to a lesser extent with T.
the knowledge and several interesting future directions. Despite f usca cellulases at low concentrations on filter paper assays.973
significant progress, there is still much to learn about the These reports marked the first in-depth biochemical and
mechanisms employed by these fascinating, newly discovered structural studies of CBM33 proteins and, especially in the
enzymes. case of S. marcescens CBP21, hinted at an unknown, powerful
Lastly, the nomenclature related to these enzymes has evolved mechanism for disrupting crystalline polysaccharide sub-
considerably and likely will continue to do so for some strates.966,972,974
time.42,50,152 Although all enzymes described in this section are In 2010, Harris and co-workers reported an in-depth study of
LPMOs, we generally refer to the enzymes as was done GH61 enzymes from the fungi T. aurantiacus and T. terrestris.967
historically (e.g., we refer to them as “GH61” or “CBM33”, Before the study from Harris et al., GH61s were thought to
described below, when discussing studies before the term exhibit weak EG activity, for example on CMC,487,975−977 and
LPMO was coined). We note that the term LPMO is now their considerable synergy with the T. reesei secretome had only
reasonably well accepted and conveys the salient details of their been briefly reported.978 Harris and co-workers demonstrated
currently reported function. Remarks on the likely nomencla- that either GH61 enzyme added to a native T. reesei cellulase
ture for these enzymes is provided at the end of this section. cocktail at low concentrations (4 mg GH61/g cellulose)
Additionaly, descriptions of structures discussed in this section significantly enhanced the T. reesei enzyme cocktail performance
are provided in Table 16. on pretreated corn stover (Figure 75).967 Indeed, the authors
11.1. Initial Discoveries of Oxidative Function report a reduction in overall protein loading was possible by a
Prior to 2005, family 33 CBMs from insect viruses had factor of 2 if a GH61 enzyme was added, which is a reduction in
previously been shown to bind to chitin, but no mechanistic loading with significant cost savings implications for enzymatic
details were revealed.970,971 In 2005, Vaaje-Kolstad et al. hydrolysis of pretreated biomass.21−23,967 Intriguingly, the
reported the first family 33 CBM structure from the chitinolytic authors also showed that there was no synergy on various
bacterium Serratia marcescens (Table 16).972 The protein was clean cellulose substrates (e.g., Avicel), which at the time was
termed “chitin-binding protein 21” (CBP21), where the “21” hypothesized as a mechanism of GH61 action on either
denotes the fact that the protein is 21 kDa.966,972 The overall 3- hemicellulose or lignin. The T. terrestris GH61E enzyme
D structure of CBP21 was characterized as a “budded” structure was solved (Figure 76), which was the second
fibronectin type III fold (Figure 74). In the same study, representative structure from this family (the first structural
mutations of residues on the putative chitin-binding surface member was T. reesei GH61B (TrGH61B) reported two years
were conducted, all of which were shown to reduce the binding earlier337). In both the structural reports from Harris et al.967
affinity to the substrate.972 The CBP21 structure harbors a metal and from Karkehabadi et al.,337 it was speculated that GH61
ion between the conserved N-terminal histidine (His28) and enzymes are not GHs due to the lack of conserved, carboxylate
another conserved histidine (His114), but its function was not pairs in sufficient proximity required for GH action.169,177
initially realized (Figure 74). Shortly after, the same group Karkehabadi et al. also noted a structural similarity to S.
published a study wherein they systematically demonstrated that marcescens CBP21.337 In the two first GH61 structures solved,
CBP21 dramatically boosts β-chitin conversion in the presence similar to CBM33s,972 both exhibit a metal-binding site
of any of the three native GH18 chitinases from S. marcescens coordinated by two histidine residues on the flat face of the
(two chitobiohydrolases, one endo-chitinase) or in mixtures protein. Harris et al. both varied and removed the metal
with all three GH18 chitinases present.966 One of the conserved altogether with EDTA, and demonstrated that the presence of a
1417 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

spectrometry experiments with isotopically labeled oxygen


atoms in either 18O2 or with H218O, the authors demonstrated
that the aldonic acid functionality incorporated one oxygen
atom from O2 and one from H2O (Scheme 3). This led the

Scheme 3. Overall Reaction for S. marcescens CBP21 Action


on Chitina

a
Separate mass spectrometry experiments with 18O-labeled molecular
oxygen and water demonstrated that S. marcescens CBP21 action
involves a tandem oxidation−hydrolysis mechanism.48
Figure 75. Comparison of cellulase activity after 75 h of the specified T.
reesei cellulases alone (light gray) or with T. aurantiacus GH61A
included at 10% of the total protein loading of 2.5 mg/g of cellulose authors to postulate a mechanism wherein both oxidation and
(dark gray). The T. reesei mixtures tested comprised ratios of 4.5:2.5:1.0 hydrolysis occur as a result of CBP21 action. Another
for Cel7A:Cel6A:Cel7B.967 noteworthy observation from this work is that cyanide
significantly inhibits CBP21 action. As cyanide is a well-
divalent metal cation is essential for the observed performance known inhibitor of O2 binding to metal centers in enzymes and
enhancements. Mutation of both of the conserved, ligating catalysts, this suggests that molecular oxygen binds directly to
histidine residues to alanine completely abolished activity as the histidine-coordinated metal center in the enzyme. Addi-
measured by synergistic performance with a T. reesei cocktail on tionally, superoxide dismutase does not inhibit the reaction
pretreated corn stover.967 Lastly, Harris et al. noted a similarity significantly implying that the metal center in the oxygen-bound
in the orientation and spacing of three tyrosine residues on the form of the enzyme is shielded from bulk solvent. Overall, these
flat face of T. terrestris GH61E to that of family 1 CBMs, perhaps results suggest that CBP21 acts directly on the surface of β-
suggesting the putative GH61 binding face to the cellulose chitin, to which it is known to bind with high affinity,966,972 and
surface (Figure 76).346,351,352,355,362,967 Taken together, these that molecular oxygen binds to the metal center primarily when
studies showed that GH61s and CBM33s share similar the enzyme is actively engaged on the substrate for oxidation.48
structural functionalities and that the metal coordination was It was noted that, given the similarity in structure and metal-
important for synergism with GHs.337,966,967,972 binding site between CBM33s and GH61s, these enzymes may
The major breakthrough in understanding the mechanistic employ similar chemical mechanisms.48 This seminal study
basis of these enzymes was reported in 2010 when Vaaje- marked a turning point in biomass conversion research in that
Kolstad et al. observed that CBP21 can depolymerize β-chitin the conventional, nearly universal hydrolytic paradigm was
alone in the presence of a reducing agent and molecular oxygen overturned; a drastically different mechanism in the cleavage of
pointing toward an oxidative mechanism for CBP21.48 glycosidic bonds in polysaccharides was revealed.48 It is now
Specifically, oligomers were released from β-chitin that exhibited generally hypothesized that these oxidative enzymes produce
an aldonic acid functionality at the C1 carbon. Using mass chain breaks in crystalline regions of cellulose (and chitin),

Figure 76. T. terrestris GH61 enzyme and active site.967 PDB code: 3EII (A) An overall structural view is shown with active site residues highlighted in
yellow stick. The aromatic residues forming the putative cellulose binding face of the enzyme are shown in green stick. (B) The active site
representation illustrates the coordination distances of the zinc ion (gray sphere) to His1, His68, and two water molecules shown in red sphere.

1418 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 77. T. aurantiacus GH61 structure. (A) Overall structure of the T. aurantiacus GH61 (LPMO) with the active site residues shown in stick
format.51 PDB code: 2YET, chain A. His1, His86, Gln173, Tyr175 shown in stick format. (B) Active site of the T. aurantiacus GH61 (LPMO)
enzyme.51 Note that the Nε2 atom of the N-terminal histidine exhibits a methyl group of unknown function. The two red spheres coordinating the
copper ion are water molecules. The coordination geometry of the copper ion in this enzyme is octahedral, indicating a +2 oxidation state.

which likely is synergistic with conventional GH enzymes due to electron paramagnetic resonance (EPR) spectroscopy, crystallo-
the presence of additional attachment and detachment sites for graphic studies, and activity assays to demonstrate that copper is
processive GHs. EGs, on the other hand, are thought to act the active metal.51 They found that copper binds very strongly
primarily in amorphous regions where the substrate is more to the metal-binding site (the Kd was shown to be tighter than 1
accessible for complexation.557 nM),51 and they also observed a methylation at the Nε2 atom of
From the initial discovery of oxidative activity in CBP21, the N-terminal histidine residue in the enzyme active site
many studies have been subsequently published to characterize (Figure 77). This histidine N-methylation was also present in
various features of the LPMO catalytic mechanisms, to the previously reported T. terrestris and T. reesei GH61
understand substrate specificity and regioselectivity, and to structures,337,967 but was not described in the papers describing
understand the range of reducing agents able to potentiate this the structures. In unrelated systems, this post-translational
oxidative activity. Soon after the initial characterization of modification has been shown to modify copper-binding
CBP21 oxidative action, Forsberg et al. reported that a two- systems,988 perhaps by modulating the binding affinity of
domain CBM33 enzyme from Streptomyces coelicolor A3(2) is copper via a pKa shift of the histidine residues989,990 or by
active on cellulose.979 Similar to CBP21, the S. coelicolor CBM33 modifying the overall electronic structure of the system such
primarily produced oxidized oligomers that had an even number that the oxygen activation and reactivity is altered.989−991
of glycans in the product. Additional chitinolytic CBM33 Overall, the active site geometry in the T. aurantiacus GH61
structures have been reported from the same group, which structure appears to harbor a Cu(II) cation and displays classic
exhibit a similar product profile as the one of S. marcescens Jahn−Teller distortion with elongated coordination bonds in
CPB21.980 the axial positions. In the equatorial plane, the copper ion is
11.2. Mechanistic and Structural Studies
ligated in a bidentate fashion to the N-terminal histidine and
monodentate to a second histidine residue and a polyethylene
Subsequently, four studies were published in rapid succession in glycol hydroxyl group, the latter from the crystallization buffer.
2011, several of which demonstrated that GH61s employ copper The authors coined the term “histidine brace” for this structural
as the active metal and demonstrated that multiple chemical or motif that binds copper.51 Via activity assays, they also
enzymatic reducing agents can promote oxidative activ- demonstrate that copper is essential to the reactivity of the
ity.49−51,981 Langston et al. demonstrated that CDH is able to enzyme, and that reducing agents such as ascorbate and gallate
act as a reducing agent for GH61 activity (specifically for the T. can potentiate activity.51 Action at the C6-carbon was also
aurantiacus GH61 enzyme and H. insolens CDH).49 It has long postulated, but evidence of this type of activity was not reported
been known that CDHs are coregulated with cellulases, and and has not been definitively verified to date in the open
their specific function beyond cellobiose oxidation has been the literature. Although at least one other study has presented
subject of intense study.982,983 CDH consists of a heme domain indirect evidence that C6-oxidation may occur, it may be that
and a flavin adenine dinucleotide (FAD) domain. The FAD these observations are actually C4 oxidation, as discussed
domain of CDH is responsible for a 2-electron oxidation of below.992
cellobiose, and the heme domain is believed to transfer the Shortly after, Phillips et al. reported the detailed character-
electrons from the FAD domain to another electron acceptor, ization of a N. crassa GH61 enzyme and its interaction with
including in this case to GH61 enzymes. Structures of the heme CDH from the same organism.50 Specifically, they knocked out
and FAD domains are known,984,985 and kinetic measurements the CDH-expressing gene in N. crassa, which resulted in a
of electron transfer rates have been reported.986,987 Langston et significant reduction in the cellulolytic performance of the
al. also showed that the natively expressed CDH and GH61 secreted enzyme cocktail in the mutant strain. Add-back studies
enzymes from T. terrestris are synergistic in cellulose with CDH from M. thermophila resulted in a full rescue of
depolymerization. performance suggesting that CDH is a primary driver of biomass
With the GH61 from T. aurantiacus expressed in A. oryzae, depolymerization in N. crassa.50 The authors then systematically
Quinlan et al. employed isothermal titration calorimetry, demonstrated that both metals and oxygen were important for
1419 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Scheme 4. Originally Proposed Catalytic Mechanism for LPMO Action50,312,a

a
The final hydroxylated product (top left) will undergo a rapid elimination reaction to form a C1-lactone and cleave the glycosidic bond. It is noted
that an oxidative attack at the C4 carbon follows the same overall mechanism in terms of the elementary steps and differs only in the position of the
oxidative attack and the final product (namely a 4-keto sugar instead of a C1-lactone).

the CDH enhancement of the N. crassa secretome by loading then further, either spontaneously or via enzymatic catalysis,
with EDTA and running the hydrolysis reaction anaerobically, form an aldonic acid, which aligns with the mass spectrometry
respectively. Given these observations, the authors then experiments reported by Vaaje-Kolstad et al. in the initial report
analyzed GH61 proteins in the N. crassa secretome with tryptic of CBP21 employing an oxidative catalytic mechanism.48
digestions, LC-MS/MS analysis, and inductively coupled Directly following the reports from Quinlan et al.,51 and
plasma-atomic emission spectroscopy (ICP-AES), which Phillips et al.,50 Westereng and co-workers reported that P.
demonstrated that the GH61 enzymes from N. crassa bind chrysosporium GH61D, expressed heterologously in Pichia
copper in a 1:1 ratio. On the basis of the detailed chemical pastoris, is also a copper-dependent enzyme that is primarily
analysis conducted in their study, Phillips et al. thus specific to glycosidic bond cleavage at the C1 carbon in
independently concluded that copper was the active metal in cellulose.981 Likely due to the expression in P. pastoris, the N-
GH61 enzymes and that CDH potentiates its activity. By terminal histidine did not exhibit N-methylation as observed
analyzing multiple GH61s, the authors also discovered the earlier in the T. aurantiacus,51 T. terrestris,967 or T. reesei
presence of 2 types of specificities: some N. crassa GH61s structures.337 However, even without the N-methylation, the P.
produced C1-oxidized species (aldonic acids), dubbed “type 1”, chrysosporium GH61D was still active on cellulose.981 The same
whereas some GH61s were suggested to form 4-keto sugars via observation has also been made for two GH61s from Podospora
oxidation at the C4 carbon (dubbed “type 2”). Phillips and anserina, also expressed in P. pastoris.992 Direct comparisons of
colleagues proposed that GH61 and CBM33 enzymes should be an oxidative enzyme with and without this post-translational
termed “polysaccharide monooxygenases”, which was later modification to date have not been reported in the open
modified with the term “lytic”,42 thus generally becoming literature. A more recent follow-up study was also published
referred to as LPMOs.965 This study was also the first to propose wherein the crystal structure of P. chrysosporium GH61D
a chemical mechanism for GH61 action, illustrated in Scheme 4. produced from P. pastoris was reported.993 This LPMO structure
The mechanism proposed by Phillips et al. consists of the was the first reported from a basidiomycete fungus, which seems
following steps: reduced CDH transfers an electron to the to utilize LPMOs quite ubiquitously in wood degradation.994
“resting” Cu(II)-bound LPMO to form Cu(I), which then binds These initial mechanistic and structural characterizations of
molecular oxygen (Scheme 4). Oxygen then abstracts an LPMO enzymes in 2011 demonstrated unequivocally that
electron from Cu(I) to form a Cu(II)−superoxo species copper is a highly active cofactor in LPMO action, that CBM33s
(Cu(II)−O−O•), which can then abstract a hydrogen atom and GH61s share some commonality in their metal-binding sites
from either the C1 or C4 carbon of the substrate to form a that suggests an overall similar mechanism, and that a reducing
Cu(II)−hydrosuperoxo species (Cu(II)−O−OH) and a radical agent, either enzymatic or from small molecules, and molecular
on the substrate centered at the point of hydrogen atom oxygen are needed for catalysis by LPMOs.48,50,51,979,981 In
abstraction. Another electron and proton are transferred from 2012, more reports began to generally refer to these enzymes as
CDH to form water and a Cu(II)−oxyl species (Cu(II)−O•). “LPMOs”, and several additional details regarding LPMO action
The Cu(II)−oxyl species then hydroxylates the substrate, which were reported. Beeson et al. reported an additional character-
undergoes a spontaneous and rapid elimination reaction to form ization of N. crassa LPMO action wherein they more definitively
a lactone or 4-keto sugar at the C1 or C4 carbon, respectively, assigned the type 2 LPMO products to be 4-keto sugars,312 as
resulting in glycosidic bond cleavage.50 The lactone form can suggested earlier in Phillips et al.50 Soon after, the same group
1420 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

published two structures of natively expressed N. crassa LPMOs site is indeed the binding face to crystalline chitin. Additionally,
(GH61s), one of which (a type 2 LPMO) was shown to be it is likely that LPMOs first require in situ reduction of Cu(II) in
specific for oxidation at the C4 carbon, and the other of which is the active site to “activate” the enzyme for oxygen binding, as
a type 3 LPMO that carried out oxidation both at the C1 and C4 also proposed by Phillips et al.50 To understand the binding
carbons in cellulose.995 In both structures an N-methylation of affinity differences between monovalent and divalent copper,
the N-terminal histidine reported earlier was observed.51,995 Aachmann et al. used competition assays to demonstrate that
Interestingly, the active site in both enzymes displays an oblong Cu(I) binds more strongly to CBP21 than Cu(II) (1.2 nM
molecule coordinated end-on (η1) to the copper cation. The compared to 55 nM, respectively). Similar to their previous
Cu−O1 distances reported are 2.96, 2.92, and 3.44 Å (chains A report,48 the authors also show that cyanide directly inhibits
and B in N. crassa LPMO-2 and chain A in N. crassa LPMO-3, CBP21, but with NMR, they demonstrate that cyanide interacts
respectively).995 For the species in N. crassa LPMO-2, assuming with the copper ion, thus more definitively showing that the
that the species is an O−O species, the O−O bond distance is metal takes place directly in the reaction.990
1.16 Å. This observation led the authors to speculate that this To further elucidate the nature of LPMOs with Cu(I) bound
may be a dioxygen species, namely superoxo, weakly bound to to the active site, Hemsworth et al. recently reported a structure
the copper atom.995 However, this observation has been photoreduced with X-rays.991 Similar to the study from
questioned given the distance from the copper ion to the Aachmann et al.,990 they demonstrate Cu(II) binds with very
putative O1 oxygen. In Cu(II)−O−O• superoxide species, the high affinity to a CBM33 from Bacillus amyloliquefaciens, ranging
Cu−O bond distance should be around 1.33 Å.996 Nonetheless, between 6 nM at pH 5 and 80 nM at pH 7.991 Interestingly, they
another interesting observation was made in the N. crassa also measure the melting temperature of the enzyme with and
LPMO-3 structure; the molecule crystallizes as a face-to-face without copper bound, which shows an astounding 20 °C
dimer, and a tyrosine residue near the active site of its increase in the presence of copper. An X-ray photoreduced
neighboring LPMO was hydroxylated to 3,4-dihydroxyphenyla- structure exhibits a T-shaped configuration, indicative of a
lanine, perhaps suggesting that the X-rays reduced copper and Cu(I)-bound state. A primary outcome from the work of
activated the enzyme, resulting in hydroxylation of the nearby Hemsworth et al. that should be noted in structural studies of
tyrosine residue in the neighboring molecule.995 LPMOs is that X-rays can reduce copper, and thus proper care
The study from Li et al. also presented sequence alignments should be taken when analyzing the active site geometries to
grouped by LPMO type to ascertain which residues are note the coordination of the copper atom, as proper analysis of
important for regioselectivity.995 On the basis of these the coordination geometry can immediately suggest the
alignments, the authors described potential enzyme−substrate oxidation state of the copper ion.991 Lastly, the authors give
interactions that give rise to the difference LPMO types via significant mechanistic weight to the differences in nonfungal
docking the LPMOs onto the cellulose surface. They noted LPMOs (CBM33s) and fungal GH61s in this study, as discussed
several differences between type 1, 2, and 3 LPMOs including in more detail below.991
differences in the number and placement of aromatic residues As a further illustration of the sensitivity of LPMO active sites
and putative N-glycans near the binding surface, informed by to X-ray radiation, Gudmunsson et al. also recently reported a
previous studies on family 1 CBM−cellulose interac- series of bacterial LPMO structures from Enterococcus faecalis
tions.346,352,371 More recently, the same group has developed CBM33A (EfaCBM33A) along the reduction pathway from
further insights into the regioselectivity of LPMOs. Vu et al. Cu2+ to Cu1+.998 The original EfaCBM33A structure was solved
combined phylogenetic analysis over a much larger set of previously in an apo form.980 A specialized photoreduction
LPMOs with experimental studies of additional N. crassa method using helical X-ray diffraction was employed to collect
LPMOs to identify structural motifs that are important for structures along an increasingly intense dosage profile. These
regioselectivity.997 Vu et al. demonstrated that type 3 LPMOs structures definitively revealed the change in LPMO active site
exhibit an extra loop near the N-terminus, consisting of architecture in the two oxidation states, namely with the Cu2+
approximately 12 amino acid residues. Removal of this loop in state exhibiting a trigonal bipyramidal geometry and the Cu1+
an N. crassa LPMO resulted in the production of primarily state in a T-shaped configuration. Comparison of these
aldonic acids (i.e., oxidation at the C1 carbon or conversion to a structures to similar small-molecule crystal structures bound
type 1 LPMO), suggesting that this loop is important for C4 to copper unequivocally demonstrated that the oxidation states
specificity.997 Multiple additional conserved residues were found of copper in the low and high dosage structures are Cu2+ and
from their phylogenetic analysis, all of which now highlight the Cu1+. Moreover, the method employed in this study provides a
need for a large mutation campaign to further understand the straightforward means to solve LPMO crystal structures in both
structural aspects for LPMO regioselectivity. Interestingly, the copper oxidation states.
comprehensive sequence analysis of fungal LPMOs from Vu et Until the end of 2013, all LPMOs reported in the literature
al. highlights two subfamilies of LPMOs that they mention have were only active on the insoluble substrates cellulose and chitin.
no structural or functional characterization to date (Figure 4 in Although likely an important feature of LPMOs utilized by
ref 997), which is perhaps an area ripe for discovery of new biomass-degrading organisms, this substrate preference creates
LPMO activities or substrate specificities.997 significant challenges in the evaluation of kinetics, study of the
In many of the earlier LPMO studies, it was assumed that the catalytic mechanism, and detailed molecular-level character-
flat face of LPMOs is the primary binding face to crystalline ization of enzyme−substrate interactions. Isaksen et al. recently
substrates. To verify this hypothesis and to gain further insights described the biochemical characterization of the first reported
into the copper binding in bacterial LPMOs (CBM33s), LPMO acting on soluble oligosaccharides of cellulose.999
Aachmann and co-workers conducted a detailed NMR and Namely, an N. crassa LPMO (NCU02916) or NcrLPMO9C
isothermal titration calorimetry study on S. marcescens was found to exhibit substantial activity on cellopentaose and
CBP21.990 Interestingly, they applied an elegant NMR method cellohexaose with specificity for the C4-carbon. For cellopen-
to demonstrate that the flat face harboring the copper-binding taose oxidation, a nonoxidized trimer and a C4-oxidized dimer
1421 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Scheme 5. Overall Reaction Scheme for C1 and C4 Oxidation of Cellulose by LPMOs and Subsequent Hydrolysisa

a
(A) C1 oxidation produces a lactone species, and subsequent hydrolysis produces an aldonic acid functionality at C1 overall resulting in glycosidic
bond cleavage.48,50,51 (B) C4 oxidation produces a 4-keto sugar which can then undergo hydrolysis to form a gemdiol species.50,312,999

Scheme 6. Copper−Oxyl Catalytic Mechanism Examined by Density Functional Theory Calculationsa

a
Density functional theory calculations predicted that a Cu(II)-oxyl species is the most likely reactive oxygen species (ROS) in the LPMO catalytic
mechanism, which has been predicted previously to exhibit significant oxidative power. The barrier for this reaction mechanism is predicted to be
18.8 kcal/mol. The rate-limiting step is hydrogen abstraction from the substrate. H-abstraction from the C1 and C4 carbon are predicted to be
energetically equivalent.1003

are observed, suggesting binding in a −3 to +2 mode in the possible, which will enable dramatic advances in our collective
enzyme using the nomenclature for carbohydrate binding understanding of these important oxidative enzymes. Using the
sites.175,176,999 This enabled more detailed characterization of same enzyme, the Eijsink group subsequently reported that it is
the reaction productions with NMR spectroscopy than was active on xyloglucan.1000 This report highlights the potential
possible before, which when combined with MS/MS analysis scope of LPMO substrate specificity, and likely suggests that
definitively reveals a 4-keto sugar product that further oxidative depolymerization of polysaccharides is likely more
hydrolyzes to a geminal diol in solution (Scheme 5).999 Even general than for just cellulose and chitin. Indeed, a recent study
more importantly, this discovery enables incredible potential for has identified an LPMO that is active on starch.1001
experimental elucidation of the LPMO catalytic mechanism. As As noted in multiple studies since the initial report that
a result, substrate-bound structures may now be possible via LPMOs were oxidative enzymes, understanding the catalytic
NMR or X-ray crystallography. The development of LPMO mechanism is one of the most important outstanding problems
kinetic assays will likely also follow soon, which will enable for LPMOs.48,50,51,312,991,995,1002 To that end, the first report of
experimental determination of the rate-limiting steps in LPMO the LPMO catalytic mechanism was recently published.1003
action. By using unreactive analogues (e.g., cyanide in place of Using density functional theory calculations, Kim et al. used an
oxygen), isolation of reactive intermediates will likely be active site, or “theozyme” model,1004 of the T. aurantiacus
1422 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Figure 78. A. orzyae AA11 LPMO structure. (A) Overall structure of the A. orzyae LPMO structure with the active site residues shown in stick
format.1024 PDB code: 4MAI. His1, Ala58, His60, Glu138, Tyr140 shown in stick format. (B) Active site of the A. orzyae LPMO enzyme. The copper
ion coordination state indicates that the copper ion is in a +1 oxidation state.

LPMO51 to first examine how oxygen binds to a Cu(I)-bound abstractions from the C1 and C4 carbon are energetically
state in the LPMO active site. With this activated structure, two quite similar.1003 Although the two proposed reaction
catalytic mechanisms for hydrogen abstraction and substrate mechanisms from Kim et al.1003 draw heavily on the proposed
hydroxylation were compared with transition state calculations, catalytic cycle from Phillips and colleagues50 (Scheme 4) and
one with a copper−superoxo reactive oxygen species (ROS) and the study of other copper oxygenases (refs 1005, 1006, 1009,
the other utilizing a copper−oxyl ROS. For oxygen binding to 1011−1023), there are some important contrasting features.
the LPMO active site, it was predicted that oxygen binds end-on Namely, the mechanisms proposed by Kim et al. do not require
(η1) to copper with an O−O bond distance of 1.31 Å, which is in formation of two radicals simultaneously, and do not employ a
good agreement with the ideal bond length of Cu(II) superoxide reducing agent at disparate parts of the catalytic cycle.1003
species of 1.33 Å.996 From there, a “superoxo” mechanism was Additionally, N-methylation of the N-terminal histidine was also
tested wherein the LPMO−Cu(II)−superoxo species abstracts a studied with the theozyme approach, but no appreciable
hydrogen atom from C1 (or C4) carbon of a cellobiose unit, differences were predicted on the basis of the theozyme
which was chosen to represent the cellulose chain. This reaction method. This study provided the first detailed examination of
forms a radical centered on the substrate carbon atom (C1 or the LPMO catalytic mechanism, and implicated a ROS for
C4) and an LPMO−Cu(II)−hydroperoxo species. The O−O LPMO action of substantial oxidative power,1005,1006 concom-
bond then breaks, and the hydroxyl group undergoes a rebound itant with the incredibly strong glycosidic bonds in poly-
mechanism to form a covalent bond with the C1 (or C4) saccharides.71 Certainly, however, more work remains to be
carbon. This results in a hydroxylated substrate and an LPMO− conducted to fully characterize the LPMO catalytic mechanism.
Cu(II)−oxyl species, and the oxyl group is then reduced to Given the reclassification of GH61s and CBM33s as LPMOs,
water by action of a reducing agent. The potential energy surface the curators of the CAZy database recently published an update
(PES) for this reaction exhibits a barrier of 43.0 kcal/mol, in which many oxidative biomass-degrading enzymes including
regardless of attack at either the C1 or C4 carbon.1003 Given LPMOs were catalogued.152 With the diverse activity of this
such a high barrier for a Cu(II)−superoxo mechanism, it was portfolio of enzymes, the large suite of oxidative enzymes was
hypothesized that a much more powerful ROS was needed for given the generic title “auxiliary activity” or AA. Fungal LPMOs,
cellulose hydroxylation. Cu(II)−oxyl species have previously formerly GH61s, are classified as AA9 enzymes whereas
been predicted to exhibit a much stronger oxidative character nonfungal LPMOs are classified as AA10 enzymes.152
than Cu(II)−superoxo.1005,1006 Given the likely very short Additional families of LPMOs have also recently begun to
lifetime of Cu(II)−oxyl, it has only been experimentally isolated emerge. Hemsworth and colleagues reported the discovery of a
in a single study to date.1007 Nonetheless, this species has new LPMO family, classified as AA11 on the CAZy database.1024
previously been implicated in methane and aromatic hydrox- The authors indicate that this class of LPMOs is phylogeneti-
ylation reactions.1008−1010 Thus, Cu(II)−oxyl was subsequently cally distinct from the currently classified oxidative enzymes, and
hypothesized to be the ROS in LPMO action. The reaction characterize a specific member of this new family, which is a
cycle was proposed as follows (Scheme 6): a reducing agent first chitinolytic LPMO from A. oryzae that forms aldonic acids. The
acts on the LPMO−Cu(II)−superoxo species to produce water structure of the AA11 LPMO is overall quite similar to both
and an LPMO−Cu(II)−oxyl species. LPMO−Cu(II)−oxyl previously characterized fungal and nonfungal LPMOs. A
abstracts a hydrogen from the substrate to form a LPMO− conserved alanine residue is contained in the active site, similar
Cu(II)−hydroxyl species, which then undergoes an oxygen- to nonfungal LPMOs, and a tyrosine residue resides axial to the
rebound mechanism to hydroxylate the substrate. The overall copper ion. On the basis of this observation and detailed analysis
barrier for this reaction mechanism was calculated to be 18.8 of the EPR spectra, the authors suggest that AA11 LPMOs are
kcal/mol, which is a much more feasible energy barrier at intermediate between the two known families (Figure 78).
biological conditions than 43.0 kcal/mol for a “superoxo” In 2014, a report from the Eijsink group was also published
mechanism. The rate-limiting step in the Cu(II)−oxyl which demonstrated that AA10 LPMOs also exhibit a broader
mechanism is C−H abstraction from the substrate, and regioselectivity than just C1 oxidation.991 In a structural and
1423 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

biochemical study from Forsberg et al., the authors solved high- ROS via an oxygen-rebound mechanism for polysacchar-
resolution structures of a pair of AA10 LPMOs from S. coelicolor, ide hydroxylation.1003
and showed that one was a C1 oxidizer and the other specific to LPMOs are a recent discovery, and they are rapidly emerging
C4 oxidation. The enzymes were also shown to be synergistic in as a very important enzyme addition to the canon of cellulolytic
their degradation of cellulose. They also demonstrated that the enzymes, which to date have primarily been hydrolytic. They are
copper affinities, redox potentials, and EPR spectra are quite also attracting interest from a fundamental perspective, given
similar for both the C1 and C4 oxidizing LPMOs. The outcome their unique copper-binding sites and the fact that they catalyze
from this study based on a structural comparison of the active cleavage of very strong glycosidic bonds.1027 LPMOs are able to
sites is that regioselectivity is likely directed by residues beyond synergize with GHs, likely as endo-acting enzymes that act
the first “shell” from the copper binding site of LPMOs, similar directly on the surface of crystalline polysaccharides. This ability
to hypotheses from AA9 LPMOs. to act on crystalline regions offers a plausible explanation of the
From a more industrial standpoint, additional discoveries apparent synergy of LPMOs with EGs, which are thought to
have been made related to LPMO action. Related to the original primarily act in more accessible, amorphous regions of cellulose.
observation that LPMOs do not synergize with GH cocktails on As further evidence toward this hypothesis, in the case of α-
clean cellulose substrates from Harris et al.,967 Dimarogona and chitin, low crystallinity in the initial substrate reduces the
colleagues demonstrated that the presence of lignin specifically boosting potential of CBP21.1028
boosts the activity of a fungal LPMO.1025 Cannella et al. As discussed in section 2, glycosidic linkages in poly-
conducted a study with the LPMO-containing Cellic CTec2 saccharides are among the strongest covalent linkages found
cocktail (Novozymes, Inc.) at high solids loadings, and in nature.71 The hydrolytic enzymes that cleave these linkages
demonstrated that up to 4% of the glucose released was are thus some of the most powerful enzymes known, and the
gluconic acid.626 The authors went on to further demonstrate discovery of LPMOs demonstrates that other enzymatic
that the β-glucosidase in Cellic CTec2 cleaves the glycosidic paradigms are also able to selectively cleave these bonds.48
bond in cellobionic acid 10 times slower that in cellobiose and Transition metals are usually required in enzymes or oxidation
that gluconic acid is a significant inhibitor of the β-glucosidase. catalysts to circumvent spin-forbidden transitions in the triplet
Similar to the results of Dimarogona et al.,1025 Cannella and ground state of molecular oxygen. The oxidative power required
colleagues demonstrate that the presence of lignin negates the to break strong bonds in carbohydrates coupled to the active site
need to add external reducing agents.626 In a subsequent study, geometries revealed in LPMO structures has drawn comparison
Cannella and Jørgenson demonstrate that simultaneous to methane monooxygenase, which activates one of the
saccharification and fermentation conditions reduce the amount strongest known C−H bonds.50,51,1002,1009 Similarly, Kim et al.
of gluconic acid production likely due to oxygen uptake by the predicted that an incredibly powerful oxidative species, Cu(II)−
fermentative organism.627 Lastly, the analysis of heterogeneous, oxyl, is likely responsible for polysaccharide hydroxylation in a
oxidized products from LPMOs has been a significant challenge fungal LPMO.1003
in the field that has required the development of new analytical Going forward, there are undoubtedly many open questions
approaches. In a series of two papers, Isaksen et al. and related to the mechanisms of these fascinating enzymes. In a
Westereng et al. report detailed spectroscopic and chromato- recent perspective on LPMOs,1002 Hemsworth, Davies, and
graphic methods that enable chemically accurate and robust Walton posited that nonfungal (CBM33, AA10) and fungal
methodologies for analysis of oxidized products.52,999 (GH61, AA9) LPMOs may employ a different catalytic
11.3. Conclusions mechanism based on differences in the active site structures,
In terms of LPMO mechanisms of action, to date, we illustrated for representative structures in Figures 74 and 77, and
collectively know the following: their EPR spectra. Specifically, nonfungal LPMOs examined to
date exhibit a conserved alanine residue in the active site that is
(1) LPMO action requires a reducing agent and molecular
absent in fungal AA9 LPMO structures and sequence align-
oxygen and a copper ion in the active site.48,50,51,981
ments. This alanine residue may alter the initial binding of
(2) The reducing agent required for LPMO action can be oxygen1003 or modulate the identity of the ROS.991,1002
enzymatic (CDH), an externally added small molecule, or Additionally, the identity of the axially coordinating residue
lignin-derived molecules found in bio- was thought for some time to be mainly tyrosine in fungal AA9
mass.48−51,312,626,965,967,979,995,997,999,1002,1024−1026 LPMOs and phenylalanine in nonfungal AA10 LPMOs, but it is
(3) LPMOs have been shown to definitively act either on C1 not known how universal this delineation is, or if the axial
or C4 carbons,50,312,995,997,999 and several motifs have residue plays a significant role in the reaction. In fact, Harris et
been identified from biochemical and structural studies al. demonstrated in their initial study that the mutation of the
that impart regioselectivity.997 Oxidative action at the C1 tyrosine residue to phenylalanine in a fungal LPMO reduced the
carbon yields a lactone, which can be hydrolyzed to an observed synergy, but did not remove it altogether suggesting
aldonic acid motif, whereas C4 oxidation yields a 4-keto that the enzyme was still active.967 Later reports of AA10
sugar that can hydrolyze to a geminal diol. LPMOs have shown that the axial residue can be tyrosine.1029
(4) There have been three “auxiliary activity” families The report of the AA11 LPMO family was the first fungal
characterized according to the CAZy database thus far LPMO that exhibits an axial phenylalanine residue and an
on the basis of sequence differences.152,1024 Different alanine in the active site, in a seemingly intermediate active site
LPMOs within the AA9 and 10 families can act either on arrangement between AA9 and AA10 LPMOs.1024 Additionally,
insoluble forms of cellulose or chitin and, in one reported there is a highly conserved residue in the active site that is
case to date, on soluble cello-oligomers,999 xyloglucan,1000 typically glutamine in fungal LPMOs and glutamic acid in
and starch.1001 nonfungal LPMOs, but again its role in the oxidation reaction is
(5) A theoretical prediction has been reported that suggests unknown. Additionally, the lack of N-methylation on the N-
the fungal LPMO mechanism employs a Cu(II)−oxyl terminal histidine in bacterially expressed AA10 LPMOs, or
1424 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

fungal LPMO heterologously expressed in yeast, suggests that 12. MODELING ENZYMATIC HYDROLYSIS
perhaps filamentous fungi are able to employ this post- Extremely valuable information regarding the catalytic and
translational modification whereas bacteria and some yeast processive mechanisms of CBHs and EGs can be obtained by
cannot. If this modification is significant for catalytic action, then considering each component in isolation. However, synergism
perhaps it might lead to differences in activity or even a different between multiple cellulase components cannot be captured by
mechanism. Lastly, X-ray radiation is able to rapidly photo- models that only consider one cellulase at a time. Kinetic models
reduce the copper ion from a copper(II) oxidation state to a of multiple enzyme components (cocktails) allow for testing of
copper(I) oxidation state in AA10 (and for at least one case, predictions and quantifying the effects of varying enzyme and
AA11) LPMOs. However, fungal LPMOs solved to date with substrate concentrations and properties.33,556 We review here
similar X-ray crystallography approaches are able to harbor two basic categories of kinetic models that have been applied to
copper(II) ions seemingly with no consideration for photo- the enzymatic hydrolysis of cellulose, namely ordinary differ-
reduction. This difference may suggest clues as to the differences ential equation (ODE)-based models and agent-based models.
in the mechanisms of these LPMO families. Regardless, In what follows, we first highlight ODE-based models (Figure
although the LPMO enzymes in the AA9−11 families are 79), followed by agent-based models (Figure 80).
distinctly separated in sequence space, it is not clear how
universal the active site differences are between the three LPMO
families, nor if this has significant ramifications on the oxidation
mechanism. Thus far, LPMO products and substrates do not
significantly differ between the families relative to the internal
variation per family. Undoubtedly, the more information that we
collectively elucidate regarding the mechanisms of these novel
enzymes will likely warrant constant evaluation as to their
classification. The differences in reactivity, now among the three
LPMO families,1024 will require considerable biochemical,
structural, and theoretical work to fully unravel.
Additionally, to date very little has been reported on the
electron transfer steps besides a study on the interaction of a
fungal LPMO with two different CDHs.1026 Sygmund et al.
examined the effect of the two native CDHs from N. crassa in
their interaction with an LPMO from the same organism, and
found that there are differences in the efficiency of CDH/
LPMO interaction, but the authors point out that this requires a
significantly expanded scope to be able to ascertain LPMO−
CDH intermolecular interactions and catalytic efficiencies.1026
Intramolecular electron transfer in LPMOs has been postulated
to occur via clusters of aromatic residues,991,995 but the test of
this hypothesis has not yet been reported. The need for further
studies into the interactions of CDH or small-molecule reducing
agents with LPMOs is obvious, and may be appropriate for
biophysical studies such as those utilizing surface plasmon
resonance, NMR, or even cocrystallization.
Beyond the detailed, molecular mechanisms of LPMO action,
Figure 79. Illustrative example of an ODE-based model for the
there are many questions that need to be answered related to
enzymatic degradation of crystalline cellulose. (A) An example of the
industrial applications of LPMOs. The design of optimal discrete steps of CBH processive/hydrolytic activity that may be
mixtures of GHs and LPMOs will be required for biomass captured by an ODE based kinetic model. Kinetic schemes to capture
depolymerization in a biofuels or chemicals context, and this the CBH processive action may assume (B) that CBH can only
work will require massive efforts similar to the decades of efforts produce cellobiose or (C) that initial cuts produce glucose, cellobiose,
already spent optimizing GH cocktails and understanding and cellotriose with nonzero probability. Note that the cycle in part A
enzyme synergy. The application of sophisticated enzymology does not correspond precisely to either of the schemes shown in parts B
tools such as those reviewed above from Väljamäe et or C. Part A is reprinted with permission from ref 532. Copyright 2011
American Chemical Society. Parts B and C are reprinted with
al.,307,531,557 Westh et al.,559,560 and Igarashi et al.476,558 will
permission from ref 1040. Copyright 2011 Wiley Periodicals, Inc.
undoubtedly help elucidate the mechanisms of synergy in a
more detailed fashion, which will accelerate the development of
optimized GH/LPMO cocktails. 12.1. Ordinary Differential Equation-Based Models
Lastly, the discovery of LPMOs 20 years after the first ODE-based kinetic models first formulate a set of process steps
structures of the first GH enzymes from fungi172,173,192 and 60 (Figure 79A) and then develop a set of differential equations
years after the C1−Cx hypothesis was put forth by Reese et that describe these steps (Figure 79B,C). The rate constants
al.291 highlights the fact that we still have much to learn about necessary for each process step generally come from experi-
novel biomass depolymerization mechanisms in nature. There ments (or sometimes from simulations) wherein the kinetics of
are likely more discoveries on the level of LPMOs that we still individual steps have been isolated. Varying system inputs (e.g.,
have yet to uncover about how fungi and organisms from other starting concentrations of substrate or enzyme, substrate
kingdoms of life degrade plant cell walls. properties, the reaction rate constants, etc.) and solving the
1425 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

enzyme loading exists in between these two limits, resulting in a


maximum degree of synergy.1032
Eriksson et al. modeled the synergistic action of TrCel7A and
TrCel7B (a CBH and an EG, respectively) on steam-pretreated
spruce and attempted to account for the rapid rate decrease seen
in enzymatic hydrolysis.569 Their experiments indeed exhibited
this rate decline, and they were able to rule out enzyme thermal
stability and product inhibition as major causes. Addition of
TrCel7B increases the hydrolysis rate more than adding
TrCel7A, which in turn has a greater effect than adding
substrate. Their explanation for EG/CBH synergism centered
on the ability for EGs to “remove” disordered cellulose chains
that act as obstacles to processive CBH action. This proposal has
gained a lot of traction in recent years as convincing
experimental evidence has pointed to steric obstacles as a rate-
limiting factor for CBHs acting in isolation307 and their removal
as a vital role for EGs.557
Zhang and Lynd developed a model for an enzyme cocktail
with parameters chosen from the experimental literature for
Figure 80. Illustrative example of an agent-based model for crystalline TrCel7A, TrCel6A, and TrCel7B.1034 Notably, they apply a
cellulose enzymatic degradation. (A) At the beginning of an
adsorption/hydrolysis simulation the substrate surface lattice is
single set of enzyme parameters to various “substrates” (by
perfectly ordered. (B) After adsorption equilibrium has been reached, examining characteristic values of degree of polymerization DP
nonproductively bound CBHs contribute to a reduced apparent and fraction of β-glucosidic bonds available to cellulase). They
processivity. (C) Surface erosion is exemplified after limited hydrolysis conclude that these two substrate parameters are sufficient to
and (D) after extended hydrolysis. Reprinted with permission from ref capture various phenomena such as the dependence of cellulase
600. Copyright 2001 John Wiley and Sons. synergy on the substrate properties, enzyme loading, and
reaction time. In addition, of the two substrate properties, they
predict that enhancing substrate availability is more beneficial to
resulting set of differential equations can reveal the dependency increasing activity than decreasing DP.1034
of various outputs (product concentrations, conversion, degree Bommarius et al.596 compared two approaches to the problem
of synergism, etc.) on the inputs. of enzyme adsorption, diffusion, and reaction, namely fractal
In the earliest model of this type, Suga et al. predicted the kinetics and jamming kinetics. Fractal kinetics accounts for the
molecular weight distribution of saccharides released from an heterogeneous nature of the substrate that restricts cellulase
insoluble substrate, considering EG and CBH (the latter acting diffusion, whereas jamming kinetics focuses on the obstruction
exclusively at chain ends) both in isolation and in concert.1030 of cellulase motion due to other bound cellulases. Exper-
They modeled the substrate−enzyme interaction as Michaelis− imentally, they studied the effect of using an array of different
Menten kinetics, but this is solely for the reaction and not for pretreatments on rate slowdown. They find that the rate of
adsorption, which they did not consider. They assumed that all hydrolysis dropped by 2−3 orders of magnitude at high degrees
β-glycosidic bonds are accessible to the enzymes. They noted of conversions, but this is not dependent on the pretreatment
that, with EG alone, the production of monomers (in this case, method. In fact, they conclude that pretreatment and fractal
cellobiose) is extremely low due to its random cleavage of kinetics are irrelevant to the rate decline seen at high degrees of
internal bonds. On the other hand, CBHs produce only conversion. Their model explained the decline, however, as
monomers, but have a low concentration of chain-end glycosidic being due to the jamming of enzymes bound to the substrate
bonds available to cleave. In concert, however, they demon- surface.596
strated that the monomer production was drastically increased, Zhou et al. introduced a model that explicitly accounted for
thus capturing endo/exo synergism.1030 the changing morphology of the substrate due to enzymatic
The model of Okazaki et al. added to the approach of Suga et hydrolysis as well as the cellulose chain fragmentation.1035−1037
al. the inclusion of β-glucosidase activity as well as consideration This is an important concept, as the evolution of accessible
of product inhibition.1031 Their results predicted that CBH surface area is likely to be a complex function of multiple
catalytic activity is inversely proportional to the initial substrate dynamic parameters throughout the course of hydrolysis. This
DP and that EG activity is not dependent on DP. However, the model was applied to the T. reesei system of Cel7A, Cel6A, and
synergistic effect of employing EG and CBH in concert actually Cel7B degrading Avicel,1035 and they were able to explicitly
increases as the substrate DP increases.1031 capture the hydrolytic evolution of cellulose substrate. Their
Converse and Optekar considered a binary cocktail of EG and results indicated that, in addition to averaged substrate
CBH1032 in order to rationalize the fact that the degree of characteristics such as DP, internal surface area, and enzyme
synergism had been shown1033 to go through a maximum as the accessibility fraction, that the distribution of structural
total enzyme concentration increased. Their model featured heterogeneities is an important characteristic to be considered.
competitive adsorption of the two cellulases for a limited Their model captured the rapid rate retardation of enzymatic
number of sites. At low enzyme concentrations, the system is hydrolysis, and the authors proposed that this distribution is an
also at low conversion, and the CBH is less dependent on the important contributing factor in the slowdown. Their “surface
new chain ends created by the EG. At high enzyme ablation” model is conceptually similar to the lattice models of
concentrations, CBH dominates the surface sites and thus Sild et al.600,1038 discussed below. Within this model, there are
restricts the EG from creating new chain ends. An optimal two relevant substrate-related time scales: the time to degrade a
1426 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

single glucan chain (nonmorphological) and the time to complexation-limited framework. Fox et al.’s calculation for
completely degrade the substrate (which necessitates morpho- apparent processivity was much lower than for intrinsic
logical consideration). The model is able to correlate the first processivity; thus, they concluded that CBHs do have their
time scale with the initial rapid rate slowdown,1037 a finding that processive runs halted by steric obstacles on the surface,532
is compatible with the steric obstacles hypothesis.307 However, consistent with Kurašin and Väljamäe.307
the model is not able to match experiments in regard to the Griggs et al. developed a population balance model to capture
second time scale, leading to the conclusion that other inhibitory the hydrolysis of cellulose by action of an EG and CBH.1041 For
factors must be at work in this regime.1037 the CBH, distinct steps for the adsorption, complexation,
Levine et al. introduced a mechanistic model that included the processive hydrolysis, and desorption were modeled. They
following novel features: enzyme adsorption and complexation captured temporal development of the chain length distribution
were modeled as distinct steps, the equilibrium assumption for for EG in isolation, CBH in isolation, and EG/CBH in concert
adsorption was abandoned, and representation of cellulosic for both amorphous and structured substrate. In a follow-up
substrate was as a polydisperse distribution of spheres.1039 They study, β-glucosidase action was included to model a complete
applied this to the individual and synergistic behavior of Cel7A cocktail. The rate-limiting factor in this study was determined to
and Cel5A from T. reesei. Their model performed well except be the availability of substrate.1042
with low surface area, where enzyme competition for adsorption One of the most difficult aspects of gaining mechanistic
sites is particularly high and enzyme crowding is most relevant. insight into how cellulases work is that surface interactions are
Also, the experimentally observed retardation of hydrolysis very difficult to characterize.1043 Fox et al. utilized photo-
could only be accounted for with abnormally strong product activated localization microscopy (PALM), and applied it to
inhibition or short enzyme half-lives; thus, they concluded that quantify the binding affinity of six different CBMs for regions of
neither of these were the primary causes of the slowdown. They a cellulosic cotton substrate of varying crystallinity.1044 Their
also found that enzyme crowding cannot be the sole reason for results demonstrated quantitatively the invalidity of applying a
the hydrolytic rate slowdown; model results only showed this purely binary classification to cellulose (amorphous vs
effect with multiple enzymes present and with high substrate crystalline). They quantitatively assessed the binding preference
loadings, whereas experiments find the rate slowdown to be of the different CBMs studied and found a continuum of
universal (even with single enzymes). They note that “structural binding preferences. They then related these findings to binary
heterogeneities” may play a significant role, but their model did synergistic enzyme activity by synthesizing chimeras composed
not account for these. of A. cellulolyticus GH5 EG, TrCel7A linker, and each of the six
Levine et al. later expanded their previous model to track CBMs. Their results revealed a significant new element of
individual product formation (glucose, cellobiose, and cello- cellulase synergism. If two cellulases worked in either totally
triose) instead of just overall hydrolytic activity,1040 with the different regions of cellulose or in the same region, there is no
goal of developing a rational method for determining the synergistic effect. However, enzymes with similar but non-
optimum cellulase ratios for hydrolysis experiments (using identical binding site preferences gave optimal synergism, by
Cel7A, Cel6A, and Cel5A from T. reesei). They considered enhancing the susceptibility of the substrate for one another.
various combinations of substrate DP and surface area, including These conclusions have clear implications for selecting enzymes
values representative of BMCC. The model results suggested for cocktails with optimal synergism.
that optimal enzyme ratios are 1:0:1 with Cel7B:Cel7A:Cel6A at Gao et al. challenged the conventional paradigm that
shorter times (24 h) and 1:1:0 at longer times (72 h); the increased hydrolytic activity positively correlates with an
authors indicated this may be related to the relative thermal increase in surface-bound enzyme.1045 With a cocktail of
stabilities of the two CBHs. TrCel7A, TrCel6A, and TrCel7B, they showed that although
Informed by the kinetic models just described, Fox et al. the binding coefficient on cellulose III was only half that on
experimentally examined the kinetics of CBH chain complex- native cellulose Iβ, its hydrolytic activity was increased. In fact,
ation and how EGs affect these kinetics.532 They employ long- they showed that enzyme loading can be decreased by 5-fold on
time hydrolysis trials (>100 h) of BMCC by CBH Cel7A from cellulose III to achieve concomitant hydrolysis rates as on
T. longibrachiatum and EG II from T. emersonii. They measured cellulose Iβ. To rationalize this observation, they developed a
initial-cut product release and equated this to initial chain kinetic model that couples enzyme binding, chain decrystalliza-
complexation. They also measured the processive-cut products tion/sliding, and hydrolysis as distinct steps in the enzymatic
and deduced processivity from the ratio of processive-cut processive cycle. Their model showed that their nonintuitive
products to initial-cut products. Via material balances on the experimental result can be reproduced if the enzymes’
initial-cut products and the processive-cut products, they processive ability (quantified by rate constant kslide) was
developed a system of ODEs. Even when chain-ends for CBH enhanced while its initial chain-association ability (quantified
action were in excess (due to high concentration of EG), the rate by nkon, where n is the number of available binding sites and kon
of generation of initial-cut products was an order of magnitude is the adsorption rate constant) was diminished.
lower than the hydrolysis rate of soluble cellohexaose. From this
12.2. Agent-Based Models
and other evidence, it was concluded that chain complexation is
rate-limiting. This was in contrast to the findings of Kurašin and Agent-based models (also known as cellular automata) treat the
Väljamäe.307 However, Fox et al. also found that increasing EG cellulosic substrate and enzymes as individual entities and assign
concentration decreases the length of a processive run for the behaviors or properties to each on the basis of previously
CBH while also increasing the hydrolytic conversion. This published parameters. They have the advantage of giving a
finding perhaps supports the idea that EGs help to “rescue” spatial dimension to traditional models that are solely based on
CBHs that are trapped behind obstacles.557 This would explain differential equations. In addition, they provide the additional
the trends seen in processivity and conversion within the capability of visual inspection, which can give insight into
obstacle-limiting framework, though it does not fit as well in the mechanisms and guide further model development (Figure 80).
1427 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Finally, this type of model more naturally captures physical found that varying the enzyme/substrate ratio and adsorption
phenomena on the substrate surface such as enzyme crowding strength (for either CBH or EG) from the literature values
and nonproductive binding to substrate. resulted in a decrease in glucose production. They found
Sild et al. employed an infinite, anisotropic, two-dimensional decreasing glucose production with time and attributed this to
lattice simulation with overlapping binding sites (of only one enzyme crowding on the surface and decreased cellulose surface
type) with the goal of estimating the amount of bound enzyme area.
as a function of free enzyme.1038 Simple rate equations governed Asztalos et al. utilized a spatial model for cellulose degradation
the binding and dissociation of CBH (e.g., Cel7A) to/from an that “forms a bridge” between all-atom MD simulation and
initially perfect crystalline surface (e.g., Avicel). Though higher-level ODE models that accounted for enzyme crowding
experimentally it had been shown that the adsorption of on the surface.1048 They developed a coarse-grained stochastic
TrCel7A to cellulose could be accounted for by assuming two model of endo and exo cellulases with a two-dimensional surface
different types of binding sites (perhaps signifying regions of modeling crystalline cellulose. They examined TrCel7A,
high and low crystallinity), Sild et al. demonstrated with their TrCel6A, and TrCel7B and include the following reactive
model that if the binding sites are allowed to overlap (as is the events: adsorption, interchain hydrogen bond breaking,
case experimentally),348 the data is fit equally well by a model hydrolysis of glycosidic bonds, and desorption. Endo/exo
that assumes only one type of binding site. Their data also synergism was qualitatively reproduced.1048
indicate that, at high cellulase loadings, the surface reaches an In summary, both ODE- and agent-based models of cellulose
apparent equilibrium before the theoretical maximum coverage degradation by enzyme cocktails have supplemented and
due to surface crowding and slow rearrangement. This resulted improved experimental characterization of the substrate and
in surface coverage that is as much as 40% below the theoretical enzymatic action. In particular, they have shed light on the
maximum, depending on the shape of the adsorbing molecule. mechanisms and influencing factors of significant experimentally
Later, Väljamäe et al.600 coupled experiments with an observed phenomena including cellulase synergism and the
expanded version of the Monte Carlo simulation model of rapid initial retardation in hydrolysis rate. These models offer
Sild et al.1038 by adding two significant aspects: TrCel6A was the ability to test mechanistic explanations for these phenomena
included (in addition to TrCel7A), and the hydrolysis, as well as as well as to strip the system down to its basic essentials in order
the binding, was considered. Experimentally, they find hydro- to offer these insights. As such, they can inform the selection of
lytic rate retardation at short times (in the first 5−10 min). Also, process conditions, such as optimal cellulase loadings and ratios
preincubation of BMCC with TrCel7A actually decreased the between cocktail components. Finally, they have the potential of
reaction velocity for subsequent hydrolysis with TrCel7A. assisting in the determination of the relative rates of the
However, preincubation with TrCel6A increased the reaction individual steps of the processive cycle of CBHs. Going forward,
velocity for subsequent hydrolysis by TrCel7A. Thus, so-called these models have the potential to enhance mechanistic
exo/exo synergism was not dependent on the CBHs being understanding, particularly as they are increasingly coupled to
incubated simultaneously. They subsequently sought to explain advanced experimental techniques (refs 102, 103, 128, 307, 476,
these results via Monte Carlo simulations. They attributed the 521, 532, 534, 535, 546, 548, 552, 557, 558, 560, 561, 564, 795,
initial rate retardation to two primary factors: steric hindrance and 1044) and include the synergistic action of CBHs (with
by nonproductively bound cellulases and cellulose surface endo and exo capability and acting from both reducing-end and
erosion due to CBH action. Their simulations support these two nonreducing-end), EGs, and LPMOs.
modes of CBH inhibition (Figure 80). The early stages of
adsorption/desorption simulations featured the most drastic 13. CONCLUDING REMARKS
increase in bound CBHs causing a concomitant increase in Fungi have evolved to be the most powerful and prevalent
enzyme obstacles. In addition, CBH processive action degraded biomass degrading organisms in nature, exhibiting a diverse
the surface from an initially homogeneous one to a drastically range of lifestyles for the turnover of lignocellulosic material on
eroded one; at long times, a steady-state erosion pattern having Earth. Given their significant activity and ability to be readily
a constant retarding effect should be established.600 produced at high titers on the industrial scale, fungal cellulase
Fenske et al. examined the case of a single cellulase that has cocktails are an excellent starting point for industrial biorefinery
nonzero exo and endo capabilities acting on an insoluble applications. Indeed, industrial-scale lignocellulosic biorefi-
polysaccharide substrate, modeled as a two-dimensional lattice neries, mostly slated to produce bioethanol at this point, are
of fixed DP.1046 Enzymes were allowed to move and cleave beginning to come online worldwide at the time of this review.
bonds on the surface in a stochastic fashion (with fixed Given the scale of fuel demand and the significant potential for
probabilities). Their results indicated that substrate inhibition biomass conversion to offset fossil fuel usage, the use of fungal
was present at high substrate loadings with only one cellulase cellulases will likely grow dramatically in the coming years,
component. This was due to a decrease in “autosynergism”, that which in turn could make fungal cellulases the most widely
is, the endo activity of an enzyme aiding the exo-activity of the produced industrial enzymes in the world by a large margin. As
same enzyme.1046 enzymes remain a major cost in renewable fuels production, this
Warden et al. developed a cellular automata model to model in turn makes even “small” gains in enzyme or cocktail
the deconstruction of cellulose to glucose by cellulase performance of significant importance in the renewable energy
cocktails.1047 They developed a three-dimensional spatial lattice economy.
model that included control over many physical and chemical As covered in this review, significant strides in our
variables involved in the saccharification of cellulose. They fundamental understanding of cellulase action have been made
utilized published values for the cellulase system for T. reesei in the past several decades, especially driven by the structural
(section 4.3) and examined the effect on glucose production of biology efforts starting in the early 1990s. However, challenges
varying enzyme loadings, adsorption strengths, and catalytic to “well-accepted” models and the basic physical mechanisms of
activities of the EGs and CBHs in the system. In general, they cellulose deconstruction have been reported in the last several
1428 DOI: 10.1021/cr500351c
Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

years that highlight how much we have yet to understand. For industrial world. Advances from the worldwide research
example, the role of hydrolytic EGs has been recently revealed community will be needed to comprehensively address these
to be generation of “dissociation” points for CBHs, instead of open questions and to harness the considerable potential of
the commonly cited attachment points (see section 6.2). these fascinating enzymes.
Additionally, the recent discovery of LPMOs highlights the fact
that basic cellulolytic paradigms are not fully characterized, and AUTHOR INFORMATION
it is possible that other enzyme activities exist that have not yet
been discovered, or like LPMOs, may be misclassified. Beyond Corresponding Author
those two examples, many open questions remain even *E-mail: gregg.beckham@nrel.gov.
regarding basic catalytic mechanisms in some cases (e.g., GH6
Author Contributions
cellulases and LPMOs), the molecular basis for activity
#
differences within the same family of cellulolytic enzymes, the These authors contributed equally to the review.
basis of synergistic function in enzyme cocktails (e.g., why are Notes
multiple EGs needed? What function do LPMOs serve beyond
EGs?), and enzyme−substrate interactions across the multiple The authors declare no competing financial interest.
length scales likely of importance in cellulose (and biomass) Biographies
deconstruction. Unlike most enzymes that work in solution,
cellulases function at a solid−liquid interface on a physically
heterogeneous substrate in the case of clean cellulose and
physically and chemically heterogeneous substrate in the case of
native or pretreated plant cell walls. Our lack of understanding
of these solid substrates coupled to the inherent difficulties
associated with effective kinetics measurements of single
enzymes and enzyme cocktails makes the challenge of
understanding and improving cellulase action a daunting
technical challenge. Moreover, for both fundamental inves-
tigations and industrial applications, the substrate of choice must
be carefully considered for interpretation of results, and in the
latter case of pretreated biomass, the substrate treatment must
be “co-optimized” with the enzyme cocktail design. Taken
together, these recent findings and open questions continue to
challenge our collective basic premise of cellulose deconstruc- Christina M. Payne is an assistant professor in the Department of
tion and highlight the fact that exciting, impactful advances are Chemical and Materials Engineering at the University of Kentucky and
yet to be made in the study of cellulose deconstruction in both the August T. Larsson Guest Researcher at the Swedish University of
natural and applied contexts. Agricultural Sciences in Uppsala, Sweden. She received her Ph.D. in
Going forward, these open questions will be addressed chemical engineering from Vanderbilt University in 2007. After a brief
through the continued application of traditional structural foray into the world of chemical process engineering, she undertook
biology and biochemical measurements, development and postdoctoral studies at the National Renewable Energy Laboratory
application of advanced biophysical measurements, kinetic and (NREL) in 2011 and was promoted to staff scientist the same year. In
molecular modeling coupled to experimental findings, screening 2012, she joined the faculty at the University of Kentucky where her
and mining natural diversity of both natural enzymes and research focuses on application of computational tools to investigate
secretomes, detailed substrate characterization, and the develop- protein structure−function relationships and biocatalysis.
ment of novel methods for engineering enzymes that inherently
work at solid−liquid interfaces in a cocktail context. The
elucidation of rate-limiting steps in “simple” cocktails via
advanced methodologies developed in the last several years
has begun to uncover some aspects of cellulolytic action that can
be considered targets for enzyme engineering. Continued
development and application of these types of quantitative
methods in increasingly complex systems and across substrates
will further elucidate the physical and chemical mechanisms of
cellulase action, and highlight opportunities for improvement of
enzyme performance.
In conclusion, fungal cellulases, some of which remain to be
discovered, work in concert to accomplish one of the most
important processes in nature, namely the turnover of
recalcitrant cellulose. This process is of paramount importance
in the global carbon cycle and may become one of the most Brandon C. Knott is a Postdoctoral Research Fellow in the National
industrially important enzymatic reactions given the desperately Bioenergy Center at NREL and recipient of the NREL Director’s
needed drive toward a renewable energy-based global society. Fellowship. He obtained his Ph.D. from the University of California,
Understanding the function, diversity, and mechanisms of fungal Santa Barbara, in 2012 in Chemical Engineering for research into the
cellulases remains an extremely challenging, massive endeavor mechanisms of nucleation from solution utilizing statistical mechanics,
but one of considerable importance for both the natural and computer simulation, and experiments. His current research at NREL

1429 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

focuses on utilizing advanced computational approaches to elucidate


mechanisms of enzymatic catalysis relevant to biofuels production.

Michael E. Himmel is a Research Fellow in the Biosciences Center at


NREL. He obtained his Ph.D. from Colorado State University in 1980
in Biochemistry. At NREL, he has conducted, led, and designed
Heather B. Mayes is a Ph.D. candidate in the Department of Chemical research in protein biochemistry, recombinant technology, enzyme
engineering, and new microorganism discovery. During the past three
and Biological Engineering at Northwestern University in Evanston, IL,
decades, he has contributed over 300 journal articles, 7 books, and 22
where she is using computational chemistry to elucidate thermochem- patents to the literature. In 2004, he was a recipient of an R&D 100
award for “Advanced Catalytic System for Biomass Conversion”.
ical and enzymatic routes for cellulose decomposition to produce
renewable fuels and chemicals. She is a Department of Energy
Computational Science Graduate Fellow, Chicago ARCS Scholar, and
active member of the Society of Women Engineers. Before returning to
school, she worked as a chemical engineering consultant for Jacobs
Consultancy, helping transportation fuel companies evaluate potential
new processes and make existing processes safer and more energy-
efficient.

Mats Sandgren is an Associate Professor at the Department of


Chemistry and Biotechnology, Swedish University of Agricultural
Sciences, Uppsala, Sweden, and has studied cellulose degrading
enzymes since 1998. At Uppsala University he obtained his Ph.D. in
molecular biology in 2003, in the group of Alwyn Jones, for research on
structure−function relationships in cellulases. He and his research
group moved to the Swedish University of Agricultural Sciences in
2007 and focus now on research on structure−function relationships in
cellulases and other carbohydrate-active enzymes.

Henrik Hansson is a Researcher at the Department of Chemistry and


Biotechnology, Swedish University of Agricultural Sciences, Uppsala,
Sweden. He obtained his Ph.D. from the Royal Institute of Technology,
Stockholm, in 2002 for research in protein−protein interaction using
NMR and other biophysical techniques. A postdoctoral period followed
at Uppsala University, Uppsala, Sweden, working with Dr. Gerard J.
Kleywegt in the laboratory of Professor T. Alwyn Jones. In 2006, he
went to the Swedish University of Agricultural Sciences and
Department of Molecular Biology, now the Department of Chemistry
and Biotechnology, to use crystallography and structural biology to
Jerry Ståhlberg is an Associate Professor at the Department of
study cellulases and other cellulose degrading enzymes. Chemistry and Biotechnology, Swedish University of Agricultural

1430 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

Sciences, Uppsala, Sweden, and has studied cellulose-degrading EPR electron paramagnetic resonance
enzymes since the mid-1980s. At Uppsala University he obtained his GH glycoside hydrolase
Ph.D. in biochemistry with Göran Pettersson, and he then joined the GH5 family 5 glycoside hydrolase
structural biology group of Alwyn Jones. He moved to the Swedish GH6 family 6 glycoside hydrolase
University of Agricultural Sciences in 1998 and is leading research on GH7 family 7 glycoside hydrolase
structure−function relationships in cellulases and other carbohydrate- GH12 family 12 glycoside hydrolase
active enzymes. GH45 family 45 glycoside hydrolase
GH61 family 61 glycoside hydrolase
Glc3 cellotriose
Glc4 cellotetraose
Glc5 cellopentaose
Glc6 cellohexaose
Glc8 cellooctaose
Glc9 cellononaose
LPMO lytic polysaccharide monooxygenase
MCC microcrystalline cellulose
MD molecular dynamics
NMR nuclear magnetic resonance
PASC phosphoric acid swollen cellulose
oNPC o-nitrophenyl-β-D-cellobioside
pNPC p-nitrophenyl-β-D-cellobioside
pNPL p-nitrophenol-β-D-lactoside
Gregg T. Beckham is a Senior Engineer in the National Bioenergy RMSD root-mean-square deviation
Center at NREL in Golden, Colorado. He obtained his Ph.D. from the SAXS small-angle X-ray scattering
Massachusetts Institute of Technology in 2007 in Chemical Engineer- SEM scanning electron microscopy
ing for research into solid-state nucleation mechanisms using statistical TEM transmission electron microscopy
mechanics and molecular simulation. After joining NREL in 2008, his XRD X-ray diffraction
research interests have focused on understanding structure−function
relationships in carbohydrate-active enzymes, chemical catalysis, and
process development for carbohydrate and lignin valorization. REFERENCES
(1) Perlack, R. D.; Wright, L. L.; Turhollow, A. F.; Graham, R. L.;
ACKNOWLEDGMENTS Stokes, B. J.; Erbach, D. C. Biomass as Feedstock for A Bioenergy and
Bioproducts Industry: The Technical Feasibility of a Billion-Ton Annual
We thank our colleagues, past and present, for many productive
Supply; DOE/GO-102005-2135; U.S. Department of Energy: Oak
collaborations and engaging discussions that contributed to Ridge, TN, 2005.
ideas in this review. C.M.P. acknowledges the August T. Larsson (2) Conti, J. J.; Holtberg, P. D.; Diefenderfer, J. R.; Napolitano, S. A.;
Guest Researcher Programme at the Swedish University of Schaal, M.; Turnure, J. T.; Westfall, L. D. Annual Energy Outlook 2014;
Agricultural Sciences for funding. B.C.K. and G.T.B. acknowl- DOE/EIA-0383(2014); U.S. Energy Information Administration:
edge the NREL Laboratory Directed Research and Develop- Washington, DC, 2014.
ment Program and Director’s Fellowship Program for funding. (3) Brandt, A. R.; Millard-Ball, A.; Ganser, M.; Gorelick, S. M. Environ.
G.T.B. and M.E.H. acknowledge funding from the US Sci. Technol. 2013, 47, 8031.
Department of Energy BioEnergy Technologies Office. (4) Pacala, S.; Socolow, R. Science 2004, 305, 968.
H.B.M. thanks the DOE Computational Science Graduate (5) Ragauskas, A. J.; Williams, C. K.; Davison, B. H.; Britovsek, G.;
Cairney, J.; Eckert, C. A.; Frederick, W. J.; Hallett, J. P.; Leak, D. J.;
Fellowship (CSGF), provided under Grant DE-FG02-
Liotta, C. L.; Mielenz, J. R.; Murphy, R.; Templer, R.; Tschaplinski, T.
97ER25308 and the ARCS Foundation Inc., Chicago Chapter. Science 2006, 311, 484.
H.H., M.S., and J.S. thank the faculty for Natural Resources and (6) Himmel, M. E.; Ding, S. Y.; Johnson, D. K.; Adney, W. S.; Nimlos,
Agriculture at the Swedish University of Agricultural Sciences M. R.; Brady, J. W.; Foust, T. D. Science 2007, 315, 804.
for support of research through the program “MicroDrivE”. (7) Somerville, C.; Youngs, H.; Taylor, C.; Davis, S. C.; Long, S. P.
Science 2010, 329, 790.
ABBREVIATIONS (8) Somerville, C.; Bauer, S.; Brininstool, G.; Facette, M.; Hamann,
T.; Milne, J.; Osborne, E.; Paredez, A.; Persson, S.; Raab, T.; Vorwerk,
AFM atomic force microscopy S.; Youngs, H. Science 2004, 306, 2206.
BC bacterial cellulose (9) Scheller, H. V.; Ulvskov, P. Annu. Rev. Plant Biol. 2010, 61, 263.
BMCC bacterial microcrystalline cellulose (10) Mohnen, D. Curr. Opin. Plant Biol. 2008, 11, 266.
CBD cellulose-binding domain (11) Atmodjo, M. A.; Hao, Z. Y.; Mohnen, D. Annu. Rev. Plant Biol.
CBH cellobiohydrolase 2013, 64, 747.
CBM carbohydrate-binding module (12) Boerjan, W.; Ralph, J.; Baucher, M. Annu. Rev. Plant Biol. 2003,
CBM33 family 33 carbohydrate-binding module 54, 519.
(13) Carpita, N. C. Annu. Rev. Plant Physiol. Plant Mol. Biol. 1996, 47,
CBP21 chitin-binding protein 21
445.
CD catalytic domain (14) Somerville, C. Annu. Rev. Cell Dev. Biol. 2006, 22, 53.
CDH cellobiose dehydrogenase (15) Morgan, J. L. W.; Strumillo, J.; Zimmer, J. Nature 2013, 493, 181.
CI crystallinity index (16) Pauly, M.; Keegstra, K. Plant J. 2008, 54, 559.
CMC carboxymethyl cellulose (17) Zakzeski, J.; Bruijnincx, P. C. A.; Jongerius, A. L.; Weckhuysen, B.
EG endoglucanase M. Chem. Rev. 2010, 110, 3552.

1431 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(18) Ragauskas, A. J.; Beckham, G. T.; Biddy, M. J.; Chandra, R.; Salamov, A. A.; Schmoll, M.; Terry, A.; Thayer, N.; Westerholm-
Chen, F.; Davis, M. F.; Davison, B. H.; Dixon, R. A.; Gilna, P.; Keller, Parvinen, A.; Schoch, C. L.; Yao, J.; Barbote, R.; Nelson, M. A.; Detter,
M.; Langan, P.; Naskar, A. K.; Saddler, J. N.; Tschaplinski, T. J.; Tuskan, C.; Bruce, D.; Kuske, C. R.; Xie, G.; Richardson, P.; Rokhsar, D. S.;
G. A.; Wyman, C. E. Science 2014, 344, 1246843. Lucas, S. M.; Rubin, E. M.; Dunn-Coleman, N.; Ward, M.; Brettin, T. S.
(19) Linger, J. G.; Vardon, D. R.; Guarnieri, M. T.; Karp, E. M.; Nat. Biotechnol. 2008, 26, 553.
Hunsinger, G. B.; Franden, M. A.; Johnson, C. W.; Chupka, G.; (42) Medie, F. M.; Davies, G. J.; Drancourt, M.; Henrissat, B. Nat. Rev.
Strathmann, T. J.; Pienkos, P. T.; Beckham, G. T. Proc. Natl. Acad. Sci. Microbiol. 2012, 10, 227.
U.S.A. 2014, 111, 12013. (43) Tien, M.; Kirk, T. K. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 2280.
(20) Chundawat, S. P. S.; Beckham, G. T.; Himmel, M. E.; Dale, B. E. (44) Kirk, T. K.; Farrell, R. L. Annu. Rev. Microbiol. 1987, 41, 465.
Annu. Rev. Chem. Biomol. Eng. 2011, 2, 121. (45) Bugg, T. D. H.; Winfield, C. J. Nat. Prod. Rep. 1998, 15, 513.
(21) Aden, A.; Foust, T. Cellulose 2009, 16, 535. (46) Bugg, T. D. H.; Ahmad, M.; Hardiman, E. M.; Rahmanpour, R.
(22) Humbird, D.; Davis, R.; Tao, L.; Kinchin, C.; Hsu, D.; Aden, A.; Nat. Prod. Rep. 2011, 28, 1883.
Schoen, P.; Lukas, J.; Olthof, B.; Worley, M.; Sexton, D.; Dudgeon, D. (47) Bugg, T. D. H.; Ahmad, M.; Hardiman, E. M.; Singh, R. Curr.
Process Design and Economics for Biochemical Conversion of Lignocellulosic Opin. Biotechnol. 2011, 22, 394.
Biomass to Ethanol; NREL/TP-5100-47764; National Renewable (48) Vaaje-Kolstad, G.; Westereng, B.; Horn, S. J.; Liu, Z. L.; Zhai, H.;
Energy Laboratory: Golden, CO, 2011. Sørlie, M.; Eijsink, V. G. H. Science 2010, 330, 219.
(23) Davis, R.; Tao, L.; Tan, E.; Biddy, M. J.; Beckham, G. T.; Scarlata, (49) Langston, J. A.; Shaghasi, T.; Abbate, E.; Xu, F.; Vlasenko, E.;
C.; Jacobson, J.; Cafferty, K.; Ross, J.; Lukas, J.; Knorr, D.; Schoen, P. Sweeney, M. D. Appl. Environ. Microbiol. 2011, 77, 7007.
Process Design and Economics for the Conversion of Lignocellulosic Biomass (50) Phillips, C. M.; Beeson, W. T.; Cate, J. H.; Marletta, M. A. ACS
to Hydrocarbons: Dilute-Acid Prehydrolysis and Enzymatic Hydrolysis Chem. Biol. 2011, 6, 1399.
Deconstruction of Biomass to Sugars and Biological Conversion of Sugars to (51) Quinlan, R. J.; Sweeney, M. D.; Lo Leggio, L.; Otten, H.;
Hydrocarbons; NREL/TP-5100-60223; National Renewable Energy Poulsen, J.-C. N.; Johansen, K. S.; Krogh, K. B. R. M.; Jørgensen, C. I.;
Laboratory: Golden, CO, 2013. Tovborg, M.; Anthonsen, A.; Tryfona, T.; Walter, C. P.; Dupree, P.; Xu,
(24) Mosier, N.; Wyman, C.; Dale, B.; Elander, R.; Lee, Y. Y.; F.; Davies, G. J.; Walton, P. H. Proc. Natl. Acad. Sci. U.S.A. 2011, 108,
Holtzapple, M.; Ladisch, M. Bioresour. Technol. 2005, 96, 673. 15079.
(25) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; (52) Westereng, B.; Agger, J. W.; Horn, S. J.; Vaaje-Kolstad, G.;
Ladisch, M. R.; Lee, Y. Y. Bioresour. Technol. 2005, 96, 1959. Aachmann, F. L.; Stenstrøm, Y. H.; Eijsink, V. G. H. J. Chromatogr. A
(26) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; 2013, 1271, 144.
Ladisch, M. R.; Lee, Y. Y. Bioresour. Technol. 2005, 96, 2026. (53) Bayer, E. A.; Chanzy, H.; Lamed, R.; Shoham, Y. Curr. Opin.
(27) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; Struct. Biol. 1998, 8, 548.
Ladisch, M. R.; Lee, Y. Y.; Mitchinson, C.; Saddler, J. N. Biotechnol. (54) Bayer, E. A.; Belaich, J. P.; Shoham, Y.; Lamed, R. Annu. Rev.
Prog. 2009, 25, 333. Microbiol. 2004, 58, 521.
(28) Garlock, R. J.; Balan, V.; Dale, B. E.; Pallapolu, V. R.; Lee, Y. Y.; (55) Doi, R. H.; Kosugi, A. Nat. Rev. Microbiol. 2004, 2, 541.
Kim, Y.; Mosier, N. S.; Ladisch, M. R.; Holtzapple, M. T.; Falls, M.; (56) Fontes, C. M. G. A.; Gilbert, H. J. Annu. Rev. Biochem. 2010, 79,
Sierra-Ramirez, R.; Shi, J.; Ebrik, M. A.; Redmond, T.; Yang, B.; 655.
Wyman, C. E.; Donohoe, B. S.; Vinzant, T. B.; Elander, R. T.; Hames, (57) Brunecky, R.; Alahuhta, M.; Xu, Q.; Donohoe, B. S.; Crowley, M.
B.; Thomas, S.; Warner, R. E. Bioresour. Technol. 2011, 102, 11063. F.; Kataeva, I. A.; Yang, S.-J.; Resch, M. G.; Adams, M. W. W.; Lunin, V.
(29) Tao, L.; Aden, A.; Elander, R. T.; Pallapolu, V. R.; Lee, Y. Y.;
V.; Himmel, M. E.; Bomble, Y. J. Science 2013, 342, 1513.
Garlock, R. J.; Balan, V.; Dale, B. E.; Kim, Y.; Mosier, N. S.; Ladisch, M.
(58) Naas, A. E.; Mackenzie, A. K.; Mravec, J.; Schückel, J.; Willats, W.
R.; Falls, M.; Holtzapple, M. T.; Sierra, R.; Shi, J.; Ebrik, M. A.;
G. T.; Eijsink, V. G. H.; Pope, P. B. mBio 2014, 5, e01401.
Redmond, T.; Yang, B.; Wyman, C. E.; Hames, B.; Thomas, S.; Warner,
(59) Resch, M. G.; Donohoe, B. S.; Baker, J. O.; Decker, S. R.; Bayer,
R. E. Bioresour. Technol. 2011, 102, 11105.
E. A.; Beckham, G. T.; Himmel, M. E. Energy Environ. Sci. 2013, 6,
(30) Wyman, C. E.; Balan, V.; Dale, B. E.; Elander, R. T.; Falls, M.;
1858.
Hames, B.; Holtzapple, M. T.; Ladisch, M. R.; Lee, Y. Y.; Mosier, N.;
(60) Eastwood, D. C.; Floudas, D.; Binder, M.; Majcherczyk, A.;
Pallapolu, V. R.; Shi, J.; Thomas, S. R.; Warner, R. E. Bioresour. Technol.
2011, 102, 11052. Schneider, P.; Aerts, A.; Asiegbu, F. O.; Baker, S. E.; Barry, K.;
(31) Klein-Marcuschamer, D.; Oleskowicz-Popiel, P.; Simmons, B. A.; Bendiksby, M.; Blumentritt, M.; Coutinho, P. M.; Cullen, D.; de Vries,
Blanch, H. W. Biotechnol. Bioeng. 2012, 109, 1083. R. P.; Gathman, A.; Goodell, B.; Henrissat, B.; Ihrmark, K.; Kauserud,
(32) Lynd, L. R.; Weimer, P. J.; van Zyl, W. H.; Pretorius, I. S. H.; Kohler, A.; LaButti, K.; Lapidus, A.; Lavin, J. L.; Lee, Y. H.;
Microbiol. Mol. Biol. Rev. 2002, 66, 506. Lindquist, E.; Lilly, W.; Lucas, S.; Morin, E.; Murat, C.; Oguiza, J. A.;
(33) Zhang, Y. H. P.; Lynd, L. R. Biotechnol. Bioeng. 2004, 88, 797. Park, J.; Pisabarro, A. G.; Riley, R.; Rosling, A.; Salamov, A.; Schmidt,
(34) Zhang, Y. H. P.; Himmel, M. E.; Mielenz, J. R. Biotechnol. Adv. O.; Schmutz, J.; Skrede, I.; Stenlid, J.; Wiebenga, A.; Xie, X. F.; Kues,
2006, 24, 452. U.; Hibbett, D. S.; Hoffmeister, D.; Hogberg, N.; Martin, F.; Grigoriev,
(35) Stephanopoulos, G. Science 2007, 315, 801. I. V.; Watkinson, S. C. Science 2011, 333, 762.
(36) Lynd, L. R.; Laser, M. S.; Brandsby, D.; Dale, B. E.; Davison, B.; (61) Goodell, B.; Jellison, J.; Liu, J.; Daniel, G.; Paszczynski, A.;
Hamilton, R.; Himmel, M.; Keller, M.; McMillan, J. D.; Sheehan, J.; Fekete, F.; Krishnamurthy, S.; Jun, L.; Xu, G. J. Biotechnol. 1997, 53,
Wyman, C. E. Nat. Biotechnol. 2008, 26, 169. 133.
(37) Atsumi, S.; Liao, J. C. Curr. Opin. Biotechnol. 2008, 19, 414. (62) Reese, E. T. Biotechnol. Bioeng. Symp. 1976, 6, 9.
(38) Alonso, D. M.; Bond, J. Q.; Dumesic, J. A. Green Chem. 2010, 12, (63) Sørensen, A.; Lübeck, M.; Lübeck, P.; Ahring, B. Biomolecules
1493. 2013, 3, 612.
(39) Peralta-Yahya, P. P.; Zhang, F. Z.; del Cardayre, S. B.; Keasling, J. (64) Brown, R. M., Jr.; Saxena, I. M. Plant Physiol. Biochem. 2000, 38,
D. Nature 2012, 488, 320. 57.
(40) Jang, Y. S.; Kim, B.; Shin, J. H.; Choi, Y. J.; Choi, S.; Song, C. W.; (65) Doblin, M. S.; Kurek, I.; Jacob-Wilk, D.; Delmer, D. P. Plant Cell
Lee, J.; Park, H. G.; Lee, S. Y. Biotechnol. Bioeng. 2012, 109, 2437. Physiol. 2002, 43, 1407.
(41) Martinez, D.; Berka, R. M.; Henrissat, B.; Saloheimo, M.; Arvas, (66) Saxena, I. M.; Brown, R. M., Jr. Ann. Bot. 2005, 96, 9.
M.; Baker, S. E.; Chapman, J.; Chertkov, O.; Coutinho, P. M.; Cullen, (67) Hu, S. Q.; Gao, Y. G.; Tajima, K.; Sunagawa, N.; Zhou, Y.;
D.; Danchin, E. G. J.; Grigoriev, I. V.; Harris, P.; Jackson, M.; Kubicek, Kawano, S.; Fujiwara, T.; Yoda, T.; Shimura, D.; Satoh, Y.; Munekata,
C. P.; Han, C. S.; Ho, I.; Larrondo, L. F.; de Leon, A. L.; Magnuson, J. M.; Tanaka, I.; Yao, M. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 17957.
K.; Merino, S.; Misra, M.; Nelson, B.; Putnam, N.; Robbertse, B.; (68) Mazur, O.; Zimmer, J. J. Biol. Chem. 2011, 286, 17601.

1432 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(69) Sethaphong, L.; Haigler, C. H.; Kubicki, J. D.; Zimmer, J.; (110) Bellesia, G.; Asztalos, A.; Shen, T. Y.; Langan, P.; Redondo, A.;
Bonetta, D.; DeBolt, S.; Yingling, Y. G. Proc. Natl. Acad. Sci. U.S.A. Gnanakaran, S. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2010, 66,
2013, 110, 7512. 1184.
(70) Omadjela, O.; Narahari, A.; Strumillo, J.; Melida, H.; Mazur, O.; (111) Beckham, G. T.; Bomble, Y. J.; Bayer, E. A.; Himmel, M. E.;
Bulone, V.; Zimmer, J. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 17856. Crowley, M. F. Curr. Opin. Biotechnol. 2011, 22, 231.
(71) Wolfenden, R.; Lu, X.; Young, G. J. Am. Chem. Soc. 1998, 120, (112) O’Sullivan, A. C. Cellulose 1997, 4, 173.
6814. (113) French, A. D. In Advances in Carbohydrate Chemistry and
(72) Wolfenden, R.; Snider, M. J. Acc. Chem. Res. 2001, 34, 938. Biochemistry; Elsevier Academic Press Inc.: San Diego, 2012; Vol 67, pp
(73) Jahren, A. H. Annu. Rev. Earth Planet. Sci. 2007, 35, 509. 19−93.
(74) Richter, S. L.; Johnson, A. H.; Dranoff, M. M.; LePage, B. A.; (114) Matthews, J. F.; Skopec, C. E.; Mason, P. E.; Zuccato, P.;
Williams, C. J. Geochim. Cosmochim. Acta 2008, 72, 2744. Torget, R. W.; Sugiyama, J.; Himmel, M. E.; Brady, J. W. Carbohydr.
(75) Jahren, A. H.; Sternberg, L. S. Geology 2008, 36, 99. Res. 2006, 341, 138.
(76) Stankiewicz, B. A.; Briggs, D. E.; Evershed, R. P.; Flannery, M. B.; (115) Paavilainen, S.; Rog, T.; Vattulainen, I. J. Phys. Chem. B 2011,
Wuttke, M. Science 1997, 276, 1541. 115, 3747.
(77) Ballantyne, A. P.; Rybczynski, N.; Baker, P. A.; Harington, C. R.; (116) Zhao, Z.; Shklyaev, O. E.; Nili, A.; Mohamed, M. N. A.; Kubicki,
White, D. Palaeogeogr., Palaeoclimatol., Palaeoecol. 2006, 242, 188. J. D.; Crespi, V. H.; Zhong, L. H. J. Phys. Chem. A 2013, 117, 2580.
(78) Wolfenden, R. Chem. Rev. 2006, 106, 3379. (117) Matthews, J. F.; Beckham, G. T.; Bergenstrahle-Wohlert, M.;
(79) Wolfenden, R.; Yuan, Y. J. Am. Chem. Soc. 2008, 130, 7548. Brady, J. W.; Himmel, M. E.; Crowley, M. F. J. Chem. Theory Comput.
(80) Dumas, J.-B. C. R. Hebd. Seances Acad. Sci. 1839, 8, 51. 2012, 8, 735.
(81) Nishikawa, S.; Ono, S. Proc. Tokyo Math.-Phys. Soc. 1913, 7, 131. (118) Matthews, J. F.; Bergenstrahle, M.; Beckham, G. T.; Himmel,
(82) Meyer, K. H.; Mark, H. Ber. Dtsch. Chem. Ges. 1928, 61, 593. M. E.; Nimlos, M. R.; Brady, J. W.; Crowley, M. F. J. Phys. Chem. B
(83) Meyer, K. H.; Misch, L. Helv. Chim. Acta 1937, 20, 232. 2011, 115, 2155.
(84) Honjo, G.; Watanabe, M. Nature 1958, 181, 326. (119) Hadden, J. A.; French, A. D.; Woods, R. J. Biopolymers 2013, 99,
(85) Gardner, K. H.; Blackwell, J. Biopolymers 1974, 13, 1975. 746.
(86) Atalla, R. H.; VanderHart, D. L. Science 1984, 223, 283. (120) Srinivas, G.; Cheng, X. L.; Smith, J. C. J. Chem. Theory Comput.
(87) VanderHart, D. L.; Atalla, R. H. Macromolecules 1984, 17, 1465. 2011, 7, 2539.
(88) Nishiyama, Y.; Langan, P.; Chanzy, H. J. Am. Chem. Soc. 2002, (121) Bu, L. T.; Beckham, G. T.; Crowley, M. F.; Chang, C. H.;
124, 9074. Matthews, J. F.; Bomble, Y. J.; Adney, W. S.; Himmel, M. E.; Nimlos,
(89) Nishiyama, Y.; Sugiyama, J.; Chanzy, H.; Langan, P. J. Am. Chem. M. R. J. Phys. Chem. B 2009, 113, 10994.
Soc. 2003, 125, 14300. (122) Hynninen, A. P.; Matthews, J. F.; Beckham, G. T.; Crowley, M.
(90) Langan, P.; Nishiyama, Y.; Chanzy, H. Biomacromolecules 2001, F.; Nimlos, M. R. J. Chem. Theory Comput. 2011, 7, 2137.
2, 410. (123) Wohlert, J.; Berglund, L. A. J. Chem. Theory Comput. 2011, 7,
(91) Swatloski, R. P.; Spear, S. K.; Holbrey, J. D.; Rogers, R. D. J. Am. 753.
Chem. Soc. 2002, 124, 4974. (124) Bellesia, G.; Chundawat, S. P. S.; Langan, P.; Redondo, A.; Dale,
(92) Pinkert, A.; Marsh, K. N.; Pang, S.; Staiger, M. P. Chem. Rev. B. E.; Gnanakaran, S. J. Phys. Chem. B 2012, 116, 8031.
2009, 109, 6712. (125) Chang, R.; Gross, A. S.; Chu, J. W. J. Phys. Chem. B 2012, 116,
(93) Langan, P.; Nishiyama, Y.; Chanzy, H. J. Am. Chem. Soc. 1999, 8074.
121, 9940. (126) Markutsya, S.; Devarajan, A.; Baluyut, J. Y.; Windus, T. L.;
(94) Wada, M.; Chanzy, H.; Nishiyama, Y.; Langan, P. Macromolecules Gordon, M. S.; Lamm, M. H. J. Chem. Phys. 2013, 138, 214108.
2004, 37, 8548. (127) Zhao, H.; Kwak, J. H.; Wang, Y.; Franz, J. A.; White, J. M.;
(95) Chundawat, S. P. S.; Bellesia, G.; Uppugundla, N.; Sousa, L. D.; Holladay, J. E. Energy Fuels 2006, 20, 807.
Gao, D. H.; Cheh, A. M.; Agarwal, U. P.; Bianchetti, C. M.; Phillips, G. (128) Ciesielski, P. N.; Matthews, J. F.; Tucker, M. P.; Beckham, G.
N.; Langan, P.; Balan, V.; Gnanakaran, S.; Dale, B. E. J. Am. Chem. Soc. T.; Crowley, M. F.; Himmel, M. E.; Donohoe, B. S. ACS Nano 2013, 7,
2011, 133, 11163. 8011.
(96) Fan, L. T.; Lee, Y.-H.; Beardmore, D. H. Biotechnol. Bioeng. 1980, (129) Habibi, Y.; Lucia, L. A.; Rojas, O. J. Chem. Rev. 2010, 110, 3479.
22, 177. (130) Beckham, G. T.; Matthews, J. F.; Peters, B.; Bomble, Y. J.;
(97) Atalla, R. H.; Ellis, J. D.; Schroeder, L. R. J. Wood Chem. Technol. Himmel, M. E.; Crowley, M. F. J. Phys. Chem. B 2011, 115, 4118.
1984, 4, 465. (131) Payne, C. M.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J.
(98) Nishiyama, Y. J. Wood Sci. 2009, 55, 241. Phys. Chem. Lett. 2011, 2, 1546.
(99) Dale, B. E.; Tsao, G. T. J. Appl. Polym. Sci. 1982, 27, 1233. (132) Cho, H. M.; Gross, A. S.; Chu, J. W. J. Am. Chem. Soc. 2011,
(100) Kennedy, C. J.; Cameron, G. J.; Sturcova, A.; Apperley, D. C.; 133, 14033.
Altaner, C.; Wess, T. J.; Jarvis, M. C. Cellulose 2007, 14, 235. (133) Kwan, C. C.; Ghadiri, M.; Papadopoulos, D. G.; Bentham, A. C.
(101) Ha, M. A.; Apperley, D. C.; Evans, B. W.; Huxham, M.; Jardine, Chem. Eng. Technol. 2003, 26, 185.
W. G.; Vietor, R. J.; Reis, D.; Vian, B.; Jarvis, M. C. Plant J. 1998, 16, (134) Hwang, J. W.; Yang, Y. K.; Hwang, J. K.; Pyun, Y. R.; Kim, Y. S.
183. J. Biosci. Bioeng. 1999, 88, 183.
(102) Fernandes, A. N.; Thomas, L. H.; Altaner, C. M.; Callow, P.; (135) Iguchi, M.; Yamanaka, S.; Budhiono, A. J. Mater. Sci. 2000, 35,
Forsyth, V. T.; Apperley, D. C.; Kennedy, C. J.; Jarvis, M. C. Proc. Natl. 261.
Acad. Sci. U.S.A. 2011, 108, E1195. (136) Bielecki, S.; Krystynowicz, A.; Turkiewicz, M.; Kalinowska, H.
(103) Thomas, L. H.; Forsyth, V. T.; Sturcova, A.; Kennedy, C. J.; In Polysaccharides and Polyamides in the Food Industry; Steinbüchel, A.,
May, R. P.; Altaner, C. M.; Apperley, D. C.; Wess, T. J.; Jarvis, M. C. Rhee, S. K., Eds.; Wiley-Blackwell: Weinheim, Germany, 2005; pp 31−
Plant Physiol. 2013, 161, 465. 85.
(104) Kimura, S.; Laosinchai, W.; Itoh, T.; Cui, X. J.; Linder, C. R.; (137) Wanichapichart, P.; Kaewnopparat, S.; Buaking, K.; Puthai, W. J.
Brown, R. M., Jr. Plant Cell 1999, 11, 2075. Sci. Technol. 2002, 24, 855.
(105) Mueller, S. C.; Brown, R. M., Jr. J. Cell Biol. 1980, 84, 315. (138) Horikawa, Y.; Sugiyama, J. Cellulose 2008, 15, 419.
(106) Endler, A.; Persson, S. Mol. Plant 2011, 4, 199. (139) Sugiyama, J.; Harada, H.; Fujiyoshi, Y.; Uyeda, N. Mokuzai
(107) Herth, W. Planta 1983, 159, 347. Gakkaishi 1984, 30, 98.
(108) Ding, S. Y.; Himmel, M. E. J. Agric. Food. Chem. 2006, 54, 597. (140) Ek, R.; Gustafsson, C.; Nutt, A.; Iversen, T.; Nyström, C. J. Mol.
(109) Newman, R. H.; Hill, S. J.; Harris, P. J. Plant Physiol. 2013, 163, Recognit. 1998, 11, 263.
1558. (141) Imai, T.; Sugiyama, J. Macromolecules 1998, 31, 6275.

1433 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(142) Koyama, M.; Sugiyama, J.; Itoh, T. Cellulose 1997, 4, 147. (184) Mayes, H. B.; Broadbelt, L. J.; Beckham, G. T. J. Am. Chem. Soc.
(143) Mihranyan, A.; Llagostera, A. P.; Karmhag, R.; Strømme, M.; 2014, 136, 1008.
Ek, R. Int. J. Pharm. 2004, 269, 433. (185) Hill, A. D.; Reilly, P. J. J. Chem. Inf. Model. 2007, 47, 1031.
(144) Walseth, C. S. Tappi 1952, 35, 228. (186) Blake, C. C. F.; Mair, G. A.; North, A. C. T.; Phillips, D. C.;
(145) Whitmore, R. E.; Atalla, R. H. Int. J. Biol. Macromol. 1985, 7, Sarma, V. R. Proc. R. Soc. London, Ser. B 1967, 167, 365.
182. (187) Blake, C. C. F.; Koenig, D. F.; Mair, G. A.; North, A. C. T.;
(146) Schenzel, K.; Fischer, S.; Brendler, E. Cellulose 2005, 12, 223. Phillips, D. C.; Sarma, V. R. Nature 1965, 206, 757.
(147) Kataoka, Y.; Kondo, T. Macromolecules 1998, 31, 760. (188) Schindler, M.; Assaf, Y.; Sharon, N.; Chipman, D. M.
(148) Åkerholm, M.; Hinterstoisser, B.; Salmén, L. Carbohydr. Res. Biochemistry 1977, 16, 423.
2004, 339, 569. (189) Pincus, M. R.; Scheraga, H. A. Biochemistry 1981, 20, 3960.
(149) Thygesen, A.; Oddershede, J.; Lilholt, H.; Thomsen, A. B.; (190) Ford, L. O.; Johnson, L. N.; Machin, P. A.; Phillips, D. C.; Tjian,
Ståhl, K. Cellulose 2005, 12, 563. R. J. Mol. Biol. 1974, 88, 349.
(150) Park, S.; Baker, J. O.; Himmel, M. E.; Parilla, P. A.; Johnson, D. (191) Strynadka, N. C. J.; James, M. N. G. J. Mol. Biol. 1991, 220, 401.
K. Biotechnol. Biofuels 2010, 3, 10. (192) Rouvinen, J.; Bergfors, T.; Teeri, T.; Knowles, J. K.; Jones, T. A.
(151) Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Science 1990, 249, 380.
Lombard, V.; Henrissat, B. Nucleic Acids Res. 2009, 37, D233. (193) Barr, B. K.; Hsieh, Y.-L.; Ganem, B.; Wilson, D. B. Biochemistry
(152) Levasseur, A.; Drula, E.; Lombard, V.; Coutinho, P.; Henrissat, 1996, 35, 586.
B. Biotechnol. Biofuels 2013, 6, 41. (194) Zou, J.-y.; Kleywegt, G. J.; Ståhlberg, J.; Driguez, H.; Nerinckx,
(153) Lombard, V.; Golaconda Ramulu, H.; Drula, E.; Coutinho, P. W.; Claeyssens, M.; Koivula, A.; Teeri, T. T.; Jones, T. A. Structure
M.; Henrissat, B. Nucleic Acids Res. 2014, 42, D490. 1999, 7, 1035.
(154) Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu. (195) Driguez, H. Top. Curr. Chem. 1997, 187, 85.
Rev. Biochem. 2008, 77, 521. (196) Rye, C. S.; Withers, S. G. Curr. Opin. Chem. Biol. 2000, 4, 573.
(155) Lombard, V.; Bernard, T.; Rancurel, C.; Brumer, H.; Coutinho, (197) Diot, J.; García-Moreno, M. I.; Gouin, S. G.; Ortiz Mellet, C.;
P. M.; Henrissat, B. Biochem. J. 2010, 432, 437. Haupt, K.; Kovensky, J. Org. Biomol. Chem. 2009, 7, 357.
(156) Boraston, A. B.; Bolam, D. N.; Gilbert, H. J.; Davies, G. J. (198) Withers, S. G.; Street, I. P.; Bird, P.; Dolphin, D. H. J. Am. Chem.
Biochem. J. 2004, 382, 769. Soc. 1987, 109, 7530.
(157) Koshland, D. E., Jr. Biol. Rev. 1953, 28, 413. (199) Withers, S. G.; Rupitz, K.; Street, I. P. J. Biol. Chem. 1988, 263,
(158) Jongkees, S. A. K.; Withers, S. G. Acc. Chem. Res. 2014, 47, 226. 7929.
(159) Davies, G. J.; Planas, A.; Rovira, C. Acc. Chem. Res. 2012, 45, (200) White, A.; Tull, D.; Johns, K.; Withers, S. G.; Rose, D. R. Nat.
Struct. Biol. 1996, 3, 149.
308.
(201) Withers, S. G.; Warren, R. A. J.; Street, I. P.; Rupitz, K.;
(160) van der Kamp, M. W.; Mulholland, A. J. Biochemistry 2013, 52,
Kempton, J. B.; Aebersold, R. J. Am. Chem. Soc. 1990, 112, 5887.
2708.
(202) Withers, S. G.; Street, I. P. J. Am. Chem. Soc. 1988, 110, 8551.
(161) Vocadlo, D. J.; Davies, G. J. Curr. Opin. Chem. Biol. 2008, 12,
(203) Williams, S. J.; Withers, S. G. Carbohydr. Res. 2000, 327, 27.
539. (204) Czjzek, M.; Cicek, M.; Zamboni, V.; Bevan, D. R.; Henrissat, B.;
(162) Jencks, W. P. Annu. Rev. Biochem. 1962, 32, 639.
Esen, A. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 13555.
(163) Phillips, D. C. Proc. Natl. Acad. Sci. U.S.A. 1967, 57, 484.
(205) Verdoucq, L.; Morinière, J.; Bevan, D. R.; Esen, A.; Vasella, A.;
(164) Vernon, C. A. Proc. R. Soc. London, Ser. B 1967, 167, 389.
Henrissat, B.; Czjze, M. J. Biol. Chem. 2004, 279, 31796.
(165) Chipman, D. M.; Sharon, N. Science 1969, 165, 454.
(206) Davies, G. J.; Mackenzie, L.; Varrot, A.; Dauter, M.; Brzozowski,
(166) Deslongchamps, P. Stereoelectronic Effects in Organic Chemistry;
A. M.; Schülein, M.; Withers, S. G. Biochemistry 1998, 37, 11707.
Pergamon Press: New York, 1983. (207) Varrot, A.; Davies, G. J. Acta Crystallogr., Sect. D: Biol.
(167) Post, C. B.; Karplus, M. J. Am. Chem. Soc. 1986, 108, 1317.
Crystallogr. 2003, 59, 447.
(168) Sinnott, M. L. Adv. Phys. Org. Chem. 1988, 24, 113.
(208) Sandgren, M.; Berglund, G. I.; Shaw, A.; Ståhlberg, J.; Kenne, L.;
(169) Sinnott, M. L. Chem. Rev. 1990, 90, 1171.
Desmet, T.; Mitchinson, C. J. Mol. Biol. 2004, 342, 1505.
(170) Davies, G.; Sinnott, M. L.; Withers, S. G. In Comprehensive
(209) Varrot, A.; Frandsen, T. P.; Driguez, H.; Davies, G. J. Acta
Biological Catalysis: A Mechanistic Reference; Sinnott, M., Ed.; Academic Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 2201.
Press: San Diego, 1998; Vol. I, pp 119−208. (210) Guérin, D. M. A.; Lascombe, M.-B.; Costabel, M.; Souchon, H.;
(171) Vocadlo, D. J.; Davies, G. J.; Laine, R.; Withers, S. G. Nature Lamzin, V.; Béguin, P.; Alzari, P. M. J. Mol. Biol. 2002, 316, 1061.
2001, 412, 835. (211) Deslongchamps, P. Pure Appl. Chem. 1975, 43, 351.
(172) Divne, C.; Ståhlberg, J.; Reinikainen, T.; Ruohonen, L.; (212) Bennet, A. J.; Sinnott, M. L. J. Am. Chem. Soc. 1986, 108, 7287.
Pettersson, G.; Knowles, J. K. C.; Teeri, T. T.; Jones, T. A. Science 1994, (213) Deslongchamps, P. Pure Appl. Chem. 1993, 65, 1161.
265, 524. (214) Nerinckx, W.; Desmet, T.; Claeyssens, M. ARKIVOC 2006, 13,
(173) Divne, C.; Ståhlberg, J.; Teeri, T. T.; Jones, T. A. J. Mol. Biol. 90.
1998, 275, 309. (215) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes
(174) Sulzenbacher, G.; Driguez, H.; Henrissat, B.; Schülein, M.; and Solutions; Wiley-Interscience: New York, 1991.
Davies, G. J. Biochemistry 1996, 35, 15280. (216) Warshel, A.; Sharma, P. K.; Kato, M.; Xiang, Y.; Liu, H.; Olsson,
(175) Davies, G. J.; Wilson, K. S.; Henrissat, B. Biochem. J. 1997, 321, M. H. M. Chem. Rev. 2006, 106, 3210.
557. (217) Fort, S.; Coutinhoa, P. M.; Schülein, M.; Nardin, R.; Cottaz, S.;
(176) Biely, P.; Krátký, Z.; Vršanská, M. Eur. J. Biochem. 1981, 119, Driguez, H. Tetrahedron Lett. 2001, 42, 3443.
559. (218) Walvoort, M. T. C.; van der Marel, G. A.; Overkleeft, H. S.;
(177) Davies, G.; Henrissat, B. Structure 1995, 3, 853. Codée, J. D. C. Chem. Sci. 2013, 4, 897.
(178) Vasella, A.; Davies, G. J.; Bohm, M. Curr. Opin. Chem. Biol. (219) Pauling, L. Chem. Eng. News 1946, 24, 1375.
2002, 6, 619. (220) Withers, S. G. Pure Appl. Chem. 1995, 67, 1673.
(179) Schwarz, J. C. P. J. Chem. Soc., Chem. Commun. 1973, 14, 505. (221) Smith, B. J. J. Am. Chem. Soc. 1997, 119, 2699.
(180) Joint Commission on Biochemical Nomenclature. Eur. J. (222) Biarnés, X.; Ardèvol, A.; Planas, A.; Rovira, C.; Laio, A.;
Biochem. 1980, 111, 295. Parrinello, M. J. Am. Chem. Soc. 2007, 129, 10686.
(181) Cremer, D.; Pople, J. A. J. Am. Chem. Soc. 1975, 97, 1354. (223) Sega, M.; Autieri, E.; Pederiva, F. J. Chem. Phys. 2009, 130,
(182) Barnett, C. B.; Naidoo, K. J. Mol. Phys. 2009, 107, 1243. 225102.
(183) Barnett, C. B.; Naidoo, K. J. J. Phys. Chem. B 2010, 114, 17142. (224) DeMarco, M. L.; Woods, R. J. Glycobiology 2008, 18, 426.

1434 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(225) Ryu, D. D. Y.; Mandels, M. Enzyme Microb. Technol. 1980, 2, 91. (255) Ilmén, M.; Thrane, C.; Penttilä, M. E. Mol. Gen. Genet. 1996,
(226) Montenecourt, B. S. Trends Biotechnol. 1983, 1, 156. 251, 451.
(227) Eveleigh, D. E. Philos. Trans. R. Soc. London, Ser. A 1987, 321, (256) Mach, R. L.; Strauss, J.; Zeilinger, S.; Schindler, M.; Kubicek, C.
435. P. Mol. Microbiol. 1996, 21, 1273.
(228) Persson, I.; Tjerneld, F.; Hahn-Hägerdal, B. Process Biochem. (257) Takashima, S.; Iikura, H.; Nakamura, A.; Masaki, H.; Uozumi,
1991, 26, 65. T. FEMS Microbiol. Lett. 1996, 145, 361.
(229) Cherry, J. R.; Fidantsef, A. L. Curr. Opin. Biotechnol. 2003, 14, (258) Ilmén, M.; Onnela, M. L.; Klemsdal, S.; Keränen, S.; Penttilä, M.
438. Mol. Gen. Genet. 1996, 253, 303.
(230) Mandels, M.; Reese, E. T. J. Bacteriol. 1957, 73, 269. (259) Ilmén, M.; Saloheimo, A.; Onnela, M.-L.; Penttilä, M. E. Appl.
(231) Gauss, W. F.; Suzuki, S.; Takagi, M. (Bio Research Center Environ. Microbiol. 1997, 63, 1298.
Company Limited) Manufacture of Alcohol from Cellulosic Materials (260) Pakula, T. M.; Uusitalo, J.; Saloheimo, M.; Salonen, K.; Aarts, R.
Using Plural Ferments. Bio Research Center Company Limited. U.S. Patent J.; Penttilä, M. Microbiology 2000, 146, 223.
3,990,944, Nov 9, 1976. (261) Saloheimo, A.; Aro, N.; Ilmén, M.; Penttilä, M. J. Biol. Chem.
(232) Silver, R. S. (Gulf Research & Development Company) 2000, 275, 5817.
Saccharification Method. U.S. Patent 4,409,329, Oct. 11, 1983. (262) Aro, N.; Saloheimo, A.; Ilmén, M.; Penttilä, M. J. Biol. Chem.
(233) El Gogary, S.; Leite, A.; Crivellaro, O.; Dorry, H. E.; Eveleigh, 2001, 276, 24309.
D. E. In TRICEL 89An International Symposium on Trichoderma (263) Vasara, T.; Salusjärvi, L.; Raudaskoski, M.; Keränen, S.; Penttilä,
Cellulases; Kubicek, C. P., Esterbauer, H., Eveleigh, D. E., Steiner, W., M.; Saloheimo, M. Mol. Microbiol. 2001, 42, 1349.
Eds., Royal Chemical Society: London, U.K., 1990; pp 200−211. (264) Valkonen, M.; Penttilä, M.; Saloheimo, M. Mol. Genet. Genomics
(234) Sternberg, D.; Mandels, G. R. J. Bacteriol. 1979, 139, 761. 2004, 272, 443.
(235) Mandels, M.; Weber, J.; Parizek, R. Appl. Microbiol. 1971, 21, (265) Nakari-Setälä, T.; Paloheimo, M.; Kallio, J.; Vehmaanperä, J.;
152. Penttilä, M.; Saloheimo, M. Appl. Environ. Microbiol. 2009, 75, 4853.
(236) Himmel, M. E.; Adney, W. S.; Rivard, C. J.; Baker, J. O. In (266) Kubicek, C. P.; Mikus, M.; Schuster, A.; Schmoll, M.; Seiboth,
Energy from Biomass and Wastes XVI: Proceedings of the Institute of Gas B. Biotechnol. Biofuels 2009, 2, 1.
Technology Conference; Institute of Gas Technology: Orlando, FL, 1992; (267) Steiger, M. G.; Vitikainen, M.; Uskonen, P.; Brunner, K.; Adam,
pp 529−543. G.; Pakula, T.; Penttilä, M.; Saloheimo, M.; Mach, R. L.; Mach-Aigner,
(237) Ghose, T. K. Pure Appl. Chem. 1987, 59, 257. A. R. Appl. Environ. Microbiol. 2011, 77, 114.
(238) Gallo, B. J.; Andreotti, R.; Roche, C.; Ryu, D.; Mandels, M. (268) Seiboth, B.; Ivanova, C.; Seidl-seiboth, V. In Biofuel
Biotechnol. Bioeng. Symp. 1978, 8, 89. ProductionRecent Developments and Prospects; Bernardes, M. A. d.
(239) Ryu, D.; Andereotti, R.; Mandels, M.; Gallo, B.; Reese, E. S., Ed., InTech: Rijeka, Croatia, 2011; pp 309−340.
Biotechnol. Bioeng. 1979, 21, 1887. (269) Le Crom, S.; Schackwitz, W.; Pennacchio, L.; Magnuson, J. K.;
(240) Montenecourt, B. S.; Eveleigh, D. E. In Hydrolysis of Cellulose: Culley, D. E.; Collett, J. R.; Martin, J.; Druzhinina, I. S.; Mathis, H.;
Mechanisms of Enzymatic and Acid Catalysis; Brown, R., Jurasek, L., Monot, F.; Seiboth, B.; Cherry, B.; Rey, M.; Berka, R.; Kubicek, C. P.;
Eds.; American Chemical Society: Washington, DC, 1979; pp 289− Baker, S. E.; Margeot, A. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 16151.
301. (270) Seidl, V.; Gamauf, C.; Druzhinina, I.; Seiboth, B.; Hartl, L.;
(241) Montenecourt, B. S.; Eveleigh, D. E. Appl. Environ. Microbiol. Kubicek, C. BMC Genomics 2008, 9, 327.
1977, 34, 777. (271) England, G. R.; Kelley, A.; Mitchinson, C. (Genencor
(242) Ghosh, A.; Alrabiai, S.; Ghosh, B. K.; Trimiño-Vazquez, H.; International, Inc.) Induction of Gene Expression Using a High
Eveleigh, D. E.; Montenecourt, B. S. Enzyme Microb. Technol. 1982, 4, Concentration Sugar Mixture. U.S. Publication 2010/0009408 A1, Jan.
110. 14, 2010.
(243) Singhania, R.; Sukumaran, R.; Pandey, A. Appl. Biochem. (272) Geysens, S.; Pakula, T.; Uusitalo, J.; Dewerte, I.; Penttilä, M.;
Biotechnol. 2007, 142, 60. Contreras, R. Appl. Environ. Microbiol. 2005, 71, 2910.
(244) Shoemaker, S. P.; Raymond, J. C.; Bruner, R. In Trends in the (273) Hui, J. P. M.; Lanthier, P.; White, T. C.; McHugh, S. G.;
Biology of Fermentations for Fuels and Chemicals; Hollaender, A., Rabson, Yaguchi, M.; Roy, R.; Thibault, P. J. Chromatogr. B 2001, 752, 349.
R., Rogers, P., Pietro, A., Valentine, R., Wolfe, R., Eds.; Plenum Press: (274) Peterson, R.; Nevalainen, H. Microbiology 2012, 158, 58.
New York, 1981; pp 89−109. (275) Datema, R.; Schwarz, R. T. Eur. J. Biochem. 1978, 90, 505.
(245) Portnoy, T.; Margeot, A.; Seidl-Seiboth, V.; Le Crom, S.; Ben (276) Joutsjoki, V. V.; Kuittinen, M.; Torkkeli, T. K.; Suominen, P. L.
Chaabane, F.; Linke, R.; Seiboth, B.; Kubicek, C. P. Eukaryotic Cell FEMS Microbiol. Lett. 1993, 112, 281.
2011, 10, 262. (277) Kiiskinen, L.-L.; Kruus, K.; Bailey, M.; Ylösmäki, E.; Siika-aho,
(246) Penttilä, M.; Nevalainen, H.; Rättö, M.; Salminen, E.; Knowles, M.; Saloheimo, M. Microbiology 2004, 150, 3065.
J. Gene 1987, 61, 155. (278) Salles, B. C.; Te’o, V. S. J.; Gibbs, M. D.; Bergquist, P. L.; Filho,
(247) Knowles, J.; Lehtovaara, P.; Penttilä, M.; Teeri, T.; Harkki, A.; E. X. F.; Ximenes, E. A.; Nevalainen, K. M. H. Biotechnol. Lett. 2007, 29,
Salovuori, I. Antonie van Leeuwenhoek 1987, 53, 335. 1195.
(248) Uusitalo, J. M.; Helena Nevalainen, K. M.; Harkki, A. M.; (279) Nyyssönen, E.; Penttilä, M.; Harkki, A.; Saloheimo, A.;
Knowles, J. K. C.; Penttilä, M. E. J. Biotechnol. 1991, 17, 35. Knowles, J. K. C.; Keränen, S. Nat. Biotechnol. 1993, 11, 591.
(249) Harkki, A.; Mäntylä, A.; Penttilä, M.; Muttilainen, S.; Bühler, R.; (280) Paloheimo, M.; Mäntylä, A.; Kallio, J.; Puranen, T.; Suominen,
Suominen, P.; Knowles, J.; Nevalainen, H. Enzyme Microb. Technol. P. Appl. Environ. Microbiol. 2007, 73, 3215.
1991, 13, 227. (281) Shoemaker, S.; Schweickart, V.; Ladner, M.; Gelfand, D.; Kwok,
(250) Teeri, T. T.; Penttilä, M.; Keränen, S.; Nevalainen, H.; Knowles, S.; Myambo, K.; Innis, M. Bio/Technology 1983, 1, 691.
J. K. C. In Biotechnology of Filamentous Fungi; Finkelstein, D. B., Ball, C., (282) Teeri, T.; Salovuori, I.; Knowles, J. Bio/Technology 1983, 1, 696.
Eds.; Newnes: Boston, 1992; pp 417−445. (283) Panttilä, M. E.; André, L.; Saloheimo, M.; Lehtovaara, P.;
(251) Srisodsuk, M.; Reinikainen, T.; Penttilä, M.; Teeri, T. T. J. Biol. Knowles, J. K. C. Yeast 1987, 3, 175.
Chem. 1993, 268, 20756. (284) Penttilä, M. E.; André, L.; Lehtovaara, P.; Bailey, M.; Teeri, T.
(252) Mach, R. L.; Schindler, M.; Kubicek, C. P. Curr. Genet. 1994, 25, T.; Knowles, J. K. C. Gene 1988, 63, 103.
567. (285) Zurbriggen, B.; Bailey, M. J.; Penttilä, M. E.; Poutanen, K.;
(253) Nakari-Setälä, T.; Penttilä, M. Appl. Environ. Microbiol. 1995, Linko, M. J. Biotechnol. 1990, 13, 267.
61, 3650. (286) Bailey, M. J.; Siika-aho, M.; Valkeajarvi, A.; Penttilä, M. E.
(254) Keränen, S.; Penttilä, M. Curr. Opin. Biotechnol. 1995, 6, 534. Biotechnol. Appl. Biochem. 1993, 17, 65.

1435 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(287) Stålbrand, H.; Saloheimo, A.; Vehmaanperä, J.; Henrissat, B.; (315) Murphy, L.; Cruys-Bagger, N.; Damgaard, H. D.; Baumann, M.
Penttilä, M. Appl. Environ. Microbiol. 1995, 61, 1090. J.; Olsen, S. N.; Borch, K.; Lassen, S. F.; Sweeney, M.; Tatsumi, H.;
(288) Valkonen, M.; Penttilä, M.; Saloheimo, M. Appl. Environ. Westh, P. J. Biol. Chem. 2012, 287, 1252.
Microbiol. 2003, 69, 2065. (316) Teeri, T. T. Trends Biotechnol. 1997, 15, 160.
(289) Ilmén, M.; den Haan, R.; Brevnova, E.; McBride, J.; Wiswall, E.; (317) Takashima, S.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi,
Froehlich, A.; Koivula, A.; Voutilainen, S. P.; Siika-aho, M.; la Grange, T. J. Biochem. 1999, 125, 728.
D. C.; Thorngren, N.; Ahlgren, S.; Mellon, M.; Deleault, K.; Rajgarhia, (318) Saloheimo, M.; Kuja-Panula, J.; Ylösmäki, E.; Ward, M.;
V.; van Zyl, W. H.; Penttilä, M. Biotechnol. Biofuels 2011, 4, 30. Penttilä, M. Appl. Environ. Microbiol. 2002, 68, 4546.
(290) Voutilainen, S. P.; Nurmi-Rantala, S.; Penttilä, M.; Koivula, A. (319) Foreman, P. K.; Brown, D.; Dankmeyer, L.; Dean, R.; Diener,
Appl. Microbiol. Biotechnol. 2014, 98, 2991. S.; Dunn-Coleman, N. S.; Goedegebuur, F.; Houfek, T. D.; England, G.
(291) Reese, E. T.; Siu, R. G. H.; Levinson, H. S. J. Bacteriol. 1950, 59, J.; Kelley, A. S.; Meerman, H. J.; Mitchell, T.; Mitchinson, C.; Olivares,
485. H. A.; Teunissen, P. J. M.; Yao, J.; Ward, M. J. Biol. Chem. 2003, 278,
(292) Pedersen, K. O. In Les Protéines: Rapports et Discussions; Stoops, 31988.
R., Ed.; Institut International de Chimie Solvay: Bruxelles, 1953; pp (320) Mach, R. L., Klonierung und Charakterisierung einiger Gene
19−62. des Kohlenstoffmetabolismus von Trichoderma reesei. Ph.D. Thesis,
(293) Mandels, M.; Reese, E. T. In Developments in Industrial Institute of Biochemistry and Technology, Vienna, Austria, 1993.
Microbiology; Society for Industrial Microbiology: New York, 1964; pp (321) Korotkova, O. G.; Semenova, M. V.; Morozova, V. V.; Zorov, I.
5−20. N.; Sokolova, L. M.; Bubnova, T. M.; Okunev, O. N.; Sinitsyn, A. P.
(294) Selby, K.; Maitland, C. C. Biochem. J. 1965, 94, 578. Biochemistry 2009, 74, 569.
(295) Li, L. H.; Flora, R. M.; King, K. W. Arch. Biochem. Biophys. 1965, (322) Karkehabadi, S.; Helmich, K. E.; Kaper, T.; Hansson, H.;
111, 439. Mikkelsen, N.-E.; Gudmundsson, M.; Piens, K.; Fujdala, M.; Banerjee,
(296) Reese, E. T.; Mandels, M. In Cellulose and Cellulose Derivatives;
G.; Scott-Craig, J. S.; Walton, J. D.; Phillips, G. N.; Sandgren, M. J. Biol.
Bikales, N. M., Segal, L., Eds.; Wiley Interscience: New York, 1971; pp
Chem. 2014, 289, 31624.
1079−1094.
(323) Saloheimo, M.; Lehtovaara, P.; Penttilä, M.; Teeri, T. T.;
(297) Liu, T. H.; King, K. W. Arch. Biochem. Biophys. 1967, 120, 462.
Ståhlberg, J.; Johansson, G.; Pettersson, G.; Claeyssens, M.; Tomme, P.;
(298) Eriksson, K.-E. In Cellulases and Their Applications; Hajny, G. J.,
Reese, E. T., Eds.; American Chemical Society: Washington, DC, 1969; Knowles, J. K. C. Gene 1988, 63, 11.
(324) Qin, Y.; Wei, X.; Song, X.; Qu, Y. J. Biotechnol. 2008, 135, 190.
pp 83−104.
(325) Teeri, T. T.; Lehtovaara, P.; Kauppinen, S.; Salovuori, I.;
(299) Halliwell, G.; Griffin, M.; Vincent, R. Biochem. J. 1972, 127, 43.
(300) Berghem, L. E. R.; Pettersson, L. G. Eur. J. Biochem. 1973, 37, Knowles, J. Gene 1987, 51, 43.
21. (326) Poidevin, L.; Feliu, J.; Doan, A.; Berrin, J.-G.; Bey, M.;
(301) Van Tilbeurgh, H.; Pettersson, G.; Bhikabhai, R.; De Boeck, H.; Coutinho, P. M.; Henrissat, B.; Record, E.; Heiss-Blanquet, S. Appl.
Claeyssens, M. Eur. J. Biochem. 1985, 148, 329. Environ. Microbiol. 2013, 79, 4220.
(302) Chanzy, H.; Henrissat, B. FEBS Lett. 1985, 184, 285. (327) Boer, H.; Koivula, A. Eur. J. Biochem. 2003, 270, 841.
(303) van Tilbeurgh, H.; Bhikhabhai, R.; Pettersson, L. G.; (328) Becker, D.; Braet, C.; Brumer, H., III; Claeyssens, M.; Divne,
Claeyssens, M. FEBS Lett. 1984, 169, 215. C.; Fagerström, B. R.; Harris, M.; Jones, T. A.; Kleywegt, G. J.; Koivula,
(304) Ståhlberg, J.; Johansson, G.; Pettersson, G. Biochim. Biophys. A.; Mahdi, S.; Piens, K.; Sinnott, M. L.; Ståhlberg, J.; Teeri, T. T.;
Acta 1993, 1157, 107. Underwood, M.; Wohlfahrt, G. Biochem. J. 2001, 356, 19.
(305) Henrissat, B.; Driguez, H.; Viet, C.; Schülein, M. Bio/Technology (329) Penttilä, M.; Lehtovaara, P.; Nevalainen, H.; Bhikhabhai, R.;
1985, 3, 722. Knowles, J. Gene 1986, 45, 253.
(306) Wood, T. M. Biochem. Soc. Trans. 1985, 13, 407. (330) Van Arsdell, J. N.; Kwok, S.; Schweickart, V. L.; Ladner, M. B.;
(307) Kurašin, M.; Väljamäe, P. J. Biol. Chem. 2011, 286, 169. Gelfand, D. H.; Innis, M. A. Bio/Technology 1987, 5, 60.
(308) Sampedro, J.; Cosgrove, D. J. Genome Biol. 2005, 6, 242. (331) Biely, P.; Vršnska, M.; Claeyssens, M. Eur. J. Biochem. 1991, 200,
(309) Saloheimo, M.; Paloheimo, M.; Hakola, S.; Pere, J.; Swanson, 157.
B.; Nyyssönen, E.; Bhatia, A.; Ward, M.; Penttilä, M. Eur. J. Biochem. (332) Vlasenko, E.; Schülein, M.; Cherry, J.; Xu, F. Bioresour. Technol.
2002, 269, 4202. 2010, 101, 2405.
(310) Baker, J. O.; King, M. R.; Adney, W. S.; Decker, S. R.; Vinzant, (333) Fowler, T.; Mitchinson, C. (Genencor International, Inc.)
T. B.; Lantz, S. E.; Nieves, R. E.; Thomas, S. R.; Li, L.-C.; Cosgrove, D. Mutant EGIII Cellulase, DNA Encoding Such EGIII Compositions and
J.; Himmel, M. E. Appl. Biochem. Biotechnol. 2000, 84−86, 217. Methods for Obtaining Same. U.S. Patent US 6,187,732 B1, Feb. 13,
(311) Beeson, W. T.; Iavarone, A. T.; Hausmann, C. D.; Cate, J. H. D.; 2001.
Marletta, M. A. Appl. Environ. Microbiol. 2011, 77, 650. (334) Karlsson, J.; Siika-aho, M.; Tenkanen, M.; Tjerneld, F. J.
(312) Beeson, W. T.; Phillips, C. M.; Cate, J. H. D.; Marletta, M. A. J. Biotechnol. 2002, 99, 63.
Am. Chem. Soc. 2012, 134, 890. (335) Saloheimo, A.; Henrissat, B.; Hoffrén, A.-M.; Teleman, O.;
(313) Kostylev, M.; Wilson, D. Biofuels 2012, 3, 61. Penttilä, M. Mol. Microbiol. 1994, 13, 219.
(314) Kubicek, C. P.; Herrera-Estrella, A.; Seidl-Seiboth, V.; Martinez, (336) Benkő , Z.; Siika-aho, M.; Viikari, L.; Réczey, K. Enzyme Microb.
D. A.; Druzhinina, I. S.; Thon, M.; Zeilinger, S.; Casas-Flores, S.; Technol. 2008, 43, 109.
Horwitz, B. A.; Mukherjee, P. K.; Mukherjee, M.; Kredics, L.; Alcaraz, (337) Karkehabadi, S.; Hansson, H.; Kim, S.; Piens, K.; Mitchinson,
L. D.; Aerts, A.; Antal, Z.; Atanasova, L.; Cervantes-Badillo, M. G.; C.; Sandgren, M. J. Mol. Biol. 2008, 383, 144.
Challacombe, J.; Chertkov, O.; McCluskey, K.; Coulpier, F.; (338) Gilbert, H. J.; Knox, J. P.; Boraston, A. B. Curr. Opin. Struct. Biol.
Deshpande, N.; von Döhren, H.; Ebbole, D.; Esquivel-Naranjo, E. 2013, 23, 669.
U.; Fekete, E.; Flipphi, M.; Glaser, F.; Gómez-Rodríguez, E. Y.; Gruber, (339) Shoseyov, O.; Shani, Z.; Levy, I. Microbiol. Mol. Biol. Rev. 2006,
S.; Han, C.; Henrissat, B.; Hermosa, R.; Hernández-Oñate, M.; Karaffa, 70, 283.
L.; Kosti, I.; Le Crom, S.; Lindquist, E.; Lucas, S.; Lübeck, M.; Lübeck, (340) Hashimoto, H. Cell. Mol. Life Sci. 2006, 63, 2954.
P.; Margeot, A.; Metz, B.; Misra, M.; Nevalainen, H.; Omann, M.; (341) Hilden, L.; Johansson, G. Biotechnol. Lett. 2004, 26, 1683.
Packer, N.; Perrone, G.; Uresti-Rivera, E. E.; Salamov, A.; Schmoll, M.; (342) Guillén, D.; Sánchez, S.; Rodríguez-Sanoja, R. Appl. Microbiol.
Seiboth, B.; Shapiro, H.; Sukno, S.; Tamayo-Ramos, J. A.; Tisch, D.; Biotechnol. 2010, 85, 1241.
Wiest, A.; Wilkinson, H. H.; Zhang, M.; Coutinho, P. M.; Kenerley, C. (343) van Tilbeurgh, H.; Tomme, P.; Claeyssens, M.; Bhikhabhai, R.;
M.; Monte, E.; Baker, S. E.; Grigoriev, I. V. Genome Biol. 2011, 12, 1. Pettersson, G. FEBS Lett. 1986, 204, 223.

1436 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(344) Tomme, P.; van Tilbeurgh, H.; Pettersson, G.; Van Damme, J.; (376) Xiao, Z.; Gao, P.; Qu, Y.; Wang, T. Biotechnol. Lett. 2001, 23,
Vandekerckhove, J.; Knowles, J.; Teeri, T.; Claeyssens, M. Eur. J. 711.
Biochem. 1988, 170, 575. (377) Wang, L.; Zhang, Y.; Gao, P. Sci. China, Ser. C: Life Sci. 2008, 51,
(345) Gilkes, N. R.; Warren, R. A. J.; Miller, R. C., Jr.; Kilburn, D. G. J. 620.
Biol. Chem. 1988, 263, 10401. (378) Hall, M.; Bansal, P.; Lee, J. H.; Realff, M. J.; Bommarius, A. S.
(346) Kraulis, P. J.; Clore, G. M.; Nilges, M.; Jones, T. A.; Pettersson, Bioresour. Technol. 2011, 102, 2910.
G.; Knowles, J.; Gronenborn, A. M. Biochemistry 1989, 28, 7241. (379) Wang, Y. G.; Tang, R. T.; Tao, J.; Wang, X. N.; Zheng, B. S.;
(347) Gouet, P.; Robert, X.; Courcelle, E. Nucleic Acids Res. 2003, 31, Feng, Y. J. Biol. Chem. 2012, 287, 29568.
3320. (380) Hall, M.; Rubin, J.; Behrens, S. H.; Bommarius, A. S. J.
(348) Ståhlberg, J.; Johansson, G.; Pettersson, G. Bio/Technology Biotechnol. 2011, 155, 370.
1991, 9, 286. (381) Voutilainen, S. P.; Puranen, T.; Siika-aho, M.; Lappalainen, A.;
(349) Reinikainen, T.; Ruohonen, L.; Nevanen, T.; Laaksonen, L.; Alapuranen, M.; Kallio, J.; Hooman, S.; Viikri, L.; Vehmaanperä, J.;
Kraulis, P.; Jones, T. A.; Knowles, J. K. C.; Teeri, T. T. Proteins 1992, Koivula, A. Biotechnol. Bioeng. 2008, 101, 515.
14, 475. (382) Kim, T.-W.; Chokhawala, H. A.; Nadler, D.; Blanch, H. W.;
(350) Reinikainen, T.; Teleman, O.; Teeri, T. T. Proteins 1995, 22, Clark, D. S. Biotechnol. Bioeng. 2010, 107, 601.
392. (383) Kern, M.; McGeehan, J. E.; Streeter, S. D.; Martin, R. N. A.;
(351) Linder, M.; Lindeberg, G.; Reinikainen, T.; Teeri, T. T.; Besser, K.; Elias, L.; Eborall, W.; Malyon, G. P.; Payne, C. M.; Himmel,
Pettersson, G. FEBS Lett. 1995, 372, 96. M. E.; Schnorr, K.; Beckham, G. T.; Cragg, S. M.; Bruce, N. C.;
(352) Linder, M.; Mattinen, M. L.; Kontteli, M.; Lindeberg, G.; McQueen-Mason, S. J. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 10189.
Ståhlberg, J.; Drakenberg, T.; Reinikainen, T.; Pettersson, G.; Annila, A. (384) Várnai, A.; Siika-aho, M.; Viikari, L. Biotechnol. Biofuels 2013, 6,
Protein Sci. 1995, 4, 1056. 30.
(353) Linder, M.; Teeri, T. T. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, (385) Hoffrén, A. M.; Teeri, T. T.; Teleman, O. Protein Eng. 1995, 8,
12251. 443.
(354) Linder, M.; Salovuori, I.; Ruohonen, L.; Teeri, T. T. J. Biol. (386) Mulakala, C.; Reilly, P. J. Proteins: Struct., Funct., Bioinf. 2005,
Chem. 1996, 271, 21268. 60, 598.
(355) Linder, M.; Teeri, T. T. J. Biotechnol. 1997, 57, 15. (387) Nimlos, M. R.; Matthews, J. F.; Crowley, M. F.; Walker, R. C.;
(356) Mattinen, M.-L.; Kontteli, M.; Kerovuo, J.; Drakenberg, T.; Chukkapalli, G.; Brady, J. V.; Adney, W. S.; Clearyl, J. M.; Zhong, L. H.;
Annila, A.; Linder, M.; Reinikainen, T.; Lindeberg, G. Protein Sci. 1997, Himmel, M. E. Protein Eng., Des. Sel. 2007, 20, 179.
6, 294. (388) Zhong, L.; Matthews, J. F.; Crowley, M. F.; Rignall, T.; Talon,
(357) Mattinen, M.-L.; Linder, M.; Teleman, A.; Annila, A. FEBS Lett. C.; Cleary, J. M.; Walker, R. C.; Chukkapalli, G.; McCabe, C.; Nimlos,
1997, 407, 291. M. R.; Brooks, C. L.; Himmel, M. E.; Brady, J. W. Cellulose 2008, 15,
(358) Srisodsuk, M.; Lehtiö, J.; Linder, M.; Margolles-Clark, E.; 261.
Reinikainen, T.; Teeri, T. T. J. Biotechnol. 1997, 57, 49. (389) Zhong, L. H.; Matthews, J. F.; Hansen, P. I.; Crowley, M. F.;
(359) Mattinen, M.-L.; Linder, M.; Drakenberg, T.; Annila, A. Eur. J. Cleary, J. M.; Walker, R. C.; Nimlos, M. R.; Brooks, C. L.; Adney, W. S.;
Biochem. 1998, 256, 279. Himmel, M. E.; Brady, J. W. Carbohydr. Res. 2009, 344, 1984.
(360) Carrard, G.; Linder, M. Eur. J. Biochem. 1999, 262, 637. (390) Tavagnacco, L.; Mason, P. E.; Schnupf, U.; Pitici, F.; Zhong, L.
(361) Linder, M.; Nevanen, T.; Teeri, T. T. FEBS Lett. 1999, 447, 13. H.; Himmel, M. E.; Crowley, M.; Cesaro, A.; Brady, J. W. Carbohydr.
(362) Lehtiö, J.; Sugiyama, J.; Gustavsson, M.; Fransson, L.; Linder, Res. 2011, 346, 839.
M.; Teeri, T. T. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 484. (391) Nimlos, M. R.; Beckham, G. T.; Matthews, J. F.; Bu, L. T.;
(363) Sugimoto, N.; Igarashi, K.; Wada, M.; Samejima, M. Langmuir Himmel, M. E.; Crowley, M. F. J. Biol. Chem. 2012, 287, 20603.
2012, 28, 14323. (392) Shiiba, H.; Hayashi, S.; Yui, T. Carbohydr. Res. 2013, 374, 96.
(364) Guo, J.; Catchmark, J. M. Biomacromolecules 2013, 14, 1268. (393) Payne, C. M.; Resch, M. G.; Chen, L. Q.; Crowley, M. F.;
(365) Takashima, S.; Ohno, M.; Hidaka, M.; Nakamura, A.; Masaki, Himmel, M. E.; Taylor, L. E., II; Sandgren, M.; Ståhlberg, J.; Stals, I.;
H.; Uozumi, T. FEBS Lett. 2007, 581, 5891. Tan, Z. P.; Beckham, G. T. Proc. Natl. Acad. Sci. U.S.A. 2013, 110,
(366) Creagh, A. L.; Ong, E.; Jervis, E.; Kilburn, D. G.; Haynes, C. A. 14646.
Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 12229. (394) Shiiba, H.; Hayashi, S.; Yui, T. Cellulose 2012, 19, 635.
(367) Liu, Y. S.; Baker, J. O.; Zeng, Y. N.; Himmel, M. E.; Haas, T.; (395) Yui, T.; Shiiba, H.; Tsutsumi, Y.; Hayashi, S.; Miyata, T.; Hirata,
Ding, S. Y. J. Biol. Chem. 2011, 286, 11195. F. J. Phys. Chem. B 2010, 114, 49.
(368) Harrison, M. J.; Nouwens, A. S.; Jardine, D. R.; Zachara, N. E.; (396) Mackerell, A. D.; Feig, M.; Brooks, C. L. J. Comput. Chem. 2004,
Gooley, A. A.; Nevalainen, H.; Packer, N. H. Eur. J. Biochem. 1998, 256, 25, 1400.
119. (397) Sakon, J.; Irwin, D.; Wilson, D. B.; Karplus, P. A. Nat. Struct.
(369) Beckham, G. T.; Dai, Z.; Matthews, J. F.; Momany, M.; Payne, Biol. 1997, 4, 810.
C. M.; Adney, W. S.; Baker, S. E.; Himmel, M. E. Curr. Opin. Biotechnol. (398) van Aalten, D. M. F.; Synstad, B.; Brurberg, M. B.; Hough, E.;
2012, 23, 338. Riise, B. W.; Eijsink, V. G. H.; Wierenga, R. K. Proc. Natl. Acad. Sci.
(370) Beckham, G. T.; Matthews, J. F.; Bomble, Y. J.; Bu, L. T.; U.S.A. 2000, 97, 5842.
Adney, W. S.; Himmel, M. E.; Nimlos, M. R.; Crowley, M. F. J. Phys. (399) Abuja, P. M.; Pilz, I.; Claeyssens, M.; Tomme, P. Biochem.
Chem. B 2010, 114, 1447. Biophys. Res. Commun. 1988, 156, 180.
(371) Taylor, C. B.; Talib, M. F.; McCabe, C.; Bu, L.; Adney, W. S.; (400) Abuja, P. M.; Schmuck, M.; Pilz, I.; Tomme, P.; Claeyssens, M.;
Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2012, Esterbauer, H. Eur. Biophys. J. 1988, 15, 339.
287, 3147. (401) Abuja, P. M.; Pilz, I.; Tomme, P.; Claeyssens, M. Biochem.
(372) Crooks, G. E.; Hon, G.; Chandonia, J.-M.; Brenner, S. E. Biophys. Res. Commun. 1989, 165, 615.
Genome Res. 2004, 14, 1188. (402) Pilz, I.; Schwarz, E.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R.
(373) Chen, L. Q.; Drake, M. R.; Resch, M. G.; Greene, E. R.; A. J.; Gilkes, N. R. Biochem. J. 1990, 271, 277.
Himmel, M. E.; Chaffey, P. K.; Beckham, G. T.; Tan, Z. P. Proc. Natl. (403) Meinke, A.; Schmuck, M.; Gilkes, N. R.; Kilburn, D. G.; Miller,
Acad. Sci. U.S.A. 2014, 111, 7612. R. C., Jr.; Warren, R. A. J. Glycobiology 1992, 2, 321.
(374) Din, N.; Gilkes, N. R.; Tekant, B.; Miller, R. C., Jr.; Warren, R. (404) Langsford, M. L.; Gilkes, N. R.; Singh, B.; Moser, B.; Miller, R.
A. J.; Kilburn, D. G. Bio/Technology 1991, 9, 1096. C., Jr.; Warren, R. A. J.; Kilburn, D. G. FEBS Lett. 1987, 225, 163.
(375) Gao, P.-J.; Chen, G.-J.; Wang, T.-H.; Zhang, Y.-S.; Liu, J. Acta (405) Shen, H.; Schmuck, M.; Pilz, I.; Gilkes, N. R.; Kilburn, D. G.;
Biochim. Biophys. Sin. 2000, 33, 13. Miller, R. C., Jr.; Warren, R. A. J. J. Biol. Chem. 1991, 266, 11335.

1437 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(406) Boisset, C.; Borsali, R.; Schülein, M.; Henrissat, B. FEBS Lett. D.; Knights, A.; Loulseged, H.; Mungall, K.; Oliver, K.; Price, C.; Quail,
1995, 376, 49. M. A.; Urushihara, H.; Hernandez, J.; Rabbinowitsch, E.; Steffen, D.;
(407) Receveur, V.; Czjzek, M.; Schülein, M.; Panine, P.; Henrissat, B. Sanders, M.; Ma, J.; Kohara, Y.; Sharp, S.; Simmonds, M.; Spiegler, S.;
J. Biol. Chem. 2002, 277, 40887. Tivey, A.; Sugano, S.; White, B.; Walker, D.; Woodward, J.; Winckler,
(408) von Ossowski, I.; Eaton, J. T.; Czjzek, M.; Perkins, S. J.; T.; Tanaka, Y.; Shaulsky, G.; Schleicher, M.; Weinstock, G.; Rosenthal,
Frandsen, T. P.; Schülein, M.; Panine, P.; Henrissat, B.; Receveur- A.; Cox, E. C.; Chisholm, R. L.; Gibbs, R.; Loomis, W. F.; Platzer, M.;
Bréchot, V. Biophys. J. 2005, 88, 2823. Kay, R. R.; Williams, J.; Dear, P. H.; Noegel, A. A.; Barrell, B.; Kuspa, A.
(409) Wright, P. E.; Dyson, H. J. J. Mol. Biol. 1999, 293, 321. Nature 2005, 435, 43.
(410) Dunker, A. K.; Brown, C. J.; Lawson, J. D.; Iakoucheva, L. M.; (431) Sucgang, R.; Kuo, A.; Tian, X. J.; Salerno, W.; Parikh, A.;
Obradovic, Z. Biochemistry 2002, 41, 6573. Feasley, C. L.; Dalin, E.; Tu, H.; Huang, E. Y.; Barry, K.; Lindquist, E.;
(411) Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Shapiro, H.; Bruce, D.; Schmutz, J.; Salamov, A.; Fey, P.; Gaudet, P.;
Romero, P.; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Anjard, C.; Babu, M. M.; Basu, S.; Bushmanova, Y.; van der Wel, H.;
Hipps, K. W.; Ausio, J.; Nissen, M. S.; Reeves, R.; Kang, C.; Kissinger, Katoh-Kurasawa, M.; Dinh, C.; Coutinho, P. M.; Saito, T.; Elias, M.;
C. R.; Bailey, R. W.; Griswold, M. D.; Chiu, W.; Garner, E. C.; Schaap, P.; Kay, R. R.; Henrissat, B.; Eichinger, L.; Rivero, F.; Putnam,
Obradovic, Z. J. Mol. Graphics Modell. 2001, 19, 26. N. H.; West, C. M.; Loomis, W. F.; Chisholm, R. L.; Shaulsky, G.;
(412) Dunker, A. K.; Obradovic, Z. Nat. Biotechnol. 2001, 19, 805. Strassmann, J. E.; Queller, D. C.; Kuspa, A.; Grigoriev, I. V. Genome
(413) Dyson, H. J.; Wright, P. E. Nat. Rev. Mol. Cell Biol. 2005, 6, 197.
Biol. 2011, 12, R20.
(414) Lima, L. H. F.; Serpa, V. I.; Rosseto, F. R.; Sartori, G. R.; Neto,
(432) Kunii, M.; Yasuno, M.; Shindo, Y.; Kawata, T. Dev. Genes Evol.
M. D.; Martinez, L.; Polikarpov, I. Cellulose 2013, 20, 1573.
2014, 224, 25.
(415) Poon, D. K. Y.; Withers, S. G.; McIntosh, L. P. J. Biol. Chem.
(433) Sethi, A.; Kovaleva, E. S.; Slack, J. M.; Brown, S.; Buchman, G.
2007, 282, 2091.
W.; Scharf, M. E. Arch. Insect Biochem. Physiol. 2013, 84, 175.
(416) Beckham, G. T.; Bomble, Y. J.; Matthews, J. F.; Taylor, C. B.;
(434) Bissett, F. H. J. Chromatogr. A 1979, 178, 515.
Resch, M. G.; Yarbrough, J. M.; Decker, S. R.; Bu, L. T.; Zhao, X. C.;
(435) Fägerstam, L. G.; Pettersson, L. G. FEBS Lett. 1979, 1100, 363.
McCabe, C.; Wohlert, J.; Bergenstrahle, M.; Brady, J. W.; Adney, W. S.;
(436) Fägerstam, L. G.; Pettersson, L. G. FEBS Lett. 1980, 119, 97.
Himmel, M. E.; Crowley, M. F. Biophys. J. 2010, 99, 3773.
(437) Shoemaker, S.; Watt, K.; Tsitovsky, G.; Cox, R. Nat. Biotechnol.
(417) Sammond, D. W.; Payne, C. M.; Brunecky, R.; Himmel, M. E.;
1983, 1, 687.
Crowley, M. F.; Beckham, G. T. PLoS One 2012, 7, e48615.
(438) Fägerstam, L. G.; Pettersson, L. G.; Engström, J. Å. FEBS Lett.
(418) Hui, J. P. M.; White, T. C.; Thibault, P. Glycobiology 2002, 12,
1984, 167, 309.
837.
(439) Henrissat, B.; Claeyssens, M.; Tomme, P.; Lemesle, L.;
(419) Stals, I.; Sandra, K.; Geysens, S.; Contreras, R.; Van Beeumen,
J.; Claeyssens, M. Glycobiology 2004, 14, 713. Mornon, J.-P. Gene 1989, 81, 83.
(420) Christiansen, M. N.; Kolarich, D.; Nevalainen, H.; Packer, N. (440) Teeri, T. T.; Koivula, A.; Linder, M.; Wohlfahrt, G.; Divne, C.;
H.; Jensen, P. H. Anal. Chem. 2010, 82, 3500. Jones, T. A. Biochem. Soc. Trans. 1998, 26, 173.
(421) Deshpande, N.; Wilkins, M. R.; Packer, N.; Nevalainen, H. (441) Knott, B. C.; Momeni, M. H.; Crowley, M. F.; Mackenzie, L. F.;
Glycobiology 2008, 18, 626. Götz, A. W.; Sandgren, M.; Withers, S. G.; Ståhlberg, J. S.; Beckham, G.
(422) Goto, M. Biosci. Biotechnol. Biochem. 2007, 71, 1415. T. J. Am. Chem. Soc. 2013, 136, 321.
(423) Cummings, R. D.; Doering, T. L. In Essentials of Glycobiology; (442) Schou, C.; Rasmussen, G.; Kaltoft, M.-B.; Henrissat, B.;
Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Schülein, M. Eur. J. Biochem. 1993, 217, 947.
Bertozzi, C. R., Hart, G. W., Etzler, M. E., Eds.; Cold Spring Harbor (443) Knowles, J. K. C.; Lentovaara, P.; Murray, M.; Sinnott, M. L. J.
Laboratory Press: Cold Spring Harbor, NY, 2009. Chem. Soc., Chem. Commun. 1988, 21, 1401.
(424) Zhao, X.; Rignall, T. R.; McCabe, C.; Adney, W. S.; Himmel, M. (444) McCarter, J. D.; Withers, S. G. Curr. Opin. Struct. Biol. 1994, 4,
E. Chem. Phys. Lett. 2008, 460, 284. 885.
(425) Ting, C. L.; Makarov, D. E.; Wang, Z. G. J. Phys. Chem. B 2009, (445) Vrsanska, M.; Biely, P. Carbohydr. Res. 1992, 227, 19.
113, 4970. (446) Ståhlberg, J.; Divne, C.; Koivula, A.; Piens, K.; Claeyssens, M.;
(426) Martinez, D.; Larrondo, L. F.; Putnam, N.; Gelpke, M. D. S.; Teeri, T. T.; Jones, T. A. J. Mol. Biol. 1996, 264, 337.
Huang, K.; Chapman, J.; Helfenbein, K. G.; Ramaiya, P.; Detter, J. C.; (447) Tews, I.; Perrakis, A.; Oppenheim, A.; Dauter, Z.; Wilson, K. S.;
Larimer, F.; Coutinho, P. M.; Henrissat, B.; Berka, R.; Cullen, D.; Vorgias, C. E. Nat. Struct. Biol. 1996, 3, 638.
Rokhsar, D. Nat. Biotechnol. 2004, 22, 695. (448) Sulzenbacher, G.; Schülein, M.; Davies, G. J. Biochemistry 1997,
(427) Wymelenberg, A. V.; Minges, P.; Sabat, G.; Martinez, D.; Aerts, 36, 5902.
A.; Salamov, A.; Grigoriev, I.; Shapiro, H.; Putnam, N.; Belinky, P.; (449) Momeni, M. H.; Payne, C. M.; Hansson, H.; Mikkelsen, N. E.;
Dosoretz, C.; Gaskell, J.; Kersten, P.; Cullen, D. Fungal Genet. Biol. Svedberg, J.; Engström, Å.; Sandgren, M.; Beckham, G. T.; Ståhlberg, J.
2006, 43, 343. J. Biol. Chem. 2013, 288, 5861.
(428) King, A. J.; Cragg, S. M.; Li, Y.; Dymond, J.; Guille, M. J.; (450) Kleywegt, G. J.; Zou, J. Y.; Divne, C.; Davies, G. J.; Sinning, I.;
Bowles, D. J.; Bruce, N. C.; Graham, I. A.; McQueen-Mason, S. J. Proc. Ståhlberg, J.; Reinikainen, T.; Srisodsuk, M.; Teeri, T. T.; Jones, T. A. J.
Natl. Acad. Sci. U.S.A. 2010, 107, 5345. Mol. Biol. 1997, 272, 383.
(429) Todaka, N.; Moriya, S.; Saita, K.; Hondo, T.; Kiuchi, I.; Takasu, (451) Mackenzie, L. F.; Davies, G. J.; Schülein, M.; Withers, S. G.
H.; Ohkuma, M.; Piero, C.; Hayashizaki, Y.; Kudo, T. FEMS Microbiol. Biochemistry 1997, 36, 5893.
Ecol. 2007, 59, 592. (452) Mackenzie, L. F.; Sulzenbacher, G.; Divne, C.; Jones, T. A.;
(430) Eichinger, L.; Pachebat, J. A.; Glockner, G.; Rajandream, M. A.; Wöldike, H. F.; Schülein, M.; Withers, S. G.; Davies, G. J. Biochem. J.
Sucgang, R.; Berriman, M.; Song, J.; Olsen, R.; Szafranski, K.; Xu, Q.; 1998, 335, 409.
Tunggal, B.; Kummerfeld, S.; Madera, M.; Konfortov, B. A.; Rivero, F.; (453) Namchuk, M. N.; McCarter, J. D.; Becalski, A.; Andrews, T.;
Bankier, A. T.; Lehmann, R.; Hamlin, N.; Davies, R.; Gaudet, P.; Fey, Withers, S. G. J. Am. Chem. Soc. 2000, 122, 1270.
P.; Pilcher, K.; Chen, G.; Saunders, D.; Sodergren, E.; Davis, P.; (454) Withers, S. G.; Aebersold, R. Protein Sci. 1995, 4, 361.
Kerhornou, A.; Nie, X.; Hall, N.; Anjard, C.; Hemphill, L.; Bason, N.; (455) Street, I. P.; Kempton, J. B.; Withers, S. G. Biochemistry 1992,
Farbrother, P.; Desany, B.; Just, E.; Morio, T.; Rost, R.; Churcher, C.; 31, 9970.
Cooper, J.; Haydock, S.; van Driessche, N.; Cronin, A.; Goodhead, I.; (456) Klarskov, K.; Piens, K.; Ståhlberg, J.; Høj, P. B.; Van Beeumen,
Muzny, D.; Mourier, T.; Pain, A.; Lu, M.; Harper, D.; Lindsay, R.; J.; Claeyssens, M. Carbohydr. Res. 1997, 304, 143.
Hauser, H.; James, K.; Quiles, M.; Madan Babu, M.; Saito, T.; (457) Davies, G. J.; Ducros, V.; Lewis, R. J.; Borchert, T. V.; Schülein,
Buchrieser, C.; Wardroper, A.; Felder, M.; Thangavelu, M.; Johnson, M. J. Biotechnol. 1997, 57, 91.

1438 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(458) Ubhayasekera, W.; Muñoz, I. G.; Vasella, A.; Ståhlberg, J.; (491) Luo, H.; Yang, J.; Yang, P.; Li, J.; Huang, H.; Shi, P.; Bai, Y.;
Mowbray, S. L. FEBS J. 2005, 272, 1952. Wang, Y.; Fan, Y.; Yao, B. Appl. Microbiol. Biotechnol. 2010, 85, 1015.
(459) Knott, B. C.; Crowley, M. F.; Himmel, M. E.; Ståhlberg, J.; (492) Zhang, Y.; Xu, X.; Zhou, X.; Chen, R.; Yang, P.; Meng, Q.;
Beckham, G. T. J. Am. Chem. Soc. 2014, 136, 8810. Meng, K.; Luo, H.; Yuan, J.; Yao, B.; Zhang, W. PLoS One 2013, 8,
(460) Muñoz, I. G.; Ubhayasekera, W.; Henriksson, H.; Szabó, I.; e81993.
Pettersson, G.; Johansson, G.; Mowbray, S. L.; Ståhlberg, J. J. Mol. Biol. (493) Li, Y.-L.; Li, H.; Li, A.-N.; Li, D.-C. J. Appl. Microbiol. 2009, 106,
2001, 314, 1097. 1867.
(461) von Ossowski, I.; Ståhlberg, J.; Koivula, A.; Piens, K.; Becker, (494) Gusakov, A. V.; Sinitsyn, A. P.; Salanovich, T. N.; Bukhtojarov,
D.; Boer, H.; Harle, R.; Harris, M.; Divne, C.; Mahdi, S.; Zhao, Y. X.; F. E.; Markov, A. V.; Ustinov, B. B.; van Zeijl, C.; Punt, P.; Burlingame,
Driguez, H.; Claeyssens, M.; Sinnott, M. L.; Teeri, T. T. J. Mol. Biol. R. Enzyme Microb. Technol. 2005, 36, 57.
2003, 333, 817. (495) Müller, U.; Tenberge, K. B.; Oeser, B.; Tudzynski, P. Mol. Plant-
(462) Muñoz, I. G.; Mowbray, S. L.; Ståhlberg, J. Acta Crystallogr., Sect. Microbe Interact. 1997, 10, 268.
D: Biol. Crystallogr. 2003, 59, 637. (496) Kanokratana, P.; Chantasingh, D.; Champreda, V.;
(463) Ståhlberg, J.; Henriksson, H.; Divne, C.; Isaksson, R.; Tanapongpipat, S.; Pootanakit, K.; Eurwilaichitr, L. Protein Expression
Pettersson, G.; Johansson, G.; Jones, T. A. J. Mol. Biol. 2001, 305, 79. Purif. 2008, 58, 148.
(464) Grassick, A.; Murray, P. G.; Thompson, R.; Collins, C. M.; (497) Takashima, S.; Iikura, H.; Nakamura, A.; Hidaka, M.; Masaki,
Byrnes, L.; Birrane, G.; Higgins, T. M.; Tuohy, M. G. Eur. J. Biochem. H.; Uozumi, T. J. Biochem. 1998, 124, 717.
2004, 271, 4495. (498) Takashima, S.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi,
(465) Parkkinen, T.; Koivula, A.; Vehmaanpera, J.; Rouvinen, J. T. J. Biotechnol. 1996, 50, 137.
Protein Sci. 2008, 17, 1383. (499) Schülein, M. J. Biotechnol. 1997, 57, 71.
(466) Textor, L. C.; Colussi, F.; Silveira, R. L.; Serpa, V.; de Mello, B. (500) Xu, F.; Ding, H.; Tejirian, A. Enzyme Microb. Technol. 2009, 45,
L.; Muniz, J. R. C.; Squina, F. M.; Pereira, N.; Skaf, M. S.; Polikarpov, I. 203.
FEBS J. 2013, 280, 56. (501) Hamada, N.; Ishikawa, K.; Fuse, N.; Kodaira, R.; Shimosaka, M.;
(467) Gao, J. L.; Truhlar, D. G. Annu. Rev. Phys. Chem. 2002, 53, 467. Amano, Y.; Kanda, T.; Okazaki, M. J. Biosci. Bioeng. 1999, 87, 442.
(468) Garcia-Viloca, M.; Gao, J.; Karplus, M.; Truhlar, D. G. Science (502) Miettinen-Oinonen, A.; Londesborough, J.; Joutsjoki, V.;
2004, 303, 186. Lantto, R.; Vehmaanperä, J. Enzyme Microb. Technol. 2004, 34, 332.
(469) Zhang, Y.; Yan, S. H.; Yao, L. S. J. Phys. Chem. B 2013, 117, (503) Szijártó, N.; Siika-aho, M.; Tenkanen, M.; Alapuranen, M.;
8714. Vehmaanperä, J.; Réczey, K.; Viikari, L. J. Biotechnol. 2008, 136, 140.
(470) Barnett, C. B.; Wilkinson, K. A.; Naidoo, K. J. J. Am. Chem. Soc. (504) Voutilainen, S. P.; Boer, H.; Linder, M. B.; Puranen, T.;
2011, 133, 19474. Rouvinen, J.; Vehmaanperä, J.; Koivula, A. Enzyme Microb. Technol.
(471) Li, J. H.; Du, L. K.; Wang, L. S. J. Phys. Chem. B 2010, 114, 2007, 41, 234.
15261. (505) Voutilainen, S. P.; Boer, H.; Alapuranen, M.; Jänis, J.;
(472) Yan, S. H.; Li, T.; Yao, L. S. J. Phys. Chem. B 2011, 115, 4982. Vehmaanperä, J.; Koivula, A. Appl. Microbiol. Biotechnol. 2009, 83, 261.
(473) Bolhuis, P. G.; Chandler, D.; Dellago, C.; Geissler, P. L. Annu. (506) Karnaouri, A. C.; Topakas, E.; Christakopoulos, P. Appl.
Rev. Phys. Chem. 2002, 53, 291. Microbiol. Biotechnol. 2014, 98, 231.
(474) Peters, B.; Beckham, G. T.; Trout, B. L. J. Chem. Phys. 2007, (507) Hou, Y.; Wang, T.; Long, H.; Zhu, H. Acta Biochim. Biophys. Sin.
127, 034109. 2007, 39, 101.
(475) Peters, B.; Trout, B. L. J. Chem. Phys. 2006, 125, 054108. (508) Gao, L.; Gao, F.; Wang, L. S.; Geng, C. L.; Chi, L. L.; Zhao, J.;
(476) Igarashi, K.; Uchihashi, T.; Koivula, A.; Wada, M.; Kimura, S.; Qu, Y. B. J. Biol. Chem. 2012, 287, 15906.
Okamoto, T.; Penttilä, M.; Ando, T.; Samejima, M. Science 2011, 333, (509) Wei, X. M.; Qin, Y. Q.; Qu, Y. B. J. Microbiol. Biotechnol. 2010,
1279. 20, 265.
(477) Barnett, C. B.; Wilkinson, K. A.; Naidoo, K. J. J. Am. Chem. Soc. (510) Limam, F.; Chaabouni, S. E.; Ghrir, R.; Marzouki, N. Enzyme
2010, 132, 12800. Microb. Technol. 1995, 17, 340.
(478) Granum, D. M.; Vyas, S.; Sambasivarao, S. V.; Maupin, C. M. J. (511) Marjamaa, K.; Toth, K.; Bromann, P. A.; Szakacs, G.; Kruus, K.
Phys. Chem. B 2013, 118, 434. Enzyme Microb. Technol. 2013, 52, 358.
(479) Bu, L. T.; Crowley, M. F.; Himmel, M. E.; Beckham, G. T. J. (512) Uzcategui, E.; Ruiz, A.; Montesino, R.; Johansson, G.;
Biol. Chem. 2013, 288, 12175. Pettersson, G. J. Biotechnol. 1991, 19, 271.
(480) Zhang, Y.; Yan, S.; Yao, L. Theor. Chem. Acc. 2013, 132, 1367. (513) Tuohy, M. G.; Walsh, D. J.; Murray, P. G.; Claeyssens, M.;
(481) Lin, Y.; Silvestre-Ryan, J.; Himmel, M. E.; Crowley, M. F.; Cuffe, M. M.; Savage, A. V.; Coughlan, M. P. Biochim. Biophys. Acta
Beckham, G. T.; Chu, J.-W. J. Am. Chem. Soc. 2011, 133, 16617. 2002, 1596, 366.
(482) Lin, Y.; Beckham, G. T.; Himmel, M. E.; Crowley, M. F.; Chu, (514) Texier, H.; Dumon, C.; Neugnot-Roux, V.; Maestracci, M.;
J.-W. J. Phys. Chem. B 2013, 117, 10750. O’Donohue, M. J. J. Ind. Microbiol. Biotechnol. 2012, 39, 1569.
(483) Szijártó, N.; Horan, E.; Zhang, J.; Puranen, T.; Siika-Aho, M.; (515) Furniss, C. S. M.; Williamson, G.; Kroon, P. A. J. Sci. Food Agric.
Viikari, L. Biotechnol. Biofuels 2011, 4, 2. 2005, 85, 574.
(484) Takada, G.; Kawaguchi, T.; Sumitani, J.; Arai, M. Biosci. (516) Hong, J.; Tamaki, H.; Yamamoto, K.; Kumagai, H. Appl.
Biotechnol. Biochem. 1998, 62, 1615. Microbiol. Biotechnol. 2003, 63, 42.
(485) Takada, G.; Kawaguchi, T.; Sumitani, J.-I.; Arai, M. J. Ferment. (517) Colussi, F.; Serpa, V.; Delabona, P. d. S.; Manzine, L. R.;
Bioeng. 1998, 85, 1. Voltatodio, M. L.; Alves, R.; Mello, B. L.; Pereira, N., Jr.; Farinas, C. S.;
(486) Kanamasa, S.; Mochizuki, M.; Takada, G.; Kawaguchi, T.; Golubev, A. M.; Santos, M. A. M.; Polikarpov, I. J. Microbiol. Biotechnol.
Sumitani, J.-I.; Arai, M. J. Biosci. Bioeng. 2003, 95, 627. 2011, 21, 808.
(487) Bauer, S.; Vasu, P.; Persson, S.; Mort, A. J.; Somerville, C. R. (518) Ganga, A.; González-Candelas, L.; Ramón, D.; Pérez-González,
Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 11417. J. A. J. Agric. Food Chem. 1997, 45, 2359.
(488) Gielkens, M. M. C.; Dekkers, E.; Visser, J.; de Graaff, L. H. Appl. (519) Mitrovic, A.; Flicker, K.; Steinkellner, G.; Gruber, K.; Reisinger,
Environ. Microbiol. 1999, 65, 4340. C.; Schirrmacher, G.; Camattari, A.; Glieder, A. J. Mol. Catal. B: Enzym.
(489) Kitamoto, N.; Go, M.; Shibayama, T.; Kimura, T.; Kito, Y.; 2014, 103, 16.
Ohmiya, K.; Tsukagoshi, N. Appl. Microbiol. Biotechnol. 1996, 46, 538. (520) Song, J.; Liu, B.; Liu, Z.; Yang, Q. Mol. Biol. Rep. 2010, 37, 2135.
(490) Kotaka, A.; Bando, H.; Kaya, M.; Kato-Murai, M.; Kuroda, K.; (521) Horn, S. J.; Sørlie, M.; Vårum, K. M.; Väljamäe, P.; Eijsink, V. G.
Sahara, H.; Hata, Y.; Kondo, A.; Ueda, M. J. Biosci. Bioeng. 2008, 105, H. Methods Enzymol. 2012, 510, 69.
622. (522) Doner, L. W.; Irwin, P. L. Anal. Biochem. 1992, 202, 50.

1439 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(523) Irwin, D. C.; Spezio, M.; Walker, L. P.; Wilson, D. B. Biotechnol. (560) Cruys-Bagger, N.; Elmerdahl, J.; Praestgaard, E.; Tatsumi, H.;
Bioeng. 1993, 42, 1002. Spodsberg, N.; Borch, K.; Westh, P. J. Biol. Chem. 2012, 287, 18451.
(524) Irwin, D.; Shin, D.-H.; Zhang, S.; Barr, B. K.; Sakon, J.; Karplus, (561) Cruys-Bagger, N.; Tatsumi, H.; Ren, G. R.; Borch, K.; Westh, P.
P. A.; Wilson, D. B. J. Bacteriol. 1998, 180, 1709. Biochemistry 2013, 52, 8938.
(525) Koivula, A.; Kinnari, T.; Harjunpaa, V.; Ruohonen, L.; Teleman, (562) Praestgaard, E.; Elmerdahl, J.; Murphy, L.; Nymand, S.;
A.; Drakenberg, T.; Rouvinen, J.; Jones, T. A.; Teeri, T. T. FEBS Lett. McFarland, K. C.; Borch, K.; Westh, P. FEBS J. 2011, 278, 1547.
1998, 429, 341. (563) Horn, S. J.; Sikorski, P.; Cederkvist, J. B.; Vaaje-Kolstad, G.;
(526) Zhang, S.; Irwin, D. C.; Wilson, D. B. Eur. J. Biochem. 2000, 267, Sørlie, M.; Synstad, B.; Vriend, G.; Vårum, K. M.; Eijsink, V. G. H. Proc.
3101. Natl. Acad. Sci. U.S.A. 2006, 103, 18089.
(527) Kipper, K.; Väljamäe, P.; Johansson, G. Biochem. J. 2005, 385, (564) Nakamura, A.; Watanabe, H.; Ishida, T.; Uchihashi, T.; Wada,
527. M.; Ando, T.; Igarashi, K.; Samejima, M. J. Am. Chem. Soc. 2014, 136,
(528) Vuong, T. V.; Wilson, D. B. Appl. Environ. Microbiol. 2009, 75, 4584.
6655. (565) Henrissat, B. Cellul. Commun. 1998, 5, 84.
(529) Watson, B. J.; Zhang, H.; Longmire, A. G.; Moon, Y. H.; (566) Lee, T. M.; Farrow, M. F.; Arnold, F. H.; Mayo, S. L. Protein Sci.
Hutcheson, S. W. J. Bacteriol. 2009, 191, 5697. 2011, 20, 1935.
(530) Velleste, R.; Teugjas, H.; Väljamäe, P. Cellulose 2010, 17, 125. (567) Sandgren, M.; Shaw, A.; Ropp, T. H.; Wu, S.; Bott, R.;
(531) Jalak, J.; Väljamäe, P. Biotechnol. Bioeng. 2010, 106, 871. Cameron, A. D.; Ståhlberg, J.; Mitchinson, C.; Jones, T. A. J. Mol. Biol.
(532) Fox, J. M.; Levine, S. E.; Clark, D. S.; Blanch, H. W. Biochemistry 2001, 308, 295.
2012, 51, 442. (568) Väljamäe, P.; Sild, V.; Nutt, A.; Pettersson, G.; Johansson, G.
(533) Zhang, Y. H. P.; Lynd, L. R. Biomacromolecules 2005, 6, 1510. Eur. J. Biochem. 1999, 266, 327.
(534) Bubner, P.; Plank, H.; Nidetzky, B. Biotechnol. Bioeng. 2013, (569) Eriksson, T.; Karlsson, J.; Tjerneld, F. Appl. Biochem. Biotechnol.
110, 1529. 2002, 101, 41.
(535) Goacher, R. E.; Selig, M. J.; Master, E. R. Curr. Opin. Biotechnol. (570) Wood, T. M.; McCrae, S. I. Biochem. J. 1972, 128, 1183.
2014, 27, 123. (571) Bu, L. T.; Beckham, G. T.; Shirts, M. R.; Nimlos, M. R.; Adney,
(536) White, A. R.; Brown, R. M., Jr. Proc. Natl. Acad. Sci. U.S.A. 1981, W. S.; Himmel, M. E.; Crowley, M. F. J. Biol. Chem. 2011, 286, 18161.
78, 1047. (572) Bu, L. T.; Nimlos, M. R.; Shirts, M. R.; Ståhlberg, J.; Himmel,
(537) Chanzy, H.; Henrissat, B.; Vuong, R.; Schülein, M. FEBS Lett. M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2012, 287, 24807.
1983, 153, 113. (573) Beckham, G. T.; Ståhlberg, J.; Knott, B. C.; Himmel, M. E.;
(538) Chanzy, H.; Henrissat, B.; Vuong, R. FEBS Lett. 1984, 172, 193. Crowley, M. F.; Sandgren, M.; Sørlie, M.; Payne, C. M. Curr. Opin.
(539) Blanchette, R. A.; Abad, A. R.; Cease, K. R.; Lovrien, R. E.; Biotechnol. 2014, 27, 96.
Leathers, T. D. Appl. Environ. Microbiol. 1989, 55, 2293. (574) Koivula, A.; Reinikainen, T.; Ruohonen, L.; Valkeajarvi, A.;
(540) Boisset, C.; Fraschini, C.; Schülein, M.; Henrissat, B.; Chanzy, Claeyssens, M.; Teleman, O.; Kleywegt, G. J.; Szardenings, M.;
H. Appl. Environ. Microbiol. 2000, 66, 1444. Rouvinen, J.; Jones, T. A.; Teeri, T. T. Protein Eng. 1996, 9, 691.
(541) Donohoe, B. S.; Selig, M. J.; Viamajala, S.; Vinzant, T. B.; (575) Nakamura, A.; Tsukada, T.; Auer, S.; Furuta, T.; Wada, M.;
Adney, W. S.; Himmel, M. E. Biotechnol. Bioeng. 2009, 103, 480. Koivula, A.; Igarashi, K.; Samejima, M. J. Biol. Chem. 2013, 288, 13503.
(542) Imai, T.; Boisset, C.; Samejima, M.; Igarashi, K.; Sugiyama, J. (576) GhattyVenkataKrishna, P. K.; Alekozai, E. M.; Beckham, G. T.;
FEBS Lett. 1998, 432, 113. Schulz, R.; Crowley, M. F.; Uberbacher, E. C.; Cheng, X. Biophys. J.
(543) Lee, H. J.; Brown, R. M., Jr. J. Biotechnol. 1997, 57, 127. 2013, 104, 904.
(544) Nieves, R. A.; Ellis, R. P.; Todd, R. J.; Johnson, T. J. A.; (577) Taylor, C. B.; Payne, C. M.; Himmel, M. E.; Crowley, M. F.;
Grohmann, K.; Himmel, M. E. Appl. Environ. Microbiol. 1991, 57, 3163. McCabe, C.; Beckham, G. T. J. Phys. Chem. B 2013, 117, 4924.
(545) Lee, I.; Evans, B. R.; Lane, L. M.; Woodward, J. Bioresour. (578) Payne, C. M.; Bomble, Y.; Taylor, C. B.; McCabe, C.; Himmel,
Technol. 1996, 58, 163. M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2011, 286, 41028.
(546) Ganner, T.; Bubner, P.; Eibinger, M.; Mayrhofer, C.; Plank, H.; (579) Zakariassen, H.; Aam, B. B.; Horn, S. J.; Vårum, K. M.; Sørlie,
Nidetzky, B. J. Biol. Chem. 2012, 287, 43215. M.; Eijsink, V. G. H. J. Biol. Chem. 2009, 284, 10610.
(547) Lee, I.; Evans, B. R.; Woodward, J. Ultramicroscopy 2000, 82, (580) Payne, C. M.; Jiang, W.; Shirts, M. R.; Himmel, M. E.; Crowley,
213. M. F.; Beckham, G. T. J. Am. Chem. Soc. 2013, 135, 18831.
(548) Santa-Maria, M.; Jeoh, T. Biomacromolecules 2010, 11, 2000. (581) Holtzapple, M.; Cognata, M.; Shu, Y.; Hendrickson, C.
(549) Wang, J. P.; Quirk, A.; Lipkowski, J.; Dutcher, J. R.; Hill, C.; Biotechnol. Bioeng. 1990, 36, 275.
Mark, A.; Clarke, A. J. Langmuir 2012, 28, 9664. (582) Gusakov, A. V.; Sinitsyn, A. P. Biotechnol. Bioeng. 1992, 40, 663.
(550) Jeoh, T.; Santa-Maria, M. C.; O’Dell, P. J. Carbohydr. Polym. (583) Pingali, S. V.; O’Neill, H. M.; McGaughey, J.; Urban, V. S.;
2013, 97, 581. Rempe, C. S.; Petridis, L.; Smith, J. C.; Evans, B. R.; Heller, W. T. J. Biol.
(551) Wang, J. P.; Quirk, A.; Lipkowski, J.; Dutcher, J. R.; Clarke, A. J. Chem. 2011, 286, 32801.
Langmuir 2013, 29, 14997. (584) Andrić, P.; Meyer, A. S.; Jensen, P. A.; Dam-Johansen, K.
(552) Bubner, P.; Dohr, J.; Plank, H.; Mayrhofer, C.; Nidetzky, B. J. Biotechnol. Adv. 2010, 28, 308.
Biol. Chem. 2012, 287, 2759. (585) Gan, Q.; Allen, S. J.; Taylor, G. Biochem. Eng. J. 2002, 12, 223.
(553) Boisset, C.; Pétrequin, C.; Chanzy, H.; Henrissat, B.; Schülein, (586) Tengborg, C.; Galbe, M.; Zacchi, G. Enzyme Microb. Technol.
M. Biotechnol. Bioeng. 2001, 72, 339. 2001, 28, 835.
(554) Luterbacher, J. S.; Walker, L. P.; Moran-Mirabal, J. M. (587) Xiao, Z. Z.; Zhang, X.; Gregg, D. J.; Saddler, J. N. Appl. Biochem.
Biotechnol. Bioeng. 2013, 110, 108. Biotechnol. 2004, 113, 1115.
(555) Jung, J.; Sethi, A.; Gaiotto, T.; Han, J. J.; Jeoh, T.; Gnanakaran, (588) Lee, Y. H.; Fan, L. T. Biotechnol. Bioeng. 1983, 25, 939.
S.; Goodwin, P. M. J. Biol. Chem. 2013, 288, 24164. (589) Andrić, P.; Meyer, A. S.; Jensen, P. A.; Dam-Johansen, K.
(556) Bansal, P.; Hall, M.; Realff, M. J.; Lee, J. H.; Bommarius, A. S. Biotechnol. Adv. 2010, 28, 407.
Biotechnol. Adv. 2009, 27, 833. (590) Gan, Q.; Allen, S. J.; Taylor, G. Process Biochem. 2003, 38, 1003.
(557) Jalak, J.; Kurašin, M.; Teugjas, H.; Väljamäe, P. J. Biol. Chem. (591) Gusakov, A. V.; Sinitsyn, A. P.; Klyosov, A. A. Biotechnol. Bioeng.
2012, 287, 28802. 1987, 29, 906.
(558) Igarashi, K.; Koivula, A.; Wada, M.; Kimura, S.; Penttilä, M.; (592) Gavlighi, H. A.; Meyer, A. S.; Mikkelsen, J. D. Biotechnol. Lett.
Samejima, M. J. Biol. Chem. 2009, 284, 36186. 2013, 35, 205.
(559) Cruys-Bagger, N.; Elmerdahl, J.; Praestgaard, E.; Borch, K.; (593) Igarashi, K.; Samejima, M.; Eriksson, K. E. L. Eur. J. Biochem.
Westh, P. FEBS J. 2013, 280, 3952. 1998, 253, 101.

1440 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(594) Dekker, R. F. H.; Wallis, A. F. A. Biotechnol. Bioeng. 1983, 25, (631) He, H. Y.; Qin, Y. L.; Chen, G. G.; Li, N.; Liang, Z. Q. Appl.
3027. Biochem. Biotechnol. 2013, 169, 870.
(595) Sternberg, D.; Vijayakumar, P.; Reese, E. T. Can. J. Microbiol. (632) Bhikhabhai, R.; Pettersson, L. G. FEBS Lett. 1984, 167, 301.
1977, 23, 139. (633) Saarelainen, R.; Paloheimo, M.; Fagerström, R.; Suominen, P.
(596) Bommarius, A. S.; Katona, A.; Cheben, S. E.; Patel, A. S.; L.; Nevalainen, K. M. H. Mol. Gen. Genet. 1993, 241, 497.
Ragauskas, A. J.; Knudson, K.; Pu, Y. Metab. Eng. 2008, 10, 370. (634) Xu, J.; Takakuwa, N.; Nogawa, M.; Okada, H.; Morikawa, Y.
(597) Teugjas, H.; Väljamäe, P. Biotechnol. Biofuels 2013, 6, 105. Appl. Microbiol. Biotechnol. 1998, 49, 718.
(598) Halliwell, G.; Griffin, M. Biochem. J. 1973, 135, 587. (635) Fischer, W. H.; Spiess, J. Proc. Natl. Acad. Sci. U.S.A. 1987, 84,
(599) Johnson, E. A.; Reese, E. T.; Demain, A. L. J. Appl. Biochem. 3628.
1982, 4, 64. (636) Busby, W. H.; Quackenbush, G. E.; Humm, J.; Youngblood, W.
(600) Väljamäe, P.; Sild, V.; Pettersson, G.; Johansson, G. Eur. J. W.; Kizer, J. J. Biol. Chem. 1987, 262, 8532.
Biochem. 1998, 253, 469. (637) Hinke, S. A.; Pospisilik, J. A.; Demuth, H.-U.; Mannhart, S.;
(601) Zhang, S.; Wolfgang, D. E.; Wilson, D. B. Biotechnol. Bioeng. Kühn-Wache, K.; Hoffmann, T.; Nishimura, E.; Pederson, R. A.;
1999, 66, 35. McIntosh, C. H. J. Biol. Chem. 2000, 275, 3827.
(602) Bezerra, R. M. F.; Dias, A. A. Appl. Biochem. Biotechnol. 2004, (638) Van Coillie, E.; Proost, P.; Van Aelst, I.; Struyf, S.; Polfliet, M.;
112, 173. De Meester, I.; Harvey, D. J.; Van Damme, J.; Opdenakker, G.
(603) Oh, K. K.; Kim, S. W.; Jeong, Y. S.; Hong, S. I. Appl. Biochem. Biochemistry 1998, 37, 12672.
Biotechnol. 2000, 89, 15. (639) Podell, D. N.; Abraham, G. N. Biochem. Biophys. Res. Commun.
(604) Philippidis, G. P.; Smith, T. K.; Wyman, C. E. Biotechnol. Bioeng. 1978, 81, 176.
1993, 41, 846. (640) Dana, C. M.; Dotson-Fagerstrom, A.; Roche, C. M.; Kal, S. M.;
(605) Wald, S.; Wilke, C. R.; Blanch, H. W. Biotechnol. Bioeng. 1984, Chokhawala, H. A.; Blanch, H. W.; Clark, D. S. Biotechnol. Bioeng. 2014,
26, 221. 111, 842.
(606) Claeyssens, M.; van Tilbeurgh, H.; Tomme, P.; Wood, T. M.; (641) Dana, C. M.; Saija, P.; Kal, S. M.; Bryan, M. B.; Blanch, H. W.;
McRae, S. I. Biochem. J. 1989, 261, 819. Clark, D. S. Biotechnol. Bioeng. 2012, 109, 2710.
(607) van Tilbeurgh, H.; Claeyssens, M. FEBS Lett. 1985, 187, 283. (642) Varki, A. Glycobiology 1993, 3, 97.
(608) Vonhoff, S.; Piens, K.; Pipelier, M.; Braet, C.; Claeyssens, M.; (643) Hart, G. W.; Copeland, R. J. Cell 2010, 143, 672.
Vasella, A. Helv. Chim. Acta 1999, 82, 963. (644) Maras, M.; DeBruyn, A.; Schraml, J.; Herdewijn, P.; Claeyssens,
(609) Hsu, T. A.; Gong, C. S.; Tsao, G. T. Biotechnol. Bioeng. 1980, 22, M.; Fiers, W.; Contreras, R. Eur. J. Biochem. 1997, 245, 617.
2305. (645) Jeoh, T.; Michener, W.; Himmel, M. E.; Decker, S. R.; Adney,
(610) Nidetzky, B.; Zachariae, W.; Gercken, G.; Hayn, M.; Steiner, W. W. S. Biotechnol. Biofuels 2008, 1, 10.
(646) Stals, I.; Sandra, K.; Devreese, B.; Van Beeumen, J.; Claeyssens,
Enzyme Microb. Technol. 1994, 16, 43.
M. Glycobiology 2004, 14, 725.
(611) Gruno, M.; Väljamäe, P.; Pettersson, G.; Johansson, G.
(647) Sheirneiss, G.; Montenecourt, B. S. Appl. Microbiol. Biotechnol.
Biotechnol. Bioeng. 2004, 86, 503.
1984, 20, 46.
(612) van Tilbeurgh, H.; Loontiens, F. G.; De Bruyne, C. K.;
(648) Adney, W. S.; Jeoh, T.; Beckham, G. T.; Chou, Y. C.; Baker, J.
Claeyssens, M. Methods Enzymol. 1988, 160, 45.
O.; Michener, W.; Brunecky, R.; Himmel, M. E. Cellulose 2009, 16, 699.
(613) Du, F. Y.; Wolger, E.; Wallace, L.; Liu, A.; Kaper, T.; Kelemen,
(649) Voutilainen, S. P.; Murray, P. G.; Tuohy, M. G.; Koivula, A.
B. Appl. Biochem. Biotechnol. 2010, 161, 313. Protein Eng., Des. Sel. 2010, 23, 69.
(614) Teugjas, H.; Väljamäe, P. Biotechnol. Biofuels 2013, 6, 104. (650) Heinzelman, P.; Komor, R.; Kanaan, A.; Romero, P.; Yu, X.;
(615) Murphy, L.; Bohlin, C.; Baumann, M. J.; Olsen, S. N.; Sørensen,
Mohler, S.; Snow, C.; Arnold, F. Protein Eng., Des. Sel. 2010, 23, 871.
T. H.; Anderson, L.; Borch, K.; Westh, P. Enzyme Microb. Technol. (651) Komor, R. S.; Romero, P. A.; Xie, C. B.; Arnold, F. H. Protein
2013, 52, 163. Eng., Des. Sel. 2012, 25, 827.
(616) Bezerra, R. M. F.; Dias, A. A. Appl. Biochem. Biotechnol. 2005, (652) Smith, M. A.; Bedbrook, C. N.; Wu, T.; Arnold, F. H. ACS
126, 49. Synth. Biol. 2013, 2, 690.
(617) Bezerra, R. M. F.; Dias, A. A.; Fraga, I.; Pereira, A. N. Appl. (653) Ducros, V. M.-A.; Tarling, C. A.; Zechel, D. L.; Brzozowski, A.
Biochem. Biotechnol. 2006, 134, 27. M.; Frandsen, T. P.; von Ossowski, I.; Schülein, M.; Withers, S. G.;
(618) Bezerra, R. M. F.; Dias, A. A.; Fraga, I.; Pereira, A. N. Appl. Davies, G. J. Chem. Biol. 2003, 10, 619.
Biochem. Biotechnol. 2011, 165, 178. (654) Henrissat, B. Biochem. J. 1991, 280, 309.
(619) Holtzapple, M. T.; Caram, H. S.; Humphrey, A. E. Biotechnol. (655) Mertz, B.; Kuczenski, R. S.; Larsen, R. T.; Hill, A. D.; Reilly, P. J.
Bioeng. 1984, 26, 753. Biopolymers 2005, 79, 197.
(620) Berlin, A.; Balakshin, M.; Gilkes, N.; Kadla, J.; Maximenko, V.; (656) Koivula, A.; Ruohonen, L.; Wohlfahrt, G.; Reinikainen, T.;
Kubo, S.; Saddler, J. J. Biotechnol. 2006, 125, 198. Teeri, T. T.; Piens, K.; Claeyssens, M.; Weber, M.; Vasella, A.; Becker,
(621) Kumar, R.; Wyman, C. E. Biotechnol. Bioeng. 2014, 111, 1341. D.; Sinnott, M. L.; Zou, J.-Y.; Kleywegt, G. J.; Szardenings, M.;
(622) Qing, Q.; Yang, B.; Wyman, C. E. Bioresour. Technol. 2010, 101, Ståhlberg, J.; Jones, T. A. J. Am. Chem. Soc. 2002, 124, 10015.
9624. (657) Henrissat, B.; Teeri, T. T.; Warren, R. A. J. FEBS Lett. 1998,
(623) Kont, R.; Kurašin, M.; Teugjas, H.; Väljamäe, P. Biotechnol. 425, 352.
Biofuels 2013, 6, 135. (658) Spezio, M.; Wilson, D. B.; Karplus, P. A. Biochemistry 1993, 32,
(624) Dekker, R. F. H. Biotechnol. Bioeng. 1986, 28, 1438. 9906.
(625) Dale, M. P.; Ensley, H. E.; Kern, K.; Sastry, K. A. R.; Byers, L. D. (659) Varrot, A.; Hastrup, S.; Schülein, M.; Davies, G. J. Biochem. J.
Biochemistry 1985, 24, 3530. 1999, 304, 297.
(626) Cannella, D.; Hsieh, C.-w. C.; Felby, C.; Jørgensen, H. (660) Meinke, A.; Damude, H. G.; Tomme, P.; Kwan, E.; Kilburn, D.
Biotechnol. Biofuels 2012, 5, 26. G.; Miller, R. C., Jr.; Warren, R. A. J.; Gilkes, N. R. J. Biol. Chem. 1995,
(627) Cannella, D.; Jørgensen, H. Biotechnol. Bioeng. 2014, 111, 59. 270, 4383.
(628) Krisch, J.; Bencsik, O.; Papp, T.; Vagvolgyi, C.; Tako, M. (661) Damude, H. G.; Withers, S. G.; Kilburn, D. G.; Miller, R. C., Jr.;
Bioresour. Technol. 2012, 114, 555. Warren, R. A. J. Biochemistry 1995, 34, 2220.
(629) Christakopoulos, P.; Goodenough, P. W.; Kekos, D.; Macris, B. (662) Varrot, A.; Schülein, M.; Davies, G. J. Biochemistry 1999, 38,
J.; Claeyssens, M.; Bhat, M. K. Eur. J. Biochem. 1994, 224, 379. 8884.
(630) Gao, L.; Gao, F.; Zhang, D. Y.; Zhang, C.; Wu, G. H.; Chen, S. (663) Davies, G. J.; Brzozowski, A. M.; Dauter, M.; Varrot, A.;
L. Bioresour. Technol. 2013, 147, 658. Schülein, M. Biochem. J. 2000, 348, 201.

1441 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(664) de Grotthuss, C. J. T. Ann. Chim. 1806, 58, 54. (696) Zhang, S.; Barr, B. K.; Wilson, D. B. Eur. J. Biochem. 2000, 267,
(665) Varrot, A.; Frandsen, T. P.; von Ossowski, I.; Boyer, V.; Cottaz, 244.
S.; Driguez, H.; Schülein, M.; Davies, G. J. Structure 2003, 11, 855. (697) Calza, R. E.; Irwin, D. C.; Wilson, D. B. Biochemistry 1985, 24,
(666) Varrot, A.; Macdonald, J.; Stick, R. V.; Pell, G.; Gilbert, H. J.; 7797.
Davies, G. J. Chem. Commun. 2003, 946. (698) Ghangas, G. S.; Wilson, D. B. Appl. Environ. Microbiol. 1988, 54,
(667) Varrot, A.; Leydier, S.; Pell, G.; Macdonald, J. M.; Stick, R. V.; 2521.
Henrissat, B.; Gilbert, H. J.; Davies, G. J. J. Biol. Chem. 2005, 280, (699) Amano, Y.; Shiroishi, M.; Nisizawa, K.; Hoshino, E.; Kanda, T. J.
20181. Biochem. 1996, 120, 1123.
(668) Yoshida, M.; Sato, K.; Kaneko, S.; Fukuda, K. Biosci. Biotechnol. (700) Quirk, A.; Lipkowski, J.; Vandenende, C.; Cockburn, D.; Clarke,
Biochem. 2009, 73, 67. A. J.; Dutcher, J. R.; Roscoe, S. G. Langmuir 2010, 26, 5007.
(669) Liu, Y.; Igarashi, K.; Kaneko, S.; Tonozuka, T.; Samejima, M.; (701) Okada, G.; Nisizawa, K. J. Biochem. 1975, 78, 297.
Fukuda, K.; Yoshida, M. Biosci. Biotechnol. Biochem. 2009, 73, 1432. (702) Okada, G.; Nisizawa, T.; Nisizawa, K. Biochem. J. 1966, 99, 214.
(670) Liu, Y.; Yoshida, M.; Kurakata, Y.; Miyazaki, T.; Igarashi, K.; (703) Wood, T. M. Biochem. J. 1971, 121, 353.
Samejima, M.; Fukuda, K.; Nishikawa, A.; Tonozuka, T. FEBS J. 2010, (704) Wu, M.; Nerinckx, W.; Piens, K.; Ishida, T.; Hansson, H.;
277, 1532. Sandgren, M.; Ståhlberg, J. FEBS J. 2013, 280, 184.
(671) Tamura, M.; Miyazaki, T.; Tanaka, Y.; Yoshida, M.; Nishikawa, (705) Ai, Y.-C.; Wilson, D. B. Enzyme Microb. Technol. 2002, 30, 804.
A.; Tonozuka, T. FEBS J. 2012, 279, 1871. (706) Ai, Y.-C.; Zhang, S.; Wilson, D. B. Enzyme Microb. Technol.
(672) Thompson, A. J.; Heu, T.; Shaghasi, T.; Benyamino, R.; Jones, 2003, 32, 331.
A.; Friis, E. P.; Wilson, K. S.; Davies, G. J. Acta Crystallogr., Sect. D: Biol. (707) Lantz, S. E.; Goedegebuur, F.; Hommes, R.; Kaper, T.;
Crystallogr. 2012, 68, 875. Kelemen, B. R.; Mitchinson, C.; Wallace, L.; Ståhlberg, J.; Larenas, E. A.
(673) Wang, X.-J.; Peng, Y.-J.; Zhang, L.-Q.; Li, A.-N.; Li, D.-C. Appl. Biotechnol. Biofuels 2010, 3, 20.
Microbiol. Biotechnol. 2012, 95, 1469. (708) Heinzelman, P.; Snow, C. D.; Smith, M. A.; Yu, X. L.; Kannan,
(674) Wu, I.; Arnold, F. H. Biotechnol. Bioeng. 2013, 110, 1874. A.; Boulware, K.; Villalobos, A.; Govindarajan, S.; Minshull, J.; Arnold,
(675) Eijsink, V. G. H.; Bjørk, A.; Gåseidnes, S.; Sirevåg, R.; Synstad, F. H. J. Biol. Chem. 2009, 284, 26229.
B.; van den Burg, B.; Vriend, G. J. Biotechnol. 2004, 113, 105. (709) Heinzelman, P.; Snow, C. D.; Wu, I.; Nguyen, C.; Villalobos, A.;
(676) Eijsink, V. G. H.; Gåseidnes, S.; Borchert, T. V.; van den Burg, Govindarajan, S.; Minshull, J.; Arnold, F. H. Proc. Natl. Acad. Sci. U.S.A.
B. Biomol. Eng. 2005, 22, 21. 2009, 106, 5610.
(677) Matthews, B. W.; Nicholson, H.; Becktel, W. J. Proc. Natl. Acad. (710) Wu, I.; Heel, T.; Arnold, F. H. Biochim. Biophys. Acta 2013,
Sci. U.S.A. 1987, 84, 6663. 1834, 1539.
(678) Watanabe, K.; Masuda, T.; Ohashi, H.; Mihara, H.; Suzuki, Y. (711) Denman, S.; Xue, G.-P.; Patel, B. Appl. Environ. Microbiol. 1996,
Eur. J. Biochem. 1994, 226, 277. 62, 1889.
(679) Prajapati, R. S.; Das, M.; Sreeramulu, S.; Sirajuddin, M.; (712) Li, X.-L.; Chen, H.; Ljungdahl, L. G. Appl. Environ. Microbiol.
Srinivasan, S.; Krishnamurthy, V.; Ranjani, R.; Ramakrishnan, C.; 1997, 63, 4721.
Varadarajan, R. Proteins: Struct., Funct., Bioinf. 2007, 66, 480. (713) Wohlfahrt, G.; Pellikka, T.; Boer, H.; Teeri, T. T.; Koivula, A.
(680) Wolfgang, D. E.; Wilson, D. B. Biochemistry 1999, 38, 9746. Biochemistry 2003, 42, 10095.
(681) André, G.; Kanchanawong, P.; Palma, R.; Cho, H.; Deng, X.; (714) Chow, C.-M.; Yagüe, E.; Raguz, S.; Wood, D. A.; Thurston, C.
Irwin, D.; Himmel, M. E.; Wilson, D. B.; Brady, J. W. Protein Eng. 2003,
F. Appl. Environ. Microbiol. 1994, 60, 2779.
16, 125.
(715) Emalfrab, M. A.; Burlingame, R. P.; Olson, P. T.; Sinitsyn, A. P.;
(682) Vuong, T. V.; Wilson, D. B. FEBS J. 2009, 276, 3837.
Parriche, M.; Bousson, J. C.; Pynnonen, C. M.; Punt, P. J.; Van Zeijl, C.
(683) Sandgren, M.; Wu, M.; Karkehabadi, S.; Mitchinson, C.;
M. J. (Dyadic International, Inc.) Transformation System in the Field of
Kelemen, B. R.; Larenas, E. A.; Ståhlberg, J.; Hansson, H. J. Mol. Biol.
2013, 425, 622. Filamentous Fungal Hosts. U.S. Patent 6,573,086 B1, June 3, 2003.
(684) Ruohonen, L.; Koivula, A.; Reinikainen, T.; Valkeajärvi, A.; (716) Moriya, T.; Watanabe, M.; Sumida, N.; Okakura, K.; Murakami,
Teleman, A.; Claeyssens, M.; Szardenings, M.; Jones, T. A.; Teeri, T. T. T. Biosci. Biotechnol. Biochem. 2003, 67, 1434.
In Trichoderma reesei Cellulases and Other Hydrolases: Enzyme Structures, (717) Dalbøge, H.; Heldt-Hansen, H. P. Mol. Gen. Genet. 1994, 243,
Biochemistry, Genetics and Applications; Suominen, P., Reinikainen, T., 253.
Eds.; Foundation for Biotechnical and Industrial Fermentation (718) Toda, H.; Nagahata, N.; Amano, Y.; Nozaki, K.; Kanda, T.;
Research: Espoo, Finland, 1993; pp 87−96. Okazaki, M.; Shimosaka, M. Biosci. Biotechnol. Biochem. 2008, 72, 3142.
(685) Zhang, S.; Wilson, D. B. J. Biotechnol. 1997, 57, 101. (719) Wu, W.; Lange, L.; Skovlund, D. A.; Liu, Y. (Novozymes A/S)
(686) Wu, M.; Bu, L.; Vuong, T. V.; Wilson, D. B.; Crowley, M. F.; Polypeptides Having Cellobiohydrolase II Activity and Polynucleotides
Sandgren, M.; Ståhlberg, J.; Beckham, G. T.; Hansson, H. J. Biol. Chem. Encoding Same. U.S. Patent 7,867,744 B2, Jan. 11, 2011.
2013, 288, 33107. (720) Wang, H.-C.; Chen, Y.-C.; Huang, C.-T.; Hseu, R.-S. Protein
(687) Barr, B. K.; Wolfgang, D. E.; Piens, K.; Claeyssens, M.; Wilson, Expression Purif. 2013, 90, 153.
D. B. Biochemistry 1998, 37, 9220. (721) Chen, H.; Li, X.-L.; Blum, D. L.; Ximenes, E. A.; Ljungdahl, L.
(688) Larsson, A. M.; Bergfors, T.; Dultz, E.; Irwin, D. C.; Roos, A.; G. Appl. Biochem. Biotechnol. 2003, 108, 775.
Driguez, H.; Wilson, D. B.; Jones, T. A. Biochemistry 2005, 44, 12915. (722) Gao, L.; Wang, F.; Gao, F.; Wang, L.; Zhao, J.; Qu, Y. Bioresour.
(689) Dougherty, D. A. Science 1996, 271, 163. Technol. 2011, 102, 8339.
(690) van Tilbeurgh, H.; Loontiens, F. G.; Engelborgs, Y.; Claeyssens, (723) Zhao, J.; Shi, P.; Li, Z.; Yang, P.; Luo, H.; Bai, Y.; Wang, Y.; Yao,
M. Eur. J. Biochem. 1989, 184, 553. B. Bioresour. Technol. 2012, 121, 404.
(691) Teleman, A.; Koivula, A.; Reinikainen, T.; Valkeajärvi, A.; Teeri, (724) Tsai, C.-F.; Qiu, X.; Liu, J.-H. Anaerobe 2003, 9, 131.
T. T.; Drakenberg, T.; Teleman, O. Eur. J. Biochem. 1995, 231, 250. (725) Harhangi, H. R.; Freelove, A. C. J.; Ubhayasekera, W.; van
(692) Taylor, J. S.; Teo, B. T.; Wilson, D. B.; Brady, J. W. Protein Eng. Dinther, M.; Steenbakkers, P. J. M.; Akhmanova, A.; van der Drift, C.;
1995, 8, 1145. Jetten, M. S. M.; Mowbray, S. L.; Gilbert, H. J.; Op den Camp, H. J. M.
(693) Harjunpäa,̈ V.; Teleman, A.; Koivula, A.; Ruohonen, L.; Teeri, Biochim. Biophys. Acta, Gene Struct. Expression 2003, 1628, 30.
T. T.; Teleman, O.; Drakenberg, T. Eur. J. Biochem. 1996, 240, 584. (726) Yamanobe, T.; Watanabe, M.; Hamaya, T.; Sumida, N.; Aoyagi,
(694) Konstantinidis, A. K.; Marsden, I.; Sinnott, M. L. Biochem. J. K.; Murakami, T. (Japan as represented by Director General of Agency
1993, 291, 883. of Industrial Science and Technology, Meiji Seika Kaisha Ltd.) Protein
(695) Damude, H. G.; Ferro, V.; Withers, S. G.; Warren, R. A. J. Having Cellulase Activities and Process for Producing the Same. U.S. Patent
Biochem. J. 1996, 315, 467. 6,127,160. Oct. 3, 2000.

1442 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(727) Brown, K.; Harris, P.; De Leon, A. L.; Merino, S. T. (761) Clarke, A. J.; Drummelsmith, J.; Yaguchi, M. FEBS Lett. 1997,
(Novozymes, Inc.) Polypeptides Having Cellobiohydrolase Activity and 414, 359.
Polynucleotides Encoding Same. U.S. Patent 7,220,565, May 22, 2007. (762) Tseng, C.-W.; Ko, T.-P.; Guo, R.-T.; Huang, J.-W.; Wang, H.-
(728) Shoemaker, S. P.; Brown, R. D., Jr. Biochim. Biophys. Acta 1978, C.; Huang, C.-H.; Cheng, Y.-S.; Wang, A. H. J.; Liu, J.-R. Acta
523, 133. Crystallogr., Sect. F: Struct. Biol. Cryst. Commun. 2011, 67, 1189.
(729) Shoemaker, S. P.; Brown, R. D., Jr. Biochim. Biophys. Acta 1978, (763) Liu, J.; Wang, X.; Xu, D. J. Phys. Chem. B 2010, 114, 1462.
523, 147. (764) Saharay, M.; Guo, H.-B.; Smith, J. C.; Guo, H. In Computational
(730) Jenkins, J.; Lo Leggio, L.; Harris, G.; Pickersgill, R. FEBS Lett. Modeling in Lignocellulosic Biofuel Production; Nimlos, M. R., Crowley,
1995, 362, 281. M. F., Eds.; American Chemical Society: Washington, DC, 2010; pp
(731) Pickersgill, R.; Harris, G.; Lo Leggio, L.; Mayans, O.; Jenkins, J. 135−154.
Biochem. Soc. Trans. 1998, 26, 190. (765) Medve, J.; Lee, D.; Tjerneld, F. J. Chromatogr. A 1998, 808, 153.
(732) Henrissat, B.; Bairoch, A. Biochem. J. 1996, 316, 695. (766) Quiocho, F. A. Annu. Rev. Biochem. 1986, 55, 287.
(733) Henrissat, B.; Callebaut, I.; Fabrega, S.; Lehn, P.; Mornon, J. P.; (767) Aurora, R.; Rose, G. D. Protein Sci. 1998, 7, 21.
Davies, G. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 7090. (768) Payne, C. M.; Baban, J.; Horn, S. J.; Backe, P. H.; Arvai, A. S.;
(734) Aspeborg, H.; Coutinho, P. M.; Wang, Y.; Brumer, H., III; Dalhus, B.; Bjørås, M.; Eijsink, V. G. H.; Sørlie, M.; Beckham, G. T.;
Henrissat, B. BMC Evol. Biol. 2012, 12, 186. Vaaje-Kolstad, G. J. Biol. Chem. 2012, 287, 36322.
(735) St John, F. J.; Gonzalez, J. M.; Pozharski, E. FEBS Lett. 2010, (769) Matsumura, M.; Signor, G.; Matthews, B. W. Nature 1989, 342,
584, 4435. 291.
(736) Béguin, P. Annu. Rev. Microbiol. 1990, 44, 219. (770) Vieille, C.; Zeikus, G. J. Microbiol. Mol. Biol. Rev. 2001, 65, 1.
(737) Hilge, M.; Gloor, S. M.; Rypniewski, W.; Sauer, O.; Heightman, (771) Volkin, D. B.; Klibanov, A. M. J. Biol. Chem. 1987, 262, 2945.
T. D.; Zimmermann, W.; Winterhalter, K.; Piontek, K. Structure 1998, (772) Elcock, A. H. J. Mol. Biol. 1998, 284, 489.
6, 1433. (773) Hendsch, Z. S.; Jonsson, T.; Sauer, R. T.; Tidor, B. Biochemistry
(738) Larsson, A. M.; Anderson, L.; Xu, B.; Muñoz, I. G.; Usón, I.; 1996, 35, 7621.
Janson, J.-C.; Stålbrand, H.; Ståhlberg, J. J. Mol. Biol. 2006, 357, 1500. (774) Perutz, M. F.; Raidt, H. Nature 1975, 255, 256.
(739) Lo Leggio, L.; Parry, N. J.; Van Beeumen, J.; Claeyssens, M.; (775) Waldburger, C. D.; Schildbach, J. F.; Sauer, R. T. Nat. Struct.
Bhat, M. K.; Pickersgill, R. W. Acta Crystallogr., Sect. D: Biol. Crystallogr. Mol. Biol. 1995, 2, 122.
1997, 53, 599. (776) Liu, J. H.; Tsai, C. F.; Liu, J. W.; Cheng, K. J.; Cheng, C. L.
(740) Lo Leggio, L.; Larsen, S. FEBS Lett. 2002, 523, 103. Enzyme Microb. Technol. 2001, 28, 582.
(741) Ståhlberg, J.; Johansson, G.; Pettersson, G. Eur. J. Biochem. (777) Lin, L.; Meng, X.; Liu, P.; Hong, Y.; Wu, G.; Huang, X.; Li, C.;
1988, 173, 179. Dong, J.; Xiao, L.; Liu, Z. Appl. Microbiol. Biotechnol. 2009, 82, 671.
(742) Suominen, P. L.; Mäntylä, A. L.; Karhunen, T.; Hakola, S.; (778) Liu, W.; Zhang, X.-Z.; Zhang, Z.; Zhang, Y. H. P. Appl. Environ.
Nevalainen, H. Mol. Gen. Genet. 1993, 241, 523. Microbiol. 2010, 76, 4914.
(743) Dominguez, R.; Souchon, H.; Spinelli, S.; Dauter, Z.; Wilson, K. (779) Samanta, S.; Basu, A.; Halder, U. C.; Sen, S. K. J. Microbiol.
S.; Chauvaux, S.; Béguin, P.; Alzari, P. M. Nat. Struct. Biol. 1995, 2, 569.
2012, 50, 518.
(744) Reardon, D.; Farber, G. K. FASEB J. 1995, 9, 497.
(780) Parry, N. J.; Beever, D. E.; Owen, E.; Nerinckx, W.; Claeyssens,
(745) Copley, R. R.; Bork, P. J. Mol. Biol. 2000, 303, 627.
M.; Van Beeumen, J.; Bhat, M. K. Arch. Biochem. Biophys. 2002, 404,
(746) Nagano, N.; Orengo, C. A.; Thornton, J. M. J. Mol. Biol. 2002,
243.
321, 741.
(781) Pereira, J. H.; Chen, Z.; McAndrew, R. P.; Sapra, R.; Chhabra, S.
(747) Wang, Q. P.; Tull, D.; Meinke, A.; Gilkes, N. R.; Warren, R. A.
J.; Aebersold, R.; Withers, S. G. J. Biol. Chem. 1993, 268, 14096. R.; Sale, K. L.; Simmons, B. A.; Adams, P. D. J. Struct. Biol. 2010, 172,
(748) Macarrón, R.; Acebal, C.; Castillón, M. P.; Dominguez, J. M.; de 372.
la Mata, I.; Pettersson, G.; Tomme, P.; Claeyssens, M. Biochem. J. 1993, (782) Berghem, L. E. R.; Pettersson, L. G.; Axiofredriksson, U. B. Eur.
289, 867. J. Biochem. 1976, 61, 621.
(749) Macarrón, R.; Henrissat, B.; Claeyssens, M. Biochim. Biophys. (783) Okada, G. J. Biochem. 1975, 77, 33.
Acta, Gen. Subj. 1995, 1245, 187. (784) Okada, G. J. Biochem. 1976, 80, 913.
(750) Macarron, R.; Van Beeumen, J.; Henrissat, B.; de la Mata, I.; (785) Selby, K.; Maitland, C. C. Biochem. J. 1967, 104, 716.
Claeyssens, M. FEBS Lett. 1993, 316, 137. (786) Beldman, G.; Searle-Van Leeuwen, M. F.; Rombouts, F. M.;
(751) Murzin, A. G.; Brenner, S. E.; Hubbard, T.; Chothia, C. J. Mol. Voragen, F. G. J. Eur. J. Biochem. 1985, 146, 301.
Biol. 1995, 247, 536. (787) Beldman, G.; Voragen, A. G. J.; Rombouts, F. M.; Pilnik, W.
(752) Badieyan, S.; Bevan, D. R.; Zhang, C. Biotechnol. Bioeng. 2012, Biotechnol. Bioeng. 1988, 31, 173.
109, 31. (788) Vinzant, T. B.; Adney, W. S.; Decker, S. R.; Baker, J. O.; Kinter,
(753) Van Petegem, F.; Vandenberghe, I.; Bhat, M. K.; Van Beeumen, M. T.; Sherman, N. E.; Fox, J. W.; Himmel, M. E. Appl. Biochem.
J. Biochem. Biophys. Res. Commun. 2002, 296, 161. Biotechnol. 2001, 91−93, 99.
(754) Ducros, V.; Czjzek, M.; Belaich, A.; Gaudin, C.; Fierobe, H. P.; (789) Claeyssens, M.; Aerts, G. Bioresour. Technol. 1992, 39, 143.
Belaich, L. P.; Davies, G. J.; Haser, R. Structure 1995, 3, 939. (790) Johnston, D. B.; Shoemaker, S. P.; Smith, G. M.; Whitaker, J. R.
(755) Sakon, J.; Adney, W. S.; Himmel, M. E.; Thomas, S. R.; Karplus, J. Food Biochem. 1998, 22, 301.
P. A. Biochemistry 1996, 35, 10648. (791) Jäger, G.; Wu, Z.; Garschhammer, K.; Engel, P.; Klement, T.;
(756) Barras, F.; Bortoligerman, I.; Bauzan, M.; Rouvier, J.; Gey, C.; Rinaldi, R.; Spiess, A. C.; Büchs, J. Biotechnol. Biofuels 2010, 3, 18.
Heyraud, A.; Henrissat, B. FEBS Lett. 1992, 300, 145. (792) Nidetzky, B.; Steiner, W.; Claeyssens, M. Biochem. J. 1994, 303,
(757) Gebler, J.; Gilkes, N. R.; Claeyssens, M.; Wilson, D. B.; Béguin, 817.
P.; Wakarchuk, W. W.; Kilburns, D. G.; Miller, R. C., Jr.; Warrens, R. A. (793) Palonen, H.; Tjerneld, F.; Zacchi, G.; Tenkanen, M. J.
J.; Withers, S. G. J. Biol. Chem. 1992, 267, 12559. Biotechnol. 2004, 107, 65.
(758) Baird, S. D.; Hefford, M. A.; Johnson, D. A.; Sung, W. L.; (794) Le Costaouëc, T.; Pakarinen, A.; Várnai, A.; Puranen, T.;
Yaguchi, M.; Seligy, V. L. Biochem. Biophys. Res. Commun. 1990, 169, Viikari, L. Bioresour. Technol. 2013, 143, 196.
1035. (795) Cruys-Bagger, N.; Ren, G.; Tatsumi, H.; Baumann, M. J.;
(759) Navas, J.; Béguin, P. Biochem. Biophys. Res. Commun. 1992, 189, Spodsberg, N.; Andersen, H. D.; Gorton, L.; Borch, K.; Westh, P.
807. Biotechnol. Bioeng. 2012, 109, 3199.
(760) Py, B.; Bortoli-German, I.; Haiech, J.; Chippaux, M.; Barras, F. (796) Medve, J.; Karlsson, J.; Lee, D.; Tjerneld, F. Biotechnol. Bioeng.
Protein Eng. 1991, 4, 325. 1998, 59, 621.

1443 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(797) Karlsson, J.; Medve, J.; Tjerneld, F. Appl. Biochem. Biotechnol. (828) Zhao, J.; Shi, P.; Huang, H.; Li, Z.; Yuan, T.; Yang, P.; Luo, H.;
1999, 82, 243. Bai, Y.; Yao, B. Appl. Microbiol. Biotechnol. 2012, 95, 947.
(798) Billard, H.; Faraj, A.; Lopes Ferreira, N.; Menir, S.; Heiss- (829) Eberhardt, R. Y.; Gilbert, H. J.; Hazlewood, G. P. Microbiology
Blanquet, S. Biotechnol. Biofuels 2012, 5, 9. 2000, 146, 1999.
(799) Donnelly, M. K.; Moran-Mirabal, J. M.; Corgie, S. C.; (830) Murray, P. G.; Grassick, A.; Laffey, C. D.; Cuffe, M. M.; Higgins,
Craighead, H. G.; Walker, L. P. Biophys. J. 2010, 98, 749A. T.; Savage, A. V.; Planas, A.; Tuohy, M. G. Enzyme Microb. Technol.
(800) Jeoh, T.; Wilson, D. B.; Walker, L. P. Biotechnol. Prog. 2006, 22, 2001, 29, 90.
270. (831) Hong, J.; Tamaki, H.; Yamamoto, K.; Kumagai, H. Biotechnol.
(801) Tambor, J. H.; Ren, H.; Ushinsky, S.; Zheng, Y.; Riemens, A.; Lett. 2003, 25, 657.
St-Francois, C.; Tsang, A.; Powlowski, J.; Storms, R. Appl. Microbiol. (832) Nozaki, K.; Seki, T.; Matsui, K.; Mizuno, M.; Kanda, T.; Amano,
Biotechnol. 2012, 93, 203. Y. Biosci. Biotechnol. Biochem. 2007, 71, 2375.
(802) Baker, J. O.; Tatsumoto, K.; Grohmann, K.; Woodward, J.; (833) Sul, O. J.; Kim, J. H.; Park, S. J.; Son, Y. J.; Park, B. R.; Chung, D.
Wichert, J. M.; Shoemaker, S. P.; Himmel, M. E. Appl. Biochem. K.; Jeong, C. S.; Han, I. S. Appl. Biochem. Biotechnol. 2004, 66, 63.
Biotechnol. 1992, 34−35, 217. (834) Huang, X. M.; Yang, Q.; Liu, Z. H.; Fan, J. X.; Chen, X. L.; Song,
(803) Li, C.; Knierim, B.; Manisseri, C.; Arora, R.; Scheller, H. V.; J. Z.; Wang, Y. Appl. Biochem. Biotechnol. 2010, 162, 103.
Auer, M.; Vogel, K. P.; Simmons, B. A.; Singh, S. Bioresour. Technol. (835) Ding, S. J.; Ge, W.; Buswell, J. A. Eur. J. Biochem. 2001, 268,
2010, 101, 4900. 5687.
(804) Wahlström, R.; Rovio, S.; Suurnäkki, A. RSC Adv. 2012, 2, 4472. (836) Telke, A. A.; Zhuang, N.; Ghatge, S. S.; Lee, S.-H.; Shah, A. A.;
(805) Schülein, M.; Tikhomirov, D. F.; Schou, C. In Trichoderma Khan, H.; Um, Y.; Shin, H.-D.; Chung, Y. R.; Lee, K. H.; Kim, S.-W.
reesei Cellulases and Other Hydrolases: Enzyme Structures, Biochemistry, PLoS One 2013, 8, e65727.
Genetics, and Applications: Proceedings of the TRICEL93 Symposium, June (837) Xiao-Zhou, Z.; Zhang, Y. H. P. Microb. Biotechnol. 2011, 4, 98.
2−5, 1993, Espoo, Finland; Suominen, P., Reinikainen, T., Eds.; (838) Qin, Y.; Wei, X.; Liu, X.; Wang, T.; Qu, Y. Protein Expression
Foundation for Biotechnical and Industrial Fermentation Research: Purif. 2008, 58, 162.
Helsinki, 1993; pp 109−116. (839) Banerjee, G.; Car, S.; Scott-Craig, J.; Hodge, D.; Walton, J.
(806) Karlsson, J.; Momcilovic, D.; Wittgren, B.; Schülein, M.; Biotechnol. Biofuels 2011, 4, 16.
Tjerneld, F.; Brinkmalm, G. Biopolymers 2002, 63, 32. (840) Li, Q.; Gao, Y.; Wang, H.; Li, B.; Liu, C.; Yu, G.; Mu, X.
(807) Liu, D.; Zhang, R.; Yang, X.; Xu, Y.; Tang, Z.; Tian, W.; Shen, Bioresour. Technol. 2012, 125, 193.
Q. Protein Expression Purif. 2011, 79, 176. (841) Banerjee, G.; Car, S.; Liu, T.; Williams, D. L.; Meza, S. L.;
(808) Chikamatsu, G.; Shirai, K.; Kato, M.; Kobayashi, T.; Walton, J. D.; Hodge, D. B. Biotechnol. Bioeng. 2012, 109, 922.
Tsukagoshi, N. FEMS Microbiol. Lett. 1999, 175, 239. (842) Karp, E. M.; Donohoe, B. S.; O’Brien, M. H.; Ciesielski, P. N.;
(809) Li, C.-H.; Wang, H.-R.; Yan, T.-R. Molecules 2012, 17, 9774. Mittal, A.; Biddy, M. J.; Beckham, G. T. ACS Sustainable Chem. Eng.
(810) Karnchanatat, A.; Petsom, A.; Sangvanicha, P.; Piapukiew, J.; 2014, 2, 1481.
Whalley, A. J. S.; Reynolds, C. D.; Gadd, G. M.; Sihanonth, P. Enzyme (843) Wang, T.; Liu, X.; Yu, Q.; Zhang, X.; Qu, Y.; Gao, P.; Wang, T.
Microb. Technol. 2008, 42, 404.
Biomol. Eng. 2005, 22, 89.
(811) Shi, R.; Li, Z.; Ye, Q.; Xu, J.; Liu, Y. Bioresour. Technol. 2013,
(844) Wang, H.; Jones, R. W. Appl. Microbiol. Biotechnol. 1997, 48,
142, 338.
225.
(812) Yoon, J.-J.; Cha, C.-J.; Kim, Y.-S.; Son, D.-W.; Kim, Y.-K. J.
(845) Baker, J. O.; McCarley, J. R.; Lovettt, R.; Yu, C. H.; Adney, W.
Microbiol. Biotechnol. 2007, 17, 800.
S.; Rignall, T. R.; Vinzant, T. B.; Decker, S. R.; Sakon, J.; Himmel, M. E.
(813) Yoon, J.-J.; Cha, C.-J.; Kim, Y.-S.; Kim, W. Biotechnol. Lett. 2008,
Appl. Biochem. Biotechnol. 2005, 121, 129.
30, 1373.
(846) Nakazawa, H.; Okada, K.; Kobayashi, R.; Kubota, T.; Onodera,
(814) de Almeida, M. N.; Falkoski, D. L.; Guimaraes, V. M.; Ramos,
H. J.; Visser, E. M.; Maitan-Alfenas, G. P.; de Rezende, S. T. Bioresour. T.; Ochiai, N.; Omata, N.; Ogasawara, W.; Okada, H.; Morikawa, Y.
Technol. 2013, 143, 413. Appl. Microbiol. Biotechnol. 2008, 81, 681.
(815) Cohen, R.; Suzuki, M. R.; Hammel, K. E. Appl. Environ. (847) Håkansson, U.; Fägerstam, L.; Pettersson, G.; Andersson, L.
Microbiol. 2005, 71, 2412. Biochim. Biophys. Acta 1978, 524, 385.
(816) Kim, H. M.; Lee, Y. G.; Patel, D. H.; Lee, K. H.; Lee, D.-S.; Bae, (848) Ü lker, A.; Sprey, B. FEMS Microbiol Lett. 1990, 57, 215.
H.-J. J. Ind. Microbiol. Biotechnol. 2012, 39, 1081. (849) Sprey, B.; Uelker, A. FEMS Microbiol Lett. 1992, 71, 253.
(817) Takashima, S.; Nakamura, A.; Masaki, H.; Uozumi, T. Biosci. (850) Hayn, M.; Klinger, R.; Esterbauer, H. In Trichoderma Reesei
Biotechnol. Biochem. 1997, 61, 245. Cellulases and Other Hydrolases; Suominen, P., Reinikainen, T., Eds.;
(818) Fujino, Y.; Ogata, K.; Nagamine, T.; Ushida, K. Biosci. Foundation for Biotechnical and Industrial Fermentation Research:
Biotechnol. Biochem. 1998, 62, 1795. Helsinki, 1993; pp 147−151.
(819) Sun, J.; Phillips, C. M.; Anderson, C. T.; Beeson, W. T.; (851) Ward, M.; Wu, S.; Dauberman, J.; Weiss, G.; Larenas, E.; Bower,
Marletta, M. A.; Glass, N. L. Protein Expression Purif. 2011, 75, 147. B.; Rey, M.; Clarkson, K.; Bott, R. The Tricell 93 Symposium; Espoo,
(820) Qiu, X.; Selinger, B.; Yanke, L.; Cheng, K. Gene 2000, 245, 119. Finland, 1993; pp 153−158.
(821) Krogh, K. B. R. M.; Kastberg, H.; Jørgensen, C. I.; Berlin, A.; (852) Okada, H.; Tada, K.; Sekiya, T.; Yokoyama, K.; Takahashi, A.;
Harris, P. V.; Olsson, L. Enzyme Microb. Technol. 2009, 44, 359. Tohda, H.; Kumagai, H.; Morikawa, Y. Appl. Environ. Microbiol. 1998,
(822) Chulkin, A. M.; Loginov, D. S.; Vavilova, E. A.; Abyanova, A. R.; 64, 555.
Zorov, I. N.; Kurzeev, S. A.; Koroleva, O. V.; Benevolenskii, S. V. (853) Goedegebuur, F.; Fowler, T.; Phillips, J.; van der Kley, P.; van
Biochemistry (Moscow) 2009, 74, 655. Solingen, P.; Dankmeyer, L.; Power, S. D. Curr. Genet. 2002, 41, 89.
(823) Liu, G.; Qin, Y.; Hu, Y.; Gao, M.; Peng, S.; Qu, Y. Enzyme (854) Master, E. R.; Zheng, Y.; Storms, R.; Tsang, A.; Powlowski, J.
Microb. Technol. 2013, 52, 190. Biochem. J. 2008, 411, 161.
(824) Rubini, M. R.; Dillon, A. J.; Kyaw, C. M.; Faria, F. P.; Pocas- (855) Song, B.-C.; Kim, K.-Y.; Yoon, J.-J.; Sim, S.-H.; Lee, K.; Kim, Y.-
Fonseca, M. J.; Silva-Pereira, I. J. Appl. Microbiol. 2010, 108, 1187. S.; Kim, Y.-K.; Cha, C.-J. J. Microbiol. Technol. 2008, 18, 404.
(825) Mernitz, G.; Koch, A.; Henrissat, B.; Schulz, G. Curr. Genet. (856) Takeda, T.; Takahashi, M.; Nakanishi-Masuno, T.; Nakano, Y.;
1996, 29, 490. Saitoh, H.; Hirabuchi, A.; Fujisawa, S.; Terauchi, R. Appl. Microbiol.
(826) Jeya, M.; Joo, A.-R.; Lee, K.-M.; Sim, W.-I.; Oh, D.-K.; Kim, Y.- Biotechnol. 2010, 88, 1113.
S.; Kim, I.-W.; Lee, J.-K. Appl. Microbiol. Biotechnol. 2010, 85, 1005. (857) Zechel, D. L.; He, S.; Dupont, C.; Withers, S. G. Biochem. J.
(827) Uzcategui, E.; Johansson, G.; Ek, B.; Pettersson, G. J. Biotechnol. 1998, 336, 139.
1991, 21, 143. (858) Sprey, B.; Bochem, H.-P. FEMS Microbiol. Lett. 1992, 97, 113.

1444 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(859) Huang, Y.; Krauss, G.; Cottaz, S.; Driguez, H.; Lipps, G. (890) Sandgren, M.; Ståhlberg, J.; Mitchinson, C. Prog. Biophys. Mol.
Biochem. J. 2005, 385, 581. Biol. 2005, 89, 246.
(860) Yuan, S.; Wu, Y.; Cosgrove, D. J. Plant Physiol 2001, 127, 324. (891) Forse, G. J.; Ram, N.; Banatao, D. R.; Cascio, D.; Sawaya, M. R.;
(861) Damasio, A. R.; Ribeiro, L. F.; Ribeiro, L. F.; Furtado, G. P.; Klock, H. E.; Lesley, S. A.; Yeates, T. O. Protein Sci. 2011, 20, 168.
Segato, F.; Almeida, F. B.; Crivellari, A. C.; Buckeridge, M. S.; Souza, T. (892) Murao, S.; Sakamoto, R.; Arai, M. Methods Enzymol. 1988, 160,
A.; Murakami, M. T.; Ward, R. J.; Prade, R. A.; Polizeli, M. L. Biochim. 274.
Biophys. Acta 2012, 1824, 461. (893) Sakamoto, S.; Tamura, G.; Ito, K.; Ishikawa, T.; Iwano, K.;
(862) Gloster, T. M.; Ibatullin, F. M.; Macauley, K.; Eklöf, J. M.; Nishiya, N. Curr. Genet. 1995, 27, 435.
Roberts, S.; Turkenburg, J. P.; Bjørnvad, M. E.; Jørgensen, P. L.; (894) Van Den Broeck, H. C.; De Graaff, L. H.; Visser, J.; Van Ooijen,
Danielsen, S.; Johansen, K. S.; Borchert, T. V.; Wilson, K. S.; Brumer, A. J. J. (Gist-Brocades, B.V.) Fungal Cellulases. U.S. Patent 6,190,890
H.; Davies, G. J. J. Biol. Chem. 2007, 282, 19177. B1, Feb. 20, 2001.
(863) Wicher, K.; Abou-Hachem, M.; Halldó r sdó t tir, S.; (895) Hasper, A. A.; Dekkers, E.; van Mil, M.; van de Vondervoort, P.
Thorbjarnadóttir, S.; Eggertsson, G.; Hreggvidsson, G.; Nordberg J. I.; de Graaff, L. H. Appl. Environ. Microbiol. 2002, 68, 1556.
Karlsson, E.; Holst, O. Appl. Microbiol. Biotechnol. 2001, 55, 578. (896) Narra, M.; Dixit, G.; Divecha, J.; Kumar, K.; Madamwar, D.;
(864) Bok, J.-D.; Yernool, D. A.; Eveleigh, D. E. Appl. Environ. Shah, A. R. Int. Biodeterior. Biodegrad. 2014, 88, 150.
Microbiol. 1998, 64, 4774. (897) Shimokawa, T.; Shibuya, H.; Nojiri, M.; Yoshida, S.; Ishihara,
(865) Powlowski, J.; Mahajan, S.; Schapira, M.; Master, E. R. M. Appl. Environ. Microbiol. 2008, 74, 5857.
Carbohydr. Res. 2009, 344, 1175. (898) Henriksson, G.; Nutt, A.; Henriksson, H.; Pettersson, B.;
(866) Grishutin, S. G.; Gusakov, A. V.; Dzedzyulya, E. I.; Sinitsyn, A. Ståhlberg, J.; Johansson, G.; Pettersson, G. Eur. J. Biochem. 1999, 259,
P. Carbohydr. Res. 2006, 341, 218. 88.
(867) Bauer, M. W.; Driskill, L. E.; Callen, W.; Snead, M. A.; Mathur, (899) Ishihara, H.; Imamura, K.; Kita, M.; Aimi, T.; Kitamoto, Y.
E. J.; Kelly, R. M. J. Bacteriol. 1999, 181, 284. Mycoscience 2005, 46, 148.
(868) Kim, H.; Ahn, J.-H.; Görlach, J. M.; Caprari, C.; Scott-Craig, J. (900) Henrissat, B.; Bairoch, A. Biochem. J. 1993, 293, 781.
S.; Walton, J. D. Mol. Plant-Microbe Interact. 2001, 14, 1436. (901) Gilbert, H. J.; Hall, J.; Hazlewood, G. P.; Ferreira, L. M. A. Mol.
(869) Park, Y. B.; Cosgrove, D. J. Plant Physiol. 2012, 158, 1933. Microbiol. 1990, 4, 759.
(870) Okada, H.; Mori, K.; Tada, K.; Nogawa, M.; Morikawa, Y. J. (902) Rasmussen, G.; Mikkelsen, J. M.; Schuelein, M.; Patkar, S. A.;
Mol. Catal. B: Enzym. 2000, 10, 249. Hagen, F.; Hjort, C. M.; Hastrup, S. (Novo Nordisk A/S) A Cellulase
(871) Sulzenbacher, G.; Shareck, F.; Morosoli, R.; Dupont, C.; Davies, Preparation Comprising an Endoglucanase Enzyme. W.O. Patent
1991017243 A1, Nov. 14, 1991.
G. J. Biochemistry 1997, 36, 16032.
(903) Davies, G. J.; Dodson, G. G.; Hubbard, R. E.; Tolley, S. P.;
(872) Sulzenbacher, G.; Mackenzie, L. F.; Wilson, K. S.; Withers, S.
Dauter, Z.; Wilson, K. S.; Hjort, C.; Mikkelsen, J. M.; Rasmussen, G.;
G.; Dupont, C.; Davies, G. J. Biochemistry 1999, 38, 4826.
Schülein, M. Nature 1993, 365, 362.
(873) Keitel, T.; Simon, O.; Borriss, R.; Heinemann, U. Proc. Natl.
(904) Davies, G. J.; Tolley, S. P.; Henrissat, B.; Hjort, C.; Schülein, M.
Acad. Sci. U.S.A. 1993, 90, 5287.
Biochemistry 1995, 34, 16210.
(874) Kim, H.-W.; Kataoka, M.; Ishikawa, K. FEBS Lett. 2012, 586,
(905) Couturier, M.; Feliu, J.; Haon, M.; Navarro, D.; Lesage-
1009.
Meessen, L.; Coutinho, P. M.; Berrin, J. G. Microb. Cell Fact. 2011, 10,
(875) Crennell, S. J.; Hreggvidsson, G. O.; Nordberg Karlsson, E. J.
103.
Mol. Biol. 2002, 320, 883. (906) Baba, Y.; Shimonaka, A.; Koga, J.; Kubota, H.; Kono, T. J.
(876) Sandgren, M.; Gualfetti, P. J.; Shaw, A.; Gross, L. S.; Saldajeno,
Bacteriol. 2005, 187, 3045.
M.; Day, A. G.; Jones, T. A.; Mitchinson, C. Protein Sci. 2003, 12, 848. (907) Yang, J. C.; Madupu, R.; Durkin, A. S.; Ekborg, N. A.;
(877) Cheng, Y.-S.; Ko, T.-P.; Wu, T.-H.; Ma, Y.; Huang, C.-H.; Lai, Pedamallu, C. S.; Hostetler, J. B.; Radune, D.; Toms, B. S.; Henrissat,
H.-L.; Wang, A. H. J.; Liu, J.-R.; Guo, R.-T. Proteins: Struct., Funct., B.; Coutinho, P. M.; Schwarz, S.; Field, L.; Trindade-Silva, A. E.; Soares,
Bioinf. 2011, 79, 1193. C. A. G.; Elshahawi, S.; Hanora, A.; Schmidt, E. W.; Haygood, M. G.;
(878) Yoshizawa, T.; Shimizu, T.; Hirano, H.; Sato, M.; Hashimoto, Posfai, J.; Benner, J.; Madinger, C.; Nove, J.; Anton, B.; Chaudhary, K.;
H. J. Biol. Chem. 2012, 287, 18710. Foster, J.; Holman, A.; Kumar, S.; Lessard, P. A.; Luyten, Y. A.; Slatko,
(879) Khademi, S.; Zhang, D.; Swanson, S. M.; Wartenberg, A.; Witte, B.; Wood, N.; Wu, B.; Teplitski, M.; Mougous, J. D.; Ward, N.; Eisen, J.
K.; Meyer, E. F. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 660. A.; Badger, J. H.; Distel, D. L. PLoS One 2009, 4, e6085.
(880) Sandgren, M.; Gualfetti, P. J.; Paech, C.; Paech, S.; Shaw, A.; (908) Igarashi, K.; Ishida, T.; Hori, C.; Samejima, M. Appl. Environ.
Gross, L. S.; Saldajeno, M.; Berglund, G. I.; Jones, T. A.; Mitchinson, C. Microbiol. 2008, 74, 5628.
Protein Sci. 2003, 12, 2782. (909) Sakamoto, K.; Toyohara, H. Comp. Biochem. Physiol., Part B:
(881) Prates, É. T.; Stankovic, I.; Silveira, R. L.; Liberato, M. V.; Biochem. Mol. Biol. 2009, 152, 390.
Henrique-Silva, F.; Pereira, N., Jr.; Polikarpov, I.; Skaf, M. S. PLoS One (910) Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T. J.; Karplus, K.; Li,
2013, 8, e59069. W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J.; Thompson, J.
(882) Robeva, A.; Politi, V.; Shannon, J. D.; Bjarnason, J. B.; Fox, J. W. D.; Higgins, D. G. Mol. Syst. Biol. 2011, 7, 539.
Biomed. Biochim. Acta 1991, 50, 769. (911) Okonechnikov, K.; Golosova, O.; Fursov, M.; UGENE Team.
(883) Muilu, J.; Törrönen, A.; Peräkylä, M.; Rouvinen, J. Proteins: Bioinformatics 2012, 28, 1166.
Struct., Funct., Bioinf. 1998, 31, 434. (912) Valjakka, J.; Rouvinen, J. Acta Crystallogr., Sect. D: Biol.
(884) Crennell, S. J.; Cook, D.; Minns, A.; Svergun, D.; Andersen, R. Crystallogr. 2003, 59, 765.
L.; Nordberg Karlsson, E. J. Mol. Biol. 2006, 356, 57. (913) Hirvonen, M.; Papageorgiou, A. C. Acta Crystallogr., Sect. D:
(885) Törrönen, A.; Harkki, A.; Rouvinen, J. EMBO J. 1994, 13, 2493. Biol. Crystallogr. 2002, 58, 336.
(886) van Solingen, P.; Meijer, D.; van der Kleij, W. A. H.; Barnett, C.; (914) Hirvonen, M.; Papageorgiou, A. C. J. Mol. Biol. 2003, 329, 403.
Bolle, R.; Power, S. D.; Jones, B. E. Extremophiles 2001, 5, 333. (915) Takashima, S.; Iikura, H.; Nakamura, A.; Hidaka, M.; Masaki,
(887) Cheng, Y.-S.; Ko, T.-P.; Huang, J.-W.; Wu, T.-H.; Lin, C.-Y.; H.; Uozumi, T. J. Biotechnol. 1999, 67, 85.
Luo, W.; Li, Q.; Ma, Y.; Huang, C.-H.; Wang, A. J.; Liu, J.-R.; Guo, R.-T. (916) Xu, B. Z.; Hellman, U.; Ersson, B.; Janson, J. C. Eur. J. Biochem.
Appl. Microbiol. Biotechnol. 2012, 95, 661. 2000, 267, 4970.
(888) Nakazawa, H.; Okada, K.; Onodera, T.; Ogasawara, W.; Okada, (917) Xu, B. Z. Endoglucanase and Mannanase from Blue Mussel,
H.; Morikawa, Y. Appl. Microbiol. Biotechnol. 2009, 83, 649. Mytilus edulis: Purification, Characterization, Gene and Three Dimensional
(889) Spilliaert, R.; Hreggvidsson, G. O.; Kristjansson, J. K.; Structure. Ph.D. Thesis, Center for Surface Biotechnology, Uppsala
Eggertsson, G.; Palsdottir, A. Eur. J. Biochem. 1994, 224, 923. Biomedical Center, Uppsala University, Uppsala, Sweden, 2002.

1445 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(918) Davies, G. J.; Dodson, G.; Moore, M. H.; Tolley, S. P.; Dauter, (952) Kende, H.; Bradford, K. J.; Brummell, D. A.; Cho, H. T.;
Z.; Wilson, K. S.; Rasmussen, G.; Schülein, M. Acta Crystallogr., Sect. D: Cosgrove, D. J.; Fleming, A. J.; Gehring, C.; Lee, Y.; McQueen-Mason,
Biol. Crystallogr. 1996, 52, 7. S.; Rose, J. K. C.; Voesenek, L. Plant Mol. Biol. 2004, 55, 311.
(919) Castillo, R. M.; Mizuguchi, K.; Dhanaraj, V.; Albert, A.; (953) Darley, C. P.; Li, Y.; Schaap, P.; McQueen-Mason, S. J. FEBS
Blundell, T. L.; Murzin, A. G. Structure 1999, 7, 227. Lett. 2003, 546, 416.
(920) Brock, V.; Kennedy, V. S. J. Exp. Mar. Biol. Ecol. 1992, 159, 51. (954) Qin, L.; Kudla, U.; Roze, E. H. A.; Goverse, A.; Popeijus, H.;
(921) Liu, G.; Wei, X.; Qin, Y.; Qu, Y. J. Gen. Appl. Microbiol. 2010, 56, Nieuwland, J.; Overmars, H.; Jones, J. T.; Schots, A.; Smant, G.; Bakker,
223. J.; Helder, J. Nature 2004, 427, 30.
(922) Shulein, M.; Henriksen, T.; Lassen, S. F.; Kauppinen, M. S. (955) Brotman, Y.; Briff, E.; Viterbo, A.; Chet, I. Plant Physiol. 2008,
(Novozymes A/S) Endoglucanases. U.S. Patent 6,855,531 B2, Feb. 15, 147, 779.
2005. (956) Jäger, G.; Girfoglio, M.; Dollo, F.; Rinaldi, R.; Bongard, H.;
(923) Dalboege, H.; Diderichsen, B.; Sandal, T.; Kauppinen, S. Commandeur, U.; Fischer, R.; Spiess, A. C.; Büchs, J. Biotechnol.
(Novozymes A/S) Method of Providing Novel DNA Sequences. W.O. Biofuels 2011, 4, 1.
Patent 1997043409 A3, Feb. 26, 1998. (957) Chen, X.-a.; Ishida, N.; Todaka, N.; Nakamura, R.; Maruyama,
(924) Shimonaka, A.; Baba, Y.; Koga, J.; Nakane, A.; Kubota, H.; J.-i.; Takahashi, H.; Kitamoto, K. Appl. Environ. Microbiol. 2010, 76,
Kono, T. Biosci. Biotechnol. Biochem. 2004, 68, 2299. 2556.
(925) Murashima, K.; Nishimura, T.; Nakamura, Y.; Koga, J.; Moriya, (958) Wang, T.-Y.; Chen, H.-L.; Lu, M.-Y. J.; Chen, Y.-C.; Sung, H.-
T.; Sumida, N.; Yaguchi, T.; Kono, T. Enzyme Microb. Technol. 2002, M.; Mao, C.-T.; Cho, H.-Y.; Ke, H.-M.; Hwa, T.-Y.; Ruan, S.-K.; Hung,
30, 319. K.-Y.; Chen, C.-K.; Li, J.-Y.; Wu, Y.-C.; Chen, Y.-H.; Chou, S.-P.; Tsai,
(926) Moriya, T.; Murashima, K.; Nakane, A.; Yanai, K.; Sumida, N.; Y.-W.; Chu, T.-C.; Shih, C.-C. A.; Li, W.-H.; Shih, M.-C. Biotechnol.
Koga, J.; Murakami, T.; Kono, T. J. Bacteriol. 2003, 185, 1749. Biofuels 2011, 4, 24.
(927) Koga, J.; Baba, Y.; Shimonaka, A.; Nishimura, T.; Hanamura, S.; (959) Zhou, Q.; Lv, X.; Zhang, X.; Meng, X.; Chen, G.; Liu, W. World
Kono, T. Appl. Environ. Microbiol. 2008, 74, 4210. J. Microbiol. Biotechnol. 2011, 27, 1905.
(928) Wonganu, B.; Pootanakit, K.; Boonyapakron, K.; Champreda, (960) Gourlay, K.; Hu, J.; Arantes, V.; Andberg, M.; Saloheimo, M.;
Penttilä, M.; Saddler, J. Bioresour. Technol. 2013, 142, 498.
V.; Tanapongpipat, S.; Eurwilaichitr, L. Protein Expression Purif. 2008,
(961) Quiroz-Castañeda, R. E.; Martínez-Anaya, C.; Cuervo-Soto, L.
58, 78.
I.; Segovia, L.; Folch-Mallol, J. L. Microb. Cell Fact. 2011, 10, 8.
(929) Schauwecker, F.; Wanner, G.; Kahmann, R. Biol. Chem. Hoppe-
(962) Bouzarelou, D.; Billini, M.; Roumelioti, K.; Sophianopoulou, V.
Seyler 1995, 376, 617.
Fungal Genet. Biol. 2008, 45, 839.
(930) Cosgrove, D. J. Nature 2000, 407, 321.
(963) van Straaten, K. E.; Dijkstra, B. W.; Vollmer, W.; Thunnissen,
(931) McQueen-Mason, S.; Cosgrove, D. J. Proc. Natl. Acad. Sci.
A.-M. W. H. J. Mol. Biol. 2005, 352, 1068.
U.S.A. 1994, 91, 6574. (964) Beckham, G. T.; Crowley, M. F. J. Phys. Chem. B 2011, 115,
(932) McQueen-Mason, S. J.; Cosgrove, D. J. Plant Physiol. 1995, 107,
4516.
87. (965) Horn, S. J.; Vaaje-Kolstad, G.; Westereng, B.; Eijsink, V. G. H.
(933) Wang, T.; Park, Y. B.; Caporini, M. A.; Rosay, M.; Zhong, L. H.; Biotechnol. Biofuels 2012, 5, 45.
Cosgrove, D. J.; Hong, M. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 16444. (966) Vaaje-Kolstad, G.; Horn, S. J.; van Aalten, D. M. F.; Synstad, B.;
(934) Georgelis, N.; Tabuchi, A.; Nikolaidis, N.; Cosgrove, D. J. J. Biol. Eijsink, V. G. H. J. Biol. Chem. 2005, 280, 28492.
Chem. 2011, 286, 16814. (967) Harris, P. V.; Welner, D.; McFarland, K. C.; Re, E.; Poulsen, J.
(935) Georgelis, N.; Yennawar, N. H.; Cosgrove, D. J. Proc. Natl. Acad. C. N.; Brown, K.; Salbo, R.; Ding, H. S.; Vlasenko, E.; Merino, S.; Xu,
Sci. U.S.A. 2012, 109, 14830. F.; Cherry, J.; Larsen, S.; Lo Leggio, L. Biochemistry 2010, 49, 3305.
(936) Kim, I. J.; Ko, H.-J.; Kim, T.-W.; Choi, I.-G.; Kim, K. H. (968) Wymelenberg, A. V.; Gaskell, J.; Mozuch, M.; Sabat, G.; Ralph,
Biotechnol. Bioeng. 2013, 110, 401. J.; Skyba, O.; Mansfield, S. D.; Blanchette, R. A.; Martinez, D.;
(937) Kerff, F.; Amoros, A.; Herman, R.; Sauvage, E.; Petrella, S.; Grigoriev, I.; Kersten, P. J.; Cullen, D. Appl. Environ. Microbiol. 2010,
Filée, P.; Charlier, P.; Joris, B.; Tabuchi, A.; Nikolaidis, N.; Cosgrove, 76, 3599.
D. J. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 16876. (969) Yakovlev, I.; Vaaje-Kolstad, G.; Hietala, A. M.; Stefanczyk, E.;
(938) Yennawar, N. H.; Li, L.-C.; Dudzinski, D. M.; Tabuchi, A.; Solheim, H.; Fossdal, C. G. Appl. Microbiol. Biotechnol. 2012, 95, 979.
Cosgrove, D. J. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 14664. (970) Schnellmann, J.; Zeltins, A.; Blaak, H.; Schrempf, H. Mol.
(939) Cosgrove, D. J. Plant Physiol. 1998, 118, 333. Microbiol. 1994, 13, 807.
(940) Darley, C. P.; Forrester, A. M.; McQueen-Mason, S. J. Plant (971) Li, Z.; Li, C.; Yang, K.; Wang, L.; Yin, C.; Gong, Y.; Pang, Y.
Mol. Biol. 2001, 47, 179. Virus Res. 2003, 96, 113.
(941) Cosgrove, D. J. Nat. Rev. Mol. Cell Biol. 2005, 6, 850. (972) Vaaje-Kolstad, G.; Houston, D. R.; Riemen, A. H. K.; Eijsink, V.
(942) Lipchinsky, A. Acta Physiol. Plant. 2013, 35, 3277. G. H.; van Aalten, D. M. F. J. Biol. Chem. 2005, 280, 11313.
(943) Kim, E. S.; Lee, H. J.; Bang, W. G.; Choi, I. G.; Kim, K. H. (973) Moser, F.; Irwin, D.; Chen, S. L.; Wilson, D. B. Biotechnol.
Biotechnol. Bioeng. 2009, 102, 1342. Bioeng. 2008, 100, 1066.
(944) Wei, W.; Yang, C.; Luo, J.; Lu, C.; Wu, Y.; Yuan, S. J. Plant (974) Eijsink, V. G. H.; Vaaje-Kolstad, G.; Vårum, K. M.; Horn, S. J.
Physiol. 2010, 167, 1204. Trends Biotechnol. 2008, 26, 228.
(945) Lin, H.; Shen, Q.; Zhan, J. M.; Wang, Q.; Zhao, Y. H. PLoS One (975) Karlsson, J.; Saloheimo, M.; Siika-aho, M.; Tenkanen, M.;
2013, 8, e75022. Penttilä, M.; Tjerneld, F. Eur. J. Biochem. 2001, 268, 6498.
(946) Kang, K.; Wang, S.; Lai, G.; Liu, G.; Xing, M. BMC Biotechnol. (976) Koseki, T.; Mese, Y.; Fushinobu, S.; Masaki, K.; Fujii, T.; Ito, K.;
2013, 13, 42. Shiono, Y.; Murayama, T.; Iefuji, H. Appl. Microbiol. Biotechnol. 2008,
(947) Georgelis, N.; Nikolaidis, N.; Cosgrove, D. J. Carbohydr. Polym. 77, 1279.
2014, 100, 17. (977) Saloheimo, M.; Nakari-Setälä, T.; Tenkanen, M.; Penttilä, M.
(948) Lee, H. J.; Kim, I. J.; Kim, J. F.; Choi, I. G.; Kim, K. H. Bioresour. Eur. J. Biochem. 1997, 249, 584.
Technol. 2013, 149, 516. (978) Merino, S.; Cherry, J. In Biofuels; Olsson, L., Ed.; Springer:
(949) Li, Y.; Jones, L.; McQueen-Mason, S. Curr. Opin. Plant Biol. Berlin, 2007; pp 95−120.
2003, 6, 603. (979) Forsberg, Z.; Vaaje-Kolstad, G.; Westereng, B.; Bunæs, A. C.;
(950) Lee, Y.; Choi, D.; Kende, H. Curr. Opin. Plant Biol. 2001, 4, 527. Stenstrøm, Y.; MacKenzie, A.; Sørlie, M.; Horn, S. J.; Eijsink, V. G. H.
(951) Cosgrove, D. J. Curr. Opin. Plant Biol. 2000, 3, 73. Protein Sci. 2011, 20, 1479.

1446 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(980) Vaaje-Kolstad, G.; Bøhle, L. A.; Gåseidnes, S.; Dalhus, B.; (1003) Kim, S.; Ståhlberg, J.; Sandgren, M.; Paton, R. S.; Beckham, G.
Bjørås, M.; Mathiesen, G.; Eijsink, V. G. H. J. Mol. Biol. 2012, 416, 239. T. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 149.
(981) Westereng, B.; Ishida, T.; Vaaje-Kolstad, G.; Wu, M.; Eijsink, V. (1004) Tantillo, D. J.; Chen, J. G.; Houk, K. N. Curr. Opin. Chem. Biol.
G. H.; Igarashi, K.; Samejima, M.; Ståhlberg, J.; Horn, S. J.; Sandgren, 1998, 2, 743.
M. PLoS One 2011, 6, e27807. (1005) Gherman, B. F.; Tolman, W. B.; Cramer, C. J. J. Comput.
(982) Henriksson, G.; Johansson, G.; Pettersson, G. J. Biotechnol. Chem. 2006, 27, 1950.
2000, 78, 93. (1006) Huber, S. M.; Ertem, M. Z.; Aquilante, F.; Gagliardi, L.;
(983) Henriksson, G.; Ander, P.; Pettersson, B.; Pettersson, G. Appl. Tolman, W. B.; Cramer, C. J. Chem.Eur. J. 2009, 15, 4886.
Microbiol. Biotechnol. 1995, 42, 790. (1007) Schroder, D.; Holthausen, M. C.; Schwarz, H. J. Phys. Chem. B
(984) Hallberg, B. M.; Bergfors, T.; Bäckbro, K.; Pettersson, G.; 2004, 108, 14407.
Henriksson, G.; Divne, C. Structure 2000, 8, 79. (1008) Comba, P.; Knoppe, S.; Martin, B.; Rajaraman, G.; Rolli, C.;
(985) Hallberg, B. M.; Henriksson, G.; Pettersson, G.; Divne, C. J. Shapiro, B.; Stork, T. Chem.Eur. J. 2008, 14, 344.
Mol. Biol. 2002, 315, 421. (1009) Himes, R. A.; Karlin, K. D. Curr. Opin. Chem. Biol. 2009, 13,
(986) Igarashi, K.; Momohara, I.; Nishino, T.; Samejima, M. Biochem. 119.
J. 2002, 365, 521. (1010) Kunishita, A.; Teraoka, J.; Scanlon, J. D.; Matsumoto, T.;
(987) Igarashi, K.; Yoshida, M.; Matsumura, H.; Nakamura, N.; Ohno, Suzuki, M.; Cramer, C. J.; Itoh, S. J. Am. Chem. Soc. 2007, 129, 7248.
H.; Samejima, M.; Nishino, T. FEBS J. 2005, 272, 2869. (1011) Klinman, J. P. Chem. Rev. 1996, 96, 2541.
(988) Tickler, A. K.; Smith, D. G.; Ciccotosto, G. D.; Tew, D. J.; (1012) Solomon, E. I.; Sundaram, U. M.; Machonkin, T. E. Chem. Rev.
Curtain, C. C.; Carrington, D.; Masters, C. L.; Bush, A. I.; Cherny, R. 1996, 96, 2563.
A.; Cappai, R.; Wade, J. D.; Barnham, K. J. J. Biol. Chem. 2005, 280, (1013) Aboelella, N. W.; Kryatov, S. V.; Gherman, B. F.; Brennessel,
13355. W. W.; Young, V. G.; Sarangi, R.; Rybak-Akimova, E. V.; Hodgson, K.
(989) Paiva, A. C. M.; Juliano, L.; Boschcov, P. J. Am. Chem. Soc. 1976, O.; Hedman, B.; Solomon, E. I.; Cramer, C. J.; Tolman, W. B. J. Am.
98, 7645. Chem. Soc. 2004, 126, 16896.
(990) Aachmann, F. L.; Sørlie, M.; Skjåk-Bræk, G.; Eijsink, V. G. H.; (1014) Chen, P.; Solomon, E. I. J. Am. Chem. Soc. 2004, 126, 4991.
Vaaje-Kolstad, G. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 18779. (1015) Klinman, J. P. J. Biol. Chem. 2006, 281, 3013.
(991) Hemsworth, G. R.; Taylor, E. J.; Kim, R. Q.; Gregory, R. C.; (1016) Gherman, B. F.; Heppner, D. E.; Tolman, W. B.; Cramer, C. J.
Lewis, S. J.; Turkenburg, J. P.; Parkin, A.; Davies, G. J.; Walton, P. H. J. JBIC, J. Biol. Inorg. Chem. 2006, 11, 197.
Am. Chem. Soc. 2013, 135, 6069. (1017) Maiti, D.; Fry, H. C.; Woertink, J. S.; Vance, M. A.; Solomon,
(992) Bey, M.; Zhou, S.; Poidevin, L.; Henrissat, B.; Coutinho, P. M.; E. I.; Karlin, K. D. J. Am. Chem. Soc. 2007, 129, 264.
Berrin, J.-G.; Sigoillot, J.-C. Appl. Environ. Microbiol. 2013, 79, 488. (1018) Cramer, C. J.; Tolman, W. B. Acc. Chem. Res. 2007, 40, 601.
(993) Wu, M.; Beckham, G. T.; Larsson, A. M.; Ishida, T.; Kim, S.; (1019) Cramer, C. J.; Gour, J. R.; Kinal, A.; Wtoch, M.; Piecuch, P.;
Payne, C. M.; Himmel, M. E.; Crowley, M. F.; Horn, S. J.; Westereng, Shahi, A. R. M.; Gagliardi, L. J. Phys. Chem. A 2008, 112, 3754.
B.; Igarashi, K.; Samejima, M.; Ståhlberg, J.; Eijsink, V. G. H.; Sandgren, (1020) Itoh, S. In Copper-Oxygen Chemistry; John Wiley & Sons, Inc.:
M. J. Biol. Chem. 2013, 288, 12828. Hoboken, NJ, 2011; pp 225−282.
(994) Floudas, D.; Binder, M.; Riley, R.; Barry, K.; Blanchette, R. A.; (1021) Osborne, R. L.; Klinman, J. P. In Copper-Oxygen Chemistry;
Henrissat, B.; Martinez, A. T.; Otillar, R.; Spatafora, J. W.; Yadav, J. S.; John Wiley & Sons, Inc.: Hoboken, NJ, 2011; pp 1−22.
(1022) Peterson, R. L.; Himes, R. A.; Kotani, H.; Suenobu, T.; Tian,
Aerts, A.; Benoit, I.; Boyd, A.; Carlson, A.; Copeland, A.; Coutinho, P.
L.; Siegler, M. A.; Solomon, E. I.; Fukuzumi, S.; Karlin, K. D. J. Am.
M.; de Vries, R. P.; Ferreira, P.; Findley, K.; Foster, B.; Gaskell, J.;
Chem. Soc. 2011, 133, 1702.
Glotzer, D.; Gorecki, P.; Heitman, J.; Hesse, C.; Hori, C.; Igarashi, K.;
(1023) Suess, A. M.; Ertem, M. Z.; Cramer, C. J.; Stahl, S. S. J. Am.
Jurgens, J. A.; Kallen, N.; Kersten, P.; Kohler, A.; Kues, U.; Kumar, T. K.
Chem. Soc. 2013, 135, 9797.
A.; Kuo, A.; LaButti, K.; Larrondo, L. F.; Lindquist, E.; Ling, A.; (1024) Hemsworth, G. R.; Henrissat, B.; Davies, G. J.; Walton, P. H.
Lombard, V.; Lucas, S.; Lundell, T.; Martin, R.; McLaughlin, D. J.; Nat. Chem. Biol. 2013, 10, 122.
Morgenstern, I.; Morin, E.; Murat, C.; Nagy, L. G.; Nolan, M.; Ohm, R. (1025) Dimarogona, M.; Topakas, E.; Olsson, L.; Christakopoulos, P.
A.; Patyshakuliyeva, A.; Rokas, A.; Ruiz-Duenas, F. J.; Sabat, G.; Bioresour. Technol. 2012, 110, 480.
Salamov, A.; Samejima, M.; Schmutz, J.; Slot, J. C.; John, F. S.; Stenlid, (1026) Sygmund, C.; Kracher, D.; Scheiblbrandner, S.; Zahma, K.;
J.; Sun, H.; Sun, S.; Syed, K.; Tsang, A.; Wiebenga, A.; Young, D.; Felice, A. K. G.; Harreither, W.; Kittl, R.; Ludwig, R. Appl. Environ.
Pisabarro, A.; Eastwood, D. C.; Martin, F.; Cullen, D.; Grigoriev, I. V.; Microbiol. 2012, 78, 6161.
Hibbett, D. S. Science 2012, 336, 1715. (1027) Solomon, E. I.; Heppner, D. E.; Johnston, E. M.; Ginsbach, J.
(995) Li, X.; Beeson, W. T.; Phillips, C. M.; Marletta, M. A.; Cate, J. H. W.; Cirera, J.; Qayyum, M.; Kieber-Emmons, M. T.; Kjaergaard, C. H.;
D. Structure 2012, 20, 1051. Hadt, R. G.; Tian, L. Chem. Rev. 2014, 114, 3659.
(996) Dietzel, P. D. C.; Kremer, R. K.; Jansen, M. J. Am. Chem. Soc. (1028) Nakagawa, Y. S.; Eijsink, V. G. H.; Totani, K.; Vaaje-Kostad, G.
2004, 126, 4689. J. Agric. Food. Chem. 2013, 61, 11061.
(997) Vu, V. V.; Beeson, W. T.; Phillips, C. M.; Cate, J. H. D.; (1029) Forsberg, Z.; Mackenzie, A. K.; Sørlie, M.; Røhr, Å. K.;
Marletta, M. A. J. Am. Chem. Soc. 2014, 136, 562. Helland, R.; Arvai, A. S.; Vaaje-Kolstad, G.; Eijsink, V. G. H. Proc. Natl.
(998) Gudmundsson, M.; Kim, S.; Wu, M.; Ishida, T.; Haddad Acad. Sci. U.S.A. 2014, 111, 8446.
Momeni, M.; Vaaje-Kolstad, G.; Lundberg, D.; Royant, A.; Ståhlberg, J.; (1030) Suga, K.; Vandedem, G.; Mooyoung, M. Biotechnol. Bioeng.
Eijsink, V. G. H.; Beckham, G. T.; Sandgren, M. J. Biol. Chem. 2014, in 1975, 17, 433.
press. (1031) Okazaki, M.; Mooyoung, M. Biotechnol. Bioeng. 1978, 20, 637.
(999) Isaksen, T.; Westereng, B.; Aachmann, F. L.; Agger, J. W.; (1032) Converse, A. O.; Optekar, J. D. Biotechnol. Bioeng. 1993, 42,
Kracher, D.; Kittl, R.; Ludwig, R.; Haltrich, D.; Eijsink, V. G. H.; Horn, 145.
S. J. J. Biol. Chem. 2014, 289, 2632. (1033) Woodward, J.; Lima, M.; Lee, N. E. Biochem. J. 1988, 255, 895.
(1000) Agger, J. W.; Isaksen, T.; Várnai, A.; Vidal-Melgosa, S.; Willats, (1034) Zhang, Y. H. P.; Lynd, L. R. Biotechnol. Bioeng. 2006, 94, 888.
W. G. T.; Ludwig, R.; Horn, S. J.; Eijsink, V. G. H.; Westereng, B. Proc. (1035) Zhou, W.; Hao, Z. Q.; Xu, Y.; Schuttler, H. B. Biotechnol.
Natl. Acad. Sci. U.S.A. 2014, 111, 6287. Bioeng. 2009, 104, 275.
(1001) Vu, V. V.; Beeson, W. T.; Span, E. A.; Farquhar, E. R.; (1036) Zhou, W.; Schuttler, H. B.; Hao, Z. Q.; Xu, Y. Biotechnol.
Marletta, M. A. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 13822. Bioeng. 2009, 104, 261.
(1002) Hemsworth, G. R.; Davies, G. J.; Walton, P. H. Curr. Opin. (1037) Zhou, W.; Xu, Y.; Schuttler, H. B. Biotechnol. Bioeng. 2010,
Struct. Biol. 2013, 23, 660. 107, 224.

1447 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448
Chemical Reviews Review

(1038) Sild, V.; Ståhlberg, J.; Pettersson, G.; Johansson, G. FEBS Lett.
1996, 378, 51.
(1039) Levine, S. E.; Fox, J. M.; Blanch, H. W.; Clark, D. S. Biotechnol.
Bioeng. 2010, 107, 37.
(1040) Levine, S. E.; Fox, J. M.; Clark, D. S.; Blanch, H. W. Biotechnol.
Bioeng. 2011, 108, 2561.
(1041) Griggs, A. J.; Stickel, J. J.; Lischeske, J. J. Biotechnol. Bioeng.
2012, 109, 665.
(1042) Griggs, A. J.; Stickel, J. J.; Lischeske, J. J. Biotechnol. Bioeng.
2012, 109, 676.
(1043) Igarashi, K. Nat. Chem. Biol. 2013, 9, 350.
(1044) Fox, J. M.; Jess, P.; Jambusaria, R. B.; Moo, G. M.; Liphardt, J.;
Clark, D. S.; Blanch, H. W. Nat. Chem. Biol. 2013, 9, 356.
(1045) Gao, D. H.; Chundawat, S. P. S.; Sethi, A.; Balan, V.;
Gnanakaran, S.; Dale, B. E. Proc. Natl. Acad. Sci. U.S.A. 2013, 110,
10922.
(1046) Fenske, J. J.; Penner, M. H.; Bolte, J. P. J. Theor. Biol. 1999,
199, 113.
(1047) Warden, A. C.; Little, B. A.; Haritos, V. S. Biotechnol. Biofuels
2011, 4, 39.
(1048) Asztalos, A.; Daniels, M.; Sethi, A.; Shen, T. Y.; Langan, P.;
Redondo, A.; Gnanakaran, S. Biotechnol. Biofuels 2012, 5, 55.

1448 DOI: 10.1021/cr500351c


Chem. Rev. 2015, 115, 1308−1448

You might also like