Professional Documents
Culture Documents
annurev.bb_.24.060195.003333
annurev.bb_.24.060195.003333
annurev.bb_.24.060195.003333
net/publication/15536289
CITATIONS READS
130 1,480
3 authors:
Charles Delisi
Boston University
400 PUBLICATIONS 14,834 CITATIONS
SEE PROFILE
All content following this page was uploaded by Rakefet Rosenfeld on 15 May 2014.
DESIGN
CONTENTS
INTRODUC TION . . . . ....... ................................. ............ ............... ........... 678
PROBLEM S REQUIRING THE USE OF FLEXIBLE METHODS. .......... ..... 680
De sign o f Pro te ase Inhib itors . . . . . .. ... . . . ........................... ....................... 680
Pep tide Binding to Majo r Histo comp atib ility Comple x Recep to rs . . ..... . . ... . . . 68 1
METHO D S OF DOCKING AND DE SI GN. .. ................... ........ . . ......... . ...... 683
From Rigid to Fle xib le A lgor ithm s . ..... .... . . . ..... . .. . . .... . . ... . . . .. . . . .. ...... . . . . . . . . 684
Strate gies for Fle xib le D ocking and De sign . . . . . . . . . . . . . . . . . . . . . . . . . . . ................. 686
Emp iric al Evalu atio n o f Binding Free Energy... .. . . . .... . ... . . . ... . . ... .. ... . . . . . . .. .. 690
APPL I CA TIONS OF FLEX IBLE METHODS: CURRENT STATUS . .... .... . . . . 692
Design 692
Pep tide Binding to Majo r Histocomp atib ility Comp le x Recep tors . . . . .. . .. . . . . . . 694
CONCLUS IONS . . . . .. .. . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695
ABSTRACT
Docking and design are the major computational steps toward under
standing and affecting receptor-ligand interactions. The flexibility of
many ligands makes these calculations difficult and requires the devel
opment and use of special methods. The need for such tools is illus
trated by two examples: the design of protease inhibitors and the analy
sis and design of peptide antigens binding to specific MHC receptors.
We review the computational concepts that have been extended from
rigid-body to flexible docking, as well as the following important strate
gies for flexible docking and design: (a) Monte Carlo/molecular dynam-
677
1056-8700/95/0610-0677$05.00
678 ROSENFELD, V AIDA & DeLISI
ics docking, (b) in-site combinatorial search, (c) ligand build-up, and
(d) site mapping and fragment assembly. The use of empirical free en
ergy as a target function is discussed. Due to the rapid development
of the methodology, most new methods have been tested on only a
limited number of applications and are likely to improve results ob
tained by more traditional computational or graphic tools.
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
INTRODUCTION
hibitors, and cofactors are often small flexible organic molecules. Many
hormones and antigens are small or intermediate-size polypeptides. A
predictive understanding of the molecular basis of recognition, which
would encompass the ability to design ligands that bind specific recep
tors and to predict the change in stability caused by site-specific substi
tutions, is central to developing rational drug- and vaccine-design
strategies as well as to elucidating normal and pathological cellular
behavior.
Here, we analyze aspects of computational approaches to the dock
ing of flexible molecules to proteins, as well as to the design of oligopep
tides and other flexible ligands that would bind with high affinity to a
specified receptor site. Design means either determining the chemical
composition of a ligand with desired binding properties (e. g. the amino
acid sequence of an oligopeptide) or, in a somewhat more restrictive
sense, ranking a given set of ligands according to their binding affinities.
Docking means determining the geometry of a receptor-ligand complex
using the structure of the free receptor.
In general, accurate approaches to docking and design require
progress on two major problems: finding a target function (in the most
general case, a computationally viable free energy-evaluation model)
that can accurately weight any specified conformation of the system
and developing an algorithm that efficiently searches a complex energy
landscape for the target function's minimum value.
In spite of the complexity of these problems, several experimental
systems allow simplifying approximations in the computations. In the
simplest systems, the geometries of the reactants remain, to a good
first approximation, unchanged by complex formation (2, 16, 34, 40,
51 , 52). In such rigid-body systems, an algorithm need only search the
six-dimensional rotational/transitional space and can generally make
FLEXIBLE DOCKING AND DESIGN 679
in the complex than when one uses the geometries of the uncomplexed
molecules (70). The difference in accuracy implies that the assumption
of rigidity is not fully valid even for protein-protein complexes in which
the observed structural changes between the bound and free forms are
small (77). In addition, simple scoring functions, such as measures of
surface complementarity (34, 84),surface-area burial (85),solvation
by Boston University on 06/07/09. For personal use only.
for at least another five to ten years. The alternative approach to deter
mining free energy is to use empirical functions (31, 58, 60), several of
which are reviewed here.
that carry out these diverse cleavage processes are collectively referred
to as proteases. Errors in the production or regulation of any number
of these enzymes cause serious pathologies. Protease inhibitors can
be potential drugs (such as the HIV aspartic protease inhibitor), or
investigative tools to elucidate the function of specific proteases.
p3'
P4
P2'
ity of known protease inhibitors are not specific for a single protease
but rather inhibit, to varying levels, several proteases from a given
family.
Because the naturally occurring protease inhibitors are proteins, one
obvious approach to inhibitor design is to modify a segment of the
inhibitor that is implicated in binding. Peptide inhibitors can occasion
ally be used directly for therapeutic purposes-for example, topical
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
Receptors
OVERVIEW The ability to discriminate foreign from self, the major role
of the immune system, is regulated mainly by the major histocompatibil
ity complex (MHC) receptors. These comprise a highly polymorphic
set of glycoproteins that bind and display fragments of intracellular
proteins on the cell surface, thereby informing other components of
the immune system of the protein composition within the cell. The two
major classes of MHC receptors, I and II, have highly similar struc
tures. In both, the antigenic binding site-a cleft approximately 20 A
long, 8 A deep, and 12 A wide-is formed by a {3-sheet floor bounded
by a-helical walls. Variations in the amino acids that line the cleft modu
late both its shape and chemical nature, thus determining haplotype
specificity.
Class I molecules are found on all nucleated cells, displaying 8- to 1 0-
residue-long fragments of cytosolic proteins. In infected cells, a certain
fraction of these fragments will have originated in proteins that are not
native to the host, thus presenting a foreign molecular arrangement
recognized by T-cell receptors (TCR) on CD8+ cytotoxic T lympho
cytes. This recognition stimulates, among other things, the release of
cytolytic agents that mediate lysis of the infected cell. Class II products
are generally confined to cells of the immune system, such as B cells,
macrophages, and other dendritic cells. The pathway leading to the
presentation of class II receptors on B cells consists of protein recogni
tion by immunoglobulin receptors, followed by endocytosis of the anti
gen-receptor complex, proteolysis of the antigen, and recycling of 13-
to 15-residue-Iong fragments complexed with class II molecules to the
surface. These peptide-MHC complexes, when recognized by the TCR
of CD4 + helper T cells, stimulate production of the various lympho
kines that regulate a humoral (antibody) immune response.
682 ROSENFELD, VAJDA & DeLISI
c. D.
ies have suggested that class II-bound peptides share a common three
dimensional structure (50, 61),
dimensional grid (26, 27). Any target function in which the ligand and
receptor terms are separable can be treated in this manner. The ap
proach can be combined with the use of shape descriptors (32, 52).
Although grid-based energy evaluation applies to rigid receptors only,
ligand energy values are not precomputed and hence the ligand can be
flexible (27).
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
SOFT POTENTIALS Soft potentials are target functions that allow for
some penetration of the protein surfaces being matched. Even the most
rigid proteins undergo some changes when interacting with another
molecule. Thus, rigid docking procedures frequently rank near-native
conformations after structures that are far from the native, and a more
realistic potential function does not necessarily improve performance.
by Boston University on 06/07/09. For personal use only.
ecules, (b) in-site combinatorial search, (c) ligand buildup, and (d) site
mapping and fragment assembly. Some of the recent docking and design
procedures combine several of these strategies. Although the same
ideas are also important in pharmacophore-matching programs such as
ALLADIN (81), CATALIST (73), and FOUNDATION (30), we re
strict our discussion to docking and design for receptors with known
three-dimensional structures.
confined to small regions at the two ends of the site. As shown below,
one can exploit this property by mapping these regions to find favorable
positions for the end residues and then determining the conformation
of the intervening peptide fragment by chain-closure algorithms (65, 68)
based either on combinatorial search (8) or on a mUltiple copy scaling
relaxation algorithm (90, 91).
where Eel is the electrostatic interaction energy between the ligand and
the receptor, AGh is the solvation free energy, and ASsc is the loss of
side-chain conformational entropy. The constant term includes further
entropy loss resulting from changes in rotational and translational de
grees of freedom and the cratic contribution to entropy. Equation 1 does
not include a van der Waals component because the protein-protein and
protein-water van der Waals interactions are assumed to cancel (60).
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
Horton & Lewis (31) estimated the free energy of rigid binding by
AG = a�Gapolar + f3�Gpolar + const, 2.
where both solvation free-energy terms, �Gapolar and �Gpolar. are calcu
lated by sums of the form used by Eisenberg & McLachlan (20). The
polar term encompasses only atoms that are involved in hydrogen
bonds and salt bridges; all other atoms contribute to the apolar term.
The coefficients a and f3 are dimensionless parameters determined by
fitting Equation 2 to binding free energies measured in 15 rigid-body
complexes (31). On the basis of general relationships in the thermody
namics of protein folding (57), Murphy and coworkers described the
structural energetics of angiotensin binding (58) with a similar expres
sion that, in addition to a constant, consists of terms that depend on
polar and apolar surface areas.
Whereas Equation I can be derived from the general expression AG
= AE - lAS by considering the energy and entropy differences be
tween bound and free states and using several simplifying assumptions,
the model of Horton & Lewis (Equation 2) is completely empirical.
Furthermore, the two expressions differ strongly, because of the Cou
lombic term Eel in Equation 1. However, in analyzing high-resolution
X-ray structures of protein complexes, we found that the electrostatic
interaction energy Eel can be accurately described as a linear combina
tion of polar and apolar surface areas, probably because surface charges
dominate the interaction energy (80a). Thus, the two free-energy
expressions are, in fact, very similar.
Equations 1 and 2 assume that both the receptor and the ligand are
rigid, but this is clearly a very poor approximation for many ligands,
such as short peptides. The flexibility of the ligand has two conse
quences: Neither (a) the conformational (internal) energy change of the
692 ROSENFELD, V AIDA & DeLISI
ligand, llE, nor (b) the loss of conformational entropy upon binding are
negligible. Because the free ligand adopts many different conforma
tions, one needs to calculate the average energy change (!:lE). The sol
vation free-energy change !:lGh also depends on the conformation of
the free ligand, and hence in Equation 1 it should be replaced by the
ensemble average (!:lGh) (80a). For peptides, the equation must account
for the backbone entropy loss llSbb, which can be estimated by an
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
Verlinde & HoI (82) recently formulated the most important qualita
tive rules for the design of high-affinity ligands as follows: (a) The steric
and electrostatic complementarity should be excellent; (b) a fair amount
of hydrophobic surface should be buried in the complex; and (c) sub
stantial conformational rigidity of the ligand is required to reduce the
entropy loss upon binding. The terms in Equation 1 and 3 are in good
agreement with these rules. The only difference is that, because of the
lack of the van der Waals term, the equations do not measure steric
complementarity. In fact, the simplified model assumes good packing of
both protein-ligand and protein-solvent interfaces, and hence all steric
clashes have to be removed before the free-energy evaluation, e.g. by
subjecting the complex to local energy minimization using a traditional
molecular-mechanics potential function.
sive drug Captopril. A recent study showed that this compound is highly
flexible in its free state (46), and its success as a drug demonstrates that
protease inhibitors can be designed using traditional tools. However,
Hassall and coworkers (29) later addressed the issue of flexibility and
successfully increased Captopril's potency. They fit a model of the
molecule, which they considered to have two freely rotatable bonds,
by Boston University on 06/07/09. For personal use only.
This assumption forms the basis for two recent studies (65, 68) in
which each of the terminal residues was initially docked in isolation
and the conformation of the intervening chain was then calculated using
a loop-closure algorithm. In the first study, the terminal residues were
docked using a rigid-body grid search augmented by local energy min
imization at energetically favorable grid positions; the second study
utilized a mUltiple-copy (21 , 55) method to simultaneously minimize
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
CONCLUSIONS
other research groups have not waited for such state-of-the-art technol
ogies and have solved difficult drug-design problems using traditional
tools such as rigid-body docking and visual inspection on the computer
screen. As discussed above, these traditional tools have been success
ful in the design of protease inhibitors, but the problem of docking
peptide antigens in MHC receptors requires a genuine flexible method
by Boston University on 06/07/09. For personal use only.
ology. The new programs are likely to result in improved inhibitors and
ligands in both types of problems.
In spite of the spectacular rate at which computational tools are being
deVeloped, additional work will be required to solve two major difficul
ties. First, both docking and design as defined here lead to nondetermin
istic polynomial (NP)-complete combinatorial problems. Solving such
problems by any exact method presumably requires computing time
that grows faster than any polynomial in N, the number of degrees of
freedom. In practice, investigators avoid the combinatorial explosion
by using criteria that retain only some top-scoring solutions at various
steps of the algorithms. Thus, the final solutions are likely to be subopti
mal rather than optimal. Second, a related and even more important
problem is selecting a target function that estimates the binding free
energy efficiently and with adequate accuracy. Recently developed em
pirical models seem to satisfy these criteria when tested on complexes
of rigid molecules with known structure, but these models have not yet
been coupled with search algorithms. In particular, the success of such
a procedure depends on the sensitivity of the calculated free energies
to inevitable errors in the atomic coordinates. Hence, this sensitivity
requires careful analysis. As we have discussed, the use of free energy
as a target function can be easily extended to flexible ligands in docking
problems, but design requires calculating the free energy of the un
bound ligand, a problem that is substantially more difficult.
ACKNOWLEDGMENTS
We thank Zhiping Weng for reading the manuscript. This work was
supported by NIH/NIAID grant AI30535 (CD and RR) and DOE grant
FG02-93ER61 656 (SV).
FLEXIBLE DOCKING AND DESIGN 697
Any Annual Review chapter, as well as any article cited in an Annual Review chapter,
may be purchased from the Annual Reviews Preprints and Reprints service.
1-800-347-8007;415-259-5017;email:arpr@c1ass.org
Literature Cited
I . Alberg DG, Schreiber SL. 1 993 . Struc 1 2 . Cohen FE, Gregoret LM, Amiri P, Al
Annu. Rev. Biophys. Biomol. Struct. 1995.24:677-700. Downloaded from arjournals.annualreviews.org
and general rules. Immunogenetics 39: 37. Kollman PA. 1 994. Theory of macro
230-42 molecule-ligand interactions . Curro
24. Fremont DH , Matsumura M , Stura Opin. Struct. Bio i . 4: 240-45
EA, Peterson PA, Wilson IA. 1 992. 38. Krystek S, Stouch T, Novotny J . 1993.
Crystal structures of two viral peptides Affinity and specificity of serine endo
in complex with murine MHC class I peptidase-protein inhibitor interac
H-2 K!;. Science 257:919-27 tions: empirical free energy calculation
25 . Gibson KD, Scheraga HA. 1987. Re based on X-ray crystallographic struc
vised algorithms for the build-up pro tures. J. Mol. Bioi. 234:661 -79
by Boston University on 06/07/09. For personal use only.
cedure for predicting protein confor 39. Kuntz ID. 1992. Structure-based stra
mations by energy minimization. J. tegies for drug design and discovery.
Comput. Chem. 8: 826-34 Science 257: 1 078-82
26. Goodford PJ . 1 985. A computational 40. Kuntz ID, Blaney JM, Oatley SJ, Lan
procedure for determining energeti gridge R, Ferrin TE . 1 982. A geometric
cally favorable binding sites on biolog approach to macromolecule-ligand in
ically important macromolecules. J. teractions. J. Mol. Bioi. 1 6 1 :269-88
Med. Chem. 28: 849-75 41. Lawrence MC, Davis Pc. 1992. CLlX:
27. Goodsell DS, Olson AJ. 1990. Auto a search algorithm for finding novel lig
mated docking of substrates to pro ands capable of binding proteins of
teins by simulated annealing. Proteins known three-dimensional structure.
8 : 195-202 Proteins 1 2 : 3 1-41
28. Hart TN, Read NJ. 1 992. A multiple 42. Leach AR. 1 994. Ligand docking to
start Monte-Carlo docking method. proteins with discrete side-chain flexi
Proteins 1 3 : 206-22 bility. J. Mol. B ioi . 235:345-56
29. Hassall CH, Krohn A, Moody CJ, 43 . Leach AR, Kuntz ID. 1992. Confor
Thomas WA. 1 982. The design of a mational analysis of flexible ligands in
new group of angiotensin-converting macromolecular receptor sites. J.
enzyme inhibitors. FEBS Lett. 147: Comp o Chem. 1 3 : 733-48
1 75-79 44. Lewis RA, Dean PM . 1989. Auto
30. Ho CMW, Marshall G. 1993. FOUN mated site-directed drug design: the
DATION: a program to retrieve all concept of spacer skeletons for pri
possible structures containing a user mary structure generation. Proc. R.
defined minimum number of matching Soc. London Ser. B 236: 1 25-40
query elements from three-dimen 45. Lewis RA, Dean PM. 1 989. Auto
sional databases. J. Comput. Aided mated site-directed drug design: the
Mol. Des. 7:3-22 formation of molecular templates in
3 1 . Horton N, Lewis M. 1992. Calculation primary structure generation. Proc. R.
of the free energy of association for Soc. London Ser. B 236: 1 4 1 - 62
protein complexes. Protein Sci. I : 46. Luke BT. 1 994. A quantum mechani
169-81 cal conformational search of Capto
32. Jiang F, Kim SH. 1992. "Soft dock pril, a potent inhibitor of the angioten
ing": matching the molecular surface sin-converting enzyme. J. Mol. Struct.
cubes. J. Mol. B ioi . 2 1 9 : 79-102 309: 1 - 1 1
33. Kaslow RA, Duquesnoy R , Van 47. Madden DR, Garboczi DN, Wiley DC .
Raden M . 1 990. A I , Cw7, B8, DR3 1 994. The antigenic identity of peptide
HLA antigen combination associated MHC complexes: a comparison of the
with rapid decline of T-helper lympho conformations of five viral pep tides
cytes in HIV-I infection. Lancet 335: presented by HLA-A2. Cell 75 :
927-30 693-708
34. Katchalski-Katzir E, Shariv I, Eisen 48. Madden DR, Gorga, JC, Strominger
stein M, Friesem A, Aflalo C , Vakser JL, Wiley DC . 1992. The three-dimel\
IA. 1992. Molecular surface recogni- sional structure of HLA-B27 at 2 . 1 A
FLEXIBLE DOCKING AND DESIGN 699
52. Meng EC, Schoichet BK, Kuntz !D. 65. Rosenfeld R, Zheng Q, Vajda S, De
1992. Automated docking with grid Lisi C. 1993. Computing the structure
based energy evaluation . J. Comput. of bound peptides: application to anti
Chem. 1 3 '.505-24 gen recognition by class I M HCs . J.
53. Metropolis N, Rosenbluth AW, Ro Mol. Bioi. 234:5 1 5-2 1
senbluth MN, Teller AH, Teller E . 66. Rotstein SH, Murcko MA. 1993.
1953. Equation of state calculations by GroupBuild: a fragment-based method
fast computing machines. J. Chem. for de novo drug design. J. Med.
Phys. 2 1 : 1087-92 Chem . 36: 1 700- 1 O
54. Mezei M, Beveridge DL. 1 986. Free 67. Rotstein S H , Murcko MA. 1993. Gen
energy simulations. Ann. N. Y. Acad. Star: a program for de novo drug de
Sci. 494: 1 -23 sign. J. Comput. A ided Mol. Des. 7 :
55. Miranker A, Karplus M. 1 99 1 . Func 23-43
tionality maps of binding sites: a mul 68. Sezerman U , Cornette J, Vaj da S ,
ticopy simultaneous search method. DeLisi C . 1993. Toward computa
Proteins 1 : 29-34 tional determination of peptide-recep
56. Moon J B , Howe J. 1 99 1 . Computer de tor structure . Protein Sci. 2 : 1 827-46
sign of bioactive molecules: a method 69. Shoichet BK, Bodian DL, Kuntz !D.
for receptor-based de novo ligand de 1992. Molecular docking using shape
sign. Proteins 1 1 : 3 14-28 descriptors . J. Comput. Chem. 3 :
57. Murphy KP, Freire E. 1992. Thermo 380-97
dynamics of structural stability and co 70. Shoichet B K , Kuntz !D. 199 1 . Protein
operative folding behaviours in pro docking and complementarity. J. Mol.
teins. Adv. Protein . Chem. 43 : 3 1 3-61 Bioi. 221 :327-46
58. Murphy KP, Xie D . Garcia KC, Amzel 7 1 . Shoichet BK, Stroud RM, Santi D ,
LM, Freire E. 1993. Structural ener Kuntz !D, Perry KM. 1993. Structure
getics of peptide recognition : angio based discovery of inhibitors of thymi
tensin II1antibody binding. Proteins dilate synthase. Science 259: 1 1 45 -50
1 5 : 1 1 20 72. Silver ML, Guo He, Strominger JL,
59. Nishibata y , Itai A. 1993. Confirma Wiley D . 1992. Atomic structure of a
tion of usefulness of a structure con human MHC molecule presenting an
struction program based on three-di influenza virus peptide. Nature 360:
mensional receptor structure for 367-68
rational lead generation . J. Med. 73. Sprague PW. 1 99 1 . CATALIST: a
Chem. 36: 292 1 -28 computer aided drug design system
60. Novotny J, Bruccoleri RE, Saul FA. specifically designed for medicinal
1989. On the attribution of binding en chemists. In Recent Advances in
ergy in antigen-antibody complexes Chemical Information. Proc. /99/
MCPC603 , D 1 . 3 , and HyHEL-5. Bio Chemical Information Conf. , ed. H
chemistry 28:4735-49 Coller, pp. 107- 1 1 . London: R. Soc.
6 1 . O'Sullivan D, Arrhenius T, Sidney J , Chern.
Del Guercio M-F, Albertson M , e t aI . 74. Stern LG, Brown JH, Jardetsky TS ,
1 99 1 . On the interaction o f promiscu Oorga JC, Urban RG, et al. 1994.
ous antigenic peptides with different Crystal structure of the human class
700 ROSENFELD, VAJDA & DeLISI
76. Taylor R. 1 994. Histocompatibility an 84. Walls PH, Sternberg MJE. 1992. New
tigens, protective immunity, and HIV- algorithm to model protein-protein
1 . J. NIH Res. 6:68-71 recognition based on surface comple
77. Totrov M, Abagyan R. 1 994. Detailed mentarity: applications to antibody
ab initio prediction of lyspzyme-anti antigen docking. J. Mol. Bioi. 228:
body complex with 1 .6A accuracy . 277-97
Nat. Struct. Bioi. 1 :259-63 85. Wang H . 1 99 1 . Grid-search molecular
78 . Vajda S. 1993. Conformational filter accessible surface algorithm for solv
ing in polypeptides and proteins. J. ing the protein docking problem. J.
by Boston University on 06/07/09. For personal use only.