Professional Documents
Culture Documents
Pairwise Additivity of Energy Components in Protein-Ligand Binding: The HIV II Protease-Indinavir Case
Pairwise Additivity of Energy Components in Protein-Ligand Binding: The HIV II Protease-Indinavir Case
Indinavir case
Melek N. Ucisik, Danial S. Dashti, John C. Faver, and Kenneth M. Merz Jr.
Citation: The Journal of Chemical Physics 135, 085101 (2011); doi: 10.1063/1.3624750
View online: http://dx.doi.org/10.1063/1.3624750
View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/135/8?ver=pdfcov
Published by the AIP Publishing
Theory and simulation on the kinetics of protein–ligand binding coupled to conformational change
J. Chem. Phys. 134, 105101 (2011); 10.1063/1.3561694
A water-swap reaction coordinate for the calculation of absolute protein–ligand binding free energies
J. Chem. Phys. 134, 054114 (2011); 10.1063/1.3519057
Coarse-grained molecular dynamics of ligands binding into protein: The case of HIV-1 protease inhibitors
J. Chem. Phys. 130, 215102 (2009); 10.1063/1.3148022
Calculation of absolute protein-ligand binding free energy using distributed replica sampling
J. Chem. Phys. 129, 155102 (2008); 10.1063/1.2989800
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
THE JOURNAL OF CHEMICAL PHYSICS 135, 085101 (2011)
An energy expansion (binding energy decomposition into n-body interaction terms for n ≥ 2) to
express the receptor-ligand binding energy for the fragmented HIV II protease-Indinavir system is
described to address the role of cooperativity in ligand binding. The outcome of this energy expansion
is compared to the total receptor-ligand binding energy at the Hartree-Fock, density functional theory,
and semiempirical levels of theory. We find that the sum of the pairwise interaction energies approx-
imates the total binding energy to ∼82% for HF and to >95% for both the M06-L density functional
and PM6-DH2 semiempirical method. The contribution of the three-body interactions amounts to
18.7%, 3.8%, and 1.4% for HF, M06-L, and PM6-DH2, respectively. We find that the expansion can
be safely truncated after n = 3. That is, the contribution of the interactions involving more than three
parties to the total binding energy of Indinavir to the HIV II protease receptor is negligible. Over-
all, we find that the two-body terms represent a good approximation to the total binding energy of
the system, which points to pairwise additivity in the present case. This basic principle of pairwise
additivity is utilized in fragment-based drug design approaches and our results support its contin-
ued use. The present results can also aid in the validation of non-bonded terms contained within
common force fields and in the correction of systematic errors in physics-based score functions.
© 2011 American Institute of Physics. [doi:10.1063/1.3624750]
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-2 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
emphasize the pairwise character of additivity by comparing tion sites, which implies no interaction between the two sites.
the total free energy of association among a set of N solutes to Any means of interaction between the two mutated residues,
the sum of free energies of all two-body combinations of these both via direct contact and indirect electrostatic interactions
N solutes, specifically to the N(N-1)/2 possible pairings of the were described by Wells as factors which would cause the
solutes calculated one pair at a time for two solutes while the simple additivity model to collapse.19 Schreiber and Fersht
effects of the remaining N-2 solutes were ignored.15 Here, we examined coupling free energies between two mutated sites,
should make a distinction between our work and this approach Gint, which correspond to non-additivity in the thermody-
– we do not predict anything specifically about free energies namic cycle of singly- and doubly-mutated proteins.20 They
but only electronic energies, which is further discussed in the found that residues separated by less than 7 Å showed non-
“Theory” section. Nonetheless, this study constitutes a vali- additivity and this non-additive behavior was interpreted to
dation of pairwise additivity of interaction energies computed imply cooperativity between those residues. It was concluded
in FBDD studies. that at greater separations the effects of the point mutations
Biochemists examine additivity concepts through double were additive implying that the mutual interaction between
mutant cycles.16 Basically, constructing a double mutant cycle these point mutations was minimal. In contrast, our studies
corresponds to constructing a thermodynamic cycle involving have explored the additivity of interaction energies beyond
the wild-type protein, the protein with a particular point muta- these distance boundaries. Some fragments that we examined
tion in the region of interest, the protein with a different point were separated by less than 7 Å, but only had interaction en-
mutation again in the region of interest and the protein with ergies of a few hundredths of a kcal/mol. Nevertheless, the
both single mutations applied simultaneously as shown in observed free energy additivity behavior of fragment energies
Figure 1. Thus, if free energy additivity holds true, the impact located in close proximity can be complex, as demonstrated
of the double mutation should be equal to the sum of the two by Wells.19 Establishment of an inverse correlation between
single mutations which allows for the prediction of the free additivity and cooperativity is a very common conclusion.
energy associated with a certain functionality or structural el- That is, most studies connect the non-additivity to coopera-
ement of any protein in the cycle from measurements of the tivity among the examined fragments.15, 19, 20 Accordingly a
other three proteins in the cycle. This practice resembles our non-zero coupling energy, which is accepted to be the mea-
strategy, whereby the ligand is kept intact and the binding sure of the cooperativity between interacting fragments, im-
pocket of the protein is varied, but has focused on addressing plies either a direct interplay mediated by steric, electrostatic,
the (non)-additivity of free energies. Free energy additivity of hydrogen-bonding, or hydrophobic interactions, or an indirect
point mutations has been observed at enzyme-substrate inter- interplay through structural changes in the protein or solvent
faces in several studies.17–19 However, the observed additivity shell. The non-additive character of free energies associated
was largely traced back to the remoteness of the point muta- with fragments is largely the result of the non-additivity of en-
tropic terms. However, for enthalpies or energies, in addition
to being theoretically justified,21 additivity has been observed
in, for example, the isothermal calorimetry experiments of
Baum et al. and Olejniczak et al.22, 23 Geometrical changes in-
ΔGP-XY→P-Y duced by point mutations can contribute to the non-additivity
X P L of observed free energies, but this point is not a major con-
P Y
L Y cern in the present work because of the fixed geometry we are
using and due to the additivity of enthalpies experimentally
observed in Refs. 22 and 23.
In order to understand the role additivity plays, it is im-
portant to select the appropriate computational model. The
method to be employed should have a good balance between
ΔGP-XY→P-X ΔGPY→P accuracy, cost-efficiency, and feasibility in terms of memory
and computer time requirements. Recent work examining the
accuracy of several computational methods when compared
to complete basis set (CBS) results acquired at the coupled-
cluster single double (triple) (CCSD(T)) level of theory sug-
ΔGP-X→P gested that the M06-L (Ref. 24) meta-GGA (generalized gra-
X dient approximation) functional is such an appropriate level
P L P L
of theory.25 Even in conjunction with the 6-31G* basis set26
it yielded a narrow error distribution for the bound Indinavir
system. We note that in spite of its relative accuracy and
speed the M06-L functional has difficulty with convergence
FIG. 1. A representative double mutant cycle involving protein (P)-ligand for charged systems.27 Thus, substantial effort was expended
(L) interactions. X and Y stand for two residues, which are mutated in- to deal with convergence problems for fragments containing
dividually and together in the wild type protein (P-XY). The free energy
acetate and methyl guanidinium. For comparison purposes
change associated with each mutation is inserted into the following equa-
tions in order to obtain the free energy of coupling, Gint : Gint HF/6-31G* and PM6-DH2 calculations were performed as
= GP−XY→P−X − GP−Y→P = GP−XY→P−Y − GP−X→P . well.
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-3 Pairwise additivity of energy components J. Chem. Phys. 135, 085101 (2011)
The present work is organized as follows: first, the model were addressed by using quadratic convergence methods,31
system and the computational details are reviewed. Next, the Fermi temperature broadening,32 and shifting of orbital ener-
justification for using the computed energies as a measure of gies. In order to correct for the basis set superposition error
additivity validation is presented. The energy decomposition the counterpoise method was applied.33 In each calculation
scheme for the cluster analysis of the binding site of the en- the nuclei of the atoms which did not belong to that particu-
zyme complexed with its inhibitor is then introduced. Then lar calculation were deleted while their basis functions were
the contributions of the n-body interactions to the total energy retained.
of the system are evaluated at different levels of theory and the The DFT and HF calculations were performed with
conclusions are given. Finally, the ramifications of our work the GAUSSIAN 09 (Ref. 34) suite of programs, while
on FBDD and force fields are discussed. MOPAC2009TM ,35 was used for the semiempirical computa-
tions. Visualizations were done using the Visual Molecular
Dynamics (VMD) (Ref. 36) program, while the density plots
II. METHODS
were obtained using the statistical software package R.37
The protein-ligand model system is based on the crys-
tal structure of the HIV II protease obtained from Protein
Data Bank at 1.9 Å resolution (PDB ID:1HSG).10 The bind- III. THEORY
ing pocket was decomposed previously into a total of 21 The process of partitioning a larger molecule into con-
fragments by Faver et al. in their recent study,25 for which stituent fragments and treating those fragments as unique
the enzyme-ligand complex structure was obtained from the bonding units within the framework of the larger system for-
PDB, hydrogen atoms were added to the structure with the mally splits the Hamiltonian of the larger molecule into in-
program Reduce28 and were subsequently optimized with the dividual fragment Hamiltonians. The total binding energy is
AMBER FF99SB force field.29 These 21 receptor fragments given by the ensemble average of the Hamiltonian H of the
and the ligand were combined to model the binding site. full ligand containing N particles, at a volume V and at a
Overlapping receptor fragments were joined to yield a to- temperature T:21
tal of 18 fragments from the original 21 fragments. The re-
sultant cluster structure was retained for all subsequent cal- E (N, V , T ) = H N,V ,T . (1a)
culations. The final model contained short aliphatic alkanes
Now, let us assume that the Hamiltonians associated with the
including ethane, propane, isobutane, and butane along with
distinct fragments sum to model the Hamiltonian of the full
polar species consisting of acetate, acetic acid, methyl guani-
protein-ligand system. Thus, for a system of n fragments,
dinium, and four peptide chains containing up to 35 atoms.
Two tightly coordinated water molecules in the crystal struc-
n
ture were retained and treated as two distinct fragments. The H = Hi . (1b)
ligand L-735 524,11, 12 an orally bio-available inhibitor of HIV i=1
proteases with the commercial name Indinavir, was kept intact By combining Eq. (1a) and Eq. (1b), we can define an energy
in all calculations. expression:
Single point calculations were carried out at the HF, den- n
sity functional theory (DFT), and semiempirical levels. The n
6-31G* Pople type basis set26 was used throughout. The M06- E = H = Hi = Ei , (1c)
i=1 i=1
L/6-31G* combination of level of theory was chosen because
of its narrow systematic and random error distribution with re- suggesting that the energy and enthalpy of the system is addi-
spect to a CCSD(T)/CBS reference. Moreover, through its pa- tive as long as the Hamiltonian is additive.
rameterization the M06-L functional gives a good account of The additivity of enthalpies was experimentally studied
intermolecular interactions as evidenced by the low system- by Baum et al. using isothermal titration calorimetry for a
atic error associated with the polar and non-polar interactions series of thrombin inhibitors.22 Incorporation of a particular
relative to CCSD(T)/CBS reference calculations. On the other functionality into the inhibitor always corresponded to a spe-
hand, HF/6-31G* calculations represent another extreme in cific H, which underlines the independent (additive) be-
that this method has both large random and systematic er- havior of the enthalpy component, whereas the free energies
rors largely due to the incorrect treatment of dispersion.25 The and the entropy components for the same structural changes
large size of the quantum region was the major obstacle to up- were much more variable suggesting non-additivity. Further-
grading to much larger basis sets and prompted our choice more, this finding also supports the idea of associating a func-
of the 6-31G* basis set. In order to examine higher order tional group present within a molecule with a given enthalpy
multi-body interaction energies the semiempirical PM6-DH2 change, which bolsters the notion of attributing certain ener-
(Ref. 30) level of theory was employed due to its accurate per- gies or enthalpies to fragments of a larger molecule.
formance with regards to the CCSD(T)/CBS results and its In this study we employ an approximation for the en-
fast computational speed. Considering the individual accura- ergy decomposition of the binding pocket of HIV II protease
cies for polar and non-polar interactions gave us further con- bound to the ligand Indinavir.10–12 The decomposition scheme
fidence in the use of PM6-DH2 method in our calculations.25 we used was adopted from Xantheas’s formulation for water
Convergence problems were encountered for some of the clusters.13 The total energy En of an n-body cluster can be
acetate and methyl guanidinium containing systems, which expanded into one-, two-, three-, four-, . . . , n-body terms via
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-4 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
the formula below The collection of terms in the first row of the expansion (4)
corresponds to El+bp in Eq. (3), E(19) is equivalent to El and
En = E (1234 . . . n) the expression in the third row excludes the 19th fragment,
n
n−1
n namely, the ligand, leaving the energy of the whole binding
≡ E (i) + 2 E (ij ) pocket composed of 18 fragments (n = 1–18) in the absence
i=1 i=1 j >i of the ligand. The difference between Ebind and Ebind will
help us address the issue of additivity or non-additivity of the
n−2
n−1
n
+ 3 E (ij k) interaction energy. As more terms in Eq. (4) are considered,
i=1 j >i k>j the difference between Ebind and Ebind will approach zero.
Truncation at lower order terms (e.g., two-body) will allow us
n−3
n−2
n−1
n
to investigate the contribution of the higher order multi-body
+ 4 E (ij kl) terms to Ebind .
i=1 j >i k>j l>k
Let us turn our attention to the individual n-body terms.
+ . . . + n E (1234 . . . n) . (2) The two-, three-, four-, and . . . n-body terms are defined as
follows:
Here, the first term corresponds to the one-body term, the sec-
ond to the two-body, the third to the three-body, . . . and even- 2 E(ij ) = E(ij ) − {E(i) + E(j )}
tually the last term to the n-body term. E(i) denotes the ener-
gies of the single fragments or the ligand, E(ij) represents the 3 E(ij k) = E(ij k) − {E(i) + E(j ) + E(k)}
energies of all possible combinations involving two bodies out −{2 E(ij ) + 2 E(ik) + 2 E(j k)}
of the pool of fragments and the ligand and E(ijk), E(ijkl) de-
scribe the energies of the multi-body combinations out of the 4 E(ij kl) = E(ij kl) − {E(i) + E(j ) + E(k) + E(l)}
same pool. −{2 E(ij ) + 2 E(ik)
The total binding energy of the Indinavir ligand to the
HIV II protease binding pocket, which was split into 18 frag- +2 E(il) + 2 E(j k) + 2 E(j l) + 2 E(kl)}
ments, was first calculated using Eq. (3), −{3 E(ij k) + 3 E(ikl) + 3 E(j kl)
Ebind = El+bp − El − Ebp , (3) +3 E(ij l)} . . . . (5)
where Ebind stands for the total binding energy, El+bp is the The number of m-body terms out of a system of n bod-
energy of the entire system consisting of the ligand and the ies, where m = n, is simply given by the number of
fragmented binding pocket, El indicates the energy of the lig- m-combinations out of a set of n bodies. Thus, for our clus-
and, and Ebp refers to the energy of the binding pocket com- 19
ter with 19 bodies, there are = 2!∗17!
19!
= 171 E(ij) val-
prising its 18 fragments. The total binding energy Ebind is 2
compared to the binding energy Ebind , which was obtained ues. However, using Eq. (4) and noting that the 18 receptor
from the expansion (2) for the same system. If we write Ebind fragments are common to both expansions in the first and
explicitly using Eq. (2) and label the ligand as the 19th frag- the third rows, we observe that only the E(ij) values involv-
ment, we arrive at ing the ligand, the 19th fragment, will survive. Thus, the
total two-body term 2 E (ij ) encountered in Eq. (4) in-
19
18
19
17
18
19
19 18
Ebind = E(i) + 2 E(ij )+ 3 E(ij k) volves − = 18 2 E (ij ) values, which is ac-
2 2
i=1 i=1 j >i i=1 j >i k>j
tually obvious from the fact that there exist 18 fragments to
16
17
18
19 pair with the ligand. The same logic applies to the 3 E (ij k),
+ 4 E(ij kl) + . . . 4 E (ij kl), . . . terms, which will be referred to as n-body
i=1 j >i k>j l>k correction terms in the sense that these terms supplement the
+19 E(1234 . . . 19) − E(19) total 2 E (ij ) term in converging to Ebind . In other words,
⎛ they correct Ebind to approach Ebind . Hence, 153 3 E (ij k)
18 17 18 values are required to calculate the total three-body correction
−⎝ E(i) + 2 E(ij ) term, while 816 and 3060 and 4 E(ij kl)’s and 5 E(ij klm)’s
i=1 i=1 j >i compose the total four-body and five-body correction terms,
respectively. As indicated above, the many-body energy val-
16
17
18
+ 3 E(ij k) ues in the correction terms always involve the ligand since the
i=1 j >i k>j interactions among only the receptor fragments are cancelled
in the total decomposition scheme.
15
16
17
18
The sum of the two-body terms, namely, the 2 E (ij )
+ 4 E(ij kl) + . . .
is defined as the additive part of the binding energy decom-
i=1 j >i k>j l>k
⎞ position, while the higher multi-body correction terms rep-
resent the non-additive part. According to this definition, the
+18 E(1234 . . . 18)⎠ . (4) neglect of higher order multi-body corrections might be ex-
pected to produce a significant difference between Ebind and
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-5 Pairwise additivity of energy components J. Chem. Phys. 135, 085101 (2011)
Ebind demonstrating the importance of non-additivity. This is method to account for BSSE.33 Accordingly, the energies of
what was observed in the work of Xantheas for water clus- the many-body subsystems, e.g., E (ij k) for the subsystem
ters, where the three-body terms and above represented 30% (ijk), at the cluster geometry were calculated in the full basis
of the total binding energy. On the other hand, for the energy of the entire system, which was denoted by E (ij k|ij k . . . n).
decomposition scheme given by Eq. (1c), the additive That is, in all the multi-body energy calculations, the basis
part must be much bigger than the non-additive many- functions centered on each of the 323 nuclei were kept, while
body correction terms. This manuscript aims to eluci- the nuclei, which do not participate in that particular calcula-
date the extent of non-additivity for protein-ligand binding tion were deleted. The inclusion of these so-called “ghost”
clusters. orbitals removes the effects of BSSE. This approach com-
Having chosen our cluster we need to address the basis pletes the formulation of the individual many-body correc-
set superposition error (BSSE) because we will employ finite tion terms given in Eq. (5) by converting the expressions
basis sets for our calculations. We applied the counterpoise into Eq. (6),
Figures 2 and 3 visually demonstrate the procedure used Table I shows the individual total correction terms for the
to obtain the binding energies Ebind and Ebind . Insertion of three levels of theory, HF, M06-L, and PM6-DH2. The first
the individual correction terms given in Eq. (6) into Eq. (4) point to note about these total correction terms is that as n
yields the terms that must be computed to obtain the two bind- increases, the absolute value of the total correction term asso-
ing energies. In Figure 2, Eq. (3) is visualized for the HIV II ciated with n, n E(ij . . . n), decreases. That is, the mag-
protease-ligand system. The atoms containing ghost orbitals nitude of the contributions of the total correction terms as-
are transparent, while the darker atoms designate the nuclei sociated with that particular n to Ebind diminishes as n rises.
present in that particular calculation. For HF, it dropped from |−26.86| kcal/mol to |4.24| kcal/mol,
In Figure 3, the components of the energy decomposition which corresponds to a fall of 84.2%. Thus, the three-body
scheme given by Eq. (4) producing Ebind are represented. total correction produces 18.7% of the Ebind . At the M06-L
The same color-coding scheme used for Figure 2 applies to level, this decline is much sharper: the total two-body cor-
Figure 3. rection term was |−111.49| kcal/mol, while the total three-
body correction equaled |−4.45| kcal/mol, which yielded a
reduction of 96.0%. The 3 E (ij k) correction contributed
3.84% to the overall Ebind . The semiempirical PM6-DH2
IV. RESULTS AND DISCUSSION
level was found to show an even sharper drop of 98.6%, from
First, the total binding energy Ebind of the HIV II pro- |−132.86| kcal/mol to |1.92| kcal/mol. The contribution of
tease to the Indinavir ligand was computed according to the three-body correction terms to Ebind was 1.46%. Due to
Eq. (3) with the constituent terms shown in Figure 1. It was the speed of semiempirical methods higher order correction
found to amount to −23.57 kcal/mol, −111.60 kcal/mol, and terms up to n = 5 were included in the computed Ebind for
−131.68 kcal/mol for the HF/6-31G*, M06-L/6-31G*, and PM6-DH2 and were found to make small contributions. The
PM6-DH2 levels of theory, respectively, as shown in Ta- four-body corrections add up to |−0.76| kcal/mol, which cor-
ble I. Next, the correction terms n E (ij . . . n) given by responds to 0.58% of the total Ebind and 0.57% of the total
Eq. (6) were obtained. The large jump in the number of the two-body correction. The total five-body correction term is
energy values needed to generate the total many-body correc- |0.08| kcal/mol, or 0.06% of the total Ebind and 0.06% of the
tion terms set the limit to where we truncated Eq. (6). The overall two-body correction. This equality of percentages in
limit is mostly forced by the computational expense of the the presence or absence of the higher order correction terms
various methods. n E (ij . . . n) with n > 2 demonstrates that their effect on
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-6 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
TABLE I. Ebind , Ebind and the total correction terms for various orders of correction estimated using M06-L, HF, and
PM6-DH2. For HF and M06-L, the calculations were completed up to 3◦ (n = 3), while with PM6-DH2, terms up to 5◦
were (n = 5) achieved. All energy value entries are in kcal/mol.
FIG. 3. Visualization of the components summing up to total correction terms given in Eq. (4). The panels display the ligand and the particular fragments’
nuclei participating in the calculations. Transparent atoms are the “ghosted” ones, while the solid atoms are the ones whose nuclei take part in that particular
calculation. All figures were created using the VMD package (see Ref. 36).
0.00 0.02 0.04 0.06 0.08 0.10 0.12
Density
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-8 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
15
15
Density
10
10
5
5
0
−0.4 −0.2 0.0 0.2 0.4
0
−5 0 5 10
3
Δ E(kcal/mol)
FIG. 5. The distribution of the three-body correction terms to the ligand-binding pocket interactions at M06-L, HF, and PM6-DH2 levels are given in panels
(red), (black), and (blue), respectively. The inset shows the peak region in more detail, where the y axis is the density (the AUC = 1).
down as n increases. For n = 2, the range of the magnitudes interactions as more “parties” are involved in one particular
containing most of the interactions is [0.00:1.00] kcal/mol interaction. To reframe the picture, in spite of the higher num-
(Figure 7), while this range loses one order of magnitude for ber of interaction terms involving increasing number of par-
n = 3 (see Figure 8). For n = 4 and n = 5, the correspond- ties, the cumulative energy correction arising from the entire
ing range decreases one and three orders of magnitude fur- group of interactions among n bodies shows such a strong
ther relative to the three-body correction terms, respectively. decay that at cases for n ≥ 4, it reaches below the limits of
Hence, there is an overall decline in the magnitudes of the accurate computation.
4000
1000
15
800
3000
600
Density
10
2000
400
5
1000
200
0
−1.0 −0.5 0.0 0.5 1.0 −0.04 0.00 0.02 0.04 −0.004 0.000 0.004
Δ3E(kcal/mol) Δ4E(kcal/mol) Δ5E(kcal/mol)
FIG. 6. The distributions of the n-body terms for n = 3 (left), n = 4 (middle), and n = 5 (right) at the PM6-DH2 level. These distributions apply to ligand-
receptor interactions.
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-9 Pairwise additivity of energy components J. Chem. Phys. 135, 085101 (2011)
5
4
4
Density
3
3
2
2
1
0
1
−3 −2 −1 0 1 2 3
0
−40 −20 0 20 40
2
Δ E(kcal/mol)
FIG. 7. The distributions of the pairwise corrections to all the interactions present within the Indinavir-binding pocket complex at the (black) HF level, (red)
M06-L level, and (blue) PM6-DH2 level. One interaction lies around −110 kcal/mol at both HF and M06-L levels, which involved adjacent opposite charges
and has been omitted for the purpose of visual clarity. The detailed peak region is shown in the inset.
Having discussed the limits of accuracy, it is necessary to . . . and eventually first order correction terms. At this point,
acknowledge the uncertainty within the consecutive correc- any nth order correction term with n > 1 would have some
tion terms. Revisiting Eq. (5), the (n + 1)th total correction level of uncertainty. The number of lower order terms con-
term of the energy decomposition scheme given in Eq. (4), tributing to the nth order correction term rises with increas-
n+1 E (ij k..), includes the lower nth, (n − 1)th, (n − 2)th, ing n, which results in accumulation of uncertainty within the
20000
300
10000
250
15000
8000
200
Density
6000
10000
150
4000
100
5000
2000
50
0
−1.0 −0.5 0.0 0.5 1.0 −0.04 0.00 0.02 0.04 −0.004 0.000 0.004
3 4 5
Δ E(kcal/mol) Δ E(kcal/mol) Δ E(kcal/mol)
FIG. 8. The distributions of n-body correction terms for n = 3 (left), n = 4 (middle), and n = 5 (right) at the PM6-DH2 level of theory. These correction
terms apply to all the interactions involving 3- to 5-parties out of the members of the Indinavir-binding pocket complex system, including all the 18 fragments
constituting the binding pocket and the Indinavir ligand.
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-10 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
FIG. 9. Magnitude of individual three-body corrections vs. the distance between the nearest atoms of the two-receptor fragments involved in the three-body
interaction. Interactions involving charged fragments are color-coded with red, water fragments with blue, and other fragments with black. The plot has been
obtained by using GNUPLOT 4.2.
n-body total correction term. Hence, although the decreas- ticular interaction which evokes a larger magnitude for the
ing magnitudes of the nth order corrections with increasing individual three-body correction. At distances <4 Å, the im-
n has been confirmed by three different levels of theory, we pacts of polarization are stronger due to the proximity of the
do not claim that the numbers presented for higher n-body receptor fragments. Only the charged fragments are capable
corrections are free of uncertainty. What is certain here is that of polarizing the remaining interacting parties over greater
the contribution of corrections to the total energy expansion distances (>4 Å) and influence the interaction to result in a
yielding Ebind decreases as one includes higher terms in the considerable three-body correction.
expansion given in Eq. (4). Taken altogether, we conclude that to a good approxima-
Although it was found that the pairwise interactions rep- tion, the protein-ligand complex studied herein behaves ad-
resent the full interaction energy Ebind to 86.0% at the HF ditively. This observation supports the use of pairwise addi-
level, to 99.9% at the M06-L level, and to 99.1% at the PM6- tive force fields and enthalpy computation in fragment-based
DH2 level of theory, the slight contribution of the higher order drug design. Most force fields define non-bonded interac-
corrections was another point of interest. The cumulative im- tions as a double summation, which can be decomposed into
pact of the three-body corrections was very little. However, van der Waals and electrostatic energies over all interacting
the individual 3 E (ij k) values associated with some of the atom pairs.41, 42 This pairwise treatment of non-bonded in-
receptor fragments and the ligand had magnitudes of several teractions in protein-ligand complexes is supported by the
kcal/mol. In order to understand whether the magnitudes of enormous drop of contributions from the n-body correction
these individual three-body corrections could be categorized terms, where n > 2, at multiple levels of theory. Second,
according to the proximity of the two receptor fragments in- as mentioned in the “Introduction,” the observed additivity
teracting with the ligand, the distance between the nearest for the ligand-binding pocket energetics provides a powerful
atoms of the non-ligand fragments has been plotted against approximation for the potential additivity of the binding en-
the corresponding three-body correction at the M06-L level of thalpies of the fragments when they are unified into a larger
theory as shown in Figure 9. These data reveal that the signif- molecule.
icant three-body corrections stem from interactions involving This observed additivity has another significant appli-
polarization of the ligand or the surrounding receptor frag- cation for the improvement of scoring algorithms based on
ments. The red data points symbolize the charged fragments, energy. Faver et al.25 have recently suggested that system-
whereas the blue data points stand for the two water fragments atic errors propagate as a simple sum of the errors contained
and the black data points represent the remaining peptide and within the individual interactions associated with each frag-
hydrocarbon fragments. The charged and hydrogen-bonding ment pair. Thus, if the sum of the energies of the fragment
fragments significantly polarize their environment including pairs approximates the total binding energy well enough, then
both the ligand and the non-ligand parties involved in the the error propagation formula of ErrorSystematic = Err1 + Err2
three-body interaction. Each polarizing entity affects the elec- + Err3 + . . . represents the total systematic error to a consid-
tron densities on the remaining parties involved in that par- erable accuracy. This finding confirms the proposed idea of
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-11 Pairwise additivity of energy components J. Chem. Phys. 135, 085101 (2011)
accounting for the systematic errors in a physics-based score veals the total systematic error for the whole binding event.
function by constructing a reference library composed of nu- Thus, the observed additivity of fragment energies supports
merous unique interacting fragments and developing accu- the idea of the post-hoc correction of systematic errors in
rate error probability density functions based on those inter- physics-based score functions.
action libraries. The effective application of a systematic error
correction scheme is facilitated through additivity with re-
spect to the fragment energies. ACKNOWLEDGMENTS
We thank the National Institutes of Health (Grant Nos.
V. CONCLUSIONS GM044974 and GM066689) for funding the present re-
search. Helpful communication with Professor Alan E. Mark
In this work we aimed to show that additivity principles
is greatly appreciated. We also acknowledge the University of
are applicable to electronic interaction energies of fragments
Florida High-Performance Computing Center for their com-
making up a larger molecular entity. We employed an energy
putational support. M.N.U. and D.S.D contributed equally to
expansion, which decomposes protein-ligand interaction en-
the work.
ergies into n terms where each term designates the contribu-
tions originating from m interacting fragments with m ≤ n.
1 G.
In our scenario involving the HIV II protease active site com- M. Makara, J. Med. Chem. 50(14), 3214 (2007).
2 A. A. Alex and M. M. Flocco, Curr. Top. Med. Chem. 7(16), 1544 (2007).
plexed to the inhibitor Indinavir, m ranged from 2 to 5. The 3 P. J. Hajduk and J. Greer, Nat. Rev. Drug Discovery 6(3), 211 (2007).
reliability of this decomposition scheme was confirmed at the 4 N. Brooijmans and I. D. Kuntz, Annu. Rev. Biophys. Biomol. Struct. 32,
HF, M06-L, and PM6-DH2 levels of theory. For all three lev- 335 (2003).
5 O. Guvench and A. D. MacKerell, Curr. Opin. Struct. Biol. 19(1), 56
els, inclusion of higher order terms with m > 2 brought the
(2009).
interaction energy expansion closer to the exact binding en- 6 P. Kolb and J. J. Irwin, Curr. Top Med. Chem. 9(9), 755 (2009).
ergy of the system, which was obtained by subtraction of en- 7 A. R. Leach, B. K. Shoichet, and C. E. Peishoff, J. Med. Chem. 49(20),
ergies of the ligand and the binding pocket from that of the full 5851 (2006).
8 G. Morra, A. Genoni, M. A. C. Neves, K. M. Merz, and G. Colombo, Curr.
complex (Ebind ).
Med. Chem. 17(1), 25 (2010).
Additivity is theoretically and experimentally unsup- 9 M. Congreve, G. Chessari, D. Tisi, and A. J. Woodhead, J. Med. Chem.
ported in the literature with regards to free energies.21–23 51(13), 3661 (2008).
10 Z. G. Chen, Y. Li, E. Chen, D. L. Hall, P. L. Darke, C. Culberson,
However, for enthalpies and electronic energies, additivity is
largely supported in the case studied herein. The narrowing of J. A. Shafer, and L. C. Kuo, J. Biol. Chem. 269(42), 26344 (1994).
11 B. D. Dorsey, R. B. Levin, S. L. McDaniel, J. P. Vacca, J. P. Guare,
the distributions for higher order interactions (for m > 2) sup- P. L. Darke, J. A. Zugay, E. A. Emini, W. A. Schleif, J. C. Quintero,
ports additivity for energies and enthalpies. The energy ranges J. H. Lin, I. W. Chen, M. K. Holloway, P. M. D. Fitzgerald, M. G. Axel,
encompassing the mth order corrections become consistently D. Ostovic, P. S. Anderson, and J. R. Huff, J. Med. Chem. 37(21), 3443
narrower with increasing m. Moreover, the ranges get exceed- (1994).
12 J. P. Vacca, B. D. Dorsey, W. A. Schleif, R. B. Levin, S. L. McDaniel,
ingly narrow such that they cannot be accurately detected with P. L. Darke, J. Zugay, J. C. Quintero, O. M. Blahy, E. Roth, V. V. Sardana,
the present computational methods. Although the number of A. J. Schlabach, P. I. Graham, J. H. Condra, L. Gotlib, M. K. Holloway,
correction terms summing up to the total m-body correc- J. Lin, I. W. Chen, K. Vastag, D. Ostovic, P. S. Anderson, E. A. Emini, and
tion term increases very rapidly and amounts to thousands at J. R. Huff, Proc. Natl. Acad. Sci. U.S.A. 91(9), 4096 (1994).
13 S. S. Xantheas, J. Chem. Phys. 100(10), 7523 (1994).
m = 4, the magnitudes of these corrections are so small, that 14 P. V. Benos, M. L. Bulyk, and G. D. Stormo, Nucleic Acids Res. 30(20),
they do not accumulate to significant values. In other words, 4442 (2002).
15 S. Shimizu and H. S. Chan, J. Chem. Phys. 115(3), 1414 (2001).
the quick decrease in the magnitudes of the correction terms 16 A. Horovitz, Folding Des. 1(6), R121 (1996).
with rising m overwhelms the increase in the number of sin- 17 P. J. Carter, G. Winter, A. J. Wilkinson, and A. R. Fersht, Cell 38(3), 835
gle correction terms to be added to yield the total m-body cor- (1984).
rection term for a particular m. This observation is advanta- 18 A. Horovitz and M. Rigbi, J. Theor. Biol. 116(1), 149 (1985).
19 J. A. Wells, Biochemistry 29(37), 8509 (1990).
geous from a computational perspective, since the thousands 20 G. Schreiber and A. R. Fersht, J. Mol. Biol. 248(2), 478 (1995).
of terms in the m > 3 corrections may safely be neglected in 21 A. E. Mark and W. F. van Gunsteren, J. Mol. Biol. 240(2), 167 (1994).
the energy expansion for analogous systems. 22 B. Baum, L. Muley, M. Smolinski, A. Heine, D. Hangauer, and G. Klebe,
From the present work we can arrive at three main con- J. Mol. Biol. 397(4), 1042 (2010).
23 E. T. Olejniczak, P. J. Hajduk, P. A. Marcotte, D. G. Nettesheim,
clusions. First, many force fields evaluate non-bonded inter-
R. P. Meadows, R. Edalji, T. F. Holzman, and S. W. Fesik, J. Am. Chem.
actions in a pairwise fashion for protein-ligand systems. Now, Soc. 119(25), 5828 (1997).
having confirmed that the two-body corrections to the over- 24 Y. Zhao and D. G. Truhlar, J. Chem. Phys. 125, 194101 (2006).
25 J. C. Faver, M. L. Benson, X. He, B. P. Roberts, B. Wang, M. S. Marshall,
all energy expansion constitute more than 95% of the total
protein-ligand interaction energy, the use of pairwise poten- M. R. Kennedy, C. D. Sherrill, and K. M. Merz, J. Chem. Theory Comput.
7(3), 790 (2011).
tials in drug design applications is supported. Second, our re- 26 R. Ditchfie, W. J. Hehre, and J. A. Pople, J. Chem. Phys. 54(2), 724 (1971).
sults place fragment-based drug design on a firmer footing 27 S. E. Wheeler and K. N. Houk, J. Chem. Theory Comput. 6(2), 395 (2010).
28 J. M. Word, S. C. Lovell, J. S. Richardson, and D. C. Richardson, J. Mol.
especially with regards to interaction energy computation. Fi-
nally, if the additive model provides a good approximation for Biol. 285(4), 1735 (1999).
29 V. Hornak, R. Abel, A. Okur, B. Strockbine, A. Roitberg, and C. Simmer-
the total binding energy of the ligand to the receptor, then the ling, Proteins: Struct., Funct., Bioinf. 65(3), 712 (2006).
systematic errors in each of the receptor-ligand fragment in- 30 J. J. P. Stewart, J. Mol. Model. 13(12), 1173 (2007).
31 G. B. Bacskay, Chem. Phys. 61(3), 385 (1981).
teractions accumulate as a simple sum, which accurately re-
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06
085101-12 Ucisik et al. J. Chem. Phys. 135, 085101 (2011)
32 A. D. Rabuck and G. E. Scuseria, J. Chem. Phys. 110(2), 695 (1999). 38 O. Matsuoka, E. Clementi, and M. Yoshimine, J. Chem. Phys. 64(4), 1351
33 S. F. Boys and F. Bernardi, Mol. Phys. 19(4), 553 (1970). (1976).
34 M. J. T. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria et al., GAUS - 39 W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and
SIAN 09, Gaussian, Inc., Wallingford, CT, 2009. M. L. Klein, J. Chem. Phys. 79(2), 926 (1983).
35 J. J. P. Stewart, MOPAC2009, Stewart Computational Chemistry, Colorado 40 H. Kistenmacher, G. C. Lie, H. Popkie, and E. Clementi, J. Chem. Phys.
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
130.209.6.50 On: Fri, 19 Dec 2014 09:07:06