Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Received: 23 May 2019 Revised: 15 July 2019 Accepted: 24 July 2019

DOI: 10.1002/qua.26035

SOFTWARE NEWS UPDATES

Python implementation of the restrained electrostatic


potential charge model

Asem Alenaizan1 | Lori A. Burns1 | C. David Sherrill1,2

1
School of Chemistry and Biochemistry, and
Center for Computational Molecular Science Abstract
and Technology, Georgia Institute of The restrained electrostatic potential (RESP) charge model is widely used in molecular
Technology, Atlanta, Georgia
2 dynamics simulations, especially for the AMBER and GAFF force fields. We have
School of Computational Science and
Engineering, Georgia Institute of Technology, implemented the RESP scheme using the accessible and widely used Python lan-
Atlanta, Georgia
guage and the NumPy numerical library. This article provides a programming-oriented
Correspondence introduction to the RESP scheme and highlights some of the features of NumPy that
Asem Alenaizan, School of Chemistry and
are useful in scientific computing.
Biochemistry, and Center for Computational
Molecular Science and Technology, Georgia
Institute of Technology, Atlanta, GA KEYWORDS
30332-0400. atomic charges, restrained electrostatic potential, software
Email: alenaizana@gmail.com

Funding information
National Science Foundation, Grant/Award
Numbers: ACI-1449723, CHE-1504217

1 | I N T RO D UC T I O N

Atomic partial charges are a useful conceptual framework through which chemists explain chemical trends. Atomic charges also enjoy widespread
quantitative use in classical molecular mechanics and molecular dynamics where nonbonded electrostatics are represented by pairwise interac-
tions between atom-centered partial charges.
Assigning charges to atoms is an ill-defined procedure, and there is no unique approach for defining partial charges. Consequently, a large vari-
ety of charge models has been proposed.[1] Some models parameterize the partial charges using experimental data, while others utilize computa-
tional quantum mechanical approaches to compute partial charges using the properties of the approximate wave function itself or its
observables.[1] Examples of the latter approach include Mulliken charges,[2] which are extracted from the population density of the occupied
orbitals, and charges fitted to approximately reproduce the computed electrostatic potential (ESP) of the molecule.[3]
Murray and Politzer[4] provide an overview of the ESP in chemical contexts. The ESP represents the potential felt at some point in space r due
to the electrons and nuclei in a molecule, and the ESP at a point in space r is given by:

X
nuclei ð
Zk 1
V ESP ðrÞ = − dr0 Ψ * ðr0 Þ Ψ ðr0 Þ, ð1Þ
k
j r −rk j j r− r0 j

where Zk is the charge of nucleus k, rk are the positions of the nuclei, and Ψ is the wave function. Practically, the ESP can be utilized to explain
chemical trends. For example, it may be depicted visually in maps that show contrasting colors for regions with positive and negative ESP, which
in turn can identify chemically reactive regions of the molecules. Furthermore, it can be used to gain insight into intermolecular interactions for
noncovalent complexes[5] (although it can also be misinterpreted in this context).[6] Overall, the ESP is a reasonable, physically motivated quantity
to use when fitting partial charges.
Various models determine the partial charges by a least-squares fit of the ESP computed from the point charges to that computed quantum
mechanically. The models include the CHELPG[7] and the Merz-Kollman schemes,[8,9] as well as a more recent model developed by Yang and

Int J Quantum Chem. 2019;1–7. http://q-chem.org © 2019 Wiley Periodicals, Inc. 1


2 ALENAIZAN ET AL.

coworkers.[10] The restrained electrostatic potential (RESP) model is an improvement of the Merz-Kollman scheme.[11] RESP charges are popular
because of their use in the AMBER and GAFF force fields.[12–14] Therefore, it is profitable to have a practical exposition of how to implement the
RESP approach, which the present article aims to address. We use snippets of our code written in the popular Python language to explain the
RESP procedure.

2 | IMPLEMENTATION

2.1 | General considerations


The canonical program for obtaining RESP charges is the RESP module implemented by Christopher I. Bayly, the first author of the original RESP
paper,[11] available in the AmberTools package.[15] The RESP and ESP charge Derive (R.E.D.) project[16] provides a set of tools to automate the
computation of RESP charges, and it has interfaces with the quantum chemistry packages Gaussian[17] and GAMESS.[18] Furthermore, it provides
a web server and database to facilitate the parameterization of new molecules.[19,20] We have used these existing tools for the validation of our
implementation.
Our open-source implementation of the RESP procedure uses the easy-to-learn Python language and the Numerical Python library
(NumPy).[21] Python is increasingly used for programming in the molecular sciences. An implementation of the RESP procedure in Python serves
two objectives. First, it provides a clear representation of the RESP model, which might enable scientists to better understand the details of the
RESP scheme. Second, it facilitates the integration of the RESP scheme into existing or future workflows in quantum chemistry or molecular
dynamics software programs. Our implementation was initially contributed to the Psi4NumPy project.[22] Now, it has been updated to work as a
plug-in to the Psi4 software package[23] and can be installed along with Psi4 using the Conda package manager. If the reader is interested in
extending the program to support other packages, they may contact the corresponding author. For details on the installation of the code, the
reader is referred to the GitHub repository at https://github.com/cdsgroup/resp. Minimal knowledge of Python or NumPy is required to under-
stand the following subsections.
Here, we focus on the programmatic aspects of the RESP procedure. For tutorials on the use of RESP charges for molecular dynamics simula-
tion or the parameterization of new molecules, the reader is referred to existing online materials.[24,25]

2.2 | Grid generation


The first step in the RESP procedure is the generation of grid points around the molecule at which to compute the ESP. The Connolly molecular
surface[26] is used in the RESP model. Typically, four shells beyond the van der Waals surface of the molecule are generated at values of 1.4, 1.6,
1.8, and 2.0 Å times the van der Waals radii of the atoms. The density of the points on the surface is recommended to be 1.0 point per Å2.[8,9]
Code snippet 2 shows the overall procedure for generating the surface. The user can define his or her options for grid generation and store them
in the options dictionary. The code loops over the number of van der Waals shells the user has requested. The vdw_surface function com-
putes the van der Waals surface of the shell given the atomic coordinates, elements, scale factor, point density, and van der Waals radii. The grid
points are stored in the points variable.

# Get the points at which we're going to calculate the ESP


points = []
for scale_factor in options["VDW_SCALE_FACTORS"]:
shell = vdw_surface.vdw_surface(coordinates, symbols, scale_factor,
options["VDW_POINT_DENSITY"], options["VDW_RADII"])
points.append(shell)
ð2Þ

The vdw_surface function computes the grid points. For maximum reproducibility, we used the same scheme and default Merz-Kollman
radii as those implemented in the GAMESS package (version 30 SEPT 2017 (R2)).[18] As shown in code snippet 3, points on the spheres surround-
ing each atom in the molecule are calculated. The required number of points is calculated from the density and the scaled radii. Then, points
on a unit sphere around the atom, evenly distributed along the polar and azimuthal angles, are calculated in the function surface (not shown
here). The points are then multiplied by the scaled radii and translated according to their coordinates. Finally, the inner for loop checks
whether any grid point is closer to the atoms in the molecule than their scaled radii. If so, the point is excluded. The quantum ESP can then be
computed at the grid points.

"""
vdw_surface function in the vdw_surface.py file.
ALENAIZAN ET AL. 3

arguments: atomic coordinates, element symbols, scale factor,


point density, user-defined VDW radii
"""
# loop over atomic coordinates
for i in range(len(coordinates)):
# calculate approximate number of ESP grid points
n_points = int(density * 4.0 * np.pi* np.power(radii[elements[i]], 2))
# generate an array of n_points in a unit sphere around the atom
ð3Þ
dots = surface(n_points)

# scale the unit sphere by the VDW radius and translate


dots = coordinates[i] + radii[elements[i]] * dots
for j in range(len(dots)):
save = True
for k in range(len(coordinates)):
if i == k:
continue
# exclude points within the scaled VDW radius of other atoms
d = np.linalg.norm(dots[j] - coordinates[k])
if d < radii[elements[k]]:
save = False
break
if save:
surface_points.append(dots[j])

2.3 | Electrostatic potential fitting


After obtaining the quantum ESP at the grid, we want to compute the classical ESP using the atom-centered point charges qj. We follow the nota-
tion of ref. [9]. The classical Coulomb potential at point i is given in atomic units by:

X
n
qj
Ei = , ð4Þ
r
j = 1 ij

where n is the number of atoms. The goal is to find the charges that minimize the squared difference between the quantum and classical ESPs,
P
that is, i(Vi − Ei)2, where Vi and Ei are the quantum and classical ESPs, respectively, computed at point i. As shown in ref. [9], the procedure
amounts to solving the matrix equation

Aq = B, ð5Þ

where
Xm
1
Ajk = , ð6Þ
i=1
r ij r ik

and
X
m
Vi
Bk = , ð7Þ
r
i = 1 ik

and where m is the number of grid points. A is an n × n matrix, and B is a vector with n elements, where n is the number of atoms. The sizes of
A and B may be increased in order to impose fitting constraints as discussed below.
Essentially, the left- and right-hand sides in Equation (5) contain information about the classical and quantum ESPs, respectively. Solving the
matrix equation yields the partial charge qj. In Python, the matrix A and vector B can be built thus:
4 ALENAIZAN ET AL.

# Construct A: A_jk = sum_i [(1/r_ij)*(1/r_ik)]


a = np.einsum("ij, ik -> jk", r_inverse, r_inverse)

# Construct B: B_k = sum_i (V_i/r_ik)


b = np.einsum("i, ik->k", V, r_inverse)
ð8Þ
Here, we use the powerful NumPy function, einsum, to build A and B according to Equations (6) to (7). einsum can perform tensor opera-
tions using the Einstein convention in which repeating an index implies summing over that index. This provides a clear method of translating
mathematical equations into computer codes. The r_inverse variable contains a matrix of the inverse distances 1=rij .
Adding constraints to the charge-fitting procedure is an important feature. Therefore, we will discuss it here in detail. The only necessary
constraint is that the sum of the partial charges be equal to the molecular charge. In general, constraints are satisfied by adding appropriate columns
and rows to the A matrix and appropriate elements to the B vector. To illustrate the total charge constraint, let us express it mathematically as:

X
n
qi = Q, ð9Þ
i

where Q is the molecular charge. To account for this in A and B, we need to add columns/rows in the A matrix that have 1 in the matrix elements
which are multiplied by the atomic charges and 0 otherwise. Thus, multiplying this row of A with the atomic charges vector q will generate the
left-hand side in Equation (9). In addition, we need to add an element in B that contains the total molecular charge, corresponding to the right-
hand side of Equation (9). This can be expressed in Python as:

# Capital A and B include constraints, unlike small a and b


# Set the elements in one row and one column in A to 1
# ":" indicates the use of all elements in the column or
# row denoted by "molecular_charge_constraint"
A[:molecular_charge_constraint, molecular_charge_constraint] = 1
A[molecular_charge_constraint,:molecular_charge_constraint] = 1

# Set the corresponding element in B to the molecular charge


ð10Þ
B[molecular_charge_constraint] = molecular_charge

Suppose that, in addition to the total charge constraint, we need to constrain two equivalent hydrogen atoms i and j to have the same charge.
Using the same approach, we need to add another column/row to the A matrix. Elements that correspond to the unconstrained charges will be
0. For the two elements that correspond to constrained atoms, one will be 1 and the other will be -1. Moreover, we will add another element to
B that equals zero. Thus, performing the multiplication between the relevant entries of A and q and equating it with the corresponding element in
B, we obtain:

qi −qj = 0, ð11Þ

or qi = qj. In this way, we can handle intricate constraints on the charges.


Having constructed A and B, the final step is to solve Equation (5) to obtain the atomic charges. In Python, this can simply be achieved with
the linear algebra module of the NumPy library, linalg, and standard linear algebra techniques. We tested both the LU and singular value
decompositions for solving the system in a few big systems (150-300 atoms), and we found that they both give practically identical results for the
atomic charges (less than 1−10 a.u. difference).

2.4 | Restraint and iterative solution


The previous procedure has enabled us to fit the partial charges to the quantum ESP. Nevertheless, direct fitting of the partial charges to the
quantum ESP has several problems. They include the poor determination of charges that are far from the molecular surface and the dependence
of the charges on conformation, leading the atomic charges to fluctuate strongly. The RESP procedure aims to alleviate these problems.[11]
The RESP scheme adds a hyperbolic restraint function meant to minimize the magnitude of the poorly determined charges (and hence their varia-
tion). As shown in ref. [11], this corresponds to modifying all the diagonal elements of the matrix A to be:
ALENAIZAN ET AL. 5

X1   −1=2
Ajj = + aqj q2j + b2 , ð12Þ
i
r2ij

where a determines the strength of the restraint, and b determines how “tight” the hyperbola is around the minimum. b is typically set at 0.1
electrons, while a is set at 0.0005 or 0.001 a.u. The restraint may be limited to nonhydrogen atoms. The restraint is implemented in our
code thus:

index = 0
# loop over all molecules/conformers
for mol in range(n_mol):
for i in range(n_atoms[mol]):
# if an element is not hydrogen or if hydrogens are to be restrained
if not ihfree[mol] or symbols[mol][i] != "H":
A[index+i, index+i] = (A_unrestrained[index+i, index+i]
+ resp_a[mol]/np.sqrt(q[index+i]**2 + resp_b[mol]**2))
ð13Þ
# move index to beginning of block of A corresponding to molecule mol
index += n_atoms[mol] + n_constraints[mol]

We loop over all atoms in the molecule and check if a charge should be restrained. If so, we add the second term in Equation (12) to the
corresponding diagonal element of the unrestrained A matrix, A_unrestrained.
To illustrate the problem of conformational dependence, consider a methanol molecule where one of the hydrogen atoms in the methyl group
is trans with the hydrogen atom of the O-H group. The charge of that atom will be different from the gauche hydrogen atoms. However, the cen-
tral methanol bond is rotatable in molecular dynamics simulations, and therefore, this will lead to essentially equivalent conformations having
different energies. To solve this issue, ref. [11] proposed enforcing the requirement that equivalent hydrogen atoms have identical charges. This
constraint can be implemented using the approach discussed in Section 2.3.
The hyperbolic restraint term depends on the charges, which in turn depend on the restraint term. Hence, an iterative procedure is required
to optimize the charges. We can acquire an initial guess for the charges using the values that we obtained from the unrestrained fit in Section 2.3.
Then, we add the restraint (code snippet 13) and solve the matrix equation. We repeat this procedure until the charges converge. For conver-
gence, we require that the difference of any of the charges between two consecutive iterations be below a certain threshold. This is accomplished
by the following code:

while dif > toler and niter < maxit:


niter + = 1
A = restraint(q, A_unrestrained, resp_a, resp_b, ihfree,
symbols, n_atoms, n_constraints)
q = esp_solve(A, B)
# Extract q vector elements corresponding to charges ð14Þ
dif = np.sqrt(np.max((q[atom_indices] - q_last[atom_indices])**2))
q_last = copy.deepcopy(q)

The recommended RESP procedure requires two-stage fitting. In the first stage, a small scaling parameter a (0.0005 a.u.) is used, and
symmetry is not enforced. This allows for better fitting of the polar regions of the molecule. In the second, a larger scaling parameter
(0.001 a.u.) is used; the charges in the polar regions are frozen, and symmetry is enforced. The unfrozen atoms in the second stage are
normally in methyl groups. Freezing charges and symmetry enforcement are performed through adding constraints as described in
Section 2.3.
Often, a balanced determination of the partial charges requires their fitting to multiple conformers. To perform this, grid points and ESPs are
computed for all the conformers. The A matrix becomes block diagonal; each block contains data for a given conformer. The B vector is also
extended to contain the elements for the different conformers. Constraints are enforced such that corresponding atoms in the different con-
formers are forced to have identical charges. Then, the RESP procedure is performed as discussed above. The code snippet 13, for example,
shows the use of the mol variable to index the options for different conformers.
6 ALENAIZAN ET AL.

3 | CO NC LUSIO NS

This article has presented a programming-oriented discussion of the RESP charge model and described our recent Python implementation of the
RESP scheme. It has introduced various useful features of the NumPy library for scientific programming and discussed how mathematical equa-
tions can be intuitively translated into a Python program.
The availability of an accessible implementation of the RESP scheme that uses an open-source quantum chemistry package, such as Psi4, will
enable the use of the RESP model to address chemical problems of wide interest. For example, our code has recently been used by MacKerell and
coworkers to obtain charges that were used to train a model that predicts atomic charges and polarizabilities for their use in polarizable force
fields.[27] More traditionally, our code can also be used to compute electrostatic parameters for the AMBER and GAFF force fields.[12–14]
The code is available at https://github.com/cdsgroup/resp. The GitHub repository contains the documented code, installation instructions,
and examples for using the code. Users can also report issues or propose changes in the GitHub repository. The code is available as a Conda pack-
age for easy installation.

ACKNOWLEDGEMEN TS

We thank Christopher I. Bayly for helpful discussions.

ORCID

Asem Alenaizan https://orcid.org/0000-0002-0871-664X


Lori A. Burns https://orcid.org/0000-0003-2852-5864
C. David Sherrill https://orcid.org/0000-0002-5570-7666

RE FE R ENC E S

[1] C. J. Cramer, Essentials of Computational Chemistry: Theories and Models, Wiley: Chichester, England, 2004.
[2] R. S. Mulliken, J. Chem. Phys. 1955, 23, 1833.
[3] E. Sigfridsson, U. Ryde, J. Comput. Chem. 1998, 19, 377.
[4] J. S. Murray, P. Politzer, WIREs Comput. Mol. Sci. 2011, 1, 153.
[5] P. Politzer, J. S. Murray, Z. Peralta-Inga, Int. J. Quantum Chem. 2001, 85, 676.
[6] S. E. Wheeler, J. W. G. Bloom, J. Phys. Chem. A 2014, 118, 6133.
[7] C. M. Breneman, K. B. Wiberg, J. Comput. Chem. 1990, 11, 361.
[8] U. C. Singh, P. A. Kollman, J. Comput. Chem. 1984, 5, 129.
[9] B. H. Besler, K. M. Merz, P. A. Kollman, J. Comput. Chem. 1990, 11, 431.
[10] H. Hu, Z. Lu, W. Yang, J. Chem. Theory Comput. 2007, 3, 1004.
[11] C. I. Bayly, P. Cieplak, W. Cornell, P. A. Kollman, J. Phys. Chem. 1993, 97, 10269.
[12] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, P. A. Kollman, J. Am. Chem. Soc.
1995, 117, 5179.
[13] J. A. Maier, C. Martinez, K. Kasavajhala, L. Wickstrom, K. E. Hauser, C. Simmerling, J. Chem. Theory Comput. 2015, 11, 3696.
[14] J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, D. A. Case, J. Comput. Chem. 2004, 25, 1157.
[15] D. A. Case, I. Y. Ben-Shalom, S. R. Brozell, D. S. Cerutti, T. E. Cheatham III., V. W. D. Cruzeiro, T. A. Darden, R. E. Duke, D. Ghoreishi, M. K. Gilson,
H. Gohlke, A. W. Goetz, D. Greene, R. Harris, N. Homeyer, S. Izadi, A. Kovalenko, T. Kurtzman, T. S. Lee, S. LeGrand, P. Li, C. Lin, J. Liu, T. Luchko,
R. Luo, D. J. Mermelstein, K. M. Merz, Y. Miao, G. Monard, C. Nguyen, H. Nguyen, I. Omelyan, A. Onufriev, F. Panm, R. Qi, D. R. Roe, A. Roitberg,
C. Sagui, S. Schott-Verdugo, J. Shen, C. L. Simmerling, J. Smith, R. Salomon-Ferrer, J. Swails, R. C. Walker, J. Wang, H. Wei, R. M. Wolf, X. Wu, L. Xiao,
D. M. York, P. A. Kollman, Amber 2018, University of California, San Francisco, CA 2018.
[16] F.Y. Dupradeau, A. Pigache, T. Zaffran, C. Savineau, R. Lelong, N. Grivel, D. Lelong, W. Rosanski, P. Cieplak, Phys. Chem. Chem. Phys. 2010, 12, 7821.
[17] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, G. A. Petersson, H. Nakatsuji, X. Li,
M. Caricato, A. V. Marenich, J. Bloino, B. G. Janesko, R. Gomperts, B. Mennucci, H. P. Hratchian, J. V. Ortiz, A. F. Izmaylov, J. L. Sonnenberg,
D. Williams-Young, F. Ding, F. Lipparini, F. Egidi, J. Goings, B. Peng, A. Petrone, T. Henderson, D. Ranasinghe, V. G. Zakrzewski, J. Gao, N. Rega,
G. Zheng, W. Liang, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven,
K. Throssell, J. A. Montgomery Jr., J. E. Peralta, F. Ogliaro, M. J. Bearpark, J. J. Heyd, E. N. Brothers, K. N. Kudin, V. N. Staroverov, T. A. Keith,
R. Kobayashi, J. Normand, K. Raghavachari, A. P. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, J. M. Millam, M. Klene, C. Adamo, R. Cammi,
J. W. Ochterski, R. L. Martin, K. Morokuma, O. Farkas, J. B. Foresman, D. J. Fox, Gaussian16 Revision C.01, Gaussian Inc, Wallingford, CT 2016.
[18] M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S. Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, et al., J. Comput.
Chem. 1993, 14, 1347.
[19] E. Vanquelef, S. Simon, G. Marquant, E. Garcia, G. Klimerak, J. C. Delepine, P. Cieplak, F.-Y. Dupradeau, Nucleic Acids Res. 2011, 39, W511.
[20] F.Y. Dupradeau, C. Cézard, R. Lelong, E. Stanislawiak, J. Pêcher, J. C. Delepine, P. Cieplak, Nucleic Acids Res. 2007, 36, D360.
[21] S. van der Walt, S. C. Colbert, G. Varoquaux, Comput. Sci. Eng. 2011, 13, 22.
ALENAIZAN ET AL. 7

[22] D. G. A. Smith, L. A. Burns, D. A. Sirianni, D. R. Nascimento, A. Kumar, A. M. James, J. B. Schriber, T. Zhang, B. Zhang, A. S. Abbott, et al., J. Chem. The-
ory Comput. 2018, 14, 3504.
[23] R. M. Parrish, L. A. Burns, D. G. A. Smith, A. C. Simmonett, A. E. DePrince III., E. G. Hohenstein, U. Bozkaya, A. Y. Sokolov, R. Di Remigio,
R. M. Richard, et al., J. Chem. Theory Comput. 2017, 13, 3185.
[24] B. Leland, D. Paul, B. Krueger, R. Walker, Amber Advanced Tutorials Tutorial: Setting Up an Advanced System (Including Basic Charge Derivation), http://
ambermd.org/tutorials/advanced/tutorial1/section1.htm (accessed: May 2, 2019).
[25] F. Wang, J. P. Becker, C. Cezard, E. Vanquelef, P. Cieplak, F.Y. Dupradeau, Tutorials describing the use of the R.E.D. tools, the PyRED program, the R.E.
DD.B. database, R.E.D. server and R.E.D. server development, https://upjv.q4md-forcefieldtools.org/Tutorial/ (accessed: May 2, 2019).
[26] M. L. Connolly, J. Appl. Cryst. 1983, 16, 548.
[27] E. Heid, M. Fleck, P. Chatterjee, C. Schröder, A. D. MacKerell, J. Chem. Theory Comput. 2019, 15, 2460.

How to cite this article: Alenaizan A, Burns LA, Sherrill CD. Python implementation of the restrained electrostatic potential charge
model. Int J Quantum Chem. 2019;1–7. https://doi.org/10.1002/qua.26035

You might also like