Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

To pair or not to pair?

Machine-learned explicitly-correlated electronic


structure for NaCl in water
Niamh O’Neill,1, 2, 3 Benjamin X. Shi,1, 3 Kara Fong,1, 3 Angelos Michaelides,1, 3, a) and Christoph Schran1, 2, 3, b)
1)
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW,
UK
2)
Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, CB3 0HE,
UK
3)
Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge, CB2 1TN,
UK
The extent of ion pairing in solution is an important phenomenon to rationalise transport and thermodynamic
arXiv:2311.01527v1 [physics.chem-ph] 2 Nov 2023

properties of electrolytes. A fundamental measure of this pairing is the potential of mean force (PMF)
between the solvated ions. The relative stabilities of the paired and solvent separated states in the PMF are
highly sensitive to the underlying potential energy surface. However direct application of accurate electronic
structure methods to resolve this property is challenging, since long simulations are required. Leveraging
developments in machine learning potentials and electronic structure methods, we obtain wavefunction based
models with RPA and MP2 for the prototypical system of Na and Cl ions in water. We show that even among
these methods, discrepancies in the PMF still remain, and also highlight shortcomings of density functional
theory and classical force-field predictions. These models are primed for application to computationally
intensive electrolyte properties including transport coefficients and even confined systems, all of which are
highly sensitive to their chosen reference electronic structure method.
Keywords: Dissolution, Aqueous Phase, Molecular Simulations

Understanding the nature of ion pairing and the sol- uation for bulk water,12 they have the additional com-
vation structure of electrolyte solutions is a fundamental plication of ion-water and ion-ion interactions that need
challenge in the quest to design efficient next generation to be accurately described.13 Therefore, it is clear that
energy storage devices.1 For example, ionic conductiv- to rationalise the PMF, it is imperative to first have a
ity and redox stability are predominantly influenced by model that accurately describes the structure of NaCl in
the electrolyte solvation structure.2,3 Moreover, an un- water.
derstanding of the solvation behaviour of ions is crucial In fact, to faithfully capture many dynamical and
to ensure a uniform and stable solid-liquid interphase to structural properties of electrolytes, it is generally ac-
limit dendrite formation, which currently presents a sig- cepted that a model that explicitly treats its electronic
nificant challenge with respect to efficiency and safety structure is mandatory.14,15 Until now, the workhorse
of electrochemical devices.4 More generally for trans- method of computational chemistry, density functional
port properties, Peng et al. showed that the diffusion theory (DFT) has been the method of choice for study-
of sodium ions at interfaces is significantly effected by ing these systems, offering an acceptable balance of ac-
their hydration number.5 curacy and computational overhead. While the solvation
The potential of mean force between an ion pair in so- structure of water around ions has been the subject of nu-
lution provides a direct window into the ion pairing be- merous previous ab initio simulation studies,16–21 there
haviour and solution structure of electrolytes. However, is currently no consensus on a sufficiently accurate model
for the prototypical electrolyte solution of NaCl in wa- to describe NaCl in water. In fact both Duignan et al.
ter, no experimental benchmark exists. Moreover, from and Panagiotopoulos et al. have recently highlighted the
a computational modelling perspective, it is highly sen- urgent need for accurate ab initio models for electrolytes
sitive to the underlying potential energy surface6,7 and to capture their dynamics15 and collective properties.6
there is currently no quantitative or qualitative consen- For example, tried and tested exchange-correlation
sus on the PMF of NaCl in water. For example, two of the (XC) functionals for liquid water, such as the generalised
best-established classical force-fields for NaCl in water – gradient approximation (GGA) revPBE-D3 are not guar-
the Joung-Cheatham (JC)8 and Smith-Dang (SD)9 mod- anteed to perform well when ions are added.18,19,22 This
els – both disagree significantly in their PMFs, with the is because standard GGAs tend to perform poorly for in-
JC model predicting thermodynamically unfavourable teractions involving charges, overestimating electrostatic
ion pairing.10,11 In general electrolytes are very challeng- contributions to binding energies23 along with their well-
ing to model, where on top of the already arduous sit- known delocalisation error,24,25 both arising from the
self-interaction error. Moreover, the interplay of the elec-
tronic structure method and nuclear quantum effects has
been shown to be highly sensitive for liquid water,26–28
a) Electronic mail: am452@cam.ac.uk but their impact on electrolyte solutions has not been ex-
b) Electronic mail: cs2121@cam.ac.uk haustively explored so far.29 Inclusion of more complex
2

ingredients into the density functional approximation fol- duce experimental densities and radial distribution func-
lowing Perdew’s Jacob’s Ladder30 may improve the de- tions for both bulk water and solvated ions. We then
scription of water-water31,32 or ion-water33 interactions, use these wavefunction based methods as a benchmark
however a model that faithfully captures the behaviour against which to compare the predictions of DFT and
of ions in solution for the right reasons remains elusive. classical force-fields for the potential of mean force of
Going beyond DFT, correlated wavefunction-based a NaCl ion pair in water. In all cases, the simulations
methods such as the Random Phase Approximation performed are well-converged with respect to statistical
(RPA) and second-order Møller Plesset Perturbation sampling and are not impacted by finite-size effects, high-
Theory (MP2) are expected to perform well for elec- lighting a major advantage of the MLP approach over ab
trolyte systems. These methods naturally incorporate initio methods. We finish with an outlook on the po-
van der Waals interactions and do not suffer from delo- tential of these models for describing more complex sit-
calisation error.34 They have shown initial promise for uations such as confined electrolytes and for computing
liquid water, accurately predicting the correct relative dynamical properties.
densities for ice and water, which is governed sensitively
by a balance of van der Waals and and hydrogen bond-
ing interactions.35–37 While Duignan et al. hint that RPA I. STRUCTURE OF IONS AND WATER
could outperform lower rungs of Jacobs Ladder for the
case of the solvated sodium ion,19 routine application of In Figure 1, we compare with experiment the perfor-
these high-level methods in condensed phase simulations mance of various DFT XC functionals, along with two
has been sporadic. This is primarily because of their high wave function based methods MP2 and RPA, in predict-
computational cost to implement, with canonical scaling ing the radial distribution functions (RDFs) of both bulk
behaviour between O(N 4 ) and O(N 5 ), albeit reduced- water and Na and Cl ions in water. The DFT function-
scaling variants also exist.38 Meanwhile, to obtain sta- als have been chosen to span the various levels of Jacob’s
tistically converged properties to probe the structure of Ladder, with each level incorporating increased complex-
electrolyte solutions including radial distribution func- ity into the XC functional description. Specifically we
tions, densities and solvation free energies, even DFT have chosen the GGA revPBE-D3,55,56 van der Waals
becomes extremely computationally challenging. While inclusive optB88-vdW,57 meta-GGA r2 SCAN58 and hy-
valuable developments in electronic structure code algo- brid revPBE0-D3.56 Beyond DFT, we consider RPA and
rithms and computer hardware36,39 have significantly in- MP2, with both classical and quantum nuclei. Inclusion
creased the accessibility of these methods, it would be of nuclear quantum effects have been shown to be nec-
highly desirable to confine them to a small number of essary to obtain accurate agreement with experiment for
single-point energy (and force) calculations rather than liquid water structural and dynamical properties, includ-
finite temperature simulations which typically require ing RDFs, diffusion coefficients and spectroscopy.26,31
hundreds of thousands of such computations. We first consider the performance of the RPA and MP2
Fortunately machine learning potentials provide a wavefunction-based methods. Overall, out of all methods
gateway to perform simulations at ab initio levels of the- tested, RPA performs best in describing the structure of
ory with a decrease in several orders of magnitude in both bulk water and Cl and Na ions in water (Figure 1
their computational cost.40,41 Machine learning models (a), (d) and (f) respectively), accurately reproducing ex-
have been successfully trained and applied to study bulk perimental RDFs. For bulk water, inclusion of nuclear
water at various levels of theory,42–46 including recent quantum effects (a) for RPA reduces the height of the
neural network models for bulk water with MP247,48 and first peak of the O-O RDF compared to classical nuclei
RPA.49 While the natural next step is to add ions to (b), (where the first peak height is overestimated by ap-
the water, this brings additional complexity to the con- proximately 18 %), yielding excellent agreement with ex-
figuration space to be explored and so the training set periment, with just a slight overstructuring of the first
must be judiciously chosen to reflect this. To this end peak, consistant with previous literature.49 Meanwhile
we use a previously developed automated active learning MP2, even with quantum nuclei shows some deficiencies
framework50 and an initial training set describing NaCl for bulk water, predicting a more structured liquid by
dissolution in water51 to generate MLPs at various levels overestimating the height of the first peak by approxi-
of electronic structure theory for Na and Cl ions in water. mately 11%, underestimating the first minimum also by
The forthcoming discussion will simply refer to all approximately 11 % and also predicting greater long-
of these models by the name of the reference method range order than experiment. The poorer performance
to which they have been trained. We believe this is of MP2 for bulk water is consistent with previous liter-
valid since we have rigorously benchmarked the capabil- ature. Lan et al. showed that MP2 predicts overstruc-
ity of the MLPs to reproduce their underlying reference tured water and a lower diffusion coefficient compared to
method (see SI). We first explore the capabilities of dif- experiment,47 attributing some of the short-comings in
ferent DFT XC functionals and correlated wave-function that study to an incomplete basis set, while Willow et
methods to accurately describe the structure of ions in al. in Ref. 59 also show MP2 predicts denser water than
water, showing that both RPA and MP2 generally repro- experiment and DFT at ambient conditions.
3

Bulk Water Solvated Ions

O-O
3 NQE MP2
Quantum nuclei | Beyond-DFT

(a)
NQE RPA

0
Cl-O Na-O
3 MP2 MP2 6 MP2
Classical nuclei | Beyond-DFT

(b) (d) (f)


RPA 3 RPA RPA

2 4
2

1 1 2

0 0 0
3 R2SCAN R2SCAN 6 R2SCAN
(c) (e) (g)
optb88-vdW 3 optb88-vdW optb88-vdW
revPBE-D3 revPBE-D3 revPBE-D3
Classical nuclei | DFT

2 revPBE0-D3 revPBE0-D3 4 revPBE0-D3


2

1 1 2

0 0 0
2 3 4 5 6 2 3 4 5 6 1 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å]

FIG. 1. Comparison of radial distribution functions with experiment for correlated wavefunction methods and ascending rungs
of Jacob’s Ladder of DFT XC functionals. The first column shows the O-O RDF for bulk water, considering nuclear quantum
effects with RPA and MP2 (a), classical nuclei for RPA and MP2 (b) and classical nuclei for DFT (c). Columns 2 and 3 show
the O-Cl and O-Na RDFs respectively of a solvated NaCl ion pair in water for RPA and MP2 with classical nuclei (d and f) and
DFT with classical nuclei (e and g). Experimental references in each case are shaded in grey. The bulk water O-O experimental
RDF is taken from Ref. 52. Cl-O is neutron scattering data from Ref. 53 for KCl. The Na-O reference from Ref. 18 is the
rescaled peak from X-ray diffraction data.

Upon comparing correlated wavefunction simulations clude NQEs result in essentially identical RDFs for these
with classical and quantum nuclei (see SI) it seems that cases, as shown in the SI. It should be noted that only the
the ion-water structure is more forgiving with respect to first peak of the Na-O RDF was quoted from experimen-
NQEs than bulk water for both RPA and MP2, with tal XRD measurements.18 Also, the experimental Cl-O
no significant improvement to the structural description RDF is only available from a KCl solution.53 While there
when NQEs are included. Both RPA and MP2 with clas- are discrepancies beyond the first peak, the cation should
sical nuclei are in excellent agreement with experiment not influence the first solvation peak in the dilute limit.
for the Na-O and Cl-O RDFs, accurately reproducing the The challenges of accurately modelling bulk water and
position and height of the first peak. Simulations that in- ions in solution for DFT is apparent when comparing the
4

poor performance for both revPBE-D3 and revPBE0-D3


(a) 1.3 MP2 for Na-O can be potentially ascribed to the fact that the
RPA
Density [g/cm3]
1.2 D3 correction does not account for changes in disper-
Exp sion due to charge-transfer effects (i.e., the formation of
1.1 ions). Cations have been shown to have a significantly
different polarizability (i.e. dispersion) than their neutral
1.0 atom counterpart compared to anions,62 thus explaining
0.9 the greater impact on sodium. More sophisticated treat-
0 1 2 3 4 ment of these cases such as the D4 correction,63 meth-
Concentration [mol/kg] ods incorporating iterative Hirschfeld partitioning64 and
(b) using van der Waals inclusive methods (as shown here
1.20 with optB88-vdW) may alleviate this problem. It has
also been suggested that simply neglecting the D3 correc-
1.15 10 %
Density [g/cm3]

tion for the problematic cation interactions can improve


1.10 5% agreement with experiment.65
1.05 Accurately predicting the density is another important
benchmark of the ML models and their underlying elec-
1.00 tronic structure reference method. Figure 2 (a) compares
0.95 the concentration dependent density prediction of RPA
and MP2 with experiment. This is a stern test of the elec-
tronic structure method but also the MLP quality, cov-
dW
2

AN

-D3
RP
MP

E-D

ering a significant concentration range. Both MP2 and


SC

8-v

E0
PB
R2

PB

RPA show a similar qualitative increase in density with


tb8

rev

rev
op

concentration of NaCl as observed in experiment. MP2 is


within 1 % of the experimental values, while RPA more
FIG. 2. Comparison of densities with experiment54 for differ- significantly overestimates the density by approximately
ent DFT XC functionals and beyond DFT methods. Panel 5 % at lower concentrations and 10 % at higher concen-
(a) shows the NaCl concentration-dependent density for RPA trations. Both RPA and MP2 MLP predictions for the
and MP2 at 300 K. Panel (b) compares the experimental pre- bulk water density are also in close agreement to previous
diction at 300 K for a 2 M NaCl solution. In both cases a ab initio predictions of 0.994 g/cm335 and 1.020 g/cm337
5 and 10 % boundary around experiment is shown shaded in for RPA and MP2 respectively, and are within 5% of the
dark and light grey respectively to facilitate comparisons. experimental value. Figure 2 (b) compares the DFT and
the wave-function MLP predictions of the density of a 2
M NaCl solution with the experimental value. There is
different DFT XC functional predictions for the RDFs again a wide spread in DFT predictions, with all func-
with experiment. None of the functionals we tested can tionals tested overestimating the density of the 2 M NaCl
simultaneously describe water and ions with the same solution apart from revPBE0-D3 which underestimates
accuracy as the best performing correlated wavefunction the density by 5% with respect to experiment. In com-
method, RPA. As has been previously observed, revPBE- parison, MP2 and RPA slightly overestimate the density.
D3 performs very well for liquid water,26,60 however it It should also be noted that while classical force-field
has been shown that this is in part due to a fortuitous models can accurately reproduce experimental density
cancellation of errors and this breaks down upon inclu- predictions, particularly in low concentration regimes,66
sion of NQEs.26 Moreover, while it also performs well for its agreement arises because the density is a property to
the Cl-O RDF, it significantly underestimates the first which the water model has been explicitly fitted.67 At
peak of the Na-O RDF. Inclusion of a fraction of exact higher salt concentrations, away from the regions explic-
exchange via the hybrid revPBE0-D3 shows similar good itly used in the parameterisation, force-field predictions
performance for bulk water and Cl-O, but does not im- also deviate from experiment.66
prove the performance for Na-O. In contrast, r2 SCAN To summarise, although RPA gives excellent structural
significantly overstructures bulk water, predicting a first properties for NaCl in water, it is slightly outperformed
RDF peak for O-O approximately 20% greater then ex- by MP2 in terms of the density response with increas-
periment, yet shows good agreement with experiment for ing NaCl concentration. However, MP2 yields poorer
both ion types in water. Various density corrections to agreement with experiment for structural properties than
the r2 SCAN functional have shown to improve its de- RPA. Nevertheless, the overall commendable all-round
scription of liquid water,61 which warrant further tests performances of RPA and MP2 for all cases of ions and
on their suitability for aqueous electrolytes. Similar to water suggests them as reliable methods with which to
r2 SCAN, the van der Waals inclusive optB88-vdW also study NaCl in water. Of course the enhanced computa-
significantly overstructures bulk water, but accurately tional cost for initial training set computations for both
predicts the ion-water RDF for both Na-O and Cl-O. The RPA and MP2 over DFT should be mentioned, however
5

this is a one-off investment during training. Once a model ion pair in 62 waters for MLPs at the same levels of
is obtained, it can then be applied in simulations with electronic structure theory described in Section I, as well
a reduction of several orders of magnitude in computa- as some commonly used classical force fields. Statisti-
tional cost - where the saving is greater for models based cal convergence and finite size effects have been care-
on more expensive methods. fully investigated and details are given in the SI. The
key features of the PMF described above are also sum-
marised in Table I. We first note that even among the
II. PMF FOR IONS IN WATER correlated wave-function methods, there are still discrep-
ancies in their PMF predictions. RPA predicts a slightly
more stable SSIP state than CIP, which is the opposite
Going beyond experimentally accessible properties, the
of MP2, which predicts a greater stability of the CIP
potential of mean force is a fundamental property of the
over the SSIP. Moreover, MP2 predicts a slightly larger
electrolyte governing the extent of ion pairing in solu-
barrier than RPA to go from the CIP to SSIP state. It
tion and capturing collective solvent motion. The key
has previously been shown that MP2 tends to overbind,73
features of this quantity for Na and Cl in water are two
while RPA underbinds interatomic interactions,74 which
local minima corresponding to the so-called contact-ion
seems to be consistent with the PMFs shown here. These
pair (CIP) and solvent-separated ion pair (SSIP), with
results therefore highlight the remaining deficiencies of
free energies UCIP and USSIP respectively. These are
RPA and MP2. It should be noted that RPA out-
separated by a barrier along the inter-ionic separation
performs MP2 with respect to all RDFs as shown in
coordinate as illustrated in Figure 3 (a), where Ub is
Figure 3, while at low concentrations, RPA and MP2
defined as UTS − UCIP . The relative stabilities of these
show comparable performance in reproducing experimen-
minima (∆USSIP−CIP = USSIP − UCIP ), along with the
tal densities. Given the close relationship between the
barrier height (Ub ) are crucial to understanding the ki-
PMF and RDF, along with the low concentration regimes
netics of ion-pair association and dissociation.69 However
considered in this work, the RPA prediction could be con-
as previously mentioned, the PMF is highly sensitive to
sidered more reliable than MP2. Additionally, among all
its underlying potential energy surface.6,7 Among both
of the ‘first-principles‘ methods tested in this work, MP2
DFT and classical models, there is a major unresolved
is the only method to predict a more stable CIP over
question regarding the relative stabilities of the CIP and
SSIP. However, while these are the highest level methods
SSIP,10,68,70,71 with some classical models predicting a
applied to compute the PMF to date, a reference method
more stable CIP by up to 4 kcal/mol compared to DFT,
such as CCSD(T) or Diffusion Monte Carlo (DMC) would
which generally predicts almost degenerate CIP and SSIP
be the next step to fully resolve all discrepancies in struc-
states. It should also be noted that the error bars on the
tural properties and density.
DFT predictions are typically much greater, due to the
significant computational effort required to converge the Similar to the RDFs and densities shown in Section
PMF. Despite commendable efforts to resolve the PMF I, DFT predictions also vary significantly (although not
using ab initio methods,7,68,70,72 it is computationally to the same extent as classical force-fields), and reinforce
intensive to statistically converge, requiring long simu- the considerations for selecting a suitable functional to
lation times (∼ 2-3 ns) and several replicates to obtain study electrolyte systems. Nevertheless, they all predict
statistical error bars. Therefore, in absence of experi- a more stable SSIP state over the CIP, however to vary-
mental benchmarks a reference PMF based on high-level ing degrees, with r2 SCAN and revPBE-D3 showing al-
electronic structure method is particularly valuable to most degeneracy, while revPBE0-D3 and optB88-vdW
the community. It can provide clear atomistic insight predict greater stability by approximately 0.4 kcal/mol.
to explain experimental observations of ion-pairing, and The free energy barrier between CIP and SSIP for both
allow comparison among theoretical models – both ab r2 SCAN and optB88-vdW is similar at approximately
initio and classical. 1.2 kcal/mol. Meanwhile, although revPBE-D3 and
The MLPs developed in this work are ideal for effi- revPBE0-D3 predict similar barrier heights for the CIP
cient and statistically converged calculation of the NaCl to SSIP transition, revPBE-D3 shows a shift to larger
PMF at a given level of theory. We compute the PMF distances for the locations of the CIP, SSIP and transi-
via the Na-Cl RDF of one ion-pair in a box of 62 wa- tion state. Given the wide-ranging performances of the
ters. This highlights another advantage of the MLP DFT methods tested here with respect to various aspects
approach, where the potential energy surface along the of the electrolyte structure (shown in Section I), it is per-
inter-ionic separation can be sufficiently sampled through haps unsurprising that they also struggle to concur on the
standard molecular dynamics simulations, without the PMF, a property depending sensitively on the collective
requirement for free energy-based methods such as ther- interactions of sodium and chlorine with the surrounding
modynamic integration (we show in the SI the equiva- water.
lence of both methods). Overall, over 200 ns of MD sim- The largest discrepancies from all the classes of
ulations were performed, which is significantly beyond methods tested are in the classical force-field predic-
the capabilities of ab initio methods. tions. The Smith-Dang/ SPCE model9 is very sim-
Figure 3 compares the PMF predictions for an NaCl ilar to the MP2 prediction, with the position of the
6

Wavefunction DFT Force-field


3

2
TS
Free energy [kcal/mol]

0
Ub

1 SSIP
CIP r2SCAN
2 optB88-vdW SD-SPCE
RPA revPBE-D3 JC-SPCE
MP2 revPBE0-D3 HMN-SPCE
3
2 3 4 5 6 2 3 4 5 6 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å]

FIG. 3. Potential of mean force for Na Cl ion pair in water with associated statistical error bars for various MLPs: (a)
Wavefunction methods RPA and MP2 (b) DFT (c) Classical force-fields. HMN-SPCE is taken from Ref. 68 and JC-SPCE from
Ref. 10.

CIP (Å) TS (Å) SSIP (Å) Ub (kcal/mol) ∆USSIP−CIP (kcal/mol)


RPA 2.82 3.55 4.76 1.33 -0.52
MP2 2.80 3.67 4.70 1.67 0.25
revPBE-D3 3.04 3.67 4.93 0.73 -0.10
r2 SCAN 2.74 3.61 4.74 1.29 -0.12
revPBE0-D3 2.91 3.74 4.93 0.75 -0.37
optB88-vdW 2.85 3.55 4.74 1.16 -0.42
SD-SPCE 2.89 3.73 4.98 2.32 0.55
JC-SPCE 2.89 3.51 5.04 0.85 -1.71
HMN-SPCE 2.56 3.60 4.94 4.27 2.26

TABLE I. Summary of key features of PMF for 1 NaCl ion pair in 62 water box. First 3 columns show the positions of the
contact ion pair (CIP), transition state (TS), and solvent-separated ion pair (SSIP), respectively. The next columns then give
the barrier height Ub , defined as the difference between CIP and TS free energies, and ∆USSIP−CIP which is the relative free
energy difference between the CIP and SSIP states.

SSIP slightly shifted to larger distances. However, III. DISCUSSION AND CONCLUSION
the Joung-Cheatham/SPCE and HMN/SPCE models
significantly differ from all the other tested methods.
The machine learning approach used in this work offers
JC predicts a more stable SSIP by almost 2 kcal/mol,
a high level of accuracy and efficiency for studying elec-
while HMN/SPCE predicts a more stable CIP by over
trolyte solutions, yielding high-quality, well-converged
2 kcal/mol, and a barrier of over 4kcal/mol between CIP
structural properties. Importantly, we compute the dis-
and SSIP. It should be noted that in this work, we restrict
puted potential of mean force of the NaCl ion pair in
our exploration of the free energy surface to up to 6 Å,
water using correlated wavefunction methods. We find
in order to resolve the literature debate on the relative
that correlated wavefunction methods are in good agree-
stabilities of the CIP and SSIP and associated transition
ment for the description of structural properties and den-
barrier. However, the efficiency of MLPs means that ex-
sities, while some smaller differences remain for the rel-
ploring the dilute limit with even larger simulation cells
ative stability of the contact and solvent separated ion
is now feasible - even with correlated wavefunction meth-
pairs. DFT can reproduce these results with larger vari-
ods.
ations depending on the chosen functional, where we do
not identify a functional that delivers convincing perfor-
mance throughout the chosen properties. Compared to
7

the scale of force field predictions, in particular for ion- energy harvesting80 and desalination.81,82 In particular,
pairing, these differences remain small and candidates extension of our current models to explore the intriguing
such as revPBE-D3 or r2 SCAN are expected to deliver a phenomenon of confinement-induced ion pairing is very
good compromise when simulating extended system sizes. attractive. Recent work has shown assembly of confined
Going forward, the remaining differences among DFT ions in solution into long chains under an electric field,
XC functionals and even the discrepancies that remain resulting in the so-called ‘memristor’ effect,83 for which
among the wavefunction based methods point towards accurate atomistic insights are urgently needed.
a need for an unambiguous benchmark quality model
for these systems. Given the recent successes by Chen
et al.75 and Daru et al.76 in performing CCSD(T) – IV. METHODS
considered the ‘gold-standard’ of quantum chemistry–
based molecular dynamics for bulk water, there is an ex- Machine learning potential MLPs provide a direct
citing opportunity to build on this work for the case of mapping between a structure and its energy and forces,
ions in water. Their machine learning models accurately bypassing the computationally costly step of solving the
predict structural and dynamical properties of liquid wa- Schrödinger equation for each timestep of a simulation.
ter with respect to experiment. Using these approaches All of the models in this work are based on the seminal
to train a model for water including ions is therefore a work of Behler and Parrinello,84 where we train a com-
promising prospect to obtain a high quality potential en- mittee of eight neural network models45 for Na and Cl
ergy surface for this system that can be used to further ions in water on forces and energies computed at various
benchmark the lower level methods, and also to be di- levels of electronic structure theory. The model is sys-
rectly used for simulations. From a more general per- tematically trained over two generations. The first com-
spective, our work highlights how ML potentials can now prises a general training set common to all models ob-
be used to efficiently screen various levels of theory di- tained from previous work51 for NaCl ions in water. We
rectly on properties of interest, as has also been recently then use an active learning procedure50 screening differ-
shown for the case of perovskite phase transitions.77 ent solution concentrations to augment the training set
Beyond simply comparing levels of theory, the PMFs for each model to ensure relevant configurations for a
and models generated in this work can next be used to given level of theory are included to give the second gen-
provide valuable insights into experiment and computa- eration model used in production simulations. Further
tionally measured dynamical properties of electrolytes. technical details of the models’ training and validation
There has been significant work using ab initio data to are given in the SI.
parameterise continuum scale models of electrolytes to Molecular dynamics simulations All classical simu-
access quantities such as osmotic coefficients and activi- lations were carried out using the CP2K/Quickstep code
ties, along with transport properties such as diffusion co- at a constant temperature of 300 K maintained using the
efficients and conductivity.6 This work shows that using CSVR thermostat85 and with a 1 fs timestep.
MLPs offers an alternative yet complementary approach, PMF: The PMF calculations were carried out with
in the quest to obtain a fully ab initio description of elec- a 0.9 mol/kg NaCl solution comprising 1 NaCl ion pair
trolytes. From a fundamental perspective, building on in 62 waters. This was first equillibrated in the NpT
the work of Geissler et al., a high level reference model is ensemble to obtain the equillibrium density for a given
imperative to establish a quantitative understanding of level of theory. To ensure sufficient statistical sampling,
the kinetics of ion pair dissociation.69 Similarly, a reliable ten uncorrelated configurations were then sampled from
model capable of accurately describing the bulk structure this trajectory and used as independent initial condi-
of electrolytes is primed to target dynamical properties tions. Production simulations were then performed for
of interest, including activity and osmotic coefficients, at least 2 ns in the NVT ensemble at the previously com-
along with transport properties such as diffusion coeffi- puted equilibrium density, after which the radial distri-
cients and conductivity. Relating the PMF to Onsager bution functions were obtained. The potential of mean
transport coefficients, which quantify correlations in ion force between Na and Cl was computed using the rela-
motion78 would be highly insightful to rationalise various tion F (r) = −kT ln(gNa−Cl (r)), where k and T is the
transport phenomena. In particular, the ionic conductiv- Boltzman constant and temperature, with gNa−Cl (r) be-
ity as a measure of the quantity of current an electrolyte ing the radial distribution function between Na and Cl
can transport is vital for the development of next gen- ion pairs. The final result was given by the average of
eration energy storage devices.79 One important consid- these 10 runs, with the standard deviation providing an
eration for these problems is the influence of long-range error estimate. We show in the SI that this approach
effects, given the large simulation cells required to com- is fully consistant to performing constrained molecular
pute these properties and the long-range electrostatic in- dynamics with thermodynamic integration and does not
teractions between ions. suffer from finite size effects.
Beyond the bulk, confinement of electrolytes leads to RDFs The RDF calculations for bulk water were
interesting physics and unexpected phenomena that are obtained from simulations of 126 water molecules in
highly relevant to a range of applications, including blue the NVT ensemble at the experimental density. RDFs
8

for ions were obtained from simulations of a single ion REFERENCES


pair and 95 water molecules. Path integral simulations
for RPA and MP2 were performed using ring polymer
molecular dynamics with 16 replicas, using the PILE 1 J. Holoubek, K. Kim, Y. Yin, Z. Wu, H. Liu, M. Li, A. Chen,
thermostat.86 H. Gao, G. Cai, T. A. Pascal, P. Liu, and Z. Chen, Electrolyte
design implications of ion-pairing in low-temperature li metal
Electronic structure Electronic structure calcula- batteries, Energy & Environmental Science 15, 1647 (2022).
tions were all carried out using the CP2K/Quickstep 2 N. Yao, X. Chen, Z. H. Fu, and Q. Zhang, Applying classical,

code. A plane-wave cutoff of 1200 Ry was required to ab initio, and machine-learning molecular dynamics simulations
converge the forces on the sodium atoms, and a TZ qual- to the liquid electrolyte for rechargeable batteries, Chemical Re-
views 122, 10970 (2022).
ity basis set was used for each model. Further details of 3 S. Chen, M. Zhang, P. Zou, B. Sun, and S. Tao, Historical devel-
specific electronic structure settings for various levels of opment and novel concepts on electrolytes for aqueous recharge-
theory are given in the SI, including convergence tests for able batteries, Energy & Environmental Science 15, 1805 (2022).
4 X. Q. Zhang, X. Chen, L. P. Hou, B. Q. Li, X. B. Cheng, J. Q.
the various electronic structure settings.
Huang, and Q. Zhang, Regulating anions in the solvation sheath
of lithium ions for stable lithium metal batteries, ACS Energy
Letters 4, 411 (2019).
5 J. Peng, D. Cao, Z. He, J. Guo, P. Hapala, R. Ma, B. Cheng,

DATA AVAILABILITY J. Chen, W. J. Xie, X. Z. Li, P. Jelı́nek, L. M. Xu, Y. Q. Gao,


E. G. Wang, and Y. Jiang, The effect of hydration number on the
interfacial transport of sodium ions, Nature 2018 557:7707 557,
All data required to reproduce the findings of this 701 (2018).
6 T. T. Duignan, M. D. Baer, and C. J. Mundy, Ions interacting in
study will be made available upon publication of this solution: Moving from intrinsic to collective properties (2016).
study. 7 M. D. Baer and C. J. Mundy, Local aqueous solvation structure

around ca2+ during ca2+···cl- pair formation, Journal of Physical


Chemistry B 120, 1885 (2016).
8 I. S. Joung and T. E. Cheatham, Determination of alkali and

halide monovalent ion parameters for use in explicitly solvated


CODE AVAILABILITY biomolecular simulations, Journal of Physical Chemistry B 112,
9020 (2008).
9 L. X. Dang and D. E. Smith, Molecular dynamics simulations
All simulations were performed with publicly available of aqueous ionic clusters using polarizable water, The Journal of
simulation software (n2p2, CP2K), while the active learn- Chemical Physics 99, 6950 (1993).
10 C. Zhang, F. Giberti, E. Sevgen, J. J. de Pablo, F. Gygi, and
ing package is available at GitHub (https://github.
G. Galli, Dissociation of salts in water under pressure, Nature
com/MarsalekGroup/aml-dev). Communications 2020 11:1 11, 1 (2020).
11 D. E. Smith, . Liem, X. Dang, and L. X. Dang, Computer sim-

ulations of nacl association in polarizable water, The Journal of


Chemical Physics 100, 3757 (1994).
12 M. J. Gillan, D. Alfè, and A. Michaelides, Perspective: How good
ACKNOWLEDGMENTS is dft for water?, Journal of Chemical Physics 144, 130901 (2016).
13 T. T. Duignan, S. M. Kathmann, G. K. Schenter, and C. J.

Mundy, Toward a first-principles framework for predicting col-


N.O.N acknowledges financial support from the Gates lective properties of electrolytes, Accounts of Chemical Research
Cambridge Trust. B.X.S. acknowledges support from the 54, 2833 (2021).
14 Y. Ding, A. A. Hassanali, and M. Parrinello, Anomalous water
EPSRC Doctoral Training Partnership (EP/T517847/1).
diffusion in salt solutions, Proceedings of the National Academy
KF acknowledges financial support from Schmidt Sci-
of Sciences of the United States of America 111, 3310 (2014).
ence Fellows. AM acknowledges support from the Eu- 15 A. Z. Panagiotopoulos and S. Yue, Dynamics of aqueous elec-
ropean Union under the “n-AQUA” European Research trolyte solutions: Challenges for simulations, Journal of Physical
Council project (Grant No. 101071937). C.S acknowl- Chemistry B 127, 430 (2023).
16 A. Bankura, V. Carnevale, and M. L. Klein, Hydration struc-
edges financial support from the Alexander von Hum-
ture of na+ and k+ from ab initio molecular dynamics based on
boldt Stiftung and the Deutsche Forschungsgemeinschaft modern density functional theory, Molecular Physics 112, 1448
(DFG, German Research Foundation) project number (2014).
500244608. We are grateful for computational sup- 17 A. P. Gaiduk and G. Galli, Local and global effects of dissolved

port and resources from the UK Materials and Molec- sodium chloride on the structure of water, Journal of Physical
ular Modeling Hub which is partially funded by EPSRC Chemistry Letters 8, 1496 (2017).
18 M. Galib, M. D. Baer, L. B. Skinner, C. J. Mundy, T. Huthwelker,
(Grant Nos. EP/P020194/1 and EP/T022213/1). We G. K. Schenter, C. J. Benmore, N. Govind, and J. L. Fulton,
are also grateful for computational support and resources Revisiting the hydration structure of aqueous na+, Journal of
from the UK national high-performance computing ser- Chemical Physics 146, 10.1063/1.4975608 (2017).
19 T. T. Duignan, G. K. Schenter, J. L. Fulton, T. Huthwelker,
vice, Advanced Research Computing High End Resource
(ARCHER2). Access for both the UK Materials and M. Balasubramanian, M. Galib, M. D. Baer, J. Wilhelm, J. Hut-
ter, M. D. Ben, X. S. Zhao, and C. J. Mundy, Quantifying the
Molecular Modeling Hub and ARCHER2 were obtained hydration structure of sodium and potassium ions: Taking ad-
via the UK Car-Parrinello consortium, funded by EPSRC ditional steps on jacob’s ladder, Physical Chemistry Chemical
grant reference EP/P022561/1. Physics 22, 10641 (2020).
9

20 V. Rozsa, T. A. Pham, and G. Galli, Molecular polarizabilities as and Computation 12, 5851 (2016).
fingerprints of perturbations to water by ions and confinement, 39 M. D. Ben, J. Hutter, and J. Vandevondele, Forces and stress in
Journal of Chemical Physics 152, 124501 (2020). second order møller-plesset perturbation theory for condensed
21 M. Dellostritto, J. Xu, X. Wu, and M. L. Klein, Aqueous sol- phase systems within the resolution-of-identity gaussian and
vation of the chloride ion revisited with density functional the- plane waves approach, Journal of Chemical Physics 143, 102803
ory: impact of correlation and exchange approximations, Physi- (2015).
cal Chemistry Chemical Physics 22, 10666 (2020). 40 O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger,
22 M. Galib, T. T. Duignan, Y. Misteli, M. D. Baer, G. K. Schen- I. Poltavsky, K. T. Schütt, A. Tkatchenko, and K. R. Müller,
ter, J. Hutter, and C. J. Mundy, Mass density fluctuations in Machine learning force fields (2021).
quantum and classical descriptions of liquid water, Journal of 41 J. Behler and G. Csányi, Machine learning potentials for ex-

Chemical Physics 146, 244501 (2017). tended systems: a perspective, European Physical Journal B 94,
23 A. Otero-De-La-Roza and E. R. Johnson, Analysis of density- 1 (2021).
functional errors for noncovalent interactions between charged 42 T. Morawietz, A. Singraber, C. Dellago, and J. Behler, How van

molecules, Journal of Physical Chemistry A 124, 353 (2020). der waals interactions determine the unique properties of water,
24 K. R. Bryenton, A. A. Adeleke, S. G. Dale, and E. R. John- Proceedings of the National Academy of Sciences of the United
son, Delocalization error: The greatest outstanding challenge in States of America 113, 8368 (2016).
density-functional theory, Wiley Interdisciplinary Reviews: Com- 43 C. Zhang, F. Tang, M. Chen, J. Xu, L. Zhang, D. Y. Qiu, J. P.

putational Molecular Science 13, e1631 (2023). Perdew, M. L. Klein, and X. Wu, Modeling liquid water by climb-
25 J. Cheng, X. Liu, J. VandeVondele, M. Sulpizi, and M. Sprik, Re- ing up jacob’s ladder in density functional theory facilitated by
dox potentials and acidity constants from density functional the- using deep neural network potentials, Journal of Physical Chem-
ory based molecular dynamics, Accounts of Chemical Research istry B 125, 11444 (2021).
47, 3522 (2014). 44 Y. Yao and Y. Kanai, Temperature dependence of nuclear quan-
26 O. Marsalek and T. E. Markland, Quantum dynamics and spec- tum effects on liquid water via artificial neural network model
troscopy of ab initio liquid water: The interplay of nuclear and based on scan meta-gga functional, Journal of Chemical Physics
electronic quantum effects, Journal of Physical Chemistry Letters 153, 44114 (2020).
8, 1545 (2017). 45 C. Schran, K. Brezina, and O. Marsalek, Committee neural net-
27 L. R. Pestana, N. Mardirossian, M. Head-Gordon, and T. Head- work potentials control generalization errors and enable active
Gordon, Ab initio molecular dynamics simulations of liquid wa- learning, Journal of Chemical Physics 153, 104105 (2020).
ter using high quality meta-gga functionals, Chemical Science 8, 46 B. Cheng, E. A. Engel, J. Behler, C. Dellago, and M. Ceriotti,

3554 (2017). Ab initio thermodynamics of liquid and solid water, Proceed-


28 M. Ceriotti, W. Fang, P. G. Kusalik, R. H. McKenzie, ings of the National Academy of Sciences of the United States of
A. Michaelides, M. A. Morales, and T. E. Markland, Nuclear America 116, 1110 (2019).
quantum effects in water and aqueous systems: Experiment, the- 47 J. Lan, D. M. Wilkins, V. V. Rybkin, M. Iannuzzi, and

ory, and current challenges, Chemical Reviews 116, 7529 (2016). J. Hutter, Quantum dynamics of water from møller-plesset per-
29 J. Xu, Z. Sun, C. Zhang, M. Dellostritto, D. Lu, M. L. Klein, and turbation theory via a neural network potential, ChemRxiv
X. Wu, Importance of nuclear quantum effects on the hydration 10.26434/chemrxiv-2021-n32q8-v2 (2021).
of chloride ion, Physical Review Materials 5, L012801 (2021). 48 J. Liu, J. Lan, and X. He, Toward high-level machine learning
30 J. P. Perdew and K. Schmidt, Jacob’s ladder of density func- potential for water based on quantum fragmentation and neural
tional approximations for the exchange-correlation energy (AIP networks, Journal of Physical Chemistry A 126, 3926 (2022).
Publishing, 2001) pp. 1–20. 49 Y. Yao and Y. Kanai, Nuclear quantum effect and its tempera-
31 L. R. Pestana, O. Marsalek, T. E. Markland, and T. Head- ture dependence in liquid water from random phase approxima-
Gordon, The quest for accurate liquid water properties from first tion via artificial neural network, Journal of Physical Chemistry
principles, Journal of Physical Chemistry Letters 9, 5009 (2018). Letters 12, 6354 (2021).
32 S. Dasgupta, C. Shahi, P. Bhetwal, J. P. Perdew, and F. Paesani, 50 C. Schran, F. L. Thiemann, P. Rowe, E. A. Müller, O. Marsalek,

How good is the density-corrected scan functional for neutral and and A. Michaelides, Machine learning potentials for complex
ionic aqueous systems, and what is so right about the hartree- aqueous systems made simple, Proceedings of the National
fock density?, Journal of Chemical Theory and Computation 18, Academy of Sciences of the United States of America 118,
4745 (2022). e2110077118 (2021).
33 M. Riera, N. Mardirossian, P. Bajaj, A. W. Götz, and F. Paesani, 51 N. O’Neill, C. Schran, S. J. Cox, and A. Michaelides, Crumbling

Toward chemical accuracy in the description of ion-water inter- crystals: On the dissolution mechanism of nacl in water, arXiv
actions through many-body representations. alkali-water dimer 10.48550/arXiv.2211.04345 (2022).
potential energy surfaces, Journal of Chemical Physics 147, 2698 52 L. B. Skinner, C. Huang, D. Schlesinger, L. G. Pettersson,

(2017). A. Nilsson, and C. J. Benmore, Benchmark oxygen-oxygen pair-


34 Y. S. Al-Hamdani and A. Tkatchenko, Understanding non- distribution function of ambient water from x-ray diffraction
covalent interactions in larger molecular complexes from first measurements with a wide q-range, Journal of Chemical Physics
principles, Journal of Chemical Physics 150, 10901 (2019). 138, 074506 (2013).
35 M. D. Ben, J. Hutter, and J. VandeVondele, Probing the struc- 53 A. K. Soper and K. Weckström, Ion solvation and water struc-

tural and dynamical properties of liquid water with models ture in potassium halide aqueous solutions, Biophysical Chem-
including non-local electron correlation, Journal of Chemical istry 124, 180 (2006).
Physics 143, 54506 (2015). 54 K. S. Pitzer, J. C. Peiper, and R. H. Busey, Thermodynamic
36 M. D. Ben, O. Schütt, T. Wentz, P. Messmer, J. Hutter, and properties of aqueous sodium chloride solutions, Journal of Phys-
J. Vandevondele, Enabling simulation at the fifth rung of dft: ical and Chemical Reference Data 13, 1 (1984).
Large scale rpa calculations with excellent time to solution, Com- 55 Y. Zhang and W. Yang, Comment on “generalized gradient

puter Physics Communications 187, 120 (2015). approximation made simple”, Physical Review Letters 80, 890
37 M. D. Ben, M. Schönherr, J. Hutter, and J. Vandevondele, Bulk (1998).
liquid water at ambient temperature and pressure from mp2 the- 56 S. Grimme, J. Antony, S. Ehrlich, and H. Krieg, A consistent

ory, Journal of Physical Chemistry Letters 4, 3753 (2013). and accurate ab initio parametrization of density functional dis-
38 J. Wilhelm, P. Seewald, M. D. Ben, and J. Hutter, Large-scale persion correction (dft-d) for the 94 elements h-pu, The Journal
cubic-scaling random phase approximation correlation energy of Chemical Physics 132, 154104 (2010).
calculations using a gaussian basis, Journal of Chemical Theory
10

57 J. Klimeš, D. R. Bowler, and A. Michaelides, Chemical accu- 72 E. Pluhařová, O. Marsalek, B. Schmidt, and P. Jungwirth, Ab
racy for the van der waals density functional, Journal of Physics: initio molecular dynamics approach to a quantitative description
Condensed Matter 22, 022201 (2009). of ion pairing in water, Journal of Physical Chemistry Letters 4,
58 J. W. Furness, A. D. Kaplan, J. Ning, J. P. Perdew, and J. Sun, 4177 (2013).
Accurate and numerically efficient r2scan meta-generalized gra- 73 K. E. Riley, J. A. Platts, J. Řezáč, P. Hobza, and J. G. Hill,

dient approximation, Journal of Physical Chemistry Letters 11, Assessment of the performance of mp2 and mp2 variants for the
8208 (2020). treatment of noncovalent interactions, Journal of Physical Chem-
59 S. Y. Willow, X. C. Zeng, S. S. Xantheas, K. S. Kim, and S. Hi- istry A 116, 4159 (2012).
rata, Why is mp2-water ”cooler” and ”denser” than dft-water?, 74 X. Ren, A. Tkatchenko, P. Rinke, and M. Scheffler, Beyond the

Journal of Physical Chemistry Letters 7, 680 (2016). random-phase approximation for the electron correlation energy:
60 L. B. Skinner, M. Galib, J. L. Fulton, C. J. Mundy, J. B. Parise, The importance of single excitations, Physical Review Letters
V. T. Pham, G. K. Schenter, and C. J. Benmore, The structure of 106, 153003 (2011).
liquid water up to 360 mpa from x-ray diffraction measurements 75 M. S. Chen, J. Lee, H. Z. Ye, T. C. Berkelbach, D. R. Reichman,

using a high q-range and from molecular simulation, Journal of and T. E. Markland, Data-efficient machine learning potentials
Chemical Physics 144 (2016). from transfer learning of periodic correlated electronic structure
61 S. Dasgupta, E. Lambros, J. P. Perdew, and F. Paesani, Ele-
methods: Liquid water at afqmc, ccsd, and ccsd(t) accuracy,
vating density functional theory to chemical accuracy for water Journal of Chemical Theory and Computation 19, 4510 (2023).
simulations through a density-corrected many-body formalism, 76 J. Daru, H. Forbert, J. Behler, and D. Marx, Coupled cluster
Nature Communications 12, 1 (2021). molecular dynamics of condensed phase systems enabled by ma-
62 S. Ehrlich, J. Moellmann, W. Reckien, T. Bredow, and
chine learning potentials: Liquid water benchmark, Physical Re-
S. Grimme, System-dependent dispersion coefficients for the view Letters 129, 226001 (2022).
dft-d3 treatment of adsorption processes on ionic surfaces, 77 E. Fransson, J. Wiktor, and P. Erhart, Phase transitions in inor-
ChemPhysChem 12, 3414 (2011). ganic halide perovskites from machine-learned potentials, Jour-
63 E. Caldeweyher, J. M. Mewes, S. Ehlert, and S. Grimme, Ex-
nal of Physical Chemistry C 127, 13773 (2023).
tension and evaluation of the d4 london-dispersion model for pe- 78 K. D. Fong, J. Self, B. D. McCloskey, and K. A. Persson, Ion
riodic systems, Physical Chemistry Chemical Physics 22, 8499 correlations and their impact on transport in polymer-based elec-
(2020). trolytes, Macromolecules 54, 2575 (2021).
64 T. Bučko, S. Lebègue, J. G. Ángyán, and J. Hafner, Extending 79 D. Chao, W. Zhou, F. Xie, C. Ye, H. Li, M. Jaroniec, and S. Z.
the applicability of the tkatchenko-scheffler dispersion correction Qiao, Roadmap for advanced aqueous batteries: From design of
via iterative hirshfeld partitioning, Journal of Chemical Physics materials to applications, Science Advances 6 (2020).
141, 34114 (2014). 80 A. Siria, M. L. Bocquet, and L. Bocquet, New avenues for the
65 V. Kostal, P. E. Mason, H. Martinez-Seara, and P. Jungwirth,
large-scale harvesting of blue energy, Nature Reviews Chemistry
Common cations are not polarizable: Effects of dispersion correc- 2017 1:11 1, 1 (2017).
tion on hydration structures from ab initio molecular dynamics, 81 S. Marbach and L. Bocquet, Osmosis, from molecular insights
Journal of Physical Chemistry Letters 14, 4403 (2023). to large-scale applications, Chemical Society Reviews 48, 3102
66 A. L. Benavides, J. L. Aragones, and C. Vega, Consensus on the
(2019).
solubility of nacl in water from computer simulations using the 82 R. H. Tunuguntla, R. Y. Henley, Y. C. Yao, T. A. Pham, M. Wa-
chemical potential route, The Journal of Chemical Physics 144, nunu, and A. Noy, Enhanced water permeability and tunable
124504 (2016). ion selectivity in subnanometer carbon nanotube porins, Science
67 H. J. Berendsen, J. R. Grigera, and T. P. Straatsma, The missing
357, 792 (2017).
term in effective pair potentials, Journal of Physical Chemistry 83 P. Robin, N. Kavokine, and L. Bocquet, Modeling of emergent
91, 6269 (1987). memory and voltage spiking in ionic transport through angstrom-
68 Y. Yao and Y. Kanai, Free energy profile of nacl in water: First-
scale slits, Science 373, 687 (2021).
principles molecular dynamics with scan and ωb97x-v exchange- 84 J. Behler and M. Parrinello, Generalized neural-network repre-
correlation functionals, Journal of Chemical Theory and Compu- sentation of high-dimensional potential-energy surfaces, Physical
tation 14, 884 (2018). Review Letters 98, 146401 (2007).
69 P. L. Geissler, C. Dellago, and D. Chandler, Kinetic pathways of 85 G. Bussi, D. Donadio, and M. Parrinello, Canonical sampling
ion pair dissociation in water, Journal of Physical Chemistry B through velocity rescaling, Journal of Chemical Physics 126,
103, 3706 (1999). 14101 (2007).
70 J. Timko, D. Bucher, and S. Kuyucak, Dissociation of nacl in 86 M. Ceriotti, M. Parrinello, T. E. Markland, and D. E.
water from ab initio molecular dynamics simulations, Journal of Manolopoulos, Efficient stochastic thermostatting of path in-
Chemical Physics 132, 114510 (2010). tegral molecular dynamics, Journal of Chemical Physics 133,
71 A. R. Finney and M. Salvalaglio, Multiple pathways in nacl
124104 (2010).
homogeneous crystal nucleation, Faraday Discussions 235, 56
(2022).
Supporting Information for: Crumbling Crystals: On the Dissolution Mechanism of
NaCl in Water

Niamh O’Neill,1 Benjamin X. Shi,1 Kara Fong,1 Angelos Michaelides,1, a) and Christoph
Schran1, b)
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road,
arXiv:2311.01527v1 [physics.chem-ph] 2 Nov 2023

Cambridge, CB2 1EW, UK

a)
Electronic mail: am452@cam.ac.uk
b)
Electronic mail: cs2121@cam.ac.uk

S1
CONTENTS

S1. Electronic Structure S2


DFT settings S2
Correlated wave-function theory S3

S2. Development and validation of machine learning potential S6


Automated work flow S6
Details of Model S7
Validation S8

S3. Simulation Details S8


System setup S8
Convergence tests S9

S4. Additional Results S11

References S15

S1. ELECTRONIC STRUCTURE

DFT settings

All DFT calculations were performed in CP2K to generate both total energies and nuclear
gradients.S1 The electronic density was partitioned into core and valence contributions, with
core electrons described using the norm-conserving Goedecker, Teter and Hutter (GTH)
pseudopotentials.S2 Na 2s and 2p electrons were also treated explicitly given the well-known
issue of non-linear core-valence exchange/correlation. Valence electrons were described using
the MOLOPT TZV2P basis set.S3 To cover the range of DFT XC functional approxima-
tions, functionals were chosen from ascending rungs of Jacob’s Ladder. In short we used
the generalised gradient approximation, revPBE,S4,S5 meta GGA r2 SCANS6 which improves
on the numerical instabilities of SCAN,S7 Van der Waals inclusive optB88-vdWS8 and hy-
brid revPBE0.S9 revPBE and revPBE0 were used with with the zero-damping variant of
Grimme’s D3 dispersion correction.S10 The revPBE0 calculations were performed using the

S2
auxialliary density matrix method,S11 to reduce the cost of computing the exact exchange
component.

Figure S1 shows the convergence of the forces on the four atom types with respect to
plane-wave cutoff. A large cutoff is required to converge the forces on Na. Using the Gaussian
and Augmented Plane Wave (GAPW) methodS12 resolves this issue, with forces converging
after 400 Ry. However, the GAPW method is not available for wavefunction methods RPA
and MP2. We therefore consider the effect of these stochastic forces on the sodium atoms
with respect to the property of interest in this work - the potential of mean force of Na and
Cl in water. Figure S2 shows the PMF for 3 ML models at revPBE-D3 level of theory using
the GPW method with increasing plane-wave cutoff. This property is well-converged with
PW cutoff, with an acceptable error of approximately 0.1 kcal/mol. Additionally, Figure S3
shows that again the target property, the PMF is not affected by using the GPW method
over the GAPW method.

Correlated wave-function theory

Random-phase approximation (RPA) and second-order Møller Plesset perturbation the-


ory (MP2) were performed in CP2KS1 to generate both total energies and nuclear gradients.
We used DFT with the PBE functional as the starting point for the RPA correlation energy
calculations. The resolution-of-identity (RI) techniques was used for these methods.S13 We
used triple-zeta (TZ) quality correlation consistent basis sets for H and O (taken from CP2K)
as well as Na and Cl. Auxiliary basis sets for the RI integral operations were generated using
the automatic auxiliary basis of Stoychev et al.S14,S15 for Na and Cl, with the defaults (from
CP2K) used for H and O. We used the GTH-HF pseudopotentials from Goedecker-Teter-
HutterS2 for all of the atoms. Figure S4 shows the force convergence on the atoms with
respect to the GPW integral cutoff. A cutoff of 300 Ry was shown to be well converged
within 0.00001 meV/Å. The calculation is also sensitive to the number of quadrature points
and so 20 were used, which has an error of < 1 mHartree according to literature.S16

RPA quadrature points and relative cutoff

S3
2.2 GPW 2.0
Cl GAPW Na
Force [eV / A]

Force [eV / A]
2.1 1.8
1.6
2.0
1.4
1.9 1.2
0.2 0.2
O H
Force [eV / A]

Force [eV / A]
0.1 0.1

0.0 0.0

0.1 0.1
400 800 1200 1600 2000 400 800 1200 1600 2000
Cutoff [Ry] Cutoff [Ry]
FIG. S1. Convergence plots of forces on each atom type (Na, Cl, O, H) vs plane wave cutoff for
GAPW (blue) and GPW (green) methods.

6 800Ry
1200Ry
Free energy [kcal/mol]

1400Ry
4

2 3 4 5 6
Distance [Å]

FIG. S2. PMF from MLP @ revPBE-D3 for PW cutoffs at 800, 1200 and 1400 Ry.

S4
6 1200 Ry - GPW
1200 Ry - GAPW

Free energy [kcal/mol]


4

2 3 4 5 6
Distance [Å]

FIG. S3. PMF for revPBE-D3 MLP with 1200 Ry PW cutoff for both GPW and GAPW methods.

0.3778
Cl 3 Na
Force [eV / A]

Force [eV / A]
0.3776
2
0.3774
1
0.3772
0.0420
0.0427675
0.0419 O 0.0427650
H
Force [eV / A]

Force [eV / A]

0.0418
0.0427625
0.0417
0.0427600
0.0416
0.0427575
0.0415
100 200 300 400 500 100 200 300 400 500
Cutoff [Ry] Cutoff [Ry]
FIG. S4. Convergance plots of forces on each atom type (Na, Cl, O, H) vs integral plane wave
cutoff.

S5
Structures
799 configurations
revPBE-D3 geometry
From S18

1 NaCl/ 62 H2O 2 NaCl/ 58 H2O 2 NaCl/ 58 H2O 2 NaCl/ 58 H2O 2 NaCl/ 58 H2O

Generation-01 RPA MP2 revPBE0-D3 r2 SCAN optB88-vdW revPBE-D3

NpT simulations

RPA MP2 revPBE0-D3 r2 SCAN optB88-vdW revPBE-D3


Generation-02
1049 configurations 963 configurations 1047 configurations 999 configurations 1049 configurations 1049 configurations

FIG. S5. Schematic of training procedure for the 6 ML models in this work.

S2. DEVELOPMENT AND VALIDATION OF MACHINE LEARNING


POTENTIAL

Automated work flow

The procedure for developing the committee neural network potentials (C-NNP) used in
this work was followed as described in Ref S17. Overall 6 models were trained to describe
NaCl ions in water at different levels of electronic structure theory (revPBE-D3, optB88-
vdW, r2 SCAN, revPBE0-D3, MP2 and RPA). The training of the individual models was
divided into two generations, and is graphically depicted in Figure S5. The first generation
comprised a common training set for all models, adapted from previous work.S18 Specifi-
cally, only configurations corresponding to ions in solution were used, since this model also
contained solid NaCl in water, which was used to explore dissolution. These configurations
comprised increasing concentrations of NaCl ions in water (See Figure S5 for specifics). The
forces and energies of this common training set were then computed for the different levels of
theory (See Section S1 for the electronic structure details) used in this paper, and individual
models then trained as described below. In order to ensure the relevant configuration space
for a given level of theory was suitably covered by the models, NpT simulations were then

S6
performed with each model for all of the solution concentrations used in Generation 1. An
active learning procedureS17 was then employed to select relevant configurations from each
concentration to add to the model training set. For a given active learning iteration, 20 ran-
dom structures from a reference trajectory were used to initialise the model. After training
8 NNP members, forces and energies of 2000 randomly selected structures from the refer-
ence trajectory were predicted to ascertain the force and energy committee disagreements.
20 structures with the largest mean force disagreement were added to the training set for
the next round of active learning. Convergence was reached when new structures added
to the training set did not improve the committee disagreement between points already in
the training set, indicating the training set was sufficiently diverse. Overall, approximately
200 additional structures were added per model, leading to the training set of each model
comprising ∼ 1000 structures (See Figure S5) for model-specific details.

Details of Model

The chemical environment around each atom was described using a general set of atom-
centered symmetry functions.S19 There are 10 radial and 4 angular functions for each pair
and triple of atoms, following Ref. S17. All symmetry functions used a cutoff function of
angular cosine form with a cutoff radius of 12 Bohr. The committee was comprised of 8 NNP
members, of identical architecture with 2 hidden layers and 25 neurons in each layer. In all
cases, random sub-sampling was performed to introduce variability between the committee
members, where 10% of the total set of structures were discarded. The weights and biases of
the NNPs were optimised using the n2p2 code.S20 Individual models during active learning
were optimised for 15 epochs, while the final C-NNP model used in simulations was optimised
for 50 epochs.
We have explicitly incorporated long-range effects beyond the cutoff of the symmetry
functions (12 Bohr) of the machine learning potential. The predicted energy can in general
be written as a sum of short range and long range contributions (Esr and Ecoul respectively):
Etot = Esr + Ecoul . The long-range model was thus trained on the difference between the
standard short-ranged model and the Coulomb contribution, calculated using point charges
of +/- 1 respectively for Na and Cl and using TIP3P model parameters for water.S21 We used
this model in all production simulations, where the Coulomb contributions were explicitly

S7
included via particle mesh ewald summation. Details on the validation of the final models
are presented in the next Section.

Validation

We validate the ability of the model to reproduce its underlying reference method based on
a validation set of 100 configurations. These comprise a scan along the inter-ion separation
coordinate, ensuring that the model is accurate for the full range of the PMF. The force and
energy RMSE for each model are shown in Table I for each model. These errors compare
favorably with our previous modelS18 with RMSE values for both forces (37.0 meV/Å) and
energies (0.3 meV/atom) as well as similar reactive systems, which have been studied using
machine learning potentials, such as the work by Behler et al. in Ref. S22 who quote a force
and energy RMSE for a model describing proton transport at ZnO/H2O interfaces of 140.4
meV/Å and 1.0 meV/atom respectively.
Training errors, forces

TABLE I. Summary of force and energy RMSE for each ML model.


Energy RMSE Force RMSE
Model
[meV/atom] [meV/Å]
revPBE-D3 2.312 38.224
optB88-vdW 1.080 37.987
r2 SCAN 1.068 41.590
revPBE0-D3 1.516 40.685
MP2 0.906 43.108
RPA 0.873 46.130

S3. SIMULATION DETAILS

System setup

The NaCl PMF was obtained from molecular dynamics simulation using the machine
learning potentials from above. The system comprised an NaCl ion pair surrounded by
62 waters in a cubic simulation cell with periodic boundary conditions in the x, y and
z directions. For each model, a 1 ns NpT simulation was first performed to obtain the

S8
6
Energy RMSE [meV per atom] = 1.068
Force RMSE [meV/Å] = 41.590
4

2
FNNP [eV/Å]
0

6
0.4
0.2
F [eV/Å]

0.0
0.2
0.4
6 4 2 0 2 4 6
FDFT [eV/Å]

FIG. S6. Correlation plot for r2 SCAN C-NNP predicted forces and corresponding reference DFT
forces, with light grey line showing a perfect correlation coefficient of 1.

equillibrium density. From the NpT simulations, 10 configurations were sampled to give
uncorrelated starting configurations for production simulations run at the model density.
Production simulations were performed in the NVT ensemble, with a timestep of 1 fs at
300 K. The PMF was then obtained from the average of the rdfs from the 10 independant
simulations, using the relation F (r) = −kB T ln g(r). The standard deviation was used to
quantify the statistical uncertainty. Overall, over 200 ns of ab initio quality simulations were
performed, highlighting the major advantage of the machine learning approach.

Convergence tests

This section describes several tests to ensure our simulation protocol was statistically
converged and did not suffer from finite-size effects. Simulation box size Here we show
the PMF of a large 24.82 Å cubic box comprising 6 NaCl ion pairs and 332 waters and the

S9
12.42 Å cubic box of one ion pair and 62 waters used in production runs.

12.42 Å2
2 24.82 Å2

Free energy [kcal/mol]


0

2 3 4 5 6
Distance [Å]

FIG. S7.

Sampling - thermodynamic integration vs RDF Since we only use a single ion


pair in our simulations, to ensure that all regions along the inter-ion separation coordinate
were sufficintly sampled, we compare a thermodynamic integration scheme as described in
to the rdf method. Figure S8 shows the PMF obtained from RDF and thermodynamic
integration (TI) for 3 of the models, showing the equivalence of the approaches within their
own statistical errors.

RPA MP2 revPBE-D3


1.0 RDF RDF RDF
Free energy [kcal/mol]

0.5 TI TI TI
0
-0.5
-1
-1.5
2 3 4 5 6 2 3 4 5 6 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å]

FIG. S8. Comparison of thermodynamic integration and RDF approaches for 3 ML models.

Simulation time
One of the major advantages of machine learning potentials is the much greater timescales
accessible during simulations than for standard ab initio methods. Here we compute the
PMF using one of the MLPs over a time period of 500 ps, for 2 replicates, a time period at
the extreme upper end of that accessible for ab initio simulations (Note that for the hybrid

S10
and wavefunction methods even this timescale is unfeasible). In figure S9, we compare this
with the ease in which the PMF can be converged over 10 replicates each of over 3 ns with
the MLP.

2.0 500 ps

Free energy [kcal/mol]


1.5 3 ns
1.0
0.5
0.0
0.5
3 4 5 6
Distance [Å]

FIG. S9. Comparison of PMF obtained from MLP simulations for 500 ps vs fully converged MLP
simulations of 10 replicates of 3 ns each.

S4. ADDITIONAL RESULTS

Figures S10, S11 and S12 summarise all of the RDFs computed in this work. Figure S10
shows the bulk water RDFs. Figures S12 and S11 show the ion-water and water-water RDFs
respectively for the system of one NaCl ion pair in 95 waters.

S11
O-O H-O H-H
NQE MP2 NQE MP2
2 NQE MP2
NQE RPA NQE RPA NQE RPA

2 2

0 0 0
MP2 MP2 MP2
RPA RPA RPA

2 2
2

0 0 0
R2SCAN R2SCAN R2SCAN
optb88-vdW optb88-vdW optb88-vdW
revPBE-D3 revPBE-D3 revPBE-D3
revPBE0-D3 revPBE0-D3 revPBE0-D3
2 2
2

0 0 0
2 3 4 5 6 0 1 2 3 4 5 6 1 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å]

FIG. S10. Water - water RDFs for bulk water.

S12
Cl-O Na-O H-Cl Na-H
4 NQE MP2
6 NQE MP2 NQE MP2 NQE MP2
NQE RPA NQE RPA NQE RPA NQE RPA
4 2 2
2
2

0 0 0 0
4 MP2
6 MP2 MP2 MP2
RPA RPA RPA RPA
4 2 2
2
2

0 0 0 0
4 R2SCAN
6 R2SCAN R2SCAN R2SCAN
optb88-vdW optb88-vdW optb88-vdW optb88-vdW
revPBE-D3 revPBE-D3 revPBE-D3 revPBE-D3
revPBE0-D3 4 revPBE0-D3 2 revPBE0-D3 2 revPBE0-D3
2
2

0 0 0 0
2 3 4 5 6 2 3 4 5 6 1 2 3 4 5 6 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å] Distance [Å]

FIG. S11. Ion - water RDFs for one NaCl in a 95 water box.

S13
O-O H-O H-H
NQE MP2 NQE MP2
2 NQE MP2
NQE RPA NQE RPA NQE RPA

2 2

0 0 0
MP2 MP2 MP2
RPA RPA RPA

2 2
2

0 0 0
R2SCAN R2SCAN R2SCAN
optb88-vdW optb88-vdW optb88-vdW
revPBE-D3 revPBE-D3 revPBE-D3
revPBE0-D3 revPBE0-D3 revPBE0-D3
2 2
2

0 0 0
2 3 4 5 6 0 1 2 3 4 5 6 1 2 3 4 5 6
Distance [Å] Distance [Å] Distance [Å]

FIG. S12. Water - water RDFs for one NaCl in a 95 water box.

S14
REFERENCES

S1
T. D. Kühne, M. Iannuzzi, M. D. Ben, V. V. Rybkin, P. Seewald, F. Stein, T. Laino,
R. Z. Khaliullin, O. Schütt, F. Schiffmann, D. Golze, J. Wilhelm, S. Chulkov, M. H.
Bani-Hashemian, V. Weber, U. Borštnik, M. Taillefumier, A. S. Jakobovits, A. Lazzaro,
H. Pabst, T. Müller, R. Schade, M. Guidon, S. Andermatt, N. Holmberg, G. K. Schenter,
A. Hehn, A. Bussy, F. Belleflamme, G. Tabacchi, A. Glöß, M. Lass, I. Bethune, C. J.
Mundy, C. Plessl, M. Watkins, J. VandeVondele, M. Krack, and J. Hutter, “Cp2k: An
electronic structure and molecular dynamics software package - quickstep: Efficient and
accurate electronic structure calculations,” The Journal of Chemical Physics 152, 194103
(2020).
S2
S. Goedecker and M. Teter, “Separable dual-space gaussian pseudopotentials,” Physical
Review B - Condensed Matter and Materials Physics 54, 1703–1710 (1996).
S3
J. VandeVondele and J. Hutter, “Gaussian basis sets for accurate calculations on molec-
ular systems in gas and condensed phases,” Journal of Chemical Physics 127, 114105
(2007).
S4
J. P. Perdew, K. Burke, and M. Ernzerhof, “Generalized gradient approximation made
simple,” Physical Review Letters 77, 3865–3868 (1996).
S5
Y. Zhang and W. Yang, “Comment on “generalized gradient approximation made sim-
ple”,” Physical Review Letters 80, 890 (1998).
S6
J. W. Furness, A. D. Kaplan, J. Ning, J. P. Perdew, and J. Sun, “Accurate and nu-
merically efficient r2scan meta-generalized gradient approximation,” Journal of Physical
Chemistry Letters 11, 8208–8215 (2020).
S7
J. Sun, A. Ruzsinszky, and J. Perdew, “Strongly constrained and appropriately normed
semilocal density functional,” Physical Review Letters 115, 036402 (2015).
S8
J. Klimeš, D. R. Bowler, and A. Michaelides, “Chemical accuracy for the van der waals
density functional,” Journal of Physics: Condensed Matter 22, 022201 (2009).
S9
C. Adamo and V. Barone, “Toward reliable density functional methods without adjustable
parameters: The pbe0 model,” The Journal of Chemical Physics 110, 6158–6170 (1999).
S10
S. Grimme, J. Antony, S. Ehrlich, and H. Krieg, “A consistent and accurate ab initio
parametrization of density functional dispersion correction (dft-d) for the 94 elements
h-pu,” The Journal of Chemical Physics 132, 154104 (2010).

S15
S11
M. Guidon, J. Hutter, and J. Vandevondele, “Auxiliary density matrix methods for
hartree-fock exchange calculations,” Journal of Chemical Theory and Computation 6,
2348–2364 (2010).
S12
G. Lippert, J. Hutter, and M. Parrinello, “The gaussian and augmented-plane-wave
density functional method for ab initio molecular dynamics simulations,” Theoretical
Chemistry Accounts 103, 124–140 (1999).
S13
A. Bussy, O. Schütt, and J. Hutter, “Sparse tensor based nuclear gradients for peri-
odic hartree-fock and low-scaling correlated wave function methods in the cp2k software
package: A massively parallel and gpu accelerated implementation,” Journal of Chemical
Physics 158 (2023), 10.1063/5.0144493/2886896.
S14
G. L. Stoychev, A. A. Auer, and F. Neese, “Automatic generation of auxiliary basis sets,”
Journal of Chemical Theory and Computation 13, 554–562 (2017).
S15
S. Lehtola, “Straightforward and accurate automatic auxiliary basis set generation for
molecular calculations with atomic orbital basis sets,” Journal of Chemical Theory and
Computation 17, 6886–6900 (2021).
S16
M. D. Ben, O. Schütt, T. Wentz, P. Messmer, J. Hutter, and J. Vandevondele, “Enabling
simulation at the fifth rung of dft: Large scale rpa calculations with excellent time to
solution,” Computer Physics Communications 187, 120–129 (2015).
S17
C. Schran, F. L. Thiemann, P. Rowe, E. A. Müller, O. Marsalek, and A. Michaelides,
“Machine learning potentials for complex aqueous systems made simple,” Proceedings of
the National Academy of Sciences of the United States of America 118, 38 (2021).
S18
N. O’Neill, C. Schran, S. J. Cox, and A. Michaelides, “Crumbling crystals: On the
dissolution mechanism of nacl in water,” arXiv (2022).
S19
J. Behler, “Atom-centered symmetry functions for constructing high-dimensional neural
network potentials,” The Journal of Chemical Physics 134, 74106 (2011).
S20
A. Singraber, T. Morawietz, J. Behler, and C. Dellago, “Parallel multistream training of
high-dimensional neural network potentials,” Journal of Chemical Theory and Computa-
tion 15, 3075–3092 (2019).
S21
W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein,
“Comparison of simple potential functions for simulating liquid water,” The Journal of
Chemical Physics 79, 926–935 (1983).

S16
S22
V. Quaranta, M. Hellström, and J. Behler, “Proton-transfer mechanisms at the water-zno
interface: The role of presolvation,” Journal of Physical Chemistry Letters 8, 1476–1483
(2017).

S17

You might also like