Professional Documents
Culture Documents
Lecture 10 2019 Monte Carlo Final
Lecture 10 2019 Monte Carlo Final
• Importance sampling
10.3 Announcements
• Exam 1 Thursday from 6-7:30 PM in EH 3024. Bring equation sheet, no calculator.
" 12 6 #
σij σij
ELJ (rij ) = 4ij − (10.1)
rij rij
rij is the (scalar) distance between particle i and j, ij is a characteristic interaction energy and σij
is a characteristic interaction length scale describing the approximate diameter of a particle. The
Lennard-Jones potential is broken into an attractive interaction that scales with r−6 and a repulsive
potential that scales with r−12 . The attractive potential represents three contributions to typical
van der Waals interactions that are all attractive and scale with r−6 : London dispersion forces,
which are related to interactions between instantaneous dipoles that arise from quantum mechanical
considerations; dipole-induced dipole interactions, which are related to attractions between dipoles
1
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
on a molecule and induced dipoles that arise from the polarizibility of a different molecule; and
Keesom interactions, which emerge from the orientation dependence of dipole-dipole interactions.
In general, we do not attempt to divide the LJ potential into contributions from these three distinct
interactions, but rather empirically identify parameters for and σ which capture all three effects.
The repulsive potential represents Pauli exclusion, which acts to ensure that particle wave functions
do not overlap. There is no single scaling relation for Pauli exclusion other than that it must be
a strong repulsive force, so for computational convenience r−12 is chosen since the calculated r−6
term can be simply squared.
In addition to van der Waals interactions, it is typical to associate charges (or partial charges,
to account for the unequal distribution of electrons throughout a molecule that leads to dipoles) to
atoms or particles in a system. These charges interact via a long-range Coulombic potential:
1 qi qj
Ecoulomb (rij ) = (10.2)
4π0 r rij
where 0 is the permittivity of free space, r is the relative dielectric constant (1 in vacuum, 2-4 in
oil, 80 in water), and qi is the charge on particle i. In practice, Coulombic interactions are difficult
to calculate in simulations due to periodic boundary conditions because they decay slowly and
the minimum image convention is a severe underestimate of the total magnitude of electrostatic
interactions. Instead, advanced techniques, such as Ewald summations, have been derived handle
their calculation. Such methods are outside of the scope of this discussion but are discussed in the
Frenkel and Smit textbook if you would like to review them on your own.
Many other non-bonded interactions are possible and in common use, but these represent the
two most common functional forms used in most atomistic simulations.
2
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
Another example is the interaction between nearest-neighbors used in the Ising model, which
can be easily represented in MC simulations:
−J , if i, j are neighbors
EIsing (rij ) = (10.4)
0 , otherwise
In principle, many other possible interactions could be defined; here we only included a subset
that are commonly found in the simulation literature and map directly to many experimental
problems.
3
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
N N
N 1 XX
E(r ) = E(|rij |) with free boundaries (10.5)
2
i j
N N
N 1 XXX
E(r ) = E(|rij + nL|) with periodic boundaries (10.6)
2 n i j
Here, rN is a vector representing the set of all N particle positions, rij is the (vector) distance
between particle i and j, n is a vector of 3 arbitrary integers, and L is the size of the box. The factor
of 1/2 eliminates overcounting of pairs of atoms. In other words, each particle can interact with
every possible periodic image since the system is infinite. This may not be desirable, so typically
interactions use the minimum image convention meaning that the distance used in computing
pair potentials is the shortest possible distance between two particles, taking into account periodic
boundary conditions. Thus, the distance in any one dimension of the box can never exceed L.
In practice, PBCs are used for the vast majority of simulations, but you must be careful that
they do not introduce artifacts. If the value of L is small relative to long-wavelength modes of
the system, for example, then the presence of PBCs could limit these modes. For example, lipid
bilayers, while largely planar, undulate out-of-plane over length scales of hundreds of nanometers;
these undulations are often damped in simulations that are too small to properly capture them.
XXX X
Z= ··· e−βE(r1 ,r2 ,r3 ,...rN ) (10.7)
r1 r2 r3 rN
In this notation, each sum accounts for all possible positions (ri ) of one of the particles, and
the energy of each configuration is a function of all particle positions. The bold notation indicates
that ri is a vector; in this case a vector with 3 coordinates referring to the x/y/z positions of the
particle. We cannot factorize this partition function because the particles are interacting and hence
a single-particle partition function cannot be written without knowledge of the positions/states of
the other particles. If we now assume that particle positions are continuous, rather than discrete,
we can transform our sums to integrals in the classical limit and write the expression as:
4
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
Z Z Z Z
Z= ··· dr1 dr2 dr3 . . . drN e−βE(r1 ,r2 ,r3 ,...rN ) (10.8)
r1 r2 r3 rN
Z
drN exp −βE(rN )
≡ (10.9)
VN
Here, each integral runs over some volume which is accessible to each particle in the system
since the particle positions have units of length. We simplify the notation by writing the integral
over a single vector, rN , which contains the positions of all N particles; in three-dimensions, this
is then a vector with 3N coordinates, and integrating over all possible positions is equivalent to
integrating over a 3N-dimensional volume V N which we call the volume of phase space accessible
to the N particles (often this partition function is written with a normalizing prefactor with units
of 1/volumeN to ensure that the partition function is unitless; we omit this prefactor here).
With this new notation, the ensemble average in the classical canonical ensemble (with N V T
fixed) is given as:
R N
dr exp −βE(rN ) Y (rN )
hY i = R (10.10)
drN exp [−βE(rN )]
This notation mirrors the notation used to sum over all states accessible to a system using a
discrete partition function, but here the sum is replaced by an integral over phase space. It is only
a notation change, and conceptually the quantities are the same.
In principle, the integrals in eq. (10.10) could be calculated in a brute force manner by de-
termining the value of Y (rN ) for every set of particle coordinates and integrating numerically.
However, such an approach would be impossible computationally because the number of system
configurations becomes effectively infinite for even a small number of particles. Moreover, it is
likely that the vast majority of the system configurations would have a high energy, E(rN ) kB T ,
and as a result the Boltzmann factor for most values of Y (rN ) would be zero. In other words, a
large portion of the phase space (V N ) possible for a simulation will be inaccessible due to its high
energy - those configurations will be infinitely unlikely. Performing such a calculation would thus
be not only nearly impossible, but also highly inefficient. Finally, the last thing to notice is that
to calculate hY i, it is not necessary to calculate the value of the integrals in both the numerator
and denominator of eq. (10.10); only their ratio must be determined. This observation will form
the basis of the Metropolis Monte Carlo algorithm. We will now describe Monte Carlo sampling in
general, then discuss the Metropolis algorithm.
5
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
Z b
F = f (x)dx (10.12)
a
Z b
f (x)
= p(x)dx (10.13)
ρ(x)
a
f (x)
= (10.14)
ρ(x)
This expression is the ensemble-average of an observable but P in the continuum limit - in other
words, this is the continuum version of the expression hY i = pi Yi where we have replaced
the
f (x)
summation (for discrete states) with an integral and the observable we are computing is p(x) .
Now, we can calculate the average value of fp(x) (x)
by randomly selecting points within the
interval [a, b] according to the probability distribution p(x), and calculating fp(x) (x)
for each ran-
domly selected value of x. This stochastic sampling of x from all possible values defines the Monte
Carlo method. If we have an infinite number
of trials, then each value of x will be sampled exactly
f (x)
according to p(x) and the average of p(x) computed from the infinite number of trials will be
exactly equal to the value of the integral above. We can thus approximate F by:
f (x)
F = (10.15)
p(x)
f (x)
≈ (10.16)
p(x) trials
τ
1 X f (x)
= (10.17)
τ p(x)
i
where for each of the τ trials (samples), x is chosen according to the probability p(x).
Let’s consider a simple example of how we might apply this idea. First, we will choose p(x) to
be a uniform probability density:
1
p(x) = for a ≤ x ≤ b (10.18)
b−a
Then, we can generally approximate F as:
τ
b−aX
F ≈ f (x) (10.19)
τ
i
6
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
Z
drN exp −βE(rN )
Z= (10.20)
VN
τ
VN X
exp −βE(rN
≈ i ) (10.21)
τ
i
V N is again the volume of phase space which is the 3N -dimensional analogue to the interval
[a, b]; τ is the total number of samples used for the approximation, and E(rN i ) is the potential
energy of the system for the specific configuration denoted by i. There are two major problems
with this approach in practice. First, it’s difficult to estimate the total phase space volume, V N .
This problem can be avoided, however, by recognizing that calculating the ensemble-average value
of an observable requires only the ratio of two quantities within the phase space V N . So we can
write:
VN Pτ
Yi (rN N
τ i ) exp −βE(ri )
i
hY iN V T ≈ V N Pτ
(10.22)
N
τ i exp −βE(ri )
Pτ N
N
i Yi (ri ) exp −βE(ri )
≈ Pτ N
(10.23)
i exp −βE(ri )
7
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
contribute meaningfully to the calculation of the ensemble average. This is the essence of importance
sampling. We can perform importance sampling for configurations in the canonical ensemble by
recalling that the probability of finding the system in a given microstate of the canonical ensemble,
p(rN )N V T , is related to the Boltzmann factor for that state normalized by the partition function.
We can then write:
exp −βE(rN )
N
p(r )N V T = (10.24)
Z Z
hY iN V T = drN p(rN )N V T Y (rN ) (10.25)
Rb
The ensemble average then has the same form as F = a f (x)dx if we let x = rN and f (x) =
p(rN )N V T Y (rN ). Following the reasoning above, we can then approximate hY iN V T by:
f (x)
hY iN V T ≈ (10.26)
p(x) trials
p(rN )N V T Y (rN )
= (10.27)
p(rN ) trials
From this expression, we see that if we select trials according to the probability distribution
p(rN ) = p(rN )N V T , then we get:
hY iN V T = hY itrials (10.28)
Thus, we choose configurations in our ensemble according to the canonical ensemble probability
distribution, in which case the ensemble-average value of Y can be estimated by sampling con-
figurations according to their Boltzmann weight. Finally, we note that we just need to know the
probability of sampling a configuration - we do not necessarily need an expression for the partition
function itself. In principle, we could choose another probability distribution p(x) from which to
sample states, but the choice of p(rN ) = p(rN )N V T is the simplest.
Our problem then boils down to: how do we select states according to the correct probability
distribution without knowing the value of the partition function? To do so, we will generate a
Markov chain of states as a means of sampling our distribution. A Markov chain refers to a
sequence of states (i.e. configurations or trials using our previous nomenclature) that satisfy the
following two conditions:
• Each state generated belongs to a finite set of possible outcomes called the state space. The
statistical mechanical analogue to this statement is to say that each microstate generated
belongs to a finite ensemble. We can denote each possible state by rN N N
1 , r2 , r3 , . . . for the
enormous set of possible microstates within the canonical ensemble that we are sampling.
For the canonical ensemble, this state space is equal to V N , the accessible phase space.
• The probability of sampling state i + 1 in the sequence of states sampled depends only on
state i, and not on previous states in the chain.
Since the likelihood of sampling a new state is only related to what current state we are in, we
can define a transition probability, Π(m → n), which defines the likelihood of transitioning from
state m to state n. We can then imagine an algorithm in which we start in some state m then
8
University of Wisconsin-Madison Lecture 10
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 8, 2019
transition to a new state n with a probability given by Π(m → n) and repeat this for a large number
of trials. If we do this an infinite number of times, then the state m will appear with an overall
probability given by p(m), where p(m) is the limiting probability distribution that does not depend
on any of the other states (unlike the transition probability). When sampling from the canonical
ensemble, then, we want p(m) to equal p(rN m )N V T - that is, the likelihood of sampling state m if
we take enough states from our Markov chain is equal to the probability of sampling that state
according to the canonical ensemble distribution function. Thus, we need to find an expression for
the transition probability, Π, that yields this correct limiting distribution. We will return to this
problem in the next lecture.