Download as pdf or txt
Download as pdf or txt
You are on page 1of 367

APPLIED STOCHASTIC PROCESSES

G.A. Pavliotis

Department of Mathematics Imperial College London, UK

January 16, 2011

Pavliotis (IC) StochProc January 16, 2011 1 / 367


Lectures: Mondays, 10:00-12:00, Huxley 6M42.
Office Hours: By appointment.
Course webpage:
http://www.ma.imperial.ac.uk/~pavl/stoch_proc.htm
Text: Lecture notes, available from the course webpage. Also,
recommended reading from various textbooks/review articles.
The lecture notes are still in progress. Please send me your
comments, suggestions and let me know of any typos/errors
that you have spotted.

Pavliotis (IC) StochProc January 16, 2011 2 / 367


This is a basic graduate course on stochastic processes, aimed
towards PhD students in applied mathematics and theoretical
physics.
The emphasis of the course will be on the presentation of
analytical tools that are useful in the study of stochastic models
that appear in various problems in applied mathematics, physics,
chemistry and biology.
The course will consist of three parts: Fundamentals of the theory
of stochastic processes, applications (reaction rate theory, surface
diffusion....) and non-equilibrium statistical mechanics.

Pavliotis (IC) StochProc January 16, 2011 3 / 367


1 PART I: FUNDAMENTALS OF CONTINUOUS TIME
STOCHASTIC PROCESSES
Elements of probability theory.
Stochastic processes: basic definitions, examples.
Continuous time Markov processes. Brownian motion
Diffusion processes: basic definitions, the generator.
Backward Kolmogorov and the Fokker–Planck (forward
Kolmogorov) equations.
Stochastic differential equations (SDEs); Itô calculus, Itô and
Stratonovich stochastic integrals, connection between SDEs and
the Fokker–Planck equation.
Methods of solution for SDEs and for the Fokker-Planck equation.
Ergodic properties and convergence to equilibrium.

Pavliotis (IC) StochProc January 16, 2011 4 / 367


2. PART II: APPLICATIONS.
Asymptotic problems for the Fokker–Planck equation: overdamped
(Smoluchowski) and underdamped (Freidlin-Wentzell) limits.
Bistable stochastic systems: escape over a potential barrier, mean
first passage time, calculation of escape rates etc.
Brownian motion in a periodic potential. Stochastic models of
molecular motors.
Multiscale problems: averaging and homogenization.

Pavliotis (IC) StochProc January 16, 2011 5 / 367


3. PART III: NON-EQUILIBRIUM STATISTICAL MECHANICS.
Derivation of stochastic differential equations from deterministic
dynamics (heat bath models, projection operator techniques etc.).
The fluctuation-dissipation theorem.
Linear response theory.
Derivation of macroscopic equations (hydrodynamics) and
calculation of transport coefficients.

4. ADDITIONAL TOPICS (time permitting): numerical methods,


stochastic PDES, Markov chain Monte Carlo (MCMC)

Pavliotis (IC) StochProc January 16, 2011 6 / 367


Prerequisites

Basic knowledge of ODEs and PDEs.


Elementary probability theory.
Some familiarity with the theory of stochastic processes.

Pavliotis (IC) StochProc January 16, 2011 7 / 367


Lecture notes will be provided for all the material that we will cover
in this course. They will be posted on the course webpage.
There are many excellent textbooks/review articles on applied
stochastic processes, at a level and style similar to that of this
course.
Standard textbooks are
Gardiner: Handbook of stochastic methods (1985).
Van Kampen: Stochastic processes in physics and chemistry
(1981).
Horsthemke and Lefever: Noise induced transitions (1984).
Risken: The Fokker-Planck equation (1989).
Oksendal: Stochastic differential equations (2003).
Mazo: Brownian motion: fluctuations, dynamics and fluctuations
(2002).
Bhatthacharya and Waymire, Stochastic Processes and
Applications (1990).

Pavliotis (IC) StochProc January 16, 2011 8 / 367


Other standard textbooks are
Nelson: Dynamical theories of Brownian motion (1967). Available
from the web (includes a very interesting historical overview of the
theory of Brownian motion).
Chorin and Hald: Stochastic tools in mathematics and science
(2006).
Swanzing: Non-equilibrium statistical mechanics (2001).
The material on multiscale methods for stochastic processes,
together with some of the introductory material, will be taken from
Pavliotis and Stuart: Multiscale methods: averaging and
homogenization (2008).

Pavliotis (IC) StochProc January 16, 2011 9 / 367


Excellent books on the mathematical theory of stochastic
processes are
Karatzas and Shreeve: Brownian motion and stochastic calculus
(1991).
Revuz and Yor: Continuous martingales and Brownian motion
(1999).
Stroock: Probability theory, an analytic view (1993).

Pavliotis (IC) StochProc January 16, 2011 10 / 367


Well known review articles (available from the web) are:
Chandrasekhar: Stochastic problems in physics and astronomy
(1943).
Hanggi and Thomas: Stochastic processes: time evolution,
symmetries and linear response (1982).
Hanggi, Talkner and Borkovec: Reaction rate theory: fifty years
after Kramers (1990).

Pavliotis (IC) StochProc January 16, 2011 11 / 367


The theory of stochastic processes started with Einstein’s work on
the theory of Brownian motion: Concerning the motion, as
required by the molecular-kinetic theory of heat, of particles
suspended in liquids at rest (1905).
explanation of Brown’s observation (1827): when suspended in
water, small pollen grains are found to be in a very animated and
irregular state of motion.
Einstein’s theory is based on
A Markov chain model for the motion of the particle (molecule, pollen
grain...).
The idea that it makes more sense to talk about the probability of
finding the particle at position x at time t, rather than about individual
trajectories.

Pavliotis (IC) StochProc January 16, 2011 12 / 367


In his work many of the main aspects of the modern theory of
stochastic processes can be found:
The assumption of Markovianity (no memory) expressed through
the Chapman-Kolmogorov equation.
The Fokker–Planck equation (in this case, the diffusion equation).
The derivation of the Fokker-Planck equation from the master
(Chapman-Kolmogorov) equation through a Kramers-Moyal
expansion.
The calculation of a transport coefficient (the diffusion equation)
using macroscopic (kinetic theory-based) considerations:

kB T
D= .
6πηa

kB is Boltzmann’s constant, T is the temperature, η is the viscosity


of the fluid and a is the diameter of the particle.

Pavliotis (IC) StochProc January 16, 2011 13 / 367


Einstein’s theory is based on the Fokker-Planck equation.
Langevin (1908) developed a theory based on a stochastic
differential equation. The equation of motion for a Brownian
particle is
d 2x dx
m 2 = −6πηa + ξ,
dt dt
where ξ is a random force.
There is complete agreement between Einstein’s theory and
Langevin’s theory.
The theory of Brownian motion was developed independently by
Smoluchowski.

Pavliotis (IC) StochProc January 16, 2011 14 / 367


The approaches of Langevin and Einstein represent the two main
approaches in the theory of stochastic processes:
Study individual trajectories of Brownian particles. Their evolution is
governed by a stochastic differential equation:

dX
= F (X ) + Σ(X )ξ(t),
dt
where ξ(t) is a random force.
Study the probability ρ(x, t) of finding a particle at position x at
time t. This probability distribution satisfies the Fokker-Planck
equation:
∂ρ 1
= −∇ · (F (x)ρ) + ∇∇ : (A(x)ρ),
∂t 2
where A(x) = Σ(x)Σ(x)T .

Pavliotis (IC) StochProc January 16, 2011 15 / 367


The theory of stochastic processes was developed during the 20th
century:
Physics:
Smoluchowksi.
Planck (1917).
Klein (1922).
Ornstein and Uhlenbeck (1930).
Kramers (1940).
Chandrasekhar (1943).
...
Mathematics:
Wiener (1922).
Kolmogorov (1931).
Itô (1940’s).
Doob (1940’s and 1950’s).
...

Pavliotis (IC) StochProc January 16, 2011 16 / 367


The One-Dimensional Random Walk

We let time be discrete, i.e. t = 0, 1, . . . . Consider the following


stochastic process Sn :
S0 = 0;
at each time step it moves to ±1 with equal probability 12 .
In other words, at each time step we flip a fair coin. If the outcome is
heads, we move one unit to the right. If the outcome is tails, we move
one unit to the left.
Alternatively, we can think of the random walk as a sum of independent
random variables:
Xn
Sn = Xj ,
j=1

where Xj ∈ {−1, 1} with P(Xj = ±1) = 21 .

Pavliotis (IC) StochProc January 16, 2011 17 / 367


We can simulate the random walk on a computer:
We need a (pseudo)random number generator to generate n
independent random variables which are uniformly distributed in
the interval [0,1].
If the value of the random variable is > 21 then the particle moves
to the left, otherwise it moves to the right.
We then take the sum of all these random moves.
The sequence {Sn }N n=1 indexed by the discrete time
T = {1, 2, . . . N} is the path of the random walk. We use a linear
interpolation (i.e. connect the points {n, Sn } by straight lines) to
generate a continuous path.

Pavliotis (IC) StochProc January 16, 2011 18 / 367


50−step random walk

−2

−4

−6

0 5 10 15 20 25 30 35 40 45 50

Figure: Three paths of the random walk of length N = 50.

Pavliotis (IC) StochProc January 16, 2011 19 / 367


1000−step random walk

20

10

−10

−20

−30

−40

−50

0 100 200 300 400 500 600 700 800 900 1000

Figure: Three paths of the random walk of length N = 1000.

Pavliotis (IC) StochProc January 16, 2011 20 / 367


Every path of the random walk is different: it depends on the
outcome of a sequence of independent random experiments.
We can compute statistics by generating a large number of paths
and computing averages. For example, E(Sn ) = 0, E(Sn2 ) = n.
The paths of the random walk (without the linear interpolation) are
not continuous: the random walk has a jump of size 1 at each time
step.
This is an example of a discrete time, discrete space stochastic
processes.
The random walk is a time-homogeneous (the probabilistic law
of evolution is independent of time) Markov (the future depends
only on the present and not on the past) process.
If we take a large number of steps, the random walk starts looking
like a continuous time process with continuous paths.

Pavliotis (IC) StochProc January 16, 2011 21 / 367


Consider the sequence of continuous time stochastic processes

1
Ztn := √ Snt .
n

In the limit as n → ∞, the sequence {Ztn } converges (in some


appropriate sense) to a Brownian motion with diffusion
2
coefficient D = ∆x 1
2∆t = 2 .

Pavliotis (IC) StochProc January 16, 2011 22 / 367


2
mean of 1000 paths
5 individual paths
1.5

0.5
U(t)
0

−0.5

−1

−1.5
0 0.2 0.4 0.6 0.8 1
t

Figure: Sample Brownian paths.

Pavliotis (IC) StochProc January 16, 2011 23 / 367


Brownian motion W (t) is a continuous time stochastic processes
with continuous paths that starts at 0 (W (0) = 0) and has
independent, normally. distributed Gaussian increments.
We can simulate the Brownian motion on a computer using a
random number generator that generates normally distributed,
independent random variables.

Pavliotis (IC) StochProc January 16, 2011 24 / 367


We can write an equation for the evolution of the paths of a
Brownian motion Xt with diffusion coefficient D starting at x:

dXt = 2DdWt , X0 = x.

This is an example of a stochastic differential equation.


The probability of finding Xt at y at time t, given that it was at x at
time t = 0, the transition probability density ρ(y, t) satisfies the
PDE
∂ρ ∂2ρ
= D 2 , ρ(y, 0) = δ(y − x).
∂t ∂y
This is an example of the Fokker-Planck equation.
The connection between Brownian motion and the diffusion
equation was made by Einstein in 1905.

Pavliotis (IC) StochProc January 16, 2011 25 / 367


Why introduce randomness in the description of physical systems?
To describe outcomes of a repeated set of experiments. Think of
tossing a coin repeatedly or of throwing a dice.
To describe a deterministic system for which we have incomplete
information: we have imprecise knowledge of initial and boundary
conditions or of model parameters.
ODEs with random initial conditions are equivalent to stochastic
processes that can be described using stochastic differential
equations.
To describe systems for which we are not confident about the
validity of our mathematical model.

Pavliotis (IC) StochProc January 16, 2011 26 / 367


To describe a dynamical system exhibiting very complicated
behavior (chaotic dynamical systems). Determinism versus
predictability.
To describe a high dimensional deterministic system using a
simpler, low dimensional stochastic system. Think of the physical
model for Brownian motion (a heavy particle colliding with many
small particles).
To describe a system that is inherently random. Think of quantum
mechanics.

Pavliotis (IC) StochProc January 16, 2011 27 / 367


ELEMENTS OF PROBABILITY THEORY

Pavliotis (IC) StochProc January 16, 2011 28 / 367


A collection of subsets of a set Ω is called a σ–algebra if it
contains Ω and is closed under the operations of taking
complements and countable unions of its elements.
A sub-σ–algebra is a collection of subsets of a σ–algebra which
satisfies the axioms of a σ–algebra.
A measurable space is a pair (Ω, F) where Ω is a set and F is a
σ–algebra of subsets of Ω.
Let (Ω, F) and (E , G) be two measurable spaces. A function
X : Ω 7→ E such that the event

{ω ∈ Ω : X (ω) ∈ A} =: {X ∈ A}

belongs to F for arbitrary A ∈ G is called a measurable function or


random variable.

Pavliotis (IC) StochProc January 16, 2011 29 / 367


Let (Ω, F) be a measurable space. A function µ : F 7→ [0, 1] is
P∞ measure if µ(∅) = 1, µ(Ω) = 1 and
called a probability
µ(∪∞ A
k =1 k ) = k =1 µ(Ak ) for all sequences of pairwise disjoint
sets {Ak }∞k =1 ∈ F.
The triplet (Ω, F, µ) is called a probability space.
Let X be a random variable (measurable function) from (Ω, F, µ)
to (E , G). If E is a metric space then we may define expectation
with respect to the measure µ by
Z
E[X ] = X (ω) d µ(ω).

More generally, let f : E 7→ R be G–measurable. Then,


Z
E[f (X )] = f (X (ω)) d µ(ω).

Pavliotis (IC) StochProc January 16, 2011 30 / 367


Let U be a topological space. We will use the notation B(U) to
denote the Borel σ–algebra of U: the smallest σ–algebra
containing all open sets of U. Every random variable from a
probability space (Ω, F, µ) to a measurable space (E , B(E ))
induces a probability measure on E :

µX (B) = PX −1 (B) = µ(ω ∈ Ω; X (ω) ∈ B), B ∈ B(E ).

The measure µX is called the distribution (or sometimes the law)


of X .

Example
Let I denote a subset of the positive integers. A vector
ρ0 = {ρ0,i , i ∈ I} is a distribution
P on I if it has nonnegative entries and
its total mass equals 1: i∈I ρ0,i = 1.

Pavliotis (IC) StochProc January 16, 2011 31 / 367


We can use the distribution of a random variable to compute
expectations and probabilities:
Z
E[f (X )] = f (x) d µX (x)
S

and Z
P[X ∈ G] = d µX (x), G ∈ B(E ).
G

When E = Rd and we can write d µX (x) = ρ(x) dx, then we refer


to ρ(x) as the probability density function (pdf), or density with
respect to Lebesque measure for X .
When E = Rd then by Lp (Ω; Rd ), or sometimes Lp (Ω; µ) or even
simply Lp (µ), we mean the Banach space of measurable functions
on Ω with norm   1/p
kX kLp = E|X |p .

Pavliotis (IC) StochProc January 16, 2011 32 / 367


Example

Consider the random variable X : Ω 7→ R with pdf


 
− 12 (x − m)2
γσ,m (x) := (2πσ) exp − .

Such an X is termed a Gaussian or normal random variable. The


mean is Z
EX = xγσ,m (x) dx = m
R
and the variance is
Z
E(X − m)2 = (x − m)2 γσ,m (x) dx = σ.
R

Pavliotis (IC) StochProc January 16, 2011 33 / 367


Example (Continued)
Let m ∈ Rd and Σ ∈ Rd×d be symmetric and positive definite. The
random variable X : Ω 7→ Rd with pdf
 − 1  
d 2 1 −1
γΣ,m (x) := (2π) detΣ exp − hΣ (x − m), (x − m)i
2

is termed a multivariate Gaussian or normal random variable.


The mean is

E(X ) = m (1)

and the covariance matrix is


 
E (X − m) ⊗ (X − m) = Σ. (2)

Pavliotis (IC) StochProc January 16, 2011 34 / 367


Since the mean and variance specify completely a Gaussian
random variable on R, the Gaussian is commonly denoted by
N (m, σ). The standard normal random variable is N (0, 1).
Since the mean and covariance matrix completely specify a
Gaussian random variable on Rd , the Gaussian is commonly
denoted by N (m, Σ).

Pavliotis (IC) StochProc January 16, 2011 35 / 367


The Characteristic Function

Many of the properties of (sums of) random variables can be


studied using the Fourier transform of the distribution function.
The characteristic function of the random variable X is defined
to be the Fourier transform of the distribution function
Z
φ(t) = eitλ d µX (λ) = E(eitX ). (3)
R

The characteristic function determines uniquely the distribution


function of the random variable, in the sense that there is a
one-to-one correspondance between F (λ) and φ(t).
The characteristic function of a N (m, Σ) is
1
φ(t) = ehm,ti− 2 ht,Σti .

Pavliotis (IC) StochProc January 16, 2011 36 / 367


Lemma
Let {X1 , X2 , . . . Xn } be independent random variablesP with
characteristic functions φj (t), j = 1, . . . n and let Y = nj=1 Xj with
characteristic function φY (t). Then

φY (t) = Πnj=1 φj (t).

Lemma
Let X be a random variable with characteristic function φ(t) and
assume that it has finite moments. Then
1 (k )
E (X k ) = φ (0).
ik

Pavliotis (IC) StochProc January 16, 2011 37 / 367


Types of Convergence and Limit Theorems

One of the most important aspects of the theory of random


variables is the study of limit theorems for sums of random
variables.
The most well known limit theorems in probability theory are the
law of large numbers and the central limit theorem.
There are various different types of convergence for sequences or
random variables.

Pavliotis (IC) StochProc January 16, 2011 38 / 367


Definition
Let {Zn }∞
n=1 be a sequence of random variables. We will say that
(a) Zn converges to Z with probability one (almost surely) if

P lim Zn = Z = 1.
n→+∞

(b) Zn converges to Z in probability if for every ε > 0



lim P |Zn − Z | > ε = 0.
n→+∞

(c) Zn converges to Z in Lp if
 p 
lim E Zn − Z = 0.
n→+∞

(d) Let Fn (λ), n = 1, · · · + ∞, F (λ) be the distribution functions of


Zn n = 1, · · · + ∞ and Z , respectively. Then Zn converges to Z in
distribution if
lim Fn (λ) = F (λ)
n→+∞
Pavliotis (IC) StochProc January 16, 2011 39 / 367
Let {Xn }∞n=1 be iid random variables with EXn = V . Then, the
strong law of large numbers states that average of the sum of
the iid converges to V with probability one:
N
!
1X
P lim Xn = V = 1.
n→+∞ N
n=1

The strong law of large numbers provides us with information


about the behavior of a sum of random variables (or, a large
number or repetitions of the same experiment) on average.
We can also study fluctuations around the average behavior.
Indeed, let E(Xn − V )2 = σ 2 . Define the centered iid random
variables Yn = Xn − V . Then, the sequence of random variables
1 PN
n=1 Yn converges in distribution to a N (0, 1) random

σ N
variable:
N
! Z
1 X a
1 1 2
lim P √ Yn 6 a = √ e− 2 x dx.
n→+∞ σ N n=1 −∞ 2π

Pavliotis (IC) StochProc January 16, 2011 40 / 367


Assume that E|X | < ∞ and let G be a sub–σ–algebra of F. The
conditional expectation of X with respect to G is defined to be
the function E[X |G] : Ω 7→ E which is G–measurable and satisfies
Z Z
E[X |G] d µ = X d µ ∀ G ∈ G.
G G

We can define E[f (X )|G] and the conditional probability


P[X ∈ F |G] = E[IF (X )|G], where IF is the indicator function of F , in
a similar manner.

Pavliotis (IC) StochProc January 16, 2011 41 / 367


ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES

Pavliotis (IC) StochProc January 16, 2011 42 / 367


Let T be an ordered set. A stochastic process is a collection of
random variables X = {Xt ; t ∈ T } where, for each fixed t ∈ T , Xt
is a random variable from (Ω, F) to (E , G).
The measurable space {Ω, F} is called the sample space. The
space (E , G) is called the state space .
In this course we will take the set T to be [0, +∞).
The state space E will usually be Rd equipped with the σ–algebra
of Borel sets.
A stochastic process X may be viewed as a function of both t ∈ T
and ω ∈ Ω. We will sometimes write X (t), X (t, ω) or Xt (ω) instead
of Xt . For a fixed sample point ω ∈ Ω, the function Xt (ω) : T 7→ E
is called a sample path (realization, trajectory) of the process X .

Pavliotis (IC) StochProc January 16, 2011 43 / 367


The finite dimensional distributions (fdd) of a stochastic
process are the distributions of the E k –valued random variables
(X (t1 ), X (t2 ), . . . , X (tk )) for arbitrary positive integer k and
arbitrary times ti ∈ T , i ∈ {1, . . . , k}:

F (x) = P(X (ti ) 6 xi , i = 1, . . . , k)

with x = (x1 , . . . , xk ).
We will say that two processes Xt and Yt are equivalent if they
have same finite dimensional distributions.
From experiments or numerical simulations we can only obtain
information about the (fdd) of a process.

Pavliotis (IC) StochProc January 16, 2011 44 / 367


Definition
A Gaussian process is a stochastic processes for which E = Rd and
all the finite dimensional distributions are Gaussian

F (x) = P(X (ti ) 6 xi , i = 1, . . . , k)


 
−n/2 −1/2 1 −1
= (2π) (detKk ) exp − hKk (x − µk ), x − µk i ,
2

for some vector µk and a symmetric positive definite matrix Kk .

A Gaussian process x(t) is characterized by its mean


m(t) := Ex(t)
and the covariance function
  
C(t, s) = E x(t) − m(t) ⊗ x(s) − m(s) .

Thus, the first two moments of a Gaussian process are sufficient


for a complete characterization of the process.
Pavliotis (IC) StochProc January 16, 2011 45 / 367

Let Ω, F, P be a probability space. Let Xt , t ∈ T (with T = R or Z) be
a real-valued random process on this probability space with finite
second moment, E|Xt |2 < +∞ (i.e. Xt ∈ L2 ).

Definition
A stochastic process Xt ∈ L2 is called second-order stationary or
wide-sense stationary if the first moment EXt is a constant and the
second moment E(Xt Xs ) depends only on the difference t − s:

EXt = µ, E(Xt Xs ) = C(t − s).

Pavliotis (IC) StochProc January 16, 2011 46 / 367


The constant m is called the expectation of the process Xt . We
will set m = 0.
The function C(t) is called the covariance or the autocorrelation
function of the Xt .
Notice that C(t) = E(Xt X0 ), whereas C(0) = E(Xt2 ), which is
finite, by assumption.
Since we have assumed that Xt is a real valued process, we have
that C(t) = C(−t), t ∈ R.

Pavliotis (IC) StochProc January 16, 2011 47 / 367


Continuity properties of the covariance function are equivalent to
continuity properties of the paths of Xt .

Lemma
Assume that the covariance function C(t) of a second order stationary
process is continuous at t = 0. Then it is continuous for all t ∈ R.
Furthermore, the continuity of C(t) is equivalent to the continuity of the
process Xt in the L2 -sense.

Pavliotis (IC) StochProc January 16, 2011 48 / 367


Proof.
Fix t ∈ R. We calculate:

|C(t + h) − C(t)|2 = |E(Xt+h X0 ) − E(Xt X0 )|2 = E|((Xt+h − Xt )X0 )|2


6 E(X0 )2 E(Xt+h − Xt )2
2
= C(0)(EXt+h + EXt2 − 2EXt Xt+h )
= 2C(0)(C(0) − C(h)) → 0,

as h → 0.
Assume now that C(t) is continuous. From the above calculation we
have
E|Xt+h − Xt |2 = 2(C(0) − C(h)), (4)
which converges to 0 as h → 0. Conversely, assume that Xt is
L2 -continuous. Then, from the above equation we get
limh→0 C(h) = C(0).

Pavliotis (IC) StochProc January 16, 2011 49 / 367


Notice that form (4) we immediately conclude that
C(0) > C(h), h ∈ R.
The Fourier transform of the covariance function of a second order
stationary process always exists. This enables us to study second
order stationary processes using tools from Fourier analysis.
To make the link between second order stationary processes and
Fourier analysis we will use Bochner’s theorem, which applies to
all nonnegative functions.

Definition
A function f (x) : R 7→ R is called nonnegative definite if
n
X
f (ti − tj )ci c̄j > 0 (5)
i,j=1

for all n ∈ N, t1 , . . . tn ∈ R, c1 , . . . cn ∈ C.

Pavliotis (IC) StochProc January 16, 2011 50 / 367


Lemma
The covariance function of second order stationary process is a
nonnegative definite function.

Proof.
Pn
We will use the notation Xtc := i=1 Xti ci . We have.
n
X n
X
C(ti − tj )ci c̄j = EXti Xtj ci c̄j
i,j=1 i,j=1
 
Xn n
X 
= E Xti ci Xtj c̄j  = E Xtc X̄tc
i=1 j=1

= E|Xtc |2 > 0.

Pavliotis (IC) StochProc January 16, 2011 51 / 367


Theorem
(Bochner) There is a one-to-one correspondence between the set of
continuous nonnegative definite functions and the set of finite
measures on the Borel σ-algebra of R: if ρ is a finite measure, then
Z
C(t) = eixt ρ(dx) (6)
R

in nonnegative definite. Conversely, any nonnegative definite function


can be represented in the form (6).

Definition
Let Xt be a second order stationary process with covariance C(t)
whose Fourier transform is the measure ρ(dx). The measure ρ(dx) is
called the spectral measure of the process Xt .

Pavliotis (IC) StochProc January 16, 2011 52 / 367


In the following we will assume that the spectral measure is
absolutely continuous with respect to the Lebesgue measure on R
with density f (x), d ρ(x) = f (x)dx.
The Fourier transform f (x) of the covariance function is called the
spectral density of the process:
Z ∞
1
f (x) = e−itx C(t) dt.
2π −∞

From (6) it follows that that the covariance function of a mean


zero, second order stationary process is given by the inverse
Fourier transform of the spectral density:
Z ∞
C(t) = eitx f (x) dx.
−∞

In most cases, the experimentally measured quantity is the


spectral density (or power spectrum) of the stochastic process.

Pavliotis (IC) StochProc January 16, 2011 53 / 367


The correlation function of a second order stationary process
enables us to associate a time scale to Xt , the correlation time
τcor :
Z ∞ Z ∞
1
τcor = C(τ ) d τ = E(Xτ X0 )/E(X02 ) d τ.
C(0) 0 0

The slower the decay of the correlation function, the larger the
correlation time is. We have to assume sufficiently fast decay of
correlations so that the correlation time is finite.

Pavliotis (IC) StochProc January 16, 2011 54 / 367


Example
Consider the mean zero, second order stationary process with
covariance function
D
R(t) = e−α|t| . (7)
α
The spectral density of this process is:
Z
1 D +∞ −ixt −α|t|
f (x) = e e dt
2π α −∞
Z 0 Z +∞ !
1 D −ixt αt −ixt −αt
= e e dt + e e dt
2π α −∞ 0
 
1 D 1 1
= +
2π α ix + α −ix + α
D 1
= .
π x + α2
2

Pavliotis (IC) StochProc January 16, 2011 55 / 367


Example (Continued)
This function is called the Cauchy or the Lorentz distribution.
The Gaussian stochastic process with covariance function (7) is
called the stationary Ornstein-Uhlenbeck process.
The correlation time is (we have that C(0) = D/(α))
Z ∞
τcor = e−αt dt = α−1 .
0

Pavliotis (IC) StochProc January 16, 2011 56 / 367


The OU process was introduced by Ornstein and Uhlenbeck in 1930
(G.E. Uhlenbeck, L.S. Ornstein, Phys. Rev. 36, 823 (1930)) as a model
for the velocity of a Brownian particle. It is of interest to calculate the
statistics of the position of the Brownian particle, i.e. of the integral
Z t
X (t) = Y (s) ds, (8)
0

where Y (t) denotes the stationary OU process.

Lemma
Let Y (t) denote the stationary OU process with covariance
function (7). Then the position process (8) is a mean zero Gaussian
process with covariance function

E(X (t)X (s)) = 2 min(t, s) + e− min(t,s) + e−max(t,s) − e−|t−s| − 1.

Pavliotis (IC) StochProc January 16, 2011 57 / 367


Second order stationary processes are ergodic: time averages equal
phase space (ensemble) averages. An example of an ergodic theorem
for a stationary processes is the following L2 (mean-square) ergodic
theorem.
Theorem
Let {Xt }t>0 be a second order stationary process on a probability
space Ω, F, P with mean µ and covariance R(t), and assume that
R(t) ∈ L1 (0, +∞). Then
Z 2
1 T

lim E X (s) ds − µ = 0. (9)
T →+∞ T 0

Pavliotis (IC) StochProc January 16, 2011 58 / 367


Proof.
We have
Z 2 Z Z
1 T 1 T T

E X (s) ds − µ = R(t − s) dtds
T 0 T2 0 0
Z T Z t
2
= R(t − s) dsdt
T2 0 0
Z T
2
= (T − v)R(u) du → 0,
T2 0

using the dominated convergence theorem and the assumption


R(·) ∈ L1 . In the above we used the fact that R is a symmetric
function, together with the change of variables u = t − s, v = t and an
integration over v.

Pavliotis (IC) StochProc January 16, 2011 59 / 367


Assume that µ = 0. From the above calculation we can conclude that,
under the assumption that R(·) decays sufficiently fast at infinity, for
t ≫ 1 we have that
Z t 2
E X (t) dt ≈ 2Dt,
0

where Z ∞
D= R(t) dt
0
is the diffusion coefficient. Thus, one expects that at sufficiently long
times and under appropriate assumptions on the correlation function,
the time integral of a stationary process will approximate a Brownian
motion with diffusion coefficient D.
This kind of analysis was initiated in G.I. Taylor, Diffusion by
Continuous Movements Proc. London Math. Soc..1922; s2-20:
196-212.

Pavliotis (IC) StochProc January 16, 2011 60 / 367


Definition
A stochastic process is called (strictly) stationary if all finite
dimensional distributions are invariant under time translation: for any
integer k and times ti ∈ T , the distribution of (X (t1 ), X (t2 ), . . . , X (tk )) is
equal to that of (X (s + t1 ), X (s + t2 ), . . . , X (s + tk )) for any s such that
s + ti ∈ T for all i ∈ {1, . . . , k}. In other words,

P(Xt1 +t ∈ A1 , Xt2 +t ∈ A2 . . . Xtk +t ∈ Ak )


= P(Xt1 ∈ A1 , Xt2 ∈ A2 . . . Xtk ∈ Ak ), ∀t ∈ T .

Let Xt be a strictly stationary stochastic process with finite second


moment (i.e. Xt ∈ L2 ). The definition of strict stationarity implies
that EXt = µ, a constant, and E((Xt − µ)(Xs − µ)) = C(t − s).
Hence, a strictly stationary process with finite second moment is
also stationary in the wide sense. The converse is not true.

Pavliotis (IC) StochProc January 16, 2011 61 / 367


Remarks

1 A sequence Y0 , Y1 , . . . of independent, identically distributed


random variables is a stationary process with
R(k) = σ 2 δk 0 , σ 2 = E(Yk )2 .
2 Let Z be a single random variable with known distribution and set
Zj = Z , j = 0, 1, 2, . . . . Then the sequence Z0 , Z1 , Z2 , . . . is a
stationary sequence with R(k) = σ 2 .
3 The first two moments of a Gaussian process are sufficient for a
complete characterization of the process. A corollary of this is that
a second order stationary Gaussian process is also a (strictly)
stationary process.

Pavliotis (IC) StochProc January 16, 2011 62 / 367


The most important continuous time stochastic process is
Brownian motion. Brownian motion is a mean zero, continuous
(i.e. it has continuous sample paths: for a.e ω ∈ Ω the function Xt
is a continuous function of time) process with independent
Gaussian increments.
A process Xt has independent increments if for every sequence
t0 < t1 ...tn the random variables

Xt1 − Xt0 , Xt2 − Xt1 , . . . , Xtn − Xtn−1

are independent.
If, furthermore, for any t1 , t2 and Borel set B ⊂ R

P(Xt2 +s − Xt1 +s ∈ B)

is independent of s, then the process Xt has stationary


independent increments.

Pavliotis (IC) StochProc January 16, 2011 63 / 367


Definition

A one dimensional standard Brownian motion W (t) : R+ → R is a


real valued stochastic process with the following properties:
1 W (0) = 0;
2 W (t) is continuous;
3 W (t) has independent increments.
4 For every t > s > 0 W (t) − W (s) has a Gaussian distribution with
mean 0 and variance t − s. That is, the density of the random
variable W (t) − W (s) is
 − 21  
x2
g(x ; t, s) = 2π(t − s) exp − ; (10)
2(t − s)

Pavliotis (IC) StochProc January 16, 2011 64 / 367


Definition (Continued)
A d –dimensional standard Brownian motion W (t) : R+ → Rd is a
collection of d independent one dimensional Brownian motions:

W (t) = (W1 (t), . . . , Wd (t)),

where Wi (t), i = 1, . . . , d are independent one dimensional


Brownian motions. The density of the Gaussian random vector
W (t) − W (s) is thus
 −d/2  
kxk2
g(x; t, s) = 2π(t − s) exp − .
2(t − s)

Brownian motion is sometimes referred to as the Wiener process .

Pavliotis (IC) StochProc January 16, 2011 65 / 367


2
mean of 1000 paths
5 individual paths
1.5

0.5
U(t)
0

−0.5

−1

−1.5
0 0.2 0.4 0.6 0.8 1
t

Figure: Brownian sample paths

Pavliotis (IC) StochProc January 16, 2011 66 / 367


It is possible to prove rigorously the existence of the Wiener process
(Brownian motion):
Theorem
(Wiener) There exists an almost-surely continuous process Wt with
independent increments such and W0 = 0, such that for each t > the
random variable Wt is N (0, t). Furthermore, Wt is almost surely locally
Hölder continuous with exponent α for any α ∈ (0, 21 ).

Notice that Brownian paths are not differentiable.

Pavliotis (IC) StochProc January 16, 2011 67 / 367


Brownian motion is a Gaussian process. For the d –dimensional
Brownian motion, and for I the d × d dimensional identity, we have
(see (1) and (2))
EW (t) = 0 ∀t > 0
and  
E (W (t) − W (s)) ⊗ (W (t) − W (s)) = (t − s)I. (11)

Moreover,  
E W (t) ⊗ W (s) = min(t, s)I. (12)

Pavliotis (IC) StochProc January 16, 2011 68 / 367


From the formula for the Gaussian density g(x, t − s), eqn. (10),
we immediately conclude that W (t) − W (s) and
W (t + u) − W (s + u) have the same pdf. Consequently, Brownian
motion has stationary increments.
Notice, however, that Brownian motion itself is not a stationary
process.
Since W (t) = W (t) − W (0), the pdf of W (t) is

1 2
g(x, t) = √ e−x /2t .
2πt
We can easily calculate all moments of the Brownian motion:
Z +∞
1 2
n
E(x (t)) = √ x n e−x /2t dx
2πt −∞
n
1.3 . . . (n − 1)t n/2 , n even,
=
0, n odd.

Pavliotis (IC) StochProc January 16, 2011 69 / 367


We can define the OU process through the Brownian motion via a
time change.

Lemma
Let W (t) be a standard Brownian motion and consider the process

V (t) = e−t W (e2t ).

Then V (t) is a Gaussian second order stationary process with mean 0


and covariance
K (s, t) = e−|t−s| .

Pavliotis (IC) StochProc January 16, 2011 70 / 367


Definition
A (normalized) fractional Brownian motion WtH , t > 0 with Hurst
parameter H ∈ (0, 1) is a centered Gaussian process with continuous
sample paths whose covariance is given by

1 2H 
E(WtH WsH ) = s + t 2H − |t − s|2H . (13)
2
Fractional Brownian motion has the following properties.
1
1 When H = 21 , Wt 2 becomes the standard Brownian motion.
2 W0H = 0, EWtH = 0, E(WtH )2 = |t|2H , t > 0.
3 It has stationary increments, E(WtH − WsH )2 = |t − s|2H .
4 It has the following self similarity property
H
(Wαt , t > 0) = (αH WtH , t > 0), α > 0,

where the equivalence is in law.

Pavliotis (IC) StochProc January 16, 2011 71 / 367


The Poisson Process

Another fundamental continuous time process is the Poisson


process :

Definition
The Poisson process with intensity λ, denoted by N(t), is an
integer-valued, continuous time, stochastic process with independent
increments satisfying
k
e−λ(t−s) λ(t − s)
P[(N(t) − N(s)) = k] = , t > s > 0, k ∈ N.
k!
Both Brownian motion and the Poisson process are
homogeneous (or time-homogeneous): the increments
between successive times s and t depend only on t − s.

Pavliotis (IC) StochProc January 16, 2011 72 / 367


A very useful result is that we can expand every centered
(EXt = 0) stochastic process with continuous covariance (and
hence, every L2 -continuous centered stochastic process) into a
random Fourier series.
Let (Ω, F, P) be a probability space and {Xt }t∈T a centered
process which is mean square continuous. Then, the
Karhunen-Loéve theorem states that Xt can be expanded in the
form
X∞
Xt = ξn φn (t), (14)
n=1

where the ξn are an orthogonal sequence of random variables


with E|ξk |2 = λk , where {λk φn }∞
k =1 are the eigenvalues and
eigenfunctions of the integral operator whose kernel is the
covariance of the processes Xt . The convergence is in L2 (P) for
every t ∈ T .
If Xt is a Gaussian process then the ξk are independent Gaussian
random variables.
Pavliotis (IC) StochProc January 16, 2011 73 / 367
Set T = [0, 1].Let K : L2 (T ) → L2 (T ) be defined by
Z 1
Kψ(t) = K (s, t)ψ(s) ds.
0

The kernel of this integral operator is a continuous (in both s and


t), symmetric, nonnegative function. Hence, the corresponding
integral operator has eigenvalues, eigenfunctions

Kφn = λn , λn > 0, (φn , φm )L2 = δnm ,

such that the covariance operator can be expanded in a uniformly


convergent series

X
B(t, s) = λn φn (t)φn (s).
n=1

Pavliotis (IC) StochProc January 16, 2011 74 / 367


The random variables ξn are defined as
Z 1
ξn = X (t)φn (t) dt.
0

The orthogonality of the eigenfunctions of the covariance operator


implies that
E(ξn ξm ) = λn δnm .
Thus the random variables are orthogonal. When X (t) is
Gaussian, then ξk are Gaussian random variables. Furthermore,
since they are also orthogonal, they are independent Gaussian
random variables.

Pavliotis (IC) StochProc January 16, 2011 75 / 367


Example
The Karhunen-Loéve Expansion for Brownian Motion We set
T = [0, 1].The covariance function of Brownian motion is
C(t, s) = min(t, s). The eigenvalue problem Cψn = λn ψn becomes
Z 1
min(t, s)ψn (s) ds = λn ψn (t).
0

Or, Z Z
1 1
sψn (s) ds + t ψn (s) ds = λn ψn (t).
0 t
We differentiate this equation twice:
Z 1
ψn (s) ds = λn ψn′ (t) and − ψn (t) = λn ψn′′ (t),
t

where primes denote differentiation with respect to t.

Pavliotis (IC) StochProc January 16, 2011 76 / 367


Example
From the above equations we immediately see that the right boundary
conditions are ψ(0) = ψ ′ (1) = 0. The eigenvalues and eigenfunctions
are
√    2
1 2
ψn (t) = 2 sin (2n + 1)πt , λn = .
2 (2n + 1)π

Thus, the Karhunen-Loéve expansion of Brownian motion on [0, 1] is


√ ∞  
2X 2 1
Wt = ξn sin (2n + 1)πt . (15)
π (2n + 1)π 2
n=1

Pavliotis (IC) StochProc January 16, 2011 77 / 367


The Path Space

Let (Ω, F, µ) be a probability space, (E , ρ) a metric space and let


T = [0, ∞). Let {Xt } be a stochastic process from (Ω, F, µ) to
(E , ρ) with continuous sample paths.
The above means that for every ω ∈ Ω we have that
Xt ∈ CE := C([0, ∞); E ).
The space of continuous functions CE is called the path space of
the stochastic process.
We can put a metric on E as follows:

X
1 2 1 
ρE (X , X ) := n
max min ρ(Xt1 , Xt2 ), 1 .
2 06t6n
n=1

We can then define the Borel sets on CE , using the topology


induced by this metric, and {Xt } can be thought of as a random
variable on (Ω, F, µ) with state space (CE , B(CE )).
Pavliotis (IC) StochProc January 16, 2011 78 / 367
The probability measure PXt−1 on (CE , B(CE )) is called the law of
{Xt }.
The law of a stochastic process is a probability measure on its
path space.
Example
The space of continuous functions CE is the path space of Brownian
motion (the Wiener process). The law of Brownian motion, that is the
measure that it induces on C([0, ∞), Rd ), is known as the Wiener
measure.

Pavliotis (IC) StochProc January 16, 2011 79 / 367


MARKOV STOCHASTIC PROCESSES

Pavliotis (IC) StochProc January 16, 2011 80 / 367


Let (Ω, F) be a measurable space and T an ordered set. Let
X = Xt (ω) be a stochastic process from the sample space (Ω, F)
to the state space (E , G). It is a function of two variables, t ∈ T
and ω ∈ Ω.
For a fixed ω ∈ Ω the function Xt (ω), t ∈ T is the sample path of
the process X associated with ω.
Let K be a collection of subsets of Ω. The smallest σ–algebra on
Ω which contains K is denoted by σ(K) and is called the
σ–algebra generated by K.
Let Xt : Ω 7→ E , t ∈ T . The smallest σ–algebra σ(Xt , t ∈ T ), such
that the family of mappings {Xt , t ∈ T } is a stochastic process
with sample space (Ω, σ(Xt , t ∈ T )) and state space (E , G), is
called the σ–algebra generated by {Xt , t ∈ T }.

Pavliotis (IC) StochProc January 16, 2011 81 / 367


A filtration on (Ω, F) is a nondecreasing family {Ft , t ∈ T } of
sub–σ–algebras of F: Fs ⊆ Ft ⊆ F for s 6 t.
We set F∞ = σ(∪t∈T Ft ). The filtration generated by Xt , where
Xt is a stochastic process, is

FtX := σ (Xs ; s 6 t) .

We say that a stochastic process {Xt ; t ∈ T } is adapted to the


filtration {Ft } := {Ft , t ∈ T } if for all t ∈ T , Xt is an Ft –measurable
random variable.

Definition
Let {Xt } be a stochastic process defined on a probability space
(Ω, F, µ) with values in E and let FtX be the filtration generated by
{Xt }. Then {Xt } is a Markov process if

P(Xt ∈ Γ|FsX ) = P(Xt ∈ Γ|Xs ) (16)

for all t, s ∈ T with t > s, and Γ ∈ B(E ).


Pavliotis (IC) StochProc January 16, 2011 82 / 367
The filtration FtX is generated by events of the form
{ω|Xs1 ∈ B1 , Xs2 ∈ B2 , . . . Xsn ∈ Bn , } with
0 6 s1 < s2 < · · · < sn 6 s and Bi ∈ B(E ). The definition of a
Markov process is thus equivalent to the hierarchy of equations

P(Xt ∈ Γ|Xt1 , Xt2 , . . . Xtn ) = P(Xt ∈ Γ|Xtn ) a.s.

for n > 1, 0 6 t1 < t2 < · · · < tn 6 t and Γ ∈ B(E ).

Pavliotis (IC) StochProc January 16, 2011 83 / 367


Roughly speaking, the statistics of Xt for t > s are completely
determined once Xs is known; information about Xt for t < s is
superfluous. In other words: a Markov process has no memory.
More precisely: when a Markov process is conditioned on the
present state, then there is no memory of the past. The past and
future of a Markov process are statistically independent when the
present is known.
A typical example of a Markov process is the random walk: in
order to find the position x(t + 1) of the random walker at time
t + 1 it is enough to know its position x(t) at time t: how it got to
x(t) is irrelevant.
A non Markovian process Xt can be described through a
Markovian one Yt by enlarging the state space: the additional
variables that we introduce account for the memory in the Xt . This
"Markovianization" trick is very useful since there are many more
tools for analyzing Markovian process.

Pavliotis (IC) StochProc January 16, 2011 84 / 367


With a Markov process {Xt } we can associate a function
P : T × T × E × B(E ) → R+ defined through the relation
h i
P Xt ∈ Γ|FsX = P(s, t, Xs , Γ),

for all t, s ∈ T with t > s and all Γ ∈ B(E ).


 
Assume that Xs = x. Since P Xt ∈ Γ|FsX = P [Xt ∈ Γ|Xs ] we can
write
P(Γ, t|x, s) = P [Xt ∈ Γ|Xs = x] .
The transition function P(t, Γ|x, s) is (for fixed t, x s) a
probability measure on E with P(t, E |x, s) = 1; it is
B(E )–measurable in x (for fixed t, s, Γ) and satisfies the
Chapman–Kolmogorov equation
Z
P(Γ, t|x, s) = P(Γ, t|y, u)P(dy, u|x, s). (17)
E

for all x ∈ E , Γ ∈ B(E ) and s, u, t ∈ T with s 6 u 6 t.


Pavliotis (IC) StochProc January 16, 2011 85 / 367
The derivation of the Chapman-Kolmogorov equation is based on
the assumption of Markovianity and on properties of the
conditional probability:
1 Let (Ω, F , µ) be a probability space, X a random variable from
(Ω, F , µ) to (E, G) and let F1 ⊂ F2 ⊂ F. Then

E(E(X |F2 )|F1 ) = E(E(X |F1 )|F2 ) = E(X |F1 ). (18)

2 Given G ⊂ F we define the function PX (B|G) = P(X ∈ B|G) for


B ∈ F. Assume that f is such that E(f (X )) < ∞. Then
Z
E(f (X )|G) = f (x )PX (dx |G). (19)
R

Pavliotis (IC) StochProc January 16, 2011 86 / 367


Now we use the Markov property, together with equations (18)
and (19) and the fact that s < u ⇒ FsX ⊂ FuX to calculate:

P(Γ, t|x, s) := P(Xt ∈ Γ|Xs = x) = P(Xt ∈ Γ|FsX )


= E(IΓ (Xt )|FsX ) = E(E(IΓ (Xt )|FsX )|FuX )
= E(E(IΓ (Xt )|FuX )|FsX ) = E(P(Xt ∈ Γ|Xu )|FsX )
= E(P(Xt ∈ Γ|Xu = y)|Xs = x)
Z
= P(Γ, t|Xu = y)P(dz, u|Xs = x)
Z R

=: P(Γ, t|y, u)P(dy, u|x, s).


R

IΓ (·) denotes the indicator function of the set Γ. We have also set
E = R.

Pavliotis (IC) StochProc January 16, 2011 87 / 367


The CK equation is an integral equation and is the fundamental
equation in the theory of Markov processes. Under additional
assumptions we will derive from it the Fokker-Planck PDE, which
is the fundamental equation in the theory of diffusion processes,
and will be the main object of study in this course.
A Markov process is homogeneous if

P(t, Γ|Xs = x) := P(s, t, x, Γ) = P(0, t − s, x, Γ).

We set P(0, t, ·, ·) = P(t, ·, ·). The Chapman–Kolmogorov (CK)


equation becomes
Z
P(t + s, x, Γ) = P(s, x, dz)P(t, z, Γ). (20)
E

Pavliotis (IC) StochProc January 16, 2011 88 / 367


Let Xt be a homogeneous Markov process and assume that the
initial distribution of Xt is given by the probability measure
ν(Γ) = P(X0 ∈ Γ) (for deterministic initial conditions–X0 = x– we
have that ν(Γ) = IΓ (x) ).
The transition function P(x, t, Γ) and the initial distribution ν
determine the finite dimensional distributions of X by

P(X0 ∈ Γ1 , X (t1 ) ∈ Γ1 , . . . , Xtn ∈ Γn )


Z Z Z
= ... P(tn − tn−1 , yn−1 , Γn )P(tn−1 − tn−2 , yn−2 , dyn−1 )
Γ0 Γ1 Γn−1
· · · × P(t1 , y0 , dy1 )ν(dy0 ). (21)

Theorem
(Ethier and Kurtz 1986, Sec. 4.1) Let P(t, x, Γ) satisfy (20) and
assume that (E , ρ) is a complete separable metric space. Then there
exists a Markov process X in E whose finite-dimensional distributions
are uniquely determined by (21).
Pavliotis (IC) StochProc January 16, 2011 89 / 367
Let Xt be a homogeneous Markov process with initial distribution
ν(Γ) = P(X0 ∈ Γ) and transition function P(x, t, Γ). We can calculate
the probability of finding Xt in a set Γ at time t:
Z
P(Xt ∈ Γ) = P(x, t, Γ)ν(dx).
E

Thus, the initial distribution and the transition function are sufficient to
characterize a homogeneous Markov process. Notice that they do not
provide us with any information about the actual paths of the Markov
process.

Pavliotis (IC) StochProc January 16, 2011 90 / 367


The transition probability P(Γ, t|x, s) is a probability measure.
Assume that it has a density for all t > s:
Z
P(Γ, t|x, s) = p(y, t|x, s) dy.
Γ

Clearly, for t = s we have P(Γ, s|x, s) = IΓ (x).


The Chapman-Kolmogorov equation becomes:
Z Z Z
p(y, t|x, s) dy = p(y, t|z, u)p(z, u|x, s) dzdy,
Γ R Γ

and, since Γ ∈ B(R) is arbitrary, we obtain the equation


Z
p(y, t|x, s) = p(y, t|z, u)p(z, u|x, s) dz. (22)
R

The transition probability density is a function of 4 arguments: the


initial position and time x, s and the final position and time y, t.

Pavliotis (IC) StochProc January 16, 2011 91 / 367


In words, the CK equation tells us that, for a Markov process, the
transition from x, s to y, t can be done in two steps: first the system
moves from x to z at some intermediate time u. Then it moves from z
to y at time t. In order to calculate the probability for the transition from
(x, s) to (y, t) we need to sum (integrate) the transitions from all
possible intermediary states z. The above description suggests that a
Markov process can be described through a semigroup of operators,
i.e. a one-parameter family of linear operators with the properties

P0 = I, Pt+s = Pt ◦ Ps ∀ t, s > 0.

A semigroup of operators is characterized through its generator.

Pavliotis (IC) StochProc January 16, 2011 92 / 367


Indeed, let P(t, x, dy) be the transition function of a homogeneous
Markov process. It satisfies the CK equation (20):
Z
P(t + s, x, Γ) = P(s, x, dz)P(t, z, Γ).
E

Let X := Cb (E ) and define the operator


Z
(Pt f )(x) := E(f (Xt )|X0 = x) = f (y)P(t, x, dy).
E

This is a linear operator with

(P0 f )(x) = E(f (X0 )|X0 = x) = f (x) ⇒ P0 = I.

Pavliotis (IC) StochProc January 16, 2011 93 / 367


Furthermore:
Z
(Pt+s f )(x) = f (y)P(t + s, x, dy)
Z Z
= f (y)P(s, z, dy)P(t, x, dz)

Z Z 
= f (y)P(s, z, dy) P(t, x, dz)
Z
= (Ps f )(z)P(t, x, dz)
= (Pt ◦ Ps f )(x).

Consequently:
Pt+s = Pt ◦ Ps .

Pavliotis (IC) StochProc January 16, 2011 94 / 367


Let (E , ρ) be a metric space and let {Xt } be an E -valued
homogeneous Markov process. Define the one parameter family
of operators Pt through
Z
Pt f (x) = f (y)P(t, x, dy) = E[f (Xt )|X0 = x]

for all f (x) ∈ Cb (E ) (continuous bounded functions on E ).


Assume for simplicity that Pt : Cb (E ) → Cb (E ). Then the
one-parameter family of operators Pt forms a semigroup of
operators on Cb (E ).
We define by D(L) the set of all f ∈ Cb (E ) such that the strong
limit
Pt f − f
Lf = lim ,
t→0 t
exists.

Pavliotis (IC) StochProc January 16, 2011 95 / 367


Definition
The operator L : D(L) → Cb (E ) is called the infinitesimal generator
of the operator semigroup Pt .

Definition
The operator L : Cb (E ) → Cb (E ) defined above is called the generator
of the Markov process {Xt }.

The study of operator semigroups started in the late 40’s


independently by Hille and Yosida. Semigroup theory was
developed in the 50’s and 60’s by Feller, Dynkin and others,
mostly in connection to the theory of Markov processes.
Necessary and sufficient conditions for an operator L to be the
generator of a (contraction) semigroup are given by the
Hille-Yosida theorem (e.g. Evans Partial Differential Equations,
AMS 1998, Ch. 7).

Pavliotis (IC) StochProc January 16, 2011 96 / 367


The semigroup property and the definition of the generator of a
semigroup imply that, formally at least, we can write:

Pt = exp(Lt).

Consider the function u(x, t) := (Pt f )(x). We calculate its time


derivative:
∂u d d 
= (Pt f ) = eLt f
∂t dt  dt
= L eLt f = LPt f = Lu.

Furthermore, u(x, 0) = P0 f (x) = f (x). Consequently, u(x, t)


satisfies the initial value problem

∂u
= Lu, u(x, 0) = f (x). (23)
∂t

Pavliotis (IC) StochProc January 16, 2011 97 / 367


When the semigroup Pt is the transition semigroup of a Markov
process Xt , then equation (23) is called the backward
Kolmogorov equation. It governs the evolution of an observable

u(x, t) = E(f (Xt )|X0 = x).

Thus, given the generator of a Markov process L, we can


calculate all the statistics of our process by solving the backward
Kolmogorov equation.
In the case where the Markov process is the solution of a
stochastic differential equation, then the generator is a second
order elliptic operator and the backward Kolmogorov equation
becomes an initial value problem for a parabolic PDE.

Pavliotis (IC) StochProc January 16, 2011 98 / 367


The space Cb (E ) is natural in a probabilistic context, but other
Banach spaces often arise in applications; in particular when
there is a measure µ on E , the spaces Lp (E ; µ) sometimes arise.
We will quite often use the space L2 (E ; µ), where µ will is the
invariant measure of our Markov process.
The generator is frequently taken as the starting point for the
definition of a homogeneous Markov process.
Conversely, let Pt be a contraction semigroup (Let X be a
Banach space and T : X → X a bounded operator. Then T is a
contraction provided that kTf kX 6 kf kX ∀ f ∈ X ), with
D(Pt ) ⊂ Cb (E ), closed. Then, under mild technical hypotheses,
there is an E –valued homogeneous Markov process {Xt }
associated with Pt defined through

E[f (X (t)|FsX )] = Pt−s f (X (s))

for all t, s ∈ T with t > s and f ∈ D(Pt ).

Pavliotis (IC) StochProc January 16, 2011 99 / 367


Example
The one dimensional Brownian motion is a homogeneous Markov
process. The transition function is the Gaussian defined in the
example in Lecture 2:
 
1 |x − y|2
P(t, x, dy) = γt,x (y)dy, γt,x (y) = √ exp − .
2πt 2t

The semigroup associated to the standard Brownian motion is the heat


t d2
1 d2
semigroup Pt = e 2 dx 2 . The generator of this Markov process is 2 dx 2 .

Notice that the transition probability density γt,x of the one


dimensional Brownian motion is the fundamental solution (Green’s
function) of the heat (diffusion) PDE

∂γ 1 ∂2γ
= .
∂t 2 ∂x 2

Pavliotis (IC) StochProc January 16, 2011 100 / 367


Example
The Ornstein-Uhlenbeck process Vt = e−t W (e2t ) is a homogeneous
Markov process. The transition probability density is the Gaussian

p(y, t|x, s) := p(Vt = y|Vs = x)


!
1 |y − xe−(t−s) |2
= p exp − .
2π(1 − e−2(t−s) ) 2(1 − e−2(t−s) )

The semigroup associated to the Ornstein-Uhlenbeck process is


d d2
t(−x + 21 )
Pt = e dx dx 2 . The generator of this Markov process is
d 1 d2
L = −x dx + 2 dx 2 .

Notice that the transition probability density p(t, x) of the 1d


OU process is the fundamental solution (Green’s function) of the
heat (diffusion) PDE
∂p ∂(xp) 1 ∂ 2 p
= + .
∂t ∂x 2 ∂x 2
Pavliotis (IC) StochProc January 16, 2011 101 / 367
The semigroup Pt acts on bounded measurable functions.
We can also define the adjoint semigroup Pt∗ which acts on
probability measures:
Z Z

Pt µ(Γ) = P(Xt ∈ Γ|X0 = x) d µ(x) = p(t, x, Γ) d µ(x).
R R

The image of a probability measure µ under Pt∗ is again a


probability measure. The operators Pt and Pt∗ are adjoint in the
L2 -sense: Z Z
Pt f (x) d µ(x) = f (x) d (Pt∗ µ)(x). (24)
R R

Pavliotis (IC) StochProc January 16, 2011 102 / 367


We can, formally at least, write

Pt∗ = exp(L∗ t),

where L∗ is the L2 -adjoint of the generator of the process:


Z Z
Lfh dx = f L∗ h dx.

Let µt := Pt∗ µ. This is the law of the Markov process and µ is the
initial distribution. An argument similar to the one used in the
derivation of the backward Kolmogorov equation (23) enables us
to obtain an equation for the evolution of µt :

∂µt
= L ∗ µt , µ0 = µ.
∂t

Pavliotis (IC) StochProc January 16, 2011 103 / 367


Assuming that µt = ρ(y, t) dy, µ = ρ0 (y) dy this equation
becomes:
∂ρ
= L∗ ρ, ρ(y, 0) = ρ0 (y). (25)
∂t
This is the forward Kolmogorov or Fokker-Planck equation.
When the initial conditions are deterministic, X0 = x, the initial
condition becomes ρ0 = δ(y − x).
Given the initial distribution and the generator of the Markov
process Xt , we can calculate the transition probability density by
solving the Forward Kolmogorov equation. We can then calculate
all statistical quantities of this process through the formula
Z
E(f (Xt )|X0 = x) = f (y)ρ(t, y; x) dy.

We will derive rigorously the backward and forward Kolmogorov


equations for Markov processes that are defined as solutions of
stochastic differential equations later on.

Pavliotis (IC) StochProc January 16, 2011 104 / 367


We can study the evolution of a Markov process in two different
ways:
Either through the evolution of observables
(Heisenberg/Koopman)

∂(Pt f )
= L(Pt f ),
∂t
or through the evolution of states
(Schrödinger/Frobenious-Perron)

∂(Pt∗ µ)
= L∗ (Pt∗ µ).
∂t
We can also study Markov processes at the level of trajectories.
We will do this after we define the concept of a stochastic
differential equation.

Pavliotis (IC) StochProc January 16, 2011 105 / 367


A very important concept in the study of limit theorems for
stochastic processes is that of ergodicity.
This concept, in the context of Markov processes, provides us with
information on the long–time behavior of a Markov semigroup.

Definition
A Markov process is called ergodic if the equation

Pt g = g, g ∈ Cb (E ) ∀t > 0

has only constant solutions.

Roughly speaking, ergodicity corresponds to the case where the


semigroup Pt is such that Pt − I has only constants in its null
space, or, equivalently, to the case where the generator L has
only constants in its null space. This follows from the definition of
the generator of a Markov process.

Pavliotis (IC) StochProc January 16, 2011 106 / 367


Under some additional compactness assumptions, an ergodic
Markov process has an invariant measure µ with the property that,
in the case T = R+ ,
Z
1 t
lim g(Xs ) ds = Eg(x),
t→+∞ t 0

where E denotes the expectation with respect to µ.


This is a physicist’s definition of an ergodic process: time
averages equal phase space averages.
Using the adjoint semigroup we can define an invariant measure
as the solution of the equation

Pt∗ µ = µ.

If this measure is unique, then the Markov process is ergodic.

Pavliotis (IC) StochProc January 16, 2011 107 / 367


Using this, we can obtain an equation for the invariant measure in
terms of the adjoint of the generator L∗ , which is the generator of
the semigroup Pt∗ . Indeed, from the definition of the generator of a
semigroup and the definition of an invariant measure, we conclude
that a measure µ is invariant if and only if

L∗ µ = 0

in some appropriate generalized sense ((L∗ µ, f ) = 0 for every


bounded measurable function).
Assume that µ(dx) = ρ(x) dx. Then the invariant density
satisfies the stationary Fokker-Planck equation

L∗ ρ = 0.

The invariant measure (distribution) governs the long-time


dynamics of the Markov process.

Pavliotis (IC) StochProc January 16, 2011 108 / 367


If X0 is distributed according to µ, then so is Xt for all t > 0. The
resulting stochastic process, with X0 distributed in this way, is
stationary .
In this case the transition probability density (the solution of the
Fokker-Planck equation) is independent of time: ρ(x, t) = ρ(x).
Consequently, the statistics of the Markov process is independent
of time.

Pavliotis (IC) StochProc January 16, 2011 109 / 367


Example
The one dimensional Brownian motion is not an ergodic process: The
d2
null space of the generator L = 21 dx 2 on R is not one dimensional!

Example
Consider a one-dimensional Brownian motion on [0, 1], with periodic
boundary conditions. The generator of this Markov process L is the
d2
differential operator L = 21 dx 2 , equipped with periodic boundary
conditions on [0, 1]. This operator is self-adjoint. The null space of both
L and L∗ comprises constant functions on [0, 1]. Both the backward
Kolmogorov and the Fokker-Planck equation reduce to the heat
equation
∂ρ 1 ∂2ρ
=
∂t 2 ∂x 2
with periodic boundary conditions in [0, 1]. Fourier analysis shows that
the solution converges to a constant at an exponential rate.

Pavliotis (IC) StochProc January 16, 2011 110 / 367


Example
The one dimensional Ornstein-Uhlenbeck (OU) process is a
Markov process with generator

d d2
L = −αx + D 2.
dx dx
The null space of L comprises constants in x. Hence, it is an
ergodic Markov process. In order to calculate the invariant
measure we need to solve the stationary Fokker–Planck equation:

L∗ ρ = 0, ρ > 0, kρkL1 (R) = 1. (26)

Pavliotis (IC) StochProc January 16, 2011 111 / 367


Example (Continued)
Let us calculate the L2 -adjoint of L. Assuming that f , h decay
sufficiently fast at infinity, we have:
Z Z h i
Lfh dx = (−αx∂x f )h + (D∂x2 f )h dx
R ZR h Z
i
= f ∂x (αxh) + f (D∂x2 h) dx =: f L∗ h dx,
R R

where
d d 2h
L∗ h :=
(axh) + D 2 .
dx dx
We can calculate the invariant distribution by solving
equation (26).
The invariant measure of this process is the Gaussian measure
r  α 
α
µ(dx) = exp − x 2 dx.
2πD 2D

If the initial
Pavliotis (IC) condition of the OU process is distributed
StochProc according
January 16, 2011 112to
/ 367
Let Xt be the 1d OU process and let X0 ∼ N (0, D/α). Then Xt is a
mean zero, Gaussian second order stationary process on [0, ∞)
with correlation function
D −α|t|
R(t) = e
α
and spectral density

D 1
f (x) = .
π x + α2
2

Furthermore, the OU process is the only real-valued mean zero


Gaussian second-order stationary Markov process defined on R.

Pavliotis (IC) StochProc January 16, 2011 113 / 367


DIFFUSION PROCESSES

Pavliotis (IC) StochProc January 16, 2011 114 / 367


A Markov process consists of three parts: a drift (deterministic), a
random process and a jump process.
A diffusion process is a Markov process that has continuous
sample paths (trajectories). Thus, it is a Markov process with no
jumps.
A diffusion process can be defined by specifying its first two
moments:

Pavliotis (IC) StochProc January 16, 2011 115 / 367


Definition
A Markov process Xt with transition probability P(Γ, t|x, s) is called a
diffusion process if the following conditions are satisfied.
1 (Continuity). For every x and every ε > 0
Z
P(dy, t|x, s) = o(t − s) (27)
|x−y |>ε

uniformly over s < t.


2 (Definition of drift coefficient). There exists a function a(x, s) such
that for every x and every ε > 0
Z
(y − x)P(dy, t|x, s) = a(x, s)(s − t) + o(s − t). (28)
|y −x|6ε

uniformly over s < t.

Pavliotis (IC) StochProc January 16, 2011 116 / 367


Definition (Continued)
3. (Definition of diffusion coefficient). There exists a function b(x, s)
such that for every x and every ε > 0
Z
(y − x)2 P(dy, t|x, s) = b(x, s)(s − t) + o(s − t). (29)
|y −x|6ε

uniformly over s < t.

Pavliotis (IC) StochProc January 16, 2011 117 / 367


In Definition 39 we had to truncate the domain of integration since we
didn’t know whether the first and second moments exist. If we assume
that there exists a δ > 0 such that
Z
1
lim |y − x|2+δ P(s, x, t, dy) = 0, (30)
t→s t − s Rd

then we can extend the integration over the whole Rd and use
expectations in the definition of the drift and the diffusion coefficient.
Indeed, ,let k = 0, 1, 2 and notice that
Z
|y − x|k P(s, x, t, dy)
|y −x|>ε
Z
= |y − x|2+δ |y − x|k −(2+δ) P(s, x, t, dy)
|y −x|>ε
Z
1
6 2+δ−k |y − x|2+δ P(s, x, t, dy)
ε |y −x|>ε
Z
1
6 2+δ−k |y − x|2+δ P(s, x, t, dy).
ε R d

Pavliotis (IC) StochProc January 16, 2011 118 / 367


Using this estimate together with (30) we conclude that:
Z
1
lim |y − x|k P(s, x, t, dy) = 0, k = 0, 1, 2.
t→s t − s |y −x|>ε

This implies that assumption (30) is sufficient for the sample paths to
be continuous (k = 0) and for the replacement of the truncated
integrals in (28) and (29) by integrals over Rd (k = 1 and k = 2,
respectively). The definitions of the drift and diffusion coefficients
become:  
Xt − Xs
lim E Xs = x = a(x, s) (31)
t→s t −s
and  
(Xt − Xs ) ⊗ (Xt − Xs )
lim E Xs = x = b(x, s) (32)
t→s t −s

Pavliotis (IC) StochProc January 16, 2011 119 / 367


Notice also that the continuity condition can be written in the form

P (|Xt − Xs | > ε|Xs = x) = o(t − s).

Now it becomes clear that this condition implies that the probability of
large changes in Xt over short time intervals is small. Notice, on the
other hand, that the above condition implies that the sample paths of a
diffusion process are not differentiable: if they where, then the right
hand side of the above equation would have to be 0 when t − s ≪ 1.
The sample paths of a diffusion process have the regularity of
Brownian paths. A Markovian process cannot be differentiable: we
can define the derivative of a sample paths only with processes for
which the past and future are not statistically independent when
conditioned on the present.

Pavliotis (IC) StochProc January 16, 2011 120 / 367


Let us denote the expectation conditioned on Xs = x by Es,x . Notice
that the definitions of the drift and diffusion coefficients (31) and (32)
can be written in the form

Es,x (Xt − Xs ) = a(x, s)(t − s) + o(t − s).

and
 
Es,x (Xt − Xs ) ⊗ (Xt − Xs ) = b(x, s)(t − s) + o(t − s).

Consequently, the drift coefficient defines the mean velocity vector


for the stochastic process Xt , whereas the diffusion coefficient (tensor)
is a measure of the local magnitude of fluctuations of Xt − Xs about the
mean value. hence, we can write locally:

Xt − Xs ≈ a(s, Xs )(t − s) + σ(s, Xs ) ξt ,

where b = σσ T and ξt is a mean zero Gaussian process with

E s,x (ξt ⊗ ξs ) = (t − s)I.

Pavliotis (IC) StochProc January 16, 2011 121 / 367


Since we have that

Wt − Ws ∼ N (0, (t − s)I),

we conclude that we can write locally:

∆Xt ≈ a(s, Xs )∆t + σ(s, Xs )∆Wt .

Or, replacing the differences by differentials:

dXt = a(t, Xt )dt + σ(t, Xt )dWt .

Hence, the sample paths of a diffusion process are governed by a


stochastic differential equation (SDE).

Pavliotis (IC) StochProc January 16, 2011 122 / 367


Theorem
(Kolmogorov) Let f (x) ∈ Cb (R) and let
Z
u(x, s) := E(f (Xt )|Xs = x) = f (y)P(dy, t|x, s) ∈ Cb2 (R).

Assume furthermore that the functions a(x, s), b(x, s) are continuous
in both x and s. Then u(x, s) ∈ C 2,1 (R × R+ ) and it solves the final
value problem

∂u ∂u 1 ∂2u
− = a(x, s) + b(x, s) 2 , lim u(s, x) = f (x). (33)
∂s ∂x 2 ∂x s→t

Pavliotis (IC) StochProc January 16, 2011 123 / 367


Proof:
First we notice that, the continuity assumption (27), together with the
fact that the function f (x) is bounded imply that
Z
u(x, s) = f (y) P(dy, t|x, s)
ZR Z
= f (y)P(dy, t|x, s) + f (y)P(dy, t|x, s)
|y −x|6ε |y −x|>ε
Z Z
6 f (y)P(dy, t|x, s) + kf kL∞ P(dy, t|x, s)
|y −x|6ε |y −x|>ε
Z
= f (y)P(dy, t|x, s) + o(t − s).
|y −x|6ε

Pavliotis (IC) StochProc January 16, 2011 124 / 367


We add and subtract the final condition f (x) and use the previous
calculation to obtain:
Z Z
u(x, s) = f (y)P(dy, t|x, s) = f (x) + (f (y) − f (x))P(dy, t|x, s)
R Z R

= f (x) + (f (y) − f (x))P(dy, t|x, s)


|y −x|6ε
Z
+ (f (y) − f (x))P(dy, t|x, s)
|y −x|>ε
Z
= f (x) + (f (y) − f (x))P(dy, t|x, s) + o(t − s).
|y −x|6ε

Now the final condition follows from the fact that f (x) ∈ Cb (R) and the
arbitrariness of ε.

Pavliotis (IC) StochProc January 16, 2011 125 / 367


Now we show that u(s, x) solves the backward Kolmogorov equation.
We use the Chapman-Kolmogorov equation (17) to obtain
Z
u(x, σ) = f (z)P(dz, t|x, σ) (34)
ZR Z
= f (z)P(dz, t|y, ρ)P(dy, ρ|x, σ)
ZR R

= u(y, ρ)P(dy, ρ|x, σ). (35)


R

The Taylor series expansion of the function u(s, x) gives


∂u(x, ρ) 1 ∂ 2 u(x, ρ)
u(z, ρ)−u(x, ρ) = (z−x)+ (z−x)2 (1+αε ), |z−x| 6 ε,
∂x 2 ∂x 2
(36)
where 2
∂ u(x, ρ) ∂ 2 u(z, ρ)
αε = sup − .
ρ,|z−x|6ε ∂x 2 ∂x 2
Notice that, since u(x, s) is twice continuously differentiable in x,
limε→0 αε = 0.
Pavliotis (IC) StochProc January 16, 2011 126 / 367
We combine now (35) with (36) to calculate

u(x, s) − u(x, s + h)
=
Z h 
1
P(dy, s + h|x, s)u(y, s + h) − u(x, s + h)
h
Z R
1
= P(dy, s + h|x, s)(u(y, s + h) − u(x, s + h))
h R
Z
1
= P(dy, s + h|x, s)(u(y, s + h) − u(x, s)) + o(1)
h |x−y |<ε
Z
∂u 1
= (x, s + h) (y − x)P(dy, s + h|x, s)
∂x h |x−y |<ε
Z
1 ∂2u 1
+ (x, s + h) (y − x)2 P(dy, s + h|x, s)(1 + αε ) + o(1)
2 ∂x 2 h |x−y |<ε
∂u 1 ∂2u
= a(x, s) (x, s + h) + b(x, s) 2 (x, s + h)(1 + αε ) + o(1).
∂x 2 ∂x
Equation (33) follows by taking the limits ε → 0, h → 0.
Pavliotis (IC) StochProc January 16, 2011 127 / 367
Assume now that the transition function has a density p(y, t|x, s). In
this case the formula for u(x, s) becomes
Z
u(x, s) = f (y)p(y, t|x, s) dy.
R
Substituting this in the backward Kolmogorov equation we obtain
Z  
∂p(y, t|x, s)
f (y) + As,x p(y, t|x, s) = 0 (37)
R ∂s
where
∂ 1 ∂2
As,x := a(x, s) + b(x, s) 2 .
∂x 2 ∂x
Since (37) is valid for arbitrary functions f (y), we obtain a partial
differential equations for the transition probability density:
∂p(y, t|x, s) ∂p(y, t|x, s) 1 ∂ 2 p(y, t|x, s)
− = a(x, s) + b(x, s) . (38)
∂s ∂x 2 ∂x 2
Notice that the variation is with respect to the "backward" variables
x, s. We can also obtain an equation with respect to the "forward"
variables y, t, the Forward Kolmogorov equation .
Pavliotis (IC) StochProc January 16, 2011 128 / 367
Now we derive the forward Kolmogorov equation or Fokker-Planck
equation. We assume that the transition function has a density with
respect to Lebesgue measure.
Z
P(Γ, t|x, s) = p(t, y|x, s) dy.
Γ

Theorem
(Kolmogorov) Assume that conditions (27), (28), (29) are satisfied and
that p(y, t|·, ·), a(y, t), b(y, t) ∈ C 2,1 (R × R+ ). Then the transition
probability density satisfies the equation

∂p ∂ 1 ∂2
=− (a(t, y)p) + (b(t, y)p) , lim p(t, y|x, s) = δ(x − y).
∂t ∂y 2 ∂y 2 t→s
(39)

Pavliotis (IC) StochProc January 16, 2011 129 / 367


Proof
Fix a function f (y) ∈ C02 (R). An argument similar to the one used in the
proof of the backward Kolmogorov equation gives
Z 
1 1
lim f (y)p(y, s + h|x, s) ds − f (x) = a(x, s)fx (x)+ b(x, s)fxx (x),
h→0 h 2
(40)
where subscripts denote differentiation with respect to x. On the other
hand

Pavliotis (IC) StochProc January 16, 2011 130 / 367


Z Z
∂ ∂
f (y) p(y, t|x, s) dy = f (y)p(y, t|x, s) dy
∂t ∂t
Z
1
= lim (p(y, t + h|x, s) − p(y, t|x, s)) f (y) dy
h→0 h
Z Z 
1
= lim p(y, t + h|x, s)f (y) dy − p(z, t|s, x)f (z) dz
h→0 h
Z Z Z
1
= lim p(y, t + s|z, t)p(z, t|x, s)f (y) dydz − p(z, t|s, x)f (z) d
h→0 h
Z Z 
1
= lim p(z, t|x, s) p(y, t + h|z, t)f (y) dy − f (z) dz
h→0 h
Z  
1
= p(z, t|x, s) a(z, t)fz (z) + b(z)fzz (z) dz
2
Z  
∂ 1 ∂2
= − (a(z)p(z, t|x, s)) + (b(z)p(z, t|x, s) f (z) dz.
∂z 2 ∂z 2

Pavliotis (IC) StochProc January 16, 2011 131 / 367


In the above calculation used the Chapman-Kolmogorov equation. We
have also performed two integrations by parts and used the fact that,
since the test function f has compact support, the boundary terms
vanish.
Since the above equation is valid for every test function f (y), the
forward Kolmogorov equation follows.

Pavliotis (IC) StochProc January 16, 2011 132 / 367


Assume now that initial distribution of Xt is ρ0 (x) and set s = 0 (the
initial time) in (39). Define
Z
p(y, t) := p(y, t|x, 0)ρ0 (x) dx.

We multiply the forward Kolmogorov equation (39) by ρ0 (x) and


integrate with respect to x to obtain the equation

∂p(y, t) ∂ 1 ∂2
=− (a(y, t)p(y, t)) + (b(y, t)p(t, y)) , (41)
∂t ∂y 2 ∂y 2

together with the initial condition

p(y, 0) = ρ0 (y). (42)

The solution of equation (41), provides us with the probability that the
diffusion process Xt , which initially was distributed according to the
probability density ρ0 (x), is equal to y at time t. Alternatively, we can
think of the solution to (39) as the Green’s function for the PDE (41).
Pavliotis (IC) StochProc January 16, 2011 133 / 367
Quite often we need to calculate joint probability densities. For,
example the probability that Xt1 = x1 and Xt2 = x2 . From the properties
of conditional expectation we have that
p(x1 , t1 , x2 , t2 ) = PXt1 = x1 , Xt2 = x2 )
= P(Xt1 = x1 |Xt2 = x2 )P(Xt2 = x2 )
= p(x1 , t1 |x2 t2 )p(x2 , t2 ).
Using the joint probability density we can calculate the statistics of a
function of the diffusion process Xt at times t and s:
Z Z
E(f (Xt , Xs )) = f (y, x)p(y, t|x, s)p(x, s) dxdy. (43)

The autocorrelation function at time t and s is given by


Z Z
E(Xt Xs ) = yxp(y, t|x, s)p(x, s) dxdy.

In particular,
Z Z
E(Xt X0 ) = yxp(y, t|x, 0)p(x, 0) dxdy.

Pavliotis (IC) StochProc January 16, 2011 134 / 367


The drift and diffusion coefficients of a diffusion process in Rd are
defined as:
Z
1
lim (y − x)P(dy, t|x, s) = a(x, s)
t→s t − s |y −x|<ε

and
Z
1
lim (y − x) ⊗ (y − x)P(dy, t|x, s) = b(x, s).
t→s t − s |y −x|<ε

The drift coefficient a(x, s) is a d -dimensional vector field and the


diffusion coefficient b(x, s) is a d × d symmetric matrix (second order
tensor). The generator of a d dimensional diffusion process is
1
L = a(s, x) · ∇ + b(s, x) : ∇∇
2
Xd d
∂ 1X ∂2
= aj + bij 2 .
∂xj 2 ∂xj
j=1 i,j=1

Pavliotis (IC) StochProc January 16, 2011 135 / 367


Assuming that the first and second moments of the multidimensional
diffusion process exist, we can write the formulas for the drift vector
and diffusion matrix as
 
Xt − Xs
lim E Xs = x = a(x, s) (44)
t→s t −s

and  
(Xt − Xs ) ⊗ (Xt − Xs )
lim E X
s = x = b(x, s) (45)
t→s t −s
Notice that from the above definition it follows that the diffusion matrix
is symmetric and nonnegative definite.

Pavliotis (IC) StochProc January 16, 2011 136 / 367


THE FOKKER-PLANCK EQUATION

Pavliotis (IC) StochProc January 16, 2011 137 / 367


Consider a diffusion process on Rd with time-independent drift
and diffusion coefficients. The Fokker-Planck equation is
d
X d
∂p ∂ 1 X ∂2
=− (ai (x)p) + (bij (x)p), t > 0, x ∈ Rd ,
∂t ∂xj 2 ∂xi ∂xj
j=1 i,j=1
(46a)
d
p(x, 0) = f (x), x ∈R . (46b)

Write it in non-divergence form:


d d
∂p X ∂p 1X ∂2p
= ãj (x) + b̃ij(x) + c̃(x)u, t > 0, x ∈ Rd ,
∂t ∂xj 2 ∂xi ∂xj
j=1 i,j=1
(47a)
p(x, 0) = f (x), x ∈ Rd , (47b)

Pavliotis (IC) StochProc January 16, 2011 138 / 367


where
d
X d d
∂bij 1 X ∂ 2 bij X ∂ai
ãi (x) = −ai (x) + , c̃i (x) = − .
∂xj 2 ∂xi ∂xj ∂xi
j=1 i,j=1 i=1

The diffusion matrix is always nonnegative. We will assume that it


is actually positive, i.e. we will impose the uniform ellipticity
condition:
Xd
bij (x)ξi ξj > αkξk2 , ∀ ξ ∈ Rd , (48)
i,j=1

Furthermore, we will assume that the coefficients ã, b, c̃ are


smooth and that they satisfy the growth conditions

kb(x)k 6 M, kã(x)k 6 M(1 + kxk), kc̃(x)k 6 M(1 + kxk2 ). (49)

Pavliotis (IC) StochProc January 16, 2011 139 / 367


We will call a solution to the Cauchy problem for the
Fokker–Planck equation (47) a classical solution if:
1 u ∈ C 2,1 (Rd , R+ ).
2 ∀T > 0 there exists a c > 0 such that
2
ku(t, x )kL∞ (0,T ) 6 ceαkxk

3 limt→0 u(t, x ) = f (x ).

Pavliotis (IC) StochProc January 16, 2011 140 / 367


Theorem
Assume that conditions (48) and (49) are satisfied, and assume that
2
|f | 6 ceαkxk . Then there exists a unique classical solution to the
Cauchy problem for the Fokker–Planck equation. Furthermore, there
exist positive constants K , δ so that
 
2 (−n+2)/2 1 2
|p|, |pt |, k∇pk, kD pk 6 Kt exp − δkxk .
2t

This estimate enables us to multiply the Fokker-Planck equation


by monomials x n and then to integrate over Rd and to integrate by
parts.
For a proof of this theorem see Friedman, Partial Differential
Equations of Parabolic Type, Prentice-Hall 1964.

Pavliotis (IC) StochProc January 16, 2011 141 / 367


The FP equation as a conservation law
We can define the probability current to be the vector whose ith
component is
d
1X ∂ 
Ji := ai (x)p − bij (x)p .
2 ∂xj
j=1

The Fokker–Planck equation can be written as a continuity


equation:
∂p
+ ∇ · J = 0.
∂t
Integrating the FP equation over Rd and integrating by parts on
the right hand side of the equation we obtain
Z
d
p(x, t) dx = 0.
dt Rd
Consequently:
kp(·, t)kL1 (Rd ) = kp(·, 0)kL1 (Rd ) = 1.
Pavliotis (IC) StochProc January 16, 2011 142 / 367
Boundary conditions for the Fokker–Planck
equation

So far we have been studying the FP equation on Rd .


The boundary condition was that the solution decays sufficiently
fast at infinity.
For ergodic diffusion processes this is equivalent to requiring that
the solution of the backward Kolmogorov equation is an element
of L2 (µ) where µ is the invariant measure of the process.
We can also study the FP equation in a bounded domain with
appropriate boundary conditions.
We can have absorbing, reflecting or periodic boundary
conditions.

Pavliotis (IC) StochProc January 16, 2011 143 / 367


Consider the FP equation posed in Ω ⊂ Rd where Ω is a bounded
domain with smooth boundary.
Let J denote the probability current and let n be the unit normal
vector to the surface.
1 We specify reflecting boundary conditions by setting

n · J(x , t) = 0, on ∂Ω.

2 We specify absorbing boundary conditions by setting

p(x , t) = 0, on ∂Ω.

3 When the coefficient of the FP equation are periodic functions, we


might also want to consider periodic boundary conditions, i.e. the
solution of the FP equation is periodic in x with period equal to that
of the coefficients.

Pavliotis (IC) StochProc January 16, 2011 144 / 367


Reflecting BC correspond to the case where a particle which
evolves according to the SDE corresponding to the FP equation
gets reflected at the boundary.
Absorbing BC correspond to the case where a particle which
evolves according to the SDE corresponding to the FP equation
gets absorbed at the boundary.
There is a complete classification of boundary conditions in one
dimension, the Feller classification: the BC can be regular, exit,
entrance and natural.

Pavliotis (IC) StochProc January 16, 2011 145 / 367


Examples of Diffusion Processes

Set a(t, x) ≡ 0, b(t, x) ≡ 2D > 0. The Fokker-Planck equation


becomes:
∂p ∂2p
= D 2, p(x, s|y, s) = δ(x − y).
∂t ∂x
This is the heat equation, which is the Fokker-Planck equation for
Brownian motion (Einstein, 1905). Its solution is
 
1 (x − s)2
pW (x, t|y, s) = p exp − .
2πD(t − s) 2D(t − s)

Pavliotis (IC) StochProc January 16, 2011 146 / 367


Assume that the initial distribution is

pW (x, s|y, s) = W (y, s).

The solution of the Fokker-Planck equation for Brownian motion


with this initial distribution is
Z
PW (x, t) = p(x, t|y, s)W (y, s) dy.

The Gaussian distribution is the fundamental solution (Green’s


function) of the heat equation (i.e. the Fokker-Planck equation for
Brownian motion).

Pavliotis (IC) StochProc January 16, 2011 147 / 367


Set a(t, x) = −αx, b(t, x) ≡ 2D > 0:

∂p ∂(xp) ∂2p
=α + D 2.
∂t ∂x ∂x
This is the Fokker-Planck equation for the Ornstein-Uhlenbeck
process (Ornstein-Uhlenbeck, 1930). Its solution is
r !
α α(x − e−α(t−s) y)2
pOU (x, t|y, s) = exp − .
2πD(1 − e−2α(t−s) ) 2D(1 − e−2α(t−s) )

Proof: take the Fourier transform in x, use the method of


characteristics and take the inverse Fourier transform.

Pavliotis (IC) StochProc January 16, 2011 148 / 367


Set y = 0, s = 0. Notice that

lim pOU (x, t) = pW (x, t).


α→0

Thus, in the limit where the friction coefficient goes to 0, we


recover distribution function of BM from the DF of the OU
processes.
Notice also that
r  
α αx 2
lim pOU (x, t) = exp − .
t→+∞ 2πD 2D

Thus, the Ornstein-Uhlenbeck process is an ergodic Markov


process. Its invariant measure is Gaussian.
We can calculate all moments of the OU process.

Pavliotis (IC) StochProc January 16, 2011 149 / 367


Define the nth moment of the OU process:
Z
Mn = x n p(x, t) dx, n = 0, 1, 2, . . . .
R

Let n = 0. We integrate the FP equation over R to obtain:


Z Z Z 2
∂p ∂(xp) ∂ p
=α +D = 0,
∂t ∂x ∂x 2
after an integration by parts and using the fact that p(x, t) decays
sufficiently fast at infinity. Consequently:
d
M0 = 0 ⇒ M0 (t) = M0 (0) = 1.
dt
In other words:
d
kpkL1 (R) = 0 ⇒ p(x, t) = p(x, t = 0).
dt
Consequently: probability is conserved.
Pavliotis (IC) StochProc January 16, 2011 150 / 367
Let n = 1. We multiply the FP equation for the OU process by x,
integrate over R and perform and integration by parts to obtain:

d
M1 = −αM1 .
dt
Consequently, the first moment converges exponentially fast to
0:
M1 (t) = e−αt M1 (0).

Pavliotis (IC) StochProc January 16, 2011 151 / 367


Let now n > 2. We multiply the FP equation for the OU process by
x n and integrate by parts (once on the first term on the RHS and
twice on the second) to obtain:
Z Z Z
d
x n p = −αn x n p + Dn(n − 1) x n−2 p.
dt

Or, equivalently:

d
Mn = −αnMn + Dn(n − 1)Mn−2 , n > 2.
dt
This is a first order linear inhomogeneous differential equation.
We can solve it using the variation of constants formula:
Z t
Mn (t) = e−αnt Mn (0) + Dn(n − 1) e−αn(t−s) Mn−2 (s) ds.
0

Pavliotis (IC) StochProc January 16, 2011 152 / 367


The stationary moments of the OU process are:
r Z
α αx 2
n
hx iOU = x n e− 2D dx
2πD R
n 
D n/2
= 1.3 . . . (n − 1) α , n even,
0, n odd.

We have that
lim Mn (t) = hx n iOU .
t→∞

If the initial conditions of the OU process are stationary, then:

Mn (t) = Mn (0) = hx n iOU .

Pavliotis (IC) StochProc January 16, 2011 153 / 367


set a(x) = µx, b(x) = 12 σx 2 . This is the geometric Brownian
motion. The generator of this process is

∂ σ2 x 2 ∂ 2
L = µx + .
∂x 2 ∂x 2
Notice that this operator is not uniformly elliptic.
The Fokker-Planck equation of the geometric Brownian motion is:
 
∂p ∂ ∂ 2 σ2 x 2
=− (µx) + p .
∂t ∂x ∂x 2 2

Pavliotis (IC) StochProc January 16, 2011 154 / 367


We can easily obtain an equation for the nth moment of the
geometric Brownian motion:

d  σ2 
Mn = µn + n(n − 1) Mn , n > 2.
dt 2
The solution of this equation is
σ2
Mn (t) = e(µ+(n−1) 2
)nt
Mn (0), n>2

and
M1 (t) = eµt M1 (0).
Notice that the nth moment might diverge as t → ∞, depending
on the values of µ and σ:

σ2 µ
2 < − n−1 ,

 0,
2 µ
lim Mn (t) = Mn (0), σ2 = − n−1 ,
t→∞ 
 σ2 µ
+∞, 2 > − n−1 .

Pavliotis (IC) StochProc January 16, 2011 155 / 367


Gradient Flows

Let V (x) = 12 αx 2 . The generator of the OU process can be written


as:
L = −∂x V ∂x + D∂x2 .
Consider diffusion processes with a potential V (x), not
necessarily quadratic:

L = −∇V (x) · ∇ + D∆. (50)

This is a gradient flow perturbed by noise whose strength is


D = kB T where kB is Boltzmann’s constant and T the absolute
temperature.
The corresponding stochastic differential equation is

dXt = −∇V (Xt ) dt + 2D dWt .

Pavliotis (IC) StochProc January 16, 2011 156 / 367


The corresponding FP equation is:

∂p
= ∇ · (∇Vp) + D∆p. (51)
∂t
It is not possible to calculate the time dependent solution of this
equation for an arbitrary potential. We can, however, always
calculate the stationary solution.

Pavliotis (IC) StochProc January 16, 2011 157 / 367


Theorem
Assume that V (x) is smooth and that

e−V (x)/D ∈ L1 (Rd ). (52)

Then the Markov process with generator (50) is ergodic. The unique
invariant distribution is the Gibbs distribution
1 −V (x)/D
p(x) = e , (53)
Z
where the normalization factor Z is the partition function
Z
Z = e−V (x)/D dx.
Rd

Pavliotis (IC) StochProc January 16, 2011 158 / 367


The fact that the Gibbs distribution is an invariant distribution
follows by direct substitution. Uniqueness follows from a PDEs
argument (see discussion below).
It is more convenient to "normalize" the solution of the
Fokker-Planck equation wrt the invariant distribution.

Theorem
Let p(x, t) be the solution of the Fokker-Planck equation (51), assume
that (52) holds and let ρ(x) be the Gibbs distribution (187). Define
h(x, t) through
p(x, t) = h(x, t)ρ(x).
Then the function h satisfies the backward Kolmogorov equation:

∂h
= −∇V · ∇h + D∆h, h(x, 0) = p(x, 0)ρ−1 (x). (54)
∂t

Pavliotis (IC) StochProc January 16, 2011 159 / 367


Proof.
The initial condition follows from the definition of h. We calculate the
gradient and Laplacian of p:

∇p = ρ∇h − ρhD −1 ∇V

and

∆p = ρ∆h − 2ρD −1 ∇V · ∇h + hD −1 ∆V ρ + h|∇V |2 D −2 ρ.

We substitute these formulas into the FP equation to obtain

∂h  
ρ = ρ − ∇V · ∇h + D∆h ,
∂t
from which the claim follows.

Pavliotis (IC) StochProc January 16, 2011 160 / 367


Consequently, in order to study properties of solutions to the FP
equation, it is sufficient to study the backward equation (54).
The generator L is self-adjoint, in the right function space.
We define the weighted L2 space L2ρ :
Z
2

Lρ = f | |f |2 ρ(x) dx < ∞ ,
Rd

where ρ(x) is the Gibbs distribution. This is a Hilbert space with


inner product Z
(f , h)ρ = fhρ(x) dx.
Rd

Pavliotis (IC) StochProc January 16, 2011 161 / 367


Theorem
Assume that V (x) is a smooth potential and assume that
condition (52) holds. Then the operator

L = −∇V (x) · ∇ + D∆

is self-adjoint in L2ρ . Furthermore, it is non-positive, its kernel consists


of constants.

Pavliotis (IC) StochProc January 16, 2011 162 / 367


Proof.
Let f , ∈ C02 (Rd ). We calculate
Z
(Lf , h)ρ = (−∇V · ∇ + D∆)fhρ dx
d
ZR Z Z
= (∇V · ∇f )hρ dx − D ∇f ∇hρ dx − D ∇fh∇ρ dx
Rd Z Rd Rd

= −D ∇f · ∇hρ dx,
Rd

from which self-adjointness follows.

Pavliotis (IC) StochProc January 16, 2011 163 / 367


If we set f = h in the above equation we get

(Lf , f )ρ = −Dk∇f k2ρ ,

which shows that L is non-positive.


Clearly, constants are in the null space of L. Assume that f ∈ N (L).
Then, from the above equation we get

0 = −Dk∇f k2ρ ,

and, consequently, f is a constant.


Remark
The expression (−Lf , f )ρ is called the Dirichlet form of the
operator L. In the case of a gradient flow, it takes the form

(−Lf , f )ρ = Dk∇f k2ρ . (55)

Pavliotis (IC) StochProc January 16, 2011 164 / 367


Using the properties of the generator L we can show that the
solution of the Fokker-Planck equation converges to the Gibbs
distribution exponentially fast.
For this we need the following result.

Theorem
Assume that the potential V satisfies the convexity condition

D 2 V > λI.

Then the corresponding Gibbs measure satisfies the Poincaré


inequality with constant λ:
Z √
f ρ = 0 ⇒ k∇f kρ > λkf kρ . (56)
Rd

Pavliotis (IC) StochProc January 16, 2011 165 / 367


Theorem
Assume that p(x, 0) ∈ L2 (eV /D ). Then the solution p(x, t) of the
Fokker-Planck equation (51) converges to the Gibbs distribution
exponentially fast:

kp(·, t) − Z −1 e−V kρ−1 6 e−λDt kp(·, 0) − Z −1 e−V kρ−1 . (57)

Pavliotis (IC) StochProc January 16, 2011 166 / 367


Proof.
We Use (54), (55) and (56) to calculate
 
d 2 ∂h
− k(h − 1)kρ = −2 , h − 1 = −2 (Lh, h − 1)ρ
dt ∂t ρ
= (−L(h − 1), h − 1)ρ = 2Dk∇(h − 1)kρ
> 2Dλkh − 1k2ρ .

Our assumption on p(·, 0) implies that h(·, 0) ∈ L2ρ . Consequently, the


above calculation shows that

kh(·, t) − 1kρ 6 e−λDt kH(·, 0) − 1kρ .

This, and the definition of h, p = ρh, lead to (57).

Pavliotis (IC) StochProc January 16, 2011 167 / 367


The assumption
Z
|p(x, 0)|2 Z −1 eV /D < ∞
Rd

is very restrictive (think of the case where V = x 2 ).


The function space L2 (ρ−1 ) = L2 (e−V /D ) in which we prove
convergence is not the right space to use. Since p(·, t) ∈ L1 ,
ideally we would like to prove exponentially fast convergence in L1 .
We can prove convergence in L1 using the theory of logarithmic
Sobolev inequalities. In fact, we can also prove convergence in
relative entropy:
Z  
p
H(p|ρV ) := p ln dx.
Rd ρV

Pavliotis (IC) StochProc January 16, 2011 168 / 367


The relative entropy norm controls the L1 norm:

kρ1 − ρ2 kL1 6 CH(ρ1 |ρ2 )

Using a logarithmic Sobolev inequality, we can prove exponentially


fast convergence to equilibrium, assuming only that the relative
entropy of the initial conditions is finite.

Theorem
Let p denote the solution of the Fokker–Planck equation (51) where
the potential is smooth and uniformly convex. Assume that the the
initial conditions satisfy

H(p(·, 0)|ρV ) < ∞.

Then p converges to the Gibbs distribution exponentially fast in relative


entropy:
H(p(·, t)|ρV ) 6 e−λDt H(p(·, 0)|ρV ).

Pavliotis (IC) StochProc January 16, 2011 169 / 367


Convergence to equilibrium for kinetic equations, both linear and
non-linear (e.g., the Boltzmann equation) has been studied
extensively.
It has been recognized that the relative entropy plays a very
important role.
For more information see
On the trend to equilibrium for the Fokker-Planck equation: an
interplay between physics and functional analysis by P.A.
Markowich and C. Villani, 1999.

Pavliotis (IC) StochProc January 16, 2011 170 / 367


Eigenfunction Expansions

Consider the generator of a gradient flow with a uniformly convex


potential
L = −∇V · ∇ + D∆. (58)
We know that
1 L is a non-positive self-adjoint operator on L2ρ .
2 It has a spectral gap:

(Lf , f )ρ 6 −Dλkf k2ρ

where λ is the Poincaré constant of the potential V .


The above imply that we can study the spectral problem for −L:

−Lfn = λn fn , n = 0, 1, . . .

Pavliotis (IC) StochProc January 16, 2011 171 / 367


The operator −L has real, discrete spectrum with

0 = λ0 < λ1 < λ2 < . . .

Furthermore, the eigenfunctions {fj }∞


j=1 form an orthonormal basis
in Lρ : we can express every element of L2ρ in the form of a
2

generalized Fourier series:



X
φ = φ0 + φn fn , φn = (φ, fn )ρ (59)
n=1

with (fn , fm )ρ = δnm and φ0 ∈ N (L).


This enables us to solve the time dependent Fokker–Planck
equation in terms of an eigenfunction expansion.
Consider the backward Kolmogorov equation (54).
We assume that the initial conditions h0 (x) = φ(x) ∈ L2ρ and
consequently we can expand it in the form (59).

Pavliotis (IC) StochProc January 16, 2011 172 / 367


We look for a solution of (54) in the form

X
h(x, t) = hn (t)fn (x).
n=0

We substitute this expansion into the backward Kolmogorov


equation:
∞ ∞
!
∂h X X
= ḣn fn = L hn fn (60)
∂t
n=0 n=0

X
= −λn hn fn . (61)
n=0

We multiply this equation by fm , integrate wrt the Gibbs measure


and use the orthonormality of the eigenfunctions to obtain the
sequence of equations

ḣn = −λn hn , n = 0, 1, . . .

Pavliotis (IC) StochProc January 16, 2011 173 / 367


The solution is

h0 (t) = φ0 , hn (t) = e−λn t φn , n = 1, 2, . . .

Notice that
Z Z
1 = p(x, 0) dx = p(x, t) dx
d Rd
ZR
= h(x, t)Z −1 e−βV dx = (h, 1)ρ = (φ, 1)ρ
Rd
= φ0 .

Pavliotis (IC) StochProc January 16, 2011 174 / 367


Consequently, the solution of the backward Kolmogorov equation
is
X∞
h(x, t) = 1 + e−λn t φn fn .
n=1

This expansion, together with the fact that all eigenvalues are
positive (n > 1), shows that the solution of the backward
Kolmogorov equation converges to 1 exponentially fast.
The solution of the Fokker–Planck equation is

!
X
p(x, t) = Z −1 e−βV (x) 1+ e−λn t φn fn .
n=1

Pavliotis (IC) StochProc January 16, 2011 175 / 367


Self-adjointness

The Fokker–Planck operator of a general diffusion process is not


self-adjoint in general.
In fact, it is self-adjoint if and only if the drift term is the gradient
of a potential (Nelson, ≈ 1960). This is also true in infinite
dimensions (Stochastic PDEs).
Markov Processes whose generator is a self-adjoint operator are
called reversible: for all t ∈ [0, T ] Xt and XT −t have the same
transition probability (when Xt is statinary).
Reversibility is equivalent to the invariant measure being a Gibbs
measure.
See Thermodynamics of the general duffusion process:
time-reversibility and entropy production, H. Qian, M. Qian, X.
Tang, J. Stat. Phys., 107, (5/6), 2002, pp. 1129–1141.

Pavliotis (IC) StochProc January 16, 2011 176 / 367


Reduction to a Schrödinger Equation
Lemma
The Fokker–Planck operator for a gradient flow can be written in the
self-adjoint form

∂p   
= D∇ · e−V /D ∇ eV /D p . (62)
∂t

Define now ψ(x, t) = eV /2D p(x, t). Then ψ solves the PDE

∂ψ |∇V |2 ∆V
= D∆ψ − U(x)ψ, U(x) := − . (63)
∂t 4D 2
Let H := −D∆ + U. Then L∗ and H have the same eigenvalues. The
nth eigenfunction φn of L∗ and the nth eigenfunction ψn of H are
associated through the transformation
 
V (x)
ψn (x) = φn (x) exp .
2D
Pavliotis (IC) StochProc January 16, 2011 177 / 367
Remarks

1 From equation (62) shows that the FP operator can be written in


the form   
L∗ · = D∇ · e−V /D ∇ eV /D · .

2 The operator that appears on the right hand side of eqn. (63) has
the form of a Schrödinger operator:

−H = −D∆ + U(x).

3 The spectral problem for the FP operator can be transformed into


the spectral problem for a Schrödinger operator. We can thus use
all the available results from quantum mechanics to study the FP
equation and the associated SDE.
4 In particular, the weak noise asymptotics D ≪ 1 is equivalent to
the semiclassical approximation from quantum mechanics.

Pavliotis (IC) StochProc January 16, 2011 178 / 367


Proof.

We calculate
      
D∇ · e−V /D ∇ eV /D f = D∇ · e−V /D D −1 ∇Vf + ∇f eV /D
= ∇ · (∇Vf + D∇f ) = L∗ f .

Consider now the eigenvalue problem for the FP operator:

−L∗ φn = λn φn .
1

Set φn = ψn exp − 2D V . We calculate −L∗ φn :

Pavliotis (IC) StochProc January 16, 2011 179 / 367


  
−L∗ φn = −D∇ · e−V /D ∇ eV /D ψn e−V /2D
   
∇V
= −D∇ · e−V /D ∇ψn + ψn eV /2D
2D
  2  
|∇V | ∆V
= −D∆ψn + − + ψn e−V /2D
4D 2D
= e−V /2D Hψn .

From this we conclude that e−V /2D Hψn = λn ψn e−V /2D from which the
equivalence between the two eigenvalue problems follows.

Pavliotis (IC) StochProc January 16, 2011 180 / 367


Remarks
1 We can rewrite the Schrödinger operator in the form

∇U ∇U
H = DA∗ A, A=∇+ , A∗ = −∇ + .
2D 2D
2 These are creation and annihilation operators. They can also be
written in the form
   
A· = e−U/2D ∇ eU/2D · , A∗ · = eU/2D ∇ e−U/2D ·

3 The forward the backward Kolmogorov operators have the same


eigenvalues. Their eigenfunctions are related through

φBn = φFn exp (−V /D) ,

where φBn and φFn denote the eigenfunctions of the backward and
forward operators, respectively.
Pavliotis (IC) StochProc January 16, 2011 181 / 367
The OU Process and Hermite Polynomials
The generator of the OU process is
d d2
L = −y +D 2 (64)
dy dy
The OU process is an ergodic Markov process whose unique
invariant measure is absolutely continuous with respect to the
Lebesgue measure on R with density ρ(y) ∈ C ∞ (R)
1 y2
ρ(y) = √ e− 2D .
2πD
Let L2ρ denote the closure of C∞ functions with respect to the norm
Z
kf k2ρ = f (y)2 ρ(y) dy.
R
The space L2ρ is a Hilbert space with inner product
Z
(f , g)ρ = f (y)g(y)ρ(y) dy.
R
Pavliotis (IC) StochProc January 16, 2011 182 / 367
Consider the eigenvalue problem for L:

−Lfn = λn fn .

The operator L has discrete spectrum in L2ρ :

λn = n, n ∈ N.

The corresponding eigenfunctions are the normalized Hermite


polynomials:  
1 y
fn (y) = √ Hn √ , (65)
n! D
where  
n y2 dn 2
− y2
Hn (y) = (−1) e 2 e .
dy n

Pavliotis (IC) StochProc January 16, 2011 183 / 367


The first few Hermite polynomials are:

H0 (y) = 1,
H1 (y) = y,
H2 (y) = y 2 − 1,
H3 (y) = y 3 − 3y,
H4 (y) = y 4 − 3y 2 + 3,
H5 (y) = y 5 − 10y 3 + 15y.

Hn is a polynomial of degree n.
Only odd (even) powers appear in Hn when n is odd (even).

Pavliotis (IC) StochProc January 16, 2011 184 / 367


Lemma
The eigenfunctions {fn (y)}∞n=1 of the generator of the OU process L
satisfy the following properties.
1 They form an orthonormal set in L2ρ :

(fn , fm )ρ = δnm .

2 {fn (y) : n ∈ N } is an orthonormal basis in L2ρ .


3 Define the creation and annihilation operators on C 1 (R) by
 
1 dφ
A+ φ(y) = p −D (y) + yφ(y) , y ∈ R
D(n + 1) dy

and r
D dφ
A− φ(y) = (y).
n dy
Then
A+ fn = fn+1 and A− fn = fn−1 . (66)
Pavliotis (IC) StochProc January 16, 2011 185 / 367
4. The eigenfunctions fn satisfy the following recurrence relation
√ p
yfn (y) = Dnfn−1 (y) + D(n + 1)fn+1 (y), n > 1. (67)

5. The function
λ2
H(y; λ) = eλy − 2

is a generating function for the Hermite polynomials. In particular


  X∞
y λn
H √ ;λ = √ fn (y), λ ∈ C, y ∈ R.
D n=0
n!

Pavliotis (IC) StochProc January 16, 2011 186 / 367


The Klein-Kramers-Chandrasekhar Equation
Consider a diffusion process in two dimensions for the variables q
(position) and momentum p. The generator of this Markov process
is
L = p · ∇q − ∇q V ∇p + γ(−p∇p + D∆p ). (68)
The L2 (dpdq)-adjoint is
L∗ ρ = −p · ∇q ρ − ∇q V · ∇p ρ + γ (∇p (pρ) + D∆p ρ) .
The corresponding FP equation is:
∂p
= L∗ p.
∂t
The corresponding stochastic differential equations is the
Langevin equation
p
Ẍt = −∇V (Xt ) − γ Ẋt + 2γD Ẇt . (69)
This is Newton’s equation perturbed by dissipation and noise.
Pavliotis (IC) StochProc January 16, 2011 187 / 367
The Klein-Kramers-Chandrasekhar equation was first derived by
Kramers in 1923 and was studied by Kramers in his famous paper
"Brownian motion in a field of force and the diffusion model of
chemical reactions", Physica 7(1940), pp. 284-304.
Notice that L∗ is not a uniformly elliptic operator: there are second
order derivatives only with respect to p and not q. This is an
example of a degenerate elliptic operator. It is, however,
hypoelliptic: we can still prove existence and uniqueness of
solutions for the FP equation, and obtain estimates on the
solution.
It is not possible to obtain the solution of the FP equation for an
arbitrary potential.
We can calculate the (unique normalized) solution of the
stationary Fokker-Planck equation.

Pavliotis (IC) StochProc January 16, 2011 188 / 367


Theorem
Assume that V (x) is smooth and that

e−V (x)/D ∈ L1 (Rd ).

Then the Markov process with generator (140) is ergodic. The unique
invariant distribution is the Maxwell-Boltzmann distribution
1 −βH(p,q)
ρβ (p, q) = e (70)
Z
where
1
H(p, q) = kpk2 + V (q)
2
is the Hamiltonian, β = (kB T )−1 is the inverse temperature and the
normalization factor Z is the partition function
Z
Z = e−βH(p,q) dpdq.
R2d

Pavliotis (IC) StochProc January 16, 2011 189 / 367


It is possible to obtain rates of convergence in either a weighted
L2 -norm or the relative entropy norm.

H(p(·, t)|ρ) 6 Ce−αt .

The proof of this result is quite complicated, since the generator L


is degenerate and non-selfadjoint.
See F. Herau and F. Nier, Isotropic hypoellipticity and trend to
equilibrium for the Fokker-Planck equation with a high-degree
potential, Arch. Ration. Mech. Anal., 171(2),(2004), 151–218.
See also C. Villani, Hypocoercivity, AMS 2008.

Pavliotis (IC) StochProc January 16, 2011 190 / 367


Consider a second order elliptic operator of the form
d
X
L = X0 + Xi∗ Xi ,
i=1

where {Xj }dj=0 are vector fields (first order differential operators).
Define the Lie algebras
A0 = Lie(X1 , X1 , . . . Xd ),
A1 = Lie([X0 , X1 ], [X0 , X2 ], . . . [X0 , Xd ]),
... = ...,
Ak = Lie([X0 , X ]; X ∈ Ak −1 ), k >1
Define
H = Lie(A0 , A1 , . . . )
Hörmander’s Hypothesis. H is full in the sense that, for every x ∈ Rd
the vectors {H(x) : H ∈ H} span Tx Rd :
span{H(x); H ∈ H} = Tx Rd .
Pavliotis (IC) StochProc January 16, 2011 191 / 367
Theorem (Hörmander/Kolmogorov.)
Under the above hypothesis, the diffusion process Xt with generator L
has a transition density
Z
(Pt f )(x) = p(t, x, y)f (y) dy,
Rd

where p(·, ·, ·) is a smooth function on (0, +∞) × Rd × Rd . Moreover,


the function p satisfies Kolmogorov’s forward and backward equations


p(·, x, ·) = L∗y p(·, x, ·), x ∈ Rd (71)
∂t
and

p(·, ·, y) = Lx p(·, ·, y), y ∈ Rd . (72)
∂t
For every x ∈ Rd the function p(·, x, ·) is the fundamental solution
d
R function) of (71), such that, for each f ∈ C0 (R ),
(Green’s
limt→0 Rd p(t, x, y)f (y) dy = f (x).
Pavliotis (IC) StochProc January 16, 2011 192 / 367
Example
Consider the SDE √
ẍ = 2Ẇ .
Write it as a first order system

ẋ = y, ẏ = 2Ẇ .

The generator of this process is

L = X0 + X12 , X0 = y∂x , X1 = ∂y .

Consequently [X0 , X1 ] = −∂x and

Lie(X1 , [X0 , X1 ]) = Lie(−∂x , ∂y ),

which spans Tx R2 .

Pavliotis (IC) StochProc January 16, 2011 193 / 367


Let ρ(q, p, t) be the solution of the Kramers equation and let ρβ (q, p)
be the Maxwell-Boltzmann distribution. We can write

ρ(q, p, t) = h(q, p, t)ρβ (q, p),

where h(q, p, t) solves the equation

∂h
= −Ah + γSh
∂t
where
A = p · ∇q − ∇q V · ∇p , S = −p · ∇p + β −1 ∆p .
The operator A is antisymmetric in L2ρ := L2 (R2d ; ρβ (q, p)), whereas S
is symmetric.

Pavliotis (IC) StochProc January 16, 2011 194 / 367



Let Xi := − ∂p i
. The L2ρ -adjoint of Xi is


Xi∗ = −βpi + .
∂pi
We have that
d
X
−1
S=β Xi∗ Xi .
i=1

Consequently, the generator of the Markov process {q(t), p(t)} can be


written in Hörmander’s "sum of squares" form:
d
X
L = A + γβ −1 Xi∗ Xi . (73)
i=1

We calculate the commutators between the vector fields in (73):



[A, Xi ] = , [Xi , Xj ] = 0, [Xi , Xj∗ ] = βδij .
∂qi

Pavliotis (IC) StochProc January 16, 2011 195 / 367


Consequently,
Lie(X1 , . . . Xd , [A, X1 ], . . . [A, Xd ]) = Lie(∇p , ∇q )
which spans Tp,q R2d for all p, q ∈ Rd . This shows that the generator L
is a hypoelliptic operator.
∂ ∂ ∂V
Let now Yi = − ∂p i
with L2ρ -adjoint Yi∗ = ∂q i
− β ∂q i
. We have that
 
∗ ∗ ∂ ∂V ∂
Xi Yi − Yi Xi = β pi − .
∂qi ∂qi ∂pi
Consequently, the generator can be written in the form
d
X
L = β −1 (Xi∗ Yi − Yi∗ Xi + γXi∗ Xi ) .
i=1

Notice also that


d
X
LV := −∇q V ∇q + β −1 ∆q = β −1 Yi∗ Yi .
i=1

Pavliotis (IC) StochProc January 16, 2011 196 / 367


The phase-space Fokker-Planck equation can be written in the
form
∂ρ
+ p · ∇q ρ − ∇q V · ∇p ρ = Q(ρ, fB )
∂t
where the collision operator has the form
  
Q(ρ, fB ) = D∇ · fB ∇ fB−1 ρ .

The Fokker-Planck equation has a similar structure to the


Boltzmann equation (the basic equation in the kinetic theory of
gases), with the difference that the collision operator for the FP
equation is linear.
Convergence of solutions of the Boltzmann equation to the
Maxwell-Boltzmann distribution has also been proved. See
L. Desvillettes and C. Villani: On the trend to global equilibrium for
spatially inhomogeneous kinetic systems: the Boltzmann
equation. Invent. Math. 159, 2 (2005), 245-316.

Pavliotis (IC) StochProc January 16, 2011 197 / 367


We can study the backward and forward Kolmogorov equations
for (183) by expanding the solution with respect to the Hermite
basis.
We consider the problem in 1d. We set D = 1. The generator of
the process is:
 
L = p∂q − V ′ (q)∂p + γ −p∂p + ∂p2 .
=: L1 + γL0 ,

where

L0 := −p∂p + ∂p2 and L1 := p∂q − V ′ (q)∂p .

The backward Kolmogorov equation is

∂h
= Lh. (74)
∂t

Pavliotis (IC) StochProc January 16, 2011 198 / 367


The solution should be an element of the weighted L2 -space
 Z 
2 2 −1 −βH(p,q)
Lρ = f | |f | Z e dpdq < ∞ .
R2

We notice that the invariant measure of our Markov process is a


product measure:
1 2
e−βH(p,q) = e−β 2 |p| e−βV (q) .
1 2
The space L2 (e−β 2 |p| dp) is spanned by the Hermite polynomials.
Consequently, we can expand the solution of (74) into the basis of
Hermite basis:

X
h(p, q, t) = hn (q, t)fn (p), (75)
n=0

where fn (p) = 1/ n!Hn (p).

Pavliotis (IC) StochProc January 16, 2011 199 / 367


Our plan is to substitute (75) into (74) and obtain a sequence of
equations for the coefficients hn (q, t).
We have:

X ∞
X
L0 h = L0 hn fn = − nhn fn
n=0 n=0

Furthermore
L1 h = −∂q V ∂p h + p∂q h.
We calculate each term on the right hand side of the above
equation separately. For this we will need the formulas
√ √ √
∂p fn = nfn−1 and pfn = nfn−1 + n + 1fn+1 .

Pavliotis (IC) StochProc January 16, 2011 200 / 367



X ∞
X
p∂q h = p∂q hn fn = p∂p h0 + ∂q hn pfn
n=0 n=1
∞ √ 
X √
= ∂q h0 f1 + ∂q hn nfn−1 + n + 1fn+1
n=1

X √ √
= ( n + 1∂q hn+1 + n∂q hn−1 )fn
n=0

with h−1 ≡ 0.
Furthermore

X ∞
X √
∂q V ∂p h = ∂q Vhn ∂p fn = ∂q Vhn nfn−1
n=0 n=0

X √
= ∂q Vhn+1 n + 1fn .
n=0

Pavliotis (IC) StochProc January 16, 2011 201 / 367


Consequently:

Lh = L1 + γL1 h

X √
= − γnhn + n + 1∂q hn+1
n=0
√ √ 
+ n∂q hn−1 + n + 1∂q Vhn+1 fn

Using the orthonormality of the eigenfunctions of L0 we obtain the


following set of equations which determine {hn (q, t)}∞
n=0 .

ḣn = −γnhn + n + 1∂q hn+1
√ √
+ n∂q hn−1 + n + 1∂q Vhn+1 , n = 0, 1, . . .

This is set of equations is usually called the Brinkman hierarchy


(1956).

Pavliotis (IC) StochProc January 16, 2011 202 / 367


We can use this approach to develop a numerical method for
solving the Klein-Kramers equation.
For this we need to expand each coefficient hn in an appropriate
basis with respect to q.
Obvious choices are other the Hermite basis (polynomial
potentials) or the standard Fourier basis (periodic potentials).
We will do this for the case of periodic potentials.
The resulting method is usually called the continued fraction
expansion. See Risken (1989).

Pavliotis (IC) StochProc January 16, 2011 203 / 367


The Hermite expansion of the distribution function wrt to the
velocity is used in the study of various kinetic equations (including
the Boltzmann equation). It was initiated by Grad in the late 40’s.
It quite often used in the approximate calculation of transport
coefficients (e.g. diffusion coefficient).
This expansion can be justified rigorously for the Fokker-Planck
equation. See
J. Meyer and J. Schröter, Comments on the Grad Procedure for
the Fokker-Planck Equation, J. Stat. Phys. 32(1) pp.53-69 (1983).
This expansion can also be used in order to solve the Poisson
equation −Lφ = f (p, q). See G.A. Pavliotis and T. Vogiannou
Diffusive Transport in Periodic Potentials: Underdamped
dynamics, Fluct. Noise Lett., 8(2) L155-L173 (2008).

Pavliotis (IC) StochProc January 16, 2011 204 / 367


There are very few potentials for which we can calculate the
eigenvalues and eigenfunctions of the generator of the Markov
process {q(t), p(t)}.
We can calculate everything explicitly for the quadratic (harmonic)
potential
1
V (q) = ω02 q 2 . (76)
2
The Langevin equation is
p
q̈ = −ω02 q − γ q̇ + 2γβ −1 Ẇ (77)

or p
q̇ = p, ṗ = −ω02 q − γp + 2γβ −1 Ẇ . (78)
This is a linear equation that can be solved explicitly (The solution
is a Gaussian stochastic process).

Pavliotis (IC) StochProc January 16, 2011 205 / 367


The generator of the process {q(t), p(t)}

L = p∂q − ω02 q∂p + γ(−p∂p + β −1 ∂p2 ). (79)

The Fokker-Planck operator is

L = p∂q − ω02 q∂p + γ(−p∂p + β −1 ∂p2 ). (80)

The process {q(t), p(t)} is an ergodic Markov process with


Gaussian invariant measure

βω0 − β p2 − βω02 q 2
ρβ (q, p) dqdp = e 2 2 . (81)

Pavliotis (IC) StochProc January 16, 2011 206 / 367


For the calculation of the eigenvalues and eigenfunctions of the
operator L it is convenient to introduce creation and annihilation
operator in both the position and momentum variables:

a− = β −1/2 ∂p , a+ = −β −1/2 ∂p + β 1/2 p (82)

and

b − = ω0−1 β −1/2 ∂q , b + = −ω0−1 β −1/2 ∂q + ω0 β 1/2 p. (83)

We have that
a+ a− = −β −1 ∂p2 + p∂p
and
b + b − = −β −1 ∂q2 + q∂q
Consequently, the operator

Lb = −a+ a− − b + b − (84)

is the generator of the OU process in two dimensions.


Pavliotis (IC) StochProc January 16, 2011 207 / 367
Exercises
1 Calculate the eigenvalues and eigenfunctions of L. b Show that
b
there exists a transformation that transforms L into the
Schrödinger operator of the two-dimensional quantum harmonic
oscillator.
2 Show that the operators a± , b ± satisfy the commutation relations

[a+ , a− ] = −1, (85a)

[b + , b − ] = −1, (85b)
[a± , b ± ] = 0. (85c)

Pavliotis (IC) StochProc January 16, 2011 208 / 367


Using the operators a± and b ± we can write the generator L in the
form
L = −γa+ a− − ω0 (b + a− − a+ b − ), (86)
We want to find first order differential operators c ± and d ± so that
the operator (86) becomes

L = −Cc + c − − Dd + d − (87)

for some appropriate constants C and D.


Our goal is, essentially, to map L to the two-dimensional OU
process.
We require that that the operators c ± and d ± satisfy the
canonical commutation relations

[c + , c − ] = −1, (88a)

[d + , d − ] = −1, (88b)
[c ± , d ± ] = 0. (88c)
Pavliotis (IC) StochProc January 16, 2011 209 / 367
The operators c ± and d ± should be given as linear combinations
of the old operators a± and b ± . They should be of the form

c + = α11 a+ + α12 b + , (89a)

c − = α21 a− + α22 b − , (89b)


d + = β11 a+ + β12 b + , (89c)
d − = β21 a− + β22 b − . (89d)
Notice that the c− and d − are not the adjoints of c + and d + .
If we substitute now these equations into (87) and equate it
with (86) and into the commutation relations (88) we obtain a
system of equations for the coefficients {αij }, {βij }.

Pavliotis (IC) StochProc January 16, 2011 210 / 367


In order to write down the formulas for these coefficients it is
convenient to introduce the eigenvalues of the deterministic
problem
q̈ = −γ q̇ − ω02 q.
The solution of this equation is

q(t) = C1 e−λ1 t + C2 e−λ2 t

with q
γ±δ
λ1,2 = , δ= γ 2 − 4ω02 . (90)
2
The eigenvalues satisfy the relations

λ1 + λ2 = γ, λ1 − λ2 = δ, λ1 λ2 = ω02 . (91)

Pavliotis (IC) StochProc January 16, 2011 211 / 367


Lemma
Let L be the generator (86) and let c ± , d ± be the operators

1 p p  1 p p 
c+ = √ λ1 a + + λ2 b + , c− = √ λ1 a − − λ2 b − ,
δ δ
(92a)
+ 1 p +
p
+

− 1  p −
p


d =√ λ2 a + λ1 b , d = √ − λ2 a + λ1 b .
δ δ
(92b)
± ±
Then c , d satisfy the canonical commutation relations (88) as well
as
[L, c ± ] = −λ1 c ± , [L, d ± ] = −λ2 d ± . (93)
Furthermore, the operator L can be written in the form

L = −λ1 c + c − − λ2 d + d − . (94)

Pavliotis (IC) StochProc January 16, 2011 212 / 367


Proof. first we check the commutation relations:
1 
[c + , c − ] = λ1 [a+ , a− ] − λ2 [b + , b − ]
δ
1
= (−λ1 + λ2 ) = −1.
δ
Similarly,
1 
[d + , d − ] = −λ2 [a+ , a− ] + λ1 [b + , b − ]
δ
1
= (λ2 − λ1 ) = −1.
δ
Clearly, we have that
[c + , d + ] = [c − , d − ] = 0.
Furthermore,
1 p p 
[c + , d − ] = − λ1 λ2 [a+ , a− ] + λ1 λ2 [b + , b − ]
δ
1 p p
= (− λ1 λ2 + − λ1 λ2 ) = 0.
δ
Pavliotis (IC) StochProc January 16, 2011 213 / 367
Finally:

[L, c + ] = −λ1 c + c − c + + λ1 c + c + c −
= −λ1 c + (1 + c + c − ) + λ1 c + c + c −
= −λ1 c + (1 + c + c − ) + λ1 c + c + c −
= −λ1 c + ,

and similarly for the other equations in (93). Now we calculate

L = −λ1 c + c − − λ2 d + d −

λ22 − λ21 + − + − λ1 λ2
= − a a + 0b b + (λ1 − λ2 )a+ b −
δ δ
1p
+ λ1 λ2 (−λ1 + λ2 )b + a−
δ
= −γa+ a− − ω0 (b + a− − a+ b − ),

which is precisely (86). In the above calculation we used (91).

Pavliotis (IC) StochProc January 16, 2011 214 / 367


Theorem
The eigenvalues and eigenfunctions of the generator of the Markov
process {q, p} (78) are

1 1
λnm = λ1 n + λ2 m = γ(n + m) + δ(n − m), n, m = 0, 1, . . . (95)
2 2
and
1
φnm (q, p) = √ (c + )n (d + )m 1, n, m = 0, 1, . . . (96)
n!m!

Pavliotis (IC) StochProc January 16, 2011 215 / 367


We have

[L, (c + )2 ] = L(c + )2 − (c + )2 L
= (c + L − λ1 c + )c + − c + (Lc + + λ1 c + )
= −2λ1 (c + )2

and similarly [L, (d + )2 ] = −2λ1 (c + )2 . A simple induction argument


now shows that ( exercise)

[L, (c + )n ] = −nλ1 (c + )n and [L, (d + )m ] = −mλ1 (d + )m . (97)

We use (97) to calculate

L(c + )n (d + )n 1 = (c + )n L(d + )m 1 − nλ1 (c + )n (d + m)1


= (c + )n (d + )m L1 − mλ2 (c + )n (d + m)1 −
nλ1 (c + )n (d + m)1
= −nλ1 (c + )n (d + m)1 − mλ2 (c + )n (d + m)1

from which (95) and (96) follow.


Pavliotis (IC) StochProc January 16, 2011 216 / 367
Exercise
1 Show that

[L, (c ± )n ] = −nλ1 (c ± )n , [L, (d ± )n ] = −nλ1 (d ± )n , (98)


[c − , (c + )n ] = n(c + )n−1 , [d − , (d + )n ] = n(d + )n−1 . (99)

Remark In terms of the operators a± , b ± the eigenfunctions of L are

√ n X
X m
− n+m n/2 m/2 1
φnm = n!m!δ 2 λ1 λ2 ×
k!(m − k)!ℓ!(n − ℓ)!
ℓ=0 k =0
  k −ℓ
λ1 2
(a+ )n+m−k −ℓ (b + )ℓ+k 1.
λ2

Pavliotis (IC) StochProc January 16, 2011 217 / 367


The first few eigenfunctions are

φ00 = 1.

√ `√ √ ´
β λ1 p + λ2 ω0 q
φ10 = √ .
δ

√ `√ √ ´
β λ2 p + λ1 ω0 q
φ01 = √
δ

√ √ √ √ √ √
−2 λ1 λ2 + λ1 β p 2 λ2 + β pλ1 ω0 q + ω0 β qλ2 p + λ2 ω0 2 β q 2 λ1
φ11 = .
δ

√ √
−λ1 + β p 2 λ1 + 2 λ2 β p λ1 ω0 q − λ2 + ω0 2 β q 2 λ2
φ20 = √ .

√ √
−λ2 + β p 2 λ2 + 2 λ2 β p λ1 ω0 q − λ1 + ω0 2 β q 2 λ1
φ02 = √ .

Pavliotis (IC) StochProc January 16, 2011 218 / 367


The eigenfunctions are not orthonormal.
The first eigenvalue, corresponding to the constant eigenfunction,
is 0:
λ00 = 0.
The operator L is not self-adjoint and consequently, we do not
expect its eigenvalues to be real.
Whether the eigenvalues are real or not depends on the sign of
the discriminant ∆ = γ 2 − 4ω02 .
In the underdamped regime, γ < 2ω0 the eigenvalues are
complex:
q
1 1
λnm = γ(n + m) + i −γ 2 + 4ω02 (n − m), γ < 2ω0 .
2 2
This it to be expected, since the underdamped regime the
dynamics is dominated by the deterministic Hamiltonian dynamics
that give risepto the antisymmetric Liouville operator.
We set ω = (4ω02 − γ 2 ), i.e. δ = 2iω. The eigenvalues can be
written as
γ
λnm = (n + m) + iω(n − m).
2
Pavliotis (IC) StochProc January 16, 2011 219 / 367
3

Im (λnm) 1

−1

−2

−3
0 0.5 1 1.5 2 2.5 3
Re (λ )
nm

Figure: First few eigenvalues of L for γ = ω = 1.


Pavliotis (IC) StochProc January 16, 2011 220 / 367
In Figure 34 we present the first few eigenvalues of L in the
underdamped regime.
The eigenvalues are contained in a cone on the right half of the
complex plane. The cone is determined by
γ γ
λn0 = n + iωn and λ0m = m − iωm.
2 2
The eigenvalues along the diagonal are real:
λnn = γn.
In the overdamped regime, γ > 2ω0 all eigenvalues are real:
q
1 1
λnm = γ(n + m) + γ 2 − 4ω02 (n − m), γ > 2ω0 .
2 2
In fact, in the overdamped limit γ → +∞, the eigenvalues of the
generator L converge to the eigenvalues of the generator of the
OU process:
ω02
λnm = γn + (n − m) + O(γ −3 ).
γ
Pavliotis (IC) StochProc January 16, 2011 221 / 367
The eigenfunctions of L do not form an orthonormal basis in
L2β := L2 (R2 , Z −1 e−βH ) since L is not a selfadjoint operator.
Using the eigenfunctions/eigenvalues of L we can easily calculate
the eigenfunctions/eigenvalues of the L2β adjoint of L.
The adjoint operator is

Lb := −A + γS
= −ω0 (b + a− − b − a+ ) + γa+ a−
= −λ1 (c − )∗ (c + )∗ − λ2 (d − ) ∗ (d + )∗,

where
1 p p  1 p p 
(c + )∗ = √ λ1 a− + λ2 b − , (c − )∗ = √ λ1 a + − λ2 b +
δ δ
(100a)
1 p p  1  p p
(d + )∗ = √ λ2 a − + λ1 b − , − ∗
(d ) = √ − λ2 a + λ1 b +
+
δ δ
(100b)
Pavliotis (IC) StochProc January 16, 2011 222 / 367
Exercise
1 Show by direct substitution that Lb can be written in the form

Lb = −λ1 (c − )∗ (c + )∗ − λ2 (d − )∗ (d + )∗ .

2 Calculate the commutators

[(c + )∗ , (c − )∗ ], [(d + )∗ , (d − )∗ ], [(c ± )∗ , (d ± )∗ ], [L,


b (c ± )∗ ], [L,
b (d ± )∗ ].

1 Lb has the same eigenvalues as L:


b nm = λnm ψnm ,
−Lψ

2 where λnm are given by (95).


3 The eigenfunctions are

1
ψnm = √ ((c − )∗ )n ((d − )∗ )m 1. (101)
n!m!

Pavliotis (IC) StochProc January 16, 2011 223 / 367


Theorem
The eigenfunctions of L and Lb satisfy the biorthonormality relation
Z Z
φnm ψℓk ρβ dpdq = δnℓ δmk . (102)

Proof. We will use formulas (98). Notice that using the third and fourth
of these equations together with the fact that c − 1 = d − 1 = 0 we can
conclude that (for n > ℓ)
(c − )ℓ (c + )n 1 = n(n − 1) . . . (n − ℓ + 1)(c + )n−ℓ . (103)
We have

1
Z Z Z Z
φnm ψℓk ρβ dpdq = √ ((c + ))n ((d + ))m 1((c − )∗ )ℓ ((d − )∗ )k 1ρβ dpdq
n!m!ℓ!k !
n(n − 1) . . . (n − ℓ + 1)m(m − 1) . . . (m − k + 1)
= √ ×
n!m!ℓ!k !
Z Z
((c + ))n−ℓ ((d + ))m−k 1ρβ dpdq

= δnℓ δmk ,
since all eigenfunctions average to 0 with respect to ρβ .
Pavliotis (IC) StochProc January 16, 2011 224 / 367
From the eigenfunctions of Lb we can obtain the eigenfunctions of
the Fokker-Planck operator. Using the formula

L∗ (f ρβ ) = ρLf
b

we immediately conclude that the the Fokker-Planck operator has


the same eigenvalues as those of L and L. b The eigenfunctions
are
∗ 1
ψnm = ρβ φnm = ρβ √ ((c − )∗ )n ((d − )∗ )m 1. (104)
n!m!

Pavliotis (IC) StochProc January 16, 2011 225 / 367


STOCHASTIC DIFFERENTIAL EQUATIONS (SDEs)

Pavliotis (IC) StochProc January 16, 2011 226 / 367


In this part of the course we will study stochastic differential
equation (SDEs): ODEs driven by Gaussian white noise.
Let W (t) denote a standard m–dimensional Brownian motion,
h : Z → Rd a smooth vector-valued function and γ : Z → Rd×m a
smooth matrix valued function (in this course we will take
Z = Td , Rd or Rl ⊕ Td−l .
Consider the SDE
dz dW
= h(z) + γ(z) , z(0) = z0 . (105)
dt dt

We think of the term dW


dt as representing Gaussian white noise: a
mean-zero Gaussian process with correlation δ(t − s)I.
The function h in (105) is sometimes referred to as the drift and γ
as the diffusion coefficient.

Pavliotis (IC) StochProc January 16, 2011 227 / 367


Such a process exists only as a distribution. The precise
interpretation of (105) is as an integral equation for
z(t) ∈ C(R+ , Z):
Z t Z t
z(t) = z0 + h(z(s))ds + γ(z(s))dW (s). (106)
0 0

In order to make sense of this equation we need to define the


stochastic integral against W (s).

Pavliotis (IC) StochProc January 16, 2011 228 / 367


The Itô Stochastic Integral

For the rigorous analysis of stochastic differential equations it is


necessary to define stochastic integrals of the form
Z t
I(t) = f (s) dW (s), (107)
0

where W (t) is a standard one dimensional Brownian motion. This


is not straightforward because W (t) does not have bounded
variation.
In order to define the stochastic integral we assume that f (t) is a
random process, adapted to the filtration Ft generated by the
process W (t), and such that
Z !
T
E f (s)2 ds < ∞.
0

Pavliotis (IC) StochProc January 16, 2011 229 / 367


The Itô stochastic integral I(t) is defined as the L2 –limit of the
Riemann sum approximation of (107):
K
X −1
I(t) := lim f (tk −1 ) (W (tk ) − W (tk −1 )) , (108)
K →∞
k =1

where tk = k∆t and K ∆t = t.


Notice that the function f (t) is evaluated at the left end of each
interval [tn−1 , tn ] in (108).
The resulting Itô stochastic integral I(t) is a.s. continuous in t.
These ideas are readily generalized to the case where W (s) is a
standard d dimensional Brownian motion and f (s) ∈ Rm×d for
each s.

Pavliotis (IC) StochProc January 16, 2011 230 / 367


The resulting integral satisfies the Itô isometry
Z t
2
E|I(t)| = E|f (s)|2F ds, (109)
0
p
where | · |F denotes the Frobenius norm |A|F = tr (AT A).
The Itô stochastic integral is a martingale:

EI(t) = 0

and
E[I(t)|Fs ] = I(s) ∀ t > s,
where Fs denotes the filtration generated by W (s).

Pavliotis (IC) StochProc January 16, 2011 231 / 367


Example
Consider the Itô stochastic integral
Z t
I(t) = f (s) dW (s),
0

where f , W are scalar–valued. This is a martingale with quadratic


variation Z t
hIit = (f (s))2 ds.
0
More generally, for f , W in arbitrary finite dimensions, the integral
I(t) is a martingale with quadratic variation
Z t
hIit = (f (s) ⊗ f (s)) ds.
0

Pavliotis (IC) StochProc January 16, 2011 232 / 367


The Stratonovich Stochastic Integral

In addition to the Itô stochastic integral, we can also define the


Stratonovich stochastic integral. It is defined as the L2 –limit of a
different Riemann sum approximation of (107), namely

1 
K
X −1
Istrat (t) := lim f (tk −1 ) + f (tk ) (W (tk ) − W (tk −1 )) , (110)
K →∞ 2
k =1

where tk = k∆t and K ∆t = t. Notice that the function f (t) is


evaluated at both endpoints of each interval [tn−1 , tn ] in (110).
The multidimensional Stratonovich integral is defined in a similar
way. The resulting integral is written as
Z t
Istrat (t) = f (s) ◦ dW (s).
0

Pavliotis (IC) StochProc January 16, 2011 233 / 367


The limit in (110) gives rise to an integral which differs from the Itô
integral.
The situation is more complex than that arising in the standard
theory of Riemann integration for functions of bounded variation:
in that case the points in [tk −1 , tk ] where the integrand is evaluated
do not effect the definition of the integral, via a limiting process.
In the case of integration against Brownian motion, which does not
have bounded variation, the limits differ.
When f and W are correlated through an SDE, then a formula
exists to convert between them.

Pavliotis (IC) StochProc January 16, 2011 234 / 367


Existence and Uniqueness of solutions for SDEs

Definition
By a solution of (105) we mean a Z-valued stochastic process {z(t)}
on t ∈ [0, T ] with the properties:
1 z(t) is continuous and Ft −adapted, where the filtration is
generated by the Brownian motion W (t);
2 h(z(t)) ∈ L1 ((0, T )), γ(z(t)) ∈ L2 ((0, T ));
3 equation (105) holds for every t ∈ [0, T ] with probability 1.
The solution is called unique if any two solutions xi (t), i = 1, 2 satisfy

P(x1 (t) = x2 (t), ∀t ∈ [0.T ]) = 1.

Pavliotis (IC) StochProc January 16, 2011 235 / 367


It is well known that existence and uniqueness of solutions for
ODEs (i.e. when γ ≡ 0 in (105)) holds for globally Lipschitz vector
fields h(x).
A very similar theorem holds when γ 6= 0.
As for ODEs the conditions can be weakened, when a priori
bounds on the solution can be found.

Pavliotis (IC) StochProc January 16, 2011 236 / 367


Theorem
Assume that both h(·) and γ(·) are globally Lipschitz on Z and that z0
is a random variable independent of the Brownian motion W (t) with

E|z0 |2 < ∞.

Then the SDE (105) has a unique solution z(t) ∈ C(R+ ; Z) with
"Z #
T
E |z(t)|2 dt < ∞ ∀ T < ∞.
0

Furthermore, the solution of the SDE is a Markov process.

Pavliotis (IC) StochProc January 16, 2011 237 / 367


Remarks
The Stratonovich analogue of (105) is
dz dW
= h(z) + γ(z) ◦ , z(0) = z0 . (111)
dt dt
By this we mean that z ∈ C(R+ , Z) satisfies the integral equation
Z t Z t
z(t) = z(0) + h(z(s))ds + γ(z(s)) ◦ dW (s). (112)
0 0

By using definitions (108) and (110) it can be shown that z


satisfying the Stratonovich SDE (111) also satisfies the Itô SDE
dz 1  1  dW
= h(z) + ∇ · γ(z)γ(z)T − γ(z)∇ · γ(z)T + γ(z) ,
dt 2 2 dt
(113a)
z(0) = z0 , (113b)
provided that γ(z) is differentiable.
Pavliotis (IC) StochProc January 16, 2011 238 / 367
White noise is, in most applications, an idealization of a stationary
random process with short correlation time. In this context the
Stratonovich interpretation of an SDE is particularly important
because it often arises as the limit obtained by using smooth
approximations to white noise.
On the other hand the martingale machinery which comes with
the Itô integral makes it more important as a mathematical object.
It is very useful that we can convert from the Itô to the
Stratonovich interpretation of the stochastic integral.
There are other interpretations of the stochastic integral, e.g. the
Klimontovich stochastic integral.

Pavliotis (IC) StochProc January 16, 2011 239 / 367


The Definition of Brownian motion implies the scaling property

W (ct) = cW (t),

where the above should be interpreted as holding in law. From


this it follows that, if s = ct, then
dW 1 dW
=√ ,
ds c dt

again in law.
Hence, if we scale time to s = ct in (105), then we get the equation

dz 1 1 dW
= h(z) + √ γ(z) , z(0) = z0 .
ds c c ds

Pavliotis (IC) StochProc January 16, 2011 240 / 367


The Stratonovich Stochastic Integral: A first
application of multiscale methods

When white noise is approximated by a smooth process this often


leads to Stratonovich interpretations of stochastic integrals, at
least in one dimension.
We use multiscale analysis (singular perturbation theory for
Markov processes) to illustrate this phenomenon in a
one-dimensional example.
Consider the equations

dx 1
= h(x) + f (x)y, (114a)
dt ε
r
dy αy 2D dV
=− 2 + , (114b)
dt ε ε2 dt
with V being a standard one-dimensional Brownian motion.

Pavliotis (IC) StochProc January 16, 2011 241 / 367


We say that the process x(t) is driven by colored noise: the
noise that appears in (114a) has non-zero correlation time. The
correlation function of the colored noise η(t) := y(t)/ε is (we take
y(0) = 0)
1 D − α |t−s|
R(t) = E (η(t)η(s)) = 2 e ε2 .
ε α
The power spectrum of the colored noise η(t) is:
1 Dε−2 1
f ε (x) =
ε2 π x + (αε−2 )2
2

D 1 D
= 4 2 2

π ε x +α πα2
and, consequently,
 
y(t) y(s) 2D
lim E = δ(t − s),
ε→0 ε ε α2
which implies the heuristic
r
y(t) 2D dV
lim = . (115)
ε→0 ε α2 dt
Pavliotis (IC) StochProc January 16, 2011 242 / 367
Another way of seeing this is by solving (114b) for y/ε:
r
y 2D dV ε dy
= − . (116)
ε α2 dt α dt
If we neglect the O(ε) term on the right hand side then we arrive,
again, at the heuristic (115).
Both of these arguments lead us to conjecture the limiting Itô SDE:
r
dX 2D dV
= h(X ) + f (X ) . (117)
dt α dt
In fact, as applied, the heuristic gives the incorrect limit.

Pavliotis (IC) StochProc January 16, 2011 243 / 367


whenever white noise is approximated by a smooth process, the
limiting equation should be interpreted in the Stratonovich sense,
giving r
dX 2D dV
= h(X ) + f (X ) ◦ . (118)
dt α dt
This is usually called the Wong-Zakai theorem. A similar result is
true in arbitrary finite and even infinite dimensions.
We will show this using singular perturbation theory.

Theorem
Assume that the initial conditions for y(t) are stationary and that the
function f is smooth. Then the solution of eqn (114a) converges, in the
limit as ε → 0 to the solution of the Stratonovich SDE (118).

Pavliotis (IC) StochProc January 16, 2011 244 / 367


Remarks

1 It is possible to prove pathwise convergence under very mild


assumptions.
2 The generator of a Stratonovich SDE has the from

D
Lstrat = h(x)∂x + f (x)∂x (f (x)∂x ) .
α
3 Consequently, the Fokker-Planck operator of the Stratonovich
SDE can be written in divergence form:

D  2 
L∗strat · = −∂x (h(x)·) + ∂x f (x)∂x · .
α

Pavliotis (IC) StochProc January 16, 2011 245 / 367


4. In most applications in physics the white noise is an approximation
of a more complicated noise processes with non-zero correlation
time. Hence, the physically correct interpretation of the stochastic
integral is the Stratonovich one.

5. In higher dimensions an additional drift term might appear due to


the noncommutativity of the row vectors of the diffusion matrix. This
is related to the Lévy area correction in the theory of rough paths.

Pavliotis (IC) StochProc January 16, 2011 246 / 367


Proof of Proposition 61 The generator of the process (x(t), y(t)) is

1  2
 1
L = −αy∂ y + D∂y + f (x)y∂x + h(x)∂x
ε2 ε
1 1
=: L0 + L1 + L2 .
ε2 ε
The "fast" process is an stationary Markov process with invariant
density r
α − αy 2
ρ(y) = e 2D . (119)
2πD
The backward Kolmogorov equation is
 
∂u ε 1 1
= L0 + L1 + L2 u ε . (120)
∂t ε2 ε
We look for a solution to this equation in the form of a power series
expansion in ε:

u ε (x, y, t) = u0 + εu1 + ε2 u2 + . . .

Pavliotis (IC) StochProc January 16, 2011 247 / 367


We substitute this into (120) and equate terms of the same power in ε
to obtain the following hierarchy of equations:

−L0 u0 = 0,
−L0 u1 = L1 u0 ,
∂u0
−L0 u2 = L1 u1 + L2 u0 − .
∂t
The ergodicity of the fast process implies that the null space of the
generator L0 consists only of constant in y. Hence:

u0 = u(x, t).

The second equation in the hierarchy becomes

−L0 u1 = f (x)y∂x u.

Pavliotis (IC) StochProc January 16, 2011 248 / 367


This equation is solvable since the right hand side is orthogonal to the
null space of the adjoint of L0 (this is the Fredholm alterantive). We
solve it using separation of variables:

1
u1 (x, y, t) = f (x)∂x uy + ψ1 (x, t).
α
In order for the third equation to have a solution we need to require
that the right hand side is orthogonal to the null space of L∗0 :
Z  
∂u0
L1 u 1 + L2 u 0 − ρ(y) dy = 0.
R ∂t

We calculate: Z
∂u0 ∂u
ρ(y) dy = .
R ∂t ∂t
Furthermore:

Pavliotis (IC) StochProc January 16, 2011 249 / 367


Z
L2 u0 ρ(y) dy = h(x)∂x u.
R
Finally
Z Z  
1
L1 u1 ρ(y) dy = f (x)y∂x f (x)∂x uy + ψ1 (x, t) ρ(y) dy
R R α
1
= f (x)∂x (f (x)∂x u) hy 2 i + f (x)∂x ψ1 (x, t)hyi
α
D
= f (x)∂x (f (x)∂x u)
α2
D D
= 2
f (x)∂x f (x)∂x u + 2 f (x)2 ∂x2 u.
α α
Putting everything together we obtain the limiting backward
Kolmogorov equation
 
∂u D D
= h(x) + 2 f (x)∂x f (x) ∂x u + 2 f (x)2 ∂x2 u,
∂t α α
from which we read off the limiting Stratonovich SDE
r
dX 2D dV
Pavliotis (IC) = h(X ) + StochProc f (X ) ◦ . January 16, 2011 250 / 367
A Stratonovich SDE

dX (t) = f (X (t)) dt + σ(X (t)) ◦ dW (t) (121)

can be written as an Itô SDE


   
1 dσ
dX (t) = f (X (t)) + σ (X (t)) dt + σ(X (t)) dW (t).
2 dx

Conversely, and Itô SDE

dX (t) = f (X (t)) dt + σ(X (t))dW (t) (122)

can be written as a Statonovich SDE


   
1 dσ
dX (t) = f (X (t)) − σ (X (t)) dt + σ(X (t)) ◦ dW (t).
2 dx

The Itô and Stratonovich interpretation of an SDE can lead to


equations with very different properties!
Pavliotis (IC) StochProc January 16, 2011 251 / 367
Multiplicative Noise.
When the diffusion coefficient depends on the solution of the SDE
X (t), we will say that we have an equation with multiplicative
noise.
Multiplicative noise can lead to noise induced phase transitions.
See
W. Horsthemke and R. Lefever, Noise-induced transitions,
Springer-Verlag, Berlin 1984.
This is a topic of current interest for SDEs in infinite dimensions
(SPDEs).

Pavliotis (IC) StochProc January 16, 2011 252 / 367


Colored Noise

When the noise which drives an SDE has non-zero correlation


time we will say that we have colored noise.
The properties of the SDE (stability, ergodicity etc.) are quite
robust under "coloring of the noise". See
G. Blankenship and G.C. Papanicolaou, Stability and control of
stochastic systems with wide-band noise disturbances. I, SIAM J.
Appl. Math., 34(3), 1978, pp. 437–476.
Colored noise appears in many applications in physics and
chemistry. For a review see
P. Hanggi and P. Jung Colored noise in dynamical systems. Adv.
Chem. Phys. 89 239 (1995).

Pavliotis (IC) StochProc January 16, 2011 253 / 367


In the case where there is an additional small time scale in the
problem, in addition to the correlation time of the colored noise, it
is not clear what the right interpretation of the stochastic integral
(in the limit as both small time scales go to 0). This is usually
called the Itô versus Stratonovich problem.
Consider, for example, the SDE

τ Ẍ = −Ẋ + v(X )η ε (t),

where η ε (t) is colored noise with correlation time ε2 .


In the limit where both small time scales go to 0 we can get either
Itô or Stratonovich or neither. See
G.A. Pavliotis and A.M. Stuart, Analysis of white noise limits for
stochastic systems with two fast relaxation times, Multiscale Model.
Simul., 4(1), 2005, pp. 1-35.

Pavliotis (IC) StochProc January 16, 2011 254 / 367


Given the function γ(z) in the SDE (105) we define

Γ(z) = γ(z)γ(z)T . (123)

The generator L is then defined as

1
Lv = h · ∇v + Γ : ∇∇v. (124)
2
This operator, equipped with a suitable domain of definition, is the
generator of the Markov process given by (105).
The formal L2 −adjoint operator L∗

1
L∗ v = −∇ · (hv) + ∇ · ∇ · (Γv).
2

Pavliotis (IC) StochProc January 16, 2011 255 / 367


The Itô formula enables us to calculate the rate of change in time
of functions V : Z → Rn evaluated at the solution of a Z-valued
SDE.
Formally, we can write:
 
d  dW
V (z(t)) = LV (z(t)) + ∇V (z(t)), γ(z(t)) .
dt dt

Note that if W were a smooth time-dependent function this


formula would not be correct: there is an additional term in LV ,
proportional to Γ, which arises from the lack of smoothness of
Brownian motion.
The precise interpretation of the expression for the rate of change
of V is in integrated form:

Pavliotis (IC) StochProc January 16, 2011 256 / 367


Lemma
(Itô’s Formula) Assume that the conditions of Theorem 60 hold. Let
x(t) solve (105) and let V ∈ C 2 (Z, Rn ). Then the process V (z(t))
satisfies
Z t Z t
V (z(t)) = V (z(0)) + LV (z(s))ds + h∇V (z(s)), γ(z(s)) dW (s)i .
0 0

Let φ : Z 7→ R and consider the function



v(z, t) = E φ(z(t))|z(0) = z , (125)

where the expectation is with respect to all Brownian driving


paths. By averaging in the Itô formula, which removes the
stochastic integral, and using the Markov property, it is possible to
obtain the Backward Kolmogorov equation.

Pavliotis (IC) StochProc January 16, 2011 257 / 367


For a Stratonovich SDE the rules of standard calculus apply:
Consider the Stratonovich SDE (121) and let V (x ) ∈ C 2 (R). Then

dV
dV (X (t)) = (X (t)) (f (X (t)) dt + σ(X (t)) ◦ dW (t)) .
dx
Consider the Stratonovich SDE (121) on Rd (i.e. f ∈ Rd ,
σ : Rn 7→ Rd , W (t) is standard Brownian motion on Rn ). The
corresponding Fokker-Planck equation is:

∂ρ 1
= −∇ · (f ρ) + ∇ · (σ∇ · (σρ))). (126)
∂t 2

Pavliotis (IC) StochProc January 16, 2011 258 / 367


1 The SDE for Brownian motion is:

dX = 2σdW , X (0) = x.

The Solution is:



X (t) = x + 2σW (t).

Pavliotis (IC) StochProc January 16, 2011 259 / 367


2. The SDE for the Ornstein-Uhlenbeck process is

dX = −αX dt + 2λ dW , X (0) = x.

We can solve this equation using the variation of constants formula:


√ Z t
X (t) = e−αt x + 2λ e−α(t−s) dW (s).
0

We can use Itô’s formula to obtain equations for the moments of the
OU process. The generator is:

L = −αx ∂x + λ∂x2 .

Pavliotis (IC) StochProc January 16, 2011 260 / 367


We apply Itô’s formula to the function f (x) = x n to obtain:

dX (t)n = LX (t)n dt + 2λ∂X (t)n dW
= −αnX (t)n dt + λn(n − 1)X (t)n−2 dt

+n 2λX (t)n−1 dW .

Consequently:
Z t 
n n
X (t) = x + −αnX (t)n + λn(n − 1)X (t)n−2 dt
0
√ Z t
+n 2λ X (t)n−1 dW .
0

By taking the expectation in the above equation we obtain the


equation for the moments of the OU process that we derived
earlier using the Fokker-Planck equation:
Z t
Mn (t) = x n + (−αnMn (s) + λn(n − 1)Mn−2 (s)) ds.
0

Pavliotis (IC) StochProc January 16, 2011 261 / 367


3. Consider the geometric Brownian motion

dX (t) = µX (t) dt + σX (t) dW (t), (127)

where we use the Itô interpretation of the stochastic differential. The


generator of this process is

σ2 x 2 2
L = µx ∂x + ∂ .
2 x
The solution to this equation is
 
σ2
X (t) = X (0) exp (µ − )t + σW (t) . (128)
2

Pavliotis (IC) StochProc January 16, 2011 262 / 367


To derive this formula, we apply Itô’s formula to the function
f (x) = log(x):
 
d log(X (t)) = L log(X (t)) dt + σx∂x log(X (t)) dW (t)
  
1 σ2 x 2 1
= µx + − 2 dt + σ dW (t)
x 2 x
 
σ2
= µ− dt + σ dW (t).
2

Consequently:
   
X (t) σ2
log = µ− t + σW (t)
X (0) 2

from which (128) follows.

Pavliotis (IC) StochProc January 16, 2011 263 / 367


Notice that the Stratonovich interpretation of this equation leads to
the solution
X (t) = X (0) exp(µt + σW (t))
Exercise: calculate all moments of the geometric Brownian motion
for the Itô and Stratonovich interpretations of the stochastic
integral.

Pavliotis (IC) StochProc January 16, 2011 264 / 367


Consider the Landau equation:

dXt
= Xt (c − Xt2 ), X0 = x. (129)
dt

This is a gradient flow for the potential V (x) = 21 cx 2 − 41 x 4 .


When c < 0 all solutions are attracted to the single steady state
X∗ = 0.
When√c > 0 the steady state X
√∗ = 0 becomes unstable and
Xt → c if x > 0 and Xt → − c if x < 0.

Pavliotis (IC) StochProc January 16, 2011 265 / 367


Consider additive random perturbations to the Landau equation:

dXt √ dWt
= Xt (c − Xt2 ) + 2σ , X0 = x. (130)
dt dt
This equation defines an ergodic Markov process on R: There
exists a unique invariant distribution:
Z
1 1
ρ(x) = Z −1 e−V (x)/σ , Z = e−V (x)/σ dx, V (x) = cx 2 − x 4 .
R 2 4

ρ(x) is a probability density for all values of c ∈ R.


The presence of additive noise in some sense "trivializes" the
dynamics.
The dependence of various averaged quantities on c resembles
the physical situation of a second order phase transition.

Pavliotis (IC) StochProc January 16, 2011 266 / 367


Consider now multiplicative perturbations of the Landau equation.

dXt √ dWt
= Xt (c − Xt2 ) + 2σXt , X0 = x. (131)
dt dt
Where the stochastic differential is interpreted in the Itô sense.
The generator of this process is

L = x(c − x 2 )∂x + σx 2 ∂x2 .

Notice that Xt = 0 is always a solution of (131). Thus, if we start


with x > 0 (x < 0) the solution will remain positive (negative).
We will assume that x > 0.

Pavliotis (IC) StochProc January 16, 2011 267 / 367


Consider the function Yt = log(Xt ). We apply Itô’s formula to this
function:

dYt = L log(Xt ) dt + σXt ∂x log(Xt ) dWt


 
2 1 2 1 1
= Xt (c − Xt ) − σXt 2 dt + σXt dWt
Xt Xt Xt
= (c − σ) dt − Xt2 dt + σ dWt .

Thus, we have been able to transform (131) into an SDE with


additive noise:
h i
dYt = (c − σ) − e2Yt dt + σ dWt . (132)

This is a gradient flow with potential


h 1 i
V (y) = − (c − σ)y − e2y .
2

Pavliotis (IC) StochProc January 16, 2011 268 / 367


The invariant measure, if it exists, is of the form

ρ(y) dy = Z −1 e−V (y )/σ dy.

Going back to the variable x we obtain:


x2
ρ(x) dx = Z −1 x (c/σ−2) e− 2σ dx.

We need to make sure that this distribution is integrable:


Z +∞
x2 c
Z = x γ e− 2σ < ∞, γ = − 2.
0 σ

For this it is necessary that

γ > −1 ⇒ c > σ.

Pavliotis (IC) StochProc January 16, 2011 269 / 367


Not all multiplicative random perturbations lead to ergodic
behavior!
The dependence of the invariant distribution on c is similar to the
physical situation of first order phase transitions.
Exercise Analyze this problem for the Stratonovich interpretation
of the stochastic integral.
Exercise Study additive and multiplicative random perturbations
of the ODE
dx
= x(c + 2x 2 − x 4 ).
dt
For more information see M.C. Mackey, A. Longtin, A. Lasota
Noise-Induced Global Asymptotic Stability, J. Stat. Phys. 60
(5/6) pp. 735-751.

Pavliotis (IC) StochProc January 16, 2011 270 / 367


Theorem
Assume that φ is chosen sufficiently smooth so that the backward
Kolmogorov equation

∂v
= Lv for (z, t) ∈ Z × (0, ∞),
∂t
v = φ for (z, t) ∈ Z × {0} , (133)

has a unique classical solution v(x, t) ∈ C 2,1 (Z × (0, ∞), ). Then v is


given by (125) where z(t) solves (106).

Pavliotis (IC) StochProc January 16, 2011 271 / 367


Now we can derive rigorously the Fokker-Planck equation.

Theorem
Consider equation (106) with z(0) a random variable with density
ρ0 (z). Assume that the law of z(t) has a density
ρ(z, t) ∈ C 2,1 (Z × (0, ∞)). Then ρ satisfies the Fokker-Planck
equation
∂ρ
= L∗ ρ for (z, t) ∈ Z × (0, ∞), (134a)
∂t
ρ = ρ0 for z ∈ Z × {0}. (134b)

Pavliotis (IC) StochProc January 16, 2011 272 / 367


Proof

Let Eµ denote averaging with respect to the product measure


induced by the measure µ with density ρ0 on z(0) and the
independent driving Wiener measure on the SDE itself.
Averaging over random z(0) distributed with density ρ0 (z), we find
Z
µ
E (φ(z(t))) = v(z, t)ρ0 (z) dz
ZZ
= (eLt φ)(z)ρ0 (z) dz
ZZ

(eL t ρ0 )(z)φ(z) dz.



=
Z

But since ρ(z, t) is the density of z(t) we also have


Z
µ
E (φ(z(t))) = ρ(z, t)φ(z)dz.
Z

Pavliotis (IC) StochProc January 16, 2011 273 / 367


Equating these two expressions for the expectation at time t we
obtain Z Z
L∗ t
(e ρ0 )(z)φ(z) dz = ρ(z, t)φ(z) dz.
Z Z
We use a density argument so that the identity can be extended to
all φ ∈ L2 (Z). Hence, from the above equation we deduce that
 ∗ 
ρ(z, t) = eL t ρ0 (z).

Differentiation of the above equation gives (134a).


Setting t = 0 gives the initial condition (134b).

Pavliotis (IC) StochProc January 16, 2011 274 / 367


THE SMOLUCHOWSKI AND FREIDLIN-WENTZELL LIMITS

Pavliotis (IC) StochProc January 16, 2011 275 / 367


There are very few SDEs/Fokker-Planck equations that can be
solved explicitly.
In most cases we need to study the problem under investigation
either approximately or numerically.
In this part of the course we will develop approximate methods for
studying various stochastic systems of practical interest.

Pavliotis (IC) StochProc January 16, 2011 276 / 367


There are many problems of physical interest that can be
analyzed using techniques from perturbation theory and
asymptotic analysis:

1 Small noise asymptotics at finite time intervals.


2 Small noise asymptotics/large times (rare events): the theory of
large deviations, escape from a potential well, exit time problems.
3 Small and large friction asymptotics for the Fokker-Planck
equation: The Freidlin–Wentzell (underdamped) and
Smoluchowski (overdamped) limits.
4 Large time asymptotics for the Langevin equation in a periodic
potential: homogenization and averaging.
5 Stochastic systems with two characteristic time scales: multiscale
problems and methods.

Pavliotis (IC) StochProc January 16, 2011 277 / 367


We will study various asymptotic limits for the Langevin equation
(we have set m = 1)
p
q̈ = −∇V (q) − γ q̇ + 2γβ −1 Ẇ . (135)

There are two parameters in the problem, the friction coefficient γ


and the inverse temperature β.
We want to study the qualitative behavior of solutions to this
equation (and to the corresponding Fokker-Planck equation).
There are various asymptotic limits at which we can eliminate
some of the variables of the equation and obtain a simpler
equation for fewer variables.
In the large temperature limit, β ≪ 1, the dynamics of (183) is
dominated by diffusion: the Langevin equation (183) can be
approximated by free Brownian motion:
p
q̇ = 2γβ −1 Ẇ .

Pavliotis (IC) StochProc January 16, 2011 278 / 367


The small temperature asymptotics, β ≫ 1 is much more
interesting and more subtle. It leads to exponential, Arrhenius
type asymptotics for the reaction rate (in the case of a particle
escaping from a potential well due to thermal noise) or the
diffusion coefficient (in the case of a particle moving in a periodic
potential in the presence of thermal noise)

κ = ν exp (−βEb ) , (136)

where κ can be either the reaction rate or the diffusion coefficient.


The small temperature asymptotics will be studied later for the
case of a bistable potential (reaction rate) and for the case of a
periodic potential (diffusion coefficient).

Pavliotis (IC) StochProc January 16, 2011 279 / 367


Assuming that the temperature is fixed, the only parameter that is
left is the friction coefficient γ. The large and small friction
asymptotics can be expressed in terms of a slow/fast system of
SDEs.
In many applications (especially in biology) the friction coefficient
is large: γ ≫ 1. In this case the momentum is the fast variable
which we can eliminate to obtain an equation for the position. This
is the overdamped or Smoluchowski limit.
In various problems in physics the friction coefficient is small:
γ ≪ 1. In this case the position is the fast variable whereas the
energy is the slow variable. We can eliminate the position and
obtain an equation for the energy. This is the underdampled or
Freidlin-Wentzell limit.
In both cases we have to look at sufficiently long time scales.

Pavliotis (IC) StochProc January 16, 2011 280 / 367


We rescale the solution to (183):

q γ (t) = λγ q(t/µγ ).

This rescaled process satisfies the equation


q
λγ γ γ
γ γ
q̈ = − 2 ∂q V (q /λγ ) − q̇ + 2γλ2γ µ−3
γ β
−1 Ẇ , (137)
µγ µ γ

Different choices for these two parameters lead to the


overdamped and underdamped limits:

Pavliotis (IC) StochProc January 16, 2011 281 / 367


λγ = 1, µγ = γ −1 , γ ≫ 1.
In this case equation (137) becomes
p
γ −2 q̈ γ = −∂q V (q γ ) − q̇ γ + 2β −1 Ẇ . (138)

Under this scaling, the interesting limit is the overdamped limit,


γ ≫ 1.
We will see later that in the limit as γ → +∞ the solution to (138)
can be approximated by the solution to
p
q̇ = −∂q V + 2β −1 Ẇ .

Pavliotis (IC) StochProc January 16, 2011 282 / 367


λγ = 1, µγ = γ, γ ≪ 1:
p
q̈ γ = −γ −2 ∇V (q γ ) − q̇ γ + 2γ −2 β −1 Ẇ . (139)

Under this scaling the interesting limit is the underdamped limit,


γ ≪ 1.
We will see later that in the limit as γ → 0 the energy of the
solution to (139) converges to a stochastic process on a graph.

Pavliotis (IC) StochProc January 16, 2011 283 / 367


We consider the rescaled Langevin equation (138):
p
ε2 q̈ γ (t) = −∇V (q γ (t)) − q̇ γ (t) + 2β −1 Ẇ (t), (140)

where we have set ε−1 = γ, since we are interested in the limit


γ → ∞, i.e. ε → 0.
We will show that, in the limit as ε → 0, q γ (t), the solution of the
Langevin equation (140), converges to q(t), the solution of the
Smoluchowski equation
p
q̇ = −∇V + 2β −1 Ẇ . (141)

Pavliotis (IC) StochProc January 16, 2011 284 / 367


We write (140) as a system of SDEs:
1
q̇ = p, (142)
ε s
1 1 2
ṗ = − ∇V (q) − 2 p + Ẇ . (143)
ε ε βε2

This systems of SDEs defined a Markov process in phase space.


Its generator is
1  1 
Lε = 2
− p · ∇p + β −1 ∆p + p · ∇q − ∇q V · ∇p
ε ε
1 1
=: L0 + L1 .
ε2 ε
This is a singularly perturbed differential operator.
We will derive the Smoluchowski equation (141) using a pathwise
technique, as well as by analyzing the corresponding Kolmogorov
equations.
Pavliotis (IC) StochProc January 16, 2011 285 / 367
We apply Itô’s formula to p:
1 p −1
dp(t) = Lε p(t) dt + 2β ∂p p(t) dW
ε
1 1 1 p −1
= − 2 p(t) dt − ∇q V (q(t)) dt + 2β dW .
ε ε ε
Consequently:
Z Z t p
1 t
p(s) ds = − ∇q V (q(s)) ds + 2β −1 W (t) + O(ε).
ε 0 0

From equation (142) we have that


Z t
1
q(t) = q(0) + p(s) ds.
ε 0

Combining the above two equations we deduce


Z t p
q(t) = q(0) − ∇q V (q(s)) ds + 2β −1 W (t) + O(ε)
0

from which (141) follows.


Pavliotis (IC) StochProc January 16, 2011 286 / 367
Notice that in this derivation we assumed that

E|p(t)|2 6 C.

This estimate is true, under appropriate assumptions on the


potential V (q) and on the initial conditions.
In fact, we can prove a pathwise approximation result:
!1/p
E sup |q γ (t) − q(t)|p 6 Cε2−κ ,
t∈[0,T ]

where κ > 0, arbitrary small (it accounts for logarithmic


corrections).

Pavliotis (IC) StochProc January 16, 2011 287 / 367


For the rigorous proof see
E. Nelson, Dynamical Theories of Brownian Motion, Princeton
University press 1967.
A similar approximation theorem is also valid in infinite dimensions
(i.e. for SPDEs):
S. Cerrai and M. Freidlin, On the Smoluchowski-Kramers
approximation for a system with an infinite number of degrees of
freedom, Probab. Theory Related Fields, 135 (3), 2006, pp.
363–394.

Pavliotis (IC) StochProc January 16, 2011 288 / 367


The pathwise derivation of the Smoluchowski equation implies
that the solution of the Fokker-Planck equation corresponding to
the Langevin equation (140) converges (in some appropriate
sense to be explained below) to the solution of the Fokker-Planck
equation corresponding to the Smoluchowski equation (141).
It is important in various applications to calculate corrections to
the limiting Fokker-Planck equation.
We can accomplish this by analyzing the Fokker-Planck equation
for (140) using singular perturbation theory.
We will consider the problem in one dimension. This mainly to
simplify the notation. The multi–dimensional problem can be
treated in a very similar way.

Pavliotis (IC) StochProc January 16, 2011 289 / 367


The Fokker–Planck equation associated to equations (142) and
(143) is

∂ρ
= L∗ ρ
∂t
1 1  
= (−p∂q ρ + ∂q V (q)∂p ρ) + 2 ∂p (pρ) + β −1 ∂p2 ρ
ε
  ε
1 ∗ 1 ∗
=: L + L ρ. (144)
ε2 0 ε 1

The invariant distribution of the Markov process {q, p}, if it exists,


is Z
1
ρ0 (p, q) = e−βH(p,q) , Z = e−βH(p,q) dpdq,
Z R 2

where H(p, q) = 21 p 2 + V (q). We define the function f(p,q,t)


through
ρ(p, q, t) = f (p, q, t)ρ0 (p, q). (145)

Pavliotis (IC) StochProc January 16, 2011 290 / 367


Theorem
The function f (p, q, t) defined in (145) satisfies the equation
   1 
∂f 1 −1 2
= −p∂q + β ∂p − (p∂q − ∂q V (q)∂p ) f
∂t ε2 ε
 
1 1
=: L0 − L1 f . (146)
ε2 ε

remark
This is "almost" the backward Kolmogorov equation with the
difference that we have −L1 instead of L1 . This is related to the
fact that L0 is a symmetric operator in L2 (R2 ; Z −1 e−βH(p,q) ),
whereas L1 is antisymmetric.

Pavliotis (IC) StochProc January 16, 2011 291 / 367


Proof.
We note that L∗0 ρ0 = 0 and L∗1 ρ0 = 0. We use this to calculate:

L∗0 ρ = L0 (f ρ0 ) = ∂p (f ρ0 ) + β −1 ∂p2 (f ρ0 )
= ρ0 p∂p f + ρ0 β −1 ∂p2 f + f L∗0 ρ0 + 2β −1 ∂p f ∂p ρ0
 
= −p∂p f + β −1 ∂p2 f ρ0 = ρ0 L0 f .

Similarly,

L∗1 ρ = L∗1 (f ρ0 ) = (−p∂q + ∂q V ∂p ) (f ρ0 )


= ρ0 (−p∂q f + ∂q V ∂p f ) = −ρ0 L1 f .

Consequently, the Fokker–Planck equation (144) becomes


 
∂f 1 1
ρ0 = ρ0 L0 f − L1 f ,
∂t ε2 ε

from which the claim follows.


Pavliotis (IC) StochProc January 16, 2011 292 / 367
We look for a solution to (146) in the form of a power series in ε:

X
f (p, q, t) = εn fn (p, q, t). (147)
n=0

We substitute this expansion into eqn. (146) to obtain the following


system of equations.

L0 f0 = 0, (148)
L0 f1 = L1 f0 , (149)
∂f0
L0 f2 = L1 f1 + (150)
∂t
∂fn
L0 fn+1 = L1 fn + , n = 2, 3 . . . (151)
∂t
The null space of L0 consists of constants in p. Consequently,
from equation (148) we conclude that

f0 = f (q, t).

Pavliotis (IC) StochProc January 16, 2011 293 / 367


Now we can calculate the right hand side of equation (149):

L1 f0 = p∂q f .

Equation (149) becomes:

L0 f1 = p∂q f .

The right hand side of this equation is orthogonal to N (L∗0 ) and


consequently there exists a unique solution. We obtain this
solution using separation of variables:

f1 = −p∂q f + ψ1 (q, t).

Pavliotis (IC) StochProc January 16, 2011 294 / 367


Now we can calculate the RHS of equation (150). We need to
calculate L1 f1 :
  
−L1 f1 = p∂q − ∂q V ∂p p∂q f − ψ1 (q, t)
= p 2 ∂q2 f − p∂q ψ1 − ∂q V ∂q f .

The solvability condition for (150) is


Z 
∂f0 
− L1 f1 − ρ (p) dp = 0,
R ∂t OU
from which we obtain the backward Kolmogorov equation
corresponding to the Smoluchowski SDE:
∂f
= −∂q V ∂q f + β −1 ∂p2 f . (152)
∂t
Now we solve the equation for f2 . We use (152) to write (150) in
the form  
L0 f2 = β −1 − p 2 ∂q2 f + p∂q ψ1 .

Pavliotis (IC) StochProc January 16, 2011 295 / 367


The solution of this equation is
1 2
f2 (p, q, t) = ∂ f (p, q, t)p 2 − ∂q ψ1 (q, t)p + ψ2 (q, t).
2 q
The right hand side of the equation for f3 is
1 3 3
L1 f2 = p ∂q f − p 2 ∂q2 ψ1 + p∂q ψ2 − ∂q V ∂q2 fp − ∂q V ∂q ψ1 .
2
The solvability condition
Z
L1 f2 ρOU (p) dp = 0.
R

This leads to the equation

−∂q V ∂q ψ1 − β −1 ∂q2 ψ1 = 0

The only solution to this equation which is an element of


L2 (e−βV (q) ) is
ψ1 ≡ 0.
Pavliotis (IC) StochProc January 16, 2011 296 / 367
Putting everything together we obtain the first two terms in the
ε-expansion of the Fokker–Planck equation (146):
 
ρ(p, q, t) = Z −1 e−βH(p,q) f + ε(−p∂q f ) + O(ε2 ) ,

where f is the solution of (152).


Notice that we can rewrite the leading order term to the expansion
in the form
1 2 /2
ρ(p, q, t) = (2πβ −1 )− 2 e−βp ρV (q, t) + O(ε),

where ρV = Z −1 e−βV (q) f is the solution of the Smoluchowski


Fokker-Planck equation

∂ρV
= ∂q (∂q V ρV ) + β −1 ∂q2 ρV .
∂t

Pavliotis (IC) StochProc January 16, 2011 297 / 367


It is possible to expand the n-th term in the expansion (147) in
terms of Hermite functions (the eigenfunctions of the generator of
the OU process)
n
X
fn (p, q, t) = fnk (q, t)φk (p), (153)
k =0
where φk (p) is the k–th eigenfunction of L0 :
−L0 φk = λk φk .
We can obtain the following system of equations
(Lb = β −1 ∂q − ∂q V ):
b n1 = 0,
Lf
s
k +1 b p
Lfn,k +1 + kβ −1 ∂q fn,k −1 = −kfn+1,k , k = 1, 2 . . . , n − 1,
β −1
p
nβ −1 ∂q fn,n−1 = −nfn+1,n ,
q
(n + 1)β −1 ∂q fn,n = −(n + 1)fn+1,n+1 .

Pavliotis (IC) StochProc January 16, 2011 298 / 367


Using this method we can obtain the first three terms in the
expansion:
 p
ρ(x, y, t) = ρ0 (p, q) f + ε(− β −1 ∂q f φ1 )
β −1 
+ε2 √ ∂q2 f φ2 + f20
2
r
 β −3 3 p
3 b q2 f
+ε − ∂q f φ3 + − β −1 L∂
3! 
p 
− β −1 ∂q f20 φ1
+O(ε4 ),

Pavliotis (IC) StochProc January 16, 2011 299 / 367


The Freilin–Wentzell Limit

Consider now the rescaling λγ,ε = 1, µγ,ε = γ. The Langevin equation


becomes p
q̈ γ = −γ −2 ∇V (q γ ) − q̇ γ + 2γ −2 β −1 Ẇ . (154)
We write equation (154) as system of two equations
p
q̇ γ = γ −1 p γ , ṗ γ = −γ −1 V ′ (q γ ) − p γ + 2β −1 Ẇ .

This is the equation for an O(1/γ) Hamiltonian system perturbed by


O(1) noise. We expect that, to leading order, the energy is conserved,
since it is conserved for the Hamiltonian system. We apply Itô’s
formula to the Hamiltonian of the system to obtain
  q
Ḣ = β −1 − p 2 + 2β −1 p 2 Ẇ

with p 2 = p 2 (H, q) = 2(H − V (q)).

Pavliotis (IC) StochProc January 16, 2011 300 / 367


Thus, in order to study the γ → 0 limit we need to analyze the following
fast/slow system of SDEs
  q
Ḣ = β −1 − p 2 + 2β −1 p 2 Ẇ (155a)
p
ṗ γ = −γ −1 V ′ (q γ ) − p γ + 2β −1 Ẇ . (155b)
The Hamiltonian is the slow variable, whereas the momentum (or
position) is the fast variable. Assuming that we can average over the
Hamiltonian dynamics, we obtain the limiting SDE for the Hamiltonian:
  q
−1 2
Ḣ = β − hp i + 2β −1 hp 2 iẆ . (156)

The limiting SDE lives on the graph associated with the Hamiltonian
system. The domain of definition of the limiting Markov process is
defined through appropriate boundary conditions (the gluing
conditions) at the interior vertices of the graph.

Pavliotis (IC) StochProc January 16, 2011 301 / 367


We identify all points belonging to the same connected
component of the a level curve {x : H(x) = H}, x = (q, p).
Each point on the edges of the graph correspond to a trajectory.
Interior vertices correspond to separatrices.
Let Ii , i = 1, . . . d be the edges of the graph. Then (i, H) defines a
global coordinate system on the graph.
For more information see
Freidlin and Wentzell, Random Perturbations of Dynamical
Systems, Springer 1998.
Freidlin and Wentzell, Random Perturbations of Hamiltonian
Systems, AMS 1994.
Sowers, A Boundary layer theory for diffusively perturbed transport
around a heteroclinic cycle, CPAM 58 (2005), no. 1, 30–84.

Pavliotis (IC) StochProc January 16, 2011 302 / 367


We will study the small γ asymptotics by analyzing the corresponding
backward Kolmogorov equation using singular perturbation theory.
The generator of the process {q γ , p γ } is

Lγ = γ −1 (p∂q − ∂q V ∂p ) − p∂p + β −1 ∂p2


= γ −1 L0 + L1 .

Let u γ = E(f (p γ (p, q; t), q γ (p, q; t))). It satisfies the backward


Kolmogorov equation associated to the process {q γ , p γ }:
 
∂u γ 1
= L0 + L1 u γ . (157)
∂t γ

Pavliotis (IC) StochProc January 16, 2011 303 / 367


We look for a solution in the form of a power series expansion in ε:

u γ = u0 + γu1 + γ 2 u2 + . . .

We substitute this ansatz into (157) and equate equal powers in ε to


obtain the following sequence of equations:

L0 u0 = 0, (158a)
∂u0
L0 u1 = −L1 u1 + , (158b)
∂t
∂u1
L0 u2 = −L1 u1 + . (158c)
∂t
.........
Notice that the operator L0 is the backward Liouville operator of the
Hamiltonian system with Hamiltonian
1 2
H= p + V (q).
2
Pavliotis (IC) StochProc January 16, 2011 304 / 367
We assume that there are no integrals of motion other than the
Hamiltonian. This means that the null space of L0 consists of functions
of the Hamiltonian:

N (L0 ) = functions ofH . (159)

Let us now analyze equations (158). We start with (158a); eqn. (159)
implies that u0 depends on q, p through the Hamiltonian function H:

u0 = u(H(p, q), t) (160)

Now we proceed with (158b). For this we need to find the solvability
condition for equations of the form

L0 u = f (161)

My multiply it by an arbitrary smooth function of H(p, q), integrate over


R2 and use the skew-symmetry of the Liouville operator L0 to deduce:1
1
We assume that both u1 and F decay to 0 as |p| → ∞ to justify the integration by
parts that follows.
Pavliotis (IC) StochProc January 16, 2011 305 / 367
Z Z
L0 uF (H(p, q)) dpdq = uL∗0 F (H(p, q)) dpdq
R2 Z R2

= u(−L0 F (H(p, q))) dpdq


R2
= 0, ∀F ∈ Cb∞ (R).
This implies that the solvability condition for equation (161) is that
Z
f (p, q)F (H(p, q)) dpdq = 0, ∀F ∈ Cb∞ (R). (162)
R2
We use the solvability condition in (158b) to obtain that
Z  
∂u0
L1 u 1 − F (H(p, q)) dpdq = 0, (163)
R2 ∂t
To proceed, we need to understand how L1 acts to functions of
H(p, q). Let φ = φ(H(p, q)). We have that
∂φ ∂H ∂φ ∂φ
= =p
∂p ∂p ∂H ∂H
and
Pavliotis (IC) StochProc January 16, 2011 306 / 367
 
∂2φ ∂ ∂φ ∂φ ∂2φ
2
= = + p2 .
∂p ∂p ∂H ∂H ∂H 2

The above calculations imply that, when L1 acts on functions


φ = φ(H(p, q)), it becomes
h i
L1 = (β −1 − p 2 )∂H + β −1 p 2 ∂H2 , (164)

where
p 2 = p 2 (H, q) = 2(H − V (q)).

Pavliotis (IC) StochProc January 16, 2011 307 / 367


We want to change variables in the integral (163) and go from (p, q) to
p, H. The Jacobian of the transformation is:
∂p ∂p
∂(p, q) ∂H ∂q ∂p 1
= ∂q ∂q = = .
∂(H, q) ∂H ∂q
∂H p(H, q)

We use this, together with (164), to rewrite eqn. (163) as


Z Z 
∂u h −1 i 
+ (β − p 2 )∂H + β −1 p 2 ∂H2 u F (H)p −1 (H, q) dHdq
∂t
= 0.

We introduce the notation


Z
h·i := · dq.

The integration over q can be performed "explicitly":


Z h
∂u −1  i
hp i + (β −1 hp −1 i − hpi)∂H + β −1 hpi∂H2 u F (H) dH = 0.
∂t

Pavliotis (IC) StochProc January 16, 2011 308 / 367


This equation should be valid for every smooth function F (H), and this
requirement leads to the differential equation

∂u 
hp −1 i = β −1 hp −1 i − hpi ∂H u + hpiβ −1 ∂H2 u,
∂t
or,
∂u 
= β −1 − hp −1 i−1 hpi ∂H u + γhp −1 i−1 hpiβ −1 ∂H2 u.
∂t
Thus, we have obtained the limiting backward Kolmogorov equation for
the energy, which is the "slow variable". From this equation we can
read off the limiting SDE for the Hamiltonian:

Ḣ = b(H) + σ(H)Ẇ (165)

where

b(H) = β −1 − hp −1 i−1 hpi, σ(H) = β −1 hp −1 i−1 hpi.

Pavliotis (IC) StochProc January 16, 2011 309 / 367


Notice that the noise that appears in the limiting equation (165) is
multiplicative, contrary to the additive noise in the Langevin equation.
As it well known from classical mechanics, the action and frequency
are defined as Z
I(E ) = p(q, E ) dq
and  
dI −1
ω(E ) = 2π ,
dE
respectively. Using the action and the frequency we can write the
limiting Fokker–Planck equation for the distribution function of the
energy in a very compact form.
Theorem
The limiting Fokker–Planck equation for the energy distribution
function ρ(E , t) is
  
∂ρ ∂ −1 ∂ ω(E )ρ
= I(E ) + β . (166)
∂t ∂E ∂E 2π
Pavliotis (IC) StochProc January 16, 2011 310 / 367
Proof.
We notice that Z Z
dI ∂p
= dq = p −1 dq
dE ∂E
and consequently hp −1 i−1 = ω(E)
2π . Hence, the limiting Fokker–Planck
equation can be written as
   2
 
∂ρ ∂ −1 I(E )ω(E ) −1 ∂ Iω
= − β ρ +β 2
∂t ∂E 2π ∂E 2π
   
−1 ∂ρ ∂ Iω −1 ∂ dI ωρ
= −β + ρ +β
∂E ∂E 2π ∂E dE 2π
  ωρ 
∂ ∂
+β −1 I
∂E ∂E 2π
   
∂ Iω −1 ∂ ∂  ωρ 
= ρ +β I
∂E 2π ∂E ∂E 2π
  
∂ ∂ ω(E )ρ
= I(E ) + β −1 ,
∂E ∂E 2π
Pavliotis (IC) StochProc January 16, 2011 311 / 367
Remarks

1 We emphasize that the above formal procedure does not provide


us with the boundary conditions for the limiting Fokker–Planck
equation. We will discuss about this issue in the next section.
2 If we rescale back to the original time-scale we obtain the equation
  
∂ρ ∂ −1 ∂ ω(E )ρ
=γ I(E ) + β . (167)
∂t ∂E ∂E 2π

We will use this equation later on to calculate the rate of escape


from a potential barrier in the energy-diffusion-limited regime.

Pavliotis (IC) StochProc January 16, 2011 312 / 367


Exit Time Problems and Escape from a Potential Well

Pavliotis (IC) StochProc January 16, 2011 313 / 367


Escape From a Potential Well

There are many systems in physics, chemistry and biology that


exist in at least two stable states:
Switching and storage devices in computers.
Conformational changes of biological macromolecules (they can
exist in many different states).
The problems that we would like to study are:
How stable are the various states relative to each other.
How long does it take for a system to switch spontaneously from
one state to another?
How is the transfer made, i.e. through what path in the relevant
state space?
How does the system relax to an unstable state?

Pavliotis (IC) StochProc January 16, 2011 314 / 367


The study of bistability and metastability is a very active research
area. Topics of current research include:
The development of numerical methods for the calculation of
various quantities such as reaction rates, transition pathways etc.
The study of bistable systems in infinite dimensions.

Pavliotis (IC) StochProc January 16, 2011 315 / 367


We will consider the dynamics of a particle moving in a bistable
potential, under the influence of thermal noise in one dimension:
p
ẋ = −V ′ (x) + 2kB T β̇. (168)

We will consider potentials that have to local minima, one local


maximum (saddle point) and they increase at least quadratically
fast at infinity. This ensures that the state space is "compact", i.e.
that the particle cannot escape at infinity.
The standard potential that satisfies these assumptions is

1 4 1 2 1
V (x) = x − x + . (169)
4 2 4
This potential has three local minima, a local maximum at x = 0
and two local minima at x = ±1.

Pavliotis (IC) StochProc January 16, 2011 316 / 367


3

−1

−2

−3
0 50 100 150 200 250 300 350 400 450 500

Figure: Sample Path of solution to equation (168).

Pavliotis (IC) StochProc January 16, 2011 317 / 367


The values of the potential at these three points are:

1
V (±1) = 0, V (0) = .
4

We will say that the height of the potential barrier is 41 . The


physically (and mathematically!) interesting case is when the
thermal fluctuations are weak when compared to the potential
barrier that the particle has to climb over.
More generally, we assume that the potential has two local minima
at the points a and c and a local maximum at b.
We will consider the problem of the escape of the particle from the
left local minimum a.
The potential barrier is then defined as

Eb = V (b) − V (a).

Pavliotis (IC) StochProc January 16, 2011 318 / 367


Our assumption that the thermal fluctuations are weak can be
written as
kB T
≪ 1. (170)
Eb
Under condition (170), the particle is most likely to be found close
to a or c. There it will perform small oscillations around either of
the local minima.
We can study the weak noise asymptotics for finite times using
standard perturbation theory.
We can describe locally the dynamics of the particle by
appropriate Ornstein-Uhlenbeck processes.
This result is valid only for finite times: at sufficiently long times
the particle can escape from the one local minimum, a say, and
surmount the potential barrier to end up at c. It will then spend a
long time in the neighborhood of c until it escapes again the
potential barrier and end at a.

Pavliotis (IC) StochProc January 16, 2011 319 / 367


This is an example of a rare event. The relevant time scale, the
exit time or the mean first passage time scales exponentially in
β := (kB T )−1 :
τ = ν −1 exp(βEb ).
the rate with which particles escape from a local minimum of the
potential is called the rate of escape or the reaction rate
κ := τ −1 :
κ = ν exp(−β∆E ). (171)
It is important to notice that the escape from a local minimum, i.e.
a state of local stability, can happen only at positive temperatures:
it is a noise assisted event:
In the absence of thermal fluctuations the equation of motion
becomes:
ẋ = −V ′ (x), x(0) = x0 .

Pavliotis (IC) StochProc January 16, 2011 320 / 367


In this case the potential becomes a Lyapunov function:

dx dx
= V ′ (x) = −(V ′ (x))2 < 0.
dt dt
Hence, depending on the initial condition the particle will converge
either to a or c. The particle cannot escape from either state of
local stability.
On the other hand, at high temperatures the particle does not
"see" the potential barrier: it essentially jumps freely from one
local minimum to another.
We will study this problem and calculate the escape rate using by
calculating the mean first passage time.

Pavliotis (IC) StochProc January 16, 2011 321 / 367


The Arrhenius-type factor in the formula for the reaction rate,
eqn. (171) is intuitively clear and it has been observed
experimentally in the late nineteenth century by Arrhenius and
others.
What is extremely important both from a theoretical and an
applied point of view is the calculation of the prefactor ν, the rate
coefficient.
A systematic approach for the calculation of the rate coefficient, as
well as the justification of the Arrhenius kinetics, is that of the
mean first passage time method (MFPT).
Since this method is of independent interest and is useful in
various other contexts, we will present it in a quite general setting
and apply it to the problem of the escape from a potential barrier
later on.

Pavliotis (IC) StochProc January 16, 2011 322 / 367


Let Xt be a continuous time diffusion process on Rd whose
evolution is governed by the SDE

dXtx = b(Xtx ) dt + σ(Xtx ) dWt , X0x = x. (172)

Let D be a bounded subset of Rd with smooth boundary. Given


x ∈ D, we want to know how long it takes for the process Xt to
leave the domain D for the first time

τDx = inf {t > 0 : Xtx ∈


/ D} .

Clearly, this is a random variable. The average of this random


variable is called the mean first passage time MFPT or the first
exit time:
τ (x) := EτDx .
We can calculate the MFPT by solving an appropriate boundary
value problem.

Pavliotis (IC) StochProc January 16, 2011 323 / 367


Theorem
The MFPT is the solution of the boundary value problem

−Lτ = 1, x ∈ D, (173a)

τ = 0, x ∈ ∂D, (173b)
where L is the generator of the SDE 173.

The homogeneous Dirichlet boundary conditions correspond to an


absorbing boundary: the particles are removed when they reach
the boundary. Other choices of boundary conditions are also
possible.
The rigorous proof of Theorem 67 is based on Itô’s formula.

Pavliotis (IC) StochProc January 16, 2011 324 / 367


Derivation of the BVP for the MFPT
Let ρ(X , x, t) be the probability distribution of the particles that
have not left the domain D at time t. It solves the FP equation
with absorbing boundary conditions.
∂ρ
= L∗ ρ, ρ(X , x, 0) = δ(X − x), ρ|∂D = 0. (174)
∂t
We can write the solution to this equation in the form

ρ(X , x, t) = eL t δ(X − x),


where the absorbing boundary conditions are included in the


definition of the semigroup eL t .

The homogeneous Dirichlet (absorbing) boundary conditions


imply that
lim ρ(X , x, t) = 0.
t→+∞

That is: all particles will eventually leave the domain.


Pavliotis (IC) StochProc January 16, 2011 325 / 367
The (normalized) number of particles that are still inside D at time
t is Z
S(x, t) = ρ(X , x, t) dx.
D
Notice that this is a decreasing function of time. We can write

∂S
= −f (x, t),
∂t
where f (x, t) is the first passage times distribution.

Pavliotis (IC) StochProc January 16, 2011 326 / 367


The MFPT is the first moment of the distribution f (x, t):
Z +∞ Z +∞
dS
τ (x) = f (s, x)s ds = − s ds
0 0 ds
Z +∞ Z +∞ Z
= S(s, x) ds = ρ(X , x, s) dXds
0 0 D
Z +∞ Z
eL s δ(X − x) dXds

=
0 D
Z +∞ Z   Z +∞  
Ls
= δ(X − x) e 1 dXds = eLs 1 ds.
0 D 0

We apply L to the above equation to deduce:


Z +∞   Z t
d  Lt 
Lτ = LeLt 1 dt = e 1 dt
0 0 dt
= −1.

Pavliotis (IC) StochProc January 16, 2011 327 / 367


Consider the boundary value problem for the MFPT of the one
dimensional diffusion process (168) from the interval (a, b):
 
− β −1 eβV ∂x e−βV ∂x τ = 1 (175)

We choose reflecting BC at x = a and absorbing B.C. at x = b.


We can solve (175) with these boundary conditions by
quadratures:
Z b Z y
−1 βV (y )
τ (x) = β dye dze−βV (z) . (176)
x a

Now we can solve the problem of the escape from a potential well:
the reflecting boundary is at x = a, the left local minimum of the
potential, and the absorbing boundary is at x = b, the local
maximum.

Pavliotis (IC) StochProc January 16, 2011 328 / 367


We can replace the B.C. at x = a by a repelling B.C. at x = −∞:
Z b Z y
−1 βV (y )
τ (x) = β dye dze−βV (z) .
x −∞
When Eb β ≫ 1 the integral wrt z is dominated by the value of the
potential near a. Furthermore, we can replace the upper limit of
integration by ∞:
Z z Z +∞
exp(−βV (z)) dz ≈ exp(−βV (a))
−∞ −∞
!
βω02
exp − (z − a)2 dz
2
s

= exp (−βV (a)) ,
βω02
where we have used the Taylor series expansion around the
minimum:
1
V (z) = V (a) + ω02 (z − a)2 + . . .
2
Pavliotis (IC) StochProc January 16, 2011 329 / 367
Similarly, the integral wrt y is dominated by the value of the
potential around the saddle point. We use the Taylor series
expansion
1
V (y) = V (b) − ωb2 (y − b)2 + . . .
2
Assuming that x is close to a, the minimum of the potential, we
can replace the lower limit of integration by −∞. We finally obtain
Z b Z b !
βωb2 2
exp(βV (y)) dy ≈ exp(βV (b)) exp − (y − b) dy
x −∞ 2
s
1 2π
= exp (βV (b)) .
2 βωb2

Pavliotis (IC) StochProc January 16, 2011 330 / 367


Putting everything together we obtain a formula for the MFPT:
π
τ (x) = exp (βEb ) .
ω0 ωb

The rate of arrival at b is 1/τ . Only half of the particles escape.


1
Consequently, the escape rate (or reaction rate), is given by 2τ :
ω0 ωb
κ= exp (−βEb ) . (177)

Pavliotis (IC) StochProc January 16, 2011 331 / 367


Consider now the problem of escape from a potential well for the
Langevin equation
p
q̈ = −∂q V (q) − γ q̇ + 2γβ −1 Ẇ . (178)

The reaction rate depends on the fiction coefficient and the


temperature. In the overdamped limit (γ ≫ 1) we retrieve (177),
appropriately rescaled with γ:
ω0 ωb
κ= exp (−βEb ) . (179)
2πγ

We can also obtain a formula for the reaction rate for γ = O(1):
q
γ2 2 γ
4 − ωb − 2 ω0
κ= exp (−βEb ) . (180)
ωb 2π

Naturally, in the limit as γ → +∞ (180) reduces to (179)

Pavliotis (IC) StochProc January 16, 2011 332 / 367


In order to calculate the reaction rate in the underdamped or
energy-diffusion-limited regime γ ≪ 1 we need to study the
diffusion process for the energy, (165) or (166). The result is
ω0 −βEb
κ = γβI(Eb ) e , (181)

where I(Eb ) denotes the action evaluated at b.
A formula for the escape rate which is valid for all values of friction
coefficient was obtained by Melnikov and Meshkov in 1986, J.
Chem. Phys 85(2) 1018-1027. This formula requires the
calculation of integrals and it reduced to (179) and (181) in the
overdamped and underdamped limits, respectively.

Pavliotis (IC) StochProc January 16, 2011 333 / 367


The escape rate can be calculated in closed form only in 1d or in
multidimesional problems that can be reduced to a 1d problem
(e.g., problems with radial symmetry).
For more information on the calculation of escape rates and
applications of reaction rate theory see
P. Hänggi, P. Talkner and M. Borkovec: Reaction rate theory: fifty
years after Kramers. Rev. Mod. Phys. 62(2) 251-341 (1990).
R. Zwanzig: Nonequilibrium Statistical Mechanics, Oxford 2001,
Ch. 4.
C.W. Gardiner: Handbook of Stochastic Methods, Springer 2002,
Ch. 9.
Freidlin and Wentzell: Random Perturbations of Dynamical
Systems, Springer 1998, Ch. 6.

Pavliotis (IC) StochProc January 16, 2011 334 / 367


The Kac-Zwanzig model and the Generalized Langevin Equation

Pavliotis (IC) StochProc January 16, 2011 335 / 367


Contents

Introduction: Einstein’s theory of Brownian motion.


Particle coupled to a heat bath: the Kac-Zwanzig model.
Coupled particle/field models.
The Mori-Zwanzig formalism.

Pavliotis (IC) StochProc January 16, 2011 336 / 367


Heavy (Brownian) particle immersed in a fluid.
Collisions of the heavy particle with the light fluid particles.
Statistical description of the motion of the Brownian particle.
Fluid is assumed to be in equilibrium.
Brownian particle performs an effective random walk:

hX (t)2 i = 2Dt.

The constant D is the diffusion coefficient (which should be


calculated from first principles, i.e. through Green-Kubo formulas).

Pavliotis (IC) StochProc January 16, 2011 337 / 367


Phenomenological theories based either on an equation for the
evolution of the probability distribution function ρ(q, p, t) (the
Fokker-Planck equation)

∂t ρ = −p∂q ρ + ∂q V ∂p ρ + γ ∂p (pρ) + β −1 ∂p2 ρ (182)

or the a stochastic equation for the evolution of trajectories, the


Langevin equation
p
q̈ = −V ′ (q) − γ q̇ + 2γβ −1 ξ, (183)

where ξ(t) is a white noise process, i.e. a (generalized) mean zero


Gaussian Markov process with correlation function

hξ(t)ξ(s)i = δ(t − s).

Pavliotis (IC) StochProc January 16, 2011 338 / 367


Notice that the noise and dissipation in the Langevin equation are
not independent. They are related through the
fluctuation-dissipation theorem.
This is not a coincidence since, as we will see later, noise and
dissipation have the same source, namely the interaction between
the Brownian particle and its environment (heat bath).
There are two phenomenological constants, the inverse
temperature β and the friction coefficient γ.
Our goal is to derive the Fokker-Planck and Langevin equations
from first principles and to calculate the phenomenological
coefficients β and γ.
We will consider some simple "particle + environment" systems for
which we can obtain rigorously a stochastic equation that
describes the dynamics of the Brownian particle.

Pavliotis (IC) StochProc January 16, 2011 339 / 367


We can describe the dynamics of the Brownian particle/fluid
system:

H(QN , PN ; q, p) = HBP (QN , PN ) + HHB (q, p) + HI (QN , q), (184)



where {q, p} := {qj }N j=1 , {p j }N
j=1 are the positions and
momenta of the fluid particles, N is the number of fluid (heat bath)
particles (we will need to take the thermodynamic limit N → +∞).
The initial conditions of the Brownian particle are taken to be fixed,
whereas the fluid is assumed to be initially in equilibrium (Gibbs
distribution).

Pavliotis (IC) StochProc January 16, 2011 340 / 367



Goal: eliminate the fluid variables {q, p} := {qj }N N
j=1 , {pj }j=1 to
obtain a closed equation for the Brownian particle.
We will see that this equation is a stochastic integrodifferential
equation, the Generalized Langevin Equation (GLE) (in the limit
as N → +∞)
Z t

Q̈ = −V (Q) − R(t − s)Q̇(s) ds + F (t), (185)
0

where R(t) is the memory kernel and F (t) is the noise.


We will also see that, in some appropriate limit, we can derive the
Markovian Langevin equation (183).

Pavliotis (IC) StochProc January 16, 2011 341 / 367


Need to model the interaction between the heat bath particles and
the coupling between the Brownian particle and the heat bath.
The simplest model is that of a harmonic heat bath and of linear
coupling:

X p2N
PN2 n 1
H(QN , PN , q, p) = + V (QN ) + + kn (qn − λQN )2(186)
.
2 2mn 2
n=1

The initial conditions of the Brownian particle


{QN (0), PN (0)} := {Q0 , P0 } are taken to be deterministic.
The initial conditions of the heat bath particles are distributed
according to the Gibbs distribution, conditional on the knowledge
of {Q0 , P0 }:
µβ (dpdq) = Z −1 e−βH(q,p) dqdp, (187)
where β is the inverse temperature. This is a way of introducing
the concept of the temperature in the system (through the average
kinetic energy of the bath particles).
Pavliotis (IC) StochProc January 16, 2011 342 / 367
In order to choose the initial conditions according to µβ (dpdq) we
can take
q p
qn (0) = λQ0 + β −1 kn−1 ξn , pn (0) = mn β −1 ηn , (188)

where the ξn ηn are mutually independent sequences of i.i.d.


N (0, 1) random variables.
Notice that we actually consider the Gibbs measure of an effective
(renormalized) Hamiltonian.
Other choices for theq
initial conditions are possible. For example,
we can take qn (0) = β −1 kn−1 ξn . Our choice of I.C. ensures that
the forcing term in the GLE that we will derive is mean zero (see
below).

Pavliotis (IC) StochProc January 16, 2011 343 / 367


Hamilton’s equations of motion are:
N
X

Q̈N + V (QN ) = kn (λqn − λ2 QN ), (189a)
n=1
q̈n + ωn2 (qn − λQN ) = 0, n = 1, . . . N, (189b)

where ωn2 = kn /mn .


The equations for the heat bath particles are second order linear
inhomogeneous equations with constant coefficients. Our plan is
to solve them and then to substitute the result in the equations of
motion for the Brownian particle.

Pavliotis (IC) StochProc January 16, 2011 344 / 367


We can solve the equations of motion for the heat bath variables
using the variation of constants formula

pn (0)
qn (t) = qn (0) cos(ωn t) + sin(ωn t)
mn ω n
Z t
+ωn λ sin(ωn (t − s))QN (s) ds.
0

An integration by parts yields

pn (0)
qn (t) = qn (0) cos(ωn t) + sin(ωn t) + λQN (t)
mn ω n
Z t
−λQN (0) cos(ωn t) − λ cos(ωn (t − s))Q̇N (s) ds.
0

Pavliotis (IC) StochProc January 16, 2011 345 / 367


We substitute this in equation (189) and use the initial
conditions (188) to obtain the Generalized Langevin Equation
Z t
Q̈N = −V ′ (QN ) − λ2 RN (t − s)Q̇N (s) ds + λFN (t), (190)
0

where the memory kernel is


N
X
RN (t) = kn cos(ωn t) (191)
n=1

and the noise process is


N
X kn pn (0)
FN (t) = kn (qn (0) − λQ0 ) cos(ωn t) + sin(ωn t)
mn ω n
n=1
p N p
X
= β −1 kn (ξn cos(ωn t) + ηn sin(ωn t)) . (192)
n=1

Pavliotis (IC) StochProc January 16, 2011 346 / 367


1 The noisy and random term are related through the
fluctuation-dissipation theorem:
N
X
−1
hFN (t)FN (s)i = β kn cos(ωn t) cos(ωn s)
n=1

+ sin(ωn t) sin(ωn s)
= β −1 RN (t − s). (193)
2 The noise F (t) is a mean zero Gaussian process.
3 The choice of the initial conditions (188) for q, p is crucial for the
form of the GLE and, in particular, for the fluctuation-dissipation
theorem (193) to be valid.
4 The parameter λ measures the strength of the coupling between
the Brownian particle and the heat bath.
5 By choosing the frequencies ωn and spring constants kn (ω) of the
heat bath particles appropriately we can pass to the limit as
N → +∞ and obtain the GLE with different memory kernels R(t)
and noise processes F (t).
Pavliotis (IC) StochProc January 16, 2011 347 / 367
Let a ∈ (0, 1), 2b = 1 − a and set ωn = N a ζn where {ζn }∞
n=1 are
i.i.d. with ζ1 ∼ U(0, 1). Furthermore, we choose the spring
constants according to

f 2 (ωn )
kn = ,
N 2b
where the function f (ωn ) decays sufficiently fast at infinity.
We can rewrite the dissipation and noise terms in the form
N
X
RN (t) = f 2 (ωn ) cos(ωn t) ∆ω
n=1

and
N
X √
FN (t) = f (ωn ) (ξn cos(ωn t) + ηn sin(ωn t)) ∆ω,
n=1

where ∆ω = N a /N.
Pavliotis (IC) StochProc January 16, 2011 348 / 367
Using now properties of Fourier series with random
coefficients/frequencies and of weak convergence of probability
measures we can pass to the limit:

RN (t) → R(t) in L1 [0, T ],

for a.a. {ζn }∞


n=1 and

FN (t) → F (t) weakly in C([0, T ], R).

The time T > 0 if finite but arbitrary.


The limiting kernel and noise satisfy the fluctuation-dissipation
theorem (193):
hF (t)F (s)i = β −1 R(t − s). (194)
QN (t), the solution of (190) converges weakly to the solution of
the limiting GLE
Z t
Q̈ = −V ′ (Q) − λ2 R(t − s)Q̇(s) ds + λF (t). (195)
0

Pavliotis (IC) StochProc January 16, 2011 349 / 367


The properties of the limiting dissipation and noise are determined
by the function f (ω).
Ex: Consider the Lorentzian function
2α/π
f 2 (ω) = (196)
α2 + ω 2
with α > 0. Then
R(t) = e−α|t| .
The noise process F (t) is a mean zero stationary Gaussian
process with continuous paths and, from (194), exponential
correlation function:
hF (t)F (s)i = β −1 e−α|t−s| .
Hence, F (t) is the stationary Ornstein-Uhlenbeck process:
dF p dW
= −αF + 2β −1 α , (197)
dt dt
with F (0) ∼ N (0, β −1 ).
Pavliotis (IC) StochProc January 16, 2011 350 / 367
The GLE (195) becomes
Z t

Q̈ = −V (Q) − λ 2
e−α|t−s| Q̇(s) ds + λ2 F (t), (198)
0

where F (t) is the OU process (197).

Pavliotis (IC) StochProc January 16, 2011 351 / 367


Q(t), the solution of the GLE (195), is not a Markov process, i.e.
the future is not statistically independent of the past, when
conditioned on the present. The stochastic process Q(t) has
memory.
We can turn (195) into a Markovian SDE by enlarging the
dimension of state space, i.e. introducing auxiliary variables.
We might have to introduce infinitely many variables!
For the case of the exponential memory kernel, when the noise is
given by an OU process, it is sufficient to introduce one auxiliary
variable.

Pavliotis (IC) StochProc January 16, 2011 352 / 367


We can rewrite (198) as a system of SDEs:

dQ
= P,
dt
dP
= −V ′ (Q) + λZ ,
dt
dZ p dW
= −αZ − λP + 2αβ −1 ,
dt dt
where Z (0) ∼ N (0, β −1 ).
The process {Q(t), P(t), Z (t)} ∈ R3 is Markovian.
It is a degenerate Markov process: noise acts directly only on one
of the 3 degrees of freedom.
We can eliminate the auxiliary process Z by taking an appropriate
distinguished limit.

Pavliotis (IC) StochProc January 16, 2011 353 / 367


√ −1
Set λ = γε , α = ε−2 . Equations (200) become

dQ
= P,
dt √
dP γ
= −V ′ (Q) + Z,
dt ε r

dZ 1 γ 2β −1 dW
= − 2Z − P+ .
dt ε ε ε2 dt
We can use tools from singular perturbation theory for Markov
processes to show that, in the limit as ε → 0, we have that
1 p dW
Z → 2γβ −1 − γP.
ε dt
Thus, in this limit we obtain the Markovian Langevin Equation
(R(t) = γδ(t))
p dW
Q̈ = −V ′ (Q) − γ Q̇ + 2γβ −1 . (201)
dt
Pavliotis (IC) StochProc January 16, 2011 354 / 367
Whenever the GLE (195) has "finite memory", we can represent it
as a Markovian SDE by adding a finite number of additional
variables.
These additional variables are solutions of a linear system of
SDEs.
This follows from results in approximation theory.
Consider now the case where the memory kernel is a bounded
analytic function. Its Laplace transform
Z +∞
b
R(s) = e−st R(t) dt
0
can be represented as a continued fraction:

b ∆21
R(s) = , γi > 0, (202)
∆22
s + γ1 + ...
Since R(t) is bounded, we have that
b
lim R(s) = 0.
s→∞

Pavliotis (IC) StochProc January 16, 2011 355 / 367


Consider an approximation RN (t) such that the continued fraction
representation terminates after N steps.
RN (t) is bounded which implies that
b N (s) = 0.
lim R
s→∞

The Laplace transform of RN (t) is a rational function:


PN N−j
b j=1 aj s
RN (s) = P , aj , bj ∈ R. (203)
sN + N j=1 bj s
N−j

This is the Laplace transform of the autocorrelation function of an


appropriate linear system of SDEs. Indeed, set
dxj dWj
= −bj xj + xj+1 + aj , j = 1, . . . , N, (204)
dt dt
with xN+1 (t) = 0. The process x1 (t) is a stationary Gaussian
process with autocorrelation function RN (t).
Pavliotis (IC) StochProc January 16, 2011 356 / 367
p
For N = 1 and b1 = α, a1 = 2β −1 α we derive the GLE (198)
with F (t) being the OU process (197).
Consider nowpthe case N = 2 with bi = αi , i = 1, 2 and
a1 = 0, a2 = 2β −1 α2 . The GLE becomes
Z t
′ 2
Q̈ = −V (Q) − λ R(t − s)Q̇(s) ds + λF1 (t),
0
Ḟ1 = −α1 F1 + F2 ,
p
Ḟ2 = −α2 F2 + 2β −1 α2 Ẇ2 ,

with
β −1 R(t − s) = hF1 (t)F1 (s)i.

Pavliotis (IC) StochProc January 16, 2011 357 / 367


We can write (206) as a Markovian system for the variables
{Q, P, Z1 , Z2 }:

Q̇ = P,
Ṗ = −V ′ (Q) + λZ1 (t),
Ż1 = −α1 Z1 + Z2 ,
p
Ż2 = −α2 Z2 − λP + 2β −1 α2 Ẇ2 .

Notice that this diffusion process is "more degenerate" than (198):


noise acts on fewer degrees of freedom.
It is still, however, hypoelliptic (Hormander’s condition is satisfied):
there is sufficient interaction between the degrees of freedom
{Q, P, Z1 , Z2 } so that noise (and hence regularity) is transferred
from the degrees of freedom that are directly forced by noise to
the ones that are not.
The corresponding Markov semigroup has nice regularizing
properties. There exists a smooth density.
Pavliotis (IC) StochProc January 16, 2011 358 / 367
Stochastic processes that can be written as a Markovian process
by adding a finite number of additional variables are called
quasimarkovian.
Under appropriate assumptions on the potential V (Q) the solution
of the GLE equation is an ergodic process.
It is possible to study the ergodic properties of a quasimarkovian
processes by analyzing the spectral properties of the generator of
the corresponding Markov process.
This leads to the analysis of the spectral properties of hypoelliptic
operators.

Pavliotis (IC) StochProc January 16, 2011 359 / 367


When studying the Kac-Zwanzing model we considered a one
dimensional Hamiltonian system coupled to a finite dimensional
Hamiltonian system with random initial conditions (the harmonic
heat bath) and then passed to the theromdynamic limit N → ∞.
We can consider a small Hamiltonian system coupled to its
environment which we model as an infinite dimensional
Hamiltonian system with random initial conditions. We have a
coupled particle-field model.
The distinguished particle (Brownian particle) is described through
the Hamiltonian
1
HDP = p 2 + V (q). (207)
2

Pavliotis (IC) StochProc January 16, 2011 360 / 367


We will model the environment through a classical linear field
theory (i.e. the wave equation) with infinite energy:

∂t2 φ(t, x) = ∂x2 φ(t, x). (208)

The Hamiltonian of this system is


Z

HHB (φ, π) = |∂x φ|2 + |π(x)|2 . (209)

π(x) denotes the conjugate momentum field.


The initial conditions are distributed according to the Gibbs
measure (which in this case is a Gaussian measure) at inverse
temperature β, which we formally write as

”µβ = Z −1 e−βH(φ,π) d φd π”. (210)

Care has to be taken when defining probability measures in


infinite dimensions.
Pavliotis (IC) StochProc January 16, 2011 361 / 367
Under this assumption on the initial conditions, typical
configurations of the heat bath have infinite energy.
In this way, the environment can pump enough energy into the
system so that non-trivial fluctuations emerge.
We will assume linear coupling between the particle and the field:
Z
HI (q, φ) = q ∂q φ(x)ρ(x) dx. (211)

where The function ρ(x) models the coupling between the particle
and the field.
This coupling is influenced by the dipole coupling approximation
from classical electrodynamics.

Pavliotis (IC) StochProc January 16, 2011 362 / 367


The Hamiltonian of the particle-field model is

H(q, p, φ, π) = HDP (p, q) + H(φ, π) + HI (q, φ). (212)

The corresponding Hamiltonian equations of motion are a coupled


system of equations of the coupled particle field model.
Now we can proceed as in the case of the finite dimensional heat
bath. We can integrate the equations motion for the heat bath
variables and plug the solution into the equations for the Brownian
particle to obtain the GLE. The final result is
Z t

q̈ = −V (q) − R(t − s)q̇(s) + F (t), (213)
0

with appropriate definitions for the memory kernel and the noise,
which are related through the fluctuation-dissipation theorem.

Pavliotis (IC) StochProc January 16, 2011 363 / 367


Consider now the N + 1-dimensional Hamiltonian (particle + heat
bath) with random initial conditions. The N + 1− probability
distribution function fN+1 satisfies the Liouville equation
∂fN+1
+ {fN+1 , H} = 0, (214)
∂t
where H is the full Hamiltonian and {·, ·} is the Poisson bracket
N 
X 
∂A ∂B ∂B ∂A
{A, B} = − .
∂qj ∂pj ∂qj ∂pj
j=0

We introduce the Liouville operator

LN+1 · = −i{·, H}.

The Liouville equation can be written as


∂fN+1
i = LN+1 fN+1 . (215)
∂t
Pavliotis (IC) StochProc January 16, 2011 364 / 367
We want to obtain a closed equation for the distribution function of
the Brownian particle. We introduce a projection operator which
projects onto the distribution function f of the Brownian particle:

PfN+1 = f , PfN+1 = h.

The Liouville equation becomes

∂f
i = PL(f + h), (216a)
∂t

∂h
i = (I − P)L(f + h). (216b)
∂t

Pavliotis (IC) StochProc January 16, 2011 365 / 367


We integrate the second equation and substitute into the first
equation. We obtain
Z t
∂f
i = PLf − i PLe−i(I−P)Ls (I − P)Lf (t − s) ds + PLe−i(I−P)Lt h(0).
∂t 0
(217)
In the Markovian limit (large mass ratio) we obtain the
Fokker-Planck equation (182).

Pavliotis (IC) StochProc January 16, 2011 366 / 367


L. Rey-Bellet Open Classical Systems.
D. Givon, R. Kupferman and A.M. Stuart Extracting Marcoscopic
Dynamics: Model Problems and Algorithms, Nonlinearity 17
(2004) R55-R127.
R.M. Mazo Brownian Motion, Oxford University Press, 2002.

Pavliotis (IC) StochProc January 16, 2011 367 / 367

You might also like