CalcustoENSTA2020-21IPP MasterSFA

STOCHASTIC CALCULUS
FIRST PART (FQ307, OMI302) and SECOND PART (FQ308) of a global

Course in Stochastic Calculus (OMI302b)
Francesco RUSSO
ENSTA Paris, Institut Polytechnique de Paris

Unité de Mathématiques Appliquées
828, Bd. des Maréchaux
F-91120 Palaiseau
Tél. +33 1 81 87 21 12
E-mail: francesco.russo@ensta-paris.fr
https://uma.ensta-paris.fr/∼russo
November 2020
Key words: Stochastic calculus, Stochastic differential equations, Semimartin-

gales, Itô integral, Itô formula, forward integrals, stochastic flows, Girsanov
theorem. Bessel processes, Cox-Ingersoll-Ross process, Backward SDEs.
Abstract. The course includes the following chapters of stochastic calculus.
(i) Motivations: stochastic modeling, probabilistic representations of lin-

ear PDEs, stochastic control, filtering, mathematical finance.
(ii) Stochastic processes in continuous time: Gaussian processes, Brownian

motion, (local) martingales, semimartingales, Itô processes.
(iii) Stochastic integrals: forward and Itô integrals.
(iv) Itô and chain rule formulae, a first approach to stochastic differential
equations.
(v) Girsanov formulae. Novikov and Benês coondition. Predictable repre-

sentation of Brownian martingales.
(vi) Stochastic differential equations with Lipschitz coefficients. Markov

flows.
(vii) Stochastic differential equations without Lipschitz coefficients: strong

existence, pathwise uniqueness, existence and uniqueness in law. Engelbert-
Schmidt criterion. Non-explosion conditions
(viii) Bessel processes and Cox-Ingersoll-Ross model in mathematical finance.
(ix) Backward stochastic differential equations and connections with semi-

linear PDEs.
2
1 Introduction
The present document constitutes a second-level course of stochastic calculus.

We would like to illustrate the richness of stochastic calculus in itself and at
the level of applications. It is often related to a microscopic way to observe
the world but it also helps to show important links to the macroscopic vision.
A preliminary list of complementary monographs is the following: [17, 22,
30, 26, 29].
The range of applications goes from mathematical biology (neurosciences,
population growth, oncology, epidemiology), engineering sciences, physics,
economics and finance. More recently, applications to very recent fields have
appeared, including big data.
We start with some possible problems.
(A) Stochastic analogue of ordinary differential equations
Allowing randomness in some coefficients of an ordinary differential equation

(ODE), produces a more realistic mathematical model of the situation.
PROBLEM 1
Consider a simple model of population growth
dN
(t) = a(t) N (t), N (0) = A, (1.1)
dt
where N (t) is the size of the population at time t, a(t) is the growth rate at
time t. It might happen that a(t) is not completely known, but subject to
some random environmental effects, so that we have
a(t) = r(t) + “noise”, (1.2)
where we do not know the exact behaviour of the noise term, only its proba-
bility distribution. The function r(t) is assumed to be nonrandom. How do
we solve (1.1) in this case?
1
(B) Stochastic approach to a deterministic problem with bound-
ary or initial condition
PROBLEM 2 The most celebrated example is the stochastic solution of

the Dirichlet problem.
Let U be a (reasonable) domain in Rn and a continuous function f : ∂U → R.
Find a continuous function f˜ : Ū → R such that
i) f˜ = f on ∂U ,
ii) f˜ is harmonic on U i.e.

n
X ∂ 2 f˜
∆f˜ = = 0 sur U.
i=1
∂x2i
In 1944 Kakutani proved that the solution could be expressed in terms of

Brownian motion (which will be introduced later): f˜(x) = E(f (B x )) τ
where (Btx , t ≥ 0) is a Brownian motion starting from x and τ is the first
exit point from the domain.
It turned out that this was just the top of an iceberg. The Dirichlet prob-
lem could be solved for a large class of elliptic (hypoelliptic) second order
PDE operator L more general than the laplacian. In that case the Brown-
ian motion can be replaced by the solution (Xt ) of an associated stochastic
differential equation.
PROBLEM 3
We consider the problem
(
∂u
∂t (t, x) − 21 ∆u(t, x) = 0
u(0, x) = f (x),
where f : Rn → R is continuous non-negative or with polynomial growth. It

is possible to show that
u(t, x) = E(f (Btx )),
is a solution of the problem.
2
(C) Optimal stopping
PROBLEM 4
Suppose that a person has an asset or resource (e.g. a house, stocks, oil...)
that she is planning to sell. The price Xt at time t of her asset on the open
market varies according to a stochastic differential equation of the same type
as in PROBLEM 1, i.e.
d
Xt = Xt (r + α “noise”),
dt
where r, α are known constants. The discount rate is a known constant ρ.
At what time should she decide to sell?
We assume that she knows the behaviour of (Xs ) up to the present time t,
but because of the noise in the system, she can of course never be sure at
the time of the sale if her choice of time will turn out to be the best. So
what we are searching for a stopping strategy that gives the best result in the
long run, i.e. maximizes the expected profit when the inflation is taken into
account.
This is an optimal stopping problem. It turns out that the solution can be
expressed in terms of the solution of a corresponding boundary value problem
(PROBLEM 2), except that the boundary is unknown (free) as well.
(D) Filtering problem.
PROBLEM 5
Suppose that, in order to improve our knowledge about the solution of a
random differential equation, say of (1.1) in PROBLEM 1, we perform ob-
servations Z(s) of N (s) at times s ≤ t. However, due to inaccuracies in our
measurements we do not really measure N (s) but a disturbed version of it,
i.e.
Z(s) = N (s) + “noise”, (1.3)
So in this case, there are two sources of noise, the second coming from the
error of measurement.
3
The filtering problem is the following. What is the best estimate of N (s)
satisfying (1.1), based on the observations (1.3) when s ≤ t ?
Intuitively, the problem is to filter the noise away from the observations in an
optimal way. In 1960 Kalman and in 1961 Kalman and Bucy proved what is
now known as the Kalman-Bucy filter. Basically the filter gives a procedure
for estimating the state of a system which satisfies a "noisy" linear differential
equation, based on a series of "noisy" observations. Almost immediately
the discovery found applications in aerospace engineering (Ranger, Mariner,
Apollo etc.) and it now has a broad range of applications but it has known
a wide range of other applications.
(E) Black-Scholes-Samuelson model (Financial mathematics)
It is a continuous time model to describe the price of two financial assets:

one is risky (say a stock with price St at time t) and a non-risky asset (with
price St0 at time t). We suppose that the evolution of St0 is described by the
ordinary differential equation
dSt0 = rSt0 dt, (1.4)
where r is a positive constant.
Remark 1.1 This equation is the continuous time version of the difference
equations
0
Sn+1 = Sn0 (1 + r). (1.5)
In fact (1.5) gives

∆Sn0 = rSn0 ∆t, (1.6)
where
∆t = 1, ∆Sn0 = Sn+1
0
− Sn0 .
Now (1.6) motivates (1.4).
In (1.4), the interest rate related to the non-risky asset is constant and equal
to r.
4
We set
S00 = 1,
such that
St0 = ert , ∀ t ≥ 0.
We suppose that the evolution of the risky asset price (stock) is described by
SDEs of the type (1.1) and (1.2)
dSt = St (µ + “noise”)dt.
At time t, we set πt0 (resp. πt ) the quantity of non-risky (resp. risky) detained
in the portfolio.
Suppose that at time t = 0 the investor is offered the right (but without
obligation) to buy one unity of risky asset at a specified price K at a specified
future time t = T . Such a right is called a European call option. How
much should the person be willing to pay for such an option? This problem
was solved when Fisher Black and Myron Scholes (1973) used stochastic
analysis to compute a theoretical value for the price, the now famous Black-
Scholes opion price formula. They also solved the so called hedging problem
i.e. the investment that the seller of the option has to perform in order to
reproduce the sum to provide to the owner of the option in case he exercise
it. In particular he has to evaluate the quantities πt0 and πt to invest at
each time t in order to be able to reimburse the owner of the option at the
maturity T .
PROBLEME 6 Stochastic control
It constitutes a maximization (resp. minimization) problem in expectation.

We consider again the previous (Samuelson) model and let πt0 (resp. πt )
denote the quantity of non-risky (resp. risky) asset that an investor owns in
his portfolio.
πt
We express by ut = Xt the wealth portion invested in the stock (risky asset).
The total wealth Xt at time t is given by
Xt = St0 πt0 + St πt .
5
At each time t, the investor can decide the portion ut ; he will put (1−ut )Xt in
the non-risky asset. Given a utility function U and a maturity T , the problem
consists in determining the optimal portfolio, i.e. to find a good allocation
investment, ut , 0 ≤ t ≤ T , which maximizes at the maturity time the
(u)
expected utility of the wealth Xt ; one tries to maximize on 0 ≤ ut ≤ 1 the
expected utility n o
(u)
E U (XT ) .
2 Basic recalls in probability theory
2.1 Preliminaries
In general, we will be given an underlying probability space (Ω, F, P ).

Let (E, E) be a measurable set, i.e. a set E equipped with a σ-field E. In
most cases E will be of the type
E = Rd , C, R, (2.1)
but it can also be an infinite dimensional Banach space E of real continuous

functions on [0, T ]. E will be then the Borel σ-field B(E). If E1 , E2 are
Banach spaces then f : E1 → E2 is called Borel if it is measurable with
respect to the corresponding Borel σ-fields. If X takes values in E, (X :
Ω → E), we denominate by law of X (or induced probability measure
through X) on (E, E) the probability ν = νX : B(E) → [0, 1] defined by
ν(B) = P {X ∈ B} = P (X −1 (B)).
All the (σ-fields) included in F will be supposed to be completed with the

P -null sets. In particular, if X is a G-measurable r.v. and X = Y a.s. then
Y is also G-mesurable.
By definition, a property Π will be said to hold ν-a.e. on E if it holds for
every x ∈ E except on a ν -negligible set. When ν is Lebesgue measure dx,
we omit the mention ν and we will simply say a.e.
A property Π will be said to hold a.s. it holds P -a.e. on Ω.
6
If X = (X1 , . . . , Xd ) is a r.v. taking values in Rd (or random vector), we call
characteristic function of X the function ϕX : Rd → C defined by
t 7−→ E{exp(it · X)}, t ∈ Rd .
Remark 2.1 The characteristic function uniquely determines the law of X.
Let X be a square integrable random vector, i.e. such that E(|X|2 ) < ∞.
The covariance matrix of X is the matrix Γ(X) = (σij )1≤i,j≤d whose
coefficients are given by
σi,j = Cov(Xi , Xj ) = E{(Xi − µi ) (Xj − µj ))},
where µ = (µ1 , ..., µd ) is the expectation of X, i.e. µi = E(Xi ), 1 ≤ i ≤ d.
Remark 2.2 X1 , · · · , Xd are independent if and only if
ϕX (t) = Πdj=1 ϕXj (tj ), t = (t1 , · · · , td ).
Remark 2.3 If the r.v. X1 , . . . , Xd are independent then Γ(X) is diagonal.

The opposite implication is generally wrong.
2.2 Real Gaussian random variables
Let µ ∈ R, σ > 0. A real r.v. X (i.e. taking values in E = R) is called

Gaussian r.v. or normal r.v. with parameter µ and σ 2 , notation X ∼
N (µ, σ 2 ), its law admits a density given by

1 (x − µ)2
fµ,σ (x) = √ exp − .
σ 2π 2σ 2
The law of X is equally called Gaussian or normal law with parameters µ

and σ 2 . In particular we also have
E(X) = µ, V ar(X) = σ 2 .
If µ = 0, σ = 1, X is called standard. Similarly the law of X is also called

standard Gaussian law.
7
If σ = 0, we say that X ∼ N (µ, σ 2 ) if the law of X is the (discrete) Dirac
law δµ , i.e. (
1 if µ ∈ A
δµ (A) =
0 otherwise.
In that case X ≡ µ a.s.
Proposition 2.4 Suppose X ∼ N (µ, σ 2 ). Then

σ2 z2
E(ezX ) = eµz+ 2 , ∀ z ∈ C. (2.2)
Remark 2.5 (i) If Y ∼ N (0, 1), we have

z2
E(ezY ) = e 2 , ∀ z ∈ C. (2.3)
(ii) The characteristic function of X ∼ N (µ, σ 2 ) is given by

σ 2 t2
ϕX (t) = eiµt− 2 , ∀ t ∈ R.
2.3 Gaussian vectors
Definition 2.6 A r.v. X = (X1 , . . . , Xd ) taking values in Rd is called

d
X
Gaussian vector if for every reals a1 , . . . , ad , the real-valued r.v. a i Xi
i=1
is a Gaussian random variable.
We recall that the components X1 , . . . , Xd of a Gaussian vector are Gaussian

random variables. However, if the coordinates of a random vector X are
Gaussian, this is not enough to ensure that X is a Gaussian vector. When the
r.v. are independent then the situation is different. We recall the proposition
below without proof.
Proposition 2.7 Let X1 , . . . , Xd be independent Gaussian r.v. Then the

vector X = (X1 , . . . , Xd ) is Gaussian.
Theorem 2.8 Let X = (X1 , . . . , Xd ) an Rd -valued Gaussian vector. The

r.v. X1 , . . . , Xd are independent if and only if the matrix Γ(X) is diagonal.
8
Proof. Exercise. Indication: use Remark 2.2.
Exercise 2.9 Let X be a square integrable random vector. Prove that the
covariance matrix Γ is a positive definite symmetric matrix.
We need to characterize the notion of multi-dimensional Gaussian law. Let

µ = (µ1 , ..., µd ) in Rd and Γ a positive-definite real matrix. A probability ν
on B(Rd ) is called Gaussian or normal with parameter µ and Γ if a r.v. X
distributed according to ν verifies

tΓt⊤
ϕX (t) = exp it · µ − .
2
We will use the notation X ∼ N (µ, Γ).

This leads naturally to the following proposition whose proof constitutes
again an exercise, at least when Γ is diagonal.
Proposition 2.10 Let X be a Gaussian vector such that E(X) = µ and

covariance matrix Γ. With the first statement we mean E(Xi ) = µi with
1 ≤ i ≤ d.
• Then X ∼ N (µ, Γ).
• If Γ is invertible then the Gaussian law is absolutely continuous with

density
1 1
p(x) = d √ exp{− (x − µ)′ Γ−1 (x − µ)}.
(2π) 2 det Γ 2
2.4 Complements on conditional expectation
Here we only recall some well-known results that we have non necessarily in
mind.
Let (Ω, F, P ) be a probability space and G a sub σ-field of F.
Remark 2.11 If X is an integrable r.v., independent of G, then E(X|G) =

E(X) a.s.
9
This property has no converse implication. However the interesting result
below holds.
Proposition 2.12 Let X be a given r.v. X is independent on a σ-field G if

and only if
∀u ∈ R : E(eiuX |G) = E(eiuX ) a.s. (2.4)
A useful result in the sequel is the following.
Theorem 2.13 Let (E1 , E1 ), (E2 , E2 ) two sets equipped with a σ-field. Let
X be a G-measurable r.v. taking values in (E1 , E1 ) and Y be a r.v. with
values in (E2 , E2 ). We suppose that Y is independent of G.
Let Φ : (E1 × E2 , E1 ⊗ E2 ) −→ C measurable.
We define ϕ : E1 → C by
ϕ(x) = E(Φ(x, Y )), x ∈ E1 .
Then
E(Φ(X, Y )|G) = ϕ(X) a.s.
provided that the left-hand side exists.
Remark 2.14 i) Under the stated hypotheses, we can calculate E(Φ(X, Y )|G)
as if X were constant.
ii) If Φ(x, y) = f (x)g(y) the result follows by the fondamental properties of

conditional expectation.
If (ξα )α∈I is a family of integrable r.v. then E(X|ξα , α ∈ I) will be by

definition E(X|σ(ξα , α ∈ I)). Let (ξ1 , . . . , ξn+1 ) be a square integrable
r.v. In general E(ξn+1 |ξ1 , . . . , ξn ) is the orthogonal projection of ξn+1 on
L2 (Ω, σ(ξ1 , . . . , ξn ), P ).
Let ξˆn+1 denote the orthogonal projection of ξn+1 on the finite-dimensional
subspace generated by 1, ξ1 , . . . , ξn i.e. the linear regression of ξn+1 on
ξ1 , . . . , ξn . In general if ηn+1 = E(ξn+1 |ξ1 , . . . , ξn ), then
E(ηn+1 − ξn+1 )2 ≤ E(ξˆn+1 − ξn+1 )2 .
10
Proposition 2.15 Suppose that the covariance matrix Γ(n) of (ξ1 , . . . , ξn )
is invertible. Then the linear regression ξˆn+1 of ξn+1 on ξ1 , ..., ξn is given by
n
X
µn+1 + (ξj − µj ) bj ,
j=1
where µℓ = E(ξℓ ) and 1 ≤ ℓ ≤ n + 1. The coefficients (bj ) are solutions of

the system    
σ1,n+1 b1
 ..   .. 
  = Γ(n)  . 
 .   ,
σn,n+1 bn
σij = Cov(ξi , ξj ).
Remark 2.16 When (ξ1 , . . . , ξn+1 ) is Gaussian, then the linear regression
ξˆn+1 coincides with E(ξn+1 |ξ1 , . . . , ξn ).
3 Stochastic processes, Brownian motion
We introduce now basic material about the topic, which will be widely used
in the sequel. For more specific properties, the reader can however consult
for instance [17, 30].
3.1 Preliminaries in analysis
We go on with some reminders about measure theory on the real line, see
[18] (Chapter 9) and [31] (Chapter 8) for more details. See also [25].
Definition 3.1 (i) A subdivision of an interval [a, b] is a finite sequence

T := (ti )0≤i≤n of ordered real numbers in [a, b], i.e. a = t0 < t1 <
t2 < · · · < tn−1 < tn = b. The mesh of this subdivision is |T | :=
sup1≤i≤n−1 (ti+1 − ti ).
(ii) If f is a function defined on an interval of R, f (x+ ) (resp. f (x− ))

denotes the right-limit (resp. left-limit) of f at point x. We set ∆f (x) =
f (x+ ) − f (x− ).
11
f is said right-continuous or càd (resp. to have a left-limit or
làg) if for any x, f (x+ ) = f (x) (resp. f (x− ) exists).
f is said left-continuous or càg (resp. to have a right-limit or
làd) if for any x, f (x− ) = f (x) (resp. f (x+ ) exists).
Let f : [a, b] → R be a real valued function. The total variation kf k[a,b] of

f over [a, b] is the supremum of
n
X
|f (ti+1 ) − f (ti )|,
i=0
over all the subdivisions (ti )N

i=0 of [a, b].
A function f : [0, T ] → R is said to be of bounded (or finite) variation if

the total variation kf k[0,T ] is finite. We denote by BV the set of Borel real
functions with bounded variation.
An important property of bounded variation functions is the following.
Proposition 3.2 A function f : [0, T ] → R is of bounded variation if and

only if f is a difference of two non-decreasing functions.
Remark 3.3 Let f : [0, T ] → R with finite variation.
(i) Then f is bounded and the function t → kf k[0,t] from [0, T ] to R+ is a

non-decreasing function.
(ii) If f : [0, T ] → R is differentiable and f ′ ∈ L1 ([0, T ]), then f ∈ BV and

Rt
kf k[0,t] = 0 |f ′ |(s)ds.
(iii) f is almost everywhere differentiable.
(iv) If f : [0, T ] → R is monotone then f is in BV , and kf k[a,b] = |f (b) −

f (a)|.
T > 0 will be a fixed maturity in the whole course.
12
3.2 Generalities on continuous time processes
Let (Ω, F, P ) be a probability space and E be a set equipped with a σ-algebra

E.
We denominate continuous time stochastic process (or simpler process)
taking values in (E, E), a family (Xt , t ≥ 0) of r.v. defined on (Ω, F, P ) with
values in (E, E). The latter space will be omitted if self-explanatory.
Remark 3.4 In practice the index t represents the time.
a) A process can be considered as an application R+ × Ω → E. When that

application is measurable with R+ × Ω being equipped with the product
σ-field B(R+ ) ⊗ F and E of the σ-field E, then X is said to be a
measurable process.
b) To each ω in Ω, we associate the function R+ → E, t 7−→ Xt (ω), called

path of the process.
c) When E = Rd , E will be the Borel σ-field.
d) We will also consider processes indexed by the time interval [0, T ]. Most
of the notions introduced for R+ can be expressed for [0, T ]. We will
maintain some ambiguity with respect to this.
We provide now some general definitions about processes.
Definition 3.5 (i) A process (Xt )t≥0 is said to have independent in-
crements if for every 0 ≤ t0 < t1 < . . . < tn < ∞, the family
{Xt1 − Xt0 , . . . , Xtn − Xtn−1 } is independent.
(ii) It is said to be stationary if for every 0 ≤ t0 < t1 < . . . < tn < ∞,

the law of (Xt0 +h , . . . , Xtn +h ) does not depend on h.
(iii) A process (Xt )t≥0 is said to have stationary increments if for every
0 ≤ t0 < t1 < . . . < tn < ∞, the law of (Xt1 +h − Xt0 +h , . . . , Xtn +h −
Xtn−1 +h ) does not depend on h.
13
(iv) (Xt )t≥0 is said to be square integrable if E(|Xt |2 ) < ∞, ∀t ≥ 0.
(v) Suppose that E = R. (Xt ) is said to be Gaussian, if for every 0 ≤

t0 < t1 < . . . < tn < ∞, (Xt0 , . . . , Xtn ) is a Gaussian vector.
We state now some basic definitions related to the measurability of processes.

A process (Yt ) is called a modification of (Xt ) if Xt = Yt a.s. for every
positive t.
Two processes (Xt ) and (Yt ) are said indistinguishable if there is a P -
negligible set N such that
Xt (ω) = Yt (ω), ∀t ≥ 0, ∀ω ∈ N c .
If (Xt ) and (Yt ) are indistinguishable, then (Xt ) is a modification of (Yt ).

The concept of process may be extended to the case when the parameter
belongs to a subset of Rn . A family of r.v. (Xt , t ∈ A), where A ⊂ Rn ,
is called random field (indexed by A). Similarly as for processes, we can
speak about indistinguishable random fields and modifications.
When E = Rd , a process (Xt )t≥0 is said to be continuous, if a.s. all its
paths are continuous.
More generally, if a.s. all the paths of a process X verify a given property Π,
we will say that X has the property Π. For instance, X is right-continuous
(or càd), right-continuous with left limits (or càdlàg), increasing, hav-
ing bounded variation, if a.s. all the paths of X fulfill the corresponding
property.
Proposition 3.6 There is at most one left-continuous (resp. right-continuous)

modification of a given process in the sense of indistinguishability. In other
words, if (Yt1 )t≥0 and (Yt2 )t≥0 are two left-continuous (resp. right-continuous)
modifications of (Xt )t≥0 , then they are indistinguishable.
Proof.
It is enough to show that two left-continuous (resp. right-continuous) pro-
cesses (Xt ) and (Yt ), modification of each other are indistinguishable. We
only consider the case when the paths are left-continuous, the other cases
14
being similar. Let N be a negligible set such that for ω ∈
/ N the paths
t 7→ Xt (ω) and t 7→ Yt (ω) are left-continuous.
On the other hand, for each n ∈ N∗ , k ∈ N, there is a null set Nk,n such that
X k (ω) = Y k (ω), ∀ω ∈
/ Nk,n .
2n 2n
S
We set Ñ = k,n Nk,n ∪ N which is still a null set.
By left-continuity of paths, for each ω ∈
/ Ñ , t ≥ 0, we have
X
Xt (ω) = lim 1[ k k+1
, [ (t)X k (ω)
n→∞ 2n 2n 2n
k≥0
X
= lim 1[ k k+1
, [ (t)Y k (ω) = Yt (ω). (3.1)
n→∞ 2n 2n 2n
k≥0
Finally, (Xt )t≥0 and (Yt )t≥0 are indistinguishable.
A classical criterion to establish if a process is continuous is Garsia-Rudemich-

Rumsey criterion, see for instance [3].
Proposition 3.7 Let (Xt , 0 ≤ t ≤ T ) be a process such that

E |Xt − Xs |α ≤ C|t − s|1+β , 0 ≤ s, t ≤ T, (3.2)
where C, α, β > 0.
Then, there is a modification of X (still denoted by the same letter) such that
β
for any γ ∈]0, [, a.s. we have
α
1
sup |Xt − Xs | → 0, (h ↓ 0). (3.3)
hγ |t−s|≤h
A simplified version of this result is the so-called Kolmogorov theorem.
Corollary 3.8 Let (Xt , 0 ≤ t ≤ T ) be a process satisfying (3.2). Then it

β
admits a modification which is Hölder continuous with parameter γ < α. In
particular for ω a.s. there is a constant c(ω) such that
|Xt − Xs | ≤ c(ω)|t − s|γ .
15
We now introduce the concept of filtration and related definitions.
Definition 3.9 (i) A filtration (Ft , t ≥ 0) on (Ω, F, P ) will be an in-

creasing family of sub-σ-fields of F. Namely Ft will be a σ-algebra
included in F, and Fs ⊂ Ft , for any 0 ≤ s ≤ t. The σ-field Ft gener-
ally represents the state of the information available at time t.
(ii) A filtration (Ft ) is supposed to satisfy the usual conditions, if F0

contains all the P -negligible sets in F and is right-continuous in the
sense
\
Ft+ := Fs = Ft . (3.4)
s>t
In the sequel all the filtrations will verify the usual conditions.
(iii) A process (Xt ) is said to be adapted to a filtration (Ft , t ≥ 0), if for

each t, Xt is Ft -measurable.
Remark 3.10 In particular Ft contains the family N of the P -negligible

sets of F. The aim of this technical assumption is to allow the validity of the
following property: if ξ, η are two random variables such that ξ = η a.s., and
ξ is Ft -measurable then η is Ft -measurable.
(In fact, the set N = {ω|ξ(ω) 6= η(ω)} is negligible. For any B ∈ B(R),
η −1 (B) = (η −1 (B) ∩ N c ) ∪ Ñ = (ξ −1 (B) ∩ N c ) ∪ Ñ ∈ Ft ,
where Ñ = η −1 (B) ∩ N is negligible).
Given a process (Xt )t≥0 , let (Ft ) be the filtration Ft = σ(Xs , s ≤ t). Ob-
viously (Ft ) is the smallest filtration for which (Xt )t≥0 is adapted. A priori
the above filtration does not verify the usual conditions.
Remark 3.11 Given the (natural) filtration (Ft ) it is no difficult to show the
existence of a smallest filtration (FtX ) fulfilling the usual conditions such that
Ft ⊂ FtX for every t ∈ [0, T ]. In particular, F0X contains all the P -negligible
subsets in F. This filtration is called the canonical filtration of the process
(Xt )t≥0 . When we speak about a filtration associated with a process (Xt )t≥0
without other comments, we intend its canonical filtration.
16
Remark 3.12 (i) If (Xt ), (Yt ) are two modifications of each other, they
will have the same canonical filtration.
(ii) If (Xt ) is (Ft )-adapted , then every modification remains (Ft )-adapted.
Sometimes it is not sufficient to only suppose that a process (Xt )t≥0 is (Ft )-
adapted, we need a little bit more.
Definition 3.13 (i) A process (Xt )t≥0 is said to be progressively mea-

surable (with respect to (Ft )) if for any t ≥ 0, the map (ω, s) 7→ Xs (ω),
from (Ω × [0, t] ; Ft ⊗ B([0, t])) is measurable.
(ii) Let Pprog be the σ-field generated by subsets A ⊂ Ω × [0, T ] in F such

that 1A is a progressively measurable process. It can be checked that a
process is progressively measurable if and only if it is Pprog -measurable.
Remark 3.14 (i) It is obvious that any (Ft )-progressively measurable pro-
cess is (Ft )-adapted.
(ii) If (Xt ) is (Ft )-adapted, and càd (or càg), then (Xt ) is progressively
measurable, see [17] Proposition 1.13, chapter 1.
(iii) An (Ft )-adapted process is nearly progressively measurable. More pre-

cisely, it admits a progressively measurable modification, [17] Proposi-
tion 1.12, chapter 1. This result will not be used in the sequel.
We introduce now the notion of random time and stopping time.

A r.v. τ with values in [0, T ] ∪{∞} (or in R+ ∪{∞}) if the process is indexed
by the whole half line) is called random time. If (Xt )t≥0 is a stochastic
process, (Xtτ ) symbolizes the stopped process and is defined by
Xtτ = Xt∧τ , t ≥ 0.
A stopping time is a particular random time. We call stopping time with

respect to a filtration (Ft )t≥0 a random time τ such that
{τ ≤ t} ∈ Ft , ∀t ≥ 0.
17
Let τ be a stopping time. We introduce
Fτ = {A ∈ F|A ∩ {τ ≤ t} ∈ Ft ; ∀t ≥ 0} .
Fτ represents the available information up to τ .
Exercise 3.15 (i) Prove that Fτ is a σ-field. Prove that it coincides with
Ft when τ = t.
Proposition 3.16 Let S, τ be two stopping times.
a) τ is Fτ -measurable.
b) If S ≤ τ a.s., then FS ⊂ Fτ .
c) S ∧ τ = inf(S ∧ τ ) is a stopping time. In particular, if t is a deterministic

time, then S ∧ t is again a stopping time.
Proof. Left to the reader.
3.3 Brownian motion
A particularly important example of stochastic process is Brownian mo-

tion. It constitutes the basis of a great quantity of stochastic models, in
physics, engineering and finance.
Definition 3.17 A real valued process (Bt )t≥0 will be called Brownian mo-
tion (or Wiener process) with variance σ 2 , if
(i) it has independent increments,
(ii) it is such that Bt − Bs ∼ N (0, σ 2 (t − s)), for any t > s.
Remark 3.18 a) A Brownian motion admits a continuous modification,

see Proposition 3.21 below. So in general (without restriction of gener-
ality) we will also suppose
(iii) B is continuous.
18
b) Assumptions (i) and (ii) imply that that process has stationary incre-
ments.
Theorem 3.19 Let (Bt ) be a Brownian motion. If
B0 ≡ x,
for some x ∈ R then it is a Gaussian process.
Proof. It will be enough to show that for all 0 ≤ t0 < t1 < . . . <
tn , (Bt1 − Bt0 , . . . , Btn − Btn−1 ) is a Gaussian vector. This is a consequence
of independence and of the Gaussian character of each component.
Definition 3.20 A Brownian motion (Wiener process) (Wt ) is called stan-

dard or classical if its variance is 1 and W0 = 0 a.s.
Proposition 3.21 Let W = (Wt ) be a standard Brownian motion. Then its

paths are Hölder continuous with parameter γ < 12 .
Proof. Let n be a positive integer. If Z is a standard Gaussian r.v. we

have
√
E(Wt − Ws )2n = E( t − sZ)2n = c|t − s|n ,
where c = E(|Z|2n ). According to Corollary 3.8 the consequence follows

β 1 1
taking α = 2n, 1 + β = n, so that α = 2 − 2n .
In the sequel, when we will mention a Brownian motion, without other pre-
cision, this will be a classical Brownian motion.
In this case the law of Wt is N (0, t), i.e. its density with respect to Lebesgue
measure is
1 −x2
√ exp .
2πt 2t
We will also need the notion of Brownian motion related to a particular
filtration (Ft )t≥0 .
19
Definition 3.22 Let (Ft ) be a σ-field fulfilling the usual conditions. We will
call (Ft )-Brownian motion (with variance σ 2 ), a (continuous) real process
(Bt )t≥0 which verifies the following properties.
(i) ∀t ≥ 0, Bt is Ft -measurable;
(ii) If s < t, Bt − Bs is independent of Fs ;
(iii) If s < t, Bt − Bs ∼ N (0, σ 2 (t − s)).
Also, we will call classical (Ft )-Brownian motion an (Ft )-Brownian mo-
tion (Wt ) if σ = 1 and W0 = 0 a.s.
Remark 3.23 Let (Bt )t≥0 be an (Ft )-Brownian motion Then, the filtra-
tion σ(Bs , s ≤ t) ∨ N where N is the family of P -null sets is the canonical
filtration (FtB ) of B, see Theorem 31, section 4, Chapter 1 of [29].
Moreover we have the following.
a) (Bt )t≥0 is (Ft )-adapted. Point (i) shows that FtB ⊂ Ft
b) A Brownian motion (Bt ) is an (FtB )-Brownian motion.
c) An (Ft )-Brownian motion is always an (FtB )-Brownian motion.
3.4 White noise
We consider a simple model of population growth of the type

dN
= a(t)N (t), N (0) = A, (3.5)
dt
where N (t) is the population tail at time t, a(t) is the increasing growth rate
at time t. It can happen that a(t) is not completely determined, because it
is influenced by some environmental effects; this drives us to
a(t) = r(t) + α“noise′′ , (3.6)
20
where we do not know the exact behaviour of the noise, but only its proba-
bility distribution and its “mean” r(t) is known.
A particular case of (3.5) and (3.6) can be
dN
= (r + α“noise′′ )N (t), N (0) = A. (3.7)
dt
N (t) can represent for instance the price of a resource, as oil, apartments,
currencies, stocks and so on. α could represent the volatility of the resource.
If α = 0, one would have Xt = X0 ert .
More generally, we will be interested in equations of the type
dXt
= b(t, Xt ) + a(t, Xt ).“noise′′ ,
dt
where b, a are given functions.
It seems reasonnable to look for a noise process (ξt ) such that

dXt
= b(t, Xt ) + a(t, Xt ), ξt , t ≥ 0. (3.8)
dt
Starting from different situations arising for instance in engineering, one
could suppose that (ξt )t≥0 has, at least approximatively the following prop-
erties.
(i) the r.v. (ξt )t≥0 are independent;
(ii) the process (ξt )t≥0 is stationary, i.e. the law of

(ξt1 +h , ξt2 +h , . . . , ξtn +h ) does not depend on h for every (t1 , . . . , tn );
(iii) E(ξt ) = 0, ∀t ≥ 0.
Unfortunately, there is no reasonnable stochastic process satisfying (i) and

(ii), see for instance, [26]. For instance, such a (ξt ) cannot be continuous. If
moreover one requests E(ξt2 ) to be finite, then function (t, ω) 7→ ξt (ω) will
even not be measurable with respect to the σ-algebra B([0, T ]) ⊗ F where
B(I) is the Borel σ-algebra of I where I ⊂ R.
Nevertheless, it is possible to represent (ξt )t≥0 as a generalized process, see
e.g. [15] (with the help of tempered Schwartz distributions S ′ ). Here we
Rt
will avoid such difficulties, so formally we write Bt = 0 ξs ds ou Bt (ω) =
Rt
0 ξs (ω)ds. Properties (i), (ii), (iii) can be translated into
21
(i’) (Bt ) has independent increments,
(ii’) (Bt ) has stationary increments.
On the other hand, logically we have
(iii’) B0 = 0 a.s.
(iv’) (Bt ) is continuous.
Theorem 3.24 There are µ ∈ R, σ ≥ 0 such that (Bt −µt)t≥0 is a Brownian

motion with variance σ 2 .
Moreover, if σ > 0, Wt = σ1 (Bt − µt) is a classical Brownian motion.
Proof. See [14].
Equation (3.8) can be formally rewritten as

Z t Z t
Xt = X0 + b(s, Xs )ds + a(s, Xs )dBs . (3.9)
0 0
Difficulty. It is possible to show that a.s., every path of the Brownian

motion is nowhere differentiable, so that it is necessary to give a meaning to
the ”stochastic integral” with respect to dBs appearing in (3.9). It is indeed
also possible to show that the paths of B are a.s. never Hölder-continuous
with parameter γ ≥ 12 .
4 Martingales and semimartingales.
4.1 Continuous time martingales
The following definition is similar to the case of discrete time processes.

Let (Ft )t≥0 be a filtration on the probability space (Ω, F, P ). An adapted
process (Mt ) of integrable r.v., i.e. verifying E(|Mt |) < ∞, ∀t ≥ 0 is
• an (Ft )-martingale if E(Mt |Fs ) = Ms , ∀t ≥ s;
• an (Ft )-supermartingale if E(Mt |Fs ) ≤ Ms , ∀t ≥ s;
22
• an (Ft )-submartingale if E(Mt |Fs ) ≥ Ms , ∀t ≥ s.
Remark 4.1 From that definition, we can deduce that if (Mt )t≥0 is a Ft -
martingale, then E(Mt ) = E(M0 ), ∀t ≥ 0. If (Mt )t≥0 is an Ft -supermartingale
(resp. Ft -submartingale) then t 7→ E(Mt ) is decreasing (resp. increasing).
Definition 4.2 An (Ft )-martingale (resp. (Ft )-submartingale, (Ft )-supermartingale),

is said square integrable if E(Mt2 ) < ∞, for any t ≥ 0.
When one speaks about a martingale, without σ-field specification, one refers
to the canonical filtration.
Remark 4.3 An (Ft )-martingale (resp. supermartingale, submartingale) is

also an (FtM )-martingale (resp. supermartingale, submartingale).
Exercise 4.4 Let (Mt ; 0 ≤ t ≤ T ) be a continuous (Ft )-martingale such that

E[MT2 ] < ∞. Prove the following.
(i) For any t ∈ [0, T ], E[Mt2 ] < ∞.
(ii) The function t 7→ E[Mt2 ] is non-decreasing.
(iii) For any s, t in [0, T ],
E[(Mt − Ms )2 |Fs ] = E[Mt2 |Fs ] − Ms2 , if 0 ≤ s ≤ t ≤ T.
Exercise 4.5 Let (Mt ) be a submartingale, ϕ : R → R a convex function

such that ϕ(Mt ) is integrable for every t. We suppose that one of the two
following properties are realized.
• (Mt ) is a martingale,
• ϕ is non-decreasing.
Then (ϕ(Mt )) is a submartingale.
23
At this stage we can provide some examples coming from Brownian motion.
Proposition 4.6 Let (Wt )t≥0 be (Ft )-classical Brownian motion. Then, the
following processes are (Ft )-martingales:
1)Wt
2)Wt2 − t

σ2
3) exp σWt − 2 t .
Proof. If s ≤ t then Wt − Ws is independent from the σ−field Fs . So

E(Wt − Ws |Fs ) = E(Wt − Ws ); since a classical Brownian motion is mean
zero, then E(Wt − Ws ) = 0. Therefore (i) follows.
In order to prove (ii), we remark that

E(Wt2 − Ws2 |Fs ) = E (Wt − Ws )2 + 2Ws (Wt − Ws )|Fs

= E (Wt − Ws )2 |Fs + 2Ws E(Wt − Ws |Fs );
| {z }
0
E(Wt − Ws |Fs ) = 0 since (Wt ) is a martingale. Therefore
E(Wt2 − Ws2 |Fs ) = E((Wt − Ws )2 |Fs ) = E(Wt − Ws )2 = t − s,
because Wt − Ws is independent of Ft and Wt − Ws ∼ N (0, t − s). We can

deduce that E(Wt2 − t|Fs ) = Ws2 − s, when s < t.
We finally prove (iii). If s < t

σ2 t σ2 t
E eσWt − 2 |Fs = eσWs − 2 E eσ(Wt −Ws ) |Fs ,
because Ws is Fs -measurable. Since Wt − Ws is independent of Fs we have

√
E eσ(Wt −Ws ) |Fs = E eσ(Wt −Ws ) = E eσG t−s ,
where G ∼ N (0, 1). At this stage (2.3) implies that the previous quantity
σ 2 (t−s)
equals e 2 . This yields the announced result.
Previous processes provide in fact example of continuous martingales, since

they are based on Brownian motion which is a continuous process.
The reader may ask if the definition of martingale guarantees a minimum of
path regularity. This can be ensured since our “usual” filtration are essentially
right-continuous, see [29], Corollary Chapter I, section 2.
24
Lemma 4.7 Any (Ft )-martingale admits a càdlàg modification.
Exercise 4.8 Let A be an integrable random variable and (Mt ) a process

such that Mt = E(A|Ft ).
(i) Show that M is an (Ft )-martingale.
(ii) Conclude that M has a càdlàg modification.
If (Mt )t≥0 is a martingale, the relation E(Mt |Fs ) = Ms , t > s, can be gen-
eralized to random times, if those are bounded stopping times. Next theorem,
namely Doob’s stopping time theorem, will be stated without proof.
Theorem 4.9 Let (Mt )t≥0 be a càdlàg martingale with respect to a filtration
(Ft )t≥0 and τ1 , τ2 are two stopping times such that τ1 ≤ τ2 ≤ K, K being
a real positive constant. Then Mτ2 is integrable and
E(Mτ2 |Fτ1 ) = Mτ1 Pa.s. (4.10)
Remark 4.10 (i) Let τ be a bounded stopping time. Previous result im-
plies that E(Mτ ) = E(M0 ). To observe this, it is enough to apply
Doob’s stopping time to τ1 = 0, τ2 = τ and to take the expectation of
both members.
(ii) If (Mt ) is a submartingale, the same result holds replacing (4.10) with
E(Mτ2 |Fτ1 ) ≥ Mτ1 P a.s. (4.11)
We will provide an application example of this result to the evaluation of the

law of the hitting time of a barrier by classical Brownian motion.
Given a continuous process (Xt )t≥0 adapted to a filtration (Ft ). Let a ∈ R.
We consider the random time
(
inf{s ≥ 0|Xs ≥ a} if {s ≥ 0|Xs ≥ a} =
6 ∅
Ta =
∞ if {s ≥ 0|Xs ≥ a} = ∞.
Proposition 4.11 Ta is a stopping time with respect to (Ft )t≥0 .
25
Proof. We can replace (Xt ) with an indistinguishable process whose paths
are continuous, see Remark 3.18 a).
Remark 4.12 The path continuity allows to write Ta (ω) = min{s ≥ 0|Xs (ω) ≥
6 ∅. In particular XTa (ω) = a if {s ≥ 0|Xs (ω) ≥
a}, if {s ≥ 0|Xs (ω) ≥ a} =
6 ∅.
a} =
Since
{Ta ≤ t} = {max Xs ≥ a},
s≤t
it remains to show that maxs≤t Xs is (Ft )-measurable to conclude that Ta is

a stopping time. This is true because it is the limit of
max Xs ,
s=tk2−n ,k=0...2n
which is obviously (Ft )-measurable.
Proposition 4.13 Let (Wt )t≥0 be a (Ft ) classical Brownian motion. If a ∈

R, we set
(
inf{s ≥ 0|Ws = a} if {s ≥ 0|Ws = a} =
6 ∅
τa =
∞ if {s ≥ 0|Xs = a} = ∅.
Then, τa is a.s. a finite stopping time and

√
E e−λτa = e− 2λ|a| , λ ≥ 0. (4.12)
Remark 4.14 (4.12) provides the Laplace transform of τa . As the charac-

teristic function, it uniquely determines the law.

In fact, λ −→ E e−λτa is analytical (being holomorphic) on {Reλ > 0}.
√
The same holds for the right member λ 7→ e− 2λ|a| .
Equality (4.12) means that the two previous complex functions are identical
on R+ . By analytical continuation, they also coincide on {Reλ > 0}. By
continuity, they are equal on {Reλ ≥ 0}; this determines the characteristic
function which uniquely determines the law.
Proof (of Proposition 4.13).

There is no restriction of generality to suppose that all paths of (Wt )t≥0 are
26
continuous. The obtained indistinguishable process is still a (Ft ) -Brownian
motion.
We suppose a ≥ 0. Since the paths are continuous, τa = Ta , so that τa is
again a stopping time.
We will apply Doob’s stopping time theorem to the martingale

σ2t
Mt = exp σWt − .
2
Unfortunately, we cannot apply directly that stopping time to τa , which is
not a priori bounded. However, if n is a strictly positive integer, Proposition
3.16 says that τa ∧ n is still a stopping time, which is this time bounded.
Applying Remark 4.10 (i), we obtain E(Mτa ∧n ) = 1. Since Wr (ω) ≤ a, if
r ≤ τa (ω),
σ2
Mτa ∧n = eσWτa ∧n − 2
(τa ∧n)
≤ eσa .
Moreover, if τa < ∞, limn→∞ Mτa ∧n = Mτa . If τa = +∞, for every t ≥

0, Wt ≤ a, so that

σ2n
Mτa ∧n ≤ exp σa − −→n→∞ 0.
2
In fact, Wτa ∧n ≤ a and τa ∧ n = n. Therefore we have limn→∞ Mτa ∧n =
1{τa <∞} Mτa .
By the dominated convergence theorem, we have

1 = lim E (Mτa ∧n ) = E 1{τa <∞} Mτa .
n→∞ | {z }
1
Since Wτa = a, if τa < ∞,

2 τa

E e−σ 2 1{τa <∞} = E Mτa e−σWτa 1{τa <∞} = e−σa E Mτa 1{τa <∞}
| {z }
1
−σa
= e .
Letting σ go to 0, we obtain
P {τa < ∞} = 1.
This means that the classical Brownian motion reaches the barrier a with
probability one. Then
2
− σ 2τa
E e = e−σa . (4.13)
27
Let us suppose now that −a > 0. We remark that
τa = inf{s ≥ 0, −Ws = −a}.
Now (−Wt )t≥0 is an (Ft )-Brownian motion and the result follows.
Remark 4.15 Doob’s stopping theorem cannot be applied to τa . In particu-

lar τa is not bounded. In fact, if it were the case, we would have E (Wτa ) = 0.
Since Wτa = a, we would obtain a contradiction.
Another consequence of Doob’s stopping theorem concerning stopping of a

martingale.
The result below is stated without proof.
Proposition 4.16 Let (Mt )0≤t≤T be a càdlàg (Ft )-martingale (resp. sub-
martingale, supermartingale). Then M τ is still an (Ft )-martingale (resp.
submartingale, supermartingale).
Remark 4.17 It is immediate to see that (Mt )0≤t≤T is (Ft∧τ )-martingale.

The statement of the proposition is stronger.
4.2 Local martingales, finite variation processes and uniform

convergence in probability
We denote by C([0, T ]) the vector algebra of continuous processes indexed by

[0, T ]). C([0, T ]) is an F -space (metrizable, with homogeneous distance, com-
plete) equipped with the topology of the uniform convergence in proba-
bility (ucp). We recall that a sequence of processes Xn in C([0, T ]) converges
ucp to a process X (necessarly in C([0, T ]) if
P
sup |Xn (t) − X(t)| −−−→ 0. (4.14)
t≤T
Remark 4.18 Xn → X ucp if and only if ρ(Xn , X) → 0 where ρ is the

distance defined by
 
sup |Xt − Yt |
 t≤T 
ρ(X, Y ) = E  .
1 + sup |Xt − Yt |
t≤T
28
Remark 4.19 As mentioned earlier, the space C([0, T ]) equipped with the
distance ρ is a complete metric space.
Indeed the space of random elements with values in the Banach space E :=
C([0, T ]) (equipped with the sup-norm k · k) is a complete metric space if
equipped with the metric
kX − Y k
ρC (X, Y ) = ,
1 + kX − Y k
describing the convergence in probability.
So if (X n ) is a Cauchy sequence in C([0, T ]), then by definition of ρ, (X n )
is Cauchy as sequence of random elements with values in E with respect to
the metric ρC . Therefore there is a random element X with values in E such
that X n converges in probability to X.
Finally this shows that C([0, T ]) is complete.
We denote by C the vector algebra of continuous processes indexed by R+ .

We say again that a sequence (Xn ) in C converges ucp to X if (4.14) holds
for every T > 0. C is again an F-space. We denote by CF the subalgebra of
C constituted by (Ft )- adapted continuous processes.
We provide now some recalls concerning the theory of martingales.
Doob’s stopping time theorem allows to obtain estimates for the maximum
of a martingale. If (Mt ) is a martingale, we can get a bound for the moments
of sup0≤t≤T |Mt |. First, we state the so called Doob inequality.
Proposition 4.20 Let p > 1, 0 ≤ s < T. If (Mt )s≤t≤T is a right-continuous

non-negative (Ft )-submartingale, we have
! p
p p
E sup Mt ≤ E MTp .
s≤t≤T p−1
Proof. See [17], Theorem 3.8, ch.1.
Corollary 4.21 Let p > 1, 0 ≤ s < T. Let (Mt )s≤t≤T be a right-continuous

(Ft )-martingale. We have
! p
p p
E sup |Mt | ≤ E (|MT |p ) .
s≤t≤T p−1
29
Proof. |Mt |s≤t≤T is a submartingale and x → |x| is a convex function.
Example 4.22 If p = 2, we have

!
2

E sup |Mt | ≤ 4E |MT |2 .
0≤t≤T
Definition 4.23 Let I = R+ or [0, T ]. A process (Xt )t∈I , is called (Ft )-

local martingale if there is an increasing sequence (τn ) of stopping times
such that
• X τn is an (Ft )-martingale;
• limn→∞ τn = +∞ a.s.
Remark 4.24 • An (Ft )-martingale is an (Ft )-local martingale.
• The set of (Ft )-local martingales is a linear space.
• If M is a càdlàg (Ft )-local martingale, τ is a stopping time, then M τ

is again an (Ft )- local martingale.
• A similar notion can be defined for local submartingale and local

supermartingale.
Definition 4.25 A process S is called (continuous) (Ft )-semimartingale

if it is the sum of a (continuous) (Ft )-local martingale and an (Ft )-adapted
(continuous) finite variation process.
Example 4.26 We give two very simple examples of semimartingales arising

from standard Brownian motion W .
• W itself since it is a martingale.
• St := sups≤t Ws . Why?
Remark 4.27 When the filtration is not mentioned, again the canonical fil-
tration will be underlying.
30
Proposition 4.28 (Stricker theorem).
Let S be an (Ft )- semimartingale and (Gt ) be a subfiltration of (Ft ) such that
S is (Gt )- adapted. Then S is also a (Gt )- semimartingale.
Proof. See [29], Th. II 2.4 and [34].
We may ask under which conditions a local martingale is a true martingale.

Next lemma provides a partial answer.
Lemma 4.29 Let (Mt )t∈[0,T ] be an (Ft )-càdlàg local martingale such that
sup |Mt | ∈ L1 .
t∈[0,T ]
Then (Mt )t∈[0,T ] is an (Ft )-martingale.
Proof. Suppose that MT⋆ = sup |Mt | is an integrable r.v. In particular

t∈[0,T ]
Mt ∈ L1 (Ω) for all t ∈ [0, T ]. Let (τn )n≥1 be a non-decreasing sequence
of stopping times such that τn ↑ +∞, and (Mt∧τn ; 0 ≤ t ≤ T ) is an (Ft )-
martingale.
Obviously, for any t ∈ [0, T ], Mt∧τn converges a.s. as n → ∞ to Mt , and
sup |Mt∧τn | ≤ MT⋆ ∈ L1 (Ω). Hence by Lebesgue dominated convergence
t∈[0,T ]
theorem, for every t ∈ [0, T ], Mt∧τn converges to Mt , in L1 (Ω).
Let us check the martingale property. Let 0 ≤ s ≤ t ≤ T and Λ ∈ Fs . Since
(Mt∧τn ; 0 ≤ t ≤ T ) is a martingale
E[Ms∧τn 1Λ ] = E[Mt∧τn 1Λ ].
By the dominated convergence theorem may be applied, n going to ∞, then

E[Ms 1Λ ] = E[Mt 1Λ ]. This proves that (Mt ) is a martingale.
A basic decomposition in stochastic analysis is the following. It will be a

particular case among many theorems in the literature. We only state a
particular case. It is a consequence of Theorem 4.10 and Remark 4.16 of
chapter 1. of [17].
31
Theorem 4.30 (Doob-Meyer decomposition of a continuous sub-
martingale)
Let X be a continuous (Ft )-submartingale non-negative or such there is γ > 1
such that E(|XT |γ ) < ∞. Then, there is a continuous (Ft )- martingale M
and an adapted, continuous, and increasing process V (such that V0 = 0)
with X = M + V . The decomposition is unique.
The moment condition above can be replaced by a uniform integrability con-

dition.
Remark 4.31 • In particular, an (Ft )-submartingale is an (Ft )-semimartingale.
• The uniqueness can also be a consequence of the uniqueness of the de-

composition for Dirichlet processes.
By localization arguments (that we will omit) it is possible to establish the

following.
Proposition 4.32 Let M be a continuous (Ft )-local martingale. There is

a continuous (Ft )-local martingale and an increasing process V such that
M2 = N + V .
The decomposition is unique if we set V0 = 0.
Exercise 4.33 • Prove the proposition if M is an (Ft )-martingale.
• Provide the explicit Doob-Meyer decomposition for the submartingale

(Wt2 ) if W is a classical Wiener process.
Definition 4.34 Let M be a continuous (Ft )-local martingale. We denote by

< M > the increasing process vanishing at zero; which appears in the Doob-
Meyer decomposition of the local submartingale M 2 . As a result, M 2 − <
M > is an (Ft )-local martingale. It will be called oblique bracket or angle
bracket.
Remark 4.35 The angle bracket of a classical Brownian motion W equals

hW it = t.
32
RT
Remark 4.36 Let K = (Ks ) be process such that 0 |Ks |ds < ∞ a.s.
Rt
(i) Vt = 0 Ks ds defines a finite variation process and its total variation
process is given by Z t
kV k[0,t] = |Ks |ds.
0
(ii) Exercise: deduce that V is a semimartingale with respect to some fil-

tration.
5 Covariation and Stochastic integrals via discretiza-

tion
5.1 Definitions and fundamental properties
We introduce here the notion of covariation of two processes and associated

integral close to a similar definition of Hans Föllmer, see [12].
Let (Xt )t∈[0,T ] be a càdlàg processes. Let (Yt )t≥0 be a càdlàg or càglàd
process.
Remark 5.1 The process Yt− := Yt− := lims↑t Ys is well-defined and it is

càglàd.
A càglàd process is locally bounded in the sense that there exists a sequence
(τn ) of stopping times such that Y τn is bounded. This happens setting τ n :=
inf{s ∈ [0, T ]||Ysτn | > n}. See [29] Theorem 15, Chapter IV.
In the sequel, if we do not mention differently, processes (Xt )t∈[0,T ] will be

prolonged by continuity to the whole line, using the same notation. In par-
ticular, we have (
X0 if t≤0
Xt =
XT if t > T.
In the sequel we will often have to deal with processes depending on a pa-
rameter ε which converge to zero when ε → 0. For this reason, the generic
notation (R(ε, s), s ∈ [0, T ]) indicate a family of processes such that
sups≤T |R(ε, s)| →ǫ→0 0,
33
in probability. In the definition below we come back to the notion of subdi-
vision introduced in Definition 3.1.
Definition 5.2 (i) Let us consider a subdivision of [0, T ], which is, by

definition, a finite subset of the type
Π = {0 = t0 < t1 < · · · < tn = T }. (5.1)
For a sequence (Πm ) of subdivisions of [0, T ] of the type Π = {0 =

tm m m
0 < t1 < · · · < tn = T }, we say that its mesh converges to zero
if |Πm | := sup0≤i≤n−1 (tm m
i+1 − ti ) converges to 0 when m → ∞.
(ii) We consider a subdivision Π of the type (5.1). For t ∈ [0, T ], we set

n−1
X
St− (Π, Y, X) = Yti ∧t (Xti+1 ∧t − Xti ∧t )
i=0
n−1
X
St+ (Π, Y, X) = Yti+1 ∧t (Xti+1 ∧t − Xti ∧t )
i=0
n−1
X Yti+1 ∧t + Yti ∧t
St◦ (Π, Y, X) = (Xti+1 ∧t − Xti ∧t )
2
i=0
St− (Π, Y, X) + St+ (Π, Y, X)
=
2
n−1
X
Ct (Π, Y, X) = (Xti+1 ∧t − Xti ∧t )(Yti+1 ∧t − Yti ∧t ).
i=0
Provided that the corresponding limits exist in probability, and do not

depend on the underlying sequence of subdvisions, we introduce the fol-
lowing definite integrals and covariations.
a)We denote
Z t
Y d− X := lim St− (Π, Y, X).
0 |Π|→0
b) We set
Z t
Y d◦ X := lim St◦ (Π, Y, X).
0 |Π|→0
We introduce now the (indefinite) integrals as processes. We say that the

forward (resp. symmetric) integral of Y with respect to X exists if, for
34
Rt Rt Rt
any t ∈ [0, T ] 0 Y d− X (resp. 0 Y d◦ X) exists and the process ( 0 Y d− X)t
Rt
(resp. ( 0 Y d◦ X)t ) admits a càdlàg version. In that case that version will be
chosen by default.
If St− (Π, Y, X) (resp. St◦ (Π, Y, X)) converge ucp (without dependence from
the chosen sequence of subdivisions), then we say that the forward (resp.
symmetric) integral exists in the ucp sense.
If the forward integral of Y with respect to X exists, for a < b in [0, T ] we
Rb Rb Ra
also set a Y d− X := 0 Y d− X − 0 Y d− X. Similar considerations can be
done for the symmetric integrals.
We define now the notions of covariation and quadratic variation.
c) (Covariation)
We set
[Y, X]t := lim C(Π, Y, X)t (5.2)
|Π|→0
if the limit holds in the ucp sense. [Y, X] is called the covariation of Y
and X. If Y = X, then [X] := [X, X] is called quadratic variation of
X.
d) α-variation) Let α ≥ 1. We set

n−1
X
[X]αt = lim |Xsi+1 ∧t − Xsi ∧t |α .
|Π|→0
i=0
In reality, previous objects, if they exist, depend on the chosen family of

subdivisions. We will suppose moreover that they will be in fact independent.
Remark 5.3 i) The covariation of continuous processes is a symmetric

operation.
ii) Let τ a random time. If [X, Y ] exists then
[Y τ , X τ ]t = [Y, X]t∧τ , (5.3)
where X τ is the process X stopped at time τ defined by Xtτ = Xt∧τ , t ≥

0.
iii) If the forward (resp. symmetric) integral exists in the ucp sense then
the (indefinite) forward (resp. symmetric) integral exists.
35
iv) As covariations, also stochastic integrals, if they exist in the ucp sense,
allow a stopping property:
Z t Z t Z t Z t∧τ
τ ⋆ ⋆ τ τ ⋆ τ
Y d X= Yd X = Y d X = Y d⋆ X,
0 0 0 0
where ⋆ denotes −, ◦.
v) [X, X] coincides with the 2-variation [X]2 .
Remark 5.4 When it exists, [X]α is an increasing process. If H is measur-

able (pathwise bounded or positive) one can always define
Z ·
Hs d[X]αs .
0
R
In particular, if α = 2, the integral 0 Hs d[X, X]s is meaningful.
Definition 5.5 • If [X, X] exists, then X will be said finite quadratic

variation process. [X, X] is called quadratic variation of X. We
also use for covariation the terminology square bracket.
• If [X, X] = 0 (resp. [X]α = 0), X is called zero quadratic variation

process (resp. zero α- variation process).
Proposition 5.6 Suppose that X is a continuous process. If [X]α exists

then [X]β ≡ 0 for any β > α. In particular, if [X, X] exists, then [X]α = 0
for every α > 2.
Proof. Let ε > 0. Let Π such that |Π| < ε. Setting, si = ti ∧ t, 1 ≤ i ≤ n,

we have
n−1
X n−1
X
β n−1
|Xsi+1 − Xsi | ≤ |Xsi+1 − Xsi |α sup |Xti+1 − Xti |β−α
i=0 i=0 i=0
n−1
X
n−1
≤ sup δ(X, ε)β−α |Xsi+1 − Xsi |α ,
i=0 i=0
where δ(X, ε) is the modulus of continuity of X.

Since the paths of X are uniformly continuous, the previous expression con-
verges a.s. (and therefore in probability) to zero.
36
Definition 5.7 A vector (X 1 , . . . , X n ) of càdlàg processes is said to have
all its mutual brackets (covariations) if [X i , X j ] exist for every 1 ≤ i,
j ≤ n.
Remark 5.8 If (X 1 , . . . , X n ) has all its mutual covariations, then we have
[X i + X j , X i + X j ] = [X i , X i ] + 2[X i , X j ] + [X j , X j ]; (5.4)
from previous equality, it follows that [X i , X j ] is the difference of increasing

processes therefore it has bounded variation; so, that bracket is a classical
integrator in the Lebesgue-Stieltjes sense.
Remark 5.9 a) Relation (5.4) holds as soon as three of the appearing

brackets exist. More generally an identity of the type I1 + · · · + In = 0
has the following meaning: if n − 1 terms among the Ij exist, the re-
maining one also makes sense and the identity holds true.
b) We will see later, studying Dirichlet processes, that a covariation process

[X, Y ] will not always be a finite variation process; in particular (X, Y )
may not have all its mutual brackets and [X, Y ] will however exist.
The properties below can be established in a elementary way working out

the definition of the integral via discretization.
Proposition 5.10 Let X = (Xt )t∈[0,T ] be a continuous process and Y =

(Yt )t∈[0,T ] be a càdlàg (or càglàd) process.
Z · Z ·
1
(i) We have Y d◦ X = Y d− X + [Y, X].
0 0 2
(ii) (Integration by parts).
We have
Z t Z t
−
Xt Y t = X0 Y 0 + Xd Y + Y d− X + [X, Y ]t , t ∈ [0, T ],
0 0
(iii) (Kunita-Watanabe).
If X and Y are such that (X, Y ) has all its mutual brackets, we have
1
|[X, Y ]| ≤ {[X, X][Y, Y ]} 2 .
37
(iv) If X is a finite quadratic variation process and Y is a zero quadratic
variation process then (X, Y ) has all its mutual brackets and [X, Y ] = 0.
(v) Let X be a bounded variation continuous process. Then for any càdlàg
(or càglàd) process Y we have
Rt Rt Rt
a) 0 Y d− X = 0 Y dX = 0 Y d◦ X,
b) [X, Y ] = 0. In particular a bounded variation continuous process is
a zero quadratic variation process.
(i) Let
Π = {0 = t0 < ... < tn = T }, (5.5)
be an element of a sequence of subdivisions of [0, T ] whose mesh con-

verges to zero. For fixed t ∈ [0, T ],
X Yti+1 ∧t + Yti ∧t 1
St◦ (Π, Y, X) = (Xti+1 ∧t −Xti ∧t ) = St− (Π, Y, X)+ C(Π, X, Y )t .
2 2
i
Taking the limit when |Π| goes to zero gives the result (i).
(ii) Let Π of the type (5.5). For t ∈ [0, T ], we write

n−1
X
Xt Y t − X0 Y 0 = (Xti+1 ∧t Yti+1 ∧t − Xti ∧t Yti ∧t )
i=0
= St− (Π, Y, X)t + St− (Π, X, Y ) + Ct (Π, X, Y ).
Letting |Π| go to zero, the integration by parts is then established.
(iii) If X is a finite quadratic variation process, Cauchy-Schwarz inequality

p
Ct (Π, Y, X) ≤ Ct (Π, Y, Y )Ct (Π, X, X).
(iv) is a direct consequence of (iii).
(v) The first equality of a) follows through Lebesgue dominated conver-

Rt
gence theorem. In fact S − (Π, Y, X)t = 0 Y Π dX, where
n−1
X
YsΠ = Yti 1[ti ,ti+1 [ (s).
i=0
38
Clearly for almost all ω
Z t Z T

sup (Y − Y )dX ≤
Π
|YsΠ − Ys |dkXk[0,s]
t≤T 0 0
converges to zero since, for each underlying ω, the quantity of jumps

of Y is countable and the total variation measure of dkXk of each
singleton is zero.
Concerning b), we also use Lebesgue dominated convergence theorem.
Now Z t
C(Π, Y, X)t = (Y +,Π − Y Π )dX + R(|Π|, t),
0
where
n
X
Ys+,Π = Yti+1 1]ti ,ti+1 ] (s).
i=1
By similar arguments as for a) supt≤T |C(Π, Y, X)t | converges to zero

a.s.
This together with the first equality of a) together with (i) gives the
second equality of a).
We need at this point a Dini type lemma in the stochastics framework. The
deterministic version of this lemma is well-known.
Lemma 5.11 Let (Zℓ (t), t ∈ [0, T ]) be a sequence of continuous processes

verifying the following properties.
(i) ∀ℓ ∈ N∗ , t → Zℓ (t) is non-decreasing.

(ii) There is a continuous process Z(t) 0≤t≤T
such that Zℓ (t) → Z(t) in
probability when ℓ goes to infinity. Then Zℓ converges to Z ucp.
Given a process X, the following corollary shows that the convergence in

probability of expression (5.2) with X = Y , guarantees the existence of
[X, X].
39
Corollary 5.12 Let (Xt )0≤t≤T a process. Suppose there is a continuous
process (At )
lim C(Π, X, X)t
|Π|→0
converges in probability to At for any t ≥ 0. Then X is a finite quadratic

variation process and [X] = A.
Proof (of the Lemma 5.11). Since the process Z is continuous and because
of assumptions (i) and (ii), it is clear that Z non-decreasing.
iT
Let ρ, α > 0, N ∈ N∗ . We set tN i = , 0 ≤ i ≤ N.
N
For almost all ω in Ω,
T
sup |Z(tN
i+1 ) − Z(t N
i )|(ω) ≤ δ Z(·, ω) ; ,
i N
T
where δ Z(·, ω) ; is the continuity modulus of Z(t, ω), 0 ≤ t ≤ T . The
N
T
a.s. convergence of δ Z(., ω) ; to zero, when N → ∞, implies the one in
N
probability. Therefore, we choose N (fixed) so that
n T αo
P δ Z(.) ; > ≤ ρ.
N 4
We define n o
A= sup |Zℓ (t)) − Z(t)| > α .
0≤t≤T
From now on, we will simply write ti = tN

i , 0 ≤ i ≤ N . Since

sup |Zℓ (t) − Z(t)| = sup sup |Zℓ (t) − Z(t)| ,
0≤t≤T 0≤i≤N −1 t∈[ti ,ti+1 ]
N[
−1
the event A is included in the union Ai where
i=0
n o
Ai = sup |Zℓ (t) − Z(t)| > α ; 0 ≤ i ≤ N − 1.
t∈[ti ,ti+1 ]
Let t ∈ [ti , ti+1 ]. Since processes Z ℓ and Z are increasing, we have
Zℓ (t) − Z(t) ≤ Zℓ (ti+1 ) − Z(ti ).
40
We rearrange the right hand-side of the above inequality,
Zℓ (ti+1 ) − Z(ti ) = Zℓ (ti+1 ) − Z(ti+1 ) + Z(ti+1 ) − Z(ti ).
Then T
Zℓ (t) − Z(t) ≤ Zℓ (ti+1 ) − Z(ti+1 ) + δ Z(.) ; .
N
Similarly
T
Zℓ (t) − Z(t) ≥ Zℓ (ti ) − Z(ti ) − δ Z(.) ; .
N
Therefore,
T
|Zℓ (t)−Z(t)| ≤ δ Z(.) ; +|Zℓ (ti+1 )−Z(ti+1 )|+|Zℓ (ti )−Z(ti )| ; ∀t ∈ [ti , ti+1 ].
N
Consequently,
n T αo e
Ai ⊂ δ Z(.) ; > ei+1 ,
∪ Ai ∪ A
N 2
where n o
ei = |Zℓ (ti ) − Z(ti )| > α .
A
4
Finally,
!
1 α
N
[
P (A) ≤ P δ Z(.) ; > + P( ei )
A
N 4
i=0
!
1 α
N
X
≤ P δ Z(.) ; > + ei ).
P (A
N 4
i=0
Since Zℓ (ti ) converges in probability to Z(ti ), for any 0 ≤ i ≤ N − 1, ℓ going

to ∞, yields P (A) ≤ ρ. Since ρ is arbitrary then the result follows.
5.2 The quadratic variation of a (continuous) martingale and

a semimartingale
Let (Mt ; t ≥ 0) be a continuous local martingale. Then according to Propo-

sition 4.32, (Mt2 ; t ≥ 0), admits a Doob’s type decomposition.
Mt2 = Nt + At , t ≥ 0, (5.6)
where (Nt ; t ≥ 0) is a continuous local martingale, (At , t ≥ 0) is a non-

decreasing, adapted, continuous process and A0 = 0.
The main result of this subsection is the following.
41
Theorem 5.13 Let (Mt ; t ≥ 0) be a continuous (Ft )-local martingale. Then
the process A coming from the Doob decomposition of M 2 coincides with
[M, M ]. In particular (Mt2 − [M, M ]t ; t ≥ 0) is a continuous (Ft )-local mar-
tingale. If (Mt ; t ≥ 0) is a continuous and square integrable martingale (see
Definition 4.2) then (Mt2 − [M, M ]t ; t ≥ 0) is a continuous martingale and
E[[M, M ]t ] < ∞, for any t ≥ 0.
Corollary 5.14 Let W be standard (Ft )-Brownian motion. Then [W ]t ≡ t.
Proof. By Proposition 4.6 Wt2 − t is a martingale. The result follows from

Theorem 5.13.
The proof of Theorem 5.13 is adapted from Theorem 5.8 chap. 1 of [17]. The
key fact employed here is that, when squaring sums of martingale increments
and taking the expectation, one can neglect the cross-product terms. Before
the sketch of the proof we propose an easy exercise.
Let A as the process appearing in the decomposition (5.6).
Exercise 5.15 Let M be an (Ft )-square integrable martingale. Let 0 ≤ a0 ≤

a ≤ b ≤ u ≤ v. We have the following.
(i) E ((Mv − Mu )(Mb − Ma )|Fa0 ) = 0.
(ii)
E[(Mb − Ma )2 |Fa0 ] = E[Mb2 − Ma2 |Fa0 ] = E(Ab − Aa |Fa0 ); (5.7)
Exercise 5.16 Let Π = {0 = t0 < . . . < tn = T } be a subdivision of [0, T ].

Let M be a square integrable martingale and let Doob decomposition of M 2 =
N + A where N is a martingale and A is an increasing process vanishing at
zero. Then
E(C(Π, M, M )t ) = E(At ).
Proof (of Theorem 5.13). Let 0 ≤ t ≤ T . Let Π be a subdivision of [0, T ]

as in (5.1). We denote Πt the subdivision {0 = s0 < s1 . . . < sm } where {si }
are the subset of [0, T ] of distinct elements obtained setting si = ti ∧ t.
42
Lemma 5.17 Let M be a square integrable (Ft )-martingale and let K such
that such that supt≤T |Mt | ≤ K. Then, for every t ∈ [0, T ],
E(C(Π, M, M )t )2 ) ≤ 3K 4 .
Proof. Using the linearity of conditional expectation and martingale prop-

erty (see Exercise (5.15)), for 0 ≤ k ≤ m − 1 we have
 
m−1
X m−1
X
2
E  (Msj+1 − Msj ) |Fsk+1  = E(Msj+1 − Msj )2 |Fsk+1 )
j=k+1 j=k+1
m−1
X
= E(Ms2j+1 − Ms2j |Fsk+1 ) (5.8)
j=k+1
 
m−1
X
= E (Ms2j+1 − Ms2j )|Fsk+1 
j=k+1

= E Ms2m − Ms2k+1 )|Fsk+1 ≤ E(Ms2m |Fsk+1 ) ≤ K 2 .
So
 
X m−1
m−1 X
E (Msj+1 − Msj )2 (Msk+1 − Msk )2 
k=0 j=k+1
  
m−1
X m−1
X
= E (Msk+1 − Msk )2 E  (Msj+1 − Msj )2 |Fsk+1  (5.9)
k=0 j=k+1
m−1
!
X
≤ K 2E (Msk+1 − Msk )2 ≤ K 4,
k=0
where the latter line can be justified taking the expectation in (5.8). We also
have
m−1
! m−1
!
X X
E (Msk+1 − Msk )4 ≤ K 2E (Msk+1 − Msk )2 ≤ K 4. (5.10)
k=0 k=0
Inequalities (5.9) and (5.10) imply
m−1
!
X
E(C(Π, M, M )2t ) = E (Msk+1 − Msk ) 4
k=0
 
X m−1
m−1 X
+ 2E  (Msj+1 − Msj )2 (Msk+1 − Msk )2 
k=0 j=k+1
4
≤ 3K .
43
Lemma 5.18 Let M be a square integrable continuous martingale and let K
such that supt≤T |Mt | ≤ K a.s. Then, given a sequence (Π) of subdivisions
whose mesh converges to zero, for every t ∈ [0, T ], we have
m−1
!
X
4
lim E (Msk+1 − Msk ) = 0.
|Π| →0
k=0
Proof. For any subdivision Π, we can write

m−1
X
(Msk+1 − Msk )4 ≤ C(Π, M, M )δ 2 (M, |Π|),
k=0
where δ(M, ε) is the modulus of continuity of M , i.e.
δ(M, ε) := sup{|Mr − Ms |; 0 ≤ r < s ≤ T, s − r ≤ ε}. (5.11)
That quantity is a measurable r.v. because the supremum can be restricted

to rational s and r and the quantity of rational numbers is countable. Hölder
inequality implies
m−1
!
X 1
E (Msk+1 − Msk )4 ≤ E(C(Π, M, M )2 )E(δ 4 (M ; |Π|) 2 .
k=0
As |Π| approaches zero, the first factor on the right-hand side remains bounded
by Lemma 5.17 and the second term tends to zero, by the uniform continuity
of M and by Lebesgue dominated convergence theorem.
We go on with the proof of Theorem 5.13. Taking into account Dini’s type
Lemma 5.11 it is enough to show that for any fixed t ∈ [0, T ], C(Π, M, M )t −
At converges to zero in probability. Let Π of the type (5.1) as in the beginning
of the present theorem proof.
We start supposing that sup0≤t≤T |Mt | ≤ K and also that A intervening
in the Doob decomposition M 2 = N + A is also bounded by K, for some
constant K. Then N = M 2 − A is also bounded and it is of course even a
square integrable martingale. Since for 0 ≤ j ≤ m − 1 we have
(Msj+1 − Msj )2 = Ms2j+1 − Ms2j − 2Msj (Msj+1 − Msj ),
44
and taking into account Exercise 5.15 it follows
 2
m−1
X
E{(C(Π, M, M )t − At )2 } = E  {(Msj+1 − Msj )2 − (Asj+1 − Asj )}
j=0
X m−1
m−1 X
= E ((Msj+1 − Msj )2 − (Asj+1 − Asj ))((Msk+1 − Msk )2 − (Ask+1 − Ask )) .
j=0 k=0
Now for fixed j 6= k (say k > j) we get

E ((Msj+1 − Msj )2 − (Asj+1 − Asj ))((Msk+1 − Msk )2 − (Ask+1 − Ask ))

= E ((Msj+1 − Msj )2 − (Asj+1 − Asj ))E((Msk+1 − Msk )2 − (Ask+1 − Ask )|Fsk ) .
Morever by Exercise 5.15
E((Msk+1 − Msk )2 − (Ask+1 − Ask )|Fsk ) = 0.
So
m−1
X
2

E{(C(Π, M, M )t − At ) } = E ((Msj+1 − Msj )2 − (Asj+1 − Asj ))2
j=0
m−1
X
≤ 2 E (Msj+1 − Msj )4 + (Asj+1 − Asj )2
j=0
m−1
X
≤ 2 E (Msj+1 − Msj )4 + E(At δ(A; Π)) .
j=0
As the mesh of Π approaches zero, the first term on the right-hand side of this
inequality converges to zero because of Lemma 5.18; the second term does
as well, by the Lebesgue dominated convergence theorem and the sample
path uniform continuity of A. Convergence in L2 implies convergence in
probability, so this proves the theorem for martingales which are uniformly
bounded.
We treat now the general case. We proceed by localization. Let (τ N ) be the
localizing sequence defined by
τ N := inf{t||Mt | > N } ∧ inf{t|At > N }.
(τ N ) is a suitable sequence of stopping times in the sense that ∪ΩN = Ω a.s.

where ΩN := {τ N > T }. Since the stopped process
N N N
Nτ = (M 2 )τ − AτN = (M τ )2 − AτN ,
45
is a martingale, by the first part of the proof we have
[M τN , M τN ] = AτN .
Clearly, on ΩN we have M = M τN , A = AτN . Finally

N N
1ΩN [M, M ] = 1ΩN [M τ , M τ ] = 1ΩN AτN = 1ΩN A.
Proposition 5.19 Let (Ft ) be a usual filtration and M is a càdlàg (Ft )-local
martingale vanishing at zero with [M ] = 0. Then M is identically zero.
Proof.
Let us consider the sequence (τk ) intervening in the definition of local mar-
tingale. For every k, M τk is a martingale so (M 2 )τk is a non-negative sub-
martingale which is a martingale since [M τk , M τk ] = [M, M ]τk ≡ 0. See
Doob decomposition and Theorem 5.13. Without restriction of generality,
we can suppose that limk→+∞ τk (ω) = ∞, for every ω ∈ Ω. It will enough
to show that, for every k,
1{τk >T } M ≡ 0,
or better that M τk ≡ 0. Finally, it will be sufficient to show the proposition
when M is a martingale such that M 2 is a martingale.
Consequently, E(Mt2 ) = E(M02 ) = 0 and so Mt = 0 a.s. Since M is càdlàg
then M will be indistinguishable from 0.
Corollary 5.20 The decomposition of an (Ft )-continuous semimartingale is

unique.
Proof. We write 0 = M + V where V is a finite variation process and M

is an (Ft )-local martingale. The bilinearity of the covariation implies
0 = [M, M ] + 2[M, V ] + [V, V ].
This implies [M, M ] = 0 because of Proposition 5.10. Proposition 5.19 alows

to conclude.
46
Exercise 5.21 Show that M, N are continuous (Ft )-local martingales then
(M, N ) has all its mutual brackets.
At this point we can easily identify the covariation of two semimartingales.
Proposition 5.22 Let S i = M i + V i be two (Ft )-semimartingales, i = 1, 2,

where M i are continuous local martingales and V i bounded variation contin-
uous processes. We have [S 1 , S 2 ] = [M 1 , M 2 ].
Proof. The result follows directly from Proposition 5.10 (v) and the bilin-
earity of the covariation.
Corollary 5.23 Let M, N be two continuous (Ft )-local martingales.
(i) M N − [M, N ] is an (Ft )-local martingale. In particular M N is an

(Ft )-semimartingale.
(ii) If M, N are square integrable martingale then M N −[M, N ] is an (Ft )-

martingale.
Proof. Items (i) and (ii) follow from the bilinearity of the covariation and
Theorem 5.13.
6 Complements on martingale theory
6.1 A change of variable
Proposition 6.1 Let ϕ : R+ → R+ , strictly increasing. Let M = (Mt ) be

an (Ft )-local martingale. We set Mtϕ = Mϕ(t) . Then M ϕ is an (Fϕ(t) )-local
martingale and [M ϕ ]t = [M ]ϕ(t) .
Proof. (of Proposition 6.1).

Through the usual localization procedure, one can reduce to the case that M
is a martingale. This case can be treated since M ϕ is obviously a martingale.
Concerning the quadratic variation property, by Theorem 5.13 it is enough
47
to show that (M ϕ )2 − [M ]ϕ is a martingale. This happens because N =
M 2 − [M ] is a martingale and the definition of quadratic variation.
6.2 Burkholder-Davies-Gundy inequality
We start with an elementary result denominated again Kunita-Watanabe.
Proposition 6.2 Let M, N be two processes such that (M, N ) has all its
RT
mutal brackets. Let H, K be measurable processes such that 0 Hs2 d[M, M ]s +
RT 2
0 Ks d[N, N ]s < ∞ a.s. Then for all t ∈ [0, T ]
Z sZ Z
t t t
| Hr Kr d[M, N ]r | ≤ Hr2 d[M, M ]r Kr2 d[N, N ]r .
0 0 0
In particular the left-hand side is well-defined.
Exercise 6.3 Discuss the case H = K = 1[a,b] , where 0 ≤ a < b.
Among the most important inequality of martingale theory is Burkholder-

Davies-Gundy (BDG) inequality.
Proposition 6.4 Let (Mt )t≥0 be a continuous (Ft )-local martingale such
that M0 = 0. We set
Mt∗ = maxs≤t |Ms |.
For every m > 0, there are universal positive constants km , Km such that

km E ([M ]m ∗ 2m
τ ) ≤ E (M )τ ≤ Km E ([M ]m
τ ), (6.1)
takes place for any finite stopping time τ .
Proof. See [17], Theorem 3.28 chap. 3.
A first consequence of previous result is the following.
Proposition 6.5 Let (Mt )t∈[0,T ] be a continuous (Ft )-local martingale such
that M0 ∈ Lp , p ≥ 1. Suppose that one of the items below are valid.
48
p
(i) E([M ]T2 ) < ∞.
(ii) supt∈[0,T ] |Mt | ∈ Lp .
Then (Mt ) is a martingale in Lp .
Corollary 6.6 Let (Mt ) be a continuous (Ft )-local martingale such that
M0 ∈ L2 and
E([M ]T ) < ∞, ∀T > 0.
Then (Mt ) is a square integrable martingale.
By subtraction, it is of course possible to suppose M0 = 0.
• (i) and (ii) are equivalent by BDG.
• The result for p = 1 follows by Lemma 4.29.
• The case p > 1 follows since (ii) implies that supt∈[0,T ] |Mt | ∈ L1 .
7 Stochastic Itô integral
7.1 The case of a square integrable martingale
Let M be continuous square integrable martingale, i.e. such that Mt ∈ L2 ,

for all t ≥ 0. By Theorem 5.13 we know that [M, M ]T ∈ L1 and M 2 −[M, M ]
is a martingale.
Definition 7.1 We denote by HT2 the set of square integrable continuous

martingales M such that M0 = 0, equipped with inner product
hM, N iH2 = E(MT NT ). (7.2)

T
49
We remark that, if M, N ∈ HT2 , then M + N and M − N belong to HT2 , and
by Corollary 5.23 M N − [M, N ] is a martingale.
Proposition 7.2 HT2 is a Hilbert space.
Proof. The positivity and the bilinearity properties for (7.2) are obvious.
Suppose that hM, M iH2 = 0. This implies that E(MT2 ) = 0. So M is a
T
submartingale with
0 = E(M02 ) ≤ E(Mt2 ) ≤ E(MT2 ) = 0.
Finally E(Mt2 ) = 0 and so Mt ≡ 0 a.s. which implies M ≡ 0 in the indistin-

guishable sense.
It remains to show HT2 is complete. We consider a Cauchy sequence (M (n) ).

(n)
Then (MT ) is a Cauchy sequence of L2 (Ω, F, P ) which is complete. There
is then a r.v. that we will denote MT such that
(n)
E((MT − MT )2 ) → 0,
when n → ∞. By Doob’s inequality we have
(n) (m)
E(sup |Ms(n) − Ms(m) |2 ) ≤ 4E(MT − MT )2 ,
s≤T
(n)
which converges to zero when n, m → ∞. Consequently the r.v. sups≤T |Ms −
(m)
Ms | converges in probability to zero and so M (n) is a Cauchy sequence in
CF ([0, T ]) equipped with the usual metric d, related to the convergence ucp,
so it converges to some (continuous) process M . It is square integrable since
(n)
for every t Mt converges in L2 to Mt .
It remains to show that M is a martingale: indeed
(n) (n)
Mt = lim Mt = lim E(MT |Ft ) = E(MT |Ft ),
n→∞ n→∞
by the L2 -continuity of the conditional expectation. Finally M belongs to

HT2 and
(n)
kM (n) − M k2H2 = E(MT − MT )2 → 0,
T
when n → ∞.
50
Definition 7.3 Given a continuous (Ft )-local martingale M we denote L2 (M )T
RT
the set of progressively measurable processes H such that E( 0 Hs2 d[M, M ]s ) <
∞ equipped with the inner product
Z T
hH, KiL2 (M )T = E Hs Ks d[M, M ]s .
0
Definition 7.4 We denote by E the set of elementary process, i.e. the set of
processes H which are linear combination of processes of the type
n−1
X
Hi 1[ti ,ti+1 [ , (7.3)
i=0
where Π is a subdivision of [0, T ] of the type (5.1) and every Hi is a bounded

Fti -measurable r.v.
Proposition 7.5 E is dense in L2 (M )T .
Proof. Clearly L2 (M )T is a closed subspace of L2 (Ω×[0, T ], FT ⊗B([0, T ]), m)

where m is the measure defined, by A ∈ FT ⊗ B([0, T ]), m)
Z T
m(A) = E 1A (ω, s)d[M, M ]s (ω) .
0
Therefore it is a Hilbert space. It is enough to show that if K ∈ L2 (M )T is

orthogonal to E then it is null. We set
Z t
Xt = Ks d[M, M ]s , t ∈ [0, T ].
0
We show that X is a martingale. By Cauchy-Schwarz we get
Z t Z t 12
p 2
|Ks |d[M, M ]s ≤ E([M, M ]t )E Ks d[M, M ]s < ∞. (7.4)
0 0
Consequently Xt ∈ L1 . Let 0 ≤ s ≤ t ≤ T . Let Z be a Fs -measurable

bounded r.v. The process Hr = Z1[s,t[ (r) is an element of E; so, by assump-
tion, it is orthogonal to K. Therefore
Z T Z t
0=E Hr Kr d[M, M ]r = E Z Kr d[M, M ]r = E(Z(Xt − Xs )).
0 s
This shows that X is a martingale. Since it has bounded variation it is null,

see Proposition 5.10 item (v)(b) and Proposition 5.19. So K = 0 as an
element of L2 (M )T .
51
If H is of the type (7.3) and M is a continuous square integrable martingale
then we denote
n−1
X
(H · M )t := Hi (Mti+1 ∧t − Mti ∧t ).
i=0
Exercise 7.6 Show that the process H · M is a square integrable martingale.
Proposition 7.7 If H ∈ E and M is a square integrable continuous martin-

gale then Z
T
E(H · M )2T =E Hs2 d[M, M ]s .
0
Proof. By the linearity of expectation and the orthogonality property stated

in Exercise 5.15 applied to the martingale H · M we obtain
n−1
X 2
E(H · M )2T = E (H · M )ti+1 − (H · M )ti )
i=0
n−1
X
+ E ((H · M )ti+1 − (H · M )ti )((H · M )tj+1 − (H · M )tj )
i,j=0,i6=j
n−1
X 2
= E (H · M )ti+1 − (H · M )ti )
i=0
n−1
X
= E Hi2 E((Mti+1 − Mti )2 |Fti )
i=0
n−1
X
= E Hi2 E([M, M ]ti+1 − [M, M ]ti |Fti ) ,
i=0
where the latter passage can be explained by (5.7) and Theorem 5.13. Con-
sequently this equals
n−1
X
E(H · M )2T = E E(Hi2 ([M, M ]ti+1 − [M, M ]ti )|Fti )
i=0
n−1
X
= E Hi2 ([M, M ]ti+1 − [M, M ]ti )
i=0
Z !
T n−1
X
= E Hi2 1[ti ,ti+1 [ (s)d[M, M ](s))
0 i=0
Z T
= E Hs2 d[M, M ]s .
0
52
Theorem 7.8 The application E → HT2 defined by H 7→ H · M extends to
an isometry to L2 (M )T → HT2 .
Remark 7.9 The extended application will still be denoted by H 7→ H · M .
Proof. By Proposition 7.7, the map H 7→ H ·M defined on E as a subspace

of L2 (M )T is an isometry. Since E is dense and HT2 is complete then we can
extend it to the whole space L2 (M )T . Indeed we have the general following
result.
Proposition 7.10 Let (F1 , d1 ) and (F2 , d2 ) be two metric spaces, D a dense
subset of F1 . Let T : D → F2 for which there is c > 0 such that
d2 (T x, T y) ≤ cd1 (x, y) (resp. d2 (T x, T y) = cd1 (x, y)), (7.5)
for every x, y ∈ D.
If (F2 , d2 ) is complete then the map T prolonges to the full space F1 to a map
for which (7.5) remains true.
Proof. Since D is dense in F1 , for every x ∈ F1 , there is a sequence (xn ) in

D such that d1 (xn , x) → 0 when n → ∞. The sequence (xn ) is Cauchy since
it converges and (in)equality (7.5) implies that (T xn ) is also Cauchy and it
converges to some element z ∈ F2 , being F2 complete.
(i) We set T x := z. It is not difficult to see that choosing a different

sequence (x′n ), the definition z is still the same, so T x does not depend
on the chosen subsequence.
(ii) Since d2 and d1 are continuous on respectively F1 × F1 (resp. F2 × F2 ),

(7.5) is maintained for every x, y ∈ F1 .
Theorem 7.11 Let H ∈ L2 (M )T . (H ·M ) is the only continuous martingale

Φ vanishing at zero such that, for every N ∈ HT2
Z ·
[Φ, N ] = Hs d[M, N ]s . (7.6)
0
53
Proof. We set Φ := H · M and we prove (7.6). Taking into account Corol-
lary 5.23 and Corollary 5.20, we need to show that
Z ·
(H · M )N − Hr d[M, N ]r is a martingale. (7.7)
0
For this let 0 ≤ s < t ≤ T . We have to show that, for every ξ bounded
Fs -measurable we have
Z t
E (H · M )t Nt − (H · M )s Ns − Hr d[M, N ]r ξ = 0. (7.8)
s
The case H = Z1[a,b[ , 0 ≤ a < b ≤ T , where Z is an Fa -measurable random

variable will be the object of Exercise 7.12 and so (7.7) for H ∈ E follows by
linearity. If H ∈ L2 (M )T , consider a sequence (H n ) in E converging to H in
L2 (M )T . By Theorem 7.8, Proposition 6.2 (Kunita-Watanabe) and Exercise
7.13 below, we have
Z · Z ·
n
[H · M, N ]T = lim [H · M, N ]T = lim Hsn d[M, N ]s = Hs d[M, N ]s .
n→∞ n→∞ 0 0
It remains to show the uniqueness. Let X be a martingale vanishing at zero,

such that for each square integrable martingale N ,
[X, N ]T = [H · M, N ]T .
By localization, this extends to every martingale N . We take N = X −H ·M ,

we get [X − H · M, X − H · M ] = 0 and therefore X − H · M = 0.
Exercise 7.12 Show (7.8) for H is an elementary process. Hint. Make use
of (5.7) in Exercise 5.15.
Exercise 7.13 Let N ∈ HT2 . Show that the map M 7→ [M, N ]T is continu-
ous from HT2 to L1 (Ω).
Corollary 7.14 For every martingales M, N ∈ HT2 , H, K ∈ L2 (M )T , we get

Z ·
[H · M, K · N ] = Hs Ks d[M, N ]s .
0
54
7.2 The case of local martingales
For this part we do not develop all the details, the interested reader can
consult [17], chapter 3, section 3.2 D. Let M be a continuous (Ft )-local
martingale vanishing at zero and let P(M ) be the collections of (equivalence
classes) of all progressively measurable processes X satisfying
Z T
Xt2 d[M ]t < ∞, a.s. (7.9)
0
The proposition below is the object of Proposition 2.24, Chapter 3 of [17].
Proposition 7.15 Let H ∈ P(M ). There is a unique continuous local mar-

tingale vanishing at zero Φ such that for every continuous martingale in
N ∈ HT2 we have Z ·
[Φ, N ] = Hs d[M, N ]s . (7.10)
0
Corollary 7.16 For every continuous local martingales M, N , H ∈ P(M ), K ∈

P(N ) we get Z ·
[H · M, K · N ] = Hs Ks d[M, N ]s .
0
Definition 7.17 The process Φ characterized by Proposition 7.15 is called

(Itô)-stochastic integral of H with respect to M and it is also denoted by
R·
0 HdM .
The next proposition is constitued by Proposition 2.26, chapter 3. of [17]
Proposition 7.18 Let τ ≤ T be a stopping time and let (H n ) be a sequence

of elements in P(M ) and H ∈ P(M ) such that, in probability,
Z τ
|Hrn − Hr |2 d[M ]r → 0.
0
Then Z t Z t
lim sup | Hrn dMr − Hr dMr | = 0,
n→∞ t≤τ 0 0
in probability
55
R·
Remark 7.19 Taking τ = T , previous convergence says that 0 H n dM con-
R·
verge to 0 HdM ucp.
Proposition 7.20 Let M be a local martingale, α, β ∈ R H 1 , H 2 be elements

of P(M ) such that
Z T
(Hs1 )2 + (Hs2 )2 d[M, M ]s < ∞, a.s.
0
Then Z · Z · Z ·
(αHs1 + βHs2 )dMs =α Hs1 dMs +β Hs2 dMs .
0 0 0
Proof. Exercise.
The stochastic integral with respect to a local martingale extends the one
related to square integrable martingales. The proof of proposition below is
an easy exercise.
Proposition 7.21 Let M be a square integrable martingale, H ∈ L2 (M )T .

R·
Then 0 HdM = H · M.
The following chain rule holds and it is an easy consequence of Theorem 7.11
and Proposition 7.15.
Proposition 7.22 Let M be a continuous local martingale, let H, K ∈ P(M )

RT RT
such that 0 Hs2 d[M, M ]s + 0 Hs2 Ks2 d[M, M ]s < ∞ a.s.
Rt
Let Xt := 0 HdM, 0 ≤ t ≤ T.
Then Z · Z ·
KdX = HKdM.
0 0
8 Itô integral with respect to a semimartingale
Itô stochastic integral can be extended to the case where the integrator is
a continuous (Ft )-semimartingale (St ). Let S = M + V its decomposition
where M is a (Ft )-local martingale and V is an (Ft )-progressively process
finite variation process, V0 = 0, for simplicity.
56
Example 8.1 Itô process
Let (Wt )t≥0 be an (Ft )-classical Brownian motion and X0 be F0 -measurable,
(Ht ), (Kt ) (Ft )-progressively measurable processes such that
Z T Z T
Hs2 ds + |Ks |ds < ∞ a.s.
0 0
Then Z t Z t
St = X0 + Hs dWs + Ks ds
0 0
is called (W, Ft )-Itô process or simply Itô process.
Remark 8.2 (i) An Itô process is of course an (Ft )-semimartingale.
(ii) We recall that [W ]t ≡ t.
It is possible to construct Itô stochastic integral related to S.

Let (Yt )t≥0 be an (Ft )-progressively measurable process (Vt ) an (Ft )-progressively
measurable finite variation continuous process and M a continuous (Ft )-local
martingale. If
Z T Z T
2
|Ys | d[M ]s + |Ys |dkV ks < ∞ a.s. (8.11)
0 0
and S = M + V , then
Z t Z t Z t
Ys dSs := Ys dMs + Ys dVs
0 0 0
is well-defined. Moreover it is a semimartingale, see Exercise 8.13 below.
Proposition 8.3 Let S be a continuous semimartingale, let H, K progres-

RT RT
sively measurable processes such that 0 Hs2 d[M, M ]s + 0 Hs2 Ks2 d[M, M ]s +
RT RT Rt
0 |H|s dkV ks + 0 |Hs Ks |s dkV ks < ∞ a.s. Let Xt := 0 HdS, 0 ≤ t ≤ T.
Then Z · Z ·
KdX = HKdS.
0 0
Proof. It follows by Proposition 7.22 and the definition of stochastic inte-

gral with respect to semimartingales.
57
Application.
Rt
Let (Wt ) an (Ft )-standard Brownian motion. Let St = X0 + 0 Hs dWs +
Rt
0 Ks ds an Itô stochastic process. Let (Yt ) be an (Ft )-progressively measur-
RT RT
able process such that 0 Ys2 Hs2 ds + 0 |Ys Ks |ds < +∞ a.s. Then
Z t Z t Z t
Y dS = Y HdW + Ys Ks ds.
0 0 0
Exercise 8.4 Let (Yti ), i = 1, 2, (Ft )-progressively measurable. Let S i , i =

1, 2 be two continuous (Ft )-semimartingales with decomposition M i + V i , i =
1, 2. We suppose
Z T Z T
i 2 i
|Ys | d[M ]s + |Ysi |dkV i ks , < ∞ a.s., i = 1, 2. (8.12)
0 0
Verify the following properties.

Rt
i) ( 0 Ys1 dSs1 )t≥0 is an (Ft )-semimartingale.
ii) Z Z Z
· · ·
1 1 2 2
Y dS , Y dS = Y 1 Y 2 d[S 1 , S 2 ]s . (8.13)
0 0 t 0
Solutions
i) We set Y =Y 1 , M = M 1 , V = V 1 . The (Ft )- local martingale part is

Rt Rt
0 Y dM . The finite variation part is 0 Ys dVs .
t≥0
ii) We make use of Corollary 7.16, the definition of stochastic integral with
respect to a semimartingale. We also take into account the fact that,
R·
whenever V is a bounded variation process, then 0 Y i dV is of finite
variation, i = 1, 2 and so
Z ·
i
Y dV ≡ [V ] ≡ 0.
0
On the other hand, the covariation of a finite variation process and a

continuous process is zero.
An important property of localization for Itô integrals is given below.
58
Remark 8.5 Let M, N be two (Ft )-local martingales, (Yt ), (Zt ) (Ft )-progressively
RT RT
measurable processes such that 0 Ys2 d[M ]s + 0 Zs2 d[N ]s < ∞ a.s. Let
Ω0 ∈ F such that
Mt (ω) = Nt (ω), Yt (ω) = Zt (ω), for t ≤ T, ω ∈ Ω0 .
Then Z t Z t
1Ω0 Ys dMs = 1Ω0 Zs dNs , t ≤ T.
0 0
(True for elementary processes, afterwards we go to the limit).
An important theorem which extends the classical theory of Lebesgue inte-

gration to the Itô integral is stochastic Fubini’s theorem.
Proposition 8.6 (Stochastic Fubini). Let X = M +V be an (Ft )-semimartingale,

H : Rd × R+ × Ω → R, measurable, with respect to the product σ-field such
that for every a, (H(a, s)s≥0 ) is (Ft )-progressively measurable. Let µ be a fi-
nite Borel measure on Rd . We also suppose that (Ht )0≤t≤T is either bounded
or satisfies
Z n hZ T i Z T o
E H(u, s)2 d[M, M ]s + |H(u, s)|dkV ks µ(du) < ∞. (8.14)
U 0 0
We set Z t
Zta = H(a, s)dXs .
0
Then (Zta ) admits a measurable version from R+ × Rd × Ω → R and

Z Z t
Zta µ(da) = H(s)dXs
Rd 0
R
where H(s) = Rd H(a, s)µ(da).
Proof. See [29], theorem 45, ch. 4.
At this stage, there is a very popular integral which is altenative to Itô

integral.
59
Definition 8.7 Let (St )t≥0 , (Yt )t≥0 , be two (Ft )-continuous semimartingales.
We denote then Z t Z t
1
Y ◦ dS := Y dS + [Y, S]t . (8.15)
0 0 2
This process will be called Stratonovich integral of Y with respect to S.
We remark that
Rt
• Itô integral 0 Y dS exists,
• [Y, S] exists.
8.1 Connections between Itô, Stratonovich and integrals via

discretization
In this section we want to explore the relation between classical integrals

with respect to semimartingales and integrals via discretizations.
Theorem 8.8 Let (Ft ) be a usual filtration, X, Y be two continuous (Ft )-

semimartingales, Z be an (Ft )-progressively measurable càglàd (or càdlàg)
process. Then
Rt Rt
i) 0 Zd− X = 0 ZdX,
Rt Rt
ii) 0 Y d◦ X = 0 Y ◦ dX.
Moreover previous forward and symmetric integrals exist ucp.
Proof.
Suppose that Z is càglàd. If it were càdlàg we can replace Z with Zs− =
Zs− , s ≥ 0 which is càglàd, see Exercise 8.9 below.
We use the fact that a càglàd function g is a pointwise limit of simple func-
Rt
tions. We have S − (Π, Z, X)t = 0 Z Π dX where
n−1
X
Z Π (s) = Ztk 1]tk ,tk+1 ] (s).
k=0
60
Rt Π −Z)dX
We need to show that 0 (Z → 0 in the ucp sense. Now Z Π (s) → Zs
pointwise a.s. We decompose X = M + V where M is an (Ft )-local martin-
gale and V a bounded variation process. Taking in account Proposition 5.10
(v), it is enough to consider the case X = M .
We proceed via localization. For k > 0 we consider the stopping time
τk = inf{t ≥ 0| sup |Zs | ≥ k or [X]t ≥ k or |Xt | ≥ k}.

s≤t
Indeed the càglàd process (Zs ) is locally bounded, see Remark 5.1. We set
Ωk = {τk > T }. If ω ∈ Ωk , for t ∈ [0, T ], we have
sup |Zs |2 ≤ k 2 [X]t ≤ k, |Xt | ≤ k.

s≤t
Now Z t Z t
Π
1Ωk (Z − Z )dX = 1Ωk (Z − Z Π )τk dX τk (8.16)
0 0
Moreover Z t 2
(Z − Z Π )τsk 1[0,τk ] d[X τk ] ≤ k 3 a.s. (8.17)
0
We have, up to a negligible set
[
Ω= Ωk ,
k
In fact it is enough to prove that for any k > 0

Z t

sup 1Ωk (Z − Z )s dXs ,
Π
(8.18)
t≤T 0
converges to zero in probability. This will hold if we show that the expec-
tation of previous expression (8.18) goes to zero when |Π| → 0. Because of
(8.16), that expectation equals
Z t !

E sup 1Ωk (Z − Zs ) k dXs k
Π τ τ
t≤T 0
Z T

≤ 4E ((Z − Z Π )τk )2s 1[0,τk ] d[X]τsk .
0
This converges to zero because of Lebesgue dominated convergence theorem
taking into account (8.17).
ii) Consequence of previous point and of the definition of Stratonovich inte-
gral.
61
Exercise 8.9 Let (Ft ), X, Y as in the statement of Theorem 8.8. Let Z be
an (Ft )-progressively measurable càdlàg process.
Rt Rt
(i) Show that the Itô integrals 0 ZdX and 0 Z − dX are equal, where Zs− =
Zs− , s ≥ 0.
Rt Rt
(ii) Prove item i) of Theorem 8.8 by first proving that 0 Zd− X = 0 Z − dX
if Z is càdlàg.
(iii) Conclude to the validity of ii) in Theorem 8.8 when Z is càdlàg.
9 Stability of covariation and Itô formula
9.1 Recalls in real measure theory
A useful tool of probability theory is weak convergence of probability laws;

that concept coincides with the convergence of the related distribution func-
tions at each continuity point of the limiting distribution function. Here we
will make use of such convergence for the paths realizations of a process. For
reference on the subject we refer to [18, 31].
As mentioned earlier, we denote by BV the set of Borel real functions defined
on R+ (resp. [0, T ]) which have locally bouded variation.
Remark 9.1 (i) A bounded variation function is always equal a.e. to

a càdlàg bounded variation function. In fact it is difference of non-
decreasing functions.
(ii) For that reason, without restriction of generality, a bounded variation

function can be considered càdlàg.
(iii) A càdlàg (càglàd) function is bounded and has at most countable jumps.
(iv) For basic properties (as those above), the reader can refer to [4], chapter
1.
We equip BV with the topology associated with the following convergence.

A sequence (Vn ) in BV converges to V if
62
• Vn (0) −→ V (0),
• dVn −→ dV with respect to the vague topology, i.e. for every α ∈

C 0 (R) with compact support
Z ∞ Z ∞
αdVn −→ αdV.
0 0
The natural topology is the total variation on each compact but, it is too
strong for our purposes.
Remark 9.2 a) The sequence (dVn ) converges to dV if and only if for every
α∈ C 0 (R)
Z t Z t
αdVn −→ αdV
0 0
holds (at every continuity point t of V ).
b) If (Vn ) is a sequence converging in BV, the total variations are uniformly

bounded on each compact set K, i.e.
Z
sup dkVn k < ∞. (9.1)
n K
c) Vn → V in BV if and only if Vn (x) → V (x) at every continuity point x

of V and (9.1) holds.
d) We need also the following Dini’s type lemma.

Let (un (t), t ∈ [0, T ]) be a sequence of non-decreasing functions such
that supt∈[0,T ] |un (t)| < ∞.
If un (t) −→ u(t), t ∈ [0, T ] (pointwise) then the convergence holds uni-
formly.
9.2 Stability of the covariation
One basic tool of calculus related to covariations is the following.
Proposition 9.3 Let X = (X 1 , . . . , X d ) be a vector of processes having all

its mutual covariations, F, G ∈ C 1 (Rd ). Then, [F (X), G(X)] exists and it is
given by
d Z
X t
[F (X), G(X)]t = ∂i F (X)∂j G(X)d[X i , X j ]. (9.2)
i,j=1 0
63
We have the following particular cases.
Remark 9.4 a) An interesting particular case, is given by d = 2, F (x1 , x2 ) =

f (x1 ), G(x1 , x2 ) = g(x2 ) with f, g ∈ C 1 (R).
Then (9.2) implies that (f (X 1 ), g(X 2 )) has all its mutual covariations
and Z t
1 2
[f (X ), g(X )]t = f ′ (X 1 )g ′ (X 2 )d[X 1 , X 2 ].
0
b) In particular, the family of finite quadratic variation processes is stable

through C 1 transformations. Let φ1 , φ2 ∈ C 1 (R).
Z t
[φ1 (X), φ2 (X)]t = φ′1 (X)φ′2 (X)d[X].
0
We recall that the ucp convergence coïncides with the convergence in prob-
ability of random variables with the values in the metric space E = C[0, T ].
We recall the following fact.
Remark 9.5 A sequence (Zn ) of E-valued random variables converges in

probability to an E-valued random variable Z if and only if for each subse-
quence (nk ) there is a subsequence (nkj ) such that Znkj −→ Z a.s. and in
probability.
Let Π = {0 = t0 < ... < tn = T } be an element of a subdivisions sequence.

We set
X
µΠ (t) = (Xti+1 − Xti )2 , t ∈ [0, T ].
i|ti ≤t
Exercise 9.6 Let (Πℓ ) be a sequence of subdivisions whose mesh converge

to zero. Using the definition of [X, X] and Remark 9.5, show that there is a
ℓ
subsequence (ℓk ) such that a.s. µΠ k converges in BV to [X, X].
We will not prove Proposition 9.3 but only point a) of Remark 9.4. The
proof of the Proposition and point b) can be proven analogously.
Proof Sketch of Remark 9.4 a).

Using polarization techniques (bilinearity arguments), it is enough to con-
sider the case X = X 1 = X 2 , f = g.
64
Let Π = {0 = t0 < ... < tn = T } be an element of a subdivisions sequence,
whose mesh converges to zero. We set Πt = {0 = s0 < ... < sn = t} obtained
setting si = ti ∧ t. We expand
f (Xsi+1 ) − f (Xsi ) = f ′ (Xsi )(Xsi+1 − Xsi ) + RΠ (t)(Xsi+1 − Xsi ),
where
Z 1

|RΠ (t)| = ′
f (Xsi + a(Xsi+1 − Xsi ))da − f (Xsi ) ≤ δ(f ′ , δ(X, |Π|)),
′
0
and δ(g, ·) is the modulus of continuity of a function g.

We have supt≤T |RΠ (t)| → 0 in probability since f ′ and X are uniformly
continuous on each compact.
We recall that the notation (RΠ (t)) indicates a family of processes such that
supt≤T |RΠ (t)| → 0 in probability.
So,
2
f (Xsi+1 ) − f (Xsi ) − f ′ (Xsi )2 (Xsi+1 − Xsi )2 = RΠ (t)(Xsi+1 − Xsi )2 .
Summing up from i = 0 to i = n − 1, we get

n−1
X 2
f (Xsi+1 ) − f (Xsi ) = I1 (t, Π) + I2 (t, Π),
i=0
with
Z t
I1 (t, Π) = f ′ (Xs )2 dµΠ (s) + I11 (t, Π),
0
n−1
X
I2 (t, Π) ≤ (Xti+1 ∧t − Xti ∧t )2 RΠ (t),
i=0
where it is not difficult to show that
I11 (t, Π) ≤ sup f ′ (Xt )2 δ(X, |Π|)2 .

t∈[0,T ]
Now
n−1
X
sup |I2 (t, Π)| ≤ sup |RΠ (t)| (Xti+1 ∧t − Xti ∧t )2 .
t≤T s≤T i=0
65
ucp
Since [X, X] exists then I2 (·, Π) −→ 0, it remains to prove that
Z t Z t
Π
Ys dµ (s) → Ys d[X, X]s ucp (9.3)
0 0
and Y is a continuous non-negative process.

Taking in account Remark 9.5, it will be enough to show that there is a
sequence (Πn ) such that
Z t Z t
Ys dµΠn (s) −→ Ys d[X, X]s ucp. (9.4)
0 0
By Dini’s lemma 5.11, it will be enough to show that (9.4) holds in probability
for every fixed t.
By Exercise 9.6, we can consider a null set N such that for ω ∈
/ N the
sequence of the real functions (µΠn (·)(ω)) in BV to [X, X](ω). Therefore
Z t Z t
Ys (ω)dµΠn (s)(ω) −→ Y (ω)d[X, X](ω),
0 0
because of Remark 9.2 a) and the fact that Y is continuous.
9.3 Dirichlet processes
The notion of Dirichlet process is the natural generalization of semimartin-

gales. It was introduced by H. Föllmer, see [13] under the inspiration of the
theory of Dirichlet forms.
Let (Ft )t≥0 be a fixed filtration fulfilling the usual hypotheses.
Definition 9.7 A (Ft )-adapted process X will be called (continuous) (Ft )-

Dirichlet process, if
X = M + A, (9.5)
where
• (Mt )t≥0 is a continuous (Ft )-local martingale,
66
• A is a zero (continuous) quadratic variation process such that A0 = 0.
Remark 9.8 (At ) is an (Ft )-adapted process.
Remark 9.9 An (Ft )-semimartingale is an (Ft )-Dirichlet process.
If (Ft ) is the canonical filtration to X, we will simply say that X is a Dirichlet

process, omitting the mention to the filtration.
Lemma 9.10 Decomposition (9.5) is unique.
Proof. We suppose that
X = M i + Ai , i = 1, 2,
where M i is an (Ft )-local martingale, Ai an (Ft )-adapted, (continuous) zero

quadratic variation process, such that Ai0 = 0, By additivity, we have
M + A = 0, M = M 1 − M 2, A = A1 − A2 .
Moreover M0 = 0 since M01 = M02 = X0 .

The bilinearity of the covariation says that
0 = [M + A, M + A]
= [M, M ] + 2[M, A] + [A, A].
Since A has zero quadratic variation, Proposition 5.10 (iii) implies that
[M, A] = 0 so that [M ] = 0. Recalling that M0 = 0, Proposition 5.19
yields that M ≡ 0. Finally A will also be zero.
We have already seen that the class of finite quadratic variation processes
is stable under C 1 transformations. This property extends to the case of
Dirichlet processes as the following result shows.
Proposition 9.11 Let X be an (Ft )-Dirichlet process, let f ∈ C 1 . Then

f (X) is again an (Ft )-Dirichlet process.
67
Proof. Let X = M + A, as in (9.5).
We set Z t
Mtf = f ′ (X)dM + f (X0 ).
0
To conclude, it is enough to show that
Aft = f (Xt ) − Mtf ,
is a zero quadratic variation process.

For this, we again proceed using the bilinearity and symmetry of the covari-
ation.
(i) Remark 9.4 2. implies that

Z t
2
[f (X), f (X)]t = f ′ (Xs )d[X]s
0
Z t
2
= f ′ (Xs )d[M ]s .
0
(ii) Since M f is a local martingale and taking into account Corollary 7.14
and Proposition 5.10 (v) it follows that
Z t
f f
[M , M ]t = f ′ (Xs )2 d[M ]s ,
0
(iii) Morever, using similar arguments as before

Z t
f
[f (X), M ]t = f ′ (X)d[X, M f ]
0
Z t
= f ′ (X)d[M, M f ]
0
Z t
2
= f ′ (X)d[M ].
0
This concludes the proof of Proposition 9.11.
Natural examples of Dirichlet processes arise from Brownian motion and

related processes.
Let W be a classical (Ft )-Brownian motion.
68
Example 9.12 Let f of class C 1 . Then X = f (W ) is an (Ft )-Dirichlet
process.
Remark 9.13 Let f ∈ C 0 (R). We recall a celebrated result of [5]. f (W ) is

a (Ft )-semimartingale if and only if f is difference of two convex functions.
Previous Example and Remark show easily that the class of (Ft )-Dirichlet
processes strictly include the class of (Ft )-semimartingales.
Open question
Let (Ft ) be a filtration such that X is a (Ft )-Dirichlet process and let (Gt )
be a subfiltration of (Ft ) so that X is still (Gt )-adapted.
Is X a (Gt )-Dirichlet process ?
In general, at our knowledge the answer is unknown. However, at least
in the particular case when X is (Ft )-semimartingale, then X is a (Gt )-
Dirichlet process; in fact Stricker Theorem 4.28 asserts that if X is an (Ft )-
semimartingale then it is also a (Gt )-semimartingale.
9.4 Itô formulae for finite quadratic variation process
We start with the one-dimensional case. Let X = (Xt )t≥0 be a continuous

process.
Proposition 9.14 Suppose that [X, X] exists and let f ∈ C 2 (R). Then
Z · Z ·
′ −
f (X)d X and f ′ (X)d◦ X exist. (9.6)
0 0
Moreover
Rt 1
Rt
a) f (Xt ) = f (X0 ) + 0 f ′ (X)d− X + 2 0 f ′′ (Xs )d[X, X]s .
Rt
b) f (Xt ) = f (X0 ) + 0 f ′ (X)d− X + 21 [f ′ (X), X]t .
Rt
c) f (Xt ) = f (X0 ) + 0 f ′ (X)d◦ X.
Proof. We use the same notations as before. Let t ≥ 0. Π is a subdivision

Π = {0 = t0 < ... < tm = T }, si = ti ∧ t.
69
b) follows from a) because Remark 9.4 a) implies that
Z t
[f ′ (X), X]t = f ′′ (X)d[X, X].
0
c) follows from b) and Proposition 5.10 (i). It remains to justify a) and (9.6).
We write Taylor expansion until the second order to obtain, for t ≥ 0
f (Xsi+1 ) = f (Xsi ) + f ′ (Xsi )(Xsi+1 − Xsi ) (9.7)

f ′′ (Xsi )
+ (Xsi+1 − Xsi )2 + RΠ (t)(Xsi+1 − Xsi )2 ,
2
where again supt≤T |RΠ (t)| −→ 0 a.s. because f ′′ is uniformly continuous on
each compact. Summing (9.7) from i = 0 to n − 1, we obtain
n−1
X
f (Xsi+1 ) − f (Xsi ) = I1 (t, Π) + I2 (t, Π) + I3 (t, Π), (9.8)
i=0
where
n−1
X
I1 (t, Π) = f ′ (Xsi )(Xsi+1 − Xsi )
i=0
n−1
X
1
I2 (t, Π) = f ′′ (Xsi )(Xsi+1 − Xsi )2 ,
2
i=0
I3 (t, Π) ≤ C(Π, X, X)(t)|RΠ (t)|,
with the same notations as in previous subsection. I2 (·, Π) converges ucp to

R
1 · ′′
2 0 f (Xs )d[X, X] and I3 (·, Π) goes ucp to zero by the same arguments as
the proof of the C 1 stability of the covariation.
The left member of (9.8) equals (for each Π) f (Xt ) − f (X0 ). Therefore the
R·
ucp limit of I1 (·, Π) is forced to exist and it will be of course 0 f ′ (X)d− X.
R·
This establishes a) at the same time. The existence of 0 f ′ (X)d◦ X follows
again by Proposition 5.10 (i) and Remark 9.4.
Before writing the generalization to dimension d, we observe the following.
Lemma 9.15 Let X be a continuous process.

The following properties are equivalent.
70
a) [X, X] exists.
R·
b) 0 Xd− X exist.
Proof. a) ⇔ b) follows by the following remark:
(Xsi+1 − Xsi )2 = Xs2i+1 − Xs2i − 2Xsi (Xsi+1 − Xsi ). (9.9)
Corollary 9.16 Let X be a continuous process.

The following properties are equivalent.
a) [X, X] exists.
R·
b) 0 g(X)d− X exists ∀g ∈ C 1 .
Proof. a) ⇒ b) follows by Itô formula stated in Proposition 9.14 a).

b) ⇒ a) follows setting g(x) = x and using Lemma 9.15.
Itô formula admits an extension to the multidimensional case.
Theorem 9.17 Let X = (X 1 , · · · , X d ) be a vector of processes having all

its mutual brackets.
Let f ∈ C 2 (Rd ). Suppose that
Z ·
∂i f (X)d− X i (9.10)
0
exist 2 ≤ i ≤ d. Then
d Z
X t
f (X t ) = f (X 0 ) + ∂i f (X s )d− Xsi
i=1 0
(9.11)
d Z
1 X t 2
+ ∂i,j f (X s )d[X i , X j ]s
2 0
i,j=1
In particular Z ·
∂1 f (X s )d− X 1 exists.
0
71
Remark 9.18 a) One could define a vector stochastic integral of an Rd -
Rt
valued process Y with respect to an Rd -valued process X denoted 0 Y ·
d− X. It would be defined as the limit in probability of
n−1
XX d
Ysℓi (Xsℓi+1 − Xsℓi ), (9.12)
i=0 ℓ=1
where si = ti ∧ t and (ti ) is a element of a sequence of subdivisions

whose mesh goes to zero. With this convention, we would not need
P R·
to suppose (9.10) in Theorem 9.17 and di=1 0 ∂i f (X)d− X i would be
R·
replaced with 0 ∇f (X) · d− X.
Similar considerations as for (9.12) can be done for the symmetric
R·
integral 0 Y · d◦ X.
b) Let f ∈ C 2,1 (R × Rd−1 ).

Suppose that X 2 , · · · , X d are bounded variation processes. In this case,
R·
∂i f (X)d− X i coincides with the classical Lebesgue-Stieltjes integral
R0· i
0 ∂i f (X)dX . We can show that Itô’s formula conclusion (see Theorem
9.17) is verified; in particular
Z ·
∂1 f (X)d− X 1 exists
0
and
Z t n Z
X ·
f (X t ) = f (X 0 ) + ∂1 f (X)d− X 1 + ∂i f (X s )dX is
0 i=2 0
Z t
1
+ ∂12 f (Xs )d[X 1 , X 1 ]s .
2 0
c) Let V be a real bounded variation process with values in R+ and a process

(Xt ) with values in Rd−1 having all its mutual covariations. Then a Itô
formula also holds under the only assumption f ∈ C 1,2 (R+ × Rd−1 ).
Indeed we have
Z t Z t
f (Vt , Xt ) = f (V0 , X0 ) + ∂v (Vs , Xs )dVs + ∇x f (Vs , Xs ) · d− Xs
0 0
d Z
1 X t 2
+ ∂ij f (Vs , Xs )d[X i , X j ]s .
2 0
i,j=2
72
d) In reality, if one knows a priori that the process (Xt ) will remain in an
open set O and setting Vt = t, we only need to suppose f ∈ C 1,2 (R+ ×O)
in order to expand f (t, Xt ). For instance if X is a strictly positive finite
quadratic variation process and f (t, x) := log x, we can write
Z t Z
1 − 1 t 1
log Xt = log X0 + d Xs − d[X]s .
0 Xs 2 0 Xs2
Taking in account the relation between symmetric and forward integrals, see
Proposition 5.10 (i) and Proposition 9.3, we can easily prove the following.
Proposition 9.19 Let X = (X 1 , · · · , X n ) be a vector of processes having

all its mutual brackets.
Let f ∈ C 2 (Rn ). Then
Z t
f (X t ) = f (X 0 ) + ∇f (X s ) · d◦ X s .
0
9.5 Applications to semimartingales and Itô processes
Let (St )t≥0 be an (Ft )-semimartingale and (Wt )t≥0 be an (Ft )-Brownian
motion. Let (Xt )t∈[0,T ] an Itô process of the form
Z t Z t
Xt = X0 + Ks ds + Hs dWs , t ∈ [0, T ]. (9.13)
0 0
We recall that (Xt ) is an (Ft )-semimartingale with bounded variation part

Rt
Vt = 0 Ks ds.
Proposition 9.20 Let f ∈ C 1,2 (R+ × R), (s, x) 7→ f (s, x). We have the
following.
i)
Z t
∂f
f (t, St ) = f (0, S0 ) + (s, Ss )ds
0 ∂s
Z t Z
∂f 1 t ∂2f
+ (s, Ss )dSs + (s, Ss )d[S, S]s .
0 ∂x 2 0 ∂x2
73
ii)
Z t
∂f
f (t, Xt ) = f (0, X0 ) + (s, Xs )ds
0 ∂s
Z t Z
∂f 1 t ∂2f
+ (s, Xs )(Hs dWs + Ks ds) + (s, Xs )Hs2 ds.
0 ∂x 2 0 ∂x2
iii) Let S 0 be another (Ft )-semimartingale. The following integration by

parts holds.
Z t Z t
St St0 = S0 S00 + Ss dSs0 + Ss0 dSs + [S, S 0 ]t .
0 0
Proof.
i) We recall that Itô and forward integral coincide, see Theorem 8.8. The
result follows then from Remark 9.18 c) which follows Theorem 9.17
with d = 2, Vt = t, Xt = St .
ii) We write the Chain rule Proposition 8.3, the fact that [W, W ]t = t, and
Rt
Corollary 7.16 which implies that [X, X]t = 0 Hs2 d[W ]s .
iii) It is a consequence of integration by parts formula Proposition 5.10 (iii)

on forward integral, and the fact that again in this case Itô and forward
integrals coincide.
Proposition 9.21 a) The product of semimartingales is still a semimartin-

gale.
b) The product of Itô processes is again a Itô process.
c) In particular, the family of (Ft )-semimartingale (resp. ((Ft ), Wt )-Itô

processes) is a vector algebra, which is a subalgebra of the algebra CF
of (Ft )-adapted continuous processes.
Proof. It is a consequence of integration by parts. In fact, Y 1 and Y 2 are

two (Ft )-semimartingales, resp. Itô processes; Proposition 9.20 iii) gives
Z t Z t
1 2 1 2 1 2
Yt Yt = Y0 Y0 + Y dY + Y 2 dY 1 + [Y 1 , Y 2 ]t .
0 0
“Chain rule” Proposition 8.3 and Exercise 8.4 give the result.
74
Exercise 9.22 a) Evaluate the quadratic variation of [sin(W ), sin(W )].
Let A be a zero quadratic variation process and set D = W +A. Deduce
[sin(D), sin(D)].
Rt
b) If X is a Itô process in the form given above, express 0 X ◦ dW .
c) Suppose that F is of class C 2 (R). Verify that

Z t
1
Mt = F (Wt ) − F ′′ (Ws )ds, t ∈ [0, T ],
2 0
R ′ 2
p( √y )
is an (Ft )-martingale if R |F | (y) 1+y 2
T
dy < ∞, where p is the law
density of N (0, 1).
9.6 A first glance to stochastic differential equations
Let W be an (Ft )-Brownian motion, ξ be an F0 -measurable random variable.

Let us consider some Borel functions a, b : R+ × R → R.
Definition 9.23 We will say that a process (Xt )t∈[0,T ] is a solution to

(
dXt = b(t, Xt )dt + a(t, Xt )dWt
(9.14)
X0 = ξ
in the Itô sense, if
• X is (Ft )-adapted;
RT RT
• 0 a2 (s, Xs )ds + 0 |b|(s, Xs )ds < ∞ a.s.
Rt Rt
• Xt = ξ + 0 a(s, Xs )dWs + 0 b(s, Xs )ds, a.s., ∀t ∈ [0, T ].
In particular, a solution has a continuous modification which is an Itô process.

Very often ξ, which is in general only an F0 -random variable, will be a real
deterministic x0 .
We would like to discuss here a first method for the research of a solution of
equation (9.14).
We first start to find out a solution in the linear case: let µ, σ ∈ R, s0 > 0.
75
Let (St )t≥0 be a strictly positive solution (if it exists) of
(
dSt = µSt dt + σSt dWt
(9.15)
S0 = s0 .
We proceed now using the Itô formula in an open set, i.e. Remark 9.18 d),
since f (x) = log(x) is not of class C 2 (R). Such a solution is a Itô process.
We have Z t Z t
dSs 1 −1 2 2
log(St ) = log(s0 ) + + σ Ss ds. (9.16)
0 Ss 2 0 Ss2
Using (9.15), for Yt = logSt we have
Z t Z
σ2 t
Yt = Y0 + µ− ds + σdWs . (9.17)
0 2 0
We deduce σ2
Yt = logSt = logS0 + µ − t + σWt . (9.18)
2
It seems that
σ2
St = s0 exp µ− t + σWt (9.19)
2
is a candidate solution of (9.15). We will verify rigorously that this is even
the unique solution to (9.15).
We start with the existence property which will be stated in a more general
framework in the next Proposition 9.26.
Definition 9.24 Suppose a and b as in the previous definition being con-

tinuous. Let ξ be an F0 -measurable random variable. Let A = (At )t∈[0,T ] ,
(Ft )-adapted measurable process. We will say that a process (Xt )t∈[0,T ] is a
solution to (
dXt = b(t, Xt )dt + a(t, Xt )d− At
(9.20)
X0 = ξ,
in the forward sense if
RT
• 0 |b|(s, Xs )ds < ∞ a.s.
Rt Rt
• Xt = X0 + 0 a(s, Xs )d− As + 0 b(s, Xs )ds, a.s., ∀t ≥ 0.
76
Remark 9.25 Suppose a, b continuous functions as before.
If A is a (Ft )- semimartingale, an (Ft )-adapted process X is a solution to the
equation in the Itô sense if and only if it is a solution in the forward sense.
Proposition 9.26 Let A be an (Ft )-adapted finite quadratic variation pro-

cess. Then
σ2
St = s0 exp µt − [A, A]t + σAt (9.21)
2
is a solution of (9.20) with a(t, x) = µx and b(t, x) = σx.
Proof. We have S = f (V, A), where f (v, x) = s0 exp(v + σx) and Vt =

σ2
µt − 2 [A]t . We have
∂x f (v, x) = σf (v, x)
2
∂xx f (v, x) = σ 2 f (v, x)
∂v f (v, x) = f (v, x).
Consequently, using Itô formula (Remark 9.18 c)), we obtain

Z t Z t
St = f (Vt , At ) = f (0, 0) + ∂v f (Vs , As )dVs + ∂x f (Vs , As )d− As
| {z } 0 0
s0
Z t
1 2
+ ∂xx f (Vs , As )d[A, A]s
2 0
Z t Z t
σ2
= s0 + f (Vs , As )(µds − d[A]s ) + σf (Vs , As )d− As
0 2 0
Z
1 t 2
+ σ f (Vs , As )d[A, A]s
2 0
Z t Z t
= s0 + µ Ss ds + σ Ss d− As .
0 0
Exercise 9.27 Let b, a : R → R respectively continuous and of class C 1 . Let

also x0 ∈ R, a > 0. Let (Xt )t∈[0,T ] be a strictly positive solution to
(
dXt = b(Xt )dt + a(Xt )dWt
(9.22)
X0 = ξ.
77
Rx 1
Set H(x) = 0 a (y)dy, x ∈ R and set Y = H(X). Show explicitly that Y
solves one equation directed by a bounded variation process (so not involving
stochastic integrals).
Remark 9.28 In particular, Proposition 9.26 implies that (9.19) is a solu-

tion of (9.15). Next argument will show that it is the unique solution.
So
σ2
St = s0 exp µ− t + σWt
2
is a solution of (9.15); we will suppose that (Xt )t≥0 is another one. We will
try to express the “Itô stochastic differential” of Xt St−1 . We set

−1 σ2
Zt = s0 St = exp −µ+ t − σWt ;
2
n ′2
o
if µ′ = −µ+σ 2 et σ ′ = −σ then Zt = exp µ′ − σ2 t + σ ′ Wt . Proposition
9.26 shows again that
Z t Z t
Zt = 1 + Zs µ′ ds + σ ′ dWs = 1 + Zs (σ 2 − µ)ds − σdWs . (9.23)
0 0
We can express now Xt Zt through integration by parts for semimartingales,

see Proposition 9.20. We formally obtain
d(Xt Zt ) = Xt dZt + Zt dXt + d[X, Z]t .
Since (Xt ) is a solution to equation (9.15) and having in mind (9.23), we

have
Z t Z t Z t
[X, Z]t = Xs σdWs , − Zs σdWs = − σ 2 Xs Zs ds.
0 0 0
We deduce that
Z t Z t Z t
Xt Zt = s 0 + Xs dZs + Zs dX − σ 2 Xs Zs ds
0 |{z} 0 |{z}s 0
Zs ((σ 2 −µ)ds−σdWs ) Xs (µds+σdWs )
Z t Z t Z t
2
= s0 + (σ − µ)Xs Zs ds − σ Zs Xs dWs + Zs Xs (µ − σ 2 )ds
0 0 0
Z t
+ σZs Xs dWs
0
= s0 .
78
So Xt Zt ≡ s0 , ∀t ≥ 0 , P -a.s.
This yields
Xt = s0 Zt−1 = St , ∀t ≥ 0, P −a.s.
We have finally been able to prove the following result.
Theorem 9.29 Let σ ∈ R, µ ∈ R, s0 ∈ R, (Wt ) is an (Ft ) standard Brow-

nian motion and T > 0. There is a unique Itô process (St )t∈[0,T ] (up to
indistinguishability) which is solution to
(
dXt = µXt dt + σXt dWt
(9.24)
X0 = s0 .
This process is given by

σ2
St = s0 exp µ− t + σWt .
2
Remark 9.30 (i) Process St that we have provided constitutes the classi-
cal model of the price of a stock in the financial model of Black-Scholes.
(ii) When µ = 0, St is a martingale, see Proposition 4.6. This kind of

processus is called exponential martingale.
We construct now a classical process of stochastic analysis.
Definition 9.31 (Generalized Ornstein-Uhlenbeck process). Let (At )t≥0

be a general (continuous) process such that A0 = ξ, ξ being any random vari-
able, c ∈ R. A solution X to
Z t
Xt = At + c Xs ds, (9.25)
0
is called Ornstein-Uhlenbeck process.

Proposition 9.33 will show that it is legitimate to denote the previous integral
equation with (
dXt = cXt dt + d− At
(9.26)
X0 = ξ
79
Remark 9.32 Previous equation admits a unique solution in the class of
continuous processes. This is an elementary consequence of a fixed point
theorem on each path.
Proposition 9.33 The unique solution (Xt ) of previous equation is given

by Z t
Xt = ect ξ + ec(t−s) d− As .
0
Remark 9.34 The most classical example of Ornstein-Uhlenbeck comes out

from the case A = ξ + σW where W is a classical Brownian motion with
canonical filtration (Ft ). Let ξ be an F0 -measurable r.v.
In this case previous expression gives
Z t
ct
Xt = e ξ + σ ec(t−s) dWs .
0

Uniqueness and existence have already been discussed. Let X be a solu-
tion. We set Yt = Xt e−ct . Since t 7→ e−t has bounded variation, we have
[e−c· , X] ≡ 0. Integration by parts in Proposition 5.10, and the fact that X
is a solution of (9.26) gives
Z t Z t
Yt = ξ + e−cs d− Xs − c Xs e−cs ds
0 0
Z t Z t Z t
−cs − −cs
= ξ+ e d As + c e Xs ds − c e−cs Xs ds
0 0 0
Z t
= ξ+c e−cs d− As .
0
Since Xt = ect Yt , we obtain the expression stated in Proposition 9.33.
9.7 Multidimensional semimartingales and Itô processes
Before evaluating the covariation of two independent stochastic processes, we

profit to give a foundational theorem of probability, see for instance Section
4.3 in [1].
80
Definition 9.35 (i) A non-empty class D of subsets of Ω is called Π-
system if
\
A, B ∈ D ⇒ A B ∈ D.
(ii) A non empty class Λ is called λ-system if
(a) Ω ∈ Λ;
(b) A, B ∈ Λ, B ⊂ A ⇒ A − B ∈ Λ;
S
(c) If (An ) is an increasing sequence of elements of Λ then n An ∈ Λ.
The following theorem is called Monotone class theorem.
Theorem 9.36 Let D be a Π-system, Λ a λ-system such that D ⊂ Λ. Then

σ(D) ⊂ Λ.
An easy observation is the following.
Proposition 9.37 Let A, B be two σ -fields, Λ = σ(A ∪ B). We set M =

T
σ(A B, A ∈ A, B ∈ B). Then Λ = M.
Proof. It is enough to show Λ ⊂ M, the other sense being obvious. This

follows since every element A of A (resp. B of B) can be written as A = A∩Ω
(resp. B = Ω ∩ B).
Proposition 9.38 Let (Mt )t∈[0,T ] be an continuous (Ft )-local martingale,

(Yt )t∈[0,T ] be a càdlàg or càglàd (Ft )-adapted process. If M and Y are inde-
pendent then [M, Y ] = 0.
Proof. Let YT be the σ-field generated by (Yt )t∈[0,T ] . We denote Mt =

S
σ(YT FtM ), where (FtM ) is the canonical filtration associated with M . Via
localization, it is possible to suppose M to be an FtM -martingale.
We prove that (Mt ) is an (Mt )-martingale, that is to say
E(Mt |Ms ) = Ms , t ≥ s.
In other words
E(Mt 1C ) = E(Ms 1C ), ∀C ∈ Ms . (9.27)
81
Now Ms = σ(D) where
\
D = {A B, A ∈ FsM , B ∈ YT },
see Proposition 9.37. Let Λ be the family of C for which (9.27) is verified.
• Ω ∈ Λ;
• by additivity, if A, B verifying (9.27) with B ⊂ A then it is true for

C = A − B;
• if (Cn ) is an increasing sequence of elements fulfilling (9.27) then it will

S
be the same for C = n Cn .
We observe that D ⊂ Λ: let C = A ∩ B, A ∈ FsM , B ∈ YT . We have
E(1A 1B Mt ) = E(1A Mt E(1B |FtM )) = E(1A Mt P (B))

= P (B)E(1A Mt ) = P (B)E(1A Ms ) = E(1A P (B)Ms ) = E(1A Ms E(1B |FsM ))
= E(1A 1B Ms ).
So by monotone class theorem, the set Λ contains σ(D) = Ms , so (9.27) is

proved.
Finally (Mt ) is also an (Mt )-martingale. By definition we know that Ct (Π; Y, M ) =
St+ (Π, Y, M ) − St− (Π, Y, M ), t ∈ [0, T ]. Theorem 8.8 implies that the limit
Rt
when |Π| goes to zero of St− (Π, Y, M ), which is the forward integral 0 Y d− M ,
Rt
equals the (Mt )-Itô integral 0 Y dM .
Without restriction of generality, we can suppose that and M0 = 0 and Y is
bounded and càglàd.
For fixed t we have
Z t
St+ (Π, Y, M ) = YsΠ,+ dMs + R(|Π|, t),
0
where
X
YsΠ,+ = Ysk+1 1]sk ,sk+1 ] (s),
k
where sk = tk ∧ t, 1 ≤ k ≤ n.
82
Since (YsΠ,+ ) is Ms -adapted this equals the Itô integral
Z t
YsΠ,+ dMs .
0
By a similar argument as for the proof of Theorem 8.8, this can be shown to
Rt
converge ucp again to Itô integral 0 Y dM .
This concludes the proof.
Corollary 9.39 Let S 1 , S 2 be two (Ft )-continuous semimartingales whose

martingale parts are independent. Then [S 1 , S 2 ] = 0.
Theorem 8.8, Theorem 9.17 and Remark 9.18 c), give the following multidi-
mensional Itô formula.
Proposition 9.40 Let X 1 , . . . , X m be (Ft )-semimartingales. Let f ∈ C 1,2 (R+ ×

Rm ). Then
Z t
∂f
f (t, Xt1 , . . . , Xtm ) = f (0, X01 , . . . , X0m ) + (s, Xs1 , . . . , Xsm )ds
0 ∂s
m Z
X t
∂f
+ (s, Xs1 , . . . , Xsm )dXsi
∂x i
i=1 0
m Z
1 X t ∂2f
+ (s, Xs1 , . . . , Xsm )d[X i , X j ]s .
2 0 ∂x i ∂x j
i,j=1
Definition 9.41 A vector (W 1 , . . . , W p ) of independent classical Brownian

motions is called p-dimensional classical Brownian motion. We can of
course define similarly an (Ft )- p-dimensional classical Brownian mo-
tion.
Remark 9.42 What is the value of [W i , W j ] when i 6= j and W i , W j are

two independent classical Brownian motions? Corollary 9.39 affirms that its
value is zero.
We continue with an extension of the notion of Itô process.
83
Definition 9.43 Let W = (W 1 , . . . , W p ) be an (Ft )-p-dimensional classical
Brownian motion. We say that a process (Xt )t≥0 is a (W, F)-Itô process (or
simply general Itô process) if
Z t p Z
X t
Xt = X0 + Ks ds + Hsi dWsi ,
0 i=1 0
where
• X0 est F0 -measurable;
• (Kt ), (Hti ) are (Ft )-progressively measurable processes,

Z T
• |Ks |ds < ∞ a.s.
0
Z T 2
• Hsi ds < ∞ a.s., 1 ≤ i ≤ p.
0
Remark 9.42 and Corollary 7.16 (which allows to calculate brackets of Itô
integrals related to local martingales) provide the following result.
Proposition 9.44 Let (Hti )t∈[0,T ] , i = 1, 2 be two (Ft )-progressively measur-

RT
able processes, such that 0 (Hsi )2 ds < ∞ a.s. i = 1, 2. Let W 1 , W 2 be two
independent (Ft )- Brownian motions. Then
Z · Z ·
1 1 2 2
Hs dWs , Hs dWs ≡ 0.
0 0
Exercise 9.45 Let W be a classical p-dimensional Brownian motion, F ∈

C 2 (Rp ). Show that
Z t
1
F (Wt ) − F (0) − ∆F (Ws )ds
2 0
defines an (Ft )-martingale if ∇F is bounded.
Proposition 9.44 also allows to evaluate the covariation of two general Itô
processes. Consider in fact
Z t p Z
X t
Xti = X0i + Ksi ds + Hsi,j dWsj , i = 1, 2.
0 j=1 0
84
Then
p Z
X t
1 2
[X , X ]t = Hs1,j Hs2,j ds.
j=1 0
10 Stochastic differential equations
At Section 9, we have studied in detail the solutions of equations

Z t
Xt = x 0 + Xs (µds + σdWs )
0
and the Ornstein-Uhlenbeck processes; we recall that examples of such pro-

cesses are provided by
Z t
Xt = x 0 + c Xs ds + σWt ,
0
where W is a classical Brownian motion and c, σ, x0 are real constants.

At this point we will study first equations with Lipschitz coefficients, i.e.
Z t Z t
Xt = Z + b(s, Xs )ds + a(s, Xs )dWs . (10.1)
0 0
We recall that those equations are called Itô stochastic differential equa-
tions. A solution of previous equation is named diffusion.
10.1 Existence and uniqueness for Itô stochastic differential

equations
Definition 10.1 A function γ : [0, T ]×Rm −→ Rd is called Lipschitz (with

respect to the second variable x uniformly with respect to the first t) if there
is a constant k > 0 such that
sup |γ(t, x) − γ(t, y)| ≤ k|x − y|, ∀x, y ∈ Rm . (10.2)

t∈[0,T ]
A function γ : R+ ×Rm −→ Rd is called Lipschitz (with respect to the second

variable x uniformly with respect to the first t on every compact interval) if
for every T > 0, γ restricted to [0, T ] × Rm is Lipschitz in previous sense.
85
A function γ : [0, T ] × Rm −→ Rd is said to have linear growth (with respect
to x uniformly with respect to t), if there is a constant C > 0 with
sup |γ(t, x)| ≤ C(1 + |x|). (10.3)

t∈[0,T ]
A function γ : R+ ×Rm −→ Rd is said to have linear growth (with respect to x

uniformly for t on each compact interval), if for every T > 0, the restriction
of γ with respect to [0, T ] × Rm has linear growth in previous sense.
Remark 10.2 If γ : [0, T ] × Rm is continuous and Lipschitz then it has

linear growth.
In fact, if x, y ∈ Rm , t ≥ 0
|γ(t, x)| ≤ |γ(t, x) − γ(t, 0)| + |γ(t, 0)|

≤ k|x| + sup |γ(t, 0)|
t∈[0,T ]
≤ C(|x| + 1), ∀x ∈ Rm
!
avec C = max sup |γ(t, 0)|; k .
t∈[0,T ]
As usual we will consider a probability space (Ω, F, P ) equipped with a usual

filtration (Ft )t≥0 .
Let a, b : R+ × R −→ R be Borel functions, Z a random variable F0 -
measurable and (Wt )t≥0 a classical (Ft )-Brownian motion.
Remark 10.3 a) We recall that (10.1) can be formally expressed by

(
dXt = b(t, Xt )dt + a(t, Xt )dWt
X0 = Z.
b) If a continuous process fulfills (10.1) P a.s. for every t ≥ 0 then equality

(10.1) holds ∀t ≥ 0 P a.s. since both members are continuous and so
indistinguishable.
Theorem 10.4 We suppose a, b : R+ ×R −→ R Lipschitz with linear growth.

Let Z be a square integrable r.v. that is F0 -measurable. Then (10.4) has
86
a (unique up to indistinguishability) continuous solution X. Moreover, for
every T > 0 !
2
E sup |Xs | < ∞.
s≤T
Proof It will be enough to discuss existence and uniqueness in [0, T ], where

T > 0. We denote
n
A = (Xt )t∈[0,T ] ; (Ft )-adapted continuous such that
o
E supt≤T |Xt |2 < ∞ .
A is a Banach space equipped with the norm

v !
u
u
||X|| = tE sup |Xs |2 .
s≤T
If A is a positive constant, A can itself be equipped with the norm || · ||∼,A

(equivalent to || · ||):

||X||2∼,A = sup e −At 2
E sup |Xs | .
t∈[0,T ] s≤t
In fact
2 2
||X|| = sup E sup |Xs | .
t∈[0,T ] s≤t
Let Φ be the application that to a process (Xs )s∈[0,T ] in A associates a process

(Φ(X)t )t∈[0,T ] defined by
Z t Z t
Φ(X)t = Z + b(s, Xs )ds + a(s, Xs )dWs .
0 0
Remark 10.5 Φ sends an element of A into A, i.e., if (Xt )t∈[0,T ] ∈ A then

!
E sup |Φ(X)t |2 < ∞. (10.4)
t≤T
87
In fact, Doob inequality says
! Z t 2 !

E sup |Φ(X)t |2 ≤ 4E(Z 2 ) + 4E sup a(s, Xs )dWs
t≤T t≤T 0
Z T 2
+ 4E |b(s, Xs )|ds
0
Z T Z T
2 2 2
≤ 4E(|Z| ) + 16E a (s, Xs )ds + 4T E b (s, Xs )ds
0 0
(provided the expectation in the middle is finite).

Since a and b have linear growth, previous expression is inferior to
Z T
2 2 2
4E(|Z| ) + C E (1 + |Xs | )ds(32 + 8T ) ,
0
where C is a linear growth constant.

Since X ∈ A, previous expression is finite. ✷
On the other hand, if X, Y ∈ A, 0 ≤ t ≤ u, we have
Z u 2
2
|Φ(X)t − Φ(Y )t | ≤ 2 |b(s, Xs ) − b(s, Ys )|ds
0
Z t 2

+ 2 sup (a(s, Xs ) − a(s, Ys ))dWs

t∈[0,u] 0
and consequently for u ∈ [0, T ]

Z u
2 2
E sup |Φ(X)t − Φ(Y )t | ≤ 2T E |b(s, Xs ) − b(s, Ys )| ds
t≤u 0
Z u
2
+ 8E (a(s, Xs ) − a(s, Ys )) ds
Z0 u
2
≤ K̃E |Xs − Ys | ds ,
0
where K̃ is a constant depending on Lipschitz constants and T .

Previous expression equals
Z u
eAu − 1
K̃ ds E(|Xs − Ys |2 )e−As eAs ≤ K̃||X − Y ||2∼,A .
0 | {z } A
≤E (supr≤s |Xr −Yr |2 )e−As
88
Therefore

||Φ(X) − Φ(Y )||2∼,A = sup E sup |Φ(X)s − Φ(Y )s | 2 e−Au
u∈[0,T ] s≤u
K̃
≤ sup 1 − e−Au ||X − Y ||2∼,A
u∈[0,T ] A
K̃
≤ ||X − Y ||2∼,A .
A
Choosing A > K̃, the application Φ is a contraction on A equipped with the
norm || · ||∼,A .
By Banach fixed point theorem there is a “unique process in A” (Xt ) such
that
Z t Z t
Xt = Z + a(s, Xs )dWs + b(s, Xs )ds, ∀t ∈ [0, T ] Pa.s. (10.5)
0 0
The proof of uniqueness is omitted.
10.2 Vector valued Itô stochastic differential equations
It is possible to define and study Itô stochastic differential equation where

the process solution propagates in Rm . We consider the following objects.
• W = (W 1 , . . . , W p ) a p-dimensional (Ft )-Brownian motion;
• a Borel function b : R+ ×Rm −→ Rm , b(s, x) = (b1 (s, x), . . . , bm (s, x));
• a : R+ × Rm −→ Rm×p (the set of m × p real matrices)

a(s, x) = (aij (s, x))1≤i≤m,1≤j≤p being Borel functions;
• an F0 -measurable Z = (Z 1 , . . . , Z m ) with values in Rm .
We consider now the stochastic differential equation

Z t p Z
X t
Xti =Z + i i
b (s, Xs )ds + aij (s, Xs )dWsj , 1 ≤ i ≤ m, (10.6)
0 j=1 0
where by solution we intend again a continuous process (Xt )t≥0 with values in
Rm , adapted to (Ft )t≥0 and such that (10.6) is verified P a.s. ∀t ∈ [0, T ], i ∈
{1, . . . , m}.
89
(10.6) can be compactified as
Z t Z t
Xt = Z + b(s, Xs )ds + a(s, Xs )dWs . (10.7)
0 0
Existence and uniqueness theorem can be generalized in the following way.

If x ∈ Rm , a ∈ Rm×p , we denote by |x| the Euclidean norm of x and |a|2 =
P 2
1≤i≤m,1≤j≤p aij .
Theorem 10.6 We suppose a, b Lipschitz with linear growth and E(|Z|2 ) <
∞. Then, there is a solution to (10.7), unique up to indistinguishability.
Moreover, that solution verifies
!
E sup |Xt |2 < ∞, ∀T > 0.
t≤T
Remark 10.7 The proof is fairly similar to the real case.
Remark 10.8 The assumptions of Theorem 10.6 (and of course of Theorem

10.4) can be weakened assuming that coefficients a and b are Borel, locally
Lipschitz and with linear growth. The idea is that the locally Lipschitz prop-
erty provides local existence and uniqueness of possibly exploding solutions
and the linear growth condition guarantees the non-explosion of the solu-
tion. Concerning uniqueness (under locally Lipschitz conditions) we refer for
instance to [29], Theorem 38, Chapter 7, which also discusses existence of
possibly exploding solutions.
10.3 Dirichlet processes and SDEs
Next exercise motivates the introduction of Dirichlet processes for studying

generalized stochastic differential equations.
Exercise 10.9 (Zvonkin transformation).

Let (Wt ) be an (Ft )-classical Brownian motion. Let a, b : R → R be respec-
tively be of class C 1 and C 0 , such that a > 0. We fix x ∈ R. Let (Xt )t≥0 be
a solution of Z Z
t t
Xt = x + b(Xs )ds + a(Xs )dWs . (10.8)
0 0
90
(i) Determine one particular strictly increasing solution h of equation
1 2 ′′
a h + bh′ = 0.
2
We set ã(x) = (ah′ )(h−1 (x)) where h−1 is the inverse of h.
(ii) Verify that Yt = h(Xt ) defines a solution of

(
dYt = ã(Yt )dWt
(10.9)
Y0 = h(x).
We suppose now that
• a has linear growth;

Ry
• y 7→ 0 ab(s)
2 (s) ds is a bounded function.
(iii) Verify that problem (10.9) has a unique solution.
(iv) Explain why previous arguments provide uniqueness for problem (10.8).
(v) Establish existence for (10.8).
One motivation for studying Dirichlet processes comes from the irregular
medium, see i.e. [33], [23, 24].
Those authors consider formally one equation of the type
dXt = dWt + γ ′ (Xt )dt, (10.10)
where γ is only a continuous function; γ could be for instance the realization

of an independent Brownian motion W . When a = 1, b = γ ′ , formulating
stochastic differential equation (10.9), h′ (y) = e−2γ (y) still makes sense. If
Y is a solution of (10.9) then X = h−1 (Y ) provides in a sense to be precised
a solution to (10.8).
We will see that problem (10.9) admits existence and uniqueness in the sense
of distribution laws. [10, 11] make rigorous the previous statement, in partic-
ular they investigated generalized SDEs as (10.10). More recent papers are
[32, 9].
Remark that X, being only a C 1 function of the semimartingale Y , will only
be a Dirichlet process.
91
11 Change of probability et martingale representa-
tion theorems
11.1 Equivalent probabilities
Let (Ω, F) be a set equipped with a σ-field, let P, Q be two probabilities. We

recall that Q is absolutely continuous with respect to P if
P (N ) = 0 ⇒ Q(N ) = 0 ∀N ∈ F.
We symbolize this by Q << P . We will say that P and Q are equivalent if

P << Q and Q << P ; this will be denoted by P ∼ Q.
Remark 11.1 (Radon-Nikodym Theorem).

Q << P if and only if it exists Z ≥ 0 on (Ω, F) such that
Z
Q(A) = Z(ω)dP (ω), ∀A ∈ F.
A
Z is called the (Radon-Nikodym) density of Q with respect to P , denoted

dQ
by .
dP
Lemma 11.2 Suppose Q << P with Radon-Nikodym density Z. Then P ∼

Q ⇔ P {Z > 0} = 1.
Proof. ⇐ . Soit A ∈ F. It is enough to see that Q(A) = 0 ⇒ P (A) = 0.

R
Suppose Q(A) = 1A ZdP = 0. This implies 1A Z = 0, P a.s. But Z > 0
a.s. yields 1A = 0 a.s. and so P (A) = 0.
⇒ We set C = {Z = 0}. We have to show that P (C) = 0.
Since P << Q, it is enough to show that Q(C) = 0.
R
Now Q(C) = Z1{Z=0} dP = 0.
Exercise 11.3 Verify that a discrete probability on the Borel sets of R is not
absolutely continuous with respect to Lebesgue measure.
92
11.2 Girsanov Theorem
Let (Ω, F, P ) be a probability space with F = FT and let (Ft )t∈[0,T ] be a

filtration fulfilling the usual conditions and consider a classical (Ft )t∈[0,T ] -
Brownian motion (Wt )t∈[0,T ] . Next theorem is known under the name of
Girsanov theorem.
Theorem 11.4 Let (θt )t∈[0,T ] be a progressively measurable process such that
Z T
θs2 ds < ∞ a.s. We set
0
Z t Z t
1
Lt = exp θs dWs − |θs |2 ds .
0 2 0
We suppose that (Lt )t∈[0,T ] is a martingale.

Then, under probability P ⋆ of density
Z t LT with respect to P , the process
(Bt )t∈[0,T ] defined by Bt = Wt − θs ds is an (Ft )0≤t≤T -standard Brow-
0
nian motion.
Remark 11.5 i) P ⋆ (Ω) = 1 since (Lt ) is a martingale.
ii) Girsanov theorem admits an obvious generalization to the case of a multi-

dimensional Brownian motion.
iii) For the proof, the reader can consult for instance [17], section 3.5.
Remark 11.6 i) (Novikov condition).

A sufficient condition such that (Lt )0≤t≤T is a martingale is the fol-
lowing: Z T
1 2
E exp θ dt < ∞.
2 0 t
ii) It is easily possible to show that
dLt = θt Lt dWt .
This only shows that (Lt )t∈[0,T ] is a local martingale. Exercise.
The proposition below constitutes a slight generalization of Novikov condi-

tion, which also admits a multi-dimensional generalization.
93
Theorem 11.7 Let W be a standard Brownian motion and θ be a progres-
sively measurable process such that there are instants 0 = t0 < t1 < . . . <
tn = T such that for every i ∈ {1, . . . , n}
Z !!
1 ti 2
E exp |θs | ds < ∞.
2 ti−1
Then the process
Z t Z t
1 2
Lt = exp θs dWs − |θs | ds , 0 ≤ t ≤ T,
0 2 0
is a martingale. In particular E(LT ) = 1.
Proof. For every i ∈ {1, . . . , n}, we set θ(i)t = θt 1[ti−1 ,ti [ . Novikov condi-
tion is fulfilled for θ(i) and so, the process
Z t Z
1 t 2
Lt (θ(i)) := exp θ(i)s dWs − |θ(i)s | ds
0 2 0
is a martingale, so
E(Lti (θ(i))|Fti−1 ) = Lti−1 (θ(i)) = 1.
By recurrence on i ≤ n one can show that E(Lti ) = 1. Indeed, for every

i ∈ {1, . . . , n}
E(Lti ) = E(Lti−1 E(Lti (θ(i))|Fti−1 )) = E(Lti−1 ).
Since tn = T , we deduce that E(LT ) = 1. Since L is a non-negative local

martingale, it is a supermartingale. Exercise. Then L is a martingale.
The following application is due to to Benes.
Proposition 11.8 Let W be a standard Brownian motion and b : [0, T ] ×

R → R a Borel function such that there is a constant K such that
sup |b(t, x)| ≤ K(1 + |x|).

t∈[0,T ]
We set
Z t
1 2
Lt := exp (b(s, Ws )dWs − b (s, Ws ))ds , t ∈ [0, T ].
0 2
Then L is a martingale. In particular E(LT ) = 1.
94
Remark 11.9 Let N be a standard Gaussian r.v. If c0 < 12 , then

E exp(c0 N 2 ) < ∞.
Exercise.
Ti
Proof (of Proposition 11.8). Let n be a positive integer and ti = n ,i ∈
{1, . . . , n}. For every i ∈ {1, . . . , n},
Z !
ti
T
|b(s, Ws )|2 ds ≤ 2K 2 1 + sup |Ws |2 .
ti−1 n 0≤s≤T
K2T 1
We set c = n . We choose n large enough so that cT < 2. By Remark
11.9, for t ≥ 0
E(exp(c(1 + |Wt |2 ))) = exp(c)E(exp(ctN 2 )) ≤ exp(c)E(exp(cT N 2 )) < ∞,
where N is a standard Gaussian random variable. Since x 7→ exp(c(1 + x2 ))

is convexe then Yt = exp(c(1+|Wt |2 ) is a submartingale. By Doob inequality
Proposition 4.20 we get
!

|2 ) |2 ) 2
E ec(1+sup0≤s≤T |Ws =E sup ec(1+|Wt ≤ 4E ec(1+|WT | ) < ∞.
0≤t≤T
The result follows by Theorem 11.7 with θs = b(s, Ws ).
An important remark concerns the change of probability which can be estab-

lished by the approximation arguments of the construction of Itô integral.
Remark 11.10 Let (Xt )t∈[0,T ] be an ((Ft )t∈[0,T ] , P )-semimartingale. Sup-

pose that (Xt )t∈[0,T ] is still a semimartingale with respect to an equivalent
probability measure Q. Let (Ht )t≥0 be such that
Z t
Hs dXs
0
exists with respect to P . Then

Z t
Hs dXs
0
exists also with respect to Q and the two integrals coincide.
95
Example 11.11 Let (Xt )t∈[0,T ] be a solution to
dXt = a(Xt )dWt + b(Xt )dt, X0 = x 0 , (11.1)

b
where a, b are Lipschitz real functions, a > 0, a is bounded and x0 is a real
number.
In this case
dXt = a(Xt )dBt , (11.2)
where Z t
b
Bt = W t + (Xs )ds.
0 a
We set Z t Z
1 t 2
Lt = exp − θs dWs − |θs | ds ,
0 2 0
with θ := ab . Since b
a is bounded, Novikov condition is verified and (Lt )t∈[0,T ]
is an (Ft )t≥0 - martingale. Therefore under Q defined so that dQ = LT dP ,
(Bt )t∈[0,T ] is an (Ft )t≥0 - Brownian motion.
The local martingale (Xt )t∈[0,T ] is (under Q) the unique solution of the SDE
(11.2). By Theorem 10.4, it is an (Ft )t≥0 - square integrable martingale (still
with respect to Q).
11.3 Representation theorem for Brownian martingales
We suppose here that (Ft ) is the canonical filtration of a classical Brownian

motion (Wt ). We will say in this case that (Ft ) is a Brownian filtration.
We recall that, if (Ht )t≥0 is a progressively measurable process such that
Z T Z t
2
E Ht dt < ∞, the process Hs dWs is a (square integrable)
0 0 t∈[0,T ]
martingale vanishing at 0.
Next theorem shows that all martingales with respect to a Brownian filtration
can be represented with the help of a stochastic integral.
Theorem 11.12 Let (Mt )0≤t≤T be an (Ft )t∈[0,T ] -martingale such that E(Mt2 ) <
∞ (i.e. square integrable).
There is a progressively measurable process (Ht )t∈[0,T ] such that
Z T
2
E Hs ds < ∞ and
0
96
Z t
Mt = M0 + Hs dWs a.s. ∀t ∈ [0, T ]. (11.3)
0
In particular the martingale admits a continuous modification.
Remark 11.13 This representation is only possible for martingales related

to a Brownian filtration (natural filtration of a Brownian motion). Such
martingales are called Brownian martingales.
From the theorem, it comes out that whenever U is a FT -measurable square

integrable random variable, then we can express it as
Z T
U = E(U ) + Hs dWs a.s. (11.4)
0
Z T
where (Ht ) is a progressively measurable process such that E Hs2 ds <
0
∞.
In fact, the point consists in verifying the hypotheses of Theorem 11.12 for
Mt = E(U |Ft ).
• (Mt )t≥0 is square integrable because

E(Mt2 ) = E E(U |Ft )2 ≤ E E(U 2 |Ft )
= E(U 2 ).
• (Mt )t≥0 is an (Ft )-martingale since

E(Mt |Fs ) = E E(U |Ft )Fs = E(U |Fs ) = Ms , t ≥ s.
Exercise 11.14 Let U be an FT -measurable r.v. Show that there exists at

most a unique couple (h0 , H) where h0 ∈ R and H is progressively measurable
RT
with E( 0 Hs2 ds) < ∞ such that
Z T
U = h0 + Hs dWs .
0
The uniqueness holds dtdP -a.e.
97
12 Markov property of solutions of stochastic dif-
ferential equations (SDEs)
12.1 Definition
A process (Xt )t≥0 with values in Rm is said to fulfill the Markov property
and it is called Markov process when its future behaviour after an instant
t only depends on Xt and not from the past before t.
Mathematically speaking, a process (Xt )t≥0 verifies the Markov property
with respect to a filtration (Ft )t≥0 to which it is adapted, if for any Borel
bounded function f : Rm −→ R and for any s ≤ t we have

E f (Xt )Fs = E f (Xt )Xs .
We will see that solutions of SDEs are Markov. We consider the solution
(Xt ) of an SDE of the type
Z t Z t
Xt = x 0 + a(u, Xu )dWu + b(u, Xu )du, (12.1)
0 0
where a, b are Lipschitz with linear growth, x0 ∈ R, (Wt ) is a classical Brow-

nian motion with respect to a filtration (Ft ).
Remark 12.1 (i) (Xt )t≥0 can be considered as a r.v. ω 7→ Xt (ω) from
Ω with values in C(R+ ) (equipped with the topology of uniform conver-
gence on each compact and of the related Borel σ-field). Indeed we can
write
X = Φ(x, W· ),
where is Φ : R × C(R+ ) −→ C(R+ ) is measurable (proof omitted).
(ii) Let 0 ≤ s ≤ t. Similar considerations as in Section 10.1 are valid for

equation
Z t Z t
Yt = Z + a(u, Yu )dWu + b(u, Yu )du, t ≥ s, (12.2)
s s
Z being Fs -measurable and square integrable.

Since Itô integral coincides with forward integral, the stochastic integral
98
with respect to W coincides with the integral with respect to (W̄u )u≥0 ,
where W̄u = Ws+u − Ws . We remark that W̄ is a classical Brownian
motion independent of Fs .
Consequently, (12.2) with Z = y can also be written as
Z t Z t
Yt = y + a(u, Yu )dW̄u + b(u, Yu )du, t ≥ s.
s s
Via point (i), there is Φ̄ : R × C([0, ∞[) −→ C([s, ∞[) such that Y =
Φ̄(y, W̄ ). In other words, Y is measurable with respect to y and to the
process of Brownian motion increments.
(iii) The considerations of this section extend to the case of multi-dimensional

SDEs.
For x ∈ R, (Xts,x )t≥s will denote the unique solution (Yt )t≥s of
Z t Z t
Yt = x + a(u, Yu )dWu + b(u, Yu )du, t ≥ s.
s s
The solution X 0,x of (12.1) will simply be denoted by X x .
Remark 12.2 It is possible to show the existence of a continuous modifica-

tion in (s, t, x) of (Xts,x ) still denoted by the same symbol. It is a delicate
Z t
result to establish, see [21]. Similarly, (s, t, x) 7→ a(u, Xus,x )dWu admits a
s
continuous modification.
12.2 Flow of a stochastic differential equation
This notion generalizes the flow property fulfilled in the case of ordinary
differential equations.
Lemma 12.3 Under the conditions of existence and uniqueness of Theorem

10.4, we have
s,Xsx
Xt0,x = Xt , t ≥ s ≥ 0, x∈R P −a.s.
Proof. For every y ∈ R, t ≥ s ≥ 0,

Z t Z t
s,y s,y
Xt = y + b(u, Xu )du + a(u, Xus,y )dWu a.s. (12.3)
s s
99
Since left and right members are continuous in (s, t, y), we have (12.3) for
every t ≥ s ≥ 0, y ∈ R P -a.s.
In fact, two continuous processes (random fields) which are modification one
of the other are indeed indistinguishable.
Finally, for y = Xsx
Z t Z t
s,X x x x
Xt s = Xsx + b u, Xus,Xs du + a u, Xus,Xs dWu . (12.4)
s s
Now Xtx is also solution of previous equation because, if t ≥ s,

Z t Z t
x x
Xt = x + b(u, Xu )du + a(u, Xux )dWu
0 0
Z s Z s Z t
x x
= x+ b(u, Xu )du + a(u, Xu )dWu + b(u, Xux )du
0
| {z 0 } s
Xsx
Z t
+ a(u, Xux )dWu .
s
We recall that equation (12.4), which is (12.2) with initial condition Z = Xsx ,
has a unique solution. This implies the result.
The Markov property assumes consequently the following form.
Theorem 12.4 Let (Xt )t≥0 be a solution of (12.1). Then it has the Markov
property with respect to the filtration (Ft ). More precisely, for every bounded
Borel function f , we have

E f (Xt )Fs = φ(Xs ), (12.5)
where
φ(y) = E f Xts,y .
Remark 12.5 We often express previous equality as

E f (Xt )Fs = E f Xts,y .
y=Xs
Proof. The flow property says that

s,Xsx
Xtx = Xt . (12.6)
100
By Remark 12.1, there is Φ : R × C(R+ ) −→ C([s, +∞[) such that
X s,y = Φ(y, Ws+u − Ws ; u ≥ 0).
Therefore, by (12.5), for fixed s, t
Xtx = Φ(Xsx , Ws+u − Ws ; u ≥ 0),
with Xsx being (Fs )-measurable and (Ws+u − Ws ) independant of Fs .

We apply the well-known freezing Theorem 2.13 for the conditional expecta-
tion. So the left-hand side of (12.5) equals
n o

E f Φ(Xsx , Ws+u − Ws ; u ≥ 0) Fs
n o

= E f Φ(y, Ws+u − Ws ; u ≥ 0)
y=Xsx

= E f Xts,y x
.
y=Xs
The previous result can be generalized to the case when Φ depends on the
whole path of the diffusion after time s.
Theorem 12.6 Let (Xt )t≥0 be a solution to (12.1) and r(s, x) be a non-
negative measurable fonction. If t > s we have
Rt
− s r(u,Xu )du
E e f (Xt )Fs = φ(Xs ) P a.s.
with Rt
s,y
φ(y) = E e− s r(u,Xu )du f Xts,y .
We also express that equality in the form

Rt Rt
s,y
E e− s r(u,Xu )du f (Xt )Fs = E e− s r(u,Xu )du f Xts,y
y=Xs
Remark 12.7 We can in fact establish a more general result than the one
previously stated. Omitting technical details, we can affirm that, given a
function Φ of the whole path of Xt after s, we have

E Φ Xtx ; t ≥ s Fs = E Φ Xts,y , t ≥ s P a.s.
y=Xs
101
12.3 Infinitesimal generator of a diffusion
Suppose a, b : R → R Lipschitz.
We denote by (Xt )t≥0 a solution of
dXt = b(Xt )dt + a(Xt )dWt . (12.7)
Proposition 12.8 We denote by A the differential operator that to a func-

tion g of class C 2 associates
a2 (x) ′′
(Ag)(x) = g (x) + b(x)g ′ (x).
2
Then, if f ∈ C 2 (R) with bounded first order derivative then Mt = f (Xt ) −
Z t
Af (Xs )ds is an (Ft )-martingale.
0
Proof. Exercise.
Remark 12.9 We denote by (Xtx )t≥0 the solution of the SDE (12.7) such
that X0x = x.
Let f ∈ C 2 (R) with bounded derivatives. Let Kf be a constant bounding the
first and second derivatives and K such that |b(x)| + |a(x)| ≤ K(1 + |x|).
Theorem 10.4 of existence and uniqueness for SDEs says that
! !!
2

E sup Af Xsx ≤ Kf′ 1 + E sup Xsx < ∞,
s≤T s≤T
where Kf′ is another constant depending on K and Kf .
From Proposition 12.8, we deduce that

Z t
x
E f Xt = f (x) + E Af Xsx ds .
0
Consequently, we can apply Lebesgue dominated convergence theorem since
x 7→ Af (x) and s 7→ Xsx are continuous functions. We deduce that
Z t
d x 1 x
E (f (Xt ))t=0 = lim E Af (Xs ) ds = Af (x).
dt t−→0 t 0
The differential operator A is called infinitesimal generator of the diffu-
sion.
Proposition 12.8 extends to the case where the coefficients of the SDE (12.7)
depend on time.
102
Proposition 12.10 Let (t, x) 7→ u(t, x) of class C 1,2 (R+ × R). We suppose
moreover that first order derivative with respect to x is bounded. Then the
process Z t
∂u
Mt = u(t, Xt ) − + As u (s, Xs )ds
0 ∂t
is a martingale, where As is the operator acting on variable x defined by
a2 (s, x) ′′
(As f )(x) = f (x) + b(s, x)f ′ (x).
2
Proof. It is analogous to the one of Proposition 12.8.

We use this time the time-dependent Itô formula Proposition 9.20.
In reality, we need a more general result.
Proposition 12.11 Under assumptions of Proposition 12.10, if

r : R+ × R → R, (t, x) 7→ r(t, x) is bounded and continuous, the process
Z t R
− 0t r(s,Xs )ds − 0s r(v,Xv )dv ∂u
R
Mt = e u(t, Xt ) − e + As u − ru (s, Xs )ds,
0 ∂t
is a martingale.
Proof. That proposition can be established differentiating the product

Rt
e− 0 r(s,Xs )ds
u(t, Xt ), (12.8)
through integration by parts formula Proposition 9.20 iii); so expression

(12.8) becomes
Z t Rs
u(0, X0 ) − u(s, Xs )e− 0 r(v,Xv )dv
r(s, Xs )ds
|{z} 0
x0
Z t Rs
+ e− 0 r(v,Xv )dv
du(s, Xs ) + [ ].
0 | {z }
0
Subsequently, we apply Itô formula to process u(t, Xt ).
Previous result generalizes to the case of vector valued diffusions. Consider

the stochastic differential equation
p
X
dXti i
= b (t, Xt )dt + aij (t, Xt )dWtj , i = 1, . . . , m. (12.9)
j=1
103
. We make the assumptions of existence and uniqueness of Section 10.2. For
every t, we introduce the operator At which to a function f ∈ C 2 (Rm ; R)
associates the function
n n
1 X ∂2f X ∂f
At f (x) = ℓij (t, x) (x) + bj (t, x) (x),
2 ∂xi ∂xj ∂xj
i,j=1 j=1
where (σij (t, x)) is the matrix defined by

p
X
σij (t, x) = aik (t, x)ajk (t, x).
k=1
With matrix notations, we will have Σ(t, x) = (σij (t, x)) with Σ(t, x) =
A(t, x)A⊤ (t, x), where A⊤ (t, x) is the transposed matrix of A(t, x) = (aij (t, x)).
Proposition 12.12 Let (Xt ) be a solution of system (12.9), u : R+ ×Rm −→

R, (t, x) 7→ u(t, x) of class C 1,2 . We suppose moreover that the first order
derivatives with respect to x are bounded.
Let r : R+ × Rm −→ R, (t, x) 7→ r(t, x) be a bounded continuous function.
Then the process
Z t
−
Rt
r(v,Xv )dv −
Rs
r(v,Xv )dv ∂u
Mt = e 0 u(t, Xt ) − e 0 + As u − ru (s, Xs )ds
0 ∂t
is a martingale.
Proof. It follows applying integration by parts and multi-dimensional Itô

formula.
∂
Remark 12.13 The differential operator + At is called the Dynkin oper-
∂t
ator of the diffusion.
Example 12.14 We consider the p-dimensional Brownian motion issued

from x, obtained in (12.9) setting p = m, bi = 0, aij = δij . In this
1
case, the generator is At ≡ 2 ∆.
12.4 Parabolic partial differential equations and expectation

evaluations
We consider a diffusion (Xt )t≥0 with values in Rm , f (x) from Rm to R non-

negative Borel, r(s, x) a real continuous and bounded function. One of the
104
classical problems of the theory of financial mathematics consists in evaluat-
ing RT

Vs = E e− s r(t,Xt )dt f (XT )Fs .
As we have seen at the beginning of this chapter for the case m = 1, we have
equally
Vs = F (s, Xs ),
where RT
s,x
F (s, x) = E e− s r(t,Xt )dt f XTs,x , (12.10)
X s,x being the unique solution to system (12.9) issued from x at time s.
We can extend this result and show that, if r is a function only depending
on x (time-homogeneity), we have
n R s+t s,y
o n Rt 0,y
o
E e− s r(Xu )du f Xs+t
s,y
= E e− 0 r(Xu )du f Xt0,y .
In this case the equality in Theorem 12.6 can be rewritten as follows:

n Rt o n R t−s o
0,y
E e− s r(Xu )du f (Xt )Fs = E e− 0 r(Xu )du f Xt−s 0,y
.
y=Xs
Next result allows to relate F defined in (12.10) to the solution of a parabolic

partial differential equation.
Theorem 12.15 (Feynmann-Kac) Let u : R+ × Rm −→ R in (s, x) of

class C 1,2 (R+ × Rm ).
We suppose moreover that the first order derivatives with respect to x are
bounded.
If

 u(T, x) = f (x), ∀x ∈ Rm

∂u
 + As u − ru (s, x) = 0, ∀(s, x) ∈]0, T [×Rm ,
∂s
then RT
s,x
u(s, x) = F (s, x) := E e− s r(t,Xt )dt f XTs,x .
105
Proof. We prove equality u(s, x) = F (s, x) for s = 0.
By Proposition 12.12, we know that the process
r(v,Xv0,x )dv
Rt
Mt = e − 0 u(t, Xt0,x )
is a martingale. Writing E(M0 ) = E(MT ) we obtain

RT 0,x

u(0, x) = E e− 0 r(v,Xv )dv u(T, XT0,x )
RT 0,x

= E e− 0 r(v,Xv )dv f (XT0,x ) ,
since u(T, x) = f (x). The proof is similar when s > 0.
Remark 12.16 In order to calculate F (s, x) given in (12.10), previous the-

orem suggests to find out the solution of the following problem.

 ∂u + A u − ru = 0 in ]0, T ] × Rm
s
∂s (12.11)
 u(T, x) = f (x), ∀x ∈ Rm .
Remark 12.17 When r = 0 and As is the laplacian, the following property

holds. If

 ∂u + 1 ∆u = 0, in ]0, T ] × Rm ,
∂s 2 (12.12)
 u(T, x) = f (x), ∀x ∈ Rm ,
we have
u(s, x) = E (f (x + WT −s )) ,
(Ws ) = (Ws1 , . . . , Wsp ) being a classical Brownian motion.
Problem (12.11) consists in one equation of parabolic type with terminal

condition (whenever the function u(T, ·) is given).
(12.11) will be well-posed, if we formulate a suitable functional framework;
there we can provide existence (and uniqueness) theorems.
Moreover, we will be able to affirm that the solution of (12.11) equals F if we
can prove that this solution is regular enough in order to apply Proposition
12.12. This kind of result can be generally obtained under an ellipticity
106
condition for operator At , which has the form
∃c > 0, ∀(t, x) ∈ [0, T ] × Rm

(12.13)
X n
X
∀(ξ1 , . . . , ξn ) ∈ Rm : aij (t, x)ξi ξj ≥ c |ξi |2
i,j i=1
and some regularity assumptions on b and a.
12.5 Partial differential equations with boundary conditions

and expectation calculation
In the sequel of this chapter, we will have a process (Xt )t≥0 which is solution
of a stochastic differential equation with coefficients a, b which do not depend
on time. Let also consider r : R → R bounded and continuous.
So we have
dXt = a(Xt )dWt + b(Xt )dt. (12.14)
We consider the differential operator
1
Ag(x) = a2 (x)g ′′ (x) + b(x)g ′ (x). (12.15)
2
We denote Ãg(x) = Ag(x) − r(x)g(x). Equation (12.11) can be written as

 ∂u (t, x) + Ãu(t, x) = 0, in ]0, T ] × R
∂t (12.16)
 u(T, x) = f (x), ∀x ∈ R.
If we state Problem (12.16) not any more on the whole real line, but on
O =]x0 , x1 [, we need to impose boundary conditions on x0 and x1 .
We will be more particularly interested to the case when we impose zero
boundary conditions (of Dirichlet type). Let f : O → R. We try in this case
to solve

 ∂u
 ∂t (t, x) + Ãu(t, x) =
 0, in ]0, T ] × O
u(t, x0 ) = u(t, x1 ) = 0, ∀t ≤ T (12.17)



u(T, x) = f (x) ∀x ∈ O.
A regular solution of (12.17) can so be interpreted with the help of the
diffusion equation (12.14). We denote by X s,x the solution of (12.14) issued
from x at time s.
107
Theorem 12.18 Let u : R+ × Ō −→ R be a function in (t, x) of class
C 1,2 ([0, T ] × Ō).
We suppose moreover u to be solution of (12.17).
Then n o
RT s,x
u(s, x) = E 1{∀t∈[s,T ],Xts,x ∈O} e− s r(Xt )dt f (XTs,x ) .
Proof. We check the result when s = 0, the proof being similar in the other
cases.
We can extend u from [0, T ] × Ō to [0, T ] × R conserving the C 1,2 character
of u on [0, T ] × R. We continue symbolizing u such a prolongation. Through
Proposition 12.11, we know that the process
Rt 0,x
Mt = e− 0 r(Xv )dv u(t, Xt0,x )
Z t Z v
0,x ∂u
− exp − r(Xα )dα + Au − ru (v, Xv0,x )dv
0 0 ∂t
is a martingale. Moreover, we set

(
x inf{0 ≤ t | Xt0,x ∈
/ O} : if {0 ≤ t | Xt0,x ∈
/ O} =
6 ∅
T =
∞ : otherwise.
T x is a stopping time because T x = Txx0 ∧ Txx1 , where
Tℓx = inf{t ≥ 0 : Xt0,x = ℓ}
and Tℓx are stopping times, see Proposition 4.11.

We set τ x = T x ∧ T ; it is a bounded stopping time.
Applying the Doob stopping time Theorem 4.9 between 0 and τ x , we obtain
E(M0 ) = E(Mτ x ); taking into account that, if t ∈]0, τ x [,

∂u
+ Au − ru (t, Xt0,x ) = 0,
∂t
we have
n R τx 0,x
o
u(0, x) = E e− 0 r(Xv )dv u(τ x , Xτ0,x x )
n RT 0,x
o
= E 1{∀t∈[0,T ],X 0,x ∈O} e− 0 r(Xv )dv u(T, XT0,x )
t
n R x o
− 0τ r(Xv0,x )dv x 0,x
+ E 1{∃t∈[0,T ],X 0,x ∈O}
/
e u(τ , X τ x ) .
t
108
Now, on the event {∃t ∈ [0, T ], Xt0,x ∈
/ O}, u(τ x , Xτ0,x x 0,x
x ) = u(T , XT x ) = 0.
On the other hand f (x) = u(T, x), so that

n RT 0,x
o
u(0, x) = E 1{∀t∈[0,T ],X 0,x ∈O} e− 0 r(Xv )dv f (XT0,x ) .
t
This proves the result for s = 0.
13 Stochastic differential equations with non-Lipschitz

coefficients
Let a : [0, T ] × Rm → Rm×p , b : [0, T ] × Rm → Rp Borel functions.
13.1 Generalities
The study of stochastic differential equations is much richer than the study
of ordinary differential equations which are well-posed essentially when the
coefficients are Lipschitz with very few exceptions.
We consider first the following well-known problem with m = p = 1.
( p
ẋ(t) = x(t)
x(0) = 0.
This problem is not well-posed since the family of solutions is provided by

x ≡ 0 and (
(t−C)2
4 : t≥C
x(t) =
0 : t ∈ [0, C],
where C is a general constant. It is possible to show that previous system
perturbed by a white noise (even with small intensity) provides uniqueness as
we will discuss later. Some authors have examined the asymptotic behavior
of the unique solution when one switches off the noise. Among the first
authors who have examined this phenomenon when the noise goes to zero
are those of [2].
We come back now to the stochastic case.
Let (Ω, F, P ) be a probability space, a usual filtration (Ft )t≥0 , an (Ft )t≥0 -
classical p-dimensional Brownian motion W = (Wt1 , . . . , Wtp )t≥0 . Let T > 0.
109
Generally speaking, we will be interested in a SDE of Itô type as follows:
dXt = a(t, Xt )dWt + b(t, Xt )dt (E(a, b)). (13.1)
Definition 13.1 A stochastic process (Xt )t∈[0,T ] is said to be solution of

(13.1) if the following holds.
i) (Xt ) is (Ft )t∈[0,T ] -adapted and measurable.

RT
ii) 0 (|b(t, Xt )| + |a|2 (t, Xt ))dt < ∞ a.s.
Rt Rt
iii) Xt = X0 + 0 a(s, Xs )dWs + 0 b(s, Xs )ds, ∀t ∈ [0, T ], P -a.s.,
where the stochastic integral above is defined componentwise as in (10.7). In

an obvious way, we can define the notion of solution (Xt )t≥0 on the whole
real line.
Remark 13.2 Item iii) means that the left and right-hand side are indistin-
guishable. If X is a continuous process then the equality holds if for any t
Rt Rt
Xt = X0 + 0 a(s, Xs )dWs + 0 b(s, Xs )ds a.s.
Definition 13.3 (Strong existence). If for any probablity space (Ω, F, P ),

an (Ft )t≥0 -classical Brownian motion (Wt )t≥0 -classical Brownian motion,
with respect to a filtration (Ft )t≥0 (fulfilling the usual conditions, or usual),
γ0 , F0 -measurable square integrable random variable, there is a process (Xt )t≥0
solution to E(a,b) with X0 = γ0 a.s., we will say that equation E(a,b) admits
strong existence.
Definition 13.4 (Pathwise uniqueness). We will say that equation E(a, b)

admits pathwise uniqueness if the following property is fulfilled.
Let (Ω, F, P ) be a probability space, a usual filtration (Ft )t≥0 , an (Ft )t≥0 -
classical Brownian motion (Wt )t≥0 . Let γ0 be a (square integrable) F0 -r.v.
If two processes X, X̃ are two solutions of E(a,b), such that X0 = X̃0 = γ0
a.s. then X and X̃ are indistinguishable.
110
Definition 13.5 (Existence in law or Weak existence). Let ν be a
probability law (on Rm ). We will say that E(a, b; ν) admits weak existence if
there is a probability space (Ω, F, P ), a usual filtration (Ft )t≥0 , an (Ft )t≥0 -
p-dimensional classical Brownian motion (Wt )t≥0 and a process (Xt )t≥0 so-
lution of E(a, b) where ν is the law of X0 .
We say that E(a, b) admits weak existence if E(a, b; ν) admits weak existence
for every ν.
Definition 13.6 (Uniqueness in law). Let ν be a probability law (on

Rm ). We say that E(a, b; ν) has a unique solution in law if the following
holds. Let (Ω, F, P ) (resp. (Ω̃, F̃, P̃ ),) be a probability space; consider an
usual filtration (Ft )t≥0 (resp. (F̃t )t≥0 ), an (Ft )t≥0 -classical Brownian mo-
tion (Wt )t≥0 (resp. an (F̃t )t≥0 - classical Brownian motion (W̃t )t≥0 ), and a
process (Xt )t≥0 (resp. (X̃t )t≥0 ) solution of E(a, b), such that both the law of
X0 and X̃0 are identical to ν. Then X and X̃ have the same law as r.v. with
values in E = C(R+ ) (or C[0, T ]).
We say that E(a, b) has a unique solution in law if E(a, b; ν) has a unique
solution in law for every ν. Sometimes it is reasonnable to suppose ν is a
probability measure having a second moment.
Proposition 13.7 (Yamada-Watanabe). The two properties are valid.
i) Pathwise uniqueness imply uniqueness in law.
ii) Weak existence and pathwise uniqueness imply strong existence.
A version can be stated for E(a, b; ν) where ν is a fixed probability law.
Proof. See [17] Proposition 5.3.20 and Corollary of this result.
In Section 7., we considered in some details the case where a and b are
Lipschitz.
Definition 13.8 A function γ : [0, T ] × Rm → Rℓ (resp. γ : R+ × Rm → Rℓ )

is said to be locally Lipschitz (with respect to x uniformly with respect to
111
t), if for every K > 0 (resp. for every T > 0, K > 0) γ|[0,T ]×[−K,K]m is
Lipschitz (with respect to x uniformly with respect to t).
We come back to notations introduced at the beginning of the chapter.
Remark 13.9 i) If a and b are Lipschitz with linear growth, E(a,b) admits
a unique solution, of course (Ft )t≥0 - adapted. This means that E(a,b)
admits strong existence and pathwise uniqueness. This result can be
deduced from Theorem 10.4.
ii) If a and b are only locally Lipschitz, it is possible to show existence until
some suitable (explosion) stopping time. The local Lipschitz property
guarantees however pathwise uniqueness.
iii) Condition i) on a and b may be replaced by the following condition:

a, b locally Lipschitz with linear growth.
A simple but very efficient inequality in the theory of ordinary differential

equations, also useful in stochastic differential equations is Gronwall’s lemma
(inequality) whose first version was established by Haken Gronwall in 1919.
Lemma 13.10 (Gronwall). Let g : [0; T ] → R be an integrable Borel func-

tion such that there exist constants a, b ≥ 0,
Z t
g(t) ≤ a + b g(s)ds, 0 ≤ t ≤ T. (13.2)
0
Then g(t) ≤ aebt , t ∈ [0, T ].
Proof.
Rt
We set G(t) = a + b 0 g(s)ds. Then by assumption g(t) ≤ G(t).
(i) If g is continuous then G is differentiable and
(e−bt G(t))′ = −be−bt G(t) + e−bt G′ (t) = −be−bt G(t) + be−bt g(t) ≤ 0.
(13.3)
112
So t 7→ e−bt G(t) is non-increasing and
e−bt g(t) ≤ e−bt G(t) ≤ G(0) = a,
which proves the result in this case.
(ii) Suppose now that g is only Lebesgue integrable. Then G is continuous

and verifies by assumption
Z t Z t
G(t) = a + b g(s)ds ≤ a + b G(s)ds.
0 0
But now the continuous function G fulfills (13.2) so we can apply the
first part of the proof. Finally
g(t) ≤ G(t) ≤ aebt , t ∈ [0, T ].
13.2 Existence and uniqueness in law: an example
We start with an example where E(a, b) does not admit pathwise uniqueness,
even though it admits uniqueness in law.
Before this, we need to state an important result due to Paul Lévy, called
Lévy characterization of Brownian motion.
Theorem 13.11 Let (Mt )t≥0 be an (Ft )t≥0 - continuous local martingale
such that M0 = 0. Then (Mt )t≥0 is an (Ft )t≥0 - classical Brownian motion if
and only if [M, M ]t ≡ t.
Proof. See [30] Theorem 3.6.
We fix m = p = 1. We set b(x) = 0 and

(
1 : x≥0
a(x) = sign+ (x) =
−1 : x < 0.
Equation E(a, b; δ0 ) admits uniqueness in law. In fact, every solution X is a
local martingale, vanishing at zero with quadratic variation [X]t = t. So by
Lévy characterization theorem X is a classical Brownian motion.
113
Remark 13.12 A solution X of equation E(a,b;δ0 ) is also a solution to
Z t
Xt = sign− (Xs )dWs ,
0
and Z t
Xt = sign(Xs )dWs ,
0
where (
1 : x>0
sign− (x) =
−1 : x ≤ 0
and 

 1 : x>0

sign(x) = −1 : x < 0


 0 : x = 0.
Rt
In fact, since X is a Brownian motion then 0 1{Xs =0} dWs = 0 for any t ≥ 0.
In fact, using Fubini’s theorem and the isometry of stochastic integral,
Z t Z t
E 1{Xs =0} ds = P {Xs = 0}ds = 0,
0 0
since the law of Xs admits a density and therefore it is non-atomic.
Later we will show that E(a, 0; δ0 ) admits weak existence. Let now (Ω, F, P )
be a probability space, an (Ft )t≥0 - classical Brownian motion (Wt ) with
respect to a usual filtration and (Xt )t≥0 be a solution of E(a, b; δ0 ). Since
Rt
X is a solution to Xt = 0 sign+ (Xs )dWs , t ≥ 0, then X̃t = −Xt verifies
Rt
of X̃t = 0 sign− (X̃s )dWs , t ≥ 0. Since X and X̃ are both solutions of
E(a, b; δ0 ), then E(a, b; δ0 ) does not admit pathwise uniqueness.
13.3 Existence and uniqueness in law in the multi-dimensional

case.
We recall here the known results about existence and uniqueness in law in the
multi-dimensional case. Let a : [0, T ] × Rm → Rm×p , b : [0, T ] × Rm → Rm
be two Borel functions.
We start introducing the so called Non-degeneracy condition for a given ma-
trix function σ.
114
Assumption 13.13 A function σ : [0, T ] × Rm → Rm×m is said to fulfill
the Non-degeneracy condition if
m
X m
X
m m
∃C > 0, ∀(t, x) ∈ [0, T ]×R , ∀(ξ1 , · · · , ξm ) ∈ R : σij (t, x)ξi ξj ≥ C |ξi |2 .
i,j=1 i=1
(13.4)
The first result of existence in law, that we state, is taken from the Theorem
1 in the Section 2.6 in [19].
Theorem 13.14 We suppose m = p, a, b bounded and Borel measurable

with respect to (t, x) such that a fulfills Assumption 13.13. Then E(a, b)
admits existence in law.
In Exercise 7.3.3 in [35], we can find a theorem for the uniqueness in law only
for m = p = 1.
Theorem 13.15 We suppose m = p = 1 and a, b bounded and Borel mea-

surable with respect to (t, x) such that for each compact K ⊂ R, there is a
constant C = C(K) such that a2 (t, x) ≥ C, ∀(t, x) ∈ [0, T ] × K. Then E(a, b)
admits existence and uniqueness in law.
Under the non-degeneracy condition (13.13) on aa⊤ and a minimal regularity

conditions we have the following well-statement result.
Theorem 13.16 We suppose a, b bounded and Borel measurable such that

σ = aa⊤ fulfills Assumption 13.13 and
lim sup |σ(s, y) − σ(s, x)| = 0, x ∈ Rm . (13.5)

y→x 0≤s≤T
Then the problem E(a, b) admits weak existence and uniqueness in law.
Remark 13.17 If aa⊤ is continuous, (13.5) is verified.
A last interesting result is Theorem 12.2.3 in [35] about existence in law.
Theorem 13.18 We suppose a, b continuous bounded. Then E(a, b) admits

existence in law.
115
We remark that in Theorem 13.18, a could be also degenerate. A consequence
of this theorem is the following.
Theorem 13.19 We suppose a, b continuous with linear growth. Let ν be

R
a Borel probability measure on Rm such that Rd |x|γ ν(dx) < ∞ for some
γ > 2m. Then E(a, b) admits existence in law.
Proof (Sketch). We proceed according to the following steps.
(i) We truncate the coefficients a and b defining, for each N > 0, aN (t, x) =
(aN N
ij (t, x)) where aij (t, x) = (aij (t, x)∧N )∨(−N ) and b = (b1 , . . . , bm )
⊤
bN
i (t, x) = (bi (t, x) ∧ N ) ∨ (−N ), 1 ≤ i, j ≤ m.
(ii) According to Theorem 13.18 we are allowed to consider (weak) solutions

X N of E(aN , bN ; ν). A priori, by BDG and Jensen’s inequalities, the
moment E(supt≤T |XsN |γ ) is finite but it depends on N .
(iii) On the other hand previous item, the same BDG and Jensen’s inequal-
ities, together with Gronwall’s lemma allow to show the existence of a
constant C(γ, T, ca , cb ), where ca , cb are the linear growth coefficients
related to a and b such that
sup E(|XtN |γ ) ≤ C(γ, T, ca , cb ). (13.6)

t≤T
(iv) At this point we can show that the laws of X N as r.v. with val-
ues in C([0, T ]) are tight. For this we make use of the Prohorov and
Kolmogorov theorems whose consequences for us are summarized in
Problem 4.11 in Section 2.4 in [17]. It is enough to show
E(|XtN − XsN |α ) ≤ const|t − s|m+β , s, t ∈ [0, T ], |t − s| ≤ 1, (13.7)
for some α, β > 0. By BDG and Jensen inequalities and (13.6),

γ
E(|XtN − XsN |γ ) ≤ const|t − s| 2 , s, t ∈ [0, T ], |t − s| ≤ 1, (13.8)
where the constant const does not depend on N . So (13.7) holds with
γ
α = γ, β = 2 − m.
116
(v) Since the laws of X N are tight, there exists a subsequence (still de-
noted by X N ) which converges in law. By Skorohod theorem there is
a sequence of copies X̃ N of copies of X N with the same distributions
as X N such that X̃ N converge a.s., in particular ucp to some process
X. Without restriction of generality we still write X N = X̃ N .
(vi) We set σ = aa⊤ . Let f ∈ C 2 (Rm ). By Itô formula

Z t m Z
X t
1
f (XtN )−f (X0N )− ∇f (XsN )b(s, XsN )ds− 2
∂ij f (s, XsN )σij (s, XsN )ds,
0 2 0
i,j
is a martingale. By previous point, taking N → ∞ previous processes

converge ucp to
m Z Z
1 X t 2 1 t
f (Xt )−f (X0 )− ∂ij f (Xs )σij (s, Xs )ds− ∇f (Xs )b(Xs )ds.
2 0 2 0
i,j=1
This can be shown to be a local martingale, using the definition of

martingale. In other words (X, P ) solves the martingale problem
related to σ, b.
(vii) Since the concept of solution in law is equivalent to the one of martin-
gale problem, see Proposition 4.16 of [17], it is possible to show that
that X solves E(a, b; ν). See also Exercise 13.20 below.
Exercise 13.20 Let (Ω, F, P ) and a filtration (Ft ) fulfilling the usual con-
ditions. Let a : [0, T ] × Rm → Rm×p , b : [0, T ] × Rm → Rm be bounded Borel
functions.
(i) Let X be a solution of E(a, b). Show that for every f ∈ C 2 (Rm ) then
Z m Z
t
1X t
f (Xt )−f (X0 )− ∇f (Xs )·b(s, Xs )ds− ∂x2i ,xj f (s, Xs )(aa⊤ )ij (s, Xs )ds
0 2 0
i,j
(13.9)
is a local martingale, where a⊤ is the transposed of matrix a.
If (13.9) is verified for every f ∈ C 2 (Rm ), we say that (X, P ) solves
the martingale problem related to b and aa⊤ .
117
(ii) Suppose now (X, Q) solves previous martingale problem, where X a
continuous process and Q is a probability on (Ω, F). Q is the reference
probability for the items below.
(a) Show that the processes

Z t
i i
Mt := Xt − bi (s, Xs )ds, t ≥ 0, 1 ≤ i ≤ m,
0
are (FtX )-local martingales under Q and (FtX ) is the canonical

filtration associated with X.
(b) Show that for every 1 ≤ i, j ≤ m, [M i , M j ] = [X i , X j ].
(c) Let 1 ≤ i, j ≤ m. Express (13.9) for f (x) = xi xj to show that
Z t
i j
[M , M ]t = (aa⊤ )ij (s, Xs )ds.
0
Consider now for simplicity the case m = p = 1 and a > 0. Let

M = M 1 . Show that the process
Z t
1
Wt := dMs , t ≥ 0,
0 a(s, Xs )
is an (FtX )-Brownian motion.

(d) Deduce finally that X is a solution of E(a, b) and E(a, b) admits
weak existence.
Exercise 13.21 Show (13.6) and (13.8) in the case m = p = 1 and T = 1

(without restriction of generality).
13.4 The Engelbert-Schmidt criteria
We proceed with results which are more specifically unidimensional stating

some results from K.J. Engelbert and W. Schmidt, see [7]. These two au-
thors furnished necessary and sufficient conditions for weak existence and
uniqueness in law of SDEs with zero drift.
For a Borel function γ : R → R, we first define
Z(γ) = {x ∈ R|γ(x) = 0};
118
then we define the set I(γ) as the set of real numbers x such that
Z x+ε
dy
2 (y)
= ∞, ∀ε > 0.
x−ε γ
For a clear exposition of the next result, see Theorems 5.4 and 5.5 of [17].
Proposition 13.22 (Engelbert-Schmidt). Suppose that a : R → R, i.e. not

depending ont time and consider equation E(a, 0; ν) for some law ν on R.
i) E(a, 0; ν) admits weak existence (without explosion) if and only if
I(a) ⊂ Z(a). (13.10)
ii) E(a,0; ν) admits weak existence and uniqueness in law if and only if
I(a) = Z(a). (13.11)
Remark 13.23 i) If a is continuous then (13.10) is always verified. In

fact, if a(x) 6= 0, there is ε > 0 such that
|a(y)| > 0, ∀y ∈ [x − ε, x + ε].
Therefore x cannot belong to I(a).
ii) (13.10) is verified also for some discontinuous functions as for instance
sign+ . This confirms what was affirmed previously, i.e. the weak exis-
tence for E(a, 0), see Theorem 13.15.
iii) If a(x) = 1{0} (x), (13.10) is not verified.

1
iv) If a(x) = |x|α , α ≥ 2 then
Z(a) = I(a) = {0}.
So there is at most one solution in law for E(a, 0; ν) independently from

initial condition.
v) The proof is technical and makes use of Lévy characterisation theorem

13.11 of Brownian motion.
119
13.5 Feller test for explosion
Let us consider Borel functions a, b : R → R (not depending on time). Sup-

pose moreover a > 0 and that a, b, ab2 are locally integrable. We consider the
following infinitesimal generator:
a2 ′′
Ag = g + bg ′ .
2
We denote Z x
b
Σ(x) := 2 (y) dy. (13.12)
0 a2
Let ℓ ∈ C 0 (R) and u0 , u1 ∈ R. Then there is a unique solution to
(
Au = ℓ
(13.13)
u(0) = u0 , u′ (0) = u1 ,
which is given by
(
u(0) = u0
Rx 2ℓ Σ

u′ (x) = e−Σ(x) 0 a2 e (y) dy + u1 ,
Remark 13.24 This procedure can be extended to the case when b is the
derivative of a continuous function β, therefore a distribution, see [10].
Proposition 13.25 (Feller test for explosion). Suppose a > 0 such that
a, b, ab2 are locally integrable. Let v be the unique solution to


 Av =
 1


 v(0) = 0, v ′ (0) = 0.
Then weak existence and uniqueness in law for E(a, b) takes place (without
explosion, for every t ≥ 0) if and only if
v(−∞) = v(+∞) = +∞. (13.14)
Proof. Theorem 5.5.29 in [17] and the remark below which follows from
(13.13) and (13.14) give the result.
120
Remark 13.26 v given above is explicitely given by


 v(0) =
 0
 R

 v ′ (x) = e−Σ(x) x 2eΣ
0 a2
(y) dy .
121
Exercise 13.27 i) Verify that whenever
Z ∞ Z 0
−Σ(y)
e dy = e−Σ(y) dy = ∞, (13.15)
0 −∞
then (13.14) is verified.
ii) Deduce in particular that a unique (non-exploding) solution exists in the

three following cases.
a) b = 0;
b)Σ is upper bounded;
b
c) a2
is integrable.
iii) Suppose a of class C 1 . Find necessary and sufficient conditions so that

the Stratonovich type equation
Z t
Xt = X0 + a(Xs ) ◦ dWs
0
has a (unique Itô process) non-exploding solution.
Solutions
i) It will be enough to check that v(+∞) = +∞, since the −∞ case can be
justified similarly. For z ≥ 1 we have
Z z Z x Σ
e
v(z) = 2 dxe−Σ(x) 2
(y) dy
0 0 a
Z z Σ Z z
e
= 2 dy 2 (y) e−Σ(x) dx
0 a y
Z 1 Z z
eΣ
≥ 2 dy 2 (y) e−Σ(x) dx
0 a 1
Therefore
Z 1 Z ∞
eΣ
liminf z→0 v(z) ≥ 2 dy 2 (y) e−Σ(x) dx.
0 a 1
So, if condition (13.15) is verified then v(+∞) = +∞.
ii) This includes the following cases:
122
a) Σ = 0 since b = 0;
b) Σ is bounded from above since
e−Σ ≥ e−MaxΣ ;
c) ab2 ∈ L1 (R) because in that case Σ is bounded.
iii) We have
Z t Z t
1
a(Xs ) ◦ dWs = a(Xs )dWs + [a(X), W ]t
0 0 2
Z t Z t 2 ′
a
= a(Xs )dWs + (Xs )ds.
0 0 4
′
a2
So, X can be considered as solution to E(a, b) with b = 4 . We
have
2 Z Z
( a4 )′ 1 x (a2 )′
x
Σ(x) = 2 (y)dy = (y)dy
0 a2 2 0 a2
1 a(x)
= (loga2 )(y)|x0 = log ,
2 a(0)
so that
a(x)
eΣ(x) =
a(0)
and we have Z x
′ 1 1
v (x) = dy.
a(x) 0 a(y)
(13.14) becomes
Z ∞ Z x Z ∞ 2
1 1 1 1
dx dy = dx = +∞;
0 a(x) 0 a(y) 2 0 a(x)
Z 0 Z 0 Z 0 2
1 1 1 1
dx dy = dx = +∞.
−∞ a(x) x a(y) 2 −∞ a(x)
Conditions (13.14) become

Z ∞ Z 0
1 1
dx = dx = +∞.
0 a(x) −∞ a(x)
Proof of Proposition 13.25).
123
(Sketch). Without restriction of generality we suppose a to be continuous.
We only discuss uniqueness in the particular case when (13.15) is verified,
i.e. Z ∞ Z 0
e−Σ(y) dy = e−Σ(y) dy.
0 −∞
Let (Xt )t≥0 be a solution to E(a, b) related to some filtered probability space
and an (Ft )t≥0 - classical Brownian motion (Wt )t≥0 .
We denote h : R → R such that h′ = e−Σ , h(0) = 0; we remark that h ∈
C 2 (R) and Ah = 0. We set Yt = h(Xt ); using Itô formula we obtain
Z t Z
1 t ′′
Yt = h(X0 ) + h′ (Xs )dXs + h (Xs )d[X]s
0 2 0
Z t Z
′ 1 t
= h(X0 ) + (ah )(Xs )dWs + Ah (Xs )ds
0 2 0 |{z}
=0
Z t
= h(X0 ) + a0 (Ys )dWs ,
0
where a0 = (ah′ ) ◦ h−1 . Since Assumption (13.15) is verified, then h is

surjective hence bijective. So we have
Z t
Yt = Y0 + a0 (Ys )dWs . (13.16)
0
Uniqueness is verified. Indeed a0 is continuous and strictly positive. We can

trivially apply Engelbert-Schmidt criterion.
Remark 13.28 If (13.15) is not verified and h(−∞) = a, h(+∞) = b, the

solution X explodes if and only if Y attains a or b with a positive probability.
13.6 Results on pathwise uniqueness
We remain in the one-dimensional case.
Proposition 13.29 (Yamada-Watanabe, [37]).

Let a, b : R+ ×R → R and consider again E(a,b). Suppose b globally Lipschitz
and h : R+ → R+ strictly increasing continuous such that
• h(0) = 0;
124
• Z ε
1
(y)dy = ∞, ∀ε > 0; (13.17)
0 h2
• |a(t, x) − a(t, y)| ≤ h(|x − y|).
Then E(a, b) admits pathwise uniqueness.
Proposition 13.30 Under the assumption of Proposition 13.29 we suppose

moreover that a, b are continuous and a has linear growth. Then E(a, b)
admits strong existence.
Proof. It follows from Theorem 13.19, Proposition 13.29 and Proposition

13.7 ii).
Remark 13.31 In Proposition 13.29, one typical choice is

1
h(u) = uα , α ≥ .
2
Proof (of Proposition 13.29). Taking into account the conditions imposed
on h, there is a strictly decreasing sequence (an ) in [0, 1], with
Z an−1
1
a0 = 1, limn→∞ an = 0, (x)dx = n,
an h2
for every n ≥ 1; in fact for a fixed an , we have

Z an
1
limt→0+ (x)dx = ∞
t h2
Ra
and the function t 7→ t n h12 (x)dx is continuous. For every n ≥ 1, we define
a function ρn : R+ → R+ such that
• its support is in [an , an−1 ];

2
• 0 ≤ ρn (x) ≤ nh2 (x)
, ∀x > 0;
R an−1
• an ρn (x)dx = 1.
125
We can show how such a construction is possible. We fix n and, for u ∈ [0, 1],
we set
1+u 1−u
φ(u) = an + an−1 ;
2 2
1−u 1+u
ψ(u) = an + an−1 .
2 2
We define the family of functions ρn (·, u) such that


 0 : x∈/ [an , an−1 ]

2u
ρn (x, u) = nh2 (x)
: x ∈ [φ(u), ψ(u)]


 linear affine : x ∈ [a , φ(u)[∪]ψ(u), a
n n−1 ].
In fact we have,
Ifu = 0, ρn (x, u) ≡ 0;
2
ifu = 1, ρn (x, u) = , x ∈ [an , an−1 ].
nh2 (x)
We set Z an−1
Φ(u) = ρn (x, u)dx.
an
Now, Φ : [0, 1] → R+ is continuous and Φ(0) = 0, Φ(1) = 2.
By finite increments theorem, there is un ∈]0, 1[ such that Φ(un ) = 1.
We choose ρn (x) = ρ(x, un ). We define ψn : R → R by

Z |x| Z y
ψn (x) = dy ρn (z)dz, x ∈ R; (13.18)
0 0
ψn is a even function with |ψn′ (x)| ≤ 1. On the other hand, if y > 0, there is
N (y) with
Z y Z an−1
n > N (y) ⇒ ρn (z)dz = ρn (z)dz = 1.
0 an
Consequently we have
limn→∞ ψn (x) = |x|, ∀x ∈ R, ψn (x) ≤ |x|.
Suppose now to have a probability space (Ω, F, P ) and an (Ft )- Brownian

(1) (2)
motion (W )t≥0 and two solutions X (1) and X (2) of E(a, b) with X0 = X0
126
a.s. It is enough to show that X (1) and X (2) are indistinguishable under the
assumption
Z t
2
E a (s, Xs(i) )ds < ∞
0
(13.19)
Z t
E |b|(s, Xs(i) )ds < ∞,
0
for every 0 ≤ t < ∞, i = 1, 2. Indeed, using a localization argument, it is

possible to reduce everything to (13.19); we explain in Remark 13.32 the
details.
(2) (1)
We set ∆t = Xt − Xt so that we have
Z t Z t
(1) (2)
∆t = (a(s, Xs ) − a(s, Xs ))dWs + (b(s, Xs(1) − b(s, Xs(2) ))ds.
0 0
In particular, ∆t is integrable. Using Itô formula for semimartigales and the

fact that ψn is of class C 2 (R), we get
Z t h i
ψn (∆t ) = ψn′ (∆s ) b(s, Xs(1) ) − b(s, Xs(2) ) ds
0
Z h i2
1 t ′′
+ ψn (∆s ) a(s, Xs(1) ) − a(s, Xs(2) ) ds (13.20)
2 0
Z t h i
+ ψn′ (∆s ) a(s, Xs(1) ) − a(s, Xs(2) ) dWs .
0
The expectation of stochastic integral in (13.20) is zero because of (13.19).

On the other hand, the expectation of the second integral is bounded by
 
Z t
  2t
E ψ ′′
n (∆ s ) h 2
(|∆ s |)ds ≤ .

0 | {z } n
2
ρn (|∆s |)≤
nh2 (|∆s |)
We conclude that
Z t h i t
E(ψn (∆t )) ≤ E ψn′ (∆s ) b(s, Xs(1) ) − b(s, Xs(2) )ds + . (13.21)
0 n
If k is the Lipschitz constant of b we get
Z t
t
E(ψn (∆t )) ≤ + k E(|∆s |)ds.
n 0
127
We let n go to ∞ which by Lebesgue dominated convergence theorem gives
Z t
E(|∆t |) ≤ k E(|∆s |)ds.
0
By Gronwall lemma, we easily obtain E(|∆t |) ≡ 0 and finally ∆t = 0 a.s.

Since (∆t ) is continuous, it is indistinguishable from zero.
Remark 13.32 We explain here how we can reduce to the hypothesis (13.19)
the proof of Proposition 13.29.
(i) First we reduce to the case when the initial condition X0 belong to a
compact interval [−M, M ]. For this we replace the inital probability P
with the conditional expectation P M := P (·|X ∈ [−M, M ]). Under P M
the Brownian motion remains a Brownian motion and the solutions of
SDE remain solutions of SDE.
(ii) Suppose now that X0 lives in a compact interval. For each positive
integer N we define the stopping times
τiN := inf{t ∈ [0, T ]|Xti | ≥ N }, i = 1, 2

τN := τ1N ∨ τ2N ,
with the convention the infimum of the empty set is +∞. We remark
that (up to a null set) Ω = ∪ΩN , ΩN := {τ N > T }. Consequently, it is
enough to show that X 1 ≡ X 2 on ΩN . On ΩN , we have (for i = 1, 2)
Z t Z t
Xti = X0 + aN (s, Xsi )dWs + bN (s, Xsi )ds,
0 0
where
aN (t, x) = a(t, (x ∧ N ) ∨ (−N )),

bN (t, x) = b(t, (x ∧ N ) ∨ (−N )).
Now bN is Lipschitz and hN fulfill the assumption of a. Consequently

(13.19) hold since aN and bN are bounded.
A direct consequence of Propositions 13.29 and 13.30 we get the following.
128
Corollary 13.33 Suppose the assumption of Proposition 13.29 verified and
a, b are continuous with linear growth. Then E(a, b) admits strong existence
and pathwise uniqueness.
Example 13.34 Z t
Xt = |Xs |α dWs , t ≥ 0. (13.22)
0
We set a(x) = |x|α , 0 < α < 1, b = 0.
According to Engelbert-Schmidt notations, we have Z(a) = 0. Moreover we

have the following.
1
• If α ≥ 2 then I(a) = {0}.
1
• If α < 2 then I(a) = ∅.
On the other hand, if α ≥ 21 ,
||x|α − |y|α | ≤ h(|x − y|), (13.23)
where h(z) = z α .
This follows from the fact that, if β ≥ 1, a, t ≥ 0,
(a + t)β − aβ − tβ ≥ 0. (13.24)
In fact for a fixed, deriving, we easily see that Φ(t) = (a + t)β − aβ − tβ

1
is increasing; since Φ(0) = 0, (13.24) follows. Consequently, setting β = α,
a = |x|α , t = |y − x|α , we get
1
(|y − x|α + |x|α ) α ≥ |x| + |y − x| ≥ |y|.
Therefore
|x|α + |y − x|α ≥ |y|α .
Replacing x with y we also obtain
|y|α + |y − x|α ≥ |x|α .
The two inequalities above imply (13.23).
129
By Corollary 13.33, (13.22) admits strong existence and pathwise uniqueness.
The unique solution is X ≡ 0.
If α < 21 , X ≡ 0 is always a solution. This is not the only one; even uniqueness
in law is not true, by Engelbert-Schmidt criterion.
Exercise 13.35 (Itô-Watanabe, 1978, [16]). We consider the equation

Z t 1
Z t 2
Xt = 3 3
Xs ds + 3 Xs3 dWs . (13.25)
0 0
i) Verify if the conditions of Proposition 13.29 are verified.
ii) Show that (13.25) admits an infinity of solutions of the type

(
(θ) 0 : 0 ≤ t < βθ ;
Xt =
Wt3 : βθ ≤ t < ∞,
where 0 ≤ θ < ∞ and
βθ = inf{s ≥ θ|Ws = 0}.
Under assumptions of Proposition 13.29, we also have a comparison theorem,

see [30] (Theorem 3.7, Chap. IX) or [17] (Proposition 2.18, Chap. 5) for the
case of time dependent coefficients.
Theorem 13.36 Let b1 , b2 : R+ × R → R be Borel real functions. Let a

fulfilling the same assumption of Proposition 13.29.
Let (Xti ) be (Ft )-adapted processes such that the following holds.
i) X01 ≤ X02 a.s.
ii) b1 (t, x) ≤ b2 (t, x) ∀ x.
iii) Either b1 or b2 are Lipschitz.
iv) X i is solution of E(a, bi ), i = 1, 2.
Then
P {Xt1 ≤ Xt2 ; ∀ t ≥ 0} = 1.
130
In some cases E(a, b) admits pathwise uniqueness even if b is only measurable,
and sometimes for unbounded drifts b.
We conclude the section about strong existence and pathwise uniqueness with
a celebrated theorem of A.Yu. Veretennikov.
Theorem 13.37 We set σ = aaT . Assume the following.
(i) a, b are Borel bounded.
(ii) a is Lipschitz with linear growth.
(iii) The non-degeneracy Assumption 13.13. holds.
Then E(a, b) admits strong existence and pathwise uniqueness.
Proof. The result follows from Theorem 1. in [36].
Remark 13.38 (i) There exists still big activity in this field. For instance
[20] has establishes pathwise uniqueness for the case m = p, a being the
identity and b is locally in some suitable Lq spaces in time and space.
For instance if b is time-independent then b is allowed to locally in
Lq (Rm ) when q > m.
(ii) Other extensions on pathwise uniqueness verifying suitable properties

on the flow appear in [8] where a, b do not depend on time. b is allowed
to be suitably Hölder continuous, aa⊤ fulfills Assumption 13.13 and it
is bounded of class C 3 .
(iii) Concerning the study of SDEs in the multidimensional case, see also
[9] and references therein.
14 Bessel processes
An application of previous considerations concern the class of Bessel and

square Bessel processes.
131
p
Example 14.1 Let a(x) = c |x|, where c is a positive constant. Let b be
continuous and Lipschitz. Then E(a, b) admits strong existence and pathwise
uniqueness.
In fact, first we observe that a, b are continuous with linear growth.
We have
p p p
|a(x) − a(y)| = c| |x| − |y|| ≤ c |x − y|
p
by (13.23). Proposition 13.29 can be applied taking h(z) = c |z|. By Corol-
lary 13.33, the result follows.
p
Let x0 ≥ 0, δ ≥ 0. Consider the SDE E(a, b), with a(x) = 2 |x|, b(t, x) = δ.
Z tp
Zt = x 0 + 2 |Zs |dWs + δt. (14.26)
0
According to the comparison Theorem 13.36 the unique solution Z of E(a, b)

is non-negative and so the absolute value in (14.26) can be raised.
In fact for x0 = δ = 0, the unique solution is Z ≡ 0. So we have
Z tp
Zt = x 0 + 2 Zs dWs + δt, t ≥ 0. (14.27)
0
Definition 14.2 The unique solution Z to (14.26) is called square δ-dimensional

Bessel process starting at x0 ; it is denoted by BESQ(x0 , δ). For fine prop-
erties of this process, see [30], ch. IX.3.
Taking into account Z ≥ 0, we call δ− dimensional Bessel process start-
√ √ √
ing at x0 the process X = Z. It is denoted by BES( x0 , δ). .
Proposition 14.3 Let p be a positive integer. Let W = (W 1 , · · · , W p ) be a

classical p-dimensional Brownian motion, y0 ∈ Rd . We set Xt = ky0 + Wt k.
(Xt )t≥0 is a p-dimensional Bessel process starting at ky0 k.
Proof. We set Zt = ky0 + Wt k2 . By Itô formula, we have

p Z
X p
t
1X
Zt = ky0 k2 + 2 (Wsi + y0i )dWsi + 2t
0 2
i=1 i=1
2
= ky0 k + 2Mt + pt,
132
where
p Z
X t
Mt = (Wsi + y0i )dWsi .
i=1 0
Now, (Mt ) is a continuous martingale whose quadratic variation is

p Z
X t Z t
[M ]t = (Wsi + y0i )2 ds = kWs + y0 k2 ds.
i=1 0 0
Therefore Z t
dMs
Bt =
0 kWs + y0 k
defines a classical Brownian motion because [B]t = t and because of Lévy
characterization Theorem 13.11 of Brownian motion. Now,
Z tp Z t
kWs + y0 k
2 Zs dBs = 2 dMs = 2Mt ,
0 0 kWs + y0 k
and the final result follows.
Remark 14.4 a) If δ > 2, it is possible to see that the process BES(x0 , δ)

fulfills Z t
δ−1 ds
X t = x 0 + Bt + ,
2 0 Xs
where B is a standard Brownian motion.
b) X is a Dirichlet process for δ > 0 and a semimartingale if and only if

δ ≥ 1, see [38].
Exercise 14.5 Show item a) supposing that Xt > 0 a.s. for every t.
Exercise 14.6 Let (St ) be a BESQ(x0 , δ). For N > 0, we set τN = inf{s ≥
0|Ss > N }.
(i) Show that a.s. (τN ) is an increasing sequence of stopping times diverg-
ing to infinity.
(ii) Show that E(StτN ) = x0 + δE(τN ∧ t).
(iii) Calculate E(St ), t ≥ 0.
(iv) Show that E(supt≤T St2 ) < ∞.
133
(v) Calculate V ar(St ) for any t ≥ 0.
We aim here at establishing the marginal distribution law ρt (dy) of St where

S is a Square Bessel process. In fact we will determine the Laplace transform
i.e. Z ∞
−λSt
E(e )= e−λy ρt (dy). (14.28)
0
We will establish a particular case of the celebrated Pitman-Yor theorem

which allows to calculate an even more general quantity i.e.
R∞
E e−λ 0 Ss dµ(s) ,
where µ is a Borel measure with compact support on R+ which will corre-

spond to (14.28) taking µ = δt .
First we need a lemma.
Lemma 14.7 Let Z (resp. Z ′ ) be a BESQ(x, δ) (resp. BESQ(x′ , δ ′ )),

driven by a Brownian motion W (resp. W ′ ). We suppose W, W ′ independent
and defined on the same probability space. Then X = Z + Z ′ is a BESQ(x +
x′ , δ + δ ′ ) process.
Proof. We set Xt = Zt + Zt′ . By additivity,

Z tp
′ ′
Xt = Zt + Zt = x + x + 2 Zs dWs + δt
0
(14.29)
Z tp
+ 2 Zs′ dWs′ + δ ′ t.
0
In order to conclude, we need to show the existence of some Brownian motion

W̃ with Z tp Z tp Z tp
Zs dWs + Zs′ dWs′ = Xs dW̃s .
0 0 0
We introduce the process
Z √ p
t Zs dWs + Zs′ dWs′
Mt = 1{Xs >0} √ +
0 Xs
Z t
+ 1{Xs =0} dWs′′ ,
0
134
where W ′′ is another Brownian motion independent of W, W ′ , possibly by
enlarging the probability space. We omit the details about this. We show
that M is Brownian motion. Since M is a local martingale, according to
Lévy’s characterization theorem, it will be enough to show [M ]t = t.
Zs +Zs′
Recalling that Xs = 1, we obtain
Z t Z t
[M ]t = 1{Xs >0} ds + 1{Xs =0} ds = t.
0 0
Coming back to (14.29), we get

Z tp Z t p p Z tp
′
Xs dMs = ′
1{Xs >0} ( Zs dWs + Zs dWs ) + Xs 1{Xs =0} dWs′′
0 0 0
Z t p p
= 1{Xs >0} ( Zs dWs + Zs′ dWs′ )
0
Z t p p
= ( Zs dWs + Zs′ dWs′ ).
0
Indeed Xt = 0 implies that Zt = Zt′ = 0 Xt being positive. So the result

follows.
We go on with another significant partial result.
Theorem 14.8 Let µ be a Borel finite measure on [0, T ]. We set

R∞
ϕ(x, δ) := E e−λ 0 Xs dµ(s) ,
where X be BESQ(x, δ). Then
ϕ(x, δ) = ϕ(x, 0)ϕ(0, δ).
Proof.
We remark that this quantity is finite since µ is a positive measure and Bessel
process is always non-negative.
Let Z (resp. Z ′ ) respectively a BESQ(x, δ) (resp. BESQ(x′ , δ ′ )) for x, x′ ∈
R+ , δ, δ ′ ≥ 0. Using Lemma 14.7 we have
R∞ ′
R∞ R∞ ′
ϕ(x + x′ , δ + δ ′ ) = E e−λ 0 (Zs +Zs )dµ(s) = E e−λ 0 Zs dµ(s) E e−λ 0 Zs dµ(s)
= ϕ(x, δ)ϕ(x′ , δ ′ ).
135
So we have proved that
ϕ(x + x′ , δ + δ ′ ) = ϕ(x, δ)ϕ(x′ , δ ′ ). (14.30)
In particular, for x, δ ≥ 0,
ϕ(x, δ) = ϕ(x, 0)ϕ(0, δ).
Next lemma is technical.
Lemma 14.9 (i) ϕ(x, δ) = ϕ( xδ , 1)δ , if δ > 0.
(ii) ϕ(x, δ) = ϕ(1, xδ )x .
Proof. It is enough to prove (i) since (ii) follows similarly. If δ is rational,

then (i) is the consequence of the multiplicative property (14.30). The general
case follow because ϕ is continuous, see Lemma 14.10.
Lemma 14.10 The function ϕ : R+ × R+ → R is continuous.
Proof. We omit the details of proof. However the exercise below allows to
show that if there are two sequences (xn ) and (δn ) which converge increas-
ingly (or decreasingly) respectively to x0 and δ0 , then ϕ(xn , δn ) → ϕ(x0 , δ0 ).
Exercise 14.11 Let S i , i = 1, 2 be the solution (on some filtered probability

space, equipped with a Brownian motion W ) of
Z tq
Sti = xi + 2 Ssi dWs + δi t, i = 1, 2,
0
where 0 ≤ x1 ≤ x2 , 0 ≤ δ1 ≤ δ2 .
(i) Show that St1 ≤ St2 , ∀t ∈ [0, T ].
(ii) Express E(|St1 − St2 |).
136
(iii) Show that
(δ2 − δ1 )t2
Var(St2 − St1 ) ≤ (x2 − x1 )t + .
2
(iv) Using Doob’s inequality find an upper bound for
!
E sup |St1 − St2 |2 .
t≤T
(v) Let (xn ) (resp. δn ) be an increasing sequence of non-negative numbers

coverging to x0 (resp. δ0 ). Let S = S n the solution to
Z tp
St = xn + Ss dWs + δn t, n ∈ N.
0
Deduce that the sequence S n converges ucp to S 0 .
(vi) Deduce that ϕ(xn , δn ) → ϕ(x, δ) when µ has its support on a compact
interval [0, T ].
Next theorem allows to calculate the Laplace transform of the law of a square
Bessel process.
Theorem 14.12 Let S be a BESQ(x, δ). Then

δ
1 2 λx
E (exp(−λSt )) = exp − .
2tλ + 1 2tλ + 1
Proof. We apply previous results with µ being the positive measure δt . Let
λ > 0. Using the notation ϕ as before we have
x δ
E(exp(−λSt )) = ϕ (x, δ) = ϕ ,1 ,
δ
by Lemma 14.9. So by Proposition 14.3, it is enough to evaluate previous
p
quantity for St = ( xδ + Wt )2 , where (Wt ) is a classical Wiener process. We
137
denote x0 = xδ . We get
√ Z √ √ 2 y2
2 1
ϕ(x0 , 1) = E e−λ( x0 +Wt ) = √ e−λ( x0 +y t) − 2 dy
Z 2π R
1 1 2 √
= √ e−λx0 e−(λt+ 2 )y −2λy x0 t dy
2π ZR
1 −λx0 1
√ 2
√
= √ e e− 2 {(y 2λt+1) +4λy x0 t} dy
2π R
q 2 Z √ q 2
1 −λx0 2 2λ 2λt+1
1 x0 t x0 t
− 21 y 2λt+1+2λ 2λt+1
= √ e e e dy
2π
R Z
x0 t 1
= exp −λx0 + 2λ2 √ q(y)dy,
2λt + 1 2λt + 1 R
where q is some law density. Finally

1 −λx0
ϕ(x0 , 1) = 1 exp .
(1 + 2λt) 2 2λt + 1
Taking into account the definition of x0 the result follows.
Exercise 14.13 Deduce E(St ) from previous formula.
15 Cox-Ingersoll-Ross equation
A succesful model in finance is provided by Cox-Ingersoll-Ross model which

is often used to model the evolution of interest rates.
We consider the following equation:
( p
dXt = (a + bXt )dt + c |Xt |dWt
(15.31)
X0 = x0 ,
where b, c ∈ R and a, x0 ≥ 0.
Proposition 15.1 Equation (15.31) admits strong existence and pathwise

uniqueness. Moreover Xt ≥ 0 a.s.
Proof.
138
p
The function x 7→ a + bx is Lipschitz and x 7→ |x| fulfills the assumption
of Proposition 13.29. Moreover those functions are continuous. The result
follows by Corollary 13.33.
Remark 15.2 (i) The unique solution X of Proposition 15.1 is said Cox-
Ingersoll-Ross process.
(ii) Again, as for the case of square Bessel process, We apply comparison
theorem for SDEs, setting
b1 (x) = a + bx, b2 (x) = bx.
Since b1 (x) ≥ b2 (x), x0 ≥ 0, then the unique solution X of (15.33) is

greater or equal than the unique solution Y of
( p
dYt = bYt dt + c |Yt |dWt
(15.32)
X0 = 0.
Since Y ≡ 0, then Xt ≥ 0 and we can remove the absolute value to

(15.33).
(iii) We can therefore affirm that the Cox-Ingersoll-Ross process is (the

unique) solution of
( √
dXt = (a + bXt )dt + c Xt dWt
(15.33)
X0 = x0 .
Next step allows to associate the solution of (15.31) to square Bessel process.
Before this we need a lemma that can be found in [17].
Lemma 15.3 Let (Mt ) be an (Ft )-local martingale with quadratic variation
Rt
[M ]t = 0 Φ2s ds, for Φ being a measurable (Ft )-adapted non-negative process.
Then, there us an enlarged (filtered) probability space (Ω̄, F̄, P̄ , (F̄t ), a clas-
sical Wiener process (W̄t ) where
Z t
Mt = M0 + Φs dW̄s .
0
139
Remark 15.4 By enlarged probability space we intend a space of the type
Ω̄ = Ω × Ω0 , F̄ = F ⊗ F0 .
P̄ being a probability whose marginal on Ω is P . Moreover,
F̄t = Ft ⊗ F0 , W̄ (ω, ω0 ) = W (ω0 ).
Proposition 15.5 Let (Xt ) be the solution of (15.33) and (Tt ) be a BESQ(x0 , δ)
4a c2
where δ = c2
, x0 = b . Then Xt = ebt Tϕ(t) in law where
c2
ϕ(t) = (1 − e−bt ).
4b
Proof. We set Rt = Tϕ(t) . We determine the SDE that it fulfills. Let

(Zt )t≥0 be a classical Wiener process such that
Z tp
4a
Tt = T 0 + 2 Ts dZs + 2 t.
0 c
Rt√
So, setting Mt := 2 0 Ts dZs , t ≥ 0, we have
Z ϕ(t)
4a
Rt = R0 + ds + Mϕ(t)
0 c2
(15.34)
Z t
= R0 + a e−bs̃ ds̃ + Mϕ(t) .
0
Indeed the change of variable in the second line comes from

c2 −bs̃ c2
s̃ = ϕ−1 (s), ds = ϕ′ (s̃)ds̃ = e ds̃, ϕ : R+ → [0, [.
4 4b
Rt
Since [M ]t = 4 0 Ts ds we have
Z ϕ(t) Z t Z t
′
[(M )ϕ(t) ]t = [M ]ϕ(t) = 4 Ts ds = 4 Tϕ(u) ϕ (u)du = 4 Ru ϕ′ (u)du
0 0 0
Z t
2
= c Ru e−bu du.
0
(Mϕ(t) ) is an (Fϕ(t) )-local martingale. According to Lemma 15.3, there is a

classical Brownian motion W on an enlarged filtered probability space with
Z t
bu p
Mϕ(t) = ce− 2 Ru dWu .
0
140
Consequently by (15.34) we have
Z t Z t
bs p
Rt = R0 + a e−bs ds + ce− 2 Rs dWs .
0 0
We evaluate now Yt = ebt R t integrating by parts. We get

Z t Z t
Yt = Y0 + ebs dRs + b Rs ebs ds
0 0
Z
bs p
t
= Y0 + ebs (ae−bs ds + ce− 2 Rs dWs + bRs ds)
0
Z tp Z t
= Y0 + at + c Ys dWs + b Ys ds.
0 0
So Y solves equation (15.33). Since pathwise uniqueness implies uniqueness

in law then X and Y have the same law.
Exercise 15.6 Evaluate the Laplace transform of the Cox-Ingersoll-Ross pro-

cess Xt , i.e.
E(e−λXt ).
16 Backward stochastic differential equations
16.1 Preliminaries and motivations.
These equations were motivated by applications to stochastic control. They

represent an efficient tool to represent probabilitically a non-linear parabolic
or elliptic PDEs. Today, there are several applications to mathematical fi-
nance, see [6]. A nice survey is provided by [27] and a recent reference is
given in [28]. The theorems we expose here are written with relatively heavy
assumptions, so they are not optimal.
Let (Wt ) be classical Wiener process equipped with its canonical filtration
(Ft ).
We recall the notation L2 (W )T (see Definition 7.3) indicating the linear space
of progressively measurable processes (Xt )t∈[0,T ] such that
Z T Z T
2 2
E |Xs | ds = E |Xs | d[W ]s < ∞.
0 0
141
Assumption 16.1 a) We are given a square integrable random variable ξ
which plays the role of final condition.
b) A coefficient f : Ω × [0, T ] × R × R → R, for which there is a real number

K with K > 0 with the following features.
(i) f (·, y, z) is progressively measurable for every y, z;

(ii)
Z T
E f (t, 0, 0)2 dt < ∞.
0
(iii) |f (t, y, z)−f (t, y ′ , z ′ )| ≤ K(|y−y ′ |+|z−z ′ |), for every t, y, y ′ , z, z ′ .
Remark 16.2 (i) In this first section, Assumption (iii) on f will not op-
erate.
(ii) The Lipschitz conditions with respect to y given in item (iii) are not
necessary. They can be replaced by monotonicity conditions.
(iii) We write everything in the one-dimensional case. The extension to the

multi-dimensional case holds with similar arguments.
Definition 16.3 A pair of adapted real valued processes (Y, Z) indexed on

[0, T ] will be said solution to BSDE (ξ, f ) if
R
T
a) E 0 |Zs |2 ds < ∞,
RT RT
b) Yt = ξ + t f (s, Ys , Zs )ds − t Zs dWs , 0 ≤ t ≤ T a.s.
Remark 16.4 (i) We remark that the adaptedness of Y implies in partic-

ular that Y0 is deterministic.
(ii) Since the right-hand side in item b) is continuous then every solution
is a continuous process.
We discuss now some motivations in the introduction to the study of back-

ward SDEs.
142
Remark 16.5 (i) f = 0. It constitutes the representation theorem of
martingales and hedging problem in finance.
In this case the equation BSDE(ξ, f ) becomes
( RT
Yt = ξ − t Zs dWs
RT
E( 0 Zs2 ds) < ∞.
We set Yt = E(ξ|Ft ). The representation theorem of Brownian mar-

tingales, there is Z ∈ L2 (Ω × [0, T ]), measurable adapted such that
Z t
Yt = Y0 + Zs dWs , t ∈ [0, T ].
0
So Z Z
T T
Yt = YT − Zs dWs = ξ − Zs dWs ,
t t
which shows that BSDE(ξ, 0) has a solution.
We go on with uniqueness.
Let (Y 1 , Z 1 ), (Y 2 , Z 2 ) be two solutions. We set
Y = Y 1 − Y 2, Z = Z 1 − Z 2.
We have Z T
Yt = − Zs dWs , t ∈ [0, T ]. (16.1)
t
R R·
T
Since E 0 Zs2 ds < ∞ then 0 Zs dWs is a (square integrable) mar-
tingale. Taking the conditional expectation in (16.1) with respect to
(Ft ), we get Yt = 0 and so Y ≡ 0.
Then Z t
0= Zs dWs , t ≥ 0.
0
Taking the quadratic variation we obtain
Z t
0= Zs2 ds,
0
so Zs (ω) = 0 dsdP a.e.
(ii) Suppose the existence of g : R+ × R3 → R such that
f (ω, t, y, z) = g(t, Xt (ω), y, z),
143
where (Xt ) is the solution of a classical SDE
Z t Z t
Xt = x 0 + a(s, Xs )dWs + b(s, Xs )ds, E(a, b), (16.2)
0 0
where a, b are with linear growth. We suppose g be bounded and contin-

uous in (y, z) for each fixed t ∈ [0, T ]. Let φ : R → R continuous with
linear growth such that ξ = φ(XT ).
Exercise: Show that ξ is square integrable.
In this case the system
( Rt Rt
Xt = x0 + 0 a(s, Xs )dWs + 0 b(s, Xs )ds
RT RT FBSDE(g, φ, a, b),
Yt = ξ + t g(s, Xs , Ys , Zs )dWs − t Zs dWs
is called forward-backward SDE or also Markovian SDE.
We suppose the existence of u ∈ C 1,2 ([0, T [×R), continuous on [0, T ] × R

solving

 a2 2

 (∂t u + 2 ∂xx u + b∂x u)(t, x)
(PDE) + g(t, x, u(t, x), (a∂x u)(t, x)) = 0,


 u(T, x) = φ(x).
We also suppose that ∂x u is bounded on [0, T [×R.
Proposition 16.6 The system FBSDE(g, φ, a, b) admits a solution (Y, Z)

where (
Yt = u(t, Xt )
(16.3)
Zt = ∂x u(t, Xt )a(t, Xt ),
where X is a solution of E(a, b).
Proof. We apply Itô formula to Yt = u(t, Xt ). Let τ < T . We obtain

Z τ Z τ Z
1 τ 2
Yτ = Yt + ∂s u(s, Xs )ds + ∂x u(s, Xs )dXs + ∂ u(s, Xs )d[X]s
t t 2 t xx
Z τ
= Yt + ∂x u(s, Xs )a(s, Xs )dWs
t
Z τ
1 2 2
+ ∂s u(s, Xs ) + b(s, Xs )∂x u(s, Xs ) + ∂xx u(s, Xs )a (s, Xs ) ds
t 2
Z τ Z τ
= Yt + Zs dWs − g(s, Xs , Ys , Zs )ds,
t t
144
so Z τ Z τ
Yt = Yτ − Zs dWs + g(s, Xs , Ys , Zs )ds.
t t
Finally, letting τ converge to T we get
Z T Z T
Yt = ξ − Zs dWs + g(s, Xs , Ys , Zs )ds,
t t
which constitutes a probabilistic representation of the non-linear PDE.
In conclusion, the system FBSDE(ξ, g, a, b) constitutes a probabilistic repre-

sentation of the non-linear PDE.
16.2 Existence and uniqueness of the BSDE
We start with a simple basic lemma often used here.
Lemma 16.7 Let Y be an adapted measurable process such that E(supt≤T Yt2 ) <
Rt
∞. Let Z ∈ L2 (W )T . Then Mt := 0 Ys Zs dWs , t ≥ 0, is a martingale.
Proof. Since M is a local martingale, by Proposition 6.5 it is enough to

1
prove that E([M, M ]T2 ) < ∞. Indeed, by Cauchy-Schwarz, we get
s  s !
1
Z T Z T
E([M, M ]T2 ) = E  Ys2 Zs2 ds ≤ E sup Yt2 Zs2 ds
0 t≤T 0
( !) 1 Z 12
2 T
2
≤ E sup Yt E Zs2 ds ,
t≤T 0
which is finite.
From now on, we will make Assumption (iii) (Lipschitz on (y, z)) on f .
Proposition 16.8 Under the above conditions, if (Y, Z) is a solution of the

BSDE(ξ, f ), then there exists a constant c, which depends only on T and K,
such that
Z ! Z
T T
2 2 2 2
E sup |Yt | + |Zt | dt ≤ cE |ξ| + |f (t, 0, 0)| dt .
0≤t≤T 0 0
145
Proof. If (Y, Z) is a solution, then
Z t Z t
Yt = Y0 − f (s, Ys , Zs )ds + Zs dWs . (16.4)
0 0
For each n ∈ N, we introduce the stopping time
τn = inf{0 ≤ t ≤ T : |Yt | ≥ n},
(with the convention that the infimum over an empty set is infinite) and the
stopped process
Ytn = Yt∧τn .
Since Y is continuous and therefore almost all paths are bounded on [0, T ]
we have Tn → +∞ when n → ∞. Then |Ytn | ≤ n ∨ |Y0 | and
Z t Z t
n n n
Yt = Y0 − 1[0,τn ] (s)f (s, Ys , Zs )ds + 1[0,τn ] (s)Zs dWs .
0 0
For n ≥ |Y0 | can write

Z 2 Z !
t t
E(|Ytn |2 ) ≤ 3 |Y0 |2 + E |f (s, Ysn , Zs )|ds + E(|Zs |2 )ds , t ∈ [0, T ].
0 0
Taking into account Assumption 16.1 b) (ii), (iii) there is a constant C de-
pending on K and (which in the sequel can change from one line to the other)
such that previous expression is inferior or equal to
Z t
C |Y02 | + E(f (s, 0, 0)2 + |Ysn |2 + |Zs |2 )ds .
0
RT RT
So there exists a quantity C, depending on K, Y0 , E( 0 f (s, 0, 0)2 ds), E( 0 Zs2 ds)
but not on n such that
Z t
E|Ytn |2 ≤C 1+ n 2
E(|Ys | )ds .
0
Gronwall’s Lemma 13.10 implies
E((Ytn )2 ) ≤ CeCt , t ∈ [0, T ].
On the other hand

lim Ytn = Yt a.s.
n→∞
146
Therefore, by Fatou’s lemma
E(Yt2 ) = E(lim inf (Ytn )2 ) ≤ lim inf E(Ytn )2 ≤ CeCT , t ∈ |0, T ]. (16.5)
n→∞ n→∞
From this, from (16.4) and b) (iii) in Assumption 16.1 we have

Z T Z T
2 2 2 2 2 2
E(sup |Yt | ) ≤ c Y0 + K E (|Ys | + |Zs | )ds + E f (s, 0, 0)ds
t≤T 0 0
(16.6)
Z !
t
+ E sup | Zs dWs |2 ,
t≤T 0
where c is a constant which may change from line to line. Therefore, by

Doob’s inequality,
Z T Z T Z T
2 2 2 2 2
E(sup |Yt | ) ≤ c Y0 + E |Ys | ds + E |Zs | ds) + E f (s, 0, 0) ds ,
t≤T 0 0 0
(16.7)
where c only depends on K and T . This quantity is finite. Finally by (16.5)
E(sup |Yt |2 ) < ∞. (16.8)

t≤T
We go on with the proof of the proposition. Let (Y, Z) be a solution. It

follows that Z
T
E Ys Zs dWs =0 (16.9)
t
This will follow from the fact that the local martingale
Z t
Nt = Ys Zs dWs
0
is a true martingale, because of Lemma 16.7.

We apply Itô formula and (16.4) to get
Z T
YT2 = Yt2 −2 Ys dYs + [Y ]T − [Y ]t
t
Z T Z T
= Yt2 −2 Ys f (s, Ys , Zs )ds + 2 Ys Zs dWs
t t
Z T
+ Zs2 ds.
t
147
Taking the expectation in previous expression and using (16.9) we get
Z T
2
E(ξ ) = E(Yt2 ) −2 E(Ys f (s, Ys , Zs ))ds
t
Z T
+ E Zs2 ds .
t
Therefore
Z T Z T
E(Yt2 ) +E Zs2 ds = E(ξ 2 ) + 2 E(Ys f (s, Ys , Zs ))ds. (16.10)
t t
On the other hand, taking into account the initial assumptions on f we get
2E(|Ys f (s, Ys , Zs )|) ≤ 2E |Ys (f (s, Ys , Zs ) − f (s, 0, 0))| + 2E|Ys f (s, 0, 0)|
≤ 2K (E(|Ys |(|Ys | + |Zs |) + 2E(|Ys ||f (s, 0, 0)|)))

≤ c E(|Ys |2 ) + E(|Ys Zs |) + E(|Ys ||f (s, 0, 0)|) ,
for a certain constant c only depending on T and K. Using
a 2 + b2 |Zs |2
ab ≤ , a2 = c|Ys |2 , b2 = ,
2 c
it yields
1 2 Zs2
E(|Ys Zs |) ≤ E cYs + .
2 c
So

c 1
2E(|Ys f (s, Ys , Zs )|) ≤ c E(|Ys |2 ) + E(|Ys |2 ) + E(|Zs |2 )
2 2c

1 2 1 2
+ E(f (s, 0, 0)) + E(|Ys | )
2 2
1
= C0 (E(|Ys |2 ) + E(|f (s, 0, 0)2 )) + E(|Zs |2 ),
2
where C0 is a constant only depending on K, T . Coming back to (16.10), we
obtain
Z T Z T
E(Yt2 ) + E Zs2 ds ≤ E(ξ ) + C0 E 2
Ys2 ds 2
+ f (s, 0, 0) ds
t t
Z T
1 2
+ E Zs ds .
2 t
148
Therefore, since E(Yt2 ) < ∞, t ∈ [0, T ]
Z T Z T
1
E(Yt2 ) + E Zs2 ds 2
≤ E(ξ ) + C0 E(Ys2 )ds
2 t t
(16.11)
Z T
+ C0 E f (s, 0, 0)2 ds .
0
So, by Gronwall’s lemma (applied backwardly with g(t) = E(YT2−t )) there is

a constant C1 such that
Z T
2 2 2
E(Yt ) ≤ C1 E(ξ ) + E f (s, 0, 0) ds , t ∈ [0, T ]. (16.12)
0
Combining (16.11) and (16.12) provides the final statement but with the
supremum outside the expectation, i. e.
Z T Z T
2 2 2 2
sup E |Yt | + |Zt | dt ≤ cE |ξ| + |f (t, 0, 0)| dt . (16.13)
0≤t≤T 0 0
The result follows by (16.7) Fubini’s and (16.13).
We go on preparing the main theorem of existence and uniqueness.

We denote by B 2 := L2 (W )T × L2 (W )T . We introduce a map Φ : B 2 →
CF ([0, T ]) × L2 (W )T : to each couple (U, V ) ∈ B 2 we associate the following
couple (Y, Z) of processes. We first define the continuous version of the
square integrable martingale
Z T
Mt := E ξ + f (s, Us , Vs )ds|Ft , t ∈ [0, T ].
0
RT
The r.v. ξ + 0 f (s, Us , Vs )ds is square integrable since ξ is and (U, V ) ∈ B 2 .
Observe in particular
Z T Z T Z T
2 2 2
f (s, Us , Vs ) ds ≤ 2 f (s, 0, 0) ds + 2K (Us2 + Vs2 )ds .
0 0 0
(16.14)
149
We recall that by the Brownian martingale representation theorem there is
Z ∈ L2 (W )T such that
Z T Z t
Mt = E ξ + f (s, Us , Vs )ds + Zr dWr , t ∈ [0, T ]. (16.15)
0 0
We also set Z t
Y t = Mt − f (s, Us , Vs )ds, t ∈ [0, T ], (16.16)
0
which is a continuous process. In particular YT = ξ.
Lemma 16.9 (i) Φ maps a couple of process (U, V ) ∈ B 2 into a couple

(Y, Z) ∈ B 2 .
(ii) E(supt≤T Yt2 ) < ∞ for every (U, V ) ∈ B 2 .
(iii) (Y, Z) = Φ(U, V ) if and only if

Z T Z T
Yt = ξ − Zr dWr + f (s, Us , Vs )ds, t ∈ [0, T ]. (16.17)
t t
(iv) (Y, Z) is a fixed point of Φ if and only if (Y, Z) is a solution of BSDE(f, ξ).
Proof. Let (U, V ) ∈ B 2 . Z ∈ L2 (W )T by construction. About Y , item (i)

comes from (ii). Concerning (ii), since M is a square integrable martingale
and taking into account Doob’s inequality and (16.14) we have
Z T
2 2 2
E(sup Yt ) ≤ 8E(MT ) + 2T E f (s, Us , Vs ) ds < ∞.
t≤T 0
Concerning item (iii), if (Y, Z) = Φ(U, V ) we have

Z T Z T
Yt = YT − Zr dWr + f (r, Ur , Vr )dr
t t
Z T Z T
= ξ− Zr dWr + f (r, Ur , Vr )dr, t ∈ [0, T ].
t t
If viceversa (Y, Z) is an element of B 2 fulfilling previous equality then by

taking the expectation with respect to Ft we get
Z T Z t
Yt = E ξ + f (r, Ur , Vr )dr|Ft − f (r, Ur , Vr )dr
0 0
Z t
= Mt − f (r, Ur , Vr )dr,
0
150
where Z
T
Mt = E ξ + f (r, Ur , Vr )dr|Ft a.s.,
0
which says that Φ(U, V ) = (Y, Z). This shows (iii). (iv) is a direct conse-
quence of (iii).
Theorem 16.10 Under Assumption 16.1 BSDE(ξ, f ) admits a unique solu-

tion (Y, Z) in B 2 .
Proof. According to Lemma 16.9 we need to show the existence of a unique

fixed point for the map Φ. Let γ ≥ 0; Let (U, V ) ∈ B 2 and (Y, Z) = (U, V ).
Let (U, V ), (U ′ , V ′ ) ∈ B 2 and (Y, Z) = Φ(U, V ), (Y ′ , Z ′ ) = Φ(U ′ , V ′ ). We set
Ū = U − U ′ , V̄ = V − V ′ , Ȳ = Y − Y ′ , Z̄ = Z − Z ′ . We apply Itô’s formula
from t to T we get
Z T Z T Z T
−eγt Ȳt2 = 2 eγs Ȳs dȲs + γeγs Ȳs2 ds + eγs d[Ȳ ]s
t t t
Z T Z T
= 2 eγs Ȳs Z̄s dWs + Ȳs (f (s, Us , Vs ) − f (s, Us′ , Vs′ ))eγs ds
t t
Z T Z T
γs 2
+ γe Ȳs ds + eγs Z̄s2 ds. (16.18)
t t
Rt
By Lemma 16.9 (ii) we have E(supt≤T Ȳt2 ) < ∞. By Lemma 16.7 ( 0 eγs Ȳs Z̄s dWs )
is a martingale. So taking the expectation in (16.18), taking into account
the Lipschitz condition we get
Z T
γt 2 γs 2 2
e E(|Ȳ |t ) + E e (γ Ȳs + Z̄s )ds
t
Z T
≤ 2K eγs |Ȳs |(|Ūs | + |V̄s |)ds
t
Z T Z T
2 γs 2 1 γs 2
≤ 4K E e |Ȳs | ds + E e (|Ūs | + |V̄s |) ds ,
t 4 t
making use of the inequality
b2
2ab ≤ a2 (4K) +
4K
151
with
a = |Ȳs |, b = |Ūs | + |V̄s |.
We choose γ = 1 + 4K 2 hence
Z T Z T
γs 2 2 1 γs 2 2
E e (|Ȳs | + |Z̄s | )ds ≤ E e (|Ūs | + |V̄s | )ds ,
0 2 0
from which it follows that Φ is a strict contraction on B 2 equipped with the

norm k · kγ , equivalent to the initial norm, where
Z T
k(Y, Z)k2γ =E γs 2 2
e (|Ys | + |Zs | )ds ,
0
for γ = 1 + 4K 2 . Then, by Banach fixed point theorem, Φ has a unique fixed

point and the theorem is proved.
We now want to estimate the difference between two solutions in terms of

the difference between the data. Given two final conditions ξ, ξ ′ ∈ L2 (Ω) and
the two drivers f, f ′ where f ′ verifies the Assumption 16.1 and (Y, Z) (resp.
(Y ′ , Z ′ )) be a solution of BSDE(f, ξ) (resp. BSDE(f ′ , ξ ′ ). We also suppose
that E(supt≤T Yt2 ) + E(supt≤T (Y ′ )2t ) < ∞.
Theorem 16.11 There exists a constant c which depends on the Lipschitz

conditions K ′ of f ′ with
Z ! Z
T T
E sup |Yt − Yt′ |2 + |Zs − Zs′ |2 )ds ′ 2
≤ cE |ξ − ξ | + ′ 2
|f (s, Ys , Zs ) − f (s, Ys , Zs )| ds .
t≤T 0 0
Proof. We use Itô’s formula to develop the increment of |Ys − Ys′ |2 between
s = t and T , yielding
Z T
|Yt − Yt′ |2 + |Zs − Zs′ |2 ds = |ξ − ξ ′ |2
t
Z T
+ 2 (Ys − Ys′ )(f (s, Ys , Zs ) − f ′ (s, Ys′ , Zs′ ))ds (16.19)
t
− 2(MT − Mt ),
152
Rt
where Mt = 0 (Ys − Ys′ )(Zs − Zs′ )dWs , is a martingale by Lemma 16.7. We
remark that
2(Ys − Ys′ )(f (s, Ys , Zs ) − f ′ (s, Ys′ , Zs′ )) (16.20)

≤ 2|Ys − Ys′ | |(f (s, Ys , Zs ) − f ′ (s, Ys , Zs )| + K ′ |Zs − Zs′ | + 2K ′ |Ys − Ys′ |2
≤ |Ys − Ys′ |2 + |f (s, Ys , Zs ) − f ′ (s, Ys , Zs )|2
1
+ 2K ′2 |Ys − Ys′ |2 + |Zs − Zs′ |2 + 2K ′ |Ys − Ys′ |2 ,
2
where we have used successively the inequalities 2ab ≤ a2 + b2 and 2ab ≤
b2
2a2 K ′ + 2K ′ with
a = |Ys − Ys′ |, b = f (s, Ys , Zs ) − f ′ (s, Ys , Zs ),
and
a = |Ys − Ys′ |, b = |Zs − Zs′ |.
Hence taking the expectation (16.19) and taking (16.20) into account we
obtain
Z
′ 2 1 T
E |Yt − Yt | + |Zs − Zs | ds ≤ E(|ξ − ξ ′ |2 )
′ 2
(16.21)
2 t
Z T Z T
′ 2 ′ ′2 ′ 2
+ E (f (s, Ys , Zs ) − f (s, Ys , Zs )) ds + (1 + 2K + 2K )E |Ys − Ys | ds .
t t
So, by Gronwall’s lemma (applied backwardly with g(t) = E(|Y − Y ′ |2

T −t )
there is a constant C1 such that
Z T
′ 2 ′ 2 ′ 2
E(|Yt −Yt | ) ≤ C1 E(|ξ − ξ | + |f (s, Ys , Zs ) − f (s, Ys , Zs )| ds , t ∈ [0, T ].
0
(16.22)
Combining (16.21) and previous equality provides the final statement but
with the supremum outside the expectation, i. e.
Z T
′ 2 ′ 2
sup E |Yt − Yt | + |Zt − Zt | dt (16.23)
0≤t≤T 0
Z T
′ 2 ′ 2
≤ C2 E |ξ − ξ | + |f (s, Ys , Zs ) − f (s, Ys , Zs )| ds , t ∈ [0, T ].
0
On the other hand

Z t Z t Z t
Yt −Yt′ = Y0 −Y0′ − ℓ(s)ds− (f ′ (s, Ys , Zs )−f ′ (s, Ys′ , Zs′ ))ds+ (Zs −Zs′ )dWs ,
0 0 0
(16.24)
153
where
ℓ(s) = f (s, Ys , Zs ) − f ′ (s, Ys , Zs ).
By Doob’s we get
Z T Z T
E(sup |Yt − Yt′ |2 ) ≤ 4|Y0 − Y0′ |2 + 4T 2
ℓ(s) ds + 8K E ′2
(Zs − Zs′ )2 ds
t≤T 0 0
Z T Z T
′2
+ 8K E(Ys − Ys′ )2 ds + 16E (Zs − Zs′ )2 ds .
0 0
Previous expression together with (16.23) and Fubini’s provide the final state-
ment.
We continue the same set-up and prove a comparison theorem.
Theorem 16.12 Let ξ, ξ ′ be two square integrable r.v. such that ξ ≤ ξ ′ a.s.
Let f, f ′ fulfilling Assumption 16.1 and f (t, y, z) ≤ f ′ (t, y, z) dt ⊗ dP a.e.
Let Y (resp. Y ′ ) be the solution of BSDE(ξ, f ) (resp. BSDE(ξ ′ , f ′ ).
We have the following.
(i) Ys ≤ Ys′ a.s. for all s ∈ [0, T ].
(ii) If moreover Y0 = Y0′ then Yt = Yt′ a.s.
(iii) Whenever P {ξ < ξ ′ } > 0 or f (t, Yt′ , Zt′ ) < f ′ (t, Yt′ , Zt′ ) on a set of
positive dt ⊗ dP measure, then Y0 < Y0′ .
Proof. We define
( f (t,Yt′ ,Zt′ )−f (t,Yt ,Zt′ )
Yt′ −Yt
, if Yt 6= Yt′
αt =
0 if Yt = Yt′ .
We also define the process (βt )0≤t≤T as follows. We denote by

( f (t,Y ,Z ′ )−f (t,Y ,Z )
t t
Zt′ −Zt
t t
6 Zt′
, if Zt =
βt =
0 if Zt = Zt′ .
We note that (αt )0≤t≤T and (βt )0≤t≤T are measurable adapted processes,
|α| ≤ K, |β| ≤ K.
154
For 0 ≤ s ≤ t ≤ T , let
Z t Z t
|βr |2
Γs,t := exp αr − dr + βr dWr .
s 2 s
Define
(Ȳt , Z̄t ) = (Yt′ − Yt , Zt′ − Zt ), ξ¯ = ξ ′ − ξ, Ut = f ′ (t, Yt′ , Zt′ ) − f (t, Yt′ , Zt′ ).
Then (Ȳ , Z̄) solves the linear BSDE

Z T Z T Z T
Ȳt = ξ + (αs Ȳs + βs Z̄s )ds + Us ds − Z̄s dWs . (16.25)
t t t
We state a lemma.
Lemma 16.13 For 0 ≤ s < t ≤ T we have

Z t Z t
Ȳs = Γs,t Ȳt + Γs,r Ur dr − Γs,r (Z̄r + Ȳr βr )dWr
s s
Z t
Ȳs = E Γs,t Ȳt + Γs,r Ur dr|Fs .
s
Proof (of Lemma 16.13).

By an easy application of Itô formula, we can show
Z t
Γs,t = Γs,r (βr dWr + αr dr). (16.26)
s
Then we apply integration by parts to Γs,t Ȳt using (16.25) and (16.26).
(16.25) gives
Z t Z t Z t
Ȳt = Ȳs − (αr Ȳr + βr Z̄r )dr − Ur dr + Z̄r dWr .
s s s
155
Consequently
Z t
Γs,t Ȳt = Γs,s Ȳs + Γs,r dȲr
|{z} s
=1
Z t
+ Ȳr dΓ̄s,r + [Ȳ , Γs,· ]t
s
Z t
= Ȳs + Γs,r (−αr Ȳr − βr Z̄r )dr
s
Z t Z t
− Γs,r Ur dr + Γs,r Z̄r dWr
s s
Z t
+ Ȳr Γs,r (βr dWr + αr dr)
s
Z t
+ Γs,r βr Z̄r dr
s
Z t Z t
= Ȳs − Γs,r Ur dr + Γs,r (Z̄r + βr Ȳr )dWr ,
s s
which implies the first equality. The second equality of the statement follows
taking the conditional expectation of the first expression with respect to Fs .
Indeed the local martingale
Z t
Mt := Γs,r (Z̄r + Ȳr βr )dWr ,
s
R
T
is a martingale by Lemma 16.7. Indeed first β is bounded and so E 0 (Z̄r + Ȳr βr )2 dr <
∞. Moreover
sup Γs,t ≤ exp(T kαk∞ )Γ̃s,t , (16.27)
s≤t≤T
where Z t Z t
|βr |2
Γ̃s,t = exp − dr + βr dWr , s ≤ t ≤ T,
s 2 s
which is a martingale because of the Novikov criterion. Indeed
Z T
|βr |2
E exp dr < ∞,
0 2
holds because β is bounded. Now (Γ̃s,t )t is also square integrable. Indeed

Z t
2 2
Γ̃s,t = exp βr dr Ls,t ,
s
156
with Z t Z t
|2βr |2
Ls,t = exp − dr + 2βr dWr ,
s 2 s
which is a martingale again by Novikov criterion. So (Γ̃s,t ) is a square inte-

grable martingale and
E( sup Γ2s,t ) ≤ exp(2T kαk∞ )E( sup Γ̃2s,t ),

s≤t≤T s≤t≤T
which is finite by Doob inequality.
Proof of Theorem 16.12.

ξ and U are by assumption non-negative. Item (i) of Theorem 16.12 follows
setting t = T in the second equality in Lemma 16.13 statement. Item (ii)
follows setting s = 0 in the same mentioned equality and by the strict pos-
itivity of Γs,t . Item (iii) is a consequence of the strict positivity of Γs,t and
Lemma 16.13 for t = T and s = 0.
The theory of forward-backward SDEs is an important chapter of stochastic

analysis connecting deterministic PDEs and world of stochastic processes.
In the introduction of the topic we have shown that classical solutions of
parabolic PDEs produce solutions of a FBSDE. Previous comparison theorem
is useful to show that whenever a forward-backward SDE is well-posed for
every initial time and initial conditions, those produce solutions of PDEs in
generalized sense, called viscosity solutions.
References
[1] C. D. Aliprantis and K. C. Border. Infinite-dimensional analysis.

Springer-Verlag, Berlin, second edition, 1999. A hitchhiker’s guide.
[2] R. Bafico and P. Baldi. Small random perturbations of Peano phenom-

ena. Stochastics, 6(3-4):279–292, 1981/82.
[3] M. T. Barlow and M. Yor. Semimartingale inequalities via the Garsia-

Rodemich-Rumsey lemma, and applications to local times. J. Funct.
Anal., 49(2):198–229, 1982.
157
[4] Patrick Billingsley. Convergence of probability measures. Wiley Series
in Probability and Statistics: Probability and Statistics. John Wiley
& Sons Inc., New York, second edition, 1999. A Wiley-Interscience
Publication.
[5] E. Çinlar, J. Jacod, P. Protter, and M. J. Sharpe. Semimartingales and

Markov processes. Z. Wahrsch. Verw. Gebiete, 54(2):161–219, 1980.
[6] N. El Karoui, S. Peng, and M. C. Quenez. Backward stochastic differ-

ential equations in finance. Math. Finance, 7(1):1–71, 1997.
[7] H. J. Engelbert and W. Schmidt. On solutions of one-dimensional

stochastic differential equations without drift. Z. Wahrsch. Verw. Gebi-
ete, 68(3):287–314, 1985.
[8] F. Flandoli, M. Gubinelli, and E. Priola. Flow of diffeomorphisms

for SDEs with unbounded Hölder continuous drift. Bull. Sci. Math.,
134(4):405–422, 2010.
[9] F. Flandoli, E. Issoglio, and F. Russo. Multidimensional stochastic dif-

ferential equations with distributional drift. Trans. Amer. Math. Soc.,
369(3):1665–1688, 2017.
[10] F. Flandoli, F. Russo, and J. Wolf. Some SDEs with distributional drift.
I. General calculus. Osaka J. Math., 40(2):493–542, 2003.
[11] F. Flandoli, F. Russo, and J. Wolf. Some SDEs with distributional drift.
II. Lyons-Zheng structure, Itô’s formula and semimartingale character-
ization. Random Oper. Stochastic Equations, 12(2):145–184, 2004.
[12] H. Föllmer. Calcul d’Itô sans probabilités. In Seminar on Probability,

XV (Univ. Strasbourg, Strasbourg, 1979/1980) (French), volume 850 of
Lecture Notes in Math., pages 143–150. Springer, Berlin, 1981.
[13] H. Föllmer. Dirichlet processes. In Stochastic integrals (Proc. Sympos.,

Univ. Durham, Durham, 1980), volume 851 of Lecture Notes in Math.,
pages 476–478. Springer, Berlin, 1981.
158
[14] Ĭ. Ī. Gı̄hman and A. V. Skorohod. The theory of stochastic processes. I,
volume 210 of Grundlehren der Mathematischen Wissenschaften [Fun-
damental Principles of Mathematical Sciences]. Springer-Verlag, Berlin,
english edition, 1980. Translated from the Russian by Samuel Kotz.
[15] T. Hida, H.-H. Kuo, J. Potthoff, and L. Streit. White noise, volume
253 of Mathematics and its Applications. Kluwer Academic Publishers
Group, Dordrecht, 1993. An infinite-dimensional calculus.
[16] Kiyosi Itô and Shinzo Watanabe. Introduction to stochastic differential

equations. In Proceedings of the International Symposium on Stochas-
tic Differential Equations (Res. Inst. Math. Sci., Kyoto Univ., Kyoto,
1976), pages i–xxx, New York, 1978. Wiley.
[17] I. Karatzas and S. E. Shreve. Brownian motion and stochastic calculus,

volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New
York, second edition, 1991.
[18] A. N. Kolmogorov and S. V. Fomı̄n. Introductory real analysis. Dover

Publications Inc., New York, 1975. Translated from the second Russian
edition and edited by Richard A. Silverman, Corrected reprinting.
[19] N. V. Krylov. Controlled diffusion processes, volume 14 of Stochastic

Modelling and Applied Probability. Springer-Verlag, Berlin, 2009. Trans-
lated from the 1977 Russian original by A. B. Aries, Reprint of the 1980
edition.
[20] N. V. Krylov and M. Röckner. Strong solutions of stochastic equa-

tions with singular time dependent drift. Probab. Theory Related Fields,
131(2):154–196, 2005.
[21] H. Kunita. Stochastic differential equations and stochastic flows of dif-

feomorphisms. In École d’été de probabilités de Saint-Flour, XII—1982,
volume 1097 of Lecture Notes in Math., pages 143–303. Springer, Berlin,
1984.
[22] D. Lamberton and B. Lapeyre. Introduction au calcul stochastique ap-

pliqué à la finance. Ellipses Édition Marketing, Paris, second edition,
1997.
159
[23] P. Mathieu. Zero white noise limit through Dirichlet forms, with appli-
cation to diffusions in a random medium. Probab. Theory Related Fields,
99(4):549–580, 1994.
[24] P. Mathieu. Limit theorems for diffusions with a random potential.

Stochastic Process. Appl., 60(1):103–111, 1995.
[25] I. P. Natanson. Theory of functions of a real variable. Frederick Ungar

Publishing Co., New York, 1955. Translated by Leo F. Boron with the
collaboration of Edwin Hewitt.
[26] B. Øksendal. Stochastic differential equations. Universitext. Springer-

Verlag, Berlin, sixth edition, 2003. An introduction with applications.
[27] E. Pardoux. Backward stochastic differential equations and viscosity

solutions of systems of semilinear parabolic and elliptic PDEs of sec-
ond order. In Stochastic analysis and related topics, VI (Geilo, 1996),
volume 42 of Progr. Probab., pages 79–127. Birkhäuser Boston, Boston,
MA, 1998.
[28] Étienne Pardoux and Aurel Rascanu. Stochastic differential equations,

Backward SDEs, Partial differential equations. Springer, 2014.
[29] Ph. Protter. Stochastic integration and differential equations, volume 21

of Applications of Mathematics (New York). Springer-Verlag, Berlin,
1990. A new approach.
[30] D. Revuz and M. Yor. Continuous martingales and Brownian motion,

volume 293 of Grundlehren der Mathematischen Wissenschaften [Fun-
damental Principles of Mathematical Sciences]. Springer-Verlag, Berlin,
third edition, 1999.
[31] W. Rudin. Real and complex analysis. McGraw-Hill Book Co., New
York, third edition, 1987.
[32] F. Russo and G. Trutnau. Some parabolic PDEs whose drift is an

irregular random noise in space. Ann. Probab., 35(6):2213–2262, 2007.
[33] P. Seignourel. Discrete schemes for processes in random media. Probab.

Theory Related Fields, 118(3):293–322, 2000.
160
[34] Ch. Stricker. Quasimartingales, martingales locales, semimartingales et
filtration naturelle. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete,
39(1):55–63, 1977.
[35] Daniel W. Stroock and S. R. Srinivasa Varadhan. Multidimensional

diffusion processes, volume 233 of Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences].
Springer-Verlag, Berlin, 1979.
[36] A Ju Veretennikov. On strong solutions and explicit formulas for solu-

tions of stochastic integral equations. Sbornik: Mathematics, 39(3):387–
403, 1981.
[37] Shinzo Watanabe and Toshio Yamada. On the uniqueness of solutions of

stochastic differential equations. II. J. Math. Kyoto Univ., 11:553–563,
1971.
[38] Marc Yor. Some aspects of Brownian motion. Part II. Lectures in
Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 1997. Some recent
martingale problems.
161

CalcustoENSTA2020-21IPP MasterSFA

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CalcustoENSTA2020-21IPP MasterSFA

Uploaded by

Copyright:

Available Formats

STOCHASTIC CALCULUS

FIRST PART (FQ307, OMI302) and SECOND PART (FQ308) of a global

ENSTA Paris, Institut Polytechnique de Paris

Key words: Stochastic calculus, Stochastic differential equations, Semimartin-

(i) Motivations: stochastic modeling, probabilistic representations of lin-

(ii) Stochastic processes in continuous time: Gaussian processes, Brownian

(iii) Stochastic integrals: forward and Itô integrals.

(v) Girsanov formulae. Novikov and Benês coondition. Predictable repre-

(vi) Stochastic differential equations with Lipschitz coefficients. Markov

(vii) Stochastic differential equations without Lipschitz coefficients: strong

(viii) Bessel processes and Cox-Ingersoll-Ross model in mathematical finance.

(ix) Backward stochastic differential equations and connections with semi-

The present document constitutes a second-level course of stochastic calculus.

(A) Stochastic analogue of ordinary differential equations

Allowing randomness in some coefficients of an ordinary differential equation

a(t) = r(t) + “noise”, (1.2)

PROBLEM 2 The most celebrated example is the stochastic solution of

ii) f˜ is harmonic on U i.e.

In 1944 Kakutani proved that the solution could be expressed in terms of

where f : Rn → R is continuous non-negative or with polynomial growth. It

u(t, x) = E(f (Btx )),

is a solution of the problem.

(D) Filtering problem.

(E) Black-Scholes-Samuelson model (Financial mathematics)

It is a continuous time model to describe the price of two financial assets:

dSt0 = rSt0 dt, (1.4)

where r is a positive constant.

In fact (1.5) gives

Now (1.6) motivates (1.4).

It constitutes a maximization (resp. minimization) problem in expectation.

2 Basic recalls in probability theory

In general, we will be given an underlying probability space (Ω, F, P ).

but it can also be an infinite dimensional Banach space E of real continuous

All the (σ-fields) included in F will be supposed to be completed with the

t 7−→ E{exp(it · X)}, t ∈ Rd .

Remark 2.1 The characteristic function uniquely determines the law of X.

σi,j = Cov(Xi , Xj ) = E{(Xi − µi ) (Xj − µj ))},

where µ = (µ1 , ..., µd ) is the expectation of X, i.e. µi = E(Xi ), 1 ≤ i ≤ d.

Remark 2.2 X1 , · · · , Xd are independent if and only if

ϕX (t) = Πdj=1 ϕXj (tj ), t = (t1 , · · · , td ).

Remark 2.3 If the r.v. X1 , . . . , Xd are independent then Γ(X) is diagonal.

2.2 Real Gaussian random variables

Let µ ∈ R, σ > 0. A real r.v. X (i.e. taking values in E = R) is called

The law of X is equally called Gaussian or normal law with parameters µ

If µ = 0, σ = 1, X is called standard. Similarly the law of X is also called

Proposition 2.4 Suppose X ∼ N (µ, σ 2 ). Then

Remark 2.5 (i) If Y ∼ N (0, 1), we have

(ii) The characteristic function of X ∼ N (µ, σ 2 ) is given by

2.3 Gaussian vectors

Definition 2.6 A r.v. X = (X1 , . . . , Xd ) taking values in Rd is called

We recall that the components X1 , . . . , Xd of a Gaussian vector are Gaussian

Proposition 2.7 Let X1 , . . . , Xd be independent Gaussian r.v. Then the

Theorem 2.8 Let X = (X1 , . . . , Xd ) an Rd -valued Gaussian vector. The

We need to characterize the notion of multi-dimensional Gaussian law. Let

We will use the notation X ∼ N (µ, Γ).

Proposition 2.10 Let X be a Gaussian vector such that E(X) = µ and

• Then X ∼ N (µ, Γ).

• If Γ is invertible then the Gaussian law is absolutely continuous with

2.4 Complements on conditional expectation

Remark 2.11 If X is an integrable r.v., independent of G, then E(X|G) =

Proposition 2.12 Let X be a given r.v. X is independent on a σ-field G if

A useful result in the sequel is the following.