Download as pdf or txt
Download as pdf or txt
You are on page 1of 105

An Introductory Course in Stochastic

Processes
Stphane Crpey, University of Evry, France
(after Tomasz R. Bielecki, Illinois Institute of Technology, Chicago)
February 13, 2012
This is an introductory course in stochastic processes. Its purpose is to introduce
students into a range of stochastic processes, which are used as modeling tools in diverse elds
of applications, especially in the risk management applications for nance and insurance. In
addition, students will be introduced to some basic stochastic analysis.
The course introduces the most fundamental ideas in the area of modeling and anal-
ysis of real world phenomena in terms of stochastic processes. It covers dierent classes
of Markov processes: discrete and continuous-time Markov chains, Brownian motion and
diusion processes. It also presents some aspects of stochastic calculus with emphasis on
the application to nancial and insurance modeling, as well as nancial engineering.
Main references
1. Introduction to Stochastic Processes, Gregory F. Lawler. Chapman & Hall, old version
1996 (new version 2004).
2. Elementary Stochastic Calculus with Finance in View, Thomas Mikosch. World Scientic,
1998 or later.
3. Stochastic Calculus for Finance II: ContinuousTime Models, Steven E. Shreve. Springer,
2004 or later.
Other references
4. Stochastic Dierential Equations and Diusion Processes, N. Ikeda and S. Watanabe.
Second edition, North-Holland, 1989.
5. Stochastic Integration and Dierential Equations, Philip E. Protter. Second Edition,
Springer, 2004.
Sections marked with a

correspond to more advanced material that can be skipped


at rst reading.
Contents
I Some classes of discrete-time stochastic processes 7
1 Discrete-time stochastic processes 9
1.1 Conditional expectations and ltrations . . . . . . . . . . . . . . . . . . . . . 9
1.1.1 Main properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Discrete-time Markov chains 13
2.1 Motivation and construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 An introductory example . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Denitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Chapman-Kolmogorov equations . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Long-range behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Discrete-time martingales [Lawler, Chapter 5; Mikosch, Section 1.4; Shreve,
Chapter 2 and Section 3.2] 21
3.1 Denitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Doob-Meyer decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Stopping times and optional stopping theorem . . . . . . . . . . . . . . . . . . 28
3.3.1 Applications to random walks . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Uniform integrability and martingales . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Martingale convergence theorems . . . . . . . . . . . . . . . . . . . . . . . . . 35
II Some classes of continuous-time stochastic processes 37
4 Continuous-time stochastic processes 39
4.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Continuous-time martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2.1 Optional Stopping Theorem . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Continuous-time Markov chains [Lawler, Chapter 3] 41
5.1 Poisson process [Shreve, Sections 11.2 and 11.3] . . . . . . . . . . . . . . . . . 43
5.2 Two-states continuous time Markov chain . . . . . . . . . . . . . . . . . . . . 46
5.3 Birth-and-death Process [Lawler, Section 3.3.] . . . . . . . . . . . . . . . . . . 47
6 Brownian motion [Lawler, Chapter 8; Mikosch, Section 1.3; Shreve, Sec-
tions 3.33.7] 51
6.1 Denition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 Random walk approximation . . . . . . . . . . . . . . . . . . . . . . . 52
3
4 CONTENTS
6.1.2 Second order properties . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Markov properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3 Martingale methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3.1 Martingales associated with Brownian motion . . . . . . . . . . . . . . 58
6.3.2 Exit time from a corridor . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.3 Laplace Transform of the rst passage time of a drifted Brownian motion 61
6.4 Geometric Brownian motion [see also Mikosch, Example 1.3.8] . . . . . . . . . 63
III Elements of stochastic analysis 65
7 Stochastic integration [Lawler, Chapter 9; Mikosch, Chapter 2; Shreve,
Sections 4.2 and 4.3 ] 67
7.1 Integration with respect to symmetric random walk . . . . . . . . . . . . . . . 67
7.2 The It stochastic integral for simple processes . . . . . . . . . . . . . . . . . 68
7.3 The general It stochastic integral . . . . . . . . . . . . . . . . . . . . . . . . 70
7.4 Stochastic Integral with respect to a Poisson process . . . . . . . . . . . . . . 72
7.5 Semimartingale Integration Theory [See Protter]

. . . . . . . . . . . . . . . . 73
8 It formula [Mikosch, Chapter 2; Shreve, Section 4.4] 77
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.1.1 What about

|
0
\
-
d\
-
? . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.1.2 What about

|
0

-
d
-
? . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2 It formulas for continuous processes . . . . . . . . . . . . . . . . . . . . . . . 78
8.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.3 It formulas relative to jump processes [See Ikeda and Watanabe]

. . . . . . 80
8.3.1 Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
9 Stochastic dierential equations (SDEs) [Mikosch, Chapter 3; Shreve, Sec-
tion 6.2] 85
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.2 Diusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.2.1 SDEs for diusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.3 Solving diusion SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.4 SDEs Driven by a Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . 91
9.5 Jump-Diusions [See Ikeda and Watanabe]

. . . . . . . . . . . . . . . . . . . 93
10 Girsanov transformations 95
10.1 Girsanov transformation relative to Gaussian distributions . . . . . . . . . . . 95
10.1.1 Gaussian random variables . . . . . . . . . . . . . . . . . . . . . . . . 95
10.1.2 Brownian motion [Mikosch, Section 4.2; Shreve, Sections 1.6 and 5.2.1] 96
10.2 Girsanov transformation relative to Poisson distributions . . . . . . . . . . . . 97
10.2.1 Poisson random variables . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.2.2 Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.3 Girsanov transformation relative to both Brownian motion and Poisson process 98
10.4 Abstract Bayes formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
11 Feynman-Kac formulas

101
11.1 Linear case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.2 Backward Stochastic Dierential Equations . . . . . . . . . . . . . . . . . . . 102
11.2.1 Non-linear Feynman-Kac formula . . . . . . . . . . . . . . . . . . . . . 103
11.2.2 Optimal stopping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6
Part I
Some classes of discrete-time
stochastic processes
7
Chapter 1
Discrete-time stochastic processes
1.1 Conditional expectations and ltrations
In this section we discuss the notions of conditional expectations and ltrations which are
key in the study of stochastic processes.
Denition 1.1. Let and -
1
, . . . , -
a
be random variables. The conditional expectation
1(-
1
, . . . , -
a
) is a random variable characterized by two properties:
1. The value of 1(-
1
, . . . , -
a
) depends only on the values of -
1
, . . . , -
a
, i.e., we can write
1(-
1
, . . . , -
a
) = j(-
1
, . . . , -
a
) for some function j. If a random variable can be written as
a function of -
1
, . . . , -
a
, it is called measurable with respect to -
1
, . . . , -
a
.
2. Suppose is any event that depends only on -
1
, . . . , -
a
. Let 1

denote the indicator


function of , i.e., the random variable which equals 1 if occurs and 0 otherwise. Then
1(1

) = 1(1(-
1
, . . . , -
a
)1

) (1.1)
Let (, 1) be the underlying probability space. That is, is the set of elementary
events . and 1 is the probability dened on . Every random variable considered here is a
function of .. Thus, the quantity 1(-
1
, . . . , -
a
)(.) is a value of 1(-
1
, . . . , -
a
). Sometimes
a slightly less formal, but a bit more convenient notation is used: suppose . is such
that -
i
(.) = r
i
, i = 1, 2, 3, . . . , n. Then, the notation 1(-
1
, . . . , -
a
)(r
1
, r
2
, . . . , r
a
) or
1(-
1
= r
1
, -
2
= r
2
, . . . , -
a
= r
a
) is used in place of 1(-
1
, . . . , -
a
)(.). Likewise, for the
value of the indicator random variable 1

, where , the notation 1


-()
(r
1
, r
2
, . . . , r
a
)
is used instead of 1

(.), with the understanding that -() = (-


1
(.), -
2
(.), . . . , -
a
(.)), .
.
Lawler p.85-87 (p.101-106) shows how to compute conditional expectations using joint
and marginal distributions of random variables.
Example 1.2. We illustrate the equality (1.1) with an example in which n = 1: Suppose
that and - are discrete random variables and is an event which involves -. (For con-
creteness you may think of - as the value of the rst roll and the sum of the two rolls of
9
10
dice, and = - 2). Then indeed, we have, using the Bayes formula in the third line
1(1(-)1

) =

a
1(- = r)1
-()
(r)1(- = r)
=

a
[

j1( = j- = r)
]
1
-()
(r)1(- = r)
=

a
1
-()
(r)1( = j, - = r)
=

,a
(j1
-()
(r))1( = j, - = r) = 1(1

).
For the example involving dice, taking -() = r 2
1(1

) =

,a
j1
{a2}
1( = j, - = r)
=

j1( = j, - = 1) +

j1( = j, - = 2)
=
1
36
(2 + 3 + 4 + 5 + 6 + 7) +
1
36
(3 + 4 + 5 + 6 + 7 + 8)
=
27
36
+
33
36
=
5
3
and
1[1(-)1

] = 1[(- + 3.5)1
{-2}
]
=
2

a=1
r1(- = r) + 3.51(- 2) =
3
6
+ 3.5
1
3
=
5
3
.
It will be convenient to make the notation more compact. If -
1
, -
2
, ... is a sequence
of random variables we will use
a
to denote the information contained in -
1
, . . . , -
a
, and
we will write 1(
a
) for 1(-
1
, . . . , -
a
). We have
a

n
if 1 n : . This is
because the collection -
1
, . . . , -
a
of random variables contains no more information than
-
1
, . . . , -
a
, . . . , -
n
.
Mikosch, section 1.4.2, denes information carried by random variables -
1
, . . . , -
a
in
terms of the associated o-eld: o(-
1
, . . . , -
a
). Thus,
a
= o(-
1
, . . . , -
a
) [we say that
a
is
a o-eld generated by -
1
, . . . , -
a
.] As we already noted we have that

a

n
if 1 n :.
A collection
a
, n = 1, 2, 3, . . . , of o-elds satisfying the above property is called a ltration.
[We similarly dene a ltration
|
, t 0, for a continuous time index t, and the conditional
expectations with respect to such ltrations. This will be needed later.]
1.1.1 Main properties
0. Conditional expectation is a linear operation: if o, / are constants
1(o
1
+ /
2

a
) = o1(
1

a
) + /1(
2

a
).
11
1. The following property follows from (1.1) if the event is the entire sample space, so
that 1

= 1:
1(1(
a
)) = 1().
1. [Tower rule] If : n, then
1(1(
a
)
n
) = 1(
n
).
2. If is measurable with respect to [is a function of] -
1
, . . . , -
a
then
1(
a
) = .
2. If is measurable with respect to -
1
, . . . , -
a
then for any random variable j
1(j
a
) = 1(j
a
).
3. If is independent of -
1
, . . . , -
a
, then
1(
a
) = 1().
3. [See Mikosch, section 1.4.4, rule 7.] If is independent of -
1
, . . . , -
a
and j is measurable
with respect -
1
, . . . , -
a
, then for every function c = c(j, .)
1(c(, j)
a
) = 1

c(, j)
where 1

c(, j) means that we freeze j and take expectation with respect to , so 1

c(, j) =
1c(, j)
=j
.
4. [Projection property of the conditional expectation; see Mikosch, section 1.4.5.] Let
be a random variable with 1
2
< . The conditional expectation 1(
a
) is that random
variable in 1
2
(
a
) which is closest to in the mean square sense, so
1[ 1(
a
)]
2
= min
j1
2
(u)
1(- j)
2
.
Example of the verication of the tower rule Let = -
1
+ -
2
+ -
3
, where -
i
is the
outcome of the i
|
toss of a fair coin, so that 1(-
i
= 1) = 1(-
i
= 0) = 1,2, and the -
i
s are
independent. Consider
1(1(
2
)
1
) = 1(1(-
1
, -
2
)-
1
)
= 1(-
1
+ -
2
+ 1-
3
-
1
)) = -
1
+ 1-
2
+ 1-
3
= -
1
+ 1,
and
1(-
1
) = -
1
+ 1(-
2
+ -
3
) = -
1
+ 1,2 + 1,2 = -
1
+ 1.
12
Chapter 2
Discrete-time Markov chains
2.1 Motivation and construction
2.1.1 An introductory example
Suppose that 1
a
denotes the short term interest rate prevailing on day n (n 0). Suppose
also that the rate 1
a
is a random variable which may only take two values: Low (L) and
High (H), for every n. [We call the possible values of 1
a
the states]. Thus, we consider a
random sequence: 1
a
, n = 0, 1, 2, . . . . Sequences like this are frequently called discrete time
stochastic processes.
Next suppose that we have the following information available about the conditional
probabilities:
1(1
a
= ,
a
1
0
= ,
0
, 1
1
= ,
1
, . . . , 1
a2
= ,
a2
, 1
a1
= ,
a1
)
= 1(1
a
= ,
a
1
a2
= ,
a2
, 1
a1
= ,
a1
)
(2.1)
for every n 2 and for every sequence of states (,
0
, ,
1
, . . . , ,
a2
, ,
a1
, ,
a
), and
1(1
a
= ,
a
1
0
= ,
0
, 1
1
= ,
1
, . . . , 1
a2
= ,
a2
, 1
a1
= ,
a1
)
= 1(1
a
= ,
a
1
a1
= ,
a1
)
(2.2)
for some n 1 and for some sequence of states (,
0
, ,
1
, . . . , ,
a2
, ,
a1
, ,
a
),
In other words, we know that our todays interest rate depends on the entire history of
the past values of the interest rates only through the values of interest rates prevailing on
the two immediately preceding days [this is the condition (2.1) above]. But, the information
contained in these two values may sometimes aect todays conditional distribution of the
interest rate in a dierent way than the information provided only by the yesterdays value
of the interest rate [this is the condition (2.2) above].
The type of stochastic dependence subject to condition (2.2) is not the Markovian type
of dependence. [It will be clear soon what we mean by the Markovian type of dependence.]
However, due to the condition (2.1) the stochastic process 1
a
, n = 0, 1, 2, . . . can be en-
larged [or augmented] to a so called Markov chain that will exhibit the Markovian type of
dependence.
To see this, let us see what happens when we create a new stochastic process A
a
, n =
0, 1, 2, . . . , by enlarging the state space of the original sequence 1
a
, n = 0, 1, 2, . . . . Towards
this end let us dene
A
a
= (1
a
, 1
a+1
).
13
14
Observe that the state space for the sequence A
a
, n = 0, 1, 2, . . . contains four elements:
(1, 1), (1, H), (H, 1) and (H, H). We shall now examine conditional probabilities for the
sequence A
a
, n = 0, 1, 2, . . . :
1(A
a
= i
a
A
0
= i
0
, A
1
= i
1
, . . . , A
a2
= i
a2
, A
a1
= i
a1
) =
1(1
a+1
= ,
a+1
, 1
a
= ,
a
1
0
= ,
0
, 1
1
= ,
1
, . . . , 1
a1
= ,
a1
, 1
a
= ,
a
)
= 1(1
a+1
= ,
a+1
1
0
= ,
0
, 1
1
= ,
1
, . . . , 1
a1
= ,
a1
, 1
a
= ,
a
) =
by condition (2.1)
= 1(1
a+1
= ,
a+1
1
a1
= ,
a1
, 1
a
= ,
a
)
= 1(1
a+1
= ,
a+1
, 1
a
= ,
a
1
a1
= ,
a1
, 1
a
= ,
a
)
= 1(A
a
= i
a
A
a1
= i
a1
)
for every n 1 and for every sequence of states (i
0
, i
1
, . . . , i
a1
, i
a
).
We see that the enlarged sequence A
a
exhibits the so called Markov property.
2.1.2 Denitions and examples
Denition 2.1. A random sequence A
a
, n = 0, 1, 2, . . . , where A
a
takes values in the set
o, is called a Markov chain with the (discrete: nite or countable) state space o if it satises
the Markov property:
1(A
a
= i
a
A
0
= i
0
, A
1
= i
1
, . . . , A
a2
= i
a2
, A
a1
= i
a1
)
= 1(A
a
= i
a
A
a1
= i
a1
)
for every n 1 and for every sequence of states (i
0
, i
1
, . . . , i
a1
, i
a
) from the set o.
Every discrete time stochastic process satises the following property [given the conditional
probabilities are well dened, property (1.2) in Lawler, p.7 (p.9)]
1(A
0
= i
0
, A
1
= i
1
, . . . , A
a2
= i
a2
, A
a1
= i
a1
, A
a
= i
a
) =
1(A
0
= i
0
)1(A
1
= i
1
A
0
= i
0
) . . .
1(A
a
= i
a
A
0
= i
0
, A
1
= i
1
, . . . , A
a2
= i
a2
, A
a1
= i
a1
).
Not every random sequence satises the Markov property. Sometimes a random sequence,
which is not a Markov chain, can be transformed to a Markov chain by means of enlargement
of the state space.
Denition 2.2. A random sequence A
a
, n = 0, 1, 2, . . . , where A
a
takes values in the set o,
is called a time-homogeneous Markov chain with the state space o if it satises the Markov
property (1) and, in addition,
1(A
a
= i
a
A
a1
= i
a1
) = (i
a1
, i
a
) (2.3)
for every n 1 and for every two of states i
a1
, i
a
from the set o, where j : o o [0, 1]
is some given function.
We shall only study time-homogeneous Markov chains. Time-inhomogeneous Markov
chain can be transformed to a time-homogeneous one by including the time variable in the
state vector.
15
Denition 2.3. The (possibly innite) matrix Q = [(i, ,)]
i,)
is called the (one-step)
transition matrix for the Markov chain A
a
.
The transition matrix for a Markov chain A
a
is a stochastic matrix. That is, its rows can
be interpreted as probability distributions [with non-negative entries summing up to unity,
see (1.4) and (1.5) in Lawler]. To every pair
(
c
0
, Q
)
, where c
0
=
(
c
0
(i)
)
i
is an initial
probability distribution on o and Q is a stochastic matrix, there corresponds some Markov
chain with the state space o. Such a chain can be constructed via the formula [Equation
(1.3) in Lawler, p.8]
1(A
0
= i
0
, A
1
= i
1
, . . . , A
a2
= i
a2
, A
a1
= i
a1
, A
a
= i
a
) =
c
0
(i
0
)(i
0
, i
1
) . . . (i
a1
, i
a
).
In other words, the initial distribution c
0
and the transition matrix Q determine a Markov
chain completely by determining its nite dimensional distributions.
Remark 2.4. There is an obvious analogy with a dierence equation:
r
a+1
= or
a
, n = 0, 1, 2, . . . ,
r
0
= r.
The solution path r
0
, r
1
, r
2
, . . . is uniquely determined by the initial condition r and the
transition rule o.
Example 2.5. Let -
a
, n = 1, 2, . . . be i.i.d. (independent, identically distributed random
variables) such that 1(-
a
= 1) = j, 1(-
a
= 1) = = 1 j. Dene A
0
= 0 and, for
n = 1, 2, 3, . . . ,
A
a
= A
a1
+ -
a
.
The process A
a
, n = 0, 1, 2, . . . is a time-homogeneous Markov chain on o = . . . , i, i +
1, 1, 0, 1, . . . , i 1, i, . . . =the set of all integers, and the corresponding transition
matrix is Q given by
(i, i + 1) = 1 j, (i, i 1) = j, i = 0, 1, 2, . . . .
This is a random walk (on the integer lattice) starting at zero. If j = 1,2, then the walk is
called symmetric.
Example 2.6. Let -
a
, n = 1, 2, . . . be i.i.d. (independent, identically distributed random
variables) such that 1(-
a
= 1) = j, 1(-
a
= 1) = = 1 j. Dene A
0
= 0 and, for
n = 1, 2, 3, . . . ,
A
a
=

if Y
a1
=
A
a1
+ -
a
if < A
a1
<
if A
a1
= .
The process A
a
, n = 0, 1, 2, . . . is a time-homogeneous Markov chain on o = , +
1, 1, 0, 1, . . . , 1, , and the corresponding transition matrix is Q given by
(i, i + 1) = 1 j, (i, i 1) = j, < i < ,
(, ) = (, ) = 1.
This is a random walk starting at zero with absorbing boundaries at and . If j = 1,2,
then the walk is called symmetric.
16
Example 2.7. Let -
a
, n = 1, 2, . . . be i.i.d. (independent, identically distributed random
variables) such that 1(-
a
= 1) = j, 1(-
a
= 1) = = 1 j. Dene A
0
= 0 and, for
n = 1, 2, 3, . . . ,
A
a
=

+ 1 if Y
a1
=
A
a1
+ -
a
if < A
a1
<
1 if A
a1
= .
The process A
a
, n = 0, 1, 2, . . . is a time-homogeneous Markov chain on o = , +
1, 1, 0, 1, . . . , 1, , and the corresponding transition matrix is Q given by
(i, i + 1) = 1 j, (i, i 1) = j, < i < ,
(, + 1) = (, 1) = 1.
This is a random walk starting at zero with reecting boundaries at and . If j = 1,2,
then the walk is called symmetric.
Example 2.8. Let -
a
, n = 0, 2, . . . be i.i.d. (independent, identically distributed random
variables) such that 1(-
a
= 1) = j, 1(-
a
= 1) = = 1 j. Then, the stochastic process
A
a
= -
a
, n = 0, 1, 2, . . . is a time-homogeneous Markov chain on o = 1, 1, and the
corresponding transition matrix is
Q =
(
1 1
1 j
1 j
)
.
Here of course
(i, 1) = 1(A
a
= 1A
a1
= i) = 1(A
a
= 1)
for i = 1, 1, and, likewise,
(i, 1) = 1(A
a
= 1A
a1
= i) = 1(A
a
= 1)
for i = 1, 1.
2.2 Chapman-Kolmogorov equations
Denition 2.9. Given any two states i, , o the n-step transition probability
a
(i, ,) is
dened as

a
(i, ,) = 1(A
a
= ,A
0
= i)
for every n 0. We dene the n-step transition matrix Q
a
as
Q
a
= [
a
(i, ,)]
i,)
.
Proposition 2.1. We have
(i)
1
(i, ,) = (i, ,) and

0
(i, ,) =
{
1 if i = ,
if i = ,.
and thus, Q
1
= Q and Q(0) = 1 (the identity matrix).
(ii)
a
(i, ,) = 1(A
I+a
= ,A
I
= i) for each / 0.
17
Proof. Part (i) is obvious and part (ii) holds since we only consider time-homogenous
Markov chains. In fact, for n = 2 we have
1(A
I+2
= ,A
I
= i) =

1(A
I+2
= ,, A
I+1
= :A
I
= i) =

[
1(A
I+2
= ,, A
I+1
= :, A
I
= i)1(A
I+1
= :, A
I
= i)
1(A
I+1
= :, A
I
= i)1(A
I
= i)
]
=

[
1(A
I+2
= ,A
I+1
= :, A
I
= i)1(A
I+1
= :A
I
= i)
]
= by Markov property

[
1(A
I+2
= ,A
I+1
= :)1(A
I+1
= :A
I
= i)
]
= by time homogeneity

[
1(A
2
= ,A
1
= :)1(A
1
= :A
0
= i)
]
=

[
1(A
2
= ,A
1
= :, A
0
= i)1(A
1
= :A
0
= i)
]
=

1(A
2
= ,, A
1
= :A
0
= i) =
1(A
2
= ,A
0
= i) =
2
(i, ,).
Similar argument may be used for an arbitrary n 1.
Proposition 2.2. The following representation for the n-step transition matrix holds:
Q
a
= Q
a
for every n 0. [Recall: by denition we have Q
0
= 1.]
Proof. The proof is done by induction. See Lawler, p.11 (p.13).
Corollary 2.3. The Chapman-Kolmogorov equation is satised:
Q
n+a
= Q
n
Q
a
= Q
a
Q
n
for every :, n 0. Or, equivalently,

n+a
(i, ,) =

n
(i, /)
a
(/, ,) =

a
(i, /)
n
(/, ,)
for every :, n 0, and every i, , o.
18
Proof. Q
n+a
= Q
n+a
= Q
n
Q
a
= Q
n
Q
a
.
The Chapman-Kolmogorov equation provides the basis for the rst step analysis:
Q
a+1
= QQ
a
. (2.4)
We shall see applications later. The last step analysis would be Q
a+1
= Q
a
Q. Observe that
Equation (2.4) and the Chapman-Kolmogorov equations are equivalent. Equation (2.4) can
also be written as
Q
a+1
= Q
a
= Q
a
,
where Q
a+1
= Q
a+1
Q
a
, and = Q 1. Note that the diagonal elements of are
negative and that the rows add to 0. The matrix is called the generator for any Markov
chain associated with Q.
Denition 2.10. The (unconditional) n-step probabilities c
a
(i) are dened as
c
a
(i) = 1(A
a
= i)
for every n 0. In particular, c
0
(i) = 1(A
0
= i) (the initial probabilities).
We shall use the notation c
a
= [c
a
(i)]
i
. This is a row vector (possibly innite) represent-
ing distribution of the states of the Markov process at time n.
Proposition 2.4. We have
c
a
= c
0
Q
a
for every n 0.
Proof. It is straightforward:
1(A
a
= ,) =

i
1(A
0
= i)1(A
a
= ,A
0
= i) =

i
c
0
(i)
a
(i, ,).
We already know that the n-step transition probability
a
(i, ,) is the (i, ,) entry of the ma-
trix Q
a
.
A recursive equation for the n-step transition probabilities [that is for the conditional
probabilities 1(A
a
= ,A
0
= i)] is:
Q
a+1
= Q
a
Q, n = 0, 1, 2, . . . ,
with the initial condition Q(0) = 1.
A recursive equation for the unconditional probabilities 1(A
a
= ,) is:
c
a+1
= c
a
Q, n = 0, 1, 2, . . . ,
with the initial condition c
0
corresponding to the distribution of A
0
.
See also example 6 in Lawler, p.11 (p.13).
19
2.2.1 Long-range behavior
By the long time behavior of a Markov chain we mean behavior of the conditional prob-
abilities Q
a
and the unconditional probabilities c
a
for large n. In view of the fact that
c
a
= c
0
Q
a
= c
0
Q
a
this essentially boils down to the behavior of the powers Q
a
of the
transition matrix for large n.
Understanding the long time behavior of a Markov chains that model real systems is
important for various applications in operations research and engineering [manufacturing,
investment, scheduling etc.].
20
Homework 1: Conditional expectations and discrete-time Markov chains
Recall our introductory example.
1. Supposing that
1(1
a+2
= 11
a+1
= H, 1
a
= H) = 1(1
a+2
= H1
a+1
= 1, 1
a
= 1) = 1,8
1(1
a+2
= 11
a+1
= 1, 1
a
= H) = 1(1
a+2
= H1
a+1
= H, 1
a
= 1) = 1,2
1(1
a+2
= H1
a+1
= 1, 1
a
= H) = 1(1
a+2
= 11
a+1
= H, 1
a
= 1) = 1,2
1(1
a+2
= 11
a+1
= 1, 1
a
= 1) = 1(1
a+2
= H1
a+1
= H, 1
a
= H) = 7,8,
derive the transition matrix for the enlarged process A
a
.
2. Assuming further 1(1
0
= 1) = 1,3 and
1(1
1
= 11
0
= 1) = 1(1
1
= H1
0
= H) = 3,4,
what is the probability that the interest rates will be low for three consecutive days
starting from day 0? from day 2?
3. Given that the initial interest rate is low, that is 1
0
= 1, what is the conditional
probability that 1
4
= H?
4. What is the probability that the interest rate will be high in the long range?
Chapter 3
Discrete-time martingales [Lawler,
Chapter 5; Mikosch, Section 1.4;
Shreve, Chapter 2 and Section 3.2]
3.1 Denitions and examples
Denition 3.1. A stochastic process Y
a
, n 0, is a martingale with respect to a ltration

a
, n 0 if
(i)
1Y
a
< , for all n 0,
and
(ii)
1(Y
n

a
) = Y
a
, for all : n. (3.1)
In this denition, as in the introductory section to this part,
a
denotes the information
contained in a sequence -
1
, . . . , -
a
of random variables. The second condition of the def-
inition implies that Y
a
is
a
-measurable. The rst condition assures that the conditional
expectations are well dened. When we say that Y
a
is a martingale without reference to

a
, n 0 we understand that
a
is the information contained in Y
0
, . . . , Y
a
, so -
a
= Y
a
.
In order to verify (3.1) it is enough to show that for all n
1(Y
a+1

a
) = Y
a
, (3.2)
since then by the tower rule
1(Y
a+2

a
) = 1(1(Y
a+2

a+1
)
a
) = 1(Y
a+1

a
) = Y
a
,
and so on. We also note that for every n 0
1(Y
a+1
) = 1[1(Y
a+1

a
)] = 1(Y
a
),
so that a martingale is a process with a constant mean. Because of property (3.2) martingale
is thought of as a model of a fair game. The process that can be thought of as a model of a
favorable (unfavorable) game is called a submartingale (supermartingale) as dened below.
21
22
Denition 3.2. A stochastic process Y
a
is a submartingale (supermartingale) with respect
to (
a
) if for all n 0,
1Y
a
< ,
1(Y
n

a
) ()Y
a
, for all : n
Y
a
is
a
-measurable.
We note that the last condition is automatically satised for a martingale. A process Y
a
such that Y
a
is measurable with respect to
a
for every n 0 is called adapted to the
ltration = (
a
)
a0
. We shall normally consider adapted processes only.
Example 3.3. Let the -
i
be i.i.d. random variables with mean j. Let o
0
=

o
0
= 0 and for
n 0 let
o
a
= -
1
+ . . . + -
a
,

o
a
= o
a
nj.
[This is example 1 in Lawler, p.90 (p.107)]. Are all processes -
a
, o
a
and

o
a
Markov chains?
Are all these processes martingales with respect to the ltration
a
if j = 0? What is the
answer to the preceding question when j = 0?
Example 3.4. [Compare with example 3, section 5.2 in Lawler.] Consider a gambler who
is playing a sequence of independent games in each of which he wins one with probability j
or loses one with probability 1 j. Let -
a
, n 1 be a sequence of i.i.d. random variables
indicating the outcome of the i
|
game:
1(-
i
= 1) = j = 1 1(-
i
= 1), i 1, -
0
= 0.
We note that 1(-
i
) = 2j 1, i 1. Suppose that the gambler employs a betting strategy
based on the past history of the game, that is, the bet
a+1
on the n + 1
|
game is

a+1
=
a+1
(-
1
, . . . , -
a
), n 0,
where
a+1
0. Let Y
a
, n 1, denote the gamblers fortune after n games and set Y
0
= 0.
Then
Y
a+1
= Y
a
+
a+1
(-
1
, . . . , -
a
)-
a+1
, n 0.
Now denote by
a
the information contained in -
0
, . . . , -
a
and consider
1(Y
a+1

a
) = 1
[
Y
a
+
a+1
(-
1
, . . . , -
a
)-
a+1

a
]
= Y
a
+
a+1
(-
1
, . . . , -
a
)1(-
a+1
)

= Y
a
if 1(-
a+1
) = 0 j =
1
2
Y
a
if 1(-
a+1
) < 0 j <
1
2
Y
a
if 1(-
a+1
) 0 j
1
2
Thus when j =
1
2
, Y
a
is a martingale with respect to
a
.When j < ()
1
2
, Y
a
is a super-
martingale (submartingale) with respect to
a
. An interesting aspect of this example, when
j =
1
2
, is that no matter what betting strategy is used in the class of strategies based on the
past history of the game, we have 1(Y
a
) = 1(Y
0
) = 0, for every n.
Now recall Example 3.3 above. If j = 1,2 then the process
o
a
= -
1
+ + -
a
, n 0
23
is a martingale w.r.t
a
, n 0. [It is a supermartingale if j < 1,2, and a submartingale
if j 1,2.] Next, observe that
a
=
a
(-
1
, . . . , -
a1
) is
a1
-measurable for every n 1.
[Such process is called predictable with respect to the ltration
a
.] The gamblers fortune
Y
a
can be written as
Y
a
=
a

I=1

I
(
o
I
o
I1
)
, n = 0, 1, 2, 3 . . . .
This expression is a martingale transform of the process o
a
by the process
a
. This is
the discrete counterpart of the stochastic integral

do. [We know that that Y
a
is a
martingale (supermartingale, submartingale) with respect to
a
, n 0 if o
a
is a martingale
(supermartingale, submartingale) with respect to
a
, n 0.]
Example 3.5. [Compare with example 2, section 5.2 in Lawler.] This example is a special
case of Example 3.4 with j =
1
2
, and the following betting strategy. Bet $1 on the rst
game. Stop if you win. If not double your bet. If you win stop betting (i.e. set
a
= 0 for
all greater n). Otherwise, keep doubling your bet until you eventually win. This is a very
attractive betting strategy which involves a random stopping rule: you stop when you win.
Let Y
a
denote your fortune after n games. Assume Y
0
= 0. We already know from Example
3.4 that Y
a
is a martingale, with 1Y
a
= 1Y
0
= 0. But in the present case the gambler
employs a randomized stopping strategy, i.e. the gambler stops the game at the random
time i = mini 1 : Y
i
= 1, the time at which she wins. Note that Y
i
= 1 on i < ,
and that
1(i = n) = (
1
2
)
a
, n 1,
so
1(i < ) = 1.
Therefore the gambler wins 1 in nite time with probability one. In particular
1(Y
i
) = 1 = 0 = 1(Y
a
), n = 0, 1, 2, . . . .
The reason why this inequality happens is that i is an unbounded stopping time [i.e. there
is no nite constant 1 such that 1(i 1) = 1]. We shall talk about this more in a
following lecture. That is why, employing this randomized [doubling] strategy the gambler
can guarantee that she nishes the game ahead. However, consider the expected amount
lost before the gambler wins (which is the expected value of the last bet)
1(amount lost) =

a=0
1(i = n + 1)[2
a
1] =

a=0
(
1
2
)
a+1
[2
a
1] =
Thus, on average, you need an innite capital to play a winning game, which makes the
doubling strategy much less attractive.
Example 3.6 (Martingales associated with driftless random walk). This complements Ex-
ample 3.3. Let -
i
, i 1,be i.i.d. random variables with 1(-
i
) = 0, 1(-
2
i
) = o
2
< . We
verify that
o
a
= r +
a

i=1
-
i
, n 0,
24
and
`
a
= o
2
a
no
2
, n 0,
are martingales with respect to
a
, n 0, information contained in -
1
, . . . , -
a
, n 0, or
equivalently, information contained in o
0
, . . . , o
a
, n 0. We have
(i)
1o
a
r +
a

i=1
1-
i

a

i=1
[
1(-
2
i
)
]
12
< ,
and
(ii)
1(o
a+1

a
) = o
a
.
Similarly
(i)
1`
a
1o
2
a
+ no
2
= r
2
+ no
2
+ no
2
= r
2
+ 2no
2
< ,
and
(ii)
1(`
a+1

a
) = 1
[
o
2
a+1
(n + 1)o
2


a
]
= 1
[
o
2
a
+ -
2
a+1
+ 2o
a
-
a+1
(n + 1)o
2


a
]
= o
a
-
2
i
+ o
2
+ 0 (n + 1)o
2
= `
a
.
Recall that o
a
is a Markov chain. So, o
a
is both a Markov chain and a martingale. Is `
a
a Markov chain as well?
Example 3.7 (Walds martingale). Let -
i
, i 1, be i.i.d. random variables with 1(-
i
) <
, \ o:(-
i
) < . Set o
a
= r +

a
i=1
-
i
, and for i 1, let
:(0) = 1 [exp(0-
i
)] , < 0 < ,
be the moment generating function of -
i
. Dene
7
a
=
exp(0o
a
)
[:(0)]
a
, n 0.
We verify that 7
a
is a martingale for every 0. We have
(i)
17
a
=
1 [exp(0o
a
)]
[:(0)]
a
=
exp(0r) [:(0)]
a
[:(0)]
a
= exp(0r),
and
(ii)
1(7
a+1

a
) = 1
{
exp(0o
a+1
)
[:(0)]
a+1

a
}
=
exp(0o
a
)
[:(0)]
a+1
1(exp 0-
a+1
) = 7
a
.
25
Now, suppose that each -
i
is normally distributed with mean j and variance o
2
. Then,
letting r = 0 (to simplify presentation), we have, o
a
A(nj, no
2
), and thus
7
a
= exp
(
0nj
1
2
0
2
no
2
+ 0o
a
)
.
This model is related to so called geometric Brownian motion with a drift.
Example 3.8. Let -
i
, i 1,be i.i.d. random variables with 1(-
i
= 1) = j, 1(-
i
= 1) =
= 1 j, 0 < j, < 1. Set o
a
= r +

a
i=1
-
i
. We note that 1-
i
= j = 2j 1, and
\ o:(-
i
) = 1 (2j 1)
2
= 4j. We now verify that

o
a
o
a
n(j ), n 0,
and
7
a
= (,j)
Su
, n 0,
are martingales. That

o
a
is a martingale with respect to o(-
1
, . . . , -
a
), n 0 follows
immediately from Example 3.3 by writing

o
a
r +
a

i=1
[-
i
(j )].
We now show that 7
a
is a Walds martingale. We have for i 0
:(0) = 1[exp(0-
i
)] = j exp(0) + exp(0).
If we choose 0 = ln(,j) then :(0) = 1, and Walds martingale takes the form
exp [ln(,j)o
a
] = (,j)
Su
= 7
a
.
3.2 Doob-Meyer decomposition
Recall the driftless random walk of Example 3.6 and set o
0
= 0. In this example we saw
that the process `
a
= o
2
a
no
2
was a martingale with respect to the ltration
a
. The
process
a
= no
2
is non-decreasing [it is in fact strictly increasing in this case], and it is
predictable [of course, since it is deterministic]. Finally observe that for each n 0 we have
the following decomposition of the process o
2
a
:
o
2
a
= o
2
0
+ `
a
+
a
.
Thus, we have decomposed the sub-martingale into a sum of a martingale and a non-
decreasing, predictable process. This particular result obtained for our example is a special
case of the celebrated result known as the Doob-Meyer decomposition:
Theorem 3.1. Let A
a
, n 0 be a process adapted to some ltration
a
, n 0. Assume
1A
a
< for every n 0. Then, A
a
has a Doob-Meyer decomposition
A
a
= A
0
+ `
a
+
a
, n 0,
where `
a
is a martingale with `
0
= 0, and
a
is a predictable process with
0
= 0. The
decomposition is unique [in appropriate sense].
A
a
is a sub-martingale if and only if the process
a
is non-decreasing.
26
Denition 3.9. If a process A
a
is a square integrable martingale, the predictable quadratic
variation process A
a
of A
a
is the predictable Doob-Meyer component of A
2
a
, such that
A
2
a
A
a
is a martingale with respect to
a
.
27
Homework 2: Conditional expectations and discrete-time martingales
1. Show that if A
a
is a martingale then it has uncorrelated increments, i.e.,
1(A
n
A
a
)(A
I
A

) = 0,
where 0 : < / n < : < .
2. Answer the questions posed in Example 3.3.
(a) Are the processes -
a
, o
a
and

o
a
Markov chains?
(b) Are the processes -
a
, o
a
and

o
a
martingales w.r.t
a
when j = 0?
(c) Are the processes -
a
, o
a
and

o
a
martingales w.r.t
a
when j = 0?
3. Let o
a
=

a
i=1
-
i
, n 0, where -
i
, i 1, is a sequence of independent, identically
distributed random variables, and -
i
has an exponential distribution with parameter `.
Identify three dierent martingales associated with the process o
a
, n 0. Represent
these martingales as a function of o
a
and parameter `.
[Hint: The rst martingale

o
a
= o
a
)(n, `) [where )(n, `) = some function of n and
`]. The second martingale `
a
=

o
2
a
p(n, `) [where p(n, `) = some function of n and
`]. The third martingale 7
a
=
caj(0Su)
(a,0,A)
[where (n, 0, `) = some function of n and `.]
4. Let -
i
, i 1, be i.i.d. random variables with 1(-
i
= 1) = 1(-
i
= 1) =
1
2
. Set
o
a
=

a
i=1
-
i
, n = 0, 1, 2, . . . , and let
a
, be the information contained in o
0
, . . . , o
a
[which is the same as information contained in -
1
, . . . , -
a
]. Finally, let
7
a
= c
Su
, n 0.
(a) Verify whether the process 7
a
is a martingale, or super-martingale, or sub-
martingale, or neither with respect to the ltration
a
.
(b) Find a numerical sequence .
a
so that process 7
a
given as
7
a
=
7
a
.
a
is a (Walds) martingale with respect to the ltration
a
.
28
3.3 Stopping times and optional stopping theorem
The optional stopping theorem is also referred to in the literature as Doobs optional sam-
pling theorem.
Denition 3.10. A random variable i is called a stopping time with respect to
a
, n 0,
(where
a
is the information contained in Y
0
, . . . , Y
a
) if
(i) i takes values in 0, 1, . . . , ,
(ii) For each n, 1(i = n) is measurable with respect to
a
.
Thus stopping time is a stopping rule based only on the information contained in
a
. Put
another way, if we know that a particular event from
a
took place, then we know whether
i = n or not. The notation 1() is used instead of 1

in this section.
Example 3.11. Let i = , for some , 0. Clearly, i is a stopping time. This is the most
elementary example of a bounded [nite] stopping time. Let now -
i
, i 1, be i.i.d. random
variables with 1(-
i
= 1) = j, 1(-
i
= 1) = = 1 j, 0 < j, < 1. Set o
a
=

a
i=1
-
i
.
Let
a
be the information contained in o
0
, . . . , o
a
[which is the same as the information
contained in -
0
, . . . , -
a
]. We consider dierent stopping rules.
1. Let
i
)
= minn 0 : o
a
= ,
= if o
a
= , for all n 0.
Since 1(i
)
= n) is determined by the information in
a
, i
)
is a stopping time with
respect to
a
.
2. Let
0
)
= i
)
1, , = 0,
then, since 1(0
)
= n) = 1(i
)
1 = n) = 1(i
)
= n+1), 1(0
)
= n) is not
a
-measurable
(it is
a+1
-measurable). Hence 0
)
is not a stopping time.
3. i

)
, the r
|
passage time of the process o
a
to ,, : = 1, 2, . . . , is a stopping time with
respect to
a
.
4. Let
i
)
= maxn 0 : o
a
= ,.
Thus i
)
is the last time o
a
visits state ,. Clearly i
)
is not a stopping time.
Exercise 1. suppose that 0 is a stopping time, then 0
)
:= min(0, ,), where , is a xed
integer, is also a stopping time. Clearly 0
)
,.
Exercise 2. If i and 0 are stopping times, then so are min(i, 0) and max(i, 0).
Let i be any non-negative integer valued random variable which is nite with proba-
bility one. Let A
a
, n = 0, 1, 2, 3, . . . be a random sequence. Then, A
i
denotes the random
variable that takes values A
i(.)
(.). The following proposition says that you cannot beat a
fair game by using a stopping rule which is a bounded stopping time.
29
Proposition 3.2. Let `
a
be a martingale and i a stopping time with respect to
a
, n 0,
where
a
is the information contained in `
0
, ...`
a
. Then
1`
min(i, a)
= 1`
0
, n 0.
Proof. We have
`
min(i, a)
= `
i
1(i n) + `
a
1(i n)
= `
i
a

I=0
1(i = /) + `
a
1(i n)
=
a

I=0
`
I
1(i = /) + `
a
1(i n).
Hence
1`
min(i, a)
=
a

I=0
1 `
I
1(i = /) + 1 `
a
1(i n)
=
a

I=0
1 [1(`
a

I
)1(i = /)] + 1 `
a
1(i n)
=
a

I=0
1 [1(`
a
1(i = /)
I
)] + 1 `
a
1(i n)
=
a

I=0
1 `
a
1(i = /) + 1 `
a
1(i n)
= 1 `
a
1(i n) + 1 `
a
1(i n) = 1`
a
= 1`
0
,
where the second equality follows from the martingale property of `
a
, the third from the
fact that 1(i = /) is measurable with respect to
I
, and the fourth from the tower rule.
In many situations of interest the stopping time is not bounded, but is almost surely
nite, as in the doubling strategy of Example 3.5. In this example 1A
i
= 1 = 0 = 1A
0
.
The question arises when is it that 1`
i
= 1`
0
, for a stopping time which is not bounded?
We have
`
i
= `
min(i, a)
+ `
i
1(i n) `
a
1(i n)
Hence, using Proposition 3.2 we obtain for every n
1`
i
= 1`
0
+ 1[`
i
1(i n)] 1[`
a
1(i n)]. (3.3)
This provides motivation for the following
Theorem 3.3 (Optional Stopping Theorem). Let `
a
be a martingale and i a stopping
time with respect to
a
, n 0. If
1(i < ) = 1, (3.4)
1`
i
< , (3.5)
and
lim
a
1[`
a
1(i n)] = 0, (3.6)
then
1`
i
= 1`
0
. (3.7)
30
Proof. It follows from (3.3) and (3.6) that we only have to show
lim
a
1[`
i
1(i n)] = 0. (3.8)
By (3.4) and (3.5)
1`
i
=

I=0
1[`
i
1(i = /)]
=
a

I=0
1[`
i
1(i = /)] + 1[`
i
1(i n)] <
(3.9)
Now (3.8) follows because we see from (3.9) that 1[`
i
1(i n)] is a tail of a convergent
series.
Example 3.12. For the doubling strategy of Example 3.5 we know that (3.7) does not hold.
We also know that for this strategy 1(i < ) = 1, and 1Y
i
= 1 < , so it must be that
(3.6) does not hold. Indeed, as n
1[Y
a
1(i n)] = 1 2
a
1(i n) = 1 2
a

(
1,2
)
a
1.
3.3.1 Applications to random walks
Let -
i
, i 1, be i.i.d. random variables with 1(-
i
= 1) = j, 1(-
i
= 1) = = 1 j, 0 <
j, < 1. Set o
a
= r +

a
i=1
-
i
, and let
i = minn 0 : o
a
= o or o
a
= /, o r /,
where o, / are integers. Now i is a stopping with respect to
a
, n 0, where
a
is the
information contained in o
0
, . . . , o
a
. One can show [see Exercise 1.7 in Lawler] that 1i < ,
which implies that 1(i < ) = 1. Our goal is to compute, using OST, 1(o
i
= /), the
probability that o
a
reaches / before o, and 1i, for a symmetric random walk and for a
random walk with drift.
Symmetric random walk: j = = 1,2.
In this case we know from Example 3.6 that o
a
and `
a
= o
2
a
n are martingales with
respect to
a
, n 0, where
a
is the information contained in o
0
, . . . , o
a
. We note that i
is a stopping time with respect to
a
, n 0. We apply OST to o
a
and stopping time i.
Since
1o
i
max(o, /) < ,
and
lim
a
1[o
a
1(i n)] lim
a
max(o, /)1(i n) = 0
(recall that 1(i < ) = 1), by OST we have
1o
i
= 1o
0
= r. (3.10)
But also
1o
i
= /1(o
i
= /) + o1(o
i
= o). (3.11)
31
Combining (3.10) and (3.11) we obtain
1(o
i
= /) =
r o
/ o
, 1(o
i
= o) =
/ r
/ o
. (3.12)
Setting o = 0 and / = , we get
1(o
i
= /) =
r

, 1(o
i
= 0) =
r

,
which are the probabilities of winning and ruin, respectively, in the so-called Gamblers ruin
problem. To compute 1i we apply OST to `
a
and i. We check assumptions of OST. We
have
1`
i
1o
2
i
+ 1i < ,
and
1[`
a
1(i n)] 1[o
2
a
1(i n)] + n1(i n)
max(o
2
, /
2
)1(i n) + n1(i n).
Clearly
lim
a
max(o
2
, /
2
)1(i n) = 0.
To show that lim
a
n1(i n) = 0 we write
1i =
a

I=1
/1(i = /) +

I=a+1
/1(i = /).
Since 1i < ,

I=a+1
/1(i = /) is a tail of a convergent series, and thus
0 = lim
a

I=a+1
/1(i = /) lim
a

I=a+1
n1(i = /) = lim
a
n1(i n),
which shows that (3.6) holds. Hence by OST we have
1`
i
= 1o
2
i
1i = 1o
2
0
= r
2
,
so that using (3.12) we obtain,
1i = 1o
2
i
r
2
= /
2
r o
/ o
+ o
2
/ r
/ o
r
2
= (/ r)(r o). (3.13)
By setting o = 0 and / = we obtain the expected duration of the game in the Gamblers
ruin chain
1i = ( r)r.
Random walk with drift: j = .
In this case (see Example 3.8)

o
a
o
a
n(j ), and 7
a
= (,j)
Su
, are martingales with
respect to
a
, n 0, where o
a
is dened as in Example 3.8. We rst apply OST to 7
a
and
i. We check assumptions of OST:
17
i
max((,j)
o
, (,j)
o
) < ,
32
and
lim
a
1[7
a
1(i n)] lim
a
max((,j)
o
, (,j)
o
)1(i n) = 0.
Hence we can apply OST to conclude
17
i
= 17
0
= (,j)
a
. (3.14)
But
17
i
= (,j)
o
1(o
i
= /) + (,j)
o
(1 1(o
i
= /)). (3.15)
From (3.14) and (3.15) we obtain
1(o
i
= /) =
(,j)
a
(,j)
o
(,j)
o
(,j)
o
. (3.16)
Setting o = 0 and / = in (3.16) we obtain the probability that the Gambler, who starts
with r dollars, will win the desired r dollars:
1(o
i
= ) =
(,j)
a
1
(,j)
.
1
.
We now compute 1i by applying OST to the martingale

o
a.
The assumptions of OST can
be veried in the same way as in previous cases. By OST we have
1

o
i
= 1o
i
1i(j ) = 1

o
0
= r.
so that
1i =
1o
i
r
j
,
where 1o
i
can be easily computed from (3.16).
Exercise 3 Compute the expected duration of the game for the Gamblers ruin chain with
j = .
33
Homework 3: Discrete-time optional stopping theorem
1. Exercise 1.
2. Exercise 2.
3. Exercise 3.
4. Let o
a
= 1 +

a
i=1
-
i
, n 0, be a random walk with j = 1,4 and = 3,4 starting
from r = 1. Suppose that
i = min n 0 : o
a
= 2 .
Use Optional Stopping Theorem (admitting its conditions hold) to compute
(a) 1
1
(i)
(b) \
1
(i)
34
3.4 Uniform integrability and martingales
Condition (3.6) is dicult to verify. Here we present some conditions that imply it.
Denition 3.13. A sequence of random variables A
1
, A
2
, . . . is uniformly integrable (UI
for short) if for every c 0 there exists a c 0 such that if for some random event
we have 1() < c then
1(A
a
1

) < c, (3.17)
for each n.
Observe that c must not depend on n and (3.17) must hold for all values of n.
Example 3.14. Let A
1
, A
2
, . . . be a random sequence with A
a
1 < for every n [1
does not depend on n which means that the sequence is uniformly bounded]. This sequence
is UI. To see this x c 0. Then, take c =
c
1
. Now, take any event so that 1() < c.
We have
1(A
a
1

) 11() < 1c = c,
for every n. Thus the sequence A
1
, A
2
, . . . is UI.
Exercise 1. Let the sequence A
a
be as in the above example. Consider the sequence
o
a
=

a
I=1
A
I
. Is the sequence o
a
UI?
Here is an equivalent denition of a UI sequence:
Denition 3.15. A sequence of random variables A
1
, A
2
, . . . is uniformly integrable (UI
for short) if
sup
a0
1(A
a
1
Auo
) 0 as o .
Example 3.16. Consider the fortune process Y
a
of the doubling strategy from Example
3.5. We know that this process is a martingale with respect to
a
. Is it a UI martingale?
In order to answer this question consider the event
a
= -
1
= -
2
= = -
a
= 1. Then
1(
a
) =
1
2
a
and 1(Y
a
1
u
) = (2
a
1),2
a
[this is because Y
a
= 2
a
1 if event
a
occurs].
Thus, 1(Y
a
1
u
) = 1
1
2
a
. Now, take any c < 1. No matter how small (but positive) c you
select, you will alway nd n large enough so that 1(
a
) < c and 1(Y
a
1
u
) c. Thus, the
gamblers fortune process is not a UI martingale.
Suppose now that `
0
, `
1
, . . . is a UI martingale with respect to some ltration, and
that i is a stopping time s.t. 1(i < ) = 1. By uniform integrability we then conclude
that [since 1i n 0]
lim
a
1(`
a
1i n) = 0,
so that condition (3.6) holds. Thus, we may state a weaker version of the OST:
Theorem 3.4. Let `
a
be a UI martingale and i a stopping time with respect to
a
. Suppose
that 1i < = 1 and 1(`
i
) < . Then, 1(`
i
) = 1(`
0
).
A useful criterion for uniform integrability is this one. If for a sequence of random
variables A
a
there exists a constant C < so that 1(A
2
a
) < C for each n then the
sequence A
a
is uniformly integrable. [See Lawler p.98 (p.115) for proof].
35
Example 3.17. Consider a driftless random walk o
a
as in Example 3.8, assuming 1(-
i
=
1) = 1(-
i
= 1) = 1,2 for every i 1. That is we have a symmetric random walk on
integers starting at 0. We know this random walk is a martingale with respect to
a
. Now,
consider the process

o
a
=
Su
a
, n 1. We have that 1(

o
2
a
) = 1,n for every n 1. Thus, the
sequence

o
a
is UI [of course, as it is a bounded sequence in the rst place]. This criterion is
not satised for the random walk o
a
itself. In fact, the random walk o
a
is not UI!
3.5 Martingale convergence theorems
The following theorem is important.
Theorem 3.5. Suppose `
0
, `
1
, . . . is a supermartingale with respect to a ltration
a
, and
there exists a nite constant C so that 1`
a
< C for all n. Then, there exists a random
variable `

so that with probability one


`
a
`

.
This result is proved in Lawler (sect. 5.5) for a martingale sequence. We shall skip
the general proof. The above convergence means that if you denote = . :
lim
a
`
a
(.) = `

(.), then 1() = 1. This means that the probability that the con-
vergence lim
a
`
a
(.) = `

(.) does not hold is zero, or that this convergence holds for
almost every elementary event . . Such mode of convergence of random variables is called
convergence with probability one or almost sure convergence [see appendix in Mikosch].
Corollary 3.6. Suppose `
0
, `
1
, . . . is a non-negative supermartingale with respect to a
ltration
a
. Then, there exists a random variable `

so that
`
a
`

.
For UI supermartingales we obtain a stronger result:
Theorem 3.7. Suppose `
0
, `
1
, . . . is a UI supermartingale with respect to a ltration
a
.
Then, there exists a random variable `

so that
`
a
`

.
Moreover, 1(`

a
) `
a
. In particular, we get 1(`

) 1(`
0
) [The equality holds
here in the case of martingales].
Example 3.18. See Example 6, p.102 (p.120), in Lawler, for an interesting example of a
non-UI martingale `
a
which almost surely converges to `

= 0, so that 1`

= 1`
0
= 1.
36
Homework 4: Discrete-time uniformally integrable martingales
1. Exercise 1.
2. An urn contains / red balls and : green balls at the initial time n = 0. One ball
is chosen randomly from the urn. The ball is then put back into the urn together
with another ball of the same color. Hence, the number of total balls in the urn grows
[Polyas urn scheme, see Lawler Example 4 p.92 (p.109) and Example 3 p.101 (p.119)].
Let A
a
denote the proportion of green balls in the urn at time n 0.
(a) Is A
a
a Markov chain?
(b) Is A
a
a martingale?
(c) Is it a UI martingale?
Part II
Some classes of continuous-time
stochastic processes
37
Chapter 4
Continuous-time stochastic processes
4.1 Generalities
So far we have been studying random processes in discrete time. We are turning now to
studying random processes in continuous time [See Mikosch, Section 1.2].
A collection of random variables A
|
, t [0, ) is called a continuous time random
(or stochastic) process. We shall frequently use notation A to denote the process A
|
, t
[0, ). A
|
denotes the state at time t 0 of our random process. That is for every xed
t 0, A
|
is a random variable on some underlying probability space (, 1). This means that
A
|
() is a function from to the state space o [that is A
|
() : o ]. On the other hand,
for every xed . we deal with a trajectory (or a sample path), denoted by A

(.), of our
random process. That is, A

(.) is a function from [0, ) to o [that is A

(.) : [0, ) o ].
The (natural) ltration generated by the process A is
|
= o(A
-
, 0 : t) =
information contained in the random variables A
-
, 0 : t. Process Y
|
is said to be
-adapted if Y
|
is
|
-measurable for every t.
4.2 Continuous-time martingales
Since the denitions and results concerning martingales in continuous time are essentially
analogous to those in discrete time in this section we state denitions and results without
much elaboration [see Mikosch, Section 1.5].
Denition 4.1. Process Y = (Y
|
, t 0) is called a martingale, resp. submartingale, resp.
resp. supermartingale, with respect to the family of information sets = (
|
, t 0) [which
satisfy
-

|
, : t] if Y is -adapted and
(i) 1Y
|
< , t 0
(ii) For every : t one has 1(Y
|

-
) = Y
-
, resp. 1(Y
|

-
) Y
-
, resp. resp. 1(Y
|

-
) Y
-
.
4.2.1 Optional Stopping Theorem
Denition 4.2. A nonnegative random variable t is called a stopping time relative to
(
|
, t 0) if for each t, 1(t t), the indicator function of the event t t, is measurable
with respect to
|
.
Theorem 4.1. Let (`
|
, t 0) be a martingale and t a stopping time with respect to
(
|
, t 0). If
39
40
(i) 1(t < ) = 1
(ii) 1`
t
<
(iii) lim
|
`
|
1(t t) = 0 Then
1`
t
= 1`
0
.
Chapter 5
Continuous-time Markov chains
[Lawler, Chapter 3]
Denition 5.1. A random process A is called a continuous time Markov chain with a
discrete (nite or countable) state space o if for any 0 :, t and for any j o it holds
1(A
-+|
= j
-
) = 1(A
-+|
= rA
-
).
The above Markov property can equivalently be stated as: for any sequence of times 0
t
1
t
2
t
a1
t
a
< , and for any collection of states r
1
, r
2
. . . , r
a1
, r
a
we have
1(A
|u
= r
a
A
|
u1
= r
a1
, . . . , A
|
2
= r
2
, A
|
1
= r
1
)
= 1(A
|u
= r
a
A
|
u1
= r
a1
).
Denition 5.2. A Markov chain A is time homogeneous i for all r, j o and all 0 :, t
we have
1(A
-+|
= jA
-
= r) = 1(A
|
= jA
0
= r) =: (t; r, j).
Let
Q(t) = ((t; r, j))
a,
, t 0,
denote the transition probability function for a time homogeneous Markov chain A. Note
that Q(0) = 1.
Proposition 5.1. For every :, t 0, the transition probability function for a time homoge-
neous Markov chain A satises
(i) 0 (t; r, j) 1, r, j o,
(ii)

(t; r, j) = 1, r o,
(iii) (Chapman-Kolmogorov equations)
(: + t; r, j) =

:
(:; r, .)(t; ., j), r, j o, :, t 0 (5.1)
or, equivalently
Q(: + t) = Q(:)Q(t), :, t 0.
Proof. Left as an exercise.
41
42
Recall that if a real valued continuous function )(t) satises the equation
)(: + t) = )(:))(t), :, t 0,
then it is dierentiable and such that (with

) =
o)
o|
)

)(t) = o)(t), )(t) = c


o|
for some real number o. Similarly, in case of a continuous semi-group of transition probabil-
ities Q(t) with a so-called matrix-generator of A, the matrix function Q is dierentiable
in t 0 and such that Q(0) = 1 and for t 0

Q(t) = Q(t) (5.2)


(forward form), or, equivalently since Q(t) commutes with :

Q(t) = Q(t) (5.3)


(backward form). The above equations are called Kolmogorov equations. The (innite)
matrix form of the unique solution is
Q(t) = c
|
, t 0; (5.4)
where for any matrix [nite or innite] the matrix exponential is dened in the usual manner
c
|
=

a=0
(t)
a
n!
.
For any initial distribution vector c
0
= (1(A
0
= r), r o) one then has that
(1(A
|
= r), r o) =: c
|
= c
0
Q(t) = c
0
c
|
.
Remark 5.3. The Chapman-Kolmogorov equation (5.1) can be equivalently written as
Q(: + t) Q(t) = Q(t)
(
Q(:) 1
)
, :, t 0.
Now, x t and rewrite the above for : 0
Q(t + :) Q(t)
:
= Q(t)
Q(:) Q(0)
:
.
Since the matrix function Q(t) is dierentiable we obtain after letting : 0

Q(t) = Q(t)

Q(0)
(the right derivative of Q(t) at t = 0). Comparing this with the forward equation we
conclude that
=

Q(0).
Remark 5.4. Recall from Chapter 1 that for a discrete time Markov chain the n-step
transition matrix Q
a
satises the rst step equation: Q
0
= 1 and for n = 0, 1, 2, . . .
Q
a+1
= Q
a
,
or, equivalently, the last step equation: Q
0
= 1 and for n = 0, 1, 2, . . .
Q
a+1
= Q
a

where = Q1. The solution to both equations is Q


a
= Q
a
= (1 +)
a
, n = 0, 1, 2, 3, . . . .
The forward Kolmogorov equation (5.2) is the continuous time counterpart of the last step
equation (5.4). The backward Kolmogorov equation (5.3) is the continuous time counterpart
of the rst step equation (5.4).
43
5.1 Poisson process [Shreve, Sections 11.2 and 11.3]
Although it is customary to denote a Poisson process by = (
|
, t 0), Lawler uses the
notation A = (A
|
, t 0). We shall follow Lawlers notation. The notation h stands for a
small time increment.
Denition 5.5. Let A
0
= 0. Also, let A
|
denote the (random) number of occurences of
some underlying random event in the time interval (0, t], t 0. If A
|
satises the following
two conditions we shall call A a Poisson process with intensity (or rate) ` 0:
(i) For every t 0 we have
1(A
|+h
A
|
= /) =

`h + o(h) if / = 1
o(h) if / 2
1 `h + o(h) if / = 0.
where lim
h0
c(h)
h
= 0.
(ii) For any sequence 0 :
1
t
1
:
2
t
2
:
a
t
a
< , the random variables
A
|
1
A
-
1
, A
|
2
A
-
2
, . . . , A
|u
A
-u
are independent.
For any 0 : t the random variable A
|
A
-
denotes the number of occurences of
our underlying random event in the time interval (:, t]. Any random process A satisfying
condition (ii) of the above denition is called a process with independent increments. See
Lawler, section 3.1, for a motivating example where A
|
represents the number of customers
arriving at a service facility by the time t. The following important result explains the name
Poisson process:
Theorem 5.2. Let A be a Poisson process with rate `. Then, for any 0 :, t we have
1(A
-+|
A
-
= /) =
(`t)
I
c
A|
/!
, / = 0, 1, 2, . . . .
In other words, the increment A
-+|
A
-
is a random variable that has Poisson distribution
with parameter `t.
Proof. Let us x : and denote 1
I
(t) = 1(A
-+|
A
-
= /). Now,
1
0
(t + h) = 1(A
-+|+h
A
-
= 0)
= 1(A
-+|
A
-
= 0, A
-+|+h
A
-+|
= 0)
= 1(A
-+|
A
-
= 0)1(A
-+|+h
A
-+|
= 0)
= 1
0
(t)
[
1 `h + o(h)
]
.
Therefore,
1
0
(t + h) 1
0
(t)
h
= `1
0
(t) +
o(h)
h
1
0
(t).
Letting h 0 we get

1
0
(t) = `1
0
(t), t 0, (5.5)
with the initial condition
1
0
(0) = 1(A
0
= 0) = 1.
44
The ordinary dierential equation (5.5) has the well known solution (check it as an exercise)
1
0
(t) = c
A|
, t 0.
Thus, 1(A
-+|
A
-
= 0) = 1
0
(t) = c
A|
for t 0.
Next, for any / 1, we have
1
I
(t + h) =
I

i=0
1
(
A
-+|
A
-
= / i, A
-+|+h
A
-+|
= i
)
=
I

i=0
1
Ii
(t)1
(
A
-+|+h
A
-+|
= i
)
= 1
I
(t)1
(
A
-+|+h
A
-+|
= 0
)
+ 1
I1
(t)1
(
A
-+|+h
A
-+|
= 1
)
+
I

i=2
1
Ii
(t)1
(
A
-+|+h
A
-+|
= i
)
= 1
I
(t)
[
1 `h + o(h)
]
+ 1
I1
(t)
[
`h + o(h)
]
+
I

i=2
1
Ii
(t)o(h).
So,
1
I
(t + h) 1
I
(t)
h
= `
(
1
I1
(t) 1
I
(t)
)
+
o(h)
h
I

i=2
1
Ii
(t).
Again, letting h 0 we get

1
I
(t) = `
(
1
I1
(t) 1
I
(t)
)
, t 0, (5.6)
with the initial condition
1
I
(0) = 1(A
0
= /) = 1(0 = /) = 0,
for / = 1, 2, 3, . . . . Thus, for / = 1 we get

1
1
(t) = `
(
c
A|
1
1
(t)
)
, t 0, (5.7)
with the initial condition
1
1
(0) = 1(A
0
= 1) = 0.
The ordinary dierential equation (5.7) has a well known solution
1
1
(t) = `tc
A|
, t 0.
Thus, 1(A
-+|
A
-
= 1) = 1
1
(t) =
(A|)
1
c
AI
1!
for t 0. Proceeding similarly for / 2 we
nally obtain
1(A
-+|
A
-
= /) = 1
I
(t) =
(`t)
I
c
A|
/!
, t 0;
for all / = 0, 1, 2, 3, . . . .
We now have the important
45
Corollary 5.3. Let A be a Poisson process with rate `. Then the process A is a time
homogeneous Markov chain.
Proof. It is enough to verify that any three times 0 : : t < , and for any three
integers / : n we have
1(A
|
= nA

= :, A
-
= /) = 1(A
|
= nA

= :),
and that
1(A
|
= nA

= :)
only depends on :, n and the time dirential t :. Now,
1(A
|
= nA

= :, A
-
= /)
=
1(A
|
A

= n :, A

A
-
= :/, A
-
= /)
1(A

A
-
= :/, A
-
= /)
= 1(A
|
A

= n :) = 1(A
|
A

= n :A

= :)
= 1(A
|
= nA

= :).
(5.8)
This proves the Markov property. From Theorem (5.2) we know that
1(A
|
A

= n :) =
(`(t :))
an
c
A(|)
(n :)!
which proves the time homogeneity in view of (5.8).
In particular,
1(A
|
= nA
0
= 0) = 1(A
|
= n) =
(`t)
a
c
A|
n!
.
Consider the random time t
1
= mint 0 : A
|
= 1. This is the time of the rst jump of
the process A. Then, of course, we have for all t 0
1(t
1
t) = 1(A
|
= 0) = c
A|
,
and so the random time t
1
has exponential distribution with parameter `. More generally,
the so called sojourn times, that is the random times t
a
that elapse between the consec-
utive jumps of the Poisson process A, are i.i.d random variables each having exponential
distribution with parameter `.
The previous remark tells us that the trajectories of a Poisson process with rate ` are
right-continuous step functions. The height of each step is 1. The length of each step is
the value of exponential random variable with parameter `. The lengths of dierent steps
are distributed independently. Now sketch a graph of a trajectory (or a sample path) of a
Poisson process [you may want to look at Figure 1.2.9 in Mikosch].
The time-t transition matrix of a Poisson process is Q(t) =
(
1(A
-+|
= :A
-
=
n)
)
n, a=0,1,2,...
=
(
1(A
-+|
A
-
= : n)
)
n, a=0,1,2,...
=
(
1
na
(t)
)
n, a=0,1,2,...
, where
1
na
(t) = 0 for : < n. Thus,
Q(t) =

0 1 2 3 4
0 1
0
(t) 1
1
(t) 1
2
(t) 1
3
(t) 1
4
(t)
1 0 1
0
(t) 1
1
(t) 1
2
(t) 1
3
(t)
2 0 0 1
0
(t) 1
1
(t) 1
2
(t)
3 0 0 0 1
0
(t) 1
1
(t)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
46
Let us now introduce the countably innite matrix
=

0 1 2 3 4
0 ` ` 0 0 0
1 0 ` ` 0 0
2 0 0 ` ` 0
3 0 0 0 ` `
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. (5.9)
Consistently with the general form (5.2)-(5.3) of the Kolmogorov equations of a Markov
process, the system of ordinary dierential equations (5.5), (5.6) can be written vectorially
as: Q(0) = 1 and for t 0

Q(t) = Q(t) = Q(t), Q(t) = c


|
.
For 0 : t the increment A
|
A
-
is a random variable that has the Poisson
distribution with parameter `(t:). We also know that this random variable is independent
of the all random variables A

, 0 : :. Thus, we have
1(A
|
`t
-
) = 1(A
|
A
-
`(t :)
-
) + 1(A
-
`:
-
) = A
-
`:.
This means that the process
`
|
:= A
|
`t, t 0
is a martingale with respect to the natural ltration of A.
5.2 Two-states continuous time Markov chain
We now consider a two state Markov chain with the innitesimal generator
=
(
0 1
0 ` `
1 j j
)
.
If the process is in state 0 then it waits for a random time t
0
before it decides to jump
to state 1. The random time t
0
has an exponential distribution with parameter `. If the
process is in state 1 then it waits for a random time t
1
before it decides to jump to state 0.
The random time t
1
has an exponential distribution with parameter j.
The forward Kolmogorov equation is

Q(t) = Q(t), Q(0) = 1, t 0;


that is
(t; 0, 0) = `(t; 0, 0) + j(t; 0, 1), (t; 0, 1) = `(t; 0, 0) j(t; 0, 1),
(t; 1, 0) = `(t; 1, 0) + j(t; 1, 1), (t; 1, 1) = `(t; 1, 0) j(t; 1, 1),
for t 0, with the initial conditions
(0; 0, 0) = (0; 1, 1) = 1, (0; 1, 0) = (0; 0, 1) = 0.
47
The backward Kolmogorov equation is

Q = Q(t), Q(0) = 1, t 0;
that is
(t; 0, 0) = `(t; 0, 0) + `(t; 1, 0), (t; 0, 1) = `(t; 0, 1) + `(t; 1, 1),
(t; 1, 0) = j(t; 0, 0) j(t; 1, 0), (t; 1, 1) = j(t; 0, 1) j(t; 1, 1),
for t 0, with the initial conditions
(0; 0, 0) = (0; 1, 1) = 1, (0; 1, 0) = (0; 0, 1) = 0.
The matrix diagonalizes as follows (check it as an exercise!)
=
(
1
A
j
1 1
)(
0 0
0 (` + j)
)
(
j
A+j
A
A+j

j
A+j
j
A+j
)
.
Thus, the solution to both equations (forward and backward) is
Q(t) = c
|
=
(
1
A
j
1 1
)
c

0 0
0 (` + j)

|
(
j
A+j
A
A+j

j
A+j
j
A+j
)
=
(
1
A
j
1 1
)(
1 0
0 c
|(A+j)
)
(
j
A+j
A
A+j

j
A+j
j
A+j
)
.
That is,
(t; 0, 0) =
j
` + j
+
`
` + j
c
|(A+j)
, (t; 0, 1) = 1 (t; 0, 0),
(t; 1, 1) =
`
` + j
+
j
` + j
c
|(A+j)
, (t; 1, 0) = 1 (t; 1, 1)
for t 0.
Observe that
lim
|
Q(t) =
(
j
A+j
A
A+j
j
A+j
A
A+j
)
=
(

)
,
where, = (
j
A+j
A
A+j
). Thus, is the unique stationary distribution for this chain.
5.3 Birth-and-death Process [Lawler, Section 3.3.]
The innitesimal generator of a Birth-and-Death process is the innite matrix
=

0 1 2 3 4
0 `
0
`
0
0 0 0
1 j
1
`
1
j
1
`
1
0 0
2 0 j
2
`
2
j
2
`
2
0
3 0 0 j
3
`
3
j
3
`
3

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
48
The constants j
i
0, i = 0, 1, 2, 3, . . . , represent the death rates at various states of the
process. They are intensities of downward transitions; note that we always have j
0
= 0.
The constants `
i
0, i = 0, 1, 2, 3, . . . , represent the birth rates at various states of the
process. They are intensities of upward transitions. Observe that the diagonal elements
of the matrix are non-positive, and that the rows of the matrix sum up to 0.
In each state i the process waits a random amount of time, t
i
, before the process de-
cides to jump to either the higher state i + 1 or the lower state i 1 [the latter possibility
is valid only if i 1]. The waiting time t
i
has an exponential distribution with parameter
j
i
+ `
i
. Thus, the intensity (or rate) of a jump out of state i is j
i
+ `
i
. Once the process
decides to jump from state i, the probability of the jump nj [to i +1] is
A
.
A
.
+j
.
, and the prob-
ability of the jump donn [to i 1] is
j
.
A
.
+j
.
. Obviously, if j
i
+`
i
= 0 then the process never
leaves state i with probability one [we deal here with a degenerate exponential distribution
with all probability mass concentrated at ].
The Poisson process is a BDP process for which j
i
= 0, i = 0, 1, 2, 3, . . . , and `
i
= `, i =
0, 1, 2, 3, . . . , for some positive `.
49
Homework 5: Poisson processes and Continuous-time Markov chains
1. Prove Proposition 5.1.
2. Let A
|
be a Markov chain with state space 1, 2 and intensities `(|, 2) = |, `(2, |) = 4.
Find Q(t).
3. Let A be a Poisson process with parameter ` = 2. Determine the following expecta-
tions
(a) 1(A
2
)
(b) 1(A
2
1
)
(c) 1(A
1
A
2
)
(d) 1(t
2
), 1(t
1
)
(e) 1(T
2
) where T
2
denotes the second jump time of A
|
.
(f) 1(t
1
t
2
)
4. Lawler, Ex. 3.1. Let A
|
= # of calls arrived by time t. Thus, A is a Poisson process
with rate ` = 4,hour. Compute:
(a) 1(A
1
< 2)
(b) 1(A
2
8A
1
= 6)
5. Let A be a Poisson process with parameter `. Is the process Y
|
= A
2
|
`t a martingale
with respect to the natural ltration of A?
50
Chapter 6
Brownian motion [Lawler, Chapter 8;
Mikosch, Section 1.3; Shreve, Sections
3.33.7]
Up to now we have been dealing with stochastic processes taking at most a countable number
of values. The one-dimensional Brownian motion (BM) process takes values in the entire
real line and that is why it is successfully used to model certain types of random continuous
motions. [See Mikosch, section 1.3, for graphs of simulated sample paths of a 1-d BM
process.]
6.1 Denition and basic properties
Our intention is to model a random continuous motion that satises certain desirable
physical postulates. [It is worth emphasizing that here the continuity is understood both in
the time variable and the space variable.] Let A
|
denote the position at time t 0 of our
random process. Our postulates regarding the A
|
are as follows:
A
0
= 0 .
The random process has independent and time homogeneous increments. That is for
any 0 : t n the random variables A
|
A
-
and A

A
&
are independent.
In addition, for any 0 : t the distribution of A
|
A
-
depends only on the time
dierential t :.
Recall that a Poisson process with rate ` satises the above two postulates. But it is an
integer valued process. Let us make instead the following additional postulate.
The sample paths A

(.) of our random process are continuous functions from [0, )


to the state space o = (, ).
It turns out [compare Lawler p.143-144 (p.173-174)] that the above three postulates imply
that any 0 : t the distribution of the increment A
|
A
-
must be Gaussian:
A
|
A
-
A
(
j(t :), o
2
(t :)
)
for some constants j and o 0 [we assume strict positivity of o to avoid trivialities]. All
this motivates the following
51
52
Denition 6.1. The stochastic process (A
|
, t 0) is called Brownian motion with drift, or
Wiener process with drift, starting at 0, if A
0
= 0 and
1. Sample paths of A
|
are continuous functions of t,
2. A
|
has independent increments: for any sequence of times 0 t
1
< t
2
< ... < t
a
< ,
the increments A
|
1
A
0
, A
|
2
A
|
1
, . . . , A
|u
A
|
u1
are independent random variables;
for t : the distribution of A
|
A
-
depends on t and : only through t :,
3. For 0 : t
A
|
A
-
A
(
j(t :), o
2
(t :)
)
.
It was indicated above that condition 3 in Denition 6.1 is implied by conditions 1 and 2
(assuming A
0
= 0). Nevertheless, it is customary to include this condition as a part of the
denition of the BM process. Note that in particular we have A
|
A
(
jt, o
2
t
)
. If we
change A
0
= 0 into A
0
= r in the denition, then A
|

(
r + jt, o
2
t
)
. Here j is called
the drift parameter and o
2
the variance (or diusion) parameter.
Frequently, the denition of a BM process is formulated for the case j = 0. This is
the denition of the BM process given in Lawler p.144 (p.174). We have chosen to give the
general (i.e. j (, )) denition above. Such BM process is called Brownian motion
with drift in Lawler, section 8.7, and denoted by Y
|
, t 0.
Denition 6.2. When j = 0 and o
2
= 1, the process A
|
, t 0 is called standard Brownian
motion (SBM) and is often denoted by \
|
, t 0. [Mikosch denotes the SBM by 1
|
, as is
also done in some other texts.]
It can be shown that sample paths of a BM, though continuous, are nowhere dieren-
tiable [see Lawler p.145 (p.175), or Mikosch p. 36]. It can also be shown that sample paths
of BM do not have bounded variation on any nite time interval [see Mikosch p. 39].
6.1.1 Random walk approximation
A random walk may serve as a discrete time prototype of the BM process. As a matter of
fact, the BM process can be constructed as an appropriate limit of random walk processes.
Here is how [the construction is done for the case j = 0 for simplicity; also, compare Lawler
p.144-145 (p.174-175), or Mikosch section 1.3.3]. Let -
1
, -
2
, ... be i.i.d with 1(-
1
1) =
1
2
,
and let
A
h,k
|
= k(-
1
+ -
2
+ ...-
[
I
h
]
)
where [x] denotes greatest integer r. A
h,k
|
can be interpreted as the time-t location of the
particle executing random walk with size of the step given by k and with time unit equal to
h . We have
1(A
h,k
|
) = 0,
\ o:(A
h,k
|
) = k
2
[
t
h
]
k
2
t
h
.
Let h, k 0 in such a way that \ o:(A
h,k
|
) converges to a nite, positive number. (Note
that if we set k = h, then \ o:(A
h,k
|
) 0 ). This can be accomplished by maintaining
53
k
2
= o
2
h, for a nite constant o . In particular by considering A
a
= A
1
u
,
u

u
we obtain
\ o:(A
a
|
) o
2
t. We then have
A
a
=
o

n
(
-
1
+ -
2
+ ... + -
[a|]
)
=
o

n
(
-
1
+ -
2
+ ... + -
[a|]
)

[nt]

[nt]
=
o

t
(
-
1
+ -
2
+ ... + -
[a|]
)

[nt]

[nt]

nt
for every n. We note that, as n ,

[nt],

nt 1, and thus by the central limit


theorem, as n
A
a
A(0, o
2
t) in distribution.
In addition one can show that all joint distributions of A
a
converge to the joint normal
distributions.
6.1.2 Second order properties
We know that A
|
A
(
jt, o
2
t
)
. We now consider the joint probability density function
of (A
|
1
, A
|
2
, . . . , A
|u
). Recall that the random variables A
|
1
, A
|
2
, . . . , A
|u
are said to have
a joint normal distribution if they can be represented as
A
|
.
=
n

)=1
o
i,)
-
)
i = 1, 2, . . . , n
where -
)
, 1 , : are independent normal random variables and the o
i,)
are arbitrary
constants. For Brownian motion, we have
A
|
1
= A
|
1
A
0
A
|
2
= (A
|
2
A
|
1
)
+ (A
|
1
A
0
)
. . .
A
|u
= (A
|u
A
|
u1
) + . . . + (A
|
1
A
0
)
where, by denition of BM, increments are independent normal random variables. Hence the
distribution of (A
|
1
, A
|
2
, . . . , A
|u
) is multi-variate normal with 1A
|

= jt
)
, 1 , n,
and with covariance matrix C = (C
i,)
) given by
C
i,)
= Co
(
A
|
.
, A
|

)
= o
2
min(t
i
, t
)
).
We verify the last statement. Assume that : < t. Then
Co(A
-
, A
|
) = 1(A
-
A
|
) (1A
-
)(1A
|
)
= 1 [A
-
(A
|
A
-
)] + 1A
2
-
j
2
:t
= 1A
-
1(A
|
A
-
) + o
2
: + j
2
:
2
j
2
:t
= j
2
:(t :) + o
2
: + j
2
:
2
j
2
:t
= o
2
: = o
2
min(:, t).
54
6.2 Markov properties
Brownian motion, as a process with independent increments, is a Markov process. This can
be veried as follows:
1(A
|+-
j A
-
= r, A
|
1
= r
1
, . . . , A
|u
= r
a
)
= 1(A
|+-
A
-
j r A
-
= r, A
|
1
= r
1
, . . . , A
|u
= r
a
)
= 1 (A
|+-
A
-
j r) = 1 (A
|+-
j A
-
= r)
where 0 t
1
< t
2
< ... < t
a
< :. Let
(t; r, j)

1(A
|+-
jA
-
= r) =

1(A
|+-
A
-
j r)
=
1

2to
exp[(j r jt)
2
,2o
2
t].
where the rst and second equality follow from the rst and second dening property of BM,
respectively. The function (t; r, j) is the probability density function of A
|+-
given that
A
-
= r. It is called the transition density function of (A
|
, t 0).
Note that (t; r, j) depends on r, j only through (j r). Therefore BM is a spatially
homogeneous process as well as a time homogeneous process. You may recall that analogous
properties were satised for a Poisson process.
Remark 6.3. One has the following two properties of the function (t; r, j) , with
:

:
,
:
2
2
:
2
for every variable ..
(i) For every r (, )

|
(t; r, j) =

(t; r, j) , t 0, j (, ),
where

(t; r, j) = j

(t; r, j) +
1
2
o
2

2
(t; r, j). (6.1)
The operator

is the adjoint innitesimal generator for BM. Equation (6.1) is called the
forward Kolmogorov equation for the transition probability density function. [Compare with
equation (5.2) in the case of a continuous-time Markov chain.]
(ii) For every j (, )

|
(t; r, j)t = (t; r, j) , t 0, r (, ), (6.2)
where
(t; r, j) = j
a
(t; r, j) +
1
2
o
2

2
a
2
(t; r, j).
The operator is the innitesimal generator for BM. Equation (6.2) is called the back-
ward Kolmogorov equation for the transition probability density function. [Compare with
equation (5.3) in the case of a continuous-time Markov chain.]
The BM process also satises the so called strong Markov property, namely the fact
that 1
|
= \
|+t
\
t
is a Brownian motion independent of
t
, for every stopping time t
[see Lawler p.147 (p.178)]. Using this property the following three important features of the
SBM process (i.e. A
|
= \
|
) can be demonstrated [see Lawler p.148-150 (p.178-180)]:
55
Reection Principle For any / 0 and for any t 0
1(\
-
/ for some 0 : t) = 21(\
|
/).
Equivalently,
1(t
o
< t) = 21(\
|
/),
where t
o
= inft 0 : \
|
= /.
Arctan law and recurrence For any t 1
1(\
-
= 0 for some 1 : t) = 1
2

Arctan
1

t 1
.
Consequently
1(\
-
= 0 for some 1 :) = 1.
Strong law of large numbers With probability 1 we have
lim
|
\
|
t
= 0.
56
Homework 6: Brownian motion
1. Suppose \
|
is a standard Brownian motion and 1
|
=

o\
o|
with o 0. Show that 1
|
is a standard Brownian motion (known as time rescaled standard Brownian motion).
2. Suppose \
|
is a standard Brownian motion and 7
|
= t\
1|
. Show that 7
|
is a standard
Brownian motion (known as time reversed standard Brownian motion).
3. Let \
|
be a standard Brownian motion. Compute the following conditional probabil-
ity: 1(\
2
0\
1
0). Are the events \
1
0 and \
2
0 independent?
4. Let \
|
be a standard Brownian motion.
(a) Express the joint density function of \
2
, \
4
, \
6
in terms of the transition density
function of (\
|
, t 0).
(b) Compute the probability density function of \
4
conditional on \
2
= 0 and
\
6
= 0.
(c) Compute 1(\
6
\
2
, \
4
).
(d) Compute 1(\
6
\
2
\
4
).
57
6.3 Martingale methods
Let \
|
be a standard BM with \
0
= 0, and let
t
o
= mint 0 : \
|
= o, o = 0.
Assume a 0, and let 1(t) = 1(t
o
t). We have
1(\
|
o) = 1(\
|
o, t
o
t) =

|
0
1 (\
|
ot
o
= :) d1(:)
=

|
0
1
(
\
|
o

\
-
= o, (\
&
< o, n < :)
)
d1(:)
=

|
0
1
(
\
|
\
-
o

\
-
= o, (\
&
< o, n < :)
)
d1(:)
=

|
0
1(\
|
\
-
0)d1(:) =
1
2
1(t
o
t).
where the next-to-last equality follows by the independent increments property and the last
one by the distributional properties of BM. Hence
1(t
o
t) = 21(\
|
o) =
2

2t


o
exp(r
2
,2t)dr
=
2

|
exp(j
2
,2)dj
where in the last step we substituted j = r,

t. [Note, that we have just demonstrated


the so-called reection principle for standard BM process]. Thus the probability density
function of t
o
is
(t) =
d
dt
1(t
o
t) =
2

2
exp(o
2
,2t)
d
dt
(
o

t
)
=
o

2
t
32
exp(o
2
,2t), t 0, o 0.
For o < 0, by symmetry
(t) =
o

2
t
32
exp(o
2
,2t),
so that
(t) =
o

2
t
32
exp(o
2
,2t), o = 0, t 0,
is the pdf of t
o
. It is called an inverse Gaussian density. We have
1(t
o
< ) = lim
|
1(t
o
t) = lim
|
2

|
exp(j
2
,2)dj = 1,
and 1t
o
= . This shows that standard BM is a null recurrent process [It behaves like a
driftless random walk on the integers, see Lawler].
Let `
|
= max
0-|
\
-
, the maximum of BM on the time interval [0, t]. We can now
easily obtain the distribution of `
|
:
1(`
|
o) = 1( max
0-|
\
-
o) = 1(t
o
t) =
2

|
exp(j
2
,2)dj.
58
6.3.1 Martingales associated with Brownian motion
Let (A
|
, t 0) be BM with j = 0 and variance parameter o
2
. Then the following processes
are martingales:
a. A
|
b. `
|
= A
2
|
o
2
t
c. 7
|
= exp
(
0A
|

1
2
0
2
o
2
t
)
, 0 .
To verify that the above processes are martingales we write A
|
= A
|
A
-
+ A
-
, and use
the fact that BM has independent increments. As an example we verify that A
|
, t 0, is a
martingale. We have
1(A
|

-
) = 1(A
|
A
-
+ A
-

-
) = 1(A
|
A
-
) + A
-
= A
-
,
and 1A
|
< , since A
|
is a normal random variable. Hence indeed (A
|
, t 0) is a mar-
tingale.
Let now A
|
, t 0, be BM with parameters j = 0, and o
2
. Then the following processes
are martingales
d. Y
|
= A
|
jt
e. `
|
= (A
|
jt)
2
o
2
t
f. 7
|
= exp
(
0A
|

(
0j +
1
2
0
2
o
2
)
t
)
, 0 .
The processes in c. and f. are Walds martingales (recall the structure of Walds martingale
in discrete time) since in both cases 7
|
= exp(0A
|
),1[exp(0A
|
)].
6.3.2 Exit time from a corridor
Let A
|
be BM with parameters j and o
2
, and A
0
= r. Then for o, /
t
o
= min(t 0 : A
|
= o), t
o
= min(t 0 : A
|
= /) or
t
o,o
= t = min(t 0 : A
|
= o or /)
are stopping times (hitting time of given levels and exit time from a corridor) relative to
(
|
, t 0), where
|
= o(A
-
, : t). Let us now use OST to compute 1(A
t
= /) and 1t
(given the initial condition r of A). We omit the verication of assumptions of OST as
these can be veried similarly as for random walks (see Chapter 2).
Case 1 j = 0 In this case (A
|
, t 0) is a martingale. By OST
1A
t
= 1A
0
= r.
But
1A
t
= /1(A
t
= /) + o1(A
t
= o) = r.
59
Solving last equation for 1(A
t
= /) = 1 1(A
t
= o) gives
1(A
t
= /) =
r o
/ o
, o r /.
To compute 1t we use the martingale in point b. of Subsection 6.3.1. By OST
1`
t
= 1A
2
t
o
2
1t = r
2
so that
1t =
1A
2
t
r
2
o
2
=
[(
r o
/ o
)
/
2
+
(
/ r
/ o
)
o
2
r
2
]
,o
2
=
(r o)(/ r)
o
2
.
Case 2 j = 0 To compute 1(A
t
= /) we apply OST to the martingale in point f. of
Subsection 6.3.1:
7
|
= exp
(
0A
|

(
0j +
1
2
0
2
o
2
)
t
)
,
with 0 = 0

:= 2j,o
2
. With this choice of 0
7
|
= exp
(

2j
o
2
A
|
)
= exp (0

A
|
) .
By OST
17
t
= 17
0
= exp (0

r) .
Solving
17
t
= exp (0

/) 1(A
t
= /) + exp (0

o) 1(A
t
= o) = exp (0

r)
for 1(A
t
= /) = 1 1(A
t
= o) gives
1(A
t
= /) =
exp (0

r) exp (0

o)
exp (0

/) exp (0

o)
.
It now follows from the above equation that
1(t
o
< ) = lim
o
1(A
t
= /)
=
{
1 if j 0, hence 0

< 0
exp[0

(r /)] < 1 if j < 0, hence 0

0
Similarly
1(t
o
< ) = lim
o
1(A
t
= o) = lim
o
exp (0

/) exp (0

r)
exp (0

/) exp (0

o)
=
{
exp[0

(r o)] < 1 if j 0, hence 0

< 0
1 if j < 0, hence 0

0.
These results show that BM with j = 0 is a transient process. We shall now compute 1t.
For this we use the martingale:
Y
|
A
|
jt, t 0.
60
By OST we have
1Y
t
= 1A
t
j1t = r,
so that
1t =
1A
t
r
j
,
where
1A
t
= o
exp (0

/) exp (0

r)
exp (0

/) exp (0

o)
+ /
exp (0

r) exp (0

o)
exp (0

/) exp (0

o)
,
and, as before, 0

= 2j,o
2
. Suppose now that we let o and assume that j 0,
hence 0

< 0. Then
lim
o
1A
t
= /,
which shows that
1t = 1t
o
=
/ r
j
.
61
6.3.3 Laplace Transform of the rst passage time of a drifted Brownian
motion
Let now t
o
be the rst passage time of BM A
|
with j 0 and variance parameter o
2
to
level o. We assume A
0
= r < o, hence 1t
o
< +. Recall that
7
|
= exp
(
0A
|

(
0j +
1
2
0
2
o
2
)
t
)
, 0
is a martingale with
17
|
= 17
0
= exp(0r).
Applying OST to 7
|
with the stopping time t
o
, we obtain
17
tu
= exp(0r).
or
1
[
exp
(
0o
(
0j +
1
2
0
2
o
2
)
t
o
)]
= exp(0r)
so that
1 exp
{

(
0j +
1
2
0
2
o
2
)
t
o
}
= exp[0(r o)]. (6.3)
Let
` = 0j +
1
2
0
2
o
2
.
We require that ` 0, so that 1 exp(`t
o
) is the Laplace transform of t
o
. Solving last
equation for 0 gives
0

=
j

j
2
+ 2o
2
`
o
2
.
Taking the positive root 0 = 0
+
(this guarantees that ` 0), and substituting it into the
right hand side of (6.3), gives the Laplace transform of t
o
as
1 exp(`t
o
) = exp
[
(o r)(

j
2
+ 2o
2
` j),o
2
]
.
This transform can be inverted to obtain the pdf of t
o
(t) =
o r
o

2t
3
exp
[

(o r jt)
2
2o
2
t
]
, t 0.
Note that if r = 0, j = 0, and o
2
= 1, we formally recover the pdf of t
o
obtained before for
standard BM. However the assumptions of OST are not satised in case of j = 0 since in
this case 1t
o
= +. Obviously the Laplace transform can be used to obtain moments of
t
o
. We have
d
d`
1 exp (`t
o
) = 1
d
d`
exp (`t
o
) = 1(t
o
exp (`t
o
)),
so that
d
d`
1 exp (`t
o
)

A=0
= 1t
o
.
Carrying out the computation on the left side of the above equation gives the result obtained
previously, that is
1t
o
=
o r
j
.
Higher moments of t
o
can be obtained by further dierentiation.
62
Homework 7: Continuous-time optional stopping theorem
1. Two independent Brownian motions A
1
|
and A
2
|
with drift parameters j
1
and j
2
,
respectively, where j
1
j
2
, and the same variance parameter o
2
, start
out at positions r
1
and r
2
, respectively, where r
1
< r
2
. Calculate the probability that
they will never meet.
Hint: Consider the process A
2
|
A
1
|
and notice that it is also a Brownian motion.
Then consider separately two cases: j
1
= j
2
and j
1
< j
2
.
2. Let A
|
, t 0, be a Brownian motion with drift parameter j 0, and variance param-
eter o
2
. Let
0 = mint 0 : A
|
= o,
where A
0
= r < o. Compute 10 and \ o:(0) using Optional Stopping Theorem.
63
6.4 Geometric Brownian motion [see also Mikosch, Example
1.3.8]
Denition 6.4. Let A
|
be a Brownian motion with parameters j, o
2
, and A
0
= r. The
process
o
|
= exp(A
|
) = exp(r + jt + o\
|
)
is called Geometric Brownian motion (GBM).
Note that the state space of (o
|
, t 0) is o = (0, ). Let
0 = t
0
< t
1
< t
2
< ... < t
a
< .
be an increasing sequence of times, and consider relative changes
o
|
1
o
|
0
o
|
0
,
o
|
2
o
|
1
o
|
1
, . . . ,
o
|u
o
|
u1
o
|
u1
,
which can be expressed as
exp(A
|
1
A
|
0
) 1, exp(A
|
2
A
|
1
) 1, ... exp(A
|u
A
|
u1
) 1,
from which we see that for GBM, relative changes in disjoint time intervals are independent
random variables. o
|
is also called a lognormal process, and is often used to model prices
of nancial assets. Modeling prices with GBM involves the assumption that returns are
independent from period to period.
We compute 1o
|
and \ o:(o
|
). To this end we recall that the moment generating
function `(0) of - A(:, ) is
`(0) = 1 exp(0-) = exp(0: +
0
2
2
)
In particular, setting 0 = 1, we obtain
1 exp(-) = exp(: +
1
2
)
In our case, A
|
A(r + jt, o
2
t). Hence
1o
|
= 1 exp(A
|
) = exp(r + jt +
1
2
o
2
t) = exp(r + jt +
1
2
o
2
t).
We can show in a similar way that
\ o:(o
|
) = exp(2r + 2jt + o
2
t)
(
exp(o
2
t) 1
)
,
and also that the mean and the variance of the return are
1
(
o
|
o
-
o
-
)
= exp
[
j(t :) +
1
2
o
2
(t :)
]
1,
\ o:
(
o
|
o
-
o
-
)
= exp
[
2j(t :) + o
2
(t :)
] {
exp
[
o
2
(t :)
]
1
}
.
64
Part III
Elements of stochastic analysis
65
Chapter 7
Stochastic integration [Lawler,
Chapter 9; Mikosch, Chapter 2;
Shreve, Sections 4.2 and 4.3 ]
Our purpose in this topic is to give an overview of the basics of stochastic calculus. Stochastic
calculus is one of the mathematical tools used in engineering (e.g. control engineering),
modern nance industry, in modern insurance industry, and in modern management science,
among others.
We begin with the study of stochastic integrals with respect to a SBM process. We
shall proceed in three stages. First, in Section 5.1, we shall dene and analyze stochastic
integrals with respect to a discrete time symmetric random walk. Next, in Section 5.2,
we shall dene and analyze stochastic integrals of random step functions with respect to a
BM process. Lastly, in Section 5.3, we shall generalize results of Section 5.2 to stochastic
integrals of general stochastic integrand with respect to a SBM process. We shall indicate,
that important properties of stochastic integrals derived in Sections 5.1 and 5.2 carry over
to the general case studied in Section 5.3. Section 5.4 discusses integration with respect to
a Poisson process. Finally Section 5.5 provides a glimpse of a more general semimartingale
integration theory.
7.1 Integration with respect to symmetric random walk
Recall that we constructed a discrete-time stochastic integral with respect to a symmetric
random walk in Example 3.4. We called it then a martingale transform of the process o
a
by the betting process
a
. For convenience we repeat here the content of Example 3.4. We
have, for n = 0, 1, 2, . . . ,
o
a
= r + -
1
+ + -
a
,
where the -
a
are i.i.d. random variable with 1(-
a
= 1) = 1(-
a
= 1) =
1
2
for n 1.
We know that symmetric random walk o
a
is a martingale with respect to the ltration

a
= o(-
1
, . . . , -
a
) = o(o
0
, o
1
, , o
a
). We saw that for n 0 the gamblers fortune
process Y
a
could be represented as a discrete time stochastic integral
Y
a
=
a

I=1

I
o
I
, n = 1, 2, 3, . . . ,
67
68
where o
I
= o
I
o
I1
. Recall that the process
a
was supposed to be predictable with
respect to the ltration
a
, that is for every n 1 we have that
a
is
a1
-measurable.
Properties enjoyed by the stochastic integral Y
a
=

a
I=1

I
o
I
:
(i) We veried in Example 3.3 that Y
a
, n 0 is a martingale with respect to the ltration

a
;
(ii) 1Y
a
= 0 for every n 0. Here is why,
1Y
a
= 1
[
a

I=1

I
o
I
]
=
a

I=1
1[
I
o
I
]
=
a

I=1
1
[
1[
I
o
I

I1
]
]
=
a

I=1
1
[

I
1[-
I

I1
]
]
=
a

I=1
1
[

I
1[-
I
]
]
= 0.
Of course, a much faster proof is possible: due to martingale property we have 1Y
a
=
1Y
0
= 0.
(iii) \ o:(Y
a
) = 1Y
2
a
=

a
I=1
1
2
I
for every n 1. See Lawler p.164 (p.199).
7.2 The It stochastic integral for simple processes
Denition 7.1. A stochastic process 7
|
, t [0, T] is called a simple process if it satises
the following properties:
There exists a partition

a
: 0 = t
0
< t
1
< < t
a1
< t
a
= T,
and a sequence of random variables 7
1
, 7
2
, . . . , 7
a
such that
7
|
=
{
7
i
if t
i1
t < t
i
, i = 1, 2, 3, . . . , n
7
a
if t = T.
The sequence (7
i
, i = 1, 2, . . . , n) is (
|
.1
, i = 1, 2, . . . , n)-adapted. That is 7
i
is a
function of SBM up to time t
i1
. Moreover, 17
2
i
is nite.
We can now dene the It stochastic integral for simple processes 7:
Denition 7.2. The It
1
stochastic integral for simple processes 7 on the interval (0, t],
where t
i
t < t
i+1
, is given by [a random Riemann-Stieltjes sum]
Y
a
|
=

|
0
7
-
d\
-
=
i

I=1
7
I
\
|
I
+ 7
i+1
(\
|
\
|
.
), (7.1)
where \
|
I
= \
|
I
\
|
I1
, and where, for i = 0,

0
I=1
7
I
\
|
I
= 0.
1
The pioneering work is the paper by Kiyoshi It (1915-2008): Stochastic Integral, Proc. Imperial Acad.
Tokyo, 20, 519-524, 1944.
69
We can think of the SBM \
|
as of a symmetric random walk continuous in time and space.
Suppose now that a gambler may place bets depending only on the history of SBM. The
bets may be placed only at a certain nite set of times 0 = t
0
< t
1
< < t
a1
< t
a
= T.
A bet 7
i
placed at time t
i1
may only depend on the history of the SBM up to time t
i1
.
The game is stopped at time t. If the player bets 7
i
at time t
i1
then the player receives
7
i
\
|
.
at time t
i
if t
i
< t, and the player receives 7
i
(\
|
\
|
.1
) if t
i1
t < t
i
. Then
the integral Y
a
|
=

|
0
7
-
d\
-
represents the players fortune at time t in such a game.
The It stochastic integral of 7 is then dened likewise on any interval (:, t] with
0 : t. When considered as a function of t the It stochastic integral for simple processes
7, Y
a
|
, is a stochastic process.
Properties enjoyed by the It stochastic integral for simple processes 7:
(i) The It stochastic integral for simple processes 7 is a martingale. We check now the
two martingale conditions:
We have 1Y
a
|
< for all t [0, T]. This follows from the isometry property
(iii) below.
We have 1(Y
a
|

-
) = Y
a
-
for every 0 : t T. To demonstrate this we rst
take t
i
: t t
i+1
. In this case we have
Y
a
|
= Y
a
-
+ 7
i+1
(\
|
\
-
).
Thus, since both Y
a
-
and 7
i+1
are
-
-measurable [why?], we have
1(Y
a
|

-
) = Y
a
-
+ 7
i+1
1(\
|
\
-

-
)
= Y
a
-
+ 7
i+1
1(\
|
\
-
) = Y
a
-
,
where the second equality follows due to independent increments property of SBM.
Exercise 1. Verify that 1(Y
a
|

-
) = Y
a
-
is true for t
i
: t
i+1
and t
I
t
t
I+1
where t
i+1
t
I
.
(ii) 1Y
a
|
= 0 for every t [0, T].
Exercise 2. Verify property (ii).
(iii) (Isometry property) We have that
\ o:(Y
a
|
) = 1[(Y
a
|
)
2
] =

|
0
17
2
-
d:, t [0, T].
Exercise 3. Verify property (iii). [Hint: See Lawler p.166-167 (p.202-203) or Mikosch
p.106.]
(iv) (Linearity with respect to integrands) Let 7
|
and l
|
be two simple processes,
and let o, / be two constants. Then

|
0
(o7
-
+ /l
-
)d\
-
= o

|
0
7
-
d\
-
+ /

|
0
l
-
d\
-
, t [0, T].
This property follows immediately from the linearity property of summation.
70
(v) (Linearity on adjacent intervals) Let 0 : t T. Then

7
-
d\
-
=

|
0
7
-
d\
-


0
7
-
d\
-
.
Exercise 4. Verify property (v). [Hint: See Mikosch p. 107.]
(vi) The sample paths of the process Y
a
|
are continuous. This follows since the sample
paths of \
|
are continuous and we have
Y
a
|
= Y
a
|
.1
+ 7
|
.
(\
|
\
|
.1
), t
i1
t t
i
.
Example 7.3. This is a simple, but important example. This example is discussed in detail
in Mikosch, section 2.2.1. Take 7
i
= \
|
.1
. Here we have, for t = t
i
,
Y
a
|
=
i

I=1
\
|
I1
\
|
I
=
i

I=1
\
|
I1
(\
|
I
\
|
I1
)
=
1
2
\
2
|

1
2
i

I=1
(
\
|
I
)
2
.
It is demonstrated in Mikosch (p.98) that when the partition
a
becomes ner and ner [i.e.
:c:(
a
) := max
{i=1,2,...,a}
[t
i
t
i1
] 0] then the sum

i
I=1
(
\
|
I
)
2
converges to t in
mean square sense. This is very important observation as you will see later.
Exercise 5. Suppose that the simple integrand process 7 is deterministic. That is, suppose
that 7
1
, 7
2
, . . . , 7
a
are constants. Verify that in this case the stochastic integral Y
a
|
is a
random variable that has normal distribution with mean zero and variance

|
0
(7
-
)
2
d:.
7.3 The general It stochastic integral
The general It stochastic integral for an appropriately regular integrand processes 7
|
, t
[0, T] is dened as the mean square limit of a sequence of It stochastic integrals for simple
processes 7
a
|
, t [0, T]. The processes 7
a
|
, t [0, T] are chosen in such a way that they
converge to the process 7
|
, t [0, T] in an appropriate sense. This was the main idea of K.
It. Here are some details.
Lemma 7.1. Let 7 be a process satisfying the following assumptions:
7 is adapted to the SBM on [0, T], that is, for every t [0, T], the random variable 7
|
is a function of \
-
, 0 : t.
The integral

T
0
17
2
-
d: is nite.
Then, there exists a sequence
{
(7
a
|
, t 0), n = 1, 2, 3, . . .
}
of simple processes so that

T
0
1[7
-
7
a
-
]
2
d: 0
as n .
71
Proof. See Mikosch, Appendix A4. Construction of an approximating sequence
{
(7
a
|
, t
0), n = 1, 2, 3, . . .
}
of simple processes involves making the partitions
a
ner and ner.
Now, we already know how to evaluate the It stochastic integral for each simple process 7
a
in the sequence above. Let us denote by Y
a
|
the It stochastic integral of 7
a
on the interval
[0, t]. From the general results in the area of functional analysis it follows that there exists
a process Y
|
, t [0, T] so that
1 sup
0|T
[
Y
|
Y
a
|
]
2
0
as n . We say that the sequence of processes Y
a
converges in the mean square to
the process Y . Moreover one can prove that the limit Y does not depend on the choice of
a sequence 7
a
of simple processes approximating 7. As a consequence we can state the
following
Denition 7.4. The mean square limit process Y is called the It stochastic integral of 7,
and it is denoted by
Y
|
=

|
0
7
-
d\
-
, t [0, T]. (7.2)
If 7 is a simple process then the It stochastic integral of 7 is given by the Riemann-Stieltjes
sum (7.1). Mikosch uses notation 1
|
(7) to denote the It stochastic integral of 7.
The It stochastic integral of 7 can be dened likewise on any interval (:, t] with
0 : t.
Properties enjoyed by the (general) It stochastic integral All the properties enjoyed
by the It integral of simple processes are inherited by the general It stochastic integral
Y = Y (7). Thus,
(i) The (general) It stochastic integral for processes 7 is a martingale with respect to
the natural ltration of the SBM.
(ii) 1Y
|
= 0 for every t [0, T].
(iii) (Isometry property) We have that
\ o:(Y
|
) = 1Y
2
|
=

|
0
17
2
-
d:, t [0, T].
(iv) (Linearity with respect to integrands) Let 7
|
and l
|
be two admissible integrands,
and let o, / be two constants. Then

|
0
(o7
-
+ /l
-
)d\
-
= o

|
0
7
-
d\
-
+ /

|
0
l
-
d\
-
, t [0, T].
(v) (Linearity on adjacent intervals) Let 0 : t T. Then

7
-
d\
-
=

|
0
7
-
d\
-


0
7
-
d\
-
.
(vi) The sample paths of the process Y
|
are continuous.
72
The denition of the general It stochastic integral can be extended to the case of
T = . Suppose that the simple integrand process 7 is deterministic. In this case the
stochastic integral Y
|
is a random variable that has normal distribution with mean zero and
variance

|
0
(7
-
)
2
d:.
Example 7.5. (This is a continuation of Example 7.3) From the above results it
follows that mean square limit of the stochastic integrals
Y
a
|
=
i

I=1
\
|
I1
\
|
I
=
i

I=1
\
|
I1
(\
|
I
\
|
I1
)
is the stochastic integral
Y
|
=

|
0
\
-
d\:, t [0, T].
But, we have seen that
Y
a
|
=
1
2
\
2
|

1
2
i

I=1
(
\
|
I
)
2
,
and that when the partition
a
becomes ner and ner then the sum

i
I=1
(
\
|
I
)
2
con-
verges to t in mean square sense. Thus, we obtain the following formula
Y
|
=

|
0
\
-
d\: =
1
2
\
2
|

1
2
t, t [0, T].
This result indicates that It stochastic calculus is dierent from an ordinary calculus. [But,
you may want to read about the Stratonovich and Other Integrals in Mikosch, Section 2.4.]
7.4 Stochastic Integral with respect to a Poisson process
Le be PP(` 0). In view of the simple structure of trajectories of a Poisson process is
is easy to dene a stochastic integral with respect to such process. However, in order that
the stochastic integral with respect to a Poisson process has nice properties, it is required
that integrands are predictable and that they satisfy some mild integrability conditions. A
detailed discussion of the concept of predictability for continuous time processes is beyond
the scope of this course. It will be enough for us to know that whenever a process 7 is
left-continuous and adapted with respect to the natural ltration of process , or in case
of an arbitrary deterministic process
2
7, then 7 is predictable.
We dene the stochastic integral of a predictable integrand 7 with respect to a Poisson
process as
1
|
:=

|
0
7
-
d
-
:=

-|
7
-

-
, (7.3)
where
|
:=
|

|
, with
|
= lim
-|

-
is the left-limit of at time t.
Note that one can represent d
|
as

a
c
Tu
(dt) where T
a
represents the n
|
jump time
of and c
Tu
is a Dirac mass at T
a
(random measure over the half-line). In this perspective
one can view the Poisson stochastic integral 1
|
as the Lebesgue-Stieltjes integral of 7 against
the measure d
|
, . by .. In particular one has 1
|
= 7
|

|
.
2
Borel-measurable function of time.
73
Properties of this integral are analogous to the properties of the Ito integral discussed
above, except that the process 1 is not a martingale with respect to the natural ltration
of our Poisson process. The reason why 1 is not a martingale is that process itself is not
a martingale. Letting `
|
=
|
`t denote the compensated martingale of , it can be
veried that the process Y dened as
Y
|
=

|
0
7
-
d`
-
:=

|
0
7
-
d
-
`

|
0
7
-
d: (7.4)
is a martingale.
7.5 Semimartingale Integration Theory [See Protter]

In this Section we give a very brief account of the general semimartingale integration theory,
such as it is developed for instance in the book by Protter. Given the previous developments
the reader should be able to admit the following notions and results without too much harm
(a detailed exposition would take us very far away from the scope of these notes).
Semimartingales are a class of integrator-processes giving rise to the most exible theory
of stochastic integration. In mathematical nance another motivation for modeling prices of
traded assets as semimartingales is that price processes outside this class involve arbitrages,
unless rather strong constraints are imposed on the trading strategies.
It is well known that in a ltration satisfying the usual conditions, every semimartingale
can be considered as cdlg, a French acronym for (almost surely) left limited and right
continuous. All semimartingales in these notes are understood in a cdlg version. In one
of several equivalent characterizations, a semimartingale A corresponds to the sum of a local
martingale ` and of a nite variation process , where:
A local martingale ` admits an increasing sequence of stopping times t
a
such that
every stopped process `
tu
is a uniformly integrable martingale, and
A nite variation process is a dierence between two adapted non-decreasing processes
starting from 0.
Any such representation A = ` + is called a Doob-Meyer decomposition of the semi-
martingale A. A Doob-Meyer decomposition is not unique in general. However there is at
most one such representation of a process A with predictable. One then talks of the
canonical Doob-Meyer decomposition of a special semimartingale A. In particular,
Proposition 7.2. A predictable local martingale of nite variation (e.g., a time-dierentiable
local martingale) is constant.
The stochastic integral of a predictable and locally bounded process 7 with respect to
a semimartingale Y is then dened as
Y
|
=

|
0
7
-
dA
-
:=

|
0
7
-
d`
-
+

|
0
7
-
d
-
, (7.5)
where A = ` + is a Doob-Meyer decomposition of A, and

|
0
7
-
d`
-
is dened by
localization of `. A remarkable fact is that the corresponding notion of stochastic integral
is independent of the Doob-Meyer decomposition of A which is used in (7.5). Predicable
and locally bounded processes notably include all left-limiting processes of the form 7

7

where

7 is a semimartingale.
74
Proposition 7.3. In case A is a local martingale, the integral process Y is again a local
martingale.
In case of a continuous integrator A, it is possible (as we saw earlier in the case of the
Brownian motion) to dene the stochastic integral Y for a class of admissible integrands 7
larger than that of the predictable and bounded processes, namely for integrands 7 that are
only progressive and subject to suitable integrability conditions. One then has that

|
0
7
-
dA
-
=

|
0
7
-
dA
-
(7.6)
for every admissible semimartingale 7.
75
Homework 8: Stochastic integration
1. Do
(a) Exercise 1. Verify that 1(Y
a
|

-
) = Y
a
-
is true for t
i
: t
i+1
and t
I
t t
I+1
where t
i+1
t
I
.
(b) Exercise 2. Verify property (ii), that is, verify that 1Y
a
|
= 0 for every t [0, T].
(c) Exercise 3.
(d) Exercise 4.
(e) Exercise 5.
2. Compute the It stochastic integral for the process 7
|
= 1, t [0, T].
3. Compute the Poisson stochastic integral 1
T
in (7.3) for the process 7
|
= 1, t [0, T].
4. Dene 7
|
=
|
, t (0, T]. Explain why process 7 is predictable. Compute the
stochastic integrals 1
|
and Y
|
for the process 7.
76
Chapter 8
It formula [Mikosch, Chapter 2;
Shreve, Section 4.4]
8.1 Introduction
Consider the integral

|
0
:d:. We know that

|
0
:d: =
1
2
t
2
.
Now, think of the function n(t) = t. Thus, we have

|
0
n(:)dn(:) =
1
2
n
2
(t).
In fact, if n(t) is any dierentiable function of t, so that the integrals below exists, then we
have

|
0
n(:)dn(:) =
1
2
n
2
(t)
1
2
n
2
(0). (8.1)
This is just the chain rule formula:
d[)(n(:))]
d:
= )

(n(:))n

(:) (8.2)
which for )(r) = r
2
yields
d(n
2
(t)) = 2n(t)n

(t)dt = 2n(t)dn(t) (8.3)


or, in integrated form, (8.1).
Observe next that for an arbitrary dierentiable function ) (like )(r) = r
2
above) the
expression (8.2) can be written as
d)(n(t)) = )(n(t + dt)) )(n(t)) = )

(n(t))dn(t).
On the other hand, if the function ) is more than once dierentiable then we have from the
Taylor expansion
)(n(t) + dn(t)) )(n(t)) = )

(n(t))dn(t) +
1
2
)

(n(t))(dn(t))
2
+ ,
where, as usually, dn(t) = n(t +dt) n(t) is the increment of the function n on the interval
[t, t +dt]. So, in case of a dierentiable function ., we may in fact neglect all terms of order
2 and higher in the above Taylor expansion. This is because (dn(t))
I
= o(dt) for any / 2.
77
78
8.1.1 What about

|
0

-
?
Now, could it be that for the SBM we would have

|
0
\
-
d\
-
=
1
2
\
2
|
? (8.4)
Of course not! First of all from the properties of the It integral we know that the expectation
of the left hand side in (8.4) is zero, whereas the expectation of the right hand side in (8.4)
is
1
2
t. Secondly, we already saw that the true value of the stochastic integral in (8.4) is

|
0
\
-
d\
-
=
1
2
\
2
|

1
2
t.
Applying Taylor expansion to the function )(\
|
) = \
2
|
we see that
d\
2
|
= (\
|
+ d\
|
)
2
\
2
|
= 2\
|
d\
|
+ (d\
|
)
2
. (8.5)
But,
1((d\
|
)
2
) = 1(\
|+o|
\
|
)
2
= dt.
Thus, the term (d\
|
)
2
is like dt [it is frequently written that (d\
|
)
2
= dt]. That is the
term (d\
|
)
2
is not o(dt), and therefore it must not be neglected in (8.5). This the reason
why (8.4) is not true. This is the reason why Prof. Kyosi It invented It calculus, and ...
became famous!
8.1.2 What about

|
0

-
?
Could it be that for the Poisson process we would have

|
0

-
d
-
=
1
2

2
|
? (8.6)
Of course not! If you already did Exercise 4 from Homework 8, then you know that

|
0

-
d
-
=
1
2
(
2
|

|
).
8.2 It formulas for continuous processes
There is a general semimartingale It formula which leads to all of the formulas that you
will see below in this section and in the following sections. We shall not state this general
result though. We shall only state some of its simpler versions that we are going to use in
the future lectures.
Let )(r) be twice continuously dierentiable function. We already know that we must
not neglect the terms d\
|
and (d\
|
)
2
= dt in the Taylor expansion of )(\
|
+ d\
|
).
However, it is known, that we may neglect the higher order terms. The suitably amended
Taylor expansion yields the following simple It formula
d)(\
|
) = )

(\
|
)d\
|
+
1
2
)

(\
|
)dt
79
or in integral form
)(\
|
) )(\

) =

(\
-
)d\
-
+
1
2

(\
-
)d:, : t
We know that the process \
2
|
t is a martingale with respect to the natural ltration of
the SBM. Thus, the quadratic variation process of the martingale \
|
is \
|
= t. This is
the reason why the term )

(\
|
)dt is sometimes written as )

(\
|
)d\
|
.
Before we proceed, let us introduce some notation: for a function )(t, r) we denote

|
)(t, r) =
)(t, r)
t
, )(t, r) =
)(t, r)
r
,
2
)(t, r) =

2
)(t, r)

2
r
.
The rst extension of the simple It formula Let the function )(t, r) be once contin-
uously dierentiable w.r.t t and twice continuously dierentiable w.r.t r. Then, for every
0 : t,
)(t, \
|
) )(:, \

) =

)(:, \
-
)d\
-
+

-
)(:, \
-
) +
1
2

2
)(:, \
-
)
)
d: (8.7)
or in dierential form, for t 0,
d)(t, \
|
) = )(t, \
|
)d\
|
+
(

|
)(t, \
|
) +
1
2

2
)(t, \
|
)
)
dt. (8.8)
The second extension of the simple It formula Suppose the processes /
|
and o
|
are
adapted to the natural ltration of the SBM, and are such that the two integrals below are
well dened. Dene a new process A
|
by
A
|
= A
0
+

|
0
/
-
d: +

|
0
o
-
d\
-
, t 0.
Let the function )(t, r) be once continuously dierentiable w.r.t t and twice continuously
dierentiable w.r.t r. Then for process A we have for 0 : t
)(t, A
|
) )(:, A

) =

-
)(:, A
-
) + )(:, A
-
)/
-
+
1
2

2
)(:, A
-
)o
2
-
)
d:
+

)(:, A
-
)o
-
d\
-
(8.9)
or, in dierential form,
d)(t, A
|
) =
(

|
)(t, A
|
) + )(t, A
|
)/
|
+
1
2

2
)(t, A
|
)o
2
|
)
dt + )(t, A
|
)o
|
d\
|
. (8.10)
Observe that formulas (8.7) and (8.8) are special cases of formulas (8.9) and (8.10) for the
case where /
|
0 and o
|
1.
8.2.1 Examples
Example 8.1. Take )(r) = r
2
. From It formula we get
\
2
|
\
2

= 2

\
-
d\
-
+

1d: = 2

\
-
d\
-
+ (t :), 0 : t.
In particular, with : = 0 we get

|
0
\
-
d\
-
=
1
2
\
2
|

1
2
t.
80
Example 8.2. Take )(r) = c
a
. From It formula we get
c
WI
c
Wr
=

c
Ws
d\
-
+
1
2

c
Ws
d:.
Recall, that for a dierentiable function n(t) we have
dc
&(|)
= c
&(|)
n

(t)dt = c
&(|)
dn(t)
and thus
c
&(|)
c
&()
=

c
&(-)
dn(:).
Example 8.3. (It exponential) Take )(t, r) = c
a0.5|
. From the rst extension of the
It formula we get
c
WI0.5|
c
Wr0.5
=

c
Ws0.5-
d\
-
.
Example 8.4. (Geometric Brownian motion) Take )(t, r) = c
(j0.5o
2
)|+oa
, where /, j
(, ) and o 0. From the rst extension of the It formula we get
c
(j0.5o
2
)|+oWI
c
(j
1
2
o
2
)+oWr
= o

c
(j
1
2
o
2
)-+oWs
d\
-
+ j

c
(j
1
2
o
2
)-+oWs
d:.
Thus, dening a GBM process o
|
by
o
|
= o
0
c
(j
1
2
o
2
)|+oWI
, t 0
we get
o
|
o

= o

o
-
d\
-
+ j

o
-
d:, 0 : t
or, in dierential form
do
|
= jo
|
dt + oo
|
d\
|
, t 0.
8.3 It formulas relative to jump processes [See Ikeda and
Watanabe]

Suppose function )(t, n) : [0, )0, 1, . . . , be dierentiable in the rst variable, and
such that the process )(t,
|
) satises some mild integrability condition (e.g., ) bounded).
Then, we have that the following It formula for a Poisson process (integral form)
)(t,
|
) )(:,

) =

-
)(:,
-
) d: +

()(:,
-
) )(:,
-
)) d
-
(8.11)
in which (in dierential form)
()(:,
-
) )(:,
-
))d
-
= ()(:,
-
+ 1) )(:,
-
))d
-
. (8.12)
81
Example 8.5. Take )(t, n) = n
2
. We get

2
|
= 2

|
0

-
d
-
+
|
.
Example 8.6. Take )(t, n) = c
a
. We get
c
.I
= 1 + (c 1)

|
0
c
.
s
d
-
.
Example 8.7. Take )(t, n) = 2
a
. Then,
2
.I
= 1 +

|
0
2
.
s
d
-
.
Let be the generator of the Poisson process in the sense introduced earlier of the
matrix (5.9). Note that for xed t, )(t, ) may be considered as an innite column vector
) = ()(t, 0), )(1), )(2), . . . )
T
. Consequently for xed t the expression ) denes another
vector ) = ()(t, 0), )(t, 1), )(t, 2), . . . )
T
. In view of the form (5.9) of , one has that
)(t, n) = `()(t, n + 1) )(t, n)) .
Letting `
|
=
|
`t, it follows that (8.12) can be rewritten as the following It formula
for a Poisson process (canonical dierential form)
d)(t,
|
) =
|
)(t,
|
)dt + ()(t,
|
+ 1) )(t,
|
)) d
|
= (
|
) + )) (t,
|
)dt + ()(t,
|
+ 1) )(t,
|
)) d`
|
.
Consequently, we have that
)(t,
|
)

|
0
(
-
)(:,
-
) + )(:,
-
)) d: =
)(0,
0
) +

|
0
()(:,
-
+ 1) )(:,
-
)) d`
-
which by application of Proposition 7.3 is a (local) martingale with respect to the natural
ltration of .
In addition to the Poisson process with intensity `, let us consider a standard d-
variate Brownian motion \, and let J
(|)
denote a family of i.i.d. d-variate random variables
with distribution denoted by n(dj), all assumed to coexist on the same probability space.
Given adapted coecients /
|
(a random vector in
o
), o
|
(a random matrix in
oo
) and
a predictable function c
|
(r) (a random vector in
o
marked or parameterized by r
o
),
we shall now consider an It process in the sense of an adapted d-variate process A obeying
the following dynamics over [0, T]:
dA
|
= /
|
dt + o
|
d\
|
+ c
|
(J
(|)
)d
|
. (8.13)
In particular the description of the jumps of A is decomposed into, on one hand, the fre-
quency of the jumps of A (given by, on average, jumps of per unit of time), and, on
the other hand, the distribution n of the marks determining the jump size of A incurred by
82
a jump of . Given a real valued, regular enough function ) = )(t, r) on [0, T]
o
, one
then has for any t [0, T] the following It formula for an It process
d)(t, A
|
) =
|
)(t, A
|
)dt + )(t, A
|
)/
|
dt + )(t, A
|
)o
|
d\
|
(8.14)
+
1
2
Tr
(

2
)(t, A
|
)o
|
)
dt + c
)
|
(A
|
, J
(|)
)d
|
in which:
)(t, r) and
2
)(t, r) denote the row-gradient and the Hessian of ) with respect to r,
o
|
= o
|
o
T
|
is the covariance matrix of A,
Tr stands for the trace operator (sum of the diagonal elements of a square matrix),
and
c
)
|
(r, .) = )(r + c
|
(.)) )(r).
It formula (8.14) reads equivalently, in canonical form:
d)(t, A
|
) =
(

|
)(t, A
|
) + )(t, A
|
)/
|
+
1
2
Tr
(

2
)(t, A
|
)o
|
)
+ `

c
)
|
(A
|
)
)
dt
+)(t, A
|
)o
|
d\
|
+ d`
)
|
(8.15)
with

c
)
|
(r) = 1
(
c
)
|
(r, J
(|)
)

|
)
=

u
()(r + c
|
(j)) )(r)) n(dj).
and
d`
)
|
= c
)
|
(A
|
, J
(|)
)d
|
`

c
)
|
(A
|
)dt. (8.16)
Moreover one has the following
Lemma 8.1. Process `
)
is a local martingale, and a martingale under suitable integrability
conditions.
In the formalism of measure-stochastic integration, process `
)
can equivalently be
written as the stochastic integral of a predictable random function with respect to a com-
pensated Poisson random measure. Lemma 8.1 then appears as an analog of Proposition 7.3.
Using measure-stochastic integration, is also possible to adapt and extend the It formula
(8.15) to more general It processes with an innite activity of jumps.
8.3.1 Brackets
Introducing the (random) generator

)
|
(r) = )(t, r)/
|
+
1
2
Tr
(

2
)(t, r)o
|
)
+ `

c
)
|
(r),
it follows from (8.15) under suitable integrability conditions:
(dt)
1
1
(
d)(t, A
|
)

|
)
=
|
)(t, A
|
) +
)
|
(A
|
). (8.17)
83
Given another real valued function p = p(t, r), it is also easy to show that
(dt)
1
Co
(
d)(t, A
|
), dp(t, A
|
)

|
)
(8.18)
= )(t, A
|
)o
|
(p(t, A
|
))
T
+ `

c
),j
|
(A
|
) =
),j
|
(A
|
)
where (), p)

c
),j
and (), p)
),j
are the bilinear carr du champ (random) operators
associated with the linear (random) operators )
)
and )

c
)
, so

c
),j
|
(r) =

c
)j
|
(r) )(t, r)

c
j
|
(r) p(t, r)

c
)
|
(r)
=

u
()(r + c
|
(j)) )(r)) (p(r + c
|
(j)) p(r)) n(dj).
and
),j
=
)j
)
j
p
)
. Letting Y
|
= )(t, A
|
) and 7
|
= p(t, A
|
), the process
Co
(
dY
|
, d7
|

|
)
also corresponds to the so-called sharp bracket dY, 7
|
. To sum up,
Proposition 8.2. The (random) generator )
)
|
of process A, and its carr du champ
(), p)
),j
, are such that for every functions ), p of r,
1
(
d)(t, A
|
)

|
)
=
(

|
)(t, A
|
) +
)
|
(A
|
)
)
dt
Co
(
d)(t, A
|
), dp(t, A
|
)

|
)
=
),j
|
(A
|
)dt = dY, 7
|
.
(8.19)
In particular,
(dt)
1
\ o:
(
dY
|

|
)
=
),)
|
(A
|
) (8.20)
=
dY
|
dt
= )(t, A
|
)o
|
()(t, A
|
))
T
+ `

c
),)
|
(A
|
)
with

c
),)
|
(r) =

u
()(t, r + c
|
(j)) )(t, r))
2
n(dj).
By letting ) and p range over the various coordinate mappings of A in the second line of
(8.19), and denoting in matrix form < A = (< A
i
, A
)
)
)
i
, one obtains that
(dt)
1
Co
(
dA
|

|
)
=
dA
|
dt
= o
|
+ `

u
(c
|
c
T
|
)(j)n(dj). (8.21)
Observe that the above sharp brackets compensate the corresponding square brackets
(quadratic covariation and variations) dened by [Y, 7]
0
= 0 and
d[Y, 7]
|
= )(t, A
|
)o
|
(p(t, A
|
))
T
dt + c
)
|
(A
|
, J
(|)
)c
j
|
(A
|
, J
(|)
)d
|
and thus [Y, Y ]
0
= 0 and
d[Y, Y ]
|
= )(t, A
|
)o
|
()(t, A
|
))
T
+
(
c
)
|
(A
|
, J
(|)
)
)
2
d
|
.
Notably, if A is a continuous It process, the corresponding sharp and square brackets (exist
and) coincide.
The square brackets can equivalently be dened as limits in probability
1
of realized
covariance and variance processes. They can be dened as such for any semimartingales
Y, 7. They are key in the following semimartingale integration by parts formulas (in
dierential form):
d(Y
|
7
|
) = Y
|
d7
|
+ 7
|
dY
|
+ d[Y, 7]
|
(8.22)
and can also be used for stating a general semimartingale It formula.
1
Or almost sure limits in the case of nested time-meshes.
84
Homework 9: It formula
1. Verify formulas presented in Examples (8.1)-(8.4) and (8.5)-(8.7).
2. Apply It formula to )(\
|
)
(a) )(r) = r,
(b) )(r) = r
3
.
3. Verify that process Y given as Y
|
=

|
0
\
-
d\
-
, t 0, is a martingale with respect to
the natural ltration of process \.
4. Denoting as usual `
|
=
|
`t :
(a) Verify that process Y given as Y
|
=

|
0

-
d`
-
, t 0, is a martingale with
respect to the natural ltration of process ;
(b) Compute J
|
=

|
0

-
d`
-
. Is process J
|
a martingale with respect to the natural
ltration of process ?
5. Derive the It formula relative to both Brownian motion and Poisson process, so for
d)(t, \
|
,
|
). Apply this formula to )(t, r, j) = trj + t + r + j.
6. Let \ = (\
|
, t 0) be a SBM. Dene a process Y by
Y
|
=

|
0
c
Ws
d\
-
, t 0.
Verify whether the process Y is a martingale with respect to the natural ltration of
\.
Chapter 9
Stochastic dierential equations
(SDEs) [Mikosch, Chapter 3; Shreve,
Section 6.2]
9.1 Introduction
Consider the ordinary dierential equation
dr(t)
dt
= /, t 0, r(0) = r
0
. (9.1)
The constant / may be interpreted as an innitesimal [instantaneous] absolute rate of change
of the function r(t). This is because, in view of (9.1), we have
r(t + dt) r(t) = /dt.
The solution to equation (9.1) is
r(t) = r
0
+ /t, t 0.
Now, consider the ordinary dierential equation
dr(t)
dt
= /r(t), t 0, r(0) = r
0
. (9.2)
Here, the constant / may be interpreted as an innitesimal [instantaneous] relative rate of
change of the function r(t). This is because, in view of (9.2), we have
r(t + dt) r(t)
r(t)
= /dt.
The solution to equation (9.2) is
r(t) = r
0
c
o|
, t 0. (9.3)
Now, imagine that both rates are perturbed by normally distributed random shocks.
In case of (9.1) this phenomenon can be modeled as
r(t + dt) r(t) = /dt + o(\
|+o|
\
|
)
85
86
or, equivalently
dr(t) = /dt + od\
|
. (9.4)
In case of (9.2) this phenomenon can be modeled as
r(t + dt) r(t)
r(t)
= /dt + o(\
|+o|
\
|
)
or, equivalently
dr(t) = r(t)
(
/dt + od\
|
)
. (9.5)
Equations (9.4) and (9.5) are prototypes of stochastic dierential equations (SDEs). It
needs to be explained what is meant by a solution to a SDE, and how a SDE can be solved.
9.2 Diusions
Denition 9.1. A Markov process A on state space o = (o, /), o < / , is said
to be a diusion with drift coecient /(t, r) and diusion coecient o
2
(t, r) 0, if
(i) (A
|
, t 0) has continuous sample paths, and
(ii) The following relations hold at h 0, for every (t, r)
+
:
1(A
|+h
A
|
A
|
= r) = /(t, r)h + o(h) (9.6)
1[(A
|+h
A
|
)
2
A
|
= r] = o
2
(t, r)h + o(h). (9.7)
The functions /(t, r) and o
2
(t, r) are usually assumed to be continuous. They are also called
the local mean function and the local variance function of a diusion. Diusion processes
behave locally like BM.
One has
\ o:(A
|+h
A
|
A
|
= r)
= 1[(A
|+h
A
|
)
2
A
|
= r] [1(A
|+h
A
|
A
|
= r)]
2
= o
2
(t, r)h + o(h) [/(t, r)h + o(h)]
2
= o
2
(t, r)h + o(h).
Hence
\ o:(A
|+h
A
|
A
|
= r) 1[(A
|+h
A
|
)
2
A
|
= r] = o(h).
Therefore (9.7) is equivalent to
\ [(A
|+h
A
|
)
2
A
|
= r] = o
2
(t, r)h + o(h) (9.8)
In case the coecients / and o do not depend on time one calls A a time homogeneous
diusion.
9.2.1 SDEs for diusions
Let be a random variable. Let /(t, r) and o(t, r) be two real valued functions. Suppose
now that a real valued process A satises the following three properties:
Property 1 The process A is adapted with respect to to the ltration generated by
and the SBM \.
87
Property 2 The ordinary and It integrals below are well-dened for every t [0, T]:

|
0
/(:, A
-
)d:,

|
0
o(:, A
-
)d\
-
.
Property 3 The equation
A
|
= +

|
0
/(:, A
-
)d: +

|
0
o(:, A
-
)d\
-
,
is satised for all t [0, T].
Denition 9.2. We say that a process A is a strong solution to the SDE
dA
|
= /(t, A
|
)dt + o(t, A
|
)d\
|
, t 0, A
0
= , (9.9)
if the process A satises the Properties 1-3 above.
A strong solution to the SDE (9.9) is readily seen to be a diusion in the sense of Denition
9.1. The SDE is thus also known as the diusion equation with drift coecient /(t, r),
diusion coecient o(t, r) and initial condition .
Observe that the strong solution process A to the diusion SDE (9.9) is an It process
in the sense of Section 8.3 (special case without jumps).
Solving equation (9.9) means determining a process A that is a strong solution to (9.9).
[There is another concept of a solution to equation (9.9), so called weak solution. We shall
not discuss it here however.]
Typically, the drift and the diusion coecients, as well as the initial condition are the
input data for a modeler of physical phenomena. Therefore, if one attempts to model evolu-
tion of a physical phenomenon using equation like (9.9) above, one must address questions
of uniqueness and existence of strong solutions to the equations like (9.9), just like it is the
case with ordinary dierential equations. That is to say, one must answer the following two
questions:
What conditions are necessary and/or sucient to be satised by /(t, r), o(t, r) and
so that there exist solutions to equation (9.9)?
What conditions are necessary and/or sucient to be satised by /(t, r), o(t, r) and
so that there exists a unique solution to equation (9.9)?
We shall not address these questions here. The reader is referred to Mikosch, p.138, for a
brief discussion of the above issues.
9.2.2 Examples
Here are some basic examples of time-homogeneous diusions:
1. BM: /(r) = /, o
2
(r) = o
2
; SBM: /(r) = 0, o
2
(r) = 1.
2. Ornstein-Uhlenbeck process, o = (, )
/(r) = o(/ r), o
2
(r) = o
2
, o, /, o
2
0.
This is a mean-reverting process.
88
3. Square-root process, o = (0, )
/(r) = /r, o
2
(r) = o
2
r, /, o
2
0.
4. Square-root process with mean reversion, o = (0, )
/(r) = o(/ r), o
2
(r) = o
2
r, o, /, o
2
0.
5. Constant Elasticity of Variance Diusion, o = (0, )
/(r) = /r, o
2
(r) = o
2
r

, /, o
2
0, 0 2
6. Geometric BM o
|
, o = (0, ). By denition o
|
= exp(A
|
), where A
|
is BM. An
application of the It formula yields
/(o) = o
(
/ +
o
2
2
)
, o
2
(o) = o
2
o
2
The latter result can also be obtained by direct computations of
1(o
|+h
o
|
o
|
= o) = o
[
exp
(
/h +
o
2
2
h
)
1
]
= o
(
1 + /h +
o
2
2
h + o(h) 1
)
= o
(
/h +
o
2
2
h
)
+ o(h)
and
\ o:(o
|+h
o
|
o
|
= o) = o
2
exp(2/h + o
2
h)[exp(o
2
h) 1]
= o
2
(
1 + 2/h + o
2
h + o(h)
)
(o
2
h + o(h))
= o
2
o
2
h + o(h).
9.3 Solving diusion SDEs
Deriving an explicit formula for a strong solution to the SDE (9.9) is not possible in general
[just like it is the case with ODEs, or PDEs]. So, in general, one needs to approximate
solutions to equations like (9.9) by using numerical methods [see Mikosch, Section 3.4].
Nevertheless, sometimes it is possible to guess an explicit formula for a strong solution to
the SDE (9.9), and then to use It formula to verify that the guessed formula is a correct
one (Property 3 may be veried using It formula).
We shall discuss several examples of diusion SDEs that can be explicitly solved by
using It formulas.
Example 9.3. Consider the equation
dA
|
= d\
|
, t 0; A
0
= 0. (9.10)
Here /(t, r) = 0, o(t, r) = 1. The obvious strong solution is: A
|
= \
|
, t 0.
Exercise 1. Verify that Properties 1-3 are veried by this solution.
Example 9.4. Consider the equation
dA
|
= /dt + od\
|
, t 0; A
0
= r. (9.11)
Here /(t, r) = /, o(t, r) = o. The obvious strong solution is: A
|
= r + /t + o\
|
, t 0.
89
Exercise 2. Verify that Properties 1-3 are veried by this solution.
Example 9.5. Consider the equation
dA
|
= dt + 2sgn(\
|
)

A
|
d\
|
, t 0; A
0
= 0. (9.12)
Here /(t, r) = 1, o(t, r, n) = 2sgn(n)

r, where
sgn(r) =
{
1 if r 0
0 if r = 0.
We shall verify that A
|
= \
2
|
, t 0 is a strong solution to this equation.
Property 1. For every t 0 the random variable A
|
= \
2
|
is a function of \
|
.
Property 2. The integral

|
0
1d: = t is well dened. The integral

|
0
sgn(\
-
)

A
-
d\
-
is well dened. This is because the process

A
|
is adapted to the SBM, and

|
0
1
[
(
sgn(\
-
)

A
-
)
2
]
d: =

|
0
:d: =
1
2
t
2
is well dened.
Property 3. Using It formula we get [recall that here A
0
= 0]
A
|
= 2

|
0
\
-
d\
-
+

|
0
1d:
= 2

|
0
sgn(\
-
)

A
-
d\
-
+

|
0
1d:, t 0.
Thus, indeed A
|
= \
2
|
, t 0 is a strong solution to equation (12).
Remark 9.6. Equation (9.12) is not of the form (9.9). In fact, this equation can be
considered as a part of the following system of SDEs for two processes A and Y :
dA
|
= dt + 2sgn(Y
|
)

A
|
d\
|
, t 0; A
0
= 0, (9.13)
dY
|
= d\
|
, t 0; Y
0
= 0. (9.14)
It can be demonstrated [using so called Levy characterization theorem] that the process

\
|
=

|
0
sgn(\
-
)d\
-
, t 0;
is a SBM. Observe that d

\
|
= sgn(\
|
)d\
|
. Thus the process A
|
= \
2
|
which is the strong
solution to equation (9.12) is a weak solution to the following SDE
dA
|
= dt + 2

A
|
d

\
|
, t 0; A
0
= 0.
In this sense the process A
|
is a homogeneous diusion with the drift coecient /(r) =
/(t, r) = 1 and the diusion coecient o(r) = o(t, r) = 2

r. Finally observe that the


process A
|
is an example of the square-root diusion (see Subsection 9.2.2).
90
Example 9.7. Consider the equation
dA
|
=
1
2
A
|
dt + A
|
d\
|
, t 0; A
0
= 1. (9.15)
Here /(t, r) =
1
2
r and o(t, r) = r. Using Example 8.2, we easily deduce that the process
A
|
= c
WI
, t 0 is a strong solution to this equation.
Exercise 3. Verify that Properties 1-3 are veried by this solution.
Example 9.8. Consider the equation
dA
|
= /A
|
dt + oA
|
d\
|
, t 0; A
0
= c

. (9.16)
Here /(t, r) = /r and o(t, r) = or. Using Example 8.4, we easily deduce that the GBM
process
A
|
= c
+(o
1
2
o
2
)|+oWI
, t 0;
is a strong solution to equation (9.16). Note that the random variable Y
|
= lnA
|
is normally
distributed with mean j + (/
1
2
o
2
)t and variance o
2
t. That is, the random variable A
|
has a lognormal distribution. Examples 9.7 and Example 9.9 below are special cases of this
example.
Example 9.9. Consider the equation
dA
|
= (/ + o
2
,2)A
|
dt + oA
|
d\
|
, t 0; A
0
= j. (9.17)
Here /(t, r) = (/ +o
2
,2)r and o(t, r) = or. Using Example 9.8 above we easily deduce that
the GBM process
A
|
= jc
o|+oWI
, t 0;
is a strong solution to equation (9.17). In particular, for / =
1
2
, o = 1 and j = 1 we get
that
A
|
= c

1
2
|+WI
(9.18)
solves
dA
|
= A
|
d\
|
, t 0; A
0
= 1, (9.19)
and thus it is a Brownian martingale, called the stochastic exponential of the Brownian
motion \.
Example 9.10. Consider the Ornstein-Uhlenbeck equation
dA
|
= o(/ A
|
)dt + od\
|
, t 0; A
0
= r. (9.20)
Here /(t, r) = o(/ r) and o(t, r) = o. The strong solution to equation (9.20) is the
Ornstein-Uhlenbeck (OU) process
A
|
= rc
o|
+ /(1 c
o|
) + oc
o|

|
0
c
o-
d\
-
, t 0.
Exercise 4. Verify that Properties 2-3 are satised here.
91
The random variable

|
0
c
o-
d\
-
has a normal distribution with mean zero and variance

|
0
c
2o-
d:. Thus, for the OU process we have
A
|
A
(
rc
o|
+ /(1 c
o|
), o
2
c
2o|

|
0
c
2o-
d:
)
.
If o 0 then for large values of t the distribution of the OU random variable is close to
A(/,
o
2
2o
). This is the reason why the constant / is called the mean reversion level.
Example 9.11. Consider an SDE
dA
|
=
A
|
1 t
dt + d\
|
, t [0, 1), A
0
= 0.
The strong solution of this equation is
A
|
= (1 t)

|
0
1
1 :
d\
-
, t [0, 1).
It can be shown by continuity that A
1
= 0. Thus, the process A
|
is a Gaussian process
with the mean function :(t) := 1A
|
= 0 and covariance function c(t, :) := Co(A
|
, A
-
) =
1A
|
A
-
= min (t, :) t:, t, : [0, 1]. Process A
|
is known as Brownian bridge.
In Mikosch, Example 1.3.5, Brownian bridge is given as
Y
|
= \
|
t\
1
, t [0, 1].
Observe that processes A and Y are both Gaussian with the same mean function :(t) and
the same covariance function c(t, :). However, the process Y is not adapted to the ltration
of \.
All the above examples, with the exception of Example 9.5, are special cases of the
General Linear SDE (3.32) of Mikosch, Section 3.3.
9.4 SDEs Driven by a Poisson Process
Let be 11(`). In many ways SDEs driven by a Poisson process are easier to deal with
that SDEs driven by the SBM. As it was the case with SDEs driven by the SBM, It formula
plays a fundamental role in solving SDEs driven by a Poisson process.
Example 9.12. The unique strong solution to the equation
dA
|
= A
|
d
|
, t 0; A
0
= 1,
is the process A
|
= (1 + )
.I
, t 0. In particular, when = 1 we obtain A
|
= 2
.I
, t 0.
Example 9.13. The unique strong solution to the equation
dA
|
= /A
|
(dt + d
|
), t 0; A
0
= 1,
is the process A
|
= c
o|
(1+/)
.I
, t 0. In particular, when / = 1 we obtain A
|
= c
|
2
.I
, t 0.
Example 9.14. The unique strong solution of
dA
|
= A
|
d`
|
, t 0; A
0
= 1,
is
A
|
= 1 +

|
0
A
&
d`
&
= c
.I ln 2A|
, t 0.
Thus, process c
A|+.I ln 2
is a martingale. Compare this with (9.18) and (9.19).
92
Homework 10: Stochastic dierential equations
1. Do
(a) Exercise 1.
(b) Exercise 2.
(c) Exercise 3.
(d) Exercise 4.
2. Let \
|
, t 0 be a SBM.
(a) Compute the expectation 1A
T
and the variance \ o:(A
T
), where A
|
is the strong
solution to the SDE
dA
|
= A
|
dt + A
|
d\
|
, t [0, T], A
0
= 1.
Is the process A a martingale with respect to the natural ltration of \?
(b) Compute the covariance Co(Y
|
, Y
-
) for 0 : t, where Y
|
is the strong solution
to the SDE
dY
|
= Y
|
dt + d\
|
, t 0, Y
0
= 0
In addition compute the mean and the variance of the limiting distribution of the
process Y . [Hint: Process c
|
Y
|
has independent increments.]
(c) What is the distribution of Y
|
, where Y is the strong solution to the SDE
dY
|
= Y
|
dt + d\
|
, t 0, Y
0
= j
in which j A(0,
1
4
) is independent of the SBM \.
3. Let
|
, t 0 be a PP(`). Compute the covariance Co(Y
|
, Y
-
) for 0 : t, where
Y
|
= ln 7
|
, and where 7 is the strong solution to the SDE
d7
|
= 7
|
(dt + d
|
), t 0, 7
0
= 1.
4. Let \
|
, t 0 be a SBM. Let
|
, t 0 be a PP(`). Let A
|
= 2
.I
c
3|
+ \
2
|
. Compute
the dierential dA
|
.
93
9.5 Jump-Diusions [See Ikeda and Watanabe]

By jump-diusion we mean hereafter an It process A in the sense of Section 8.3, but for a
Markovian SDE (8.13), meaning that the random coecients /
|
, o
|
and c
|
(r) of (8.13) are
now given deterministically in terms of of A
|
1
as
/
|
= /(t, A
|
), o
|
= o(t, A
|
), c
|
(r) = c(t, A
|
, r).
Equation (8.13) is thus now an SDE in A. Well-posedness
2
of such jump-diusion SDEs
can be studied by classical Picard iteration techniques under suitable Lipschitz and growth
conditions on the coecients. A notable feature of the solution is the so-called Markov
property, meaning that
1((A
-
, : [t, T])
|
) = 1((A
-
, : [t, T]) A
|
)
for every (possibly path-dependent) functional of A giving sense to both sides of the
equality. So the past of A does not inuence its future, the present of A gathering all the
relevant information.
Given a real valued, regular enough function ) = )(t, r) on [0, T]
o
, one has by
(8.15) the following It formula for a jump-diusion (canonical form)
d)(t, A
|
) = (
|
+ ) )(t, A
|
)dt + )(t, A
|
)o
|
d\
|
+ d`
)
|
(9.21)
for the compensated jump (local) martingale
d`
)
|
= c)(t, A
|
, J
(|)
)d
|
`

c)(t, A
|
)dt
in which we let for every t 0 and r, j in
o
c)(t, r, j) = )(t, r + c(t, r, j)) )(t, r),

c)(t, r) =

u
c)(t, r, j)n(dj)
and where the innitesimal generator of A acts on ) at time t as
())(t, r) = )(t, r)/(t, r) +
1
2
Tr
(

2
)(t, r)o(t, r)
)
+ `

c)(t, r). (9.22)


Therefore, for every suitable functions ), p = ), p (r) (cf. (8.19))
1
(
d)(t, A
|
)

|
)
= (
|
+ ))(t, A
|
)dt
Co
(
d)(t, A
|
), dp(t, A
|
)

|
)
= (()p) )p p)) (t, A
|
)dt
(9.23)
in which with Y
|
= )(t, A
|
), 7
|
= p(t, A
|
)
(()p) )p p)) (t, A
|
) =
dY, 7
|
dt
= )(t, A
|
)o
|
(p(t, A
|
))
T
+ `

u
c)(t, A
|
, j)cp(t, A
|
, j)n(dj).
Also, the conditionings with respect to
|
in (9.23) can be replaced by the conditioning
with respect to A
|
, by Markov property of A.
1
Or of I in the case of and , which in view of (7.6) makes no dierence in (8.13), by continuity of
and I.
2
In the so-called strong sense to which we shall limit ourselves in the context of these notes.
94
Chapter 10
Girsanov transformations
Girsanov transformation is a very useful technique of converting processes that are not mar-
tingales into martingales. It amounts to changing probabilities of random events (changing
probability measures).
10.1 Girsanov transformation relative to Gaussian distribu-
tions
10.1.1 Gaussian random variables
Suppose - is a standard normal variable, that is -
1
A(0, 1). Its probability density function
is (r) = (2)

1
2
exp
(

a
2
2
)
, r (, ). Now, consider a function n(r) = c
qa
q
2
2
, where
is a constant. Let us now transform the probability measure 1 via the function n in order
to produce a new probability measure on , denoted by Q, and dened by
dQ(.)
d1(.)
= j(.) (10.1)
where j = n(-). We call the random variable j the density of measure Q with respect to
to the measure 1. This is equivalent to writing
dQ = jd1 [meaning that Q is absolutely continuous with respect to 1]
or
d1 = j
1
dQ [meaning that 1 is absolutely continuous with respect to Q].
We say that measures 1 and Q are equivalent with respect to each other.
Exercise 1. Verify that Q is a probability measure.
[Hint: Since obviously Q is non-negative and o-additive, then verifying that Q is a probabil-
ity measure amounts to verication that Q() = 1, where Q() =

dQ(.) =

j(.)d1(.) =
1
1
j =

n(r)(r)dr.]
95
96
In view of (10.1) we obtain
Q(- dr) = n(r)1(- dr)
= c
qa
q
2
2
(2)

1
2
exp
(

r
2
2
)
dr
= (2)

1
2
exp
(

(r )
2
2
)
dr.
But the function (2)

1
2
exp
(

(aq)
2
2
) is the density of the normal distribution A(, 1).
Thus, the random variable - has the normal distribution A(, 1) under the measure Q,
which we write A
Q
A(, 1).
10.1.2 Brownian motion [Mikosch, Section 4.2; Shreve, Sections 1.6 and
5.2.1]
Let be a constant as before, let T < and let us consider a process 1 dened as
1
|
= \
|
t, t [0, T],
where the process \
|
is the SBM under some ltered probability space (
|
, 1). Now, dene
a process j
|
by
j
|
(.) = exp
{
\
|
(.)
1
2

2
t
}
, t [0, T], (10.2)
and then dene a new measure Q on
T
by
dQ(.) = j
T
(.)d1(.).
A remarkable result, known as the Girsanov Theorem, states that:
The process j
|
is a martingale with respect to the ltration
|
under the measure 1,
The measure Q is a probability measure,
The process 1
|
is a SBM under the measure Q.
Remark 10.1. The Girsanov theorem can not be generalized to the case T = .
For us, the most important application of the Girsanov theorem is the following example
Example 10.2. (Elimination of the drift term in a linear SDE) Consider the linear SDE
dA
|
= /A
|
dt + oA
|
d\
|
, t [0, T]; A
0
= c

. (10.3)
From Example 9.16 we know that the strong solution to this equation is
A
|
= c
+(o
1
2
o
2
)|+oWI
, t [0, T].
This is not a martingale under 1. Let us introduce a process
1
|
= \
|
t, t [0, T],
with =
o
o
. We may now rewrite equation (10.3) as
dA
|
= oA
|
d1
|
, t [0, T]; A
0
= c

. (10.4)
Due to the Girsanov theorem the process 1 is a SBM under Q. Thus the process A
|
=
c
+(o
1
2
o
2
)|+oWI
= c

1
2
o
2
|+o1I
, which is the strong solution to both (10.3) and (10.4), is
a martingale under Q. This observation plays a crucial role in the so called risk-neutral
approach to pricing nancial assets.
97
Exercise 2. Verify that the strong solution to (10.4) is a martingale with respect to the
ltration
|
.
Exercise 3. Write down a SDE satised by process j
|
of (10.2).
10.2 Girsanov transformation relative to Poisson distributions
10.2.1 Poisson random variables
Let i
1
T
A
be a Poisson random variable with parameter ` under 1, so
1(i = /) =
c
A
`
I
/!
for / = 0, 1, 2, . . . , and zero otherwise. Letting n(/) = c
(A)I ln
A

for some 0, dene


a new measure Q on (, ) by
dQ(.)
d1(.)
= j(.),
where j = n(i). Note
1
1
j =

j(.)1(d.) =

I
c
(A)I ln(
A

)
c
A
`
I
/!
= 1,
showing that Q is a probability measure (with total mass equal to one). Observe now that
Q(i = /) =

{.;i(.)=I}
dQ(.) =

{.;i(.)=I}
j(.)d1(.)
= c
(A)I ln
A

1(i = /) =
c

I
/!
for / = 0, 1, 2, . . . , and zero otherwise. Thus,
i
Q
T

.
10.2.2 Poisson process
Let
|
be PP(`) on some ltered probability space (,
|
, 1). Now, for 0, dene a
process j
|
by
j
|
(.) = c
(A)|.I(.) ln
A

, t [0, T], (10.5)


and then, dene a new measure Q on (,
T
) by
dQ(.) = j
T
(.)d1(.).
An appropriate version the Girsanov Theorem states that:
The process j
|
is a martingale with respect to the ltration
|
under the measure 1,
The measure Q is a probability measure,
The process
|
is PP() under the measure Q.
98
Example 10.3. (Elimination of the drift term in a linear SDE) Consider the linear SDE
dA
|
= A
|
(dt + d
|
), t [0, T], A
0
= 1. (10.6)
Process A
|
is not a martingale with respect to
|
under 1. The above equation can of course
be written as
dA
|
= A
|
d

`
|
, t [0, T]; A
0
= 1, (10.7)
where we let

`
|
=
|
t. So process A
|
is a martingale with respect to
|
under Q.
Exercise 4. Verify that the strong solution to (10.7) is a martingale with respect to the
natural ltration of under Q.
Exercise 5. Take ` = 1. Verify that process j of (10.5) is the strong solution to
dj
|
= ( 1)j
|
(dt + d
|
), t [0, T]; A
0
= 1,
10.3 Girsanov transformation relative to both Brownian mo-
tion and Poisson process
Girsanov transformation can be applied jointly to a pair (A, ) where A and are a
Brownian motion and a Poisson process, respectively, dened on the same probability space,
say (, , 1).
If A BM(r, /, o) and PP(`), then one can apply a simultaneous change of
probability measure 1 to a new measure Q so that under the new measure we have that
A BM(r, :, o) and PP().
10.4 Abstract Bayes formula
Suppose that is an
T
-measurable and integrable random variable and let Q be dened
via 1 by
oQ
o1
= j
T
, for some positive 1-martingale j with unit mean. We shall admit the
following lemma.
Lemma 10.1. A process A is a Q-local martingale if and only if jA is a 1-local martingale.
As a consequence,
Corollary 10.2. The following Bayes formula holds:
j
|
1
Q
(
|
) = 1
1
(j
T

|
). (10.8)
Proof. One has a (Doob) Q-martingale 1
Q
(
|
), and therefore by the lemma a 1-
martingale j
|
1
Q
(
|
). Since j
|
1
Q
(
|
) is a 1-martingale with terminal condition j
T

at T, so (10.8) follows (admitting the required integrability).


99
Homework 11: Girsanov transformations
1. Solve Exercises 1-5 above.
2. Let
|
, t [0, T] be a PP(`) on a ltered probability space (
|
, 1) Let A be the
strong solution to the following SDE on this space:
dA
|
= A
|
(dt + d
|
), t 0, A
0
= 1.
Dene a new probability measure Q on (,
T
) so that process A is a martingale
under Q.
100
Chapter 11
Feynman-Kac formulas

11.1 Linear case


Let A be given as a jump-diusion and let ) = )(t, r) be a function such that )(t, A
|
) is
local martingale. Then in view of the It formula in canonical form (9.21), one concludes
from Proposition 7.2 that the time-dierentiable local martingale
(
|
+ ) )(t, A
|
)dt = d)(t, A
|
) )(t, A
|
)o
|
d\
|
d`
)
|
is constant. Using for instance BSDE techniques to be mentioned in Section 11.2, this in
turn translates into the following partial integro-dierential equation (deterministic PIDE)
to be satised by the function ):
(
|
+ ) )(t, r) = 0, r
o
. (11.1)
The fundamental situation of this kind corresponds to a Doob-martingale
)(t, A
|
) := 1
(
c(A
T
)

A
|
)
= 1
(
c(A
T
)

|
)
for an integrable terminal condition c(A
T
). Here the second equality, which grounds the
martingale property of )(t, A
|
), holds in virtue of the Markov property of A. In this case
the function ) can typically be characterized as the unique solution to the PIDE (11.1),
along with the terminal condition ) = c at time T.
More generally, given suitable running and terminal cost functions c and c, and a
discount rate function :, let
n(t, A
|
) := 1
(
T
|
c

s
I
(,A

)o
c(:, A
-
)d: + c

T
I
(-,As)o-
c(A
T
)

A
|
)
= 1
(
T
|
c

s
I
(,A

)o
c(:, A
-
)d: + c

T
I
(-,As)o-
c(A
T
)

|
)
, (11.2)
by Markov property of A. Let a
|
= c

I
0
(-,As)o-
denote the discount factor at rate :(t, A
|
).
By immediate extensions of the previous computations, one then has:
On one hand, the following (local) martingale that arises from the integration by parts
and It formulas applied to n(t, A
|
):
dn(t, A
|
) (
|
n + n) (t, A
|
)dt = n(t, A
|
)o
|
d\
|
+ d`
&
|
; (11.3)
101
102
On the other hand, the following Doob-martingale (conditional expectation process of
an integrable terminal condition) that arises from (11.2):
dn(t, A
|
) + (c(t, A
|
) :n(t, A
|
)) dt. (11.4)
Substracting (11.3) from (11.4) yields the local martingale
(
|
n + n + c :n)(t, A
|
)dt (11.5)
which is therefore constant as a time-dierentiable local martingale (Proposition 7.2). Also
accounting for the terminal condition n = c at time T, this translates into the following
PIDE to be satised by the function n:
{
n(T, r) = c(r), r
o
(
|
n + n + c :n) (t, r) = 0, t < T, r
o
(11.6)
The function n can then typically be characterized and computed (including numerically if
needed/possible) as the unique solution in some sense to (11.6).
11.2 Backward Stochastic Dierential Equations
The SDEs that were discussed in Chapter 9 were so called forward SDEs. This is because any
solution process of any such equation was supposed to satisfy the given initial condition.
In Example 9.3 we considered the equation
dY
|
= d\
|
, t 0; Y
0
= 0, (11.7)
with the obvious solution Y
|
= \
|
. In the above equation the initial condition was specied
the equation is to be solved forward in time. The backward version of equation (11.7)
would read
dY
|
= d\
|
, t 0; Y
T
= , (11.8)
where is some random variable. In equation (11.8) the terminal condition is specied
the equation is to be solved backward in time, and called therefore a backward stochastic
dierential equation (BSDE). It is rather clear that equation (11.8) is only solvable if =
\
T
+ c, where c is a constant, in which case we have Y
|
= \
|
+ c. Note that for c = 0 the
solution of this equation is the same as the solution of equation (11.7), that is Y
|
= \
|
. Also
note that equation (11.8) can be written as
dY
|
= 7
|
d\
|
, t 0; Y
T
= , (11.9)
where 7
|
= 1 (observe that the process 7 is (trivially) adapted to the ltration
|
, t 0,
generated by the Brownian motion \). In case when = \
T
+ c we saw that the BSDE
(11.9) admits a solution pair (Y
|
, 7
|
) that is adapted to the ltration
|
, t 0, where
Y
|
= \
|
+ c and 7
|
= 1.
In more generality, consider the BSDE (11.9) where is some random variable mea-
surable with respect
T
(not necessarily = \
T
+ c). If we additionally suppose that is
square integrable, then one can show that BSDE (11.9) admits a solution pair (Y
|
, 7
|
) that is
adapted to the ltration
|
, t 0, where Y
|
= 1(
|
) and 7
|
is a unique process appearing
in the so called Brownian martingale representation of , that is = 1() +

T
0
7
-
d\
-
. We
then have
Y
|
= Y
0
+

|
0
7
-
d\
-
= 1() +

|
0
7
-
d\
-
=

T
|
7
-
d\
-
.
103
11.2.1 Non-linear Feynman-Kac formula
In the jump-diusion set-up of Subsection 11, one can in view of (11.3) and (11.6) interpret
the triplet of processes (parameterized by r
o
in case of \ )
(Y
|
, 7
|
, \
|
(r)) := (n(t, A
|
), n(t, A
|
)o(t, A
|
), cn(t, A
|
, r)) (11.10)
as a solution to the following BSDE:
{
Y
T
= c(A
T
) and for t < T,
dY
|
=
(
c(t, A
|
) :(t, A
|
)Y
|
)
dt 7
|
d\
|

(
\
|
(J
(|)
)d
|
`

\
|
dt
)
,
(11.11)
with

\
|
:=

u
\
|
(r)n(dr). In case of a diusion A (without jumps, so ` = 0), there is no
component \ involved in the solution (or formally \ = 0 above).
In a BSDE perspective, the Feynman-Kac formula (11.2), written in the equivalent
form (as obvious from (11.11))
Y
|
= 1
(
T
|
(
c(:, A
-
) :(:, A
-
)Y
-
+ c(A
T
)
)
d:

|
)
, (11.12)
is regarded as the Feynman-Kac representation of the solution (Y, 7, \ ) of the BSDE (11.11).
In fact, the simplest way to rigorously solve (11.6) in a function n satisfying (11.2), is
actually to go the other way round, namely solving upfront (11.11) in a triplet of processes
(Y
|
, 7
|
, \
|
(r)), and redoing the above computations in reverse order to establish that the
function n then dened via n(t, A
|
) = Y
|
, solves (11.6) and satises (11.2).
Note that the intrinsic (non discounted) form (11.11) of the Feynman-Kac represen-
tation (11.2) is implicit, meaning that the right-hand side of (11.12) also depends on Y. In
this case this is not a real issue however, as revealed by the equivalent explicit discounted
representation (11.2). Now, the power of BSDEs precisely lies in the fact that this theory
allows one to solve more general problems than the linear equations (11.6), (11.11), namely
nonlinear problems in which the BSDE coecient, p(t, A
|
, Y
|
) := c(t, A
|
) :Y
|
in the case
of (11.6), (11.11), depends nonlinearly on Y, and possibly also on 7 and \. Let us thus
consider the following BSDE to be solved in a triplet of processes (Y
|
, 7
|
, \
|
(r)):
{
Y
T
= c(A
T
) and for t < T,
dY
|
= p(t, A
|
, Y
|
, 7
|
,

\
|
)dt 7
|
d\
|

(
\
|
(J
(|)
)d
|
`

\
|
dt
)
(11.13)
where

\
|
:=

u
\
|
(j)j(t, A
|
, j)n(dj) for a suitable (possibly vector-valued) integration ker-
nel j.
Let now a function n = n(t, r) solve the following semilinear PIDE:

n(T, r) = c(r), r
o

|
n(t, r) + n(t, r)
+p
(
t, r, n(t, r), n(t, r)o(t, r),

cn(t, r)
)
= 0, t < T, r
o
(11.14)
with

cn(t, r) :=

u
cn(t, r, j)j(t, r, j)n(dj).
Straightforward extensions of the computations having led from (11.6) to (11.11) show
that the triplet (Y, 7, \ ) given in terms of n by formula (11.10), solves the nonlinear BSDE
(11.13). For this reason formula (11.10) is known as a nonlinear Feynman-Kac formula.
104
11.2.2 Optimal stopping
.
BSDEs allow one to deal non only with semilinearities referring to the possible nonlinear
dependence of the coecient p with respect to (Y, 7, \ ), but also to nonlinearities resulting
from optimal stopping features which may be involved in a control problem.
For instance, one may consider instead of n in (11.2), the function = (t, r) such that
(t, A
|
) := ess sup
t
[I,T]
(11.15)
1
(
t
|
c

s
I
(,A

)o
c(:, A
-
)d: + c

f
I
(-,As)o-
c(A
t
)

A
|
)
or equivalently in implicit intrinsic form:
(t, A
|
) = ess sup
t
[I,T]
(11.16)
1
(
t
|
(
c(:, A
-
) :(:, A
-
)(:, A
-
)
)
d: + c(A
t
)

A
|
)
.
In these dynamic programming equations, T
[|,T]
denotes the set of all [t, T]-valued -stopping
times. The set T
[|,T]
is uncountable, so that care is needed in taking the supremum over an
uncountable family of random variables in the r.h.s. of (11.15), which is taken care of by
the use of the essential supremum (ess sup). In particular, one has (t, r) c(r), which
arises from considering t t in (11.15), and of course (t, r) n(t, r). In (11.15), as in
(11.2), the conditioning with respect to
|
can be replaced by the conditioning with respect
to A
|
, by the Markov property of a jump-diusion A.
Computations similar to the above ones allow one to establish the usual connection
(11.10) (nonlinear Feynman-Kac formula) between:
On one hand, the solution (Y, 7, \, ) to the following reected BSDE:

Y
T
= c(A
T
) and for t < T :
dY
|
= p(t, A
|
, Y
|
, 7
|
,

\
|
)dt + d
|
7
|
d\
|

(
\
|
(J
(|)
)d
|
`

\
|
dt
)
,
(Y
|
c(A
|
))
+
d
|
= 0; , (Y
|
c(A
|
))
+
d
|
= 0;
(11.17)
Here represents a further adapted, continuous and non-decreasing process which is
required for preventing the component Y
|
of the solution from falling below the barrier
level c(A
|
); this non-decreasing process is only allowed to increase on the random set
Y
|
= c(A
|
), as imposed by the minimality condition in the third line;
On the other hand, the following obstacle problem:

(T, r) = c(r), r
o
max
(

|
(t, r) + (t, r) + p
(
t, r, (t, r), (t, r)o(t, r),

c(t, r)
)
,
c(t, r) (t, r)
)
= 0, t < T, r
o
.
(11.18)
The corresponding dynamic programming equation generalizing (11.16) reads as follows:
Y
|
= (11.19)
ess sup
t
[I,T]
1
(
t
|
p(:, A
-
, Y
-
, 7
-
,

\
-
)d: + c(A
t
)

A
|
)
.
105
The proof of these results involves some technicalities though, since, in particular, the value
function of an optimal stopping problem is well known to be only of class c
1,1
on the
boundary = c of the continuation region c. This implies that being a solution
to (11.18) (which involves through the generator the second derivatives in space of ) can
only be understood in a weak sense. But again BSDEs are useful here since there is a well
established theory connecting solutions of BSDEs to the so-called viscosity class of weak
solutions to nonlinear PDEs or PIDEs.
Finally it is important to have in mind that BSDEs are not only useful theoretically
as we showed, but also in practice, when the dimension d of A is such that the numerical
solution of a nonlinear P(I)DE by a deterministic scheme is ruled out by Bellmans curse
of dimensionality. For solving such high-dimensional nonlinear problems, BSDE-based
simulation schemes are the only viable alternative.

You might also like