Lec

Lecture Notes STOCHASTIC DIFFERENTIAL EQUATIONS
Andreas Lpker, TU/e Eindhoven
PREFACE
These are the lecture notes of the Mastermath course Stochastic Differential Equations in the semester 2009/2010, held at Utrecht university. The lecture notes follow in large parts (namely the chapters 2 to 7) the exciting textbook "Stochastic calculus and nancial applications" by Michael Steele (see [L7]). The numbering of the sections was adopted from this book, so theorem and section numbering is not always continuous. Additionally, every subsection has its own number, indicated and referred to by double brackets . The notes start with a very brief introduction to probability and measure theory, before introducing martingales, Brownian motion and nally stochastic integrals.
In the Appendix A we give some additional information on topics that couldnt be discussed during the lecture. Appendices B and C consist of a collection of exercises with solutions that were handed out during the classes. The lecture notes close with a short list of literature for further study.
CONTENTS
1 Measure Theory and Probability 1.1 Measure spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Measurable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Lebesgue integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 L p spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 7 9
1.5 Product measure and Fubinis theorem . . . . . . . . . . . . . . . . . . . . . . . . 10 1.6 Probability spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.7 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.8 Convergence of random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.9 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 2 Discrete Time Martingales 16
2.1 Classic examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Theorem 2.2 (Optional stopping theorem) . . . . . . . . . . . . . . . . . . . . . . . 17 2.4 Submartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5 Doobs Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Theorem 2.4 (Doobs Maximal inequality) . . . . . . . . . . . . . . . . . . . . . . . 19 Theorem 2.5 (Doobs L p inequality) . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 Martingale convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Theorem 2.6 (L 2 -bounded Martingale Convergence Theorem) . . . . . . . . . . 20 Theorem 2.8 (L 1 -bounded Martingale Convergence Theorem) . . . . . . . . . . 21 3 Brownian Motion 21
3.0 Multivariate Gaussians and Brownian motion . . . . . . . . . . . . . . . . . . . . 21 3.3 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4 Wavelet representation of Brownian motion . . . . . . . . . . . . . . . . . . . . . . 24 Theorem 3.1a (Wavelet representation) . . . . . . . . . . . . . . . . . . . . . . . . 24 Theorem 3.1b (Brownian motion representation) . . . . . . . . . . . . . . . . . . . 24 4 Continuous Time Martingales 26
4.2 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 Uniform integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4 Continuous Time Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Theorem 4.1 (Doobs Optional Stopping Theorem) . . . . . . . . . . . . . . . . . . 29 Theorem 4.2 (Maximal inequality) . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Theorem 4.3 (Martingale limit theorem) . . . . . . . . . . . . . . . . . . . . . . . . 29 4.5 Classic Brownian Motion Martingales . . . . . . . . . . . . . . . . . . . . . . . . . 29 6 The It integral 30
6.0 Bounded variation and Lebesgue-Stieltjes integral . . . . . . . . . . . . . . . . . 30 6.1 Denition of the It Integral for functions in H 2 . . . . . . . . . . . . . . . . . . . 31
2 Theorem 6.1 (It s Isometry on H 0 ) . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Theorem 6.2 (It integral) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.4 An explicit Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7 Localization and Its integral 7.1 It integral for functions in
2 L loc
36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.2 Two special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Theorem 7.1 (Continuous functions of Brownian motion) . . . . . . . . . . . . . . 38 Proposition 7.6 (Gaussian integrals) . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.4 Local martingales and honest ones . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Proposition 7.8 (Application of local martingales) . . . . . . . . . . . . . . . . . . 40 A More Explanations and Examples 42
(1) p.6 Nonmeasurable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 (2) p.9 Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 (3) p.9 Two transformation rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 (4) p.13 The Borel-Cantelli Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 (5) p.27 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 (6) p.27 Uniform integrability I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 (7) p.27 Uniform integrability II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Theorem 6.5 (Approximation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 (8) p.34 Approximation of functions in H 2 (receipe) . . . . . . . . . . . . . . . . . . . 45 B Exercises 45
Measure theory and probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Discrete Time Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Continuous time martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 It integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 C Solutions D Literature 49 57
MOTIVATION
Consider an ordinary differential equation d X t = (t, X t ), dt t 0, (1)
where we write the unknown function as X t instead of X (t), since we will consider random d processes later. For instance take (t, x) = t x, then dt X t = t X t , with solution X t = 2 X 0 exp( t /2) (see the orange curve in the gure below).
1.0
0.8
0.6
0.4
0.2
0.0 0 200 400 600 800 1000
If we want to add a random disturbance, depending both on t and X t , we take a random noise Wt , multiply it with some sufciently nice function and add it to (1). We end up with an stochastic differential equation (SDE): d X t = (t, X t ) + (t, X t )Wt . dt Equation (2) can also be written as d X t = (t, X t ) dt + (t, X t)Wt dt or
t t
(2)
X t = X0 +
(s, X s ) ds +
(s, X s )Ws ds.
If Ws is a nicely behaving process, then we may study this equation and nd for example conditions for the existence and uniqueness of solutions. Particularly interesting would be to use a white noise Wt , a stochastic process such that d dt B t = Wt ", where B t is the famous Brownian motion. Then we could write
t t
X t = X0 +
(s, X s ) ds +
(s, X s ) dB s
(3)
Clearly, if is continuous, the rst integral is well dened. Unfortunately the dubious white noise process Wt does not exist as a stochastic process (although it has a Wikipedia article...). Brownian motion is not differentiable and it is not clear, if the right integral in (3) (the stochastic integral) exists or has any meaning at all. A considerable part of this course will be devoted to answer the question, whether we can actually give (3) a reasonable meaning.
1 Measure Theory and Probability
MEASURE THEORY AND PROBABILITY 1.1 Measure spaces
1 Measure space ingredients If we want to measure we clearly need (1) objects to be measured: sets. (2) a measure that assigns to each set a real number. All sets to be measured are subsets of some basic set . The family of all sets that we can measure is denoted by F Usually F is smaller then the collection P () of all subsets of . This has some technical reasons that will be mentioned later. 2 -eld To be consistent with our intuition, F should have a certain structure. A -eld (or -eld) in is a family of subsets of , such that (1) F (2) A F A c F (where A c is the complement of A) (3) A 1 , A 2 , . . . F
i =1
Ai F
The sets in F are called measurable sets, (, F ) is a measurable space.
3 Generated -eld We can produce -elds from collections C of subsets of . Just take the intersection of all -elds that contain C (prove that this is indeed a -eld). This intersection is called the -eld generated by C (written (C )). 4 Example - Borel -eld An important example is the following. Let C be the collection of all open sets in [0, 1] (could be another topological space), then B ([0, 1]) = (C ) is called the Borel -eld, the natural candidate to choose as an -eld on [0, 1]. It turns out that if you take the family I of all intervals of [0, 1], then (I ) = B ([0, 1]).
5 Measures Once we have dened our measurable objects, how can we actually measure them? A measure is a function : F [0, ], such that (1) ( ) = 0 (2) For disjoint sets A 1 , A 2 , . . . F we have
i =1
Ai =
(A i ).
i =1
The triple (, F , ) is called a measure space.
6 Null sets Sets A with (A) = 0 are called null sets. If a property holds for all except for in a null set, then we say the property holds almost everywhere (a.e.). 7 Completion Sometimes we stumble upon the situation that not all subsets of that are subsets of a null set are in F . It is then possible to extend F to a -eld F , such that any subset of a null set is again measurable (and has of course measure zero). This new -eld is called the completion of F . 8 Lebesgue measure Particularly interesting for practical purposes would be a measure on , such that the measure of an interval I is its length. There is indeed such a measure , dened for all sets A B (). If we complete B () to L () = B () we end up with the so called Lebesgue measurable sets and the measure can be extended to L (). 9 An disturbing fact There are actually subsets of that are not in B () and some are not even in L (). These sets can not be measured by . You can do strange things with nonmeasurable sets. If you are interested and brave enough check the notion "BanachTarski paradox" with some internet search engine.
(1)
1.2
Measurable functions
10 Measurable functions Like continuous functions are nice mappings between topological spaces, measurable functions are the natural functions to use with measurable spaces. A function f from a measurable space (1 , F1 ) to another measurable space (2 , F2 ) is measurable if f 1 (B) = { 1 : X () B} F1 , for every B F2 . We also say that f is measurable with respect to F1 .
1 Measure Theory and Probability 11 How can we decide whether a function is measurable?
How many functions are actually measurable and do our favourite functions from calculus belong to this class? It is sometimes very difcult to verify the above denition. It turns out that the class of measurable functions is in fact very high and it takes considerable amount of work to construct functions that do not belong to it. A function f : (, F ) (we omit the Borel -eld) is measurable iff for all y either { : f () y} F or { : f () > y} F or { : f () y} F or { : f () < y} F . You can perform the following operations on measurable functions If f 1 , f 2 , . . . : (, F ) are measurable, then the same holds for
| f 1 |, f 1 + f 2 , f 1 f 2 , f 1 f 2 ,
lim supn f n (!), lim infn f n (!), max{ f 1 , . . ., f n }, min{ f 1 , . . ., f n }.
supn f n , infn f n ,
1.3
Lebesgue integral
Once measures and measurable functions are designed, it is obvious to ask, whether integration can be dened in a suitable way. This is the case, and the denition below turns out to generate an integral that beats our old friend the Riemann integral in most regards. The standard way to construct the so called Lebesgue integral is to use a three step construction that denes the integral rst for a class of simple functions, then for nonnegative limits of them and nally for general measurable functions. 12 (I) Simple functions A real-valued measurable function f : (, F ) is called simple if there are sets A i F , i = 1, 2, . . ., n, such that
n
f (x) = where A i (x) = 1 if x A i and 0 else.
c i A i (x),
(1.1)
i =1
We dene its integral (in the obvious way) by

n
f (x) d (x) =
c i (A i ).
i =1
This integral does not depend on the choice of the sets A i in (1.1)
13 (II) Non-negative measurable functions Note that any nonnegative measurable functon f : (, F ) + is an increasing limit of nonnegative simple functions f 1 , f 2 , . . ..
We dene f (x) d (x) = lim

n
f n (x) d (x)
One can prove that for any other sequence g 1 , g 2 , . . . with g n f the dened integral coincedes, i.e.
n
lim
g n (x) d (x) = lim
f n (x) d (x)
14 (III) Any measurable function Let f be some measurable function. If f (x) d (x) =
| f | d < then we call f integrable and we dene
f + (x) d (x)
f (x) d (x),
| f | d < iff
where f + (x) = max{ f (x), 0} and f (x) = max{ f (x), 0}. Note that and f d < . 15 Notation We write
A
f + d <
f (x) d (x) =
A (x) f (x) d (x)
if A F and
f (x) dx =
f (x) d (x), if the integrating measure is the Lebesgue-measure.
16 Theorem (Properties) The Lebesgue integral enjoys all the properties that we are used to see for the classical Riemann integral from calculus. (1) The integral is linear: (a f + b g) d = a f d + a g d . (2) If f g then f d
A
g d .
A
(3) If (A) = 0 then
f d = 0. If
f d = 0 and f 0 then (A) = 0.
If f : I , where I a closed interval, is Riemann integrable then it also Lebesgue integrable (w.r.t Lebesgue measure ) and the value of the integrals coincide. On the whole real line we have to be careful, e.g. the improper Riemann integral
sin(x) dx x
n
(dened as the limit of the proper integrals 0 sin(x)/x dx) exists and is nite. But since 0 | sin(x)/x| dx = the Lebesgue integral does not exist. However, in most cases the Lebesgue integral wins the contest. Especially function which are not very smooth can often be Lebesgue integrated but not Riemann integrated. A famous
1 Measure Theory and Probability example is

1 0
(x) dx, where the Riemann integral does not converge to a nite limit, but
(2)
the Lebesgue integral exists and is equal to zero. 17 Three convergence theorems The main advantage of Lebesgue integration is the fact that the integral behaves nicely with limits, as the next three cornerstones of integration theory show: 18 Theorem (Lemma of Fatou) If f 1 , f 2 , . . . 0 are measurable then lim inf
n
f n (x) d (x)
lim inf f n (x) d (x)

n
To remember Fatous lemma, think of the functions f n (x) = 1 + sin(2 x + n), for which the 1 limit inferior is f (x) = 0, but the integrals 0 f n (x) dx are equal to one (note that f n does not converge). 19 Theorem (Monotone convergence (Levi)) If f , f 1 , f 2 . . . are measurable with 0 f n f , then lim f n (x) d (x) = f (x) d (x).
20 Theorem (Dominated convergence (Lebesgue)) If f , g, f 1, f 2 , . . . are measurable functions with | f n | g and f n f and g is integrable then
n
lim
f n (x) d (x) =
f (x) d (x).
For some transformation rules see
(3)
1.4
L p spaces
p
21 Denition (L p space) A measurable function f : (, F ) is in L , p 1, if

| f (x)| p d (x) < . L is a complete normed vector space with norm || f || p = | f (x)| p d (x)
1/ p
We let L be the set of all f : (, F ) , such that
|| f || = inf{ y : (| f | > y) = 0} < .
1 Measure Theory and Probability If is a nite measure, i.e. () < then

L L p Lq L1
10
For f , f 1, f 2 . . . L p we say that f n converges to f in L p if || f n f || p 0.

p
for p > q (why isnt this true e.g. for the Lebesgue measure, which is certainly not nite?). 22 Special case p = 2 If p = 2 then L is a Hilbert space, i.e. a complete normed vector space with inner product. The inner product is simply dened by
f , g =
f (x)g(x) d (x).
23 Important Inequalities The following famous inequalities are heavily used in connection with L p spaces.
(1) Minkowski inequality:

|| f + g|| p || f || p + || g|| p
(2) Hlder inequality: 1/p + 1/q = 1, then

|| f g||1 || f || p || g|| q ,
(3) Special case (Cauchy-[Bunyakovsky]-Schwarz)

|| f g||1 || f ||2 || g||2
1.5
Product measure and Fubinis theorem
Given two measure spaces (1 , F1 , 1 ) and (1 , F1 , 1 ), there is a measure space (, F , ) such that = 1 2 and
(A B) = 1 (A)1 (B),
A F1 , B F2 .
This is the product space with the product measure = 1 2 (we dont say anything about F here, which is constructed from F1 and F2 in a certain way).
11
24 Theorem (Fubini) If either f : (, F ) is integrable or then f (x, y) d (x, y) = f (x, y) d 1(x) d 2 (y) = f (x, y) d 1 (y) d 2 (x).
| f (x, y)| d 1 (x) d 2 (y) < and | f (x, y)| d 1 (y) d 2 (x) < ,
1.6
Probability spaces
25 Denition As we stated before, probability theory can be seen as a special discipline of measure theory with some important extra features (like independence). Here is the denition of a probability space. A probability space is a measure space (, F , ), where is a nite measure with () = 1. We interpret elements A F as events and P(A) as the probability that A happens. Measurable functions X : (, F ) (, B ()) are called random variables.
The denition of random variables makes sense, since measurability of X guarantees that sets of the form { : X () B, B B ()} are in F and hence -measurable. We shortly write
(X
has property A) = ({ : X () has property A }).
We say that a property holds with probability one (or almost surely) if
(property holds) = 1.
1.7
The expectation of X is dened as (X ) = X () dP() =

Expectation
x dF(x).
The properties of the expectation are of course inherited from the integral:
12
26 Theorem (Properties of the integral) (1) (aX + bY ) = a (X ) + b (Y ), (X ) (Y ), ( A ) = (A),
(2) If X Y a.s. then (3)
Our three integration theorems now read: Fatou: If 0 X n then (lim inf X n ) lim inf (X n ). (X ). (Y ) < then lim (X n ) = (X ) < .
Monotone convergence: If 0 X n X then lim (X n ) =
Dominated convergence: If X n X almost surely and | X n | Y with
We can also dene the expectation of functions of X , (g(X )) = For example g(X ()) dP() =

g(x) dF(x). (X 2 ) exists, then
(X n ) is called the nth moment of X . If Var(X) =
(X2 ) (X)2 .
is the variance of X and (X ) =
Var(X) is the standard deviation of X .
The covariance of two random variables is dened as Cov(X, Y) = (XY) (X) (Y). (esX ), s 0. (e isX ), s .
The Laplace transform of X is the function s
The characteristic function of X is the function s
27 Distribution function The distribution function of X is given by F(x) = (X x). F is nondecreasing and right-continuous and it is possible to dene the Stieltjes integral (see the section on bounded variation in your favourite textbook on advanced calculus) (g(X )) = g(x) dF(x) = g(X ()) dP()).
X is a discrete random variable if there is a set C = { c 1, c 2, . . .} of elements in , such that (C) = 1. In that case F is a step function and (g(X )) = g(x) dF(x) =
i =1
c i (X = c i ).

x
13
If F(x) = f (x) dx then F is called absolutely continous and f is the probability density function of X . In this case (g(X )) = g(x) dF(x) =

g(x) f (x) dx.
28 Independence Two events A, B are independent if (A B) = (A)(B). If C , D are two collections of events, then C and D are independent if (A B) = (A)(B) for all A C , B D. Two random variables X and Y are independent if H(x, y) = F(x)G(y) where H(x, y) = (X x, Y y) is the joint distribution function of X , Y and F and G are the distribution functions of X and Y respectively. 29 Theorem (Borel-Cantelli lemma) Let (A i ) i=1,2,... be a sequence of events and let B be the event that A i happens for innitely many i. Then (1)
i =1
(A i ) < implies (B) = 0.

i =1
(2) if the events are independent then
(A i ) = implies (B) = 1.
(4)
30 Inequalities Markovs inequality:
(X > )
Chebyshevs inequality:
(X ) ,
> 0.
(| X
1.8
31 Convergence
(X )| > )
Var(X) , 2
> 0.
Convergence of random variables
Let (X n )n be a sequence of random variables. We write X n X , if we want to say that X n converges to X as n tends to innity. Here is a formal denition of different types of convergence:
14
Types of convergence: Sure (pointwise) convergence: X n () X () for all , almost sure convergence: X n () X () for all A, some A F with P(A) = 1, convergence in probability:
(| X n X | > ) 0 for all > 0, we write limp X n = X ,

(| X n X | p ) 0, we write limL X n = X ,
p
L p -convergence: X n , X L p and
convergence in distribution (weakly): x (X x) is continuous.

p>q
(X n x) (X
x) at all points x where
Lq in probability almost sure in distribution
32 Criterion for almost sure convergence X n converges to X almost surely iff for all >0
(sup | X k X n | ) 0,
k n
n .
1.9
Limit theorems
33 Strong law of large numbers If (X n )n=1,2,... is a sequence of i.i.d. random variables with (| X 1 |) < then
n i =1 X i
(X 1 )
almost surely as n .
15
34 Central limit theorem If (X n )n=1,2,... is a sequence of i.i.d. random variables with = Var(X) < then
n i =1
X i n (X )
n
(x),
where is the normal distribution function with mean 0 and variance 1, i.e.
(x) =
1 2
eu
/2
du.
35 Conditional expectation We postpone the exact denition of conditional expectation to chapter 4. Let X 1 , X 2 , . . . be sequence of random variables (a discrete time stochastic process). Let (Y | X 1 , X 2 , . . ., X n ) denote the conditional expectation of Y , given X 1 , X 2 , . . .. We shortly write (|Fk ) instead of (| X 1 , . . ., X k ) and say that Y Fn if Y = f (X 1 , X 2 , . . ., X n ) for some measurable function f . Moreover we say A Fn if A Fn . We have the following rules: A) (Y |Fn ) = Y and (Y Z |Fn ) = Y (Z |Fn ) if Y Fn . (X 1 , X 2 , . . ., X n contains all information about Y ) (Y |Fn ) = (Y ) if Y is independent of X 1 , X 2 , . . ., X n . (X 1 , X 2 , . . ., X n contains no information about Y ) ( (Y |Fn )) = (Y ). This is the law of total probability.
B)
C)
(Compare with 61 )
2 Discrete Time Martingales
16
DISCRETE TIME MARTINGALES 2.1 Classic examples

0
10 10
15 15
20 20
20
40
60
80
100
120
140
200
400
600
800
1000
1200
Figure 1: Coin tossing. Sample path of the process M n .
36 Coin tossing Consider a coin-tossing game, where a player can wins a Euro if head is tossed and loses 1 Euro if not. Let X n denote the prot in the nth game (so X n {1, 1}) and let M n denote the players fortune after the nth coin toss. (3) Since this is a fair game, we have (1) M n is a function of X 1 , . . ., X n : M n = (X n ) =
n i =1 X i
(2)
(X n1 ). But more is true:
(| M n |)
n i =1
(| X i |) = n.
(M n |Fn1 ) = M n1 . (this will be shown in an exercise, see also the example below). 37 Martingales A sequence { M k }k of random variables is a martingale with respect to { X k }k if (1) M n Fn (2) (3) (| M n |) < , (M n |Fn1 ) = M n1 for all n 1.
Martingales have some very special properties as we will see. An obvious property is that martingales have a constant expectation (see exercise): (M n ) = (M1 ).
38 Example 1 Let { X k }k be a sequence of i.i.d. random variables (i.e. they are independent and have the same distribution) with (X 1 ) < and let S n = n=1 X i . Then i M n = S n n (X 1 ) is a martingale w.r.t. (X k ) (see exercise).
2 Discrete Time Martingales 39 Example 2 In the above situation, if is a martingale.
17
(X 1 ) = 0 and Var(X1 ) = 2 , then M n = S 2 n2 n
Proof. (1) Clearly M n is a function of X 1 , . . ., X n . (2) (3) We have (| M n |) < since (S 2 ) = n (

n i, j =1 X i X j ) = n i =1
2 (X i ) < .
(M n |Fn1 ) =
(S 2 n2 |Fn1 ) = n
(S 2 |Fn1 ) n2 . n
2 First (S 2 |Fn1 ) = (S 2 1 + 2S n1 X n + X n |Fn1 ). It is clear that, given M1 , . . ., M n1 , the n n conditional expectation of the rst term is (S 2 1 |Fn1 ) = S 2 1 . Moreover, using A), it n n follows from the independence of X n from X 1 , . . ., X n
(S n1 X n |Fn1 ) = S n1 (X n |Fn1 ). = S n1 (X n ) = 0 By the same reasoning we obtain

2 (X n |Fn1 ) = 2 (X n ) = 2 . Altogether
(M n |Fn1 ) = S 2 1 + 2 n2 = M n1 . n 40 Example 3 If { X k }k are non-negative and i.i.d. with is a martingale (see exercise) (X 1 ) = 1 then M n = X 1 X 2 X n
41 Denition 2.3 (Stopping times) A random variable with values in {0, 1, . . .} {} is a stopping time w.r.t. { X n }n if { n} Fn , 0 n Let M denote the value of the process { M n } at , or formally M =
k=0 M k = k .
42 Theorem 2.2 (Optional stopping theorem) Let { M n } be a martingale w.r.t. Fn . Let { M n } denote the stopped process, i.e. M n = Mk Mn , ,
n, = k n
Then { M n } is also a martingale w.r.t. Fn Proof. (1) M n Fn ? M n

=
n1
n 1 k =0
M k =k + n M n .
Fn1 Fn
Fn1
(2) We obviously have
(| M n |) < .
2 Discrete Time Martingales (3) For the third property we use the representation from above (M n |Fn1 ) =
=
18
(n1 (n1
n 1 k =0 n 1 k =0
M k =k + n M n |Fn1 ) M k =k |Fn1 ) + (n M n |Fn1 )
Using the martingale property of M n ,
= =
n1 <n1
n 1 k =0 n 2 k =0
M k =k + n M n1 M k =k + n1 M n1 = M(n1) .
43 Coin tossing revisited How long does it take until the players fortune is for the rst time either A or B, B < 0 < A. We dene
= min{ n : M n = X 1 + . . . + X n { A, B}}.
Then
{ n} = { X 1 { A, B} or X 2 { A, B} or ... or X n { A, B}} Fn ,
where is a stopping time w.r.t. { M n }. It follows that { M n } is also a martingale and hence (M n ) = (M1 ) = 0.
Since ( < ) = 1 (see exercise) it follows that M n M as n almost surely (actually M n = M for large enough n) and | M n | max{ A, B}, so by dominated convergence (M ) = 0. But (M ) = A (M = A) B(M = B) and hence A (M = A) = B(M = B) = B(1 (M = A)), leading to the nice formulas
(M = A) =
B , A +B
(M = B) =
A . A +B
(2.1)
19
2.4
Submartingales
44 Submartingales A sequence { M k }k of random variables is a submartingale with respect to { X k }k if (1) M n Fn (2) (3) (| M n |) < , (M n |Fn1 ) M n1 for all n 1.
submartingale. If : is a convex function and M n is a martingale then (M n ) is a submartingale, 2 provided that (|(M n )|) < (see exercise 15). For example M n or | M n | are submartingales, if the integrability condition is fullled.
2.5
Doobs Inequalities
We present two important inequalities without proof. Note rst that if { M n } is a martingale and is a convex function then (M n ) is a submartinp gale if (|(M n )|) < for all n 0. For example M n and | M n | are submartingales. 45 Theorem 2.4 (Doobs Maximal inequality) Let { M n } be a non-negative submartingale and dene the maximal sequence
M n = sup M m .
0 m n
Then for > 0

(M n )
(M n Mn > )
(M n ).
Moreover for any p 1

p (M n )
(M n ).
46 Theorem 2.5 (Doobs L p inequality) Let { M n } be a non-negative submartingale. Then for p > 1 and n 0
|| M n || p
p || M n || p . p1
20
2.6
Martingale convergence
Together with optional stopping the convergence property is what makes martingales so valuable. Under quite loose conditions it is possible to show that M n M as n for some random variable M . We rst show a convergence result that holds for (uniformly) L 2 -bounded martingales. 47 Theorem 2.6 (L 2 -bounded Martingale Convergence Theorem) If { M n } is a mar2 2 tingale and (M n ) B < then there is a random variable M L 2 with (M ) B < 2 and M n M almost surely and in L , i.e.
|| M n M ||2 0,
n . (X2 ) (|X|))2 0).
If X is L 2 -bounded then X is also L 1 -bounded (follows from Var(|X|) = The opposite is not true.
Proof. Without loss of generality we let M1 = 0. Otherwise take M k = M k M1 . We are going to show that
(sup | Mk Mm | ) 0
k m
as m . Then M k converges almost surely to a limit M and lemma:

2 ( lim M n ) lim
2 (M ) B by Fatous
2 (M n ) B.
Let d k = M k M k1 , then, if j < k (d k d j |Fk1 ) = So (d k d j ) = 0 for j = k and

2 (M n ) =
M j M k1 M k1 M j M k1 M j 1 + M j 1 M k1 = 0.
(M k M j M k1 M j M k M j 1 + M j 1 M k1 |Fk1 )
(
k =1
dk ) =
n k =1
(d 2 ). k
Since
2 (M n ) B for all n, the sum on the right converges to
k =1
(d 2 ) B k
and it follows that
(d 2 ) 0 as k . The process M n = (M k+m M m )2 is a submartingale k
3 Brownian Motion (exercise) and using Doobs inequality
21
(sup | Mm+k Mm | )
k 0
= =
(sup Mn 2 )
2 k+ m i = m +1 d i
(M n )
2
((M k+m M m )2 ) 2 di2
k+ m i = m +1 2
i = m +1 2
di2
which tends to zero as m . Moreover, M M n as n . Without proof we state the following important generalization of the last theorem for L 1 bounded martingales. 48 Theorem 2.8 (L 1 -bounded Martingale Convergence Theorem) If { M n } is a martingale and (| M n |) B < then there is a random variable M L 1 with (| M |) B < and M n M almost surely. It is not clear, whether under the stated conditions in fact
| M n M | 0,
2
dk
k = n +1
k = n +1
(d 2 ) 0 k
BROWNIAN MOTION
3.0 Multivariate Gaussians and Brownian motion

49 Denition (Multivariate Gaussians) A d-dimensional random vector V has a multivariate Gaussian distribution with mean vector = (1 , . . ., d ) and covariance matrix Var(V ) Cov(V , V ) ... Cov(V , V )
1 Cov(V1 , V2 ) . . . Cov(Vd , V2 )
1 2 Var(V2 ) . . . Cov(Vd , V2 )
...
. . . ...
1 d Cov(V2 , Vd ) . . . Var(Vd )
if the density of V is given by f (x) = f (x1 , x2 , . . ., xd ) = 1 exp (x )T 1 (x ) . 2 (2)d det 1
If V has a d-dimensional multivariate Gaussian distribution then each Vi , i = 1, 2, . . ., d has a normal distribution with mean i and standard deviation i .
3 Brownian Motion
22
In general the Vi are not independent. However, if V1 , V2 , . . ., Vd are uncorrelated then is a diagonal matrix and f (x) = 1 (2)d
d i =1 i
exp
d 1 d (x i i )2 /1 = 2 i =1 i =1
1 2 i
exp
1 (x i i )2 , 2 i
Hence V1 , . . ., Vd are independent normal random variables! Characteristic function of Gaussian r.v.: If V has a d-dimensional multivariate Gaussian distribution then (e i In particular if d = 1 (e iV ) = e i
2 2 2
) = e i
1 2 T
50 Denition 3.1 (Brownian motion) A standard Brownian motion on [0, T] is a stochastic process {B t }0 t<T , t [0, T], such that (1) B0 = 0 (2) Increments are independent, i.e. B t2 B t1 , . . ., B t n B t n are independent for any choice of 0 t 1 t 2 . . . t n < T. (3) B t B s has a normal distribution with mean 0 and variance t s for 0 s t < T. (4) t B t is a continuous function a.s.
3.3
51 The functions Let
t (t) = 1 t 0
Wavelets
, , ,
1 0t 2 1 2
otherwise
t1
(3.1)
and dene for n = 2 j + k, j 0, 0 k < 2 j ,
n (t) = 2 j /2 (2 j t k)
3 Brownian Motion
23
0.5
0.4
0.3
0.2
0.1
0.2
0.2
0.4
0.6
0.8
1.0
1.2
Figure 2: The functions 1 ,... , 63 .
52 Theorem (Parsevals identity) We have st =
n (t)n (s).
n =0
Now let Z0 , Z1 , . . . be a sequence of i.i.d. normal random variables with mean 0 and variance 1 and let for t [0, 1]
m
X t,m =
n (t)Z n
n =0
1.0 1.0 0.8 0.8 0.6 0.6
0.4
0.4
0.2
0.2
0.2
0.4
0.6
0.8
1.0 0.2
0.2
0.4
0.6
0.8
1.0
0.2 0.4
Figure 3: Approximation X t,m for m = 1, m = 3, m = 7 (left) and m = 7, m = 255 (right).
You can see in the above gures how the processes X t,m become more and more fractal as n increases. The question is, whether X t.m actually converges as n (and then, in which sense).
3 Brownian Motion
24
3.4
Wavelet representation of Brownian motion
53 Theorem 3.1a (Wavelet representation) The sum X t,m converges almost surely uniformly for t [0, 1] as m . We denote the limit by Xt =
n (t)Z n .
n =0
Proof. First note that
n= M
n (t)| Z n |
log(n)n (t),
(3.2)
n= M
where S = supi M
|Zi | . log( i)
We have for M > 2 J log(n)n (t)

2 j 1
n= M
j = J k =0
log(2 j + k)2 j +k (t),

2 j 1
j= J j= J
j +1
k =0
2 j +k (t),
j + 12 j /2 .
Since the rightmost sum in (3.2) tends to 0 uniformly as J , it follows that we only have to show that S is a.s. nite. We have
(| Z n |
2 log(n))
u=
= =
2 log w
2 2 2
eu 1
/2
du
2 log(n)
1 dw log(w) w2
n .
For all > 1 we have
n =1
(| Z n | (
2 log(n)) < and by Borel-Cantelli

|Zn |
log(n)
2 i.o.) = 0.
Consequently S is nite with probability one. 54 Theorem 3.1b (Brownian motion representation) The process X t = is a standard Brownian motion.
n=0 n (t)Z n
3 Brownian Motion
25
Proof. Clearly X 0 = 0 and X t is continuous, because it is the uniform limit of continuous functions. It remains to show that X t has independent increments and that the increments X t X s , t > s have a normal distribution with mean 0 and variance t s. We calculate the nite dimensional distributions
(X t1 x1 , . . ., X t n xn ).
Let 0 t 1 t n < T. We calculate the joint characteristic function
m m
(exp i
j =1
j Xtj )
= = =
(exp i
j
j =1
n (t j )Z n ) j n (t j ) )
n =0 m j =1 m
(exp iZ n
n =0
1 exp ( j n (t j ))2 2 j =1 n =0
= exp = exp
1 m m j i (t i t j ) , 2 i =1 j =1
1 m m j i n (t j )n (t i ) 2 i =1 j =1 n =0
which is the characteristic function of a multivariate Gaussian with mean 0 and covariance matrix = (t i t j ). We still have to show the independence of the increments. For i < j ((X t i X t i1 )(X t j X t j1 )) =
=
t i t i t i1 + t i1 = 0,
(X t i X t j ) (X t i X t j1 ) (X t i1 X t j ) + (X t i1 X t j1 )
so X t i X t i1 and X t j X t j1 are uncorrelated and hence (having a two-dimensional Gaussian distribution) independent. 55 Brownian motion on [0, ) We have dened a standard Brownian motion on [0, 1]. To obtain a standard Brownian motion on [0, ) we simply glue together Brownian motions on [n, n + 1). We write for the resulting process B t . To be more precise, for t [n, n + 1) we let
n
Bt =
k =1
( (k X 1k) + X t+1) , n
For other constructions using the i in particular the construction of a Brownian bridge see exercise 17.
( where { X tk) }k=1,2,... are independent copies of our process X t .
4 Continuous Time Martingales
26
4 CONTINUOUS TIME MARTINGALES 4.2 Conditional Expectation
Suppose that we know for any event A G whether A or not (A happened or not). Then there are still events in F \ G for which we dont know this information. In a sense G contains a certain amount of information and any larger -eld holds more information. The smallest -eld is { , }, giving us no information whether A for sets A = F . 57 Example (compare with Exercise 1). Consider = [0, 1] with F = B () and the Lebesgue-measure. Let X be a random variable dened by 1 , [0, 1/3] X () = 2 , (1/3, 2/3] . 3 , (2/3, 1] 3 0 , ,
[0, 1/3]
56 -elds and information Given a probability space (, F , ) let G be a be sub-elds of F (i.e. A G A F ).
2 6 The expectation of X is given by (X ) = 1+3+3 = 3 = 2. The smallest -eld such that X is measurable is (X ) = ({[0, 1/3], (1/3, 2/3]}). Now consider a second random variable Z
Z() =
(1/3, 1]
Given Z = 3 the conditional expectation (X | Z = 3) is obviously 1. Similarly 5/2, so we could dene a random variable Y () = 1 ,
[0, 1/3]
The smallest -eld such that Z is measurable is G = (Z) = ({[0, 1/3]).
(X | Z = 0) =
5/2 ,
(1/3, 1]
58 Conditional Expectation We observe the following two properties of Y : Y is G -measurable. (Y A ) = (X A ), A G .
59 Denition 4.1 (Conditional Expectation) Let X be an integrable random variable in the probability space (, F , ). If G is a sub--eld of F and Y is a random variable such that Y is G -measurable. (Y A ) = (X A ), for all A G . (X |G ).
Then Y is called a conditional expectation of X w.r.t. G . We write Y =
27
60 Uniquenes and existence (X | Z) is not unique. If Y = (X | Z) a.s. then Y is also a conditional expectation, given Z. We say that Y and Y are versions of the conditional expectation. We dont prove that the conditional expectation (X | Z) actually exists (see ). (5)
61 Rules for conditional expectation (see Exercise 18) A) (X |G ) = X and (X Z |G ) = X (Z |G ) if X is G -measurable. (G contains all information about X ) (X |G ) = (X ) if (X ) is independent of G . (G contains no information about X ) ( (X |G )|H ) = (X |H ). In particular ( (X |G )) = (X ).
B)
C) If H G then
4.3
Uniform integrability
62 Denition 4.2 (Uniform Integrability) A collection C of random variables is uniformly integrable if sup (| Z || Z |> x ) 0
Z C
as x . (6) 63 Lemma 4.4 (Conditions for uniform integrability) If for all Z C ((| Z |)) B < for some constant B and some function with (x)/x as x then C is uniformly integrable. (7) 64 Why uniform integrability? 65 Lemma 4.1 (Uniform integrability and L 1 -convergence) If { Z n }n is a uniformly integrable sequence with Z n Z almost surely, then X n X also in L 1 (so Z L 1 ).
p>1
p
Uniform Integrability
L1 in probability in distribution almost sure
4 Continuous Time Martingales Proof. By Fatous lemma sup (| Z n || Zn |> x ) lim inf (| Z n || Zn |> x )
n 1 n
28
(| Z || Z |> x )
and thus, writing (x) = supn1 (| Z n || Zn |> x ), (| Z |) = and hence Z L 1 . Next

| Z n Z | = | Z n Z || Zn | x + | Z n Z || Zn |> x | Z n Z || Zn | x + | Z || Zn |> x + | Z n || Zn > x| .
(| Z || Z |> x ) + (| Z || Z |> x ) (x) + x.
Clearly | Z n Z || Zn | x x + | Z | L 1 so by dominated convergence (| Z n Z || Zn | x ) 0 as n . For the second term we obtain | Z || Zn |> x | Z | L 1 and hence (| Z || Zn |> x ) (| Z || Z |> x ) (x).
Finally (| Z n || Zn > x| ) (x). It follows that lim supn (| Z n Z |) 2 (x), but (x) can be made arbitrarily small if x is large enough.
4.4
Continuous Time Martingales
If for any t 0, X t is F t -measurable, then we say that the process { X t } t0 is adapted to the ltration F t . 67 Denition (Martingales) A adapted process { X t } t0 is a martingale if (1) (2) (| X t |) < , (X t |Fs ) = X s for all 0 s t < . (X t |Fs ) X s .
66 Filtration Let {F t } t0 be a collection -elds, such that Fs F t for s t, then {F t } t0 is called a ltration .
{ X t } t0 is a submartingale if instead of (2)
68 Example (Poisson process) Let { N t } t0 be a Poisson process and dene X t = N t t. We show that X t is a martingale (w.r.t. the ltration (Ns : 0 s t)). Clearly (| X t |) < and X t is adapted. Moreover (X t |Fs ) = (see also Exercise 19). (N t Ns + Ns |Fs ) t = (N t Ns ) + Ns t
= (t s) + Ns t = Ns s = X s .
29
69 Stopping times A stopping time is a random variable with values in [0, ) {} such that
{ t } F t , t 0.
70 Standard Brownian ltration 71 Denition (Standard Brownian ltration) The standard Brownian ltration on [0, T] is given by the smallest -eld that contains (B s : s t) and the collection of subsets of null sets in (B s : s T). We extend so that it assigns measure 0 to all subsets of null-sets. The standard Brownian ltration fulls the so called standard conditions: 72 Denition (Usual conditions) A ltration F t fulls the usual conditions if (1) F0 contains all subsets of null sets in F . (2) F t is right-continuous, i.e. F t = F t+ :=
s: s> t F s .
73 Some theorems The following results are stated without proof. They are the continuous time counterparts of theorems we already saw for the discrete time martingales. 74 Theorem 4.1 (Doobs Optional Stopping Theorem) If { M t } t0 is a continuous martingale w.r.t. a ltration F t that satises the usual conditions and if is a stopping time w.r.t. F t then M t is also a continuous martingale w.r.t. F t . 75 Theorem 4.2 (Maximal inequality) Let M = sup0 tT M t . If { M t } t0 is a continuous t nonnegative submartingale and > 0 then
p (M > ) t
(MT ).
If MT L p for some p > 1 then || M || p t
p p1 || M T || p .
76 Theorem 4.3 (Martingale limit theorem) If a continuous martingale { M t } t0 satises (| M t | p ) B for some p 1 and all t 0, then there is a random variable M with (| M | p ) B such that M t M almost surely. If p > 1 then || M t M || p 0.
4.5
See exercise 20.
Classic Brownian Motion Martingales
6 The It integral
30
6 6.0
THE IT INTEGRAL
Bounded variation and Lebesgue-Stieltjes integral
77 Bounded variation A real valued function G : [0, T] is of bounded variation, if

n
sup
i =1
|G(t i ) G(t i1 )| < ,
where the supremum is taken over all partitions of the interval 0 = t 0 < t 1 < . . . < t n = T. A function of bounded variation is the difference of two nonnegative monotonically increasing functions: G(x) = G 1 (x) G 2 (x). 78 Lebesgue-Stieltjes integral It is possible to dene measures 1 , 2 on B ([0, T]) in a way such that
i ([x, y]) = G i (y) G i (x),
i = 1, 2,
The Lebesgue-Stieltjes integral of a bounded function f : [0, T] w.r.t. G is dened as the difference of two well dened Lebesgue-integrals:
T
0
for intervals [x, y] [0, T].
f (x) dG(x) =
f (x) d 1(x)
f (x) d 2(x).
If f is continuous then
T
0
f (x) dG(x) = lim
0 i=1
f (t ) G(t i ) G(t i1 ) < , i
with t [t i1 , t i ] and = max{ t i t i1 : i = 1, 2, . . ., n}. i If G(0) = 0 then one can show by using integration by parts that
t
0
G(u) dG(u) =
1 G(t)2 . 2
We will see that this no longer holds if we replace the Lebesgue-Stieltjes integral by a It integral and the function G(t) by Brownian motion B t .
6 The It integral
31
6.1
Denition of the It Integral for functions in H 2
79 It integral not a Lebesgue-Stieltjes integral We want to dene a stochastic integral

t
0
f (, s) dB s ,
t [0, T],
where B t is a Brownian motion and f (, t) is some (random) function. If we would interpret the integral as a Lebesgue-Stieltjes integral we would need B t to be of bounded variation. But B t is almost surely nowhere of bounded variation. We are going to dene the stochastic integral as a L 2 -limit of integrals of simple functions. 80 Good integrands We also need good integrands f (, t) to be able to dene a suitable stochastic integral. 81 Denition (The class H 2 ) A function f : [0, T] is in H 2 if (1) f is measurable: f is FT B ([0, T])-measurable, i.e. f 1 (C) FT B ([0, T]),
C B (),
where FT B ([0, T]) is the smallest -eld containg all sets of the form A B, A FT and B B ([0, T]). (2) f is adapted: f (, t) is F t -measurable for all t [0, T] (3) Note that
|| f ||L 2 (d t) =
T 0
f 2 (, t) dt < , i.e. f L 2 (d dt).
f 2 (, t) d(() dt)
1/2
f 2 (, t) dt
1/2
The construction of a stochastic integral for f H 2 follows ideas that are similar to the construction of the Lebesgue integral. Starting point are simple functions, for which a integral is easily dened. 82 Simple functions Let 0 = t 0 < t 1 < . . . < t n = T. Simple functions in H 2 are functions f H 2 such that f (, t) = with a i being F t i -measurable and 2 by H 0 .
n 1 i =0
a i () t i < t t i+1 ,
(a2 ) < . The class of simple functions in H 2 is denoted i
6 The It integral
32
Note the measurability condition on a i . Knowledge of {B s : s [0, t]} allows us to nd a i (w) for t i t. Hence f (, t) is F t i -measurable and a member of H 2 .
2 83 It integral for simple functions For f H 0 we dene
I( f )() =
n 1 i =0
a i () B t i+1 B t i
2 2 Suppose that f n H 0 and f n f H 2 . If the mapping I : H 0 L 2 (d ) would be continuous then we could dene I( f ) by the limit of I( f n ) as f n f . 2 2 84 Theorem 6.1 (It s Isometry on H 0 ) For f H 0 we have
|| I( f )||2 = || f ||L 2 (d t) .
2 Hence I : H 0 L 2 (d ) is an isometry and in particular continuous.
Proof. The proof is straightforward:

|| I( f )||2 = =
(I(F)2 ) =
n 1 i =0
n 1 n 1 i =0 j =0
a i ()a j () B t i+1 B t i B t j+1 B j i

2
(a2 ) i
B t i +1 B t i
n 1 i =0
(a2 ) t i+1 t i , i
where we used that a i F t i and hence independent from B t i+1 B t i . Similarly

|| f ||L 2 (d t) =
T
0
f 2 (, t) dt =
T
0
n 1 i =0
a2 t i < t t i+1 dt = i
n 1 i =0
(a2 ) t i+1 t i . i
As the next result shows, we are lucky: any f H 2 is actually a limit of simple functions.
2 85 Approximation For any function f H 2 there is a sequence of functions in H 0 such that
|| f n f ||L 2 (d t) 0,
n .
Almost nothing can stop us now to dene I( f ) as the limit of the I( f n ).
6 The It integral
33
86 It integral in H 2 The limit I( f ) = L 2 lim I( f n )

n
exists and denes the It integral for f H 2 (for any other sequence f n with f n f the limit is the same). Moreover, the It isometry holds also in H 2 , i.e || I( f )||2 = || f ||L 2 (d t) .
2 Sketch of proof: We know that there is a sequence f n in H 0 such that
|| f n f ||L 2 (d t) 0.
So f n is a Cauchy-sequence in L 2 (d dt) and hence I( f n ) is a Cauchy sequence in L 2 , because of the It isometry. L 2 is complete and so I( f n ) I( f )
2 in L 2 . Given another approximating sequence f n H 0 with || f n f ||L 2 (d t) 0 then
|| f n f n ||L 2 (d t) || f n f ||L 2 (d t) + || f n f ||L 2 (d t) 0
too, and again by the It isometry || I( f n ) I( f n )||2 0.
Moreover || f n f ||L 2 (d t) || f n ||L 2 (d t) || f ||L 2 (d t) , so

|| f n ||L 2 (d t) || f ||L 2 (d t) .
Similarly || I( f n )||2 || I( f )||2 , so || I( f )||2 = || f ||L 2 (d t) since || f n ||L 2 (d t) = || I( f n )||2 . 87 Integral symbol From now on we use the familiar integral symbol for the It integral for functions in H 2 ,
T
0
f (, t) dB t := I( f ).
t 0
88 It integral as a stochastic process How can we dene a process Clearly, for each t [0, T]
T
f (, s) dB s?
Jt =
[0,t](s) f (, s) dB s
is well-dened as a member of L 2 , but we can modify each Jt on some null-set A t and obtain still a version of the integral. But we dont know whether ( t[0,T ] A t ) = 0 so that the process { Jt } t[0,T ] could be ambigous on a set with positive probability. 89 Denition (Versions) Two stochastic processes X t and X t on t [0, T] are called versions of each other if X t = X t a.s. for each t [0, T].
6 The It integral
34
Among the versions of Jt there is one that is not only continuous, but is much more: 90 Theorem 6.2 (It integral) The process Jt has a version which is a continuous martingale w.r.t. standard Brownian ltration. For this version we write
t
0
f (, s) dB s,
t [0, T].
Hence stochastic integration is a martingale producing machine. Moreover we gain even more integrals if we utilize the Itisometry. It follows from Its isometry on H 2 that
t 0
f (, s) dB s
t 0
f 2 (, s) ds is a martingale. (8)
91 Preliminary summary We dened the It integral for functions f H 2 , i.e. functions f : [0, T] for which (1) f is measurable, (2) f is adapted, (3)
t T 0
f 2 (, t) dt < .
Then 0 f (, s) dB S is dened to be the continuous martingale version of the limit L 2 -limit T 2 of the integrals 0 f n (, s) dB S , where f n (, s) H 0 are simple functions such that f n (, s) s< t f (, s) in L 2 (d dt). 92 Properties of the It integral Let also g H 2 and let a, b . Then
t 0 (a f
+ b g) dB s = a
t 0
f dB s + b
t 0
g dB s (
t 2 0 f ds)
(
t 0
t 0
f dB s ) = 0 and Var(
t 0
t 0 f dBs ) =
f dB s is a continuous martingale. f dB s )2
t 0
The process (
f 2 ds is a martingale.
6 The It integral
35
6.4
An explicit Calculation
t 0 B s dB s .
We try to evaluate the stochastic integral 2 functions in H 0 . We chose f n (, t) =

i where t i = n T.
We have to approximate f (, t) = B t by
n 1 i =0
B t i t i < t t i +1 ,
k=2
0=t0
t1
t2
t3
t4
t5
0=t0
t1
t2
tk+1
t
tk+2
t5
2 Clearly f n H 0 and
|| f n f ||2 2 (d t) L
= =
n 1
Bt
t i +1
n 1 i =0
B t i t i < t t i +1
dt =
T n 1
0
i =0 t i
(t t i ) dt =
1 1 n 1 (t i+1 t i )2 = 2 i =0 2
i =0 n 1 T 2 i =0
Bt Bti
=
t i < t t i+1 dt
n2
T2 . 2n
As n this tends to zero, so f n f in L 2 (dP dt). Let k = max{ i : t i+1 t}, so that (k + 1)T (k + 2)T = t k +1 t < t k +2 = . n n Then
t
0
L2
L2
k i =0
B s dB s
B s dB s = lim
n 0
f n (, s) dB s = lim
B t i (B t i+1 B t i ) + B t k+1 (B t B k+1 ).
For the last term B t k+1 (B t B k+1 )

2
= t k+1 (t t k+1) tT/n.
Moreover, since 2a(b a) = b2 a2 (a b)2 B t i (B t i+1 B t i ) = 1 1 2 B t i+1 B2i B t i+1 B t i t 2 2

2
7 Localization and Its integral and hence

k i =0
36
B t i (B t i+1 B t i ) =
=
1 2
k i =0
B2i+1 B2i B t i+1 B t i t t
1 k 1 2 2 B t i +1 B t i . B k +1 2 2 i =0 1 2 1 L B lim Yk 2 T 2 n
2
It follows that
t
0
B s dB s
2
where we let Yk = 1 2
k i =0
B t i+1 B t i . Then
k
(Yn ) =
i =0
(t i+1 t i ) = t k+1 =
(k + 1)T , n (X 4 ) = 3Var(X)2 for a normal random T T2 2t , n n2
so (|Yn t|) T k/n 0 as n . Moreover, since variable X with mean 0,

k
Var(Yn ) =
i=0
Var((Bti+1 Bti )2 ) = 2
i=0
(ti+1 ti )2 = 2k
since t > k T . So Var(Yn ) 0 as n and n We proved 93 Our rst It integral We have

t
0
(Yn ) t, so Yn t in L 2 .
1 1 B s dB s = B2 t. 2 t 2
LOCALIZATION AND ITS INTEGRAL 7.1

2 It integral for functions in L loc
94 Problem We have dened the It integral for functions in H 2 . We could be happy with this, but the condition (3)
T 0
for the integrand is a considerable restriction: E.g. if f (, t) = exp(B a ), then t (exp(2B a )) = for a > 2 (can you show this?). t
f 2 (, t) dt <
( f (, t)2 ) =
95 Solution Remember from your probability course that for a random variable X it could happen that (X ) = even though X is nite with probability one (e.g. take a uniform random variable U in [0, 1] and let X = 1/U).
7 Localization and Its integral The idea is to replace (3) by the weaker (3 )
T 0
37
f 2 (, s), ds < almost surely,
2 2 and call the extended class of such functions L loc . Clearly H 2 L loc .
96 Denition 7.1 (Localizing sequence) An increasing sequence n of stopping times is 2 called a H 2 localizing sequence for f L loc , if (1) f n (, t) = f (, t) t<n is in H 2 , (2) n = T for some n a.s. Guess why f n (, t) should be a member of H 2 . The reason is of course that we want to integrate f n and we already know how to form integrals for functions in H 2 . 97 Denition 7.2 (Local martingale) An adapted stochastic process { M t } is a local martingale if there is a sequence n of stopping times, such that (1) n as n almost surely, (2) M (k) := M tk M0 is a martingale. t Every martingale is a local martingale, just let n = n.
2 98 Summary: Construction of the It integral for f L loc Let n be a localizing sequence and dene
X t,n =
s<n f (, s) dB s.
(i.e. we chose the continuous martingale dened in the previous section). Then (1) X t,n X t almost surely as n for some continuous process { X t } t[0,T ] . (2) X t is independent of the choice of the localizing sequence. (3) There is a version of X t which is a local martingale, we write 0 f (, s) dB s for this s version. One can take the localizing sequence n = T inf{ s : 0 f 2 (, t) dt n}.
2 (4) Let f , g L loc and be a stopping time. If f (s) = g(s) for s < then t 0 g(, s) dB s with probability one on { : t }.
t 0
f (, s) dB s =
99 notation Of course, we write 2 integral for functions in L loc .
t 0
f (, s) dB s for the local martingale version of the It
7 Localization and Its integral
38
7.2
Two special cases
100 Continuous functions of Brownian motion We have dened the It integral T t 2 2 0 g(, s) dB s for functions in L loc , i.e. functions which are measurable, adapted and 0 g(, s) ds < is a.s. nite. How can we decide measurability? 101 Proposition (Measurability) If f (, t) is adapted and t f (, t) is a.s. right- (or left-) continuous in t then f is measurable. For example if we let g(, t) = f (B t ()) where B t is a Brownian motion and f is continuous, 2 then clearly g is adapted and continuous in t and hence measurable. It follows that g L loc . Since B t is continuous a.s., f (B t )2 is continuous and bounded on [0, T], so
T
0
f (B s )2 ds <
almost surely. We may wonder how to calculate the stochastic integral of f w.r.t B t . 102 Theorem 7.1 (Continuous functions of Brownian motion) If f : is continuous then (letting t i = iT/n)
T
0 p
n i =1
f (B s ) dB s = lim
f (B t i1 ) B t i B t i1 ,
where limp denotes limit in probability. The proof is a demonstration on how localization can be used. But before, there is one question to be answered. Why is the limit on the right a limit in probability, not a stronger limit almost surely?
2 The construction of Its integral for f L loc promised a almost sure limit of integrals of functions in H 2 . But here we ask for a limit of integrals of simple functions that are not in H 2 , so we lose the L 2 convergence.
Proof. We show in an exercise that

M = T min{ t : |B t | M }
Main idea: there is a continuous function f M (x) with compact support (i.e. supp( f ) = closure{ x : f (x) = 0} is bounded, in other words: f M (x) vanishes for large | X |) and f (x) = f M (x),
n
2 is a localizing sequence for f (B s ()) L loc [0, T].
| x| M.
Then f M H 2 and we can construct the It integral of f M (B s ) by using the simple functions
n (, s) =
i =1
f M (B t i1 ) t i1 <s t i .
7 Localization and Its integral One can show that n is an approximating sequence to f M . Then
t
0
L2
39
L2
n i =1
f M (B s ) dB s
n 0
lim
n (, s) dB s = lim
f M (B t i1 ) B t i B t i1 .
Let
n T
A n () = : Then
i =1
f (B t i1 ) B t i B t i1
f (B s ) dB s
(A n ()) (M < T) + (A n (), M = T).

The rst term tends to zero as M . On { : M = T } we have a.s. that t 0 f M (B s ) dB s and
n i =1 n t 0
f (B s ) dB s =
f M (B t i1 ) B t i B t i1 =
i =1
f (B t i1 ) B t i B t i1 .
Hence by Chebyshevs inequality,
(A n (), M = T)
which tends to zero.
1 2
n i =1
f M (B t i1 ) B t i B t i1
f M (B s ) dB s
103 Non-random integrands What happens if the integrand f (, t) is non-random, i.e. f (, t) = f (t), independent of ? 104 Proposition 7.6 (Gaussian integrals) If f : [0, T] is continuous then
t
Xt =
f (s) dB s
t s 2 0 f (u) du.
is a mean zero Gaussian process with independent increments and Cov(Xt , Xs ) = Moreover
p
n i =1
X t = lim if t i = iT/n and t [t i1 , t i ]. i
f (t ) B t i B t i1 i
We will encounter a nice application in Exercise 25 (see also Exercise 24).
40
7.4 Local martingales and honest ones

Recall Denition 7.2 of a local martingale. An adapted stochastic process { M t } is a local martingale if there is a sequence n of stopping times, such that (1) n as n almost surely, (2) M (k) := M tk M0 is a martingale. t
2 We have seen that the It integral of a function in L loc is a local martingale.
105 Local martingales and martingales Let M t be a local martingale. (1) If is a stopping time, then X t is a local martingale. (2) If M t is continuous and | M s | < B < for all t 0 then M t is a martingale. (3) If M t 0 and (| X 0 |) < , then M t is a submartingale. If additionally then { M t } t[0,T ] is a martingale. (X T ) = (X 0 )
106 Proposition 7.8 (Application of local martingales) If X t is a continuous local martingale with X 0 = 0 and if < a.s. for the stopping time
= inf{ t : X t { A, B}},
then
(X ) = 0 and (X = A) =
B A +B .
Proof. Let n be any localizing sequence for X t . Then Yt(k) = X tk

k) is a martingale for every k, so by optional stopping Yt( is a martingale and k) (Yt( ) = 0. k) k) ( Since is almost surely nite we have that Yt( Y k) as t . Since Yt( max{ A, B} we can apply dominated convergence and obtain
( (Y k) ) = 0. ( ( But Y k) = X k X as k (since k ). Again Y k) max{ A, B} and using dominated convergence we get
from which (X = A) =
(X ) = 0 = A (X = A) B(X = B)
B A +B
follows.
41
Summary and Outlook

(1) We started with the desire to give a SDE
t t
X t = X0 +
(s, X s ) ds +
(s, X s ) dB s
some meaning. The troublemaker here is the stochastic integral term on the r.h.s. (2) We dened stochastic integrals w.r.t Brownian motion for suitable integrands. Suitable integrands are such that f (, t) is left- or right-continuous and adapted to the standard ltration and either or
2 f (, s)2 ds < a.s. (then f L loc ) and the stochastic integral is a continuous local martingale, dened as a almost sure limit of integrals functions in H 2 ,
( 0 f (, s)2 ds) < (then f H 2 ) and the stochastic integral is a continuous martingale, dened as a L 2 limit of integrals of simple functions,
T 0
(3) We still cant evaluate stochastic integrals without rst nding appropriate approximations and going through the whole procedure of proving convergence. In this way we found our rst integral
t
0
1 1 B s dB s = B2 t. t 2 2
This is contrary to the situation where A t is of bounded variation, where

t
0
As d As =
1 2 A . 2 t
It would be nice to have a formula similar to the fundamental theorem of calculus:

t
0
f (A s ) d A s = f (A t ) f (A 0 )
(7.1)
(if f is continuously differentiable). Indeed, such a formula exists and is we are not surprised at all different from (7.1). This formula will be revealed in the second part of the course...
A More Explanations and Examples
42
MORE EXPLANATIONS AND EXAMPLES

(1) p.6 Nonmeasurable sets
Look at the following terrifying construction. For each x let so that y E x iff x y is a rational number (e.g. 2 + 17 E 2 ). Then = x E x and E x E y = for x = y. Now we choose from each set E x exactly one element in [0, 1] and collect these elements in a new set E [0, 1] (we need the so called axiom of choice here, a legitimation that we can indeed chose one element from each of the uncountably many E x ). We have
y
E x = { y : x y },
(y + E) =
where (y + E) = { z : z = y + u, u E }. If E would be Lebesgue-measurable then

= () =
y
(y + E) =
(y + E).
On the other hand we also have
But (y + E) = (E) ( is translation invariant: the mass of A is equal to the mass of the shifted set A + a), so (E) > 0 (otherwise the above sum would be zero).
q[0,1]
(q + E) [0, 2)
and hence
q[0,1]
(q + E) 2 < ,
and it follows that (E) = 0. This is a contradiction and hence E is not Lebesgue-measurable! Since any Borel set (a set from B ()) is Lebesgue-measurable, E is also an example of a set that is not Borel-measurable. Btw. there are Borel sets B B (2 ), such that the projection is not Borel measurable (not in B ()). Some nice members in B (2 ) have very ugly shadows... (2) p.9 Riemann integral Indeed, if f (x) = (x), then f (x) is one if x is a rational number and f (x) = 0 elsewhere. It follows that the upper sums in the denition of the Riemann integral are all 1, while the lower sums are all 0.It follows that the Riemann integral cannot exist. There is no problem with the Lebesgue integral, since is a measurable set and f is a simple function and thus integrable (yielding the value zero). B1 = { x : (x, y) B, some y }
A More Explanations and Examples (3) p.9 Two transformation rules Here are two more formulas, that show how to transform integrals.
43
107 Theorem (Rule 1) Let (, F , ) be a measure space and let ( , F ) be measurable space. Let f : and g : be measurable. Then (if one of the two sides exists) g( f (x)) d (x) = where (B) = ( f 1 (B)) for b F . 108 Theorem (Rule 2) Let (, F , ) be a measure space. For any measurable functions f : + and g : f (x)g(x) d (x) = where (A) =
A
g(y) d (y),
g(y) d (y)
f (x) d (x).
(4) p.13 The Borel-Cantelli Lemma We really need independence in (2). Let X n be an independent sequence of random variables, such that (X n = n) = 1/n and (X n = 0) = 1 1/n (see exercises). We showed that X n 0 in probability and X n does not converge in L 1 . Also, by using Borel-Cantelli, we showed that X n = n infenitely often with probability one and hence X n does not converge to zero almost surely. But if we dene on = [0, 1] the random sequence Yn () = n<1/n then (Yn = n) = 1/n and (Yn = 0) = 1 1/n and X n 0 almost surely. The reason: the Yn are not independent, we can not apply Borel-Cantelli here. (5) p.27 Conditional expectation Here are two classic proofs for the existence of the conditional expectation (X |G ).
(1) First suppose that X L 2 , then one can use the projection theorem for L 2 to see that the projection X onto the subspace L 1 (G ) of G -measurable integrable functions has the required properties of a conditional expectation. For general X L 1 one uses an approximation procedure to extend the denition of the conditional expectation. A very interesting property is the following: the conditional expectation W = (X |G ) has the smallest expectation ((X W)2 ) among the random variables that are G measurable. (2) More standard is the approach via the Radon-Nikodym theorem, that roughly says that if for two measures (A) = 0 (A) = 0 holds, then (A) = A f d , for some measurable function f (the Radon-Nikodym-derivative).
A More Explanations and Examples
44
Let (A) = (X A ) and denote the restriction of to G . (A) = 0 implies (A) = 0 for A G and hence by the Radon-Nikodym theorem there is a G -measurable function Y , such that (X A ) = (A) = A Y d = (Y A ), A G . (6) p.27 Uniform integrability I Note that (| Z || Z |> x ) 0 as x for any L 1 -random variable Z. This follows from dominated convergence, since (| Z || Z |> x ) (| Z |) < and | Z || Z |> x 0 almost surely. The difference with uniform integrability is that zero is a uniform limit for all members in C . (7) p.27 Uniform integrability II The following very interesting Lemma can be used in the proof of the following Lemma 4.6. Although we dont prove 4.6, it is interesting to note that any L 1 random variable is also in a L p -like space for some p > 1, in the sense that ((| Z |)) < for some function (x) that increases faster than x. 109 Lemma 4.5 If (| Z |) < then there is a convex , such that (x)/x as x and ((| Z |)) < . 110 Lemma 4.6 If Z is a F -measurable random variable with tion
{ (Z |G ) : G sub -eld of F }
(| Z |) < then the collec-
is uniformly integrable. (8) p.34 Approximation of functions in H 2 (receipe) 111 Theorem 6.5 (Approximation) Let the operator A n act on functions f H 2 in the following way: A n( f ) =
2 n 1
i =1
1 t i t i 1
ti
f (, ds) ds
t i 1
t i < t t i +1 ,
where t i = iT/2n for 0 i 2n 1. Then

2 (1) A n is a bounded linear operator from H 2 to H 0 ,
(2) Contraction properties: || A n f || || f || and || A n f ||L 2 (d t) || f ||L 2 (d t) , (3) || A n f f ||L 2 (d t) 0 as n . Then

T 0
f (s) dB s = limn
1 2 n 1 i =1 t i t i 1
ti t i 1
f (, ds) ds B t i+1 B t i
B Exercises
45
EXERCISES
Measure theory and probability

Exercise 1: Suppose that F1 and F2 are two -elds in with F1 F2 (i.e. A F1 A F2 , F1 is a subeld of F2 ). Let f : (, B ()) be some function. (1) Is it true that if f is measurable w.r.t. F1 then f is also measurable w.r.t. F2 ? (2) Is the converse true?
1 Exercise 2: Let = [0, 1], F2 = ({[0, 1 ], [0, 4 ]}). 2
(1) Find all elements of F2 . (2) Give an example of a -eld F1 with F1 F2 , F1 = F2 and a function f : [0, 1] that is measurable only w.r.t. one of the two -elds. (3) Find a g = f such that g is F1 -measurable and
A
f (x) dx =
g(x) dx, A F1 .
Exercise 3: Show that the following functions f : (, B ()) are measurable and decide which are Lebesgue integrable (w.r.t to the Lebesgue measure ). Which of the integrals f (x) dx exist in the Riemann sense? f (x) = [1,1] (x), f (x) = x/(1 + x2 ), (1) (2) (3) Find f (x) = [3,) (x) e x , f (x) = 2x e x , f (x) = 1/(1 + x2 ), f (x) = (1,) (x) 1/x, f (x) = sin(x)/x, f (x) = (x).
2
Exercise 4: Find the expectation
(X ) for the following dicrete distributions of X :
(X = k) = p(1 p)k1 , k = 1, 2, . . ., (geometric distribution). (X = k) = 1/n, k = 1, 2, . . ., n, (discrete uniform distribution). (X = k) = ek /k!, k = 0, 1, 2, . . ., (Poisson distribution).
(X ) for the random variable X , with the following density functions:
(4) f (x) = e x , x 0, density of the (exponential distribution). (5) f (x) = 1/(b a), x [a, b], density of the (uniform distribution on [a, b]). (6) f (x) = e( x)
2
/(22 )
/ 22 , density of the (normal distribution).
Exercise 5: Let X n be an independent sequence of random variables, such that (X n = n) = 1/n and (X n = 0) = 1 1/n.
B Exercises (1) Show that X n 0 in probability. (2) What is the probability that X n = n innitely often? (3) Does the sequence converge almost surely? (4) What about convergence in distribution? (5) Find the limit limn (X n ). Exercise 6: Let X ,Y be random variables with X , Y L 2 ( (1) Is it true that X + Y L 2 ? (2) Is X Y in L 2 ? For which p is it in L p ? (3) Let X = A and Y = B and show that (A B)2 (A)(B). Exercise 7: Show using Fubinis theorem: if X 0 a.s. and (X ) =
46
(| X |2 ) < and
(|Y |2 ) < ).
(X ) < then
(X > u) du.
(X n ), n 2?
Can you nd a similar formula for the higher moments
Exercise 8: Let X and Y be independent random variables with distribution functions F(x) = (X x) and G(x) = (X x). Find the distribution function of the sum Z = X + Y . Hint: use the fact that (A) = ( A ). Exercise 9: Let X n be a sequence of random variables and C a constant. Show that if X n C in distribution, then X n C also in probability.
Discrete Time Martingales

Exercise 10: Consider the coin tossing example and let S = { k : (M n = k innitely often) > 0}. Show that S = or S = and conclude that ( < ) = 1, where is the rst time the process reaches the set A or B. (M n ) = (M1 ) for all n 1.
Exercise 12: Let { X k }k be i.i.d. (1) Suppose that { X k }k ).
Exercise 11: Given a martingale { M n }n . Show that
(| X 1 |) < . Show that M n = X 1 + . . . + X n n (X 1 ) is a martingale (w.r.t. (X 1 ) = 1. Show that M n = X 1 X 2 X n
(2) Suppose that X 1 , X 2 , . . . are non-negative with is a martingale (w.r.t. { X k }k ).
Exercise 13: Let { M n } be a non-negative martingale with M1 = 0. Show that M n = 0 for all n a.s. Exercise 14: Show: if : is a convex function and M n is a martingale w.r.t. X 1 , X 2 , . . . then (M n ) is a submartingale w.r.t. X 1 , X 2 , . . ., provided that (|(M n )|) < . Use Jensens inequality ((X )|F ) (X |F ) .
B Exercises
47
Exercise 15: If { M n } is a martingale w.r.t X 1 , X 2 , . . ., show that M k = (M k+m M m )2 is a submartingale for all m w.r.t. Fk = X 1 , . . ., X m+k . Exercise 16: Let A 1 , A 2, . . . be a sequence of independent events and let S n = k = min{ n : S n = k}, k 1. Suppose that a n = (S n ) as n . (1) Show that (k < ) = 1. (2) Use optional stopping to show that (S k ) = k.
n i =1
Ai
and
Brownian Motion
Exercise 17: We dened the functions n during the lecture and showed that X t = 0 n (t)Z n n= denes a Brownian motion on [0, 1]. Draw (as good as you can) a typical sample path of the process Yt = 1 n (t)Z n for t [0, 1]. n= (1) What is the distribution of Yt ? (2) Calculate the covariance Cov(Yt , Ys,). (3) Show that Yt = B t tB1 . (4) Make a gure showing a typical sample path of Wt =
n=4 n (t)Z n
for t [0, 1].
Continuous time martingales

Exercise 18: Let H G be sub--elds of F . Prove that: (1) ( (X |G )|H ) = (X |H ). Hint: Let Z = (X |G ). Then we have to show that U = conditional expectation (X |H ), i.e. (i) U is H -measurable, (ii) (2) (U A ) = (X A ) for all A H . (X Z |G ) = X (Z |G ) if X is G -measurable. Argue as in the above (Z |H ) is a version of the
(X |G ) = X and exercise.
Exercise 19: Show that the following processes are martingales w.r.t. the standard Brownian ltration: (1) B t , (2) B2 t, t (3) exp B t t . 2 Exercise 20: Let B < 0 < A and dene the random variable = inf{ t : B t {B, A }}. Obviously represents the rst time the Brownian motion leaves the open interval (B, A).
2
B Exercises (1) Show that is a stopping time w.r.t. standard Brownian ltration.
48
(2) Show that ( < ) = 1 (Hint: show ( > n + 1) (|B i+1 B i | < A + B, i = 1, 2, . . ., n)). (3) Employ the optional stopping theorem to show that (B ) = 0. Use this to deduce that A . A +B (B2 ) = () and () = AB.
(B = A) =
B , A +B
(B = B) =
(Compare with equation (2.1) on page 18). (4) Use the martingale B2 t from Exercise 19 to show that t
8 6 4 2 0 2 4
B
6 0 200 400 600 800
Figure 4: Two samples of B t for 0 t ( A = 8, B = 5)
(5) Let A = inf{ t : B t = A }. Show that ( A < ) = 1 by using the results from (3). (6) Show that M A = 1 where M t = exp B t t is the martingale from Exercise 19(3). 2 (7) Calculate the Laplace transform (8) Is ( A ) nite? (es A ) = e A
2s
2
, s 0 of the hitting time A ?
It integral
Exercise 21: Calculate the mean and variance of the (non-stochastic) integrals. (1) (2)
t 0 B s ds, t 2 0 B s ds,
Hint: to calculate the variance write
2 t 0 B s ds
t t 0 0 B s B w dw ds
Hint: use the same trick as in (1). Apply the martingale B2 t to calculate t the expectation (B2 B2 ) = ( (B2 B2 |Fw )). s w s w
Exercise 22: Calculate the variance of the following stochastic integrals. (1)
t 0
|B s | dBs (you can use that for a normal random variable Z with mean 0 and vari-
ance s we have (2)

t 2 0 (B s + s) dBs.
(| Z |) =
s),
C Solutions Hint: use the Ito isometry:

t 0
49
t 0
f (, s) dB s
Exercise 23: This exercises is needed in the proof of Theorem 102 . Show that
n = T min{ t : |B t | n}
2 is a localizing sequence for f (B s ()) L loc [0, T]. Here f is a continuous function.
f 2 (, s)) ds .
Recall the denition: An increasing sequence n of stopping times is a localizing sequence 2 for f L loc [0, T], if (1) f n (, t) = f (, t) t<n is in H 2 , (2) n = T for some n a.s. Exercise 24: Construct a Gaussian process X t on [0, 1] with independent increments and variance Var(Xt ) = t/(1 + t). Recall 104 : If f : [0, T] is continuous then Z t = process with independent increments and Cov(Zt , Zs ) =
t 0 f (s) dB s is t s 2 0 f (u) du. t
a mean zero Gaussian
Exercise 25: Let f : [0, ) (0, ) be continuous and such that 0 f 2 (s) ds as t . ( t ) ( t ) Let (t) be a solution of the equation 0 f 2 (s) ds = t. Show that X t = 0 f (s) dB s is a standard Brownian motion.
SOLUTIONS
Solution to 1: (1) If f is measurable w.r.t. F1 then f 1(B) = { x : f (x) B} is a element of F1 . Since F1 F2 it follows that f 1 (B) is also in F2 , which makes f also measurable w.r.t. F2 .
1 (2) We guess, that this is not the case. Let = [0, 1], F2 = ({[0, 1 ], [0, 1 ]}) and F1 = ({[0, 2 ]}), 2 4 i.e. 1 1 F1 = { , [0, 2 ], (0, 2 ], [0, 1]}
and
1 1 1 1 1 1 F2 = { , [0, 2 ], ( 2 , 1], [0, 1], [0, 1 ], ( 4 , 1], ( 4 , 1 ], [0, 4 ] ( 2 , 1]} 4 2 1 1 (the smallest sigma elds that contain {[0, 2 ]} and {[0, 1 ], [0, 4 ]} respectively, see gure be2 low).
Now let f (x) =
Solution to 2: (1),(2) See exercise 1.
1 (x). [0, 4 ]
Then f 1 ({1}) = [0, 1 ]) F1 , so f is not measurable w.r.t. F1 . 4
C Solutions (3) Let g(x) = Then g is certainly F1 measurable and

1 [0, 2 ]
50
1 1 (x). 2 [0, 2 ]
f (x) dx =
1 = 4
1 [0, 2 ]
g(x) dx.
The following gure shows another (different) example of a function f that is to ne to be measurable w.r.t. F1 . On the interval [0, 1 ] the function g is equal to the average of f . We 2 will later see, that this kind of averaging property is useful to dene conditional expectation.
F1
F2
g
1
Solution to 3: The functions are all measurable. function Leb. integrable?

x
Riemann integrable?
remark simple function 1/x for large x = 1/x for large x f + = , f = is countable measurable
[3,)(x) e
2
[1,1](x)
2x e x 1/(1 + x2 ) x/(1 + x2 ) (1,)(x) 1/x sin(x)/x f (x) = (x)
# # #
# #
(improper)
Solution to 5: (1) W.l.o.g. we assume that < 1. Then (| X n | > ) = (X n > ) = 1/n, which tends to zero as n . Hence X n 0 in probability. (2) Using Borel-Cantelli it follows that (X n = n i.o.) = 1, since
n =1
(X n = n) =
1 = n =1 n
(3) No, since it follows from (2) that the probability that X n = n innitely often is one, so X n does not converge with probability one.
C Solutions
51
(4) It follows from (1) that X n 0 in distribution. You can also argue as follows: For x = 0 we have
(X n x) x0 .
(5) Since (X n ) = 1/n (X n = 1/n) + 0 (X n = 0) we have sequence X n does not converge to zero in L 1 . (X n ) 1 as n . The
(2) For p = 1 the product X Y is in L p because of Cauchy-Schwarzs inequality. It is not clear whether this is true for p = 2 (all we know is, that L 2 L 1 ). Let = = 1, 2, 3, . . ., ({ k}) = 26k2 (note that () = 1 because random variables X () = Y () = 4 . Then X 2 d = but (X Y )2 d = so X Y L 2 .
2 k=1 1/k
Solution to 6: (1) Short answer: Sure, because L 1 is a linear space! Or use Minkowskis inequality to see that || x + y||2 || x||2 + || y||2 < .
= 2 /6). Dene the
k =1
k 26k2 <
k =1
k 26k2 = ,
Solution to 7: Main idea: Use Fubinis theorem and the following transformation rule: (g(X )) = g(X ()) dP() =

g(x) dF(x).
This is useful, since you can calculate expectations without using the integral w.r.t. to the probability measure itself, but instead using the probability distribution function F(x) = (X x). We then obtain
(X > u) =
( X >u ) =
( X ()>u) () dP() =

(u,) (x) dF(x),
where is the distribution function of X . Then
(X > u) du
(u,) (x) dF(x) du.
It follows by Fubinis theorem that we can interchange the order of integration:
0 0
(u,) (x) dF(x) du =
0 0
(u,) (x) du dF(x) =
0 0
du dF(x) =
x dF(x).
0
Solution to 8: We want to determine the distribution function of the sum Z = X + Y of two independent random variables X and Y . We have (Z z) = ( Z z ), the right side being
C Solutions just an abbreviation of
52
({:: Z () z} ). Then (using again the transformation of exercise 7):
( Z z ) =
=
Z() z dP() = X ()+Y () z dP() X +Y z dP =
x+ y z dH(x, y).
Here H is the joint distribution function of X and Y . By independence we have H(x, y) = F(x) G(y) with F, G the distribution functions of X and Y . Hence

x+ y z dH(x, y)
= =
x z y dF(x) dG(y) =
F(z y) dG(y).
z y
dF(x) dG(y)
This is the so called convolution of F and G. Two remarks: i) of course it follows similarly that
(Z z) = (Z z) =
G(z x) dF(x).
ii) If X and Y are nonnegative random variables, then

z
0
F(z y) dG(y).
Solution to 9: Suppose X n C in distribution, i.e. (X n < C) 0 and (X n > C) 1 (we dont know about convergence of (X n = C), but we dont need it). Then
(| X n C | > ) = (X n > C + ) + (X n < C ) 1 + 0 = 1.

Solution to 10: Suppose S = and S = . Then S = {a, a + 1, . . ., b} with a, b . But
(Mk {a, b} innitely often) > 0.

Solution to 11: constant. Since (M k+1 = b + 1| M k = b) = 1/2 > 0 we have b + 1 S which is a contradiction. Hence S = or S = ( < ) = 1 and it follows that ( < ) = 1. (M n ) = ( (M n |Fn )) = (M n1 ). It follows by induction that (M n ) is
Solution to 12: (1) Clearly X n Fn and . . . + X n, (M n |Fn1 ) =

=
(| M n |) < . Then, writing as usual S n = X 1 + (S n1 |Fn1 ) + (X n |Fn1 ) n (X 1 )
S n1 + (X n ) n (X 1 ) = S n1 (n 1) (X 1 ) = M n1 . (| M n |) =
n 1 k =1 n X k =1 i
(S n |Fn1 ) n (X 1 ) =
(2) Obviously M n Fn and
< . We have
n 1 k =1 n 1 k =1
(M n |Fn1 ) =
X i |Fn1 ) (X n |Fn1 ) =
X i (X n ) =
X i = M n 1 .
C Solutions
Solution to 13: We use Doobs inequality (M n ) (M n ) = (Mn ) = 0 for all > 0, so Mn = 0 a.s. and then Mn = 0 a.s.
53 (M1 ) = 0 and hence
Solution to 14: (1) (|(M n )|) < is given in the exercise, (2) (M n ) = f (X 1 , . . ., X n ) is clear, (3) The martingale property follows from Jensens inequality: ((M n )|Fn1 )
(M n |Fn1 ) = (M n1 ).
Solution to 15: First note that taking the convex function (x) = x2 it follows that M m+k is 2 a martingale and M m+k is a submartingale (w.r.t. Fk = X 1 , . . ., X m+k , see exercise 13). Then we have (M n |Fn1 ) =
=
((M n+m M m )2 |Fm+n1 )
2 2 M n+m1 + 2M m M n+m1 + M m = (M n+m1 M m )2 .
2 2 (M n+m |Fm+n1 ) + 2 (M n+m M m |Fm+n1 ) + (M m |Fm+n1 )
(2) The process M n = S n a n is a martingale w.r.t. Fn = { A 1 , . . ., A n }. Indeed, (1) n < , (2) M n = f (X 1 , . . ., X n ) and (3) (M n |Fn1 ) = (S n1 + A n a n |Fn1 ) = S n1 + (A n )
n i =1
Solution to 16: (1) Since (S n ) = n=1 (A i ) it follows from 1 (A i ) = and Boreli i= Cantelli that ( A i = 1 innitely often) = 1 and hence (S i reaches k) = (k < ) = 1.
(| M n |)
(A i ) = Mn1 .
Obviously k is a stopping time w.r.t. Fn and by optional stopping M nk is a martingale and since (M1 ) = 0, it follows that (M nk ) = 0. Clearly | M nk | 2k and M nk Mk and hence by dominated convergence (Mk ) = 0. Since (k < ) = 1 we have Mk = k a k , so that indeed a k = (S k ) = k.
Solution to 17: (1) Yt is a sum of normal random variables and has a normal distribution with mean 0 and variance Var(Xt ) =
n=1
2 (t) = Var(Xt ) 2 (t) = t t2 . n 0
Actually, this is a bit cheated since we cannot just take the mean and the variance of an innite sum to be the innite sum of the means and variances. If you want a more rigorous proof follows along the lines of the proof for the Brownian motion representation (nite dimensional distributions via characteristic functions). Yt is a Brownian bridge, i.e. it behaves in principle like a Brownian motion, but has Y1 = 1 (so the process is xed at t = 1: we see that the variance has its maximal value at t = 1/2 and then goes back to 0). (2) Cov(Yt , Ys ) = ( 1 n (t)Zn 1 n (s)Zn ) = 1 n (t)n (s) = s t ts. n= n= n= (3) We have Yt = B t tZ0 and B1 = 0 n (1)Z n = tZ0 . n= (4) The process Wt = 4 n (t)Z n is a combination of Brownian bridges, such that Wt = 0 for n= t {0, 0.25, 0.5, 0.75, 1}.
C Solutions
54
0.4 0.1
0.2 0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.4
0.6
0.8
1.0
0.2
0.2
0.3 0.4
Figure 5: Exercise 17 (4): Wm,t =
m n=4 n ( t) Z n
for m = 7, m = 15 (left) and m = 255 (right).
Solution to 18: (1) U is H measurable by denition. Also and (U A ) = (Z A ) for all A H by denition. Hence (U A ) = (Z A ) =
(Z A ) =
(X A ) for all A G
(X A ), A H . (X |H ). (X A ) for all A G , so X is
So U is aversion of the conditional expectation
(2) It is clear that X is G -measurable and that (X A ) = indeed a version of the conditional expectation of X w.r.t. G . (X (Z |G ) A ) = ( (Z |G ) A B ) =
The second question is not so easy to prove. If X = B with B G then (Z A B ) = (X Z A ), for all A G . If Z is a simple random variable (c.f. (1.1)) Z = n=1 c i A i with A i G , i then similarly (use the same reasoning as above) (X (Z |G ) A ) = (X Z A ). If X is G measurable and nonnegative, then there is a sequence of simple nonnegative r.v. X n of the above form with X n X and (X n (Z |G ) A ) = (X n Z A ). We have X n (Z |G ) A X (Z |G ) A and X n Z A X Z A , so by monotone convergence (X (Z |G ) A ) = (X Z A ),
showing that X (Z |G ) is version of the conditional expectation (X Z A ) (we havent shown measurability, bit this is clear since X and (Z |G ) are both G -measurable). Solution to 19: (1) This can be shown similarly to the Poisson process example 68 . (2) (|B2 t|) (|B2 |) + t2 = t + t2 < and for s t, t t (B2 t|Fs ) = t
=
((B t B s )2 |Fs ) + 2 (B t B s |Fs ) (B2 |Fs ) t s ((B t B s )2 ) B2 t = t s B2 t = B2 s. s s s (eB t ) = e

2 2
(3) Follows similarly if we use
Solution to 20: (1) { > t} = {B s (B, A), 0 s t} F t , so is a stopping time. (2)
( > n + 1) (|B i+1 B i | < A + B, i = 1, 2, . . ., n) is clear. Letting p = (|B i+1 B i | < A + B) we obtain ( > n + 1) p n . It follows that ( = ) = limn ( > n + 1) = 0.
C Solutions
55
(3) B t is a martingale, so (B t ) = (B0 ) = 0. As t , B t B almost surely. Moreover B t A + B, so by dominated convergence (B ) = 0. But at the same time (B ) = A (B = A) B(B = B) and B), leading to the formulas.
(B = A) = 1 (B =
().
(4) We have (B2 ) = (t ). t almost surely and t . So (t ) t Also B2 (A + B)2 and B2 B2 , yielding (B2 ) (B2 ) and hence t t t () = (5) (B2 ) = A 2 (B = A) + B2 (1 (B = A)) = AB.
( A < ) (B = A), which tends to 1 as B .

(M A t ) = (M0 ) = 1. Again, M A t ea and A t A , hence
2
(6) We have
(M A ) = 1.
(7) It follows that (8) No, since
(ea
/2 A
) = 1, so, letting 2 /2 = s,
= . =
t 0
(es A ) = e
2 sa
( A ) =
d 2 sa ds e s =0
Solution to 21: (1) Clearly

t
0
t 0 B s ds t t
0
(B s ) ds = 0, and
t t
0 0
B s ds
=
t
B s B w dw ds =
t t s
0
(B s B w ) dw ds 1 2
t
0
s
0
w dw ds + (B2 ) ds = s
t t
0 0
s dw ds =
s2 ds +
t
0
(st s2) ds =
1 3 t . 3
(2) We have
t 2 0 B s ds t
0
=
2
t 0
t t2 0 s ds = 2 .
Moreover,
t
0 0
B2 ds s
(B2 B2 ) dw ds = 2 s w
(B2 B2 ) dw ds s w
(this is not completely obvious. Check it!). It follows form the martingale property of B2 t, t that (B2 s|Fw ) = B2 w for w s and hence s w (B2 B2 |Fw ) = B2 (B2 |Fw ) = B2 (B2 + s w). s w w s w w Then (B2 B2 ) = s w calculate (B2 B2 ) = s w
=
(B2 (B2 + s w)) = 3w2 + (s w)w = 2w2 + sw. Alternatively you may w w ((B s B w )2 ) (B2 ) + 2 ((B s B w )B3 ) + (B4 ) w w w
(s w)w + 2 ((B s B w )) (B3 ) + 3w2 = (s w)w + 3w2 = 2w2 + sw. w

t s
0
It follows that
t
0
B2 ds s
=2
2w2 + sw dw ds = 4
t
0
1 3 s ds + 3
t
0
1 7 4 1 t . s3 ds = t4 + t4 = 3 4 12
Solution to 22: Note that both stochastic integrals dene martingales and hence their expectation is zero (and hence the variance is equal to the second moment).
C Solutions (1) We have

t
0
56
|B s | dB s
|B s | ds =
(|B s |) ds. (| Z |) =
2
For a normal random variable Z with mean 0 and variance s,

t
0
s. Hence
|B s | ds =
t
0
s ds =
2 3
2 3/2 t .
(2) Here
t
0
(B s + s)2 dBs
(B s + s)4 ds =
t
0
((B s + s)4 ) ds.
We end up with ((B s + s)4 ) = (B4 ) + 4s (B3 ) + 6s2 (B2 ) + 4s3 (B s ) + s4 = 3s2 + 6s3 + s4 . s s s
Solution to 23: n is increasing a.s. Moreover f n (, t) = f (B t ()) t<n is adapted and measurable since f n (, t) is a right continuous process. Also (and this is the idea of localization)
T
0
( f (B t ()) t<n )2 dt
T
0
( sup f (w))2 dt < ,

|w| n
the sup being existent since f is continuous. Finally B t is a.s. continuous on [0, t] and hence bounded by some (random) constant C (with probability one). If n > C then n = T min{ t : |B t | n} = T. Solution to 24: Choose f (u) = 1/(1 + u). Then
t t 0
Var(Xt ) = Hence
f2 (u) du =
1 t . du = 2 1+t (1 + u)
t
0
1 dB s (u). 1+u
Solution to 25: X 0 is clear. Clearly X t = Z( t) is continuous, because t (t) is continuous. Also (t) is strictly increasing, so X is a time change of Z t . Using Proposition 104 it follows that Z t is a mean zero Gaussian process with independent t s increments and Cov(Zt , Zs) = 0 f2 (u) du. Hence for s < t
(s)
Cov(Xt , Xs ) =
f2 (u) du = s.
So Cov(Xt , Xs) = t s. It follows that X t is a standard Brownian motion.
D Literature
57
D LITERATURE
Measure theory [L1] H ALMOS: Measure theory, Van Nostrand (1950) Probability theory [L2] F ELLER: Introduction to probability I & II, Wiley (1969) The classic book about probability theory. [L3] K ALLENBERG: Foundation of Modern Probability, Springer (2002) You will nd almost everything about probability theory in this book. It is written on a more advanced level. Stochastic processes & applied probability [L4] K ARLIN & T AYLOR: A First Course in Stochastic Processes, 1975 (Academic Press) [L5] K ARLIN & T AYLOR: A Second Course in Stochastic Processes, 1981 (Academic Press) [L6] R OSS: Introduction to Probability Models, Elsevier/Academic Press (2006) More elementary approach to probability models. No measure theory involved. Stochastic integration [L7] S TEELE: Stochastic calculus and nancial applications, Springer (2001) Is used for this lecture. [L8] P ROTTER: Stochastic Integration and Differential Equations, Springer (2004) The standard reference to stochastic integration.

Lec

Uploaded by

Copyright:

Available Formats

You might also like

Lec

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec

Uploaded by

Copyright:

Available Formats

Lecture Notes STOCHASTIC DIFFERENTIAL EQUATIONS

Andreas Lpker, TU/e Eindhoven

2 2 Discrete Time Martingales 16

7.2 Two special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

0.0 0 200 400 600 800 1000

(s, X s )Ws ds.

1 Measure Theory and Probability

MEASURE THEORY AND PROBABILITY 1.1 Measure spaces

The sets in F are called measurable sets, (, F ) is a measurable space.

1 Measure Theory and Probability

The triple (, F , ) is called a measure space.

lim supn f n (!), lim infn f n (!), max{ f 1 , . . ., f n }, min{ f 1 , . . ., f n }.

f (x) = where A i (x) = 1 if x A i and 0 else.

We dene its integral (in the obvious way) by

1 Measure Theory and Probability

We dene f (x) d (x) = lim

g n (x) d (x) = lim

A (x) f (x) d (x)

f (x) d (x), if the integrating measure is the Lebesgue-measure.

(3) If (A) = 0 then

f d = 0 and f 0 then (A) = 0.

1 Measure Theory and Probability example is

lim inf f n (x) d (x)

For some transformation rules see

21 Denition (L p space) A measurable function f : (, F ) is in L , p 1, if

We let L be the set of all f : (, F ) , such that

|| f || = inf{ y : (| f | > y) = 0} < .

1 Measure Theory and Probability If is a nite measure, i.e. () < then

For f , f 1, f 2 . . . L p we say that f n converges to f in L p if || f n f || p 0.

(1) Minkowski inequality:

(2) Hlder inequality: 1/p + 1/q = 1, then

(3) Special case (Cauchy-[Bunyakovsky]-Schwarz)

Product measure and Fubinis theorem

1 Measure Theory and Probability

has property A) = ({ : X () has property A }).

1 Measure Theory and Probability

26 Theorem (Properties of the integral) (1) (aX + bY ) = a (X ) + b (Y ), (X ) (Y ), ( A ) = (A),

(2) If X Y a.s. then (3)

Monotone convergence: If 0 X n X then lim (X n ) =

Dominated convergence: If X n X almost surely and | X n | Y with

g(x) dF(x). (X 2 ) exists, then

(X n ) is called the nth moment of X . If Var(X) =

is the variance of X and (X ) =

Var(X) is the standard deviation of X .

The Laplace transform of X is the function s

The characteristic function of X is the function s

1 Measure Theory and Probability

g(x) f (x) dx.

(A i ) < implies (B) = 0.

(2) if the events are independent then

30 Inequalities Markovs inequality:

Convergence of random variables

1 Measure Theory and Probability

(| X n X | > ) 0 for all > 0, we write limp X n = X ,

convergence in distribution (weakly): x (X x) is continuous.

x) at all points x where

Lq in probability almost sure in distribution

1 Measure Theory and Probability