Download as pdf or txt
Download as pdf or txt
You are on page 1of 170

Méthodes numériques probabilistes avancées en finance

M2MO, Trimestre 3 , Février-Mars 2020

Jean-Francois CHASSAGNEUX∗,

- Documents de cours: Moodle


Université de Paris, U.F.R. de Mathématiques, chassagneux@lpsm.paris

1
References

[1] Frédéric Abergel and Rémi Tachet. A nonlinear partial integro-differential equa-

tion from mathematical finance. Discrete and Continuous Dynamical Systems-

Series A, 27(3):907–917, 2010.

[2] Yacine Ait-Sahalia. Testing continuous-time models of the spot interest rate.

The review of financial studies, 9(2):385–426, 1996.

[3] Aurélien Alfonsi et al. Affine diffusions and related processes: simulation, theory

and applications, volume 6. Springer, 2015.

[4] Aurélien Alfonsi, Benjamin Jourdain, and Arturo Kohatsu-Higa. Pathwise opti-

mal transport bounds between a one-dimensional diffusion and its euler scheme.

The Annals of Applied Probability, 24(3):1049–1080, 2014.

[5] Vlad Bally and Gilles Pages. Error analysis of the optimal quantization al-

gorithm for obstacle problems. Stochastic processes and their applications,

106(1):1–40, 2003.

[6] Vlad Bally, Gilles Pages, et al. A quantization algorithm for solving multidi-

mensional discrete-time optimal stopping problems. Bernoulli, 9(6):1003–1049,

2003.

[7] Vlad Bally and Denis Talay. The law of the euler scheme for stochastic differ-

ential equations. Probability theory and related fields, 104(1):43–60, 1996.

[8] Denis Belomestny et al. Solving optimal stopping problems via empirical dual

optimization. The Annals of Applied Probability, 23(5):1988–2019, 2013.

2
[9] Denis Belomestny and John Schoenmakers. Projected particle methods for

solving mckean–vlasov stochastic differential equations. SIAM Journal on Nu-

merical Analysis, 56(6):3169–3195, 2018.

[10] Jean-Michel Bismut. Conjugate convex functions in optimal stochastic control.

Journal of Mathematical Analysis and Applications, 44(2):384–404, 1973.

[11] Jean-Michel Bismut. Contrôle des systemes linéaires quadratiques: applications

de l’intégrale stochastique. In Séminaire de Probabilités XII, pages 180–264.

Springer, 1978.

[12] François Bolley. Separability and completeness for the wasserstein distance. In

Séminaire de probabilités XLI, pages 371–377. Springer, 2008.

[13] Mireille Bossy and Denis Talay. Convergence rate for the approximation of the

limit law of weakly interacting particles: application to the burgers equation.

The Annals of Applied Probability, 6(3):818–861, 1996.

[14] Mireille Bossy and Denis Talay. A stochastic particle method for the mckean-

vlasov and the burgers equation. Mathematics of Computation of the American

Mathematical Society, 66(217):157–192, 1997.

[15] Bruno Bouchard, Jean-François Chassagneux, et al. Fundamentals and ad-

vanced techniques in derivatives hedging. Springer, 2016.

[16] Bruno Bouchard, Ivar Ekeland, and Nizar Touzi. On the malliavin approach to

monte carlo approximation of conditional expectations. Finance and Stochas-

tics, 8(1):45–71, 2004.

3
[17] Bruno Bouchard and Nizar Touzi. Discrete-time approximation and monte-carlo

simulation of backward stochastic differential equations. Stochastic Processes

and their applications, 111(2):175–206, 2004.

[18] Gerard Brunick and Steven Shreve. Mimicking an itô process by a solution of a

stochastic differential equation. The Annals of Applied Probability, 23(4):1584–

1628, 2013.

[19] René Carmona, François Delarue, Gilles-Edouard Espinosa, and Nizar Touzi.

Singular forward–backward stochastic differential equations and emissions

derivatives. The Annals of Applied Probability, 23(3):1086–1128, 2013.

[20] René Carmona, François Delarue, et al. Probabilistic Theory of Mean Field

Games with Applications I-II. Springer, 2018.

[21] René Carmona, Jean-Pierre Fouque, and Li-Hsien Sun. Mean field games and

systemic risk. Communications in Mathematical Sciences, 13(4):911–933, 2015.

[22] Jean-François Chassagneux. Linear multistep schemes for bsdes. SIAM Journal

on Numerical Analysis, 52(6):2815–2836, 2014.

[23] Jean-François Chassagneux, Adrien Richou, et al. Numerical simulation of

quadratic bsdes. The Annals of Applied Probability, 26(1):262–304, 2016.

[24] Jean-François Chassagneux, Lukasz Szpruch, and Alvin Tse. Weak quantitative

propagation of chaos via differential calculus on the space of measures. arXiv

preprint arXiv:1901.02556, 2019.

4
[25] Emmanuelle Clément, Damien Lamberton, and Philip Protter. An analysis of

a least squares regression method for american option pricing. Finance and

Stochastics, 6(4):449–471, 2002.

[26] D Crisan and K Manolarakis. Solving backward stochastic differential equations

using the cubature method: Application to nonlinear pricing. In Progress in

analysis and its applications, pages 389–397. World Scientific, 2010.

[27] Dan Crisan and Konstantinos Manolarakis. Solving backward stochastic differ-

ential equations using the cubature method: application to nonlinear pricing.

SIAM Journal on Financial Mathematics, 3(1):534–571, 2012.

[28] Dan Crisan, Konstantinos Manolarakis, et al. Second order discretization of

backward sdes and simulation with the cubature method. The Annals of Applied

Probability, 24(2):652–678, 2014.

[29] Dan Crisan, Konstantinos Manolarakis, and Nizar Touzi. On the monte carlo

simulation of bsdes: An improvement on the malliavin weights. Stochastic

Processes and their Applications, 120(7):1133–1158, 2010.

[30] PE Chaudru de Raynal and CA Garcia Trillos. A cubature based algorithm to

solve decoupled mckean–vlasov forward–backward stochastic differential equa-

tions. Stochastic Processes and their Applications, 125(6):2206–2255, 2015.

[31] Nicole El Karoui, Shige Peng, and Marie Claire Quenez. Backward stochastic

differential equations in finance. Mathematical finance, 7(1):1–71, 1997.

[32] Michael B Giles. Multilevel monte carlo path simulation. Operations Research,

56(3):607–617, 2008.

5
[33] Paul Glasserman. Monte Carlo methods in financial engineering, volume 53.

Springer Science & Business Media, 2013.

[34] E Gobet and P Turkedjiev. Adaptive importance sampling in least-squares

monte carlo algorithms for backward stochastic differential equations. Stochastic

Processes and their applications, 127(4):1171–1203, 2017.

[35] Emmanuel Gobet. Weak approximation of killed diffusion using euler schemes.

Stochastic processes and their applications, 87(2):167–197, 2000.

[36] Emmanuel Gobet. Monte-Carlo methods and stochastic processes: from linear

to non-linear. CRC Press, 2016.

[37] Emmanuel Gobet, Jean-Philippe Lemor, Xavier Warin, et al. A regression-

based monte carlo method to solve backward stochastic differential equations.

The Annals of Applied Probability, 15(3):2172–2202, 2005.

[38] Emmanuel Gobet, José G López-Salas, Plamen Turkedjiev, and Carlos

Vázquez. Stratified regression monte-carlo scheme for semilinear pdes and bsdes

with large scale parallelization on gpus. SIAM Journal on Scientific Computing,

38(6):C652–C677, 2016.

[39] Emmanuel Gobet and Plamen Turkedjiev. Linear regression mdp scheme for

discrete backward stochastic differential equations under general conditions.

Mathematics of Computation, 85(299):1359–1391, 2016.

[40] Emmanuel Gobet, Plamen Turkedjiev, et al. Approximation of backward

stochastic differential equations using malliavin weights and least-squares re-

gression. Bernoulli, 22(1):530–562, 2016.

6
[41] S Graf, H Luschgy, et al. Asymptotics of the quantization errors for self-similar

probabilities. Real Analysis Exchange, 26(2):795–810, 2000.

[42] Carl Graham, Thomas G Kurtz, Sylvie Méléard, Philip Protter, and Mario

Pulvirenti. Probabilistic Models for Nonlinear Partial Differential Equations:

Lectures Given at the 1st Session of the Centro Internazionale Matematico Es-

tivo (CIME) Held in Montecatini Terme, Italy, May 22-30, 1995. Springer,

2006.

[43] J Guyon and P Henry-Labordere. Non linear pricing. crc financial mathematics,

2014.

[44] Julien Guyon and Pierre Henry-Labordere. The smile calibration problem

solved. Available at SSRN 1885032, 2011.

[45] Julien Guyon and Pierre Henry-Labordère. Being particular about calibration.

Risk, 25(1):88, 2012.

[46] István Gyöngy. Mimicking the one-dimensional marginal distributions of pro-

cesses having an itô differential. Probability theory and related fields, 71(4):501–

516, 1986.

[47] Martin B Haugh and Leonid Kogan. Pricing american options: a duality ap-

proach. Operations Research, 52(2):258–270, 2004.

[48] Martin Hutzenthaler, Arnulf Jentzen, and Peter E Kloeden. Strong and weak

divergence in finite time of euler’s method for stochastic differential equations

with non-globally lipschitz continuous coefficients. Proceedings of the Royal

7
Society A: Mathematical, Physical and Engineering Sciences, 467(2130):1563–

1576, 2010.

[49] Benjamin Jourdain and Alexandre Zhou. Existence of a calibrated regime

switching local volatility model and new fake brownian motions. arXiv preprint

arXiv:1607.00077, 2016.

[50] Rafail Khasminskii. Stochastic stability of differential equations, volume 66.

Springer Science & Business Media, 2011.

[51] P. Kloeden and E. Platen. Numerical Solution of Stochastic Differential Equa-

tions. Springer, 1992.

[52] Vassili N Kolokoltsov. Nonlinear Markov processes and kinetic equations, vol-

ume 182. Cambridge University Press, 2010.

[53] Hiroshi Kunita. Stochastic flows and stochastic differential equations, vol-

ume 24. Cambridge university press, 1997.

[54] Daniel Lacker, Mykhaylo Shkolnikov, and Jiacheng Zhang. Inverting the marko-

vian projection, with an application to local stochastic volatility models. arXiv

preprint arXiv:1905.06213, 2019.

[55] V. Lemaire and G. Pagès. Multilevel Richardson-Romberg extrapolation. ArXiv

e-prints, January 2014.

[56] Jean-Philippe Lemor, Emmanuel Gobet, Xavier Warin, et al. Rate of con-

vergence of an empirical regression method for solving generalized backward

stochastic differential equations. Bernoulli, 12(5):889–916, 2006.

8
[57] A Lipton. The vol-smile problem. Risk, 15:61–65, 2002.

[58] Yating Liu. Optimal Quantization: Limit Theorems, Clustering and Simulation

of the McKean-Vlasov Equation. PhD thesis, Sorbonne Université, UPMC;

Laboratoire de Probabilités, Statistique et . . . , 2019.

[59] Francis A Longstaff and Eduardo S Schwartz. Valuing american options by

simulation: a simple least-squares approach. The review of financial studies,

14(1):113–147, 2001.

[60] Harald Luschgy, Gilles Pagès, et al. Functional quantization rate and mean

regularity of processes with an application to lévy processes. The Annals of

Applied Probability, 18(2):427–469, 2008.

[61] Jin Ma and Jiongmin Yong. Forward-backward stochastic differential equations

and their applications. Number 1702. Springer Science & Business Media, 1999.

[62] Gilles Pages. Quadratic optimal functional quantization of stochastic processes

and numerical applications. In Monte Carlo and Quasi-Monte Carlo Methods

2006, pages 101–142. Springer, 2008.

[63] Gilles Pagès. Numerical Probability: An Introduction with Applications to Fi-

nance. Springer, 2018.

[64] Gilles Pagès and Abass Sagna. Improved error bounds for quantization based

numerical schemes for bsde and nonlinear filtering. Stochastic Processes and

their Applications, 128(3):847–883, 2018.

[65] Etienne Pardoux and Shige Peng. Adapted solution of a backward stochastic

differential equation. Systems & Control Letters, 14(1):55–61, 1990.

9
[66] Etienne Pardoux and Shige Peng. Backward stochastic differential equations

and quasilinear parabolic partial differential equations. In Stochastic partial

differential equations and their applications, pages 200–217. Springer, 1992.

[67] Étienne Pardoux and Aurel Răşcanu. Stochastic differential equations, Back-

ward SDEs, Partial differential equations, volume 69. Springer, 2014.

[68] Huyên Pham. Continuous-time stochastic control and optimization with finan-

cial applications, volume 61. Springer Science & Business Media, 2009.

[69] Vladimir Piterbarg. Markovian projection method for volatility calibration.

Available at SSRN 906473, 2006.

[70] Daniel Revuz and Marc Yor. Continuous martingales and Brownian motion,

volume 293. Springer Science & Business Media, 2013.

[71] Leonard CG Rogers. Monte carlo valuation of american options. Mathematical

Finance, 12(3):271–286, 2002.

[72] Alain-Sol Sznitman. Topics in propagation of chaos. In Ecole d’été de probabil-

ités de Saint-Flour XIX—1989, pages 165–251. Springer, 1991.

[73] Denis Talay and Luciano Tubaro. Expansion of the global error for numeri-

cal schemes solving stochastic differential equations. Stochastic analysis and

applications, 8(4):483–509, 1990.

[74] John N Tsitsiklis and Benjamin Van Roy. Regression methods for pricing com-

plex american-style options. IEEE Transactions on Neural Networks, 12(4):694–

703, 2001.

10
[75] Cédric Villani. Optimal transport: old and new, volume 338. Springer Science

& Business Media, 2008.

[76] Jianfeng Zhang et al. A numerical scheme for bsdes. the annals of applied

probability, 14(1):459–488, 2004.

[77] Alexandre Zhou. Etude théorique et numérique de problèmes non linéaires au

sens de McKean en finance. PhD thesis, Paris Est, 2018.

11
Notations

Sc2 the space of stochastic process satisfying


" #
2
E sup |Xt | <∞
t∈[0,T ]

and whose sample paths are continuous.

12
Part I

Handouts

13
1 Introduction

These handouts have non-empty intersection with some textbooks: [51, 36, 63, 33]

14
2 Review of the linear case

2.1 Mathematical & Financial Framework

• (Ω, A, P) is a probability space supporting a d-dimensional Brownian Motion

W , F is the filtration generated by W .

• T > 0 is a terminal date.

2.1.1 SDEs

• We consider on [0, T ] the following SDE

dXt = b (t, Xt ) dt + σ (t, Xt ) dWt . (2.1)

where b and σ are measurable function.

• (HL)(i): σ and b are Lipschitz-Continuous in time and space


|b(t, x) − b(t0 , x0 )| + |σ(t, x) − σ(t0 , x0 )| ≤ L |x − x0 | + |t − t0 | .

Theorem 2.1. (strong Existence and uniqueness for SDEs) Under (HL)(i), there

exists a unique1 continuous F-adapted process X taking its values in Rd such that

Z t d Z
X t
Xti = xi0 + i
b (s, Xs ) ds + σ ij (s, Xs ) dWsj (2.2)
0 j=1 0

1
Up to indistinguishability.

15
2.1.2 Useful estimates under (HL)

We use Hölder, BDG inequalities & Gronwall Lemma to get them.

• For all p ≥ 1,
" #
p
E sup |Xt | ≤ CT (1 + E[|X0 |p ]). (2.3)
t∈[0,T ]

• Time regularity:
" #
2
max E sup |Xt − Xti | ≤ C|ti+1 − ti |. (2.4)
i t∈[ti ,ti+1 ]

Stochastic flow Observe that we can define for all (s, x) ∈ [0, T ]×Rd the solution

to

Z t Z t
Xts,x = x + b(r, Xrs,x )dr + σ(r, Xrs,x )dWr , s ≤ t ≤ T
s s

and, by convention, Xts,x = x, t ∈ [0, s].

The flow of the SDE is the mapping (s, x, t) 7→ Xts,x .

It has several important properties:

• Let 1 ≤ p < ∞, There exists a constant Cp such that


" #1
p

E sup |Xt |p ≤ Cp (1 + |x|) . (2.5)


t∈[0,T ]

• Let 2 ≤ p < ∞. There exists a constant Cp such that for all (s, x, t) and

(s0 , x0 , t0 ),

h 0 0
i  p p

E |Xts,x − Xts0 ,x |p ≤ Cp |x − x0 |p + (1 + |x|p ∨ |x0 |p )(|s − s0 | 2 + |t − t0 | 2 ) .

(2.6)

16
• The mapping (s, x) 7→ X s,x with values in Sc2 is continuous.

• Let x ∈ Rd , we observe, for all 0 ≤ r ≤ s ≤ t,

r,x
Xtr,x = Xts,Xs , P − a.s.

• Let (s, x) ∈ [0, T ] × Rd , (Xts,x )0≤t≤T is a Markov process and if f is a bounded

measurable function then for s ≤ r ≤ t

E[f (Xts,x )|Fr ] = Λ(r, Xrs,x ) P − a.s.

with Λ(r, y) = E[f (Xtr,y )].

17
2.1.3 Link with PDEs

• LX is the Dynkin operator associated to X

1  2 
LX ϕ(t, x) = b(t, x)∂x ϕ(t, x) + Tr ∂xx ϕa (t, x) (2.7)
2

for ϕ ∈ C 2 , where a := σσ † .

Theorem 2.2. (Feynmann-Kac for parabolic PDEs - Cauchy condition) Assume

that (HL)(i) holds true, g ∈ Cp0 and that there is a u a Cp1,2 ([0, T )×Rd )∩Cp0 ([0, T ]×

Rd ) solution (classical solution) to the PDE:







∂t u + LX u = 0 on (t, x) ∈ [0, T ) × Rd
(2.8)




u(T, ·) = g(·)

then

h i
u(t, x) = E g(XTt,x )

,→ uniqueness result...

• In the sequel, we shall sometimes use the following assumption:

(HX∞) b and σ are Cb∞ and moreover, σ is uniformly elliptic:

υ † σσ † (x)υ ≥ |υ|2 > 0 (2.9)

for all υ ∈ Rd .

• This assumption allows to prove the smoothness of the function u (for t < T )

whatever the smoothness of the terminal condition.

18
2.1.4 Financial setting

...Pricing an european option in a “perfect” market!

• fixed interest rate r

• price process S solution to SDE (2.1) with bi = rxi i.e.

d
X
dSti = rSti dt + σ ij (St )dWtj
j=1

(we work directly under the risk neutral probability)

• The super-replication price is defined as

p(G) := inf{p ∈ R : ∃ φ ∈ Ab s.t. VTp,φ ≥ G} . (2.10)

G is the random payoff, Ab is an admissible set of self-financing strategies

(investment in risky asset), V p,φ is the portfolio value with initial value p and

following the strategy φ.

• In “perfect market”

  p(G),φ∗
p(G) = EQ e−rT G = V0 (2.11)

for some optimal strategy φ∗ . (some conditions on G too)

,→ linear pricing rule in complete market

• When G = g(ST ) (vanilla option), one can show that

p(G) = u(0, S0 ) ,

where u is solution (in some sense) to the following parabolic PDE, see e.g.

19
[15]:





∂t u + LS u = ru on [0, T ) × Rd
(2.12)




u(·, T ) = g(·)

,→ Call:g(x) = [x − k]+ , Put:g(x) = [k − x]+ etc.

20
2.2 Euler Scheme for SDEs

To compute the option price, one can solve numerically the PDE or try to evaluate

the expectation, using e.g. Monte Carlo methods. We will indeed solve the PDE

but using probabilistic methods essentially, meaning that they are inspired by the

representation of the value function in terms of an expectation.

2.2.1 Definition and first properties

• Generally, it is not possible to obtain "exact" simulation of the solution of an

SDE.

• One has then to consider an approximation of X given by a time-discretisation

of the SDE.

• The simplest time discrete approximations of an SDE is the Euler approxima-

tion or the Euler-Maruyama approximation.

Definition of the Euler scheme

• We chose a discretisation/partition π of [0, T ]

π = {0 =: t0 < t1 < ... < ti < ... < tn := T }.

We denote hi := ti+1 − ti and h = maxi hi .

• most generally, we work with a constant time-step h = T /n.

Definition 2.1. An Euler approximation of the equation (2.1) associated with the

partition π is a discrete time process X π = {Xtπ , t ∈ π} satisfying the iterative

21
scheme

  
Xtπi+1 = Xtπi + b ti , Xtπi (ti+1 − ti ) + σ ti , Xtπi Wti+1 − Wti (2.13)

for i = 0, 1, ..., n − 1 and with initial value X0π = X0 .

22
Remark 2.1. .

(i) If σ ≡ 0 then (2.13) reduces to the deterministic Euler scheme for ODEs.

(ii) Wti+1 − Wti is a Gaussian random variable with zero mean and variance ti+1 −

ti = h. Hence in order to generate the increments Wti+1 −Wti of the Brownian

motion W we can use a sequence (Gi )i≥1 of independent Gaussian pseudo-

random numbers Gi ∼ N (0, h).

For later use, we introduce a "continuous" version of the Euler scheme.

Definition 2.2. We denote by {Xtπ , t ∈ [0, T ]}, the continuous Euler Scheme.

- At time ti belonging to π, we have

  
Xtπi+1 = Xtπi + b ti , Xtπi (ti+1 − ti ) + σ ti , Xtπi Wti+1 − Wti

- For ti < t < ti+1 , we set

 
Xtπ = Xtπi + b ti , Xtπi (t − ti ) + σ ti , Xtπi (Wt − Wti ) . (2.14)

• Associated differential operator (for later use in proofs):

1  2 
L̄(s,z) ϕ(t, x) = b(s, z)∂x ϕ(t, x) + Tr ∂xx ϕ(t, x)a(s, z) (2.15)
2

Moment estimate

Proposition 2.1. Under the assumptions of Theorem 2.1, we have, for p ≥ 2


" #
E sup |Xtπ |p ≤ Cp (1 + |X0 |p )
0≤t≤T

where Cp does not depend on π.

Proof. Cf. Exercise II.4 2

23
2.2.2 Weak convergence for vanilla options

• The goal is to estimate the error

w := E[g(XTπ )] − E[g(XT )] (2.16)

• (Hr) g, b, σ and u are C 1,2 with Lipschitz derivatives.

Theorem 2.3. Under (Hr) and (HL),

|E[g(XTπ )] − E[g(XT )] | ≤ Ch . (2.17)

Proof. First, we note that, for t ∈ [ti , ti+1 ]

 1 √
E |f (Xtπ ) − f (Xtπi )|q q ≤ Lf h (2.18)

where Lf is the Lipschitz constant of f . For later use we introduce

Li := 2
sup {|∂x u(t, ·)|∞ + |∂xx 3
u(t, ·)|∞ + |∂xxx u(t, ·)|∞ }
t∈[ti ,ti+1 ]

and by L the Lipschitz constant of b and a (to simplify in the proof σ is bounded)

We observe that
n−1
X h i
w = E u(ti+1 , Xtπi+1 ) − u(ti , Xtπi ) .
i=0

Applying Ito’s formula, we have

h i
E u(ti+1 , Xtπi+1 ) − u(ti , Xtπi ) =
Z ti+1  
π π π 1 2 π 2 π
E ∂t u(t, Xt ) + b(Xti )∂x u(t, Xt ) + σ (Xti )∂xx u(t, Xt ) dt
ti 2

Using the PDE satisfied by u, we get

h i
E u(ti+1 , Xtπi+1 ) − u(ti , Xtπi ) = (2.19)
Z ti+1  
1
E {b(Xtπi ) − b(Xtπ )}∂x u(t, Xtπ ) + {a(Xtπi ) − a(Xtπ )}∂xx
2
u(t, Xtπ ) dt.
ti 2

24
For the first term in the RHS we compute

 
|E {b(Xtπi ) − b(Xtπ )}∂x u(t, Xtπ ) | =
 
|E {b(Xtπi ) − b(Xtπ )}{∂x u(t, Xtπ ) − ∂x u(t, Xtπi )} + {b(Xtπi ) − b(Xtπ )}∂x u(t, Xtπi ) |
 Z ti 
i π π
≤ LL hi + E ∂x u(t, Xti ) L̄z=Xtπ b(Xs )ds
i
t

where we used (2.18) and Cauchy-Schwarz inequality to get the upper bound for the

first term in the RHS of the above inequality. For the second term, we apply Ito’s

Formula:

 Z ti Z ti 
 π π π
 π
E b(Xti ) − b(Xt )∂x u(t, Xti ) = E ∂x u(t, Xti ) L̄ z=Xtπ b(Xsπ )ds + ∂x b(Xsπ )σ(Xtπi )dWs
i
t t

Conditioning with respect to Fti cancels the strochastic integral term. Using the

assumptions on b, we get

 
|E {b(Xtπi ) − b(Xtπ )}∂x u(tti , Xtπi ) | ≤ CL hi .

We thus obtain

 
|E {b(Xtπi ) − b(Xtπ )}∂x u(t, Xtπ ) | ≤ CL hi .

For the second term in (2.19), we perform similar computations and obtain the same

upper bound. This leads to

n−1
X
|w | ≤ C Li h2i , (2.20)
i=0

≤ Ch . (2.21)

25
Extensions .

i) Error expansion with a lot of smoothness

Talay & Tubaro [73] have proved

Theorem 2.4. If u ∈ C ∞ ,

n
X
E[g(XTπ )] = E[g(XT )] + Ci hi + O(hn+1 )
i=1

,→ Generally one cannot beat order one with an Euler scheme.

,→ Possibility of Romberg-Richardson extrapolation method:


   
e.g. compute 2E g(XTh ) − E g(XT2h )

ii) Measurable terminal condition

• For the Euler Scheme with Brownian increments and with condition on the diffu-

sion coefficient, Bally & Talay [7] have proved

E[g(XTπ )] = E[g(XT )] + O(h)

for g measurable and bounded!

26
2.3 Implementation using Monte Carlo Methods

2.3.1 Quick review of the case without bias

 
• Assuming we know how to sample from ST , the price p0 = E e−rT g(ST ) < ∞

is approximated by

N
1 X −rT
p̂N
0 = e g(STj ) ,
N
j=1

where the STj are iid random variables with same law as ST .

,→ empirical mean.

• We observe that p̂N


0 is an unbiased estimator of p0 and importantly we have

the following result.

Theorem 2.5. (LLN) Since (g(STj ))j is a sequence of integrable and iid random

variables, then

p̂N
0 → p0 a.s.

• We need to assess the accuracy of the previous estimate. We can use the

following L2 -estimate:

Theorem 2.6. Assume that (g(STj ))j a sequence of square integrable and iid random

variables. We have that

   N  V ar(e−rT g(ST ))
Var p̂N
0 = E |p̂0 − p0 |
2
= .
N

27
• Remark: If X1 ∈ L2 , then

N
1 X rT
V̂0N := (e g(STj ) − p̂N
0 )
2
N −1
j=1

 
is an unbiased estimator of Var erT g(ST ) .

28
• We can also describe the distribution of p̂N
0 , at least asymptotically.

Theorem 2.7. (CLT) Assume that (g(STj ))j is a sequence of square integrable and
q
N V̂0N
iid random variables. We assume that Var[g(ST )] > 0 and set σ̂0 := N then

√  p̂N − p0 
N 0 N 1σ̂N >0 → N (0, 1) in distribution.
σ̂0 0

• From this, we can deduce asymptotic confidence interval

Corollary 2.1. Under the assumption of Theorem 2.7, we have

√ p̂N − p0
P( N 0 N < z α2 ) → 1 − α
σ̂0

where for  ∈ [0, 1], z denotes the 1 −  quantile of the standard normal distribution

i.e. 1 −  := P (G < z ) with G ∼ N (0, 1).

Figure 1: Probability density function of N (0, 1) and quantile

,→ This tells us that, for N large enough, the probability that

σ̂0N N σ̂0N
p0 ∈ [p̂N
0 − z α √
2
, p̂ 0 + z α √
2
]
N N

29
is closed to 1 − α.

• Using concentration inequalities, one can obtain non-asymptotic confidence inter-

val.

2.3.2 The case with bias

• BUT in practice, we do not know how to sample from ST and we use a dis-

cretisation scheme e.g. the Euler scheme STπ

• The simulation of the Euler scheme is quite straightforward as soon as one

knows how to simulate the Brownian increment!

• We then compute
N
1 X −rT
p̂π,N
0 := e g(STπ,j )
N
j=1

• The error is decomposed into

p̂π,N
0 − p0 = M C + w (2.22)
 
with M C := p̂π,N
0 − E e−rT g(STπ )
   
and w = E e−rT g(STπ ) − E e−rT g(ST )

,→ There is a balance to find between these two errors (see below)

 
• The study of the MC error can be done as previously as soon as E |g(STπ )|2 <

∞.

2.3.3 Convergence of Mean Square Error for MC method

 
• In estimating the quantity p0 = E e−rT g(ST ) , one faces a tradeoff between

reducing bias (w ) and reducing variance (S ).

30
• To take this into account, one tries to minimise the Mean Square Error.

Definition 2.3. For a given quantity α estimated by α̂, we set

 
M SE := E |α − α̂|2

We observe

M SE = |α − E[α̂] |2 + Var[α̂] = bias2 + V ariance

Indeed,

     
E |α − α̂|2 = E |α − E[α̂] |2 + 2E[(α − E[α̂])(E[α̂] − α̂)] + E |α̂ − E[α̂] |2

= |α − E[α̂] |2 + V ar(α̂) .

Optimising the computational effort

• Assuming weak convergence of order 1, we have

1
M SE ∼c h2 + .
N

• The computational effort C:

1. For one path, computational effort ∼c 1/h

2. there are N paths to simulate

,→ overall C ∼c N/h

• The goal is to minimise the MSE taking into account the computational cost:

c2 c3 N
min(c1 h2 + ) s.t. =C.
h,N N h

31
1
• This leads to, setting h2 ∼c N,

√ 1
M SE = O(C − 3 ) .


• In other words, to reach a precision M SE = O(), one has

C = O(−3 ) .

• Note: To give (asymptotic) confidence interval for p0 , one needs to make the

discretisation error negligeable with respect to the stat error and to chose then

N ∼c h−2−η for η small.

32
2.4 Implementation using quantisation of Brownian increments

Discretisation of the brownian increment (d = 1 for ease of presentation).

ci stands for a discrete approximation of ∆Wi := Wt − Wt


• ∆W i+1 i




 b π = X0
X ti+1
(2.23)



 bπ = Xb π + b(X
b π )hi + σ(X
b π )∆W
ci , 0 ≤ i < n
X ti+1 ti ti ti

• Matching moment property up to order M : for al k ≥ M ,

h i h i
ci )k = E (∆Wi )k , .
E (∆W (2.24)

Proposition 2.2. Assume that g and u(t, ·) is Cb4 (with bounds uniform in time) and
h i
c has the matching moment property up to order M = 3 and E (∆W
that ∆W ci )4 =

O(|π|2 ) then

h i
bTπ ) − E[g(XT )] | ≤ C|π| .
ˆw := |E g(X (2.25)

Example 2.1. Two-point discretisation

p
ci = ± hi ) = 1 .
P(∆W
2

One observes that

h i   h i h i
ci )2 = E (∆Wi )2 = hi and E (∆W
E (∆W ci )2k+1 = E (∆Wi )2k+1 = 0 , (2.26)

for all k ≥ 0, by symmetry.

Proof. 1. We drop the π in the proof and denote the Euler scheme with increment

∆W started from ξ at time t by X̄ t,ξ . In particular, we shall use:

t ,X
i
bt
i bt + ∆X̄i with ∆X̄i := b̂i hi + σ̂i ∆Wi .
X̄ti+1 =X i

33
We observe that
n−1
X h i
ˆw = bt ) − u(ti , X
E u(ti+1 , X bt )
i+1 i
i=0
n−1
X    
b ti ,X
bt ti ,X
bt
b
= E u(ti+1 , Xti+1 ) − u(ti+1 , X̄ti+1 ) + E u(ti+1 , X̄ti+1 ) − u(ti , Xti )
i i

i=0

The second term has already been studied in the proof of Theorem 2.3. We just have

to study the first one. We proceed by doing an expansion of the smooth function

u(ti+1 , ·) in terms of the increments of both Euler scheme. In what follows, we

perform this for a generic Cb4 function ϕ.

bt + λ∆X
Introducing λ 7→ ϕ(X bi ), and performing a Taylor expansion, we compute
i

3
X b i )k Z 1 3
b t + ∆X
ϕ(X bi ) = ϕ(k) b
(Xti )
(∆X
+ bt + λ∆X
ϕ(4) (X bi )4 (1 − λ) dλ ,
bi )(∆X
i i
k! 0 6
k=0

and similarly
3
X Z 1
t ,X bt ) (∆X̄i )k bt + λ∆X̄i )(∆X̄i )4 (1 − λ)3
ϕ(k) (X ϕ(4) (X
i
bt
i
ϕ(X̄ti+1 )= i + i dλ .
k! 0 6
k=0

Due to the matching moment property and boundedness properties, we get

h i  
b ti ,X
E ϕ(Xti+1 ) = E ϕ(X̄ti+1 ) + O(h2i ) .
bt
i

The proof is then concluded observing that, from the previous expansion,
 
b ti ,X
E u(ti+1 , Xti+1 ) − u(ti+1 , X̄ti+1 ) = O(h2i )
bt
i

Tree methods

• When using discrete random variables for the increment, it is possible to cal-

culate the approximation on a tree

,→ the two-point approximation used in Example 2.1 leads to a binomial tree

34
• There is no statistical error but computation becomes rapidly untractable (un-

less special properties are used as recombination)

Link with finite difference scheme

• say b = 0, σ = cste : (2.8) ,→ backward heat equation

• Tree approximation gives for time ti starting at x

h i
ūi (x) = E ūi+1 (X̂tti+1
i ,x
)

1 √ 1 √
= ūi+1 (x + σ h) + ūi+1 (x − σ h)
2 2

where ui (·) (resp. ui+1 (·)) stands for the approximation of uti (·) (resp. uti+1 (·))

• Rearranging the terms, we obtain

ūi (x) − ūi+1 (x) 1 √ √ 


= ūi+1 (x + σ h) + ūi+1 (x − σ h) − 2ūi+1 (x)
h 2h

which corresponds to the (backward explicit) finite different scheme

ūi (x) − ūi+1 (x) σ2  


= 2 ūi+1 (x + δ) + ūi+1 (x − δ) − 2ūi+1 (x)
h 2δ
√ σ2 h
with space discretisation δ = σ h (satisfying the stability condition 2δ 2
≤ 12 ).

35
Figure 2: MC simulation for a put (gauss inc)

36
Figure 3: MC simulation for a digital (gauss inc)

Figure 4: MC simulation for a digital (disc inc)

37
Figure 5: Bias for a digital (disc inc) - no variance

JUMP

38
2.5 Strong convergence

2.5.1 Lipschitz case

Proposition 2.3. Assume that b and σ are Lipschitz continuous in t and x, then
" #
E sup |Xt − Xtπ |2 ≤ C|π| .
t∈[0,T ]

Proof. 1. Stability: Let δXt := Xt − Xtπ . We have, for 0 ≤ u ≤ T

Z u Z u
 
δXu = b(Xs̄ ) − b(Xs̄π ) ds + σ(Xs̄ ) − σ(Xs̄π ) dWs + T (u) (2.27)
0 0

with the truncation error T (u) given by

Z u Z u
 
T (u) = b(Xs ) − b(Xs̄ ) ds + σ(Xs ) − σ(Xs̄ ) dWs . (2.28)
0 0

We then compute

  Z t Z u 
2

E sup |δXu | ≤ C(E |b(Xs̄ ) − b(Xs̄π )|2 ds + sup | σ(Xs̄ ) − σ(Xs̄π ) 2 2
dWs | + sup |T (u)| .
u≤t 0 u≤t 0 u≤t

Applying BDG inequality for the stoc. int. term and then using the Lipschitz

property of b and σ, we get

  Z t 
2 2 2
E sup |δXu | ≤ C(E |δXs̄ | ds + sup |T (u)| ) , (2.29)
u≤t 0 u≤t

leading to

  "Z #
t
E sup |δXu |2 ≤ C(E sup |δXu |2 ds + sup |T (u)|2 ) . (2.30)
u≤t 0 u≤s u≤T

Applying Gronwall’s Lemma, we obtain


" # " #
E sup |δXu |2 ≤ CE sup |T (u)|2 . (2.31)
u≤T u≤T

39
This step shows that the global error is controlled by the (global) truncation error!

2. Study of the truncation error:

We compute, recalling (2.28),

Z T Z u
2 2 
sup |T (u)| ≤ C( b(Xs ) − b(Xs̄ ) ds + sup | σ(Xs ) − σ(Xs̄ ) dWs |2 ) .
u≤T 0 u≤T 0

(2.32)

Taking the expectation and applying BDG inequality, we obtain


" # Z 
T
2 2 2
E sup |T (u)| ≤ CE ( b(Xs ) − b(Xs̄ ) + |σ(Xs ) − σ(Xs̄ )| )ds , (2.33)
u≤T 0

which leads, since b, σ are Lipschitz, to


" # Z 
T
2 2
E sup |T (u)| ≤ CE Xs − Xs̄ ds .
u≤T 0

The proof is concluded using (2.4).

• The result above yields a naturally a control on the strong approximation error at

each time step, with the same order.

• Let assume a uniform grid π with h = T /n. We now report the strong error

between the X process and its piecewise constant Euler scheme.


" #1 r
p
T (1 + log(n))
E max sup |Xt − Xtπi |p ≤ Cp (1 + |X0 |) (2.34)
i t∈[ti ,ti+1 ) n

For a proof see Theorem 7.2 in [63].

Remark 2.2. Obviously, the same control holds true with Xti instead of Xtπi . In

fact it is sharp for both, taking the specific case of X = W , see [63], Chapter 7.

40
2.5.2 Non globally Lipschitz case

• Existence and uniqueness results are more complicated to obtain when the coeffi-

cients are no longer Lipschitz continuous.

• We consider below SDEs strong uniqueness & existence result: see the book by

Khasminskii [50] for a Lyapounov function approach or the book by Alfonsi [3] for

affine processes, in particular the CIR.

Explicit Euler scheme diverges Explicit Euler Scheme for SDEs with super-

linear coefficient will diverge. [48].

• Proof for a simple example:

dXt = −Xt3 dt + dWt and X0 = 0

Existence and uniqueness comes from Theorem 2.4.1 in [50]. It is also the case that

E[|Xt |p ] < ∞, for all p ∈ [1, ∞).

• Euler scheme is, as usual, on a uniform time grid π with n time steps, h = T /n,

Xtπi+1 = Xtπi − (Xtπi )3 h + ∆Wi and X0π = X0 = 0. (2.35)

• The main observation is that

lim E[|XTπ |] = +∞. (2.36)


n→∞

Proof. The proof is based on identifying an event with exponential small probability

but with doubly exponential weight. Let

Ωn := {sup |∆Wi | ≤ 1 and |Wt1 | ≥ rn }


i≥1

41
where rn = max( h3 , 2).
i−1
1. By induction we show that, |Xtπi | ≥ rn2 on Ωn .

This is true for i = 1 by definition of Ωn and since X0π = 0.

Now assume the property holds true for i < n. Observing that |a − (b + c)| ≥

|a| − |b + c| ≥ |a| − |b| − |c|, we get, using the scheme definition (2.35),

|Xtπi+1 | ≥ h|Xtπi |3 − |Xtπi | − 1 ,

≥ h|Xtπi |3 − 2|Xtπi |2 ,

where we used the fact that |Xtπi | ≥ 1 for the last inequality. We then compute

i
|Xtπi+1 | ≥ |Xtπi |2 (hrn − 2) ≥ (rn )2 .

2. We now compute

P(Ωn ) ≥ P(sup |∆Wi | ≤ 1)P(|Wh | ≥ rn )


i≥1

1
Observing that supt∈[0,T ] |Wt | ≤ 2 implies that |Wt − Ws | ≤ 1 for any (s, t) ∈ [0, T ]2 ,

we get

1
P(Ωn ) ≥ P( sup |Wt | ≤ )P(|Wh | ≥ rn )
t∈[0,T ] 2
rn
≥ cP(|W1 | ≥ √ ) for c > 0
h

Using the inequality

2
xe−x
P(|W1 | ≥ x) ≥ (2.37)
4

see Exercice II.5, we then obtain

(rn )2
P(Ωn ) ≥ ce− h

42
3. Combining step 1. with the above inequality, we conclude

n−1 n3
E[XTπ ] ≥ c22 e9 T 3 → +∞ when n → ∞

• Note that we have then, for p ≥ 1,

lim E[|XTπ − XT |p ] = +∞.


n→∞

Proof. By Jensen inequality, we have

E[|XTπ − XT |p ] ≥ E[|XTπ − XT |]p ≥ (E[|XTπ |] − E[|XT |])p

moreover E[|XT |] < ∞ so that the result comes from a direct application of (2.36).

• The proof above can be extended to more general situation see Theorem 1 in [48].

Namely, let

dXt = µ(Xt )dt + σ(Xt )dWt and X0 = ξ

which is assumed to have a strong solution. Denote by X π its classical (explicit!)

Euler scheme.

Assume moreover, P(σ(ξ) 6= 0) > 0 and that for some constant C ≥ 1, β > α > 1,

|x| ≥ C,

|x|β
max(|µ(x)|, |σ(x)|) ≥ and min(|µ(x)|, |σ(x)|) ≤ C|x|α
C

then

lim E[|XTπ |] = +∞ .
n→∞

43
• Depending on the SDEs various solutions can be proposed, among them the use

of tamed explicit scheme.

• Typical example of symptomatic financial models are short rate models, as Aït-

Sahalia ones see [2]:

 
a−1
dXt = − a0 + a1 Xt − a2 Xt dt + γXtρ dWt and X0 > 0 ,
%
Xt

ai ≥ 0, ρ, % > 1.

44
2.6 Path-dependent options

2.6.1 General consideration

• For some measurable functional F : C → R and some SDE solution X we would

like to evaluate

p0 := E[F (Xt , 0 ≤ t ≤ T )] assuming it is finite.

This is done by introducing a discrete version of X π and possibly an approximation

F ρ of F .

• In finance, typical functionals are

1. Asian option like:


Z T
F ((xt )t∈[0,T ] ) := f ( xt dt)
0

2. Lookback option in one dimension

F ((xt )t∈[0,T ] ) := f (xT , sup xt , inf xt )


t∈[0,T ] t∈[0,T ]

3. Barrier option

F ((xt )t∈[0,T ] ) := f (xT )1τD (x)>T

where τD is the exit time of the open domain D, namely

τD (x) := inf{t ∈ [0, T ] | xt ∈


/ D}

,→ If F is Lipschitz continuous for the sup-norm, namely

∀x, y ∈ C, |F (x) − F (y)| ≤ L max |xt − yt |


t∈[0,T ]

45
for some positive constant L.

• We consider the piecewise linear version of the Euler scheme:

t − ti
Xt`,π = Xtπi + (X π − Xtπi )
ti+1 − ti ti+1

which is non adapted... but computable for any t ∈ [0, T ]. Then,


r
h i 1 + log(n)
|E F (X· ) − F (X·`,π ) | ≤ C
n

• The continuous version of the Euler scheme X π can be used if F depends only on

quantities of the path that can be simulated (see e.g. next section). Then, we have,

1
|E[F (X· ) − F (X·π )] | ≤ CL √
n

• The lower bound in this case is given by the weak error analysis and is of order
2
1
n. The best known upper bound [4] is C n− 3 + in dimension one under an uniform

ellipticity assumption..

Example: Weak convergence for barrier options (1d case) See [35] (or 2019

lecture notes)

46
2.7 Acceleration methods

• When there is no bias: one should implement Variance reduction methods (see

e.g. )

2.7.1 Reducing the bias

• If one uses a weak order α scheme, the complexity is then

2α+1
C = O(− α ),


for a M SE = O().

• Recall that, if there is no bias, C ∼c −2 ...

• To obtain a smaller biais, one can

1. use a high order scheme (for weak convergence) see below;

2. use a Romberg-Richardson extrapolation method.

47
2.7.2 Multi-Level Monte Carlo

Introduced in [32].

We want to approximate

p = E[G] where G = g(XT ) ,

for some Lipschitz function g and X is solution to SDE of type (2.1).

• we consider a collection of grid π ` with step size

T
h` = , ` = 0, 1, . . . , L .
2`

• We denote by

`
G` = g(XTπ )

the approximation of G using (say) an Euler scheme for X on each grid π l .

,→ The classical method simply estimates E[GL ].

• We observe
L
X
E[GL ] = E[G0 ] + E[G` − G`−1 ]
`=1

,→ the MLMC method independently estimates each of the expectations on the

RHS in a way which minimises the overall variance for a given computational

cost.

• Y0 is an estimator for E[G0 ] using N0 sample paths, and Y l for E[G` − G`−1 ]

using Nl sample paths. We consider




 P 0 j


Y 0 = N10 N j=1 G0 ,
.

 PN `  


Y ` = 1
N` j=1 Gj` − Gj`−1 , 1 ≤ ` ≤ L

We use the same Brownian motion to get G` and G`−1 !

48
PL
• We observe, setting Y = Y 0 + `=1 Y
` (main estimator)

h i V`
L
X
Var Y ` = and Var[Y ] = N`−1 V`
N`
`=0

,→ Hopefully, V` will be smaller on finer level...

• The total computational cost is proportional to

L
X
N` h−1
` .
`=0

• The general result is as follows [32]

Theorem 2.8. In the setting presented above, assume, with c1 , c2 , c3 , α, β, γ positive

1
constants s.t. α ≥ 2 min(γ, β), that

(i) |E[G` − G] | ≤ c1 2−α` ,

(ii) Var[Y` ] ≤ c2 N`−1 2−β` ,

(iii) C` ≤ c3 N` 2γ` , where C` is the computational complexity of Y` ,

then there exists c4 > 0, such that there are values L and N` s.t. M SE :=
 
E |Y − E[G] |2 < 2 for  < 1/e with






 c4 −2 , β>γ




C ≤ c4 −2 (log(1/))2 , β=γ









c4 −2−(γ−β)/α , 0<β<γ

• Precisely, assuming an Euler approximation of order 1 in the weak sense and

1
2 in the strong sense, we have

Proposition 2.4. For  > 0, setting

N` ∼c −2 Lh` and L ∼c log2 −1 ,

49
we obtain

E[GL − G] = O()

Var[Y ] = O(2 )

M SE = O(2 )

C = O(−2 (log )2 ).

,→ To compare to a cost of O(−3 ) in the standard method!

50
Proof. We first observe that

L
!
X h`
M SE ≤ C + h2L (2.38)
N`
`=0

and we have the constraint on the overall computational effort (C)

L
X N`
=C.
h`
`=0

We minimise the RHS of (2.38) with L being fixed for now. Observing that the

problem is similar to

L
X L
X
1
min s.t. x` = 1
x`
`=0 `=0

1
whose solution is simply x` = L+1 , we compute that

h` C (L + 1)2
N` = and M SE = + h2L . (2.39)
L+1 C

To reach the M SE ≤ C2 , we set

log(1/)
L= and C ∼c −2 (log(1/))2 . (2.40)
log(2)

Remark 2.3. It is possible to combine Romberg-Richardson extrapolation method

and MLMC method, see Lemaire & Pagès [55].

51
Figure 6: MLMC for European option (M=4), see [32]
0 0

−2
−2

−4
log variance

log |mean|

−4
−6
M
M

−6
−8
P
l
−8 P P− P
l −10 l l−1
P− P Y −Y /M
l l−1 l l−1
−10 −12
0 1 2 3 4 0 1 2 3 4
l l

10 1
10 10
ε=0.00005
ε=0.0001
8 ε=0.0002
10 10
0
ε=0.0005
ε=0.001
ε2 Cost
Nl

6
10 −1
10
Std MC
4 Std MC ext
10
−2 MLMC
10 MLMC ext

0 1 2 3 4 −4 −3
10 10
l ε

Figure 2: Geometric Brownian motion with European option (value ≈ 0.10).

approximately −1 again implies an O(h) convergence of E[P!l − P!l−1 ]. Even at


l = 3, the relative error E[P − P!l ]/E[P ] is less than 10−3 . Also plotted is a line
for the multilevel method with Richardson extrapolation, showing significantly
faster weak convergence.
The bottom two plots have results from two sets of multilevel calculations,
with and without Richardson extrapolation, for five different values of !. Each
line in the bottom left plot corresponds to one multilevel calculation and shows
the values for Nl , l = 0, . . . , L, with the values decreasing with l because of
the decrease in both Vl and hl . It can also be seen that the value for L, the
maximum level of timestep refinement, increases as the value for ! decreases.
52
The bottom right plot shows the variation of the computational complexity
C (as defined in the previous section) with the desired accuracy !. The plot
is of !2 C versus !, because we expect to see that !2 C is only very weakly

16
2.8 High order schemes

See the book [51]. (and 2019 lecture notes)

2.8.1 High order weak convergence

An example of order 2 weak Taylor scheme To get high order convergence

scheme, one has to add more term in the expansion of the coefficient.

Example 2.2. (autonomous 1-dimensional case) On a discrete time grid π, we

define X π by

• X0π = X0

• For 0 ≤ i ≤ n − 1,

1 
Xtπi+1 = Xtπi + b(Xtπi )hi + σ(Xtπi )∆Ŵi + σ(Xtπi )σ 0 (Xtπi ) (∆Ŵi )2 − hi
2
1 1 00 
+ b0 (Xtπi )σ(Xtπi ) + b(Xtπi )σ 0 (Xtπi ) + σ (Xtπi )σ 2 (Xtπi ) ∆Ŵi hi
2 2
1 0 π 1 
bb (Xti ) + b00 σ 2 (Xtπi ) h2i
2 2

where ∆Ŵi is a three-point distributed random variable with distribution

p 1 2
P(∆Ŵi = ± 3hi ) = , P(∆Ŵi = 0) = .
6 3

53
3 Computing sensitivities in the linear case

3.1 Finite-Difference Approximations

• We focus on the Delta but this approach can be applied to other "greeks"

(Gamma, Vega etc.)

h i
• The price of the european option is given by u(t, x) = E g(XTt,x ) where X is

solution to an SDE of type (2.1) (with b = 0, d = 1), its Delta is ∂x u(t, x).

• To approximate ∂x u(0, x), we use finite difference.

1. Forward difference: for  > 0

u(0, x + ) − u(0, x)
∂x u(0, x) ≈ ∆F () :=


Remark: one can consider also backward difference. Same properties here.

2. Central difference: for  > 0

u(0, x + ) − u(0, x − )
∂x u(0, x) ≈ ∆C () :=
2

• Estimator (for the central difference): let Xix− , Xix+ be iid samples of XT0,x−

and XT0,x+ , we define then

N
1 X
ˆ N () :=
∆ C g(Xix+ ) − g(Xix− )
2N 
i=1

3.1.1 Bias

Proposition 3.1. Assuming smoothness of the value function, we have

∂x u(0, x) = ∆F () + O() and ∂x u(0, x) = ∆C () + O(2 )

54
Proof. For the central difference: If u(0, .) is C 3 , we simply observe

1 2
u(0, x + ) = u(0, x) + ∂x u(0, x) + ∂xx u(0, x)2 + O(3 )
2
1 2
u(0, x − ) = u(0, x) − ∂x u(0, x) + ∂xx u(0, x)2 + O(3 )
2

55
3.1.2 Variance

ˆ N.
We focus on ∆ C

(i) Using independent sets of random numbers for Xix− and Xix+ :

0,x+
ˆN 1 V ar(g(XT )) + V ar(g(XT0,x− ))
V ar(∆ C ()) =
N 2 4

Simply observe that

ˆ N ()) = 1
V ar(∆ C (V ar(p̂N (x + )) + V ar(p̂N (x − )))
42

(ii) Using the same set of random numbers for Xix− and Xix+ and assuming

g, b and σ are C 1 :

ˆ N ()) ' 1 V ar(g 0 (X 0,x )∇X x )


V ar(∆ C T T
N

Simply observe that

ˆN 1 g(XTx+ ) − g(XTx− ) 
V ar(∆C ()) = V ar
N 

56
3.2 Tangent process approach

3.2.1 Tangent process

We denote X x := X 0,x , where X is solution to an SDE of type (2.1) .

Theorem 3.1. If b and σ are Cb2 2 then for all t ∈ [0, T ], the mapping x 7→ Xtx is

a.s. C 1 . Its derivative denoted ∇X x is solution to the following SDE:

Z t d Z
X t
∇Xtx = Id + ∇b(Xsx )∇Xsx ds + ∇σ .j (Xsx )∇Xsx dWsj .
0 j=1 0

Precisely, we have (up to a modification) that a.s.

Xtx+ − Xtx
lim = ∇Xtx ,
→0 

and x 7→ ∇X x is continuous.

Lemma 3.1. Under the assumptions of the previous theorem, we have


" #
E sup |∇Xt |p ≤ Cp .
t≤T

Proof of Theorem 3.1. We only prove:


" #
X x+ − X x
lim E sup |∇Xux − u u 2
| + |∇Xux+ − ∇Xux |2 = 0 (3.1)
→0 u≤T 

(d = 1)

To obtain the a.s. properties claimed in the statement of the Theorem, one has to

use the Kolmogorov continuity criterion, see e.g. Chapter 1, Theorem 1.8 in [70].

X x+ −X x
1. We define ∆ X := X x+ − X x , δ X :=  and compute

Z t Z t
∆ Xt =  + b̃u ∆ Xu dWu + σ̃u ∆ Xu dWu ,
0 0
2
b, σ Cb1 with α-Holder continuous derivatives for α > 0 is sufficient, see [53].

57
R1 R1
where we set σ̃u = 0 σ 0 (Xux+ + λ∆ Xu )dλ and b̃u = 0 b0 (Xux+ + λ∆ Xu )dλ.

Since σ and b have bounded derivatives, we observe that σ̃  and b̃ are bounded. We

compute, for p ≥ 2
   Z s   Z s 
p p−1 p
E sup |∆ Xs | ≤ 3 ( + E sup | b̃u ∆ Xu du|p + E sup | σ̃u ∆ Xu dWu |p )
s≤t s≤t 0 s≤t 0
Z t    Z s 
p p
≤ C( + E sup |∆ Xs | du + E sup | σ̃u ∆ Xu dWu |p )
0 s≤u s≤t 0

Then, using BDG inequality, we have


  Z t  
p p p
E sup |∆ Xs | ≤ C( + E sup |∆ Xs | du) .
s≤t 0 s≤u

Applying Gronwall’s Lemma, we obtain


 
E sup |∆ Xs | ≤ Cp .
p
(3.2)
s≤t

2. Next, we compute
Z t Z t
∇Xt − δ Xt = b̃u (∇Xu − δ Xu )du + σ̃u (∇Xu − δ Xu )dWu + Rt
0 0

with
Z t Z t
Rt = ∇Xu (b0 (Xux ) − b̃u )du + ∇Xu (σ 0 (Xux ) − σ̃u )dWu .
0 0

Using the same argument as in the previous step we obtain,


" # " #
E sup |∇Xt − δ Xt |2 ≤ CE sup |Rt |2 .
t≤T t≤T

And we compute
" # Z 
T  
2 0 0
E sup |Rt | ≤ CE |∇Xu (b (Xux ) − b̃u )|2 + |∇Xu (σ (Xux ) − σ̃u )|2 du .
t≤T 0

Using CS inequality, we get


Z T  Z T  21 Z T  12
0 0
E |∇Xu (b (Xux ) − b̃u )|2 du ≤ CE |∇Xu | du E 4
|b (Xux ) − b̃u |4 du .
0 0 0

58
We observe that

Z 1
0
|b (Xux ) − b̃u | =| (b0 (Xux ) − b0 (Xux+ + λ∆ Xu ))dλ| ≤ C|∆ Xu |,
0

leading to

Z  " #1
T 2
0
E |∇Xu (b (Xux ) − b̃u )|2 du ≤ CE sup |∆ Xu | 4
,
0 u≤T

where we used Lemma 3.1. Similarly, we compute

Z  " #1
T 2

E |∇Xu (σ 0 (Xux ) − σ̃u )|2 du ≤ CE sup |∆ Xu |4 .


0 u≤T

We then get, using step 1. that


" #
E sup |Rt |2 ≤ C2
t≤T

and the proof is concluded letting  ↓ 0 .

3. For ease of notations, we set b = 0. We have then

Z t
∇Xtx =x+ σ 0 (Xsx )∇Xsx dWs ,
0
Z t
∇Xtx+ = x +  + σ 0 (Xsx+ )∇Xsx+ dWs .
0

Setting Γ = ∇X x+ − ∇X x , we get

Z u Z u
0

Γu =+ σ (Xsx )Γs dWs + σ 0 (Xsx ) − σ 0 (Xsx+ ) ∇Xsx+ dWs .
0 0

Squaring the previous equality, we get

Z u Z u
0

sup |Γu |2 2
≤ C( + sup | σ (Xsx )Γs dWs |2 + sup | σ 0 (Xsx ) − σ 0 (Xsx+ ) ∇Xsx+ dWs |2 )
u≤t u≤t 0 u 0

Taking the expectation and using BDG inequality, we compute

  Z t Z t 
 2 2  2 x x+ 2 x+ 2
E sup |Γu | ≤ C( + E |Γs | ds + |Xs − Xs | |∇Xs | ds )
u≤t 0 0

59
Using CS inequality, we get

  Z t  " #1
2

E sup |Γu |2 ≤ C(2 + E sup |Γu |2 ds + CE sup |Xsx − Xsx+ |4 )


u≤t 0 u≤s s∈[0,T ]

Combining Gronwall Lemma with (3.2), we obtain


" #
E sup |∇Xux+ − ∇Xux |2 ≤ C2 . (3.3)
u≤T

Example 3.1. Set d = 1. We have

Z t Z t
0
∇Xtx =1+ b (Xsx )∇Xsx ds + σ 0 (Xsx )∇Xsx dWs .
0 0

leading to

0 x 2
0 (X x )− σ (Xt ) )ds+ t
Rt
σ 0 (Xtx )dWs
R
∇Xtx = e 0 (b s 2 0 .

Example 3.2. Set d = 1. Give the expression of the tangent process in the BS

model.

60
3.2.2 Computing the delta (pathwise approach)

Proposition 3.2. Assume that g is Cb1 and b and σ are Cb2 , then u is Cb1 and

∂x u(0, x) = E[∇g(XTx )∇XTx ] .

Proof. For the proof, we assume that g is Cb2 . We are going to compute the limit

of ∆F () for  −→ 0.

1. We have that
  Z 1 
1 x+ x
 0 x ∆ XT
∆F () = E g(XT ) − g(XT ) = E g (XT + λ∆ XT ) dλ
 0 

We then compute
   Z 1 
0 ∆ XT 0 0
 ∆ XT
|E[∇g(XTx )∇XTx ] x
− ∆F ()| = |E g (XT ) ∇XT −x
− x x
g (XT + λ∆ XT ) − g (XT ) dλ |
 0 
   
x ∆ XT |∆ XT |2
≤ CE |∇XT − | + CE
 
 
∆ XT
≤ C(E |∇XTx − | + )


using (3.2). Using (3.1), we then obtain lim→0 ∆F () = E[∇g(XTx )∇XTx ], proving

the differentiability f u.

2. We now prove that ∂x u is continuous. We compute

 
|∂x u(0, x + ) − ∂x u(0, x)| = |E g 0 (XTx+ )∇XTx+ − g 0 (XTx )∇XTx |
   
≤ |E g 0 (XTx+ )(∇XTx+ − ∇XT2 ) | + |E (g 0 (XTx+ ) − g 0 (XTx ))∇XT2 |
   1
≤ CE |∇XTx+ − ∇XTx | + CE |XTx+ − XTx |2 2 ,

where for the last term we used CS inequality and Lemma 3.1. Now using (3.3)-(3.2),

we have

|∂x u(0, x + ) − ∂x u(0, x)| ≤ C .

61
Letting  goes to 0 concludes the proof. 2

62
3.2.3 Practical implementation

It requires the discretization of (X x , ∇X x ). One may use the usual Euler scheme,

setting:

- (X0π , ∇X0π ) = (x, 1)

- and on π:

d
X
Xtπi+1 = Xtπi + b(Xtπi )(ti+1 − ti ) + σ .j (Xtπi )(Wtji+1 − Wtji )
j=1
d
X
∇Xtπi+1 = ∇Xtπi + ∇b(Xtπi )∇Xtπi (ti+1 − ti ) + ∇σ .j (Xtπi )∇Xtπi (Wtji+1 − Wtji )
j=1

Remark: simplification if d = 1.

63
3.3 Greek weights

• We are interested in computing the sensitivity of

h i
u(θ) = E g(XTθ )

with respect to the parameter θ at θ = θ0 .

Example 3.3. (i) If θ = x, and x is the starting point of X then we are computing

the ’Delta’.

(ii) If X follows a BS model and θ = σ the volatility of X then we are computing

the ’vega’.

• We are interested in finding random variable Π s.t.

h i
∂θ u(θ0 ) = E g(X θ0 )Π

for a large class of payoff function g.

• Precisely, we consider

h i
2 θ0
W := {Π ∈ L | ∂θ u(θ0 ) = E g(X )Π , for all bounded measurable function g}

• Often in practice, this will lead to

h i
∂θ u(θ0 ) = E g(X θ0 )Π + E()

where Π is an approximation of such greek weight and E() is the associated

error.

64
3.3.1 Likelihood Ratio Method

• Assuming that XTθ has a differentiable positive probability density φ(θ, x):

h i h i
∂θ E g(XTθ ) = E g(XTθ0 )s(θ0 , XTθ0 ) with s(θ, x) = ∂θ ln(φ(θ, x)) .
θ=θ0

s is called the score function.

We compute

h i Z
∂θ E g(XTθ ) = ∂θ g(x)φ(θ, x)dx
Z Z
∂θ φ(θ, x)
= g(x)∂θ φ(θ, x)dx = g(x) φ(θ, x)dx
φ(θ, x)

• If S 0 := s(0, XT0 ) ∈ L2 , then

 
W := {Π ∈ L2 | E Π | XT0 = S 0 }

• We observe also that V ar[g(XT0 )Π] ≥ V ar[g(XT0 )S 0 ]

 
V ar[g(XT0 )Π] = E (g(XT0 )Π)2 − (∂θ u)2
  
= E E (g(XT0 )Π)2 |XT0 − (∂θ u)2
   
≥ E (E g(XT0 )Π|XT0 )2 − (∂θ u)2

= V ar[g(XT0 )S 0 ]

65
Example: Black Scholes delta In the Black-Scholes setting, we have

 
−rT x WT
∂x u(0, x) = e E g(Xt ) .
xσT

The density h of XTx is lognormal and given by

1 log(y/x) − (r − 21 σ 2 )T
h(y) = √ ϕ(ζ(y)), ζ(y) = √
yσ T σ T

And we compute

log(y/x) − (r − 12 σ 2 )T
∂x h(y)/h(y) = −ζ(y)∂x ζ(y) =
xσ 2 T

Similar computations lead to the following form for the vega,

 2 
−rT x WT 1
∂σ u(0, x) = e E g(Xt ) − WT − .
σT σ

Higher order derivatives Assuming that XTθ has a twice differentiable positive

probability density φ(θ, x):

h i  2 θ 
2 θ ∂θθ φ(θ, XT )
∂θθ E g(XTθ ) = E g(XT ) .
φ(θ, XTθ )

Example 3.4. (Black Scholes Gamma)

 2 
2 −rT 1 x WT 1
∂xx u(0, x) =e E g(Xt ) − WT − .
x2 σT σT σ

66
3.3.2 Integration by part

• We consider the delta in the Black Scholes setting (r = 0), the payoff function

is Cb1 .

• From the tangent process approach, we know that:

1  0 x x
∂x u(0, x) = E g (XT )XT .
x

• Using an IBP argument, we compute

1
∂x u(0, x) = E[g(XTx )WT ] .
xσT

We have

Z √ √
2 T /2+σ 2 T /2+σ
∂x u(0, x) = g 0 (xe−σ Ty
)e−σ Ty
ϕ(y)dy

which rewrites

Z √
1 2 T /2+σ
∂x u(0, x) = √ ∂y [g(xe−σ Ty
)]ϕ(y)dy
xσ T
Z √
1 2 T /2+σ
= √ g(xe−σ Ty
)yϕ(y)dy
xσ T

67
3.3.3 Bismut’s formula

Theorem 3.2.

 Z T 
x 1 x −1 x
∂x u(0, x) = E g(XT ) σ(Xs ) ∇Xs dWs . (3.4)
T 0

1
RT
(Here we have Π = T 0 σ(Xsx )−1 ∇Xsx dWs ).

Proof. We observe that for 0 ≤ s ≤ T ,

∂x u(0, x) = E[∂x u(s, Xsx )∇Xsx ]

and then

Z T
1
∂x u(0, x) = E[∂x u(s, Xsx )∇Xsx ] ds
T 0

Z T  Z T Z T 
E ∂x u(s, Xsx )∇Xsx ds =E ∂x u(s, Xsx )σ(Xsx )dWs σ(Xsx )−1 ∇Xsx dWs
0 0 0

And via the martingale representation theorem, since u is solution to the pde,

Z T
∂x u(s, Xsx )σ(Xsx )dWs = g(XTx ) − u(0, x)
0

Remark 3.1. If Xt := x + bt + σWt then ∇Xt = 1 and we have

 
Wt
∂x E[φ(Xt )] = E φ(Xt ) . (3.5)
σt

68
4 U.S. options in complete market

4.1 Definition and first properties

• An option that can be exercised at anytime until its maturity T > 0.

• Setting: Interest rate r constant, X solution on [0, T ], to


Z t Z t
Xt = X0 + rXs ds + σ(s, Xs )dWs
0 0

W is a standard BM under the risk neutral probability, the market is complete

(σ is invertible).

Upon exercise, the option pays to the owner g(Xt ) (g exercise payoff, Lipstchitz

continuous).

• Super-hedging price:

y,φ
pus
0 := inf{y ∈ R | ∃ φ; admissible financial strategy s.t. Vt ≥ g(Xt ) ∀ t ∈ [0, T ]}

• Dual representation of the price (optimal stopping problem):

,→ at time 0:

 −rτ 
pus
0 = sup E e g(Xτ )
τ ∈T[0,T ]
h ?
i
= E e−rτ g(Xτ ? )

,→ at time t ∈ [0, T ]:

h i
−r(τ −t)
pus
t = esssupτ ∈T[t,T ] E e g(Xτ )|F t

The discounted price is a super-martingale

One proves easily that pus −rτ g(X )]


0 ≥ supτ ∈T[0,T ] E[e τ

69
h i
• (non-linear) PDE representation: pus (t, x) = supτ ∈T[t,T ] E e−r(τ −t) g(Xτt,x ) is

solution3 to

min{−LX u − ru , u − g} = 0 on [0, T ) × R

u(T, .) = g(.)

• This comes from the Dynamic programming principle: For all stopping time

t≤θ≤T

h i
pus (t, x) = sup E e−r(τ −t) g(Xθt,x )1{θ>τ } + pus (θ, Xθt,x )1{θ≤τ }
τ ∈T[t,T ]

(see e.g. chapter 5 in [68])

3
in some sense... e.g. viscosity sense.

70
4.2 Bermudan option

• An option that can be exercised at a discrete set of time (<) during its life.

< = {0 =: s0 , . . . , sj , . . . , sκ := T } .

• Super-hedging price:

pb0 := inf{y ∈ R | ∃ φ; admissible financial strategy s.t. Vsy,φ ≥ g(Xs ) ∀ s ∈ <}

• Dual representation of the price:

,→ at time 0:

 
pb0 = sup E e−rτ g(Xτ )
<
τ ∈T[0,T ]

h ∗
i
= E e−rτ g(Xτ ∗ )

,→ at time t ∈ <:

h i
pbt = esssupτ ∈T < E e−r(τ −t) g(Xτ )|Ft
[t,T ]

• Backward Programming Algorithm: (denoting G := g(X) )

Definition 4.1. (i) at T , set YT = GT

(ii) For j < κ, compute

 h i
−r(sj+1 −sj )
Ysj = max Csj , Gsj where Csj := E e Ysj+1 |Fsj (Continuation value)

We have Yt = pbt , t ∈ <.

• We now set r = 0 and prove the above results.

71
Proposition 4.1. (i) Y is the smallest super-martingale above G = g(X).

(ii) The optimal stopping time τ ? is given by

τ ? = inf{t ∈ < | Yt = Gt } ∧ T ,

i.e. Y0 = E[Gτ ? ].

(iii) Martingale decomposition theorem: Yt = Y0 + Mt? − A?t .

(iv) Y is the super-hedging price.

Proof.

(i) by induction

(ii) Since Y is a super-martingale above Z on <, we observe that Y0 ≥ E[Yτ ] ≥ E[Zτ ]

< .
for all τ ∈ T[0,T ]

Let us define τ ? = inf{t ∈ < | Yt = Zt }. We compute


 
Xκ−1
E[Yτ ? − Y0 ] = E (Ysj+1 − Ysj )1{τ ? >sj } 
j=0
 
Xκ−1 h i
= E E (Ysj+1 − Ysj )1{τ ? >sj } |Fsj 
j=0
 
Xκ−1
 
= E (E Ysj+1 |Fsj − Ysj )1{τ ? >sj } 
j=0

=0

This proves that Y0 = supτ ∈T < E[Gτ ].


[0,T ]

(iii) For t ∈ <, we have that Yt = Y0 + Mt? − A?t where M ? is a martingale and A?

is an increasing process given recursively by (M0 , A0 ) = (0, 0) and

 
Ms?j+1 = Ms?j + Ysj+1 − E Ysj+1 |Fsj
 
A?sj+1 = A?sj + Ysj − E Ysj+1 |Fti

72
 
We notice that Ysj − E Ysj+1 |Fti = [Gsj − Csj ]+

(iv) Denotes

Γ := {y ∈ R | ∃ φ; admissible financial strategy s.t. Vsy,φ ≥ g(Xs ) ∀ s ∈ <}

a. Note that for y ∈ Γ, V y,φ is a super-martingale above G and thus – in particular

y = V0y,φ ≥ Y0

which leads to inf Γ ≥ Y0 .

b. Denote η ? = Y0 + M ? , we have that

Y0 = E[ητ?? ] == E[ηT ] .

In our setting, this means that we can replicate ηT with an initial wealth of Y0 i.e.

there exists an admissible strategy φ s.t.

ηt = UtY0 ,φ and then UtY0 ,φ ≥ Yt ≥ Gt , t ∈ π.

Thus Y0 ∈ Γ. 2

73
Proposition 4.2. Set |<| = max0<j≤κ (sj − sj−1 ), then the following holds:

h i
sup E |pbt − pus
t |2
≤ C|<|α
t∈[0,T ]

where α = 1 if g is Lipschitz continuous and α = 2 if g ∈ Cb2 .

Proof. We first observe that

0 ≤ pus b
t − pt

Let τ̄ , be the projection on the grid < of τ ∗ (optimal stopping time for the US

option), we have

h i
pus
t − pb
t = E g(Xτ ∗ ) Ft − p
b
t
h i
≤ E[g(Xτ ∗ )|Ft ] − E g(Xτ̄ ) Ft
h i
≤ CE |Xτ ∗ − Xτ̄ | Ft
p
≤C |<|

If g ∈ Cb2 , we apply Ito formula to obtain

 Z τ̄ 
pus
t − pbt ≤ CE | LX g(Xs )ds| Ft
τ∗

≤ C|<|

74
4.2.1 Discretisation of the forward process

• We now introduce a discretization grid π:

π = {0 =: t0 < · · · < ti < . . . tn := T } ,

such that < ⊂ π.

• We define the bermudan option associated to the exercise payoff g(X π ).

 
pπ0 = sup E e−rτ g(Xτπ )
<
τ ∈T[0,T ]

h ∗
i
= E e−rτ g(Xτπ∗ )

Proposition 4.3. In our setting, we have

1
|pb0 − pπ0 | ≤ C|π| 2 .

when g is Lipschitz continuous.

Proof. See Exercise II.18 2

75
Extension to the continuous case

• We assume that < = π.

• Combining Proposition 4.2 and Proposition 4.3, we have


|pus π
0 − p0 | ≤ C π .

76
4.2.2 Longstaff-Schwarz algorithm

• Observe that in Definition (4.1), the optimal stopping time τ̂0 (from time 0)

can be estimated by

1. set τ̂κ = T

2. then set τ̂j = sj 1Aj + τ̂j+1 1Acj with Aj = {Ysj ≤ Gsj } .

,→ Indeed, one has Y0 = E[Gτ̂0 ], namely τ ? = τ̂0 .

• The Longstaff-Schwartz Algorithm [59] focuses on the sequence of optimal time

execution

Definition 4.2. (Longstaff-Schwartz)

1. set τ̃κ = T

 
2. then set τ̃j = sj 1Ej + τ̃j+1 1Ejc with Ej = {E Gτ̃j+1 |Fsj ≤ Gsj } .

,→ Then one has Y0 = E[Gτ̃0 ]

• Note that the value process Y is computed only through its representation in

terms of optimal stopping time.

 
indeed, Ysj = E Gτ̃j+1 |Fsj

77
4.3 Dual approach

• We work in the Bermudan case on the discrete time grid <.

• Let M0 the space of square integrable martingale M such that M0 = 0.

Theorem 4.1. The following holds


 
Y0 = sup E[Gτ ] = inf E sup(Gt − Mt ) . (4.1)
τ M ∈M0 t∈<

The inf is achieved for M ? the martingale part of the Doob-Meyer decomposition

and

Y0 = sup(Gt − Mt? ) .
t∈<

• The previous representation has been introduced in [71] (for the continuous

case) and simultaneously in [47] where the formula involves supermartingales.

Proof. 1. First, observe we observe that

Y0 = sup E[Gτ − Mτ ]
τ
   
≤ E sup(Gτ − Mτ ) = E sup(Gt − Mt ) .
τ t∈<

2. Recall the Doob-Meyer decomposition theorem: Yt = Y0 + Mt? − A?t where A is

an increasing predictable process with A0 = 0. Thus, we have

Gt ≤ Yt = Y0 + Mt? − A?t

which leads to

(Gt − Mt? ) ≤ Y0 − A?t ≤ Y0 .

78
• In [71], sub-optimal martingales are constructed in "ad-hoc" way, see Section

4.4.2 for possible systematic approaches.

• Numerical methods based on the dual formula 4.1 will lead naturally to upper-

bounds for the true price.

79
4.4 Implementation using regression techniques

• The regression part is required to estimate the conditional expectation (linear

or nonlinear [74, 47])

• Approximation of the Bermudan option price on

< = {0 =: s0 , . . . , sj , . . . , sκ := T }

• Assume perfect simulation of the underlying and r = 0 (w.l.o.g)

• Use the Backward Algorithm:

(i) at T , set Yκ = g(XT ).

(ii) For j < κ, compute

  
Yj = max E Yj+1 |Fsj , g(Xsj )

,→ need to compute continuation value.

4.4.1 Linear Regression-based methods

• An easy backward induction proves Yj = uj (Xsj ), 1 ≤ j ≤ κ. (X is markovian)

   
• Denote Cj := cj (Xsj ) := E Yj+1 |Xsj = E Yj+1 | Fsj ( continuation value),

we have

Yj = cj (Xsj ) ∨ g(Xsj )

How to approximate Cj ? (assuming all quantities are square integrable)

• Observe that

   
Cj = argminZ∈L2 (Fsj ) E |Yj+1 − Z|2 = argminZ∈L2 (σ(Xsj )) E |Yj+1 − Z|2

80
• Denoting L2j the set of PXsj -square integrable function, we observe

 
cj = argminf ∈L2 E |Yj+1 − f (Xsj )|2
j

• L2j is too big! so one considers some basis function (ψ` )`≥1 and then pick a

finite number of them, say K, to get

K
X
f' β ` ψ`
`=1

((β ` )1≤`≤K depends on f ...)

• With this approximation, the minimisation problem can be rewritten


" K
#
X
β¯j = argminβ∈RK E |uj+1 (Xsj+1 ) − β ` ψ` (Xsj )|2 (4.2)
`=1

and one sets

K
X
c̄j := β̄j` ψ` , and cj = c̄j + error
`=1

PK `
(Cj = C̄j + error where C̄j = `=1 c̄j ψ` (Xsj ))

• The coefficient β̄ is easily calculated, once observed that C̄j is the orthogonal

projection of Yj+1 on V = V ect(ψ1 (Xj ), . . . , ψK (Xj )):

We denote ψ := (ψ1 , . . . , ψK ) and set

h i h i
Bψj = E ψ(Xj )> ψ(Xj ) , j
Bψu = E ψ(Xj )> uj+1 (Xj+1 ) ,

we have that

β̄j = (Bψj )−1 Bψu


j
,

assuming Bψj non-singular.

81
By property of the orthogonal projection on the vector space V, we have
 
E (Yj+1 − C̄j )Z = 0, for all Z ∈ V leading to
" K
#
X 
E Yj+1 − β̄` ψ` (Xj ) ψr (Xj ) = 0 , 1 ≤ r ≤ K .
`=1

j
• In practice, one has to consider estimated counterparts of Bψj , Bψu ...

N
1 X
Bψj = ψ(Xji )> ψ(Xji )
N
i=1

where (Xji )1≤i≤N are i.i.d copies of Xj .

82
• Full approximation (Tsitsiklis-Van Roy): observe that in practice one has to

estimate ūj = c̄j ∨ g instead of uj leading to the following method:

1. Simulate (Xji )1≤i≤N i.i.d copies of Xj , j ≤ κ.

2. at T , set Yκi = g(Xκi ) and ū


ˆκ = g.

3. for j < κ, compute β̄ˆ = (B̂ψj )−1 B̂ψj ūˆ


j+1

PK ˆ
set c̄ˆj := `=1 β̄` ψ` and ˆj = c̄ˆj ∨ g.

• One can compute an approximated optimal policy τ̂ ∗ on each path and recom-

pute E[g(Xτ̂ ∗ )].

Remark 4.1. 1. The choice of the basis functions is key, specially in high di-

mension.

2. The linear-regression method is used for the Longstaff-Schwartz Algorithm [59].

3. Error analysis (approximation/statistical) is involved, see [25, 39, 36].

4. This is a low-biased estimator of the American option price.

4.4.2 Implementation of the dual approach

• Based on Theorem 4.1, one should find a "good" martingale. Various ap-

proaches beyond "guessing" [71] have been considered see e.g. [8] and the

references therein.

• We recall the method in [47], see also [33, Chapter 8].

• The martingale to use in (4.1) is M ? which is given by

 
∆Mj? = Ysj+1 − E Ysj+1 |Fsj . (4.3)

83
• To ensure the martingale property, Haugh & Kogan suggest to recompute the

previous martingale increment for each simulated path ⇒ Nested Simulation

ˆ of the us option price has been computed


– An approximation ū

 
– Simulate M path of X: for each Xsmj , compute E ū
ˆ(Xsj+1 )|Xsj by res-

imulation and set

 
ˆ j with ∆
M̂j+1 = M̂j + ∆ ˆ j = ū
ˆ(Xsj+1 ) − E ū
ˆ(Xsj+1 )|Xsj (4.4)

1 PM m
– Compute M m=1 maxt∈< (g(Xsj ) − M̂jm )

84
4.5 Quantization based methods

4.5.1 Introduction - cubature formula

See e.g. the review article [62] or Chapter 5 in the book [63]

• Let X be an Rd random variable. Quantization → find the best approximation

of X by discrete random variable taking at most M different values x :=

{x1 , . . . , xM }.

• Voronoi tessellation of the M -quantizer x is a (Borel) partition C(x) := (Ci (x))1≤i≤M

s.t.

Ci (x) ⊂ {ξ ∈ Rd | |ξ − xi | ≤ min |ξ − xj |}
i6=j

• Nearest neighbour projection on x: Px : ξ ∈ Rd 7→ xi if ξ ∈ Ci (x)

85
• x-quantization of X: X̂ = Px (X) (remark: if X is absolutely continuous any

two x quantizations are Pa.s. equal.)

• pointwise error: |X − X̂ x | := d(X, {x1 , . . . , xn }) = min1≤i≤M |X − xi |

h i1
2
• Quadratic mean quantization error : E |X − X̂ x |2 =: kX − X̂ x k2

Optimal quantization (this presentation: X has infinite support)

• Proposition 1 in [62] ot Theorem 5.1 in [63]

h i
1. The quadratic distortion function: x 7→ E |X − X̂ x |2 reaches a minimum

at some quantizer x∗
h ∗
i
2. The function M 7→ E |X − X̂ x |2 is decreasing to 0 as M → +∞

• Upper bound on the convergence rate for X ∈ L2+ for some  > 0: There

exists Cd, s.t.

1
∀M ≥ 1, min kX − X̂ x k2 ≤ Cd, kXk2+ M − d (4.5)
x∈(Rd )M

This is Zador’s Theorem, see e.g. [60].

• Any L2 -optimal M -quantizer is the best approximation of X by L2 r.v. taking

at most M values (least square approximation) i.e.


kX − X̂ x k2 := min{kX − Y k2 | #Im(Y ) ≤ M } (4.6)

Proof. Set y := Im(Y ) = {y1 , . . . , yM } and observe that

min |X − yi | ≤ |X − Y | P − a.s.
i


so that kX − Y k2 ≥ kX − X̂ y k2 ≥ kX − X̂ x k2 . 2

86
• Any L2 -optimal M -quantizer x∗ ∈ Rd is stationary in the following sense

h ∗
i ∗
E X X̂ x = X̂ x . (4.7)

h i
(in particular E[X] = E X̂ x as soon as the quantizer is stationary which

generally does not correspond to minima)

Proof. We compute

h ∗
i
E |X − X̂ x |2
 h i2 h i h i h i 
2
x∗ x∗ x∗ x∗ x∗ x∗
= E X − E X X̂ + 2(X − E X X̂ )(E X X̂ − X̂ ) + E X X̂ − X̂
 h i2 h i 
∗ ∗ ∗ 2
= E X − E X X̂ x + E X X̂ x − X̂ x


where the last inequality is obtained by conditioning w.r.t X̂ x .
h ∗
i
From the previous point, we also have, as E X X̂ x takes almost M different

values (this is a measurable function of X x ...)

h i  h i 2
x∗ 2 x∗
E |X − X̂ | ≤ E X − E X X̂

This leads to

 h i 
2
x∗ x∗
E E X X̂ − X̂ =0.

• Two M -quantizer (M = 500) of N (0, I2 ) one of them being (close to be) op-

timal... [62]

87
For a discussion on how to obtain optimal quantization grid, we refer to Section

5.3 in [63].

88
Cubature formulas

• Below x is an L2 -optimal M -quantizer for X.

• One naturally computes

h i XM
E[g(X)] ' E g(X̂ x ) = f (xi )P(X ∈ Ci (x))
i=1

• If g is Lipschitz,

h i h i
|E[g(X)] − E g(X̂ x ) | ≤ [g]Lip E |X − X̂ x | ≤ [g]Lip kX − X̂ x k2

• If g is C 1 with Lipschtiz derivative,

h i h i
|E[g(X)] − E g(X̂ x ) | ≤ [Dg]Lip E |X − X̂ x |2 (4.8)

as X̂ x is stationnary.

• Convergence when M → ∞ follows from (4.5).

Conditional expectation

• For X, Y denote their quantized version by X̂, Ŷ

h i
• Approximation of Θ = E[F (X)|Y ] by Θ̂ = E F (X̂)|Ŷ ...

• one needs to compute X̂, Ŷ and the law of (X̂, Ŷ ).

• Say E[F (X)|Y ] = ϕF (Y ) with ϕF Lipschitz, then:

 
kΘ − Θ̂k2 ≤ C kX − X̂k2 + kY − Ŷ k2 (4.9)

Proof. See Exercises II.19. 2

89
4.5.2 Quantization tree for optimal stopping problem

This technique has been introduced in [6, 5], where a complete error analysis is done,

see also Section 11.3.2 in [63].

• we are given a discrete-time Markov Chain (Xk )0≤k≤ on a grid π (samples of

(Xt )0≤t≤T or an associated scheme).

• we use the Backward Programming algorithm of Definition 4.1.

• In the Markovian setting, recall that

Yk = uk (Xk ) where uk = ck ∨ g and ck (Xk ) = E[uk+1 (Xk+1 ) | Xk ].

• For each k, we consider the (optimal) quantization X̂k of Xk on the grid

Ck := (Cki )1≤i≤Mk .

h i
• For any ϕ, we replace E[ϕ(Xk+1 )|Xk ] by E ϕ(X̂k+1 ) | X̂k , denote then

k
πij = P(X̂k+1 = xjk+1 |X̂k = xik )

• Define the pseudo-Snell envelope:







Ŷn = g(X̂n )
. (4.10)

 h i


Ŷk = max{E Ŷk+1 |X̂k , g(X̂k )}

• A backward induction yields Ŷk = ŷk (X̂k ) where







ŷn = g
(4.11)

 PMk+1 k


ŷk (xik ) = max{ j=1 πij ŷk (xjk+1 ), g(xik )}

Pn−1
• ’online’ computational cost: ∼ k=0 Mk Mk+1 .

90
• note that the grids and (π k ) are computed offline, e.g. by MC method: Note

k
]{n|X̂kn = xik & X̂k+1
n = xjk+1 }
πij = limN →∞
]{n|X̂kn = xik }

where (Xkn )1≤n≤N,1≤k≤κ are MC simulation of the Markov Chain (Xk )1≤k≤κ .

91
4.5.3 Markovian quantization (grid method)

• given δ > 0 and κ ∈ N∗ , we consider the bounded lattice grid:

Γ = {x ∈ δZd | |xj | 6 κδ, 1 6 j 6 d} .

Observe that there are (2κ)d + 1 points in Γ.

• We introduce a projection operator Π on the grid Γ centered in X0 given by,

for x ∈ Rd ,





 δbδ −1 (xj − X0j ) + 21 c + X0j , if |xj − X0j | 6 κδ,


(Π[x])j = κδ, if xj − X0j > κδ,






 −κδ, if xj − X0j < κδ.

• we use an optimal quantization of Gaussian random variables (∆Wi ):

p ∆Wi
ci :=
∆W hi GM ( √ )
hi

GM denotes the projection operator on the optimal quantization grid for the

standard Gaussian distribution with M points in the support4 .

Moreover, it is shown in [41] that

h i1 √ 1
c
E |∆Wi − ∆Wi |p p
6 Cp,d hM − d . (4.12)

• we introduce the following discrete/truncated version of the Euler scheme,





 X b π = X0
0
h i (4.13)


 X bπ = Π X b π + hi b(X
b π ) + σ(X
b π )∆W
ci .
i+1 i i i

b π is a Markovian process living on Γ and satisfying |X


We observe that X bπ| 6
i

C(|X0 | + κδ), for all i 6 n.


4
The grids can be d ownloaded from the website: http://www.quantize.maths-fi.com/ .

92
Definition 4.3. We denote (Ybiπ )06i6n the solution of the backward scheme satisfying

(i) The terminal condition is Ybnπ := g(X


bnπ )

(ii) for i < n, the transition from step i + 1 to step i is given by

h i
Ybiπ = max(Eti Ybi+1
π biπ ))
, g(X (4.14)

Proposition 4.4. For all i ∈ {0, ..., n}, there exists a function uπ (ti , .) : Γ → R

such that

Yb π = uπ (ti , X
biπ )

This function is computed on the grid by the following backward induction: for all

i ∈ {0, ..., n} and x ∈ Γ,

h   p i
uπ (ti , x) = max{E uπ ti+1 , Π x + hi b(x) + hi σ(x)GM (U ) , g(x)}

with U ∼ N (0, 1).

The terminal condition is given by uπ (tn , x) = g(x).

93
5 Non-linear pricing methods

5.1 Backward Stochastic Differential Equation

• We work in a Lipschitz setting. We give the main definition and properties for

solution to BSDEs.

• first introduced by Bismut [10, 11] and then studied in general way first by

Pardoux and Peng [65]. See also [31, 61, 67].

5.1.1 Definition

• For a prescribed terminal time T > 0, the solution of a backward stochastic

differential equation is a couple (Y, Z) satisfying on [0, T ]







dYt = −f (t, Yt , Zt )dt + Zt dWt
(5.1)




YT = ξ

for some progressively measurable random function f , called the driver, and a

terminal condition ξ which is a FT -measurable random variable.

• We denote S 2 (Rk ) the vector space of RCLL5 adapted processes Y , with values

in Rk , and such that:


" #
kY k2S 2 := E sup |Yt | 2
< ∞, (5.2)
0≤t≤T

Sc2 (Rk ) is the subspace of continuous process.

• The set H2 (Rk×d ) is the set of Z process, progressively measurable, valued in

Rk×d , such that


Z T 
kZk2H2 := E 2
|Zt | dt < ∞,
0
5
Righ Continuous with Left Limits.

94
where for z ∈ Rk×d , |z|2 = Tr(zz † ).

assumptions The random function f defined on [0, T ] × Ω × Rk × Rk×d and

valued in Rk is such that for all (y, z) ∈ Rk × Rk×d , the process {f (t, y, z)}0≤t≤T is

progressively measurable. We also assume that

(H1): There exists a positive constant L such that P a.s.,

1. Lipschitz continuity in (y, z) : for all t, y, y 0 , z, z 0 ,


f (t, y, z) − f (t, y 0 , z 0 ) ≤ L |y − y 0 | + kz − z 0 k ;

2. Integrability condition:

 Z T 
2 2
E |ξ| + |f (r, 0, 0)| dr < ∞.
0

Theorem 5.1. Under (H1), there exists a unique solution (Y, Z) ∈ Sc2 × H2 to

Z T Z T
Yt = ξ + f (s, Ys , Zs )ds − Zs dWs , 0 ≤ t ≤ T. (5.3)
t t

Proof. see the proof of Theorem 2.1 in [31] 2

5.1.2 Some key properties

Linear BSDEs We first study linear BSDE for which we can give an almost

explicit solution. For this section, we set k = 1: Y is then real valued and Z a

d-dimensional row vector.

Proposition 5.1. Let {(at , bt )}t∈[0,T ] be progressively measurable and bounded pro-

cesses with value in R × Rd . Let {ct }t∈[0,T ] be an element of H2 (R) and ξ a random

variable, FT –measurable, square integrable.

95
The linear BSDE

Z T Z T
Yt = ξ + {ar Yr + Zr br + cr } dr − Zr dWr , (5.4)
t t

has a unique solution given by:

 Z T 
∀t ∈ [0, T ], Yt = Γ−1
t E ξΓT + cr Γr dr Ft , (5.5)
t

where for all t ∈ [0, T ],

nZ t
1
Z t Z t o
2
Γt = exp br dWr − |br | dr + ar dr .
0 2 0 0

Proof. See exercise II.20 2

Remark 5.1. We observe that if ξ and c are non-negative then Y ≥ 0.

Comparison Theorem

Theorem 5.2. Let k = 1 and assume that (ξ, f ) satisfies (H1), the solution to

the associated BSDE is denoted (Y, Z). Let (Y 0 , Z 0 ) be a solution of a BSDE with
RT
parameters (ξ 0 , f 0 ) and satisfying 0 f 0 (t, Yt0 , Zt0 )dt ∈ L2 (FT ). We also assume that

P a.s. ξ ≤ ξ 0 and f (t, Yt , Zt ) ≤ f 0 (t, Yt , Zt ) λ ⊗ P − a.e. (λ denoting the Lebesgue

measure). Then,

P a.s., ∀t ∈ [0, T ], Yt ≤ Yt0 .

If, moreover, Y0 = Y00 , then P a.s., Yt = Yt0 , 0 ≤ t ≤ T and f (t, Yt , Zt ) = f 0 (t, Yt , Zt )

λ ⊗ P–a.e. In particular, as soon as, P (ξ < ξ 0 ) > 0 or f (t, Yt , Zt ) < f 0 (t, Yt , Zt ) on

a set with positive λ ⊗ P–measure then Y0 < Y00 .

Proof. See Exercise II.21 2

96
5.1.3 Application to non-linear pricing

• We consider a market with one risky asset solution of the SDE


Z t Z t
Xt = x + b(s, Xs )ds + σ(s, Xs )dW̃s (5.6)
0 0

and assume a different rate for lending (r) and borrowing (R, R > r).

• To simplify the notation, we assume that there exists a brownian motion W

under Q ∼ P s.t.
Z t Z t
Xt = x + rXs ds + σ(s, Xs )dWs (5.7)
0 0

We will work now under the probability Q.

• A portfolio is constituted of cash α and risky asset (quantity φ), its value at

time t is

Vtυ,φ = αt + φt Xt . (5.8)

• Between t and t + dt, the variation of

- the cash account is: [αt ]+ rdt − [αt ]− Rdt = (rαt + [αt ]− (r − R))dt

- the risky asset account is: φt dXt

,→ Working with self financing strategies, we obtain

dVt = (rVt + [Vt − φt Xt ]− (r − R))dt + φt σ(t, Xt )dWt . (5.9)

• The (super-)replication price for an european option with terminal payoff

g(XT ) is defined as

p0 = inf{υ ∈ R | ∃ψ admissible strategy , VTυ,ψ = g(XT ) (≥)} (5.10)

97
equivalently one has to solve for (Y, ∆) the following

Z T Z T

Yt = g(XT ) + (rYs + [Ys − ∆s Xs ] (r − R))ds − ∆s σ(s, Xs )dWs
t t

(5.11)

and obtain that p0 = Y0 and ∆ is an admissible strategy.

Sketch of proof. From Theorem 5.1, there exists a unique solution to

Z T Z T
Zs
Yt = g(XT ) + (rYs + [Ys − Xs ]− (r − R))ds − Zs dWs (5.12)
t σ(Xs ) t

Setting

Zs
φs = (5.13)
σ(s, Xs )

we have that Yt = VtY0 ,φ and then VTY0 ,φ ≥ g(XT ). Thus, Y0 ≥ p0 .

Now let v s.t. ∃ψ, VTv,ψ ≥ g(XT ) = YT . We observe that V v,ψ has the same

dynamics as a BSDE and thus we can use the comparison theorem to conclude that

v ≥ Y0 . Taking the infinimum on v leads to p0 ≥ Y0 .

98
5.2 Main properties in the Markov setting

See [66]

• One can show that Yt = u(t, Xt ) for some Lipschitz function u.

• u is solution (in the viscosity sense) of


− u(0) (t, x) − f x, u(t, x), ∂x u(t, x)σ(t, x) = 0 (t, x) ∈ [0, T ) × Rd (5.14)

and u(T, x) = g(x) . (5.15)

• Assuming that u is a smooth solution to the above non-linear pde, one can

prove conversely

(Yt = u(t, Xt ), Zt = ∂x u(t, Xt )σ(t, Xt )). (5.16)

Proof. Apply Ito’s Formula. 2

Notations (in the one-dimensional setting) For a function ϕ : [0, T ] × R → R, we

define

ϕ(0) (t, x) = L(0) ϕ(t, x) := ∂t ϕ + LX ϕ(t, x) (5.17)

ϕ(1) (t, x) = L(1) ϕ(t, x) := σ(t, x)∂x ϕ(t, x) (5.18)

where LX is given in (2.7).

For a multi-index, β ∪n≥1 {0, 1}n with β = (j1 , . . . , jk ) for some k ≥ 1, we denote

by −β := (j2 , . . . , jk ) and

ϕβ = Lβ ϕ = L(j1 ) [L−β ϕ] .

99
5.3 Numerical analysis of backward Methods

• we seek to approximate (Y, Z) in order to approximate u.

• to alleviate the presentation, we assume that f := f (y, z) only

• We work under two different sets of assumptions

1. (HL): the coefficient b, σ, f , g are L-Lipschitz continuous.

2. (Hr): the function u, b, σ are Cb2,4 (up to T ).

Backward Algorithm: [17, 76] on a discrete grid π

- at tn = T :

(Ynπ , Znπ ) = (g(XT ), 0) (5.19)

- i < n, compute


 h i

 π ( Wti+1 −Wti )
Ziπ = Eti Yi+1 ti+1 −ti
(5.20)

  π 


Yiπ = Eti Yi+1 + (ti+1 − ti )f (Yiπ , Ziπ )

where the grid π satisfies n|π| ≤ C where |π| := maxi (ti+1 − ti ).

Remark 5.2. .
1
(i) This is an implicit Euler scheme and the error is O(|π| 2 ) in the Lipschitz case

and assuming smoothness of the coefficient O(|π|) at most, generally (see below).

(ii) Explicit version of the scheme: in the smooth case Znπ := σ(XT )g 0 (XT ).

(iii) One has to estimate the conditional expectations!

heuristics.

100
5.3.1 L2 -stability

• Perturbation approach: Consider

h i
Ỹi = Eti Ỹi+1 + (ti+1 − ti )f (Ỹi , Z̃i ) + ζi (5.21)
 
Wti+1 − Wti
Z̃i = Eti Ỹi+1 ( ) (5.22)
ti+1 − ti

where ζi ∈ L2 (Fti ).

• Let us set δY := Y π − Ỹ , δZ := Z π − Z̃, we introduce

Definition 5.1 (L2 -stability). The scheme given in (5.20) is L2 -stable if there exists

a constant C > 0 s.t.


" n−1
#
  X
max E |δYi |2 ≤ CE |δYn |2 + nζi2 (5.23)
i
i=0

for |π| small enough, for all perturbation ζ.

Proposition 5.2. If f is Lipschitz continous, the scheme (5.20) is L2 -stable.

Proof. Let us observe that the scheme can be rewritten

Yiπ = Yi+1
π
+ hi f (Yiπ , Ziπ ) − hi Ziπ Hi − ∆Mi (5.24)

∆Wi
where Hi = h . Note that (5.26) defines ∆Mi , moreover it satisfies

 
Eti[∆Mi ] = Eti[∆Mi Hi ] = 0 and E |∆Mi |2 < ∞ (5.25)

These properties are obtained by using the definition of the scheme given in (5.20).

For the perturbed scheme, we have similarly:

Ỹi = Ỹi+1 + hi f (Ỹi , Z̃i ) + ζi − hi Z̃i Hi − ∆M̃i (5.26)

101
Denoting δfi = f (Yiπ , Ziπ ) − f (Ỹi , Z̃i ) and δ∆Mi = ∆Mi − ∆M̃i , we observe that

δYi + hi δZi Hi + δ∆Mi = δYi+1 + hi δfi + ζi . (5.27)

Squaring both sides and taking conditional expectation, we compute, using Young’s

inequality,

h i C
|δYi |2 + hi |δZi |2 ≤ (1 + Ch)Eti (δYi+1 + hi δfi )2 + |ζi |2 . (5.28)
h

Note that

(δYi+1 + hi δfi )2 ≤ (|δYi+1 | + Chi |δYi | + Chi |δZi |)2 (5.29)

h 
≤ (1 + )(|δYi+1 | + Chi |δYi |)2 + C(1 + )h2i |δZi |2 (5.30)
 h

Choosing h and  such that C(h + ) ≤ 12 , we obtain

1
(δYi+1 + hi δfi )2 ≤ (1 + Ch)|δYi+1 |2 + Ch|δYi |2 + h2i |δZi |2 (5.31)
2

Inserting the previous inequality in (5.28), we get

  C
|δYi |2 ≤ (1 + Ch)Eti |δYi+1 |2 + |ζi |2 (5.32)
h
  C
≤ eCh Eti |δYi+1 |2 + |ζi |2 (5.33)
h

Taking expectation on both sides and iterating over i, we obtain (5.23).

102
5.4 Convergence analysis assuming no error on X

5.4.1 Truncation error

• Let us introduce

 
∆Wi
Ẑti = Eti Yti+1 . (5.34)
hi

• We define the local truncation error as

Z ti+1 
ζ̂i := Eti [f (Yt , Zt ) − f (Yti , Ẑti )]dt . (5.35)
ti

• It measures how well the true solution satisfies the scheme.

Indeed, we observe that:

• The global truncation error is then defined as

X h i
T (π) := E n|ζ̂i |2 . (5.36)
i

• Assume at this point that there is no error made on the forward process, thus

δYn := 0 and we have

 
max E |Yti − Yiπ |2 ≤ CT (π) . (5.37)
i

Proof. This comes directly from the L2 -stability of the scheme with the

perturbation given by ζ̂. 2

• we now study the order of the truncation error under (HL) or (Hr).

103
5.4.2 Order of convergence in the smooth case

Theorem 5.3. Under (Hr), we have that

T (|π|) ≤ C|π|2 (5.38)

and thus the scheme is of order 1.

Proof. (proof in the one dimensional case) 1. We observe that


Z ti+1 
|ζ̂i | ≤ Eti [f (Yt , Zt ) − f (Yti , Zti )]dt + Chi |Zti − Ẑti |. (5.39)
ti

Using the PDE satisfied by u, we get

h i
|Eti[f (Yt , Zt ) − f (Yti , Zti )] | = |Eti u(0) (t, Xt ) − u(0) (ti , Xti ) | (5.40)
Z ti+1 
(0,0)
= |Eti u (t, Xt )dt | (5.41)
ti

≤ C|π| (5.42)

where we used Ito’s formula for the last equality. Now we compute, setting Hi :=

∆Wi
hi

 Z ti+1 Z ti+1 
  (0) 1 (1)
Ẑti = Eti u(ti+1 , Xti+1 )Hi = Eti Hi u (t, Xt )dt + u (t, Xt )dt
ti hi ti

(5.43)

Observe that
 Z ti+1   Z ti+1 
|Eti Hi u(0) (t, Xt )dt | = |Eti Hi {u(0) (t, Xt ) − u(0) (ti , Xti )}dt | (5.44)
ti ti

≤ C|π| (5.45)

and that

h i  Z t 
(1) (1) (0,1)
Eti u (t, Xt ) = Eti u (ti , Xti ) + u (s, Xs )ds (5.46)
ti

104
We thus get

 Z ti+1 
1 (1)
|Eti u (t, Xt )dt − u(1) (ti , Xti )| ≤ C|π| (5.47)
hi ti

leading to |Zti − Ẑti |2 ≤ C|π|. We then obtain that

|ζ̂i | ≤ C|π|2 . (5.48)

And, by summing this local error estimate, we conclude that T (π) ≤ C|π|2 . 2

105
5.4.3 Order of convergence in the Lipschitz case

• The Lipschitz case is more involved.

Proposition 5.3. Under (HL), we have that

Z !
 2
 n−1
X ti+1  
T (|π|) ≤ C |π| + max E |Yt − Yti | + E |Zt − Z̄ti |2 dt (5.49)
i ti
i=0

where
Z ti+1 
1
Z̄ti = Eti Zt dt .
hi ti

Proof. We first observe that


Z ti+1 Z ti+1 
Ẑti = Eti Zs dWs Hi + Hi f (Yt , Zt )dt (5.50)
ti ti

and then compute


Z ti+1 
Z̄ti = Eti Zs dWs Hi (5.51)
ti
 Z ti+1  Z ti+1 
2 2
|Eti Hi f (Yt , Zt )dt | ≤ Eti |f (Yt , Zt )| dt (5.52)
ti ti

leading to
Z ti+1 
2 2
|Z̄ti − Ẑti | ≤ CEti |f (Yt , Zt )| dt (5.53)
ti

Then
Z ti+1   Z ti+1 
2 2 2 2
|Eti [f (Yt , Zt ) − f (Yti , Ẑti )]dt | ≤ hi Eti C [|Yt − Yti | + |Zt − Z̄ti | + |Z̄ti − Ẑti | ]dt
ti ti

(5.54)
Z ti+1 
2 2
≤ hi CEti [|Yt − Yti | + |Zt − Z̄ti | ]dt
ti

(5.55)
Z ti+1 
+ Ch2i Eti 2
|f (Yt , Zt )| dt (5.56)
ti

106
The proof is concluding by summing on i and recalling that

Z T 
2
E |f (Yt , Zt )| dt ≤ C. (5.57)
0

• We give without proof the following result

Z ti+1 
X Z ti+1 
2 2
max E |Zt | dt + E |Zt − Z̄ti | dt ≤ C|π| (5.58)
i ti ti
i

due to Zhang [76].

• It is based on a representation of Z obtained by means of Malliavin calculus

and require some heavy computations.

Theorem 5.4. Under (HL), we have that

T (|π|) ≤ C|π| (5.59)

and thus the scheme is of order 1/2.

Proof. Observe that


" # Z Z 
ti+1 ti+1
2
E sup |Yt − Yti | ≤ CE |f (Yt , Zt )|2 dt + |Zt |2 dt (5.60)
t∈[ti ,ti+1 ] ti ti

≤ C|π| (5.61)

where we used (5.58). Then, the proof is concluded by combining the previous

estimate and (5.58) with Proposition 5.3. 2

107
5.5 Full discrete-time error analysis

• We replace X by its Euler scheme X π .

• The scheme for (Y, Z) is the same but the terminal condition (5.19) is now

(Yn , Zn ) = (g(XTπ ), 0) (5.62)

Theorem 5.5. Under (Hr), the following holds

|Y0 − u(0, X0 )| ≤ C|π| . (5.63)

5.5.1 Truncation error

• Define Ỹi = u(ti , Xtπi ) and

 
∆Wi
Z̃i = Eti Ỹi .
ti+1 − ti

Proposition 5.4. With the above definition,

h i
Ỹi = Eti Ỹi+1 + (ti+1 − ti )f (Ỹi , Z̃i ) + ζi (5.64)

where

h i
ζi = Eti ζie + ζif + ζiz (5.65)

and ζie , ζif and ζiz are defined in (5.67), (5.69) and (5.70) respectively.

Proof. (b = 0) 1. Applying Ito’s formula, we have, setting Vsi = σ(Xtπi )∂x u(s, Xsπ )

Z ti+1 Z ti+1
1
u(ti+1 , Xtπi+1 ) = u(ti , Xtπi ) + {∂t u(s, Xsπ ) + σ 2 (Xtπi )∂xx
2
u(s, Xsπ )}ds + Vsi dWs
ti 2 ti

(5.66)

108
Then, introducing

Z ti+1
1  2
ζie := − σ 2 (Xtπi ) − σ 2 (Xsπ ) ∂xx u(s, Xsπ )ds (5.67)
2 ti

the above equality rewrites, using the PDE satisfied by u,

Z ti+1   Z ti+1
Ỹi+1 = Ỹi − hf (Ỹi , Vtii ) + u(0) (s, Xsπ ) − u(0) (ti , Xtπi ) ds + ζie + Vsi dWs
ti ti

(5.68)

Now we define

Z ti+1  
ζif =− u(0) (s, Xsπ ) − u(0) (ti , Xtπi ) ds (5.69)
ti

ζiz = h{f (Ỹi , Z̃i ) − f (Ỹi , Vtii )} (5.70)

and observe that

h i
Ỹi = Eti Ỹi+1 + hf (Ỹi , Z̃tii ) + ζie + ζif + ζiz . (5.71)

5.5.2 Convergence analysis

• We prove below Theorem 5.5: this relies (as usual) on

1. the stability result for the BTZ scheme

2. a control of the truncation error coming from the regularity of u.

Lemma 5.1. Under (Hr), the following holds

|Eti[ζie ] |2 = O(h4 ) (5.72)


h i
|Eti ζif |2 = O(h4 ) (5.73)

|Eti[ζiz ] |2 = O(h4 ) (5.74)

109
Proof. See Exercise II.22 2

Proof of Theorem 5.5

We simply observe that (Ỹi , Z̃i )i defines a pertubed scheme as in Section 5.3.1. We

can then combine Proposition 5.2 with Lemma 5.1 to conclude the proof, recall also

(5.37).

110
5.6 Numerical illustration and further consideration

5.6.1 Monte Carlo based methods

• Regression methods, as the one used for US options, can be easily adapted to

the BSDEs setting, introduced in [37, 56] and extensively studied in [39, 40,

38, 34]

• Any methods that permits to estimate conditional expectation can be used,

e.g. Malliavin method [29, 17, 16]

5.6.2 Tree based methods

• We illustrate Section 5.5 with Example 5.1.

• A more systematic approach is based on Cubature methods (it allows in par-

ticular to work with degenerate diffusion), see e.g. [27, 28, 26].

Example 5.1. In this example [22], we work with d = 3 and X = W , We consider

the following BSDE

Z 1 Z 1
ω(1, W1 ) 5
Yt = + (Zs1 + Zs2 + Zs3 )( − Ys )ds − Zs dWs . (5.75)
1 + ω(1, W1 ) t 6 t

where ω(t, x) = exp (x1 + x2 + x3 + t). Applying Itô formula, we verify that the

solution is given by

ω(t, Wt ) ω(t, Wt )
Yt = and Ztl = , l ∈ {1, 2, 3} , t ∈ [0, 1] . (5.76)
1 + ω(t, Wt ) (1 + ω(t, Wt ))2

• In this very smooth setting, we can introduce higher order scheme too. We

consider the Crank-Nicholson scheme (CN) and a linear multi-step scheme

(AMB2) on the graph below.

111
• The number of points used to approximate ∆W depends on the theoretical rate

of convergence of the discrete-time error and the number of moment to match,

recall Section 2.4.

• Since X = W , we can use easily a recombination procedure.

Figure 7: Empirical convergence, Example 5.1

112
5.6.3 (Markovian) quantisation

• We illustrate here the Markovian quantisation with Example [23] in the case

of quadratic BSDEs.

• "Classical" quantization can be used for BSDEs as well, see e.g. [64].

Example 5.2. [23] We consider the following quadratic Markovian BSDE:




 R
 Xt` = X0` + t νXs` dWs` , ` ∈ {1, 2, 3}
0
, 0 6 t 6 1, (5.77)

 R R
 Yt = g(X1 ) + 1 a kZs k2 ds − 1 Zs dWs
t 2 t

where a, ν, and (X0` )`∈{1,2,3} are given real positive parameters and g : Rd → R is a

bounded Lipschitz function.

Applying Ito’s formula, one can show that the solution is given by

1    
Yt = log Et exp ag(X1 ) , t61. (5.78)
a

For any given g, ν and a, it is possible to estimate the solution Y0 at time 0 using an
ν2
+νW1`
approximation of the Gaussian distribution at time T = 1, since X1` = X0` e− 2 .

• For our numerical illustration, d = 2 and g is given by

2
X
g : x 7→ 3 sin2 (x` ) , (5.79)
`=1

• We use a markovian quantisation scheme as introduced in Section 4.5.3.

• The non-Lipschitz setting may cause instability and we use in the graph below

a scheme taming this quadratic growth also (called ’adaptive truncation’)

113
Figure 8: Comparison of schemes’ convergence

5.7 An example of forward method

• We present here a machine learning method to approximate BSDEs.

• Consider, for y ∈ R and Z ∈ H2

Z t Z t
Yty,Z =y− f (Ysy,Z , Zs )ds + Zs dWs (5.80)
0 0

• and the following optimisation problem

h i
V := min E |g(XT ) − YTy,Z |2 (5.81)
(y,Z)∈R×H2

• In the Lipschitz setting, the solution of the above optimisation is the BSDEs

(5.1) with V = 0!

• Main idea: solve numerically the optimisation problem (5.81) to get an ap-

proximation of the BSDE.

114
5.7.1 Discretization of the optimisation problem (5.81)

• The dynamics (5.80) of Y y,Z is discretised using a classical Euler scheme on

π:

Ytπn+1 = Ytπn + hf (Ytπn , Ztn ) + Ztn (Wtn+1 − Wtn ) and Y0π = y . (5.82)

• The random variable Z must be discretised also in some sense: parametric

specification!

1. Non-linear specification: Z = Z Θ for some Θ ∈ RK where Θ stands for

the coefficient of a Neural Network ϕNN and ZtΘn = ϕNN (tn , Xtn )

2. Linear specification: Ztθn = ϕL (tn , Xtn ) where

K
X
ϕL (tn , ·) = θk φk (tn , ·) , θ ∈ RdK
k=1

where (φk )1≤k≤K are some basis functions.

• Often, in practice, X has to be approximated too, by an Euler scheme for

example, recall (2.13).

• The discrete optimisation problem is now given by

 
V π := min E |g(XTπ ) − YTπ,υ |2 (5.83)
υ∈R1+K̄

where υ = (y, Θ0 , . . . , ΘN −1 ) and K̄ = d + (N − 1)K for the non-linear spec-

ification, or υ = (y, θ0 , . . . , θN −1 ) and K̄ = d + (N − 1)dK for the linear

specification.

• To solve the discrete optimisation problem one uses generally a (stochastic)

gradient descent algorithm.

115
6 An introduction to McKean-Vlasov SDEs in finance

See the seminal paper [72] and among others [42, 52], [20] for the large population

control point of view (e.g. Mean Field Games).

6.1 Definition

6.1.1 Introduction

• Example of systemic risk study [21]:

X i log-monetary reserve of each bank, the ’particle system’ is given by

 
N
X
1
dXti,N = Xtj,N − Xti,N  dt + dBti (6.1)
N
j=1

Bank i borrows to bank j if its reserve is lower.

LLN should lead to the following equation when the number of banks goes to infinity:

 
dX̄t = (E X̄t − X̄t )dt + dBt (6.2)

 
,→ Solution is given by an Ornstein-Uhlenbeck process (note E X̄t = E[X0 ]).

• Typical questions:

1. Does the solution of (6.2) describes well the system (6.1) when N → +∞?

2. Alternatively: Is (6.1) a good approximation of (6.2)?

116
6.1.2 Notations

R
• For p ≥ 1, Pp (Rd ) is the space of probability measure µ satisfying |x|p µ(dx) <

∞.

• Wasserstein distance on Pp :

1
Wp (µ, ν) := inf E[|X − Y |p ] p . (6.3)
X∼µ,Y ∼ν

Note that (Pp , Wp ) is a polish space6 see e.g [75] or [12] among others.

1 PN 1 PN
• For later use: denote µN = N i=1 δxi and ν N = N i=1 δyi , then

N
1 X
W22 (µN , ν N ) ≤ |xi − yi |2 (6.4)
N
i=1

where x ∈ (Rd )N , y ∈ (Rd )N .

6
A complete metric space.

117
6.1.3 Existence and uniqueness

• We consider, for ξ ∈ L2 ,

Z t Z t
Xt = ξ + b(Xs , L(Xs ))ds + σ(Xs , L(Xs ))dWs , t ≤ T. (6.5)
0 0

• In this context, (HL) rewrites


|b(x, µ) − b(x0 , µ0 )| + |σ(x, µ) − σ(x0 , µ0 )| ≤ L |x − x0 | + W2 (µ, µ0 ) (6.6)

Theorem 6.1. Under (HL), there exits an unique solution (6.5)

Proposition 6.1. Denote µt := L(Xt ), then (µ)0≤t≤T is a weak solution to

1 2
∂t µt = −∂x {b(x, µt )µt } + ∂xx {σ 2 (x, µt )µt } (6.7)
2

118
Proof. of Theorem 6.1 [b = 0 for the proof] Let Φ : S 2 → S 2 given by

Z t
Xt = Φ(x)t := ξ + σ(xs , L(xs ))dWs (6.8)
0

and denote ∆X := X − X 0 = Φ(x) − Φ(x0 ). We compute

Z r
2
sup |∆Xr | ≤ sup | δσs dWs |2 (6.9)
r≤t r≤t 0

where δσs := σ(xs , L(xs )) − σ(x0s , L(x0s )). It satisfies

   
E |δσs |2 ≤ 2L2 E |xs − x0s |2 + W22 (L(xs ), L(x0s ))
 
≤ 4L2 E |xs − x0s |2 (6.10)

Using BDG inequality, we obtain

  Z t 
2 2
E sup |∆Xr | ≤ CE |δσs | ds (6.11)
r≤t 0

And from (6.10), we deduce

  Z t
 
2
E sup |∆Xr | ≤ C E |xs − x0s |2 ds (6.12)
r≤t 0

and a fortiori, setting ∆x = x − x0 ,

  Z t  
2 2
E sup |∆Xr | ≤ C E sup |∆xr | ds (6.13)
r≤t 0 r≤s

Iterating this inequality, one obtains, denoting Φ(k+1) = Φ ◦ Φ(k) ,

  " #
(CT )k
E sup |Φ(k) (x)r − Φ(k) (x0 )r |2 ≤ E sup |∆xr |2 ds (6.14)
r≤t k! r≤T

so that Φ has a unique fixed point. 2

119
...

120
Proof. of Proposition 6.1. Let φ ∈ Cc ([0, T ) × R) and apply Ito’s formula to get

Z T 
1
E[φ(t, Xt )] = E[φ(0, X0 )] + E (∂t φ + b(Xs , µs )∂x φ + σ(Xs , µs )2 ∂xx
2
φ)(s, Xs )ds
0 2

(6.15)

Thus, φ having compact support,

Z T Z
1
0= {(∂t φ + b(x, µs )∂x φ + σ(x, µs )2 ∂xx
2
φ)(s, x)}µs (dx)ds (6.16)
0 R 2

which expresses the weak solution property.

Recall that if µt has a smooth density that me denote mt : We can use Integration

by Part formula to get

Z
E[b(Xs , µs )∂x φ(s, Xs )] = b(x, µs )∂x φ(s, x)ms (x)dx (6.17)
R
Z
=− φ(s, x)∂x {b(x, µs )ms (x)}dx . (6.18)
R

6.2 Particle system approximation

• We consider the following example, denoting µt := L(Xt ),

Z
dXt = β(Xt , y)µt (dy)dt + dWt , (6.19)

X0 = ξ ∼ µ ∈ P2 (6.20)

where β is a Lipschitz function. (Note that b(x, ν) = E[β(x, χ)] where χ ∼ ν.)

,→ Unique solution to the above equation as (HL) is satisfied.

(This is the laboratory example of [72])

121
Proof. We observe that

 
b(x, ν) − b(x, ν 0 ) = E β(x, χ) − β(x, χ0 ) for any χ ∼ ν, χ0 ∼ ν 0 (6.21)

Then, since β is Lipschitz continuous, we get

 
|b(x, ν) − b(x, ν 0 )| ≤ LE |χ − χ0 | (6.22)
 1
≤ LE |χ − χ0 |2 2 (6.23)

where we used Cauchy-Schwartz for the last inequality. Note that χ and χ0 are

arbitrary so that

 1
|b(x, ν) − b(x, ν 0 )| ≤ L inf0 0
E |χ − χ 0 2 2
| = LW2 (ν, ν 0 ) (6.24)
χ∼ν,χ ∼ν

122
• Associated particle system:

N
1 X
dXti = β(Xti , Xtj )dt + dWti , (6.25)
N
j=1

X0i = ξ i

where (ξ 1 , . . . , ξ N ) are iid with law µ0 and (W i ) are i.i.d. Brownian motion.

• Observe that

N N
1 X 1 X
β(x, Xtj ) = b(x, µN
t ) where µ N
t := δX j (empirical measure part. syst.).
N N t
j=1 j=1

(6.26)

• We consider N independent particles with the dynamics (6.19)

dX̄ti = b(X̄ti , L(X̄ti ))dt + dWti (6.27)

Same BM as in (6.25), observe that L(X̄ti ) = µt and we denote

N
1 X
µ̄N
t := δX̄ j (6.28)
N t
j=1

Theorem 6.2. The following holds, for all T > 0,


" #
CT
ϕ(T ) := max E sup |Xti − X̄ti |2 ≤ (6.29)
i t≤T N

• From this, one deduces that the law of the first k particules (X i ) converges to the

law of the first k independent particules (X̄ i ):

This phenomenon is called Propagation of chaos (the particules become independent

at the limit N → ∞), see [72].

123
Proof. Let δX i := X i − X̄ i , r ≤ T ,

Z r Z r
|δXri |2 =| {b(Xsi , µN
s ) − b(X̄si , µs )}ds|2 ≤T |b(Xsi , µN i 2
s ) − b(X̄s , µs )| ds
0 0

(6.30)

leading to

  Z t
i 2
 
E sup |δXr | ≤ CT E |b(Xsi , µN i N 2 i N i 2
s ) − b(X̄s , µ̄s )| + |b(X̄s , µ̄s ) − b(X̄s , µs )| ds
r≤t 0

(6.31)

Now,
 
N
X
  1
E |b(Xsi , µN i N 2
s ) − b(X̄s , µ̄s )| ≤ CE|Xsi − X̄si |2 + |Xsj − X̄sj |2 
N
j=1

≤ Cϕ(s) . (6.32)

since b satisfies (HL), recalling (6.4).

124
Moreover,

N
1 X
|b(X̄si , µ̄N i 2
s ) − b(X̄s , µs )| = | {β(X̄si , X̄sj ) − b(X̄si , µs )}|2 (6.33)
N
j=1

 
Denote Y j = β(X̄si , X̄sj ) − b(X̄si , µs ) and observe that E Y j |X̄ i = 0, for j 6= i and

/ {i, j} Y j and Y k are independent.


k∈

We then compute
 2 
XN
  1
E |b(X̄si , µ̄N i
s ) − b(X̄s , µs )|
2
= 2 E Y j  (6.34)
N
j=1
 
1   X h i X h i
= 2 E |Y i |2 + 2 E Y iY k + E Y jY k 
N
k6=i j6=i,k6=i
N
X
1   C
= E |Y j |2 ≤ (6.35)
N2 N
j=1

Combining (6.31)-(6.32)-(6.35), we arrive at

Z t
C
ϕ(t) ≤ ϕ(s)ds + (6.36)
0 N

and the result follows from Gronwall Lemma. 2

125
• It is possible also to obtain a weak error as follows

  1
|E Ψ(µN
T ) − Ψ(µT )| = O( ) (6.37)
N

for Ψ : P2 → R under some smoothness condition on Ψ, see [24].

Time discretization and other approximation schemes

• Full discretization of the particles system requires a time discretization see e.g.

[14].

• The theoretical time discretization is also studied in Chapter 5 of [58] in a

Lipschitz setting.

• Other type of approximation schemes are possible: via “projection” [9], via

optimal quantization see Chapter 7 of [58], cubature method see [30].

126
6.3 Singular interaction

6.3.1 Burgers Equation

• Burgers equation is given by

ν2 2
∂t u + u∂x u = ∂ u and u(0, ·) = g(·) (6.38)
2 xx

ν is the viscosity coefficient (ν = 0, inviscid Burgers equation which is a scalar

conservation law)

A simple model from fluid dynamics but represent also Carbon Allowance Price in

some simple model (a slight modification of it), see e.g. [19].

• When ν > 0, there exists a unique solution given by


h i
R (x−y) − G(y) − |x−y|2 νWt
E Ĝ ν (x + νWt )
t e
ν 2 2ν t dy
2
t
u(t, x) = R − G(y) − |x−y|2 =− h i (6.39)
e ν2 2ν 2 t dy E Ĝν (x + νWt )

or alternatively
h i
E Ĝ0ν (x + νWt )
u(t, x) = −ν 2 h i (6.40)
E Ĝν (x + νWt )
G(y)
with Ĝν (y) = e− ν2 .

Proof. We assume smoothness of the function in the following computations. Ob-

serve that u = ∂x w where w is solution to the HJB equation:


Z ·
1 ν2 2
∂t w + (∂x w)2 = ∂xx w and w(0, ·) = G(·) := g(z)dz . (6.41)
2 2 −∞

−w
Define θ := e η and compute
 
θ θ 2 2 (∂x2 w)2 θ
∂t θ = −∂t w , ∂x θ = −∂x w , ∂xx θ = − ∂xx w − (6.42)
η η η η

Setting η = ν 2 , we get from (6.41):

ν2 2 G(·)
∂t θ = ∂xx θ and θ(0, ·) = e− ν 2 (6.43)
2

127
This implies that

Z
1 G(y) |x−y|2
θ(t, x) = √ e− ν 2 − 2ν 2 t dy (6.44)
σ 2πt
Z
= Ĝν (x − y)φν (t, y)dy (6.45)

We then observe that

∂x θ
u = ∂x w = −ν 2 . (6.46)
θ

Differentiating (6.44) or (6.45) yields the result. 2

A particle system representation We follow [13] and assume that g is a cdf.

• Setting v = ∂x u. We obtain, that v is a weak solution

σ2 2
∂t v + ∂x (uv) = ∂ v and v0 = ∂x g (6.47)
2 xx

X0 ∼ v0 and

Z t
Xt = X0 + u(s, Xs )ds + σWt and Xt ∼ vt (6.48)
0

Rx R∞
We have u(t, x) = −∞ v(t, y)dy = −∞ H(x, y)v(t, y)dy with H(x, y) = 1y≤x .

• When ν = 0, there exists a unique "good physical solution" (a.k.a. entropy solution

see) For g(x) = 1[0,+∞) (x), it is given by

x 0∧x∨t
u(t, x) = 1{0≤x≤t} + 1{x>t} = (6.49)
t t

• Since u(t, x) = E[H(x, Xt )], we introduce the particle system:

M
X 1
dXti = H(Xti , Xtj )dt + σdWti . (6.50)
M
j=1

• Simulation of the particle system gives:

128
Legend. blue : σ = 1, orange: σ = 0.5, green: σ = 0.1, red: σ = 0

parameters: M = 100000, Time grid with n = 21 dates.

• The algorithm converges to the true value when σ → 0. Note that the sum is

obtained by sorting the system, so that denoting rk(Xti ) the rank of the particle Xti ,

(6.52) is replaced in practice by

rk(Xti )
Xtik+1 = Xtik + + σ∆Wki . (6.51)
M

• If u0 = 1 − g, with g cdf then the particle system is (Exercice)

M
X 1
dXti = (1 − H)(Xti , Xtj )dt + σdWti and X0 ∼ g 0 (6.52)
M
j=1

For g(x) = 1 ∧ (x − 1) ∨ 0, the algorithm behaves very well as it captures the shock

appearing at t = 1 in the entropy solution:

129
Legend. blue : T = 0, orange: T = 0.5, green: T = 1, red: T = 2

parameters: σ = 0.1, M = 10000 ! Time grid with n = 21 dates.

• See [14] for generalisation and study of the error.

130
6.3.2 Calibration of LSV model

[45, 43]

Local volatility model .

• For a volatility function σ

dSt = rSt dt + σ(t, St )St dWt and S0 = x (6.53)

• Calibration to market data imposes restrictions on σ, namely Dupire’s formula

gives

∂T C(T, K) + rK∂K C(T, K)


σD (T, K)2 = 2 2 C(T, K) (6.54)
K 2 ∂KK

where C(T, K) is the price observed today (S0 = x) of Call option with strike K

and maturity T .

,→ No extra volatility risk in this class of models...

131
Stochastic volatility model .

• Given a a stochastic process:

dSt = rSt dt + at St dWt (6.55)

e.g. a follows a brownian SDE

dat = β(t, at )dt + α(t, at )dBt (6.56)

where B is a BM possibly correlated to W .

132
Local Stochastic volatility model [57, 69].

• should incorporate volatility risk but also calibrate to European call option.

dSt = rSt dt + at σ(t, St )dWt (6.57)

for a given stochastic process a.

• Calibration to European call option imposes

σD (t, x)
σ(t, x) = q   (6.58)
E a2t |St = x

so that S above has the following dynamics

σD (t, St )at
dSt = rSt dt + q   dWt (6.59)
E a2t |St

133
• This comes from Gyöngy Theorem [46], see also [18]. Namely, for an Ito process

dZt = βt dt + αt dWt (6.60)

there exists a diffusion Z D

dZtD = b(t, ZtD )dt + Σ(t, ZtD )dWt (6.61)

such that L(Zt ) = L(ZtD ) for all t ≥ 0, where

 
b(t, x) = E[βt |Zt = x] and Σ(t, x)2 = E αt2 |Zt = x . (6.62)

134
Heuristics: 1. For Gyöngy Theorem (β = 0). Let φ ∈ Cc ([0, T ) × R) and apply Ito’s

formula to get

Z T 
1 2 2
E[φ(t, Zt )] = E[φ(0, Z0 )] + E (∂t φ + αt ∂xx φ)(s, Zs )ds (6.63)
0 2

Since φ has compact support,

Z T 
1  2  2
0=E (∂t φ + E αs |Zs ∂xx φ)(s, Zs )ds (6.64)
0 2
Z T 
1 2
=E (∂t φ + Σ(s, Zs )∂xx φ)(s, Zs )ds (6.65)
0 2

Denoting µt = L(Zt ), we have

Z T Z
1
0= (∂t φ + Σ(s, x)2 ∂xx
2
φ)(s, x)µs (dx)ds (6.66)
0 2

so that (µt )t is a weak solution to the Fokker-Planck equation

1 2
∂t m = ∂xx (Σ(t, x)2 m) and m0 = L(Z0 ) , (6.67)
2

which is also satisfied by L(ZtD ).

2. Combining Gyöngy Theorem and Dupire result, we must have, recalling (6.57),

 
E (at σ(t, St ))2 |St = x = σD (t, x) (6.68)

which yields (6.58) .

• Existence and uniqueness for (6.59): mostly open, see nevertheless [1, 49] and [54].

135
Numerical simulation of (6.59) [77, 44] .

• Following MC approach, set at = f (Yt ) where Y is a process that can be simulated.


 
,→ The main question is to compute E f 2 (Yt )|St

R
• Consider φ : R → R a smooth function s.t. φ(x)dx = 1 and set

1 x
φ (x) := φ( ) , ∀x ∈ R. (6.69)
 

- If (Sti )1≤i≤N is a particle system associated to St (with or without MF inter-

action) an estimator of ρt (x) (the density function of St ) is

N Z
1 X
µt (x) ' φ (x − St ) = φ (x − y)dµN
i
t (dy) (6.70)
N
i=1

- Nadaraya-Watson estimator at x

PN 2 i i
  i=1 f (Yt )φ (x − St )
Θ(t, x) := E f 2 (Yt )|St = x ' PN =: ΘN (t, x) (6.71)
i
i=1 φ (x − St )

- (6.58) is thus approximated by

σD (t, x)
σ(t, x) ' σN (t, x) := p (6.72)
ΘN (t, x)

136
Heuristics.

1. Consider the kernel estimator of the density ρt (x, y) of L(Yt , St ) for two parame-

ters , 0 namely

N
1 X
φ (x − Sti )φ0 (y − Yti ) (6.73)
N
i=1

Then one sets

1 PN i
N i=1 φ (x − St )φ0 (y − Yti )
ρ(y|x) ' 1 P N
(6.74)
i
N i=1 φ (x − St )

and

1 PN i
R 2
  i=1 φ (x − St ) f (y)φ0 (y − Yti )dy
E f 2 (Yt )|St = x ' N
1 P N
(6.75)
i
N i=1 φ (x − St )

sending 0 → 0 we retrieve (6.71).

Remark: if f 2 (y) = y and the function φ is symetric then one has directly

Z
yφ0 (y − Yti )dy = Yti . (6.76)

2. Alternatively, one can see


 
 2  E f 2 (Yt )δ0 (x − St )
E f (Yt )|St = x ” = ” (6.77)
E[δ0 (x − St )]

After smoothing of δ0 (the Dirac at 0), we get


 
 2  E f 2 (Yt )φ (x − St )
E f (Yt )|St = x ' (6.78)
E[φ (x − St )]

A particle approximation of the previous expression leads to (6.71).

137
• Assume we are given an approximation Ȳ of Y on the grid {tk , k = 0, . . . , κ}.

• Particle system for time-discretized (6.59): for all j ≤ N ,

Stjk+1 = rStjk h + σN (tk , Stjk )f (Ȳtk )Stjk ∆Wj (6.79)

• Acceleration in practice:

1. In σN (t,x), ΘN (t, x) is not computed for all x = Stjk but on a grid of R. The

value at x = Stjk is then obtained by interpolation.

2. The sums (6.71) has to be optimised for the method to be not too expensive

computation wise.

 
Remark 6.1. Note that the quantity E f 2 (Yt )|St can also be obtained by a non-

parametric regression estimate (see US option)

Numerical Example We consider Fake Brownian Motion, namely models of the

type:

f (Yt )
dXt = p dWt (6.80)
E[f 2 (Yt )|Xt ]

,→ the Markovian projection is indeed a Brownian motion and thus L(Xt ) = L(Wt ),

for all t.

• In the numerics, Y follows the SDE

dYt = −Yt dt + dBt and Y0 = 0 (6.81)

and

f : x 7→ 0.1 + sin(x)2 (6.82)

• We implement the previous scheme and obtain at T = 1, M = 100000, N time =

20: (Legend: orange: true gaussian density, blue: estimation of XT density function)

138
• "Call price" in this model:

1. E[[X]+ ]:

prix= 0.381 std estim= 0.0018

BM: prix= 0.390 std estim= 0.0018

True: 0.399

2. E[[X − 1]+ ]

X: prix= 0.076 std estim= 0.0008

BM: prix= 0.077 std estim= 0.0008

3. E[[X − 0.5]+ ]

X: prix= 0.183 std estim= 0.0013

BM: prix= 0.189 std estim= 0.0013

139
4. supremum:

E[maxt (Xt )]= 0.600 std estim = 0.0018

E[maxt (Wt )]= 0.658 std estim = 0.0019

140
Contents

I Handouts 15

1 Introduction 16

2 Review of the linear case 17

2.1 Mathematical & Financial Framework . . . . . . . . . . . . . . . . . 17

2.1.1 SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.2 Useful estimates under (HL) . . . . . . . . . . . . . . . . . . 18

2.1.3 Link with PDEs . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.4 Financial setting . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Euler Scheme for SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.1 Definition and first properties . . . . . . . . . . . . . . . . . . 23

2.2.2 Weak convergence for vanilla options . . . . . . . . . . . . . . 27

2.3 Implementation using Monte Carlo Methods . . . . . . . . . . . . . . 31

2.3.1 Quick review of the case without bias . . . . . . . . . . . . . 31

2.3.2 The case with bias . . . . . . . . . . . . . . . . . . . . . . . . 34

2.3.3 Convergence of Mean Square Error for MC method . . . . . . 35

2.4 Implementation using quantisation of Brownian increments . . . . . 37

2.5 Strong convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.5.1 Lipschitz case . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.5.2 Non globally Lipschitz case . . . . . . . . . . . . . . . . . . . 47

2.6 Path-dependent options . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.6.1 General consideration . . . . . . . . . . . . . . . . . . . . . . 51

141
2.7 Acceleration methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.7.1 Reducing the bias . . . . . . . . . . . . . . . . . . . . . . . . 53

2.7.2 Multi-Level Monte Carlo . . . . . . . . . . . . . . . . . . . . . 54

2.8 High order schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2.8.1 High order weak convergence . . . . . . . . . . . . . . . . . . 59

3 Computing sensitivities in the linear case 60

3.1 Finite-Difference Approximations . . . . . . . . . . . . . . . . . . . . 60

3.1.1 Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.1.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2 Tangent process approach . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2.1 Tangent process . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2.2 Computing the delta (pathwise approach) . . . . . . . . . . . 67

3.2.3 Practical implementation . . . . . . . . . . . . . . . . . . . . 69

3.3 Greek weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.3.1 Likelihood Ratio Method . . . . . . . . . . . . . . . . . . . . 71

3.3.2 Integration by part . . . . . . . . . . . . . . . . . . . . . . . . 73

3.3.3 Bismut’s formula . . . . . . . . . . . . . . . . . . . . . . . . . 74

4 U.S. options in complete market 75

4.1 Definition and first properties . . . . . . . . . . . . . . . . . . . . . . 75

4.2 Bermudan option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2.1 Discretisation of the forward process . . . . . . . . . . . . . . 82

4.2.2 Longstaff-Schwarz algorithm . . . . . . . . . . . . . . . . . . . 84

4.3 Dual approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

142
4.4 Implementation using regression techniques . . . . . . . . . . . . . . 87

4.4.1 Linear Regression-based methods . . . . . . . . . . . . . . . . 87

4.4.2 Implementation of the dual approach . . . . . . . . . . . . . . 90

4.5 Quantization based methods . . . . . . . . . . . . . . . . . . . . . . . 92

4.5.1 Introduction - cubature formula . . . . . . . . . . . . . . . . . 92

4.5.2 Quantization tree for optimal stopping problem . . . . . . . . 97

4.5.3 Markovian quantization (grid method) . . . . . . . . . . . . . 99

5 Non-linear pricing methods 101

5.1 Backward Stochastic Differential Equation . . . . . . . . . . . . . . . 101

5.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.1.2 Some key properties . . . . . . . . . . . . . . . . . . . . . . . 103

5.1.3 Application to non-linear pricing . . . . . . . . . . . . . . . . 104

5.2 Main properties in the Markov setting . . . . . . . . . . . . . . . . . 107

5.3 Numerical analysis of backward Methods . . . . . . . . . . . . . . . . 108

5.3.1 L2 -stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.4 Convergence analysis assuming no error on X . . . . . . . . . . . . . 112

5.4.1 Truncation error . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.4.2 Order of convergence in the smooth case . . . . . . . . . . . 113

5.4.3 Order of convergence in the Lipschitz case . . . . . . . . . . 115

5.5 Full discrete-time error analysis . . . . . . . . . . . . . . . . . . . . . 118

5.5.1 Truncation error . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.5.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . 119

5.6 Numerical illustration and further consideration . . . . . . . . . . . . 121

143
5.6.1 Monte Carlo based methods . . . . . . . . . . . . . . . . . . . 121

5.6.2 Tree based methods . . . . . . . . . . . . . . . . . . . . . . . 121

5.6.3 (Markovian) quantisation . . . . . . . . . . . . . . . . . . . . 123

5.7 An example of forward method . . . . . . . . . . . . . . . . . . . . . 125

5.7.1 Discretization of the optimisation problem (5.81) . . . . . . . 126

6 An introduction to McKean-Vlasov SDEs in finance 128

6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.1.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.1.3 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . 130

6.2 Particle system approximation . . . . . . . . . . . . . . . . . . . . . . 133

6.3 Singular interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.3.1 Burgers Equation . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.3.2 Calibration of LSV model . . . . . . . . . . . . . . . . . . . . 143

II Exercises 158

III Partial correction to Exercises 168

144
Part II

Exercises
Exercise II.1. Prove inequality 2.4.

Exercise II.2. Prove inequality 2.5.

Exercise II.3. Prove inequality 2.6

Exercise II.4. Prove the statement of Proposition 2.1.

Exercise II.5. Prove inequality (2.37).

Exercise II.6. Let X π denote the Milstein scheme for X (one dimensional SDE)

given by

Z t Z t
Xt = X0 + b(Xs )ds + σ(Xs )dWs , t≤T,
0 0

where b and σ are Cb2 and σ is bounded.

1. Recall the definition of X π and of the (global) truncation error T .

2. Prove the stability of the scheme i.e.


" # " #
E sup |Xu − Xuπ |p ≤ Cp E sup |T (u)| p
,
u≤T u≤T

for p ≥ 2 .

Exercise II.7. 1. Prove Gronwall Lemma.

2. Let φ be a non-decreasing non-negative function satisfying

Z t Z t  21
2
φ(t) ≤ A φ(s)ds + B φ (s)ds +C , ∀t ≤ T ,
0 0

145
where A, B, C are positive constants.

Show that, for all t ≤ T ,

2 )t
φ(t) ≤ 2Ce(2A+B .

Exercise II.8. Let X π be given by

1 
π
Xi+1 = Xiπ + b(Xiπ )h + σ(Xiπ )∆i W + [σσ 0 ](Xiπ ) (∆i W )2 − h
2
1 1
+ b0 σ(Xiπ )∆i Z + [bb0 + b00 σ 2 ](Xiπ )h2
2 2
1
+ [bσ 0 + σ 00 σ 2 ](Xiπ ) (∆i W h − ∆i Z)
2
 
1 2 00 0 2 π 1 2
+ [σ σ + σ(σ ) ](Xi ) (∆i W ) − h ∆i W
2 3
R ti+1
where ∆i Z = ti (Ws − Wti )ds, ∆i W = Wti+1 − Wti .

This is an Ito-Taylor scheme with strong order 1.5 (no proof required!).

1. Derive heuristically the scheme from the original SDE.

2. What is the distribution of (∆i W, ∆i Z)?

3. Explain how to simulate (∆i W, ∆i Z).

Exercise II.9. (Asian Option) Let X be a one dimensional SDE given by

Z t Z t
Xt = X0 + b(Xs )ds + σ(Xs )dWs , t≤T,
0 0

representing the price of some stock. We assume a deterministic interest rate r and

the existence of a unique pricing measure.

1. Recall the price of an european option with payoff g(XT ) at maturity T , where

g is some Lipschitz function. Explain how to compute this price using MC

simulation, in particular give the Euler scheme X π for X.

146
RT
2. We consider now options written on AT = 0 Xs ds. Using a grid π with

constant step size, we introduce an Euler scheme Aπ to compute A, based on

X π . Show that

1 C
E[|g(AT ) − g(AπT )|p ] p ≤ √ ,
n

where g is Lipschitz continuous function.

3. The Black-Scholes model.

(a) Write down the Euler scheme for the Black-Scholes model. Explain one

drawback of this scheme and why, in fact, we don’t need to use it.

(b) We assume now that X is perfectly approximated on π and we consider

the same approximation Aπ as above (using X...). Prove that

1 C
E[|g(AT ) − g(AπT )|p ] p ≤ .
n

Exercise II.10. Let X be a one dimensional SDE given by


Z t Z t
Xt = X0 + b(Xs )ds + σ(Xs )dWs , t≤T.
0 0

T
We denote by X n its Euler scheme using a grid π with constant time step n. We

consider a bounded measurable function f and assume that the following expansion

holds true:

c1 1
E[f (XT )] = E[f (XTn )] + + O( 2 ) . (6.83)
n n

1. We assume that the Euler scheme is computed using MC simulation. Give an

expression of the MSE, if one uses the ’usual’ estimator.

2. One would like to take advantage of the expansion (6.83). Suggest a new

approximation of E[f (XT )] using X n and X 2n with precision O( n12 ).

147
3. One would like to implement this new approximation in practice.

(a) We first simulate X n and X 2n using two independent batches of Brownian

increments. Give the expression of the MSE and the asymptotic variance

of the estimator (when n → ∞).

(b) We now simulate X n and X 2n using the same Brownian increments.

i. Explain how one could do that in practice, implement the method...

ii. Give the expression of the MSE and the asymptotic variance of the

estimator (when n → ∞) and compare with the previous result.

Exercise II.11. (Lookback Option) Let X be the solution of the one dimensional

SDE:

Z t
Xt = X0 + σ(Xs )dWs ,
0

representing the price of some underlying asset. We assume that σ is Lipschitz and

positive. We would like to approximate the price of an option with the following

payoff at maturity: g(maxt∈[0,T ] Xt ). We introduce the (continuous) Euler scheme

X π for X, on a grid with constant timestep h = T /n.

We assume that

   
π
E g( max Xt ) ' E g( max Xt ) .
t∈[0,T ] t∈[0,T ]

We know how to simulate Xtπi , ti ∈ π. The goal here is to understand how to simulate

then

max Xtπ = max max Xtπ .


t∈[0,T ] i t∈[ti ,ti+1 ]

148
knowing that (no proof required!)

Law( max Xtπ | (Xtπj )j≤n ) = Law( max Xtπ | Xtπi , Xtπi+1 )
t∈[ti ,ti+1 ] t∈[ti ,ti+1 ]

1. Using the reflection principle, one can prove that for z ≥ y

P( sup Wt ≥ z, Wt ≤ y) = P(Wt ≥ 2z − y) .
t∈[0,T ]

Deduce that for z ≥ y

2
P( sup Wt ≥ z, | Wt = y) = e− T z(z−y) .
t∈[0,T ]

2. Show that, for M ≥ xi ∨ xi+1

P( max Xtπ ≤ M | Xtπi = xi , Xtπi+1 = xi+1 )


t∈[ti ,ti+1 ]
(M −xi )(M −xi+1 )
−2
hσ 2 (xi )
=1−e =: Fh (M ; xi , xi+1 ) .

 
3. Explain how to compute E g(maxt∈[0,T ] Xtπ ) in practice.

Exercise II.12. Let b and σ be Cb2 functions from R to R and X be the solution of

the following SDE

Z t Z t
Xt = X0 + b(Xs )ds + σ(Xs )dWs , 0 ≤ t ≤ T .
0 0

1. Recall the definition of ∇X the tangent process for X and the equation (E) it

satisfies .

2. Prove that
" #
p
E sup |∇XT | ≤ Cp .
t≤T

3. Show uniqueness to (E).

149
Exercise II.13. Let X be given by

Z t Z t
Xtx =x+ b(Xsx )ds + σ(Xsx )dWs ,
0 0

where b and σ are Cb2 .

1. Recall the definition of the tangent process.

2. Write down the Euler scheme for X and for ∇X, on a discrete time grid π.

3. In the case d = 1 case, suggest an approximation of ∇X wich is positive.

Exercise II.14. Let X be given by

Z t Z t
Xtx =x+ b(Xsx )ds + σ(Xsx )dWs ,
0 0

where b and σ are Cb2 .

We consider the Euler approximation of Question 2 in Exercise II.13.

1. Recall the rate of strong convergence of the Euler scheme for X.

2. Show that
" #
2
E sup |∇Xt − ∇Xti | ≤ C|π|
t∈[ti ,ti+1 ]

3. Show that
" #
E sup |∇Xt − ∇Xtπ |2 ≤ C|π| .
t≤T

4. We approximate E[g(XT )∇XT ] by E[g(XTπ )∇XTπ ]. Give an upper bound of the

error when using the previous approximation.

5. Explain how to compute the previous approximation in practice.

150
Exercise II.15. We work in the Black-Scholes setting with

dXt = rXt dt + σXt dXt .

Using the likelihood ratio method, show that

1. the delta is given by


 
−rT x WT
∂x u(0, x) = e E g(XT ) ;
xσT

2. the vega is given by


 2 
−rT x WT 1
∂σ u(0, x) = e E g(XT ) − WT − ;
σT σ

3. the Gamma is given by


 2 
2 −rT 1 x WT 1
∂xx u(0, x) =e E g(XT ) − WT − .
x2 σT σT σ

Exercise II.16. Let X be given by


Z t Z t
Xtx =x+ b(Xsx )ds + σ(Xsx )dWs ,
0 0

where b and σ are Cb2 . For α > 0 and H be a bounded progressively measurable

process, we introduce
Z t Z t
Xtx (α) =x+ (b(Xsx (α)) − αHs σ(Xsx (α))ds + σ(Xsx (α))dWs
0 0

∂X α
The goal of this exercise is to prove that Ut := ∂α α=0 satisfies

dUt = Ut (b0 (Xt )dt + σ 0 (Xt )dWt ) − Ht σ(Xt )dt and U0 = 0 . (6.84)

1. We define

Xt (α + ) − Xt (α)
∂α Xt (α) = lim
→0 

151
using a heuristic argument, show that

Z t 
∂α Xtx (α) = b0 (Xsx (α))∂α Xtx (α) − αHs σ 0 (Xsx (α))∂α Xtx (α) − Hs σ(Xsx (α) ds
0
Z t
+ σ 0 (Xsx (α))∂α Xtx (α)dWs
0

2. Setting α = 0, in the above equation, we retrieve (6.84). We now define


X x ()−X x
Γ :=  and

Z 1 Z 1
0
b̃t = b (Xt + λ(Xt − Xt ))dλ , σ̃t = σ 0 (Xt + λ(Xt − Xt ))dλ.
0 0

(a) Write down the dynamics of Γ using b̃ , σ̃  and show that, for  ∈ [−1, 1]
" #
E sup |Γt |p ≤ Cp .
t≤T

(b) Show that


" #
lim E sup |∂α Xtx (0) − Γt |2 =0.
→0 t≤T

3. Recall the definition of ∇Xtx , the tangent process for X.

Ut
4. Compute the dynamics of ∇Xt and give the expression of U using (∇Xt ).

Exercise II.17. 1. Prove that the estimator given in Proposition ?? has finite

variance.

Exercise II.18. Prove Proposition 4.3 (hint: introduce a dominating bermudan

option with exercise payoff g(X) ∨ g(X π )).

Exercise II.19. 1. Prove that any L2 -optimal quantizer is stationary, recall (4.7).

In particular, prove (4.6).

2. Prove (4.8) and (4.9).

152
Exercise II.20. Prove Proposition 5.1.

Exercise II.21. Prove Theorem 5.2.

Exercise II.22. Prove Lemma 5.1.

153
Part III

Partial correction to Exercises


Correction to Exercise II.4

1. Localisation: Let us define for M a positive integer

τM := inf{t ≥ 0 | |Xtπ | ≥ M } ∧ T ,

which is a stopping time. We then consider

XtM,π := Xt∧τ
π
M
.

We observe that
Z t∧τM Z t∧τM
XtM,π = x + b(s̄, Xs̄π )ds + σ(s̄, Xs̄π )dWs ,
0 0
Z t Z t
=x+ b(s̄, Xs̄M,π )1{τM >s} ds + σ(s̄, Xs̄M,π )1{τM >s} dWs .
0 0

2. Estimate: Applying the following inequality, (a + b + c)p ≤ 3p−1 (ap + bp + cp )

(Jensen inequality), we compute, using step 1.


Z t Z t
|XtM,π |p ≤ 3p−1 (xp + | b(s̄, Xs̄M,π )1{τM >s} ds|p + | σ(s̄, Xs̄M,π )1{τM >s} dWs |p )
0 0

Using BDG inequality, we obtain


   Z u   Z t 
p
M,π p p
E sup |Xu | ≤ Cp (x + E sup | b(s, Xs̄M,π )1{τM >s} ds|p +E | M,π 2
σ(s, Xs̄ ) 1{τM >s} ds| 2 )
0≤u≤t 0≤u≤t 0 0

p
Using Hölder inequality (recalling that p > 2 ≥ 1),
  Z t  Z t 
M,π p p M,π
M,π p p p−1 −1 p
E sup |Xu | ≤ Cp (x + t E |b(s, Xs̄ )| ds + t 2 E |σ(s, Xs̄ )| ds
0≤u≤t 0 0

Using the linear growth property of b ad σ, we compute


  Z t h i
M,π p
E sup |Xu | ≤ Cp (1 + E |Xs̄M,π |p du)
0≤u≤t 0

154
which a fortiori leads to
  Z t  
M,π p M,π p
E sup |Xu | ≤ Cp (1 + E sup |Xu | du) .
0≤u≤t 0 0≤u≤s

Applying Gronwall’s Lemma, we obtain


" #
E sup |XuM,π |p ≤ Cp .
0≤u≤T

3. Conclusion: We observe that, as M grows to infinity,

sup |XuM,π |p = sup |Xuπ |p


0≤u≤T 0≤u≤T ∧τM

converges increasingly to sup0≤u≤T |Xuπ |p .

The proof is then concluded by applying the monotone convergence theorem. 2

Correction to Exercise II.5

Correction to Exercise II.6

1. We have

π 1
Xi+1 = Xiπ + b(Xiπ )(ti+1 − ti ) + σ(Xiπ )(Wti+1 − Wti ) + a(Xiπ )((Wti+1 − Wti )2 − (ti+1 − ti )) .
2

and
Z u Z u
T (u) = (b(Xs ) − b(Xs̄ ))ds + (σ(Xs ) − σ(Xs̄ ) − a(Xs̄ )(Ws − Ws̄ ))dWs
0 0

2. We set δXt = Xt − Xtπ and observe that


Z t Z t Z t
δXt = (b(Xs̄ ) − b(Xs̄π ))ds + (σ(Xs̄ ) − σ(Xs̄π ))dWs + (a(Xs̄ ) − a(Xs̄π ))(Ws − Ws̄ )dWs
0 0 0

The only new extra term in the analysis, compared with the Euler scheme is
Z t
At := (a(Xs̄ ) − a(Xs̄π ))(Ws − Ws̄ )dWs .
0

155
We then compute, using BDG inequality,

  Z t 
p
p π 2
E sup |Au | ≤ Cp E ( |a(Xs̄ ) − a(Xs̄ ))(Ws − Ws̄ )| 2 ) ds
0≤u≤t 0
Z t 
π p
≤ Cp E |a(Xs̄ ) − a(Xs̄ ))(Ws − Ws̄ )| ds
0
Z t 
π p p
≤ Cp E |a(Xs̄ ) − a(Xs̄ )| Es̄[|Ws − Ws̄ | ] ds
0
Z t 
≤E |a(Xs̄ ) − a(Xs̄π )|p ds .
0

Since σ is bounded and Cb2 , we have that a is Lipschitz and we finally obtain

   
E sup |Au |p ≤ Cp E sup |δXu |p .
0≤u≤t u≤t

The proof is then concluded using the same arguments as the ones used for

the Euler scheme.

Correction to Exercise II.7

1.

2. We observe that, φ being non-decreasing and non-negative,

Z t Z t Z t
2 1 φ(t) B
1
( φ (s)ds) ≤ (φ(t)
2 φ(s)ds) ≤ 2+ φ(s)ds ,
0 0 2B 2 0

where we used Young’s inequality for the last part. Finally, we obtain

Z t
2
φ(t) ≤ (2A + B ) φ(s)ds + 2C .
0

The proof is concluded using the standard Gronwall’s Lemma.

Correction to Exercise II.8

156
1. Apply Ito’s formula to b(Xs ) and σ(Xs ) and discretize the integrals. We sketch

the proof for b = 0. Applying Ito’s formula once, we get

1
Xti+1 ' Xti + σ(Xti )∆i W + σ 00 σ 2 (∆i W h − ∆i Z)
2
Z ti+1 Z t
+ σ 0 σ(Xs )dWs dWt
ti ti

Expanding the last integral (keeping only the dW terms), we get

1
Xti+1 ' Xti + σ(Xti )∆i W + σ 00 σ 2 (∆i W h − ∆i Z)
2
Z ti+1 Z t
+ σ 0 σ(Xti ) dWs dWt
ti ti
Z ti+1 Z tZ s
+ (σ 0 σ)0 σ(Xti ) dWr dWs dWt
ti ti ti

The proof is completed by checking that


Z ti+1 Z ti+1
3
∆i W = 3 Ws ds + 3 Ws2 dWs
ti ti
Z ti+1 Z ti+1 Z t Z ti+1
=3 (Ws − Wti )ds + 6 (Ws − Wti )dWs dWt + 3 (s − ti )dWs
ti ti ti ti
Z ti+1 Z tZ s
= 3(ti+1 − ti )∆i W + 6 dWr dWs dWt
ti ti ti

 
2. Gaussian vector s.t. E[∆i Z] = 0, E |∆i Z|2 = 13 ∆3 and E[∆i Z∆i W ] = 12 ∆2 .

√ 3
3. Set ∆i W = hG1 , ∆i Z = 12 h 2 (G1 + √1 G2 ),
3
(G1 , G2 ) ∼ N (0, I2 ).

Correction to Exercise II.9

 
1. EQ e−T r g(XT ) , Xi+1
π = X π + rh + σ(X π )∆ W Q .
i i i

2. Euler scheme.

3. The Black-Scholes model.

π = X π (1 + rh + σ∆ W Q ) can take negative values.


(a) Xi+1 i i

157
(b) We first observe that

1 1
E[|g(AT ) − g(AπT )|p ] p ≤ CE[|AT − AπT |p ] p

Z T
AT − AπT = (Xs − Xs̄ )ds
0

We compute, using an IBP argument,


Z ti+1 Z ti+1 Z s Z s 
(Xs − Xti )ds = rXu du + σXu dWu ds
ti ti ti ti
Z ti+1 Z ti+1
= (ti+1 − s)rXs ds + (ti+1 − s)σXs dWs
ti ti

We then have
 Z T   Z T 
∗ ∗
E[|AT − AπT |p ] ≤2 p−1
(E | (s − s)rXs ds| p
+E | p
(s − s)σXs dWs | )
0 0
Z T    Z T 
p
≤ Cp ( |π|p E sup |Xu |p ds + E | |(s∗ − s)σXs | 2 ds|2 )
0 u 0
Z T  
≤ Cp |π| E sup |Xu | ds ≤ Cp |π|p .
p p
0 u

Correction to Exercise II.10

1 PN n j
1. The ’classical’ estimator is N j=1 f ((XT ) ) and then

  V ar(f ((XTn )1 )) 1 V ar(f ((XTn )1 ))


M SE = |E f (XT ) − f ((XTn )1 ) |2 + = O( 2 ) + .
N n N

2. We have that

  1
E 2f (XT2n ) − f (XTn ) = E[f (XT )] + O( 2 ) .
n

3. (a) We have

1 V ar[2f (XT2n ) − f (XTn )]


M SE = O( ) +
n4 N
1 4V ar[f (XT )] + V ar[f (XTn )]
2n
= O( 4 ) +
n N

158
because of independence. Then we have that, for n big,

V ar[2f (XT2n ) − f (XTn )] ∼ 5V ar[f (XT )] .

(b) We have

1 V ar[2f (XT2n ) − f (XTn )]


M SE = O( ) +
n4 N

and at the limit

V ar[2f (XT2n ) − f (XTn )] ∼ V ar[f (XT )] .

The estimator has then a smaller variance.

Correction to Exercise II.11

1. We compute

P( sup Wt ≥ z, | Wt = y) = lim P( sup Wt ≥ z, | y ≤ Wt ≤ y + η)


t∈[0,T ] η↓0 t∈[0,T ]

[P(WT ≥ 2z − (y + η)) − P(WT ≥ 2z − y)]/η


= lim
η↓0 [P(WT ≤ y + η) − P(WT ≤ y)]/η
∂y P(WT ≥ 2z − y)
=
∂y P(WT ≤ y)
(2z−y)2
−e 2T (−1) z(z−y)
= y2
= e−2 T
− 2T
e

2. We observe that

M − Xtπi
max Xtπ ≤ M ⇔ max Wt − Wti ≤
t∈[ti ,ti+1 ] t∈[ti ,ti+1 ] σ(Xtπi )

leading to, (W.ti is a brownian motion starting from 0 at time ti )

M − xi xi+1 − xi
P( max Xtπ ≤ M | Xtπi = xi , Xtπi+1 = xi+1 ) = P( max Wtti ≤ | Wtti+1
i
= )
t∈[ti ,ti+1 ] t∈[ti ,ti+1 ] σ(xi ) σ(xi )
M − xi xi+1 − xi
= 1 − P( max Wtti ≥ | Wtti+1
i
= )
t∈[ti ,ti+1 ] σ(xi ) σ(xi )
(M −xi )(M −xi+1 )
−2
hσ 2 (xi )
=1−e

159
using the previous question.

3. We use the inverse transform method. Once the values Xtπi have been simu-

lated, we simulate maxt∈[ti ,ti+1 ] Xtπ by computing

1 π q 
Xti + Xtπi+1 + (Xtπi+1 − Xtπi )2 − 2hσ 2 (Xtπi ) ln(U i ))
2

where U i are iid U(0, 1) random variables.

Correction to Exercise II.12

1. We recall that ∇Xtx is solution to


Z t Z t
0
Yt = 1 + b (Xsx )Ys ds + σ 0 (Xsx )Ys dWs . (6.85)
0 0

2. To ease the notation, we set b = 0. We then compute


Z u
p
sup |Yu | ≤ Cp (1 + sup | σ 0 (Xsx )Ys dWs |p )
u≤t u≤t 0

Taking the expectation and using BDG inequality, we get


  Z t 
p 0 x 2 p2
E sup |Yu | ≤ Cp (1 + E ( |σ (Xs )Ys | ) )
u≤t 0

This leads to, for p ≥ 2,


  Z t  Z t " #
p p p
E sup |Yu | ≤ Cp (1 + E |Ys | ds ) ≤ Cp (1 + E sup |Yu | ds) .
u≤t 0 0 u∈[0,s]

We conclude using Gronwall’s Lemma.

Remark: (see also lecture notes section 2.7, proof of Proposition 2.3) The

above proof should be applied to the localised version of Y i.e. Y·∧τn with

τn = inf{t ≥ 0 | |Yt | ≥ n} ∧ T ,

for n a positive integer.

160
3. Let Y and Z be solution of (6.85) (with b = 0) and ∆ = Y − Z. We then

observe that

Z t
∆t = σ 0 (Xsx )∆s dWs
0

We then have

Z u
sup |∆u |2 = sup | σ 0 (Xsx )∆s dWs |2 ,
u≤t u≤t 0

and taking the expectation and using BDG inequality, we obtain

  Z t 
2 0 x 2
E sup |∆u | ≤ 4E |σ (Xs )∆s | ds .
u≤t 0

Since σ 0 is bounded, we get

  Z t  
2 2
E sup |∆u | ≤ C E sup |∆u | ds .
u≤t 0 u≤s

 
And then using Gronwall’s Lemma, we have that E supu≤t |∆u |2 = 0.

Correction to Exercise II.13

Xtx+ −Xtx
1. It is the derivative of X x with respect to x i.e. ∇Xtx = lim→0  .

2. cf lecture notes: 4.3.3

3. We compute that

R ti σ 0 (Xs )2 Rt
(b0 (Xs )− )ds+ 0 i σ 0 (Xs )dWs
∇Xtxi = e 0 2

One can discretise X and then the above integrals, to get

σ 0 (Xtπ )2
Pi−1 0 j Pi−1

π
j=0 (b (Xt )− )(tj+1 −tj )+ j=0 σ 0 (Xtπ )(Wtj+1 −Wtj )
∇X ti =e j 2 j

Correction to Exercise II.14

161
1. the strong convergence has a rate equal to 1/2.

2. We observe that (say b = 0)


Z t
sup |∇Xt − ∇Xti | = 2
sup | σ 0 (Xs )∇Xs dWs |2
t∈[ti ,ti+1 ] t∈[ti ,ti+1 ] ti

Taking expectation and applying BDG inequality we obtain


" # Z t 
2 0 2
E sup |∇Xt − ∇Xti | ≤ CE |σ (Xs )∇Xs | ds
t∈[ti ,ti+1 ] ti
 
2
≤ C|π|E sup |∇Xs | .
s

Using Lemma 4.1 in section 4.3.1, we obtain


" #
2
E sup |∇Xt − ∇Xti | ≤ C|π|
t∈[ti ,ti+1 ]

3. We define δt = ∇Xt − ∇Xtπ and observe


Z t Z t
δt = (σ 0 (Xs ) − σ 0 (Xsπ ))∇Xs dWs + σ 0 (Xsπ )(∇Xs − ∇Xs̄ )dWs
0 0
Z t
+ σ 0 (Xsπ )(δs̄ )dWs
0

Squaring, taking the supremum, expectation and applyin BDG we get


  Z T 
2 0 0 0

E sup |δs | ≤ CE |(σ (Xs ) − σ (Xsπ ))∇Xs |2 + |σ (Xsπ )(∇Xs − ∇Xs̄ )| 2
ds
s≤t 0
Z t 
2
+ CE sup |δu | ds ,
0 u≤s

where we used also the boundedness of σ 0 . We then apply Gronwall’s Lemma

to get
" # Z 
T 
E sup |δt | 2
≤ CE |(σ 0 (Xs ) − σ 0 (Xsπ ))∇Xs |2 + |σ 0 (Xsπ )(∇Xs − ∇Xs̄ )|2 ds .
t≤T 0

We then get
" # Z 
T 
2 0 0
E sup |δt | ≤ CE |(σ (Xs ) − σ (Xsπ ))∇Xs |2 + |∇Xs − ∇Xs̄ )| 2
ds .
t≤T 0

162
and applying CS inequality,
" # Z 
 1  1 T
2 4 2 0 0 π 4 2 2
E sup |δt | ≤ CE sup |∇Xs | E sup |(σ (Xs ) − σ (Xs ))| +E |∇Xs − ∇Xs̄ )| ds .
t≤T 0

Using the previous estimates, we get


" #
E sup |δt |2 ≤ C|π| .
t≤T

Correction to Exercise II.15

1. The density h of XTx is lognormal and given by

1 log(y/x) − (r − 21 σ 2 )T
h(x, y) = √ ϕ(ζ(x, y)), ζ(x, y) = √
yσ T σ T

And we compute

log(y/x) − (r − 12 σ 2 )T
s(x, y) = ∂x h(x, y)/h(x, y) = −ζ(x, y)∂x ζ(x, y) =
xσ 2 T

2. We see u and h as functions of σ and y now. We know that

Z
∂σ u(σ, x) = g(y)s(σ, y)h(σ, y)dy .

We compute

log(y/x) − r 1 √ √ WT WT
∂σ ζ = − √ + T , ∂σ ζ(XTx ) = T − √ , ζ(XTx ) = √
σ2 T 2 σ T T
1
∂σ h = (− − ζ∂σ ζ)h
σ

and then

 
x 1 x x
∂σ u = E g(XT )(− − ζ(σ, XT )∂σ ζ(σ, XT ))
σ
 
x 1 WT
= E g(XT )(− + − WT )
σ σT

163
3. We have
Z Z 2 h(x, y)
2 2 ∂xx
∂xx u = g(y)∂xx h(x, y)dy = g(y) h(x, y)dy
h(x, y)

Recall that

∂x h = −ζ∂x ζh

∂xx h = −∂x [ζ∂x ζ]h + (ζ∂x ζ)2 h

Expliciting the above derivatives leads to the desired result.

Correction to Exercise II.16

1. heuristic: assume that ∂α X exists and then differentiate under the integral

(both for ds and dWs ) !

2. (say b = 0)

(a) The dynamics of Γ are


Z t Z t
Γt =− Hs σ(Xs )ds + σ̃s Γs dWs .
0 0

We compute, recalling that H is bounded,


Z T Z u
sup |Γu |p ≤ Cp ( |σ(Xs )|p ds + sup | σ̃s Γs dWs |p ) .
u≤t 0 u≤t 0

Using BDG inequality, the boundedness of σ̃  and Gronwall Lemma, we

obtain, for p ≥ 2,
  Z T 
 p
E sup |Γu | ≤ Cp E |σ(Xs )|p ds
u≤t 0

Using then classical estimate for the solution of an SDE and the fact that

 ∈ [−1, 1], we get


 
 p
E sup |Γu | ≤ Cp .
u≤t

164
(b) We observe that setting δs = ∂α Xs (0) − Γs ,
Z t Z t
δt = − Hs (σ(Xs ) − σ(Xs ))ds + (σ 0 (Xs )δs + (σ 0 (Xs ) − σ̃s )Γs )dWs +
0 0

Using usual arguments (sup+expectation+BDG+Gronwall) we obtain


" #  Z T Z T 
E sup |δt | ≤ CE 2
2
|Hs σ̃s Γs |ds + E |(σ 0 (Xs ) − σ̃s )Γs |2 dt
t≤T 0 0

Combining the result of 2.a, estimation of |σ 0 (Xs )− σ̃s | (see lecture notes)

with a CS inequality arguments, we obtain


" #
E sup |δt |2 ≤ C2
t≤T

proving the statement.

Correction to Exercise II.17

Correction to Exercise II.18

Proof. (r = 0) We introduce a bermudan option with exercise payoff Zt = g(Xt ) ∨

g(Xtπ ) and we set

pa0 = sup E[Zτ ] = E[Zτ a ]


<
τ ∈T[0,T ]

We then simply observe that

|pπ0 − pb0 | ≤ |pπ0 − pa0 | + |pa0 − pb0 | .

And that

pa0 − pπ0 ≤ E[Zτ a − g(Xτπa )]

≤ E[g(Xτπa ) ∨ g(Xτ a ) − g(Xτπa )]



≤ E[|g(Xτ a ) − g(Xτπa )|] ≤ C π .

165
Similarly one computes


pπ0 − pa0 ≤ C π

The term |pa0 − pb0 | is treated using same arguments. 2

Correction to Exercise II.19

1.

2. Proof. We first compute

h i h i
Θ − Θ̂ = E[F (X)|Y ] − E E[F (X)|Y ] |Ŷ + E F (X) − F (X̂)|Ŷ
h i h i
= ϕF (Y ) − E ϕF (Y )|Ŷ + E F (X) − F (X̂)|Ŷ

From the best approximation property of conditional expectation, we obtain

h i
kϕF (Y ) − E ϕF (Y )|Ŷ k2 ≤ kϕF (Y ) − ϕF (Ŷ )k2

≤ [ϕF ]kY − Ŷ k2

and from the contraction property

h i
kE F (X) − F (X̂)|Ŷ k2 ≤ kF (X) − F (X̂)k2

≤ [F ]kX − X̂k2

Correction to Exercise II.20

Proof. We first write down the dynamics of Γ:

dΓt = Γt (at dt + bt dWt ) , Γ0 = 1.

166
Using Doob’s inequality, we get easily that Γ ∈ S 2 as b is bounded. It also clear

that there is a unique solution to (5.3): if we state f (t, y, z) = at y + zbt + ct which

obviously satisfies (H1) and we know that Y ∈ S 2 .

Using the product formula, we compute

dΓt Yt = Γt dYt + Yt dΓt + dhΓ, Y it = −Γt ct dt + Γt Zt dWt + Γt Yt bt · dWt ,

Rt
showing that Γt Yt + 0 cr Γr dr is a local martingale, which is - in fact - a martingale

as c ∈ H2 and Γ, Y are in S 2 . Then,

Z t  Z T 
Γt Yt + cr Γr dr = E ΓT YT + cr Γr dr Ft ,
0 0

which concludes the proof. 2

Correction to Exercise II.21

The proof is done using a linearisation argument. Denoting U = Y 0 −Y ; V = Z 0 −Z

and ζ = ξ 0 − ξ, we have

Z T Z T

Ut = ζ + f 0 (r, Yr0 , Zr0 ) − f (r, Yr , Zr ) dr − Vr dWr .
t t

We observe that

f 0 (r, Yr0 , Zr0 ) − f (r, Yr , Zr ) = f 0 (r, Yr0 , Zr0 ) − f 0 (r, Yr , Zr0 ) + f 0 (r, Yr , Zr0 ) − f 0 (r, Yr , Zr )

+f 0 (r, Yr , Zr ) − f (r, Yr , Zr ) (non negative).

We introduce a and b : a is valued in R and b d-dimensional vector. We set

f 0 (r, Yr0 , Zr0 ) − f 0 (r, Yr , Zr0 )


ar := 1{Ur 6=0} .
Ur

(i)
For 0 ≤ i ≤ d, we consider the vector Zr whose last d − i components are those of

167
Zr0 and the first i components are those of Zr . For 1 ≤ i ≤ d, we set
   
(i−1) (i)
f 0 r, Yr , Zr − f 0 r, Yr , Zr
bir = 1{Vri 6=0} .
Vri

Importantly, as f 0 is Lipschitz, the two processes are bounded and progressively

measurable. We then observe that

Z T Z T
Ut = ζ + (ar Ur + Vr br + cr ) dr − Vr dWr ,
t t

where cr = f 0 (r, Yr , Zr ) − f (r, Yr , Zr ). By assumption, we have ζ ≥ 0 and cr ≥ 0.

Using the formula given in – Proposition ?? –, we have, for all t ∈ [0, T ],

 Z T 
−1
Ut = Γt E ζΓT + cr Γr dr Ft ,
t

with for 0 ≤ r ≤ T ,

nZ r
1
Z r Z r o
2
Γr = exp bu · dWu − |bu | du + au du .
0 2 0 0

Following Remark ??, we get that Ut ≥ 0.

If moreover, U0 = 0 we have

 Z T 
0 = E ζΓT + cr Γr dr ,
0

and the r.v. is non negative. Then, it is equal to zero P a.s. which implies ζ = 0

and cr = 0, concluding the proof of the Theorem.

Correction to Exercise II.22

Proof. 1. Observe that

Z ti+1
1  2
ζie = − σ 2 (Xtπi ) − σ 2 (Xsπ ) ∂xx u(s, Xsπ )ds (6.86)
2 ti

168
has already been studied in the proof of the weak error for the Euler scheme. The

upper is obtained by observing that

 2  2
σ 2 (Xtπi ) − σ 2 (Xsπ ) ∂xx u(s, Xsπ ) = σ 2 (Xtπi ) − σ 2 (Xsπ ) (∂xx u(s, Xsπ ) − ∂xx
2
u(ti , Xtπi )) =: Ai
 2
+ σ 2 (Xtπi ) − σ 2 (Xsπ ) ∂xx u(ti , Xtπi )) =: Bi

We have Eti[|Ai |] ≤ β|π| and applying Ito’s formula, we also obtain |Eti[Bi ] | ≤ β|π|.

(β is a random variable independent of i and whose moments are bounded) This

yields straightforwardly (5.72).

2. Using Ito’s formula, we obtain, for s ∈ [ti , ti+1 ],

h i Z s 
(0) 1
Eti u (s, Xsπ ) =u (0)
(ti , Xtπi ) + Eti (0)
{∂t u (t, Xtπ ) + σ(Xtπi )∂xx
2
u(t, Xtπ )}dt
ti 2

= u(0) (ti , Xtπi ) + O(h)

We easily deduce that (5.73) holds true.

3. We now observe that

|ζiz | ≤ Lh|Z̃i − Vtii | (6.87)

also

 
(0) ∆Wi
Z̃i = Eti u (ti+1 , Xtπi+1 )
ti+1 − ti

∆Wi
Now we compute, setting Hi := hi

 Z Z 
  ti+1
1 σ(Xtπi ) ti+1
Eti u(ti+1 , Xti+1 )Hi = Eti Hi [∂t + σ 2 (Xtπi )∂xx
2
]u(t, Xtπ )dt + ∂x u(t, Xtπ )dt
ti 2 hi ti

(6.88)

169
Observe that

 Z ti+1 
1 2 π 2 π
|Eti Hi [∂t + σ (Xti )∂xx ]u(t, Xt )dt | (6.89)
ti 2
 Z ti+1 
1 2 π 2 π 1 2 π 2 π
= |Eti Hi {[∂t + σ (Xti )∂xx ]u(t, Xt ) − [∂t + σ (Xti )∂xx ]u(ti , Xti )}dt |
ti 2 2

(6.90)

≤ C|π| (6.91)

and that

 Z t 
1 2 π 2
Eti[∂x u(t, Xt )] = Eti ∂x u(ti , Xti ) + [∂t + σ (Xti )∂xx ]∂x u(s, Xs )ds (6.92)
ti 2

We thus get

 Z 
σ(Xtπi ) ti+1
|Eti ∂x u(t, Xtπ )dt − Vtii | ≤ C|π| (6.93)
hi ti

Combining the previous inequality with (6.87), we obtain easily (5.74). 2

170

You might also like