Professional Documents
Culture Documents
Crépey - Computational Finance
Crépey - Computational Finance
stephane.crepey@univ-evry.fr
Contents
1 Outline 7
2 General Set-Up 7
4 Bibliographic Guidelines 9
II Market Models 11
6 BGM Model 14
6.1 Changes of Numeraire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.5 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
7.2 Li Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.4 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
10 Numerical Approximation 31
10.1 Finite dierences methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
11.3 Theta-schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
11.4.1 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
11.4.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
IV Tree Methods 53
17 Random numbers 61
19 Low-discrepancy sequences 62
19.1 General remarks on low discrepancy sequences . . . . . . . . . . . . . . . . . . . . . 63
21.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
26 Simulation of processes 73
26.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
26.2 BS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
26.4 JumpDiusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
28 Backtesting 80
28.1 Static versus dynamic hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
VI Calibration Methods 86
A Arbitrage Theory 99
These notes partly rely on the public releases of the option pricing software and documentation sys-
tem PREMIA developed since 1999 by the MATHFI project at INRIA www.inria.fr and CERMICS
cermics.enpc.fr. The aim of the Premia project is threefold: rst, to assist the R&D profes-
sional teams in their day-to-day duty, second, to help the academics who wish to perform tests of a
new algorithm or pricing method without starting from scratch, and nally, to provide the graduate
students in the eld of numerical methods for nance with open-source examples of implementation
of many of the algorithms found in the literature
www-rocq.inria.fr/mathfi/Premia/index.html
7
Part I
Introduction and Preliminaries
1 Outline
These notes bear on computational nance (pricing, Greeking and calibration methods), with a
focus on algorithmic aspects. The related theoretical results (convergence analysis, etc) are generally
stated without proof.
In parts III and IV, we discuss deterministic pricing methods (general nite dierences methods in
part III, and more specic tree methods in part IV). Part V is about stochastic simulation pricing
methods (Monte Carlo methods).
Note that there is no hermetic frontier between deterministic and stochastic methods. In a sense,
Monte Carlo (MC) methods are special cases of tree methods. This is more clearly visible on the more
general problem of MC pricing an American option, which is beyond the scope of these notes. Yet
it is also true of a standard MC algorithm for pricing an European option. Indeed the latter may be
interpreted as a one-time-step multinomial tree, which provides an exact discretization for the option
price in the limit where the number of discretization points in space (tree branches) goes to innity.
In essence (though rather caricaturally, needless to say), all these numerical schemes are based on
the idea of propagating the solution from a hypersurface of the time-space domain on which it is
known, along suitable (randomized) characteristics of the problem, like in the Riemann's method
of characteristics for solving hyperbolic rst-order equations (see e.g. [118, Chapter 4]). From a
more control-theoretical point of view, all these numerical schemes are based on Bellman's dynamic
programming principle [22].
Of course the dierence between trees in the usual sense and MC methods is that the MC mesh is
stochastically generated and unstructured.
The prices of nancial instruments are essentially given by the market, and made by oer-and-
demand (unless very exotic structures are considered). Market prices are in fact used (rather than
computed) by models, in the reverse-engineering" mode that consists in calibrating the model to
market prices, so that the model be consistent with the market for suitable values of its parameters.
This calibration process is the object of part VI. Once calibrated to the market, the model is eec-
tively used for Greeking (and/or pricing exotic structures), that is, computing the risk sensitivities
of a position, in order to set up a related hedge, or complementary position required for o-setting
such or such undesired risk.
Since the object of these notes is methods, not models, we present most methods on simple models,
like the BlackScholes model (in general), the Merton model, the Heston model, etc. Of course the
methods themselves are always generic to some degree, hence applicable in a broad range of models
to a broad range of nancial instruments. In the Appendix (part VII), we recall the basic facts of
nancial theory necessary to understand how a generic contingent claim pricing equation is derived
(equation (150)), in a Markovian risk-neutral primary market model. And, to start with, we shall
review in part II the benchmark models on the main derivative markets (equity, interest rate and
credit), with the related closed pricing formulae (so no computational methods are required in these
models, as far as pricing vanillas is concerned).
2 General Set-Up
We assume throughout that the evolution of a nancial market model can be modeled in terms of
stochastic processes dened on a continuous time stochastic basis (Ω, F, P),
b where P
b denotes the
objective (or physical, statistical..) probability measure. We can and do assume that the ltration
F satises the usual completeness and right-continuity conditions, and that all semimartingales are
càdlàg. Finally, since we are always in the context of pricing a contingent claim with maturity T,
8
we further assume that F = (Ft )t∈[0,T ] with F0 trivial and FT = F, for simplicity. Moreover, we
declare that a process on [0, T ] (resp. a random variable) has to be F-adapted (resp. F -measurable),
by denition.
We shall typically work under a risk-neutral (RN) measure P ∼ P
b (risk-neutral modeling), such
that the prices of the primary assets, once discounted and adjusted for any dividends, are (Ω, F, P)-
local martingales. Recall that, under mild technical conditions, existence of such a RN measure
P is equivalent to a suitable notion of no-arbitrage (NFLVR condition [61], see the Appendix). In
practical applications, it is convenient to think of P as the pricing measure chosen by the market
to price a contingent claim. We denote by Tt (or simply T, in caset = 0) the set of [t, T ]-valued
F-stopping times, and by E (resp. Et ) the P-expectation (resp. P-conditional expectation given Ft )
operator.
Note that pricing theory can also be developed in discrete time (see e.g. [71, 102]). The tree
(including MC) computational methods presented in these notes are then directly applicable (in
Markovian set-ups), and they are then trivially exact in the time direction, since no approximation
in time is required in this case. In particular, these methods can be used in static (one-period)
set-ups, such as often encountered in multi-name credit (static copula models etc, see section 7).
As for single-name credit, namely the pricing of defaultable claims with terminal payos of the form
1T <τd φ(ST ) (or 1τ <τd φ(Sτ ) upon exercice at time τ, in case of American claims), where τd represents
the default-time of a reference entity, it happens that such defaultable claims can be handled in
exactly the same way as default-free ones, provided a suitably credit-risk adjusted discount factor is
used, instead of the usual riskless discount factor (see the introductory paragraph to the Appendix).
Up to this simple modication, defaultable contingent claims can be treated in exactly the same
way as default-free ones, both from the theoretical and from the practical point of view [27, 28].
This means in particular that all the numerical methods presented in these notes can be used for
pricing and Greeking defaultable claims (or calibrating credit-risk models), provided the credit-risk
adjusted discount factor is used instead of the original default-free discount factor.
Incidentally, note that the original default-free discount factor can itself be interpreted as a default
probability (or killing rate, in stochastic processes terminology, see e.g. [133]).
A typical benchmark of accuracy in computational nance (this varies of course with the application
−4
at hand) is a 1bp (=10 ) error on normalized prices and Greeks, namely prices and Greeks for a
value of the spot scaled to unity.
As for computation times, the benchmark also greatly varies with the application at hand, but as
far as `real-time' option pricing is concerned, `instantaneous' pricing is the target, and more than
half-an-hour computation time (!!) is prohibitive.
Now a resolution within a 1bp normalized error by a nite dierences ADI method (the industry
standard today as far as deterministic methods are concerned, see section 13), in space dimension d,
typically requires 300 grid points per space dimension (most used numerical schemes are of order 2 of
d
accuracy in space and 1 in time), hence a related computation time (and storage cost) in O(300 ), i.e.
a computation time ranging from a few milliseconds for d = 1 to twenty minutes or so for d = 3. This
limits in practice the range of applicability of deterministic pricing methods to problems in space
dimension ≤ 3, unless sophisticated sparse grid or grid renement techniques (mostly available with
nite element methods) are used to counter Bellman's `curse of dimensionality' (referring to the fact
that in space dimension d, the computational cost of numerical integration grows exponentially like
md1 , where m1 is the number of discretization points in each space direction).
Table 1 provides a (crude) comparison of the computational costs of typical MC and deterministic
methods (MC algorithm with time discretization of the underlying factor process, cf. section 26.5,
versus ADI PDE method). A rough conclusion (note that we don't detail the constants involved in
our computational cost estimates) is that deterministic methods are more ecient (but often harder
9
to implement!) in space dimension ≤ 3, otherwise MC methods (if applicable) are the best.
PDE
−2
−1
O(nm) O(n +m d ) O(m)
Table 1: Compared computational costs of Stochastic methods (m simulation runs) versus Deter-
ministic methods (m1 mesh points per space dimension, i.e. m = md1 space mesh points), in space
dimension d; n is the number of discretization points in time.
This leads to the following dictionary (Table 2) of the method to use, depending on the space-
dimension and the nature of a pricing problem.
In the upper left corner of this Table, the choice between PDE and MC should depend on the
relative interest of performance level with respect to implementation cost and risk, generally higher
with deterministic methods.
In the lower right corner of the Table, FBSDE (Forward-Backward Stochastic Dierential Equations)
methods refers to specic MC methods which are available for control problems, cf. the introductory
paragraph to part V. These methods are not treated in these notes.
Note that the recommendations for American Problems are in fact valid for more general Control
Problems, e.g. for pricing game contingent claims, like convertible bonds (see [25, 28]), or for pricing
problems in (classes of ) model(s) in which the spot volatility is stochastic, but known to remain in
a range (σ, σ) for sure, where σ and σ are given constants or functions of (t, S) (Uncertain Volatility
model [13]), etc.
An interesting subject hardly mentioned is these notes is parallelism. The reason why we do not
consider this issue (besides the vastness of this subject per se) is that parallelism is most ecient
on dedicated machines or networks (see Figure 5(b) and Table 3), and working with such machines
or networks is not the main trend in the nancial industry today.
4 Bibliographic Guidelines
Bibliographic references are given along the text and gathered at the end of this document. Various
aspects of computational nance are introduced in [82] (see in particular Chapter 16 Numerical
procedures). Here are some more comprehensive references, among others:
• On market models (part II): [76, 40, 131, 119, 31, 138];
• On deterministic pricing methods: (parts III and IV) [140, 146, 118, 1, 12, 100, 102, 147, 23];
• On stochastic simulation methods (part V): [78, 104, 33, 96];
• On calibration of nancial models (part VI): [48, 67, 124];
• On nancial theory (part VII): [61, 119].
www-rocq.inria.fr/mathfi/Premia/index.html
screpey.free.fr
www.mathfinance.de/frontoffice.html
quantlib.org
www.nr.com
10
www.gro.creditlyonnais.fr
www.iro.umontreal.ca/~lecuyer
www.optioncity.net
www.defaultrisk.com
11
Part II
Market Models
In this part we gather basic facts regarding the benchmark models on the main derivatives markets:
equity-derivatives markets (BlackScholes model and simple extensions), interest-rate derivatives
markets (BGM model) and credit derivatives markets (one factor Gaussian copula model).
where:
•Wc denotes a standard Brownian motion under the objective (or physical, statistical..) probability
P,
b
• the volatility parameter, σ , is a positive constant (number), and
• the physical drift, µ, is a process satisfying mild regularity and growth conditions.
We assume for simplicity that the riskless interest rate r in the economy and the dividend yield q on
S 1
are constant , and that the market composed of the spot S and of the savings account Bt = ert
is arbitrage-free and frictionless.
Then one can show that European vanilla call options on S have a unique arbitrage price process Π
in this model, given by, for t≤T :
with τ = T − t, and where E refers to the expectation operator under the risk-neutral probability P,
such that
σ2
with κ = r − q, b = κ − 2 , for a standard P-Brownian motion W. Note that the second equality in
(2) follows from the fact that S is a P-Markov process.
In particular the BS price does not depend on the physical drift µ, even though µ may be responsible
for fat tails or skewness in the physical spot returns (under b). This is because options risks are
P
entirely market diversiable in this model. The BS model is complete.
Indeed one can show that at any time t ∈ [0, T ] the price Πt (T, K) in (2) is also the least initial wealth
of a dynamic replicating strategy for the option (self-nancing strategy in cash and spot with wealth
process bounded from below and equal to the payo of the option at maturity, see the Appendix).
Such a replicating strategy with initial wealth Πbs (T, K; 0, S0 , σ) (starting from time 0) consists in
holding a number of units of spot at time t equal to
1 In fact the interpretation of r and q depends on the nature of the underlyer S: stock, interest rate, exchange rate,
etc.
12
Direct computation shows that the BS call pricing function Πbs and delta function ∆bs = ∂S Πbs are
given by the so-called BlackScholes formulae:
S
ln( K )+κτ √
d± = √
σ τ
± 12 σ τ (5)
These results admit straightforward extensions to the case where r, q and σ are Borel bounded
√ RT RT
functions of time: simply replace rτ, qτ and σ τ by
t
r(u)du, t
q(u)du and Σ, with Σ2 =
RT
t
σ 2 (u)du, in (4)(5).
As is well known, the BlackScholes model is strongly misspecied, which leads to consider natural
extensions of BlackScholes, adding stochastic volatility (Heston model) or/and jumps (Merton and
Bates model) in the picture.
√
dvt = −λ(vt − v)dt + η vt dZt
√ (6)
dSt = St (κdt + vt dWt ) ,
where:
• dhW, Zi = ρdt,
• λ is the speed of mean-reversion of the instantaneous variance,
• v is the long-term variance mean,
η
• 2√ v
is the volatility of the (instantaneous) volatility.
dSt PNt ¯
St− = κdt + σdWt + d( l=1 Jl − λJt) (7)
1
AS v = κ − λJ¯ S∂S v + σ 2 S 2 ∂S2 2 v + λ [Ev((1 + J1 )S) − v(S)] ,
(8)
2
where:
• (Nt )t≥0 is a Poisson process with deterministic jump intensity λ;
• the Jl are the jump size variables, such that jl := ln(1 + Jl ) ,→ N (α, β), so
β
j̄ := Ej1 = α, Ej12 = α2 + β, J¯ := EJ1 = eα+ 2 − 1 ;
Denoting further ¯
a = b − λJ, we have in particular:
1
AS ln(S) = κ − λJ¯ − σ 2 + λj̄ = a + λj̄ . (9)
2
13
By application of the Itô-Levy formula (special case when Y is trivial in the general formula (147)),
the model rewrites in terms of the log-spot Xt = ln(St ) as follows:
XNt XNt
dXt = (a + λj̄) dt + σdWt + d( jl − λj̄t) = adt + σdWt + d( jl ) , (10)
l=1 l=1
PNt
whence XT = x0 + at + σWt + l=1 jl , and
QNt
ST = S0 eat+σWt l=1 (1 + Jl ) (11)
Note that from the practical point of view, the Heston and the Merton model are still notoriously
misspecied, like BlackScholes. The Bates model may often be considered as a exible enough and
robust alternative.
Proposition 5.1 We have in the risk-neutral BS, Heston, Merton and Bates models, respectively:
ΦdT (u) = exp iu x0 + κT
1
Φbs d
T (u) = ΦT (u) exp − 2 u(i + u)σ T
2
h β 2β
i
ΦjT (u) = exp −λT iu eα+ 2 − 1 − (eiuα−u 2 − 1)
me bs j iuα−u2 β u2 σ 2 T α+ β
ΦT (u) = ΦT (u)ΦT (u) = exp λT (e 2 − 1) − 2 + iu x0 + [b − λe 2 + λ]T
Φhe d
T (u) = ΦT (u) exp [C(u, T )v + D(u, T )v]
j
Φba he
T (u) = ΦT (u)ΦT (u) ,
with
h β 2β i
Φt (u) = exp iu x0 + κt , Φjt (u) = exp −λt iu eα+ 2 − 1 − eiuα−u 2 − 1
h −pt
i
1−e−pt
C(u, t) = λ ty− − η22 ln( 1−ge
1−g ) , D(u, t) = 1−ge−pt y− (13)
y−
p = y 2 − 4wz , y± = y±p
p
η 2 , g = y+
η2
w = − 12 u(i + u) , y = λ − ρηiu , z = 2
√
Remarks 5.1 For λ = ρ = 0 and η → 0+, we get y = 0, y± = ±p
η2 , p= −2wη, 1−g = 2, D(u, t) ∼
pt −p2 t 1 2
1−g y− = 2η 2 = wt, and C(u, t)v + D(u, t)v reduces to − u(i + u)σ t, so
2 Φhe
t (u) reduces to Φbs
t (u),
as expected.
Likewise, it is easy to check that Φme
t (u) reduces to Φbs
t (u) for α = β = 0.
14
Proof of Proposition 5.1 (see e.g. [76]). In the case of the Heston model, we introduce Ft := St eκt ,
he
so F0 = S0 , and we compute ΦT (u) as
with Φ(t, F ) := Et FTui on {Ft = F } . In particular Φ(T, F ) = F ui , which leads us to seek an explicit
solution for Φ in the form
for suitable coecients C and D. Note that Φ(t, Ft ) is a (Doob) martingale. We thus get by
application of the Itô formula (diusive case, so no jumps and Y trivial in the general formula
(147)): 0 = (∂t + AF )Φ on {t < T }, with
1 2 2 1
AF = vF ∂F 2 + η 2 v∂v22 + ηvρ∂F2 v − λ(v − v)∂v .
2 2
So
1 1
0 = v ∂t C + v ∂t D + v(ui)(ui − 1) + η 2 vD2 + ηρv(ui)D − λ(v − v)D
2 2
on {t < T }. Since this must be true for any (v, v), it implies that we have for t < T :
−∂t C = λD
(14)
−∂t D = w + zD2 − yD = z(D − y+ )(D − y− ) .
We recognize a standard system of Riccati equations, with solution given by (13), as one can easily
check by inspection. This proves the result in the case of the Heston model.
To prove the result in the case of the Bates model. We introduce the process
Nt
¯
Y
Lt = (1 + Jl )e−λJt .
l=1
XNt
¯ , AL Lui = −λLJ∂
¯ L Lui + λLui E (1 + J1 )ui − 1 .
dLt = Lt− d( Jl − λJt)
l=1
·
We thus have by the Itô formula, with = standing for equality up to a martingale term:
· ¯ + E (1 + J1 )ui − 1 λdt ,
dLui ui
t = Lt − Jui
whence
2
ELui ui δλt ¯ + E (1 + J1 )ui − 1 = −ui eα+ β2 − 1 + euiα− u2β − 1 ,
t = L0 e , with δ = −Jui
j
that is ELui
t = Φt (u).
Finally, the results in the case of the BS and of the Merton models follow by passage to the limit of
the previous ones, by Remark 5.1. 2
6 BGM Model
As we will see in section 6.1.1, bond options can be handled by suitable extensions of the Black
Scholes formulae, the so-called Black formulae.
15
Other major vanilla interest rate derivatives: caps/oors and swaptions, have been quoted in terms
of Black implied volatility for long in the market. The rigorous justication for such market practice
was derived in 1995 by Brace, Gaterek and Musiela [36] for caps and oors (`BGM Model', also called
`Libor Market Model' LMM; see also Miltersen et al [116]), and by Jamshidian [90] for swaptions
(Libor Swap Model LSM).
The BGM and the LSM are not consistent in theory but they do not dier much in practice [35].
Today the BGM model has emerged as the market standard for interest rate derivatives. There are
three reasons for this:
• the related computations are easier in the LMM than in the LSM,
• it is easier to express swap rates in terms of LIBOR rate than the other way round, and
• good approximate formulas are available for swaptions in the LMM.
We shall therefore focus on the BGM model (LMM) in the sequel.
B0 B
et
νt := , t ∈ [0, T ] ,
B
e0 Bt
let P
e be the corresponding valuation measure, dened on (Ω, F) (with F = FT as usual) by
dP
e
= νT , P-a.s. (15)
dP
From the abstract Bayes rule and the fact that ν is a P-martingale, it follows, for t ∈ [0, T ] :
e −1
B
et Ee (B et EP (BT ξνT | Ft ) = Bt EP (B −1 ξ | Ft ) = Πt .
e −1 ξ | Ft ) = B
T T
P EP (νT | Ft )
So
et Ee (B
Πt = B e −1 ξ | Ft ) , t ∈ [0, T ] (16)
P T
Π
and e is a martingale under P.
e
B
de
Also note that νt is the Ft -measurable Radon-Nikodym density dPP
|Ft of P
e with respect to P on Ft ,
for any t ∈ [0, T ]. Indeed, for any Ft -measurable and bounded random variable X, we have:
Black formulae extend the BlackScholes formulae (4)(5) (general case with time-dependent volatil-
ity σ(t) and dividend yield q(t)) to the case of stochastic interest rates r. Assuming absence of
arbitrage, the primary asset S (or the value of S adjusted for interest rates and dividend yields, if
any) must be a P-local martingale under any risk-neutral measure (see the Appendix).
In case where the information ltration is generated by a unidimensional Brownian motion, this
− tT q(u)du
R
with
ln(F/K) RT
d± = Σ ± 12 Σ where Σ2 = t
σ 2 (u)du (20)
1 Bt (T ) − Bt (U )
Bt (T ) = [1 + α(T, U )Lt (T, U )] Bt (U ), or Lt (T, U ) = ,
α(T, U ) Bt (U )
where α(T, U ) = U − T and Bt (T ) is the price at time t of a zero coupon bond which pays 1 at T.
We consider n LIBOR rates
on consecutive time periods [ti , ti+1 ]. Let αi = ti+1 − ti , Bti = Bt (ti ), Ei = EPT i . So for t ≤ ti :
Bti − Bti+1 1
αi Lit = , Bti+1 = Bi , (21)
Bti+1 1 + αi Lit t
Ql
Btl+1
i
= 1
k=i 1+αk Lk (22)
t i
R
t Rt
dLit = σi (t)Lit dWti+1 , or Lit = Li (0) exp 0
σi (s)dWsi+1 − 1
2 0
σi2 (s)ds (23)
a for some Pi+1 -Brownian motion W i+1 and a deterministic volatility function σi (t) (null after ti ).
i i+1
Rt
The process L is thus log-normal under P , and the variance of log Lit is equal to 0 σi2 (s)ds.
17
For the specication of the model to be complete, we should specify the correlation structure of L.
This is deferred to section 6.4. Indeed, correlation does not aect the pricing of caps and oors to
be considered in the next section.
Cti
= αi Ei+1 max{Liti − K, 0}|Ft ,
i+1
Bt
with
Li
0 Σ2i R ti
ln( )+
di1 = K
Σi
2
, di2 = di1 − Σi , and Σ2i = 0
σi2 (s)ds
We thus get the price of the cap (portfolio of caplets) with maturity t1 and tenor t1 , ..., tn+1 as
Pn Pn
αi B0i+1 Li0 N (di1 ) − KN (di2 )
C0 = i=1 C0i = i=1
and the analog formula for oors. So because of the addtively separable structure of the related
payoofs, the prices of caps and oors only depend on the marginals of the Libor rates Li at the ti .
6.4 Adding Correlation
More complex derivatives, like swaptions, are also aected by the correlation structure of L.
We are thus going to dene a correlation structure between the Li s by expressing their dynamics
(until the ti s) under the terminal measure Pn+1 .
Note that by denition of the Pi , we have for t ≤ ti (cf. (17)):
where si (t) = (0, ..., 0, σi (t), 0, ..., 0) is a row volatility vector and Wi+1 = (W j,i+1 )1≤j≤n is a (corre-
lated) n-dimensional Pi+1 -Brownian motion.
Let ρ = (ρi,l )1≤i,l≤n denote the correlation matrix of Wn+1 ,
dhW i,n+1 , W l,n+1 it = ρi,l dt .
so
n n+1
Rt
We proceed by backward induction. We rst want to specify Wt as Wt −ρ 0 hT u du, for a properly
n
chosen row vector process h. In view of Girsanov theorem, it is enough, for this to be a P -Brownian
18
αn Lnt ρsT
n (t)
dLn−1
t = sn−1 (t)Ln−1
t dWn+1
t − dt
1 + αn Lnt
αn Lnt σn (t)ρn,n−1
= − σn−1 (t)Ltn−1 dt + σn−1 (t)Ln−1
t dWtn−1,n+1 ,
1 + αn Lnt
and then likewise, for any i = 1, . . . , n :
Pn αl Llt σl (t)ρl,i
dLit = − l=i+1 1+αl Llt
σi (t)Lit dt + σi (t)Lit dWti,n+1 (26)
Note that Ln only is a Pn+1 -martingale. For i ≤ n − 1, Li has a Pn+1 -drift depending on the Ll s
for l > i.
6.4.1 Correlation Structures
The simplest way to x ρ is to set it as an historical estimate of the correlation matrix of the Libor
rates (historical correlation structure). But such estimates are typically noisy and hardly usable in
practice.
Various parametric forms for ρ may be used instead, and calibrated to market prices of correlation-
sensitive instruments, like the swaptions to be considered in the next section. For instance, one may
choose ρi,l = exp(−β|i − l|), or (Rebonato)
6.5 Swaptions
Recall that an interest-rate swap with tenor t1 , . . . , tn+1 is a contract with cash ows αi (Liti − K)
at the ti+1 s, for i = 1, . . . , n (from the point of view of the party receiving the oating payments).
By (18), the value of the swap at time t ≤ t1 is thus given by
n
X n
X
αi Bti+1 Ei+1 (Liti − K) | Ft = αi Bti+1 (Lit − K) ,
i=1 i=1
19
by thePi+1 -martingale property of Li , i = 1, . . . , n. A swaption (or option to enter the swap at the
maturity date t1 ) with tenor t1 , . . . , tn+1 and strike K is thus tantamount to a product paying at t1
the positive part of
n
X
Bti+1
1
αi (Lit1 − K) ,
i=1
that is
Pn
i=1 Bti+1
1
αi (St1 − K)+
B 1 −Btn+1
St = Pnt i i+1
i=1 α Bt
Pn i+1 i i n+1 Pn
(hence
1
i=1 Bt1 α Lt1 = Bt1 − Bt1 = i=1 Bti+11
αi St1 , by (21)). Note that S is a martingale
∗
Pn i i+1
under the pricing measure P corresponding to the the numeraire i=1 α Bt .
The LSM model mentioned in the introductory paragraph to this section, consists precisely in
modeling the swap rate S as a P∗ -log-normal process with time-deterministic volatility, hence an
exact Black formula for swaptions in the LSM.
The swap rate S is not P∗ -log-normal in the BGM model, yet it is not far from P∗ -log-normal with
2
integrated squared variance Σ given by the so-called Rebonato's formula (see e.g. [40, p. 248]):
n Z t1
1 X i l i l
Σ2 = w w L L ρ
0 0 0 0 i,l σi (t)σl (t)dt , (27)
S02 0
i,l=1
Ql
with weights wl proportional to the αl Btl+1 1
= αl
k=1 1+αk Lk .
1
We thus have the following approximate Black formula for swaptions in the BGM model (at time 0):
n
X
P0 = αi B0i+1 [S0 N (d1 ) − KN (d2 )] ,
i=1
with
Σ2
ln( SK0 ) + 2
d1 = , d 2 = d1 − Σ
Σ
where Σ is given by (27).
n
" #
X αi (Liti − K)+
C0 = B0n+1 En+1
i=1
Btn+1
i+1
" n
#
X αi (Liti − K)+
= B0n+1 En+1
i=1
Btn+1
i+1
" n n
#
X Y
= B0n+1 En+1 αi (Liti − K)+ (1 + αl Llti+1 ) ,
i=1 l=i+1
by (22). We thus see that to properly discount each cash-ow αi (Liti − K)+ under Pn+1 , we need
to know the values of
Li+1 i+2 n
ti+1 , Lti+1 , ... , Lti+1 ,
20
where V = ΛG ∼ Nn (0, ρ), with G = (ε1 , ..., εn )T Gaussian standard and ρ = ΛΛT . For example, in
case n=2:
dW 1,3
1 ρ
dW3 = , dhW 1,3
, W 2,3
it = dt = ΛΛT dt ,
dW 2,3 ρ 1
1 p 0
with Λ= .
ρ 1 − ρ2
In general, the computation goes like this:
t 0 t1 t2 ... tn
L1 L10 L1t1
L2 L20 L2t1 L2t2
L3 L30 L3t1 L3t2
... ... ...
Ln Ln0 Lnt1 Lnt2 . . . Lntn
For instance, assuming for notational simplicity a correlation structure of rank one (ρi,l = 1 and
Λi,l = 1l=1 for any i, l, so ρ completely disappears from the picture and it is enough to make one
standard univariate Gaussian draw ε per time step), and with n = 4, σi = 15%, αi = ti+1 − ti = 0, 5,
i
and L0 = 5% (initial term structure at at level 5%):
t 0 t1 t2 t3 t4
√
tl+1 − tl εl −0, 371379 1, 81768 −0, 204069 0, 512108
L1 5% 4, 698%
L2 5% 4, 699% 6, 135%
L3 5% 4, 701% 6, 138% 5, 918%
L4 5% 4, 702% 6, 142% 5, 923% 6, 36%
We thus consider d rms with respective nominal Nl , recovery rate Rl , loss given default Ml =
(1 − Rl )Nl , and default time τl > 0 with distribution function Fl (t) = P(τl ≤ t) and indicator
l
process Ht = 1{τl ≤t} , l = 1, . . . , d. The cumulative portfolio loss at time t is thus given by
Pd
Lt = l=1 Ml Htl
21
Finally we denote by
F (t1 , . . . , td ) = P(τ1 ≤ t1 , . . . , τd ≤ td )
the joint cumulative distribution function of (τ1 , . . . , τd ).
7.1 Single Tranche CDOs
A single tranche CDO (Collateralized Debt Obligation) with lower attachment point K, upper
attachment point K and maturity T is a contract with the following cumulative discounted cash
ow, from the point of view of the investor (which is typically the credit protection seller, in the
case of a CDO):
RT
0
βt [Σ(K − K − Lt )dt − dLt ]
where:
• Lt = min ((Lt − K)+ , K − K) = (Lt − K)+ − is the cumulative tranche loss,
• Σ is the tranche
R t spread at time 0, and
• βt = exp(− 0 ru du) is the risk-neutral discount factor (inverse of the savings account, see the
Appendix).
Note that min ((Lt − K)+ , K − K) = (Lt − K)+ − (Lt − K)+ , so a CDO tranche can be interpreted
as a call-spread on the porfolio loss Lt , with strikes K, K.
Equity (resp. senior) tranches refer to tranches with lower attachment point K=0 (resp. K > 0).
For instance, on the DJ iTraxx market (a family of CDS indices for Europe and Asia), CDO tranches
are quoted for (K, K) equal to (0%, 3%), (3%, 6%), (6%, 9%), (9%, 12%) and (12%, 22%).
By arbitrage, the value of a tranche at time 0 is thus given by
Z T
E βt [Σ(K − K − Lt )dt − dLt ] ,
0
under a risk-neutral probability P on the primary market (see the Appendix). The tranche spread
Σ at time 0 is typically set such that the tranche is entered at no cost at inception, so
E 0T βt dLt
R
Σ= E
RT
βt (K−K−Lt )dt
0
Assuming deterministic interest rates r, the values of either leg of the CDO only depends on the
expected tranche losses ELt , t ∈ [0, T ]. Indeed, we have for the fees leg:
RT RT
E 0
βt (K − K − Lt )dt = 0
βt (K − K − ELt )dt (28)
For the protection leg, we have d(βt Lt ) = βt (dLt − rt Lt dt) (and L0 = 0), thus by Fubini's theorem:
RT RT
E βt dLt = βT ELT + rt βt ELt dt (29)
0 0
7.2 Li Model
A copula function C is the joint cumulative distribution function of an Rd -valued random vector
with uniform marginals on [0, 1], so that in particular C(1, . . . , 1, ul , 1, . . . , 1) = ul , for any ul ∈ [0, 1].
Recall that for any joint multivariate cumulative distribution function F (x1 , . . . , xd ) with univariate
marginal cumulative distribution functions F1 (x1 ), . . . , Fd (xd ), there exists a copula function C such
that
F (x1 , . . . , xd ) = C [F1 (x1 ), . . . , Fd (xd )] ,
by Sklar's theorem (1973).
22
In the one-factor Gaussian copula model (market model for CDOs), it is postulated that the depen-
dence structure between the τl is dened by the Gaussian copula
where N and Nρ respectively stand for the standard (univariate) Gaussian cumulative distribution
function and the d-variate Gaussian cumulative distribution function with covariance (correlation,
in fact) matrix
1 ρ ... ρ
.. .
.
ρ 1 . .
. (30)
. ..
..
. 1 ρ
ρ ... ρ 1
So, by denition:
F (t1 , . . . , td ) = P(τ1 ≤ t1 , . . . , τd ≤ td )
= P N −1 (F1 (τ1 )) ≤ N −1 (F1 (t1 )), . . . , N −1 (Fd (τd )) ≤ N −1 (Fd (td ))
and X = (N −1 (F1 (τ1 )), . . . , N −1 (Fd (τd ))) is a Gaussian vector with covariance (correlation) matrix
given by (30). Equivalently,
τl = Fl−1 (N (Xl ))
where
√ √
Xl = ρV + 1 − ρ εl , l = 1, . . . , d (31)
for independent standard Gaussian random variables V (the common factor) and ε1 , . . . , εd . Dening,
for t≥0:
√
N −1 (Fl (t)) −
l|v ρv
pt = P(τl ≤ t | V = v) = P(Xl ≤ N −1 (Fl (t)) | V = v) = N √ ,
1−ρ
we thus have:
Z ∞Y d
l|v
= ptl g(v)dv ,
−∞ l=1
by conditional independence, where g denotes the standard Gaussian density. Likewise, the moment
generating function of the portfolio loss Lt satises
l|v
In particular, in the homogenous case pt = pvt , Ml = M :
d Z
X ∞
ΨLt (u) = Cdl (1 − pvt )l (pvt )d−l eu(d−l)M g(v)dv . (33)
l=0 −∞
Of course, for a complete determination of L, the marginal distributions of the τl need to be spec-
ied. This is typically done by assuming that the only information at hand is the one provided by
23
the observation of the defaults of the reference entities, in which case the cumulative distribution
functions Fl (t) may be bootstrapped from the individual market CDS spread curves of the reference
entities. Indeed, we then have by arbitrage, for any l = 1, . . . , d (see e.g. [82, 30]):
Z T Z T
Σl (T ) (1 − Fl (t)) dt − Ml dFl (t) = 0 , T ≥ 0 (34)
0 0
(and Fl (0) = 0), where Σl (T ) denotes the fair spread at time 0 of the CDS with maturity T on the
th
l name.
Alternatively, the portfolio loss probability distribution qt can be computed by using the following
k|v
recursive relation on the conditional loss probabilities qt (i) which take into consideration the i rst
k|v k|v ·|v
reference entities only (so P(Lt = k|v) =: qt = qt (d) and qt (0) = δ0 ):
k|v i|v k−M |v
i i|v k|v
qt (i) = pt qt (i − 1) + (1 − pt )qt (i − 1) ,
i = d, . . . , 1 , k = 0, . . . , M .
Indeed we have in obvious notation, for any i = d, . . . , 1 and k = 0, . . . , M :
k|v k−Mi |v,Hti =1 i|v k|v,Hti =0 i|v
qt (i) = qt (i)pt + qt (i)(1 − pt )
where by conditional independence:
As opposed to the previous exact procedures, which assume that the losses Ml are commensurate, fast
k|v
approximate methods may be used to compute the qt , without the assumption of commensurate
losses. These are saddle point methods based on the following inverse Laplace transform represen-
k|v
tation for the qt , in the sense of distributions (cf. (113); recall that the portfolio loss distribution
has nite support), formally obtained by integration parallel to the imaginary axis in the complex
plane, for any η > 0:
k|v 1
R η+i∞
qt = 2πi η−i∞
ΨLt (u|V ) exp(−uk) du (35)
d
l|v l|v
Y
ΨLt (u|V ) := E[euLt |V ] = 1 − pt + pt euMl .
l=1
24
1
R η+i∞ R∞
E [ϕ(Lt )|V ] = 2πi u=η−i∞
ΨLt (u|V ) L=0
exp(−uL)ϕ(L)dL du (36)
(identity in the strong sense, now), for any regular enough function ϕ such that the conditional
expectation exists in (36).
Saddle point methods are based on the approximation of ΨLt (u|V ) exp(−uk) in (36) by a suitable
Taylor expansion around a well chosen point u? , so that the computation of the resulting inte-
gral approximating that in (35) (approximate inverse Laplace transform) only involves Gaussian
integration. We thus get the density of a distribution (a.c. w.r.t. the Lebesgue measure on R+ )
approximating L in law (conditionally on V ), much faster than by the previous exact procedures,
and without the assumption of commensurate losses (see [8, 91, 150] for every detail).
Even better, we get by application of (36) with
R∞ ϕ(Lt ) = (Lt − K)+ (note that ∂x [(ux + 1)e−ux ] =
2 −ux −ux 1
−u xe , so
0
xe dx = u2 ):
1
R c+i∞ ΨLt (u|V ) exp(−uK)
E [(Lt − K)+ |V ] = 2πi u=c−i∞ u2 du (37)
Saddle point methods may then be applied to the integral in (37), namely directly to the pricing of
the tranche, without passing by the derivation of the porfolio loss distribution anymore. Depending
on the expansion point u? , the order of the Taylor expansion, etc, we thus get a whole family of
approximate algorithms for pricing the tranche.
In the simplest case, we recover the so-called large portfolio approximation to the tranche price.
Rening the approximation a little bit, we recover the so-called normal approximation. Saddle point
methods of higher order for the tranche are both very accurate, and incomparably much faster than
exact methods (unless the Ml be commensurate and supported by a very coarse grid), see [8].
A last possibility to compute the portfolio loss probabilities qtk , or, directly, the values of the fees leg
(28) and of the protection leg (29) of the tranche), is, of course, to proceed by simulation (Monte
Carlo method, cf. part V). But simulation methods are much slower on these problems than any of
the previous procedures (approximate saddle point methods to estimate the tranche legs, or even,
assuming commensurate Ml , exact convolution FFT or recursive methods to compute the portfolio
loss distribution).
Note nally that the numerical integrations to be performed in these algorithms, which all involve
the Gaussian kernel g(v)dv , may be done very eciently by Gaussian quadrature (Gauss-Hermite
quadrature, using for instance the function gauher of [129]; see also [60]).
7.4 Hedging
Since the dynamics of the losses is not specied in the one factor Gaussian copula model, the
connection of prices (initial prices at time 0) with any form of hedging cannot be formally established.
Let us thus say a word about how people use the model for hedging in practice. In fact, the related
hedging methodology may be immediately transposed to any model, or at least, to any model with
information ltration equal to the ltration of default times, for the sake of (34). So we write this
section relatively to an arbitrary model endowed with this property, specifying the description to
the Li model where need be.
The most usual hedge is delta-neutral with respect to homogenous bumps on homogenous time
buckets of the underlying CDS curves. This consists in hedging a (credit index) CDO tranche with
maturity T using CDS index contracts with various maturities Tj (Tj−1 ≤ T ) as hedging instruments.
A CDS index contract is an insurance contract covering default risk on the pool of names in the
index. Index contracts dier slightly from single-name securities. In the case of a credit event,
the related entity is removed from the index and the contract continues (with a reduced notional
amount) until maturity. CDS indices may be considered as kinds of averages of individual CDSs,
25
and they can be priced essentially like the latter, cf. (34) (since we assume that the model ltration
is the ltration corresponding to the observation of default times).
We thus rebalance a complementary position in CDS index contracts, every time step τ (=1 market
day, typically), in order to minimize the overall exposure to homogenous bumps in the underlying
CDS curves. In this way we hedge (typical deformations of ) the market credit spread curve. Sim-
ilarly, by using individual CDSs as hedging instruments, one could hedge pertaining deformations
scenarii of the related individual CDS curves. Note that the involved CDSs (whether they are CDS
index contracts or individual CDSs) are new CDSs freshly emitted at each rebalancing time t.
The vector of CDS index hedging positions Λ = (λj )j is chosen so that the tranche (sold, i.e. tranche
protection bought) and its hedging portfolio are globally immune, locally in time, to homogenous
bp-bumps on the time-buckets [Ti , Ti+1 ] of the underlying CDS curves at the current time t, namely
∆? = ∆Λ (38)
where ∆? and ∆ respectively stand for the vector of the deltas (sensitivities) of the tranche, and
the matrix of the deltas (sensitivities) of the CDS index contracts with maturities Tj such that
Tj−1 ≤ T , with respect to homogenous bp-bumps on the time-buckets [Ti−1 , Ti ] of the underlying
CDS curves for Ti−1 ≤ T . The matrix ∆ is obviously upper triangular, so the linear system (38) is
trivial.
The ∆ji s are computed by assessing the impact on the Protection and Fees Legs of the j th CDS
th
index contract, of the i bump on the underlying CDS curves. So they are independent of the de-
pendence structure (copula function) of (τ1 , . . . , τd ) (provided the model information is the ltration
corresponding to the observation of default times, cf. (34)). Note that individual CDS deltas (in
case where an individual CDSs hedge is wished, see section 7.4.1) could be computed in the same
way.
The ∆?i s are computed by assessing the impact on the Protection and Fees Legs of the tranche of
th
the i bump on the underlying CDS curves. More precisely, for each i such that Ti−1 ≤ T :
(i) we bootstrap the marginal cumulative distribution functions Fl from the underlying CDS spread
curves by (34), and we calibrate the other model parameters to relevant market data (using the base
or the compound implied correlation of the tranche for the ρ parameter in the case of the Li model,
see section 8.2);
(ii) we compute the associated values P and Q of the Protection and Fees Legs of the tranche;
(iii) we bootstrap as in (i) the marginal cumulative distribution functions Fel from the underlying
CDS curves bumped by +1bp on the time-buckets [Ti−1 , Ti ], Ti−1 ≤ T ;
(iv) we compute the values Pe and Q
e of the Protection and Fees Legs of the tranche corresponding
to the Fel , with other model parameters unchanged (i.e. for the value of ρ computed in (i), in the
case of the Li model).
We nally get
∆?i = Σ(Q
e − Q) − (Pe − P ) ,
where Σ is the spread of the tranche at inception date 0. Note that as opposed to the ∆ji s, the ∆?i s
are aected by the dependence structure between the τl .
7.4.3 P&L Analysis
Neglecting transaction costs, the P&L increment of the hedged position on the time interval (t, t + δ)
writes, for a trader who is short one tranche and long λj units in the CDS index contract with
26
X
δ P&L = −δ P&L? + λj δ P&Lj
Tj ≤T
X
= δP − ΣδQ − Στ + λj ((1 − R)Pj − Σj Qj − Σj τ ) , (39)
j;Tj−1 ≤T
where:
• δ P&L? (resp. δ P&Lj ) is the P&L increment on a unit position on the tranche (resp. on a unit
position on the CDS index contract with maturity Tj ),
• Σ is the spread of the tranche at inception date 0,
• δP and δQ denote the increments of the values of the Protection and Fees Legs of the tranche on
the time interval (t, t + δ),
• Σj is the spread of the CDS index contract with maturity Tj at the current time t,
• Pj and Qj are the values of the Protection and Fees Legs of the CDS index contract with maturity
Tj at time t + τ , and
• R is the recovery rate on the credit index.
In (39), the terms Στ and Σj τ account for the carry of the various instruments involved, while the
remaining terms account for the slide of the portfolio between t and t + τ. Note that the δ P&Lj only
depend on the value of the CDS index contracts at time t + τ. This is due to the fact that the CDS
index contracts used for hedging at time t are new CDSs freshly emitted at t. So their value at time
t is equal to zero, by denition of a spread.
Of course, to quantify (39), one would need to specify the model dynamics, beyond the denition of
the distribution of the portfolio losses at time 0.
The related deltas (BlackScholes implied volatility delta or Li implied correlation delta) may also
be used for hedging (see e.g. section 7.4), yet the related hedges are only partial hedges, which do
not account for the volatility or correlation risk.
More precisely, given the values of r and q inferred from the related riskless bond market for r and
by call-put parity for q, the BlackScholes implied volatility of an (European vanilla) option is the
value of σ such that:
Πbs (T, K; t, St , r, q, σ) = Πma
t (T, K) (40)
where Πma
t (T, K) denotes the market price of the option at time t. Since the function σ 7→
27
Πbs (T, K; t, St , r, q, σ) is monotone (in the case of an European vanilla), equation (40) is easy to
solve numerically, by dichotomy.
Proceeding in this way for varying strike K, we get the so-called Black(Scholes) implied volatility
smile (typical on foreign exchange derivatives markets), smirk (typical on interest rates derivatives
markets) or negative skew (typical on equity derivatives markets) (or positive skew on `negative beta'
assets derivatives markets, like options on gold futures, see Figure 2).
0.27
0.26
0.27 0.25
0.26 0.24
0.25 0.23
0.24 0.22
0.23 0.21
0.22 0.2
0.21
0.19
0.2
0.19 0.18
0.18
0.13
0.12
0.11
0.1
340 0.09
360 0.08 Maturity
380 0.07
400 0.06
420 0.05
Strike 440
460 0.04
where Σma
t (T, K, K) denotes the market tranche spread at time t;
• the base implied correlation of a tranche is the value of the correlation ρ in a Li model such that
where Σma
t (T, 0, K) is a synthetic market spread computed from the observed market spreads for
the tranches with upper attachment point K and below (see e.g. [125]).
Base implied correlation is more stable than compound implied correlation, because the function
ρ 7→ Σli
t (T, 0, K; t, (Fl )1≤l≤d , ρ) is monotone [125].
Other models are assessed on their ability to reproduce the so-called market correlation skew, for
suitably calibrated values of their parameters (see e.g. Figure 3, in which the model correlation skew
was derived in a suitable credit migrations set-up [29]).
25
20
10
0
0.0−0.03 0.03−0.06 0.06−0.09 0.09−0.12 0.12−0.22
Tranche Attachement Points
Figure 3: Market and Model implied correlation skews for CDO tranches.
For a consistent risk-management of derivative products, the benchmark models are not good enough.
More realistic models are needed (like the Bates model of section 5.4 or beyond, in the case of
equity derivatives; see also the generic Markovian model (144)(145) in the Appendix). In such
models, and/or for pricing and Greeking exotic products (in any model), numerical methods are
the only way. The remaining parts of these notes will be devoted to these methods. As already
mentioned in the Introduction (see section 1), we shall present most methods on simple models, like
the BlackScholes model (in general), the Merton model or the Heston model, for simplicity of the
presentation, yet the methods themselves are always generic to some degree. For instance, it should
be clear from the previous developements that simulating correlated Gaussian vectors (see section
20.3) is a prerequisite for Monte Carlo pricing in the BGM or in the Li models, etc.
29
Part III
Finite Dierences Methods
The deterministic pricing methods presented in these notes: Partial Dierential Equation (PDE)
methods in this part or tree methods in the next one, can be used in any Markovian model, irre-
spective of any early exercice features, but with the practical limitation that the space dimension of
the model be not too large (no more than 3, say; cf. section 3).
In a risk-neutral market model, the primary risky market price process is typically modeled as an
Rd -valued locally bounded local martingale X with respect to some stochastic basis (Ω, F, P) (see
the Appendix). We assume that the riskless interest rate r ql
in the economy and the dividend yields
on the X l are constant, for simplicity2 . Let us further be given an Rm -valued (F, P)-Markov process
X , with X l = X l , l = 1, . . . , d (thus d ≤ m), such that the payo process of a nancial derivative be
l
given by a Borel function of Xt . So the components X , l = d + 1, . . . , m (in case d < m) represent
factors that may be required to `Markovianize' the market model and/or the payo of the derivative.
Then under mild technical conditions (see the Appendix for a detailed presentation), the P-arbitrage
price of the derivative, whether it be an European, an American, or even a Game Contingent Claim
on X, writes Πt = u(t, Xt ), for a Borel (deterministic) function u of t and Xt . So the derivative price
Πt is given by a function of Xt (and t), and all the randomness in Π is embedded in that of X .
Now, the Markov process X typically admits an innitesimal generator with a canonical structure in
three parts: a drift, a diusion and a jump component. As for the drift, P-arbitrage actually implies
that the (rst d) (r − q1 )X 1 , . . . , (r − qd )X d .
drift coecients are given by
This canonical structure of the generator of the Markov process X implies that the P-pricing function
u solves a parabolic partial integro-dierential problem (PIDE) of the following type on the time-
m
space domain [0, T ] × R , where T is the maturity of the product (cf. pricing equation (150) in the
Appendix ):
3
(
F (t, x, u(t, x), ∂t u(t, x), Iu(t,
x),∇u(t, x), Hu(t, x)) = 0 on [0, T ) × D
(41)
m
u = φ(t, x) on [0, T ] × R \ [0, T ) × D
where:
•D is an open subset of Rm . In particular the terminal condition at T, which is embedded in the
boundary condition φ in (41), is given by the derivative payo at maturity;
• Iu is a non-local term dened as
Z
Iu(t, x) = g(t, x) (u(t, x + f (t, x, y)) − u(t, x))h(t, x, dy) ,
Rm
for a non-negative jump intensity functiong(t, x) and a jump size probability measure h(t, x, dy),
• ∇u and Hu denote the spatial gradient and Hessian matrix of u (gradient and Hessian with respect
to x).rots
The precise denition of the space domain D and of the operator F depend on the specications of
the derivative at hand. So, in the case of a European vanilla option, D =R (or R?+ ), and F is a
2 The interpretation of r and of the ql depends on the nature of the primary market: stock, interest rate, exchange
rate, etc.
3 In the Appendix a more general factor process Z = (t, X , Y) is considered, where Y is a further Continous Time
Markov Chain interacting with X.
30
linear operator. In the case of American (or Game) options, there are further obstacles in F. In case
there are barriers involved, D is a part of Rm delimited by the barriers, etc.
But the structure of the generator of the Markov process X implies that F is always monotone in
this sense:
Proof. The proof essentially relies on the classic maximum principle, according to which
∇v(z) = 0, Hv(z) ≤ 0 ,
F (t, x, u(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ≤ 0
≤ F (t, x, v(t, x), ∂t v(t, x), Iv(t, x), ∇v(t, x), Hv(t, x))
implies by monotonicity of F :
F (t, x, u(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ≤
F (t, x, v(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ,
which is in contradiction with u(t, x) > v(t, x), given the strict monotonicity of F in r. 2
Interpreting the problem at hand as comparing the distribution of the temperature in two identical
glasses lled with hot liquid, our comparison principle says that the liquid in a glass is hotter than
the other at any corresponding points inside the glasses, should it be hotter at any corresponding
points outside the glasses (or at any corresponding points on the surface of the glasses, in the purely
dierential case with no jumps, g = 0).
Proposition 9.1 immediately implies that there is at most one classic solution of (41). In very specic
cases, assuming in particular that F is linear, there may be a chance that the pricing PIDE has a
classic solution. However, in general, there is no hope that a classic solution to the pricing PIDE
does exist, and one must resort to suitable notions of weak solutions of (41).
31
The theory of viscosity solutions [51, 70, 2, 16, 4, 3, 49, 87, 88] denes suitable notions of weak (con-
tinuous, or even semi-continuous) solutions, subsolutions and supersolutions of non-linear monotone
PIDEs such as (41), such that related maximum and comparison principles still hold true. Thus
viscosity solutions of (41) (bounded, or satisfying suitable growth conditions) are unique.
Existence of a viscosity solution of (41) can be established by various means, such as the Perron's
method, itself based on the related comparison principles.
The precise denition of a viscosity solution to (41) is outside the scope of these notes and, in fact,
irrelevant here. It will be enough for us to keep in mind that under mild technical assumptions, the
pricing equation (41) is well-posed (admits a unique solution continuously depending on its input
data) in a suitable space of viscosity solutions with growth conditions. This will enable us to ana-
lyze convergence properties of naturally related nite dierences approximation schemes (including
approximation trees) for (41).
Alternatively to working with viscosity solutions, it is possible to derive weak (variational) formu-
lations of the pricing PIDE problem (41) in suitable weighted Sobolev functional spaces H. In this
approach, the boundary condition φ is typically accounted for by a judicious choice of H.
Existence and uniqueness of a solution to the variational formulation of the problem typically results
by application of the LaxMilgram Theorem.
Various choices for H are possible, all giving rise to well-posed variational problems (see e.g. Bally et
al. [15, 14], BarlesLesigne [18], AchdouPironneau [1], Jaillet et al. [86], Ern et al. [68]). The choice
made in [18, 15, 14] has the advantage to give rise to a clear connection (Feynman-Kac formula)
between the solution of the PDE problem and the option price process as probabilistically dened
by arbitrage (see the Appendix).
Again, the precise variational formulation of the pricing problem, including the denition of the
spaces, is outside the scope of these notes and irrelevant here. We shall simply keep in mind that
under mild technical assumptions, a weak (re)formulation of the pricing PIDE problem (41) is well-
posed in a suitable weighted Sobolev space H. This underlies the theory of naturally related nite
elements and nite volumes approximation schemes.
10 Numerical Approximation
In some models (models of the AJD class [63], SABR model [80]..) and in the case of European and
more or less vanilla options, the pricing equation (41) can be solved analytically (or semi-analytically,
see e.g. section 30).
But in general, (41) must be solved numerically. In order to approximate (41), one can either use
nite dierences methods [140], or resort to more general nite elements (or even nite volumes)
methods [1].
Note that there is in fact no hermetic frontier between these methods. Indeed, schematically:
(and also in a sense, for a complete picture: MC Methods ⊂ Tree Methods, cf. section 1).
is that nite element methods are potentially more powerful, but they are also heavier, than nite
dierences methods. The related cost is justied in cases where the geometry of the domain makes it
necessary to use a sophisticated unstructured and adaptative discretization mesh. But pricing prob-
lems in nance are typically posed on rectangular domains, for which a simple nite dierence grid
is good enough (with the limitation that a nite dierences grid does not oer as many possibilities
as a nite elements mesh for rening the mesh in multi-dimensional problems, however).
The numerical resolution of the PIDE problem (41) by nite dierences is a four steps process:
(1) Transformation(s) of the problem, whenever judged useful, such as:
• changes of variables (e.g. as x = ln S, in the BS case),
• changes of unknowns (e.g. solving the equation for e−rt u rather than u),
• changes of probabilities (see e.g. [73, Appendix A] or [55, III,3.1.3] for a PDE interpretation of
the Girsanov transform for diusion processes);
(2) Localisation of the problem, that is, truncating the integration domain in the non-local term Iu,
and the problem domain:
• replacing the integration domain Rm by bounded domains Dε (t, x) ⊂ Rm such that
Z
h(t, x, dy) ≤ ε ,
Rm \Dε (t,x)
Note that to exploit the parabolic structure of pricing equations in nance, the time dimension
is typically treated separately in (3) (as in the so-called `theta-schemes'), in order to `save' one
dimension in terms of storage cost. The problem is then solved `linearly in time' (one time step after
the other) at step (4).
Convergence, convergence rate (and, of course, computational cost!) of the resulting approximation
scheme, are the main issues.
Given a suitable mesh with time step h and space step vector k = (k1 , . . . , km ) on [0, T ] × Rm , let
stand for a fully-discrete nite dierences approximation scheme for (41) (or a localized version of
(41), in case where the original domain D or the support of h are unbounded, cf. section 10.1.1).
By (43) we mean that the related equalities are satised at every mesh point in
[0, T ) × D or
[0, T ] × Rm \ [0, T ) × D . We assume that the discrete problem (43) admits a unique solution
Recall that under mild technical assumptions, the pricing equation (41) has a unique solution u in a
suitable space of viscosity solutions with growth conditions. Moreover we assume that the boundary
data φ and the solution u are bounded, for simplicity.
In the purely dierential case (no jumps, g = 0) and in the case of a linear operator F, u is typically
a classic solution of (41). Then the well-known Lax Equivalence Theorem states that any stable and
33
consistent numerical scheme Fhk (ukh ) = 0 is convergent, which loosely means that ukh converges to u
at mesh points as h, k → 0+, provided:
• ukh is bounded, uniformly over h, k (stability),
• Fhk (u) → 0 at mesh points as h, k → 0 (consistency).
We refer the reader to [118] for the detail of this theorem. In particular, the previous statements
are relative to the specication of a given norm in which ukh is bounded and Fhk (u) → 0, ukh → u,
hence the notions of stability/convergence in norm l∞ , l2 , etc.
There is a further notion of order of consistency, which measures the speed at which Fhk (u) → 0 at
mesh points as h, k → 0. Note however that the order of consistency of a numerical scheme has no
immediate implication in terms of convergence rates of ukh to u. Convergence rates only follow under
additional assumptions, regarding in particular the regularity of the boundary data φ, etc.
Now, a nice feature of the theory of viscosity solutions is that the Lax Equivalence Theorem can be
generalized to the case of a general non linear monotone operator F and to problems with jumps in
(41).
So, for a general monotone, yet still purely dierential, operator F in (41), Barles and Souganidis
[19] proved the convergence of any monotone, stable and consistent approximation scheme for (41),
provided (41) satises a suitable viscosity solutions comparison principle. We already observed that
such comparison principles can indeed be obtained under mild technical conditions on the data of
problem (41). The monotonicity condition on the scheme is a discrete version of that on F (cf.
(42)), which is typically satised by natural nite dierences approximation schemes for (41). So
the convergence conditions in the BarlesSouganidis theorem are essentially reduced to those in the
Lax Equivalence Theorem, namely stability and consistency of the scheme.
Finally, the latter results were extended to problems with jumps (including the case of unbounded
jump measures, hence beyond the case of probability measures h postulated in (41)) by Briani, La
Chioma and Natalini [39]. Note however that in presence of jumps, the monotonicity of the related
numerical schemes does not follow as universally as in the purely dierential case (see e.g. [144]).
As already mentioned in section 10.1, they are heavier and harder to implement (nite volume meth-
ods in particular) than nite dierences methods. By heavier, we mean that they are computationally
intensive, particularly in terms of storage cost. Indeed, a prerequisite in a nite element method is the
construction of a suitable discretization mesh, which is typically unstructured and adaptative, and
which has to be handled by the computer program all along the numerical resolution. This also means
that nite element methods are harder to implement. One typically then has to use nite element
toolboxes (many of them are actually free, see e.g. http://www.inria.fr/valorisation/logiciels/calcul.en.html),
34
However, the related cost is justied in cases where the geometry of the domain makes it necessary
to use a sophisticated unstructured and adaptative discretization mesh, such as it is typically the
case in uid dynamics (see Figure 5), but also occasionally so in nance. This is for instance the case
Figure 5: (a) Detail of viscous mesh for wind tunnel model in aircraft design, (b) Parallel computation
on an unstructured mesh showing the domain decomposition of 16 processors of a distributed memory
computer.
for pricing barrier options with curved boundaries (see e.g. Crépey [55]; barrier options with curved
boundaries are common on interest-rates), or for the precise approximation of exercice boundaries
in the case of American problems (see e.g. [42], also reported in Crépey [55]).
Another motivation for using nite elements is to counter the curse of dimensionality (see section
3), by rening the approximation mesh in a more clever way than simply taking the product of uni-
dimensional adaptive meshes, as is typically done with nite dierences (see Figure 6 and section
10.2.2).
Practically speaking, the numerical resolution of problem (41) by nite elements is a ve steps
process (cf. the analogous description of nite dierences methods in section 10.1.1):
(1) Transformation(s) of the problem, whenever judged useful;
(2) Localisation of the problem, that is, truncating the integration domain and the problem domain,
and introducing a suitable Dirichlet boundary condition ϕ which prolongates φ on a boundary layer
∂Dj around the localized domain;
(3) Derivation of a weak formulation of the localized problem in a suitable weighted Sobolev functional
space H (the boundary condition ϕ is typically accounted for by a judicious choice of H);
(4) Projection of the resulting problem onto a nite-dimensional sub-space of nite elements Hh ⊂
H;
35
(5) We thus get a high-dimensional, sparse linear system, in the coecients of the approximate
solution on a nite element basis, to be programmed and solved numerically on a computer.
Existence and uniqueness of a solution to the weak form of the localized problem at step (3) typically
follows by application of the LaxMilgram Theorem (see section 9.2.2).
With the same motivation as for nite dierences methods, the time dimension is typically (yet not
always, see e.g. [42, 68]) treated separately at steps (3)(4). So the problem may be solved `linearly
in time' at constant storage cost. Like with nite dierences methods, convergence, convergence
rate, and computational cost of the resulting approximation scheme, are the main issues.
An interesting feature of nite elements methods is that a powerful error theory (a priori and a
posteriori error estimates) is available.
To solve the high-dimensional sparse linear system arising at step (5), an iterative solver is required.
The Generalized Minimal Residual (GMRES) algorithm [136], a special form of conjugate gradient
method which is both ecient and relatively easy to code, is the industry standard in this regard.
Finite volumes methods can be seen in an informal way as counterparts of nite element meth-
ods in which the test functions used in the variational formulation of the problem are indicator
(discontinuous) functions, instead of regular test functions as usual. Finite volumes methods are
especially dedicated to PDE convection (or convection dominant) problems with discontinuous data.
For instance, the nite volume discretization of the pricing function of a digital option is exact in a
BlackScholes model with no volatility (σ = 0).
10.2.2 Sparse Grids
Sparse grids denote numerical techniques to represent, integrate or interpolate high dimensional
functions, relying on the seminal works of the Russian mathematician Smolyak, who found a clever
quadrature rule to (partially) escape the curse of dimensionality.
This direction of research underlies active developements in nite elements methods (or nite dier-
ences methods; see Figure 7). But the related algorithms are dicult to implement, and we will not
go further in this direction at the level of these notes, referring the interested reader to [132, 75].
For simplicity, we shall focus on nite dierences methods in the sequel. About nite element
methods in nance, we refer the reader to AchdouPironneau [1].
36
(κ = r − q) for a standard P-Brownian motion W, the price process of an European vanilla option
with (integrable) payo φ(ST ) at T is given by, for t ∈ [0, T ] :
Πt = e−r(T −t) Et φ(ST ) . (44)
So, consistently with arbitrage requirements (cf. the Appendix), the discounted price e−rt Πt is a
martingale under the risk-neutral measure P.
By the Markov property of the risk-neutral BS stock S , we have that Et φ(ST ) = E (φ(ST ) | St ) , and
it is possible to show that the price Πt = v(t, St ), for a suitable Borel function v . Then, assuming
that the function v lies in C 1,2 (R+ × R) with bounded derivatives, we have by application of the Itô
formula:
ert d(e−rt v(t, St )) = (∂t v + Lv)(t, St )dt + σSt ∂S v(t, St )dWt
σ2 S 2 2
where L = κS∂S + 2 ∂S 2 − rId. Since e−rt v(t, St ) is a martingale, thus
∂t v + Lv = 0 .
Accounting for the fact that ΠT = φ, we get that the price of an European vanilla option in the BS
model solves the following PDE problem:
After logarithmic transformation Xt = ln(St ), the option price process is given by Πt = u(t, Xt ),
where u solves the parabolic equation
∂t u + b∂x u + 12 σ 2 ∂x22 u − ru = 0
in [0, T ) × R
(46)
u(T, x) = ψ(x), x ∈ R
σ2
(b =κ− 2 ), with x = ln S and ψ(x) = φ(ex ), the payo in the x variable.
For smooth data, this equation is well-posed (i.e. has one and only one solution, which depends
continuously on the data of the problem), in a suitable space of classic solutions [74] (and in suitable
spaces of viscosity or weak solutions in weighted Sobolev spaces as well, cf. section 9.2).
for a related quantile f of the Gaussian law (like f = 4, see e.g. Crépey [55]). Once D is chosen,
one discretizes it in space, constructing a uniform grid {xj } with
2j`
xj = x − ` + , for 0≤j ≤m+1
m+1
37
(with m odd, so that x lies in the space grid; otherwise some kind of interpolation has to be used,
cf. Remark 11.1).
Au = 12 σ 2 ∂x22 u + b∂x u − ru
by a discrete operator Ak acting on Rm -valued vectors uk = (uk (t, x1 ), . . . , uk (t, xm )). The easiest
and most natural is to take:
1 2 2 k
Ak uk (xj ) = σ δx2 u (xj ) + bδx uk (xj ) − ruk (xj ) (49)
2
with
1
δx uk (xj ) = 2k (uk (xj+1 ) − uk (xj−1 ))
δx22 uk (xj ) = k12 (uk (xj+1 ) − 2uk (xj ) + uk (xj−1 ))
where uk (x0 ) = uk (x − `) and uk (xm+1 ) = uk (x + `) are notations for quantities to be dened below
k
using the u (xj ), j = 1 . . . m. By using well chosen Taylor expansions, it is possible to show that δx
2 2
and δx2 are consistent approximations of order two of the spatial dierential operators ∂x and ∂x2 ,
respectively, meaning that for regular test functions ϕ :
Remark 11.1 If |κ| /σ 2 is not small, a less precise but more stable nite dierence approximation
for ∂x is
1 k k
k
δx u (xj ) = k (u (xj ) − u (xj−1 )) if b<0
1 k k
k (u (xj+1 ) − u (xj )) if b>0.
This alternative discretization for ∂x , called upwind discretization, is based on the consideration of
the characteristics of the limiting hyperbolic transport equation with σ = 0.
k
u (T, xj ) = ψ(xj )
uk (t, x − `) = ψ(x − `) , uk (t, x + `) = ψ(x + `) (50)
d k k k
dt u (t, xj ) + A u (t, xj ) = 0 for 0 ≤ t < T, 1 ≤ j ≤ m ,
k
u (T, xj ) = ψ(xj )
k
u (t, x1 ) = uk (t, x − `) + k∂x ψ(x − `)
(51)
uk (t, xm ) = uk (t, x + `) − k∂x ψ(x + `)
d k k k
dt u (t, xj ) + A u (t, xj ) = 0 for 0 ≤ t < T, 1 ≤ j ≤ m .
Set
σ2 b σ2 σ2 b
α= − , β = − − r , γ = + (52)
2k 2 2k k2 2k 2 2k
Ak uk (t) = Ak uk (t) + v k ,
38
β γ 0 ,..., 0 0
ψ(x − `)α
α β γ 0 ,..., 0
0
0 α β γ ,..., 0
.
k k .
A = . , v = (53)
. .. .. .. .
. .
0 . . . . .
0
0
0 ,..., α β γ
ψ(x + `)γ
0 0 0 ,..., α β
β+α γ 0 ,..., 0 0
−αk ∂ψ
α β γ 0 ,..., 0 ∂x (x − `)
0
0 α β γ ,..., 0
.
k k .
A = , v = . (54)
. .. .. .. . .
. . . . .
0 . .
0
0 0 ,..., α β γ
γk ∂ψ
∂x (x + `)
0 0 0 ,..., α β+γ
Remarks 11.1 If m was even, x wuould not belong to {xj ; 0 ≤ j ≤ m + 1}. In this case one could
use linear interpolation to compute the option value corresponding to the initial stock price at time
1
0. This price would thus be approximated by
2 uk (0, xm/2 ) + uk (0, xm/2+1 ) .
11.3 Theta-schemes
We now discuss discretization in time. The standard theta-scheme (θ ∈ [0, 1]) for the parabolic
equation (46) may be summarized as follows (see e.g. [102, 118]): Fix a time discretization step h
such that T = nh, and construct a fully discrete approximation ukh (ti , xj ) = ui (xj ) where the ui ,
i = 0 . . . n are Rm -valued vectors such that:
un = ψ k
ui+1 −ui (55)
h + Ak (ui+1 + θ(ui − ui+1 )) = 0 for 0≤i≤n−1
For θ = 0, we get the so-called Euler explicit scheme. Likewise, for θ = 1 the scheme is the fully
1
implicit Euler scheme, and for θ = 2 it is the Crank-Nicholson scheme.
k −x
Once we have computed uh , we recover the delta ∆ = e ∂x u by its approximation given by
First, let us discuss the case θ =0 (explicit scheme). Using the denition of Ak (considering say
Dirichlet conditions), the approximating scheme (55) is reduced to (cf. (52))
un = ψ k , and for 0 ≤ i ≤ n − 1 , 1 ≤ j ≤ m :
2
k 2
• stable, provided h ≤ σ2 +rk 2 (and σ > |b|k, but this is always satised for k small enough);
• consistent of order one in time and two in space.
11.3.2 Implicit Methods
When we choose 1 ≥ θ > 0, we have to solve at each time step, a linear system
Aui = Bui+1 + hv k
b1 c1 0 ,..., 0 0
a2 b2 c2 0 ,..., 0
0 a3 b3 c3 ,..., 0
.
. .. .. .. .
. .
0 . . . . .
0 0 , . . . , am−1 bm−1 cm−1
0 0 0 ,..., am bm
For example, in the case of natural Dirichlet boundary condition, A is given by, for every j:
b σ2 σ2 b σ2
aj = θh( 2k − 2k2
), bj = 1 + θh(r + k2
), cj = −θh( 2k + 2k2
)
The fully implicit and the CrankNicholson schemes correspond to θ=1 and θ = 21 , respectively.
One can show that the implicit (θ 6= 0) approximation theta-schemes for the BS pricing equation
(46) are:
• stable inconditionally for θ ≥ 21 ;
• consistent of order one in time and two in space, with the notable exception of the CrankNicholson
scheme which is consistent of order two in time and space.
On Figure 8, we plotted the relative error at time 0 as a function of the spot price S0 , obtained
when pricing an European vanilla call option with the explicit, fully implicit and CrankNicholson
theta-schemes , respectively (for r = 10%, σ = 20%, T = 1y, K = 100). The CrankNicholson is
more accurate (at least around the money), as expected.
All implicit theta-schemes require at each time step the resolution of a linear system Au = v, where
u and v are m-dimensional vectors. Let us describe two algorithms of resolution of such linear
systems.
Gauss Factorization This algorithm is based on the fact that a regular matrix can be factorized
as A = LU, where L is a lower triangular matrix, and U is an upper triangular matrix with all ones
on its diagonal. The linear system LU z = v is decomposed into Ly = v, U z = y . It is easy to see
that A tridiagonal implies that L, U are also tridiagonal and so only the upper diagonal of U and
the two diagonals of L need to be found. This results in the following procedure, known as Thomas'
algorithm [118, 102]:
0
b = bm , ym = vm
z1 = y1 /b01
m
For 1 ≤ j ≤ m − 1, j decreasing:
0 , followed by For 2 ≤ j ≤ m, j increasing:
b = bj − cj aj+1 /b0j+1 ,
j zj = (yj − aj zj−1 )/b0j .
yj = vj − cj yj+1 /b0j+1 .
Remark 11.2 Note that this method presupposes that all the b0j (called the pivots) are non zero.
40
3
Crank-Nicholson Scheme; NT = 50 , NX= 101
implicit Scheme ; NT = 50 , NX = 101
Explicit scheme; NT = 50 , NX = 101
2
-1
-2
-3
-4
50 100 150 200
Spot value
SOR Iterative Methods An alternative, which in the case of tridiagonal systems is justied only
by its programming simplicity, is to use the Successive Over-Relaxation iterative scheme. The idea
is to decompose A as A = D + R where D is the diagonal part of A. The linear system Au = v is
then rewritten Du = v − Ru. The solution is computed as the limit of a converging sequence
up+1 = D−1 (v − Rup ) , (56)
or, better:
up+1 = up + ω(e
up+1 − up )
for some 1 < ω < 2, where ep+1
u stands for the r.h.s. of (56). More precisely, the algorithm writes
as follows:
• Step 0 Choose u0 ≥ 0, ε > 0, 1 < ω < 2. Set p = 0.
• Step 1 (Jacobi iteration) Form an intermediate vector hp+1 = (hp+1 j )1≤j≤m by
1 X X
hp+1
j = (vj − Ajl upl − Ajl upl ) , 1 ≤ j ≤ m (57)
Ajj
l<j l>j
and set p = p + 1.
• Step 3 Repeat steps 2 and 3 until |up+1 − up | < ε.
Nu
dSu ¯
X
= (κ − λJ)du + σdWu + d( Jl ), ST −t = s (59)
Su−
l=1
41
where N is a Poisson process with deterministic jump intensity λ, and the Jl are positive, independent
stochastic variables with law ν of j1 := ln(1 + J1 ) (J¯ = EJ1 ; the other data in (59) are dened as
usual). Moreover, N, W, and the Jl are independent. So in case ν = N (α, β) we recover the RN
Merton model, cf. section 5.3.
One can show as in section 11.1 (see also the Appendix) that the price of an European option in the
RN jump-diusion (59) can be formulated in terms of the solution to a Partial Integro-Dierential
Equation (PIDE). So, u denoting the pricing function in the returns variable x = ln S, we have
·
formally as in section (11.1), with = standing for equality up to a martingale term:
·
ert d(e−rt u(t, Xt )) = (∂t u + AX u − ru)(t, Xt )dt ,
1
AX u = (a + λj̄ − λj̄) ∂x u + σ 2 ∂x22 u + λ [Eu(x + j1 ) − u(x)] . (60)
2
By the usual arbitrage argument, the price at time t of the option is thus given by Πt = u(t, Xt )
where u solves the integro-parabolic equation
u(T, x) = ψ(x), x ∈ R
(61)
∂t u + Au + Bu = 0 in [0, T ) × R
with
Z
1
Au = σ 2 ∂x22 u + a∂x u − ru , Bu = λ (u(t, x + z) − u(t, x))ν(dz)
2 R
(note that the operator A of this section reduces to the operator A of the previous section when
λ = 0, since in this case a = b − λJ¯ = b ).
11.4.1 Localization
Localization works essentially as in sections 11.2, except for the fact that:
• b is replaced by a in (48),
−
• the boundary ∂D = {x − `} ∪ {x + `} is replaced by the boundary layer ∂Dj = [x − ` − zmin ,x −
+
`] ∪ [x + `, x + ` + zmax ] (cf. step (2) in section 10.1.1), where zmin and zmax are such that
Z zmax Z ∞
ν(dz) ≈ ν(dz) − ε = 1 − ε, ε 1 ;
zmin −∞
where ϕ is a suitable Dirichlet condition such that ϕ(T, x) = ψ(x) (e.g., ϕ(t, x) ≡ ψ(x)), and B` is
such that, for x∈D:
R
B` u(x) = λ D−x
u(t, x + z)ν(dz) − λu(t, x)+
R (63)
λ ∂Dj −x
ϕ(t, x + z)ν(dz)
11.4.2 Discretization
42
We dene:
h R i
B` u(t, x) = λ D−x u(t, x + z)ζ(z)dz − λu(t, x) +
h R i
λ ∂Dj −x ϕ(t, x + z)ζ(z)dz = B̄` u(t, x) + Φ(t, x)
Pju −j
B̄` u(·, xj ) ≈ B k u(·, xj ) = λk i=j l −j
u(·, xj+i )ζ(·, xi ) − λu(·, xj )
Φ(·, xj ) ≈ Φk (·, xj ) = λk {i+j<jl }∪{i+j>ju } ϕ(·, xj+i )ζ(·, xi )
P
FT approximation Recall that the discrete correlation of two functions gj , hj , each periodic with
period m, is dened by
m−1
X
ρ(g, h)j = gj+l hl .
l=0
The discrete correlation theorem says that the discrete Fourier transform of the discrete correlation
of two real functions g and h is such that
where <·> is the discrete Fourier transforms operator, and the bar denotes complex conjugation.
Then we can compute the discrete correlation using the FFT as follows: FFT the two data sets,
multiply one resulting transform by the complex conjugate of the other and inverse transform the
product. The result will formally be a complex vector of length m. However, it will turn out to have
all its imaginary parts equal to zero since the original data sets were both real.
We can apply this procedure to the B k and Φk operator (with the related vectors suitably prolongated
by zero padding outside jl , . . . , ju , see [129]).
σ2 2
Ak u(·, xj ) = [aδx + 2 δxx − r]u(·, xj )
where
u(·,xj+1 )−2u(·,xj )+u(·,xj−1 )
δx22 u(·, xj ) = k2 ,
σ2
1
ka ≤ 2 α
= 2
σ2 α=0 a>0
ka > 2 α=1 a<0
43
Theta-schemes We dene the time grid {ti | ti = hi, i = 0, . . . , n}. Denoting uji = u(ti , xj ), we
consider the following discrete operator:
uji+1 −uji
h + Ak [θA uji + (1 − θA )uji+1 ]+
(65)
j
ui = ϕji , j = 0, . . . , jl − 1 and j = ju + 1, . . . , n
Piu
uji = p1 uj−1 + p2 uji+1 + p3 uj+1
i+1 X i+1 + hλ(k
j+i j
i=il ζi ui+1 − ui+1 (66)
ζi ϕj+i
+k i+1 ), j = jl , . . . , ju ,
{i+j<jl }∪{i+j>ju }
where
σ2 a σ2 a σ2 a
p1 = h + (1 − α) , p 2 = 1 − h + (1 − 2α) + r , p 3 = h − α .
2k 2 k k2 k 2k 2 k
1
`Asymmetric' scheme 2 , θB = 0 Stable and ecient but some accuracy is lost due to the
θA =
asymmetric treatment of the continuous and jump part.
0 jl ju m T
We have to solve, for i = n−1, . . . , 0, the linear system Aui = Bi+1 , where ui = (ui , . . . , ui , . . . , ui , . . . , ui ) ,
Il 0
A= A
e , (67)
0 Iu
and nally
l −1
Bi+1 = (ϕ0i+1 , . . . , ϕji+1 jl
, fi+1 ju
, . . . , fi+1 , ϕji+1
u +1
, . . . , ϕm T
i+1 ) , (68)
where, for j = jl , . . . , ju ,
Xiu
j
fi+1 = −a0 uj−1
i+1 + (2 − a1 )uji+1 − a2 uj+1
i+1 + hλ k ζi uj+i j
i+1 − ui+1
i=il
X
+k ζi ϕj+i
i+1 .
{i+j<jl }∪{i+j>ju }
44
with Xt = ln(St ), where u is the unique viscosity solution satisfying suitable growth conditions to
the following obstacle problem in the log-return space variable x = ln S :
u(T, ·) = ψ(T, ·) on R
(69)
max(∂t u + Au, ψ − u) = 0 on [0, T ) × R
Equivalently, u can be characterized as the unique solution to the following variational inequality
problem:
u(T, ·) = ψ(T, ·)
(70)
h∂t u + Au, v − ui ≤ 0 , v ≥ ψ
where h·, ·i denotes the scalar product in a suitable weighted Sobolev space.
To support this equivalence, note that provided u ≥ ψ, (69) implies that, for any v≥ψ:
hence (70). Conversely, still assuming u ≥ ψ, (70) implies that for any non-negative test-function
ϕ, we have h∂t u + Au, ϕi ≤ 0, hence ∂t u + Au ≤ 0 and h∂t u + Au, ψ − ui ≥ 0. But taking v = ψ in
(70) gives h∂t u + Au, ψ − ui ≤ 0, so that nally h∂t u + Au, ψ − ui = 0, hence (69).
12.2 Splitting methods
Let us rst describe the computational treatment of the obstacle problem (69), using the so-called
splitting method. The idea is to construct recursively, by dynamic programming, the approximate
solution uji , starting from un = ψ k (T ), and computing ui from ui+1 , for 0 ≤ i ≤ n − 1, in two steps
as follows:
• Step 1 Solve the following Cauchy problem (in discrete space) on [ih, (i + 1)h] × Dk (cf. section
11):
u((i + 1)h, ·) = ui+1 on Dk
∂t u + Ak u = 0 on [ih, (i + 1)h[×Dk ;
denote by ui+ 21 the solution u at time ih;
• Step 2 Do ui = max ψ k (ti ), ui+ 12 .
By Barles et al. [17, 19], this scheme converges to the unique viscosity solution of (69) satisfying
suitable growth conditions, namely the pricing function u.
12.3 Linear Complementarity Problem
Let us now describe the computational treatment of the variational inequalities (70). We refer to
[86, 79] for a detailed presentation. It is well-known that after localization and discretization the
variational inequalityproblem (70) can be expressed as a linear complementarity (LC) problem. At
each time step i, we thus have to solve:
AX ≥ G
X≥Φ (71)
(AX − G, X − Φ) = 0
45
b c ,..., ,..., 0
a b c ,..., 0 σ2 1
a = θh − 2k 2 + 2k b
.. .. .. .. .
.
A= . . . . . with σ2
b = 1 + θh(
k 2 + r)
. . c = −θh σ22 + 1 b .
. .
. . a b c 2k 2k
0 ,..., ,..., a b
There exist three classic algorithms to solve the linear complementarity problem (71).
A method by Brennan and Schwarz [38] This method consists in introducing an auxiliary
LC problem. A rigorous justication of the convergence of this algorithm, in the case of an American
put option, was given in Jaillet et al. [86].
PSOR Method The LC problem (71) can be written as follows: nd vectors W = (wi )1≤j≤m
and Z = (zj )1≤j≤m in Rm such that
W = AZ + V (72.1)
W ≥ 0, Z ≥ 0 (72.2) (72)
(W, Z) = 0 (72.3)
In the RN BS2D model, the underlying stock-prices satisfy the following stochastic dierential
equations:
dSt1 = St1 (κ1 dt + σ11 dWt1 + σ12 dWt2 ), ln(S01 ) = x10
,
dSt2 = St2 (κ2 dt + σ21 dWt1 + σ22 dWt2 ), ln(S02 ) = x20
46
κ1 r − q1 σ11 σ12 σ1 p 0
= , Σ= = ,
κ2 r − q2 σ21 σ22 ρσ2 1 − ρ2 σ 2
σ12
ρσ1 σ2
Γ= .
ρσ1 σ2 σ22
In order to apply the ADI method, we'd rather work with the underlying decorrelated bidimensional
Brownian motion.
Let us introduce some notation:
• b is the vector with components (κ1 − 12 σ12 , κ2 − 12 σ22 ),
• ex = (ex1 , ex2 ), for x = (x1 , x2 ),
• ϕ(t, w) = ψ(ex+bt+σw ), so that the payo of the option writes ϕ(t, Wt ).
The price at time 0 of a European vanilla option on (ST1 , ST1 ) is then given by:
where u is the unique viscosity solution with suitable growth conditions to the following two di-
mensional PDE:
(
2 1 2
∂t u(t, w1 , w2 ) + 12 ∂w 2 u(t, w1 , w2 ) + 2 ∂w 2 u(t, w1 , w2 ) − ru(t, w1 , w2 ) = 0 on [0, T ) × R2
1
2
1 (73)
u(T, w1 , w2 ) = ϕ(T, w1 , w2 ) on R .
ln S 1 W1
=A+Σ
ln S 2 W2
S 1 ∂S 1
p
∂W 1 1 σ2 1 − ρ2 0 ∂W 1
= Σ−1 = .
S 2 ∂S 2 ∂W 2 −σ2 ρ ∂W 2
p
σ1 σ2 1 − ρ2 σ1
The deltas
are then retrieved by using suitable nite dierences approximations for ∂w1 u and ∂w2 u, e.g.
Alternate Direction Implicit methods were originally proposed by Peachman Rachford [127]. At each time
step, one integrates `in each direction' by using the usual nite dierence method for 1D problems in two
steps, the rst one in w1 and the second one in w2 , as follows:
( 2 2 1 2
h
(ui+1 − ui+ 1 ) + 12 δw 1 1
2 ui+ 1 + 2 δw 2 ui+1 − 2 rui+ 1 − 2 rui+1 =0
2 1 2 2 2
2 1 2 1 2 1 1
h
(ui+ 1 − ui ) + 2 δw2 ui+ 1 + 2 δw2 ui − 2 rui+ 1 − 2 rui = 0 ,
2 1 2 2 2
that is: ( 2 h 2
[(1 + hr
4
)Id − h4 δw hr
2 ]ui+ 1 = [(1 − 4 )Id + 4 δw 2 ]ui+1
2
2
1
h 2
2 (74)
[(1 + hr
4
)Id − h4 δw hr
2 ]ui = [(1 − 4 )Id + 4 δw 2 ]ui+ 1 ,
2 1 2
with
Au·,j = Bu·,j
( 2 2
i+ 1 i+1 , for any j2
2
Cuij1 ,· = Duji+
1 ,·
1 , for any j1
2
Problem
Again:
• we localize in space problem (75) on a domain D̄ = [−`, `]2 , introducing a suitable boundary condition
u = ψ on (0, T ) × ∂D;
• we introduce a grid of mesh points (t, x, y) = (ih, j1 k1 , j2 k2 ) on D̄ and we discretize the localized problem
on the grid;
•a suitable nite dierences approximation scheme on the grid can be obtained by combining the previous
ADI nite dierence method with the splitting method of section 12.2 (see [143]).
This can be established by a formal application of the Itô semimartingale formula to the price process
ut = u(t, St , Mt ). Note that M is a non-decreasing process, yet it is not absolutety continuous, thus the
48
relevant Itô formula is not contained in (147). Here the relevant Itô formula is the general Itô formula for
semimartingales (see e.g. [48, 85]), which gives:
1 2 2 2
dut = 2
σ S ∂S 2 u + κS∂S u dt + ∂M u dMt + σS∂S udWt
Remarks 14.1 (i) This can also be derived by letting n tend to ∞ in the pricing PDE of the option with
approximating payo φ(ST , MT (n)), where
t
Stn
Z
1 1
Mt (n) = ( S n ds) n , dMt (n) = dt .
0 n Mt (n)n−1
(ii) When St gets close to its current maximum Mt , then it becomes very likely that the related value of Mt
(current maximum at time t) will not remain the maximum up to expiry. It follows that the option value is
insensitive to small changes in the value of Mt in this case, which provides an intuitive insight into the fact
that ∂M u = 0 on S = M.
φ(S
e T , MT ) = ϕ(ST )1{M <U } + R1{M ≥U } .
T T
This payo is paid at time τ ∧ T, where τ is the rst hitting time of the barrier level U by the underlying
S. So in case{τ > T } the amount ϕ(ST ) is paid to the holder at time T, otherwise the amount R is paid to
the holder at time τ. Equivalently, the following payo is paid at time T :
The related price process is thus given by e−r(T −t) Et φ(ST , τ ). Note that until time τ ∧ T, this is also equal
to
Such options were created as a way to provide the insurance value of an option without charging as much
premium. If you believe that IBM will go up this year, but you're willing to bet that it won't go above 100,
then you can buy the barrier and pay less premium than the vanilla option.
Other forms of barrier options are up and in, down and out, down and in, double in, and double out options.
In some cases, the rebate may be paid at the maturity time T, rather than at the time of the barrier event.
In the latter cases, barrier options are special cases of lookback options, so that they may handled as in
the previous section. Their pricing functions thus solve relevant two-dimensional Cauchy problems in the
variables t, S, M.
However, in all cases, it is better to use the fact, using representations such as (76), the BS pre-barrier
event pricing functions of barrier options generically solve uni-dimensional CauchyDirichlet problems of
the following kind:
2 2
∂t u + 21 σ ∂x2 u + b∂x u − ru = 0 on [0, T ) × D,
Let us give some examples (in the case of rebates paid at the time of the barrier event):
• Down Out Barrier L : D = (L, x + `), R(t, L) = R;
• Down In Barrier L : D = (L, x + `), ψ(x) = R, R(t, L) = C(t, L), where C is the price of a vanilla
49
One uses linear interpolation to nd the price value and delta value corresponding to the initial stock price.
If the initial stock price is close to barrier, a one-sided second-order nite dierence approximation should
be used for the delta.
Price at time t:
Rt RT !+
−r(T −t) t
Su du + t
Su du
Πt = e Et K− . (79)
T −t
2D PDE To markovianize the payo (78), we introduce an additional state variable (process) It =
R t
t
Su du. In this new variable, the payo writes:
+
IT
φ(IT ) = K − .
T −t
∂t u + AS,I u = ru, t ≤ t < T
(80)
u(T, S, I) = φ(I)
with
1 2 2 2
AS,I = σ S ∂S 2 + κS∂S + S∂I . (81)
2
The following Proposition follows by application of El Karoui et al. [66, Théorème 4.2].
Proposition 14.1 Πt = u(t, St , It ), t ∈ [t, T ], where u is the unique bounded viscosity solution of (80).
Remarks 14.2 (i) Note that the form of equation (80) may be derived by a formal application of the Itô
−rt
formula (formula (147), diusive case) to the discounted price process e u(t, St , It ). The function u is not
1,2
of class C , however, so that this derivation is purely formal. Yet it can be turned into a rigorous argument
by resorting to the notion of viscosity solutions of equation (80) (see section (9.2.1)).
(ii) The operator (81) is degenerate in the I variable, so that specic numerical treatments are necessary to
1
solve (80) (`PDE in dimension 1 2 ', see Zvan et al. [151]).
Scaling We dene
t
−ItT
Z
ItT = −K(T − t) + Su du , ηt =
t (T − t)St
50
K(T −t)− tT Su du
R
(thus in particular ηT = (T −t)ST
). So, by elementary Itô calculus:
dt
dηt = − − ηt (κ − σ 2 )dt − ηt σdWt
T −t
1 1
Aη = − + (κ − σ )η ∂η + η 2 σ 2 ∂η22 .
2
(82)
T −t 2
Proposition 14.2 We have Πt = St v(t, ηt ), where the reduced price function v is the unique classic solution
of the following 1D PDE:
(
∂t v − 1
T −t
+ κη ∂η v + 12 σ 2 η 2 ∂η22 v = qv
(83)
v(T, η) = η+ .
Proof of Proposition 14.2 (see Rogers-Shi [135]; see also Barraquand [20]). First note that equation (83) has
·
a unique classic solution v, by Frieman [74]. We thus have by elementary Itô calculus, with = standing
for equality up to a martingale term:
·
ert d e−rt St v(t, ηt ) = − rSt v(t, ηt ) + v(t, ηt )κSt dt + St (∂t v + Aη v) (t, ηt ) − σSt ησ∂η v(t, ηt ) dt .
This PDE is a non degenerate 1D PDE which can be solved numerically by nite dierences theta-schemes.
However care should be taken to the fact that the transport term is dominant in this equation (because of
1
the coecient T −t ), but this can be handled by a further transformation of the problem (see [109]).
Price at time t: Rt Rτ !+
−r(τ −t) t
Su du + t
Su du
Πt = esssupτ ∈Tt Et e K− . (85)
τ −t
+
function
Rt Iτ
Let It stand for
t
Su du, so φ((St )t≤t≤τ ) = K− τ −t
, which is a of τ and Iτ . This function
is also denoted by φ in the sequel. We introduce the following variational inequality (obstacle problem) on
(t, T ] × R?+ 2 :
min(−∂t u − LS,I u + ru , u − φ(t, I)) = 0 , t<t<T
(86)
u(T, S, I) = φ(T, I)
where LS,I was dened in (81). Note that the obstacle function φ(t, I) is singular at t = t.
Proposition 14.3 (Crépey[55]) Πt = u(t, St , It ), t ∈ (t, T ], where u is the unique bounded viscosity solu-
tion of (86). Moreover, the price at time t is given by the radial limit:
Figure 9 provides a numerical illustration of the last point in Proposition 14.3 (radial convergence at time
t). Numerically it seems that convergence also occurs along radii α other than St , yet at a slower rate than
for α = St .
Again, the 2D PDEs in this section (PDE problems (80) and (86)) are degenerate in the I variable, so
that specic numerical treatments are necessary, particularly in the singular American case (see Zvan et al.
[151]).
51
7 -0.2
StartingAverage=100. StartingAverage=100.
StartingAverage=0. StartingAverage=0.
StartingAverage=200. StartingAverage=200.
6 -0.3
5 -0.4
Delta
Price
4 -0.5
3 -0.6
2 -0.7
1 -0.8
0 0.01 0.02 0.03 0.04 0.05 0 0.01 0.02 0.03 0.04 0.05
Tau Tau
Figure 9: Convergence of the price and delta of an American Asian put at (t, St , α(t − t)) as t → t,
for α = 0, St or 2St .
N
!+
T X
φ((St )t≤t≤T ) = K− St , (87)
N i=0 i
relatively to a set of observation dates t1 , . . . , tn . As shown in Wilmott et al. [147], this kind of discretely
path-dependent payos can be priced by PDE methods, after a suitable extension of the state space. In this
section we shall exemplify this technique on the case of cliquet options. Proceeding along the same lines, it
is possible to price by nite dierences any discretely path-dependent payo, like the discretely monitored
Asian payo (87), or volatility and variance swaps (see [62, 149]).
Note that the pricing of such products by MC methods is obvious in BS, but it becomes very dicult in
simple extensions of BS, such as the Uncertain Volatility model [13], whereas the PDE methods can easily
be adapted to such framework (see e.g. Windcli et al. [145, 148]).
Also note that in the case of discretely monitored Asian options, this approach is both closer to the clauses
of the product, and also easier from the numerical point of view. Indeed, instead of a 2D PDE degenerate
1
in one direction (problems in dimension `1 2 ', unless specic dimension reduction techniques are available,
2
see section 14.3), we get, in rough terms, m1 1D PDE problems to solve, where m1 is a generic number of
mesh points per space dimension.
the spot return on the period [ti−1 , ti ]. The payo of a cliquet is dened by (up to a constant factor, the
option's notional):
Pn
φ = max(Fg , min(Cg , i=1 max(Fl , min(Cl , Ri ))))
Moreover, the factor process (Z, P, S) is Markovian in the RN BS model for S. In the BS model we then
have Πt = u(t, St , Pt , Zt ), for a Borel (deterministic) function u of t, St , Pt and Zt . Moreover, one can show
that the pricing function u = ui on (ti−1 , ti ], i = 1, . . . , n, where (ui )1≤i≤n is the unique sequence of viscosity
solutions with suitable growth conditions of the following PDE cascade:
+ 3
un (T, S, P, Z) = max(Fg , min(Cg , nZ )) on R
∂t ui + 2 σ S ∂S 2 ui + κS∂S ui − rui = 0 on (ti−1 , ti ) × R3 ,
1 2 2 2
(88)
ui−1 (ti−1 , S, P, Z) = ui (ti−1 , S, P + , Z + ) on R3
where P+ and Z+ are obtained via the following jump conditions at the monitoring dates ti :
P + = S , Z+ = Z + R−Z
i
(89)
S−P
with R = max(Fl , min(Cl , P
)).
In order to solve (88), we localize in space the problem on a compact set
with: √
Smin = 0 , Smax = S0 (1 + f σ T ) , Zmin = min(0, Fl ) , Zmax = Cl
where f is a suitable factor, e.g. f = 6. Recommended boundary conditions are ∂S2 2 u = 0 on S = Smin and
S = Smax .
We then put a nite dierences mesh on D̄. It is better to use a non uniform mesh in the P -direction, in
order to concentrate the mesh points near spot price S0 . In the S -direction, Windcli et al. [148] suggest to
have a ne node spacing near the diagonal of (S, P ). A uniform grid may be used in the Z -direction.
Note that linear interpolation is required to implement the jump conditions (89).
Between monitoring dates, the BS equation (88) can be discretized by a standard nite dierence scheme
(theta-scheme) in the t, S variables, dened on the three-dimensonal state space (S, P, Z).
Finally, interpolation of degree 2 in space is used for computing the price, the delta and the gamma at
0, S0 , P0 , Z0 .
53
Part IV
Tree Methods
Tree methods (historically starting with binomial trees) are but another point of view on explicit nite
dierences methods, namely the interpretation of an explicit nite dierence scheme in terms of a related
Markov chain.
Trees are natural in nance because such Markov chains can also be considered as `realistic' models of trading
and hedging in discrete time, that can be used per se, without reference to any continuous-time model.
Moreover, trees compute prices and Greeks in a cone starting from (t0 , S0 ), rather than on the whole band
[t0 , T ] × R?+ by implicit nite dierences methods. In nance, the main concern is to get prices and Greeks
at (t0 , S0 ). So the extra-work furnished by implicit schemes (theta-schemes withθ 6= 0) for computing the
(t0 , S), is in a sense wasted time (naturally the motivation for using implicit theta-schemes is
solution at any
not to compute the solution at any (t0 , S), but rather to improve the stability of the schemes, and sometimes
also, the accuracy as well, as with the CrankNicholson scheme).
Today it is fair to see that from a practical point of view, binomial trees are obsolete as compared with nite
dierences (or nite elements) technologies. However, in a number of situations, trinomial trees remain a
simple, exible and good enough alternative. Trees are easy to customize due to the visualization of the paths
of the underlying in the Markov chain approximation (see e.g. section 16). So it is often straightforward to
design a tree algorithm for the pricing of a somewhat involved contingent claim.
From the theoretical point of view (convergence analysis), the Markov chain interpretation opens the way
to alternative probabilistic convergence proofs based on Kushner's theorem [99, 98], which says that the so-
called local consistency conditions, that is, the matching of the rst and second conditional moments of the
increments of the approximating chain with those of the continuous-time limit with accuracy o(h), grants
the convergence of the expectations of usual functionals.
In fact the setting in Kushner's theorem is a jump-diusions stochastic control setting, and Kushner's result
deals with the convergence of the optimal controlled chain to the optimal controlled process. It is by far the
most powerful tool for dealing with the convergence of Markov chain based algorithms.
T
with h=
n
, where T is the time to maturity of an option (the current date is taken to be zero), n is the
number of time steps in the tree, and
√ √
eκh −d
u = eσ h
, d = e−σ h
,p= u−d
One can show that p is the unique risk-neutral probability in the CRR tree, in the sense that the unique
φ STh in the CRR tree, is given by, for
arbitrage price process of an European vanilla option with payo
i = 0, . . . , n :
Πhi = E e−r(T −ih) φ(Snh ) | Sih (91)
If the option is American, the (unique arbitrage) price is given by, for i = 0, . . . , n :
Πhi = esssupτ ∈T h E e−r(τ −ih) φ(Sτh ) | Sih (92)
i
54
Convergence of the marginal law of the spot at maturity Assume s = 1, without loss of
generality. For λ ∈ R, we have:
" !#
n−1
h
h
i Si+1Y
E exp iλ ln Snh = E exp iλ ln
i=0
Sih
h in
= E exp iλ ln S1h
√ √ n
= p exp iλσ h + (1 − p) exp −iλσ h ,
eκh −d 1 b
√
and since p= u−d
∼ 2
+ 2σ h + O (h), we have as h → 0+ :
2
n
2σ T
h i
E exp iλ ln Snh ∼ 1 + iλb − λ
2 n
σ2
→ exp iλb − λ2 T
2
= E [exp (iλ (bT + σWT ))]
= E [exp (iλ ln(ST ))]
where
BS spot process. We thus have convergence of the characteristic func-
S denotes the risk-neutral
tion Φhn (λ) = E exp iλ ln Snh of the RN CRR log-spot ln(Snh ) to the characteristic function ΦT (λ) =
E [exp (iλ ln(ST ))] of the RN BS log-spot ln(ST ), hence
L
Snh −→ ST as n→∞. (93)
This grants the convergence of the price of European vanilla options with continuous and bounded payos,
e.g. put options. The convergence of call prices follows by callput parity, since the CRR scheme satises
the call-put parity relationship.
Some remarks about the limit n → ∞ The limit law depends only on peiλ ln(u) + (1 − p) eiλ ln(d)
through its Taylor expansion up to o(h). Thus u, d or/and p can be altered as long as the involved terms of
the development are not modied.
√ √ √ √
n σ T n n −σ T n
The upper and lower value of the spot at maturity are u = e and d = e whereas the ratio of
√ q
T
u
2 successive points is d = e
2σ h
= e2σ n . Thus at the same time, the scan of the law of Snh goes to R∗+ and
h
the grid gets more and more dense. It is easy to show that the points visited by the process S eventually
∗
become dense in [0, T ] × R+ .
Convergence of the delta For bounded and suciently regular payos, one can show by elementary
computations based on the mean value property, that the CRR delta of an option:
Πh h
1 (us)−Π1 (ds)
∆h0 = (u−d)s
converges towards the BS delta ∆0 = ∂s Eφ(STs ) as h → 0+. This results can be extended to the put payo
by density, and then to the call payo by callput parity.
American options We still have convergence of the prices and deltas, though this cannot proven by
elementary computations as in the European case.
By (91), the CRR pricing function of an European option satises Πhn (S h ) = φ(S h ), and for i = n − 1, . . . , 0 :
whereas by (92), the pricing function of an American option satises Πhn (S h ) = φ(S h ), and for i = n−
1, . . . , 0 :
Πhi (S h ) = max φ(S h ), e−rh pΠhi+1 (uS h ) + (1 − p)Πhi+1 (dS h )
(95)
The CRR algorithm is a backward computation of the option price, based on the dynamic programming
equations (94) or (95), after a forward computation of the n + 1 possible values of the underlying at maturity
Snh . Since the CRR tree is a at tree, it is easily seen that the value of the underlying at time i and level j
(starting from below) is the same as that at time i+2 and level j + 1. In particular there are only 2n + 1
possible values of the underlying between time 0 and time n.
For computational purposes, it is clever, in the American case, to compute only once the corresponding
values of the payo of the option (and store them in an array), at the beginning of the algorithm.
√ √
1
u = ebh+σ h
, d = ebh−σ h
,p= .
2
κh
pu + (1 − p) d = e
2
2 2 2κh
pu + (1 − p) d − e = e2κh eσ h − 1 .
Since one degree of freedom remains, a natural idea is to match also the third moment, which gives the
equation
2
pu3 + (1 − p) d3 = e3κh e3σ h
.
The solution of this system is
eκh Q h p i
u = 1 + Q + Q2 + 2Q − 3
2
eκh Q h p i
d = 1 + Q − Q2 + 2Q − 3
2
eκh − d
p =
u−d
2
with Q = eσ h .
2κh 2
Note that ud = e Q >1: this tree is not symmetric any more.
like Lévy processes for instance). From the previous computations it is easy to see that the points and
probabilities of the chosen scheme should be constrained by:
h 2
i
pj exp (iλ ln uj ) = 1 + iλb − λ2 σ2 h + o(h)
P
in the sense that these conditions ensure convergence of the characteristic functions, hence convergence of
the marginal laws of S (cf. section 15.1).
We shall see in section 15.4.1 that these conditions are in fact tantamount to the local consistency conditions,
which ensure a convergence of a much more general type, by Kushner's theorem.
Note that from a computational point of view, the condition that uj+1 /uj be independent of j must be
imposed in order to get a recombining tree.
A feature common to all trinomial trees is that they give an accurate computation of the Greeks (delta,
gamma, and theta), by nite dierences.
k (pu − pd ) = bh
k (pu + pd ) − k2 (pu − pd )2 = σ 2 h
2
.
Kamrad and Ritchken further simplify the last equality, still maintaining an o (h) matching of the variance
(which is enough to ensure convergence of the characteristic functions), as
k2 (pu + pd ) = σ 2 h .
√
Denoting k = λσ h, this leads to
√
pu = 2λ12 + b2λσh
pf = 1 − λ12√
pd = 2λ12 − b2λσh
The parameter λ appears as a free parameter of the geometry of the tree. It is called the stretch parameter,
and must satisfy λ ≥ 1, to ensure stability of the tree algorithm. The value λ = 1.22474 which corresponds
1
to pf = 3 is reported to be a good choice for an ATM call (or put).
pu u + pf f + pd d = eκh
2 2 2
pu u + pf f + pd d = e2κh Q
2
where Q = eσ h .
Since pu + pf + pd = 1, thus two unknowns remain. The solution corresponding to the additional constraint
1
pu = pf = pd =
3
is
p
u = V + V 2 − f2
p
d = V − V 2 − f2
eκh (3 − Q)
f =
2
57
13.02
"Call0.dat" using 1:2
"CoxRossRubinstein1.dat" using 1:2
"KamradRitchken2.dat" using 1:2
13.01
13
12.99
Price
12.98
12.97
12.96
12.95
50 100 150 200 250 300 350 400 450 500
StepNumber
0.718
"Call0.dat" using 1:3
"CoxRossRubinstein1.dat" using 1:3
0.7178 "KamradRitchken2.dat" using 1:3
0.7176
0.7174
0.7172
Delta
0.717
0.7168
0.7166
0.7164
0.7162
0.716
50 100 150 200 250 300 350 400 450 500
StepNumber
Figure 10: European vanilla call priced by a CRR binomial tree, resp. a KR trinomial tree.
58
eκh (3+Q)
with V = 4
.
In view of Kushner's Theorem, the fact that local consistency implies convergence in law was expected;
what this proposition really says is that for multinomial trees of the form (98), convergence in law implies
a convergence of a much more general form.
The algorithm requires the computation of the intrinsic value at each node. A computational advantage of
a at tree (like the CRR or the KR trees) is that one can compute once for all the intrinsic values across all
the possible values of the underlying (n + 1 or 2n + 1 values for the CRR or the KR trees) before performing
the backward scheme. This reduces signicantly the computational cost of the algorithm.
(where Πbs denotes the BS price of the option), whereas the convergence of the tree for European vanilla
−1
options is known to be of order n .
An alternative method (Ritchken's algorithm [134]) is to feed the algorithm with the right value of the
barrier. The idea here is to choose the stretch parameter λ of a trinomial KR tree (see section 15.3.1)
such that the barrier is hit exactly. We know that λ should be greater than one. Intuitively among many
possibilities for λ, the closer λ is to one, the better.
S
ln L0
The natural way to choose λ is the following: compute the value of l above, i.e. l = √
σ h
, where [t]
S
ln L0
1
denotes the greater integer less or equal than t. Then take λ= l σ h
√ .
Proceeding in this way, convergence is reported to be like that for vanilla options.
In the case of Bermudean options, the American right is in force only at a set of prescribed periods, for
instance between a xed date T1 and maturity T (T1 excluded and T included, say). In this case a natural
algorithm is to apply the backward formula (100) between step n − 1 and i1 and the standard CRR scheme
before i1 , where (i1 − 1) h ≤ T1 < i1 h.
This rst algorithm is very crude since it gives the same price, n being xed, for any value of T1 between
(i1 − 1) h and i1 h. A way to feed the algorithm with the right value is to use a KR tree with a stretch
parameter λ1 and a number of steps i1 between times 0 and T1 and λ2 and i2 between T1 and T. In order to
get a recombining tree we get the following pasting condition at time T1 :
r r
T1 T − T1
λ1 = λ2 .
i1 i2
60
Recall that λ1 and λ2 should be greater than one. A possible way to choose the parameters is: rst x
λ1 ≥ 1 (for instance λ1 = 1.2274), choose i1 , then set
i1 (T − T1 )
i2 = +1.
T1
Part V
Monte Carlo Methods
This part is a short overview of Monte Carlo Methods in nance. Comprehensive references are given in
section 4. As deterministic methods, stochastic simulation methods (the terminology `Monte Carlo' was
introduced in [115]) can also be used in any Markovian model. In the case of European products, they
consist of the classic Monte Carlo (MC) loops. For products with early exercice features, stochastic simu-
lation methods are available too, as there are numerical methods, more generally so, for Forward-Backward
Stochastic Dierential Equations (methods based on dynamic programming and MC computation of condi-
tional expectations, see e.g. [34] and references therein), but this is outside the scope of these notes.
MC methods are attractive by their genericity (genericity of the related theoretical properties, that are
insensitive to the dimension of the problem or to the regularity of the payo function, and genericity of
implementation as well), and by the condence interval they provide on the solution (at least for pseudo MC
methods, as opposed to quasi MC methods, see below).
−1
But (pseudo) MC methods are slow (convergence in σm 2 , where m is the sample size and σ is the standard
deviation of the sampled payo ), unless specic variance reduction techniques are applicable.
An alternative to variance reduction methods is to use quasi MC methods, which converge faster in practice
than (pseudo) MC methods. Note however that quasi MC methods don't furnish condence intervals (unless
sophisticated randomized forms of quasi MC methods are used), and that the performances of quasi MC
algorithms are strongly altered in high dimension (high eective dimension, as measured by the number of
signicant risk dimensions in a PCA of the sampled payo, as opposed to nominal dimension; in nancial
problems, eective dimension if often much less than nominal dimension, which explains the popularity and
success of quasi MC methods in nance).
17 Random numbers
d
To sample `random' vectors in R with specic distributions, one rst draws sequences of `uniform' random
d
vectors uj over [0, 1] . Then one transforms the uj into xj with the required distribution.
Of course the quality of a generator puts an upper bound on the quality of any simulation method using it.
Since S is nite, the sequence of states is ultimately periodic. The period is the smallest positive integer ν
such that for j≥ some j0 , sj+ν = sj .
Uj
uj = ∈ [0, 1]
N
where
Uj = (aUj−1 + b) mod N
for xed integers a > 0, b and N > 0.
Such generators produce a lot of regularity in sequences and a lattice structure.
Random numbers generator of L'Ecuyer with Bayes & Durham shuing procedure
This is a combination of two short periods LCG, hence a longer period generator with good properties.
19 Low-discrepancy sequences
Quasi-random numbers, or successive values of low-discrepancy sequences [123, 122, 121, 117], are not inde-
d
pendent. But they have good equidistribution properties on [0, 1] , hence good convergence properties of
Pm
1
∈ [0, 1]d , y ≤ x}, with y ≤ x if and only
R
m j=1 ψ(ξ j ) to
[0,1]d
ψ(u)du as m → ∞. Let [|0, x|] stand for {y
if yj ≤ xj , j = 1 . . . d.
Denition 19.1 Given a [0, 1]d -valued sequence ξ = (ξj )j≥1 , and x ∈ [0, 1]d , we dene:
is called the (star) discrepancy for the rst m terms of the ξ sequence.
• A sequence ξ is said to be a low-discrepancy sequence if its discrepancy satises
d
Dm = O( (lnm
m)
)
63
1
or if it is asymptotically better than O(( ln m ) ), the discrepancy of a sequence of independent uniform
ln m 2
Quasi-random numbers combine the advantage of a lattice with the advantage that points can be added
incrementally (like points of a random sequence).
(ln m)d
But for large dimension d, the theoretical bound m
may be meaningful for extremely large values of m
only.
Low discrepancy sequences perform very well in low dimension. In high dimension d, a lattice can only be
d
rened by increasing the number of points by a factor 2 .
Orthogonal projections If a d-dimensional sequence is uniformly distributed in [0, 1]d , then two-
dimensional sequences formed by pairing coordinates should also be uniformly distributed. The appearance
of non-uniformity in such projections (see Figure 11) is an indication of potential problems in using a
quasi-random sequence [117].
1 1
Sobol dim 1 and 2 Sobol dim 159 and 160
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 11: Orthogonal projection on the rst, resp. last two coordinates of the rst 10000 points of
the Sobol sequence in dimension 160.
Proof Let θ be uniform over [0, 2π] and ρ2 be exponential with parameter 12 , independent from θ. Then
the pair (X, Y ) = (ρ cos θ, ρ sin θ) is standard Gaussian. Indeed, for every measurable and bounded function
φ, we have:
∞ 2π
1 −ρ2
Z Z
dθ
Eφ(X, Y ) = φ(ρ cos θ, ρ sin θ) e 2 d(ρ2 )
0 0 2 2π
Z ∞ Z 2π −ρ2 dθ
= φ(ρ cos θ, ρ sin θ)ρdρe 2
0 0 2π
x2 +y 2
Z Z
dxdy
= φ(x, y)e− 2 .
R R 2π
2
Remarks This method requires two independent random values to obtain two Gaussian variables. So
it must not be used when the random numbers u and v are generated from two successive values of a one-
dimensional low-discrepancy sequence, because u and v are not independent in this case.
To use the Box-Müller transformation with quasi-random numbers, one should thus take special care to
generate u and v independently, e.g. from two dierent one-dimensional sequences, or from a two-dimensional
sequence.
Simulating Gaussian variables with the inverse method The inverse Gaussian cumulative
distribution function N −1
is not known in closed form. To use the inverse method for simulating Gaussian
. Moro's algorithm furnishes a very good and quick
−1
variables, we thus need an approximation of N
approximation.
Γ = ΣΣT
σ12
ρσ1 σ2
Γ= ,
ρσ1 σ2 σ22
then
σ11 σ12 σ1 p 0
Σ= =
σ21 σ22 ρσ2 1 − ρ2 σ2
65
1
Pm a.s.
m j=1 φ(xj ) −→ E[φ(X)] as m→∞
Central Limit Theorem If σ 2 = Var[φ(X)] < ∞, the normalized error converges in law to the Gaussian
distribution:
√
m 1
Pm L
σ m j=1 φ(xj ) − E[φ(X)] −→ N (0, 1) as m→∞
Θ = E[φ(X)]
where φ is some function on D ⊆ Rd and X is a D-valued random vector. Note that Θ can be expressed as
the integral Z
Θ= φ(x)dPX (x) .
D
• An unbiased estimator with m trials for Θ is given by
1
Pm
Θm = m j=1 φ(xj )
Variance decreases to 0 as m → ∞. It means that the greater m, the more accurate the estimator. The
σ
speed of convergence of Θm to Θ is √ , which
m
is independent of the dimension d.
• A condence interval IC = [A, B] at the threshold (condence level) 1 − 2α for Θ is built as follows:
IC = [Θm − zα σm ; Θm + zα σm ]
with zα = N −1 (1 − α). So P(A < Θ < B) = 1 − 2α. For instance, if the threshold is set at 95%, then
α = 2.5% and zα ≈ 1.96.
A natural stopping criterion consists in breaking the MC loop when zα σm becomes less than some percentage
(e.g. 1bp = 10−2 %) of Θm , so that one knows the price with a 1bp relative error, with condence 1 − α.
21.3 Properties
We briey summarize some advantages and disadvantages of the Monte Carlo method.
• Advantages:
66
We can implement this method very easily if we are able to simulate the variable X.
It does not require regularity or dierentiability properties of the function φ.
The estimator is unbiased.
The error on the estimate can be controlled by the Central Limit Theorem, and we can build a condence
interval.
• Disadvantages: We have to realize a lot of simulations to obtain an accurate estimator. Therefore com-
puting time can be very high.
m
1 X
Θ̄m = (φ(xj ) + φ(x̄j ))
2m j=1
2 1
σ̄m = Var[φ(X)] + Cov(φ(X), φ(X̄)) ,
2m
with X̄ = 1 − X.
The following theorem gives a simple condition ensuring variance reduction with this method.
1
Factor 2 is due to the sample size for the antithetic method: indeed, the related estimator contains 2m
terms.
1
Pm
Θ
bm =
m j=1 (φ(xj ) − ξ(xj )) + c
2 1
σ
bm = m
Var[φ(X) − ξ(X)]
1
= m
[Var[φ(X)] + Var[ξ(X)] − 2Cov(φ(X), ξ(X))] .
67
Variance reduction with regard to standard Monte Carlo simulation is not guaranteed by this method. To
decrease the variance, functions φ and ξ must have a large positive correlation. It implies a specic choice
for the control variate ξ.
so
Θ = E[φ(X)] = E[ϕ(Y )]
We thus get the following estimator for Θ:
_ Pm
1
Θm = m j=1 ϕ(yj )
_2 1
σm = Var [ϕ(Y )] .
m
Variance reduction with regard to standard Monte Carlo simulation is not guaranteed by this method. It
depends on the choice of the importance function ρ. The minimum of the variance is reached for the following
∗
importance density ρ :
ρ∗ = R |φ(x)|µ(x) (101)
|φ(y)|µ(y)dy
Usually ρ∗ is unknown (note that Θ stands in the denominator of the r.h.s. of (101), for φ > 0). In practice,
one chooses ρ by using formula (101) applied to an approximation of the payo φ such that the corresponding
density is computable.
So the eciency criterion is essentially independent on the sample size. For instance, if ε(p, q) = 3, it means
that method p requires 9 times more time to obtain the same accuracy than method q. In other words, with
the same computing time, standard error for method q is 3 times smaller than the one of method p.
1
Pm
by m j=1 ψ(ξj ), where ξ is a d-dimensional low-discrepancy sequence.
−1 (law)
Choosing ψ = φ ◦ FX , then ψ(U ) = φ(X), so E[ψ(U )] = E[φ(X)].
The denition of the (Hardy-Krause) variation of ψ is rather technical and irrelevant for our purposes.
Simply note that in dimension 1, the Hardy-Krause variation of a function coincides with the usual notion
| ∂u ψ | du, for ψ of class C 1 . Through the Koksma-Hlawka inequality, we understand
R
of variation, e.g.
[0,1]
the need to have sequences with discrepancy Dm as small as possible.
The Koksma-Hlawka inequality gives an a priori deterministic bound for the error in the approximation of
R 1
Pm
[0,1]d
ψ(x)dx by the sum m j=1 ψ(ξj ). This error is expressed in term of the discrepancy of the sequence
and the variation of the function ψ.
Nevertheless it is often dicult to calculate or even to estimate the
(ln m)d
variation of ψ . Moreover, since for large dimension d, the asymptotic bound m
of a low-discrepancy
(ln m)d
sequence may be meaningful for extremly large values of m only, and because m
increases exponentially
with d, then the bound in Koksma-Hlawka inequality gives no relevant information until a very large number
of points is used.
Unlike the Monte Carlo method, the Quasi-Monte Carlo approach doesn't provide an condence interval
for the estimator. The empirical variance of the sample is not meaningful because successive terms are not
independent. This is due to the construction of low-discrepancy sequences.
Another dierence in comparison with Monte Carlo is that the convergence rate for QMC simulation depends
on the dimension d of the considered model through the discrepancy.
It is worth stressing that low-discrepancy sequences (Sobol sequences in particular) are often more fruitful
in nance than in other application areas. This is because the eective dimension of nancial problems
is often much lower than their nominal dimension. In such cases, one can benet from all the power of
low-discrepancy sequences by assigning the main risk factors of the problem, ordered by decreasing amount
of explained variance, to the successive components of the points of a multi-dimensional low-discrepancy
sequence. Thus, though we use a low-discrepancy sequence in the nominal dimension of the problem,
which may be high, the fact that the rst coordinates of the quasi-random points are assigned to the main
risk factors of the problem avoids much of the drawbacks generally associated with high-dimensional low-
discrepancy sequences.
Now, it happens that under quite general circoumstances, Greeks are also given by expressions of the form
Θ = E[φ(X)], so that all the MC pricing techniques that we have seen so far, can also be used for Greeking.
In this section we illustrate this on the problem of (Q)MC computing the delta of an option in a Markovian
s s
model. We thus want to compute ∆0 = ∂s E[φ(ST )] , where ST is the RN spot with initial condition s at time
0 (we should in fact write STs,s̄ , where s̄ is the initial condition of the remaining factors in our Markovian
model, see the Appendix; since s̄ plays no role in the sequel, we omit it for simplicity).
One obvious method consists in perturbing the initial condition s, computing (Q)MC prices for the original
and perturbed initial condition, and deducing a nite dierences estimator for ∆0 . But this procedure is both
69
costly (since it involves resimulation) and parameterized and biased (via the perturbation on s).
In many cases, direct (without resimulation) approaches are possible: by derivation of the payo, provided
the latter is smooth enough, or by derivation of the transition probability density pT (s, S) (as a shorthand
for pT (s, s̄; S)) of S , provided the latter exists and is smooth enough.
Note that in the case of the BS model everything is elementary, however in general the latter two approaches
ultimately rely on the theory of stochastic ows and of the Malliavin calculus, respectively (see e.g. [78, 41,
72, 73]).
The expectations in (102) are estimated by (Q)MC simulation. In order to optimize the algorithms, common
random numbers should be used to estimate both expectations. Thus, in the BS model, ∆0 can be best
estimated in these approaches by
m √ √
1 X
φ[(1 + ε)sebT +σ T εj ] − φ[(1 − ε)sebT +σ T εj ] ,
2εsm j=1
provided φ is dierentiable.
Note that φ0 may be a weak derivative in the sense of distributions, yet it still needs to be a well dened
function (unlike a Dirac mass for instance) for applicability of the method. So this method is applicable to
the computation of the delta of a call option, not to that of a digital option.
S 2
1 −
(ln( s )−bT ) WT
pT (s, S) = √ e 2σ 2 T , ∂1 ln(pT (s, STs )) = ,
2
S 2πσ T sσT
so
h i
WT
∆0 = E φ(STs ) sσT
70
ST = s exp(bt + σWt ) ,
where ST denotes the spot at maturity T, s is the initial spot, and t is time-to-maturity.
where φ denotes the payo of the option, K is the strike and R the rebate (for digital option only),
m m
1 −rt X 1 X
∆m = e ∂s π(j) = e−rt δ(j)
m j=1
m j=1
The values for π(j) and δ(j) are detailed for each option.
π(j) = (K − ST (j))+
C = P + se−qt − Ke−rt
∆C = ∆P + e−qt ,
where C and P respectively denotes the call and the put prices. These relations may be used for the call
simulation (in order to limit variance).
To have an estimation of the delta in the case of a digital option, we need to use the increment value ε at
each iteration j, so:
1
δ(j) = [φ ((1 + ε)ST (j), K, R) − φ ((1 − ε)ST (j), K, R)] .
2εs
13.2
"Standard0.dat" using 1:2
"Standard1.dat" using 1:2
"Antithetic2.dat" using 1:2
13.15
13.1
13.05
13
Price
12.95
12.9
12.85
12.8
12.75
0 5000 10000 15000 20000 25000 30000
N iterations
0.725
"Standard0.dat" using 1:3
"Standard1.dat" using 1:3
"Antithetic2.dat" using 1:3
0.72
0.715
Delta
0.71
0.705
0.7
0.695
0 5000 10000 15000 20000 25000 30000
N iterations
Figure 12: European vanilla call priced by Pseudo MC simulation (L'Ecuyer's generator), Quasi MC
simulation (Sobol sequence), and Pseudo MC simulation with antithetic variables.
where N is a Poisson process with deterministic jump intensity λ and ln(1 + Jl ) ,→ N (α, β).
From the point of view of (Q)MC pricing European path-independent options with payo φ(ST ) (like the
European call, put or digital options of section 25.1), nothing changes, except for the fact that ST (j) in
section 25.1 is now given by (cf. (11))
QNp √ l
ST (j) = s exp [at + σWt (j)] l=1 exp(α + βεj )
72
where:
• Np is a simulated Poisson variable with parameter λT,
• the εlj are√independent standard Gaussian draws, and
• Wt (j) = tεj for a further independent standard Gaussian draw εj .
Output parameters Price Π, Error Price σP , Deltas ∆1 , ∆2 , Errors delta σ∆1 , σ∆2 , Price Condence Interval
ICΠ =[Inf Price, Sup Price].
The underlying asset prices evolve according to the two-dimensional BS dynamics, that is, (under P:
dSu1 Su1 (κ1 du σ1 dWu1 ), ST1 −t
(
= + = s1
dSu2 = Su2 (κ2 du + σ2 dWu2 ), ST2 −t = s2 ,
where:
• κl = r − ql ,
l
• sl is the initial spot and ST is the spot at maturity T ,
• W 1 and W 2 denote two real-valued Brownian motions with instantaneous correlation ρ.
So
ST1 = s1 exp(b1 t + σ11 Wt1 )
and where W
f denotes a real-valued Brownian motion independent of W 1. The price of an option is
where φ denotes the payo of the option, K is the strike, and t is time to maturity.
The deltas are given by
∆1 = ∂s1 E[e−rt φ(K, ST1 , ST2 )] , ∆2 = ∂s2 E[e−rt φ(K, ST1 , ST2 )] .
The values for π(j) and δ l (j) are detailed for each option.
If π(j) ≥ 0 then:
exp(b1 t + σ11 Wt1 ) if ST1 (j) ≥ ST2 (j)
δ 1 (j) =
0 otherwise
δ2 (j) =
exp(b2 t + σ21 Wt1 + σ22 Wt2 ) if ST1 (j) ≥ ST2 (j)
0 otherwise .
Otherwise δ 1 (j) = δ 2 (j) = 0.
• Exchange Option: The payo is (S1 − ratio × S2 )+ .
+
π(j) = ST1 − ratio × ST2
If π(j) ≥ 0 then:
26 Simulation of processes
Simulating random variables is enough to (Q)MC-compute prices of vanilla options, in cases where the value
of the underlying at the time of maturity T of the option can be simulated directly, without discretizing
the associated SDE. However, to be able to deal with more complex models, or, even in simple models,
with path dependent options, one has no other choice than simulating the whole trajectory of the underlying
between the time of pricing t T. Even for simple options in simple models, simulating spot
and the maturity
trajectories is necessary for testing the hedging performances of a model (see section 28). In this case, the
payo we are interested in typically consists of the Prot and Loss at maturity of an option's seller, who
rebalances a delta hedge in the underlying at regular time intervals, such as one day. This Prot and Loss
at maturity is a path-dependent quantity, so that the simulation problem in this case is similar to that of
MC-pricing a path-dependent option.
Simulation of Wt Simulation of Wt is an easy step because we have that L(Wt ) = N (0, t):
• rst generate a standard
√ Gaussian variable ε,
• then compute Wt as tg.
where (ε1 , ..., εn ) are independent standard Gaussian variables. If we use a discretization with evenly spaced
intervals of size h = Tn , we get:
W0 = 0 √
Wti+1 = Wti + hεi
√
Attention: Simulating independently each Wti as ti εi would be incorrect (e.g. wrong variance of Wti+1 −
Wti ).
Backward simulation with Brownian Bridge An alternative method is based on the following
Brownian Bridge property (see [94, 133]):
hence in particular
L(W t+s |Ws = x, Wt = y) = N ( x+y
2
, t−s
4
) .
2
W0 = 0√
WT = T ε1 q
W0 +WT T
WT = 2
+ 4
ε2
2
W0 +W T q
2 T
W 3T = 2
+ 8
ε3
4
W T +WT q
2 T
WT = 2
+ 8
ε4
4
...
For this algorithm, one must choose n as a power of 2. The rst step is directly for 0 to T. Intermediates steps
are lled by taking successive subdivisions of the time intervals into halves. The algorithm can be adapted
to non-uniform time subdivisions by considering the conditional law of the Brownian Bridge between s and
t.
Remark on these two schemes for MC and QMC simulations We need a vector of size n of
independent Gaussian variables. For MC, these n variables can be simulated from the same pseudo random
numbers generator. However, for a QMC simulation, we need to take care about the independence property.
In the case of backward simulation with QMC, a good point is that the values of W successively determined
on each trajectory, are drawn by order of decreasing variance. Therefore the rst components of the simulated
QMC vectors (`the best ones', i.e. `the more uniform ones') are naturally aected to the main directions of
risk in the problem.
26.2 BS model
In the RN (might be objective, for constant µ) BS model: St = S0 exp(bt + σWt ), St is a function of Wt .
The simulation of price paths is then directly based on the simulation of Brownian motion described in
section26.2.
xt = ln(St ) = x0 + bt + σWt ,
we get:
√
xT = x0 + bt + σqT ε1
x0 +xT T
xT = 2
+ σ 4
ε2
2
x T +xT q
x 3T = 2
2
+ σ T8 ε3
4
x0 +x T q
xT = 2
2
+ σ T8 ε4
4
,...,
We nally set Sti = exp(xti ).
(under suitable Lipschitz conditions on the coecients). Unless an explicit solution is known for X (like in
the BS model, cf. section 26.3), we have to approximate the process by discretization in time. The two best
known schemes are the Euler and the Milshtein schemes, that we now present on a uniform mesh with time
step h.
26.3.1 Euler approximation scheme
It is dened by
X i+1 = Xti + b(Xti )h + σ(Xti )(Wti+1 − Wti ) .
bt b b b
Simulation is obtained with a forward algorithm by:
√
X
bt
i+1 = Xti + b(Xti )h + σ(Xti )
b b b hεi
1
lim h−α sup |Xih − X
bih | = 0 a.s. , α < .
h→0+ 0≤i≤n 2
• Convergence in law For b and σ of class Cb4 , and φ of class Cp4 (p=polynomial), there exists a constant
CT such that:
|E(φ(XT ) − φ(X
bT ))| ≤ CT h .
1 1
So L2 -convergence is of order h 2 , and almost sure (trajectorial) convergence is of order h 2 − ε, for any
ε > 0. Convergence in law (the kind of convergence which is relevant for pricing and Greeking applications)
is linear in h, in regular cases.
Continuous Euler scheme (d = 1) This is the continuous time approximation scheme (S̄t )t≥0 dened
by interpolation of the previous Euler scheme by a Brownian Bridge (see [94, 133]) between (ti , X
bt ) and
i
(ti+1 , X
bt ),
i+1 for i = 0, . . . , n − 1. So, on [ti , ti+1 ]:
where Sb is a (discrete) trajectory of the Euler scheme aproximation for S, and Bi is a Brownian Bridge on
[ti , ti+1 ], such that S̄ = Sb at ti and ti+1 .
76
X bt ) − 1 σ 0 (X
i+1 = Xt√
bt b + (b(X bt )σ(X bt ))h+
i i 2 i i
+σ(Xbt ) hεi + 1 σ 0 (X
bt )σ(Xbt )hε2i
i 2 i i
where (ε1 , ε2 ) is a standard Gaussian pair. The interest of using the Milshtein scheme for the v component
is to benet from a better trajectorial convergence than with the Euler scheme. So the process vb obtained
in this way is less prone to take negative values than it would be the case with an Euler scheme (see also
Andersen [5]).
26.4 JumpDiusions
Finally we consider the following jumpdiusion, driven by a multidimensional Brownian motion W and a
Poisson random measure J (under suitable Lipschitz conditions on the coecients):
R
dXt = b(Xt )dt + σ(Xt )dWt + x∈Rd
f (Xt− , x)χ(dt, dx) (104)
where χ(dt, dx) is a Poisson random measure with compensator measure λ(Xt )ν(Xt , dx)dt, for some intensity
λ and jump size probability measure ν .
Depending on the application at hand, we may be interested at simulating X at xed times, or at the jump
times of X (see e.g. [48, 96]).
Euler scheme at xed times To simulate (104) at xed times 0 < t1 <, . . . , < tn , set X
b0 = X0 , and
for i = 0, . . . , n − 1 :
• simulate
X i+1 = Xti + b(Xti )(ti+1 − ti ) + σ(Xti )(Wti+1 − Wti ) ,
et b b b
77
√
where Wti+1 − Wti ,→ ti+1 − ti N (0, 1), and
−λ(X
b t )(ti+1 −ti )
• compute Xbt
i+1 by adding to Xti+1 , with probability 1 − e
e i (as dictated by the position of
an independent uniform draw Ui with respect to e−λ(Xti )(ti+1 −ti ) ), a jump term of law ν(X
bt , dx).
b
i
Euler scheme at jump times To simulate (104) at the n rst jump times 0 < τ1 < . . . < τn of X, set
Xb0 = X0 , and for i = 0, . . . , n − 1 :
• simulate τi+1 as τi plus an independent draw in an exponential law with parameter λ(X bt ),
i
• simulate
X i+1 = Xτi + b(Xτi )(τi+1 − τi ) + σ(Xτi )(Wτi+1 − Wτi ) ,
eτ b b b
√
where Wτi+1 − Wτi ,→ τi+1 − τi N (0, 1), and
• compute X bτ
i+1 by adding to X
eτ
i+1 a jump term of law ν(Xτi+1 , dx).
e
Continuous Euler scheme (d = 1) This is the continuous time approximation scheme dened by
interpolation of the previous Euler scheme at jump times by a Brownian bridge between (τi , X
bτ )
i and
(τi+1 , X
bτ ),
i+1 for i = 0, . . . , n − 1.
1
Pm 1
Pm
E[φ(X)] − m j=1
bj ) = (E[φ(X)] − E[φ(X)])
φ(X b −
b + (E[φ(X)]
m j=1 φ(X
bj ))
So the error is the sum of two terms, a discretization error and a Monte Carlo error (simulation error):
• for usual discretization schemes such as the Euler or the Milshtein scheme, the weak convergence rate is
linear in h, so that the discretization error is as O(h);
−1
• as for the Monte Carlo error, after scaling by 2 m , it is asymptotically distributed as N (0, Var[φ(X h )]).
Therefore in order to balance the two terms in the error a natural choice is to take m of the order of n2 .
Proof By the CameronMartin formula, the law of Mtλ conditional on Wtλ does not depend on λ. So it
is enough to prove the Lemma in case λ = 0. Now it is well known that the law of the pair (Wt , Mt ) admits
the following transition probability density between time 0 and t:
(2y − x)2
2(2y − x)
p(x, y) = 1y≥0 1y≥x √ exp − (105)
2πt3 2t
(this can be established by using the reection principle for Brownian Motion, see e.g. [94, 133]). Denoting
Zt = (2Mt − Wt )2 − (Wt )2 , we have, formally:
P(Zt ∈ dz | Wt = x) = P(Mt ∈ dy | Wt = x) ,
2 2
with z = (2y − x) − x . Hence, by (105) and given the expression of the Gaussian density:
1 −z
P(Zt ∈ dz | Wt = x) = exp( )dz ,
2t 2t
which proves that, conditionally on Wt , Zt 1
is 2t -exponentially distributed. 2
where Sb is a trajectory of the Euler scheme aproximation for S, and (U i )0≤i≤n−1 is a sequence of independent
uniform random variables on [0, 1].
where E 1
1
is an 2h -exponential variable. 2
2h
In the special case of the BS model, the Euler discretization is exact, provided one works in returns variable
x = ln S . In this case one can take n equal to one and use a 2-dimensional low-discrepancy sequence.
Let ST = s exp (bt + σWt ) denote the RN BS spot. We note MT = max[T −t,T ] S the maximum reached
before maturity.
The price and delta of a lookback option with payo φ and strike K write:
1 −rt
Pm
Πm = m e j=1 π(j)
1 −rt
Pm 1 −rt
Pm
∆m = me j=1 ∂s π(j) = m e j=1 δ(j)
Simulation of the maximum MT The conditional cumulative distribution function of the maximum
m of x = ln S , given xT −t = x1 and xT = x2 , writes:
F (x; x1 , x2 ) =
h i
1 − exp − 22 (x − x1 )(x − x2 ) if x ≥ max(x1 , x2 )
σ t
0 otherwise
so
1
F −1 (y; x1 , x2 ) =
p
x1 + x2 + (x1 − x2 )2 − 2σ 2 t ln(1 − y) .
2
where ci
M̄T = max0≤i≤n−1 M as before. Now, we have:
E ϕ(SbT )1{M̄T ≤L} |Sbih , 0 ≤ i ≤ n
n−1
Y
= ϕ(SbT ) ci ≤L} |Sih , S(i+1)h )
E(1{M b b
i=0
n−1
Y
= ϕ(SbT ) Φ(Sbih , Sb(i+1)h ) ,
i=0
with
2
Φ(x, y) = 1 − exp(− (L − x)(L − y)) 1L>x∧y .
σ(x)2 h
80
Likewise,
n−1
!
Y
E R1{M̄T >L} |Sih , 0 ≤ i ≤ n = R 1 −
b Φ(Sih , S(i+1)h ) .
b b
i=0
Therefore " #
n−1
Y
E φ(S̄T , M̄T ) = R + E ϕ(ST ) − R
b Φ(Sih , S(i+1)h ) .
b b
i=0
In this case, no random draws are needed other than those used for simulating SbT , namely n random draws
by simulation run where n can be taken equal to one, in the special case of the BS model.
h b
AhT = (S0 + Sb1 + . . . + Sb(n−1)h ) .
T
But this discretization works poorly in practice. For better discretization schemes based on the continuous
Euler scheme, see [103].
These discretizations schemes can be used in conjunction with associated variance reduction techniques, as
introduced in [95]. Thus note that
Z T Z T
1 1
St dt is close to exp ln(St ) dt ,
T 0 T 0
ln(St ) dt, as a control variable to price the payo φ(x, y) = (y − K)+ (Fixed strike Asian
1
RT
where ZT = 0 T
call) in the BS model at time 0. As ZT is normally distributed, Ee−rT (eZ − K)+ is known explicitely.
28 Backtesting
Before a model be used in production, it is backtested, so the hedging performances of the model are assessed
using both simulated trajectories in relevant market models and real data sets. In the course of this process,
it is useful to be able so simulate spot trajectories with specic features, such as passing by a point of interest
(e.g. a barrier level, if any in the product at hand) at some point in time. This is the object of this section.
In any case, the very possibility to trade options, at least in a safe or quite safe manner, is closely related to
the access to markets of hedging instruments.
Note also that hedging is model-dependent (with the notable exception of a model-independent static repli-
cation strategy which is known for European path-independent options, cf. BreedenLitzenberger [37]), and,
even, calibrated model dependent. So the composition of the hedging portfolios is dierent in two models
calibrated to the same set of hedging instruments.
Static replication is rarely feasible at aordable costs in practice, so that the most common form of hedging is
dynamic hedging. In a complete market model, it is possible to determine a perfect dynamic hedging strategy
such that in the model, the hedging portfolio has initial value equal to the option price, and is always worth
more (with equality in the case of European options) than the option's payo at termination (see section
5.1 in the special case of an European option in the BS model, or Theorem B.4 in the Appendix for a
general statement). But real markets are never complete. For instance, the BS analysis requires continuous
hedging, which is possible in theory but impossibleand even undesirable, accounting for transaction costs
in practice. Thus dynamic hedging means, practically and in the sequel, discrete, imperfect dynamic hedging.
The simplest form of discrete hedging consists in rebalancing the hedge at xed intervals of time h, a strategy
commonly used with h ranging from one day to one week.
Let us consider an European option with maturity T . The picture is the following: we sell the option at time
0 at price Π0 , that is we receive at time 0 the amount of money Π0 , but in turn it is mandatory for us to pay
+
at time T the payo (ST − K) , which may be very high depending on the movements of the underlying.
Our strategy consists in rebalancing at every time step h a self-nancing hedge in the underlying and in the
savings account. So we sell the option at time 0 at price Π0 , we hedge n times at regular time step h and
we deal the payo at time T = nh.
and more generally, denoting Sbt = e(q−r)t St , it is straightforward to show by recurrence that for i = 0...n :
i−1
X
Vih = erih V0 + ∆lh e−qlh (Sb(l+1)h − Sblh ) .
l=0
Accounting also for the facts that V0 = Π0 and for the nal payment by the trader of φ(ST ) at time T, the
discounted P&L of the trader at T is therefore:
Pn−1
e−rT P< = Π0 + i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih − e−rT φ (ST ) (106)
or, equivalently:
Pn−1
e−rT P< = l=0 δl P&L (107)
with
To x ideas, let us assume that the spot obeys the following BS dynamics (under the objective measure b):
P
dSt = St (µdt + σdWt ) . If a trader were able to hedge continuously, she could perfectly replicate the option
82
and her P&L would be equal to 0, provided she uses the BS delta of the option as control ∆ (see section
5.1). But in practice the trader hedges at discrete times, for example every day at closing price. So the
P&L deviates from 0 (typically σP &L ∼ 1%Π0 for the standard deviation of the P&L at maturity of a daily
rebalanced delta-hedged option position in the BS model, see e.g. [126]).
Note that in the case of an American option exercised at the stopping time τ, assumed for simplicity to be
of the form τ = νh, for a random integer ν ∈ {0, . . . , n}, formulas (106) and (108) become:
Pν−1 Pν−1
e−rτ P&Lτ = Π0 + i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih − e−rτ Πτ = l=0 δl P&L (108)
In the BS model a trader can then super-hedge the option in continuous time.
Pn−1
ξ= i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih
for a suitably-chosen control function ∆ (see [46]). Note that under mild assumptions Sb must be a local
martingale under P, by arbitrage (see the Appendix), and Eξ = 0.
σ2
St+h = St exp µ− h+σ Wct+h − W
ct (109)
2
√
where ct+h − W
W ct = hε and ε ,→ N (0, 1). Thus we can simulate the value of the spot step by step,
2
√
multiplying St by exp µ − σ2 h + σ hε with ε ,→ N (0, 1) to get t+h (see also section 26.2).
Moreover it is interesting to be able to lter some noteworthy paths of the spot, selecting for example spot
trajectories passing by a target level at a target time. We can thus observe the behaviour of a pricing routine
or a hedging scheme when the spot's trajectory passes at a critical point. For example, in the case of a
(reverse) barrier option, we can impose the spot to reach the barrier level at an interesting date like the
maturity (recall that the delta of a reverse barrier option explodes at this point).
This can be done with the Brownian Bridge, wich may be dened as a centered Gaussian process dened on
[0, 1] Γ (s, t) = s (1 − t) on s ≤ t (see also sections 26.1 and 27.1). The easiest way to prove
with covariance
that Γ is a covariance is to observe that the process Bt = W ct − tWc1 satises Eb [Bs Bt ] = s (1 − t) for s ≤ t.
This also gives us a continuous version of the Brownian Bridge. Observe that B1 = 0 a.s., so all the paths
a.s. go from 0 at time 0 to 0 at time 1, hence the name of this process. Naturally, the notion of Brownian
Bridge may be extended to higher dimensions and to intervals other than [0, 1].
In our case, we want the spot trajectory to pass by ST1 at time T1 . For this we would like to replace the
Brownian motion in (109) by a Brownian Bridge so that S passes by the desired target point. We thus set:
T1 ,y t ct − t W
B0,x (t) = x + (y − x) + W cT1 ,
T1 T1
where T1 is the target time, and x and y are the Brownian Bridge's starting and target levels, respectively.
This gives us a Brownian Bridge between (0, x) and (T1 , y). Since we want:
σ2
T1 ,y
ST1 = S0 exp µ − T1 + σB0,x (T1 ) ,
2
T1 ,y
B0,x (0) = x = 0
S
T1 ,y σ2
B0,x (T1 ) = y = σ1 ln ST01 − µ − 2
T1 .
83
We nally obtain
σ2
T1 ,y T1 ,y
St+h = St exp µ− h + σ B0,x (t + h) − B0,x (t) ,
2
with x and y thus determined. All we need now is to simulate the quantity
T1 ,y T1 ,y
Γ = B0,x (t + h) − B0,x (t) .
This gives us the formula for the next value of the spot:
" r #
σ2
T1 ,y h h(T1 − t − h)
St+h = St exp (µ − )h + σ (y − B0,x (t)) + ε , (110)
2 T1 − t T1 − t
where ε ,→ N (0, 1) .
Note that formula (110) is only applicable for t ≤ T1 ; for (t > T1 ) we have to use formula (109).
We may thus simulate a number of spot trajectories with these formulas at discrete times, and calculate
for each path the corresponding P&L given by the discrete hedging formula (106). We may then compute
related statistics, like the mean or the square deviation of these P&L, etc. We may also identify some
relevant spot's trajectories, like those generating extremal or median P&L positions (see Figures 13 and 14).
Such dynamic tests allow risk managers or traders to assess the performance of a hedging scheme, and they
may help developers in detecting some problems of a pricing routine behaviour.
84
100 -0.5
-1
95
-1.5
-2
Stock
PL
90
-2.5
-3
85
-3.5
80 -4
-4.5
75 -5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
102
3
100
2.5
98
Stock
PL
2
96
1.5
94
1
92
90 0.5
88 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
0.3
90
0.2
85
Stock
PL
0.1
80
0
75
-0.1
70
-0.2
65 -0.3
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
Figure 13: Weekly hedging P&L simulations with Brownian Bridge at level 90 at t = 12 .
85
110 0
105 -0.5
Stock
PL
100 -1
95 -1.5
90 -2
85 -2.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
100
1.5
98
Stock
PL
96 1
94
0.5
92
0
90
88 -0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
105 0
100 -0.1
Stock
PL
95 -0.2
90 -0.3
85 -0.4
80 -0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time
Figure 14: Daily hedging P&L simulations with Brownian Bridge at level 90 at t = 12 .
86
Part VI
Calibration Methods
In nancial modelling, calibrating a (class of ) model(s) means nding numerical values of its parameters
such that the related instance of the model is consistent with the market, in the sense that the prices of
some nancial instruments at the current time, or calibration input data, are the same in the market and in
the model thus calibrated, or, practically speaking, that the model ts the currently observed market prices
within the bid-ask spread.
So the calibration problem is the inverse of the pricing problem. Instead of computing prices in a model
with given values for its parameters, one wishes to compute the values of the model parameters that are
consistent with observed prices.
The simplest example of a calibration problem was encountered in section 8, when we discussed the notions
of implied volatility of an option or implied correlation of a CDO tranche. These problems can indeed be
interpreted as the calibration problem of a BS model or a Li model, using an observed option price or a
CDO tranche market spread as calibration input data. Of course in these cases the calibration problem is
very easy, since there is only one parameter to calibrate to exactly one market data, so the task is easily
done by dichotomy.
Since the calibration input data are typically derivative prices given by (risk-neutral) expectations of the
related payos (see the Appendix), the calibration problem can thus be seen as a moment problem. Physicists
would call this problem, the estimation of the model. However, in nance, the term `estimation' specically
refers to statistical estimation of the model, namely the estimation of the model parameters using historical
data, by Maximum likelihood estimation (MLE) or any other statistical procedure. So, in nance, estimation
is a backward-looking process (using historical data), whereas calibration is said to be a forward-looking
process, referring to the fact that derivative prices at the current time are based on the anticipations of the
market regarding the dynamics of the underlying in the future.
It is generally acknowledged that whenever option data are available, it is better to use them to calibrate
the model, rather than to estimate the model on past data. Of course, in the absence of calibration data (no
observed prices), one has no other means than to estimate the model statistically, but this is another story.
Now, it is well-known by physicists that inverse problems are typically ill-posed. Recall that a problem is
well-posed (as dened by Hadamard) if its solution exists, is unique, and depends continuousy on its input
data. Thus there are three reasons for which a problem might be ill-posed:
• it admits no solution, or/and
• it admits more than one solution, or/and
• the solution(s) of the inverse problem do(es) not depend on the input data in a continuous way.
In the case of calibration problems in nance, except for trivial situations, there exists typically no instance
of a given class of models which is exactly consistent with a full calibration data set, including a number of
option prices, a zero-coupons curve, an expected dividend yield curve on the underlying, etc.
But there are often various instances of a given class of models that t the data within the bidask spread.
In this case, if one perturbs the data (e.g., if the observed prices move from some small amount between
today and tomorrow), it is quite typical that a numerically determined best t solution of the calibration
problem switches from one `basin of attraction' to the other, thus the numerically determined solution is not
stable either.
Recall that option prices (or pricing functions in Markovian models, more precisely) are solutions of PDEs
with model parameters related coecients (see the Appendix). Thus model parameters can be expressed in
terms of partial derivatives of the option prices with respect to the model factors. Therefore, calibrating the
model is tantamount to numerically dierentiating derivatives pricing functions. But numerical dierentia-
tion is the canonical example of an ill-posed problem, see e.g. [67] (so two functions may be arbitrarily close
to one another in sup norm, though their derivatives dier signicantly).
In order to get a well-posed problem, we need to introduce some regularization. The most widely known
87
Π
H ⊇ A 3 a −→ Π (a) ∈ Rq ,
(so a corresponds to the set of model parameters), and noisy data (`observed prices') πδ , the Tikhonov
regularization method for inverting Π at π δ consists in:
• substituting, to the inverse problem properly said, the nonlinear least squares problem:
2
mina∈A
Π (a) − π δ
Denition 29.1 (α-solution) By an α-solution to the inverse problem for Π at π δ , we mean any aδ,η
α ∈ A
such that
Jαδ aδ,η
α ≤ Jαδ (a) + η , a ∈ A
with
2
2
2Jαδ (a) ≡
Π (a) − π δ
+ α
a − a0
Under suitable assumptions, one can show that the problem of determining α-solutions of the inverse problem
is well-posed, in the following sense.
We postulate that the direct operator Π satises the following regularity assumption.
and
Under these assumptions, we dene an a0 -Minimal Norm Solution (MNS) of the inverse problem for Π at
π, as any a ∈ Argmin ka − a0 k.
{Π(a)=π}
88
1
Then kaδ,η
α − ak = O(δ 2 ), for any a -MNS a of the inverse problem for Π at π such that
0
a − a0 = dΠ (a)∗ λ (111)
An important issue in practice is the choice of the regularization parameter α, that determines the trade-o
between accuracy and regularity in the method. To set α, the two main approaches are:
• a priori methods, in which the choice of α is based on the level of noise δ in the data (such as the width
of the bidask spread, in the case of market prices data in nance);
• a posteriori methods, in which α may also depend on the data themselves.
The most commonly used method for choosing α is the a posteriori method based on the so-called discrepancy
principle, which consists in choosing the greatest level of α for which the `distance' between the solution of
the regularized inverse problem with regularization level α and the observations, does not exceed the level
of noise on the observations.
Recall that a real function J on A (the cost criterion J , that will typically be a regularized non linear least
squares criterion as in Denition 29.1) is said to be lower semi-continuous at a ∈ A i `it cannot exceed its
limits at a', i.e. J(a) ≤ lim inf a J .
The following Theorem extends to (innite dimensional, presumably) Hilbert spaces, under an additional
d
convexity assumption, the well-known fact that a lower semi-continuous function on a compact subset of R
admits a global minimum.
Theorem 29.8 If J is lower semi-continuous, convex, and goes to innity as a goes to innity in A, then
J admits a global minimum on A. Moreover, if J is strictly convex on A, this minimum is unique.
In the strictly convex case, and if, additionally, J is dierentiable, one can prove the convergence to the
minimum of various descent gradient algorithms. These consist in moving at each step from some amount
(xed step descent vs optimal step gradient descent) in a direction made of the gradient ∇J at the current
step of the algorithm, in combination with, in some variants of the method (conjugated gradient method,
quasi-Newton algorithms, etc), the gradient(s) at the previous step(s).
89
In the non strictly convex case, or, more generally, in the non-convex case (as it will typically be the case
in the context of calibration problems), or if the cost criterion is only almost everywhere dierentiable (as
in the American calibration problem mentioned in Remark 31.1), such algorithms can still be used, in which
case they will typically converge to a local minimum of J.
When the gradient of J is not computable in closed form, or not computable numerically with the required
accuracy, an alternative to gradient descent methods is to use the nonlinear simplex method (not to be
confused with the simplex algorithm for solving linear programming problems). As opposed to gradient
descent methods, the nonlinear simplex algorithm only uses the values (and not the gradient) of J, but the
convergence of the algorithm is not proved in general, and there are known counter-examples in which it
does not converge.
When there are no constraints (A = H or analogous discretized form), the minimization problem is, in
practice, much easier, and many implementations of the related gradient descent algorithms are publi-
cally available, see for instance [129] (C codes publically available on www.nr.com). As for constrained
problems, a state-of-the-art implementation of the quasi-Newton method for minimizing a function on
a hyperrectangle commonly used in the nancial industry, the lbfgs algorithm, is publically available on
www.ece.northwestern.edu/~nocedal/lbfgsb.html.
Note that the characteristic function of a random variable X with law of density p, dened by:
R∞
Φ(u) = E[exp(iuX)] = −∞
eiux p(x)dx , u ∈ R
for dierentiable f.
For regular f, the inverse Fourier transform formula writes, for x∈R:
R∞
f (x) = 1
2π −∞
e−iux F f (u)du (112)
Moreover, Fourier transforms may be extended to complex values of their argument u (complex Fourier
transforms), for u in suitable strips of analyticity parallel to the real axis. One can then show that the law of
any Fourier-integrable random variable X, with characteristic function denoted by Φ, admits a weak density
p formally given by, for any η > 0 :
R −ηi+∞
p(x) = 1
2π −ηi−∞
e−iux Φ(u)du (113)
90
meaning that for any regular enough function ϕ such that Eϕ(X) is well dened, we have (in the strong
sense, now):
R −ηi+∞ R
∞
Eϕ(X) = 1
2π u=−ηi−∞
Φ(u) y=−∞
e−iuy ϕ(y)dy du (114)
In particular, we have by (114) applied with ϕ(y) = 1y≤x , for any x ∈ (−∞, x] :
F (x) − F (x) = P x ≤ X ≤ x
−ηi+∞ −ηi+∞
x
e−iux − e−iux
Z Z Z
1 1
= Φ(u) e−iuy dy du = Φ(u) du .
2π u=−ηi−∞ y=x 2π u=−ηi−∞ iu
e−iux
1
R −ηi+∞
Now, by the Cauchy residue formula (see e.g. Titchmarsh [141]): 2π
u=−ηi−∞ iu
Φ(u) du = 1. Therefore,
u=−ηi−∞
e−iux Φ(u)
Z
1
F (x) = lim (F (x) − F (x)) = 1 + du . (115)
x=−∞ 2πi −ηi+∞ u
Moving the integration contour to the real axis, we get by application of Cauchy residue formula (note that
the integrals in (116) are only dened as principal values, see e.g. Lewis [110], Titchmarsh [141]):
e−iux Φ(u)
Z
1 1
F (x) = − lim du
2 2πi ε→0+ u
|u|>ε
(116)
ie−iux Φ(u)
Z
1 1
= + lim du .
2 2π ε→0+ u
|u|>ε
Z∞
ie−iux Φ(u)
Z −iux
ie Φ(u)
lim du = 2 lim Re du
ε→0+ u ε→0+ u
|u|>ε ε
Z∞ −iux
ie Φ(u)
=2 Re du ,
u
0
whence
R∞
−iux
F (x) = 1
2
+ 1
π
Re ie u Φ(u) (117)
0
Theorem 30.1 The call value at time 0, C0 = Ee−rT (ST − K)+ , can be rewritten:
where the pseudo-probabilities Π1 and Π2 are given in terms of the characteristic function ΦT (u) = E[exp(iuxT )]
of xT = ln(ST ), assumed to exist, as:
R∞
−iuk
Π1 =
1
2
+ 1
π
Re e iuΦΦTT(−i)
(u−i)
du
0
∞
(119)
−iuk
1
+ π1 Re e ΦT (u)
R
Π2 = du .
2 iu
0
91
(with ΦT (−i) = EST = S0 eκT in (119) , by arbitrage). Now in many models the characteristic function
ΦT is known and computable. This is for instance the case in all AJD models [63] (see e.g. section 5.5).
Knowing ΦT , (118) enables one to compute C0 numerically by quadrature (section 30.4).
C0 = E e−rT exT 1{xT ≥k} − Ke−rT P(xT ≥ k) ,
where k = ln K.
By (117),
Z∞ −iuk
1 1 e ΦT (u)
Π2 = P xT ≥ k = + Re du.
2 π iu
0
de
P exT
Let us introduce the probability measure P
e equivalent to P dened by dP = E(exT )
. So
Z
E(e−rT exT 1{xT ≥k} ) = e−rT ex 1{x≥k} )dP{xT = x}
x∈R
E(exT eiuxT )
Z
e iuxT ) = ΦT (u − i)
E(e eiux dP{x
e T = x} = = .
x∈R E(exT ) ΦT (−i)
Z∞ −iuk
e xT ≥ k = 1 + 1 e ΦT (u − i)
Π1 = P Re du , (120)
2 π iuΦT (−i)
0
or, equivalently:
In a homogenous model one easily deduces from (118), (117), (121) that
∆0 = ∂s C(T, K; 0, s) = e−qT Π1
For implementation details, including the choice of more ecient quadrature schemes than the rather rough
trapezoid method above, as well as numerical issues related to the evaluation of the multi-valued complex
functions involved in the formulas, we refer the reader to [92].
The idea is to compute the Fourier transform of the call price viewed as a function of the log-strike k = ln K.
−rT
However, this function C(k) is not integrable (since limk→−∞ C(k) = e EST = se−qT > 0). We thus
dene the modied price c(k) = e
αk
C(k), for some xed α > 0. The modied price is integrable, and we
have, for any u∈R:
Z ∞ Z ∞
F c(u) = eiku c(k)dk = e(α+iu)k C(k)dk
−∞ −∞
Z ∞ Z ∞ +
= e−rT e(α+iu)k ex − ek ρT (x)dx dk
−∞ −∞
Z ∞ Z ∞
−rT (α+iu)k
=e e ex − ek ρT (x)dx dk
−∞ k
Z ∞ Z x
= e−rT ρT (x) e(α+iu)k ex − ek dk dx
−∞ −∞
Z ∞ Z x Z x
−rT x
=e ρT (x) e e(α+iu)k dk − e(α+iu+1)k dk dx
−∞ −∞ −∞
Z ∞ x
e h ix
= e−rT ρT (x) e(α+iu)k
−∞ α + iu −∞
1 h i x
− e(α+iu+1)k dx
α + iu + 1 −∞
Z ∞ x (α+iu)x
e(α+iu+1)x
e e
= e−rT ρT (x) − dx ,
−∞ α + iu α + iu + 1
where the last equality follows from the fact that limk→−∞ e(a+iu)k = 0, for any a>0 and u ∈ R. Hence
Z ∞ (α+iu+1)x
e
F c(u) = e−rT ρT (x) dx
−∞ (α + iu)(α + iu + 1)
∞
e−rT
Z
= ei(u−(α+1)i)x ρT (x)dx ,
(α + iu)(α + iu + 1) −∞
that is :
e−rT ΦT (u − (α + 1)i)
F c(u) = ,
(α + iu)(α + iu + 1)
in which ΦT is the characteristic function of xT = ln(ST ), assumed to be known.
The modied call price function c(k), hence the function C(k) (call price for varying log-strike k = ln K ),
may then be retrieved numerically by discrete Fourier transform. Indeed, we have by the inverse Fourier
transform formula (112):
∞
e−αk
Z
C(k) = e−αk c(k) = e−iku F c(u)du
2π −∞
A N −1
e−αk e−αk X −ikuj
Z
2
≈ e−iku F c(u)du ≈ h e F c(uj ) ,
2π −A 2π j=0
2
A N −1
eik 2 −αk X −ijkh
C(k) ≈ h e F c(uj ) ,
2π j=0
93
2πn
so for kn = Nh
(recall that A = hN ):
N −1
einπ e−αkn X −2πi jn
C(kn ) ≈ h e N F c(u ) ,
j
2π j=0
In the last sum, we recognize the discrete Fourier transform of (F c(uj ))j at the evaluation point n. Recall
that the discrete Fourier transform (F fn )0≤n≤N −1 of a vector (fj )0≤j≤N −1 writes:
N −1
X jn
F fn = e−2iπ N fj , 0 ≤ n ≤ N − 1 .
j=0
2
It might seem at rst sight that N operations are required to compute the vector F f. However, when N is
a power of 2, the Fast Fourier Transform (FFT) algorithm allows one to compute F f at cost O(N ln N ).
In summary, we can price a call for N values of the strike K by applying the discrete Fourier transform to
an explicitely known function, which can be made at cost O(N ln N ), provided N is a power of 2.
for a standard P-Brownian motion W. Thus European vanilla call options on S have unique LV arbitrage
price processes given by, for t ∈ [0, T ] :
In a general LV model, we don't have closed pricing formulae. But the pricing function Π(T, K; t, S, σ)
1,2
dened by (123) is the unique Wp,loc -solution (for p > 2; a.e. solutions with related partial derivatives in
Lp,loc in time-space) of the BS equation in the variables (t, S) :
1
(
−∂t Π − κS∂S Π − σ(t, S)2 S 2 ∂S2 2 Π + rΠ = 0, t<T
2 (124)
Π|T = (S − K)+
1
(
∂T Π + κK∂K Π − σ(T, K)2 K 2 ∂K
2
2 Π + qΠ = 0, T >t
2 (125)
Π|t = (S − K)+
These equations can be used to compute LV option prices and Greeks numerically. They also imply that
whatever the market RN price process may be, there is always, at any date t0 , a LV model (dependent on t0 !)
with the same spot marginals as the market RN price process at t0 . Assume t0 = 0, without loss of generality.
1,2
Given a suitable interpolation Π0 ∈ Wp,loc of the set obs0 of observed European vanilla call (might be put)
prices π0 (T, K) at time 0, this `tangent diusion process' of the market RN price process corresponds to the
volatility function given by the following Dupire's formula [64], for any (T, K) ∈ [0, ∞) × R+ :
?
provided Dupire's ratio in the r.h.s. of (126) denes a positively bounded Borel function. But, practically
speaking, it is virtually always possible to nd an interpolation (and/or approximation within the bid ask
1,2
spread) Π0 ∈ Wp,loc of π0 for which this is satised, and we shall refer to the related volatility function
b0 (T, K), as the market (or market model, in case where prices π are in fact model prices) eective volatility
σ
(EV) function σ b0 .
The practical problem of extracting eective volatility from market prices is tantamount to calibrating a LV
model (see section 31.2). Once available, eective volatility is a useful tool in various tasks: hedging [52],
calibrating stochastic volatility (SV) or SV+jumps models [24, 76], arbitraging basket options [9], etc.
But this calibration problem is under-determined (since the set of observed prices is nite whereas the
nonparametric function σ has an innity of degrees of freedom) and ill-posed. So numerical dierentiation
based on Dupire's formula (126) gives a local volatility which is both frenzy and highly unstable, for instance
when performing a day-to-day calibration (see Figure 15).
To solve this diculty, the rst idea that comes in mind is to look for σ within a parameterized family
of functions. But nding classes of functions with all the exibility required for tting implied volatility
surfaces with several hundred of implied volatility points and a variety of shapes, turns out to be a very
challenging task (unless a large family of splines is considered [47], in which case the ill-posedness of the
problem shows up again).
The best way to proceed is to stay non-parametric, and to use regularization methods with prior σ 0 to
stabilize the calibration procedure. Since we use a non-parametric local volatility, the model contains a
sucient number of degrees of freedom to provide a perfect t to virtually any market smile. And the
regularization method guarantees that the local volatility thus calibrated is nice and smooth.
Among the various regularization methods at hand, the most popular one is the Tikhonov variational regu-
larization method with prior σ 0 (cf. section 29.1).
1 0.8
0.9
0.8 0.6
0.7
0.6 0.4
0.5
0.4 0.2
0.3
0.2 0
0.1
0
1.15
1.1
1.05
0 1
0.1 0.2 0.95Stock (Normalized)
0.3 0.4 0.9
0.5 0.6 0.85
Date 0.7 0.8 0.9 0.8
Figure 15: Local Volatility obtained by application of Dupire's formula on the DAX index, May 2
2001.
2
minσ kΠ|obs0 (σ) − π0 k2 + αd σ, σ 0 (127)
d σ, σ 0 = kσ − σ 0 kH 1 ,
where with
Z ∞ Z ∞
kuk2H 1 = u2 + (∂T u)2 + (∂K u)2 dT dK .
T =0 K=0
Crépey [54] Trinomial tree implementation, using a extended KamradRitchken tree like that of section
15.3.1), but with local volatility σ(ti , Sj ).
Example Calibration with Tikhonov regularization on the DAX index European options data set of May 2,
2001, with n = 75 time steps in the calibration tree (see Figure 16).
NIT NF CG F GNORM
0 1 0 3.300143167176308E-03 5.73346051E-02
1 2 4 2.628389922446460E-03 5.12570229E-02
...
65 67 546 1.874191365654949E-04 6.83920968E-07
0.25 0.34
0.32
0.25 0.2 0.34 0.3
0.32 0.28
0.2 0.15 0.3 0.26
0.28 0.24
0.15 0.1 0.26 0.22
0.1 0.24 0.2
0.05 0.22
0.18
0.05 0.2
0 0.18 0.16
0 0.16
1.5 1
1.4 0.9
1.3 0.8
0.7
1.2 0.6
0 1.1 4500 0.5
0.1 5000
0.2 1 Stock (Normalized) 0.4 Maturity
0.3 5500 0.3
0.4 0.9 6000
0.5 0.2
0.6 0.8 6500 0.1
Date 0.7 Strike 7000
0.8
0.9 0.7 7500 0
0.04
maturity:
0.044
0.121
0.216
0.03 0.389
0.638
0.869
Implied Volatility mismatch
0.02
0.01
-0.01
-0.02
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
Moneyness
Figure 16: Local volatility (squared), implied volatility and calibration accuracy (DAX index).
96
N × nproc 1 3 6
54 25s 9s 10s
101 4m30s 1m57s 1m36s
Table 3: CPU times on nproc processors on a fast Myrinet network; 352 DAX European options.
Remarks 31.1 This approach can be extended to the American local volatility calibration problem (see
Crépey [53]).
Example Calibration with Tikhonov regularization on the FTSE index American options data set of January
4, 1999 (see Figure 17).
NIT NF CG F GNORM
0 1 0 6.433226117966511E-03 2.84770204E-02
1 2 4 4.186829455917616E-03 1.50087510E-02
...
61 61 408 1.448329600171004E-03 6.62664338E-05
(a)
0.5
0.5
0.45 0.45
0.4 0.5
0.5 0.4
0.45 0.35 0.45
0.4 0.3 0.4 0.35
0.35 0.25 0.3
0.35
0.3 0.2
0.25 0.15 0.3 0.25
0.2 0.1 0.25
0.15 0.2
0.05
0.1 0
0.2
0.05
0 1
0.9
0.8
1.5 0.7
0.6
1.4 4500 0.5
1.3 5000 0.4 Maturity
5500 0.3
1.2 6000 0.2
0 0.1 0.2 1.1 Strike 6500 0.1
0.3 0.4 1 Stock (Normalized) 7000 0
0.5 0.6 0.9
Date 0.7 0.8 0.8
0.9 1 0.7
0.1
maturity:
0.128
0.205
0.282
0.08 0.454
0.953
0.06
Implied Volatility mismatch
0.04
0.02
-0.02
-0.04
0.9 0.92 0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.1
Moneyness
Figure 17: Local volatility (squared), implied volatility and calibration accuracy (SEI index).
97
min d σ, σ 0
with kΠ|obs0 (σ) − π0 k ≤ δ ,
σ
where
2
Z ∞
d σ, σ 0 = E0,S0 (σ(t, St ) − σ 0 (t, St ))2 dt .
0
Using a dual formulation of the problem with Lagrange multipliers, the related optimization problem may be
2
solved at cost O(|obs|), versus O(n ) in the case of Tikhonov regularization. The resolution is thus typically
faster, but it also happens to be less robust, and the regularization is less ecient (since one only regularizes
0 0
by the dierences σ − σ , instead of ∇(σ − σ ) in (127)) than with Tikhonov regularization.
Example Calibration with entropic regularization on the DAX European options data set of May 2, 2001
with n = 75 time steps in the calibration tree (see Figure 18).
entropy : -1.96329e-05
0.3
0.25 0.3
0.3 0.34
0.2 0.32
0.25 0.3 0.25
0.2 0.15 0.28
0.26
0.15 0.1 0.24 0.2
0.22
0.1 0.05 0.2
0.05 0.18 0.15
0
0.16
0 0.14
1.4 1
1.3 0.9
0.8
1.2 0.7
1.1 0.6
0 4500 0.5
0.1 1 5000
0.2 Stock (Normalized) 0.4 Maturity
0.3 0.9 5500 0.3
0.4 6000
0.5 0.2
0.6 0.8 6500 0.1
Date 0.7 Strike 7000
0.8
0.9 0.7 7500 0
0.07
maturity:
0.044
0.06 0.121
0.216
0.389
0.05 0.638
0.869
0.04
Implied Volatility mismatch
0.03
0.02
0.01
-0.01
-0.02
-0.03
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
Moneyness
Figure 18: Local volatility (squared), implied volatility and calibration accuracy.
99
Part VII
Appendix Financial Theory Essentials
The material in this part is contained in Bielecki et al. [25, 27], where a more general notion of defaultable
option is considered. Here we only consider the case of default-free European or American options. Note
however that defaultable claims with terminal payos like 1T <τd φ(ST ) (or 1τ <τd φ(Sτ ) upon exercice at time
τ, in case of American claims), where τd default-time of a reference entity, can be handled in
represents the
exactly the same way as below, provided the default-free discount factor process β below is replaced by a
credit-risk adjusted discount factor α = βG, where Gt represents a suitable survival probability beyond t (see
[25, 26, 27, 28]).
A Arbitrage Theory
To model a derivative with maturity T, we consider, in the abstract set-up of section 2, a primary market
composed of the savings account B and of d risky assets such that on [0, T ] :
• the discount factor process β , that is, the inverse of the savings account B , is a nite variation, continuous,
positive and bounded process, with β0 = 1;
• the risky assets are locally bounded semimartingales.
The discount factor β is supposed to be absolutely continuous with respect to the Lebesgue measure, namely
r du) for a bounded from below short-term interest rate process r.
Rt
βt = exp(− 0 u
The primary risky assets, with Rd -valued price process X, may pay dividends, whose cumulative value
process, denoted by D, is assumed to be an Rd -valued process of nite variation. Given the price process
X, we dene the cumulative price X
b of the asset as
Z
−1
Xt = Xt + βt
b βu dDu . (128)
[0,t]
In the nancial interpretation, the last term in (128) represents the current value at time t of all divi-
dend payments of the asset over the period [0, t], under the assumption that all dividends are immediately
reinvested in the savings account B.
A predictable trading strategy (ζ , ζ)
0
built on the primary market has the wealth process Y given as,
Moreover, if the wealth process Y is bounded from below, we say that the strategy (ζ 0 , ζ) is admissible.
We assume that the primary market model is free of arbitrage opportunities (though presumably incomplete),
in the sense that the so-called No Free Lunch with Vanishing Risk (NFLVR) condition is satised. This is a
specic no arbitrage condition involving wealth processes of admissible self-nancing trading strategies, see
[61].
By the now classic arbitrage theory (see e.g. [61, 45, 25]), the NFLVR condition is equivalent to the existence
of a risk-neutral measure P ∈ M, where M denotes the set of probability measures P∼P
b such that βX
b is
a P-local martingale.
Denition A.1 An American option is a nancial product with payo process, as seen from the perspective
of the option holder, given by
1{t<T } Lt + 1{t=T } ξ , (131)
where:
• the early exercise payment process L = (Lt )t∈[0,T ] is a real-valued, bounded from below, càdlàg process;
• the payment at maturity T, ξ, is a bounded from below random variable.
An European option is a nancial product with terminal payo, as seen from the perspective of the option
holder, ξ at T, where ξ denotes a bounded from below real random variable.
100
±
In the simplest case, ξ = (ST − K) , for an European vanilla call/put option with maturity T and strike K
1
on S = X , the rst primary risky asset.
Recall that for t ∈ [0, T ], Et denotes the conditional expectation operator given Ft , and Tt (or simply T , in
case t = 0) stands for the set of all stopping times with values in [t, T ].
We know further (see e.g. Bielecki et al. [25]) that for any P ∈ M, the process Π = (Πt )t∈[0,T ] such that
βt Πt = esssupτ ∈Tt Et βτ 1{τ <T } Lτ + 1{τ =T } ξ , t ∈ [0, T ] (132)
is an arbitrage price of the related American option as soon as it is a semimartingale. Moreover, any arbitrage
price of an American option is of this form provided
sup E sup 1{t<T } Lt + 1{t=T } ξ < ∞ . (133)
P∈M t∈[0,T ]
In the case of an European option, then for any P ∈ M, the process Π = (Πt )t∈[0,T ] such that
βt Πt = Et βT ξ, t ∈ [0, T ], (134)
is an arbitrage price of the option. Moreover, any arbitrage price of an European option is of this form
provided
Arbitrage prices like (132)(134) are called P-arbitrage prices in the sequel.
In the case of an European option, we dene a process L by βL = −(c + 1), where −c is a minorant of βT ξ.
In view of the above results, this allows one to interpret an European option as a special case of an American
option. Henceforth, by default, `option' means American option, including European option, with L dened
in this way, as a special case.
Denition B.1 By a primary strategy, we mean a triplet (V0 , ζ, ρ), such that:
• The number V0 represents the initial wealth,
• ζ is an β Xb -integrable process representing holdings in primary risky assets,
• ρ is a real-valued semimartingale with ρ0 = 0.
The wealth process V of a primary strategy (V0 , ζ, ρ) is given by
d(βt Vt ) = ζt d(βt X
bt ) + βt dρt , t ∈ [0, T ] (136)
Note that ρ can be interpreted as a nancing cost. So the primary strategy introduced in Denition B.1 is
not self-nancing in the standard sense, unless ρ = 0.
Denition B.2 (i) An (issuer) hedge with residual cost ρ for an option is a primary strategy (V0 , ζ, ρ) with
related wealth process V such that, for t ∈ [0, T ],
Vt − 1{t<T } Lt + 1{t=T } ξ ≥ 0. (137)
Hedges with no residual cost (that is, with ρ = 0) are also called superhedges.
In the special case of European options, if equality holds in (137) at t = T, so VT = ξ, we then deal with a
replicating strategy with residual cost ρ, or simply, in case ρ = 0, a replicating strategy.
101
Given a risk-neutral measure P ∈ M, we shall now postulate suitable integrability and regularity conditions
embedded in the standing assumption that a related reected BSDE (see e.g. [66]) has a solution. So we
shall introduce a reected BSDE (E ) with respect to (Ω, F, P), with data dened in terms of those of a given
option. Assuming that (E ) has a solution (for which various sets of sucient regularity and integrability
conditions are known in the literature), we will deduce explicit hedging strategies with minimal initial wealth
for the related option.
Let H2 (P) denote the set of real-valued P-martingales vanishing at time 0, with integrable quadratic variation
over [0, T ].
Denition B.3 By a (P-)solution to (E ), we mean a triplet (Π, m, k) such that all conditions in (E ) are
satised, where:
• the state process Π is a real valued, càdlàg process,
• m lies in H2 (P), and
• k is a non-decreasing continuous process (null at time 0).
Note that if (Π, m, k) is a solution to (E ), then Π is a semimartingale, and by a standard verication principle,
we have that
βt Πt = esssupτ ∈Tt Et βτ 1{τ <T } Lτ + 1{τ =T } ξ , (138)
which is also equal to Et βT ξ , in the case of an European option. So the state process Π of any P-solution to
(E ) is an arbitrage P-price of the option.
Recall that an R -valued semimartingale Y is called a sigma martingale (see e.g. [85, 130, 45]) if there
d
d d i i
exists an R -valued local martingale M and an R -valued process H such that each H is M -integrable,
i = 1, . . . , d (in particular, H must be predictable), and Y i = Y0i + H i dM i for i = 1, . . . , d. Any local
R
martingale is a sigma martingale, and any locally bounded sigma martingale is a local martingale.
Theorem B.4 (By application of Bielecki et al. [27]) Assume that the BSDE (E ) has a solution (Π, m, k).
Then Π is an arbitrage P-price process of the option. Moreover:
(i) Π0 is the minimal initial wealth of an hedge with P-sigma martingale residual cost.
(ii) For any R -valued, (row by row) β Xb -integrable process, Λ, such that N has components in H2 (P),
q⊗d
where
βt dNt = Λt d(βt X
bt ), t ∈ [0, T ], (139)
let us dene ζ by
ζt = zt , Rt − Π̄t− Λt , t ∈ [0, T ] (141)
·
with Π̄t = Πt + βt−1 [0,t] βu dku , and ρ = m − 0 zdN. Then (Π0 , ζ, ρ), is an hedge with initial wealth Π and
R R
Remarks B.1 (i) The special case ρ=0 in the previous results corresponds to a suitable form of model
completeness (replicability of European options, cf. last point of Theorem B.4), in which the issuer of the
option may and wishes to hedge all the risks embedded in the option.
102
The case where ρ 6= 0 corresponds to either model incompleteness, or a situation of model completeness in
which the issuer may but wishes not to hedge all the risks embedded in the option, for instance because she
wants to limit transaction costs, or because she wishes to take some bets in specic risk directions.
(ii) In cases where ρ may be taken equal to 0 in Theorem B.4, the minimality statements in this Theorem
may be used to prove uniqueness of the related arbitrage prices (see [26]).
Denition C.1 We say that the BSDE (E ) is a decoupled Markovian Forward-Backward SDE (Markovian
FBSDE, for short), if the input data r, ξ and L of (E ) are given by Borel functions of some (Ω, F, P)-Markov
factor process Z with values in a suitable state space (nite-dimensional state space with rst component
given by time t), so
rt = r(Zt ) , ξ = ξ(ZT ) , Lt = L(Zt ) (142)
In particular, the system made of the specication of a forward dynamics for Z, together with the BSDE
(E ), constitutes a decoupled Markovian forward-backward system of equations in (Z, Π, m, k). The system
is decoupled in the sense that the forward component of the system serves as an input for the backward
component(Z is an input to (E ), cf. (142)), but not the other way round.
From the point of view of interpretation, the components of Z are observable factors. The rst component
ofZ (indexed by 0) is Zt0 = t. As for the other components of Z :
• the rst d following components of Z are given by the primary risky asset price processes X i , i = 1, . . . , d;
• the remaining components of Z (if any) are to be understood as simple factors that may be required to
`Markovianize' the payos of the option (e.g. factors accounting for path dependence in the option's payo,
and/or non-traded factors such as stochastic volatility in the dynamics of the assets underlying the option).
Note that due to the nature of our model, observability of the factor process Z in the mathematical sense
ofF-adaptedness is not sucient in practice. In order for the model to be usable in practice, a constructive
mapping from a collection of meaningful and directly observable economic variables to Z is really needed.
Otherwise, the model will be useless.
Since the model is dened under a risk-neutral probability P ∈ M, this mapping will typically be achieved
in practice by calibration of Z to a set of observed prices of traded derivatives. Of course this `calibration' is
obvious for the time component of Z or the components of Z which are represented in X: simply x these
components equal to the current time or to the current market values of the related assets.
In this view, given an integer p and a nite set E of size l, we dene the following linear operator A acting
on regular functions Π = Π(t, x, y), (t, x, y) ∈ [0, T ] × Rp × E :
p
1 X
At Π(t, x, y) = aij (t, x, y)∂x2i xj Π(t, x, y)
2 i,j=1
+ pi=1 bi (t, x, y) − g(t, x, y) Rp ui (t, x, y, x0 )h(t, x, y, dx0 ) ∂xi Π(t, x, y) (143)
P R
0 0
R
P x, y) Rp Π(t,0x + u(t, x,0y, x ) − Π(t, x, y) h(t, x, y, dx )
+ g(t,
+ y0 ∈E λ(t, x, y, y )(Π(t, x, y ) − Π(t, x, y))
103
where:
• the a(t, x, y) are symmetric non-negative matrices,
• the b(t, x, y) are drift vector coecients,
• the jump intensity function g(t, x, y) is non-negative, the h(t, x, y, ·) are probability measures on Rp , and
u(t, x, y, x0 ) is the jump size function,
0 0 0
• the intensity P matrix function [λ(t, x, y, y )]y,y 0 ∈E is such that λ(t, x, y, y ) ≥ 0 whenever y 6= y , and
0
λ(t, x, y, y) = − y0 ∈E\{y} λ(t, x, y, y ), by convention. Under appropriate technical conditions on the coef-
cients, the existence and uniqueness in law of a stochastic basis (Ω, F, P) and an (Ω, F, P)-Markov process
Z = (t, X , Y) solving the martingale problem for A results from Theorems 4.1 and 5.4 in Chapter 4 of Ethier
and Kurtz [69].
Equivalently, under these conditions (see also [57]), there exists a stochastic basis (Ω, F, P) and an (Ω, F, P)-
Markov process Z = (t, X , Y), the latter unique in law, such that:
• The Rp -valued process X (for some p ≥ d) satises, for t ≥ 0 :
R
dXt = b(t, Xt , Yt ) dt + σ(t, Xt , Yt ) dWt + Rp
u(t, Xt− , Yt− , x) χ
e(dt, dx) (144)
where:
χ
e(dt, dx) denotes the P-compensated martingale measure associated with a Poisson random measure
χ with P-intensity measure (or P-compensator measure) g(t, Xt , Yt )h(t, Xt , Yt , dx)dt, so
the dispersion matrix function σ(t, x, y) satises the equality σ(t, x, y)σ(t, x, y)T = a(t, x, y);
P P
dYt = y∈E λ(t, Xt , Yt , y)(y − Yt ) dt + y∈E (y − Yt− ) de
νt (y) (145)
in which ν is the counting measure, such that νt (y) is equal to the number of transitions to state y since
time 0.
In particular, we then have the following Itô formula, in which ∂x denotes the row-gradient of Π(t, x, y) with
respect to x:
Let us now put a bit more structure on the primary dividend process D, which to this point was only assumed
d
to be an R -valued càdlàg process of nite variation . So we further assume that D is P-integrable and of
R·
the form D = c(Zt )dt, for some Borel function c.
0
Observe that under the following arbitrage P-consistency condition on the model coecients:
then βX
b is a P-local martingale.
Given such an (Ω, F, P)-model with related Markovian FBSDE (E ) for an option, let us further dene N
with components in H2 (P) by N T = (W T , νeT ), with q = p + l; in case when l = 1, that is in case when the
regime indicator process is constant, the one-dimensional process ν e in (146) is trivially null and plays no
role whatsoever, and so, we may take q = p. Denote by P the F-predictable σ -algebra on Ω × [0, T ] and by
P(E) the σ -algebra of all subsets of E. A solution (Π, m, k) to (E ) is then typically sought for with m in the
form
Rt Rt RtR
mt = 0
zu dNu + nt = 0
zu dNu + 0 Rp
vu (x) χ
e(du, dx) (149)
Assuming further that N satises (139), then all assumptions are satised in Theorem B.4. We thus get a
decoupled forward-backward system of stochastic dierential equations in the unknowns (Z, Π, z, v, k) (see
e.g. MaYong [112]).
The issues of solving (E ) under rather general assumptions covering in particular the cases corresponding
to typical nancial applications (e.g. convertible bonds, see [25, 26]), and the related variational inequality
approach in the Markovian FBSDE case, are addressed in [56, 57] (see also [28]). In particular, we prove
in [56, 57] that, under mild regularity conditions, (E ) has a unique solution (Π, m, k), with m in the form
(149), in suitable Hilbert spaces. In the Markovian FBSDE case, we establish the relation between this
solution and the unique solution in some sense (viscosity solution with growth conditions or weak solution
in weighted Sobolev spaces), Π(t, x, y), to the following PIDE obstacle problem (system of l coupled PIDEs
with obstacle in space-dimension p):
with terminal condition Π(T, x, y) = ξ(x, y). Then the above-mentioned relation writes:
Πt = Π(Zt ), t ∈ [0, T ]
plus extra relations between the z -component of the solution of (E ) and ∂x Π(t, Xt , Yt ), in regular cases. The
related hedging strategy with residual cost ρ in Theorem B.4, can also be expressed in terms of the solution
to the above PIDE problem.
In summary, in a generic Markovian market model, option prices and Greeks are given as solutions to related
risk-neutral pricing FBSDEs / PIDEs.
In some cases (models of the AJD class [63], or SABR model [80]), semi-analytic formulas are available for
the prices and Greeks of European vanillas. In any case, the pricing PIDEs can be solved numerically by
stochastic simulation methods, or, provided the dimension of the model is not too large, by deterministic
PIDE numerical schemes.
Acknowledgements. These notes grew out of a graduate course of the Master Program in Financial
Engineering at Evry University (M2IF program, see www.univ-evry.fr/m2if ). I wish to express warm thanks
to the students, with a special mention to Arslan Bendimerad and Rémi Boyer, who produced a preliminary
version of the material of sections 30.5 and 6 in the context of their master project.
References
[1] Y. ACHDOU AND O. PIRONNEAU, Computational Methods for Option Pricing. Vol. 30 of Frontiers
in Applied Mathematics, SIAM, Philadelphia, PA, 2005.
[3] A. L. AMADORI. The obstacle problem for nonlinear integro-dierential operators arising in option
pricing. Submitted.
[4] A. L. AMADORI. Nonlinear integro-dierential evolution problems arising in option pricing: a viscos-
ity solutions approach. Journal of Dierential and Integral Equations, 16(7):787811, 2003.
[5] L. ANDERSEN. Ecient Simulation of the Heston Stochastic Volatility Model. Banc of America
Securities, January 23, 2007.
[7] L. ANDERSEN AND J. SIDENIUS. Extensions to the Gaussian Copula: Random Recovery and
Random Factor Loadings, Journal of Credit Risk, Vol. 1, No. 1 (Winter 2004), pp. 2970.
[8] A. ANTONOV, S. MECHKOV, AND T. MISIRPASHAEV. Analytical techniques for synthetic CDOs
and credit default risk measures, NumeriX, 2005.
[10] AVELLANEDA M., BU R., FRIEDMAN C., GRANDCHAMP N., KRUK L. AND NEWMAN J.
Weighted Monte Carlo: a new technique for calibrating asset-pricing models, IJTAF, (2001) 4, 129.
[13] M. AVELLANEDA, A. LEVY, AND A. PARAS. Pricing and hedging derivative securities in markets
with uncertain volatilities, Applied Math. Finance, 1995.
[14] BALLY V., CABALLERO E., EL-KAROUI N. AND B. FERNANDEZ. Reected BSDE's, PDE's and
Variational Inequalities. Bernouilli, To appear.
[15] BALLY, V. AND MATOUSSI, A. Weak solutions for SPDEs and Backward doubly stochastic dier-
ential equations, Journal of Theoretical Probability, Vol. 14, No. 1, 125-164 (2001).
[16] BARLES, G., BUCKDAHN, R. AND PARDOUX, E.. Backward Stochastic Dierential Equatiions
and Integral-Partial Dirential Equations, Stochastics and Stochastics Reports, Vol. 60, pp. 57-83
(1997).
[17] G.BARLES, C. DAHER AND M. ROMANO. Convergence of numerical schemes for parabolic equa-
tions arising in nance theory, Math. Models Methodes App. Sci., 5, n 1, pp. 125143, 1995.
o
[18] BARLES, G. AND L. LESIGNE. SDE, BSDE and PDE. El Karoui, N. and Mazliak, L., (Eds.),
Backward Stochastic dierential Equatons. Pitman Research Notes in Mathematics Series 364, pp.
47-80 (1997).
[19] G. G. BARLES AND P.E. SOUGANIDIS. Convergence of approximation schemes for fully nonlinear
second order equations, Asymptotic Anal., (4), pp. 271283, 1991.
[20] J. BARRAQUAND AND T. PUDET. Pricing of American Path-Dependent Contingent Claims, Math-
ematical Finance, 6, no 1, pp. 1751, 1996.
[21] D. BATES. Jumps and SV: exchange rate processes implicit in deutsche mark options, Review of
Financial Studies, 9 (1) (1996), 69107.
[23] A. BENSOUSSAN AND J.L. LIONS. Application des Inéquations Variationnelles en Contrôle Stochas-
tique, Dunod, 1978.
[24] BERESTYCKI, H., J. BUSCA, AND I. FLORENT. Asymptotics and calibration of local volatility
models, Quant. Finance 2 (1) (2002).
[25] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Arbitrage pricing of
defaultable game options with applications to convertible bonds, Quantitative Finance, Forthcoming.
[26] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Valuation and hedging
of defaultable game options in a hazard process model, Submitted, 2006.
[27] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Defaultable options in a
Markovian intensity model of credit risk, Submitted, 2006.
106
[28] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Convertible bonds in a
jump-diusion model of credit risk, Work in preparation, 2006.
[29] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M., Valuation of basket
credit derivatives in the credit migrations environment. Handbook of Financial Engineering, To appear.
[30] BIELECKI, T.R., JEANBLANC, M. AND RUTKOWSKI, M. Pricing and trading credit default
swaps. Submitted, 2005.
[31] BIELECKI, T.R. AND RUTKOWSKI, M. Credit Risk: Modeling, Valuation and Hedging. Springer-
Verlag, Berlin, 2002.
[32] F.BLACK M.SCHOLES. The pricing of Options and Corporate Liabilities, Journal of Political Econ-
omy, 81:635654, 1973.
[33] N. BOULEAU AND D. LEPINGLE. Numerical methods for stochastic processes, Wiley, 1993.
[34] BOUCHARD B. AND ELIE R. Discrete time approximation of decoupled Forward-Backward SDE
with jumps, Submitted, 2006.
[35] BRACE, A., DUN, T. AND BARTON, G., Towards a Central Interest Rate Model, FMMA notes
worhing paper, 1998.
[36] A. BRACE, D. GATAREK AND M. MUSIELA. The market model of interest rate dynamics, Math-
ematical Finance, 7(2) (1997), pp. 127154.
[38] M.J. BRENNAN E.S. SCHWARTZ. The valuation of the American put option, J. of Finance, 32:449
462, 1977.
[39] BRIANI M, LA CHIOMA C AND NATALINI R. Convergence of numerical schemes for viscosity so-
lutions to integro-dierential degenerate parabolic problems arising in nancial theory, Numer. Math.,
2004, vol. 98, no4, pp. 607-646.
[40] D. BRIGO AND F. MERCURIO. Interest Rate Models - Theory and Practice, Springer Finance,
2001.
[41] M. BROADIE AND P. GLASSERMANN. Pricing American-style securities using simulation, J.J.of
Economic Dynamics and Control, 21:13231352, 1997.
[42] J. BUSCA. A nite elements method for the valuation of American Options, Unpublished manuscript,
1998.
[43] P. CARR AND D.MADAN. Option Valuation using the fast Fourier transform,, Journal of Computa-
tional Finance, (1998), 2, 61-73.
[44] Y. ZHU, X. WU, I-L. CHERN. Derivative Securities and Dierence Methods, Springer Finance, 2004.
[45] CHERNY, A. AND SHIRYAEV, A. Vector stochastic integrals and the fundamental theorems of asset
pricing, Proceedings of the Steklov Institute of mathematics, 237 (2002), pp. 649.
[46] L. CLEWLOW AND A. CARVEHILL. On the simulation of contingent claims, Journal of Derivatives,
6673, Winter 1994.
[47] T. COLEMAN, Y. LI AND A. VERMA. Reconstructing the unknown volatility function. Journal of
Computational Finance, 2 (1999), 3, pp. 77102.
[48] R. CONT AND P. TANKOV. Financial Modelling with Jump Processes, Chapman & Hall/CRC, 2003.
[49] R. CONT AND E. VOLTCHKOVA. Finite dierence methods for option pricing in jump- diusion
and exponential Lévy models, SIAM J. Numer. Anal. 43 (2005), 15961626.
[51] M. CRANDALL, H. ISHII AND P.-L. LIONS. User's guide to viscosity solutions of second order partial
dierential equations, Bull. Amer. Math. Soc., 1992.
[52] S. CRÉPEY. Delta-hedging Vega Risk? Quantitative Finance, 4 (October 2004), pp. 559579.
[53] S. CRÉPEY. Calibration of the Local Volatility in a trinomial tree using Tikhonov regularization.
Inverse Problems, 19 (2003), pp. 91127.
107
[54] S. CRÉPEY. Calibration of the Local Volatility in a generalized BlackScholes model using Tikhonov
regularization. SIAM Journal on Mathematical Analysis, Vol. 34 No 5 (2003), pp. 11831206.
[55] S. CRÉPEY. Contribution à des méthodes numériques appliquées à la Finance et aux Jeux Diéren-
tiels. PhD Thesis, Ecole Polytechnique, France, January 2001.
[56] CRÉPEY, S., MATOUSSI, A.: Doubly Reected BSDEs with Jumps: A Priori Estimates and Com-
parison Principle, Work in preparation.
[57] CRÉPEY, S., JEANBLANC, M., AND MATOUSSI, A.: Markovian Doubly Reected BSDEs in a
JumpDiusion Setting with Regimes. Work in preparation.
[58] C.W. CRYER. The solution of a quadratic programming problem using systematic overrelaxation,
SIAM J. Control,(9):385392, 1971.
[59] C.W. CRYER. The ecient solution of linear complementarity problems for tridiagonal minkowski
matrices, ACM Trans. Math. Softwave, (9):199214, 1983.
[60] A. DEBUYSSCHER AND M. SZEGÖ. The Fourier Transform Method Technical Document, Moody's
Investors Service, Working paper, Jan 2003.
[61] F. DELBAEN AND W. SCHACHERMAYER. The Mathematics of Arbitrage, Springer Finance, 2005.
[62] K. DEMETERFI, E. DERMAN, M. KAMAL AND J. ZOU. More than you ever wanted to know
about volatility swaps, Quantitative Strategies Research Notes, March 1999. pp. 7895.
[63] D. DUFFIE, J. PAN AND K. SINGLETON. Transform Analysis and Asset Pricing for Ane Jump-
Diusions, Econometrica, 68 (2000), 6, pp. 13431376.
[65] B. DUMAS, J. FLEMING AND R. WHALEY. Implied volatility functions: empirical tests, J. of
Finance, 53 (1998), 6, pp. 20592106.
[66] EL KAROUI, N., PENG, S., AND QUENEZ, M.-C. Backward stochastic dierential equations in
nance. Mathematical Finance 7 (1997), 171.
[67] H. W. ENGL, M. HANKE, AND A. NEUBAUER. Regularization of Inverse Problems. Kluwer, Dor-
drecht, 1996.
[68] A. ERN S. VILLENEUVE A. ZANETTE. Adaptive Finite element methods for local volatility euro-
pean option pricing. International Journal of Theoretical and Applied Finance 7(6), 2004.
[69] H.J. ETHIER AND T.G. KURTZ. Markov Processes. Characterization and Convergence. Wiley, 1986.
[70] W. FLEMING AND H. SONER. Controlled Markov processes and viscosity solutions, Second edition,
Springer, 2006.
[71] FOLLMER H. AND SCHIED A.. Stochastic Finance, An Introduction in Discrete Time, De Gruyter,
2002.
[72] E. FOURNIÉ, J-M. LASRY, J. LEBUCHOUX, P-L. LIONS AND N. TOUZI. An application of
Malliavin calculus to Monte Carlo methods in Finance, Finance & Stochastics, vol. 4, no 3, 391412,
1999.
[73] E. FOURNIÉ, J-M. LASRY, J. LEBUCHOUX AND P-L. LIONS. Applications of Malliavin calculus
to Monte-Carlo methods in nance. II, Finance & Stochastics, 5, 2001, p. 201-236.
[77] M. GIBSON. Understanding the Risk of Synthetic CDOs, FEDS Working Paper No. 2004-36, July
2004.
[79] R. GLOWINSKI, J-L. LIONS AND R. TREMOLIERES. Analyse Numérique des Inéquations Varia-
tionnelles, Dunod , 1976.
[80] P. HAGAN, D. KUMAR, A. LESNIEWSKI AND D. WOODWARD. Managing Smile Risk. Wilmott
Magazine, 2002.
[81] S. HESTON. A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond
and Currency Options, Review of Financial Studies, 6(2) 32743, 1993.
108
[82] J. HULL. Options, futures, & other derivatives, Prentice Hall , 5th edition, 2002.
[83] J. HULL AND A. WHITE. forward rate volatilities, swap rate volatilities, and the implementation of
the libor market model, Journal of Fixed Income, 10(2) 4662, 2000.
[84] P. JÄCKEL. Stochastic Volatility Models: Past, Present and Future, The Best of Wilmott 1, Chapter
23, 2005.
[85] JACOD, J. AND SHIRYAEV, A.N. Limit Theorems for Stochastic Processes. Springer, 2003.
[86] P. JAILLET D. LAMBERTON AND B. LAPEYRE. Variational Inequalities and the Pricing of Amer-
ican Options, Acta Applicandae Mathematicae 21, pp. 263289, 1990.
[88] E. R. JAKOBSEN AND K. H. KARLSEN. Continuous dependence estimates for viscosity solutions
of integro-PDEs, J. Dierential Equations 212(2): 278-318, 2005.
[89] F. JAMSHIDIAN. Valuation of credit default swap and swaptions. Finance and Stochastics 8, 343-371,
2004.
[90] F. JAMSHIDIAN. Libor and swap market models and measures, Working paper, Sakura Global Capital,
1996.
[91] Y. JIAO. Le risque de crédit: modélisation et simulation numérique. PhD Thesis (in English), École
polytechnique.
[92] C. KAHL, P. JACKEL. Not-so-complex logarithms in the Heston model. Working Paper, 2006.
[93] B. KAMRAD AND P. RITCHKEN. Multinomial approximating models for options with k state vari-
ables, Management Science, 37:16401652, 1991.
[94] I. KARATZAS AND S. SHREVE. Brownian Motion and Stochastic Calculus, Springer, 1988.
[95] G.Z. KEMNA AND A.C.F. VORST. A pricing method for options based on average asset values, J.
Banking Finan., 113129, March 1990.
[96] P. E. KLOEDEN AND E. PLATEN. The Numerical Solution of Stochastic Dierential Equations,
Springer-Verlag, 1992
[97] D.E. KNUTH. The Art of Computer programming, Seminumerical Algorithms, volume 2, Addison-
Wesley, 1981.
[98] H. KUSHNER. Probability Methods for Approximations in Stochastic Control and for Elliptic Equa-
tions, Academic Press, 1977.
[99] H. KUSHNER AND P.G. DUPUIS. Numerical Methods for Stochastic Control Problems in Continous
Time, Springer-Verlag, 1992.
[100] Y.W. KWOK. Mathematical models of nancial derivatives, Springer Finance, 1998.
[101] R. LAGNADO AND S. OSHER. A Technique for Calibrating Derivative Security Pricing Models:
Numerical Solution of an Inverse Problem, Journal of Computational Finance, 1 (1997), 1, pp. 1325.
[102] D. LAMBERTON AND B. LAPEYRE. Introduction to Stochastic Calculus Applied to Finance, Chap-
man and Hall, 1996.
[103] B. LAPEYRE AND E. TEMAM.. Competitive Monte Carlo methods for the pricing of Asian options
, Journal of Computational Finance, Volume 5 / Number 1, Fall 2001
[104] B. LAPEYRE, E. PARDOUX AND R. SENTIS. Méthodes de Monte-Carlo pour les équations de
transport et de diusion, Math. & Applic., SMAI, No 29, Springer, 1998.
[105] J.-P. LAURENT AND J. GREGORY. Basket Default Swaps, CDOs and Factor Copulas, Journal of
Risk, Forthcoming.
[106] P. L'ECUYER.. Random numbers for simulation, Communications of the ACM, 33(10), Octobre 1990.
[107] P. L'ECUYER. Uniform random number generation, The Annals of Operations Research, 53:77120,
1994.
[109] F. DUBOIS, T. LELIEVRE. Ecient Pricing of Asian Options by the PDE Approach, Journal of
Computational Finance, Vol. 8, No. 2, pp. 55-64, Winter 2004/05.
109
[110] LEWIS, A.. A simple option, formula for general jump- diusion and other exponential Levy processes,
available from http://www.optioncity.net, 2001.
[111] D. LI. On default correlation: A copula function approach, In Journal of Fixed Income, 115118, 2000.
[112] J. MA AND J. YONG. Forward-backward stochastic dierential equations and their applications.
Springer, Lecture Notes in Math., vol. 1702 (1999).
[113] R.C. MERTON. Option pricing when underlying stock returns are discontinuous, J. Financial Eco-
nomics, 3 (1976), pp. 125144.
[114] R.C. MERTON. The theory of rational option pricing, Bell J. Econ. Man. Sc., 4 (1973), pp. 141183.
[115] METROPOLIS, N. AND ULAM, S. The Monte Carlo Method, J. Amer. Stat. Assoc., 44, 335-341
(1949).
[116] K. MILTERSEN, K. SANDMANN AND D. SONDERMANN, Closed form solutions for term structure
derivatives with log-normal interest rates,
[117] W.J. MOROKOFF AND R.E. CAFLISH. Quasi-random sequences and their discrepancies, SIAM,
Journal of Scientic Computin, 5(6):12511279, nov 1994.
[118] K. MORTON AND D. MAYERS. Numerical Solution of Partial Dierential Equations, Cambridge
university press, 1994.
[119] M. MUSIELA AND M. RUTKOWSKI. Martingale Methods in Financial Modelling, Springer, 2004.
[120] H. NIEDERREITER. New developments in uniform pseudorandom number and vector generation, In
Springer, editor, In Lecture Notes in Statistics, 106, Monte Carlo and Quasi-Monte Carlo Methods in
Scientic Computing, volume 106, pages 87120, 1994.
[121] H. NIEDERREITER A.B.OWEN AND J.SHIUE Editors. Randomly permuted (t,m,s)-Nets and (t,s)-
sequences, in Montecarlo and Quasi Montecarlo methods in Scientic Computing, Springer, New York,
1995.
[122] H. NIEDERREITER. Random Number Generation and Quasi Monte Carlo Methods, Society for
Industrial and Applied mathematics, 1992.
[123] H. NEIDERREITER. Points sets ans sequences with small discrepancy, Monatsh.Math, 104:273337,
1987.
[125] D. O'KANE AND M. LIVESEY. Base Correlation Explained, Lehman Brothers Fixed Income Quan-
titative Credit Research, Nov 2004.
[126] M. OVERHAUS ET AL. Equity Derivatives: Theory and Applications , Wiley Finance, 2002.
[127] D.W. PEACEMAN AND H.H. RACHFORD Jr. The numerical solution of parabolic and elliptic
dierential equations, J.of Siam, 3:2842, 1955.
[129] W.H. PRESS, B.P. FLANNERY, S.A. TEUKOLSKY AND W.T. VETTERLING, Numerical Recipes
in C: The Art of Scientic Computing (Second Edition), Cambridge University Press, 1992.
[130] PROTTER, P. E., Stochastic Integration and Dierential Equations, Second edition, Springer, 2004.
[131] R. REBONATO. Modern Pricing of Interest-Rate Derivatives: The LIBOR Market Model and Beyond,
Princeton University Press, 2002.
[132] REISINGER, C. Analysis of linear dierence schemes in the sparse grid combination technique, SIAM
Journal on Numerical Analysis, 2006.
[133] D. REVUZ AND M. YOR. Continuous Martingales and Brownian Motion, 3rd edition. Springer, 2001.
[134] P. RITCHKEN. On pricing barrier options, Journal Of Derivatives, pages 1928, Winter 95 1995.
[135] L.C.G. ROGERS AND Z. SHI. The value of an asian option, J. Appl. Probab., 32(4):10771088, 1995.
[136] SAAD, Y. AND SCHULTZ, M.H. GMRES: A Generalized Minimal Residual Algorithm for Solving
Nonsymmetric Linear System, SIAM J. Sci. Statist. Comput., 7, 856-869, 1986.
[137] D. SAMPERI. Calibrating a Diusion Pricing Model with Uncertain Volatility: Regularization and
Stability, Mathematical Finance, 12 (2002), 1, pp. 7187.
110
[138] P.J. SCHÖNBUCHER. Credit Derivatives Pricing Models: Model, Pricing and Implementation, Wiley
Finance, 2003.
[139] I.M. SOBOL. The distribution of points in a cube and the approximate evaluation of integrals, U.S.S.R
Comput. Maths. Math. Phys, 16:236242, 1976.
[140] D TAVELLA AND C RANDALL. Financial Instruments: The Finite Dierence Method, Wiley Fi-
nancial Engineering, 2000.
[141] TITCHMARSH, E.C.. The Theory of Functions, 2nd ed., Oxford 1997.
[142] O. VASICEK. Limiting loan loss probability distribution, Moody's KMV, 1991.
[143] S. VILLENEUVE AND A. ZANETTE. Parabolic ADI methods for pricing American option on two
stocks, Mathematics of Operations Research, Vol. 27, Issue 1, 121149, 2002
[145] P. WILMOTT, Cliquet Options and Volatility Models, Wilmott Magazine, 78-83, 2002.
[146] P. WILMOTT. Derivatives: the theory and practice of nancial engineering, Wiley, 1998.
[147] P. WILMOTT, J. DEWYNNE AND S. HOWISON. Option Pricing, Oxford Financial Press (1993).
[148] H.A. WINDCLIFF P.A. FORSYTH AND K.R. VETZAL, Numerical Methods for Valuing Cliquet
Options. Applied Mathematical Finance, Forthcoming.
[149] H. WINDCLIFF, P.A. FORSYTH AND K.R.VETZAL. Pricing methods and hedging strategies for
volatility derivatives, Journal of Banking and Finance, 30 (2006) 409431.
[150] YANG, J., T. HURD AND X. ZHANG. Saddlepoint Approximation Method for Pricing CDOs, Journal
of Computational Finance, Vol. 10, No. 1, (Fall 2006).
[151] R. ZVAN, P.A. FORSYTH AND K.R. VETZAL. Robust numerical methods for pde models of asian
option, Journal of Computational Finance, 1:3978, 1998.