Download as pdf or txt
Download as pdf or txt
You are on page 1of 110

COMPUTATIONAL FINANCE

Stéphane CRÉPEY, Évry University, France

stephane.crepey@univ-evry.fr

April 30, 2007

Figure 1: DAX Index Eective Volatility, June 1 2001

Contents

I Introduction and Preliminaries 7

1 Outline 7

2 General Set-Up 7

3 Accuracy Requirements and Computational Cost Considerations 8

4 Bibliographic Guidelines 9

II Market Models 11

5 BlackScholes and Beyond 11


5.1 BlackScholes Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5.2 Heston model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5.3 Merton model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12


5.4 Bates model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5.5 Log-Spot Characteristic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6 BGM Model 14
6.1 Changes of Numeraire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

6.1.1 Application: Black Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

6.2 LIBOR Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

6.3 Caps and Floors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6.4 Adding Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6.4.1 Correlation Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6.5 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6.6 Model Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

7 One Factor Gaussian Copula Model 20


7.1 Single Tranche CDOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

7.2 Li Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

7.3 Pricing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7.4 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7.4.1 Hedging Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7.4.2 Computation of the Deltas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

7.4.3 P&L Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

8 Market Models in Practice 26


8.1 Black(Scholes) implied volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

8.2 Li implied correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

8.3 Towards Real-Life Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

III Finite Dierences Methods 29

9 Generic Pricing PIDE 29


9.1 Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

9.2 Weak Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

9.2.1 Viscosity Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

9.2.2 Weak Solutions in Weighted Sobolev Spaces . . . . . . . . . . . . . . . . . . . 31

10 Numerical Approximation 31
10.1 Finite dierences methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

10.1.1 Localization, Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

10.1.2 Convergence and Convergence rates . . . . . . . . . . . . . . . . . . . . . . . 32

10.2 Finite Element Methods and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


10.2.1 Finite Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

10.2.2 Sparse Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

11 Finite Dierences for European Vanilla Options 36


11.1 BS Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

11.2 Localization and Discretization in space . . . . . . . . . . . . . . . . . . . . . . . . . 36

11.3 Theta-schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

11.3.1 Explicit Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

11.3.2 Implicit Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

11.4 Adding Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

11.4.1 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

11.4.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

12 Finite Dierences for American Vanilla Options 44


12.1 BS Variational inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

12.2 Splitting methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

12.3 Linear Complementarity Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

13 Finite Dierences for 2D Vanilla Options 45


13.1 Numerical integration by an ADI Method . . . . . . . . . . . . . . . . . . . . . . . . 46

13.2 American Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

14 Finite Dierences for Exotic Options 47


14.1 Lookback Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

14.2 Barrier Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

14.3 Asian options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

14.3.1 European Asian option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

14.3.2 American Asian put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

14.4 Discretely Path-Dependent Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

14.4.1 Cliquet Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

IV Tree Methods 53

15 Trees for vanilla options 53


15.1 Cox-Ross-Rubinstein Binomial Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

15.1.1 CRR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

15.2 Other Binomial Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

15.2.1 The Random Walk scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

15.2.2 The matching-3-moments scheme . . . . . . . . . . . . . . . . . . . . . . . . . 55

15.3 Trinomial trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55


15.3.1 The KamradRitchken tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

15.3.2 Trinomial schemes with matching rst two moments . . . . . . . . . . . . . . 56

15.4 Miscellaneous remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

15.4.1 Local consistency and convergence in law . . . . . . . . . . . . . . . . . . . . 58

15.4.2 Flat trees and American options . . . . . . . . . . . . . . . . . . . . . . . . . 59

16 Trees for exotic options 59


16.1 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

16.2 Bermudean Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

V Monte Carlo Methods 61

17 Random numbers 61

18 Pseudo random generators 61


18.1 Properties required for a good pseudo-random numbers generator . . . . . . . . . . . 61

18.2 Constructing pseudo-random number generators . . . . . . . . . . . . . . . . . . . . 62

19 Low-discrepancy sequences 62
19.1 General remarks on low discrepancy sequences . . . . . . . . . . . . . . . . . . . . . 63

19.2 Sobol sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

20 Simulation of non-uniform random variables or vectors 63


20.1 Inverse method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

20.2 Simulation of Gaussian variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

20.3 Simulation of Gaussian vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

21 Principle of the Monte Carlo Simulation 65


21.1 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

21.2 Estimation principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

21.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

22 Variance Reduction Techniques 66


22.1 Antithetic Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

22.2 Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

22.3 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

22.4 Eciency of the Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . . . 67

23 Quasi Monte Carlo Simulation 67


23.1 Koksma-Hlawka inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

24 Greeking by (Quasi) Monte Carlo 68


24.1 Finite Dierences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

24.2 Derivation of the payo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

24.3 Derivation of the spot transition probability density . . . . . . . . . . . . . . . . . . 69

25 (Quasi) Monte Carlo algorithms for vanilla options 70


25.1 (Q)MC BS1D Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

25.1.1 Adding Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

25.2 (Q)MC BS2D Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

26 Simulation of processes 73
26.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

26.2 BS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

26.3 General diusions: Euler and Milshtein schemes . . . . . . . . . . . . . . . . . . . . . 75

26.3.1 Euler approximation scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

26.3.2 Milshtein approximation scheme (d = 1) . . . . . . . . . . . . . . . . . . . . . 76

26.3.3 Example: Heston model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

26.4 JumpDiusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

26.5 Monte Carlo Simulation for Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 77

27 (Quasi) Monte Carlo methods for Exotic Options 77


27.1 Lookback options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

27.1.1 Andersen and Brotherton-Ratclie Algorithm . . . . . . . . . . . . . . . . . . 78

27.2 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

27.3 Asian options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

27.4 Adding Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

28 Backtesting 80
28.1 Static versus dynamic hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

28.2 Discrete Time Delta-hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

28.2.1 Discrete Time Delta-hedging as a Control Variate Method . . . . . . . . . . . 82

28.3 Dynamic tests using the Brownian Bridge in the BS model . . . . . . . . . . . . . . 82

VI Calibration Methods 86

29 The ill-posed inverse calibration problem 86


29.1 Tikhonov regularization of non-linear inverse problems . . . . . . . . . . . . . . . . . 87

29.2 Nonlinear Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

30 A method using the characteristic function for European vanillas 89


30.1 Fourier Transform Miscellanea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
30.2 Option Pricing by Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

30.3 Derivation of the delta in the case of homogenous models . . . . . . . . . . . . . . . 91

30.4 Numerical Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

30.5 An alternative Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

31 Extracting Eective Volatility 93


31.1 Local versus Eective Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

31.2 The Local Volatility Calibration problem . . . . . . . . . . . . . . . . . . . . . . . . 94

31.3 Approach by Tikhonov regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

31.4 Approach by entropic regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

VII Appendix  Financial Theory Essentials 99

A Arbitrage Theory 99

B Connection with Hedging 100

C Markovian FBSDE Approach 102

D Variational Inequality Approach in a JumpDiusion Setting with Regimes 102

These notes partly rely on the public releases of the option pricing software and documentation sys-
tem PREMIA developed since 1999 by the MATHFI project at INRIA www.inria.fr and CERMICS
cermics.enpc.fr. The aim of the Premia project is threefold: rst, to assist the R&D profes-
sional teams in their day-to-day duty, second, to help the academics who wish to perform tests of a
new algorithm or pricing method without starting from scratch, and nally, to provide the graduate
students in the eld of numerical methods for nance with open-source examples of implementation
of many of the algorithms found in the literature

www-rocq.inria.fr/mathfi/Premia/index.html
7

Part I
Introduction and Preliminaries
1 Outline
These notes bear on computational nance (pricing, Greeking and calibration methods), with a
focus on algorithmic aspects. The related theoretical results (convergence analysis, etc) are generally
stated without proof.

In parts III and IV, we discuss deterministic pricing methods (general nite dierences methods in
part III, and more specic tree methods in part IV). Part V is about stochastic simulation pricing
methods (Monte Carlo methods).
Note that there is no hermetic frontier between deterministic and stochastic methods. In a sense,
Monte Carlo (MC) methods are special cases of tree methods. This is more clearly visible on the more
general problem of MC pricing an American option, which is beyond the scope of these notes. Yet
it is also true of a standard MC algorithm for pricing an European option. Indeed the latter may be
interpreted as a one-time-step multinomial tree, which provides an exact discretization for the option
price in the limit where the number of discretization points in space (tree branches) goes to innity.
In essence (though rather caricaturally, needless to say), all these numerical schemes are based on
the idea of propagating the solution from a hypersurface of the time-space domain on which it is
known, along suitable (randomized) characteristics of the problem, like in the Riemann's method
of characteristics for solving hyperbolic rst-order equations (see e.g. [118, Chapter 4]). From a
more control-theoretical point of view, all these numerical schemes are based on Bellman's dynamic
programming principle [22].
Of course the dierence between trees in the usual sense and MC methods is that the MC mesh is
stochastically generated and unstructured.

The prices of nancial instruments are essentially given by the market, and made by oer-and-
demand (unless very exotic structures are considered). Market prices are in fact used (rather than
computed) by models, in the reverse-engineering" mode that consists in calibrating the model to
market prices, so that the model be consistent with the market for suitable values of its parameters.
This calibration process is the object of part VI. Once calibrated to the market, the model is eec-
tively used for Greeking (and/or pricing exotic structures), that is, computing the risk sensitivities
of a position, in order to set up a related hedge, or complementary position required for o-setting
such or such undesired risk.

Since the object of these notes is methods, not models, we present most methods on simple models,
like the BlackScholes model (in general), the Merton model, the Heston model, etc. Of course the
methods themselves are always generic to some degree, hence applicable in a broad range of models
to a broad range of nancial instruments. In the Appendix (part VII), we recall the basic facts of
nancial theory necessary to understand how a generic contingent claim pricing equation is derived
(equation (150)), in a Markovian risk-neutral primary market model. And, to start with, we shall
review in part II the benchmark models on the main derivative markets (equity, interest rate and
credit), with the related closed pricing formulae (so no computational methods are required in these
models, as far as pricing vanillas is concerned).

2 General Set-Up
We assume throughout that the evolution of a nancial market model can be modeled in terms of
stochastic processes dened on a continuous time stochastic basis (Ω, F, P),
b where P
b denotes the
objective (or physical, statistical..) probability measure. We can and do assume that the ltration
F satises the usual completeness and right-continuity conditions, and that all semimartingales are
càdlàg. Finally, since we are always in the context of pricing a contingent claim with maturity T,
8

we further assume that F = (Ft )t∈[0,T ] with F0 trivial and FT = F, for simplicity. Moreover, we
declare that a process on [0, T ] (resp. a random variable) has to be F-adapted (resp. F -measurable),
by denition.
We shall typically work under a risk-neutral (RN) measure P ∼ P
b (risk-neutral modeling), such
that the prices of the primary assets, once discounted and adjusted for any dividends, are (Ω, F, P)-
local martingales. Recall that, under mild technical conditions, existence of such a RN measure
P is equivalent to a suitable notion of no-arbitrage (NFLVR condition [61], see the Appendix). In
practical applications, it is convenient to think of P as the pricing measure chosen by the market
to price a contingent claim. We denote by Tt (or simply T, in caset = 0) the set of [t, T ]-valued
F-stopping times, and by E (resp. Et ) the P-expectation (resp. P-conditional expectation given Ft )
operator.

Note that pricing theory can also be developed in discrete time (see e.g. [71, 102]). The tree
(including MC) computational methods presented in these notes are then directly applicable (in
Markovian set-ups), and they are then trivially exact in the time direction, since no approximation
in time is required in this case. In particular, these methods can be used in static (one-period)
set-ups, such as often encountered in multi-name credit (static copula models etc, see section 7).

As for single-name credit, namely the pricing of defaultable claims with terminal payos of the form
1T <τd φ(ST ) (or 1τ <τd φ(Sτ ) upon exercice at time τ, in case of American claims), where τd represents
the default-time of a reference entity, it happens that such defaultable claims can be handled in
exactly the same way as default-free ones, provided a suitably credit-risk adjusted discount factor is
used, instead of the usual riskless discount factor (see the introductory paragraph to the Appendix).
Up to this simple modication, defaultable contingent claims can be treated in exactly the same
way as default-free ones, both from the theoretical and from the practical point of view [27, 28].
This means in particular that all the numerical methods presented in these notes can be used for
pricing and Greeking defaultable claims (or calibrating credit-risk models), provided the credit-risk
adjusted discount factor is used instead of the original default-free discount factor.
Incidentally, note that the original default-free discount factor can itself be interpreted as a default
probability (or killing rate, in stochastic processes terminology, see e.g. [133]).

3 Accuracy Requirements and Computational Cost Consider-


ations

A typical benchmark of accuracy in computational nance (this varies of course with the application
−4
at hand) is a 1bp (=10 ) error on normalized prices and Greeks, namely prices and Greeks for a
value of the spot scaled to unity.
As for computation times, the benchmark also greatly varies with the application at hand, but as
far as `real-time' option pricing is concerned, `instantaneous' pricing is the target, and more than
half-an-hour computation time (!!) is prohibitive.
Now a resolution within a 1bp normalized error by a nite dierences ADI method (the industry
standard today as far as deterministic methods are concerned, see section 13), in space dimension d,
typically requires 300 grid points per space dimension (most used numerical schemes are of order 2 of
d
accuracy in space and 1 in time), hence a related computation time (and storage cost) in O(300 ), i.e.
a computation time ranging from a few milliseconds for d = 1 to twenty minutes or so for d = 3. This
limits in practice the range of applicability of deterministic pricing methods to problems in space
dimension ≤ 3, unless sophisticated sparse grid or grid renement techniques (mostly available with
nite element methods) are used to counter Bellman's `curse of dimensionality' (referring to the fact
that in space dimension d, the computational cost of numerical integration grows exponentially like
md1 , where m1 is the number of discretization points in each space direction).

Table 1 provides a (crude) comparison of the computational costs of typical MC and deterministic
methods (MC algorithm with time discretization of the underlying factor process, cf. section 26.5,
versus ADI PDE method). A rough conclusion (note that we don't detail the constants involved in
our computational cost estimates) is that deterministic methods are more ecient (but often harder
9

to implement!) in space dimension ≤ 3, otherwise MC methods (if applicable) are the best.

Number of Operations Convergence rate Memory Cost


MC O(nm) O(n
−1 1
+ m− 2 ) O(1)

PDE
−2
−1
O(nm) O(n +m d ) O(m)

Table 1: Compared computational costs of Stochastic methods (m simulation runs) versus Deter-
ministic methods (m1 mesh points per space dimension, i.e. m = md1 space mesh points), in space
dimension d; n is the number of discretization points in time.

This leads to the following dictionary (Table 2) of the method to use, depending on the space-
dimension and the nature of a pricing problem.
In the upper left corner of this Table, the choice between PDE and MC should depend on the
relative interest of performance level with respect to implementation cost and risk, generally higher
with deterministic methods.
In the lower right corner of the Table, FBSDE (Forward-Backward Stochastic Dierential Equations)
methods refers to specic MC methods which are available for control problems, cf. the introductory
paragraph to part V. These methods are not treated in these notes.
Note that the recommendations for American Problems are in fact valid for more general Control
Problems, e.g. for pricing game contingent claims, like convertible bonds (see [25, 28]), or for pricing
problems in (classes of ) model(s) in which the spot volatility is stochastic, but known to remain in
a range (σ, σ) for sure, where σ and σ are given constants or functions of (t, S) (Uncertain Volatility
model [13]), etc.

European Problem American Problem


d≤3 PDE or MC PDE
d>3 MC FBSDE

Table 2: Which method to choose?

An interesting subject hardly mentioned is these notes is parallelism. The reason why we do not
consider this issue (besides the vastness of this subject per se) is that parallelism is most ecient
on dedicated machines or networks (see Figure 5(b) and Table 3), and working with such machines
or networks is not the main trend in the nancial industry today.

4 Bibliographic Guidelines
Bibliographic references are given along the text and gathered at the end of this document. Various
aspects of computational nance are introduced in [82] (see in particular Chapter 16 Numerical
procedures). Here are some more comprehensive references, among others:
• On market models (part II): [76, 40, 131, 119, 31, 138];
• On deterministic pricing methods: (parts III and IV) [140, 146, 118, 1, 12, 100, 102, 147, 23];
• On stochastic simulation methods (part V): [78, 104, 33, 96];
• On calibration of nancial models (part VI): [48, 67, 124];
• On nancial theory (part VII): [61, 119].

Finally, let us mention some related web resources:

www-rocq.inria.fr/mathfi/Premia/index.html
screpey.free.fr
www.mathfinance.de/frontoffice.html
quantlib.org
www.nr.com
10

www.gro.creditlyonnais.fr
www.iro.umontreal.ca/~lecuyer
www.optioncity.net
www.defaultrisk.com
11

Part II
Market Models
In this part we gather basic facts regarding the benchmark models on the main derivatives markets:
equity-derivatives markets (BlackScholes model and simple extensions), interest-rate derivatives
markets (BGM model) and credit derivatives markets (one factor Gaussian copula model).

5 BlackScholes and Beyond


5.1 BlackScholes Basics
Let us start by the undisputed star of mathematical nance: the BlackScholes formula.
The BlackScholes model [32, 1973] [114, 1973] (BS model for short in the sequel) postulates an
objective spot diusion of the following type, on the stochastic basis (Ω, F, P)
b :

dSt = St (µt dt + σdW


ct ) , t≥0 (1)

where:
•Wc denotes a standard Brownian motion under the objective (or physical, statistical..) probability
P,
b
• the volatility parameter, σ , is a positive constant (number), and
• the physical drift, µ, is a process satisfying mild regularity and growth conditions.
We assume for simplicity that the riskless interest rate r in the economy and the dividend yield q on
S 1
are constant , and that the market composed of the spot S and of the savings account Bt = ert
is arbitrage-free and frictionless.

Then one can show that European vanilla call options on S have a unique arbitrage price process Π
in this model, given by, for t≤T :

Πt (T, K) = e−rτ Et (ST − K)+ (2)

= e−rτ E (ST − K)+ | St = Πbs (T, K; t, St , σ) ,




with τ = T − t, and where E refers to the expectation operator under the risk-neutral probability P,
such that

dSt = St (κdt + σdWt ) , St = S0 exp(bt + σWt ) (3)

σ2
with κ = r − q, b = κ − 2 , for a standard P-Brownian motion W. Note that the second equality in
(2) follows from the fact that S is a P-Markov process.

In particular the BS price does not depend on the physical drift µ, even though µ may be responsible
for fat tails or skewness in the physical spot returns (under b). This is because options risks are
P
entirely market diversiable in this model. The BS model is complete.
Indeed one can show that at any time t ∈ [0, T ] the price Πt (T, K) in (2) is also the least initial wealth
of a dynamic replicating strategy for the option (self-nancing strategy in cash and spot with wealth
process bounded from below and equal to the payo of the option at maturity, see the Appendix).
Such a replicating strategy with initial wealth Πbs (T, K; 0, S0 , σ) (starting from time 0) consists in
holding a number of units of spot at time t equal to

∆t (T, K) = ∂S Πbs (T, K; t, St , σ),

the so-called delta of the option.

1 In fact the interpretation of r and q depends on the nature of the underlyer S: stock, interest rate, exchange rate,

etc.
12

Direct computation shows that the BS call pricing function Πbs and delta function ∆bs = ∂S Πbs are
given by the so-called BlackScholes formulae:

Πbs (T, K; t, S, σ) = Se−qτ N (d+ ) − Ke−rτ N (d− )


(4)
∆bs (T, K; t, S, σ) = e−qτ N (d+ )

where τ = T − t, N is the Gaussian cumulative distribution function and

S
ln( K )+κτ √
d± = √
σ τ
± 12 σ τ (5)

These results admit straightforward extensions to the case where r, q and σ are Borel bounded
√ RT RT
functions of time: simply replace rτ, qτ and σ τ by
t
r(u)du, t
q(u)du and Σ, with Σ2 =
RT
t
σ 2 (u)du, in (4)(5).

As is well known, the BlackScholes model is strongly misspecied, which leads to consider natural
extensions of BlackScholes, adding stochastic volatility (Heston model) or/and jumps (Merton and
Bates model) in the picture.

5.2 Heston model


The best known Stochastic Volatility model is the Heston model [81, 1993], which postulates, for
the instantaneous variance process v, a CIR Bessel process (or square root process, hence v ≥ 0),
namely, under P (i.e., in risk-neutral form):

 √
dvt = −λ(vt − v)dt + η vt dZt
√ (6)
dSt = St (κdt + vt dWt ) ,

where:
• dhW, Zi = ρdt,
• λ is the speed of mean-reversion of the instantaneous variance,
• v is the long-term variance mean,
η
• 2√ v
is the volatility of the (instantaneous) volatility.

5.3 Merton model


The Merton model [113, 1976] is obtained by adding an independent compound Poisson jump process
to the BS model as follows, under P:

dSt PNt ¯
St− = κdt + σdWt + d( l=1 Jl − λJt) (7)

with related generator (see (143)):

1
AS v = κ − λJ¯ S∂S v + σ 2 S 2 ∂S2 2 v + λ [Ev((1 + J1 )S) − v(S)] ,

(8)
2
where:
• (Nt )t≥0 is a Poisson process with deterministic jump intensity λ;
• the Jl are the jump size variables, such that jl := ln(1 + Jl ) ,→ N (α, β), so

β
j̄ := Ej1 = α, Ej12 = α2 + β, J¯ := EJ1 = eα+ 2 − 1 ;

• W, N and the Jl are independent.

Denoting further ¯
a = b − λJ, we have in particular:

1
AS ln(S) = κ − λJ¯ − σ 2 + λj̄ = a + λj̄ . (9)
2
13

By application of the Itô-Levy formula (special case when Y is trivial in the general formula (147)),
the model rewrites in terms of the log-spot Xt = ln(St ) as follows:

XNt XNt
dXt = (a + λj̄) dt + σdWt + d( jl − λj̄t) = adt + σdWt + d( jl ) , (10)
l=1 l=1

PNt
whence XT = x0 + at + σWt + l=1 jl , and

QNt
ST = S0 eat+σWt l=1 (1 + Jl ) (11)

5.4 Bates model


The Bates model [21, 1996] is a mixture of the Heston and Merton models (Heston model with
further independent Merton-style jumps in S ), so, under P:
( √
dvt = −λ(vt − v)dt + η vt dZt
dSt √ PNt (12)
S − = (κ − λEJ1 )dt +
t
vt dWt + d( l=1 Jl ) .

Note that from the practical point of view, the Heston and the Merton model are still notoriously
misspecied, like BlackScholes. The Bates model may often be considered as a exible enough and
robust alternative.

5.5 Log-Spot Characteristic Functions


The characteristic function Φt (u) = E[exp(iuxt )] of the log-spot xt = ln(St ) is known in closed-form
in the previous models (which are all AJD models [63]), so that the related vanilla option prices and
Greeks can be computed eciently by Fourier transform (see section 30).

Proposition 5.1 We have in the risk-neutral BS, Heston, Merton and Bates models, respectively:
 
ΦdT (u) = exp iu x0 + κT
 1 
Φbs d
T (u) = ΦT (u) exp − 2 u(i + u)σ T
2
h  β 2β
i
ΦjT (u) = exp −λT iu eα+ 2 − 1 − (eiuα−u 2 − 1)

  
me bs j iuα−u2 β u2 σ 2 T α+ β
ΦT (u) = ΦT (u)ΦT (u) = exp λT (e 2 − 1) − 2 + iu x0 + [b − λe 2 + λ]T

Φhe d
T (u) = ΦT (u) exp [C(u, T )v + D(u, T )v]
j
Φba he
T (u) = ΦT (u)ΦT (u) ,

with
h  β 2β i
Φt (u) = exp iu x0 + κt , Φjt (u) = exp −λt iu eα+ 2 − 1 − eiuα−u 2 − 1
  
h −pt
i
1−e−pt
C(u, t) = λ ty− − η22 ln( 1−ge
1−g ) , D(u, t) = 1−ge−pt y− (13)

y−
p = y 2 − 4wz , y± = y±p
p
η 2 , g = y+
η2
w = − 12 u(i + u) , y = λ − ρηiu , z = 2


Remarks 5.1 For λ = ρ = 0 and η → 0+, we get y = 0, y± = ±p
η2 , p= −2wη, 1−g = 2, D(u, t) ∼
pt −p2 t 1 2
1−g y− = 2η 2 = wt, and C(u, t)v + D(u, t)v reduces to − u(i + u)σ t, so
2 Φhe
t (u) reduces to Φbs
t (u),
as expected.
Likewise, it is easy to check that Φme
t (u) reduces to Φbs
t (u) for α = β = 0.
14

Proof of Proposition 5.1 (see e.g. [76]). In the case of the Heston model, we introduce Ft := St eκt ,
he
so F0 = S0 , and we compute ΦT (u) as

E0 STui = eκuiT E0 FTui = eκuiT Φ(0, F0 ) ,

with Φ(t, F ) := Et FTui on {Ft = F } . In particular Φ(T, F ) = F ui , which leads us to seek an explicit
solution for Φ in the form

Φ(t, F ) = F ui exp C(u, T − t)v + D(u, T − t)v ,


 

for suitable coecients C and D. Note that Φ(t, Ft ) is a (Doob) martingale. We thus get by
application of the Itô formula (diusive case, so no jumps and Y trivial in the general formula
(147)): 0 = (∂t + AF )Φ on {t < T }, with

1 2 2 1
AF = vF ∂F 2 + η 2 v∂v22 + ηvρ∂F2 v − λ(v − v)∂v .
2 2
So
1 1
0 = v ∂t C + v ∂t D + v(ui)(ui − 1) + η 2 vD2 + ηρv(ui)D − λ(v − v)D
2 2
on {t < T }. Since this must be true for any (v, v), it implies that we have for t < T :

−∂t C = λD
(14)
−∂t D = w + zD2 − yD = z(D − y+ )(D − y− ) .

We recognize a standard system of Riccati equations, with solution given by (13), as one can easily
check by inspection. This proves the result in the case of the Heston model.

To prove the result in the case of the Bates model. We introduce the process

Nt
¯
Y
Lt = (1 + Jl )e−λJt .
l=1

Thus, using an obvious notation: S ba = S he L, whence Φba he L


t (u) = Φt (u)Φt (u), by independence.
ui j
Therefore it merely remains to prove that ELt = Φt (u). In this view, note that

XNt
¯ , AL Lui = −λLJ∂
¯ L Lui + λLui E (1 + J1 )ui − 1 .
  
dLt = Lt− d( Jl − λJt)
l=1

·
We thus have by the Itô formula, with  = standing for equality up to a martingale term:

· ¯ + E (1 + J1 )ui − 1 λdt ,
dLui ui
  
t = Lt − Jui

whence

2
ELui ui δλt ¯ + E (1 + J1 )ui − 1 = −ui eα+ β2 − 1 + euiα− u2β − 1 ,
   
t = L0 e , with δ = −Jui
j
that is ELui
t = Φt (u).

Finally, the results in the case of the BS and of the Merton models follow by passage to the limit of
the previous ones, by Remark 5.1. 2

6 BGM Model
As we will see in section 6.1.1, bond options can be handled by suitable extensions of the Black
Scholes formulae, the so-called Black formulae.
15

Other major vanilla interest rate derivatives: caps/oors and swaptions, have been quoted in terms
of Black implied volatility for long in the market. The rigorous justication for such market practice
was derived in 1995 by Brace, Gaterek and Musiela [36] for caps and oors (`BGM Model', also called
`Libor Market Model' LMM; see also Miltersen et al [116]), and by Jamshidian [90] for swaptions
(Libor Swap Model LSM).

The BGM and the LSM are not consistent in theory but they do not dier much in practice [35].
Today the BGM model has emerged as the market standard for interest rate derivatives. There are
three reasons for this:
• the related computations are easier in the LMM than in the LSM,
• it is easier to express swap rates in terms of LIBOR rate than the other way round, and
• good approximate formulas are available for swaptions in the LMM.
We shall therefore focus on the BGM model (LMM) in the sequel.

6.1 Changes of Numeraire


Let us rst consider the general set-up of section A in the Appendix (see also Jamshidian [89] or
Bielecki et al [29]). Let in particular B denote the savings account. We are interested in providing
valuation formulae for some (European) nancial products with payo ξ at time T, so by arbitrage,
under mild conditions:
Πt = Bt EP (BT−1 ξ | Ft ) , t ∈ [0, T ] ,
for some risk-neutral probability measure P (see section A). Suppose it is convenient to express the
price relatively to a particular numeraire Be (that is, the price of a strictly positive claim such that
B
savings account B as above. Introducing the P-martingale
B is a P-martingale), instead of the
e

B0 B
et
νt := , t ∈ [0, T ] ,
B
e0 Bt

let P
e be the corresponding valuation measure, dened on (Ω, F) (with F = FT as usual) by

dP
e
= νT , P-a.s. (15)
dP
From the abstract Bayes rule and the fact that ν is a P-martingale, it follows, for t ∈ [0, T ] :

e −1
B
et Ee (B et EP (BT ξνT | Ft ) = Bt EP (B −1 ξ | Ft ) = Πt .
e −1 ξ | Ft ) = B
T T
P EP (νT | Ft )

So
et Ee (B
Πt = B e −1 ξ | Ft ) , t ∈ [0, T ] (16)
P T

Π
and e is a martingale under P.
e
B
de
Also note that νt is the Ft -measurable Radon-Nikodym density dPP
|Ft of P
e with respect to P on Ft ,
for any t ∈ [0, T ]. Indeed, for any Ft -measurable and bounded random variable X, we have:

EeP (X) = EP (XνT ) = EP [XEP (νT | Ft )] = EP (Xνt ) , (17)

by the P-martingale property of ν.


The previous results will be intensively used henceforth with e = PT ,
P the so-called T -forward-
neutral measure, which consists in choosing as numeraire B
e the zero coupon bond paying 1 at T,
with (observed) price process Bt (T ). Since BT (T ) = 1, (16) rewrites in this case:

Πt = Bt (T ) EPT (ξ | Ft ) , t ∈ [0, T ] (18)


16

6.1.1 Application: Black Formulae

Black formulae extend the BlackScholes formulae (4)(5) (general case with time-dependent volatil-
ity σ(t) and dividend yield q(t)) to the case of stochastic interest rates r. Assuming absence of
arbitrage, the primary asset S (or the value of S adjusted for interest rates and dividend yields, if
any) must be a P-local martingale under any risk-neutral measure (see the Appendix).
In case where the information ltration is generated by a unidimensional Brownian motion, this
− tT q(u)du
R

implies by change of numeraire (cf. (16)) that the T -forward Ft (T ) := St e Bt (T )


spot is a local,
T
Brownian martingale, under the related T -forward-neutral measure P , with volatility process σ.
This is specied in Black assumptions by postulating that the volatility process is given by a time-
deterministic, Borel function σ(t).
By (18) applied with ξ = (ST − K)+ = (FT (T ) − K)+ , and using the BlackScholes formulae (4)(5)
with r=q=0 to compute the expectation in the r.h.s. of (18), we thus get:

Πbl (T, K; t, F, σ) = Bt (T )(F N (d+ ) − KN (d− ))


(19)
∆ (T, K; t, F, σ) = ∂F Πbl (T, K; t, F, σ) = Bt (T )N (d+ )
bl

with
ln(F/K) RT
d± = Σ ± 12 Σ where Σ2 = t
σ 2 (u)du (20)

Black formulae are the standard way to quote bond options.

6.2 LIBOR Rates


The (forward) LIBOR rate Lt (T, U ) is the simple interest rate locked at time t for an investment on
the future time period [T, U ], so for t ≤ T :

1 Bt (T ) − Bt (U )
Bt (T ) = [1 + α(T, U )Lt (T, U )] Bt (U ), or Lt (T, U ) = ,
α(T, U ) Bt (U )

where α(T, U ) = U − T and Bt (T ) is the price at time t of a zero coupon bond which pays 1 at T.
We consider n LIBOR rates

Lit = Lt (ti , ti+1 ), t ≤ ti ; i = 1, . . . , n ,

on consecutive time periods [ti , ti+1 ]. Let αi = ti+1 − ti , Bti = Bt (ti ), Ei = EPT i . So for t ≤ ti :

Bti − Bti+1 1
αi Lit = , Bti+1 = Bi , (21)
Bti+1 1 + αi Lit t

hence by induction, for any 1≤i≤l+1≤n+1: (recall that Btii = 1):

Ql
Btl+1
i
= 1
k=i 1+αk Lk (22)
t i

In particular, αi Li B i+1 as a numeraire,


is a ratio of prices of traded assets, by (21). Thus, choosing
the process (αi )Li must be a martingale under the related pricing measure P , by arbitrage. i+1
i+1 i
Consistently with this requirement, the BGM model postulates the following P -dynamics for L :

R 
t Rt
dLit = σi (t)Lit dWti+1 , or Lit = Li (0) exp 0
σi (s)dWsi+1 − 1
2 0
σi2 (s)ds (23)

a for some Pi+1 -Brownian motion W i+1 and a deterministic volatility function σi (t) (null after ti ).
i i+1
Rt
The process L is thus log-normal under P , and the variance of log Lit is equal to 0 σi2 (s)ds.
17

For the specication of the model to be complete, we should specify the correlation structure of L.
This is deferred to section 6.4. Indeed, correlation does not aect the pricing of caps and oors to
be considered in the next section.

6.3 Caps and Floors


A caplet is a call option with arrear settlement on a LIBOR Rate. Considering the caplet with
maturity ti and strike K, the payo to the caplet holder at ti+1 is:

Ctii+1 = αi max{Liti − K, 0}.

Working in the B i+1 numeraire, we thus have by (18), for t ≤ ti :

Cti
= αi Ei+1 max{Liti − K, 0}|Ft ,
 
i+1
Bt

hence by log-normality postulated in (23):

C0i = αi B0i+1 Li0 N (di1 ) − KN (di2 )


 

with
Li
0 Σ2i R ti
ln( )+
di1 = K
Σi
2
, di2 = di1 − Σi , and Σ2i = 0
σi2 (s)ds

Likewise, the pricing formula for a oorlet is:

F0i = αi B0i+1 KN (−di2 ) − Li0 N (−di1 )


 

We thus get the price of the cap (portfolio of caplets) with maturity t1 and tenor t1 , ..., tn+1 as

Pn Pn
αi B0i+1 Li0 N (di1 ) − KN (di2 )
 
C0 = i=1 C0i = i=1

and the analog formula for oors. So because of the addtively separable structure of the related
payoofs, the prices of caps and oors only depend on the marginals of the Libor rates Li at the ti .
6.4 Adding Correlation
More complex derivatives, like swaptions, are also aected by the correlation structure of L.
We are thus going to dene a correlation structure between the Li s by expressing their dynamics
(until the ti s) under the terminal measure Pn+1 .
Note that by denition of the Pi , we have for t ≤ ti (cf. (17)):

dPi dPi B i+1 B i B i+1


 
νti := |F = Ei+1
| Ft = 0i i+1t = 0 i (1 + αi Lit ). (24)
dPi+1 t dPi+1 B0 Bt B0

We nd it convenient to denote

dLit = si (t)Lit dWi+1


t = σi (t)Lit dWti,i+1 (25)

where si (t) = (0, ..., 0, σi (t), 0, ..., 0) is a row volatility vector and Wi+1 = (W j,i+1 )1≤j≤n is a (corre-
lated) n-dimensional Pi+1 -Brownian motion.
Let ρ = (ρi,l )1≤i,l≤n denote the correlation matrix of Wn+1 ,
dhW i,n+1 , W l,n+1 it = ρi,l dt .
so

n n+1
Rt
We proceed by backward induction. We rst want to specify Wt as Wt −ρ 0 hT u du, for a properly
n
chosen row vector process h. In view of Girsanov theorem, it is enough, for this to be a P -Brownian
18

motion on [0, tn ], that h satises dνtn = ht νtn dWn+1


t , where νtn is the Ft -measurable Radon-Nikodym
n n+1
density of P with respect to P on Ft that was introduced in (24). Now, by (24)(25), we have
that:
αn Lnt sn (t)
dνtn = νtn dWn+1
t .
1 + αn Lnt
αn Ln
t sn (t)
We thus choose ht = 1+αn Ln , hence
t

αn Lnt ρsT
 
n (t)
dLn−1
t = sn−1 (t)Ln−1
t dWn+1
t − dt
1 + αn Lnt
αn Lnt σn (t)ρn,n−1
= − σn−1 (t)Ltn−1 dt + σn−1 (t)Ln−1
t dWtn−1,n+1 ,
1 + αn Lnt
and then likewise, for any i = 1, . . . , n :
Pn αl Llt σl (t)ρl,i
dLit = − l=i+1 1+αl Llt
σi (t)Lit dt + σi (t)Lit dWti,n+1 (26)

Note that Ln only is a Pn+1 -martingale. For i ≤ n − 1, Li has a Pn+1 -drift depending on the Ll s
for l > i.
6.4.1 Correlation Structures

The simplest way to x ρ is to set it as an historical estimate of the correlation matrix of the Libor
rates (historical correlation structure). But such estimates are typically noisy and hardly usable in
practice.

Various parametric forms for ρ may be used instead, and calibrated to market prices of correlation-
sensitive instruments, like the swaptions to be considered in the next section. For instance, one may
choose ρi,l = exp(−β|i − l|), or (Rebonato)

ρi,l = ρ∞ + (1 − ρ∞ ) exp [−|ti − tl |β(ti , tl )]


with β(ti , tl ) = d1 − d2 max(ti , tl ) (but all such matrices are not correlation matrices), or (Coeey
et Schoenmakers)
 
|i − l|
ρi,l = exp − (ln ν∞ + η1 ϕ(n, i, l) + η2 ψ(i, l, n)) ,
n−1
with

i2 + l2 + il − 3ni − 3nl + 3i + 3l + 2n2 − n − 4


ϕ(i, l, n) =
(n − 2)(n − 3)
i2 + l2 + il − ni − nl − 3i − 3l + 3n + 2
ψ(i, l, n) =
(n − 2)(n − 3)
(qualitatively acceptable and robust, always a correlation matrix provided 0 ≤ η2 ≤ 3η1 and 0≤
η1 + η2 ≤ − ln ν∞ ).
Note nally that some procedures are known to generate in a systematic way correlation matrices
of a given rank, like the method of angles of Rebonato.

6.5 Swaptions
Recall that an interest-rate swap with tenor t1 , . . . , tn+1 is a contract with cash ows αi (Liti − K)
at the ti+1 s, for i = 1, . . . , n (from the point of view of the party receiving the oating payments).
By (18), the value of the swap at time t ≤ t1 is thus given by

n
X n
 X
αi Bti+1 Ei+1 (Liti − K) | Ft = αi Bti+1 (Lit − K) ,

i=1 i=1
19

by thePi+1 -martingale property of Li , i = 1, . . . , n. A swaption (or option to enter the swap at the
maturity date t1 ) with tenor t1 , . . . , tn+1 and strike K is thus tantamount to a product paying at t1
the positive part of
n
X
Bti+1
1
αi (Lit1 − K) ,
i=1

that is
Pn
i=1 Bti+1
1
αi (St1 − K)+

where the (forward) swap rate process S is dened by, for t ≤ t1 :

B 1 −Btn+1
St = Pnt i i+1
i=1 α Bt

Pn i+1 i i n+1 Pn
(hence
1
i=1 Bt1 α Lt1 = Bt1 − Bt1 = i=1 Bti+11
αi St1 , by (21)). Note that S is a martingale

Pn i i+1
under the pricing measure P corresponding to the the numeraire i=1 α Bt .
The LSM model mentioned in the introductory paragraph to this section, consists precisely in
modeling the swap rate S as a P∗ -log-normal process with time-deterministic volatility, hence an
exact Black formula for swaptions in the LSM.
The swap rate S is not P∗ -log-normal in the BGM model, yet it is not far from P∗ -log-normal with
2
integrated squared variance Σ given by the so-called Rebonato's formula (see e.g. [40, p. 248]):

n Z t1
1 X i l i l
Σ2 = w w L L ρ
0 0 0 0 i,l σi (t)σl (t)dt , (27)
S02 0
i,l=1

Ql
with weights wl proportional to the αl Btl+1 1
= αl
k=1 1+αk Lk .
1
We thus have the following approximate Black formula for swaptions in the BGM model (at time 0):
n
X
P0 = αi B0i+1 [S0 N (d1 ) − KN (d2 )] ,
i=1

with
Σ2
ln( SK0 ) + 2
d1 = , d 2 = d1 − Σ
Σ
where Σ is given by (27).

6.6 Model Simulation


interest rate derivatives payo processes and discount factors (working under the terminal measure
Pn+1 as usual) are typically dened as suitable functions φ of the (Litj )1≤i≤j≤n .
So, for a cap, we have by (16):

n
" #
X αi (Liti − K)+
C0 = B0n+1 En+1
i=1
Btn+1
i+1
" n
#
X αi (Liti − K)+
= B0n+1 En+1
i=1
Btn+1
i+1
" n n
#
X Y
= B0n+1 En+1 αi (Liti − K)+ (1 + αl Llti+1 ) ,
i=1 l=i+1

by (22). We thus see that to properly discount each cash-ow αi (Liti − K)+ under Pn+1 , we need
to know the values of
Li+1 i+2 n
ti+1 , Lti+1 , ... , Lti+1 ,
20

beyond that of Liti (and L0 , to compute B0n+1 ).


For MC pricing and Greeking in the BGM model, it is thus enough to be able to Pn+1 -simulate the
Litj , 1 ≤ j ≤ i ≤ n. Now it has been shown in [83] that time-discretizing (the logarithmic form of )
(26) by a standard Euler scheme at tenor dates is accurate enough for many purposes (such as MC
pricing swaptions in the LMM), thus for l = 0, . . . , n − 1 i = l + 1, . . . , n :
 
Pn αk Lk tl σk (tl )ρk,i 1 2
Litl+1 = Litl exp − k=i+1 1+αk Lk
σ (t
i l ) − σ (t
2 i l ) (tl+1 − tl )
tl

+σi (tl ) tl+1 − tl Vi ]

where V = ΛG ∼ Nn (0, ρ), with G = (ε1 , ..., εn )T Gaussian standard and ρ = ΛΛT . For example, in
case n=2:
dW 1,3
   
1 ρ
dW3 = , dhW 1,3
, W 2,3
it = dt = ΛΛT dt ,
dW 2,3 ρ 1
 
1 p 0
with Λ= .
ρ 1 − ρ2
In general, the computation goes like this:

t 0 t1 t2 ... tn
L1 L10 L1t1
L2 L20 L2t1 L2t2
L3 L30 L3t1 L3t2
... ... ...
Ln Ln0 Lnt1 Lnt2 . . . Lntn
For instance, assuming for notational simplicity a correlation structure of rank one (ρi,l = 1 and
Λi,l = 1l=1 for any i, l, so ρ completely disappears from the picture and it is enough to make one
standard univariate Gaussian draw ε per time step), and with n = 4, σi = 15%, αi = ti+1 − ti = 0, 5,
i
and L0 = 5% (initial term structure at at level 5%):

t 0 t1 t2 t3 t4

tl+1 − tl εl −0, 371379 1, 81768 −0, 204069 0, 512108
L1 5% 4, 698%
L2 5% 4, 699% 6, 135%
L3 5% 4, 701% 6, 138% 5, 918%
L4 5% 4, 702% 6, 142% 5, 923% 6, 36%

7 One Factor Gaussian Copula Model


We now move to the context of Multi-Name Credit. The primary market consists of the saving
accounts and of individual CDSs on d reference entities (rms), with default times τl , l = 1, . . . , d.
Note that in this context, no satisfactory dynamic model has emerged yet. The benchmark model
is the one factor Gaussian copula model, or Li's model [111] (see also LaurentGregory [105]). In
this model, one postulates a particular form for the joint cumulative distribution function F of
(τ1 , . . . , τd ), which is enough for deriving semi-closed pricing formulas at time 0 for standard multi-
name credit derivatives like First-to-Default Swaps or CDOs. However the dynamics of the losses is
not specied in this model, thus the connection between prices and any form of hedging cannot be
established.

We thus consider d rms with respective nominal Nl , recovery rate Rl , loss given default Ml =
(1 − Rl )Nl , and default time τl > 0 with distribution function Fl (t) = P(τl ≤ t) and indicator
l
process Ht = 1{τl ≤t} , l = 1, . . . , d. The cumulative portfolio loss at time t is thus given by

Pd
Lt = l=1 Ml Htl
21

Finally we denote by
F (t1 , . . . , td ) = P(τ1 ≤ t1 , . . . , τd ≤ td )
the joint cumulative distribution function of (τ1 , . . . , τd ).
7.1 Single Tranche CDOs
A single tranche CDO (Collateralized Debt Obligation) with lower attachment point K, upper
attachment point K and maturity T is a contract with the following cumulative discounted cash
ow, from the point of view of the investor (which is typically the credit protection seller, in the
case of a CDO):
RT
0
βt [Σ(K − K − Lt )dt − dLt ]

where:
• Lt = min ((Lt − K)+ , K − K) = (Lt − K)+ − is the cumulative tranche loss,
• Σ is the tranche
R t spread at time 0, and
• βt = exp(− 0 ru du) is the risk-neutral discount factor (inverse of the savings account, see the
Appendix).
Note that min ((Lt − K)+ , K − K) = (Lt − K)+ − (Lt − K)+ , so a CDO tranche can be interpreted
as a call-spread on the porfolio loss Lt , with strikes K, K.

Equity (resp. senior) tranches refer to tranches with lower attachment point K=0 (resp. K > 0).
For instance, on the DJ iTraxx market (a family of CDS indices for Europe and Asia), CDO tranches
are quoted for (K, K) equal to (0%, 3%), (3%, 6%), (6%, 9%), (9%, 12%) and (12%, 22%).
By arbitrage, the value of a tranche at time 0 is thus given by

Z T
E βt [Σ(K − K − Lt )dt − dLt ] ,
0

under a risk-neutral probability P on the primary market (see the Appendix). The tranche spread
Σ at time 0 is typically set such that the tranche is entered at no cost at inception, so

E 0T βt dLt
R
Σ= E
RT
βt (K−K−Lt )dt
0

Assuming deterministic interest rates r, the values of either leg of the CDO only depends on the
expected tranche losses ELt , t ∈ [0, T ]. Indeed, we have for the fees leg:

RT RT
E 0
βt (K − K − Lt )dt = 0
βt (K − K − ELt )dt (28)

For the protection leg, we have d(βt Lt ) = βt (dLt − rt Lt dt) (and L0 = 0), thus by Fubini's theorem:

RT RT
E βt dLt = βT ELT + rt βt ELt dt (29)
0 0

7.2 Li Model
A copula function C is the joint cumulative distribution function of an Rd -valued random vector
with uniform marginals on [0, 1], so that in particular C(1, . . . , 1, ul , 1, . . . , 1) = ul , for any ul ∈ [0, 1].
Recall that for any joint multivariate cumulative distribution function F (x1 , . . . , xd ) with univariate
marginal cumulative distribution functions F1 (x1 ), . . . , Fd (xd ), there exists a copula function C such
that
F (x1 , . . . , xd ) = C [F1 (x1 ), . . . , Fd (xd )] ,
by Sklar's theorem (1973).
22

In the one-factor Gaussian copula model (market model for CDOs), it is postulated that the depen-
dence structure between the τl is dened by the Gaussian copula

Cρ (u1 , .., ud ) = Nρ N −1 (u1 ), .., N −1 (ud ) ,


 

where N and Nρ respectively stand for the standard (univariate) Gaussian cumulative distribution
function and the d-variate Gaussian cumulative distribution function with covariance (correlation,
in fact) matrix

 
1 ρ ... ρ
.. .
.
 
 ρ 1 . . 
  . (30)
 . ..
 ..

. 1 ρ 
ρ ... ρ 1

So, by denition:

F (t1 , . . . , td ) = P(τ1 ≤ t1 , . . . , τd ≤ td )
= P N −1 (F1 (τ1 )) ≤ N −1 (F1 (t1 )), . . . , N −1 (Fd (τd )) ≤ N −1 (Fd (td ))
 

= Nρ N −1 (F1 (t1 )), . . . , N −1 (Fd (td )) ,


 

and X = (N −1 (F1 (τ1 )), . . . , N −1 (Fd (τd ))) is a Gaussian vector with covariance (correlation) matrix
given by (30). Equivalently,

τl = Fl−1 (N (Xl ))

where
√ √
Xl = ρV + 1 − ρ εl , l = 1, . . . , d (31)

for independent standard Gaussian random variables V (the common factor) and ε1 , . . . , εd . Dening,
for t≥0:

N −1 (Fl (t)) −
 
l|v ρv
pt = P(τl ≤ t | V = v) = P(Xl ≤ N −1 (Fl (t)) | V = v) = N √ ,
1−ρ
we thus have:

F (t1 , . . . , td ) = E P X1 ≤ N −1 (F1 (t1 )), . . . , Xd ≤ N −1 (Fd (td )) | V


  

Z ∞Y d
l|v
= ptl g(v)dv ,
−∞ l=1

by conditional independence, where g denotes the standard Gaussian density. Likewise, the moment
generating function of the portfolio loss Lt satises

ΨLt (u) = E[euLt ] = E[E[euLt |V ]] (32)


d Z ∞Y d  
uMl Htl l|v l|v
Y
= E[ E[e |V ]] = 1 − pt + pt euMl g(v)dv .
l=1 −∞ l=1

l|v
In particular, in the homogenous case pt = pvt , Ml = M :
d Z
X ∞
ΨLt (u) = Cdl (1 − pvt )l (pvt )d−l eu(d−l)M g(v)dv . (33)
l=0 −∞

Of course, for a complete determination of L, the marginal distributions of the τl need to be spec-
ied. This is typically done by assuming that the only information at hand is the one provided by
23

the observation of the defaults of the reference entities, in which case the cumulative distribution
functions Fl (t) may be bootstrapped from the individual market CDS spread curves of the reference
entities. Indeed, we then have by arbitrage, for any l = 1, . . . , d (see e.g. [82, 30]):
Z T Z T
Σl (T ) (1 − Fl (t)) dt − Ml dFl (t) = 0 , T ≥ 0 (34)
0 0

(and Fl (0) = 0), where Σl (T ) denotes the fair spread at time 0 of the CDS with maturity T on the
th
l name.

7.3 Pricing Algorithms


We assume for a start that the losses Ml , l = 1, . . . , d, are commensurate. More precisely, P
we shall
d
assume that the Ml take integer values, without loss of generality. We thus have, with M = l=1 Ml
k
and qt = P(Lt = k), k = 0, . . . , M :
PK PM PM
ELt = k=K (k − K)qtk + (K − K) k=K+1 qtk , ΨLt (u) = k=0 qtk euk .

So the expected tranche loss ELt qtk , k =


is a simple function of the portfolio loss probabilities
0, . . . , M. Moreover, the latter can be computed by Fourier inversion from the values of ΨLt (u)
obtained by (32) for M + 1 well chosen values of the argument u, in time O(dM). Choosing M + 1
as a power of 2 (completing the involved vectors by zero padding if necessary, see [60, 129]), the
Fourier inversion may be realized in time O(M ln M), by FFT.

Alternatively, the portfolio loss probability distribution qt can be computed by using the following
k|v
recursive relation on the conditional loss probabilities qt (i) which take into consideration the i rst
k|v k|v ·|v
reference entities only (so P(Lt = k|v) =: qt = qt (d) and qt (0) = δ0 ):
k|v i|v k−M |v
i i|v k|v
qt (i) = pt qt (i − 1) + (1 − pt )qt (i − 1) ,
i = d, . . . , 1 , k = 0, . . . , M .
Indeed we have in obvious notation, for any i = d, . . . , 1 and k = 0, . . . , M :
k|v k−Mi |v,Hti =1 i|v k|v,Hti =0 i|v
qt (i) = qt (i)pt + qt (i)(1 − pt )
where by conditional independence:

k−Mi |v,Hti =1 k−Mi |v k|v,Hti =0 k|v


qt (i) = qt (i − 1) , qt (i) = qt (i − 1) ,
k|v k|v
hence (35). Once the conditional loss probabilities qt = qt (d) have been recursively computed
by (35), we recover the portfolio loss probabilities by expectation on V, as
Z ∞
k|v
qtk = qt g(v)dv.
−∞

As opposed to the previous exact procedures, which assume that the losses Ml are commensurate, fast
k|v
approximate methods may be used to compute the qt , without the assumption of commensurate
losses. These are saddle point methods based on the following inverse Laplace transform represen-
k|v
tation for the qt , in the sense of distributions (cf. (113); recall that the portfolio loss distribution
has nite support), formally obtained by integration parallel to the imaginary axis in the complex
plane, for any η > 0:
k|v 1
R η+i∞
qt = 2πi η−i∞
ΨLt (u|V ) exp(−uk) du (35)

where (cf. (32)):

d  
l|v l|v
Y
ΨLt (u|V ) := E[euLt |V ] = 1 − pt + pt euMl .
l=1
24

Identity (35) in the sense of distributions means that (cf. (114))

1
R η+i∞ R∞ 
E [ϕ(Lt )|V ] = 2πi u=η−i∞
ΨLt (u|V ) L=0
exp(−uL)ϕ(L)dL du (36)

(identity in the strong sense, now), for any regular enough function ϕ such that the conditional
expectation exists in (36).
Saddle point methods are based on the approximation of ΨLt (u|V ) exp(−uk) in (36) by a suitable
Taylor expansion around a well chosen point u? , so that the computation of the resulting inte-
gral approximating that in (35) (approximate inverse Laplace transform) only involves Gaussian
integration. We thus get the density of a distribution (a.c. w.r.t. the Lebesgue measure on R+ )
approximating L in law (conditionally on V ), much faster than by the previous exact procedures,
and without the assumption of commensurate losses (see [8, 91, 150] for every detail).
Even better, we get by application of (36) with
R∞ ϕ(Lt ) = (Lt − K)+ (note that ∂x [(ux + 1)e−ux ] =
2 −ux −ux 1
−u xe , so
0
xe dx = u2 ):

1
R c+i∞ ΨLt (u|V ) exp(−uK)
E [(Lt − K)+ |V ] = 2πi u=c−i∞ u2 du (37)

Saddle point methods may then be applied to the integral in (37), namely directly to the pricing of
the tranche, without passing by the derivation of the porfolio loss distribution anymore. Depending
on the expansion point u? , the order of the Taylor expansion, etc, we thus get a whole family of
approximate algorithms for pricing the tranche.
In the simplest case, we recover the so-called large portfolio approximation to the tranche price.
Rening the approximation a little bit, we recover the so-called normal approximation. Saddle point
methods of higher order for the tranche are both very accurate, and incomparably much faster than
exact methods (unless the Ml be commensurate and supported by a very coarse grid), see [8].

A last possibility to compute the portfolio loss probabilities qtk , or, directly, the values of the fees leg
(28) and of the protection leg (29) of the tranche), is, of course, to proceed by simulation (Monte
Carlo method, cf. part V). But simulation methods are much slower on these problems than any of
the previous procedures (approximate saddle point methods to estimate the tranche legs, or even,
assuming commensurate Ml , exact convolution FFT or recursive methods to compute the portfolio
loss distribution).

Note nally that the numerical integrations to be performed in these algorithms, which all involve
the Gaussian kernel g(v)dv , may be done very eciently by Gaussian quadrature (Gauss-Hermite
quadrature, using for instance the function gauher of [129]; see also [60]).

7.4 Hedging
Since the dynamics of the losses is not specied in the one factor Gaussian copula model, the
connection of prices (initial prices at time 0) with any form of hedging cannot be formally established.
Let us thus say a word about how people use the model for hedging in practice. In fact, the related
hedging methodology may be immediately transposed to any model, or at least, to any model with
information ltration equal to the ltration of default times, for the sake of (34). So we write this
section relatively to an arbitrary model endowed with this property, specifying the description to
the Li model where need be.

7.4.1 Hedging Scheme

The most usual hedge is delta-neutral with respect to homogenous bumps on homogenous time
buckets of the underlying CDS curves. This consists in hedging a (credit index) CDO tranche with
maturity T using CDS index contracts with various maturities Tj (Tj−1 ≤ T ) as hedging instruments.
A CDS index contract is an insurance contract covering default risk on the pool of names in the
index. Index contracts dier slightly from single-name securities. In the case of a credit event,
the related entity is removed from the index and the contract continues (with a reduced notional
amount) until maturity. CDS indices may be considered as kinds of averages of individual CDSs,
25

and they can be priced essentially like the latter, cf. (34) (since we assume that the model ltration
is the ltration corresponding to the observation of default times).

We thus rebalance a complementary position in CDS index contracts, every time step τ (=1 market
day, typically), in order to minimize the overall exposure to homogenous bumps in the underlying
CDS curves. In this way we hedge (typical deformations of ) the market credit spread curve. Sim-
ilarly, by using individual CDSs as hedging instruments, one could hedge pertaining deformations
scenarii of the related individual CDS curves. Note that the involved CDSs (whether they are CDS
index contracts or individual CDSs) are new CDSs freshly emitted at each rebalancing time t.
The vector of CDS index hedging positions Λ = (λj )j is chosen so that the tranche (sold, i.e. tranche
protection bought) and its hedging portfolio are globally immune, locally in time, to homogenous
bp-bumps on the time-buckets [Ti , Ti+1 ] of the underlying CDS curves at the current time t, namely

∆? = ∆Λ (38)

where ∆? and ∆ respectively stand for the vector of the deltas (sensitivities) of the tranche, and
the matrix of the deltas (sensitivities) of the CDS index contracts with maturities Tj such that
Tj−1 ≤ T , with respect to homogenous bp-bumps on the time-buckets [Ti−1 , Ti ] of the underlying
CDS curves for Ti−1 ≤ T . The matrix ∆ is obviously upper triangular, so the linear system (38) is
trivial.

7.4.2 Computation of the Deltas

The ∆ji s are computed by assessing the impact on the Protection and Fees Legs of the j th CDS
th
index contract, of the i bump on the underlying CDS curves. So they are independent of the de-
pendence structure (copula function) of (τ1 , . . . , τd ) (provided the model information is the ltration
corresponding to the observation of default times, cf. (34)). Note that individual CDS deltas (in
case where an individual CDSs hedge is wished, see section 7.4.1) could be computed in the same
way.

The ∆?i s are computed by assessing the impact on the Protection and Fees Legs of the tranche of
th
the i bump on the underlying CDS curves. More precisely, for each i such that Ti−1 ≤ T :
(i) we bootstrap the marginal cumulative distribution functions Fl from the underlying CDS spread
curves by (34), and we calibrate the other model parameters to relevant market data (using the base
or the compound implied correlation of the tranche for the ρ parameter in the case of the Li model,
see section 8.2);
(ii) we compute the associated values P and Q of the Protection and Fees Legs of the tranche;
(iii) we bootstrap as in (i) the marginal cumulative distribution functions Fel from the underlying
CDS curves bumped by +1bp on the time-buckets [Ti−1 , Ti ], Ti−1 ≤ T ;
(iv) we compute the values Pe and Q
e of the Protection and Fees Legs of the tranche corresponding
to the Fel , with other model parameters unchanged (i.e. for the value of ρ computed in (i), in the
case of the Li model).
We nally get

∆?i = Σ(Q
e − Q) − (Pe − P ) ,

where Σ is the spread of the tranche at inception date 0. Note that as opposed to the ∆ji s, the ∆?i s
are aected by the dependence structure between the τl .
7.4.3 P&L Analysis

Neglecting transaction costs, the P&L increment of the hedged position on the time interval (t, t + δ)
writes, for a trader who is short one tranche and long λj units in the CDS index contract with
26

maturity Tj (all nominals are taken equal to unity, by convention):

X
δ P&L = −δ P&L? + λj δ P&Lj
Tj ≤T
X
= δP − ΣδQ − Στ + λj ((1 − R)Pj − Σj Qj − Σj τ ) , (39)
j;Tj−1 ≤T

where:
• δ P&L? (resp. δ P&Lj ) is the P&L increment on a unit position on the tranche (resp. on a unit
position on the CDS index contract with maturity Tj ),
• Σ is the spread of the tranche at inception date 0,
• δP and δQ denote the increments of the values of the Protection and Fees Legs of the tranche on
the time interval (t, t + δ),
• Σj is the spread of the CDS index contract with maturity Tj at the current time t,
• Pj and Qj are the values of the Protection and Fees Legs of the CDS index contract with maturity
Tj at time t + τ , and
• R is the recovery rate on the credit index.
In (39), the terms Στ and Σj τ account for the carry of the various instruments involved, while the
remaining terms account for the slide of the portfolio between t and t + τ. Note that the δ P&Lj only
depend on the value of the CDS index contracts at time t + τ. This is due to the fact that the CDS
index contracts used for hedging at time t are new CDSs freshly emitted at t. So their value at time
t is equal to zero, by denition of a spread.

Of course, to quantify (39), one would need to specify the model dynamics, beyond the denition of
the distribution of the portfolio losses at time 0.

8 Market Models in Practice


The reader should not misunderstand the meaning of the previous benchmark models. In fact,
the Black(Scholes) closed formulae for single-name derivatives, or the Li semi-closed formulas for
multi-name credit, are essentially used by traders for conveying information on the relative value
of the various derivatives available in the market, in a unit of measurement (BlackScholes implied
volatility or Li implied correlation) less sensitive to the characteristics (maturity, strike, etc.) of the
product at hand than its money-value. So these formulas are but wrong formulas into which to put
a wrong number [the implied volatility of an option or the implied correlation of a tranche] to get
the right result [an option market price or a tranche market spread].

The related deltas (BlackScholes implied volatility delta or Li implied correlation delta) may also
be used for hedging (see e.g. section 7.4), yet the related hedges are only partial hedges, which do
not account for the volatility or correlation risk.

8.1 Black(Scholes) implied volatility


As one can see from sections 5.1 and 6, BlackScholes (or Black, more generally) formulae are
applicable to any European vanilla option on virtually any market (except for multi-name credit):
stock, index, future or exchange rate, forward interest or swap rate, etc. But these formulae are in
fact used in the reverse-engineering" mode that consists in determining, given an European vanilla
price observed on the market, the corresponding value of the volatility consistent with that option
price through the Black(Scholes) formula.

More precisely, given the values of r and q inferred from the related riskless bond market for r and
by call-put parity for q, the BlackScholes implied volatility of an (European vanilla) option is the
value of σ such that:
Πbs (T, K; t, St , r, q, σ) = Πma
t (T, K) (40)

where Πma
t (T, K) denotes the market price of the option at time t. Since the function σ 7→
27

Πbs (T, K; t, St , r, q, σ) is monotone (in the case of an European vanilla), equation (40) is easy to
solve numerically, by dichotomy.

Proceeding in this way for varying strike K, we get the so-called Black(Scholes) implied volatility
smile (typical on foreign exchange derivatives markets), smirk (typical on interest rates derivatives
markets) or negative skew (typical on equity derivatives markets) (or positive skew on `negative beta'
assets derivatives markets, like options on gold futures, see Figure 2).

0.27
0.26
0.27 0.25
0.26 0.24
0.25 0.23
0.24 0.22
0.23 0.21
0.22 0.2
0.21
0.19
0.2
0.19 0.18
0.18

0.13
0.12
0.11
0.1
340 0.09
360 0.08 Maturity
380 0.07
400 0.06
420 0.05
Strike 440
460 0.04

Figure 2: Golden smile, October 6 2003

8.2 Li implied correlation


The Li model (one factor Gaussian copula model) is the benchmark model for multi-name credit.
As the Black formula for single-name derivatives, it is also used in the reverse-engineering mode, for
quoting CDO tranches in terms of their Li implied correlations, given the cumulative distribution
functions Fl inferred from the marginal CDS markets.

More precisely, given the Fl :


• the compound implied correlation of a tranche is the value of the correlation ρ in a Li model such
that
Σli (T, K, K; t, (Fl )1≤l≤d , ρ) = Σma
t (T, K, K)

where Σma
t (T, K, K) denotes the market tranche spread at time t;
• the base implied correlation of a tranche is the value of the correlation ρ in a Li model such that

Σli (T, 0, K; t, (Fl )1≤l≤d , ρ) = Σma


t (T, 0, K)

where Σma
t (T, 0, K) is a synthetic market spread computed from the observed market spreads for
the tranches with upper attachment point K and below (see e.g. [125]).
Base implied correlation is more stable than compound implied correlation, because the function
ρ 7→ Σli
t (T, 0, K; t, (Fl )1≤l≤d , ρ) is monotone [125].

Other models are assessed on their ability to reproduce the so-called market correlation skew, for
suitably calibrated values of their parameters (see e.g. Figure 3, in which the model correlation skew
was derived in a suitable credit migrations set-up [29]).

8.3 Towards Real-Life Models


28

Implied Correlation Skews: Market (Nov 5, ’05) vs. Model


30
model
market

25

20

Implied Correlation (%)


15

10

0
0.0−0.03 0.03−0.06 0.06−0.09 0.09−0.12 0.12−0.22
Tranche Attachement Points

Figure 3: Market and Model implied correlation skews for CDO tranches.

For a consistent risk-management of derivative products, the benchmark models are not good enough.
More realistic models are needed (like the Bates model of section 5.4 or beyond, in the case of
equity derivatives; see also the generic Markovian model (144)(145) in the Appendix). In such
models, and/or for pricing and Greeking exotic products (in any model), numerical methods are
the only way. The remaining parts of these notes will be devoted to these methods. As already
mentioned in the Introduction (see section 1), we shall present most methods on simple models, like
the BlackScholes model (in general), the Merton model or the Heston model, for simplicity of the
presentation, yet the methods themselves are always generic to some degree. For instance, it should
be clear from the previous developements that simulating correlated Gaussian vectors (see section
20.3) is a prerequisite for Monte Carlo pricing in the BGM or in the Li models, etc.
29

Part III
Finite Dierences Methods
The deterministic pricing methods presented in these notes: Partial Dierential Equation (PDE)
methods in this part or tree methods in the next one, can be used in any Markovian model, irre-
spective of any early exercice features, but with the practical limitation that the space dimension of
the model be not too large (no more than 3, say; cf. section 3).

9 Generic Pricing PIDE


First of all, how comes that deterministic methods come into play to compute derivative prices that
are in essence stochastic processes?

In a risk-neutral market model, the primary risky market price process is typically modeled as an
Rd -valued locally bounded local martingale X with respect to some stochastic basis (Ω, F, P) (see
the Appendix). We assume that the riskless interest rate r ql
in the economy and the dividend yields
on the X l are constant, for simplicity2 . Let us further be given an Rm -valued (F, P)-Markov process
X , with X l = X l , l = 1, . . . , d (thus d ≤ m), such that the payo process of a nancial derivative be
l
given by a Borel function of Xt . So the components X , l = d + 1, . . . , m (in case d < m) represent
factors that may be required to `Markovianize' the market model and/or the payo of the derivative.

Then under mild technical conditions (see the Appendix for a detailed presentation), the P-arbitrage
price of the derivative, whether it be an European, an American, or even a Game Contingent Claim
on X, writes Πt = u(t, Xt ), for a Borel (deterministic) function u of t and Xt . So the derivative price
Πt is given by a function of Xt (and t), and all the randomness in Π is embedded in that of X .

Now, the Markov process X typically admits an innitesimal generator with a canonical structure in
three parts: a drift, a diusion and a jump component. As for the drift, P-arbitrage actually implies
that the (rst d) (r − q1 )X 1 , . . . , (r − qd )X d .
drift coecients are given by
This canonical structure of the generator of the Markov process X implies that the P-pricing function
u solves a parabolic partial integro-dierential problem (PIDE) of the following type on the time-
m
space domain [0, T ] × R , where T is the maturity of the product (cf. pricing equation (150) in the
Appendix ):
3

(
F (t, x, u(t, x), ∂t u(t, x), Iu(t,
 x),∇u(t, x), Hu(t, x)) = 0 on [0, T ) × D
(41)

m
u = φ(t, x) on [0, T ] × R \ [0, T ) × D

where:
•D is an open subset of Rm . In particular the terminal condition at T, which is embedded in the
boundary condition φ in (41), is given by the derivative payo at maturity;
• Iu is a non-local term dened as
Z
Iu(t, x) = g(t, x) (u(t, x + f (t, x, y)) − u(t, x))h(t, x, dy) ,
Rm

for a non-negative jump intensity functiong(t, x) and a jump size probability measure h(t, x, dy),
• ∇u and Hu denote the spatial gradient and Hessian matrix of u (gradient and Hessian with respect
to x).rots

The precise denition of the space domain D and of the operator F depend on the specications of
the derivative at hand. So, in the case of a European vanilla option, D =R (or R?+ ), and F is a

2 The interpretation of r and of the ql depends on the nature of the primary market: stock, interest rate, exchange

rate, etc.
3 In the Appendix a more general factor process Z = (t, X , Y) is considered, where Y is a further Continous Time
Markov Chain interacting with X.
30

linear operator. In the case of American (or Game) options, there are further obstacles in F. In case
there are barriers involved, D is a part of Rm delimited by the barriers, etc.

But the structure of the generator of the Markov process X implies that F is always monotone in
this sense:

F (t, x, r, I, a, p, M ) ≤ F (t, x, s, J, a, p, N ) whenever r ≤ s, J ≤ I, N ≤ M , (42)

where r, s, I, J, a ∈ R, (t, x) ∈ [0, T ] × D, p ∈ Rm and M, N are in the space of symmetric d⊗d


matrices equipped with its usual order, i.e. M ≤ N i N − M is a non-negative matrix (a matrix
with non-negative eigenvalues).

9.1 Maximum Principle


A classic solution to the pricing PIDE (41) is a bounded function in C([0, T ] × Rm ) ∩ C 1,2 ([0, T ) × D)
that satises (41) everywhere. Likewise, a classic subsolution (resp. supersolution) satises (41)
everywhere, with = replaced by ≤ (resp. ≥) therein.
We assume for simplicity that D and φ are bounded, and that the monotonicity of F is strict in its
third argument r. We then have the following comparison principle.

Proposition 9.1 u ≤ v, for any classic subsolution u (resp. supersolution v) of (41).

Proof. The proof essentially relies on the classic maximum principle, according to which

∇v(z) = 0, Hv(z) ≤ 0 ,

for any function v ∈ C 2 (Rd ) z.


locally maximum at
We proceed by contradiction. Assume that u ≤ v does not hold. Then w = u − v admits a positive
global maximum at a point (t, x) ∈ [0, T ) × D, so by the classic maximum principle and by denition
of Iu,

∂t u(t, x) = ∂t v(t, x) , Iu(t, x) ≤ Iv(t, x)


∇u(t, x) = ∇v(t, x) , Hu(t, x) ≤ Hv(t, x) .

Hence the fact that

F (t, x, u(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ≤ 0
≤ F (t, x, v(t, x), ∂t v(t, x), Iv(t, x), ∇v(t, x), Hv(t, x))

implies by monotonicity of F :

F (t, x, u(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ≤
F (t, x, v(t, x), ∂t u(t, x), Iu(t, x), ∇u(t, x), Hu(t, x)) ,

which is in contradiction with u(t, x) > v(t, x), given the strict monotonicity of F in r. 2

Interpreting the problem at hand as comparing the distribution of the temperature in two identical
glasses lled with hot liquid, our comparison principle says that the liquid in a glass is hotter than
the other at any corresponding points inside the glasses, should it be hotter at any corresponding
points outside the glasses (or at any corresponding points on the surface of the glasses, in the purely
dierential case with no jumps, g = 0).
Proposition 9.1 immediately implies that there is at most one classic solution of (41). In very specic
cases, assuming in particular that F is linear, there may be a chance that the pricing PIDE has a
classic solution. However, in general, there is no hope that a classic solution to the pricing PIDE
does exist, and one must resort to suitable notions of weak solutions of (41).
31

9.2 Weak Solutions


We thus extend the denition of a solution so that a solution does exist under rather general cir-
cumstances, while preserving uniqueness.

9.2.1 Viscosity Solutions

The theory of viscosity solutions [51, 70, 2, 16, 4, 3, 49, 87, 88] denes suitable notions of weak (con-
tinuous, or even semi-continuous) solutions, subsolutions and supersolutions of non-linear monotone
PIDEs such as (41), such that related maximum and comparison principles still hold true. Thus
viscosity solutions of (41) (bounded, or satisfying suitable growth conditions) are unique.
Existence of a viscosity solution of (41) can be established by various means, such as the Perron's
method, itself based on the related comparison principles.

The precise denition of a viscosity solution to (41) is outside the scope of these notes and, in fact,
irrelevant here. It will be enough for us to keep in mind that under mild technical assumptions, the
pricing equation (41) is well-posed (admits a unique solution continuously depending on its input
data) in a suitable space of viscosity solutions with growth conditions. This will enable us to ana-
lyze convergence properties of naturally related nite dierences approximation schemes (including
approximation trees) for (41).

9.2.2 Weak Solutions in Weighted Sobolev Spaces

Alternatively to working with viscosity solutions, it is possible to derive weak (variational) formu-
lations of the pricing PIDE problem (41) in suitable weighted Sobolev functional spaces H. In this
approach, the boundary condition φ is typically accounted for by a judicious choice of H.
Existence and uniqueness of a solution to the variational formulation of the problem typically results
by application of the LaxMilgram Theorem.
Various choices for H are possible, all giving rise to well-posed variational problems (see e.g. Bally et
al. [15, 14], BarlesLesigne [18], AchdouPironneau [1], Jaillet et al. [86], Ern et al. [68]). The choice
made in [18, 15, 14] has the advantage to give rise to a clear connection (Feynman-Kac formula)
between the solution of the PDE problem and the option price process as probabilistically dened
by arbitrage (see the Appendix).
Again, the precise variational formulation of the pricing problem, including the denition of the
spaces, is outside the scope of these notes and irrelevant here. We shall simply keep in mind that
under mild technical assumptions, a weak (re)formulation of the pricing PIDE problem (41) is well-
posed in a suitable weighted Sobolev space H. This underlies the theory of naturally related nite
elements and nite volumes approximation schemes.

10 Numerical Approximation
In some models (models of the AJD class [63], SABR model [80]..) and in the case of European and
more or less vanilla options, the pricing equation (41) can be solved analytically (or semi-analytically,
see e.g. section 30).
But in general, (41) must be solved numerically. In order to approximate (41), one can either use
nite dierences methods [140], or resort to more general nite elements (or even nite volumes)
methods [1].
Note that there is in fact no hermetic frontier between these methods. Indeed, schematically:

Tree Methods ⊂ FD Methods ⊂ FE Methods ⊂ FV Methods

(and also in a sense, for a complete picture: MC Methods ⊂ Tree Methods, cf. section 1).

10.1 Finite dierences methods


Finite dierence methods are naturally connected with viscosity solutions of monotone equations
expressing related maximum principles (cf. section 9.1). A practical reason to use nite dierences
32

is that nite element methods are potentially more powerful, but they are also heavier, than nite
dierences methods. The related cost is justied in cases where the geometry of the domain makes it
necessary to use a sophisticated unstructured and adaptative discretization mesh. But pricing prob-
lems in nance are typically posed on rectangular domains, for which a simple nite dierence grid
is good enough (with the limitation that a nite dierences grid does not oer as many possibilities
as a nite elements mesh for rening the mesh in multi-dimensional problems, however).

10.1.1 Localization, Discretization

The numerical resolution of the PIDE problem (41) by nite dierences is a four steps process:
(1) Transformation(s) of the problem, whenever judged useful, such as:
• changes of variables (e.g. as x = ln S, in the BS case),
• changes of unknowns (e.g. solving the equation for e−rt u rather than u),
• changes of probabilities (see e.g. [73, Appendix A] or [55, III,3.1.3] for a PDE interpretation of
the Girsanov transform for diusion processes);
(2) Localisation of the problem, that is, truncating the integration domain in the non-local term Iu,
and the problem domain:
• replacing the integration domain Rm by bounded domains Dε (t, x) ⊂ Rm such that
Z
h(t, x, dy) ≤ ε ,
Rm \Dε (t,x)

in the non-local term Iu;


• replacing D by an open bounded sub-domain in (41) (in case D was not bounded from the begin-
ning), still denoted by D henceforth, and introducing a suitable Dirichlet boundary condition ϕ on
a boundary layer ∂Dj around the new domain D (we also dene ϕ=φ at T );
(3) Discretizing the localized timespace domain and choosing a suitable nite dierences numerical
scheme for the localized problem;
(4) We thus get a problem of linear algebra (typically: a sparse linear system) in the values of the
approximate solution at mesh nodes, to be programmed and solved numerically on a computer.

Note that to exploit the parabolic structure of pricing equations in nance, the time dimension
is typically treated separately in (3) (as in the so-called `theta-schemes'), in order to `save' one
dimension in terms of storage cost. The problem is then solved `linearly in time' (one time step after
the other) at step (4).
Convergence, convergence rate (and, of course, computational cost!) of the resulting approximation
scheme, are the main issues.

10.1.2 Convergence and Convergence rates

Given a suitable mesh with time step h and space step vector k = (k1 , . . . , km ) on [0, T ] × Rm , let

Fhk (ukh ) = 0on [0, T ) × D


(
(43)
  
ukh = ϕ on [0, T ] × Rm \ [0, T ) × D

stand for a fully-discrete nite dierences approximation scheme for (41) (or a localized version of
(41), in case where the original domain D or the support of h are unbounded, cf. section 10.1.1).
By (43) we mean that the related equalities are satised at every mesh point in
    [0, T ) × D or

[0, T ] × Rm \ [0, T ) × D . We assume that the discrete problem (43) admits a unique solution

ukh dened at grid points.

Recall that under mild technical assumptions, the pricing equation (41) has a unique solution u in a
suitable space of viscosity solutions with growth conditions. Moreover we assume that the boundary
data φ and the solution u are bounded, for simplicity.

In the purely dierential case (no jumps, g = 0) and in the case of a linear operator F, u is typically
a classic solution of (41). Then the well-known Lax Equivalence Theorem states that any stable and
33

consistent numerical scheme Fhk (ukh ) = 0 is convergent, which loosely means that ukh converges to u
at mesh points as h, k → 0+, provided:
• ukh is bounded, uniformly over h, k (stability),
• Fhk (u) → 0 at mesh points as h, k → 0 (consistency).
We refer the reader to [118] for the detail of this theorem. In particular, the previous statements
are relative to the specication of a given norm in which ukh is bounded and Fhk (u) → 0, ukh → u,
hence the notions of stability/convergence in norm l∞ , l2 , etc.
There is a further notion of order of consistency, which measures the speed at which Fhk (u) → 0 at
mesh points as h, k → 0. Note however that the order of consistency of a numerical scheme has no
immediate implication in terms of convergence rates of ukh to u. Convergence rates only follow under
additional assumptions, regarding in particular the regularity of the boundary data φ, etc.
Now, a nice feature of the theory of viscosity solutions is that the Lax Equivalence Theorem can be
generalized to the case of a general non linear monotone operator F and to problems with jumps in
(41).
So, for a general monotone, yet still purely dierential, operator F in (41), Barles and Souganidis
[19] proved the convergence of any monotone, stable and consistent approximation scheme for (41),
provided (41) satises a suitable viscosity solutions comparison principle. We already observed that
such comparison principles can indeed be obtained under mild technical conditions on the data of
problem (41). The monotonicity condition on the scheme is a discrete version of that on F (cf.
(42)), which is typically satised by natural nite dierences approximation schemes for (41). So
the convergence conditions in the BarlesSouganidis theorem are essentially reduced to those in the
Lax Equivalence Theorem, namely stability and consistency of the scheme.
Finally, the latter results were extended to problems with jumps (including the case of unbounded
jump measures, hence beyond the case of probability measures h postulated in (41)) by Briani, La
Chioma and Natalini [39]. Note however that in presence of jumps, the monotonicity of the related
numerical schemes does not follow as universally as in the purely dierential case (see e.g. [144]).

10.2 Finite Element Methods and Beyond


Finite element and nite volume methods are based on energetic, variational (re-)formulations of the
pricing PIDE problem (41).These methods give approximate solutions dened on the whole state
space of the continuous (localized) problem (see Figure 4), as opposed to approximate solutions
at grid points by nite dierences methods. They are most naturally connected with equations

Figure 4: Convergence of nite elements approximations.

expressing (energy) conservation principles.

As already mentioned in section 10.1, they are heavier and harder to implement (nite volume meth-
ods in particular) than nite dierences methods. By heavier, we mean that they are computationally
intensive, particularly in terms of storage cost. Indeed, a prerequisite in a nite element method is the
construction of a suitable discretization mesh, which is typically unstructured and adaptative, and
which has to be handled by the computer program all along the numerical resolution. This also means
that nite element methods are harder to implement. One typically then has to use nite element
toolboxes (many of them are actually free, see e.g. http://www.inria.fr/valorisation/logiciels/calcul.en.html),
34

which may induce less exibility in programming.

However, the related cost is justied in cases where the geometry of the domain makes it necessary
to use a sophisticated unstructured and adaptative discretization mesh, such as it is typically the
case in uid dynamics (see Figure 5), but also occasionally so in nance. This is for instance the case

Figure 5: (a) Detail of viscous mesh for wind tunnel model in aircraft design, (b) Parallel computation
on an unstructured mesh showing the domain decomposition of 16 processors of a distributed memory
computer.

for pricing barrier options with curved boundaries (see e.g. Crépey [55]; barrier options with curved
boundaries are common on interest-rates), or for the precise approximation of exercice boundaries
in the case of American problems (see e.g. [42], also reported in Crépey [55]).
Another motivation for using nite elements is to counter the curse of dimensionality (see section
3), by rening the approximation mesh in a more clever way than simply taking the product of uni-
dimensional adaptive meshes, as is typically done with nite dierences (see Figure 6 and section
10.2.2).

Figure 6: Finite elements mesh renement.

Practically speaking, the numerical resolution of problem (41) by nite elements is a ve steps
process (cf. the analogous description of nite dierences methods in section 10.1.1):
(1) Transformation(s) of the problem, whenever judged useful;
(2) Localisation of the problem, that is, truncating the integration domain and the problem domain,
and introducing a suitable Dirichlet boundary condition ϕ which prolongates φ on a boundary layer
∂Dj around the localized domain;
(3) Derivation of a weak formulation of the localized problem in a suitable weighted Sobolev functional
space H (the boundary condition ϕ is typically accounted for by a judicious choice of H);
(4) Projection of the resulting problem onto a nite-dimensional sub-space of nite elements Hh ⊂
H;
35

(5) We thus get a high-dimensional, sparse linear system, in the coecients of the approximate
solution on a nite element basis, to be programmed and solved numerically on a computer.

Existence and uniqueness of a solution to the weak form of the localized problem at step (3) typically
follows by application of the LaxMilgram Theorem (see section 9.2.2).

With the same motivation as for nite dierences methods, the time dimension is typically (yet not
always, see e.g. [42, 68]) treated separately at steps (3)(4). So the problem may be solved `linearly
in time' at constant storage cost. Like with nite dierences methods, convergence, convergence
rate, and computational cost of the resulting approximation scheme, are the main issues.
An interesting feature of nite elements methods is that a powerful error theory (a priori and a
posteriori error estimates) is available.

To solve the high-dimensional sparse linear system arising at step (5), an iterative solver is required.
The Generalized Minimal Residual (GMRES) algorithm [136], a special form of conjugate gradient
method which is both ecient and relatively easy to code, is the industry standard in this regard.

10.2.1 Finite Volumes

Finite volumes methods can be seen in an informal way as counterparts of nite element meth-
ods in which the test functions used in the variational formulation of the problem are indicator
(discontinuous) functions, instead of regular test functions as usual. Finite volumes methods are
especially dedicated to PDE convection (or convection dominant) problems with discontinuous data.
For instance, the nite volume discretization of the pricing function of a digital option is exact in a
BlackScholes model with no volatility (σ = 0).
10.2.2 Sparse Grids

Sparse grids denote numerical techniques to represent, integrate or interpolate high dimensional
functions, relying on the seminal works of the Russian mathematician Smolyak, who found a clever
quadrature rule to (partially) escape the curse of dimensionality.
This direction of research underlies active developements in nite elements methods (or nite dier-
ences methods; see Figure 7). But the related algorithms are dicult to implement, and we will not
go further in this direction at the level of these notes, referring the interested reader to [132, 75].

Figure 7: Sparse Grids Renement.

For simplicity, we shall focus on nite dierences methods in the sequel. About nite element
methods in nance, we refer the reader to AchdouPironneau [1].
36

11 Finite Dierences for European Vanilla Options


11.1 BS Equation
Recall from section 5.1 that in the risk-neutral BS model relative to the stochastic basis (Ω, F, P) :
dSt
St = κdt + σdWt

(κ = r − q) for a standard P-Brownian motion W, the price process of an European vanilla option
with (integrable) payo φ(ST ) at T is given by, for t ∈ [0, T ] :
Πt = e−r(T −t) Et φ(ST ) . (44)

So, consistently with arbitrage requirements (cf. the Appendix), the discounted price e−rt Πt is a
martingale under the risk-neutral measure P.
By the Markov property of the risk-neutral BS stock S , we have that Et φ(ST ) = E (φ(ST ) | St ) , and
it is possible to show that the price Πt = v(t, St ), for a suitable Borel function v . Then, assuming
that the function v lies in C 1,2 (R+ × R) with bounded derivatives, we have by application of the Itô
formula:
ert d(e−rt v(t, St )) = (∂t v + Lv)(t, St )dt + σSt ∂S v(t, St )dWt
σ2 S 2 2
where L = κS∂S + 2 ∂S 2 − rId. Since e−rt v(t, St ) is a martingale, thus

∂t v + Lv = 0 .

Accounting for the fact that ΠT = φ, we get that the price of an European vanilla option in the BS
model solves the following PDE problem:

∂t v + κS∂S v + 12 σ 2 S 2 ∂S2 2 v − rv = 0 [0, T ) × R?+



in
(45)
v(T, S) = φ(S), S ∈ R?+

After logarithmic transformation Xt = ln(St ), the option price process is given by Πt = u(t, Xt ),
where u solves the parabolic equation

∂t u + b∂x u + 12 σ 2 ∂x22 u − ru = 0

in [0, T ) × R
(46)
u(T, x) = ψ(x), x ∈ R

σ2
(b =κ− 2 ), with x = ln S and ψ(x) = φ(ex ), the payo in the x variable.

For smooth data, this equation is well-posed (i.e. has one and only one solution, which depends
continuously on the data of the problem), in a suitable space of classic solutions [74] (and in suitable
spaces of viscosity or weak solutions in weighted Sobolev spaces as well, cf. section 9.2).

11.2 Localization and Discretization in space


Let x = ln(S0 ). To solve numerically the BS equation in log-returns variable (46), we start by
truncating the integration domain in space: the problem will be solved in a nite interval D̄ =
[x − `, x + `] . One chooses ` so that

P (∃s ∈ [0, T ], |Xsx − x| ≥ `) ≤ ε , (47)

which can be achieved by setting



` = |b|T + f σ T (48)

for a related quantile f of the Gaussian law (like f = 4, see e.g. Crépey [55]). Once D is chosen,
one discretizes it in space, constructing a uniform grid {xj } with
2j`
xj = x − ` + , for 0≤j ≤m+1
m+1
37

(with m odd, so that x lies in the space grid; otherwise some kind of interpolation has to be used,
cf. Remark 11.1).

One then approximates the dierential spatial operator

Au = 12 σ 2 ∂x22 u + b∂x u − ru

by a discrete operator Ak acting on Rm -valued vectors uk = (uk (t, x1 ), . . . , uk (t, xm )). The easiest
and most natural is to take:

1 2 2 k
Ak uk (xj ) = σ δx2 u (xj ) + bδx uk (xj ) − ruk (xj ) (49)
2

with

1
δx uk (xj ) = 2k (uk (xj+1 ) − uk (xj−1 ))
δx22 uk (xj ) = k12 (uk (xj+1 ) − 2uk (xj ) + uk (xj−1 ))

where uk (x0 ) = uk (x − `) and uk (xm+1 ) = uk (x + `) are notations for quantities to be dened below
k
using the u (xj ), j = 1 . . . m. By using well chosen Taylor expansions, it is possible to show that δx
2 2
and δx2 are consistent approximations of order two of the spatial dierential operators ∂x and ∂x2 ,
respectively, meaning that for regular test functions ϕ :

|δx ϕ(t, xj ) − ∂x ϕ(t, xj )| = O(k 2 ) , |δx22 ϕ(t, xj ) − ∂x22 ϕ(t, xj )| = O(k 2 ) .

Remark 11.1 If |κ| /σ 2 is not small, a less precise but more stable nite dierence approximation
for ∂x is
1 k k

k
δx u (xj ) = k (u (xj ) − u (xj−1 )) if b<0
1 k k
k (u (xj+1 ) − u (xj )) if b>0.
This alternative discretization for ∂x , called upwind discretization, is based on the consideration of
the characteristics of the limiting hyperbolic transport equation with σ = 0.

One then seeks an Rm -valued time functional uk (t) such that:


• in the case of Dirichlet boundary conditions,

 k
 u (T, xj ) = ψ(xj )
uk (t, x − `) = ψ(x − `) , uk (t, x + `) = ψ(x + `) (50)
 d k k k
dt u (t, xj ) + A u (t, xj ) = 0 for 0 ≤ t < T, 1 ≤ j ≤ m ,

• in the case of Neumann boundary conditions,

 k

 u (T, xj ) = ψ(xj )
 k
u (t, x1 ) = uk (t, x − `) + k∂x ψ(x − `)
(51)

 uk (t, xm ) = uk (t, x + `) − k∂x ψ(x + `)
 d k k k
dt u (t, xj ) + A u (t, xj ) = 0 for 0 ≤ t < T, 1 ≤ j ≤ m .

Set

σ2 b σ2 σ2 b
α= − , β = − − r , γ = + (52)
2k 2 2k k2 2k 2 2k

In this notation, the operator Ak applied to uk (t) writes:

Ak uk (t) = Ak uk (t) + v k ,
38

with in the case of Dirichlet boundary conditions:

 
β γ 0 ,..., 0 0  
ψ(x − `)α
 α β γ 0 ,...,  0
   0 
 0 α β γ ,...,  0 
.

k k .
A = .  , v =  (53)
   
. .. .. .. .
. . 

 0 . . . . .   

 0
 0 
0 ,..., α β γ 
ψ(x + `)γ
0 0 0 ,..., α β

and in the case of Neumann boundary conditions:

 
β+α γ 0 ,..., 0 0
−αk ∂ψ
 
 α β γ 0 ,..., 0  ∂x (x − `)
   0 
 0 α β γ ,..., 0  
.

k k .
A =  , v =  . (54)
   
. .. .. .. . .
. . . . .
 0 . .   
   0 
 0 0 ,..., α β γ 
γk ∂ψ
∂x (x + `)
0 0 0 ,..., α β+γ

Remarks 11.1 If m was even, x wuould not belong to {xj ; 0 ≤ j ≤ m + 1}. In this case one could
use linear interpolation to compute the option value corresponding to the initial stock price at time
1

0. This price would thus be approximated by
2 uk (0, xm/2 ) + uk (0, xm/2+1 ) .

11.3 Theta-schemes
We now discuss discretization in time. The standard theta-scheme (θ ∈ [0, 1]) for the parabolic
equation (46) may be summarized as follows (see e.g. [102, 118]): Fix a time discretization step h
such that T = nh, and construct a fully discrete approximation ukh (ti , xj ) = ui (xj ) where the ui ,
i = 0 . . . n are Rm -valued vectors such that:

un = ψ k

ui+1 −ui (55)
h + Ak (ui+1 + θ(ui − ui+1 )) = 0 for 0≤i≤n−1

For θ = 0, we get the so-called Euler explicit scheme. Likewise, for θ = 1 the scheme is the fully
1
implicit Euler scheme, and for θ = 2 it is the Crank-Nicholson scheme.
k −x
Once we have computed uh , we recover the delta ∆ = e ∂x u by its approximation given by

ui (xj +k)−ui (xj −k)


∆kh (ti , xj ) = e−xj 2k

11.3.1 Explicit Method

First, let us discuss the case θ =0 (explicit scheme). Using the denition of Ak (considering say
Dirichlet conditions), the approximating scheme (55) is reduced to (cf. (52))

un = ψ k , and for 0 ≤ i ≤ n − 1 , 1 ≤ j ≤ m :


ui (xj ) = hα ui+1 (xj−1 ) + (1 + hβ) ui+1 (xj ) + hγ ui+1 (xj+1 )

with ui+1 (x ± `) = ψ(x ± `).


Considering (52), one can then show that the explicit approximation scheme for the BS pricing
equation (46) is (cf. section 10.1.2):
39

2
k 2
• stable, provided h ≤ σ2 +rk 2 (and σ > |b|k, but this is always satised for k small enough);
• consistent of order one in time and two in space.
11.3.2 Implicit Methods

When we choose 1 ≥ θ > 0, we have to solve at each time step, a linear system

Aui = Bui+1 + hv k

where A and B are tridiagonal matrices of the type

 
b1 c1 0 ,..., 0 0

 a2 b2 c2 0 ,..., 0 

 0 a3 b3 c3 ,..., 0 
.
 
. .. .. .. .
. .

 0 . . . . . 
 
 0 0 , . . . , am−1 bm−1 cm−1 
0 0 0 ,..., am bm

For example, in the case of natural Dirichlet boundary condition, A is given by, for every j:

b σ2 σ2 b σ2
aj = θh( 2k − 2k2
), bj = 1 + θh(r + k2
), cj = −θh( 2k + 2k2
)

and B is given by, for every j:


2
σ b σ2 b σ2
aj = (1 − θ)h( 2k 2 − 2k
), bj = 1 − (1 − θ)h(r + k2
), cj = (1 − θ)h( 2k + 2k2
)

The fully implicit and the CrankNicholson schemes correspond to θ=1 and θ = 21 , respectively.
One can show that the implicit (θ 6= 0) approximation theta-schemes for the BS pricing equation
(46) are:
• stable inconditionally for θ ≥ 21 ;
• consistent of order one in time and two in space, with the notable exception of the CrankNicholson
scheme which is consistent of order two in time and space.

On Figure 8, we plotted the relative error at time 0 as a function of the spot price S0 , obtained
when pricing an European vanilla call option with the explicit, fully implicit and CrankNicholson
theta-schemes , respectively (for r = 10%, σ = 20%, T = 1y, K = 100). The CrankNicholson is
more accurate (at least around the money), as expected.

All implicit theta-schemes require at each time step the resolution of a linear system Au = v, where
u and v are m-dimensional vectors. Let us describe two algorithms of resolution of such linear
systems.

Gauss Factorization This algorithm is based on the fact that a regular matrix can be factorized
as A = LU, where L is a lower triangular matrix, and U is an upper triangular matrix with all ones
on its diagonal. The linear system LU z = v is decomposed into Ly = v, U z = y . It is easy to see
that A tridiagonal implies that L, U are also tridiagonal and so only the upper diagonal of U and
the two diagonals of L need to be found. This results in the following procedure, known as Thomas'
algorithm [118, 102]:
 0
b = bm , ym = vm
 z1 = y1 /b01

 m


For 1 ≤ j ≤ m − 1, j decreasing:
0 , followed by For 2 ≤ j ≤ m, j increasing:
b = bj − cj aj+1 /b0j+1 ,
 j zj = (yj − aj zj−1 )/b0j .
 
yj = vj − cj yj+1 /b0j+1 .

Remark 11.2 Note that this method presupposes that all the b0j (called the pivots) are non zero.
40

3
Crank-Nicholson Scheme; NT = 50 , NX= 101
implicit Scheme ; NT = 50 , NX = 101
Explicit scheme; NT = 50 , NX = 101
2

log_10 ( % err rel) 0

-1

-2

-3

-4
50 100 150 200
Spot value

Figure 8: Pricing of an European call option by theta-schemes for θ = 0, 21 , 1.

SOR Iterative Methods An alternative, which in the case of tridiagonal systems is justied only
by its programming simplicity, is to use the Successive Over-Relaxation iterative scheme. The idea
is to decompose A as A = D + R where D is the diagonal part of A. The linear system Au = v is
then rewritten Du = v − Ru. The solution is computed as the limit of a converging sequence
up+1 = D−1 (v − Rup ) , (56)

or, better:

up+1 = up + ω(e
up+1 − up )
for some 1 < ω < 2, where ep+1
u stands for the r.h.s. of (56). More precisely, the algorithm writes
as follows:
• Step 0 Choose u0 ≥ 0, ε > 0, 1 < ω < 2. Set p = 0.
• Step 1 (Jacobi iteration) Form an intermediate vector hp+1 = (hp+1 j )1≤j≤m by

1 X X
hp+1
j = (vj − Ajl upl − Ajl upl ) , 1 ≤ j ≤ m (57)
Ajj
l<j l>j

Here a possible renement (GaussSeidel iteration) is to use hp+1


l instead of upl in the rst sum in
the r.h.s. of (57).
• Step 2 (Over-relaxation) Dene up+1 by

up+1 = up + ω(hp+1 − up ) , (58)

and set p = p + 1.
• Step 3 Repeat steps 2 and 3 until |up+1 − up | < ε.

11.4 Adding Jumps


Let us now add jumps in S, assuming that the underlying asset price evolves according to the
following RN jump-diusion :

Nu
dSu ¯
X
= (κ − λJ)du + σdWu + d( Jl ), ST −t = s (59)
Su−
l=1
41

where N is a Poisson process with deterministic jump intensity λ, and the Jl are positive, independent
stochastic variables with law ν of j1 := ln(1 + J1 ) (J¯ = EJ1 ; the other data in (59) are dened as
usual). Moreover, N, W, and the Jl are independent. So in case ν = N (α, β) we recover the RN
Merton model, cf. section 5.3.

One can show as in section 11.1 (see also the Appendix) that the price of an European option in the
RN jump-diusion (59) can be formulated in terms of the solution to a Partial Integro-Dierential
Equation (PIDE). So, u denoting the pricing function in the returns variable x = ln S, we have
·
formally as in section (11.1), with  = standing for equality up to a martingale term:

·
ert d(e−rt u(t, Xt )) = (∂t u + AX u − ru)(t, Xt )dt ,

where by the rst equality in (10) (cf. (143), (144)):

1
AX u = (a + λj̄ − λj̄) ∂x u + σ 2 ∂x22 u + λ [Eu(x + j1 ) − u(x)] . (60)
2
By the usual arbitrage argument, the price at time t of the option is thus given by Πt = u(t, Xt )
where u solves the integro-parabolic equation


u(T, x) = ψ(x), x ∈ R
(61)
∂t u + Au + Bu = 0 in [0, T ) × R

with

Z
1
Au = σ 2 ∂x22 u + a∂x u − ru , Bu = λ (u(t, x + z) − u(t, x))ν(dz)
2 R

(note that the operator A of this section reduces to the operator A of the previous section when
λ = 0, since in this case a = b − λJ¯ = b ).
11.4.1 Localization

Localization works essentially as in sections 11.2, except for the fact that:
• b is replaced by a in (48),

• the boundary ∂D = {x − `} ∪ {x + `} is replaced by the boundary layer ∂Dj = [x − ` − zmin ,x −
+
`] ∪ [x + `, x + ` + zmax ] (cf. step (2) in section 10.1.1), where zmin and zmax are such that
Z zmax Z ∞
ν(dz) ≈ ν(dz) − ε = 1 − ε, ε  1 ;
zmin −∞

• we solve the following localized problem on D̄j := D∪ ∂Dj :



 u(T, x) = ϕ(T, x) = ψ(x) , x ∈ R
u(t, x) = ϕ(t, x) , (t, x) ∈ [0, T ) × ∂Dj (62)
∂t u + Au + B` u = 0 , (t, x) ∈ [0, T ) × D

where ϕ is a suitable Dirichlet condition such that ϕ(T, x) = ψ(x) (e.g., ϕ(t, x) ≡ ψ(x)), and B` is
such that, for x∈D:
R
B` u(x) = λ D−x
u(t, x + z)ν(dz) − λu(t, x)+
R (63)
λ ∂Dj −x
ϕ(t, x + z)ν(dz)

11.4.2 Discretization
42

We dene:

xmin = min D̄j , xmax = max D̄j


xmax −xmin
k= m , xj = xmin + jk (j = 0, . . . , m)
h − i h +
i
zmin zmax
jl = k , ju = n − k .

B` approximation In order to approximate B` , we set ν(dz) = ζ(z)dz, where ζ is the density of


ν , assumed to exist. Then the B` operator is decomposed as

h R i
B` u(t, x) = λ D−x u(t, x + z)ζ(z)dz − λu(t, x) +
h R i
λ ∂Dj −x ϕ(t, x + z)ζ(z)dz = B̄` u(t, x) + Φ(t, x)

Standard approximation For j = jl , . . . , ju , we approximate

Pju −j
B̄` u(·, xj ) ≈ B k u(·, xj ) = λk i=j l −j
u(·, xj+i )ζ(·, xi ) − λu(·, xj )
Φ(·, xj ) ≈ Φk (·, xj ) = λk {i+j<jl }∪{i+j>ju } ϕ(·, xj+i )ζ(·, xi )
P

FT approximation Recall that the discrete correlation of two functions gj , hj , each periodic with
period m, is dened by
m−1
X
ρ(g, h)j = gj+l hl .
l=0

The discrete correlation theorem says that the discrete Fourier transform of the discrete correlation
of two real functions g and h is such that

< ρ(g, h)j >=< gj >< h̄j > (64)

where <·> is the discrete Fourier transforms operator, and the bar denotes complex conjugation.
Then we can compute the discrete correlation using the FFT as follows: FFT the two data sets,
multiply one resulting transform by the complex conjugate of the other and inverse transform the
product. The result will formally be a complex vector of length m. However, it will turn out to have
all its imaginary parts equal to zero since the original data sets were both real.
We can apply this procedure to the B k and Φk operator (with the related vectors suitably prolongated
by zero padding outside jl , . . . , ju , see [129]).

Finite dierences in space For the dierential operator A, we write:

σ2 2
Ak u(·, xj ) = [aδx + 2 δxx − r]u(·, xj )

where
u(·,xj+1 )−2u(·,xj )+u(·,xj−1 )
δx22 u(·, xj ) = k2 ,

u(·,xj )−u(·,xj−1 ) u(·,xj+1 )−2u(·,xj )+u(·,xj−1 )


δx u(·, xj ) = k +α k ,
where α is chosen such that (for stability reasons):

σ2

1
 ka ≤ 2 α
= 2
σ2 α=0 a>0
 ka > 2 α=1 a<0
43

Theta-schemes We dene the time grid {ti | ti = hi, i = 0, . . . , n}. Denoting uji = u(ti , xj ), we
consider the following discrete operator:

uji+1 −uji
h + Ak [θA uji + (1 − θA )uji+1 ]+
(65)

+θB [B k uji + (Φk )ji ] + (1 − θB )[B k uji+1 + (Φk )ji ]

where θA , θB ∈ [0, 1]. We consider in particular two cases.


Explicit scheme θA = θB = 0 Computationally feasible but potentially unstable and suers from
the drawback that convergence in time is only of the rst order (in O(h)).
For every i = n − 1, . . . , 0 we solve

 j

 ui = ϕji , j = 0, . . . , jl − 1 and j = ju + 1, . . . , n


 Piu
uji = p1 uj−1 + p2 uji+1 + p3 uj+1
i+1 X i+1 + hλ(k
j+i j
i=il ζi ui+1 − ui+1 (66)

ζi ϕj+i

 +k i+1 ), j = jl , . . . , ju ,



{i+j<jl }∪{i+j>ju }

where

 σ2 a   σ2 a   σ2 a 
p1 = h + (1 − α) , p 2 = 1 − h + (1 − 2α) + r , p 3 = h − α .
2k 2 k k2 k 2k 2 k

1
`Asymmetric' scheme 2 , θB = 0 Stable and ecient but some accuracy is lost due to the
θA =
asymmetric treatment of the continuous and jump part.
0 jl ju m T
We have to solve, for i = n−1, . . . , 0, the linear system Aui = Bi+1 , where ui = (ui , . . . , ui , . . . , ui , . . . , ui ) ,

 
Il 0
A= A
e , (67)
0 Iu

in which Il and Iu are two identity matrices, jl × jl and (m − ju ) × (m − ju ) respectively, and A


e is
the (ju − jl + 1) × (ju − jl + 1) tridiagonal matrix such that
 
σ2
a0 = − h2 + ka (1 − α)
 
a1 a2 0 · · 0 2k2

 a0 a1 a2 0 · 0 

 0 a0 a1 a2 0 · 
 a1 = 1 + h σ22 + a (1 − 2α) + r
 
A
e=

 · 0 0 

2 k k
 · · 0 a0 a1 a2   2 
0 · · 0 a0 a1 a2 = − h2 2k
σ a
2 − kα

and nally
l −1
Bi+1 = (ϕ0i+1 , . . . , ϕji+1 jl
, fi+1 ju
, . . . , fi+1 , ϕji+1
u +1
, . . . , ϕm T
i+1 ) , (68)

where, for j = jl , . . . , ju ,

 Xiu
j
fi+1 = −a0 uj−1
i+1 + (2 − a1 )uji+1 − a2 uj+1
i+1 + hλ k ζi uj+i j
i+1 − ui+1
i=il
X 
+k ζi ϕj+i
i+1 .
{i+j<jl }∪{i+j>ju }
44

12 Finite Dierences for American Vanilla Options


12.1 BS Variational inequalities
Using the formalism of viscosity or variational solutions of PDEs, the arguments of section 11.1 can
be extended to American options. It can thus be shown (see the Appendix) that the price of an
American vanilla option in the BS model at time 0 is given by

sup Ee−rτ ψ(Xτ ) = u(0, 0) ,


τ ∈T

with Xt = ln(St ), where u is the unique viscosity solution satisfying suitable growth conditions to
the following obstacle problem in the log-return space variable x = ln S :

u(T, ·) = ψ(T, ·) on R
(69)
max(∂t u + Au, ψ − u) = 0 on [0, T ) × R

Equivalently, u can be characterized as the unique solution to the following variational inequality
problem:

u(T, ·) = ψ(T, ·)
(70)
h∂t u + Au, v − ui ≤ 0 , v ≥ ψ
where h·, ·i denotes the scalar product in a suitable weighted Sobolev space.

To support this equivalence, note that provided u ≥ ψ, (69) implies that, for any v≥ψ:

h∂t u + Au, v − ui = h∂t u + Au, v − ψi + h∂t u + Au, ψ − ui = h∂t u + Au, v − ψi ≤ 0,

hence (70). Conversely, still assuming u ≥ ψ, (70) implies that for any non-negative test-function
ϕ, we have h∂t u + Au, ϕi ≤ 0, hence ∂t u + Au ≤ 0 and h∂t u + Au, ψ − ui ≥ 0. But taking v = ψ in
(70) gives h∂t u + Au, ψ − ui ≤ 0, so that nally h∂t u + Au, ψ − ui = 0, hence (69).
12.2 Splitting methods
Let us rst describe the computational treatment of the obstacle problem (69), using the so-called
splitting method. The idea is to construct recursively, by dynamic programming, the approximate
solution uji , starting from un = ψ k (T ), and computing ui from ui+1 , for 0 ≤ i ≤ n − 1, in two steps
as follows:
• Step 1 Solve the following Cauchy problem (in discrete space) on [ih, (i + 1)h] × Dk (cf. section
11):
u((i + 1)h, ·) = ui+1 on Dk


∂t u + Ak u = 0 on [ih, (i + 1)h[×Dk ;
denote by ui+ 21 the solution u at time ih;
 
• Step 2 Do ui = max ψ k (ti ), ui+ 12 .
By Barles et al. [17, 19], this scheme converges to the unique viscosity solution of (69) satisfying
suitable growth conditions, namely the pricing function u.
12.3 Linear Complementarity Problem
Let us now describe the computational treatment of the variational inequalities (70). We refer to
[86, 79] for a detailed presentation. It is well-known that after localization and discretization the
variational inequalityproblem (70) can be expressed as a linear complementarity (LC) problem. At
each time step i, we thus have to solve:

 AX ≥ G
X≥Φ (71)
(AX − G, X − Φ) = 0

45

with (using Dirichlet boundary conditions)




 A = I − hθAk
X = ui

 G = (I + h(1 − θ)Ak )ui+1 + hv k

Φ = ψk .

So A is the following tridiagonal matrix:

 
b c ,..., ,..., 0   
 a b c ,..., 0  σ2 1
  
 a = θh − 2k 2 + 2k b
 .. .. .. .. . 
. 

A= . . . . . with σ2

 b = 1 + θh(
 k 2 + r)
 . .  c = −θh σ22 + 1 b .
 
. .
 
 . . a b c  2k 2k
0 ,..., ,..., a b

There exist three classic algorithms to solve the linear complementarity problem (71).

A method by Brennan and Schwarz [38] This method consists in introducing an auxiliary
LC problem. A rigorous justication of the convergence of this algorithm, in the case of an American
put option, was given in Jaillet et al. [86].

PSOR Method The LC problem (71) can be written as follows: nd vectors W = (wi )1≤j≤m
and Z = (zj )1≤j≤m in Rm such that

 W = AZ + V (72.1)
W ≥ 0, Z ≥ 0 (72.2) (72)
(W, Z) = 0 (72.3)

where we have set Z =X −Φ and V = AΦ − G. Such a LC problem can be solved by a Projected


SOR scheme (cf. section 11.3.2). Convergence was established in Cryer [58].

An Algorithm of Cryer This algorithm is based on a direct method, which is an adaptation to


LC problems of Saigal's linear programming algorithm. The basic idea of this kind of algorithm is:
Choose an initial value which satises both (72.1) and (72.2), maintain the two conditions during
all steps and have gradually satised the null condition (72.3). The solution of problem (72) is then
obtained.
Note that the matrix A is a Minkowski matrix, namely a matrix with positive principal minors,
positive diagonal entries and non-positive o-diagonal entries. This implies in particular that A−1 ≥
0. The Cryer algorithm is valid for all Minkowski matrix. In the particular case, such as ours, where
A is a tridiagonal Minkowski matrix, an implementation of this basic method which minimizes the
amount of computation can be found in [59].

13 Finite Dierences for 2D Vanilla Options


The purpose of this section is to describe various algorithms for pricing options in the BS2D setting,
relying upon the ADI (Alternate Direction Implicit) )method of Peaceman Rachford [127, 118]. The
ADI method can in fact be used in any multi-dimensional Markov model. It is actually the current
industry standard to solve multi-dimensional pricing problems by PDE methods.

In the RN BS2D model, the underlying stock-prices satisfy the following stochastic dierential
equations:
dSt1 = St1 (κ1 dt + σ11 dWt1 + σ12 dWt2 ), ln(S01 ) = x10

,
dSt2 = St2 (κ2 dt + σ21 dWt1 + σ22 dWt2 ), ln(S02 ) = x20
46

for independent Brownian motions W 1, W 2, with

       
κ1 r − q1 σ11 σ12 σ1 p 0
= , Σ= = ,
κ2 r − q2 σ21 σ22 ρσ2 1 − ρ2 σ 2

so that in particular ΣΣT = Γ, where Γ is the following covariance matrix:

σ12
 
ρσ1 σ2
Γ= .
ρσ1 σ2 σ22

In order to apply the ADI method, we'd rather work with the underlying decorrelated bidimensional
Brownian motion.
Let us introduce some notation:
• b is the vector with components (κ1 − 12 σ12 , κ2 − 12 σ22 ),
• ex = (ex1 , ex2 ), for x = (x1 , x2 ),
• ϕ(t, w) = ψ(ex+bt+σw ), so that the payo of the option writes ϕ(t, Wt ).
The price at time 0 of a European vanilla option on (ST1 , ST1 ) is then given by:

Ee−rT ϕ(T, WT1 , WT2 ) = u(0, 0, 0)

where u is the unique viscosity solution with suitable growth conditions to the following two di-
mensional PDE:

(
2 1 2
∂t u(t, w1 , w2 ) + 12 ∂w 2 u(t, w1 , w2 ) + 2 ∂w 2 u(t, w1 , w2 ) − ru(t, w1 , w2 ) = 0 on [0, T ) × R2
1
2
1 (73)
u(T, w1 , w2 ) = ϕ(T, w1 , w2 ) on R .

For the numerical resolution of problem (73) by nite dierences:


• we localize in space the problem on a domain D̄ = [−`, `]2 , introducing a suitable boundary condition on
(0, T ) × ∂D;
• we introduce a grid of mesh points (t, w1 , w2 ) = (ih, j1 k1 , j2 k2 ) on [0, T ] × D̄, for mesh steps (which can be
thought of as tending to zero) h, k1 , k2 , and we discretize the localized problem on the time-space grid by a
suitable nite dierences scheme, such as the ADI scheme described in the next section.
Note that an implicit 2D scheme would involve a high dimensional linear system with sparse matrix, costly to
solve by an exact method (the sparseness of the matrix may be fruitfully used in an iterative approximation
scheme, however).
j1 ,j2
Let ui denote the solution of the discrete problem. Note that

ln S 1 W1
   
=A+Σ
ln S 2 W2

for some (time-dependent) vector A, thus

S 1 ∂S 1
     p  
∂W 1 1 σ2 1 − ρ2 0 ∂W 1
= Σ−1 = .
S 2 ∂S 2 ∂W 2 −σ2 ρ ∂W 2
p
σ1 σ2 1 − ρ2 σ1

The deltas

e−x1 e−x2 −ρ∂w1 u ∂w2 u


∆1 = ∂w1 u , ∆2 = p ( + )
σ1 1 − ρ2 σ1 σ2

are then retrieved by using suitable nite dierences approximations for ∂w1 u and ∂w2 u, e.g.

uj01 +1,j2 − uj01 −1,j2 uj1 ,j2 +1 − uj01 ,j2 −1


∂w1 u ≈ , ∂w2 u ≈ 0 .
2k 2k

13.1 Numerical integration by an ADI Method


47

Alternate Direction Implicit methods were originally proposed by Peachman Rachford [127]. At each time
step, one integrates `in each direction' by using the usual nite dierence method for 1D problems in two
steps, the rst one in w1 and the second one in w2 , as follows:
( 2 2 1 2
h
(ui+1 − ui+ 1 ) + 12 δw 1 1
2 ui+ 1 + 2 δw 2 ui+1 − 2 rui+ 1 − 2 rui+1 =0
2 1 2 2 2
2 1 2 1 2 1 1
h
(ui+ 1 − ui ) + 2 δw2 ui+ 1 + 2 δw2 ui − 2 rui+ 1 − 2 rui = 0 ,
2 1 2 2 2

that is: ( 2 h 2
[(1 + hr
4
)Id − h4 δw hr
2 ]ui+ 1 = [(1 − 4 )Id + 4 δw 2 ]ui+1
2
2
1
h 2
2 (74)
[(1 + hr
4
)Id − h4 δw hr
2 ]ui = [(1 − 4 )Id + 4 δw 2 ]ui+ 1 ,
2 1 2

with

2 j1 ,j2 uji 1 −1,j2 − 2uij1 ,j2 + uij1 +1,j2


(δw 2 u)i =
1 k2
2 j1 ,j2 uij1 ,j2 −1 − 2uji 1 ,j2 + uj1 ,j2 +1
(δw 2 u)i = .
2 k2
The system (74) is of the form

Au·,j = Bu·,j
( 2 2
i+ 1 i+1 , for any j2
2
Cuij1 ,· = Duji+
1 ,·
1 , for any j1
2

for tridiagonal matrices A, B, C, D.


So each time step consists in the resolution of m1 + m2 implicit 1D problems, which can be solved using the
Gauss method. This is in general a far better alternative than having to solve the m1 m2 -dimensional linear
system that would arise from a 2D implicit scheme.

Problem

13.2 American Options


Likewise, the price of a 2D American vanilla option at time 0 is given by (see the Appendix)

sup Ee−rτ ϕ(τ, Wτ1 , Wτ2 ) = u(0, 0, 0) ,


τ ∈T

where u is the solution to the following variational inequality (obstacle problem):


(  
max ϕ − u, ∂t u + 21 ∂x22 u + 21 ∂y22 u − ru = 0 on [0, T ) × R2
(75)
u(T, x, y) = ϕ(T, x, y) on R2 .

Again:
• we localize in space problem (75) on a domain D̄ = [−`, `]2 , introducing a suitable boundary condition
u = ψ on (0, T ) × ∂D;
• we introduce a grid of mesh points (t, x, y) = (ih, j1 k1 , j2 k2 ) on D̄ and we discretize the localized problem
on the grid;
•a suitable nite dierences approximation scheme on the grid can be obtained by combining the previous
ADI nite dierence method with the splitting method of section 12.2 (see [143]).

14 Finite Dierences for Exotic Options


14.1 Lookback Options
The BS pricing function of a lookback option with payo φ(ST , MT ) at T, where Mt = sup0≤s≤t Ss , satises
the usual BS1D pricing PDE in the (t, S) variables, but in the subdomain S ≤ M of the three-dimensional
state space (t, S, M ).
Moreover, on the boundary S = M, the pricing function u satises an homogenous oblique Neumann
condition ∂M u = 0 (see [146]).

This can be established by a formal application of the Itô semimartingale formula to the price process
ut = u(t, St , Mt ). Note that M is a non-decreasing process, yet it is not absolutety continuous, thus the
48

relevant Itô formula is not contained in (147). Here the relevant Itô formula is the general Itô formula for
semimartingales (see e.g. [48, 85]), which gives:

1 2 2 2

dut = 2
σ S ∂S 2 u + κS∂S u dt + ∂M u dMt + σS∂S udWt

whence by arbitrage, on {t < T } :


1 2 2 2
2
σ S ∂S 2 u + κS∂S u − ru = 0 on {S < M }
∂M u = 0 on {S = M } .

Remarks 14.1 (i) This can also be derived by letting n tend to ∞ in the pricing PDE of the option with
approximating payo φ(ST , MT (n)), where

t
Stn
Z
1 1
Mt (n) = ( S n ds) n , dMt (n) = dt .
0 n Mt (n)n−1

(ii) When St gets close to its current maximum Mt , then it becomes very likely that the related value of Mt
(current maximum at time t) will not remain the maximum up to expiry. It follows that the option value is
insensitive to small changes in the value of Mt in this case, which provides an intuitive insight into the fact
that ∂M u = 0 on S = M.

14.2 Barrier Options


A barrier option is a type of nancial option where the option to exercise depends on the underlying crossing
or reaching a given barrier level. For instance, barrier up and out options correspond to the following payo,
involving in particular the rebate R :

φ(S
e T , MT ) = ϕ(ST )1{M <U } + R1{M ≥U } .
T T

This payo is paid at time τ ∧ T, where τ is the rst hitting time of the barrier level U by the underlying
S. So in case{τ > T } the amount ϕ(ST ) is paid to the holder at time T, otherwise the amount R is paid to
the holder at time τ. Equivalently, the following payo is paid at time T :

φ(ST , τ ) = ϕ(ST )1{τ >T } + er(T −τ ) R1{τ ≤T } .

The related price process is thus given by e−r(T −t) Et φ(ST , τ ). Note that until time τ ∧ T, this is also equal
to

Et e−r(τ ∧T −t) ψ(Sτ ∧T ) , with ψ(S) = ϕ(S)1{S<U } + R1{S=U } (76)

Such options were created as a way to provide the insurance value of an option without charging as much
premium. If you believe that IBM will go up this year, but you're willing to bet that it won't go above 100,
then you can buy the barrier and pay less premium than the vanilla option.

Other forms of barrier options are up and in, down and out, down and in, double in, and double out options.
In some cases, the rebate may be paid at the maturity time T, rather than at the time of the barrier event.
In the latter cases, barrier options are special cases of lookback options, so that they may handled as in
the previous section. Their pricing functions thus solve relevant two-dimensional Cauchy problems in the
variables t, S, M.
However, in all cases, it is better to use the fact, using representations such as (76), the BS pre-barrier
event pricing functions of barrier options generically solve uni-dimensional CauchyDirichlet problems of
the following kind:
2 2
 ∂t u + 21 σ ∂x2 u + b∂x u − ru = 0 on [0, T ) × D,

u(T, x) = ψ(x) on D, (77)


u(t, x) = R(t, x) on [0, T ] × ∂D .

Let us give some examples (in the case of rebates paid at the time of the barrier event):
• Down Out Barrier L : D = (L, x + `), R(t, L) = R;
• Down In Barrier L : D = (L, x + `), ψ(x) = R, R(t, L) = C(t, L), where C is the price of a vanilla
49

European call with maturity T;


• Double Out Barriers L, U : D = (L, U ), R(t, L) = R(t, U ) = R;
• Double In Barriers L, U : D = (L, U ), ψ(x) = R, R(t, L) = C(t, L), R(t, U ) = C(t, U ).
For computing the pricing function of a barrier option, one discretizes the related problem (77) in time-space
with a theta-scheme which is solved by the Gauss method. The use of implicit schemes is recommended, for
stability issues.
For the sake of accuracy of the resulting scheme, grid points should be put on the barrier. In the case of
curved barriers, articial Dirichlet boundary conditions may be imposed along the barrier (Dirichlet boundary
conditions on ctitious grid points along the barrier, see e.g. [118, 55]).

One uses linear interpolation to nd the price value and delta value corresponding to the initial stock price.
If the initial stock price is close to barrier, a one-sided second-order nite dierence approximation should
be used for the delta.

14.3 Asian options


14.3.1 European Asian option
Payo at T :
RT !+
t
St dt
φ((St )t≤t≤T ) = K− . (78)
T −t

Price at time t:
Rt RT !+
−r(T −t) t
Su du + t
Su du
Πt = e Et K− . (79)
T −t

2D PDE To markovianize the payo (78), we introduce an additional state variable (process) It =
R t
t
Su du. In this new variable, the payo writes:

 +
IT
φ(IT ) = K − .
T −t

We thus consider the following Cauchy problem on [t, T ] × R?2


+ :


∂t u + AS,I u = ru, t ≤ t < T
(80)
u(T, S, I) = φ(I)

with
1 2 2 2
AS,I = σ S ∂S 2 + κS∂S + S∂I . (81)
2
The following Proposition follows by application of El Karoui et al. [66, Théorème 4.2].

Proposition 14.1 Πt = u(t, St , It ), t ∈ [t, T ], where u is the unique bounded viscosity solution of (80).

Remarks 14.2 (i) Note that the form of equation (80) may be derived by a formal application of the Itô
−rt
formula (formula (147), diusive case) to the discounted price process e u(t, St , It ). The function u is not
1,2
of class C , however, so that this derivation is purely formal. Yet it can be turned into a rigorous argument
by resorting to the notion of viscosity solutions of equation (80) (see section (9.2.1)).
(ii) The operator (81) is degenerate in the I variable, so that specic numerical treatments are necessary to
1
solve (80) (`PDE in dimension 1 2 ', see Zvan et al. [151]).

Scaling We dene

t
−ItT
Z
ItT = −K(T − t) + Su du , ηt =
t (T − t)St
50

K(T −t)− tT Su du
R
(thus in particular ηT = (T −t)ST
). So, by elementary Itô calculus:

dt
dηt = − − ηt (κ − σ 2 )dt − ηt σdWt
T −t
 
1 1
Aη = − + (κ − σ )η ∂η + η 2 σ 2 ∂η22 .
2
(82)
T −t 2

Proposition 14.2 We have Πt = St v(t, ηt ), where the reduced price function v is the unique classic solution
of the following 1D PDE:
(  
∂t v − 1
T −t
+ κη ∂η v + 12 σ 2 η 2 ∂η22 v = qv
(83)
v(T, η) = η+ .

Proof of Proposition 14.2 (see Rogers-Shi [135]; see also Barraquand [20]). First note that equation (83) has
·
a unique classic solution v, by Frieman [74]. We thus have by elementary Itô calculus, with  = standing
for equality up to a martingale term:
 ·  
ert d e−rt St v(t, ηt ) = − rSt v(t, ηt ) + v(t, ηt )κSt dt + St (∂t v + Aη v) (t, ηt ) − σSt ησ∂η v(t, ηt ) dt .


Whence by arbitrage, given (82):


 
1 1
0 = −qv + ∂t v − + κη ∂η v + η 2 σ 2 ∂η22 v .
T −t 2
 +
T
ΠT = K − (T 1−t) t Su du = ST ηT+ , whence v(T, η) = η + . 2
R
Finally, we have

This PDE is a non degenerate 1D PDE which can be solved numerically by nite dierences theta-schemes.
However care should be taken to the fact that the transport term is dominant in this equation (because of
1
the coecient T −t ), but this can be handled by a further transformation of the problem (see [109]).

14.3.2 American Asian put


Payo upon exercice at τ : Rτ !+
t
St dt
φ((St )t≤t≤τ ) = K− . (84)
τ −t

Price at time t: Rt Rτ !+
−r(τ −t) t
Su du + t
Su du
Πt = esssupτ ∈Tt Et e K− . (85)
τ −t
 +
function
Rt Iτ
Let It stand for
t
Su du, so φ((St )t≤t≤τ ) = K− τ −t
, which is a of τ and Iτ . This function

is also denoted by φ in the sequel. We introduce the following variational inequality (obstacle problem) on
(t, T ] × R?+ 2 :

min(−∂t u − LS,I u + ru , u − φ(t, I)) = 0 , t<t<T
(86)
u(T, S, I) = φ(T, I)
where LS,I was dened in (81). Note that the obstacle function φ(t, I) is singular at t = t.
Proposition 14.3 (Crépey[55]) Πt = u(t, St , It ), t ∈ (t, T ], where u is the unique bounded viscosity solu-
tion of (86). Moreover, the price at time t is given by the radial limit:

Πt = limt→t+ u(t, St , (t − t)St )

Figure 9 provides a numerical illustration of the last point in Proposition 14.3 (radial convergence at time
t). Numerically it seems that convergence also occurs along radii α other than St , yet at a slower rate than
for α = St .
Again, the 2D PDEs in this section (PDE problems (80) and (86)) are degenerate in the I variable, so
that specic numerical treatments are necessary, particularly in the singular American case (see Zvan et al.
[151]).
51

7 -0.2
StartingAverage=100. StartingAverage=100.
StartingAverage=0. StartingAverage=0.
StartingAverage=200. StartingAverage=200.

6 -0.3

5 -0.4

Delta
Price

4 -0.5

3 -0.6

2 -0.7

1 -0.8
0 0.01 0.02 0.03 0.04 0.05 0 0.01 0.02 0.03 0.04 0.05
Tau Tau

Figure 9: Convergence of the price and delta of an American Asian put at (t, St , α(t − t)) as t → t,
for α = 0, St or 2St .

14.4 Discretely Path-Dependent Options


In real-life contracts, Asian options are in fact dened in terms of discretely sampled payos, like

N
!+
T X
φ((St )t≤t≤T ) = K− St , (87)
N i=0 i

relatively to a set of observation dates t1 , . . . , tn . As shown in Wilmott et al. [147], this kind of discretely
path-dependent payos can be priced by PDE methods, after a suitable extension of the state space. In this
section we shall exemplify this technique on the case of cliquet options. Proceeding along the same lines, it
is possible to price by nite dierences any discretely path-dependent payo, like the discretely monitored
Asian payo (87), or volatility and variance swaps (see [62, 149]).

Note that the pricing of such products by MC methods is obvious in BS, but it becomes very dicult in
simple extensions of BS, such as the Uncertain Volatility model [13], whereas the PDE methods can easily
be adapted to such framework (see e.g. Windcli et al. [145, 148]).

Also note that in the case of discretely monitored Asian options, this approach is both closer to the clauses
of the product, and also easier from the numerical point of view. Indeed, instead of a 2D PDE degenerate
1
in one direction (problems in dimension `1 2 ', unless specic dimension reduction techniques are available,
2
see section 14.3), we get, in rough terms, m1 1D PDE problems to solve, where m1 is a generic number of
mesh points per space dimension.

14.4.1 Cliquet Options


S(ti )−S(ti−1 )
Let t1 , . . . , tn = T denote a set of observation dates, or monitoring dates, and let Ri = S(ti−1 )
denote

the spot return on the period [ti−1 , ti ]. The payo of a cliquet is dened by (up to a constant factor, the
option's notional):
Pn
φ = max(Fg , min(Cg , i=1 max(Fl , min(Cl , Ri ))))

for given numbers Fl , Cl , Fg and Cg .


In order to markovianize this payo, we introduce two additional state variables: P, corresponding to the
asset price at the previous monitoring date, and Z, the current arithmetic average. So

P (ti < t ≤ ti+1 ) = S(ti )


Z(ti < t ≤ ti+1 ) = 1i ik=1 max(Fl , min(Cl , Rk ))
P

In this notation, we have

φ = max(Fg , min(Cg , nZT+ )) = φ(ZT )


52

Moreover, the factor process (Z, P, S) is Markovian in the RN BS model for S. In the BS model we then
have Πt = u(t, St , Pt , Zt ), for a Borel (deterministic) function u of t, St , Pt and Zt . Moreover, one can show
that the pricing function u = ui on (ti−1 , ti ], i = 1, . . . , n, where (ui )1≤i≤n is the unique sequence of viscosity
solutions with suitable growth conditions of the following PDE cascade:

+ 3

 un (T, S, P, Z) = max(Fg , min(Cg , nZ )) on R
∂t ui + 2 σ S ∂S 2 ui + κS∂S ui − rui = 0 on (ti−1 , ti ) × R3 ,
1 2 2 2
(88)
ui−1 (ti−1 , S, P, Z) = ui (ti−1 , S, P + , Z + ) on R3

where P+ and Z+ are obtained via the following jump conditions at the monitoring dates ti :

P + = S , Z+ = Z + R−Z
i
(89)

S−P
with R = max(Fl , min(Cl , P
)).
In order to solve (88), we localize in space the problem on a compact set

D = [Smin , Smax ] × [Smin , Smax ] × [Zmin , Zmax ] ,

with: √
Smin = 0 , Smax = S0 (1 + f σ T ) , Zmin = min(0, Fl ) , Zmax = Cl
where f is a suitable factor, e.g. f = 6. Recommended boundary conditions are ∂S2 2 u = 0 on S = Smin and
S = Smax .
We then put a nite dierences mesh on D̄. It is better to use a non uniform mesh in the P -direction, in
order to concentrate the mesh points near spot price S0 . In the S -direction, Windcli et al. [148] suggest to
have a ne node spacing near the diagonal of (S, P ). A uniform grid may be used in the Z -direction.
Note that linear interpolation is required to implement the jump conditions (89).

Between monitoring dates, the BS equation (88) can be discretized by a standard nite dierence scheme
(theta-scheme) in the t, S variables, dened on the three-dimensonal state space (S, P, Z).
Finally, interpolation of degree 2 in space is used for computing the price, the delta and the gamma at
0, S0 , P0 , Z0 .
53

Part IV
Tree Methods
Tree methods (historically starting with binomial trees) are but another point of view on explicit nite
dierences methods, namely the interpretation of an explicit nite dierence scheme in terms of a related
Markov chain.
Trees are natural in nance because such Markov chains can also be considered as `realistic' models of trading
and hedging in discrete time, that can be used per se, without reference to any continuous-time model.
Moreover, trees compute prices and Greeks in a cone starting from (t0 , S0 ), rather than on the whole band
[t0 , T ] × R?+ by implicit nite dierences methods. In nance, the main concern is to get prices and Greeks
at (t0 , S0 ). So the extra-work furnished by implicit schemes (theta-schemes withθ 6= 0) for computing the
(t0 , S), is in a sense wasted time (naturally the motivation for using implicit theta-schemes is
solution at any
not to compute the solution at any (t0 , S), but rather to improve the stability of the schemes, and sometimes
also, the accuracy as well, as with the CrankNicholson scheme).

Today it is fair to see that from a practical point of view, binomial trees are obsolete as compared with nite
dierences (or nite elements) technologies. However, in a number of situations, trinomial trees remain a
simple, exible and good enough alternative. Trees are easy to customize due to the visualization of the paths
of the underlying in the Markov chain approximation (see e.g. section 16). So it is often straightforward to
design a tree algorithm for the pricing of a somewhat involved contingent claim.

From the theoretical point of view (convergence analysis), the Markov chain interpretation opens the way
to alternative probabilistic convergence proofs based on Kushner's theorem [99, 98], which says that the so-
called local consistency conditions, that is, the matching of the rst and second conditional moments of the
increments of the approximating chain with those of the continuous-time limit with accuracy o(h), grants
the convergence of the expectations of usual functionals.
In fact the setting in Kushner's theorem is a jump-diusions stochastic control setting, and Kushner's result
deals with the convergence of the optimal controlled chain to the optimal controlled process. It is by far the
most powerful tool for dealing with the convergence of Markov chain based algorithms.

15 Trees for vanilla options


15.1 Cox-Ross-Rubinstein Binomial Tree
The Cox-Ross-Rubinstein tree [50] may be seen as the approximation to BS obtained by replacing the BS
RN dynamics by the following Markov chain:

u Sih with proba p



h
Si+1 = , S0 = s (90)
d Sih with proba 1 − p

T
with h=
n
, where T is the time to maturity of an option (the current date is taken to be zero), n is the
number of time steps in the tree, and

√ √
eκh −d
u = eσ h
, d = e−σ h
,p= u−d

One can show that p is the unique risk-neutral probability in the CRR tree, in the sense that the unique
φ STh in the CRR tree, is given by, for

arbitrage price process of an European vanilla option with payo
i = 0, . . . , n :
 
Πhi = E e−r(T −ih) φ(Snh ) | Sih (91)

If the option is American, the (unique arbitrage) price is given by, for i = 0, . . . , n :

 
Πhi = esssupτ ∈T h E e−r(τ −ih) φ(Sτh ) | Sih (92)
i
54

Convergence of the marginal law of the spot at maturity Assume s = 1, without loss of
generality. For λ ∈ R, we have:
" !#
n−1
h
h
 i Si+1Y
E exp iλ ln Snh = E exp iλ ln
i=0
Sih
 h  in
= E exp iλ ln S1h
  √   √ n
= p exp iλσ h + (1 − p) exp −iλσ h ,

eκh −d 1 b

and since p= u−d
∼ 2
+ 2σ h + O (h), we have as h → 0+ :
  2
 n
2σ T
h  i
E exp iλ ln Snh ∼ 1 + iλb − λ
2 n
σ2
  
→ exp iλb − λ2 T
2
= E [exp (iλ (bT + σWT ))]
= E [exp (iλ ln(ST ))]
where
 BS spot process. We thus have convergence of the characteristic func-
S denotes the risk-neutral
tion Φhn (λ) = E exp iλ ln Snh of the RN CRR log-spot ln(Snh ) to the characteristic function ΦT (λ) =
E [exp (iλ ln(ST ))] of the RN BS log-spot ln(ST ), hence
L
Snh −→ ST as n→∞. (93)

This grants the convergence of the price of European vanilla options with continuous and bounded payos,
e.g. put options. The convergence of call prices follows by callput parity, since the CRR scheme satises
the call-put parity relationship.

Some remarks about the limit n → ∞ The limit law depends only on peiλ ln(u) + (1 − p) eiλ ln(d)
through its Taylor expansion up to o(h). Thus u, d or/and p can be altered as long as the involved terms of
the development are not modied.
√ √ √ √
n σ T n n −σ T n
The upper and lower value of the spot at maturity are u = e and d = e whereas the ratio of
√ q
T
u
2 successive points is d = e
2σ h
= e2σ n . Thus at the same time, the scan of the law of Snh goes to R∗+ and
h
the grid gets more and more dense. It is easy to show that the points visited by the process S eventually

become dense in [0, T ] × R+ .

Convergence of the delta For bounded and suciently regular payos, one can show by elementary
computations based on the mean value property, that the CRR delta of an option:

Πh h
1 (us)−Π1 (ds)
∆h0 = (u−d)s

converges towards the BS delta ∆0 = ∂s Eφ(STs ) as h → 0+. This results can be extended to the put payo
by density, and then to the call payo by callput parity.

American options We still have convergence of the prices and deltas, though this cannot proven by
elementary computations as in the European case.

15.1.1 CRR Algorithm


Input parameters (BS parameters,) StepNumber n
Output parameters Price, delta

By (91), the CRR pricing function of an European option satises Πhn (S h ) = φ(S h ), and for i = n − 1, . . . , 0 :

Πhi (S h ) = e−rh pΠhi+1 (uS h ) + (1 − p)Πhi+1 (dS h )


 
(94)
55

whereas by (92), the pricing function of an American option satises Πhn (S h ) = φ(S h ), and for i = n−
1, . . . , 0 :
Πhi (S h ) = max φ(S h ), e−rh pΠhi+1 (uS h ) + (1 − p)Πhi+1 (dS h )
 
(95)

The CRR algorithm is a backward computation of the option price, based on the dynamic programming
equations (94) or (95), after a forward computation of the n + 1 possible values of the underlying at maturity
Snh . Since the CRR tree is a at tree, it is easily seen that the value of the underlying at time i and level j
(starting from below) is the same as that at time i+2 and level j + 1. In particular there are only 2n + 1
possible values of the underlying between time 0 and time n.
For computational purposes, it is clever, in the American case, to compute only once the corresponding
values of the payo of the option (and store them in an array), at the beginning of the algorithm.

15.2 Other Binomial Trees


To achieve the convergence in law (93), many other choices of u, d and p may be done, regardless of any
arbitrage or nancial consideration: the tree algorithm becomes a purely numerical approximation algorithm,
the only purpose being to get a good convergence to the limiting price and delta [100].
Note that a binomial tree is recombining as soon as u and d remain constant within the tree.

15.2.1 The Random Walk scheme


A natural choice is to approximate the Brownian motion W by the standard Random Walk in ST =
S0 exp (bT + σWT ) . This leads to

√ √
1
u = ebh+σ h
, d = ebh−σ h
,p= .
2

Convergence may be proved in the same way as before.


Note that the discretized process is not a martingale any more.

15.2.2 The matching-3-moments scheme


An alternative route to convergence is Kushner's Theorem (see introductory paragraph to this part). This
leads to the idea of matching the mean and variance of the conditional laws of the approximating chain with
those of the continuous RN BS process (local consistency conditions). The equations for u, d, p are:

κh
pu + (1 − p) d = e
 2 
2 2 2κh
pu + (1 − p) d − e = e2κh eσ h − 1 .

Since one degree of freedom remains, a natural idea is to match also the third moment, which gives the
equation
2
pu3 + (1 − p) d3 = e3κh e3σ h
.
The solution of this system is

eκh Q h p i
u = 1 + Q + Q2 + 2Q − 3
2
eκh Q h p i
d = 1 + Q − Q2 + 2Q − 3
2
eκh − d
p =
u−d
2
with Q = eσ h .
2κh 2
Note that ud = e Q >1: this tree is not symmetric any more.

15.3 Trinomial trees


Along this line there is no need to remain stuck with the discrete-time no-arbitrage one node-two sons
constraint. We may as well choose a 3-points scheme, or a l-points scheme, or even a scheme involving
a number of points depending on n (this is useful for other kinds of limiting continuous-time dynamics,
56

like Lévy processes for instance). From the previous computations it is easy to see that the points and
probabilities of the chosen scheme should be constrained by:

h 2
i
pj exp (iλ ln uj ) = 1 + iλb − λ2 σ2 h + o(h)
P

in the sense that these conditions ensure convergence of the characteristic functions, hence convergence of
the marginal laws of S (cf. section 15.1).
We shall see in section 15.4.1 that these conditions are in fact tantamount to the local consistency conditions,
which ensure a convergence of a much more general type, by Kushner's theorem.

Note that from a computational point of view, the condition that uj+1 /uj be independent of j must be
imposed in order to get a recombining tree.

A feature common to all trinomial trees is that they give an accurate computation of the Greeks (delta,
gamma, and theta), by nite dierences.

15.3.1 The KamradRitchken tree


The KamradRitchken (KR) tree [93] is the archetype of a trinomial tree. This is a at tree with 2n + 1
possible values of the underlying S throughout the option's life. Kamrad and Ritchken take a symmetric
 h
S
3-points approximation to ln Sh and match the related rst 2 moments to RN BS. More precisely, if k
0
denotes the upper state, this leads to:

k (pu − pd ) = bh
k (pu + pd ) − k2 (pu − pd )2 = σ 2 h
2
.

Kamrad and Ritchken further simplify the last equality, still maintaining an o (h) matching of the variance
(which is enough to ensure convergence of the characteristic functions), as

k2 (pu + pd ) = σ 2 h .

Denoting k = λσ h, this leads to

pu = 2λ12 + b2λσh
pf = 1 − λ12√
pd = 2λ12 − b2λσh

The parameter λ appears as a free parameter of the geometry of the tree. It is called the stretch parameter,
and must satisfy λ ≥ 1, to ensure stability of the tree algorithm. The value λ = 1.22474 which corresponds
1
to pf = 3 is reported to be a good choice for an ATM call (or put).

15.3.2 Trinomial schemes with matching rst two moments


h
Si+1
Let u>f >d be the possible values of
Sih
, with probabilities pu , pf , pd respectively. In order to get a
2
recombining tree we impose ud = f . The two rst moment matching conditions give

pu u + pf f + pd d = eκh
2 2 2
pu u + pf f + pd d = e2κh Q
2
where Q = eσ h .
Since pu + pf + pd = 1, thus two unknowns remain. The solution corresponding to the additional constraint

1
pu = pf = pd =
3
is
p
u = V + V 2 − f2
p
d = V − V 2 − f2
eκh (3 − Q)
f =
2
57

13.02
"Call0.dat" using 1:2
"CoxRossRubinstein1.dat" using 1:2
"KamradRitchken2.dat" using 1:2
13.01

13

12.99
Price

12.98

12.97

12.96

12.95
50 100 150 200 250 300 350 400 450 500
StepNumber

0.718
"Call0.dat" using 1:3
"CoxRossRubinstein1.dat" using 1:3
0.7178 "KamradRitchken2.dat" using 1:3

0.7176

0.7174

0.7172
Delta

0.717

0.7168

0.7166

0.7164

0.7162

0.716
50 100 150 200 250 300 350 400 450 500
StepNumber

Figure 10: European vanilla call priced by a CRR binomial tree, resp. a KR trinomial tree.
58

eκh (3+Q)
with V = 4
.

15.4 Miscellaneous remarks


15.4.1 Local consistency and convergence in law
In a general l-points scheme, let us come back to the equality ensuring convergence in law:
2
 
X σ
pj exp (iλ ln uj ) = 1 + iλb − λ2 h + o(h) , (96)
2
to be compared with the RN BS local consistency conditions:
 P
P pj uj2 = exp (κh) + o(h) (97)
pj uj = exp 2κ + σ 2 h + o(h)
 

Proposition 15.1 For a multinomial tree of the following form:


 √
uj = 1 + uj,1 h√+ uj,2 h + o(h)
(98)
pj = pj,0 + pj,1 h + pj,2 h + o(h) ,
convergence in law and local consistency are equivalent.

In view of Kushner's Theorem, the fact that local consistency implies convergence in law was expected;
what this proposition really says is that for multinomial trees of the form (98), convergence in law implies
a convergence of a much more general form.

Proof. Assuming (98), (96) obviously implies


X X X
pj,0 = 1, pj,1 = pj,2 = 0 .
We have
√ u2j,1
   
exp (iλ ln uj ) = exp iλ uj,1 h + uj,2 h − h + o(h)
2
√ 2 
λ2 2
  
uj,1
= 1 + iλuj,1 h + iλ uj,2 − − uj,1 h + o (h) ,
2 2
so that (96) is in fact equivalent to
X
pj,0 iλuj,1 = 0
X u2j,1 2
λ 2 X σ2
pj,0 (iλ(uj,2 − )− uj,1 ) + pj,1 iλuj,1 = [iλb − λ2 ],
2 2 2
that is, since p and u are real-valued:
X
pj,0 uj,1 = 0
 2 
X uj,1 X
pj,0 uj,2 − + pj,1 uj,1 = b
2
X
pj,0 u2j,1 = σ2 ,
or
 P
 P pj,0 uj,1 P = 0
pj,0 uj,2 + pj,1 uj,1 = κ (99)
pj,0 u2j,1 σ2 .
 P
=
But this may be rewritten:
X
pj,0 uj,1 = 0
X X
pj,0 uj,2 + pj,1 uj,1 = κ
X
2 pj,0 uj,1 = 0
X X X
pj,0 u2j,1 + 2 2κ + σ 2 ,

pj,0 uj,2 + 2 pj,1 uj,1 =
which is another form of the local consistency equations (97). 2
59

15.4.2 Flat trees and American options


The generic recombining tree algorithm for pricing American options is the natural backward scheme
Πhn (S h ) = φ(S h ), and for i = n − 1, . . . , 0 :
   X 
Πhi (S h ) = max φ S h , e−rh pj Πhi+1 (uj S h ) . (100)

The algorithm requires the computation of the intrinsic value at each node. A computational advantage of
a at tree (like the CRR or the KR trees) is that one can compute once for all the intrinsic values across all
the possible values of the underlying (n + 1 or 2n + 1 values for the CRR or the KR trees) before performing
the backward scheme. This reduces signicantly the computational cost of the algorithm.

16 Trees for exotic options


16.1 Barrier options
Let us consider the simple case of a Dow and Out call with a constant rebate R attached to the barrier L.
The rst idea to price this option within the CRR scheme is to apply the usual backward induction scheme,
with prices at or above the barrier constrained to R.
It is indeed possible to show that the resulting price converges to the right BS limit. Nevertheless, the
convergence is very bad compared with that for vanilla options. The reason is clear: let l denote the node
index such that
S0 dl ≥ L > S0 dl+1 .
Then the algorithm, n being xed, yields the same result for any value of the barrier between S0 dl and
S0 dl+1 . Therefore the convergence cannot be faster than
  1
∂L Πbs dl − dl+1 = O(n− 2 )

(where Πbs denotes the BS price of the option), whereas the convergence of the tree for European vanilla
−1
options is known to be of order n .
An alternative method (Ritchken's algorithm [134]) is to feed the algorithm with the right value of the
barrier. The idea here is to choose the stretch parameter λ of a trinomial KR tree (see section 15.3.1)
such that the barrier is hit exactly. We know that λ should be greater than one. Intuitively among many
possibilities for λ, the closer λ is to one, the better. 
S
 
ln L0
The natural way to choose λ is the following: compute the value of l above, i.e. l = √
σ h
, where [t]

S

ln L0
1
denotes the greater integer less or equal than t. Then take λ= l σ h
√ .
Proceeding in this way, convergence is reported to be like that for vanilla options.

16.2 Bermudean Options


For pricing time-dependent American options, the natural way to modify the basic American tree algorithm
is to replace φ(S h ) by φ(ih, S h ) in the backward formula (100), where φ (t, S) is the time-dependent payo
of the option.

In the case of Bermudean options, the American right is in force only at a set of prescribed periods, for
instance between a xed date T1 and maturity T (T1 excluded and T included, say). In this case a natural
algorithm is to apply the backward formula (100) between step n − 1 and i1 and the standard CRR scheme
before i1 , where (i1 − 1) h ≤ T1 < i1 h.

This rst algorithm is very crude since it gives the same price, n being xed, for any value of T1 between
(i1 − 1) h and i1 h. A way to feed the algorithm with the right value is to use a KR tree with a stretch
parameter λ1 and a number of steps i1 between times 0 and T1 and λ2 and i2 between T1 and T. In order to
get a recombining tree we get the following pasting condition at time T1 :
r r
T1 T − T1
λ1 = λ2 .
i1 i2
60

Recall that λ1 and λ2 should be greater than one. A possible way to choose the parameters is: rst x
λ1 ≥ 1 (for instance λ1 = 1.2274), choose i1 , then set
 
i1 (T − T1 )
i2 = +1.
T1

So i2 T1 ≥ i1 (T − T1 ) , thus λ22 = λ21 i1 (TT−T


1
1)
i2 ≥ λ21 ≥ 1.
61

Part V
Monte Carlo Methods
This part is a short overview of Monte Carlo Methods in nance. Comprehensive references are given in
section 4. As deterministic methods, stochastic simulation methods (the terminology `Monte Carlo' was
introduced in [115]) can also be used in any Markovian model. In the case of European products, they
consist of the classic Monte Carlo (MC) loops. For products with early exercice features, stochastic simu-
lation methods are available too, as there are numerical methods, more generally so, for Forward-Backward
Stochastic Dierential Equations (methods based on dynamic programming and MC computation of condi-
tional expectations, see e.g. [34] and references therein), but this is outside the scope of these notes.

MC methods are attractive by their genericity (genericity of the related theoretical properties, that are
insensitive to the dimension of the problem or to the regularity of the payo function, and genericity of
implementation as well), and by the condence interval they provide on the solution (at least for pseudo MC
methods, as opposed to quasi MC methods, see below).
−1
But (pseudo) MC methods are slow (convergence in σm 2 , where m is the sample size and σ is the standard
deviation of the sampled payo ), unless specic variance reduction techniques are applicable.
An alternative to variance reduction methods is to use quasi MC methods, which converge faster in practice
than (pseudo) MC methods. Note however that quasi MC methods don't furnish condence intervals (unless
sophisticated randomized forms of quasi MC methods are used), and that the performances of quasi MC
algorithms are strongly altered in high dimension (high eective dimension, as measured by the number of
signicant risk dimensions in a PCA of the sampled payo, as opposed to nominal dimension; in nancial
problems, eective dimension if often much less than nominal dimension, which explains the popularity and
success of quasi MC methods in nance).

17 Random numbers
d
To sample `random' vectors in R with specic distributions, one rst draws sequences of `uniform' random
d
vectors uj over [0, 1] . Then one transforms the uj into xj with the required distribution.

`Uniform' random vectors uj over [0, 1]d may be obtained:


• either by sampling `iid' random variables over [0, 1] with a (pseudo) random generator, and grouping them
in buckets of size d,
• or by using quasi random generators (low-discrepancy sequences) in dimension d.
The quotes are here to indicate that simulated pseudo- or quasi-random numbers aim at emulating as well
as possible the statistical properties of randomness and uniformity (and independence, in the case of pseudo-
random numbers). Note however that since the simulated sequences are generated deterministically on a
computer, we can always nd a statistical test of randomness, uniformity or independence that the sequence
will fail.

Of course the quality of a generator puts an upper bound on the quality of any simulation method using it.

18 Pseudo random generators


Pseudo random generators are used to simulate independent uniform variables over [0, 1] [108, 107, 106, 122,
120, 97].

Denition 18.1 A pseudo-random number generator is a structure G = (S, s0 , T, U, G) where S is a nite


set of states, s0 ∈ S is the initial state, the mapping T : S → S is the transition function, U is a nite set
of outputs symbols, and G : S → U is the output function (see L'Ecuyer [108, 107]).

Since S is nite, the sequence of states is ultimately periodic. The period is the smallest positive integer ν
such that for j≥ some j0 , sj+ν = sj .

18.1 Properties required for a good pseudo-random numbers generator


62

• Large period length At least 260 .


• Good equidistribution properties and statistical independence of successive pseudo-random draws The gen-
erator should pass statistical tests for uniformity and independence [97, 107]: general tests like chi-square or
Kolmogorov-Smirnov tests, and more specic tests like equidistribution test, serial test, gap test, partition
test, etc.
• Little intrinsic structure Successive values produced by some generators produce a lattice structure in any
given dimension.
• Eciency, fast generation algorithm, requiring not too much memory space Especially if we use many
generators together or in parallel.
• Repeatability (xing a given seed) Very useful for practical applications. Otherwise we can use the current
time (computer clock) to initialize the generators.
• Portability The generator should produce exactly the same sequence on dierent computers or with dier-
ent compilers.
• Unpredictability It means that we cannot predict the next generated value by the algorithm from the
previous ones (though this is less important in nance than for other applications like cryptography).

18.2 Constructing pseudo-random number generators


The simplest methods to construct random number generators are linear methods. Linear methods use a
linear recurrence relation to compute the next value from the previous ones

Linear Congruential Generators (LCG) The j th random number is given by

Uj
uj = ∈ [0, 1]
N
where
Uj = (aUj−1 + b) mod N
for xed integers a > 0, b and N > 0.
Such generators produce a lot of regularity in sequences and a lattice structure.

Random numbers generator of L'Ecuyer with Bayes & Durham shuing procedure
This is a combination of two short periods LCG, hence a longer period generator with good properties.

19 Low-discrepancy sequences
Quasi-random numbers, or successive values of low-discrepancy sequences [123, 122, 121, 117], are not inde-
d
pendent. But they have good equidistribution properties on [0, 1] , hence good convergence properties of
Pm
1
∈ [0, 1]d , y ≤ x}, with y ≤ x if and only
R
m j=1 ψ(ξ j ) to
[0,1]d
ψ(u)du as m → ∞. Let [|0, x|] stand for {y
if yj ≤ xj , j = 1 . . . d.

Denition 19.1 Given a [0, 1]d -valued sequence ξ = (ξj )j≥1 , and x ∈ [0, 1]d , we dene:

1[|0,x|] (ξj ) − Vol([|0, x|])


1
Pm
Dm (ξ, x) = m j=1

with Vol([|0, x|]) = d1 xl .


Q
• A sequence (ξj )j ≥ 1 is said to be equidistributed on [0, 1]d if

∀x ∈ [0, 1]d , lim Dm (ξ, x) = 0 .


m→∞

• The value Dm ξ) dened by


Dm (ξ) = sup |Dm (ξ, x)|
x∈[0,1]d

is called the (star) discrepancy for the rst m terms of the ξ sequence.
• A sequence ξ is said to be a low-discrepancy sequence if its discrepancy satises

d
Dm = O( (lnm
m)
)
63

1
or if it is asymptotically better than O(( ln m ) ), the discrepancy of a sequence of independent uniform
ln m 2

variables (by the law of the iterated logarithm).

19.1 General remarks on low discrepancy sequences


The discrepancy measures how a given set of points is distributed in [0, 1]d . It can be viewed as a quantitative
measure for the deviation from the uniform distribution.

Quasi-random numbers combine the advantage of a lattice with the advantage that points can be added
incrementally (like points of a random sequence).

(ln m)d
But for large dimension d, the theoretical bound m
may be meaningful for extremely large values of m
only.
Low discrepancy sequences perform very well in low dimension. In high dimension d, a lattice can only be
d
rened by increasing the number of points by a factor 2 .

Orthogonal projections If a d-dimensional sequence is uniformly distributed in [0, 1]d , then two-
dimensional sequences formed by pairing coordinates should also be uniformly distributed. The appearance
of non-uniformity in such projections (see Figure 11) is an indication of potential problems in using a
quasi-random sequence [117].

1 1
Sobol dim 1 and 2 Sobol dim 159 and 160

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 11: Orthogonal projection on the rst, resp. last two coordinates of the rst 10000 points of
the Sobol sequence in dimension 160.

19.2 Sobol sequences


The Sobol sequence is one of the most used sequences for Quasi-Monte Carlo simulation, and one of the most
successful for nancial applications. Its construction is based on primitive polynomials in the eld Z2 and
bitwise XOR (Exclusive Or) operations (see [78, 139]).

20 Simulation of non-uniform random variables or vectors


20.1 Inverse method
The inverse method allows one to simulate a random variable with known (computable) inverse cumulative
−1
distribution function F , as follows:
• First, simulate a variable u uniformly distributed on [0, 1], using a pseudo-random numbers generator or
a quasi-random number generator;
• Then set x = F −1 (u).
Example To simulate an exponential random variable with parameter λ, do X = − λ1 ln(1 − U ) (or
− λ1 ln U ), with U uniform.
64

20.2 Simulation of Gaussian variables


Box-Müller transformation If (U, V ) is uniformly distributed on [0, 1]2 , then the pair (X, Y ) dened
by

X = √ −2 ln U sin(2πV )
Y = −2 ln U cos(2πV )
is standard Gaussian.

Proof Let θ be uniform over [0, 2π] and ρ2 be exponential with parameter 12 , independent from θ. Then
the pair (X, Y ) = (ρ cos θ, ρ sin θ) is standard Gaussian. Indeed, for every measurable and bounded function
φ, we have:
∞ 2π
1 −ρ2
Z Z

Eφ(X, Y ) = φ(ρ cos θ, ρ sin θ) e 2 d(ρ2 )
0 0 2 2π
Z ∞ Z 2π −ρ2 dθ
= φ(ρ cos θ, ρ sin θ)ρdρe 2
0 0 2π
x2 +y 2
Z Z
dxdy
= φ(x, y)e− 2 .
R R 2π
2

Remarks This method requires two independent random values to obtain two Gaussian variables. So
it must not be used when the random numbers u and v are generated from two successive values of a one-
dimensional low-discrepancy sequence, because u and v are not independent in this case.
To use the Box-Müller transformation with quasi-random numbers, one should thus take special care to
generate u and v independently, e.g. from two dierent one-dimensional sequences, or from a two-dimensional
sequence.

Simulating Gaussian variables with the inverse method The inverse Gaussian cumulative
distribution function N −1
is not known in closed form. To use the inverse method for simulating Gaussian
. Moro's algorithm furnishes a very good and quick
−1
variables, we thus need an approximation of N
approximation.

20.3 Simulation of Gaussian vectors


Simulating a d-dimensional Gaussian vector V with zero mean and a covariance matrix Γ can be done as
follows:
• Compute the lower triangulary matrix Σ obtained by Cholesky decomposition of Γ, such that

Γ = ΣΣT

So, for p = 1...d : q Pp−1 2


Σpp = Γpp − r=1 Σpr
Pp−1
Γpq − r=1 Σpr Σqr
Σqp = Σpp
for q = p + 1, . . . , d ;
• Generate d independent standard Gaussian variables εi ;
• Compute V = ΣG, with G = (ε1 , . . . , εd );
Then V ,→ N (0, Γ).

Two-dimensional case Assuming

σ12
 
ρσ1 σ2
Γ= ,
ρσ1 σ2 σ22
then
   
σ11 σ12 σ1 p 0
Σ= =
σ21 σ22 ρσ2 1 − ρ2 σ2
65

21 Principle of the Monte Carlo Simulation


(Pseudo) Monte Carlo simulation is a general method for evaluating an integral as an expected value, based
on the Strong Law of Large Numbers and the Central Limit Theorem. It provides an unbiased estimator and
the error on the estimate is controlled within a condence interval.

21.1 Limit theorems


Strong Law of Large Numbers For xj , X iid with E[|φ(X)|] < ∞, then

1
Pm a.s.
m j=1 φ(xj ) −→ E[φ(X)] as m→∞

Central Limit Theorem If σ 2 = Var[φ(X)] < ∞, the normalized error converges in law to the Gaussian
distribution:
√  
m 1
Pm L
σ m j=1 φ(xj ) − E[φ(X)] −→ N (0, 1) as m→∞

21.2 Estimation principle


We want to estimate the following parameter Θ:

Θ = E[φ(X)]

where φ is some function on D ⊆ Rd and X is a D-valued random vector. Note that Θ can be expressed as
the integral Z
Θ= φ(x)dPX (x) .
D
• An unbiased estimator with m trials for Θ is given by

1
Pm
Θm = m j=1 φ(xj )

• The variance of this estimator is given by


2 σ2
σ
em = ,
m
with unbiased estimator
" m
#
2 1 1 X 2
σm = φ (xj ) − Θ2m
m−1 m j=1

Variance decreases to 0 as m → ∞. It means that the greater m, the more accurate the estimator. The
σ
speed of convergence of Θm to Θ is √ , which
m
is independent of the dimension d.

• A condence interval IC = [A, B] at the threshold (condence level) 1 − 2α for Θ is built as follows:

IC = [Θm − zα σm ; Θm + zα σm ]

with zα = N −1 (1 − α). So P(A < Θ < B) = 1 − 2α. For instance, if the threshold is set at 95%, then
α = 2.5% and zα ≈ 1.96.
A natural stopping criterion consists in breaking the MC loop when zα σm becomes less than some percentage
(e.g. 1bp = 10−2 %) of Θm , so that one knows the price with a 1bp relative error, with condence 1 − α.

21.3 Properties
We briey summarize some advantages and disadvantages of the Monte Carlo method.
• Advantages:
66

 We can implement this method very easily if we are able to simulate the variable X.
 It does not require regularity or dierentiability properties of the function φ.
 The estimator is unbiased.
 The error on the estimate can be controlled by the Central Limit Theorem, and we can build a condence
interval.
• Disadvantages: We have to realize a lot of simulations to obtain an accurate estimator. Therefore com-
puting time can be very high.

22 Variance Reduction Techniques


We saw that a disadvantage of the standard Monte Carlo Simulation is its required computing time, which
σ
is proportional to √ . So to increase the accuracy by a factor 10, one must increase the number m of
m
simulations by a factor 100.
An alternative (Variance reduction techniques, or Accelerated Monte Carlo) is to rewrite Θ by means of a
new random variable with smaller variance σ2 .

22.1 Antithetic Variables


The principle of antithetic variables is to introduce some correlation between the terms of the estimate.
When simulation is done by the inverse method, we use uniform numbers uj on [0, 1]. In the antithetic
variables method, we use each uj twice, as uj and 1 − uj . These two variables have the same law, but they
are not independent. Let us denote by xj and x̄j the variables respectively generated from uj and 1 − uj .
An unbiased estimator of Θ with m trials is dened by

m
1 X
Θ̄m = (φ(xj ) + φ(x̄j ))
2m j=1

with xj , X iid. The variance of the estimator is given by

2 1 
σ̄m = Var[φ(X)] + Cov(φ(X), φ(X̄)) ,
2m

with X̄ = 1 − X.
The following theorem gives a simple condition ensuring variance reduction with this method.

Theorem 22.1 If φ is a monotone function, then σ̄m


2
≤ 21 σ
em2
.

1
Factor 2 is due to the sample size for the antithetic method: indeed, the related estimator contains 2m
terms.

22.2 Control Variables


The principle of this method is to introduce another model for which we have an explicit solution and to
estimate the dierence between the original parameter Θ and the new one, writing

Θ = E[φ(X)] = E[φ(X) − ξ(X)] + E[ξ(X)]

where ξ is a function such that E[ξ(X)] = c is known.

An unbiased estimator with m trials of Θ is dened by

1
Pm
Θ
bm =
m j=1 (φ(xj ) − ξ(xj )) + c

with xj , X iid. The variance of the estimator is given by

2 1
σ
bm = m
Var[φ(X) − ξ(X)]
1
= m
[Var[φ(X)] + Var[ξ(X)] − 2Cov(φ(X), ξ(X))] .
67

Variance reduction with regard to standard Monte Carlo simulation is not guaranteed by this method. To
decrease the variance, functions φ and ξ must have a large positive correlation. It implies a specic choice
for the control variate ξ.

22.3 Importance Sampling


The basic idea of importance sampling is to concentrate the distribution of the sample points in the most
contributive parts of the space. We thus introduce a new D-valued random vector Y. Assuming that the
law of X and Y have densities µ and ρ, respectively, so

dPX (z) = µ(z)dz , dPY (z) = ρ(z)dz ,


Denoting ϕ = 1φµ6=0 φµ
ρ
(assuming Supp(φµ) ⊂ Supp(ρ)), we have:

φ(z)dPX (z) = D φ(z) µ(z) dPY (z) ,


R R
D ρ(z)

so
Θ = E[φ(X)] = E[ϕ(Y )]
We thus get the following estimator for Θ:
_ Pm
1
Θm = m j=1 ϕ(yj )

with Y, yj iid. The variance of the estimator is given by

_2 1
σm = Var [ϕ(Y )] .
m

Variance reduction with regard to standard Monte Carlo simulation is not guaranteed by this method. It
depends on the choice of the importance function ρ. The minimum of the variance is reached for the following

importance density ρ :
ρ∗ = R |φ(x)|µ(x) (101)
|φ(y)|µ(y)dy

Usually ρ∗ is unknown (note that Θ stands in the denominator of the r.h.s. of (101), for φ > 0). In practice,
one chooses ρ by using formula (101) applied to an approximation of the payo φ such that the corresponding
density is computable.

22.4 Eciency of the Monte Carlo methods


We now introduce a criterion to compare the eciency of the various simulation methods, standard simulation
or simulation with variance reduction techniques. This criterion takes into account the computing time
required by the simulation for each method.
Eciency of the method q with regard to the method p is dened by:
s
σmp (p) tmp (p)
ε(p, q) = lim .
mp ,mq →∞ σmq (q) tmq (q)

The method q is considered to be more ecient than the method p if ε(p, q) ≥ 1.


If we assume that the computing time is proportional to the sample size, that is tmp (p) = kp mp , where kp
is a factor which expresses the complexity of the algorithm for method p, then:
s
σp kp
ε(p, q) ≈ .
σq kq

So the eciency criterion is essentially independent on the sample size. For instance, if ε(p, q) = 3, it means
that method p requires 9 times more time to obtain the same accuracy than method q. In other words, with
the same computing time, standard error for method q is 3 times smaller than the one of method p.

23 Quasi Monte Carlo Simulation


Quasi Monte Carlo simulation consists in estimating
Z
E[ψ(U )] = ψ(u)du
[0,1]d
68

1
Pm
by m j=1 ψ(ξj ), where ξ is a d-dimensional low-discrepancy sequence.

−1 (law)
Choosing ψ = φ ◦ FX , then ψ(U ) = φ(X), so E[ψ(U )] = E[φ(X)].

23.1 Koksma-Hlawka inequality


Theorem 23.1 For any ξ1 , . . . , ξm ∈ [0, 1]d , we have:
m Z
1
X
ψ(ξj ) − ψ(x)dx ≤ V (ψ)Dm (ξ) , m ≥ 1

m

[0,1]d
j=1

where V (ψ) denotes the Hardy-Krause variation of ψ.

The denition of the (Hardy-Krause) variation of ψ is rather technical and irrelevant for our purposes.
Simply note that in dimension 1, the Hardy-Krause variation of a function coincides with the usual notion
| ∂u ψ | du, for ψ of class C 1 . Through the Koksma-Hlawka inequality, we understand
R
of variation, e.g.
[0,1]
the need to have sequences with discrepancy Dm as small as possible.

The Koksma-Hlawka inequality gives an a priori deterministic bound for the error in the approximation of
R 1
Pm
[0,1]d
ψ(x)dx by the sum m j=1 ψ(ξj ). This error is expressed in term of the discrepancy of the sequence
and the variation of the function ψ.
Nevertheless it is often dicult to calculate or even to estimate the
(ln m)d
variation of ψ . Moreover, since for large dimension d, the asymptotic bound m
of a low-discrepancy
(ln m)d
sequence may be meaningful for extremly large values of m only, and because m
increases exponentially
with d, then the bound in Koksma-Hlawka inequality gives no relevant information until a very large number
of points is used.

Unlike the Monte Carlo method, the Quasi-Monte Carlo approach doesn't provide an condence interval
for the estimator. The empirical variance of the sample is not meaningful because successive terms are not
independent. This is due to the construction of low-discrepancy sequences.

Another dierence in comparison with Monte Carlo is that the convergence rate for QMC simulation depends
on the dimension d of the considered model through the discrepancy.

It is worth stressing that low-discrepancy sequences (Sobol sequences in particular) are often more fruitful
in nance than in other application areas. This is because the eective dimension of nancial problems
is often much lower than their nominal dimension. In such cases, one can benet from all the power of
low-discrepancy sequences by assigning the main risk factors of the problem, ordered by decreasing amount
of explained variance, to the successive components of the points of a multi-dimensional low-discrepancy
sequence. Thus, though we use a low-discrepancy sequence in the nominal dimension of the problem,
which may be high, the fact that the rst coordinates of the quasi-random points are assigned to the main
risk factors of the problem avoids much of the drawbacks generally associated with high-dimensional low-
discrepancy sequences.

24 Greeking by (Quasi) Monte Carlo


Prices sensitivities, or Greeks, are actually the main issue in nancial modeling. Indeed, unless very exotic
products are considered, derivatives prices are made by oer-and-demand in the market (prices are eectively
used in the reverse-engineering mode to calibrate the models, see part VI). Greeks, on the contrary, must be
computed within models. They are used to determine the composition of a dynamic hedging portfolio (see
section 5.1 and the Appendix).

Now, it happens that under quite general circoumstances, Greeks are also given by expressions of the form
Θ = E[φ(X)], so that all the MC pricing techniques that we have seen so far, can also be used for Greeking.

In this section we illustrate this on the problem of (Q)MC computing the delta of an option in a Markovian
s s
model. We thus want to compute ∆0 = ∂s E[φ(ST )] , where ST is the RN spot with initial condition s at time
0 (we should in fact write STs,s̄ , where s̄ is the initial condition of the remaining factors in our Markovian
model, see the Appendix; since s̄ plays no role in the sequel, we omit it for simplicity).

One obvious method consists in perturbing the initial condition s, computing (Q)MC prices for the original
and perturbed initial condition, and deducing a nite dierences estimator for ∆0 . But this procedure is both
69

costly (since it involves resimulation) and parameterized and biased (via the perturbation on s).
In many cases, direct (without resimulation) approaches are possible: by derivation of the payo, provided
the latter is smooth enough, or by derivation of the transition probability density pT (s, S) (as a shorthand
for pT (s, s̄; S)) of S , provided the latter exists and is smooth enough.
Note that in the case of the BS model everything is elementary, however in general the latter two approaches
ultimately rely on the theory of stochastic ows and of the Malliavin calculus, respectively (see e.g. [78, 41,
72, 73]).

24.1 Finite Dierences


For xed h > 0, we may approach ∆0 by decentered nite dierences
1 
(1+ε)s

E[φ(ST )] − E[φ(STs )]
εs
or, preferably (better convergence properties), by centered nite dierences
 
1 (1+ε)s (1−ε)s
2εs
E[φ(ST )] − E[φ(ST )] (102)

The expectations in (102) are estimated by (Q)MC simulation. In order to optimize the algorithms, common
random numbers should be used to estimate both expectations. Thus, in the BS model, ∆0 can be best
estimated in these approaches by

m √ √
1 X 
φ[(1 + ε)sebT +σ T εj ] − φ[(1 − ε)sebT +σ T εj ] ,
2εsm j=1

where the εj are standard Gaussian (Q)MC draws.

24.2 Derivation of the payo


In case where φ is regular and we know how to derive STs with respect to s, ∆0 may alternatively be computed
as
∆0 = ∂s Eφ(STs ) = E∂s φ(STs ) = Eφ0 (STs )∂s STs .
s
ST
In multiplicative models (like the BS model, or more general homogenous models), we have ∂s STs = s
, so

∆0 = 1s E[φ0 (STs )STs ]

provided φ is dierentiable.
Note that φ0 may be a weak derivative in the sense of distributions, yet it still needs to be a well dened
function (unlike a Dirac mass for instance) for applicability of the method. So this method is applicable to
the computation of the delta of a call option, not to that of a digital option.

24.3 Derivation of the spot transition probability density


Assuming that S admits a transition probability density pT (s, S) between 0 and T, dierentiable in the rst
variable s, then under mild conditions:
Z Z
∂1 pT (s, S)
∆0 = φ(S)∂1 pT (s, S) dS = φ(S) pT (s, S) dS ,
R R pT (s, S)
so
∆0 = E [φ(STs )∂1 ln(pT (s, STs ))]

For instance, in the BS model:

S 2
1 −
(ln( s )−bT ) WT
pT (s, S) = √ e 2σ 2 T , ∂1 ln(pT (s, STs )) = ,
2
S 2πσ T sσT
so
h i
WT
∆0 = E φ(STs ) sσT
70

which can be easily computed by (Q)MC.

25 (Quasi) Monte Carlo algorithms for vanilla options


25.1 (Q)MC BS1D Algorithm
Description Computation, for an European call, put or digital option, of its price and delta by (Pseudo-
)MC or Quasi-MC simulation. In the case of Pseudo-MC simulation, the algorithm provides estimations for
price and delta with a condence interval. In the case of Quasi-MC simulation, the program just provides
estimations for price and delta. For a call, the implementation is based on the call-put Parity relationship.

Input parameters StepNumber m, Generator_Type, Increment ε, Condence Value.

Output parameters Price Π, Error Price σP , ∆, Error delta σ∆ , Price


Delta Condence Interval ICΠ =[Inf
Price, Sup Price], Delta Condence Interval: IC∆ =[Inf Delta, Sup Delta].
The underlying asset price evolves according to the BS model, so, under P:

ST = s exp(bt + σWt ) ,

where ST denotes the spot at maturity T, s is the initial spot, and t is time-to-maturity.

The price of an option at T −t is:


Π = E e−rt φ(K, ST , R)
 

where φ denotes the payo of the option, K is the strike and R the rebate (for digital option only),

The delta is given by:


∆ = ∂s E[e−rt φ(K, ST , R)] .

The estimators write:


m
1 −rt X
Πm = e π(j)
m j=1

with π(j) = φ(K, ST (j), K), and

m m
1 −rt X 1 X
∆m = e ∂s π(j) = e−rt δ(j)
m j=1
m j=1

The values for π(j) and δ(j) are detailed for each option.

• Put: The payo is (K − ST )+ . We have:

π(j) = (K − ST (j))+

−∂s ST (j) = − STs(j)



if π(j) ≥ 0
δ(j) =
0 otherwise

• Call: The payo is (ST − K)+ .


The call-put parity relations for price and delta write:

C = P + se−qt − Ke−rt
∆C = ∆P + e−qt ,

where C and P respectively denotes the call and the put prices. These relations may be used for the call
simulation (in order to limit variance).

• Digital option: The payo is R1{ST −K≥0} .


We have:
π(j) = R1{ST (j)−K≥0} .
71

To have an estimation of the delta in the case of a digital option, we need to use the increment value ε at
each iteration j, so:

1
δ(j) = [φ ((1 + ε)ST (j), K, R) − φ ((1 − ε)ST (j), K, R)] .
2εs

13.2
"Standard0.dat" using 1:2
"Standard1.dat" using 1:2
"Antithetic2.dat" using 1:2
13.15

13.1

13.05

13
Price

12.95

12.9

12.85

12.8

12.75
0 5000 10000 15000 20000 25000 30000
N iterations

0.725
"Standard0.dat" using 1:3
"Standard1.dat" using 1:3
"Antithetic2.dat" using 1:3

0.72

0.715
Delta

0.71

0.705

0.7

0.695
0 5000 10000 15000 20000 25000 30000
N iterations

Figure 12: European vanilla call priced by Pseudo MC simulation (L'Ecuyer's generator), Quasi MC
simulation (Sobol sequence), and Pseudo MC simulation with antithetic variables.

25.1.1 Adding Jumps


Let us now add jumps in S, assuming that the underlying asset price evolves according to the RN Merton
model, so, by (11):
QNt
ST = seat+σWt l=1 (Jl + 1)

where N is a Poisson process with deterministic jump intensity λ and ln(1 + Jl ) ,→ N (α, β).
From the point of view of (Q)MC pricing European path-independent options with payo φ(ST ) (like the
European call, put or digital options of section 25.1), nothing changes, except for the fact that ST (j) in
section 25.1 is now given by (cf. (11))

QNp √ l
ST (j) = s exp [at + σWt (j)] l=1 exp(α + βεj )
72

where:
• Np is a simulated Poisson variable with parameter λT,
• the εlj are√independent standard Gaussian draws, and
• Wt (j) = tεj for a further independent standard Gaussian draw εj .

25.2 (Q)MC BS2D Algorithm


Description Computation, for a call on maximum, put on minimum, Exchange or BestOf european
option, of its price and delta by (Pseudo-)MC or Quasi-MC simulation. In the case of Pseudo-MC simulation,
the algorithm also provides an estimation for the integration error and a condence interval.

Input parameters StepNumber m, Generator_Type, Increment ε, Condence Value.

Output parameters Price Π, Error Price σP , Deltas ∆1 , ∆2 , Errors delta σ∆1 , σ∆2 , Price Condence Interval
ICΠ =[Inf Price, Sup Price].
The underlying asset prices evolve according to the two-dimensional BS dynamics, that is, (under P:
dSu1 Su1 (κ1 du σ1 dWu1 ), ST1 −t
(
= + = s1
dSu2 = Su2 (κ2 du + σ2 dWu2 ), ST2 −t = s2 ,

where:
• κl = r − ql ,
l
• sl is the initial spot and ST is the spot at maturity T ,
• W 1 and W 2 denote two real-valued Brownian motions with instantaneous correlation ρ.
So
ST1 = s1 exp(b1 t + σ11 Wt1 )


ST2 = s2 exp(b2 t + σ21 Wt1 + σ22 W


ft2 ) ,
with
2 !
σ1
b1
     
σ11 σ12 σ1 p 0 κ1 − 2
= , = ,
σ21 σ22 ρσ2 1 − ρ2 σ2 b2 κ2 −
σ22

and where W
f denotes a real-valued Brownian motion independent of W 1. The price of an option is

Π = E e−rt φ(K, ST1 , ST2 ) ,


 

where φ denotes the payo of the option, K is the strike, and t is time to maturity.
The deltas are given by

∆1 = ∂s1 E[e−rt φ(K, ST1 , ST2 )] , ∆2 = ∂s2 E[e−rt φ(K, ST1 , ST2 )] .

The estimators write:


1 −rt
Pm
Πm = m e j=1 π(j) P
1 −rt m 1 −rt m
∆lm l
P
= m e j=1 ∂sl π(j) = m e j=1 δ (j)

The values for π(j) and δ l (j) are detailed for each option.

• Put on the Minimum: The payo is (K − min(S1 , S2 ))+ .


+
π(j) = K − min(ST1 (j), ST2 (j))
If π(j) ≥ 0 then:
− exp(b1 t + σ11 Wt1 ) if ST1 (j) ≤ ST2 (j)

δ 1 (j) =
0 otherwise
δ2 (j) =
− exp(b2 t + σ21 Wt1 + σ22 Wt2 ) if ST1 (j) ≤ ST2 (j)
0 otherwise .
Otherwise δ 1 (j) = δ 2 (j) = 0.
• Call on the Maximum: The payo is (max(S1 , S2 ) − K)+ .
+
π(j) = max(ST1 (j), ST2 (j)) − K
73

If π(j) ≥ 0 then:
exp(b1 t + σ11 Wt1 ) if ST1 (j) ≥ ST2 (j)

δ 1 (j) =
0 otherwise
δ2 (j) =
exp(b2 t + σ21 Wt1 + σ22 Wt2 ) if ST1 (j) ≥ ST2 (j)
0 otherwise .
Otherwise δ 1 (j) = δ 2 (j) = 0.
• Exchange Option: The payo is (S1 − ratio × S2 )+ .
+
π(j) = ST1 − ratio × ST2

exp(b1 t + σ11 Wt1 )



if π(j) ≥ 0
δ 1 (j) =
0 otherwise
δ2 (j) =
−ratio × exp(b2 t + σ21 Wt1 + σ22 Wt2 ) if π(j) ≥ 0
0 otherwise

• BestOf Option: The payo is [max(S1 − K1 , S2 − K2 )]+ .


+
π(j) = max(ST1 − K1 , ST2 − K2 )


If π(j) ≥ 0 then:

exp(b1 t + σ11 Wt1 ) if ST1 (j) − K1 ≥ ST2 (j) − K2



δ 1 (j) =
0 otherwise
δ2 (j) =
exp(b2 t + σ21 Wt1 + σ22 Wt2 ) if ST1 (j) − K1 ≥ ST2 (j) − K2
0 otherwise .

Otherwise δ 1 (j) = δ 2 (j) = 0.

26 Simulation of processes
Simulating random variables is enough to (Q)MC-compute prices of vanilla options, in cases where the value
of the underlying at the time of maturity T of the option can be simulated directly, without discretizing
the associated SDE. However, to be able to deal with more complex models, or, even in simple models,
with path dependent options, one has no other choice than simulating the whole trajectory of the underlying
between the time of pricing t T. Even for simple options in simple models, simulating spot
and the maturity
trajectories is necessary for testing the hedging performances of a model (see section 28). In this case, the
payo  we are interested in typically consists of the Prot and Loss at maturity of an option's seller, who
rebalances a delta hedge in the underlying at regular time intervals, such as one day. This Prot and Loss
at maturity is a path-dependent quantity, so that the simulation problem in this case is similar to that of
MC-pricing a path-dependent option.

26.1 Brownian Motion


We recall that a Brownian motion on the probability space (Ω, F, P) is a continuous process W with the
properties that W0 = 0 and for 0 ≤ s < t, the increment Wt − Ws is independent of Fs and is normally
distributed with mean zero and variance t − s.

Simulation of Wt Simulation of Wt is an easy step because we have that L(Wt ) = N (0, t):
• rst generate a standard
√ Gaussian variable ε,
• then compute Wt as tg.

Simulation (discretization) of a Brownian trajectory, 0 ≤ t ≤ T We now detail two approaches


for simulating a Brownian path. Typically, for path-dependent options, we have to simulate W over {ti ; i =
0, ..., n, t0 = 0, tn = T }.
74

Forward simulation of W is given by


W0 = 0

Wti+1 = Wti + ti+1 − ti εi

where (ε1 , ..., εn ) are independent standard Gaussian variables. If we use a discretization with evenly spaced
intervals of size h = Tn , we get:
W0 = 0 √
Wti+1 = Wti + hεi

Attention: Simulating independently each Wti as ti εi would be incorrect (e.g. wrong variance of Wti+1 −
Wti ).

Backward simulation with Brownian Bridge An alternative method is based on the following
Brownian Bridge property (see [94, 133]):

L(Wu , s < u < t|Ws = x, Wt = y) = N ( t−u


t−s
x+ u−s
t−s
y, (t−u)(u−s)
t−s
) ,

hence in particular
L(W t+s |Ws = x, Wt = y) = N ( x+y
2
, t−s
4
) .
2

The related scheme consists in simulating W as

W0 = 0√
WT = T ε1 q
W0 +WT T
WT = 2
+ 4
ε2
2
W0 +W T q
2 T
W 3T = 2
+ 8
ε3
4
W T +WT q
2 T
WT = 2
+ 8
ε4
4
...

where (ε1 , ..., εn ) are independent standard Gaussian variables.

For this algorithm, one must choose n as a power of 2. The rst step is directly for 0 to T. Intermediates steps
are lled by taking successive subdivisions of the time intervals into halves. The algorithm can be adapted
to non-uniform time subdivisions by considering the conditional law of the Brownian Bridge between s and
t.

Remark on these two schemes for MC and QMC simulations We need a vector of size n of
independent Gaussian variables. For MC, these n variables can be simulated from the same pseudo random
numbers generator. However, for a QMC simulation, we need to take care about the independence property.
In the case of backward simulation with QMC, a good point is that the values of W successively determined
on each trajectory, are drawn by order of decreasing variance. Therefore the rst components of the simulated
QMC vectors (`the best ones', i.e. `the more uniform ones') are naturally aected to the main directions of
risk in the problem.

26.2 BS model
In the RN (might be objective, for constant µ) BS model: St = S0 exp(bt + σWt ), St is a function of Wt .
The simulation of price paths is then directly based on the simulation of Brownian motion described in
section26.2.

Forward simulation We have

Sti+1 = Sti exp(b(ti+1 − ti ) + σ(Wti+1 − Wti )) ,

so for a discretization with evenly spaced intervals of size h:



Sti+1 = Sti exp(bh + σ hεi )
75

Backward simulation Denoting

xt = ln(St ) = x0 + bt + σWt ,

we get:

xT = x0 + bt + σqT ε1
x0 +xT T
xT = 2
+ σ 4
ε2
2
x T +xT q
x 3T = 2
2
+ σ T8 ε3
4
x0 +x T q
xT = 2
2
+ σ T8 ε4
4
,...,
We nally set Sti = exp(xti ).

26.3 General diusions: Euler and Milshtein schemes


We now consider the general, d-dimensional diusion process:

dXt = b(Xt )dt + σ(Xt )dWt

(under suitable Lipschitz conditions on the coecients). Unless an explicit solution is known for X (like in
the BS model, cf. section 26.3), we have to approximate the process by discretization in time. The two best
known schemes are the Euler and the Milshtein schemes, that we now present on a uniform mesh with time
step h.
26.3.1 Euler approximation scheme
It is dened by
X i+1 = Xti + b(Xti )h + σ(Xti )(Wti+1 − Wti ) .
bt b b b
Simulation is obtained with a forward algorithm by:


X
bt
i+1 = Xti + b(Xti )h + σ(Xti )
b b b hεi

with εi Gaussian independent, for i = 0, . . . , n − 1.


Theorem 26.1 •L 2
-Convergence and Trajectorial Convergence For b, σ of class Cb2 , we have:
 
E bih |2q
sup |Xih − X ≤ Chq , q ≥ 1
0≤i≤n

1
lim h−α sup |Xih − X
bih | = 0 a.s. , α < .
h→0+ 0≤i≤n 2

• Convergence in law For b and σ of class Cb4 , and φ of class Cp4 (p=polynomial), there exists a constant
CT such that:
|E(φ(XT ) − φ(X
bT ))| ≤ CT h .

1 1
So L2 -convergence is of order h 2 , and almost sure (trajectorial) convergence is of order h 2 − ε, for any
ε > 0. Convergence in law (the kind of convergence which is relevant for pricing and Greeking applications)
is linear in h, in regular cases.

Continuous Euler scheme (d = 1) This is the continuous time approximation scheme (S̄t )t≥0 dened
by interpolation of the previous Euler scheme by a Brownian Bridge (see [94, 133]) between (ti , X
bt ) and
i
(ti+1 , X
bt ),
i+1 for i = 0, . . . , n − 1. So, on [ti , ti+1 ]:

S̄t = Sbti + b(Sbti )(t − ti ) + σ(Sbti )Bti

where Sb is a (discrete) trajectory of the Euler scheme aproximation for S, and Bi is a Brownian Bridge on
[ti , ti+1 ], such that S̄ = Sb at ti and ti+1 .
76

26.3.2 Milshtein approximation scheme (d = 1)


It is dened by (assuming d = 1)
1 0 b
X i+1 = Xti + (b(Xti ) − σ (Xti )σ(X
bt b b bt ))h+
i
2
1 0 b 2
+σ(X i+1 − Wti ) + i+1 − Wti ) .
bt )(Wt σ (Xti )σ(X
bt )(Wt
i i
2
Simulation is obtained with a forward algorithm by

X bt ) − 1 σ 0 (X
i+1 = Xt√
bt b + (b(X bt )σ(X bt ))h+
i i 2 i i
+σ(Xbt ) hεi + 1 σ 0 (X
bt )σ(Xbt )hε2i
i 2 i i

with εi Gaussian independent, for i = 0, . . . , n − 1.


Theorem 26.2 •L 2
-Convergence and Trajectorial Convergence For b, σ of class Cb2 , we have:
 
E bih |q
sup |Xih − X ≤ Chq , q ≥ 1
0≤i≤n

lim h−α sup |Xih − X


bih | = 0 a.s. , α < 1 .
h→0+ 0≤i≤n

• Convergence in law Same result as for the Euler scheme.


As for pricing and Greeking, (weak) convergence is thus again linear in h in regular cases. But the rate of
trajectorial (or strong) convergence is improved with respect to the Euler scheme.

26.3.3 Example: Heston model


We consider the RN Heston model (cf. (6)):
 √
dvt = −λ(vt − v)dt + η vt dZt
√ (103)
dSt = St (κdt + vt dWt )
with dhW, Zi = ρdt.
A classic discretization for (103) consists in discretizing v by the Milshtein scheme and ln S by the Euler
scheme, as follows:

η2 p hη 2 ε21
vbt+h − vbt = −(λ(b
vt − v) + )h + η vbt hε1 +

4
 ln(Sb ) − ln(Sb ) = (κ − vbt )h √ p 4
t+h t + vbt h(ρε1 + 1 − ρ2 ε2 )
2

where (ε1 , ε2 ) is a standard Gaussian pair. The interest of using the Milshtein scheme for the v component
is to benet from a better trajectorial convergence than with the Euler scheme. So the process vb obtained
in this way is less prone to take negative values than it would be the case with an Euler scheme (see also
Andersen [5]).

26.4 JumpDiusions
Finally we consider the following jumpdiusion, driven by a multidimensional Brownian motion W and a
Poisson random measure J (under suitable Lipschitz conditions on the coecients):

R
dXt = b(Xt )dt + σ(Xt )dWt + x∈Rd
f (Xt− , x)χ(dt, dx) (104)

where χ(dt, dx) is a Poisson random measure with compensator measure λ(Xt )ν(Xt , dx)dt, for some intensity
λ and jump size probability measure ν .
Depending on the application at hand, we may be interested at simulating X at xed times, or at the jump
times of X (see e.g. [48, 96]).

Euler scheme at xed times To simulate (104) at xed times 0 < t1 <, . . . , < tn , set X
b0 = X0 , and
for i = 0, . . . , n − 1 :
• simulate
X i+1 = Xti + b(Xti )(ti+1 − ti ) + σ(Xti )(Wti+1 − Wti ) ,
et b b b
77


where Wti+1 − Wti ,→ ti+1 − ti N (0, 1), and
−λ(X
b t )(ti+1 −ti )
• compute Xbt
i+1 by adding to Xti+1 , with probability 1 − e
e i (as dictated by the position of

an independent uniform draw Ui with respect to e−λ(Xti )(ti+1 −ti ) ), a jump term of law ν(X
bt , dx).
b
i

Euler scheme at jump times To simulate (104) at the n rst jump times 0 < τ1 < . . . < τn of X, set
Xb0 = X0 , and for i = 0, . . . , n − 1 :
• simulate τi+1 as τi plus an independent draw in an exponential law with parameter λ(X bt ),
i
• simulate
X i+1 = Xτi + b(Xτi )(τi+1 − τi ) + σ(Xτi )(Wτi+1 − Wτi ) ,
eτ b b b

where Wτi+1 − Wτi ,→ τi+1 − τi N (0, 1), and
• compute X bτ
i+1 by adding to X

i+1 a jump term of law ν(Xτi+1 , dx).
e

Continuous Euler scheme (d = 1) This is the continuous time approximation scheme dened by
interpolation of the previous Euler scheme at jump times by a Brownian bridge between (τi , X
bτ )
i and
(τi+1 , X
bτ ),
i+1 for i = 0, . . . , n − 1.

26.5 Monte Carlo Simulation for Processes


In the case of Monte Carlo simulation for processes, we have

1
Pm 1
Pm
E[φ(X)] − m j=1
bj ) = (E[φ(X)] − E[φ(X)])
φ(X b −
b + (E[φ(X)]
m j=1 φ(X
bj ))

So the error is the sum of two terms, a discretization error and a Monte Carlo error (simulation error):
• for usual discretization schemes such as the Euler or the Milshtein scheme, the weak convergence rate is
linear in h, so that the discretization error is as O(h);
−1
• as for the Monte Carlo error, after scaling by 2 m , it is asymptotically distributed as N (0, Var[φ(X h )]).
Therefore in order to balance the two terms in the error a natural choice is to take m of the order of n2 .

27 (Quasi) Monte Carlo methods for Exotic Options


A nice feature of MC methods is that they can easily deal with path dependent payos. Note however that
specic treatments must be applied to guarantee the quality of the related schemes.

27.1 Lookback options


We consider a Lookback option with payo φ(ST , MT ), where S is given by the following one-dimensional
diusion:

dSt = b(St )dt + σ(St )dWt


(under suitable Lipschitz conditions on the coecients), and Mt = sup0≤s≤t Ss . The idea to eciently (Q)MC
price this payo is to estimate the maximum M along the trajectory of the continuous Euler scheme S̄ for
S (see section 26.3.1). The following results give a way of simulating the pair (S̄, M̄ ), where M̄t = sup[0,t] S̄.
Lemma 27.1 Denoting
Wtλ = Wt + λt , Mtλ = sup Wsλ ,
0≤s≤t

where λ is a real number (drift parameter), the random variable

Ztλ = (2Mtλ − Wtλ )2 − (Wtλ )2

is, conditionally on Wtλ , 1


2t
-exponentially distributed.
78

Proof By the CameronMartin formula, the law of Mtλ conditional on Wtλ does not depend on λ. So it
is enough to prove the Lemma in case λ = 0. Now it is well known that the law of the pair (Wt , Mt ) admits
the following transition probability density between time 0 and t:

(2y − x)2
 
2(2y − x)
p(x, y) = 1y≥0 1y≥x √ exp − (105)
2πt3 2t

(this can be established by using the reection principle for Brownian Motion, see e.g. [94, 133]). Denoting
Zt = (2Mt − Wt )2 − (Wt )2 , we have, formally:

P(Zt ∈ dz | Wt = x) = P(Mt ∈ dy | Wt = x) ,
2 2
with z = (2y − x) − x . Hence, by (105) and given the expression of the Gaussian density:

1 −z
P(Zt ∈ dz | Wt = x) = exp( )dz ,
2t 2t
which proves that, conditionally on Wt , Zt 1
is 2t -exponentially distributed. 2

Theorem 27.2 The law of M ci )0≤i≤n−1 , with M


c = (M ci = maxih≤t≤(i+1)h S̄t , i = 0, . . . , n − 1, can be
simulated by
 
1
q
max Sbih + Sb(i+1)h + (Sbih − Sb(i+1)h )2 − 2σ 2 (Sbih )h ln(U i ) , i = 0, . . . , n − 1
0≤i≤n−1 2

where Sb is a trajectory of the Euler scheme aproximation for S, and (U i )0≤i≤n−1 is a sequence of independent
uniform random variables on [0, 1].

Proof By application of the previous Lemma to


S̄t −S
σ(S
bih
bih ) , we have that conditionally on Sbih and Sb(i+1)h :

 S̄t − S̄ih Sb(i+1)h − Sbih 2 (law) Sb(i+1)h − Sbih 2


2 sup − = ( ) +E 1 ,
2h
ih≤t≤(i+1)h σ(Sbih ) σ(Sbih ) σ(Sbih )
so
(law) 1
 q 
supih≤t≤(i+1)h S̄t = 2
Sbih + Sb(i+1)h + (Sb(i+1)h − Sbih )2 + σ(Sbih )2 E 1
2h

where E 1
1
is an 2h -exponential variable. 2
2h

One then generates a pair (S̄T , M̄T ) as follows:


• rst simulate a (discrete) trajectory Sb of the Euler scheme aproximation for S, using n independent uniform
random draws;
• then simulate M ci )0≤i≤n−1
c = (M as in Theorem 27.2, using n further independent uniform random draws;
ci .
• put S̄T = SbT , M̄T = maxi M
Note that if quasi-random numbers are used, one must use a 2n-dimensional low-discrepancy sequence.
Recall that the use of high-dimensional low-discrepancy sequences should be considered with caution.

In the special case of the BS model, the Euler discretization is exact, provided one works in returns variable
x = ln S . In this case one can take n equal to one and use a 2-dimensional low-discrepancy sequence.

27.1.1 Andersen and Brotherton-Ratclie Algorithm


Description Computation, for a lookback option, of its price and delta by Monte Carlo Simulation (see
[6]).

Let ST = s exp (bt + σWt ) denote the RN BS spot. We note MT = max[T −t,T ] S the maximum reached
before maturity.
The price and delta of a lookback option with payo φ and strike K write:

Π = E e−rt φ(K, ST , MT ) , ∆ = ∂s E[e−rt φ(K, ST , MT )] ,


 
79

with related estimators:

1 −rt
Pm
Πm = m e j=1 π(j)
1 −rt
Pm 1 −rt
Pm
∆m = me j=1 ∂s π(j) = m e j=1 δ(j)

• Fixed Strike Lookback call The payo is (max[T −t,T ] S − K)+


π(j) = (MT (j) − K)+
MT (j)

∂s MT (j) = if π(j) ≥ 0
δ(j) = s
0 otherwise

• Floating Strike Lookback put The payo is (max[T −t,T ] S − ST )


π(j) = (MT (j) − ST (j))
MT (j) − ST (j) π(j)
δ(j) = ∂s MT (j) − ∂s ST (j) = =
s s

Simulation of the maximum MT The conditional cumulative distribution function of the maximum
m of x = ln S , given xT −t = x1 and xT = x2 , writes:

F (x; x1 , x2 ) =
 h i
 1 − exp − 22 (x − x1 )(x − x2 ) if x ≥ max(x1 , x2 )
σ t

0 otherwise

so
1 
F −1 (y; x1 , x2 ) =
p
x1 + x2 + (x1 − x2 )2 − 2σ 2 t ln(1 − y) .
2

At run j , MT (j) is simulated as follows:



• ST (j) is generated as s exp (bt + σWt (j)) , with Wt (j) = tεj , for a standard Gaussian variable εj ;
• y(j) is generated as a uniform variable on [0, 1];
−1
• mT (j) = Fmax (y(j); xT −t , xT (j)) with xT −t = ln s, xT (j) = ln[ST (j)];
• MT (j) = exp(mT (j)).

27.2 Barrier options


We now consider the special case of lookback option corresponding to a barrier up and out option with
payo function (in the case of a rebate R paid at the maturity time T ):

φ(ST , MT ) = ϕ(ST )1{MT <L} + R1{MT ≥L} .

An approximation of the option price is given by

e−r(T −t) E ϕ(S̄T )1{M̄T <L} + R1{M̄T ≥L} ,




where ci
M̄T = max0≤i≤n−1 M as before. Now, we have:
 
E ϕ(SbT )1{M̄T ≤L} |Sbih , 0 ≤ i ≤ n
n−1
Y
= ϕ(SbT ) ci ≤L} |Sih , S(i+1)h )
E(1{M b b
i=0
n−1
Y
= ϕ(SbT ) Φ(Sbih , Sb(i+1)h ) ,
i=0

with  
2
Φ(x, y) = 1 − exp(− (L − x)(L − y)) 1L>x∧y .
σ(x)2 h
80

Likewise,
n−1
!
  Y
E R1{M̄T >L} |Sih , 0 ≤ i ≤ n = R 1 −
b Φ(Sih , S(i+1)h ) .
b b
i=0
Therefore " #
    n−1
Y
E φ(S̄T , M̄T ) = R + E ϕ(ST ) − R
b Φ(Sih , S(i+1)h ) .
b b
i=0

In this case, no random draws are needed other than those used for simulating SbT , namely n random draws
by simulation run  where n can be taken equal to one, in the special case of the BS model.

Similar considerations apply to the other forms of (simple) barrier options.

27.3 Asian options


RT
Asian options correspond to payos of the form φ (ST , AT ) , with At = 1
T 0
Su du (e.g. φ(x, y) = (y − x)+ ,
in the case of the oating strike Asian put).
Straightforward application of the Euler scheme suggests to approximate AT by the Riemann sum

h b
AhT = (S0 + Sb1 + . . . + Sb(n−1)h ) .
T
But this discretization works poorly in practice. For better discretization schemes based on the continuous
Euler scheme, see [103].
These discretizations schemes can be used in conjunction with associated variance reduction techniques, as
introduced in [95]. Thus note that

Z T  Z T 
1 1
St dt is close to exp ln(St ) dt ,
T 0 T 0

for r and σ small. This suggests for instance to choose

e−rT (eZT − K)+ ,

ln(St ) dt, as a control variable to price the payo φ(x, y) = (y − K)+ (Fixed strike Asian
1
RT
where ZT = 0 T
call) in the BS model at time 0. As ZT is normally distributed, Ee−rT (eZ − K)+ is known explicitely.

27.4 Adding Jumps


The previous MC schemes for Lookback, Barrier and Asian Options in diusion models, can be extended to
more general jump-diusion models of form (104), using the related continuous time Euler approximation
scheme (see section 26.4).

28 Backtesting
Before a model be used in production, it is backtested, so the hedging performances of the model are assessed
using both simulated trajectories in relevant market models and real data sets. In the course of this process,
it is useful to be able so simulate spot trajectories with specic features, such as passing by a point of interest
(e.g. a barrier level, if any in the product at hand) at some point in time. This is the object of this section.

28.1 Static versus dynamic hedging


When banks sell options, they are left with the responsability to nd ways to be able to provide the option's
payo at termination. To this end, they immediately set up a hedging portfolio by combination of liquidly
traded instruments, such as the asset underlying the option, and/or vanilla options. There are two main
approaches to the problem of hedging options:
• static replication approach, one uses a static, buy-and-hold portfolio, which synthetically replicates
in the
the option's payo;
• in the dynamic hedging approach, one replicates the option by a dynamic self-nancing portfolio, with
the same initial value and the same risk sensitivities as the option. Since option payos are nonlinear with
respect to the value of the underlying assets at termination, these sensitivities change all the time, so that
the composition of the hedging portfolio has to be updated continuously.
81

In any case, the very possibility to trade options, at least in a safe or quite safe manner, is closely related to
the access to markets of hedging instruments.

Note also that hedging is model-dependent (with the notable exception of a model-independent static repli-
cation strategy which is known for European path-independent options, cf. BreedenLitzenberger [37]), and,
even, calibrated model dependent. So the composition of the hedging portfolios is dierent in two models
calibrated to the same set of hedging instruments.

Static replication is rarely feasible at aordable costs in practice, so that the most common form of hedging is
dynamic hedging. In a complete market model, it is possible to determine a perfect dynamic hedging strategy
such that in the model, the hedging portfolio has initial value equal to the option price, and is always worth
more (with equality in the case of European options) than the option's payo at termination (see section
5.1 in the special case of an European option in the BS model, or Theorem B.4 in the Appendix for a
general statement). But real markets are never complete. For instance, the BS analysis requires continuous
hedging, which is possible in theory but impossibleand even undesirable, accounting for transaction costs
in practice. Thus dynamic hedging means, practically and in the sequel, discrete, imperfect dynamic hedging.
The simplest form of discrete hedging consists in rebalancing the hedge at xed intervals of time h, a strategy
commonly used with h ranging from one day to one week.

28.2 Discrete Time Delta-hedging


Delta-hedging is the special case of dynamic hedging cash and spot are used as hedging instruments. It is
the most commonly used form of dynamic hedging. Delta-hedging can be seen as a technique of portfolio
insurance or position risk management in which an option-like return pattern is created by increasing or
reducing the position in the underlying to simulate the change in value of an option position.

Let us consider an European option with maturity T . The picture is the following: we sell the option at time
0 at price Π0 , that is we receive at time 0 the amount of money Π0 , but in turn it is mandatory for us to pay
+
at time T the payo (ST − K) , which may be very high depending on the movements of the underlying.
Our strategy consists in rebalancing at every time step h a self-nancing hedge in the underlying and in the
savings account. So we sell the option at time 0 at price Π0 , we hedge n times at regular time step h and
we deal the payo at time T = nh.

Let V denote the value process of the hedging portfolio. Thus:


• V0 = Π0 (prot of the sale of the option), which is immediately reinvested as V0 = ∆0 S0 + C0 (∆0 units
of the underlying and a cash amount equal to C0 );
qh
• At time h,
the number of units of the underlying held by the trader is ∆0 e , the position in the savings
rh qh rh
account is worth C0 e , whence Vh = ∆0 e Sh +C0 e , which is instantaneously reinvested in a self-nanced
way as Vh = ∆h Sh + Ch ;
• And so on, until the last rebalancement time (n − 1)h.
So  
Vh = erh V0 + ∆0 (e(q−r)h Sh − S0 ) ,

and more generally, denoting Sbt = e(q−r)t St , it is straightforward to show by recurrence that for i = 0...n :
 i−1
X 
Vih = erih V0 + ∆lh e−qlh (Sb(l+1)h − Sblh ) .
l=0

Accounting also for the facts that V0 = Π0 and for the nal payment by the trader of φ(ST ) at time T, the
discounted P&L of the trader at T is therefore:
Pn−1  
e−rT P&LT = Π0 + i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih − e−rT φ (ST ) (106)

or, equivalently:
Pn−1
e−rT P&LT = l=0 δl P&L (107)

with

δl P&L = −(e−r(l+1)h Π(l+1)h − e−rlh Πlh ) + ∆lh e−qlh (Sb(l+1)h − Sblh )

To x ideas, let us assume that the spot obeys the following BS dynamics (under the objective measure b):
P
dSt = St (µdt + σdWt ) . If a trader were able to hedge continuously, she could perfectly replicate the option
82

and her P&L would be equal to 0, provided she uses the BS delta of the option as control ∆ (see section
5.1). But in practice the trader hedges at discrete times, for example every day at closing price. So the
P&L deviates from 0 (typically σP &L ∼ 1%Π0 for the standard deviation of the P&L at maturity of a daily
rebalanced delta-hedged option position in the BS model, see e.g. [126]).

Note that in the case of an American option exercised at the stopping time τ, assumed for simplicity to be
of the form τ = νh, for a random integer ν ∈ {0, . . . , n}, formulas (106) and (108) become:

Pν−1   Pν−1
e−rτ P&Lτ = Π0 + i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih − e−rτ Πτ = l=0 δl P&L (108)

In the BS model a trader can then super-hedge the option in continuous time.

28.2.1 Discrete Time Delta-hedging as a Control Variate Method


Delta hedging (under a risk-neutral measure P) can also be used as a MC control variate method in which
the quantity to MC compute is the option price Π0 = Ee−rT φ (ST ) , and the control variate is

Pn−1  
ξ= i=0 e−qih ∆ih (Sih ) Sb(i+1)h − Sbih

for a suitably-chosen control function ∆ (see [46]). Note that under mild assumptions Sb must be a local
martingale under P, by arbitrage (see the Appendix), and Eξ = 0.

28.3 Dynamic tests using the Brownian Bridge in the BS model


Assuming constant µ, the solution of the objective BS SDE satises (see section 5.1) :

σ2
   
St+h = St exp µ− h+σ Wct+h − W
ct (109)
2

where ct+h − W
W ct = hε and ε ,→ N (0, 1). Thus we can simulate the value of the spot step by step,
 2
 √ 
multiplying St by exp µ − σ2 h + σ hε with ε ,→ N (0, 1) to get t+h (see also section 26.2).

Moreover it is interesting to be able to lter some noteworthy paths of the spot, selecting for example spot
trajectories passing by a target level at a target time. We can thus observe the behaviour of a pricing routine
or a hedging scheme when the spot's trajectory passes at a critical point. For example, in the case of a
(reverse) barrier option, we can impose the spot to reach the barrier level at an interesting date like the
maturity (recall that the delta of a reverse barrier option explodes at this point).

This can be done with the Brownian Bridge, wich may be dened as a centered Gaussian process dened on
[0, 1] Γ (s, t) = s (1 − t) on s ≤ t (see also sections 26.1 and 27.1). The easiest way to prove
with covariance
that Γ is a covariance is to observe that the process Bt = W ct − tWc1 satises Eb [Bs Bt ] = s (1 − t) for s ≤ t.
This also gives us a continuous version of the Brownian Bridge. Observe that B1 = 0 a.s., so all the paths
a.s. go from 0 at time 0 to 0 at time 1, hence the name of this process. Naturally, the notion of Brownian
Bridge may be extended to higher dimensions and to intervals other than [0, 1].
In our case, we want the spot trajectory to pass by ST1 at time T1 . For this we would like to replace the
Brownian motion in (109) by a Brownian Bridge so that S passes by the desired target point. We thus set:

T1 ,y t ct − t W
B0,x (t) = x + (y − x) + W cT1 ,
T1 T1
where T1 is the target time, and x and y are the Brownian Bridge's starting and target levels, respectively.
This gives us a Brownian Bridge between (0, x) and (T1 , y). Since we want:

σ2
  
T1 ,y
ST1 = S0 exp µ − T1 + σB0,x (T1 ) ,
2

we choose x and y such that

T1 ,y
B0,x (0) = x = 0
 S    
T1 ,y σ2
B0,x (T1 ) = y = σ1 ln ST01 − µ − 2
T1 .
83

We nally obtain
σ2
   
T1 ,y T1 ,y
St+h = St exp µ− h + σ B0,x (t + h) − B0,x (t) ,
2
with x and y thus determined. All we need now is to simulate the quantity

T1 ,y T1 ,y
Γ = B0,x (t + h) − B0,x (t) .

After some computations, we nd that


  
T1 ,y
 h h
Γ ,→ N y − B0,x (t) ;h 1 − .
T1 − t T1 − t

This gives us the formula for the next value of the spot:
" r #
σ2 
T1 ,y h h(T1 − t − h) 
St+h = St exp (µ − )h + σ (y − B0,x (t)) + ε , (110)
2 T1 − t T1 − t

where ε ,→ N (0, 1) .
Note that formula (110) is only applicable for t ≤ T1 ; for (t > T1 ) we have to use formula (109).

We may thus simulate a number of spot trajectories with these formulas at discrete times, and calculate
for each path the corresponding P&L given by the discrete hedging formula (106). We may then compute
related statistics, like the mean or the square deviation of these P&L, etc. We may also identify some
relevant spot's trajectories, like those generating extremal or median P&L positions (see Figures 13 and 14).
Such dynamic tests allow risk managers or traders to assess the performance of a hedging scheme, and they
may help developers in detecting some problems of a pricing routine behaviour.
84

Stock’s trajectory to obtain the minimal PL. PL’s trajectory.


105 0.5
Spot Target
exercise Time
0

100 -0.5

-1
95
-1.5

-2
Stock

PL
90
-2.5

-3
85
-3.5

80 -4

-4.5

75 -5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Stock’s trajectory to obtain the maximal PL. PL’s trajectory.


106 4
Spot Target
exercise Time
104 3.5

102
3

100
2.5

98
Stock

PL

2
96

1.5
94

1
92

90 0.5

88 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Stock’s trajectory to obtain the mean PL. PL’s trajectory.


100 0.5
Spot Target
exercise Time
0.4
95

0.3
90

0.2
85
Stock

PL

0.1

80
0

75
-0.1

70
-0.2

65 -0.3
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Figure 13: Weekly hedging P&L simulations with Brownian Bridge at level 90 at t = 12 .
85

Stock’s trajectory to obtain the minimal PL. PL’s trajectory.


115 0.5
Spot Target
exercise Time

110 0

105 -0.5
Stock

PL
100 -1

95 -1.5

90 -2

85 -2.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Stock’s trajectory to obtain the maximal PL. PL’s trajectory.


104 2.5
Spot Target
exercise Time
102
2

100

1.5
98
Stock

PL

96 1

94
0.5

92

0
90

88 -0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Stock’s trajectory to obtain the mean PL. PL’s trajectory.


110 0.1
Spot Target
exercise Time

105 0

100 -0.1
Stock

PL

95 -0.2

90 -0.3

85 -0.4

80 -0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time Time

Figure 14: Daily hedging P&L simulations with Brownian Bridge at level 90 at t = 12 .
86

Part VI
Calibration Methods
In nancial modelling, calibrating a (class of ) model(s) means nding numerical values of its parameters
such that the related instance of the model is consistent with the market, in the sense that the prices of
some nancial instruments at the current time, or calibration input data, are the same in the market and in
the model thus calibrated, or, practically speaking, that the model ts the currently observed market prices
within the bid-ask spread.
So the calibration problem is the inverse of the pricing problem. Instead of computing prices in a model
with given values for its parameters, one wishes to compute the values of the model parameters that are
consistent with observed prices.
The simplest example of a calibration problem was encountered in section 8, when we discussed the notions
of implied volatility of an option or implied correlation of a CDO tranche. These problems can indeed be
interpreted as the calibration problem of a BS model or a Li model, using an observed option price or a
CDO tranche market spread as calibration input data. Of course in these cases the calibration problem is
very easy, since there is only one parameter to calibrate to exactly one market data, so the task is easily
done by dichotomy.

29 The ill-posed inverse calibration problem

Since the calibration input data are typically derivative prices given by (risk-neutral) expectations of the
related payos (see the Appendix), the calibration problem can thus be seen as a moment problem. Physicists
would call this problem, the estimation of the model. However, in nance, the term `estimation' specically
refers to statistical estimation of the model, namely the estimation of the model parameters using historical
data, by Maximum likelihood estimation (MLE) or any other statistical procedure. So, in nance, estimation
is a backward-looking process (using historical data), whereas calibration is said to be a forward-looking
process, referring to the fact that derivative prices at the current time are based on the anticipations of the
market regarding the dynamics of the underlying in the future.
It is generally acknowledged that whenever option data are available, it is better to use them to calibrate
the model, rather than to estimate the model on past data. Of course, in the absence of calibration data (no
observed prices), one has no other means than to estimate the model statistically, but this is another story.

Now, it is well-known by physicists that inverse problems are typically ill-posed. Recall that a problem is
well-posed (as dened by Hadamard) if its solution exists, is unique, and depends continuousy on its input
data. Thus there are three reasons for which a problem might be ill-posed:
• it admits no solution, or/and
• it admits more than one solution, or/and
• the solution(s) of the inverse problem do(es) not depend on the input data in a continuous way.

In the case of calibration problems in nance, except for trivial situations, there exists typically no instance
of a given class of models which is exactly consistent with a full calibration data set, including a number of
option prices, a zero-coupons curve, an expected dividend yield curve on the underlying, etc.
But there are often various instances of a given class of models that t the data within the bidask spread.
In this case, if one perturbs the data (e.g., if the observed prices move from some small amount between
today and tomorrow), it is quite typical that a numerically determined best t solution of the calibration
problem switches from one `basin of attraction' to the other, thus the numerically determined solution is not
stable either.

Recall that option prices (or pricing functions in Markovian models, more precisely) are solutions of PDEs
with model parameters related coecients (see the Appendix). Thus model parameters can be expressed in
terms of partial derivatives of the option prices with respect to the model factors. Therefore, calibrating the
model is tantamount to numerically dierentiating derivatives pricing functions. But numerical dierentia-
tion is the canonical example of an ill-posed problem, see e.g. [67] (so two functions may be arbitrarily close
to one another in sup norm, though their derivatives dier signicantly).

In order to get a well-posed problem, we need to introduce some regularization. The most widely known
87

and applicable regularization method is Tikhonov regularization method [67].

29.1 Tikhonov regularization of non-linear inverse problems


Given a Hilbert space H, a closed non-void convex subset A of H, a direct operator (`pricing functional')

Π
H ⊇ A 3 a −→ Π (a) ∈ Rq ,

(so a corresponds to the set of model parameters), and noisy data (`observed prices') πδ , the Tikhonov
regularization method for inverting Π at π δ consists in:
• substituting, to the inverse problem properly said, the nonlinear least squares problem:
2
mina∈A Π (a) − π δ

to ensure existence of a solution,


• selecting the solution of the previous nonlinear least squares problem that minimizes ka − a0 k2 over the
set of all solutions, where a is the so-called prior in the method, and
0

• introducing a trade-o between accuracy and regularity, parameterized by a level of regularization α, to


ensure stability.
More precisely, we dene an α-solution to the inverse problem as follows.

Denition 29.1 (α-solution) By an α-solution to the inverse problem for Π at π δ , we mean any aδ,η
α ∈ A
such that  
Jαδ aδ,η
α ≤ Jαδ (a) + η , a ∈ A

with 2 2
2Jαδ (a) ≡ Π (a) − π δ + α a − a0

(given a prior a0 ∈ H).

Under suitable assumptions, one can show that the problem of determining α-solutions of the inverse problem
is well-posed, in the following sense.

We postulate that the direct operator Π satises the following regularity assumption.

Assumption 29.2 (Compacity) Π (an ) −→ Π (a) if an ,→ a.

We then have the following stability result.

Theorem 29.3 (Stability) Let π δj → π δ , ηj → η = 0 when n → ∞. Then any sequence aα


δn ,ηn
admits a
subsequence which converges towards an α-solution aα δ,η=0
.

Let us make the following additional assumptions.

Assumption 29.4 (Twice dierentiability of the direct operator)


2
Π (a + εh) = Π (a) + εdΠ (a) · h + ε2 d2 Π (a) · (h, h) + o ε2

; a, a + h ∈ A
kdΠ (a) · hk ≤ C khk , d2 Π (a) · (h, h0 ) ≤ C khk kh0 k ; h, h0 ∈ H

a∈A,

and

Assumption 29.5 (Range property) π ∈ Π(A).

Under these assumptions, we dene an a0 -Minimal Norm Solution (MNS) of the inverse problem for Π at
π, as any a ∈ Argmin ka − a0 k.
{Π(a)=π}
88

Theorem 29.6 (Convergence) Under the previous assumptions, let



(n ∈ N) π − π δn ≤ δn ,

(n → ∞) αn ≡ α (δn , ηn ) , δn2 /αn , ηn /αn −→ 0.


Then any sequence admits a subsequence which converges towards an a -MNS a. So aα
δn ,ηn
aα n
δn ,ηn
n
→ a, if a 0

is the supposingly unique a0 -MNS of the inverse problem for Π at π.

Theorem 29.7 (Convergence Rates) Under the previous assumptions, let



δ 2
π − π ≤ δ , α ∼ δ , η = O δ .

1
Then kaδ,η
α − ak = O(δ 2 ), for any a -MNS a of the inverse problem for Π at π such that
0

a − a0 = dΠ (a)∗ λ (111)

for λ suciently small in R , where q ∗


stands for the adjonction operator (see [67]).

An important issue in practice is the choice of the regularization parameter α, that determines the trade-o
between accuracy and regularity in the method. To set α, the two main approaches are:
• a priori methods, in which the choice of α is based on the level of noise δ in the data (such as the width
of the bidask spread, in the case of market prices data in nance);
• a posteriori methods, in which α may also depend on the data themselves.

The most commonly used method for choosing α is the a posteriori method based on the so-called discrepancy
principle, which consists in choosing the greatest level of α for which the `distance' between the solution of
the regularized inverse problem with regularization level α and the observations, does not exceed the level
of noise on the observations.

29.2 Nonlinear Optimization


In the case of parametric models, namely models with a small number of scalar (as opposed to functional,
e.g. time-dependent) parameters, such as the Heston model, the Merton model, etc, the choice of a suitable
regularization term is generally not obvious. In this case, the industry standard rather consists in solving
a non linear least squares problem without regularization, starting from a `reasonable' set of parameters
(or a few of them, then selecting the `best' solution of the related optimization runs), based on physical
considerations or on the output of the calibration of the previous day. So Tikhonov regularization is rather
used for calibrating non parametric nancial models (see section 31 for a typical example).
Thus, in practice, calibration problems are essentially reduced to nonlinear minimization problems (non
non parametric
linear least squares problems, typically with Tikhonov regularization, at least for calibrating
models with functional coecients), on suitable closed non-void convex subsets A of related Hilbert spaces
H. Of course, when it comes to implementation, the related minimization problems are both localized and
discretized, thus becoming eectively nonlinear minimization problems on (subsets of ) R .
d

Recall that a real function J on A (the cost criterion J , that will typically be a regularized non linear least
squares criterion as in Denition 29.1) is said to be lower semi-continuous at a ∈ A i `it cannot exceed its
limits at a', i.e. J(a) ≤ lim inf a J .
The following Theorem extends to (innite dimensional, presumably) Hilbert spaces, under an additional
d
convexity assumption, the well-known fact that a lower semi-continuous function on a compact subset of R
admits a global minimum.

Theorem 29.8 If J is lower semi-continuous, convex, and goes to innity as a goes to innity in A, then
J admits a global minimum on A. Moreover, if J is strictly convex on A, this minimum is unique.

In the strictly convex case, and if, additionally, J is dierentiable, one can prove the convergence to the
minimum of various descent gradient algorithms. These consist in moving at each step from some amount
(xed step descent vs optimal step gradient descent) in a direction made of the gradient ∇J at the current
step of the algorithm, in combination with, in some variants of the method (conjugated gradient method,
quasi-Newton algorithms, etc), the gradient(s) at the previous step(s).
89

In the non strictly convex case, or, more generally, in the non-convex case (as it will typically be the case
in the context of calibration problems), or if the cost criterion is only almost everywhere dierentiable (as
in the American calibration problem mentioned in Remark 31.1), such algorithms can still be used, in which
case they will typically converge to a local minimum of J.
When the gradient of J is not computable in closed form, or not computable numerically with the required
accuracy, an alternative to gradient descent methods is to use the nonlinear simplex method (not to be
confused with the simplex algorithm for solving linear programming problems). As opposed to gradient
descent methods, the nonlinear simplex algorithm only uses the values (and not the gradient) of J, but the
convergence of the algorithm is not proved in general, and there are known counter-examples in which it
does not converge.

When there are no constraints (A = H or analogous discretized form), the minimization problem is, in
practice, much easier, and many implementations of the related gradient descent algorithms are publi-
cally available, see for instance [129] (C codes publically available on www.nr.com). As for constrained
problems, a state-of-the-art implementation of the quasi-Newton method for minimizing a function on
a hyperrectangle commonly used in the nancial industry, the lbfgs algorithm, is publically available on
www.ece.northwestern.edu/~nocedal/lbfgsb.html.

30 A method using the characteristic function for European


vanillas
Calibrating a model typically involves repeatedly computing vanilla option prices for various maturities,
strikes, and numerical sets of model parameters. Thus for calibration purposes it is essential to have e-
cient methods for pricing vanilla options. In many models, such methods are provided by suitable Fourier
transforms.

30.1 Fourier Transform Miscellanea


Recall that the Fourier transform F f of an absolutely integrable function f from R to itself (in particular,
lim∞ f = lim−∞ f = 0), is dened by, for u∈R:
R∞
F f (u) = −∞
eiux f (x)dx

Note that the characteristic function of a random variable X with law of density p, dened by:

R∞
Φ(u) = E[exp(iuX)] = −∞
eiux p(x)dx , u ∈ R

can be interpreted as the Fourier transform F p.


The dierentiation operator translates into multiplication by iu in the Fourier space:

F f (m) (u) = (−iu)m F f (u) , u ∈ R

for dierentiable f.
For regular f, the inverse Fourier transform formula writes, for x∈R:
R∞
f (x) = 1
2π −∞
e−iux F f (u)du (112)

Moreover, Fourier transforms may be extended to complex values of their argument u (complex Fourier
transforms), for u in suitable strips of analyticity parallel to the real axis. One can then show that the law of
any Fourier-integrable random variable X, with characteristic function denoted by Φ, admits a weak density
p formally given by, for any η > 0 :
R −ηi+∞
p(x) = 1
2π −ηi−∞
e−iux Φ(u)du (113)
90

meaning that for any regular enough function ϕ such that Eϕ(X) is well dened, we have (in the strong
sense, now):
R −ηi+∞ R 

Eϕ(X) = 1
2π u=−ηi−∞
Φ(u) y=−∞
e−iuy ϕ(y)dy du (114)

In particular, we have by (114) applied with ϕ(y) = 1y≤x , for any x ∈ (−∞, x] :
 
F (x) − F (x) = P x ≤ X ≤ x
−ηi+∞ −ηi+∞
x
e−iux − e−iux
Z Z  Z
1 1
= Φ(u) e−iuy dy du = Φ(u) du .
2π u=−ηi−∞ y=x 2π u=−ηi−∞ iu

e−iux
1
R −ηi+∞
Now, by the Cauchy residue formula (see e.g. Titchmarsh [141]): 2π
u=−ηi−∞ iu
Φ(u) du = 1. Therefore,

u=−ηi−∞
e−iux Φ(u)
Z
1
F (x) = lim (F (x) − F (x)) = 1 + du . (115)
x=−∞ 2πi −ηi+∞ u

Moving the integration contour to the real axis, we get by application of Cauchy residue formula (note that
the integrals in (116) are only dened as principal values, see e.g. Lewis [110], Titchmarsh [141]):

e−iux Φ(u)
Z
1 1
F (x) = − lim du
2 2πi ε→0+ u
|u|>ε
(116)
ie−iux Φ(u)
Z
1 1
= + lim du .
2 2π ε→0+ u
|u|>ε

ie−iux Φ(u) ieiux Φ(−u)


Since the conjugate of u
is −u
, one has:

Z∞
ie−iux Φ(u)
Z  −iux 
ie Φ(u)
lim du = 2 lim Re du
ε→0+ u ε→0+ u
|u|>ε ε

Z∞  −iux 
ie Φ(u)
=2 Re du ,
u
0

whence
R∞
 
−iux
F (x) = 1
2
+ 1
π
Re ie u Φ(u) (117)
0

30.2 Option Pricing by Fourier Transform


Let us thus consider the problem of valuing a European call of maturity T, written on the terminal spot
price ST of some underlying asset. Using (117), we are going to see that

Theorem 30.1 The call value at time 0, C0 = Ee−rT (ST − K)+ , can be rewritten:

C0 = Se−qT Π1 − Ke−rT Π2 (118)

where the pseudo-probabilities Π1 and Π2 are given in terms of the characteristic function ΦT (u) = E[exp(iuxT )]
of xT = ln(ST ), assumed to exist, as:

R∞
  
−iuk
 Π1 =

 1
2
+ 1
π
Re e iuΦΦTT(−i)
(u−i)
du
0

  (119)
−iuk
1
+ π1 Re e ΦT (u)
R
 Π2 = du .


2 iu
0
91

(with ΦT (−i) = EST = S0 eκT in (119) , by arbitrage). Now in many models the characteristic function
ΦT is known and computable. This is for instance the case in all AJD models [63] (see e.g. section 5.5).
Knowing ΦT , (118) enables one to compute C0 numerically by quadrature (section 30.4).

In order to establish (118), we rst decompose

 
C0 = E e−rT exT 1{xT ≥k} − Ke−rT P(xT ≥ k) ,

where k = ln K.
By (117),

  Z∞  −iuk 
1 1 e ΦT (u)
Π2 = P xT ≥ k = + Re du.
2 π iu
0

de
P exT
Let us introduce the probability measure P
e equivalent to P dened by dP = E(exT )
. So
Z
E(e−rT exT 1{xT ≥k} ) = e−rT ex 1{x≥k} )dP{xT = x}
x∈R

= E(e−rT exT )P(x


e T ≥ k) = se−qT P(x
e T ≥ k) .

The characteristic function of xT under P


e is given by:

E(exT eiuxT )
Z
e iuxT ) = ΦT (u − i)
E(e eiux dP{x
e T = x} = = .
x∈R E(exT ) ΦT (−i)

Hence, by (117) again,

  Z∞  −iuk 
e xT ≥ k = 1 + 1 e ΦT (u − i)
Π1 = P Re du , (120)
2 π iuΦT (−i)
0

which nishes to prove (118).

30.3 Derivation of the delta in the case of homogenous models


Homogenous models mean models in which the European vanilla option prices are degree one-homogenous
with respect to the pair (S0 , K) (for given values of the remaining parameters and risk factors in the model),
so:

C0 (T, αK; αS0 ) = αC0 (T, K; S0 ) , α > 0

or, equivalently:

C0 (T, K; S0 ) = S0 ∂S0 C0 (T, K; S0 ) + K∂K C0 (T, K; S0 ) (121)

In a homogenous model one easily deduces from (118), (117), (121) that

∆0 = ∂s C(T, K; 0, s) = e−qT Π1

30.4 Numerical Algorithm


If ΦT (u) is computable, both Π1 and Π2 can be computed by inserting the expression for ΦT (u) in (119), and
numerically discretizing the corresponding integrals by the trapezoid method. More precisely, we choose a
step h (e.g. h = 0.01) and a time mesh 0 = x0 < x1 <, . . . , < x M = M (e.g. M = 100) such that xj+1 −xj =
h
M −1
RM
 
hP
h
h. Denoting by f either integrand in (119), one approximates f (u) du by 2
f (x0 )+f (x M )+2 f (xj ) .
h j=1
0
92

For implementation details, including the choice of more ecient quadrature schemes than the rather rough
trapezoid method above, as well as numerical issues related to the evaluation of the multi-valued complex
functions involved in the formulas, we refer the reader to [92].

30.5 An alternative Formula


Unfortunately, FFT cannot be used to evaluate the integrals in (119), since the integrands are singular at the
required evaluation point u = 0. We now present an alternative formula amenable to evaluation by the FFT.
A further advantage of this alternative formula is that it allows one to compute the prices of a whole family
of options with various strikes and maturity T at time 0, which is precisely what is required for calibration
purposes (for other alternative approaches, see [43, 48]).

The idea is to compute the Fourier transform of the call price viewed as a function of the log-strike k = ln K.
−rT
However, this function C(k) is not integrable (since limk→−∞ C(k) = e EST = se−qT > 0). We thus
dene the modied price c(k) = e
αk
C(k), for some xed α > 0. The modied price is integrable, and we
have, for any u∈R:
Z ∞ Z ∞
F c(u) = eiku c(k)dk = e(α+iu)k C(k)dk
−∞ −∞
Z ∞  Z ∞ + 
= e−rT e(α+iu)k ex − ek ρT (x)dx dk
−∞ −∞
Z ∞ Z ∞   
−rT (α+iu)k
=e e ex − ek ρT (x)dx dk
−∞ k
Z ∞ Z x   
= e−rT ρT (x) e(α+iu)k ex − ek dk dx
−∞ −∞
Z ∞  Z x Z x 
−rT x
=e ρT (x) e e(α+iu)k dk − e(α+iu+1)k dk dx
−∞ −∞ −∞
Z ∞  x
e h ix
= e−rT ρT (x) e(α+iu)k
−∞ α + iu −∞

1 h i x
− e(α+iu+1)k dx
α + iu + 1 −∞
Z ∞  x (α+iu)x
e(α+iu+1)x

e e
= e−rT ρT (x) − dx ,
−∞ α + iu α + iu + 1
where the last equality follows from the fact that limk→−∞ e(a+iu)k = 0, for any a>0 and u ∈ R. Hence
Z ∞ (α+iu+1)x
e
F c(u) = e−rT ρT (x) dx
−∞ (α + iu)(α + iu + 1)

e−rT
Z
= ei(u−(α+1)i)x ρT (x)dx ,
(α + iu)(α + iu + 1) −∞
that is :
e−rT ΦT (u − (α + 1)i)
F c(u) = ,
(α + iu)(α + iu + 1)
in which ΦT is the characteristic function of xT = ln(ST ), assumed to be known.

The modied call price function c(k), hence the function C(k) (call price for varying log-strike k = ln K ),
may then be retrieved numerically by discrete Fourier transform. Indeed, we have by the inverse Fourier
transform formula (112):

e−αk
Z
C(k) = e−αk c(k) = e−iku F c(u)du
2π −∞
A N −1
e−αk e−αk X −ikuj
Z
2
≈ e−iku F c(u)du ≈ h e F c(uj ) ,
2π −A 2π j=0
2

by numerical integration with u0 = − A


2
, uj = u0 + jh (h = A
N
). Hence

A N −1
eik 2 −αk X −ijkh
C(k) ≈ h e F c(uj ) ,
2π j=0
93

2πn
so for kn = Nh
(recall that A = hN ):
N −1
einπ e−αkn X −2πi jn
C(kn ) ≈ h e N F c(u ) ,
j
2π j=0

that is, since eiπ = −1 :


N −1
(−1)n e−αkn X −2πi jn
C(kn ) ≈ h e N F c(u ) .
j
2π j=0

In the last sum, we recognize the discrete Fourier transform of (F c(uj ))j at the evaluation point n. Recall
that the discrete Fourier transform (F fn )0≤n≤N −1 of a vector (fj )0≤j≤N −1 writes:

N −1
X jn
F fn = e−2iπ N fj , 0 ≤ n ≤ N − 1 .
j=0
2
It might seem at rst sight that N operations are required to compute the vector F f. However, when N is
a power of 2, the Fast Fourier Transform (FFT) algorithm allows one to compute F f at cost O(N ln N ).
In summary, we can price a call for N values of the strike K by applying the discrete Fourier transform to
an explicitely known function, which can be made at cost O(N ln N ), provided N is a power of 2.

31 Extracting Eective Volatility


31.1 Local versus Eective Volatility
Local Volatility (LV) models [64] are the straightforward generalization of the BS model in which the volatility
parameter is not a constant anymore, but is given by a (positively bounded Borel) function σ = σ(t, St ) of t
and St . As the BS model, LV models are complete, and we have under the related RN probability measure
P:
dSt = St (κdt + σ(t, St )dWt ) , (122)

for a standard P-Brownian motion W. Thus European vanilla call options on S have unique LV arbitrage
price processes given by, for t ∈ [0, T ] :

Πt (T, K) = e−rτ E (ST − K)+ | St = Π(T, K; t, St , σ) .



(123)

In a general LV model, we don't have closed pricing formulae. But the pricing function Π(T, K; t, S, σ)
1,2
dened by (123) is the unique Wp,loc -solution (for p > 2; a.e. solutions with related partial derivatives in
Lp,loc in time-space) of the BS equation in the variables (t, S) :

1
(
−∂t Π − κS∂S Π − σ(t, S)2 S 2 ∂S2 2 Π + rΠ = 0, t<T
2 (124)
Π|T = (S − K)+

and of the dual Dupire equation in the variables (T, K) :

1
(
∂T Π + κK∂K Π − σ(T, K)2 K 2 ∂K
2
2 Π + qΠ = 0, T >t
2 (125)
Π|t = (S − K)+

(see [54, theorem 4.3]).

These equations can be used to compute LV option prices and Greeks numerically. They also imply that
whatever the market RN price process may be, there is always, at any date t0 , a LV model (dependent on t0 !)
with the same spot marginals as the market RN price process at t0 . Assume t0 = 0, without loss of generality.
1,2
Given a suitable interpolation Π0 ∈ Wp,loc of the set obs0 of observed European vanilla call (might be put)
prices π0 (T, K) at time 0, this `tangent diusion process' of the market RN price process corresponds to the
volatility function given by the following Dupire's formula [64], for any (T, K) ∈ [0, ∞) × R+ :
?

∂T Π0 (T,K)+κK∂K Π0 (T,K)+qΠ0 (T,K)


b0 (T, K)2 = 2
σ K 2 ∂ 2 2 Π0 (T,K) (126)
K
94

provided Dupire's ratio in the r.h.s. of (126) denes a positively bounded Borel function. But, practically
speaking, it is virtually always possible to nd an interpolation (and/or approximation within the bid ask
1,2
spread) Π0 ∈ Wp,loc of π0 for which this is satised, and we shall refer to the related volatility function
b0 (T, K), as the market (or market model, in case where prices π are in fact model prices) eective volatility
σ
(EV) function σ b0 .
The practical problem of extracting eective volatility from market prices is tantamount to calibrating a LV
model (see section 31.2). Once available, eective volatility is a useful tool in various tasks: hedging [52],
calibrating stochastic volatility (SV) or SV+jumps models [24, 76], arbitraging basket options [9], etc.

31.2 The Local Volatility Calibration problem


The LV calibration problem amounts to inferring a LV function σ from observed option prices, namely
European calls or puts with various strikes and maturities. The local volatility function thus inferred is then
used to price exotic options, and value hedge ratios or derivative exposure, consistently with the market.

But this calibration problem is under-determined (since the set of observed prices is nite whereas the
nonparametric function σ has an innity of degrees of freedom) and ill-posed. So numerical dierentiation
based on Dupire's formula (126) gives a local volatility which is both frenzy and highly unstable, for instance
when performing a day-to-day calibration (see Figure 15).

To solve this diculty, the rst idea that comes in mind is to look for σ within a parameterized family
of functions. But nding classes of functions with all the exibility required for tting implied volatility
surfaces with several hundred of implied volatility points and a variety of shapes, turns out to be a very
challenging task (unless a large family of splines is considered [47], in which case the ill-posedness of the
problem shows up again).

The best way to proceed is to stay non-parametric, and to use regularization methods with prior σ 0 to
stabilize the calibration procedure. Since we use a non-parametric local volatility, the model contains a
sucient number of degrees of freedom to provide a perfect t to virtually any market smile. And the
regularization method guarantees that the local volatility thus calibrated is nice and smooth.

Among the various regularization methods at hand, the most popular one is the Tikhonov variational regu-
larization method with prior σ 0 (cf. section 29.1).

1 0.8
0.9
0.8 0.6
0.7
0.6 0.4
0.5
0.4 0.2
0.3
0.2 0
0.1
0

1.15
1.1
1.05
0 1
0.1 0.2 0.95Stock (Normalized)
0.3 0.4 0.9
0.5 0.6 0.85
Date 0.7 0.8 0.9 0.8

Figure 15: Local Volatility obtained by application of Dupire's formula on the DAX index, May 2
2001.

31.3 Approach by Tikhonov regularization


95

Regularized least squares minimization One rewrites the calibration problem as

2
minσ kΠ|obs0 (σ) − π0 k2 + αd σ, σ 0 (127)

d σ, σ 0 = kσ − σ 0 kH 1 ,

where with

Z ∞ Z ∞
kuk2H 1 = u2 + (∂T u)2 + (∂K u)2 dT dK .
 
T =0 K=0

Lagnado and Osher [101] General gradient descent methodology

Crépey [54] 0 < σ ≤ σ ≤ σ


(α → ∞) (127) well posed in the sense of Hadamard

(α > 0) Stability of α-solutions


(α → 0) Convergence of α-solutions to α ≡ 0-solutions (if any); convergence rates

Crépey [54] Trinomial tree implementation, using a extended KamradRitchken tree like that of section
15.3.1), but with local volatility σ(ti , Sj ).

Example Calibration with Tikhonov regularization on the DAX index European options data set of May 2,
2001, with n = 75 time steps in the calibration tree (see Figure 16).

./calibr calibr.in 2_5_2001__filterdax.dat

NIT NF CG F GNORM
0 1 0 3.300143167176308E-03 5.73346051E-02
1 2 4 2.628389922446460E-03 5.12570229E-02
...
65 67 546 1.874191365654949E-04 6.83920968E-07

Calibration done. All right.

0.25 0.34
0.32
0.25 0.2 0.34 0.3
0.32 0.28
0.2 0.15 0.3 0.26
0.28 0.24
0.15 0.1 0.26 0.22
0.1 0.24 0.2
0.05 0.22
0.18
0.05 0.2
0 0.18 0.16
0 0.16

1.5 1
1.4 0.9
1.3 0.8
0.7
1.2 0.6
0 1.1 4500 0.5
0.1 5000
0.2 1 Stock (Normalized) 0.4 Maturity
0.3 5500 0.3
0.4 0.9 6000
0.5 0.2
0.6 0.8 6500 0.1
Date 0.7 Strike 7000
0.8
0.9 0.7 7500 0

0.04
maturity:
0.044
0.121
0.216
0.03 0.389
0.638
0.869
Implied Volatility mismatch

0.02

0.01

-0.01

-0.02
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
Moneyness

Figure 16: Local volatility (squared), implied volatility and calibration accuracy (DAX index).
96

N × nproc 1 3 6
54 25s 9s 10s
101 4m30s 1m57s 1m36s

Table 3: CPU times on nproc processors on a fast Myrinet network; 352 DAX European options.

Remarks 31.1 This approach can be extended to the American local volatility calibration problem (see
Crépey [53]).

Example Calibration with Tikhonov regularization on the FTSE index American options data set of January
4, 1999 (see Figure 17).

./calibr calibr.in 4_1_1999__sei.dat

NIT NF CG F GNORM
0 1 0 6.433226117966511E-03 2.84770204E-02
1 2 4 4.186829455917616E-03 1.50087510E-02
...
61 61 408 1.448329600171004E-03 6.62664338E-05

Calibration done. All right.

(a)

0.5
0.5
0.45 0.45
0.4 0.5
0.5 0.4
0.45 0.35 0.45
0.4 0.3 0.4 0.35
0.35 0.25 0.3
0.35
0.3 0.2
0.25 0.15 0.3 0.25
0.2 0.1 0.25
0.15 0.2
0.05
0.1 0
0.2
0.05
0 1
0.9
0.8
1.5 0.7
0.6
1.4 4500 0.5
1.3 5000 0.4 Maturity
5500 0.3
1.2 6000 0.2
0 0.1 0.2 1.1 Strike 6500 0.1
0.3 0.4 1 Stock (Normalized) 7000 0
0.5 0.6 0.9
Date 0.7 0.8 0.8
0.9 1 0.7

0.1
maturity:
0.128
0.205
0.282
0.08 0.454
0.953

0.06
Implied Volatility mismatch

0.04

0.02

-0.02

-0.04
0.9 0.92 0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.1
Moneyness

Figure 17: Local volatility (squared), implied volatility and calibration accuracy (SEI index).
97

31.4 Approach by entropic regularization


Stochastic control approach An alternative approach is to use entropic regularization, rewriting the calibra-
tion problem as the following stochastic control problem (see Avellaneda et al. [11], Samperi [137]):

min d σ, σ 0

with kΠ|obs0 (σ) − π0 k ≤ δ ,
σ

where
2
Z ∞
d σ, σ 0 = E0,S0 (σ(t, St ) − σ 0 (t, St ))2 dt .
0

Using a dual formulation of the problem with Lagrange multipliers, the related optimization problem may be
2
solved at cost O(|obs|), versus O(n ) in the case of Tikhonov regularization. The resolution is thus typically
faster, but it also happens to be less robust, and the regularization is less ecient (since one only regularizes
0 0
by the dierences σ − σ , instead of ∇(σ − σ ) in (127)) than with Tikhonov regularization.

Example Calibration with entropic regularization on the DAX European options data set of May 2, 2001
with n = 75 time steps in the calibration tree (see Figure 18).

./calibr calibr.in 2_5_2001__filterdax.dat

I NFN FUNC GNORM STEPLENGTH

1 4 -9.089E-06 4.101E+00 9.915E-06


...
21 27 -3.450E-05 1.867E-02 2.854E-01
22 28 -3.450E-05 5.052E-03 1.000E+00

entropy : -1.96329e-05

Calibration done. All right.


98

0.3
0.25 0.3
0.3 0.34
0.2 0.32
0.25 0.3 0.25
0.2 0.15 0.28
0.26
0.15 0.1 0.24 0.2
0.22
0.1 0.05 0.2
0.05 0.18 0.15
0
0.16
0 0.14

1.4 1
1.3 0.9
0.8
1.2 0.7
1.1 0.6
0 4500 0.5
0.1 1 5000
0.2 Stock (Normalized) 0.4 Maturity
0.3 0.9 5500 0.3
0.4 6000
0.5 0.2
0.6 0.8 6500 0.1
Date 0.7 Strike 7000
0.8
0.9 0.7 7500 0

0.07
maturity:
0.044
0.06 0.121
0.216
0.389
0.05 0.638
0.869

0.04
Implied Volatility mismatch

0.03

0.02

0.01

-0.01

-0.02

-0.03
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
Moneyness

Figure 18: Local volatility (squared), implied volatility and calibration accuracy.
99

Part VII
Appendix  Financial Theory Essentials
The material in this part is contained in Bielecki et al. [25, 27], where a more general notion of defaultable
option is considered. Here we only consider the case of default-free European or American options. Note
however that defaultable claims with terminal payos like 1T <τd φ(ST ) (or 1τ <τd φ(Sτ ) upon exercice at time
τ, in case of American claims), where τd default-time of a reference entity, can be handled in
represents the
exactly the same way as below, provided the default-free discount factor process β below is replaced by a
credit-risk adjusted discount factor α = βG, where Gt represents a suitable survival probability beyond t (see
[25, 26, 27, 28]).

A Arbitrage Theory
To model a derivative with maturity T, we consider, in the abstract set-up of section 2, a primary market
composed of the savings account B and of d risky assets such that on [0, T ] :
• the discount factor process β , that is, the inverse of the savings account B , is a nite variation, continuous,
positive and bounded process, with β0 = 1;
• the risky assets are locally bounded semimartingales.

The discount factor β is supposed to be absolutely continuous with respect to the Lebesgue measure, namely
r du) for a bounded from below short-term interest rate process r.
Rt
βt = exp(− 0 u

The primary risky assets, with Rd -valued price process X, may pay dividends, whose cumulative value
process, denoted by D, is assumed to be an Rd -valued process of nite variation. Given the price process
X, we dene the cumulative price X
b of the asset as
Z
−1
Xt = Xt + βt
b βu dDu . (128)
[0,t]

In the nancial interpretation, the last term in (128) represents the current value at time t of all divi-
dend payments of the asset over the period [0, t], under the assumption that all dividends are immediately
reinvested in the savings account B.
A predictable trading strategy (ζ , ζ)
0
built on the primary market has the wealth process Y given as,

Yt = ζt0 βt−1 + ζt Xt t ∈ [0, T ] (129)

(where ζ is a row vector). Accounting for dividends, we say that a strategy


0
(ζ , ζ) is self-nancing whenever
ζ is b -integrable and if we have, for t ∈ [0, T ],
βX
d(βt Yt ) = ζt d(βt X
bt ). (130)

Moreover, if the wealth process Y is bounded from below, we say that the strategy (ζ 0 , ζ) is admissible.
We assume that the primary market model is free of arbitrage opportunities (though presumably incomplete),
in the sense that the so-called No Free Lunch with Vanishing Risk (NFLVR) condition is satised. This is a
specic no arbitrage condition involving wealth processes of admissible self-nancing trading strategies, see
[61].

By the now classic arbitrage theory (see e.g. [61, 45, 25]), the NFLVR condition is equivalent to the existence
of a risk-neutral measure P ∈ M, where M denotes the set of probability measures P∼P
b such that βX
b is
a P-local martingale.

Denition A.1 An American option is a nancial product with payo process, as seen from the perspective
of the option holder, given by
1{t<T } Lt + 1{t=T } ξ , (131)

where:
• the early exercise payment process L = (Lt )t∈[0,T ] is a real-valued, bounded from below, càdlàg process;
• the payment at maturity T, ξ, is a bounded from below random variable.
An European option is a nancial product with terminal payo, as seen from the perspective of the option
holder, ξ at T, where ξ denotes a bounded from below real random variable.
100

±
In the simplest case, ξ = (ST − K) , for an European vanilla call/put option with maturity T and strike K
1
on S = X , the rst primary risky asset.

Recall that for t ∈ [0, T ], Et denotes the conditional expectation operator given Ft , and Tt (or simply T , in
case t = 0) stands for the set of all stopping times with values in [t, T ].
We know further (see e.g. Bielecki et al. [25]) that for any P ∈ M, the process Π = (Πt )t∈[0,T ] such that

βt Πt = esssupτ ∈Tt Et βτ 1{τ <T } Lτ + 1{τ =T } ξ , t ∈ [0, T ] (132)

is an arbitrage price of the related American option as soon as it is a semimartingale. Moreover, any arbitrage
price of an American option is of this form provided


sup E sup 1{t<T } Lt + 1{t=T } ξ < ∞ . (133)
P∈M t∈[0,T ]

In the case of an European option, then for any P ∈ M, the process Π = (Πt )t∈[0,T ] such that

βt Πt = Et βT ξ, t ∈ [0, T ], (134)

is an arbitrage price of the option. Moreover, any arbitrage price of an European option is of this form
provided

sup Eξ < ∞ . (135)


P∈M

Arbitrage prices like (132)(134) are called P-arbitrage prices in the sequel.

In the case of an European option, we dene a process L by βL = −(c + 1), where −c is a minorant of βT ξ.
In view of the above results, this allows one to interpret an European option as a special case of an American
option. Henceforth, by default, `option' means American option, including European option, with L dened
in this way, as a special case.

B Connection with Hedging


We shall now see that under mild technical conditions, the notions of arbitrage P-prices dened in section
A can be connected with a suitable notion of hedging.

Denition B.1 By a primary strategy, we mean a triplet (V0 , ζ, ρ), such that:
• The number V0 represents the initial wealth,
• ζ is an β Xb -integrable process representing holdings in primary risky assets,
• ρ is a real-valued semimartingale with ρ0 = 0.
The wealth process V of a primary strategy (V0 , ζ, ρ) is given by

d(βt Vt ) = ζt d(βt X
bt ) + βt dρt , t ∈ [0, T ] (136)

with the initial condition V0 .

Note that ρ can be interpreted as a nancing cost. So the primary strategy introduced in Denition B.1 is
not self-nancing in the standard sense, unless ρ = 0.

Denition B.2 (i) An (issuer) hedge with residual cost ρ for an option is a primary strategy (V0 , ζ, ρ) with
related wealth process V such that, for t ∈ [0, T ],
 
Vt − 1{t<T } Lt + 1{t=T } ξ ≥ 0. (137)

Hedges with no residual cost (that is, with ρ = 0) are also called superhedges.
In the special case of European options, if equality holds in (137) at t = T, so VT = ξ, we then deal with a
replicating strategy with residual cost ρ, or simply, in case ρ = 0, a replicating strategy.
101

Given a risk-neutral measure P ∈ M, we shall now postulate suitable integrability and regularity conditions
embedded in the standing assumption that a related reected BSDE (see e.g. [66]) has a solution. So we
shall introduce a reected BSDE (E ) with respect to (Ω, F, P), with data dened in terms of those of a given
option. Assuming that (E ) has a solution (for which various sets of sucient regularity and integrability
conditions are known in the literature), we will deduce explicit hedging strategies with minimal initial wealth
for the related option.

So we consider the following reected BSDE (E ) with data ξ, L :


RT
Πt = ξ − t
ru Πu du + kT − kt − (mT − mt ), t ∈ [0, T ]
L ≤ Πt , t ∈ [0, T ] (E)
RT t
0
(Πu− − Lu− ) dku = 0

Let H2 (P) denote the set of real-valued P-martingales vanishing at time 0, with integrable quadratic variation
over [0, T ].

Denition B.3 By a (P-)solution to (E ), we mean a triplet (Π, m, k) such that all conditions in (E ) are
satised, where:
• the state process Π is a real valued, càdlàg process,
• m lies in H2 (P), and
• k is a non-decreasing continuous process (null at time 0).

Note that if (Π, m, k) is a solution to (E ), then Π is a semimartingale, and by a standard verication principle,
we have that

βt Πt = esssupτ ∈Tt Et βτ 1{τ <T } Lτ + 1{τ =T } ξ , (138)

which is also equal to Et βT ξ , in the case of an European option. So the state process Π of any P-solution to
(E ) is an arbitrage P-price of the option.

Recall that an R -valued semimartingale Y is called a sigma martingale (see e.g. [85, 130, 45]) if there
d
d d i i
exists an R -valued local martingale M and an R -valued process H such that each H is M -integrable,
i = 1, . . . , d (in particular, H must be predictable), and Y i = Y0i + H i dM i for i = 1, . . . , d. Any local
R

martingale is a sigma martingale, and any locally bounded sigma martingale is a local martingale.

Theorem B.4 (By application of Bielecki et al. [27]) Assume that the BSDE (E ) has a solution (Π, m, k).
Then Π is an arbitrage P-price process of the option. Moreover:
(i) Π0 is the minimal initial wealth of an hedge with P-sigma martingale residual cost.
(ii) For any R -valued, (row by row) β Xb -integrable process, Λ, such that N has components in H2 (P),
q⊗d

where

βt dNt = Λt d(βt X
bt ), t ∈ [0, T ], (139)

and for any R1⊗q -valued predictable process z such that


q Z T
X
E (ztj )2 d[N j , N j ]t < ∞ , (140)
j=1 0

let us dene ζ by
 
ζt = zt , Rt − Π̄t− Λt , t ∈ [0, T ] (141)

·
with Π̄t = Πt + βt−1 [0,t] βu dku , and ρ = m − 0 zdN. Then (Π0 , ζ, ρ), is an hedge with initial wealth Π and
R R

P-sigma martingale residual cost ρ.


In the special case of an European option, then k = 0, and (Π, ζ, ρ) is a replicating strategy with P-sigma
martingale residual cost ρ.

Remarks B.1 (i) The special case ρ=0 in the previous results corresponds to a suitable form of model
completeness (replicability of European options, cf. last point of Theorem B.4), in which the issuer of the
option may and wishes to hedge all the risks embedded in the option.
102

The case where ρ 6= 0 corresponds to either model incompleteness, or a situation of model completeness in
which the issuer may but wishes not to hedge all the risks embedded in the option, for instance because she
wants to limit transaction costs, or because she wishes to take some bets in specic risk directions.
(ii) In cases where ρ may be taken equal to 0 in Theorem B.4, the minimality statements in this Theorem
may be used to prove uniqueness of the related arbitrage prices (see [26]).

C Markovian FBSDE Approach


For being usable in practice, a (dynamic) pricing model needs to be constructive, or Markovian in some sense,
relatively to a given option. This will be achieved by assuming that the related BSDE (E ) is Markovian (see
e.g. section 4 of [66]).

Denition C.1 We say that the BSDE (E ) is a decoupled Markovian Forward-Backward SDE (Markovian
FBSDE, for short), if the input data r, ξ and L of (E ) are given by Borel functions of some (Ω, F, P)-Markov
factor process Z with values in a suitable state space (nite-dimensional state space with rst component
given by time t), so
rt = r(Zt ) , ξ = ξ(ZT ) , Lt = L(Zt ) (142)

for some Borel functions denoted as the related processes.

In particular, the system made of the specication of a forward dynamics for Z, together with the BSDE
(E ), constitutes a decoupled Markovian forward-backward system of equations in (Z, Π, m, k). The system
is decoupled in the sense that the forward component of the system serves as an input for the backward
component(Z is an input to (E ), cf. (142)), but not the other way round.

From the point of view of interpretation, the components of Z are observable factors. The rst component
ofZ (indexed by 0) is Zt0 = t. As for the other components of Z :
• the rst d following components of Z are given by the primary risky asset price processes X i , i = 1, . . . , d;
• the remaining components of Z (if any) are to be understood as simple factors that may be required to
`Markovianize' the payos of the option (e.g. factors accounting for path dependence in the option's payo,
and/or non-traded factors such as stochastic volatility in the dynamics of the assets underlying the option).

Note that due to the nature of our model, observability of the factor process Z in the mathematical sense
ofF-adaptedness is not sucient in practice. In order for the model to be usable in practice, a constructive
mapping from a collection of meaningful and directly observable economic variables to Z is really needed.
Otherwise, the model will be useless.

Since the model is dened under a risk-neutral probability P ∈ M, this mapping will typically be achieved
in practice by calibration of Z to a set of observed prices of traded derivatives. Of course this `calibration' is
obvious for the time component of Z or the components of Z which are represented in X: simply x these
components equal to the current time or to the current market values of the related assets.

D Variational Inequality Approach in a JumpDiusion Set-


ting with Regimes
Under a rather generic specication for the Markov factor process Z, we shall now derive a related variational
inequality approach for pricing and hedging the option.

In this view, given an integer p and a nite set E of size l, we dene the following linear operator A acting
on regular functions Π = Π(t, x, y), (t, x, y) ∈ [0, T ] × Rp × E :
p
1 X
At Π(t, x, y) = aij (t, x, y)∂x2i xj Π(t, x, y)
2 i,j=1
 
+ pi=1 bi (t, x, y) − g(t, x, y) Rp ui (t, x, y, x0 )h(t, x, y, dx0 ) ∂xi Π(t, x, y) (143)
P R
0 0
R 
P x, y) Rp Π(t,0x + u(t, x,0y, x ) − Π(t, x, y) h(t, x, y, dx )
+ g(t,
+ y0 ∈E λ(t, x, y, y )(Π(t, x, y ) − Π(t, x, y))
103

where:
• the a(t, x, y) are symmetric non-negative matrices,
• the b(t, x, y) are drift vector coecients,
• the jump intensity function g(t, x, y) is non-negative, the h(t, x, y, ·) are probability measures on Rp , and
u(t, x, y, x0 ) is the jump size function,
0 0 0
• the intensity P matrix function [λ(t, x, y, y )]y,y 0 ∈E is such that λ(t, x, y, y ) ≥ 0 whenever y 6= y , and
0
λ(t, x, y, y) = − y0 ∈E\{y} λ(t, x, y, y ), by convention. Under appropriate technical conditions on the coef-
cients, the existence and uniqueness in law of a stochastic basis (Ω, F, P) and an (Ω, F, P)-Markov process
Z = (t, X , Y) solving the martingale problem for A results from Theorems 4.1 and 5.4 in Chapter 4 of Ethier
and Kurtz [69].
Equivalently, under these conditions (see also [57]), there exists a stochastic basis (Ω, F, P) and an (Ω, F, P)-
Markov process Z = (t, X , Y), the latter unique in law, such that:
• The Rp -valued process X (for some p ≥ d) satises, for t ≥ 0 :
R
dXt = b(t, Xt , Yt ) dt + σ(t, Xt , Yt ) dWt + Rp
u(t, Xt− , Yt− , x) χ
e(dt, dx) (144)

where:

 χ
e(dt, dx) denotes the P-compensated martingale measure associated with a Poisson random measure
χ with P-intensity measure (or P-compensator measure) g(t, Xt , Yt )h(t, Xt , Yt , dx)dt, so

e(dt, dx) = χ(dt, dx) − g(t, Xt , Yt )h(t, Xt , Yt , dx)dt ,


χ

 the dispersion matrix function σ(t, x, y) satises the equality σ(t, x, y)σ(t, x, y)T = a(t, x, y);

• The continuous time E -valued Markov chain Y satises

P P
dYt = y∈E λ(t, Xt , Yt , y)(y − Yt ) dt + y∈E (y − Yt− ) de
νt (y) (145)

where νe is the compensated counting measure corresponding to Y, so

νt (y) = dνt (y) − 1{Yt 6=y} λ(t, Xt , Yt , y) dt,


de y∈E, (146)

in which ν is the counting measure, such that νt (y) is equal to the number of transitions to state y since
time 0.
In particular, we then have the following Itô formula, in which ∂x denotes the row-gradient of Π(t, x, y) with
respect to x:

dΠ(ZRt ) = (∂t + At )Π(t, Xt , Yt )dt + ∂x Π(t, Xt , Yt )σ(t, Xt , Yt)dWt


+P Rp
Π(t, Xt− + u(t, Xt− , Yt− , x), Yt− ) − Π(t, Xt− , Yt− ) χ e(dt, dx) (147)
+ y∈E (Π(t, Xt− y) − Π(t, Xt− , Yt− ))de νt (y) ,

for regular enough functions Π.


Note that if we suppose that the intensity matrix of Y does not depend on t, x, then Y is an homogenous
Markov chain with nite state space E. Alternatively, if we take g(t, x, y, x0 ) = x0 , and we suppose that
the coecients σ, b, u, g and h do not depend on t, x, y, then X is a Poisson-Lévy process. For simplicity
we do not consider the innite activity case, that is, the case when the Lévy jump measure gh is not a
nite measure. We thus dened a rather generic class (except for the exclusion of `small jumps' in X) of
Markovian factor processes Z = (t, X , Y) for an option, in the form of a Y -modulated Lévy-like component
X and an X -modulated Markov chain component Y.
From the point of view of interpretation, Y represents regimes that modulate the dynamics of the risk-neutral
pricing process. For the sake of calibrability of the model, this means in particular that the various model
regimes y∈E will have to be associated with non-overlapping intervals for the other model parameters.

Let us now put a bit more structure on the primary dividend process D, which to this point was only assumed
d
to be an R -valued càdlàg process of nite variation . So we further assume that D is P-integrable and of

the form D = c(Zt )dt, for some Borel function c.
0
Observe that under the following arbitrage P-consistency condition on the model coecients:

bi (t, x, y) = r(t, x, y) xi − ci (t, x, y) , 1 ≤ i ≤ d (148)


104

then βX
b is a P-local martingale.

Given such an (Ω, F, P)-model with related Markovian FBSDE (E ) for an option, let us further dene N
with components in H2 (P) by N T = (W T , νeT ), with q = p + l; in case when l = 1, that is in case when the
regime indicator process is constant, the one-dimensional process ν e in (146) is trivially null and plays no
role whatsoever, and so, we may take q = p. Denote by P the F-predictable σ -algebra on Ω × [0, T ] and by
P(E) the σ -algebra of all subsets of E. A solution (Π, m, k) to (E ) is then typically sought for with m in the
form
Rt Rt RtR
mt = 0
zu dNu + nt = 0
zu dNu + 0 Rp
vu (x) χ
e(du, dx) (149)

for an F-predictable process z and a P ⊗ P(E)-measurable random function v : Ω × [0, T ] × E 7→ R such


that, denoting E = {y 1 , . . . , y l } :
Pp RT RT
(ztj )2 dt + lj=1 E 0 (ztp+j )2 λ(t, Xt , Yt , y j )dt < ∞
P
j=1 E 0
RT R
E 0 E vt2 (x)f (t, Xt , Yt )g(t, Xt , Yt , dx)dt < ∞ .

Assuming further that N satises (139), then all assumptions are satised in Theorem B.4. We thus get a
decoupled forward-backward system of stochastic dierential equations in the unknowns (Z, Π, z, v, k) (see
e.g. MaYong [112]).

The issues of solving (E ) under rather general assumptions covering in particular the cases corresponding
to typical nancial applications (e.g. convertible bonds, see [25, 26]), and the related variational inequality
approach in the Markovian FBSDE case, are addressed in [56, 57] (see also [28]). In particular, we prove
in [56, 57] that, under mild regularity conditions, (E ) has a unique solution (Π, m, k), with m in the form
(149), in suitable Hilbert spaces. In the Markovian FBSDE case, we establish the relation between this
solution and the unique solution in some sense (viscosity solution with growth conditions or weak solution
in weighted Sobolev spaces), Π(t, x, y), to the following PIDE obstacle problem (system of l coupled PIDEs
with obstacle in space-dimension p):

min (−∂t Π(t, x, y) − At Π(t, x, y) + r(t, x, y)Π(t, x, y),


(150)
Π(t, x, y) − L(t, x, y)) = 0 , t < T , (x, y) ∈ Rd × E

with terminal condition Π(T, x, y) = ξ(x, y). Then the above-mentioned relation writes:

Πt = Π(Zt ), t ∈ [0, T ]

plus extra relations between the z -component of the solution of (E ) and ∂x Π(t, Xt , Yt ), in regular cases. The
related hedging strategy with residual cost ρ in Theorem B.4, can also be expressed in terms of the solution
to the above PIDE problem.

In summary, in a generic Markovian market model, option prices and Greeks are given as solutions to related
risk-neutral pricing FBSDEs / PIDEs.
In some cases (models of the AJD class [63], or SABR model [80]), semi-analytic formulas are available for
the prices and Greeks of European vanillas. In any case, the pricing PIDEs can be solved numerically by
stochastic simulation methods, or, provided the dimension of the model is not too large, by deterministic
PIDE numerical schemes.

Acknowledgements. These notes grew out of a graduate course of the Master Program in Financial
Engineering at Evry University (M2IF program, see www.univ-evry.fr/m2if ). I wish to express warm thanks
to the students, with a special mention to Arslan Bendimerad and Rémi Boyer, who produced a preliminary
version of the material of sections 30.5 and 6 in the context of their master project.

References
[1] Y. ACHDOU AND O. PIRONNEAU, Computational Methods for Option Pricing. Vol. 30 of Frontiers
in Applied Mathematics, SIAM, Philadelphia, PA, 2005.

[2] O. ALVAREZ AND A. TOURIN. Viscosity solutions of nonlinear integro-dierential equations.Ann.


Inst. H. Poincaré Anal. Non Lineaire, 13(3):293317, 1996.
105

[3] A. L. AMADORI. The obstacle problem for nonlinear integro-dierential operators arising in option
pricing. Submitted.
[4] A. L. AMADORI. Nonlinear integro-dierential evolution problems arising in option pricing: a viscos-
ity solutions approach. Journal of Dierential and Integral Equations, 16(7):787811, 2003.

[5] L. ANDERSEN. Ecient Simulation of the Heston Stochastic Volatility Model. Banc of America
Securities, January 23, 2007.

[6] L. ANDERSEN R. BROTHERTON-RATCLIFFE. Exact exotics, Risk, 9:8589, Oct 1996.

[7] L. ANDERSEN AND J. SIDENIUS. Extensions to the Gaussian Copula: Random Recovery and
Random Factor Loadings, Journal of Credit Risk, Vol. 1, No. 1 (Winter 2004), pp. 2970.

[8] A. ANTONOV, S. MECHKOV, AND T. MISIRPASHAEV. Analytical techniques for synthetic CDOs
and credit default risk measures, NumeriX, 2005.

[9] M. AVELLANEDA, D. BOYER-OLSON, J. BUSCA AND P. FRIZ. Reconstruction of Volatility:


Pricing index options using the steepest-descent approximation, RISK, October 2002.

[10] AVELLANEDA M., BU R., FRIEDMAN C., GRANDCHAMP N., KRUK L. AND NEWMAN J.
Weighted Monte Carlo: a new technique for calibrating asset-pricing models, IJTAF, (2001) 4, 129.

[11] M. AVELLANEDA, C. FRIEDMAN, R. HOLMES AND D. SAMPERI. Calibrating volatility surfaces


via relative-entropy minimization, Applied Math. Finance, 41 (1997), pp. 3764.

[12] M. AVELLANEDA, P. LAURENCE. Quantitative Modeling of Derivative Securities from Theory to


Practice, Chapman & Hall, 2000.

[13] M. AVELLANEDA, A. LEVY, AND A. PARAS. Pricing and hedging derivative securities in markets
with uncertain volatilities, Applied Math. Finance, 1995.

[14] BALLY V., CABALLERO E., EL-KAROUI N. AND B. FERNANDEZ. Reected BSDE's, PDE's and
Variational Inequalities. Bernouilli, To appear.

[15] BALLY, V. AND MATOUSSI, A. Weak solutions for SPDEs and Backward doubly stochastic dier-
ential equations, Journal of Theoretical Probability, Vol. 14, No. 1, 125-164 (2001).
[16] BARLES, G., BUCKDAHN, R. AND PARDOUX, E.. Backward Stochastic Dierential Equatiions
and Integral-Partial Dirential Equations, Stochastics and Stochastics Reports, Vol. 60, pp. 57-83
(1997).

[17] G.BARLES, C. DAHER AND M. ROMANO. Convergence of numerical schemes for parabolic equa-
tions arising in nance theory, Math. Models Methodes App. Sci., 5, n 1, pp. 125143, 1995.
o

[18] BARLES, G. AND L. LESIGNE. SDE, BSDE and PDE. El Karoui, N. and Mazliak, L., (Eds.),
Backward Stochastic dierential Equatons. Pitman Research Notes in Mathematics Series 364, pp.
47-80 (1997).

[19] G. G. BARLES AND P.E. SOUGANIDIS. Convergence of approximation schemes for fully nonlinear
second order equations, Asymptotic Anal., (4), pp. 271283, 1991.

[20] J. BARRAQUAND AND T. PUDET. Pricing of American Path-Dependent Contingent Claims, Math-
ematical Finance, 6, no 1, pp. 1751, 1996.
[21] D. BATES. Jumps and SV: exchange rate processes implicit in deutsche mark options, Review of
Financial Studies, 9 (1) (1996), 69107.

[22] R. BELLMAN. Dynamic Programming, Princeton Univ Press, 1957.

[23] A. BENSOUSSAN AND J.L. LIONS. Application des Inéquations Variationnelles en Contrôle Stochas-
tique, Dunod, 1978.

[24] BERESTYCKI, H., J. BUSCA, AND I. FLORENT. Asymptotics and calibration of local volatility
models, Quant. Finance 2 (1) (2002).

[25] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Arbitrage pricing of
defaultable game options with applications to convertible bonds, Quantitative Finance, Forthcoming.

[26] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Valuation and hedging
of defaultable game options in a hazard process model, Submitted, 2006.

[27] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Defaultable options in a
Markovian intensity model of credit risk, Submitted, 2006.
106

[28] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M. Convertible bonds in a
jump-diusion model of credit risk, Work in preparation, 2006.

[29] BIELECKI, T.R., CRÉPEY, S., JEANBLANC, M. AND RUTKOWSKI, M., Valuation of basket
credit derivatives in the credit migrations environment. Handbook of Financial Engineering, To appear.
[30] BIELECKI, T.R., JEANBLANC, M. AND RUTKOWSKI, M. Pricing and trading credit default
swaps. Submitted, 2005.

[31] BIELECKI, T.R. AND RUTKOWSKI, M. Credit Risk: Modeling, Valuation and Hedging. Springer-
Verlag, Berlin, 2002.

[32] F.BLACK M.SCHOLES. The pricing of Options and Corporate Liabilities, Journal of Political Econ-
omy, 81:635654, 1973.

[33] N. BOULEAU AND D. LEPINGLE. Numerical methods for stochastic processes, Wiley, 1993.

[34] BOUCHARD B. AND ELIE R. Discrete time approximation of decoupled Forward-Backward SDE
with jumps, Submitted, 2006.

[35] BRACE, A., DUN, T. AND BARTON, G., Towards a Central Interest Rate Model, FMMA notes
worhing paper, 1998.

[36] A. BRACE, D. GATAREK AND M. MUSIELA. The market model of interest rate dynamics, Math-
ematical Finance, 7(2) (1997), pp. 127154.

[37] BREEDEN, D. AND R. LITZENBERGER. Prices of State-contingent Claims Implicit in Options


Prices, Journal of Business, 51, 621651 (1978).

[38] M.J. BRENNAN E.S. SCHWARTZ. The valuation of the American put option, J. of Finance, 32:449
462, 1977.

[39] BRIANI M, LA CHIOMA C AND NATALINI R. Convergence of numerical schemes for viscosity so-
lutions to integro-dierential degenerate parabolic problems arising in nancial theory, Numer. Math.,
2004, vol. 98, no4, pp. 607-646.

[40] D. BRIGO AND F. MERCURIO. Interest Rate Models - Theory and Practice, Springer Finance,
2001.

[41] M. BROADIE AND P. GLASSERMANN. Pricing American-style securities using simulation, J.J.of
Economic Dynamics and Control, 21:13231352, 1997.

[42] J. BUSCA. A nite elements method for the valuation of American Options, Unpublished manuscript,
1998.

[43] P. CARR AND D.MADAN. Option Valuation using the fast Fourier transform,, Journal of Computa-
tional Finance, (1998), 2, 61-73.

[44] Y. ZHU, X. WU, I-L. CHERN. Derivative Securities and Dierence Methods, Springer Finance, 2004.
[45] CHERNY, A. AND SHIRYAEV, A. Vector stochastic integrals and the fundamental theorems of asset
pricing, Proceedings of the Steklov Institute of mathematics, 237 (2002), pp. 649.

[46] L. CLEWLOW AND A. CARVEHILL. On the simulation of contingent claims, Journal of Derivatives,
6673, Winter 1994.

[47] T. COLEMAN, Y. LI AND A. VERMA. Reconstructing the unknown volatility function. Journal of
Computational Finance, 2 (1999), 3, pp. 77102.

[48] R. CONT AND P. TANKOV. Financial Modelling with Jump Processes, Chapman & Hall/CRC, 2003.
[49] R. CONT AND E. VOLTCHKOVA. Finite dierence methods for option pricing in jump- diusion
and exponential Lévy models, SIAM J. Numer. Anal. 43 (2005), 15961626.

[50] J. COX, S. ROSS M. RUBINSTEIN. Option pricing: a simplied approach, J. of Economics,January


1978.

[51] M. CRANDALL, H. ISHII AND P.-L. LIONS. User's guide to viscosity solutions of second order partial
dierential equations, Bull. Amer. Math. Soc., 1992.

[52] S. CRÉPEY. Delta-hedging Vega Risk? Quantitative Finance, 4 (October 2004), pp. 559579.

[53] S. CRÉPEY. Calibration of the Local Volatility in a trinomial tree using Tikhonov regularization.
Inverse Problems, 19 (2003), pp. 91127.
107

[54] S. CRÉPEY. Calibration of the Local Volatility in a generalized BlackScholes model using Tikhonov
regularization. SIAM Journal on Mathematical Analysis, Vol. 34 No 5 (2003), pp. 11831206.

[55] S. CRÉPEY. Contribution à des méthodes numériques appliquées à la Finance et aux Jeux Diéren-
tiels. PhD Thesis, Ecole Polytechnique, France, January 2001.

[56] CRÉPEY, S., MATOUSSI, A.: Doubly Reected BSDEs with Jumps: A Priori Estimates and Com-
parison Principle, Work in preparation.
[57] CRÉPEY, S., JEANBLANC, M., AND MATOUSSI, A.: Markovian Doubly Reected BSDEs in a
JumpDiusion Setting with Regimes. Work in preparation.
[58] C.W. CRYER. The solution of a quadratic programming problem using systematic overrelaxation,
SIAM J. Control,(9):385392, 1971.

[59] C.W. CRYER. The ecient solution of linear complementarity problems for tridiagonal minkowski
matrices, ACM Trans. Math. Softwave, (9):199214, 1983.

[60] A. DEBUYSSCHER AND M. SZEGÖ. The Fourier Transform Method  Technical Document, Moody's
Investors Service, Working paper, Jan 2003.

[61] F. DELBAEN AND W. SCHACHERMAYER. The Mathematics of Arbitrage, Springer Finance, 2005.
[62] K. DEMETERFI, E. DERMAN, M. KAMAL AND J. ZOU. More than you ever wanted to know
about volatility swaps, Quantitative Strategies Research Notes, March 1999. pp. 7895.

[63] D. DUFFIE, J. PAN AND K. SINGLETON. Transform Analysis and Asset Pricing for Ane Jump-
Diusions, Econometrica, 68 (2000), 6, pp. 13431376.

[64] B. DUPIRE. Pricing with a smile, Risk, 7 (1994), pp. 1820.

[65] B. DUMAS, J. FLEMING AND R. WHALEY. Implied volatility functions: empirical tests, J. of
Finance, 53 (1998), 6, pp. 20592106.

[66] EL KAROUI, N., PENG, S., AND QUENEZ, M.-C. Backward stochastic dierential equations in
nance. Mathematical Finance 7 (1997), 171.

[67] H. W. ENGL, M. HANKE, AND A. NEUBAUER. Regularization of Inverse Problems. Kluwer, Dor-
drecht, 1996.

[68] A. ERN S. VILLENEUVE A. ZANETTE. Adaptive Finite element methods for local volatility euro-
pean option pricing. International Journal of Theoretical and Applied Finance 7(6), 2004.

[69] H.J. ETHIER AND T.G. KURTZ. Markov Processes. Characterization and Convergence. Wiley, 1986.
[70] W. FLEMING AND H. SONER. Controlled Markov processes and viscosity solutions, Second edition,
Springer, 2006.

[71] FOLLMER H. AND SCHIED A.. Stochastic Finance, An Introduction in Discrete Time, De Gruyter,
2002.

[72] E. FOURNIÉ, J-M. LASRY, J. LEBUCHOUX, P-L. LIONS AND N. TOUZI. An application of
Malliavin calculus to Monte Carlo methods in Finance, Finance & Stochastics, vol. 4, no 3, 391412,
1999.

[73] E. FOURNIÉ, J-M. LASRY, J. LEBUCHOUX AND P-L. LIONS. Applications of Malliavin calculus
to Monte-Carlo methods in nance. II, Finance & Stochastics, 5, 2001, p. 201-236.

[74] A. FRIEDMAN.Partial Dierential Equations of Parabolic Type, Prentice Hall, 1964.

[75] A. GARCKE. Sparse grid tutorial, 2006.

[76] J. GATHERAL. The Volatility Surface: A Practitioner's Guide. Wiley, 2006.

[77] M. GIBSON. Understanding the Risk of Synthetic CDOs, FEDS Working Paper No. 2004-36, July
2004.

[78] P. GLASSERMAN. Monte Carlo Methods in Financial Engineering, Springer, 2004.

[79] R. GLOWINSKI, J-L. LIONS AND R. TREMOLIERES. Analyse Numérique des Inéquations Varia-
tionnelles, Dunod , 1976.

[80] P. HAGAN, D. KUMAR, A. LESNIEWSKI AND D. WOODWARD. Managing Smile Risk. Wilmott
Magazine, 2002.

[81] S. HESTON. A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond
and Currency Options, Review of Financial Studies, 6(2) 32743, 1993.
108

[82] J. HULL. Options, futures, & other derivatives, Prentice Hall , 5th edition, 2002.

[83] J. HULL AND A. WHITE. forward rate volatilities, swap rate volatilities, and the implementation of
the libor market model, Journal of Fixed Income, 10(2) 4662, 2000.

[84] P. JÄCKEL. Stochastic Volatility Models: Past, Present and Future, The Best of Wilmott 1, Chapter
23, 2005.

[85] JACOD, J. AND SHIRYAEV, A.N. Limit Theorems for Stochastic Processes. Springer, 2003.

[86] P. JAILLET D. LAMBERTON AND B. LAPEYRE. Variational Inequalities and the Pricing of Amer-
ican Options, Acta Applicandae Mathematicae 21, pp. 263289, 1990.

[87] E. R. JAKOBSEN AND K. H. KARLSEN. A "maximum principle for semicontinuous functions"


applicable to integro-partial dierential equations, NoDEA Nonlinear Dierential Equations Appl.
13:137-165, 2006.

[88] E. R. JAKOBSEN AND K. H. KARLSEN. Continuous dependence estimates for viscosity solutions
of integro-PDEs, J. Dierential Equations 212(2): 278-318, 2005.

[89] F. JAMSHIDIAN. Valuation of credit default swap and swaptions. Finance and Stochastics 8, 343-371,
2004.

[90] F. JAMSHIDIAN. Libor and swap market models and measures, Working paper, Sakura Global Capital,
1996.

[91] Y. JIAO. Le risque de crédit: modélisation et simulation numérique. PhD Thesis (in English), École
polytechnique.

[92] C. KAHL, P. JACKEL. Not-so-complex logarithms in the Heston model. Working Paper, 2006.

[93] B. KAMRAD AND P. RITCHKEN. Multinomial approximating models for options with k state vari-
ables, Management Science, 37:16401652, 1991.

[94] I. KARATZAS AND S. SHREVE. Brownian Motion and Stochastic Calculus, Springer, 1988.

[95] G.Z. KEMNA AND A.C.F. VORST. A pricing method for options based on average asset values, J.
Banking Finan., 113129, March 1990.

[96] P. E. KLOEDEN AND E. PLATEN. The Numerical Solution of Stochastic Dierential Equations,
Springer-Verlag, 1992

[97] D.E. KNUTH. The Art of Computer programming, Seminumerical Algorithms, volume 2, Addison-
Wesley, 1981.

[98] H. KUSHNER. Probability Methods for Approximations in Stochastic Control and for Elliptic Equa-
tions, Academic Press, 1977.

[99] H. KUSHNER AND P.G. DUPUIS. Numerical Methods for Stochastic Control Problems in Continous
Time, Springer-Verlag, 1992.

[100] Y.W. KWOK. Mathematical models of nancial derivatives, Springer Finance, 1998.

[101] R. LAGNADO AND S. OSHER. A Technique for Calibrating Derivative Security Pricing Models:
Numerical Solution of an Inverse Problem, Journal of Computational Finance, 1 (1997), 1, pp. 1325.

[102] D. LAMBERTON AND B. LAPEYRE. Introduction to Stochastic Calculus Applied to Finance, Chap-
man and Hall, 1996.

[103] B. LAPEYRE AND E. TEMAM.. Competitive Monte Carlo methods for the pricing of Asian options
, Journal of Computational Finance, Volume 5 / Number 1, Fall 2001

[104] B. LAPEYRE, E. PARDOUX AND R. SENTIS. Méthodes de Monte-Carlo pour les équations de
transport et de diusion, Math. & Applic., SMAI, No 29, Springer, 1998.

[105] J.-P. LAURENT AND J. GREGORY. Basket Default Swaps, CDOs and Factor Copulas, Journal of
Risk, Forthcoming.

[106] P. L'ECUYER.. Random numbers for simulation, Communications of the ACM, 33(10), Octobre 1990.
[107] P. L'ECUYER. Uniform random number generation, The Annals of Operations Research, 53:77120,
1994.

[108] P. L'ECUYER. Random number generation, In The Hanbook of Simulation, 1998.

[109] F. DUBOIS, T. LELIEVRE. Ecient Pricing of Asian Options by the PDE Approach, Journal of
Computational Finance, Vol. 8, No. 2, pp. 55-64, Winter 2004/05.
109

[110] LEWIS, A.. A simple option, formula for general jump- diusion and other exponential Levy processes,
available from http://www.optioncity.net, 2001.

[111] D. LI. On default correlation: A copula function approach, In Journal of Fixed Income, 115118, 2000.
[112] J. MA AND J. YONG. Forward-backward stochastic dierential equations and their applications.
Springer, Lecture Notes in Math., vol. 1702 (1999).

[113] R.C. MERTON. Option pricing when underlying stock returns are discontinuous, J. Financial Eco-
nomics, 3 (1976), pp. 125144.

[114] R.C. MERTON. The theory of rational option pricing, Bell J. Econ. Man. Sc., 4 (1973), pp. 141183.

[115] METROPOLIS, N. AND ULAM, S. The Monte Carlo Method, J. Amer. Stat. Assoc., 44, 335-341
(1949).

[116] K. MILTERSEN, K. SANDMANN AND D. SONDERMANN, Closed form solutions for term structure
derivatives with log-normal interest rates,

[117] W.J. MOROKOFF AND R.E. CAFLISH. Quasi-random sequences and their discrepancies, SIAM,
Journal of Scientic Computin, 5(6):12511279, nov 1994.

[118] K. MORTON AND D. MAYERS. Numerical Solution of Partial Dierential Equations, Cambridge
university press, 1994.

[119] M. MUSIELA AND M. RUTKOWSKI. Martingale Methods in Financial Modelling, Springer, 2004.

[120] H. NIEDERREITER. New developments in uniform pseudorandom number and vector generation, In
Springer, editor, In Lecture Notes in Statistics, 106, Monte Carlo and Quasi-Monte Carlo Methods in
Scientic Computing, volume 106, pages 87120, 1994.

[121] H. NIEDERREITER A.B.OWEN AND J.SHIUE Editors. Randomly permuted (t,m,s)-Nets and (t,s)-
sequences, in Montecarlo and Quasi Montecarlo methods in Scientic Computing, Springer, New York,
1995.

[122] H. NIEDERREITER. Random Number Generation and Quasi Monte Carlo Methods, Society for
Industrial and Applied mathematics, 1992.

[123] H. NEIDERREITER. Points sets ans sequences with small discrepancy, Monatsh.Math, 104:273337,
1987.

[124] J. NOCEDAL, S. J. WRIGHT. Numerical Optimization, 2


nd
edition, Springer, 2006.

[125] D. O'KANE AND M. LIVESEY. Base Correlation Explained, Lehman Brothers Fixed Income Quan-
titative Credit Research, Nov 2004.

[126] M. OVERHAUS ET AL. Equity Derivatives: Theory and Applications , Wiley Finance, 2002.

[127] D.W. PEACEMAN AND H.H. RACHFORD Jr. The numerical solution of parabolic and elliptic
dierential equations, J.of Siam, 3:2842, 1955.

[128] D.M.POOLEY P.A.FORSYTH AND K.R.VETZAL, Numerical Convergences properties of option


pricing PDEs with uncertain volatility, IMA Journal of Numerical Analysis, 23, 241-267, 2003.

[129] W.H. PRESS, B.P. FLANNERY, S.A. TEUKOLSKY AND W.T. VETTERLING, Numerical Recipes
in C: The Art of Scientic Computing (Second Edition), Cambridge University Press, 1992.

[130] PROTTER, P. E., Stochastic Integration and Dierential Equations, Second edition, Springer, 2004.

[131] R. REBONATO. Modern Pricing of Interest-Rate Derivatives: The LIBOR Market Model and Beyond,
Princeton University Press, 2002.

[132] REISINGER, C. Analysis of linear dierence schemes in the sparse grid combination technique, SIAM
Journal on Numerical Analysis, 2006.

[133] D. REVUZ AND M. YOR. Continuous Martingales and Brownian Motion, 3rd edition. Springer, 2001.
[134] P. RITCHKEN. On pricing barrier options, Journal Of Derivatives, pages 1928, Winter 95 1995.

[135] L.C.G. ROGERS AND Z. SHI. The value of an asian option, J. Appl. Probab., 32(4):10771088, 1995.

[136] SAAD, Y. AND SCHULTZ, M.H. GMRES: A Generalized Minimal Residual Algorithm for Solving
Nonsymmetric Linear System, SIAM J. Sci. Statist. Comput., 7, 856-869, 1986.

[137] D. SAMPERI. Calibrating a Diusion Pricing Model with Uncertain Volatility: Regularization and
Stability, Mathematical Finance, 12 (2002), 1, pp. 7187.
110

[138] P.J. SCHÖNBUCHER. Credit Derivatives Pricing Models: Model, Pricing and Implementation, Wiley
Finance, 2003.

[139] I.M. SOBOL. The distribution of points in a cube and the approximate evaluation of integrals, U.S.S.R
Comput. Maths. Math. Phys, 16:236242, 1976.

[140] D TAVELLA AND C RANDALL. Financial Instruments: The Finite Dierence Method, Wiley Fi-
nancial Engineering, 2000.

[141] TITCHMARSH, E.C.. The Theory of Functions, 2nd ed., Oxford 1997.

[142] O. VASICEK. Limiting loan loss probability distribution, Moody's KMV, 1991.

[143] S. VILLENEUVE AND A. ZANETTE. Parabolic ADI methods for pricing American option on two
stocks, Mathematics of Operations Research, Vol. 27, Issue 1, 121149, 2002

[144] E. VOLTCHKOVA. Integro-dierential evolution equations: numerical methods and applications in


nance, PhD thesis, Ecole Polytechnique (in French), Paris, 2005.

[145] P. WILMOTT, Cliquet Options and Volatility Models, Wilmott Magazine, 78-83, 2002.

[146] P. WILMOTT. Derivatives: the theory and practice of nancial engineering, Wiley, 1998.

[147] P. WILMOTT, J. DEWYNNE AND S. HOWISON. Option Pricing, Oxford Financial Press (1993).

[148] H.A. WINDCLIFF P.A. FORSYTH AND K.R. VETZAL, Numerical Methods for Valuing Cliquet
Options. Applied Mathematical Finance, Forthcoming.

[149] H. WINDCLIFF, P.A. FORSYTH AND K.R.VETZAL. Pricing methods and hedging strategies for
volatility derivatives, Journal of Banking and Finance, 30 (2006) 409431.

[150] YANG, J., T. HURD AND X. ZHANG. Saddlepoint Approximation Method for Pricing CDOs, Journal
of Computational Finance, Vol. 10, No. 1, (Fall 2006).

[151] R. ZVAN, P.A. FORSYTH AND K.R. VETZAL. Robust numerical methods for pde models of asian
option, Journal of Computational Finance, 1:3978, 1998.

You might also like