3 - Partial Differential Equations

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

MATHGR5030: Numerical Methods in Finance Spring 2023

Part 3: Finite di↵erence methods for PDEs


Lecturer: Luca Capriotti

Synopsis: In this lecture, we discuss finite-di↵erence methods for the solution of Partial Di↵erential Equa-
tions (PDE). We start by refreshing the main concepts on PDEs and the Black-Scholes heat equation which
we use as a basis for introducing the main discretization schemes. We also use the heat equation to dis-
cuss the von Neumann stability analysis to characterize the stability and convergence of the finite di↵erence
schemes using two di↵erent approaches, namely the Fourier and spectral analysis. We then move to study
general second-order linear parabolic partial di↵erential equations, such those that arise in more complex
modelling frameworks. After reviewing the general link between PDEs and the type of expectation values that
are common place in derivatives pricing, via the so-called Feynman-Kac Theorem we discuss in this setting
general ✓ schemes. We cover di↵erent type of boundary conditions, and the application to Asian options and
American-style exercise. We then cover two dimensional PDEs and the so-called ADI method. Finally we
review the standard methods for solving linear systems that are at the core of PDE finite-di↵erence solvers.

3.1 Reminder on PDEs and the Black-Scholes heat equation

3.1.1 Review of PDEs and their classification

A partial di↵erential equation (PDE) is a functional equation that contains both a function and some of
its derivatives. As opposed to an ordinary di↵erential equation (ODE) in which the function to determine
depends on one variable, the unknown function in a PDE depends on several variables. In mathematical
finance, these variables are usually the time t and a state variable x that lies in some subset of Rn (n 1).
For a given function f : R ! R, we shall use interchangeably the notations @f @x and @x f to denote the
derivative with respect to the (one-dimensional) variable x. Let now ⌦ be a subset of Rn , u = (u1 , . . . , um ) a
multidimensional function from ⌦ to Rm . For in an integer vector ↵ = (↵1 , . . . , ↵n ), with |↵|= ↵1 + . . . + ↵n ,
we denote by D↵ u the partial derivative
✓ ◆
@ |↵| u @ |↵| u1 @ |↵| um
D u , ↵1

= ↵n , . . . , ,
@x1 · · · @x↵nn ↵1
@x1 · · · @xn @x↵ ↵n
1 · · · @ xn
1

and for any k 1, Dk u , {D↵ u : |↵|= k} the set of all partial derivatives of order k. For example, when
|↵|= 1, Du represents the gradient matrix
0 1
@ x 1 u1 ... @xn u1
B .. .. .. C
Du = @ . . . A.
@ x 1 um ... @ xn um

For a given integer k 2 N, Dk u represents the tensor of all partial derivatives of order k, namely the collection
of all partial derivatives D↵ u such that |↵|= k.
A partial di↵erential equation (PDE) in u of order k is an equation of the form

F Dk u, Dk 1
u, . . . , Du, u, x = 0 , (3.1)

3-1
3-2 Module 3: Finite di↵erence methods for PDEs

for an appropriate function F , where the higher-order term in Dk u is not null. A solution of the PDE is a
k-times di↵erentiable function u : ⌦ ! Rm such that F Dk u(x), Dk 1 u(x), . . . , Du(x), u(x), x = 0, for all
x 2 ⌦.
There is in general no guarantee that a solution to a given PDE of the form (3.1) will exist. For instance,
the PDE
2
(@x u) + 1 = 0 ,
with m = n = 1 has no real solution.
It is useful to introduce some nomenclature. The PDE in (3.1) is called:

Pk
1. linear if it can be written as i=0 ↵i (x)Di u(x) = f (x) for some functions ↵i (i  k) and some function
f . It is further called homogeneous if f ⌘ 0. For example, using the standard notation ux = @x u and
with x = (t, x1 , . . . , xn 1 ),

• ut + ux = 0 is homogeneous linear;
• uxx + uyy = 0 is homogeneous linear;
• uxx + uyy = x2 + y 2 is inhomogeneous linear;
• ut + x2 ux = 0 is homogeneous linear;
• ut + uxxx + uux = 0 is not linear;
• u2x + u2y = 1 is not linear.

2. semilinear if it can be written as ↵k (x)Dk u(x) + ↵0 Dk 1


u(x), . . . , Du(x), u(x), x = 0. For example:

• ut + ux + u2 = 0 is semilinear;
• ut + uxxx + uux = 0 is semilinear;
• ut + uux = 0 is not semilinear.

3. quasilinear if it has the form

↵k D k 1
u(x), . . . , Du(x), u(x), x Dk u(x) + ↵0 Dk 1
u(x), . . . , Du(x), u(x), x = 0 .

For example:

• ut + uux = 0 is quasilinear.
• ut + a(u)ux = 0 is quasilinear.
• u2x + u2y = 1 is not quasilinear.

4. non-linear if it is not quasi-linear. For example:

• u2x + u2y = 1 is non-linear.

Among all PDEs, we shall be interested in inhomogeneous linear second-order PDEs in the case m = 1,
namely equations of the form Lf = 0, where the operator L has the following form:

L , a11 @xx + 2a12 @xy + a22 @yy + a1 @x + a2 @y + a0 . (3.2)

The operator L in (3.2) can be reduced to one of the following three forms:

• Elliptic form: if a212 < a11 a22 , then L = @xx + @yy + L1 ;


Module 3: Finite di↵erence methods for PDEs 3-3

• Hyperbolic form: if a212 > a11 a22 , then L = @xx @yy + L1 ;


• Parabolic form: if a212 = a11 a22 , then L = @xx + L1 ,

where L1 is an operator of order at most one.


Let us focus on the second order term. Assuming that a11 6= 0 (and in that case normalising a11 = 1 ), and
a12 , a12 /a11 and likewise for the other parameters, we can write
denoting e
2
L = (@x + e
a12 @y ) + e
a22 a212 @yy .
e
1/2
In the elliptic case a212 < a11 a22 (equivalently e a22 , the quantity , e
a212 < e a212
a22 e is well defined
and non zero. With the new variables , y + ã12 x and ⇠ , x, the operator L reads

L=@ + @⇠⇠ Laplace operator on R2 .

In the transformed coordinate system the PDE assumes an elliptic canonical form. The other cases (hyper-
bolic and parabolic) are treated similarly. The example below give the prototypical canonical second order
PDEs:

• Laplace equation on R2 , r2 f = 0 (where r2 is the Laplace operator), is a linear elliptic PDE: a11 =
a22 = 1 and a12 = 0;
• Heat equation on R, @t f @x2 f = 0, is a linear parabolic PDE: a11 = 0, a22 = 1 and a12 = 0;
• Wave equation on R, @tt f @x2 f = 0, is a linear hyperbolic PDE: a11 = a22 = 1;

Here we have assumed the coefficients of the operator L in (3.1.2) to be constant. We could make them
functions of (x, y), and the definitions would remain the same, namely the operator L is locally elliptic at the
point (x, y) if a12 (x, y)2 < a11 (x, y)a22 (x, y), and is elliptic everywhere if the inequality holds for all (x, y).
The classification above can be extended to second-order PDEs in higher dimension using the sign of the
eigenvalues of the matrix constructed with the coefficients of the second order derivatives.
As discussed above, a partial di↵erential equation is an equation involving a function u and its derivatives.
Given a PDE, we want to know if there are any functions u which satisfy the equation. As you may recall
from studying ordinary di↵erential equations, however, there may be no solutions or there may be many
solutions for a given ODE. The same is true for partial di↵erential equations. For example, consider the
transport equation,
ut ux = 0.
This equation models simple fluid motion, where u(x, t) is the height of the wave at time t and position x.
Clearly, u(x, t) = 0 satisfies this equation, but so does u(x, t) = c for any constant c. In fact, there are a
lot of di↵erent solutions to this equation. This is called an ill-posed problem, because there is not a unique
solution. However, if we impose an auxiliary condition, i.e. - an initial condition, we can find a unique
solution. In particular, the problem ⇢
ut ux = 0
u(x, 0) = (x)
is a well-posed problem assuming the initial data (x) is a “nice” function. We are interested in studying
so-called well-posed problems. Roughly, we say a problem is well-posed if there exists a unique solution which
depends continuously on the initial or boundary data. We will discuss particular initial value problems and
boundary value problems later in the course. When modelling a physical or financial problem with a partial
di↵erential equation, it is quite natural to ask that the solution u not only satisfies the PDE, but also satisfies
some additional conditions.
3-4 Module 3: Finite di↵erence methods for PDEs

The classical types of boundary conditions are the Dirichlet boundary conditions, i.e. u(x) is specified when
x lies at the boundary @⌦ of the domain, the Neumann condition when the derivative of u is set on @⌦, and
mixed boundary conditions, which are a combination of Dirichlet and Neumann conditions.

3.1.2 The Black-Scholes-Merton heat equation

The celebrated Black-Scholes, which we recalled in the first lecture, is an example of second order parabolic
PDE:
2
@t Vt + rS@S Vt + S 2 @SS
2
Vt = rVt , (3.3)
2
with boundary condition VT (S). For instance, for a European call option with maturity T > 0 and strike
+
K > 0, we have VT (S) = (ST K) . Let us define ⌧ , T t (this will invert the sign of the time axis) and
the function g⌧ (S) , Vt (S), then @t Vt (S) = @⌧ g⌧ (S) and hence
2
@⌧ g⌧ + rS@S g⌧ + S 2 @SS
2
g⌧ = rg⌧ , (3.4)
2

with boundary condition g0 (S). Let us now define the function f by f⌧ (S) , er⌧ g⌧ (S), and we obtain
2
@⌧ f⌧ + rS@S f⌧ + S 2 @SS
2
f⌧ = 0, (3.5)
2

with boundary condition f0 (S). Consider a further transformation x , log(S) and the function ⌧ (x) ,
-_-

f⌧ (S). Since S@S f⌧ (S) = @x ⌧ (x) and S 2 @SS


2 2
f⌧ (S) = @xx ⌧ (x) @x ⌧ (x), we obtain
v5
✓ 2
◆ 2
2
@⌧ ⌧ + r @x ⌧ + @xx ⌧ =0, (3.6)
2 2

with boundary condition 0 (x). Finally, define the function ⌧ via fsny.name
(x) , e

↵x+ ⌧
(x), so that ⌧

↵x+ ⌧
@x ⌧ (x) = (↵ ⌧ (x) + @x ⌧ (x)) e ,
2 2 2
@xx ⌧ (x) = ↵ ⌧ (x) + 2↵@x ⌧ (x) + @xx ⌧ (x) e↵x+ ⌧ ,
↵x+ ⌧
@⌧ ⌧ (x) =( ⌧ (x) + @⌧ ⌧ (x)) e .

With the parameters


✓ 2
◆ ✓ 2
◆2
1 1
↵, 2
r and , 2
r ,
2 2 2
Equation (3.6) becomes the so-called heat equation:
2
2
@⌧ ⌧ (x) = @xx ⌧ (x) ,
2
↵x
for all real number x with (Dirichlet) boundary condition 0 (x) =e 0 (x).

In this very particular case though, one can determine an exact solution using Fourier transform methods.
Define the Fourier transform b⌧ of the function ⌧ by
Z +1
b⌧ (z) , 1 eizx ⌧ (x)dx, for any z 2 R.
2⇡ 1
Module 3: Finite di↵erence methods for PDEs 3-5

A double integration by parts shows that


Z +1
\ 1
@xx ⌧ (z) = eizx @xx ⌧ (x)dx
2⇡ 1
Z
1 ⇥ izx ⇤+1 iz +1 izx
= e @x ⌧ (x) 1 e @x ⌧ (x)dx
2⇡ 2⇡ 1
Z +1
1 ⇥ izx ⇤+1 iz ⇥ izx ⇤+1 z 2
= e @x ⌧ (x) 1 e ⌧ (x) 1
eizx ⌧ (x)dx
2⇡ 2⇡ 2⇡ 1
2c
= z ⌧ (z),
where we have made the standing assumption that the functions ⌧ and @x ⌧ converge to zero at infinity.
It is also immediate to see that @d b
⌧ (z) = @⌧ ⌧ (z). The heat equation therefore becomes

1 2 2b
@⌧ b⌧ (z) +
z ⌧ (z) = 0
2
in the Fourier variable z, with boundary condition b0 (z). The solution of this equation can be immediately
recognized as ✓ ◆
2 2
b⌧ (z) = b0 (z) exp z ⌧
.
2
and hence, inverting the Fourier transform leads to
Z +1 Z +1
1 2 2
ixz b
⌧ (x) = e ⌧ (z)dz = e ixz b0 (z)e 2 z ⌧ dz
1 1
Z +1 ✓ Z +1 ◆
ixz 1 iz⇠ 1 2 2
z ⌧
= e e 0 (⇠)d⇠ e 2 dz
1 2⇡ 1
Z +1 ✓Z +1 ◆
1 1 2 2
= 0 (⇠) eiz(⇠ x) e 2 z ⌧ dz d⇠
2⇡ 1 1
Z +1 ✓ ◆
1 (x ⇠)2
= p 0 (⇠) exp d⇠
2⇡⌧ 1 2 2⌧
the third line follows by Fubini’s theorem, and the last line relies on the following equality:
Z +1 ✓ ◆
\ 1 z 2 +i!z 1 !2
e z 2
(!) , e dz = p exp ,
2⇡ 1 2 ⇡ 4
with ! , x ⇠ and , 2
⌧ /2. As a result, the solution of the Black-Scholes PDE (3.3) reads
Z +1 ✓ ◆
r(T t) ↵ log S+ (T t) 1 (log S ⇠)2
Vt = e e p 0 (⇠) exp d⇠ (3.7)
2⇡(T ⌧ ) 1 2 2 (T ⌧ )
with
↵⇠
0 (⇠) =e (e⇠ K)+ .

Thhhelneten
Exercise 1 (Optional) Use the equation above to recover the classical Black-Scholes formula for a call option.

3.2 Discretization Schemes

We now focus on building up accurate numerical schemes to solve the partial di↵erential equation
2
2
@⌧ u(⌧, x) = @xx u(⌧, x), (3.8)
2
3-6 Module 3: Finite di↵erence methods for PDEs

for ⌧ > 0 and x in some interval [xL , xU ] 2 R, with (Dirichlet) boundary conditions u(0, x) = g(x) (payo↵
at maturity), u (⌧, xL ) = fL (⌧ ) and u (⌧, xU ) = fU (⌧ ). The last two boundary conditions allow one to
compute for instance the prices of barrier options such as up-and-out or down-and-out options. The two
state-boundary points xL and xU may be infinite. The idea of finite-di↵erence methods is to approximate
each derivative by its first or second-order approximation, and then run a recursive algorithm starting from
the time-boundary point.
We shall study three types of numerical methods to solve the heat equation (3.8). Each of them relies on one
of the above discretisation scheme for the approximation of the time derivative @⌧ , while the space-derivative
@xx (and @x whenever needed) is always approximated by central di↵erences:

• the implicit method uses a backward di↵erence scheme, leading to an error of order ";

• the explicit method uses a forward di↵erence scheme, leading to an error of order ";

• the Crank-Nicolson method uses a central di↵erence scheme, leading to an error of order "2 .

Let us first start by constructing the time-space grid on which we will build the approximation scheme. The
time boundaries are 0 and T > 0 (the maturity of the option) and the space boundaries are xL and xU . Let
m and n be two integers. We consider a uniform grid, i.e. we split the space axis into m intervals and the
time axis into n intervals, and we denote I , {0, 1, . . . , n} and J , {0, 1, . . . , m}. This means that each
point on the grid has coordinates (i t , xL + j x ) for i 2 I and j 2 J , where t , Tn and x , xU mxL . At
each node, we let ui,j , u (i t , xL + j x ) denote the value of the function u. Note in particular that the
boundary conditions imply

u0,j = f (xL + j x) , ui,0 = fL (i t ) , ui,m = fU (i t ) .

From now on, for simplicity we will denote for a tridiagonal matrix T 2 Rm⇥m , i.e.
0 1
a1 c1 0 0
B .. C
B b2 a2 . 0 C
T ,B
B
C,
C (3.9)
@ 0 .. .. A
. . cm 1
0 0 bm am

we shall use the short-hand notation T = Tm (a, b, c) for some Rm -valued vectors a, b and c, or simply
T = Tm (a, b, c) when the entries in the vectors are all the same.
The heat equation (3.8) is an example of a convection-di↵usion equation @⌧ + @xx + µ@x = 0, where > 0
is the di↵usion coefficient and µ the convection coefficient. The schemes we shall study below are efficient (up
to some precision) to solve this parabolic partial di↵erential equation. However, when is very small, these
schemes are usually not accurate at the boundary layer, and we speak of singular perturbation problem.
In fact, formally setting to zero changes the nature of the PDE, which becomes first-order hyperbolic:
@⌧ + µ@x = 0. Other methods have been proposed in the literature, and we refer the interested reader to
[1] for an overview of these.

3.2.1 Explicit scheme

In the explicit scheme, the time derivative @⌧ is evaluated using the forward di↵erence scheme, while the
space second derivative @xx is approximated with a central di↵erence scheme. More precisely we consider
Module 3: Finite di↵erence methods for PDEs 3-7

the following approximations


u (⌧ + t , x) u(⌧, x)
@⌧ u(⌧, x) = + O ( t) ,
t
u (⌧, x + x) 2u(⌧, x) + u (⌧, x x) 2
@xx u(⌧, x) = 2
+O x .
x

2
Ignoring the terms of order t and x, the heat equation (3.8) at the node (i t , xL + j x) becomes
2
ui+1,j ui,j ui,j+1 2ui,j + ui,j 1 2
+ O ( t) = 2
+O x ,
t 2 x

which we can rewrite ✓ ◆


2 2
t t 2 t
ui+1,j = 2
ui,j+1 + 1 2
ui,j + 2
ui,j 1 (3.10)
x 2 x x 2
for all i = 0, . . . , n 1, j = 1, m 1. Let us rewrite this in matrix form. To do so, define for each i = 0, . . . , n
the vectors ui 2 Rm 1 , bi 2 Rm 1 as
T
ui , (ui,1 , . . . , ui,m 1) ,
T
bi , (ui,0 , 0, . . . , 0, ui,m ) ,

and the matrix A 2 R(m 1)⇥(m 1)


as
0 2 ↵ 2 1
1 ↵ 2 0 0
✓ ◆ B ↵ 2 .. C
↵ 2 ↵ 2 B 1 ↵ 2 . 0 C
A , Tm 1 1 ↵ 2
, , ⌘B
B
2 C ,
C (3.11)
2 2 @ .. .. ↵ 2 A
0 . . 2
↵ 2 2
0 0 2 1 ↵

where we introduced the quantity ↵ , 2


t/ x. The recursion (3.10) thus becomes

↵ 2
ui+1 = Aui + bi , for each i = 0, . . . , n 1,
2
T T
where the time boundary condition reads u0 = (u0,1 , . . . , u0,m 1) = (f (xL + x) , . . . , f (xL + (m 1) x )) .
It is easy to recognize that the recursion in (3.10) is identical to what we found in one of the parameterization
of trinomial trees we considered.
Note that as soon as the inequality 2 t / x2  1 is satisfied, Eq.(3.10) implies that ui+1,j is a convex
combination of the neighbouring three nodes at the previous time i t . If the initial datum u0,· is bounded,
say u  u0,j  ū, for all j 2 J and for some constants u and ū, then the inequalities u  ui,j  ū remain
true for all j 2 J and all i 2 I. This condition on t and x is called the CFL condition (after Richard
Courant, Kurt Friedrichs, and Hans Lewy, who introduced for finite di↵erence schemes of some classes of
partial di↵erential equations in 1928, see [3]) and clearly prevents the solution from unbounded oscillations.
This was the same condition we found for the stability of trinomial trees. The stability of the discretisation
scheme will be made rigorous in Section 3.2.5 below.
The explicit scheme-as well as the other schemes that will follow-computes the value of the function u at
some points on the grid. In the case of the Black-Scholes model, we have performed quite a few changes
of variables from the stock price S to the space variable x. Fix some time t 0 (or remaining time ⌧ ).
If one wants to compute the option value at some point S, it is not obvious to have the grid match the
corresponding x value exactly. In that case one can perform some form of interpolation between the two
points that are the closest to x.
3-8 Module 3: Finite di↵erence methods for PDEs

3.2.2 Implicit scheme

In the implicit scheme, the time derivative @⌧ is evaluated using the backward di↵erence scheme, while the
space second derivative @xx is approximated with a central di↵erence scheme. Ignoring the errors or orders
t and x , the heat equation (3.8) therefore becomes

2
u(⌧, x) u (⌧ t , x) u (⌧, x + x) 2u(⌧, x) + u (⌧, x x)
= 2
,
t 2 x

which, at the node (i t , xL + j x ), reads


2
ui,j ui 1,j ui,j+1 2ui,j + ui,j 1
= 2
.
t 2 x

Similarly as for the explicit scheme, we can reorganise the equality and we obtain
2 2
2
ui 1,j = ↵ ui,j+1 + 1 + ↵ ui,j ↵ ui,j 1 , (3.12)
2 2
where as before we set ↵ , t / x2 . As in the explicit scheme, define for each i = 1, . . . , n the vectors
ui 2 Rm 1 , bi 2 Rm 1 and the matrix A 2 R(m 1)⇥(m 1) by
✓ ◆
T T ↵ 2 ↵ 2
ui , (ui,1 , . . . , ui,m 1 ) , bi , (ui,0 , 0, . . . , 0, ui,m ) , A , Tm 1 1 + ↵ ,
2
, .
2 2
The recursion (3.12) becomes

↵ 2
ui 1 = Aui bi , for each i = 0, . . . , n 1,
2
T T
with boundary condition u0 = (u0,1 , . . . , u0,m 1) = (f (xL + x) , . . . , f (xL + (m 1) x )) .

3.2.3 Crank-Nicolson scheme

The Crank-Nicolson scheme uses the central di↵erence approximation for the first-order time derivative.
It was first described in 1947 (see [4]), and was subsequently fully developed at the Los Alamos National
Laboratory. Let us consider the point i t + 12 t , xL + j x , and perform a Taylor series expansion around
the point i t + 12 t . The central di↵erence approximation for the time derivative gives
ui+1,j ui,j 2
@⌧ u|i 1
t + 2 t ,xL +j x
= +O t .
t

For the space-derivative, we average the central di↵erences between the points (i, j) and (i + 1, j) :
1 ui+1,j+1 2ui+1,j + ui+1,j 1 1 ui,j+1 2ui,j + ui,j 1 2
@xx u|(i+ 1 ) t ,xL +j x
= 2
+ 2
+O x .
2 2 x 2 x

The heat equation (3.8) thus becomes, after some reorganisation,


✓ ◆ ✓ ◆
↵ 2 ↵ 2 ↵ 2 ↵ 2 ↵ 2 ↵ 2
ui+1,j+1 + 1 + ui+1,j ui+1,j 1 = ui,j+1 + 1 ui,j + ui,j 1.
4 2 4 4 2 4
In matrix form, this reads
1 2
Cui+1 = Dui + ↵ bi , for i = 0, . . . , n 1,
2
Module 3: Finite di↵erence methods for PDEs 3-9

where ✓ ◆ ✓ ◆
↵ 2 ↵ 2 ↵ 2 ↵ 2 ↵ 2 ↵ 2
C , Tm 1 1+ , , , D , Tm 1 1 , ,
2 4 4 2 4 4
✓ ◆T
ui,0 + ui+1,0 ui,m + ui+1,m
bi , , 0, . . . , 0, 2 Rm 1
2 2

3.2.4 Generalization to ✓-schemes

One may wonder why these three schemes (implicit, explicit and Crank-Nicolson)
p lead to a similar recurrence
relation. Let us cast a new look at the heat equation (3.8) with = 2. A Taylor series expansion at some
point (t, x) in the t direction gives u (⌧ + t , x) = e t @⌧ u (⌧, x), where we write the operator e t @⌧ as a
compact version of 1 + t @⌧ + 12 t2 @⌧2⌧ + . . .. This implies that

u (⌧ + t , x) u(⌧, x) = e t @⌧
1 u(⌧, x) , ⌧ u(⌧, x),

and hence @⌧ = t 1 log (1 + ⌧ ). ⌧ is therefore a one-sided di↵erence operator in the time variable. A
Taylor expansion leads to
1 1 2 3
@⌧ = ⌧ ⌧ +O ⌧ .
t 2 t
For the central di↵erence scheme, let x be the central di↵erence operator defined by x u(⌧, x) , u ⌧, x + 1
2 x
u ⌧, x 12 x , which we can also write as
✓ ◆ ✓ ◆ ✓ ◆
x x x @x
x = exp @x exp @x = 2 sinh
2 2 2
2 1
in terms of the operator @x . Therefore @x = x
asinh 2 and Taylor expansions give
x
✓ 3

1 x 5
@x = x +O x ,
x 24
✓ 4

2 1 2 x
@xx = 2 x + O 6x ,
x 12
1 3 5
where we recall that asinh(z) = z 6z + O z for z close to zero. Note further that 2x u(⌧, x) =
u (⌧, x + x ) 2u(⌧, x) + u (⌧, x x ). Between two points (i t , xL + j x ) and ((i + 1) t , xL + j x ) on the
grid, the heat equation then reads, in operator form:
1 2 2
t @⌧ t @xx
ui+1,j = e ui,j = e 2 uij ,

where the first equality follows by Taylor expansion (in time) and the second one holds since the function u
solves the heat equation. Consider for instance ⌧ = ✓i t + (1 ✓)(i + 1) t , where ✓ is some
fixed real number in [0, 1], and denote u✓ij the value of the function u at this point. Assuming they exist,
1 2 2
a forward Taylor expansion gives u✓ij = e(1 ✓) t @⌧ ui,j = e 2 (1 ✓) t @xx ui,j and a backward Taylor expansion
1 2 2
implies u✓ij = e ✓ t @⌧ ui+1,j = e 2 ✓ t @xx ui+1,j , so that we can write
1 2 2 1 2 2
✓ t @xx (1 ✓) t @xx
e 2 ui+1,j = e 2 ui,j .
2
From the Taylor expansions for the di↵erential operators @⌧ and @xx derived above, we obtain - after tedious
yet straightforward algebra - the so-called ✓-recurrence scheme:
2
2 ↵✓ 2 ↵ 2
1 + ↵✓ ui+1,j (ui+1,j+1 + ui+1,j 1) = 1 ↵ (1 ✓) ui,j + (1 ✓) (ui,j+1 + ui,j 1) ,
2 2
3-10 Module 3: Finite di↵erence methods for PDEs

where we recall that ↵ , t / x2 . In matrix form, this can be rewritten as


✓ ◆ ✓ ◆
↵✓ 2 ↵ 2 (1 ✓)
I A ui+1 = I + A u i + bi , for i = 0, . . . , n 1,
2 2

where 0 1
2 1 0 ... 0
B .. .. .. C
B 1 2 . . . C
B C
B .. .. .. C
A,B 0 . . . 0 C ,
B C
B . .. .. C
@ .. . . 2 1 A
0 ... 0 1 2
and where bi represents the vector of boundary conditions. The explicit, implicit and CrankNicolson schemes
are fully recovered by taking ✓ = 0, ✓ = 1 and ✓ = 1/2.

3.2.5 Stability and convergence: von Neumann analysis

We start this section with a simple example. Suppose we are interested in solving the heat equation (3.8)
with an explicit di↵erence scheme, as developed in Sec. 3.2.1. We also consider the following boundary
conditions:

u(0, x) = x2 1{x2[0,1/2]} + (1 x)2 1{x2[1/2,1]} and u(⌧, 0) = u(⌧, 1) = 0, for all ⌧ 0

0.25 0.25
0.2 0.2
0.15 0.15

0.1 0.1

0.05 0.05

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

44
x 10
0.06 2

1
0.04
0
0.02
−1

0 −2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Figure 3.1: Explicit finite di↵erence method for the heat equation with x = 0.1 and t = 0.001 (left) and
t = 0.1 (right). The upper plot corresponds to the solution at time 0 (initial condition) and the lower plot
to the solution at time 1.

As shown in Fig. 3.2.5, increasing the time step, for a fixed x leads to numerical instability, as we have
already remarked before. We now move on to a rigorous justification of this fact.
Exercise: (Homework 2) Reproduce numerically Fig. 3.2.5. Verify that such instability is not present in
the Crank-Nicholson or the fully implicit scheme.
Let us start with a few preliminary definitions and notations. We shall restrict ourselves here mainly for
notational reasons - as before - to the one-dimensional case, even though most of the discussion below carries
over to more general spaces. Let denote a (di↵erential) operator acting on the space of smooth real
Module 3: Finite di↵erence methods for PDEs 3-11

functions C 1 (R+ , R). In the case of the heat equation (3.8), , L @⌧ , where L = 12 2 @xx is the rescaled
Laplace operator. In the following, we shall denote ⇤ the (unique up to boundary point specification)
solution to ( ⇤ ) = 0. We define further
⇣ ⌘
e , e ij
i2I,j2J

as the finite di↵erence approximation of on the grid I ⇥ J . where for any (i, j) 2 I ⇥ J ,

e ij ( ) = 0

is a so-called di↵erence equation.


For any smooth real function 2 C 1 (R+ , R), the truncation error Eij at the node (i, j) is defined as

Eij ( ) , e ij ( ) ( )|(i t ,xL +j x )


.

Since ⇤ solves the PDE ( ) = 0, we immediately see that Eij ( ⇤


) = e ij ( ⇤
), and we shall call this
quantity the local truncation error at the lattice point (i, j).

Definition 3.1 The finite di↵erence scheme ( e ij )i2I,j2J is said to be consistent if for any (i, j) 2 I ⇥ J ,
the truncation error Eij ( ) converges to zero as t and x tend to zero for any smooth real function .

As an example let us consider the equation defined by ( ) = (@t + @x )( ) ⌘ 0. It is easy to show that a
forward-time forward-space discretisation scheme:

e i,j ( ) , i+1,j i,j i,j+1 i,j


+ . (3.13)
t x

is a consistent scheme.
Indeed, a Taylor expansion gives
1 2 2
i+1,j = i,j + t @t i,j + @ i,j + O( t3 ),
2 t tt
1
i,j+1 = i,j + x @x i,j + x2 @xx
2
i,j + O( x3 ).
2

Plugging these expansions into the definition of the e , we can compute the truncation error for any (i, j) 2
I ⇥ J as
i+1,j i,j i,j+1 i,j
Ei,j ( ) = + (@t + @x ) ij
t x
1 2 1
= (@t i,j + t@ i,j + O( t2 )) + (@x i,j + 2
x @xx i,j + O( x2 )) (@t + @x ) ij
2 tt 2
1 2 1
= t@ i,j + O( t2 ) + 2
x @xx i,j + O( x2 ),
2 tt 2
which clearly converges to zero as x and t tend to zero.

Definition 3.2 We define the orders of accuracy p⇤ (in t) and q ⇤ (in x) as


p
(p⇤ , q ⇤ ) , sup (p, q) 2 N2 : |Eij ( )|  C ( t + q
x) , for some C > 0 .
(i,j)2I⇥J
3-12 Module 3: Finite di↵erence methods for PDEs

It is easy to see that for the explicit scheme, p⇤ = 1 and q ⇤ = 2, whereas in the Crank-Nicolson scheme,
p⇤ = q ⇤ = 2.

Definition 3.3 Let e , ( eij )i2I,j2J be the exact solution of the finite di↵erence approximation scheme e
(i.e. e ( e) = 0), and define the discretisation error ✏ by

✏ij , eij ⇤
ij , for each i 2 I, j 2 J ,

namely the di↵erence, at the point (i, j) on the grid, between the solution of the di↵erence equation and that
of the di↵erential equation.

Definition 3.4 The scheme is said to converge in the k·k norm if

lim sup k✏ij k = 0.


( t, x )!(0,0) i2I,j2J

In particular, on a grid with spacing and dimension m, it is useful to define the norm k·k by
0 11/2
m
X
kvk , @ vj2 A ,
j=1

for any Rm -valued vector v = (vj )j=1,...,m . For a matrix ( ij )i2I,j2J defined on the grid I ⇥J , we define i· ,
for any i 2 I, as the vector corresponding to the i-th row (in our case corresponding to the i-th time-slice),
so that i· 2 Rm , where m , dim(J ).

Definition 3.5 The finite di↵erence scheme e is said to be stable in some stability region S if there exists
n0 2 N such that for all T > 0, there is a strictly positive constant CT satisfying
n0
X
2 2
en·  CT ei· , for all 0  n t  T and ( x , t ) 2 S.
x x
i=0

The stability inequality above expresses the idea that the norm of the solution vector, at any point in time,
is limited by the sums of the norms of the vector solution up to time n0 .
We can illustrate this definition with a scheme for the equation introduced before in Eq. (3.13). Consider a
general forward-time forward-space discretisation scheme of the form

i+1,j =↵ i,j + i,j+1 ,

for some ↵ and . Then


2
X 2
X 2
k i+1,· k x
= x | i+1,j | = x |↵ i,j + i,j+1 |
j j
X⇣ 2 2

2 2
 x ↵ | i,j | + | i,j+1 | + 2|↵|| || i,j | | i,j+1 |
j
X⇣ 2 2

2 2
⌘⌘
 x ↵2 | i,j | + 2
| i,j+1 | + |↵|| | | i,j | +| i,j+1 | ,
j
Module 3: Finite di↵erence methods for PDEs 3-13

since x2 + y 2 2xy for any (x, y) 2 R2 . Splitting the indices j and j + 1, we obtain

2
X 2
X 2
k i+1,· k x  x ↵2 + |↵|| | | i,j | + x
2
+ |↵|| | | i,j+1 |
j j
X 2
2 2
 x ↵ + 2|↵|| |+ | i,j |
j
2
X 2
= (|↵|+| |) x | i,j | .
j

2
Therefore k i+1,· k x  (|↵|+| |) k i· k x . Note that we have here omitted the boundary terms arising from
the second sum in the first line. We shall assume here that they are not relevant. Repeating this, we obtain
2 2i 2
k i,· k x  (|↵|+| |) k 0· k x
,

and hence the scheme is stable if and only if |↵|+| | 1

Definition 3.6 The partial di↵erential equation = 0 is said to be well-posed if for each T > 0 there
exists CT > 0 such that
k (t, ·)kL2 (R)  CT k (0, ·)kL2 (R) ,

for all t 2 [0, T ] and for any solution , where the L2 -norm of a function : R ! R is k kL2 (R) ,
R 1/2
R
(x)2 dx .

We can now state the main result of this section, which provides a full characterisation of the convergence
of a scheme.

Theorem 3.7 (Lax Equivalence theorem) A consistent finite di↵erence scheme of a well-posed linear
initial-valued problem converges if and only if it is stable .

One may wonder how this result changes when studying an inhomogeneous PDE such as = g for some
function g. Duhamel’s principle says that the solution to such a problem can be written as the superposition
of solutions to the homogeneous PDE = 0. Therefore the concepts of well-posedness and stability follow
from the homogeneous case.
The importance of this theorem stems from the fact that convergence is not easy to check directly from the
definition. However, well-posedness and stability are much easier to check in practice.
By means of Fourier methods, we shall now give a precise and easy-to-check condition for the stability of
the finite di↵erence scheme. Let v = (vi,j ) be a function defined on a grid with space increment x and time
increment t . We assume for now that there is no restriction in the space domain (i.e. x 2 R), and we fix
some time n t . The discrete Fourier transform of the vector vn is defined as
1
X 
1 ⇡ ⇡
vbn (⇠) , p xe
im x⇠
vn,m , for ⇠ 2 ⇧x , , , (3.14)
2⇡ m= 1 x x

and we have the inverse Fourier transform formula


Z ⇡/
1 x

vn,m = p eim x⇠
vbn (⇠)d⇠. (3.15)
2⇡ ⇡/ x
3-14 Module 3: Finite di↵erence methods for PDEs

Assume now that we have a (one-step in time) finite di↵erence scheme which we write as
u
X
vn+1,m = ↵j ( t , x )vn,m+j . (3.16)
j= d

This means that at each grid point ((n + 1) t , m x ) we can write the scheme using some of the grid points
at time n t (just like in the explicit scheme). The positive integers d and u represent how far up and down
we have to go along the grid at time n t . Applying the inverse Fourier transform formula (3.15) to the finite
di↵erence scheme (3.16), we have
Z ⇡/ x Xu
1
p eim x ⇠ vbn+1 (⇠)d⇠ = vn+1,m = ↵j ( t , x )vn,m+j
2⇡ ⇡/ x j= d
Xu Z ⇡/ x
1
= ↵j ( t , x ) p ei(m+j) x ⇠ vbn (⇠)d⇠
j= d
2⇡ ⇡/ x

Z ⇡/ x Xu
1
=p eim x ⇠ ↵j ( t , x )eij x ⇠ vbn (⇠)d⇠,
2⇡ ⇡/ x j= d

which implies, by unicity of the Fourier transform, that


u
X
vbn+1 (⇠) = ↵j ( t , x )e
ij x⇠
vbn (⇠) , ⇣ (⇠, t , bn (⇠),
x) v
j= d

for all ⇠ 2 ⇧x , where the function ⇣ is defined in an obvious way, and does not depend on n. Iterating this
equality, we obtain
n
vbn (⇠) = ⇣ (⇠, t , x ) vb0 (⇠). (3.17)
By recalling Parseval identity:
Z ⇡/ x 1
X
2 2 2 2
vn kL2 (⇧x ) ,
kb |b
vn (⇠)| d⇠ = x |vn,m | = kvn,· k x
. (3.18)
⇡/ x m= 1

Recalling the definition of stability (Definition 3.5) of a finite di↵erence scheme, we see that it should be
possible to express it simply in terms of the function ⇣. The following theorem makes this precise and
provides us with a ready-to-use condition to check stability.

Theorem 3.8 A (one-step in time) finite di↵erence scheme is stable if and only if there exist K > 0
(independent of ⇠, x , t ) and t0 , x0 such that
0
|⇣(⇠, t , x )| 1+K t,

for all ⇠, t 2 (0, t0 ] and x 2 (0, 0


x ]. In particular, when the function ⇣ does not depend on t and x, the
condition |⇣(⇠)| 1 is sufficient.

The factor ⇣ is called the amplification factor and this analysis is called von Neumann analysis, in memory
of its founder. Assuming the condition stated in Theorem 3.8 as true for the function ⇣ in (3.17), we now
see that (3.18) gives
Z ⇡/ x
2 2 2n 2
kvn,· k x = kb
vn kL2 (⇧x ) = |⇣(⇠, t , x )| |bv0 (⇠)| d⇠
⇡/ x

0 2n 2
 1+K t kb
v0 kL2 (⇧x )
0 2n 2
= 1+K t kv0,· k x
,
Module 3: Finite di↵erence methods for PDEs 3-15

which indeed implies stability, according to Definition 3.5.


We now apply this result to the three finite di↵erence schemes developed above to the heat equation
2
@⌧ u = @xx u,

with > 0.

3.2.5.1 Application to ✓-schemes

Let us first consider the von Neumann analysis of the explicit scheme. The explicit finite di↵erence (3.10)
can be rewritten as
un+1,m = un,m + ↵ (un,m+1 2un,m + un,m 1 ) , (3.19)
where we recall that ↵ , t / x2 and = 2 /2. Writing un,m in terms of its Fourier transform (3.15) and
using the relation (3.17), the amplification factor reads
✓ ◆2
i⇠ i⇠ ⇠ x
⇣(⇠, t , x) =1+↵ e x
2+e x
= 1 + 2↵ (cos(⇠ x ) 1) = 1 4↵ sin .
2

Hence |⇣(⇠, t , x )|  1 if and only if ↵  1/2 or


2
t
<1.
x

which is the same condition we found in the analysis of trinomial trees. The scheme is hence conditionally
stable. The implicit finite di↵erence Eq. (3.12) can be rewritten as

un,m = un 1,m + ↵ (un,m+1 2un,m + un,m 1) . (3.20)

The amplification factor is


✓ ◆2 ! 1
i⇠ i⇠ 1 ⇠ x
⇣(⇠, t , x) = 1 ↵ e x
2+e x
= 1 + 4↵ sin ,
2

and the inequality |⇣(⇠, t , x )|  1 always holds. The scheme is therefore unconditionally stable.

o
Exercise 2 Prove that the amplification factor for the Crank-Nicolson scheme is
⇣ ⌘2
1 2↵ sin ⇠ 2x
⇣(⇠, t , x ) = ⇣ ⌘2 ,
1 + 2↵ sin ⇠ 2x

and conclude on the stability of the scheme.

Similarly, it is possible to shown that the theta scheme is unconditionally stable for ✓ 1/2.

3.2.6 Convergence analysis via matrices

We review here the convergence analysis from a di↵erent - albeit equivalent - point of view.
3-16 Module 3: Finite di↵erence methods for PDEs

3.2.6.1 A crash course of matrix norms

We recall here some basic facts about vector and matrix norms, which shall be useful for a full understanding
of the matrix approach of convergence of the finite di↵erence schemes. We let x , (x1 , . . . , xn ) be a vector
in Cn for some fixed n 2 N⇤ . The norm k·k of a vector is a real non-negative number that gives a measure
of its size. It has to satisfy the following properties:

• kxk> 0 if x 6= 0 and kxk= 0 if x = 0;

• k↵xk= |↵|kxk, for any ↵ 2 C;

• kx + yk kxk+kyk, for any x, y 2 Cn .

The most common norms are

Pn
• the L1 -norm (or taxicab norm): kxk1 , i=1 |xi |;
✓ ◆1/p
P
n
• the Lp -norm (p 1): kxkp , |xi |p ;
i=1

• the infinity-norm: kxk1 , max |xi |.


i=1,...,n

For a matrix A = (aij ) 2 Cn⇥n , we define its subordinate norm as follows:

Definition 3.9 Let k·k be a vector norm on Cn . We then define the subordinate matrix norm (and by a
slight abuse of language, use the same notation k·k) by

kAxk
kAk, sup .
x2Cn \{0} kxk

Norms of matrices measure in some sense their sizes. It is hence natural that they will play a role in the
behaviour of expressions such as Ap as p tends to infinity. The right tool to study this is the spectral radius
of a matrix, which we define as follows.

Definition 3.10 For a matrix A 2 Cn⇥n , the spectral radius ⇢(A) is defined as the maximum modulus of
the eigenvalues of A.

For any subordinate norm the following bound always holds:

Lemma 3.11 Let A 2 Mn (C) and k·k a subordinate norm. For any k 2 N⇤ , ⇢(A)  kAk k1/k .

Proof: Let be an eigenvalue of A with associated eigenvector u 6= 0, then

| |k kuk= k k
uk= kAk uk kAk kkuk.

Therefore | |k  kAk k and the lemma follows.


The following theorem, proved by Gelfand, highlights the importance of the spectral radius, and its relation-
ship with the asymptotic growth rate of the matrix norm.
Module 3: Finite di↵erence methods for PDEs 3-17

Theorem 3.12 For any matrix norm k·k and any matrix A 2 Cn⇥n , ⇢(A) = lim kAk k1/k .
k!1

We shall pay special attention to the 2-norm of a real symmetric matrices A 2 Rn⇥n . Indeed, this is the
only norm that can be represented as an inner product:

kAxk2 , (Ax) · (Ax) = xT AT Ax ,

so that the subordinate matrix 2-norm can be defined as

kAk22 , max xT AT Ax, x 2 Rn : kxk= 1 .

Since the matrix AT A is real and symmetric, there exists an orthonormal basis of eigenvectors (with corre-
sponding real eigenvalues 1 , . . . , n ), so that
n
X
kAk22 = max 2
i zi ,
z2Rn :kzk2 =1
i=1

and hence (note that AT A is positive semidefinite and hence | i | = i)

kAk22 = max i = ⇢ AT A ,
i

where the last equality follows from the definition of the spectral radius. In the particular case where the
matrix A itself is symmetric, q q
2
⇢ (AT A) = ⇢ (A) = ⇢(A),
and therefore kAk2 = ⇢(A). So, in the case of real symmetric matrices the spectral radius is equal to the 2
norm of the matrix.

3.2.6.2 Convergence analysis

We have seen above that ✓-schemes have the general representation Cun+1 = Aun + bn , where C is the
identity matrix in the explicit scheme and A in the implicit schemes, and the vector b represents boundary
conditions. Using the notation (3.9), let us define the matrix T 2 R(m 1)⇥(m 1) by T , Tm 1 ( 2, 1, 1), and
rewrite the recurrence matrix equation as

un+1 = M un + ebn , (3.21)

where the matrix M takes the following form:

• Explicit scheme: M = I + ↵ T ;
1
• Implicit scheme: M = (I ↵ T) ;
✓ ◆ 1 ✓ ◆
1 1
• Crank-Nicolson scheme: M = I ↵ T I+ ↵ T ,
2 2
1
• General ✓-scheme: M = (I ↵ ✓T ) (I + ↵ (1 ✓)T ),

where I denotes the identity matrix in R(m 1)⇥(m 1) and where the matrix ebn of modified boundary con-
ditions is straightforward to compute. We used here , 2 /2 for notational convenience. The following
theorem is a matrix reformulation of the von Neumann analysis above:
3-18 Module 3: Finite di↵erence methods for PDEs

Theorem 3.13 If kM k2  1, then the scheme is convergent.

Here the norm k·k2 represents the spectral radius of a real symmetric matrix, i.e. its largest absolute
eigenvalue. This theorem can be understood with the following argument: let the vector e0 denote a small
en as the perturbed solution. By (3.21), we can write
perturbation of the initial condition u0 , and define u
n
X1
en = M u
u en 1 + ebn 1 = M 2u
en 2 + M ebn 2 + ebn 1 = . . . = M nu
e0 + Mn 1 ie
bi .
i=0

Therefore the error satisfies en , un u en = M n e0 , and hence ken k2 = kM n e0 k2  kM n k2 ke0 k2 . Since we


want the error to remain bounded, we need to find a constant M > 0 such that ken k2  M ke0 k2 . It is clear
that this will be satisfied as soon as kM k2  1.
It is easy to show that the triangular matrix Tp (a, b, c) 2 Rp⇥p , where (a, b, c) 2 R3 with bc > 0 has p
eigenvalues ( k )1kp of the form:
✓ ◆
p ⇡k
k = a + 2 bc cos , for k = 1, . . . , p .
p+1

In the ✓-schemes above, we can apply the result above to compute the m eigenvalues of the tridiagonal
matrix Tm 1 ( 2, 1, 1) as
✓ ◆2
T k⇡
k = 4 sin , for k = 1, . . . , m 1.
2m
The eigenvalues of the transition matrix M in the implicit scheme then follow directly and we obtain
✓ ◆2 ! 1
M k⇡
k = 1 + 4↵ sin , for k = 1, . . . , m 1.
2m

Since the k·k2 norm of a symmetric matrix is equal to its spectral radius, i.e. the largest absolute eigenvalue,
we obtain
✓ ◆2 ! 1
k⇡
kM k2 = max 1 + 4↵ sin < 1,
k=1,...,m 1 2m

for any ↵ > 0. The implicit scheme is thus unconditionally stable, consistent and hence (by Theorem 3.7)
convergent.
In the Crank-Nicolson scheme, the eigenvalues of the matrix M read

k⇡ 2
M 1 2↵ sin 2m
k = ,
k⇡ 2
1 + 2↵ sin 2m

for k = 1, . . . , m 1. Therefore
M
kM k2 = max k < 1, for any ↵ > 0.
k=1,...,m 1

The Crank-Nicolson scheme is therefore unconditionally stable and hence convergent.

ns.eeen
Exercise 3 Perform such an analysis for the explicit scheme and discuss its stability.
Module 3: Finite di↵erence methods for PDEs 3-19

Remark 3.14 We could have worked from the beginning with general ✓ schemes, with ✓ 2 [0, 1]. In that case,
a similar analysis of the eigenvalues of the transition matrix M shows that the scheme is unconditionally
stable as soon as ✓ 2 [1/2, 1].


Exercise 4 Which condition on the parameters ↵ and
[0, 1/2)?
ensures that ✓-schemes are convergent when ✓ 2

3.3 Solving general second-order linear parabolic partial di↵eren-


tial equations

We have so far concentrated our e↵orts in solving the heat equation, which enabled us to solve the Black-
Scholes equation from the set of transformations developed in Section 3.1.2. However the latter are not
always available. Here we present a more general treatment. We start from refreshing how PDE naturally
emerge in derivative pricing problems beyond the standard Black-Scholes framework.

3.3.1 Kolmogorov’s Equations and the Feynman-Kac Theorem

We have seen that in the Black-Scholes model derivatives prices can be expressed as expectations under
certain probability measures or as solutions to PDEs. To explore a bit further this connection, let us
consider again the asset price SDE
dX(t) = µ(t, X(t))dt + (t, X(t))dW (t), X(0) = X0 , (3.22)
define a functional
u(t, x) = EP [g(X(T )) | X(t) = x] , (3.23)
for a function g : R ! R. Under regularity conditions on g, it is easy to see that the process u(t, X(t)),
p

being a conditional expectation, must be a martingale. Proceeding informally, applying Ito’s lemma we get,
for t 2 [0, T ),
p
X p p
1 XX
du(t) = ut (t)dt + uxi (t)µi (t)dt + ux x (t)⌃i,j (t)dt + O(dW (t)) (3.24)
i=1
2 i=1 j=1 i j

>
where as before ⌃i,j is the (i, j) -th element of . For u(t, X(t)) to be a martingale, the term multiplying
dt in the equation above must be zero. Defining the operator
p
X p p
@ 1 XX @2
L= µi (t, x) + ⌃i,j (t, x) (3.25)
i=1
@xi 2 i=1 j=1 @xi @xj

we deduce that u(t, x) satisfies the PDE


@u(t, x)
+ Lu(t, x) = 0 (3.26)
@t
with terminal condition u(T, x) = g(x). We have therefore seen in general that in our Markovian di↵usive
setup conditional expectations do satisfy PDE equations. This result is also known as Feynman-Kac theorem.
Such equations are known as the Kolmogorov backward equation for the SDE (3.22).
A family of functions g of particular importance to many of our applications is
>
g(x) = eik x
, k 2 Rp (3.27)
3-20 Module 3: Finite di↵erence methods for PDEs

where i is the imaginary unit, such that i2 = 1. In this case, u(t, x) becomes the characteristic function of
X(T ), conditional on X(t) = x.
For the Markov process X(t) in (3.22), let us now introduce a transition density, given by

p(t, x; s, y)dy , P(X(s) 2 [y, y + dy] | X(t) = x), 0tsT . (3.28)

We can loosely think of the transition density as a special case of the functional u(t, x) above, namely

p(t, x; s, y) = E [ (X(s) y) | X(t) = x] , (3.29)

where (·) is the Dirac’s delta function. Sometimes transition densities are also known as Green’s function
or fundamental solution. It is immediate to show that the following property holds:
Z
u(t, x) = g(y)p(t, x; T, y)dy, t 2 [0, T ] . (3.30)
Rp

557
Exercise 5 Show it.
-_-
Under appropriate regularity conditions discussed in [5], the transition density solves the Kolmogorov back-
ward equation
@p(t, x)
+ Lp(t, x) = 0, (s, y) fixed, (3.31)
@t
subject to the boundary condition p(s, x; s, y) = (x y).
In many applications (e.g., in calibration problems), it is useful to have a result that produces transition
densities at future times s t from a known state at time t, rather than vice-versa. For this, we first define
an operator L⇤ by
p
X p p
@ [µi (s, y)f (s, y)] 1 X X @ 2 [⌃i,j (s, y)f (s, y)]
L⇤ f (s, y) = + (3.32)
i=1
@yi 2 i=1 j=1 @yi @yj

In the transition density p(t, x; s, y) now consider (t, x) fixed and let L⇤ operate on the resulting function of
s and y. Under additional regularity conditions, we then have the forward Kolmogorov equation (or Fokker-
Planck equation)
@p(s, y)
+ L⇤ p(s, y) = 0, (t, x) fixed, (3.33)
@s
subject to the boundary condition p(t, x; t, y) = (x y).
We conclude with a useful generalization of transition densities, called Arrow-Debreu prices (or densities).
These are essentially discounted version of transition densities:
h Rs i
G(t, x; s, y) , e e t r(u,X(u))du (X(s) y) X(t) = x . (3.34)

It is possible to show that Arrow-Debreu prices satify the following generalization of the Kolmogorov’s
equations, namely:
@G(t, x; s, y)
+ LG(t, x; s, y) = r(t, x)G(t, x; s, y), (s, y) fixed, (3.35)
@t
@G(t, x; s, y)
+ L⇤ G(t, x; s, y) = r(s, y)G(t, x; s, y), (t, x) fixed, (3.36)
@s
subject to the boundary conditions G(s, x; s, y) = (x y) and G(t, x; t, y) = (x y), respectively.
Module 3: Finite di↵erence methods for PDEs 3-21

We conclude by providing a useful extension to the Kolmogorov backward equation. Let use consider the
extension of the PDE Eq. (3.26)

@u(t, x)
+ Lu(t, x) + h(t, x) = r(t, x)u(t, x) (3.37)
@t

where h, r : [0, T ] ⇥ Rp ! R. Given the boundary condition u(T, x) = g(x), the Feynman-Kac solution is
" Z #
T
P
u(t, x) = e (t, T )g(X(T )) + (t, s)h(s, X(s))ds X(t) = x , (3.38)
t

where
Z !
T
(t, T ) = exp r(s, X(s))ds , t 2 [0, T ] . (3.39)
t

The result is easily understood from an application of Ito’s lemma, similar to the one used above to motivate
the backward Kolmogorov equation.

Eeen.tn
Exercise 6 Prove it.

Exercise 7 Verify that the Feynman Kax result above reconciles with the Black-Scholes PDEs Eqs. (3.4)
and (3.5) in the case in which the asset price follows a Geometric Brownian motion.

In the following section we will introduce general schemes that are applicable to the PDEs in this section.
Although we will focus on the backward PDEs the same techniques can be applied to forward PDEs.

3.3.2 General ✓-scheme

Let us consider the generic pricing (Kolmogorov Backward) PDE, Eq, (3.26) for a derivative V = V (t, x)

2
@V @V 1 2@ V
+ µx (t, x) + x (t, x) = r(x)V , (3.40)
@t @x 2 @x2
or
@V
+ LV = 0 (3.41)
@t
where L is the operator
@ 1 2 @2
L = µx (t, x) + x (t, x) r(x)) , (3.42)
@x 2 @x2
and where V = V (t, x) satisfies a terminal condition V (T, x) = g(x).
In order to solve the PDE above numerically, we discretize it on the rectangular domain (t, x) 2 [0, T ] ⇥
[xL , xU ], where xU and xL are finite constants determined to span the domain which the underlying process
n m+1
is likely to visit. We first introduce two equidistant grids {ti }i=0 and {xj }j=0 where ti = iT /n , i t , i =
0, 1, . . . , n, and xj = xL +j(xU xL )/(m+1) , xL +j x j = 0, 1, . . . , m+1. The terminal value V (T, x) = g(x)
is imposed at tn = T and spatial boundary conditions are imposed at x0 and xm+1 .
3-22 Module 3: Finite di↵erence methods for PDEs

3.3.2.1 Discretization in the Space Direction and Dirichlet Boundary Conditions

We first focus on the spatial operator L and restrict x to take values in the interior of the spatial grid
m
x 2 {xj }j=1 . As opposed to the heat equation, we do have here a term in @x , in order to be consistent with
the O( x2 ) order of accuracy of the scheme, central finite di↵erences for this term will be necessary.
We introduce the discrete operator

1
Lb = µx (t, x) ˆx + x (t, x)

xx r(x, t) , (3.43)
2
where

ˆx V (t, xj ) , V (t, xj+1 ) V (t, xj 1)


, (3.44)
2 x
ˆxx V (t, xj ) , V (t, xj+1 + V (t, xj
) 1) 2V (t, xj )
2
. (3.45)
x

are the first and second order operators, both accurate to second order in x, so that

L = Lb + O 2
x . (3.46)

1537
Exercise 8 Show it.
-_-

For the Dirichlet boundary condition, we assume for instance that

V (x0 , t) = f (t, x0 ) , V (xm+1 , t) = f (t, xm+1 ) (3.47)

for given functions f , f : [0, T ] ⇥ R ! R. These functions can be in some cases be explicitly imposed as part
of the option specification (as is the case for a knock-out options). In other cases, they are determined via
an asymptotic analysis of the solution of the PDE.
>
Defining V (t) = (V (t, x1 ) , . . . , V (t, xm )) and, for j = 1, . . . , m
2
cj (t) , x (t, xj ) 2
r (t, xj ) ,
x (3.48)
1 1 2
uj (t) , µx (t, xj ) x 1 + x (t, xj ) x
2
(3.49)
2 2
1 1 2
lj (t) , µx (t, xj ) x 1 + x (t, xj ) x
2
(3.50)
2 2
we can write
b (t) = T (t)V (t) + ⌦(t)
LV (3.51)
where A is a tri-diagonal matrix
0 1
c1 (t) u1 (t) 0 0 0 ... 0
B l2 (t) c2 (t) u2 (t) 0 0 ... 0 C
B C
B 0 l3 (t) c3 (t) u3 (t) 0 ... 0 C
B C
B C
A(t) = B 0 0 l4 (t) c4 (t) u4 (t) ... 0 C (3.52)
B .. .. .. .. .. .. C
B . . . . . . 0 C
B C
@ 0 0 0 0 lm 1 (t) cm 1 (t) um 1 (t) A
0 0 0 0 0 lm (t) cm (t)
Module 3: Finite di↵erence methods for PDEs 3-23

and ⌦(t) is a vector containing boundary values


0 1
l1 (t)f (t, x0 )
B 0 C
B C
B .. C
⌦(t) = B . C (3.53)
B C
@ 0 A
um (t)f¯ (t, xm+1 ) .

3.3.2.2 Time-Discretization

Focusing the attention on a particular bucket [ti , ti+1 ], the choice for the finite di↵erence approximation of
@V /@t is obvious:
@V V (ti+1 ) V (ti )
⇡ . (3.54)
@t t

Not so obvious, however, is to which time in the interval [ti , ti+1 ] we should associate this derivative. To be
general, consider picking a time ti+1
i (✓) 2 [ti , ti+1 ], given by

ti+1
i (✓) = (1 ✓)ti+1 + ✓ti , (3.55)

where ✓ 2 [0, 1] is a parameter. We then write

@V ti+1
i (✓) V (ti+1 ) V (ti )
⇡ . (3.56)
@t t

By a Taylor expansion, it is easy to see that this expression is first-order accurate in the time step when
✓ 6= 12 , and second-order accurate when ✓ = 12 . Written compactly,

@V ti+1
i (✓) V (ti+1 ) V (ti ) 2
= + 1{✓6= 1 } O ( t ) + O t . (3.57)
@t t 2

= 1
(Exercise: Show it.) This result is intuitive because only in the case ✓ = is the di↵erence coefficient
2
precisely central; for all other cases, the di↵erence coefficient is either predominantly backward in time or
predominantly forward in time. As we have seen already, the special cases of ✓ = 1, ✓ = 0, and ✓ = 12 are
known as the fully implicit scheme, the fully explicit scheme, and the Crank-Nicolson scheme, respectively.

3.3.2.3 Backward Induction

We are now ready to write the full finite di↵erence scheme.


First, we evaluate the matrix system in Eq. (3.51) at time ti+1
i (✓) and expand the vector function V (ti+1
i (✓))
so that it can be evaluated in terms of V exactly on the time grid, namely

A ti+1
i (✓) V ti+1
i (✓) = ✓A ti+1
i (✓) V (ti ) + (1 ✓)A ti+1
i (✓) V (ti+1 ) + 1{✓6= 1 } O ( t ) + O 2
t . (3.58)
2

1=
(Exercise: Show it.) Similarly for the vector ⌦(ti+1
i (✓))

⌦(ti+1
i (✓)) = ✓⌦(ti ) + (1 ✓)⌦(ti+1 ) + 1{✓6= 1 } O ( t ) + O 2
t . (3.59)
2
3-24 Module 3: Finite di↵erence methods for PDEs

As a result, the discretized version of the PDE (3.40) can written as

V (ti+1 ) V (ti )
+ 1{✓6= 1 } O ( t ) + O 2
t = A ti+1
i (✓) V ti+1
i (✓) + O 2
x
2
t
= ✓A ti+1
i (✓) V (ti ) (1 ✓)A ti+1
i (✓) V (ti+1 ) + ✓⌦(ti ) + (1 ✓)⌦(ti+1 )
2 2
+ 1{✓6= 1 } O ( t ) + O t +O x . (3.60)
2

Multiplying through with t gives rise to the complete finite di↵erence representation of the PDE solution
at times ti and ti+1 :

I ✓ t A ti+1
i (✓) V (ti ) = (I+ (1 ✓) t A ti+1
i (✓) V (ti+1 )
+ (1 ✓) t ⌦ (ti+1 ) + ✓ t ⌦ (ti ) + ei+1
i . (3.61)

where I is the m ⇥ m identity matrix, and ei+1


i is an error term

ei+1
i = tO
2
x + 1{✓6= 1 } O 2
t +O 3
t . (3.62)
2

Let V̂ (ti , xj ) denote the approximation to the true solution V (ti , xj ) obtained by using Eq. (3.61) without
the error term. Defining
⇣ ⌘>
Vb (t) = Vb (t, x1 ) , . . . , Vb (t, xm ) , (3.63)

we have

I ✓ l A ti+1
i (✓) V̂ (ti ) = I + (1 ✓) t A ti+1
i (✓) V̂ (ti+1 ) + (1 ✓) t ⌦ (ti+1 ) + ✓ t ⌦ (ti ) . (3.64)

For a known value of Vb (ti+1 ), Eq. (3.64) defines a simple linear system of equations that can be solved
for Vb (ti ) by standard methods. Simplifying matters is the fact that the matrix I ✓ t A ti+1 i (✓) is tri-
diagonal, allowing us, as we shall see in th following to solve (3.64) in only O(m) operations; see also, e.g.
[2]. Starting from the prescribed terminal condition V (tn , xj ) = g (xj ) , j = 1, . . . , m, we can now use (3.64)
to iteratively step backward in time until we ultimately recover V̂ (0). This procedure is known as backward
induction.

Proposition 3.15 The ✓ scheme (3.64) recovers Vb (0) in O(mn) operations. If the scheme converges, the
error on Vb (0) compared to the exact solution V (0) is of order
2 2
O x + 1{✓6= 1 } O ( t ) + ØO t . (3.65)
2

Proof: The backward induction algorithm requires the solution of n tridiagonal systems, one per time step,
for a total computational cost of O(mn). The local truncation error on Vb (ti ) is ei+1
i , making the global
truncation error after n time steps of order nei+1
i . Combining with the fact that n = T / t = O t 1 gives
the order result listed in the proposition.
It follows from the Proposition above that the Crank-Nicolson scheme is second-order convergent in the
time step, and all other theta schemes are first-order convergent in the time step. All theta-schemes are
second-order convergent in the spatial step x . One may wonder why anything other than the Crank-Nicolson
scheme is ever used. The Cranck-Nicholson method is, indeed, often the method of choice, but there are
situations where a straight application of the Crank-Nicolson scheme can lead to oscillations in the numerical
solution or its spatial derivatives. Judicial application of the fully implicit method (the so-called ‘Rannacher
stepping’) can often alleviate these problems, as we shall discuss later. The fully explicit method should
Module 3: Finite di↵erence methods for PDEs 3-25

never be used due to poor convergence and stability properties we discussed before, but has nevertheless
managed to survive in a surprisingly large number of finance texts and papers.
As a final point, we stress that the finite di↵erence scheme above ultimately yields a full vector of values
Vb (0) at time 0 , with one element per value of xj , j = 1, . . . , m. In general, we are mainly interested in
V (0, x(0)), where x(0) is the known value of x at time 0 . There is no need to include x(0) in the grid, as we
can simply employ an interpolator (e.g., a cubic spline) on this vector Vb (0) to compute V (0, x(0)). Clearly,
such an interpolator should be at least second-order accurate to avoid interfering with the overall O x2
convergence of the finite di↵erence scheme. Assuming the interpolator is sufficiently smooth, we can also
use it to compute various partial derivatives with respect to x that we may be interested in. The derivative
@V (0, x(0))/@t - the time decay - can be picked up from the grid in the same fashion.

3.3.2.4 Other Boundary Conditions

Deriving asymptotic Dirichlet conditions can be involved for complicated payouts. Rather than having to
perform an asymptotic analysis for each and every type of option payout, it would be preferable to have
a general-purpose mechanism for specifying the boundary condition. One common idea involves making
assumptions on the form of the functional dependency between V and x at the grid boundaries, often from
specification of relationships between spatial derivatives. For instance, if we impose the condition that the
second derivative of V is zero at the upper boundary (xm+1 ) that is, V is a linear function of x-we can
write
V (t, xm+1 ) + V (t, xm 1) 2V (t, xm )
2
=0, (3.66)
x

that is

V (t, xm+1 ) = 2V (t, xm ) V (t, xm 1) . (3.67)

Similarly, imposing the same at the lower spatial boundary gives

V (t, x0 ) = 2V (t, x1 ) V (t, x2 ) . (3.68)

For PDEs discretized in the logarithm of some asset, it may be more natural to assume that V (t, x) / ex at
the boundaries; equivalently, we can assume that @V /@x = @ 2 V /@x2 at the boundary. When discretized in
downward fashion at the upper boundary (xm+1 ), this implies that

V (t, xm+1 ) V (t, xm ) V (t, xm+1 ) + V (t, xm 1) 2V (t, xm )


= 2
, (3.69)
x x

or (assuming that x 6= 1)
1 x 2
V (t, xm+1 ) = V (t, xm 1) + V (t, xm ) , (3.70)
x 1 x 1
Similarly,
2+ x 1
V (t, x0 ) = V (t, x1 ) V (t, x2 ) , (3.71)
1+ x x+1
Common for both methods above is that they give rise to boundary specifications through simple linear
systems of the general form

V (t, xm+1 ) = km (t)V (t, xm ) + km 1 (t)V (t, xm 1) , (3.72)


V (t, x0 ) = k1 (t)V (t, x1 ) + k2 (t)V (t, x2 ) . (3.73)
3-26 Module 3: Finite di↵erence methods for PDEs

This boundary specification can be captured in the matrix system (3.51) by simply rewriting a few compo-
nents of A(t); specifically,
2
cm (t) = (t, xm ) x 2 r (t, xm ) + km (t)um (t), (3.74)
1 1 2
lm (t) = µ (t, xm ) x 1 + (t, xm ) x 2 + km 1 (t)um (t) (3.75)
2 2
2
c1 (t) = (t, xj ) x 2 r (t, xj ) + k1 (t)l1 (t), (3.76)
1 1 2
u1 (t) = µ (t, x1 ) x 1 + (t, x1 ) x 2 + k2 (t)l1 (t) (3.77)
2 2
All other components of T remain as in (3.51); note that T remains tridiagonal which is an important
property for solving efficiently the linear system.
An alternative approach to specification of boundary conditions in the x-domain involves using the PDE
itself to determine the boundary conditions, through replacement of all central di↵erence operators with
one-sided di↵erences at the boundaries, namely
V0 (t + ) V0 (t) V1 (t) V0 (t) V1 (t + ) V0 (t + )
+ ✓µ (t, x0 ) + (1 ✓)µ (t + , x0 )
x1 x0 x1 x0

✓ 2 V 2 (t) V 1 (t) V 1 (t) V 0 (t) 1
+ x (t) 1
2 x2 x1 x1 x0 2 (x2 x0 )

1 ✓ 2 V2 (t + ) V1 (t + ) V1 (t + ) V0 (t + ) 1
+ r (t + ) 1
2 x2 x1 x1 x0 2 (x2 x0 )
= ✓r(0)V0 (t) + (1 ✓r(0))V0 (t + ) . (3.78)

This equation can be rearranged to write V0 (t) as

V0 (t) = k1 (t)V1 (t) + k2 (t)V2 (t) + g0 (t + ), (3.79)

where k1 (t) and k2 (t) are easily computed functions of the process parameters, and where g0 (t + ) is a
function of V0 (t + ), V1 (t + ), and V2 (t + ). In a similar way, we get

Vm+1 (t) = km 1 (t)Vm 1 (t) + km (t)Vm (t) + gm+1 (t + ) . (3.80)

-_-
⼀⼀
(Exercise: Work out the expressions for k1 , k2 , km , km 1 , g0 , gm+1 .)
We can see that these boundary conditions can be incorporated into our usual tridiagonal system by simply
interpreting f (t, x0 ) = g0 (t + ) and f¯ (t, xm+1 ) = gm+1 (t + ). As we are rolling back in time (from t +
to t) when using the finite di↵erence equations, both g0 (t + ) and gm+1 (t + ) are known at time t, so this
interpretation involves no difficulties.

3.3.3 American Options

We recall that derivative securities with early exercise are characterized by an adapted payout process U (t),
payable to the option holder at a stopping time (or exercise policy) ⌧  T , chosen by the holder. Let the
allowed (and deterministic) set of exercise dates larger than or equal to t be denoted D(t), and suppose that
we are given at time 0 a particular exercise policy ⌧ taking values in D(0), as well as a pricing numeraire N
inducing a unique martingale measure QN . Let V ⌧ (0) be the time 0 value of a derivative security that pays
U (⌧ ). Under some technical conditions on U (t), we can write for the value of the derivative security

N U (⌧ )
V ⌧ (0) = EQ , (3.81)
N (⌧ )
Module 3: Finite di↵erence methods for PDEs 3-27

where we have assumed, with no loss of generality, that N (0) = 1. Let T (t) be the time t set of (future)
stopping times taking value in D(t). In the absence of arbitrage, the time 0 value of a security with early
exercise into U must then be given by the optimal stopping problem

N U (⌧ )
V (0) = sup V ⌧ (0) = sup EQ , (3.82)
⌧ 2T (0) ⌧ 2T (0) N (⌧ )

reflecting the fact that a rational investor would choose an exercise policy to optimize the value of his claim.
We can extend this relation to future times t by

QN U (⌧ )
V (t) = N (t) sup Et . (3.83)
⌧ 2T (t) N (⌧ )

Let’s assume that the set of exercise date is discrete like in a Bermudan option (we can recover the American
case but taking the continuous time limit), D(0) = {T1 , T2 , . . . , TB }. For t < Ti+1 , define Hi (t) the time t
value of the Bermudan option when exercise is restricted to the dates D (Ti+1 ) = {Ti+1 , Ti+2 , . . . , TB }. That
is N
Hi (t) = N (t)EQt [V (Ti+1 ) /N (Ti+1 )] , i = 1, . . . , B 1 (3.84)
At time Ti , Hi (Ti ) can be interpreted as the hold value of the Bermudan option, that is, the value of the
Bermudan option if not exercised at time Ti . If an optimal exercise policy is followed, clearly we must have
at time Ti
V (Ti ) = max (U (Ti ) , Hi (Ti )) , i = 1, . . . , B , (3.85)
such that N
Hi (Ti ) = N (Ti )EQ
Ti [max (U (Ti+1 ) , Hi+1 (Ti+1 )) /N (Ti+1 )] , i = 1, . . . , B 1. (3.86)
Starting with the terminal condition HB (TB ) = 0, Eq. (3.86) defines a useful iteration backwards in time for
the value V (0) = H0 (T0 ). We note that the idea behind (3.85) is often known as dynamic programming or the
Bellman principle. Loosely speaking, we here work ‘from the back’ to price the Bermudan option. This idea
is particularly well-suited for numerical methods that proceed backwards in time, such as finite di↵erence
methods, provided that the discretization time steps is a superset of the early exercise dates. Indeed, at
maturity, the continuation value is equal to the maturity of the option. At each early exercise date, Ti the
continuation value , Hi (Ti ), is what is iteratively computed by the discretization scheme provided that the
value of the option V (Ti ) is set as
max (U (Ti ) , Hi (Ti )) ,
where U (Ti ) is the value of the option if exercised at time Ti . For instance, in the case of an American put
in the Black-Scholes framework N (t) = ert and
+
U (Ti ) = (K S(Ti )) .

⼀⼀
⼀⼀
Exercise 9 Modify the PDE scheme introduced before to handle early exercise and verify that a Bermudan
put has more value than a European put. Verify that that the value of the option increases as the number of
exercise dates and converges in the American limit.

3.3.4 Asian Options

The payo↵ of an Asian option written on the underlying S, with maturity T , depends on the average of the
asset price over the whole life of the product. This average can be either continuous or discrete, namely
Z n
1 T 1X
St dt or St ,
T 0 n i=1 i
3-28 Module 3: Finite di↵erence methods for PDEs

for some partition 0 < t1 < · · · < tn = T . We assume for simplicity that the underlying stock price evolves
according to the Black-Scholes dynamics:

dSt = St (rdt + dWt ) , S0 > 0,

and we shall be interested in deriving a PDE to evaluate a continuously monitored Asian Call option with
strike K, i.e. an option with the following payo↵ at maturity:
Z !+
T
1
VT , St dt K .
T 0

By risk-neutral expectation, the price at time t 2 [0, T ] is given by


h i
Vt = E e r(T t) VT |Ft .

The discounted option price (e rt Vt )t2[0,T ] is clearly a martingale; however, it is not Markovian, since it
does not onlyRdepends on T , but on the whole trajectory. We therefore augment the state space with the
t
process It , 0 Su du, which clearly satisfies the stochastic di↵erential equation dIt = St dt, with starting
value I0 = 0. Now, the couple (S, Y ) forms a Markov process, and the value of the option depends only on its
terminal value, making Feynman-Kac theorem amenable to such a problem: the function u : [0, T ]⇥R+ ⇥R+
defined by " #
✓ ◆+
1
u(t, x, y) , E e r(T t)
IT K |St = x, It = y
T
satisfies the following:

Theorem 3.16 The function u satisfies the following partial di↵erential equation:
✓ ◆
1 2 2
@t + rx@x + x@y + x @xx u(t, x, y) = ru(t, x, y), for all (t, x, y) 2 [0, T ) ⇥ R+ ⇥ R,
2

with boundary conditions


⇣y ⌘+
r(T t)
u(t, 0, y) = e K , for (t, y) 2 [0, T ) ⇥ R,
T
lim u(t, x, y) = 0, for (t, x) 2 [0, T ) ⇥ R+ ,
y! 1
⇣y ⌘+
u(T, x, y) = K , for (x, y) 2 R+ ⇥ R.
T


Proof: Exercise.
This is a first example of two-dimensional PDE which we discuss next.

3.4 Two-dimensional PDEs

The partial di↵erential equations we have studied so far were one-dimensional (in space). This came from the
fact that we were looking at financial derivatives written on a single stock price, as in the one-dimensional
Black-Scholes model. Many financial derivatives are actually written on several assets—for instance basket
options—and hence the methods above have to be extended to higher dimensions. Even in the case of a single
asset, higher-dimensional PDEs can be needed, for instance in the case of stochastic volatility, stochastic
Module 3: Finite di↵erence methods for PDEs 3-29

interest rates. We shall focus here on the heat equation in two dimensions, which provide us with the
canonical model to study such a feature. Let us consider the two-dimensional (in space) partial di↵erential
equation
@⌧ u = b1 @xx u + b2 @yy u, (3.87)
on a square. We know that this PDE is parabolic if b1 > 0 and b2 > 0, which we assume from now on. In
particular when b1 = b2 , the PDE (3.87) precisely corresponds to the two-dimensional heat equation. Let
us now see how this comes into the financial modelling picture. In Section 3.1.2, we saw how to reduce the
Black-Scholes di↵erential equation to the heat equation in one dimension. The two-dimensional Black-Scholes
model for the pair (S1 (t), S2 (t))t 0 reads
✓✓ ◆ ⌘◆
1 2 p ⇣ p
S1 (t) = S1 (0) exp r t + t ⇢Z + 1 ⇢ 2W ,
1
2 1
✓✓ ◆ ◆
1 2 p
S2 (t) = S2 (0) exp r t + 2 tZ ,
2 2
where W and Z are two independent Gaussian random variables with zero mean and unit variance. The
two volatilities 1 and 2 are strictly positive, the risk-free interest rate r is non negative and the correlation
parameter ⇢ lies in ( 1, 1). In two dimensions (in the space variable), one can show that the Black-Scholes
di↵erential equation reads
@t V + LV = 0, (3.88)
where
1 2 2 1 2 2
L , rS1 @S1 + rS2 @S2 + S @S S + S @S S + ⇢ 1 2 S1 S2 @S1 S2 r.
2 1 1 1 1 2 2 2 2 2
For clarity, we do not mention the boundary conditions here, but it is clear that they are fundamental
in establishing a unique solution consistent with the pricing problem. Let us consider this equation on a
logarithmic scale, i.e. x1 , log(S1 ) and x2 , log(S2 ), and define ⌫1 , r 12 12 and ⌫2 , r 12 22 . The
PDE (3.88) reduces to
1 2 1 2
@ t V + ⌫ 1 @ x1 V + ⌫ 2 @ x2 V + 1 @ x1 x1 + 2 @ x2 x2 +⇢ 1 2 @ x1 x2 r = 0. (3.89)
2 2
In order to simplify this equation further, we need to remove the mixed second-order derivative. Let us first
recall some elementary facts from linear algebra.

Theorem 3.17 (Spectral theorem) Let A 2 Rn⇥n . If the matrix A has n linearly independent eigen-
vectors (u1 , . . . , un ) (i.e. there exists ( 1 , . . . , n ) 6= 0 such that Aui = i ui for any i = 1, . . . , n), then
the decomposition A = U ⇤U 1 holds where each column of U is an eigenvector and where the matrix ⇤ is
diagonal with ⇤ii = i .

Remark 3.18 In particular, when the matrix A is real and symmetric, the eigenmatrix U is orthogonal, i.e.
U 1 = U T , and hence A = U ⇤U T .

Proposition 3.19 Let A 2 R2⇥2 . Then A has at most two eigenvalues 1 and 2 and
1⇣ 1/2

1,2 = Tr(A) ± (Tr(A) 4det(A)) .
2

The proof is left as an exercise. Consider now the covariance matrix related to the PDE (3.89):
✓ 2 ◆
⇢ 1 2
⌃, 1
2 .
⇢ 1 2 2
3-30 Module 3: Finite di↵erence methods for PDEs

From the proposition above, we can compute explicitly its two eigenvalues 1 an 2 . Denote by u1 and u2
the corresponding eigenvectors and U , (u1 u2 ) as in the spectral theorem. Consider finally the change of
variables (y1 , y2 )T = U (x1 , x2 )T . The partial di↵erential equation (3.89) becomes

1 2
@ t V + ↵ 1 @ y 1 V + ↵ 2 @ y2 V + @ y1 y1 V + @ y2 y2 V rV = 0,
2 2

where (↵1 , ↵2 )T = U (⌫1 , ⌫2 )T .


Exercise 10 Prove the result above.

As in the one-dimensional case, let us define the transformation

V (y1 , y2 , t) , e 1 y 1 + 2 y2 + 3 t
(y1 , y2 , t).

↵21 ↵22
Upon choosing 1 , ↵1 / 1, 2 , ↵2 / 2 and 3 , 2 1 + 2 2 + r, we finally obtain

1 2
@t + @ y1 y1 + @ y 2 y2 = 0.
2 2
p
We can make a last change of variable (z1 , z2 ) , (y1 , y2 1/ 2) in order to obtain the standard heat equation
in two dimensions:
1
@t + (@z1 z1 + @z2 z2 ) = 0.
2

3.4.1 ✓-schemes for the two-dimensional heat equation

By a change of time t 7! T t, where T > 0 is the time boundary of the problem, the heat equation boils
down to (modulo some constant factor)
2
@t u = (@xx u + @yy u) , (3.90)

and we are interested in solving it for (x, y, t) 2 [xL , xU ] ⇥ [yL , yU ] ⇥ [0, T ]. We specify now the following
boundary conditions:

u(x, y, 0) = u0 (x, y), for any (x, y) 2 [xL , xU ] ⇥ [yL , yU ],


u(xL , y, t) = fxL (y, t), for any (y, t) 2 [yL , yU ] ⇥ [0, T ],
u(xU , y, t) = fxU (y, t), for any (y, t) 2 [yL , yU ] ⇥ [0, T ],
u(x, yL , t) = fy (x, t), for any (x, t) 2 [xL , xU ] ⇥ [0, T ],
u(x, yU , t) = fyU (y, t), for any (x, t) 2 [xL , xU ] ⇥ [0, T ].

The first boundary condition corresponds to the payo↵ at maturity, whereas the other boundary conditions
account for possible knock-out barriers. The functions f· are assumed to be smooth. For (nx , ny , nT ) 2 N3 ,
consider the discretisation steps x > 0, y > 0 and t > 0 defined by

xU xL yU y T
x , , y , , t , .
nx ny nT

At some node (i, j, k) 2 [0, nx ] ⇥ [0, ny ] ⇥ [0, nT ] (in Cartesian coordinates: (xL + i x , yL + j y , k t )), we use
the notation uki,j for the function u evaluated at this point.
Module 3: Finite di↵erence methods for PDEs 3-31

3.4.1.1 Explicit scheme

At the node (i, j, k), approximating the time-derivative of the function u using a forward di↵erence scheme
@t u|(i,j,k) = t 1 uk+1
i,j uki,j + O( t ), the heat equation (3.90) is approximated by

uk+1
i,j = (1 2 (↵x + ↵y ) )uki,j + ↵x uki+1,j + uki 1,j + ↵y uki,j+1 + uki,j 1 ,

for any i = 1, . . . , nx 1, j = 1, . . . , ny 1, k = 0, . . . , nT 1, and where we define ↵x , 2 2


t/ x and
↵y , 2 t / y2 .

3.4.1.2 Implicit scheme

1
At the node (i, j, k), using a backward di↵erence scheme @t u|(i,j,k) = t uki,j uki,j 1 + O( t ) for the time-
derivative, the heat equation (3.90) is approximated by
(1 + 2 (↵x + ↵y ) )uki,j ↵x uki+1,j + uki 1,j ↵y uki,j+1 + uki,j 1 = uki,j 1 ,
for any i = 1, . . . , nx 1, j = 1, . . . , ny 1, k = 0, . . . , nT 1, and where ↵x and ↵y are defined as in the
explicit scheme.

3.4.1.3 Crank-Nicolson

If we apply the Crank-Nicolson scheme to the two-dimensional heat equation, we obtain

uk+1
i,j uki,j 2 uki+1,j 2uki,j + uki 1,j
2 uk+1
i+1,j 2uk+1 k+1
i,j + ui 1,j
= 2
+ 2
t 2 x 2 x
2 uki,j+1 2uki,j + ui,j 2 uk+1 k+1
2ui,j + uk+1
1 i,j+1 i,j 1
+ 2
+ 2
.
2 y 2 y

Using a similar Fourier analysis as above, one can further show that the Crank-Nicolson scheme in two
(space) dimensions remains unconditionally stable. Consider the following vector:
⇣ ⌘T
U k , uk11 , . . . , unx 1,1 , uk12 , . . . , unx 1,2 , . . . , uk1,ny 1 , . . . , uknx 1,ny 1 2 R(nx 1)⇥(ny 1) .

The Crank-Nicolson scheme can be written in matrix form as follows


✓ ◆ ✓ ◆
1 1
I + C U k+1 = I C U k + bk ,
2 2
where the vector bk represents the boundary conditions and where
Dx , T (a, x, x) 2 R(nx 1)⇥(nx 1)
,
Dy , yI 2R (ny 1)⇥(ny 1)
,
1)2 ⇥(ny 1)2
C , T (Dx , Dy , Dy ) 2 R(nx ,
a,2 2
t
2
x + 2
y .

We may solve these matrix equations as in the one-dimensional case. However, the matrix to invert is
now block-tridiagonal and its inversion is computer intensive and often not tractable for practical purposes.
We therefore search for an alternative method, tailored for this multidimensional problem, in particular
preserving the tridiagonal structure of the matrix.
3-32 Module 3: Finite di↵erence methods for PDEs

3.4.2 The ADI method

Let L1 and L2 be linear operators, and assume that we are able to solve the equations @t u = L1 u and
@t u = L2 u. Taking L1 = b1 @xx and L2 = b2 @yy , the equation (3.87) becomes @t u = L1 u + L2 u. We now use
a central di↵erence scheme around the point k + 12 t , i.e. perform a Taylor series around this point, and
average the central di↵erences in space, so that (3.87) becomes

uk+1 uk 1 1
= L1 uk+1 + L1 uk + L2 uk+1 + L2 u + O 2
t ,
t 2 2
which we can rewrite as
✓ ◆ ✓ ◆
t k+1 t
I (L1 + L2 ) u = I+ (L1 + L2 ) uk + O 3
t . (3.91)
2 2

Applying a central-di↵erence scheme for the space variables as in the Crank-Nicolson scheme will eventually
lead to the computation of the inverse of the matrix version of the left-hand side, which is not an easy task.
However, using the identities

(1 + z1 ) (1 + z2 ) = 1 + z1 + z2 + z1 z2 ,
(1 z1 ) (1 z2 ) = 1 z1 z2 + z1 z2 ,
1 1
we can turn (3.91) into (take z1 = 2 t L1 and z2 = 2 t L2 )
✓ ◆✓ ◆ ✓ ◆✓ ◆ 2
t t t t t
I L1 I L2 uk+1 = I + L1 I+ L 2 uk + L1 L2 uk+1 uk + O 3
t .
2 2 2 2 4
3
Now, the last two terms on the right-hand side are of order O t , so that this simplifies to
✓ ◆✓ ◆ ✓ ◆✓ ◆
t t k+1 t t
I L1 I L2 u = I+ L1 I+ L 2 uk + O 3
t .
2 2 2 2

Let us now use a Crank-Nicolson central di↵erence scheme for the space variables, i.e. let the matrices L1
and L2 be the second-order approximations of the operators L1 and L2 , and uk the vector of solutions at
time k t . We obtain
✓ ◆✓ ◆ ✓ ◆✓ ◆
t t k+1 t t
I L1 I L2 u = I + L1 I + L2 uk , (3.92)
2 2 2 2

where we have ignored the terms of order O t3 , O t x2 and O t y2 . Several schemes now exist to solve
such an equation. Peaceman and Rachford [6] splits (3.92) as follows
✓ ◆ ✓ ◆
1 k+1/2 1
I t L1 ue = I + t L2 uk ,
✓ 2 ◆ ✓ 2 ◆ (3.93)
1 k+1 1
I t L2 u = I + t L1 u ek+1/2 ,
2 2

where the notation u e expresses the fact that this is not an approximation of the function u at the time
(k + 1/2) t but only an auxiliary quantity. This set of two equations is furthermore equivalent to the original
matrix equation (3.92), where we use the fact that, for i = 1, 2,
✓ ◆✓ ◆ ✓ ◆✓ ◆
1 1 1 1
I t Li I + t Li = I + t Li I t Li .
2 2 2 2
Module 3: Finite di↵erence methods for PDEs 3-33

As usual, we need to specify (space) boundary conditions for this split scheme at the intermediate time point
t + k t /2. These are simply obtained using the (space) boundary conditions at the times t and t + k t and
plugging them into (3.93).
Alternatively one can use a backward-in-time scheme to discretize the time derivative in @t u = L1 u + L2 u

k+1
(I t L1 t L2 ) u = uk + O( t2 ),

where we use I as the identity operator. We can rewrite this as

2
I t L1 t L2 + t L1 L2 uk+1 = I + 2
t L1 L2 uk + 2
t L1 L2 uk+1 uk + O( t2 ),

which implies that


k+1 2
(I t L1 ) (I t L2 ) u = I+ t L1 L2 uk ,

where the matrices L1 and L2 are defined as above, I is the identity matrix, and we have again ignored the
higher-order terms. The Douglas-Rachford method [7] reads

(I ek+1/2
t L1 ) u = (I + t L2 ) u
k
,
k+1 k+1/2 k
(I t L2 ) u e
=u t L2 u .

Both the Crank-Nicholson and the implicit discretization schemes can be shown to be unconditionally stable.

3.5 Numerical solution of systems of linear equations

In Sections 3.2 and 3.3 above, we have explored di↵erent ways to approximate a partial di↵erential equation.
In particular, our analysis has boiled down to solving a matrix equation of the form Ax = b, where A 2 Rm⇥m
and x and b are two Rm -valued vectors. An interesting feature outlined above was that the matrix A had a
tridiagonal structure, i.e. it can be written as A = Tm (a, b, c), where we use the tridiagonal notation (3.9).
By construction, this matrix is invertible, and hence the solution to the matrix equation is simply x = A 1 b.
Classical matrix inversion results—in particular the Gaussian elimination method—are of order O(m3 ) (in
the number of operations). Since we may want to have a fine discretisation grid, the dimension m may be
very large, and these methods may be too time consuming for high-dimensional problems. We may however
use the simplified structure of the matrix A.

3.5.1 Gaussian elimination

Gaussian elimination is a method devised to solve systems of linear equations, and hence to compute the
inverse of a matrix. Note that it was invented in the second century BC, but got its name in the 1950s based
on the fact that Gauss came up with standard notations. Consider the system
0 10x 1 0b 1
a11 a12 ... a1m 1 1
B a21 B .. C B .. C
B a22 ... a2m C
C B . C B . C
C B
B .. .. .. .. C B
B C = B C.
C
@ . . . . A @ ... A @ ... A
am1 am2 ... amm xm bm
3-34 Module 3: Finite di↵erence methods for PDEs

Assume for now that the coefficient a11 is not null. Dividing by it and modifying the last m 1 lines, we
obtain
0 10x 1 0 1
1 a12 /a11 ... a1m /a11 1 b1 /a11
B0 a22 a12 (a21 /a11 ) ... a2m a1m (a21 /a11 )CB .. C B b b1 (a21 /a11 ) C
B CB . C
B B 2 C
B .. .. .. .. CB . C =B .. C.
@. . . . A@ . C @ . A
. A
0 am2 a12 (am1 /a11 ) ... amm a1m (am1 /a11 ) xm bm b1 (am1 /a11 )

If we now repeat this process, we obtain


0 1 0 x 1 0e 1
1 e a12 e
a13 ... e
a1m 1 b1
B0 1 e
a23 ... e
a2m CB .. C B e C
B C B . C B b2 C
B C B
B .. .. CB . C B . C
B0 0 1 . . C B . C = B .. C .
B CB . C B C
B. .. .. .. CB . C B . C C
@ .. . . . am 1,m A @ .. A @ . A
e .
0 0 ... 0 1 xm ebm

aij and ebi are determined recursively. The matrix equation is then solved by
where all the coefficients e
backward substitution:
8
>
< xm = ebm ,
Xm
> x m k = ebm k e
am k,j xj , for any k = 1, . . . , m 1.
:
j=m k+1

This method has however several drawbacks. The first obvious one occurs as soon as one diagonal element
becomes null, in which case, we cannot proceed as above. From a numerical point of view, even if not null, a
very small value of the diagonal element can lead to numerical (decimal) truncation, which can get amplified
as the scheme goes on.

Exercise 11 Note that a zero on the diagonal does not mean that the matrix is singular!! Consider for
example the matrix equation equation:
0 10 1 0 1
0 3 0 x1 3
@2 0 0A @x2 A = @2A .
0 0 1 x3 1

What happens when one applies Gauss elimination? Is there however an (obvious) solution?

The way to bypass this issue is to use pivoting. The idea is that one can interchange rows and columns of
the matrix A (keeping track of the corresponding modified vectors x and b) without modifying the problem.
Interchanging two rows implies interchanging the corresponding two elements of the vector b. Interchanging
two columns implies interchanging the corresponding two elements of the vector x.

Exercise 12 Apply pivoting to the system in Exercise 11 to use Gaussian elimination.

The Gaussian elimination method therefore requires the existence of an invertible matrix B such that the
matrix BA is upper triangular. Once this is done, all that is left is (i) to compute the product Bb and (ii)
to solve the triangular system (BA)x = Bb by backward substitution. The existence of such a matrix B is
guaranteed by the following lemma, the proof of which is simply the construction of the Gaussian elimination
method itself.
Module 3: Finite di↵erence methods for PDEs 3-35

Lemma 3.20 Let A be a square matrix. There exists at least one invertible matrix B such that BA is upper
triangular.

Some remarks are in order here:

• we never compute the matrix B;

• if the original matrix A is not invertible, then one of the diagonal elements of the matrix BA will be
null, and the backward substitution will be impossible;

• The total amount of operations is hence of order O(m3 ).

3.5.2 LU decomposition

In the Gaussian elimination method above, the vector b is modified when solving the matrix equation.
This makes the method rather cumbersome when one has to solve multiple equations that shaer the same
matrix A. A better alternative is using the so-called LU decomposition for the matrix A 2 Rm⇥m , i.e. by
determining a lower triangular matrix L 2 Rm⇥m and an upper triangular matrix U 2 Rm⇥m such that
A = LU .

Proposition 3.21 Let A = (aij ) 2 Rm⇥m such that all the subdiagonal matrices (called principal minors)
0 1 0 1
✓ ◆ a11 ... a1k a11 ... a1m
a11 a11 B .. .. .. C , . . . , B .. .. .. C ,
(a11 ), ,...,@ . . . A @ . . . A
a21 a22
ak1 ... akk am1 ... amm

are invertible. Then there exist a unique lower triangular matrix L 2 Rm⇥m with unit diagonal and upper
triangular matrix U 2 Rm⇥m such that A = LU .

In particular, if the matrix A is positive definite, then the proposition holds. The proof of this proposition
is again constructive, and the following practical computation of the decomposition is very similar. Assume
that such a decomposition holds. For any 1  i, j  m we have
8
Xm i^j
X < li1 u1j + . . . + lii uij , if i < j,
aij = lik ukj = lik ukj = li1 u1j + . . . + lij ujj , if i > j,
:
k=1 k=1 li1 u1j + . . . + lii ujj , if i = j.

and 0 10 1
l11 0 ... 0 u11 u12 ... u1m
B .. C B .. C
B l21 l22 0 . C B . u2m C
B CB 0 u22 C = (aij )
B . .. .. CB . .. .. C 1i,jm .
@ .. . . 0 A @ .. . . um 1,m A
lm1 lm2 ... lmm 0 ... 0 umm

There are m2 equations to solve and m2 + m variables to determine. We therefore have the freedom to
choose m of them arbitrarily. Following Proposition 3.21, Crout’s algorithm proceeds as follows:

(i) for each i = 1, . . . , m, set lii = 1;


3-36 Module 3: Finite di↵erence methods for PDEs

(ii) for each j = 1, . . . , m, let

i 1
X
uij = aij lik ukj , for i = 1, . . . , j, (3.94)
k=1
j 1
!
1 X
lij = aij lik ukj , for i = j + 1, . . . , m. (3.95)
ujj
k=1

Note that under the conditions of Proposition 3.21, the term ujj can never be null.

3.5.2.1 Solving the system

We are interested here in solving the matrix equation Ax = b, where A 2 Rm⇥m and x and b are two
Rm -valued vectors. If the matrix A admits an LU-decomposition, then there exist two matrices L and U
(respectively lower triangular and upper triangular) such that A = LU . The system Ax = b can therefore
be written as L (U x) = b. Set z , U x = (zi )1in . The new system then reads
0 10 1 0 1 0b 1
l11 0 ... 0 z1 l11 z1 1
B .. C B .. C B C B .. C
B l21 l22 0 . C B C l21 z1 + l22 z2 C B . C
B CB . C = B C=B C.
B . .. .. CB . C B @
..
A @ .. C
B
@ .. . . 0 A @ .
. A . . A
lm1 lm2 ... lmm zm l z
m1 1 + . . . + l z
mm m bm

So that, starting from the last line, we can solve this equation by successive forward substitution:
0 1
k
X 1
b1 1 @
z1 = , zk = bk lkj zj A , for k = 2, . . . , m.
l11 lkk j=1

Once this is done, we can then solve the other part U x = z, i.e.
0 10 1 0 1 0z 1
u11 u12 ... u1m x1 u11 x1 + . . . + u1m xm 1
B .. .. C B .. C B . C B .. C
B 0 u22 . . CB . C B .. C B . C
B CB C = B C=B C
B . .. .. CB . C @ A B .. C .
@ .. . . .
um 1,m A @ . A u x
m 1,n n 1 + u x
mm m @ . A
0 0 0 umm xm umm xm zm

Backward substitution hence gives


0 1
k+1
X
zm 1 @
xm = , xk = zk ujk xj A , for k = m 1, . . . , 1.
umm ukk j=m

Note that the backward substitution step is exactly the same as in the Gaussian elimination method. The
LU decomposition requires O(m3 ) operations and the forward-backward substitution requires O(m2 ) steps.
The total amount of operations is hence of order O(m3 ).

Remark 3.22 In the particular case where the matrix A is tridiagonal (as in the one-dimensional ✓-
schemes), we have the obvious decomposition A = T (a, b, c) = T (1, l, 0)T (d, 0, u), where T (1, l, 0) is lower
Module 3: Finite di↵erence methods for PDEs 3-37

triangular and T (d, 0, u) is upper triangular (recall the tridiagonal notation (3.9)). The vectors d, u and l
are computed recursively a follows
0 1
b1 c 1
B a 2 b2 c2 0 C
B C
B .. .. .. C
B . . . C=
B C
@ 0 a n 1 bn 1 c n 1 A
an bn
0 10 1
1 v1 c1
B l2 1 0 CB v2 c2 0 C
B CB C
B l3 1 C B .. .. C
B CB . . C
B .. .. C B C
@ . . A@ vn 1 cn 1 A
0 ln 1 0 vn
b1 = v 1 ) v 1 = b1
a k = lk v k 1 ) lk = ak /vk 1
bk = l k c k 1 + v k ) v k = bk lk c k 1, k = 2, . . . , m
In this case, it is easy to realize that the decomposition requires requires O(m) operations and so does the
backward substitution. Indeed to solve Lx = f one has:

x1 = f1
lk x k 1 + xk = fk ) xk = fk lk xk 1, k = 2, . . . , m ,

and to solve U u = x:

vn un = xn ) un = xn /vn
vk uk + ck uk+1 = xk ) uk = (xk ck uk+1 ) /vk , k=m 1, . . . , 1 ,

so that solving LU x = f requires O(m) operations.

3.5.3 Cholesky decomposition

Another important decomposition applies to real symmetric positive definite matrices.

Proposition 3.23 Let A 2 Rm⇥m be a real symmetric positive definite matrix. There exists a unique lower
triangular L 2 Rm⇥m with strictly positive diagonal entries, such that LLT = A.

Proof:. Since A is positive definite, all its principal minors are positive definite and hence invertible. As such,
because of Proposition 3.21, A has an LU factorization. Since A is also symmetric then the decomposition
is of the form:
A = LLT = L̃DL̃T ,
where L̃ is a lower triangular matrix with a unit diagonal and D is a diagonal matrix. This is the so-
called LDLT factorization of a symmetric invertible matrix A. Since A is positive definite and L̃ has unit
determinant, D is also positive definite so all the entries of D are positive. This means that
⇣p p ⌘
D1/2 = diag d11 , · · · dnn
⇣ ⌘T
A = L̃D1/2 L̃D1/2 = L̂L̂T
3-38 Module 3: Finite di↵erence methods for PDEs

for the lower triangular matrix L̂ = L̃D1/2 .


We now give an algorithm to compute the square-root of a symmetric positive definite matrix A = (aij ).
Since A is symmetric we need only consider elements on or above the diagonal:
m
X k
X
akj = `ki `ji = `ki `ji , k j.
i=1 i=1

We solve for the entries of L, starting with the first column, then the second and so on. To get the k-th
column, for k = 1, · · · , m first solve for `kk using

k
!1/2
X1
`2kk = akk `2ki
i=1

and then for j = k + 1, · · · m : !


k
X1
1
`jk = ajk `ji `ki .
`kk i=1

So long as the computed `ii ’s are positive, the iteration will go to completion, yielding the Cholesky factoriza-
tion. It can be shown that the Cholesky factorization algorithm is numerically stable. From a computational
point of view—and this is left as a simple exercise—one can show that the number of operations is of order
m3 /6 as the size m of the matrix A becomes large, which is twice as fast as the LU decomposition.

3.5.4 Banded matrices

The methods we have presented so far apply in fairly general situations. In the case of the finite di↵erence
schemes, the matrix A is tridiagonal and has hence many zeros. It is therefore natural to wonder whether
the methods above are quicker for this special type of matrices.

Definition 3.24 A matrix A = (aij ) 2 Rm⇥m is called a banded matrix with half-bandwidth p 2 N (equiva-
lently with band size 2p) if aij = 0 whenever |i j|> p.

In the case of a tridiagonal matrix, for instance, p = 1. The following lemma—the proof of which is left as
an exercise—shows why this is important.

Lemma 3.25 For a matrix A 2 Rm⇥m with half-bandwidth p, the number of operations is of order mp2 for
the LU decomposition and of order mp2 /2 for the Cholesky decomposition.

3.5.5 Iterative methods

When the problem under consideration requires a fine grid the Gaussian elimination method above can
become computationally intensive, and one may need to resort to more suitable methods, in particular the
so-called iterative methods. As before, we are interested in solving the system Ax = b, where A 2 Rm⇥m and
b 2 Rm . We shall assume that the matrix A does not have any zero diagonal elements (if such is the case,
we can always interchange some rows and columns in order to satisfy the assumption). In particular, it is
clear that this system has a unique solution if and only if the matrix A is invertible, which we shall assume
from now on. The essence of iterative methods is to rewrite this matrix equation as a fixed-point iteration

xk+1 = (xk , b), for any k 2 N. (3.96)


Module 3: Finite di↵erence methods for PDEs 3-39

A solution x⇤ of the equation x⇤ = (x⇤ , b) is called a fixed-point.


Examples of fixed point iterations algorithms are the Jacobi iteration, the Gauss-Seidel iteration and the
Successive Over Relaxation (SOR) method which we will not cover in this course. The interested reader can
refer to [8].

References
[1] D.J. Du↵y, A Critique of the Crank-Nicolson scheme strenghts and weaknesses for financial instrument
pricing, Wilmott Magazine, July-August, 2004
[2] William Press, et. al., Numerical Recipes: The Art of Scientific Computing, Third Edition, Cambridge
University Press, 2007.
[3] R. Courant, K. Friedrichs and H. Lewy, Über die partiellen Di↵erenzengleichungen der mathematischen
Physik, Mathematische Annalen, 100 (1): 32-74, 1928.
[4] J.Crank and P. Nicolson, A Practical Method for Numerical Evaluation of Solutions of Partial Di↵erential
Equations of Heat Conduction Type, Proceedings of the Cambridge Philosophical Society 43: 50-67, 1947.
[5] Ioannis Karatzas, and Steven E. Shreve, Brownian motion and stochastic calculus, New York: Springer-
Verlag, 1988.
[6] D.W. Peaceman and H.H. Rachford. The numerical solution of parabolic and elliptic di↵erential equa-
tions. Journal of SIAM, 3: 28-41, 1955.
[7] J. Douglas and H.H. Rachford. On the numerical solution of heat conduction problems in two and three
space variables. Transactions of the AMS, 82:421-439, 1956.

[8] Paul Wilmott, Sam Howison, and Je↵ Dewynne, The Mathematics of Financial Derivatives: A Student
Introduction, Cambridge University Press, 1995.

You might also like