Introduction to Importance Sampling

dung cheng

2017. 08

Table of Contents

1 Basic knowledge

2 Variance Reduction

3 Change of Measure

4 Remarks for Option Pricing

5 Importance Sampling

6 Efficient Importance Sampling

7 Large Deviation Principle and Cramer’s Theorem

8 Reference

Rare Event

We consider the events that occur with low frequency as rare event.
Financial example:
I Default rate: joint default probability
I Risk management: portfolio credit risk, systemic risk

Monte Carlo Simulation

(Law of Large Number)

Xi → E[X ]

(Central limit theorem)

1 Pn
n i=1 Xi− E[X ]
√ → N(0, 1)
σ/ n
Application: integration, probability estimation.

Now, we want to estimate the probability P(X < −5) for X ∼ N(0, 1)

function estimate(N, C)
for i=1:length(C)
%Crude Monte Carlo
A = randn(1,N);
CMC_sm = sum(A<C(i))/N;
CMC_se = std(A<C(i))/sqrt(N);
%Exact answer
mu = 0;
sigma = 1;
pd = makedist(’Normal’,mu,sigma);
cdf_ans = cdf(pd,C(i));
fprintf(’%1.1f & %e & %e & %e \\\\\n’, C(i), cdf_ans,

c cdf CMC s.e.

-1 0.158655253931457 0.162900000000000 0.003692928752667
-1.5 0.066807201268858 0.064900000000000 0.002463616435364
-2 0.022750131948179 0.021000000000000 0.001433912692796
-2.5 0.006209665325776 0.005400000000000 7.328967961256935e-04
-3 0.001349898031630 0.001100000000000 3.314965897243874e-04
-3.5 2.326290790355250e-04 2.000000000000000e-04 1.414142842854940e-04
-4 3.167124183311996e-05 0 0
-4.5 3.397673124730062e-06 0 0
-5 2.866515718791946e-07 0 0

Table: Results of CMC for N = 10000

Variance Reduction

Estimate E[X ] by E[X + C λ]
where λ: control parameter with E[C ] = 0
I Asian option !
1 T
Z  1 RT 
X =h St dt , Y = h e T 0 ln St dt
T 0
Let C = Y − E[Y ], then E[C ] = 0.
I American option pricing by Monte Carlo Simulation
(ref. Longstaff-Schwartz, 2000)

Variance Reduction

Estimate E[X ] by Ẽ[X · Q(X )] where Q(X ): importance function.
In option pricing, the importance sampling plays the role of change of
probability measure.
I X ∼ N(0, 1), evaluate p = P(X < c) = E[I(X < c)]

Exponential Twisting

E[I(X < c)] = I(X < c)f (x)dx
f (x) ˜
= I(X < c) · f (x)dx
h f (x) i
e I(X < c)
| {z }
likelyhood ratio

e µx f (x)
where f˜(x) = , MX (µ) = E[e µX ]: moment generating function
MX (µ)

Remark: Option Pricing by Esscher Transforms

In rare event simulation, µ is determined by threshold.

In option pricing, µ is determined by martingale condition.
I (Risk-neutral Esscher transform)
I S(t): the price of a non-divident-paying stock (security) at time t with
S(t) = S(0)e X (t)
I (Xt )t≥0 : stochastic process with stationary and independent
increments and X (0) = 0
I M(z, t) := E[e zX (t) ], M(z, t) = [M(z, 1)]t provided M(z, t):
continuous at t = 0

Remark: Option Pricing by Esscher Transforms

Z ∞
M(z, t) = e zx f (x, t)dx
e hx f (x, t)
f (x, t; h) :=
M(h, t)
M(z + h, t)
M(z, t; h) := , and note that M(z, t; h) = [M(z, 1; h)]t
M(h, t)

Remark: Option Pricing by Esscher Transforms

Goal: seek h∗ such that the discounted price e −δt S(t) is a martingale
under the new probability measure given by Esscher transform
I S(0) = E∗ [e −δt S(t)] ⇒ S(0) = e −δt S(0)E∗ [e X (t) ]
I e δt = E∗ [e X (t) ] = M(1, t; h∗ ) = [M(1, 1; h∗ )]t ⇒ δ = log M(1, 1; h∗ )
To evaluate a European call option on the stock with strike K and
exercise date τ at time t = 0: V0 := E∗ [e −δt (S(t) − K )+ |F0 ]
I Define κ := log

I V0 = e S(0) e x f (x, τ ; h∗ )dx − e −δt K [1 − F (κ, τ ; h∗ )]
where e x f (x, τ ; h∗ ) = e δt f (x, τ ; h∗ + 1)
I V0 = S(0)[1 − F (κ, τ, h∗ + 1)] − e −δt K [1 − F (κ, τ ; h∗ )]

Remark: Option Pricing by Esscher Transforms

{X (t)}: Wiener process with mean µ and variance σ 2 (per unit time)
I F (x, t) := P(X (t) ≤ x) = N(x; µt, σ 2 t)
1 2 2
I M(z, t) = e (µz+ 2 σ z )t
2 1 2 2
I M(z, t; h) = e ((µ+hσ )z+ 2 σ z )t
I F (x, t; h) = N(x; (µ + hσ 2 )t, σ 2 t)
Risk-neutral Esscher transform
δ = (µ + h ∗ σ 2 ) + σ 2 /2
µ∗ = µ + h∗ σ 2 = δ − σ 2 /2
V0Wie =
−κ + (δ + σ 2 /2)τ −κ + (δ − σ 2 /2)τ
S(0)Φ √ −e KΦ √
σ τ σ τ

Remark: Option Pricing by Esscher Transforms

{X (t) = kN(t) − ct}: shifted Poisson process

I N(t): Poisson process with parameter λ, k, c > 0
X e −θ θj
I CDF: Λ(x; θ) =
x + ct
I F (x, t) = Λ , λt
I M(z, t) = e tλ(e −1)−cz]t
hk zk
I M(z, t; h) = e tλe (e −1)−cz]t
Risk-neutral Esscher transform

δ = λe h k (e k − 1) − c
∗ δ+c
λ∗ = λe h k = k
e −1
V0Poi =     
κ + cτ ∗ k −δτ κ + cτ ∗
S(0) 1 − Λ ;λ e τ − Ke 1−Λ ;λ τ
k k

Remark: Option Pricing by Esscher Transforms

{S(t) = S(0)e X (t) }: discrete-time multiplicative binomial process

with X (t) = X1 + · · · + Xt
I Ω = {a, b} with P(b) = p, P(a) = 1 − p
X n
I CDF: B(x; n, θ) = θj (1 − θ)n−j
x − at
I F (x, t) = B ; t, p
I M(z, t) = [(1 − p)e az + pe bz ]t
I M(z, t; h) = [(1 − π(h))e az + π(h)e bz ]t
pe bh
where π(h) =
(1 − p)e ah + pe bh

Remark: Option Pricing by Esscher Transforms

Risk-neutral Esscher transform

e δ = [1 − π(h∗ )]e a + π(h∗ )e b
eδ − ea
π(h∗ ) = b
e − e a  
Bin κ − aτ ∗
V0 = S(0) 1 − B ; τ, π(h + 1) −
  b − a 
κ − aτ
Ke −δt 1 − B ; τ, π(h∗ ) where π(h∗ + 1) = π(h∗ )e b−δ
Note that it is not necessary to know P(b) = p to price the option.
(Only need to know π(h∗ ))

Remark: Option Pricing by Esscher Transforms

{X (t) = Y (t) − ct}: shifted gamma process

I Y (t): gamma processZ x with parameter α, β, c > 0
I G (x; α, β) = y α−1 e −βy dy , x ≥ 0
Γ(α) 0
I F (x, t) = G (x + ct; αt, β)
I M(z, t) = e −ctz , z < β
I M(z, t; h) = e −ctz , z < β − h
Risk-neutral Esscher
β − h∗

e = e −c
β − h∗ − 1
β ∗ = β − h∗ =
1 − e −(c+δ)/α
V0Gam = S(0)[1 − G (κ + cτ ; ατ, β ∗ − 1)] − Ke −δt [1 − G (κ + cτ ; ατ.β ∗ )]

Remark: Option Pricing by Esscher Transforms

{X (t) = Y (t) − ct}: shifted inverse Gaussian process

I {Y (t)}: inverse
 Gaussian process
 with√ parameter a, b
√ √
−a 2a b −a
I J(x; a, b) = Φ √ + 2bx + e Φ √ − 2bx , x > 0
2x 2x
I F (x, t) = J(x√+ ct;

at, b)
I M(z, t) = e a( b−√

,z <b
a( b−h− b−h−z)−ctz
I M(z, t; h) = e , z <b−h
√ Esscher
√ transform
δ = a( b − h∗ − b − h∗ − 1) − c
√ √

∗ ∗ ∗ ∗
c +δ ∗ 1 c +δ a
b = b −h ⇒ b − b − 1 = ⇒b = +
a 4 a c +δ
V0InG = S(0)[1 − J(κ + cτ ; aτ, b ∗ − 1)] − e −δt [1 − J(κ + cτ ; aτ, b ∗ )]

Example: Ruin Probability

Consider an insurance firm earing premiums at a constant rate p per

unit of time, and paying claims that arrive at the jump of a Poisson
process with rate λ.
N(t): number of claims arriving in [0, t]
Yi : size of i-th claim
ξi : interarrival time of Poisson process ∼ exp(λ)
Set x: reserve, λE[Yi ] < p
Net payout over [0, t]: Yi − pt
Net payout up to n-th claim: Sn = X1 + · · · + Xn where Xi = Yi − pξi

dung cheng Introduction to Importance Sampling 2017. 08 19 / 32

Example: Ruin Probability

Example: Ruin Probability

Define τx := inf n
Sn >x
Estimate the eventual ruin P(τx < ∞):
(For general setting, we assume {Xi }: i.i.d. with 0 < P(Xi > 0) < 1,
E[Xi ] < 0 and drop the special form Yi − pξi )

P(τx < ∞) = E[I(τx < ∞)]

h i
= Eθ e −θSτx +ψX (θ)τx · I(τx < ∞)

If 0 < ψX0 (θ) < ∞, then Eθ [Xn ] = ψX0 (θ) < ∞.

h i
Then P(τx < ∞) = 1 =⇒ P(τx < ∞) = Eθ e −θSτx +ψX (θ)τx

Example: First Hitting Time

Optimal Measure

p = EP [X ] = EQ X
"  #
dP 2 dP 2
VarQ X = EQ X − EQ X
dQ dQ dQ
= EP X 2 − p2
∗ dP X
∃Q such that VarQ∗ X ∗
= 0 by taking =
dQ dP EP [X ]

Asymptotically Optimal Importance Sampling

P1Q , P2Q : first, and second moment, respectively.

VarQ [X ] = P2Q − (P1Q )2
We call the importance sampling efficient (or asymptotically optimal)
1 1
if lim log P2Q = 2 lim log P1Q
n→∞ n n→∞ n

Toy Model

E[I(X < c)] for X ∼ N(0, 1)

1 x2 µ2 1 1 2
f (x) = √ e − 2 , MX (µ) = e 2 , f˜(x) = √ e − 2 (x−µ)
2π 2π
Second moment:

h f (x) 2 i
 h i
−µx+ 12 µ2
P2Q = EQ I(X < c) = E I(X < c)e
1 2 1 2
≤ E[e −cµ+ 2 µ ] = e −cµ+ 2 µ

d  1 2

By taking log e −cµ+ 2 µ = 0, we have µ = c

for i=1:length(C)
%Crude Monte Carlo
A = randn(1,N);
CMC_sm = sum(A<C(i))/N;
CMC_se = std(A<C(i))/sqrt(N);
%Efficient I.S.

c cdf CMC s.e. EIS s.e.

-1 0.158655253931457 0.162900000000000 0.003692928752667 0.156958196898354 0.001895435259949
-1.5 0.066807201268858 0.064900000000000 0.002463616435364 0.065160610408277 9.033103757300151e-04
-2 0.022750131948179 0.021000000000000 0.001433912692796 0.023020966481626 3.494388612908102e-04
-2.5 0.006209665325776 0.005400000000000 7.328967961256935e-04 0.006138267097036 1.043373591892651e-04
-3 0.001349898031630 0.001100000000000 3.314965897243874e-04 0.001321438853177 2.459045897978202e-05
-3.5 2.326290790355250e-04 2.000000000000000e-04 1.414142842854940e-04 2.370773935282700e-04 4.671138444317568e-06
-4 3.167124183311996e-05 0 0 3.148668008326063e-05 6.698453759226119e-07
-4.5 3.397673124730062e-06 0 0 3.383089349109682e-06 7.653502125235220e-08
-5 2.866515718791946e-07 0 0 2.857501664657509e-07 6.762587035351070e-09

Table: Results of EIS for N = 10000

Useful Tools

Large deviation principle(LDP):

A sequence {Yi } obeys the LDP with rate function ! I (·) if
1 1 X
For any closed set F , lim sup log P Yi ∈ F ≤ − inf I (a)
n→∞ n n a∈F
1 1X
For any open set G , lim sup log P Yi ∈ G ≥ − inf I (a)
n→∞ n n a∈G

Useful Tools

ψ(θ) := log E[e θX ] and ψ ∗ (x) := sup[θx − ψ(θ)]

(Cramer’s theorem)  
1 Sn
∀x ≥ E[X1 ], we have lim log P ≥ x = −ψ ∗ (x)
n→∞ n n
Multivariate normal distribution X ∼ N(µ, Σ),
the rate function is given by I (a) = (a − µ)t Σ−1 (a − µ)

Importance Sampling for Diffusions via Girsanov’s Theorem

X : d-dimensional diffusion process dXs = µ(Xs )ds + σ(Xs )dWs

(Wt )t≥0 : n-dimensional Brownian motion on a filter probability space
(Ω, F, F = (Ft )t≥0 , P)
Xst,x : the solution from time x to t
v (x, t) := E g (Xst,x , t ≤ s ≤ T ) , (t, x) ∈ [0, T ] × Rd

Let φ = (φ ) : Rd -valued adapted such that

R t t 00≤t≤T 1 R t 2
Mt = e − 0 φu dWu − 2 0 |φu | du is a martingale. i.e., E[MT ] = 1
Let Q be a probability measure such that = MT

Importance Sampling for Diffusions via Girsanov’s Theorem

Z t
ft = Wt + φu du: Brownian motion under Q
dXs = (µ(Xs ) − σ(Xs )φs ) ds + σ(Xs )dW
Then v (t, x) = EQ g (Xs , t ≤ s ≤ T )LT
Rt Rt
fu − 1
φ0u dW |φu |2 du
where LT = e 0 2 0

1 X
Importance sampling: IgN,φ (t, x) = g (X i,t,x )LiT

C.H.Han, Efficient Importance Sampling Estimation for Joint Default

Probability: the First Passage Time Problem.
J.Bucklew, Introduction to rare event simulation.
P.Glasserman, Monte Carlo methods in financial engineering.
Pham, Large deviations in mathematical finance.
Hans U. Gerber, Elias S.W. Shiu, Option Pricing by Esscher

