Week6 Handout

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

IS principles Example Options Path-dependent options

wi3425TU—Monte Carlo methods

L.E. Meester

Week 6

1/ 44
IS principles Example Options Path-dependent options

Week 6—Program for this week


Importance sampling is like a Formula 1 racing car:
if you know how to drive you can go very fast; otherwise,
you might end up in the hay (if you’re lucky).
1 Importance sampling principles
2 Continuing last week’s example
3 Importance sampling for options
How IS works with the normal distribution
Watch (out with) the weights
4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution
Inserted at the start: Points of attention for Monte Carlo
simulations that you “should” apply in your MC-life, but definitely
on the exam. . . 2/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations I

1 Analyse your problem. Do this, before you start


programming. Are there alternative ways to formulate it?
This may lead to alternative/better solutions and/or
simulation possibilities.
2 Accuracy. Estimates should (where possible) always be
accompanied by an indication of their accuracy: standard
errors or confidence intervals (make sure the confidence level
is clear). The notation 3.12 ± 0.14 (s.e.) indicates an estimate
of 3.12 with a standard error of 0.14. If it is understood which
it is, 3.12 ± 0.14 or 3.12 (0.14) would suffice.

3/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations II

3 Significant digits. How many of the digits are significant


depends on the accuracy. In case of an estimate of
3.1237920388 with a standard error of 0.121298375 the 95%
confidence interval is about 3.1237920388 ± 0.2377448150.
This says that the estimate is accurate to about 0.24.
Everything from the third digit after the decimal point carries
no additional information and is therefore better omitted: the
answer 3.12 ± 0.24 (95%CI) contains all the useful
information.
A good rule of thumb: round the standard error (or the ±
part of the confidence interval) to two significant digits; then
state your estimate with the same number of digits as the
standard error. So if your estimate is 9823.34 and your
standard error 327.89 you would write: 9820 ± 330 (s.e.).

4/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations III

4 Report relevant parameters with your results. If there are


parameters whose values can be set at your discretion, like the
number of replications, the step size, the control variate
parameter θ, one often experiments with them while
simulating. So, even if the first line of code is M=1e3, this
does not mean that this was the case for the reported results.
5 Consistency. In some situations a quantity may be estimated
in more than one way or using several methods. The resulting
estimates should then be consistent: differences between
them, expressed in the number of (appropriately computed)
standard errors, should not be too big. If they are, then
something is wrong and should be checked. . .

5/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations IV

6 Mistakes, checks. Everybody makes mistakes: errors of


reasoning, programming errors, etcetera. So, use every
opportunity you have to catch those mistakes. Never blindly
believe the final answer your program spits out, but scrutinize
intermediate results for errors. Usually, there are enough
things around that you might check. If necessary, create a
testing opportunity, for example a special case for which you
know the answer.
7 Random seeds. Initialise the randomgenerator(s) so that
your results are reproducible. Set a seed chosen by yourself: if
we all mimick Higham and start our program with
rand(’state’,100) and randn(’state’,100), how
random is that?

6/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations V

8 Variance reduction. When applying variance reduction


methods, there are several possibilities; in order of
attractiveness: a) you know that the method will bring
reduction, and you can show this (before simulating); b) you
think that the method will bring some reduction, and you
have some arguments to support it (perhaps intuitive ones);
or c) you just try it out. Try to be aware which of those
situations you are in, and if you “just tried something” and
afterwards realized “I could have known that this would
happen” then try to make explicit why this is so and how you
could have known—this is how you learn to get better at this.
9 Bias. Some methods produce biased estimates: there is a
systematic deviation with respect to the unknown quantity
you are estimating. Make sure you are aware of this.

7/ 44
IS principles Example Options Path-dependent options

Points of attention for Monte Carlo simulations VI

10 Accuracy in the presence of bias. The Monte Carlo rule


“one hundred times as many replications gives me an
additional digit of accuracy” is no longer true when there is
bias, so then blindly going for as large an M as possible is
senseless. It is better to find a balance between the size of the
(remaining) bias and the standard error. This is a hard and
sometimes unsolvable problem—do what you can.

8/ 44
IS principles Example Options Path-dependent options

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

9/ 44
IS principles Example Options Path-dependent options

Importance sampling: the principle

For a function k : R → R one can determine


Z ∞
I = k(x)f (x)dx,
−∞

by Monte Carlo via I = E [k(T )] where T has density f .


Suppose: g is another probability density on R such that
g (x) > 0 if f (x) > 0; then:
Z ∞
f (x)
I = k(x) g (x) dx.
−∞ g (x)
 
f (X )
This we interpret as I = E k(X ) , where X has pdf g .
g (X )

10/ 44
IS principles Example Options Path-dependent options

The likelihood ratio (LR)

The ratios
f (x) f (X )
w (x) = and w (X ) =
g (x) g (X )

are called the likelihood ratio (LR).


We write Z ∞
I = k(x)w (x) g (x)dx
−∞
and
I = E [k(X )w (X )] , X ∼ g.

11/ 44
IS principles Example Options Path-dependent options

Sampling from the “wrong” distribution can be made OK

It’s OK to sample from g instead of f if we reweigh the results:

Values of X sampled from the interval (x, x + dx)


occur with frequency g (x)dx;
are multiplied by f (x)/g (x);
so contribute the correct amount f (x)dx to the integral I .

Example:
Suppose g (x0 ) = 2 f (x0 ) then
under g values near x0 happen twice as often as they should;
the weight w (x0 ) = f (x0 )/g (x0 ) = 0.5 corrects this.

12/ 44
IS principles Example Options Path-dependent options

How can we use this to our advantage?

We could obtain a small variance for k(X )w (X ) if

k(x)f (x)
k(x)w (x) = ≈ constant;
g (x)

so we should choose g (x) ≈ constant · k(x)f (x);


i.e., g (x) should be large when k(x)f (x) is large.
This makes sense intuitively, because corresponding x-values
contribute most to the integral
Z ∞
I = k(x)f (x)dx.
−∞

Name of the method: importance sampling (IS)—we look


for g that sample the important values more often than f .
13/ 44
IS principles Example Options Path-dependent options

The zero-variance distribution: a mirage? (not on exam)

The optimal g would be

g (x) = constant · k(x)f (x);

which is only possible if the function k is nonnegative;


this g the so-called zero-variance density.
If you plug it into E [k(X )w (X )] you find

k(X )w (X ) = constant,

which means that the variance is zero!


However, the constant equals (the unknown) I . . .
Even though this may look stupid now, it is a very useful guideline
(look up “Approximate zero-variance”).
14/ 44
IS principles Example Options Path-dependent options

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

15/ 44
IS principles Example Options Path-dependent options

Last week: tried IS on determining π


Earlier we used Y = 4 1√− U 2 with U ∼ U (0, 1) to
R1
estimate π: E [Y ] = 0 4 1 − x 2 dx = π and Var(Y ) ≈ 0.8.

Fits framework: k(x) = 4 1 − x 2 and f (x) = 1 for
0 ≤ x ≤ 1, f (x) = 0 elsewhere, so that E [k(U)] = I .
Last week we tried for g :
g (x) = 32 (1 − x 2 ); failed, we did not see how to simulate this;
g2 (x) = 32 (2 − x); worked, found R1 ≈ 0.25; SchatPi IS2.m.

Two more g : [0, 1] → R are instructive to explore:



g (x) ∝ 1 − x 2 .
g (x) ∝ 1 − x.

16/ 44
IS principles Example Options Path-dependent options

IS for π: g (x) ∝ 1 − x2


Plan: try g (x) ∝ 1 − x 2 (where ∝ means proportional to);
get the distribution function G ; then we can simulate from g using
the inverse distribution function method: solve G (x) = u etc.

Set g (x) = 1 − x 2 . With Maple we find:
Z t
1 p 1
G (t) = g (x)dx = t 1 − t 2 + arcsin (t)
0 2 2

We forgot: to normalize G , divide by G (1) which equals 14 π.


Even if you could solve G (t) = u, the answer involves our
unknown π. . .
This happens if you try to get the zero-variance g : a vicious circle.

17/ 44
IS principles Example Options Path-dependent options

IS for π: g (x) ∝ 1 − x

If we take g1 (x) = 2 (1 − x) for 0 ≤ x ≤ 1, we find


G1 (x) = 1 − (1 − x)2 for 0 ≤ x ≤ 1 and

G1 (x) = u⇔ 1 − u = (1 − x)2 ⇔ x =1− 1 − u;

so X = 1 − U has cdf G1 .
If we simulate (SchatPi_IS.m) we find:
Var(k(X )w (X )) ≈ 2.14, a detoriation from 0.797.
r
1+X
Explanation: for this X , k(X )w (X ) = 2 and this
1−X
goes to ∞ for X near 1:

18/ 44
IS principles Example Options Path-dependent options

19/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

20/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

A simple example of how importance sampling may help

For the price of a European call-option we can write V = E [k(Z )],


with Z standard normal and

σ2 √ 
k(Z ) = e−rT max S0 e(r − 2 ) T +σ T ·Z − E .

Suppose we want to determine the price by simulation, for several


high strikes. Parameters: S0 = 10, σ = 0.1, r = 0.06, T = 1.
Strikes from E = 9 to E = 17; see EurCall IS.m.

21/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

Output from EurCall IS.m:

strike V̂ s.e. s.e. as fraction of V̂


9 1.54269 0.00097 0.00063
11 0.25080 0.00050 0.00199
13 0.008774 0.000087 0.00986
15 0.0000824 0.0000075 0.09072
17 0 0 NaN

The price drops off, the relatieve standard error (s.e. divided by
estimate) increases: 9% at E = 15; at E = 17: error!

22/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

Analysis of the simulation

Payoff zero means we’re out of the money: S(T ) ≤ E .


Writing things out we get S(T ) = 10 · exp(0.055 + 0.1 · Z ) so

k(Z ) = 0 ⇔ S(T ) ≤ E ⇔ 0.055+0.1·Z ≤ ln(E /10);

For E = 15 this applies if Z ≤ 3.50.


Therefore, we are trying to estimate E [k(Z )] where

P(k(Z ) = 0) = P(Z ≤ 3.5) ≈ 0.9998

The majority of the simulated values is ZERO! . . .


At E = 17 apparently all of them. . . .

23/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

24/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

25/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

Importance sampling for the normal, with a shift

In the preceding example the simulated prices were too low to


generate enough non-zero payoffs.
Want to generate higher values of S(T ) more frequently.
2
Writing k(z) for the discounted payoff and f (z) = √12π e−z /2 ,
our importance sampling equations are:
Z ∞ Z ∞
f (y )
V = k(z) f (z) dz = k(y ) g (y ) dy = E [k(Y ) w (Y )] ,
−∞ −∞ g (y )

where
f (Y )
w (Y ) = and Y has density g .
g (Y )

26/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

2
Let’s be simple: set Y ∼ N (µ, 1), so g (y ) = √1 e−(y −µ) /2 ,

with µ to be determined. Then:
1 2
√1 e− 2 y
f (y ) 1 2 − 1 (y −µ)2 1 2 −µ y
w (y ) = = 2π
1 2
= e− 2 y 2 = e2µ .
g (y ) √1 e− 2 (y −µ)

So h i
1 2
V = E k(Y ) e 2 µ −µY , for Y ∼ N (µ, 1),

may serve as a base for an importance sampling simulation.


See: EurCall IS2.m (µ = 3.5).
Even better is a µ dependent on the strike: choose µ so that
P(k(Y ) > 0) = 0.5 for each E (EurCall IS3.m).

27/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

Results

EurCall IS3.m: for each E , choose µ so that P(k(Y ) > 0) = 0.5.

µ E V̂ s.e. rel s.e.


-1.60 9 1.5370 0.0116 0.0075
0.40 11 0.2510 3.18e-04 0.0013
2.07 13 0.0088 1.02e-05 0.0012
3.50 15 8.29e-05 1.14e-07 0.0014
4.76 17 3.13e-07 4.96e-10 0.0016
Except for the first two strikes the relative s.e. has improved.
Our criterion is somewhat arbitrary; the idea is to have
enough samples that end in the money, but not too many (try
this out to see for yourself).
There is another point of attention: the weights.

28/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

29/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

A few remarks about the weights

The expected value of the weights is 1:


Z Z Z
f (y )
E [w (Y )] = w (y )g (y ) dy = g (y ) dx = f (y ) dx = 1.
g (y )

But the distribution can be so skewed (for µ = 3.5 below)


that we can only look on a log-scale:

Note that weights between 10−6 and 10 are all quite common.
30/ 44
IS principles Example Options Path-dependent options IS for the normal distribution Watch the weights

A warning: we need to watch the weight distribution

Note that weights between 10−6 and 10 are all quite common.
Some samples have a weight 107 times that of others. . .
This potentially means trouble: imagine this
1 observation with weight 10;
999999 with weight 10−6 .
Recall that sample mean and standard deviation are not very
robust to outliers: if the weight distribution is very extreme,
these estimators break down.
31/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

32/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

Importance sampling: multi-dimensional

Suppose I = E [k(Z1 , . . . , Zm )] where Zi has density fi and Z1 , . . . ,


Zm are independent. Then, just as before:

I = E [k(Z1 , . . . , Zm )] = E [k(Y1 , . . . , Ym ) w (Y1 , . . . , Ym )] ,

where Yi has density gi and Y1 , . . . , Ym are independent, and


Qm m
f (y1 , . . . , ym ) i=1 fi (yi ) fi (yi )
Y
w (y1 , . . . , ym ) = = m
Q = .
g (y1 , . . . , ym ) i=1 gi (yi ) gi (yi )
i=1

Requirement, again: the numerator must be positive for all y1 , . . . ,


ym for which the denominator is positive.

33/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

The choice of g1 , . . . , gm

A good choice for g1 , . . . , gm puts larger weight on large


values of k · f (in the ideal situation the quotient is constant).
In practice this is not always easy.
Intuitive principle: make sure that important values occur
(more) often (compare with the earlier example).
For path dependent options: make sure that paths for which
the option ends in-the-money have a large(r) probability of
occurring.

34/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

Application: Path dependent options

Based on the risk-neutral asset price model


2 √
S(ti+1 ) = S(ti ) e(r −σ /2)(ti+1 −ti )+σ ti+1 −ti ·Zi
, i = 1, . . . , n, (1)

for S(t0 ), . . . , S(tn ), many option prices may be written as

V = E [k(Z1 , . . . , Zm )] with Z1 , . . . , Zm independent N (0, 1).

Suppose we take Y1 , . . . , Ym independent, with Yi ∼ N (µi , 1)


then
V = E [k(Y1 , . . . , Ym ) w (Y1 , . . . , Ym )] .
Weight function
m
f (y1 , . . . , ym ) Y fi (yi )
w (y1 , . . . , ym ) = = .
g (y1 , . . . , ym ) gi (yi )
i=1

35/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

The weight function is a product of factors we already saw:

fi (yi ) 1 2

wi (yi ) = = exp 2 µi − µi yi .
gi (yi )

so w (y1 , . . . , ym ) =
1 2 1 2
e 2 µ1 −µ1 y1 · · · e 2 µm −µm ym

The value k(Y1 , . . . , Ym ) should have weight:


m m m
!
Y 1 2 X X
µ −µi Yi
w (Y1 , . . . , Ym ) = e 2 i = exp 1
2 µ2i − µi Yi .
i=1 i=1 i=1

This is just the product of what we get in the 1-dim case.

36/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

Parameters µi should be selected to give “important” paths a


larger chance of occurring.
Looking at
2 /2)(t √
S(ti+1 ) = S(ti ) e(r −σ i+1 −ti )+σ ti+1 −ti ·Zi

we see: simulating with Yi instead of Zi boils down to:



adding of an extra drift term µi σ ti+1 − ti in the exponent.
This is so because Yi has the same distribution as Zi + µi and
it is as if Zi is replaced by Zi + µi .
With this, one can get some idea of the effect (just as in the
call option example earlier):
with µi > 0 we add extra upward drift and
with µi < 0 downward drift.

37/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

38/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

How do you “steer” the price path with IS?

Assume a fixed stepsize ∆t, T = N · ∆t.


Steering: suppose at time T (not necessarily “expiration”) we
want P(S(T ) > E ) ≈ 0.5. The current asset price model:
 
N √
X
S(T ) = S0 · exp (r − 12 σ 2 ) · T + σ ∆t · Zj  .
j=1

Importance sampling: Zj becomes Yj = Zj + µj , so the


exponent becomes:
N √
X N √
X
(r − 12 σ 2 ) · T + σ ∆t · Zj + σ ∆t · µj .
j=1 j=1

(extra drift terms in blue).


39/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

The exponent:
N √
X N √
X
(r − 21 σ 2 ) · T + σ ∆t · Zj + σ ∆t · µj .
j=1 j=1

The rest is just computation: the exponent has a normal


distribution, so I should try to get the median (which
corresponds to Zj = 0) to equal ln E /S0 .
So, solve: N √
X
(r − 21 σ 2 ) · T + σ ∆t · µj = ln E /S0 .
j=1

For the common choice µj = µ this leads to

ln E /S0 − (r − 12 σ 2 )T
µ= √ .
σN ∆t
40/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

A numerical example: S0 = 5, E = 6, σ = 0.3, r = 0.05,


T = 1, ∆t = 10−3 , µj = µ for all j, then

0.005 + 3 10 · µ = ln 6/5, or µ ≈ 0.0187.

See ch19 IS.m for an implementation example.


A variant: given S(t0 ), how do we accomplish that at time t1
the median of S(t1 ) is at the barrier-level B?
Assume the stepsize is ∆t and t1 − t0 = N1 ∆t. This leads to
a slightly modified equation and solution:

ln B/S(t0 ) − (r − 12 σ 2 )(t1 − t0 )
µ= √ .
σN1 ∆t
For a down-and-in-call with barrier quite below S0 one might
want to first steer the path down to hit the barrier and then
up to be in the money at expiration.
41/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

1 Importance sampling principles

2 Continuing last week’s example

3 Importance sampling for options


How IS works with the normal distribution
Watch (out with) the weights

4 Path-dependent options
Multi-dimensional importance sampling
Steering the price path
A word of caution

42/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

Overview importance sampling

Importance sampling: for E [k(X )] purposely simulating X


from a distribution different from the intended one.
To correct this, simulated values k(X ) have weight w (X ).
Weights: realizations both smaller and larger than 1 occur.
w : the ratio of two probability densities, therefore also called
likelihood ratio or LR.
E [w (X )] = 1 and also E [w (Y1 , . . . , Ym )] = 1: on average the
weight is 1.
The method is a bit more delicate than, for example,
antithetic variables.
If you “push g too far away from f ” the method may fail:

43/ 44
IS principles Example Options Path-dependent options Multi-dim IS Steering the price path A word of caution

A warning and a rule of thumb

IS-estimators: always unbiased; variance reduction: not


guaranteed.
Pushed too far, the distribution of the weights w (Y1 , . . . , Ym )
becomes very skewed and Var(w (Y1 , . . . , Ym )) may become
so big that the variance of the IS-estimator blows up.
How can this be avoided: Check the range of the
distribution of weights, either theoretically (in the examples
they have a log-normal distribution), or by looking at a
histogram of (the 10-logs of) the simulated weights.
See EurCall_IS2.m and ch19_IS.m.
A (conservative) rule of thumb: with weights between 10−5
and 102 you are OK; (far) outside this range you are at risk.
(For some strikes EurCall_IS3.m is on the borderline).

44/ 44

You might also like