Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Section 2

CONTINUOUS DISTRIBUTIONS

33 / 107
2.1 Things to remember

The likely behaviour of a continuous random variable X is determined


by its probability density function
Z x fX and distribution function FX :
d
FX (x) = P(X ≤ x) = fX (y) dy, fX (x) = FX (x).
−∞ dx

The distribution function of a continuous random variable is a


continuous function (unlike that of a discrete random variable).
The expectationZ of X and of a function of X isZ
∞ ∞
E[X] = xfX (x) dx, E[h(X)] = h(x)fX (x) dx.
−∞ −∞

The variance of X is
Var(X) = E(X 2 ) − [E(X)]2 = E [X − E(X)]2 ≥ 0.


34 / 107
2.2 Some continuous families of distributions

There are some distributions which are particularly useful in


modelling real-life quantities. They are grouped into families; we will
be taking a look at some of the most important families.
We will be looking at:
• the continuous uniform distribution
• the exponential distribution
• the Gamma distribution
• the normal distribution
We will also think about distributions such as chi-squared, logNormal
and Beta, which arise from transformations.

35 / 107
2.2.1 Continuous Uniform

Similar to the discrete uniform distribution, in the sense that the


density function is constant over the whole range.
1
fX (x) = , (a < x < b)
b−a
Examples of use:
• An athlete is told that they will be given a drugs test at some
random time during the day
• A delivery van driver says he is going to arrive between one
o’clock and four o’clock

(b + a) (b − a)2
E(X) = , Var(X) = .
2 12

36 / 107
2.2.2 Exponential

(cf Poisson). Events in a Poisson process are equally likely to occur at


any time; the average rate is λ per unit time. The distribution of the
time it takes until the first occurrence is exponential with rate λ.
f (x) = λe−λx , (x > 0).

Examples of use:
• the time until the next claim arrives at an insurance office
• the time until the next bus arrives

E(X) = λ−1 , Var(X) = λ−2 .

37 / 107
2.2.3 Gamma

A non-negative random variable X has a Gamma distributrion with


shape parameter α and scale parameter β if its density function is
1 α−1 α −βx
fX (x) = x β e , (x > 0),
Γ(α)

where Γ(α) is a standard mathematical function given by


Z ∞
Γ(α) = yα−1 e−y dy.
0

There is no closed-form expression for FX (x) unless α is an integer.


The Gamma function has the properties
Γ(n) = (n − 1)! for integer n ge1
Γ(α) = (α − 1)Γ(α − 1) for all α > 1

38 / 107
Properties of the Gamma distribution

The Gamma distribution has a scaling property:


if X ∼ Γ(α, β) and Y = βX, then Y ∼ Γ(α, 1).

If you put α = 1 in the density you will see that


Γ(1, β) = Expo(β).

The mean and variance are given by


α α
E(X) = , Var(X) = 2 .
β β
Example of use:
• If X1 , . . . , XnPare independent exponential random variables with
rate λ, then i Xi ∼ Γ(n, λ). P
• If X1 , . . . , Xn ∼ Γ(α, β) then i Xi ∼ Γ(nα, β).
We will prove both of these results in Chapter 5.

39 / 107
2.2.4 Normal

Many continuous random quantities are assumed to be Normally


distributed: it is the standard assumption if you have no reason to
believe otherwise.
(x − µ)2
 
1
f (x) = √ exp − , (−∞ < x < ∞).
σ 2π 2σ 2

Mean and variance are:


E(X) = µ, Var(X) = σ 2 .

40 / 107
Normal

The graph of the density function is a ‘bell-shaped’ curve.


0.15
dnorm(x, mean = 2, sd = 3)

0.10
0.05
0.00

−5 0 5 10

41 / 107
The Standard Normal distribution

When µ = 0 and σ = 1
• the distribution is called the standard Normal distribution
• we use the notation Φ(z) for the distribution function, φ(z) for
the density.
It has two key properties.
• The symmetry property
If Z ∼ N(0, 1) then
P(Z < z) = P(Z > −z)

in other words, Φ(z) = 1 − Φ(−z).


• The scaling property
If Y ∼ N(µ, σ 2 ) and Z = (Y − µ)/σ then Z ∼ N(0, 1).

42 / 107
Calculation of Normal probabilities

There are tables of Φ, at least over the range z > 0.


Example
If Y ∼ N(25, 102 ) and we want to calculate P(22 < Y < 32), we let
Z = (Y − 25)/10 and say
P(22 < Y < 32) = P(−0.3 < Z < 0.7)
= Φ(0.7) − Φ(−0.3) = Φ(0.7) + Φ(0.3) − 1.

You will also find tables showing the percentage points of Φ, where
you specify a value of p in the range (0, 1) and want to know what
value of z gives P(Z > z) = p, i.e., we calculate
zp = Φ−1 (1 − p).

43 / 107
Normal approximations to Binomial

As already seen,
• when k is large and θ small, the probability function of Bin(k, θ)
can be approximated by that of Pois(kθ).
• However, this is only really useful when kθ is not very large.
If kθ > 5 and k(1 − θ) > 5 it is more helpful to use a Normal
approximation to Binomial.
Suppose
• X ∼ Bin(k, θ) and
• Y ∼ N(µ, σ 2 ), where µ = kθ, σ 2 = kθ(1 − θ)
Then
P(X ≤ x) ≈ P(Y ≤ x + 21 ),
obtained from tables as above.
The 12 added to (or in some cases subtracted from) x is called the
continuity correction, and it arises because we are approximating
P(X = x) by P(x − 0.5 < Y < x + 0.5).
44 / 107
Other Normal approximations

The Normal approximation to Poisson is also fine, as long as µ > 5


• again we would use the continuity correction
• the mean and variance of the approximating Normal random
variable would both be equal to µ
It is not just the Binomial and Poisson which can be so approximated.
As we shall see in Chapter 4, if a sample of size n is taken from any
distribution with finite mean and variance, the sample mean is
approximately normally distributed for large n.
When the underlying distribution is continuous we do not need to use
the continuity correction.

45 / 107
2.3 Distributions arising from transformations

Many distributions which are used in statistics and in probability


theory are derived from simpler distributions by means of
transformations.

46 / 107
2.3.1 The chi-squared distribution

If Z is a standard Normal variable, then Z 2 has a chi-squared


distribution with one degree of freedom, Z ∼ χ21 .
If Z1 , . . . , Zn are independent N(0, 1) variables, then
X n
Zi2 ∼ χ2n .
i=1

Chi-squared is a special case of Gamma: 


χ2n = Γ 12 n, 21 .

This enables you to write down the density,


 n
1 n
−1 1
2 x
f (x) = n x
 2 e− 2 , x>0
Γ 2 2

47 / 107
2.3.2 The lognormal distribution

If
• X is Normal with mean µ and variance σ 2 ,
• Y = eX ,
then Y is said to have the lognormal distribution with parameters µ
and σ 2 .
Note that eµ is the median of Y, but not the expectation.
Lognormal variables are always non-negative, unlike Normal. They
are right-skewed, again unlike Normal.
Examples of use:
• The value of a share index today is 1250. The value in one year’s
time might be modelled as a logNormal random variable.
• When valuing put and call options using the Black-Scholes
formula, a fundamental assumption is that the value of the
“underlying” (the asset on which the options are based) at the
exercise time is logNormally distributed
48 / 107
logNormal mean and variance

We will show in chapter 5 that


1 2 2
E[Y] = E[eX ] = eµ+ 2 σ and E[Y 2 ] = E[e2X ] = e2µ+2σ ,

so that  
2 2
Var[Y] = e2µ+σ eσ − 1 .

49 / 107
2.3.3 The beta distribution

A beta-distributed random variable takes values only in the range


(0, 1).
We will find the distribution useful when we come to Bayesian
statistics.
If X and Y are independent, with X ∼ Γ(α, 1) and Y ∼ Γ(β, 1), then
X
∼ Beta(α, β)
X+Y
.
Beta(1, 1) is the same as U(0, 1).
xα−1 (1 − x)β−1 Γ(α)Γ(β)
f (x) = for 0 < x < 1, where B(α, β) = .
B(α, β) Γ(α + β)

α αβ
E[X] = , Var[X] = .
α+β (α + β)2 (α + β + 1)

50 / 107
2.4 Transformations

Often we are interested in functions of random variables


• Log-return X, asset price Y = y0 eX
• Vulnerability of a building to peak wind speed, Y = cX 3
More generally, for X with distribution function FX
• Set Y = g(X)
• What is the probability distribution of Y?

51 / 107
2.4.1 Linear transformations

The simplest transformation: Y = aX + b


We already know that:
E (Y) = aE (X) + b, Var (Y) = a2 Var (X)

More generally, for a >0:   


y−b 1 y−b
FY (y) = FX , fY (y) = fX
a a a

52 / 107
2.4.2 Monotone transformations

Finding the distribution of Y = g(X):


First note that
FY (y) = P (Y ≤ y) = P (g(X) ≤ y) .

If FX is continuous and g strictly monotone:

g strictly increasing ⇒ FY (y) = FX g−1 (y)




g strictly decreasing ⇒ FY (y) = 1 − FX g−1 (y)




The density of Y is derived by differentiation.


Note: it is possible to go straight from the density of X to the density
of Y, as long as you multiply by the correct Jacobian. If you don’t
know about Jacobians, don’t even attempt this: it is safer to work out
FY first.

53 / 107
Example of a monotone transformation

Example
Suppose that X ∼ U(0, 1) and Y = 1/X. Then
FY (y) = P[Y ≤ y] = P[1/X ≤ y]
Z 1
= P[X ≥ 1/y] = dx = 1 − 1/y
1/y

as long as y > 1. Therefore, by differentiation,


fY (y) = y−2 (y > 1)

54 / 107
2.5 Random numbers and simulation

Many applications require the user to come up with simulated


quantities from specific distributions. This leads to the question:
• given a distribution function F(x), how do we simulate an
observation from the distribution?
Computers find it easy to simulate numbers which are uniformly
distributed between 0 and 1: these are pseudo-random numbers.
So if U is a pseudo-random number from U(0, 1) distribution, we
want to find a function H with the property that,
• if X = H(U), then X is a pseudo-random variable with
distribution function F.

55 / 107
2.5.1 Principles of simulation

The function we want is H = F −1 , the inverse of F.


Why does this work?
Suppose U ∼ U(0, 1) and X is defined as F −1 (U). Then
P[X ≤ x] = P[F −1 (U) ≤ x] = P[U ≤ F(x)] = F(x).

Notice that this method works well if F is easily inverted.


In other cases, where there is no closed-form expression for the
inverse of F, the calculation of F −1 needs a lot of numerical work –
other methods are often used instead.

56 / 107
2.5.2 Examples of simulation

Example
If F(x) = 1 − e−λx then
1
F −1 (u) = − log(1 − u),
λ
so we set
1
X = − log(1 − U)
λ
.
Example
If we want to simulate from N(µ, σ 2 ), we observe that
x−µ 
F(x) = Φ σ , so
F −1 (u) = µ + σΦ−1 (u)

and we can set


X = µ + σZ, where Z = Φ−1 (U).
57 / 107
Examples of simulation

Example
Suppose F(x) = eλx /(eλx + e−λx ). If X = F −1 (U), then U = F(X),
so
U = 1/(1 + e−2λX )
e−2λX = −1 + 1/U
 
1 1
X = − log −1
2λ U

So you simulate U from U(0, 1), then apply the formula to produce
the simulated value of X.

58 / 107
2.6 Summary

In this chapter we have encountered the continuous uniform,


exponential, gamma, and normal distributions as well as the χ2 ,
lognormal and beta.
We have also seen how to work out the distribution of a function of a
random variable and how to simulate random variables from
continuous distributions with invertible distribution functions.

59 / 107

You might also like