Continuous Distributions: Section 2

Section 2
CONTINUOUS DISTRIBUTIONS
33 / 107
2.1 Things to remember
The likely behaviour of a continuous random variable X is determined

by its probability density function
Z x fX and distribution function FX :
d
FX (x) = P(X ≤ x) = fX (y) dy, fX (x) = FX (x).
−∞ dx
The distribution function of a continuous random variable is a

continuous function (unlike that of a discrete random variable).
The expectationZ of X and of a function of X isZ
∞ ∞
E[X] = xfX (x) dx, E[h(X)] = h(x)fX (x) dx.
−∞ −∞
The variance of X is
Var(X) = E(X 2 ) − [E(X)]2 = E [X − E(X)]2 ≥ 0.

34 / 107
2.2 Some continuous families of distributions
There are some distributions which are particularly useful in

modelling real-life quantities. They are grouped into families; we will
be taking a look at some of the most important families.
We will be looking at:
• the continuous uniform distribution
• the exponential distribution
• the Gamma distribution
• the normal distribution
We will also think about distributions such as chi-squared, logNormal
and Beta, which arise from transformations.
35 / 107
2.2.1 Continuous Uniform
Similar to the discrete uniform distribution, in the sense that the

density function is constant over the whole range.
1
fX (x) = , (a < x < b)
b−a
Examples of use:
• An athlete is told that they will be given a drugs test at some
random time during the day
• A delivery van driver says he is going to arrive between one
o’clock and four o’clock
(b + a) (b − a)2
E(X) = , Var(X) = .
2 12
36 / 107
2.2.2 Exponential
(cf Poisson). Events in a Poisson process are equally likely to occur at

any time; the average rate is λ per unit time. The distribution of the
time it takes until the first occurrence is exponential with rate λ.
f (x) = λe−λx , (x > 0).
Examples of use:
• the time until the next claim arrives at an insurance office
• the time until the next bus arrives
E(X) = λ−1 , Var(X) = λ−2 .
37 / 107
2.2.3 Gamma
A non-negative random variable X has a Gamma distributrion with

shape parameter α and scale parameter β if its density function is
1 α−1 α −βx
fX (x) = x β e , (x > 0),
Γ(α)
where Γ(α) is a standard mathematical function given by

Z ∞
Γ(α) = yα−1 e−y dy.
0
There is no closed-form expression for FX (x) unless α is an integer.

The Gamma function has the properties
Γ(n) = (n − 1)! for integer n ge1
Γ(α) = (α − 1)Γ(α − 1) for all α > 1
38 / 107
Properties of the Gamma distribution
The Gamma distribution has a scaling property:

if X ∼ Γ(α, β) and Y = βX, then Y ∼ Γ(α, 1).
If you put α = 1 in the density you will see that

Γ(1, β) = Expo(β).
The mean and variance are given by

α α
E(X) = , Var(X) = 2 .
β β
Example of use:
• If X1 , . . . , XnPare independent exponential random variables with
rate λ, then i Xi ∼ Γ(n, λ). P
• If X1 , . . . , Xn ∼ Γ(α, β) then i Xi ∼ Γ(nα, β).
We will prove both of these results in Chapter 5.
39 / 107
2.2.4 Normal
Many continuous random quantities are assumed to be Normally

distributed: it is the standard assumption if you have no reason to
believe otherwise.
(x − µ)2

1
f (x) = √ exp − , (−∞ < x < ∞).
σ 2π 2σ 2
Mean and variance are:

E(X) = µ, Var(X) = σ 2 .
40 / 107
Normal
The graph of the density function is a ‘bell-shaped’ curve.

0.15
dnorm(x, mean = 2, sd = 3)
0.10
0.05
0.00
−5 0 5 10
41 / 107
The Standard Normal distribution
When µ = 0 and σ = 1
• the distribution is called the standard Normal distribution
• we use the notation Φ(z) for the distribution function, φ(z) for
the density.
It has two key properties.
• The symmetry property
If Z ∼ N(0, 1) then
P(Z < z) = P(Z > −z)
in other words, Φ(z) = 1 − Φ(−z).

• The scaling property
If Y ∼ N(µ, σ 2 ) and Z = (Y − µ)/σ then Z ∼ N(0, 1).
42 / 107
Calculation of Normal probabilities
There are tables of Φ, at least over the range z > 0.

Example
If Y ∼ N(25, 102 ) and we want to calculate P(22 < Y < 32), we let
Z = (Y − 25)/10 and say
P(22 < Y < 32) = P(−0.3 < Z < 0.7)
= Φ(0.7) − Φ(−0.3) = Φ(0.7) + Φ(0.3) − 1.
You will also find tables showing the percentage points of Φ, where
you specify a value of p in the range (0, 1) and want to know what
value of z gives P(Z > z) = p, i.e., we calculate
zp = Φ−1 (1 − p).
43 / 107
Normal approximations to Binomial
As already seen,
• when k is large and θ small, the probability function of Bin(k, θ)
can be approximated by that of Pois(kθ).
• However, this is only really useful when kθ is not very large.
If kθ > 5 and k(1 − θ) > 5 it is more helpful to use a Normal
approximation to Binomial.
Suppose
• X ∼ Bin(k, θ) and
• Y ∼ N(µ, σ 2 ), where µ = kθ, σ 2 = kθ(1 − θ)
Then
P(X ≤ x) ≈ P(Y ≤ x + 21 ),
obtained from tables as above.
The 12 added to (or in some cases subtracted from) x is called the
continuity correction, and it arises because we are approximating
P(X = x) by P(x − 0.5 < Y < x + 0.5).
44 / 107
Other Normal approximations
The Normal approximation to Poisson is also fine, as long as µ > 5

• again we would use the continuity correction
• the mean and variance of the approximating Normal random
variable would both be equal to µ
It is not just the Binomial and Poisson which can be so approximated.
As we shall see in Chapter 4, if a sample of size n is taken from any
distribution with finite mean and variance, the sample mean is
approximately normally distributed for large n.
When the underlying distribution is continuous we do not need to use
the continuity correction.
45 / 107
2.3 Distributions arising from transformations
Many distributions which are used in statistics and in probability

theory are derived from simpler distributions by means of
transformations.
46 / 107
2.3.1 The chi-squared distribution
If Z is a standard Normal variable, then Z 2 has a chi-squared

distribution with one degree of freedom, Z ∼ χ21 .
If Z1 , . . . , Zn are independent N(0, 1) variables, then
X n
Zi2 ∼ χ2n .
i=1
Chi-squared is a special case of Gamma:

χ2n = Γ 12 n, 21 .
This enables you to write down the density,

n
1 n
−1 1
2 x
f (x) = n x
2 e− 2 , x>0
Γ 2 2
47 / 107
2.3.2 The lognormal distribution
If
• X is Normal with mean µ and variance σ 2 ,
• Y = eX ,
then Y is said to have the lognormal distribution with parameters µ
and σ 2 .
Note that eµ is the median of Y, but not the expectation.
Lognormal variables are always non-negative, unlike Normal. They
are right-skewed, again unlike Normal.
Examples of use:
• The value of a share index today is 1250. The value in one year’s
time might be modelled as a logNormal random variable.
• When valuing put and call options using the Black-Scholes
formula, a fundamental assumption is that the value of the
“underlying” (the asset on which the options are based) at the
exercise time is logNormally distributed
48 / 107
logNormal mean and variance
We will show in chapter 5 that

1 2 2
E[Y] = E[eX ] = eµ+ 2 σ and E[Y 2 ] = E[e2X ] = e2µ+2σ ,
so that
2 2
Var[Y] = e2µ+σ eσ − 1 .
49 / 107
2.3.3 The beta distribution
A beta-distributed random variable takes values only in the range

(0, 1).
We will find the distribution useful when we come to Bayesian
statistics.
If X and Y are independent, with X ∼ Γ(α, 1) and Y ∼ Γ(β, 1), then
X
∼ Beta(α, β)
X+Y
.
Beta(1, 1) is the same as U(0, 1).
xα−1 (1 − x)β−1 Γ(α)Γ(β)
f (x) = for 0 < x < 1, where B(α, β) = .
B(α, β) Γ(α + β)
α αβ
E[X] = , Var[X] = .
α+β (α + β)2 (α + β + 1)
50 / 107
2.4 Transformations
Often we are interested in functions of random variables

• Log-return X, asset price Y = y0 eX
• Vulnerability of a building to peak wind speed, Y = cX 3
More generally, for X with distribution function FX
• Set Y = g(X)
• What is the probability distribution of Y?
51 / 107
2.4.1 Linear transformations
The simplest transformation: Y = aX + b

We already know that:
E (Y) = aE (X) + b, Var (Y) = a2 Var (X)
More generally, for a >0:

y−b 1 y−b
FY (y) = FX , fY (y) = fX
a a a
52 / 107
2.4.2 Monotone transformations
Finding the distribution of Y = g(X):

First note that
FY (y) = P (Y ≤ y) = P (g(X) ≤ y) .
If FX is continuous and g strictly monotone:
g strictly increasing ⇒ FY (y) = FX g−1 (y)

g strictly decreasing ⇒ FY (y) = 1 − FX g−1 (y)

The density of Y is derived by differentiation.

Note: it is possible to go straight from the density of X to the density
of Y, as long as you multiply by the correct Jacobian. If you don’t
know about Jacobians, don’t even attempt this: it is safer to work out
FY first.
53 / 107
Example of a monotone transformation
Example
Suppose that X ∼ U(0, 1) and Y = 1/X. Then
FY (y) = P[Y ≤ y] = P[1/X ≤ y]
Z 1
= P[X ≥ 1/y] = dx = 1 − 1/y
1/y
as long as y > 1. Therefore, by differentiation,

fY (y) = y−2 (y > 1)
54 / 107
2.5 Random numbers and simulation
Many applications require the user to come up with simulated

quantities from specific distributions. This leads to the question:
• given a distribution function F(x), how do we simulate an
observation from the distribution?
Computers find it easy to simulate numbers which are uniformly
distributed between 0 and 1: these are pseudo-random numbers.
So if U is a pseudo-random number from U(0, 1) distribution, we
want to find a function H with the property that,
• if X = H(U), then X is a pseudo-random variable with
distribution function F.
55 / 107
2.5.1 Principles of simulation
The function we want is H = F −1 , the inverse of F.

Why does this work?
Suppose U ∼ U(0, 1) and X is defined as F −1 (U). Then
P[X ≤ x] = P[F −1 (U) ≤ x] = P[U ≤ F(x)] = F(x).
Notice that this method works well if F is easily inverted.

In other cases, where there is no closed-form expression for the
inverse of F, the calculation of F −1 needs a lot of numerical work –
other methods are often used instead.
56 / 107
2.5.2 Examples of simulation
Example
If F(x) = 1 − e−λx then
1
F −1 (u) = − log(1 − u),
λ
so we set
1
X = − log(1 − U)
λ
.
Example
If we want to simulate from N(µ, σ 2 ), we observe that
x−µ
F(x) = Φ σ , so
F −1 (u) = µ + σΦ−1 (u)
and we can set

X = µ + σZ, where Z = Φ−1 (U).
57 / 107
Examples of simulation
Example
Suppose F(x) = eλx /(eλx + e−λx ). If X = F −1 (U), then U = F(X),
so
U = 1/(1 + e−2λX )
e−2λX = −1 + 1/U

1 1
X = − log −1
2λ U
So you simulate U from U(0, 1), then apply the formula to produce
the simulated value of X.
58 / 107
2.6 Summary
In this chapter we have encountered the continuous uniform,

exponential, gamma, and normal distributions as well as the χ2 ,
lognormal and beta.
We have also seen how to work out the distribution of a function of a
random variable and how to simulate random variables from
continuous distributions with invertible distribution functions.
59 / 107

Continuous Distributions: Section 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Continuous Distributions: Section 2

Uploaded by

Copyright:

Available Formats

Section 2

The likely behaviour of a continuous random variable X is determined

The distribution function of a continuous random variable is a

There are some distributions which are particularly useful in

Similar to the discrete uniform distribution, in the sense that the

(cf Poisson). Events in a Poisson process are equally likely to occur at

E(X) = λ−1 , Var(X) = λ−2 .

A non-negative random variable X has a Gamma distributrion with

where Γ(α) is a standard mathematical function given by

There is no closed-form expression for FX (x) unless α is an integer.

The Gamma distribution has a scaling property:

If you put α = 1 in the density you will see that

The mean and variance are given by

Many continuous random quantities are assumed to be Normally

Mean and variance are:

The graph of the density function is a ‘bell-shaped’ curve.

in other words, Φ(z) = 1 − Φ(−z).

There are tables of Φ, at least over the range z > 0.

The Normal approximation to Poisson is also fine, as long as µ > 5

Many distributions which are used in statistics and in probability

If Z is a standard Normal variable, then Z 2 has a chi-squared

Chi-squared is a special case of Gamma:

This enables you to write down the density,

We will show in chapter 5 that

A beta-distributed random variable takes values only in the range

Often we are interested in functions of random variables

The simplest transformation: Y = aX + b

More generally, for a >0:   

Finding the distribution of Y = g(X):

If FX is continuous and g strictly monotone:

g strictly increasing ⇒ FY (y) = FX g−1 (y)

g strictly decreasing ⇒ FY (y) = 1 − FX g−1 (y)

The density of Y is derived by differentiation.

as long as y > 1. Therefore, by differentiation,

Many applications require the user to come up with simulated

The function we want is H = F −1 , the inverse of F.

Notice that this method works well if F is easily inverted.

and we can set

In this chapter we have encountered the continuous uniform,

You might also like

More generally, for a >0: