Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

2.

Continuous Random Variables

Ismaı̈la Ba

ismaila.ba@umanitoba.ca
STAT 3100 - Winter 2024

1 / 55
Course Outline

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

2 / 55
Introduction

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

3 / 55
Introduction

Discrete random variables take their values on countable sets.

Continuous random variables take their values on the ”continuum”


R.

Probabilities such as P(X = x) are not appropriate for continuous


P
random variables since x∈R P(X = x) really has no meaning.

We need to move away from summations and turn to integration.

4 / 55
Continuous random variables

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

5 / 55
Continuous random variables

Definition 1
A random variable X with CDF F (x) is said to be a continuous random
variable if there exists a real-valued function f such that
Z x
F (x) := P(X ≤ x) = f (t)dt and f (t) ≥ 0 ∀t ∈ R.
−∞

The function f is called the probability density function (pdf) of X or


simply the density of X .

Theorem 2
A function f (x) is a pdf of a continuous random variable X if and only if it
satisfies the following conditions :
1 f (x) ≥ 0 for all x ∈ R.
R∞
2
−∞
f (x)dx = 1.

6 / 55
Continuous random variables

Properties of CDF F (x)

The CDF F (x) has the following properties :


1 limx→−∞ F (x) = 0 and limx→∞ F (x) = 1.
2 F (x) is a nondecreasing function of x ; that is, for y ≤ x, F (y ) ≤ F (x).
3 F (x) is right-continuous ; that is for every number x0 ,
limx↓x0 F (x) := limx→x0 ;x>x0 F (x) = F (x0 ).

Example 1
1
Consider the following CDF F (x) = 1+e −x . F (x) satisfies :
limx→−∞ F (x) = 0 since limx→−∞ e −x = ∞ and limx→∞ F (x) = 1 since
limx→∞ e −x = 0.
e −x
d
dx F (x) = (1+e −x )2
> 0, so F (x) is increasing.
F (x) is not only right-continuous, but also continuous.

7 / 55
Continuous random variables

Fundamental Theorem of Calculus


The Fundamental Theorem of Calculus tells us that, by definition, the pdf
f (x) can be obtained by differentiating F (x) so that

d
f (x) = F (x) = F ′ (x),
dx
at any point x where F (x) is differentiable. F (x) may not be differentiable
at all points x ∈ R but these points can be safely ignored.

For a continuous random variable X with CDF F and pdf f , we have


Z x
P(X = x) = f (t)dt = 0, ∀x ∈ R.
x

That is, the area under a curve at a single point is 0 (this is true even
when f (x) > 0 !). This has an important practical consequence :
Z b
P(a < X ≤ b)=P(a ≤ X ≤ b)=P(a < X < b)=P(a ≤ X < b)= f (x)dx
a 8 / 55
Common continuous distributions

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

9 / 55
Common continuous distributions

The Standard Uniform Distribution

The standard uniform distribution is an equal probability model on the


interval Ω = (0, 1) (or Ω = [0, 1], or Ω = (0, 1], or Ω = [0, 1)). The pdf is
given by f (x) = 1(0,1) (x). The CDF comes in 3 parts :
Z x Z x Z x
for x ≤ 0 : F (x) = f (t)dt = 1(0,1) (t)dt = 0dt = 0
−∞ −∞ −∞
Z x Z x Z x
for x ∈ (0, 1) : F (x) = f (t)dt = 1(0,1) (t)dt = 1dt = x
−∞ −∞ 0
Z x Z x Z 1 Z x
for x ≥ 1 : F (x) = f (t)dt = 1(0,1) (t)dt = 1dt + 0dt = 1
−∞ −∞ 0 1

A note on notation : For a random variable X with the standard uniform


distribution, we write X ∼ U(0, 1), where we read the symbol ”∼” as ”is
distributed as”. We can similarly write X ∼ f (x) or X ∼ F (x), which is
shorthanded for ”X has CDF F ” or ”X has pdf/pmf f ” respectively.
10 / 55
Common continuous distributions

The General Uniform Distribution

The pdf of a general uniform distribution on an interval Ω = (a, b), with


a < b, is given by,

f (x) = f (x; a, b) = c · 1(a,b) (x),

where c > 0 is a constant. The notation f (x; a, b) is a reminder that the


pdf depends on the parameters a and b and that the constant c does also.
The CDF F (x) = F (x; a, b) comes in three parts x ≤ a, x ∈ (a, b) and
x ≥ b with c = 1/(b − a). Here, we write X ∼ U(a, b).

11 / 55
Common continuous distributions

The Normal Distribution

The standard normal distribution, denoted N(0, 1), has pdf

1
f (x) = √ e −x /2 ,
2
−∞ < x < ∞.

Remark : The standard normal random variable will be denoted by Z .

The normal distribution with mean µ and variance σ2 > 0, denoted


N(µ, σ2 ), has pdf

1 − 1
(x−µ)2
f (x) = f (x; µ, σ2 ) = √ e 2σ2 , −∞ < x < ∞.
σ 2π

The CDF of the normal distribution has no closed form. Tables or


computer software (R for instance) can be used to calculate P(X ≤ x)
when X ∼ N(µ, σ2 ).
12 / 55
Common continuous distributions

The Normal Distribution

Note that if X ∼ N(µ, σ2 ), then

X −µ
Z= ∼ N(0, 1),
σ
so only tabulated probabilities for N(0, 1) random variables are required.

d
If X ∼ N(µ, σ2 ), then X = µ + σZ ∼ N(µ, σ2 ).

More generally, if X ∼ N(µ, σ2 ) and a, b ∈ R, then

Y = a + bX ∼ N(a + bµ, b2 σ2 ).

Remark : The normal distribution is the most important one in all of


probability and statistics. The CDF of the standard normal distribution is
commonly denoted by Φ(z).
13 / 55
Common continuous distributions

The Exponential Distribution

Let β > 0 and consider a continuous random variable X with pdf


1 −x/β
f (x) = f (x; β) = e · 1(0,∞) (x).
β

The indicator function tells us that the random variable X is positive


(or at least non-negative).
The constant β is called the scale parameter and in this case, we
write X ∼ Exp(β).
( ) Check that f (x) is a proper density.

The CDF of X is given by


Z x Z x Z x/β
1 −t/β
F (x; β) = P(X ≤ x) = f (t)dt = e dt = e −u du = 1−e −x/β .
0 0 β 0
14 / 55
Common continuous distributions

The Exponential Distribution


( ) Check that, as x → 0, F (x) → 0 and that, as x → ∞, F (x) → 1.

Another commonly used parameterization of the exponential distribution


uses λ = 1/β so that f (x; λ) = λe −λx · 1(0,∞) (x).
This parameterization has its roots in stochastic processes.
λ is called the rate parameter and in this case, we will use the
notation X ∼ ExpR (λ).
In statistical inference, scale parameters are usually of greater interest
so we will normally use the Exp(β) parameterization.

Proposition 1 (Lack of memory property ( ))


Let X ∼ Exp(β) (or ExpR (λ)), then, for s, t ≥ 0,

P(X > s + t | X > s) = P(X > t).


15 / 55
Common continuous distributions

The Gamma Distribution

Definition 3
For α > 0, the gamma function Γ(α) is defined by
Z ∞
Γ(α) = x α−1 e −x dx
0

Remark : The gamma function is analytic over the entire complex plane
except at the non-positive integers, but that will not be of interest in this
course - only that it is real-valued and continuous for real α ∈ (0, ∞).

(Primary) properties of the gamma function


1 For any α > 1, Γ(α) = (α − 1)Γ(α − 1)
2 For any positive integer n, Γ(n) = (n − 1)!

3 Γ( 12 ) = π

16 / 55
Common continuous distributions

The Gamma Distribution

A continuous random variable X is said to have a gamma distribution


with parameters α > 0 and β > 0, denoted Gamma(α, β) if the pdf of X is

x α−1 e −x/β
f (x) = f (x; α, β) = · 1(0,∞) (x).
βα Γ(α)

( ) Use the primary properties of the gamma function to show that


f (x; α, β) is a proper density function.
There is no closed form for the CDF F (x; α, β) except for certain
values of the parameters.
d
Gamma(1, β) = Exp(β).

17 / 55
Common continuous distributions

The Gamma Distribution

Another parameterization commonly used is GammaR (α, λ), where λ is


now a rate parameter. The density becomes

λα x α−1 e −xλ
f (x; α, λ) = · 1(0,∞) (x).
Γ(α)

d
GammaR (1, λ) = ExpR (λ).

18 / 55
Common continuous distributions

The Chi-Squared Distribution

The chi-squared distribution with parameter ν > 0 is a special case of the


gamma distribution. The pdf is given by
1
f (x) = f (x; ν) = x ν/2−1 e −x/2 · 1(0,∞) (x).
2ν/2 Γ(ν/2)

The parameter ν is called the number of degrees of freedom.

The notation is X ∼ χ2 (ν).

d
χ2 (ν) = Gamma(α = ν/2, β = 2) = GammaR (α = ν/2, λ = 1/2).

19 / 55
Common continuous distributions

Other Continuous Distributions

See Chapter 4, Section 4.5 in Devore & Berk (2018) for more examples.

Example 2 ( )
Suppose that the continuous random variable X has density

f (x) = c(1 + x)−3 · 1(0,∞) (x).

Find the constant c and the CDF F (x). Can you generalize this for

f (x; η) = c(η)(1 + x)−η · 1(0,∞) (x), η > 1,

where c(η) is a constant depending only on η ? What happens if η = 1 ?

20 / 55
Moments, Expectation and Variance

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

21 / 55
Moments, Expectation and Variance

Expectation

The expected or mean value of a continuous random variable with pdf is


Z ∞
E(X ) = x f (x)dx
−∞

R∞
This expected value will exist provided that −∞
|x| f (x)dx < ∞.
We will use the notation E(X ) = µ = µX or say that X has mean
(expectation) µ.

Example 3
Suppose that X ∼ Exp(β). Then
Z ∞ Z ∞ Z ∞ 2−1 −x/β
1 −x/β 1 −x/β x e
E(X ) = x e · 1(0,∞) (x)dx = x e dx = β dx = β.
−∞ β 0 β 0 β2 Γ(2)
| {z }
Gamma(2, β)density
22 / 55
Moments, Expectation and Variance

Expectation
Example 4
Suppose that X ∼ Gamma(α, β). Then
Z ∞ α−1 −x/β
Γ(α + 1) ∞ x (α+1)−1 e −x/β
Z
x e
E(X ) = x α dx = β α+1 Γ(α + 1)
dx.
0 β Γ(α) Γ(α) 0 β
| {z }
Gamma(α+1, β)density

Since Γ(α + 1) = αΓ(α), we obtain E(X ) = αβ.

Exercise 1 ( )
Determine E(X ) when
1 X ∼ U(a, b).
2 X ∼ GammaR (α, λ).
3 X ∼ χ2 (ν).
23 / 55
Moments, Expectation and Variance

Expectation

Proposition 2
If X is a continuous random variable with pdf f (x) and u(x) is a
real-valued Rfunction whose domain includesR the range of X , then
∞ ∞
E[u(X )] = −∞ u(x) f (x)dx, provided that −∞ |u(x)| f (x)dx < ∞.

Linearity
When u(x) = a + bx, we have
Z ∞ Z ∞ Z ∞
E(u(X ))=E(a+bX )= (a+bx) f (x)dx =a f (x)dx+b x f (x)dx =a+bE(X ).
−∞ −∞ −∞

Suppose that g (x) and h(x) are real valued functions such that E[g (X )]
and E[h(X )] exist and a, b ∈ R. Then

E[a g (X ) + b h(X )] = aE[g (X )] + bE[h(X )].


24 / 55
Moments, Expectation and Variance

Variance/Standard Deviation

The variance of a random variable X with finite mean µ is

σ2 = σ2X = V(X ) = E[(X − µ)2 ] = E[h(X )], with h(X ) = (X − µ)2 ,

provided thisp expectation exists. The standard deviation of X is


σ = σX = V(X ).

Computational formula for the variance

σ2 = V(X ) = E[(X − µ)2 ] = E[X 2 − 2µE(X ) + µ2 ] =E(X 2 ) − 2µ2 + µ2


=E(X 2 ) − µ2 .

Exercise 2 ( )
Find E(X 2 ) and V(X ) when 1. X ∼ Gamma(α, β) and 2. X ∼ χ2 (ν).
25 / 55
Moments, Expectation and Variance

Higher Moments

Definition 4 (Higher moments)


For a random variable X , provided they exist, the kth moment about the
origin or simply kth moment is defined as

µ′k := E(X k ),

and the kth moment about the mean or kth central moment is defined
as
µk := E[(X − E(X ))k ] = E[(X − µ)k ].

E(X ) = µ = µ′1 .

V(X ) = σ2 = µ2 .

26 / 55
Moments, Expectation and Variance

Higher Moments

Example 5
Suppose that X ∼ Gamma(α, β).

x α−1 e −x/β
Z ∞
Γ(α + k) ∞ x (α+k)−1 e −x/β
Z
E(X k ) = x k α dx = βk dx
0 β Γ(α) Γ(α) 0 βα+k Γ(α + k)
| {z }
Gamma(α+k, β)density
Γ(α + k)
= βk .
Γ(α)

d
Remark : If Y ∼ Gamma(1, β) = Exp(β), we have E(Y k ) = βk k!

27 / 55
Joint distributions

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

28 / 55
Joint distributions

Joint distribution

Definition 5
Let X and Y be two continuous random variables defined on the same
sample Ω. Then fX ,Y (x, y ) or simply f (x, y ) is the joint probability
function for X and Y if for A ⊆ R2
"
P[(X , Y ) ∈ A] = f (x, y ) dx dy .
A

If A = {(x, y ) : a ≤ x ≤ b, c ≤ y ≤ d} the two-dimensional rectangle, then,


Z d Z b
P[(X , Y ) ∈ A] = P(a ≤ X ≤ b, c ≤ Y ≤ d) = f (x, y ) dx dy
c a
Z b Z d
= f (x, y ) dy dx.
a c

29 / 55
Joint distributions

Joint distribution

Definition 6
If X = (X1 , . . . , Xn )′ are defined on the same sample space Ω, the joint
density of X is denoted by

fX (x) = fX1 ,...,Xn (x1 , . . . , xn ).

If A ⊆ Rn , then
(
P(X ∈ A) = f (x1 , . . . , xn ) dx1 . . . dxn .
A

30 / 55
Joint distributions

Expected Values and Covariance

If X and Y are jointly distributed, then the covariance between X and Y


is defined as

Cov(X , Y ) = E[(X − E(X ))(Y − E(Y ))] = E(XY ) − E(X )E(Y ),

provided these expectations exist.

If X = (X1 , . . . , Xn )′ are jointly distributed random variables, the mean


vector is defined as

µ = (µ1 , . . . , µn )′ , where µi = E(Xi ), for i = 1, . . . , n.

The covariance matrix Λ = (λij ) is a symmetric n × n matrix with entries

λij = λji = Cov(Xi , Xj ) = Cov(Xj , Xi ).

31 / 55
Joint distributions

Marginal Distributions
Definition 7
The marginal densities of X and Y , denoted by fX (x) and fY (y ),
respectively, are given by
Z ∞
fX (x) = fX ,Y (x, y ) dy for − ∞ < x < ∞
−∞
Z ∞
fY (y ) = fX ,Y (x, y ) dx for − ∞ < y < ∞.
−∞

If X = (X1 , . . . , Xn )′ and U = {i1 , . . . , ik } ⊂ {1, . . . , n} (and


U ′ = {1, . . . , n} \ U), then the marginal (joint) distribution of
(Xi1 , Xi2 , . . . , Xik )′ is given by
(
fXi1 ,Xi2 ,...,Xik (xi1 , Xxi2 , . . . , xik ) = fX1 ,...,Xn (x1 , . . . , xn ) dxU ′ ,
Rn−k
32 / 55
Joint distributions

Conditional Distribution
where dxU ′ represents dxj for all j < U.
Definition 8
Let X and Y be two continuous random variables with joint density
fX ,Y (x, y ) and marginal X density fX (x). Then for any x value such that
fX (x) > 0, the conditional density of Y given X = x is

fX ,Y (x, y )
fY |X =x (y ) = .
fX (x)

This will possibly be a function of x.


Consider the random variable h(X ) := fY |X (y ) = fX ,Y (X , y )/f (X ).
With h(x) := fY |X =x (y ), we have fX ,Y (x, y ) = fY |X (y )fX (x).
We can also consider fX |Y (x) and extend this to higher dimensions.

33 / 55
Joint distributions

Conditional Distribution

If X = (X1 , X2 , X3 , X4 )′ , we can consider the distribution of (X1 , X4 ) given


(X2 = x2 , X3 = x3 ). The density becomes in this case

fX1 ,X2 ,X3 ,X4 (x1 , x2 , x3 , x4 )


fX1 ,X4 |X2 =x2 ,X3 =x3 (x1 , x4 ) =
fX2 ,X3 (x2 , x3 )
fX ,X ,X ,X (x1 , x2 , x3 , x4 )
=R R 1 2 3 4 .
f
R R X1 ,...,X4 1
(x , . . . , x4 ) dx1 dx4

34 / 55
Joint distributions

Independence

Definition 9
Two random variables X and Y defined on the same sample space Ω are
said to be independent if for every pair x and y values

fX ,Y (x, y ) = fX (x)fY (y ).

For higher dimensions : if X1 , . . . , Xn are random variables defined on the


same sample space, we say that X1 , . . . , Xn are independent if and only if
n
Y
fX1 ,...,Xn (x1 , . . . , xn ) = fXi (xi ).
i=1

Remark : (X1 , . . . , Xn ) is a random sample from distribution with CDF F


(or pdf f ) ⇐⇒ X1 , . . . , Xn are independent and identically distributed (iid)
with CDF F (or density f ).

35 / 55
Joint distributions

Independence
If X = (X1 , . . . , Xn )′ are iid random variables, each with CDF F and
probability mass/density function f , then the joint density (mass) function
of (X1 , . . . , Xn )′ is given by
n
Y
f (x) = f (x1 , . . . , xn ) = f (xi ).
i=1

Remark : If the Xi are discrete, then so is f (x). If the Xi are continuous,


then so is f (x). It is not always the case that the Xi are all discrete or
continuous, there may be a mixture of both.
Events such as {X ≤ x} are determined component-wise :

{X ≤ x}={X1 ≤ x1 , . . . , Xn ≤ xn }={X1 ≤ x1 } ∩ {X2 ≤ x2 } ∩ . . . ∩ {Xn ≤ xn }

(If the Xi are independent) The joint CDF is given by


F (x) = F (x1 , . . . , xn ) = ni=1 F (xi ).
Q

36 / 55
Joint distributions

Example 6 ( )

Suppose that X and Y are jointly distributed with density

fX ,Y (x, y ) = ce −x−y 1(0<y <x<∞) ,

where c is a constant. Determine the constant c, the marginal distributions


of X and Y , and the conditional distribution of Y |X = x.

37 / 55
Joint distributions

Conditional Expectation

Definition 10
Let X and Y be two continuous random variables with conditional
probability density function fY |X (y |x). Then
Z ∞
µY |X =x = E(Y |X = x) = y fY |X (y |x) dy .
−∞

Example 7 ( )
Use Example 6 to determine E(Y |X = x).

38 / 55
Joint distributions

Computation of the Covariance

Example 8

By definition, Cov(X , Y ) = E(XY ) − E(X )E(Y ). Using Example 6, we have


Z ∞Z x Z ∞ Z x !
−x−y −x −y
E(XY ) = xy 2e dy dx = 2 xe ye dy dx (c = 2).
0 0 0 0

The integration inside the parenthesis is 1 − e −x − xe −x . Therefore, we have


Z ∞
E(XY ) = 2 xe −x (1 − e −x − xe −x ) dx
0
Z ∞ Z ∞ Z ∞
=2 xe −x dx − x · 2e −2x dx − x 2 · 2e −2x dx
0 0 0
= 2A − B − C .

39 / 55
Joint distributions

Computation of the Covariance

Example 8 continued
Now, A = 1 because it is the expectation of a standard Exp(1) random
variable. B = 1/2 since it is the expectation of an Exp(1/2) random
variable. C is equal to E(U 2 ) = V(U) + E2 (U), where U ∼ Exp(1/2). So,
C = 41 + 14 = 12 . Therefore, E(XY ) = 2 − 12 − 12 = 1.

3
( )Verify that E(X ) = 2 and E(Y ) = 12 . The covariance becomes

3 1 1
Cov(X , Y ) = E(XY ) − E(X )E(Y ) = 1 − × = .
2 2 4
Does this result make sense ? Why ?

40 / 55
Joint distributions

Another Example on Conditional Expectation

In Example 6, we find that fY (y ) = 2e −2y 1(0<y <∞) . We have

fX ,Y (x, y ) 2e −x−y
fX |Y =y (x) = = = e −(x−y ) 1(0<y <x<∞) .
fY (y ) 2e −2y

Note that y is fixed here so, with the substitution u = x − y (with du = dx


and x = u + y ), we have
Z ∞ Z ∞
−(x−y )
E(X |Y = y ) = x ·e dx = (y + u)e −u du = y + 1.
y 0

You can also argue this using the lack of memory property of the
exponential distribution !

41 / 55
Joint distributions

Law of Total Expectation

Suppose that X and Y are jointly distributed (we will assume they are
both continuous, but this also works for discrete random variables or
mixture of both types) with pdf fX (x) and fY (y ). Let h(x) = E(Y |X = x)
and define the random variable h(X ) := E(Y |X ). We have
Z
E(E(Y |X )) = E(h(X )) = h(x) fX (x) dx
R
Z Z !
= y · fY |X =x (y ) dy fX (x) dx
R R
= E(Y ).

For jointly distributed random variables X and Y , we have


E(Y ) = E(E(Y |X )), provided these expectations exist. This is actually a
very powerful tool and often simplifies complicated proofs. There is also a
Law of Total Variation and a Law of Total Covariation, which we will
discuss if and when needed. 42 / 55
Transformation theorem

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

43 / 55
Transformation theorem

Let X be an n-dimensional
R continuous random variable with density
fX (x) and suppose that S fX (x) dx = 1, where S ⊂ Rn .

Let g(x) = (g1 (x), g2 (x), . . . , gn (x)) be a bijection from S to some


T ⊂ Rn , with unique inverse mapping g−1 .

Assume that both g and g−1 are continuously differentiable.

Let Y = g(X) i.e. Y1 = g1 (X1 , . . . , Xn ), . . . , Yn = gn (X1 , . . . , Xn ).

Let h(y) = (h1 (y), . . . , hn (y)) be the unique inverse of g so that


X1 = h1 (Y1 , . . . , Yn ), . . . , Xn = hn (Y1 , . . . , Yn ).

44 / 55
Transformation theorem

Theorem 11
The (joint) density of Y is

fY (y) = fX (h1 (y), h2 (y), . . . , hn (y))|J| · I {y ∈ T },

where J is the Jacobian given by


∂x1 ∂x1 ∂x1 ∂x1
∂y1 ∂y2 ∂y3 ... ∂yn
∂x2 ∂x2 ∂x2 ∂x2
∂x ∂x ∂y1 ∂y2 ∂y3 ... ∂yn
J= ,..., = .. .. .. .. ..
∂y1 ∂yn . . . . .
∂xn ∂xn ∂xn ∂xn
∂y1 ∂y2 ∂y3 ... ∂yn

∂xk ∂hk (y)


and ∂yi = ∂yi .

Proof.( ) For any B ∈ Rn , let


h(B) = g−1 (B) = {x : g(x) ∈ B}.
45 / 55
Transformation theorem

Lemma to complete the proof of Theorem 11

The following Lemma, which we state without proof, completes the proof
of Theorem 11.
Lemma 1
Let Z be an n-dimensional continuous random variable. If, for every
B ⊂ Rn , Z
P(Z ∈ B) = fZ (x) dx
B
then fZ (x) is a density of Z.

46 / 55
Transformation theorem

Comment on Theorem 11

The joint density of Y can also be written as

fX (h1 (y), h2 (y), . . . , hn (y))


fY (y) = · I {y ∈ T },
|Jy |

where Jy is given by
∂y1 ∂y1 ∂y1 ∂y1
∂x1 ∂x2 ∂x3 ... ∂xn
∂y2 ∂y2 ∂y2 ∂y2
∂y ∂y ∂x1 ∂x2 ∂x3 ... ∂xn
Jy = ,..., = .. .. .. ..
∂x1 ∂xn ..
. . . . .
∂yn ∂yn ∂yn ∂yn
∂x1 ∂x2 ∂x3 ... ∂xn

∂yk ∂gk (x)


and ∂xi = ∂xi .

47 / 55
Transformation theorem

Example 9
Suppose that X ∼ Gamma(α, β). Find the distribution of Y = 2X /β. We
have
x α−1 e −x/β
fX (x) = α · 1(0,∞) (x).
β Γ(α)
Here, y = g (x) = 2x/β which implies that x = h(y ) = βy /2. The Jacobian
is |dx/dy | = |β/2| = β/2. Then,
 βy  β (βy /2)α−1 e −(βy /2)/β β
fY (y ) = fX = ·1 (y )
2 2 βα Γ(α) 2 (0,∞)
y α−1 e −y /2
= α · 1(0,∞) (y ).
2 Γ(α)
d
Thus, Y ∼ Gamma(α, 2) = χ2 (2α).

48 / 55
Transformation theorem

Example 10

Let X and Y be independent with X ∼ Gamma(α1 , β) and


Y ∼ Gamma(α2 , β). Let U = X + Y and V = X /(X + Y ) with
U ∼ Gamma(α1 + α2 , β) and V ∼ Beta(α1 , α2 ). We wish to show that U
and V are independent. Here, u = g1 (x, y ) = x + y and
v = g2 (x, y ) = x/(x + y ) yield x = h1 (u, v ) = uv and h2 (u, v ) = u(1 − v ).
The Jacobian is
∂x ∂x
∂u ∂v v u
J= ∂y ∂y
= = −vu − u(1 − v ) = −u.
∂u ∂v
1−v −u

The joint density of U and V is

fU,V (u, v ) =fX ,Y (uv , u(1 − v ))|J|


(uv )α1 −1 e −uv /β [u(1 − v )]α2 −1 e −u(1−v )/β
= × × | − u|
βα1 Γ(α1 ) βα2 Γ(α2 )

49 / 55
Transformation theorem

Example 10 continued
We can rewrite the joint density of U and V as follows

u α1 +α2 −1 e −u/β Γ(α1 + α2 ) α1 −1


fU,V (u, v ) = α
× v (1 − v )α2 −1
β 1 +α 2 Γ(α1 + α2 ) Γ(α1 )Γ(α2 )
| {z } | {z }
Gamma(α1 +α2 , β) density Beta(α1 , α2 ) density

where 0 < u < ∞ and 0 < v < 1. Therefore,

fU,V (u, v ) = fU (u) × fV (v )

which implies that the random variables U and V are independent.

50 / 55
Transformation theorem

Remark (from Example 9 and 10) : If X1 , . . . , Xn are independent with


Xi ∼ Gamma(αi , β) then
i=1 Xi ∼ Gamma( i=1 αi , β).
Pn Pn
1

2 2X /β ∼ χ2 (2α ) and 2
i=1 Xi /β ∼ χ (2 i=1 αi ).
Pn 2 Pn
i i

Exercise 3 ( )
iid
1 Suppose that X1 , . . . , Xn ∼ Exp(β). What is the distribution of
2nX̄n /β ?
iid
2 Let m, n ∈ N such that m < n and suppose that X1 , . . . , Xn ∼ Exp(β).
What is the distribution of ( m
P Pn
i=1 Xi )/( i=1 Xi ) ?
3 Let X1 , . . . , Xn be independent with Xi ∼ χ2 (νi ) for i = 1, . . . , n. Argue
that ni=1 Xi ∼ χ2 ( ni=1 νi ).
P P

51 / 55
Transformation theorem

Many-to-one transformations

The mapping g is many-to-1.


Find a partition S1 , . . . , Sm of S (i.e. S = ∪k Sk and Sk ∩ Si = ∅ for all
k , i) such that g : Sk → T is 1-1 on each Sk with Jacobian Jk , then
m
X
fY (y) = fX (h1k (y), . . . , hnk (y))|Jk |,
k=1

where hk (Jk ) = (h1k (y), . . . , hnk (y)) is the unique inverse of the
mapping g : Sk → T on each Sk .

52 / 55
Transformation theorem

Many-to-one transformations

Example 11
Let Z ∼ N(0, 1) and let Y = Z 2 . Find the distribution of Y. The function
g (z) = z 2 is not one-to-one since z and −z map to the same y but g (z) is
one-to-one and onto T = (0, ∞) on each of S1 = (−∞, 0) and S2 = (0, ∞)
with R = S1 ∪ S2 . That is, g : (−∞, 0) → (0, ∞) and g : (0, ∞) → (0, ∞)

are each one-to-one and onto (0, ∞). On S1 , we have z = h11 (y ) = − y
√ √
and J1 = dy /dz = −1/(2 y ) ; and on S2 , we have z = h12 (y ) = y and

J2 = dy /dz = 1/(2 y ). Then, the density of Y is
√ √
fY (y ) = fZ (− y )|J1 | + fZ ( y )|J2 |.

Note that fZ (z) = fZ (−z) and |J1 | = |J2 | = 1/(2 y ). The density of Y
(Y ∼ χ2 (1)) becomes
√ 1/2−1 e −y /2
1
fY (y ) = 2fZ ( y )|J1 | = √2πy e −y /2 = y21/2 Γ(1/2) 1(0,∞) (y ).

53 / 55
Some useful facts

Contents

1 Introduction

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

5 Joint distributions

6 Transformation theorem

7 Some useful facts

54 / 55
Some useful facts

1 If X1 , . . . , Xn are independent standard normal random variables, then



i=1 Xi / n ∼ N(0, 1). More generally, if
Pn Pn
i=1 Xi ∼ N(0, n) and
X , . . . , Xn are independent with Xi ∼ N(µi , σ2i ), then
P1n
i=1 Xi ∼ N( i=1 µi , i=1 σi ) and i=1 (Xi − µi )/σi ∼ N(0, n).
Pn Pn 2 Pn

2 If Z ∼ N(0, 1), then Z 2 ∼ χ2 (1).


3 If X , . . . , Xn are independent with Xi ∼ Gamma(αi , β), then
Pn 1
i=1 Xi ∼ Gamma( i=1 αi , β).
Pn

4 If Z , . . . , Z are independent standard normal random variables, then


Pn 1 2 n 2
i=1 Zi ∼ χ (n).
5 If Z ∼ N(0, 1) and V ∼ χ2 (n) are independent, then T = √VZ /n ∼ t(n),
where t(n) is the Student’s t distribution with n degrees of freedom.
/n
6 If X ∼ χ2 (n) is independent of Y ∼ χ2 (m), then F = YX/m ∼ F (n, m),
where F (n, m) is Snedecor’s F distribution with n and m degrees of
freedom.

55 / 55

You might also like