chap2

2.
Continuous Random Variables
Ismaı̈la Ba
ismaila.ba@umanitoba.ca
STAT 3100 - Winter 2024
1 / 55
Course Outline
1 Introduction
2 Continuous random variables
3 Common continuous distributions
4 Moments, Expectation and Variance
5 Joint distributions
6 Transformation theorem
7 Some useful facts
2 / 55
Introduction
Contents
1 Introduction
7 Some useful facts
3 / 55
Introduction
Discrete random variables take their values on countable sets.
Continuous random variables take their values on the ”continuum”

R.
Probabilities such as P(X = x) are not appropriate for continuous

P
random variables since x∈R P(X = x) really has no meaning.
We need to move away from summations and turn to integration.
4 / 55
Continuous random variables
Contents
1 Introduction
7 Some useful facts
5 / 55
Definition 1
A random variable X with CDF F (x) is said to be a continuous random
variable if there exists a real-valued function f such that
Z x
F (x) := P(X ≤ x) = f (t)dt and f (t) ≥ 0 ∀t ∈ R.
−∞
The function f is called the probability density function (pdf) of X or

simply the density of X .
Theorem 2
A function f (x) is a pdf of a continuous random variable X if and only if it
satisfies the following conditions :
1 f (x) ≥ 0 for all x ∈ R.
R∞
2
−∞
f (x)dx = 1.
6 / 55
Properties of CDF F (x)
The CDF F (x) has the following properties :

1 limx→−∞ F (x) = 0 and limx→∞ F (x) = 1.
2 F (x) is a nondecreasing function of x ; that is, for y ≤ x, F (y ) ≤ F (x).
3 F (x) is right-continuous ; that is for every number x0 ,
limx↓x0 F (x) := limx→x0 ;x>x0 F (x) = F (x0 ).
Example 1
1
Consider the following CDF F (x) = 1+e −x . F (x) satisfies :
limx→−∞ F (x) = 0 since limx→−∞ e −x = ∞ and limx→∞ F (x) = 1 since
limx→∞ e −x = 0.
e −x
d
dx F (x) = (1+e −x )2
> 0, so F (x) is increasing.
F (x) is not only right-continuous, but also continuous.
7 / 55
Fundamental Theorem of Calculus

The Fundamental Theorem of Calculus tells us that, by definition, the pdf
f (x) can be obtained by differentiating F (x) so that
d
f (x) = F (x) = F ′ (x),
dx
at any point x where F (x) is differentiable. F (x) may not be differentiable
at all points x ∈ R but these points can be safely ignored.
For a continuous random variable X with CDF F and pdf f , we have

Z x
P(X = x) = f (t)dt = 0, ∀x ∈ R.
x
That is, the area under a curve at a single point is 0 (this is true even
when f (x) > 0 !). This has an important practical consequence :
Z b
P(a < X ≤ b)=P(a ≤ X ≤ b)=P(a < X < b)=P(a ≤ X < b)= f (x)dx
a 8 / 55
Common continuous distributions
Contents
1 Introduction
7 Some useful facts
9 / 55
The Standard Uniform Distribution
The standard uniform distribution is an equal probability model on the

interval Ω = (0, 1) (or Ω = [0, 1], or Ω = (0, 1], or Ω = [0, 1)). The pdf is
given by f (x) = 1(0,1) (x). The CDF comes in 3 parts :
Z x Z x Z x
for x ≤ 0 : F (x) = f (t)dt = 1(0,1) (t)dt = 0dt = 0
−∞ −∞ −∞
Z x Z x Z x
for x ∈ (0, 1) : F (x) = f (t)dt = 1(0,1) (t)dt = 1dt = x
−∞ −∞ 0
Z x Z x Z 1 Z x
for x ≥ 1 : F (x) = f (t)dt = 1(0,1) (t)dt = 1dt + 0dt = 1
−∞ −∞ 0 1
A note on notation : For a random variable X with the standard uniform

distribution, we write X ∼ U(0, 1), where we read the symbol ”∼” as ”is
distributed as”. We can similarly write X ∼ f (x) or X ∼ F (x), which is
shorthanded for ”X has CDF F ” or ”X has pdf/pmf f ” respectively.
10 / 55
The General Uniform Distribution
The pdf of a general uniform distribution on an interval Ω = (a, b), with

a < b, is given by,
f (x) = f (x; a, b) = c · 1(a,b) (x),
where c > 0 is a constant. The notation f (x; a, b) is a reminder that the

pdf depends on the parameters a and b and that the constant c does also.
The CDF F (x) = F (x; a, b) comes in three parts x ≤ a, x ∈ (a, b) and
x ≥ b with c = 1/(b − a). Here, we write X ∼ U(a, b).
11 / 55
The Normal Distribution
The standard normal distribution, denoted N(0, 1), has pdf
1
f (x) = √ e −x /2 ,
2
−∞ < x < ∞.
2π
Remark : The standard normal random variable will be denoted by Z .
The normal distribution with mean µ and variance σ2 > 0, denoted

N(µ, σ2 ), has pdf
1 − 1
(x−µ)2
f (x) = f (x; µ, σ2 ) = √ e 2σ2 , −∞ < x < ∞.
σ 2π
The CDF of the normal distribution has no closed form. Tables or

computer software (R for instance) can be used to calculate P(X ≤ x)
when X ∼ N(µ, σ2 ).
12 / 55
The Normal Distribution
Note that if X ∼ N(µ, σ2 ), then
X −µ
Z= ∼ N(0, 1),
σ
so only tabulated probabilities for N(0, 1) random variables are required.
d
If X ∼ N(µ, σ2 ), then X = µ + σZ ∼ N(µ, σ2 ).
More generally, if X ∼ N(µ, σ2 ) and a, b ∈ R, then
Y = a + bX ∼ N(a + bµ, b2 σ2 ).
Remark : The normal distribution is the most important one in all of

probability and statistics. The CDF of the standard normal distribution is
commonly denoted by Φ(z).
13 / 55
The Exponential Distribution
Let β > 0 and consider a continuous random variable X with pdf

1 −x/β
f (x) = f (x; β) = e · 1(0,∞) (x).
β
The indicator function tells us that the random variable X is positive

(or at least non-negative).
The constant β is called the scale parameter and in this case, we
write X ∼ Exp(β).
( ) Check that f (x) is a proper density.
The CDF of X is given by

Z x Z x Z x/β
1 −t/β
F (x; β) = P(X ≤ x) = f (t)dt = e dt = e −u du = 1−e −x/β .
0 0 β 0
14 / 55
The Exponential Distribution

( ) Check that, as x → 0, F (x) → 0 and that, as x → ∞, F (x) → 1.
Another commonly used parameterization of the exponential distribution

uses λ = 1/β so that f (x; λ) = λe −λx · 1(0,∞) (x).
This parameterization has its roots in stochastic processes.
λ is called the rate parameter and in this case, we will use the
notation X ∼ ExpR (λ).
In statistical inference, scale parameters are usually of greater interest
so we will normally use the Exp(β) parameterization.
Proposition 1 (Lack of memory property ( ))

Let X ∼ Exp(β) (or ExpR (λ)), then, for s, t ≥ 0,
P(X > s + t | X > s) = P(X > t).

15 / 55
The Gamma Distribution
Definition 3
For α > 0, the gamma function Γ(α) is defined by
Z ∞
Γ(α) = x α−1 e −x dx
0
Remark : The gamma function is analytic over the entire complex plane
except at the non-positive integers, but that will not be of interest in this
course - only that it is real-valued and continuous for real α ∈ (0, ∞).
(Primary) properties of the gamma function

1 For any α > 1, Γ(α) = (α − 1)Γ(α − 1)
2 For any positive integer n, Γ(n) = (n − 1)!
√
3 Γ( 12 ) = π
16 / 55
A continuous random variable X is said to have a gamma distribution

with parameters α > 0 and β > 0, denoted Gamma(α, β) if the pdf of X is
x α−1 e −x/β
f (x) = f (x; α, β) = · 1(0,∞) (x).
βα Γ(α)
( ) Use the primary properties of the gamma function to show that

f (x; α, β) is a proper density function.
There is no closed form for the CDF F (x; α, β) except for certain
values of the parameters.
d
Gamma(1, β) = Exp(β).
17 / 55
Another parameterization commonly used is GammaR (α, λ), where λ is

now a rate parameter. The density becomes
λα x α−1 e −xλ
f (x; α, λ) = · 1(0,∞) (x).
Γ(α)
d
GammaR (1, λ) = ExpR (λ).
18 / 55
The Chi-Squared Distribution
The chi-squared distribution with parameter ν > 0 is a special case of the

gamma distribution. The pdf is given by
1
f (x) = f (x; ν) = x ν/2−1 e −x/2 · 1(0,∞) (x).
2ν/2 Γ(ν/2)
The parameter ν is called the number of degrees of freedom.
The notation is X ∼ χ2 (ν).
d
χ2 (ν) = Gamma(α = ν/2, β = 2) = GammaR (α = ν/2, λ = 1/2).
19 / 55
Other Continuous Distributions
See Chapter 4, Section 4.5 in Devore & Berk (2018) for more examples.
Example 2 ( )
Suppose that the continuous random variable X has density
f (x) = c(1 + x)−3 · 1(0,∞) (x).
Find the constant c and the CDF F (x). Can you generalize this for
f (x; η) = c(η)(1 + x)−η · 1(0,∞) (x), η > 1,
where c(η) is a constant depending only on η ? What happens if η = 1 ?
20 / 55
Moments, Expectation and Variance
Contents
1 Introduction
7 Some useful facts
21 / 55
Expectation
The expected or mean value of a continuous random variable with pdf is

Z ∞
E(X ) = x f (x)dx
−∞
R∞
This expected value will exist provided that −∞
|x| f (x)dx < ∞.
We will use the notation E(X ) = µ = µX or say that X has mean
(expectation) µ.
Example 3
Suppose that X ∼ Exp(β). Then
Z ∞ Z ∞ Z ∞ 2−1 −x/β
1 −x/β 1 −x/β x e
E(X ) = x e · 1(0,∞) (x)dx = x e dx = β dx = β.
−∞ β 0 β 0 β2 Γ(2)
| {z }
Gamma(2, β)density
22 / 55
Expectation
Example 4
Suppose that X ∼ Gamma(α, β). Then
Z ∞ α−1 −x/β
Γ(α + 1) ∞ x (α+1)−1 e −x/β
Z
x e
E(X ) = x α dx = β α+1 Γ(α + 1)
dx.
0 β Γ(α) Γ(α) 0 β
| {z }
Gamma(α+1, β)density
Since Γ(α + 1) = αΓ(α), we obtain E(X ) = αβ.
Exercise 1 ( )
Determine E(X ) when
1 X ∼ U(a, b).
2 X ∼ GammaR (α, λ).
3 X ∼ χ2 (ν).
23 / 55
Expectation
Proposition 2
If X is a continuous random variable with pdf f (x) and u(x) is a
real-valued Rfunction whose domain includesR the range of X , then
∞ ∞
E[u(X )] = −∞ u(x) f (x)dx, provided that −∞ |u(x)| f (x)dx < ∞.
Linearity
When u(x) = a + bx, we have
Z ∞ Z ∞ Z ∞
E(u(X ))=E(a+bX )= (a+bx) f (x)dx =a f (x)dx+b x f (x)dx =a+bE(X ).
−∞ −∞ −∞
Suppose that g (x) and h(x) are real valued functions such that E[g (X )]
and E[h(X )] exist and a, b ∈ R. Then
E[a g (X ) + b h(X )] = aE[g (X )] + bE[h(X )].

24 / 55
Variance/Standard Deviation
The variance of a random variable X with finite mean µ is
σ2 = σ2X = V(X ) = E[(X − µ)2 ] = E[h(X )], with h(X ) = (X − µ)2 ,
provided thisp expectation exists. The standard deviation of X is

σ = σX = V(X ).
Computational formula for the variance
σ2 = V(X ) = E[(X − µ)2 ] = E[X 2 − 2µE(X ) + µ2 ] =E(X 2 ) − 2µ2 + µ2

=E(X 2 ) − µ2 .
Exercise 2 ( )
Find E(X 2 ) and V(X ) when 1. X ∼ Gamma(α, β) and 2. X ∼ χ2 (ν).
25 / 55
Higher Moments
Definition 4 (Higher moments)

For a random variable X , provided they exist, the kth moment about the
origin or simply kth moment is defined as
µ′k := E(X k ),
and the kth moment about the mean or kth central moment is defined
as
µk := E[(X − E(X ))k ] = E[(X − µ)k ].
E(X ) = µ = µ′1 .
V(X ) = σ2 = µ2 .
26 / 55
Higher Moments
Example 5
Suppose that X ∼ Gamma(α, β).
x α−1 e −x/β
Z ∞
Γ(α + k) ∞ x (α+k)−1 e −x/β
Z
E(X k ) = x k α dx = βk dx
0 β Γ(α) Γ(α) 0 βα+k Γ(α + k)
| {z }
Gamma(α+k, β)density
Γ(α + k)
= βk .
Γ(α)
d
Remark : If Y ∼ Gamma(1, β) = Exp(β), we have E(Y k ) = βk k!
27 / 55
Joint distributions
Contents
1 Introduction
7 Some useful facts
28 / 55
Joint distributions
Joint distribution
Definition 5
Let X and Y be two continuous random variables defined on the same
sample Ω. Then fX ,Y (x, y ) or simply f (x, y ) is the joint probability
function for X and Y if for A ⊆ R2
"
P[(X , Y ) ∈ A] = f (x, y ) dx dy .
A
If A = {(x, y ) : a ≤ x ≤ b, c ≤ y ≤ d} the two-dimensional rectangle, then,

Z d Z b
P[(X , Y ) ∈ A] = P(a ≤ X ≤ b, c ≤ Y ≤ d) = f (x, y ) dx dy
c a
Z b Z d
= f (x, y ) dy dx.
a c
29 / 55
Joint distributions
Joint distribution
Definition 6
If X = (X1 , . . . , Xn )′ are defined on the same sample space Ω, the joint
density of X is denoted by
fX (x) = fX1 ,...,Xn (x1 , . . . , xn ).
If A ⊆ Rn , then
(
P(X ∈ A) = f (x1 , . . . , xn ) dx1 . . . dxn .
A
30 / 55
Joint distributions
Expected Values and Covariance
If X and Y are jointly distributed, then the covariance between X and Y

is defined as
Cov(X , Y ) = E[(X − E(X ))(Y − E(Y ))] = E(XY ) − E(X )E(Y ),
provided these expectations exist.
If X = (X1 , . . . , Xn )′ are jointly distributed random variables, the mean

vector is defined as
µ = (µ1 , . . . , µn )′ , where µi = E(Xi ), for i = 1, . . . , n.
The covariance matrix Λ = (λij ) is a symmetric n × n matrix with entries
λij = λji = Cov(Xi , Xj ) = Cov(Xj , Xi ).
31 / 55
Joint distributions
Marginal Distributions
Definition 7
The marginal densities of X and Y , denoted by fX (x) and fY (y ),
respectively, are given by
Z ∞
fX (x) = fX ,Y (x, y ) dy for − ∞ < x < ∞
−∞
Z ∞
fY (y ) = fX ,Y (x, y ) dx for − ∞ < y < ∞.
−∞
If X = (X1 , . . . , Xn )′ and U = {i1 , . . . , ik } ⊂ {1, . . . , n} (and

U ′ = {1, . . . , n} \ U), then the marginal (joint) distribution of
(Xi1 , Xi2 , . . . , Xik )′ is given by
(
fXi1 ,Xi2 ,...,Xik (xi1 , Xxi2 , . . . , xik ) = fX1 ,...,Xn (x1 , . . . , xn ) dxU ′ ,
Rn−k
32 / 55
Joint distributions
Conditional Distribution
where dxU ′ represents dxj for all j < U.
Definition 8
Let X and Y be two continuous random variables with joint density
fX ,Y (x, y ) and marginal X density fX (x). Then for any x value such that
fX (x) > 0, the conditional density of Y given X = x is
fX ,Y (x, y )
fY |X =x (y ) = .
fX (x)
This will possibly be a function of x.

Consider the random variable h(X ) := fY |X (y ) = fX ,Y (X , y )/f (X ).
With h(x) := fY |X =x (y ), we have fX ,Y (x, y ) = fY |X (y )fX (x).
We can also consider fX |Y (x) and extend this to higher dimensions.
33 / 55
Joint distributions
Conditional Distribution
If X = (X1 , X2 , X3 , X4 )′ , we can consider the distribution of (X1 , X4 ) given

(X2 = x2 , X3 = x3 ). The density becomes in this case
fX1 ,X2 ,X3 ,X4 (x1 , x2 , x3 , x4 )

fX1 ,X4 |X2 =x2 ,X3 =x3 (x1 , x4 ) =
fX2 ,X3 (x2 , x3 )
fX ,X ,X ,X (x1 , x2 , x3 , x4 )
=R R 1 2 3 4 .
f
R R X1 ,...,X4 1
(x , . . . , x4 ) dx1 dx4
34 / 55
Joint distributions
Independence
Definition 9
Two random variables X and Y defined on the same sample space Ω are
said to be independent if for every pair x and y values
fX ,Y (x, y ) = fX (x)fY (y ).
For higher dimensions : if X1 , . . . , Xn are random variables defined on the

same sample space, we say that X1 , . . . , Xn are independent if and only if
n
Y
fX1 ,...,Xn (x1 , . . . , xn ) = fXi (xi ).
i=1
Remark : (X1 , . . . , Xn ) is a random sample from distribution with CDF F

(or pdf f ) ⇐⇒ X1 , . . . , Xn are independent and identically distributed (iid)
with CDF F (or density f ).
35 / 55
Joint distributions
Independence
If X = (X1 , . . . , Xn )′ are iid random variables, each with CDF F and
probability mass/density function f , then the joint density (mass) function
of (X1 , . . . , Xn )′ is given by
n
Y
f (x) = f (x1 , . . . , xn ) = f (xi ).
i=1
Remark : If the Xi are discrete, then so is f (x). If the Xi are continuous,

then so is f (x). It is not always the case that the Xi are all discrete or
continuous, there may be a mixture of both.
Events such as {X ≤ x} are determined component-wise :
{X ≤ x}={X1 ≤ x1 , . . . , Xn ≤ xn }={X1 ≤ x1 } ∩ {X2 ≤ x2 } ∩ . . . ∩ {Xn ≤ xn }
(If the Xi are independent) The joint CDF is given by

F (x) = F (x1 , . . . , xn ) = ni=1 F (xi ).
Q
36 / 55
Joint distributions
Example 6 ( )
Suppose that X and Y are jointly distributed with density
fX ,Y (x, y ) = ce −x−y 1(0<y <x<∞) ,
where c is a constant. Determine the constant c, the marginal distributions

of X and Y , and the conditional distribution of Y |X = x.
37 / 55
Joint distributions
Conditional Expectation
Definition 10
Let X and Y be two continuous random variables with conditional
probability density function fY |X (y |x). Then
Z ∞
µY |X =x = E(Y |X = x) = y fY |X (y |x) dy .
−∞
Example 7 ( )
Use Example 6 to determine E(Y |X = x).
38 / 55
Joint distributions
Computation of the Covariance
Example 8
By definition, Cov(X , Y ) = E(XY ) − E(X )E(Y ). Using Example 6, we have

Z ∞Z x Z ∞ Z x !
−x−y −x −y
E(XY ) = xy 2e dy dx = 2 xe ye dy dx (c = 2).
0 0 0 0
The integration inside the parenthesis is 1 − e −x − xe −x . Therefore, we have

Z ∞
E(XY ) = 2 xe −x (1 − e −x − xe −x ) dx
0
Z ∞ Z ∞ Z ∞
=2 xe −x dx − x · 2e −2x dx − x 2 · 2e −2x dx
0 0 0
= 2A − B − C .
39 / 55
Joint distributions
Computation of the Covariance
Example 8 continued
Now, A = 1 because it is the expectation of a standard Exp(1) random
variable. B = 1/2 since it is the expectation of an Exp(1/2) random
variable. C is equal to E(U 2 ) = V(U) + E2 (U), where U ∼ Exp(1/2). So,
C = 41 + 14 = 12 . Therefore, E(XY ) = 2 − 12 − 12 = 1.
3
( )Verify that E(X ) = 2 and E(Y ) = 12 . The covariance becomes
3 1 1
Cov(X , Y ) = E(XY ) − E(X )E(Y ) = 1 − × = .
2 2 4
Does this result make sense ? Why ?
40 / 55
Joint distributions
Another Example on Conditional Expectation
In Example 6, we find that fY (y ) = 2e −2y 1(0<y <∞) . We have
fX ,Y (x, y ) 2e −x−y
fX |Y =y (x) = = = e −(x−y ) 1(0<y <x<∞) .
fY (y ) 2e −2y
Note that y is fixed here so, with the substitution u = x − y (with du = dx

and x = u + y ), we have
Z ∞ Z ∞
−(x−y )
E(X |Y = y ) = x ·e dx = (y + u)e −u du = y + 1.
y 0
You can also argue this using the lack of memory property of the
exponential distribution !
41 / 55
Joint distributions
Law of Total Expectation
Suppose that X and Y are jointly distributed (we will assume they are
both continuous, but this also works for discrete random variables or
mixture of both types) with pdf fX (x) and fY (y ). Let h(x) = E(Y |X = x)
and define the random variable h(X ) := E(Y |X ). We have
Z
E(E(Y |X )) = E(h(X )) = h(x) fX (x) dx
R
Z Z !
= y · fY |X =x (y ) dy fX (x) dx
R R
= E(Y ).
For jointly distributed random variables X and Y , we have

E(Y ) = E(E(Y |X )), provided these expectations exist. This is actually a
very powerful tool and often simplifies complicated proofs. There is also a
Law of Total Variation and a Law of Total Covariation, which we will
discuss if and when needed. 42 / 55
Transformation theorem
Contents
1 Introduction
7 Some useful facts
43 / 55
Let X be an n-dimensional
R continuous random variable with density
fX (x) and suppose that S fX (x) dx = 1, where S ⊂ Rn .
Let g(x) = (g1 (x), g2 (x), . . . , gn (x)) be a bijection from S to some

T ⊂ Rn , with unique inverse mapping g−1 .
Assume that both g and g−1 are continuously differentiable.
Let Y = g(X) i.e. Y1 = g1 (X1 , . . . , Xn ), . . . , Yn = gn (X1 , . . . , Xn ).
Let h(y) = (h1 (y), . . . , hn (y)) be the unique inverse of g so that

X1 = h1 (Y1 , . . . , Yn ), . . . , Xn = hn (Y1 , . . . , Yn ).
44 / 55
Theorem 11
The (joint) density of Y is
fY (y) = fX (h1 (y), h2 (y), . . . , hn (y))|J| · I {y ∈ T },
where J is the Jacobian given by

∂x1 ∂x1 ∂x1 ∂x1
∂y1 ∂y2 ∂y3 ... ∂yn
∂x2 ∂x2 ∂x2 ∂x2
∂x ∂x ∂y1 ∂y2 ∂y3 ... ∂yn
J= ,..., = .. .. .. .. ..
∂y1 ∂yn . . . . .
∂xn ∂xn ∂xn ∂xn
∂y1 ∂y2 ∂y3 ... ∂yn
∂xk ∂hk (y)

and ∂yi = ∂yi .
Proof.( ) For any B ∈ Rn , let

h(B) = g−1 (B) = {x : g(x) ∈ B}.
45 / 55
Lemma to complete the proof of Theorem 11
The following Lemma, which we state without proof, completes the proof
of Theorem 11.
Lemma 1
Let Z be an n-dimensional continuous random variable. If, for every
B ⊂ Rn , Z
P(Z ∈ B) = fZ (x) dx
B
then fZ (x) is a density of Z.
46 / 55
Comment on Theorem 11
The joint density of Y can also be written as
fX (h1 (y), h2 (y), . . . , hn (y))

fY (y) = · I {y ∈ T },
|Jy |
where Jy is given by
∂y1 ∂y1 ∂y1 ∂y1
∂x1 ∂x2 ∂x3 ... ∂xn
∂y2 ∂y2 ∂y2 ∂y2
∂y ∂y ∂x1 ∂x2 ∂x3 ... ∂xn
Jy = ,..., = .. .. .. ..
∂x1 ∂xn ..
. . . . .
∂yn ∂yn ∂yn ∂yn
∂x1 ∂x2 ∂x3 ... ∂xn
∂yk ∂gk (x)

and ∂xi = ∂xi .
47 / 55
Example 9
Suppose that X ∼ Gamma(α, β). Find the distribution of Y = 2X /β. We
have
x α−1 e −x/β
fX (x) = α · 1(0,∞) (x).
β Γ(α)
Here, y = g (x) = 2x/β which implies that x = h(y ) = βy /2. The Jacobian
is |dx/dy | = |β/2| = β/2. Then,
βy β (βy /2)α−1 e −(βy /2)/β β
fY (y ) = fX = ·1 (y )
2 2 βα Γ(α) 2 (0,∞)
y α−1 e −y /2
= α · 1(0,∞) (y ).
2 Γ(α)
d
Thus, Y ∼ Gamma(α, 2) = χ2 (2α).
48 / 55
Example 10
Let X and Y be independent with X ∼ Gamma(α1 , β) and

Y ∼ Gamma(α2 , β). Let U = X + Y and V = X /(X + Y ) with
U ∼ Gamma(α1 + α2 , β) and V ∼ Beta(α1 , α2 ). We wish to show that U
and V are independent. Here, u = g1 (x, y ) = x + y and
v = g2 (x, y ) = x/(x + y ) yield x = h1 (u, v ) = uv and h2 (u, v ) = u(1 − v ).
The Jacobian is
∂x ∂x
∂u ∂v v u
J= ∂y ∂y
= = −vu − u(1 − v ) = −u.
∂u ∂v
1−v −u
The joint density of U and V is
fU,V (u, v ) =fX ,Y (uv , u(1 − v ))|J|

(uv )α1 −1 e −uv /β [u(1 − v )]α2 −1 e −u(1−v )/β
= × × | − u|
βα1 Γ(α1 ) βα2 Γ(α2 )
49 / 55
Example 10 continued
We can rewrite the joint density of U and V as follows
u α1 +α2 −1 e −u/β Γ(α1 + α2 ) α1 −1

fU,V (u, v ) = α
× v (1 − v )α2 −1
β 1 +α 2 Γ(α1 + α2 ) Γ(α1 )Γ(α2 )
| {z } | {z }
Gamma(α1 +α2 , β) density Beta(α1 , α2 ) density
where 0 < u < ∞ and 0 < v < 1. Therefore,
fU,V (u, v ) = fU (u) × fV (v )
which implies that the random variables U and V are independent.
50 / 55
Remark (from Example 9 and 10) : If X1 , . . . , Xn are independent with

Xi ∼ Gamma(αi , β) then
i=1 Xi ∼ Gamma( i=1 αi , β).
Pn Pn
1
2 2X /β ∼ χ2 (2α ) and 2
i=1 Xi /β ∼ χ (2 i=1 αi ).
Pn 2 Pn
i i
Exercise 3 ( )
iid
1 Suppose that X1 , . . . , Xn ∼ Exp(β). What is the distribution of
2nX̄n /β ?
iid
2 Let m, n ∈ N such that m < n and suppose that X1 , . . . , Xn ∼ Exp(β).
What is the distribution of ( m
P Pn
i=1 Xi )/( i=1 Xi ) ?
3 Let X1 , . . . , Xn be independent with Xi ∼ χ2 (νi ) for i = 1, . . . , n. Argue
that ni=1 Xi ∼ χ2 ( ni=1 νi ).
P P
51 / 55
Many-to-one transformations
The mapping g is many-to-1.

Find a partition S1 , . . . , Sm of S (i.e. S = ∪k Sk and Sk ∩ Si = ∅ for all
k , i) such that g : Sk → T is 1-1 on each Sk with Jacobian Jk , then
m
X
fY (y) = fX (h1k (y), . . . , hnk (y))|Jk |,
k=1
where hk (Jk ) = (h1k (y), . . . , hnk (y)) is the unique inverse of the
mapping g : Sk → T on each Sk .
52 / 55
Many-to-one transformations
Example 11
Let Z ∼ N(0, 1) and let Y = Z 2 . Find the distribution of Y. The function
g (z) = z 2 is not one-to-one since z and −z map to the same y but g (z) is
one-to-one and onto T = (0, ∞) on each of S1 = (−∞, 0) and S2 = (0, ∞)
with R = S1 ∪ S2 . That is, g : (−∞, 0) → (0, ∞) and g : (0, ∞) → (0, ∞)
√
are each one-to-one and onto (0, ∞). On S1 , we have z = h11 (y ) = − y
√ √
and J1 = dy /dz = −1/(2 y ) ; and on S2 , we have z = h12 (y ) = y and
√
J2 = dy /dz = 1/(2 y ). Then, the density of Y is
√ √
fY (y ) = fZ (− y )|J1 | + fZ ( y )|J2 |.
√
Note that fZ (z) = fZ (−z) and |J1 | = |J2 | = 1/(2 y ). The density of Y
(Y ∼ χ2 (1)) becomes
√ 1/2−1 e −y /2
1
fY (y ) = 2fZ ( y )|J1 | = √2πy e −y /2 = y21/2 Γ(1/2) 1(0,∞) (y ).
53 / 55
Some useful facts
Contents
1 Introduction
7 Some useful facts
54 / 55
Some useful facts
1 If X1 , . . . , Xn are independent standard normal random variables, then

√
i=1 Xi / n ∼ N(0, 1). More generally, if
Pn Pn
i=1 Xi ∼ N(0, n) and
X , . . . , Xn are independent with Xi ∼ N(µi , σ2i ), then
P1n
i=1 Xi ∼ N( i=1 µi , i=1 σi ) and i=1 (Xi − µi )/σi ∼ N(0, n).
Pn Pn 2 Pn
2 If Z ∼ N(0, 1), then Z 2 ∼ χ2 (1).

3 If X , . . . , Xn are independent with Xi ∼ Gamma(αi , β), then
Pn 1
i=1 Xi ∼ Gamma( i=1 αi , β).
Pn
4 If Z , . . . , Z are independent standard normal random variables, then

Pn 1 2 n 2
i=1 Zi ∼ χ (n).
5 If Z ∼ N(0, 1) and V ∼ χ2 (n) are independent, then T = √VZ /n ∼ t(n),
where t(n) is the Student’s t distribution with n degrees of freedom.
/n
6 If X ∼ χ2 (n) is independent of Y ∼ χ2 (m), then F = YX/m ∼ F (n, m),
where F (n, m) is Snedecor’s F distribution with n and m degrees of
freedom.
55 / 55

chap2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

chap2

Uploaded by

Copyright:

Available Formats

2.

Continuous Random Variables

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

7 Some useful facts

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

7 Some useful facts

Discrete random variables take their values on countable sets.

Continuous random variables take their values on the ”continuum”

Probabilities such as P(X = x) are not appropriate for continuous

We need to move away from summations and turn to integration.

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

7 Some useful facts

The function f is called the probability density function (pdf) of X or

Properties of CDF F (x)

The CDF F (x) has the following properties :

Fundamental Theorem of Calculus

For a continuous random variable X with CDF F and pdf f , we have

2 Continuous random variables

3 Common continuous distributions

4 Moments, Expectation and Variance

7 Some useful facts

The Standard Uniform Distribution

The standard uniform distribution is an equal probability model on the

A note on notation : For a random variable X with the standard uniform

The General Uniform Distribution

The pdf of a general uniform distribution on an interval Ω = (a, b), with

f (x) = f (x; a, b) = c · 1(a,b) (x),

where c > 0 is a constant. The notation f (x; a, b) is a reminder that the

The Normal Distribution

The standard normal distribution, denoted N(0, 1), has pdf

The normal distribution with mean µ and variance σ2 > 0, denoted

The CDF of the normal distribution has no closed form. Tables or

The Normal Distribution

Note that if X ∼ N(µ, σ2 ), then

More generally, if X ∼ N(µ, σ2 ) and a, b ∈ R, then

Remark : The normal distribution is the most important one in all of

The Exponential Distribution

Let β > 0 and consider a continuous random variable X with pdf

The indicator function tells us that the random variable X is positive

The CDF of X is given by

The Exponential Distribution

Another commonly used parameterization of the exponential distribution

Proposition 1 (Lack of memory property ( ))

P(X > s + t | X > s) = P(X > t).

The Gamma Distribution

(Primary) properties of the gamma function

The Gamma Distribution

A continuous random variable X is said to have a gamma distribution

( ) Use the primary properties of the gamma function to show that

The Gamma Distribution

Another parameterization commonly used is GammaR (α, λ), where λ is

The Chi-Squared Distribution

The chi-squared distribution with parameter ν > 0 is a special case of the

The parameter ν is called the number of degrees of freedom.

The notation is X ∼ χ2 (ν).

Other Continuous Distributions

f (x) = c(1 + x)−3 · 1(0,∞) (x).

f (x; η) = c(η)(1 + x)−η · 1(0,∞) (x), η > 1,