Professional Documents
Culture Documents
MS Lectures 5
MS Lectures 5
where x = (x1 , x2 )T .
The determinant of V is
σ12 ρσ1 σ2
det V = det = (1 − ρ2 )σ12 σ22 .
ρσ1 σ2 σ22
48 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY
Theorem 1.17. Let X and Y be jointly continuous random variables with joint
pdf fX,Y (x, y) which has support on S ⊆ R2 . Consider random variables U =
g(X, Y ) and V = h(X, Y ), where g(·, ·) and h(·, ·) form a one-to-one mapping
from S to D with inverses x = g −1 (u, v) and y = h−1 (u, v) which have continuous
partial derivatives. Then, the joint pdf of (U, V ) is
fU,V (u, v) = fX,Y g −1(u, v), h−1(u, v) |J|,
1.10. TWO-DIMENSIONAL RANDOM VARIABLES 49
We will find the joint pdf for (U, V ), where U = g(X, Y ) = X + Y and V =
h(X, Y ) = X/Y . This transformation and the support for (X, Y ) give the support
for (U, V ). This is {(u, v) : u > 0, v > 0}.
Theorem 1.18. Let X and Y be independent rvs and let g(x) be a function of x
only and h(y) be function of y only. Then the functions U = g(X) and V = h(Y )
are independent.
Proof. (Continuous case) For any u ∈ R and v ∈ R, define
Exercise 1.19. Let (X, Y ) be a two-dimensional random variable with joint pdf
8xy, for 0 ≤ x < y ≤ 1;
fX,Y (x, y) =
0, otherwise.
Let U = X/Y and V = Y .
X
U= and V = X + Y.
X +Y
Example 1.32. In a study on relation of the level of cholesterol in blood and inci-
dents of heart attack 28 heart attack patients had their cholesterol level measured
two days and four days after the attack. Also, cholesterol levels were recorded
for a control group of 30 people who did not have a heart attack. The data are
available on the course web-page.
1. What is the population’s mean cholesterol level on the second day after heart
attack?
2. Is the difference in the mean cholesterol level on day 2 and on day 4 after
the attack statistically significant?
We say that such random variables are iid, that is identically, independently dis-
tributed.
The joint pdf (pmf in the discrete case) can be written as a product of the marginal
pdfs, i.e.,
n
Y
fX (x1 , . . . , xn ) = fXi (xi ),
i=1
Note: In the Example 1.32 it can be assumed in that these rvs are mutually in-
dependent (cholesterol level of one patient does not depend on the cholesterol
level of another patient). If we can assume that the distribution of the cholesterol
of each patient is the same, then the variables X1 , . . . , X28 constitute a random
sample.
Y = T (X1 , . . . , Xn ).
Note that Y itself is then a random variable. The distribution of Y carries some
information about the population and it allows us make inference regarding some
parameters of the population. For example, about the expected cholesterol level
and its variability in the population of people who suffer from heart attack.
The distributions of many functions T (X1 , . . . , Xn ) can be derived from the dis-
tributions of the variables Xi . Such distributions are called sampling distributions
and the function T is called a statistic.
P Pn
Functions X = n1 ni−1 Xi and S 2 = n−1 1 2
i=1 (Xi − X) are the most common
statistics used in data analysis.
These three distributions can be derived from distributions of iid random variables.
They are commonly used in statistical hypothesis testing and in interval estimation
of unknown population parameters.
χ2ν distribution
Y ∼ χ2ν ,
54 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY
In the following example we will see that χ2ν is the distribution of a function of iid
standard normal rvs.
Example 1.33. In Example 1.17 we have seen that square of a standard normal rv
has χ21 distribution. Now, assume that Z1 ∼ N (0, 1) and Z2 ∼ N (0, 1) indepen-
dently. What is the distribution of Y = Z12 + Z22 ? Denote by Y1 = Z12 and by
Y2 = Z22 .
To answer this question we can use the properties of the mgf as follows.
MY = E etY = E et(Y1 +Y2 ) = E etY1 etY2 = MY1 (t)MY2 (t)
as, by Theorem 1.18, Y1 and Y2 are independent. Also, Y1 ∼ χ21 and Y2 ∼ χ21 ,
each with the mgf equal to
1
MYi (t) = , i = 1, 2.
(1 − 2t)1/2
Hence,
1 1 1
MY (t) = MY1 (t)MY2 (t) = 1/2 1/2
= .
(1 − 2t) (1 − 2t) (1 − 2t)
This is the mgf for χ22 . Hence, by the uniqueness of the mgf we can conclude that
Y = Z12 + Z22 ∼ χ22 .
Note: This result can be easily extended to n independent standard normal rvs.
That is, if Zi ∼ N (0, 1) for i = 1, . . . , n independently, then
n
X
Y = Zi2 ∼ χ2n .
i=1
1.11. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 55
Note that T in the above corollary can be written as a sum of squared iid standard
normal rvs, hence it must have a chi-squared distribution.
Example 1.34. Let X1 , . . . , Xn be iid random variables, such that
Xi ∼ N (µ, σ 2 ), i = 1, . . . , n.
Then
Xi − µ
Zi = ∼ N (0, 1)
σ
and so n n
X X (Xi − µ)2
Zi2 = ∼ χ2n .
i=1 i=1
σ2
This is a useful result, but it depends on two parameters, whose values are usu-
ally unknown (when we Panalyze data). Here we will see what happens when we
1 n
replace µ with X = n i=1 Xi . We can write
n
X n
X 2
(Xi − µ)2 = (Xi − X) + (X − µ)
i=1 i=1
n
X n
X n
X
2 2
= (Xi − X) + (X − µ) + 2(X − µ) (Xi − X)
i=1 i=1
|i=1 {z }
=0
n
X
= (Xi − X)2 + n(X − µ)2 .
i=1
1
Pn
where S 2 = n−1 i=1 (Xi − X)2 .
Hence,
2
Xq− µ ∼ χ21 .
σ2
n
Also
n
X (Xi − µ)2
∼ χ2n .
i=1
σ2
That is
MW (t) (1 − 2t)−n/2 1
MU (t) = = = .
MV (t) (1 − 2t) −1/2 (1 − 2t)(n−1)/2
The last expression is the mgf of a random variable with a χ2n−1 distribution. That
is
(n − 1)S 2
∼ χ2n−1 .
σ2
Equivalently,
n
X (Xi − X)2
∼ χ2n−1 .
i=1
σ2
Student tν distribution
We write
Y ∼ tν ,
1.11. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 57
Proof. Here we will apply Theorem 1.17. We will find the joint distribution of
(Y, W ), where Y = √Z and W = X, and then we will find the marginal
X/ν
density function for Y , as required.
The range of (Z, X) is {(z, x) : −∞ < z < ∞, 0 < x < ∞} and so the range of
(Y, W ) is {(y, w) : −∞ < y < ∞, 0 < w < ∞}.
To obtain the marginal pdf for Y we need to integrate the joint pdf for (Y, W ) over
the range of W . This is an easy task when we notice similarities of this function
with the pdf of Gamma(α, λ) and use the fact that a pdf over the whole range
integrates to 1.
v α−1 λα e−λv
fV (v) = .
Γ(α)
Denote
ν +1 1 y2
α= , λ= 1+ .
2 2 ν
Then, multiplying and dividing the joint pdf for (Y, W ) by λα and by Γ(α), we
get
w α−1 e−λw λα Γ(α)
fY,W (y, w) = √ ×
νπ2α Γ( ν2 ) λα Γ(α)
w α−1 λα e−λw Γ(α)
= ×√ .
Γ(α) νπ2α Γ( ν2 )λα
Then, the marginal pdf for Y is
Z ∞ α−1 α −λw
w λ e Γ(α)
fY (y) = ×√ dw
0 Γ(α) νπ2α Γ( ν2 )λα
Z ∞ α−1 α −λw
Γ(α) w λ e
=√ α ν α
× dw
νπ2 Γ( 2 )λ 0 Γ(α)
1.11. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 59
Then
σ2 X −µ
X ∼ N µ, , so U = ∼ N (0, 1).
n √σ
n
Note: The function T in Example 1.35 is used for testing hypothesis about the
population mean µ when the variance of the population is unknown. We will be
using this function later on in this course.
60 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY
ν1 +ν2
ν1 −2
Γ ν1 y 2
fY (y) = ν1
2 ν2
ν1 +ν , y ∈ R+ .
Γ 2
Γ 2
ν2
ν1 2
2
1+ ν2
y
Proof. This result can be shown in a similar way as the result of Theorem 1.19.
Take the transformation F = X/ν 1
Y /ν2
and W = Y and use Theorem 1.17.
Example 1.36. Let X1 , . . . , Xn and Y1 , . . . , Ym be two independent random sam-
ples such that
and let S12 and S22 be the sample variances of the random samples X1 , . . . , Xn and
Y1 , . . . , Ym, respectively. Then (see Example 1.34),
(n − 1)S12
∼ χ2n−1 .
σ12
Similarly,
(m − 1)S22
∼ χ2m−1 .
σ22
1.11. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 61
(a) Show that for any a ∈ R such that a + α > 0 the expectation of Y a is
Γ(a + α)
E(Y a ) = .
λa Γ(α)
(b) Obtain E(Y ) and var(Y ). Hint: Use result given in point (a) and the itera-
tive property of gamma function, i.e., Γ(z) = (z − 1)Γ(z − 1).
P
(a) W = 5i=1 Yi2 ?
P
(b) U = 5i=1 (Yi − Y )2 ?
(c) U + Y62 ?
√ √
(d) 5Y6 / W ?