Professional Documents
Culture Documents
CH 5 Slides
CH 5 Slides
Kuan Xu
University of Toronto
kuan.xu@utoronto.ca
1 Introduction
2 Bivariate and Multivariate Probability Distributions
3 Marginal and Conditional Probability Distributions
4 Independent Random Variables
5 The Expected Value of a Function of Random Variables
6 Special Theorems
7 The Covariance of Two Random Variables
8 The Expected Value and Variance of Linear Functions of Random
Variables
9 The Bivariate Normal Distribution
10 Conditional Expectations
Example: Toss a pair of dice. The mn rule tells us there are 36 sample
points.
We note
1 p(y1 , y2 ) > 0.
P
i,j p(yi , yj ) = 1.
2
Y1
2 3
1 p(2, 1) p(3, 1)
Y2
2 p(2, 2) p(3, 2)
Table
{1, 1}, {1, 2}, {1, 3}, . . . , {3, 1}, {3, 2}, {3, 3}
y1
0 1 2
y2 0 {3, 3} {1, 3} or {3, 1} {1, 1}
1 {3, 2} or {2, 3} {1, 2} or {2, 1} NA
2 {2, 2} NA NA
Table: Events for Y1 and Y2
y1
0 1 2
0 1/9 2/9 1/9
y2
1 2/9 2/9 0
2 1/9 0 0
Table: Joint Probability Function
F (2, 3) = P(Y1 ≤ 2, Y2 ≤ 3)
Remarks on part 3:
y2∗
y2
F (y1 , y2 ) F (y1∗ , y2 )
−∞
−∞ y1 y1∗
Figure: f (y1 , y2 )
Kuan Xu (UofT) ECO 227 January 22, 2024 15 / 92
Bivariate and Multivariate Probability Distributions (13)
b
Z .4 Z .2
F (.2, .4) = (1)dy1 dy2
0 0
Z .4 .2
= (y1 ) dy2
0 0
Z .4
= .2dy2
0
= .08.
Solution (continued):
c
Z .5 Z .3
P(.1 ≤ Y1 ≤ .3, 0 ≤ Y2 ≤ .5) = (1)dy1 dy2
0 .1
Z .5 .3
= y1 dy2
0 .1
Z .5
= (.2)dy2
0
.5
= (.2y2 )
0
= .1.
link2
link3
Figure: f (y1 , y2 )
2 Find P(0 ≤ Y1 ≤ .5, Y2 > .25). Note (0 ≤ Y2 ≤ Y1 ≤ 1), (Y2 > .25), and (0 ≤ Y1 ≤ .5) ⇒ (.25 ≤ Y1 ≤ .5)
and (.25 ≤ Y2 ≤ Y1 ).
2 (continued)
Z 1/2 Z y
1
P(0 ≤ Y1 ≤ .5, Y2 > .25) = 3y1 dy2 dy1
1/4 1/4
Z 1/2 y1
= 3y1 (y2 ) dy1
1/4 1/4
Z 1/2
= 3y1 (y1 − 1/4)dy1
1/4
Z 1/2
2
= (3y1 − 3/4y1 )dy1
1/4
2 1/2
h i
3
= y1 − (3/8)y1
1/4
= [(1/8) − (3/8)(1/4)] − [(1/64) − (3/8)(1/16)]
= [1/32] − [2/128 − 3/128]
= [4/128] + [1/128]
= 5/128.
and
6
X
P(Y2 = y2 ) = p2 (y2 ) = p(y1 , y2 ).
y1 =1
y1
0 1 2 Total
0 0 3/15 3/15 6/15
y2
link1 link2
P(Y1 = 0, Y2 = 0) = p(0, 0) = 0.
Let us look at P(Y1 = 1, Y2 = 0). We use the hypergeometric probability distribution here.
P(Y1 = 1, Y2 = 0) = p(1, 0)
3 2 1
1 0 1
=
6
2
= 3/15.
Here 0! = 1. Similarly,
P(Y1 = 2, Y2 = 0) = p(2, 0)
3 2 1
2 0 0
=
6
2
= 3/15.
We could fill other cells of the table. Note the last row of the table offers the marginal distribution of Y1 . Similarly, the last
column of the table offers the marginal distribution of Y2 .
Figure: f (y1 , y2 )
P(A ∩ B) = P(A)P(B|A).
Examples: link Please find P(Y1 = 0|Y2 = 1) and (Y1 = 1|Y2 = 1).
Solution:
2/15
P(Y1 = 0|Y2 = 1) = = 1/4.
8/15
6/15
P(Y1 = 1|Y2 = 1) = = 3/4.
8/15
Kuan Xu (UofT) ECO 227 January 22, 2024 32 / 92
Marginal and Conditional Probability Distributions (11)
In addition, by definition,
Z y1 Z y1 Z ∞
F (y1 ) = f1 (t1 )dt1 = f (t1 , y2 )dy2 dt1 .
−∞ −∞ −∞
These imply Z y1
F (y1 |y2 )f2 (y2 ) = f (t1 , y2 )dt1 .
−∞
Z y1 Z y1
f (t1 , y2 )
⇒ F (y1 |y2 ) = dt1 = f (t1 |y2 )dt1 , f2 (y2 ) > 0,
−∞ f2 (y2 ) −∞
where the last equality defines the link between the conditional distribution
function F (y1 |y2 ) and the conditional density function f (y 1|y2 ).
Kuan Xu (UofT) ECO 227 January 22, 2024 34 / 92
Marginal and Conditional Probability Distributions (13)
Please note that f (y1 |y2 ) [or f (y2 |y1 )] is undefined for all y2 [or y1 ] such
that f2 (y2 ) = 0 [or f1 (y1 )]. In other words, f (y1 |y2 ) [or f (y2 |y1 )]] exists if
f2 (y2 ) ̸= 0 [f1 (y1 ) ̸= 0].
(The points (y1 , y2 ) are uniformly distributed over the triangle with the
given boundaries.) Find the conditional density function of Y1 given
Y2 = y2 , f (y1 |y2 ), and P(Y1 ≤ 1/2|Y2 = 1.5) Link
Solution: Find f2 (y2 ).
y2
R y2 (1/2)dy = (1/2)y
= (1/2)y2 , 0 ≤ y2 ≤ 2,
0 1 1
f2 (y2 ) =
0
∞
R
−∞ 0dy1 = 0, elsewhere.
Remarks: f2 (y2 ) > 0 if and only if 0 < y2 ≤ 2. Thus, for any 0 < y2 ≤ 2,
f (y1 , y2 ) 1/2 1
f (y1 |y2 ) = = = , 0 ≤ y1 ≤ y2 .
f2 (y2 ) (1/2)y2 y2
= (2/3)(1/2) = 1/3.
for every pair of real numbers (y1 , y2 ). If Y1 and Y2 are not independent,
they are said to be dependent.
and (R 1
6y1 y22 dy1 = 3y22 , 0 ≤ y2 ≤ 1,
f2 (y2 ) = R0∞
−∞ 0dy1 = 0, elsewhere.
for all real numbers (y1 , y2 ), and, therefore, Y1 and Y2 are independent.
Kuan Xu (UofT) ECO 227 January 22, 2024 42 / 92
Independent Random Variables (6)
Example (continuous r. v.; dependent case): Given
(
2, 0 ≤ y2 ≤ y1 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Are Y1 and Y2 dependent?
Solution:
We see f (y1 , y2 ) = 2 over the shaded region as shown.
Solution (continued):
Therefore,
y1
R y1 2dy = 2y
= 2y1 , 0 ≤ y1 ≤ 1 ⇐ 0 ≤ y2 ≤ y1 ≤ 1
0 2 2
f1 (y1 ) =
0
0, elsewhere.
Similarly,
1
R 1
2dy1 = 2y1 = 2(1 − y2 ), 0 ≤ y2 ≤ 1
y2
f2 (y2 ) =
y2
0, elsewhere.
R1
Note that y2 2dy1 because 0 ≤ y2 ≤ y1 ≤ 1.
E [g (Y1 , Y2 , . . . , Yk )]
Let g (Y1 , Y2 , . . . , Yk ) be a function of the discrete random variables,
Y1 , Y2 , . . . , Yk , which have joint probability function p(y1 , y2 , . . . , yk ). Then
X XX
E [g (Y1 , Y2 , . . . , Yk ] = ··· g (y1 , y2 , . . . , yk )p(y1 , y2 , . . . , yk ).
∀yk ∀y2 ∀y1
Find V (Y1 ).
Solution: Note
1
R 1
2y1 dy2 = 2y1 y2 = 2y1 , 0 ≤ y1 ≤ 1,
0
f1 (y1 ) =
0
0, elsewhere.
Solution:
Z 1Z 1 Z 1 Z 1
E (Y1 Y2 ) = y1 y2 2(1 − y1 )dy1 dy2 = 2 y1 (1 − y1 ) y2 dy2 dy1
0 0 0 0
Kuan Xu (UofT) ECO 227 January 22, 2024 49 / 92
The Expected Value of a Function of Random Variables (5)
Solution (continued):
Z 1 Z 1
=2 y1 (1 − y1 )(1/2)dy1 = (y1 − y12 )dy1
0 0
1
y12 y13
= − = 1/2 − 1/3 = 1/6.
2 3
0
LinkFind E (Y1 − Y2 ).
Solution: Apply the above theorem to get
Solution (continued):
y1
1 Z y1 1
y22
Z Z
E (Y2 ) = y2 (3y1 )dy2 dy1 = 3y1 dy1
0 0 0 2
0
1
1
y14
Z
= (3/2)y13 dy1 = (3/2) = 3/8.
0 4
0
Therefore,
E (Y1 − Y2 ) = 3/4 − 3/8 = 3/8.
Link
Z ∞ Z ∞
= g (y1 )f1 (y1 )dy1 h(y2 )f2 (y2 )dy2
−∞ −∞
= E (g (Y1 ))E (h(Y2 )).
Solution (continued):
Now we find
Z 1 Z 1
E (Y1 ) = y1 [2(1 − y1 )]dy1 = 2 (y1 − y12 )dy1
0 0
1
y12 y13
=2 − = 2(1/2 − 1/3) = 2(3/6 − 2/6) = 1/3.
2 3
0
1
1
y22
Z
E (Y2 ) = y2 dy2 = = 1/2.
0 2
0
Remarks: Y2 is uniformly distributed over (0, 1). Therefore,
Remarks:
The larger |Cov (Y1 , Y2 )|, greater the linear dependence. A positive
(negative) Cov (Y1 , Y2 ) indicates a positive (negative) linear dependence.
Sometimes, Cov (Y1 , Y2 ) is written as σ12 .
Correlation
The correlation coefficient between Y1 and Y2 is defined as
σ12
ρ= ,
σ1 σ2
where σi is the stanedard deviation of Yi (i = 1, 2) and −1 ≤ ρ ≤ 1.
Kuan Xu (UofT) ECO 227 January 22, 2024 61 / 92
The Covariance of Two Random Variables (4)
Remarks:
Correlation is a “standardized” covariance.
Correlation is scale-free.
ρ = 1 indicates a perfect positive correlation.
ρ = −1 indicates a perfect negative correlation.
ρ = 0 indicates a zero correlation.
y1 1
1 1
y22
Z Z
= 3y12 dy1 = (3/2) y14 dy1 = (3/2)(y15 /5) = 3/10.
0 2 0
0 0
Therefore,
Cov (Y1 , Y2 ) = 0.
Remarks:
The theorem can be established using the fact that if Y1 and Y2 are
independent, then E (Y1 Y2 ) = E (Y1 )E (Y2 ).
The converse of this theorem is not generally true (it is only true for
normally distributed random variables). That is, the uncorrelated Y1 and
Y2 may not be independent random variables.
Example (continued):
y1
Z y1
f1 (y1 ) = 3y1 dy2 = 3y1 (y2 ) = 3y12 , 0 ≤ y1 ≤ 1.
0
0
1
Z 1
3 2 3
f2 (y2 ) = 3y1 dy1 = y = (1 − y22 ), 0 ≤ y2 ≤ 1.
y2 2 1 2
y2
It follows
1
1
y15
Z
E (Y12 ) = 3y14 dy1 =3 = 3/5.
0 5
0
1
1
3 y23 y25
Z
3 2 3 1 1
E (Y22 ) = y2 (1 − y22 )dy2 = − = − = 1/5.
0 2 2 3 5 2 3 5
0
Now V (Y1 ) = 3/5 − (3/4)2 = 3/5 − 9/16 = .06 − .5625 = .04 and
V (Y2 ) = 1/5 − (3/8)2 = .20 − .140625 = .06. Therefore,
Example:
Let Y be a random variable follows the binomial distribution with the
probability of success p and the number of trials n. Assume that we get a
sample {Y1 , Y2 , . . . , Yn } with n = 10 and that we have an estimator
p̂ = Y /n. Please find the expected value and variance of p̂.
Solution: Recall the expected value and variance of a binomially
distributed random variable are np and npq,1 respectively.
Y 1 np
E (p̂) = E = E (Y ) = = p.
n n n
and
1 1 pq
V (p̂) = V (Y ) = (npq) = .
n2 n2 n
1
q = 1 − p.
Kuan Xu (UofT) ECO 227 January 22, 2024 74 / 92
The Expected Value and Variance of Linear Functions of
Random Variables (7)
Example: In an urn of N balls, in which r balls are red and N − r are
black. A random sample of n balls without replacement is made. Let Y be
the number of red balls observed in the sample. Clearly, Y follows a
hypergeometric probability distribution; that is,
r N−r
y n−y
p(y ) = N
.
n
Find the mean and variance of Y .
Solution:
Let (
1, if ith ball is red,
Xi =
0, otherwise.
Let
n
X
Y = Xi .
i=1
Kuan Xu (UofT) ECO 227 January 22, 2024 75 / 92
The Expected Value and Variance of Linear Functions of
Random Variables (8)
Solution (continued):
To understand E (Y ) and V (Y ), we need to know E (Xi ), E (Xi2 ), E (Xi Xj )
for i ̸= j, and Cov (Xi , Xj ) for i ̸= j.
Clearly, P(X1 = 1) = r /N. Show P(X2 = 1) = r /N:
Solution (continued):
Cov (Xi , Xj ) = E (Xi , Xj ) − E (Xi )E (Xj )
r (r − 1) r −1 r
2
= − (r /N) = (r /N) −
N(N − 1) N−1 N
! !
Nr − N − r (N − 1) (N − r )
= (r /N) = −(r /N)
(N − 1)N N(N − 1)
r 1
= −(r /N) 1− .
N N−1
The second equality in the above uses the following idea where n = 3
1 2 3
1 ◦ · ·
2 · ◦ ·
3 · · ◦
This is perhaps the most important distribution that you will use. We only
introduce the bivariate (k = 2) normal distribution. The idea can be
extended to the multivariate (k > 2) normal distribution.
Consider k = 2 continuous random variables Y1 and Y2 with the bivariate
(joint) density function:
e −Q/2
f (y1 , y2 ) = p , −∞ < y1 < ∞, −∞ < y2 < ∞,
2πσ1 σ2 1 − ρ2
where
(y1 − µ1 )2 (y1 − µ1 )(y2 − µ2 ) (y2 − µ2 )2
1
Q= − 2ρ +
1 − ρ2 σ12 σ1 σ2 σ22
Conditional Expectation
If Y1 and Y2 are any two random variables, the conditional expectation of
g (Yi ), given that Y2 = y2 , is defined to be
Z ∞
E (g (Y1 )|Y2 = y2 ) = g (y1 )f (y1 |y2 )dy1
−∞
Therefore,
Z ∞ Z y
1
2
E (Y1 |Y2 = y2 ) = y1 f (y1 |y 2)dy1 = y1 dy1
−∞ 0 y2
y !
1 y12 2 y2
= = .
y2 2 0 2
Remarks:
Z ∞ Z ∞
E (Y1 ) = y1 f (y1 , y2 )dy1 dy2
Z−∞ −∞
∞ Z ∞
= y1 f (y1 |y2 )f2 (y2 )dy1 dy2
−∞ −∞
Z∞ Z ∞
= y1 f (y1 |y2 )dy1 f2 (y2 )dy2
Z−∞
∞
−∞
Example: Let n = 10 be the sample for quality control per day, Y be the
number of defectives, and p be the probability of observing a defective.
Y ∼ Bin(n, p), which has mean np and variance npq, where q = 1 − p. It
is known that E (Y ) = E [E (Y |p)]. But p is random and has a uniform
(U) distribution on the interval from 0 to 1/4. Find E (Y ).
Solution: 2
For p ∼ U(0, 1/4), E (p) = 1/4−0
2 = 1/8 and V (p) = (1/4−0)
12 = 1/192.
Applying the theorem for E (Y1 ) = E [E (Y1 |Y2 )], we get
1/4 − 0
E (Y ) = E [E (Y |p)] = E (np) = nE (p) = n = n/8.
2
Remarks: The justification for the above theorem uses the theorem for
E (Y1 ) = E [E (Y1 |Y2 )]. Let’s see how. Using the concept of variance, we
write
V (Y1 |Y 2) = E (Y12 |Y2 ) − [E (Y1 |Y2 )]2 .
Taking the expectation of the above, we get
The variance of Y1 is
2 2
V (Y1 ) = E (Y1 ) − [E (Y1 )]
2 2
= E [E (Y1 |Y2 )] − {E [E (Y1 |Y2 )]}
2 2 2 2 2
= E [E (Y1 |Y2 )] − E {[E (Y1 |Y2 )] } + E {[E (Y1 |Y2 )] } − {E [E (Y1 |Y2 )]} add and subtractE {[E (Y1 |Y2 )] }
| {z } | {z }
E [V (Y1 |Y2 )] V [E (Y1 |Y2 )]
Example: Let n = 10 be the sample for quality control per day, Y be the
number of defectives, and p be the probability of observing a defective.
Y ∼ Bin(n, p), which has mean np and variance npq, where q = 1 − p. It
is known that E (Y ) = E [E (Y |p)]. But p is random and has a uniform
(U) distribution on the interval from 0 to 1/4. Find V (Y ).
Solution: Apply the theorem for V (Y1 ) = E [V (Y1 |Y2 )] + V [E (Y1 |Y2 )],
where Y1 = Y and Y2 = p. We have
2
V (Y ) = E (V (Y |p))) + V (E (Y |p)) = E (npq) + V (np) = nE [p(1 − p)] + n V (p).
E (p) = 1/8, V (p) = 1/192, E (p 2 ) = V (p) + [E (p)]2 = 1/192 + (1/8)2 = 1/192 + 1/64 = 1/48.
Solution (continued):
2 5n n2
V (Y ) = n(1/8 − 1/48) + n (1/192) = +
48 192
For n = 10,
V (Y ) = 50/48 + 100/192 = 1.6525.
q √
SD(Y ) = V (Y ) = 1.6525 = 1.25.