CH 5 Slides

ECO 227Y1 Foundations of Econometrics
Kuan Xu
University of Toronto
kuan.xu@utoronto.ca
January 22, 2024
Kuan Xu (UofT) ECO 227 January 22, 2024 1 / 92

Ch 5 Multivariate Probability Distributions
1 Introduction
2 Bivariate and Multivariate Probability Distributions
3 Marginal and Conditional Probability Distributions
4 Independent Random Variables
5 The Expected Value of a Function of Random Variables
6 Special Theorems
7 The Covariance of Two Random Variables
8 The Expected Value and Variance of Linear Functions of Random
Variables
9 The Bivariate Normal Distribution
10 Conditional Expectations

Introduction
We are often interested in the intersection(s) of two or more

events—height versus weight, investment versus economic growth,
asset return versus, respectively, size, value, market returns.
We are often interested in the intersections of n
outcomes/measures—i.i.d. or dependent.

Bivariate and Multivariate Probability Distributions (1)
Example: Toss a pair of dice. The mn rule tells us there are 36 sample
points.
p(y1 , y2 ) = P(Y1 = y1 , Y2 = y2 ) = 1/36, y1 = 1, 2, . . . , 6, y2 = 1, 2, . . . ,
We note
1 p(y1 , y2 ) > 0.
P
i,j p(yi , yj ) = 1.
2

Fig. 5.1, p. 225
Figure: Bivariate Probability Function; y1 = # of dots on die 1 and y2 = # of

dots on die 2

Joint (or Bivariate) Probability Function

Let Y1 and Y1 be discrete random variables. The joint (or bivariate)
probability function for Y1 and Y2 is given by
p(y1 , y2 ) = P(Y1 = y1 , Y1 = y2 ), −∞ < y1 < ∞, −∞ < y2 < ∞.

Theorem 5.1—Properties of Joint Probability Function

If Y1 and Y2 are discrete random variables with joint probability function
p(y1 , y2 ), then
1 p(y1 , y2 ) ≥ 0 for all y1 , y2 .
P
y1 ,y2 p(y1 , y2 ) = 1, where the sum is over all values (y1 , y2 ) that are
2
assigned nonzero probabilities.
Remarks: The joint probability function for discrete random variables is

often called the joint probability mass function, where the word mass
refers to the probability associated with each of the possible pairs of values
of the random variables.

Example: In our example of tossing a pair of dice. Find

P(2 ≤ Y1 ≤ 3, 1 ≤ Y2 ≤ 2).
Y1
2 3
1 p(2, 1) p(3, 1)
Y2
2 p(2, 2) p(3, 2)
Table
Each cell has a probability of 1/36 and
P(2 ≤ Y1 ≤ 3, 1 ≤ Y2 ≤ 2) = 4/36 = 1/9.

Example: Two customers arrive independently and randomly to any of

three checkout counters where there is no other customers. Y1 (Y2 ) is the
number of customers arriving at counter 1 (2). Note counter 3 is not our
focus. Find the joint probability function of Y1 and Y2 .
Solution: Using the mn rule we know that the sample space consists of
3 × 3 = 9 sample points:
{1, 1}, {1, 2}, {1, 3}, . . . , {3, 1}, {3, 2}, {3, 3}

y1
0 1 2
y2 0 {3, 3} {1, 3} or {3, 1} {1, 1}
1 {3, 2} or {2, 3} {1, 2} or {2, 1} NA
2 {2, 2} NA NA
Table: Events for Y1 and Y2
y1
0 1 2
0 1/9 2/9 1/9
y2
1 2/9 2/9 0
2 1/9 0 0
Table: Joint Probability Function

Joint (Bivariate) Distribution Function

For any random variables Y1 and Y2 , the joint (bivariate) distribution
function F (y1 , y2 ) is
F (y1 , y2 ) = P(Y1 ≤ y1 , Y2 ≤ y2 ), −∞ < y1 < ∞, −∞ < y2 < ∞,
Remarks: For two discrete random variables Y1 and Y2 is given by

X X
F (y1 , y2 ) = p(t1 , t2 ).
t1 ≤y1 t1 ≤y2
Example: Toss a pair of dice. Y1 (Y2 ) is the dots on die 1 (2).
F (2, 3) = P(Y1 ≤ 2, Y2 ≤ 3)
= p(1, 1) + p(1, 2) + p(1, 3) + p(2, 1) + p(2, 2) + p(2, 3) = 6/36.

Now we focus on jointly continuous random variables.

Jointly Continuous Random Variables and Their Joint Probability
Density Function
Let Y1 and Y2 be continuous random variables with joint distribution
function F (y1 , y2 ). If there exists a nonnegative function f (y1 , y2 ) such
that Z y1 Z y2
F (y1 , y2 ) = f (t1 , t2 )dt2 dt1 ,
−∞ −∞
for all −∞ < y1 < ∞, −∞ < y1 < ∞, then Y1 and Y2 are said to be
jointly continuous random variables. The function f (y1 , y2 ) is called the
joint probability density function.

Theorem 5.2A—Properties of Joint Distribution Function

If Y1 and Y2 are random variables with joint distribution function
F (y1 , y2 ), then
1 F (−∞, ∞) = F (−∞, y2 ) = F (y1 , −∞) = 0.
2 F (∞, ∞) = 1.
3 If y1∗ ≥ y1 and y2∗ ≥ y2 , then
F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0.
Remarks on part 3:
F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0.
= P(y1 < Y1 ≤ y1∗ , y2 < Y2 ≤ y2∗ ) ≥ 0.

To gain an insight to point 3 in Theorem 5.2A, see
F (y1 , y2∗ )
y2∗
y2
F (y1 , y2 ) F (y1∗ , y2 )
−∞
−∞ y1 y1∗
Figure: F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0

Theorem 5.2B—Properties of Joint Density Function
If Y1 and Y2 are jointly continuous random variables with a joint density
function given by f (y1 , y2 ), then
1 f (y1 , y2 ) ≥ 0 for all y1 , y2 .
R∞ R∞
−∞ −∞ f (y1 , y2 )dy1 dy2 = 1.
2
Fig. 5.2, p. 228
Figure: f (y1 , y2 )
Example: You are given a bivariate density function.

(
1, 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere
a Show the figure of this density function.

b Find F (.2, .4).
c Find F (.1 ≤ Y1 ≤ .3, 0 ≤ Y2 ≤ .5).

Solution:
a See
Fig. 5.3, p. 229
b
Z .4 Z .2
F (.2, .4) = (1)dy1 dy2
0 0
Z .4 .2
= (y1 ) dy2
0 0
Z .4
= .2dy2
0
= .08.

Solution (continued):
c
Z .5 Z .3
P(.1 ≤ Y1 ≤ .3, 0 ≤ Y2 ≤ .5) = (1)dy1 dy2
0 .1
Z .5 .3
= y1 dy2
0 .1
Z .5
= (.2)dy2
0
.5
= (.2y2 )
0
= .1.

Example: Let the joint density function of Y1 (proportion of the tank at

the beginning of the week) and Y2 (proportion of the tank) sold during the
week) be (
3y1 , 0 ≤ y2 ≤ y1 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
1 Sketch this function.

2 Find P(0 ≤ Y1 ≤ .5, Y2 > .25).
link1
link2
link3

Solution:
1 Sketch this function.
Fig. 5.4, p. 230
2 Find P(0 ≤ Y1 ≤ .5, Y2 > .25). Note (0 ≤ Y2 ≤ Y1 ≤ 1), (Y2 > .25), and (0 ≤ Y1 ≤ .5) ⇒ (.25 ≤ Y1 ≤ .5)
and (.25 ≤ Y2 ≤ Y1 ).
Fig. 5.5, p. 231
Figure: Region of Integration

2 (continued)
Z 1/2 Z y
1
P(0 ≤ Y1 ≤ .5, Y2 > .25) = 3y1 dy2 dy1
1/4 1/4
Z 1/2 y1
= 3y1 (y2 ) dy1
1/4 1/4
Z 1/2
= 3y1 (y1 − 1/4)dy1
1/4
Z 1/2
2
= (3y1 − 3/4y1 )dy1
1/4
2 1/2
h i
3
= y1 − (3/8)y1
1/4
= [(1/8) − (3/8)(1/4)] − [(1/64) − (3/8)(1/16)]
= [1/32] − [2/128 − 3/128]
= [4/128] + [1/128]
= 5/128.

Remarks: The above discussion on the bivariate distribution and density

function can be extended to the multivariate distribution and density
functions.

Marginal and Conditional Probability Distributions (1)
We discuss marginal probability distributions first.

Example: Toss a pair of dice. Y1 (Y2 ) = # of dots on the upper face of die
1 (2).
P(Y1 = 1) = p(1, 1) + p(1, 2) + · · · + p(1, 6) = 6(1/36) = 1/6.
P(Y1 = 2) = p(2, 1) + p(2, 2) + · · · + p(2, 6) = 6(1/36) = 1/6.

..
.
P(Y1 = 6) = p(6, 1) + p(6, 2) + · · · + p(6, 6) = 6(1/36) = 1/6.

We can use summation notation to get the marginal probability

distributions for the above example.
6
X
P(Y1 = y1 ) = p1 (y1 ) = p(y1 , y2 )
y2 =1
and
6
X
P(Y2 = y2 ) = p2 (y2 ) = p(y1 , y2 ).
y1 =1

Marginal Probability and Density Functions

a Let Y1 and Y2 be jointly discrete random variables with probability
function p(y1 , y2 ). Then the marginal probability functions of Y1 and
Y2 , respectively, are given by
X X
p1 (y1 ) = p(y1 , y2 ) and p2 (y2 ) = p(y1 , y2 ).
∀y2 ∀y1
b Let Y1 and Y2 be jointly continuous random variables with joint

density function f (y1 , y2 ). Then the marginal probability functions of
Y1 and Y2 , respectively, are given by
Z ∞ Z ∞
f1 (y1 ) = f (y1 , y2 )d2 and f2 (y2 ) = f (y1 , y2 )dy1 .
−∞ −∞

Example (discrete r.v.): Given a group of 3 Republicans, 2 Democrats, and

1 independent, a committee of 2 people is to be randomly selected. Here
Y1 (Y2 ) = # of Republicans (Democrats). Find the joint probability
function of Y1 and Y2 and the marginal distribution of Y1 .
Solution: We need to build the joint probability function in a table.
y1
0 1 2 Total
0 0 3/15 3/15 6/15
y2
1 2/15 6/15 0 8/15

2 1/15 0 0 1/15
Total 3/15 9/15 3/15 1
Table: Joint Probability Function
link1 link2

Let us look at P(Y1 = 0, Y2 = 0). This is not possible. Hence,
P(Y1 = 0, Y2 = 0) = p(0, 0) = 0.
Let us look at P(Y1 = 1, Y2 = 0). We use the hypergeometric probability distribution here.
P(Y1 = 1, Y2 = 0) = p(1, 0)

3 2 1
1 0 1
=
6
2
= 3/15.
Here 0! = 1. Similarly,
P(Y1 = 2, Y2 = 0) = p(2, 0)

3 2 1
2 0 0
=
6
2
= 3/15.
We could fill other cells of the table. Note the last row of the table offers the marginal distribution of Y1 . Similarly, the last
column of the table offers the marginal distribution of Y2 .

Example (continuous r. v.): Let
(
2y1 , 0 ≤ y1 ≤ 1, ≤ y2 ≤ 1.
f (y1 , y2 ) =
0, elsewhere
Sketch f (y1 , y2 ) and find the marginal density functions of Y1 and Y2 .
Solution:
Fig. 5.6, p. 238

Solution (continued): For 0 ≤ y1 ≤ 1,

1
Z 1
f1 (y1 ) = 2y1 dy2 = 2y1 (y2 ) = 2y1 .
0
0
For y1 < 0 or y1 > 1,

f1 (y1 ) = 0.
Therefore, (
2y1 , 0 ≤ y1 ≤ 1,
f1 (y1 ) =
0, elsewhere.

Solution (continued): Similarly, For 0 ≤ y2 ≤ 1,

1
Z 1
f2 (y2 ) = 2y1 dy1 = (y12 ) = 1.
0
0
For y1 < 0 or y1 > 1,

f2 (y2 ) = 0.
Therefore, (
1, 0 ≤ y2 ≤ 1,
f2 (y2 ) =
0, elsewhere.

Now we discuss conditional probability distributions for discrete and

continuous random variables.
Recall the multiplicative law for P(A ∩ B) as
P(A ∩ B) = P(A)P(B|A).
Here P(A) is the unconditional probability distribution of A whereas

P(B|A) is the conditional probability distribution of B given that A has
occurred (or simply given A).

Conditional Discrete Probability Function
If Y1 and Y2 are jointly discrete random variables with joint probability
function p(y1 , y2 ) and marginal probability functions p1 (y1 ) and p2 (y2 ),
respectively, then the conditional discrete probability function of Y1 given
Y2 is
P(Y1 = y1 , Y2 = y2 ) p(y1 , y2 )
p(y1 |y2 ) = P(Y1 = y1 |Y2 = y2 ) = = ,
P(Y2 = y2 ) p2 (y2 )
provided p2 (y2 ) > 0.
Examples: link Please find P(Y1 = 0|Y2 = 1) and (Y1 = 1|Y2 = 1).
Solution:
2/15
P(Y1 = 0|Y2 = 1) = = 1/4.
8/15
6/15
P(Y1 = 1|Y2 = 1) = = 3/4.
8/15
For continuous multiple random variables, we need to discuss the link

between the conditional distribution and density functions as the case for a
univariate random variable.
Conditional Distribution Function
If Y1 and Y2 are jointly continuous random variables with joint density
function f (y1 , y2 ), then the conditional distribution function of Y1 given
Y2 = y2 is
F (y1 |y2 ) = P(Y1 ≤ y1 |Y2 = y2 ).

Discuss: Summarize F (y1 |y2 ) across f2 (y2 ) for all y2 to get
Z ∞
F (y1 ) = F (y1 |y2 )f2 (y2 )dy2 .
−∞
In addition, by definition,
Z y1 Z y1 Z ∞
F (y1 ) = f1 (t1 )dt1 = f (t1 , y2 )dy2 dt1 .
−∞ −∞ −∞
These imply Z y1
F (y1 |y2 )f2 (y2 ) = f (t1 , y2 )dt1 .
−∞
Z y1 Z y1
f (t1 , y2 )
⇒ F (y1 |y2 ) = dt1 = f (t1 |y2 )dt1 , f2 (y2 ) > 0,
−∞ f2 (y2 ) −∞
where the last equality defines the link between the conditional distribution
function F (y1 |y2 ) and the conditional density function f (y 1|y2 ).
Conditional Density Function

Let Y1 and Y2 be jointly continuous random variables with joint density
functions f (y1 , y2 ) and marginal density functions f1 (y1 ) and f2 (y2 ),
respectively. For any y2 [or y1 ] such that f2 (y2 ) > 0 [or f1 (y1 ) > 0], the
conditional density of Y1 [or Y2 ] given Y2 = y2 [or Y1 = y1 ] is given by

f (y1 , y2 ) f (y1 , y2 )
f (y1 |y2 ) = or f (y2 |y1 ) =
f2 (y2 ) f1 (y1 )
Please note that f (y1 |y2 ) [or f (y2 |y1 )] is undefined for all y2 [or y1 ] such
that f2 (y2 ) = 0 [or f1 (y1 )]. In other words, f (y1 |y2 ) [or f (y2 |y1 )]] exists if
f2 (y2 ) ̸= 0 [f1 (y1 ) ̸= 0].

Example: Y2 = a random amount soft-drink supplied at the beginning of
the day. Y1 = a random amount soft-drink dispensed during the day.
Y1 ≤ Y2 . The joint density function is given by
(
1/2, 0 ≤ y1 ≤ y2 ≤ 2,
f (y1 , y2 ) =
0, elsewhere
(The points (y1 , y2 ) are uniformly distributed over the triangle with the
given boundaries.) Find the conditional density function of Y1 given
Y2 = y2 , f (y1 |y2 ), and P(Y1 ≤ 1/2|Y2 = 1.5) Link
Solution: Find f2 (y2 ).
 y2
R y2 (1/2)dy = (1/2)y

= (1/2)y2 , 0 ≤ y2 ≤ 2,
0 1 1
f2 (y2 ) =
0
 ∞
R
−∞ 0dy1 = 0, elsewhere.

Remarks: f2 (y2 ) > 0 if and only if 0 < y2 ≤ 2. Thus, for any 0 < y2 ≤ 2,
f (y1 , y2 ) 1/2 1
f (y1 |y2 ) = = = , 0 ≤ y1 ≤ y2 .
f2 (y2 ) (1/2)y2 y2
Note that both restrictions on y2 and y1 on the above conditional density

functions. Z 1/2
P(Y1 ≤ 1/2|Y2 = y2 ) = (1/y2 )dy1 .
0
If Y2 = 1.5, then
1/2
Z 1/2
P(Y1 ≤ 1/2|Y2 = 1.5) = (1/(1.5))dy1 = (1/1.5)y1 = (1/1.5)(1/2)
0
0
= (2/3)(1/2) = 1/3.

Independent Random Variables (1)
Recall: Two events A and B are independent if P(A ∩ B) = P(A) × P(B).

Otherwise (P(A ∩ B) = P(A) × P(B|A) or P(A ∩ B) = P(B) × P(A|B)),
the two events are dependent. Extend these to random variables. Let
a < b and c < d. The event of interest is (a < Y1 ≤ b) ∩ (c < Y2 ≤ d).
Y1 and Y2 are independent if
P(a < Y1 ≤ b, c < Y2 ≤ d) = P(a < Y1 ≤ b) × P(c < Y2 ≤ d),
for any choice of real numbers a < b and c < d.

Independence versus Dependence

Let Y1 and Y2 have distribution Functions F1 (y1 ) and F2 (y2 ), respectively.
Y1 and Y2 have joint distribution function F (y1 , y2 ). Then Y1 and Y2 are
said to be independent if and only if
F (y1 , y2 ) = F1 (y1 )F2 (y2 )
for every pair of real numbers (y1 , y2 ). If Y1 and Y2 are not independent,
they are said to be dependent.

Theorem 5.4—Independence—Discrete and Continuous Random

Variables
If Y1 and Y2 are discrete random variables with joint probability function
p(y1 , y2 ) and marginal probability functions p1 (y1 ) and p2 (y2 ),
respectively, then Y1 and Y2 are independent if and only if
p(y1 , y2 ) = p1 (y1 )p2 (y2 )
for all pairs of real numbers (y1 , y2 ).

If Y1 and Y2 are continuous random variables with joint density function
f (y1 , y2 ) and marginal density functions f1 (y1 ) and f2 (y2 ) are independent
if and only if
f (y1 , y2 ) = f1 (y1 )f2 (y2 )
for all pairs of real numbers (y1 , y2 ).

Example (discrete r. v.; independent case): Toss a pair of dice. Y1 (Y2 ) =

# of dots on the top face of die 1 (2). Show that Y1 and Y2 are
independent.
Solution: Consider the point (1,2).
p(1, 2) = 1/36 = p1 (1)p2 (2) = (1/6)(1/6) = 1/36.
This holds for all y1 and y2 . Hence, Y1 and Y2 are independent.

Example (discrete r. v.; dependent case): link Show that Y1 and Y2 are
dependent.
Solution:
p(0, 0) = 0 but p1 (0) = 3/15 p2 (0) = 6/15.
Hence, p(0, 0) ̸= p1 (0)p2 (0). Y1 and Y2 are dependent.

Example (continuous r. v.; independent case): Given
(
6y1 y22 , 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Are Y1 and Y2 independent?

Solution: Note

1
R 1 6y y 2 dy = 6y y23


0 1 2 2 1 3 = 2y1 , 0 ≤ y1 ≤ 1,
f1 (y1 ) =
0
 ∞ 0dy = 0,

R
−∞ 2 elsewhere.
and (R 1
6y1 y22 dy1 = 3y22 , 0 ≤ y2 ≤ 1,
f2 (y2 ) = R0∞
−∞ 0dy1 = 0, elsewhere.
for all real numbers (y1 , y2 ), and, therefore, Y1 and Y2 are independent.
Example (continuous r. v.; dependent case): Given
(
2, 0 ≤ y2 ≤ y1 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Are Y1 and Y2 dependent?
Solution:
We see f (y1 , y2 ) = 2 over the shaded region as shown.

Therefore,
 y1
R y1 2dy = 2y

= 2y1 , 0 ≤ y1 ≤ 1 ⇐ 0 ≤ y2 ≤ y1 ≤ 1
0 2 2
f1 (y1 ) =
 0
0, elsewhere.

Similarly,
1


R 1

2dy1 = 2y1 = 2(1 − y2 ), 0 ≤ y2 ≤ 1
y2
f2 (y2 ) =
 y2

0, elsewhere.
R1
Note that y2 2dy1 because 0 ≤ y2 ≤ y1 ≤ 1.

Solution (continued): Because
2 = f (y1 , y2 ) ̸= f1 (y1 )f2 (y2 ) = 2y1 (2(1 − y2 )) = 4y1 (1 − y2 ),
Y1 and Y2 are dependent.

The Expected Value of a Function of Random Variables (1)
E [g (Y1 , Y2 , . . . , Yk )]
Let g (Y1 , Y2 , . . . , Yk ) be a function of the discrete random variables,
Y1 , Y2 , . . . , Yk , which have joint probability function p(y1 , y2 , . . . , yk ). Then
X XX
E [g (Y1 , Y2 , . . . , Yk ] = ··· g (y1 , y2 , . . . , yk )p(y1 , y2 , . . . , yk ).
∀yk ∀y2 ∀y1
If g (Y1 , Y2 , . . . , Yk ) be a function of the continuous random variables,

Y1 , Y2 , . . . , Yk , which have joint density function f (y1 , y2 , . . . , yk ). Then
Z ∞ Z ∞Z ∞
E [g (Y1 , Y2 , . . . , Yk ] = ··· g (y1 , y2 , . . . , yk )f (y1 , y2 , . . . , yk )dy1 dy2 . . . d
−∞ −∞ −∞

Example: Let Y1 and Y2 have a joint density function:
(
2y1 , 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Find E (Y1 ) and E (Y2 ).

Solution: Z 1 Z 1
E (Y1 ) = y1 (2y1 )dy1 dy2
0 0
 
1 1
Z 1 3 Z 1
=  2y1  dy2 = (2/3)dy2 = (2/3)y1 = 2/3.
0 3 0
0 0
Z 1 Z 1 Z 1 Z 1
E (Y2 ) = y2 (2y1 )dy1 dy2 = y2 2y1 dy1 dy2
0 0 0 0
 
1 1
1 1
2y12  y22
Z Z
= y2  dy2 = y2 dy2 = = 1/2.
0 2 0 2
0 0

Example: Let Y1 and Y2 be random variables with density function
(
2y1 , 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Find V (Y1 ).
Solution: Note
 
1


R 1

2y1 dy2 = 2y1 y2  = 2y1 , 0 ≤ y1 ≤ 1,
0
f1 (y1 ) =
 0

0, elsewhere.

Recall V (Y1 ) = E (Y12 ) − [E (Y1 )]2 . Note

! 1
1 1
2y1k+2
Z Z
2
E (Y1k ) = y1k (2y1 )dy1 = 2y1k+1 dy1 = = .
0 0 k +2 k +2
0

2
Solution (continued): Using E (Y1k ) = k+2 , we have E (Y1 ) = 2/3 and
E (Y12 ) = 1/2. Therefore,
V (Y1 ) = 1/2 − (2/3)2 = 9/18 − 8/18 = 1/18.

Example: Y1 = the proportion of impurities in the sample and Y2 = the
proportion of type I impurities among all impurities found. Y1 and Y2 have
the following density function:
(
2(1 − y1 ), 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Find the expected value of the proportion of type I impurities in the
sample; that is, E (Y1 Y2 ).
Link
Solution:
Z 1Z 1 Z 1 Z 1
E (Y1 Y2 ) = y1 y2 2(1 − y1 )dy1 dy2 = 2 y1 (1 − y1 ) y2 dy2 dy1
0 0 0 0
Z 1 Z 1
=2 y1 (1 − y1 )(1/2)dy1 = (y1 − y12 )dy1
0 0
1
y12 y13

= − = 1/2 − 1/3 = 1/6.
2 3
0

Special Theorems (1)
Theorem 5.6—E (c) = c

Let c be a constant. Then
E (c) = c.
Theorem 5.7—E [cg (Y1 , Y2 )] = cE [g (Y1 , Y2 )]

Let c be a constant. Then
E [cg (Y1 , Y2 )] = cE [g (Y1 , Y2 )].

Theorem 5.8—E [g1 (Y1 , Y2 ) + g2 (Y1 , Y2 ) + · · · + gk (Y1 , Y2 )] =

E [g1 (Y1 , Y2 )] + E [g2 (Y1 , Y2 )] + · · · + E [gk (Y1 , Y2 )]
Let Y1 and Y2 be random variables and gj (Y1 , Y2 ), j = 1, 2, . . . , k be k
functions of Y1 and Y2 . Then
 
Xk k
X
E  gj (Y1 , Y2 ) =
 E [gj (Y1 , Y2 )].
j=1 j=1

Example: Given Y1 and Y2 with joint density function

(
3y1 , 0 ≤ y2 ≤ y1 ≤ 1,
f (y1 , y2 ) =
0, elsewhere
LinkFind E (Y1 − Y2 ).
Solution: Apply the above theorem to get
E (Y1 − Y2 ) = E (Y1 ) − E (Y2 ).

y1
Z 1 Z y1 Z 1 Z y1 Z 1
E (Y1 ) = y1 (3y1 )dy2 dy1 = 3y12 dy2 dy1 = 3y12 (y2 ) dy1
0 0 0 0 0
0
1
1
3y 4
Z
= 3y13 dy1 = 1 = 3/4.
0 4
0

y1
1 Z y1 1
y22
Z Z
E (Y2 ) = y2 (3y1 )dy2 dy1 = 3y1 dy1
0 0 0 2
0
1
1
y14
Z
= (3/2)y13 dy1 = (3/2) = 3/8.
0 4
0
Therefore,
E (Y1 − Y2 ) = 3/4 − 3/8 = 3/8.
Link

Theorem 5.9—Independent Y1 and Y2 : E [g (Y1 )h(Y2 )]

Let Y1 and Y1 be independent random variables and g (Y1 ) and h(Y2 ) be
functions of only Y1 and Y2 , respectively. Then,
E [g (Y1 )h(Y2 )] = E [g (Y1 )]E [h(Y2 )],
provided that the expectations exist.
Remarks: If Y1 and Y2 are independent, their joint density function

f (y1 , y2 ) = f1 (y1 )f2 (y2 ). Now consider
Z ∞Z ∞
E [g (Y1 )h(Y2 )] = g (y1 )h(y2 )f (y1 , y2 )dy2 dy1
−∞ −∞
Z ∞Z ∞
= g (y1 )h(y2 )f1 (y1 )f2 (y2 )dy2 dy1
−∞ −∞

Z ∞ Z ∞
= g (y1 )f1 (y1 )dy1 h(y2 )f2 (y2 )dy2
−∞ −∞
= E (g (Y1 ))E (h(Y2 )).
The justification for discrete random variables can be made similarly.

Example: In the example (see Link ), Y1 and Y2 are independent with joint
density function
(
2(1 − y1 ), 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Find E (Y1 Y2 ) = E (Y1 )E (Y2 ).

Solution:
(R 1
f1 (y1 ) = 0 2(1 − y1 )dy2 = 2(1 − y1 ), 0 ≤ y1 ≤ 1,
0, elsewhere.

1

R 1

2(1 − y1 )dy1 = −(1 − y1 )2 = 1, 0 ≤ y2 ≤ 1,
f2 (y2 ) = 0

 0
0, elsewhere.

Now we find
Z 1 Z 1
E (Y1 ) = y1 [2(1 − y1 )]dy1 = 2 (y1 − y12 )dy1
0 0
1
y12 y13

=2 − = 2(1/2 − 1/3) = 2(3/6 − 2/6) = 1/3.
2 3
0
1
1
y22
Z
E (Y2 ) = y2 dy2 = = 1/2.
0 2
0
Remarks: Y2 is uniformly distributed over (0, 1). Therefore,
E (Y1 Y2 ) = E (Y1 )E (Y2 ) = (1/3)(1/2) = 1/6.

The Covariance of Two Random Variables (1)
Consider a sample of n observations from random variables Y1 and Y2 .
We may see two possible cases.
Fig. 5.8, p. 265
Figure: Observations for (y1 , y2 ): Dependent (Correlated) versus Independent

(Uncorrelated)

We can demean Y1 and Y2 to get measures (y1 − µ1 ) and (y2 − µ2 ) if µ1

and µ2 are known (we can estimate them as shown later in the course).
The product (y1 − µ1 )(y2 − µ2 ) provides more information about the
relationship between Y1 and Y2 . Discuss
(y1 − µ1 )(y2 − µ2 ) > 0,
(y1 − µ1 )(y2 − µ2 ) < 0,

and
(y1 − µ1 )(y2 − µ2 ) = 0.

Covariance
If Y1 and Y2 are random variables with means µ1 and µ2 , respectively, the
covariance of Y1 and Y2 is
Cov (Y1 , Y2 ) = E [(Y1 − µ1 )(Y2 − µ2 )].
Remarks:
The larger |Cov (Y1 , Y2 )|, greater the linear dependence. A positive
(negative) Cov (Y1 , Y2 ) indicates a positive (negative) linear dependence.
Sometimes, Cov (Y1 , Y2 ) is written as σ12 .
Correlation
The correlation coefficient between Y1 and Y2 is defined as
σ12
ρ= ,
σ1 σ2
where σi is the stanedard deviation of Yi (i = 1, 2) and −1 ≤ ρ ≤ 1.
Remarks:
Correlation is a “standardized” covariance.
Correlation is scale-free.
ρ = 1 indicates a perfect positive correlation.
ρ = −1 indicates a perfect negative correlation.
ρ = 0 indicates a zero correlation.

Theorem 5.10—Cov (Y1 , Y2 ) = E (Y1 Y2 ) − E (Y1 )E (Y2 )

If Y1 and Y2 are random variables with means µ1 and µ2 , respectively, then
Cov (Y1 , Y2 ) = E [(Y1 − µ1 )(Y2 − µ2 )] = E (Y1 Y2 ) − E (Y1 )E (Y2 ).
Example: Consider the example ( Link ). Find Cov (Y1 , Y2 ).

Solution:
Recall the joint density function of Y1 (proportion of the tank at the
beginning of the week) and Y2 (proportion of the tank sold during the
week): (
3y1 , 0 ≤ y2 ≤ y1 ≤ 1,
f (y1 , y2 ) =
0, elsewhere.
Recall we have found E (Y1 ) = 3/4 and E (Y2 ) = 3/8. Link

To obtain Cov (Y1 , Y2 ), we still need to get

Z 1 Z y1 Z 1 Z y1
E (Y1 Y2 ) = y1 y2 (3y1 )dy2 dy1 = 3y12 y2 dy2 dy1
0 0 0 0
y1 1
1 1
y22
Z Z
= 3y12 dy1 = (3/2) y14 dy1 = (3/2)(y15 /5) = 3/10.
0 2 0
0 0
Therefore,
Cov (Y1 , Y2 ) = E (Y1 Y2 ) − E (Y1 )E (Y2 ) = (3/10) − (3/4)(3/8)
= .30000 − 9/32 = .300000 − 0.28125 = .01875 ≈ .02.

Theorem 5.11—Cov (Y1 , Y2 ) = 0

If Y1 and Y2 are independent random variables, then
Cov (Y1 , Y2 ) = 0.
Thus, independent random variables must be uncorrelated.
Remarks:
The theorem can be established using the fact that if Y1 and Y2 are
independent, then E (Y1 Y2 ) = E (Y1 )E (Y2 ).
The converse of this theorem is not generally true (it is only true for
normally distributed random variables). That is, the uncorrelated Y1 and
Y2 may not be independent random variables.

To understand the converse is not generally true, consider a case where

Cov (X , Y ) = 0 does not imply independence between random variables X
and Y . There exist two random variables X and Y = X 2 such that
Cov (X , Y ) = 0. The random variable X has E (X ) = 0 and E (X 3 ) = 0.
Hence,
Cov (X , X 2 ) = E (X · X 2 ) − E (X )E (X 2 )
= E (X 3 ) − E (X )E (X 2 ) = 0 − 0 · E (X 2 ) = 0.
But we know that random variables X and Y are not independent.

Example: See
Table 5.3, p. 267
Figure: Joint Probability Distribution
p1 (−1) = 5/16, p1 (0) = 6/16, p1 (+1) = 5/16 and

p2 (−1) = 5/16, p2 (0) = 6/16, p2 (+1) = 5/16. Note
p(0, 0) = 0 ̸= p1 (0)p2 (0) = (6/16)(6/16). Hence, Y1 and Y2 are
dependent. But Cov (Y1 , Y2 ) = 0 (see the next slide).
Example (continued):
E (Y1 ) = (−1)(5/16) + (0)(6/16) + (1)(5/16) = 0.
E (Y2 ) = (−1)(5/16) + (0)(6/16) + (1)(5/16) = 0.

XX
E (Y1 Y2 ) = y1 y2 p(y1 , y2 )
∀y1 ∀y2
= (−1)(−1)(1/16) + (−1)(0)(3/16) + (−1)(1)(1/16)

+(0)(−1)(3/16) + (0)(0)(0) + (0)(1)(/3)
+(1)(−1)(1/16) + (1)(0)(3/16) + (1)(1)(1/16)
= (1/16) − (1/16) − (1/16) + (1/16) = 0.
That is, Y1 and Y2 need not be independent even if Cov (Y1 , Y2 ) = 0.

The Expected Value and Variance of Linear Functions of
Random Variables (1)
Theorem 5.12—E (U1 ), V (U1 ), and Cov (U1 , U2 )

Let Y1 , Y2 , . . . , Yn and X1 , X2 , . . . , Xm be random variables with
E (Yi ) = µi and E (Xj ) = ξj . Define
n
X m
X
U1 = ai Yi and U2 = bj Xj
i=1 j=1
for constants a1 , a2 , . . . , an and b1 , b2 , . . . , bm . Then the following hold:

a E (U1 ) = ni=1 ai µi .
P
b V (U1 ) = ni=1 ai2 V (Yi ) + 2

P PP
1≤i<j≤n ai aj Cov (Yi , Yj ), where the
double sum is over all pairs (i, j) with i < j.
c Cov (U1 , U2 ) = ni=1 m
P P
i=1 ai bj Cov (Yi , Xj ).

Remarks: The proof is straightforward. We usually stick to the case where
U1 = a1 Y1 + a2 Y2 and U2 = b1 X1 + b2 X2 .
Example: Recall our oil tank example ( Link ). We are interested in Y1 − Y2 ,
the proportional amount of gasoline remaining at the end of a week. Find
the variance of Y1 − Y2 . We are given the joint density function
(
3y1 , 0 ≤ y2 ≤ y1 ≤ 1
f (y1 , y2 ) =
0, elsewhere.
We have calculated E (Y1 ) = 3/4 and E (Y2 ) = 3/8 previously. In addition,

we have calculated Cov (Y1 , Y2 ) = 0.02. Please find V (Y1 − Y2 )
Solution: It is clear that V (Y1 − Y2 ) = V (Y1 ) − 2Cov (Y1 , Y2 ) + V (Y2 )
but V (Y1 ) = E (Y12 ) − [E (Y1 )]2 and V (Y2 ) = E (Y22 ) − [E (Y2 )]2 .
Therefore, we must find E (Y12 ) and E (Y22 ). To find them, we must find
f1 (y1 ) and f2 (y2 ).
y1
Z y1
f1 (y1 ) = 3y1 dy2 = 3y1 (y2 ) = 3y12 , 0 ≤ y1 ≤ 1.
0
0
1
Z 1
3 2 3
f2 (y2 ) = 3y1 dy1 = y = (1 − y22 ), 0 ≤ y2 ≤ 1.
y2 2 1 2
y2
It follows
1
1
y15
Z
E (Y12 ) = 3y14 dy1 =3 = 3/5.
0 5
0
1
1
3 y23 y25
Z
3 2 3 1 1
E (Y22 ) = y2 (1 − y22 )dy2 = − = − = 1/5.
0 2 2 3 5 2 3 5
0

Now V (Y1 ) = 3/5 − (3/4)2 = 3/5 − 9/16 = .06 − .5625 = .04 and
V (Y2 ) = 1/5 − (3/8)2 = .20 − .140625 = .06. Therefore,
V (Y1 − Y2 ) = V (Y1 ) − Cov (Y1 , Y2 ) + V (Y2 )
= .04 + .06 − 2(.02) = .06.

Example: Let Y1 , Y2 , . . . , Yn be i.i.d random variables with E (Yi ) = µ and
V (Yi ) = σ 2 . These are typical of the outcome of n independent trials of
an experiment. Define the sample mean as
n
1X
Y = Yi .
n
i=1
σ2
Please show E (Y ) = µ and V (Y ) = n .
Solution:
Find
n n
X 1 X
E (Y ) = E (Yi ) = µ (1/n) = µ(n/n) = µ.
n
i=1 i=1
and
n
1 X X 1 1 nσ 2 σ2
V (Y ) = 2
V (Yi ) + 2 Cov (Yi , Yj ) = 2 = .
n n n | {z } n n
i=1 i<j
=0
Example:
Let Y be a random variable follows the binomial distribution with the
probability of success p and the number of trials n. Assume that we get a
sample {Y1 , Y2 , . . . , Yn } with n = 10 and that we have an estimator
p̂ = Y /n. Please find the expected value and variance of p̂.
Solution: Recall the expected value and variance of a binomially
distributed random variable are np and npq,1 respectively.

Y 1 np
E (p̂) = E = E (Y ) = = p.
n n n
and
1 1 pq
V (p̂) = V (Y ) = (npq) = .
n2 n2 n
1
q = 1 − p.
Example: In an urn of N balls, in which r balls are red and N − r are
black. A random sample of n balls without replacement is made. Let Y be
the number of red balls observed in the sample. Clearly, Y follows a
hypergeometric probability distribution; that is,
r N−r

y n−y
p(y ) = N
.
n
Find the mean and variance of Y .
Solution:
Let (
1, if ith ball is red,
Xi =
0, otherwise.
Let
n
X
Y = Xi .
i=1
To understand E (Y ) and V (Y ), we need to know E (Xi ), E (Xi2 ), E (Xi Xj )
for i ̸= j, and Cov (Xi , Xj ) for i ̸= j.
Clearly, P(X1 = 1) = r /N. Show P(X2 = 1) = r /N:
P(X2 = 1) = P(X1 = 1, X2 = 1) + P(X1 = 0, X2 = 1)

= (r /N)[(r − 1)/(N − 1)] + [(N − r )/N][r /(N − 1)]
r 2 − r + Nr − r 2
=
N(N − 1)
r (N − r )
=
N(N − r )
= r /N.
To generalize, P(Xk = 1) = r /N for k = 1, 2, . . . , n.

r (r −1)
P(Xj = 1, Xk = 1) = N(N−1) for j ̸= k.
Solution (continued): Recall Y = ni=1 Xi . Given

P
E (Xi ) = (1)(r /N) + (0)[(N − r )/N] = r /N, find E (Y ).
n
X
E (Y ) = E (Xi ) = n(r /N).
i=1
To find V (Y ), we need to find V (Xi ) and Cov (Xi , Xj ) for i ̸= j. Recall

V (X ) = E (X 2 ) − [E (X )]2 . We have
V (Xi ) = (12 )(r /N) − [r /N]2 = (r /N)(1 − r /N).
Finding Cov (Xi , Xj ) takes more steps.

Cov (Xi , Xj ) = E (Xi , Xj ) − E (Xi )E (Xj )
r (r − 1) r −1 r

2
= − (r /N) = (r /N) −
N(N − 1) N−1 N
! !
Nr − N − r (N − 1) (N − r )
= (r /N) = −(r /N)
(N − 1)N N(N − 1)
r 1

= −(r /N) 1− .
N N−1

n
X XX
V (Y ) = V (Xi ) + 2 Cov (Xi , Xj )
i<j
i=1
1

= n(r /N)(1 − r /N) − n(n − 1)(r /N)(1 − r /N)
N−1
n(n − 1)
= n(r /N)(1 − r /N) − (r /N)(1 − r /N)
N−1
n(n − 1)

= n− (r /N)(1 − r /N)
N−1
2
!
nN − n − n + n)
= (r /N)(1 − r /N)
N−1
N−n

= n (r /N)(1 − r /N).
N−1
The second equality in the above uses the following idea where n = 3
1 2 3
1 ◦ · ·
2 · ◦ ·
3 · · ◦
Table: How many dots (·)?

n(n−1)
2
=
3(3−1)
2
= 6. How many circles (◦)? n = 3

The Bivariate Normal Distribution (1)
This is perhaps the most important distribution that you will use. We only
introduce the bivariate (k = 2) normal distribution. The idea can be
extended to the multivariate (k > 2) normal distribution.
Consider k = 2 continuous random variables Y1 and Y2 with the bivariate
(joint) density function:
e −Q/2
f (y1 , y2 ) = p , −∞ < y1 < ∞, −∞ < y2 < ∞,
2πσ1 σ2 1 − ρ2
where
(y1 − µ1 )2 (y1 − µ1 )(y2 − µ2 ) (y2 − µ2 )2

1
Q= − 2ρ +
1 − ρ2 σ12 σ1 σ2 σ22

The Bivariate Normal Distribution (2)
The density function has parameters—µ1 , µ2 , σ12 , σ22 , and ρ. Recall

ρ = Covσ(Y 1 ,Y2 )
1 σ2
. If Cov (Y1 , Y2 ) = 0 (this implies ρ = 0), then
f (y1 , y2 ) = g (y1 )h(y2 ),
where g and h are marginal density functions for Y1 and Y2 , respectively.

Recall that zero covariance does not necessarily imply independence
generally but independence must imply zero covariance. In the case of the
bivariate normal distribution of two random variables, the two random
variables are independent if and only if their covariance is zero. An
interesting case, indeed!

Conditional Expectations (1)
Conditional Expectation
If Y1 and Y2 are any two random variables, the conditional expectation of
g (Yi ), given that Y2 = y2 , is defined to be
Z ∞
E (g (Y1 )|Y2 = y2 ) = g (y1 )f (y1 |y2 )dy1
−∞
if Y1 and Y2 are jointly continuous and

X
E (g (Y1 )|Y2 = y2 ) = g (y1 )p(y1 |y2 )
∀y1
if Y1 and Y2 are jointly discrete.

Example: Refer to the previous soft-drink problem ( Link ). Y2 = supply
and Y1 = dispense. The joint density function is given by
(
1/2, 0 ≤ y1 ≤ y2 ≤ 2,
f (y1 , y2 ) =
0, elsewhere
Find E (Y1 |Y2 = 1.5).

Solution: Recall from Link , for any 0 < y2 ≤ 2,
f (y1 , y2 ) 1/2 1
f (y1 |y2 ) = = = , 0 ≤ y1 ≤ y2 .
f2 (y2 ) (1/2)y2 y2
Therefore,
Z ∞ Z y
1

2
E (Y1 |Y2 = y2 ) = y1 f (y1 |y 2)dy1 = y1 dy1
−∞ 0 y2
y !
1 y12 2 y2
= = .
y2 2 0 2
Given Y2 = 1.5, E (Y1 |Y2 = 1.5) = 1.5/2 = .75.

We have discussed E (Y1 |Y2 = y2 ), which is a function of Y2 = y2 . We

may be interested in E (Y1 |Y2 ) and its expected value E [E (Y1 |Y2 )]. It
turns out this expected value is E (Y1 ).
E (Y1 ) = E [E (Y1 |Y2 )]

Let Y1 and Y2 be two random variables. Then
E (Y1 ) = E [E (Y1 |Y2 )],
where on the right-hand side the inside expectation is with respect to

conditional distribution of Y1 given Y2 and the outside expectation is with
respect to the distribution of Y2 .

Remarks:
Z ∞ Z ∞
E (Y1 ) = y1 f (y1 , y2 )dy1 dy2
Z−∞ −∞
∞ Z ∞
= y1 f (y1 |y2 )f2 (y2 )dy1 dy2
−∞ −∞
Z∞ Z ∞
= y1 f (y1 |y2 )dy1 f2 (y2 )dy2
Z−∞
∞
−∞
= E (Y1 |Y2 = y2 )f2 (y2 )dy2

−∞
= E [E (Y1 |Y2 )].
For the discrete case, the similar reasoning applies.

Example: Let n = 10 be the sample for quality control per day, Y be the
number of defectives, and p be the probability of observing a defective.
Y ∼ Bin(n, p), which has mean np and variance npq, where q = 1 − p. It
is known that E (Y ) = E [E (Y |p)]. But p is random and has a uniform
(U) distribution on the interval from 0 to 1/4. Find E (Y ).
Solution: 2
For p ∼ U(0, 1/4), E (p) = 1/4−0
2 = 1/8 and V (p) = (1/4−0)
12 = 1/192.
Applying the theorem for E (Y1 ) = E [E (Y1 |Y2 )], we get

1/4 − 0
E (Y ) = E [E (Y |p)] = E (np) = nE (p) = n = n/8.
2
Given n = 10, E (Y ) = 10/8 = 1.25. In the long run, we expect to observe

1.25 defectives per day.

As for E (Y1 ) = E [E (Y1 |Y2 )], we can derive

V (Y1 ) = E [V (Y1 |Y2 )] + V [E (Y1 |Y2 )] as stated below.
V (Y1 ) = E [V (Y1 |Y2 )] + V [E (Y1 |Y2 )]

Let Y1 and Y2 be random variables. Then,
V (Y1 ) = E [V (Y1 |Y2 )] + V [E (Y1 |Y2 )]

Remarks: The justification for the above theorem uses the theorem for
E (Y1 ) = E [E (Y1 |Y2 )]. Let’s see how. Using the concept of variance, we
write
V (Y1 |Y 2) = E (Y12 |Y2 ) − [E (Y1 |Y2 )]2 .
Taking the expectation of the above, we get
E [V (Y1 |Y 2)] = E [E (Y12 |Y2 )] − E {[E (Y1 |Y2 )]2 }.
Finding the variance of E (Y1 |Y2 ), we get
V [E (Y1 |Y2 )] = E {[E (Y1 |Y2 )]2 } − {E [E (Y1 |Y2 )]}2 .

The variance of Y1 is
2 2
V (Y1 ) = E (Y1 ) − [E (Y1 )]
2 2
= E [E (Y1 |Y2 )] − {E [E (Y1 |Y2 )]}
2 2 2 2 2
= E [E (Y1 |Y2 )] − E {[E (Y1 |Y2 )] } + E {[E (Y1 |Y2 )] } − {E [E (Y1 |Y2 )]} add and subtractE {[E (Y1 |Y2 )] }
| {z } | {z }
E [V (Y1 |Y2 )] V [E (Y1 |Y2 )]

Example: Let n = 10 be the sample for quality control per day, Y be the
number of defectives, and p be the probability of observing a defective.
Y ∼ Bin(n, p), which has mean np and variance npq, where q = 1 − p. It
is known that E (Y ) = E [E (Y |p)]. But p is random and has a uniform
(U) distribution on the interval from 0 to 1/4. Find V (Y ).
Solution: Apply the theorem for V (Y1 ) = E [V (Y1 |Y2 )] + V [E (Y1 |Y2 )],
where Y1 = Y and Y2 = p. We have
2
V (Y ) = E (V (Y |p))) + V (E (Y |p)) = E (npq) + V (np) = nE [p(1 − p)] + n V (p).
E (p) = 1/8, V (p) = 1/192, E (p 2 ) = V (p) + [E (p)]2 = 1/192 + (1/8)2 = 1/192 + 1/64 = 1/48.

2 5n n2
V (Y ) = n(1/8 − 1/48) + n (1/192) = +
48 192
For n = 10,
V (Y ) = 50/48 + 100/192 = 1.6525.
q √
SD(Y ) = V (Y ) = 1.6525 = 1.25.

The End

CH 5 Slides

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 5 Slides

Uploaded by

Copyright:

Available Formats

ECO 227Y1 Foundations of Econometrics

January 22, 2024

Kuan Xu (UofT) ECO 227 January 22, 2024 1 / 92

Kuan Xu (UofT) ECO 227 January 22, 2024 2 / 92

We are often interested in the intersection(s) of two or more

Kuan Xu (UofT) ECO 227 January 22, 2024 3 / 92

p(y1 , y2 ) = P(Y1 = y1 , Y2 = y2 ) = 1/36, y1 = 1, 2, . . . , 6, y2 = 1, 2, . . . ,

Kuan Xu (UofT) ECO 227 January 22, 2024 4 / 92

Fig. 5.1, p. 225

Figure: Bivariate Probability Function; y1 = # of dots on die 1 and y2 = # of

Kuan Xu (UofT) ECO 227 January 22, 2024 5 / 92

Joint (or Bivariate) Probability Function

p(y1 , y2 ) = P(Y1 = y1 , Y1 = y2 ), −∞ < y1 < ∞, −∞ < y2 < ∞.

Kuan Xu (UofT) ECO 227 January 22, 2024 6 / 92

Theorem 5.1—Properties of Joint Probability Function

assigned nonzero probabilities.

Remarks: The joint probability function for discrete random variables is

Kuan Xu (UofT) ECO 227 January 22, 2024 7 / 92

Example: In our example of tossing a pair of dice. Find

Each cell has a probability of 1/36 and

P(2 ≤ Y1 ≤ 3, 1 ≤ Y2 ≤ 2) = 4/36 = 1/9.

Kuan Xu (UofT) ECO 227 January 22, 2024 8 / 92

Example: Two customers arrive independently and randomly to any of

Kuan Xu (UofT) ECO 227 January 22, 2024 9 / 92

Kuan Xu (UofT) ECO 227 January 22, 2024 10 / 92

Joint (Bivariate) Distribution Function

F (y1 , y2 ) = P(Y1 ≤ y1 , Y2 ≤ y2 ), −∞ < y1 < ∞, −∞ < y2 < ∞,

Remarks: For two discrete random variables Y1 and Y2 is given by

Example: Toss a pair of dice. Y1 (Y2 ) is the dots on die 1 (2).

= p(1, 1) + p(1, 2) + p(1, 3) + p(2, 1) + p(2, 2) + p(2, 3) = 6/36.

Kuan Xu (UofT) ECO 227 January 22, 2024 11 / 92

Now we focus on jointly continuous random variables.

Kuan Xu (UofT) ECO 227 January 22, 2024 12 / 92

Theorem 5.2A—Properties of Joint Distribution Function

F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0.

F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0.

= P(y1 < Y1 ≤ y1∗ , y2 < Y2 ≤ y2∗ ) ≥ 0.

Kuan Xu (UofT) ECO 227 January 22, 2024 13 / 92

Figure: F (y1∗ , y2∗ ) − F (y1∗ , y2 ) − F (y1 , y2∗ ) + F (y1 , y2 ) ≥ 0

Kuan Xu (UofT) ECO 227 January 22, 2024 14 / 92

Fig. 5.2, p. 228

Example: You are given a bivariate density function.

a Show the figure of this density function.

Kuan Xu (UofT) ECO 227 January 22, 2024 16 / 92

Fig. 5.3, p. 229

Kuan Xu (UofT) ECO 227 January 22, 2024 17 / 92

Kuan Xu (UofT) ECO 227 January 22, 2024 18 / 92

Example: Let the joint density function of Y1 (proportion of the tank at

1 Sketch this function.

Kuan Xu (UofT) ECO 227 January 22, 2024 19 / 92

Fig. 5.4, p. 230

Fig. 5.5, p. 231

Figure: Region of Integration

Kuan Xu (UofT) ECO 227 January 22, 2024 20 / 92

Kuan Xu (UofT) ECO 227 January 22, 2024 21 / 92

Remarks: The above discussion on the bivariate distribution and density

Kuan Xu (UofT) ECO 227 January 22, 2024 22 / 92

We discuss marginal probability distributions first.

P(Y1 = 1) = p(1, 1) + p(1, 2) + · · · + p(1, 6) = 6(1/36) = 1/6.

P(Y1 = 2) = p(2, 1) + p(2, 2) + · · · + p(2, 6) = 6(1/36) = 1/6.

Kuan Xu (UofT) ECO 227 January 22, 2024 23 / 92

We can use summation notation to get the marginal probability

Kuan Xu (UofT) ECO 227 January 22, 2024 24 / 92