Joint Distributions: The Joint Distribution of Two Discrete Random Variables

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

LECTURE NOTES NO.

5 M235

Joint Distributions
The Joint Distribution of Two Discrete
Random Variables:

Example: A experiment consists of three


tosses of a fair coin. The sample space
S = {HHH, HHT, HTH, HTT, THH, THT,
TTH, TTT}
Let the random variable X be the number of
heads.

1
A random variable is A function that
maps events to numbers.
X:S (the sample space)→R(the set of real numbers)

X(HHH)=3,X(HHT)=2,X(HTH)=2,
X(HTT)=1,X(THH)=2,X(THT)=1
X(TTT)=0,X(TTH)=1
Then the possible values of X are 0,1,2, 3
The probability distribution of X is given by
Table 1
x 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8
Let the random variable Y be the number of
tails that precede the first head. (if no heads
come up, we let Y=3),

2
A random variable is A function that
maps events to numbers.
Y(HHH)=0,Y(HHT)=0,Y(HTH)=0,
Y(HTT)=0,Y(THH)=1,Y(THT)=1
Y(TTT)=3,Y(TTH)=2
then the possible values of Y are 0,1,2, and 3
The probability distribution of Y is given by
Table 2
y 0 1 2 3
P(Y=y) 4/8 2/8 1/8 1/8

Look to events involving both X and Y, for


example P(X=1 and Y = 1)?

P(X=1 and Y = 1) = P(THT) = 1/8,

3
because the event “X=1 and Y=1” occurs
only if the outcome of the experiment is
THT. But notice that this answer , 1/8,
cannot be found from the tables 1 and 2.
Notation:
P(X = x and Y = y)= P(X = x, Y = y)
The tables 1 and 2 are fine for computing
probabilities for X and Y alone, but from
these tables, there is no way to know that
P(X=1 and Y = 1)=1/8.
What do we need?
It seems clear that in this example, the
probabilities in the following table will give
us enough information:

4
y
p(x, y) 0 1 2 3
0 0 0 0 1/8
x 1 1/8 1/8 1/8 0
2 2/8 1/8 0 0
3 1/8 0 0 0
The notation “p(x, y)” refers to P(X = x and
Y = y); for example, p(2, 0) = P(X=2 and
Y=0)= 2/8. Notice that the 16 probabilities
in the table add to 1.
The entries in table 3 amount to the
specification of a function of x and y, p(x,
y)= P(X = x and Y = y). It is called the joint
probability mass function of X and Y.

5
y
0 1 2 3
p(x, y)
0 0 0 0 1/8 1/8 P(X=x)
1 1/8 1/8 1/8 0 3/8
x 2 2/8 1/8 0 0 3/8
3 1/8 0 0 0 1/8
4/8 2/8 1/8 1/8
P(Y=y)

Notice from Table 4, we can get tables 1 and 2 – The


individual mass functions of X and Y – by adding
across the rows and down the columns. In this
context the probability mass function of X and Y are
called marginal probability mass functions and
denoted by pX(x) and pY(y).

6
Definition. The joint probability mass
function of two discrete random variables X
and Y is the function p(x, y), defined for all
pairs of real numbers x and y by
P(x, y)=P(X = x and Y = y).
and satisfies the following two conditions:
1. P(x, y)  0, for all x and y
2.  P(x, y) =  1
x y
.

3. Pa  X  b, c  X  d     P( x, y)
c  x  d a  xy b

7
Theorem: The marginal probability mass
functions can be obtained from the joint
probability mass function:
P( X  x)   P( X  x, Y  y)
y

P(Y  y)   P( X  x, Y  y)
x

8
Independence of random variables :
Recall: events A and B are
independent  P(A  B) = P(A) P(B)
Two r.v.`s X and Y will be called
independent when every event
concerning X and every event concerning
Y are indpendent.
General definition :
X and Y are independent 
P(XA and YB) = P(XA) P(YB) ,
for any combination A R and B R

For discrete variables this is equivalent


to:
X and Y are independent 
P(X = x, Y = y) = P(X = x) P(Y = y)
for all x and y
X and Y are dependent if they are not
independent
9
Theorem:
* ( )+
∑∑ ( ) ( )

e.g.
( ) ∑ ∑ ( )

It follows that: E(X+Y) = E(X) + E(Y)


Definition: The conditional probability mass
function of X, given Y=y denoted by PX(x|y) is
defined by
p( X  x, Y  y)
p X ( x | y )  P( X  x | Y  y ) 
P (Y  y)

for fixed y. Provided P(Y=y)>0


The conditional expectation is:
E ( X | Y  y )   P( X  x | Y  y )
x

10
Example: Consider the following joint
probability mass function
y
p(x, y) 0 1 2 3
0 1/8 1/8 0 0
x 1 0 2/8 2/8 0
2 0 0 1/8 1/8

a. Find P(X+Y = 2) ?
=P(0,2)+P(1,1)+P(2,0)=0+2/8+0 = 2/8
b. Find P(X > Y)?
=P(1,0)+P(2,0)+P(2,1) = 0
c. Find marginal probability mass
function of X
3
P( X  x)   P( X  x, Y  y)
y 0

11
3
P( X  0)   P( X  0, Y  y )
y 0

1 1 2
 p(0,0)  p(0,1)  p(0,2)  p(0,3)    0  0 
8 8 8
3
P( X  1)   P( X  1, Y  y)
y 0

2 2 4
 p(1,0)  p(1,1)  p(1,2)  p(1,3)  0    0 
8 8 8

3
P( X  2)   P( X  2, Y  y )
y 0

1 1 2
 p(2,0)  p(2,1)  p(2,2)  p(2,3)  0  0   
8 8 8

d. Find the conditional p.m.f of X,


given Y=1?

12
p( X  x, Y  1)
p X |Y ( X  x | Y  1) 
P(Y  1)

But PY(1)=3/8. Therefore


p( X  x, Y  1)
p X |Y ( X  x | Y  1)  , x  0,1,2
P(Y  1)

Thus
p( X  0, Y  1) 1 / 8
p X |Y ( X  0 | Y  1)    1/ 3
P(Y  1) 3/8
p( X  1, Y  1) 2 / 8
p X |Y ( X  1 | Y  1)    2/3
P(Y  1) 3/8
p( X  2, Y  1) 0
p X |Y ( X  2 | Y  1)   0
P(Y  1) 3/8

These probabilities can be represented in


the following table:
x 0 1 2
P(X=x|Y=1) 1/3 2/3 0

e. Find P(0 ≤ X ≤ 1|Y=1) ?

13
P(0  X  1 | Y  1)   P( X  x | Y  1)
x 0,1

1 2
 p(0 | 1)  p(1 | 1)    1
3 3
f. Are X and Y independent?
X and Y are independent if
P(X=x, Y=y)= pX(X=x) . PY(Y=y)
for all x and y
Notice:
Check first if
p(0, 0) ≠ pX(0) . pY(0) ?
1/8 ≠ (2/8)(1/8)
Hence So, X and Y are not independent.
Joint Probability mass functions:
Let X 1 ,, X n be discrete random variables. The
joint probability distribution function is
Px1 ,, xn   P X 1  x1 ,, X n  xn  ,
14
where x1 ,, xn are possible values of

respectively. P x1 ,, xn satisfies
X 1 ,, X n , 
the following conditions:

1. Px1 ,, xn   0 for all x1 ,, x n ;

   P  x , , x   1
1 n
2. x2 xn
.

Independence of Multiple Random Variables:


Let X 1 ,, X n be random variables. If X 1 ,, X n
are independent if and only if
P X n  x1,, X n  xn   P X n  x1  P X n  xn  ,

for all pairs of x1 ,, xn 

15
Covariance Function
Measure of Dependency Between Two
Variables
Definition:
Suppose that X and Y are random
variables with joint p.m.f P(x,y) or p.d.f
fX,Y(x,y). The covariance of X and Y is
defined by
Cov(X,Y)=E([X−E(X)][Y−E(Y)]
Covariance is a measure of how much two
variables change together
Covariance indicates the relationship of two
variables whenever one variable changes.
If an increase in one variable results in an
increase in the other variable, both variables are
said to have a positive covariance. Decreases in
16
one variable also cause a decrease in the other.
Both variables move together in the same
direction when they change.
The concept of covariance is commonly used
when discussing relationships between two
variables.

Example1:
Suppose first that X and Y have the joint
p.m.f p(x,y) as given in the following
table
P(x,y) y
1 2 3
0 0.02 0.05 0.15 0.22
x 1 0.08 0.24 0.05 0.37
2 0.23 0.15 0.03 0.41
0.33 0.44 0.23
17
E(X) =0(.22)+1(.37)+2(.41)=1.19
E(Y) =1(.33)+2(.44)+3(.23)=1.90
E(XY) =
0(1)(.02)+0(2)(.05)+0(3)(.15)+1(1)(.08)+
1(2)(.24)+1(3)
(.37)+2(1)(.23)+2(2)(.15)+2(3)(.03)=1.95
Consequently,
Cov(X,Y) =1.95-1.19(1.90)= -0.311

Example 2:
P(x,y) 1 2 3
0 0.02 0.05 0.15 0.22
1 0.08 0.24 0.05 0.37
2 0.23 0.15 0.03 0.41
0.33 0.44 0.23

18
Note if X=0, Y is much more likely to be 3 rather
than 1
If X=2, Y is much more likely to be 1 rather
than 3
We expect negative correlation
E(X)=1.19
E(Y)=1.9
E(XY)=1.95
Cov(X,Y)=E(XY)-E(X) E(Y)
Cov(X,Y)=1.95-1.19(1.9)=-0.311
Example 3:
P(x,y) 1 2 3
0 0.22 0.05 0.05 0.32
1 0.18 0.14 0.05 0.37
2 0.07 0.1 0.14 0.31
0.47 0.29 0.24

19
Note if X=0, Y is much more likely to be 0 rather
than 3
If X=2, Y is much more likely to be 3 rather
than 1
We expect positive correlation
E(X)=0.99
E(Y)=1.77
E(XY)=1.99
Cov(X,Y)=E(XY)-E(X) E(Y)
Cov(X,Y)=1.99-0.99(1.77)=0.737

The covariance indicates the sign of


the correlation between the variables.
If cov(X, Y) > 0 the correlation is
positive.
If cov(X, Y) < 0 the correlation is
negative .

20
Covariance is a measure of dependence
between X and Y, it is zero when they
are independent.
We say that X and Y are not correlated if
cov(X, Y) = 0
Result :
cov(X, Y) = E(XY) – EX×EY
Properties covariance:
1. cov(X, Y) = cov(Y, X)
2. cov(X, X) = var(X)
3. cov(X + a, Y) = cov(X, Y)
4. cov(aX, Y) = a cov(X, Y)
5. cov(X, Y + Z) = cov(X, Y) + cov(X, Z)
6. var(X+Y) = var(X)+var(Y)+2cov(X,Y)
Property 8 for X1, …, Xn :
21
 n  n
var  X i    var( X i )   cov( X i , X j )
 i 1  i 1 i j

Properties for independent X and Y :


1. E(XY) = EX×EY
2. cov(X, Y) = 0
3. var(X+Y) = var(X)+var(Y)
Note 1: In general: E(X+Y) = EX+EY ,
but E(XY) ≠ EX×EY
Note 2 : independence => no correlation
not vice versa !
The value of the covariance :
- is positive if large values of X in general
coïncide with large values of Y and v.v.

22
- dépends on the unit of measurement
The correlation coefficient ρ(X,Y) does
not depend on the unit of
measurement:
( )
( )

Cov( X , Y ) Cov( X , Y )
Corr( X , Y )  
Var ( X ) Var (Y )  XY

Properties ρ(X,Y):
1. -1 ≤ ρ(X,Y) ≤ 1
So “ρ(X,Y) is a standardized measure for
linear dependence of X and Y”

23
Joint distribution of two continuous
random variables

If there is a function fX,Y(x,y) so that for


every B R2
P(( X , Y )  B)   f
( x , y )B
X ,Y ( x, y)dxdy

Then (X,Y) are jointly continuous


random variables fX,Y(x,y) is called joint
probability density function
As X and Y are continuous, Probability of values of
(X, Y) in a rectangle:

24
d b
Pa  X  b, c  Y  d     f ( x, y )dxdy
c a

Requirements for joint density function f(x,y):


1. f(x, y) ≥ 0, for all x and y
2. ∫ ∫ ( )

Notation: f(x, y) = fX,Y(x, y)

Example: Let X and Y have joint probability density


function:
4 xy 0  x  1,0  y  1
f ( x, y )   
0 otherwise 

25
Marginal probability density functions:
 Let X and Y be jointly continuous random
variables. The marginal probability density
function for X and Y are

f X ( x)  f

X ,Y ( x, y )dy ,

and

f Y ( y)  f

X ,Y ( x, y )dx

Conditional probability density function:


Let X and Y be continuous random variables. The
conditional probability density function of X given
Y is
f X |Y x, y 
f X |Y ( x | y ) 
fY  y

Where fY(y)>0
Independence of Two Random Variables:
Let X and Y be random variables. X and Y are
independent if and only if

26
f X ,Y ( x, y)  f X ( x) fY ( y), for all x and y

Independence of continuous random variables


X and Y are independent if
( ) ( ) ( ) for all xR and yR

Example: Let X and Y have joint probability density


function:

4 xy 0  x  1,0  y  1
f ( x, y)   
0 otherwise 

(a) Find P( X  1/ 2, Y  1/ 2) ?
1 1
2 4
P ( X  1 / 2, Y  1 / 4)    4 xydxdy
0 0
1
 14 
2 
P ( X  1 / 2, Y  1 / 4)   4 y   xdx dy
0 0 
 
1
2
 x 2 1/ 4 
P( X  1 / 2, Y  1 / 4)   4 y |0 dy
0  2 

27
1

 1 
2
P( X  1 / 2, Y  1 / 4)  0  32 dy
4 y
1

1 1 y 2 1/ 2 1
2
P( X  1 / 2, Y  1 / 4)   ydy  |0 
80 8 2 64
(b) Find the marginal probability density function
for X and Y.
1 1
y2 1
f X ( x)   4 xydy  4 x  ydy  4 x |0  2 x 0  x  1
0 0
2

1 1
x2 1
f Y ( x)   4 xydx  4 y  xdx  4 y |0  2 y 0  y  1
0 0
2

(c) Find the conditional probability density function


for X given X and the conditional probability
1 2

density function for X given Y , f X |Y ( x | y) ?


f X ,Y ( x, y) 4 xy
f X |Y ( x | y)    2 x, 0  x 1
fY ( y) 2y
(d)
© Are X and Y independent?

28
check if f X ,Y ( x, y)  f X ( x) f Y ( y) ?

4 xy  (2 x)(2 y)  4 xy

Then X and Y are independent


For the expectation we can derive:
( ) ∫ ∫ ( ) ( )

In particular:
( ) ∫ ∫ ( ) ,

Joint probability distribution of more than


two random variables
 Let X 1 ,, X n be continuous random variables. The
joint probability density function satisfies the
following conditions:
1. f x1 ,, xn   0 for    x1 ,, xn  ; ;
 

2.    f x1 ,, xn dx1  d xn  1 .


 

properties for two variables can be easily extended


to 3 or more variables. Some examples:
29
( ) ∫ ∫ ( )

Independence of Multiple Random Variables:


Let X 1 ,, X n be random variables. If X 1 ,, X n are
independent if and only if

f x1 ,, xn   f1 x1  f n xn  ,

for all pairs of x1 ,, xn  , where f j is the marginal


probability distribution (or density) function of X j .

Result
If X ~ N(µ1, σ12) and Y ~ N(µ2, σ22) and X and Y
are independent, then
X +Y ~ N(µ1+ µ2, σ12 + σ22)
Results:
- E(X1 +…+ X n) = E(X1) +…+ E(X n)

30
If X1,…, Xn are independent, then
- var(X1 +…+ X n) = var(X1) +…+ var(X n)
- ( ) ∏ ( )

31

You might also like