4.random Vectors Clase

Random Vectors
1 Joint distribution of a random vector
2 Marginal and conditional distributions
3 Independent random variables
4 Characteristics of random vectors
Mean
Variance, Covariance, Correlation
5 Transformation of random vectors
6 Multivariate Normal distribution 1

Random Vectors
11Joint
Jointdistribution
distributionof
ofaarandom
randomvector
vector
Mean

Estadística. Profesora: María Durbán
Previously, we studied probability distributions of a random variable. However,

in many occasions we are interested on studying more than one variable in a
random experiment.
For example, signals (sent or received) can be classified, attending to their

quality as: low, medium, high. We define X=number of signals of low quality,
and Y=number of signals of high quality.
In general, if X and Y are random variables, the probability distribution

that describe them simultaneously is called joint probability distribution
3
Discrete variables
Given two discrete r.v., X , Y we define their probability function as:
p( x, y ) = Pr( X = x, Y = y )
As in the univariate case, this function should satisfied that:

p ( x, y )  0
 p( x, y) = 1
x y
The joint distribution function is:
F ( x0 , y0 ) = Pr( X  x0 , Y  y0 ) =   Pr( X = x, Y = y)
x  x0 y  y0
4
Example
A new receptor for the transmission of digital information receives bit that
Are classified as acceptable, suspicious or non-acceptable, depending on
The quality of the signal received.
4 bits are transmitted, and two r.v. are defined:
X = Number of bit acceptable

Y = Number of suspicious bit
5
Example
Y
4 4.1x10-5
3 4.1x10-5 1.84x10-3
p ( x, y )  0
 p( x, y) = 1
x y
2 1.54x10-5 1.38x10-3 3.11x10-2
1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333
0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561
0 1 2 3 4 X 6
Example
Y
4 4.1x10-5
3 4.1x10-5 1.84x10-3
Pr( X  1, Y  2) 2 1.54x10-5 1.38x10-3 3.11x10-2
1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333
0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561
0 1 2 3 4 X 7
Continuous variables
Given two continuous r.v., X , Y their joint density function is defined as:
f ( x, y )
As in the univariate case, this function should satisfie that:

f ( x, y)  0
+ +
 
− −
f ( x, y)dxdy = 1
The joint distribution function: 2

d F ( x, y )
f ( x, y ) =
F ( x , y ) = Pr( X  x , Y  y ) =  
y0 x0
f ( x, y )dxdy
0 0
dxdy 0 0
− −
8
The probability is now a volume:

b d
Pr(a  X  b, c  Y  d ) =   f ( x, y )dxdy
a c
Pr(−1  X  1, −1.5  Y  1.5)
9
The probability is now a volume:

b d
Pr(a  X  b, c  Y  d ) =   f ( x, y )dxdy
a c
Pr(−1  X  1, −1.5  Y  1.5)
10
Example
Let X be a random variable which represents the time until your PC connects
to a server and Y, the time until the server recognize you as a user.
The joint density function is given by:
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
¿ Pr( X  1000, Y  2000) ?
11
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

¿ Pr( X  1000, Y  2000) ?
Y
3000
Region where the density
funtion is not 0
2000
1000
1000 2000 3000 X 12

Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

¿ Pr( X  1000, Y  2000) ?
Y
3000
2000
Region used to compute that
probability
1000
1000 2000 3000 X 13

Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
Y
1000 y
Pr( X  1000, Y  2000) =   f ( x, y )dxdy +
0 0
3000
2000
x=y
1000
1000 2000 3000 X 14

Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
Y 1000 y 2000 1000

Pr( X  1000, Y  2000) =   f ( x, y )dxdy +   f ( x, y )dxdy
0 0 1000 0
3000
= 0.915
2000
x=y
1000
1000 2000 3000 X 15

Random Vectors
Mean

2 Marginal distributions
If we observe than one r.v. in an experiment, it is important to distinguish

between the joint probability distribution, and the probability distribution of
each of them separately. The distribution of each variable is called
marginal distribution.
Discrete variables
Given two discrete r.v., X , Y with joint probability function p ( x, y )

the marginal probability functions are given by:
pX ( x) = Pr( X = x) =  Pr( X = x, Y = y ) They are probability functions

y
pY ( y ) = Pr(Y = y ) =  Pr( X = x, Y = y )
x
We can compute their mean,
variance, etc.
17
Example
Y
X = Number of bits acceptables
The marginal probability 4 4.1x10 -5
Y = Number of suspicious bits
functions are obtained by
adding in both directions. 3 4.1x10-5 1.84x10-3
2 1.54x10-5 1.38x10-3 3.11x10-2
1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333
0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561
0 1 2 3 4 X 18
Example
Y
The marginal probability
adding in both directions.
0.0001 0.0036 0.0486 0.2916 0.6561
0 1 2 3 4 X 19
Example
Y
The marginal probability 4 4.1x10 -5
adding in both directions. 3 4.1x10-5 1.84x10-3
2 1.54x10-5 1.38x10-3 3.11x10-2
1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333
0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561
0 1 2 3 4 X 20
Example
Y
The marginal probability 4 0.00004
adding in both directions. 3 0.00188
2 0.03250
1 0.24925
0 0.71637
X 21
Given two continuous r.v., X , Y With joint density function f ( x, y )
the marginal density functions are given by:
+
f X ( x) =  f ( x, y )dy They are density functions
−
+
fY ( y ) =  f ( x, y )dx We can compute their mean,
−
variance, etc.
0.4
0.3
f(x) 0.2
0.1
0.0
-4 -2 0
x
2 4 22
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
¿ Pr(Y  2000) ?
23
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

¿ Pr(Y  2000) ?
We can solve ir in two ways:
Integrate the density function over the

appropriate region
Compute the marginal density of Y and
use it compute the probability
24
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y ¿ Pr(Y  2000) ?
Y
We can solve it in two ways:
3000 Integrate the density function over the

appropriate region
2000
+ y
Pr(Y  2000) =   f ( x, y )dxdy = 0.05
1000 2000 0
1000 2000 3000 X 25

Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y ¿ Pr(Y  2000) ?
Y
3000 Compute the marginal density of Y and

use it to compute the probability
2000 y
fY ( y ) =  f ( x, y )dx = 6 10−3 e −0.002 y (1 − e −0.001 y ) y0
0
1000
0
1000 2000 3000 X 26
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y ¿ Pr(Y  2000) ?
Y
3000 Compute the marginal density of Y and

use it to compute the probability
2000 +
Pr(Y  2000) =  fY ( y ) dy = 0.05
2000
1000
0
27
2 Conditional distributions
When we observe more than one r.v. in an experiment, one variable may
affect the probabilities associated with the other.
Remember from previous lectures (Probability):
Pr (A  B )
Pr (B A ) =
Pr (A ) Measures the
size of one
event with
respect to the
other
28
affect the probabilities associated with the other.
Discrete variables
Given two discrete r.v., X , Y with joint probability function p ( x, y )

the conditional probabilily function of Y given X=x0:
A B A
p( y, x0 ) Pr(Y = y, X = x0 ) p X ( x0 )  0
p( y | x0 ) = =
pX ( x0 ) Pr( X = x0 )
For any given x

p( y, x) Pr(Y = y, X = x)
p ( y | x) = = p ( y , x ) = p ( y | x) p X ( x )
We can compute the mean, p X ( x) Pr( X = x)
variance, etc. 29
Example

Only 4 bits are transmitted, so if X=4, thenY=0

if X=3, then Y=0 ó 1
Knowing the value of X changes the probability associated with the values of Y
30
Example

Only 4 bits are transmitted, if X=3, then Y=0 ó 1

Pr(Y = 0, X = 3) 0.05832
Pr(Y = 0 | X = 3) = = = 0.2 Pr(Y = 0 | X = 3) + Pr(Y = 1| X = 3) = 1
Pr( X = 3) 0.2916
Pr(Y = 1, X = 3) 0.2333
Pr(Y = 1| X = 3) = = = 0.8 E[Y | X = 3] = 0  0.2 + 1 0.8 = 0.8
Pr( X = 3) 0.2916
Expected number of suspicious bits when the number of acceptable bits is 3

31
Given two continuous r.v., X , Y With joint density function f ( x, y )
the marginal density functions are given by:
They are density functions

f ( x, y)
f ( y | x) =
f X ( x)
We can compute their mean,
variance, etc.
f ( x, y) = f ( y | x) f X ( x)
32
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
What is the probability that the time until the server recognize you as user is
mote than 2000, if your PC has taken 1500 to connect to the server?
¿ Pr(Y  2000 | X = 1500) ?
33
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y ¿ Pr(Y  2000 | X = 1500) ?
Y
f ( x, y)
f ( y | x) =
3000 f X ( x)
+
2000 f X ( x) =  f ( x, y )dy = 0.003e −0.003 x x0
x
1000
f ( y | x) = 0.002e0.002 x −0.002 y 0  x  y
0
1000 2000 3000 X 34
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y ¿ Pr(Y  2000 | X = 1500) ?
Y
f ( y | x) = 0.002e0.002 x −0.002 y 0  x  y
3000
+
2000 Pr(Y  2000 | X = 1500) =  f ( y | X = 1500)dy
2000
+
= 0.002e3−0.002 y dy
1000 2000
= 0.368
0
1000 2000 3000 X 35
Random Vectors
33Independent
Independentrandom
randomvariables
variables
Mean

not affect the probabilities associated with the other.
Remember from previous lectures (Probability):
Pr( A  B ) = Pr( A) Pr(B )

Pr( A | B ) = Pr( A)
Pr(B | A) Pr(B )
37
Discrete variables
Two variables X , Y are independent if:
p( y | x) = pY ( y) p ( x | y ) = p X ( x)
p( x, y) = p( x | y) pY ( y) = pX ( x) pY ( y) x, y
38
Two variables X , Y are independent if:
f ( y | x ) = fY ( y ) f ( x | y) = f X ( x)
f ( x, y) = f ( x | y) fY ( y) = f X ( x) fY ( y) x, y
39
Example
f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y
f ( y | x) = 0.002e0.002 x −0.002 y 0  x  y
 For all values of x
fY ( y ) = 6 10−3 e−0.002 y (1 − e−0.001 y ) y0
40
Random Vectors
44 Characteristics
Characteristics of
of random
random vectors
vectors
Mean

 X1 
 
 X2 
Given n r.v. X 1 , X 2 , , X n an n-dimensional random vector is X=
 
 
 Xn 
The probability/density function of the vector is the joint probability/density
function of the vector components.
Mean
We define the vector of means as the vector whose components are the
means of each component.
 E  X1  
 
μ = E  X = 
E  2 
X
 
 
 E  X n  42
Covariance
We start by defining the covariance between two variables:
It is measure of the linear relationship between two variables
Cov( X , Y ) = E ( X − E  X ) (Y − E Y ) = E  XY  − E  X  E Y 
Properties
If X , Y are independent  Cov( X , Y ) = 0 since E  XY  = E  X  E Y 
If Cov( X , Y ) = 0  X , Y are independent

Z = aX + b
If we change scale and origin: Cov( Z , W ) = abCov( X , Y )
W = cY + d 43
Covariance
Cov( X , Y ) = E ( X − E  X ) (Y − E Y ) = E  XY  − E  X  E Y 
How do we compute this?
We need to compute the mean of a function of two random variables:
 h( x, y) p( x, y)
E  h( X , Y )  = x y
+ +
 
− −
h( x, y ) f ( x, y )dxdy
44
´Positive Covariance Zero Covariance
They are related,

but not linearly
45
Negative
Estadística. Profesora: MaríaCovariance
Durbán Zero Covariance
Example

Is the covariance between X and Y positive or negative?
We know that X + Y  4  when Y is close to 4, X is close to 0

Therefore, the covariance is negative.
46
Correlation
The correlation between two variables is also a measure of the linear

relationship between two variables.
Cov( X , Y )
 ( X ,Y ) =
Var  X Var Y 
If X , Y are independent   ( X , Y ) = 0 since Cov ( X , Y ) = 0

|  ( X , Y ) | 1
If Y = aX + b |  ( X , Y ) |= 1
47
Vaciance-Covariance Matrix
Given n r.v. X 1 , X 2 , , X n the variace-covariance matrix of vector X

is an n x n matrix:
 Var  X 1  Cov  X 1 , X 2  Cov  X 1 , X n  

 
    Cov  X 1 , X 2  Var  X 2  
M X = E ( X - μ )( X - μ ) =
   
 
 Cov  X 1 , X n  Var  X n  
Properties
Symmetric
Positive semi-definite
48
Random Vectors
Mean
55 Transformation
Transformation of
of random
random vectors
vectors

As in the univariate case, sometimes we will need to obtain the probability

distribution of a function of two or more r.v.
Given a random vector X with joint density function f ( X) and it is

transformed into another random vector Y of the same dimension, by a
function g
y1 = g1 ( x1 , , xn )
The inverse
y2 = g 2 ( x1 , , xn )
transformations
exist
yn = g n ( x1 , , xn ) dx1 dx1
dy1 dyn
dX dX
f (Y) = f ( g −1 (Y)) =
dY dY
dxn dxn
dy1 dyn
50
if Y has lower dimension than X , we complete Y with elements of

X until both have the same dimension.
Example
4 x1 x2 0  x1 , x2  1
f X1 X 2 ( x1 , x2 ) =
0 elsewhere
Calculate the density function of Y1 = X1 + X 2
1. Define Y2 = X 2
2. Find the joint density of Y = (Y1 , Y2 )
3. Find the marginal density of Y1

51
Example
4 x1 x2 0  x1 , x2  1
f X1 X 2 ( x1 , x2 ) =
0 elsewhere
▪ Find the joint density of Y = (Y1 , Y2 )
dX
f (Y) = f ( g −1 (Y))
dY
g ( X) = ( X 1 + X 2 , X 2 ) → g −1 (Y) = (Y1 − Y2 , Y2 )
Y1 Y2 f (Y) = 4( y1 − y2 ) y2
dX 1 −1 In which
−1
region is defined?
= =1 f ( g (Y)) = 4( y1 − y2 ) y2
dY 0 1
52
Example
0  x1 , x2  1
Y1 = X1 + X 2 → 0  y1  2
Y2 = X 2 → 0  y2  1 Y1 − Y2 = X 2 → 0  y1 − y2  1
53
Example
0  x1 , x2  1
Y1 = X1 + X 2 → 0  y1  2
Y2 = X 2 → 0  y2  1 Y1 − Y2 = X 2 → 0  y1 − y2  1
y1 − y2 = 0 y1 − y2 = 1
54
Example
f (Y) =0 4( y1 − y2 ) y2
x , x 11 2
0  y1  1 0  y2  y1
Y1 = X1 + X 2 → 0  y1  2
Y2 = X 2 → 0  y2  1 Y1 − Y2 = X 2 → 0  y1 − y2  1
1  y1  2 y1 -1  y2  1
1
0  y1  1 0  y2  y1
y1 − y2 = 0 y1 − y2 = 1 1  y1  2 y1 -1  y2  1
55
Example
▪Find the marginal density of Y1
y1
3 3
0 4( y1 − y2 ) y2y2 = 2 y1 0  y1  1
fY1 ( y1 ) = 1
8 3 3
y −1 4( y1 − y2 ) y2y2 = − 3 + 4 y1 − 2 y1 1  y1  2
1
56
Convolution of X1 and X2
If X1 and X2 are independent randon variables with density functions
f X1 ( x1 ) and f X 2 ( x2 ) , the density function of Y = X1 + X 2 is
+
( f X1 * f X 2 ) = fY ( y ) =  f X1 ( y − x) f X 2 ( x)x
−
57
An special case is the computation of the mean and variance of a linear

transformation:
Ym1 = A mn Xn1 m  n
E  Y  = AE  X 
Var  Y  = AM X A
Example
 X1 
Y = X1 + X 2  Y = (1 1)  
 X2 
E  Y  = E  X1  + E  X 2 
 Var  X1  Cov( X1 , X 2 )  1
Var  Y  = (1 1)     = Var  X1  + Var  X 2  + 2Cov( X1 , X 2 )
 Cov( X1 , X 2 ) Var  X 2   1 58

transformation:
Ym1 = A mn Xn1 m  n
E  Y  = AE  X 
Example
 X1 
Y = X1 − X 2  Y = (1 −1)  
 X2 
E  Y  = E  X1  − E  X 2 
 Var  X1  Cov( X1 , X 2 )   1 
Var  Y  = (1 −1)     = Var  X1  + Var  X 2  − 2Cov( X1 , X 2 )
 Cov( X1 , X 2 ) Var  X 2    −1 59

transformation:
Ym1 = A mn Xn1 m  n
E  Y  = AE  X 
Normal Distribution
X i ~ N ( i ,  i ) i = 1, , n independent
Y = a1 X 1 + a2 X 2 + + an X n Normal
n n
E Y  =  ai i Var Y  =  ai2 i2
i =1 i =1
60
Random Vectors
Mean
66 Multivariate
Multivariate Normal
Normal distribution
distribution 61
6 Multivariate Normal distribution
X 
If a random vector X =  1  follow a bivariante Normal distibution
X2
 1  variance-covariance matrix
with vector of means μ=  
 2
  12  1 2 
= 
  1 2 2 
2
it has density function:
 1 
f (X ) =
1 −1
exp− ( X − μ)'  ( X − μ)
(2 )  1/ 2
 2 
62
 1 
f (X ) =
1 −1
exp− ( X − μ)'  ( X − μ)
(2 )  1/ 2
 2 
 1 − 
 2  1 2 
  12  1 2  1  1 
=   =  12 22 (1 −  2 )  −1 =
  1 2  22  (1 −  2 )  −  1 
   22 
 1 2
1 
 1  x −  2  x −  2  x1 − 1  x2 − 2   
f ( x1 , x2 ) = exp − 
1 1
+
 
2 2
 − 2     
( 2 ) 1 2 (1 −  2 )  2(1 −  2
) 

           
 1 2 1 2

63
Density function
Scatterplot
 X1   1    12  1 2 
X=  μ=  = 
2    1 2 2 
2
X2
Properties
 = 0  X1 , X 2 independent
X1 ~ N ( 1 ,  1 ) X 2 ~ N ( 1 ,  1 )
X1 | X 2 y X 2 | X1 are Normal
65

4.random Vectors Clase

Uploaded by

Copyright:

Available Formats

You might also like

4.random Vectors Clase

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4.random Vectors Clase

Uploaded by

Copyright:

Available Formats

Random Vectors

1 Joint distribution of a random vector

2 Marginal and conditional distributions

3 Independent random variables

4 Characteristics of random vectors

5 Transformation of random vectors

6 Multivariate Normal distribution 1

2 Marginal and conditional distributions

3 Independent random variables

4 Characteristics of random vectors

5 Transformation of random vectors

6 Multivariate Normal distribution 2

Previously, we studied probability distributions of a random variable. However,

For example, signals (sent or received) can be classified, attending to their

In general, if X and Y are random variables, the probability distribution

Given two discrete r.v., X , Y we define their probability function as:

As in the univariate case, this function should satisfied that:

The joint distribution function is:

X = Number of bit acceptable

1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333

0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561

Pr( X  1, Y  2) 2 1.54x10-5 1.38x10-3 3.11x10-2

1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333

0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561

As in the univariate case, this function should satisfie that:

The joint distribution function: 2

The probability is now a volume:

Pr(−1  X  1, −1.5  Y  1.5)

The probability is now a volume:

Pr(−1  X  1, −1.5  Y  1.5)

f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

¿ Pr( X  1000, Y  2000) ?

f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

1000 2000 3000 X 12

f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

1000 2000 3000 X 13

f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

1000 2000 3000 X 14

f ( x, y ) = 6 10−6 exp(−0.001x − 0.002 y ) 0 x y

Y 1000 y 2000 1000

1000 2000 3000 X 15

1 Joint distribution of a random vector

2 Marginal and conditional distributions

3 Independent random variables

4 Characteristics of random vectors

5 Transformation of random vectors

6 Multivariate Normal distribution 16

If we observe than one r.v. in an experiment, it is important to distinguish

Given two discrete r.v., X , Y with joint probability function p ( x, y )

pX ( x) = Pr( X = x) =  Pr( X = x, Y = y ) They are probability functions

2 1.54x10-5 1.38x10-3 3.11x10-2

1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333

0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561

0.0001 0.0036 0.0486 0.2916 0.6561

2 1.54x10-5 1.38x10-3 3.11x10-2

1 2.56x10-6 3.46x10-4 1.56x10-2 0.2333

0 1.6x10-7 2.88x10-5 1.94x10-3 5.83x10-2 0.6561