Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2

Random Vectors: An overview
master INVESTMAT 2018–2019

Unit 2
Random Differential Equations and Applications Random Vectors: An overview 1

MAIN IDEA
In this session we will remember the main definitions, concepts, properties and results
related to multivariate random variables, or equivalently, random vectors. The
exposition follows on purpose, in broad outline, the same structure as univariate r.v.’s
since most of the ideas are extensions of their unidimensional counterpart. In this way,
we pursue clarity rather than appealing.
To be equipped with a good background of random vectors is crucial in dealing with
systems of random differential equations (R.D.E.’s) since their inputs, such as, initial
and/or boundary conditions, source terms and coefficients can be random vectors
and/or matrices rather than deterministic vectors and/or matrices, respectively.

Part I
Random Vectors

Throughout the exposition, (Ω, FΩ , P) will denote the underlying probability space.
For the sake of clarity, first we will introduce the main definitions mainly focusing on
two–dimensional vectors:
X = (X1 , X2 ) is a 2 − dimensional random vector ⇔ X1 , X2 are r.v.’s
Distribution function (d.f.) of a two–dimensional random vector
FX1 ,X2 (x1 , x2 ) = P [X1 ≤ x1 , X2 ≤ x2 ]

= P [{ω ∈ Ω : X1 (ω) ≤ x1 , X2 (ω) ≤ x2 }]
= P [A1 ∩ A2 ] , x = (x1 , x2 ) ∈ R2 .
where Ai = {ω ∈ Ω : Xi (ω) ≤ xi }, i = 1, 2, is the joint distribution function of the

random vector X = (X1 , X2 ).
This function satisfies the following conditions:
lim FX1 ,X2 (x1 , x2 ) = lim FX1 ,X2 (x1 , x2 ) = lim FX1 ,X2 (x1 , x2 ) = 0.
(x1 ,x2 )→(−∞,−∞) x1 →−∞ x2 →−∞
lim FX1 ,X2 (x1 , x2 ) = 1.

(x1 ,x2 )→(∞,∞)
X = (X1 , . . . , Xn ) is a random vector ⇔ Xi , 1 ≤ i ≤ n, are r.v.’s
FX (x) = P [X1 ≤ x1 , . . . , Xn ≤ xn ] = P [A1 ∩ · · · ∩ An ] , x = (x1 , . . . , xn ) ∈ Rn .

Marginal d.f.’s
Let X = (X1 , X2 ) be a two–dimensional random vector with joint d.f. FX1 ,X2 (x1 , x2 ).
Then the following functions FXi (xi ), i = 1, 2 are called the marginal distribution of Xi :

Discrete r.v.’s: If X1 , X2 take the values (x1,i , x2,j ) 1≤i≤M, 1≤j≤N , then
N M
FX1 (x1 ) = ∑ FX1 ,X2 (x1,i , x2,j ), FX2 (x2 ) = ∑ FX1 ,X2 (x1,i , x2,j ).
j=1 i=1
Continue r.v.’s:
Z ∞ Z ∞
FX1 (x1 ) = FX1 ,X2 (x1 , x2 )dx2 , FX2 (x2 ) = FX1 ,X2 (x1 , x2 )dx1 .
−∞ −∞
The following results link d.f.’s and marginal d.f.’s:
lim FX1 ,X2 (x1 , x2 ) = FX1 (x1 ), lim FX1 ,X2 (x1 , x2 ) = FX2 (x2 ).
x2 →∞ x1 →∞
In the n–dimensional case, and assuming that X = (X1 , . . . , Xn ) is a continuous random vector the i-th marginal d.f.
is given by: Z ∞ Z ∞
FX (xi ) = ··· FX1 ,...,Xn (x1 , . . . , xn )dx1 · · · dxi−1 dxi+1 · · · dxn , 1 ≤ i ≤ n.
i −∞ −∞

P.m.f. and p.d.f. of a two–dimensional random vector
Let X = (X1 , X2 ) be a two–dimensional
random vector whose components are discrete
r.v.’s that take the values (x1,i , x2,j ) 1≤i≤M, 1≤j≤N . Then its joint probability mass
function (p.m.f.) is given by:
pX1 ,X2 (x1 , x2 ) = P[X1 = x1,i , X2 = x2,j ], 1 ≤ i ≤ M, 1 ≤ j ≤ N.
It satisfies that:
FX1 ,X2 (x1 , x2 ) = ∑ ∑ pX1 ,X2 (x1,i , x2,j ).
x1,i ≤x1 x2,j ≤x2
If X1 and X2 are continuous r.v.’s, the joint probability density function (p.d.f.) is a
function fX1 ,X2 (x1 , x2 ) such that:
 2
Z Z x1 x2  fX1 ,X2 (x1 , x2 ) ≥ 0, ∀(x1 , x2 ) ∈ R ,

FX1 ,X2 (x1 , x2 ) = fX1 ,X2 (x1 , x2 ) dx1 dx2 , Z ∞Z ∞
−∞ −∞
fX1 ,X2 (x1 , x2 ) dx1 dx2 = 1.


−∞ −∞
In addition, the following relationships between marginal p.d.f.’s and joint p.d.f. hold:
Z ∞ Z ∞
fX1 ,X2 (x1 , x2 ) dx2 = fX1 (x1 ), fX1 ,X2 (x1 , x2 ) dx1 = fX2 (x2 ).
−∞ −∞
In addition, the following relationship between the joint p.d.f. and joint d.f. are fulfilled
∂ 2 FX1 ,X2 (x1 , x2 )
fX1 ,X2 (x1 , x2 ) = .
∂ x1 ∂ x2
We do not rewrite the above relationships in the multi–dimensional case.
Example 1: Computing some significant distributions from the joint p.d.f.
Let (X , Y ) be a two-dimensional r.v. with joint p.d.f.
y 2
f (x, y ) = 6 x − , 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
2
1 Check that the joint d.f. is given by:


 1 if x ≥ 1, y ≥ 1,
1
1 − 3x + 4x 2

2x if 0 ≤ x ≤ 1, y ≥ 1,



1
FX ,Y (x, y ) = 4 − 3y + y 2
2y if x ≥ 1, 0 ≤ y ≤ 1,
 1 2 − 3xy + y 2
 xy 4x if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
 2


0 otherwise.
Plot this function.

2 Obtain the joint p.d.f. from the joint d.f.
3 Check that the marginal p.d.f.’s are given by:
1 1
1 − 6x + 12x 2 4 − 6y + 3y 2

2 if 0 ≤ x ≤ 1, 2 if 0 ≤ y ≤ 1
fX (x) = fY (y ) =
0 otherwise. 0 otherwise.
Plot these functions.

(X , Y ) the random vector defined in the Example 1
joint p.d.f. (left) and joint d.f. (right)

fX (x), fY (y ) marginal p.d.f’s. of the random vector defined in the Example 1
(left: fX (x)) and (right: fY (y ))

To introduce the concept of independent r.v.’s, we first need to give the definition of
independent events:
Pair of independent events
E1 , E2 ∈ FΩ are independent events ⇔ P [E1 ∩ E2 ] = P [E1 ] P [E2 ] .
This definition is motivated from the concept of conditional probability:
indep. P [E1 ∩ E2 ]
P [E1 ] = P [E1 |E2 ] = ⇒ P [E1 ∩ E2 ] = P [E1 ] P [E2 ] .
P [E2 ]
Pair of independent r.v.’s

A pair of r.v.’s X = (X1 , X2 ) are said to be independent r.v.’s if and only if the events
Ai = {ω ∈ Ω : Xi (ω) ≤ xi }, i = 1, 2 are independent.
As a consequence:
FX1 ,X2 (x1 , x2 ) = P [A1 ∩ A2 ] = P [A1 ] P [A2 ] = FX1 (x1 )FX2 (x2 ).
Then, we can take as a characterization (or as a definition):
X1 , X2 are independent r.v.’s ⇔ FX1 ,X2 (x1 , x2 ) = FX1 (x1 )FX2 (x2 ).

As a consequence of the previous result and independence definition through d.f. one
gets the following definition (or characterization) of independence of two r.v.’s
Independence of two r.v.’s
Let X = (X1 , X2 ) be a two–dimensional random vector whose components are discrete
(or continuous) r.v.’s with joint m.p.f. pX1 ,X2 (x1 , x2 ) (or p.d.f. fX1 ,X2 (x1 , x2 )). Then

pX1 ,X2 (x1 , x2 ) = pX1 (x1 )pX2 (x2 ),
X1 , X2 are independent r.v.’s ⇔
fX1 ,X2 (x1 , x2 ) = fX1 (x1 )fX2 (x2 ).
Conditional joint m.p.f. and p.d.f. of two r.v.’s

Let X = (X1 , X2 ) be a two–dimensional random vector whose components are
continuous r.v.’s with joint fX1 ,X2 (x1 , x2 ). The conditional p.d.f.’s of X1 (X2 ) given X2
(X1 ) are defined, respectively, by:
fX1 ,X2 (x1 , x2 ) fX1 ,X2 (x1 , x2 )
pX1 |X2 (x1 |x2 ) = , pX2 |X1 (x2 |x1 ) = .
fX2 (x2 ) fX1 (x1 )
The definition can also be given by discrete r.v.’s with m.p.f. pX1 ,X2 (x1 , x2 ).
As a consequence one gets the following characterization (or definition) of
independence of two r.v.’s:
X1 , X2 are independent r.v.’s ⇔ pX1 |X2 (x1 |x2 ) = pX1 (x1 ) and pX2 |X1 (x2 |x1 ) = pX2 (x2 ).

The concept of a pair of independent r.v.’s can be extended to three o more r.v.’s. For
it, we start from the basic relationship:
P [E1 ∩ E2 ∩ E 3] = P [E1 |E2 ∩ E3 ] P [E2 |E3 ] P [E3 ] .
Following the previous development, this can be written in terms of p.d.f.’s:
fX1 ,X2 ,X3 (x1 , x2 , x3 ) = fX1 |X2 ,X3 (x1 |x2 , x3 )fX2 |X3 (x2 |x3 )fX3 (x3 ).
Hence, the definition of mutually independent r.v.’s turns out:
X1 , X2 , X3 are mutually independent r.v.’s ⇔ fX1 ,X2 ,X3 (x1 , x2 , x3 ) = fX1 (x1 )fX2 (x2 )fX3 (x3 ).
In general, from the relationship:
fX1 ,...,Xn (x1 , . . . , xn ) = fX1 |X2 ,...,Xn (x1 |x2 , . . . , xn )fX2 |X3 ,...,Xn (x2 |x3 , . . . , xn ) · · · fXn−1 |Xn (xn−1 |xn )fXn (xn ),
one gets the following definition:
Mutually independent r.v.’s

Let X = (X1 , . . . , Xn ) be an n–dimensional random vector with joint p.d.f.
fX1 ,...,Xn (x1 , . . . , xn ) and fXi (xi ) the p.d.f.’s Xi . Then
X1 , . . . , Xn are mutually independent r.v.’s ⇔ fX1 ,...,Xn (x1 , . . . , xn ) = fX1 (x1 ) · · · fXn (xn ).
This definition can also be done to discrete random vectors.

Example 2: Computing conditional distributions
In the context of Example 1, show that the conditional p.d.f.’s are:
3(y − 2x)2
fX |Y (x|y ) = , 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
4 + 3y (y − 2)
3(y − 2x)2
fY |X (y |x) = , 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
1 + 6x(2x − 1)
Check that both are p.d.f.’s and show a plot of them.

Remark: Coming back to the concept of independence through events it is important
to delineate the difference between complete independence of a set of events and
pairwise independence. Below we give both definitions and an illustrative example. Let
A = {A1 , . . . , An } be a family of events, then
Complete independence: They are said to be independent if any subset of
elements of A are independent. This means that
P[Ai1 ∩ · · · ∩ Aij ] = P[Ai1 ] · · · P[Aij ], ∀i1 , . . . , ij ∈ {1, . . . , n}.
Pairwise independence: They are said to be pairwise independent if
P[Ai1 ∩ Aij ] = P[Ai1 ]P[Aij ], ∀i1 , ij ∈ {1, . . . , n}.
As a consequence, in order for three events A, B, C to be independent the following

four relationship must be hold
P[A ∩ B] = P[A]P[B],
P[A ∩ C ] = P[A]P[C ], (1)
P[B ∩ C ] = P[B]P[C ],
and
P[A ∩ B ∩ C ] = P[A]P[B]P[C ]. (2)
It is possible that relationship (2) holds, but some of the relationships fails (1).
Conversely, next example shows that (1) holds true but (2) fails.

Example 3: Pairwise independence does not entail independence
Let us consider a probabilistic experience whose sample space and associated
probabilities are defined as
1
Ω = {s1 , s2 , s3 , s4 }, P[si ] = , i = 1, . . . , 4.
4
Let us consider the events
A = {s1 , s2 }, B = {s1 , s3 }, C = {s1 , s4 }.
Then
A ∩ B = A ∩ C = B ∩ C = A ∩ B ∩ C = {s1 },

1
= = P[C ] = ,

 P[A] P[B]
2
1
 P[A ∩ B]
 = P[A ∩ C ] = P[B ∩ C ] = P[A ∩ B ∩ C ] = .
4

Example 4: Study of independence through marginal p.d.f.’s and joint p.d.f.
Let us consider the following joint p.d.f.
fX1 ,X2 (x1 , x2 ) = 8x1 x2 , 0 ≤ x1 ≤ x2 ≤ 1.
From it one deduces that X1 and X2 are dependent r.v.’s since

Z 1
fX1 (x1 ) = 8x1 x2 dx2 = 4x1 (1 − (x1 )2 ), 0 ≤ x1 ≤ 1,
x1
Z x2
fX2 (x2 ) = 8x1 x2 dx1 = 4(x2 )3 , 0 ≤ x2 ≤ 1.
0
Then
3 3 1
fX1 ,X2 (0.5, 0.5) = 2 6= = × = fX1 (0.5)fX2 (0.5) ⇒ fX1 ,X2 (x1 , x2 ) 6= fX1 (x1 )fX2 (x2 ),
4 2 2
One reason X1 and X2 are not independent r.v.’s is
D = {(x1 , x2 ) ∈ R2 : 0 ≤ x1 ≤ x2 ≤ 1} 6= D1 × D2 ,
i.e., the domain where the joint p.d.f. fX1 ,X2 (x1 , x2 ) is defined is not a product space
(plot the domain!).
Remark: A necessary condition of independence is the domain D be the product of
several spaces. However, this condition is not sufficient (see Exercise 2 (3)).
From the concept of conditional distribution one can introduce the definition of
truncated distribution which is very useful in practice where often the support of the
r.v.’s is bounded.
Truncated distributions
Let X be a real r.v. defined on a probability space (Ω, FΩ , P) and T ∈ BR such that:
0 < P[{ω ∈ Ω : X (ω) ∈ T }] < 1. Then
Discrete r.v.: If X is a discrete r.v. with m.p.f. pX (x) = P[X = x], the truncated
m.p.f. over the set T is given by:
pX (x)

P[X = x, X ∈ T ]

 if x ∈ T ,
P[X = x|X ∈ T ] = = ∑ pX (t)
P[X ∈ T ] 
 t∈T
0 otherwise.
Continuous r.v.: If X is a continuous r.v. with p.d.f. fX (x), the truncated p.d.f.
over the set T is given by:
f (x)

 Z X
 if x ∈ T ,
fX |T (x|X ∈ T ) = fX |T (x) = fX (y ) dy
 T

0 otherwise.
Notice that fX |T (x|X ∈ T ) is really a p.d.f. since it satisfies:

Z
fX |T (x|X ∈ T ) ≥ 0, fX |T (x) dx = 1.
T
Example 5: Truncated binomial distribution
Let X ∼ Bi(n = 10; p) be a binomial r.v. and T = {0, 1, . . . , 6}. Then the truncated
binomial distribution over the interval T has as m.p.f.:
10 x 10−x
x p (1 − p)
pX (x) = 6 , x = 0, 1, . . . , 6.
10
∑ x px (1 − p)10−x
x=0
Example 6: Truncated exponential distribution

Let X ∼ Exp(λ ) be an exponential r.v. and T = [0, 20]. Then the truncated
exponential distribution over the interval T has as p.d.f.:
e −λ x
fX (x) = Z 20
, 0 ≤ x ≤ 20.
e −λ y dy
0

Expectation of a function of two r.v.’s. Joint moments of two r.v.’s
Let X = (X1 , X2 ) be a two–dimensional continuous random vector with joint p.d.f.
fX1 ,X2 (x1 , x2 ) and expectations, E[X1 ] = µX1 , E[X2 ] = µX2 , respectively. Let
Y = g (X1 , X2 ), g : R2 −→ R be a new r.v. Then the expectation of Y = g (X1 , X2 ) is
defined by:
Z ∞Z ∞
E[Y ] = E[g (X1 , X2 )] = g (x1 , x2 )fX1 ,X2 (x1 , x2 ) dx1 dx2 .
−∞ −∞
In particular:
Moment w.r.t. the origin: Taking g (X1 , X2 ) = (X1 )m (X2 )n , m, n ≥ 1, one gets:
Z ∞Z ∞
αm,n = E[(X1 )m (X2 )n ] = (x1 )m (x2 )n fX1 ,X2 (x1 , x2 ) dx1 dx2 .
−∞ −∞
Moment w.r.t. the mean: Taking g (X1 , X2 ) = (X1 − µX1 )m (X2 − µX2 )n , m, n ≥ 1,
one gets:
Z ∞Z ∞
µm,n = E[(X1 − µX1 )m (X2 − µX2 )n ] = (x1 − µX1 )m (x2 − µX2 )n fX1 ,X2 (x1 , x2 ) dx1 dx2 .
−∞ −∞
Remarks:
1 The above definitions can also be stated for discrete r.v.’s X1 and X2 substituting integrals by sums and
the joint p.d.f. by the joint p.m.f.
2 In the same way as we did for univariate r.v.’s, one can defined the absolute moments w.r.t the origin and
the mean.
3 These definitions can also be given for n–dimensional random vectors.

Some significant statistical moments related to two r.v.’s
In the context of the above definitions, it is easy to check the following relationships:
Moments of unidimensional r.v.’s:
α1,0 = µX1 , α0,1 = µX2 , µ2,0 = σX21 , µ0,2 = σX22 ,
µ2,0 = α2,0 − (α1,0 )2 , µ0,2 = α0,2 − (α0,1 )2 .
Correlation (α1,1 ):
4
α1,1 = E[X1 X2 ].
Covariance (µ1,1 = C[X1 , X2 ]):
4
µ1,1 = E[(X1 − µX1 )(X2 − µX2 )] = α1,1 − α1,0 α0,1 = E[X1 X2 ] − E[X1 ]E[X2 ].
Correlation coefficient (ρX1 ,X2 ): Sometimes is convenient to consider a

normalized quantity of covariance:
4 µ1,1
ρX1 ,X2 = √ ∈ [−1, 1] (by Schwarz inequality).
µ2,0 µ0,2
Remark: Do not confuse ρX1 ,X2 with α1,1 !.

Expectation and independence of two r.v.’s
⇒
X1 , X2 are independent r.v.’s E[X1 X2 ] = E[X1 ]E[X2 ]. ?
:
⇒: The direct implication comes from the characterization of independence in terms
of the p.d.f. and Fubini’s theorem.
:: Let X1 = X ∼ N(0; 1) and X2 = X 2 . Then:
E[X1 X2 ] = E[X 3 ] = 0 = E[X ] E[X 2 ] = E[X1 ]E[X2 ].

| {z } | {z } | {z }
by symmetry =0 <+∞
Obviously, X1 and X2 are dependent in the sense that knowing X determines X 2

completely. In fact:
P[X ∈ [−1, 1], X 2 ∈ [0, 1]] = P[X ∈ [−1, 1]]

> P[X ∈ [−1, 1]]P[X 2 ∈ [0, 1]]
= (P[X ∈ [−1, 1]])2 .
?: More generally: Let f , g : R −→ R be measurable functions. Then
⇒
X1 , X2 are independent r.v.’s E[f (X1 )g (X2 )] = E[f (X1 )]E[g (X2 )].
:

Correlation and independence of two r.v.’s
Correlation is of great importance in the analysis of two r.v.’s. It is a measure of their
linear interdependence in the sense that its value is a measure of accuracy with which
one r.v., say X2 , can be approximated by a linear function of the other, aX1 + b. If this
approximation is made in a such way that m.s. error (e = E[(X2 − (aX1 + b))2 ]) is
minimum, then it can be proven that:
σX 2
a = ρX1 ,X2 , b = µX2 − aµX1 , e = σX22 (1 − (ρX1 ,X2 )2 ).
σX1
As a consequence:
  σX2
 a= > 0,
 σX1
ρX1 ,X2 = 1 ⇒ X2 = aX1 + b,

 σ
 b = µX2 − σX2 µX1 .



X1
e = 0 ⇔ |ρX1 ,X2 | = 1 ⇒ 
  a = − σX2 < 0,
 X1 σ
 ρX1 ,X2 = −1 ⇒ X2 = aX1 + b,  b = µ + σX2 µ .



X2 σ X1 X1
This is the case where linear approximation is the best in the m.s. sense. The opposite
case is when the error is maximum, i.e.,
e is maximum ⇔ ρX1 ,X2 = 0 ⇔ µ1,1 = 0 ⇔ E[X1 X2 ] = E[X1 ]E[X2 ].

That motivates the following definitions
−1 ⇒ X1 , X2 are negative and perfectly correlated r.v.’s,



 ρ ∈ (−1, 0) ⇒ X1 , X2 are negative correlated r.v.’s,


ρX1 ,X2 = 0 ⇒ X1 , X2 are uncorrelated r.v.’s,
 ρ ∈ (0, 1) ⇒ X1 , X2 are positive correlated r.v.’s,



1 ⇒ X1 , X2 are positive and perfectly correlated r.v.’s.
Relationship between uncorrelated and independent r.v.’s

⇒
X1 , X2 are independent r.v.’s X , X uncorrelated r.v.’s
: 1 2
⇒:
X1 , X2 are independent r.v.’s ⇒ E[X1 X2 ] = E[X1 ]E[X2 ] ⇒ µ1,1 = 0.
:: This was shown in a previous counterexample (see slide 18).

A very useful result in practice is the following:
Transformation of independent r.v.’s
H : Let X , Y be independent r.v.’s and f , g : R −→ R measurable functions.
T : Then, f (X ) and g (Y ) are independent r.v.’s.
Let us show by means of an illustrative example this result. We assume

U = f (X ) = X 2 and V = g (Y ) = Y 2 being X , Y positive and independent r.v.’s, then
FU,V (u, v ) = P[U ≤ u, V ≤ v ]
= P[X 2 ≤ u, Y 2 ≤ v ]
√ √
= P[X ≤ u, Y ≤ v ]
F √ √
= P[X ≤ u]P[Y ≤ v ]
= P[X 2 ≤ u]P[Y 2 ≤ v ]
= P[U ≤ u]P[V ≤ v ]
= FU (u)FV (v ).
This means that U = f (X ) and V = g (Y ) are independent r.v.’s.

F: In this step independence between X , Y is used.
Exercise 1: About independence and mutually independence
Let the r.v.’s X and Y be independent and let each take the values +1 and −1 with
probability 12 .
1 Show that X and XY are independent but
2 the r.v.’s X , Y and XY are not mutually independent.
The first part of the previous exercise shows that two r.v.’s which are functionally
dependent can be statistically independent.
Exercise 2: About independence and correlation

The joint p.d.f. functions of two r.v.’s X and Y are:
1
1 f (x, y ) = x2y 2
, 1 ≤ x, y < ∞.
2 g (x, y ) = π1 , x 2 + y 2 ≤ 1.
3 h(x, y ) = y 2 e −y (1+x) , x, y > 0.
For each of the above cases, answer the following questions:
i) Determine that f (x, y ), g (x, y ) and h(x, y ) are really p.d.f.’s.
ii) Determine the marginal p.d.f. of X and Y .
iii) Are the r.v.’s X , Y uncorrelated? Are they independent?

Properties of the variance of two r.v.’s
Let X1 , X2 be r.v.’s and a1 , a2 ∈ R. Then:
Variance of a linear combination of two r.v.’s:
σa21 X1 +a2 X2 = (a1 )2 σX21 + (a2 )2 σX22 + 2a1 a2 µ1,1 or equivalently,
V[a1 X1 + a2 X2 ] = (a1 )2 V[X1 ] + (a2 )2 V[X2 ] + 2a1 a2 C[X1 , X2 ].
Variance of a linear combination of two independent r.v.’s: If X1 , X2 are
independent r.v.’s:
σa21 X1 +a2 X2 = (a1 )2 σX21 + (a2 )2 σX22 ≡ V[a1 X1 + a2 X2 ] = (a1 )2 V[X1 ] + (a2 )2 V[X2 ].
The above properties can be generalized: Let X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Ym be two

families of r.v.’s and a1 , a2 , . . . , an and b1 , b2 , . . . , bm arbitrary constants, then
" #
n m n m
C ∑ ai Xi , ∑ bj Yj = ∑ ∑ ai bj C Xi , Yj .
i=1 j=1 i=1 j=1
In the case where Yj = Xi and bj = ai with m = n, one gets:

" # " #
n n n n n n n i−1
V ∑ ai Xi =C ∑ ai Xi , ∑ aj Xj = ∑ ∑ ai aj C Xi , Xj = ∑ (ai )2 V [Xi ]+2 ∑ ∑ ai aj C Xi , Xj .

i=1 i=1 j=1 i=1 j=1 i=1 i=2 j=1
If X1 , X2 ,. . . , Xn are mutually independent r.v.’s (or milder, uncorrelated), then
C Xi , Xj = 0 if i 6= j and previous expression yields
" #
n n
V ∑ a i Xi = ∑ ai2 V [Xi ] .
i=1 i=1

Mean and variance–covariance matrix
Let X = (X1 , . . . , Xn ) and Y = (Y1 , . . . , Yn ) be two n–dimensional random vectors.
Then:
mean vector: E[X] = µ X = (E[X1 ], . . . , E[Xn ]) = (µX1 , . . . , µXn ).
covariance matrix:
Σ X,Y = E[(X − µ X )(Y − µ Y )T ] = C[Xi , Yj ]

n×n
, 1 ≤ i, j ≤ n.
In the particular case that n = 2, one gets:

 
E[(X1 − µX1 )(Y1 − µY1 )] E[(X1 − µX1 )(Y2 − µY2 )]
Σ X,Y =  
E[(X2 − µX2 )(Y1 − µY1 )] E[(X2 − µX2 )(Y2 − µY2 )]
 
C[X1 , Y1 ] C[X1 , Y2 ]
=  .
C[X2 , Y1 ] C[X2 , Y2 ]
If X = Y, one gets the variance–covariance matrix:
σX2
   
C[X1 , X1 ] C[X1 , X2 ] ρX1 ,X2 σX1 σX2
1
Σ X,X =  = .


C[X2 , X1 ] C[X2 , X2 ] ρX1 ,X2 σX1 σX2 σX2
2
4 µ1,1 C[X1 , X1 ]
ρX1 ,X2 = √ = .
µ2,0 µ0,2 σX1 σX2

In Statistics there exist several types of multi–dimensional probability distributions
that appear in applications often. For instance:
Example 7: multivariate Gaussian distribution

Let X = (X1 , . . . , Xn ) be a random vector, µ ∈ Rn and Σ a symmetric and
definite–positive matrix in Rn×n , then X is said to have a multivariate normal
distribution of parameters µ and Σ, denoted by X ∼ N(µ µ ; Σ), if its joint p.d.f. is given
by:
1 1 −1 T
fX (X;µ
µ , Σ) = p e − 2 (x−µµ )Σ (x−µµ ) .
(2π)n/2 det(Σ)
One can check that µ = µ X and Σ = ΣX are the mean and variance-covariance matrix
of X, respectively.
In the case that n = 2 (bivariate Gaussian distribution), fX (X;µ
µ , Σ) can be expressed
equivalently through the correlation coefficient ρ = ρX1 ,X2 and the standard deviations
of each components, σX1 and σX2 ,
( " 2
1 1 x1 − µX1
fX1 ,X2 (x1 , x2 ) = exp −
2(1 − ρ 2 )
p
2π (1 − ρ 2 )σX1 σX2 σX 1
x1 − µX1 x2 − µX2

− 2ρ
σX 1 σX 2
2 #)
x2 − µX2

+ .
σX 2

Next, we state two distinguished properties of Gaussian distribution:
Linear transformation of a multivariate Gaussian random vector
µ X ; ΣX ), A ∈ Rm×n and b ∈ Rm .
H : Let X = (X1 , . . . , Xn ) ∼ N(µ
µ X + b; AΣX AT ).
T : Then AX + b ∼ N(Aµ
We showed that
⇒
X1 , X2 are independent r.v.’s X , X uncorrelated r.v.’s,
: 1 2
however
Exercise 3: Independence and uncorrelated are equivalent for Gaussian r.v.’s

Prove that if X1 and X2 are uncorrelated Gaussian r.v.’s, then X1 and X2 are
independent.
Exercise 4: Functional and statistical dependence

Let the r.v.’s X and Y be independent and identically Gaussian distributed. Show that
the r.v.’s X + Y and X − Y are independent.

In the context of a single r.v., we remembered the definition of some significant
functions related to the statistical moments such as m.g.f., c.f., etc. Now, we will just
give the corresponding definition of c.f. to serve as a guidance to understand how
these extensions can be done. So, here we do not pursue to give a comprehensive
survey of definitions and properties for two r.v.’s.
Joint c.f. of a two–dimensional random vector

Let X = (X1 , X2 ) a random vector. For simplicity, let us assume that X1 , X2 a
continuous r.v.’s with joint p.d.f. fX1 ,X2 (x1 , x2 ). One defines its joint characteristic
function as (i denote the imaginary unit):
h i Z ∞ Z ∞
ϕX1 ,X2 (u1 , u2 ) = E e i(u1 X1 +u2 X2 ) = e i(u1 x1 +u2 x2 ) fX1 ,X2 (x1 , x2 ) dx1 dx2 , u1 , u2 ∈ R.
−∞ −∞
The definition in the case that both r.v.’s are discrete can be made in analogous way.
This complex function always exists.
ϕX1 ,X2 (u1 , u2 ) is just the two–dimensional Fourier transform the joint p.d.f.
fX1 ,X2 (x1 , x2 ). By the Fourier inversion theorem, one can recover the joint p.d.f.:
1
Z ∞Z ∞
fX1 ,X2 (x1 , x2 ) = e −i(u1 x1 +u2 x2 ) ϕX1 ,X2 (u1 , u2 ) du1 du2 .
4π 2 −∞ −∞

Some properties of the joint c.f.
In the context of the previous definition.
X1 , X2 are independent r.v.’s ⇒ ϕX1 ,X2 (u1 , u2 ) = ϕX1 (u1 )ϕX2 (u2 ).
X1 , X2 are independent r.v.’s ⇒ ϕX1 +X2 (u) = ϕX1 (u)ϕX2 (u).
Relating joint c.f. and moments w.r.t. the origin:
∂ n+m ϕX1 ,X2 (u1 , u2 )

n m = i n+m αn,m , n, m = 0, 1, 2, . . .
∂ u1 ∂ u2
(u1 ,u2 )=(0,0)
Exercise 5: Independence holds true in the m.s. limit

Let X , Y be 2-r.v.’s and {Xn : n ≥ 0} be a sequence of 2-r.v.’s such that
m.s.
i) Xn −−−→ X ,
n→∞
ii) Xn is independent of Y for each n ≥ 0,
then, X and Y are also independent r.v.’s.

In Unit 1, we dealt with the problem of computing the p.d.f. fY (y ) of a r.v. Y which
is the functional transformation of other r.v. X whose p.d.f. fX (x) is known. Next, we
will state the same problem in the bidimensional case.
R.V.T. method: Two–dimensional version
H : Let X = (X1 , X2 ) be a two-dimensional r.v. with joint p.d.f. fX1 ,X2 (x1 , x2 ). Let
r : R2 → R2 a one-to-one deterministic map and s : R2 → R2 its inverse:
y1 = r1 (x1 , x2 ), x1 = s1 (y1 , y2 ),
y2 = r2 (x1 , x2 ), x2 = s2 (y1 , y2 ).
Let us assume that both maps are differentiable being their four partial derivatives
continuous. Let us also assume that the Jacobian J2 of the inverse map satisfies:
∂ x1 ∂ x2
!
∂ y1 ∂ y1
J2 = det ∂ x1 ∂ x2 6= 0.
∂ y2 ∂ y2
T : Then, the joint p.d.f. fY1 ,Y2 (y1 , y2 ) of the two-dimensional r.v.
Y = (Y1 , Y2 ) = (r1 (X1 , X2 ), r2 (X1 , X2 )) is given by
fY1 ,Y2 (y1 , y2 ) = fX1 ,X2 (s1 (y1 , y2 ), s2 (y1 , y2 )) |J2 |.

R.V.T. technique: multi-dimensional version
H : Let X = (X1 , . . . , Xn ) be a random vector of dimension n with joint p.d.f. fX (x).

Let r : Rn −→ Rn be a one-to-one deterministic map which is assumed to be continuous
with respect to each one of its arguments, and with continuous partial derivatives.
T : Then, the joint p.d.f. fY (y) of the random vector Y = r(X) is given by
fY (y) = fX (s(y)) |Jn |,
where s(y) is the inverse transformation of r(x): x = r−1 (y) = s(y) and Jn is the
Jacobian of the transformation, i.e.,
 ∂ x1 ∂ xn 
∂ y1 ··· ∂ y1
∂x  .. .. .. 
Jn = det = det  . . .
,
∂y  
∂ x1 ∂ xn
∂ yn ··· ∂ yn
which is assumed to be different from zero.

The above method applies to:
Y1 = r1 (X1 , X2 ) = X1 + X2 , Y2 = r2 (X1 , X2 ) = X1 ,
to obtain the following result:
P.d.f. of the sum of two r.v.’s
H : Let (X1 , X2 ) be a continuous random vector with joint p.d.f. fX1 ,X2 (x1 , x2 ) and
respective domains: DX1 = {x1 : x1,1 ≤ x1 ≤ x1,2 } and DX2 = {x2 : x2,1 ≤ x2 ≤ x2,2 }.
T : Then the p.d.f. fY1 (y1 ) of their sum Y1 = X1 + X2 is given by:
Z x1,2
fY1 (y1 ) = fX1 ,X2 (x1 , y1 − x1 )dx1 , y1,1 = x1,1 + x2,1 ≤ y1 ≤ x1,2 + x2,2 = y1,2 ,
x1,1
or, equivalently by
Z x2,2
fY1 (y1 ) = fX1 ,X2 (y1 − x2 , x2 )dx2 , y1,1 = x1,1 + x2,1 ≤ y1 ≤ x1,2 + x2,2 = y1,2 .
x2,1
If X1 and X2 are independent r.v.’s, the p.d.f. of the sum of two independent r.v.’s is
just the convolution of their respective p.d.f.’s:
Z x1,2 Z x2,2
fY1 (y1 ) = fX1 (x1 )fX2 (y1 − x1 )dx1 , or fY1 (y1 ) = fX1 (y1 − x2 )fX2 (x2 )dx2 .
x1,1 x2,1

Suppose we wish to generate a random vector X = (X1 , . . . , Xn )T of dependent
components Xi ∼ fi (xi ), 1 ≤ i ≤ n, where fi are known univariate p.d.f.’s with
corresponding d.f.’s Fi
COPULAS provide a convenient method for imposing dependency structure among
the components of a random vector X while keeping the marginal distributions fixed.
copula
A copula is a d.f.
C : [0, 1]n −→ [0, 1]
of n dependent uniform r.v.’s U1 , . . . , Un ∼ U([0, 1]):
C (u1 , . . . , un ) = P[{ω ∈ Ω : U1 (ω) ≤ u1 , . . . , Un (ω) ≤ un }].
For a copula C and marginal d.f.’s Fi , define

F (x1 , . . . , xn ) = C (F1 (x1 ), . . . , Fn (xn )).
If U = (U1 , . . . , Un )T has the d.f. C (u1 , . . . , un ), then the random vector
X = (X1 , . . . , Xn )T = (F1−1 (U1 ), . . . , Fn−1 (Un ))T has joint d.f. F and marginal d.f.’s:
F1 (x1 ), . . . , Fn (xn ). Indeed, let us denote by H the d.f. of X, then:
H(x1 , . . . , xn ) = P[X1 ≤ x1 , . . . , X1 ≤ xn ]
= P[F1−1 (U1 ) ≤ x1 , . . . , Fn−1 (Un ) ≤ xn ]
= P[U1 ≤ F1 (x1 ), . . . , Un ≤ Fn (xn )]
= C (F1 (x1 ), . . . , Fn (xn ))
= F (x1 , . . . , xn ).
Algorithm: Dependent components generation using a copula
1 Generate U ∼ C (u1 , . . . , un ).
2 Output X = (X1 , . . . , Xn )T = (F1−1 (U1 ), . . . , Fn−1 (Un ))T .
Example 8: Student’s copula

This is a commonly used copula:
C (u1 , . . . , un ) = Tν,Σ (Tν−1 (u1 ), . . . , Tν−1 (un )),
where Tν,Σ is the d.f. of the multivariate t0,Σ distribution, with ν degrees of freedom,
mean vector 0 and correlation matrix Σ (i,e., Σ is a covariance matrix with ones in the
main diagonal), and Tν−1 is the inverse of the d.f. of the univariate tν distribution.
This includes the special case (ν = ∞) of the Gaussian copula model. In this setting
the dependency structure in the random vector X is determined by the correlation
matrix Σ.
Exercise 6: A Cauchy’s copula

Compute the Student’s copula in the particular case that:

1 2/3
Σ= , ν = 1 (usually referred to as Cauchy’s copula).
2/3 5
Plot this copula.

Example 9: Product’s copula
Probably the straightforward way to build a copula is taking:
n
C (u1 , . . . , un ) = ∏ ui .
i=1
In this case, the univariate distributions are assumed to be independent.
Exercise 7: Building copulas with univariate standard distributions

Using the product copula:
1 Compute the m.p.f. of the product copula built from a binomial distribution
Bi(n = 10; p = 0.4) and a Poisson distribution Po(λ = 3).
2 Compute and plot the p.d.f. of the product copula built from two Gaussian
distributions N(µ1 = 2; σ1 = 3) and N(µ2 = 4; σ2 = 5). Check that the marginal
distributions correspond to each univariate factor distribution.

Part II
The Banach Space of Random Vectors Ln2

The Banach space Ln2
Let Xi ∈ LRV = L2 , 1 ≤ i ≤ n and consider the set Ln2 whose elements are n-dimensional
random vectors X = (X1 , . . . , Xn ). This set has an structure of linear space and with
the norm:
kXkLn2 = max kXi kRV ,
1≤i≤n
is a Banach space. Therefore the concept of convergence in this space is the one
inferred by the above norm.

References
1 V. Quesada, A. Garcı́a (1988): Lecciones de Cálculo de Probabilidades, Ed.
Dı́az de Santos, Madrid. (Spanish)
2 M. Loève (1977): Probability Theory, Vol. I-II, Graduate Texts in Mathematics,
Springer Verlag, New York.
3 T.T. Soong (1973): Random Differential Equations in Science and Engineering,
Academic Press, New York.

Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2

Uploaded by

Copyright:

Available Formats

You might also like

Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2

Uploaded by

Copyright:

Available Formats

Random Vectors: An overview

master INVESTMAT 2018–2019

Random Differential Equations and Applications Random Vectors: An overview 1

Random Differential Equations and Applications Random Vectors: An overview 2

Random Differential Equations and Applications Random Vectors: An overview 3

X = (X1 , X2 ) is a 2 − dimensional random vector ⇔ X1 , X2 are r.v.’s

Distribution function (d.f.) of a two–dimensional random vector

FX1 ,X2 (x1 , x2 ) = P [X1 ≤ x1 , X2 ≤ x2 ]

where Ai = {ω ∈ Ω : Xi (ω) ≤ xi }, i = 1, 2, is the joint distribution function of the

lim FX1 ,X2 (x1 , x2 ) = 1.

X = (X1 , . . . , Xn ) is a random vector ⇔ Xi , 1 ≤ i ≤ n, are r.v.’s

FX (x) = P [X1 ≤ x1 , . . . , Xn ≤ xn ] = P [A1 ∩ · · · ∩ An ] , x = (x1 , . . . , xn ) ∈ Rn .

Random Differential Equations and Applications Random Vectors: An overview 4

The following results link d.f.’s and marginal d.f.’s:

Random Differential Equations and Applications Random Vectors: An overview 5

1 Check that the joint d.f. is given by:

Plot this function.

Plot these functions.

Random Differential Equations and Applications Random Vectors: An overview 7

joint p.d.f. (left) and joint d.f. (right)

Random Differential Equations and Applications Random Vectors: An overview 8

(left: fX (x)) and (right: fY (y ))

Random Differential Equations and Applications Random Vectors: An overview 9

Pair of independent events

E1 , E2 ∈ FΩ are independent events ⇔ P [E1 ∩ E2 ] = P [E1 ] P [E2 ] .

This definition is motivated from the concept of conditional probability:

Pair of independent r.v.’s

Then, we can take as a characterization (or as a definition):

Random Differential Equations and Applications Random Vectors: An overview 10

Conditional joint m.p.f. and p.d.f. of two r.v.’s

Random Differential Equations and Applications Random Vectors: An overview 11

P [E1 ∩ E2 ∩ E 3] = P [E1 |E2 ∩ E3 ] P [E2 |E3 ] P [E3 ] .

Following the previous development, this can be written in terms of p.d.f.’s:

Hence, the definition of mutually independent r.v.’s turns out:

In general, from the relationship:

one gets the following definition:

Mutually independent r.v.’s

This definition can also be done to discrete random vectors.

Random Differential Equations and Applications Random Vectors: An overview 13

P[Ai1 ∩ · · · ∩ Aij ] = P[Ai1 ] · · · P[Aij ], ∀i1 , . . . , ij ∈ {1, . . . , n}.

Pairwise independence: They are said to be pairwise independent if

P[Ai1 ∩ Aij ] = P[Ai1 ]P[Aij ], ∀i1 , ij ∈ {1, . . . , n}.

As a consequence, in order for three events A, B, C to be independent the following

Random Differential Equations and Applications Random Vectors: An overview 14

Let us consider the events

A = {s1 , s2 }, B = {s1 , s3 }, C = {s1 , s4 }.

Random Differential Equations and Applications Random Vectors: An overview 15

fX1 ,X2 (x1 , x2 ) = 8x1 x2 , 0 ≤ x1 ≤ x2 ≤ 1.

From it one deduces that X1 and X2 are dependent r.v.’s since

One reason X1 and X2 are not independent r.v.’s is

Notice that fX |T (x|X ∈ T ) is really a p.d.f. since it satisfies:

Example 6: Truncated exponential distribution

Random Differential Equations and Applications Random Vectors: An overview 18

Random Differential Equations and Applications Random Vectors: An overview 19

α1,0 = µX1 , α0,1 = µX2 , µ2,0 = σX21 , µ0,2 = σX22 ,

µ2,0 = α2,0 − (α1,0 )2 , µ0,2 = α0,2 − (α0,1 )2 .

Covariance (µ1,1 = C[X1 , X2 ]):