Multivariate Statistical Methods: Abiyot Negash (Assi. Prof)

Multivariate Statistical Methods
Abiyot Negash (Assi. Prof)
Department of Statistics
Abiyot. (JU) Multivariate statistical Methods

Introduction and Matrix Algebra 1. Introduction
Introduction
This course is considered with statistical methods designed to elicit

information from the data sets with many different variables.
Because the data include simultaneous measurements on many variables,
this body of methodology is called multivariate analysis.
The need to understand the relationships between many variables makes
multivariate analysis an inherently difficult subject.
Many multivariate methods are based upon an underlying probability

model known as the multivariate normal distribution.
Multivariate analysis is a ”mixed bag”. It is difficult to establish a

classification scheme for multivariate techniques that both widely accepted
and indicates the appropriateness of the techniques
Abiyot. (JU) Multivariate statistical Methods 1

Multivariate statistical analysis consists of a collection of methods

that can be used when several measurements are made on each
individual (subject or experimental unit).
We will refer to the measurements as variables and to the individuals
or objects as units (research units, or experimental units, or
observations).
Using multivariate analysis, the variables can be examined
simultaneously in order to access the key features of the process.
It enables us to
explore the joint performance of the variables and
determine the effect of each variable in the presence of the others.

It is assumed that a random sample of the multi-component

observations has been collected from different individuals.
The data consists of simultaneous measurements on many response
variables.
The common source of each individual observation will generally lead
to dependence or correlation among the dimension (components).
And this is the feature that distinguishes multivariate data and
techniques from their univariate counterparts.
Many multivariate methods are based upon an underlying probability
model known as the multivariate normal distribution.

Introduction and Matrix Algebra Matrix Algebra
1.2 Some Basic Matrix and Vector Algebra

1.2.1 The Organization of Data
A matrix is a rectangular or square array of numbers or variables arranged
in rows and columns.
   
x x12 ··· x1k ··· x1p x0
 11   1
   0
x21 x22 ··· x1k ··· x2p  x 
   2
 .. .. .. .. .. ..   .. 
. . . . . .  .
X=

= 
  0
 xj1 xj2 ··· xjk ··· xjp  x 
   j
 .. .. .. .. .. ..   .. 
. . . . . .  .
   
xn1 xn2 ··· xnk ··· xnp x0n
p variables observed on n subjects

j-th row contains measurements from the j-th subject
k-th column contains measurements from the k-th variable
Matrix representation allows calculations to be done via matrix
A vector is a matrix with a single column or row.

 
x
 1
 
x2 
x= 
 ..  or x0 = (x1 , x2 , · · · xn )
.
 
xn
where the prime denotes the operation of transposing a column to a row.
Figure: The vector x = [1, 3, 2]

A vector has both magnitude (length) and direction. The length of a

vector x 0 = (x1 ; x2 , · · · , xn ), is defined by
q √
Lx = x12 + x22 + . . . + xn2 = x0 x
The length of a vector can be expanded and contracted by

multiplying with a constant c
 
cx
 1
 
cx2 
cx =  
 .. 
 . 
 
cxn
Such multiplication of a vector x by a scalar c changes the length as

q √
Lcx = c 2 x12 + c 2 x22 + . . . + c 2 xn2 = |c| x0 x

When |c| > 1, vector x is expanded. When |c| < 1, vector x is contracted.
When |c| = 1, there is no change. If c < 0, the direction of vector x is
changed.
Choosing a = L−1 −1
x , we obtain the unit vector La x, which has length 1 and
lies in the direction of x.
 
x1 
Example If n = 2, consider the vector  .
x=  The length of x is
x2
q
Lx = x12 + x22 .
Geometrically, the length of a vector in two dimensions can be viewed as
the hypotenuse of a right triangle.

Introduction and Matrix Algebra Matrix Characteristics
1.2.2 Matrix Characteristics
Rank: The rank of a matrix A is the maximum number of linearly

independent rows (columns)
- A
Pset of k vectors x1 , x2 , . . . , xk is said to be linearly independent if
k
i=1 αi xi = 0 only if α1 = α2 = . . . = αk = 0
Linear independence implies every vector can not be written as a
linear combination of the other vectors.
Example
3 2
x1 = , x2 =
4 1
α1 x1 + α2 x2 = 0 ⇒
3α1 + 2α2 = 0
4α1 + α2 = 0
holds only if α1 = α2 = 0.
This confirms that x1 and x2 are linearly independent
The row and column rank of a matrix are equal.

* Rank(A) ≥ 0
* Rank(A) ≤ min(n, p)
* Rank(A) = Rank(A0 )
* Rank(A) = Rank(A0 A) = Rank(AA0 )
Trace: The
P trace of a matrix is the sum of its diagonal elements:
tr (A) = ki=1 αii
- tr (A ± B) = tr (A) ± tr (B)
- tr (cA) = ctr (A)
- tr (An×p Bp×n ) = tr (BA)
- tr (An×p Bp×q Cq×n ) = tr (CAB) = tr (BCA)
- tr (A0 A) = tr (An×p A0 ) = ni=1 pj=1 aij2
P P

DETERMINANTS: det(A) = |A|

- |cAn×n | = c n |A|
- |An×n Bn×n | = |BA| = |A||B|
Inverse: If a matrix A is square and of full rank, then A is said to be
nonsingular, and A has a unique inverse, denoted by A-1, with the property
that
AA−1 = A−1 A = I
A−1 exists if and only if the determinant of A is non-zero. And hence,
|A−1 | = |A|−1
. If A and B are the same size and nonsingular, then the inverse of their
product is the product of their inverses in reverse order,
(AB)−1 = (A)−1 (B)−1

Kronecker product
Let A = (aij ) be a p × m matrix and B = (bk` ) be a q × n matrix.

The (right) Kronecker product of A and B is the pq × mn block
matrix
 
a B a12 B ··· a1m B
 11 
 
a21 B a22 B ··· a2m B
A ⊗ B = (aij B) = 
 .. .. ..


 . . ··· . 
 
ap1 B ap2 B ··· apm B
properties
(A ⊗ B) ⊗ C = A ⊗ (B ⊗ C)
(A ⊗ B)(C ⊗ D) = AC ⊗ (BD)
(A + B) ⊗ C = (A ⊗ C) + (B ⊗ C)
tr (A ⊗ B) = tr (A)tr (B)

Positive Definite Matrix: The symmetric matrix A is said to be positive

definite if Q(x) = x0 Ax > 0 for all x 6= 0 where x = (x1 , x2 . . . , xn ).
A symmetric matrix A is said to be positive semi-definite if x0 Ax ≥ 0 for
all x 6= 0.
A positive definite matrix A can be factored into
A = T0 T
where T is a nonsingular upper triangular matrix.

Eigenvalues and Eigenvectors: For every square matrix A, a scalar λ
and a nonzero vector x can be found such that
Ax = λx
where λ is called an eigenvalue of A, and x is an eigenvector of A

corresponding to λ.

To find λ and x, we use the following polynomial equation
(A − λI)x = 0 ⇒ |A − λI| = 0
The equation |A − λI| = 0 as a function of λ is called characteristic

equation.
If A is n × n full rank matrix then the characteristic equation will
have n roots; that is A will have n eigenvaluesλ1 , λ2 , . . . , λn .
The eigenvalues of a positive definite matrix are all positive and
positive semidefinite matrix are positive or zero, with the number of
positive eigenvalues equal to the rank of the matrix.
The eigenvalues of a diagonal matrix are the diagonal elements
themselves and an idempotent matrix are 1 and 0.
Associated with every eigenvalue λi of a square matrix A, there is an

eigenvector xi whose elements satisfy the homogeneous system of
equations
(A − λi I)xi = 0 ⇔ Axi = λi xi
The eigenvectors are unique only up to multiplication by a scalar and

hence we can adjust the length of x by normalizing to have a unit
length.
The normalized eigenvector, ei , of xi is:

1 xi
ei = xi = p 0
Lxi xi xi
The normalized eigenvectors are chosen to satisfy
e01 e01 = e02 e2 = · · · = e0n en = 1 and be mutually perpendicular,
e0i ej = 0; i 6= j.
Example Find the eigenvalues and eigenvectors of
 
1 2
A= 
−1 4


1 − λ 2

|A − λI| = = (1 − λ)(4 − λ) + 2 = 0
−1 4−λ

λ2 − 5λ + 6 = (λ − 3)(λ − 2) = 0
from which λ1 = 3 and λ2 = 2. To find the eigenvector corresponding to
λ1 = 3 we use the equation (A − λI)x = 0

1−3 2 x1 0
=
−1 4−3 x2 0
−2x1 + 2x2 = 0
−x1 + x2 = 0
The two equations are redundant and remains a single equation with two
unknowns, x1 = x2 . The solution vector can be written with an arbitrary
constant,
x1 1 1
= x1 =c
x2 1 1
√
If c is set equal to 1/ 2 to normalize the eigenvector, we obtain
 
√
1/ 2
x1 = 
 √ 

1/ 2
Similarly, corresponding to λ2 = 2, we have

 
√
2/ 5
x2 =  √ 

1/ 5
For any square matrix A with eigenvalues λ1 , λ2 , . . . , λn , we have

n
X
tr (A) = λi
i=1
|A| = Πni=1 λi
Spectral Decomposition of a Symmetric Matrix
Any symmetric square matrix can be can be constructed from its

eigenvalues and eigenvectors
Let A be a n × n symmetric matrix having k non-zero eigenvalues
λ1 , λ2 , · · · , λn with normalized eigen vectors e1 , e2 , · · · , en .
Then, the spectral decomposition of A is given by:
X n
0 0 0
A = λ1 e1 e1 + λ2 e2 e2 + · · · + λn en en = λj ej e0j
j=1
0
A = PΛP
Example:Consider the symmetric matrix

1 2
A=
2 −2
The eigenvalues obtained from the characteristic equation |A − λI| = 0 are

λl = 2 and λ2 = −3 and the eigenvectors are:for λ1 = 2
    
1 2  x11  x11 
Ax1 = λ1 x1 ⇔ 

  = 2 
   
2 −2 x21 x21
 
1 2
⇒ x21 = x11 ⇒ x1 =  
2  
1
 
√
2/ 5
The normalized eigenvector corresponding to λ1 = 2 is e1 = 
 √  For

1/ 5
 
√
 1/ 5 
λ2 = 3 the corresponding normalized eigenvector is e2 =  ,
 √ 
−2/ 5
We need to show A = λ1 e1 e01 + λ2 e2 e02
√
√
1 2 2 1
2/ 5 1/ 5 1 −2
√ =2 √ √ − 3 √ √ √
2 −2 1/ 5 5 5 −2/ 5 5 5
The matrix is written as a function of eigenvalues and normalized
eigenvectors
In matrix form, the spectral decomposition of A is:
A = PΛP 0
where P = (e1 , e2 , · · · , en ) and Λ = diag(λ1 , λ2, · · · , λn )
Note here that P 0 P = PP 0 = In×n (P is orthogonal, P −1 = P 0 ) In the
above example,
   
√2 √1 2 0
5 5
P = (e1 , e2 ) =  , Λ =  
√1 −2
√ 0 −3
5 5
⇒ A = PΛP 0
Again, using spectral decomposition, for a positive definite matrix A
A−1 = PΛ−1 P 0
Powers of Symmetric Matrix
Let Ak×k be a symmetric matrix

Define Λy = diag(λy1 , · · · , λyk )
Then
An = A × . . . × A = PΛn P 0
A−1 = PΛ−1 P 0 assumingλi 6= 0 for alli = 1, 2 . . . , k
A1/2 is called the symmetric square root of A
The eigenvalues of An , A−1 and A1/2 are easily determined from λ1 , . . . , λk

Singular Value Decomposition
Let A be an n × p matrix of rank k. Then the singular value

decomposition of A can be expressed as
A = UDV0
where U is n × k, D is k × k, and V is p × k.
The diagonal elements of the nonsingular diagonal matrix
D = diag(λ1 , λ2 , · · · , λk ) are the positive square roots of
λ21 , λ22 , · · · , λ2k which are the nonzero eigenvalues of A0 A or of AA0 .
The values λ1 , λ2 , · · · , λk are called the singular values of A.
The k columns of U are the normalized eigenvectors of AA0
corresponding to the eigenvalues λ21 , λ22 , · · · , λ2k
The k columns of V are the normalized eigenvectors of A0 A
corresponding to the eigenvalues λ21 , λ22 , · · · , λ2k

Example:
 
 3 1 1
A=



−1 3 1
 
  3
 −1  
 3 1 1   11 1
AA0 =   =  and
 
 1 3 
  
−1 3 1   1 11
1 1
   
3 −1  
10 0 2
  3 1 1  
A0 A = 1 =
  
 3 
 0
  10 4

  −1 3 1  
1 1 2 4 2
Eigenvalues and eigenvectors corresponding to AA0

|AA0 − λI| = 0 ⇒ λ1 = 12 and λ2 = 10
The eigenvalues 0 0 A are λ = 12 and λ = 10 which implies the eigenvalues of
√ of AA or A √ 1 2
A to be λ1 = 12 and λ2 = 10
√
Eigenvector corresponding toλ1 = 12
    
11 1 x11 x
AA0 x1 = λ1 x1 ⇒     = 12  11  ⇒ x11 = x21
1 11 x21 x21

Let x11 = 1 ⇒ x21 = 1 ⇒ U01 = √1 √1
2 2

Similarly the eigenvector corresponding to λ2 = 10 is U02 = √1 −1
√
2 2
Eigenvalues and eigenvectors corresponding to A0 A.

10 − λ 0 2

0

|A A − λI = 0 ⇒ 0
10 − λ 4 = 0

2 4 2 − λ
⇒ λ1 = 12 or λ2 = 10 or λ3 = 0
Eigenvector corresponding to λ1 = 12
    
10 0 2 x11 x11
AA0 x1 = λ1 x1 ⇒ 
    
x21  = 12 x21 
0 10 4
   
2 4 2 x31 x31
⇒ x11 = x31 and x21 = 2x31

Let x11 = 1 ⇒ x21 = 2 and x31 = 1 V10 = √1 √2 √1
6 6 6

Similarly the eigenvector corresponding to λ2 = 10 is V20 = √2 −1
√ 0
5 5
Then
     
1
√
 √2 √1   12 0 1
  √6 √2 √1   3 1 1
2  6 6
A = UΛV0 =   
=
 
−1
 √  2 −1

√1 √ 0 10 √ √ 0 −1 3 1
2 2 5 5

A geometric interpretation based on the eigenvalues and eigenvectors

of the matrix A.
For example, suppose p = 2, Then the points x = [x1 , x2 ] of constant
distance c from the origin satisfy
x0 Ax = α11 x12 + α22 x22 + 2α11 α22 x1 x2 = c 2
By the spectral decomposition
A = λ1 e01 e1 + λ1 e02 e2
Hence
x0 Ax = λ1 (x0 e1 )2 + λ2 (x0 e2 )2

Figure: Points a constant distance from the origin (P = 2, 1 ≤ λ1 ≤ λ2 )

Introduction and Matrix Algebra Some other matrix properties
1.2.3 Some other matrix properties
LU (LR) decomposition:
Ap×p = Lp×p Up×p
where L is a lower triangular matrix and U is an upper triangular

matrix.
QR decomposition:
Ap×p = Qp×p Rp×p
where Q is an orthogonal matrix (Q0 Q = I) and R is an upper
triangular matrix
Cholesky decomposition: For A > 0, there exists a unique lower
triangular matrix T (tij = 0, i < j) with positive diagonal elements
such that A = TT0 .

Multivariate Statistical Methods: Abiyot Negash (Assi. Prof)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariate Statistical Methods: Abiyot Negash (Assi. Prof)

Uploaded by

Copyright:

Available Formats

Multivariate Statistical Methods

Abiyot Negash (Assi. Prof)

Abiyot. (JU) Multivariate statistical Methods

This course is considered with statistical methods designed to elicit

Many multivariate methods are based upon an underlying probability

Multivariate analysis is a ”mixed bag”. It is difficult to establish a

Abiyot. (JU) Multivariate statistical Methods 1

Multivariate statistical analysis consists of a collection of methods

Abiyot. (JU) Multivariate statistical Methods 2

It is assumed that a random sample of the multi-component

Abiyot. (JU) Multivariate statistical Methods 3

1.2 Some Basic Matrix and Vector Algebra

p variables observed on n subjects

A vector is a matrix with a single column or row.

where the prime denotes the operation of transposing a column to a row.

Figure: The vector x = [1, 3, 2]

A vector has both magnitude (length) and direction. The length of a

The length of a vector can be expanded and contracted by

Such multiplication of a vector x by a scalar c changes the length as

Abiyot. (JU) Multivariate statistical Methods 6

Abiyot. (JU) Multivariate statistical Methods 7

1.2.2 Matrix Characteristics

Rank: The rank of a matrix A is the maximum number of linearly

The row and column rank of a matrix are equal.

Abiyot. (JU) Multivariate statistical Methods 9

DETERMINANTS: det(A) = |A|

(AB)−1 = (A)−1 (B)−1

Abiyot. (JU) Multivariate statistical Methods 10

Let A = (aij ) be a p × m matrix and B = (bk` ) be a q × n matrix.

Abiyot. (JU) Multivariate statistical Methods 11

Positive Definite Matrix: The symmetric matrix A is said to be positive

where T is a nonsingular upper triangular matrix.

where λ is called an eigenvalue of A, and x is an eigenvector of A

Abiyot. (JU) Multivariate statistical Methods 12

To find λ and x, we use the following polynomial equation

The equation |A − λI| = 0 as a function of λ is called characteristic

Associated with every eigenvalue λi of a square matrix A, there is an

The eigenvectors are unique only up to multiplication by a scalar and

The normalized eigenvector, ei , of xi is:

Abiyot. (JU) Multivariate statistical Methods 14

Similarly, corresponding to λ2 = 2, we have

For any square matrix A with eigenvalues λ1 , λ2 , . . . , λn , we have

Spectral Decomposition of a Symmetric Matrix

Any symmetric square matrix can be can be constructed from its

The eigenvalues obtained from the characteristic equation |A − λI| = 0 are

Powers of Symmetric Matrix

Let Ak×k be a symmetric matrix

The eigenvalues of An , A−1 and A1/2 are easily determined from λ1 , . . . , λk

Abiyot. (JU) Multivariate statistical Methods 20

Singular Value Decomposition

Let A be an n × p matrix of rank k. Then the singular value

Abiyot. (JU) Multivariate statistical Methods 21

Eigenvalues and eigenvectors corresponding to AA0

Eigenvalues and eigenvectors corresponding to A0 A.

⇒ x11 = x31 and x21 = 2x31  

Abiyot. (JU) Multivariate statistical Methods 24

A geometric interpretation based on the eigenvalues and eigenvectors

Abiyot. (JU) Multivariate statistical Methods 25

Figure: Points a constant distance from the origin (P = 2, 1 ≤ λ1 ≤ λ2 )

Abiyot. (JU) Multivariate statistical Methods 26

1.2.3 Some other matrix properties

Ap×p = Lp×p Up×p

⇒ x11 = x31 and x21 = 2x31