Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Positive Definite Matrix

Chia-Ping Chen

Professor
Department of Computer Science and Engineering
National Sun Yat-sen University

Linear Algebra

1/36
Outline and Notation

xT Ax: quadratic form


f (x): multi-variate function
∇f (x): gradient vector
H: Hessian matrix
σi : singular value
Σ: singular value matrix
A = U ΣV T : singular value decomposition of A
A+ : pseudo-inverse of A

2/36
Chen P Positive Definite Matrix
Quadratic Function and Matrix

3/36
Chen P Positive Definite Matrix
Quadratic Function of Multiple Variables

A quadratic function of variables x1 , . . . , xn is a linear


combination of the second-order terms x2i and xi xj .

Details. Let cij be the coefficient of term xi xj of a quadratic


function of n variables f (x1 , . . . , xn ). Then
n X
X n
f (x1 , . . . , xn ) = cij xi xj
i=1 j=1

4/36
Chen P Positive Definite Matrix
Representation with a Symmetric Matrix
A quadratic function of n variables can be represented by
a symmetric matrix of order n × n.

n
P
Construction of matrix. For f (x1 , . . . , xn ) = cij xi xj ,
i,j=1
define matrix A with aij = 12 (cij + cji ). Note aij = aji so A is
symmetric. Furthermore, aij + aji = cij + cji , so
n n
aij xi xj = xT Ax
X X
f (x1 , . . . , xn ) = cij xi xj =
i,j=1 i,j=1

Example. f (x, y) = ax2 + 2bxy + cy 2 can be represented by


f (x) = xT Ax where
" # " #
x a b
x= , A=
y b c
5/36
Chen P Positive Definite Matrix
Positive Definite Matrix

The function xT Ax is called the quadratic form of A.

Positive definite. Matrix A is said to be positive definite if


its quadratic form xT Ax is positive for any x 6= 0.

6/36
Chen P Positive Definite Matrix
Positivity of Eigenvalues

Every eigenvalue of a positive definite matrix is positive.

Proof. Suppose A is a positive definite matrix. Let λ be an


eigenvalue of A, and s be an eigenvector of A corresponding
to λ. We have
As = λs
It follows that
sT As = λ(sT s)
Hence
sT As
λ= >0
sT s

7/36
Chen P Positive Definite Matrix
Positive Eigenvalues

A matrix is positive definite if every eigenvalue of the


matrix is positive.

Proof. Suppose every eigenvalue of A is positive. By spectral


theorem, A has an eigenvalue decomposition A = QΛQT . It
follows that
yT y
z }| { z }| {
xT Ax = xT Q Λ QT x = y T Λy = λi yi2
X

Hence, the quadratic form xT Ax is positive for any x 6= 0,


and A is positive definite.
8/36
Chen P Positive Definite Matrix
Positivity of Determinants

If a matrix is positive definite, then the determinant of


every leading principal sub-matrix is positive.

Proof.h Suppose
i A is positive definite. For every k, consider
T T
x = xk 0 with xk ∈ Rk . For a non-zero xk , we have
T

x 6= 0, and
" #" #
h i Ak B xk
xT Ax = xTk 0T = xTk Ak xk > 0
BT C 0

So Ak , the leading principle sub-matrix of A of order k × k, is


positive definite. Since the determinant of a matrix is the
product of eigenvalues, and every eigenvalue of Ak is positive,
|Ak | must be positive.
9/36
Chen P Positive Definite Matrix
Positivity of Pivots
If the determinant of every leading principal sub-matrix
of a matrix is positive, then the matrix has full positive
pivots.

Proof. By assumption |A| > 0, so A is non-singular. Let


A = LDU be the LDU decomposition of A. Explicitly
" # " #" #" # " #
Ak B Lk 0 Dk 0 Uk ∗ Lk D k U k ∗
= =
BT C ∗ ∗ 0 ∗ 0 ∗ ∗ ∗

So Ak = Lk D k U k and |Ak | = |D k | = d1 . . . dk where di is a


pivot. Thus
|Ak |
dk = > 0, k = 1, . . . , n
|Ak−1 |
10/36
Chen P Positive Definite Matrix
Positive Pivots

If a matrix has full positive pivots, then the matrix is


positive definite.

Proof. By assumption, A has full pivots, so it is non-singular.


Let A = LDU be the LDU decomposition of A. Since A is
symmetric, A = AT or LDU = U T DLT , so U = LT . Thus

A = LDLT = LD 1/2 D 1/2 LT = RT R

where R = D 1/2 LT is non-singular. The quadratic form of A


is
xT Ax = xT RT Rx = (Rx)T (Rx) = kRxk2
which is positive for x 6= 0. Hence A is positive definite.
11/36
Chen P Positive Definite Matrix
Equivalent Statements for PDM

There are many ways to say a matrix is positive definite.

1 A is positive definite.
2 Every eigenvalue of A is positive.
3 The determinant of every leading principal sub-matrices
of A is positive.
4 A has full positive pivots.

What we have shown in the previous slides are


1 ⇔
2

and

1 ⇒
3 ⇒
4 ⇒
1
12/36
Chen P Positive Definite Matrix
Example
 
2 −1 0
A = −1 2 −1
 

0 −1 2

The quadratic form of A is


xT Ax = 2x21 + 2x22 + 2x23 − 2x1 x2 − 2x2 x3
2 2
1 3 2 4
 
= 2 x1 − x2 + x2 − x3 + x23
2 2 3 3
The eigenvalues, the determinants, and the pivots are

spectrum(A) = {2, 2 ± 2}, |A1 | = 2, |A2 | = 3, |A3 | = 4
1 − 12 0
   
1 0 0 2
1 3
− 2
A= 1 0 
 2  0

1 − 32 

0 − 23 1 4
3
0 0 1
13/36
Chen P Positive Definite Matrix
Ellipsoid

Let A be a positive definite matrix. Then the equation


xT Ax = 1 is an ellipsoid.

Explanation. By spectral theorem A = QΛQT . Note that


{q 1 , . . . q n } is an orthonormal basis, and the representation of
x in this basis is QT x. By a change of basis, xT Ax = 1 can
be converted to

xT QΛQT x = y T Λy = λi yi2 = 1
X

This is an ellipsoid with the axes of symmetry along q i ’s, with


the intercepts of q −1
yi = ± λi
14/36
Chen P Positive Definite Matrix
Extension

Negative definite. Matrix A is said to be negative definite if


its quadratic form xT Ax is negative for any x 6= 0.
Semi-definite. Matrix A is said to be positive semi-definite
(resp. negative semi-definite) if its quadratic form xT Ax is
non-negative (resp. non-positive) for any x.

15/36
Chen P Positive Definite Matrix
Approximation and Extremal Points

16/36
Chen P Positive Definite Matrix
First-order Approximation

The first-order approximation to a multi-variate function f (x)


near x0 is

f (x) ≈ f (x0 ) + ∇f (x0 )T (x − x0 )

where  ∂f (x) 
∂x1
 . 
∇f (x) =  . 
 . 
∂f (x)
∂xn

Gradient. ∇f (x) is called the gradient of f (x).

17/36
Chen P Positive Definite Matrix
Second-order Approximation

The second-order approximation to f (x) near x0 is


1
f (x) ≈ f (x0 ) + ∇f (x0 )T (x − x0 ) + (x − x0 )T H(x0 )(x − x0 )
2
where  
∂ 2 f (x) ∂ 2 f (x)
 ∂x21
... ∂x1 ∂xn 
 .. ∂ 2 f (x) .. 
H(x) = 
 . ∂xi ∂xj
. 

∂ 2 f (x) ∂ 2 f (x)
 
∂xn ∂x1
... ∂x2n

Hessian. H(x) is called the Hessian of f (x).

18/36
Chen P Positive Definite Matrix
Stationary Point

x0 is called a stationary point of f (x) if ∇f (x0 ) = 0.

Near a stationary point. Suppose x0 is a stationary point of


f (x). Near x0 , the second-order approximation to f (x) is
1
f (x) ≈ f (x0 ) + (x − x0 )T H(x0 )(x − x0 )
2

19/36
Chen P Positive Definite Matrix
Example
Find the second-order approximation near x0 = 0 to

f (x) = 2x2 + 4xy + y 2

    
" ∂f (x) # " # ∂ ∂f (x) ∂ ∂f (x)
∂x 4x + 4y  ∂x  ∂x  ∂y  ∂x 
∇f (x) = ∂f (x) = , H(x) = ∂ ∂f (x) ∂ ∂f (x)
∂y
4x + 2y ∂x ∂y ∂y ∂y
" #
4 4
f (0) = 0, ∇f (0) = 0, H(0) =
4 2
1
f (x) ≈ f (0) + (x − 0)T H(0)(x − 0) = 2x2 + 4xy + y 2
2
20/36
Chen P Positive Definite Matrix
Example
Find the second-order approximation near x0 = 0 to

F (x) = 7 + 2(x + y)2 − y sin y − x3

" ∂F (x) # " #


∂x 4(x + y) − 3x2
∇F (x) = ∂F (x) =
∂y
4(x + y) − sin y − y cos y
    
∂ ∂F (x) ∂ ∂F (x) " #
 ∂x  ∂x  ∂y  ∂x  4 − 6x 4
H(x) = ∂F (x) ∂F (x) =

∂x ∂y

∂y ∂y
4 4 − 2 cos y + y sin y
" #
4 4
F (0) = 7, ∇F (0) = 0, H(0) =
4 2
1
F (x) ≈ F (0) + (x − 0)T H(0)(x − 0) = 7 + 2x2 + 4xy + y 2
2
21/36
Chen P Positive Definite Matrix
Local Minimum and Local Maximum

A point x0 is called a local minimum of f (x) if f (x) ≥


f (x0 ) for every point in a small neighborhood of x0 .

Similarly, x0 is called a local maximum if f (x) ≤ f (x0 ) in a


small neighborhood of x0 .

22/36
Chen P Positive Definite Matrix
Optimality of a Stationary Point

Let x0 be a stationary point of f (x).


x0 is local minimum if H(x0 ) is positive definite.
x0 is local maximum if H(x0 ) is negative
definite.
x0 is a saddle point if it is neither local maximum
nor local minimum.

For example, 0 is a stationary point of F (x), and it is a saddle


point because 2x2 + 4xy + y 2 can be positive or negative as x
and y vary.

23/36
Chen P Positive Definite Matrix
Singular Value Decomposition

24/36
Chen P Positive Definite Matrix
Singular Value and Singular Vector

A singular value of a real matrix A is the square root


of a non-zero eigenvalue of AT A .

It means to find the singularvaluesof A, one needs to find


the non-zero eigenvalues of AT A .
Singular vector. If σ is a singular value of A, then there
exists v 6= 0 such that
 
AT A v = σ 2 v

Such a v is called a right singular


 vector
 of A with singular
T
value σ. It is an eigenvector of A A with eigenvalue σ 2 .

25/36
Chen P Positive Definite Matrix
Positivity of Singular Value

A singular value is always positive.


 
The matrix AT A is positive semi-definite:
 
xT AT A x = (Ax)T (Ax) = kAxk2 ≥ 0
 
so the eigenvalues of AT A must be non-negative, and the
non-zero eigenvalues must be positive. Hence a singular value
is positive.

26/36
Chen P Positive Definite Matrix
Number of Singular Values

A matrix of rank r has exactly r singular values.


 
Proof. Note AT A x = 0 ⇔ xT AT Ax = 0 ⇔ Ax = 0, so
 
N AT A = N (A). Let A be of order m × n with rank r.
   
Then dim N (A) = n − r = dim N AT A . Matrix AT A is
non-defective, so the algebraic multiplicity of eigenvalue 0 is
(n − r). It follows that the
 total algebraic multiplicities of the
T
non-zero eigenvalues of A A is

n − (n − r) = r

Notation. Singular values are denoted by σ1 , . . . , σr .


27/36
Chen P Positive Definite Matrix
Singular Value Decomposition

A real matrix can be decomposed by its singular values


and singular vectors. This is called singular value decom-
position.

A matrix of order m × n has SVD

A = U ΣV T

where Σ is an m × n ”diagonal” matrix with the singular


 U is an m × m
values of A as the leading diagonal elements,
orthogonal matrix with the eigenvectors of AAT as
columns, and V is an n × n orthogonal matrix with the
eigenvectors of AT A as columns.
28/36
Chen P Positive Definite Matrix
Proof of SVD 1
Let r be the rank of A. Let σ1 . . . σr be the singular
 values
 of
T
A. Let v 1 . . . v r be orthonormal eigenvectors of A A with
positive eigenvalues σi2 and u1 . . . ur be defined by ui = Avσi
i
.
Note
  AAT Av i Aσi2 v i
AAT ui = = = σi2 ui , i = 1, . . . , r
σi σi
 
So ui is an eigenvector of AAT with the same eigenvalue
 
σi2 . Let v r+1 . . . v n be orthonormal eigenvectors of AT A
with eigenvalue
  0, and ur+1 . . . um be eigenvectors of
T
AA with eigenvalue 0. Construct matrices U and V by
   

U = u1 . . .  , V = v 1 . . .
um  vn
 

29/36
Chen P Positive Definite Matrix
Proof of SVD 2
We show U T AV = Σ which leads to SVD A = U ΣV T . For
j = 1 . . . r, we have uj = Av
σj
j
, so Av j = σj uj and

(U T AV )ij = uTi Av j = uTi (σj uj ) = σj δij , i = 1, . . . , m


 
For j = r + 1 . . . n, we have AT A v j = 0, so Av j = 0 and

(U T AV )ij = uTi Av j = 0, i = 1, . . . , m

Combining the results, we get

U T AV = Σ

Hence
A = U ΣV T
30/36
Chen P Positive Definite Matrix
Matrices in SVD

For a matrix of order m × n with SVD A = U ΣV T , the


column vectors of U (resp. V ) is an orthonormal basis
of Rm (resp. Rn ).

U must be an eigenvector matrix of AAT .


diagonal
   z }| {
AAT = U ΣV T V ΣT U T = U ΣΣT U T

Similarly, V must be an eigenvector matrix of AT A.


diagonal
   z }| {
AT A = V ΣT U T U ΣV T = V ΣT Σ V T
31/36
Chen P Positive Definite Matrix
Singular Vectors and Fundamental Spaces

The right (resp. left) singular vectors in an SVD of a


matrix form an orthonormal basis of the row space (resp.
column space) of the matrix.
 
Row space. {v r+1 , . . . , v n } contains eigenvectors of AT A
with eigenvalue 0, so it is a basis of N (AT A) = N (A). This
implies {v 1 , . . . , v r } is a basis of the orthogonal complement
of N (A), i.e. the row space of A.
Column space. We have AV = U Σ. The first r columns
are
Av i = σi ui , i = 1, . . . , r
So {u1 , . . . , ur } is a linearly independent set in the column
space of A. Hence, it is a basis of C(A).
32/36
Chen P Positive Definite Matrix
Example

Find SVD of " #


−1 1 0
A=
0 −1 1

The eigenvalues of
 
1 −1 0
T
A A = −1 2 −1


0 −1 1

are λ1 = 3, λ2 = 1, λ3 = 0. Hence the singular values of A are



σ1 = 3, σ2 = 1

33/36
Chen P Positive Definite Matrix
 
Orthonormal eigenvectors of AT A are
     
1 −1 1
1   1   1  
v 1 = √ −2 , v 2 = √  0  , v 3 = √ 1
6 1 2 1 3 1

The corresponding left singular vectors of A are


" # " #
Av 1 1 −1 Av 2 1 1
u1 = =√ , u2 = =√
σ1 2 1 σ2 2 1

So
 1
− √26 √1

# "√ √
− √12 √1
" #
3 0 0  61 6
A = U ΣV T = 2 − √ 0 √1 
√1 √1 0 1 0  2 2
2 2 √1 √1 √1
3 3 3

34/36
Chen P Positive Definite Matrix
SVD as a Sum of Rank-1 Matrices

Every real matrix of rank r is the sum of r real matrices


of rank 1 based on singular values and singular vectors.

By SVD

A = U ΣV T = σ1 u1 v T1 + · · · + σr ur v Tr
= A1 + · · · + A r

Image approximation. For an image of size 1000 × 1000, a


compression rate of 90% is achieved if 50 terms are used.

Data Compression with SVD


35/36
Chen P Positive Definite Matrix

SVD and Pseudo-Inverse

Let A = U ΣV T be an SVD of A. For a rectangular


system of linear equations Ax = b, the least-squares
solution with the minimum length is x+ = V Σ+ U T b.

Pseudo-inverse. The minimum-length least-squares solution


can be written as x+ = A+ b, where A+ = V Σ+ U T . A+ is
called the pseudo-inverse of A.

36/36
Chen P Positive Definite Matrix

You might also like