PDM 1

Positive Definite Matrix
Chia-Ping Chen
Professor
Department of Computer Science and Engineering
National Sun Yat-sen University
Linear Algebra
1/36
Outline and Notation
xT Ax: quadratic form

f (x): multi-variate function
∇f (x): gradient vector
H: Hessian matrix
σi : singular value
Σ: singular value matrix
A = U ΣV T : singular value decomposition of A
A+ : pseudo-inverse of A
2/36
Chen P Positive Definite Matrix
Quadratic Function and Matrix
3/36
Quadratic Function of Multiple Variables
A quadratic function of variables x1 , . . . , xn is a linear

combination of the second-order terms x2i and xi xj .
Details. Let cij be the coefficient of term xi xj of a quadratic

function of n variables f (x1 , . . . , xn ). Then
n X
X n
f (x1 , . . . , xn ) = cij xi xj
i=1 j=1
4/36
Representation with a Symmetric Matrix
A quadratic function of n variables can be represented by
a symmetric matrix of order n × n.
n
P
Construction of matrix. For f (x1 , . . . , xn ) = cij xi xj ,
i,j=1
define matrix A with aij = 12 (cij + cji ). Note aij = aji so A is
symmetric. Furthermore, aij + aji = cij + cji , so
n n
aij xi xj = xT Ax
X X
f (x1 , . . . , xn ) = cij xi xj =
i,j=1 i,j=1
Example. f (x, y) = ax2 + 2bxy + cy 2 can be represented by

f (x) = xT Ax where
" # " #
x a b
x= , A=
y b c
5/36
Positive Definite Matrix
The function xT Ax is called the quadratic form of A.
Positive definite. Matrix A is said to be positive definite if

its quadratic form xT Ax is positive for any x 6= 0.
6/36
Positivity of Eigenvalues
Every eigenvalue of a positive definite matrix is positive.
Proof. Suppose A is a positive definite matrix. Let λ be an

eigenvalue of A, and s be an eigenvector of A corresponding
to λ. We have
As = λs
It follows that
sT As = λ(sT s)
Hence
sT As
λ= >0
sT s
7/36
Positive Eigenvalues
A matrix is positive definite if every eigenvalue of the

matrix is positive.
Proof. Suppose every eigenvalue of A is positive. By spectral

theorem, A has an eigenvalue decomposition A = QΛQT . It
follows that
yT y
z }| { z }| {
xT Ax = xT Q Λ QT x = y T Λy = λi yi2
X
Hence, the quadratic form xT Ax is positive for any x 6= 0,

and A is positive definite.
8/36
Positivity of Determinants
If a matrix is positive definite, then the determinant of

every leading principal sub-matrix is positive.
Proof.h Suppose
i A is positive definite. For every k, consider
T T
x = xk 0 with xk ∈ Rk . For a non-zero xk , we have
T
x 6= 0, and
" #" #
h i Ak B xk
xT Ax = xTk 0T = xTk Ak xk > 0
BT C 0
So Ak , the leading principle sub-matrix of A of order k × k, is

positive definite. Since the determinant of a matrix is the
product of eigenvalues, and every eigenvalue of Ak is positive,
|Ak | must be positive.
9/36
Positivity of Pivots
If the determinant of every leading principal sub-matrix
of a matrix is positive, then the matrix has full positive
pivots.
Proof. By assumption |A| > 0, so A is non-singular. Let

A = LDU be the LDU decomposition of A. Explicitly
" # " #" #" # " #
Ak B Lk 0 Dk 0 Uk ∗ Lk D k U k ∗
= =
BT C ∗ ∗ 0 ∗ 0 ∗ ∗ ∗
So Ak = Lk D k U k and |Ak | = |D k | = d1 . . . dk where di is a

pivot. Thus
|Ak |
dk = > 0, k = 1, . . . , n
|Ak−1 |
10/36
Positive Pivots
If a matrix has full positive pivots, then the matrix is

positive definite.
Proof. By assumption, A has full pivots, so it is non-singular.

Let A = LDU be the LDU decomposition of A. Since A is
symmetric, A = AT or LDU = U T DLT , so U = LT . Thus
A = LDLT = LD 1/2 D 1/2 LT = RT R
where R = D 1/2 LT is non-singular. The quadratic form of A

is
xT Ax = xT RT Rx = (Rx)T (Rx) = kRxk2
which is positive for x 6= 0. Hence A is positive definite.
11/36
Equivalent Statements for PDM
There are many ways to say a matrix is positive definite.
1 A is positive definite.
2 Every eigenvalue of A is positive.
3 The determinant of every leading principal sub-matrices
of A is positive.
4 A has full positive pivots.
What we have shown in the previous slides are

1 ⇔
2
and

1 ⇒
3 ⇒
4 ⇒
1
12/36
Example
 
2 −1 0
A = −1 2 −1
 
0 −1 2
The quadratic form of A is

xT Ax = 2x21 + 2x22 + 2x23 − 2x1 x2 − 2x2 x3
2 2
1 3 2 4

= 2 x1 − x2 + x2 − x3 + x23
2 2 3 3
The eigenvalues, the determinants, and the pivots are
√
spectrum(A) = {2, 2 ± 2}, |A1 | = 2, |A2 | = 3, |A3 | = 4
1 − 12 0
   
1 0 0 2
1 3
− 2
A= 1 0 
 2  0

1 − 32 

0 − 23 1 4
3
0 0 1
13/36
Ellipsoid
Let A be a positive definite matrix. Then the equation

xT Ax = 1 is an ellipsoid.
Explanation. By spectral theorem A = QΛQT . Note that

{q 1 , . . . q n } is an orthonormal basis, and the representation of
x in this basis is QT x. By a change of basis, xT Ax = 1 can
be converted to
xT QΛQT x = y T Λy = λi yi2 = 1
X
This is an ellipsoid with the axes of symmetry along q i ’s, with

the intercepts of q −1
yi = ± λi
14/36
Extension
Negative definite. Matrix A is said to be negative definite if

its quadratic form xT Ax is negative for any x 6= 0.
Semi-definite. Matrix A is said to be positive semi-definite
(resp. negative semi-definite) if its quadratic form xT Ax is
non-negative (resp. non-positive) for any x.
15/36
Approximation and Extremal Points
16/36
First-order Approximation
The first-order approximation to a multi-variate function f (x)

near x0 is
f (x) ≈ f (x0 ) + ∇f (x0 )T (x − x0 )
where  ∂f (x) 
∂x1
 . 
∇f (x) =  . 
 . 
∂f (x)
∂xn
Gradient. ∇f (x) is called the gradient of f (x).
17/36
Second-order Approximation
The second-order approximation to f (x) near x0 is

1
f (x) ≈ f (x0 ) + ∇f (x0 )T (x − x0 ) + (x − x0 )T H(x0 )(x − x0 )
2
where  
∂ 2 f (x) ∂ 2 f (x)
 ∂x21
... ∂x1 ∂xn 
 .. ∂ 2 f (x) .. 
H(x) = 
 . ∂xi ∂xj
. 

∂ 2 f (x) ∂ 2 f (x)
 
∂xn ∂x1
... ∂x2n
Hessian. H(x) is called the Hessian of f (x).
18/36
Stationary Point
x0 is called a stationary point of f (x) if ∇f (x0 ) = 0.
Near a stationary point. Suppose x0 is a stationary point of

f (x). Near x0 , the second-order approximation to f (x) is
1
f (x) ≈ f (x0 ) + (x − x0 )T H(x0 )(x − x0 )
2
19/36
Example
Find the second-order approximation near x0 = 0 to
f (x) = 2x2 + 4xy + y 2
 
" ∂f (x) # " # ∂ ∂f (x) ∂ ∂f (x)
∂x 4x + 4y  ∂x ∂x ∂y ∂x 
∇f (x) = ∂f (x) = , H(x) = ∂ ∂f (x) ∂ ∂f (x)
∂y
4x + 2y ∂x ∂y ∂y ∂y
" #
4 4
f (0) = 0, ∇f (0) = 0, H(0) =
4 2
1
f (x) ≈ f (0) + (x − 0)T H(0)(x − 0) = 2x2 + 4xy + y 2
2
20/36
Example
Find the second-order approximation near x0 = 0 to
F (x) = 7 + 2(x + y)2 − y sin y − x3
" ∂F (x) # " #

∂x 4(x + y) − 3x2
∇F (x) = ∂F (x) =
∂y
4(x + y) − sin y − y cos y
 
∂ ∂F (x) ∂ ∂F (x) " #
 ∂x ∂x ∂y ∂x  4 − 6x 4
H(x) = ∂F (x) ∂F (x) =
∂
∂x ∂y
∂
∂y ∂y
4 4 − 2 cos y + y sin y
" #
4 4
F (0) = 7, ∇F (0) = 0, H(0) =
4 2
1
F (x) ≈ F (0) + (x − 0)T H(0)(x − 0) = 7 + 2x2 + 4xy + y 2
2
21/36
Local Minimum and Local Maximum
A point x0 is called a local minimum of f (x) if f (x) ≥

f (x0 ) for every point in a small neighborhood of x0 .
Similarly, x0 is called a local maximum if f (x) ≤ f (x0 ) in a

small neighborhood of x0 .
22/36
Optimality of a Stationary Point
Let x0 be a stationary point of f (x).

x0 is local minimum if H(x0 ) is positive definite.
x0 is local maximum if H(x0 ) is negative
definite.
x0 is a saddle point if it is neither local maximum
nor local minimum.
For example, 0 is a stationary point of F (x), and it is a saddle

point because 2x2 + 4xy + y 2 can be positive or negative as x
and y vary.
23/36
Singular Value Decomposition
24/36
Singular Value and Singular Vector
A singular value of a real matrix A is the square root

of a non-zero eigenvalue of AT A .
It means to find the singularvaluesof A, one needs to find

the non-zero eigenvalues of AT A .
Singular vector. If σ is a singular value of A, then there
exists v 6= 0 such that

AT A v = σ 2 v
Such a v is called a right singular

vector
of A with singular
T
value σ. It is an eigenvector of A A with eigenvalue σ 2 .
25/36
Positivity of Singular Value
A singular value is always positive.

The matrix AT A is positive semi-definite:

xT AT A x = (Ax)T (Ax) = kAxk2 ≥ 0

so the eigenvalues of AT A must be non-negative, and the
non-zero eigenvalues must be positive. Hence a singular value
is positive.
26/36
Number of Singular Values
A matrix of rank r has exactly r singular values.

Proof. Note AT A x = 0 ⇔ xT AT Ax = 0 ⇔ Ax = 0, so

N AT A = N (A). Let A be of order m × n with rank r.

Then dim N (A) = n − r = dim N AT A . Matrix AT A is
non-defective, so the algebraic multiplicity of eigenvalue 0 is
(n − r). It follows that the
total algebraic multiplicities of the
T
non-zero eigenvalues of A A is
n − (n − r) = r
Notation. Singular values are denoted by σ1 , . . . , σr .

27/36
Singular Value Decomposition
A real matrix can be decomposed by its singular values

and singular vectors. This is called singular value decom-
position.
A matrix of order m × n has SVD
A = U ΣV T
where Σ is an m × n ”diagonal” matrix with the singular

U is an m × m
values of A as the leading diagonal elements,
orthogonal matrix with the eigenvectors of AAT as
columns, and V is an n × n orthogonal matrix with the
eigenvectors of AT A as columns.
28/36
Proof of SVD 1
Let r be the rank of A. Let σ1 . . . σr be the singular
values
of
T
A. Let v 1 . . . v r be orthonormal eigenvectors of A A with
positive eigenvalues σi2 and u1 . . . ur be defined by ui = Avσi
i
.
Note
AAT Av i Aσi2 v i
AAT ui = = = σi2 ui , i = 1, . . . , r
σi σi

So ui is an eigenvector of AAT with the same eigenvalue

σi2 . Let v r+1 . . . v n be orthonormal eigenvectors of AT A
with eigenvalue
0, and ur+1 . . . um be eigenvectors of
T
AA with eigenvalue 0. Construct matrices U and V by
   
U = u1 . . .  , V = v 1 . . .
um  vn
 

29/36
Proof of SVD 2
We show U T AV = Σ which leads to SVD A = U ΣV T . For
j = 1 . . . r, we have uj = Av
σj
j
, so Av j = σj uj and
(U T AV )ij = uTi Av j = uTi (σj uj ) = σj δij , i = 1, . . . , m

For j = r + 1 . . . n, we have AT A v j = 0, so Av j = 0 and
(U T AV )ij = uTi Av j = 0, i = 1, . . . , m
Combining the results, we get
U T AV = Σ
Hence
A = U ΣV T
30/36
Matrices in SVD
For a matrix of order m × n with SVD A = U ΣV T , the

column vectors of U (resp. V ) is an orthonormal basis
of Rm (resp. Rn ).
U must be an eigenvector matrix of AAT .

diagonal
z }| {
AAT = U ΣV T V ΣT U T = U ΣΣT U T
Similarly, V must be an eigenvector matrix of AT A.

diagonal
z }| {
AT A = V ΣT U T U ΣV T = V ΣT Σ V T
31/36
Singular Vectors and Fundamental Spaces
The right (resp. left) singular vectors in an SVD of a

matrix form an orthonormal basis of the row space (resp.
column space) of the matrix.

Row space. {v r+1 , . . . , v n } contains eigenvectors of AT A
with eigenvalue 0, so it is a basis of N (AT A) = N (A). This
implies {v 1 , . . . , v r } is a basis of the orthogonal complement
of N (A), i.e. the row space of A.
Column space. We have AV = U Σ. The first r columns
are
Av i = σi ui , i = 1, . . . , r
So {u1 , . . . , ur } is a linearly independent set in the column
space of A. Hence, it is a basis of C(A).
32/36
Example
Find SVD of " #

−1 1 0
A=
0 −1 1
The eigenvalues of
 
1 −1 0
T
A A = −1 2 −1


0 −1 1
are λ1 = 3, λ2 = 1, λ3 = 0. Hence the singular values of A are

√
σ1 = 3, σ2 = 1
33/36

Orthonormal eigenvectors of AT A are
     
1 −1 1
1   1   1  
v 1 = √ −2 , v 2 = √  0  , v 3 = √ 1
6 1 2 1 3 1
The corresponding left singular vectors of A are

" # " #
Av 1 1 −1 Av 2 1 1
u1 = =√ , u2 = =√
σ1 2 1 σ2 2 1
So
 1
− √26 √1

# "√ √
− √12 √1
" #
3 0 0  61 6
A = U ΣV T = 2 − √ 0 √1 
√1 √1 0 1 0  2 2
2 2 √1 √1 √1
3 3 3
34/36
SVD as a Sum of Rank-1 Matrices
Every real matrix of rank r is the sum of r real matrices

of rank 1 based on singular values and singular vectors.
By SVD
A = U ΣV T = σ1 u1 v T1 + · · · + σr ur v Tr
= A1 + · · · + A r
Image approximation. For an image of size 1000 × 1000, a

compression rate of 90% is achieved if 50 terms are used.
Data Compression with SVD

35/36
∗
SVD and Pseudo-Inverse
Let A = U ΣV T be an SVD of A. For a rectangular

system of linear equations Ax = b, the least-squares
solution with the minimum length is x+ = V Σ+ U T b.
Pseudo-inverse. The minimum-length least-squares solution

can be written as x+ = A+ b, where A+ = V Σ+ U T . A+ is
called the pseudo-inverse of A.
36/36

PDM 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PDM 1

Uploaded by

Copyright:

Available Formats

Positive Definite Matrix

xT Ax: quadratic form

A quadratic function of variables x1 , . . . , xn is a linear

Details. Let cij be the coefficient of term xi xj of a quadratic

Example. f (x, y) = ax2 + 2bxy + cy 2 can be represented by

The function xT Ax is called the quadratic form of A.

Positive definite. Matrix A is said to be positive definite if

Every eigenvalue of a positive definite matrix is positive.

Proof. Suppose A is a positive definite matrix. Let λ be an

A matrix is positive definite if every eigenvalue of the

Proof. Suppose every eigenvalue of A is positive. By spectral

Hence, the quadratic form xT Ax is positive for any x 6= 0,

If a matrix is positive definite, then the determinant of

So Ak , the leading principle sub-matrix of A of order k × k, is

Proof. By assumption |A| > 0, so A is non-singular. Let

So Ak = Lk D k U k and |Ak | = |D k | = d1 . . . dk where di is a

If a matrix has full positive pivots, then the matrix is

Proof. By assumption, A has full pivots, so it is non-singular.

A = LDLT = LD 1/2 D 1/2 LT = RT R

where R = D 1/2 LT is non-singular. The quadratic form of A

There are many ways to say a matrix is positive definite.

What we have shown in the previous slides are

The quadratic form of A is

Let A be a positive definite matrix. Then the equation

Explanation. By spectral theorem A = QΛQT . Note that

This is an ellipsoid with the axes of symmetry along q i ’s, with

Negative definite. Matrix A is said to be negative definite if

The first-order approximation to a multi-variate function f (x)

f (x) ≈ f (x0 ) + ∇f (x0 )T (x − x0 )

Gradient. ∇f (x) is called the gradient of f (x).

The second-order approximation to f (x) near x0 is

Hessian. H(x) is called the Hessian of f (x).

x0 is called a stationary point of f (x) if ∇f (x0 ) = 0.

Near a stationary point. Suppose x0 is a stationary point of

f (x) = 2x2 + 4xy + y 2

F (x) = 7 + 2(x + y)2 − y sin y − x3

" ∂F (x) # " #

A point x0 is called a local minimum of f (x) if f (x) ≥

Similarly, x0 is called a local maximum if f (x) ≤ f (x0 ) in a

Let x0 be a stationary point of f (x).

For example, 0 is a stationary point of F (x), and it is a saddle

A singular value of a real matrix A is the square root

It means to find the singularvaluesof A, one needs to find

Such a v is called a right singular

A singular value is always positive.

A matrix of rank r has exactly r singular values.

Notation. Singular values are denoted by σ1 , . . . , σr .

A real matrix can be decomposed by its singular values

A matrix of order m × n has SVD

where Σ is an m × n ”diagonal” matrix with the singular

(U T AV )ij = uTi Av j = uTi (σj uj ) = σj δij , i = 1, . . . , m

Combining the results, we get

For a matrix of order m × n with SVD A = U ΣV T , the

U must be an eigenvector matrix of AAT .

Similarly, V must be an eigenvector matrix of AT A.

The right (resp. left) singular vectors in an SVD of a

Find SVD of " #

are λ1 = 3, λ2 = 1, λ3 = 0. Hence the singular values of A are

The corresponding left singular vectors of A are

Every real matrix of rank r is the sum of r real matrices

Image approximation. For an image of size 1000 × 1000, a

Data Compression with SVD

A singular value of a real matrix A is the square root

It means to find the singularvaluesof A, one needs to find