Lect 9

MA423 Matrix Computations
Lecture 9: Cholesky Factorization
Rafikul Alam
Department of Mathematics
IIT Guwahati
R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Outline
• Characterization of positive definite matrices

• Cholesky factorization

Quadratic forms
A pure quadratic f (x, y ) comes directly from a symmetric 2 by 2 matrix!

> 2 2 2
a b x
x Ax in R ax + 2bxy + cy = x y .
b c y
For any symmetric matrix A, the product x> Ax is a pure quadratic form f (x1 , . . . , xn ) :
  
a11 · · · a1n xn n n
> n
 . .. ..   ..  = X X a x x .
x Ax in R x1 · · · xn  .. . .   .  ij i j
an1 ··· ann xn i=1 j=1

Quadratic forms
Let f : R2 −→ R a smooth function and p ∈ R2 . Then

1
f (p + h) = f (p) + ∇f (p) • h + h> Hf (p)h + O(khk32 ),
2
where the symmetric matrix
fxx (p) fxy (p)
Hf (p) :=
fxy (p) fyy (p)
is the Hessian of f at p.
Thus, if p is a critical point then f has a local minimum or maximum at p according as the
quadratic form x> Hf (p)x is positive or negative in a neighbourhood of p.
On the other hand, f has a saddle point at p if x> Hf (p)x takes positive and negative values in
a neighbourhood of p.

Positive definite matrices
A symmetric matrix A ∈ Rn×n is said to be
• positive semidefinte if x > Ax ≥ 0 for all x ∈ Rn (written as A 0)
• positive definite if x > Ax > 0 for all nonzero x ∈ Rn (written as A 0)
A matrix A ∈ Cn×n is said to be

• positive semidefinte if x ∗ Ax ≥ 0 for all x ∈ Cn (written as A 0)
• positive definite if x ∗ Ax > 0 for all nonzero x ∈ Cn (written as A 0)
A real positive definite matrix is also referred to as a symmetric positive definite (SPD) matrix.
Remark: Let A ∈ Cn×n . Then x ∗ Ax ∈ R for all x ∈ Cn ⇐⇒ A = A∗ .

But A ∈ Rn×n and x > Ax ∈ R for all x ∈ Rn 6=⇒ A = A> .

1 2
Indeed, if A = then x > Ax = (x1 + x2 )2 ≥ 0 for all x ∈ R2 but A 6= A> .
0 1

If A ∈ Rn×n is partitioned in the form

Am B
A= , Am ∈ Rm×m ,
C D
then Am is called a principal submatrix of A. Note that
A> = A ⇐⇒ A>
m = Am , C = B >, D > = D.
It follows that if A is SPD then so is Am . Indeed, for any nonzero x ∈ Rm , we have

>
x Am B x
x > Am x = > 0.
0 C D 0
In particular, if A is SPD then ajj = ej> Aej > 0 for j = 1 : n. Also, A is nonsingular (why?).

Facts: Let A ∈ Rn×n be an SPD matrix. Then the following results hold:
1 If X ∈ Rn×p with rank(X ) = p then X > AX is SPD. Indeed, for all nonzero y ∈ Rp ,
Xy 6= 0 (why?) and y > (X > AX )y = (Xy )> A(Xy ) > 0 =⇒ X > AX is SPD.
2 Leading principal submatrices of A are SPD, that is, A(1 : j, 1 : j) is SPD for j = 1 : n.
Am B >

3 Let A = . Then S := D − BA−1
m B
>
is the Schur complement of Am . Now
B D
" #" #" #>
B>

Am I 0 Am 0 I 0
=
B D BA−1
m I 0 D − BA−1
m B
>
BA−1
m I
shows that
A is SPD ⇐⇒ Am and S := D − BA−1
m B
>
are SPD.

LDV factorization
Theorem: Suppose that all leading principal submatrices A ∈ Rn×n are nonsingular. Then
A = LDV is a unique decomposition of A, where L is unit lower triangular, D is diagonal, and
V is unit upper triangular.
Proof: By assumption, A has a unique LU factorization A = LU. Let D := diag(u11 , . . . , unn ),

where u11 , . . . , unn are diagonal entries of U. Then V := D −1 U is unit upper triangular and
A = LDV .
Corollary: If A ∈ Rn×n is symmetric and all leading principal submatrices of A are nonsingular
then A = LDL> is a unique factorization of A, where L is unit lower triangular and D is a
diagonal matrix.
Corollary: If A is SPD then A = LDL> is a unique factorization of A, where L is unit lower

triangular and D is a diagonal SPD matrix.

Cholesky factorization
Theorem: Let A ∈ Rn×n be nonsingular. Then A is SPD ⇐⇒ A = GG > , where G is a unique
lower triangular matrix with positive diagonal entries.
Proof: A = GG > ⇒ x > Ax = x > GG > x = (G > x)> G > x = kG > xk2 > 0 for x 6= 0 ⇒ A is SPD.
A is SPD ⇒ A = LDL> is a unique factorization, where L is unit lower triangular and D is

diagonal
√ SPD matrix. Let D be given by D √ = diag(d11 , . . . √
, dnn ). √
Since djj > 0, define
√ √
D := diag( d11 , . . . , dnn ) and G := L D. Then A = L D(L D)> = GG > .
Definition: If A is SPD then A = GG > , where G lower triangular with positive diagonals, is
called the Cholesky factorization of A and G is called the Cholesky factor of A.
Example:
    >
1 1 1 1 0 0 1 0 0
1 2 2 = 1 1 0 1 1 0 .
1 2 3 1 1 1 1 1 1

Algorithm (inner product)

a a21 g
Let A := 11 and G := 11 . Then A = GG > yields
a21 a22 g21 g22
2
a11 a21 g g11 g21 g11 g11 g21
= 11 = 2 2 .
a21 a22 g21 g22 g22 g11 g21 g21 + g22
Equating the columns, we have

2 √
a11 = g11 g11 = a11
a21 = g11 g21 =⇒ g21 = ap21 /g11
2 2 2
a22 = g21 + g22 g22 = a22 − g21
2
Remark: The factorization is possible if a11 > 0 and a22 − g21 > 0.

More generally, equating columns on both sides of A = GG > , we have
         
a11 g11 a22 g21 g22
 ..   ..   ..   ..   .. 
 .  = g11  .  ,  .  = g21  .  + g22  . 
an1 gn1 an2 gn1 gn2
       
ajj gj1 gj2 gjj
 ..   ..   ..   .. 
 .  = gj1  .  + gj2  .  + · · · + gjj  .  , j = 1 : n
anj gn1 gn2 gnj
Algorithm (Inner product):

For j = 1 : n
q
ajj − j−1 2
P
gjj = k=1 gjk

gij = aij − j−1
P
k=1 gik gjk /gjj , i = j + 1 : n
end
Cost: n3 /3 flops - half the cost of GE.
 
16 −16 0
Example: Consider  −16 41 −5  . Then
0 −5 5
√ √ a21 −16 a31 0

g11 = a11 = 16 = 4, g21 = = = −4, g31 = = =0
g11 4 g11 4
q
2 =
√ a32 − g g
31 21 −5 − 0 × (−4)
g22 = a22 − g21 41 − 16 = 5, g32 = = = −1
g22 5
q
2 − g2 =
√
g33 = a33 − g31 32 5 − 0 − 1 = 2.
Hence     >
16 −16 0 4 4
 −16 41 −5  =  −4 5   −4 5  .
0 −5 5 0 −1 2 0 −1 2

Algorithm (outer product)
Let A ∈ Rn×n be SPD. Then A = GG > can be written as
a11 h> g>

g11 0 g11
= b> .
h Ab g G b 0 G
Equating the blocks, we have

2 √
a11 = g11 =⇒ g11 = a11
h = g11 g =⇒ g = h/g11
b>> b − gg > = G b>
A
b = gg + G
bG =⇒ A bG
For k = 1:n
A(k,k) = sqrt(A(k,k));
g = A(k+1:n,k)/A(k,k); A(k+1:n,k) = g;
A(k+1:n, k+1:n) = A(k+1:n, k+1:n)- g*g’;
end
Cost: n3 /3 flops - half the cost of GE.
Example
    
25 15 −5 g11 0 0 g11 g21 g31
 15 18 0  =  g21 g22 0  0 g22 g32 
−5 0 11 g31 g32 g33 0 0 g33
  
5 0 0 5 3 −1
=  3 g22 0  0 g22 g32 
−1 g32 g33 0 0 g33
Equating (2, 2) blocks, we have

18 0 3 9 3 g22 0 g22 g32
− 3 −1 = =
0 11 −1 3 10 g32 g33 0 g33

3 0 3 1
=
1 g33 0 g33
2
Equating (2, 2) entry, we have 10 − 1 = g33 =⇒ g33 = 3.

Solving SPD system
Let A ∈ Rn×n be SPD and b ∈ Rn . Then the system Ax = b can be solved using Cholesky
factorization as follows.
• Compute Cholesky factorization A = GG > . Cost: n3 /3 flops.

• Solve the lower triangular system Gy = b. Cost: n2 flops.
• Solve the upper triangular system G > x = y . Cost: n2 flops.
The matlab command chol computes Cholesky factorization of a positive definite matrix A.
More specifically, the commands
R = chol(A) and L = chol(A,0 lower0 )

compute an upper triangular matrix R and a lower triangular matrix L such that
A = R> R and A = LL>

A direct proof of Cholesky factorization
a11 h>

Problem: Let A = , where h ∈ Rn−1 , be SPD. Then the Schur complement
h D
S := D − hh> /a11 is SPD. Now use
" #
a11 h> h>

1 0 a11
=
h D h/a11 In−1 0 D − hh> /a11
" # >
1 0 a11 0 1 0
=
h/a11 In−1 0 D − hh> /a11 h/a11 In−1
√ " # √ >
a 0 1 0 a 0
= √11 √11
h/ a11 In−1 0 D − hh> /a11 h/ a11 In−1
and induction on n to prove that Cholesky factorization A = GG > exists and is unique.
***

Lect 9

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lect 9

Uploaded by

Copyright:

Available Formats

MA423 Matrix Computations

Lecture 9: Cholesky Factorization

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

• Characterization of positive definite matrices

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Let f : R2 −→ R a smooth function and p ∈ R2 . Then

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

A matrix A ∈ Cn×n is said to be

Remark: Let A ∈ Cn×n . Then x ∗ Ax ∈ R for all x ∈ Cn ⇐⇒ A = A∗ .

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

then Am is called a principal submatrix of A. Note that

It follows that if A is SPD then so is Am . Indeed, for any nonzero x ∈ Rm , we have

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Proof: By assumption, A has a unique LU factorization A = LU. Let D := diag(u11 , . . . , unn ),

Corollary: If A is SPD then A = LDL> is a unique factorization of A, where L is unit lower

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

A is SPD ⇒ A = LDL> is a unique factorization, where L is unit lower triangular and D is

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Equating the columns, we have

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Algorithm (Inner product):

√ √ a21 −16 a31 0

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

Equating the blocks, we have

Equating (2, 2) blocks, we have

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

• Compute Cholesky factorization A = GG > . Cost: n3 /3 flops.

R = chol(A) and L = chol(A,0 lower0 )

A = R> R and A = LL>

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

You might also like