MAT 461/561: 3.2 LU Decomposition: Announcements

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

MAT 461/561: 3.

2 LU Decomposition

James V. Lambers

January 29, 2020

Announcements
Homework submission:

• please do not use subfolders


• for MATLAB functions, please use exactly the same interface as in the problem (functions
may be graded by script)
• make it easy to see which problem your work is for, and the final result
• zip files preferred
• If a problem involves both MATLAB code and questions, easiest way is to include answers
as comments in whatever MATLAB file
• Feedback on problems will be entered as a comment in the Canvas gradebook

Introduction
Current procedure for solving Ax = b (MAT 326-style):
h i
1. Form augmented matrix A b

2. Use row operations (Gaussian elimination) to reduce augmented matrix to upper triangular
form U (row-echelon form)
3. Use back substitution on U and modified b
The choice of row operations performed on A and b are determined solely by A
Ideally, we should be able to re-use information if solving Ax2 = b2
To do this, we need a “record” of row operations performed on A to use later with a new b
Gaussian elimination on A (not b) is by far the most expensive part of solving Ax = b, should
only have to perform once, but we need “storage” of that sequence of row operations
To do this, we will examine Gaussian elimination without involving b
This useful if solving many systems with same A, different b. When this happens: using an
implicit time-stepping method to solve a linear system of ODEs, with a time-independent Jacobian
(y0 = Jy), will be seen in Chapter 12

1
Derivation of the LU Decomposition
Elementary row Matrices
Recall elementary row operations: 1) scaling rows, 2) interchanging rows, 3) subtracting a multiple
of one row from another
All of these can be described by multiplying A on the left by elementary row matrices:
• Interchanging rows 1 and 3:
    
0 0 1 0 1 2 −1 3 10 3 −4 2
 0 1 0 0  4 −7 6 −5   4 −7 6 −5 
EA =  =
    
1 0 0 0 10 3 −4 2 1 2 −1 3
 
    
0 0 0 1 1 0 −6 8 1 0 −6 8

• Subtracting 2 times row 1 from row 4:


    
1 0 0 0 1 2 −1 3 1 2 −1 3
 0 1 0 0  4 −7 6 −5   4 −7 6 −5 
EA =  =
    
0 0 1 0 10 3 −4 2 10 3 −4 2
 
    
−2 0 0 1 1 0 −6 8 −1 −4 −4 2

If we can easily describe each individual row operation, can we efficiently (in terms of time and
space) represent ALL of the operations needed to carry out Gaussian elimination?

Unit Lower Triangular Matrices


Note that the second matrix E, for subtracting a multiple of one row from another, is a unit lower
triangular matrix. That is, it is lower triangular, AND all diagonal entries are equal to 1.
It is lower triangular, instead of upper, because it subtracts a multiple of row j from row i, where
i > j (as Gaussian elimination does).
Properties of unit lower triangular matrices:
• The product of two (unit) lower triangular matrices is (unit) lower triangular
• The inverse of a nonsingular (invertible) (unit) lower triangular matrix is (unit) lower trian-
gular. A unit lower triangular matrix is always nonsingular.
Can be proven using the general formula for matrix-matrix multiplication (see Appendix B). Note
that for an lower triangular matrix L, `ij = 0 if i < j.
What happens when we apply multiple elementary row matrices to A? Let Eij be the elementary
row matrix that subtracts mij times row j from row i, where i > j, and mij = aij /ajj (the
“multiplier” from Gaussian elimination). Then

En,n−1 · · · En2 · · · E32 En1 · · · E41 E31 E21 A = U.

More concisely,
M (n−1) · · · M (2) M (1) A = U
where M (j) is the product of elementary row matrices that eliminate subdiagonal entries in column
j. That is,
M (j) = Enj · · · Ej+2,j Ej+1,j

2
We know that Eij is the identity matrix I, with −mij in row i, column j. What does M (j) look
like? It is the identity matrix I, with −mij in the (i, j) position, for i = j + 1, . . . , n.
Example:
     
1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0
 0 1 0 0  0 1 0 0  −3 1 0 0   −3 1 0 0 
=
     
0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0
   
     
−2 0 0 1 0 0 0 1 0 0 0 1 −2 0 0 1
and  −1  
1 0 0 0 1 0 0 0
 −3 1 0 0   3 1 0 0 
=
   
1 0 1 0 −1 0 1 0
  
   
−2 0 0 1 2 0 0 1
From
M (n−1) · · · M (2) M (1) A = L−1 A = U
we have
A = [M (1) ]−1 [M (2) ]−1 · · · [M (n−2) ]−1 [M (n−1) ]−1 U
or
A = L(1) L(2) · · · L(n−1) U
where
L(j) = [M (j) ]−1
is the identity matrix I with mij in the (i, j) position for i > j. But what happens when we
multiply all of the L(j) ? We obtain

L = L(1) L(2) · · · L(n−1)

where L is the identity matrix I with mij in the (i, j) position for i > j. Thus we have

A = LU

which is the LU Decomposition or LU Factorization of A. To efficiently obtain this factorization,


all we have to do is store each mij in the (i, j) entry of A. That is, we can store the “interesting”
part of both L and U within A.
Then, when we are given a b, we can perform all of the same row operations on b by looking up
the multipliers from L.

Solution of Ax = b
How we solve the system: Ax = b means LU x = b. Let y = U x. Then Ly = b.
The algorithm:
1. Perform Gaussian elimination on A: A = LU ( 23 n3 flops)
2. Solve Ly = b using forward substitution (n2 − n flops). This implicitly applies the row
operations previously performed on A to b. Compare bi = bi − mij bj step from Gaussian
elimination to yi = yi − `ij yj step of forward substitution.
3. Solve U x = y using back substitution to obtain x. (n2 flops)

3
Existence and Uniqueness
Not every matrix, even if invertible, HAS a LU Decomposition! For example, the matrix
   
0 −1 2 1 2 −7
A= 3 9 −4  , A= 2 4 1 
   
5 0 1 3 0 −5
What goes wrong? In first A, a11 = 0 so multipliers for column j = 1 are undefined.
For second A, we can proceed with first column and multipliers m21 = 2/1 = 2 and m31 = 3/1 = 3.
But then a22 = 0 in the updated matrix!
The LU Decomposition A = LU exists if and only if the principal minors of A are nonzero.
The leading principal submatrices of A, denoted A1 , A2 , . . . , An , are defined as follows:
 
a11 a12 ··· a1k
" #
a11 a12

 a21 a22 a2k 

A1 = a11 , A2 = , Ak =  .. .. .. 
a21 a22 
 . . .


ak1 ··· ··· akk
The principal minors of A are det(A1 ), det(A2 ), . . . , det(An ). All must be nonzero for the LU
Decomposition to exist.
If it exists, is it unique? Assume A = L1 U1 = L2 U2 . Because A is nonsingular (invertible), Li ,Ui
are also nonsingular. Therefore
U1 U2−1 = L−1
1 L2 .
Recall that multiplying and inverting preserves triangularity! Therefore U1 U2−1 is upper triangular,
and L−1
1 L2 is unit lower triangular. What can we conclude if a lower triangular and upper triangular
matrix are equal? They must be diagonal, because all entries below the main diagonal must be
zero (upper) and all entries above the main diagonal must also be zero (lower).
Furthermore, all the diagonal entries must be 1, because L−1
1 L2 is unit lower triangular. That is,

L−1
1 L2 = I

and therefore L1 = L2 , and U1 = U2 and the LU Decomposition is unique.

Practical Computation of Determinants


Assuming the LU Decomposition exists, it can be used to easily obtain det(A) because
det(A) = det(LU ) = det(L) det(U ).
However, det(L) = 1 and det(U ) = u11 u22 · · · unn . Therefore
det(A) = u11 u22 · · · unn .
Note that when computing the determinant using the definition requires O(n!) flops whereas using
the LU Decomposition requires “only” 23 n3 .
Why are determinants not as important in numerical linear algebra? We know A is nonsingular
iff det(A) 6= 0. However, it is possible for a matrix to be “near” a singular matrix while det(A) is
nowhere near zero. Similarly, a matrix could have a tiny determinant but not be nearly singular.
Can make a small change in A to cause a large change in det(A).

4
Pivoting
If the LU Decomposition of A does not exist, then we must certainly perform row interchanges to
ensure that it does, BUT due to floating-point arithmetic error, we have another reason to perform
row interchanges.
Consider the main step in Gaussian elimination:
(j+1) (j) (j)
aik = aik − mij ajk

(k)
where A(1) is the original matrix. Note that all entries aij will be “contaminated” with roundoff
(j) (j) (j)
error. That is, ajk = ãjk (1 + jk ) where ãjk is the result of performing Gaussian elimination in
exact arithmetic and jk is the (relative) roundoff error.
(j)
Suppose the multiplier mij is large. Then, the roundoff error in ajk is amplified by mij , thus
(j+1)
causing ajk will be “more erroneous”! Therefore, even if there is no danger of dividing by zero,
we should interchange rows (pivot) to reduce the size of mij .
Using the fact that mij = aij /ajj , we want to perform an interchange among rows j, j + 1, . . . , n
so that ajj is as large as possible.

Partial Pivoting
In partial pivoting, we examine ajj , aj+1,j , aj+2,j , . . . , anj and find the largest (in absolute value).
Suppose that entry is aqj . Then we interchange row j and row q. This ensures the smallest
multipliers possible from entries in column j. It guarantees |mij | ≤ 1. Also, it’s efficient, because
O(n2 ) comparisons vs. O(n3 ) flops. The “backslash” operator in MATLAB (\)
performs Gaussian elimination with partial pivoting. HOWEVER, even with small multipliers, the
entries of U can still get very large (see homework).

Complete Pivoting
Before eliminating subdiagonal entries in column j, we examine rows and columns j, j + 1, . . . , n to
find the largest entry in the entire submatrix. Suppose that entry is apq , where p, q ≥ j. We then
swap rows j and p, and columns j and q. Therefore A is replaced by P1 AP2 , where P1 and P2 are
permutation matrices. Example:  
1 0 0 0
 0 0 0 1 
P1 = 
 
 0 0 1 0 

0 1 0 0
swaps rows 2 and 4 if multiplied on the left, or columns 2 and 4 on the right.
Therefore Ax = b becomes
P1 AP2 P2−1 x = P1 b
or
Ãx̃ = b̃
where à = P1 AP2 , b̃ = P1 b, and x̃ = P2−1 x. Therefore we can carry out à = LU , Ly = b̃ (forward
substitution), and U x̃ = y (back substitution). Then x = P2 x̃.

5
Complete pivoting: why or why not?

• pro: smaller multipliers than partial pivoting, less propagation of rounding error, also Gaus-
sian elimination is proven to be “backward stable” (Sec 2.1) meaning that the computed
output of Gaussian elimination is the exact LU decomposition of a “nearby” matrix. Recall
that forward error (error in x) is backward error (error in A) times the condition number of
A. Note that Gaussian elimination was not known to be stable initially, until Jim Wilkinson
performed a careful roundoff error analysis.

• con: complete pivoting requires O(n3 ) comparisons, compared to partial pivoting O(n2 ).

The LU Decomposition with Partial Pivoting


Recall the algorithm for solving Ax = b:

• A = LU (Gaussian elimination)

• Ly = b (forward substitution)

• U x = y (back substitution)

How does this change when rows are interchanged? Recall

M (n−1) · · · M (2) M (1) A = U

Now, with pivoting:


M (n−1) Pn−1 · · · P3 M (2) P2 M (1) P1 A = U
But
L = [M (1) ]−1 · · · [M (n−1) ]−1
How to still obtain a LU Decomposition:

M (n−1) Pn−1 · · · P3 M (2) [P2 M (1) P2−1 ]P2 P1 A = U

First, about permutation matrices: A permutation matrix P is an example an orthogonal matrix,


for which P T P = I, or P −1 = P T . The key observation is that M̃ (1) = P2 M (1) P2T is unit lower
triangular. Now we have

M (n−1) Pn−1 · · · [P3 M (2) M̃ (1) P3T ]P3 P2 P1 A = U

Effectively, we have:
M̃ (n−1) · · · M̃ (1) Pn−1 · · · P2 P1 A = U
where M̃ (j) = Pn−1 · · · Pj+1 M (j) Pj+1
T ···PT
n−1 is unit lower triangular. This means

P A = LU

where P carries out ALL row interchanges during Gaussian elimination, and L contains the multi-
pliers, BUT: to ensure multipliers are in the correct place, do the following:

1. Store mij in aij where i > j

2. When interchanging rows, apply interchange to the ENTIRE rows j and p

6
This ensures the correct L in P A = LU .
The process now:

• P A = LU (Gaussian elimination with partial pivoting)

• Ly = P b (forward substitution)

• U x = y (back substitution)

Storage: do NOT store P explicitly. Recommended: start with a vector of indices 1, . . . , n. When
swapping rows of A, swap indices in this vector. Then use these indices to rearrange b. MATLAB:
use b(P) where P is the vector of indices.

Practical Computation of Determinants, Revisited


To obtain det(A) after Gaussian elimination with pivoting, we note
n
Y
det(A) = det(P T LU ) = det(P T ) det(L) det(U ) = det(P ) uii
i=1

Since P is a real orthogonal matrix, we have P T P = I, so

1 = det(I) = det(P T P ) = det(P )2

so det(P ) = ±1. We use the fact that interchanging rows of a matrix negates the determinant, so

det(A) = (−1)s u11 · · · unn

where s is the number of row interchanges performed.

You might also like