Solving Linear Systems: Iterative Methods and Sparse Systems

Solving Linear Systems:
Iterative Methods and Sparse Systems
COS 323
Direct vs. Iterative Methods
• So far, have looked at direct methods for

solving linear systems
– Predictable number of steps
– No answer until the very end
• Alternative: iterative methods
– Start with approximate answer
– Each iteration improves accuracy
– Stop once estimated error below tolerance
Benefits of Iterative Algorithms
• Some iterative algorithms designed for accuracy:

– Direct methods subject to roundoff error
– Iterate to reduce error to O( )
• Some algorithms produce answer faster

– Most important class: sparse matrix solvers
– Speed depends on # of nonzero elements,
not total # of elements
• Today: iterative improvement of accuracy,

solving sparse systems (not necessarily iteratively)
Iterative Improvement
• Suppose you’ve solved (or think you’ve solved)

some system Ax=b
• Can check answer by computing residual:
r = b – Axcomputed
• If r is small (compared to b), x is accurate
• What if it’s not?
• Large residual caused by error in x:

e = xcorrect – xcomputed
• If we knew the error, could try to improve x:
xcorrect = xcomputed + e
• Solve for error:
Axcomputed = A(xcorrect – e) = b – r
Axcorrect – Ae = b – r
Ae = r
• So, compute residual, solve for e,

and apply correction to estimate of x
• If original system solved using LU,
this is relatively fast (relative to O(n3), that is):
– O(n2) matrix/vector multiplication +
O(n) vector subtraction to solve for r
– O(n2) forward/backsubstitution to solve for e
– O(n) vector addition to correct estimate of x
Sparse Systems
• Many applications require solution of

large linear systems (n = thousands to millions)
– Local constraints or interactions: most entries are 0
– Wasteful to store all n2 entries
– Difficult or impossible to use O(n3) algorithms
• Goal: solve system with:
– Storage proportional to # of nonzero elements
– Running time << n3
Special Case: Band Diagonal
• Last time: tridiagonal (or band diagonal) systems

– Storage O(n): only relevant diagonals
– Time O(n): Gauss-Jordan with bookkeeping
Cyclic Tridiagonal
• Interesting extension: cyclic tridiagonal

 a11 a12 a16 
 
a21 a22 a23 
 a32 a33 a34 
  xb
 a43 a44 a45 
 
 a54 a55 a56 
 
a61 a65 a66 
• Could derive yet another special case algorithm,

but there’s a better way
Updating Inverse
• Suppose we have some fast way of finding A-1

for some matrix A
• Now A changes in a special way:
A* = A + uvT
for some n1 vectors u and v
• Goal: find a fast way of computing (A*)-1
– Eventually, a fast way of solving (A*) x = b
Sherman-Morrison Formula
A*  A  uv T  A(I  A 1uv T )
A 
* 1
 (I  A 1uv T ) 1 A 1
Let x  A 1uv T
Note that x 2  A 1u v T A 1u v T
Scalar! Call it 
x 2  A 1uv T   A 1uv T   x
x2   x
x  I  x   x 1   
x
x  I  x  0
1 
x
Ix  I  x  I
1 
 x 
I   I  x   I
 1  
 x 
   I  x
1
 I 
 1  
A 1uv T A 1
A   * 1
A  1
1  v T A 1u
1 1
 
T
* 1 A u v A b
x A b  A 1b 
1  v T A 1u
 
So, to solve A* x  b,
z vT y
solve Ay  b, Az  u , x  y 
1  vT z
Applying Sherman-Morrison
 a11
11
a12
12 16 
a16
 
• Let’s consider a21

a22
22
a23
23 

a32 a33 a34
  xb
cyclic tridiagonal again:
32 33 34
 a43 a44 a45 
 43 44 45

 a54
54
a55
55
a56
56

 
a61
61
a65 a66 
66 
a1111  1 a12
12  1 1
     
 a21 21
a22
22
a23
23     
 a32 a33 a34     
• Take A

32 33
a43
34
a44 a45
 , u   , v   
    
 43 45
    
 a54
54
a55
55
a56
56
    
     
 a65
65
a66
66
 a 61 a
61 16 
16   a
 61 
61  a16
16 

Applying Sherman-Morrison
• Solve Ay=b, Az=u using special fast algorithm

• Applying Sherman-Morrison takes
a couple of dot products
• Total: O(n) time
• Generalization for several corrections: Woodbury
A *  A  UV T
A 
* 1
 A 1  A 1U  I  V T A 1U  V T A 1
More General Sparse Matrices
• More generally, we can represent sparse matrices

by noting which elements are nonzero
• Critical for Ax and ATx to be efficient:
proportional to # of nonzero elements
– We’ll see an algorithm for solving Ax=b
using only these two operations!
Compressed Sparse Row Format
• Three arrays
– Values: actual numbers in the matrix
– Cols: column of corresponding entry in values
– Rows: index of first entry in each row
– Example: (zero-based)
0 3 2 3
  values 3 2 3 2 5 1 2 3
2 0 0 5
0 cols 1 2 3 0 3 1 2 3
0 0 0
  rows 0 3 5 5 8
0 1 2 3
Compressed Sparse Row Format
0 3 2 3
  values 32325123
2 0 0 5
0 0 0 0 cols 12303123
 
0 1 2 3 rows 03558
• Multiplying Ax:
for (i = 0; i < n; i++) {

out[i] = 0;
for (j = rows[i]; j < rows[i+1]; j++)
out[i] += values[j] * x[ cols[j] ];
}
Solving Sparse Systems
• Transform problem to a function minimization!
Solve Ax=b
 Minimize f(x) = xTAx – 2bTx
• To motivate this, consider 1D:

f(x) = ax2 – 2bx
df/ = 2ax – 2b = 0
dx
ax = b
• Preferred method:
conjugate gradients
• Recall: plain gradient
descent has a problem…
• … that’s solved by
conjugate gradients
• Walk along direction

d k 1   g k 1   k d k
• Polak and Ribiere

formula:
g k 1 ( g k 1  g k )
T
k 
g kT g k
• Easiest to think about A = symmetric

• First ingredient: need to evaluate gradient
f ( x)  x T A x  2b T x
f ( x)  2 Ax  b 
• As advertised, this only involves A multiplied

by a vector
• Second ingredient: given point xi, direction di,

minimize function in that direction
Define mi (t )  f ( xi  t d i )
d
Minimize mi (t ) : mi (t )  0
dt
dmi (t ) want
 2d i  Axi  b   2t d i A d i  0
T T
dt
d  Ax  b 
T
t min   i T i
di Adi
xi 1  xi  t min d i
• So, each iteration just requires a few

sparse matrix – vector multiplies
(plus some dot products, etc.)
• If matrix is nn and has m nonzero entries,
each iteration is O(max(m,n))
• Conjugate gradients may need n iterations for
“perfect” convergence, but often get decent
answer well before then

Solving Linear Systems: Iterative Methods and Sparse Systems

Uploaded by

Copyright:

Available Formats

You might also like

Solving Linear Systems: Iterative Methods and Sparse Systems

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solving Linear Systems: Iterative Methods and Sparse Systems

Uploaded by

Copyright:

Available Formats

Solving Linear Systems:

Iterative Methods and Sparse Systems

• So far, have looked at direct methods for

• Some iterative algorithms designed for accuracy:

• Some algorithms produce answer faster

• Today: iterative improvement of accuracy,

• Suppose you’ve solved (or think you’ve solved)

• Large residual caused by error in x:

• So, compute residual, solve for e,

• Many applications require solution of

• Last time: tridiagonal (or band diagonal) systems

• Interesting extension: cyclic tridiagonal

• Could derive yet another special case algorithm,

• Suppose we have some fast way of finding A-1

• Solve Ay=b, Az=u using special fast algorithm

• More generally, we can represent sparse matrices

for (i = 0; i < n; i++) {

• Transform problem to a function minimization!

• To motivate this, consider 1D:

• Walk along direction

• Polak and Ribiere

• Easiest to think about A = symmetric

• As advertised, this only involves A multiplied

• Second ingredient: given point xi, direction di,

• So, each iteration just requires a few

You might also like