Power Methods For Eigenvalues: MATH2071: Numerical Methods in Scientific Computing II

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Power Methods for Eigenvalues

MATH2071: Numerical Methods in Scientific Computing II


http://people.sc.fsu.edu/∼jburkardt/classes/math2071 2020/power/power.pdf

Power methods apply the same linear transformation over and over.

The Power Method


Given a matrix A, the power method is a simple iterative procedure that can give an excellent approx-
imation to the largest eigenvalue and its associated eigenvector. The inverse power method and the
shifted inverse power method allow us to search for specific eigenvalues.

1 An experiment
Consider the matrix  
1 2 3 4
 2 3 5 6 
A=
 3

5 2 7 
4 6 7 1
Let’s pick a starting vector x0 of ones, and then repeatedly multiply by A:
         
1 10 165 2589 41100
 1  16  x2 = A∗x1 =  261  x3 = A∗x2 =  4101 
     65070 
x0 =  1  x1 = A∗x0 = 
   x4 = A∗x3 =  
17   270   4251   67497 
1 18 273 4389 69108
Each multiplication seems to magnify the vector entries. By the time we get to x4 , it almost looks like the
same factor is applied to each vector entry. We can check this by computing the pairwise ratios ri = xx34 (i)
(i)
:
 
41100/2589 = 15.8749
x4 (i)  65070/4101 = 15.8669 
r= = 
x3 (i)  67497/4251 = 15.8779 
69108/4389 = 15.7457
In other words, the vector x3 is behaving like an eigenvector of A, associated with an eigenvalue of approxi-
mately 15.8. So it is approximately true that A ∗ x3 ≈ 15.8 ∗ x3 .

1
2 Compare with MATLAB’s eig() function
To get some more clues, we can use the [V,L]=eig(A) command to ask MATLAB directly for the eigenvectors
(columns of V ) and eigenvalues (diagonal entries of L):
1 [ V, L ] = e ig ( A )
2
3 V =
4
5 0.1852 −0.3795 0.8433 0.3325
6 0.2192 −0.6224 −0.5359 0.5266
7 0.5087 0.6650 −0.0278 0.5460
8 −0.8117 0.1621 0.0303 0.5604
9
10 L =
11
12 −5.9210 0 0 0
13 0 −2.6854 0 0
14 0 0 −0.2264 0
15 0 0 0 15.8328

We see that A does indeed have an eigenvalue very close to our estimate of 15.8. In order to check the
eigenvector, we need to normalize x3 with respect to the `2 norm: The norm of x3 is 7801.0. Dividing
through, we get    
2589/7801 0.3319
 4101/7801   0.5257 
 4251/7801  = 
   
0.5449 
4389/7801 0.5626
and this is very close to the fourth column of V , the eigenvector associated with the eigenvalue 15.8328.
Thus, a few steps of the power method seem to have given us an excellent approximation to the largest
eigenvalue of A and its corresponding eigenvector.

3 Why does the power method work?


To simplify the discussion, we will assume that the matrix A has n distinct real eigenvalues. This guarantees
that A is diagonalizable: A = V ∗ L ∗ V −1 , where each column V (:, j) is the eigenvector associated with
the eigenvalue Λj,j . In fact, the factors V and L are the same information that the MATLAB eig() just
returned to us.
We’d like to look at the individual pairs of eigenvector (column of V ) and eigenvalue (diagonal value of L),
so we will write λj = Lj,j and vj = V (:, j).
Because V 0 V = I, we see that the vj ’s form an orthogonal basis, and so we can decompose any vector x into
its projection onto the vj ’s:
x = c1 v1 + c2 v2 + ... + cn vn
and now it follows that;

A ∗ x = c1 A ∗ v1 + c2 A ∗ v2 + ... + cn A ∗ vn
= c1 λ1 v1 + c2 λ2 v2 + ... + cn λn vn

and if we multiply k times:

Ak ∗ x = c1 λk1 v1 + c2 λk2 v2 + ... + cn λkn vn

2
Now suppose that λn is the eigenvalue of maximum magnitude. (Is it safe to assume there is exactly one
such value? We will come back to this question.) Then

λ1 k λ2
Ak ∗ x = λkn (c1 ( ) v1 + c2 ( )k v2 + ... + cn vn )
λn λn
As k → ∞, the magnitude of the factors multiplying eigenvectors v1 through vn−1 must drop monotonically
towards zero, allowing the component in the vn direction to become ever more dominant. (Something could
go wrong here, too, but we will come back to this question as well.) Thus, with each iteration of the power
method, we expect to see a better and better approximation to the largest eigenvector.
It should be clear that the speed of convergence depends mainly on the ratio between the eigenvalues of
largest and second-largest magnitude.
During the iteration, the eigenvector estimates can grow very large, (or also very small!), and so it is usual
to normalize the vector before doing the multiplication.
To produce an estimate of the eigenvalue itself, one approach might be a norm ratio:

||Ax||
λmax ≈
||x||

but the preferred method is to use the Rayleigh quotient:

x0 Ax
λmax ≈
x0 x
It turns out that the norm ratio converges linearly to the correct value of λ, but the Rayleigh quotient
converges quadratically.

4 Pseudocode for a power method function


Since the power method is an iteration, we need to impose a maximum number of steps, so that we can deal
with errors and hard cases. On the other hand, we can also use a tolerance to stop early if we find that the
error norm ||Ax − λx|| is small enough.

Algorithm 1 [v,lambda,it]=power method(A,v,itmax,tol): power method for dominant eigenvalue.


for 1 ≤ it ≤ itmax do
v ← v/||v||
av ← A ∗ v
0
λ = vvAv 0v

if ||av − λ ∗ v|| < tol then


return v, λ, it
end if
v ← av
end for
return v, λ, it

5 Example: the power method applied to the ILL3 matrix


1 >> A = [ −149 −50 −154; . . .
2 537 180 546; . . .

3
3 −27 −9 −25 ] ;
4 >> v = [ 1 ; 1 ; 1 ] ;
5 >> itmax = 1 0 0 ;
6 >> t o l = 0 . 0 0 0 0 0 1 ;
7
8 >> [ v , lambda , i t ] = power method ( A, v , itmax , t o l ) ;
9
10 v = [ − 0 . 1 3 9 1 ; 0 . 9 7 4 0 ; −0.1789 ] ;
11 lambda = 3 . 0 0 0 0
12 i t = 34

Again, use MATLAB as a check:


1 >> [ V, L ] = e ig (A)
2
3 V =
4 0.3162 −0.4041 −0.1391
5 −0.9487 0.9091 0.9740
6 −0.0000 0.1010 −0.1789
7
8 L =
9 1.0000 0 0
10 0 2.0000 0
11 0 0 3.0000

6 Exercise: Several tests


Apply the power method to the following matrices. Compare your results to what MATLAB’s eig() function
reports. Try to explain any cases where the power method does not seem to agree with MATLAB.
1. A=upshift matrix(3), v=[1;1;1];
2. A=frank matrix(3), v=[1;1;1];
3. A=waldo matrix(), v=waldo vector();

7 Issues
Obviously, the most frustrating fact about this simple method is that it doesn’t tell us how to get the other
eigenvalues.
Even if the eigenvalues are guaranteed to be real, we could have two distinct eigenvalues of the same
magnitude.
Even if the eigenvalues have unique magnitudes, if the second largest magnitude is close to the largest
magnitude, then the rate of convergence can be very slow.
Even if the eigenvalue of maximum magnitude is ”isolated”, that is, is far larger than the other eigenvalues,
our convergence proof will fail if we have a zero projection of the initial guess vector onto the corresponding
eigenvector. Luckily, such an occurrence is extremely unlikely, and any roundoff during the computation is
likely to give us at least a tiny nonzero projection which will quickly grow and fix the problem!
If the dominant eigenvalue is real, but repeated, the method will converge, returning an eigenvector in the
span of the eigenspace of the eigenvectors associated with λmax . But the method will not be able to tell us
that this eigenvalue actually has a multidimensional space of associated eigenvectors.
If the dominant eigenvalue is a complex pair, then there are variations on the power method that are capable
of producing the pair of eigenvalues and eigenvectors.

4
Nonetheless, the power method is simple to program; it allows us to improve an approximate result by
restarting from the old output; for the common case of SPD matrices, where all the roots are real and
positive, we can expect good behavior. and p

8 Deflation for a symmetric matrix


The following technique works in cases where the matrix A is symmetric, in which case the eigenvectors can
be assumed to be orthogonal.
Suppose that we have used the power method on the matrix A, and gotten the results lambdan and vn .
Consider the following “deflated” matrix: B = A − λn vn vn0 . Then B has the same eigenvectors as A, but vn
will be associated with the eigenvalue 0:

B ∗ vj = (A − λn vn vn0 )vj
= Avj + λn vn vn0 vj
= λj vj + λn vn vn0 vj
= λj vj + λn vn δn,j
(
λj ∗ vj if j 6= n
=
0 ∗ vn if j = n

So this means that we could now apply the power method to the matrix B to seek the eigenvalue of next
largest magnitude, and presumably we could repeatedly deflate the matrix as many times as we wish. Of
course, the errors in previously computed eigenvalues and eigenvectors are now built into our new model,
and so with each deflation step we may expect a reduced accuracy.

9 The inverse power method


The power method is good at finding λmax , the eigenvalue of A of maximum magnitude. But what if we
wanted λmin instead? It’s easy to see that the maximum eigenvalue of A−1 is λmin1
. That suggests we might
−1
try using the power method with A . But it’s usually true that computing the inverse matrix is the wrong
thing to do, being rather expensive and inaccurate. Instead of computing vk+1 = A−1 ∗ vk , we can do the
equivalent operation by solving A ∗ vk+1 = vk . This defines the inverse power method, also known as inverse
iteration.
The inverse power method returns a value σ which is the largest eigenvalue of A−1 . But we want the smallest
eigenvalue of A, which is λmin = 1/σ. Although we have to modify the eigenvalue that is output from the
inverse power method, the eigenvector does not need to be changed.

10 Exercise: Create and test an inverse iteration code


Using your power method.m file as a starting point, create a new file:
1 [ v , lambda , i t ] = p o w e r m e t h o d i n v e r s e ( A, v , itmax , t o l )

Listing 1: Specification for inverse power method code.

and use it to determine the smallest eigenvalue and corresponding eigenvector of the following matrices.
Check your results using MATLAB’s eig() function:
1. A=dif2 matrix(10), v=ones(10,1);
2. A=frank matrix(10), v=ones(10,1);

5
11 Homework #11: The shifted inverse power method
Now we have methods to find the largest and smallest eigenvalues, but what about the ones inbetween?
Another way to think about the inverse power method is that it chases the eigenvalue that is closest to
zero. Suppose we suspect that the matrix A has an eigenvalue lambda∗ and we estimate that it is near 5.
Then the matrix A − 5I has a smallest eigenvalue σ. If we use the inverse power method on A − 5I, we
can approximate σ, the largest eigenvalue of (A − 5I)−1 . The eigenvalue of A that is closest to 5 will be
λ∗ = 5 + 1/σ.
Thus, in general, if we seek the eigenvalue of A closest so some value s:
1. Set A = A − s ∗ I;
2. Use inverse power method to compute σ;
3. Set λ = s + 1/σ;

Using this idea, create a program hw11.m which works with the 10 × 10 Frank matrix, and uses the shifted
inverse power method to approximate the eigenvalues nearest to 13, and to 4. Check your answers by calling
[V,L]=eig(A).
Send your program hw11.m to me at jvb25@pitt.edu. I would like to see your work by Friday, April 3.

You might also like