Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

172 2.

2 Linear Algebra and Eigenvalues Problems

5. A square matrix A is unitarily similar to the square matrix B if and only if there is an unitary
matrix Q of same size as A and B such that

A = QBQ−1 .

A square matrix A is unitarily diagonalizable if and only if it is unitarily similar to a diagonal


matrix.
Theorem 2.22 A square matrix A is unitarily diagonalizable if and only if there is a unitary
matrix Q of the same size whose columns are eigenvectors of A. •

Definition 2.16 (Normal Matrix)

A square matrix A is normal if and only if it commutes with its conjugate transpose, that is,

AA∗ = A∗ A.

For example, if a and b are real numbers, then the following matrix
!
a b
A= ,
−b a

is normal because !
a2 + b2 0
AA∗ = = A∗ A.
0 a2 + b2
However, its eigenvalues are, a + ib. Note that all the Hermitian, skew Hermitian and unitary
matrices are normal matrix. •

2.2.5 Basic Properties of Eigenvalue Problems


Now we will discuss some of the important properties of the eigenvalue problems which will help
us in the coming topics.
1. A square matrix A is singular if and only if at least one of its eigenvalue is zero. It can be
easily prove, since for λ = 0, we have (2.4) of the form

|A − λI| = |A| = 0.

Example 2.19 Consider the following matrix


 
3 1 0
A =  2 −1 −1  .
 
4 3 1

Then the characteristic equation of A takes the form

−λ3 + 3λ2 = 0.

By solving this cubic equation, the eigenvalues of A are, 0, 0 and 3. Hence the given matrix
is singular because two of its eigenvalues are zero. •
Chapter Two Eigenvalues and Eigenvectors 173

2. The eigenvalues of a matrix A and its transpose AT are identical.


It is well known that the determinant of a matrix A and of its AT are same. Therefore, they
must have the same characteristic equation and the same eigenvalues.

Example 2.20 Consider a matrix A and its transpose matrix AT as


   
1 1 0 1 3 2
A= 3 0 3  and AT =  1 0 −1  .
   
2 −1 3 0 3 3

The characteristic equations of A and AT are same, which is


−λ3 + 4λ2 − 3λ = 0.
Solving this cubic polynomial equation, we have the eigenvalues 0, 1, 3 of both the matrices A
and its transpose AT . •

3. The eigenvalues of a inverse matrix A−1 , provided that A−1 exists, are the inverses of the
eigenvalues of A.
To prove this, let us consider λ is an eigenvalue of A and using (2.4), gives

−1 −1
1 −1

|A − λI| = |A − λAA | = |A(I − λA )| = |λ||A| I − A = 0.

λ
Since the matrix A is nonsingular, therefore, |A| =
6 0, and also, λ 6= 0. Hence

1
I − A−1 = 0,

λ

1
which shows that is an eigenvalue of a matrix A−1 .
λ
Example 2.21 Consider a matrix A and its inverse matrix A−1 as
3 1
 
  0 − 
3 0 1  8 8 
−1

1
A =  0 −3 0  and A = 0 − 0 .
   
1 0 3
 3 
 1 3 
− 0
8 8
Then a characteristic equation of A has the form
λ3 − 3λ2 − 10λ + 24 = 0,
which gives the eigenvalues 4, -3, and 2 of A. Also, the characteristic equation of A−1 is
5 2 1 1
λ3 − λ − λ+ = 0,
12 8 24
and it gives the eigenvalues,
1 1 1
,− and ,
4 3 2
which are reciprocal to the eigenvalues 4, −3, 2 of the matrix A. •
174 2.2 Linear Algebra and Eigenvalues Problems

4. The eigenvalues of Ak (k an integer) are eigenvalues of A raised to the kth power.


To prove this, consider the characteristic equation of a matrix A

|A − λI| = 0,

which can be also written as

0 = |A − λI| = |A − λI||A + λI| = |(A − λI)(A + λI)| = |A2 − λ2 I|.

Example 2.22 Consider the following matrix


 
1 1 0
A= 3 0 3 ,
 
2 −1 3

which has eigenvalues, 0, 1, and 3. Now


 
4 1 3
2
AA = A =  9 0 9 ,
 
5 −1 6

has the characteristic equation of the form

λ3 − 10λ2 + 9λ = 0.

Solving this cubic equation, the eigenvalues of A2 are, 0, 1, and 9, which are double of the
eigenvalues 0, 1, 3 of A. •

5. The eigenvalues of a diagonal matrix or a triangular (upper or lower) matrix are their diagonal
elements.

Example 2.23 Consider the following matrices


   
1 0 0 2 2 3
A =  0 2 0 , B =  0 3 3 .
   
0 0 3 0 0 4

Since the characteristic equation of A is

λ3 − 6λ2 + 11λ − 6 = (1 − λ)(2 − λ)(3 − λ) = 0,

and it gives eigenvalues 1, 2, 3 which are the diagonal elements of the given matrix A. Simi-
larly, the characteristic equation of B is

λ3 − 9λ2 + 26λ − 24 = (2 − λ)(3 − λ)(4 − λ) = 0,

and it gives eigenvalues 2, 3, 4 which are the diagonal elements of the given matrix B. Hence
the eigenvalues of a diagonal matrix A and the upper-triangular matrix B are their diagonal
elements. •
Chapter Two Eigenvalues and Eigenvectors 175

6. Every square matrix satisfies its own characteristic equation, that is, p(λ) = p(A).
This is a well known theorem, called the Cayley-Hamilton Theorem. This theorem can be
used to calculate powers and inverse of matrices. For example, if A is a 2 × 2 matrix with
characteristic polynomial
p(λ) = λ2 + aλ + b = 0,
then
p(A) = A2 + aA + bI = 0,
so
A2 = −aA − bI,
and
A3 = AA2 = A(−aA − bI) = (a2 − b)A + abI.
It is easy to see that by continuing in this fashion we can express any positive power of A as
a linear combination of A and I. From A2 + aA + bI = 0, we also obtain

A(A+aI) = −bI,
1 a
and so the inverse of the matrix can be found as, A−1 = − A − I, provided b 6= 0.
b b

Example 2.24 Use Cayley-Hamilton Theorem to compute A3 and A−1 of the following
square matrix !
2 −1
A= .
−2 4

Solution. Since the characteristic equation of A can be obtained as

p(λ) = |A − λI| = λ2 − 6λ + 6 = 0,

and it gives a = −6 and b = 6. Thus


! ! !
2 −1 1 0 24 −30
A3 = (36 − 6)A − 36I = 30 − 36 = .
−2 4 0 1 −60 84

The inverse of A can be obtain as follows:


! ! !
−1 1 1 2 −1 1 0 2/3 1/6
A =− A+I=− + = ,
6 6 −2 4 0 1 2/3 1/3

which is the required inverse of the given matrix. •

If a characteristic equation of n × n matrix A is

λn + αn−1 λn−1 + αn−2 λn−2 + · · · + α1 λ + α0 = 0.

The matrix itself satisfies the same equation, namely,

An + αn−1 An−1 + αn−2 An−2 + · · · + α1 A + α0 = 0. (2.36)


176 2.2 Linear Algebra and Eigenvalues Problems

Multiplying each term in (2.36) by A−1 , when A−1 exists and thus α0 6= 0, gives an important
relationship for the inverse of a matrix

An−1 + αn−1 An−2 + αn−2 An−3 + · · · + α1 I + α0 A−1 = 0,

or
1 n−1
A−1 = − [A + αn−1 An−2 + αn−2 An−3 + · · · + α1 I].
α0
Example 2.25 Consider the following square matrix
 
2 1 2
A =  0 2 3 ,
 
0 0 5

which has a characteristic equation of the form

p(λ) = λ3 − 9λ2 + 24λ − 20 = 0,

and one can write


p(A) = A3 − 9A2 + 24A − 20I = 0,
Then the inverse of A can be obtain as follows:

A2 − 9A + 24I − 20A−1 = 0,

which gives  
10 −5 −1
1 1 
A−1 = [A2 − 9A + 24I] =  0 10 −6  .

20 20
0 0 4
Similarly, one can also find the higher power of the given matrix A. For example, one can
compute the value of the matrix A5 by solving the following expression
 
32 80 3013
5 4 3 2
A = 9A − 24A + 20A =  0 32 3093  ,
 
0 0 3125

the fifth power of the matrix. •

To find the coefficients of a characteristic equation and inverse of a matrix A by the Cayley-
Hamilton theorem, we use the author-defined function CHIM and the following MATLAB
commands as follows:

>> A = [2 1 2; 0 2 3; 0 0 5]; c = CHIM (A)

Example 2.26 (a) Verify the Cayley-Hamilton theorem for the following square matrix
 
1 0 1
A =  0 2 1 .
 
1 1 3
Chapter Two Eigenvalues and Eigenvectors 177

(b) Find the matrix represented by

A9 − 6A8 + 9A7 − 3A6 + A5 − 6A4 + 9A3 + 4A2 + 2A + I.

(c) Find the inverse of the matrix A.

Solution. (a) The characteristic equation of the matrix can be obtain by evaluating the
following determinant  
1−λ 0 1
|A − λI| =  0 2−λ 1 ,
 
1 1 3−λ
To verify the Cayley-Hamilton theorem, we have to show that

p(A) = A3 − 6A2 + 9A − 3I = 0.

First we find a matrix A2 as follows


    
1 0 1 1 0 1 2 1 4
2
A = A × A =  0 2 1  0 2 1  =  1 5 5 .
    
1 1 3 1 1 3 4 5 11

Now we will find a matrix A3 as follows


    
2 1 4 1 0 1 6 6 15
3 2
A = A × A =  1 5 5   0 2 1  =  6 15 21  .
    
4 5 11 1 1 3 15 21 42
   
6 6 15 12 6 24
A3 − 6A2 + 9A − 3I =  6 15 21  −  6 30 30 
   


15 21 42  24 30 
66
9 0 9 3 0 3
+  0 18 9  −  0 6 3
   


9 9 27
3 3 9
0 0 0
=  0 0 0 .
 
0 0 0
(b) For finding the matrix representation, we do the following
A9 − 6A8 + 9A7 − 3A6 + A5 − 6A4 + 9A3 + 4A2 + 2A + I
= A6 (A3 − 6A2 + 9A − 3I) + A2 (A3 − 6A2 + 9A − 3I) + 7A2 + 2A + I
= 7A 2
 + 2A + I 
17 7 30
=  30 40 37  .
 
30 37 84

(c) The inverse of A can be obtain by multiplying on both side of the equation by A−1 , gives

A−1 (A3 − 6A2 + 9A − 3I) = 0, A2 − 6A + 9I − 3A−1 = 0,


178 2.2 Linear Algebra and Eigenvalues Problems
 
5 1 −2
1 1
A−1 = [A2 − 6A + 9I] =  1 2 −1  ,

3 3
−2 −1 2
the required inverse of the given matrix. •

7. The eigenvectors of A−1 are the same as the eigenvectors of A.


Let x be an eigenvector of A and satisfies the following equation
Ax = λx,
then
1 −1 1
A Ax = A−1 x, and A−1 x = x,
λ λ
which shows that x is also an eigenvector of A−1 .
8. The eigenvectors of the matrix (kA) are identical to the eigenvectors of A, for any scalar k.
Since the eigenvalues of (kA) are k times the eigenvalues of A, if, Ax = λx, then
(kA)x = (kλ)x.

9. A symmetric matrix A is positive definite if and only if all the eigenvalues of A are positive.

Example 2.27 Consider the following matrix


 
2 0 1
A =  0 2 0 ,
 
1 0 2
which has the characteristic equation of the form
λ3 − 6λ2 + 11λ − 6 = 0,
and it gives the eigenvalues 3, 2 and 1 of A. Since all the eigenvalues of the matrix A are
positive, therefore A is positive definite. •

10. For any n × n matrix A, we have



B0 = I 

Ak = ABk−1



1 k = 1, 2, . . . , n,
ck = − tr(Ak ) 
k



Bk = Ak + ck I

where tr(A) is a trace of a matrix A. Then a characteristic polynomial of A is


p(λ) = |A − λI| = λn + c1 λn−1 + · · · + cn−1 λ + cn .
If cn 6= 0, then the inverse of A can be obtain as follows:
1
A−1 = − Bn−1 ,
cn
This is called the Sourian-Frame Theorem.
188 2.3 Numerical Computation of Eigenvalues

>> cond(A)
ans =
524.0568

By adapting above result, we can easily confirm that the condition numbers of Hilbert matrices
increase rapidly as the size of the matrices increases. Large Hilbert matrices are therefore,
considered to be extremely ill-conditioned.

Example 2.37 Find the conditioning of the following matrix


√ !
0 3
A= √ .
3 2

Solution. Since √
−λ 3

det(A − λI) = = λ2 − 2λ − 3 = 0.

3 2−λ

Solving above equation, gives the solution 3 and −1, which are called the eigenvalues of the
matrix A. Thus the largest eigenvalue of A is 3, and the smallest one is 1. Hence the condition
number of the matrix A is
3
cond A = = 3.
1
Since 3 is of the order of magnitude of 1, A is well conditioned. •

4. Let A be a nonsymmetric matrix A, with k.k = k.k2 , then


!1/2
largest eigenvalue |λi | of AT A
cond A = .
smallest eigenvalue|λi | of AT A

Example 2.38 Find the conditioning of the following matrix


!
4 5
A= .
3 2

Solution. Since

25 − λ 26
T
det(A A − λI) = = λ2 − 54λ + 49 = 0.

26 29 − λ

The solutions 53.08 and 0.92 of the above equation are called the eigenvalues of the matrix
AT A. Thus, the conditioning of the given matrix can be obtain as
1/2
53.08

condA = ≈ 7.6,
0.92

which showed that the given matrix A is not ill-conditioned. •


Chapter Two Eigenvalues and Eigenvectors 189

2.3 Numerical Computation of Eigenvalues


The importance of the eigenvalues of a square matrix in a broad range of applications has been
amply demonstrated in the previous chapter and its predecessor. However, finding the eigenvalues
and associated eigenvectors is not such an easy task. At this point, the only method we have
for computing the eigenvalues of a matrix is to solve the characteristic equation (discussed in the
previous sections). However, there are several problems with this method that render it imprac-
tical in all but small examples. The first problem is that it depends on the computation of a
determinant, which is a very time-consuming process for large matrices. The second problem is
that the characteristic equation is a polynomial equation, and there are no formulas for solving
polynomial equations of degree higher than 4 (polynomials of degree 2, 3, and 4 can be solved using
the quadratic formula and its analogues). Thus, we are forced to approximate eigenvalues in most
practical problems. We are in need of a completely new idea if we have any hopes of designing effi-
cient numerical techniques. Unfortunately, techniques for approximating the roots of a polynomial
are quite sensitive to roundoff error and are therefore unreliable. Here, we will discuss a few of the
most basic numerical techniques for computing eigenvalues and eigenvectors.
One class of techniques, called iterative methods, can be used to find some or all of the eigenvalues
and eigenvectors of the given matrix. They start with an arbitrary approximation to one of the
eigenvector and successively improve this until the required accuracy is obtained. Among them one
is the power method of inverse iteration, which is used to find all of the eigenvectors of a matrix
from known approximations to its eigenvalues.
The other class of techniques which can only be applied to the symmetric matrices such as methods
of the Jacobi, the Given’s and the Householder’s which reduce a given symmetric matrix to a special
form whose eigenvalues are readily computed. For general matrices (symmetric or nonsymmetric
matrices), the QR method and the LR method are the most widely used techniques for solving the
eigenvalue problems. Most of these procedures make use of a series of similarity transformation.

2.4 Vector Iterative Methods for Eigenvalues


So far as we discussed classical methods for evaluating the eigenvalues and eigenvectors for different
matrices. It is evident that these methods became impractical as the matrices involved became
large. Consequently, iterative methods are used for that purpose, such as, the power methods.
These methods are easy means to compute eigenvalues and eigenvectors of a given matrix.
The power methods include three versions. First is the regular power method or simple iteration
based on the power of a matrix. Second is the inverse power method which is based on the inverse
power of a matrix. Third is the shifted inverse power method in which the given matrix A is replaced
by (A − µI) for any given scalar µ. In the following we discuss all of these methods in some details.

2.4.1 Power Method


The basic power method can be used to compute the eigenvalue of largest modules and the cor-
responding eigenvector of a general matrix. The eigenvalue of largest magnitude is often called
the dominant eigenvalue. The implication of the power method is that if we assume a vector xk ,
then a new vector xk+1 can be calculated. The new vector is normalized by factoring its largest
coefficient. This coefficient is then taken as a first approximation to the largest eigenvalue and the
190 2.4 Vector Iterative Methods for Eigenvalues

resulting vector represents the first approximation to the corresponding eigenvector. This process
is continued by substituting the new eigenvector and determining a second approximation until the
desired accuracy achieved.
Consider an n × n matrix A, then the eigenvalues and eigenvectors satisfy

Avi = λi vi , (2.38)

where λi is the ith eigenvalue and vi is the corresponding ith eigenvector of A. The power method
can be used on both symmetric and nonsymmetric matrices. If A is symmetric matrix, then all the
eigenvalues are real. If A is a nonsymmetric, then there is possibility that there is not a single real
dominant eigenvalue but a complex conjugate pair. Under these conditions the power method does
not converge. We assume that the largest eigenvalue is real and not repeated and that eigenvalues
are numbered in the increasing order, that is

|λ1 | > |λ2 | ≥ |λ3 | · · · ≥ |λn−1 | ≥ |λn |. (2.39)

The power method starts with an initial guess for the eigenvector x0 , which can be any nonzero
vector. The power method is defined by the iteration

xk+1 = Axk , for k = 0, 1, 2, . . . , (2.40)

and it gives
x1 = Ax0
x2 = Ax1 = A2 x0
x3 = Ax2 = A3 x0
..
.
Thus
x k = Ak x 0 , for k = 1, 2, . . . .
The vector x0 is an unknown linear combination of all the eigenvectors of the system provided they
are linearly independent. Thus

x0 = α1 v1 + α2 v2 + · · · + αn vn .

Let
x1 = Ax0 = A(α1 v1 + α2 v2 + · · · + αn vn )
= α1 Av1 + α2 Av2 + · · · + αn Avn
= α1 λ1 v1 + α2 λ2 v2 + · · · + αn λn vn ,
since, from the definition of an eigenvector, Avi = λi vi . Similarly,

x2 = Ax1 = A(α1 λ1 v1 + α2 λ2 v2 + · · · + αn λn vn )
= α1 λ1 Av1 + α2 λ2 Av2 + · · · + αn λn Avn
= α1 λ21 v1 + α2 λ22 v2 + · · · + αn λ2n vn .

Continuing in this way, gives

xk = α1 λk1 v1 + α2 λk2 v2 + · · · + αn λkn vn , (2.41)


Chapter Two Eigenvalues and Eigenvectors 191

which may be written as


" k k k #
λ2 λ3 λn
  
k
xk = A x0 = λk1 α1 v1 + α2 v2 + α 3 v3 + · · · + α n vn .
λ1 λ1 λ1
All of the terms except the first in the above relation (2.41) converge to the zero vector as k → ∞,
since |λ1 | > |λi | for i 6= 1. Hence
xk ≈ (λk1 α1 )v1 , for large k, provided that α1 6= 0.
Since λk1 α1 v1 is a scalar multiple of v1 , so xk = Ak x0 will approach an eigenvector for the dominant
eigenvalue λ1 , that is
Axk ≈ λ1 xk ,
so if xk is scaled so that its dominant component is 1, then
(dominant component of Axk ) ≈ λ1 × (dominant component of xk ) = λ1 × 1 = λ1
The rate of convergence of the power
method is primarily depend on the distribution of the eigen-
λi
values; the smaller the ratio , (for i = 2, 3, . . . , n), the faster the convergence; in particular
λ1
λ2
this rate depends upon the ratio . The number of iterations required to get a desired degree
λ1
of convergence depends upon both the rate of convergence and how large λ1 is compared with the
other λi , the latter depending in turn on the choice of initial approximation x0 .
Example 2.39 Find the first five iterations obtained by the power method applied to the following
matrix using the initial approximation x0 = [1, 1, 1]T .
 
1 1 2
A =  1 2 1 .
 
1 1 0
Solution. Start with an initial vector x0 = [1, 1, 1]T , we have
    
1 1 2 1 4.0000
Ax0 =  1 2 1   1  =  4.0000  ,
    
1 1 0 1 2.0000
which gives  
1.0000
Ax0 = λ1 x1 = 4.0000  1.0000  .
 
0.5000
Similarly, the other possible iterations are as follows:
    
1 1 2 1.0000 3.0000
Ax1 =  1 2 1   1.0000  =  3.5000 
    
1 1 0 0.5000 2.0000
 
0.8571
= 3.5000  1.0000  = λ2 x2 .
 
0.5714
192 2.4 Vector Iterative Methods for Eigenvalues
    
1 1 2 0.8571 3.0000
Ax2 =  1 2 1   1.0000  =  3.4286 
    
1 1 0 0.5714 1.8571
 
0.8750
= 3.4286  1.0000  = λ3 x3 .
 
0.5417
    
1 1 2 0.8750 2.9583
Ax3 =  1 2 1   1.0000  =  3.4167 
    
1 1 0 0.5417 1.8750
 
0.8659
= 3.4167  1.0000  = λ4 x4 .
 
0.5488
    
1 1 2 0.8659 2.9634
Ax4 =  1 2 1   1.0000  =  3.4146 
    
1 1 0 0.5488 1.8659
 
0.8679
= 3.4146  1.0000  = λ5 x5 .
 
0.5464
Since the eigenvalues of the given matrix A are 3.4142, 0.5858 and −1.0000. Hence the approxi-
mation of the dominant eigenvalue after the five iterations is λ5 = 3.4146 and the corresponding
eigenvector is [0.8679, 1.0000, 0.5464]T . •

We use the author-defined function PowerM and the following MATLAB commands to get the
above results as follows:

>> A = [1 1 2; 1 2 1; 1 1 0];
>> X = [1 1 1]0 ;
>> maxI = 5;
>> sol = P owerM (A, X, maxI);

The power method has the disadvantage that it is unknown at the outset whether or not a matrix
has a single dominant eigenvalue. Nor it is known how an initial vector x0 should be chosen so
as to ensure that its representation in terms of the eigenvectors of a matrix will contain a nonzero
contribution from the eigenvector associated with dominant eigenvalue, should it exists.
Note that the dominant eigenvalue λ of a matrix can be also obtained from two successive iterations,
by dividing the correspond elements of vectors xn and xn−1 .

Example 2.40 Find the dominant eigenvalue of the following matrix


 
4 1 3
A= 9 0 9 .
 
5 −1 6
Chapter Two Eigenvalues and Eigenvectors 193

Solution. Let us consider an arbitrary vector x0 = [1, 0, 0]T . Then


    
4 1 3 1 4
x1 = Ax0 =  9 0 9  0  =  9 ,
    
5 −1 6 0 5
    
4 1 3 4 40
x2 = Ax1 =  9 0 9   9  =  81  ,
    
5 −1 6 5 41
    
4 1 3 40 364
x3 = Ax2 =  9 0 9   81  =  729  .
    
5 −1 6 41 365
Then the dominant eigenvalue can be obtained as
x3 364 729 365
λ1 ≈ ≈ ≈ ≈ ≈ 9.
x2 40 81 41

2.4.2 Power Method and Symmetric Matrices


The power method will converge if the given n × n matrix A has linearly independent eigenvectors
and a symmetric matrix satisfies this property. Now we will discuss the power method for finding
the dominant eigenvalue of a symmetric matrix only.

Theorem 2.23 (Power Method with Euclidean Scaling)

Let A be symmetric n × n matrix with a positive dominant eigenvalue λ. If x0 is a unit vector in


Rn that is not orthogonal to the eigenspace corresponding to λ, then the normalized power sequence

Ax0 Ax1 Axk−1


x0 , x1 = , x2 = ,..., xk = , ... (2.42)
kAx0 k kAx1 k kAxk−1 k

converges to a unit dominant eigenvector, and the eigenvalue

Ax1 .x1 , Ax2 .x2 , ..., Axk .xk , ... (2.43)

The basic steps for the power method with Euclidean scaling are:

1. Choose an arbitrary nonzero vector and normalized it to obtain a unit vector x0 .

2. Compute Ax0 and normalized it to obtain the first approximation x1 to a dominant unit
eigenvector. Compute Ax1 .x1 to obtain the first approximation to the dominant eigenvalue.

3. Compute Ax1 and normalized it to obtain the second approximation x2 to a dominant unit
eigenvector. Compute Ax2 .x2 to obtain the second approximation to the dominant eigenvalue.
194 2.4 Vector Iterative Methods for Eigenvalues

4. Continuing in this way we will create a sequence of better and better approximations to a
dominant eigenvalue and a corresponding eigenvector.

Example 2.41 Apply the power method with Euclidean scaling to the following matrix
!
2 2
A= ,
2 5

with x0 = [0, 1]T to get the first four approximations to the dominant unit eigenvector and the
dominant eigenvalue.

Solution. Starting with the unit vector x0 = [0, 1]T , we get the first approximation of the dominant
unit eigenvector as follows:
! ! ! ! !
2 2 0 2 Ax0 1 2 0.3714
Ax0 = = , x1 = =√ ≈ .
2 5 1 5 k(Ax0 )k 29 5 0.9285

Similarly, for the second, third, and fourth approximations of the dominant unit eigenvector, we
found ! ! !
2 2 0.3714 2.5997
Ax1 = = ,
2 5 0.9285 5.3852
! !
Ax1 1 2.5997 0.4347
x2 = ≈ ≈ .
k(Ax1 )k 5.9799 5.3852 0.9006
! ! !
2 2 0.4347 2.6706
Ax2 = = ,
2 5 0.9006 5.3723
! !
Ax2 1 2.6706 0.4451
x3 = ≈ ≈ .
k(Ax2 )k 5.9994 5.3723 0.8955
! ! !
2 2 0.4451 2.6812
Ax3 = = ,
2 5 0.8955 5.3676
! !
Ax3 1 2.6812 0.4469
x4 = ≈ ≈ .
k(Ax3 )k 6 5.3676 0.8946
Now we find the approximations of the dominant eigenvalue of the given matrix as follows:
!
Ax1 .x1 (Ax1 )T x1 0.3714
λ1 = = = (2.5997 5.3852) ≈ 5.9655,
x1 .x1 xT1 x1 0.9285
!
Ax2 .x2 (Ax2 )T x2 0.4347
λ2 = = = (2.6706 5.3723) ≈ 5.9992,
x2 .x2 xT2 x2 0.9006
!
Ax3 .x3 (Ax3 )T x3 0.4451
λ3 = = = (2.6812 5.3676) ≈ 6.0001,
x3 .x3 xT3 x3 0.8955
!
Ax4 .x4 (Ax4 )T x4 0.4469
λ4 = = = (2.6830 5.3668) ≈ 6.0002,
x4 .x4 xT4 x4 0.8946
Chapter Two Eigenvalues and Eigenvectors 195

these are the required approximations of the dominant eigenvalue of A. Notice that the exact
dominant eigenvalue of the given matrix is λ = 6 with corresponding dominant unit eigenvector
x = [0.4472, 0.8945]T . •

Now we will consider the power method using a symmetric matrix in such a way that each iterate
is scaled to make its largest entry a 1, rather than being normalized.
Theorem 2.24 (Power Method with Maximum Entry Scaling)

Let A be symmetric n×n matrix with a positive dominant eigenvalue λ. If x0 is a nonzero vector in
Rn that is not orthogonal to the eigenspace corresponding to λ, then the normalized power sequence
Ax0 Ax1 Axk−1
x0 , x1 = , x2 = ,..., xk = , ... (2.44)
max(Ax0 ) max(Ax1 ) max(Axk−1 )
converges to an eigenvector corresponding to eigenvalue λ, and the sequence
Ax1 .x1 Ax2 .x2 Axk .xk
, , ..., , ... (2.45)
x1 .x1 x2 .x2 xk .xk
converges to λ.
In using the power method with maximum entry scaling, we have to do the following steps:
1. Choose an arbitrary nonzero vector x0 .

2. Compute Ax0 and divide it by the factor max x0 to obtain the first approximation x1 to a
Ax1 .x1
dominant eigenvector. Compute to obtain the first approximation to the dominant
x1 .x1
eigenvalue.

3. Compute Ax1 and divide it by the factor max x1 to obtain the second approximation x2 to a
Ax2 .x2
dominant eigenvector. Compute to obtain the second approximation to the dominant
x2 .x2
eigenvalue.

4. Continuing in this way we will create a sequence of better and better approximations to
dominant eigenvalue and a corresponding eigenvector. •

Example 2.42 Apply the power method with maximum entry scaling to the following matrix
!
2 2
A= ,
2 5

with x0 = [0, 1]T to get the first four approximations to the dominant eigenvector and the dominant
eigenvalue.

Solution. Starting with x0 = [0, 1]T , we get the first approximation of the dominant eigenvector
as follows:
! ! ! ! !
2 2 0 2 Ax0 1 2 0.4000
Ax0 = = , x1 = = = .
2 5 1 5 max(Ax0 ) 5 5 1.0000
196 2.4 Vector Iterative Methods for Eigenvalues

Similarly, for the second, third, and fourth approximations of the dominant eigenvector, we found
! ! !
2 2 0.4000 2.8000
Ax1 = = ,
2 5 1.0000 5.8000

! !
Ax1 1 2.8000 0.4828
x2 = = ≈ .
max(Ax1 ) 5.8000 5.8000 1.0000

! ! !
2 2 0.4828 2.9655
Ax2 = ≈ ,
2 5 1.0000 5.9655

! !
Ax2 1 2.9655 0.4971
x3 = = ≈ .
max(Ax2 ) 5.9655 5.9655 1.0000

! ! !
2 2 0.4971 2.9942
Ax3 = ≈ ,
2 5 1.0000 5.9942

! !
Ax3 1 2.9942 0.4995
x4 = = ≈ ,
max(Ax3 ) 5.9942 5.9942 1.0000

which are the required first four approximations of the dominant eigenvector.
Now we find the approximations of the dominant eigenvalue of the given matrix as follows:

Ax1 .x1 (Ax1 )T x1 6.9200


λ1 = = T
≈ ≈ 5.9655,
x1 .x1 x1 x1 1.1600

Ax2 .x2 (Ax2 )T x2 7.3972


λ2 = = T
≈ ≈ 5.9989,
x2 .x2 x2 x2 1.2331

Ax3 .x3 (Ax3 )T x3 7.4826


λ3 = = ≈ ≈ 6.0000,
x3 .x3 xT3 x3 1.2471

Ax4 .x4 (Ax4 )T x4 7.4970


λ4 = = T
≈ ≈ 6.0000,
x4 .x4 x4 x4 1.2495

these are the required approximations of the dominant eigenvalue of A. Notice that the exact domi-
nant eigenvalue of the given matrix is λ = 6 with corresponding dominant eigenvector x = [0.5, 1]T .

Notice that the main difference between the power method with Euclidean scaling and the power
method with maximum entry scaling is that the Euclidean scaling gives a sequence that approaches
a unit dominant eigenvector, whereas maximum entry scaling gives a sequence that approaches a
dominant eigenvector whose largest component is 1.
Chapter Two Eigenvalues and Eigenvectors 197

2.4.3 Inverse Power Method


The power method can be modified in the way of replacing the given matrix A by its inverse matrix
A−1 , called the inverse power method. Since we know that the eigenvalues of A−1 are reciprocal of
A, so the power method applied to A−1 will find the smallest eigenvalue of A. Thus the smallest
(or least) value of eigenvalue for A will became the maximum value for A−1 . Of course, we must
assume that the smallest eigenvalue of A is real and not repeated, otherwise, the method does not
work.
In this method the solution procedure is a little more involved than that discussed for finding the
largest eigenvalue of the given matrix. Fortunately, it is just an straight forward. Consider

Ax = λx, (2.46)

multiplying by A−1 , we have


A−1 Ax = λA−1 x,
or
1
A−1 x =
x. (2.47)
λ
The solution procedure is initiated by starting with initial guess for the vector xi and improving
the solution by getting new vector xi+1 , and so on until the vector xi is approximately equal to
xi+1 .

Example 2.43 Use the inverse power method to find the first seven approximations of the least
dominant eigenvalue and corresponding eigenvector of the following matrix using an initial approx-
imation x0 = [0, 1, 2]T .  
3 0 1
A =  0 −3 0  .
 
1 0 3
Solution. The inverse of the given matrix A is
3 1
 
0 − 
 8 8 
−1

1
A = 0 − 0 .
 
 3 
 1 3 
− 0
8 8
Start with the given initial vector x0 = [0, 1, 2]T , we have
3 1
 
0 −    
 8 8  0 −0.2500

1
A−1 x = 0 − 0   1  =  −0.3333 
    
0 
 3 
 1 3  2 0.7500
− 0
8 8
 
−0.3333
= 0.75  −0.4444  = λ1 x1 .
 
1.0000
198 2.4 Vector Iterative Methods for Eigenvalues

Similarly, the other possible iterations are as follows:

3 1
 
0 −    
 8 8  −0.3333 −0.2500

1
A−1 x =  0 − 0   −0.4444  =  0.1481 
    
1
 3 
1.0000 0.4167
 1 3 
− 0
8 8
 
−0.6000
= 0.4167  0.3558  = λ2 x2 .
 
1.0000

3 1
 
0 −    
 8 8  −0.6000 −0.3500

1
A−1 x =  0 − 0   0.3558  =  −0.1185 
    
2
 3 
1.0000 0.4500
 1 3 
− 0
8 8
 
−0.7778
= 0.4500  −0.2634  = λ3 x3 .
 
1.0000

3 1
 
0 −    
 8 8  −0.7778 −0.4167

1
A−1 x =  0 − 0   −0.2634  =  0.0878 
    
3
 3 
1.0000 0.4722
 1 3 
− 0
8 8
 
−0.8824
= 0.4722  0.1859  = λ4 x4 .
 
1.0000

3 1
 
0 −    
 8 8  −0.8824 −0.4559

1
A−1 x =  0 − 0   0.1859  =  −0.0620 
    
4
 3 
1.0000 0.4853
 1 3 
− 0
8 8
 
−0.9394
= 0.4853  −0.1277  = λ5 x5 .
 
1.0000
Chapter Two Eigenvalues and Eigenvectors 199
3 1
 
0 −    
 8 8  −0.9394 −0.4773

1
A−1 x =  0 − 0   −0.1277  =  −0.0426 
    
5
 3 
1.0000 0.4924
 1 3 
− 0
8 8
 
−0.9692
= 0.4924  −0.0864  = λ6 x6 .
 
1.0000

3 1
 
0 −    
 8 8  −0.9692 −0.4885

1
A−1 x =  0 − 0   0.0864  =  −0.0288 
    
6
 3 
1.0000 0.4962
 1 3 
− 0
8 8
 
−0.9845
= 0.4962  −0.0581  = λ7 x7 .
 
1.0000

Since the eigenvalues of the given matrix A are −3.0000, 2.0000 and 4.0000. Hence the dominant
1
eigenvalue of A−1 after the seven iterations is λ7 = 0.4962 and is converging to the and so the
2
1
smallest dominant eigenvalue of the given matrix A is the reciprocal of dominant eigenvalue of
2
the matrix A−1 , that is, 2 and the corresponding eigenvector is [−0.9845, −0.0581, 1.0000]T .

We use the author-defined function IPowerM and the following MATLAB commands to get the
above results as follows:

>> A = [3 0 1; 0 − 3 0; 1 0 3];
>> X = [0 1 2]0 ;
>> maxI = 7;
>> sol = IP owerM (A, X, maxI);

2.4.4 Shifted Inverse Power Method


Another modification of the power method consists of by replacing the given matrix A by (A − µI),
for any scalar µ, that is
(A − µI)x = (λ − µ)x, (2.48)

it follows that the eigenvalues of (A − µI) are the same as these of A except that they have all been
shifted by an amount µ. The eigenvectors remain unaffected by the shift.
The shifted inverse power method is to apply the power method to the system

1
(A − µI)−1 x = x. (2.49)
(λ − µ)
200 2.4 Vector Iterative Methods for Eigenvalues

1
Thus iteration of (A − µI)−1 leads to the largest value of , that is, the smallest value of
(λ − µ)
(λ−µ). The smallest value of (λ−µ) implies that the value of λ determined will be the value closest
to µ. Thus by a suitable choice of µ we have a procedure for finding subdominant eigensolutions.
1
So (A − µI)−1 has the same eigenvectors as A but with eigenvalues .
(λ − µ)
In practice the inverse of (A − µI) is never actually computed, especially if the given matrix A is a
large sparse matrix. It is computationally more efficient if (A − µI) is decompose into the product
of a lower-triangular matrix L and a upper-triangular matrix U . If us is an initial vector for the
solution of (2.49) then
(A − µI)−1 us = vs , (2.50)
and
us+1 = vs /max(vs ). (2.51)
By rearranging (2.50), we obtain
us = (A − µI)vs
= LU vs
Let
U vs = z, (2.52)
then
Lz = us . (2.53)
By using an initial value, we can find z from (2.53) by applying forward substitution and knowing
z we can find vs from (2.52) by applying backward substitution. The new estimate for the vector
us+1 can then be found from (2.51). The iteration is terminated when us+1 is sufficiently close to
us and it can be easily shown that when convergence is completed.
Let λµ is an eigenvalue of A nearest µ, then
1
λµ = + µ. (2.54)
dominant eigenvalue of (A − µI)−1
Shifted inverse power method uses the power method as a basis but gives faster convergence.
Convergence is to the eigenvalue λ which is close to µ, and if this eigenvalue is extremely close
to µ, the rate of convergence will be very rapid. Inverse iteration therefore provides a means of
determining an eigenvector of a matrix for which the corresponding eigenvalue has already been
determined to moderate accuracy by an alternative methods, such as, the QR method or the Strum
sequence iteration which we will discuss later in the chapter.
When inverse iteration is used to determine eigenvectors corresponding to known eigenvalues, the
matrix to be inverted, even if symmetric, will not normally be positive definite, and if nonsymmetric,
will not normally be diagonally dominant. The computation of an eigenvector corresponding to a
complex conjugate eigenvalue by inverse iteration is more difficult than for a real eigenvalue.

Example 2.44 Use the shifted inverse power method to find the first five approximations of the
eigenvalue nearest µ = 6 of the following matrix using initial approximation x0 = [1, 1]T
!
4 2
A= .
3 5
Chapter Two Eigenvalues and Eigenvectors 201

Solution. Consider !
−2 2
B = (A − 6I) = .
3 −1

The inverse of B is
 1 1 
 4 2 
B −1 = (A − 3I)−1 =  .
 
 3 1 
4 2
Now apply the power method, we obtained the following iterations:
 1 1    
 4 2  1 0.7500
B −1 x0 = =
    
  
 3 1 1 1.2500

4 2
!
0.6000
= 1.2500 = λ 1 x1 .
1.0000

Similarly, the other approximations can be computed as


 1 1    
 4 2 
 0.6000 0.6500
B −1 x1 = =
  
  
 3 1  1.0000 0.9500
4 2
!
0.6842
= 0.9500 = λ 2 x2 .
1.0000

 1 1    
 4 2  0.6842 0.6711
B −1 x2 = =
    
  
 3 1 1.0000 1.0132

4 2
!
0.6623
= 1.0132 = λ 3 x3 .
1.0000
 1 1    
 4 2  0.6623 0.6656
B −1 x3 = =
    
  
 3 1 1.0000 0.9968

4 2
!
0.6678
= 0.9968 = λ 4 x4 .
1.0000
202 2.5 Location of the Eigenvalues

 1 1    
 4 2  0.6678 0.6669
B −1 x4 = =
    
  
 3 1 1.0000 1.0008

4 2
!
0.6664
= 1.0008 = λ 5 x5 .
1.0000

Thus the fifth approximation of the dominant eigenvalue of B −1 = (A − 3I)−1 is, λ5 = 1.0008 and
it is converging to 1 with the eigenvector [1.0000, 0.7000]T . Hence the eigenvalue λµ of A nearest
to the µ = 6 is
1
λµ = + 6 = 7.
1
We use the author-defined function SIPowerM and the following MATLAB commands to get the
above results as follows:

>> A = [4 2; 3 5];
>> mu = 6;
>> X = [1 1]0 ;
>> maxI = 5;
>> sol = SIP owerM (A, X, mu, maxI);

2.5 Location of the Eigenvalues


Here we discuss two well known theorems which are some of the more important among the many
theorems which deal with the location of eigenvalues of both a symmetric and a nonsymmetric
matrices, that is, the location of zeros of the characteristic polynomial. The eigenvalues of a
nonsymmetric matrix could, of course, be complex, in which case the theorems give us a means
of locating these numbers in the complex plane. The theorems can be used also for estimate
the magnitude of the largest and smallest eigenvalues in magnitude and thus to estimate spectral
radius ρ(A) of A and the condition number of A. Such estimates can be used to generate initial
approximations to be used in iterative methods for determining eigenvalues.

2.5.1 Gerschgorin Circles Theorem


Let A be an n × n matrix and Ri denote the circles in the complex plane C with center aii and
n
X
radius |ai,j |, that is
j=1
j6=i

n
X
Ri = {z ∈ C : |z − aii | ≤ |aij |}, i = 1, 2, · · · , n, (2.55)
j=1
j6=i

where the variable z is complex valued.


Chapter Two Eigenvalues and Eigenvectors 203

The eigenvalues of A are contained within R = ∪ni=1 Ri and the union of any k of these circles that
do not intersect the remaining (n − k) must contain precisely k (counting multiplication) of the
eigenvalues. •
Theorem 2.25 (Gerschgorin Circles Theorem)

Every eigenvalue of A lies within at least one of the Gershgorin circles. •


Note the following:
• If a circle is disjoint from all other circles, then it contains exactly one eigenvalue of A.
• If a circle overlaps other circles, then it need not contain any eigenvalues, although the union
of the overlapping circles contains the appropriate number.
• If A is real, AT has the same eigenvalues as A and the theorem can also be applied to AT . (or
equivalently the circle radii can be defined by summing the magnitudes of the off-diagonal
elements of columns of A rather than rows).
• (If A is irreducible, a stronger version of the theorem states that an eigenvalue cannot lie on
the boundary of a circle unless it lies on the boundary of every circle.

Example 2.45 Consider the following matrix


 
10 1 1 2
 1 5 1 0 
A= ,
 
 1 1 −5 0 
2 0 0 −10
which is symmetric and has only real eigenvalues. The Gerschgorin circles associated with A are
given by
R1 = {z ∈ C| |z − 10| ≤ 4}
R2 = {z ∈ C| |z − 5| ≤ 2}
R3 = {z ∈ C| |z + 5| ≤ 2}
R4 = {z ∈ C| |z + 10| ≤ 2}
These circles are illustrated in Figure 2.2, and the Gerschgorin’s theorem indicates that the eigen-
values of A lie inside the circles. The circles about −10 and −5 each must contain an eigenvalue.
The other eigenvalues must lies in the interval [3, 14]. By using the shifted inverse power method
with  = 0.000005 with initial approximations of 10, 5, −5, and −10 leads to approximations of

λ1 = 10.4698, λ2 = 4.8803
λ3 = −5.1497, λ4 = −10.2004

respectively. The number of iterations required ranged from 9 to 13. •

Example 2.46 Consider the following matrix


 
1 2 −1
A= 2 7 0 ,
 
−1 0 −5
204 2.5 Location of the Eigenvalues

−2

−4
−10 −5 0 5 10

Figure 2.2: Circles for the Example 2.45.

−1

−2

−3
−6 −4 −2 0 2 4 6 8

Figure 2.3: Circles for the Example 2.46

which is symmetric and so has only real eigenvalues. The Gerschgorin circles are

C1 : |z − 1| ≤ 3
C2 : |z − 7| ≤ 2
C3 : |z + 5| ≤ 1

These circles are illustrated in Figure 2.3, and the Gerschgorin’s theorem indicates that the eigen-
values of A lie inside the circles.

Then by the Gerschgorin theorem, any eigenvalues of A must lie in one of the three intervals
[−2, 4], [5, 9] and [−6, −4]. Since the eigenvalues of A are, −5.1712, 0.5589 and 7.61230. Hence
λ1 = −5.1712 lies in circle C1 , λ2 = 0.5589 lie in circle C2 and λ3 = 7.61230 lie in circle C3 . •

2.5.2 The Rayleigh Quotient


The shifted inverse power method requires the input of an initial approximation µ for the eigenvalue
λ of a matrix A. It can be obtained by the Rayleigh quotient as

xT Ax
µ= . (2.56)
xT x
Chapter Two Eigenvalues and Eigenvectors 205

The maximum eigenvalue λ1 can be obtained when x is the corresponding vector, as

xT Ax
λ1 = max . (2.57)
x6=0 xT x

In the case where λ1 is the dominant eigenvalue of a matrix A and x is the corresponding eigenvector,
then the Rayleigh quotient is

xT Ax xT λ 1 x λ1 (xT x)
= = = λ1 .
xT x xT x xT x
Thus, if xk converges to a dominant eigenvector x, then it seems reasonable that

xk T Axk
,
xk T xk
converges to
xT Ax
= λ1 ,
xT x
which is the dominant eigenvalue.

Theorem 2.26 (Rayleigh Quotient Theorem)

If the eigenvalues of a real symmetric matrix A are

λ1 ≥ λ2 ≥ λ3 · · · ≥ λn , (2.58)

and if x is any nonzero vector, then

xT Ax
λn ≤ ≤ λ1 . (2.59)
xT x

Example 2.47 Consider the symmetric matrix


 
2 −1 3
A =  −1 1 −2  ,
 
3 −2 1

and the vector x as  


1
x =  1 .
 
1
Then  
  1
xT x = 1 1 1  1  = 3,
 
1
206 2.5 Location of the Eigenvalues

and   
  2 −1 3 1
xT Ax = 1 1 1 −1 1 −2  1 
  

3 −2 1 1
 
  4
= 1 1 1  −2  = 4.
 
2
Thus
4
λ3 ≤ ≤ λ1 .
3
If µ is close to an eigenvalue λ1 , then convergence will be quite rapid. •

2.5.3 Intermediate Eigenvalues


Once the largest eigenvalue is determined, then there is method to obtain the approximations to the
other possible eigenvalues of a matrix. This method is called matrix deflation and it is applicable
to both symmetrical and nonsymmetrical coefficients matrices. The Deflation method involves
forming a new matrix B whose eigenvalues are the same as those of A, except that the dominant
eigenvalue of A is replaced by the eigenvalue zero in B.
It is evident that this process can be continued until all of the eigenvalues have been extracted.
Although this method shows promise, it does have a significant draw back. That is, at each iteration
performed in deflating the original matrix, any errors in the computed eigenvalues and eigenvectors
will be passed on to the next eigenvectors. This could result in serious in accuracy especially when
dealing with large eigenvalue problems. This is precisely why this method is generally used for
small eigenvalue problems.
The following preliminary results are essential in using this technique.

Theorem 2.27 If a matrix A has eigenvalues λi corresponding to eigenvectors xi , then Q−1 AQ


has the same eigenvalues as A but with eigenvectors Q−1 xi for any nonsingular matrix Q. •

Theorem 2.28 Let  


λ1 a12 a13 · · · a1n

 0 c22 c23 · · · c2n 

0 c32 c33 · · · c3n
 
B= , (2.60)

.. .. .. . 
. . . · · · ..
 
 
0 cn2 cn3 · · · cnn
and let C be an (n − 1) × (n − 1) matrix obtained by deleting the first row and first column of a
matrix B. The matrix B has eigenvalues λ1 together with the (n−1) eigenvalues of C. Moreover, if
(β2 , β3 , . . . , βn )T is an eigenvector of C with eigenvalue µ 6= λ1 , then the corresponding eigenvector
of B is (β1 , β2 , . . . , βn )T with
n
X
a1j βj
j=2
β1 = . (2.61)
µ − λ1
Chapter Two Eigenvalues and Eigenvectors 207

Note that eigenvectors xi of A can be recovered by pre-multiplication by Q. •

Example 2.48 Consider the following matrix


 
10 −6 −4
A =  −6 11 2 ,
 
−4 2 6

which has the dominant eigenvalue λ1 = 18 with the corresponding eigenvector x1 = [1, −1, − 12 ]T .
Use deflation method to find the other eigenvalues and eigenvectors of A.

Solution. The transformation matrix is given as


 
1 0 0
 −1 1 0 
Q=
 .
1 
− 0 1
2
Then    
1 0 0 10 −6 −4  1 0 0
−1
 1 1 0 
B = Q AQ = 
  −6 11 2 
 −1 1 0 
.
1   1 
0 1 −4 2 6 − 0 1
2 2
After simplifying, we get  
18 −6 −4
B= 0 5 −2  .
 
0 −1 4
So the deflated matrix is !
5 −2
C= .
−1 4
Now we can easily find the eigenvalues of C which are 6 and 3 with corresponding eigenvectors
[1, − 21 ]T and [1, 1]T respectively. Thus the other two eigenvalues of A are, 6 and 3. Now we calculate
the eigenvectors of A corresponding to these two eigenvalues. First we calculate the eigenvectors of
B corresponding to λ = 6 from the following system
   
18 −6 −4  β1  β1

 0

5 −2  
 1 
 = 6 1 
.
1   1 
0 −1 4 − −
2 2
Then by solving the above system, we have

18β1 − 4 = 6β1 ,

which gives β1 = 31 . Similarly, we can find value of β1 corresponding to λ = 3 by using the following
system as     
18 −6 −4 β1 β1
 0 5 −2   1  = 3 1 ,
    
0 −1 4 1 1
208 2.6 Eigenvalues of Symmetric Matrices

2
which gives β1 = . Thus the eigenvectors of B are, v1 = [ 13 , 1, − 12 ]T and v2 = [ 23 , 1, 1]T .
3
Now we find the eigenvectors of the original matrix A which can be obtained by pre-multiplying the
vectors of B by nonsingular matrix Q. First, the second eigenvector of A can be found as

1
 
1 
1 0 0 
 
3 

 3 

  
    
 −1 1 0     2 
x2 = Qv1 =  1 
= ,
  



   3 
 1    
1  
 
− 0 1

− 2 
2 2 −
3

or, equivalently, x2 = [ 21 , 1, −1]T . Similarly, the third eigenvector of the given matrix A can be
computed as
 2 

2 
1 0 0

 3 
  3   
    
 −1 1 0     1 
x3 = Qv2 =   1  =  ,
   

     3 

 1    
− 0 1 1
 2 
2
3

or, equivalently, x3 = [1, 21 , 1]T . •

Note that in this example the deflated matrix C is nonsymmetric even though the original matrix
A is symmetric. We deduce that the property of symmetry is not preserved in the deflation process.
Also, note that the method of deflation fails whenever the first element of given vector x1 is zero,
since x1 can not then be scaled so that this number is one.

We use the author-defined function DEFLATED and the following MATLAB commands to get
the above results as follows:

>> A = [10 − 6 − 4; −6 11 2; −4 2 6];


>> Lamda = 18;
>> XA = [1 − 1 − 0.5]0 ;
>> [Lamda, X] = DEF LAT ED(A, Lamda, XA);

2.6 Eigenvalues of Symmetric Matrices


In the previous section we discussed the power methods for finding individual eigenvalues. The
regular power method can be used to find the distinct eigenvalue with the largest magnitude, that
is, the dominant eigenvalue and the inverse power method can find the eigenvalue called smallest
eigenvalue and the shifted inverse power method can find the subdominant eigenvalues. In this
section we developed some methods to find all eigenvalues of a given matrix. The basic approach is
to find a sequence of similarity transformations that transformed the original matrix into a simple
Chapter Two Eigenvalues and Eigenvectors 209

form. Clearly, the best form for the transformed matrix would be a diagonal one but this is not
always possible. Since some transformed matrix would be a tridiagonal one. Furthermore, these
techniques are generally limited to symmetrical matrices with real coefficients.
Before we discuss these methods, we define some special matrices which are very useful in discussing
these methods.

Definition 2.17 (Orthogonally Similar Matrix)

A matrix A is said to be orthogonally similar to a matrix B if there is an orthogonal matrix Q for


which
A = QBQT . (2.62)
If A is a symmetric and B = QT AQ, then

B T = (QT AQ)T = QT AQ = B.

Thus similarity transformations on symmetric matrices that use orthogonal matrices produce ma-
trices which are again symmetric. •

Definition 2.18 (Rotation Matrix)

A rotation matrix Q is an orthogonal matrix that differs from the identity matrix in at most four
elements. These four elements at the vertices of rectangle have been replaced by cos θ, − sin θ, sin θ,
and cos θ in the positions pp, pq, qp, qq respectively. For example, the following matrix
 
1 0 0 0 0

 0 cos θ 0 − sin θ 0 

B= 0 0 1 0 0 , (2.63)
 
 
 0 sin θ 0 cos θ 0 
0 0 0 0 1

is the rotation matrix, where p = 2 and q = 4. Note that a rotation matrix is also an orthogonal
matrix, that is, B T B = I. •

2.6.1 Jacobi Method


This method can be used to find all the eigenvalues and eigenvectors of a symmetric matrix by
performing a series of similarity transformation. The Jacobi method permits the transformation of
a symmetric matrix into a diagonal one having the same eigenvalues as the original matrix. This
can be done by eliminating each off-diagonal elements in a systematic way. The method requires
an infinite number of iterations to produce the diagonal form. This is because the reduction of
a given element to zero in a matrix will most likely introduce a nonzero element into a previous
zero coefficient. Hence the method can be viewed as an iterative procedure that can approach a
diagonal form using a finite number of steps. The implication is that the off-diagonal coefficients
will be close to zero rather than exactly equal to zero.
Consider the eigenvalue problem
Av = λv, (2.64)
210 2.6 Eigenvalues of Symmetric Matrices

where A is a symmetric matrix of order n × n and let the solution of (2.64) gives the eigenvalues
λ1 , . . . , λn and the corresponding eigenvectors v1 , . . . , vn of A. Since the eigenvectors of a symmetric
matrix are orthogonal, that is
vT = v−1 . (2.65)
So by using (2.65), we can write (2.64) as follows

vT Av = λ. (2.66)

The basic procedure for the Jacobi method is as follows:


Assume that

A1 = QT1 AQ1

A2 = QT2 A1 Q2 = QT2 QT1 AQ1 Q2

A3 = QT3 A2 Q3 = QT3 QT2 QT1 AQ1 Q2 Q3


.. ..
. .
Ak = QTk · · · QT1 AQ1 · · · Qk

We see that as k → ∞, then

Ak → λ and Q1 Q2 · · · Qk → v. (2.67)

The matrix Qi (i = 1, 2, . . . , k) is a rotation matrix which is constructed in such a way that off-
diagonal coefficients in matrix Ak is reduced to zero. In other words, in a rotation matrix

1
 
 .. 

 . 


 cosθ −sinθ 

Qk =  ,
 
 

 sinθ cosθ 

 .. 
 . 
1

the value of θ is selected in such a way that the apq coefficient in Ak is reduced to zero, that is
2apq
tan 2θ = . (2.68)
app − aqq

Theoretically there are an infinite number of θ values corresponding to the infinite matrices Ak .
However, as θ approaches zero, a rotation matrices Qk become identity matrix and no further
transformations are required.
There are three strategies for annihilating off-diagonals. First one is called, the serial method which
selects the elements in row order, that is, in the positions (1, 2), . . . , (1, n); (2, 3), . . . , (2, n); . . .;
(n − 1, n) in turn, which is then repeated. The second method is called, the natural method which
Chapter Two Eigenvalues and Eigenvectors 211

searches through all of the off-diagonals and annihilates the element of largest modules at each
stage. Although this method converges faster than the serial method but it is not recommended
for large values of n, since the actual search procedure itself can be extremely time consuming.
The third method is known as the threshold serial method in which the off-diagonals are cycled in
row order as in the serial method, omitting transformations on any element whose magnitude is
below some threshold value. This value is usually decreased after each cycle. The advantage of
this approach is that zeros are only created in positions where it is worth while to do so, without
the need for a lengthy search. Here, we shall use only the natural method for annihilating the
off-diagonal elements.
Theorem 2.29 Consider a matrix A and a rotation matrix Q as
! !
a11 a12 p11 p12
A= and Q= .
a12 a22 p21 p22
Then there exists θ such that:

1. QT Q = I
2. QT AQ = D

where I is a identity matrix and D is a diagonal matrix, and its diagonal elements, λ1 and λ2 , are
the eigenvalues of A.
Proof. To convert the given matrix A into a diagonal matrix D, we have to make off-diagonal
element a12 of A to zero, that is, p = 1 and q = 2. Consider p11 = cos θ = p22 and p12 = −p21 =
sin θ, then the matrix Q has a form
!
cos θ − sin θ
Q= .
sin θ cos θ
The corresponding matrix A1 can be constructed as
A1 = QT1 AQ1 ,
or ! ! ! !
a∗11 a∗12 cos θ sin θ a11 a12 cos θ − sin θ
= .
a∗12 a∗22 − sin θ cos θ a21 a22 sin θ cos θ
Since our task is to reduce a∗12 to zero, carrying out the multiplication on the right-hand side and
using matrix equality gives
a∗12 = 0 = −(sin θ cos θ)a11 + (cos2 θ)a12 − (sin2 θ)a12 + (cos θ sin θ)a22
Simplifying and rearranging gives
sin θ cos θ a12
=
cos2 θ − sin2 θ a11 − a22
sin 2θ a12
=
2 cos 2θ a11 − a22
sin 2θ 2a12
=
cos 2θ a11 − a22
212 2.6 Eigenvalues of Symmetric Matrices

or more simplify
2a12
tan 2θ = , a11 6= a22 .
a11 − a22
π
Note that if a11 = a22 , this implies that, θ = . We found that for an 2 × 2 matrix, it required
4
only one iteration to convert the given matrix A to a diagonal matrix D.

Example 2.49 For the matrix !


5 −2
A= ,
−2 8

find an orthogonal matrix Q and the diagonal matrix D such that QT AQ = D by using Jacobi
method.

Solution. The off-diagonal entry of the given matrix A is a12 = −2, so we have to reduce element
a12 to zero. Since p = 1 and q = 2, so the orthogonal transformation matrix Q1 has the form
!
c −s
Q1 = .
s c

The values of c = cos θ and s = sin θ can be obtain as follows

1 2a12 1 2(−2)
   
θ = arctan = arctan ≈ 26.5651
2 a11 − a22 2 5−8

cos θ ≈ 0.8944 and sin θ ≈ 0.4472

Then ! !
0.8944 −0.4472 0.8944 0.4472
Q1 = and QT1 = ,
0.4472 0.8944 −0.4472 0.8944
and so the new similar matrix A1 has a form
!
3.9997 0.0
A1 = QT1 AQ1 = .
0.0 8.9995

Thus ! !
0.8944 −0.4472 3.9997 0.0
Q = Q1 = and D = A1 = ,
0.4472 0.8944 0.0 8.9995
the required orthogonal matrix and the diagonal matrix, respectively. •

Similarly, for higher order matrix, a diagonal matrix D can be obtained by number of such multi-
plications, that is
QTk QTk−1 · · · QT1 AQ1 · · · Qk−1 Qk = D.
The diagonal elements of D are all the eigenvalues λ of A and the corresponding eigenvectors v of
A can be obtained as
Q1 Q2 · · · Qk = v.
Chapter Two Eigenvalues and Eigenvectors 213

Example 2.50 Use the Jacobi method to find the eigenvalues and the eigenvectors of the following
matrix  
3.0 0.01 0.02
A =  0.01 2.0 0.1  .
 
0.02 0.1 1.0
Solution. The largest off-diagonal entry of the given matrix A is a23 = 0.1, so we begin by reducing
element a23 to zero. Since p = 2 and q = 3, so the first orthogonal transformation matrix has form
 
1 0 0
Q1 =  0 c −s  .
 
0 s c

The values of c = cos θ and s = sin θ can be obtain as follows

1 2a23 1 2(0.1)
   
θ = arctan = arctan ≈ 6.2833
2 a22 − a33 2 2−1

cos θ ≈ 0.9951 and sin θ ≈ 0.0985

Then    
1 0 0 1 0 0
Q1 =  0 0.9951 −0.0985  and QT1 =  0 0.9951 0.0985  ,
   
0 0.0985 0.9951 0 −0.0985 0.9951
and  
3.0 0.0119 0.0189
A1 = QT1 AQ1 =  0.0119 2.0099 0 .
 
0.0189 0 0.9901
Note that the rotation makes a32 and a23 zero, increasing slightly a21 and a12 and decreased the
second dominant off-diagonal entries a13 and a31 .
Now the largest off-diagonal element of the matrix A1 is a13 = 0.0189, so to make this position
zero, we consider the second orthogonal matrix of the form
 
c 0 −s
Q2 =  0 1 0 ,
 
s 0 c

and the values of c and s can be obtain as follows

1 2a13 1 2(0.0189)
   
θ = arctan = arctan ≈ 0.5984
2 a11 − a33 2 3 − 0.9901

cos θ ≈ 0.9999 and sin θ ≈ 0.0094.

Then    
0.9999 0 −0.0094 0.9999 0 0.0094
Q2 =  0 1 0  and QT2 =  0 1 0 .
   
0.0094 0 0.9999 −0.0094 0 0.9999
214 2.6 Eigenvalues of Symmetric Matrices

Hence  
3.0002 0.0119 0
T T T
A2 = Q2 A1 Q2 = Q2 Q1 AQ1 Q2 =  0.0119 2.0099 −0.0001 
 
0 −0.0001 0.9899

Similarly, to make off-diagonal element a12 = 0.0119 of the matrix A2 to zero, we consider the third
orthogonal matrix of the form
 
c −s 0
Q3 =  s c 0 ,
 
0 0 1

and

1 2a12 1 2(0.0119)
   
θ = arctan = arctan ≈ 0.7638
2 a11 − a22 2 3.0002 − 2.0099

cos θ ≈ 0.9999 and sin θ ≈ 0.0120.

Then    
0.9999 −0.0120 0 0.9999 0.0120 0
Q3 =  0.0120 0.9999 0  and QT3 =  −0.0120 0.9999 0  .
   
0 0 1 0 0 1

Hence  
3.0003 0 −1.35E − 6
T T T
A3 = Q3 Q2 Q1 AQ1 Q2 Q3 =  0 2.00 −1.122E − 4  ,
 
−1.35E − 6 −1.122E − 4 0.9899

which gives the diagonal matrix D, and its diagonal elements are converging to 3, 2, and 1, which
are the eigenvalues of the original matrix A. The corresponding eigenvectors can be computed as
follows:
 
0.9998 −0.0121 −0.0094
v = Q1 Q2 Q3 =  0.0111 0.9951 −0.0985  .
 
0.0106 0.0984 0.9951

To reproduce the above results by using the Jacobi method, we use the author-defined function
JOBM and the following MATLAB commands as follows:

>> A = [3 0.01 0.02; 0.01 2 0.1; 0.02 0.1 1];


>> sol = JOBM (A);

2.6.2 Sturm Sequence Iteration


When a symmetric matrix is tridiagonal, then the eigenvalues of a tridiagonal matrix can be
computed to any specified precision using a simple method called the Sturm sequence iteration. In
the coming sections we will discuss two methods which will convert the given symmetric matrices
into a symmetrical tridiagonal form by using similarity transformations. The Sturm sequence
Chapter Two Eigenvalues and Eigenvectors 215

iteration below can therefore, be used in the calculation of eigenvalues of any symmetric tridiagonal
matrix. Consider a symmetric tridiagonal matrix of order 4 × 4 as
 
a1 b2 0 0
 b2 a2 b3 0 
A= ,
 
 0 b3 a3 b4 
0 0 b4 a4
and assume that bi 6= 0, for each i = 2, 3, 4. Then one can define the characteristic polynomial of a
given matrix A as
f4 (λ) = det(A − λI), (2.69)
which is equivalent to
a1 − λ b2 0 0

b2 a2 − λ b3 0
f4 (λ) = .

0 b3 a3 − λ b4

0 0 b4 a4 − λ
We expand by minors in the last row as

a −λ b2 0 a −λ b2 0
1 1
f4 (λ) = (a4 − λ) b2 a2 − λ b3 − b4 b2 a2 − λ 0 ,

a3 − λ

0 b3 0 b3 b4

or
f4 (λ) = (a4 − λ)f3 (λ) − b24 f2 (λ). (2.70)
The recurrence relation (2.70) is true for a matrix of any order say r × r, that is

fr (λ) = (ar − λ)fr−1 (λ) − b2r fr−2 (λ), (2.71)

provided that we define f0 (λ) = 1 and evaluate f1 (λ) = a1 − λ.


The sequence {f0 , f1 , . . . , fr , . . .} is known as the Sturm sequence. So starting with f0 (λ) = 1, we
can eventually find a characteristic polynomial of A by using

fn (λ) = 0. (2.72)

Example 2.51 Use the Sturm sequence iteration to find the eigenvalues of the following symmetric
tridiagonal matrix  
1 2 0 0
 2 4 1 0 
A= .
 
 0 1 5 −1 
0 0 −1 3
Solution. We compute the Sturm sequences as follows
f0 (λ) = 1
f1 (λ) = (a1 − λ) = 1 − λ.
The second sequence is
f2 (λ) = (a2 − λ)f1 (λ) − b22 f0 (λ)
= (4 − λ)(1 − λ) − 4(1) = λ2 − 5λ.
216 2.6 Eigenvalues of Symmetric Matrices

and the third sequence is

f3 (λ) = (a3 − λ)f2 (λ) − b23 f1 (λ)


= (5 − λ)(λ2 − 5λ) − (1)2 (1 − λ)
= −λ3 + 10λ2 − 24λ − 1.

Finally, the fourth sequence is

f4 (λ) = (a4 − λ)f3 (λ) − b24 f2 (λ)


= (3 − λ)(−λ3 + 10λ2 − 24λ − 1) − (−1)2 (λ2 − 5λ)
= λ4 − 13λ3 − 53λ2 − 66λ − 3.

Thus
f4 (λ) = λ4 − 13λ3 − 53λ2 − 66λ − 3 = 0.

Solving the above equation, we have the eigenvalues 6.11, 4.41, 2.54, and −0.04, of the given sym-
metric tridiagonal matrix. •

We use the author-defined function SturmS and the following MATLAB commands as follows:
To get the above results using MATLAB command window, we do the following:

>> A = [1 2 0 0; 2 4 1 0; 0 1 5 − 1; 0 0 − 1 3];
>> sol = SturmS(A);

Theorem 2.30 For any real number λ∗ , the number of agreements in sign of successive terms of the
Sturm sequence {f0 (λ∗ ), f1 (λ∗ ), . . . , fn (λ∗ )} is equal to the number of eigenvalues of the tridiagonal
matrix A greater than λ∗ . The sign of a zero is taken to be opposite to that of the previous term.•

Example 2.52 Find the number of eigenvalues of the following matrix


 
3 −1 0
A =  −1 2 −1  ,
 
0 −1 3

lying in the interval (0, 4).

Solution. Since the given matrix is of size 3 × 3, so we have to compute the Sturm sequences f3 (0)
and f3 (4). Firstly, for λ∗ = 0, we have

f0 (0) = 1 and f1 (0) = (a1 − 0) = (3 − 0) = 3.

Also
f2 (0) = (a2 − 0)f1 (0) − b22 f0 (0) = (2)(3) − (−1)2 (1) = 5.
Finally, we have

f3 (0) = (a3 − 0)f2 (0) − b23 f1 (0) = (3)(5) − (−1)2 (3) = 12,

which have signs + + ++, with three agreements. So all three eigenvalues are greater than λ∗ = 0.
Chapter Two Eigenvalues and Eigenvectors 217

Similarly, we can calculate for λ∗ = 4. The Sturm sequences are as follows:

f0 (4) = 1 and f1 (4) = (a1 − 4) = (3 − 4) = −1.

Also
f2 (4) = (a2 − 4)f1 (4) − b22 f0 (4) = (2 − 4)(−1) − (−1)2 (1) = 1.
In the last, we have

f3 (4) = (a3 − 4)f2 (4) − b23 f1 (4) = (3 − 4)(1) − (−1)2 (−1) = 0.

which have signs + − +−, with no agreements. So no eigenvalues are greater than λ∗ = 4. Hence
there is exactly three eigenvalues in [0, 4]. Furthermore, since f3 (0) 6= 0 and also, f3 (4) = 0, we
deduce that no eigenvalue is exactly equal to 0 but one eigenvalue is exactly equal to 4, because
f3 (λ∗ ) = det(A − λ∗ I), the characteristic polynomial of A. Therefore, there are three eigenvalues in
the half-open interval (0, 4] and two eigenvalues are in open interval (0, 4) . Since the given matrix
A is positive definite, therefore, by well known result, all of the eigenvalues of A must be strictly
positive. Note that the eigenvalues of given matrix A are 1, 3, and 4. •

Note that if sign pattern is

+ + + − −, for a 4 × 4 matrix for λ = c. Then there are three eigenvalues greater than λ = c.
If sign pattern is

+ − + − −, for a 4 × 4 matrix for λ = c. Then there are one eigenvalue greater than λ = c.
If sign pattern is

+ − 0 + +, for a 4 × 4 matrix for λ = c. Then there are two eigenvalues greater than λ = c.

2.6.3 Given’s Method


This method is also based on similarity transformations of the same type as used for the Jacobi
method. The zeros created are being retained and the symmetric matrix is being reduced into
symmetric tridiagonal matrix C rather than a diagonal form using finite number of orthogonal
similarity transformations. The eigenvalues of the original matrix A are the same as those of
symmetric tridiagonal matrix C. The Given’s method is generally preferable to the Jacobi method
in that it required a finite number of iterations.
For the Given’s method the angle θ is chosen to create zeros, not in the (p, q) and (q, p) positions
as in the Jacobi method but in the (p − 1, q) and (q, p − 1) positions. This is because that zeros
can be created in row order without destroying those previously obtained.
In the first stage of the Given’s method we annihilates elements along the first row (and by symme-
try, down the first column) in the positions (1, 3), . . . , (1, n) using the rotation matrices Q23 , . . . , Q2n
in turn. Once a zero has been created in positions (1, j), subsequent transformations use matrices
Qpq with p, q 6= 1, j and so zeros are not destroyed. In the second stage we annihilates elements
in the positions (2, 4), . . . , (2, n) using Q34 , . . . , Q3n . Again, any zeros produced by these trans-
formations are not destroyed as subsequent zeros are created along the second row. Furthermore,
zeros previously obtained in the first row are also preserved. The process continues until a zero is
218 2.6 Eigenvalues of Symmetric Matrices

created in the position (n − 2, n) using Qn−1n . The original matrix can therefore, be converted into
a symmetric tridiagonal matrix C in exactly
1
(n − 2) + (n − 3) + · · · + 1 ≡ (n − 1)(n − 2),
2
steps. This method also uses rotation matrices as the Jacobi method does but in the following form

cos θ = (p − 1, p − 1), sin θ = (p − 1, q), − sin θ = (q, p − 1), cos θ = (q, q),

and !
ap−1q
θ = − arctan .
ap−1p
We can also find the values of cos θ and sin θ by using
|ap−1p | |ap−1q |
cos θ = and sin θ = ,
R R
where q
R= (ap−1p )2 + (ap−1q )2 .
Example 2.53 Use the Given’s method to reduce the following matrix
 
2 −1 1 4
 −1 3 1 2 
A= ,
 
 1 1 5 −3 
4 2 −3 6

to a symmetric tridiagonal form and then find the eigenvalues of A.

Solution.
Step I. Create a zero in the (1, 3) position by using the first orthogonal transformation matrix as
 
1 0 0 0
 0 c s 0 
Q23 =  .
 
 0 −s c 0 
0 0 0 1

To find the value of the cos θ and sin θ, we have


a13 1
   
θ = − arctan = − arctan ≈ 50.0000
a12 −1

cos θ = 0.7071, and sin θ = 0.7071.

Then
   
1.0 0 0 0 1 0 0 0
 0 0.7071 0.7071 0   0 0.7071 −0.7071 0 
Q23 =  , QT23 =  ,
   
 0 −0.7071 0.7071 0   0 0.7071 0.7071 0 
0 0 0 1.0 0 0 0 1
Chapter Two Eigenvalues and Eigenvectors 219

which gives  
2.0 −1.4142 0 4.0
 −1.4142 3.0 −1.0 3.535 
A1 = QT23 AQ23 =  .
 
 0 −1.0 5.0 −0.7071 
4.0 3.535 −0.7071 6.0
Note that because of the symmetry matrix, the lower part of A1 is same as upper part.
Step II. Create a zero in the (1, 4) position by using the second orthogonal transformation matrix
as  
1 0 0 0
 0 c 0 s 
Q24 =  ,
 
 0 0 1 0 
0 −s 0 c
and
a14 4
   
θ = − arctan = − arctan ≈ 78.3658
a12 −1.4142

cos θ = 0.3333, and sin θ = 0.9428.

Then    
1 0 0 0 1 0 0 0
 0 0.3333 0 0.9428   0 0.3333 0 −0.9428 
Q24 =  , QT24 =  ,
   
 0 0 1 0   0 0 1 0 
0 −0.9428 0 0.3333 0 0.9428 0 0.3333
which gives  
2.0 −4.2426 0 0
 −4.2426 3.4444 0.3333 −3.6927 
A2 = QT24 A1 Q24 =  .
 
 0 0.3333 5.0 −1.1785 
0 −3.6927 −1.7185 5.5556
Step III. Create a zero in the (2, 4) position by using the third orthogonal transformation matrix
as  
1 0 0 0
 0 1 0 0 
Q34 =  ,
 
 0 0 c s 
0 0 −s c
and
a24 −3.6927
   
θ = − arctan = − arctan ≈ 94.2695
a23 0.3333

cos θ = 0.0899, and sin θ = 0.9960.

Then    
1 0 0 0 1 0 0 0
 0 1 0 0   0 1 0 0 
Q34 =  , QT34 =  ,
   
 0 0 0.0899 0.9960   0 0 0.0899 −0.9960 
0 0 −0.9960 0.0899 0 0 0.9960 0.0899
220 2.6 Eigenvalues of Symmetric Matrices

which gives  
2.0 −4.2426 0 0
 −4.2426 3.4444 3.7077 0 
A3 = QT34 A2 Q34 =   = C.
 
 0 3.7077 5.7621 1.1097 
0 0 1.1097 4.7934
By using the Sturm sequence iteration, the eigenvalues of the symmetric tridiagonal matrix C are,
9.621, 5.204, 3.560 and −2.385, which are also the eigenvalues of A. •

We use the author-defined function GIVENS and the following MATLAB commands to get the
above results as follow:

>> A = [2 − 1 1 4; −1 3 1 2; 1 1 5 3; 4 2 − 3 6];
>> sol = GIV EN S(A);

2.6.4 Householder’s Method


This method is a variation of the Given’s method and enable us to reduce a symmetric matrix A
to a symmetric tridiagonal matrix form C having the same eigenvalues. It reduces a given matrix
into a symmetric tridiagonal form with about half as much computation as the Given’s method
required. This method is to reduce a whole row and column (except for the tridiagonal elements)
to zero at a time. It is to be noted that the symmetric tridiagonal matrix form by the Given’s
method and the Householder’s method may be different but the eigenvalues will be same.

Definition 2.19 (Householder Matrix)

A Householder matrix Hw is a matrix of the form


2
 
T
Hw = I − 2ww = I − wwT ,
wT w
where I is an n × n identity matrix and w is some n × 1 vector satisfying
n
X
wT w = wk2 = 1,
k=1

that is, the vector w has unit length. •

It is easy to verify that a Householder matrix Hw is symmetric, that is


1 − 2w12 −2w1 w2 · · · −2w1 wn
 

 −2w1 w2 1 − 2w22 · · · −2w2 wn   T
Hw =  .. .. ..  = Hw ,
.. 
. . . . 


−2w1 wn −2w2 wn · · · 1 − 2wn2
and is orthogonal, that is
Hw2 = (I − 2wwT )(I − 2wwT )
= I − 4wwT + 4wwT wwT
= I − 4wwT + 4wwT = I, (since wT w = 1).
Chapter Two Eigenvalues and Eigenvectors 221

Thus
Hw = Hw−1 = HwT ,
which shows that Hw is symmetric. Note that the determinant of a Householder matrix Hw is
always equal to -1.

Example 2.54 Consider a vector w = [1, 2]T , then


2
Hw = I − wwT ,
5
so  3 4 
! ! −
1 0 2 1 2  5 5 
Hw = − = ,
 
0 1 5 2 4  4 3 
− −
5 5
which shows that the given Householder matrix Hw is symmetric and orthogonal and the determi-
nant of Hw is -1. •

A Householder matrix Hw corresponding to a given w may be generated using MATLAB command


window as follows:

>> w = [1 2]0 ;
>> w = w/norm(w);
>> Hw = eye(2) − 2 ∗ w ∗ w0 ;

The basic steps of the Householder’s method which required to convert the symmetric matrix into
symmetric tridiagonal matrix are as follows:
A1 = A

A2 = QT1 A1 Q1

A3 = QT2 A2 Q2
.. ..
. .
Ak+1 = QTk Ak Qk
where Qk matrices are the Householder transformation matrices and can be constructed as
Qk = I − sk wk wkT ,
and
2
sk = .
wkT wk
The coefficients of a vector wk are defined in term of a matrix A as
(
0, for i = 1, 2, . . . , k
wik =
aik , for i = k + 2, k + 3, . . . , n
222 2.6 Eigenvalues of Symmetric Matrices

and v
u n
u X
wk+1k = ak+1k ± t a2ik .
i=k+1

The positive sign or negative sign of wk+1k can be taken depends on the sign of a coefficient ak+1k
of a given matrix A.
The Householder’s method transforms a given n × n symmetric matrix to a symmetric tridiagonal
matrix in exactly (n - 2) steps. Each step of the method creates zero in a complete row and column.
The first step annihilates elements in the position (1, 3), (1, 4), . . . , (1, n) simultaneously. Similarly,
step r annihilates elements in the positions (r, r + 2), (r, r + 3), . . . , (r, n) simultaneously. Once
a symmetric tridiagonal form has been achieved, then the eigenvalues of a given matrix can be
calculated by using the Sturm sequence iteration. After calculating the eigenvalues, the shifted
inverse power method can be used to find the eigenvectors of a symmetric tridiagonal matrix and
then the eigenvectors of original matrix A can be found by pre-multiplying these eigenvectors (of
symmetric tridiagonal matrix) by the product of successive transformation matrices.

Example 2.55 Reduce the following matrix


 
30 6 5
A =  6 30 9  ,
 
5 9 30

to symmetric tridiagonal form using the Householder’s method.

Solution. Since the given matrix is of size 3 × 3, so only one iteration is required in order to reduce
the given symmetric matrix into the symmetric tridiagonal form. Thus, for k = 1, we construct the
elements of the vector w1 as follows:

w11 = 0
w31 = a31 = q
5

w21 = a21 ± a221 + a231 = 6 ± 62 + 52 = 6 ± 7.81.

Since the given coefficient a21 is a positive, so the positive sign must be used for w21 , that is

w21 = 13.81.

Therefore, the vector w1 is now determined to be

w1 = [0, 13.81, 5]T ,

and
2
s1 = = 0.0093.
(0)2 + (13.81)2 + (5)2
Thus the first transformation matrix Q1 for the first iteration is
   
1 0 0 0  
Q1 =  0 1 0  − 0.009  13.81  0 13.81 5 ,
   
0 0 1 5
Chapter Two Eigenvalues and Eigenvectors 223

and it gives  
1 0 0
Q1 =  0 −0.7682 −0.6402  .
 
0 −0.6402 0.7682
Therefore  
30.0 −7.810 0
A2 = QT1 A1 Q1 =  −7.810 38.85 −1.622  ,
 
0 −1.622 21.15
which is the symmetric tridiagonal form. •

We use the author-defined function HouseHM and the following MATLAB commands to get the
above results as follows:

>> A = [30 6 5; 6 30 9; 5 9 30];


>> sol = HouseHM (A);

Example 2.56 Reduce the following matrix


 
7 1 2 1
 1 8 1 −1 
A= ,
 
 2 1 3 1 
1 −1 1 2

to symmetric tridiagonal form using the Householder’s method, and then find the approximation of
the eigenvalues of A using Strum sequence iteration.

Solution. As the size of A is 4 × 4, so we need two iterations to convert the given symmetric
matrix into the symmetric tridiagonal form. First iteration, take k = 1, we construct the elements
of the vector w1 as follows:

w11 = 0
w31 = a31 = 2
w41 = a41 = q
1
√ √
w21 = a21 ± a221 + a231 + a241 = 1 ± 12 + 2 2 + 1 1 = 1 ± 6.

Since the given coefficient a21 > 0, so the positive sign must be used for w21 , and gives

w21 = 1 + 2.4495 = 3.4495.

Thus the vector w1 takes the form

w1 = [0, 3.4495, 2, 1]T ,

and
2 2 2
s1 = = = = 0.1183.
w1T w1 (0)2 2 2
+ (3.4495) + (2) + 1 2 16.83
224 2.6 Eigenvalues of Symmetric Matrices

Thus the first transformation matrix Q1 for the first iteration is


   
1 0 0 0 0
 0 1 0 0   3.4495  
Q1 = I − s1 w1 w1T =   − 0.1188   0 3.4495 2 1 ,
   
 0 0 1 0   2 
0 0 0 1 1

and it gives  
1.0000 0 0 0
 0 −0.4082 −0.8165 −0.4082 
Q1 =  .
 
 0 −0.8165 0.5266 −0.2367 
0 −0.4082 −0.2367 0.8816
Therefore  
7.0000 −2.4495 0.0000 0
 −2.4495 4.6667 1.5700 1.1933 
A2 = QT1 A1 Q1 =  .
 
 0.0000 1.5700 4.7816 2.9972 
0 1.1933 2.9972 3.5518
Now for k = 2, we construct the elements of the vector w2 as follows:

w12 = 0
w22 = 0
w42 = 1.1933 p √
w32 = 1.5700 ± (1.5700)2 + (1.1933)2 = 1.5700 ± 0000003.855
= 1.5700 ± 1.9721

Since the given coefficient a32 > 0, so the positive sign must be used for w32 , and gives

w32 = 1.5700 + 1.9721 = 3.5421.

Thus the vector w2 takes the form

w2 = [0, 0, 3.5421, 1.1933]T ,

and
2 2
s2 = = = 0.1432.
w2T w2 13.9704
Thus the second transformation matrix Q2 for the second iteration is
   
1 0 0 0 0
 0 1 0 0   0  
Q2 = I − s2 w2 w2T =   − 0.1432   0 0 3.5421 1.1933 ,
   
 0 0 1 0   3.5421 
0 0 0 1 1.1933

and it gives  
1.0000 0 0 0
 0 −0.4082 0.8971 0.1690 
Q2 =  .
 
 0 −0.8165 −0.2760 −0.5071 
0 −0.4082 −0.3450 0.8452
Chapter Two Eigenvalues and Eigenvectors 225

Therefore  
7.0000 −2.4495 0.0000 0.0000
 −2.4495 4.6667 −1.9720 0.0000 
A3 = QT2 A2 Q2 =   = T,
 
 0.0000 −1.9720 7.2190 −0.2100 
0.0000 0.0000 −0.2100 1.1143
which is the symmetric tridiagonal form.
To find the eigenvalues of this symmetric tridiagonal matrix we use the Sturm sequence iteration
f4 (λ) = (a4 − λ)f3 (λ) − b24 f2 (λ),
where
f3 (λ) = (a3 − λ)f2 (λ) − b23 f1 (λ),
and
f2 (λ) = (a2 − λ)f1 (λ) − b22 f0 (λ),
with
f1 (λ) = (a1 − λ) and f0 (λ) = 1.
Since
a1 = 7.0000, a2 = 4.6667, a3 = 7.2190, a4 = 1.1143,
and
b2 = −2.4495, b3 = −1.9720, b4 = −0.2100.
Thus
f4 (λ) = λ4 − 20λ3 + 128.0002λ2 − 284.0021λ + 183.0027 = 0,
and solving this characteristic equation, we get
λ1 = 9.2510, λ1 = 7.1342, λ3 = 2.5100, λ4 = 1.1047,
which are the eigenvalues of symmetric tridiagonal matrix T and are also the eigenvalues of the
given matrix A. Once the eigenvalues of A obtained, then the corresponding eigenvectors of A can
be obtained by using the shifted inverse power method. •

2.7 Matrix Decomposition Methods


In the following we will discuss two matrix decomposition methods, called QR method and LR
method which help us to find the eigenvalues of the given general matrix.

2.7.1 QR Method
We know that the Jacobi, the Given’s and the Householder’s methods are applicable only to sym-
metric matrices for finding all the eigenvalues of a matrix A. Now we describe the QR method
which can find all the eigenvalues of a general matrix. In this method we decompose an arbitrary
real matrix A into a product QR, where Q is a orthogonal matrix and R is a upper-triangular ma-
trix with nonnegative diagonal elements. Note that when A is nonsingular, then this decomposition
is unique.
Starting with A1 = A, the QR method iteratively compute similar matrices Ai , i = 2, 3, . . ., in two
stages:
226 2.7 Matrix Decomposition Methods

(1) Factor Ai into Qi Ri , that is, Ai = Qi Ri .

(2) Define Ai+1 = Ri Qi .

Note that from stage (1), we have


Ri = Q−1
i Ai ,

using this, the stage (2) can be written as

Ai+1 = Q−1 T
i Ai Qi = Qi Ai Qi ,

where all Ai are similar to A and thus have the same eigenvalues. It turns out that in the case
where the eigenvalues of A all have different magnitude,

|λ1 | > |λ2 | > · · · > |λn |,

then the QR iterates Ai approach to an upper-triangular matrix, and thus the elements of the
main diagonal approach the eigenvalues of a given matrix A. When there are distinct eigenvalues
of the same size, the iterates Ai may not approach to a upper-triangular matrix; however, they
do approach a matrix that is near enough to a upper-triangular matrix to allow us to find the
eigenvalues of A.
If a given matrix A is symmetric and tridiagonal, since the QR transformation preserves symmetry,
all subsequent matrices Ai will be symmetric and hence tridiagonal. Thus the combined method
of first reducing a symmetric matrix to a symmetric tridiagonal form by the Householder trans-
formations and then applying the QR method is probably the most effective to evaluate all the
eigenvalues of a symmetric matrix.
The simplest way of calculating the QR decomposition of an n × n matrix A is to pre-multiply A
by a series of rotation matrix and the values of p, q and θ are chosen to annihilate one of lower-
triangular elements. The value of θ which is chosen to create zero in the (q, p) position is define
as !
aqp
θ = − arctan .
app

The first stage of the decomposition annihilates the element in the position (2,1) using the rotation
matrix QT12 . The next two stages annihilate elements in the positions (3,1) and (3,2) using the
rotation matrices QT13 and QT23 respectively. The process continues in this way, creating zeros in
row order, until the rotation matrix QTn−1n is used to annihilates the element in the position (n,n-1).
The zeros created are being retained in similar way as in the Given’s method and a upper-triangular
matrix R is produced after n(n − 1)/2 pre-multiplications, that is

QTn−1n · · · QT13 QT12 A = R,

which can be rearranged as


A = (Q12 Q13 · · · Qn−1n )R = QR,

since QTpq = Q−1


pq .
Chapter Two Eigenvalues and Eigenvectors 227

Example 2.57 Find the first QR iteration for the following matrix
 
1 4 3
A =  2 3 1 .
 
2 6 5

Solution. Step I. Create a zero in the (2, 1) position by using the first orthogonal transformation
matrix as  
c s 0
Q12 =  −s c 0 ,
 
0 0 1
and
a21
 
θ = − arctan = − arctan(2) = −70.4833,
a11

cos θ ≈ 0.4472 and sin θ ≈ −0.8944.

Then    
0.4472 −0.8944 0 0.4472 −0.8944 0
Q12 =  0.8944 0.4472 0  , QT12 =  0.8944 0.4472 0  ,
   
0 0 1 0 0 1
which gives  
2.2360 4.4720 2.2360
QT12 A =  0 −2.2360 −2.2360  .
 
2.0000 6.0000 5.0000
Step II. Create a zero in the (3, 1) position by using the second orthogonal transformation matrix
as  
c 0 s
Q13 =  0 1 0 ,
 
−s 0 c
with
a31 2
   
θ = − arctan = − arctan ≈ −46.4585
a11 2.2360

cos θ ≈ 0.7453, and sin θ ≈ −0.6667.

Then    
0.7453 0 −0.6667 0.7453 0 0.6667
Q13 = 0 1 0 , QT13 = 0 1 0 ,
   
0.6667 0 0.7453 −0.6667 0 0.7453
which gives  
3.0001 7.3336 5.0002
QT13 (QT12 A) =  0 −2.2360 −2.2360  .
 
0.0001 1.4909 2.2363
228 2.7 Matrix Decomposition Methods

Step III. Create a zero in the (3, 2) position by using the third orthogonal transformation matrix
as  
1 0 0
Q23 =  0 c s ,
 
0 −s c
with
a32 1.4909
   
θ = − arctan = − arctan ≈ 37.4393
a22 −2.2360

cos θ ≈ 0.8320, and sin θ ≈ 0.5548.


Then    
1 0 0 1 0 0
Q23 = 0 0.8320 0.5548  , QT23 =  0 0.8320 −0.5548  ,
   
0 −0.5548 0.8320 0 0.5548 0.8320
which gives  
3 7.3333 5
T T T
R1 = Q23 (Q13 Q12 A) =  0 −2.6874 −3.1009  ,
 
0 0 0.6202
which is the required upper-triangular matrix R1 . The matrix Q1 can be computed as
 
0.3333 −0.5788 −0.7442
Q1 = Q12 Q13 Q23 =  0.6667 0.7029 −0.2481  .
 
0.6667 −0.4134 0.6202
Hence the original matrix A can be decomposed as
 
0.9999 3.9997 2.9997
A1 = Q1 R1 =  2.0001 3.0001 1.0000  ,
 
2.0001 6.0001 5.0001
and the new matrix can be obtained as
 
9.2222 1.3506 −0.9509
A2 = R1 Q1 =  −3.8589 −0.6068 −1.2564  ,
 
0.4134 −0.2564 0.3846
which is the required first QR iteration for the given matrix. •

Note that if we continue in the same way until the 21 iterations, the new matrix A21 becomes the
upper-triangular matrix
 
8.5826 −4.9070 −2.1450
A21 = R20 Q20 = 0 1 −1.1491  ,
 
0 0 −0.5825
and its diagonal elements are the eigenvalues, λ = 8.5826, 1, −0.5825, of the given matrix A. Once
the eigenvalues have been determined, the corresponding eigenvectors can be computed by the shifted
inverse power method.
Chapter Two Eigenvalues and Eigenvectors 229

We use the author-defined function QRM and the following MATLAB commands to get the above
results as follows:

>> A = [1 4 3; 2 3 1; 2 6 5];
>> sol = QRM (A);

Example 2.58 Find the first QR iteration for the following matrix
!
5 −2
A= ,
−2 8

and if (Q1 R1 )x = b, and R1 x = c, with c = QT1 b, then find the solution of the linear system
Ax = [7, 8]T .

Solution. First create a zero in the (2, 1) position with the help of the orthogonal transformation
matrix !
c s
Q12 = ,
−s c
and then for finding the value of the θ, c, and s, we calculate the following

a21
 
θ = − arctan = − arctan(−0.4) = 0.3805,
a11

cos θ ≈ 0.9285 and sin θ ≈ 0.3714.

So !
0.9285 0.3714
Q1 = Q12 = ,
−0.3714 0.9285
and !
5.3853 −4.8282
R1 = QT12 A = .
0 6.6852
Since ! ! !
0.9285 −0.3714 7 3.5283
c= QT1 b = = ,
0.3714 0.9285 8 10.0278
therefore, solving the following system
! ! !
5.3853 −4.8282 x1 3.5283
R1 x = = = c.
0 6.6852 x2 10.0278

Thus, we get ! !
x1 2.0000
= ,
x2 1.5000
which is the required solution of the given system. •
230 2.7 Matrix Decomposition Methods

2.7.2 LR Method
Another method, which is very similar to the QR method is the Rutishauser’s LR method. This
method is based upon the decomposition of a matrix A into the product of lower-triangular matrix
L (with unit diagonal elements) and upper-triangular matrix R. Starting with A1 = A, the LR
method iteratively compute similar matrices Ai , i = 2, 3, . . . , in two stages.

(1) Factor Ai into Li Ri , that is, Ai = Li Ri .

(2) Define Ai+1 = Ri Li .

Each complete step is a similarity transformation because

Ai+1 = Ri Li = L−1
i Ai L i ,

and so all of the matrices Ai have the same eigenvalues. This triangular decomposition based
method enable us to reduce a given nonsymmetric matrix to upper-triangular matrix whose diagonal
elements are the possible eigenvalues of a given matrix A, in decreasing order of magnitude. The
 i
(i) λj
rate at which the lower-triangular elements ajk of Ai converges to zero is of order , j > k.
λk
This implies, in particular, that the order of convergence of the elements along the first subdiagonal
!i
λj+1
is and so convergence will be slow whenever two or more real eigenvalues are close
λj
together. The situation is rather more complicated if any of the eigenvalues are complex.
Since we know that the triangular decomposition is not always possible, therefore, we will use the
decomposition by the use of the partial pivoting. Using it, we start

Pi Ai = Li Ri .

where Pi represents the row permutations used in the decomposition. In order to preserve eigen-
values it is necessary to calculate Ai+1 from

Ai+1 = (Ri Pi )Li

It is easy to see that this is a similarity transformation because

Ai+1 = (Ri Pi )Li = L−1


i P i Ai P i L i ,

and Pi−1 = Pi .
The matrix Pi does not have to computed explicitly; Ri Pi is just a column permutation of Ri using
interchanges corresponding to row interchanges used in the decomposition of Ai .

Example 2.59 Use the LR method to find the eigenvalues of the following matrix
 
2 −2 3
A= 0 3 −2  .
 
0 −1 2
Chapter Two Eigenvalues and Eigenvectors 231

Solution. The exact eigenvalues of the given matrix A are λ = 1, 2, 4. The first triangular
decomposition of A = A1 produce
   
1.0000 0 0 2.0000 −2.0000 3.0000
L1 =  0 1.0000 0 , R1 =  0 3.0000 −2.0000  ,
   
0 −0.3333 1.0000 0 0 1.3333
and no rows are interchanged. Then
 
2.0000 −3.0000 3.0000
A2 = R1 L1 =  0 3.6667 −2.0000  .
 
0 −0.4444 1.3333
The second triangular decomposition of A2 produce
   
1.0000 0 0 2.0000 −3.0000 3.0000
L2 =  0 1.0000 0 , R2 =  0 3.6667 −2.0000  ,
   
0 −0.1212 1.0000 0 0 1.0909
and again no rows are interchanged. Then
 
2.0000 −3.3636 3.0000
A3 = R2 L2 =  0 3.9091 −2.0000  .
 
0 −0.1322 1.0909
In similar way, the next matrices in the sequence are
    
2 −3.3636 3.0000 1 0 0 2 −3.4651 3.0000
A4 =  0 3.9091 −2.0000   0 1.0000 0  =  0 3.9767 −2.0000 
    
0 0 1.0233 0 −0.0338 1 0 −0.0346 1.0233
    
2 −3.4651 3.0000 1 0 0 2 −3.4912 3.0000
A5 =  0 3.9767 −2.0000   0 1.0000 0  =  0 3.9942 −2.0000 
    
0 0 1.0058 0 −0.0087 1 0 −0.0088 1.0058
    
2 −3.4912 3.0000 1 0 0 2 −3.4978 3.0000
A6 =  0 3.9942 −2.0000   0 1.0000 0  =  0 3.9985 −2.0000 
    
0 0 1.0015 0 −0.0022 1 0 −0.0022 1.0015
    
2 −3.4978 3 1 0 0 2 −3.4995 3.0000
A7 =  0 3.9985 −2   0 1.0000 0  =  0 3.9996 −2.0000 
    
0 0 1 0 −0.0005 1 0 −0.0005 1.0004
    
2 −3.4995 3 1 0 0 2 −3.4999 3.0000
A8 =  0 3.9996 −2   0 1.0000 0  =  0 3.9999 −2.0000 
    
0 0 1 0 −0.0001 1 0 −0.0001 1.0001
    
2 −3.4999 3 1 0 0 2 −3.5 3
A9 =  0 3.9999 −2   0 1 0  =  0 4 −2  .
    
0 0 1 0 0 1 0 0 1

232 2.7 Matrix Decomposition Methods

We use the author-defined function LRM and the following MATLAB commands to get the above
results as follows:

>> A = [2 − 2 3; 0 3 − 2; 0 − 1 2];
>> sol = LRM (A);

2.7.3 Upper Hessenberg Form


In employing the QR method or the LR method to find the eigenvalues of a nonsymmetric matrix
A, it is preferable to first use similarity transformations to convert A to upper Hessenberg form
and then go on to demonstrate its usefulness in the QR and the LR methods.

Definition 2.20 A matrix A is in upper Hessenberg form if

aij = 0, for all i, j such that i − j > 1.

For example, the following 4 × 4 matrix case, the nonzero elements are
 
3 2 1 5
 4 6 7 3 
A= .
 
 0 8 9 5 
0 0 7 8

Note that one way to characterize upper Hessenberg form is that it is almost triangular. This is
important, since the eigenvalues of the triangular matrix are the diagonal elements. The upper
Hessenberg form of a matrix A can be achieved by a sequence of the Householder transformations
or the Gaussian elimination procedure. Here, we will use the Gaussian elimination procedure since
it is about a factor of 2 more efficient than the Householder method. It is possible to construct
matrices for which the Householder reduction, being orthogonal, is stable and elimination is not,
but such matrices are extremely rare in practice.
A general n × n matrix A can be reduced to upper Hessenberg form in exactly n − 2 steps.
Consider an 5 × 5 matrix  
a11 a12 a13 a14 a15
 a
 21 a22 a23 a24 a25 

A =  a31 a32 a33 a34 a35  .
 
 
 a41 a42 a43 a44 a45 
a51 a52 a53 a54 a55
In the first step of reducing the given matrix A = A1 into the upper Hessenberg form is to eliminate
the elements in the (3, 1), (4, 1) and (5, 1) positions. It can be done by subtracting multiples
a31 a41 a51
m31 = , m41 = and m51 = of row 2 from rows 3, 4 and 5, respectively and considering
a21 a21 a21
the matrix  
1 0 0 0 0
 0 1 0 0 0 
 
M1 =  0 m31 1 0 0  .
 
 
 0 m41 0 1 0 
0 m51 0 0 1
238 2.7 Matrix Decomposition Methods

2.7.4 Singular Value Decomposition


We have considered two principal methods for the decomposition of the matrix, the QR decomposi-
tion and the LR decomposition. There is another important method for the matrix decomposition,
called, the singular value decomposition (SVD).
Here, we show that every rectangular real matrix A can be decomposed into a product U DV T
of two orthogonal matrices U and V and a generalized diagonal matrix D. The construction of
U DV T is based on the fact that for all real matrices A a matrix AT A is symmetric and therefore
there exists an orthogonal matrix Q and a diagonal matrix D for which

AT A = QDQT .

As we know the diagonal entries of D are the eigenvalues of AT A. Now we show that they are
nonnegative in all cases and that their square roots, called, the singular values of A, can be used
to construct U DV T .

Singular Values of a Matrix

For any m × n matrix A, an n × n matrix AT A is symmetric and hence can be orthogonally


diagonalized. Not only are the eigenvalues of AT A all real, they are all nonnegative. To show this,
let λ be an eigenvalue of AT A with corresponding unit vector v. Then

0 ≤ kAvk2 = (Av).(Av) = (Av)T Av = vT AT Av = vT λv = λ(v.v) = λkvk2 = λ.

It therefore makes sense to take (positive) square roots of these eigenvalues.

Definition 2.21 Singular Values of a Matrix

If A is an m × n matrix, singular values of A are the square roots of the eigenvalues of AT A and are
denoted by σ1 , . . . , σn . It is conventional to arrange the singular values so that σ1 ≥ σ2 ≥ · · · σn . •

Example 2.63 Find the singular values of


!
1 0 1
A= .
1 1 0

Solution. Since the singular values of A are the square roots of the eigenvalues of AT A, so we
compute    
1 1 ! 2 1 1
1 0 1
AT A =  0 1  =  1 1 0 .
   
1 1 0
1 0 1 0 1

AT A has eigenvalues
The matrix √ √ λ1 = 3, λ2 =√1 and λ3 = 0. Consequently, the singular values of
A are σ1 = 3 = 1.7321, σ2 = 1 = 1, σ3 = 0 = 0. •

Note that the singular values of A are not the same as its eigenvalues but there is connection
between them if A is symmetric matrix.
Chapter Two Eigenvalues and Eigenvectors 239

Theorem 2.31 If A = AT is a symmetric matrix, then its singular values are the absolute values
of its nonzero eigenvalues, that is
σi = |λi | > 0. •
Theorem 2.32 The condition number of a nonsingular matrix is the ratio between its largest
singular value σ1 (or dominant singular value) and smallest singular value σn , that is
σ1
K(A) = . •
σn
Singular Value Decomposition

The following are some of the properties that make singular value decompositions useful:

1. All real matrices have singular value decompositions.


2. A real square matrix is invertible if and only if all its singular values are nonzero.
3. For any m × n real rectangular matrix A, the number of nonzero singular values of A is equal
to the rank of A.
4. If A = U DV T is a singular value decomposition of a invertible matrix A, then A−1 = V D−1 U T .
5. For positive definite symmetric matrices, the orthogonal decomposition QDQT and the singular
value decomposition U DV T coincide.
Theorem 2.33 (Singular Value Decomposition Theorem)

For every m × n matrix A can be factored into the product of an m × m matrix U with orthonormal
columns, so U T U = I, the m × n diagonal matrix D = diag(σ1 , . . . , σr ) that has the singular values
of A as its diagonal entries, and an n × n matrix V with orthonormal rows, so V T V = I, that is
σ1 0 v1T
  

 σ2 
 v2T 

 ..  .. 

 . 
 . 

T T
A = U DV = (u1 u2 · · · ur ur+1 · · · un )  σr vr
  
 

 0

 T 
vr+1 
..
  
 ..  
 .  . 
0 0 vnT
Note that the columns of U , u1 , u2 , . . . , ur , are called left singular vectors of A, and the columns
of V , v1 , v2 , . . . , vr , are called right singular vectors of A. The matrices U and V are not uniquely
determined by A, but a matrix D must contains the singular values, σ1 , σ2 , . . . , σr , of A.
To construct the orthogonal matrix V , we must find an orthonormal basis {v1 , v2 , . . . , vn } for Rn
consisting of eigenvectors of an n × n symmetric matrix AT A. Then
V = [v1 v2 · · · vn ],
is an orthogonal n × n matrix.
For the orthogonal matrix U , we first note that {Av1 , Av2 , . . . , Avn } is an orthogonal set of vectors
in Rm . To see this, suppose that vi is a eigenvector of AT A corresponding to a eigenvalue λi , then,
for i 6= j, we have
(Avi ).(Avj ) = (Avi )T Avj = vi T AT Avj = vi T λj vj = λj (vi .vj ) = 0,
240 2.7 Matrix Decomposition Methods

since the eigenvectors vi are orthogonal. Now recall that the singular values satisfy σi = kAvi k
and that the first r of these are nonzero. Therefore, we can normalize Av1 , . . . , Avr , by setting
1
ui = Avi , for i = 1, 2, . . . , r.
σi
This guarantees that {u1 , u2 , . . . , ur } is an orthonormal set in Rm , but if r < m it will not be a basis
for Rm . In this case, we extend the set {u1 , u2 , . . . , ur } to an orthonormal basis {u1 , u2 , . . . , um }
for Rm . •

Example 2.64 Find the singular value decomposition of the following matrix
!
1 0 1
A= .
1 1 0

Solution. We compute
   
1 1 ! 2 1 1
T 1 0 1
A A= 0 1  =  1 1 0 ,
   
1 1 0
1 0 1 0 1

and find that its eigenvalues are

λ1 = 3, λ2 = 1, λ3 = 0,

with corresponding eigenvectors


     
2 0 −1
 1 ,  −1  ,  1 .
     
1 1 1

These vectors are orthogonal, so we normalize them to obtain


 √     √ 
2/√6 0
√  −1/√3 
v1 =  1/√6  , v2 =  −1/√2  , v3 =  1/√3  .
   
1/ 6 1/ 2 1/ 3

The singular values of A are


p √ p √ p √
σ1 = λ1 = 3, σ2 = λ2 = 1 = 1, σ3 = λ3 = 0 = 0.

Thus √ √ 


2/√6 √0 −1/√3 
!
3 0 0
V =  1/√6 −1/√2 1/√3  , D= .

0 1 0
1/ 6 1/ 2 1/ 3
To find U , we compute
√ 
√ !

! 2/√6
1 1 1 0 1 1/√2
u1 = Av1 = √  1/√6  = ,
 
σ1 3 1 1 0 1/ 2
1/ 6
Chapter Two Eigenvalues and Eigenvectors 241

√ !
 
√0 
!
1 1 1 0 1 1/√2
u2 = Av2 = −1/√2  = .

σ2 1 1 1 0 −1/ 2

1/ 2
These vectors already form an orthonormal basis for R2 , so we have
√ √ !
1/√2 1/√2
U= .
1/ 2 −1/ 2

Thus √ √ 
√ √ ! √

! 2/√6 √0 −1/√3
1/√2 1/√2 3 0 0
A=  1/√6 −1/√2 1/√3  ,
 
1/ 2 −1/ 2 0 1 0
1/ 6 1/ 2 1/ 3
it yields the SVD •

MATLAB built-in svd function performs the SVD of a matrix. Thus to reproduce the above results
using MATLAB command window, we do the following:

>> A = [1 0 1; 1 1 0];
>> [U, D, V ] = svd(A);

The singular value decomposition occurs in many applications. For example, if we can compute the
SVD accurately, then we can solve a linear system very efficiently. Since we know that the nonzero
singular values of A are the square roots of the nonzero eigenvalues of a matrix AAT , which are the
same as the nonzero eigenvalues of AT A. There are exactly r = rank(A) positive singular values.
Suppose that A is square and has full rank. Then if Ax = b, we have

U DV T x = b and x = V D−1 U T b.

(since U T U = 1, V V T = 1 by orthogonality).

Example 2.65 Find the solution of the linear system Ax = b using singular value decomposition,
where ! !
−4 −6 1
A= and b = .
3 −8 4
Solution. First we have to compute the singular value decomposition of A, for this we have to
compute ! ! !
T −4 3 −4 −6 25 0
A A= = .
−6 −8 3 −8 0 100

The characteristic polynomial of AT A is

λ2 − 125λ + 2500 = (λ − 100)(λ − 25) = 0,

and gives the eigenvalue of AT A

λ1 = 100 and λ2 = 25.


242 2.7 Matrix Decomposition Methods

The corresponding to the eigenvalues λ1 and λ2 , we can have the eigenvectors


! !
0 1
and ,
1 0

respectively. These vectors are orthogonal, so we normalize them to obtain


! !
0 1
v1 = and v2 = .
1 0

The singular values of A are


p √ p √
σ1 = λ1 = 100 = 10 and σ2 = λ2 = 25 = 5.

Thus ! !
0 1 10 0
V = and D= .
1 0 0 5
To find U , we compute
! ! !
1 1 −4 −6 0 −0.6
u1 = Av1 = = ,
σ1 10 3 −8 1 −0.8
! ! !
1 1 −4 −6 1 −0.8
u2 = Av2 = = .
σ2 5 3 −8 0 0.6

These vectors already form an orthonormal basis for R2 , so we have


!
−0.6 −0.8
U= .
−0.8 0.6

This yields the SVD ! ! !


−0.6 −0.8 10 0 0 1
A= .
−0.8 0.6 0 5 1 0
Now to find the solution of the given linear system, we solve

x = V D−1 U T b,
! ! ! ! !
x1 0 1 0.1 0 −0.6 −0.8 −0.04
= = ,
x2 1 0 0 0.2 −0.8 0.6 −0.14
which is the solution of the given linear system. •

You might also like