Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Eigenvalues and eigenvectors of a matrix

Definition: If A is an n × n matrix and there exists a real number λ and a


non-zero column vector V such that

AV = λV

then λ is called an eigenvalue of A and V is called an eigenvector


corresponding to the eigenvalue λ.
 
2 1
Example: Consider the matrix A = . Note that
2 3
    
2 1 1 1
=4
2 3 2 2
 
1
So, the number 4 is an eigenvalue of A and the column vector is an
2
eigenvector corresponding to 4.
Dana Mackey (DIT) Numerical Methods II 1 / 23
Finding eigenvalues and eigenvectors for a given matrix A

1. The eigenvalues of A are calculated by solving the characteristic


equation of A:
det(A − λI ) = 0

2. Once the eigenvalues of A have been found, the eigenvectors


corresponding to each eigenvalue λ can be determined by solving the
matrix equation
AV = λV


4 0 1
Example: Find the eigenvalues of A = −2 1 0.
−2 0 1

Dana Mackey (DIT) Numerical Methods II 2 / 23


Example:
 
2 1
The eigenvalues of A = are 1 and 4, which is seen by solving the
2 3
quadratic equation

2 − λ 1
= λ2 − 5λ + 4 = 0
2 3 − λ
 
V1
To find the eigenvectors corresponding to 1 we write V = and solve
V2
    
2 1 V1 V1 2V1 + V2 = V1
=1 so
2 3 V2 V2 2V1 + 3V2 = V2

This gives a 
single independent equation V1 + V2 = 0 so any vector of the
1
form V = α is an eigenvector corresponding to the eigenvalue 1.
−1

Dana Mackey (DIT) Numerical Methods II 3 / 23


Matrix diagonalisation
The eigenvalues and eigenvectors of a matrix have the following important
property:

If a square n × n matrix A has n linearly independent eigenvectors then it


is diagonalisable, that is, it can be factorised as follows

A = PDP −1

where D is the diagonal matrix containing the eigenvalues of A along the


diagonal, also written as

D = diag [λ1 , λ2 , . . . , λn ]

and P is a matrix formed with the corresponding eigenvectors as its


columns.

For example, matrices whose eigenvalues are distinct numbers are


diagonalisable. Symmetric matrices are also diagonalisable.
Dana Mackey (DIT) Numerical Methods II 4 / 23
This property has many important applications. For example, it can be
used to simplify algebraic operations involving the matrix A, such as
calculating powers:

Am = PDP −1 · PDP −1 · · · PDP −1 = PD m P −1


  

Note that if D = diag [λ1 , λ2 , . . . , λn ] then D m = diag [λm m m


1 , λ2 , . . . , λn ],
which is very easy to calculate.
 
2 1
Example: The matrix A = has eigenvalues 1 and 4 with
2 3
   
1 1
corresponding eigenvectors and . Then it can be diagonalised
−1 2
as follows    
1 1 1 0 2/3 −1/3
A=
−1 2 0 4 1/3 1/3

Dana Mackey (DIT) Numerical Methods II 5 / 23


The power and inverse power method

Overview:

The Gerschgorin circle theorem is used for locating the eigenvalues of


a matrix.
The power method is used for approximating the dominant eigenvalue
(that is, the largest eigenvalue) of a matrix and its associated
eigenvector.
The inverse power method is used for approximating the smallest
eigenvalue of a matrix or for approximating the eigenvalue nearest to
a given value, together with the corresponding eigenvector.

Dana Mackey (DIT) Numerical Methods II 6 / 23


The Gerschgorin Circle Theorem
Let A be an n × n matrix and define
n
Ri = ∑ |aij |
j=1
j6=i

for each i = 1, 2, 3, . . . n. Also consider the circles

Ci = {z ∈ C
l : |z − aii | ≤ Ri }

1 If λ is an eigenvalue of A then λ lies in one of the circles Ci .


2 If k of the circles Ci form a connected region R in the complex plane,
disjoint from the remaining n − k circles then the region contains
exactly k eigenvalues.

Dana Mackey (DIT) Numerical Methods II 7 / 23


Example
Consider the matrix  
1 −1 0
A= 1 5 1
−2 −1 9
The radii of the Gerschgorin circles are

R1 = | − 1| + |0| = 1; R2 = |1| + |1| = 2; R3 = | − 2| + | − 1| = 3

and the circles are

C1 = {z ∈ C
l : |z − 1| ≤ 1}; C2 = {z ∈ C
l : |z − 5| ≤ 2};
C3 = {z ∈ C
l : |z − 9| ≤ 3}

Since C1 is disjoint from the other circles, it must contain one of the
eigenvalues and this eigenvalue is real. As C2 and C3 overlap, their union
must contain the other two eigenvalues.

Dana Mackey (DIT) Numerical Methods II 8 / 23


The power method
The power method is an iterative technique for approximating the
dominant eigenvalue of a matrix together with an associated eigenvector.

Let A be an n × n matrix with eigenvalues satisfying

|λ1 | > |λ2 | ≥ |λ3 | ≥ . . . ≥ |λn |

The eigenvalue with the largest absolute value, λ1 is called the dominant
eigenvalue. Any eigenvector corresponding to λ1 is called a dominant
eigenvector.

Let x0 be an n-vector. Then it can be written as

x0 = α1 v1 + · · · αn vn

where v1 , . . . , vn are the eigenvectors of A.

Dana Mackey (DIT) Numerical Methods II 9 / 23


Now construct the sequence x(m) = Ax(m−1) , for m ≥ 1.

x(1) = Ax(0) = α1 λ1 v1 + · · · αn λn vn
x(2) = Ax(1) = α1 λ21 v1 + · · · αn λ2n vn
..
.
x(m) = Ax(m−1) = α1 λm m
1 v1 + · · · αn λ n vn

and hence m m
x(m)
 
λ2 λn
= α1 v1 + α2 + · · · αn vn
λm
1 λ1 λ1
which gives
x(m)
lim = α1 v1
m→∞ λm
1
(m)
Hence, the sequence xλm converges to an eigenvector associated with the
1
dominant eigenvalue.

Dana Mackey (DIT) Numerical Methods II 10 / 23


The power method implementation
(0)
Choose the initial guess, x(0) such that maxi |xi | = 1.

For m ≥ 1, let
y(m)
y(m) = Ax(m−1) , x(m) = (m)
ypm
where pm is the index of the component of y(m) which has maximum
absolute value.

Note that
y(m) = Ax(m−1) ≈ λ1 x(m−1)
(m−1)
and since xpm−1 = 1 (by construction), it follows that
(m)
ypm−1 → λ1

Dana Mackey (DIT) Numerical Methods II 11 / 23


Note: Since the power method is an iterative scheme, a stopping
condition could be given as
|λ(m) − λ(m−1) | < required error
where λ(m) is the dominant eigenvalue approximation during the mth
iteration, or
(m) (m−1)
max |xi − xi | < required error
i

Note: If the ratio |λ2 /λ1 | is small, the convergence rate is fast. If
|λ1 | = |λ2 | then, in general, the power method does not converge.

Example Consider the matrix


 
−2 −2 3
A = −10 −1 6 
10 −2 −9
Show that its eigenvectors are λ1 = −12, λ2 = −3 and λ3 = 3 and
calculate the associated eigenvectors. Use the power method to
approximate the dominant eigenvalue and a corresponding eigenvector.
Dana Mackey (DIT) Numerical Methods II 12 / 23
Matrix polynomials and inverses

Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . λn and associated


eigenvectors v1 , v2 , . . . vn .

1 Let
p(x) = a0 + a1 x + · · · + am x m
be a polynomial of degree m. Then the eigenvalues of the matrix

B = p(A) = a0 + a1 A + · · · + am Am

are p(λ1 ), p(λ2 ), . . . p(λn ) with associated eigenvectors v1 , v2 , . . . vn .


2 If det(A) 6= 0 then the inverse matrix A−1 has eigenvalues 1 1 1
λ1 , λ2 , . . . λn
with associated eigenvectors v1 , v2 , . . . vn .

Dana Mackey (DIT) Numerical Methods II 13 / 23


The inverse power method
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . λn and associated
eigenvectors v1 , v2 , . . . vn .

Let q be a number for which

det(A − qI ) 6= 0

(so q is not an eigenvalue of A). Let B = (A − qI )−1 so the eigenvalues of


B are given by
1 1 1
µ1 = , µ2 = , ... µm = .
λ1 − q λ2 − q λm − q

If we know that λk is the eigenvalue that is closest to the number q, then


µk is the dominant eigenvalue for B and so it can be determined using the
power method.

Dana Mackey (DIT) Numerical Methods II 14 / 23


Example

Consider the matrix  


−4 1 1
A= 0 3 1
−2 0 15
Use Gerschgorin circles to obtain an estimate for one of the eigenvalues of
A and then get a more accurate approximation using the inverse power
method.

(Note: The eigenvalues are, approximately, -3.9095, 3.0243 and 14.8852.)

Dana Mackey (DIT) Numerical Methods II 15 / 23


The Gershgorin circles are

|z + 4| ≤ 2; |z − 3| ≤ 1 and |z − 15| ≤ 2

so they are all disjoint! There is one eigenvalue in each of the three circles
so they lie close to -4, 3 and 15.

Let λ be the eigenvalue closest to q = 3 and let B = (A − 3I )−1 . Then


1
µ = λ−3 is the dominant eigenvalue for B and can be determined using the
power method, together with an associated eigenvector, v.

Note that, once the desired approximation, µ(n) , v(n) has been obtained,
we must calculate
1
λ(n) = 3 + (n)
µ
The associated eigenvector for λ is the same as the one calculated for µ.

Dana Mackey (DIT) Numerical Methods II 16 / 23


Smallest eigenvalue

To find the eigenvalue of A that is smallest in magnitude is equivalent to


find the dominant eigenvalue of the matrix B = A−1 . (This is the inverse
power method with q = 0.)

Example: Consider the matrix


 
2 2 −1
A = −5 9 −3 .
−4 4 1

Use the inverse power method to find an approximation for the smallest
eigenvalue of A.

(Note: The eigenvalues are 3, 4 and 5.)

Dana Mackey (DIT) Numerical Methods II 17 / 23


The QR method for finding eigenvalues
We say that two n × n matrices, A and B, are orthogonally similar if
there exists an orthogonal matrix Q such that

AQ = QB or A = QBQ T

If A and B are orthogonally similar matrices then they have the


same eigenvalues.

Proof: If λ is an eigenvalue of A with eigenvector v, that is,

Av = λv

then
BQ T v = λQ T v
which means λ is an eigenvalue of B with eigenvector Q T v.

Dana Mackey (DIT) Numerical Methods II 18 / 23


To find the eigenvalues of a matrix A using the QR algorithm, we generate
a sequence of matrices A(m) which are orthogonally similar to A (and so
have the same eigenvalues), and which converge to a matrix whose
eigenvalues are easily found.

If the matrix A is symmetric and tridiagonal then the sequence of QR


iterations converges to a diagonal matrix, so its eigenvalues can easily be
read from the main diagonal.

Dana Mackey (DIT) Numerical Methods II 19 / 23


The basic QR algorithm

We find the QR factorisation for the matrix A and then take the reverse
order product RQ to construct the first matrix in the sequence.

A = QR =⇒ R = QT A
A(1) = RQ =⇒ A(1) = Q T AQ

Easy to see that A and A(1) are orthogonally similar and so have the same
eigenvalues. We then find the QR factorisation of A(1) :

A(1) = Q (1) R (1) =⇒ A(2) = R (1) Q (1)

This procedure is then continued to construct A(2) , A(3) , etc. and (if the
original matrix is symmetric and tridiagonal) this sequence converges to a
diagonal matrix. The diagonal values of each of the iteration matrices
A(m) can be considered as approximations of the eigenvalues of A.

Dana Mackey (DIT) Numerical Methods II 20 / 23


Example
Consider the matrix  
4 3 0
A = 3 1 −1
0 −1 3
whose eigenvalues are -1.035225260, 3.083970257, 5.951255004. Some of
the iterations of the QR algorithm are shown below:
 
5.923 −0.276 0
A(2) = −0.276 2.227 −1.692
0 −1.692 −0.155
 
5.951 −0.00128 0
A(10) = −0.00128 3.084 −0.000345
0 −0.000345 −1.035

Note that the diagonal elements approximate the eigenvalues of A to 3


decimal places while the off-diagonal elements are converging to zero.

Dana Mackey (DIT) Numerical Methods II 21 / 23


Remarks

1 Since the rate of convergence of the QR algorithm is quite slow


(especially when the eigenvalues are closely spaced in magnitude)
there are a number of variations of this algorithm (not discussed here)
which can accelerate convergence.
2 The QR algorithm can be used for finding eigenvectors as well but we
will not cover this technique now

Exercise: Calculate the first iteration of the QR algorithm for the matrix
on the previous slide.

Dana Mackey (DIT) Numerical Methods II 22 / 23


More examples

1. Let  
1 0 0
A = 0 6 −2
0 −2 4
Calculate the eigenvalues of A (using the definition) and perform two
iterations of the QR algorithm to approximate them.

2. Let  
12 1 0
A=1 3 −2
0 −2 5
whose eigenvalues are 12.11689276, 1.693307642, 6.189799598. Perform
one iteration of the QR algorithm.

Dana Mackey (DIT) Numerical Methods II 23 / 23


Applications of eigenvalues and eigenvectors
1. Systems of differential equations

Recall that a square matrix A is diagonalisable if it can be factorised as


A = PDP −1 , where D is the diagonal matrix containing the eigenvalues of
A along the diagonal, and P (called the modal matrix) has the
corresponding eigenvectors as columns.

Consider the system of 1st order linear differential equations

x 0 (t) = a11 x(t) + a12 y (t),


y 0 (t) = a21 x(t) + a22 y (t)

written in matrix form as X0 (t) = AX(t), where X = (x, y )T and


A = (aij )i.j=1,2 .

Dana Mackey (DIT) Numerical Methods II 24 / 23


If A is diagonalisable then A = PDP −1 , where D = diag (λ1 , λ2 ) so the
system becomes
P −1 X0 (t) = DP −1 X(t)
If we let X(t)
e = P −1 X(t) then X
e 0 (t) = D X(t)
e which can be written as

xe0 (t) = λ1 xe(t), ye0 (t) = λ2 ye(t)

which is easy to solve:

xe(t) = C1 eλ1 t , ye(t) = C2 eλ2 t ,

where C1 and C2 are arbitrary constants. The final solution is then


recovered from X(t) = P X(t).
e

Dana Mackey (DIT) Numerical Methods II 25 / 23


Example: Consider the following linear predator-prey model, where x(t)
represents a population of rabbits at time t and y (t) are foxes

x 0 (t) = x(t) − 2y (t),


y 0 (t) = 3x(t) − 4y (t)

Solve the system using eigenvalues and determine how the two populations
are going to evolve. Assume x(0) = 4, y (0) = 1.

Dana Mackey (DIT) Numerical Methods II 26 / 23


Applications of eigenvalues and eigenvectors
2. Discrete age-structured population model

The Leslie model describes the growth of the female portion of a human or
animal population. The females are divided into n age classes of equal
duration (L/n, where L is the population age limit).

The initial number of females in each age group is given by

(0) T
 
(0) (0)
x(0) = x1 , x2 , . . . xn

(0) (0)
where x1 is the number of females aged 0 to L/n years, x2 is the
number of females aged L/n to 2L/n etc. This is called the initial age
distribution vector.

Dana Mackey (DIT) Numerical Methods II 27 / 23


The birth and death parameters which describe the future evolution of the
population are given by

ai = average no. of daughters born to each female in age class i


bi = fraction of females in age class i which survive to next age class

Note that ai ≥ 0 for i = 1, 2, . . . n and 0 < bi ≤ 1 for i = 1, 2, . . . n − 1.

Let x(k) be the age distribution vector at time tk (where k = 1, 2, . . . and


tk+1 − tk = L/n).
The Leslie model states that the distribution at time tk+1 is given by

x(k+1) = Lx(k)

where L, the Leslie matrix, is defined as

Dana Mackey (DIT) Numerical Methods II 28 / 23


 
a1 a2 a3 ... an
b1 0 0 ... 0
 
L=0 b2 0 ... 0


 .. .. .. .. .. 
. . . . .
0 ... 0 bn−1 0
This matrix equation is equivalent to
(k+1) (k) (k) (k)
x1 = a1 x1 + a2 x2 + · · · an xn
(k+1) (k)
xi+1 = bi xi , i = 1, 2, . . . n − 1

The Leslie matrix has the following properties:


1. It has a unique positive eigenvalue λ1 , for which the corresponding
eigenvector has only positive components;
2. This unique positive eigenvalue is the dominant eigenvalue;
3. The dominant eigenvalue λ1 represents the population growth rate,
while the corresponding eigenvector gives the limiting age distribution.

Dana Mackey (DIT) Numerical Methods II 29 / 23


Example

Suppose the age limit of a certain animal population is 15 years and divide
the female population into 3 age groups. The Leslie matrix is given by
 
0 4 3
L =  21 0 0
0 14 0

If there are initially 1000 females in each age class, find the age
distribution after 15 years. Knowing that the dominant eigenvalue is
λ1 = 23 , find the long term age distribution.

Dana Mackey (DIT) Numerical Methods II 30 / 23

You might also like