Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

LECTURE 4

The Singular Value


Decomposition

Objective: The SVD is a matrix factorization that


has many applications: e.g., information retrieval, least-
squares problems, image processing It is so important
that we introduce it at this early stage.

4-1
Geometric observation
The SVD is motivated by the following geometric fact:
The image of the unit sphere under any m × n matrix
is a hyperellipse.
(The unit sphere in m dimensions is more accurately
called a hypersphere!)
The sphere maps into the hyperellipse by stretching the
sphere by (possibly zero) factors

σ1, σ2, . . . , σm ∈ R

in orthogonal directions

u1, u2, . . . , um ∈ Rm.

It is convenient to take the ui to be unit vectors.


Then, the vectors {σiui}m
i=1 can be viewed as the princi-
pal semi-axes of the hyperellipse with lengths

σ1, σ2, . . . , σm.

If A has rank r, then exactly r of the lengths σi will be


nonzero. In particular, if m ≥ n, then at most n of the
σi will be nonzero.
These facts are not obvious, but for now just assume they
are true!

4-2
Here is the basic picture:
Let S be the unit sphere in Rn.
Choose any A ∈ Rm×n with m ≥ n.
For simplicity, let rank(A) = n (full rank).
The image AS is a hyperellipse in Rm with the following
properties:

1. The n singular values σ1, σ2, . . . , σn of A are the


lengths of the principal semi-axes of AS. By con-
vention, we assume the σi are ordered largest to
smallest
i.e., σ1 ≥ σ2 ≥ · · · ≥ σn > 0
(why σi > 0?)
2. The set of unit vectors {u1, u2, . . . , un} are the n
left singular vectors of A. They are the directions
of the principal semi-axes of AS corresponding to
the n singular values.
∴ σiui is the ith largest principal semi-axes of AS.
3. The set of unit vectors {v1, v2, . . . , vn} are the n
right singular vectors of A. They are the “pre-
images” of the ui numbered accordingly.
i.e., Avj = σj uj
(why “left” and “right” will be explained below).

4-3
Reduced SVD
The equations relating the right singular values {vj } and
the left singular vectors {uj } are
Avj = σj uj j = 1, 2, . . . , n
i.e.,
! "
[A] v1 v2 . . . vn
 
σ1
! " σ2 
= u1 u2 . . . un 
 ...


σn
or
AV = Û Σ̂

Note
Σ̂ ∈ Rn×n is diagonal with positive entries.
Û ∈ Rm×n with orthonormal columns.
V ∈ Rn×n with orthonormal columns.
(⇒ V is orthogonal so V −1 = V T )

∴ A = Û Σ̂V T
This factorization is called the reduced SVD of A
(We’ll see in “reduced” in a minute)
4-4
Full SVD
In practice, SVD is mostly used in its reduced form. But
there is a “full SVD” that is more standard.
The idea of the full SVD is as follows:
Û is made up of n orthonormal vectors in Rm.
So, unless m = n, this cannot be a basis for Rm, and Û
cannot be orthogonal.
But, we can make Û orthogonal by adding m − n addi-
tional orthonormal columns.

! "
U := Û um+1 · · · un

We then also have to augment Σ̂.


→ for the factorization not to change, we must multiply
the last m − n columns of U by 0.
This leads to the full SVD

A = U ΣV T

where U ∈ Rm×m, V ∈ Rn×n are both orthogonal, and


Σ ∈ Rm×n.

4-5
In this framework, we no longer need to assume A has
full rank.
If rank(A) = r < n, only r singular vectors of A are
determined by the geometry of the hyperellipse and we
will have to add m − r orthonormal columns to U and
n − r orthonormal columns to V . Σ will have r positive
diagonal entries.
Alternatively, in this case it is possible to further reduce
the reduced SVD of A by letting Û ∈ Rm×r , Σ̂ ∈ Rr×r ,
and V̂ T ∈ Rr×n.

4-6
Formal definition of SVD
Given A ∈ Rm×n, the SVD of A is a factorization of the
form
A = U ΣV T

where U ∈ Rm×m, V ∈ Rn×n are orthogonal, and Σ ∈


Rn×n is “diagonal”.

Note
1. There is no assumption that m ≥ n or that A
has full rank.
2. All diagonal elements of Σ are non-negative and
in non-increasing order:

σ1 ≥ σ2 ≥ . . . ≥ σp ≥ 0

where p = min (m, n).


3. Σ has the same shape as A, but U , V are square.

4-7
Existence and uniqueness of the SVD

Theorem
Every matrix A ∈ Rm×n has singular value decompo-
sition A = U ΣV T .
The singular values {σj } are uniquely determined.
If A is square and σi '= σj for all i '= j, the left singular
vectors {uj } and the right singular vectors {vj } are
uniquely determined to within a factor of ±1.

4-8

You might also like