Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Singular Value Decomposition

• Any m x n matrix with m ≥ n can be


decomposed as follows
⎛ w1 0 L 0⎞
⎜ ⎟
Linear Algebra (3) Diagonal matrix ⎜0
W=⎜

M
w2 L
M
0⎟
O M ⎟

⎜ L wn ⎠⎟
⎝0 0

A = UWV T
Tim Cootes Diagonal elements
are called
mxn nxn ‘Singular Values’
nxn
Orthogonal columns Orthonormal
UT U = I n (Pure rotation)
VV T = V T V = I n

Singular Value Decomposition SVD and Eigenvectors

A = UWV T
• Eigenvector decomposition is a special case of
SVD for square, symmetric matrices

A = UWV T

If A = A T then U = V and A = UWUT


=
– Columns of U are eigenvectors
– Elements of W are eigenvalues
m×n m×n n× n n× n

Solving Regular LEs using SVD Solving LEs with SVD


Consider case in which m = n, | A |= (∏ wi ) ≠ 0
x = VW −1U T b
Ax = b
• Can only fail if one of the singular values, w, is
UWV T x = b zero or very small
(VW −1U T )UWV T x = ( VW −1U T )b – If so, matrix is called ‘singular’
• The condition number of a matrix is
VW −1 (U T U) WV T x = VW −1U T b
wlargest
V ( W −1W)V T x = VW −1U T b C=
wsmallest
VV T x = VW −1UT b
• If C>1012 (when using doubles), then A is ill-
x = VW −1U T b A −1 = VW −1UT conditioned, and care must be taken

1
Geometric Interpretation Geometric interpretation
• For m=n case
⎛ 0⎞ x → VT x
⎜⎜ ⎟⎟
Ax = UWV x T ⎝1⎠
θ
⎛ 1⎞ (Rotation)
⎜⎜ 0 ⎟⎟
Pure ⎝ ⎠
Scale along Pure
Rotation each axis Rotation
x → UWV T x x → Wx

(Scaling)
x → Ux

(Rotation)

Geometric Interpretation Geometric Interpretation


• Diagonal matrix scales along axes: • If one element is zero (or near-zero)
– Volume is flattened in that direction
⎛w 0⎞
x → ⎜⎜ 1 ⎟x
⎝0 w2 ⎟⎠
⎛ 0⎞
⎜⎜ 1 ⎟⎟ ⎛0⎞ ⎛0 0 ⎞
⎝ ⎠ ⎜⎜ ⎟⎟ x → ⎜⎜ ⎟⎟x
⎝ w2 ⎠
⎛ 0⎞
⎝ 0 w2 ⎠
⎜⎜ 1 ⎟⎟ ⎛0⎞
⎛ 1⎞
⎛ w1 ⎞ ⎝ ⎠ ⎜⎜ w ⎟⎟
⎜⎜ 0 ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 2⎠
⎝ ⎠
⎝0⎠
⎛ 1⎞
⎜⎜ 0 ⎟⎟
⎝ ⎠
• What if one element is zero (or near-zero)?

Geometric interpretation Zero singular values


If one singular value is near zero:

If one or more singular values is near zero


⎛ 0⎞ x → VT x
⎜⎜ 1 ⎟⎟ • The transformation Ax flattens space onto
⎝ ⎠
θ a ‘linear subspace’
⎛ 1⎞
⎜⎜ ⎟⎟
(Rotation)
⎝ 0⎠ – A line or point in 2D
⎛0 0 ⎞
x → ⎜⎜ ⎟⎟x – A plane, line or point in 3D
x → UWV T x ⎝ 0 w2 ⎠
(Scaling) – A hyperplane in n-D
x → Ux • The columns of A are not linearly
independent

(Rotation)

2
2D Case Rounding errors
test_rounding.m
|Singular values| >> 0 Singular value near zero
|Determinant| >>0 Determinant near zero

x2 x2
a=1.0;
d=1.0;
while abs(d-1.0)<1e-6
Solution point a=10*a;
well defined Solution point d=(a+1.0)-a;
becomes unstable disp(['a=' num2str(a)
x1 x1
'(a+1)-a=' num2str(d)]);
end

Output Precision
a=10 (a+1)-a=1
a=100 (a+1)-a=1 test_precision.m
a=1000 (a+1)-a=1
a=10000 (a+1)-a=1
a=100000 (a+1)-a=1 d=1.0;
a=1000000 (a+1)-a=1 while (1.0+d)>1.0
a=10000000 (a+1)-a=1 d=d/10.0;
a=100000000 (a+1)-a=1 disp(['d=' num2str(d,18) ' 1+d=' num2str(1+d,18)]);
a=1000000000 (a+1)-a=1
end
a=1.000000e+010 (a+1)-a=1
a=1.000000e+011 (a+1)-a=1
a=1.000000e+012 (a+1)-a=1
a=1.000000e+013 (a+1)-a=1
a=1.000000e+014 (a+1)-a=1
a=1.000000e+015 (a+1)-a=1
a=1.000000e+016 (a+1)-a=0

Output Rank
d=0.10000000000000001 1+d=1.1000000000000001
d=0.01 1+d=1.01 • The rank of matrix A is the number of
d=0.001 1+d=1.0009999999999999
d=0.0001 1+d=1.0001 linearly independent columns of A
d=1.0000000000000001e-005 1+d=1.0000100000000001
d=1.0000000000000002e-006 1+d=1.0000009999999999 = Number of non-zero singular values of A
d=1.0000000000000002e-007 1+d=1.0000001000000001
d=1.0000000000000002e-008 1+d=1.0000000099999999 • n x n A is full rank if rank(A)=n
d=1.0000000000000003e-009 1+d=1.0000000010000001
d=1.0000000000000003e-010 1+d=1.0000000001
d=1.0000000000000003e-011 1+d=1.00000000001
d=1.0000000000000002e-012 1+d=1.0000000000010001
d=1.0000000000000002e-013 1+d=1.0000000000000999
d=1.0000000000000002e-014 1+d=1.00000000000001
d=1.0000000000000001e-015 1+d=1.0000000000000011
d=1.0000000000000001e-016 1+d=1

3
Underdetermined Systems Underdetermined Systems
• If there are fewer equations than
m<n m = 2 equations Ax = x1a1 + x2a 2 + x3a 3 = b
n = 3 unknowns
unknowns, then the system is called 2D vectors
underdetermined
– Usually an infinite number of possible
solutions a1 a2 b

a3

Many possible solutions for x

Underdetermined Systems Solving Linear Equations using SVD


For m ≥ n solve Ax = b as follows
When m < n
Apply SVD to obtain A = UWV T
Ax = b defines a linear subspace of solutions
x2 Any point on line satisfies equations
Find largest absolute singular value, wmax = max i =1..n | wi |
Set a threshold, t = 10 −12 wmax
We’re often interested in point
that minimises |x| 1 / wi if wi > t
Define zi = {
0 otherwise
x1
⎛ z1 0 L 0⎞
The solution is then ⎜ ⎟
x = VZUT b ⎜0 z2 L 0⎟
Z=⎜
M M O M⎟
⎜ ⎟
⎜ L z n ⎠⎟
⎝0 0

Solving Linear Equations using SVD Overdetermined Case


For m < n find a solution to Ax = b as follows
• If there are more equations than
Apply SVD to A T to obtain AT = UWV T so A = VWU T unknowns, then the system is called
Find largest absolute singular value, wmax = max i =1..n | wi | overdetermined m>n
Set a threshold, t = 10 −12 wmax Ideal case Real world case

1 / wi if wi > t
x2 x2

Define zi = {
0 otherwise
⎛ z1 0 L 0⎞ Solution point is
The solution is then ⎜ ⎟ around here
x = UZV b ⎜0
T z2 L 0⎟
Z=⎜ somewhere
M M O M⎟
⎜ ⎟ x1
⎜0 L z n ⎟⎠
x1
⎝ 0

4
Overdetermined Case Overdetermined Case
Ax = x1a1 + L + xna n = b • Vector between Ax and b is Ax-b
• a are columns of A (m-D vectors) • Square of distance between Ax and b is
• In general, cannot reach b if n<m d 2 =| Ax − b |2 = ( Ax − b)T ( Ax − b)
• So find x which gets as close as possible
b2 x1a1 • The solution is usually defined as the x
m = 2 equations which minimises this distance
n = 1 unknown a1 Closest point on line to b
b

b1

Finding minimum Pseudo-inverse


d 2 = ( Ax − b)T ( Ax − b) = xT AT Ax − 2xT A T b + bT b • If A is an m x n matrix, then there exists an
n x m matrix, known as the Moore-
Differentiating w.r.t x, and equating to zero gives +
Penrose Pseudo-inverse: A
AT Ax − AT b = 0
A + x is the least - squares solution to Ax = b
( AT A)x = AT b
n× n If the inverse of AT A exists, then A + = ( AT A) −1 AT
(But not necessarily +ive definite
symmetric or full rank)
AA + A = A ( AA + )T = AA +
Thus we could solve this square linear equation to find x
However, in practice it is usually solved using SVD of A A + AA + = A + ( A + A )T = A + A

Implementation Issues Solving Ax=b


• Never solve linear equations by computing m≠n
m = n? Solve using
SVD
the inverse m=n
• Perform decomposition, then use Symmetric ?
backsubstitution
A = AT A ≠ AT
– More numerically stable Use Cholesky Use LU decomp

A = LLT A = LU
Successful Failed or Successful
and |A|>0 |A| near 0 and |A| not 0

Solve using
Cholesky Solve using
Solve using LU
backsubstitution
SVD backsubstitution

5
Iterative improvement QR Decomposition
• Solution to Ax=b can contain rounding • Any matrix A can be decomposed as
errors
A = QR
Suppose x i is current estimate of sol n to Ax = b
m× n m×n n× n
Then a better sol n is x i +1 = x i − dx i orthogonal upper triangular
where dx i is the sol n to A (dx i ) = ( Ax i − b) Q Q = In
T

Current estimate Ax = QRx = b


of error
Rx = QT b
R triangular, so simple to solve

Linear subspaces 1D linear subspace


• Suppose we have a set of k linearly • For instance, in 2D space
independent n-D vectors, with k<n ⎛1⎞
{a1 = ⎜⎜ ⎟⎟} k = 1
{a i } i = 1..k ⎝1⎠
– spans a 1-D subspace of 2D space
• These span a k-D linear subspace
p = x1a1
• Any point in this subspace can be written
p = x1a1 + L + xk a k = Ax
Any point in the space can be uniquely defined by k parameters

(Manifolds) Linear subspaces


• Linear subspaces are special cases of • The k linearly independent n-D vectors
manifolds {a i } i = 1..k
• A manifold is a possibly curved structure Are said to define a basis for the subspace
of lower intrinsic dimensionality than that (can be thought of as a set of axes)
of the space in which it is embedded
– Points on a circle: 1D manifold in 2D space
– Surface of a sphere: 2D manifold in 3D space If the vectors are of unit length, and mutually
orthogonal, they are said to define an
Intrinsic dimensionality: The minimum number of parameters required to orthonormal basis for the subspace
uniquely define any point in the space

6
Orthonormal Basis Orthonormal Basis
• Given an arbitrary set of basis vectors we • If p is a point in the subspace, then
can generate an orthonormal basis using p = x1u1 + L + xk u k = Ux
SVD
x = UT p
(a1 L a k ) = A = UWV T
The projection of p along line xu is u.p = uT p
= (u1 L u k ) WV T
p
Columns of U give an orthonormal
basis for subspace

θ
u
1 u.p

Distance to subspace Distance to subspace


• The nearest point in the subspace to any • The distance from any n-D point p to the
n-D point p is nearest point in the subspace is
p' = Ux = UUT p d 2 =| p − p' |=| p |2 − | x |2 where x = UT p
p
p
d | p |2 =| p' |2 + d 2 = x 2 + d 2
p' = xu = (u p)u
T

θ p' = xu = (uT p)u


u
1 u.p u
1 u.p

You might also like