Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Deterministic

Optimization
Nonlinear Optimization
Modeling – Approximation
and Fitting
Andy Sun
Assistant Professor
Stewart School of Industrial and Systems Engineering

Normal Equation and Singular


Value Decomposition
Nonlinear Optimization
Modeling
Learning Objectives for this lesson

• Derive Optimality Condition for


Least Squares Problem
• Discover a very important matrix
factorization technique, called the
Singular Value Decomposition
(SVD)
• Use SVD to Solve Least Square
Optimality Condition
• Least Squares Minimization problem:

min ||𝐴𝑥 − 𝑏||;;


5

• Equivalently: min 𝐴𝑥 − 𝑏 < 𝐴𝑥 − 𝑏 = 𝑥 < 𝐴< 𝐴𝑥 − 2 𝐴< 𝑏 < 𝑥 + 𝑏 < 𝑏


5
• Optimality condition:
<
• 𝛻 𝐴𝑥 − 𝑏 𝐴𝑥 − 𝑏 = 0 ⇒ 𝐴< 𝐴𝑥 = 𝐴< 𝑏
• Normal equation: 𝐴< 𝐴𝑥 = 𝐴< 𝑏

• Solving least squares problem = solving Normal Equation


Solution to Normal Equation
• We wanted to “solve” 𝐴𝑥 = 𝑏, but we couldn’t, because 𝐴 has more
rows than columns, so no such solution exists

• Instead of solving 𝐴𝑥 = 𝑏, we solve 𝐴< 𝐴𝑥 = 𝐴< 𝑏

• Since 𝐴 has full column rank, 𝐴< 𝐴 is invertible

• So the optimal solution to normal equation, thus to least squares


problem:
• 𝒙 = 𝑨< 𝑨 E𝟏
𝑨< 𝒃
Moore-Penrose Pseudoinverse
• For a matrix 𝐴 ∈ 𝑅J×L with 𝑚 ≥ 𝑛 and full rank, the Moore-Penrose
pseudoinverse of 𝐴 is defined as
• 𝐴P = 𝐴< 𝐴 EQ
𝐴<
• The solution of least squares min ||𝐴𝑥 − 𝑏||; can be written as
• 𝒙∗ = 𝑨P 𝒃
• If 𝐴 is an invertible square matrix, then 𝐴P = 𝐴EQ 𝐴< EQ <
𝐴 = 𝐴EQ
• The naïve way to compute 𝐴P is to first invert 𝐴< 𝐴 to get 𝐴< 𝐴 EQ ,
then multiple 𝐴<
• However, matrix inversion can be time consuming
• There is a more efficient way: singular value decomposition
Singular Value Decomposition
• Theorem: Let 𝐴 be an 𝑚-by-𝑛 matrix with 𝑚 ≥ 𝑛. Then we can write
𝐴 as 𝐴 = 𝑈Σ𝑉 < = ∑LX[Q 𝜎X 𝑢X 𝑣X< , where
• 𝑈 is an 𝑚-by-𝑛 matrix satisfying 𝑈 < 𝑈 = 𝐼,
• 𝑉 is an 𝑛-by-𝑛 matrix satisfying 𝑉 < 𝑉 = 𝐼,
• Σ = 𝑑𝑖𝑎𝑔(𝜎Q , … , 𝜎L ) with 𝜎Q ≥ 𝜎; ≥ ⋯ ≥ 𝜎L ≥ 0

• Columns 𝑢Q , … , 𝑢L of 𝑈 are called left singular vectors of 𝐴


• Columns 𝑣Q , … , 𝑣L of 𝑉 are called right singular vectors of 𝐴
• 𝜎Q , … , 𝜎L are called singular values of 𝐴
Relation to Eigendecomposition
Suppose SVD of 𝐴 = 𝑈Σ𝑉 <
• If 𝐴 is a symmetric matrix with eigenvalues 𝜆Q , … , 𝜆L and orthonormal
eigenvectors 𝑢Q , … , 𝑢L , i.e. 𝐴 = 𝑈Λ𝑈 < , where 𝑈 = 𝑢Q , … , 𝑢L ,
then 𝐴 = 𝑈Σ𝑉 < , where 𝜎X = |𝜆X | and 𝑣X = 𝑠𝑖𝑔𝑛 𝜆X 𝑢X

• Eigenvalues of 𝐴< 𝐴 are 𝜎Q; , … , 𝜎L; , eigenvectors are 𝑣Q , … , 𝑣L


• 𝐴< 𝐴 = 𝑉Σ𝑈 < 𝑈Σ𝑉 < = 𝑉Σ ; 𝑉 <
Use SVD to Solve Least Squares
• Solution to Least Squares: min ||𝐴𝑥 − 𝑏||; :
• 𝑥 ∗ = 𝐴P 𝑏 = 𝐴< 𝐴 EQ <
𝐴 𝑏

• Since 𝐴< 𝐴 = 𝑉Σ ; 𝑉 < , we have 𝐴< 𝐴 EQ = 𝑉Σ E; 𝑉 <

• 𝐴< 𝐴 EQ 𝐴< = 𝑉Σ E; 𝑉 < 𝑉Σ𝑈 < = 𝑉Σ EQ 𝑈 <


• This can be computed efficiently after SVD

• Therefore, 𝑥 ∗ = 𝑉Σ EQ 𝑈 < 𝑏
Summary

• We learned:
– Optimality condition and normal equation
for least squares problem

– An important matrix decomposition:


singular value decomposition

– How SVD is related to eigenvalue


decomposition

– How SVD is used to solve LS

You might also like