Advance Engineering Math

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Lecture 3

EE 506: Engineering Mathematics

UET, Lahore

November 2, 2023
Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Least Squares 2
Least Squares Analysis

Consider a system of linear equations Ax = b


▶ Here A ∈ Rm×n , b ∈ Rm , m ≥ n and rankA A = n.

▶ If b ∈
/ R(A
A), then it is an overdetermined system with no solution

Ax − b ∥22 , known as least


▶ Goal is to find the x that minimizes ∥Ax
squares solution

▶ Let f (x Ax − b ∥22 = 21 x ⊤ (2A


x) = ∥Ax A⊤A )x
x − x ⊤ (2A
A ⊤b ) + b ⊤b

▶ The unique minimizer of f is obtained by solving the FONC

∇f (x A⊤Ax − 2A
x) = 2A A⊤b = 0

▶ The solution is x∗ = (A
A⊤A)−1A⊤b

Least Squares 3
Geometric Interpretation of LS

▶ Columns of A span the range space R(A A), which is an


n-dimensional subspace of Rm
▶ Ax = b has a solution iff b lies in R(A
A)
▶ If m > n, we wish to find a point h ∈ R(A A) that is ”closest” to b
▶ Geometrically, the vector e = h − b is orthogonal to R(A A)
▶ h ∈ R(AA) minimizing ∥bb − h ∥ is exactly the orthogonal projection of
b onto R(AA). It means that ⟨ee, x1a 1 + ... + xna n ⟩ = 0
▶ If h ∈ R(A
A) such that h − b is orthogonal to R(A A), then
h = Ax ∗ = A (AA⊤A )−1A ⊤b .

Least Squares 4
Orthogonal Projectors

▶ Let V ⊂ Rn be a subspace. Then, given x ∈ Rn , its orthogonal


decomposition is
x = xV + xV ⊥
where x V is the orthogonal projection of x onto V.
▶ We can write x V = P x where P is called the orthogonal projector.
▶ If V = R(A A⊤A )−1A ⊤x , so P = A (A
A), then x V = A (A A⊤A )−1A ⊤ .
We can also write
x V = arg min ∥yy − x ∥
y ∈V

▶ What will be P if V = N (A
A)?

Least Squares 5
Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Duality 6
Dual Linear Programs

▶ Associated with every linear programming problem is a


corresponding dual linear programming problem
▶ Constructed from the cost and constraints of the primal problem
▶ Solving an LP problem via its dual may be simpler
▶ Consider an LP (primal)

minimize c ⊤x
subject to Ax ≥ b .
x ≥0

Its corresponding dual problem is

maximize λ ⊤b
subject to λ ⊤A ≤ c ⊤ .
λ ≥0

where λ ∈ Rm is the dual vector. Roles of b and c are reversed.


Duality 7
Converting an LP to its Dual Form
Consider an LP (primal) in standard form

minimize c ⊤x
subject to Ax = b .
x ≥0

▶ Ax = b is equivalent to

Ax ≥ b
−Ax
Ax ≥ −bb
Thus, the original program becomes
minimize c⊤x   
A b
subject to x≥
−AA −bb
x ≥0
▶ This is a primal problem in the symmetric form of duality
Duality 8
Converting an LP to its Dual Form

The corresponding dual is


 
b
maximize u⊤
[u
u v⊤
v ]
−bb 
A
subject to u⊤ v ⊤ ]
[u ≤ c⊤
−AA
u, v ≥ 0

or after simple manipulations

maximize u − v )⊤ b
(u
subject to u − v )⊤ A ≤ c ⊤
(u
u, v ≥ 0

Let λ = u − v
maximize λ ⊤b
subject to λ ⊤A ≤ c ⊤

Duality 9
Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Nonlinear Problems with Equality Constraints 10


Problem Formulation

We consider nonlinear optimization problem with only equality constraints

minimize x)
f (x
subject to h(xx) = 0

where x ∈ Rn , f : Rn → R, h : Rn → Rm , and m ≤ n.

Nonlinear Problems with Equality Constraints 11


Regular Point

▶ Regular Point: A point x ∗ satisfying h1 (x x∗ ) = 0, ..., hm (x


x∗ ) = 0 is
said to be a regular point of the constraints if the gradient vectors
∇h1 (xx∗ ), ..., ∇hm (x
x∗ ) are linearly independent.

▶ If Jacobian matrix of h at x ∗ is

x∗ )⊤
 
∇h1 (x
Dh x∗ ) = 
h(x
 .. 
. 
∇hm (x x∗ )

Then, x ∗ is regular iff rank Dh x∗ ) = m (full rank)


h(x

Nonlinear Problems with Equality Constraints 12


Geometry of Equality Constraints

▶ Set of equality constraints hi (x


x) = 0, i = 1, ..., m describes a surface
S = {xx ∈ Rn : h1 (x x) = 0} with dimension n − m if
x) = 0, ..., hm (x
the points in S are regular.
▶ Example: Let n = 3 and m = 1 (operating in R3 ). The set S is a
two-dim surface.
x) = x2 − x23 = 0 and ∇h1 (x
– Let h1 (x x) = [0, 1, −2x3 ]⊤
3
– Hence, for any x ∈ R , ∇h1 (xx) ̸= 0
– So dim S = dim{x x) = 0} = n − m = 2.
x : h1 (x

Nonlinear Problems with Equality Constraints 13


Geometry of Equality Constraints

▶ Example: Let n = 3 and m = 2 (operating in R3 ). The feasible set


S is a one-dim object (a curve in R3 ).
x) = x2 − x23 = 0 with ∇h1 (x
x) = x1 , h2 (x
– Let h1 (x x) = [1, 0, 0]⊤ and

∇h2 (xx) = [0, 1, −2x3 ] . Hence the vectors ∇h1 (x x) and ∇h2 (x x) are
linearly independent in R3 .
– So dim S = dim{x x : h1 (x x) = 0} = n − m = 1.
x) = 0, h2 (x

Nonlinear Problems with Equality Constraints 14


Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Tangent and Normal Spaces 15


Curve on a Surface
Definition: A curve C on a surface S is a set of points
{x
x ∈ S : t ∈ (a, b)}, continuously parameterised by t ∈ (a, b); that is
x : (a, b) → S is a continuous function.
▶ The curve C passes through a point x∗ if there exists t∗ (a, b) such
that x(t∗ ) = x∗ .
▶ Path traversed by a point x traveling on the surface S
▶ The position of the point at time t is given by x(t).

Tangent and Normal Spaces 16


Differentiable Curve
Definition: The curve C = {xx(t) : t ∈ (a, b)},is differentiable if
 
ẋ1 (t)
x
dx
x(t) =
ẋ (t) =  ... 
 
dt
ẋn (t)

exists for all t ∈ (a, b).


▶ Both ẋ x(t) and ẍ(t) are n-dim vectors.
▶ ẋ
x(t) points in the direction of the instantaneous motion of x (t)
▶ The vector ẋ x(t∗ ) is tangent to the curve C at x ∗

Tangent and Normal Spaces 17


Tangent Space
Definition The tangent space at a point x ∗ on the surface
x ∈ Rn : h(x
S = {x x∗ ) = {yy : Dh(x
x) = 0 } is the set T (x x∗ )yy = 0 }.
▶ T (x ∗
x ) = N Dh(x ∗
x ), therefore a subspace of R n

▶ Assuming that x ∗ is a regular point, dimension of the tangent space


is n − m, where m is the no of equality constraints

Tangent and Normal Spaces 18


Tangent Space

x ∈ R3 : h1 (x
Example: Let S = {x x) = x1 − x2 = 0}
x) = x1 = 0, h2 (x
▶ We have
x)⊤
   
∇h1 (x 1 0 0
Dh h(x
x) = =
∇h2 (xx) 1 −1 0

▶ Because ∇h1 and ∇h2 are linearly independent, so all points in S


are regular

▶ The tangent space is

x)
T (x = {y x)⊤y
y : ∇h1 (x  ∇h
= 0, x)⊤
2 (x y = 0}
   y1
1 0 0  

= y: y2 = 0
1 −1 0
y3
 

= {[0, 0, α] : α ∈ R}
= the x3 -axis in R3

Tangent and Normal Spaces 19


Tangent Space

▶ Tangent space at a point on a surface is the collection of all tangent


vectors to the surface at that point.
▶ Derivative of a curve on a surface at a point is a tangent vector to
the curve, and hence to the surface.
▶ Theorem: Suppose that x ∗ ∈ S is a regular point and T (x x∗ ) is a
∗ ∗
tangent space at x . Then, y ∈ T (x x ) if and only if there exists a
differentiable curve in S passing through x ∗ with derivative y at x ∗ .
▶ Curve C = {x x(t) : t ∈ (a, b)} in S such that x(t∗ ) = x∗ and
x ∗ y x(t)) = 0.
ẋ (t ) = . Then h(x
▶ Using the chain rule

d
x(t)) = Dh(x
h(x x(t))ẋ
x(t) = 0
dt
for all t ∈ (a, b). Therefore at t∗ we get Dh(x
x∗ )yy = 0 and hence

y ∈ T (xx )
Tangent and Normal Spaces 20
Normal Space

Definition: The normal space at N (x x∗ ) at a point x ∗ on the surface


n
S = {xx ∈ R : h(xx) = 0 } is the set
x∗ ) = {x
N (x x ∈ Rn : x = Dh(x x∗ )⊤z , z ∈ Rm }.
▶ Range space of Dh(x x ∗ )⊤
n
▶ A subspace of R spanned by the vectors ∇h1 (x x∗ ), ..., ∇hm (x
x∗ )

x∗ )
N (x = x∗ ), ..., ∇hm (x
span[∇h1 (x x∗ )]
n ∗ x∗ ), z ∈ Rm }
= {x
x ∈ R : x = z1 ∇h1 (x x ) + · · · + zm ∇hm (x

Tangent and Normal Spaces 21


Normal Space

▶ Tangent space and normal space are orthogonal complements of


x∗ ) = N (x
each other, T (x x∗ )⊥
x ) = {yy ∈ Rn : x ⊤y = 0, ∀x
▶ Equivalently, T (x ∗ x∗ )}.
x ∈ N (x
x∗ )
▶ For any given vector v ∈ Rn , there are unique vectors w ∈ N (x

and y ∈ T (x
x ) such that v = w + y .

Tangent and Normal Spaces 22


Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Lagrange Condition 23
Conditions for Optimality

▶ Let h : R2 → R be the constraint function.

▶ At each point, ∇h(x


x) is orthogonal to the level sets {x
x : h(x
x) = 0}

▶ Choose a point x ∗ = [x∗1 , x∗2 ]⊤ such that h(x


x∗ ) = 0 and

∇h(x
x ) ̸= 0 .

▶ Parameterize the level set by a curve {xx(t)}


 
x (t)
x (t) = 1 t ∈ (a, b), , x ∗ = x (t∗ ), ẋ
x(t∗ ) ̸= 0 , t∗ ∈ (a, b)
x2 (t)

Lagrange Condition 24
Lagrange Theorem

x∗ ) is orthogonal to ẋ
▶ ∇h(x x(t)
– h is constant on the curve {x x(t) : t ∈ (a, b)}, we have h(x
x(t)0
d
– Hence, for all t ∈ (a, b) we have dt x(t)) = 0.
h(x
– Applying the chain rule,
d
h(x x(t))⊤ẋ
x(t)) = ∇h(x x(t) = 0
dt

▶ Now suppose that x ∗ is a minimizer of f : R2 → R on the set


{x x∗ ) is orthogonal to ẋ
x) = 0}. ∇f (x
x : h(x x(t)
x(t)) achieves minimum at t∗ .
– Composite function ϕ(t) = f (x
– FONC & chain rule: 0 = dt ϕ(t∗ ) = ∇f (x
d
x∗ )⊤ẋ
x(t∗ )

Lagrange Condition 25
Lagrange Condition

▶ ẋ x(t)} at x ∗ , which means that ∇f (x


x(t) is tangent to the curve {x x∗ )

is orthogonal to the curve at x

▶ Recall that ∇h(x x∗ ) is also orthogonal to ẋ


x(t)
▶ Therefore, the vectors ∇f (x x∗ ) and ∇h(xx∗ ) are parallel.
x∗ ) is a scalar multiple of h(x
▶ ∇f (x x∗ ).

Lagrange Condition 26
Lagrange Theorem for n = 2, m = 1

Theorem: Let the point x ∗ be a minimizer of f : R2 → R s.t. constraint


x) = 0, h : R2 → R. Then ∇f (x
h(x x∗ ) and ∇h(x
x∗ ) are parallel. That is,

there exists a scalar λ such that

x∗ ) + λ∗ ∇h(x
∇f (x x∗ ) = 0

▶ λ∗ is called the Lagrange multiplier.


▶ Lagrange’s theorem provides a first-order necessary condition for a
point to be a local minimizer.
▶ Lagrange conditions:

∇f (xx∗ ) + λ∗ ∇h(x
x∗ ) = 0

x )
h(x = 0

Lagrange Condition 27
Lagrange Theorem

Lagrange
Figure: Condition
Four examples where the Lagrange condition is satisfied: (a) 28
maximizer, (b) minimizer, (c) minimizer, (d) not an extremizer
Lagrange Theorem

Theorem: Let x ∗ be a local minimizer of f : Rn → R s.t. constraint


x) = 0, h : Rn → Rm , m ≤ n. Assume that x ∗ is a regular point.
h(x
Then, there exists λ∗ ∈ Rm such that

x∗ ) + λ∗⊤ Dh(x
Df (x x∗ ) = 0 ⊤

▶ if x ∗ is an extremizer, then the gradient of the objective function f


can be expressed as a linear combination of the gradients of the
constraints.
▶ Compactly, the necessary condition is ∇f (x x∗ ) ∈ N (x
x∗ )

Lagrange Condition 29
Outline

Least Squares

Duality

Nonlinear Problems with Equality Constraints

Tangent and Normal Spaces

Lagrange Condition

Karush-Kuhn-Tucker Condition

Karush-Kuhn-Tucker Condition 30
KKT Condition
Consider the problem
minimize x)
f (x
subject to h(xx) = 0 .
x) ≤ 0
g(x

where f : Rn → R, f : Rn → Rm and g : Rn → Rp.


▶ An inequality constraint gj (x x) ≤ 0 is said to be active at x ∗ if
x∗ = 0.
gj (x
▶ Let x ∗ satisfy h(x
x∗ ) = 0 , g(x
x∗ ) ≤ 0 , and let J(x
x∗ ) be the index set
of active inequality constraints:

x∗ ) = {j : gj (x
J(x x∗ ) = 0}

Then, we say that x ∗ is a regular point if the vectors

x∗ ), ∇gj (x
∇hi (x x∗ ), 1 ≤ i ≤ m, j ∈ J(x
x∗ )

are linearly independent.


Karush-Kuhn-Tucker Condition 31
Karush-Kuhn-Tucker (KKT) Condition

Let f, g, h ∈ C 1 . Let x∗ be a regular point and a local minimizer for the


problem of minimizing f subject to h(x x∗ ) = 0 , g(x
x∗ ) ≤ 0 . Then, there
exists λ ∗ ∈ Rm and µ ∗ ∈ Rp such that
1. µ ∗ ≥ 0
x∗ ) + λ∗⊤ Dh(x
2. Df (x x∗ ) + µ∗⊤ Dg(x
x∗ ) = 0 ⊤
3. µ∗⊤ g(x
x∗ ) = 0

Karush-Kuhn-Tucker Condition 32
KKT Condition

Figure: Four examples where the Lagrange condition is satisfied: (a)


maximizer, (b) minimizer, (c) minimizer, (d) not an extremizer

Karush-Kuhn-Tucker Condition 33

You might also like