Basics of Convex Optimization: Shusen Wang

Basics of Convex Optimization
Shusen Wang
Convex Sets
Convex Set
Definition (Convex Set).
A set 𝒞 is convex if and only if for any 𝐱, 𝐲 ∈ 𝒞 and any 𝜂 ∈ (0, 1), the
point 𝜂𝐱 + 1 − 𝜂 𝐲 is also in 𝒞.
By definition, the line segment between

𝐱 and 𝐲 is in 𝒞.
A convex set 𝒞.
Convex Set
A convex set 𝒞. A non-convex set.

Convex Set: Examples
Example: The ℓ. -norm ball 𝐱: 𝐱 .
≤1 .
𝑥3
−1 1
𝑥.
−1
Example: The ℓ3 -norm ball 𝐱: 𝐱 3
≤1 .
𝑥3 𝑥3
1 1
−1 1 −1 1
𝑥. 𝑥.
−1 −1
Convex Functions
Convex Function
Definition (Convex Function).
• Let 𝒞 be a convex set and 𝑓: 𝒞 ↦ ℝ be a function.
• 𝑓 is convex if for any 𝐰. , 𝐰3 ∈ 𝒞 and any 𝜂 ∈ (0, 1),
𝑓 𝜂𝐰. + 1 − 𝜂 𝐰3 ≤ 𝜂𝑓 𝐰. + 1 − 𝜂 𝑓 𝐰3 .
𝑓 𝑤3
function value 𝑓 𝑤.
𝑤. 𝑤3 𝑤
Convex Function: Properties
Properties of convex function:
1. 𝑓 𝐰9 + 𝛻𝑓 𝐰9 ; 𝐰 − 𝐰9 ≤ 𝑓 𝐰 . (Assume 𝑓 is differentiable).
𝑓 𝑤9 + 𝑓 < 𝑤9 ⋅ (𝑤 − 𝑤9)
function
𝑓 𝑤9
value
𝑤9
𝑤
Convex Function: Properties
Properties of convex function:
1. 𝑓 𝐰9 + 𝛻𝑓 𝐰9 ; 𝐰 − 𝐰9 ≤ 𝑓 𝐰 . (Assume 𝑓 is differentiable).
2. The Hessian matrix is everywhere positive semi-definite: 𝛻 3𝑓 𝐰 ≽ 𝟎.
• Assume 𝑓 is twice differentiable.
• 𝐇 ∈ ℝA×A is positive semi-definite for all 𝐱 ∈ ℝA , 𝐱 ; 𝐇𝐱 ≥ 0.
Convex Functions
Question: Are they convex functions?
• 𝑓 𝑤 = 𝑤 3 + 𝑤 − 1, for 𝑤 ∈ ℝ.
• 𝑓 𝑤 = 𝑤 E , for 𝑤 ∈ ℝ.
• 𝑓 𝑤 = log I 𝑤, for 𝑤 > 0.
. 3 A
•𝑓 𝐰 = 𝐰 3
, for 𝐰 ∈ ℝ .
3
. 3 A
•𝑓 𝐰 = 𝐗𝐰 − 𝐲 3
, for 𝐰 ∈ ℝ .
3
Convex Function: Property
Property: Combination of convex functions is convex function.
• Let 𝑓. , ⋯ , 𝑓M be convex functions.
• Then 𝑓 𝐰 = 𝜆. 𝑓. 𝐰 + ⋯ + 𝜆M 𝑓M 𝐰 is convex function.
Example:
3
• 𝑓. 𝐰 = 𝐗𝐰 − 𝐲 3
is convex function.
3
• 𝑓3 𝐰 = 𝐰 3
is convex function.
3 3
• è 𝑓. 𝐰 + 𝜆 𝑓3 𝐰 = 𝐗𝐰 − 𝐲 3
+𝜆 𝐰 3
is convex function.
Convex Optimization
Convex Optimization
Definition (Convex Optimization).

• Optimization: min 𝑓(𝐰) ; s. t. 𝐰 ∈ 𝒞.
𝐰
• It is convex optimization if it has two properties:
1. 𝒞 (feasible set) is convex set,
2. 𝑓 (objective function) is convex function.

Convex Optimization: Examples
3
• Least squares regression: min 𝐗𝐰 − 𝐲 3
.
𝐰
• Logistic regression: min ∑[ log 1 + exp −𝑦[ 𝐰 ; 𝐱[ .

𝐰
3
• SVM: min 𝐰 + 𝜆 ∑[ 1 − 𝑦[ 𝐰 ; 𝐱[ + 𝑏 .
𝐰,] 3 _
3
• LASSO: min 𝐗𝐰 − 𝐲 3
; 𝑠. 𝑡. 𝐰 .
≤ 𝑡.
𝐰
Local and Global Optima
Convex Optimization: Properties
Property: For convex optimization, every local minimum is
global minimum.
function
value 𝑓 𝑤⋆
𝑤
optimal solution 𝑤 ⋆
First-order optimality condition (necessary condition):
• Consider the unconstrained optimization: min𝑓 𝐰 .
c
d e 𝐰
• If 𝐰 ⋆ is local minimum, then the gradient at 𝐰 ⋆ is zero.
d 𝐰
First-order optimality condition (necessary condition):
• Consider the unconstrained optimization: min𝑓 𝐰 .
c
d e 𝐰
• If 𝐰 ⋆ is local minimum, then the gradient at 𝐰 ⋆ is zero.
d 𝐰
Property of convex optimization (sufficient condition):
• Let min𝑓 𝐰 be convex optimization.

c
d e 𝐰
• If at 𝐰 ⋆ is zero, then 𝐰 ⋆ is global minimum.
d 𝐰
Subgradient and Subdifferential
Non-Differentiable Functions
• Example of non-differentiable functions: 𝑓 𝑤 = |𝑤|
𝜕𝑓 +1, if 𝑤 > 0;
= h undefined, if 𝑤 = 0;
𝜕𝑤
−1, if 𝑤 < 0.
𝑓 𝑤 = |𝑤|
𝑤
0
Subgradient of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .
𝑓 𝑤 = |𝑤|
𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9
𝑤
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .
𝑓 𝑤 = |𝑤|
𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9
𝑤
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .
𝑓 𝑤 = |𝑤|
𝑤
𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9
Subdifferential of Convex Function
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .
Definition (Subdifferential). The set containing all the subgradients of

𝑓 at 𝐰9 is called the subdifferential. Denote the set by 𝜕𝑓 𝐰9 .
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|

• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.001 = −1 .
• 𝜕𝑓 0 = [−1, 1].
𝑤
0
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .


• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.1 = −1 .
• 𝜕𝑓 0 = [−1, 1].
𝑤
0
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .


• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.1 = −1 .
• 𝜕𝑓 0 = [−1, 1].
𝑤
0
A Property of Convex Optimization
Let 𝑓 be a convex function.
Property: 𝐰 ⋆ = min 𝑓 𝐰 0 ∈ 𝜕𝑓 𝐰 ⋆ .
𝐰
Example: min 𝑓 𝑤 = |𝑤 + 5|
r
• 𝜕𝑓 −5 = −1, 1 .
• Obviously 0 ∈ 𝜕𝑓 −5 .
• 𝑤 ⋆ = −5 minimizes 𝑓.

Basics of Convex Optimization: Shusen Wang

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basics of Convex Optimization: Shusen Wang

Uploaded by

Copyright:

Available Formats

Basics of Convex Optimization

By definition, the line segment between

A convex set 𝒞. A non-convex set.

Question: Are they convex functions?

Definition (Convex Optimization).

2. 𝑓 (objective function) is convex function.

• Logistic regression: min ∑[ log 1 + exp −𝑦[ 𝐰 ; 𝐱[ .

Property of convex optimization (sufficient condition):

• Let min𝑓 𝐰 be convex optimization.

Definition (Subdifferential). The set containing all the subgradients of

Definition (Subdifferential). The set containing all the subgradients of

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|

Definition (Subdifferential). The set containing all the subgradients of

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|

Definition (Subdifferential). The set containing all the subgradients of

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|

You might also like