Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Basics of Convex Optimization

Shusen Wang
Convex Sets
Convex Set
Definition (Convex Set).
A set 𝒞 is convex if and only if for any 𝐱, 𝐲 ∈ 𝒞 and any 𝜂 ∈ (0, 1), the
point 𝜂𝐱 + 1 − 𝜂 𝐲 is also in 𝒞.

By definition, the line segment between


𝐱 and 𝐲 is in 𝒞.

A convex set 𝒞.
Convex Set
Definition (Convex Set).
A set 𝒞 is convex if and only if for any 𝐱, 𝐲 ∈ 𝒞 and any 𝜂 ∈ (0, 1), the
point 𝜂𝐱 + 1 − 𝜂 𝐲 is also in 𝒞.

A convex set 𝒞. A non-convex set.


Convex Set: Examples
Definition (Convex Set).
A set 𝒞 is convex if and only if for any 𝐱, 𝐲 ∈ 𝒞 and any 𝜂 ∈ (0, 1), the
point 𝜂𝐱 + 1 − 𝜂 𝐲 is also in 𝒞.
Convex Set: Examples
Example: The ℓ. -norm ball 𝐱: 𝐱 .
≤1 .

𝑥3

−1 1
𝑥.

−1
Convex Set: Examples
Example: The ℓ3 -norm ball 𝐱: 𝐱 3
≤1 .

𝑥3 𝑥3

1 1

−1 1 −1 1
𝑥. 𝑥.

−1 −1
Convex Functions
Convex Function
Definition (Convex Function).
• Let 𝒞 be a convex set and 𝑓: 𝒞 ↦ ℝ be a function.
• 𝑓 is convex if for any 𝐰. , 𝐰3 ∈ 𝒞 and any 𝜂 ∈ (0, 1),
𝑓 𝜂𝐰. + 1 − 𝜂 𝐰3 ≤ 𝜂𝑓 𝐰. + 1 − 𝜂 𝑓 𝐰3 .

𝑓 𝑤3
function value 𝑓 𝑤.

𝑤. 𝑤3 𝑤
Convex Function: Properties
Properties of convex function:
1. 𝑓 𝐰9 + 𝛻𝑓 𝐰9 ; 𝐰 − 𝐰9 ≤ 𝑓 𝐰 . (Assume 𝑓 is differentiable).

𝑓 𝑤9 + 𝑓 < 𝑤9 ⋅ (𝑤 − 𝑤9)
function
𝑓 𝑤9
value

𝑤9
𝑤
Convex Function: Properties
Properties of convex function:
1. 𝑓 𝐰9 + 𝛻𝑓 𝐰9 ; 𝐰 − 𝐰9 ≤ 𝑓 𝐰 . (Assume 𝑓 is differentiable).
2. The Hessian matrix is everywhere positive semi-definite: 𝛻 3𝑓 𝐰 ≽ 𝟎.
• Assume 𝑓 is twice differentiable.
• 𝐇 ∈ ℝA×A is positive semi-definite for all 𝐱 ∈ ℝA , 𝐱 ; 𝐇𝐱 ≥ 0.
Convex Functions

Question: Are they convex functions?

• 𝑓 𝑤 = 𝑤 3 + 𝑤 − 1, for 𝑤 ∈ ℝ.
• 𝑓 𝑤 = 𝑤 E , for 𝑤 ∈ ℝ.
• 𝑓 𝑤 = log I 𝑤, for 𝑤 > 0.
. 3 A
•𝑓 𝐰 = 𝐰 3
, for 𝐰 ∈ ℝ .
3
. 3 A
•𝑓 𝐰 = 𝐗𝐰 − 𝐲 3
, for 𝐰 ∈ ℝ .
3
Convex Function: Property
Property: Combination of convex functions is convex function.
• Let 𝑓. , ⋯ , 𝑓M be convex functions.
• Then 𝑓 𝐰 = 𝜆. 𝑓. 𝐰 + ⋯ + 𝜆M 𝑓M 𝐰 is convex function.

Example:
3
• 𝑓. 𝐰 = 𝐗𝐰 − 𝐲 3
is convex function.
3
• 𝑓3 𝐰 = 𝐰 3
is convex function.
3 3
• è 𝑓. 𝐰 + 𝜆 𝑓3 𝐰 = 𝐗𝐰 − 𝐲 3
+𝜆 𝐰 3
is convex function.
Convex Optimization
Convex Optimization

Definition (Convex Optimization).


• Optimization: min 𝑓(𝐰) ; s. t. 𝐰 ∈ 𝒞.
𝐰
• It is convex optimization if it has two properties:
1. 𝒞 (feasible set) is convex set,

2. 𝑓 (objective function) is convex function.


Convex Optimization: Examples

3
• Least squares regression: min 𝐗𝐰 − 𝐲 3
.
𝐰

• Logistic regression: min ∑[ log 1 + exp −𝑦[ 𝐰 ; 𝐱[ .


𝐰
3
• SVM: min 𝐰 + 𝜆 ∑[ 1 − 𝑦[ 𝐰 ; 𝐱[ + 𝑏 .
𝐰,] 3 _
3
• LASSO: min 𝐗𝐰 − 𝐲 3
; 𝑠. 𝑡. 𝐰 .
≤ 𝑡.
𝐰
Local and Global Optima
Convex Optimization: Properties
Property: For convex optimization, every local minimum is
global minimum.

function
value 𝑓 𝑤⋆

𝑤
optimal solution 𝑤 ⋆
Convex Optimization: Properties
First-order optimality condition (necessary condition):
• Consider the unconstrained optimization: min𝑓 𝐰 .
c
d e 𝐰
• If 𝐰 ⋆ is local minimum, then the gradient at 𝐰 ⋆ is zero.
d 𝐰
Convex Optimization: Properties
First-order optimality condition (necessary condition):
• Consider the unconstrained optimization: min𝑓 𝐰 .
c
d e 𝐰
• If 𝐰 ⋆ is local minimum, then the gradient at 𝐰 ⋆ is zero.
d 𝐰

Property of convex optimization (sufficient condition):

• Let min𝑓 𝐰 be convex optimization.


c
d e 𝐰
• If at 𝐰 ⋆ is zero, then 𝐰 ⋆ is global minimum.
d 𝐰
Subgradient and Subdifferential
Non-Differentiable Functions
• Example of non-differentiable functions: 𝑓 𝑤 = |𝑤|
𝜕𝑓 +1, if 𝑤 > 0;
= h undefined, if 𝑤 = 0;
𝜕𝑤
−1, if 𝑤 < 0.

𝑓 𝑤 = |𝑤|

𝑤
0
Subgradient of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

𝑓 𝑤 = |𝑤|

𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9

𝑤
Subgradient of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

𝑓 𝑤 = |𝑤|

𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9

𝑤
Subgradient of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

𝑓 𝑤 = |𝑤|

𝑤
𝑓 𝑤9 + 𝑣 𝑤 − 𝑤9
Subdifferential of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

Definition (Subdifferential). The set containing all the subgradients of


𝑓 at 𝐰9 is called the subdifferential. Denote the set by 𝜕𝑓 𝐰9 .
Subdifferential of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

Definition (Subdifferential). The set containing all the subgradients of


𝑓 at 𝐰9 is called the subdifferential. Denote the set by 𝜕𝑓 𝐰9 .

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|


• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.001 = −1 .
• 𝜕𝑓 0 = [−1, 1].

𝑤
0
Subdifferential of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

Definition (Subdifferential). The set containing all the subgradients of


𝑓 at 𝐰9 is called the subdifferential. Denote the set by 𝜕𝑓 𝐰9 .

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|


• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.1 = −1 .
• 𝜕𝑓 0 = [−1, 1].

𝑤
0
Subdifferential of Convex Function
Definition (Subgradient). A vector 𝐯 is called a subgradient of 𝑓 at 𝐰9 if
for any 𝐰, 𝑓 𝐰 ≥ 𝑓 𝐰9 + 𝐯 ; 𝐰 − 𝐰9 .

Definition (Subdifferential). The set containing all the subgradients of


𝑓 at 𝐰9 is called the subdifferential. Denote the set by 𝜕𝑓 𝐰9 .

Example: 𝑓 𝑤 = |𝑤| 𝑓 𝑤 = |𝑤|


• 𝜕𝑓 3 = 1 .
• 𝜕𝑓 −0.1 = −1 .
• 𝜕𝑓 0 = [−1, 1].

𝑤
0
A Property of Convex Optimization
Let 𝑓 be a convex function.
Property: 𝐰 ⋆ = min 𝑓 𝐰 0 ∈ 𝜕𝑓 𝐰 ⋆ .
𝐰

Example: min 𝑓 𝑤 = |𝑤 + 5|
r
• 𝜕𝑓 −5 = −1, 1 .
• Obviously 0 ∈ 𝜕𝑓 −5 .
• 𝑤 ⋆ = −5 minimizes 𝑓.

You might also like