Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Optimization

Techniques and
Applications
Convex sets & convex functions

Courtesy of K. Ma at HKCU
CONVEX SETS
Convex Sets
A set C ⊆ Rn is said to be convex if, for any x, y ∈ C,

θx + (1 − θ)y ∈ C

for any 0 ≤ θ ≤ 1.

x
θx + (1 − θ)y

convex non-convex

• The line segment of any two points in C has to be in C, in order to be convex.


Examples of Convex Sets
Hyperplane:
C = {x | aT x = b}
where a ∈ Rn, & b ∈ R.

Halfspace:
C = {x | aT x ≤ b}

aT x ≤ b
aT x = b
Polyhedron:

C = {x | aTi x ≤ bi, i = 1, . . . , m, cTi x = di, i = 1, . . . , p}

For convenience we use matrix notations to represent a polyhedron:

C = {x | Ax ≼ b, Cx = d}

where A ∈ Rm×n, C ∈ Rp×n, b ∈ Rm, d ∈ Rp, & ≼ denotes elementwise


inequality.
a2
a1

C
a3
a5

a4

W.-K. Ma 3
Convex hull of a set of points {x1 , x2, . . . , xk }:

C = conv{x1, x2, . . . , xk } = {x = θ1x1+. . .+θk xk | θ1+. . .+θk = 1, θ1, . . . , θk ≥ 0}

• The set of all convex combinations of {x1, x2, . . . , xk }

• conv{x1 , . . . , xk } is a polyhedron. (vice versa is true if polyhedron is bounded)

(a) Convex hull where only some of (b) Convex hull where all x1 , . . . , xk
the x1, x2, . . . , xk are vertices. are vertices.
!k
• x ∈ C is an extreme point or vertex of C if x ̸
= i=1 θixi for any θ1 +. . .+θk =
1, θ1, . . . , θk ≥ 0, θi ̸
= 1 for any i.
Euclidean ball:

B(xc, r) = {x | ∥x − xc∥2 ≤ r}
#
! " $& '
"$ n
= x "" % (xi − xc,i )2 ≤ r
i=1

xc
Ellipsoid:

E(xc, P ) = {x | (x − xc)T P −1(x − xc) ≤ 1}

where P ∈ Sn (Sn = the set of n × n symmetric matrices), and P is positive


semidefinite.

λ1

λ2
xc

Symmetric eigendecomposition of P :

P = QΛQT

• The eigenvector matrix Q (Q ∈ Rn×n, QT Q = I) controls the rotation;


• The eigenvalue matrix Λ = diag(λ1, . . . , λn) controls the lengths of the semi-axes.
Convex Cones

A set C ⊆ Rn is said to be a convex cone if, for any x, y ∈ C,

θ1 x + θ2 y ∈ C

for any θ1, θ2 ≥ 0.

0 y

• A convex cone is a convex set.


Examples of Convex Cones

Second-order cone (SOC) (aka Lorentz cone, or ice-cream cone):

K = {(x, t) | ∥x∥2 ≤ t}

0.8

0.6
t

0.4

0.2

0
1

0.5 1
0.5
0
0
−0.5
−0.5
−1
x2 −1
x1
Positive semidefinite (PSD) cone:

Sn+ = {X ∈ Sn | X ≽ 0}

where X ≽ 0 means that X is PSD; i.e.,

z T Xz ≥ 0, for all z ∈ Rn

Positive definite (PD) cone:

Sn++ = {X ∈ Sn | X ≻ 0}

where X ≻ 0 means that X is PD; i.e.,

z T Xz > 0, for all z ∈ Rn/{0}


1

Example: n = 2 0.8

! " 0.6
x1 x2
≽0

x3
x2 x3 0.4

⇐⇒x1x3 − x22 ≥ 0, 0.2

x1 ≥ 0, x3 ≥ 0 0
1

0.5 1
0.8
0 0.6
−0.5 0.4
0.2
−1 0
x2 x1
are smaller (e.g., E3 ). E3 is not minimal for the same reason. The ellipsoid
E2 is minimal, since no other ellipsoid (centered at the origin) contains the
points and is contained in E2 .
Key properties of convex sets

• Separating hyperplane theorem: two disjoint convex sets have


a separating between hyperplane them
aT x b aT x  b

D
C

Figure 2.19 The hyperplane {x | aT x = b} separates the disjoint convex sets


Formally: if C, D are nonempty convex sets with C \ D = ;,
C and D. The affine function aT x b is nonpositive on C and nonnegative
on D.
then there exists a, b such that

C ✓ {x : aT x  b}
D ✓ {x : aT x b}
• Supporting hyperplane theorem: a boundary point of a convex
set has a supporting hyperplane passing through it

Formally: if C is a nonempty convex set, and x0 2 bd(C),


then there exists a such that

C ✓ {x : aT x  aT x0 }
Operations preserving convexity
• Intersection: the intersection of convex sets is convex

• Scaling and translation: if C is convex, then

aC + b = {ax + b : x 2 C}

is convex for any a, b

• Affine images and preimages: if f (x) = Ax + b and C is


convex then
f (C) = {f (x) : x 2 C}
is convex, and if D is convex then
1
f (D) = {x : f (x) 2 D}

is convex
Example: linear matrix inequality solution set
Given A1 , . . . Ak , B 2 Sn , a linear matrix inequality is of the form

x1 A1 + x2 A2 + . . . + xk Ak B

for a variable x 2 Rk . Let’s prove the set C of points x that satisfy


the above inequality is convex

Approach 1: directly verify that x, y 2 C ) tx + (1 t)y 2 C.


This follows by checking that, for any v,

⇣ k
X ⌘
vT B (txi + (1 t)yi )Ai v 0
i=1
Pk
Approach 2: let f : Rk ! Sn ,
f (x) = B i=1 xi Ai . Note that
C = f 1 (Sn+ ), affine preimage of convex set
More operations preserving convexity
• Perspective images and preimages: the perspective function is
P : Rn ⇥ R++ ! Rn (where R++ denotes positive reals),

P (x, z) = x/z

for z > 0. If C ✓ dom(P ) is convex then so is P (C), and if


D is convex then so is P 1 (D)

• Linear-fractional images and preimages: the perspective map


composed with an affine function,
Ax + b
f (x) = T
c x+d

is called a linear-fractional function, defined on cT x + d > 0.


If C ✓ dom(f ) is convex then so if f (C), and if D is convex
then so is f 1 (D)
Example: Filter mask
• Let {h−n, . . . , h−1, h0, h1 . . . , hn} be 5

a set of FIR filter coefficients. 0

Assume h−i = hi (linear phase). ω1 ω2


−5

• The frequency response −10

H(ω) (dB)
−15
U (ω)

n
! −20

H(ω) = hie−jωi −25

i=−n −30

n
! −35

= h0 + 2 hi cos(ωi) −40
0 0.5 1 1.5 2 2.5 3
ω
i=1
• The set
" #
n+1 #
$
H = (h0, . . . , hn) ∈ R |H(ω)| ≤ U (ω), ω1 ≤ ω ≤ ω2

where U (ω) ≥ 0, is convex because


% " # $
n+1 #
H= (h , . . . , hn) ∈ R − U (ω) ≤ H(ω) ≤ U (ω)
& 0 '( )
ω1 ≤ω≤ω2
polyhedral for each ω
CONVEX FUNCTIONS
Convex Functions
A function f : Rn → R is said to be convex if

i) domf is convex; and

ii) for any x, y ∈ domf and θ ∈ [0, 1],

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

(y, f (y))

(x, f (x))
• f is strictly convex if f (θx + (1 − θ)y) < θf (x) + (1 − θ)f (y) for all 0 < θ < 1
and for all x =
̸ y.

• f is concave if −f is convex.

strictly convex
convex non-convex
(and of course convex)
1st and 2nd Order Conditions

• Gradient (for differentiable f )

! "T
∂f (x) ∂f (x)
∇f (x) = ,..., ∈ Rn
∂x1 ∂xn

• Hessian (for twice differentiable f ): A matrix function ∇2f (x) ∈ Sn in which

2 ∂ 2f (x)
[∇ f (x)]ij =
∂xi∂xj

• Taylor series:

f (x + ν) = f (x) + ∇f (x)T ν + 12 ν T ∇2f (x)ν + . . .


• First-order condition: A differentiable f is convex iff given any x0 ∈ domf ,

f (x) ≥ f (x0) + ∇f (x0)T (x − x0), ∀x ∈ domf

f (x)
f (x0 ) + ∇f (x0 )T (x − x0 )

x
x0

• Second-order condition: A twice differentiable f is convex if and only if

∇2 f (x) ≽ 0, ∀x ∈ domf
Restriction of a Convex Function to a Line

• This can check convexity of f by checking convexity of functions of one variable!


Examples on R

• ax + b is convex. It is also concave.

• x2 is convex (on R).

• |x| is convex.

• eαx is convex.

• log x is concave on R++.

• x log x is convex on R+.


!x 2
−t /2
• log −∞
e dt is concave.
Examples on Rn
Affine function n
!
f (x) = aT x + b = aixi + b
i=1
is both convex and concave.

1.5

0.5
f (x)

−0.5

−1

−1.5

−2
1

0.5
1
0 0.5
0
−0.5
−0.5
−1
x2 −1
x1

W.-K. Ma 6
Quadratic function
n !
! n n
!
f (x) = xT P x + 2q T x + r = Pij xixj + 2 qixi + r
i=1 j=1 i=1

is convex if and only if P ≽ 0.

20

10
15
f (x)

5
10

f (x)
0

−5

0 1
1
−10 0.5
0.5 1 1
0.5 0.5 0
0
0 0
−0.5 −0.5
−0.5 −0.5

x2 −1 −1
x1 x2 −1 −1
x1
(a) P ≽ 0. (b) P ! 0.
p-norm
! n
# p1
"
f (x) = ∥x∥p = |xi|p
i=1
is convex for p ≥ 1.

1
p=∞ 0.8
p = 0.5
0.8 p=2 p = 0.3
0.6
p=1
0.6
p = 0.1
0.4
0.4

0.2
0.2

0 0

−0.2
−0.2

−0.4
−0.4
−0.6
−0.6
−0.8

−0.8
−1

−1
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

(a) Region of ∥x∥p = 1, p ≥ 1. (b) Region of ∥x∥p = 1, p ≤ 1.


Geometric mean n
!
f (x) = xi
i=1
is concave on Rn++.

Log-sum-exp.
Examples " n
#
$
f (x) = log exi
i=1
is convex on Rn. (Log-sum-exp. can be used as an approx. to max xi)
i=1,...,n
16/10/19 Lecture 3: Convex Functions 14
Epigraph
The epigraph of f is

epif = {(x, t) | x ∈ domf, f (x) ≤ t}

A powerful property:
f convex ⇐⇒ epif convex
e.g., some convexity preserving properties can be proven quite easily by epigraph.

t f (x)

epif

x
))
Sublevel Sets
The α-sublevel set of f

Sα = {x ∈ domf | f (x) ≤ α}

f convex =⇒ Sα convex for every α, but Sα convex for every α ̸


=⇒ f convex

f (x) f (x)

α α

x x
Sα Sα

convex f and convex Sα non-convex f but convex Sα


Epigraph and First-order Condition
• Consider the first-order condition for convexity:

where f is convex and x, y 2 dom f .


• We can interpret this basic inequality geometrically in terms of epi f .

If (y, t) 2 epi f , then

• This means that the hyperplane denoted by (rf (x), 1) supports


epi f at the boundary point (x, f (x)) . (See Fig. 3.6)
Epigraph and First-order Condition

Figure 3.6
Jensen’s Inequality
• The basic inequality for convex f

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

is also called Jensen’s inequality.

• It can be extended to
! k # k
" "
f θk xk ≤ θk f (xk )
i=1 i=1

$k
where θ1, . . . , θk ≥ 0, and i=1 θk = 1; and to
%& ' &
f p(x)xdx ≤ p(x)f (x)dx
S S
(
where p(x) ≥ 0 on S ⊆ domf , and S
p(x)dx = 1.
Inequalities derived from Jensen’s inequality:

• Arithmetic-geometric inequality: ab ≤ (a + b)/2 for a, b ≥ 0
!n n
• Hadamard inequality: det X ≤ i=1 X ii for X ∈ S+

• Kullback-Leiber divergence: Let p(x), q(x) be PDFs on S,


# $
p(x)
"
p(x) log dx ≥ 0
S q(x)

• Hölder inequality:
xT y ≤ ∥x∥p∥x∥
yq
where 1/p + 1/q = 1, p > 1.
Operations that Preserve Convexity
• Practical methods for establishing convexity of a function
Convexity Preserving Operations
Affine transformation of the domain:

f convex =⇒ f (Ax + b) convex

Example: (Least squares function)

f (x) = ∥y − Ax∥2

is convex, since f (x) = ∥ · ∥2 is convex.

Example: (MIMO capacity)

f (X) = log det(HXH T + I)

is concave on Sn+, since f (X) = log det(X) is concave on Sn++.


Composition: Let g : Rn → R and h : R → R.

g convex, h convex, extended h nondecreasing =⇒ f (x) = h(g(x)) convex

g concave, h convex, extended h nonincreasing =⇒ f (x) = h(g(x)) convex

Example:
f (x) = ∥y − Ax∥22
is convex by composition, where g(x) = ∥y − Ax∥2 , h(x) = max{0, x2}.
Non-negative weighted sum:
m
f1, . . . , fm convex !
=⇒ wifi convex
w1, . . . , wm ≥ 0
i=1

Example: Regularized least squares function

f (x) = ∥y − Ax∥22 + γ∥x∥22

is convex for γ ≥ 0.

Extension of non-negative weighted sum:


"
f (x, y) convex in x for each y ∈ A
=⇒ w(y)f (x, y)dy convex
w(y) ≥ 0 for each y ∈ A A

Example: Ergodic MIMO capacity


# " $
f (x) = EH {log det(HXH T + I)} = p(H) log det(HXH T + I)dH

is concave on Sn+.
Pointwise maximum:

f1, . . . , fm convex =⇒ f (x) = max{f1(x), . . . , fm(x)} convex

Example: Infinity norm

f (x) = ∥x∥∞ = max |xi|


i=1,...,m

is convex because f (x) = max{x1, −x1, x2, −x2, . . . , xn, −xn}.

Pointwise supremum:

f (x, y) convex in x for each y ∈ A =⇒ sup f (x, y) convex


y∈A

Example: Worst-case least squares function

f (x) = max ∥y − (A + E)x∥22


E∈E

is convex. (E does not even need to be convex!)


Perspective
• The perspective of a function f:

• Examples:
Approaches to prove convexity of
functions
•  First, domain of function f is proved to be convex
•  1. Whether definition is satisfied, i.e.,
( ) () ()
f α x + (1− α ) y ≤ α f x + (1− α ) f y
•  2. Restriction to a convex function to a line is
convex (necessary and sufficient condition)
g(t) = f (x + tv) dom g = {t | x + tv ∈ dom f } is convex (in t)
–  For any x ∈ dom f ,υ ∈ ℜ n

•  3. Epigraph is convex.
•  4. Hessian is positive semidefinite.

You might also like