Lecture3 ConvexSetsFuns PDF

Optimization
Techniques and
Applications
Convex sets & convex functions
Courtesy of K. Ma at HKCU
CONVEX SETS
Convex Sets
A set C ⊆ Rn is said to be convex if, for any x, y ∈ C,
θx + (1 − θ)y ∈ C
for any 0 ≤ θ ≤ 1.
x
θx + (1 − θ)y
convex non-convex
• The line segment of any two points in C has to be in C, in order to be convex.

Examples of Convex Sets
Hyperplane:
C = {x | aT x = b}
where a ∈ Rn, & b ∈ R.
Halfspace:
C = {x | aT x ≤ b}
aT x ≤ b
aT x = b
Polyhedron:
C = {x | aTi x ≤ bi, i = 1, . . . , m, cTi x = di, i = 1, . . . , p}
For convenience we use matrix notations to represent a polyhedron:
C = {x | Ax ≼ b, Cx = d}
where A ∈ Rm×n, C ∈ Rp×n, b ∈ Rm, d ∈ Rp, & ≼ denotes elementwise

inequality.
a2
a1
C
a3
a5
a4
W.-K. Ma 3
Convex hull of a set of points {x1 , x2, . . . , xk }:
C = conv{x1, x2, . . . , xk } = {x = θ1x1+. . .+θk xk | θ1+. . .+θk = 1, θ1, . . . , θk ≥ 0}
• The set of all convex combinations of {x1, x2, . . . , xk }
• conv{x1 , . . . , xk } is a polyhedron. (vice versa is true if polyhedron is bounded)
(a) Convex hull where only some of (b) Convex hull where all x1 , . . . , xk
the x1, x2, . . . , xk are vertices. are vertices.
!k
• x ∈ C is an extreme point or vertex of C if x ̸
= i=1 θixi for any θ1 +. . .+θk =
1, θ1, . . . , θk ≥ 0, θi ̸
= 1 for any i.
Euclidean ball:
B(xc, r) = {x | ∥x − xc∥2 ≤ r}
#
! " $& '
"$ n
= x "" % (xi − xc,i )2 ≤ r
i=1
xc
Ellipsoid:
E(xc, P ) = {x | (x − xc)T P −1(x − xc) ≤ 1}
where P ∈ Sn (Sn = the set of n × n symmetric matrices), and P is positive

semidefinite.
λ1
λ2
xc
Symmetric eigendecomposition of P :
P = QΛQT
• The eigenvector matrix Q (Q ∈ Rn×n, QT Q = I) controls the rotation;

• The eigenvalue matrix Λ = diag(λ1, . . . , λn) controls the lengths of the semi-axes.
Convex Cones
A set C ⊆ Rn is said to be a convex cone if, for any x, y ∈ C,
θ1 x + θ2 y ∈ C
for any θ1, θ2 ≥ 0.
0 y
• A convex cone is a convex set.

Examples of Convex Cones
Second-order cone (SOC) (aka Lorentz cone, or ice-cream cone):
K = {(x, t) | ∥x∥2 ≤ t}
0.8
0.6
t
0.4
0.2
0
1
0.5 1
0.5
0
0
−0.5
−0.5
−1
x2 −1
x1
Positive semidefinite (PSD) cone:
Sn+ = {X ∈ Sn | X ≽ 0}
where X ≽ 0 means that X is PSD; i.e.,
z T Xz ≥ 0, for all z ∈ Rn
Positive definite (PD) cone:
Sn++ = {X ∈ Sn | X ≻ 0}
where X ≻ 0 means that X is PD; i.e.,
z T Xz > 0, for all z ∈ Rn/{0}

1
Example: n = 2 0.8
! " 0.6
x1 x2
≽0
x3
x2 x3 0.4
⇐⇒x1x3 − x22 ≥ 0, 0.2
x1 ≥ 0, x3 ≥ 0 0
1
0.5 1
0.8
0 0.6
−0.5 0.4
0.2
−1 0
x2 x1
are smaller (e.g., E3 ). E3 is not minimal for the same reason. The ellipsoid
E2 is minimal, since no other ellipsoid (centered at the origin) contains the
points and is contained in E2 .
Key properties of convex sets
• Separating hyperplane theorem: two disjoint convex sets have

a separating between hyperplane them
aT x b aT x  b
D
C
Figure 2.19 The hyperplane {x | aT x = b} separates the disjoint convex sets

Formally: if C, D are nonempty convex sets with C \ D = ;,
C and D. The affine function aT x b is nonpositive on C and nonnegative
on D.
then there exists a, b such that
C ✓ {x : aT x  b}
D ✓ {x : aT x b}
• Supporting hyperplane theorem: a boundary point of a convex
set has a supporting hyperplane passing through it
Formally: if C is a nonempty convex set, and x0 2 bd(C),

then there exists a such that
C ✓ {x : aT x  aT x0 }
Operations preserving convexity
• Intersection: the intersection of convex sets is convex
• Scaling and translation: if C is convex, then
aC + b = {ax + b : x 2 C}
is convex for any a, b
• Affine images and preimages: if f (x) = Ax + b and C is

convex then
f (C) = {f (x) : x 2 C}
is convex, and if D is convex then
1
f (D) = {x : f (x) 2 D}
is convex
Example: linear matrix inequality solution set
Given A1 , . . . Ak , B 2 Sn , a linear matrix inequality is of the form
x1 A1 + x2 A2 + . . . + xk Ak B
for a variable x 2 Rk . Let’s prove the set C of points x that satisfy

the above inequality is convex
Approach 1: directly verify that x, y 2 C ) tx + (1 t)y 2 C.

This follows by checking that, for any v,
⇣ k
X ⌘
vT B (txi + (1 t)yi )Ai v 0
i=1
Pk
Approach 2: let f : Rk ! Sn ,
f (x) = B i=1 xi Ai . Note that
C = f 1 (Sn+ ), affine preimage of convex set
More operations preserving convexity
• Perspective images and preimages: the perspective function is
P : Rn ⇥ R++ ! Rn (where R++ denotes positive reals),
P (x, z) = x/z
for z > 0. If C ✓ dom(P ) is convex then so is P (C), and if

D is convex then so is P 1 (D)
• Linear-fractional images and preimages: the perspective map

composed with an affine function,
Ax + b
f (x) = T
c x+d
is called a linear-fractional function, defined on cT x + d > 0.

If C ✓ dom(f ) is convex then so if f (C), and if D is convex
then so is f 1 (D)
Example: Filter mask
• Let {h−n, . . . , h−1, h0, h1 . . . , hn} be 5
a set of FIR filter coefficients. 0
Assume h−i = hi (linear phase). ω1 ω2

−5
• The frequency response −10
H(ω) (dB)
−15
U (ω)
n
! −20
H(ω) = hie−jωi −25
i=−n −30
n
! −35
= h0 + 2 hi cos(ωi) −40
0 0.5 1 1.5 2 2.5 3
ω
i=1
• The set
" #
n+1 #
$
H = (h0, . . . , hn) ∈ R |H(ω)| ≤ U (ω), ω1 ≤ ω ≤ ω2
where U (ω) ≥ 0, is convex because

% " # $
n+1 #
H= (h , . . . , hn) ∈ R − U (ω) ≤ H(ω) ≤ U (ω)
& 0 '( )
ω1 ≤ω≤ω2
polyhedral for each ω
CONVEX FUNCTIONS
Convex Functions
A function f : Rn → R is said to be convex if
i) domf is convex; and
ii) for any x, y ∈ domf and θ ∈ [0, 1],
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
(y, f (y))
(x, f (x))
• f is strictly convex if f (θx + (1 − θ)y) < θf (x) + (1 − θ)f (y) for all 0 < θ < 1
and for all x =
̸ y.
• f is concave if −f is convex.
strictly convex
convex non-convex
(and of course convex)
1st and 2nd Order Conditions
• Gradient (for differentiable f )
! "T
∂f (x) ∂f (x)
∇f (x) = ,..., ∈ Rn
∂x1 ∂xn
• Hessian (for twice differentiable f ): A matrix function ∇2f (x) ∈ Sn in which
2 ∂ 2f (x)
[∇ f (x)]ij =
∂xi∂xj
• Taylor series:
f (x + ν) = f (x) + ∇f (x)T ν + 12 ν T ∇2f (x)ν + . . .

• First-order condition: A differentiable f is convex iff given any x0 ∈ domf ,
f (x) ≥ f (x0) + ∇f (x0)T (x − x0), ∀x ∈ domf
f (x)
f (x0 ) + ∇f (x0 )T (x − x0 )
x
x0
• Second-order condition: A twice differentiable f is convex if and only if
∇2 f (x) ≽ 0, ∀x ∈ domf
Restriction of a Convex Function to a Line
• This can check convexity of f by checking convexity of functions of one variable!

Examples on R
• ax + b is convex. It is also concave.
• x2 is convex (on R).
• |x| is convex.
• eαx is convex.
• log x is concave on R++.
• x log x is convex on R+.

!x 2
−t /2
• log −∞
e dt is concave.
Examples on Rn
Affine function n
!
f (x) = aT x + b = aixi + b
i=1
is both convex and concave.
1.5
0.5
f (x)
−0.5
−1
−1.5
−2
1
0.5
1
0 0.5
0
−0.5
−0.5
−1
x2 −1
x1
W.-K. Ma 6
Quadratic function
n !
! n n
!
f (x) = xT P x + 2q T x + r = Pij xixj + 2 qixi + r
i=1 j=1 i=1
is convex if and only if P ≽ 0.
20
10
15
f (x)
5
10
f (x)
0
−5
0 1
1
−10 0.5
0.5 1 1
0.5 0.5 0
0
0 0
−0.5 −0.5
−0.5 −0.5
x2 −1 −1
x1 x2 −1 −1
x1
(a) P ≽ 0. (b) P ! 0.
p-norm
! n
# p1
"
f (x) = ∥x∥p = |xi|p
i=1
is convex for p ≥ 1.
1
p=∞ 0.8
p = 0.5
0.8 p=2 p = 0.3
0.6
p=1
0.6
p = 0.1
0.4
0.4
0.2
0.2
0 0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
−1
−1
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
(a) Region of ∥x∥p = 1, p ≥ 1. (b) Region of ∥x∥p = 1, p ≤ 1.

Geometric mean n
!
f (x) = xi
i=1
is concave on Rn++.
Log-sum-exp.
Examples " n
#
$
f (x) = log exi
i=1
is convex on Rn. (Log-sum-exp. can be used as an approx. to max xi)
i=1,...,n
16/10/19 Lecture 3: Convex Functions 14
Epigraph
The epigraph of f is
epif = {(x, t) | x ∈ domf, f (x) ≤ t}
A powerful property:
f convex ⇐⇒ epif convex
e.g., some convexity preserving properties can be proven quite easily by epigraph.
t f (x)
epif
x
))
Sublevel Sets
The α-sublevel set of f
Sα = {x ∈ domf | f (x) ≤ α}
f convex =⇒ Sα convex for every α, but Sα convex for every α ̸

=⇒ f convex
f (x) f (x)
α α
x x
Sα Sα
convex f and convex Sα non-convex f but convex Sα

Epigraph and First-order Condition
• Consider the first-order condition for convexity:
where f is convex and x, y 2 dom f .

• We can interpret this basic inequality geometrically in terms of epi f .
If (y, t) 2 epi f , then
• This means that the hyperplane denoted by (rf (x), 1) supports

epi f at the boundary point (x, f (x)) . (See Fig. 3.6)
Epigraph and First-order Condition
Figure 3.6
Jensen’s Inequality
• The basic inequality for convex f
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
is also called Jensen’s inequality.
• It can be extended to
! k # k
" "
f θk xk ≤ θk f (xk )
i=1 i=1
$k
where θ1, . . . , θk ≥ 0, and i=1 θk = 1; and to
%& ' &
f p(x)xdx ≤ p(x)f (x)dx
S S
(
where p(x) ≥ 0 on S ⊆ domf , and S
p(x)dx = 1.
Inequalities derived from Jensen’s inequality:
√
• Arithmetic-geometric inequality: ab ≤ (a + b)/2 for a, b ≥ 0
!n n
• Hadamard inequality: det X ≤ i=1 X ii for X ∈ S+
• Kullback-Leiber divergence: Let p(x), q(x) be PDFs on S,

# $
p(x)
"
p(x) log dx ≥ 0
S q(x)
• Hölder inequality:
xT y ≤ ∥x∥p∥x∥
yq
where 1/p + 1/q = 1, p > 1.
Operations that Preserve Convexity
• Practical methods for establishing convexity of a function
Convexity Preserving Operations
Affine transformation of the domain:
f convex =⇒ f (Ax + b) convex
Example: (Least squares function)
f (x) = ∥y − Ax∥2
is convex, since f (x) = ∥ · ∥2 is convex.
Example: (MIMO capacity)
f (X) = log det(HXH T + I)
is concave on Sn+, since f (X) = log det(X) is concave on Sn++.

Composition: Let g : Rn → R and h : R → R.
g convex, h convex, extended h nondecreasing =⇒ f (x) = h(g(x)) convex
g concave, h convex, extended h nonincreasing =⇒ f (x) = h(g(x)) convex
Example:
f (x) = ∥y − Ax∥22
is convex by composition, where g(x) = ∥y − Ax∥2 , h(x) = max{0, x2}.
Non-negative weighted sum:
m
f1, . . . , fm convex !
=⇒ wifi convex
w1, . . . , wm ≥ 0
i=1
Example: Regularized least squares function
f (x) = ∥y − Ax∥22 + γ∥x∥22
is convex for γ ≥ 0.
Extension of non-negative weighted sum:

"
f (x, y) convex in x for each y ∈ A
=⇒ w(y)f (x, y)dy convex
w(y) ≥ 0 for each y ∈ A A
Example: Ergodic MIMO capacity

# " $
f (x) = EH {log det(HXH T + I)} = p(H) log det(HXH T + I)dH
is concave on Sn+.
Pointwise maximum:
f1, . . . , fm convex =⇒ f (x) = max{f1(x), . . . , fm(x)} convex
Example: Infinity norm
f (x) = ∥x∥∞ = max |xi|

i=1,...,m
is convex because f (x) = max{x1, −x1, x2, −x2, . . . , xn, −xn}.
Pointwise supremum:
f (x, y) convex in x for each y ∈ A =⇒ sup f (x, y) convex

y∈A
Example: Worst-case least squares function
f (x) = max ∥y − (A + E)x∥22

E∈E
is convex. (E does not even need to be convex!)

Perspective
• The perspective of a function f:
• Examples:
Approaches to prove convexity of
functions
•  First, domain of function f is proved to be convex
•  1. Whether definition is satisfied, i.e.,
( ) () ()
f α x + (1− α ) y ≤ α f x + (1− α ) f y
•  2. Restriction to a convex function to a line is
convex (necessary and sufficient condition)
g(t) = f (x + tv) dom g = {t | x + tv ∈ dom f } is convex (in t)
–  For any x ∈ dom f ,υ ∈ ℜ n
•  3. Epigraph is convex.
•  4. Hessian is positive semidefinite.

Lecture3 ConvexSetsFuns PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture3 ConvexSetsFuns PDF

Uploaded by

Copyright:

Available Formats

Optimization

• The line segment of any two points in C has to be in C, in order to be convex.

C = {x | aTi x ≤ bi, i = 1, . . . , m, cTi x = di, i = 1, . . . , p}

For convenience we use matrix notations to represent a polyhedron:

where A ∈ Rm×n, C ∈ Rp×n, b ∈ Rm, d ∈ Rp, & ≼ denotes elementwise

C = conv{x1, x2, . . . , xk } = {x = θ1x1+. . .+θk xk | θ1+. . .+θk = 1, θ1, . . . , θk ≥ 0}

• The set of all convex combinations of {x1, x2, . . . , xk }

• conv{x1 , . . . , xk } is a polyhedron. (vice versa is true if polyhedron is bounded)

E(xc, P ) = {x | (x − xc)T P −1(x − xc) ≤ 1}

where P ∈ Sn (Sn = the set of n × n symmetric matrices), and P is positive

• The eigenvector matrix Q (Q ∈ Rn×n, QT Q = I) controls the rotation;

A set C ⊆ Rn is said to be a convex cone if, for any x, y ∈ C,

for any θ1, θ2 ≥ 0.

• A convex cone is a convex set.

Second-order cone (SOC) (aka Lorentz cone, or ice-cream cone):

where X ≽ 0 means that X is PSD; i.e.,

Positive definite (PD) cone:

where X ≻ 0 means that X is PD; i.e.,

z T Xz > 0, for all z ∈ Rn/{0}

⇐⇒x1x3 − x22 ≥ 0, 0.2

• Separating hyperplane theorem: two disjoint convex sets have

Figure 2.19 The hyperplane {x | aT x = b} separates the disjoint convex sets

Formally: if C is a nonempty convex set, and x0 2 bd(C),

• Scaling and translation: if C is convex, then

is convex for any a, b

• Affine images and preimages: if f (x) = Ax + b and C is

for a variable x 2 Rk . Let’s prove the set C of points x that satisfy

Approach 1: directly verify that x, y 2 C ) tx + (1 t)y 2 C.

for z > 0. If C ✓ dom(P ) is convex then so is P (C), and if

• Linear-fractional images and preimages: the perspective map

is called a linear-fractional function, defined on cT x + d > 0.

a set of FIR filter coeﬃcients. 0

Assume h−i = hi (linear phase). ω1 ω2

• The frequency response −10

H(ω) = hie−jωi −25

where U (ω) ≥ 0, is convex because

i) domf is convex; and

ii) for any x, y ∈ domf and θ ∈ [0, 1],

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

• Gradient (for diﬀerentiable f )

• Hessian (for twice diﬀerentiable f ): A matrix function ∇2f (x) ∈ Sn in which

f (x + ν) = f (x) + ∇f (x)T ν + 12 ν T ∇2f (x)ν + . . .

f (x) ≥ f (x0) + ∇f (x0)T (x − x0), ∀x ∈ domf

• Second-order condition: A twice diﬀerentiable f is convex if and only if

• This can check convexity of f by checking convexity of functions of one variable!

• ax + b is convex. It is also concave.

• x2 is convex (on R).

• log x is concave on R++.

• x log x is convex on R+.

is convex if and only if P ≽ 0.

(a) Region of ∥x∥p = 1, p ≥ 1. (b) Region of ∥x∥p = 1, p ≤ 1.

epif = {(x, t) | x ∈ domf, f (x) ≤ t}

f convex =⇒ Sα convex for every α, but Sα convex for every α ̸

convex f and convex Sα non-convex f but convex Sα

where f is convex and x, y 2 dom f .

If (y, t) 2 epi f , then

• This means that the hyperplane denoted by (rf (x), 1) supports

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

is also called Jensen’s inequality.

• Kullback-Leiber divergence: Let p(x), q(x) be PDFs on S,

f convex =⇒ f (Ax + b) convex

Example: (Least squares function)