Professional Documents
Culture Documents
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
Ryan Tibshirani
Convex Optimization 10-725
(y, f (y))
(x, f (x))
Figure 3.1 Graph of a convex function. The chord (i.e., line segment) be-
Figure 2.2 Some simple convex and nonconvex tween sets. any two points
• Key properties (e.g., first- and second-order characterizations
Left. on the graph lies above the graph.
The hexagon,
which includes its boundary (shown darker), is convex. Middle. The kidney
for functions)
shaped set is not convex, since the line segment between the two points in
the set shown as dots is not contained in the set. Right. The square contains
some boundary points but not others, and is not convex.
• Operations that preserve convexity (e.g., affine composition)
1
E.g., is max log , kAx + bk51 convex?
(aT x + b)7
Figure 2.3 The convex hulls of two sets in R2 . Left. The convex hull of a
set of fifteen points (shown as dots) is the pentagon (shown shaded). Right. 2
Outline
Today:
• Optimization terminology
• Properties and first-order optimality
• Equivalent transformations
3
Optimization terminology
Reminder: a convex optimization problem (or program) is
min f (x)
x∈D
subject to gi (x) ≤ 0, i = 1, . . . , m
Ax = b
4
• If x is feasible and f (x) = f ? , then x is called optimal; also
called a solution, or a minimizer1
• If x is feasible and f (x) ≤ f ? + , then x is called -suboptimal
• If x is feasible and gi (x) = 0, then we say gi is active at x
• Convex minimization can be reposed as concave maximization
1
Note: a convex optimization problem need not have solutions, i.e., need
not attain its minimum, but we will not be careful about this
5
Solution set
Let Xopt be the set of all solutions of convex problem, written
6
Example: lasso
Given y ∈ Rn , X ∈ Rn×p , consider the lasso problem:
min ky − Xβk22
β
subject to kβk1 ≤ s
7
Example: support vector machines
Given y ∈ {−1, 1}n , X ∈ Rn×p with rows x1 , . . . xn , consider the
support vector machine or SVM problem:
n
1 X
min kβk22 + C ξi
β,β0 ,ξ 2
i=1
subject to ξi ≥ 0, i = 1, . . . , n
yi (xTi β + β0 ) ≥ 1 − ξi , i = 1, . . . , n
8
Local minima are global minima
For a convex problem, a feasible point x is called locally optimal is
there is some R > 0 such that
●
●
● ●
Proof simply follows
●
from definitions ●
● ● ●●
Convex Nonconvex
9
Rewriting constraints
The optimization problem
min f (x)
x
subject to gi (x) ≤ 0, i = 1, . . . , m
Ax = b
can be rewritten as
10
First-order optimality condition
For a convex problem
11
Example: quadratic minimization
Consider minimizing the quadratic function
1
f (x) = xT Qx + bT x + c
2
where Q 0. The first-order condition says that solution satisfies
∇f (x) = Qx + b = 0
12
Example: equality-constrained minimization
Consider the equality-constrained convex problem:
This is equivalent to
13
Example: projection onto a convex set
Consider projection onto convex set C:
a − x ∈ NC (x)
● ●
14
Partial optimization
15
Example: hinge form of SVMs
Recall the SVM problem
n
1 X
min kβk22 + C ξi
β,β0 ,ξ 2
i=1
subject to ξi ≥ 0, yi (xTi β + β0 ) ≥ 1 − ξi , i = 1, . . . , n
16
Transformations and change of variables
If h : R → R is a monotone increasing transformation, then
17
Example: geometric programming
A monomial is a function f : Rn++ → R of the form
19
Several interesting problems are geometric programs, e.g., floor
8.8 Floor planning
planning: 439
wi
Ci hi
H
(xi , yi )
W
Figure 8.18 Floor planning problem. Non-overlapping rectangular cells are
placed in a rectangle with width W , height H, and lower left corner at (0, 0).
See Theet
Boyd ithal.
cell (2007),
is specified by
“Aitstutorial
width wi , height hi , and the coordinates
on geometric of its
programming”,
lower left corner, (xi , yi ).
and also Chapter 8.8 of B & V book
We also require that the cells do not overlap, except possibly on their boundaries:
20
Eliminating equality constraints
Important special case of change of variables: eliminating equality
constraints. Given the problem
min f (x)
x
subject to gi (x) ≤ 0, i = 1, . . . , m
Ax = b
min f (M y + x0 )
y
subject to gi (M y + x0 ) ≤ 0, i = 1, . . . , m
Note: this is fully general but not always a good idea (practically)
21
Introducing slack variables
Essentially opposite to eliminating equality contraints: introducing
slack variables. Given the problem
min f (x)
x
subject to gi (x) ≤ 0, i = 1, . . . , m
Ax = b
min f (x)
x,s
subject to si ≥ 0, i = 1, . . . , m
gi (x) + si = 0, i = 1, . . . , m
Ax = b
22
Relaxing nonaffine equalities
Given an optimization problem
hj (x) = 0, j = 1, . . . , r
hj (x) ≤ 0, j = 1, . . . , r
23
Example: maximum utility problem
The maximum utility problem models investment/consumption:
T
X
max αt u(xt )
x,b
t=1
subject to bt+1 = bt + f (bt ) − xt , t = 1, . . . , T
0 ≤ xt ≤ bt , t = 1, . . . , T
bt+1 ≤ bt + f (bt ) − xt , t = 1, . . . , T ?
24
Example: principal components analysis
Here kAk2F = ni=1 pj=1 A2ij , the entrywise squared `2 norm, and
P P
rank(A) denotes the rank of A
R = Uk Dk VkT
25
The PCA problem is not convex. Let’s recast it. First rewrite as
26
Now consider relaxing constraint set to Fk = conv(C), its convex
hull. Note
max tr(SZ)
Z∈Fk
27
References and further reading
28