A Tutorial On Convex Optimization (Haitham Hindi)

A Tutorial on Convex Optimization
Haitham Hindi
Palo Alto Research Center (PARC), Palo Alto, California
email: hhindi@parc.com
Abstract— In recent years, convex optimization has be- quickly deduce the convexity (and hence tractabil-
come a computational tool of central importance in engi- ity) of many problems, often by inspection;
neering, thanks to it’s ability to solve very large, practical 2) to review and introduce some canonical opti-
engineering problems reliably and efficiently. The goal of
this tutorial is to give an overview of the basic concepts of mization problems, which can be used to model
convex sets, functions and convex optimization problems, so problems and for which reliable optimization code
that the reader can more readily recognize and formulate can be readily obtained;
engineering problems using modern convex optimization. 3) to emphasize modeling and formulation; we do
This tutorial coincides with the publication of the new book not discuss topics like duality or writing custom
on convex optimization, by Boyd and Vandenberghe [7],
who have made available a large amount of free course codes.
material and links to freely available code. These can be We assume that the reader has a working knowledge of
downloaded and used immediately by the audience both linear algebra and vector calculus, and some (minimal)
for self-study and to solve real problems. exposure to optimization.
Our presentation is quite informal. Rather than pro-
I. I NTRODUCTION vide details for all the facts and claims presented, our
Convex optimization can be described as a fusion goal is instead to give the reader a flavor for what is
of three disciplines: optimization [22], [20], [1], [3], possible with convex optimization. Complete details can
[4], convex analysis [19], [24], [27], [16], [13], and be found in [7], from which all the material presented
numerical computation [26], [12], [10], [17]. It has here is taken. Thus we encourage the reader to skip
recently become a tool of central importance in engi- sections that might not seem clear and continue reading;
neering, enabling the solution of very large, practical the topics are not all interdependent.
engineering problems reliably and efficiently. In some Also, in order keep the paper quite general, we
sense, convex optimization is providing new indispens- have tried to not to bias our presentation toward any
able computational tools today, which naturally extend particular audience. Hence, the examples used in the
our ability to solve problems such as least squares and paper are simple and intended merely to clarify the
linear programming to a much larger and richer class of optimization ideas and concepts. For detailed examples
problems. and applications, the reader is refered to [7], [2], and
the references therein.
Our ability to solve these new types of problems
We now briefly outline the paper. Sections II and III,
comes from recent breakthroughs in algorithms for solv-
respectively, describe convex sets and convex functions
ing convex optimization problems [18], [23], [29], [30],
along with their calculus and properties. In section IV,
coupled with the dramatic improvements in computing
we define convex optimization problems, at a rather
power, both of which have happened only in the past
abstract level, and we describe their general form and
decade or so. Today, new applications of convex op-
desirable properties. Section V presents some specific
timization are constantly being reported from almost
canonical optimization problems which have been found
every area of engineering, including: control, signal
to be extremely useful in practice, and for which effi-
processing, networks, circuit design, communication, in-
cient codes are freely available. Section VI comments
formation theory, computer science, operations research,
briefly on the use of convex optimization for solving
economics, statistics, structural design. See [7], [2], [5],
nonstandard or nonconvex problems. Finally, section VII
[6], [9], [11], [15], [8], [21], [14], [28] and the references
concludes the paper.
therein.
The objectives of this tutorial are: Motivation
1) to show that there are straight forward, systematic A vast number of design problems in engineering can
rules and facts, which when mastered, allow one to be posed as constrained optimization problems, of the
form: develop an intuition for the geometry of convex opti-
mization.
minimize f0 (x)
A function f : Rn → Rm is affine if it has the form
subject to fi (x) ≤ 0, i = 1, . . . , m (1)
linear plus constant f (x) = Ax + b. If F is a matrix
hi (x) = 0, i = 1, . . . , p.
valued function, i.e., F : Rn → Rp×q , then F is affine
where x is a vector of decision variables, and the if it has the form
functions f0 , fi and hi , respectively, are the cost, in- F (x) = A0 + x1 A1 + · · · + xn An
equality constraints, and equality constraints. However,
such problems can be very hard to solve in general, where Ai ∈ Rp×q . Affine functions are sometimes
especially when the number of decision variables in x loosely refered to as linear.
is large. There are several reasons for this difficulty: Recall that S ⊆ Rn is a subspace if it contains the
First, the problem “terrain” may be riddled with local plane through any two of its points and the origin, i.e.,
optima. Second, it might be very hard to find a feasible x, y ∈ S, λ, µ ∈ R =⇒ λx + µy ∈ S.
point (i.e., an x which satisfies all the equalities and
inequalities), in fact the feasible set, which needn’t even Two common representations of a subspace are as the
be fully connected, could be empty. Third, stopping range of a matrix
criteria used in general optimization algorithms are often = {Aw | w ∈ Rq }
range(A)
arbitrary. Forth, optimization algorithms might have very
= {w1 a1 + · · · + wq aq | wi ∈ R}
poor convergence rates. Fifth, numerical problems could
cause the minimization algorithm to stop all together or where A = a1 · · · aq ; or as the nullspace of a
wander. matrix
It has been known for a long time [19], [3], [16], [13] nullspace(B) = {x | Bx = 0}
that if the fi are all convex, and the hi are affine, then
= {x | bT1 x = 0, . . . , bTp x = 0}
the first three problems disappear: any local optimum
is, in fact, a global optimum; feasibility of convex op- T
where B = b1 · · · bp . As an example, let
timization problems can be determined unambiguously, n
S = {X ∈ R n×n
| X = X T } denote the set
at least in principle; and very precise stopping criteria of symmetric matrices. Then Sn is a subspace, since
are available using duality. However, convergence rate symmetric matrices are closed under addition. Another
and numerical sensitivity issues still remained a potential way to see this is to note that Sn can be written as
problem. {X ∈ Rn×n | Xij = Xji , ∀i, j} which is the nullspace
It was not until the late ’80’s and ’90’s that researchers of linear function X − X T .
in the former Soviet Union and United States discovered A set S ⊆ Rn is affine if it contains line through any
that if, in addition to convexity, the fi satisfied a property two points in it, i.e.,
known as self-concordance, then issues of convergence
and numerical sensitivity could be avoided using interior x, y ∈ S, λ, µ ∈ R, λ + µ = 1 =⇒ λx + µy ∈ S.
point methods [18], [23], [29], [30], [25]. The self-
p
concordance property is satisfied by a very large setPSfrag of
important functions used in engineering. Hence, it is now λ = 1.5
replacements px
p
possible to solve a large class of convex optimization
λ = 0.6 py
problems in engineering with great efficiency.
p
II. C ONVEX SETS λ = −0.5
In this section we list some important convex sets Geometrically, an affine set is simply a subspace which
and operations. It is important to note that some of is not necessarily centered at the origin. Two common
these sets have different representations. Picking the representations for of an affine set are: the range of affine
right representation can make the difference between a function
tractable problem and an intractable one. S = {Az + b | z ∈ Rq },
We will be concerned only with optimization prob- or as the solution of a set of linear equalities:
lems whose decision variables are vectors in Rn or
matrices in Rm×n . Throughout the paper, we will make S = {x | bT1 x = d1 , . . . , bTp x = dp }
frequent use of informal sketches to help the reader = {x | Bx = d}.
2
A set S ⊆ Rn is a convex set if it contains the line • K = Sn+ : X K Y means Y − X is PSD
segment joining any of its points, i.e., These are so common we drop the K.
Given points xi ∈ Rn and θi ∈ R, then y = θ1 x1 +
x, y ∈ S, λ, µ ≥ 0, λ + µ = 1 =⇒ λx + µy ∈ S
· · · + θk xk is said to be a
convex not convex
• linear combination for any real θi
PSfrag replacements P
• affine combination if
Pi θi = 1
y
• convex combination if i θi = 1, θi ≥ 0
x • conic combination if θi ≥ 0
The linear (resp. affine, convex, conic) hull of a set
S is the set of all linear (resp. affine, convex, conic)
Geometrically, we can think of convex sets as always
combinations of points from S, and is denoted by
bulging outward, with no dents or kinks in them. Clearly
span(S) (resp. Aff (S), Co(S), Cone(S)). It can be
subspaces and affine sets are convex, since their defini-
shown that this is the smallest such set containing S.
tions subsume convexity.
As an example, consider the set S =
A set S ⊆ Rn is a convex cone if it contains all
{(1, 0, 0), (0, 1, 0), (0, 0, 1)}. Then span(S) is R3 ;
rays passing through its points which emanate from the
Aff (S) is the hyperplane passing through the three
origin, as well as all line segments joining any points
points; Co(S) is the unit simplex which is the triangle
on those rays, i.e.,
joining the vectors along with all the points inside it;
x, y ∈ S, λ, µ ≥ 0, =⇒ λx + µy ∈ S Cone(S) is the nonnegative orthant R3+ .
Recall that a hyperplane, represented as {x | aT x =
Geometrically, x, y ∈ S means that S contains the entire b} (a 6= 0), is in general an affine set, and is a subspace
‘pie slice’ between x, and y. if b = 0. Another useful representation of a hyperplane
is given by {x | aT (x − x0 ) = 0}, where a is normal
vector; x0 lies on hyperplane. Hyperplanes are convex,
PSfrag replacements x since they contain all lines (and hence segments) joining
any of their points.
y A halfspace, described as {x | aT x ≤ b} (a 6= 0) is
0 generally convex and is a convex cone if b = 0. Another
The nonnegative orthant, Rn+ is a convex cone. The set useful representation is {x | aT (x − x0 ) ≤ 0}, where a
Sn+ = {X ∈ Sn | X 0} of symmetric positive is (outward) normal vector and x0 lies on boundary.
semidefinite (PSD) matrices is also a convex cone, since aT x = jb
any positive combination of semidefinite matrices is
semidefinite. Hence we call Sn+ the positive semidefinite a
PSfrag replacements
cone. p x0
A convex cone K ⊆ Rn is said to be proper if it is
closed, has nonempty interior, and is pointed, i.e., there T
is no line in K. A proper cone K defines a generalized aT x ≤ b a x ≥ b
inequality K in Rn :
x K y ⇐⇒ y − x ∈ K We now come to a very important fact about prop-
erties which are preserved under intersection: Let A be
(strict version x ≺K y ⇐⇒ y − x ∈ int K). an arbitrary index set (possibly uncountably infinite) and
{Sα | α ∈ A} a collection of sets , then we have the
following:
y K x    
PSfrag replacements K subspace subspace
affine affine
\
x Sα is   =⇒ Sα is  
convex convex
convex cone α∈A convex cone
In fact, every closed convex set S is the (usually infinite)
This formalizes our use of the symbol: intersection of halfspaces which contain it, i.e.,
n T
• K = R+ : x K y means xi ≤ yi S = {H | H halfspace, S ⊆ H}. For example,
(componentwise vector inequality) another way to see that Sn+ is a convex cone is to recall
3
that a matrix X ∈ Sn is positive semidefinite if z T Xz ≥ Another workhorse of convex optimization is the
0, ∀z ∈ Rn . Thus we can write ellipsoid. In addition to their computational convenience,

n
 ellipsoids can be used as crude models or approximates
\  
for other convex sets. In fact, it can be shown that any
X
Sn+ = X ∈ Sn z T Xz =

zi zj Xij ≥ 0 .
n

i,j=1
 bounded convex set in Rn can be approximated to within
z∈R

a factor of n by an ellipsoid.
Now observe that the summation above is actually linear
There are many representations for ellipsoids. For
in the components of X, so Sn+ is the infinite intersection
example, we can use
of halfspaces containing the origin (which are convex
cones) in Sn . E = {x | (x − xc )T A−1 (x − xc ) ≤ 1}
We continue with our listing of useful convex sets.
A polyhedron is intersection of a finite number of where the parameters are A = AT 0, the center xc ∈
halfspaces Rn .
P = {x | aTi x ≤ bi , i = 1, . . . , k}
√
= {x | Ax b} PSfrag replacements λ2 √
px λ1
c
where above means componentwise inequality.
a1 a2
√
In this case, the semiaxis lengths are λi , where λi
PSfrag replacements
eigenvalues of A; the semiaxis directions are eigenvec-
P 1/2
tors of A and the volume is proportional to (det A) .
a3 However, depending on the problem, other descriptions
√
a5 might be more appropriate, such as (kuk = uT u)
• image of unit ball under affine transformation: E =
a4 {Bu + xc | kuk ≤ 1}; vol. ∝ det B
A bounded polyhedron is called a polytope, which also • preimage of unit ball under affine transformation:
has the alternative representation P = Co{v1 , . . . , vN }, E = {x | kAx − bk ≤ 1}; vol. ∝ det A−1
where {v1 , . . . , vN } are its vertices. For example, the • sublevel set of nondegenerate quadratic: E =
nonnegative orthant Rn+ = {x ∈ Rn | x 0} {x | f (x) ≤ 0} where f (x) = xT Cx + 2dT x + e
P while the probability simplex {x ∈
is a polyhedron, with C = C T 0, e − dT C −1 d < 0; vol. ∝
Rn | x 0, i xi = 1} is a polytope. (det A)−1/2
If f is a norm, then the norm ball B = {x | f (x −
It is an instructive exercise to convert among represen-
xc ) ≤ 1} is convex, and the norm cone C =
tations.
{(x, t) | f (x) ≤ t} is a convex cone. Perhaps the most
The second-order cone is the norm cone associated
familiar norms are the `p norms on Rn :
with the Euclidean norm.
( i |xi |p )1/p ; p ≥ 1,
P
kxkp = √
maxi |xi | ;p = ∞ S = {(x, t) | xT x ≤ t}
( T )
The corresponding norm balls (in R2 ) look like this: x I 0 x

= (x, t) ≤ 0, t ≥ 0
(−1, 1) (1, 1)
t 0 −1 t
p= 1

p = 1.5

p =2
p =3
Ip = ∞ 1
0.8
0.6
0.4
PSfrag replacements
p 0.2
0
1
0.5 1
0 0.5
0
−0.5
−0.5
−1 −1
The image (or preimage) under an affinte transforma-

(−1, −1) (1, −1) tion of a convex set are convex, i.e., if S, T convex,
4
then so are the sets set, where a supporting hyperplane {x | aT x = aT x0 }
supports S at x0 ∈ ∂S if
f −1 (S) = {x | Ax + b ∈ S}
f (T ) = {Ax + b | x ∈ T } x ∈ S ⇒ a T x ≤ a T x0
An example is coordinate projection {x | (x, y) ∈

S for some y}. As another example, a constraint of the a
form x0
PSfrag replacements
kAx + bk2 ≤ cT x + d, S
k×n
where A ∈ R , a second-order cone constraint, since
it is the same as requiring the affine expression (Ax +
b, cT x + d) to lie in the second-order cone in Rk+1 . III. C ONVEX FUNCTIONS
Similarly, if A0 , A1 , . . . , Am ∈ Sn , solution set of the In this section, we introduce the reader to some
linear matrix inequality (LMI) important convex functions and techniques for verifying
F (x) = A0 + x1 A1 + · · · + xm Am 0 convexity. The objective is to sharpen the reader’s ability
to recognize convexity.
is convex (preimage of the semidefinite cone under an
affine function). A. Convex functions
A linear-fractional (or projective) function f : Rm → A function f : Rn → R is convex if its domain dom f
n
R has the form is convex and for all x, y ∈ dom f , θ ∈ [0, 1]
Ax + b
f (x) = T f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y);
c x+d
and domain dom f = H = {x | cT x + d > 0}. If C f is concave if −f is convex.
is a convex set, then its linear-fractional transformation
PSfrag replacements
f (C) is also convex. This is because linear fractional
transformations preserve line segments: for x, y ∈ H,
x x x
f ([x, y]) = [f (x), f (y)]
convex concave neither
x4 x3 Here are some simple examples on R: x2 is convex

f (x4 ) (dom f = R); log x is concave (dom f = R++ ); and
PSfrag replacements PSfrag replacements f (x3 ) f (x) = 1/x is convex (dom f = R++ ).
x1 x2 f (x1 ) f (x2 ) It is convenient to define the extension of a convex
function f

˜ f (x) x ∈ dom f
f(x) =
Two further properties are helpful in visualizing the +∞ x 6∈ dom f
geometry of convex sets. The first is the separating
Note that f still satisfies the basic definition for all
hyperplane theorem, which states that if S, T ⊆ Rn
x, y ∈ Rn , 0 ≤ θ ≤ 1 (as an inequality in R ∪ {+∞}).
are convex and disjoint (S ∩ T = ∅), then there exists a
We will use the same symbol for f and its extension,
hyperplane {x | aT x − b = 0} which separates them.
i.e., we will implicitly assume convex functions are
extended.
The epigraph of a function f is
epi f = {(x, t) | x ∈ dom f, f (x) ≤ t }
T S
PSfrag replacements f (x)
epi f
a
PSfrag replacements
The second property is the supporting hyperplane the-
orem which states that there exists a supporting hy-
x
perplane at every point on the boundary of a convex
5
log-sum-exp function f (x) = log i exi (tricky!)
P
The (α-)sublevel set of a function f is •
T n
∆ • affine functions f (x) = a x + b where a ∈ R , b ∈
Sα = {x ∈ dom f | f (x) ≤ α } R are convex and concave since ∇2 f ≡ 0.
T T
Form the basic definition of convexity, it follows that • quadratic functions f (x) = x P x+2q x+r (P =
if f is a convex function if and only if its epigraph epi f P T ) whose Hessian ∇2 f (x) = 2P are convex ⇐⇒
is a convex set. It also follows that if f is convex, then P 0; concave ⇐⇒ P 0
its sublevel sets Sα are convex (the converse is false - Further elementary properties are extremely helpful
see quasiconvex functions later). in verifying convexity. We list several:
n
The convexity of a differentiable function f : Rn → R • f : R → R is convex iff it is convex on all lines:
˜ ∆
can also be characterized by conditions on its gradient f(t) = f (x0 + th) is convex in t ∈ R, for all
∇f and Hessian ∇2 f . Recall that, in general, the x0 , h ∈ R n
gradient yields a first order Taylor approximation at x0 : • nonnegative sums of convex functions are convex:
α1 , α2 ≥ 0 and f1 , f2 convex
f (x) ' f (x0 ) + ∇f (x0 )T (x − x0 ) =⇒ α1 f1 + α2 f2 convex
We have the following first-order condition: f is convex • nonnegative infinite sums, integrals:
if and only if for all x, x0 ∈ dom f , p(y) R≥ 0, g(x, y) convex in x

=⇒ p(y)g(x, y)dy convex in x
f (x) ≥ f (x0 ) + ∇f (x0 )T (x − x0 ), • pointwise supremum (maximum):
i.e., the first order approximation of f is a global fα convex =⇒ supα∈A fα convex (corresponds
underestimator. to intersection of epigraphs)
f (x) f2 (x)
PSfrag replacements PSfrag replacements epi max{f1 , f2 }
f1 (x)
x0 x
x
f (x0 ) + ∇f (x0 )T (x − x0 ) affine transformation of domain:
•
To see why this is so, we can rewrite the condition above
f convex =⇒ f (Ax + b) convex
in terms of the epigraph of f as: for all (x, t) ∈ epi f ,
The following are important examples of how these
T
further properties can be applied:

∇f (x0 ) x − x0
≤ 0, T
• piecewise-linear functions: f (x) = maxi {ai x+bi }
−1 t − f (x0 )
is convex in x (epi f is polyhedron)
i.e., (∇f (x0 ), −1) defines supporting hyperplane to • maximum distance to any set, sups∈S kx − sk, is
epi f at (x0 , f (x0 )) convex in x [(x − s) is affine in x, k · k is a norm,
PSfrag replacements f (x)
so fs (x) = kx − sk is convex].
• expected value: f (x, u) convex in x
x =⇒ g(x) = Eu f (x, u) convex
x0 (∇f (x0 ), −1) n
• f (x) = x[1] + x[2] + x[3] is convex on R , where
x[i] is the ith largest xj . [To see this, note f (x) is
Recall that the Hessian of f , ∇2 f , yields a second
the sum of the largest triple of components of x.
order Taylor series expansion around x0 :
So f can be written as f (x) = maxi cTi x, where
1
f (x) ' f (x0 )+∇f (x0 )T (x−x0 )+ (x−x0 )T ∇2 f (x0 )(x−x0 ) ci are the set of all vectors with components zero
2 except for three 1’s.]
Pm T −1
We have the following necessary and sufficient second • f (x) = i=1 log(bi − ai x) is convex
T
order condition: a twice differentiable function f is (dom f = {x | ai x < bi , i = 1, . . . , m})
convex if and only if for all x ∈ dom f , ∇2 f (x) 0, Another convexity preserving operation is that of
i.e., its Hessian is positive semidefinite on its domain. minimizing over some variables. Specifically, if h(x, y)
The convexity of the following functions is easy to is convex in x and y, then
verify, using the first and second order characterizations
f (x) = inf h(x, y)
of convexity: y
α
• x is convex on R++ for α ≥ 1; is convex in x. This is because the operation above
• x log x is convex on R+ ; corresponds to projection of the epigraph, (x, y, t) →
6
(x, t). The Schur complement technique is an indispensible
h(x, y) tool for modeling and transforming nonlinear constraints
f (x) into convex LMI constraints. Suppose a symmetric ma-
trix X can be decomposed as
PSfrag replacements
A B
y X=
BT C
x ∆
with A invertible. The matrix S = C − B T A−1 B is
An important example of this is the minimum distance called the Schur complement of A in X, and has the
function to a set S ⊆ Rn , defined as: following important attributes, which are not too hard
dist(x, S) = inf y∈S kx − yk to prove:
= inf y kx − yk + δ(y|S) ∆
• X 0 if and only if A 0 and S = C −
where δ(y|S) is the indicator function of the set S,
B T A−1 B 0
which is infinite everywhere except on S, where it is
• if A 0, then X 0 if and only if S 0
zero. Note that in contrast to the maximum distance
function defined earlier, dist(x, S) is not convex in For example, the following second-order cone con-
general. However, if S is convex, then dist(x, S) is straint on x
convex since kx − yk + δ(y|S) is convex in (x, y). kAx + bk ≤ eT x + d
The technique of composition can also be used to
deduce the convexity of differentiable functions, by is equivalent to LMI
means of the chain rule. PFor example, one can show T
results like: f (x) = log i exp gi (x) is convex if each (e x + d)I Ax + b
0.
gi is; see [7]. (Ax + b)T eT x + d
Jensen’s inequality, a classical result which essentially
This shows that all sets that can be modeled by an SOCP
a restatement of convexity, is used extensively in various
constraint can be modeled as LMI’s. Hence LMI’s are
forms. If f : Rn → R is convex
more “expressive”. As another nontrivial example, the
• two points: θ1 + θ2 = 1, θi ≥ 0 =⇒
quadratic-over-linear function
f (θ1 x1 + θ2 x2 ) ≤ θ1 f (x
P1 ) + θ2 f (x2 )
• more than two points:
P P i θi = 1, θi ≥ 0 =⇒ f (x, y) = x2 /y, dom f = R × R++
f ( i θi xi ) ≤ i θi f (xi )
is convex because its epigraph {(x, y, t) | y > 0, x2 /y ≤
R
• continuous version: p(x) ≥ 0, p(x) dx = 1 =⇒
R R
f ( xp(x) dx) ≤ f (x)p(x) dx t} is convex: by Schur complememts, it’s equivalent to
or more generally, f (E x) ≤ E f (x). the solution set of the LMI constraints
The simple techniques above can also be used to
y x
verify the convexity of functions of matrices, which arise 0, y > 0.
x t
in many important problems.
T
P n×n
• Tr A X = i,j Aij Xij is linear in X on R The concept of K-convexity generalizes convexity to
−1 n
• log det X is convex on {X ∈ S | X 0} vector- or matrix-valued functions. As mentioned earlier,
Proof: Verify convexity along lines. Let λi be the a pointed convex cone K ⊆ Rm induces generalized
−1/2 −1/2
eigenvalues of X0 HX0 inequality K . A function f : Rn → Rm is K-convex
∆
f (t) = log det(X0 + tH)−1 if for all θ ∈ [0, 1]
−1/2 −1/2 −1
= log det X0−1 + log
P det(I + tX0 HX0 )
−1
= log det X0 − i log(1 + tλi ) f (θx + (1 − θ)y) K θf (x) + (1 − θ)f (y)
which is a convex function of t.
• (det X)1/n is concave on {X ∈ Sn | X 0} In many problems, especially in statistics, log-concave
• λmax (X) is convex on Sn since, for X symmetric, functions are frequently encountered. A function f :
λmax (X) = supkyk2 =1 y T Xy, which is a supre- Rn → R+ is log-concave (log-convex) if log f is
mum of linear functions of X. concave (convex). Hence one can transform problems in
• kXk2 = σ1 (X) = (λmax (X T X))1/2 is con- log convex functions into convex optimization problems.
vex on Rm×n since, by definition, kXk2 = As an example, the normal density, f (x) = const ·
T −1
supkuk2 =1 kXuk2. e−(1/2)(x−x0 ) Σ (x−x0 ) is log-concave.
7
B. Quasiconvex functions
The next best thing to a convex function in optimiza-
PSfrag replacements S α1
tion is a quasiconvex function, since it can be minimized ∇f (x)
in a few iterations of convex optimization problems. x
A function f : Rn → R is quasiconvex if every S α2
sublevel set Sα = {x ∈ dom f | f (x) ≤ α} is S α3
convex. Note that if f is convex, then it is automatically
quasiconvex. Quasiconvex functions can have ‘locally α1 < α 2 < α 3
flat’ regions. • positive multiples
y f quasiconvex, α ≥ 0 =⇒ αf quasiconvex
• pointwise supremum: f1 , f2 quasiconvex =⇒
PSfrag replacements max{f1 , f2 } quasiconvex
α (extends to supremum over arbitrary set)
• affine transformation of domain
f quasiconvex =⇒ f (Ax + b) quasiconvex
x • linear-fractional transformation
of domain
Sα f quasiconvex =⇒ f cAx+b T x+d quasiconvex
We say f is quasiconcave if −f is quasiconvex, i.e., T
on c x + d > 0
superlevel sets {x | f (x) ≥ α} are convex. A function • composition with monotone increasing function:
which is both quasiconvex and quasiconcave is called f quasiconvex, g monotone increasing
quasilinear. =⇒ g(f (x)) quasiconvex
We list some examples: • sums of quasiconvex functions are not quasiconvex
p
• f (x) = |x| is quasiconvex on R in general
• f (x) = log x is quasilinear on R+ • f quasiconvex in x, y =⇒ g(x) = inf y f (x, y)
• linear fractional function, quasiconvex in x (projection of epigraph remains
quasiconvex)
aT x + b
f (x) = T
c x+d IV. C ONVEX OPTIMIZATION PROBLEMS
is quasilinear on the halfspace cT x + d > 0 In this section, we will discuss the formulation of
kx−ak2
• f (x) = kx−bk is quasiconvex on the halfspace optimization problems at a general level. In the next
2
{x | kx − ak2 ≤ kx − bk2 } section we will focus on useful optimization problems
Quasiconvex functions have a lot of similar features to with specific structure.
convex functions, but also some important differences. Consider the following optimization problem in stan-
The following quasiconvex properties are helpful in dard form
verifying quasiconvexity: minimize f0 (x)
• f is quasiconvex if and only if it is quasiconvex on subject to fi (x) ≤ 0, i = 1, . . . , m
lines, i.e., f (x0 + th) quasiconvex in t for all x0 , h hi (x) = 0, i = 1, . . . , p
• modified Jensen’s inequality: f is quasiconvex iff
for all x, y ∈ dom f , θ ∈ [0, 1], where fi , hi : Rn → R; x is the optimization variable;
f0 is the objective or cost function; fi (x) ≤ 0 are
f (θx + (1 − θ)y) ≤ max{f (x), f (y)} the inequality constraints; hi (x) = 0 are the equality
f (x) constraints. Geometrically, this problem corresponds to
PSfrag replacements the minimization of f0 , over a set described by as the
intersection of 0-sublevel sets of the fi ’s with surfaces
described by the 0-solution sets of the hi ’s.
A point x is feasible if it satisfies the constraints;
x y the feasible set C is the set of all feasible points; and
• for f differentiable, f quasiconvex ⇐⇒ for all the problem is feasible if there are feasible points. The
x, y ∈ dom f problem is said to be unconstrained if m = p = 0. The
optimal value is denoted by f ? = inf x∈C f0 (x), and we
f (y) ≤ f (x) ⇒ (y − x)T ∇f (x) ≤ 0 adopt the convention that f ? = +∞ if the problem is
8
infeasible). A point x ∈ C is an optimal point if f (x) = 1) no local minima: any local optimum is necessarily
f ? and the optimal set is Xopt = {x ∈ C | f (x) = f ? }. a global optimum;
As an example consider the problem 2) exact infeasibility detection: using duality theory
5
(which is not cover here), hence algorithms are

4
easy to initialize;
minimize x1 + x2 3 C 3) efficient numerical solution methods that can han-
subject to −x1 ≤ 0
x2
dle very large problems.
PSfrag
−x2 replacements
≤0
2
√ Note that often seemingly ‘slight’ modifications of

1 − x1 x2 ≤ 0 1
convex problem can be very hard. Examples include:

0
0 1 2 3 4 5
x convex maximization, concave minimization, e.g.

The objective function is f0 (x) = [1 1]T x;1 the feasible •
set C is half-hyperboloid; the optimal value is f ? = 2;

and the only optimal point is x? = (1, 1). maximize kxk
In the standard problem above, the explicit constraints subject to Ax b
are given by fi (x) ≤ 0, hi (x) = 0. However, there are
also the implicit constraints: x ∈ dom fi , x ∈ dom hi , • nonlinear equality constraints, e.g.
i.e., x must lie in the set
minimize cT x
D = dom f0 ∩ · · · ∩ dom fm ∩ dom h1 ∩ · · · ∩ dom hp subject to xT Pi x + qiT x + ri = 0, i = 1, . . . , K
which is called the domain of the problem. For example,
• minimizing over non-convex sets, e.g., Boolean
minimize − log x1 − log x2 variables
subject to x1 + x2 − 1 ≤ 0
find x
has the implicit constraint x ∈ D = {x ∈ R2 | x1 > such that Ax b,
0, x2 > 0}. xi ∈ {0, 1}
A feasibility problem is a special case of the standard
problem, where we are interested merely in finding any To understand global optimality in convex problems,
feasible point. Thus, problem is really to recall that x ∈ C is locally optimal if it satisfies
• either find x ∈ C
• or determine that C = ∅ . y ∈ C, ky − xk ≤ R =⇒ f0 (y) ≥ f0 (x)
Equivalently, the feasibility problem requires that we
either solve the inequality / equality system for some R > 0. A point x ∈ C is globally optimal
fi (x) ≤ 0, i = 1, . . . , m means that
hi (x) = 0, i = 1, . . . , p
y ∈ C =⇒ f0 (y) ≥ f0 (x).
or determine that it is inconsistent.
An optimization problem in standard form is a convex For convex optimization problems, any local solution is
optimization problem if f0 , f1 , . . . , fm are all convex, also global. [Proof sketch: Suppose x is locally optimal,
and hi are all affine: but that there is a y ∈ C, with f0 (y) < f0 (x). Then
minimize f0 (x) we may take small step from x towards y, i.e., z =
subject to fi (x) ≤ 0, i = 1, . . . , m λy + (1 − λ)x with λ > 0 small. Then z is near x, with
aTi x − bi = 0, i = 1, . . . , p. f0 (z) < f0 (x) which contradicts local optimality.]
This is often written as There is also a first order condition that characterizes
optimality in convex optimization problems. Suppose f0
minimize f0 (x)
is differentiable, then x ∈ C is optimal iff
subject to fi (x) ≤ 0, i = 1, . . . , m
Ax = b
y ∈ C =⇒ ∇f0 (x)T (y − x) ≥ 0
p×n p
where A ∈ R and b ∈ R . As mentioned in the
introduction, convex optimization problems have three So −∇f0 (x) defines supporting hyperplane for C at x.
crucial properties that makes them fundamentally more This means that if we move from x towards any other
tractable than generic nonconvex optimization problems: feasible y, f0 does not decrease.
9
contour lines of f0
We will see an important example of quasiconvex
optimization in the next section: linear-fractional pro-
−∇f0 (x) gramming.
C x
PSfrag replacements V. C ANONICAL OPTIMIZATION PROBLEMS
In this section, we present several canonical optimiza-

tion problem formulations, which have been found to be
extremely useful in practice, and for which extremely
efficient solution codes are available (often for free!).
Many standard convex optimization algorithms as- Thus if a real problem can be cast into one of these
sume a linear objective function. So it is important to forms, then it can be considered as essentially solved.
realize that any convex optimization problems can be
converted to one with a linear objective function. This A. Conic programming
is called putting the problem into epigraph form. It We will present three canonical problems in this
involves rewriting the problem as: section. They are called conic problems because the
minimize t inequalities are specified in terms of affine functions and
subject to f0 (x) − t ≤ 0, generalized inequalities. Geometrically, the inequalities
fi (x) ≤ 0, i = 1, . . . , m are feasible if the range of the affine mappings intersects
hi (x) = 0, i = 1, . . . , p the cone of the inequality.
where the variables are now (x, t) and the objective The problems are of increasing expressivity and mod-
has essentially be moved into the inequality constraints. eling power. However, roughly speaking, each added
Observe that the new objective is linear: t = eTn+1 (x, t) level of modeling capability is at the cost of longer com-
putation time. Thus one should use the least complex
(en+1 is the vector of all zeros except for a 1 in
its (n + 1)th component). Minimizing t will result in form for the problem at hand.
minimizing f0 with respect to x, so any optimal x? of A general linear program (LP) has the form
the new problem will be optimal for the epigraph form minimize cT x + d
and vice versa. subject to Gx h
f0 (x)
Ax = b,
PSfrag replacements where G ∈ Rm×n and A ∈ Rp×n .
A problem that subsumes both linear and quadratic
programming is the second-order cone program
−en+1 (SOCP):
C
minimize f T x
Hence, the linear objective is ‘universal’ for convex subject to kAi x + bi k2 ≤ cTi x + di , i = 1, . . . , m
optimization. F x = g,
The above trick of introducing the extra variable t
is known as the method of slack variables. It is very where x ∈ Rn is the optimization variable, Ai ∈ Rni ×n ,
helpful in transforming problems into canonical form. and F ∈ Rp×n . When ci = 0, i = 1, . . . , m, the SOCP
We will see another example in the next section. is equivalent to a quadratically constrained quadratic
A convex optimization problem in standard form with program (QCQP), (which is obtained by squaring each
generalized inequalities is written as: of the constraints). Similarly, if Ai = 0, i = 1, . . . , m,
minimize f0 (x) then the SOCP reduces to a (general) LP. Second-order
subject to fi (x) Ki 0, i = 1, . . . , L cone programs are, however, more general than QCQPs
Ax = b (and of course, LPs).
A problem which subsumes linear, quadratic and
where f0 : Rn → R are all convex; the Ki are
second-order cone programming is called a semidefinite
generalized inequalities on Rmi ; and fi : Rn → Rmi
program (SDP), and has the form
are Ki -convex.
A quasiconvex optimization problem is exactly the minimize cT x
same as a convex optimization problem, except for one subject to x1 F1 + · · · + xn Fn + G 0
difference: the objective function f0 is quasiconvex. Ax = b,
10
where G, F1 , . . . , Fn ∈ Sk , and A ∈ Rp×n . The is the cumulative distribution function of a zero mean
inequality here is a linear matrix inequality. As shown unit variance Gaussian random variable. Thus the prob-
earlier, since SOCP constraints can be written as LMI’s, ability constraint can be expressed as
SDP’s subsume SOCP’s, and hence LP’s as well. (If bi − u
there are multiple LMI’s, they can be stacked into one ≥ Φ−1 (η),
σ
large block diagonal LMI, where the blocks are the
individual LMIs.) or, equivalently,
We will now consider some examples which are u + Φ−1 (η)σ ≤ bi .
themselves very instructive demonstrations of modeling
and the power of convex optimization. From u = aTi x and σ = (xT Σi x)1/2 we obtain
First, we show how slack variables can be used to 1/2
aTi x + Φ−1 (η)kΣi xk2 ≤ bi .
convert a given problem into canonical form. Consider
the constrained `∞ - (Chebychev) approximation By our assumption that η ≥ 1/2, we have Φ−1 (η) ≥
0, so this constraint is a second-order cone constraint.
minimize kAx − bk∞ In summary, the stochastic LP can be expressed as the
subject to F x g. SOCP
By introducing the extra variable t, we can rewrite the minimize cT x
1/2
problem as: subject to aTi x + Φ−1 (η)kΣi xk2 ≤ bi , i = 1, . . . , m.
minimize t We will see examples of using LMI constraints below.
subject to Ax − b t1 B. Extensions of conic programming
Ax − b −t1
F x g, We now present two more canonical convex optimiza-
tion formulations, which can be viewed as extensions of
which is an LP in variables x ∈ Rn , t ∈ R (1 is the the above conic formulations.
vector with all components 1). A generalized linear fractional program (GLFP) has
Second, as (more extensive) example of how an SOCP the form:
might arise, consider linear programming with random minimize f0 (x)
constraints subject to Gx h
Ax = b
minimize cT x
where the objective function is given by
subject to Prob(aTi x ≤ bi ) ≥ η, i = 1, . . . , m
cTi x + di
Here we suppose that the parameters ai are independent f0 (x) = max ,
i=1,...,r eTi x + fi
Gaussian random vectors, with mean ai and covariance
Σi . We require that each constraint aTi x ≤ bi should with dom f0 = {x | eTi x + fi > 0, i = 1, . . . , r}.
hold with a probability (or confidence) exceeding η, The objective function is the pointwise maximum of
where η ≥ 0.5, i.e., r quasiconvex functions, and therefore quasiconvex, so
this problem is quasiconvex.
Prob(aTi x ≤ bi ) ≥ η. A determinant maximization program (maxdet) has
the form:
We will show that this probability constraint can be
expressed as a second-order cone constraint. Letting minimize cT x − log det G(x)
u = aTi x, with σ 2 denoting its variance, this constraint subject to G(x) 0
can be written as F (x) 0
Ax = b
u−u bi − u
Prob ≤ ≥ η. where F and G are linear (affine) matrix functions of
σ σ
x, so the inequality constraints are LMI’s. By flipping
Since (u − u)/σ is a zero mean unit variance Gaussian signs, one can pose this as a maximization problem,
variable, the probability above is simply Φ((bi − u)/σ), which is the historic reason for the name. Since G is
where Z z affine and, as shown earlier, the function − log det X
1 2 is convex on the symmetric semidefinite cone Sn+ , the
Φ(z) = √ e−t /2 dt
2π −∞ maxdet program is a convex problem.
11
We now consider an example of each type. First, we A = AT 0, and whose volume is proportional to
consider an optimal transmitter power allocation prob- det A−1 . This problem immediately reduces to:
lem where the variables are the transmitter powers pk ,
minimize log det A−1
k = 1, . . . , m. Given m transmitters and mn receivers
subject to A = AT 0
all at the same frequency. Transmitter i wants to transmit
kAvi − bk ≤ 1, i = 1, . . . , K
to its n receivers labeled (i, j), j = 1, . . . , n, as shown
below (transmitter and receiver locations are fixed): which is a convex optimization problem in A, b (n +
transmitter k n(n + 1)/2 variables), and can be written as a maxdet
problem.
PSfrag replacements
receiver (i, j) Now, consider finding the maximum volume ellipsoid
in a polytope given in inequality form
P = {x | aTi x ≤ bi , i = 1, . . . , L}
transmitter i
Let Aijk ∈ R denote the path gain from transmitter

k to receiver (i, j), and Nij ∈ R be the (self) noise d
power of receiver (i, j). So at receiver (i, j) the signal s
PSfrag replacements
power is Sij =PAiji pi and the noise plus interference E
power is Iij = k6=i Aijk pk + Nij .Therefore the signal
to interference/noise ratio (SINR) is
Sij /Iij .
For this problem, we represent the ellipsoid as E =
The optimal transmitter power allocation problem is then {By + d | kyk ≤ 1}, which is parametrized by its center
to choose pi to maximize smallest SINR: d and shape matrix B = B T 0, and whose volume is
Aiji pi proportional to det B. Note the containment condition
maximize min P means
k6=i ijk pk + Nij
i,j A
subject to 0 ≤ pi ≤ pmax E ⊆P ⇐⇒ aTi (By + d) ≤ bi for all kyk ≤ 1
This is a GLFP. ⇐⇒ sup (aTi By + aTi d) ≤ bi
Next we consider two related ellipsoid approximation kyk≤1
problems. In addition to the use of LMI constraints,
this example illustrates techniques of computing with ⇐⇒ kBai k + aTi d ≤ bi , i = 1, . . . , L
ellipsoids and also how the choice representation of the which is a convex constraint in B and d. Hence finding
ellipsoids and polyhedra can make the difference be- the maximum volume E ⊆ P: is convex problem in
tween a tractable convex problem, and a hard intractable variables B, d:
one.
First consider computing a minimum volume ellipsoid maximize log det B
around points v1 , . . . , vK ∈ Rn . This is equivalent subject to B = B T 0
to finding the minimum volume ellipsoid around the kBai k + aTi d ≤ bi , i = 1, . . . , L
polytope defined by the convex hull of those points, Note, however, that minor variations on these two
represented in vertex form. problems, which are convex and hence easy, resutl in
problems that are very difficult:
• compute the maximum volume ellipsoid inside
E polyhedron given in vertex form Co{v1 , . . . , vK }
• compute the minimum volume ellipsoid containing
PSfrag replacements
polyhedron given in inequality form Ax b
In fact, just checking whether a given ellipsoid E covers
For this problem, we choose the representation for a polytope described in inequality form is extremely
the ellipsoid as E = {x | kAx − bk ≤ 1}, which is difficult (equivalent to maximizing a convex quadratic
parametrized by its center A−1 b, and the shape matrix s.t. linear inequalities).
12
C. Geometric programming We will now transform geometric programs to convex
problems by a change of variables and a transformation
In this section we describe a family of optimization
of the objective and constraint functions.
problems that are not convex in their natural form.
We will use the variables defined as yi = log xi , so
However, they are an excellent example of how problems
xi = eyi . If f is a monomial function of x, i.e.,
can sometimes be transformed to convex optimization
problems, by a change of variables and a transformation f (x) = cxa1 1 xa2 2 · · · xann ,
of the objective and constraint functions. In addition,
then
they have been found tremendously useful for modeling
real problems, e.g.circuit design and communication f (x) = f (ey1 , . . . , eyn )
networks. = c(ey1 )a1 · · · (eyn )an
A function f : Rn → R with dom f = Rn++ , defined T
as = ea y+b
,
f (x) = cxa1 1 xa2 2 · · · xann , where b = log c. The change of variables yi = log xi
turns a monomial function into the exponential of an
where c > 0 and ai ∈ R, is called a monomial affine function.
function, or simply, a monomial. The exponents ai of a Similarly, if f is a posynomial, i.e.,
monomial can be any real numbers, including fractional K
or negative, but the coefficient c must be nonnegative.
X
f (x) = ck xa1 1k xa2 2k · · · xannk ,
A sum of monomials, i.e., a function of the form k=1
K then
K
X
f (x) = ck xa1 1k xa2 2k · · · xannk , X T
k=1
f (x) = eak y+bk ,
k=1
where ck > 0, is called a posynomial function (with K where ak = (a1k , . . . , ank ) and bk = log ck . After the
terms), or simply, a posynomial. Posynomials are closed change of variables, a posynomial becomes a sum of
under addition, multiplication, and nonnegative scaling. exponentials of affine functions.
Monomials are closed under multiplication and division. The geometric program can be expressed in terms of
If a posynomial is multiplied by a monomial, the result the new variable y as
is a posynomial; similarly, a posynomial can be divided PK0 aT y+b0k
by a monomial, with the result a posynomial. minimize e 0k
Pk=1Ki aT
ik y+bik ≤ 1,
An optimization problem of the form subject to k=1 e i = 1, . . . , m
T
gi y+hi
e = 1, i = 1, . . . , p,
minimize f0 (x)
subject to fi (x) ≤ 1, i = 1, . . . , m where aik ∈ Rn , i = 0, . . . , m, contain the exponents
hi (x) = 1, i = 1, . . . , p of the posynomial inequality constraints, and gi ∈ Rn ,
i = 1, . . . , p, contain the exponents of the monomial
where f0 , . . . , fm are posynomials and h1 , . . . , hp are equality constraints of the original geometric program.
monomials, is called a geometric program (GP). The Now we transform the objective and constraint func-
domain of this problem is D = Rn++ ; the constraint tions, by taking the logarithm. This results in the prob-
x 0 is implicit. lem
Several extensions are readily handled. If f is a P
K0 aT

posynomial and h is a monomial, then the constraint minimize f˜0 (y) = log e 0k y+b0k
P k=1 T
˜ Ki
f (x) ≤ h(x) can be handled by expressing it as subject to fi (y) = log k=1 e
aik y+bik
≤ 0, i = 1, . . . , m
f (x)/h(x) ≤ 1 (since f /h is posynomial). This includes h̃i (y) = giT y + hi = 0, i = 1, . . . , p.
as a special case a constraint of the form f (x) ≤ a,
where f is posynomial and a > 0. In a similar way if Since the functions f˜i are convex, and h̃i are affine, this
h1 and h2 are both nonzero monomial functions, then problem is a convex optimization problem. We refer to
we can handle the equality constraint h1 (x) = h2 (x) it as a geometric program in convex form. Note that the
by expressing it as h1 (x)/h2 (x) = 1 (since h1 /h2 transformation between the posynomial form geometric
is monomial). We can maximize a nonzero monomial program and the convex form geometric program does
objective function, by minimizing its inverse (which is not involve any computation; the problem data for the
also a monomial). two problems are the same.
13
VI. N ONSTANDARD AND N ONCONVEX PROBLEMS [9] M. A. Dahleh and I. J. Diaz-Bobillo. Control of Uncertain
Systems: A Linear Programming Approach. Prentice-Hall, 1995.
In practice, one might encounter convex problems [10] J. W. Demmel. Applied Numerical Linear Algebra. Society for
which do not fall into any of the canonical forms above. Industrial and Applied Mathematics, 1997.
In this case, it is necessary to develop custom code for [11] G. E. Dullerud and F. Paganini. A Course in Robust Control
Theory: A Convex Approach. Springer, 2000.
the problem. Developing such codes requires gradient [12] G. Golub and C. F. Van Loan. Matrix Computations. Johns
and, for optimal performance, Hessian information. If Hopkins University Press, second edition, 1989.
[13] M. Grötschel, L. Lovasz, and A. Schrijver. Geometric Algorithms
only gradient information is available, the ellipsoid, and Combinatorial Optimization. Springer, 1988.
subgradient or cutting plane methods can be used. These [14] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of
are reliable methods with exact stopping criteria. The Statistical Learning. Data Mining, Inference, and Prediction.
Springer, 2001.
same is true for interior point methods, however these [15] M. del Mar Hershenson, S. P. Boyd, and T. H. Lee. Optimal
also require Hessian information. The payoff for having design of a CMOS op-amp via geometric programming. IEEE
Hessian information is much faster convergence; in Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 20(1):1–21, 2001.
practice, they solve most problems at the cost of a dozen [16] J.-B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of Convex
least squares problems of about the size of the decision Analysis. Springer, 2001. Abridged version of Convex Analysis
variables. These methods are described in the references. and Minimization Algorithms volumes 1 and 2.
[17] R. A. Horn and C. A. Johnson. Matrix Analysis. Cambridge
Nonconvex problems are more common, of course, University Press, 1985.
than nonconvex problems. However, convex optimiza- [18] N. Karmarkar. A new polynomial-time algorithm for linear
tion can still often be helpful in solving these as well: programming. Combinatorica, 4(4):373–395, 1984.
[19] D. G. Luenberger. Optimization by Vector Space Methods. John
it can often be used to compute lower; useful initial Wiley & Sons, 1969.
starting points; etc, see for example [4], [13], [23], [30]. [20] D. G. Luenberger. Linear and Nonlinear Programming. Addison-
Wesley, second edition, 1984.
VII. C ONCLUSION [21] Z.-Q. Luo. Applications of convex optimization in signal pro-
cessing and digital communication. Mathematical Programming
In this paper we have reviewed some essentials of Series B, 97:177–207, 2003.
convex sets, functions and optimization problems. We [22] O. Mangasarian. Nonlinear Programming. Society for Industrial
and Applied Mathematics, 1994. First published in 1969 by
hope that the reader is convinced that convex optimiza- McGraw-Hill.
tion dramatically increases the range of problems in [23] Y. Nesterov and A. Nemirovskii. Interior-Point Polynomial
engineering that can be modeled and solved exactly. Methods in Convex Programming. Society for Industrial and
Applied Mathematics, 1994.
ACKNOWLEDGEMENT [24] R. T. Rockafellar. Convex Analysis. Princeton University Press,
1970.
I am deeply indebted to Stephen Boyd and Lieven [25] C. Roos, T. Terlaky, and J.-Ph. Vial. Theory and Algorithms for
Linear Optimization. An Interior Point Approach. John Wiley &
Vandenberghe for generously allowing me to use mate- Sons, 1997.
rial from [7] in preparing this tutorial. [26] G. Strang. Linear Algebra and its Applications. Academic Press,
1980.
R EFERENCES [27] J. van Tiel. Convex Analysis. An Introductory Text. John Wiley
& Sons, 1984.
[1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear [28] H. Wolkowicz, R. Saigal, and L. Vandenberghe, editors. Hand-
Programming. Theory and Algorithms. John Wiley & Sons, book of Semidefinite Programming. Kluwer Academic Publish-
second edition, 1993. ers, 2000.
[2] A. Ben-Tal and A. Nemirovski. Lectures on Modern Convex Op- [29] S. J. Wright. Primal-Dual Interior-Point Methods. Society for
timization. Analysis, Algorithms, and Engineering Applications. Industrial and Applied Mathematics, 1997.
Society for Industrial and Applied Mathematics, 2001. [30] Y. Ye. Interior Point Algorithms. Theory and Analysis. John
[3] D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Wiley & Sons, 1997.
1999.
[4] D. Bertsimas and J. N. Tsitsiklis. Introduction to Linear
Optimization. Athena Scientific, 1997.
[5] S. Boyd and C. Barratt. Linear Controller Design: Limits of
Performance. Prentice-Hall, 1991.
[6] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan. Linear
Matrix Inequalities in System and Control Theory. Society for
Industrial and Applied Mathematics, 1994.
[7] S.P. Boyd and L. Vandenberghe. Convex Optimization. Cam-
bridge University Press, 2003. In Press. Material available at
www.stanford.edu/˜boyd.
[8] D. Colleran, C. Portmann, A. Hassibi, C. Crusius, S. Mohan,
T. Lee S. Boyd, and M. Hershenson. Optimization of phase-
locked loop circuits via geometric programming. In Custom
Integrated Circuits Conference (CICC), San Jose, California,
September 2003.
14

A Tutorial On Convex Optimization (Haitham Hindi)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Tutorial On Convex Optimization (Haitham Hindi)

Uploaded by

Copyright:

Available Formats

A Tutorial on Convex Optimization

The image (or preimage) under an affinte transforma-

An example is coordinate projection {x | (x, y) ∈

x4 x3 Here are some simple examples on R: x2 is convex

if and only if for all x, x0 ∈ dom f , p(y) R≥ 0, g(x, y) convex in x

(which is not cover here), hence algorithms are

√ Note that often seemingly ‘slight’ modifications of

convex problem can be very hard. Examples include:

x convex maximization, concave minimization, e.g.

set C is half-hyperboloid; the optimal value is f ? = 2;

In this section, we present several canonical optimiza-

Let Aijk ∈ R denote the path gain from transmitter

You might also like