Professional Documents
Culture Documents
Convex Analysis and Global Optimization-Hoang Tuy
Convex Analysis and Global Optimization-Hoang Tuy
HoangTuy
Convex
Analysis
and Global
Optimization
Second Edition
Springer Optimization and Its Applications
VOLUME 110
Managing Editor
Panos M. Pardalos (University of Florida)
EditorCombinatorial Optimization
Ding-Zhu Du (University of Texas at Dallas)
Advisory Board
J. Birge (University of Chicago)
C.A. Floudas (Princeton University)
F. Giannessi (University of Pisa)
H.D. Sherali (Virginia Polytechnic and State University)
T. Terlaky (McMaster University)
Y. Ye (Stanford University)
123
Hoang Tuy
Vietnam Academy of Science
and Technology
Institute of Mathematics
Hanoi, Vietnam
This book grew out of lecture notes from a course given at Graz University of
Technology (autumn 1990), The Institute of Technology at Linkping University
(autumn 1991), and the Ecole Polytechnique in Montral (autumn 1993). Originally
conceived of as a Ph.D. course, it attempts to present a coherent and mathematically
rigorous theory of deterministic global optimization. At the same time, aiming at
providing a concise account of the methods and algorithms, it focuses on the main
ideas and concepts, leaving aside too technical details which might affect the clarity
of presentation.
Global optimization is concerned with finding global solutions to nonconvex
optimization problems. Although until recently convex analysis has been developed
mainly under the impulse of convex and local optimization, it has become a basic
tool for global optimization. The reason is that the general mathematical structure
underlying virtually every nonconvex optimization problem can be described in
terms of functions representable as differences of convex functions (dc functions)
and sets which are differences of convex sets (dc sets). Due to this fact, many
concepts and results from convex analysis play an essential role in the investigation
of important classes of global optimization problems. Since, however, convexity
in nonconvex optimization problems is present only partially or in the other
way, new concepts have to be introduced and new questions have to be answered.
Therefore, a new chapter on dc functions and dc sets is added to the traditional
material of convex analysis. Part I of this book is an introduction to convex analysis
interpreted in this broad sense as an indispensable tool for global optimization.
Part II presents a theory of deterministic global optimization which heavily relies
on the dc structure of nonconvex optimization problems. The key subproblem in
this approach is to transcend an incumbent, i.e., given a solution of an optimization
problem (the best so far obtained), check its global optimality, and find a better
solution, if there is one. As it turns out, this subproblem can always be reduced to
solving a dc inclusion of the form x 2 D n C; where D; C are two convex sets.
Chapters 46 are devoted to general methods for solving concave and dc programs
through dc inclusions of this form. These methods include successive partitioning
and cutting, outer approximation and polyhedral annexation, or combination of the
v
vi Preface to the First Edition
concepts. The last two chapters discuss methods for exploiting special structures in
global optimization. Two aspects of nonconvexity deserve particular attention: the
rank of nonconvexity, i.e., roughly speaking the number of nonconvex variables,
and the degree of nonconvexity, i.e., the extent to which a problem fails to be
convex. Decomposition methods for handling low rank nonconvex problems are
presented in Chap. 7, while nonconvex quadratic problems, i.e., problems involving
only nonconvex functions which in a sense have lowest degree of nonconvexity, are
discussed in the last Chap. 8.
I have made no attempt to cover all the developments to date because this would
be merely impossible and unreasonable. With regret I have to omit many interesting
results which do not fit in with the mainstream of the book. On the other hand, much
new material is offered, even in the parts where the traditional approach is more or
less standard.
I would like to express my sincere thanks to many colleagues and friends,
especially R.E. Burkard (Graz Technical University), A. Migdalas and P. Vrbrand
(University of Linkping), and B. Jaumard and P. Hansen (University of Montral),
for many fruitful discussions we had during my enjoyable visits to their departments,
and P.M. Pardalos for his encouragement in publishing this book. I am particularly
grateful to P.T. Thach and F.A. Al-Khayyal for many useful remarks and suggestions
and also to P.H. Dien for his efficient help during the preparation of the manuscript.
Over the 17 years since the publication of the first edition of this book, tremendous
progress has been achieved in global optimization. The present revised and enlarged
edition is to give an up-to-date account of the field, in response to the needs of
research, teaching, and applications in the years to come.
In addition to the correction of misprints and errors, much new material is
offered. In particular, several recent important advances are included: a modern
approach to minimax, fixed point and equilibrium, a robust approach to optimization
under nonconvex constraints, and also a thorough study of quadratic programming
with a single quadratic constraint.
Moreover, three new chapters are added: monotonic optimization (Chap. 11),
polynomial optimization (Chap. 12), and optimization under equilibrium constraints
(Chap. 13). These topics have received increased attention in recent years due to
many important engineering and economic applications.
Hopefully, as main reference in deterministic global optimization, this book
will replace both its first edition and the 1996 (last) edition of the book Global
Optimization: Deterministic Approaches by Reiner Horst and Hoang Tuy.
vii
Contents
ix
x Contents
References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 489
Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 501
Glossary of Notation
xv
xvi Glossary of Notation
is called the line through a and b: A subset M of Rn is called an affine set (or affine
manifold ) if it contains every line through two points of it, i.e., if .1 /a Cb 2 M
for every a 2 M; b 2 M, and every 2 R: An affine set which contains the origin is
a subspace.
Proposition 1.1 A nonempty set M is an affine set if and only if M D a C L; where
a 2 M and L is a subspace.
Proof Let M be an affine set and a 2 M: Then M D a C L; where the set L D M a
contains 0 and is an affine set, hence is a subspace. Conversely, let M D a C L;
with a 2 M and L a subspace. For any x; y 2 M; 2 R we have .1 /x C y D
a C .1 /.x a/ C .y a/: Since x a; y a 2 L and L is a subspace, it follows
that .1 /.x a/ C .y a/ 2 L; hence .1 /x C y 2 M: Therefore, M is an
affine set. t
u
The subspace L is said to be parallel to the affine set M: For a given nonempty
affine set M the parallel subspace L is uniquely defined. Indeed, if M D a0 C L0 ;
then L0 D M a0 D L C .a a0 /; but a0 D a0 C 0 2 M D a C L; hence a0 a 2 L;
and so L0 D L C .a a0 / D L:
The dimension of the subspace L parallel to an affine set M is called the
dimension of M: A point a 2 Rn is an affine set of dimension 0 because the subspace
parallel to M D fag is L D f0g: A line through two points a; b is an affine set of
dimension 1, because the subspace parallel to it is the one-dimensional subspace
M D fxj Ax D bg;
where b 2 Rm ; A 2 Rmn such that rankA D n r: Conversely, any set of the above
form is an r-dimensional affine set.
Proof Let M be an r-dimensional affine set. For any x0 2 M; the set L D M x0
is an r-dimensional subspace, so L D fxj Ax D 0g; where A is some m n matrix
of rank n r: Hence M D fxj A.x x0 / D 0g D fxj Ax D bg; with b D Ax0 :
Conversely if M is a set of this form then, for any x0 2 M; we have Ax0 D b; hence
M D fxjA.x x0 / D 0g D x0 C L; with L D fxj Ax D 0g: Since rankA D n r; L
is an r-dimensional subspace. t
u
Corollary 1.1 Any hyperplane is a set of the form
The intersection of a family of affine sets is again an affine set. Given a set
E Rn ; there exists at least an affine set containing E; namely Rn : Therefore,
the intersection of all affine sets containing E is the smallest affine set containing E:
This set is called the affine hull of E and denoted by aff E:
Proposition 1.3 The affine hull of a set E is the set consisting of all points of
the form
x D 1 x 1 C : : : C k x k ;
X
k X
h
.1 /x C y D .1 /i xi C j yj
iD1 jD1
1.2 Convex Sets 5
P P
with kiD1 .1 /i C hjD1 j D .1 / C D 1; hence .1 /x C y 2 M:
Thus PM is affine, and hence M P affE: On the other hand, it is easily seen that if
x D kiD1 i xi with xi 2 E; k
iD1 i D 1; then x 2 affE (for k D 2 this follows
from the definition of affine sets, for k 3 this follows by induction). So M affE;
and therefore, M D affE: t
u
Proposition 1.4 The affine hull of a set of k points x1 ; : : : ; xk .k > r/ in Rn is of
dimension r if and only if the .n C 1/ k matrix
x1 x2 xk
(1.3)
1 1 1
is of rank r C 1:
Proof Let M D afffx1 ; : : : ; xk g: Then M D xk C L; where L is the smallest space
containing x1 xk ; : : : ; xk1 xk : So M is of dimension r if and only if L is of
dimension r; i.e., if and only if the matrix x1 xk ; : : : ; xk1 xk is of rank r: This
in turn is equivalent to requiring that the matrix (1.3) be of rank r C 1: t
u
We say that k points x1 ; : : : ; xk are affinely independent if afffx1 ; : : : ; xk g has
dimension k 1; i.e., if the vectors x1 xk ; : : : ; xk1 xk are linearly independent,
or equivalently, the matrix (1.3) is of rank k:
Corollary 1.2 The affine hull S of a set of k affinely independent points fx1 ; : : : ; xk g
in Rn is a .k 1/-dimensional affine set. Every point x 2 S can be uniquely
represented in the form
X
k X
k
xD i x i ; i D 1: (1.4)
iD1 iD1
Pk
iD1 i x
i
Proof By Proposition 1.3 any point x 2 S has the form (1.4). If x D
Pk1
is another representation, then iD1 .i i /.x x / D 0; hence i D i
i k
8i D 1; : : : ; k: t
u
Corollary 1.3 There is a unique hyperplane passing through n affinely indepen-
dent points x1 ; x2 ; : : : ; xn in Rn : Every point of this hyperplane can be uniquely
represented in the form
X
n X
n
xD i x ;
i
i D 1:
iD1 iD1
Given two points a; b 2 Rn ; the set of all points x D .1 /a C b such that
0 1 is called the (closed) line segment between a and b and denoted by a; b:
6 1 Convex Sets
A set C Rn is called convex if it contains any line segment between two points of
it; in other words, if .1 /a C b 2 C whenever a; b 2 C; 0 1:
Affine sets are obviously convex. Other particular examples of convex sets are
halfspaces which are sets of the form
(closed halfspaces), or
X
k X
k
ai 2 C; i D 1; : : : ; k; i > 0; i D 1 ) i ai 2 C:
iD1 iD1
X
h X
h1 X
h1
i
i a i D i a i C h a h D a i C h a h :
iD1 iD1 iD1
Ph1 Ph1
Since iD1 .i =/ D 1; by the induction hypothesis y D iD1 .i =/ai 2 C: Next,
since C h D 1; we have y C h a 2 C: Thus the above property holds for every
h
natural k: t
u
Proposition 1.6 The intersection of any collection of convex sets is convex. If C; D
are convex sets then C C D WD fx C yj x 2 C; y 2 Dg; C D fxj x 2 Cg (and
hence, also C D WD C C .1/D/ are convex sets.
Proof If fC g is a collection of convex sets, and a; b 2 \ C ; then for each we
have a 2 C ; b 2 C ; therefore a; b C and hence, a; b \ C :
If C; D are convex sets, and a D x C y; b D u C v with x; u 2 C; y; v 2 D; then
.1 /a C b D .1 /x C u C .1 /y C v 2 C C D; for any 2 0; 1;
hence C C D is convex. The convexity of C can be proved analogously. t
u
A subset M of Rn is called a cone if
x 2 M; > 0 ) x 2 M:
1.3 Relative Interior and Closure 7
The origin 0 itself may or may not belong to M: A set aCM; which is the translate of
a cone M by a 2 Rn ; is also called a cone with apex at a: A cone M which contains
no line is said to be pointed; in this case, 0 is also called the vertex of M; and a C M
is a cone with vertex at a:
Proposition 1.7 A subset M of Rn is a convex cone if and only if
M M 8 > 0 (1.5)
M C M M: (1.6)
The l2 -norm,
p which can also be defined via the inner product by means of
kxk D hx; xi, is also called the Euclidean norm. The l1 -norm is also called the
Tchebycheff norm.
Given a norm k k in Rn ; the distance between two points x; y 2 Rn is the
nonnegative number kxykI the ball of center a and radius r is the set fx 2 Rn j kx
ak rg:
Topological concepts in Rn such as open set, closed set, interior, closure,
neighborhood, convergent sequence, etc., are defined in Rn with respect to a given
norm. However, the following proposition shows that the topologies in Rn defined
by different norms are all equivalent.
Proposition 1.8 For any two norms kk; kk0 in Rn there exist constants c2 c1 > 0
such that
We omit the proof which can be found, e.g., in Ortega and Rheinboldt (1970).
Denote the closure and the interior of a set C by clC and intC; respectively.
Recall that a 2 clC if and only if every ball around a contains at least one point
of C; or equivalently, a is the limit of a sequence of points of CI a 2 intC if and
only if there is a ball around a entirely included in C: An immediate consequence
of Proposition 1.8 is the following alternative characterization of interior points of
convex sets in Rn :
Corollary 1.6 A point a of a convex set C Rn is an interior point of C if for every
x 2 Rn there exists > 0 such that a C .x a/ 2 C:
Proof Without loss of generality we can assume that a D 0: Let ei be the i-unit
vector of Rn : The condition implies that for each i D 1; : : : ; n there is i > 0 such
that i ei 2 C and i > 0 such that i ei 2 C: Now let D minfi ; iP ; i D 1; : : : ; ng
and consider the set B D fxj kxi k1 g: For any x 2 B; since x D niD1 xi ei with
Pn jxi j
iD1 1; if we define u D e for xi > 0 and u D e for xi 0 then
i i i i
1
Pn jxi j i Pn jxi j
u ; : : : ; u 2 C and we can write x D
n
iD1 u C .1 iD1 /0: From the
convexity of C it follows that x 2 C: Thus, B C; and since B is a ball in the
l1 -norm, 0 is an interior point of C: t
u
A point a belongs to the relative interior of a convex set C Rn W a 2 riC; if
it is an interior point of C relative to affC: The set difference (clC/ n .riC/ is called
the relative boundary of C and denoted by @C:
Proposition 1.9 Any nonempty convex set C Rn has a nonempty relative interior.
1.3 Relative Interior and Closure 9
1 X
rC1
i D .1 / C i ; i D 1:
rC1 iD1
0
x C y D . C / x0 C y 2 . C /C:
C C
fx C yj 0g C 8x 2 C: (1.8)
The convex cone formed by all directions of recession and the vector zero is
called the recession cone of C and is denoted by recC:
Proposition 1.12 Let C Rn be a convex set containing a point a in its interior.
(i) For every x a the halfline .a; x/ D fa C .x a/j > 0g either entirely lies
in C or it cuts the boundary @C at a unique point .x/; such that every point in
the line segment a; .x/; except .x/; is an interior point of C:
(ii) The mapping .x/ is defined and continuous on the set Rn n .a C M/; where
M D rec.intC/ D rec.clC/ is a closed convex cone.
Proof
(i) Assuming, without loss of generality, that 0 2 intC; it suffices to prove the
proposition for a D 0: Let pC .x/ be the gauge of C: For every x a; if pC .x/ D
0 then x 2 C for all arbitrarily small > 0; i.e., x 2 C for all > 0; hence
the halfline .a; x/ entirely lies in C: If pC .x/ > 0 then pC .x/ D pC .x/ D 1
1.3 Relative Interior and Closure 11
a+M
a x C
(x)
just for D 1=pC .x/: So .a; x/ cuts @C at a unique point .x/ D x=pC .x/;
such that for all 0 < 1=pC .x/ we have pC .x/ < 1 and hence x 2 intC:
(ii) The domain of definition of is Rn n M; with M D fxj pC .x/ D 0g: The set M
is a closed convex cone (Fig. 1.1) because pC .x/ is continuous and pC .x/ D 0;
pC .y/ D 0 implies 0 pC .x C y/ pC .x/ C pC .y/ D 0 for all > 0
and > 0: Furthermore, the function .x/ D x=pC .x/ is continuous at any x
where pC .x/ > 0: It remains to prove that M is the recession cone of both intC
and clC: It is plain that rec(intC/ M: Conversely, if y 2 M and x 2 intC then
x D a C .1 /c; for some c 2 C and 0 < < 1: Since a C y 2 C for all
> 0; it follows that x C y D a C .1 /c C y D .a C y/ C .1 /c 2 C
for all > 0; i.e., y 2 rec.intC/: Thus, M rec.intC/: Finally, any x 2 @C
is the limit of some sequence fxk g intC: If y 2 M then for every > 0 W
xk C y 2 C; 8k; hence by letting k ! 1; x C y 2 clC: This means
that M rec.clC/; and since the converse inclusion is obvious we must have
M D rec.clC/: t
u
Note that if C is neither closed nor open then recC may be a proper subset of
rec(intC/ D rec.clC/: An example is the set C D f.x1 ; x2 /j x1 > 0; x2 > 0g [
f.0; 0/g; for which recC D C; rec.intC/ D clC:
Corollary 1.8 A nonempty closed convex set C Rn is bounded if and only if its
recession cone is the singleton f0g:
Proof By considering C in affC; we can assume that C has an interior point
(Proposition 1.6) and that 0 2 intC: If C is bounded then by Proposition 1.12, .x/
is defined on Rn n f0g; i.e., rec(int)C D f0g: Conversely, if rec(int)C D f0g; then
pC .x/ > 0 8x 0: Letting K D minfpC .x/j kxk D 1g; we then have, for every
1 1
x 2 C W pC .x/ D kxkpC kxk x
1; hence, kxk : t
u
pC x K
kxk
12 1 Convex Sets
Given any set E Rn ; there exists a convex set containing E; namely Rn : The
intersection of all convex sets containing E is called the convex hull of E and denoted
by convE: This is in fact the smallest convex set containing E:
Proposition 1.13 The convex hull of a set E Rn consists of all convex
combinations of its elements.
Proof Let C be the set of all convex combinations of E: Clearly C convE: To
prove the converse
P inclusion, itP suffices to show that C is convex (since obviously
C E/: If x D i2I i ai ; y D j2J j bj with ai ; bj 2 E and 0 1 then
X X
.1 /x C y D .1 /i ai C j bj :
i2I j2J
Since
X X
.1 /i C j
i2I j2J
X X
D .1 / i C j D .1 / C D 1;
i2I j2J
Proof By Proposition 1.13, for any x 2 convE there exists a finite subset of E; say
a1 ; : : : ; am 2 E; such that
X
m X
m
i a D x;
i
i D 1; i 0:
iD1 iD1
If D .1 ; : : : ; m / is a basic solution (in the sense of linear programming theory)
of the above linear system in 1 ; : : : ; m ; then at most k C 1 components of are
positive. For example, i D 0 for all i > h where h kC1; i.e., x can be represented
as a convex combination of a1 ; : : : ; ah ; with h k C 1: t
u
Corollary 1.9 The convex hull of a compact set is compact.
Proof
PnC1 Let E be a compact set and fxkPg convE: By the above theorem, xk D
iD1 i;k a ; with a
i;k i;k
2 E; i;k 0; nC1
iD1 i;k D 1: Since E 0; 1 is compact,
there exists a subsequence fk g such that
Remark 1.1 The convex hull of a closed set E may not be closed, as shown by the
example of a set E R2 consisting of a straightline L and a point a L (Fig. 1.2).
Theorem 1.2 (ShapleyFolkmans Theorem) Let fS!i ; i D 1; : : : ; mg be any
Xm
finite family of sets in Rn : For every x 2 conv Si there exists a subfamily
iD1
fSi ; i 2 Ig; with jIj n; such that
X X
x2 Si C conv.Si /:
iI i2I
Roughly
Pm speaking, if m is very large and every Si is sufficiently small then the sum
iD1 Si differs little from its convex hull.
P Pm
Proof Observe that conv. m iD1 Si / D iD1 convSi : P Indeed, define P the linear
1 m m
mapping
Qm ' W R nm
! R n
such that '..x ; : : : ; x m
// D iD1 x i
: Then iD1 Si D
'. iD1 Si / and by linearity of ' W
!! ! !
Ym Y m Ym
conv ' Si D ' conv Si D ' convSi ;
iD1 iD1 iD1
Pm P
proving our assertion. So any x 2 conv. iD1 Si / is of the form x D m iD1 x ; with
i
satisfying
X
m X
ij xij D x
iD1 j2Ji
X
ij D 1 i D 1; : : : ; m:
j2Ji
ij 0 i D 1; : : : ; m; j 2 Ji :
Let D .ij /
be a basic solution of this linear system in ij ; so that at most n C m
components of are positive. Denote by I the set of all i such that ij > 0 for
at least two j 2 Ji ; and let jIj D k: Then has at least 2k C .m k/ positive
components, therefore n C m 2k C .m k/; and hence k n: Since for every i I
we have iji D 1 for some ji 2 Ji and ij D 0 for all other j 2 Ji ; it follows that
X XX X X
xD xiji C ij xij 2 Si C convSi :
iI i2I j2Ji iI i2I
A caseSof interest for the applications is when B is the unit ball and D RC ; so that
E D x2X B.x; .x// where B.x; r/ denotes the ball of center x and radius r: Since
a trivial Caratheodory core of any set E is X D E with B being the unit ball in Rn
and D f0g; Caratheodorys Theorem appears to be a special case of the following
general proposition (Thach and Konno 1996):
1.4 Convex Hull 15
X
m
.x; / D i .xi ; i /:
iD1
X
m X
m
i
ti x D x; ti D 1 (1.10)
iD1 iD1
ti 0; i D 1; : : : ; m: (1.11)
and let t WD .t1 ; : : : ; tm / be a basic optimal solution of it. Since X is contained in
an affine set of dimension k the matrix of the system (1.10)
1 2
x x xm
1 1 1
is of rank at most k C 1 (Proposition 1.4). Therefore, t must satisfy m .k C 1/
constraints (1.11) as equalities. That is, ti D 0 for at least m .k C 1/ indices i:
Without loss of generality we can assume that ti D 0 for i D k C 2; : : : ; m; so by
PkC1
setting D iD1 ti i , we have
X
kC1 X
kC1
.x; / D ti .xi ; i /; ti 0; ti D 1:
iD1 iD1
Here (because t is optimal to the above program while ti D i ; i D 1; : : : ; m
is a feasible solution). Hence B B and we can write
X
m X
m
yD i y i 2 i .xi C i B/
iD1 iD1
X
m X
m
D i x i C i i B
iD1 iD1
16 1 Convex Sets
D x C B x C B
X
kC1 X
kC1
D ti xi C ti i B
iD1 iD1
X
kC1
D ti .xi C i B/:
iD1
We have
PkC1thus found k C 1 vectors
PkC1 u WD x C i B 2 E; i D 1; : : : ; k C 1; such that
i i
i
y D iD1 ti u with ti 0; iD1 ti D 1: t
u
Note that in the above proposition, the dimension k of affX can be much smaller
than the dimension of affE: For instance, if X D a; b is a line segment in Rn and
B is the unit ball while .a/ D .b/ D 1; .x/ D 0 8x fa; bg (so that E is the
union of a line segment and two balls centered at the endpoints of this segment) then
affE D Rn but affX D R: Thus in this case, by Proposition 1.14, any point of convE
can be represented as the convex combination of only two points of E (and not n C 1
points as by applying Caratheodorys Theorem.
Separation is one of the most useful concepts of convexity theory. The first and
fundamental fact is the following:
Lemma 1.2 Let C be a nonempty convex subset of Rn : If 0 C then there exists a
vector t 2 Rn such that
supht; xi 0; inf ht; xi < 0: (1.12)
x2C x2C
C0 D fx 2 Cj xn D 0g;
CC D fx 2 Cj xn > 0g; C D fx 2 Cj xn < 0g:
If xn D 0 for all x 2 C then C can be identified with a set in Rn1 ; so the Lemma
is true by the induction assumption. If C D ; but CC ;; then the vector t D
.0; : : : ; 0; 1/ satisfies (1.12). Analogously, if CC D ; and C ; we can take
t D .0; : : : ; 0; 1/: Therefore, we may assume that both C and CC are nonempty.
For any x 2 CC ; x0 2 C we have
x x0
0 2 C0 (1.13)
xn xn
1.5 Separation Theorems 17
for > 0 such that D 1: Thus, the set
xn x0n
C D f.x1 ; : : : ; xn1 /j .x1 ; : : : ; xn1 ; 0/ 2 Cg
is a nonempty convex set in Rn1 which does not contain the origin. By the induction
hypothesis there exists a vector t 2 Rn1 and a point .Ox1 ; : : : ; xO n1 / 2 C satisfying
X
n1 X
n1
ti xO i < 0; sup ti xi j .x1 ; : : : ; xn1 / 2 C 0: (1.14)
iD1 iD1
Setting
p D sup p.x/; q D inf q.x/
x2CC x2C
completing the proof since ht; xO i < 0 for xO D .Ox1 ; : : : ; xO n1 ; 0/: t
u
We say that two nonempty subsets C; D of Rn are separated by the hyperplane
ht; xi D .t 2 Rn n f0g/ if
supht; xi inf ht; yi: (1.15)
x2C y2D
Theorem 1.3 (First Separation Theorem) Two disjoint nonempty convex sets
C; D in Rn can be separated by a hyperplane.
Proof The set C D is convex and satisfies 0 C D; so by the previous Lemma,
there exists t 2 Rn n f0g such that ht; x yi 0 8x 2 C; y 2 D: Then (1.15) holds
for D supx2C ht; xi: t
u
Remark 1.2 Before drawing some consequences of the above result, it is useful to
recall a simple property of linear functions:
18 1 Convex Sets
A linear function l.x/ WD ht; xi with t 2 Rn n f0g never achieves its minimum
(or maximum) over a set D at an interior point of DI if it is bounded above
(or below) over an affine set then it is constant on this affine set.
It follows from this fact that if the set D in Theorem 1.3 is open then (1.15) implies
Furthermore:
Corollary 1.10 If an affine set C does not meet an open convex set D then there
exists a hyperplane containing C and not meeting D:
Proof Let ht; xi D be the hyperplane satisfying (1.15). By the first part of the
Remark above, we must have ht; xi D const 8x 2 C, i.e., C is contained in
the hyperplane ht; x x0 i D 0; where x0 is any point of C: By the second part
of the Remark, ht; yi < 8y 2 D (hence D \ C D ;/; since if ht; y0 i D for some
y0 2 D then this would mean that is the maximum of the linear function ht; xi over
the open set D: t
u
Lemma 1.3 Let C be a nonempty convex set in Rn : If C is closed and 0 C then
there exists a vector t 2 Rn n f0g and a number < 0 such that
ht; xi < 0 8x 2 C:
Proof Since C is closed and 0 C; there is a ball D around 0 which does not
intersect C: By the above theorem, there is t 2 Rn n f0g satisfying (1.15). Then
< 0 because 0 2 D: t
u
We say that two nonempty subsets C; D of Rn are strongly separated by a
hyperplane ht; xi D ; .t 2 Rn n f0g; 2 R/ if
Theorem 1.4 (Second Separation Theorem) Two disjoint nonempty closed con-
vex sets C; D in Rn such that either C or D is compact can be strongly separated by
a hyperplane.
Proof Assume, e.g., that C is compact and consider the convex set C D: If zk D
xk yk ! z; with xk 2 C; yk 2 D; then by compactness of C there exists a
subsequence xk ! x 2 C and by closedness of D; yk D xk zk ! x z 2 DI
hence z D xy; with x 2 C; y 2 D: So the set CD is closed. Since 0 CD; there
exists by the above Lemma a vector t 2 Rn nf0g such that ht; xyi
< 0 8x 2 C;
y 2 D: Then supx2C ht; xi
2 infy2D ht; yi C
2 : Setting D infx2C ht; xi C
2
yields (1.16). t
u
1.6 Geometric Structure 19
Remark 1.3 If the set C is a cone then one can take D 0 in (1.15). Indeed, for
any given x 2 C we then have x 2 C 8 > 0; so ht; xi 8 > 0; hence
ht; xi 0 8x 2 C: Similarly, one can take D 0 in (1.16).
There are several important concepts related to the geometric structure of convex
sets: supporting hyperplane and normal cone, face and extreme point.
hence
ht; x x0 i 0 8x 2 C:
C
20 1 Convex Sets
hy0 x0 ; x1 x0 i 0; hy0 x1 ; x0 x1 i 0;
Of course, the empty set and C itself are faces of C: A proper face of C is a face
which is neither the empty set nor C itself. A face of a convex cone is itself a convex
cone.
Proposition 1.16 The intersection of a convex set C with a supporting hyperplane
H is a face of C:
Proof The convexity of F WD H\C is obvious. If a segment a; b C has a relative
interior point x in F and were not contained in F then H would strictly separate a
and b; contrary to the definition of a supporting hyperplane. t
u
A zero-dimensional face of C is called an extreme point of C: In other words, an
extreme point of C is a point x 2 C which is not a relative interior point of any line
segment with two distinct endpoints in C:
If a convex set C has a halfline face, the direction of this halfline is called an
extreme direction of C: The set of extreme directions of C is the same as the set of
extreme directions of its recession cone.
It can easily be verified that a face of a face of C is also a face of C: In particular:
An extreme point (or direction) of a face of a convex set C is also an extreme
point (or direction, resp.) of C:
Proposition 1.17 Let E be an arbitrary set in Rn : Every extreme point of C D
convE belongs to E:
Pk
iD1 i x ; x 2
i i
Proof Let x be an extreme point of C: By Proposition 1.13 x D
Pk
E; i > 0; iD1 i D 1: It is easy to see that i D 1 (i.e., x D xi / for some i: In fact,
otherwise there is h with 0 < h < 1; so
X i
x D h xh C .1 h /yh ; with yh D xi 2 C
1 h
ih
P i
(because ih 1 h
D 1/; i.e., x is a relative interior point of the line segment
x ; y C; conflicting with x being an extreme point.
h h
t
u
22 1 Convex Sets
Given an arbitrary point a of a convex set C; there always exists at least one
face of C containing a; namely C: The intersection of all faces of C containing a is
obviously a face. We refer to this face as the smallest face containing a and denote
it by Fa :
Proposition 1.18 Fa is the union of a and all x 2 C for which there is a line segment
x; y C with a as a relative interior point.
Proof Denote the mentioned union by F: If x 2 F; i.e., x 2 C and there is y 2 C
such that a D .1 /x C y with 0 < < 1; then x; y Fa because Fa is a
face. Thus, F Fa : It suffices, therefore, to show that F is a face. By translating if
necessary, we may assume that a D 0: It is easy to see that F is a convex subset of C:
Indeed, if x1 ; x2 2 F; i.e., x1 D 1 y1 ; x2 D 2 y2 ; with y1 ; y2 2 C and 1 ; 2 > 0
then for any x D .1 /x1 C x2 .0 < < 1/ we have x D .1 /1 y1 2 y2 ;
with
1 .1 /1 2
xD y1 C y2 2 C;
.1 /1 C 2 .1 /1 C 2 .1 /1 C 2
u y v y
xD D :
1 1 1 1
We have
.1 /x
D vC y 2 C;
C C C
Given a nonempty convex set C; the set (recC/ \ .recC/ is called the lineality
space of C: It is actually the largest subspace contained in recC and consists of
the zero vector and all the nonzero vectors y such that, for every x 2 C; the line
through x in the direction of y is contained in C: The dimension of the lineality
space of C is called the lineality of C: A convex set C of lineality 0, i.e., such that
(recC/ \ .recC/ D f0g; contains no line (and so is sometimes said to be line-free).
Proposition 1.20 A nonempty closed convex set C has an extreme point if and only
if it contains no line.
Proof If C has an extreme point a then a is also an extreme point of a C recC; hence
0 is an extreme point of recC; i.e., (recC/ \ .recC/ D f0g: Conversely, suppose
C is line-free and let F be a face of smallest dimension of C: Then F must also be
line-free, and if F contains two distinct points x; y then the intersection of C with
the line through x; y must have an endpoint a: Of course Fa cannot contain x; y;
hence by Corollary 1.11, dim Fa < dim F; a contradiction. Therefore, F consists of
a single point which is an extreme point of C: t
u
If a nonempty convex set C has a nontrivial lineality space L then for any x 2 C
we have x D u C v; with u 2 L; v 2 L? ; where L? is the orthogonal complement
of L: Clearly v D x u 2 C L D C C L C; hence C can be decomposed into
the following sum:
C D L C .C \ L? /; (1.19)
in which the convex set C \ L? contains no line, hence has an extreme point, by the
above proposition.
We are now in a position to state the fundamental representation theorem for
convex sets. In view of the above decomposition (1.19) it suffices to consider closed
convex sets C with lineality zero. Denote the set of extreme points of C by V.C/
and the set of extreme directions of C by U.C/:
Theorem 1.7 Let C Rn be a closed convex set containing no line. Then
P
with I; J finite index sets, v i 2 V.C/; ui 2 U.C/; and i 0; j 0; i i D 1:
Proof The theorem is obvious if dim C D 0: Arguing by induction, assume that the
theorem is true for dim C D n 1 and consider the case when dim C D n: Let x be
an arbitrary point of C and be any line containing x: Since C is line-free, the set
\ C is either a closed halfline, or a closed line segment. In the former case, let
D fa C uj 0g: Of course a 2 @C; and there is a supporting hyperplane to C
at a by Theorem 1.5. The intersection F of this hyperplane with C is a face of C of
dimension less than n: By the induction hypothesis,
In the case when is a closed line segment a; b; both a; b belong to @C; and by
the same argument as above, each of these points belongs to coneU.C/CconvV.C/:
Since this set is convex and x 2 a; b; x belongs to this set too. t
u
An immediate consequence of Theorem 1.7 is the following important
Minkowskis Theorem which corresponds to the special case U.C/ D f0g:
Corollary 1.13 A nonempty compact convex set C is equal to the convex hull of its
extreme points. t
u
In view of Caratheodorys Theorem one can state more precisely: Every point of a
nonempty compact convex set C of dimension k is a convex combination of k C 1 or
fewer extreme points of C:
Corollary 1.14 A nonempty compact convex set C has at least one extreme point
and every supporting hyperplane H to C contains at least one extreme point of C:
Proof The set C \ H is nonempty, compact, convex, hence has an extreme point.
Since by Proposition 1.16 this set is a face of C any extreme point of it is also an
extreme point of C: t
u
1.8 Polars of Convex Sets 25
By Theorem 1.6 a closed convex set which is neither empty nor the whole space is
fully determined by the set of its supporting hyperplanes. Its turns out that a duality
correspondence can be established between convex sets and their sets of supporting
hyperplanes.
For any set E in Rn the set
E D fy 2 Rn j hy; xi 1; 8x 2 Eg
M D fy 2 Rn j hy; xi 0; 8x 2 Mg:
.M1 C M2 / D M1 \ M2 I .M1 \ M2 / D M1 C M2 :
26 1 Convex Sets
Proof If y 2 .M1 CM2 / then for any x 2 M1 ; writing x D xC0 2 M1 CM2 we have
hx; yi 0; hence y 2 M1 I analogously, y 2 M2 : Therefore .M1 C M2 / M1 \ M2 :
Conversely, if y 2 M1 \ M2 then for all x1 2 M1 ; x2 2 M2 we have hx1 ; yi
0; hx2 ; yi 0; hence hx1 C x2 ; yi 0; implying that y 2 .M1 C M2 / : Therefore,
the first equality holds. The second equality follows, because by (ii) of the previous
proposition, M1 C M2 D .M1 C M2 / D .M1 / \ .M2 / D .M1 \ M2 / : t
u
Given a set C Rn ; the function x 2 Rn 7! sC .x/ D supy2C hx; yi is called the
support function of C:
Proposition 1.23 Let C be a closed convex set containing 0.
(i) The gauge of C is the support function of C :
(ii) The recession cone of C and the closure of the cone generated by C are polar
to each other.
(iii) The lineality space of C and the subspace generated by C are orthogonally
complementary to each other.
Dually, also, with C and C interchanged.
Proof
(i) Since .C / D C by Proposition 1.21, we can write
x x
pC .x/ D inff > 0j 2 Cg D inff > 0j h ; yi 1 8y 2 C g
D inff > 0j hx; yi 8y 2 C g D sup hx; yi
y2C
(ii) Since 0 2 C; the recession cone of C is the largest closed convex cone
contained in C: Its polar must then be the smallest closed convex cone
containing C ; i.e., the closure of the convex cone generated by C :
(iii) Similarly, the lineality space of C is the largest space contained in C: Therefore,
its orthogonal complement is the smallest subspace containing C : t
u
Corollary 1.15 For any closed convex set C Rn containing 0 we have
dim C D n linealityC
linealityC D n dim C:
Proof The latter relation follows from (iii) of the previous proposition, while the
former is obtained from the latter by noting that .C / D C: u
t
The number dim C linealityC which can be considered as a measure of the
nonlinearity of a convex set C is called the rank of C: From (1.19) we have rankC D
dim.C \ L? /; where L is the lineality space of C: The previous corollary also shows
that for any closed convex set C W
rankC D rankC :
1.9 Polyhedral Convex Sets 27
hai ; xi bi i D 1; : : : ; m; (1.21)
or in matrix form:
Ax b; (1.22)
hai ; xi D bi ; i D 1; : : : ; m1
hai ; xi bi ; i D m1 C 1; : : : ; m:
The rank of a system of linear inequalities of the form (1.22) is by definition the
rank of the matrix A: When this rank is equal to the number of inequalities, we say
that the inequalities are linearly independent.
Proposition 1.24 A polyhedron D ; defined by the system (1.22) has dimension
r if and only if the subsystem of (1.22) formed by the inequalities which are satisfied
as equalities by all points of D has rank n r:
Proof Let I0 D fij hai ; xi D bi 8x 2 Dg and M D fxj hai ; xi D bi i 2 I0 g: Since
dim M D n r; and D M; it suffices to show that M affD: We can assume that
I1 WD f1; : : : ; mg n I0 ; for otherwise D D M: For each i 2 I1 take xi 2 D such
0 1 P
that ha ; x i < bi and let x D q i2I1 x ; where q D jI1 j: Then clearly
i i i
Consider now any x 2 M: Using (1.23) it is easy to see that for > 0 sufficiently
small, we have
F D fx 2 Djhak ; xi D bk g: (1.25)
1.9 Polyhedral Convex Sets 29
However, not every face of this form is a facet. An inequality hak ; xi bk is said
to be redundant if the removal of this inequality from (1.21) does not affect the
polyhedron D; i.e., if the system (1.21) is equivalent to
Then the line segment x0 ; y meets the hyperplane hak ; xi D bk at a point z satisfying
is of rank r C 1; hence k r C 1: t
u
A polytope which is the convex hull of r C 1 affinely independent points
x1 ; x2 ; : : : ; xrC1 is called an r-simplex and is denoted by x1 ; x2 ; : : : ; xrC1 :
By Proposition 1.17 any vertex of such a polytope must then be one of the points
x1 ; x2 ; : : : ; xrC1 ; and since there must be at least r C 1 vertices, these are precisely
x1 ; x2 ; : : : ; xrC1 :
If D is an r-dimensional polytope then any set of k r C 1 vertices of D
determines an r-simplex which is an r-dimensional face of D: Conversely, any r-
dimensional face of D is of this form.
Proposition 1.27 The recession cone of a polyhedron (1.22) is the cone M WD
fxj Ax 0g:
1.9 Polyhedral Convex Sets 31
Proof By definition y 2 recD if and only if for any x 2 D and > 0 we have
x C y 2 D; i.e., A.x C y/ D Ax C Ay b; i.e., Ay 0: Hence the conclusion.
t
u
Thus, the extreme directions of a polyhedron (1.22) are the same as the extreme
directions of the cone Ax 0:
Corollary 1.18 A nonempty polyhedron (1.22) has at least a vertex if and only if
rankA D n: It is a polytope if and only if the cone Ax 0 is the singleton f0g:
Proof By Proposition 1.20 D has a vertex if and only if it contains no line, i.e.,
the recession cone Ax 0 contains no line, which in turns is equivalent to saying
that rankA D n: The second assertion follows from Corollary 1.8 and the preceding
proposition. t
u
Proposition 1.28 Let P be the polyhedron defined by the system (1.27). Then the
polar of P is the polyhedron
Proof If y 2 Q; i.e.,
X
m X
p
yD i ai ; with i 1; i 0; i D 1; : : : ; m
iD1 iD1
then
X
m
hy; xi D i hai ; xi 1 8x 2 P;
iD1
a1 Q
a3
a1 , x = 1
a3 , x = 0
a2
P 0
a2 , x = 0
From Theorem 1.8 it follows that a polyhedron has finitely many faces, in particular
finitely many extreme points and extreme directions. It turns out that this property
completely characterizes polyhedrons in the class of closed convex sets.
Theorem 1.10 For every polyhedron D there exist two finite sets V D fv i ; i 2 Ig
and U D fuj ; j 2 Jg such that
D D L C C; (1.30)
where V is the set of vertices and U1 the set of extreme directions of C: Now let U0
be a basis of L; so that L D conefU0 [ .U0 /g: From (1.30) and (1.31) we conclude
noting that M D M: t
u
Theorem 1.11 Given any two finite sets V and U in Rn ; there exist an m n matrix
A and a vector b 2 Rm such that
Proof Let V D fv i ; i 2 Ig; U D fuj ; j 2 Jg: We can assume V ;; since the case
V D ; has just been treated. In the space RnC1 D Rn R; consider the set
8 9
< X X =
K D .x; t/ D i .v i ; 1/ C j .uj ; 0/j i 0; j 0 :
: ;
i2I j2J
34 1 Convex Sets
Thus,
D D fxj Ax bg;
Given a system of closed convex sets we would like to know whether they have a
nonempty intersection.
Lemma 1.5 If m closed convex proper sets C1 ; : : : ; Cm of Rn have no common point
in a compact convex set C then there exist closed halfspaces Hi Ci .i D 1; : : : ; m/
with no common point in C:
Proof The closed convex set Cm does not intersect the compact set C0 D C \
.\iD1
m1
Ci /; so by Theorem 1.4 there exists a closed halfspace Hm Cm disjoint with
0
C (i.e., such that C1 ; : : : ; Cm1 ; Hm have no common point in C). Repeating this
argument for the collection C1 ; : : : ; Cm1 ; Hm we find a closed halfspace Hm1
Cm1 such that C1 ; : : : ; Cm2 ; Hm1 ; Hm have no common point in C; etc. Eventually
we obtain the desired closed halfspaces H1 ; : : : ; Hm : t
u
Theorem 1.12 If the closed convex sets C1 ; : : : ; Cm in Rn have no common point in
a convex set C but any m 1 sets of the collection have a common point in C then
there exists a point of C not belonging to any Ci ; i D 1; : : : ; m:
Proof First, it can be assumed that the set C is compact. In fact, if it were not so,
we could pick, for each i D 1; : : : ; m; a point ai common to C and the sets Cj ; j i:
Then the set C0 D convfa1 ; : : : ; am g is compact and the sets C1 ; : : : ; Cm have no
common point in C0 (because C0 C/I moreover, any m 1 sets among these have
a common point in C0 (which is one of the points a1 ; : : : ; am ). Consequently, without
any harm we can replace C with C0 :
1.10 Systems of Convex Sets 35
P i i
For i 2 J we have ai 2 \jJ Cj ; so x D i2J a 2 \jJ Cj I for i J we have
P i i
a 2 \j2J Cj ; so x D
i
iJ a 2 \j2J Ci : Therefore, x 2 \jD1 Cj ; proving the
m
1.11 Exercises
are called the effective domain and the epigraph of f .x/; respectively. If domf ;
and f .x/ > 1 for all x 2 S, then we say that the function f .x/ is proper.
A function f W S ! 1; C1 is called convex if its epigraph is a convex set in
Rn R: This is equivalent to saying that S is a convex set in Rn and for any x1 ; x2 2 S
and 2 0; 1, we have
whenever the right-hand side is defined. In other words (2.1) must always hold
unless f .x1 / D f .x2 / D 1: By induction it can be proved that if f .x/ is
convex then for any finite set x1 ; : : : ; xk 2 S and any nonnegative numbers 1 ; : : : ; k
summing up to 1, we have
!
X
k X
k
f i x i i f .xi /
iD1 iD1
For a given nonempty convex set C Rn we can define the convex functions:
0 if x 2 C
the indicator function of C W C .x/ D
C1 if x C
the support function (see Sect. 1.8) sC .x/ D supy2C hy; xi
the distance function dC .x/ D infy2C kx yk:
The convexity of C .x/ is obvious; that of the two other functions can be verified
directly or derived from Propositions 2.5 and 2.9 below.
Proposition 2.1 If f .x/ is an improper convex function on Rn then f .x/ D 1 at
every relative interior point x of its effective domain.
Proof By the definition of an improper convex function, f .x0 / D 1 for at least
some x0 2 domf (unless domf D ;/: If x 2 ri.domf / then there is a point x0 2 domf
such that x is a relative interior point of the line segment x0 ; x0 : Since f .x0 / < C1;
it follows from x D x0 C .1 /x0 with 2 .0; 1/; that f .x/ f .x0 / C .1 /
f .x0 / D 1: t
u
From the definition it is straightforward that a function f .x/ on Rn is convex if
and only if its restriction to every straight line in Rn is convex. Therefore, convex
functions on Rn can be characterized via properties of convex functions on the real
line.
Theorem 2.1 A real-valued function f .x/ on an open interval .a; b/ R is convex
if and only if it is continuous and possesses at every x 2 .a; b/ finite left and right
derivatives
f .x C t/ f .x/ f .x C t/ f .x/
f0 .x/ D lim ; fC0 .x/ D lim
t"0 t t#0 t
f0 .x/ fC0 .x/; fC0 .x1 / f0 .x2 / for x1 < x2 : (2.2)
Proof
(i) Let f .x/ be convex. If 0 < s < t and x C t < b then the point .x C s; f .x C s//
is below the segment joining .x; f .x// and .x C t; f .x C t//; so
f .x C s/ f .x/ f .x C t/ f .x/
: (2.3)
s t
f .x C s/ f .x/ f .y C r/ f .y/
; (2.4)
s r
2.1 Definition and Basic Properties 41
which implies fC0 .x/ fC0 .y/ for x < y; i.e., fC0 .x/ is nondecreasing. Finally,
writing (2.4) as
f .y s/ f .y/ f .y C r/ f .y/
:
s r
and letting s " 0; r # 0 yields f0 .y/ fC0 .y/; proving the left part of (2.2)
and at the same time the finiteness of these derivatives. The continuity of f .x/
at every x 2 .a; b/ then follows from the existence of finite f0 .x/ and fC0 .x/:
Furthermore, setting x D x1 ; y C r D x2 in (2.4) and letting s; r ! 0 yields the
right part of (2.2).
(ii) Now suppose that f .x/ has all the properties mentioned in the Proposition and
let a < c < d < b: Consider the function
f .d/ f .c/
g.x/ D f .x/ f .c/ .x c/ :
dc
Since for any x D .1 /c C d, we have g.x/ D f .x/ f .c/ f .d/ f .c/ D
f .x/ .1 /f .c/ C f .d/; to prove the convexity of f .x/ it suffices to show
that g.x/ 0 for any x 2 c; d: Suppose the contrary, that the maximum of
g.x/ over the segment c; d is positive (this maximum exists because f .x/ is
continuous). Let e 2 c; d be the point where this maximum is attained. Note
that g.c/ D g.d/ D 0; (hence c < e < d/ and from its expression, g.x/ has
the same properties as f .x/; namely: g0 .x/; g0C .x/ exist at every x 2 .c; d/;
g0 .x/ g0C .x/; g0C .x/ is nondecreasing and g0C .x1 / g0 .x2 / for x1 x2 :
Since g.e/ g.x/ 8x 2 c; d, we must have g0 .e/ 0 g0C .e/; consequently
g0 .e/ D g0C .e/ D 0; and hence, since g0C .x/ is nondecreasing, g0C .x/ 0 8x 2
e; d: If g0 .y/ 0 for some y 2 .e; d then g0C .x/ g0 .y/ 0 hence g0 .x/ D 0
for all x 2 e; y/; from which it follows that g.y/ D g.e/ > 0: Since g.d/ D
0; there must exist y 2 .e; d/ with g0 .y/ > 0: Let x1 2 y; d/ be the point
where g.x/ attains its maximum over the segment y; d: Then g0C .x1 / 0;
contradicting g0C .y/ g0 .y/ > 0: Therefore g.x/ 0 for all x 2 c; d; as was
to be proved. t
u
Corollary 2.1 A differentiable real-valued function f .x/ on an open interval is
convex if and only if its derivative f 0 is a nondecreasing function. A twice
differentiable real-valued function f .x/ on an open interval is convex if and only
if its second derivative f 00 is nonnegative throughout this interval. t
u
Proposition 2.2 A twice differentiable real-valued function f .x/ on an open convex
set C in Rn is convex if and only if for every x 2 C its Hessian matrix
@2 f
Qx D .qij .x//; qij .x/ D .x1 ; : : : ; xn /
@xi @xj
42 2 Convex Functions
hu; Qx ui 0 8u 2 Rn :
Proof The function f is convex on C if and only if for each a 2 C and u 2 Rn the
function 'a;u .t/ D f .a C tu/ is convex on the open real interval ftj a C tu 2 Cg: The
proposition then follows from the preceding corollary since an easy computation
yields ' 00 .t/ D hu; Qx ui with x D a C tu: t
u
In particular, a quadratic function
1
f .x/ D hx; Qxi C hx; ai C ;
2
where Q is a symmetric n n matrix, is convex on Rn if and only if Q is positive
semidefinite. It is concave on Rn if and only if its matrix Q is negative semidefinite.
Proposition 2.3 A proper convex function f on Rn is continuous at every interior
point of its effective domain.
Proof Let x0 2 int.domf /: Without loss of generality one can assume x0 D 0: By
Theorem 2.1, for each i D 1; : : : ; n the restriction of f to the open interval ftj x0 C
tei 2 int.domf /g is continuous relative to this interval. Hence for any given " > 0
and for each i D 1; : : : ; n, we can select i > 0 so small that jf .x/ f .x0 /j " for
all x 2 i ei ; Ci ei : Let D minfi j i D 1; : : : ; ng and B D fxj kxk1 g: Denote
ui D ei ; uiCn D ei ; i D 1; P : : : ; n: Then, as seen
P2n in the proof of Corollary 1.6,
any x 2 B is of the form x D 2n u i
; with iD1 i DP 1; 0 i 1; hence
P2n iD1 i
0 2n 0
f .x/ iD1 i f .u /; and consequently, f .x/ f .x /
i
iD1 i f .x / f .x /:
i
Therefore,
2n
X
jf .x/ f .x0 /j i jf .x/ f .x0 /j "
iD1
For example, let f .x/ W R ! R be a piecewise convex function on the real line,
i.e., a function such that the real line can be partitioned into a finite number of
intervals i ; i D 1; : : : ; N; in each of which f .x/ is convex. Then f .x/ is convex if
and only if it is convex in the neighborhood of each breakpoint (endpoint of some
interval i /: In particular, a piecewise affine function on the real line is convex if and
only if at each breakpoint the left slope is at most equal to the right slope (in other
words, the sequence of slopes is nondecreasing) (see Fig. 2.1).
is convex on .
44 2 Convex Functions
By convexity of ';
t
u
Proposition 2.7 If gi .x/; i D 1; : : : ; m; are concave positive functions on a convex
set C Rn then their geometric mean
"m #1=m
Y
f .x/ D gi .x/ (2.5)
iD1
" #1=m ( m )
Y
m
1 X
gi .x/ D min ti gi .x/ : (2.6)
iD1
m t2T iD1
Q 0
Qm 0
since if m iD1 ti > 1 then by replacing
Pm ti with ti ti such that iD1 ti D 1;
we can only decrease the value of iD1 ti gi .x/: Therefore, it can be assumed that
Q m
iD1 ti D 1 and hence
Y
m Y
m
ti gi .x/ D gi .x/: (2.7)
iD1 iD1
numbers are equal (by theorem on arithmetic Pm and geometric mean). That is,
taking
Q account of (2.7), the minimum of iD1 ti g i .x/ is achieved when ti gi .x/ D
1=m
m iD1 g i .x/ 8i PD 1; : : : ; m; hence (2.6). Since for fixed t 2 T the function
x 7! 't .x/ WD m1 m t g .x/ is concave by Proposition 2.5, their lower envelope
Q iD1 1=m i i
inft2T 't .x/ D m iD1 g i .x/ is concave by the same proposition. t
u
An interesting concave function of the class (2.5) is the geometric mean:
.x1 x2 xn /1=n if x1 0; : : : ; xn 0
f .x/ D
1 otherwise
hence
t
u
Pm gi .x/
For example, by this proposition the function f .x/ D iD1 ci e is convex if ci > 0
and each gi .x/ is convex proper.
Given the epigraph E of a convex function f .x/, one can restore f .x/ by the
formula
Conversely, given a convex set E RnC1 the function f .x/ defined by (2.8) is
a convex function on Rn by Proposition 2.6. Therefore, if f1 ; : : : ; fm are m given
convex functions, and E RnC1 is a convex set resulting from some operation on
their epigraphs E1 ; : : : ; Em ; then one can use (2.8) to define a corresponding new
convex function f .x/:
Proposition 2.9 Let f1 ; : : : ; fm be proper convex functions on Rn : Then
( )
X
m X
m
f .x/ D inf fi .x /j x 2 R ;
i i n i
x Dx
iD1 iD1
is a convex function on Rn :
46 2 Convex Functions
But clearly for every .x; t/ 2 epig there is t such that .x; / 2 E; so for every
.x; t/ 2 conv.epig/ there is t such that .x; / 2 convE: Therefore,
PkC1 (convg/.x/ D
infftj .x; t/ 2 conv.epig/g D infftj .x; t/ 2 convEg D inff iD1 i g.xi /j xi 2 C;
X
kC1 X
kC1
i 0; i xi D x; i D 1g: t
u
iD1 iD1
Proof By the above proposition, (convg/.x/ f .x/ but any x 2 D is of the form
P X
nC1
P
x D nC1
iD1 i v i
; with v i
2 V; i 0; i D 1; hence f .x/ nC1
iD1 i g.v / (by
i
P iD1
definition of f .x//; while nC1
iD1 i g.v i
/ g.x/ by the concavity of g: Therefore,
f .x/ g.x/ 8x 2 D; and since f .x/ is convex, we must have f .x/ .convg/.x/; and
hence f .x/ D .convg/.x/: t
u
where 2 1; C1 are called lower and upper level sets, respectively, of f :
Proposition 2.11 The lower (upper, resp.) level sets of a convex (concave, resp.)
function f .x/ are convex.
Proof This property is equivalent to
Proof By Proposition 2.13 every lower level set C WD fxj f .x/ g is closed.
Let D fuj 0g: If f is bounded above on a halfline a D a C ; then
f .a/ 2 R (because f is proper) and by Proposition 2.12 f .x/ f .a/ 8x 2 a : For
any nonempty C ; 2 R; consider a point b 2 C and let D maxff .a/; g so that
2.3 Lower Semi-Continuity 49
Proposition 2.15 Let f be any proper convex function on Rn : For any y 2 Rn ; there
exists t 2 R such that .y; t/ belongs to the lineality space of epif if and only if
When f is l.s.c., this condition is satisfied provided for some x 2 domf the function
7! f .x C y/ is affine.
Proof .y; t/ belongs to the lineality space of epif if and only if for any x 2 domf W
.x; f .x// C .y; t/ 2 epif 8 2 R; i.e., if and only if
f .x C y/ t f .x/ 8 2 R:
By Proposition 2.12 applied to the proper convex function './ D f .xCt/t; this
is equivalent to saying that './ D constant, i.e., f .x C y/ t D f .x/ 8 2 R:
This proves the first part of the proposition. If f is l.s.c., i.e., epif is closed, then
.y; t/ belongs to the lineality space of epif provided for some x 2 domf the line
f.x; f .x// C .y; t/j 2 Rg is contained in epif (Lemma 1.1). t
u
The projection of the lineality space of epif on Rn ; i.e., the set of vectors y for
which there exists t such that .y; t/ belongs to the lineality space of epif ; is called
the lineality space of f : The directions of these vectors y are called directions in
which f is affine. The dimension of the lineality space of f is called the lineality of f :
50 2 Convex Functions
Corollary 2.5 If a proper convex function f of full dimension on Rn has rank k then
there exists a k n matrix B with rankB D k such that for any b 2 B.domf / the
restriction of f to the affine set Bx D b is affine.
Proof Let L be the lineality space of f : Since dim L D n k; there exists a k n
matrix B of rank k such that L D fu 2 Rn j Bu D 0g: If b D Bx0 for some x0 2 domf
then for any x such that Bx D b, Pwe have B.x x0 / D 0; hence, denoting by
1 0 h
u ; : : : ; u a basis of L; x D x C iD1 i ui : By Proposition 2.15 there exist ti 2 R
h
P P
such that f .x0 C hiD1 i ui / D f .x0 / C hiD1 i ti for all 2 Rh : Therefore, f .x/ is
affine on the affine set Bx D b: t
u
Given any proper function f on Rn ; the function whose epigraph is the closure of
epif is the largest l.s.c. minorant of f : It is called the l.s.c. hull of f or the closure of
f , and is denoted by clf : Thus, for a proper function f ;
A proper convex function f is said to be closed if clf D f (so for proper convex
functions, closed is synonym of l.s.c.). An improper convex function is said to be
closed only if f C1 or f 1 (so if f .x/ D 1 for some x then its closure is
the constant function 1/:
Proposition 2.16 The closure of a proper convex function f is a proper convex
function which agrees with f except perhaps at the relative boundary points of domf :
Proof Since epif is convex, cl(epif /, i.e., epi(clf /, is also convex (Proposition 1.10).
Hence by Proposition 2.13, clf is a closed convex function. Now the condition (2.12)
is equivalent to
If x 2 ri.domf / then by Proposition 2.3 f .x/ D lim f .y/; hence, by (2.13), f .x/ D
y!x
clf .x/: Furthermore, if x cl.domf / then f .y/ D C1 for all y in a neighborhood
of x and the same formula (2.13) shows that clf .x/ D C1: Thus the second half of
the proposition is true. It remains to prove that clf is proper. For every x 2 ri.domf /;
since f is proper, 1 < clf .x/ D f .x/ < C1: On the other hand, if clf .x/ D
1 at some relative boundary point x of domf D dom.clf / then for an arbitrary
y 2 ri.domf /, we have clf 2 12 clf .x/ C 12 clf .y/ D 1: Noting that xCy
xCy
2
2
ri.domf / this contradicts what has just been proved and thereby completes the proof.
t
u
2.4 Lipschitz Continuity 51
be any sequence such that xk ! 0; f .xk / ! : Then at least one Di ; say D1 , contains
an infinite subsequence of fxk g: For convenience we also denote this subsequence
by fxk g: If V1 is the set of vertices of D1 other than P 0, then each k
Px is ak convex
combination of 0 and elements of V1 W x D .1 v2V1 v /0 C v2V1 v v; with
k k
P
kv 0; v2V1 kv 1: By convexity of f we can write
X X
f .xk / .1 kv /f .0/ C kv f .v/:
v2V1 v2V1
compact set C int.domf / and let B be the Euclidean unit ball. For every r > 0 the
set CCrB is compact, and the family of closed sets f.CCrB/nint.domf /; r > 0g has
an empty intersection. In view of the compactness of C C rB some finite subfamily
of this family must have an empty intersection, hence for some r > 0, we must have
.C CrB/nint.domf / D ;; i.e., C CrB int.domf /: By Proposition 2.3 the function
f is continuous on int(domf /: Denote by 1 and 2 the maximum and the minimum
0/
of f over C C rB: Let x; x0 be two distinct points in C and let z D x C r.xx
kxx0 k
: Then
z 2 C C rB int.domf /: But
kx x0 k
x D .1 /x0 C z; D
r C kx x0 k
and z; x0 2 domf ; hence
f .x/ .1 /f .x0 / C f .z/ D f .x0 / C .f .z/ f .x0 //
and consequently
f .x/ f .x0 / .f .z/ f .x0 / .1 2 /
1 2
kx x0 k; D :
r
By symmetry, we also have f .x0 / f .x/ kx x0 k: Hence, for all x; x0 such that
x 2 C; x0 2 C W
jf .x/ f .x0 /j kx x0 k;
proving the Lipschitz property of f over C:
(iv)) (v) and (v)) (i): obvious. t
u
X
m
0 f0 .x/ C i fi .x/ 0 8x 2 D: (2.15)
iD1
If in addition
X
m
i y i 0 8y D .y0 ; y1 ; : : : ; ym / 2 C: (2.18)
iD0
If x 2 D then for every " > 0, we have fi .x/ < fi .x/ C " for i D 0; 1; : : : ; m; so
.f0 .x/ C "; : : : ; fm .x/ C "/ 2 C and hence,
X
m
i .fi .x/ C "/ 0:
iD0
Since " > 0 can be arbitrarily small, this implies (2.15). Furthermore, i 0;
i D 0; 1; : : : ; m because if j < 0 for some j; then by fixing, forPan arbitrary x 2 D;
m
all yi > fi .x/; i j; while letting yj ! C1; we would have Pm iD0 i yi ! 1;
contrary toP(2.18). Finally, under (2.16), if 0 D 0 then P iD1 i > 0; hence
m 0 m 0
by (2.16) f
iD1 i i .x / < 0; contradicting the inequality iD1 i fi .x / 0
from (2.15). Therefore, 0 > 0 as was to be proved. t
u
Corollary 2.6 Let D be a convex set in Rn ; g; f two convex functions finite on D: If
g.x0 / < 0 for some x0 2 D; while f .x/ 0 for all x 2 D satisfying g.x/ 0; then
there exists a real number 0 such that f .x/ C g.x/ 0 8x 2 D:
Proof Apply the above Proposition for f0 D f ; f1 D g: t
u
A more general result about inconsistent systems of convex inequalities is the
following:
54 2 Convex Functions
X
m
f0 .x/ C i fi .x/ 0 8x 2 D: (2.21)
iD1
Proof By replacing I1 with fij fi .x0 / D 0g we can assume fi .x0 / D 0 for every
i 2 I1 : Arguing by induction on m; observe first that the theorem is true for m D 1:
Indeed, in this case, if I1 D ; the theorem follows from Proposition 2.18, so we
can assume that I1 D f1g; i.e., f1 .x/ is affine. In view of the fact f1 .x0 / D 0 for
x0 2 riD; if f1 .x/ 0 8x 2 D; then f1 .x/ D 0 8x 2 D and (2.21) holds with
D 0: On the other hand, if there exists x 2 D satisfying f1 .x/ < 0 then, since the
system x 2 D; f1 .x/ 0; f0 .x/ < 0 is inconsistent, again the theorem follows from
Proposition 2.18. Thus, in any event the theorem is true when m D 1: Assuming
now that the theorem is true for m D k 1 1; consider the case m D k: The
hypotheses of the theorem imply that the system
is inconsistent, and since the function maxffk .x/; f0 .x/g is convex, there exists, by
the induction hypothesis, ti 0; i D 1; : : : ; k 1; such that
X
k1
maxffk .x/; f0 .x/g C ti fi .x/ 0 8x 2 D: (2.23)
iD1
X
k1
x 2 D; ti fi .x/ 0; fk .x/ 0; f0 .x/ < 0: (2.24)
iD1
Indeed, from (2.23) it follows that no solution of (2.24) exists with fk .x/ < 0:
Furthermore, if there exists x 2 D satisfying (2.24) with fk .x/ D P 0 then, setting
x0 D x0 C .1 /x with 2 .0; 1/, we would have x0 2 D; k1 0
iD1 ti fi .x /
0 0 0
0; fk .x / < 0 and f0 .x / f0 .x / C .1 /f0 .x/ < 0 for sufficiently small > 0:
2.5 Convex Inequalities 55
Since this contradicts (2.23), the system (2.24) is inconsistent and, again by the
induction hypothesis, there exist 0 and k 0 such that
X
k1
f0 .x/ C ti fi .x/ C k fk .x/ 0 8x 2 D: (2.25)
iD1
X
m
ht; Ax bi C i fi .x/ 0 8x 2 D: (2.27)
iD1
x 2 E; fi .x/ < 0; i D 1; : : : ; m
X
m
i fi .x/ 0 8x 2 E: (2.28)
iD1
Pm Pm
Pm iD1 i , we may assume iD1 i D 1: Obviously, the convexk function
By dividing
f .x/ WD iD1 i fi .x/ is finite on D: Consider the set C of all .y; y0 / 2 R R for
which there exists an x 2 D satisfying
Ax b D y; f .x/ < y0 :
56 2 Convex Functions
Since by (2.28) 0 C; and since C is convex there exists, again by Lemma 1.2, a
vector .t; t0 / 2 Rk R such that
ht; yi 0 8y 2 A.D/ b:
Indeed, since epif is a closed convex set there exists by Theorem 1.3 a hyperplane
strongly separating .x0 ; t0 / from epif ; i.e., an affine function ha; xi C t such that
that (2.30) holds. On the other hand, if x0 domf and D 0; we can consider an
x1 2 domf and t1 < f .x1 /, so that .x1 ; t1 / epif and by what has just been proved,
there exists .b; / 2 Rn R satisfying
i.e., .a0 ; 0 / satisfies (2.30). Note that (2.30) implies ha; xi f .x/ 8x; i.e., the
affine function h.x/ D ha; xi minorizes f .x/: Now let Q be the family of all
affine functions h minorizing f : We contend that
Suppose the contrary, that f .x0 / > D supfh.x/j h 2 Qg for some x0 : Then
.x0 ; / epif and by the above there exists .a; / 2 Rn R satisfying (2.30) for
t0 D : Hence, h.x/ D ha; xi 2 Q and < ha; x0 i; i.e., h.x0 / D ha; x0 i >
; a contradiction. Thus (2.31) holds, as was to be proved. t
u
Corollary 2.8 For any function f W Rn ! 1; C1 the closure of the convex hull
of f is equal to the upper envelope of all affine functions minorizing f :
Proof An affine function h minorizes f if and only if it minorizes cl(convf /; hence
the conclusion. t
u
Proposition 2.19 Any proper convex function f has an affine minorant. If x0 2
int.domf / then an affine minorant h exists which is exact at x0 ; i.e., such that
h.x0 / D f .x0 /:
Proof Indeed, the collection Q in Theorem 2.5 for cl(f / is nonempty. If x0 2
int.domf / then .x0 ; f .x0 // is a boundary point of the convex set epif : Hence by
Theorem 1.5 there exists a supporting hyperplane to epif at this point, i.e., there
exists .a; / 2 Rn R such that either a 0 or 0 and
Therefore, < 0; so we can take D 1: Then ha; xi t ha; x0 i f .x0 / for all
.x; t/ 2 epif and the affine function
2.7 Subdifferential
f (x)
p, x x0
f (x0 )
0 x0 x
x 2 C and p 2 @f .x/, we have hp; y xi C f .x/ f .y/ 8y; but by Theorem 2.2,
there exists > 0 such that jf .x/ f .y/j ky xk for all y 2 C C rB: Hence
jhp; y xij ky xk for all y 2 C C rB; i.e., jhp;
S uij kuk for all u 2 B: By
taking u D p=kpk this implies kpk ; so the set x2C @f .x/ is bounded. t
u
Corollary 2.9 Let f be a proper convex function on Rn : For any bounded convex
subset C of int.domf / there exists a positive constant such that
Proof if p 2 @f .x0 / then hp; x x0 i C f .x0 / f .x/ 8x: Setting x D 2x0 yields
hp; x0 i C f .x0 / 2f .x0 /; i.e., hp; x0 i f .x0 /; then setting x D 0 yields hp; x0 i
f .x0 /; hence hp; x0 i D f .x0 /: (Note that this condition is trivial and can be omitted
if x0 D 0/: Furthermore, hp; x x0 i D hp; xi hp; x0 i D hp; xi f .x0 /; hence
hp; xi f .x/ 8x: Conversely, if p belongs to the set on the right-hand side of (2.34)
then obviously hp; x x0 i f .x/ f .x0 /; so p 2 @f .x0 /: t
u
If, in addition, f .x/ D f .x/ 0 8x then the condition hp; xi f .x/ 8x is
equivalent to jhp; xij f .x/ 8x: In particular:
1. If f .x/ D kxk (Euclidean norm) then
0 fp j kpk 1g (unit ball) if x0 D 0
@f .x / D (2.35)
fx0 =kx0 kg if x0 0:
60 2 Convex Functions
Example 2.2 (Distance Function) Let C be a closed convex set in Rn ; and f .x/ D
minfky xk j y 2 Cg: Denote by C .x/ the projection of x on C; so that
kC .x/ xk D minfky xk j y 2 Cg and hx C .x/; y C .x/i 0 8y 2 C
(see Proposition 1.15). Then
( 0 0
@f .x0 / D n C .x
N / \ B.0;
o 1/ if x 2 C (2.38)
x0 C .x0 /
kx0 .x0 /k
if x0 C;
C
where NC .x0 / denotes the outward normal cone of C at x0 and B.0; 1/ the Euclidean
unit ball.
Proof Let x0 2 C; so that f .x0 / D 0: Then p 2 @f .x0 / implies hp; x x0 i
f .x/ 8x; hence, in particular, hp; x x0 i 0 8x 2 C; i.e., p 2 NC .x0 /I furthermore,
hp; xx0 i f .x/ kxx0 k 8x; hence kpk 1; i.e., p 2 B.0; 1/: Conversely, if p 2
NC .x0 / \ B.0; 1/ then hp; x C .x/i kx C .x/k f .x/; and hp; C .x/ x0 i 0;
consequently hp; x x0 i D hp; x C .x/i C hp; C .x/ x0 i f .x/ D f .x/ f .x0 /
for all x; and so p 2 @f .x0 /:
Turning to the case x0 C; observe that p 2 @f .x0 / implies hp; x x0 i C f .x0 /
f .x/ 8x; hence, setting x D C .x0 / yields hp; C .x0 / x0 i C kC .x0 / x0 k 0;
i.e., hp; x0 C .x0 /i kx0 C .x0 /k: On the other hand, setting x D 2x0 C .x0 /
yields hp; x0 C .x0 /i C kC .x0 / x0 k 2kC .x0 / x0 k; i.e., hp; x0 C .x0 /i
kC .x0 / x0 k: Thus, hp; x0 C .x0 /i D kC .x0 / x0 k and consequently p D
x0 C .x0 /
kx0 C .x0 /k
: Conversely, the last equality implies hp; x0 C .x0 /i D kx0 C .x0 /k D
f .x0 /; hp; xC .x/i kxC .x/k D f .x/; hence hp; xx0 iCf .x0 / D hp; xx0 iC
hp; x0 C .x0 /i D hp; x C .x0 /i D hp; x C .x/i C hp; C .x/ C .x0 /i; kx
C .x/k D f .x/ for all x (note that hp; C .x/C .x0 /i 0 because p 2 NC .C .x0 ///:
Therefore, hp; x x0 i C f .x0 / f .x/ for all x; proving that p 2 @f .x0 /: t
u
Observe from the above examples that there is a unique subgradient (which is
just the gradient) at every point where f is differentiable. This is actually a general
fact which we are now going to establish.
2.7 Subdifferential 61
Let f W Rn ! 1; C1 be any function and let x0 be a point where f is finite.
If for some u 0 the limit (finite or infinite)
Proof
(i) For any given u 0 the function './ D f .x0 C u/ is proper convex on the
0
real line, and 0 2 dom': Therefore, its right derivative 'C .0/ D f 0 .x0 I u/ exists
(but may equal C1 if 0 is an endpoint of dom'/: The relation (2.39) follows
from the fact that './ '.0/= is nonincreasing as # 0:
(ii) The homogeneity of f 0 .x0 I u/ is obvious. The convexity then follows from the
relations
which is equivalent to hp; ui inf>0 f .x0 C u/ f .x0 /=; 8u; i.e., by (i),
hp; ui f 0 .x0 I u/ 8u:
62 2 Convex Functions
This is equivalent to
If g is continuous at some point in Im.A/ .the range of A/ then the above inclusion
is in fact an equality for every x 2 Rn :
Proof The first part is straightforward. To prove the second part, consider any p 2
@.g A/.x0 /: Then the system
increasing, i.e., such that '.t/ '.t0 / whenever ti ti0 ; i D 1; : : : ; m: Then the
function f D ' g is convex and
( )
X
m
@f .x/ D si p j p 2 @gi .x/; .s1 ; : : : ; sm / 2 @'.g.x// :
i i
(2.42)
iD1
Proof The convexity of f .x/ follows from an obvious extensionP of Proposition 2.8
which corresponds to the case m D 1: To prove (2.42), let p D m i i
iD1 si p with p 2
0 0 0 0
@gi .x /; s 2 @'.g.x //: First observe that hs; y g.x /i '.y/ '.g.x // 8y 2 Rm
implies, for all y D g.x0 /Cu with u P0m W hs; ui '.g.x0 /Cu/'.g.x
Pm
0
// 0 8u
0; hence s 0: Now hp; x x i D iD1 si hp ; x x i iD1 si gi .x/ gi .x0 / D
0 i 0
hs; g.x/ g.x0 /i '.g.x// '.g.x0 // D f .x/ f .x0 / for all x 2 Rn : Therefore
p 2 @f .x0 /; i.e., the right-hand side of (2.42) is contained in the left-hand side. To
prove the converse, let p 2 @f .x0 /; so that the system
is inconsistent, while the system (2.43) has a solution. By Proposition 2.18 there
exists s 2 RmC such that
Note that when '.y/ is differentiable at g.x/ the above formula (2.42) is similar
to the classical chain rule, namely:
Xm
@'
@.' g/.x/ D .g.x//@gi .x/:
iD1
@yi
Proposition 2.25 Let f .x/ D maxfg1 .x/; : : : ; gm .x/g; where each gi is a convex
function from Rn to R: Then
Pm
is
Pinconsistent. By Proposition 2.18, there exist i 0 such that iD1 i D 1 and
m
iD1 i g i .x/ f .x0 / hp; x x0 i 0: Setting x D x0 , we have
X
i gi .x0 / f .x0 / 0;
iI.x0 /
with gi .x0 / f .x0 / < 0 for every i I.x0 /: This implies that i D 0 for all i I.x0 /:
Hence
X
i gi .x/ gi .x0 / hp; x x0 i 0
i2I.x0 /
P P
for all x 2 Rn and so p 2 @. i2I.x0 / i gi .x0 //: By Proposition 2.22, p D i2I.x0 / pi ;
with pi 2 @gi .x0 /: Thus @f .x0 / convf[@gi .x0 /j i 2 I.x0 /g: The converse inclusion
can be verified in a straightforward manner. t
u
f (x0 )
f (x0 )
0 x0 x
Proposition 2.26 For any proper closed convex function f on Rn ; and any given
" > 0 the "-subdifferential of f at any point x0 2 domf is nonempty. If C is a
bounded subset of int(dom f / then the set [x2C @" f .x/ is bounded.
Proof Since the point .x0 ; f .x0 / "/ epif ; there exists p 2 Rn such that
hp; x x0 i < f .x/ .f .x0 / "/ 8x (see (2.30) in the proof of Theorem 2.5).
Thus, @" f .x0 / ;: The proof of the second part of the proposition is analogous to
that of Theorem 2.6. t
u
Note that @" f .x0 / is unbounded when x0 is a boundary point of domf :
Proposition 2.27 Let x0 ; x1 2 domf : If p 2 @" f .x0 / ." 0/ then p 2 @
f .x1 / for
for all x1 ; x2 2 C; and all 2 0; 1: The number r > 0 is then called the modulus of
strong convexity of f .x/: Using the identity
for all x1 ; x2 2 Rn ; and all 2 0; 1; it is easily verified that a convex function f .x/
is strongly convex with modulus of strong convexity r if and only if the function
f .x/ rkxk2 is convex.
Proposition 2.28 If f .x/ is a strongly convex function on Rn with modulus of strong
convexity r then for any x0 2 Rn and " 0 W
p
@f .x0 / C B.0; 2 r"/ @" f .x0 /; (2.48)
Qx C a C u 2 @" f .x/
p
for any u 2 Rn such that kuk 2 r":
Proof Clearly f .x/ rkxk2 is convex (as its smallest eigenvalue is nonnegative), so
f .x/ is a strongly convex function to which the above proposition applies. t
u
8
< 0; pD0
f .p/ D C1; p<0
:
p log p p; p > 0:
The conjugate of f .p/ is in turn the function f .x/ D supfpx f .p/g D supfpx
p
p log p C pj p > 0g D ex :
1 Pn
Example 2.4 The conjugate of the function f .x/ D iD1 jxi j ; 1 < < C1;
is
1X
n
f .p/ D jpi j ; 1 < < C1;
iD1
where 1= C 1= D 1: Indeed,
( )
X
n
1X
n
f .p/ D sup p i xi jxi j :
x
iD1
iD1
X
1X
n n
1
f .p/ D j
i j 1 D jpi j :
iD1
iD1
If f is convex and closed then, since it is proper, by Theorem 2.5 it is just equal to the
function on the left-hand side of (2.51) and since f f ; it follows that f D f :
Conversely, if f D f then f is the conjugate of f , hence is convex and closed.
Turning to (iii) observe that if Q D ; then f .p/ D supx fhp; xi f .x/g D C1 for
2.11 Extremal Values of Convex Functions 69
all p and consequently, f 1: But in this case, supfhj h 2 Qg D 1; too,
hence f D supfhj h 2 Qg: On the other hand, if there is h 2 Q then from (2.51)
h f I conversely, if h f then h f (because f f /: In this case, since
f .x/ f .x/ < C1 at least for some x; and f .x/ > 1 8x (because there is an
affine function minorizing f /; it follows that f is proper. By Theorem 2.5, then
f D supfhj h affine h f g D supfhj h 2 Qg: This proves the equality in (iii),
and so, by Corollary 2.8, f D cl.conv/f : t
u
The smallest and the largest values of a convex function on a given convex set are
often of particular interest.
Let f W Rn ! 1; C1 be an arbitrary function, and C an arbitrary set
in Rn : A point x0 2 C \ domf is called a global minimizer of f .x/ on C if
1 < f .x0 / f .x/ for all x 2 C: It is called a local minimizer of f .x/ on
C if there exists a neighborhood U.x0 / of x0 such that 1 < f .x0 / f .x/ for
all x 2 C \ U.x0 /: The concepts of global maximizer and local maximizer are
defined analogously. For an arbitrary function f on a set C we denote the set of all
global minimizers (maximizers) of f on C by argminx2Cf .x/ (argmaxx2Cf .x/; resp.).
Since minx2C f .x/ D maxx2C .f .x// the theory of the minimum (maximum) of
a convex function is the same as the theory of the maximum (minimum, resp.) of a
concave function.
2.11.1 Minimum
for any two distinct points x1 ; x2 2 C and 0 < < 1: For such a function f the set
argminx2C f .x/; if nonempty, is a singleton, i.e., a strictly convex function f .x/ on
C has at most one minimizer over C: In fact, if there were two distinct minimizers
1 2
x1 ; x2 then by strict convexity f . x Cx
2
/ < f .x1 /; which is impossible.
70 2 Convex Functions
where NC .x0 / denotes the (outward) normal cone of C at x0 : (cf Sect. 1.6)
Proof If (2.52) holds there is p 2 @f .x0 / \ NC .x0 /: For every x 2 C; since p 2
@f .x0 /, we have hp; x x0 i f .x/ f .x0 /; i.e., f .x0 / f .x/ hp; x x0 iI on the
other hand, since p 2 NC .x0 /; we have hp; x x0 i 0; hence f .x0 / f .x/; i.e., x0
is a minimizer. Conversely, if x0 2 argminx2C f .x/ then the system
2.11.2 Maximum
In contrast with the minimum, a local maximum of a convex function may not be
global. Generally speaking, local information is not sufficient to identify a global
maximizer of a convex function.
2.11 Extremal Values of Convex Functions 71
Proof Suppose that f attains its maximum on C at a point x0 2 riC and let x be an
arbitrary point of C: Since x0 2 riC there is y 2 C such that x0 D x C .1 /y for
some 2 .0; 1/: Then f .x0 / f .x/ C .1 /f .y/; hence f .x/ f .x0 / .1 /
f .y/ f .x0 / .1 /f .x0 / D f .x0 /: Thus f .x/ f .x0 /; hence f .x/ D f .x0 /;
proving the first part of the proposition. The second part follows, because for any
maximizer x0 there is a face F of C such that x0 2 riF W then by the previous
argument, any point of this face is a global minimizer. t
u
Proposition 2.34 Let C be a closed convex set, and f W C ! R be a convex
function. If C contains no line and f .x/ is bounded above on every halfline of C
then
where V.C/ is the set of extreme points of C: If the maximum of f .x/ is attained at
all on C; it is attained on V.C/:
Proof By Theorem 1.7, C D convV.C/ C K; where K is the convex cone generated
by the extreme directions of C: Any point of C which is actually not an extreme
point belongs to a halfline emanating from some v 2 V.C/ in the direction of a ray
of K: Since f .x/ is finite and bounded above on this halfline, its maximum on the
halfline is attained at v (Proposition 2.12, (ii)). Therefore, the supremum of f .x/ on
C is reduced to the supremum on convV.C/: The P conclusion then follows from the
i2I i v ; with jIj < C1; v 2
i i
fact that any xP2 convV.C/ is of the form P x D
V.C/; i 0; i2I i D 1; hence f .x/ i2I i f .v / maxi2I f .v /:
i i
t
u
Corollary 2.13 A real-valued convex function f .x/ on a polyhedron C containing
no line is either unbounded above on some unbounded edge or attains its maximum
at an extreme point of C:
Corollary 2.14 A real-valued convex function f .x/ on a compact convex set C
attains its maximum at an extreme point of C:
The latter result is in fact true for a wider class of functions, namely for
quasiconvex functions. As was defined in Sect. 2.3, these are functions f W Rn !
1; C1 such that for any real number ; the level set L WD fx 2 Rn j f .x/ g
is convex, or equivalently, such that
P
any Px 2 C can be represented as x D i2I i v i ; where v i are extreme points, i 0
and i2I i D 1: If f .x/ is a quasiconvex function finite on C; and D maxi2I f .v i /;
then v i 2 C \ L ; 8i 2 I; hence x 2 C \ L ; because of the convexity of the set
C \ L : Therefore, f .x/ D maxi2I f .v i /; i.e., the maximum of f on C is attained
at some extreme point.
A function f .x/ is said to be quasiconcave if f .x/ is quasiconvex. A convex
(concave) function is of course quasiconvex (quasiconcave), but the converse may
not be true, as can be demonstrated by a monotone nonconvex function of one
variable. Also it is easily seen that an upper envelope of a family of quasiconvex
functions is quasiconvex, but the sum of two quasiconvex functions may not be
quasiconvex.
2.12.1 Minimax
Investigations on this question date back to von Neumann (1928). A classical result
of his states that if C; D are compact and f .x; y/ is convex in x; concave in y
and continuous in each variable then the minimax equality holds. Since minimax
theorems have found important applications, there has been a great deal of work on
the extension of von Neumanns theorem. Almost all these extensions are based
either on the separation theorem of convex sets or a fixed point argument. The
most important result in this direction is due to Sion (1958). Later a more general
minimax theorem was established by Tuy (1974) without any appeal to separation
or fixed point argument. We present here a simplified version of the latter result
which is also a refinement of Sions theorem.
Theorem 2.7 Let C; D be two closed convex sets in Rn ; Rm ; respectively, and let
f .x; y/ W C D ! R be a function quasiconvex, lower semi-continuous in x and
quasiconcave, upper semi-continuous in y: Assume that
(*) There exists a finite set N D such that supy2N f .x; y/ ! C1 as x 2 C;
kxk ! C1:
2.12 Minimax and Saddle Points 73
Proof Since the inequality infx2C supy2D f .x; y/ sup inf f .x; y/ is trivial, it suffices
y2D x2C
to show the reverse inequality:
Let
WD supy2D infx2C f .x; y/: If
D C1 then (2.55) is obvious, so we can assume
C .y/ D fx 2 Cj f .x; y/ g:
Since C.y / is convex it follows from (2.58) that C.y / cannot meet both sets
C.a/ and C.b/ which are disjoint by assumption (2.57). Consequently, for every
2 0; 1; one and only one of the following alternatives occurs:
Denote by Ma .Mb ; resp.) the set of all 2 0; 1 satisfying (a) (satisfying (b),
resp.). Clearly 0 2 Ma ; 1 2 Mb ; Ma [ Mb D 0; 1 and, analogously to (2.58):
This means
so in a neighborhood of the point .x; y; f .x; y// the set epif WD f.x; y; t/j .x; y/ 2
C D; t 2 R; f .x; y/ tg reminds the image of a saddle (or a mountain pass).
Proposition 2.35 A point .x; y/ 2 C D is a saddle point of f .x; y/ W C D ! R
if and only if
Proof We have
Since always infx2C supy2D f .x; y/ supy2D infx2C f .x; y/ we must have equality
everywhere in the above sequence of inequalities. Therefore, (2.60) must be
equivalent to (2.61). t
u
As a consequence of Theorem 2.7 and Remark 2.2 we can now state
Proposition 2.36 Assume that C Rn ; D Rm are nonempty closed convex sets
and the function f .x; y/ W C D ! R is quasiconvex l.s.c in x and quasiconcave
u.s.c. in y: If both the following conditions hold:
(i) There exists a finite set N D such that supy2N f .x; y/ ! C1 as x 2
C; kxk ! C1I
(ii) There exists a finite set M C such that infx2M f .x; y/ ! C1 as y 2
D; kyk ! C1I
then there exists a saddle point .x; y/ for the function f .x; y/:
Proof If (i) holds, we have minx2C supy2D f .x; y/ D supy2D infx2C f .x; y/ by The-
orem 2.7. If (ii) holds, we have infx2C supy2D f .x; y/ D maxy2D infx2C f .x; y/ by
Remark 2.2. Hence the condition (2.61). t
u
76 2 Convex Functions
We assume that the reader is familiar with convex optimization problems, i.e.,
optimization problems of the form
x
K y , y x 2 K:
b. Dual Cone
K D fy 2 Rn j hx; yi 0 8x 2 Kg:
Note that K D K o ; where K o is the polar of K (Sect. 1.8). As can easily be seen:
(i) K is a closed convex cone;
(ii) K is also the dual cone of the closure of KI
(iii) .K / D clKI
(iv) If x 2 intK then hx; yi > 0 8y 2 K n f0g:
A cone K is said to be self-dual if K D K: Clearly the orthant RnC is a self-dual
cone.
Lemma 2.1 If K is a closed convex cone then
y K , 9 2 K h; yi < 0:
c. K-Convex Functions
g.x1 C .1 /x2 /
K g.x1 / C .1 /g.x2 /: (2.63)
the function h; g.x/i D iD1 i gi .x/ is convex in the usual sense and the set
fx 2 Rn j g.x/
K 0g is convex.
Proof For every x1 ; x2 2 Rn , we have by (2.63)
hence h; g.x1 /i C .1 /h; g.x1 /i h; g.x1 / C .1 /g.x2 /i; proving that
the function h; g.x/i is convex.
Further, by (2.2), if g.x1 /
K 0; g.x2 /
K 0 then
g.x1 / C .1 /x2 /
K g.x1 / C .1 /g.x2 /
K 0;
X
m
L.x; ; / WD f .x/ C hi ; gi .x/i C h; h.x/i;
iD1
X
m
'.; / D inf L.x; ; / D inf ff .x/ C hi ; gi .x/i C h; h.x/ig:
x2 x2
ID1
Since '.; / is the lower envelope of a family of affine functions in .; /; it is a
concave function. Setting K D K1 Km ; D .1 ; : : : ; m / we can show that
sup L.x; ; /
2K ;2Rp
f .x/ if gi .x/
Ki 0 .i D 1; : : : ; m/; h.x/ D 0I
D
C1 otherwise
2.13 Convex Optimization 79
In fact, if gi .x/
Ki 0 .i D 1; : : : ; m/; h.x/ D 0 then the supremum is attained for
D 0; D 0 and equals f .x/: On the other hand, if h.x/ 0 there is 2 Rp
satisfying h; h.x/i > 0 and for D 0 the supremum is equal to sup >0 ff .x/ C
h ; h.x/ig D C1: If there is an i D 1; : : : ; m with gi .x/ 6
Ki 0 then by Lemma 2.1
there is i 2 Ki satisfying hi ; gi .x/i > 0; hence for D 0; j D 0 8j i; the
supremum equals sup >0 ff .x/ C hi ; gi .x/ig D C1:
Thus, the problem (2.64) can be written as
that is,
Theorem 2.8
(i) (weak duality) The optimal value in the dual problem never exceeds the optimal
value in the primal problem (2.64).
(ii) (strong duality) The optimal values in the two problems are equal if the Slater
condition holds, i.e., if
Proof (i) is straightforward, we need only prove (ii). By Lemma 2.2 for every i 2
Ki the function hi ; gi .x/i is convex (in the usual sense), finite on Rn and hence
continuous. So L.x; ; / is a convex continuous function in x 2 for every fixed
.; / 2 D WD K1 Km Rp and affine in .; / 2 D for every fixed x 2 h./:
Since h W Rn ! Rp is an affine map, without loss of generality it can be assumed
that h./ D Rp : Since h.x/ D 0 and x 2 int, we have 0 2 int; so for every
j D 1; : : : ; p there is an aj 2 such that h.aj / has its j-th component equal to 1,
and all other components equal to 0. Then for sufficiently small " > 0, we have
xi WD x C ".aj x/ 2 ; xO j WD x ".aj x/ 2 ; and so
X
m X
p
min L.x; .; // max f .x/ C minf i gi .x/ C j hj .x/g
x2M x2M x2M
iD1 jD1
as .; / 2 D; kk C kk ! C1: So the function L.x; .; // satisfies condition
(!) in Remark 2.2 with C D ; y D .; /: By virtue of this Remark,
inf sup L.x; .; // D max inf L.x; .; //;
x2 .;/2D .;/2D x2
Let K Rm be a closed, solid, and pointed convex cone, let c 2 Rn and let A be an
m n matrix, and b 2 Rm : Then the following optimization problem:
minfhc; xij Ax
K bg (2.66)
9x Ax K b:
X
n
Tr.A/ WD aii :
iD1
where vec.A/ denotes the n n column vector whose elements are elements of the
matrix A in the order a11 ; ; a1n ; a21 ; ; a2n ; ; an1 ; ann : The norm of a matrix
A associated with this inner product is the Frobenius norm given by
1
Sometimes also written as A B and called dot product.
82 2 Convex Functions
0 11=2
X
kAkF D .hA; Ai/1=2 D .Tr.AT A/1=2 D @ a2ij A :
i;j
The set of semidefinite positive matrices X 2 Sn forms a convex cone SnC called the
SDP cone (semidefinite positive cone). For any two matrices A; B 2 Sn we write
A
B to mean that B A is semidefinite positive. In particular, A 0 means A is a
semidefinite positive matrix.
Lemma 2.3 The cone SnC is closed, convex, solid, pointed, and self-dual, i.e.,
SnC D .SnC / :
where the last inequality holds because the matrix X 1=2 YX 1=2 is semidefinite
positive. Therefore, SnC .SnC / : Conversely, let Y 2 .SnC / : For any u 2 Rn ,
we have
X
n X
n X
n
Tr.YuuT / D y1i ui u1 C C yni ui un D yij ui uj D uT Yu:
iD1 iD1 i;jD1
The matrix uuT is obviously semidefinite positive, while the trace of YuuT is
nonnegative by assumption, so the product uT Yu is nonnegative. Since u is arbitrary,
this means Y 2 SnC : Hence, .SnC / SnC : t
u
A map F W Rn ! Sm is said to be convex, or more precisely, convex with respect
to matrix inequalities if it is Sm n
C -convex, i.e., such that for every X; Y 2 S and
0t1
F.tX C .1 t/Y/
tF.X/ C .1 t/F.Y/:
For instance, the map X 7! X 2 is convex because for every u 2 Rm the function
uT X 2 u D kXuk2 is convex quadratic with respect to the components of X and so
A 0 C x1 A 1 C C xn A n
0
X
n
A.x/ D aij .x/; aij .x/ D a0ij C akij xk :
kD1
A.x/
0;
where A.x/ is a square symmetric matrix whose every element is an affine function
of x:
By definition
A.x/
0 , hy; A.x/yi 0 8y 2 Rp ;
so setting C WD fx 2 Rn j A.x/
0g, we have
Since for every fixed y the set fx 2 Rn j hy; A.x/yi 0g is a halfspace, we see that the
solution set C of an LMI is a closed convex set. In other words, an LMI is nothing
but a specific convex inequality which is equivalent to an infinite system of linear
inequalities.
Obviously, the inequality A.x/ 0 is also an LMI, determining a convex
constraint for x: Furthermore, a finite system of LMIs of the form
A.1/ .x/
0; : : : ; A.m/ .x/
0
9x A 0 C x1 A 1 C C xn A n : (2.69)
2.15 Exercises
1 Let f .x/ be a convex function and C D domf Rn : Show that for all x1 ; x2 2 C
and 0; 1 such that x1 C .1 /x2 2 C W
2 A real-valued function f .t/; 1 < t < C1; is strictly convex (cf Remark 2.1)
if it has a strictly monotonically increasing derivative f 0 .t/: Apply this result to
f .t/ D et and show that for any positive numbers 1 ; : : : k I
!1=k
Y
k
1X
k
i i
iD1
k iD1
5 Notations being the same as in Exercise 4, show that for any x1 ; x2 2 Rn and
p1 2 @f .x1 /; p2 2 @f .x2 / W
hp1 p2 ; x1 x2 i rkx1 x2 k2 :
6 If f .x/ is strongly convex on a convex set C then for any x0 2 C the level set
fx 2 Cj f .x/ f .x0 /g is bounded.
7 Let C be a nonempty convex set in Rn ; f W C ! R a convex function,
Lipschitzian with constant L on C: The function
is convex, Lipschitzian with constant L on the whole space and satisfies F.x/ D
f .x/ 8x 2 C:
8 Show that if @" f .x0 / is a singleton for some x0 2 domf and " > 0; then f .x/ is an
affine function.
9 For a proper convex function f W p 2 @f .x0 / if and only if .p; 1/ is an outward
normal to the set epif at point .x0 ; f .x0 //:
10 Let M be a nonempty set in Rn ; h W M ! R an arbitrary function, E an n n
matrix. The function
86 2 Convex Functions
is convex and for every x0 2 dom'; if y0 2 argmaxy2M fhx0 ; Eyi h.y/g then Ey0 2
@'.x0 /:
11 Let f W Rn ! R be a convex function, X a closed convex subset of Rn ; A 2
Rmn ; B 2 Rmp ; c 2 Rm : Show that the function '.y/ D minff .x/j Ax C By
c; x 2 Xg is convex and for every y0 2 dom'; if is a KuhnTucker vector for the
problem minff .x/j Ax C By0 c; x 2 Xg then the vector BT is a subgradient of '
at y0 :
Chapter 3
Fixed Point and Equilibrium
Proof We follow the elegant proof in Hiriart-Urruty (1983). Clearly the function
g.x/ D f .x/ C " kx x" k is l.s.c. and coercive on X; so it has a minimizer x on X;
i.e., a point x 2 X such that
" "
f .x / C kx x" k f .x/ C kx x" k 8x 2 X: (3.5)
Letting x D x" yields
"
f .x / C kx x" k f .x" /; (3.6)
hence f .x / f .x" /: Further, infx2X f .x/ C " kx x" k f .x / C " kx x" k
f .x" / infx2X f .x/ C " by the definition of x" , hence kx x" k : Finally,
from (3.5) it follows that
"
f .x / f .x/ C fkx x" k kx x" kg
"
f .x/ C kx x k 8x 2 X: t
u
Note that by (3.3) x is also an "-approximate minimizer which is not too distant
from x" ; moreover by (3.4) x is an exact minimizer of the function f .x/C " kxx k
which differs from f .x/ just by a perturbation quantity " kx x k: The larger the
smaller this perturbation quantity and the closer x to an exact minimizer, but x
may be farther from x"p : In practice, the choice of depends on the circumstances.
Usually the value D " is suitable.
Corollary 3.1 Let f W Rn ! R be a differentiable function which is bounded below
on Rn : For every "-approximate minimizer x" of f there exists an "-approximate
minimizer x such that
p p
kx x" k "; krf .x /k ";
p
Proof From (3.4) for x D x C tu; u 2 Rn ; t < 0 and D " we have f .x /
p
f .x Ctu/f .x / p
f .x C tu/ C "ktuk: p Therefore, n t "kuk;
p and letting t ! C1
yields hrf .x /; ui "kuk 8u 2 R ; hence krf .x /k ":
t
u
Let X; Y be two sets in Rn ; Rm , respectively. A set-valued map f from X to Y is
a map which associates with each point x 2 X a set f .x/ Y called the value of f
at x: Theorem 3.1 implies the following fixed point theorem due to Caristi (1976):
3.2 Contraction Principle 89
Theorem 3.2 Let P be a set-valued map from a closed set X Rn into itself and
let f W X ! 0; C1 be a l.s.c. function finite at at least one point. If
is called the Hausdorff distance between A and B: From this definition it readily
follows that: if d.A; B/ then d.x; B/ 8x 2 A (because A B ) and
d.y; A/ 8y 2 B (because B A ).
A set-valued map P from a set X Rn to Rn is said to be a contraction if there
exists ; 0 < < 1; such that
Let x 2 X: Since < 1 by definition of d.x; P.x// there exists y 2 P.x/ satisfying
So y 2 E and all the conditions in Caristi Theorem 3.2 are satisfied for the set
E; the map P.x/ \ E and the function f .x/ D d.x; P.x//: Therefore, there exists
.x /
x 2 P.x / \ E;, i.e., x 2 P.x / and kx ak f .a/f
f .a/
D by noting that
f .x / D 0 as x 2 P.x /: t
u
3.3 Fixed Point and Equilibrium 91
S1 ; S2 ; : : : ; Sm : (3.12)
92 3 Fixed Point and Equilibrium
then \niD0 Li ;:
Proof First assume, as in the classical formulation of this proposition, that
a0 ; a1 ; : : : ; an are affinely independent. The set S D a0 ; a1 ; : : : ; an is then an
n-simplex and for each I f0; 1; : : : ; ng the set convfai; i 2 Ig is a face of S:
Consider a simplicial subdivision of degree k of S: Let x be a subdivision point,
i.e., a vertex of some subsimplex in the subdivision, and convfaij i 2 Ig be the
smallest face of S containing x: By assumption x 2 [i2I Li ; so there exists i 2 I
such that x 2 Li : We then set `.x/ D i: Clearly this labeling ` of the subdivision
points satisfies the condition of Lemma 3.2. Therefore, there exists a subsimplex
x0;k ; x1;k ; : : : ; xn;k with `.xi;k / D i; i D 0; 1; : : : ; n: By compactness of S there exists
3.3 Fixed Point and Equilibrium 93
xi;k ! x ; i D 0; 1; : : : ; n:
x 2 C; F.x; y/ 0 8y 2 D:
x 2 C; F.x; y/ 0 8y 2 D: (3.14)
Therefore,
Since this holds for every set I f1; : : : ; kg; by KKM Lemma, we have
\kiD1 L.ai / ;: So there exists y 2 conv such that '.y/ \ C.ai / ; 8i D
1; : : : ; k:
Now for each i D 1; : : : ; k take an xi 2 '.y/ \ C.ai /: It is not hard to see that for
every set I f1; : : : ; kg, we have
Or equivalently,
max inf F.x; y/ inf inf F.x; y/: (3.21)
x2C y2D y2D x2'.y/
Proof Clearly, for every < the assumption '.y/ C .y/ 8y 2 D means
that infy2D infx2'.y/ F.x; y/ while by Theorem 3.4 the latter inequality implies
maxx2C infy2D F.x; y/ : Therefore
max inf F.x; y/ WD inf sup F.x; y/;
x2C y2D y2D x2C
3.3 Fixed Point and Equilibrium 97
whence (3.22) because the converse inequality is trivial. Thus Theorem 3.4 implies
Theorem 3.5. The converse can be proved in the same manner. t
u
Corollary 3.2 Let C be a nonempty compact convex set in Rn ; D a nonempty closed
convex set in Rm ; and F.x; y/ W C D ! R a u.s.c. function which is quasiconcave
in x and quasiconvex in y: Then there holds the minimax equality (3.22).
Proof For every < define '.y/ WD fx 2 Cj F.x; y/ g: Then '.y/ is a
nonempty closed convex set. If .x ; y / 2 C D; .x ; y / ! .x; y/; and x 2 '.y /;
i.e., F.x ; y / then by upper semi-continuity of F.x; y/, we must have F.x; y/
; i.e., x 2 '.y/: So the set-valued map ' is closed and '.y/ D C .y/ 8y 2 D: The
conclusion follows by Theorem 3.5. t
u
Note that Corollary 3.2 differs from Sions minimax theorem (Theorem 2.7) only
by the continuity properties imposed on F.x; y/: Actually, with a little bit more skill
Sions theorem can also be derived directly from the General Equilibrium Theorem
(Exercise 6).
inf F.x; y/ 0:
y2C
Theorem 3.7 (Kakutani 1941) Let C be a compact convex subset of Rn and ' a
closed set-valued map from C to C with nonempty convex values '.x/ C 8x 2 C:
Then ' has a fixed point, i.e., a point x 2 '.x/:
Proof The function F.x; y/ W C C ! R defined by F.x; y/ D kx yk is continuous
on C C and convex in y for every fixed x: By Theorem 3.4 (Remark 3.2, (3.21))
there exists x 2 C such that
Consider an n-person game in which the player i has a strategy set Xi Rmi : When
the player i chooses a strategyQ xi 2 Xi ; the situation of the game is described by the
vector x D .x1 ; : : : ; xn / 2 niD1 Xi : In that situation the player i obtains a payoff
fi .x/:
Assume that each player does Qn not know which strategy is taken by the other
players. A vector x 2 X WD iD1 Xi is called a Nash equilibrium if for every
i D 1; : : : ; n W
Qn
where for any z 2 Xi ; xji z denotes the vector x 2 iD1 Xi in which xi has been
replaced by z:
3.3 Fixed Point and Equilibrium 99
Theorem 3.8 (Nash Equilibrium Theorem (Nash 1950)) Assume that for each
i D 1; : : : ; n the set Xi is convex compact and the function fi .x/ is continuous on X
and concave in xi : Then there exists a Nash equilibrium.
Qn
Proof The set X WD iD1 Xi is convex, compact as the Cartesian product of n
convex compact sets. Consider the function
X
n
F.x; y/ WD fi .x/ fi .xji yi /; x 2 X; y 2 X: (3.24)
iD1
This is a function continuous in x and convex in y: The latter is due to the assumption
that for each i D 1; : : : ; n the function fi .x/ is concave in xi ; which means that the
function z 7! fi .xji z/ is concave. Since yji yi D y we also have F.y; y/ D 0 8y 2 X:
Therefore, by Ky Fan inequality Theorem 3.6, there exists x 2 X satisfying
X
n
F.x; y/ D fi .x/ fi .xji yi / 0 8y 2 X:
iD1
Fix an arbitrary i and take y 2 X such that yj D xj for j i: Writing the above
inequality as
X
fi .x/ fi .xji yi / C fj .x/ fj .xjj xj / 0 8y 2 X;
ji
and noting that xjj xj D x we deduce the inequality fi .x/ fi .xjyi / 8yi 2 Xi : Since
this holds for every i D 1; : : : ; n; x is a Nash equilibrium. t
u
9y 2 @f .x/ hy; x xi 0 8x 2 C:
hy; x xi 0 8x 2 C:
3.4 Exercises 101
Proof By Proposition 3.2 is a closed map and its range D D .C/ is compact.
Define F.x; y/ D hx; yi minz2C hz; yi: This function is continuous in y and convex
in x: By Theorems3.4, (3.21), there exists y 2 D such that
Let .x; y/ be a cluster point of .xk ; yk /: Then x 2 C; y 2 .x/ because the map is
closed, and hx; yi minz2C hz; yi 0; i.e., hy; x xi 0 8x 2 C as to be proved. u
t
3.4 Exercises
1 Derive Ekeland variational principle from Theorem 3.2. (Hint: Let x" be an
"-approximate minimizer of f .x/ on X and > 0: Define P.x/ D fy 2 Xj " ky
x" /k f .x" / f .y/g and show that if Ekelands theorem is false there exists for
every x 2 X an y 2 P.x/ satisfying " kx yk < f .x/ f .y/: Then apply Caristi
theorem to derive a contradiction.)
2 Let S be an n-simplex spanned by a0 ; a1 ; : : : ; an : For each i D 0; 1; : : : ; n let
Fi be the facet of S opposite the vertex ai (i.e., the .n 1/-simplex spanned by n
points aj ; j i/; and Li a given closed subset of S: Show that the KKM Lemma is
equivalent to saying that if the following condition is satisfied:
S [niD0 Li ; Fi Li ; 8i;
then \niD0 Li ;:
3 Let S; Fi ; Li ; i D 0; 1; : : : ; n be as in the previous exercise. Show that the KKM
Lemma is equivalent to saying that if the following condition is satisfied:
S [niD0 Li ; ai 2 Li ; Fi \ Li D ;;
then \niD0 Li ;:
102 3 Fixed Point and Equilibrium
4 Derive the KKM Lemma from Brouwer fixed point theorem. (Hint: Let
S; L0 ; L1 ; : : : ; Ln be as in the previous exercise, .x; Li / D minfkx ykj y 2 Li g:
For k > 0 set Hi D fx 2 Sj .x; Li / 1=k and define a continuous map f W S ! S
by fi .x/ D Pxni1 .x;H i/
with the convention x1 D xn : Show that if x D f .x/ then
iD1 xi .x;Hi /
fi .x/ > 0 8i; hence .x; Hi / > 0; i.e., .x; Li / 1=k for every i D 0; 1; : : : ; n: Since
k can be arbitrary, deduce that \niD0 Li ;:)
5 Let C1 ; : : : ; Cn be n closed convex sets in Rk such that their union is a convex set
C: Show that if the intersection of n 1 arbitrary sets among them is nonempty then
the intersection of the n sets is nonempty. (Hint: Let ai 2 Li WD \ji Cj and apply
KKM to the set fa1 ; : : : ; an g and the collection of closed sets fL1 ; : : : ; Ln g.)
6 Let C be a compact convex subset of Rn ; D a closed convex subset of
Rm ; f .x; y/ W C D ! R a function which is quasi convex, l.s.c. in x and
quasiconcave u.s.c. in y: Show that for every >
WD supy2D infx2C f .x; y/ the
set-valued map ' from D to C such that '.y/ WD fx 2 Cj f .x; y/ g 8y 2 D is
closed and using this fact deduce Sions Theorem from Theorem 3.4. (Hint: show
that the set-valued map ' is u.s.c., hence closed because C is compact.)
Chapter 4
DC Functions and DC Sets
4.1 DC Functions
Actually most functions encountered in practice are dc, as will be shortly demon-
strated.
Functions which can be represented as differences of convex functions were
considered some 40 years ago by Alexandrov (1949, 1950) and Landis (1951), and
some time later by Hartman (1959) who proved a number of important properties
to be discussed below. For our purpose, the first fundamental property of these
functions is their stability relative to operations frequently used in optimization
theory.
where the first is a convex inequality, and the second is a reverse convex inequality.
Thus, any finite system of dc inequalities can be rewritten as a system of one convex
inequality and one reverse convex inequality. As will be seen later, this property
allows a compact description of a wide class of nonconvex optimization problems
and provides the basis for a unified theory of global optimization.
The next property shows that virtually all the most frequently encountered functions
in practice are dc. As usual, Ck .Rn / (or simply Ck / denotes the class of functions on
Rn continuously differentiable up to order k.
Proposition 4.2 Every function f 2 C2 .Rn / is dc on any compact convex
set Rn .
Proof We show that, given any compact convex set Rn , the function g.x/ D
f .x/ C kxk2 becomes convex on when is sufficiently large (then f .x/ D g.x/
kxk2 yields a dc representation of f ). Indeed, since hu; r 2 g.x/ui D hu; r 2 f .x/ui C
kuk2 , if is so large that
Proof By Weierstrass theorem there exists a polynomial g.x/ satisfying the required
condition, and obviously g.x/ 2 C2 . t
u
Thus DC./ is dense in C./, the Banach space of continuous functions on
, equipped with the sup norm. Though dc functions occur frequently in practice,
they often appear in a hidden, not directly recognizable, form. To help to identify dc
functions in various situations we prove some further properties of these functions.
A function f W D ! R defined on a convex set D Rn is said to be locally
dc if for every x 2 D there exist a convex open neighborhood U of x and a pair of
convex functions g; h on U such that f jU D gjU hjU. Note that since U can be
assumed bounded, the sets [f@g.y/ j y 2 Ug and [f@h.y/ j y 2 Ug are bounded by
Theorem 2.6, so by Corollary 2.10, the functions g; h can be assumed convex finite
on all of Rn .
106 4 DC Functions and DC Sets
`t .f1 ; : : : ; fm /
X
m X
m
D a0t C ait fiC ait fi
iD1 iD1
" #
X
m X
m
D a0t C .M C ait /fiC C .M ait /fi
iD1 iD1
X
m
M .fiC C fi / D pt q
iD1
the product of two dc functions is dc; if f .x/ is dc on an open (or closed) convex set
1
and f .x/ 0 for all x 2 then and jf .x/j1=m are dc on .
f .x/
Proof Indeed, C2 -smooth functions are dc. t
u
The above properties explain why an overwhelming majority of functions of
practical interest are dc.
Remark 4.1 Following McCormick (1972) a function f .x/ is called a factorable
function if it results from a finite sequence of compositions of transformed sums and
products of simple functions of one variable. In other words, a factorable function
is a function which can be obtained as the last in a sequence of functions f1 ; f2 ; : : :,
built up as follows:
fi .x/ D xi .i D 1; : : : ; n/ (4.4)
where g.x/ is a convex function and K is a positive constant satisfying K jq0C .0/j.
Using Proposition 4.5, one can see, for example, that the function we kxak
(with w > 0; > 0) is dc, since it is equal to q.kx ak/ and q.t/ D we t
is a convex decreasing function with q0C .0/ D w > 1. A generalization of
Proposition 4.5 is the following:
Proposition 4.6 Let h.x/ D u.x/v.x/ where u; v W M ! RC are convex functions
on a compact convex set M Rm such that h.x/ 0 8x 2 M. If q W RC ! R is a
convex nonincreasing function such that q0C .0/ > 1 then q.h.x// is a dc function
on M:
and consequently,
q.u.x/ v.x//
D sup fq. / q0C . / C .K C q0C . //u.x/ C .K q0C . //v.x/g
2RC
We contend that g.x/ D sup fq. / q0C . /C.K Cq0C . //u.x/C.K q0C . //v.x/g
2RC
is convex. Indeed, since q.t/ is convex, q0C . / q0C .0/ and hence, K C q0C . /
K C q0C .0/ 0 for all 0; furthermore, since q.t/ is nonincreasing, q0C . / 0
and hence K q0C . / K > 0 for all 0. It follows that for each fixed 2 RC
the function x 7! q. / q0C . /C .KCq0C . //u.x/C.Kq0C . //v.x/ is convex and
g.x/, as the pointwise supremum of a family of convex functions, is itself convex.
t
u
Analogously: Let h.x/ D u.x/ v.x/ where u; v W M ! RC are convex functions
on a compact convex set M Rm such that h.x/ 0 8x 2 M. If q W RC ! R
is a concave nondecreasing function such that q0C .0/ < C1, then q.h.x// is a dc
function on M:
Proof Clearly q.t/ is convex. To show that f .t/ C q.t/ is convex, observe that for
every i 2 I:
4.4 DC Representation of Separable Functions 111
p(t)
q(t)
0 1
X
.f C q/jUi D .f C
i /jUi C @
j A jUi
j2Infig
0 1
X
D@
j A jUi :
j2Infig
Since every
j with j 2 I n fig is affine on Ui , it follows that f .t/ C q.t/ is affine on
every Ui ; i 2 I. On the other hand, for i I:
0 1
X
.f C q/jUi D @f C
j A jUi ;
j2I
and since on such Ui ; f .t/ is convex while every
j with j 2 I is affine,it follows that
f .t/Cq.t/ is also convex on every Ui ; i I. Thus, every t 2 0; a has a neighborhood
in which f .t/ C q.t/ is convex. Hence, f .t/ C q.t/ is convex on 0; a, as was to be
proved (Fig. 4.1). t
u
Proposition 4.9 can be extended to continuous functions f .t/ W 0; a ! R
that are piecewise convexconcave in the following sense: there exists a sequence
e0 WD 0 < e1 < e2 < : : : < ek < a WD ekC1 such that for each interval
Vi D ei ; eiC1 ; i D 0; : : : ; k; the function fi .t/ WD f .t/jVi is convex or concave, with
finite fC0 .ei /; f0 .eiC1 /, and moreover, fi .t/ is convex whenever fi1 .t/ is concave and
conversely. Denote I D fi j fi .t/ is concaveg. For each i 2 I let i .t/ W 0; a ! R
be the continuous convex function which agrees with f .t/ for t 2 Vi and is affine
outside Vi . Then, by an analogous argument to the proof of Proposition 4.9, it is
easily proved that f .t/ D p.t/ q.t/, where
X
q.t/ D i .t/
i2I
which is obviously dc. A useful consequence of the above result is the following
(see, e.g., Thach 1992):
Corollary 4.3 Any system of quadratic inequalities
fi .x/ WD hx; Qi xi C hbi ; xi C di 0; i D 1; : : : ; m;
where Qi are symmetric matrices, bi 2 Rn ; di 2 R, is equivalent to a single dc
inequality
g.x/ kxk2 0;
where g.x/ is a convex function and maxiD1;:::;m .Qi /.
4.6 DC Sets 113
Proof Since the ith inequality can be rewritten as gi .x/ kxk2 0, with gi .x/
convex, the system is equivalent to
1 1
f1 .x/f2 .x/ D f1 .x/ C f2 .x/2 f12 .x/ C f22 .x/:
2 2
Proof Clearly f1 .x/ C f2 .x/ is convex, nonnegative-valued, and since the univariate
function q.t/ D t2 is convex increasing on RC , by Proposition 2.8 each term in the
above difference is convex. t
u
Using this Proposition and the fact that for every natural m the univariate function
t ! tm is convex on RC , one can easily obtain a dc representation for any monomial
p p
of the form x11 x22 : : : xpnn , and hence a dc representation for P.x/ on RnC .
4.6 DC Sets
In this and the next sections, by dc set we mean a set M Rn for which there
exist convex functions g; h W Rn ! R such that M D fx j g.x/ 0; h.x/ 0g
(so M D D n C, where D D fx j g.x/ 0g and C D fx j h.x/ < 0g). Since
M D fx j maxg.x/; h.x/ 0g, it follows that a dc set can also be defined by a dc
inequality. Conversely, if a set S is defined by a dc inequality: S D fx j g.x/h.x/
0g, with g.x/ and h.x/ convex, then clearly S D fx j .x; t/ 2 M for some tg, where
M D f.x; t/ 2 Rn R j g.x/ t 0; t h.x/ 0g; therefore, S can be obtained as
the projection on Rn of the dc set M RnC1 .
Surprisingly, dc sets are not so different from arbitrary closed sets. This can be
seen from the observation by Asplund (1973) that for any closed set S Rn , if
d.x; S/ D inffkx yk j y 2 Sg denotes the distance from x to S then the function
x 7! kxk2 d2 .x; S/ D supf2xykyk2 j y 2 Sg is convex. Thus, S D fx j d2 .x; S/
0g where d2 .x; S/ is a dc function.
114 4 DC Functions and DC Sets
Q
Clearly the function y 7! h.y/ Q
D h.y/ h.x/ hp; y xi satisfies h.x/ D 0
Q
h.y/ Q
8y. It follows that h.y/ > 0 8y x for, by the strict convexity of h, Q if
Q
h.y/ D 0 for some y x then h. Q xCy / < 0, a contradiction. Thus the level set
2
C0 D fy j h.y/Q 0g D fxg is bounded, and consequently, so is the level set
C1 D fy j h.y/ Q 1g. But in view of (4.11) we may assume h.y Q k / 1 8k.
k
Therefore, the sequence fy g is bounded. For any cluster point y of this sequence
Q
we have h.y/ D 0, hence y D x and since S is closed and yk 2 S, it follows that
x D y 2 S. t
u
Now let be any positive number and r W Rn ! RC any function such that:
r.x/ D 0 8x 2 SI (4.12)
0 < r.y/ minf ; d.y/g 8y S: (4.13)
4.6 DC Sets 115
Define
Proposition 4.13 Let h W Rn ! R be any given strictly convex function. For every
closed set S Rn the function gS .x/ defined by (4.14) is closed, convex, finite
everywhere and satisfies
Proof We will assume that S is neither empty nor the whole space (otherwise
the proposition is trivial). The function gS .x/ is closed, convex as the pointwise
supremum of a family of affine functions x 7! h.y/ C hp; x yi C r2 .y/. It is finite
everywhere because for every x 2 Rn :
Later, in Sect. 7.5, we will see how this representation can be used for solving
continuous optimization problems.
Corollary 4.4 (Thach 1993a) Any closed set in Rn is the projection on Rn of a dc
set in RnC1 (Fig. 4.2).
Proof From (4.15) S D fxj 9.x; t/ 2 RnC1 ; gS .x/ t; t h.x/g, so S is the
projection of DnC RnC1 on Rn , for D D f.x; t/j gS .x/ tg; C D f.x; t/j h.x/ < tg.
t
u
116 4 DC Functions and DC Sets
C
D
E1 E2
Corollary 4.5 For any lower semi-continuous function f .x/ there exists a convex
function g.x/ such that the inequality f .x/ 0 is equivalent to the dc inequality
g.x/ kxk2 0:
Proof (i) Choose a function r W Rn ! RC satisfying (4.12), (4.13) and such that
By Lemma 4.1 the function r.y/ D d.y/ satisfies this condition. Let p 2
@h.y/; l.x/ D h.y/ C hp; x yi C r2 .y/. Clearly, l.y/ D h.y/ C r2 .y/ > h.y/,
while from (4.14): l.x/ gS .x/ h.x/ 8x 2 S. We will say that the reverse
convex set D.y/ D fx j l.x/ h.x/ 0g separates S from y.
(ii) Consider a grid of points fyi j i D 1; 2; : : :g Rn n S which is dense in
Rn n S (e.g., the grid of all points with rational coordinates in Rn n S). For
each yi let pi 2 @h.yi /, li .x/ D h.yi / C hpi ; x yi i C r2 .yi / so that the set
Di D fx j li .x/ h.x/ 0g separates S from yi . We contend that
1
\
SD Di :
iD1
Most often, however, f .x/ is given as a factorable function (see Sect. 3.3) whose
explicit dc representation is not readily available and generally very hard to
compute. In such cases, using the factorable structure the computation of a convex
minorant (or concave majorant) of f .x/ can be reduced to solving the following two
problems (McCormick 1982):
PROBLEM 4.1 Given a function F.t/ of one variable, and a function t.x/, defined
on a convex set S Rn , find a convex minorant (concave majorant) of F.t.x// on
the set fx 2 Sj a t.x/ bg.
Assume that there are available a convex minorant p.x/ and a concave majorant
q.x/ of t.x/. Denote the convex envelope and the concave envelope of F.t/ on the
segment a t b by '.t/ and .t/, respectively. Let WD inffF.t/ja t bg D
F.tmin /, WD supfF.t/ja t bg D F.tmax /. Also for any three real numbers
u; v; w denote by midu; v; w the middle number, i.e., the number which is between
the two others.
Proposition 4.15 The functions
Then '.t/ D ' D .t/ C ' I .t/ C : Observe that since ' D .t/ is decreasing and
t.x/ q.x/ we have ' D .t.x// '.min.q.x/; tmin // . Similarly, ' I .t/
'.max.p.x/; tmin // . Hence,
F.t.x// '.t.x//
D ' D .t.x// C ' I .t.x// C
'.min.q.x/; tmin // C '.max.p.x/; tmin // (4.18)
D '.midp.x/; q.x/; tmin /:
4.7 Convex Minorants of a DC Function 119
The convexity of this function follows from the expression (4.18) because
'.min .q.x/; tmin // is convex as a convex decreasing function of the concave
function min.q.x/; tmin / and '.max.p.x/; tmin // is convex as a convex increasing
function of the convex function max.p.x/; tmin /. t
u
PROBLEM 4.2 Given two functions u.x/; v.y/ on Rn ; Rm resp., find a convex
minorant (concave majorant) of the product f .x/ D u.x/v.y/ on the set f.x; y/j p
u.x/ q; r v.y/ sg.
Lemma 4.2 The system of inequalities
puq r v s; (4.19)
Proof If u.x/; v.y/ are convex, the function maxfru.x/ C pv.y/ pr; su.x/ C
qv.y/ qsg is convex and by the above Lemma it is a minorant of u.x/v.y/ for
p u.x/ q; r v.y/ s. Similarly, if u.x/; v.y/ are concave, the function
minfsu.x/ C pv.y/ ps; ru.x/ C qv.y/ qrg is a concave majorant of u.x/v.y/. u
t
p p x q q
min sx C ; C :
y r s y s
Proof This follows from Proposition 4.16 when u.x/ D x; v.y/ D y and when
u.x/; v.y/ D 1y (note that the function 1y is concave). t
u
As we saw in Sect. 2.11 any local minimum of a convex function on a convex set C
is also a global minimum. This important property is no longer true for general dc
functions but it partially extends to quadratic functions, i.e., to functions of the form
f .x/ D 12 hx; Qxi C ha; xi C where Q is a symmetric n n matrix.
Proposition 4.18 Any local minimum (or maximum, resp.) of a quadratic function
f .x/ on affine set E Rn is a global minimum (maximum, resp.).
Proof If x0 is a local minimizer of f .x/ on E, there exists a ball B centered at x0 such
that f .x0 / f .x/ 8x 2 B \ E. For any z 2 E n fx0 g let L be the line through x0 and z,
i.e., L D fx 2 Rn j x D x0 C t.z x0 /; t 2 .1; C1/g. Then the restriction of f on
L is the function '.t/ D f .x0 C t.z x0 //; t 2 R. Since f .x/ is a quadratic function
on Rn , '.t/ is a quadratic function of the real variable t. This function attains a
local minimum at t D 0, which must then be a global minimum as the minimum
of a quadratic function of a real variable. So '.0/ '.t/ 8t 2 R, in particular
f .z/ D '.1/ '.0/ D f .x0 /, i.e., f .z/ f .x0 / 8z 2 E. t
u
Proposition 4.19 A quadratic function which is bounded below on Rn must be
convex. The minimum (or maximum) of an indefinite quadratic function on a
compact convex set C is always attained at a boundary point (which may not be
an extreme point).
4.8 The Minimum of a DC Function 121
Proof If a quadratic function f .x/ is not convex there exist two distinct points
a; b 2 Rn and a number t 2 .0; 1/ such that f ..1 t/a C tb/ > .1 t/f .a/ C tf .b/. So
the function '.t/ WD f ..1 t/a C tb/; t 2 R, which is the restriction of f on the line
through a; b, is not convex. Then '.t/ must be concave because a quadratic function
on R is either convex or concave. Hence, '.t/ is unbounded below on R, conflicting
with f .x/ being bounded below on Rn . This proves the first part of the proposition.
Turning to the second part, without loss of generality we can assume that the affine
hull of C is the whole space Rn . Since f .x/ is continuous and C is compact convex,
f .x/ attains a minimum on C. If this minimum were an interior point of C, it
would be a local minimum of f .x/ on Rn , and hence, by Proposition 4.18, a global
minimum. So f .x/ would be bounded below on Rn , and would be convex, contrary
to its indefiniteness. Analogously for the maximum. t
u
By Proposition 2.31, a point x0 achieves the minimum of a proper convex
function f .x/ on Rn if and only if 0 2 @f .x0 /. For general dc functions, a criterion
for the global minimum is the following (Hiriart-Urruty 1989a,b; see also Tuy and
Oettli 1994):
Proposition 4.20 Let g W Rn ! R [ fC1g be an arbitrary proper function and
h W Rn ! R [ fC1g a convex proper l.s.c. function. A point x0 2 domg \ domh is
a global minimizer of g.x/ h.x/ on Rn if and only if
Proof Without loss of generality we may assume that g.x0 / D h.x0 / D 0. Then x0
is a global minimizer of g.x/ h.x/ on Rn if and only if h.x/ g.x/ 8x 2 Rn . It is
easy to see that the latter condition is equivalent to (4.28). Indeed, observe that an
affine function '.x/ D ha; x x0 i C '.x0 / is a minorant of a proper function f .x/ if
and only if a 2 @" f .x0 / with " D f .x0 / '.x0 /. Now, since h.x/ is a proper convex
l.s.c. function on Rn , it is known that h.x/ D supf'.x/j ' 2 Qg, where Q denotes
the collection of all affine minorants of h.x/ (Theorem 2.5). Hence, the condition
h.x/ g.x/ 8x 2 Rn is equivalent to saying that every ' 2 Q is a minorant of
g, i.e., by the above observation, every a 2 @" h.x0 / belongs to @" g.x0 /, for every
" > 0. t
u
Remark 4.4 Geometrically, the condition h.x/ g.x/ 8x 2 Rn which expresses
the global minimality of x0 means
G H; (4.29)
where G; H are the epigraphs of the functions g; h, respectively. Since h is proper
convex and l.s.c., H is a nonempty closed convex set in Rn R and it is obvious
that (4.29) holds if and only if every closed halfspace in Rn R that contains H also
contains G (see Strekalovski 1987, 1990, 1993). The latter condition, if restricted to
nonvertical halfspaces, is nothing but a geometric expression of (4.28).
The next result exhibits an important duality relationship in dc minimization
problems which has found many applications (Toland 1978).
122 4 DC Functions and DC Sets
4.9 Exercises
: 0 t
for t 0 (; ;
are positive numbers, and <
/.
4
X
n
ri .xi /
f .x/ D Pk ; x 0;
iD1 0 C jD1 j xj
1 1=2 4
5 f .x/ D x1 C C ; x 2 R2C .
x2 .x1 x2 /2
6 A function f W C ! R (where C is an open convex subset of Rn / is said to be
weakly convex if there exists a constant r > 0 such that for all x1 ; x2 2 C and all
2 0; 1:
Show that f is weakly convex if and only if it is a dc function of the form f .x/ D
g.x/ rkxk2 , where r > 0.
7 Show that a function f W Rn ! R is weakly convex if and only if there exists a
constant r > 0 such that for every x0 2 Rn one can find a vector y 2 Rn satisfying
f .x/ f .x0 / hy; x x0 i rkx x0 k2 8x 2 Rn .
2
8 Find a convex minorant for the function f .x/ D e.x1 Cx2 x3 / C x1 x2 x3 on 1
xi C1; i D 1; 2; 3.
9 Same question for the function f .x/ D log.x1 C x22 /minfx1 ; x2 g on 1 xi
3; i D 1; 2.
10 (1) If x is a local minimizer of the function f .x/ D g.x/ h.x/ over Rn then
0 2 @g.x/ @h.x/ where A B D fxj x C B Ag.
(2) If x is a global minimizer of the dc function f D g h over Rn then any point
x 2 @h.x/ achieves a minimum of the function h g over domg .
Part II
Global Optimization
Chapter 5
Motivation and Overview
D D fx 2 Xj gi .x/ 0; i D 1; : : : ; mg
f .x / f .x/; 8x 2 D;
f .x0 / f .x/; 8x 2 D \ W
is called a local optimal solution (local minimizer). When the problem is convex,
i.e., when all the functions f .x/ and gi .x/; i D 1; : : : ; m, are convex, any local
minimizer is global (Proposition 2.30), but this is no longer true in the general
case. A problem is said to be multiextremal when it has many local minimizers
with different objective function values (so that a local minimizer may fail to be
global; Fig. 5.1). Throughout the sequel, unless otherwise stated it will always be
understood that the minimum required in problem .P/ is a global one, and we will
be mostly interested in the case when the problem is multiextremal. Often we will
assume X compact and f .x/; gi .x/; i D 1; : : : ; m, continuous on X, (so that if D is
nonempty then the existence of a global minimum is guaranteed); but occasionally
the case where f .x/ is discontinuous at certain boundary points of D will also be
discussed, especially when this turns out to be important for the applications.
By its very nature, global optimization requires quite different methods from
those used in conventional nonlinear optimization, which so far has been mainly
concerned with finding local optima. Strictly speaking, local methods of nonlinear
programming which rely on such concepts as derivatives, gradients, subdifferentials
and the like, may even fail to locate a local minimizer. Usually these methods
compute a KarushKuhnTucker (KKT) point, but in many instances most KKT
points are not even local minimizers. For example, the problem:
X
n
minimize .ci xi C x2i / subject to 1 xi 1 .i D 1; : : : ; n/;
iD1
where the ci are sufficiently small positive numbers, has up to 3n KKT points, of
which only 2n C 1 points are local minimizers, the others are saddle points.
Of course the above example reduces to n univariate problems minfci xi x2i j
1 xi 1g; i D 1; : : : ; n, and since the strictly quasiconcave function ci xi x2i
attains its unique minimum over the segment 1; 1 at xi D 1, it is clear that
.1; 1; : : : ; 1/ is the unique global minimizer of the problem. This situation is typical:
although certain multiextremal problems may be easy (among them instances
of problems which have been shown to be NP-hard, see Chap. 7), their optimal
solution is most often found on the basis of exploiting favorable global properties
5.2 Multiextremality in the Real World 129
of the functions involved rather than by conventional local methods. In any case,
easy multiextremal problems are exceptions. Most nonconvex global optimization
problems are difficult, and their main difficulty is often due to multiextremality.
From the computational complexity viewpoint, even the problem which only
differs from the above mentioned example in that the objective function is a
general quadratic concave function is NP-hard (Sahni 1974). It should be noted,
however, that, as shown in Murty and Kabadi (1987), to verify that a given point
is an unconstrained local minimizer of a nonconvex function is already NP-hard.
Therefore, even if we were to restrict our goal to obtaining a local minimizer only,
this would not significantly lessen the complexity of the problem. On the other
hand, as optimization models become widely used in engineering, economics, and
other sciences, an increasing number of global optimization problems arise from
applications that cannot be solved successfully by standard methods of linear and
nonlinear programming and require more suitable methods. Often, it may be of
considerable interest to know at least whether there exists any better solution to
a practical problem than a given solution that has been obtained, for example, by
some local method. As will be shown in Sect. 5.5, this is precisely the central issue
of global optimization.
X
r X
m X
r X
m
minimize cij xij C gi .yi / C hj .zj /
iD1 jD1 iD1 jD1
X
m
s:t: xij D yi .i D 1; : : : ; r/
jD1
Xr
xij D zj .j D 1; : : : ; m/
iD1
xij ; yi ; zj 0 8i; j:
130 5 Motivation and Overview
where wij is an unknown weight representing the flow from point aj to facility i, S is
a rectangular domain in R2 where the facilities must be located.
With this traditional formulation the problem is exceedingly complicated as it
involves a large number of variables and a highly nonconvex objective function.
However, observing that to minimize the total cost, each user must be served by the
closest facility, and that the distance from a user j to the closest facility is obviously
X
n
minimize wj dj .x1 ; : : : ; xp / subject to xi 2 R2 ; i D 1; : : : ; p;
jD1
where wj > 0 is the weight assigned to user j. Since each dj .:/ is a dc function
(pointwise minimum of a set of convex functions kaj xi k, see Sect. 3.1), the
problem amounts to minimizing a dc function over .R2 /p .
Example 5.3 (Engineering Design) Random fluctuations inherent in any fabri-
cation process (e.g., integrated circuit manufacturing) may result in a very low
production yield. To help the designer to minimize the influence of these random
fluctuations, a method consists in maximizing the yield by centering the nominal
value of design parameters in the region of acceptability. In more precise terms,
5.2 Multiextremality in the Real World 131
this design centering problem (Groch et al. 1985; Vidigal and Director 1982) can
be formulated as follows. Suppose the quality of a manufactured item can be
characterized by an n-dimensional parameter and an item is accepted when the
parameter is contained in a given region M Rn . If x is the nominal value of
the parameter, and y its actual value, then usually y will deviate from x and for every
fixed x the probability that the deviation kx yk does not exceed a given level r
monotonically increases with r. Under these conditions, for a given nominal value
of x the production yield can be measured by the maximal value of r D r.x/ such
that
B.x; r/ WD fy W kx yk rg M
and to maximize the production yield, one has to solve the problem
Often, the norm k:k is ellipsoidal, i.e., there exists a positive definite matrix Q such
that kx yk2 D .x y/T Q.x y/. Denoting D D Rn n M, we then have
to pool l, ylj the amount going from pool l to product j, zij the amount of component
i going directly to product j, plk the level of quality k (relative sulfur content level)
in pool l, and Cik the level of quality k in component i, then these variables must
satisfy (see, e.g., Ben-Tal et al. 1994; Floudas and Aggarwal 1990; Foulds et al.
1992) (Fig. 5.2):
P P )
xil C j zij Ai
P l P (component balance)
i xil j ylj D 0
P
x Sl (pool balance)
Pi il P
.p P /y C i .Cij Pjk /zij 0 (pool quality)
Pl lk Pjk lj
l ylj C i zij Dj (product demands constraints)
xil ; ylj ; zij ; plk 0;
where Ai ; Sl ; Pjk ; Dj are the upper bounds for component availabilities, pool sizes,
product qualities, and product demands, respectively, and Cik the level of quality
k in component i. Given the unit prices ci ; dj of component i and product j, the
problem is to determine the variables xil ; ylj ; zij ; plk satisfying the above constraints
with maximum profit
XX XX XX
ci xil C dj ylj C .dj ci /zij :
i l l j i j
5.3.1 DC Optimization
where D is a convex set in Rn , and each of the functions f .x/; h1 .x/, : : : ; hm .x/ is a
dc function on Rn . Most optimization problems of practical interest are dc programs
(see the examples in Sect. 4.2).
The class of dc optimization problems includes several important subclasses:
concave minimization, reverse convex programming, non convex quadratic opti-
mization among others.
where f .x/; h.x/ are convex finite functions on Rn and D is a closed convex set in Rn .
Setting
C D fxj h.x/ 0g;
the problem can be rewritten as
minimize f .x/ subject to x 2 D n intC:
This differs from a conventional convex program only by the presence of the
reverse convex constraint h.x/ 0: Writing the problem as minftj f .x/ t; x 2
D; h.x/ 0g, we see that it amounts to minimizing t over the set G n intH, with
G D f.x; t/j f .x/ t; x 2 Dg; H D C R.
Just as concave minimization can be extended to quasiconcave minimization, so
can problem (5.5) be extended to the case when f .x/ is quasiconvex. As indicated
above, concave minimization can be converted to reverse convex programming by
introducing an additional variable. For many years, it was commonly believed (but
never established) that conversely, a reverse convex program should be equivalent
to a concave minimization problem via some simple transformation. It was not until
1991 that a duality theory for nonconvex optimization was developed (Thach 1991),
within which framework the two problems: quasiconcave minimization over convex
sets and quasiconvex minimization over complements of convex sets are dual to each
other, so that solving one can be reduced to solving the other and vice versa.
c. Quadratic Optimization
where f .x/; g.x/; h.x/ are increasing functions on RnC . A function f W RnC ! R is
said to be increasing if f .x/ f .x0 / whenever x; x0 2 RnC ; x x0 .
Monotonic optimization is a new important chapter of global optimization which
has been developed in the last 15 years. It is easily seen that if F denotes the family
of increasing functions on RnC then
In the deterministic, much more than in the stochastic and other approaches to global
optimization, it is essential to understand the mathematical structure of the problem
being investigated. Only a deep understanding of this structure can provide insight
into the most relevant properties of the problem and suggest efficient methods for
handling it.
Despite the great variety of nonconvex optimization problems, most of them
share a common mathematical structure, which is the complementary convex
structure as described in (5.5). Consider, for example, the general dc programming
problem (5.3). As we saw in Sect. 4.1, the system of dc constraints hi .x/ 0; i D
1; : : : ; m is equivalent to the single dc constraint:
If h.x/ D p.x/ q.x/ then the constraint (5.7) in turn splits into two constraints:
one convex constraint p.x/ t and one reverse convex constraint q.x/ t. Thus,
assuming f .x/ linear and setting z D .x; t/, the dc program (5.3) can be reformulated
as the reverse convex program:
proceed further. We must then stop the procedure even though there is no guarantee
that x0 is a global optimal solution. It is this ability to become trapped at a stationary
point which causes the failure of classical methods of nonlinear programming and
motivates the need to develop global search procedures. Therefore, any global
search procedure must be able to address the fundamental question of how to
transcend a given feasible solution (the current incumbent, which may be a
stationary point). In other words, the core of global optimization is the following
issue:
Given a solution x0 (the incumbent), check whether it is a global optimal solution,
and if it is not, find a better feasible solution.
As in classical optimization theory, one way to deal with this issue is by
devising criteria for recognizing a global optimum. For unconstrained dc optimiza-
tion problems, we already know a global optimality criterion by Hiriart-Urruty
(Proposition 4.20), namely: Given a proper convex l.s.c. function h.x/ on Rn and
an arbitrary proper function g W Rn ! R [ fC1g, a point x0 such that g.x0 / < C1
is an unconstrained global minimizer of g.x/ h.x/ if and only if
We now present two global optimality criteria for the general problem
where the set S Rn is assumed nonempty, compact, and robust, i.e., S D cl.intS/,
and f .x/ is a continuous function on S. Since f .x/ is then bounded from below on S,
without loss of generality we can also assume f .x/ > 0 8x 2 S.
Proposition 5.1 A point x0 2 S is a global minimizer of f .x/ over S if and only if
WD f .x0 / satisfies either of the following conditions:
(i) The sequence r.; k/; k D 1; 2; : : : ; is bounded (Falk 1973a), where
Z k
r.; k/ D dx:
S f .x/
(ii) The set E WD fx 2 Sj f .x/ < g has null Lebesgue measure (Zheng 1985).
Proof Suppose x0 is not a global minimizer, so that f .x/ < for some x 2 S. By
continuity we will have f .x0 / < for all x0 sufficiently close to x. On the other hand,
since S is robust, in any neighborhood of x there is an interior point of S. Therefore,
by taking a point x0 2 intS sufficiently close to x we will have f .x0 / < . Let W be
a neighborhood of x0 contained in S such that f .y/ < for all y 2 W. Then W E,
hence mesE mesW > 0. Conversely, if mesE > 0, then E ;, hence x0 is not a
global minimizer. This proves that (ii) is a necessary and sufficient condition for the
optimality of x0 . We now show that (i) is equivalent to (ii). If mesE > 0, then since
E D [C1 1 1
iD1 fx 2 Sj f .x/ 1 C i g there is i0 such that E0 D fx 2 Sj f .x/ 1 C i0 g has
138 5 Motivation and Overview
For any fixed " > 0 let W" D fx 2 S j kx x0 k < "g and write
R R
jxi x0i jekf .x/ dx jxi x0i jekf .x/ dx
'i .k/ D
W"
R C
SnW"
R :
kf .x/ dx kf .x/ dx
Se Se
Since x0 is the unique minimizer of f .x/ over S, clearly minff .x/j x 2 SnW" g > f .x0 /
and there is > 0 such that maxff .x0 / f .x/ j x 2 S n W" g < . Then
R R
jxi x0i jekf .x/ dx ekf .x/ dx
W"
R " RW" ": (5.14)
kf .x/ dx kf .x/ dx
Se Se
R 0
The latter ratio tends to 0 as k ! C1 because S ekf .x /f .x/ dx is bounded by
Proposition 5.1 (Falks criterion for x0 to be the global minimizer of ef .x/ /. Since
" > 0 is arbitrary, this together with (5.14) implies (5.13). u
t
Remark 5.2 Since f .x/ > 0 8x 2 S, a minimizer of f .x/ over S is also a maximizer
1
of f .x/ and conversely. Therefore, from the above propositions we can state, under
the same assumptions as previously (S compact and robust and f .x/ continuous
positive on S/:
(i) A point x0 2 S with f .x0 / D is a global maximizer of f .x/ over S if and only
if the sequence
Z k
f .x/
s.; k/ WD dx; k D 1; 2; : : : ;
S
is bounded.
(ii) If f .x/ has a unique global maximizer x0 over S then
R
xi ekf .x/ dx
x0i D lim RS kf .x/ ; i D 1; : : : ; n:
Se dx
k!C1
In view of the difficulties with computing multiple integrals the above global
optimality criteria are of very limited applicability. On the other hand, using the
complementary convex structure common to most optimization problems, it is
possible to formulate global optimality criteria in such a way that in most cases
transcending an incumbent solution x0 reduces to checking a dc inclusion of the
form
D C; (5.17)
general form to which one can reduce virtually any global optimization problem to
be considered in this book. Assume that the feasible set D D G n intH is robust, i.e.,
satisfies D D cl.intD/ and that the problem is genuinely nonconvex, which means
that
minfl.x/j x 2 Gg < minfl.x/j x 2 G n intHg: (5.19)
Proof Suppose there exists x 2 G \ fxj l.x/ l.x0 /g n H. From (5.19) we have
l.z/ < l.x/ for at least some z 2 G \ intH. Then the line segment z; x meets the
boundary @H at some point x0 2 GnintH, i.e., at some feasible point x0 . Since x H,
we must have x0 D xC.1/z with 0 < < 1, hence l.x0 / D l.x/C.1/l.z/ <
l.x/ l.x0 /. Thus, if (5.20) does not hold then we can find a feasible solution x0
better than x0 . Conversely suppose there exists a feasible solution x better than x0 .
Let W be a ball around x such that l.x0 / < l.x0 / for all x0 2 W. Since D is robust one
can find a point x0 2 W \ intD. Then clearly x0 2 G n H and l.x0 / < l.x0 /, contrary
to (5.20). Therefore, if (5.20) holds, x0 must be global optimal. t
u
Thus, for problem (5.18) transcending x0 amounts to solving an inclusion (5.17)
where D D fx 2 Gj l.x/ l.x0 /g and C D H. If x 2 G n H then the point
x0 2 @H \ z; x yields a better feasible solution than x0 , where z is any point of
D \ intH with l.z/ < l.x/ [this point exists by virtue of (5.19)].
We shall refer to the problem of checking an inclusion D C or, equivalently,
solving the inclusion x 2 D n C, as the DC feasibility problem. Note that for the
concave minimization problem the set C D fxj f .x/ f .x0 /g depends on x0 ,
whereas for the canonical dc program the set D D fx 2 Gj l.x/ l.x0 /g depends
on x0 .
In practice, the computation of an exact global optimal solution may be too expen-
sive while most often we only need the global optimum within some prescribed
tolerance " 0. So problem
.P/ minff .x/j x 2 Dg
5.7 Approximate Solution 141
can be considered solved if a feasible solution x has been found such that
Since global optimization methods are generally expensive, they should be used
only when they are really motivated. Before applying a sophisticated global
optimization algorithm to a given problem, it is therefore necessary to perform
elementary operations (such as fixing certain variables to eliminate them, tightening
bounds, and changing the variables) for improving or simplifying the formulation
wherever possible. In some simple cases, an obvious transformation may reduce a
seemingly nonconvex problem to a convex or even linear program. For instance, a
problem of the form
p
minf 1 C .l.x//2 j x 2 Dg;
where D is a polyhedron and l.x/ a linear function nonnegative on D is merely a
disguise of the linear program maxfl.x/j x 2 Dg. More generally, problems such as
minf.l.x/j x 2 Dg where D is a polyhedron, l.x/ a linear function, while .u/ is a
nondecreasing function of u on the set l.D/ should be recognized as linear programs
and solved as such.1
There are, however, circumstances when more or less sophisticated transfor-
mations may be required to remove the apparent nonconvexities and reformulate
the problem in a form amenable to known efficient solution methods. Therefore,
preprocessing is often a necessary preliminary phase before solving any global
optimization problem.
Example 5.5 Consider the following problem encountered in the theoretical eval-
uation of certain fault-tolerant shortest path distributed algorithms (Bui 1993):
Xn
. 2xi /yi C xi
.Q/ min W (5.22)
iD1
. yi /xi
X
n X
n
xi a; yi b (5.23)
iD1 iD1
xi ;
yi ; (5.24)
where 0 < n
b; 0 < < 1 < and 0 < < 1. Assume that n
a; n b (otherwise the problem is infeasible). The objective function looks rather
complicated, and one might be tempted to study the problem as a difficult nonconvex
optimization problem. Close scrutiny reveals, however, that certain variables can be
fixed and eliminated. Specifically, by rewriting the objective function as
" ! #
Xn
1 2
1 C C2 : (5.25)
iD1
xi yi yi
1
Unfortunately, such linear programs disguised in nonlinear global optimization problems have
been and still are sometimes used for testing global optimization algorithms.
5.8 When is Global Optimization Motivated? 143
X
n
yi b;
yi : (5.27)
iD1
1
with D ./= C 1 2/ D .1 / C .= 1/ > 0. Since each term y i
is a
convex function of yi , it follows that .Q/ is a convex program. Furthermore, due to
the special structure of this program, an optimal solution of it can be computed
in closed form by standard convex mathematical programming methods. Note,
however, that if the inequalities (5.23) are replaced by equalities then the problem
becomes a truly nonconvex optimization problem.
Example 5.6 (Geometric Programming) An important class of optimization
problems called geometric programming problems, which have met with many
engineering applications (see, e.g., Duffin et al. 1967), have the following general
formulation:
X
Ti Y
n
gi .x/ D cij .xk /aijk i D 0; 1; : : : ; m
jD1 kD1
It is now not hard to see that each function hi .t/ is convex on Rn . Indeed, if the
vectors aij D .aij1 ; : : : ; aijn / 2 Rn are defined, then
X
Ti
cij eha
ij ;ti
hi .t/ D
jD1
and, since cij > 0, to prove the convexity of hi .t/ it suffices to show that for any
vector a 2 Rn ; the function eha;ti is convex. But the function '.y/ D ey is convex
increasing on the real line 1 < y < C1, while l.t/ D ha; ti is linear in t,
therefore, by Proposition 2.8, the composite function '.l.t// D eha;ti is convex.
Thus a geometric program is actually a convex program of a special form. Very
efficient specialized methods exist for solving this problem (Duffin et al. 1967).
Note, however, that if certain cij are negative, then the problem becomes a difficult
dc optimization problem (see, e.g., McCormick 1982).
Example 5.7 (Generalized Linear Programming) A generalized linear program
is a problem of the form:
min cx W
Pn j
jD1 y xj D b
.GLP/ (5.28)
yj 2 Y j D 1; : : : ; n
j
x 0;
X
n
.SP/ min ci xi W (5.29)
iD1
x 2 D Rn (5.30)
xi D ui vi i D 1; : : : ; p n; (5.31)
p p
u 2 r; s RCC ; v2S RC ; (5.32)
X
p
xj
ij D ei i D 1; : : : ; m
jD1
uj
X
n
kj xj D dk k D 1; : : : ; l
jD1
rj uj sj j D 1; : : : ; p
xj 0 j D 1; : : : ; n;
e
we obtain a (GLP) where b D 2 RmCl ,
d
8
< hj =uj h D 1; : : : ; m; j D 1; : : : ; p
j
yj 2 RmCl ; yh D 0 h D 1; : : : ; m; j D p C 1; : : : ; n
:
hm;j h D m C 1; : : : ; m C l; j D 1; : : : ; n
h 2 IjC
j
hj =rj yh hj =sj (5.35)
h 2 Ij
j
hj =sj yh hj =rj (5.36)
IjC D fij ij > 0g; Ij D fij ij < 0g: (5.37)
An interesting feature worth noticing of this problem is that, despite the bilinear
constraints, the set of all x satisfying (5.31), (5.32) is a polytope, so that .SP/
is actually a linear program (Thieu 1992). To see this, for every fixed u 2 r; s
denote by uQ W Rp ! Rp the affine map that carries v 2 Rp to uQ .v/ 2 Rp with
146 5 Motivation and Overview
j j P j
Since ui > 0; vi 0; one can have j2J j vi D 0 only if xi D 0, so one can write
P
xi D ui vi .i D 1; : : : ; p/, with v D j2J j v 2 S, and
j
( xi
if vi > 0
ui D vi
ri otherwise:
X
n
Q.x/ WD Q0 C xi Qi
0; (5.39)
iD1
where Qi ; i D 0; 1; : : : ; m are real symmetric m m matrices, and for any two real
symmetric m m matrices A; B the notation A
B (A B, resp.) means that the
matrix A B is negative semidefinite (definite, resp.), i.e., the largest eigenvalue of
A B, max .A B/, is nonpositive (negative, resp.). A semidefinite program is a
linear program under LMI constraints, i.e., a problem of the form
minfcxj Q.x/
0g; (5.40)
5.9 Exercises 147
supfuT Q.x/uj u 2 Rm g 0;
Since for fixed u the function 'u .x/ WD uT Q.x/u is affine in x, it follows that the set
G D fx 2 Rn j 'u .x/ 0 8u 2 Rm g (the feasible set of (5.40)) is convex. Therefore,
the problem (5.40) is a convex program which can be solved by specialized interior
point methods (see, e.g., Boyd and Vandenberghe 2004). Note that the problem
minfcxj Q.x/ 0g is also a semidefinite program , since Q.x/ 0 if and only
if Q.x/
0.
Many problems from control theory reduce to nonconvex global optimization
problems under LMI constraints. For instance, the robust performance problem
(Apkarian and Tuan 1999) can be formulated as that of minimizing subject to
X
N X
m
zi A0i C .ui Ai C vi AiCm / C B 0 (5.41)
kD1 iD1
ui D xi ; xi vi D i D 1; : : : ; m (5.42)
x 2 a; b; z 2 d; d ; N
(5.43)
where A0i ; Ai ; B are real symmetric matrices (the constraint (5.41) is convex,
while (5.42) are bilinear).
5.9 Exercises
1 Let C; D be two compact convex sets in R2 such that D intC. Show that the
distance d.x/ D inffkx yk W y Cg from a point x 2 D to the boundary of C is a
concave function. Suppose C is a lake, and D an island in the lake. Show that finding
the shortest bridge connecting D with the mainland (i.e., R2 n intC) is a concave
minimization problem. If C; D are merely compact (not necessarily convex), show
that this is a dc optimization problem.
2 (Weber problem with attraction and repulsion )
1. A single facility is to be constructed to serve n users located at points aj 2 D on
the plane (where D is a rectangle in R2 /. If the facility is located at x 2 D, then
the attraction of the facility to user j is qj .hj .x//, where hj .x/ D kx aj k is the
distance from x to aj and qj W R ! R is a convex decreasing function (the farther
x is away from aj the less attractive it looks to user j). Show that Pthe problem of
finding the location of the facility with maximal total attraction njD1 qj .hj .x// is
a dc optimization problem.
2. In practice, some of the points aj may be repulsion rather than attraction points
for the facility. For example, there may be in the area D a garbage dump, or
148 5 Motivation and Overview
sewage plant, or nuclear plant, and one may wish the facility to be located as far
away from these points as possible. If J1 is the set of attraction points, J2 the set
of repulsion points, then in typical situations one may have
(
j wj kx aj k j 2 J1 ;
qj hj .x/ D
wj e j kxa k
j
j 2 J2 ;
where j > 0 is the maximal attraction of point j 2 J1 , j > 0 is the rate of decay
of the repulsion of point j 2 J2 and wj > 0 for all j. Show that in this case the
problem of maximizing the function
X X
qj hj .x/ qj hj .x/
j2J1 j2J2
(for user j the new facility is attractive as long as it is closer than any existing facility,
but this attraction quickly decreases when the new facility is farther than some of
the existing facilities and becomes zero when it is farther than the farthest existing
facility). Denoting by hj .x/ the distance from the unknown location x of the new
facility to user j, show that to maximize the total attraction one has again to solve a
dc optimization problem.
5 In Example 5.3 on engineering design prove that r.x/ D maxfrj B.x; r/ Mg D
inffkx yk W y 2 @Mg (distance from x to @M/. Derive that r.x/ is a dc function
(cf Exercise 1, Chap. 4), and hence that finding the maximal ball contained in a given
compact set M Rn reduces to maximizing a dc function under a dc constraint.
5.9 Exercises 149
[
n
6 Consider a system of disjunctive convex inequalities x 2 Ci , where Ci D
iD1
fx W gi .x/ 0g. Show that this system is equivalent to a single dc inequality
p.x/ q.x/ 0 and find the two convex functions p.x/; q.x/.
P
7 In Example 5.4 introduce the new variables qil such that xil D qil j ylj
and transform the pooling problem into a problem in qil ; ylj ; zij (eliminate the
variables xil /.
8 (Optimal design of a water distribution network) Assuming that a water distri-
bution network has a single source, a single demand pattern, and a new pumping
facility at the source node, the problem is to determine the pump pressure H, the
flow rate qi and the head losses Ji along the arcs i D 1; : : : ; s, so as to minimize the
cost (see, e.g., Fujiwara and Khang 1990):
X
f .q; H; J/ WD b1 Li .KLi =Ji /= .qi =c/= C b2 ja1 je1 H e2 C b3 ja1 jH
i
qmax
i qi qmin
i 0 i D 1; : : : ; s
(bounds)
H max H H min 0
(Li is the length of arc i; K and c are the HazenWilliams coefficients, a1 is the
supply water rate at the source and b1 ; b2 ; b3 ; e1 ; e2 ; are constants such that 0 <
e1 ; e2 < 1 and 1 < < 2:5; D 1:85 and D 4:87/. By the change of variables
pi D log qi ; i D log Ji , transform the first term of the sum defining f .q; H; J/ into
a convex function of p; and reduce the problem to a dc optimization problem.
9 Solve the problem in Example 5.5.
10 Solve the problem
Methods for solving global optimization problems combine in different ways some
basic techniques: partitioning, bounding, cutting, and approximation by relaxation
or restriction. In this chapter we will discuss two general methods: branch and bound
(BB) and outer approximation (OA).
A basic operation involved in a BB procedure is partitioning, more precisely,
successive partitioning.
Let I D fij i > 0g. It is a simple matter to check that for every i 2 I, the points
u1 ; : : : ; ui1 ; v; uiC1 ; : : : ; up are vertices of a simplex Si S and that:
u1 H
u2
v v
u3
u4 u1
u3
x0
u2
[
.riSi / \ .riSj / D ; 8j iI Si D S:
i2I
Proof Denote .Sk / D k . Since kC1 k , it follows that k tends to some limit
as k ! 1. Arguing by contradiction, suppose > 0, so there exists t such that
k < for all k t. By assumption (ii) we can write
Let us color every vertex of St black and for any k > t let us color white every
vertex of Sk which is not black. Clearly, every white vertex of Sk must be a point v h
for some h 2 ft; : : : ; k 1g, hence, by (6.1), maxfkv h uhi k W i D 1; : : : ; ng
h < k . So any edge of Sk incident to a white vertex must have a length
less than k and cannot be a longest edge of Sk . In other words, for k t a longest
edge of Sk must have two black endpoints. Now denote K D fk tj Sk is bisectedg.
Then for k 2 K; v k belongs to a longest edge of Sk , hence, according to what has just
been proved, to an edge of Sk joining two black vertices. Since v k becomes a white
vertex of SkC1 , it follows that for k 2 K, SkC1 has one white vertex more than Sk . On
the other hand, in the whole subdivision process the number of white vertices never
decreases. That means, after one bisection the number of white vertices increases
at least by 1. Therefore, after a certain number of bisections, we come up with a
simplex Sk with only white vertices. Then, according to (6.1), every edge of Sk has
length less than , conflicting with k D .Sk /. Therefore, D 0, as was to be
proved. t
u
It should be emphasized that condition (ii) is essential for the exhaustiveness of
the filter (see Exercise 1 of this chapter). This condition implies in particular that the
bisections mentioned in (i) are bisections of ratio no less than 1 > 0. A filter of
simplices satisfying (i) but not necessarily (ii) enjoys the following weaker property
of semi-exhaustiveness which, however, may sometimes be more useful.
Theorem 6.1 (Basic Simplicial Subdivision Theorem) Let Sk D uk1 ; : : : ; ukn ,
k D 1; 2; : : :, be a filter of simplices such that SkC1 is a child of Sk in a subdivision
via a point v k 2 Sk . Then
(i) Each sequence fuki ; k D 1; 2; : : :g has an accumulation point ui such that
S1 WD \1 1
kD1 S is the convex hull of u ; : : : ; u ;
k n
(iii) If for infinitely many k the subdivision of Sk is a bisection of ratio no less than
> 0 then an accumulation point of the sequence fv k ; k D 1; 2; : : :g is a vertex
of S1 D u1 ; : : : ; un .
P P P
with si 0 satisfying niD1 si D 1 we have x D niD1 si uks i ! niD1 i ui
as s ! C1, hence x 2 u1 ; : : : ; un D S1 . Therefore, S1 D \1 k
kD1 S D
1
u ; : : : ; u .
n
(ii) Sk is a child of Sk1 in a subdivision via v k1 , so a vertex of Sk , say ukik , must be
v k1 . Since ik 2 f1; : : : ; ng there exists a subsequence fks g f1; 2; : : :g such
that iks is the same for all s, for instance, iks D 1 8s. Therefore v ks 1 D uks 1 8s.
Since u1 is an accumulation point of the sequence fuk1 g we can assume that
uks 1 ! u1 as s ! C1. Hence, v ks 1 ! u1 as s ! C1.
(iii) This is obviously true if .S1 / D 0, i.e., if S1 is a singleton. Consider the
case .S1 / > 0 which by Lemma 6.1 can occur only if condition (ii) of this
Lemma does not hold, i.e., if for every s D 1; 2; : : : there is a ks such that
zk1 qk2
C
q
Mk2 qk1
M D fxj rj xj vj ; ri xi si .i j/gI
MC D fxj vj xj sj ; ri xi si .i j/g:
We will refer to this partition as a partition via .v; j/ (Fig. 6.3). A nice property of
this partition method is expressed in the following:
Lemma 6.3 Let fMk WD rk ; sk ; k 2 Kg be a filter of rectangles such that Mk is
partitioned via .v k ; jk /. Then there exists a subsequence K1 K such that jk D
j0 8k 2 K1 and, as k ! C1; k 2 K1 ,
s2
v
M M+
r2
r1 v1 s1
jk 2 argmaxf
kj j j D 1; : : : ; ng;
kj D minfvjk rjk ; skj vjk g: (6.3)
Proof Let K1 be the subsequence mentioned in Lemma 6.3. One can assume that
rk ! r; sk ! s; v k ! v as k ! C1; k 2 K1 . From (6.2) and (6.3) it follows that
for all j D 1; : : : ; n:
kj
kj0 ! 0 .k ! C1; k 2 K1 /;
Consider a problem
.P/ minff .x/j x 2 Dg;
where f .x/ is a real-valued continuous function on Rn and D is a closed bounded
subset of Rn .
In general it is not possible to compute an exact global optimal solution of this
problem by a finite procedure. Therefore, in many cases one must be content with
only a feasible solution with objective function value sufficiently near to the global
optimal value. Given a tolerance " > 0 a feasible solution x such that
f .x/ minff .x/j x 2 Dg "
is called a global "-optimal solution of the problem.
A BB method for solving problem .P/ can be outlined as follows:
Start with a set (simplex or rectangle) M0 D and perform a (simplicial or
rectangular, resp.) subdivision of M0 , obtaining the partition sets Mi ; i 2 I. If
a feasible solution is available let x be the best feasible solution available and
D f .x/. Otherwise, let D C1.
For each partition set Mi compute a lower bound .Mi / for f .Mi \ D/, i.e., a
number .Mi / D C1 if M \ D D ;; .Mi / f .M \ D/ otherwise. Update the
current best feasible solution x and the current best feasible value .
Among all the current partition sets select a partition set M 2 argmini2I .Mi /.
If .M / " stop: x WD CBS is a global "-optimal solution. Otherwise,
further subdivide M , add the new partition sets to the old ones to form a new
refined partitioning of M0 .
For each new partition set M compute a lower bound .M/. Undate x; and
repeat the process.
More precisely we can state the following
If some feasible solutions have been known thus far, let xk be the best among all
of them and set k D f .xk /. Otherwise, set k D C1.
Step 2. (Pruning) Delete every M 2 Sk for which .M/ k " and let Rk be
the collection of remaining sets of Sk .
Step 3. (Termination Criterion) If Rk D ;, terminate: xk is a global "-optimal
solution. Otherwise, select Mk 2 argminf.M/j M 2 Rk .
Step 4. (Branching) Subdivide Mk according to the chosen subdivision rule. Let
Pk be the partition of Mk . Reset Sk .Rk n fMk g/ [ Pk . If any new feasible
solution is available, update the current best feasible solution xk and k D f .xk /.
Set k k C 1 and return to Step 1.
We say that bounding is consistent with branching (briefly, consistent) if
stops (case " > 0/ it only provides an approximate value of v.P/, namely a value
.Mk / < v.P/ C "; when the algorithm is infinite (case " D 0/ it only provides a
sequence of infeasible solutions converging to a global optimal solution.
All the above difficulties of the BB algorithm disappear when the problem has a nice
feasible set D. By this we mean that for any simplex or rectangle M a point x 2 M\D
can be computed at cheap cost whenever it exists. More importantly, in most cases
when the feasible set is nice a rectangular adaptive subdivision process can be used
which ensures a faster convergence than an exhaustive subdivision process does.
Specifically suppose for each rectangle Mk two points xk ; yk in Mk can be
determined such that
xk 2 Mk \ D; yk 2 Mk ; (6.6)
f .y / .Mk / D o.kx y k/:
k k k
(6.7)
From (6.6) and (6.7) we deduce .Mk / ! f .x / as ! C1. But, since x 2 D,
we have .Mk / f .x /. Therefore, .Mk / ! v.P/ . ! C1/, proving the
convergence of the algorithm. t
u
6.3 Outer Approximation 161
P1 P2 : : : Pk : : : G
approximating G more and more closely from outside, hence the name given to the
method (Fig. 6.4). The approximation is refined at each iteration but only around the
current distinguished point (considered the most promising at this stage).
An OA procedure as described above may be infinite (for instance, if x.P/ exists
and x.P/ G for every P 2 P/. An infinite OA procedure is said to be convergent
if it generates a sequence of distinguished points xk , every accumulation point x of
which belongs to G [hence solves the problem by (A1)]. In practice, the procedure
is stopped when xk has become sufficiently close to G, which necessarily occurs for
sufficiently large k if the procedure is convergent.
The study of the convergence of an OA procedure relies on properties of the cuts
lk .x/ 0 satisfying condition (6.10). Let
lk .x/
lk .x/ D hpk ; xi C k ; lk .x/ D D hpk ; xi C k :
kpk k
6.3 Outer Approximation 163
x1
x2
x3
Lemma 6.4 (Outer Approximation Lemma) If the sequence fxk g is bounded then
lk .xk / ! 0 .k ! C1/:
But for every fixed k; lk .xkq / 0 8kq > k and letting q ! C1 yields lk .x/ 0.
Therefore, 0 <
lkq .xkq / hpkq ; xkq xi ! 0, a contradiction. u
t
Theorem 6.5 (Basic Outer Approximation Theorem) Let G be an arbitrary
closed subset of Rn with nonempty interior, let fxk g be an infinite sequence in Rn ,
and for each k let lk .x/ D hpk ; xi C k be an affine function satisfying (6.10). If the
sequence fxk g is bounded and for every k there exist wk 2 G, and yk 2 wk ; xk nintG,
such that lk .yk / 0; fwk g is bounded and, furthermore, every accumulation point
of fwk g belongs to the interior of G, then
xk yk ! 0 .k ! C1/: (6.11)
w 2 intG and l.x/ 6 0, we must have l.w/ < 0. From the hypothesis lk .yk / 0 it
follows that hpk ; yk i C k 0, hence l.y/ 0. But y D w C .1 /x for some
2 0; 1, hence l.y/ D l.w/ C .1 /l.x/ D l.w/, and since l.y/ 0; l.w/ < 0,
this implies that D 0. Therefore, y D x. t
u
An important special case is the following:
Corollary 6.3 Let G D fxj g.x/ 0g, where g W Rn ! R is a convex function and
fxk g Rn n G a bounded sequence. If the affine functions lk .x/ satisfying (6.10) are
such that lk .x/ D hpk ; x yk i C k ;
where fwk g is a bounded sequence every accumulation point of which belongs to the
interior of G, and g.yk / k ! 0 .k ! C1/, then every accumulation point x of
fxk g belongs to G.
Proof Since fxk g and fwk g are bounded, fyk g is bounded, too. Therefore, fpk g
is bounded (Theorem 2.6), and by Lemma 6.4, lk .xk / ! 0. Furthermore, by
Theorem 6.5, xk yk ! 0, hence k D lk .xk /hpk ; xk yk i ! 0. But by hypothesis,
g.yk / k ! 0, consequently, g.yk / ! 0. If xkq ! x then by Theorem 6.5
ykq ! y D x, so g.x/ D lim g.yk / D 0, i.e., x 2 G. t
u
In a standard outer approximation procedure wk is fixed and k D g.yk / 8k.
But as will be illustrated in subsequent sections, by allowing more flexibility in the
choice of wk and k the above theorems can be used to establish convergence under
much weaker assumptions than usual.
6.4 Exercises
where P is a pregiven set containing D. Does the algorithm work with this
modification?
Chapter 7
DC Optimization Problems
C( )
x0 + 1 u1
x0
1 1
nX ti
eQ1 .x x0 / D . ; ; /.t1 ; : : : ; tn /T D :
1 n
iD1 i
Xn
ti
1; t D U 1 .x x0 /; (7.12)
iD1 i
infinite, but x0 C i ui 2 C. / for all i > 0, so the inequality (7.12) still defines
a -valid cut for whatever large i .i 2 I/. Making i D C1 .i 2 I/ in (7.12), we
then obtain the cut
X ti
1; t D U 1 .x x0 /: (7.13)
i
iI
We will refer to this cut (which exists, regardless of whether C. / is bounded or not)
as the -valid concavity cut for .f ; D/ at x0 (or simply concavity cut, when f ; D; ; x0
are clear from the context).
x D x0 C Ut;
7.1.3 Procedure DC
With the preceding background, let us now consider the fundamental DC feasibility
problem that was formulated in Sect. 7.1 as the key to the solution of (BCP):
./ Given two closed convex sets C; D in Rn , either find a point
x 2 D n C; (7.16)
1
z1
3
M0 z2
x0
i.e.,
Xn
ti
LP.D; M/ maxf Q t 0g
Q b;
j AUt (7.20)
iD1
i
7.1 Concave Minimization Under Linear Constraints 175
with
A b Ax0
AQ D ; bD
Q : (7.21)
I x0
Denote the optimal value and a basic optimal solution of this linear program in t by
.M/ and t.M/, respectively, and let !.M/ D x0 C Ut.M/.
Proposition 7.3 If .M/ 1 then D \ M C" , i.e., f .x/ f .x/ " for all points
x 2 D \ M. If .M/ > 1 and f .!.M// f .x/ " then !.M/ does not lie on any
edge of M.
Proof Clearly .M/ and !.M/ are the optimal value and a basic optimal solution
of the linear program
7.1.3.1 Procedure DC
Xn
ti
LP.D; M/ maxf j AUt Q t 0g
Q b;
iD1 i
176 7 DC Optimization Problems
to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/.
Step 2. If for some M 2 P; !.M/ C, i.e., f .!.M// < f .x/, then terminate:
!.M/ yields a point z 2 D n C.
Step 3. Delete every cone M 2 N for which .M/ 1 and let R be the
collection of all remaining cones.
Step 4. If R D ; then terminate: D C" .x is a global "-optimal solution of
.BCP/).
Step 5. Let M 2 argmaxf.M/j M 2 Rg, with base S . Select a point v 2 S
which does not coincide with any vertex of S . Split S via v , obtaining
a partition P of M .
Step 6. Set N .R n fM g/ [ P , P P and return to Step 1.
Incorporating the above procedure into the two phase scheme for solving (BCP)
(see Sect. 5.7), we obtain the following:
CS (Cone Splitting) Algorithm for (BCP)
Let z0 be an initial feasible solution.
Phase 1. Compute a vertex x which is a local minimizer such that f .x/ f .z0 /.
Phase 2. Call Procedure DC with CDfxj f .x/ f .x/g; C" Dfxj f .x/ f .x/"g.
(a) If the procedure returns a point z 2 D such that f .z/ < f .x/ then set
z0 z and go back to Phase 1. (Optionally: with D f .z/ and x0
being the vertex of D where Procedure DC has been started, construct
a -valid cut .xx0 / 1 for .f ; D/ at x0 and reset D D\fxj .x
x0 / 1g/.
(b) If the procedure establishes that D C" , then x is a global "-optimal
solution.
With regard to convergence and efficiency, a key point in Procedure DC is the
subdivision strategy, i.e., the rule of how the cone M in Step 5 must be subdivided.
Let Mk denote the cone M at iteration k. Noting that !.Mk / does not lie on any
edge of Mk , a natural strategy that comes to mind is to split the cone Mk upon the
halfline from x0 through !.Mk /, i.e., to subdivide its base Sk via the intersection
point v k of Sk with this halfline. A subdivision of this kind is referred to as an !-
subdivision and the strategy which prescribes !-subdivision in every iteration is
called the !-strategy. This natural strategy was used in earliest works in concave
minimization (Tuy 1964; Zwart 1974) and although it seemed to work quite well
in practice (Hamami and Jacobsen 1988), its convergence was never rigorously
established, until the works of Jaumard and Meyer (2001), Locatelli (1999), and
most recently Kuno and Ishihama (2015). Now we can sate the following:
Theorem 7.1 The CS Algorithm with !-subdivisions terminates after finitely many
steps, yielding a global "-optimal solution of (BCP).
Proof Thanks to Theorem 6.2, a new proof for this result can be given which is
much shorter than all previously published ones in the above cited works. In fact,
consider iteration k of Procedure DC with !-subdivisions. The cone Mk (i.e., M in
step 5) is subdivided via the halfline from x0 through ! k WD !.Mk /. Let zk1 ; : : : ; zkn
7.1 Concave Minimization Under Linear Constraints 177
be the points where the edges of Mk meet the boundary @C" of C" and let qk be
the point where the halfline from x0 through ! k meets the simplex zk1 ; : : : ; zkn . By
Theorem 6.2 if the procedure is infinite there is an infinite subsequence fks g such
that qks ! q 2 @C" as s ! 1. By continuity of f .x/ it follows that f .qks / ! f .q/ D
f .x/ ", so f .! ks / < f .x/ for sufficiently large ks . Since this is the stopping criterion
in Step 2 of the procedure, the algorithm terminates after finitely many iterations by
the stopping criterion in Step 4. t
u
Remark 7.3 It is not hard to show that ! k D .Mk /qk . From the above proof there
is an infinite subsequence fks g such that k! ks qks k ! 0, i.e., ..Mks / 1/kqks k !
0, hence
Later we will see that this condition is sufficient to ensure convergence of the CS
Algorithm viewed as a Branch and Bound Algorithm.
Remark 7.4 The CS Algorithm for solving (BCP) consists of a finite number
of cycles each of which involves a Procedure DC for transcending the current
incumbent. Any new cycle is actually a restart of the algorithm, whence the name
CS restart given sometimes to the algorithm. Due to restart, the set Nk (the collection
of cones to be stored) can be kept within reasonable size to avoid storage problems.
Furthermore, the vertex x0 used to initialize Procedure DC in each cycle needs not
be fixed throughout the whole algorithm, but may vary from cycle to cycle. When
started from x0 Procedure DC concentrates the search over the part of the boundary
of D contained in the canonical cone associated with x0 , so a change of x0 means
a change of search directions. This multistart effect may sometimes drastically
improve the convergence speed. On the other hand, further flexibility can be given
to the algorithm by modifying Step 2 of Procedure DC as follows:
Step 2. (Modified) If for some M 2 N ; !.M/ C, i.e., f .!.M// < f .x/ then,
optionally: (a) either set z0 !.M/ and return to Phase 1, or (b) compute
a vertex xO of D such that f .Ox/ f .!.M//, reset x xO . and go to Step 3.
With this modification, as soon as in Step 2 of Procedure DC a better feasible
solution is found which is better than the incumbent, the algorithm either returns
to Phase 1 (with z0 !.M// or goes to Step 3 (with an updated incumbent) to
continue the current Procedure DC. In general, the former option (return to Phase
1) should be chosen when the set N has reached a critical size. The efficiency of
the multirestart strategy is illustrated in the following example:
Example 7.1 Consider the problem
x1 C 2x2 C x3 C x4 C x5 C x6 C x7 C x8 8
2x C x C x 9
1 2 3
x3 C x4 C x5 5
subject to
0:5x5 C 0:5x6 C x7 C 2x8 3
2x2 x3 0:5x4 5
x1 6; xi 0 i D 1; 2; : : : ; 8
7.1.4 !-Bisection
As said above, for some time the convergence status of conical algorithms using !-
subdivisions was unsettled. It was then natural to look for other subdivisions which
could secure convergence, even at some cost for efficiency. The simplest subdivision
that came to mind was bisection: at each iteration the cone M is split along the
halfline from x0 through the midpoint of a longest edge of the simplex S . In fact, a
first conical algorithm using bisections was developed (Thoai and Tuy 1980) whose
convergence was proved rigorously. Subsequently, however, it was observed that
the convergence achieved with bisections is too slow. This motivated the Normal
Subdivision (Tuy 1991a) which is a kind of hybrid strategy, using !-subdivisions
in most iterations and bisections occasionally, just enough to prevent jamming and
eventually ensure convergence.
At the time when the pure !-subdivision was efficient, yet uncertain to converge,
Normal Subdivision was the best hybrid subdivision ensuring both convergence
and efficiency (Horst and Tuy 1996). However, once the convergence of pure !-
subdivision has been settled, bisections are no longer needed to prevent jamming in
conical algorithms. On the other hand, an advantage of bisections over other kinds
of subdivision is that each bisection creates only two new partition sets, so that using
bisections may help to keep the growth of the collection N of accumulated partition
sets within more manageable limits. Moreover, recent investigations by Kuno and
Ishihama (2015) show that bisections can be built in such a way to take account as
well of current problem structure just like !-subdivisions.
7.1 Concave Minimization Under Linear Constraints 179
ki C X
n
ik D ; i D 1; : : : ; n; vQ k D ik uki :
1 C n iD1
Pn
Clearly ik > 0 8k; i D 1; : : : ; n; iD1 i
k
D 1, so vQ k 2 Sk . Let ukrk ; uksk be a
longest edge of the simplex Sk and
while qk ; qO k are the points where the halfline from x0 through v k meets the simplex
zk1 ; : : : ; zkn and @C" , respectively, then kqk qO k k ! 0 as k ! 1. Therefore there
is q 2 @C" such that qk ; qO k ! q as k ! 1. Since ! k WD !.Mk / 2 qk ; qO k it follows
that ! k ! q 2 @C" . By reasoning just as in the last part of the proof of Theorem 6.1
we conclude that the CS Algorithm with !-bisections terminates after finitely many
iterations, yielding a global "-optimal solution. t
u
From the computational experiments reported in Kuno and Ishihama (2015) the
CS Algorithm with !-bisection seems to compete favorably with the CS Algorithm
using !-subdivisions.
180 7 DC Optimization Problems
and by D .M/ denote the optimal value of the linear program LP.D; M/ given
by (7.20). If yi D x0 C i ui ; i D 1; : : : ; n then clearly D \ M x0 ; y1 ; : : : ; yn , so
by concavity of f .x/ a lower bound of f .D \ M/ can be taken to be the value
" if .M/ 1
.M/ D (7.25)
minff .y1 /; : : : ; f .yn /g otherwise:
(see Fig. 7.3). Using this lower bounding M 7! .M/ together with the conical
!-subdivision rule we can then formulate the following BB (branch and bound)
version of the CS Algorithm discussed in Sect. 5.4.
Conical BB Algorithm for (BCP)
Initialization Compute a feasible solution z0 .
Phase 1 Compute a vertex x of D which is a local minimizer such that
f .x/ f .z0 /.
Phase 2 Take a vertex x0 of D (local minimizer) such that f .x0 / > f .x/ ". Let
M0 D fxj x D x0 C U0 t; t 0g be the canonical cone associated with
x0 ; S0 D x0 C u01 ; : : : ; x0 C u0n , where u0i ; i D 1; : : : ; n are the columns
of U0 . Set P D fM0 g; N D fM0 g; D f .x/.
y2
f (x) =
M y1
x0
to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/. Compute .M/ as in (7.25).
Step 2. (Options) If for some M 2 P; f .!.M// < , then do either of the
options:
(a) with f .!.M// construct a . "/-valid cut for .f ; D/ at x0 W
0
h; x x i 1, reset D D \ fxj h; x x0 i 1g, z0 !.M/, and
go back to Phase 1;
(b) compute a vertex xO of D such that f .Ox/ f .!.M//, reset x xO ;
f .Ox/ and go to Step 3.
Step 3. (Pruning) Delete every cone M 2 N for which .M/ " and let R
be the collection of all remaining cones.
Step 4. (Termination Criterion) If R D ;, then terminate: x is a global "-optimal
solution of (BCP).
Step 5. (Branching) Let M 2 argminf.M/j M 2 Rg, with base S . Perform an
!-subdivision of M and let P be the partition of M .
Step 6. (New Net) Set N .R n fM g/ [ P , P P and return to Step 1.
Theorem 7.3 The above algorithm can be infinite only if " D 0 and then any
accumulation point of the sequence fxk g is a global optimal solution of (BCP).
Proof Since D has finitely many vertices, if the algorithm is infinite, a Phase 2 must
be infinite and generate a filter fMk ; k 2 Kg. Let k be the current value of , and let
k D .Mk /; ! k D !.Mk /; D lim k . By formula 7.22 (which extends easily to
the case when k & /, there exists K1 K such that k ! 1; ! k ! !, with !
satisfying f .!/ D " as k ! 1; k 2 K1 . But in view of Step 2, f .! k / k , hence
f .!/ , which is possible only if " D 0. Moreover, without loss of generality
we can assume .Mk / D f .x0 C k 1k u1k /, and so, in that case, k .Mk /
f . 1k u1k / f .k 1k u1k / ! 0 as k ! 1; k 2 K1 . Therefore, by Proposition 6.1 the
algorithm converges. t
u
Remark 7.5 Option (a) in Step 2 allows the algorithm to be restarted when storage
problems arise in connection with the growth of the current set of cones. On
the other hand, with option (b) the algorithm becomes a unified procedure. This
flexibility sometimes enhances the practicability of the algorithm. Note, however,
that the BB algorithm requires more function evaluations than the CS Algorithm
(for each cone M the CS algorithm only requires the value of f .x/ at !.M/, while
the BB Algorithm requires the values of f .x/ at n points y1 ; : : : ; yn /. This should be
taken into account when functions evaluations are costly, as it occurs in problems
182 7 DC Optimization Problems
where the objective function f .x/ is implicitly defined via some auxiliary problem
(for instance, f .x/ D minfhQx; yij y 2 Gg, where Q 2 Rpn and G is a polytope
in Rp /.
Aside from simplicial and conical subdivisions, rectangular subdivisions are also
used in branch and bound algorithms for (BCP). Especially, rectangular subdivisions
are often preferred when the objective function is such that a lower bound of its
values over a rectangle can be obtained in a straightforward manner. The latter
occurs,
Pn in particular, when the function is separable, i.e., has the form f .x/ D
iD1 fi .xi /.
Consider now the Separable Basic Concave Program
X
n
.SBCP/ minff .x/ WD fj .xj /j x 2 Dg; (7.27)
jD1
where, for every j; 'M;j .:/ is the affine function of one variable that agrees with fj .:/
at the endpoints of the interval rj ; sj , i.e.,
fj .sj / fj .rj /
'M;j .t/ D fj .rj / C .t rj /: (7.28)
sj rj
Proof That 'M .x/ is affine and minorizes f .x/ over M is obvious. On the other
hand, if x is a vertex of M then, for every j, we have either xj D rj or xj D sj , hence
'M;j .xj / D fj .xj /; 8j, and therefore, f .x/ D 'M .x/. t
u
On the basis of this proposition a lower bound for f .x/ over a rectangle M D r; s
can be computed by solving the linear program
LP.D; M/ minf'M .x/j x 2 D \ Mg: (7.29)
7.1 Concave Minimization Under Linear Constraints 183
If !.M/ and .M/ denote a basic optimal solution and the optimal value of this
linear program, then .M/ min f .D\M/, with equality holding when f .!.M// D
'M .!.M//. It is also easily seen that if a rectangle M D rI s is entirely outside D
then D\M is empty and M can be discarded from consideration (.M/ D C1). On
the other hand, if M is entirely contained in D then D\M D M, hence the minimum
of f .x/ over D \ M is achieved at a vertex of M and since 'M .x/ agrees with f .x/
at the vertices of M, it follows that .M/ D min f .D \ M/. Therefore, in a branch
and bound rectangular algorithm for (SBCP) only those rectangles intersecting the
boundary of D will need to be investigated, so that the search will essentially be
concentrated on this boundary.
Proposition 7.5 Let fMk ; k 2 Kg be a filter of rectangles as in Lemma 6.3. If
f .! k / 'Mk .! k / ! 0 .k ! 1; k 2 K1 /: (7.31)
Theorem 7.4 The above algorithm can be infinite only if " D 0 and in this case
every accumulation point of the sequence fxk g is a global optimal solution of
(SBCP).
Proof Proposition 7.5 ensures that every filter fMk ; k 2 Kg contains an infinite
subsequence fMk ; k 2 K1 g such that
f .! k / .Mk / ! 0 .k ! 1; k 2 K1 /:
Thus, to apply the outer approximation method to (GCP) it only remains to define a
distinguished point xk WD x.Pk / for every approximating polytope Pk so as to fulfill
Condition A1.
In the simple outer approximation method for solving (GCP), xk D x.Pk / is chosen
to be a vertex of Pk minimizing f .x/ over Pk . In other words, denoting the vertex set
of Pk by Vk one defines
Since f .xk / min f .D/ it is obvious that x D lim!1 xk 2 D implies that x solves
(GCP), so Condition A1 is immediate.
A practical implementation of the simple OA method thus requires an efficient
procedure for computing the vertex set Vk of Pk . At the beginning, P1 can be taken
to be a simple polytope (most often a simplex) with a readily available vertex set
V1 . Since Pk is obtained from Pk1 by adding a single linear constraint, all we need
is a subroutine for solving the following subproblem of on-line vertex enumeration:
Given the vertex set Vk of a polytope Pk and a linear inequality lk .x/ 0,
compute the vertex set VkC1 of the polytope
Denote
Vk D fv 2 Vk j lk .x/ < 0g; VkC D fv 2 Vk j lk .x/ > 0g:
By replacing f .x/, if necessary, with minf; f .x/g where > 0 is sufficiently large,
we may assume that the upper level sets of the concave function f .x/ are bounded.
According to (7.35), the distinguished point xk of Pk in the above OA scheme is a
global optimal solution of the relaxed problem
Mk
x0
D
to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/. If .M/ > 1, compute zi D x0 C .M/ i ui ; i D
1; : : : ; n and
Step 7. (Updating) Let xkC1 be the best of the feasible points xk ; ! k . Let kC1 D
f .xkC1 /, NkC1 D .Rk n fMk g/ [ PkC1 . Set k k C 1 and go back to
Step 1.
Theorem 7.5 The Combined Algorithm can be infinite only if " D 0, and in that
case every accumulation point of the sequence fxk g is a global optimal solution of
(GCP).
7.3 Reverse Convex Programming 189
Proof First observe that for any k we have .Mk / min f .D/ while f .! k /
minff .zk1 /; : : : ; f .zkn /g D .Mk /. Therefore, when Step 4 occurs ! k is effectively
a global optimal solution. We now show that for " > 0 if Step 5 never occurs the
current best value k decreases by at least a quantity " after finitely many iterations.
In fact, suppose that k D k0 for all k k0 . Let C D fx 2 Rn j f .x/ k0 g and
by zki denote the intersection point of the ith edge of Mk with @C. Let k D .Mk /
and by qk denote the intersection point of the halfline from x0 through ! k with the
simplex zk1 ; : : : ; zkn . By Theorem 6.2 there exists a subsequence fks g f1; 2 : : : ; g
and a point q 2 @C such that qks ! q as s ! C1. Moreover, ks ! 1 as
s ! C1 [see (7.22) in Remark 7.3]. Since ! ks x0 D ks .qks x0 / it follows that
! ks ! q 2 @C, hence f .! ks / ! f .q/ D k0 as s ! C1. But f .! k / .Mk / 8k,
hence .Mks / f .! ks / ! k0 as s ! C1. Therefore, for s sufficiently large we
will have .Mks k0 " D ks ", and then the algorithm stops by the criterion
in Step 3 yielding k0 as the global "-optimal value and xks as a global "-optimal
solution. Thus if k0 is not yet the global "-optimal value, we must have k k0 "
for some k > k0 . Since min f .D/ < C1, this proves the finiteness of the algorithm.
t
u
Remark 7.9 If for solving (SPk ) the CS Algorithm is used instead of the BB
algorithm, then we have a Combined OA/CS Conical Algorithm for (GCP). In
Step 1 of the latter algorithm, the values i are computed from the condition
f .x0 C i ui / D k ", and there is no need to compute .M/ while in Step 2 a cone
M is deleted if .M/ 1. The OA/CS Algorithm involves less function evaluations
than the OA/BB Algorithm.
Combined algorithms of the above type were introduced for general global
optimization problems in Tuy and Horst (1988) under the name of restart BB
Algorithms. A related technique of combining BB with OA developed in Horst et al.
(1991) leads essentially to the same algorithm in the case of .GCP/.
h.x/ 0; (7.37)
c, x =
h(x) 0
1
In the above formulation the function h.x/ is assumed to be finite throughout Rn I if this condition
may not be satisfied (as in the last example) certain results below must be applied with caution.
7.3 Reverse Convex Programming 191
Here D is a polyhedron and C is a closed convex set. For the sake of simplicity we
will assume D nonempty and C bounded. Furthermore, it is natural to assume that
which simply means that the reverse convex constraint h.x/ 0 is essential, so that
(LRC) does not reduce to the trivial underlying linear program. Thus, if w is a basic
optimal solution of the linear program then
Proposition 7.7 For any z 2 D n C there exists a better feasible solution y lying on
the intersection of the boundary @C of C with an edge E of the polyhedron D.
Proof We show how to compute y satisfying the desired conditions. If p 2 @h.z/
then the polyhedron fx 2 Dj hp; xziCh.z/ D 0g is contained in the feasible set, and
a basic optimal solution z1 of the linear program minfcxj x 2 D; hp; xziCh.z/ D 0g
will be a vertex of the polyhedron P D fx 2 Dj cx cz1 g. But in view of (7.43) z1
is not optimal to the linear program minfcxj x 2 Pg. Therefore, starting from z1 and
using simplex pivots for solving the latter program we can move from a vertex of D
to a better adjacent vertex until we reach a vertex u with an adjacent vertex v such
that h.u/ 0 but h.v/ < 0. The intersection of @C with the line segment u; v will
be the desired y. t
u
Corollary 7.3 (Edge Property) If (LRC) is solvable then at least one optimal
solution lies on the intersection of the boundary of C with an edge of D.
If E denotes the collection of all edges of D and for every E 2 E , .E/ is the set
of extreme points of the line segment E \ C, then it follows from Corollary 7.3
(Hillestad and Jacobsen 1980b) that an optimal solution must be sought only among
the finite set
[
XD f.E/j E 2 E g: (7.44)
fx 2 Dj cx cxg C: (7.45)
192 7 DC Optimization Problems
x C opt
maxfh.x/j x 2 D. /g 0; (7.47)
where the maximization problem on the left-hand side is a (BCP) since h.x/
is convex. In the general case when regularity is not assumed, condition (7.45)
[i.e. (7.47)] may hold without x being a global optimal solution, as illustrated in
Fig. 7.6.
Fortunately, a slight perturbation of (LRC) can always make it regular, as shown
in the following:
Proposition 7.8 For " > 0 sufficiently small, the problem
is regular and its optimal value tends to the optimal value of (LRC) as " ! 0.
Proof. Since the vertex set V of D is finite, there exists "0 > 0 so small that " 2
.0; "0 / implies that h" .x/ WD h.x/ C ".kxk2 C 1/ 0; 8x 2 V. Indeed, if V1 D fx 2
Vjh.x/ < 0g and "0 satisfies h.x/ C "0 .kxk2 C 1/ < 0; 8x 2 V1 , then whenever
0 < " < "0 we have h.x/ C ".kxk2 C 1/ < 0; 8x 2 V1 , while h.x/ C ".kxk2 C 1/
0 C " > 0; 8x 2 V n V1 . Now consider problem (7.48) for 0 < " < "0 and let x 2 D
be such that h" .x/ 0. If x V then x is the midpoint of some line segment D,
and since the function h" .x/ is strictly convex, any neighborhood of x must contain
a point x0 of such that h" .x0 / > 0. On the other hand, if x 2 V, then h" .x/ > 0,
and any point x0 of D sufficiently near to x will satisfy h" .x0 / > 0. Thus, given any
7.3 Reverse Convex Programming 193
is called an "-approximate optimal solution to (LRC). The next result shows that for
" > 0 sufficiently small an "-approximate optimal solution is as close to an optimal
solution as desired.
Proposition 7.9 If for each " > 0, x" is an "- approximate optimal solution of
(LRC), then any accumulation point x of the sequence fx" g solves (LRC).
Proof Clearly h.x/ 0, hence x is feasible to (LRC). For any feasible solution x of
(LRC), the condition (7.50) implies that cx minfcxj x 2 D; h.x/ 0g, hence x is
an optimal solution of (LRC). t
u
Observe that if C" D fxj h.x/ "g then a feasible solution x" to the
problem (7.49) such that
Thus the search for an "-approximate optimal solution of (LRC) reduces to solving
the parametric dc feasibility problem
analogous to (7.5) for (BCP). This leads to a method for solving (LRC) which is
essentially the same as that used for solving (BCP).
In the following X" denotes the set analogous to the set X in (7.44) but defined
for problem (LRC" ) instead of (LRC).
CS (Cone Splitting) Restart Algorithm for (LRC)
Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/ ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set D 1; D. / D D, and go to Phase 2.
Phase 1. Starting from z0 search for a feasible point x 2 X" [see (7.44)]. Let
D cx; D. / D fx 2 Dj cx g. Go to Phase 2.
Phase 2. Call Procedure DC to solve problem (\) (i.e., apply Procedure DC in
Sect. 6.1 with D. /; C" and C
in place of D; C; C" resp.).
(1) If a point z 2 D. / n C" is obtained, set z0 z and go to Phase 1.
(2) If D. / C
then terminate: x is an "-approximate optimal solution
(when < C1/, or the problem is infeasible (when D C1/.
z2
C
x2
the algorithm does not assume regularity of the problem. However, if the problem is
regular, then the algorithm with
D " D 0 will find an exact global optimal solution
after finitely many steps, though may require infinitely many steps to identify it as
such.
Since the current feasible solution moves in Phase I forward from the region
h.x/ > 0 to the frontier h.x/ D 0, and in Phase 2 backward from this frontier to
the region h.x/ > 0, the algorithm is also called the forwardbackward (FW/BW)
algorithm (Fig. 7.7).
Example 7.2
Minimize 2x1 C x2
subject to x1 C x2 10
x1 C 2x2 8
x1 x2 4
2x1 3x2 6
x1 ; x2 0
x21 x1 x2 C x22 6x1 0:
opt
w
2
0 3 4
the closed convex set C D fxj h.x/ 0g. Consider a subcone M of M0 , of base
196 7 DC Optimization Problems
Furthermore, if a basic optimal solution t.M/ of this linear program is such that
!.M/ D x0 C Ut.M/ lies on some edge of M then !.M/ 2 S, and hence .M/ D
minfcxj x 2 M \ Sg, where S D D n intC is the feasible set of (LRC).
P
Proof Clearly x 2 M n intC if and only if x D x0 C Ut; t 0; niD1 tii 1. Since
M \ S .M \ D/ n intC, a lower bound of minfcxj x 2 M \ Sg is given by the
optimal value .M/ of the linear program
Xn
ti
minfhc; x0 C Utij x0 C Ut 2 D; t 0; 1g (7.54)
iD1 i
which is nothing but (7.53). To prove the second part of the proposition, observe
that if !.M/ lies on the ith edge it must satisfy !.M/ D x0 C .yi x0 / for some
1 and since x0 2 D; !.M/ 2 D it follows by convexity of D that yi 2 D, hence
yi 2 S. In view of assumption (7.42) this implies that cx0 < cyi , i.e., c.yi x0 / > 0,
and hence hc; !.M/ yi i D hc; . 1/yi i > 0 if > 1. Since c!.M/ cyi from
the definition of !.M/ we must have D 1, i.e., !.M/ 2 D n intC. t
u
As a consequence of this proposition, when !.M/ is not feasible to (LRC) then
it does not lie on any edge of M, so the subdivision of M upon the halfline from
x0 through !.M/ is proper. Also, if all yi ; i D 1; : : : ; n, belong to D, then !.M/ is
decidedly feasible (and is one of the yi ), hence .M/ is the exact minimum of cx over
the feasible portion in M and M will be fathomed in the branch and bound process.
Thus, in a conical procedure using the above lower bounding, only those cones
will actually remain for consideration which contain boundary points of the feasible
set. In other words, the search will concentrate on regions which are potentially of
interest.
In the next algorithm, let " 0 be the tolerance.
Conical BB Algorithm for (LRC)
Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/ ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set 1 D 1; D.1 / D D, and go to Phase 2.
7.3 Reverse Convex Programming 197
Phase 1. Starting from z0 search for a point x1 2 X" [see (7.44)]. Let 1 D
cx1 ; D.1 / D fx 2 Dj cx 1 g. Go to Phase 2.
Phase 2 Let M0 be the canonical cone associated with the vertex x0 D w of D, let
N1 D fM0 g; P1 D fM0 g. Set k D 1.
Step 1. (Bounding) For each cone M 2 Pk whose base is x0 C u1 ; : : : ; x0 C un
and whose edges meet @C at points yi D x0 C i ui ; i D 1; : : : ; n, solve
the linear program (7.53), where U denotes the matrix of columns ui D
yi x0 ; i D 1; : : : ; n. Let .M/; t.M/ be the optimal value and a basic
optimal solution of this linear program, and let !.M/ D x0 C Ut.M/.
Step 2. (Options) If some !.M/; M 2 Pk , are feasible to (LRC" ) (7.49) and
satisfy hc; !.M/i < k , then reset k equal to the smallest among these
values and xk argk (i.e., the x such that hc; xi D k ). Go to Step 3 or
(optionally) return to Phase 1.
Step 3. (Pruning) Delete every cone M 2 Nk such that .M/ k and let Rk be
the collection of remaining cones.
Step 4. (Termination Criterion) If Rk D ;, then terminate:
(a) If k < C1, then xk is an "-approximate optimal solution.
(b) Otherwise, (LRC) is infeasible.
Step 5. (Branching) Select Mk 2 argminf.M/j M 2 Rk g. Subdivide Mk along
the halfline from x0 through ! k D !.Mk /. Let PkC1 be the partition of Mk .
Step 6. (New Net) Set Nk .Rk n fMk g/ [ PkC1 , k k C 1 and return to
Step 1.
Theorem 7.7 The above algorithm can be infinite only if " D 0, and then any
accumulation point of each of the sequences f! k g; fxk g is a global optimal solution
of (LRC).
Proof. Since the set X" [see (7.44)] is finite, it suffices to show that Phase 2 is
finite, unless " D 0. If Phase 2 is infinite it generates at least a filter fMk ; k 2 Kg.
Denote by yki ; i D 1; : : : ; n the intersections of the edges of Mk with @C, by qk
the intersection of the halfline from x0 through ! k WD !.Mk / with the hyperplane
through yk1 ; : : : ; ykn . By Theorem 6.2, there exists at least one accumulation point
of fqk g, say q D lim qk .k ! 1; k 2 K1 K/, such that q 2 @C. We may assume
that ! k ! ! .k ! 1; k 2 K1 /. But, as a result of Steps 2 and 3, ! k 2 intC" , hence
! 2 @C" . On the other hand, qk 2 x0 ; ! k , hence q 2 x0 ; !. This is possible only if
" D 0 and ! D q 2 @C, so that ! is feasible. Since, furthermore, c! k minfcxj x 2
D n intCg; 8k, it follows that !, and hence any accumulation point of the sequence
fxk g is a global optimal solution. t
u
Remark 7.11 It should be emphasized that the above algorithm does not require
regularity of the problem, even when " D 0. However, in the latter case, since every
! k is infeasible, the procedure may approach a global optimal solution through a
sequence of infeasible points only. This situation is illustrated in Fig. 7.9, where
198 7 DC Optimization Problems
y02
opt
3
2
1
y01
In its general formulation (cf. Sect. 5.3) the Canonical DC Programming problem
is to find
.CDC/ minff .x/j g.x/ 0 h.x/g; (7.55)
where the objective function is convex (in x; z) and g.x/ D maxiD1;:::;m gi;1 .x/
gi;2 .x/ is a dc function (cf. Proposition 4.1). Next the dc inequality g.x/ D p.x/
q.x/ 0 with p.x/; q.x/ convex can be split into two inequalities:
where the first is a convex inequality, and the second is a reverse convex inequality.
By changing the notation if necessary, one obtains the desired canonical form. u t
Whereas the application of OA technique to .GCP/ is almost straightforward, it
is much less obvious for .CDC/ and requires special care to ensure convergence. By
hypothesis the sets
are closed and convex. For simplicity we will assume D bounded, although with
minor modifications the results can be extended to the unbounded case.
If an optimal solution w of the convex program minff .x/j g.x/ 0g satisfies
h.w/ 0, then the problem is solved (w is a global optimal solution). Therefore,
without loss of generality we may assume that there exists a point w satisfying
G D D./; D D. / n intC" :
By (7.63) and Proposition 7.12, it is easily seen that coincides with the set of "-
approximate optimal solutions of (CDC), so the problem amounts to searching for
a point x 2 . Denote by P the family of polytopes P in Rn for which there exists
2 ; C1 satisfying G D. / P. For every P 2 P define
where V is the vertex set of P. Let us verify that Conditions A1 and A2 stated in
Sect. 6.3 are satisfied. Condition A1 is obvious because x.P/ exists and can be
computed provided ;I moreover, since P G , we must have h.x.P//
", so any accumulation point x of a sequence x.Pk /; k D 1; 2; : : :, satisfies h.x/
", and hence x 2 whenever x 2 G.
To verify A2, consider any z D x.P/ associated to a polytope P such that D. /
P for some 2 ; C1. As we just saw, h.z/ ". If h.z/ < 0 then maxfh.x/j x 2
Pg < 0, hence D. / fxj h.x/ < 0g, which implies that is an "approximate
optimal value (if < C1/ or the problem is infeasible (if D C1/. If h.z/ 0
then maxfg.z/; h.z/ C "g > 0 and since maxfg.w/; h.w/ C "g < 0, we can compute
a point y such that
We are led to the following OA Procedure (Fig. 7.10) for finding an "-approximate
optimal solution of (CDC):
OA Algorithm for (CDC)
Step 0. Let x1 be the best feasible solution available, 1 D f .x1 / (if no feasible
solution is known, set x1 D ;; 1 D C1/. Take a polytope P0 D and
let P1 D fx 2 P0 j f .x/ 1 g. Determine the vertex set V1 of P1 . Set
k D 1.
Step 1. Compute xk 2 argmaxfh.x/j x 2 Vk g. If h.xk / < 0, then terminate:
(a) If k < C1, xk is an "-approximate optimal solution;
(b) If k D C1, the problem is infeasible.
x1
g(x) = 0 f (x) = c, x
h(x) =
P1
l1 (x) = 0 y1
P3 w y2 P2
l2 (x) = 0
x2
Step 3. If h.yk / D ", then set kC1 D minfk ; f .yk /g, xkC1 D yk if f .yk / k ,
xkC1 D xk otherwise. Let
Step 4. Compute the vertex set VkC1 of PkC1 D Pk \ fxj lk .x/ 0g, set k kC1
and go back to Step 1.
Theorem 7.9 The OA Algorithm for .CDC) can be infinite only if " D 0 and in
this case, if the problem is regular any accumulation point of the sequence fxk g will
solve (CDC).
Proof Suppose that the algorithm is infinite and let x D limq!C1 xkq . It can
easily be checked that all conditions of Theorem 6.5 are fulfilled (w 2 intG
by (7.58); yk 2 w; xk n intG, lk .yk / D 0/. By this theorem, x D lim ykq , therefore
h.x/ D lim h.ykq / ". On the other hand, h.xk / 0, hence h.x/ 0. This is a
contradiction unless " D 0, Since g.yk / 0 8k it follows that x 2 D. Thus x is a
feasible solution. Now let Q D lim k . Since for every k; h.xk / D maxfh.x/j x 2 Pk g
and D.k / Pk it follows that 0 D h.x/ D maxfh.x/j x 2 D.Q /g. Hence D.Q / C.
By the optimality criterion (7.60) this implies, under the regularity assumption, that
x is a global optimal solution and Q D , the optimal value. Furthermore, since
k D f .xk /, any accumulation point of the sequence fxk g is also a global optimal
solution. t
u
Remark 7.12 No regularity condition is required when " > 0. It can easily be
checked that the convergence proof is still valid if in Step 3 of certain iterations the
cut is taken with pk 2 @h.yk / instead of pk 2 @f .yk /. This flexibility should be used
to monitor the speed of convergence.
Remark 7.13 The above algorithm presupposes the availability of a point w 2
intD \ intC, i.e., satisfying maxfg.w/; h.w/g < 0. Such a point, if not readily
available, can be computed by methods of convex programming. It is also possible
to modify the algorithm so as to dispense with the computation of w (Tuy 1995b).
An earlier version of the above algorithm with " D 0 was developed in (Tuy
1987a). Several modified versions of this algorithm have also been proposed (e.g.,
Thoai 1988) which, however, were later shown to be nonconvergent. In fact, the
convergence of cutting methods for (CDC) is a more delicate question than it seems.
7.4 Canonical DC Programming 203
By translating if necessary, assume that (7.58) holds with w D 0, and the feasible
set D n intC lies entirely in the orthant M0 D RnC . It is easily verified that the above
OA algorithm still works if, in the case when maxfh.x/j x 2 Pk g > 0, x.Pk / is taken
to be any point xk 2 Pk satisfying h.xk / > " rather than a maximizer of h.x/ over
Pk as in (7.64). As shown in Sect. 6.3 such a point xk can be computed by applying
the Procedure DC to the inclusion
.SPk /x 2 Pk n C" :
For " > 0, after finitely many steps this procedure either returns a point xk such
that h.xk / > ", or establishes that maxfh.x/j x 2 Pk g < 0 (the latter implies that
the current incumbent is an "-approximate optimal solution if k < C1, or that
the problem is infeasible if k D C1/. Furthermore, since PkC1 differs from Pk by
only one additional linear constraint, the last conical partition in the DC procedure
for (SPk ) can be used to start the procedure for (SPkC1). We can thus formulate the
following combined algorithm (recall that w D 0/:
Combined OA/CS Conical Algorithm for (CDC)
Initialization Let 1 D cx1 , where x1 is the best feasible solution available (if no
feasible solution is known, set x1 D ;; 1 D C1/. Take a polytope P0 D and
let P1 D fx 2 P0 j cx 1 g. Set N1 D P1 D fRnC g (any subcone of RnC will
be assumed to have its base in the simplex spanned by the units vectors fei ; i D
1; : : : ; ng/. Set k D 1.
Step 1. (Evaluation) For each cone M 2 Pk of base u1 ; : : : ; un compute the
point i ui where the ith edge of M meets @C. Let U be the matrix of
columns u1 ; : : : ; un . Solve the linear program
( )
Xn
ti
max j Ut 2 Pk ; t 0
iD1 i
to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D Ut.M/.
Step 2. (Screening) Delete every cone M 2 Nk such that .M/ < 1, and let Rk
be the set of remaining cones.
Step 3. (Termination Criterion 1) If Rk D ;, then terminate: xk is an "-
approximate optimal solution of (CDC) (if k < C1/, or the problem
is infeasible (if k D 1/.
Step 4. (Termination Criterion 2) Select Mk 2 argmaxf.M/j M 2 Rk g. If ! k D
!.Mk / 2 D. k / WD fx 2 Dj f .x/ k g, then terminate: ! k is a global
optimal solution.
204 7 DC Optimization Problems
Step 5. (Branching) Subdivide Mk along the ray through ! k D !.Mk /. Let PkC1
be the partition of Mk .
Step 6. (Outer Approximation) If h.!.M// > " for some M 2 Rk then let xk D
!.M/; yk 2 0; xk such that maxfg.yk /; h.yk / C "g D 0.
(a) If g.yk / D 0 let lk .x/ 0 be the cut (7.65), xkC1 D xk ; kC1 D k .
(b) Otherwise, let lk .x/ 0 be the cut (7.66), xkC1 D yk ; kC1 D cyk .
Step 7. (Updating) Let NkC1 D .Rk nfMk g/[PkC1 , PkC1 D fx 2 Pk j lk .x/ 0g.
Set k k C 1 and go back to Step 1.
Note that in Step 2, a cone M is deleted if .M/ < 1 (strict inequality), so
Rk D ; implies that maxfh.x/j x 2 Pk g < 0. It is easy to prove that, analogously to
the Combined Algorithm for (GCP) (Theorem 7.5), the above Combined Algorithm
for (CDC) can be infinite only if " D 0 and in this case, if the problem is regular,
every accumulation point of the sequence fxk g is a global optimal solution of (CDC).
g2 (x)
x
g3 (x)
f (x)
Consider, for example, the quadratic program depicted in Fig. 7.11, where the
objective function f .x/ is linear and the nonconvex constraint set is defined by the
inequalities a x b; gi .x/ 0 .i D 1; 2; 3/. The optimal solution is x , while
the point x is infeasible, but almost feasible:
for some small > 0. If " > then x is feasible with tolerance " and will be
accepted as an "-approximate optimal solution, though it is quite far off the optimal
solution x . But if " < then x is no longer feasible with tolerance " and the "-
approximate solution will come close to x .
Thus, even for regular problems (problems with no isolate feasible solution),
the "-approximation approach may give an incorrect optimal solution if " is not
sufficiently small. The trouble is that in practice we often do not know what exactly
means sufficiently small, i.e., we do not know how small the tolerance should be
to guarantee a correct approximate optimal solution.
Things are much worse when the problem happens to have a global optimal solu-
tion which is an isolated feasible solution. In that case a slight perturbation of the
data may cause a drastic change of the global optimal solution; consequently, a small
change of the tolerances ";
may cause a drastic change of the (";
)-approximate
optimal solution. Due to this instability an isolated global optimal solution is very
difficult to compute and very difficult to implement when computable. To avoid such
annoying situation a common practice is to restrict consideration to problems with
a robust feasible set, i.e., a feasible set D satisfying
D D cl.intD/; (7.67)
where cl and int denote the closure and the interior, respectively. Unfortunately, this
condition (7.67) which implies the absence of isolated feasible solutions is generally
very hard to check for a given nonconvex problem. Practically we often have to deal
with feasible sets which are not known priori to contain isolated points or not.
206 7 DC Optimization Problems
Due to the fact that f .x/ is convex a key property of problem (Q ) is that its
feasible set is a convex set, so (Q ) has no isolated feasible solution and computing
a feasible solution of it can be done at cheap cost. Therefore, as shown in Sect. 7.5,
an adaptive BB Algorithm can be devised that, for any given tolerance
> 0,
can compute an
-optimal solution of (Q ) in finitely many steps (Proposition 5.2).
Furthermore, such an
-optimal solution is stable under small changes of
.
For our purpose of solving problem (P) this property of (Q ) together with the
following relationship between (P" ) and (Q ) turned out to be crucial. Let min.P" /,
min.Q / denote the optimal values of problem (P" ), (Q ), respectively.
Proposition 7.13 For every " > 0, if min.Q / > " then min.P" / > .
Proof If min.Q / > ", then any x 2 a; b such that g.x/ " must satisfy
f .x/ > , hence, by compactness of the feasible set of (P" ), minff .x/j g.x/ ";
x 2 a; bg > . t
u
This almost trivial proposition simply expresses the interchangeability between
objective and constraint in many practical situations (Tuy 1987a, 2003). Exploiting
this interchangeability, the basic idea of robust approach is to replace the original
problem (P), possibly very difficult, by a sequence of easier, stable, problems (Q ),
where the parameter can be iteratively adjusted until a stable (robust) solution to
(P) is obtained.
.SP / Given a real number , and " > 0, check whether problem (P) has a
nonisolated feasible solution x satisfying f .x/ , or else establish that no "-
essential feasible solution x exists such that f .x/ .
where .Mk / v.Q / (note that the objective function is g.x/, the constraints are
f .x/ ; x 2 a; b).
Proposition 7.14 Let " > 0 be given. Either g.xk / < 0 for some k or .Mk / > "
for some k. In the former case, xk is a nonisolated feasible solution of (P) satisfying
f .xk / . In the latter case, no "-essential feasible solution x of (P) exists such that
f .x/ (so, if D f .x/
for a given
> 0 and a nonisolated feasible solution
x then x is an essential .";
/-optimal solution).
208 7 DC Optimization Problems
Proof If g.xk / < 0, then xk is an essential feasible solution of (P) satisfying f .xk /
, because by continuity g.x/ < 0 for all x in a neighborhood of xk . If .Mk / > ",
then v.Q / > ", hence, by Proposition 7.13, v.P" / > .
It remains to show that either g.xk / < 0 for some k, or .Mk / > " for some k.
Suppose the contrary that
Since by (7.68) g.xk / 0 8k, it follows that g.x / 0. On the other hand, by (7.69)
and (7.68) we have g.x / D lim!C1 .Mk / ". This contradiction shows that
the event (7.68) is impossible, completing the proof. t
u
Thus, with the stopping criterium g.xk / < 0 or .Mk / > " an adaptive
BB procedure for solving (Q ) will help to solve the subproblem .SP /. Note that
.Mk / D minf.M/j M 2 Rk g (Rk being the collection of partition sets remaining
for exploration at iteration k/. So if the deletion (pruning) criterion in each iteration
of this procedure is .M/ > " then the stopping criterion .Mk / > " is
nothing but the usual criterion Rk D ; .
For brevity an adaptive BB Algorithm for solving (Q / with the deletion criterion
.M/ > " and stopping criterion g.xk / < 0 will be referred to as Procedure
.SP /. Using this procedure, problem (P) can be solved according to the following
scheme:
Successive Incumbent Transcending (SIT) Scheme
Start with D 0 , where 0 D f .a/
.
Call Procedure .SP /. If a nonisolated feasible solution x of (P) is obtained with
f .x/ , reset f .x/
and repeat. Otherwise, stop: if D f .x/
for
a nonisolated feasible solution x then x is an essential .";
/-optimal solution; if
D 0 then problem (P) has no "-essential feasible solution.
Since f .D/ is compact and
> 0 this scheme is necessarily finite.
For the sake of convenience let us rewrite problem (P) in the form:
The points xk D xMk ; yk D yMk are such that xk D yk implies that minfg1 .x/
g2 .x/j x 2 Mk g D 0, hence it is easy to verify that condition (7.71) is satisfied.
Special Case: When f ; g are polynomials or signomials, one can define at each
iteration k a convex program (e.g., a SDP)
and taking an optimal solution .xk ; uk / of (Lk / one has a couple xk ; yk satisfy-
ing (7.73) and
So far we assumed that all nonconvex constraints are of inequality type: gi .x/
0; i D 1; : : : ; m. If there are equality constraints such as
hj .x/ D 0; j D 1; : : : ; s
with all functions h1 ; : : : ; hs nonlinear, the SIT algorithm for whatever " > 0 will
conclude that there is no "-essential feasible solution. This does not mean, however,
that the problem has no nonisolated feasible solution. In that case, since an exact
solution to a nonlinear system of equations cannot be expected to be computable in
finitely many steps, one should be content with replacing every equality constraint
hj .x/ D 0 by an approximate one
jhj .x/j ;
where > 0 is the tolerance. The above approach can then be applied to the resulting
set of inequality constraints.
212 7 DC Optimization Problems
This problem has one isolated feasible point .1; 3; 5/ which is also the optimal
solution. With " D
D 0:01 the essential (";
)-optimal solution (3.747692,
7.171420, 2.362317) with objective function value 3:747692. is found by the SIT
Algorithm after 15736 iterations.
Example 7.4 (n=4) Minimize .3 C x1 x3 /.x1 x2 x3 x4 C 2x1 x3 C 2/2=3 subject to
with objective function value 5.906278 is found by the SIT Algorithm at iteration
667, and confirmed as such at iteration 866.
7.6 DC Optimization in Design Calculations 213
Global optimization problems with nonconvex constraint sets are difficult in two
major aspects: (1) a feasible solution may be very hard to determine; (2) the optimal
solution may be an isolated feasible solution, in which case a correct solution cannot
be computed by a finite procedure and implemented successfully.
To cope with these difficulties the robust approach consists basically in finding a
nonisolated feasible solution and improving it step by step. The resulting algorithm
works its way to the best nonisolated optimal solution through a number of cycles of
incumbent transcending. A major advantage of it, apart from stability, is that when
prematurely stopped it may still provide a good nonisolated feasible solution, in
contrast to other current methods which are almost useless in that case.
Except in special cases both these two subproblems are nonconvex. Even in the
relatively simple case when S is convex nonpolyhedral and p.x/ D kxk (the
Euclidean norm), the subproblem (7.76), for fixed x, is known to be NP-hard (Shiau
1984). In view of these difficulties, for some time most of the papers published on
this subject either concentrated on some particular aspects of the problem or were
confined to the search for a local optimum, under rather restrictive conditions.
If no further structure is given on the data, the general problem of design
centering is essentially intractable by deterministic methods. Fortunately, in many
practically important cases (see, e.g., Vidigal and Director (1982)) it can be assumed
that
1. p.x/ D xT Ax,
where A is a symmetric positive definite matrix;
2. S D C \ D1 \ \ Dm ,
where C is a closed convex set in Rn with intC ; and Di D Rn n intCi .i D
1; : : : ; m/ and Ci is a closed convex subset of Rn .
Below we present a solution method proposed by Thach (1988) for problem .P/
under these assumptions.
7.6.1 DC Reformulation
Proposition 7.15 (Thach 1988) The function g.x/ is convex, finite on Rn and the
general design centering problem is equivalent to the unconstrained dc maximiza-
tion problem
maxfxT Ax g.x/j x 2 Rn g: (7.78)
D inf p2 .y x/:
yintS
But for x 2 S it is easily seen from (7.76) that r.x/ D infyintS p.y x/ 8x 2 S. On
the other hand, for x S, by making y D x in (7.79) we obtain xT Ax g.x/ D 0
since p2 .y x/ 0. Hence,
7.6 DC Optimization in Design Calculations 215
T r2 .x/ x2S
x Ax g.x/ D (7.80)
0 xS
This shows that problem (7.77) is equivalent to (7.78). The finiteness of g.x/
follows from (7.80) and that of r.x/. Finally, since for every fixed y the function
x 7! 2xT Ay yT Ay is affine, g.x/ is convex as the pointwise supremum of a family
of affine functions. t
u
Remark 7.14 (i) In view of (7.80) problem (7.78) is the same as the constrained
optimization problem
(ii) For x 2 S we can also write r.x/ D miny2@S p.y x/, where @S WD S n intS is
the boundary of S. Therefore, from (7.79),
A peculiar feature of the dc optimization problem (7.78) is that the convex function
g.x/ is not given explicitly but is defined as the optimal value in the optimization
problem
with
Thus, the subproblem (7.81) splits into mC1 simpler ones. Every subproblem (7.83)
is a convex program (maximizing a concave function y 7! 2xT Ay yT Ay over a
closed convex set Ci ), so it can be solved by standard algorithms. The subprob-
lem (7.82), though harder (maximizing a concave function over a closed reverse
convex set), can be solved by reverse convex programming methods developed in
Section 7.3. In that way the value of g.x/ at every point x can be computed.
Also the next proposition helps to find a subgradient of g.x/ at every x.
216 7 DC Optimization Problems
This implies that 2Ay 2 @g.x/ for every x 2 Rn . In particular, when x S we have
g.x/ D xT Ax; so y D x is an optimal solution of (7.81), hence 2Ax 2 @g.x/. t
u
We are now in a position to derive a solution method for the dc optimization
problem (7.78) equivalent to the general design centering problem .P/.
Let f .x/ WD xT Ax g.x/. Starting with an n-simplex Z0 containing the compact
set S we shall construct a sequence of functions fk .:/; k D 0; 1; : : :, such that
(i) fk .:/ overestimates f .:/, i.e., fk .x/ f .x/ allx 2 SI
(ii) Each problem maxffk .x/j x 2 Z0 g can be solved by available efficient
algorithms;
(iii) maxffk .x/j x 2 Z0 g ! maxff .x/j x 2 Sg as k ! C1.
Specifically, we can state the following:
Algorithm
Select an n-simplex Z0 S, a point x0 2 S and y0 D x0 . Set k D 1.
Step 1. Let fk .x/ D xT Ax maxfh2Ayi ; x xi i C g.xi /j i D 0; 1; ; k 1g. Solve
the kth approximating problem
fk .xi / D f .xi /; i D 0; 1; : : : ; k 1:
7.6 DC Optimization in Design Calculations 217
.xk /T Axk h2Ayi ; xk xi i g.xi / fk .xk / fk .x/ f .x/:
It is not hard to see that the above algorithm can also be interpreted as an outer
approximation method for solving this convex maximization (concave minimiza-
tion) under convex constraints (see Sect. 7.2). In fact, the kth approximating problem
.Pk / is equivalent to the problem
The classical Webers problem is to locate a single facility in the plane that
minimizes the sum of weighted Euclidean distances to the locations of N known
users. This problem can be generalized to the case of p facilities as follows.
Suppose p facilities (providing the same service) are designed to serve N users
located at points a1 ; : : : ; an in the plane. We wish to determine the locations
x1 ; : : : ; xp 2 R2 of the facilities so as to minimize the total cost (travel time, transport
cost for customers, etc.), or to maximize the total attraction (utility, number of
customers, etc.). The cost or attraction is a given function of the distances form
the users to the facilities. Traditionally, the standard mathematical formulation of
this problem is (Brimberg and Love 1994):
X
p
X
N
minx;u wij d.xi ; aj /
iD1 jD1
X
p
s.t. wij D rj ; j D 1; : : : ; N
iD1
wij 0; i D 1; : : : ; p; j D 1; : : : ; N
xi 2 S; i D 1; : : : ; p;
where wij is unknown weight representing the flow from facility i to the fixed point
(warehouse) aj ; j D 1; : : : ; NI d.x; a/ is the distance function which estimates the
travel distance between any two points x; a 2 R2 , and S is a rectangular domain in
R2 where the facilities must be located.
A serious disadvantage of the above formulation is that it involves a huge number
of variables (pN variables wij and 2p variables xk1 ; xk2 ) and a highly nonconvex
objective function. Because of that the problem is very difficult to handle, even
for small values of p and N. In fact, while nonlinear programming methods can be
trapped at a KKT point which may even not be a local optimum, the difficulties
considerably increase when the model is further extended to deal with complicated
but more realistic objective functions (e.g., when the cost function is measured by
a monotone rather than linear function of the distances, or when the facilities are
attractive only for a number of users and repelling for others).
Using the dc approach it is possible to significantly reduce the number of
variables, and express the objective function in a much simpler form. Noticing that
in an optimal solution each user must be served by the closest facility and setting
Since clearly
X
p
X
hj .x/ D kxi aj k max kxi aj k;
k
iD1 ik
Pp P
where the functions iD1 kxi aj k; maxk ik kxi aj k are convex, it follows
that (7.88) is a dc optimization problem. One obvious advantage of this formulation
is that it uses only 2p variables xi1 ; xi2 ; i D 1; : : : ; p instead of pN C 2p variables
as in the standard formulation. This allows large-scale problems which usually
are out of reach of standard nonlinear methods to be solved successfully by dc
optimization methods. For instance, in Tuy et al. (1995a,b,c) and Al-Khayyal et al.
(1997, 2002), good computational results have been reported for problems with
N D 100;000; p D 1 and N D 10;000; p D 2; N D 1000; p D 3. The traditional
approach would have difficulty even in solving problems with N D 500; p D 2,
which would require more than 1000 variables to be introduced.
Facility with Attraction and Repulsion
In more realistic models, the attraction of facility i to user j at distance t away is
measured by a convex decreasing function qj .t/. Furthermore, for some users the
attraction effect is positive, for others it is negative (which amounts to a repulsion).
Let JC be the set of attraction points (attractors), J the set of repulsion points
(repellers). The problem is to find the locations of facilities so as to maximize the
total effect, i.e.,
nX X o
max qj .hj .X// qj .hj .X//j X D .x1 ; : : : ; xp / 2 Sp .R2 /p ;
j2JC j2J
where hj .X/ is given by (7.87). Using Proposition 4.6 it is easy to find an explicit
dc representation of the objective function. Hence, the problem is a dc optimization
just as (7.88). For detail the interested reader is referred to Chen et al. (1992, 1992b),
Tuy et al. (1992), and also Maranas and Floudas (1994) and Al-Khayyal et al.
(1997, 2002), where results in solving problems with up to 100,000 attractors and
repellers have been reported. A local approach to this problem under more restrictive
assumptions (J D ;) can be found in Idrissi et al. (1988).
Maximin Location
The maximin location problem is to determine the locations of p facilities so as
to maximize the minimum distance form a user to the closest facility. Examples are
obnoxious facilities, such as nuclear plants, garbage dumps, sewage plants, etc. With
hj .x1 ; : : : ; xp / defined by (7.87) the mathematical formulation of the problem is
maxf min hj .x1 ; : : : ; xp /j xi 2 S; i D 1; : : : ; pg: (7.89)
jD1;:::;N
220 7 DC Optimization Problems
is smallest, where D .ij /; W D .wij / are symmetric matrices of order p such that
or, alternatively, as
P t 2 kxi xj k2 t .8i < j/
2 ij ij
min i;j wij tij i ij
x 2 Rn ; i D 1; : : : ; p
dc optimization called DCA has been developed by Tao (2003) for solving the
multidimensional scaling problem on the basis of the dc formulation (7.92) (see
also An (2003) where a combined DCA-smoothing technique is applied).
The continuous p-center problem differs from the p-center problem only in that the
set of users is a rectangle S R2 rather than a finite set of points. It consists of
finding the locations x1 ; : : : ; xp of p facilities in S so as to minimize the maximal
distance from a point x 2 S to the nearest facility, i.e.,
h .x1 ; : : : ; xp / D miniD1;:::;p kxi yk
min maxy2S hy .x1 ; : : : ; xp / iy (7.93)
x 2 S; i D 1; : : : ; p:
Dj WD fxj kx xj k kx xi k 8i D 1; : : : ; pg:
The difficulty of problem (7.94) is that there are many local minima that are not
global; in addition the local minimizers have objective function values very close to
the global minimum, which make it very hard to determine the exact value of the
global minimum.
1
j k D q.kx x k/ with q D 1=t convex decreasing function for
i j
Observe that kxi x
t . Hence, by Proposition 4.5 problem (7.94) can be rewritten as
X
minf .g.x1 ; x2 / Kkxi xj k/j kxi k D 1; i D 1; : : : ; pg;
i<j
or equivalently,
X
maxf log.kxi xj k/ j kxi j D 1; i D 1; : : : ; pg
i<j
> 0/.
The problem of cluster analysis can be formulated as follows. For a given set
of points ai ; i D 1; : : : ; m in Rn find cluster points xl ; l D 1; : : : ; p in Rn such that
the sum of the minima over l 2 f1; : : : ; pg of the distance between each point ai
and the cluster centers xl ; l D 1; : : : ; p is minimized. Using one-norm for measuring
distances, the problem is
P
minimize m il
iD1 minlD1;:::;;p he; y i (7.96)
subject to y a x y i D 1; : : : ; mI l D 1; : : : ; p
il i l il
Since at optimality each point ai must be assigned the cluster center closest to it, the
clustering problem can be formulated as
X
m
min min kxl ai k s.t. xl 2 S Rn ; l D 1; : : : ; p: (7.98)
lD1;:::;p
iD1
224 7 DC Optimization Problems
X
p
X
min kxi ai k D kxl ai k max kxl ai k
lD1;:::;p r
lD1 lr
Pp l i
P l i
where lD1 kx a k and maxr lr kx a k are convex functions. There-
fore, (7.98) is actually the dc optimization problem
Pp P
min lD1 kxl ai k maxr lr kxl ai k
(7.99)
s.t. xl 2 S; l D 1; : : : ; p:
A common method for solving this problem is by branch and bound. There are
p n variables xlj ; j D 1; : : : ; n; l D 1; : : : ; p. For the sake of clarity any vector in
Qp
X D lD1 Xl ; Xl 2 Rn , will be denoted by a boldface low case letter, e.g., x D
.x1 ; : : : ; xp / where xl 2 Xl . We also agree that xl D xl . The problem takes on the
form
where
X
m X
p
g.x/ D gil x/; gil x/ D kxl ai k;
iD1 lD1
X
m X
h.x/ D hi .x/; hi .x/ D max kxl ai k;
r
iD1 lr
Qp
and S is a simplex in X D lD1 X l ; X l D Rn .
a. Bounding
Let M be a subsimplex of S, with vertex set V.M/ (note that jV.M/j D pn C 1). To
compute a lower bound .M/ for the value of f .x/ over M we replace the problem
by a linear relaxation constructed as follows.
For each v 2 V.M/ let pil .v/ 2 @gil .v/, i.e.,
(
vl ai
if vl ai ;
pil .v/ D kvl ai k
0 otherwise
7.8 Clustering via dc Optimization 225
X
m X
p
'.x/ D max hpil .v/; xl vl i C kvl ai k:
v2V.M/
iD1 lD1
Also let i .x/ be a minorant of hi .x/ provided by the affine function which matches
with hi .x/ at every v 2 V.M/, i.e.,
X X
i .x/ D v max kvi ai k for every
r
v2V.M/ lr
X X
xD v v; v D 1; v 0:
v2V.M/ v2V.M/
X
m
minf'.x/ i .x/j x 2 Mg (7.102)
iD1
X
m
minft i .x/j '.x/ t; x 2 Mg
iD1
b. Branching
P
Branching is performed using !-subdivision. Specifically, let x.M/ D v2V.M/ v v
be an optimal solution of the relaxed problem (7.102) and let J.M/ D fv 2
V.M/jjv > 0g. Then subdivide M via the point x.M/.
Theorem 7.11 The BB Algorithm with the above bounding and the !-bisection for
branching is convergent. To be precise, if Mk is the partition set at iteration k, and
xk is the current best solution, then a cluster point x of the sequence fxk g yields an
optimal solution of the problem.
Proof The algorithm works exactly the same way as that for solving the equivalent
concave minimization under convex constraints
The convergence of the algorithm then follows from Theorem 7.4 on the conver-
gence of the BB Algorithm for (SBCP). t
u
Computational testing of the above algorithm has been carried out on the
Wisconsin Diagnostic Breast Cancer Database created by Wolberg et al. (1994,
1995,a,b). This database consists of 569 vectors in R30 , which should be arranged
into two clusters (malignant/benign). The original number of feature parameters
(n D 30) is too large for a direct application of global optimization methods, but it
is argued that one can restrict consideration to four most informative parameters. So
the clustering problem is solved for m D 569; n D 4; p D 2. The effectiveness of
the method is confirmed by the performed computational experiments: in each set
of these the two clusters obtained turned out to be correct with 96 % accuracy. For
details, see Tuy et al. (2001).
7.9 Exercises
X
n
1 Consider the problem minf fj .xj / W x 2 Dg, where D RnC is a polytope and
jD1
0 if t D 0;
fj .t/ D
dj C cj t if t > 0:
(dj is a fixed cost). Show that there exists an " > 0 such that this problem is
Xn
equivalent to a (BCP) with f .x/ D fQj .xj / , where fQj W R ! R is a concave
jD1
function which is linear in the interval .1; "/ and agrees with fj .t/ at points t D 0
and t 2 ."; C1/.
7.9 Exercises 227
maximize k .x xk / subject to x 2 Dk
1 2
maximize .x1 /2 C .x2 /2 subject to
2 3
5 3 5 1 1
x1 C x2 1I x1 C x2 I x1 x2 I x1 x2 I
2 2 4 2 4
3
x1 0I 0 x2 :
2
5 Solve the problem
maxfh.x/j Ax b; x 0; cx g 0:
228 7 DC Optimization Problems
maximize x2 subject to
x1 C 2x2 2I 0 x1 I 0 x2 2
81
4.x1 2/2 C 4.x2 2/2 9I .x1 C 3/2 x22
4
8 Consider the bilevel linear programming problem:
Many problems have a special structure which should be exploited appropriately for
their efficient solution. This chapter discusses special methods adapted to special
problems of dc optimization and extensions.
In Chap. 7 (Sect. 7.1), we saw how the core of many global optimization problems
is a DC feasibility problem of the following form:
(DC) Given two closed convex sets C and D, find a point x 2 D n C or else
prove that D C.
0 2 D \ .intC/: (8.1)
By Proposition 1.21, condition (8.1), along with the convexity and closedness of D
and C imply that D D D ; C D C , so this polarity correspondence gives rise to a
dual problem which will be referred to as the dual DC feasibility problem:
(C D ) Find a point v 2 C n D or else prove that C D .
Theorem 8.1 The two problems (DC) and (C D ) are equivalent in the following
sense:
(i) D C if and only if C D ;
(ii) For each v 2 C n D there exists a point z 2 D n C; conversely for each
z 2 D n C there exists a point v 2 C n D .
Proof Since both D and C are closed convex sets, (i) follows immediately from
Proposition 1.21. If v 2 C n D , then hv; xi 1; 8x 2 C while hv; zi > 1 for at
least one z 2 D, hence z 2 D n C. Conversely, if z 2 D n C, then z 2 D n C , i.e.,
hu; zi 1 8u 2 D while hv; zi > 1 for at least one v 2 C , hence v 2 C n D . u t
Thus, the DC feasibility problem can be solved by solving the dual DC feasibility
problem and vice versa. As it often happens in optimization theory, the dual problem
may turn out to be easier than the original one. In any case, the joint study of a dual
pair of problems may give more insight into both and may suggest more efficient
methods for handling each of them. Note that since 0 2 intC, the set C is compact
by Proposition 1.21. Furthermore, if D is compact then 0 2 intD , i.e., 0 2 C \
intD , so there is a full symmetry in the above duality correspondence.
Setting G D C ; D C n D , we see that the dual DC feasibility problem is
to find an element of G, where G is a closed convex set. Let us examine how
the OA method can be applied to solve this problem. Assume that D D fxj Ax b;
x 0g is a polyhedron and x0 D 0 is a vertex of D satisfying (8.1). Let M be the
canonical cone associated with x0 D 0 (see Sect. 7.1). For every i D 1; : : : ; n the i-th
edge of M either meets @C at some point zi or is entirely contained in C. In the latter
case, let zi be any point on this edge, arbitrarily far away from 0. Define the convex
function
C P P1 WD S1 : (8.3)
.v/ 1 8v 2 V ) C D : (8.4)
8.1 Polyhedral Annexation 231
P1
Co
v 1 D eU 1 ; U D z1 ; : : : ; zn (8.5)
v k 2 argmaxf.v/j v 2 Vk g: (8.6)
hOxk ; yi 1 (8.7)
holds for all y 2 C but not for y D v k . Therefore, the cut (8.7) determines a
polytope PkC1 D fy 2 Pk j hOxk ; yi 1g 2 P such that v k PkC1 .
232 8 Special Methods
We can thus formulate the following OA procedure for solving the dual DC
feasibility problem.
Procedure DC* (D Polyhedron, C Closed Convex Set)
Initialization. Select a vertex x0 of D such that x0 2 intC and let M be the associated
canonical cone. Set D D x0 ; C C x0 ; M M x0 . For each i D 1; : : : ; n
let z be the point where the i-edge of M meets @C (or any point of this edge if it
i
to obtain its optimal value .v/ and a basic optimal solution x.v/.
Step 2. Let v k 2 argmaxf.v/j v 2 Vk g. If .v k / 1, then terminate: C D ,
hence D C.
Step 3. If .v k / > 1 and xk WD x.v k / C, then terminate: xk 2 D n C.
Step 4. If .v k / > 1 and xk 2 C, compute k 1 such that xO k D k xk 2 @C and
define
Step 5. Compute the vertex set VkC1 of PkC1 and let V kC1 D VkC1 n Vk . Set k
k C 1 and go back to Step 1.
Theorem 8.2 Procedure DC* terminates after finitely many steps yielding a point
of D n C or proving that D C.
Proof Each iteration generates an xk which is a vertex of D (basic optimal solution
of LP.v k ; D)). But xO k is distinct from all xO k1 ; : : : ; xO 1 . Indeed, v k satisfies all the
constraints hOxi ; yi 1 0; i D 1; : : : ; k 1, because v k 2 Pk , but violates the
constraint hOxk ; yi 1 0 because hv k ; xO k i .v k / > 1. Therefore, xk is distinct
from all xi ; i D 1; : : : ; k1. This implies that the number of iterations cannot exceed
the number of vertices of D. t
u
Remark 8.1 All the linear programs LP.v; D) have the same constraints. This
should make the solution of these programs an easy task. If 0 is not a vertex but an
interior point of D (so that 0 2 intD \ intC/, then one can take the initial polyhedron
to be P1 D fyj hzi ; yi 1; i D 1; : : : ; n C 1g, where z1 ; : : : ; znC1 are n C 1 points
on @C determining an n-simplex containing 0 in its interior. In this case V1 consists
of the n C 1 vertices of P1 .
Remark 8.2 If we denote by Sk the polyhedron such that Sk D Pk (i.e., Sk D Pk /,
then assuming x0 D 0 and using Propositions 1.21 and 1.28 it can be verified that
SkC1 D convf0; z1 ; : : : ; zn ; xO 1 ; : : : ; xO k g;
huk ; yi 0:
When C has a nontrivial lineality space L, then r WD dim C D n dim L < n, so the
dual DC feasibility problem is actually a problem in dimension r < n and should be
easier to solve than the original problem. We shall examine in Chap. 8 how the PA
Algorithm can be streamlined to exploit this structure.
To obtain an algorithm for (BCP) it only remains to incorporate Procedure DC* into
the usual two phase scheme. Furthermore, to enhance efficiency, each time a better
feasible solution than the incumbent is found, we can introduce a concavity cut to
reduce the feasible polyhedron to be considered in the next cycle. This leads to the
following:
PA (Polyhedral Annexation) Algorithm for (BCP)
Theorem 8.3 The PA Algorithm for (BCP) terminates after finitely many steps at a
global minimizer.
Proof Straightforward from the finiteness of Procedure DC*. t
u
Remark 8.4 Each new cycle is essentially a restart. Just as with the CS Restart
Algorithm in Chap. 6, restart is a device to avoid a excessive growth of Vk . Moreover,
since a new vertex x0 is used to initialize Phase 2 and the region that remains to be
explored is reduced by a cut, restart can also increase the chance of transcending
the incumbent. In practical implementations restart with a new cut and a new x0 is
recommended when Phase 2 seems to drag on and jVk j approaches a critical limit.
Remark 8.5 In the above statement of the PA Algorithm we implicitly assumed
that D is bounded. To extend the procedure to the case when D may be unbounded,
it suffices to make some minor modifications indicated in Remark 8.3. It should be
emphasized that in contrast to OA methods, the PA Algorithm does not require the
computation of subgradients. Nor does it require the function f .x/ to be finite on
an open set containing D (though it would be more efficient in this case). Another
advantage of the PA Algorithm which will be demonstrated in Chap. 9 is that it can
be used as a technique for reducing the dimension of the problem when the lineality
of C is positive.
Example 8.1
Since Procedure DC* solves the DC feasibility problem, it can also be incorporated
into a two phase scheme for solving linear programs with an additional reverse
convex constraint:
8.1 Polyhedral Annexation 235
1 2 6 9
where E is the collection of all edges of D and for every E 2 E ; .E/ is the set of
extreme points of the line segment E \ C" .
PA Algorithm for .LRC)
Initialization. Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/ ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set D 1; D. / D D, and go to Phase 2.
Phase 1. Starting from z0 search for a feasible point x 2 X" . Let D cx; D. / D
fx 2 Dj cx g. Go to Phase 2.
Phase 2. Call Procedure DC* with D. /; C" in place of D; C resp.
(a) If a point z 2 D. / n C" is obtained, set z0 z and go to Phase 1.
(b) If D. / C" then terminate: x is an "-approximate optimal solution
(when < C1/, or the problem is infeasible (when D C1/.
Proposition 8.1 The PA Algorithm for LRC is finite. t
u
236 8 Special Methods
so .M/ yields a lower bound for the optimal value .M/ of the subproblem (P(M)).
The problem (DP(M)) is called a Lagrange relaxation of (P(M)) and the bound .M/
is often referred to as a duality bound or a Lagrangian bound.
8.2 Partly Convex Problems 237
In general the inequality (8.9) is strict, i.e., the difference .M/ .M/, referred
to as the duality gap, or simply the gap, is positive and most often rather large.
However, it has been observed that, through a suitable (BB) scheme, one can
generate a nested sequence of boxes fMk g shrinking to a point x 2 C, such that
the gap .Mk / .Mk / decreases as k ! C1. Under suitable conditions it is
possible to eventually close the gap, i.e., to arrange things so that
Step 2. Update the incumbent CBS by setting .xk ; yk / equal to the best among all
feasible solutions available at the completion of the previous step.
Step 3. Delete every M 2 Sk such that .M/ minf; F.xk ; yk /g (with the
convention F.xk ; yk / D C1 if .xk ; yk / is not defined). Let Rk be the set
of remaining members of Sk .
Step 4. If Rk D ;, then terminate: CBS D .xk ; yk / yields a global optimal
solution, or else the problem is infeasible (if CBS is not defined).
Step 5. Choose Mk 2 argminf.M/jM 2 Rk g. Subdivide Mk according to the
chosen subdivision rule. Let PkC1 be the partition of Mk .
Step 6. Let SkC1 WD .Rk n fMk g/ [ PkC1 . Increment k and go back to Step 1.
A key operation in this algorithm is the computation of a bound .M/ for
every partition set M (Step 1). According to the prototype BB Algorithm described
in Sect. 6.2 this bound must be valid, which means that it must satisfy not only
condition (8.10) but also
f.x; y/j G.x; y/ 0; x 2 M \ C; y 2 Dg D ; ) .M/ D C1: (8.11)
Furthermore, it is convenient to assume that .M 0 / .M/ for M 0 M as we can
always replace .M 0 / with .M/ if .M 0 / < .M/.
However, since it is very difficult to decide whether the nonconvex set in the
left-hand side of the implication (8.11) is empty or not, we replace this possibly
impractical condition by a simpler one, namely
M \ C D ; ) .M/ D C1: (8.12)
Certainly, with this change the convergence criterion (6.5) for the prototype BB
Algorithm may no longer be valid. Fortunately, however, as will be seen shortly, it
is not difficult to find a suitable alternative convergence criterion.
Usually .M/ is taken to be the optimal value of some relaxation of the
subproblem .P.M// in (8.10). The most popular relaxations are
linear relaxation: F.x; y/ and G.x; y/ are replaced by their respective affine
minorants, and C; D by outer approximating polyhedrons;
convex relaxation, in particular SDP (semidefinite programming) relaxation: a
suitable convex program (in particular a SDP) is derived whose optimal value is
an underestimator of .M/ WD inffF.x; y/j G.x; y/ 0; x 2 M \ C; y 2 DgI
Lagrange relaxation: the subproblem (P(M)) is replaced by its Lagrange dual
.DP.M//; in other words, .M/ is taken to be the optimal value of .DP.M// W
Whatever bounds are used, when infinite the above BB procedure generates a
filter (an infinite nested sequence of partition sets) Mk ; k 2 I f1; 2; : : :g, such that
\
.Mk / min.P/ 8k 2 I; Mk \ C ; 8k 2 I; Mk D fx g: (8.14)
k2I
8.2 Partly Convex Problems 239
so that any optimal solution y of this problem yields an optimal solution .x ; y /
of .P/.
Theorem 8.4 If .Mk / D C1 for some k then (P) is infeasible and the algorithm
terminates. If .Mk / < C1 8k then there is an infinite subsequence Mk ; k 2 I
f1; 2; : : :g, satisfying (8.14) and such that x 2 C. If, in addition,
Theorem 8.5 Assume the set D is compact. Then the BB decomposition algorithm
using Lagrangian bounds throughout is convergent.
Proof Let us fix x 2 C. The function y 7! F.x; y/ C h; G.x; y/i is convex in y 2 D
and linear in 2 RmC . Since this function is lower semi-continuous in y while D is
compact we have the minimax equality (Theorem 2.7):
inf sup fF.x; y/ C h; G.x; y/ig D sup minfF.x; y/ C h; G.x; y/ig: (8.18)
y2D 2Rm 2Rm y2D
C C
Since clearly
F.x; y/ if G.x; y/ 0;
sup fF.x; y/ C h; G.x; y/ig D (8.19)
2Rm
C
C1 otherwise;
Consider now a filter fMk g collapsing to a point x (to simplify the notation we write
Mk instead of Mk /. As we saw in the proof of Theorem 8.4, x 2 C, while by virtue
of (8.20):
it follows that
minfF.x ; y/ C h;
Q G.x ; y/i > :
y2D
Q G.x; y/ig we
Using the lower semi-continuity of the function .x; y/ 7! fF.x; y/Ch;
can then find, for every fixed y 2 D, an open ball Uy in R around x and an open
n
F.x0 ; y0 / C h;
Q G.x0 ; y0 /i > 8x0 2 Uy \ C; 8y0 2 Vy :
Since the balls Vy ; y 2 D, form a covering of the compact set D, there is a finite set
E D such that the balls Vy ; y 2 E, still form a covering of D. If U D \y2E Uy ,
then for every y 2 D we have y 2 Vy0 for some y0 2 E, hence
But Mk U for all sufficiently large k, because \k Mk D fx g. Then the just
established inequality implies that
.8x 2 C/.8 2 Rm
C/ D0 \ argminy2D fF.x; y/ C h; G.x; y/ig ;:
However, a counter-example given there also pointed out that Theorem 8.5 will
be false if the above condition is required only for x 2 C; 2 RmC such that
argminy2D fF.x; y/ C h; Gx; y/ig ;.
For certain applications the assumption on compactness of D turned out to be
too restrictive. In the next theorem this assumption is replaced by a weaker one, at
the expense, however, of requiring the continuity and not merely the lower semi-
continuity of the functions F.x; y/ and Gi .x; y/; i D 1; : : : ; m.
242 8 Special Methods
F.x ; y/ C h;
Q G.x ; y/i ! C1 as y 2 D; kyk ! C1: (8.25)
By the minimax theorem (Sect. 2.12, Remark 2.2), this ensures that
< : (8.27)
Obviously C .1 /Q 2 Rm
C for every .; / with 2 RC ; 0 < 1. Let
m
C j inf F.x ; y/ C h; G.x ; y/i > 1g:
WD f 2 Rm (8.28)
y2D
Hence, if " denotes an arbitrary number satisfying 0 < " < , then for every
.; / 2 0; 1/ and every k there exists x.k;; / 2 C \ Mk ; y.k;; / 2 D satisfying
We contend that not for every .; / 2 0; 1/ the sequence fy.k;; / ; k D 1; 2; : : :g
is bounded. Indeed, otherwise we could assume that, for every .; / 2 0; 1/ W
y.k;; / ! y.; / 2 D, as k ! C1. Since x.k;; / 2 Mk while \C1
kD1 Mk D fx g, we
.k;; /
would have x ! x . Then letting k ! C1 in (8.29) would yield
F.x.k;; / ; y / C h C .1 /;
Q G.x.k;; / ; y /i < C ": (8.30)
For simplicity of notation, let us write xk ; yk for x.k;; / ; y.k;; / . For any l > ky k
define
l l
ykl D k yk C 1 k y :
ky k ky k
From the inequalities (8.29) and (8.30) and the convexity of the function y 7!
Q G.xk ; y/i, we then deduce, for every fixed l ky k,
F.xk ; y/ C h C .1 /;
Theorem 8.7 The Lagrangian dual of .GPL/ with respect to the nonlinear con-
straints, i.e.,
'r;s D sup inffhy; c.x/iChc0; xiCh; A.x/yCBxbij x 2 r; s; y 0g; (8.31)
0
D hb; i C h./;
where
infx2r;s hBx; i if c.x/ C .A.x//T 0 8x 2 r; s;
h./ D (8.34)
1 otherwise:
Now observe that for any q 2 Rn W
Since for fixed x the function 7! hAj .x/; i C cj .x/ is affine, g./ is a concave
function and the problem (8.31) reduces to (8.32)(8.33), which is a convex
program. t
u
Corollary 8.2 Assume that
(*) For every j the functions cj .x/; aij .x/; i D 1; : : : ; m, are either all increasing
on r; s or all decreasing on r; s.
Then the Lagrangian dual of (GPL) is the dual of its LP relaxation:
8 9
<X X =
min cj .r/yj C cj .s/yj C hc0 ; xi (8.36)
: ;
j2J C j2J
X X
s:t: aij .r/yj C aij .s/yj C hBi ; xi bi i D 1; : : : ; m; (8.37)
j2JC j2J
y 0; r x s; (8.38)
where JC is the set of all j such that all cj .x/; aij .x/; i D 1; : : : ; m, are increasing,
and J is the set of all j such that all cj .x/; aij .x/; i D 1; : : : ; m, are decreasing.
Proof By hypothesis JC [ J D f1; : : : ; pg, so for every j D 1; : : : ; p, we have
In view of (8.32), (8.33), (8.35) the problem (8.31) thus reduces to the linear
program
P
hc0 ; xi C maxfhr s; ti C m iD1 i hBi ; ri bi g;
Pm
iD1 i Bi t c0 ;
Pm
i aij .r/ cj .r/ j 2 JC ;
s.t. PiD1
m
iD1 i aij .s/ cj .s/ j 2 J ;
0; t 0:
8.2 Partly Convex Problems 247
whose dual is
8 9
<X X =
hc0 ; ci C min cj .r/yj C cj .s/yj ;
: ;
j2JC j2J
X X
s.t. aij .r/yj C aij .s/yj C hBi ; r C zi bi i D 1; : : : ; m;
j2JC j2J
y 0; 0 z s r:
8.2.2 Applications
satisfying condition (#) of Corollary 8.3, the Lagrangian bound can be computed
by solving a linear program. Furthermore, for this (GPL) condition (S) in
Theorem 8.6 now reads
C/
.8x 2 X/ .9 2 Rm cT y C h; A.x/y bi ! C1 as y ! C1;
where the last condition holds if and only if hA.x/; i C c > 0. Therefore,
as shown in Corollary 8.1 this problem can be solved by a convergent BB
decomposition algorithm using Lagrangian bounds throughout.
II. The Bilinear Matrix Inequalities Problem. Many challenging problems of con-
trol theory can be reduced to the so-called bilinear matrix inequality feasibility
problem (BMI problem for short) with the following general formulation (Tuan
et al. 2000a,b):
X
n X
m X
n X
m
L0 C xi Li0 C yj L0j C xj yj Lij 0 (8.42)
iD1 jD1 iD1 jD1
x 2 X D p; q Rn ; y 2 Rm
C (8.43)
where x; y are the decision variables, G0 ; Gj ; L0 ; L0i ; Lj0 ; Lij are symmetric
matrices of appropriate sizes, and the notation G
0; L 0 means that G
is a semidefinite negative matrix, L is a definite negative matrix.
For ease of notation we write
A A0
for ;
B d 0B
and define
2 3
G
Pn0
A00 .x/ D 4 L0 C iD1 xi Li0
5 ;
hx; ci d
2 3 2 3
G 0
Pnj
Aj0 .x/ D 4 L0j C iD1 xi Lij
5 ; Q00 D 405 :
dj d
1 d
Then, as shown in Tuan et al. (2000a), this problem can be converted to the form
8 9
< Xm =
min tj A0 .x; p; q/ C yj Aj .x; p; q/
tQ; y 0; x 2 X ;
: ;
jD1
8.3 Duality in Global Optimization 249
where
Aj0 .x/ Q00
Aj .x; p; q/ D ; QD ; Q01 D 0;
Aj1 .x; p; q/ d
Q01 d
2 3
.x1 p1 /Gj
6 .q x /G 7
6 1 1 j7
6 7
Aj1 .x; p; q/ D 6 7 j D 0; 1; : : : ; n:
6 7
4 .xn pn /Gj 5
.qn xn /Gj d
In this form the BMI problem appears to be a problem .GPL/. Condition (S) in
Theorem 8.6 can be formulated as
Therefore, by Theorem 8.6, under this assumption the BMI problem can be solved
by a convergent branch and bound algorithm using Lagrangian bounds, as proposed
in Tuan et al. (2000a). Note that the Lagrangian dual of the problem
8 2 39
< X
m =
max min t C Tr 4Z.A0 .x; p; q/ C yj Aj .x; p; q/ tQ/5
Z0 t2R;y0;x2M : ;
jD1
As we saw in Chap. 6 and the preceding sections, Lagrange duality is very useful in
global optimization, especially for computing bounds when solving a nonconvex
problem by a BB algorithm. However, since the dual problem of a nonconvex
optimization problem is also a nonconvex problem, these Lagrange bounds are often
not easily computable. In this section we present a different duality theory due to
Thach (1991, 1994, 2012) (see also Thach and Thang 2011) which has the attractive
feature of being symmetric and may turn certain difficult nonconvex problems into
more tractable, sometimes even convex or linear, ones.
250 8 Special Methods
Basic for this nonconvex duality theory is the concept of quasiconjugation which
plays a role similar to that of conjugation in convex duality theory (Sect. 2.10).
Given any function f W RnC ! RC ; 6 0, the function f \ W RnC ! 0; C1/ such
that
1
f \ .p/ D 8p 2 RnC ;
supff .x/j pT x 1; x 0g
1
(where C1
D 0 by the usual convention) is called the quasiconjugate of f .x/.
Example 8.2 Let f be a CobbDouglas function on RnC , i.e.,
Y
n
f .x/ D xi i ;
iD1
1 Y pi i
n
f \ .p/ D . / . / ; p 2 RnC
iD1 i
Pn
where iD1 i D .
Proof If p D .p1 ; p2 ; : : : ; pn / and pi D 0 for some i D 1; : : : ; n, then pT xk 1
for every xk D .x1 ; x2 ; : : : ; kxi ; : : : ; xn /, where x > 0 satisfies pT x 1. Since
f .x/ > 0; f .xk / D ki f .x/ ! C1 as k ! C1 it follows that f \ .p/ D 0.
If p > 0 then, since f .x/ is continuous the problem maxff .x/j pT x 1; x 0g
has a solution x. Noting that f .x/ D 0 8x 2 RnC n RnCC we have x > 0. By KKT
condition there exists > 0 such that
rf .x/ r.pT x 1/ D 0
.pT x 1/ D 0
i Qn i
Solving this system yields xi D pi ; i D 1; : : : ; n and D iD1 xi . Hence,
1
Qn pi i
f \ .p/ D f .x/ D 1 iD1 i 8p 2 RnC . t
u
1
f .x/ D : (8.45)
supff \ .p/j pT x 1; p 0g
Proof (i) We need only prove the quasiconcavity of f \ . Let p; p0 2 RnC . For every
t 2 0; 1 and x 0 we can write
hence
which implies
as was to be proved.
(ii) Define
1
f .x/ D :
supff \ .p/ W pT x 1; p 0g
For > 0 let F ; F denote the upper levels of f and f \ , respectively, i.e.,
1
< ;
supff \ .p/ W pT x 1; p 0g
1
f \ .p/ >
1 1
, >
supff .x/ W p x 1; x 0g
T
, supff .x/ W pT x 1; x 0g <
, f .x/ < 8x 0 such that pT x 1:
252 8 Special Methods
f .x/
1
,
supff \ .p/ W pT x 1; p 0g
1
, f \ .p/ 8p 0 such that pT x 1
1 1
, 8p 0 such that pT x 1
supff .x/ W pT x 1; x 0g
, supff .x/ W p x 1; x 0g 8p 0 such that pT x 1: (8.46)
T
qT x > 8x 2 F I (8.47)
qT x < 8x 2 0I x: (8.48)
From (8.47) we have q 0 by Lemma 1. Further, from (8.48) it follows that > 0
and there exists t > 0 sufficiently small such that .q C te/T x , where e D
.1; 1; : : : ; 1/. Setting p D 1 .q C te/, we have p > 0 and
1 T
pT x p x > 1 8x 2 F I (8.49)
pT x 1: (8.50)
Consequently,
supff .x/ W pT x 1; x 0g :
8.3.2 Examples
Y
k
f .x/ D .fj .x//j ; j > 0;
jD1
1
f .x/ D .a1 x1 C a2 x2 C C an xn / ;
X
m Y
n
f .x/ D cj .xi /aij with cj > 0 and aij 0:
jD1 iD1
1
Q .x/ WD
rf rf .x/ 2 @\ f .x/; (8.53)
hrf .x/; xi
f .x C tu/ f .x/
hrf .x/; ui D f 0 .x; u/ WD lim : (8.54)
t!0C t
Since f .x C tu/ f .x/ this implies hrf .x/; ui 0 8u 2 RnC , hence rf .x/ 0. For
every x 2 Cx if t 2 .0; 1/ then x C t.x x/ 2 Cx , so f .x C t.x x// f .x/, hence,
by (8.54) for u D x x, hrf .x/; x xi 0. Thus hrf .x/; xi hrf .x/; xi 8x 2 Cx .
1
Setting p WD hrf .x/;xi rf .x/, we can write
hp; xi 1 8x 2 Cx
, f .x/ < f .x/ 8x 0 s.t. hp; xi < 1
) f .x/ f .x/ 8x 0 s.t. hp; xi 1
, supff .x/j hp; xi 1; x 0g f .x/
1 1
,
fsupff .x/j hp; xi 1; x 0g f .x/
, f .x/f \ .p/ 1:
This together with the obvious equality hp; xi D 1 shows that p 2 @\ f .x/. t
u
Theorem 8.9 If f is quasiconcave continuous strictly increasing on RnC , then @\ f .x/
is a nonempty convex set for any x > 0. Moreover,
Proof Since f is strictly increasing on RnC , it is easily seen that x is a boundary point
of the closed convex set Cx WD fx0 2 RnC j f .x0 / f .x/g. Indeed, if there were an
open ball B of center x such that B Cx there would exist xO 2 B such that xO < x
8.3 Duality in Global Optimization 255
which would imply f .Ox/ < f .x/, conflicting with xO 2 Cx . Consequently, there exist
a vector q 0 and number 2 R such that
hq; zi 8z 2 Cx ; (8.56)
hq; xi D : (8.57)
From (8.58) we have f .z/ < f .x/ for all z 0 such that pT z < 1, hence f .z/ f .x/
for all z 0 such that pT z 1. Thus, supff .z/j pT z 1; z 0g f .x/ and therefore
f \ .p/f .x/ 1. This together with (8.59) shows that p 2 @\ f .x/. The convexity of
@\ f .x/ then follows from the definition of quasi-supgradient.
The equivalence (8.55) follows from the equality .f \ /\ D f and the definition of
subgradient. t
u
8x 2 X x0 x ) x0 2 X: (8.60)
In economics, when X represents the production set (i.e., the set of all feasible
production programs) this condition is referred to as a free disposal condition.
Obviously, it is equivalent to saying that
8x 2 X 0; x X:
In monotonic optimization (see Chap. 11) a set X RnC satisfying this condition is
said to be normal.
Consider the optimization problem:
Since f is continuous and X is compact, this problem has a solution; moreover, its
optimal value is positive.
256 8 Special Methods
For every vector x 2 X let NX .x/ be the normal cone of X at x (cf Sect. 1.6):
Proof Suppose x 2 X and (8.61) holds. Then there exists a vector p 2 @\ f .x/ such
that p 2 NX .x/. Since p 2 @\ f .x/ we have pT x D 1 and f \ .p/f .x/ D 1, while
p 2 NX .x/ implies that pT .x x/ 0, hence pT x 1 8x 2 X. From the definition
of f \ it is immediate that
Therefore,
and consequently,
pT x 1 8x 2 Cx ; pT x D 1:
X \ D fp 0j pT x 1 8x 2 Xg: (8.62)
Since X \ D X \ RnC , where X is the polar of X defined in Sect. 1.8, it follows from
the properties of X that X \ is a compact convex set with nonempty interior in RnC
and, moreover, that X is the lower conjugate of X \ :
X D fx 0 W pT x 1 8p 2 X \ g: (8.63)
(
pki
x if pi > 0
xki D pi i
xi if pi D 0:
hence,
0 2 @\ f .x/ NX .x/;
qT x D 1; (8.65)
f .x/f \ .q/ D 1; (8.66)
q x 1 8x 2 X:
T
(8.67)
258 8 Special Methods
\
0 2 @\ f \ .q/ NX .q/:
By Theorem 8.10, it follows that q is optimal solution of the problem (DP). Since p
also solves (DP), we have f \ .q/ D f \ .p/. Therefore, f \ .p/f .x/ D 1. t
u
Corollary 8.4 Let x 2 X and p 2 X \ . Then, x solves (P) and p solves (DP) if and
only if
p 2 @\ f .x/ or x 2 @\ f \ .p/
Proof The if part is obvious. To prove the only if part, suppose x solves (P) and
p solves (DP). By Theorem 8.11 we have
f \ .p/f .x/ D 1:
8.4 Application
consider the problem of finding a couple .x; p/ 2 RnC RnC satisfying the following
system of equalities and inequalities:
X
m X
m
xD i ai ; i 0 i D 1; 2; : : : ; m; i D 1 (8.70)
iD1 iD1
pj 0 j D 1; 2; : : : ; n; pT ai 1 i D 1; 2; : : : ; m (8.71)
pj xj D j j D 1; 2; : : : ; n: (8.72)
This problem is encountered, e.g., in the activity planning of a company that has m
factories for producing n different goods. For the production of these goods a given
resource of total amount 1 has to be allocated to the factories. Let ai be the capacity
of i-th factory, so i-th factory can produce aij units of j-th good, j D 1; : : : ; n when
running at full capacity. Let pj denote the amount of resource to be allocated to the
production of a unit of j-th good and xj be the quantity of j-th good to be produced.
Then pj xj is the total amount of resource to be allocated to the production of j-th
good. The technology or/and the market requires that pj xj D j ; j D 1; : : : ; n, where
1 ; : : : ; n are given numbers. The problem is to select a feasible activity program,
i.e., a vector .x; p/ 2 RnC RnC , satisfying the system (8.70)(8.72).
Let X denote the set of all x 2 RnC satisfying (8.70) and P the set of all p 2 RnC
satisfying (8.71). It is obvious that both X; P are convex polytopes with nonempty
interior. Furthermore, by formulas (8.62)(8.63),
P D X\; X D P\ : (8.73)
A solution to the system (8.70)(8.72) is a vector .x; p/ 2 X P satisfying
the equalities (8.72). Despite the nonconvexity of these equalities, we sill show
that by quasi-gradient duality the system (8.70)(8.72) can be reduced to a convex
minimization problem. More specifically, this system has a unique solution .x; p/
which can be obtained by solving either of two concave maximization (i.e., convex
minimization) problems.
Define the functions
Y
n
f .x/ WD xj j ; x 2 X;
jD1
Y
n
g.p/ WD pj j ; p 2 P:
jD1
260 8 Special Methods
Y
n
Y
n
g.p/ D f \ .p/: j j f .x/ D g\ .x/: j j :
jD1 jD1
Since X and P contain positive vectors, the optimal values in P (8.74) and in (8.75)
n
are positive. Moreover, since the functions log.f .x// D jD1 j log.xj / and
Pn
log.g.p// D jD1 j log.pj / are strictly concave on the positive orthant, each
problem (8.74) and (8.75) has a unique optimal solution.
Q
Noting that P D X \ while g.p/ D f \ .p/: njD1 j j we see that problem (8.75) is
Q
the dual of problem (8.74). In fact, as f .x/ D g\ .x/: njD1 j j and X D P\ the two
problems are dual of each other.
Theorem 8.12 If x solves problem (8.74), then p D rf Q .x/ solves problem (8.75).
Q
Conversely, if p solves problem (8.75), then x D rg.p/ solves problem (8.74).
Proof This follows from Corollary 8.4 and Proposition 8.2. t
u
8.4 Application 261
rf .x/ Q .x/:
pD D rf
hrf .x/; xi
By Corollary 8.4 x solves (8.74) and p solves (8.75).
Conversely if x solves (8.74) and p D rf Q .x/ then obviously p satisfies (8.72), so
.x; p/ is a solution of the system (8.70)(8.72). Analogously, if p solves (8.75) and
Q
x D rg.p/ then x satisfies (8.72), so .x; p/ is a solution of the system (8.70)(8.72).
t
u
The above theorem shows that the nonconvex system (8.70)(8.72) has a unique
solution .x; p/ which can be obtained either by solving the concave maximization
Q .x/) or by solving
problem (8.74) (then x is the unique solution of (8.74) and p D rf
the concave maximization problem (8.75) (then p is the unique solution of (8.75)
Q
and x D rg.p/).
P
A vector D .1 ; : : : ; n / 2 RnC with j D krD1 r jr (cost of all resources to be
used for producing j-th good), j D 1; : : : ; n, will then be referred to as a resource cost
allocation vector. It indicates that an amount of resources of total cost j is allocated
to the production of j-th good, j D 1; : : : ; n. Let be the set of all resource cost
allocation vectors, i.e.,
( )
X k Xk
D 2 RC j D
n
r ;
r
r D 1; r 0 r D 1; 2; : : : ; k : (8.77)
rD1 rD1
262 8 Special Methods
X
m X
m
xD i a i ; i 0 i D 1; 2; : : : ; m; i D 1; (8.78)
iD1 iD1
pj 0 j D 1; 2; : : : ; n; pT ai 1 i D 1; 2; : : : ; m; (8.79)
pj xj D j j D 1; 2; : : : ; n; (8.80)
D .1 ; : : : ; n / 2 : (8.81)
Clearly the system (8.70)(8.72) is a special case of the system (8.78)(8.81) when
k D 1, i.e., when is the standard simplex in RnC (which is obvious from (8.76)).
We now show that solving the system (8.78)(8.81) reduces to solving either of two
suitable concave maximization problems.
Define
Y
n
r
fr .x/ D xj j r D 1; 2; : : : ; k;
jD1
Y
n
r
gr .p/ D pj j r D 1; 2; : : : ; k:
jD1
As before let X be the set of all x satisfying (8.78) and P the set of all p
satisfying (8.79). For each 2 RnC consider the two following problems problems:
( )
Y
k
r
max fr .x/ j x 2 X : (8.82)
rD1
( )
Y
k
max gr .p/r j p 2 P : (8.83)
rD1
X
k
D r r : (8.84)
rD1
By Theorem 8.13, x is the unique optimalPto problem (8.74) and p the unique
k
optimal to problem (8.75), where j D rD1 r j . However, from (8.84) it
r
follows that
Y
n
f .x/ D xj j
jD1
Y
n Pk
rD1 r j
r
D xj
jD1
Y
n Y
k
r jr
D xj
jD1 rD1
Y
k Y
n
r jr
D xj
rD1 jD1
0 1r
Y
k Y
n
jr
D @ xj A
rD1 jD1
Y
k
D fr .x/r :
rD1
Y
k
g.p/ D gr .p/r ;
rD1
pj 0 j D 1; 2; : : : ; n; pT ai 1 i D 1; 2; : : : ; m; (8.87)
pj xj D j j D 1; 2; : : : ; n; (8.88)
D .1 ; : : : ; n / 2 : (8.89)
By rescaling if necessary we can assume without lost of generality that minjD1;:::;n aij
> 2 for every i D 1; : : : ; m, and consequently, that
min xj > 2 8x 2 X: (8.92)
jD1;:::;n
Define now
( )
Y
k
M D
D .
1 ;
2 ; : : : ;
k / 2 RnC j max fr
r .x/ 3 ;
x2X
rD1
( )
Y
k
h.
/ D max q.x/j fr
r .x/ 3; x 2 X 8
2 RnC :
rD1
Q P
k
r
Note that for each
2 RkC the function log rD1 fr .x/ D krD1
r log.fr .x//D
Pk Pn
rD1
r jD1 j log.xj / is concave, so the feasible set of the subproblem that
defines h.
/ is a convex subset of the polytope X. Therefore, the value of h.
/
is obtained by solving a convex problem (maximizing a concave function over a
compact convex set).
Lemma 8.2 M is a nonempty compact convex set in RkC .
Proof For any
2 RnC we have
Y
k X
k
max fr
r .x/ 3 , max
r log.fr .x// log.3/:
x2X x2X
rD1 rD1
P P
Since maxx2X krD1
r log.fr .x// maxx2X log.fr .x// krD1 r , it is easily seen that
Q
2 M ,
0; h0 .
/ log.3/
X
k
)
0;
r log.fr .x// log.3/ 8x 2 X
rD1
X
k
)
0;
r log.2/ log.3/
rD1
So, M is bounded. t
u
Lemma 8.3 The function h is continuous and convex on RkC .
Q
P
Proof The constraint krD1 fr r .x/ 3 can be written as krD1
r log.fr .x// log 3.
Since q.x/ and log.fr .x// are continuous concave, the subproblem that defines h.
/
is a convex program. Since furthermore Plog fr .x// is strictly concave, it is easily seen
that there must exist x 2 X satisfying krD1
r log.fr .x// > log 3. Therefore, by the
strong Lagrange duality theorem, there exists 2 RC such that
( " k #)
X
h.
/ D max q.x/ C
r log.fr .x// log 3 :
x2X
rD1
266 8 Special Methods
Thus h.
/ is the upper envelope of a family of affine functions. The convexity and
continuity of h.
/ on RkC follow. t
u
Consider now the problem
maxfh.
/j
2 Mg (8.93)
Y
k
frr .x / D 3: (8.94)
rD1
Indeed, let
Y
k
!D frr .x /:
rD1
log.3/
D
log.!/
Qk r
yields log.!/ D log.3/, i.e., log. rD1 fr .x // D log.3/, hence (8.94). Now let
D . Clearly
! !
Y
k Y
k Y
k
max fr
r .x/ D max frr .x/ D frr .x / D 3;
x2X x2X
rD1 rD1 r
so
2 M. Hence,
q D q.x /
( )
Y
k
sup q.x/j fr
r .x/ 3; x 2 X
rD1
D h.
/
h :
8.4 Application 267
Y
k
Y
k
fr
r .x / D sup fr
r .x/
rD1 x2X rD1
3:
But if
Y
k
fr
r .x / < 3;
rD1
then
( )
Y
k
h.
/ D sup q.x/j fr
r .x/ 3; x 2 X
rD1
D sup ;
D0
< h ;
Y
k
fr
r .x / D 3
rD1
and therefore,
h D h.
/
( )
Y
k
D sup q.x/j fr
r .x/ 3; x 2 X
rD1
D q.x /
q :
Proof It suffices to check that if g.x/ belongs to one of the mentioned types and
g.w/ > 0 then, for any given z w one can easily determine whether g.w C .z
w// 0 for some > 0 and if so, compute the smallest such . In particular, if g.x/
is quadratic (definite or indefinite), then g.w C .z w// is a quadratic function of
, so computing supfj g.w C .z w// 0g is an elementary problem. t
u
Since f .x/ is convex one can easily compute w 2 argminff .x/j x 2 Rn g. If it so
happens that w 2 S, we are done: w solves the problem. Otherwise, w S and we
have
f .w/ < minff .x/j x 2 Sg: (8.96)
8.5 Noncanonical DC Problems 269
From the convexity of f .x/ it then follows that f .w C .z w// < f .z/ for any
z 2 S; 2 .0; 1/, so the visible feasible point in w; z achieves the minimum of
f .x/ over the line segment w; z. Therefore, assuming the availability of a point w
satisfying (8.96) we can reformulate (8.95) as the following:
Optimal Visible Point Problem: Find the feasible point visible from w with
minimal function value f .x/.
Aside from the basic visibility condition and (8.96), we will further assume
that:
(a) the set S is robust, i.e., S D cl.intS/I
(b) f .x/ is strictly convex and has bounded level sets.
(c) a number 2 R is known such that f .x/ < 8x 2 S.
Let P0 be a polytope containing S. If we denote
D minff .x/j x 2 Sg; G D
fx 2 P0 j f .x/
g, then G is a closed convex set and the problem is to find a
point of WD S \ G. Applying the outer approximation method to this problem,
we construct a sequence of polytopes P1 P2 : : : G. At iteration k let xk
be the incumbent feasible point and zk 2 argmaxff .x/j x 2 Pk g. The distinguished
point associated with Pk is defined to be xk D .zk /. Clearly, if xk 2 G, then xk
is an optimal solution. If f .xk / < f .xk /, then xkC1 D xk I otherwise xkC1 D xk .
Furthermore, PkC1 D Pk \ fxj l.x/ 0g where l.x/ 0 is a cut which separates
xk from the closed convex set G. We can thus state the following OA algorithm for
solving the problem (8.95):
Optimal Visible Point Algorithm
Initialization. Take a polytope (simplex or rectangle) P1 known to contain one
optimal solution. Let x1 be the best feasible solution available, 1 D f .x1 / (if no
feasible solution is known, set x1 D ;; 1 D /. Let V1 be the vertex set of P1 . Set
k D 1.
Step 1. If xk 2 argminff .x/j x 2 Pk g, then terminate: xk is an optimal solution.
Step 2. Compute zk 2 argmaxff .x/j x 2 Vk g.
Step 3. Find .zk /. If .zk / exists and f ..zk // < f .xk /, set xkC1 D .zk / and
kC1 D f .xkC1 /I otherwise set xkC1 D xk ; kC1 D k .
Step 4. Let yk 2 w; zk \ fxj f .x/ D kC1 g. Compute pk 2 @f .yk / and let
PkC1 D Pk \ fxj hpk ; x yk i 0g:
Step 5. Compute the vertex set VkC1 of PkC1 . Set k kC1 and go back to Step 1.
Figure 8.3 illustrates the algorithm on a simple example in R2 where the feasible
domain consists of two disjoint parts. Starting from a rectangle P1 we arrive after
two iterations at a polygon P3 . The feasible solution x2 obtained at the second
iteration is already an optimal solution.
Theorem 8.12 Either the above algorithm terminates by an optimal solution, or it
generates an infinite sequence fxk g every accumulation point of which is an optimal
solution.
270 8 Special Methods
z2
(z2 )
w
(z1 )
P3
z1
(Of course this involves minimizing the concave functions gQ i .x/ over Pk but not
much extra computational effort is needed, since we have to compute the vertex set
Vk of Pk anyway). Let g.x/ D maxiD1;:::;m gi .x/. Clearly,
implying that there is no x 2 P1 such that g.x/ < "k ; f .x/ f .xk /. In other words
The most challenging of all nonconvex optimization problems is the one in which
no convexity or reverse convexity is explicitly present, namely:
1
If f ; g1 ; : : : ; gm are convex, condition (8.99) is implied by intS ; and by writing (8.101) in the
form 0 2 @F.; x/ we recover the classical optimality criterion in convex programming.
8.6 Continuous Optimization Problems 273
However, finding the global minimum of F.; x/ may be as hard as solving the
original problem. Therefore, we look for some function '.; x/ which could do
essentially the same job as F.; x/ while being easier to handle. The key for this is
the representation of closed sets by dc inequalities as discussed in Chap. 3.
Define
To make the problem tractable, assume that for every 2 .1; C1 a function
r.; x/ (separator for .f ; S/) is available which is l.s.c. in x and satisfies:
(a) r.; x/ D 0 8x 2 SQ I
(b) 0 < r.; y/ d.y; SQ / WD minfkx ykj x 2 SQ g 8y SQ I
(c) For fixed x; r.1 ; x/ r.2 ; x/ if 1 < 2 .
This assumption is fulfilled in many cases of interest, as shown in the following
examples:
Example 8.3 If all functions f ; g1 ; : : : ; gm are Lipschitzian with constants
L0 ; L1 ; : : : ; Lm then a separator is
f .x/ gi .x/
r.; x/ D max 0; ; ; i D 1; : : : ; m :
L0 Li
Indeed, for any x 2 SQ W
f .y/ f .y/ f .x/ L0 kx yk
gi .y/ gi .y/ gi .x/ Li kx yk i D 1; : : : ; m
hence r.; y/ kx yk, from this it is easy to verify (a)(c).
Example 8.4 If f is twice continuously differentiable, with kf 00 .x/k K for every
x, and S D fxj kxk cg then a separator is
r.; x/ D maxf.; x/; kxk cg;
M
kx yk2 C kf 0 .y/kkx yk f .y/ ;
2
it follows from the definition of .; y/ that .; y/ kx yk. On the other hand,
since kxk c; we have kykc kykkxk kxyk. Hence, r.; y/ kxyk 8x 2
SQ , proving (b).
By Proposition 4.13, given a separator r.; x/, if we define
where
and to solve the latter by the simple OA algorithm (Sect. 6.2). Since kC1 < k
we have h.k ; x/ h.kC1 ; x/, so if Gk denotes the constraint set of (8.109) then
GkC1 Gk 8k. This allows the procedures of solving (8.109) for successive k D
1; 2; : : : to be integrated into a unified OA procedure for solving the single concave
minimization problem
minft kxk2 j x 2 Pk g:
Proof Since tk < kxk k2 we have lk .xk / tk > lk .xk / kxk k2 D r2 .kC1 ; xk / 0. On
the other hand, x 2 GkC1 implies h.kC1 ; x/ t 0, but since xk SkC1 it follows
from (8.104) that lk .x/ h.kC1 ; x/, and hence lk .x/ t h.kC1 ; x/ t 0. u t
In view of this Lemma if we define PkC1 D Pk \fxj lk .x/ tg then .xk ; tk / PkC1
while PkC1 GkC1 G, so the sequence P1 P2 G realizes an OA
scheme for solving (8.110). We are thus led to the following:
Relief Indicator OA Algorithm for (COP)
Initialization. Take a simplex or rectangle P1 Rn known to contain an optimal
solution of the problem. Let x1 be a point in P1 and let 1 D f .x1 / if x1 2 S, or
1 D C1 otherwise. Set k D 1.
Step 1. If xk 2 S, then improve xk by local search methods (provided this can be
done at reasonable cost). Denote by xk the best feasible solution so far
obtained. If xk exists, let kC1 D f .xk /, otherwise let kC1 D C1. Define
Step 2. Solve
For any q > s we have Qlks .xkq / D lks .xkq / kxkq k2 tkq kxkq k2 < 0, hence, by
making q ! C1 W Qlks .x/ 0. But clearly, Qlks .xks / D Qlks .x/ kxks xk2 kxks
xk2 ! 0 as s ! C1, while Qlk .xk / D r2 .kC1 ; xk / > 0. Therefore, Qlks .xks / ! 0
and also tks kxks k2 ! 0, as s ! C1. If t D lim tks then t kxk2 D 0. Since
tk kxk k2 D minftkxk2 j x 2 P1 ; li .x/ t; i D 1; 2; : : :, k1g minftkxk2 j x 2
P1 ; h.k ; x/ tg 8k, it follows that t kxk2 D minft kxk2 j x 2 P1 ; h.; x/ tg
for D lim k . That is, '.; x/ D 0, and hence, x solves (COP). t
u
Remark 8.11 Each subproblem (SPk ) is a linearly constrained concave mini-
mization problem which can be solved by vertex enumeration as in simple outer
approximation algorithms. Since (SPkC1 ) differs from (SPk ) by just one additional
linear constraint, if Vk denotes the vertex set of the constraint polytope of (SPk )
then V1 can be considered known, while VkC1 can be derived from Vk using an
on-line vertex enumeration subroutine (Sect. 6.2). If the algorithm is stopped when
tk kxk k2 > "2 then xk yields an "-approximate optimal solution.
278 8 Special Methods
f .x/
r.; x/ D maxf0; ; g.x/g:
28:3
With tolerance " D 0:01 the algorithm terminates after 16 iterations, yielding an
approximate optimal solution x D .0; 1/ with objective function value 0:6603168.
where h.; x/ is defined as in (8.104) and is the (unknown) optimal value of f .x/.
Since for every fixed x this is a convex problem, the most convenient way is to
branch upon x and compute bounds by using relaxed subproblems of the form
where M is the partition set, M .x/ is the affine function that agrees with kxk2 at
the vertices of M; f.x; t/j li .x/ t .i 2 Ik /g the outer approximating polytope of the
convex set C WD f.x; t/j h.; x/ tg and f.x; y/j Lj .x; y/ 0 .j 2 Jk /g the outer
approximating polytope of the convex set D f.x; y/j G.x; y/ 0g, at iteration k.
8.7 Exercises 279
8.7 Exercises
1 D minfcxj x 2 Dg:
k C k
k D max h.z/j z 2 D; cz
2
to obtain zk .
(3) If
k < 0, and k D 1, terminate: the problem is infeasible.
k C k
(4) If
k < 0, and k < 1, set kC1 D k ; kC1 D and go to iteration
2
k C 1.
(5) If
k 0, then starting from zk perform a number of simplex pivots to move to
a point xk such that cxk czk and xk is on the intersection of an edge of D with
the surface h.x/ D 0 (see Phase 1 of FW/BW Procedure). Set x D xk ; kC1 D
cxk ; kC1 D k and go to iteration k C 1.
4 Solve minfy3 C 4:5y2 6yj 0 y 3g.
8.7 Exercises 281
5 Solve
8 Solve
9 Solve
1 2 1
min x1 C x2 C x22 x21 j 2x1 x2 3I x1 ; x2 0 :
2 3 2
Show that 1) .x; r/ solves .P/ if and only if .x=r; 1=r/ solves QI 2) When D is a
polytope, problem .P) merely reduces to a linear program; 3) When C is a polytope,
problem .Q/ reduces to minimizing a dc function.
2
12 Solve minfex sin x jyj j 0 x 10; 0 y 10g. (Hint: Lipschitz
constant =2)
Chapter 9
Parametric Decomposition
A partly convex problem as formulated in Sect. 8.2 of the preceding chapter can also
be viewed as a convex optimization problem depending upon a parameter x 2 Rn :
The dimension n of the parameter space is referred to as the nonconvexity rank of
the problem. Roughly speaking, this is the number of nonconvex variables in the
problem. As we saw in Sect. 8.2, a partly convex problem of rank n can be efficiently
solved by a BB decomposition algorithm with branching performed in Rn : This
chapter discusses decomposition methods for partly convex problems with small
nonconvexity rank n: It turns out that these problems can be solved by streamlined
decomposition methods based on parametric programming.
This definition is correct in view of (9.2). Obviously, f .x/ D F.Qx/: For any y; y0 2
Q.X/ such that y0 y 2 Q.K/ we have y D Qx; y0 D Qx0 ; with Q.x0 x/ D Qu; u 2
K; i.e., Qx0 D Q.x C u/; u 2 K: Here x C u 2 X because u 2 recX; hence by (9.1),
f .x0 / D f .x C u/ f .x/; i.e., F.y0 / F.y/: Conversely, if f .x/ D F.Qx/ and
F W Q.X/ ! R satisfies (9.4), then for any x; x0 2 X such that x0 x 2 K we have
y D Qx; y0 D Qx0 ; with y0 y 2 Q.K/; hence F.y0 / F.y/; i.e., f .x0 / f .x/: t
u
The above representation of K-monotonic functions explains their role in global
optimization. Indeed, using this representation, any optimization problem in Rn of
the form
x0 C K X. / WD fx 2 Xj f .x/ g:
Proposition 9.1 provides the foundation for reducing the dimension of prob-
lem (9.6), whereas Proposition 9.2 suggests methods for restricting the global search
domain. Indeed, since local information is generally insufficient to verify the global
optimality of a solution, the search for a global optimal solution must be carried
out, in principle, over the entire feasible set. If, however, the objective function f .x/
in (9.6) is K-monotonic, then, once a solution x0 2 D is known, one can ignore
the whole set D \ .x0 C K/ fx 2 Dj f .x/ f .x0 /g; because no better feasible
solution than x0 can be found in this set. Such kind of information is often very
helpful and may drastically simplify the problem by limiting the global search to a
restricted region of the feasible domain. In the next sections, we shall discuss how
efficient algorithms for K-monotonic problems can be developed based on these
observations.
Below are some important classes of quasiconcave monotonic functions encoun-
tered in the applications.
Example 9.1 f .x/ D '.x1 ; : : : ; xp / C dT x;
with '.y1 ; : : : ; yp / a continuous concave function of y D .y1 ; : : : ; yp /:
Here F.y/ D '.y1 ; : : : ; yp / C ypC1 ; Qx D .x1 ; : : : ; xp ; dT x/T : Monotonicity holds
with respect to the cone K D fuj ui D 0.i D 1; : : : ; p/; dT u 0g: It has long
been recognized that concave minimization problems with objective functions of
this form can be efficiently solved only by taking advantage of the presence of the
linear part in the objective function (Rosen 1983; Tuy 1984).
Pp 2
Example 9.2 f .x/ D P iD1 hc ; xi C hc
i pC1
; xi:
Here F.y/ D iD1 yi C ypC1 ; Qx D .hc1 ; xi; : : : ; hcpC1 ; xi/T : Monotonicity
2
p
holds with respect to K D fuj hci ; ui D 0.i D 1; : : : ; p/; hcpC1 ; ui 0g: For
p D 1 such a function f .x/ cannot in general be written as a product of two
affine functions and finding its minimum over a polytope is a problem known to
be NP-hard (Pardalos and Vavasis 1991), however, quite practically solvable by a
parametric algorithm (see Sect. 9.5).
Q
Example 9.3 f .x/ D riD1 hci ; xi C di i with i > 0; i D 1; : : : ; r:
This class
P includes functions which are products of affine functions. Since
log f .x/ D riD1 i loghci ; xi C di for every x such that hci ; xi C di > 0; it is easily
seen that f .x/ is quasiconcave on the set X D fxj hci ; xi C di 0; i D 1; : : : ; rg:
Furthermore, f .x/ is monotonic on X with respect to theQ cone K D fuj hci ; ui
0; i D 1; : : : ; rg: Here f .x/ has the form (9.3) with F.y/ D riD1 .yi Cdi /i : Problems
of minimizing functions f .x/ of this form (with i D 1; i D 1; : : : ; r/ under linear
constraints are termed linear multiplicative programs and will be discussed later in
this chapter (Sect. 9.5).
P
D riD1 i ehc ;xi with i > 0; i D 1; : : : ; r.
i
Example 9.4 f .x/ P
Here F.y/ D riD1 i eyi is concave in y; so f .x/ is concave in x: Monotonicity
holds with respect to the cone K D fxj hci ; xi 0; i D 1; : : : ; rg: Functions of
this type appear when dealing with geometric programs with negative coefficients
(signomial programs, cf Example 5.6).
286 9 Parametric Decomposition
X
r
rf .x/ D pi .x/ci ; ci 2 Rn ; i D 1; : : : ; r:
iD1
y Qx; hence f .x C u/ f .x/; and so f .x/ is monotonic with respect to the cone
T
K D fuj Qu 0g:
Example 9.7 f .x/ D supfdT yj Ax C By qg
where A 2 Rpn with rankA D r; B 2 Rpm ; y 2 Rm and q 2 Rp :
This function appears in bilevel linear programming, for instance, in the max-min
problem (Falk 1973b):
In this and the next two sections, we shall discuss decomposition methods for
solving this problem, under the additional assumption that D is a polytope. The
idea of decomposition is to transform the original problem into a sequence of
subproblems involving each a reduced number of nonconvex variables. This can
be done either by projection, or by dualization (polyhedral annexation), or by
parametrization ( for r 3/: We first present decomposition methods by projection.
To solve (9.10) by outer approximation, the key point is: given a point y 2 Q.Rn /
Rr , determine whether y 2 G D Q.D/; and if y G then construct a linear
inequality (cut) l.y/ 0 to exclude y without excluding any point of G:
Observe that y Q.D/ if and only if the linear system Qx D y has no solution
in D; so the existence of l.y/ is ensured by the separation theorem or any of its
equivalent forms (such as the FarkasMinkowski Lemma). However, since we are
not so much interested in the existence as in the effective construction of l.y/; the
best way is to use the duality theorem of linear programming (which is another form
of the separation theorem). Specifically, assuming D D fxj Ax b; x 0g; with
A 2 Rmn ; b 2 Rm ; consider the dual pair of linear programs
where h is any vector chosen so that (9.14) is feasible (for example, h D 0/ and can
be solved with least effort.
Proposition 9.3 Solving (9.14) either yields a finite optimal solution or an extreme
direction .v; w/ of the cone QT v AT w 0; w 0; such that hy; vi hb; wi > 0:
In the former case, y 2 G D Q.D/I in the latter case, the affine function
satisfies
where
Remark 9.1 A typical case of interest is the concave program with few nonconvex
variables (Example 7.1):
minf'.y/ C tj dz t; Ay C Bz b; y 0; z 0g
p
we see that for checking the feasibility of a vector .y; t/ 2 RC R the best choice
of h in (9.13) is h D d: This leads to the pair of dual linear programs
According to condition (6.4) in Sect. 6.2 this can serve as a valid lower bound,
provided, in addition, .M/ < C1 whenever Q.D/ \ M ; which is the
case when M is bounded as, e.g., when using simplicial or rectangular partitioning.
Simplicial Algorithm
If M D u1 ; : : : ; urC1 is an r-simplex in Q.Rn / then lM .x/ can beP taken to be the
affine function that agrees with F.y/ at u1 ; : : : ; urC1 : Since lM .y/ D iD1 rC1
ti F.ui / for
PrC1 i PrC1
y D iD1 ti u with t D .t1 ; : : : ; trC1 / 0; iD1 ti D 1; the above linear program
can also be written as
( rC1 )
X X
rC1 X
rC1
min ti F.u /j x 2 D; Qx D
i
ti u ;
i
ti D 1; t 0 :
iD1 iD1 iD1
Simplicial subdivision is advisable when f .x/ D f0 .x/ C dx; where f0 .x/ is concave
and monotonic with respect to the cone Qx 0: Although f .x/ is then actually
monotonic with respect to the cone Qx 0; dx 0; by writing .QCM/ in the form
one can see that it can be solved by a simplicial algorithm operating in Q.Rn / rather
than Q.Rn / R: As initial simplex M1 one can take the simplex (9.17).
Rectangular Algorithm
When F.y/ is concave separable, i.e.,
X
r
F.y/ D Fi .yi /;
iD1
Conical Algorithm
In the general case, since the global minimum of F.y/ is achieved on the boundary
of Q.D/; conical subdivision is most appropriate. Nevertheless, instead of bounding,
we use a selection operation, so that the algorithm is a branch and select rather than
branch and bound procedure.
9.2 Decomposition by Projection 291
The key point in this algorithm is to decide, at each iteration, which cones in
the current partition are non promising and which one is the most promising
(selection rule).
Let y be the current best solution (CBS) at a given iteration and WD F.y/:
Assume that all the cones are vertexed at 0 and F.0/ > ; so that
0 2 int ;
it generates is a global optimal solution. Note that no bound is needed in this branch
and select algorithm. Although a bound for F.y/ over the feasible points in M can
be easily derived from the constructions used in Proposition 9.4, it would require
the computation of the values of F.y/ at r additional points (the intersections of the
edges of M with the hyperplane lM .y/ D .M//; which may entail a nonnegligible
extra cost.
1
(as usual C1 D 0/:
9.3 Decomposition by Polyhedral Annexation 293
to obtain its optimal value .t/ and a basic optimal solution x.t/:
Step 2. Let tk 2 argmaxf.t/j t 2 Vk g: If .tk / 1 then terminate: x is an
optimal solution of (QCM).
Step 3. If .tk / > 1 and xk D x.tk / C, then update the current best feasible
solution and the set C by resetting x D xk :
Step 4. Compute
k D supf j f . xk / f .x/g
and define
( )
X
r
1
PQ kC1 D PQ k \ tj ti hx ; c i
k i
:
iD1
k
Step 5. From Vk derive the vertex set VkC1 of PQ kC1 : Set k k C 1 and go back to
Step 1.
294 9 Parametric Decomposition
Remark 9.2 Reported computational experiments (Tuy and Tam 1995) seem to
indicate that the PA method is quite practical for problems with fairly large n
provided r is small (typically r 6/: For r 3; however, specialized parametric
methods can be developed which are more efficient. These will be discussed in the
next section.
A case when the initial simplex P1 can be constructed so that its vertex set is readily
available when K D fuj Qu 0g; with an r n matrix Q of rank r: Let c1 ; : : : ; cr be
the rows of Q: Using formula (9.11) one can compute a point w satisfying Qw D e;
where e D .1; : : : ; 1/T ; and a value > 0 such that wO D w 2 C (for the efficiency
of the algorithm, should be taken to be as large as possible).
Lemma 9.2 The polar P1 of the set S1 D wO C K is an r- simplex with vertex set
f0; c1 =; : : : ; cr =g and satisfying condition (ii) in Lemma 9.1.
Proof Since K recC and wO 2 C it follows that S1 D wO C K C; and hence
P1 C : Furthermore, clearly S1 D fxj hci ; xi ; i D 1; : : : ; rg because
x 2 S1 if and only if x wO 2 K; i.e., hci ; x wi
O 0 i D 1; : : : ; r; i.e., hci ; xi
O D ; i D 1; : : : ; r: From Proposition 1.28 we then deduce that S1 D
hc ; wi
i
In the previous sections, it was shown how a quasiconcave monotonic problem can
be transformed into an equivalent quasiconcave problem of reduced dimension.
In this section an alternative approach will be presented in which a quasiconcave
monotonic problem is solved through the analysis of an associated linear program
depending on a parameter (of the same dimension as the reduced problem in the
former approach).
Consider the general problem formulated in Sect. 8.2:
Corollary 9.1 For each .; / 2 W let x; be an arbitrary basic optimal
solution of the linear program
min Pp1 i hci ; xi C hcp ; xi
LP.; / iD1
s:t: x 2 D; hci ; xi D ; i D p C 1; : : : ; r:
i
Then
Proof Clearly
Theorem 9.2 If, in addition to the already stated assumptions, D intX; while
the function f .x/ is continuous and strictly quasiconcave on X, then for any optimal
solution x of .QCM/ there exists a v 2 K n f0g such that x is an optimal solution of
LP.v/ and any optimal solution to LP.v/ will solve .QCM/.
(A function f .x/ is said to be strictly quasiconcave on X if for any two x; x0 2
X; f .x0 / < f .x/ always implies f .x0 / < f . x C .1 /x0 / 8 2 .0; 1//
Proof It suffices to consider the case when f .x/ is not constant on D: Let x be an
optimal solution of .QCM/ and X.x/ D fx 2 Xj f .x/ f .x/g: By quasiconcavity
and continuity of f .x/, this set X.x/ is closed and convex. By optimality of x; D
X.x/ and again by continuity of f .x/; fx 2 Xj f .x/ > f .x/g intX.x/: Furthermore,
since f .x/ is not constant on D one can find x0 2 D such that f .x0 / > f .x/; so if
x 2 intX.x/ then for > 0 small enough, x D x .x0 x/ 2 X.x/; f .x / < f .x0 /;
hence, by strict quasiconcavity of f .x/; f .x / < f .x/; conflicting with the definition
of X.x/: Therefore, x lies on the boundary of X.x/: By Theorem 1.5 on supporting
hyperplanes there exists then a vector v 0 such that
will solve
will solve (9.37). For fixed t > 0 the problem (9.38) splits into n one-dimensional
subproblems
In the particular case of the problem of optimal ordering policy for jointly
replenished products as treated in the mentioned paper of Tanaka et al. f1;i .xi / D
a=n C ai =xi ; f2;i .xi / D bi xi =2 .a; ai ; bi > 0/ and Xi is the set of positive integers, so
each of the above subproblems can easily be solved.
In this section we discuss parametric methods for solving problem (QCM) under
assumptions (9.30). Let Q be the matrix of rows c1 ; : : : ; cr : In view of Proposi-
tion 9.1, the problem can be reformulated as
yi y0i .i D 1; : : : ; p/
) F.y/ F.y0 /: (9.40)
yi D y0i .i D p C 1; : : : ; r/
9.5 Linear Multiplicative Programming 299
Q
which corresponds to the case F.y/ D riD1 yi : Aside from (LMP), a wide variety
of other problems can be cast in the form (9.39), as shown by the many examples in
Sect. 7.1. In spite of that, it turns out that solving (9.39) with different functions F.y/
reduces to comparing the objective function values on a finite set of extreme points
of D dependent upon the vectors c1 ; : : : ; cr but not on the specific form of F.y/.
Due to their relevance to applications in various fields (multiobjective pro-
gramming, bond portfolio optimization, VLSI design, etc. . . ), the above class of
problems, including .LMP/ and its extensions, has been the subject of quite a few
research. Parametric methods for solving .LMP/ date back to Swarup (1966a),
Swarup (1966b), Swarup (1966c), Forgo (1975), and Gabasov and Kirillova (1980).
However intensive development in the framework of global optimization began only
with the works of Konno and Kuno (1990, 1992). The basic idea of the parametric
method can be described as follows. By Corollary 9.1 one can solve .LMP/ by
solving the problem
where, in the notation in Corollary 9.1, x; is an arbitrary optimal solution of the
linear program
P
min p i hci ; xi
LP.; / iD1
s:t: x 2 D; hcj ; xi D j ; j D p C 1; : : : ; r:
For solving (9.42) the parametric method exploits the property of LP.; / that
the parameter space is partitioned in finitely many polyhedrons over each of which
the minimum of the function '.; / D f .x; / can be computed relatively easily.
Specifically, by writing the constraints in this linear program in the form
Ax D b C S; x 0; (9.43)
where S is a suitable diagonal matrix, denote the collection of all basis matrices of
this system by B:
Lemma 9.3 Each basis B of the system (9.43) determines a set B B W
which is a polyhedron whenever nonempty, and an affine map xB W B ! Rn ; such
that xB ./ is a basic optimal solution of LP.; / for all .; / 2 B B : The
collection of polyhedrons B B corresponding to all bases B 2 B covers all
.; / such that LP.; / has an optimal solution.
300 9 Parametric Decomposition
xB
Proof Let B be any basis of this system. If A D .B; N/; x D ; so that BxB C
xN
NxN D b C S; then
This basic solution is feasible for all 2 W such that B1 .b C S/ Pp 0 (nonneg-
ativity condition) and is dual feasible for all 2 such that iD1 i .ciN /T
i
cB
.ciB /T B1 N 0 (optimality criterion), where ci D : If we denote the
ciN
vector (9.45) by xB ./ and define the polyhedrons
( )
X p
i T
B D 2 j i .cN / .ciB /T B1 N 0
iD1
B D f 2 Wj B1 .b C S/ 0g
where D is a polyhedron contained in the set fxj hc1 ; xi 0; hc2 ; xi 0g; and F.y/
is a quasiconcave function on the orthant R2C such that
As seen earlier, aside from the special case when F.y/ D y1 y2 (linear multiplicative
program), the class (9.46) includes problems with quite different objective func-
tions, such as
1 ;xi 2 ;xi
.hc1 ; xi/q1 .hc2 ; xi/q2 I q1 ehc q2 ehc .q1 > 0; q2 > 0/
Since the parameter domain is the nonnegative real line, the cells of this
parametric program are intervals and can easily be determined by the standard
parametric objective simplex method (see, e.g., Murty 1983) which consists in the
following.
Suppose a basis Bk is already available such that the corresponding basic solution
xk is optimal to LP./ for all 2 k WD k1 ; k but not for outside this
interval. The latter means that if is slightly greater than k then at least one of the
inequalities in the optimality criterion
y D y0 : Therefore, without assuming (9.47), Corollary 9.1 and hence the above
method still applies. However, in this case, D R; i.e., the linear program LP./
should be considered for all 2 .1; C1/:
Consider now problem (9.39) under somewhat weaker assumptions than previously,
namely: F.y/ is not required to be quasiconcave, while condition (9.47) is replaced
by the following weaker one:
The latter condition implies that for fixed y2 the univariate function F.:; y2 / is
monotonic on the real line. It is then not hard to verify that Corollary 9.1 still holds
for this case. In fact this is also easy to prove directly.
Lemma 9.4 Let D minfhc2 ; xij x 2 Dg; D maxfhc2 ; xij x 2 Dg and for every
2 ; let x be a basic optimal solution of the linear program
Then
Ax D b C b0 ; x 0;
Then the basic solution defined by this basis, i.e., the vector xk D uk C v k such
that xkBk D B1
k .b C b0 /; xNk D 0 [see (9.45)], is dual feasible. This basic solution
k
k
x is optimal to LP./ if and only if it is feasible, i.e., if and only if it satisfies the
nonnegativity condition
B1
k .b C b0 / 0:
9.5 Linear Multiplicative Programming 303
Let k WD k1 ; k be the interval formed by all values satisfying the above
inequalities. Then xk is optimal to LP./ for all in this interval, but for any value
slightly greater than k at least one of the above inequalities becomes violated, i.e.,
at least one of the components of xk becomes negative. By performing then a number
of dual simplex pivots, we can pass to a new basis BkC1 with a basic solution xkC1
which will be optimal for all in some interval kC1 D k ; kC1 : Thus, starting
from the value 0 D ; one can determine an initial interval 1 D ; 1 ; then
pass to the next interval 2 D 1 ; 2 ; and so on, until the whole interval ; is
covered. The generated points 0 D ; 1 ; : : : ; h D are the breakpoints of the
optimum objective value function
./ in LP./ which is a convex piecewise affine
function. By Lemma 9.4 an optimal solution to (9.39) is then xk where
Remark 9.5 In both objective and right-hand side methods the sequence of points
xk on which the objective function values have to be compared depends upon the
vectors c1 ; c2 but not on the specific function F.y/: Computational experiments
(Konno and Kuno 1992), also (Tuy and Tam 1992) indicate that solving a prob-
lem (9.39) requires no more effort than solving just a few linear programs of the
same size. Two particular problems of this class are worth mentioning:
which may not belong to the class (QCM) because we cannot prove the quasicon-
cavity of the function F.y/ D y1 C y2 y3 : It is easily seen, however, that this problem
is equivalent to
where D minx2D hc2 ; xi; D maxx2D hc2 ; xi: By writing the constraints x 2
D; hc2 ; xi D in the form
Ax D b C b0 : (9.56)
for all 2 B : The collection of all intervals B corresponding to all bases B such
that B ; covers all for which P./ has an optimal solution.
xB
Proof Let B be a basis, A D .B; N/; x D ; so that the associated basic
xN
solution is
xB D B1 .b C b0 /; xN D 0: (9.57)
B1 .b C b0 / 0: (9.58)
and is dual feasible for all satisfying the dual feasibility condition (optimality
criterion)
Conditions (9.58) and (9.59) determine each an interval. If the intersection B of the
two intervals is nonempty, then the basic solution (9.57) is optimal for all 2 B :
The rest is obvious. t
u
Based on this Lemma, one can generate all the intervals B corresponding to
different bases B by a procedure combining primal simplex pivots with dual simplex
pivots (see, e.g., Murty 1983). Specifically, for any given basis Bk ; denote by uk C
v k the vector (9.57) which is the basic solution of P(/ corresponding to this basis,
and by k D k1 ; k the interval of optimality of this basic solution. When is
slightly greater than k either the nonnegativity condition [see (9.58)] or the dual
feasibility condition [see (9.59)] becomes violated. In the latter case we perform a
primal simplex pivot step, in the former case a dual simplex pivot step, and repeat
as long as necessary, until a new basis BkC1 with a new interval kC1 is obtained.
Thus, starting from an initial interval 1 D ; 1 we can pass from one interval to
9.6 Convex Constraints 305
the next, until we reach : Let then k ; k D 1; : : : ; l be the collection of all intervals.
For each interval k let uk C v k be the associated basic solution which is optimal
to P(/ for all 2 k and let xk D uk C
k v k ; where
k is a minimizer of the
quadratic function './ WD hc0 C c1 ; uk C v k i over k : Then from (9.55) and by
Lemma 9.5, an optimal solution of (9.54) is xk where
Note that the problem (9.54) is also NP-hard (Pardalos and Vavasis 1991).
formulated in Sect. 9.2 [see (9.8)], where the constraint set D is compact convex
but not necessarily polyhedral. Recall that by Proposition 9.1 where Qx D
.c1 x; : : : ; cr x/T there exists a continuous quasiconcave function F W Rr ! R on
a closed convex set Y Q.D/ satisfying condition (9.40) and such that
It is easy to extend the branch and bound algorithms in Sect. 8.2 and the polyhedral
annexation algorithm in Sect. 8.3 to (QCM) with a compact convex constraint set D;
and in fact, to the following more general problem:
where D and F.y/ are as previously (with g.D/ replacing Q.D//; while g D
.g1 ; : : : ; gr /T W Rn ! Rr is a map such that gi .x/; i D 1; : : : ; p; are convex on a
closed convex set X D and gi .x/ D hci ; xi C di ; i D p C 1; : : : ; r:
For the sake of simplicity we shall focus on the case p D r; i.e., we shall assume
F.y/ F.y0 / whenever y; y0 2 Y; y y0 : This occurs, for instance, for the convex
multiplicative programming problem
( )
Y
r
min gi .x/j x 2 D (9.63)
iD1
306 9 Parametric Decomposition
Qr
where F.y/ D iD1 yi ; D fxj gi .x/ 0g: Clearly problem (GQCM) can be
rewritten as
which is to minimize the quasiconcave RrC -monotonic function F.y/ on the closed
convex set
If the function F.y/ is concave (and not just quasiconcave) then simplicial subdivi-
sion can be used. In this case, a lower bound for F.y/ over the feasible points y in a
simplex M D u1 ; : : : ; urC1 can be obtained by solving the convex program
where lM .y/ is the affine function that agrees with F.y/ at the vertices of M (this
function satisfies lM .y/ F.y/ 8y 2 M in view of the concavity of F.y//: Using
this lower bounding rule, a simplicial algorithm for (GQCM) can be formulated
following the standard branch and bound scheme. It is also possible to combine
branch and bound with outer approximation of the convex set E by a sequence of
nested polyhedrons Pk E: P
If the function F.y/ is concave separable, i.e., F.y/ D rjD1 Fj .yj /; where each
Fj .:/ is a concave function of one variable, then rectangular subdivision can be
more convenient. However, in the general case, when F.y/ is quasiconcave but
not concave, an affine minorant of a quasiconcave function over a simplex (or a
rectangle) is in general not easy to compute. In that case, one should use conical
rather than simplicial or rectangular subdivision.
Since F.y/ is monotonic with respect to the orthant RrC ; we can apply the PA
Algorithm using Lemma 9.2 for constructing the initial polytope P1 :
Let y 2 E be an initial feasible point of (9.64) and C D fyj F.y/ F.y/g: By
translating, we can assume that 0 2 E and 0 2 intC; i.e., F.0/ > F.y/: Note that
instead of a linear program like (9.25) we now have to solve a convex program of
the form
Step 1. For every new t D .t1 ; : : : ; tr /T 2 Vk nf0g solve the convex program (9.66),
obtaining the optimal value .t/ and an optimal solution y.t/:
Step 2. Let tk 2 argmaxf.t/j t 2 Vk g: If .tk / 1, then terminate: y is an
optimal solution of (GQCM).
Step 3. If .tk / > 1 and yk WD y.tk / C (i.e., F.yk / < F.y//; then update the
current best feasible solution and the set C by resetting y D yk :
Step 4. Compute
and define
1
PkC1 D Pk \ ftj ht; yk i g:
k
Step 5. From Vk derive the vertex set VkC1 of PkC1 : Set k k C 1 and go back to
Step 1.
To establish the convergence of this algorithm we need two lemmas.
Lemma 9.6 .tk / & 1 as k ! C1:
Proof Observe that .t/ D maxfht; yij y 2 g.D/g is a convex function and that
yk 2 @.tk / because .t/ .tk / ht; yk i htk ; yk i D ht tk ; yk i 8t: Denote
lk .t/ D ht; yk i1: We have lk .tk / D htk ; yk i1 D .tk /1 > 0; and for t 2 g.D/ W
lk .t/ D ht; yk i 1 .t/ 1 0 because yk 2 g.D/ C RrC : Since g.D/ is compact,
there exist t0 2 intg.D (Proposition 1.21), i.e., t0 such that l.t0 / D .t0 / 1 < 0
and sk 2 t0 ; tk / n intg.D/ such that l.sk / D 0: By Theorem 6.1 applied to the set
g.D/ ; the sequence ftk g and the cuts lk .t/; we conclude that tk sk ! 0: Hence
every accumulation point t of ftk g satisfies .t / D 1; i.e., .tk / & .t / D 1:
t
u
Denote by Ck the set C at iteration k:
Lemma 9.7 Ck Pk for every k:
308 9 Parametric Decomposition
where x.t/ is an arbitrary optimal solution of (9.66), i.e., of the convex program
X Y
p qi
minimize gij .x/ s:t x 2 D; (9.69)
iD1 jD1
8 9
< Y
q =
min g.x/ C gj .x/j x 2 D : (9.70)
: ;
jD1
or, alternatively as
( )
X
p
min g.x/ C gi .x/hi .x/j x 2 D ; (9.71)
iD1
where all functions g.x/; gi .x/; hi .x/ are convex positive-valued on D: Another
special case worth mentioning is the problem of minimizing the scalar product of
two vectors:
( n )
X
min xi yi j .x; y/ 2 D RCC :
n
(9.72)
iD1
Let us show that any problem (9.69) can be reduced to concave minimization under
convex constraints. For this we rewrite (9.69) as
Pp Qqi
minimize iD1 . jD1 yij /1=qi
s:t: gij .x/qi yij ; i D 1; : : : ; pI j D 1; : : : ; qi (9.73)
yij 0; x 2 D:
Since 'Qj .y/ D yij ; j D 1; : : : ; qi are linear, it follows from Proposition 2.7 that each
yij /1=qi ; i D 1; : : : ; p; is a concave function of y D .yij / 2 Rq1 C:::;qp ;
qi
term . jD1
hence their sum, i.e., the objective function F.y/ in (9.73), is also a concave function
of y: Furthermore, y y0 obviously implies F.y/ F.y0 /; so F.y/ is monotonic
with respect to the cone fyjy 0g: Finally, since every function gij .x/ is convex,
so is gij .x/qi : Consequently, (9.73) is a concave programming problem. If q D
q1 C : : : C qp is relatively small this problem can be solved by methods discussed
in Chaps. 5 and 6.
An alternative way to convert (9.69) into a concave program is to use the
following relation already established in the proof of Proposition 2.7 [see (2.6)]:
Y
qi
1 Xqi
q
gij .x/ D min
ij giji .x/;
jD1
qi
2Ti jD1
i
Qqi
with
i D .
i1 ; : : : ;
iqi /; Ti D f
i W jD1
ij 1g: Hence, setting 'i .
; x/ D
i
1 X
qi
q
ij giji .x/; the problem (9.69) is equivalent to
qi jD1
310 9 Parametric Decomposition
X
p
min min 'i .
i ; x/
x2D
i 2Ti
iD1
X
p
D min minf 'i .
i ; x/ W
i 2 Ti ; i D 1; : : : ; pg
x2D
iD1
X
p
D minfmin 'i .
i ; x/ W
i 2 Ti ; i D 1; : : : ; pg
x2D
iD1
1
D minf'.
; : : : ;
p /j
i 2 Ti ; i D 1; : : : ; pg (9.74)
where
8 9
X
p <X p
Xqi
1 =
'.
1 ; : : : ;
p / D min
q
'i .
i ; x/ D min
ij giji .x/j x 2 D :
x2D : q ;
iD1 iD1 jD1 i
Since each function giji .x/ is convex, '.
1 ; : : : ;
p / is the optimal value of a convex
q
Thus, problem (9.69) is equivalent to (9.74) Qpwhich seeks to minimize the concave
function '.
1 ; : : : ;
p / over the convex set iD1 Ti :
Example 9.9 As an illustration, consider the problem of determining a rectangle of
minimal area enclosing the projection of a given compact convex set D Rn .n > 2/
onto the plane R2 : This problem arises from applications in packing and optimal
layout (Gale 1981; Haims and Freeman 1970; Maling et al. 1982), especially when
two-dimensional layouts are restricted by n 2 factors. When the vertices of D
are known, there exists an efficient algorithm based on computational geometry
(Bentley and Shamos 1978; Graham 1972; Toussaint 1978). In the general case,
however, the problem is more complicated.
Assume that D has full dimension in Rn D R2 Rn2 : The projection of D onto
R2 is the compact convex set prD D fy 2 R2 j 9z 2 Rn2 ; .y; z/ 2 Dg: For any
vector x 2 R2 such that kxk D 1 define
On the other hand, since Hx D .x2 ; x1 /T is the vector orthogonal to x such that
kH.x/k D 1 if kxk D 1; the width of prD in the direction orthogonal to x is
g.Hx/ D g.x2 ; x1 /: Hence the product g.x/:g.Hx/ measures the area of the smallest
rectangle containing prD and having a side parallel to x: The problem can then be
formulated as
Lemma 9.8 The function g.x/ is convex and satisfies g.x/ D g.x/ for any
0:
Proof Clearly g1 .x/ is convex as the pointwise maximum of a family of affine
functions, while g2 .x/ is concave as the pointwise minimum of a family of affine
functions. It is also obvious that gi .x/ D gi .x/; i D 1; 2: Hence, the conclusion.
t
u
Since H W R2 ! R2 is a linear map, it follows that g.Hx/ is also convex
and that g.H.x// D g.Hx/ for every 0: Exploiting this property, a
successive underestimation method for solving problem (9.75) has been proposed in
Kuno (1993). In view of the above established property, problem (9.75) is actually
equivalent to the convex multiplicative program
or, alternatively,
where
where the halfline from 0 through x intersects the line segment joining .1; 0/ and
.0; 1/: Since obviously, k.x/k2 D 2 C .1 /2 ; using Lemma 9.8 problem (9.76)
becomes
where
g.; 1 /:g. 1; /
F./ D :
2 C .1 /2
0 D 0 ; 1 ; 2 ; : : : ; N1 ; 1 D N (9.80)
be the union of all these four sequences rearranged in the increasing order.
Proposition 9.6 A global minimum of the function F./ over 0; 1 exists among
the points s ; s D 0; : : : ; N:
Proof For any interval s ; sC1 ; there exist two vectors u D x1 x2 ; v D x3 v 4 ;
where xi ; i D 1; : : : ; 4 are vertices of D; such that for every x 2 R2C of length kxk D
1; with .x/ D .; 1 /; 2 .s ; sC1 /; we have g.x/ D hx; ui; g.Hx/ D hHx; vi:
Then
where D angle.x; u/ (angle between the vectors x and u/ and D angle.Hx; v/:
Noting that angle.x; u/ C angle.u; v/ C angle.v; y/ D angle.x; y/; we deduce
D C ! =2; (9.81)
C D ;
contradicting the fact that 0 < jj < =2; 0 < jj < =2 (because hx; ui >
0; hHx; vi > 0/: Therefore, the minimum of F./ over an interval s ; sC1 cannot
be attained at any such that s < < sC1 ; but must be attained at either s or
sC1 : Hence the global minimum over the entire segment 0; 1 is attained at one
point of the sequence 0 ; 1 ; : : : ; N : t
u
Corollary 9.3 An optimal solution of (9.79) is
. ; 1 /
x D p ; 2 argminfF.s / W s D 0; 1; : : : ; Ng:
2 C .1 /2
In the previous sections we were concerned with low rank nonconvex problems
with nonconvex variables in the objective function. We now turn to low rank
nonconvex problems with nonconvex variables in the constraints. Consider the
following monotonic reverse convex problem
we define a function '.y/ such that '.y/ D h.x/ for all x 2 X satisfying y D Qx:
The latter also implies that '.y/ is quasiconcave on Q.X/:
Denote by .y/ the optimal value of the linear program
Since it is well known that .y/ is a convex piecewise affine function, (9.83) is a
reverse convex program in the r-dimensional space Q.Rn /: If r is not too large this
program can be solved in reasonable time by outer approximation or branch and
bound.
D fy 2 Q.D/j .y/ Q g;
yk 2 argminf'.y/j y 2 Pk g
9.7 Reverse Convex Constraints 315
'.0/ > 0; .0/ < minf .y/j y 2 Q.D/; '.y/ 0g: (9.85)
Now, since '.yk / < 0 < '.0/ and k .yk / 0 < k .0/; one can compute
zk 2 0; yk satisfying minf'.zk /; k .zk /g D 0: Consider the linear program
Proposition 9.7
(i) If SP(zk ) has a (finite) optimal solution .v k ; wk / then zk 2 Q.D/ and (9.84) is
satisfied by the affine function
l.y/ WD hv k ; y zk i: (9.86)
is feasible, hence zk 2 Q.D/: It is easy to see that v k 2 @ .zk /: Indeed, for any
y; since .y/ is the optimal value in SP .y/; it must also be the optimal value in
SP.y/I i.e., .y/ hy; v k i hb; wk i; hence .y/ .zk / hy; v k i hb; wk i
hzk ; v k i C hb; wk i D hv k ; y zk i; proving the claim. For every y 2 we have
.y/ Q .zk /; so l.y/ WD hv k ; y zk i .y/ .zk / 0: On the other hand,
yk D zk D zk C .1 /0 for some 1; hence l.yk / D l.zk / C .1 /l.0/ > 0
because l.0/ .0/ .zk / < 0; while l.zk / D 0:
To prove (ii), observe that if SP.zk / is unbounded, then SP .zk / is infeasible,
i.e., zk Q.D/: Since the recession cone of the feasible set of SP.zk / is the cone
QT v AT w 0; v 0; SP.zk / may be unbounded only if an extreme direction
316 9 Parametric Decomposition
.v k ; wk / of this cone satisfies hzk ; v k i hb; wk i > 0: Then the affine function (9.87)
satisfies l.zk / > 0 while for any y 2 Q.D/; SP .y/ (hence SP.y// must have a finite
optimal value, and consequently, hy; v k i hb; wk i 0; i.e., l.y/ 0: t
u
The convergence of an OA method for .MRP/ based on the above proposition
follows from general theorems established in Sect. 6.3.
Remark 9.7 In the above approach we only used the constancy of h.x/ on every
manifold fxj Qx D yg: In fact, since '.y0 / '.y/ for all y0 y; problem .MRP/
can also be written as
A conical algorithm for solving (9.83) starts with an initial cone M0 contained in
Q.X/ and containing the feasible set Q.D/ \ fyj '.y/ 0g.
Given any cone M Q.Rn / of base u1 ; : : : ; uk let i ui be the point where the
i-th edge of M intersects the surface '.y/ D 0: If P U denotes the matrix of columns
u1 ; : : : ; uk then M D ft 2 Rk j Ut 0; t 0g and kiD1 ti = i D 1 is the equation of
the hyperplane passing through the intersections of the edges of M with the surface
'.y/ D 0: Denote by .M/ and t.M/ the optimal value and a basic optimal solution
of the linear program
( )
X
k Xk
ti
LP.M/ min hc; xij x 2 D; Qx D ti ui ; 1; t 0 :
iD1
iD1 i
Proposition 9.8 .M/ gives a lower bound for .y/ over the set of feasible points
in M: If !.M/ D Ut.M/ lies on an edge of M then
Here the function h.x/ is concave and monotonic with respect to the cone K D
fyj c1 y D 0; c2 y D 0g; where c1 D .3; 6; 8; 0/; c2 D .4; 5; 0; 1/ (cf Example 7.2).
So Q is the matrix of two rows c1 ; c2 and Q.D/ D fy D Qx; x 2 Dg is contained in
the cone M0 D fyjy 0g (D denotes the feasible set). Since the optimal solution
0 2 R4 of the underlying linear program satisfies h.0/ > 0; conditions (9.89) hold,
and M0 D conefe1 ; e2 g can be chosen as an initial cone. The conical algorithm
terminates after 4 iterations, yielding an optimal solution of .0; 0; 1:54; 2:0/ and the
optimal value 0:76: Note that branching is performed in R2 (t-space) rather than in
the original space R4 :
C K L? ; (9.91)
D n intC ;; D C: (9.92)
Following the PA Method for .LRC/ (Sect. 8.1), to solve .MRP/ we construct a
nested sequence of polyhedrons P1 P2 : : : together with a nonincreasing
318 9 Parametric Decomposition
X
r
1
ti hci ; cj i j D 0; 1; : : : ; r
iD1
j
P
where c0 D riD1 ci ; j D supfj cj 2 Cg: In the algorithm below, when a better
feasible solution xk than the incumbent has been found, we can compute a feasible
point xOk at least as good as xk and lying on an edge of D (see Sect. 5.7). This can be
done by carrying out a few steps of the simplex algorithm. We shall refer to xOk as a
point obtained by local improvement from xk :
PA Algorithm for .MRP/
Initialization. Choose a vertex x0 of D such that h.x0 / > 0 and set D Dx0 ; C
0 1 1 1
C x : Let x D best feasible solution available, 1 D hc; x i .x D ;; k D C1 if
no feasible solution is available). Let P1 be the simplex constructed as in Lemma 9.1,
V1 the vertex set of PQ 1 : Set k D 1:
Step 1. For every t D .t1 ; : : : ; tr /T 2 Vk solve the linear program
( )
X
r
max ti hci ; xij x 2 D; hc; xi k
iD1
to obtain its optimal value .t/: (Only the new t 2 Vk should be considered
if this step is entered from Step 3). If .t/ 1 8t 2 Vk , then terminate: if
k < 1; xk is an optimal solution; otherwise (MRP) is infeasible.
Step 2. If .tk / > 1 for some tk 2 Vk , then let xk be a basic optimal solution of
the corresponding linear program. If xk C, then let xOk be the solution
obtained by local improvement from xk ; reset xk D xOk ; k D hc; xOk i; and
return to Step 1.
Step 3. If xk 2 C then set xkC1 D xk ; kC1 D k ; compute k D supf j xk 2 Cg
and define
( )
Xr
1
PQ kC1 D PQ k \ tj ti hx ; c i
k i
:
iD1
k
From Vk derive the vertex set VkC1 of PQ kC1 : Set k k C 1 and return to
Step 1.
9.8 Network Constraints 319
where yi is the production level to be determined for factory i and xij the amount
to be sent from factory i to warehouse j in order to meet the demands d1 ; : : : ; dm of
the warehouses. Another variant of SCP is the minimum concave cost network flow
problem which can be formulated as
320 9 Parametric Decomposition
P P
min r gi .xi / C n
.MCCNFP/ iD1 iDrC1 ci xi
s:t: Ax D d; x 0:
where A is the node-arc incidence matrix of a given directed graph with n arcs, d
the vector of node demands (with supplies interpreted as negative demands), and
xi the flow value on arc i: Node j withP dj > 0 are the sinks, nodes j with dj < 0
are the sources, and it is assumed that j dj D 0 (balance condition). The cost of
shipping t units along arc i is a nonnegative concave nondecreasing function gi .t/
for i D 1; : : : ; r; and a linear function ci t (with ci 0/ for i D r C 1; : : : ; n: To cast
.MCCNFP/ into the form .SCP/ it suffices to rewrite it as
P P
min r gi .yi / C n
iD1 iDrC1 ci xi (9.95)
s:t: xi D yi .i D 1; : : : ; r/; Ax D d; x 0:
min g.y/ C t
s.t. hc; xi t; Qx D y; Bx D d; x 0; y 2 Y: (9.96)
Following Remark 7.1 to check whether .yk ; tk / is feasible to (9.96) we solve the
pair of dual programs
Assume
Pr that in problem .SCP/ the function g.y/ is concave separable, i.e., g.y/ D
iD1 g i .y i /: For each rectangle M D a; b let
X
R
lM .y/ D li;M .yi /
iD1
where li;M .t/ is the affine function that agrees with gi .t/ at the endpoints ai ; bi of the
segment ai ; bi ; i.e.,
gi .bi / gi .ai /
li;M .t/ D gi .ai / C .t ai /: (9.100)
bi ai
(see Proposition 7.4). Then a lower bound .M/ for the objective function in .SCP/
over M is provided by the optimal value in the linear program
( )
X
r
.LP.M// min li;M .yi / C hc; xij Qx D y; Ax D d; x 0 :
iD1
Using this lower bound and a rectangular !-subdivision rule, a convergent rectan-
gular algorithm has been developed in Horst and Tuy (1996) which is a refined
version of an earlier algorithm by Soland (1974) and should be able to handle
even large scale .PTP/ provided r is relatively small. Also note that if the data
are all integral, then it is well known that a basic optimal solution of the linear
transportation problem LP.M/ always exists with integral values. Consequently, in
this case, starting from an initial rectangle with integral vertices the algorithm will
generate only subrectangles with integral vertices, and so it will necessarily be finite.
322 9 Parametric Decomposition
In practice, aside from production and transportation costs there may also be
shortage/holding costs due to a failure to meet the demands exactly. Then we
have the following problem (cf Example 5.1) which includes a formulation of the
stochastic transportationlocation problem as a special case:
min hc; xi C Pr g .y / C Pm h .z /
iD1 i i jD1 j j
Xm
s:t: xij D yi .i D 1; : : : ; r/
jD1
X
.GPTP/ r
xij D zj .j D 1; : : : ; m/
iD1
0 yi si 8i;
xij 0 8i; j:
where gi .yi / is the cost of producing yi units at factory i and hj .zj / is the
shortage/holding cost to be incurred if the warehouse j receives zj dj (demand
of warehouse j/: It is assumed that gi is a concave nondecreasing function satisfying
gi .0/ D 0 gi .0C / (so possibly gi .0C / > 0Pas, e.g., for a fixed charge), while
r
hj W 0; s ! RC is a convex function (s D iD1 si /; such that hj .dj / D 0; and
the right derivative of hj at 0 has a finite absolute value
j : The objective function is
then a d.c. function, which is a new source of difficulty. However, since the problem
becomes convex when y is fixed, it can be handled by a branch and bound algorithm
in which branching is performed upon y (Holmberg and Tuy 1995).
For any rectangle a; b Y WD fy 2 RrC j 0 yP i si i D 1; : : : ; rg let as
previously li;M .t/ be defined by (9.100) and lM .y/ D riD1 li;M .yi /: Then a lower
bound of the objective function over the feasible points in M is given by the optimal
value .M/ in the convex program
min hc; xi C l .y/ C Pm h .z /
M jD1 j j
Xm
s:t: xij D yi .i D 1; : : : ; r/
jD1
X
CP.M/ r
xij D zj .j D 1; : : : ; m/
iD1
ai yi bi 8i;
xij 0 8i; j:
Note that if an optimal solution .x.M/; y.M/; z.M// of this convex program satisfies
lM .y.M// D g.y.M// then .M/ equals the minimum of the objective function
over the feasible solutions in M: When y is fixed, .GPTP/ is a convex program
in x; z: If '.y/ denotes the optimal value of this program then the problem amounts
to minimizing '.y/ over all feasible .x; y; z/.
9.8 Network Constraints 323
bkq
1 1
t
O
If gik .ykik / lik ;Mk .yk / D 0; then terminate: yk is a global optimal solution.
Step 6. Divide Mk via .yk ; ik / (see Sect. 5.6). Let PkC1 be the partition of Mk and
SkC1 D Rk n .fMk g/ [ PkC1 : Set k k C 1 and return to Step 1.
To establish the convergence of this algorithm, recall that yk D y.Mk /: Similarly
denote xk D x.Mk /; zk D z.Mk /; so .xk ; yk ; zk / is an optimal solution of CP.Mk /:
Suppose the algorithm is infinite and let y be any accumulation point of fyk g, say
y D limq!1 ykq : Without loss of generality we may assume xkq ! x; zkq ! z and
also ikq D 1 8q; so that
k
Clearly if for some q we have y1q D 0, then gi .ykq / D l1;Mkq .ykq /; and the algorithm
k
terminates in Step 5. Therefore, since the algorithm is infinite, we must have y1q >
0 8q:
Lemma 9.9 If g1 .0C / > 0 then
k
y1 D lim y1q > 0: (9.103)
q!1
q q k q q q q
Proof Let Mkq ;1 D a1 ; b1 ; so that y1q 2 a1 ; b1 and a1 ; b1 ; q D 1; 2; : : : is a
q q q q
sequence of nested intervals. If 0 a10 ; b10 for some q0 , then 0 a1 ; b1 8q q0 ;
k q q
and y1q a10 > 0 8q q0 ; hence y1 a10 > 0; i.e., (9.103) holds. Thus, it suffices
to consider the case
q k
a1 D 0; y1q > 0 8q: (9.104)
Recall that j denotes the absolute value of the right derivative of hj .t/ at 0. Let
P k k k
Since njD1 x1jq D y1q > 0; there exists j0 such that x1jq0 > 0: Let .Qxkq ; ykq ; zkq /
be a feasible solution to CP.M/ obtained from .xkq ; ykq ; zkq / by subtracting a very
k k k k
small positive amount
< x1jq0 from each of the components x1jq0 ; y1q ; zj0q ; and letting
unchanged all the other components. Then the production cost at factory 1 decreases
k k k k
by l1;Mkq .y1q / l1;Mkq .Qy1q / D g1 .b1q /
=b1q > 0; the transportation cost in the arc
.1; j0 / decreases by c1j0
; while the penalty incurred at warehouse j0 either decreases
k k k
(if zj0q > dj /; or increases by hj0 .zj0q
/ hj0 .zj0q /
j0
: Thus the total cost
decreases by at least
" k
#
g1 .b1q /
D k
C c1j0
j0
: (9.106)
b1q
If 0 then c1j0
j0 0; hence > 0 and .Qxkq ; yQ kq ; zQkq / would be a better feasible
solution than .xkq ; ykq ; zkq /; contradicting the optimality of the latter for CP.Mkq /:
Therefore, > 0: Now suppose g1 .0C / > 0: Since g1 .t/=t ! C1 as t ! 0C (in
view of the fact g1 .0C / > 0/; there exists 1 > 0 satisfying
g1 .1 /
> :
1
k k k0 k
Observe that since Mkq ;1 D 0; b1q is divided via y1q ; we must have 0; b1q 0; y1q
k k
for all q0 > q; while 0; y1q 0; b1q for all q: With this in mind we will show that
k
b1q 1 8q: (9.107)
k
By the above observation this will imply that y1q 1 8q; thereby completing
k
the proof. Suppose (9.107) does not hold, i.e., b1q < 1 for some q: Then
kq kq
y1 b1 < 1 ; and since g1 .t/ is concave it is easily seen by Fig. 9.1 that
k k
g1 .b1q / b1q g1 .1 /=1 ; i.e.,
9.8 Network Constraints 325
k
g1 .b1q / g1 .1 /
:
k
b1q 1
hence .Qxkq ; yQ kq ; zQkq / is a better feasible solution than .xkq ; ykq ; zkq /: This contradiction
proves (9.107). t
u
Theorem 9.3 The above algorithm for .GPTP/ can be infinite only if " D 0 and
in this case it generates an infinite sequence yk D y.Mk /; k D 1; 2; : : : every
accumulation point of which is an optimal solution.
Proof For " D 0 let y D limq!1 ykq be an accumulation point of fykq g with, as
q q q q
previously, ikq D 1 8q; and Mkq ;1 D a1 ; b1 : Since fa1 g is nondecreasing, fb1 g
is nonincreasing, we have a1 ! a1 ; b1 ! b1 : For each q either of the following
q q
situations occurs:
q qC1 k q
a1 < aqC1 < b1 < y1q < b1 (9.108)
q k qC1 qC1 q
a1 < y1q < a1 < b1 < b1 (9.109)
k
If (9.108) occurs for infinitely many q then limq!1 y1q D limq!1 b1 D b1 :
q
k
If (9.109) occurs for infinitely many q then limq!1 y1q D limq!1 a1 D a1 :
q
Thus, y1 2 fa1 ; b1 g: By Lemma 9.9, the concave function g1 .t/ is continuous at y1 :
k
Suppose, for instance, that y1 D a1 (the other case is similar). Since y1q ! a1 and
k k
a1 ! a1 ; it follows from the continuity of g1 .t/ at a1 that g1 .y1q / l1;Mkq .y1q / ! 0
q
Thus, all is reduced to solving LP.y/ parametrically in y 2 0; s: But this is very
easy because LP.y/ is a continuous knapsack problem. In fact, by reindexing we can
assume that the cj are ordered by increasing values:
c1 c2 : : : cn :
Pk1 Pk
Let k be an index such that jD1 dj y jD1 dj :
Lemma 9.10 An optimal solution of LP.y/ is x with
8j D 1; : : : ; k 1
xj D d j P
xk D y jD1
k1
d j ; xj D 0 8j D k C 1; : : : ; n:
9.9 Some Polynomial-Time Solvable Problems 327
Proof This well-known result (see, e.g., Dantzig 1963)Pis rather intuitive because
for any feasible solution x of LP.y/, by noting that y D njD1 xj we can write
X
k1 X
n X
k1
hc; xi D cj d j C ck xj ck dj
jD1 jD1 jD1
X
k1 X
k1 X
n
D cj d j ck .dj xj / C ck xj
jD1 jD1 jDk
X
k1 X
k1 X
n
cj d j cj .dj xj / C ck xj hc; xi:u
t
jD1 jD1 jDk
Proposition 9.10
(i) The breakpoints of the optimal value function of LP.y/; 0 y s; are among
the points:
y0 D 0; yk D yk1 C dk k D 1; : : : ; n
Proof From Lemma 9.10, for any y 2 yk ; ykC1 a basic optimal solution of LP.y/ is
y yk
xy D xk C .xkC1 xk / with D :
dkC1
Hence each segment yk ; ykC1 is a linear piece of the optimal value function '.y/ of
LP.y/; which proves (i). Part (ii) is immediate. t
u
As a result, the concave program CPL.1/ can be solved by the following:
Parametric Algorithm for CPL(1)
Initialization. Order the coefficients cj by increasing values:
c1 c2 : : : cn :
328 9 Parametric Decomposition
Step 3. Find fk D minff0 ; f1 ; : : : ; fn g: Then xk is an optimal solution.
Proposition 9.11 The above algorithm requires O.n log2 n/ elementary operations
and n evaluations of g./
Proof The complexity dominant part of the algorithm is Step 1 which requires
O.n log2 n/ operations, using, e.g., the procedure Heapsort (Ahuja et al. 1993). Also
an evaluation of g./ is needed for each number fk : t
u
Thus, assuming the values of the nonlinear function g./ provided by an oracle
the above algorithm for CPL.1/ runs in strongly polynomial time.
9.10 Applications
Substituting dj x1j for x2j and setting xj D x1j ; cj D c1j c2j we can convert this
problem into a CPL.1/:
Therefore,
Proposition 9.12 PTP.2/ can be solved by an algorithm requiring O.m log2 m/
elementary operations and m evaluations of g./:
9.10 Applications 329
We shall refer to Problem .MCCNFP/ with a single source and r nonlinear arc costs
as FP.r/. So FP.1/ is the problem of finding a feasible flow with minimum cost in
a given network G with m nodes (indexed by j D 1; : : : ; m/; and n arcs (indexed
byPi D 1; : : : ; n/; where node j has a demand dj (with dm < 0; dj 0 for all other
j; m jD1 dj D 0/; arc 1 has a concave nonlinear cost function g1 .t/ while each arc
i D 2; : : : ; n has a linear cost gi .t/ D ci t (with ci 0/: If AC
j and Aj denote the sets
of arcs entering and leaving node j, respectively, then the problem is
X n
min g .x / C
1 1 ci xi
X X
iD2
FP.1/ s.t. xi D dj ; j D 1; 2; : : : ; m;
xi
C
i2Aj i2A
j
x 0:
Proof To simplify the language we call the single arc with nonlinear cost the black
arc. All the other arcs are called white, and the unit cost ci attached to a white arc
is called its length. A directed path through white arcs only is called a white path;
its length is then the sum of the lengths of its arcs. A white path from a node j to
a node j0 is called shortest if its length is smallest among all white paths from j
to j0 : Denote by Q0j the shortest white path from the source to sink j; by Q1j the
shortest white path from the head of the black arc to sink j; and by P the shortest
white path from the source to the tail of the black arc. Let c0j ; c1j ; p be the lengths
of these paths, respectively (the length being C1; or an arbitrarily large positive
number, if the path does not exist). Assuming fjjdj > 0g D f1; : : : ; m1 g; let s D
P m1
jD1 dj : Consider now a problem PTP.2/ defined for two factories F0 ; F1 , and m1
warehouses W1 ; : : : ; Wm1 with demands d1 ; : : : ; dm1 ; where the cost of producing
y units at factory F0 and s y units at factory F1 is g1 .y/ C py while the unit
transportation cost from Fi to Wj is cij .i D 0; 1/; i.e.,
min g .y/ C py C P1 Pm1 c x
P 1 iD0 jD1 ij ij
m1
s.t. x
jD1 0j D y
(9.112)
x 0j C x1j D dj j D 1; : : : ; m1
0 x0j dj j D 1; : : : ; m1
(the corresponding reduced network is depicted in Fig. 9.2). Clearly from an optimal
solution xij of (9.112) we can derive an optimal solution of FP.1/ by setting, in
FP.1/;
X
m1
x1 D x0j
jD1
330 9 Parametric Decomposition
F1 F2
W1 W2 Wm1
and solving the corresponding linear min-cost network flow problem. Thus, solving
FP.1/ is reduced to solving the just defined PTP.2/: To complete the proof, it
remains to observe that the computation of cij ; i D 0; 1; j D 1; : : : ; m1 ; and p
needed for the above transformation, requires solving two single-source multiple-
sink shortest problems with nonnegative arc costs, hence a time of O.m log2 m C n/
(using Fredman and Tarjans (1984) implementation of Dijkstra s algorithm).
It follows from Propositions 9.11 and 9.13 that FP.1/ can be solved by a strongly
polynomial algorithm requiring O.m log2 m C m log2 m C n/ D O.m log2 m C n/
elementary operations and m evaluations of the function g1 .:/:
Remark 9.9 A polynomial (but not strongly polynomial) algorithm for FP.1/
was first given by Guisewite and Pardalos (1992). Later a strongly polynomial
algorithm based on solving linear min-cost network flow problems with a parametric
objective function was obtained by Klinz and Tuy (1993). The algorithms presented
in this section for PTP.1/ and FP.1/ were developed by Tuy et al. (1993a) and
subsequently extended in a series of papers by Tuy et al. (1993b, 1994a, 1995b,
1996a) to prove the strong polynomial solvability of a class of network flow
problems including PTP.r/ and FP.r/ with fixed r:
Following the approach in Sect. 8.5 we associate with PTP.r/ the parametric
problem
X r X m Xr X m
min
ti xij C cij xij
iD1 jD1 iD1 jD1
s.t. X x D d j D 1; : : : ; m
r
P.t/
ij j
iD1
xij 0; i D 1; : : : ; r; j D 1; : : : ; m
where t 2 RrC : The parametric domain RrC is partitioned into a finite collection P
of polyhedrons (cells) such that for each cell 2 P there is a basic solution x
which is optimal to P.t/ for all t 2 : Then an optimal solution of PTP.r/ is x
where
8 0 1 9
< X m X
m X =
2 argmin g @ x
1j ; : : : ; x AC cij x
ij j 2 P : (9.113)
: rj
;
jD1 jD1 i;j
Following Tuy (2000b) we now show that the collection P can be constructed and
its cardinality is bounded by a polynomial in m.
Observe that the dual of P.t/ is
Xm
max
dj uj
P .t/
jD1
s.t. u t C c ; i D 1; : : : ; r; j D 1; : : : ; m
j i ij
r
Also for any fixed t 2 rC a basic solution of P.t/ is a vector xt such that for every
j D 1; : : : ; m there is an ij satisfying
dj i D ij
xtij D (9.114)
0 i ij
Now let I2 be the set of all pairs .i1 ; i2 / such that i1 < i2 2 f1; : : : ; rg: Define a cell
to be a polyhedron RrC which is the solution set of a linear system formed
by taking, for every pair .i1 ; i2 / 2 I2 and every j D 1; : : : ; m; one of the following
inequalities:
ti C cij ; i D 1; : : : ; r
remains unchanged as t varies over a cell : Hence the index ij satisfying PTP.3/
remains the same for all t 2 I in other words, xt (basic optimal solution of P.t//
equals a constant vector x for all t 2 : Let P be the collection of all cells
defined that way. Since every t 2 RrC satisfies one of the inequalities (9.116) for
every .i1 ; i2 / 2 I2 and every j D 1; : : : ; m; the collection P covers all of RrC : Let
us estimate an upper bound of the number of cells in P.
Observe that for any fixed pair .i1 ; i2 / 2 I2 we have ti1 C ci1 j ti2 C ci2 j if and
only if ti1 ti2 ci2 j ci1 j : Let us sort the numbers ci2 j ci1 j ; j D 1; : : : ; m; in
increasing order
and let i1 ;i2 .j/ be the position of ci2 j ci1 j in this ordered sequence.
Proposition 9.14 A cell is characterized by a map k W I2 ! f1; : : : ; m; m C 1g
such that is the solution set of the linear system
ti1 C ci1 j t2 C ci2 j 8.i1 ; i2 / 2 I2 s:t: i1 ;i2 .j/ k .i1 ; i2 / (9.118)
ti1 C ci1 j t2 C ci2 j 8.i1 ; i2 / 2 I2 ; s:t: i1 ;i2 .j/ < k .i1 ; i2 / (9.119)
i1 ;i2
Proof Let RrC be a cell. For every pair .i1 ; i2 / with i1 < i2 denote by J the
set of all j D 1; : : : ; m such that the left inequality (9.116) holds for all t 2 and
define
i1 ;i2 i1 ;i2
minfi1 ;i2 .j/j j 2 J g if J ;
k .i1 ; i2 / D
mC1 otherwise
It is easy to see that is then the solution set of the system (9.118) and (9.119).
Indeed, let t 2 : If i1 ;i2 .j/ k .i1 ; i2 /, then k .i1 ; i2 / m C 1; so k .i1 ;2 / D
i1 ;i2
i1 ;i2 .l/ for some l 2 J : Then ti1 C ci1 l ti1 C ci2 ; hence ti1 ti2 ci2 l ci1 l and
since the relation i1 ;i2 .j/ i1 ;i2 .l/ means that ci2 j ci1 j ci2 l ci1 l it follows that
ti1 ti2 ci2 j ci1 j ; i.e., ti1 C ci1 j ti2 C ci2 j : Therefore (9.118) holds. On the other
i1 ;i2
hand, if i1 ;i2 .j/ < k .i1 ; i2 / then by definition of a cell j J ; hence (9.119)
9.11 Exercises 333
holds, too [since from the definition of a cell, any t 2 must satisfy one of the
inequalities (9.116)]. Thus every t 2 is a solution of the system (9.118) and
(9.119). Conversely, if t satisfies (9.118) and (9.119) then for every .i1 ; i2 / 2 I2 ; t
i1 ;i2 i1 ;i2
satisfies the left inequality (9.116) for j 2 J and the right inequality for j J ;
2
hence t 2 : Therefore, each cell is determined by a map k W I ! f1; : : : ; m C
1g: Furthermore, it is easy seen that k k 0 for two different cells ; 0 : Indeed,
if 0 then at least for some .i1 ; i2 / 2 I2 and some j D 1; : : : ; m; one has
i1 ;i2 i1 ;i2
j 2 J n J 0 : Then k .i1 ; i2 / i1 ;i2 .j/ but k 0 .i1 ; i2 / > i1 ;i2 .j/: t
u
Corollary 9.4 The total number of cells is bounded above by .m C 1/r.r1/=2 :
Proof The number of cells does not exceed the number of different maps k W I2 !
f1; : : : ; m C 1g and there are .m C 1/r.r1/=2 such maps. t
u
To sum up, the proposed algorithm for solving PTP.r/ involves the following
steps:
(1) Ordering the sequence ci2 j ci1 j ; j D 1; : : : ; m for every pair .i1 ; i2 / 2 I2 so as
to determine i1 ;i2 .j/; j D 1; : : : ; m; .i1 ; i2 / 2 I2 :
(2) Computing the vector x for every cell 2 P (P is the collection of cells
determinedly the maps k W I2 ! f1; : : : ; m C 1g:
(3) Computing the values f .x / and select according to (9.113).
Steps (1) and (2) require obviously a number of elementary operations
bounded by a polynomial in m; while step (3) requires mr.r1/=2 evaluations
of f .x/:
9.11 Exercises
9.1 Show that a differentiable function f .x/ is K-monotonic on an open convex set
X Rn if and only if rf .x/ 2 K 8x 2 X:
9.2 Consider the problem minff .x/j x 2 Dg where D is a polyhedron in Rn with a
vertex at 0, and f W Rn ! R is a concave function such that for any real number
f .0/ the level set C D fxj f .x/ g has a polar contained in the cone
generated by two linearly independent vectors c1 ; c2 2 Rn : Show that this problem
reduces to minimizing a concave function of two variables on the projection of D
on R2 : Develop an OA method for solving the problem.
9.3 Consider the problem
where t 2 R; y 2 Rn ; a 2 Rm ; B 2 Rmn ; c 2 Rm
C : Develop a parametric algorithm
for solving the problem and compare with the polyhedral annexation method.
334 9 Parametric Decomposition
minfF.l.x//j x 2 Dg
3
X 3 X
X 5
min fi .xi / C dij yij
iD1 iD1 jD1
s:t: x1 C x2 C x3 D 203
y1j C y2j C y3j D bj j D 1; : : : ; 5
yi1 C : : : C yi5 D xi i D 1; 2; 3
xi 0I yij 0 i D 1; 2; 3I j D 1; : : : ; 5:
9.6 Suppose that details of n different kinds must be manufactured. The production
cost of t items of i-th kind is a concave function fi .t/: The number of items of i-th
kind required is ai > 0; but one item of i-th kind can be replaced by one item of any
j-th kind with j > i: Determine the number xi of items of i-th kind, i D 1; : : : ; n;
to be Pmanufactured so as to satisfy the demands with minimum cost. By setting
i D ijD1 aj ; this problem can be formulated as:
X
n
min fi .xi / s.t.
iD1
x1 1 ; x1 C x2 2 ; : : : ; x1 C : : : C xn n
x1 ; : : : ; xn 0
where D RnC is a polytope, c1 ; c2 2 RnC ; c0 2 Rn and cx denotes the inner product
of c and x: Apply the algorithm to a small numerical example with n D 2.
9.8 Solve the numerical Example 9.10 (page 317).
9.9 Solve by polyhedral annexation:
X
n
xi
min s.t.
iD1
yi C
X
n X
n
xi D a; yi D b
iD1 iD1
0 xi ; 0 yi 1 i D 1; : : : ; n
where 0 <
; 0 < a n; 0 < b n:
Hint: Reduce the problem to the concave minimization problem
( )
X
n
min h.x/j xi D a; 0 xi
iD1
336 9 Parametric Decomposition
where
( )
X
n
xi X
n
h.x/ D min j yi D b; 0 yi 1 i D 1; : : : ; n
iD1
yi C
iD1
X
n X
n
xi D a; yi D b:
iD1 iD1
Chapter 10
Nonconvex Quadratic Programming
Q D UU T ; D diag.1 ; : : : ; n /;
1
fi .x/ D hx; Qi xi C hci ; xi C di 0 i D 1; : : : ; m (10.1)
2
is equivalent to a single inequality
g.x/ rkxk2 0;
(Corollary 4.6).
5. A convex minorant of a quadratic function f .x/ D 12 hx; QxiChc; xi on a rectangle
M D fxj p x qg is
X
n
'M .x/ D f .x/ C r .xi pi /.xi qi /;
iD1
X
k X
k
QD .ai /.bi /T ; i.e.; hx; Qyi D hai ; xihbi ; yi;
iD1 iD1
1X i
k
f .x/ D ha ; xihbi ; xi C hc; xi: t
u
2 iD1
inf L.x; y / D max inf L.x; y/ & L.x ; y / D min L.x; y / (10.7)
x2C y2D x2C x2C
for a unique x 2 C:
Then we have the minimax equality
; C
.y / \ C
.y/ C
.y / D fx g:
Thus,
L.x ; y/
8y 2 D
and hence,
x2C y2D
Proof If f .x/ and g.x/ are both concave on and one of them is not affine then
clearly infff .x/j g.x/ 0; x 2 g D 1; hence infx2 supy2RC L.x; y/ D
infff .x/j g.x/ 0; x 2 g D 1; and (10.9) is trivial. If f .x/ and g.x/ are
both affine on ; (10.9) is a consequence of linear programming duality. Otherwise,
either f .x/ or g.x/ is strictly convex on . Then there exists y 2 RC such
that f .x/ C y g.x/ is strictly convex on ; hence such that L.x; y / ! C1 as
x 2 ; kxk ! C1: Since by Proposition 10.2 condition (ii) in Theorem 10.1 is
satisfied, (10.9) follows by Theorem 10.1 with C D ; D D RC :
If g.x/ is not affine on ; there always is y 2 R such that f .x/ C y g.x/ is strictly
convex on (y > 0 when g.x/ is strictly convex on and y < 0 when g.x/ is
strictly concave on ). Therefore all conditions of Theorem 10.1 are satisfied again
for C D and D D R and (10.10) follows. t
u
We can now state a generalized version of the S-Lemma which is in fact valid for
arbitrary lower semi-continuous functions f .x/ and g.x/; not necessarily quadratic.
Theorem 10.2 (Generalized S-Lemma (Tuy and Tuan 2013)) Let f .x/ and g.x/
be quadratic, functions on Rn ; W a closed subset of Rn ; D a closed subset of R and
L.x; y/ D f .x/ C yg.x/ for y 2 R: Assume WD infx2W supy2D L.x; y/ > 1 and
the following conditions are satisfied:
(i) Either : D D RC and there exists x 2 W such that g.x / < 0
Or : D D R and there exist a; b 2 W such that g.a/ < 0 < g.b/:
(ii) For any two points a; b 2 W such that g.a/ < 0 < g.b/ we have
sup inf L.x; y/ :
y2D x2fa;bg
hence
L.x; y/ 8x 2 E2 ; L.x; y/ 8x 2 E1 :
Thus in any case for any finite set E W there exists y 2 D satisfying (10.12).
Setting D.x/ WD fy 2 Dj L.x; y/ g we then have \x2E D.x/ ; for any finite
set E W: In other words, the collection of closed sets D.x/; x 2 W; has the finite
intersection property. If assumption (i) holds then
either D D RC and f .x / C yg.x / ! 1 as y ! C1; so the set D.x / D
fy 2 RC j L.x ; y/ g is compact
or D D R and f .a/ C yg.a/ ! 1 as y ! C1; f .b/ C yg.b/ ! 1 as
y ! 1:
So the set D.a/ D fy 2 Rj L.a; y/ g is bounded above, while D.b/ is bounded
below; hence the set D.a/ \ D.b/ is compact.
Consequently, \x2W D.x/ ;: For every k D 1=k; k D 1; 2; : : : there
exists yk 2 \x2W fy 2 Dj L.x; y/ k g; i.e., yk 2 D such that L.x; yk / k 8x 2 W:
Obviously every yk is contained in the compact set \x2W fy 2 Dj L.x; y/ 1 g;
so there exists y 2 D such that, up to a subsequence yk ! y .k ! C1/: Then
L.x; y/ 8x 2 W; hence, infx2W L.x; y/ D : This proves (10.11). t
u
Theorem 10.2 yields the most general version of S-Lemma. In fact, if W is an
affine manifold in Rn and D D RC condition (ii) is automatically fulfilled because
for every line connecting a and b in W we have, by Corollary 10.2,
sup inf L.x; y/ sup inf L.x; y/ D inf sup L.x; y/ inf sup L.x; y/ D :
y2R x2fa;bg y2R x2 x2 y2R x2W y2R
10.2 The S-Lemma 345
(iii) is the S-Lemma with nonstrict inequalities. So Theorem 10.2 is indeed the
most general version of the S-Lemma. Condition (i) is a kind of generalized Slater
condition and the proof given above is a simple elementary proof for the S-Lemma
in its full generality.
Remark 10.2 By Proposition 10.2 the function f .x/ C yg.x/ in assertions (i) and
(ii) of the above corollary is convex on H:
1 1
f .x/ WD hx; Q0 xi C hc0 ; xi; g.x/ D hx; Q1 xi C hc1 ; xi C d1 ; (10.13)
2 2
v.QPA/ D inf sup L.x; y/; Q v.QPB/ D inf sup L.x; y/:
x2H y2R x2H y2R
C
The problem on the right-hand side of (10.14) or (10.15) is called the dual
problem of .QPA/ or .QPB/, respectively. By Proposition 10.2, if infx2H L.x; y/ >
1 the function L.:; y/ is convex on H: Therefore, with the usual convention
10.3 Single Constraint Quadratic Optimization 347
inf ; D 1; the supremum in the dual problem can be restricted to those y 2 D
for which the function L.:; y/ is convex on H: As will be seen shortly (Remark 10.4
below), in view of this fact many properties are obvious which otherwise have been
proved sometimes by unnecessarily involved arguments. Actually, it is this hidden
convexity that is responsible for the striking analogy between certain properties of
quadratic inequalities and corresponding ones of convex inequalities.
To investigate conditions guaranteeing strong duality we introduce the following
assumptions:
(S) There exists x 2 H satisfying g.x / < 0:
(T) There exists y 2 D such that L.x; y / D f .x/ C y g.x/ is strictly convex on H:
Theorem 10.3 (Duality Theorem)
(i) If assumption (S) is satisfied then (10.14) holds and if, furthermore, v.QPA/ >
1 (which is the case if assumption (T), too, is satisfied with D D RC ), then
there exists y 2 RC such that L.x; y/ is convex on H and
(iii) If assumption (T) is satisfied then (10.15) holds, where the supremum can be
restricted to those y 2 D for which L.:; y/ is convex on H: If, furthermore, the
dual problem has an optimal solution y 2 D then x 2 H is an optimal solution
of the primal problem if and only if
Proof (i) The existence of y follows from Corollary 10.3, (i). Then x is an optimal
solution of .QPA/ if and only if
hence, if and only if conditions (10.17) hold by noting that L.:; y/ is convex
on H:
(ii) When g.x/ is homogeneous and takes both positive and negative values on
the subspace H the existence of y follows from Corollary 10.3, (ii). When
both f .x/; g.x/ are homogeneous, if g.x/ 0 8x 2 H then infff .x/j g.x/
0; x 2 Hg D v.QPB/, so by Corollary 10.3, (iii), v.QPB/ > 1 implies the
existence of .y1 ; y2 / 2 RC ; such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g:
If y1 D 0 then y2 g.x/ > 0 8x 2 H n f0g; hence g.x/ 0 8x 2 H;
hence g.x/ D 0 8x 2 H; a contradiction. Therefore, y1 > 0 and (10.18)
holds with y D y2 =y1 2 RC : Analogously, if g.x/ 0 8x 2 H, then
infff .x/j g.x/ 0; x 2 Hg D v.QPB/; so (10.18) holds for some y 2 R :
Since L.x; y/ is convex on H, conditions (10.19) are necessary and sufficient
for x 2 H to be an optimal solution.
(iii) If assumption (T) is satisfied the conclusion follows from Theorem 10.1 where
C D H and D D R for .QPB/: If, in addition to (T), the dual problem has an
optimal solution y 2 D then x 2 H solves the primal problem if and only if it
solves the convex optimization problem minx2H L.x; y/; and hence, if and only
if it satisfies (10.20). t
u
Remark 10.3 In the usual case H D Rn it is well known that by setting Q.y/ D
Q0 C yQ1 ; c.y/ D c0 C yc1 ; d.y/ D yd1 the dual problem of .QPA/ can be written
as the SDP
Q.y/ c.y/
max tj 0; y 2 D : (10.21)
c.y/T 2.d.y/ t/
is exactly described by SDP constraints (Hoang et al. 2008), so for any n-order
univariate polynomial Pn .x1 / the optimization problem minx1 2a;b Pn .x1 / can be
fully characterized by means of SDPs.
Corollary 10.5 In .QPA/ where H D Rn assume condition (S). Then x 2 Rn solves
.QPA/ if and only if there exists y 2 RC satisfying
Q0 C yQ1 0 (10.24)
0 1 0 1
.Q C yQ /x C c C yc D 0 (10.25)
hQ1 x C 2c1 ; xi C 2d1 0: (10.26)
Proof By Theorem 10.3, (i), x is an optimal solution of .QPA/ if and only if there
exists y 2 RC such that L.x; y/ D f .x/ C yg.x/ is convex and x is a minimizer of
this convex function. That is, if and only if there exists y 2 RC satisfying (10.24)
and (10.17), i.e., (10.24)(10.26).
If the inequality (10.26) holds strictly, i.e.,
then g.x/ < 0; hence g.x/ 0 for all x in some neighborhood of x: Then x is a local
minimizer, hence a global minimizer, of f .x/ over Rn ; which happens only if Q0 is
positive definite. t
u
Corollary 10.6 With f ; g; H defined as in (10.13) above, assume that g.x/ is
homogeneous and takes both positive and negative values on H: Then
y1 f .x/ C y2 g.x/ 0 8x 2 H:
Proof Clearly (i) and (ii) cannot hold simultaneously. If (i) does not hold, two cases
are possible: 1) f .x/ 0 for every x 2 H such that g.x/ < 0I 2) g.x/ 0 for every
x 2 H such that f .x/ < 0: In case (1), if 0 is a local minimum of g.x/ on H; then
g.x/ 0 8x 2 H; so y1 f .x/ C y2 g.x/ 0 8x 2 H with y1 D 0; y2 D 1I if, on
10.3 Single Constraint Quadratic Optimization 351
By Corollary 10.8 there exists .y1 ; y2 / 2 R2C n f0; 0g satisfying y1 .f .x/ t10 / C
y2 .g.x/ t20 / 0 8x 2 a; b: Setting Lt WD fs 2 R2 j y1 s1 C y2 s2 y1 t10 C y2 t20 g
it is easily checked that Cab Lt while t Lt : So for every t Cab there exists
a halfplane Lt separating t from Cab : Consequently, C D \tC Lt ; proving the
convexity of Cab :
1
This proof corrects a flaw in the original proof given in Tuy and Tuan (2013).
352 10 Nonconvex Quadratic Programming
Now if t; s are any two points of C then there exist a; b 2 H such that
a1 2
f .x/ g.x/ D r g.x/ D g.rx/; hence a 2 G because rx 2 H: Therefore, if a G
then minfa1 f .x/j a1 g.x/ a2 f .x/ D 0; x 2 Hg D 0: By Corollary 10.3, (ii),
there exists y 2 R satisfying a1 f .x/ C y.a1 g.x/ a2 f .x// 0 8x 2 H: Then
G La WD ft 2 R2 j a1 t1 C y.a1 t2 a2 t1 / 0g; while a La (because
a21 < 0/: Thus, every a G can be separated from G by a halfspace, proving
the convexity of the set G: t
u
Assertion (i) of Corollary 10.10 is analogous to a well-known proposition for
convex functions. It has been derived from Corollary 10.3, (i), but the converse
is also true. In fact, suppose for given f .x/; g.x/; H the set C WD ft 2 R2 j 9x 2
H s:t: f .x/ t1 ; g.x/ t2 g is convex. Clearly if the system x 2 H; f .x/
0; g.x/ 0 is inconsistent then .0; 0/ C; hence, by the separation theorem, there
exists .u; y/ 2 R2C ; such that ut1 Cyt2 0 8.t1 ; t2 / 2 C; i.e., uf .x/Cyg.x/ 0 8x 2
H: If, furthermore, there exists x 2 H such that g.x / < 0 then .f .x /; g.x // 2 C;
so uf .x / C yg.x / 0: We cannot have u D 0 for this would imply yg.x / 0;
hence y D 0; conflicting with .u; y/ 0: Therefore u > 0 and setting y D y=u
yields f .x/ C yg.x/ 0 8x 2 H: So the fact that C is convex is equivalent to
Corollary 10.3, (i).
Assertion (ii) of Corollary 10.10 (which still holds with H replaced by a regular
cone) includes a result of Dines (1941) which was used by Yakubovich (1977)
in his original proof of the S-Lemma. Since this result has been derived from
Corollary 10.3, (ii), i.e., the S-Lemma with equality constraints, Dines theorem
is actually equivalent to the latter.
It has been long known that the general linearly constrained quadratic pro-
gramming problem is NP-hard (Sahni 1974). Recently, two special variants of this
problem have also been proved to be NP-hard:
1. The linearly constrained quadratic program with just one negative eigenvalue
(Pardalos and Vavasis 1991):
minf.hc; xi C /.hd; xi C /j Ax b; x 0g
10.3 Single Constraint Quadratic Optimization 353
Nevertheless, quite practical algorithms exist which can solve each of the two
mentioned problems in a time usually no longer than the time needed for solving
a few linear programs of the same size (Konno et al. 1997). In this section we
are going to discuss another special quadratic minimization problem which can
also be solved by efficient algorithms (Karmarkar 1990; Ye 1992). This problem
has become widely known due to several interesting features. Referring to its
role in nonlinear programming methods it is sometimes called the Trust Region
Subproblem. It consists of minimizing a nonconvex quadratic function over a
Euclidean ball, i.e.,
1
.TRS/ min hx; Qxi C hc; xij hx; xi r2 ;
2
(10.31)
where Q is an n n symmetric matrix and c 2 Rn ; r 2 R: A seemingly more general
formulation, in which the constraint set is an ellipsoid hx; Hxi r2 (with H being a
positive definite matrix), can be reduced to the above one by means of a nonsingular
linear transformation.
Let u1 ; : : : ; un be a full set of orthonormal eigenvectors of Q; and let 1 : : : ; n
be the corresponding eigenvalues. Without loss of generality one can assume that
1 D : : : D k < kC1 : : : n ; and 1 < 0: (The latter inequality simply
means that the problem is nonconvex.) By Proposition 4.18, the global minimum is
achieved on the sphere hx; xi D r2 ; so the problem is equivalent to
1 2
min hx; Qxi C hc; xij hx; xi D r :
2
Also note that by a result of Martinez (1994) .TRS/ has at most one local nonglobal
optimal solution. No wonder that good local methods such as the DCA method (Tao
1997; Tao and Hoai-An (1998) and the references therein) can often give actually a
global optimal solution.
In view of Proposition 10.4, solving .TRS/ reduces to finding 0 satisfy-
ing (10.32)(10.34). Condition (10.33) implies that 1 < C1: For 1 <
since QCI is positive definite, the equation QxCxCc D 0 has a unique solution
given by
x./ D .Q C I/1 c:
lim kx./k D 0;
!1
kxk if hc; ui i D 0 8i D 1; : : : ; k
lim kx./k D
!1 C1 otherwise;
where
Xn
hc; ui iui
xD :
1
iDkC1 i
hc; ui i
hx./; ui i D :
i
X
n
hc; ui iui
x./ D ; (10.35)
iD1
i
and consequently
Xn
.hc; ui i/2
kx./k2 D : (10.36)
iD1
.i /2
10.4 Quadratic Problems with Linear Constraints 355
Of course, the algorithms presented in Chaps. 567 for .BCP/ can be adapted
to handle .CQP/: In doing this one should note two points where substantial
simplifications and improvements are possible due to the specific properties of
quadratic functions.
The first point concerns the computation of the -extension (Remark 5.2). Since
f .x/ is concave quadratic, for any point x0 2 Rn and any direction u 2 Rn ; the
function 7! f .x0 C u/ is concave quadratic in : Therefore, for any x0 ; x 2 Rn
and any real number such that f .x0 / > ; the -extension of x with respect to x0 ;
i.e., the point x0 C .x x0 / corresponding to the value
X
n
xi
1: (10.37)
iD1
ti
P
So if we define M.t/ D fx 2 Dj niD1 xi =ti / 1g, then fx 2 Dj f .x/ < g M.t/:
Note that by construction f .ti ei / ; i D 1; : : : ; n; where as usual ei is the ith unit
vector. We now construct a new concave function 't .x/ such that a -valid concavity
cut for .'t ; D/ will also be -valid for .f ; D/ but in general deeper than the above
cut. To this end define the bilinear function
Observe that
hence F.x; y/ F.x; x/ C F.x; y/ F.y; y/ D hx y; Q.x y/i 0 because Q is
negative semidefinite. Thus at least one of the differences F.x; y/ F.x; x/; F.x; y/
F.y; y/ must be nonnegative, whence the inequality in (10.38). The argument also
shows that if Q is negative definite then the inequality in (10.38) is strict for x y
because then hx y; Q.x y/i < 0:
Lemma 10.1 The function
Proof The concavity of 't .x/ is obvious from the fact that it is the lower envelope
of the family of affine functions x 7! F.x; y/; y 2 M.t/: Since 't .x/ F.x; x/ D
f .x/ 8x 2 M.t/ (10.39) holds. Furthermore, 't .x/ minfF.x; y/j y 2 M.t/; y D
xg D minfF.x; x/j x 2 M.t/g; hence
Xn
xi
1; (10.42)
t
iD1 i
A problem that has earned a considerable interest due to its many applications is the
bilinear program whose general form is
where f .x/; g.y/ are convex functions on Rn and G is a compact convex subset of
Rn Rn : Since hx; yi D .kx C yk2 kx yk2 /=4, the problem can be solved by
rewriting it as the reverse convex program:
An alternative branch and bound method was proposed in Al-Khayyal and Falk
(1983), where branching is performed by rectangular partition of Rn Rn : To
compute a lower bound for F.x; y/ WD f .x/ C hx; yi C g.y/ over any rectangle
M D f.x; y/ 2 Rn Rn W r x sI u y vg observe the following relations for
every i D 1; : : : ; n:
xi yi ui xi C ri yi ri ui 8xi ri ; yi ui
xi yi vi xi C si yi si vi 8xi si ; yi vi
(the latter problem is a convex program and can be solved by standard methods).
At iteration k of the branch and bound procedure, a rectangle Mk in the current
partition is selected which has smallest .Mk /: Let .xk ; yk / be the optimal solution
of the bounding subproblem (10.45) for Mk ; i.e., the point .xk ; yk / 2 G \ Mk ; such
that
and the basic rectangular subdivision theorem it can then be proved that the above
branch and bound procedure converges to a global optimal solution of .JCBP/:
Another generalization of the bilinear programming problem is the convex
concave programming problem studied in Muu and Oettli (1991):
In particular, .M/ yields a lower bound for F.x; y/ on G \ .X M/; which becomes
the exact minimum if yM D uM : This suggests a branch and bound procedure in
which branching is by rectangular subdivision of Y and .M/ is used as a lower
bound for every subrectangle M of Y: At iteration k of this procedure, a rectangle
Mk in the current partition is selected which has smallest .Mk /: Let .xk ; yk ; uk / D
.xMk ; yMk ; uMk /: If yk D uk then the procedure terminates: .xk ; yk / solves (CCP).
Otherwise, select
is equivalent to
which reduces to solving a standard convex program and computing the maximum
of the convex function h.y/ at 2q corners of the rectangle M:
By bilinear programming, one usually means a problem of the form
Since for each fixed y the function x 7! hx; Qyi C hd; yi is affine, it follows that '.x/
is concave. Therefore, the converted problem is a concave program.
Aside from the bilinear program, a number of other nonconvex problems can
also be reduced to concave or quasiconcave minimization. Some of these problems
(such as minimizing a sum of products of convex positive-valued functions) have
been discussed in Chap. 9 (Subsect. 9.6.3).
where x 2 Rn1 ; y 2 Rn2 : Earlier methods for dealing with this problem (Kough
1979; Tuy 1987b) were essentially based on Geoffrion-Benders decomposition
ideas and used the representation of the objective function as a difference of
two convex quadratic separable functions. Koughs method, which closely follows
Geoffrions scheme, requires the solution, at each iteration, of a relaxed master
problem which is equivalent to a number of linearly constrained concave programs,
and, consequently, may not be easy to handle. Another approach is to convert .IQP/
to a single concave minimization under convex constraints and to use for solving it
a decomposition technique which extends to convex constraints the decomposition
362 10 Nonconvex Quadratic Programming
it is easily seen that the function '.y/ is convex (see Proposition 2.6) and that
dom' Y: Then .IQP/ can be rewritten as
Here the objective function t hy; yi is concave, while the constraint set D WD
f.y; t/ 2 Rn2 Rj '.y/ t; y 2 Yg is convex. So (10.47) is a concave program under
convex constraints, which can be solved, e.g., by the outer approximation method.
Since, however, the convex feasible set in this program is defined implicitly via a
convex program, the following key question must be resolved for carrying out this
method (Chap. 7, Sect. 7.1): given a point .y; t/ determine whether it belongs to D
and if it does not, construct a cutting plane (in the space Rn2 R/ to separate it
from D:
Clearly, .y; t/ 2 D if and only if y 2 Y; and '.y/ t:
One has y 2 Y if and only if the linear program minf0j Ax Bycg is feasible,
hence, by duality, if and only if the linear program
Lemma 10.2 The convex program (10.50) has at least one Karush-Kuhn-Tucker
vector and any such vector satisfies
BT 2 @'.y/:
10.5 Quadratic Problems with Quadratic Constraints 363
Proof Since hx; xi 0 one has '.y/ > 1; so the existence of follows
from a classical theorem of convex programming (see, e.g., Rockafellar 1970,
Corollary 28.2.2). Then
Thus, given any point .y; t/ one can check whether .y; t/ 2 D and if .y; t/ D then
construct a cut separating .y; t/ from D (this cut is (10.49) if y Y; or (10.51) if
'.y/ > t/:
With the above background, a standard outer approximation procedure can be
developed for solving the indefinite quadratic problem .IQP/ (Tuy 1987b). At
iteration k of this procedure a polytope Pk is in hand which is an outer approximation
of the feasible set D: Let V.Pk / be the vertex set of Pk : By solving the subproblem
minft hy; yij y 2 V.Pk /g a solution .yk ; tk / is obtained. If .yk ; tk / 2 D the procedure
terminates. Otherwise a hyperplane is constructed which cuts off this solution and
determines a new polytope PkC1 : Under mild assumptions this procedure converges
to a global optimal solution.
In recent years, BB methods have been proposed for general quadratic problems
which are applicable to .IQP/ as well. These methods will be discussed in the next
section and also in Chap. 11 devoted to monotonic optimization.
1
fi .x/ D hx; Qi xi C hci ; xi C di ; i D 0; 1; : : : ; m:
2
364 10 Nonconvex Quadratic Programming
For low rank nonconvex problems, i.e., for problems which are convex when a
relatively small number of variables are kept fixed, and also for problems whose
quadratic structure is implicit, outer approximation (or combined OA/BB methods)
may in many cases be quite practical. Two such procedures have been discussed
earlier: the Relief Indicator Method (Chap. 7, Sect. 7.5), designed for continuous
optimization and the Visible Point Method (Chap. 7, Sect. 7.4) which seems to work
well for problems in low dimension with many nonconvex quadratic or piecewise
affine constraints (such as those encountered in continuous location theory (Tuy
et al. 1995a)).
For solving large-scale quadratic problems with little additional special structure
branch and bound methods seem to be most suitable. Two fundamental operations
in a BB algorithm are bounding and branching. As we saw in Chap. 5, these two
operations must be consistent in order to ensure convergence of the procedure.
Roughly speaking, the latter means that in an infinite subdivision process, the
current lower bound must tend to the exact global minimum.
A standard bounding method consists in considering for each partition set a
relaxed problem obtained from the original one by replacing the objective function
with an affine or convex minorant and enclosing the constraint set in a polyhedron
or a convex set.
a. Lower Linearization
I. Simplicial Subdivision
The function 'M .x/ will be called a lower linearization of f .x/ on M: Most
continuous functions are locally lower linearizable in the above sense. For example,
if f .x/ is concave, then 'M .x/ can be taken to be the affine function that agrees with
f .x/ at the vertices of M; if f .x/ is Lipschitz, with Lipschitz constant L, then 'M .x/
can be taken to be the affine function that agrees with fQ .x/ WD f .xM / Lkx xM k
at the vertices of M; where xM is an arbitrary point in M (in fact, fQ .x/ is concave,
so 'M .x/ is a lower linearization of fQ .x/; hence, also a lower linearization of f .x/
because by Lipschitz condition fQ .x/ f .x/ and jf .x/ fQ .x/j 2kx xM k 8x 2
M/: Thus special cases of problem (SNP) with locally lower linearizable gi ; i D
0; 1; : : : ; m; include reverse convex programs (g0 .x/ linear, gi .x/; i D 1; : : : ; m;
concave), as well as Lipschitz optimization problems (g0 .x/; gi .x/; i D 1; : : : ; m;
Lipschitz).
Now assume that every function gi is lower linearizable and let M;i .x/ be a lower
linearization of gi .x/ on a given simplex M: Then the following linear program is a
relaxation of (SNP) on MI
min M;0 .x/ C h0 .y/
SP.M/ s:t: M;i .x/ C hi .y/ 0; i D 1; : : : ; m:
.x; y/ 2 ; x 2 M:
Therefore, the optimal value .M/ of this linear program is a lower bound for the
objective function value on the feasible solutions .x; y/ such that x 2 M: Incorporat-
ing this lower bounding operation into an exhaustive simplicial subdivision process
of the x-space yields the following algorithm:
BB Algorithm for (SNP)
Initialization. Take an n-simplex M1 Rn containing the projection of on Rn :
Let .x; y/ be the best feasible solution available, D g0 .x/ ( D C1 if no feasible
solution is known). Set R D S D fM1 g; k D 1:
366 10 Nonconvex Quadratic Programming
Step 1. For each M 2 S compute the optimal value .M/ of the linear program
SP.M/:
0
Step 2. Update .x; y/ and : Delete every M 2 R such that .M/ . Let R be
the remaining collection of simplices.
0
Step 3. If R D ;, then terminate: .x; y/ solves (SNP) (if < C1) or (SNP) is
infeasible (if D C1/:
0
Step 4. Let Mk 2 argminf.M/ W M 2 R g; and let .xk ; yk / be a basic optimal
solution of SP.Mk /.
Step 5. Perform a bisection or a radial subdivision of Mk ; following an exhaustive
subdivision rule. Let S be the partition of Mk .
0
Step 6. Set S S ; R S [ .R n fMk g/; increase k by 1 and return to
Step 1.
The convergence of this algorithm is easy to establish. Indeed, if the algorithm
is infinite, it generates a filter fMk g collapsing to a point x : Since .Mk /
min(SNP) and (10.53) implies that Mk ;i .x / D gi .x /; i D 0; 1; : : : ; m; it follows
that if .Mk / D g0 .xk / C h0 .yk /, then xk ! x ; yk ! y such that .x ; y /
solves (SNP).
Remark 10.6 If the lower linearizations are such that for any simplex M each
function M;i .x/g matches gi .x/ at every corner of M then as soon as xk in Step
3 coincides with a corner of Mk the solution .xk ; yk / will be feasible, hence
optimal to the problem. Therefore, in this case an adaptive subdivision rule to
drive xk eventually to a corner of Mk should ensure a much faster convergence
(Theorem 6.4).
If f .x/ is continuous on M, then its convex envelope '.x/ WD convf .x/ over M must
be a tight convex minorant, for if minx2M f .x/ '.x/ D > 0 then the function
'.x/ C would give a larger convex minorant.
Example 10.1 The function maxfpy C rx pr; qy C sx qsg is a tight convex
minorant of xy 2 R2 on p; q r; s: It matches xy for x D p or x D q (and y
arbitrary in the interval r y s).
Proof This function is the convex envelope of the product xy on the rectangle p; q
r; s (Corollary 4.6). The second assertion follows from the inequalities xy .py C
rx pr/ D .x p/.y r/ 0; xy .qy C sx qs/ D .q x/.s y/ 0 which
hold for every .x; y/ 2 p; q r; s: t
u
Example 10.2 Let f .x/ D hx; Qxi: The function
X
n
'M .x/ D f .x/ C r .xi pi /.xi qi /;
iD1
Pn
Proof We can write f .x/ D iD1 fi .x/ with fi .x/ D xi yi ; yi D Qi x so the result
simply follows from the fact that 'M;i .x/ is a convex minorant of fi .x/ and 'M;i .x/ D
fi .x/ when xi 2 fpi ; qi g (Example 10.1). t
u
Consider now the nonconvex optimization problem
similar arguments
Pn to the above, it is easily seen that a concave majorant of hx; Qh xi is
M .x/ D iD1 i .x/ with i .x/ D minfpi Qi xCKi xi pi Ki ; qi Qi xCki xi qi ki g:
h h h h h h h h h
0
min 'M .x/ C hc0 ; xi
s.t. th C hch ; xi C dh 0 h D 1; : : : ; m
'Mh
.x/ th Mh .x/ h D 1; : : : ; m
x2M\D :
Tight convex minorants of the above type are also used in Androulakis et al. (1995)
for the general nonconvex problem (10.59). By writing each function fh .x/; h D
0; 1; : : : ; m; in (10.59) as a sum of bilinear terms of the form bij xi xj and a nonconvex
function with no special structure, a convex minorant 'M;h .x/ of fh .x/ on a rectangle
M can easily be computed from Examples 10.1 and 10.2. Such a convex minorant
is tight because it matches fh .x/ at any corner of M; furthermore, as is easily seen,
'M;h .x/; h D 0; 1; : : : ; m; satisfy the conditions formulated in Theorem 10.4. The
convergence of the BB algorithm in Androulakis et al. (1995) then follows.
The reader is referred to the cited papers (Al-Khayyal et al. 1995; Androulakis
et al. 1995) for details and computational results concerning the above algorithms.
In fact the convex minorants in Examples 10.110.3 could be obtained from the
general rules for computing convex minorants of factorable functions (see Chap. 4,
Sect. 4.7). In other words the above algorithms could be viewed as specific realiza-
tions of the general scheme of McCormick (1982) for factorable programming.
Remark 10.7 (i) If the convex minorants used are of the types given in Exam-
ples 8.18.3 then for any rectangle M each function 'M;i matches fi at the
corners of M: In that case, the convergence of the algorithm would be faster
by using the !-subdivision, viz : at iteration k divide Mk D pk ; qk via
.! k ; jk /; where ! k is an optimal solution of the relaxed problem at iteration
k and jk is a properly chosen index (jk 2 argmaxf
k;j j j D 1; : : : ; ng where
k;j D minf!jk pkj ; qkj !jk gI see Theorem 6.4.
(ii) A function f .x/ is said to be locally convexifiable on a compact set C Rn
with respect to rectangular subdivision if there exists for each rectangle M C
a convex minorant 'M .x/ satisfying (10.53) (this convex minorant is then
said to be asymptotically tight). By an argument analogous to the proof of
Theorem 10.4 it is easy to show that if the functions fi .x/; i D 0; 1; : : : ; m
in (10.59) are locally convexifiable and the convex minorants 'M;i used for
bounding are asymptotically tight, while the subdivision is exhaustive then
the BB algorithm converges. For the applications, note that if two functions
f1 .x/; f2 .x/ are locally convexifiable then so will be their sum f1 .x/ C f2 .x/:
c. Decoupling Relaxation
omitted, the problem becomes linear, or convex, or easily solvable. The method
proposed in Muu and Oettli (1991), Muu (1993), and Muu (1996), for convex
concave problems, and also the reformulationlinearization method of Sherali
and Tuncbilek (1992, 1995), can be interpreted as different variants of variable
decoupling relaxations.
d. ConvexConcave Approach
where X Rn1 ; Y Rn2 are polytopes, C is a compact convex set in Rn1 Rn2
and the function f .x; y/ is convexconcave on X Y: Clearly, linearly (or convexly)
constrained quadratic programs and more generally, d.c. programming problems
with convex constraints, are of this form. Since by fixing y the problem is convex,
a branch and bound method for solving .CCAP/ should branch upon y: Using, e.g.,
rectangular partition of the y-space, one has to compute, for a given rectangle M
Rn2 ; a lower bound .M/ for
The difficulty of this subproblem is due to the fact that convexity and concavity are
coupled in the contraints. To get round this difficulty the idea is to decouple x and y
by reformulating the problem as
and omitting the constraint u D y: Then, obviously, the optimal value in the relaxed
problem
is a lower bound of the optimal value in .CCAP/: Let V.M/ be the vertex set of M:
By concavity of f .x; y/ in y we have
For each fixed y the problem minff .x; y/j .x; u/ 2 C \ .X Y/g is convex (or linear
if f .x; y/ is affine in x; and C is a polytope), so (10.63) can be solved by standard
algorithms. If .xM ; yM ; uM / is an optimal solution of RP.M/ then along with the
lower bound f .xk ; yk / D .Mk /; we also obtain a feasible solution .xM ; uM / which
can be used to update the current incumbent. Furthermore, when yM D uM ; .M/
equals the exact minimum in (10.63).
At iteration kth of the branch and bound procedure let Mk be the rectangle such
that .Mk / is smallest among all partition sets still of interest. Denote .xk ; yk ; uk / D
.xMk ; yMk ; uMk /: If uk D yk then, as mentioned above, .Mk / D f .xk ; yk / while
.xk ; yk / is a feasible solution. Since .Mk / min (CCAP) it follows that .xk ; yk /
is an optimal solution of .CCAP/: Therefore, if uk D yk the procedure terminates.
Otherwise, Mk must be further subdivided. Based on the above observation, one
should choose a subdivision method which, intuitively, could drive kyk uk k to zero
more rapidly than an exhaustive subdivision process. A good choice confirmed by
computational experience is an adaptive subdivision via .yk ; uk /; i.e., to subdivide
Mk via the hyperplane yik D .ykik C ukik /=2 where
Proposition 10.7 The rectangular algorithm with the above branching and bound-
ing operations is convergent.
Proof If the algorithm is infinite, it generates at least a filter of rectangles, say
Mkq ; q D 1; 2; : : : : To simplify the notation we write Mq for Mkq : Let Mq D aq ; bq ;
so that by a basic property of rectangular subdivision (Lemma 5.4), one can suppose,
without loss of generality, that iq D 1; aq ; bq ! a ; b ; .y1 C u1 /=2 ! y1 2
q q
q q
fa1 ; b1 g: The latter relation implies that jy1 u1 j ! 0; hence yq uq ! 0
q q q q
because jy1 u1 j D maxiD1;:::;n2 jyi ui j by the definition of iq D 1: Let
x D lim x ; y D lim y .D lim u /: Then .Mq / D f .xq ; yq / ! f .x ; y / and since
q q q
.Mq / min(CCAP) for all q; while .x ; y / is feasible, it follows that .x ; y / is
an optimal solution. t
u
This method is practicable if n2 is relatively small. Often, instead of rectangular
subdivision, it may be more convenient to use simplicial subdivision. In that case,
a simplex Mk is subdivided via ! k D .yk C uk /=2 according to the !-subdivision
rule. The method can also be extended to problems in which convex and nonconvex
variables are coupled in certain joint constraints, e.g., to problems of the form
e. ReformulationLinearization
and let
Assume that the linear inequalities defining X and Y are such that both systems
hai ; xi i 0 i D 1; : : : ; h1 (10.70)
hbj ; yi j 0 j D 1; : : : ; h2 (10.71)
Proof It suffices to show that (10.75) implies .x; y/ 2 X Y: Let .x; y/ sat-
isfy (10.75). By (10.72) there exists for this x an index j0 2 f1; : : : ; h2 g such that
hbj0 ; yi j0 < 0: Then for any i D 1; : : : ; h1 ; since
.hai ; xi i /.hbj0 ; yi j0 / 0;
X
n1 X
n2
.ha; xi /.hb; yi /L D al bl0 wll0 hb; yi ha; xi C :
lD1 l0 D1
Let fi .x; y/L D 'i .x; y; w/; i D 0; 1; : : : ; m and for every .i; j/: .hai ; xi
i /.hbj ; yi j /L D Lij .x; y; w/: Then the bilinear program (10.69) can be written
as
min '0 .x; y; w/
s:t: 'i .x; y; w/ 0 i D 1; : : : ; m;
(10.76)
Lij .x; y; w/ 0 i D 1; : : : ; h1 ; j D 1; : : : ; h2
wll0 D xl yl0 l D 1; : : : ; n1 ; l0 D 1; : : : ; n2 :
A new feature of the latter problem is that the only nonlinear elements in it are
the coupling constraints wll0 D xl yl0 : Let (LP) be the linear program obtained
from (10.76) by omitting these coupling constraints:
min '0 .x; y; w/
.LP/ s:t: 'i .x; y; w/ 0 i D 1; : : : ; m;
Lij .x; y; w/ 0 i D 1; : : : ; h1 ; j D 1; : : : ; h2 :
Proposition 10.8 The optimal value in the linear program (LP) is a lower bound of
the optimal value in (10.69). If an optimal solution .x; y; w/ of (LP) satisfies wll0 D
xl yl0 8l; l0 then it solves (10.69).
374 10 Nonconvex Quadratic Programming
Notice that trivial bounds 1 < w < C1 would have been obtained, were the
original linear constraints used instead of the constraints Lij .x; y; w/ 0: Thus,
although it is more natural to linearize nonlinear constraints, it may sometimes be
useful to tranform linear constraints into formally nonlinear constraints.
From the above results we now derive a BB algorithm for solving (10.69). For
simplicity, consider the special case of (10.69) when x D y; i.e., the general
nonconvex quadratic program
1 1
f0 .x/ D hx; Q0 xi C hc0 ; xi; fi .x/ D hx; Qi xi C hci ; xi C di : (10.78)
2 2
Given any rectangle M in Rn ; the set X \ M can always be described by a system
hai ; xi i i D 1; : : : ; h (10.79)
the function that results from .hai ; xi i /.haj ; xi j / by the substitution wll0 D
xl xl0 ; 1 l l0 n: Also let
10.5 Quadratic Problems with Quadratic Constraints 375
1
'0 .x; w/ D hx; Q xi C hc0 ; xi
0
2 L
1
'i .x; w/ D hx; Qi xi C hci ; xi C di i D 1; : : : ; m:
2 L
Then a lower bound of f0 .x/ over all feasible points x in X \ M is given by the
optimal value .M/ in the linear program
min '0 .x; w/
LP.M/ s:t: 'i .x; w/ 0 i D 1; : : : ; m
Lij .x; w/ 0 1 i j h:
To use this lower bounding for algorithmic purpose, one has to construct a subdi-
vision process consistent with this lower bounding, i.e., such that the resulting BB
algorithm is convergent. The next result gives the foundation for such a subdivision
process.
Proposition 10.9 Let M D p; q: If .x; w/ is a feasible point of LP.M/ such that x is
a corner of the rectangle M then .x; w/ satisfies the coupling constraints wll0 D xl xl0
for all l; l0 :
Proof Since x is a corner of M D p; q one has, for some I f1; : : : ; ng: xi D pi for
i 2 I and xj D qj for j I: The feasibility of .x; w/ implies that .xl pl /.ql0 xl0 /L
0; .xl pl /.xl0 pl0 /L 0 for any l; l0 ; i.e.,
For l 2 I we have xl D pl ; so
hence wll0 D xl xl0 : On the other hand, .ql xl /.ql0 xl0 /k 0; .ql xl /.xl0 pl0 /k
0; i.e.,
For l I we have xl D ql ; so
solution of the bounding problem, so that .Mk / D '.xk ; wk /: If wkll0 D xkl xkl0 for all
l; l0 (in particular, if xk is a corner of Mk / then xk is an optimal solution of (10.77)
and the procedure terminates. Otherwise, bisect Mk via .xk ; ik / where
ik 2 argmaxf
ki j i D 1; : : : ; ng;
ki D minfxki pki ; qki xki g:
Setting w11 D x21 ; w12 D x1 x2 ; w22 D x22 the first bounding problem is to minimize
the linear function w11 C 24x1 w22 144 subject to .5 6/=2 D 15 linear
constraints corresponding to different pairwise products of the factors
The solution to this linear program is .x1 ; x2 ; w11 ; w12 ; w22 / D .8; 6; 192; 48; 72/;
with objective function value 216: By bisecting the feasible set X into M1 D fx 2
Xj x1 8g and M1C D fx 2 Xj x1 8g and solving the corresponding bounding
problems one obtains a lower bound of 180 for both. Furthermore, the solution
.x1 ; x2 ; w11 ; w12 ; w22 / D .0; 6; 0; 0; 36/ to the bounding problem on M1 is such that
.x1 ; x2 / D .0; 6/ is a vertex of M1 : Hence, by Proposition 10.9, .0; 6/ is an optimal
solution of the original problem and 180 is the optimal value. This result could
also be checked by solving the problem as a concave minimization over a polytope
(the vertices of this polytope are .0; 0/; .0; 6/; .8; 12/; .24; 6/; and .24; 0//:
Remark 10.9 Since the number of constraints Lij .x; w/ 0 is often quite large, it
may sometimes be more practical to use only a selected subset of these constraints,
even if this may result in a poorer bound. On the other hand, it is always possible
to add some convex constraints of the form x2i wii to obtain a convex rather than
linear relaxation of the original problem. This may give, of course, a tighter bound
at the expense of more computational effort.
10.5 Quadratic Problems with Quadratic Constraints 377
where X is a convex set in Rn ; f0 .x/; fi .x/ are arbitrary real-valued functions. The
function
X
m
L.x; u/ D f0 .x/ C ui fi .x/; u 2 Rm
C
iD1
\
the Lagrange dual problem . Clearly .u/ fX 8u 2 Rm
C ; hence
\
X fX ; (10.82)
\ \
i.e., X always provides a lower bound for fX : The difference fX X 0 is referred
to as the Lagrange duality gap. When the gap is zero, i.e., strong duality holds, then
X is the exact optimal value of (10.80).
Note that the function .u/ is concave on every convex subset of its domain as
it is the lower envelope of a family of affine functions in u: Furthermore:
Proposition 10.11 Assume X is compact while f0 .x/ and fi .x/ are continuous on X:
Then .u/ is defined and concave on Rm C and if .u/ D L.x; u/ then the vector
.fi .x/; i D 1; : : : ; m/ is a subgradient of .u/ at u:
Proof It is enough to prove the last part of the proposition. For any u 2 Rm
C we have
X
m
.u/ .u/ L.x; u/ L.x; u/ D .ui ui /fi .x/:
iD1
\
X D fX :
Proof This is a classical result from convex programming theory which can also be
directly derived from Theorem 2.3 on convex inequalities. t
u
Proposition 10.13 In (10.80) where X D Rn ; assume that the concave function
u 7! infx2Rn L.x; u/ attains its maximum at a point u 2 Rm
C such that L.x; u/ is
strictly convex. Then strong duality holds, so
\
X D fX :
2 3
2 6 1 0 3 3 2 6 2 2
6 6 5 8 3 0 1 3 8 9 3 7
6 7
6 7
A D 6 5 6 5 3 8 8 9 2 0 9 7
6 7
4 9 5 0 9 1 8 3 9 9 3 5
8 7 4 5 9 1 7 1 3 2
c D .48; 42; 48; 45; 44; 41; 47; 42; 45; 46/T ; b D .4; 22; 3; 23; 12/T :
x D .1; 1; 0; 1; 1; 1; 0; 1; 1; 1/T
which is feasible to (10.83). The optimal value is 47 with zero duality gap by
Proposition 10.13.
Proposition 10.14 In (10.80) whereP X D Rn assume there exists y 2 Rm C such that
that the function L.x; y/ WD f .x/ C m iD1 yi gi .x/ is convex and has a minimizer x
satisfying yi gi .x/ D 0; gi .x/ 0 8i D 1; : : : ; m: Then strong duality holds, so
\
X D fX :
10.5 Quadratic Problems with Quadratic Constraints 379
Example 10.6 Consider the following nonconvex quadratic problem which arises
from the so-called optimal beamforming in wireless communication (Bengtsson and
Ottersen 2001):
X
d X
min jjxi jj2 W xTi Rii xi C xTj Rij xj C i 0; i D 1; 2; : : : ; d;
xi 2RNi ; iD1;2;:::;d
iD1 ji
(10.84)
where
0 < i ; 0
Rij 2 RNi Ni ; i; j D 1; 2; : : : ; d; (10.85)
X
d X X
d
where L.x; y/ D xTi .I C yj Rji yi Rii /xi C i yi :
iD1 ji iD1
Then the inverse matrix R1 is positive, i.e., all its entries are positive.
Proof Define a positive diagonal matrix D WD diagrii iD1;2;:::;d and nonnegative
matrix G D D R, so R D D G. Then, (10.87) means WD Dy GT y > 0
and 0 < D1 GT y D y D1 < y. Due to (10.87), without loss of generality
380 10 Nonconvex Quadratic Programming
we can assume rij < 0 so GD1 is irreducible. Let max : stand for the maximal
eigenvalue of a matrix. As D1 GT is obviously nonnegative, by PerronFrobenius
theorem (Gantmacher 1960, p.69), we can write
.D1 GT y/i
max GD1 D max D1 GT D min max
y>0 iD1;2;:::;d yi
.D1 GT y/i
max < 1:
iD1;2;:::;d yi
which means
8 9
<X =
d
X
D sup i yi I C yj Rji yi Rii 0; i D 1; 2; : : : ; d ; (10.88)
y2RdC
: ;
iD1 ji
Then setting
(
yj j i
yQ j D
yQ i j D i
X
d X
d
yields a feasible solution yQ 2 RdC to (10.88) satisfying i yi < i yQ i ; so y
iD1 iD1
cannot be an optimal solution of (10.88).
Thus, at optimality each matrix
X
IC yj Rji yi Rii
ji
hence, by setting R D rij i;jD1;2;:::;d with rii WD xQ Ti Rii xQ i and rij WD QxTj Rij xQ j , i j,
By Lemma 10.4, R1 is positive so p D R1 > 0, which is the solution of the
linear equations
X
pi .QxTi Rii xQ i / pj .QxTj Rij xQ j / D i ; i D 1; 2; : : : ; d: (10.92)
ji
p
With such p it is obvious that fxi WD pi xQ i ; i D 1; 2; : : : ; d} is feasible to (10.84)
and satisfies (10.89), completing the proof of (10.86). t
u
Remark 10.10 The main result in Bengtsson and Ottersen (2001) states that the
convex relaxation of (10.84), i.e., the convex program
X
d
min Trace.Xi / W Trace.Xi Rii /
0 Xi 2RNi Ni ; iD1;2;:::;d
iD1
X
Trace.Xj Rij / C i ; i D 1; 2; : : : ; d; (10.93)
ji
problem (10.93) is precisely max min L.x; y/: In other words the above result of
y2RdC xi 2RNi
Bengtsson-Ottersen is just equivalent to Corollary 10.12 which, besides, has been
obtained here by a more straightforward and rigorous argument.
Also note that the computation of the optimal solution of the nonconvex
program (10.84), which is based on the optimal solution y of the convex Lagragian
p
dual (10.88) and pi xQ i ; i D 1; 2; : : : ; d by (10.91) and (10.92), is much more
efficient than by using the optimal solutions of the convex program (10.93). This
is because the convex program (10.93) has multiple optimal solutions, so its rank-
one optimal solution is not quite easily computed.
We have shown above several examples where strong duality holds, so that the
optimal value of the dual equals exactly the optimal value of the original problem.
In the general case, a gap exists which in a sense reflects the extent to which the
given problem fails to be convex and regular. Quite often, however, the gap can be
reduced by gradually restricting the set X: More specifically, it may happen that:
( ) Whenever a nested sequence of rectangles M1 M2 : : : collapses to a
\
singleton: \1
kD1 Mk D fxg then fMk Mk ! 0:
Under these circumstances, a BB algorithm for (10.80) can be developed in the
standard way, such that:
Branching follows a selected exhaustive rectangular subdivision rule.
Bounding is by Lagrangian relaxation: for each rectangle M the value M is taken
\
to be a lower bound for fM :
The candidate for further subdivision at iteration k is Mk 2 argminf M j M 2 Rk g
where Rk denotes the collection of all partition sets currently still of interest.
Termination criterion: At each iteration k a new feasible solution is computed
and the incumbent xk is set equal to the best among all feasible solutions so far
obtained. If f .xk / D M k , then terminate (xk is an optimal solution).
Theorem 10.5 Under assumption ( ) the above BB algorithm converges. In other
words, whenever infinite, the algorithm generates a sequence fMk g such that M k %
\
f \ WD fX :
Proof By the selection of Mk we have M k f \ for all k: By exhaustiveness of
the subdivision, if the procedure is infinite it generates at least one filter fMk g
\
collapsing to a singleton fxg: Since M k f \ fMk and since obviously M kC1
\
Mk for all k; it follows from Assumption ( ) that Mk % f : t
u
10.5.3 Application
a. Bounds by Lagrange relaxation
Following Ben-Tal et al. (1994) let us apply the above method to the problem:
\
.PX/ fX D minfcT yj A.x/y b; y 0; x 2 Xg; (10.94)
10.5 Quadratic Problems with Quadratic Constraints 383
D hb; ui C h.u/;
where
0 if hA.x/; ui C c 0 8x 2 X
h.u/ D
1 otherwise:
The last equality is due to the fact that the infimum of a linear function of y over
the orthant y 0 is 0 or 1 depending on whether the coefficients of this linear
function are all nonnegative or not. The conclusion follows. t
u
Denote the column j of A.x/ by Aj .x/ and for any given rectangle M X define
gM .u/ D min minhAj .x/; ui C cj :
jD1;:::;p x2M
Since for each fixed x the function u 7! hAj .x/; ui C cj is affine, gM .u/ is a concave
function and the optimal value of the dual (DPM) is
M D maxfhb; uij gM .u/ 0; u 0g:
Consider now a filter fMk g of subrectangles of X such that \1 kD1 Mk D fxg: For
\ \
brevity write fk ; k ; x for fMk ; Mk ; fxg ; respectively. We have
\
fk minfhc; yij A.x/y b; y 0g (since x 2 Mk 8k/
minfhc; yij A.x/y b; y 0g D maxfhb; uij hA.x/; ui C c 0; u 0g
(dual linear programs)
x D maxfhb; uij hA.x/; ui C c 0; u 0g (definition of fxg /;
hence,
\
fk x: (10.97)
384 10 Nonconvex Quadratic Programming
Also, setting
L D fu 2 Rm
C j hA.x/; ui C c 0g;
Lk D fu 2 Rm
C j hA.x/; ui C c 0 8x 2 Mk g
we can write
k D maxfhb; uiju 2 Lk g; x D maxfhb; uij u 2 Lg
L1 : : : Lk : : : L; 1 ::: k ::: x:
B1: C/
.8x 2 X/ .9u 2 Rm hA.x/; ui C c > 0:
\
fk k ! 0:
\
Proof For all k we have, by (10.97), k f \ fk
x; so it suffices
to show that
k % x .k ! C1/:
to obtain M :
Step 2. Delete every rectangle M 2 Sk such that M k ": Let Rk be the
collection of remaining rectangles.
Step 3. If Rk D ; then terminate: .xk ; yk / is an "-optimal solution.
Step 4. Let Mk 2 argminf M j M 2 Rk g: Compute a new feasible solution by
local search from the center of Mk (or by solving a linear program, see
Remark 10.11 below). Let .xkC1 ; ykC1 / be the new incumbent, kC1 D
hc; ykC1 i:
Step 5. Bisect Mk upon its longest side. Let PkC1 be the partition of Mk :
Step 6. Set SkC1 D .Rk n fMk g/ [ PkC1 ; k k C 1 and return to Step 1.
Proposition 10.16 The above algorithm can be infinite only if " D 0: In the latter
case, M k % min.PX/; and any filter of rectangles fMk g determines an optimal
solution .x; y/; such that fxg D \1
D1 Mk and y is an optimal solution of Pfxg :
In this form it appears as a problem of type P .PX/ since for fixed q it is linear in
.y; z; /; while the set X D fqj qil 0; i qil D 1g is a simplex. It is readily
verified that conditions B1 and B2 are satisfied, so the above method applies.
Lagrange relaxation has also been used to derive good and cheaply computable
lower bounds for the optimal value of problem (10.80) where X is a polyhedron and
1 1
f0 .x/ D hx; Q0 xi C hc0 ; xi C d0 ; fi .x/ D hx; Qi xi C hci ; xi C di :
2 2
The Lagrangian is
1
L.x; u/ D hx; Q.u/xi C hc.u/; xi C d.u/
2
with
X
m X
m X
m
Q.u/ D Q0 C ui Qi ; c.u/ D c0 C ui ci ; d.u/ D d0 C ui di :
iD1 iD1 iD1
Define
D D fu 2 Rm
C j Q.u/
0g; D D fu 2 Rm
C j Q.u/ 0g;
Note that in the special case X D Rn ; we have D because then u D implies
.u/ D 1: For simplicity below we assume X D Rn :
Proposition 10.17 If D supu2D .u/ is attained on D then D f \:
Proof Let .u/ D L.x; u/ D supu2D .u/: By Proposition 10.11 the vector
.fi .x/; i D 1; : : : ; m/ is a subgradient of .u/ at u: If the concave function .u/
attains a maximum at a point u 2 D then since u is an interior point of D one must
have fi .x/ 0; i D 1; : : : ; m: That is, x is feasible and since D L.x; u/ f \
one must have D f \ : t
u
Thus, to compute a lower bound, one way is to solve the problem
1
hx; Q.u/xi C hc.u/; xi C d.u/ t 0 8x 2 Rn
2
it is easy to check that (10.99) is equivalent to
Q.u/ c.u/
max t j 0; u D .u1 ; : : : ; um / 0 :
.c.u//T 2.d.u/ t/
Thus (10.99) is actually a linear program under LMI constraints . For this problem
several efficient algorithms are currently available (see, e.g., Vandenberge and Boyd
(1996) and the references therein). However, it is not known whether the bounds
computed that way are consistent with a rectangular or simplicial subdivision
process and can be used to produce convergent BB algorithms.
388 10 Nonconvex Quadratic Programming
10.6 Exercises
use (2) of Exercise 11, Chap. 1, to show that the convex hull of a1 ; : : : ; am
contains 0 in its interior; then for every i D 1; : : : ; n; the ith unit vector ei is a
nonnegative combination of a1 ; : : : ; am /:
2. Deduce that for any quadratic functionPf .x/ there exists such that for all
r > the quadratic function f .x/ C r m i;jD1 ij .ha ; xi i /.ha ; xi j / is
i j
1. Using the Lagrangian relaxation method show that an upper bound for .G/
is given by
" #
X
n
D inff ./j Q./ > 0g; ./ D max 2 xi hx; Q./xi ;
x
iD1
(Shor and Stetsenko 1989). Hint: show that for any positive definite matrix A
and vector b 2 Rn : maxx 2hb; xi hx; Axi D fminx hx; Axij hb; xi D 1g1 :
10 Let A0 ; A1 ; A2 ; B be symmetric matrices of order n; and a; b two positive
numbers such that a < b: Consider the problem
4ab
Show that ; where is the optimal value of the problem
.a C b/2
Hint: Observe that in the plane .x; v/ the line segment f.x; v/j 4 x C .a C b/2 v D
4.p C q/; a x bg is tangent to the arc f.x; v/j hx; vi D ; a x bg at point
2 4ab
. aCb
2
; aCb /; and is the chord of the arc f.x; v/j hx; vi D .aCb/ 2 ; a x bg:
f .qM ; H M ; J/ f .q; H; J/
KLi .qM
i
=c/ KLi .qi =c/ KLi .qM
i =c/ ; i D 1; : : : ; s
for all .q; H/ 2 M: Therefore, a lower bound for the objective function value over all
feasible solutions .q; H; J/ with .q; H/ 2 M can be computed by solving a convex
program. Derive a BB method for solving the problem based on this lower bounding.
Chapter 11
Monotonic Optimization
For any two vectors x; y 2 Rn we write x y .x < y; resp.) and say that y dominates
x ( y strictly dominates x, resp.) if xi yi (xi < yi resp.) for all i D 1; : : : ; n: If
a b then the box a; b is the set of all x such that a x b: We also write
.a; b WD fxj a < x bg; a; b/ WD fxj a x < bg: As usual e is the vector of all
ones and ei the i-th unit vector of the space under consideration, i.e., the vector such
that eii D 1; eij D 0 8j i:
A function f W RnC ! R is said to be increasing if f .x/ f .x0 / whenever 0
x x0 I strictly increasing if, in addition, f .x/ < f .x0 / whenever x < x0 : A function
f is said to be decreasing if f is increasing. Many functions encountered in
various applications are increasing in this sense. Outstanding examples are the
production functions and the utility functions in mathematical economics (under
the assumptions that all goods are useful), polynomials (in particular quadratic
functions) with nonnegative coefficients, and posynomials
X
m Y
n
a
cj xi ij with cj 0 and aij 0:
jD1 iD1
Q
(in particular, CobbDouglas functions f .x/ D i xai i ; ai 0).
The following obvious proposition shows that the class of increasing functions
includes a large variety of functions:
Proposition 11.1 (i) If f1 ; f2 are increasing functions, then for any nonnegative
numbers 1 ; 2 the function 1 f1 C 2 f2 is increasing.
(ii) The pointwise supremum of a bounded above family .f /2A of increasing
functions and the pointwise infimum of a bounded below family .f /2A of
increasing functions are increasing.
Nontrivial examples of increasing functions given by this proposition are functions
of the form f .x/ D supy2a.x/ g.y/; where g W RnC ! R is an arbitrary function and
a W RnC ! RnC is a set-valued map with bounded images such that a.x0 / a.x/ for
x0 x:
A set G RnC is called normal if 0; x G whenever x 2 GI in other words, if
0 x0 x 2 G ) x0 2 G: The empty set, the singleton f0g; and RnC are special
normal sets which we shall refer to as trivial normal sets in RnC : If G is a normal set
then G [ fx 2 RnC j xi D 0g for some i D 1; : : : ; n is still normal.
Proposition 11.2 For any increasing function g.x/ on RnC and any 2 R the set
G D fx 2 RnC j g.x/ g is normal and it is closed if g.x/ is lower semi-continuous.
Conversely, for any closed normal set G RnC with nonempty interior there exists
a lower semi-continuous increasing function g W RnC ! R and 2 R such that
G D fx 2 RnC j g.x/ g:
Proof We need only prove the second assertion. Let G be a closed normal set with
nonempty interior. For every x 2 RnC define g.x/ D inff > 0j x 2 Gg: Since
intG ; there is u > 0 such that 0; u G: The halfline fxj 0g intersects
0; u G; hence 0 g.x/ < C1: Since for every > 0 the set G is normal, if
0 x x0 2 G, then x 2 G; too, so g.x0 / g.x/; i.e., g.x/ is increasing. We
show that G D fx 2 TC n
j g.x/ 1g: In fact, if x 2 G, then obviously g.x/ 1:
Conversely, if x G, then since G is closed there exists > 1 such that x GI
hence, since G is normal, x G 8 ; which means that g.x/ > 1:
Consequently, G D fx 2 RnC j g.x/ 1g: It remains to prove that g.x/ is lower semi-
continuous. Let fxk g RnC be a sequence such that xk ! x0 and g.xk / 8k:
Then for any given 0 > we have inffj xk 2 Gg < 0 8k; i.e., xk 2 0 G 8kI
hence x0 2 0 G in view of the closedness of the set 0 G: This implies g.x0 / 0
and since 0 can be taken arbitrarily close to ; we must have g.x0 / ; as was to
be proved. t
u
A point y 2 Rn is called an upper boundary point of a normal set G if y 2 clG
and there is no point x 2 G such that x D a C .y a/ with > 1: The set of
all upper boundary points of G is called its upper boundary and is denoted @C G:
11.1 Basic Concepts 393
Given a set A RnC the whole orthant RnC is a normal set containing A:
The intersection of all normal sets containing A; i.e., the smallest normal set
containing A; is called the normal hull of A and denoted Ae : The conormal hull
of A, denoted bA, is the smallest conormal set containing A:
Proposition 11.4 (i) The normal hull of a set A a; b is the set Ae D [z2A a; z:
If A is compact then so is Ae :
(ii) The conormal hull of a set A a; b is the set bA D [z2A z; b: If A is compact
then so is bA:
394 11 Monotonic Optimization
Proof It suffices to prove (i), because the proof of (ii) is similar. Let P D [z2A a; z:
Clearly P is normal and P A; hence P Ae : Conversely, if x 2 P then x 2 a; z
for some z 2 A Ae ; hence x 2 Ae by normality of Ae ; so P Ae ; proving
the equality P D Ae : If A is compact then for any sequence fxk g Ae we have
xk 2 a; zk with zk 2 A so xk A; hence, up to a subsequence, xk ! x0 2 A Ae ;
proving the compactness of Ae : t
u
Proposition 11.5 A compact normal set G a; b has at least one upper extreme
point and is equal to the normal hull of the set V of its upper extreme points.
Proof For each i D 1; : : : ; n an extreme point of the line segment fx 2 Gj x D
ei ; 0g is an upper extreme point of G: So V ;: We show that G D V e :
Since V @C G; we have V e .@C G/e G; so it suffices to show that G V e : If
y 2 G then G .y/ 2 @C G; so y 2 .@C G/e : Define x1 2 argmaxfx1 j x 2 G; x yg;
and xi 2 argmaxfxi j x xi1 g for i D 2; : : : ; n: Then v WD xn 2 G and v x for
all x 2 G satisfying x y: Therefore, x 2 G; x v ) x D v: This means that
y v 2 V; hence y 2 V e : Consequently, G V e ; as was to be proved. t
u
A polyblock P is the normal hull of a finite set V a; b called its vertex set.
By the above proposition P D [z2V a; z: A point z 2 V is called a proper vertex
of P if there is no z0 2 V n fzg such that z0 z: It is easily seen that a proper vertex
of a polyblock is nothing but a Pareto point of it. The vertex set of a polyblock P is
denoted by vertP; the set of proper vertices by pvertP: A vertex which is not proper
is called improper. Clearly a polyblock is fully determined by its proper vertex;
more precisely, P D .pvertP/e ; i.e., a polyblock is the normal hull of its proper
vertex set (Fig. 11.1).
Similarly, a copolyblock (or reverse polyblock) Q is the conormal hull of a finite
set T a; b; i.e., Q D [z2T z; b: The set T is called the vertex set of the
copolyblock. A proper vertex of a copolyblock Q is a vertex z for which there is no
vertex z0 z such that z0 z: A vertex which is not proper is said to be improper.
A copolyblock is fully determined by its proper vertex set, i.e., a copolyblock is the
conormal hull of its proper vertex set.
D [.z;y/2V1V2 a; z ^ y
(ii) Since the set [a<x<b @f .x/ is nonempty and bounded (Theorem 2.6), we can
take M >PK WD supfkpkj p 2 @f .x/; x 2 a; b: Then the function g.x/ D
f .x/ C M niD1 xi satisfies, for a < x x0 < b:
X
n
g.x0 / g.x/ D f .x0 / f .x/ C M .x0i xi /
iD1
X
n
hp; x0 xi C M .x0i xi /;
iD1
P
where p 2 @f .x/: Hence g.x0 / g.x/ .M K/ niD1 .x0i xi / 0; so g.x/ is
increasing. P
(iii) The function g.x/ D f .x/ C M niD1 xi where M > K is increasing on a; b
because one has, for a x x0 b:
X
n
g.x0 / g.x/ D M .x0i xi / C f .x0 / f .x/
iD1
X
n
M .x0i xi / Kjx0 xj 0:
iD1
P
(iv) For M > K WD supx2a;b krf .x/k the function g.x/ D f .x/ C M niD1 xi is
increasing. In fact, by the mean-value theorem, for a x x0 b there exists
2 .0; 1/ such that
Suppose now that f .x/ is locally dm on a; b: By compactness the box a; b can
be covered by a finite number of boxes p1 ; q1 ; ; pm ; qm such that f .x/ D ui .x/
vi .x/ 8x 2 pi ; qi ; with ui ; vi increasing functions on p1 ; qi : For x 2 pi ; qi \pj ; qj
P have ui .x/ vi i .x/ D iuj .x/ vj .x/ D f .x/;
we P so we can define u.x/ D ui .x/ C
ji maxf0; u j .q / u i .q /g; v.x/ D vi .x/ C ji maxf0; u j .q i
/ u i .q /g for every
i
x 2 p ; q : These two functions are increasing and obviously f .x/ D u.x/ v.x/:
i i
t
u
where the objective function f .x/ and the constraint functions gi .x/ are dm functions
on the box a; b Rn : For the purpose of optimization an important property of the
class of dm functions is that, like the class of dc functions, it is closed under linear
operations and operations of taking pointwise maximum and pointwise minimum.
Exploiting this property we have the following proposition:
Theorem 11.1 Any monotonic optimization problem can be reduced to the canon-
ical form
For simplicity we will assume in the following that f .x/; h.x/ are upper semi-
continuous, while g.x/ is lower semi-continuous. Then the set G WD fx 2
a; bj g.x/ 0g is a compact normal set, H WD fx 2 a; bj h.x/ 0g is a compact
conormal set and the problem (MO) can be rewritten as
On the basis of these properties, one method for solving problem (MO) is
to generate inductively a nested sequence of polyblocks outer approximating the
feasible set:
a; b WD P0 P1 G \ H;
This can be done as follows. Start with the polyblock P1 D a; b: At iteration k a
polyblock Pk G has been constructed with proper vertex set Tk : If Tk0 D Tk \ H D
; stop: the problem is infeasible. Otherwise let zk 2 argmaxff .x/j x 2 Tk0 g: Since
f .x/ is an increasing function, f .zk / is the maximum of f .x/ over Pk \ H G \ H:
If zk 2 G, then stop: zk 2 G \ H; i.e., zk is feasible, so f .zk / is also the maximum
of f .x/ over G \ H; i.e., zk solves the problem. If, on the other hand, zk G, then
compute xk D G .zk / and let PkC1 D .a; b n .xk ; b/ \ Pk : By Proposition 11.7
a; b n .xk ; b is polyblock, so the polyblock PkC1 satisfies G PkC1 Pk n fzk g:
The procedure can then be repeated at the next iteration.
This method is analogous to the outer approximation method for minimizing
a convex function over a convex set. Although it can be proved that, under mild
conditions, as k ! C1 the sequence xk generated by this method converges to
a global optimal solution of (MO), the convergence is generally slow. To speed up
convergence the method should be improved in two points: (1) bounding: using valid
cuts reduce the current box p; q in order to have tighter bounds; (2) computing the
set TkC1 (proper vertex set of polyblock PkC1 /: not from scratch, but by deriving it
from Tk :
At any stage of the solving process, some information may have been gathered about
the global optimal solution. To economize the computational effort this information
should be used to reduce the search domain by removing certain unfit portions where
the global optimal solution cannot be expected to be found. The subproblem which
arises is the following.
Given a number 2 f .G \ H/ (the current best feasible objective function value)
and a box p; q a; b check whether the box p; q contains at least one solution
to the system
G
p
Given any continuous increasing function './ W 0; 1 ! R such that '.0/ <
0 < '.1/ let N.'/ denote the largest value 2 .0; 1/ satisfying './ D 0: For
every i D 1; : : : ; n let qi WD p C .qi pi /ei :
Proposition 11.13 A -valid reduction of a box p; q is given by the formula
;; if g.p/ > 0 or h .q0 / < 0;
red p; q D (11.12)
p0 ; q0 ; if g.p/ 0 & h .q0 / 0:
where
X
n X
n
q0 D p C i .qi pi /ei ; p0 D q0 i .q0 pi /ei ; (11.13)
iD1 iD1
1; if g.qi / 0
i D 'i .t/ D g.p C t.qi pi /ei /:
N.'i /; if g.qi / > 0:
1; if h .q0i / 0
i D i .t/ D h .q0 t.q0i pi /ei /:
N. i /; if h .q0i / < 0:
fx 2 G \ Hj f .x/ g P0 P:
The next question after reducing the search domain is how to derive the new
polyblock PkC1 that should better approximate G \ H than the current polyblock
Pk : The general subproblem to be solved is the following.
Given a polyblock P a; b with V D pvertP; a vertex v 2 V and a point
x 2 a; v/ such that G \ H P n .x; b; determine a new polyblock P0 satisfying
P n .x; b P0 P n fvg.
For any two z; y let J.z; y/ D fjj zj > yj g and if a x z b then define
zi D z C .xi zi /ei ; i D 1; : : : ; n:
Proposition 11.15 Let P be a polyblock with proper vertex V a; b and let x 2
a; b satisfy V WD fz 2 Vj x < zg ;:
(i) P0 WD P n .x; b is a polyblock with vertex set
(ii) The proper vertex set of P0 is obtained from T 0 by removing improper elements
according to the rule:
For every pair z 2 V ; y 2 VC WD fy 2 Vj x yg compute J.z; y/ D fJj zj > yj g
and if J.z; y/ D fig then zi is an improper element of T 0 :
Proof Since a; z\.x; b D ; for every z 2 V nV it follows that Pn.x; b D P1 [P2 ;
where P1 is the polyblock with vertex set V n V and P2 D .[z2V a; z/ n .x; b D
[z2V .a; z n .x; b/: Noting that by Proposition 11.7 a; b n .x; b is a polyblock
with vertices ui D C.xi bi /ei ; i D 1; : : : ; n; we can then write a; z n .x; b D
a; z \ .a; b n .x; b/ D a; z \ .[niD1 a; ui / D [niD1 a; z \ a; ui D [niD1 a; z ^ ui ;
where z ^ ui denotes the vector whose every j-th component equals minfzj ; uij g:
Hence, P2 D [fa; z ^ ui j z 2 V ; i D 1; : : : ; ng: Since z ^ ui D zi for z 2 V ; this
shows that the vertex set of P n .x; b is the set T 0 given by (11.14).
11.2 The Polyblock Algorithm 403
x a C e 8x 2 H: (11.15)
Step 4. Compute xk D G .zk /; the last point of G in the line segment from a to xk :
Determine the new current best feasible xkC1 and the new current best
value CBV: Compute the proper vertex set VkC1 of PkC1 D Pk n .xk ; b
according to Proposition 11.15.
Step 5. Increment k and return to Step 1.
Proposition 11.16 When infinite Algorithm POA generates an infinite sequence
fxk g every cluster point of which is a global optimal solution of (MO).
Proof First note that by replacing Pk with Pk D red Pk in Step 1, all z 2 Vk such
that f .z/ < are deleted, so that at any iteration k all feasible solutions z with
f .z/ CBV are contained in the polyblock Pk : This justifies the conclusions when
the algorithm terminates at Step 2 or Step 3. Therefore it suffices to consider the
case when the algorithm is infinite.
Condition (12.16) implies that
min.zi ai / kz ak kz ak 8z 2 H; (11.17)
i kz ak
where D kbak : We contend that zk xk ! 0 as k ! C1: Suppose the contrary,
that there exist
> 0 and an infinite sequence kl such that kzkl xkl k
> 0 8l:
For > l we have zk .xkl ; zkl because Pkm u Pkl n .xkl ; b: Hence, kzkm u
zkl k miniD1;:::;n jzki l xki l j: On the other hand, miniD1;:::;n .zki l ai / kzkl ak
by (11.17) because zkl 2 H; while xkl lies on the line segment joint a to zkl ; so
k
zi l ai
zki l xki l D kzkl ak
kzkl xkl k kzkl xkl k 8i; i.e., miniD1;:::;n jzki l xki l j kzkl xkl k:
Therefore, kz z k miniD1;:::;n jzki l xki l j kzkl xkl k ; conflicting with
km u kl
then xk is always feasible, so when f .zk / f .xk / < "; then f .xk / C " > f .zk /
maxff .x/j g.x/ 0; x 2 a; bg; i.e., xk will be an "-optimal solution of the problem.
In other words, the Polyblock Algorithm applied to problem (MO1) provides an "-
optimal solution in finitely many steps.
11.3 Successive Incumbent Transcending Algorithm 405
In the general case, if both normal and conormal constraints are present, xk may
not be feasible, and the Polyblock Algorithm may not provide an "-optimal solution
in finitely many steps. The most that can be said is that, since g.zk / g.xk / ! 0 and
g.xk / 0; while h.zk / 0 8k; f .zk / ! max .MO/ as k ! C1; we will have, for
k sufficiently large, g.zk / "; h.zk / 0; f .zk / max .MO/ ":
A point x 2 a; b satisfying
h(x) 0
g(x)=0
opt
f (x) =
g(x) 0
x
When the problem has an isolated optimal solution this optimal solution is generally
difficult to compute and difficult to implement when computable.
To get round the difficulty a common practice is to assume that the feasible set D
of the problem under consideration is robust, i.e., it satisfies
D D cl.intD/; (11.18)
where cl and int denote the closure and the interior, respectively. Condition (11.18)
rules out the existence of an isolated feasible solution and ensures that for " > 0
sufficiently small an "-approximate optimal solution will be close enough to the
exact optimum. However, even in this case the trouble is that in practice we often
do not know what does it mean precisely sufficiently small " > 0: Moreover,
condition (11.18) is generally very hard to check. Quite often we have to deal with
feasible sets which are not known a priori to contain isolated points or not.
Therefore, from a practical point of view a method is desirable which could
help to discard isolated feasible solutions from consideration without having to
preliminarily check their presence. This method should employ a more adequate
concept of approximate optimal solution than the commonly used concept of
"-approximate optimal solution.
A point x 2 Rn is called an essential feasible solution of (MO) if it satisfies
Given a tolerance
> 0 an essential feasible solution x is said to be an essential
A key step towards finding a global optimal solution of problem (MO) is to deal
with the following subproblem of incumbent transcending:
.SP / Given a real number ; check whether (MO) has a nonisolated feasible
solution x satisfying f .x/ > ; or else establish that no nonisolated feasible solution
x of (MO) exists such that f .x/ > :
Since f ; g are increasing, if f .b/ , then f .x/ 8x 2 a; b; so there is
no feasible solution with objective function value exceeding I if g.a/ > 0; then
g.x/ > 0 8x 2 a; b; so the problem is infeasible.
If f .b/ > and g.b/ < 0 then b is an essential feasible solution with objective
function value exceeding :
Therefore, barring these trivial cases, we can assume that
For our purpose an important feature of this problem is that it can be solved
efficiently, while solving it furnishes an answer to question (IT), as will shortly
be seen.
408 11 Monotonic Optimization
where k .x/ WD minfh.x/; f .x/ g is an increasing function. Since the feasible set
of .Q ) is robust (has no isolated point) and easy to handle (a feasible solution can be
computed at very cheap cost), this problem can be solved efficiently by a rectangular
RBB (reduce-bound-and-branch) algorithm characterized by three basic operations
(see Chap. 6, Sect. 6.2):
Reducing: using valid cuts reduce each partition set M D p; q a; b without
losing any feasible solution currently still of interest. The resulting box p0 ; q0 is
referred to as a valid reduction of M:
11.3 Successive Incumbent Transcending Algorithm 409
Bounding: for each partition set M D p; q compute a lower bound .M/ for
g.x/ over the feasible solutions of .Q ) contained in the valid reduction of MI
Branching: use an adaptive subdivision rule to further subdivide a selected box
in the current partitioning of the original box a; b:
b. Valid Reduction
At a given stage of the RBB algorithm for .Q ) let p; q be a box generated during
the partitioning process and still of interest. The search for a nonisolated feasible
solution x of (MO) in p; q such that f .x/ can then be restricted to the set
where
X
n X
n
q0 D p C i .qi pi /ei ; p0 D q0 i .q0i pi /ei ; (11.21)
iD1 iD1
1; if g.qi / 0
i D 'i .t/ D g.p C t.qi pi /ei /:
N.'i /; if g.qi / > 0:
1; if k .q0i / 0
i D i .t/ D k .q0 t.q0i pi /ei /:
N. i /; if k .q0i / < 0:
In the context of RBB algorithm for (Q ), a smaller red p; q allows a tighter bound
to be obtained for g.x/ over p; q but requires more computational effort. Therefore,
as already mentioned in Remark 11.2, in practical implementation of the above
formulas for i ; i it is enough to compute approximate values of N.'i / or N. i /
by using, e.g., just a few iterations of a Bolzano bisection procedure.
410 11 Monotonic Optimization
c. Bounding
As simple as it is, this bound combined with an adaptive branching rule suffices to
ensure convergence of the algorithm.
However, to enhance efficiency better bounds are often needed.
For example, applying two or more iterations of the Polyblock procedure for
computing minfg.x/j x 2 p; qg may yield a reasonably good lower bound.
Alternatively, based on the observation that minfg.x/ C tk .x/j x 2 p; qg
minfg.x/j x 2 p; q; k .x/ 0g 8t 2 RC ; one may take .M/ D g.p/ C tk .p/ for
some t > 1: As shown in (Tuy 2010), if either of the functions f .x/; g.x/ C tk .x/
is coercive, i.e., tends to C1 as kxk ! C1; then this bound approaches the exact
minimum of g.x/ over p; q as t ! C1:
Also one may try to exploit any partial convexity present in the problem. For
instance, if a convex function u.x/ and a concave function v.x/ are available such
that u.x/ g.x/; v.x/ k .x/ 8x 2 p; q; then a lower bound is given by the
optimal value of the convex minimization problem
d. Branching
In other words, subdivide Mk according to the adaptive subdivision rule via .zk ; yk /:
Proposition 11.18 Whenever infinite the above adaptive RBB algorithm for (Q )
generates a sequence of feasible solutions yk at least a cluster point y of which is an
optimal solution of (Q ).
Proof Since the algorithm uses an adaptive subdivision rule via .zk ; yk /; by Theo-
rem 6.4 there exists an infinite sequence K f1; 2; : : :g such that kzk yk k ! 0 as
k 2 K; k ! C1: From the fact
to qk meets the surface k .x/ D 0: If g.yk / < " go to Step 5. If g.yk / "
go to Step 6.
Step 5. yk is an essential feasible solution of (MO) with f .yk / : Reset x yk
and go back to Step 0.
Step 6. Divide Mk into two subboxes according to an adaptive branching via
.zk ; yk /: Let PkC1 be the collection of these two subboxes of Mk : Let
RkC1 D R 0 k n fMk g: Increment k and return to Step 1.
Proposition 11.19 The above algorithm terminates after finitely many steps, yield-
ing either an essential .";
/-optimal solution of (MO), or an evidence that the
problem is essentially infeasible.
Proof Since Step 1 deletes every box M with .M/ > "; the fact Rk D ; in
Step 3 implies that g.x/ C " > 0 8x 2 p; q \ fxj k .x/ 0g; hence there is no
"-essential feasible solution x of (MO) with f .x/ ; justifying the conclusion in
Step 3. It remains to show that Step 3 occurs for sufficiently large k: Suppose the
contrary, that the algorithm is infinite. Since each occurrence of Step 5 increases the
current value of f .x/ by at least
> 0 while f .x/ is bounded above on the feasible
set, Step 5 cannot occur infinitely often. Therefore, for all k sufficiently large Step
6 occurs, i.e., g.yk / ": By Proposition 11.18 there exists a subsequence fks g such
that kzks yks k ! 0 as s ! C1: Also, by compactness, one can assume that
yks ! y; zks ! y: Then g.y/ "; while g.zk / D .Mk / "; hence g.y/ "; a
contradiction. t
u
Remark 11.4 So far we assumed that the problem involves only inequality con-
straints. In the case there are also equality constraints, e.g.,
cj .x/ D 0; j D 1; : : : ; s: (11.23)
observe that the linear constraints can be used to eliminate certain variables, so we
can always assume that all the constraints (11.23) are nonlinear. Since, however,
in the most general case one cannot expect to compute a solution to a system of
nonlinear equations in finitely many steps, one should be content with replacing the
system (11.23) by the approximate system
cj .x/ ; j D 1; : : : ; s ;
where > 0 is the tolerance. The method presented in the previous sections can then
be applied to the resulting approximate problem. An essential "-optimal solution to
the above defined approximate problem is called an essential .; "/-optimal solution
of (MO).
where f .x/; g.x/; h.x/ are quadratic functions with positive coefficients (hence,
increasing functions). The SIT method applied to the monotonic optimization
problem .P/ then yields a robust algorithm for solving (QQP). For details of
implementation, we refer the reader to the article (Tuy and Hoai-Phuong 2007).
Sometimes it may be more convenient to convert (QQP) to the form
where g.x/ D minkD1;:::;m fuk .x/ vk .x/g and f ; uk ; vk are quadratic functions with
positive coefficients. Then the SIT Algorithm can be applied, using the master
problem
maxff .x/j x 2 G
Q \ Hg: (11.24)
Proof Since G \ S G Q we need only show that for each point x 2 G Q there
exists z 2 G \ S satisfying f .z/ f .x/: By Proposition 11.4 G
Q D [z2G\S a; z;
so x 2 a; z for some z 2 G \ S : Since x z and f .x/ is increasing we have
f .x/ f .z/: t
u
As a consequence, solving problem (DMO) reduces to solving (11.24) which is a
monotonic problem without explicit discrete constraint. The difficulty, now, is how
to handle the polyblock G Q which is defined only implicitly as the normal hull of
G\S :
Given a box p; q a; b and a point x 2 p; q; we define the lower S-
adjustment of x to be the point
xQ i D maxf j 2 Si [ fpi g; xi g; i D 1; : : : ; n:
xO i D minf j 2 Si [ fqi g; xi g; i D 1; : : : ; n:
For example, if each Si is the set of integers and pi ; qi 2 Si then xQ i is the largest
integer no larger than xi ; while xO i is the smallest integer no less than xi :
The box obtained from red p; q upon the above S-adjustment is referred to as
the S-adjusted -valid reduction of p; q and denoted redS p; q: However rather
than compute first red p; q and then S-adjust it, one should compute directly
redS p; q using the formulas redS p; q D Qp; qQ with
With the above modifications to accommodate the discrete constraint the Polyblock
Algorithm and the SIT Algorithm for (MO) can be extended to solve (DMO).
For every box p; q let redS p; q denote the S-adjusted -valid reduction of p; q;
11.4 Discrete Monotonic Optimization 415
i.e., the box obtained from p0 ; q0 WD red p; q by replacing p0 with dp0 eS and q0
with bq0 cS :
Discrete Polyblock Algorithm
Initialization. Take an initial polyblock P0 G \ H; with proper vertex set T0 :
Let xO 0 be the best feasible solution available (the current best feasible solution),
0 D f .Ox0 /; S0 D fx 2 S j f .x/ > 0 g: If no feasible solution is available, let
0 D 1: Set k WD 0:
Step 1. Delete every v 2 Tk such that redSk a; v D ;: For every v 2 Tk such
that redSk a; v ; denote the highest vertex of the latter box again by v
and if f .v/ k then drop v: Let TQ k be the set that results from Tk after
performing this operation for every v 2 Tk (so TQ k H). Reset Tk WD TQ k :
Step 2. If Tk D ; terminate: if k D 1 the problem is infeasible; if k D
f .Oxk /; xO k is an optimal solution.
Step 3. If Tk ; select v k 2 argmaxff .v/j v 2 Tk g: If v k 2 G \ S terminate: v k
is an optimal solution.
Step 4. If v k 2 G n S compute vQ k D bv k cSk (using formula (11.25) for S D Sk ).
If v k G compute xk D G .v k / and define vQ k D xk if xk 2 Sk ; vQ k D
bxk cSk if xk Sk .
Step 5. Let Tk; D fz 2 Tk j z > vQ k g: Compete
\
Tk0 D .Tk n Tk; / fzk;i D z C .vQ ik zi /ei j z 2 Tk; ; i D 1; : : : ; ng:
Let TkC1 be the set obtained from Tk0 by removing every zi such that fjj zj >
yj g D fig for some z 2 Tk; and y 2 Tk :
Step 6. Determine the new current best feasible solution xO kC1 : Set kC1 D
f .OxkC1 /; SkC1 D fx 2 S j f .x/ > kC1 g: Increment k and return to Step 1.
Proposition 11.21 If s D n the DPA (Discrete Polyblock) algorithm is finite. If
s < n and there exists a constant > 0 such that
vikl ai
This class of problems include, aside from linear multiplicative programs (see
Chap. 8, Sect. 8.5), diverse generalized linear fractional programs, namely:
418 11 Monotonic Optimization
..y/ D minfy1 ; : : : ; ym g/
minfmax gf11.x/
.x/
; : : : ; fm .x/
gm .x/
j x 2 Dg (11.32)
..y/ D maxfy1 ; : : : ; ym g/
First note that in the study of problems (11.31)(11.34) it is usually assumed that
In fact, in view of (11.29) gi .x/ does not vanish on D; hence has a constant sign on
D and by replacing fi ; gi with their negatives if necessary, we can assume gi .x/ >
0 8x 2 D; i D 1; : : : ; m: Then setting fQi .x/ D fi .x/ ai gi .x/ we have, by (11.29),
Q
fQi .x/ > 0 8x 2 D; i D 1; : : : ; m; and since gfii.x/ fi .x/
.x/ C ai D gi .x/ ; the problem (P) is
equivalent to
!
fQi .x/ fQm .x/
maxf ;:::; j x 2 Dg;
gi .x/ gm .x/
Clearly, G is a normal set, H a conormal set and both are contained in the box 0; b
with
fi .x/
bi D max ; i D 1; : : : ; m:
x2D gi .x/
Proposition 11.23 Problems (P) and (Q) are equivalent, respectively, to the fol-
lowing problems:
.MP/ maxf.y/j y 2 Gg
.MQ/ minf.y/j y 2 Hg:
420 11 Monotonic Optimization
fi .x/
More precisely, if x solves (P) ((Q), resp.) then y with yi D gi .x/ solves (MP) ((MQ),
fi .x/
resp.). Conversely, if y solves (MP) ((MQ), resp.) and x satisfies yi gi .x/ .yi
fi .x/
gi .x/ ; resp.) then x solves (P) ((Q), resp.).
.x
Proof Suppose x solves (P). Then y D . gf1i .x/ ; : : : ; gfmm.x/
.x/
/ satisfies .y/ .y/ for all
y satisfying yi D gfii.x/ .x/
; i D 1; : : : ; m; with x 2 D: Since is increasing this implies
.y/ .y/ 8y 2 G; hence y solves (MP). Conversely, if y solves (MP) then yi
fi .x/ f1 .x/ fm .x/
gi .x/ ; i D 1 : : : ; m; for some x 2 D; and since is increasing, . g1 .x/ ; : : : ; gm .x/ /
fi .x/
.y/ .y/ for every y 2 G; i.e., for every y satisfying yi D gi .x/ ; i D 1; : : : ; m
with x 2 D: This means x solves (P). t
u
Thus generalized linear fractional problems .P/; .Q/ in Rn can be converted into
monotonic optimization problems (MP), M.Q/ respectively, in Rm C with m usually
much smaller than n:
b. Solution Method
We only briefly discuss a solution method for problem (MP) (a similar method
works for (MQ)):
.MP/ maxf.y/j y 2 Gg
fi .x/
where (see (11.39)) G D fy 2 Rm C j yi gi .x/ 8i D 1; : : : ; m; x 2 Dg is contained in
the box 0; b with b satisfying
fi .x/
bi D max i D 1; : : : ; m: (11.41)
x2D gi .x/
fi .x/
ai D min >0 i D 1; : : : ; m; (11.42)
x2D gi .x/
Let be the optimal value of (MP) ( > 0 since W RnC ! RCC ). Given a
tolerance " > 0; we say that a vector y 2 G is an "-optimal solution of (MP) if
.1 C "/.y/; or equivalently, if .y/ .1 C "/.y/ 8y 2 G: The vector x 2 D
satisfying y D gfii.x/
.x/ is then called an "-optimal solution of .P/:
11.5 Problems with Hidden Monotonicity 421
.xk /
; iD
1; : : : ; m (so yQ is a feasible solution). Determine the new current best
k
c. Implementation Issues
A basic operation in Step 1 is to compute the point G .z/; where the halfline from a
through z meets the upper boundary of the normal set G: Since G 0; b we have
G .z/ D z where
fi .x/
D maxfj min ; x 2 Dg
iD1;:::;m zi gi .x/
fi .x/
D max min : (11.44)
x2D iD1;:::;m zi gi .x/
O with O .1 C
/;
y D .1 C
/z; O (11.45)
where
> 0 is a tolerance. Of course, to guarantee termination of the above
algorithm, the tolerance
must be suitably chosen. This is achieved by choosing
O < .1 C "=2/.z/;
..1 C
/z/ O (11.46)
422 11 Monotonic Optimization
"=2.G .zk //: Then, noting that .Qyk / .G .zk // because .y/ is increasing,
we will have
The monotonic structure in (11.47) is easily disclosed. Since the set C.D/ WD
fy D C.x/; x 2 Dg is compact it is contained in some box a; b Rm C : For each
y 2 a; b let
X
m
ui .x/
minff .x/j x 2 D; 1g;
iD1
vi .x/
ui .x/ Pm
Setting Ci .x/ D vi .x/ ; .y/ D iD1 yi 1; we have
X
m
minf'.y/j yi 1; y 2 a; bg:
iD1
11.6 Applications
If p D .f1 ; : : : ; fn /; .p/ D .1 .p/; : : : ; n .p// and U..p// is the system utility then
the problem we are concerned with can be formulated as
max U..p//;
s.t. i .p/ i;min i D 1; : : : ; nI (11.51)
0 fi fimax i D 1; : : : ; n:
max U..w//
s.t. i .w1 ; : : : ; wn / i;min i D 1; : : : ; nI (11.52)
0 kwi k22 Pmax i D 1; : : : ; n:
i
maxfU.y/j y 2 G \ Hg;
where G is a closed normal set, H a closed conormal set.
Remark 11.5 In Bjornson and Jorswieck (2012) and Zhang et al. (2012) the
monotonic optimization problem was solved by the Polyblock Algorithm or the
BRB Algorithm. Certainly the performance could have been much better with
the SIT Algorithm, had it been available at the time. Once again, it should be
emphasized that not only is the SIT algorithm more efficient, but it also suits best
the users purpose if we are mostly interested in quickly finding a feasible solution
or, possibly, a feasible solution better than a given one.
x x
sensor
x
x target
x x
Pn
min iD1 Ei .ri /
.SCEP/ s.t. max1in .ri ksi tj k/ 0 j D 1; : : : ; m;
l r u;
H WD fx 2 a; bj gj .x/ 0; j D 1; : : : ; mg
is a conormal set (see Sect. 11.1) and (SCEP) is actually a standard
P monotonic
optimization problem: minimizing the increasing function f .x/ D niD1 xi over the
conormal set H:
428 11 Monotonic Optimization
H D \j2J Hj ;
G WD [j2J C j D [j2J a; aj
it follows that
5
8
10
1
4
6 9
430 11 Monotonic Optimization
where h.x/ D maxiD1;:::;m .2hai ; xi C i2 kai k2 /: Since both kxk2 and h.x/ are
obviously increasing functions, this is a discrete dm optimization problem that can
be solved by a branch-reduce-and-bound method as discussed for solving the Master
Problem in Sect. 11.3. Clearly if r is the maximal value of r such that (Q(r)) is
feasible, then x D x.r/ will solve (DL). Noting that r 0 and for any r > 0
one has r r or r < r according to whether (Q(r)) is feasible or not, the value
r can be found by a Bolzano binary search scheme: starting from an interval 0; s
containing r; one reduces it by a half at each step by solving a (Q(r)) with a suitable
r: Quite encouraging computational results with this preliminary version of the SIT
approach to dm optimization have been reported in Tuy et al. (2003). However, it
turns out that more complete and much better results can be obtained by a direct
application of the enhanced Discrete SIT Algorithm adapted to problem (11.54).
Next we describe this method.
Observe that
kx ai k i z; i D 1; : : : ; m;
, kxk2 C kai k2 2hai ; xi .z C i /2 ; i D 1; : : : ; m;
, max f.z C i / C 2ha ; xi ka k g kxk2 :
2 i i 2
iD1;:::;m
Therefore, by setting
where '.x; z/ and kxk2 are increasing functions and a; b is a box containing S:
To apply the Discrete SIT Algorithm for solving this problem, observe that the
value of z is determined when x is fixed. Therefore, branching should be performed
upon the variables x only.
Another key operation in the algorithm is bounding: given a box p; q a; b;
compute an upper bound .M/ for the optimal value of the subproblem
If CBV D r > 0 is the largest thus far known value of z at a feasible solution .x; z/;
then only feasible points .x; z/ with z > r are still of interest. Therefore, to obtain a
tighter value of .M/ one should consider, instead of (11.57), the problem
Because '.x; z/ and kxk2 are both increasing, the constraints of DL.M; r/ can be
relaxed to '.p; z/ kqk2 ; so an obvious upper bound is
Although this bound is easy to compute (it is the zero of the increasing function
z 7! '.p; z/kqk2 /; it is often not quite efficient. A better bound can be computed by
solving a linear relaxation of DL.M; r/ obtained by omitting the discrete constraint
x 2 S and replacing '.x; z/ with a linear underestimator. Since
X
n
.x; z/ .pj C qj /xj pj qj 0;
jD1
X
n
max f2.r C i /.z r/ C .r C i /2 C 2hai; xi kai k2 g .pj C qj /xj pj qj :
iD1;:::;m
jD1
Therefore, a lower bound can be taken to be the optimal value .M/ of the linear
programming problem
max 2.r C i /.z r/ C .r C i /2 C 2hai ; xi kai k2
Xn
s.t. .pj C qj /xj pj qj 0; i D 1; : : : ; m;
.LP.M; r//
jD1
z r; p x q:
The bounds can be further improved by using valid reduction operations. For this,
observe that, since '.x; r/ '.x; z/ for r z; the feasible set of (DL(M; r)) is
contained in the set
and pQ D dp0 eS ; qQ D bq0 cS ; then the box Qp; qQ p; q is a valid reduction of p; q:
Thus, the Discrete SIT Algorithm as specialized to problem (DL) becomes the
following algorithm:
11.7 Exercises
In branch and bound methods for solving optimization problems with a nonconvex
constraint set, an important operation is bounding over a current partition set M.
A common bounding practice consists in replacing the original problem .P/ by a
more easily solvable relaxed problem. Solving this relaxed problem restricted to
M yields a lower bound for the objective function value over M: Under suitable
conditions this relaxation can be proved to lead to a convergent algorithm. Below
we present the RLT method (Sherali and Adams 1999) which is an extension of
the ReformulationLinearization technique for quadratic programming (Chap. 6,
Sect. 9.5) and the SDP relaxation method (Lasserre 2001, 2002) which uses a
specific convex relaxation technique based on Semidefinite Programming.
Letting li .x/ WD ci0 C hci ; xi be the linear part of fi .x/; the polynomial (12.1) can be
written as
X
fi .x/ D li .x/ C ci x ;
2A
i
P
where Ai D f 2 Ai j njD1 j 2g: Setting A D [m
iD0 Ai ; ci D 0 for 2 A n Ai
we can further write (12.1) as
X
fi .x/ D li .x/ C ci x : (12.2)
2A
12.2 Linear and Convex Relaxation Approaches 437
The RLT method is based on the simple observation that using the substitution
y xP
each polynomial fi .x/ can be transformed into the function Li .x; y/ D
li .x/ C 2A ci y ; which is linear in .x; y/ 2 Rn RjAj : Consequently, the
polynomial optimization problem (P) can be rewritten as
P
min l0 .x/ C 2A c0 y ;
P
s:t li .x/ C
.Q/ 2A ci y 0 .i D 1; : : : ; m/;
a x b;
y D x ; 2 A:
y D x ; 2 A: (12.3)
yields a linear relaxation of .P/: For any box M D p; q a; b let .PM / denotes
the problem .P/ restricted to the box M D p; qI then the optimal value of LR.PM /
yields a lower bound .M/ for min(PM ) such that .M/ D C1 whenever the
feasible set of .PM / is empty.
The bounds computed this way may be rather poor. To have tighter relaxation
and hence, tighter bounds, one way is to augment the original constraint set with
implied constraints formed by taking mixed products of original constraints. For
example, for a partition set p; q; such an implied constraint is .x1 p1 /.x3 p3 /
.q2 x2 / 0; i.e.,
x1 x2 x3 C p1 x2 x3 C p3 x1 x2 C q2 x1 x3 p1 p3 x2
q2 p3 x1 C p1 q3 x2 p1 q2 x3 0:
Of course, the more implied constraints are added, the tighter the relaxation is, but
the larger the problem becomes. In practice it is enough to add just a few implied
constraints.
With bounds computed by RLT relaxation it is not difficult to devise a convergent
rectangular branch and bound algorithm for polynomial optimization.
To ease the reasoning in the sequel it is convenient to write a vector D
.1 ; 2 ; : : : ; n / 2 A as a sequence of nonnegative integers between 0 and n as
follows: 1 numbers 1, then 2 numbers 2, . . . , then n numbers n: For example, for
n D 4 a vector D .2; 3; 1; 2/ is written as J./ D 11222344: By A denote the set
of all J./; 2 A:
438 12 Polynomial Optimization
Similarly, if xl D ql then
Proof Consider first the case xl D pl : For jJj D 1 consider any j 2 f0; 1; : : : ; ng: If
.x; y/ is feasible to LR.PM / then .x; y/ satisfies
By induction hypothesis yJ[flgnfjg Dpl yJnfjg ; xl xj f .x/L Dpl xj f .x/L ; and xl f .x/L D
pl f .x/L ; so the right-hand sides of both the above inequalities are zero. Hence,
yJ[flg D pl yJ : t
u
Corollary 12.1 Let M D p; q a; b: If .x; y/ is a feasible solution of LR.PM /
such that x is a corner of the box M then x is a feasible solution of .PM /; i.e., is a
feasible solution of problem (P) in the box M.
12.2 Linear and Convex Relaxation Approaches 439
ik 2 argmaxf
ki j i D 1; : : : ; ng;
ki D minfxki pki ; qki xki g:
This method is due essentially to Lasserre (2001, 2002). By p denote the optimal
value of the polynomial optimization problem .P/:
We show that a SDP relaxation of .P/ can be constructed whose optimal value
approximates p as closely as desired. Consider the following matrices called
moment matrices:
2 3
x1
M1 .x/ D 4 : : : 5 1 x1 : : : xn (12.7)
xn
2 3
1 x1 x2 : : : xn
6 x 7
6 1 x21 x1 x2 : : : x1 xn 7
6 7
D 6 x2 x1 x2 x22 : : : x2 xn 7 (12.8)
6 7
4::: ::: :::: ::: ::::5
xn x1 xn x2 xn : : : : x2n
2 3
x1
6 ::: 7
6 7
6 7
6 xn 7
6 2 7
6 x1 7
6 7
6 x1 x2 7
6 7
M2 .x/ D 6 7 2 2 2
6 : : : 7 x1 : : : xn x1 x1 x2 : : : x1 xn x2 x2 x3 : : : xn (12.9)
6x x 7
6 1 n7
6 x2 7
6 2 7
6 7
6 x2 x3 7
6 7
4 ::: 5
x2n
2 3
1 x1 : : : : xn x21 : : : x2n
6 x1 x2 : : : : x1 xn x3 : : : : x1 x2 7
D6 1
4::: ::: ::: ::: ::: ::: :::5
1 n7 (12.10)
x2n x1 x2n : : : x3n x21 x2n : : : x4n
and so all. We also set M0 .x/ D 1: Clearly, each moment matrix Mk .x/ is
semidefinite positive. Setting Mik .x/ WD fi .x/Mk .x/ for any given natural k we
then have
Although very ingenious, the RLT and SDP relaxation approaches suffer from two
drawbacks.
First, they are very sensitive to the highest degree of the polynomials involved.
Even for problems of small size but with polynomials of high degree they require
introducing a huge number of additional variables.
Second, they only provide an approximate optimal value rather than an
approximate optimal solution of the problem. In fact, in these methods the
problem .P/ is relaxed to a linear program or a SDP problem in the variables
x 2 Rn ; yDfy ; 2 Ag; which will become equivalent to .P/ when added the
nonconvex constraints
442 12 Polynomial Optimization
y D x 8 2 A: (12.12)
So if .Ox; yO / is an optimal solution to this relaxed problem then xO will solve problem
.P/ if yO D xO 8 2 A: The idea, then, is to use a suitable BB procedure
with bounding based on this kind of relaxation to generate a sequence .Oxk ; yO k /
such that kOyk xO k k ! 0 as k ! C1: Under these conditions, given tolerances
" > 0;
> 0 it is expected that for large enough k one would have kOyk xO k k ";
while f0 .Oxk /
p ; where p is the optimal value of problem .P/: Such an xO k
is accepted as an approximate optimal solution, though it is infeasible and may
sometimes be quite far off the true optimal solution.
Because of these difficulties, if we are not so much interested in finding an
optimal solution as in finding quickly a feasible solution, or possibly a feasible
solution better than a given one, then neither the RLT nor the SDP approach can
properly serve our purpose. By contrast, with the monotonic optimization approach
to be presented below we will be able to get round these limitations.
fi .x/ 0 i D 1; : : : ; m; x 2 a; b
If f0 .x/ D f01 .x/ f02 .x/ where f01 .x/; f02 .x/ are polynomials with positive
coefficients, then the problem (P) can be written as
where the function .x; t/ 7! f01 .x/ C t f02 .b/ is a polynomial with positive
coefficients.
It then follows that by introducing an additional variable if necessary, and
changing the notation, we can convert any polynomial programming problem (P)
into the form
From now on we assume that the original polynomial programming problem (P) has
been converted to the form .Q/:
In general the feasible solution set D of .Q/ is not robust, it may contain isolated
points. Therefore, as argued in Sect. 11.3, to avoid possible numerical instability,
instead of trying to compute a global optimal solution of (P) one should look for the
best nonisolated feasible solution. This motivates the following definitions:
A nonisolated feasible solution x of .Q/ is called an essential optimal solution
if f .x / f .x/ for all nonisolated feasible solutions x of .Q/; i.e., if
where S denotes the set of all nonisolated feasible solutions of .Q/: Assume
If g.a/ > 0 and D f .a/ then a solves .Q /: Therefore, barring this trivial case, we
can assume that
Since g and f are both increasing functions, problem .Q / can be solved very
efficiently, while, as it turns out, solving .Q / furnishes an answer to question .SP /.
More precisely,
Proposition 12.1 Under assumption (12.16):
(i) Any feasible solution x0 of .Q / such that g.x0 / > 0 is a nonisolated feasible
solution of (P) with f .x0 / : In particular, if max.Q / > 0 then the optimal
solution xO of .Q / is a nonisolated feasible solution of .P/ with f .Ox/ :
(ii) If max.Q / < " for D f .x/ "; and x is a nonisolated feasible solution of .P/,
then it is an essential "-optimal solution of (P). If max(Q / < " for D f .b/;
then the problem .P/ is "-essentially infeasible.
Proof (i) In view of assumption (12.16), x0 a: Since f .a/ < ; and x0 a;
every x in the line segment joining a to x0 and sufficiently close to x0 satisfies
f .x/ f .x0 / ; g.x/ > 0; i.e., is a feasible solution of .P/: Therefore, x0 is a
nonisolated feasible solution of .P/ satisfying f .x0 / :
12.3 Monotonic Optimization Approach 445
so for every x 2 a; b satisfying g.x/ "; we must have f .x/ > D f .x/ ";
in other words,
and repeat.
- Otherwise, max.Q / 0 W no essential feasible solution x for .Q/ exists such that
f .x/ : So if D f .Ox/ C
for some essential feasible solution xO of .Q/; xO is an
essential
-optimal solution; if D 0 problem .Q/ is essentially infeasible.
The key for implementing this conceptual scheme is an algorithm for solving the
master problem.
a. Valid Reduction
Since g.x/ D minjD1;:::;m fuj .x/ vj .x/g with uj .x/; vj .x/ being increasing polynomi-
als (see (12.13)), we can also write
The reduction operation aims at replacing the box p; q with a smaller box
p0 ; q0 p; q without losing any point x 2 B \ p; q; i.e., such that
B \ p0 ; q0 D B \ p; q:
where
X
n X
n
0 0 0
q DpC i .qi pi /e ;i
p Dq i .q0i pi /ei ;
iD1 iD1
1; if f .qi / 0
i D 'i .t/ D f .p C t.qi pi /ei / :
N.'i /; if f .qi / > 0:
1; if minj fuj .q0i / vj .p/g 0
i D
N. i /; if minj fuj .q0i / vj .p/g < 0:
Remark 12.2 It is easily verified that the box p0 ; q0 D red p; q still satisfies
b. Valid Bounds
Since g.x/ D minjD1;:::;m fuj .x/ vj .x/g and uj .x/; vj .x/ are increasing, an obvious
upper bound is
Though very simple, this bound ensures convergence of the algorithm, as will be
seen shortly. However, for a better performance of the procedure, tighter bounds
can be computed using, for instance, the following:
Lemma 12.2 (i) If g.p/ > 0 and f .p/ < , then p is a nonisolated feasible
solution with f .p/ < :
(ii) If f .q/ > and x.M/ D p C .q p/ with D maxfj f .p C .q p// g;
zi D q C .xi .M/ qi /ei ; i D 1; : : : ; n; then an upper bound of g.x/ over all
x 2 p; q satisfying f .x/ is
(see Sect. 10.1). The above procedure thus amounts to constructing a polyblock
Z fx 2 p; qj f .x/ gwhich is possible because f .x/ is increasing. To
have a tighter bound, one can even construct a sequence of polyblocks Z1 Z2 ; : : : ;
approximating the set B \ p; q more and more closely. For the details of this
construction the interested reader is referred to Tuy (2000a). By using polyblock
approximation one could compute a bound as tight as we wish; since, however, the
computation cost increases rapidly with the quality of the bound, a trade-off must
be made, so practically just one approximating polyblock as in the above lemma is
used.
Remark 12.4 In many cases, efficient bounds can also be computed by exploiting,
additional structure such as partial convexity, aside from monotonicity properties
Tuy (2007b).
For example, assuming, without loss of generality, that p 2 RnCC ; i.e., fi > 0;
i D 1; : : : ; n; and using the transformation ti D log xi ; i D 1; : : :P ; n; one can
write each monomial xP as expht; i; and each polynomial fk .x/ D 2Ak ck x
as a function 'k .t/ D 2Ak ck expht; i: Since, as can easily be seen, expht; i
with 2 Rm C is an increasing convex function of t; each polynomial in x becomes
a d.c. function of t and the subproblem over p; q becomes a d.c. optimization
problem under d.c. constraints. Here each d.c. function is a difference of two
increasing convex functions of t on the box exp p; exp q; where exp p denotes the
vector .exp f1 ; : : : ; exp fn /: A bound over p; q can then be computed by exploiting
simultaneously the d.c. and d.m. structures.
c. A Robust Algorithm
Incorporating the above RBB procedure for solving .Q ) into the SIT scheme yields
the following robust algorithm for solving .Q/ W
xk D pk C k .qk pk /:
6.a) If g.xk / > 0, then xk is a new nonisolated feasible solution of .Q/ with
f .xk / W if g.pk / < 0; compute the point xk where the line segment
joining pk to xk meets the surface g.x/ D 0; and reset x xk I otherwise,
reset x p : Go to Step 7.
k
it follows that
Remark 12.5 The SIT Algorithm works its way to the optimum through a sequence
of better and better nonisolated solutions. If for some reason the algorithm has to be
stopped prematurely, some reasonably good feasible solution may have been already
obtained. This is a major advantage of it over most existing algorithms, which may
be useless when stopped prematurely.
So far we assumed (12.14), so that the feasible set has a nonempty interior. Consider
now the case when there are equality constraints, e.g.,
hl .x/ D 0; l D 1; : : : ; s ; (12.19)
First, if some of these constraints are linear, they can be used to eliminate
certain variables. Therefore, without loss of generality we can assume that all the
constraints (12.19) are nonlinear. Since, however, in the most general case one
cannot expect to compute a solution of a nonlinear system of equations in finitely
many steps, one should be content with an approximate system
hl .x/ ; l D 1; : : : ; s ;
where > 0 is the tolerance. In other words, a set of constraints of the form
gj .x/ 0 j D 1; : : : ; m
hl .x/ D 0 l D 1; : : : ; s
The above approach can be extended without difficulty to the class of discrete (in
particular, integral) polynomial optimization problems. This can be done along
the same lines as the extension of monotonic optimization methods to discrete
monotonic optimization.
12.4 An Illustrative Example 451
This problem would be difficult to solve by the RLT or Lasserres method, since
a very large number of variables would have to be introduced.
For " D 0:01; after 53 cycles of incumbent transcending, the SIT Algorithm
found the essential "-optimal solution
with essential "-optimal value 5.906278 at iteration 667, and confirmed its essential
"-optimality at iteration 866.
452 12 Polynomial Optimization
12.5 Exercises
With tolerance " D 0:01 solve this problem by the SIT monotonic optimization
algorithm.
Hints: the essential "-optimal solution is
z 2 C; F.z; y/ 0 8y 2 D:
minimize f .x; z/
.MPEC/ .x; z/ 2 U; z 2 C.x/;
subject to
F.z; y/ 0 8y 2 D.x/:
z 2 S.x/: (13.1)
then C.x/ D D.x/ D RnC ; F.z; y/ D hF.z/; y zi: Although this case can be
reduced to the previous one, it deserves special interest as it can be shown to
include the important case when S is generated by a Nash equilibrium problem.
There is no need to dwell on the difficulty of MPEC. Finding just an equilibrium
point is already a hard problem, let alone solving an optimization problem with
an equilibrium constraint. In spite of that, the problem has attracted a great
deal of research, due to its numerous important applications in natural sciences,
engineering, and economics.
Quite understandably, in the early period aside from heuristic algorithms for
solving certain classes of MPECs (Friesz et al. 1992), many solution methods
proposed for MPECs have been essentially local optimization methods. Few papers
have been concerned with finding a global optimal solution to MPEC.
Among the most important works devoted to MPECs we should mention the
monographs by Luo et al. (1996) and Outrata et al. (1998). Both monographs
describe a large number of MPECs arising from applications in diverse fields of
natural sciences, engineering, and economics.
13.2 Bilevel Programming 455
In its general form the Bilevel Programming Problem can be formulated as follows
If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex, the problem is
called a Convex Bilevel Program (CBP); if all these functions are linear affine, the
problem is called a Linear Bilevel Program (LBP).
Specifically, the Linear Bilevel Programming problem can be formulated as
where L.x; y/ is a linear function, d 2 Rn2 ; Ai 2 Rmi n1 ; Bi 2 Rmi n2 ; ci 2 Rmi ;
i D 1; 2: In this case g1 .x; y/ D maxiD1;:::;m1 .c1 A1 x B1 y/i 0; g2 .u; y/ D
maxiD1;:::;m2 .c2 u B2 y/i 0; while C.x/ D A2 x:
Clearly all assumptions (A1), (A2), (A3) hold for (LBP) provided the set D WD
f.x; y/ 2 RnC1 RnC2 j A1 x C B1 y c1 ; A2 x C B2 y c2 g is a nonempty polytope. Also
for every x 0 satisfying A1 x C B1 y c1 the set fy 2 RnC2 j A2 x C B2 y c2 g is
compact, so R.x/ is then solvable.
Originally formulated and studied as a mathematical program by Bracken
and McGill (1973, 1974) bilevel and more generally, multilevel optimization has
become a subject of extensive research, owing to numerous applications in diverse
fields such as: economic development policy, agriculture economics, road network
design, oil industry regulation, international water systems and flood control,
energy policy, traffic assignment, structural analysis in mechanics, etc. Multilevel
optimization is also useful for the study of hierarchical designs in many complex
systems, particularly in biological systems (Baer 1992; Vincent 1990).
Even (LBP) is an NP-hard nonconvex global optimization problem (Hansen et al.
1992). Despite the linearity of all functions involved, this is a difficult problem
fraught with pitfalls. Actually, several early published methods for its solution
turned out to be non convergent, incorrect or convergent to a local optimum quite
far off the true optimum. Some of these errors have been reported and analyzed,
e.g., in Ben-Ayed (1993). A popular approach to (LBP) at that time was to use,
directly or indirectly, a reformulation of it as a single mathematical program which,
in most cases, was a linear program with an additional nonconvex constraint. It was
this peculiar additional nonconvex constraint that constituted the major source of
difficulty and was actually the cause of most common errors.
Whereas (LBP) has been extensively studied, the general bilevel programming
problem has received much less attention. Until two decades ago, most research in
(GBP) was limited to convex or quadratic bilevel programming problems, and/or
was mainly concerned with finding stationary points or local solutions (Bard 1988,
1998; Bialas and Karwan 1982; Candler and Townsley 1982; Migdalas et al. 1998;
Shimizu et al. 1997; Vicentee et al. 1994; White and Annandalingam 1993). Global
optimization methods for (GBP) appeared much later, based on converting (GBP)
into a monotonic optimization problem (Tuy et al. 1992, 1994b, and also Tuy and
Ghannadan 1998).
13.2 Bilevel Programming 457
We show that under assumptions (A1)(A3) (GBP) can be converted into a single
level optimization problem which is a monotonic optimization problem.
In view of Assumptions (A1) and (A2) the set f.C.x/; d.y//j .x; y/ 2 Dg is
compact, so without loss of generality we can assume that this set is contained in
some box a; b RmC1
C :
Setting
we define
Proposition 13.1 (i) The function .u/ is finite for every u 2 U and satisfies
.u0 / .u/ whenever u0 u 2 U: In other words, .u/ is a decreasing
function on U:
(ii) The function f .u; t/ is finite on the set W: If .u0 ; t0 / .u; t/ 2 W then
.u0 ; t0 / 2 W and f .u0 ; t0 / f .u; t/: In other words, f .u; t/ is a decreasing
function on W:
(iii) u 2 U for every .u; t/ 2 W:
Proof (i) If u 2 U then there exists .x; y/ 2 D such that C.x/ u; hence
.u/ < C1: It is also obvious that u0 u 2 U implies that .u0 / .u/:
(ii) By assumption (A2) the set D of all .x; y/ 2 RnC1 RnC2 satisfying g1 .x; y/ 0;
g2 .C.x/; y/ 0 is nonempty and compact. Then for every .u; t/ 2 W the
feasible set of the problem determining f .u; t/ is nonempty, compact, hence
f .u; t/ < C1 because F.x; y/ is continuous by (A1). Furthermore, if .u0 ; t0 /
.u; t/ 2 W, then obviously .u0 ; t0 / 2 W and f .u0 ; t0 / f .u; y/:
(iii) Obvious. t
u
As we just saw, the assumption (A2) is only to ensure the solvability of
problem (13.9) and hence, finiteness of the function f .u; t/: Therefore, without any
harm it can be replaced by the weaker assumption
(A2*) The set W D f.u; t/ 2 Rm Rj 9.x; y/ 2 D u C.x/; t d.y/g is
compact.
In addition to (A1), (A2)/(A2*), and (A3) we now make a further
assumption:
(A4) The function .u/ is upper semi-continuous on intU while f .u; t/ is lower
semi-continuous on intW:
458 13 Optimization Under Equilibrium Constraints
Proposition 13.2 If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex
then Assumption (A4) holds, with .u/; f .u; t/ being convex functions.
Proof Observe that if .u/ D d.y/ for x satisfying g2 .C.x/; y/ 0; C.x/ u;
and .u0 / D d.y0 / for x0 satisfying g2 .C.x0 /; y0 / 0; C.x0 / u0 ; then g2 .C.x C
.1 /x0 /; y C .1 /y0 / g2 .C.x/; y/ C .1 /g2 .C.x0 /; y0 /; and C.x C .1
/x0 / C.x/C.1/C.x0 / uC.1/u0 ; so that .xC.1/x0 ; yC.1/y0 /
is feasible to the problem determining .u C.1 /u0 /; hence .u C.1 /u0 /
d.y C .1 /y0 / d.y/ C .1 /d.y0 / D .u/ C .1 / .u0 /: Therefore the
function .u/ is convex, and hence continuous on intU: The convexity and hence,
the continuity of f .u; t/; on intW is proved similarly. t
u
Proposition 13.3 (GBP) is equivalent to the monotonic optimization problem
in the sense that min(GBP)= min(Q) and if .u; t/ solves (Q) then any optimal
solution .x; y/ of the problem
solves (GBP).
Proof By Proposition 13.1 the function f .u; t/ is decreasing while t .u/ is
increasing, so .Q/ is indeed a monotonic optimization problem. If .x; y/ is feasible
to (GBP), then g1 .x; y/ 0; g2 .C.x/; y/ 0; and taking u D C.x/; t D d.y/ D
.C.x// we have t; hence
where f .z/ is a decreasing function and h.z/ a continuous increasing function The
feasible set of this monotonic optimization problem .Q/ is a normal set, hence is
robust and can be handled without difficulty. As shown in Chap. 11, such a problem
.Q/ can be solved either by the polyblock outer approximation method or by the
BRB (branch-reduce-and-bound) method. The polyblock method is practical only
for problems .Q/ with small values of m (typically m 5/: The BRB method we
are going to present is more suitable for problems with fairly large values of m:
As its name indicates, the BRB method proceeds according to the standard
branch and bound scheme, with three basic operations: branching, reducing (the
partition sets), and bounding.
1. Branching consists in a successive rectangular partition of the initial box M0 D
a; b Rm R; following a chosen subdivision rule. As will be explained
shortly, branching should be performed upon the variables u 2 Rm ; so the
subdivision rule is induced by a standard bisection upon these variables.
2. Reducing consists in applying valid cuts to reduce the size of the current partition
set M D p; q a; b: The box p0 ; q0 that results from the cuts is referred to as
a valid reduction of M:
3. Bounding consists in estimating a valid lower bound .M/ for the objective
function value f .z/ over the feasible portion contained in the valid reduction
p0 ; q0 of a given partition set M D p; q:
Throughout the sequel, for any vector z D .u; t/ 2 Rm R we define zO D u:
a. Branching
Since an optimal solution .u; t/ of the problem (Q) must satisfy t D .u/; it suffices
to determine the values of the variables u in an optimal solution. This suggests
that instead of branching upon z D .u; t/ as usual, we should branch upon u;
according to a chosen subdivision rule which may be the standard bisection or
the adaptive subdivision rule. As shown by Corollary 6.2, the standard subdivision
method is exhaustive, i.e., for any infinite nested sequence Mk ; k D 1; 2; : : : ; such
that each MkC1 is a child of Mk via a bisection we have diamMk ! 0: This property
guarantees a rather slow convergence, because the method does not take account
460 13 Optimization Under Equilibrium Constraints
b. Valid Reduction
At a given stage of the BRB algorithm for .Q/ a feasible solution z is available which
is the best so far known. Let D f .z/ and let p; q a; b be a box generated
during the partitioning procedure which is still of interest. Since an optimal solution
of .Q/ is attained at a point satisfying h.z/ WD t .u/ D 0; the search for a feasible
solution z of .Q/ in p; q such that f .z/ can be restricted to the set B \ p; q;
where
B WD fzj f .z/ ; h.z/ D 0g: (13.11)
B \ p0 ; q0 D B \ p; q:
where
X
n X
n
q0 D p C i .qi pi /ei ; p0 D q0 i .q0 pi /ei ; (13.13)
iD1 iD1
1; if h.qi / 0
i D 'i ./ D h.p C .qi pi /ei /:
N.'i /; if h.qi / < 0:
1; if f .q0i / 0
i D i ./ D f .q0 .q0i pi /ei /:
N. i /; if f .q0i / > 0:
c. Valid Bounds
Since f .z/ is decreasing, an obvious lower bound is f .q/: We will shortly see that to
ensure convergence of the BRB Algorithm, it suffices that the lower bounds satisfy
.M/ f .q/:
Obviously, h.p .z// D 0:
Lemma 13.1 If s D p .q/; v i D q .qi si /ei ; iD1; : : : ; mC1; and IDfij v i / 2 Hg
then a valid lower bound over M D p; q is
.M/ D minff .v i /j i 2 Ig
D minff .u; t/j g1 .x; y/ 0;
g2 .Oui ; y/ 0; .C.x/; d.y// v i .i 2 I/; x 0; y 0g:
Proof This follows since the polyblock with vertices v i ; i 2 I; contains all feasible
solutions z D .u; t/ still of interest in p; q: t
u
462 13 Optimization Under Equilibrium Constraints
In the case of (CBP) (Convex Bilevel Programming) the function .u/ is convex
by Proposition 13.1, so the function h.u; t/ D t .u/ is concave, and a lower
bound can be computed which is rather tight. In fact, since h.z/ is concave and since
the box p; q has been -reduced, h.qi / D 0; i D 1; : : : ; n [see (13.13)], the set
fz 2 p; qj h.z/ D 0g is contained in the simplex S WD q1 ; : : : ; qn : Hence a lower
bound of f .u; t/ over M D p; q is given by
Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1. Update CBS and CBV. From Rk remove all M 2 Rk such
that .M/ CBV and let R 0 k be the resulting collection. Let Sk D
R 0k [ P 0k :
Step 3. If Sk D ; terminate: CBV is the optimal value and CBS (the feasible
solution z with f .z/ D CBV) is an optimal solution.
Step 4. If Sk ; select Mk 2 argminf.M/j M 2 Sk g: Subdivide Mk D pk ; qk
according to the adaptive subdivision rule: if .Mk / D f .v k / for some
v k 2 Mk and zk D pk .v k / is the first point where the line segment from
v k to pk meets the surface h.z/ D 0 then bisect Mk via .wk ; jk / where
wk D 12 .v k C zk / and jk satisfies kzkjk vjkk k D maxj kzkj vjk k: Let PkC1
be the collection of newly generated boxes.
Step 5. Set RkC1 D Sk n fMk g; increment k and return to Step 1.
Proposition 13.5 Whenever infinite the above algorithm generates an infinite
sequence fzk g every cluster point of it yields a global optimal solution of (GBP).
Proof Follows from Proposition 6.2. t
u
In that case the dimension of the problem (Q) can be reduced. For this, define
It is easily seen that all these functions are decreasing. Indeed, v 0 v implies
C.v 0 / C.v/ and hence g2 .C.v 0 /; y/ 0; i.e., whenever v is feasible to (13.18) and
v 0 v then v is also feasible. This shows that .v 0 / .v/: Similarly, whenever
.v; t/ is feasible to (13.19) and .v 0 ; t0 / .v; t/ then .v; t/ is also feasible, and hence,
f .v 0 ; t0 / f .v; t/:
With .v/ and f .v; t/ defined that way, we still have
Proposition 13.6 Problem (GBP) is equivalent to the monotonic optimization
problem
.Q/
Q minff .v; t/j t .v/ 0g
464 13 Optimization Under Equilibrium Constraints
solves (GBP).
Proof If .x; y/ is feasible to (GBP), then g1 .x; y/ 0; g2 .C.x/; y/ 0 and taking
v D x; t D d.y/ D .x/ we have t .v/ D 0; hence, setting G WD f.v; t/j t
.v/ 0g yields
EB1 EB1 EN xN
Hence, setting Z D ; zD yields
0 xCN
x D Zv C z with Ez D EN xN C EN xN D 0:
(so that, in particular, t D .v/; and .x; y/ solves the problem (13.22) where v D
v; t D t D .v//; then, since v D Ex; we have .Ex/ .v/ d.y/: Noting
that Zv D ZEx D x; this implies that y solves minfd.y/j g2 .Cx; y/ 0g; and
consequently, .x; y/ solves (GBP).
In the special case when all functions involved in (GBP) are linear we have the
Linear Bilevel Programming problem
min L.x; y/ s.t.
.LBP/ A1 x C B1 y c1 ; x 2 Rn1 ; y solves
C
minfhd; yij A x C B y c2 ; y 2 Rn2 g
2 2 C
1
A1 B1 c
AD ; BD ; cD 2 ;
A2 B2 c
2 3 2 3
A1;1 A2;1
A1 D 4 : : : 5 ; A2 D 4 : : : 5 ;
A1;m1 A2;m1
D D f.x; y/j Ax C By c; x 0; y 0g:
X
n X
n
q0 D p C i .qi pi /ei ; p0 D q0 i .q0i pi /ei ;
iD1 iD1
where
1; if h.qi / 0
i D 'i ./ D h.p C .qi pi /ei /:
N.'i /; if h.qi / < 0
1; if f .q0i / 0
i D i ./ D f .q0 .q0i pi /ei / :
N. i /; if f .q0i / > 0
13.2 Bilevel Programming 467
Lower bounding: For any box M D p; q with red .M/ D p0 ; q0 a lower bound
for f .z/ over p; q is
min.x2 C y2 / s.t.
x 0; y 0; y solves
min.y/ s.t.
3x C y 15;
x C y 7;
x C 3y 15:
Assume X is compact and C is continuous. Then the set C.X/ is compact and there
exists a box p; q RmC such that C.X/ a; b:
For every y 2 a; b define
x 2 XWE , h.C.x// 0:
, h.C.x// 0: t
u
Since the set C.X/ is compact, it is contained in some box a; b Rm : For each
y 2 a; b let
'.y/ WD minff .x/j x 2 X; g.x/ 0; C.x/ yg:
Clearly for y0 y we have '.y0 / '.y/; so this is a decreasing function.
Proposition 13.8 Problem (OWE) is equivalent to the monotonic optimization
problem
minf'.y/j h.y/ 0; y 2 a; bg: (13.25)
This is a problem with hidden monotonicity in the constraints which has been stud-
ied in Chap. 11, Sect. 11.5. By Proposition 11.24, it is equivalent to problem (13.25).
For small values of m the monotonic optimization problem (13.25) can be solved
fairly efficiently by the copolyblock approximation algorithm analogous to the
470 13 Optimization Under Equilibrium Constraints
polyblock algorithm described in Sect. 11.2, Chap. 11. For larger values of m; the
SIT Algorithm presented in Sect. 11.3 is more suitable. The latter is a rectangu-
lar BRB algorithm involving two specific operations: valid reduction and lower
bounding.
a. Valid Reduction
At a given stage of the BRB algorithm for solving (13.25), a feasible solution y is
known or has been produced which is the best so far available. Let D '.y/ and
let p; q a; b be a box currently still of interest among all boxes generated in the
partitioning process. Since both '.x/ and h.x/ are increasing, an optimal solution
of (13.25) must be attained at a point y 2 p; q satisfying h.y/ D 0 (i.e., a lower
boundary point of the conormal set WD fx 2 a; bj h.y/ 0g). The search for a
better feasible solution than y can therefore be restricted to the set
where
X
n X
n
q0 D p C i .qi pi /ei ; p0 D q0i i .q0i pi /ei ;
iD1 iD1
1; if h.q / 0
i
i D 'i .t/ D h.p C t.qi pi /ei /:
N.'i /; if h.qi / > 0:
1; if h .q0i / 0
i D i .t/ D h .q0 t.q0i pi /ei /:
N. i /; if h .q0i / < 0:
13.3 Optimization Over the Efficient Set 471
Remark 13.3 It is easily verified that the box p0 ; q0 D red p; q still satisfies
'.p0 / ; h.p0 / 0; h.q0 / 0:
b. Lower Bounding
we note that, since '.y/ is increasing, an obvious bound is '.p/: It turns out that, as
will be shortly seen, the convergence of the BRB algorithm for solving (eqn13.25)
will be ensured, provided that
.M/ '.p/: (13.28)
A lower bound satisfying this condition will be referred to as a valid lower bound.
Define .y/ D minf'.y/ ; h.y/g; D fy 2 p; qj .y/ 0g:
If .p/ 0 .q/; then obviously p is an exact minimizer of '.y/ over the
feasible points in p; q at least as good as the current best, and can be used to update
the current best solution. Suppose therefore that .p/ 0; h.p/ < 0; i.e., p 2 n:
For each y 2 p; q such that h.y/ < 0 let .y/ be the first point where the line
segment from y to q meet the lower boundary of ; i.e.,
Obviously, h..y// D 0:
Lemma 13.2 If z D .p/; zi D p C .zi pi /ei ; i D 1; : : : ; m; and I D fij zi 2 g;
then a valid lower bound over M D p; q is
.M/ D minf'.zi /j i 2 Ig
D minff .x/j x 2 D; C.x/ zi ; i 2 Ig:
Proof Let Mi D zi ; q: From the definition of z it is easily seen that h.u/ < 0 8u 2
p; z/; i.e., p; q\\ p; qnp; z/: Noting that fuj p u < zg D \m iD1 fuj pi
ui < zi g we can write p; qnp; z/ D p; qn\m iD1 fuj u i < zi g D [m
iD1 fu 2 p; qj zi
ui qi g D [m iD1 Mi : Thus, if I D fijzi
2 g, then p; q \ \ [ iD1 Mi : Since
m
whence liml!C1 .Mkl / D '.y /: On the other hand, since .Mkl / is minimal
among the current set Skl we have .Mkl / minf'.y/j y 2 a; b \ g ad,
consequently,
'.y / minf'.y/j y 2 a; b \ g:
minf'.y/j y 2 g
Clearly h.yi / D 0:
Proposition 13.11 If the functions Ci .x/; i D 1; : : : ; m are concave then a lower
bound for '.y/ over the feasible portion in M D p; q is
.M/ D minf'.y/j y 2 Sg
minf'.y/j y 2 p; q; h.y/ D 0g; (13.31)
474 13 Optimization Under Equilibrium Constraints
we can write
min '.y/
y2S
We now turn to the constrained optimization problem over the efficient set
moreover, it is compact. Recall from Sect. 11.1 (Chap. 11) that a point z 2 is
called an upper boundary point of ; written z 2 @C ; if .z; b a; b n : A point
z 2 @C is called an upper extreme point of ; written z 2 extC ; if x 2 ; x z
implies x D z: By Proposition 11.5 a compact normal set such as always has at
least one upper extreme point.
Proposition 13.12 x 2 XE , C.x/ 2 extC :
Proof If x0 2 XE , then y0 D C.x0 / 2 and there is no x 2 X such that C.x/
C.x/ ; C.x/ C.x0 /, i.e., no y D C.x/; x 2 X such that y y0 ; y y0 ; hence
y0 2 extC : Conversely, if y0 D C.x0 / 2 extC with x0 2 X, then there is no
y D C.x/; x 2 X; such that y y0 ; y y0 ; i.e., no x 2 X with C.x/ C.x0 /; C.x/
C.x0 /; hence x0 2 XE : t
u
For every y 2 Rm
C define
( )
X
m
.y/ D min .yi zi /j z y; z 2 : (13.32)
iD1
Q
which has the same form as (13.25), with the difference, however, that h.y/; though
increasing like h.y/; is not usc. Upon the same transformation as previously, this
problem can be rewritten as
Clearly problem (13.35) differs from problem (13.25) only by the additional
constraint .y/ D 0; whose presence is what makes (OE) a bit more complicated
than (OWE).
There are two possible approaches for solving (OE). In the first approach
proposed by Thach and discussed in Tuy et al. (1996a) the problem (OE) is
approximated by an (OWE) differing from (OE) only in that the criterion mapping
C is slightly perturbed and replaced by C" with
X
m
Cj" .x/ D Cj .x/ C " Ci .x/; j D 1; : : : ; m;
iD1
"
where " > 0 is sufficiently small. Denote by XWE and XE" resp. the weakly efficient
"
set and the set of efficient set of X w.r.t. C .x/ and consider the problem
.OWE" / "
."/ WD minff .x/j g.x/ 0; x 2 XWE g:
"
Proposition 13.14 (i) For any " > 0 we have XWE XE :
(ii) ."/ ! as " & 0:
"
(iii) If X is a polytope and C is linear, then there is "0 > 0 such that XWE D XE for
all " 2 .0; "0 /:
Proof (i) Let x 2 X n XE : Then there is x 2 XPsatisfying C.x/ Pm C.x /;
m
Ci .x/ > Ci .x / for at least one i: This implies iD1 Ci .x/ > iD1 Ci .x /;
"
hence C" .x/ > C" .x /; so that x XWE : Thus, if x 2 X n XE then
"
x 2 X n XWE XE :
(ii) Property (i) implies that ."/ : Define
x0 2 XE we have .x0 C K/ \ F D ;; and hence, there exits "F > 0 such that
.x0 C K " / \ F D ; 8" 2 .0; "F /: Then we have .x C K " / \ F D ; 8x 2 F
whenever 0 < " "F : If F denotes the set of facets of X such that F XE ;
then F is finite and "0 D minf"F j F 2 F g > 0: Hence, .x C K " / \ F D
"
; 8x 2 F; 8F 2 F ; i.e., XE XWE whenever 0 < " < "0 : t
u
As a consequence, an approximate optimal solution of (OE) can be obtained by
solving .OWE" / for any " such that 0 < " "0 :
An alternative approach to (OE) consists in solving the equivalent monotonic
optimization problem (13.25) by a BRB algorithm analogous to the one for solving
problem (OWE). To take account of the peculiar constraint .y/ D 0 some
precautions and special manipulations are needed
(i) A feasible solution is an y 2 a; b satisfying .y/ D 0; i.e., a point of extC :
The current best solution is the best among all so far known feasible solutions.
(ii) In Step 1, after completion of the reduction operation described above, a special
supplementary reductiondeletion operation is needed to identify and fathom
any box p; q that contains no point of extC : This supplementary operation
will ensure that every filter of partition sets pk ; qk collapses to a point
corresponding to an efficient point.
Define h" .y/ D minftj x 2 X; yi Ci" .x/ t; i PD 1; : : : ; mg; " D fy 2
" "
a; bj y C .x/; x 2 Xg where Cj .x/ D Cj .x/ C " m iD1 Ci .x/; j D 1; : : : ; m;
" "
Since C.x/ a 2 Rm C 8x 2 X; we have C.x/ C .x/ 8x 2 X; hence :
Let p; q be a box which has been reduced as described in Proposition 13.9, so
that p 2 ; q : Since p 2 we have .p/ 0:
Supplementary ReductionDeletion Rule: Fathom p; q in either of the
following events:
(a) .q/ 0 (so that .y/ 0 8y 2 \ p; q @C \ p; q/I if .q/ D 0, then
q gives an efficient point and can be used to update the current best.
(b) h" .q/ 0 for some small enough " > 0 (then q 2 " ; so the box p; q contains
no point of @C . " /; and hence no point of extC /:
Proposition 13.15 The BRB algorithm applied to problem (13.35) with precaution
(i) and the supplementary reductiondeletion rule either yields an optimal solution
to (OE) in finitely many iterations or generates an infinite filter of boxes whose
intersection is a global optimal solution y of (13.35), corresponding to an optimal
solution x of (OE).
Proof Because of precautions (i) and (ii) the box selected for further partitioning
in each iteration contains either a point of extC or a point of extC " : Therefore,
any infinite filter of boxes generated by the algorithm collapses either to a point
y 2 extC (yielding then an x 2 XE ) or to a point y 2 extC " (yielding then
x 2 XWE
"
XE ). t
u
478 13 Optimization Under Equilibrium Constraints
f .x/ D hc; xi; c D .4; 8; 3; 1; 7; 0; 6; 2; 9; 3/;
X D fx 2 R10
C j A1 x b1 ; A2 x b2 g;
where
2 3
7 1 3 7 0 9 2 1 5 1
6 5 8 1 7 5 0 1 3 4 0 7
6 7
6 7
A1 D 6 2 1 0 2 3 2 2 5 1 3 7
6 7
4 1 4 2 9 4 3 3 4 0 2 5
3 1 0 8 3 1 2 2 5 5
b1 D .66; 150; 81; 79; 53/
2 3
2 9 1 2 2 1 4 1 5 2
6 2 3 2 4 5 4 1 9 2 1 7
6 7
6 4 8 1 1 5 3 2 0 2 9 7
6 7
6 7
6 2 7 1 2 5 9 4 1 2 0 7
A2 D 6 7
6 2 3 1 1 4 3 1 0 0 6 7
6 7
6 6 0 0 0 4 3 2 2 4 6 7
6 7
4 2 2 3 5 6 4 0 0 1 4 5
1 4 4 6 0 3 4 2 4 1
b2 D .126; 12; 52; 23; 2; 23; 28; 90/
g.x/ WD Gx d
where
1 1 2 4 4 4 1 4 6 3
GD
4 9 0 1 2 1 6 5 0 0
d D .14; 89/
2 3
5 1 7 1 4 9 0 4 3 7
C D 4 1 2 5 4 1 6 4 0 3 0 5
3 3 0 4 0 1 2 1 4 0
Computational results:
Optimal solution (tolerance
D 0:01):
x D .0; 8:741312; 8:953411; 9:600323; 3:248449; 3:916472;
6:214436; 9:402384; 10; 4:033163/
(found at iteration 78 and confirmed at iteration 316)
13.4 Equilibrium Problem 479
Computational results:
Optimal solution (tolerance
D 0:01):
x D .1:991962; 2:991962/
(found at iteration 16 and confirmed at iteration 16),
Optimal value: 4.983923.
By Theorem 3.4 we already know that this problem has a solution, provided the
following assumption holds:
(A) There exists a closed set-valued function ' from D to C with nonempty compact
values '.y/ C 8y 2 D; such that
The next question that arises is how to find effectively the equilibrium point
when condition (13.40) is fulfilled.
As it turns out, this is a question of fundamental importance both for the theory
and applications of mathematics.
Historically, about 40 years ago the so-called combinatorial pivotal methods
were developed for computing fixed point and economic equilibrium, which was
at the time a very popular subject. Subsequently, the interest switched to variational
inequalitiesa new class of equilibrium problems which have come to play a central
role in many theoretical and practical questions of modern nonlinear analysis.
With the rapid progress in nonlinear optimization, including global optimization,
nowadays equilibrium problems and, in particular, variational inequalities can be
approached and solved as equivalent optimization problems.
For x; y 2 C define
Writing minz2C hx; zi D hx; zi for z 2 C we see that the function F.x; y/ is continuous
in x and convex in y: By Theorem 3.4 there exists x 2 C such that
where
y 2 C; g.y/ D 0; (13.46)
The existence theorem for variational inequalities and the conversion of variational
inequalities into equivalent equations date back to Hartman and Stampacchia
(1966). The use of gap functions to convert variational inequalities into optimization
problems (13.47) appeared subsequently (Auslender 1976). However, in the earlier
period, when global optimization methods were underdeveloped, the converted
optimization problem was solved mainly by local optimization methods, i.e., by
methods seeking a local minimizer, or even a stationary point satisfying necessary
optimality conditions. Since the function g.y/ may not be convex such points do not
482 13 Optimization Under Equilibrium Constraints
always satisfy the required inequality in (VI) for all y 2 C; so strictly speaking they
do not solve the variational inequality. Therefore, additional ad-hoc manipulations
have often to be used to guarantee the global optimality of the solution found.
Aside from the function (13.45) introduced by Auslender, many other gap
functions have been developed and analyzed in the last two decades: (Fukushima
1992), (Zhu and Marcotte 1994), (Mastroeni 2003),. . . For a gap function to be
practically useful it should possess some desirable properties making it amenable to
currently available optimization methods. In particular, it should be differentiable,
if differentiable optimization methods are to be used. From this point of view, the
following function, discovered independently by Auchumuty (1989) and Fukushima
(1992), is of particular interest:
1
g.y/ D max h'.y/; z yi hz y; M.z y/i ; (13.48)
z2C 2
1
h.z/ D h'.y/; z yi hz y; M.z y/i: (13.49)
2
For every fixed y 2 C; since h.z/ is a strictly concave quadratic function, it has a
unique maximizer u D u.y/ on the compact convex set C which is also a minimizer
of the strictly convex quadratic function h.z/ on C and hence, is completely
defined by the condition 0 2 rh.u/ C NX .u/ (see Proposition 2.31), i.e.,
Proposition 13.16 (Fukushima 1992) The function (13.48) is a gap function for
the variational inequality (VI). Furthermore, a vector y 2 C solves (VI) if and only
if y D u.y/; i.e., if and only if y is a fixed point of the mapping u W C ! C that
carries each point y 2 C to the maximizer u.y/ of the function h.z/ over the set C:
Proof Clearly h.y/ D 0; so g.y/ D maxz2C h.z/ h.y/ D 0 8y 2 C: We show
that y 2 C solves (VI) if and only if g.y/ D 0. Since g.y/ D maxz2C h.z/ we have
g.y/ D h'.y/; u.y/ yi 12 hu.y/ y; M.u.y/ y/i: It follows that g.y/ D 0 for
y 2 C if and only if y D u.y/; g.y/ D minz2C g.y/; and h'.y/; z yi 0; 8z 2 C
[see (13.50)], consequently, if and only if y is a solution of (VI). t
u
It has also been proved in Fukushima (1992) that under mild assumptions on
continuity and differentiability of '.y/ the function (13.48) is also continuous and
differentiable and that any stationary point of it on C solves (VI). Consequently,
with the special gap function (13.48) and under appropriate assumptions (VI) can
be solved by using conventional differentiable local optimization methods to find a
stationary point of the corresponding problem (13.47).
The next proposition indicates two important cases when the problem (13.44)
using the gap function (13.45) can be solved by well-developed global optimization
methods.
13.5 Optimization with Variational Inequality Constraint 483
Proposition 13.17 (i) If the map ' W Rn ! Rm is affine, then the gap
function (13.45) is a dc function and the corresponding optimization problem
equivalent to (VI) is a dc optimization problem.
(ii) If C RmC and each function 'i .y/ W RC ! RC ; i D 1; : : : ; m; is an increasing
n
function then the gap function (13.45) is a dm function and the corresponding
optimization problem equivalent to (VI) is a dm optimization problem.
Proof (i) If '.y/ is affine, then h'.y/; yi is a quadratic function, while
minz2C h'.y/; zi is a concave function, so (13.45) is a dc function.
(ii) If every function 'i .y/ is increasing then for any y; y0 2 Rm 0
C such that y y by
writing
A class of MPECs which has attracted much research in recent years includes
optimization problems with a variational inequality constraint. These are problems
of the form
minimize f .x; y/
.OVI/ .x; y/ 2 U; y 2 C;
subject to
h'.x; y/; z yi 0 8z 2 C;
minimize f .x; y/
.MPAEC/ .x; y/ 2 U; y 2 C;
subject to
hAx C By C c; z yi 0 8z 2 C;
1
g.x; y/ D max h'.y/; z yi hz y; M.z y/i ;
z2C 2
Since the feasible set of problem (13.51) is nonconvex, an efficient method for
solving it, as we saw in Chap. 7, Sect. 7.4, is to proceed according to the following
SIT scheme.
For any real number denote by .SP / the auxiliary problem of finding a feasible
solution .x; y/ to (MPAEC) such that f .x; y/ : This problem is a dc optimization
under convex constraints since it can be formulated as
Consequently, as argued in Sect. 6.2, it should be much easier to handle than (13.51)
whose feasible set is nonconvex. In fact this is the rationale for trying to reduce
(MPAEC) to a sequence of subproblems .SP /:
Given a tolerance
> 0; a feasible solution .x; y/ of (MPAEC) is said to be an
-optimal solution.
Formally, we can state the following algorithm.
By translating if necessary it can be assumed that U is a compact convex subset
of a hyperrectangle a; b RnC Rm C and C is a compact convex subset of RC :
m
13.6 Exercises
where U RnCm ; C Rn are two nonempty closed convex sets, f .x; y/ is convex
function on RnCm while A; B are given appropriate real matrices and c 2 Rn :
Show that if A is a symmetric positive semidefinite matrix then this problem
reduces to a convex bilevel program.
6 Let g.x; y/ D maxy2C fhAx C By C c; y zi 12 hz y; G.z y/g; where G is an
arbitrary n n symmetric positive definite matrix. Show that:
(i) g.x; y/ 0 8.x; y/ 2 C Rm ;
(ii) .x; y/ 2 U; y 2 C; g.x; y/ D 0 if and only if .x; y/ satisfies the conditions
(2) and (3) in Exercise 5.
In other words, the MPAEC in Exercise 5 is equivalent to the dc optimization
problem
minf .x; y/ W
.x; y/ 2 U; y 2 C;
g.x; y/ 0:
References
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Networks Flows: Theory, Algorithms and Applications.
Prentice Hall, Englewood Cliffs, NJ (1993)
Alexandrov, A.D.: On surfaces which may be represented by a difference of convex functions.
Izvestiya Akademii Nauk Kazakhskoj SSR, Seria Fiziko-Matematicheskikh 3, 320 (1949,
Russian)
Alexandrov, A.D.: On surfaces which may be represented by differences of convex functions. Dokl.
Akad. Nauk SSR 72, 613616 (1950, Russian)
Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math. Oper. Res. 8, 273
286 (1983)
Al-Khayyal, F.A., Larsen, C., Van Voorhis, T.: A relaxation method for nonconvex quadratically
constrained quadratic programs. J. Glob. Optim., 215230 (1995)
Al-Khayyal, F.A., Tuy, H., Zhou, F.: DC optimization methods for multisource location problems.
School of Industrial and Systems Engineering. George Institute of Technology, Atlanta (1997).
Preprint
Al-Khayyal, F.A., Tuy, H., Zhou, F.: Large-scale single facility continuous location by DC
Optimization. Optimization 51, 271292 (2002)
An, L.T.H.: Solving large scale molecular distance geometry problems by a smoothing technique
via the Gaussian Transform and D.C. Programming. J. Glob. Optim. 27, 375397 (2003)
An, L.T.H., Tao, P.D.: The DC (difference of convex) programming and DCA revisited with DC
models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 2346 (2005)
An, L.T.H., Tao, P.D., Muu, L.D.: Simplicially-constrained DC optimization over the efficient and
weakly efficient set. J. Optim. Theory Appl. 117, 503531 (2003)
An, L.T.H., Belghiti, M.T., Tao, P.D.: A new efficient algorithm based on DC programming and
DCA for clustering. J. Glob. Optim. 37, 557569 (2007)
An, L.T.H., Tao, P.D., Nam, N.C., Muu, L.D.: Methods for optimization over the efficient and
weakly efficient sets of an affine fractional vector optimisation problem. Optimization 59, 77
93 (2010)
Androulakis, I.P., Maranas, C.D., Floudas, C.A.: BB: a global optimization method for general
constrained nonconvex problems. J. Glob. Optim. 7, 337363 (1995)
Apkarian, P., Tuan, H.D.: Concave programming in control theory. J. Glob. Optim. 15, 343370
(1999)
Asplund, E.: Differentiability of the metric projection in finite dimensional Euclidean space. Proc.
AMS 38, 218219 (1973)
Astorino, A., Miglionico, G.: Optimizing sensor cover energy via DC Programming. Optim. Lett.
(2014). doi:10.1007/s11590-014-0778-y
Aubin, J.-P., Ekeland, I.: Applied Nonlinear Analysis. John Wiley & Sons, NewYork (1984)
Auchumuty, G.: Variational principles for variational inequalities. Numer. Funct. Anal. Optim. 10,
863874 (1989)
Auslender, A.: Optimisation, Mthodes Numriques. Masson, Paris (1976)
Bach-Kim, N.T.: An algorithm for optimization over the efficient set. Vietnam J. Math. 28, 329
340 (2000)
Baer, E., et al.: Biological and synthetic hierarchical composites. Phys. Today 46, 6067 (1992)
Balas, E.: Intersection cutsa new type of cutting planes for integer programming. Oper. Res. 19,
1939 (1971)
Balas, E.: Integer programming and convex analysis: intersection cuts and outer polars. Math.
Program. 2, 330382 (1972)
Balas, E., Burdet, C.A.: Maximizing a convex quadratic function subject to linear constraints.
Management Science Research Report No 299, CarnegieMelon University (1973)
Bard, J.F.: Practical Bilevel Optimization Algorithms and Optimization. Kluwer, Dordrecht (1988)
Bard, J. F.: Convex two-level optimization. Math. Program. 40, 1527 (1998)
Ben-Ayed, O.: Bilevel linear programming. Comput. Oper. Res. 20, 485501 (1993)
Bengtsson, M., Ottersen, B.: Optimum and suboptimum transmit beamforming. In: Handbook of
Antennas in Wirelss Coomunications, chap. 18. CRC, Boca Raton (2001)
Benson, H.P.: Optimization over the efficient set. J. Math. Anal. Appl. 98, 562580 (1984)
Benson, H.P.: An algorithm for optimizing over the weakly-efficient set. Eur. J. Oper. Res. 25,
192199 (1986)
Benson, H.P.: An all-linear programming relaxation algorithm for optimization over the efficient
set. J. Glob. Optim. 1, 83104 (1991)
Benson, H.P.: A bisection-extreme point search algorithm for optimizing over the efficient set in
the linear dependance case. J. Glob. Optim. 3, 95111 (1993)
Benson, H.P.: Concave minimization: theory, applications and algorithms. In: Horst, R., Pardalos,
P. (eds.) Handbook of Global Optimization, pp. 43148. Kluwer Academic Publishers,
Dordrecht (1995)
Benson, H.P.: Deterministic algorithms for constrained concave minimization: a unified critical
survey. Nav. Res. Logist. 43, 765795 (1996)
Benson, H.P.: An outer approximation algorithm for generating all efficient extreme points in the
outcome set of a multi objective linear programming problem. J. Glob. Optim. 13, 124 (1998)
Benson, H.P.: An outcome space branch and bound outer approximation algorithm for convex
multiplicative programming. J. Glob. Optim. 15, 315342 (1999)
Benson, H.P.: A global optimization approach for generating efficient points for multiplicative
concave fractional programming. J. Multicrit. Decis. Anal. 13, 1528 (2006)
Ben-Tal, A., Eiger, G., Gershovitz, V.: Global minimization by reducing the duality gap. Math.
Program. 63, 193212 (1994)
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. MPS-SIAM Series on
Optimization. SIAM, Philadelphia (2001)
Bentley, J.L., Shamos, M.I.: Divide and conquer for linear expected time. Inf. Proc. Lett. 7, 8791
(1978)
Bialas, W.F., Karwan, C.H.: On two-level optimization. IEEE Trans. Auto. Const. AC-27, 211214
(1982)
Bjornson, E., Jorswieck, E.: Optimal resource allocation in coordinated multi-cell systems.
Foundations and Trends in Communications and Information Theory, vol. 9(23), pp. 113381
(2012). doi:10.1561/0100000069
Bjornson, E., Zheng, G., Bengtson, M., Ottersen, B.: Robust monotonic optimization framework
for multicell MISO systems. IEEE Trans. Signal Process. 60(5), 25082521 (2012)
Blum, E., Oettli, W.: From optimization and variational inequality to equilibrium problems. Math.
Student 63, 127149 (1994)
Bolintineau, S.: Minimization of a quasi-concave function over an efficient set. Math. Program. 61,
83120 (1993a)
References 491
Bolintineau, S.: Optimality condition for minimization over the (weakly or properly) efficient set.
J. Math. Anal. Optim. 173, 523541 (1993b)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge,
United Kingdom (2004)
Bracken, J., McGill, J.: Mathematical programs with optimization problems in the constraints.
Oper. Res. 21, 3744 (1973)
Bracken, J., McGill, J.: A method for solving mathematical programs with nonlinear programs in
the constraints. Oper. Res. 22, 10971101 (1974)
Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. Technical
report 9603. Computer Science Department, University of Wisconsin (1997)
Brimberg, J., Love, R.F.: A location problem with economies of scale. Stud. Locat. Anal. (7), 919
(1994)
Bui, A.: Etude analytique dalgorithmes distribus de routage. Thse de doctorat, Universit Paris
VII (1993)
Bulatov, V.P. Embedding methods in optimization problems. Nauka, Novosibirsk (1977) (Russian)
Bulatov, V.P. Methods for solving multiextremal problems (Global Search). In Beltiukov, B.A.,
Bulatov, V.P. (eds) Methods of Numerical Analysis and Optimization. Nauka, Novosibirsk
(1987) (Russian)
Candler, W., Townsley, R.: A linear two-level programming problem. Comput. Oper. Res. 9, 5976
(1982)
Caristi, J.: Fixed point theorems for mappings satisfying inwardness conditions. Trans. Am. Math.
Soc. 215, 241251 (1976)
Charnes, A., Cooper, W.W.: Programming with linear fractional functionals. Nav. Res. Logist. 9,
181186 (1962)
Chen, P.C., Hansen, P., Jaumard, B.: On-line and off-line vertex enumeration by adjacency lists.
Oper. Res. Lett. 10, 403409 (1991)
Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Webers problem with attraction and repulsion. J.
Regional Sci. 32, 467486 (1992)
Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional
Weber problems by D.C. Programming, Cahier du GERAD G-92-35. Ecole Polytechnique,
Montral (1992b)
Chen, P.C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional
Weber problems by DC Programming. Oper. Res. 46, 548562 (1998)
Cheon, M.-S., Ahmed, S., Al-Khayyal, F.: A branch-reduce-cut algorithm for the global optimiza-
tion of probabilistically constrained linear programs. Math. Program. Ser. B 108, 617634
(2006)
Cheney, E.W., Goldstein, A.A.: Newtons method for convex programming and Tchebycheff
approximation. Numer. Math. 1, 253268 (1959)
Cooper, L.: Location-allocation problems. Oper. Res. 11, 331343 (1963)
Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton, NJ
(1963)
Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494498 (1941)
Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. I. North-Holland, Amster-
dam (1975)
Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. II. North-Holland, Amster-
dam (1978)
Duffin, R.J., Peterson, E.L., Zener, C.: Geometric ProgrammingTheory and Application. Wiley,
New York (1967)
Eaves, B.C., Zangwill, W.I.: Generalized cutting plane algorithms. SIAM J. Control 9, 529542
(1971)
Ecker, J.G., Song, H.G.: Optimizing a linear function over an efficient set. J. Optim. Theory Appl.
83, 541563 (1994)
Ekeland, I.: On the variational principle. J. Math. Anal. Appl. 47, 324353 (1974)
492 References
Ekeland, I., Temam, R.: Convex Analysis and Variational Problems. North-Holland, Amsterdam
(1976)
Ellaia, R.: Contribution lanalyse et loptimisation de diffrences de fonctions convexes, Thse
de trosime cycle, Universit de Toulouse (1984)
Falk, J.E.: Lagrange multipliers and nonconvex problems. SIAM J. Control 7, 534545 (1969)
Falk, J.E.: Conditions for global optimality in nonlinear programming, Oper. Res. 21, 337340
(1973a)
Falk, J.E.: A linear max-min problem. Math. Program. 5, 169188 (1973b)
Falk, J.E., Hoffman, K.L.: A successive underestimation method for concave minimization
problems. Math. Oper. Res. 1, 251259 (1976)
Falk, J.E., Hoffman, K.L.: Concave minimization via collapsing polytopes. Oper. Res. 34, 919
929 (1986)
Falk, J.E., Palocsay, S.W.: Optimizing the sum of linear fractional functions. In: Floudas, C.,
Pardalos, P. (eds) Global Optimization, pp. 221258. Princeton University Press, Princeton
(1992)
Falk, J.E., Soland, R.M.L: An algorithm for separable nonconvex programming problems. Manag.
Sci. 15, 550569 (1969)
Fan, K.: A minimax inequality and applications. In: Shisha (ed.) Inequalities, vol. 3, pp. 103113.
Academic, New York (1972)
Finsler, P.: Uber das Vorkommen definiter und semidefiniter Formen in Scharen quadratischer
Formen. Comment. Math. Helv. 9, 188192 (1937)
Floudas, C.A., Aggarwal, A.: A decomposition strategy for global optimum search in the pooling
problem. ORSA J. Comput. 2(3), 225235 (1990)
Floudas, C.A. : Deterministic Global Optimization, Theory, Methods and Applications. Kluwer,
Dordrecht, The Netherlands (2000)
Forgo, F.: The solution of a special quadratic problem (in Hungarian). Szigma 5359 (1975)
Foulds, L.R., Haugland, D., Jrnsten, K.: A bilinear approach to the pooling problem. Optimization
24, 165180 (1992)
Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization
algorithms. J. Assoc. Comput. Mach. 34, 596615 (1984)
Friesz, T.L., et al.: A simulated annealing approach to the network design problem with variational
inequality constraints. Transp. Sci. 28, 1826 (1992)
Fujiwara, O., Khang, D.B.: A two-phase decomposition method for optimal design of looped water
distribution networks. Water Resour. Res. 23, 977982 (1990)
Fukushima, M.: Equivalent differentiable optimization problems and descent methods for asym-
metric variational inequality problems. Math. Program. 53, 99110 (1992)
Gabasov, R., Kirillova, F.M.: Linear Programming Methods. Part 3 (Special Problems). Minsk,
Russian (1980)
Gale, P.: An algorithm for exhaustive generation of building floor plans. Commun. ACM 24, 813
825 (1981)
Gantmacher F.R.: The Theory of Matrices, vol. 2. Chelsea Publishing Company, New York (1960)
Ge, R.P., Qin, Y.F.: A class of filled functions for finding global minimizers of a function of several
variables. J. Optim. Theory Appl. 54, 241252 (1987)
Glover, F.: Cut search methods in integer programming. Math. Program. 3, 86100 (1972)
Glover, F.: Convexity cuts and cut search. Oper. Res. 21, 123124 (1973a)
Glover, F.: Concave programming applied to a special class of 01 integer programs. Oper. Res.
21, 135140 (1973b)
Graham, R.L.: An efficient algorithm for determining the convex hull of a finite planar set. Inf.
Proc. Lett. 1, 132133 (1972)
Groch, A., Vidigal, L., Director, S.: A new global optimization method for electronic circuit design.
IEEE Trans. Circuits Syst. 32, 160179 (1985)
Guisewite, G.M., Pardalos, P.M.: A polynomial time solvable concave network flow problem.
Network 23, 143148 (1992)
References 493
Haims, M.J., Freeman, H.: A multistage solution of the template-layout problem. IEEE Trans. Syst.
Sci. Cyber. 6, 145151 (1970)
Hamami, M., Jacobsen, S.E.: Exhaustive non-degenerate conical processes for concave minimiza-
tion on convex polytopes. Math. Oper. Res. 13, 479487 (1988)
Hansen, P., Jaumard, B., Savard, G.: New branch and bound rules for linear bilinear programming.
SIAM J. Sci. Stat. Comput. 13, 11941217 (1992)
Hansen, P., Jaumard, B., Tuy, H.: Global optimization in location. In Dresner, Z. (ed.) Facility
Location, pp. 4368. Springer, NewYork (1995)
Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9, 707
713 (1959)
Hartman, P., Stampacchia, G.: On some nonlinear elliptic differential functional equations. Acta
Math. 115, 153188 (1966)
Hillestad, R.J., Jacobsen, S.E.: Linear programs with an additional reverse convex constraint. Appl.
Math. Optim. 6, 257269 (1980b)
Hiriart-Urruty, J.-B.: A short proof of the variational principle for approximate solutions of a
minimization problem. Am. Math. Mon. 90, 206207 (1983)
Hiriart-Urruty, J.-B.: From convex optimization to nonconvex optimization, Part I: necessary and
sufficient conditions for global optimality. In: Clarke, F.H., Demyanov, V.F., Giannessi, F. (eds.)
Nonsmooth Optimization and Related Topics. Plenum, New York (1989a)
Hiriart-Urruty, J.-B. Conditions ncessaires et suffisantes doptimalit globale en optimisation de
diffrences de deux fonctions convexes. C.R. Acad. Sci. Paris 309, Srie I, 459462 (1989b)
Hoai, P.T., Tuy, H.: Monotonic Optimization for Sensor Cover Energy Problem. Institute of
Mathematics, Hanoi (2016). Preprint
Hoai-Phuong, N.T., Tuy, H.: A unified monotonic approach to generalized linear fractional
programming. J. Glob. Optim. 26, 229259 (2003)
Hoang, H.G., Tuan, H.D., Apkarian, P.: A Lyapunov variable-free KYP Lemma for SISO
continuous systems. IEEE Trans. Autom. Control 53, 26692673 (2008)
Hoffman, K.L.: A method for globally minimizing concave functions over convex sets. Math.
Program. 20, 2232 (1981)
Holmberg, K., Tuy, H.: A production-transportation problem with stochastic demands and concave
production cost. Math. Program. 85, 157179 (1995)
Horst, R., Tuy, H.: Global Optimization (Deterministic Approaches), 3rd edn. Springer,
Berlin/Heidelberg/New York (1996)
Horst, R., Thoai, N.V., Benson, H.P.: Concave minimization via conical partitions and polyhedral
outer approximation. Math. Program. 50, 259274 (1991)
Horst, R., Thoai, N.V., de Vries, J.: On finding new vertices and redundant constraints in cutting
plane algorithms for global optimization. Oper. Res. Lett. 7, 8590 (1988)
Idrissi, H., Loridan, P., Michelot, C.: Approximation for location problems. J. Optim. Theory Appl.
56, 127143 (1988)
Jaumard, B., Meyer, C.: On the convergence of cone splitting algorithms with !-subdivisions. J.
Optim. Theory Appl. 110, 119144 (2001)
Jeyakumar, V., Lee, G.M., Li, G.Y.: Alternative theorems for quadratic inequality systems and
global quadratic optimization. SIAM J. Optim. 20, 9831001 (2009)
Jorswieck, E.A., Larson, E.G.: Monotonic optimization framework for the two-user MISO
interference channel. IEEE Trans. Commun. 58, 21592168 (2010)
Kakutani, S.: A generalization of Brouwers fixed point theorem. Duke Math. J. 8, 457459 (1941)
Kalantari, B., Rosen, J.B.: An algorithm for global minimization of linearly constrained concave
quadratic problems. Math. Oper. Res. 12, 544561 (1987)
Karmarkar, N.: An interior-point approach for NP-complete problems. Contemp. Math. 114, 297
308 (1990)
Kelley, J.E.: The cutting plane method for solving convex programs. J. SIAM 8, 703712 (1960)
Klinz, B., Tuy, H.: Minimum concave-cost network flow problems with a single nonlinear arc cost.
In: Pardalos, P., Du, D. (eds.) Network Optimization Problems, pp. 125143. World Scientific,
Singapore (1993)
494 References
Knaster, B., Kuratowski, C., Mazurkiewicz, S.: Ein Beweis des Fix-punktsatzes fr n-dimensionale
Simplexe. Fund. Math. 14, 132137 (1929)
Konno, H.: A cutting plane for solving bilinear programs. Math. Program. 11,1427 (1976a)
Konno, H.: Maximization of a convex function subject to linear constraints. Math. Program. 11,
117127 (1976b)
Konno, H., Abe, N.: Minimization of the sum of three linear fractions. J. Glob. Optim. 15, 419432
(1999)
Konno, H., Kuno, T.: Generalized linear and fractional programming. Ann. Oper. Res. 25, 147162
(1990)
Konno, H., Kuno, T.: Linear multiplicative programming. Math. Program. 56, 5164 (1992)
Konno, H., Kuno, T.: Multiplicative programming problems. In: Horst, R., Pardalos, P. (eds.)
Handbook of Global Optimization, pp. 369405. Kluwer, Dordrecht, The Netherlands (1995)
Konno, H., Yajima, Y.: Minimizing and maximizing the product of linear fractional functions.
In: Floudas, C., Pardalos, P. (eds.) Recent Advances in Global Optimization, pp. 259273.
Princeton University Press, Princeton (1992)
Konno, H., Yamashita, Y.: Minimization od the sum and the product of several linear fractional
functions. Nav. Res. Logist. 44, 473497 (1997)
Konno, H., Yajima, Y., Matsui, T.: Parametric simplex algorithms for solving a special class of
nonconvex minimization problems. J. Glob. Optim. 1, 6582 (1991)
Konno, H., Kuno, T., Yajima, Y.: Global minimization of a generalized convex multiplicative
function. J. Glob. Optim. 4, 4762 (1994)
Konno, H., Thach, P.T., Tuy, H.: Optimization on Low Rank Nonconvex Structures. Kluwer,
Dordrecht, The Netherlands (1997)
Kough, P.L.: The indefinite quadratic programming problem. Oper. Res. 27, 516533 (1979)
Kuno, T.: Globally determining a minimum-area rectangle enclosing the projection of a higher
dimensional set. Oper. Res. Lett. 13, 295305 (1993)
Kuno, T., Ishihama, T.: A convergent conical algorithm with !-bisection for concave minimization.
J. Glob. Optim. 61, 203220 (2015)
Landis, E.M.: On functions representable as the difference of two convex functions. Dokl. Akad.
Nauk SSSR 80, 911 (1951)
Lasserre, J.: Global optimization with polynomials and the problem of moments. SIAM J. Optim.
11, 796817 (2001)
Lasserre, J.: Semidefinite programming vs. LP relaxations for polynomial programming. Math.
Oper. Res. 27, 347360 (2002)
Levy, A.L., Montalvo, A.: The tunneling algorithm for the global minimization of functions. SIAM
J. Sci. Stat. Comput. 6, 1529 (1988)
Locatelli, M.: Finiteness of conical algorithm with !-subdivisions. Math. Program. 85, 593616
(1999)
Luc, L.T.: Reverse polyblock approximation for optimization over the weakly efficient set and the
efficient set. Acta Math. Vietnam. 26, 6580 (2001)
Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Constraints. Cambridge
University Press, Cambridge, United Kingdom (1996)
Maling, K., Nueller, S.H., Heller, W.R.: On finding optimal rectangle package plans. In: Proceed-
ings 19th Design Automation Conference, pp. 663670 (1982)
Mangasarian, O.L.: Nonlinear Programming. MacGraw-Hill, New York (1969)
Mangasarian, O.L.: Mathematical programming data mining. In: Data Mining and Knowledge
Discovery, vol. 1, pp. 183201 (1997)
Maranas, C.D., Floudas, C.A.: A global optimization method for Webers problem with attraction
and repulsion. In: Hager, W.W., Hearn, D.W., Pardalos, P.M. (eds.) Large Scale Optimization,
pp. 259293. Kluwer, Dordrecht (1994)
Martinez, J.M.: Local minimizers of quadratic functions on Euclidean balls and spheres. SIAM J.
Optim. 4, 159176 (1994)
Mastroeni, G.: Gap functions for equilibrium problems. J. Glob. Optim. 27, 411426 (2003)
References 495
Matsui, T.: NP-hardness of linear multiplicative programming and related problems. J. Glob.
Optim. 9, 113119 (1996)
Mayne, D.Q., Polak, E.: Outer approximation algorithms for non-differentiable optimization. J.
Optim. Theory Appl. 42, 1930 (1984)
McCormick, G.P.: Attempts to calculate global solutions of problems that may have local minima.
In: Lootsma, F. (ed.) Numerical Methods for Nonlinear Optimization, pp. 209221. Academic,
London/New York (1972)
McCormick, G.P.: Nonlinear Programming: Theory, Algorithms and Applications. Wiley, New
York (1982)
Megiddo, N., Supowwit, K.J.: On the complexity of some common geometric location problems.
SIAM J. Comput. 13, 182196 (1984)
Migdalas, A., Pardalos, P.M., Vrbrand, P. (eds): Multilevel Optimization: Algorithms and
Applications. Kluwer, Dordrecht (1998)
Murty, K.G.: Linear Programming. Wiley, New York (1983)
Murty, K.G., Kabadi, S.N.: Some NP-complete problems in quadratic and nonlinear programming.
Math. Program. 39, 117130 (1987)
Muu, L.D.: A convergent algorithm for solving linear programs with an additional reverse convex
constraint. Kybernetika 21, 428435 (1985)
Muu, L.D.: An algorithm for solving convex programs with an additional convexconcave
constraint. Math. Program. 61, 7587 (1993)
Muu, L.D.: Models and methods for solving convexconcave programs. Habilitation thesis,
Institute of Mathematics, Hanoi (1996)
Muu, L.D., Oettli, W.: Method for minimizing a convex concave function over a convex set. J.
Optim. Theory Appl. 70, 377384 (1991)
Muu, L.D., Tuyen, H.Q.: Bilinear programming approach to optimization over the efficient set of
a vector affine fractional problem. Acta Math. Vietnam. 27, 119140 (2002)
Muu, L.D., Quoc, T.D., An, L.T. and Tao, P.D.: A new decomposition algorithm for globally
solving mathematical programs with equilibrium constraints. Acta Math. Vietnam. 37, 201
217 (2012)
Nadler, S.B.: Multivalued contraction mappings. Pac. J. Math. 30, 475488 (1969)
Nash, J.: Equilibrium points in n-person game. Proc. Natl. Acad. Sci. U.S.A. 36, 4849 (1950)
Nghia, N.D., Hieu, N.D.: A Method for solving reverse convex programming problems. Acta Math.
Vietnam. 11, 241252 (1986)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables.
Academic, New York (1970)
Outrata, J.V., Kocvara, M., Zowe, J.: Nonsmooth Approach to Optimization Problems with
Equilibrium Constraints. Kluwer, Cambridge (1998)
Pardalos, P.M., Vavasis, S.A.: Quadratic programming with one negative eigenvalue is NP-hard. J.
Glob. Optim. 21, 843855 (1991)
Philip, J. : Algorithms for the vector maximization problem. Math. Program. 2, 207229 (1972)
Phillips, A.T., Rosen, J.B.: A parallel algorithm for constrained concave quadratic global mini-
mization. Math. Program. 42, 421448 (1988)
Pincus, M.: A close form solution of certain programming problems. Oper. Res. 16, 690694
(1968)
Plastria, F.: Continuous location problems. In: Drezner, Z., Hamacher, H.W. (eds.) Facility
Location, pp. 225262. Springer, Berlin (1995)
Polik, I., Terlaky T.: A survey of the S-lemma. SIAM Rev. 49, 371418 (2007)
Prkopa, A.: Contributions to the theory of stochastic programming. Math. Program. 4, 202221
(1973)
Prkopa, A.: Stochastic Programming. Kluwer, Dordrecht (1995)
Putinar, M.: Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J. 42, 969
984 (1993)
Qian, L.P., Jun Zhang, Y.: S-MAPEL: monotonic optimization for non-convex joint power control
and scheduling problems. IEEE Trans. Wirel. Commun. 9, 17081719 (2010)
496 References
Qian, L.P., Jun Zhang, Y., Huang, J.W.: MAPEL: achieving global optimality for a nonconvex
wireless power control problem. IEEE Trans. Wirel. Commun. 8, 15531563 (2009)
Qian, L.P., Jun Zhang, Y., Huang, J.: Monotonic optimization in communications and networking
systems. Found. Trends Netw. 7, 175 (2012)
Roberts, A.W., Varberg, D.E.: Convex Functions. Academic, New York (1973)
Robinson, S.: Generalized equations and their solutions, part 1: basic theory. Math. Program. Stud.
10,128141 (1979)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rosen, J.B.: Global minimization of a linearly constrained concave function by partition of feasible
domain. Math. Oper. Res. 8, 215230 (1983)
Rosen, J.B., Pardalos, P.M.: Global minimization of large-scale constrained concave quadratic
problems by separable programming. Math. Program. 34, 163174 (1986)
Rubinov, A., Tuy, H., Mays, H.: Algorithm for a monotonic global optimization problem.
Optimization 49, 205221 (2001)
Saff, E.B., Kuijlaars, A.B.J.: Distributing many points on a sphere. Math. Intell. 10, 511 (1997)
Sahni, S.: Computationally related problems. SIAM J. Comput. 3, 262279 (1974)
Schaible, S.: Fractional Programming. In: Horst ,R., Pardalos, P. (eds.) Handbook of Global
Optimization, pp. 495608. Kluwer, Dordrecht (1992)
Sherali, H.D., Adams, W.P.: A Reformulation-Lineralization Technique (RLT) for Solving Discrete
and Continuous Nonconvex Programming Problems. Kluwer, Dordrecht (1999)
Sherali, H.D., Tuncbilek, C.H.: A global optimization algorithm for polynomial programming
problems using a reformulation-linearization technique. J. Glob. Optim. 2, 101112 (1992)
Sherali, H.D., Tuncbilek, C.H.: A reformulation-convexification approach to solving nonconvex
quadratic programming problems. J. Glob. Optim. 7, 131 (1995)
Shi, J.S., Yamamoto, Y.: A D.C. approach to the largest empty sphere problem in higher dimension.
In: Floudas, C., Pardalos, P. (eds.) State of the Art in Global Optimization, pp. 395411.
Kluwer, Dordrecht (1996)
Shiau, T.-H.: Finding the largest lp -ball in a polyhedral set. Technical Summary Report No. 2788.
University of Wisconsin, Mathematics Research Center, Madison (1984)
Shimizu, K., Ishizuka, Y., Bard, J.F.: Nondifferentiable and Two-Level Mathematical Program-
ming. Kluwer, Cambridge (1997)
Shor, N.Z.: Quadratic Optimization Problems. Technicheskaya Kibernetica, 1, 128139 (1987,
Russian)
Shor, N.Z.: Dual estimates in multiextremal problems. J. Glob. Optim. 2, 411418 (1991)
Shor, N.Z., Stetsenko, S.I.: Quadratic Extremal Problems and Nondifferentiable Optimization.
Naukova Dumka, Kiev (1989, Russian)
Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171176 (1958)
Sniedovich, M.: C-programming and the minimization of pseudolinear and additive concave
functions. Oper. Res. Lett. 5, 185189 (1986)
Soland, R.M.: Optimal facility location with concave costs. Oper. Res. 22, 373382 (1974)
Sperner, E.: Neuer Beweis fr die Invarianz der Dimensionszahl und des Gebietes. Abh. Math.
Seminar Hamburg VI , 265272 (1928)
Stancu-Minasian, I.M.: Fractional Programming, Theory, Methods and Applications. Kluwer,
Dordrecht (1997)
Strekalovski, A.C.: On the global extremum problem. Sov. Dokl. 292, 10621066 (1987, Russian)
Strekalovski, A.C.: On problems of global extremum in nonconvex extremal problems. Izvestya
Vuzov, ser. Matematika 8, 7480 (1990, Russian)
Strekalovski, A.C.: On search for global maximum of convex functions on a constraint set.
J. Comput. Math. Math. Phys. 33, 349363 (1993, Russian)
Suzuki, A., Okabe, A.: Using Voronoi diagrams. In: Drezner, Z. (ed.) Facility Location, pp. 103
118. Springer, New York (1995)
Swarup, K.: Programming with indefinite quadratic function with linear constraints. Cahiers du
Centre dEtude de Recherche Operationnelle 8, 132136 (1966a)
References 497
Swarup, K.: Indefinite quadratic programming. Cahiers du Centre dEtude de Recherche Opera-
tionnelle 8, 217222 (1966b)
Swarup, K.: Quadratic programming. Cahiers du Centre dEtude de Recherche Oprationnelle 8,
223234 (1966c)
Tanaka, T., Thach, P.T., Suzuki, S.: An optimal ordering policy for jointly replenished products by
nonconvex minimization techniques. J. Oper. Res. Soc. Jpn. 34, 109124 (1991)
Tao, P.D.: Convex analysis approach to dc programming: theory, algorithm and applications. Acta
Math. Vietnam. 22, 289355 (1997)
Tao, P.D.: Large-scale molecular optimization from distance matrices by a dc optimization
approach. SIAM J. Optim. 14, 77116 (2003)
Tao, P.D., Hoai-An, L.T.: DC optimization algorithms for solving the trust region subproblem.
SIAM J. Optim. 8, 476505 (1998)
Thach, P.T.: The design centering problem as a d.c. programming problem. Math. Program. 41,
229248 (1988)
Thach, P.T.: Quasiconjugates of functions, duality relationship between quasiconvex minimization
under a reverse convex constraint and quasiconvex maximization under a convex constraint,
and applications. J. Math. Anal. Appl. 159, 299322 (1991)
Thach, P.T.: New partitioning method for a class of nonconvex optimization problems. Math. Oper.
Res. 17, 4369 (1992)
Thach, P.T.: D.c. sets, d.c. functions and nonlinear equations. Math. Program. 58, 415428 (1993a)
Thach, P.T.: Global optimality criteria and duality with zero gap in nonconvex optimization
problems. SIAM J. Math. Anal. 24, 15371556 (1993b)
Thach, P.T.: A nonconvex duality with zero gap and applications. SIAM J. Optim. 4, 4464 (1994)
Thach, P.T.: Symmetric duality for homogeneous multiple-objective functions. J. Optim Theory
Appl. (2012). doi: 10.1007/s10957-011-9822-6
Thach, P.T., Konno, H.: On the degree and separability of nonconvexity and applications to
optimization problems. Math. Program. 77, 2347 (1996)
Thach, P.T., Thang, T.V.: Conjugate duality for vector-maximization problems. J. Math. Anal.
Appl. 376, 94102 (2011)
Thach, P.T., Thang, T.V.: Problems with resource allocation constraints and optimization over the
efficient set. J. Glob. Optim. 58, 481495 (2014)
Thach, P.T., Tuy, H.: The relief indicator method for constrained global optimization. Naval Res.
Logist. 37, 473497 (1990)
Thach, P.T., Konno, H., Yokota, D.: A dual approach to a minimization on the set of Pareto-optimal
solutions. J. Optim. Theory Appl. 88, 689701 (1996)
Thakur, L.S.: Domain contraction in nonlinear programming: minimizing a quadraticconcave
function over a polyhedron. Math. Oper. Res. 16, 390407 (1991)
Thieu, T.V.: Improvement and implementation of some algorithms for nonconvex optimization
problems. In: OptimizationFifth French German Conference, Castel Novel 1988. Lecture
Notes in Mathematics, vol. 1405, pp. 159170. Springer, Berlin (1989)
Thieu, T.V.: A linear programming approach to solving a jointly contrained bilinear programming
problem with special structure. In: Van tru va He thong, vol. 44, pp. 113121. Institute of
Mathematics, Hanoi (1992)
Thieu, T.V., Tam, B.T., Ban, V.T.: An outer approximation method for globally minimizing a
concave function over a compact convex set. Acta Math. Vietnam 8, 2140 (1983)
Thoai, N.V.: A modified version of Tuys method for solving d.c. programming problems.
Optimization 19, 665674 (1988)
Thoai, N.V.: Convergence of duality bound method in partly convex programming. J. Glob. Optim.
22, 263270 (2002)***
Thoai, N.V., Tuy, H.: Convergent algorithms for minimizing a concave function. Math. Oper. Res.
5, 556566 (1980)
Toland, J.F.: Duality in nonconvex optimization. J. Math. Anal. Appl. 66, 399415 (1978)
Toussaint, G.T.: Solving geometric problems with the rotating calipers. Proc. IEEE MELECON 83
(1978)
498 References
Tuan, H.D., Apkarian, P., Hosoe, S., Tuy, H.: D.C. optimization approach to robust control:
feasibility problems. Int. J. Control 73(2), 89104 (2000a)
Tuan, H.D., Hosoe, S., Tuy, H.: D.C. optimization approach to robust controls: the optimal scaling
value problems. IEEE Trans. Autom. Control 45(10), 19031909 (2000b)
Tuy, H.: Concave programming under linear constraints. Sov. Math. Dokl. 5, 14371440 (1964)
Tuy, H.: On a general minimax theorem. Sov. Math. Dokl. 15, 16891693 (1974)
Tuy, H.: A note on the equivalence between Walras excess demand theorem and Brouwers fixed
point theorem. In: Los, J., Los, M. (eds.) Equilibria: How and Why ?, pp. 6164. North-
Holland, Amsterdam (1976)
Tuy, H.: On outer approximation methods for solving concave minimization problems. Acta Math.
Vietnam. 8, 334 (1983)
Tuy, H.: Global minimization of a difference of two convex functions. In: Lecture Notes in
Economics and Mathematical Systems, vol. 226, pp. 98108. Springer, Berlin (1984)
Tuy, H.: Convex Programs with an additional reverse convex constraint. J. Optim. Theory Appl.
52, 463486 (1987a)
Tuy, H.: Global minimization of a difference of two convex functions. Math. Program. Study 30,
150182 (1987b)
Tuy, H.: Convex Analysis and Global Optimization. Kluwer, Dordrecht (1998)
Tuy, H.: On polyhedral annexation method for concave minimization. In: Leifman, L.J., Rosen, J.B.
(eds.) Functional Analysis, Optimization and Mathematical Economics, pp. 248260. Oxford
University Press, Oxford (1990)
Tuy, H.: Normal conical algorithm for concave minimization over polytopes. Math. Program. 51,
229245 (1991)
Tuy, H.: On nonconvex optimization problems with separated nonconvex variables. J. Glob. Optim.
2, 133144 (1992)
Tuy, H.: D.C. optimization: theory, methods and algorithms. In: Horst, R., Pardalos, P. (eds.) In:
Handbook of Global Optimization, pp. 149216. Kluwer, Dordrecht (1995a)
Tuy, H.: Canonical d.c. programming: outer approximation methods revisited. Oper. Res. Lett. 18,
99106 (1995b)
Tuy, H.: A general d.c. approach to location problems. In: Floudas, C., Pardalos, P. (eds.) State
of the Art in Global Optimization: Computational Methods and Applications, pp. 413432.
Kluwer, Dordrecht (1996)
Tuy, H.: Monotonic Optimization: Problems and Solution Approaches. SIAM J. Optim. 11, 464
494 (2000a)
Tuy, H.: Strong polynomial time solvability of a minimum concave cost network flow problem.
Acta Math. Vietnam. 25, 209217 (2000b)
Tuy, H.: On global optimality conditions and cutting plane algorithms. J. Optim. Theory Appl. 118,
201216 (2003)
Tuy, H.: On a decomposition method for nonconvex global optimization. Optim. Lett. 1, 245258
(2007a)
Tuy, H.: On dual bound methods for nonconvex global optimization. J. Glob. Optim. 37, 321323
(2007b)
Tuy, H.: D(C)-optimization and robust global optimization. J. Glob. Optim. 47, 485501 (2010)
Tuy, H., Ghannadan, S.: A new branch and bound method for bilevel linear programs. In: Migdalas,
A., Pardalos, P.M., Vrbrand, P. (eds.) Multilevel Optimization: Algorithms and Applications,
pp. 231249. Kluwer, Dordrecht (1998)
Tuy, H., Hoai-Phuong, N.T.: Optimization under composite monotonic constraints and constrained
optimization over the efficient set. In: Liberti, L., Maculan, N. (eds.) Global Optimization:
From Theory to Implementation, pp. 332. Springer, Berlin/New York (2006)
Tuy, H., Hoai-Phuong, N.T.: A robust algorithm for quadratic optimization under quadratic
constraints. J. Glob. Optim. 37, 557569 (2007)
Tuy, H., Horst, R.: Convergence and restart in branch and bound algorithms for global optimization.
Application to concave minimization and d.c. optimization problems. Math. Program. 42, 161
184 (1988)
References 499
Tuy, H., Luc, L.T.: A new approach to optimization under monotonic constraints. J. Glob. Optim.
18, 115 (2000)
Tuy, H., Oettli, W.: On necessary and sufficient conditions for global optimization. Matemticas
Aplicadas 15, 3941 (1994)
Tuy, H., Tam, B.T.: An efficient solution method for rank two quasiconcave minimization
problems. Optimization 24, 356 (1992)
Tuy, H., Tam, B.T.: Polyhedral annexation vs outer approximation methods for decomposition of
monotonic quasiconcave minimization. Acta Math. Vietnam. 20, 99114 (1995)
Tuy, H., Tuan, H.D.: Generalized S-Lemma and strong duality in nonconvex quadratic program-
ming. J. Glob. Optim. 56, 10451072 (2013)
Tuy, H., Migdalas, A., Vbrand, P.: A global optimization approach for the linear two-level
program. J. Glob. Optim. 3, 123 (1992)
Tuy, H., Dan, N.D., Ghannadan, S.: Strongly polynomial time algorithm for certain concave
minimization problems on networks. Oper. Res. Lett. 14, 99109 (1993a)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for a
production-transportation problem with concave production cost. Optimization 27, 205227
(1993b)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for two special
minimum concave cost network flow problems. Optimization 32, 2344 (1994a)
Tuy, H., Migdalas, A., Vbrand, P.: A quasiconcave minimization method for solving linear two-
level programs. J. Glob. Optim. 4, 243263 (1994b)
Tuy, H., Al-Khayyal, F., Zhou, F.: A d.c. optimization method for single facility location problems.
J. Glob. Optim. 7, 209227 (1995a)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: The minimum concave cost flow problem with
fixed numbers of nonlinear arc costs and sources. J. Glob. Optim. 6, 135151 (1995b)
Tuy, H., Al-Khayyal, F., Zhou, P.C.: DC optimization method for single facility location problems.
J. Glob. Optim. 7, 209227 (1995c)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for a concave
production-transportation problem with a fixed number of nonlinear variables. Math. Program.
72, 229258 (1996a)
Tuy, H., Thach, P.T., Konno, H., Yokota, D.: Dual approach to optimization on the set of pareto-
optimal solutions. J. Optim. Theory Appl. 88, 689707 (1996b)
Tuy, H., Bagirov, Rubinov, A.M.: Clustering via D.C. optimization. In: Hadjisavvas, N., Pardalos,
P.M. (eds.) Advances in Convex Analysis and Optimization, pp. 221235. Kluwer, Dordrecht
(2001)
Tuy, H., Nghia, N.D., Vinh, L.S.: A discrete location problem. Acta Math. Vietnam. 28, 185199
(2003)
Tuy, H., Al-Khayyal, F., Thach, P.T.: Monotonic optimization: branch and cut methods. In: Audet,
C., Hansen, P., Savard, G. (eds.) Essays and Survey on Global Optimization, pp. 3978.
GERAD/Springer, New York (2005)
Tuy, H., Minoux, M., Hoai-Puong, N.T.: Discrete monotonic optimization with application to a
discrete location problem. SIAM J. Optim. 17, 7897 (2006)
Utschick, W., Brehmer, J.: Monotonic optimization framework for coordinated beamforming in
multi-cell networks. IEEE Trans. Signal Process. 60, 24962507 (2012)
Vandenberge, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 4995 (1996)
Van Voorhis, T.: The quadratically constrained quadratic programs. Ph.D. thesis, Georgia Institute
of Technology, Atlanta (1997)
Veinott, A.F.: The supporting hyperplane method for unimodal programming. Oper. Res. 15, 147
152 (1967)
Vidigal, L., Director, S.: A design centering algorithm for nonconvex regions of acceptability. IEEE
Trans. Comput.-Aided Des. Integr. Circuits Syst. 14, 1324 (1982)
Vincent, J.F.: Structural Biomaterials. Princeton University Press, Princeton (1990)
Vicentee, L., Savard, G., Judice, J.: Descent approaches for quadratic bilevel programming. J.
Optim. Theory Appl. 81, 379388 (1994)
500 References
von Neumann, J.: Zur Theorie der Gesellschaftsspiele. Math. Ann. 100, 293320 (1928)
White, D.J., Annandalingam, G.: A penalty function approach for solving bilevel linear programs.
J. Glob. Optim. 3, 397419 (1993)
Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Machine learning techniques to diagnose breast
cancer from fine-needle aspirates. Cancer Lett. 77, 163171 (1994)
Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Image analysis and machine learning applied to
breast cancer diagnosis and prognosis. Anal. Quant. Cytol. Histol. 17(2), 7787 (1995)
Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computerized breast cancer
diagnosis and prognosis from fine-needle aspirates. Arch. Surg. 130, 511516 (1995a)
Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computer-derived nuclear features
distinguish malignant from benign beast cytology. Hum. Pathol. 26, 792796 (1995b)
Yakubovich, V.A.: The S-procedure in nonlinear control theory. Vestnik Leningrad Univ. 4, 7393
(1977)
Ye, Y.: On affine scaling algorithms for nonconvex quadratic programming. Math. Program. 56,
285300 (1992)
Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Network 52, 2292
2320 (2008)
Zadeh, N.: On building minimum cost communication networks over time. Networks 4, 1934
(1974)
Zhang, Y.J., Qian, L.P., Huang, J.W.: Monotonic optimization communication and networking
systems. Found. Trends Commun. Inf. Theory 7(1), 175 (2012)
Zheng, Q.: Optimality conditions for global optimization (I) and (II). Acta Math. Appl. Sinica,
English Ser. 2, 6678, 118134 (1985)
Zhu, D.L., Marcotte, P.: An extended descent framework for variational inequalities. J. Optim.
Theory Appl. 80, 349366 (1994)
Zukhovitzki, S.I., Polyak, R.A., Primak, M.E.: On a class of concave programming problems.
Ekonomika i Matematitcheskie Methody 4, 431443 (1968, Russian)
Zwart, P.B.: Global maximization of a convex function with linear inequality constraints. Oper.
Res. 22, 602609 (1974)
Index
H
halfspaces, 6 M
Hausdorff distance, 89 master problem, 408
hyperplane, plane, 4 Mathematical Programming problem with
Equilibrium Constraints, 453
maximin location, 219
I minimax
increasing function, 391 equality, 72
indefinite quadratic minimization, 361 theorem, 72
indicator function, 40 minimum concave cost network flow problem,
infimal convolution, 46 319
inner approximation, 229 minimum of a dc function, 120
interior of a set, 8 Minkowski functional, 9
Minkowskis Theorem, 24
modulus of strong convexity, 66
K moment matrices, 440
K-convex function, 77 Monotonic Optimization, 391
K-monotonic function, 283 monotonic reverse convex problem, 313
Kakutani fixed point theorem, 98 multidimensional scaling, 220
KKM Lemma, 91, 92 multiextremal, 127
Ky Fan inequality, 97
N
L Nadler contraction theorem, 90
l.s.c. hull, 50 Nash equilibrium, 98
Lagrange dual problem, 377 Nash equilibrium theorem, 99
Lagrangian, 377 network constraints, 319
Lagrangian bound, 244 non-cooperative game, 98
lagrangian bound, 236 noncanonical dc problems, 268
line segment, 5 nonconvex quadratic programming, 337
lineality of a convex function, 49 nonconvexity rank, 283
lineality of a convex set, 23 nondegenerate vertex, 30
lineality space, 23 nonregular problems, 193
lineality space of a convex function, 49 norm
linear bilevel program, 456 lp -norm, 7
linear matrix inequality, 146 equivalent norms, 8
linear multiplicative program, 299 Euclidean norm, 8
linear multiplicative programming, 298 Tchebycheff norm, 8
linear program under LMI constraints, 387 normal cone, 20
Lipschitz continuity, 51 inward normal cone, 20
LMI (linear matrix inequality), 83 outward normal cone, 20
504 Index
O R
on-line vertex enumeration, 185 radial partition(subdivision), 152
optimal visible point algorithm, 269 radius of a compact set, 37
optimal visible point problem, 269 rank of a convex function, 50
outer approximation, 161 rank of a convex set, 26
rank of a quadratic form, 338
recession cone, 10
P recession cone of a convex function, 49
Pareto point, 393 rectangular subdivision, 156
partition via .v; j/, 156 redundant inequality, 29
partly convex problems, 236 reformulation-linearization, 372
piecewise convex function, 43 regular (satble) problem, 191
pointed cone, 7 regularity asumption, 192
Polar relative boundary, 8
of a polyhedron, 31 relative interior, 8
polar, 25 relief indicator, 272, 275
polyblock, 394 OA algorithm, 276
polyblock algorithm, 403 relief indicator
polyhedral annexation, 229 method, 275
polyhedral annexation Representation of polyhedrons, 32
method, 233 representation theorem, 23
polyhedral annexation algorithm for BCP, 233 reverse convex, 52
polyhedral convex set, 27 reverse convex constraint, 134
polyhedron, 27 reverse convex inequality, 104
Polynomial Optimization, 435 reverse convex programming, 189
polytope, 30 reverse convex set, 116
pooling and blending problem, 131, 247 reverse normal set, 393
positively homogeneous, 59 reverse polyblock, 394
posynomial, 391 robust
preprocessing, 142 approach, 204
problem PTP(2), 328 constraint set, 272
procedure DC, 175 feasible set, 140, 406
procedure DC*, 229
projection, 20
proper function, 39 S
proper partition, 152 S-shaped functions, 112
proper vertex, 394 saddle point, 75
prototype BB algorithm, 158 SDP (semidefinite program), 84
prototype BB decomposition algorithm, 237 Second Separation Theorem, 18
prototype OA procedure, 162 semi-exhaustiveness, 153
semidefinite program, 147
sensor cover energy problem, 426
Q Separable Basic Concave Program, 182
quadratic function, 42 separator, 115
quais-supgradient, 254 set-valued map, 88
quasi-subdifferential, 253 Shapley-Folkmans Theorem, 13
quasi-subgradient, 253 signomial programming, 451
quasi-supdifferential, 254 simple OA for CDC, 201
Index 505