Download as pdf or txt
Download as pdf or txt
You are on page 1of 511

Springer Optimization and Its Applications 110

HoangTuy

Convex
Analysis
and Global
Optimization
Second Edition
Springer Optimization and Its Applications

VOLUME 110

Managing Editor
Panos M. Pardalos (University of Florida)

EditorCombinatorial Optimization
Ding-Zhu Du (University of Texas at Dallas)

Advisory Board
J. Birge (University of Chicago)
C.A. Floudas (Princeton University)
F. Giannessi (University of Pisa)
H.D. Sherali (Virginia Polytechnic and State University)
T. Terlaky (McMaster University)
Y. Ye (Stanford University)

Aims and Scope


Optimization has been expanding in all directions at an astonishing rate
during the last few decades. New algorithmic and theoretical techniques
have been developed, the diffusion into other disciplines has proceeded at a
rapid pace, and our knowledge of all aspects of the field has grown even more
profound. At the same time, one of the most striking trends in optimization
is the constantly increasing emphasis on the interdisciplinary nature of the
field. Optimization has been a basic tool in all areas of applied mathematics,
engineering, medicine, economics, and other sciences.
The series Springer Optimization and Its Applications publishes under-
graduate and graduate textbooks, monographs and state-of-the-art exposi-
tory work that focus on algorithms for solving optimization problems and
also study applications involving such problems. Some of the topics covered
include nonlinear optimization (convex and nonconvex), network flow
problems, stochastic optimization, optimal control, discrete optimization,
multi-objective programming, description of software packages, approxima-
tion techniques and heuristic approaches.

More information about this series at http://www.springer.com/series/7393


Hoang Tuy

Convex Analysis and Global


Optimization
Second Edition

123
Hoang Tuy
Vietnam Academy of Science
and Technology
Institute of Mathematics
Hanoi, Vietnam

ISSN 1931-6828 ISSN 1931-6836 (electronic)


Springer Optimization and Its Applications
ISBN 978-3-319-31482-2 ISBN 978-3-319-31484-6 (eBook)
DOI 10.1007/978-3-319-31484-6

Library of Congress Control Number: 2016934409

Mathematics Subject Classification (2010): 90-02, 49-02, 65K-10

Springer International Publishing AG 1998, 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG Switzerland
Preface to the First Edition

This book grew out of lecture notes from a course given at Graz University of
Technology (autumn 1990), The Institute of Technology at Linkping University
(autumn 1991), and the Ecole Polytechnique in Montral (autumn 1993). Originally
conceived of as a Ph.D. course, it attempts to present a coherent and mathematically
rigorous theory of deterministic global optimization. At the same time, aiming at
providing a concise account of the methods and algorithms, it focuses on the main
ideas and concepts, leaving aside too technical details which might affect the clarity
of presentation.
Global optimization is concerned with finding global solutions to nonconvex
optimization problems. Although until recently convex analysis has been developed
mainly under the impulse of convex and local optimization, it has become a basic
tool for global optimization. The reason is that the general mathematical structure
underlying virtually every nonconvex optimization problem can be described in
terms of functions representable as differences of convex functions (dc functions)
and sets which are differences of convex sets (dc sets). Due to this fact, many
concepts and results from convex analysis play an essential role in the investigation
of important classes of global optimization problems. Since, however, convexity
in nonconvex optimization problems is present only partially or in the other
way, new concepts have to be introduced and new questions have to be answered.
Therefore, a new chapter on dc functions and dc sets is added to the traditional
material of convex analysis. Part I of this book is an introduction to convex analysis
interpreted in this broad sense as an indispensable tool for global optimization.
Part II presents a theory of deterministic global optimization which heavily relies
on the dc structure of nonconvex optimization problems. The key subproblem in
this approach is to transcend an incumbent, i.e., given a solution of an optimization
problem (the best so far obtained), check its global optimality, and find a better
solution, if there is one. As it turns out, this subproblem can always be reduced to
solving a dc inclusion of the form x 2 D n C; where D; C are two convex sets.
Chapters 46 are devoted to general methods for solving concave and dc programs
through dc inclusions of this form. These methods include successive partitioning
and cutting, outer approximation and polyhedral annexation, or combination of the

v
vi Preface to the First Edition

concepts. The last two chapters discuss methods for exploiting special structures in
global optimization. Two aspects of nonconvexity deserve particular attention: the
rank of nonconvexity, i.e., roughly speaking the number of nonconvex variables,
and the degree of nonconvexity, i.e., the extent to which a problem fails to be
convex. Decomposition methods for handling low rank nonconvex problems are
presented in Chap. 7, while nonconvex quadratic problems, i.e., problems involving
only nonconvex functions which in a sense have lowest degree of nonconvexity, are
discussed in the last Chap. 8.
I have made no attempt to cover all the developments to date because this would
be merely impossible and unreasonable. With regret I have to omit many interesting
results which do not fit in with the mainstream of the book. On the other hand, much
new material is offered, even in the parts where the traditional approach is more or
less standard.
I would like to express my sincere thanks to many colleagues and friends,
especially R.E. Burkard (Graz Technical University), A. Migdalas and P. Vrbrand
(University of Linkping), and B. Jaumard and P. Hansen (University of Montral),
for many fruitful discussions we had during my enjoyable visits to their departments,
and P.M. Pardalos for his encouragement in publishing this book. I am particularly
grateful to P.T. Thach and F.A. Al-Khayyal for many useful remarks and suggestions
and also to P.H. Dien for his efficient help during the preparation of the manuscript.

Hanoi, Vietnam Hoang Tuy


October 1997
Preface to the Second Edition

Over the 17 years since the publication of the first edition of this book, tremendous
progress has been achieved in global optimization. The present revised and enlarged
edition is to give an up-to-date account of the field, in response to the needs of
research, teaching, and applications in the years to come.
In addition to the correction of misprints and errors, much new material is
offered. In particular, several recent important advances are included: a modern
approach to minimax, fixed point and equilibrium, a robust approach to optimization
under nonconvex constraints, and also a thorough study of quadratic programming
with a single quadratic constraint.
Moreover, three new chapters are added: monotonic optimization (Chap. 11),
polynomial optimization (Chap. 12), and optimization under equilibrium constraints
(Chap. 13). These topics have received increased attention in recent years due to
many important engineering and economic applications.
Hopefully, as main reference in deterministic global optimization, this book
will replace both its first edition and the 1996 (last) edition of the book Global
Optimization: Deterministic Approaches by Reiner Horst and Hoang Tuy.

Hanoi, Vietnam Hoang Tuy


September 2015

vii
Contents

Part I Convex Analysis


1 Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
1.1 Affine Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
1.2 Convex Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5
1.3 Relative Interior and Closure . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7
1.4 Convex Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12
1.5 Separation Theorems .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 16
1.6 Geometric Structure .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 19
1.6.1 Supporting Hyperplane .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 19
1.6.2 Face and Extreme Point . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 21
1.7 Representation of Convex Sets . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 23
1.8 Polars of Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 25
1.9 Polyhedral Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 27
1.9.1 Facets of a Polyhedron . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 28
1.9.2 Vertices and Edges of a Polyhedron . . .. . . . . . . . . . . . . . . . . . . . 29
1.9.3 Polar of a Polyhedron . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 31
1.9.4 Representation of Polyhedrons.. . . . . . . .. . . . . . . . . . . . . . . . . . . . 32
1.10 Systems of Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 34
1.11 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 36
2 Convex Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 39
2.1 Definition and Basic Properties .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 39
2.2 Operations That Preserve Convexity . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 43
2.3 Lower Semi-Continuity . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 47
2.4 Lipschitz Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 51
2.5 Convex Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 52
2.6 Approximation by Affine Functions .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 56
2.7 Subdifferential .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 58
2.8 Subdifferential Calculus . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 62
2.9 Approximate Subdifferential . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 65
2.10 Conjugate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67

ix
x Contents

2.11 Extremal Values of Convex Functions . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 69


2.11.1 Minimum.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 69
2.11.2 Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 70
2.12 Minimax and Saddle Points .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 72
2.12.1 Minimax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 72
2.12.2 Saddle Point of a Function . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 75
2.13 Convex Optimization .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 76
2.13.1 Generalized Inequalities .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 76
2.13.2 Generalized Convex Optimization .. . . .. . . . . . . . . . . . . . . . . . . . 78
2.13.3 Conic Programming . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 80
2.14 Semidefinite Programming . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 81
2.14.1 The SDP Cone . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 81
2.14.2 Linear Matrix Inequality . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 83
2.14.3 SDP Programming .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 84
2.15 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 84
3 Fixed Point and Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
3.1 Approximate Minimum and Variational Principle .. . . . . . . . . . . . . . . . . 87
3.2 Contraction Principle.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 89
3.3 Fixed Point and Equilibrium .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91
3.3.1 The KKM Lemma . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91
3.3.2 General Equilibrium Theorem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 93
3.3.3 Equivalent Minimax Theorem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 96
3.3.4 Ky Fan Inequality .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 97
3.3.5 Kakutani Fixed Point . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 98
3.3.6 Equilibrium in Non-cooperative Games .. . . . . . . . . . . . . . . . . . 98
3.3.7 Variational Inequalities .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99
3.4 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 101
4 DC Functions and DC Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
4.1 DC Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
4.2 Universality of DC Functions.. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 105
4.3 DC Representation of Basic Composite Functions . . . . . . . . . . . . . . . . . 107
4.4 DC Representation of Separable Functions . . . . .. . . . . . . . . . . . . . . . . . . . 110
4.5 DC Representation of Polynomials .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 112
4.6 DC Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 113
4.7 Convex Minorants of a DC Function .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 117
4.8 The Minimum of a DC Function . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 120
4.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 122

Part II Global Optimization


5 Motivation and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127
5.1 Global vs. Local Optimum .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127
5.2 Multiextremality in the Real World . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 129
Contents xi

5.3 Basic Classes of Nonconvex Problems .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 132


5.3.1 DC Optimization . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 133
5.3.2 Monotonic Optimization . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 135
5.4 General Nonconvex Structure . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 135
5.5 Global Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 136
5.6 DC Inclusion Associated with a Feasible Solution . . . . . . . . . . . . . . . . . 139
5.7 Approximate Solution .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 140
5.8 When is Global Optimization Motivated? . . . . . .. . . . . . . . . . . . . . . . . . . . 142
5.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 147
6 General Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 151
6.1 Successive Partitioning.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 151
6.1.1 Simplicial Subdivision . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 151
6.1.2 Conical Subdivision . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 155
6.1.3 Rectangular Subdivision . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 156
6.2 Branch and Bound.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 158
6.2.1 Prototype BB Algorithm . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 158
6.2.2 Advantage of a Nice Feasible Set . . . . . .. . . . . . . . . . . . . . . . . . . . 160
6.3 Outer Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 161
6.3.1 Prototype OA Procedure . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 162
6.4 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 164
7 DC Optimization Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 167
7.1 Concave Minimization Under Linear Constraints . . . . . . . . . . . . . . . . . . 167
7.1.1 Concavity Cuts. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 169
7.1.2 Canonical Cone .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 172
7.1.3 Procedure DC . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 173
7.1.4 !-Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 178
7.1.5 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 180
7.1.6 Rectangular Algorithms . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 182
7.2 Concave Minimization Under Convex Constraints .. . . . . . . . . . . . . . . . 184
7.2.1 Simple Outer Approximation . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 185
7.2.2 Combined Algorithm .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 187
7.3 Reverse Convex Programming.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 189
7.3.1 General Properties . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 190
7.3.2 Solution Methods .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 194
7.4 Canonical DC Programming .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
7.4.1 Simple Outer Approximation . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
7.4.2 Combined Algorithm .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 203
7.5 Robust Approach .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
7.5.1 Successive Incumbent Transcending . .. . . . . . . . . . . . . . . . . . . . 207
7.5.2 The SIT Algorithm for DC Optimization.. . . . . . . . . . . . . . . . . 208
7.5.3 Equality Constraints .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 211
7.5.4 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 212
7.5.5 Concluding Remark . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 213
xii Contents

7.6 DC Optimization in Design Calculations . . . . . . .. . . . . . . . . . . . . . . . . . . . 213


7.6.1 DC Reformulation . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 214
7.6.2 Solution Method .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 215
7.7 DC Optimization in Location and Distance Geometry . . . . . . . . . . . . . 218
7.7.1 Generalized Webers Problem.. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 218
7.7.2 Distance Geometry Problem . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 220
7.7.3 Continuous p-Center Problem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
7.7.4 Distribution of Points on a Sphere .. . . .. . . . . . . . . . . . . . . . . . . . 221
7.8 Clustering via dc Optimization . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 222
7.8.1 Clustering as a dc Optimization.. . . . . . .. . . . . . . . . . . . . . . . . . . . 223
7.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 226
8 Special Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
8.1 Polyhedral Annexation .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
8.1.1 Procedure DC . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
8.1.2 Polyhedral Annexation Method for (BCP) . . . . . . . . . . . . . . . . 233
8.1.3 Polyhedral Annexation Method for (LRC) . . . . . . . . . . . . . . . . 234
8.2 Partly Convex Problems.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 236
8.2.1 Computation of Lagrangian Bounds .. .. . . . . . . . . . . . . . . . . . . . 244
8.2.2 Applications.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 247
8.3 Duality in Global Optimization .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 249
8.3.1 Quasiconjugate Function .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 250
8.3.2 Examples .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 253
8.3.3 Dual Problems of Global Optimization.. . . . . . . . . . . . . . . . . . . 255
8.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 258
8.4.1 A Feasibility Problem with Resource
Allocation Constraint .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 259
8.4.2 Problems with Multiple Resources Allocation .. . . . . . . . . . . 261
8.4.3 Optimization Under Resources Allocation Constraints .. . 264
8.5 Noncanonical DC Problems . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 268
8.6 Continuous Optimization Problems . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 271
8.6.1 Outer Approximation Algorithm.. . . . . .. . . . . . . . . . . . . . . . . . . . 275
8.6.2 Combined OA/BB Algorithm .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 278
8.7 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 279
9 Parametric Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 283
9.1 Functions Monotonic w.r.t. a Convex Cone . . . . .. . . . . . . . . . . . . . . . . . . . 283
9.2 Decomposition by Projection .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 286
9.2.1 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 288
9.2.2 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 289
9.3 Decomposition by Polyhedral Annexation .. . . . .. . . . . . . . . . . . . . . . . . . . 292
9.4 The Parametric Programming Approach .. . . . . . .. . . . . . . . . . . . . . . . . . . . 294
9.5 Linear Multiplicative Programming .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 298
9.5.1 Parametric Objective Simplex Method . . . . . . . . . . . . . . . . . . . . 300
9.5.2 Parametric Right-Hand Side Method . .. . . . . . . . . . . . . . . . . . . . 302
9.5.3 Parametric Combined Method . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 303
Contents xiii

9.6 Convex Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 305


9.6.1 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 306
9.6.2 Polyhedral Annexation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 306
9.6.3 Reduction to Quasiconcave Minimization.. . . . . . . . . . . . . . . . 308
9.7 Reverse Convex Constraints . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 313
9.7.1 Decomposition by Projection . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 314
9.7.2 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 314
9.7.3 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 316
9.7.4 Decomposition by Polyhedral Annexation . . . . . . . . . . . . . . . . 317
9.8 Network Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 319
9.8.1 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 320
9.8.2 Branch and Bound Method.. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 321
9.9 Some Polynomial-Time Solvable Problems . . . .. . . . . . . . . . . . . . . . . . . . 326
9.9.1 The Special Concave Program CPL.1/ .. . . . . . . . . . . . . . . . . . . 326
9.10 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 328
9.10.1 Problem PTP(2).. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 328
9.10.2 Problem FP(1) . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 329
9.10.3 Strong Polynomial Solvability of PTP.r/ . . . . . . . . . . . . . . . . . 330
9.11 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 333
10 Nonconvex Quadratic Programming .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 337
10.1 Some Basic Properties of Quadratic Functions .. . . . . . . . . . . . . . . . . . . . 337
10.2 The S-Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 339
10.3 Single Constraint Quadratic Optimization . . . . . .. . . . . . . . . . . . . . . . . . . . 346
10.4 Quadratic Problems with Linear Constraints . . .. . . . . . . . . . . . . . . . . . . . 355
10.4.1 Concave Quadratic Programming.. . . . .. . . . . . . . . . . . . . . . . . . . 355
10.4.2 Biconvex and ConvexConcave Programming.. . . . . . . . . . . 358
10.4.3 Indefinite Quadratic Programs . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 361
10.5 Quadratic Problems with Quadratic Constraints .. . . . . . . . . . . . . . . . . . . 363
10.5.1 Linear and Convex Relaxation .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 364
10.5.2 Lagrangian Relaxation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 377
10.5.3 Application.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 382
10.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 388
11 Monotonic Optimization .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 391
11.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 391
11.1.1 Increasing Functions, Normal Sets, and Polyblocks . . . . . . 391
11.1.2 DM Functions and DM Constraints . . .. . . . . . . . . . . . . . . . . . . . 395
11.1.3 Canonical Monotonic Optimization Problem . . . . . . . . . . . . . 397
11.2 The Polyblock Algorithm .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 399
11.2.1 Valid Reduction .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 399
11.2.2 Generating the New Polyblock . . . . . . . .. . . . . . . . . . . . . . . . . . . . 402
11.2.3 Polyblock Algorithm . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 403
11.3 Successive Incumbent Transcending Algorithm .. . . . . . . . . . . . . . . . . . . 405
11.3.1 Essential Optimal Solution .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 406
11.3.2 Successive Incumbent Transcending Strategy.. . . . . . . . . . . . 407
11.4 Discrete Monotonic Optimization . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 413
xiv Contents

11.5 Problems with Hidden Monotonicity .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 417


11.5.1 Generalized Linear Fractional Programming.. . . . . . . . . . . . . 417
11.5.2 Problems with Hidden Monotonicity in Constraints .. . . . . 422
11.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 424
11.6.1 Reliability and Risk Management . . . . .. . . . . . . . . . . . . . . . . . . . 424
11.6.2 Power Control in Wireless Communication Networks . . . 425
11.6.3 Sensor Cover Energy Problem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 426
11.6.4 Discrete Location .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 429
11.7 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 433
12 Polynomial Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 435
12.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 435
12.2 Linear and Convex Relaxation Approaches .. . . .. . . . . . . . . . . . . . . . . . . . 436
12.2.1 The RLT Relaxation Method .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 436
12.2.2 The SDP Relaxation Method.. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 439
12.3 Monotonic Optimization Approach . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 441
12.3.1 Monotonic Reformulation .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 442
12.3.2 Essential Optimal Solution .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 443
12.3.3 Solving the Master Problem . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 445
12.3.4 Equality Constraints .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 450
12.3.5 Discrete Polynomial Optimization .. . . .. . . . . . . . . . . . . . . . . . . . 450
12.3.6 Signomial Programming . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 451
12.4 An Illustrative Example .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 451
12.5 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 452
13 Optimization Under Equilibrium Constraints . . . . . .. . . . . . . . . . . . . . . . . . . . 453
13.1 Problem Formulation .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 453
13.2 Bilevel Programming.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 455
13.2.1 Conversion to Monotonic Optimization . . . . . . . . . . . . . . . . . . . 457
13.2.2 Solution Method .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 459
13.2.3 Cases When the Dimension Can Be Reduced .. . . . . . . . . . . . 463
13.2.4 Specialization to (LBP). . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 465
13.2.5 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 467
13.3 Optimization Over the Efficient Set . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 468
13.3.1 Monotonic Optimization Reformulation . . . . . . . . . . . . . . . . . . 469
13.3.2 Solution Method for (OWE) . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 469
13.3.3 Solution Method for (OE) .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 474
13.3.4 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 478
13.4 Equilibrium Problem .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 479
13.4.1 Variational Inequality.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 480
13.5 Optimization with Variational Inequality Constraint . . . . . . . . . . . . . . . 483
13.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 486

References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 489

Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 501
Glossary of Notation

R The real line


Rn Real n-dimensional space
aff E Affine hull of the set E
conv E Convex hull of the set E
cl C Closure of the set C
ri C Relative interior of the set C
@C Relative boundary of a convex set C
rec.C/ Recession cone of a convex set C
NC .x0 / (Outward) normal cone of a convex set C at point x0 2 C
E Polar of set E
X\ Lower conjugate of set X
domf Effective domain of function f .x/
epif Epigraph of f .x/
f  .p/ Conjugate of function f .x/
f \ .p/ Quasiconjugate of function f .x/
C .x/ Indicator function of C
sC .x/ Support function of C
dC .x/ Distance function of C
@f .x0 / Subdifferential of f at x0
@" f .x0 / "-subdifferential of f at x0
@\ f .x0 / Quasi-supdifferential of f at x0
LMI Linear matrix inequality
SDP Semidefinite program
DC./ Linear space formed by dc functions on
BB Branch and Bound
BRB Branch Reduce and Bound
OA Outer Approximation
PA Polyhedral Annexation
CS Cone Splitting
SIT Successive Incumbent Transcending
BCP Basic Concave Program

xv
xvi Glossary of Notation

SBCP Separable Basic Concave Program


GCP General Concave Program
LRC Linear Reverse Convex Program
CDC Canonical DC Program
COP Continuous Optimization Problem
MO Canonical Monotonic Optimization Problem
MPEC Mathematical Programming with Equilibrium Constraints
MPAEC MPEC with an Affine Variational Inequality
VI Variational Inequality
OVI Optimization with Variational Inequality constraint
GBP General Bilevel Programming
CBP Convex Bilevel Programming
LBP Linear Bilevel Programming
OE Optimization over Efficient Set
Part I
Convex Analysis
Chapter 1
Convex Sets

1.1 Affine Sets

Let a; b be two points of Rn : The set of all x 2 Rn of the form

x D .1  /a C b D a C .b  a/; 2R (1.1)

is called the line through a and b: A subset M of Rn is called an affine set (or affine
manifold ) if it contains every line through two points of it, i.e., if .1 /a Cb 2 M
for every a 2 M; b 2 M, and every  2 R: An affine set which contains the origin is
a subspace.
Proposition 1.1 A nonempty set M is an affine set if and only if M D a C L; where
a 2 M and L is a subspace.
Proof Let M be an affine set and a 2 M: Then M D a C L; where the set L D M  a
contains 0 and is an affine set, hence is a subspace. Conversely, let M D a C L;
with a 2 M and L a subspace. For any x; y 2 M;  2 R we have .1  /x C y D
a C .1  /.x  a/ C .y  a/: Since x  a; y  a 2 L and L is a subspace, it follows
that .1  /.x  a/ C .y  a/ 2 L; hence .1  /x C y 2 M: Therefore, M is an
affine set. t
u
The subspace L is said to be parallel to the affine set M: For a given nonempty
affine set M the parallel subspace L is uniquely defined. Indeed, if M D a0 C L0 ;
then L0 D M  a0 D L C .a  a0 /; but a0 D a0 C 0 2 M D a C L; hence a0  a 2 L;
and so L0 D L C .a  a0 / D L:
The dimension of the subspace L parallel to an affine set M is called the
dimension of M: A point a 2 Rn is an affine set of dimension 0 because the subspace
parallel to M D fag is L D f0g: A line through two points a; b is an affine set of
dimension 1, because the subspace parallel to it is the one-dimensional subspace

Springer International Publishing AG 2016 3


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_1
4 1 Convex Sets

fx D .b  a/j  2 Rg: An .n  1/-dimensional affine set is called a hyperplane, or


a plane for short.
Proposition 1.2 Any r-dimensional affine set is of the form

M D fxj Ax D bg;

where b 2 Rm ; A 2 Rmn such that rankA D n  r: Conversely, any set of the above
form is an r-dimensional affine set.
Proof Let M be an r-dimensional affine set. For any x0 2 M; the set L D M  x0
is an r-dimensional subspace, so L D fxj Ax D 0g; where A is some m  n matrix
of rank n  r: Hence M D fxj A.x  x0 / D 0g D fxj Ax D bg; with b D Ax0 :
Conversely if M is a set of this form then, for any x0 2 M; we have Ax0 D b; hence
M D fxjA.x  x0 / D 0g D x0 C L; with L D fxj Ax D 0g: Since rankA D n  r; L
is an r-dimensional subspace. t
u
Corollary 1.1 Any hyperplane is a set of the form

H D fxj ha; xi D g; (1.2)

where a 2 Rn n f0g; 2 R: Conversely, any set of the above form is a hyperplane.


The vector a in (1.2) is called the normal to the hyperplane H: Of course a and
are defined up to a nonzero common multiple, for a given hyperplane H: The line

faj  2 Rg meets H at a point a such that ha; ai D ; hence  D kak 2 : Thus,
jj
the distance from 0 to a hyperplane ha; xi D is equal to jjkak D kak :

The intersection of a family of affine sets is again an affine set. Given a set
E  Rn ; there exists at least an affine set containing E; namely Rn : Therefore,
the intersection of all affine sets containing E is the smallest affine set containing E:
This set is called the affine hull of E and denoted by aff E:
Proposition 1.3 The affine hull of a set E is the set consisting of all points of
the form

x D 1 x 1 C : : : C k x k ;

such that xi 2 E; 1 C : : : C k D 1; and k a natural number.


ProofPLet M be the set
P of all points of the above Pform. If x;Py 2 M; for example,
x D kiD1 i xi ; y D hjD1 j yj ; with xi ; yj 2 E; kiD1 i D hjD1 j D 1; then for
any 2 R we can write

X
k X
h
.1  /x C y D .1  /i xi C j yj
iD1 jD1
1.2 Convex Sets 5

P P
with kiD1 .1  /i C hjD1 j D .1  / C D 1; hence .1  /x C y 2 M:
Thus PM is affine, and hence M P affE: On the other hand, it is easily seen that if
x D kiD1 i xi with xi 2 E; k
iD1 i D 1; then x 2 affE (for k D 2 this follows
from the definition of affine sets, for k  3 this follows by induction). So M  affE;
and therefore, M D affE: t
u
Proposition 1.4 The affine hull of a set of k points x1 ; : : : ; xk .k > r/ in Rn is of
dimension r if and only if the .n C 1/  k matrix
 
x1 x2    xk
(1.3)
1 1  1

is of rank r C 1:
Proof Let M D afffx1 ; : : : ; xk g: Then M D xk C L; where L is the smallest space
containing x1  xk ; : : : ; xk1  xk : So M is of dimension r if and only if L is of
dimension r; i.e., if and only if the matrix x1  xk ; : : : ; xk1  xk  is of rank r: This
in turn is equivalent to requiring that the matrix (1.3) be of rank r C 1: t
u
We say that k points x1 ; : : : ; xk are affinely independent if afffx1 ; : : : ; xk g has
dimension k  1; i.e., if the vectors x1  xk ; : : : ; xk1  xk are linearly independent,
or equivalently, the matrix (1.3) is of rank k:
Corollary 1.2 The affine hull S of a set of k affinely independent points fx1 ; : : : ; xk g
in Rn is a .k  1/-dimensional affine set. Every point x 2 S can be uniquely
represented in the form

X
k X
k
xD i x i ; i D 1: (1.4)
iD1 iD1

Pk
iD1 i x
i
Proof By Proposition 1.3 any point x 2 S has the form (1.4). If x D
Pk1
is another representation, then iD1 .i  i /.x  x / D 0; hence i D i
i k

8i D 1; : : : ; k: t
u
Corollary 1.3 There is a unique hyperplane passing through n affinely indepen-
dent points x1 ; x2 ; : : : ; xn in Rn : Every point of this hyperplane can be uniquely
represented in the form

X
n X
n
xD i x ;
i
i D 1:
iD1 iD1

1.2 Convex Sets

Given two points a; b 2 Rn ; the set of all points x D .1  /a C b such that
0    1 is called the (closed) line segment between a and b and denoted by a; b:
6 1 Convex Sets

A set C  Rn is called convex if it contains any line segment between two points of
it; in other words, if .1  /a C b 2 C whenever a; b 2 C; 0    1:
Affine sets are obviously convex. Other particular examples of convex sets are
halfspaces which are sets of the form

fxj ha; xi  g; fxj ha; xi  g

(closed halfspaces), or

fxj ha; xi < g; fxj ha; xi > g

(open halfspaces), where a P2 Rn ; 2 R: P


A point x such that x D kiD1 i ai with ai 2 Rn ; i  0; kiD1 i D 1; is called
a convex combination of a1 ; : : : ; ak 2 Rn :
Proposition 1.5 A set C  Rn is convex if and only if it contains all the convex
combinations of its elements.
Proof It suffices to show that if C is a convex set then

X
k X
k
ai 2 C; i D 1; : : : ; k; i > 0; i D 1 ) i ai 2 C:
iD1 iD1

is obvious for k D 2: Assuming this is true for k D h  1  2 and


Indeed, this P
h1
letting D iD1 i > 0 we can write

X
h X
h1 X
h1
i
i a i D i a i C h a h D a i C h a h :
iD1 iD1 iD1

Ph1 Ph1
Since iD1 .i =/ D 1; by the induction hypothesis y D iD1 .i =/ai 2 C: Next,
since C h D 1; we have y C h a 2 C: Thus the above property holds for every
h

natural k: t
u
Proposition 1.6 The intersection of any collection of convex sets is convex. If C; D
are convex sets then C C D WD fx C yj x 2 C; y 2 Dg; C D fxj x 2 Cg (and
hence, also C  D WD C C .1/D/ are convex sets.
Proof If fC g is a collection of convex sets, and a; b 2 \ C ; then for each we
have a 2 C ; b 2 C ; therefore a; b  C and hence, a; b  \ C :
If C; D are convex sets, and a D x C y; b D u C v with x; u 2 C; y; v 2 D; then
.1  /a C b D .1  /x C u C .1  /y C v 2 C C D; for any  2 0; 1;
hence C C D is convex. The convexity of C can be proved analogously. t
u
A subset M of Rn is called a cone if

x 2 M;  > 0 ) x 2 M:
1.3 Relative Interior and Closure 7

The origin 0 itself may or may not belong to M: A set aCM; which is the translate of
a cone M by a 2 Rn ; is also called a cone with apex at a: A cone M which contains
no line is said to be pointed; in this case, 0 is also called the vertex of M; and a C M
is a cone with vertex at a:
Proposition 1.7 A subset M of Rn is a convex cone if and only if

M  M 8 > 0 (1.5)
M C M  M: (1.6)

Proof If M is a convex cone then (1.5) holds by definition of a cone; furthermore,


from x; y 2 M we have 12 .x C y/ 2 M; hence by (1.5), x C y 2 M: Conversely,
if (1.5) and (1.6) hold, then M is a cone, and for any x; y 2 M;  2 .0; 1/ we have
.1  /x 2 M; y 2 M; hence .1  /x C y 2 M; so M is convex. t
u
Corollary 1.4 A subset M of Rn is a convex cone if and only if it contains all the
positive linear combinations of its elements.
Proof Properties (1.6) and (1.5) mean that M is closed under addition
P and positive
scalar multiplication, i.e., M contains all elements of the form kiD1 i ai ; with ai 2
M; i > 0: t
u
Corollary 1.5 Let E be a convex set. The set fxj x 2 E;  > 0g is the smallest
convex cone which contains E:
Proof Since any cone containing E must contain this set it suffices to show that this
set
P is ai convexPcone. Buti this follows
Pfrom the preceding corollary and the fact that
i i x D  i .i =/x ; with  D i i : t
u
The convex cone obtained by adjoining 0 to the cone mentioned in Corollary 1.5
is often referred to as the cone generated by E and is denoted by cone E:

1.3 Relative Interior and Closure

A norm in Rn is by definition a mapping k  k from Rn to R which satisfies


kxk  0 8x 2 Rn I kxk D 0 only if x D 0
kxk D jjkxk 8x 2 Rn ; 2 R 1 (1.7)
kx C yk  kxk C kyk x; y 2 R :
n

Familiar examples of norms in Rn are the lp -norms


!1=p
X
n
kxkp D jxi j p
; 1  p < 1;
iD1
8 1 Convex Sets

and the l1 -norm

kxk1 D max jxi j:


1in

The l2 -norm,
p which can also be defined via the inner product by means of
kxk D hx; xi, is also called the Euclidean norm. The l1 -norm is also called the
Tchebycheff norm.
Given a norm k  k in Rn ; the distance between two points x; y 2 Rn is the
nonnegative number kxykI the ball of center a and radius r is the set fx 2 Rn j kx
ak  rg:
Topological concepts in Rn such as open set, closed set, interior, closure,
neighborhood, convergent sequence, etc., are defined in Rn with respect to a given
norm. However, the following proposition shows that the topologies in Rn defined
by different norms are all equivalent.
Proposition 1.8 For any two norms kk; kk0 in Rn there exist constants c2  c1 > 0
such that

c1 kxk  kxk0  c2 kxk 8x 2 Rn :

We omit the proof which can be found, e.g., in Ortega and Rheinboldt (1970).
Denote the closure and the interior of a set C by clC and intC; respectively.
Recall that a 2 clC if and only if every ball around a contains at least one point
of C; or equivalently, a is the limit of a sequence of points of CI a 2 intC if and
only if there is a ball around a entirely included in C: An immediate consequence
of Proposition 1.8 is the following alternative characterization of interior points of
convex sets in Rn :
Corollary 1.6 A point a of a convex set C  Rn is an interior point of C if for every
x 2 Rn there exists > 0 such that a C .x  a/ 2 C:
Proof Without loss of generality we can assume that a D 0: Let ei be the i-unit
vector of Rn : The condition implies that for each i D 1; : : : ; n there is i > 0 such
that i ei 2 C and i > 0 such that i ei 2 C: Now let D minfi ; iP ; i D 1; : : : ; ng
and consider the set B D fxj kxi k1  g: For any x 2 B; since x D niD1 xi ei with
Pn jxi j
iD1  1; if we define u D e for xi > 0 and u D e for xi  0 then
i i i i

1
Pn jxi j i Pn jxi j
u ; : : : ; u 2 C and we can write x D
n
iD1 u C .1  iD1 /0: From the
convexity of C it follows that x 2 C: Thus, B  C; and since B is a ball in the
l1 -norm, 0 is an interior point of C: t
u
A point a belongs to the relative interior of a convex set C  Rn W a 2 riC; if
it is an interior point of C relative to affC: The set difference (clC/ n .riC/ is called
the relative boundary of C and denoted by @C:
Proposition 1.9 Any nonempty convex set C  Rn has a nonempty relative interior.
1.3 Relative Interior and Closure 9

Proof Let M D affC with dim M D r; so there existPr C 1 affinely independent


1
elements x1 ; : : : ; xrC1 of C: We show that a D rC1 rC1 i
x is a relative interior
PrC1 iD1 PrC1
of C: Indeed, any x 2 M is of the form x D  x i
; with iD1 i D 1; so
PrC1 iD1 i
a C .x  a/ D iD1 i x with i

1 X
rC1
i D .1  / C i ; i D 1:
rC1 iD1

For > 0 sufficiently small, we have i > 0; i D 1; : : : ; r C 1; hence


a C .x  a/ 2 C: Therefore, by Corollary 1.6, a 2 riC: t
u
The dimension of a convex set is by definition the dimension of its affine hull.
A convex set C in Rn is said to have full dimension if dim C D n: It follows from
the above Proposition that a convex set C in Rn has a nonempty interior if and only
if it has full dimension.
Proposition 1.10 The closure and the relative interior of a convex set are convex.
Proof Let C be a convex set, a; b 2 clC; say a D lim!1 x ; b D lim!1 y ;
where x ; y 2 C for every : For every  2 0; 1; we have .1  /x C y 2 C;
hence, .1  /a C b D lim.1  /x C y  2 clC: Thus, if a; b 2 clC then
a; b  clC; proving the convexity of clC: Now let a; b 2 riC; so that there is
a ball B around 0 such that .a C B/ \ affC and .b C B/ \ affC are contained in
C: For any x D .1  /a C b; with  2 .0; 1/ we have .x C B/ \ affC D
.1  /.a C B/ \ affC C .b C B/ \ affC  C; so x 2 riC: This proves the convexity
of riC: t
u
Let C  Rn be a convex set containing the origin. The function pC W Rn ! R
such that

pC .x/ D inff > 0j x 2 Cg

is called the gauge or Minkowski functional of C:


Proposition 1.11 The gauge pC .x/ of a convex set C  Rn containing 0 in its
interior satisfies
(i) pC .x/ D pC .x/ 8x 2 Rn ;  0 (positive homogeneity),
(ii) pC .x C y/  pC .x/ C pC .y/ 8x; y 2 C (subadditivity),
(iii) pC .x/ is continuous,
(iv) intC D fxj pC .x/ < 1g  C  clC D fxj pC .x/  1g:
Proof
(i) Since 0 2 intC it is easy to see that pC .x/ is finite everywhere. For every  0
we have pC .x/ D inff > 0j x 2 Cg D inff > 0j x 2 Cg D pC .x/:
(ii) If x 2 C; y 2 C .;  > 0/; i.e., x D x0 ; y D y0 with x0 ; y0 2 C; then
10 1 Convex Sets

  0
x C y D . C / x0 C y  2 . C /C:
C C

Thus, pC .x C y/   C  for all ;  > 0 such that x 2 C; y 2 C: Hence


pC .x C y/  pC .x/ C pC .y/:
(iii) Consider any x0 2 Rn and let x  x0 2 B WD fuj kuk  g; where > 0 is
so small that the ball B is contained in "C: Then, x  x0 2 "C; consequently
pC .x  x0 /  "; hence pC .x/  pC .x0 / C pC .x  x0 /  pC .x0 / C "I on the
other hand, x0  x 2 "C implies similarly pC .x0 /  pC .x/ C ": Thus, jpC .x/ 
pC .x0 /j  " whenever x  x0 2 B ; proving the continuity of pC at any x0 2 Rn :
(iv) This follows from the continuity of pC .x/: t
u
Corollary 1.7 If a 2 intC; b 2 clC; then x D .1/aCb 2 intC for all  2 0; 1/:
If a convex set C has nonempty interior then cl(intC) = clC; int(clC/ = intC:
Proof By translating we can assume that 0 2 intC; so intC D fxj pC .x/ < 1g:
Then pC .x/  .1  /pC .a/ C pC .b/ < 1 whenever pC .a/ < 1; pC .b/  1; and
0   < 1: The second assertion is a straightforward consequence of (iv). t
u
Let C be a convex set in Rn : A vector y 0 is called a direction of recession
of C if

fx C yj   0g  C 8x 2 C: (1.8)

Lemma 1.1 The set of all directions of recession of C is a convex cone. If C is


closed then (1.8) holds provided fx C yj   0g  C for one x 2 C:
Proof The first assertion can easily be verified. If C is closed, and fx C yj  
0g  C for one x 2 C then for any x0 2 C;   0
 0 we have x C y 2 C because
 
x0 C y D lim!C1 z ; where z D 1  C x0 C C .x C y/ 2 C: t
u

The convex cone formed by all directions of recession and the vector zero is
called the recession cone of C and is denoted by recC:
Proposition 1.12 Let C  Rn be a convex set containing a point a in its interior.
(i) For every x a the halfline  .a; x/ D fa C .x  a/j  > 0g either entirely lies
in C or it cuts the boundary @C at a unique point .x/; such that every point in
the line segment a; .x/; except .x/; is an interior point of C:
(ii) The mapping .x/ is defined and continuous on the set Rn n .a C M/; where
M D rec.intC/ D rec.clC/ is a closed convex cone.
Proof
(i) Assuming, without loss of generality, that 0 2 intC; it suffices to prove the
proposition for a D 0: Let pC .x/ be the gauge of C: For every x a; if pC .x/ D
0 then x 2 C for all arbitrarily small  > 0; i.e., x 2 C for all  > 0; hence
the halfline  .a; x/ entirely lies in C: If pC .x/ > 0 then pC .x/ D pC .x/ D 1
1.3 Relative Interior and Closure 11

a+M

a x C
(x)

Fig. 1.1 The mapping  .x/

just for  D 1=pC .x/: So  .a; x/ cuts @C at a unique point .x/ D x=pC .x/;
such that for all 0   < 1=pC .x/ we have pC .x/ < 1 and hence x 2 intC:
(ii) The domain of definition of  is Rn n M; with M D fxj pC .x/ D 0g: The set M
is a closed convex cone (Fig. 1.1) because pC .x/ is continuous and pC .x/ D 0;
pC .y/ D 0 implies 0  pC .x C y/  pC .x/ C pC .y/ D 0 for all  > 0
and  > 0: Furthermore, the function .x/ D x=pC .x/ is continuous at any x
where pC .x/ > 0: It remains to prove that M is the recession cone of both intC
and clC: It is plain that rec(intC/  M: Conversely, if y 2 M and x 2 intC then
x D a C .1  /c; for some c 2 C and 0 < < 1: Since a C y 2 C for all
 > 0; it follows that x C y D a C .1  /c C y D .a C  y/ C .1  /c 2 C
for all  > 0; i.e., y 2 rec.intC/: Thus, M  rec.intC/: Finally, any x 2 @C
is the limit of some sequence fxk g  intC: If y 2 M then for every  > 0 W
xk C y 2 C; 8k; hence by letting k ! 1; x C y 2 clC: This means
that M  rec.clC/; and since the converse inclusion is obvious we must have
M D rec.clC/: t
u
Note that if C is neither closed nor open then recC may be a proper subset of
rec(intC/ D rec.clC/: An example is the set C D f.x1 ; x2 /j x1 > 0; x2 > 0g [
f.0; 0/g; for which recC D C; rec.intC/ D clC:
Corollary 1.8 A nonempty closed convex set C  Rn is bounded if and only if its
recession cone is the singleton f0g:
Proof By considering C in affC; we can assume that C has an interior point
(Proposition 1.6) and that 0 2 intC: If C is bounded then by Proposition 1.12, .x/
is defined on Rn n f0g; i.e., rec(int)C D f0g: Conversely, if rec(int)C D f0g; then
pC .x/ > 0 8x 0: Letting K D minfpC .x/j kxk D 1g; we then have, for every
  1 1
x 2 C W pC .x/ D kxkpC kxk x
 1; hence, kxk    : t
u
pC x K
kxk
12 1 Convex Sets

1.4 Convex Hull

Given any set E  Rn ; there exists a convex set containing E; namely Rn : The
intersection of all convex sets containing E is called the convex hull of E and denoted
by convE: This is in fact the smallest convex set containing E:
Proposition 1.13 The convex hull of a set E  Rn consists of all convex
combinations of its elements.
Proof Let C be the set of all convex combinations of E: Clearly C  convE: To
prove the converse
P inclusion, itP suffices to show that C is convex (since obviously
C  E/: If x D i2I i ai ; y D j2J j bj with ai ; bj 2 E and 0   1 then
X X
.1  /x C y D .1  /i ai C j bj :
i2I j2J

Since
X X
.1  /i C j
i2I j2J
X X
D .1  / i C j D .1  / C D 1;
i2I j2J

it follows that .1  /x C y 2 C: Therefore, C is convex. t


u
The convex hull S of a set of k affinely independent points a1 ; : : : ; ak in Rn ;
written S D a1 ; : : : ; ak ; is called a .k  1/-simplex spanned by a1 ; : : : ; ak : By
Corollary 1.2 this is a .k  1/-dimensional set every point of which is uniquely
represented as a convex combination of a1 ; : : : ; ak W
X
k X
k
xD i a i ; with i D 1; i  0; i D 1; : : : ; k:
iD1 iD1

The numbers i D i .x/; i D 1; : : : ; k are called the barycentric coordinates of x


in the simplex S: For any two points x; x0 in S; since
 k 
X Xk 
0  0 
k.x/  .x /k   i .x/  i .x / D kx  x0 k
 
iD1 iD1

the 1-1 map x 7! .x/ is continuous.


Proposition 1.13 shows that any point x 2 convE can be expressed as a convex
combination of a finite number of points of E: The next proposition which gives an
upper bound of the minimal number of points in such expressions is a fundamental
dimensionality result in convexity theory.
Theorem 1.1 (Caratheodorys Theorem) Let E be a set contained in an affine
set of dimension k: Then any vector x 2 convE can be represented as a convex
combination of k C 1 or fewer elements of E:
1.4 Convex Hull 13

Proof By Proposition 1.13, for any x 2 convE there exists a finite subset of E; say
a1 ; : : : ; am 2 E; such that

X
m X
m
i a D x;
i
i D 1; i  0:
iD1 iD1

If  D .1 ; : : : ; m / is a basic solution (in the sense of linear programming theory)
of the above linear system in 1 ; : : : ; m ; then at most k C 1 components of  are
positive. For example, i D 0 for all i > h where h  kC1; i.e., x can be represented
as a convex combination of a1 ; : : : ; ah ; with h  k C 1: t
u
Corollary 1.9 The convex hull of a compact set is compact.
Proof
PnC1 Let E be a compact set and fxkPg  convE: By the above theorem, xk D
iD1 i;k a ; with a
i;k i;k
2 E; i;k  0; nC1
iD1 i;k D 1: Since E  0; 1 is compact,
there exists a subsequence fk g such that

i;k ai;k ! i ai ; ai 2 E; i 2 0; 1:


PnC1
We then have xk ! iD1 i ai 2 convE; proving the corollary. t
u

Remark 1.1 The convex hull of a closed set E may not be closed, as shown by the
example of a set E  R2 consisting of a straightline L and a point a L (Fig. 1.2).
Theorem 1.2 (ShapleyFolkmans Theorem) Let fS!i ; i D 1; : : : ; mg be any
Xm
finite family of sets in Rn : For every x 2 conv Si there exists a subfamily
iD1
fSi ; i 2 Ig; with jIj  n; such that
X X
x2 Si C conv.Si /:
iI i2I

Fig. 1.2 Convex hull of a closed set


14 1 Convex Sets

Roughly
Pm speaking, if m is very large and every Si is sufficiently small then the sum
iD1 Si differs little from its convex hull.
P Pm
Proof Observe that conv. m iD1 Si / D iD1 convSi : P Indeed, define P the linear
1 m m
mapping
Qm ' W R nm
! R n
such that '..x ; : : : ; x m
// D iD1 x i
: Then iD1 Si D
'. iD1 Si / and by linearity of ' W
!! ! !
Ym Y m Ym
conv ' Si D ' conv Si D ' convSi ;
iD1 iD1 iD1
Pm P
proving our assertion. So any x 2 conv. iD1 Si / is of the form x D m iD1 x ; with
i

x 2 convSi : Let x be a convex combination of x 2 Si ; j 2 Ji : Then there exist ij


i i ij

satisfying
X
m X
ij xij D x
iD1 j2Ji
X
ij D 1 i D 1; : : : ; m:
j2Ji

ij  0 i D 1; : : : ; m; j 2 Ji :

Let  D .ij /
be a basic solution of this linear system in ij ; so that at most n C m
components of  are positive. Denote by I the set of all i such that ij > 0 for
at least two j 2 Ji ; and let jIj D k: Then  has at least 2k C .m  k/ positive
components, therefore n C m  2k C .m  k/; and hence k  n: Since for every i I
we have iji D 1 for some ji 2 Ji and ij D 0 for all other j 2 Ji ; it follows that
X XX X X
xD xiji C ij xij 2 Si C convSi :
iI i2I j2Ji iI i2I

This is the desired expression. t


u
We close this section by a proposition generalizing Caratheodorys Theorem.
A set X  Rn is called a Caratheodory core of a set E  Rn if there exist a set
B  Rn ; an interval   R along with a mapping W X !  such that tB  t0 B for
t; t0 2 ; t < t0 ; and
[
ED .x C .x/B/:
x2X

A caseSof interest for the applications is when B is the unit ball and  D RC ; so that
E D x2X B.x; .x// where B.x; r/ denotes the ball of center x and radius r: Since
a trivial Caratheodory core of any set E is X D E with B being the unit ball in Rn
and  D f0g; Caratheodorys Theorem appears to be a special case of the following
general proposition (Thach and Konno 1996):
1.4 Convex Hull 15

Proposition 1.14 If a set E in Rn has a Caratheodory core X contained in a


k-dimensional affine set then any vector of convE is a convex combination of k C 1
or fewer elements of E:
Proof For any y 2 conv.E/ there exists a finite set of vectors fy1 ; : : : ; ym g  E such
that
X
m X
m
yD i yi ; i  0; i D 1: (1.9)
iD1 iD1

Pm is x 2 X such that y 2 x C i B; i WD .x /: Setting x D


i i i i i
Pm y 2
Since E, there
iD1 i x , D iD1 i i ; we have
i

X
m
.x; / D i .xi ; i /:
iD1

Consider the linear program


X
m
max ti i W
iD1

X
m X
m
i
ti x D x; ti D 1 (1.10)
iD1 iD1

ti  0; i D 1; : : : ; m: (1.11)
and let t WD .t1 ; : : : ; tm / be a basic optimal solution of it. Since X is contained in
an affine set of dimension k the matrix of the system (1.10)
 1 2 
x x    xm
1 1  1
is of rank at most k C 1 (Proposition 1.4). Therefore, t must satisfy m  .k C 1/
constraints (1.11) as equalities. That is, ti D 0 for at least m  .k C 1/ indices i:
Without loss of generality we can assume that ti D 0 for i D k C 2; : : : ; m; so by
PkC1 
setting  D iD1 ti i , we have
X
kC1 X
kC1

.x; / D ti .xi ; i /; ti  0; ti D 1:
iD1 iD1
 
Here  (because t is optimal to the above program while ti D i ; i D 1; : : : ; m
is a feasible solution). Hence B   B and we can write
X
m X
m
yD i y i 2 i .xi C i B/
iD1 iD1

X
m X
m
D i x i C i i B
iD1 iD1
16 1 Convex Sets

D x C B  x C  B
X
kC1 X
kC1
D ti xi C ti i B
iD1 iD1

X
kC1
D ti .xi C i B/:
iD1

We have
PkC1thus found k C 1 vectors
PkC1 u WD x C i B 2 E; i D 1; : : : ; k C 1; such that
i i
 i 
y D iD1 ti u with ti  0; iD1 ti D 1: t
u
Note that in the above proposition, the dimension k of affX can be much smaller
than the dimension of affE: For instance, if X D a; b is a line segment in Rn and
B is the unit ball while .a/ D .b/ D 1; .x/ D 0 8x fa; bg (so that E is the
union of a line segment and two balls centered at the endpoints of this segment) then
affE D Rn but affX D R: Thus in this case, by Proposition 1.14, any point of convE
can be represented as the convex combination of only two points of E (and not n C 1
points as by applying Caratheodorys Theorem.

1.5 Separation Theorems

Separation is one of the most useful concepts of convexity theory. The first and
fundamental fact is the following:
Lemma 1.2 Let C be a nonempty convex subset of Rn : If 0 C then there exists a
vector t 2 Rn such that
supht; xi  0; inf ht; xi < 0: (1.12)
x2C x2C

Proof The proposition is obvious for n D 1: Arguing by induction, assume it is true


for n D k  1 and prove it for n D k: Define

C0 D fx 2 Cj xn D 0g;
CC D fx 2 Cj xn > 0g; C D fx 2 Cj xn < 0g:

If xn D 0 for all x 2 C then C can be identified with a set in Rn1 ; so the Lemma
is true by the induction assumption. If C D ; but CC ;; then the vector t D
.0; : : : ; 0; 1/ satisfies (1.12). Analogously, if CC D ; and C ; we can take
t D .0; : : : ; 0; 1/: Therefore, we may assume that both C and CC are nonempty.
For any x 2 CC ; x0 2 C we have
 
x x0
  0 2 C0 (1.13)
xn xn
1.5 Separation Theorems 17

 
for  > 0 such that  D 1: Thus, the set
xn x0n
C D f.x1 ; : : : ; xn1 /j .x1 ; : : : ; xn1 ; 0/ 2 Cg
is a nonempty convex set in Rn1 which does not contain the origin. By the induction
hypothesis there exists a vector t 2 Rn1 and a point .Ox1 ; : : : ; xO n1 / 2 C satisfying
X
n1 X
n1
ti xO i < 0; sup ti xi j .x1 ; : : : ; xn1 / 2 C  0: (1.14)
iD1 iD1

From (1.13) and (1.14), for all x 2 CC ; x0 2 C we deduce


X
n1  
xi x0
ti  0i  0;
iD1
xn xn

i.e., p.x/  q.x0 /; where


X
n1
xi X
n1
x0i 0
p.x/ D ti .x 2 CC /; q.x0 / D ti .x 2 C /:
iD1
xn iD1
xn

Setting
p D sup p.x/; q D inf q.x/
x2CC x2C

we then conclude p  q: Therefore, if we now take t D .t1 ; : : : ; tn1



; tn /; with
tn 2 p; q then p.x/ C tn  0 8x 2 CC and tn C q.x/  0 8x 2 C : This implies
X
n
ti xi  0 8x 2 C;
iD1

completing the proof since ht; xO i < 0 for xO D .Ox1 ; : : : ; xO n1 ; 0/: t
u
We say that two nonempty subsets C; D of Rn are separated by the hyperplane
ht; xi D .t 2 Rn n f0g/ if
supht; xi   inf ht; yi: (1.15)
x2C y2D

Theorem 1.3 (First Separation Theorem) Two disjoint nonempty convex sets
C; D in Rn can be separated by a hyperplane.
Proof The set C  D is convex and satisfies 0 C  D; so by the previous Lemma,
there exists t 2 Rn n f0g such that ht; x  yi  0 8x 2 C; y 2 D: Then (1.15) holds
for D supx2C ht; xi: t
u
Remark 1.2 Before drawing some consequences of the above result, it is useful to
recall a simple property of linear functions:
18 1 Convex Sets

A linear function l.x/ WD ht; xi with t 2 Rn n f0g never achieves its minimum
(or maximum) over a set D at an interior point of DI if it is bounded above
(or below) over an affine set then it is constant on this affine set.
It follows from this fact that if the set D in Theorem 1.3 is open then (1.15) implies

ht; xi  < ht; yi 8x 2 C; y 2 D:

Furthermore:
Corollary 1.10 If an affine set C does not meet an open convex set D then there
exists a hyperplane containing C and not meeting D:
Proof Let ht; xi D be the hyperplane satisfying (1.15). By the first part of the
Remark above, we must have ht; xi D const 8x 2 C, i.e., C is contained in
the hyperplane ht; x  x0 i D 0; where x0 is any point of C: By the second part
of the Remark, ht; yi < 8y 2 D (hence D \ C D ;/; since if ht; y0 i D for some
y0 2 D then this would mean that is the maximum of the linear function ht; xi over
the open set D: t
u
Lemma 1.3 Let C be a nonempty convex set in Rn : If C is closed and 0 C then
there exists a vector t 2 Rn n f0g and a number < 0 such that

ht; xi  < 0 8x 2 C:

Proof Since C is closed and 0 C; there is a ball D around 0 which does not
intersect C: By the above theorem, there is t 2 Rn n f0g satisfying (1.15). Then
< 0 because 0 2 D: t
u
We say that two nonempty subsets C; D of Rn are strongly separated by a
hyperplane ht; xi D ; .t 2 Rn n f0g; 2 R/ if

supht; xi < < inf ht; yi: (1.16)


x2C y2D

Theorem 1.4 (Second Separation Theorem) Two disjoint nonempty closed con-
vex sets C; D in Rn such that either C or D is compact can be strongly separated by
a hyperplane.
Proof Assume, e.g., that C is compact and consider the convex set C  D: If zk D
xk  yk ! z; with xk 2 C; yk 2 D; then by compactness of C there exists a
subsequence xk ! x 2 C and by closedness of D; yk D xk  zk ! x  z 2 DI
hence z D xy; with x 2 C; y 2 D: So the set CD is closed. Since 0 CD; there
exists by the above Lemma a vector t 2 Rn nf0g such that ht; xyi 
< 0 8x 2 C;
y 2 D: Then supx2C ht; xi 
2  infy2D ht; yi C
2 : Setting D infx2C ht; xi C
2
yields (1.16). t
u
1.6 Geometric Structure 19

Remark 1.3 If the set C is a cone then one can take D 0 in (1.15). Indeed, for
any given x 2 C we then have x 2 C 8 > 0; so ht; xi  8 > 0; hence
ht; xi  0 8x 2 C: Similarly, one can take D 0 in (1.16).

1.6 Geometric Structure

There are several important concepts related to the geometric structure of convex
sets: supporting hyperplane and normal cone, face and extreme point.

1.6.1 Supporting Hyperplane

A hyperplane H D fxj ht; xi D g is said to be a supporting hyperplane to a convex


set C  Rn if at least one point x0 of C lies in H W ht; x0 i D ; and all points of C lie
in one halfspace determined by H; say: ht; xi  8x 2 C (Fig. 1.3). The halfspace
ht; xi  is called a supporting halfspace to C:
Theorem 1.5 Through every boundary point x0 of a convex set C  Rn there exists
at least one supporting hyperplane to C:
Proof Since x0 riC; there exists by Theorem 1.3 a hyperplane ht; xi D such that

ht; xi   ht; x0 i: 8x 2 riC;

hence

ht; x  x0 i  0 8x 2 C:

Thus, the hyperplane H D fxj ht; x  x0 i D 0g is a supporting hyperplane to


C at x0 : u
t

Fig. 1.3 Supporting


hyperplane
x0

C
20 1 Convex Sets

The normal t to a supporting hyperplane to C at x0 is characterized by the


condition
ht; x  x0 i  0 8x 2 C: (1.17)
A vector t satisfying this condition does not make an acute angle with any line
segment in C having x0 as endpoint. It is also called an outward normal, or simply
a normal, to C at point x0 (t is called an inward normal). The set of all vectors t
normal to C at x0 is called the normal cone to C at x0 and is denoted by NC .x0 /: It
is readily verified that NC .x0 / is a convex cone which reduces to the singleton f0g
when x0 is an interior point of C:
Proposition 1.15 Let C be a closed convex set. For any y0 C there exists x0 2 @C
such that y0  x0 2 NC .x0 /:
Proof Let fxk g  C be a sequence such that
ky0  xk k ! inf ky0  xk .k ! 1/:
x2C

The sequence fky0  xk kg is bounded (because convergent), and hence so is the


sequence fkxk kg because kxk k  kxk y0 kCky0 k: Therefore, there is a subsequence
xk ! x0 : Since C is closed, we have x0 2 C and
ky0  x0 k D min ky0  xk:
x2C
0 0 0
It is easy to see that y  x 2 NC .x /: Indeed, for any x 2 C and  2 .0; 1/ we have
x WD x C .1  /x0 2 C; so ky0  x k2  ky0  x0 k2 ; i.e.,
k.y0  x0 /  .x  x0 /k2  ky0  x0 k2 :
Upon easy computation this yields
2 kx  x0 k2  2hy0  x0 ; x  x0 i  0
for all  2 .0; 1/; hence hy0  x0 ; x  x0 i  0; as desired. t
u
Note that the condition y0  x0 2 NC .x0 / uniquely defines x0 : Indeed, if y0  x1 2
NC .x1 / then from the inequalities

hy0  x0 ; x1  x0 i  0; hy0  x1 ; x0  x1 i  0;

it follows that hx0  x1 ; x1  x0 i  0; hence x1  x0 D 0: The point x0 is called the


projection of y0 on the convex set C: As seen from the above, it is the point of C
nearest to y0 :
Theorem 1.6 A closed convex set C which is neither empty nor the whole space is
equal to the intersection of all its supporting halfspaces.
1.6 Geometric Structure 21

Proof Let E be this intersection. Obviously, C  E: Suppose E n C ; and let


y0 2 E n C: By the above proposition, there exists a supporting halfspace to C which
does not contain y0 ; conflicting with y0 2 E: Therefore, C D E: t
u

1.6.2 Face and Extreme Point

A convex subset F of a convex set C is called a face of C if any line segment in C


with a relative interior point in F entirely lies in F; i.e., if

x; y 2 C; .1  /x C y 2 F; 0 <  < 1 ) x; y  F: (1.18)

Of course, the empty set and C itself are faces of C: A proper face of C is a face
which is neither the empty set nor C itself. A face of a convex cone is itself a convex
cone.
Proposition 1.16 The intersection of a convex set C with a supporting hyperplane
H is a face of C:
Proof The convexity of F WD H\C is obvious. If a segment a; b  C has a relative
interior point x in F and were not contained in F then H would strictly separate a
and b; contrary to the definition of a supporting hyperplane. t
u
A zero-dimensional face of C is called an extreme point of C: In other words, an
extreme point of C is a point x 2 C which is not a relative interior point of any line
segment with two distinct endpoints in C:
If a convex set C has a halfline face, the direction of this halfline is called an
extreme direction of C: The set of extreme directions of C is the same as the set of
extreme directions of its recession cone.
It can easily be verified that a face of a face of C is also a face of C: In particular:
An extreme point (or direction) of a face of a convex set C is also an extreme
point (or direction, resp.) of C:
Proposition 1.17 Let E be an arbitrary set in Rn : Every extreme point of C D
convE belongs to E:
Pk
iD1 i x ; x 2
i i
Proof Let x be an extreme point of C: By Proposition 1.13 x D
Pk
E; i > 0; iD1 i D 1: It is easy to see that i D 1 (i.e., x D xi / for some i: In fact,
otherwise there is h with 0 < h < 1; so
X i
x D h xh C .1  h /yh ; with yh D xi 2 C
1  h
ih

P i
(because ih 1 h
D 1/; i.e., x is a relative interior point of the line segment
x ; y   C; conflicting with x being an extreme point.
h h
t
u
22 1 Convex Sets

Given an arbitrary point a of a convex set C; there always exists at least one
face of C containing a; namely C: The intersection of all faces of C containing a is
obviously a face. We refer to this face as the smallest face containing a and denote
it by Fa :
Proposition 1.18 Fa is the union of a and all x 2 C for which there is a line segment
x; y  C with a as a relative interior point.
Proof Denote the mentioned union by F: If x 2 F; i.e., x 2 C and there is y 2 C
such that a D .1  /x C y with 0 <  < 1; then x; y  Fa because Fa is a
face. Thus, F  Fa : It suffices, therefore, to show that F is a face. By translating if
necessary, we may assume that a D 0: It is easy to see that F is a convex subset of C:
Indeed, if x1 ; x2 2 F; i.e., x1 D 1 y1 ; x2 D 2 y2 ; with y1 ; y2 2 C and 1 ; 2 > 0
then for any x D .1  /x1 C x2 .0 < < 1/ we have x D .1  /1 y1  2 y2 ;
with

1 .1  /1 2
xD y1 C y2 2 C;
.1  /1 C 2 .1  /1 C 2 .1  /1 C 2

hence x 2 F: Thus, it remains to prove (1.18). Let x; y 2 C; u D .1  /x C y 2 F;


0 <  < 1: Since u 2 F; there is v 2 C such that u D v; so

u y v y
xD  D  :
1 1 1 1
We have

.1  /x  
 D vC y 2 C;
C C C

so x 2 F; and similarly, y 2 F: Therefore, F is a face, completing the proof. t


u
An equivalent formulation of the above proposition is the following:
Proposition 1.19 Let C be a convex set and a 2 C: A face F of C is the smallest
face containing a if and only if a 2 riF:
Proof Indeed, for any subset F of C the condition in Proposition 1.18 is equivalent
to saying that F is a face and a 2 riF: t
u
Corollary 1.11 If a face F1 of a convex set C is a proper subset of a face F2 then
dim F1 < dim F2 :
Proof Let a be any relative interior point of F1 : If dim F1 D dim F2 ; then a 2 riF2 ;
and so both F1 and F2 coincide with the smallest face containing a: t
u
1.7 Representation of Convex Sets 23

Corollary 1.12 A face F of a closed convex set C is a closed set.


Proof We may assume that 0 2 riF; so that F D F0 : Let M be the smallest space
containing F: Clearly, every x 2 M is of the form x D y for some y 2 F and
 > 0: Therefore, by Proposition 1.18, F D M \ C and since M and C are closed,
so must be F: t
u

1.7 Representation of Convex Sets

Given a nonempty convex set C; the set (recC/ \ .recC/ is called the lineality
space of C: It is actually the largest subspace contained in recC and consists of
the zero vector and all the nonzero vectors y such that, for every x 2 C; the line
through x in the direction of y is contained in C: The dimension of the lineality
space of C is called the lineality of C: A convex set C of lineality 0, i.e., such that
(recC/ \ .recC/ D f0g; contains no line (and so is sometimes said to be line-free).
Proposition 1.20 A nonempty closed convex set C has an extreme point if and only
if it contains no line.
Proof If C has an extreme point a then a is also an extreme point of a C recC; hence
0 is an extreme point of recC; i.e., (recC/ \ .recC/ D f0g: Conversely, suppose
C is line-free and let F be a face of smallest dimension of C: Then F must also be
line-free, and if F contains two distinct points x; y then the intersection of C with
the line through x; y must have an endpoint a: Of course Fa cannot contain x; y;
hence by Corollary 1.11, dim Fa < dim F; a contradiction. Therefore, F consists of
a single point which is an extreme point of C: t
u
If a nonempty convex set C has a nontrivial lineality space L then for any x 2 C
we have x D u C v; with u 2 L; v 2 L? ; where L? is the orthogonal complement
of L: Clearly v D x  u 2 C  L D C C L  C; hence C can be decomposed into
the following sum:

C D L C .C \ L? /; (1.19)

in which the convex set C \ L? contains no line, hence has an extreme point, by the
above proposition.
We are now in a position to state the fundamental representation theorem for
convex sets. In view of the above decomposition (1.19) it suffices to consider closed
convex sets C with lineality zero. Denote the set of extreme points of C by V.C/
and the set of extreme directions of C by U.C/:
Theorem 1.7 Let C  Rn be a closed convex set containing no line. Then

C D convV.C/ C coneU.C/: (1.20)


24 1 Convex Sets

In other words any x 2 C can be represented in the form


X X
xD i v i C j uj ;
i2I j2J

P
with I; J finite index sets, v i 2 V.C/; ui 2 U.C/; and i  0; j  0; i i D 1:
Proof The theorem is obvious if dim C D 0: Arguing by induction, assume that the
theorem is true for dim C D n  1 and consider the case when dim C D n: Let x be
an arbitrary point of C and  be any line containing x: Since C is line-free, the set
 \ C is either a closed halfline, or a closed line segment. In the former case, let
 D fa C uj   0g: Of course a 2 @C; and there is a supporting hyperplane to C
at a by Theorem 1.5. The intersection F of this hyperplane with C is a face of C of
dimension less than n: By the induction hypothesis,

a 2 coneU.F/ C convV.F/  coneU.C/ C convV.C/:

Since u 2 coneU.C/ and x D a C u for some   0; it follows that

x D a C u 2 .coneU.C/ C convV.C// C coneU.C/


 coneU.C/ C convV.C/:

In the case when  is a closed line segment a; b; both a; b belong to @C; and by
the same argument as above, each of these points belongs to coneU.C/CconvV.C/:
Since this set is convex and x 2 a; b; x belongs to this set too. t
u
An immediate consequence of Theorem 1.7 is the following important
Minkowskis Theorem which corresponds to the special case U.C/ D f0g:
Corollary 1.13 A nonempty compact convex set C is equal to the convex hull of its
extreme points. t
u
In view of Caratheodorys Theorem one can state more precisely: Every point of a
nonempty compact convex set C of dimension k is a convex combination of k C 1 or
fewer extreme points of C:
Corollary 1.14 A nonempty compact convex set C has at least one extreme point
and every supporting hyperplane H to C contains at least one extreme point of C:
Proof The set C \ H is nonempty, compact, convex, hence has an extreme point.
Since by Proposition 1.16 this set is a face of C any extreme point of it is also an
extreme point of C: t
u
1.8 Polars of Convex Sets 25

1.8 Polars of Convex Sets

By Theorem 1.6 a closed convex set which is neither empty nor the whole space is
fully determined by the set of its supporting hyperplanes. Its turns out that a duality
correspondence can be established between convex sets and their sets of supporting
hyperplanes.
For any set E in Rn the set

E D fy 2 Rn j hy; xi  1; 8x 2 Eg

is called the polar of E: If E is a closed convex cone, so that E  E for all  


0, then the condition hy; xi  1; 8x 2 E, is equivalent to hy; xi  0; 8x 2 E.
Therefore, the polar of a cone M is the cone

M D fy 2 Rn j hy; xi  0; 8x 2 Mg:

The polar of a subspace is the same as its orthogonal complement.


Proposition 1.21
(i) The polar E of any set E is a closed convex set containing the origin; if E  F
then F  E I
(ii) E  E ; if E is closed, convex, and contains the origin then E D EI
(iii) 0 2 intE if and only if E is bounded.
Proof
(i) Straightforward.
(ii) If x 2 E then hx; yi  1 8y 2 E , hence x 2 E : Suppose E is closed,
convex and contains 0, but there is a point u 2 E n E. Then by the Separation
Theorem 1.4 there exists a vector y such that hy; xi  1; 8x 2 E, while hy; ui >
1. The former inequality (for all x 2 E) means that y 2 E I the latter inequality
then implies that u E , a contradiction.
(iii) First note that the polar of the ball Br of radius r around 0 is the ball B1=r .
1
Indeed, the distance from 0 to a hyperplane hy; xi D 1 is . Therefore, if
kyk
1
y 2 .Br / , i.e., hy; xi  1 8x 2 Br then  r, hence kyk  1=r, i.e.,
kyk
y 2 B1=r , and conversely. This proves that .Br / D B1=r . Now, 0 2 intE if and
only if Br  E for some r > 0, hence if and only if E  .Br / D B1=r , i.e.,
E is bounded. t
u
Proposition 1.22 If M1 ; M2  Rn are closed convex cones then

.M1 C M2 / D M1 \ M2 I .M1 \ M2 / D M1 C M2 :
26 1 Convex Sets

Proof If y 2 .M1 CM2 / then for any x 2 M1 ; writing x D xC0 2 M1 CM2 we have
hx; yi  0; hence y 2 M1 I analogously, y 2 M2 : Therefore .M1 C M2 /  M1 \ M2 :
Conversely, if y 2 M1 \ M2 then for all x1 2 M1 ; x2 2 M2 we have hx1 ; yi 
0; hx2 ; yi  0; hence hx1 C x2 ; yi  0; implying that y 2 .M1 C M2 / : Therefore,
the first equality holds. The second equality follows, because by (ii) of the previous
proposition, M1 C M2 D .M1 C M2 / D .M1 / \ .M2 /  D .M1 \ M2 / : t
u
Given a set C  Rn ; the function x 2 Rn 7! sC .x/ D supy2C hx; yi is called the
support function of C:
Proposition 1.23 Let C be a closed convex set containing 0.
(i) The gauge of C is the support function of C :
(ii) The recession cone of C and the closure of the cone generated by C are polar
to each other.
(iii) The lineality space of C and the subspace generated by C are orthogonally
complementary to each other.
Dually, also, with C and C interchanged.
Proof
(i) Since .C / D C by Proposition 1.21, we can write
x x
pC .x/ D inff > 0j 2 Cg D inff > 0j h ; yi  1 8y 2 C g
 
D inff > 0j hx; yi   8y 2 C g D sup hx; yi
y2C

(ii) Since 0 2 C; the recession cone of C is the largest closed convex cone
contained in C: Its polar must then be the smallest closed convex cone
containing C ; i.e., the closure of the convex cone generated by C :
(iii) Similarly, the lineality space of C is the largest space contained in C: Therefore,
its orthogonal complement is the smallest subspace containing C : t
u
Corollary 1.15 For any closed convex set C  Rn containing 0 we have
dim C D n  linealityC
linealityC D n  dim C:

Proof The latter relation follows from (iii) of the previous proposition, while the
former is obtained from the latter by noting that .C / D C: u
t
The number dim C  linealityC which can be considered as a measure of the
nonlinearity of a convex set C is called the rank of C: From (1.19) we have rankC D
dim.C \ L? /; where L is the lineality space of C: The previous corollary also shows
that for any closed convex set C W
rankC D rankC :
1.9 Polyhedral Convex Sets 27

1.9 Polyhedral Convex Sets

A convex set is said to be polyhedral if it is the intersection of a finite family of


closed halfspaces. A polyhedral convex set is also called a polyhedron. In other
words, a polyhedron is the solution set of a finite system of linear inequalities of the
form

hai ; xi  bi i D 1; : : : ; m; (1.21)

or in matrix form:

Ax  b; (1.22)

where A is the m  n matrix of rows ai and b 2 Rm : Since a linear equality can be


expressed as two linear inequalities, a polyhedron is also the solution set of a system
of linear equalities and inequalities of the form

hai ; xi D bi ; i D 1; : : : ; m1
hai ; xi  bi ; i D m1 C 1; : : : ; m:

The rank of a system of linear inequalities of the form (1.22) is by definition the
rank of the matrix A: When this rank is equal to the number of inequalities, we say
that the inequalities are linearly independent.
Proposition 1.24 A polyhedron D ; defined by the system (1.22) has dimension
r if and only if the subsystem of (1.22) formed by the inequalities which are satisfied
as equalities by all points of D has rank n  r:
Proof Let I0 D fij hai ; xi D bi 8x 2 Dg and M D fxj hai ; xi D bi i 2 I0 g: Since
dim M D n  r; and D  M; it suffices to show that M  affD: We can assume that
I1 WD f1; : : : ; mg n I0 ; for otherwise D D M: For each i 2 I1 take xi 2 D such
0 1 P
that ha ; x i < bi and let x D q i2I1 x ; where q D jI1 j: Then clearly
i i i

x0 2 M; hai ; x0 i < bi 8i 2 I1 : (1.23)

Consider now any x 2 M: Using (1.23) it is easy to see that for  > 0 sufficiently
small, we have

x0 C .x  x0 / 2 M; hai ; x0 C .x  x0 /i  bi 8i 2 I1 ;

hence y WD x0 C .x  x0 / 2 D: Thus, both points x0 and y belong to D: Hence the


line through x0 and y lies in affD; and in particular, x 2 affD: Since x is an arbitrary
point of M this means that M  affD; and hence M D affD: t
u
28 1 Convex Sets

Corollary 1.16 A polyhedron (1.22) is full-dimensional if and only if there is x0


satisfying hai ; x0 i < bi 8i D 1; : : : ; m:
Proof This corresponds to the case r D 0; i.e., I1 D f1; : : : ; mg: Note that if for
every i 2 I1 there exists xi 2 D satisfying hai ; xi i < bi then any x0 2 riD (or x0
constructed as in the above proof) satisfies hai ; x0 i < bi 8i 2 I1 : t
u

1.9.1 Facets of a Polyhedron

As previously, denote by D the polyhedron (1.22) and let

I0 D fij hai ; xi D bi 8x 2 Dg:

Theorem 1.8 A nonempty subset F of D is a face of D if and only if

F D fxj hai ; xi D bi ; i 2 II hai ; xi  bi ; i Ig; (1.24)

for some index set I such that I0  I  f1; : : : ; mg:


Proof
(i) Any set F of the form (1.24) is a face. Indeed, F is obviously convex. Suppose
there is a line segment y; z  D with a relative interior point x in F: We have
z D .1  /x C y for some  < 0; so if hai ; yi < bi for some i 2 I then
hai ; zi D .1  /hai ; xi C hai ; yi > .1  /bi C bi D bi ; a contradiction.
Therefore, hai ; yi D bi 8i 2 I; and analogously, hai ; zi D bi 8i 2 I: Hence
y; z  F; proving that F is a face.
(ii) Any face F has the form (1.24). Indeed, by Proposition 1.19 F D Fx0 ; where
x0 is any relative interior point of F: Denote by F 0 the set in the right-hand side
of (1.24) for I D fij hai ; x0 i D bi g: According to what has just been proved, F 0
is a face, so F 0  F: To prove the converse inclusion, consider any x 2 F 0 and
let y D x0 C .x0  x/: Since

hai ; x0 i D hai ; xi D bi 8i 2 II hai ; x0 i < bi 8i I

one has hai ; yi D bi 8i 2 I; hai ; yi  bi 8i I provided  > 0 is sufficiently


small. Then y 2 F 0 and x; y  F 0  D: Since x0 2 F and x0 is a relative
interior point of x; y; while F is a face, this implies that x; y  F; and in
particular, x 2 F: Thus, any x 2 F 0 belongs to F: Therefore F D F 0 ; completing
the proof. t
u
A proper face of maximal dimension of a polyhedron is called a facet. By the
above proposition, if F is a facet of D then there is a k I0 such that

F D fx 2 Djhak ; xi D bk g: (1.25)
1.9 Polyhedral Convex Sets 29

However, not every face of this form is a facet. An inequality hak ; xi  bk is said
to be redundant if the removal of this inequality from (1.21) does not affect the
polyhedron D; i.e., if the system (1.21) is equivalent to

hai ; xi  bi ; i 2 f1; : : : ; mg n fkg:

Proposition 1.25 If the inequality hak ; xi  bk with k I0 is not redundant


then (1.25) is a facet.
Proof The non-redundance of the inequality hak ; xi  bk means that there is an y
such that

hai ; yi D bi ; i 2 I0 ; hai ; yi  bi ; i .I0 [ fkg/; hak ; yi > bk :

Let x0 be a relative interior point of D; i.e., such that

hai ; x0 i D bi ; i 2 I0 ; hai ; x0 i < bi ; i I0 :

Then the line segment x0 ; y meets the hyperplane hak ; xi D bk at a point z satisfying

hai ; zi D bi ; i 2 .I0 [ fkg/; hai ; zi < bi ; i .I0 [ fkg/:

This shows that F defined by (1.25) is of dimension dim D  1; hence is a facet. u


t
Thus, if a face (1.25) is not a facet, then the inequality hak ; xi  bk can be
removed without changing the polyhedron. In other words, the set of facet-defining
inequalities, together with the system hai ; xi D bi ; i 2 I0 ; completely determines the
polyhedron.

1.9.2 Vertices and Edges of a Polyhedron

An extreme point (0-dimensional face) of a polyhedron is also called a vertex,


and a 1-dimensional face is also called an edge. The following characterizations
of vertices and edges are immediate consequences of Theorem 1.8:
Corollary 1.17
(i) A point x 2 D is a vertex if and only if it satisfies as equalities n linearly
independent inequalities from (1.21).
(ii) A line segment (or halfline, or line)   D is an edge of D if and only if it is the
set of points of D satisfying as equalities n  1 linearly independent inequalities
from (1.21).
Proof A set F of the form (1.24) is a k-dimensional face if and only if the system
hai ; xi D bi ; i 2 I; is of rank n  k: t
u
30 1 Convex Sets

A vertex of a polyhedron D  Rn [defined by (1.21)] is said to be nondegenerate


if it satisfies as equalities exactly n inequalities (which must then be linearly
independent) from (1.21). Two vertices x1 ; x0 are said to be neighboring (or
adjacent) if the line segment between them is an edge.
Theorem 1.9 Let x0 be a nondegenerate vertex of a full-dimensional polyhedron D
defined by a system (1.21). Then there are exactly n edges of D emanating from x0 :
If I is the set of indices of inequalities that are satisfied by x0 as equalities, then for
each k 2 I there is an edge emanating from x0 whose direction z is defined by the
system

hak ; zi D 1; hai ; zi D 0; i 2 I n fkg: (1.26)

Proof By the preceding corollary a vector z defines an edge emanating from x0 if


and only if for all  > 0 small enough x0 C z belongs to D and satisfies exactly
n  1 equalities from the system hai ; xi D bi ; i 2 I: Since hai ; x0 i < bi .i I/; we
have hai ; x0 C zi  bi .i I/ for all sufficiently small  > 0: Therefore, it suffices
that for some k 2 I we have hai ; x0 C zi D bi .i 2 I n fkg/; hak ; x0 C zi < bk ;
i.e., that z (or a positive multiple of z) satisfies (1.26). Since the latter system is of
rank n it fully determines the direction of z: We thus have exactly n edges emanating
from x0 : t
u
A bounded polyhedron is called a polytope.
Proposition 1.26 An r-dimensional polytope has at least r C 1 vertices.
Proof A polytope is a compact convex set, so by Corollary 1.13 it is the convex hull
of its vertex set fx1 ; x2 ; : : : ; xk g: The affine hull of the polytope is then the affine hull
of this vertex set, and by Proposition 1.4, if it is of dimension r then the matrix
 
x1 x2    xk
1 1  1

is of rank r C 1; hence k  r C 1: t
u
A polytope which is the convex hull of r C 1 affinely independent points
x1 ; x2 ; : : : ; xrC1 is called an r-simplex and is denoted by x1 ; x2 ; : : : ; xrC1 :
By Proposition 1.17 any vertex of such a polytope must then be one of the points
x1 ; x2 ; : : : ; xrC1 ; and since there must be at least r C 1 vertices, these are precisely
x1 ; x2 ; : : : ; xrC1 :
If D is an r-dimensional polytope then any set of k  r C 1 vertices of D
determines an r-simplex which is an r-dimensional face of D: Conversely, any r-
dimensional face of D is of this form.
Proposition 1.27 The recession cone of a polyhedron (1.22) is the cone M WD
fxj Ax  0g:
1.9 Polyhedral Convex Sets 31

Proof By definition y 2 recD if and only if for any x 2 D and  > 0 we have
x C y 2 D; i.e., A.x C y/ D Ax C Ay  b; i.e., Ay  0: Hence the conclusion.
t
u
Thus, the extreme directions of a polyhedron (1.22) are the same as the extreme
directions of the cone Ax  0:
Corollary 1.18 A nonempty polyhedron (1.22) has at least a vertex if and only if
rankA D n: It is a polytope if and only if the cone Ax  0 is the singleton f0g:
Proof By Proposition 1.20 D has a vertex if and only if it contains no line, i.e.,
the recession cone Ax  0 contains no line, which in turns is equivalent to saying
that rankA D n: The second assertion follows from Corollary 1.8 and the preceding
proposition. t
u

1.9.3 Polar of a Polyhedron

Let x0 be a solution of the system (1.21), so that Ax0  b: Then A.x  x0 / 


b  Ax0 ; with b  Ax0  0: Therefore, by shifting the origin to x0 and dividing, the
system (1.21) can be put into the form

hai ; xi  1; i D 1; : : : ; p; hai ; xi  0; i D p C 1; : : : ; m: (1.27)

Proposition 1.28 Let P be the polyhedron defined by the system (1.27). Then the
polar of P is the polyhedron

Q D convf0; a1 ; : : : ; ap g C conefapC1 ; : : : ; am g: (1.28)

and conversely, the polar of Q is the polyhedron P:

Proof If y 2 Q; i.e.,
X
m X
p
yD i ai ; with i  1; i  0; i D 1; : : : ; m
iD1 iD1

then
X
m
hy; xi D i hai ; xi  1 8x 2 P;
iD1

hence y 2 P (polar of P/: Therefore, Q  P : To prove the converse inclusion,


observe that for any y 2 Q we must have hy; xi  1 8x 2 Q: In particular, for
i  p; taking x D ai yields hai ; yi  1; i D 1; : : : ; pI while for i  p C 1; taking
x D ai with arbitrarily large > 0 yields hai ; yi  0; i D p C 1; : : : ; m: Thus,
Q  P and hence, by Proposition 1.21, P  .Q / D Q: This proves that Q D P :
The equality Q D P D P then follows. t
u
32 1 Convex Sets

Note the presence of the vector 0 in (1.28). For instance, if P D fx 2 R2 j x1  1;


x2  1g then its polar is P D convf0; e1 ; e2 g and not convfe1; e2 g: However, if P
is bounded then 0 2 intP (Proposition 1.21), and 0 can be omitted in (1.28). As an
example, it is easily verified that the polar of the unit l1 -ball fxj maxi jxi j  1g D
fxj  1  P xi  1; i D 1; : : : ; ng is the set convfei; ei ; i D 1; : : : ; ng; i.e., the unit
l1 -ball fxj niD1 jxi j  1g: This fact is often expressed by saying that the l1 -norm
and the l1 -norm are dual to each other.
Corollary 1.19 If two full-dimensional polyhedrons P; Q containing the origin are
polar to each other then there is a 1-1 correspondence between the set of facets of
P not containing the origin and the set of nonzero vertices of QI likewise, there is a
1-1 correspondence between the set of facets of P containing the origin and the set
of extreme directions of Q:
Proof One can suppose that P; Q are defined as in Proposition 1.28. Then an
inequality hak ; xi  1; k 2 f1; : : : ; pg; is redundant in (1.27) if and only if the
vector ak can be removed from (1.28) without changing Q: Therefore, the equality
hak ; xi D 1 defines a facet of P not containing 0 if and only if ak is a vertex of Q:
Likewise, for the facets passing through 0 (Fig. 1.4). t
u

Fig. 1.4 Polyhedrons polar


to each other

a1 Q
a3
a1 , x = 1
a3 , x = 0
a2
P 0

a2 , x = 0

1.9.4 Representation of Polyhedrons

From Theorem 1.8 it follows that a polyhedron has finitely many faces, in particular
finitely many extreme points and extreme directions. It turns out that this property
completely characterizes polyhedrons in the class of closed convex sets.
Theorem 1.10 For every polyhedron D there exist two finite sets V D fv i ; i 2 Ig
and U D fuj ; j 2 Jg such that

D D convV C coneU: (1.29)


1.9 Polyhedral Convex Sets 33

In other words, D consists of points of the form


X X
xD i v i C j uj ; with
i2I j2J
X
i D 1; i  0; j  0; i 2 I; j 2 J:
i2I

Proof If L is the lineality of D then, according to (1.19):

D D L C C; (1.30)

where C WD D \ L? is a polyhedron containing no line. By Theorem 1.7,

C D convV C coneU1 ; (1.31)

where V is the set of vertices and U1 the set of extreme directions of C: Now let U0
be a basis of L; so that L D conefU0 [ .U0 /g: From (1.30) and (1.31) we conclude

D D convV C coneU1 C conefU0 [ .U0 /g;

whence (1.29) by setting U D U0 [ .U0 / [ U1 : t


u
Lemma 1.4 The convex hull of a finite set of halflines emanating from the origin is
a polyhedral convex cone. In other words, given any finite set of vectors U  Rn ;
there exists an m  n matrix A such that

coneU D fxj Ax  0g (1.32)

Proof Let U D fuj ; j 2 Jg; M D conefuj ; j 2 Jg: By Proposition 1.28


M D fxj hv jP
; xi  0; j 2 Jg and by Theorem 1.10 there exist a1 ; : : : ; am such that

M D fx D m iD1 i a j i  0; i D 1; : : : ; mg: Finally, again by Proposition 1.28,
i

.M / D fxj ha ; xi  0; i D 1; : : : ; mg; which yields the desired representation by
i

noting that M D M: t
u
Theorem 1.11 Given any two finite sets V and U in Rn ; there exist an m  n matrix
A and a vector b 2 Rm such that

convV C coneU D fxj Ax  bg:

Proof Let V D fv i ; i 2 Ig; U D fuj ; j 2 Jg: We can assume V ;; since the case
V D ; has just been treated. In the space RnC1 D Rn  R; consider the set
8 9
< X X =
K D .x; t/ D i .v i ; 1/ C j .uj ; 0/j i  0; j  0 :
: ;
i2I j2J
34 1 Convex Sets

This set is contained in the halfspace f.x; t/ 2 Rn  Rj t  0g of RnC1 and if we


set D D convV C coneU then: x 2 D , .x; 1/ 2 K: By Lemma 1.4, K is a convex
polyhedral cone, i.e., there exist .A; b/ such that

K D f.x; t/j Ax  tb  0g:

Thus,

D D fxj Ax  bg;

completing the proof. t


u
Corollary 1.20 A closed convex set D which has only finitely many faces is a
polyhedron.
Proof Indeed, in view of (1.19), we can assume that D contains no line. Then D
has a finite set V of extreme points and a finite set U of extreme directions, so
by Theorem 1.7 it can be represented in the form convV C coneU; and hence is a
polyhedron by the above theorem. t
u

1.10 Systems of Convex Sets

Given a system of closed convex sets we would like to know whether they have a
nonempty intersection.
Lemma 1.5 If m closed convex proper sets C1 ; : : : ; Cm of Rn have no common point
in a compact convex set C then there exist closed halfspaces Hi  Ci .i D 1; : : : ; m/
with no common point in C:
Proof The closed convex set Cm does not intersect the compact set C0 D C \
.\iD1
m1
Ci /; so by Theorem 1.4 there exists a closed halfspace Hm  Cm disjoint with
0
C (i.e., such that C1 ; : : : ; Cm1 ; Hm have no common point in C). Repeating this
argument for the collection C1 ; : : : ; Cm1 ; Hm we find a closed halfspace Hm1 
Cm1 such that C1 ; : : : ; Cm2 ; Hm1 ; Hm have no common point in C; etc. Eventually
we obtain the desired closed halfspaces H1 ; : : : ; Hm : t
u
Theorem 1.12 If the closed convex sets C1 ; : : : ; Cm in Rn have no common point in
a convex set C but any m  1 sets of the collection have a common point in C then
there exists a point of C not belonging to any Ci ; i D 1; : : : ; m:
Proof First, it can be assumed that the set C is compact. In fact, if it were not so,
we could pick, for each i D 1; : : : ; m; a point ai common to C and the sets Cj ; j i:
Then the set C0 D convfa1 ; : : : ; am g is compact and the sets C1 ; : : : ; Cm have no
common point in C0 (because C0  C/I moreover, any m  1 sets among these have
a common point in C0 (which is one of the points a1 ; : : : ; am ). Consequently, without
any harm we can replace C with C0 :
1.10 Systems of Convex Sets 35

Second, each Ci ; i D 1; : : : ; m; must be a proper subset of Rn because if some


Ci0 D Rn then a common point ai0 of Ci ; i i0 ; in C would be a common point
of C1 ; : : : ; Cm ; in C: Therefore, by Lemma 1.5 there exist closed halfspaces Hi 
Ci ; i D 1; : : : ; m; such that H1 ; : : : ; Hm have no common point in C: Let Hi D
fxj hhi .x/  0g; where hi .x/ is an affine function. We contend that for every set
I  f1; : : : ; mg and every q I there exists xqI 2 C satisfying
8
< > 0 if i D q .1/
hi .xqI / D 0 if i 2 I .2/
:
 0 if i q .3/

In fact, for I D ; this is clear: since Hi  Ci ; i D 1; : : : ; m; any m  1 sets among


the Hi ; i D 1; : : : ; m; have a common point in C: Supposing this is true for j I j D s
consider the case j I j D sC1: For p 2 I; J D Infpg by the induction hypothesis there
exist xpJ and xqJ : The convex set fx 2 Cj hhi .x/ D 0 .i 2 J/; hi .x/  0 i fp; qgg
contains two points xpJ and xqJ satisfying hp .xpJ / > 0 and hp .xqJ /  0; hence it
must contain a point xqI satisfying hp .xqI / D 0: Clearly xqI satisfies (2) and (3), and
hence, also (1) since the Hi have no common point in C:
For each q denote by xq the point xqI associated
P with I D f1; : : : ; mg n fqg:
It is easily seen that the point x WD m1 m x q
2 C satisfies 8i hi .x / D
1 P m 1
qD1

qD1 hi .x / D m hi .x / > 0: This means x does not belong to any Hi ; hence
q i
m
neither to any Ci : t
u
Corollary 1.21 Let C1 ; C2 ; : : : ; Cm be closed convex sets whose union is a convex
set C: If any m1 sets of the collection have a nonempty intersection, then the entire
collection has a nonempty intersection.
Proof If the entire collection had an empty intersection then by Theorem 1.12 there
would exist a point of C not belonging to any Ci ; a contradiction. t
u
Theorem 1.13 (Helly) Let C1 ; : : : ; Cm be m > n C 1 closed convex subsets of Rn :
If any n C 1 sets of the collection have a nonempty intersection then the entire
collection has a nonempty intersection.
Proof Suppose m D n C 2; so that m  1 D n C 1: For each i take an
ai 2 \ji Cj : Since m  1 > n the vectors ai  a1 ; i D 2; : : : ; m must be
linearly dependent,
Pm i.e., there exist numbers i ;Pi D 1; : : : ; m; P
not all zero,
1 m m
iD2 i .a a /D0:   iD1 i a D0.
i i
such that P Setting 1 D
P  iD2 i yields
With J D fij i  0; D i2J i D  iJ i > 0 the latter equality can
be written as
1X 1X
i a i D .i /ai :
i2J
iJ
36 1 Convex Sets

P i i
For i 2 J we have ai 2 \jJ Cj ; so x D i2J a 2 \jJ Cj I for i J we have
P i i
a 2 \j2J Cj ; so x D
i
iJ a 2 \j2J Ci : Therefore, x 2 \jD1 Cj ; proving the
m

theorem for m D n C 2: The general case follows by induction on m: t


u

1.11 Exercises

1 Show that any affine set in Rn is the intersection of a finite collection of


hyperplanes.
2 Let A be an mn matrix. The kernel (null-space) of A is by definition the subspace
L D fx 2 Rn j Ax D 0g: The orthogonal complement to L is the set L? D fx 2
Rn j hx; yi D 0 8y 2 Lg: Show that L? is a subspace and that L? D fx 2 Rn j x D
AT u; u 2 Rm g:
3 Let M be a convex cone containing 0. Show that the affine hull of M is the set
M  M D fx  yj x 2 M; y 2 Mg and that the largest subspace contained in M is the
set M \ M:
4 Let D be a nonempty polyhedron defined by a system of linear inequalities Ax 
b where rankA D r < n: Show that the lineality of D is the subspace E D fxj Ax D
0g; of dimension n  r; and that D D E C D0 where D0 is a polyhedron having at
least one extreme point.
5 Let ' W Rn ! Rm be a linear mapping. If D is a (bounded) polyhedron in Rn then
'.D/ is a (bounded) polyhedron in Rm : (Use Theorems 1.10 and 1.11.)
6 If C and D are polyhedrons in Rn then C C D is also a polyhedron. (C C D is the
image of E D f.x; y/j x 2 C; y 2 Dg under the linear mapping .x; y/ 7! x C y from
R2n to Rn :/
7 Let fCi ; i 2 Ig be an arbitrary collection of nonempty convex sets in Rn : Then a
vector x 2 Rn belongs to the convex hull of the union of the collection if and only if
x is the convex combination of a finite number of vectors yi ; i 2 J such that yi 2 Ci
and J  I:
8 Let C1 ; C2 be two disjoint convex sets in Rn ; and a 2 Rn : Then at least one
of the two sets C2 \ conv.C1 [ fag/; C1 \ conv.C2 [ fag/ is empty. (Argue by
contradiction.)
9 Let C1 ; : : : ; Cm be open convex proper subsets of Rn with no common point in
a convex proper subset D of Rn : Then there exist open halfspaces Li  Ci ; i D
1; : : : ; m; and a closed halfspace H  D such that L1 ; : : : ; Lm have no common
point in H:
10 Let a1 ; : : : ; am be a set of m vectors in Rn : Show that the convex hull of this set
contains 0 in its interior if and only if either of the following conditions holds:
1.11 Exercises 37

1. For any nonzero vector x 2 Rn one has maxfhai ; xij i D 1; : : : ; mg > 0:


2. Any polyhedron hai ; xi  i ; i D 1; : : : ; m is bounded.
11 Let C be a closed convex set and let y0 C: If x0 2 C is the point of C nearest
to y0 ; i.e., such that kx0  y0 k D min kx  y0 k; then kx  x0 k < kx  y0 k 8x 2 C:
x2C
(Hint: From the proof of Proposition 1.15 it can be seen that x0 is characterized by
the condition x0  y0 2 NC .x0 /; i.e., hx0  y0 ; x  x0 i  0 8x 2 C:/
12 (Fejer) For any set E in Rn ; a point y0 belongs to the closure of convE if and
only if for every y 2 Rn there is x 2 E such that kxy0 k  kxyk: (Use Exercise 10
to prove the only if part.)
13 The radius of a compact set S is defined to be the number rad.S/ D
minx maxy2S kx  yk: Show that for any x 2 convS there exists y 2 S such that
kx  yk  rad.S/:
14 Show by a counter-example that Theorem 1.7 (Sect. 1.7) is not true even for
polyhedrons if the condition that C be line-free is omitted.
15 Show by a counter-example in R3 that Corollary 1.19 is not true if P is not full
dimensional.
Chapter 2
Convex Functions

2.1 Definition and Basic Properties

Given a function f W S ! 1; C1 on a nonempty set S  Rn ; the sets

domf D fx 2 Sj f .x/ < C1g


epif D f.x; / 2 S  Rj f .x/  g

are called the effective domain and the epigraph of f .x/; respectively. If domf ;
and f .x/ > 1 for all x 2 S, then we say that the function f .x/ is proper.
A function f W S ! 1; C1 is called convex if its epigraph is a convex set in
Rn R: This is equivalent to saying that S is a convex set in Rn and for any x1 ; x2 2 S
and  2 0; 1, we have

f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 / (2.1)

whenever the right-hand side is defined. In other words (2.1) must always hold
unless f .x1 / D f .x2 / D 1: By induction it can be proved that if f .x/ is
convex then for any finite set x1 ; : : : ; xk 2 S and any nonnegative numbers 1 ; : : : ; k
summing up to 1, we have
!
X
k X
k
f i x i  i f .xi /
iD1 iD1

whenever the right-hand side is defined. A function f .x/ is said to be concave on


S if f .x/ is convex; affine on S if f .x/ is finite and both convex and concave. An
affine function on Rn has the form f .x/ D ha; xi C ; with a 2 Rn ; 2 R; because
its epigraph is a halfspace in Rn  R containing no vertical line.

Springer International Publishing AG 2016 39


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_2
40 2 Convex Functions

For a given nonempty convex set C  Rn we can define the convex functions:

0 if x 2 C
the indicator function of C W C .x/ D
C1 if x C
the support function (see Sect. 1.8) sC .x/ D supy2C hy; xi
the distance function dC .x/ D infy2C kx  yk:
The convexity of C .x/ is obvious; that of the two other functions can be verified
directly or derived from Propositions 2.5 and 2.9 below.
Proposition 2.1 If f .x/ is an improper convex function on Rn then f .x/ D 1 at
every relative interior point x of its effective domain.
Proof By the definition of an improper convex function, f .x0 / D 1 for at least
some x0 2 domf (unless domf D ;/: If x 2 ri.domf / then there is a point x0 2 domf
such that x is a relative interior point of the line segment x0 ; x0 : Since f .x0 / < C1;
it follows from x D x0 C .1  /x0 with  2 .0; 1/; that f .x/  f .x0 / C .1  /
f .x0 / D 1: t
u
From the definition it is straightforward that a function f .x/ on Rn is convex if
and only if its restriction to every straight line in Rn is convex. Therefore, convex
functions on Rn can be characterized via properties of convex functions on the real
line.
Theorem 2.1 A real-valued function f .x/ on an open interval .a; b/  R is convex
if and only if it is continuous and possesses at every x 2 .a; b/ finite left and right
derivatives

f .x C t/  f .x/ f .x C t/  f .x/
f0 .x/ D lim ; fC0 .x/ D lim
t"0 t t#0 t

such that fC0 .x/ is nondecreasing and

f0 .x/  fC0 .x/; fC0 .x1 /  f0 .x2 / for x1 < x2 : (2.2)

Proof
(i) Let f .x/ be convex. If 0 < s < t and x C t < b then the point .x C s; f .x C s//
is below the segment joining .x; f .x// and .x C t; f .x C t//; so

f .x C s/  f .x/ f .x C t/  f .x/
 : (2.3)
s t

This shows that the function t 7! f .x C t/  f .x/=t is nonincreasing as t # 0:


Hence it has a limit fC0 .x/ (finite or D 1/: Analogously, f0 .x/ exists (finite or
D C1/: Furthermore, setting y D x C s; t D s C r, we also have

f .x C s/  f .x/ f .y C r/  f .y/
 ; (2.4)
s r
2.1 Definition and Basic Properties 41

which implies fC0 .x/  fC0 .y/ for x < y; i.e., fC0 .x/ is nondecreasing. Finally,
writing (2.4) as

f .y  s/  f .y/ f .y C r/  f .y/
 :
s r

and letting s " 0; r # 0 yields f0 .y/  fC0 .y/; proving the left part of (2.2)
and at the same time the finiteness of these derivatives. The continuity of f .x/
at every x 2 .a; b/ then follows from the existence of finite f0 .x/ and fC0 .x/:
Furthermore, setting x D x1 ; y C r D x2 in (2.4) and letting s; r ! 0 yields the
right part of (2.2).
(ii) Now suppose that f .x/ has all the properties mentioned in the Proposition and
let a < c < d < b: Consider the function

f .d/  f .c/
g.x/ D f .x/  f .c/  .x  c/ :
dc

Since for any x D .1  /c C d, we have g.x/ D f .x/  f .c/  f .d/  f .c/ D
f .x/  .1  /f .c/ C f .d/; to prove the convexity of f .x/ it suffices to show
that g.x/  0 for any x 2 c; d: Suppose the contrary, that the maximum of
g.x/ over the segment c; d is positive (this maximum exists because f .x/ is
continuous). Let e 2 c; d be the point where this maximum is attained. Note
that g.c/ D g.d/ D 0; (hence c < e < d/ and from its expression, g.x/ has
the same properties as f .x/; namely: g0 .x/; g0C .x/ exist at every x 2 .c; d/;
g0 .x/  g0C .x/; g0C .x/ is nondecreasing and g0C .x1 /  g0 .x2 / for x1  x2 :
Since g.e/  g.x/ 8x 2 c; d, we must have g0 .e/  0  g0C .e/; consequently
g0 .e/ D g0C .e/ D 0; and hence, since g0C .x/ is nondecreasing, g0C .x/  0 8x 2
e; d: If g0 .y/  0 for some y 2 .e; d then g0C .x/  g0 .y/  0 hence g0 .x/ D 0
for all x 2 e; y/; from which it follows that g.y/ D g.e/ > 0: Since g.d/ D
0; there must exist y 2 .e; d/ with g0 .y/ > 0: Let x1 2 y; d/ be the point
where g.x/ attains its maximum over the segment y; d: Then g0C .x1 /  0;
contradicting g0C .y/  g0 .y/ > 0: Therefore g.x/  0 for all x 2 c; d; as was
to be proved. t
u
Corollary 2.1 A differentiable real-valued function f .x/ on an open interval is
convex if and only if its derivative f 0 is a nondecreasing function. A twice
differentiable real-valued function f .x/ on an open interval is convex if and only
if its second derivative f 00 is nonnegative throughout this interval. t
u
Proposition 2.2 A twice differentiable real-valued function f .x/ on an open convex
set C in Rn is convex if and only if for every x 2 C its Hessian matrix

@2 f
Qx D .qij .x//; qij .x/ D .x1 ; : : : ; xn /
@xi @xj
42 2 Convex Functions

is positive semidefinite, i.e.,

hu; Qx ui  0 8u 2 Rn :

Proof The function f is convex on C if and only if for each a 2 C and u 2 Rn the
function 'a;u .t/ D f .a C tu/ is convex on the open real interval ftj a C tu 2 Cg: The
proposition then follows from the preceding corollary since an easy computation
yields ' 00 .t/ D hu; Qx ui with x D a C tu: t
u
In particular, a quadratic function

1
f .x/ D hx; Qxi C hx; ai C ;
2
where Q is a symmetric n  n matrix, is convex on Rn if and only if Q is positive
semidefinite. It is concave on Rn if and only if its matrix Q is negative semidefinite.
Proposition 2.3 A proper convex function f on Rn is continuous at every interior
point of its effective domain.
Proof Let x0 2 int.domf /: Without loss of generality one can assume x0 D 0: By
Theorem 2.1, for each i D 1; : : : ; n the restriction of f to the open interval ftj x0 C
tei 2 int.domf /g is continuous relative to this interval. Hence for any given " > 0
and for each i D 1; : : : ; n, we can select i > 0 so small that jf .x/  f .x0 /j  " for
all x 2 i ei ; Ci ei : Let D minfi j i D 1; : : : ; ng and B D fxj kxk1  g: Denote
ui D ei ; uiCn D ei ; i D 1; P : : : ; n: Then, as seen
P2n in the proof of Corollary 1.6,
any x 2 B is of the form x D 2n  u i
; with iD1 i DP 1; 0  i  1; hence
P2n iD1 i
0 2n 0
f .x/  iD1 i f .u /; and consequently, f .x/  f .x / 
i
iD1 i f .x /  f .x /:
i

Therefore,

2n
X
jf .x/  f .x0 /j  i jf .x/  f .x0 /j  "
iD1

for all x 2 B; proving the continuity of f .x/ at x0 : t


u
Proposition 2.4 Let f be a real-valued function on a convex set C  Rn : If for
every x 2 C there exists a convex open neighborhood Ux of x such that f is convex
on Ux \ C then f is convex on C:
Proof It suffices to show that for every a 2 C; u 2 Rn ; the function '.t/ D f .a C tu/
is convex on the interval  WD ftj a C tu 2 Cg: But from the hypothesis, this
function is convex in a neighborhood of every t 2 ; hence is continuous and has
left and right derivatives '0 .t/  'C
0
.t/ which are non decreasing in a neighborhood
of every t 2 . These derivatives thus exist and satisfy the conditions described in
Theorem 2.1 on the whole interval : Hence, '.t/ is convex. t
u
2.2 Operations That Preserve Convexity 43

Fig. 2.1 Convex piecewise affine function

For example, let f .x/ W R ! R be a piecewise convex function on the real line,
i.e., a function such that the real line can be partitioned into a finite number of
intervals i ; i D 1; : : : ; N; in each of which f .x/ is convex. Then f .x/ is convex if
and only if it is convex in the neighborhood of each breakpoint (endpoint of some
interval i /: In particular, a piecewise affine function on the real line is convex if and
only if at each breakpoint the left slope is at most equal to the right slope (in other
words, the sequence of slopes is nondecreasing) (see Fig. 2.1).

2.2 Operations That Preserve Convexity

A function with a complicated expression may be built up from a number of


simpler ingredient functions via certain standard operations. The convexity of such a
function can often be established indirectly, by proving that the ingredient functions
are known convex functions, whereas the operations involved in the composition of
the ingredient functions preserve convexity. It is therefore useful to be familiar with
some of the most important operations which preserve convexity.
Proposition 2.5 A positive combination of finitely many proper convex functions
on Rn is convex. The upper envelope (pointwise supremum) of an arbitrary family
of convex functions is convex.
Proof If f .x/ is convex and  0, then f .x/ is obviously convex. If f1 and f2 are
proper convex functions on Rn ; then it is also evident that f1 C f2 is convex. This
proves the first part of the proposition. The second part follows from the facts that
if f .x/ D supffi .x/j i 2 Ig; then epif D \i2I epifi ; and the intersection of a family of
convex sets is a convex set. t
u
Proposition 2.6 Let be a convex set in Rn , G a convex set in Rm ; '.x; y/ a real-
valued convex function on  G: Then the function

f .x/ D inf '.x; y/


y2G

is convex on .
44 2 Convex Functions

Proof Let x1 ; x2 2 and x D x1 C .1  /x2 with  2 0; 1: For each i D 1; 2


select a sequence fyi;k g  G such that

'.xi ; yi;k / ! inf '.xi ; y/:


y2G

By convexity of ';

f .x/  '.x; y1;k C .1  /y2;k /  '.x1 ; y1;k / C .1  /'.x2 ; y2;k /;

hence, letting k ! 1 yields

f .x/  f .x1 / C .1  /f .x2 /:

t
u
Proposition 2.7 If gi .x/; i D 1; : : : ; m; are concave positive functions on a convex
set C  Rn then their geometric mean
"m #1=m
Y
f .x/ D gi .x/ (2.5)
iD1

is a concave function (so f .x/ is a convex function) on C:


Qm
Proof Let T D ft 2 Rm Cj iD1 ti  1g: We show that for any fixed x 2 C W

" #1=m ( m )
Y
m
1 X
gi .x/ D min ti gi .x/ : (2.6)
iD1
m t2T iD1

Indeed, observe that the right-hand side of (2.6) is equal to


( m )
1 X Ym
min ti gi .x/j ti D 1; ti > 0; i D 1; : : : ; m
m iD1 iD1

Q 0
Qm 0
since if m iD1 ti > 1 then by replacing
Pm ti with ti  ti such that iD1 ti D 1;
we can only decrease the value of iD1 ti gi .x/: Therefore, it can be assumed that
Q m
iD1 ti D 1 and hence

Y
m Y
m
ti gi .x/ D gi .x/: (2.7)
iD1 iD1

Since ti gi .x/ > 0; i D 1;


Q:m: : ; m; and for fixed x the product of these positive
numbers is constant [D iD1 gi .x/ by (2.7)] their sum is minimal when these
2.2 Operations That Preserve Convexity 45

numbers are equal (by theorem on arithmetic Pm and geometric mean). That is,
taking
Q account of (2.7), the minimum of iD1 ti g i .x/ is achieved when ti gi .x/ D
1=m
m iD1 g i .x/ 8i PD 1; : : : ; m; hence (2.6). Since for fixed t 2 T the function
x 7! 't .x/ WD m1 m t g .x/ is concave by Proposition 2.5, their lower envelope
Q iD1 1=m i i
inft2T 't .x/ D m iD1 g i .x/ is concave by the same proposition. t
u
An interesting concave function of the class (2.5) is the geometric mean:

.x1 x2    xn /1=n if x1  0; : : : ; xn  0
f .x/ D
1 otherwise

which corresponds to the case when gi .x/ D xi :


Proposition 2.8 Let g.x/ W Rn ! .1; C1/ be a convex function and let '.t/ W
R ! .1; C1/ be a nondecreasing convex function. Then f .x/ D '.g.x// is
convex on Rn :
Proof The proof is straightforward. For any x1 ; x2 2 Rn and  2 0; 1 we have

g..1  /x1 C x2 /  .1  /g.x1 / C g.x2 /

hence

'.g..1  /x1 C x2 //  .1  /'.g.x1 // C '.g.x2 //:

t
u
Pm gi .x/
For example, by this proposition the function f .x/ D iD1 ci e is convex if ci > 0
and each gi .x/ is convex proper.
Given the epigraph E of a convex function f .x/, one can restore f .x/ by the
formula

f .x/ D infftj .x; t/ 2 Eg: (2.8)

Conversely, given a convex set E  RnC1 the function f .x/ defined by (2.8) is
a convex function on Rn by Proposition 2.6. Therefore, if f1 ; : : : ; fm are m given
convex functions, and E  RnC1 is a convex set resulting from some operation on
their epigraphs E1 ; : : : ; Em ; then one can use (2.8) to define a corresponding new
convex function f .x/:
Proposition 2.9 Let f1 ; : : : ; fm be proper convex functions on Rn : Then
( )
X
m X
m
f .x/ D inf fi .x /j x 2 R ;
i i n i
x Dx
iD1 iD1

is a convex function on Rn :
46 2 Convex Functions

Proof Indeed, f .x/ is defined by (2.8), where E D E1 C    C Em and Ei D epifi ,


i D 1; : : : ; m: t
u
The above constructed function f .x/ is called the infimal convolution of the
functions f1 ; : : : ; fm : For example, the convexity of the distance function dC .x/ D
inffkx  yk j y 2 Cg associated with a convex set C follows from the above
proposition because dC .x/ D infy fkxykCC .y/g D inffkx1 kCC .x2 /j x1 Cx2 D xg:
Let g.x/ now be a nonconvex function, so that its epigraph is a nonconvex set.
The relation (2.8) where E D conv.epig/ defines a function f .x/ called the convex
envelope or convex hull of g.x/ and denoted by convg: Since E is the smallest
convex set containing the epigraph of g it is easily seen that convg is the largest
convex function majorized by g:
When C is a subset of Rn ; the convex envelope of the function

g.x/ if x 2 C
gjC D
C1 if x C

is called the convex envelope of g over C:


Proposition 2.10 Let g W Rn ! R: The convex envelope of g over a set C  Rn
such that dim(aff C/ D k is given by
( kC1 )
X X
kC1 X
kC1
f .x/ D inf i g.x /j x 2 C; i  0;
i i
i D 1; i x D x :
i

iD1 iD1 iD1

Proof Let X D C  f0g SRn  R; B D f.0; t/j 0  t  1g  Rn  R and define


E D f.x; g.x//j x 2 Cg D .x;0/2X ..x; 0/ C g.x/B/: Then X is a Caratheodory core
of E (cf Sect. 1.4) and since dim(aff X/ D k; by Proposition 1.14, we have
( )
X
kC1 X
kC1
convE D .x; t/ D i .x ; g.x //j x 2 C; i  0;
i i i
i D 1 :
iD1 iD1

But clearly for every .x; t/ 2 epig there is  t such that .x; / 2 E; so for every
.x; t/ 2 conv.epig/ there is  t such that .x; / 2 convE: Therefore,
PkC1 (convg/.x/ D
infftj .x; t/ 2 conv.epig/g D infftj .x; t/ 2 convEg D inff iD1 i g.xi /j xi 2 C;
X
kC1 X
kC1
i  0; i xi D x; i D 1g: t
u
iD1 iD1

Corollary 2.2 The convex envelope of a concave function g W Rn ! R over a


polytope D in Rn with vertex set V is the function
( nC1 )
X X
nC1 X
nC1
f .x/ D min i g.v /j v 2 V; i  0;
i i
i D 1; i v D x :
i

iD1 iD1 iD1


2.3 Lower Semi-Continuity 47

Proof By the above proposition, (convg/.x/  f .x/ but any x 2 D is of the form
P X
nC1
P
x D nC1 
iD1 i v i
; with v i
2 V;  i  0; i D 1; hence f .x/  nC1
iD1 i g.v / (by
i

P iD1
definition of f .x//; while nC1
iD1 i g.v i
/  g.x/ by the concavity of g: Therefore,
f .x/  g.x/ 8x 2 D; and since f .x/ is convex, we must have f .x/  .convg/.x/; and
hence f .x/ D .convg/.x/: t
u

2.3 Lower Semi-Continuity

Given a function f W Rn ! 1; C1 the sets

fxj f .x/  g; fxj f .x/  g

where 2 1; C1 are called lower and upper level sets, respectively, of f :
Proposition 2.11 The lower (upper, resp.) level sets of a convex (concave, resp.)
function f .x/ are convex.
Proof This property is equivalent to

f ..1  /x1 C x2 /  maxff .x1 /; f .x2 /g 8 2 .0; 1/ (2.9)

for all x1 ; x2 2 Rn ; which is an immediate consequence of the definition of convex


functions. t
u
Note that the converse of this proposition is not true. For example, a real-valued
function on the real line which is nondecreasing has all its lower level sets convex,
but may not be convex. A function f .x/ whose every nonempty lower level set
is convex (or, equivalently, which satisfies (2.9) for all x1 ; x2 2 Rn /; is said to
be quasiconvex. If every nonempty upper level set is convex, f .x/ is said to be
quasiconcave.
Proposition 2.12 For any proper convex function f W
(i) The maximum of f over any line segment is attained at one endpoint.
(ii) If f .x/ is finite and bounded above on a halfline, then its maximum over the
halfline is attained at the origin of the halfline.
(iii) If f .x/ is finite and bounded above on an affine set then it is constant on this
set.
Proof
(i) Immediate from Proposition 2.11.
(ii) If f .b/ > f .a/ then for any x D b C .b  a/ with   0; we have
1 
b D 1C xC 1C a; hence .1 C/f .b/  f .x/Cf .a/; (whenever f .x/ < C1/;
48 2 Convex Functions

i.e., f .x/  f .b/  f .a/ C f .b/; which implies f .x/ ! C1 as  ! C1:


Therefore, if f .x/ is finite and bounded above on a halfline of origin a; one
must have f .b/  f .a/ for every b on this halfline.
(iii) Let M be an affine set on which f .x/ is finite. If f .b/ > f .a/ for a; b 2 M;
then by (ii), f .x/ is unbounded above on the halfline in M from a through b:
Therefore, if f .x/ is bounded above on M; it must be constant on M: t
u
A function f from a set S  Rn to 1; C1 is said to be lower semi-continuous
(l.s.c.) at a point x 2 S if

lim inf f .y/  f .x/:


y2S
y!x

It is said to be upper semi-continuous (u.s.c.) at x 2 S if

lim sup f .y/  f .x/:


y2S
y!x

A function which is both lower and upper semi-continuous at x is continuous at x in


the ordinary sense.
Proposition 2.13 Let S be a closed set in Rn : For an arbitrary function f W S !
1; C1 the following conditions are equivalent:
(i) The epigraph of f is a closed set in RnC1 I
(ii) For every 2 R the set fx 2 Sj f .x/  g is closed;
(iii) f is lower semi-continuous throughout S:
Proof (i)) (ii). Let x 2 S; x ! x; f .x /  : Since .x ; / 2 epif ; it follows
from the closedness of epif that .x; / 2 epif ; i.e., f .x/  ; proving (ii).
(ii))(iii). Let x 2 S; x ! x: If lim!1 f .x / < f .x/ then there is < f .x/
such that f .x /  for all sufficiently large : From (ii) it would then follow that
f .x/  ; a contradiction. Therefore lim!1 f .x /  f .x/; proving (iii).
(iii)) (i). Let .x ; t / 2 epif ; (i.e., f .x /  t / and .x ; t / ! .x; t/: Then from
(iii), we have liminff .x /  f .x/; hence t  f .x/; i.e., .x; t/ 2 epif : t
u
Proposition 2.14 Let f be a l.s.c. proper convex function. Then all the nonempty
lower level sets fxj f .x/  g; 2 R; have the same recession cone and the same
lineality space. The recession cone is made up of 0 and the directions of halflines
over which f is bounded above, while the lineality space is the space parallel to the
affine set on which f is constant.

Proof By Proposition 2.13 every lower level set C WD fxj f .x/  g is closed.
Let  D fuj   0g: If f is bounded above on a halfline a D a C ; then
f .a/ 2 R (because f is proper) and by Proposition 2.12 f .x/  f .a/ 8x 2 a : For
any nonempty C ; 2 R; consider a point b 2 C and let D maxff .a/; g so that
2.3 Lower Semi-Continuity 49

2 R and a  C : Since C is closed and b 2 C it follows from Lemma 1.1 that


b  C ; i.e., f .x/ is finite and bounded above on b : Then by Proposition 2.12,
f .x/  f .b/  8x 2 b ; hence b  C : Thus, if f is bounded above on a halfline
a then  is a direction of recession for every nonempty C ; 2 R: The converse
is obvious. Therefore, the recession cone of C is the same for all and is made up
of 0 and all directions of halflines over which f is bounded above. The rest of the
proposition is straightforward. t
u
The recession cone and the lineality space common to each lower level set of f
are also called the recession cone and the constancy space of f ; respectively.
Corollary 2.3 If the lower level set fxj f .x/  g of a l.s.c. proper convex function
f is nonempty and bounded for one then it is bounded for every :
Proof Any lower level set of f is a closed convex set. Therefore, it is bounded if and
only if its recession cone is the singleton f0g (Corollary 1.8). t
u
Corollary 2.4 If a l.s.c. proper convex function f is bounded above on a halfline
then it is bounded above on every parallel halfline emanating from a point of domf :
If it is constant on a line then it is constant on every parallel line passing through a
point of domf :
Proof Immediate. t
u

Proposition 2.15 Let f be any proper convex function on Rn : For any y 2 Rn ; there
exists t 2 R such that .y; t/ belongs to the lineality space of epif if and only if

f .x C y/ D f .x/ C t 8x 2 domf ; 8 2 R: (2.10)

When f is l.s.c., this condition is satisfied provided for some x 2 domf the function
 7! f .x C y/ is affine.
Proof .y; t/ belongs to the lineality space of epif if and only if for any x 2 domf W
.x; f .x// C .y; t/ 2 epif 8 2 R; i.e., if and only if

f .x C y/  t  f .x/ 8 2 R:

By Proposition 2.12 applied to the proper convex function './ D f .xCt/t; this
is equivalent to saying that './ D constant, i.e., f .x C y/  t D f .x/ 8 2 R:
This proves the first part of the proposition. If f is l.s.c., i.e., epif is closed, then
.y; t/ belongs to the lineality space of epif provided for some x 2 domf the line
f.x; f .x// C .y; t/j  2 Rg is contained in epif (Lemma 1.1). t
u
The projection of the lineality space of epif on Rn ; i.e., the set of vectors y for
which there exists t such that .y; t/ belongs to the lineality space of epif ; is called
the lineality space of f : The directions of these vectors y are called directions in
which f is affine. The dimension of the lineality space of f is called the lineality of f :
50 2 Convex Functions

By definition, the dimension of a convex function f is the dimension of its domain.


The number dimf  linealityf which is a measure of the nonlinearity of f is then
called the rank of f W

rankf D dimf  linealityf : (2.11)

Corollary 2.5 If a proper convex function f of full dimension on Rn has rank k then
there exists a k  n matrix B with rankB D k such that for any b 2 B.domf / the
restriction of f to the affine set Bx D b is affine.
Proof Let L be the lineality space of f : Since dim L D n  k; there exists a k  n
matrix B of rank k such that L D fu 2 Rn j Bu D 0g: If b D Bx0 for some x0 2 domf
then for any x such that Bx D b, Pwe have B.x  x0 / D 0; hence, denoting by
1 0 h
u ; : : : ; u a basis of L; x D x C iD1 i ui : By Proposition 2.15 there exist ti 2 R
h
P P
such that f .x0 C hiD1 i ui / D f .x0 / C hiD1 i ti for all  2 Rh : Therefore, f .x/ is
affine on the affine set Bx D b: t
u
Given any proper function f on Rn ; the function whose epigraph is the closure of
epif is the largest l.s.c. minorant of f : It is called the l.s.c. hull of f or the closure of
f , and is denoted by clf : Thus, for a proper function f ;

epi.clf / D cl.epif /: (2.12)

A proper convex function f is said to be closed if clf D f (so for proper convex
functions, closed is synonym of l.s.c.). An improper convex function is said to be
closed only if f C1 or f 1 (so if f .x/ D 1 for some x then its closure is
the constant function 1/:
Proposition 2.16 The closure of a proper convex function f is a proper convex
function which agrees with f except perhaps at the relative boundary points of domf :
Proof Since epif is convex, cl(epif /, i.e., epi(clf /, is also convex (Proposition 1.10).
Hence by Proposition 2.13, clf is a closed convex function. Now the condition (2.12)
is equivalent to

clf .x/ D lim inf f .y/ 8x 2 Rn : (2.13)


y!x

If x 2 ri.domf / then by Proposition 2.3 f .x/ D lim f .y/; hence, by (2.13), f .x/ D
y!x
clf .x/: Furthermore, if x cl.domf / then f .y/ D C1 for all y in a neighborhood
of x and the same formula (2.13) shows that clf .x/ D C1: Thus the second half of
the proposition is true. It remains to prove that clf is proper. For every x 2 ri.domf /;
since f is proper, 1 < clf .x/ D f .x/ < C1: On the other hand, if clf .x/ D
1 at some relative boundary  point x of domf D dom.clf / then for an arbitrary
y 2 ri.domf /, we have clf 2  12 clf .x/ C 12 clf .y/ D 1: Noting that xCy
xCy
2
2
ri.domf / this contradicts what has just been proved and thereby completes the proof.
t
u
2.4 Lipschitz Continuity 51

2.4 Lipschitz Continuity

By Proposition 2.3 a convex function is continuous relative to the relative interior


of its domain. In this section we present further continuity properties of convex
functions.
Proposition 2.17 Let f be a convex function on Rn and D a polyhedron contained
in domf : Then f is u.s.c. relative to D; so that if f is l.s.c. f is actually continuous
relative to D:
Proof Consider any x 2 D: By translating if necessary, we can assume that
x D 0: Let e1 ; : : : ; en be the unit vectors of Rn and C the convex hull of the set
fe1 ; : : : ; en ; e1 ; : : : ; en g: This set C is partitioned by the coordinate hyperplanes
into simplices Si ; i D 1; : : : ; 2n ; with a common vertex at 0. Since D is a polyhedron,
each Di D Si \D is a polytope and obviously D\C D [2iD1 Di : Now let fxk g  D\C
n

be any sequence such that xk ! 0; f .xk / ! : Then at least one Di ; say D1 , contains
an infinite subsequence of fxk g: For convenience we also denote this subsequence
by fxk g: If V1 is the set of vertices of D1 other than P 0, then each k
Px is ak convex
combination of 0 and elements of V1 W x D .1  v2V1 v /0 C v2V1 v v; with
k k
P
kv  0; v2V1 kv  1: By convexity of f we can write
X X
f .xk /  .1  kv /f .0/ C kv f .v/:
v2V1 v2V1

As k ! C1; since xk ! 0; it follows that kv ! 0 8v; hence  f .0/; i.e.,

lim sup f .xk /  f .0/:

This proves the upper semi-continuity of f at 0 relative to D: t


u
Theorem 2.2 For a proper convex function f on Rn the following assertions are
equivalent:
(i)f is continuous at some point;
(ii)f is bounded above on some open set;
(iii)int.epif / ;I
(iv) int.domf / ; and f is Lipschitzian on every bounded set contained in
int.domf /I
(v) int.domf / ; and f is continuous there.
Proof (i) ) (ii) If f is continuous at a point x0 , then there exists an open
neighborhood U of x0 such that f .x/ < f .x0 / C 1 for all x 2 U:
(ii) ) (iii) If f .x/  c for all x in an open set U, then U  c; C1/  epif ; hence
int.epif / ;:
(iii) ) (iv) If int(epif / ;, then there exists an open set U and an open interval
I  R such that U  I  epif ; hence U  domf ; i.e., int(domf / ;: Consider any
52 2 Convex Functions

compact set C  int.domf / and let B be the Euclidean unit ball. For every r > 0 the
set CCrB is compact, and the family of closed sets f.CCrB/nint.domf /; r > 0g has
an empty intersection. In view of the compactness of C C rB some finite subfamily
of this family must have an empty intersection, hence for some r > 0, we must have
.C CrB/nint.domf / D ;; i.e., C CrB  int.domf /: By Proposition 2.3 the function
f is continuous on int(domf /: Denote by 1 and 2 the maximum and the minimum
0/
of f over C C rB: Let x; x0 be two distinct points in C and let z D x C r.xx
kxx0 k
: Then
z 2 C C rB  int.domf /: But
kx  x0 k
x D .1  /x0 C z; D
r C kx  x0 k
and z; x0 2 domf ; hence
f .x/  .1  /f .x0 / C f .z/ D f .x0 / C .f .z/  f .x0 //
and consequently
f .x/  f .x0 /  .f .z/  f .x0 /  .1  2 /
1  2
 kx  x0 k; D :
r
By symmetry, we also have f .x0 /  f .x/  kx  x0 k: Hence, for all x; x0 such that
x 2 C; x0 2 C W
jf .x/  f .x0 /j  kx  x0 k;
proving the Lipschitz property of f over C:
(iv)) (v) and (v)) (i): obvious. t
u

2.5 Convex Inequalities

A convex inequality in x is an inequality of the form f .x/  0 or f .x/ < 0 where f is a


convex function. Note that an inequality like f .x/  0 or f .x/ > 0; with f .x/ convex,
is not convex but reverse convex, because it becomes convex only when reversed.
A system of inequalities is said to be consistent if it has a solution, i.e., if there exists
at least one value x satisfying all the inequalities; it is inconsistent otherwise. Many
mathematical problems reduce to investigating the consistency (or inconsistency) of
a system of inequalities.
Proposition 2.18 Let f0 ; f1 ; : : : ; fm be convex functions, finite on some nonempty
convex set D  Rn : If the system
x 2 D; fi .x/ < 0 i D 0; 1; : : : ; m (2.14)
2.5 Convex Inequalities 53

Pminconsistent, then there exist multipliers i  0; i D 0; 1; : : : ; m; such that


is
iD0 i > 0 and

X
m
0 f0 .x/ C i fi .x/  0 8x 2 D: (2.15)
iD1

If in addition

9x0 2 D fi .x0 / < 0 i D 1; : : : ; m (2.16)

then 0 > 0; so that one can take 0 D 1:


Proof Consider the set C of all vectors y D .y0 ; y1 ; : : : ; ym / 2 RmC1 for each of
which there exists an x 2 D satisfying

fi .x/ < yi ; i D 0; 1; : : : ; m: (2.17)

As can readily be verified, C is a nonempty convex set and the inconsistency of


the system (2.14) means that 0 C: By Lemma 1.2 there is a hyperplane in RmC1
properly separating 0 from C; i.e., a vector  D .0 ; 1 ; : : : ; m / 0 such that

X
m
i y i  0 8y D .y0 ; y1 ; : : : ; ym / 2 C: (2.18)
iD0

If x 2 D then for every " > 0, we have fi .x/ < fi .x/ C " for i D 0; 1; : : : ; m; so
.f0 .x/ C "; : : : ; fm .x/ C "/ 2 C and hence,

X
m
i .fi .x/ C "/  0:
iD0

Since " > 0 can be arbitrarily small, this implies (2.15). Furthermore, i  0;
i D 0; 1; : : : ; m because if j < 0 for some j; then by fixing, forPan arbitrary x 2 D;
m
all yi > fi .x/; i j; while letting yj ! C1; we would have Pm iD0 i yi ! 1;
contrary toP(2.18). Finally, under (2.16), if 0 D 0 then P iD1 i > 0; hence
m 0 m 0
by (2.16)  f
iD1 i i .x / < 0; contradicting the inequality iD1 i fi .x /  0
from (2.15). Therefore, 0 > 0 as was to be proved. t
u
Corollary 2.6 Let D be a convex set in Rn ; g; f two convex functions finite on D: If
g.x0 / < 0 for some x0 2 D; while f .x/  0 for all x 2 D satisfying g.x/  0; then
there exists a real number   0 such that f .x/ C g.x/  0 8x 2 D:
Proof Apply the above Proposition for f0 D f ; f1 D g: t
u
A more general result about inconsistent systems of convex inequalities is the
following:
54 2 Convex Functions

Theorem 2.3 (Generalized FarkasMinkowski Theorem) Let fi ; i 2 I1 


f1; : : : ; mg; be affine functions on Rn ; and let f0 and fi ; i 2 I2 WD f1; : : : ; mg n I1 ; be
convex functions finite on some convex set D  Rn : If there exists x0 satisfying

x0 2 riD; fi .x0 /  0 .i 2 I1 /; fi .x0 / < 0 .i 2 I2 /: (2.19)

while the system

x 2 D; fi .x/  0 .i D 1; : : : ; m/; f0 .x/ < 0; (2.20)

is inconsistent, then there exist numbers i  0; i D 1; : : : ; m; such that

X
m
f0 .x/ C i fi .x/  0 8x 2 D: (2.21)
iD1

Proof By replacing I1 with fij fi .x0 / D 0g we can assume fi .x0 / D 0 for every
i 2 I1 : Arguing by induction on m; observe first that the theorem is true for m D 1:
Indeed, in this case, if I1 D ; the theorem follows from Proposition 2.18, so we
can assume that I1 D f1g; i.e., f1 .x/ is affine. In view of the fact f1 .x0 / D 0 for
x0 2 riD; if f1 .x/  0 8x 2 D; then f1 .x/ D 0 8x 2 D and (2.21) holds with
 D 0: On the other hand, if there exists x 2 D satisfying f1 .x/ < 0 then, since the
system x 2 D; f1 .x/  0; f0 .x/ < 0 is inconsistent, again the theorem follows from
Proposition 2.18. Thus, in any event the theorem is true when m D 1: Assuming
now that the theorem is true for m D k  1  1; consider the case m D k: The
hypotheses of the theorem imply that the system

x 2 D; fi .x/  0 .i D 1; : : : ; k  1/; maxffk .x/; f0 .x/g < 0 (2.22)

is inconsistent, and since the function maxffk .x/; f0 .x/g is convex, there exists, by
the induction hypothesis, ti  0; i D 1; : : : ; k  1; such that

X
k1
maxffk .x/; f0 .x/g C ti fi .x/  0 8x 2 D: (2.23)
iD1

We show that this implies the inconsistency of the system

X
k1
x 2 D; ti fi .x/  0; fk .x/  0; f0 .x/ < 0: (2.24)
iD1

Indeed, from (2.23) it follows that no solution of (2.24) exists with fk .x/ < 0:
Furthermore, if there exists x 2 D satisfying (2.24) with fk .x/ D P 0 then, setting
x0 D x0 C .1  /x with 2 .0; 1/, we would have x0 2 D; k1 0
iD1 ti fi .x / 
0 0 0
0; fk .x / < 0 and f0 .x /  f0 .x / C .1  /f0 .x/ < 0 for sufficiently small > 0:
2.5 Convex Inequalities 55

Since this contradicts (2.23), the system (2.24) is inconsistent and, again by the
induction hypothesis, there exist  0 and k  0 such that

X
k1
f0 .x/ C ti fi .x/ C k fk .x/  0 8x 2 D: (2.25)
iD1

This is the desired conclusion with i D ti  0 for i D 1; : : : ; k  1: t


u
Corollary 2.7 (Farkas Lemma) Let A be an m  n matrix, and let p 2 Rn :
If hp; xi  0 for all x 2 Rn satisfying Ax  0 then p D AT  for some
 D .1 ; : : : ; m /  0:
Proof Apply the above theorem for D D Rn ; f0 .x/ D hp; xi; and fi .x/ D hai ; xi;
i D 1; : : : ; m; where i
P a is the i-th row of A: Then there exist nonnegative
Pm 1 ; : : : ; m
such that hp; xi m iD1 P i ha i
; xi  0 for all x 2 R n
; hence hp; xi iD1 i ha ; xi D 0
 i

for all x 2 Rn ; i.e., p D m iD1 i  a i


: t
u
Theorem 2.4 Let f1 ; : : : ; fm be convex functions finite on some convex set D in Rn ;
and let A be a k  n matrix, b 2 riA.D/: If the system

x 2 D; Ax D b; fi .x/ < 0; i D 1; : : : ; m (2.26)

is inconsistent, then there exist a vector t 2 Rm and nonnegative numbers 1 ; : : : ; m


summing up to 1 such that

X
m
ht; Ax  bi C i fi .x/  0 8x 2 D: (2.27)
iD1

Proof Define E D fx 2 D j Ax D bg: Since the system

x 2 E; fi .x/ < 0; i D 1; : : : ; m

is inconsistent, there exist, by Proposition 2.18, nonnegative numbers 1 ; : : : ; m ;


not all zero, such that

X
m
i fi .x/  0 8x 2 E: (2.28)
iD1

Pm Pm
Pm iD1 i , we may assume iD1 i D 1: Obviously, the convexk function
By dividing
f .x/ WD iD1 i fi .x/ is finite on D: Consider the set C of all .y; y0 / 2 R  R for
which there exists an x 2 D satisfying

Ax  b D y; f .x/ < y0 :
56 2 Convex Functions

Since by (2.28) 0 C; and since C is convex there exists, again by Lemma 1.2, a
vector .t; t0 / 2 Rk  R such that

inf ht; yi C t0 y0   0; sup ht; yi C t0 y0  > 0: (2.29)


.y;y0 /2C .y;y0 /2C

If t0 < 0 then by fixing x 2 D; y D Ax  b and letting y0 ! C1 we would have


ht; yi C t0 y0 ! 1; contradicting (2.29). Consequently t0  0: We now contend
that t0 > 0: Suppose the contrary, that t0 D 0; so that ht; yi  0 8.y; y0 / 2 C; and
hence

ht; yi  0 8y 2 A.D/  b:

Since by hypothesis b 2 riA.D/; i.e., 0 2 ri.A.D/  b/; this implies ht; yi D 0 8y 2


A.D/; hence ht; yi C t0 y0 D 0 for all .y; y0 / 2 C; contradicting (2.29). Therefore,
t0 > 0 and we can take t0 D 1: Then the left inequality (2.29), where y D Ax  b;
y0 D f .x/ C " for x 2 D; yields the desired relation (2.27) by making " # 0. t
u

2.6 Approximation by Affine Functions

A general method of nonlinear analysis is linearization, i.e., the approximation of


convex functions by affine functions. The basis for this approximation is provided
by the next result on the structure of closed convex functions which is merely the
analytical equivalent form of the corresponding theorem on the structure of closed
convex sets (Theorem 1.5).
Theorem 2.5 A proper closed convex function f on Rn is the upper envelope
(pointwise supremum) of the family of all affine functions h on Rn minorizing f :
Proof We first show that for any .x0 ; t0 / epif there exists .a; / 2 Rn  R such
that

ha; xi  t < < ha; x0 i  t0 8.x; t/ 2 epif : (2.30)

Indeed, since epif is a closed convex set there exists by Theorem 1.3 a hyperplane
strongly separating .x0 ; t0 / from epif ; i.e., an affine function ha; xi C t such that

ha; xi C t < < ha; x0 i C t0 8.x; t/ 2 epif :

It is easily seen that  0 because if > 0 then by taking a point x 2 domf


and an arbitrary t  f .x/; we would have .x; t/ 2 epif ; hence ha; xi C t <
for all t  f .x/; which would lead to a contradiction as t ! C1: Furthermore,
if x0 2 domf then D 0 would imply ha; x0 i < ha; x0 i; which is absurd. Hence,
in this case, < 0; and by dividing a and by  , we can assume D 1; so
2.6 Approximation by Affine Functions 57

that (2.30) holds. On the other hand, if x0 domf and D 0; we can consider an
x1 2 domf and t1 < f .x1 /, so that .x1 ; t1 / epif and by what has just been proved,
there exists .b; / 2 Rn  R satisfying

hb; xi  t < < hb; x1 i  t1 8.x; t/ 2 epif :

For any > 0 we then have for all .x; t/ 2 epif W

hb C a; xi  t D .hb; xi  t/ C ha; xi < C ;

while hb C a; x0 i  t0 D .hb; x0 i  t0 / C ha; x0 i > C for sufficiently large


because < ha; x0 i: Thus, for > 0 large enough, setting a0 D bC a; 0 D C ,
we have

ha0 ; xi  t < 0 < ha0 ; x0 i  t0 8.x; t/ 2 epif ;

i.e., .a0 ; 0 / satisfies (2.30). Note that (2.30) implies ha; xi   f .x/ 8x; i.e., the
affine function h.x/ D ha; xi  minorizes f .x/: Now let Q be the family of all
affine functions h minorizing f : We contend that

f .x/ D supfh.x/j h 2 Qg: (2.31)

Suppose the contrary, that f .x0 / >  D supfh.x/j h 2 Qg for some x0 : Then
.x0 ; / epif and by the above there exists .a; / 2 Rn  R satisfying (2.30) for
t0 D : Hence, h.x/ D ha; xi 2 Q and < ha; x0 i; i.e., h.x0 / D ha; x0 i >
; a contradiction. Thus (2.31) holds, as was to be proved. t
u
Corollary 2.8 For any function f W Rn ! 1; C1 the closure of the convex hull
of f is equal to the upper envelope of all affine functions minorizing f :
Proof An affine function h minorizes f if and only if it minorizes cl(convf /; hence
the conclusion. t
u
Proposition 2.19 Any proper convex function f has an affine minorant. If x0 2
int.domf / then an affine minorant h exists which is exact at x0 ; i.e., such that
h.x0 / D f .x0 /:
Proof Indeed, the collection Q in Theorem 2.5 for cl(f / is nonempty. If x0 2
int.domf / then .x0 ; f .x0 // is a boundary point of the convex set epif : Hence by
Theorem 1.5 there exists a supporting hyperplane to epif at this point, i.e., there
exists .a; / 2 Rn  R such that either a 0 or 0 and

ha; xi  t  ha; x0 i  f .x0 / 8.x; t/ 2 epif :

As in the proof of Theorem 2.5, it is readily seen that  0: Furthermore, if


D 0 then the above relation implies that ha; xi  ha; x0 i for all x in some open
neighborhood U of x0 contained in domf ; and hence that a D 0, a contradiction.
58 2 Convex Functions

Therefore, < 0; so we can take D 1: Then ha; xi  t  ha; x0 i  f .x0 / for all
.x; t/ 2 epif and the affine function

h.x/ D ha; x  x0 i C f .x0 /

satisfies h.x/  f .x/ 8x and h.x0 / D f .x0 /: t


u

2.7 Subdifferential

Given a proper function f on Rn ; a vector p 2 Rn is called a subgradient of f at a


point x0 if

hp; x  x0 i C f .x0 /  f .x/ 8x: (2.32)

The set of all subgradients of f at x0 is called the subdifferential of f at x0 and is


denoted by @f .x0 / (Fig. 2.2). The function f is said to be subdifferentiable at x0 if
@f .x0 / ;:
Theorem 2.6 Let f beSa proper convex function on Rn : For any bounded set
C  int.domf / the set x2C @f .x/ is nonempty and bounded. In particular, @f .x0 /
is nonempty and bounded at every x0 2 int.domf /:
Proof By Proposition 2.19 if x0 2 int.domf / then f has an affine minorant h.x/
such that h.x0 / D f .x0 /; i.e., h.x/ D hp; x  x0 i C f .x0 / for some p 2 @f .x0 /:
Thus, @f .x0 / ; for every x0 2 int.domf /: Consider now any bounded set C 
int.domf /: As we saw in the proof of Theorem 2.2, there is r > 0 such that C C
rB  int.domf /; where B denotes the Euclidean unit ball. By definition, for any

f (x)

p, x x0
f (x0 )

0 x0 x

Fig. 2.2 The set @f .x0 /


2.7 Subdifferential 59

x 2 C and p 2 @f .x/, we have hp; y  xi C f .x/  f .y/ 8y; but by Theorem 2.2,
there exists > 0 such that jf .x/  f .y/j  ky  xk for all y 2 C C rB: Hence
jhp; y  xij  ky  xk for all y 2 C C rB; i.e., jhp;
S uij  kuk for all u 2 B: By
taking u D p=kpk this implies kpk  ; so the set x2C @f .x/ is bounded. t
u
Corollary 2.9 Let f be a proper convex function on Rn : For any bounded convex
subset C of int.domf / there exists a positive constant such that

f .x/ D supfh.x/j h 2 Q0 g 8x 2 C; (2.33)

where every h 2 Q0 has the form h.x/ D ha; xi  with kak  :


Proof It suffices to take as Q0 the family of all affine functions h.x/ D ha; x  yi C
f .y/; with y 2 C; a 2 @f .y/: t
u
Corollary 2.10 Let f W D ! R be a convex function defined and continuous on a
convex set D with nonempty interior. If the set [f@f .x/j x 2 intDg is bounded, then
f can be extended to a finite convex function on Rn :
Proof For each point y 2 intD take a vector py 2 @f .y/ and consider the affine
function hy .x/ D f .y/ C hpy ; x  yi: The function fQ .x/ D supfhy .x/j y 2 intDg is
convex on Rn as the upper envelope of a family of affine functions. If a is any fixed
point of D then hy .x/ D f .y/ C hpy ; a  yi C hpy ; x  ai  f .a/ C hpy ; x  ai 
f .a/ C kpy k:kx  ak: Since kpy k is bounded on intD the latter inequality shows that
1 < fQ .x/ < C1 8x 2 Rn : Thus, fQ .x/ is a convex finite function on Rn : Finally,
since obviously fQ .x/ D f .x/ for every x 2 intD it follows from the continuity of
both f .x/ and fQ .x/ on D that fQ .x/ D f .x/ 8x 2 D: t
u
Example 2.1 (Positively Homogeneous Convex Function) Let f W Rn ! R be a
positively homogeneous convex function, i.e., a convex function f W Rn ! R such
that f .x/ D f .x/ 8 > 0: Then

@f .x0 / D fp 2 Rn j hp; x0 i D f .x0 /; hp; xi  f .x/ 8xg (2.34)

Proof if p 2 @f .x0 / then hp; x  x0 i C f .x0 /  f .x/ 8x: Setting x D 2x0 yields
hp; x0 i C f .x0 /  2f .x0 /; i.e., hp; x0 i  f .x0 /; then setting x D 0 yields hp; x0 i 
f .x0 /; hence hp; x0 i D f .x0 /: (Note that this condition is trivial and can be omitted
if x0 D 0/: Furthermore, hp; x  x0 i D hp; xi  hp; x0 i D hp; xi  f .x0 /; hence
hp; xi  f .x/ 8x: Conversely, if p belongs to the set on the right-hand side of (2.34)
then obviously hp; x  x0 i  f .x/  f .x0 /; so p 2 @f .x0 /: t
u
If, in addition, f .x/ D f .x/  0 8x then the condition hp; xi  f .x/ 8x is
equivalent to jhp; xij  f .x/ 8x: In particular:
1. If f .x/ D kxk (Euclidean norm) then

0 fp j kpk  1g (unit ball) if x0 D 0
@f .x / D (2.35)
fx0 =kx0 kg if x0 0:
60 2 Convex Functions

2. If f .x/ D maxfjxi j j i D 1; : : : ; ng (Tchebycheff norm) then



0 convfe1 ; : : : ; en g if x0 D 0
@f .x / D (2.36)
convf.signxi /xi j i 2 Ix0 g if x0 0;
0 0

where Ix D fi j jxi j D f .x/g: p


3. If Q is a symmetric positive semidefinite matrix and f .x/ D hx; Qxi (elliptic
norm) then
( p
0 xi  hx; Qxi 8xg if x0 2 KerQ
fp j hp; p
@f .x / D (2.37)
f.Qx0 /= hx0 ; Qx0 ig if x0 KerQ:

Example 2.2 (Distance Function) Let C be a closed convex set in Rn ; and f .x/ D
minfky  xk j y 2 Cg: Denote by C .x/ the projection of x on C; so that
k C .x/  xk D minfky  xk j y 2 Cg and hx  C .x/; y  C .x/i  0 8y 2 C
(see Proposition 1.15). Then
( 0 0
@f .x0 / D n C .x
N / \ B.0;
o 1/ if x 2 C (2.38)
x0  C .x0 /
kx0  .x0 /k
if x0 C;
C

where NC .x0 / denotes the outward normal cone of C at x0 and B.0; 1/ the Euclidean
unit ball.
Proof Let x0 2 C; so that f .x0 / D 0: Then p 2 @f .x0 / implies hp; x  x0 i 
f .x/ 8x; hence, in particular, hp; x  x0 i  0 8x 2 C; i.e., p 2 NC .x0 /I furthermore,
hp; xx0 i  f .x/  kxx0 k 8x; hence kpk  1; i.e., p 2 B.0; 1/: Conversely, if p 2
NC .x0 / \ B.0; 1/ then hp; x  C .x/i  kx  C .x/k  f .x/; and hp; C .x/  x0 i  0;
consequently hp; x  x0 i D hp; x  C .x/i C hp; C .x/  x0 i  f .x/ D f .x/  f .x0 /
for all x; and so p 2 @f .x0 /:
Turning to the case x0 C; observe that p 2 @f .x0 / implies hp; x  x0 i C f .x0 / 
f .x/ 8x; hence, setting x D C .x0 / yields hp; C .x0 /  x0 i C k C .x0 /  x0 k  0;
i.e., hp; x0  C .x0 /i  kx0  C .x0 /k: On the other hand, setting x D 2x0  C .x0 /
yields hp; x0  C .x0 /i C k C .x0 /  x0 k  2k C .x0 /  x0 k; i.e., hp; x0  C .x0 /i 
k C .x0 /  x0 k: Thus, hp; x0  C .x0 /i D k C .x0 /  x0 k and consequently p D
x0  C .x0 /
kx0  C .x0 /k
: Conversely, the last equality implies hp; x0  C .x0 /i D kx0  C .x0 /k D
f .x0 /; hp; x C .x/i  kx C .x/k D f .x/; hence hp; xx0 iCf .x0 / D hp; xx0 iC
hp; x0  C .x0 /i D hp; x  C .x0 /i D hp; x  C .x/i C hp; C .x/  C .x0 /i;  kx 
C .x/k D f .x/ for all x (note that hp; C .x/ C .x0 /i  0 because p 2 NC . C .x0 ///:
Therefore, hp; x  x0 i C f .x0 /  f .x/ for all x; proving that p 2 @f .x0 /: t
u
Observe from the above examples that there is a unique subgradient (which is
just the gradient) at every point where f is differentiable. This is actually a general
fact which we are now going to establish.
2.7 Subdifferential 61

Let f W Rn ! 1; C1 be any function and let x0 be a point where f is finite.
If for some u 0 the limit (finite or infinite)

f .x0 C u/  f .x0 /


lim
#0 

exists, then it is called the directional derivative of f at x0 in the direction u, and is


denoted by f 0 .x0 I u/:
Proposition 2.20 Let f be a proper convex function and x0 2 domf : Then:
(i) f 0 .x0 I u/ exists for every direction u and satisfies

f .x0 C u/  f .x0 /


f 0 .x0 I u/ D inf I (2.39)
>0 
(ii) The function u 7! f 0 .x0 I u/ is convex and homogeneous and p 2 @f .x0 / if and
only if

hp; ui  f 0 .x0 ; u/ 8u: (2.40)

(iii) If f is continuous at x0 then f 0 .x0 I u/ is finite and continuous at every u 2 Rn ;


the subdifferential @f .x0 / is compact and
f 0 .x0 I u/ D maxfhp; ui j p 2 @f .x0 /g: (2.41)

Proof
(i) For any given u 0 the function './ D f .x0 C u/ is proper convex on the
0
real line, and 0 2 dom': Therefore, its right derivative 'C .0/ D f 0 .x0 I u/ exists
(but may equal C1 if 0 is an endpoint of dom'/: The relation (2.39) follows
from the fact that './  '.0/= is nonincreasing as  # 0:
(ii) The homogeneity of f 0 .x0 I u/ is obvious. The convexity then follows from the
relations

f .x0 C 2 .u C v//  f .x0 /


f 0 .x0 I u C v/ D inf 
>0
2

f .x0 C u/  f .x0 / C f .x0 C v/  f .x0 /


 inf
>0 
D f 0 .x0 I u/ C f 0 .x0 I v/:

Setting x D x0 C u we can turn the subgradient inequality (2.32) into the


condition

hp; ui  f .x0 C u/  f .x0 /= 8u; 8 > 0;

which is equivalent to hp; ui  inf>0 f .x0 C u/  f .x0 /=; 8u; i.e., by (i),
hp; ui  f 0 .x0 I u/ 8u:
62 2 Convex Functions

(iii) If f is continuous at x0 then there is a neighborhood U of 0 such that f .x0 C u/


is bounded above on U: Since by (i) f 0 .x0 I u/  f .x0 C u/  f .x0 /; it follows
that f 0 .x0 I u/ is also bounded above on U; and hence is finite and continuous
on Rn (Theorem 2.2). The Condition (2.40) then implies that @f .x0 / is closed
and hence compact because it is bounded by Theorem 2.6. In view of the
homogeneity of f 0 .x0 I u/; an affine minorant of it which is exact at some point
must be of the form hp; ui; with hp; ui  f 0 .x0 I u/ 8u; i.e., by (ii), p 2 @f .x0 /:
By Corollary 2.9, we then have f 0 .x0 I u/ D maxfhp; ui j p 2 @f .x0 /g: t
u
According to the usual definition, a function f is differentiable at a point x0 if
there exists a vector rf .x0 / (the gradient of f at x0 / such that

f .x0 C u/ D f .x0 / C hrf .x0 /; ui C o.kuk/:

This is equivalent to

f .x0 C u/  f .x0 /


lim D hrf .x0 /; ui; 8u 0;
#0 

so the directional derivative f 0 .x0 I u/ exists, and is a linear function of u:


Proposition 2.21 Let f be a proper convex function and x0 2 domf : If f is
differentiable at x0 then rf .x0 / is its unique subgradient at x0 :
Proof If f is differentiable at x0 then f 0 .x0 I u/ D hrf .x0 /; ui; so by (ii) of
Proposition 2.20, a vector p is a subgradient f at x0 if and only if hp; ui 
hrf .x0 /; ui 8u; i.e., if and only if p D rf .x0 /: t
u
One can prove conversely that if f has a unique subgradient at x0 then f is
differentiable at x0 (see, e.g., Rockafellar 1970).

2.8 Subdifferential Calculus

A convex function f may result from some operations on convex functions fi ; i 2 I:


(cf Sect. 2.2). It is important to know how the subdifferential of f can be computed
in terms of the subdifferentials of the fi0 s.
Proposition 2.22 Let fi ; i D 1; : : : ; m; be proper convex functions on Rn : Then for
every x 2 Rn W
!
X
m X
m
@ fi .x/  @fi .x/:
iD1 iD1
2.8 Subdifferential Calculus 63

iD1 domfi ; where every function fi ; except perhaps one,


If there exists a point a 2 \m
is continuous, then the above inclusion is in fact an equality for every x 2 Rn :
Proof It suffices to prove the proposition for m D 2 because the general case will
follow by induction. Furthermore, the first part is straightforward, so we only need
to prove the second part. If p 2 @.f1 C f2 /.x0 /; then the system

x  y D 0; f1 .x/ C f2 .y/  f1 .x0 /  f2 .x0 /  hp; x  x0 i < 0

is inconsistent. Define D D domf1  domf2 and A.x; y/ WD x  y: By hypothesis,


f1 is continuous at a 2 domf1 \ domf2 ; so there is a ball U around 0 such that
aCU  domf1 ; hence U D .aCU/a  domf1 domf2 D A.D/; i.e., 0 2 intA.D/:
Therefore, by Theorem 2.4 there exists t 2 Rn such that

ht; x  yi C f1 .x/ C f2 .y/  f1 .x0 /  f2 .x0 /  hp; x  x0 i  0

for all x 2 Rn and all y 2 Rn : Setting y D x0 yields hp  t; x  x0 i  f1 .x/  f1 .x0 /


8x 2 Rn ; i.e., p  t 2 @f1 .x0 /: Then setting x D x0 yields ht; y  x0 i  f2 .y/  f2 .x0 /
8y 2 Rn ; i.e., t 2 @f2 .x0 /: Thus, p D .p  t/ C t 2 @f1 .x0 / C @f2 .x0 /; as was to be
proved. t
u
Proposition 2.23 Let A W Rn ! Rm be a linear mapping and g be a proper convex
function on Rm : Then for every x 2 Rn W

AT @g.Ax/  @.g A/.x/:

If g is continuous at some point in Im.A/ .the range of A/ then the above inclusion
is in fact an equality for every x 2 Rn :
Proof The first part is straightforward. To prove the second part, consider any p 2
@.g A/.x0 /: Then the system

Ax  y D 0; g.y/  g.Ax0 /  hp; x  x0 i < 0; x 2 Rn ; y 2 Rm

is inconsistent. Define D D Rn  domg; B.x; y/ D Ax  y: Since there is a point


b 2 ImA \ int.domg/, we have b 2 intB.D/; so by Theorem 2.4 there exists t 2 Rm
such that

ht; Ax  yi C g.y/  g.Ax0 /  hp; x  x0 i  0

for all x 2 Rn and all y 2 Rm : Setting y D 0 then yields hAT t  p; xi  g.Ax0 / C


hp; x0 i  0 8x 2 Rn ; hence p D AT t; while setting x D x0 yields ht; y  Ax0 i 
g.y/  g.Ax0 /; i.e., t 2 @g.Ax0 /: Therefore, p 2 AT @g.Ax0 /: t
u
Proposition 2.24 Let g.x/ D .g1 .x/; : : : ; gm .x//; where each gi is a convex
functions from Rn to R; let ' W Rm ! R be a convex function, component-wise
64 2 Convex Functions

increasing, i.e., such that '.t/  '.t0 / whenever ti  ti0 ; i D 1; : : : ; m: Then the
function f D ' g is convex and
( )
X
m
@f .x/ D si p j p 2 @gi .x/; .s1 ; : : : ; sm / 2 @'.g.x// :
i i
(2.42)
iD1

Proof The convexity of f .x/ follows from an obvious extensionP of Proposition 2.8
which corresponds to the case m D 1: To prove (2.42), let p D m i i
iD1 si p with p 2
0 0 0 0
@gi .x /; s 2 @'.g.x //: First observe that hs; y  g.x /i  '.y/  '.g.x // 8y 2 Rm
implies, for all y D g.x0 /Cu with u  P0m W hs; ui  '.g.x0 /Cu/'.g.x
Pm
0
//  0 8u 
0; hence s  0: Now hp; x  x i D iD1 si hp ; x  x i  iD1 si gi .x/  gi .x0 / D
0 i 0

hs; g.x/  g.x0 /i  '.g.x//  '.g.x0 // D f .x/  f .x0 / for all x 2 Rn : Therefore
p 2 @f .x0 /; i.e., the right-hand side of (2.42) is contained in the left-hand side. To
prove the converse, let p 2 @f .x0 /; so that the system

x 2 Rn ; y 2 Rm ; gi .x/ < yi i D 1; : : : ; m (2.43)


'.y/  '.g.x0 //  hp; x  x0 i < 0 (2.44)

is inconsistent, while the system (2.43) has a solution. By Proposition 2.18 there
exists s 2 RmC such that

'.y/  '.g.x0 //  hp; x  x0 i C hs; g.x/  yi  0

for all x 2 Rn ; y 2 Rm : Setting x D x0 yields '.y/  '.g.x0 //  hs; y  g.x0 /i for


0 0
all y 2 Rm ; which meansPm that s 2 @'.g.x0 //: On the othern hand, setting y D g.x /
0
P hp; x  0x i  iD1 si gi .x/  gi .x / for allPxm 2 R i; whichi means that
yields
0
p2
@. m s g
iD1 i i .x //; hence by Proposition 2.22, p D s
iD1 i p with p 2 @g i .x /: t
u

Note that when '.y/ is differentiable at g.x/ the above formula (2.42) is similar
to the classical chain rule, namely:

Xm
@'
@.' g/.x/ D .g.x//@gi .x/:
iD1
@yi

Proposition 2.25 Let f .x/ D maxfg1 .x/; : : : ; gm .x/g; where each gi is a convex
function from Rn to R: Then

@f .x/ D convf[@gi .x/j i 2 I.x/g; (2.45)

where I.x/ D fi j f .x/ D gi .x/g:


Proof If p 2 @f .x0 / then the system

gi .x/  f .x0 /  hp; x  x0 i < 0 i D 1; : : : ; m


2.9 Approximate Subdifferential 65

Pm
is
Pinconsistent. By Proposition 2.18, there exist i  0 such that iD1 i D 1 and
m
iD1  i g i .x/  f .x0 /  hp; x  x0 i  0: Setting x D x0 , we have
X
i gi .x0 /  f .x0 /  0;
iI.x0 /

with gi .x0 /  f .x0 / < 0 for every i I.x0 /: This implies that i D 0 for all i I.x0 /:
Hence
X
i gi .x/  gi .x0 /  hp; x  x0 i  0
i2I.x0 /

P P
for all x 2 Rn and so p 2 @. i2I.x0 / i gi .x0 //: By Proposition 2.22, p D i2I.x0 / pi ;
with pi 2 @gi .x0 /: Thus @f .x0 /  convf[@gi .x0 /j i 2 I.x0 /g: The converse inclusion
can be verified in a straightforward manner. t
u

2.9 Approximate Subdifferential

A proper convex function f on Rn may have an empty subdifferential at certain


points. In practice, however, one often needs only a concept of approximate
subdifferential .
Given a positive number " > 0; a vector p 2 Rn is called an "-subgradient of f
at point x0 if

hp; x  x0 i C f .x0 /  f .x/ C " 8x: (2.46)

The set of all "-subgradients of f at x0 is called the "-subdifferential of f at x0 , and


is denoted by @" f .x0 / (Fig. 2.3).

f (x0 )
f (x0 )

0 x0 x

Fig. 2.3 The set @" f .x0 /


66 2 Convex Functions

Proposition 2.26 For any proper closed convex function f on Rn ; and any given
" > 0 the "-subdifferential of f at any point x0 2 domf is nonempty. If C is a
bounded subset of int(dom f / then the set [x2C @" f .x/ is bounded.
Proof Since the point .x0 ; f .x0 /  "/ epif ; there exists p 2 Rn such that
hp; x  x0 i < f .x/  .f .x0 /  "/ 8x (see (2.30) in the proof of Theorem 2.5).
Thus, @" f .x0 / ;: The proof of the second part of the proposition is analogous to
that of Theorem 2.6. t
u
Note that @" f .x0 / is unbounded when x0 is a boundary point of domf :
Proposition 2.27 Let x0 ; x1 2 domf : If p 2 @" f .x0 / ."  0/ then p 2 @
f .x1 / for

D f .x1 /  f .x0 /  hp; x1  x0 i C "  0:


Proof If hp; x  x0 i  f .x/  f .x0 / C " for all x then hp; x  x1 i D hp; x  x0 i C
hp; x0  x1 i  f .x/  f .x0 / C "  hp; x1  x0 i D f .x/  f .x1 / C
for all x: t
u
A function f .x/ is said to be strongly convex on a convex set C if there exists
r > 0 such that

f ..1  /x1 C x2 /


 .1  /f .x1 / C f .x2 /  .1  /rkx1  x2 k2 (2.47)

for all x1 ; x2 2 C; and all  2 0; 1: The number r > 0 is then called the modulus of
strong convexity of f .x/: Using the identity

.1  /kx1  x2 k2 D .1  /kx1 k2 C kx2 k2  k.1  /x1 C x2 k2

for all x1 ; x2 2 Rn ; and all  2 0; 1; it is easily verified that a convex function f .x/
is strongly convex with modulus of strong convexity r if and only if the function
f .x/  rkxk2 is convex.
Proposition 2.28 If f .x/ is a strongly convex function on Rn with modulus of strong
convexity r then for any x0 2 Rn and "  0 W
p
@f .x0 / C B.0; 2 r"/  @" f .x0 /; (2.48)

where B.0; / denotes the ball of radius around 0.


Proof Let p 2 @f .x0 /: Since F.x/ WD f .x/  rkxk2 is convex and p  2rx0 2 @F.x0 /
we can write

f .x/  f .x0 /  r.kxk2  kx0 k2 /  hp  2rx0 ; x  x0 i

for all x; hence

f .x/  f .x0 /  hp; x  x0 i C rkx  x0 k2 :


2.10 Conjugate Functions 67

Let us determine a vector u such that

rkx  x0 k2  hu; x  x0 i  " 8x 2 Rn : (2.49)

The convex quadratic function rkx  x0 k2  hu; x  x0 i achieves its minimum at


the point x such that 2r.x  x0 /  u D 0; i.e., x  x0 D 2ru : This minimum is
kuk 2 kuk 2
u 2 2
2r   2r D
equal to rp 4r : Thus, by choosing u such that kuk  4r"; i.e.,
0
u 2 B.0; 2 r"/ we will have (2.49), hence p C u 2 @" f .x /: t
u
Corollary 2.11 Let f .x/ D 12 hx; Qxi C hx; ai; where a 2 Rn ;and Q is an n  n
symmetric positive definite matrix. Let r > 0 be the smallest eigenvalue of Q: Then

Qx C a C u 2 @" f .x/
p
for any u 2 Rn such that kuk  2 r":
Proof Clearly f .x/  rkxk2 is convex (as its smallest eigenvalue is nonnegative), so
f .x/ is a strongly convex function to which the above proposition applies. t
u

2.10 Conjugate Functions

Given an arbitrary function f W Rn ! 1; C1, we consider the set of all


affine functions h minorizing f : It is natural to restrict ourselves to proper functions,
because an improper function either has no affine minorant (if f .x/ D 1 for
some x/ or is minorized by every affine function (if f .x/ is identical to C1/:
Observe that if hp; xi   f .x/ for all x then  hp; xi  f .x/ 8x: The function

f  .p/ D sup fhp; xi  f .x/g: (2.50)


x2Rn

which is clearly closed and convex is called the conjugate of f :


For instance, the conjugate of the function f .x/ D C .x/ (indicator function of
a set C; see Sect. 2.1) is the function f  .p/ D suphp; xi D sC .p/ (support function
x2C
of C/: The conjugate of an affine function f .x/ D hc; xi  is the function

C1; p c
f  .p/ D supfhp; xi  hc; xi C g D
x ; p D c:

Two less trivial examples are the following:


Example 2.3 The conjugate of the proper convex function f .x/ D ex ; x 2 R; is by
definition f  .p/ D supfpx  ex g: Obviously, f  .p/ D 0 for p D 0 and f  .p/ D C1
x
for p < 0: For p > 0; the function px  ex achieves a maximum at x D satisfying
p D e ; so f  .p/ D p log p  p: Thus,
68 2 Convex Functions

8
< 0; pD0

f .p/ D C1; p<0
:
p log p  p; p > 0:

The conjugate of f  .p/ is in turn the function f  .x/ D supfpx  f  .p/g D supfpx 
p
p log p C pj p > 0g D ex :
1 Pn
Example 2.4 The conjugate of the function f .x/ D iD1 jxi j ; 1 < < C1;
is

1X
n
f  .p/ D jpi j ; 1 < < C1;
iD1

where 1= C 1= D 1: Indeed,
( )
X
n
1X
n

f .p/ D sup p i xi  jxi j :
x
iD1
iD1

By differentiation, we find that the supremum on the right-hand side is achieved at


x D satisfying pi D j i j1 sign i ; hence

X  
1X
n n
1
f  .p/ D j i j 1  D jpi j :
iD1
iD1

Proposition 2.29 Let f W Rn ! 1; C1 be an arbitrary proper function.


Then:
(i) f .x/ C f  .p/  hp; xi 8x 2 Rn ; 8p 2 Rn I
 
(ii) f .x/  f .x/ 8x; and f D f if and only if f is convex and closed;
(iii) f  .x/ D supfh.x/j h affine; h  f g; i.e., f  .x/ is the largest closed convex
function minorizing f .x/ W f  D cl.conv/f :
Proof (i) is obvious, and from (i) f  .x/ D supp fhp; xif  .p/g  f .x/: Let Q be the
set of all affine functions majorized by f : For every h 2 Q; say h.x/ D hp; xi  ; we
have hp; xi   f .x/ 8x; hence  supx fhp; xi  f .x/g D f  .p/; and consequently
h.x/  hp; xi  f  .p/ 8x: Thus

supfhj h 2 Qg  supfhp; xi  f  .p/g D f  ; (2.51)


p

If f is convex and closed then, since it is proper, by Theorem 2.5 it is just equal to the
function on the left-hand side of (2.51) and since f  f  ; it follows that f D f  :
Conversely, if f D f  then f is the conjugate of f  , hence is convex and closed.
Turning to (iii) observe that if Q D ; then f  .p/ D supx fhp; xi  f .x/g D C1 for
2.11 Extremal Values of Convex Functions 69

all p and consequently, f  1: But in this case, supfhj h 2 Qg D 1; too,
hence f  D supfhj h 2 Qg: On the other hand, if there is h 2 Q then from (2.51)
h  f  I conversely, if h  f  then h  f (because f   f /: In this case, since
f  .x/  f .x/ < C1 at least for some x; and f  .x/ > 1 8x (because there is an
affine function minorizing f  /; it follows that f  is proper. By Theorem 2.5, then
f  D supfhj h affine h  f  g D supfhj h 2 Qg: This proves the equality in (iii),
and so, by Corollary 2.8, f  D cl.conv/f : t
u

2.11 Extremal Values of Convex Functions

The smallest and the largest values of a convex function on a given convex set are
often of particular interest.
Let f W Rn ! 1; C1 be an arbitrary function, and C an arbitrary set
in Rn : A point x0 2 C \ domf is called a global minimizer of f .x/ on C if
1 < f .x0 /  f .x/ for all x 2 C: It is called a local minimizer of f .x/ on
C if there exists a neighborhood U.x0 / of x0 such that 1 < f .x0 /  f .x/ for
all x 2 C \ U.x0 /: The concepts of global maximizer and local maximizer are
defined analogously. For an arbitrary function f on a set C we denote the set of all
global minimizers (maximizers) of f on C by argminx2Cf .x/ (argmaxx2Cf .x/; resp.).
Since minx2C f .x/ D  maxx2C .f .x// the theory of the minimum (maximum) of
a convex function is the same as the theory of the maximum (minimum, resp.) of a
concave function.

2.11.1 Minimum

Proposition 2.30 Let C be a nonempty convex set in Rn ; and f W Rn ! R be a


convex function. Any local minimizer of f on C is also global. The set argmin f .x/ is
x2C
a convex subset of C:
Proof Let x0 2 C be a local minimizer of f and U.x0 / be a neighborhood such that
f .x0 /  f .x/ 8x 2 C \ U.x0 /: For any x 2 C we have x WD .1  /x0 C x 2
C \ U.x0 / for sufficiently small  > 0: Then f .x0 /  f .x /  .1  /f .x0 / C f .x/;
hence f .x0 /  f .x/; proving the first part of the proposition. If D min f .C/ then
argminx2C f .x/ coincides with the set C \ fxj f .x/  g which is a convex set by the
convexity of f .x/ (Proposition 2.11). t
u
Remark 2.1 A real-valued function f on a convex set C is said to be strictly convex
on C if

f ..1  /x1 C x2 / < .1  /f .x1 / C f .x2 /

for any two distinct points x1 ; x2 2 C and 0 <  < 1: For such a function f the set
argminx2C f .x/; if nonempty, is a singleton, i.e., a strictly convex function f .x/ on
C has at most one minimizer over C: In fact, if there were two distinct minimizers
1 2
x1 ; x2 then by strict convexity f . x Cx
2
/ < f .x1 /; which is impossible.
70 2 Convex Functions

Proposition 2.31 Let C be a convex set in Rn ; and f W Rn ! 1; C1 a convex


function which is finite on C: For a point x0 2 C to be a minimizer of f on C it is
necessary and sufficient that

0 2 @f .x0 / C NC .x0 /; (2.52)

where NC .x0 / denotes the (outward) normal cone of C at x0 : (cf Sect. 1.6)
Proof If (2.52) holds there is p 2 @f .x0 / \ NC .x0 /: For every x 2 C; since p 2
@f .x0 /, we have hp; x  x0 i  f .x/  f .x0 /; i.e., f .x0 /  f .x/  hp; x  x0 iI on the
other hand, since p 2 NC .x0 /; we have hp; x  x0 i  0; hence f .x0 /  f .x/; i.e., x0
is a minimizer. Conversely, if x0 2 argminx2C f .x/ then the system

.x; y/ 2 C  Rn ; x  y D 0; f .y/  f .x0 / < 0

is inconsistent. Define D WD C  Rn and A.x; y/ WD x  y; so that A.D/ D C  Rn :


For any ball U around 0, x0 C U  Rn ; hence U D x0  .x0 C U/  A.D/; and so
0 2 intA.D/: Therefore, by Theorem 2.4, there exists a vector p 2 Rn such that

hp; x  yi C f .y/  f .x0 /  0 8.x; y/ 2 C  Rn :

Letting y D x0 yields hp; x  x0 i  0 8x 2 C; i.e., p 2 NC .x0 /; then letting


x D x0 yields f .y/  f .x0 /  hp; y  x0 i 8y 2 Rn ; i.e., p 2 @f .x0 /: Thus, p 2
NC .x0 / \ @f .x0 /; completing the proof. t
u
Corollary 2.12 Under the assumptions of the above proposition, an interior point
x0 of C is a minimizer if and only if 0 2 @f .x0 /:
Proof Indeed, NC .x0 / D f0g if x0 2 intC: t
u
Proposition 2.32 Let C be a nonempty compact set in Rn ; f W C ! R an arbitrary
continuous function, f c the convex envelope of f over C: Then any global minimizer
of f .x/ on C is also a global minimizer of f c .x/ on convC:
Proof Let x0 2 C be a global minimizer of f .x/ on C: Since f c is a minorant
of f , we have f c .x0 /  f .x0 /: If f c .x0 / < f .x0 / then the convex function
h.x/ D maxff .x0 /; f c .x/g would be a convex minorant of f larger than f c ; which
is impossible. Thus, f c .x0 / D f .x0 / and f c .x/ D h.x/ 8x 2 convC: Hence,
f c .x0 / D f .x0 /  f c .x/ 8x 2 convC, i.e., x0 is also a global minimizer of f c .x/
on convC: t
u

2.11.2 Maximum

In contrast with the minimum, a local maximum of a convex function may not be
global. Generally speaking, local information is not sufficient to identify a global
maximizer of a convex function.
2.11 Extremal Values of Convex Functions 71

Proposition 2.33 Let C be a convex set in Rn ; and f W C ! R be a convex function.


If f .x/ attains its maximum on C at a point x0 2 riC then f .x/ is constant on C: The
set argmax f .x/ is a union of faces of C:
x2C

Proof Suppose that f attains its maximum on C at a point x0 2 riC and let x be an
arbitrary point of C: Since x0 2 riC there is y 2 C such that x0 D x C .1  /y for
some  2 .0; 1/: Then f .x0 /  f .x/ C .1  /f .y/; hence f .x/  f .x0 /  .1  /
f .y/  f .x0 /  .1  /f .x0 / D f .x0 /: Thus f .x/  f .x0 /; hence f .x/ D f .x0 /;
proving the first part of the proposition. The second part follows, because for any
maximizer x0 there is a face F of C such that x0 2 riF W then by the previous
argument, any point of this face is a global minimizer. t
u
Proposition 2.34 Let C be a closed convex set, and f W C ! R be a convex
function. If C contains no line and f .x/ is bounded above on every halfline of C
then

supff .x/j x 2 Cg D supff .x/j x 2 V.C/g;

where V.C/ is the set of extreme points of C: If the maximum of f .x/ is attained at
all on C; it is attained on V.C/:
Proof By Theorem 1.7, C D convV.C/ C K; where K is the convex cone generated
by the extreme directions of C: Any point of C which is actually not an extreme
point belongs to a halfline emanating from some v 2 V.C/ in the direction of a ray
of K: Since f .x/ is finite and bounded above on this halfline, its maximum on the
halfline is attained at v (Proposition 2.12, (ii)). Therefore, the supremum of f .x/ on
C is reduced to the supremum on convV.C/: The P conclusion then follows from the
i2I i v ; with jIj < C1; v 2
i i
fact that any xP2 convV.C/ is of the form P x D
V.C/; i  0; i2I i D 1; hence f .x/  i2I i f .v /  maxi2I f .v /:
i i
t
u
Corollary 2.13 A real-valued convex function f .x/ on a polyhedron C containing
no line is either unbounded above on some unbounded edge or attains its maximum
at an extreme point of C:
Corollary 2.14 A real-valued convex function f .x/ on a compact convex set C
attains its maximum at an extreme point of C:
The latter result is in fact true for a wider class of functions, namely for
quasiconvex functions. As was defined in Sect. 2.3, these are functions f W Rn !
1; C1 such that for any real number ; the level set L WD fx 2 Rn j f .x/  g
is convex, or equivalently, such that

f ..1  /x1 C x2 /  maxff .x1 /; f .x2 /g (2.53)

for any x1 ; x2 2 C and any  2 0; 1:


To see that Corollary 2.14 extends to quasiconvex functions, just note that a
compact convex set C is the convex hull of its extreme points (Corollary 1.13), so
72 2 Convex Functions

P
any Px 2 C can be represented as x D i2I i v i ; where v i are extreme points, i  0
and i2I i D 1: If f .x/ is a quasiconvex function finite on C; and D maxi2I f .v i /;
then v i 2 C \ L ; 8i 2 I; hence x 2 C \ L ; because of the convexity of the set
C \ L : Therefore, f .x/  D maxi2I f .v i /; i.e., the maximum of f on C is attained
at some extreme point.
A function f .x/ is said to be quasiconcave if f .x/ is quasiconvex. A convex
(concave) function is of course quasiconvex (quasiconcave), but the converse may
not be true, as can be demonstrated by a monotone nonconvex function of one
variable. Also it is easily seen that an upper envelope of a family of quasiconvex
functions is quasiconvex, but the sum of two quasiconvex functions may not be
quasiconvex.

2.12 Minimax and Saddle Points

2.12.1 Minimax

Given a function f .x; y/ W C  D ! R we can compute infx2C f .x; y/ and


supy2D f .x; y/: It is easy to see that there always holds

sup inf f .x; y/  inf sup f .x; y/:


y2D x2C x2C y2D

Indeed, infx2C f .x; y/  f .z; y/ 8z 2 C; y 2 D; so supy2D infx2C f .x; y/ 


supy2D f .z; y/ 8z 2 C; hence supy2D infx2C f .x; y/  infz2C supy2D f .z; y/:
We would like to know when the reverse inequality is also true, i.e., when there
holds the minimax equality

WD sup inf f .x; y/ D inf sup f .x; y/ WD


:
y2D x2C x2C y2D

Investigations on this question date back to von Neumann (1928). A classical result
of his states that if C; D are compact and f .x; y/ is convex in x; concave in y
and continuous in each variable then the minimax equality holds. Since minimax
theorems have found important applications, there has been a great deal of work on
the extension of von Neumanns theorem. Almost all these extensions are based
either on the separation theorem of convex sets or a fixed point argument. The
most important result in this direction is due to Sion (1958). Later a more general
minimax theorem was established by Tuy (1974) without any appeal to separation
or fixed point argument. We present here a simplified version of the latter result
which is also a refinement of Sions theorem.
Theorem 2.7 Let C; D be two closed convex sets in Rn ; Rm ; respectively, and let
f .x; y/ W C  D ! R be a function quasiconvex, lower semi-continuous in x and
quasiconcave, upper semi-continuous in y: Assume that
(*) There exists a finite set N  D such that supy2N f .x; y/ ! C1 as x 2 C;
kxk ! C1:
2.12 Minimax and Saddle Points 73

Then there holds the minimax equality

inf sup f .x; y/ D sup inf f .x; y/: (2.54)


x2C y2D y2D x2C

Proof Since the inequality infx2C supy2D f .x; y/  sup inf f .x; y/ is trivial, it suffices
y2D x2C
to show the reverse inequality:

inf sup f .x; y/  sup inf f .x; y/: (2.55)


x2C y2D y2D x2C

Let
WD supy2D infx2C f .x; y/: If
D C1 then (2.55) is obvious, so we can assume

< C1: For an arbitrary >


define

C .y/ D fx 2 Cj f .x; y/  g:

Since supy2D infx2C < ; we have C .y/ ; 8y 2 D: If we can show that


\
C .y/ ;; (2.56)
y2D

i.e., there is x 2 C satisfying f .x; y/  for all y 2 D; then infx2C supy2D 


and since this is true for every >
it will follow that infx2C supy2D 
;
proving (2.55). Thus, all is reduced to establishing (2.56). This will be done in three
stages. To simplify the notation, from now on we shall omit the subscript and
write simply C.a/; C.b/; etc. . .
I. Let us first show that for every pair a; b 2 D the two sets C.a/ and C.b/
intersect. Assume the contrary, that

C.a/ \ C.b/ D ;: (2.57)

Consider an arbitrary  2 0; 1 and let y D .1  /a C b: If x 2 C.y /;


i.e., f .x; y /  then minff .x; a/; f .x; b/g  f .x; y /  by quasiconcavity of
f .x; :/; hence, either f .x; a/  or f .x; b/  : Therefore,

C.y /  C.a/ [ C.b/: (2.58)

Since C.y / is convex it follows from (2.58) that C.y / cannot meet both sets
C.a/ and C.b/ which are disjoint by assumption (2.57). Consequently, for every
 2 0; 1; one and only one of the following alternatives occurs:

.a/ C.y /  C.a/I .b/ C.y /  C.b/:


74 2 Convex Functions

Denote by Ma .Mb ; resp.) the set of all  2 0; 1 satisfying (a) (satisfying (b),
resp.). Clearly 0 2 Ma ; 1 2 Mb ; Ma [ Mb D 0; 1 and, analogously to (2.58):

C.y /  C.y1 / [ C.y2 / 8 2 1 ; 2 : (2.59)

Therefore,  2 Ma implies 0;   Ma ; and  2 Mb implies ; 1  Mb : Let


s D sup Ma D inf Mb and assume, for instance, that s 2 Ma (the argument is
similar if s 2 Mb /: We show that (2.57) leads to a contradiction.
Since >
 infx2C f .x; ys /; we have f .x; ys / < for some x 2 C: By
upper semi-continuity of f .x; :/ there is " > 0 such that f .x; ysC" / < and so
x 2 C.ysC" /: But x 2 C.ys /  C.a/; hence C.ysC" /  C.a/; i.e., s C " 2 Ma ;
contradicting the definition of s: Thus (2.57) cannot occur and we must have
C.a/ \ C.b/ ; for all a; b 2 C; < :
II. We now show that any finite collection C.y1 /; : : : ; C.yk / with y1 ; : : : ; yk 2 D
has a nonempty intersection. By (I) this is true for k D 2: Assuming this is true
for k D h1 let us consider the case k D h: Set C0 D C.yh /; C0 .y/ D C0 \C.y/:
From part I we know that C0 .y/ ; for every y 2 D: This means that for all
>
W 8y 2 D 9x 2 C0 f .x; y/  ; so that supy2D infx2C0 f .x; y/  :
Since
D supy2D infx2C f .x; y/  supy2D infx2C0 f .x; y/; it follows that
D
supy2D infx2C0 f .x; y/: So all the hypotheses of the theorem still hold when C
is replaced by C0 : It then follows from the induction assumption that the sets
C0 .y1 /; : : : ; C0 .yh1 / have a nonempty intersection. Thus the family fC .y/; y 2
Dg has the finite intersection property.
III. Finally, for every y 2 D let C .y/ D fx 2 Cj f .x; y/  ; supz2N f .x; z/  g:
Then C .y/  CN WD fx 2 Cj supz2N f .x; z/  g and the set CN is compact
because if it were not so there would exist a sequence xk 2 CN such that kxk j !
C1; contradicting assumption (*). On the other hand, for any finite set E  D
clearly \y2E C .y/ D \y2E[N C.y/; so by part II \y2E C .y/ ;; i.e., the family
fC .y/; y 2 Dg has the finite intersection property. Since every C .y/ is a subset
of the compact set CN it follows that \y2D C .y/ D \y2D C .y/ ;; i.e. (2.56)
must hold. t
u
Remark 2.2 In view of the symmetry in the roles of x; y Theorem 2.7 still holds if
instead of condition (*) one assumes that
(!) There exists a finite set M  C such that infx2M f .x; y/ ! 1 as y 2
D; kyk ! C1:
The proof is analogous, using D .x/ WD fy 2 Dj f .x; y/  g with > WD
infx2C supy2D f .x; y/ (instead of C .y/ with <
WD supy2D infx2C f .x; y/) and
proving that \x2C D .x/ ;:
2.12 Minimax and Saddle Points 75

2.12.2 Saddle Point of a Function

A pair .x; y/ 2 C  D is called a saddle point of the function f .x; y/ W C  D ! R if

f .x; y/  f .x; y/  f .x; y/ 8x 2 C; 8y 2 D: (2.60)

This means

min.x; y/ D f .x; y/ D max f .x; y/;


x2C y2D

so in a neighborhood of the point .x; y; f .x; y// the set epif WD f.x; y; t/j .x; y/ 2
C  D; t 2 R; f .x; y/  tg reminds the image of a saddle (or a mountain pass).
Proposition 2.35 A point .x; y/ 2 C  D is a saddle point of f .x; y/ W C  D ! R
if and only if

max inf f .x; y/ D min sup f .x; y/: (2.61)


y2D x2C x2C y2D

Proof We have

.2.60/ , supy2D f .x; y/  f .x; y/  infy2C f .x; y/


, infx2C supy2D f .x; y/  supy2D f .x; y/  f .x; y/
 infx2C f .x; y/  supy2D infx2C f .x; y/:

Since always infx2C supy2D f .x; y/  supy2D infx2C f .x; y/ we must have equality
everywhere in the above sequence of inequalities. Therefore, (2.60) must be
equivalent to (2.61). t
u
As a consequence of Theorem 2.7 and Remark 2.2 we can now state
Proposition 2.36 Assume that C  Rn ; D  Rm are nonempty closed convex sets
and the function f .x; y/ W C  D ! R is quasiconvex l.s.c in x and quasiconcave
u.s.c. in y: If both the following conditions hold:
(i) There exists a finite set N  D such that supy2N f .x; y/ ! C1 as x 2
C; kxk ! C1I
(ii) There exists a finite set M  C such that infx2M f .x; y/ ! C1 as y 2
D; kyk ! C1I
then there exists a saddle point .x; y/ for the function f .x; y/:
Proof If (i) holds, we have minx2C supy2D f .x; y/ D supy2D infx2C f .x; y/ by The-
orem 2.7. If (ii) holds, we have infx2C supy2D f .x; y/ D maxy2D infx2C f .x; y/ by
Remark 2.2. Hence the condition (2.61). t
u
76 2 Convex Functions

2.13 Convex Optimization

We assume that the reader is familiar with convex optimization problems, i.e.,
optimization problems of the form

minff .x/j gi .x/  0 .i D 1; : : : ; m/; hj .x/ D 0 .j D 1; : : : ; p/; x 2 g; (2.62)

where  Rn is a closed convex set, f ; g1 ; : : : ; gm are convex functions finite on


an open domain containing ; h1 ; : : : ; hm are affine functions.
In this section we study a generalization of problem (2.62), where the convex
inequality constraints are understood in a generalized sense.

2.13.1 Generalized Inequalities


a. Ordering Induced by a Cone

A cone K  Rn induces a partial ordering


K on Rn such that

x
K y , y  x 2 K:

We also write x K y to mean y


K x: In the case K D RnC this is the usual ordering
x  y , xi  yi ; i D 1; : : : ; m:
The following properties of the ordering
K are straightforward:
(i) transitivity: x
K y; y
K z ) x
K zI
(ii) reflexivity: x
K x 8x 2 Rn I
(iii) preservation under addition: x
K y; x0
K y0 ) x C x0
K y C y0 I
(iv) preservation under nonnegative scaling: x
K y; > 0 ) x
K y:
The ordering induced by a cone K is of particular interest when K is closed, solid
(i.e., has nonempty interior), and pointed (i.e., contains no line: x 2 K ) x K/:
In that case a relation x
K y is called a generalized inequality. We also write x K y
to mean that y  x 2 intK and call such a relation a strict generalized inequality.
Generalized inequalities enjoy the following important properties:
(i) x
K y; y
K x ) x D yI
(ii) x 6 K xI
(iii) xk
K yk ; xk ! x; yk ! y ) x
K yI
(iv) If x K y then for u; v small enough x C u K y C vI
(v) x K y; u
K ) x C u K y C vI
(vi) The set fzj x
K z
K yg is bounded.
For instance, (ii) is true because x K x implies 0 D x  x 2 intK; which
is impossible as K is pointed; (vi) holds because the set E D fzj x
K z
K yg
2.13 Convex Optimization 77

is closed, so if there exists fzk g  E; kzk k ! C1 then, up to a subsequence,


k k
zk =kzk k ! u with kuk D 1 and since kzzk k 2 kzxk k C K; kzzk k 2 kzxk k  K; letting
k ! C1 yields u 2 K; u 2 K; hence by (i) u D 0 conflicting with kuk D 1:

b. Dual Cone

The dual cone of a cone K is by definition the cone

K  D fy 2 Rn j hx; yi  0 8x 2 Kg:

Note that K  D K o ; where K o is the polar of K (Sect. 1.8). As can easily be seen:
(i) K  is a closed convex cone;
(ii) K  is also the dual cone of the closure of KI
(iii) .K  / D clKI
(iv) If x 2 intK then hx; yi > 0 8y 2 K  n f0g:
A cone K is said to be self-dual if K  D K: Clearly the orthant RnC is a self-dual
cone.
Lemma 2.1 If K is a closed convex cone then

y K , 9 2 K  h; yi < 0:

Proof By (iii) above K D .K  / so y K if and only if y .K  / ; hence if and


only if there exists  2 K  satisfying h; yi < 0: t
u

c. K-Convex Functions

Given a convex cone K  Rm inducing an ordering


K on Rm a map g W Rn ! Rm
is called K-convex if for every x1 ; x2 2 Rn and 0   1 we have the generalized
inequality

g.x1 C .1  /x2 /
K g.x1 / C .1  /g.x2 /: (2.63)

For instance, the map g W Rn ! Rm is Rm C -convex (or component-wise convex) if


each function gi .x/; i D 1 : : : ; m; is convex.

Lemma 2.2 If g W Rn ! PmR is a K-convex map then for every  2 K
m

the function h; g.x/i D iD1 i gi .x/ is convex in the usual sense and the set
fx 2 Rn j g.x/
K 0g is convex.
Proof For every x1 ; x2 2 Rn , we have by (2.63)

g.x1 / C .1  /g.x2 /  g.x1 C .1  /x2 / 2 K:


78 2 Convex Functions

Since  2 K  it follows that

h; g.x1 / C .1  /g.x2 /  g.x1 / C .1  /g.x2 /i  0;

hence h; g.x1 /i C .1  /h; g.x1 /i  h; g.x1 / C .1  /g.x2 /i; proving that
the function h; g.x/i is convex.
Further, by (2.2), if g.x1 /
K 0; g.x2 /
K 0 then

g.x1 / C .1  /x2 /
K g.x1 / C .1  /g.x2 /
K 0;

proving that the set fxj g.x/


K 0g is convex. t
u

2.13.2 Generalized Convex Optimization

A generalized convex optimization problem is a problem of the form

minff .x/j gi .x/


Ki 0 .i D 1; : : : ; m/; h.x/ D 0; x 2 g; (2.64)

where f W Rn ! R is a convex function, Ki is a closed, solid, pointed convex cone


in Rsi ; gi W Rn ! Rsi ; i D 1; : : : ; m; is Ki -convex, finite on the whole Rn ; h W
Rn ! Rp is an affine map, and  Rn is a closed convex set. By Lemma 2.2 the
constraint set of this problem is convex, so this is also a problem of minimizing a
convex function over a convex set.
Let i 2 Ki be the Lagrange multiplier associated with the generalized inequality
gi .x/
Ki 0 and  2 Rp the Lagrange multiplier associated with the equality
h.x/ D 0: So the Lagrangian of the problem is the function

X
m
L.x; ; / WD f .x/ C hi ; gi .x/i C h; h.x/i;
iD1

where i 2 Ki ; i D 1; : : : ; m; and  2 Rp : The dual Lagrange function is

X
m
'.; / D inf L.x; ; / D inf ff .x/ C hi ; gi .x/i C h; h.x/ig:
x2 x2
ID1

Since '.; / is the lower envelope of a family of affine functions in .; /; it is a
concave function. Setting K  D K1      Km ;  D .1 ; : : : ; m / we can show that

sup L.x; ; /
2K  ;2Rp

f .x/ if gi .x/
Ki 0 .i D 1; : : : ; m/; h.x/ D 0I
D
C1 otherwise
2.13 Convex Optimization 79

In fact, if gi .x/
Ki 0 .i D 1; : : : ; m/; h.x/ D 0 then the supremum is attained for
 D 0;  D 0 and equals f .x/: On the other hand, if h.x/ 0 there is  2 Rp
satisfying h; h.x/i > 0 and for  D 0 the supremum is equal to sup >0 ff .x/ C
h ; h.x/ig D C1: If there is an i D 1; : : : ; m with gi .x/ 6
Ki 0 then by Lemma 2.1
there is i 2 Ki satisfying hi ; gi .x/i > 0; hence for  D 0; j D 0 8j i; the
supremum equals sup >0 ff .x/ C hi ; gi .x/ig D C1:
Thus, the problem (2.64) can be written as

inf sup L.x; ; /:


x2 2K  ;2Rp

The dual problem is

sup inf L.x; ; /;


2K  ;2Rp x2

that is,

sup '.; /: (2.65)


2K  ;2Rp

Theorem 2.8
(i) (weak duality) The optimal value in the dual problem never exceeds the optimal
value in the primal problem (2.64).
(ii) (strong duality) The optimal values in the two problems are equal if the Slater
condition holds, i.e., if

9x 2 int h.x/ D 0; gi .x/ Ki 0; i D 1; : : : ; m:

Proof (i) is straightforward, we need only prove (ii). By Lemma 2.2 for every i 2
Ki the function hi ; gi .x/i is convex (in the usual sense), finite on Rn and hence
continuous. So L.x; ; / is a convex continuous function in x 2 for every fixed
.; / 2 D WD K1   Km Rp and affine in .; / 2 D for every fixed x 2 h./:
Since h W Rn ! Rp is an affine map, without loss of generality it can be assumed
that h./ D Rp : Since h.x/ D 0 and x 2 int, we have 0 2 int; so for every
j D 1; : : : ; p there is an aj 2 such that h.aj / has its j-th component equal to 1,
and all other components equal to 0. Then for sufficiently small " > 0, we have
xi WD x C ".aj  x/ 2 ; xO j WD x  ".aj  x/ 2 ; and so

gi .xj / < 0 .i D 1; : : : ; m/; hj .xj / > 0; hj .xj / D 0 8i j


gi .Oxj / < 0 .i D 1; : : : ; m/; hj .Oxj / < 0; hi .Oxj / D 0 8i j:

Let M D fxj ; xO j ; j D 1; : : : ; pg: For every .; / 2 D n f0g we can write


Xm X
p
.; / WD min i gi .x/ C j hj .x/ < 0
x2M
iD1 jD1
80 2 Convex Functions

hence, WD maxf.; /j  2 Rm


C ;  2 R ; kk C kk D 1g < 0: Consequently,
p

X
m X
p
min L.x; .; //  max f .x/ C minf i gi .x/ C j hj .x/g
x2M x2M x2M
iD1 jD1

 max f .x/ C .kk C kk/ ! 1


x2M

as .; / 2 D; kk C kk ! C1: So the function L.x; .; // satisfies condition
(!) in Remark 2.2 with C D ; y D .; /: By virtue of this Remark,

inf sup L.x; .; // D max inf L.x; .; //;
x2 .;/2D .;/2D x2

completing the proof of (ii). t


u
The difference between the optimal values in the primal and dual problems is
called the duality gap. Theorem 2.8 says that for problem (2.64) the duality gap is
never negative and is zero if Slater condition is satisfied.

2.13.3 Conic Programming

Let K  Rm be a closed, solid, and pointed convex cone, let c 2 Rn and let A be an
m  n matrix, and b 2 Rm : Then the following optimization problem:

minfhc; xij Ax
K bg (2.66)

is called a conic programming problem.


Since A.x1 C .1  /x2 /  Ax1 C .1  /Ax2  D 0 2 K; i.e.,

A.x1 C .1  /x2 / K Ax1 C .1  /Ax2 ;

for all x1 ; x2 2 Rn ; 0   1; the map x 7! b  Ax is K-convex. So a conic


program is nothing but a special case of the above considered problem (2.64) when
m D 1; K1 D K; f .x/ D hc; xi; g1 .x/ D b  Ax; h 0; D Rn :
The Lagrangian of problem (2.66) is

L.x; y/ D hc; xi C hy; b  Axi D hc  AT y; xi C hb; yi .y K 0/:

But, as can easily be seen,



hb; yi if AT y D x
infn L.x; y/ D
x2R 1 otherwise
2.14 Semidefinite Programming 81

so the dual of the conic programming problem (2.66) is the problem

maxfhb; yij AT y D c; y K 0g: (2.67)

Clearly linear programming is a special case of conic programming when K D RnC .


However, while strong duality always holds for linear programming (except only
when both the primal and the dual problems are infeasible), it is not so for conic
programming. By Theorem 2.8 a sufficient condition for strong duality in conic
programming is

9x Ax K b:

2.14 Semidefinite Programming

2.14.1 The SDP Cone

Given a symmetric n  n matrix A D aij  (i.e., a matrix A such that AT D A) the


trace of A; written Tr(A), is the sum of its diagonal elements:

X
n
Tr.A/ WD aii :
iD1

From the definition it is readily seen that

Tr.A C B/ D Tr.A/ C Tr.B/; Tr.AB/ D Tr.BA/


Tr.A/ D 1 C    C n

where i ; i D 1; : : : ; n; are the eigenvalues of A; the latter equality being derived


from the development of the characteristic polynomial det.In  A/.
Consider now the space Sn of all n  n symmetric matrices. Using the concept of
trace we can introduce an interior product1 in Sn defined as follows:
X
hA; Bi D Tr.AB/ D aij bij D vec.A/T vec.B/;
i;j

where vec.A/ denotes the n  n column vector whose elements are elements of the
matrix A in the order a11 ;    ; a1n ; a21 ;    ; a2n ;    ; an1 ; ann : The norm of a matrix
A associated with this inner product is the Frobenius norm given by

1
Sometimes also written as A  B and called dot product.
82 2 Convex Functions

0 11=2
X
kAkF D .hA; Ai/1=2 D .Tr.AT A/1=2 D @ a2ij A :
i;j

The set of semidefinite positive matrices X 2 Sn forms a convex cone SnC called the
SDP cone (semidefinite positive cone). For any two matrices A; B 2 Sn we write
A
B to mean that B  A is semidefinite positive. In particular, A 0 means A is a
semidefinite positive matrix.
Lemma 2.3 The cone SnC is closed, convex, solid, pointed, and self-dual, i.e.,

SnC D .SnC / :

Proof We prove only the self-dual property. By definition

.SnC / D fY 2 SnC j Tr.YX/  0 8X 2 SnC g:

Let Y 2 SnC . For every X 2 SnC , we have

Tr.YX/ D Tr.YX 1=2 X 1=2 / D Tr.X 1=2 YX 1=2 /  0;

where the last inequality holds because the matrix X 1=2 YX 1=2 is semidefinite
positive. Therefore, SnC  .SnC / : Conversely, let Y 2 .SnC / : For any u 2 Rn ,
we have

X
n X
n X
n
Tr.YuuT / D y1i ui u1 C    C yni ui un D yij ui uj D uT Yu:
iD1 iD1 i;jD1

The matrix uuT is obviously semidefinite positive, while the trace of YuuT is
nonnegative by assumption, so the product uT Yu is nonnegative. Since u is arbitrary,
this means Y 2 SnC : Hence, .SnC /  SnC : t
u
A map F W Rn ! Sm is said to be convex, or more precisely, convex with respect
to matrix inequalities if it is Sm n
C -convex, i.e., such that for every X; Y 2 S and
0t1

F.tX C .1  t/Y/
tF.X/ C .1  t/F.Y/:

For instance, the map X 7! X 2 is convex because for every u 2 Rm the function
uT X 2 u D kXuk2 is convex quadratic with respect to the components of X and so

uT .X C .1  /X/2 u  uT X 2 u C .1  /uT X 2 u;

which implies .X C .1  /X/2


X 2 C .1  /X 2 : Analogously, the function
X 7! XX T is convex.
2.14 Semidefinite Programming 83

2.14.2 Linear Matrix Inequality

A linear matrix inequality (LMI) is a generalized inequality

A 0 C x1 A 1 C    C xn A n
0

where x 2 Rn is the variable and Ai 2 Sp ; i D 0; 1; : : : ; n; are given p  p symmetric


p
matrices. The inequality sign
is understood with respect to the cone SC W the
P P
0 means
notation
p
the matrix P is semidefinite negative. Obviously, A.x/ WD
A0 C nkD1 xk Ak 2 SC and each element of this matrix is an affine function of x W

X
n
A.x/ D aij .x/; aij .x/ D a0ij C akij xk :
kD1

Therefore an LMI can also be defined as an inequality of the form

A.x/
0;

where A.x/ is a square symmetric matrix whose every element is an affine function
of x:
By definition

A.x/
0 , hy; A.x/yi  0 8y 2 Rp ;

so setting C WD fx 2 Rn j A.x/
0g, we have

C D \y2Rp fx 2 Rn j hy; A.x/yi  0g:

Since for every fixed y the set fx 2 Rn j hy; A.x/yi  0g is a halfspace, we see that the
solution set C of an LMI is a closed convex set. In other words, an LMI is nothing
but a specific convex inequality which is equivalent to an infinite system of linear
inequalities.
Obviously, the inequality A.x/ 0 is also an LMI, determining a convex
constraint for x: Furthermore, a finite system of LMIs of the form

A.1/ .x/
0; : : : ; A.m/ .x/
0

can be equivalently written as the single LMI

Diag.A.1/ .x/;    ; A.m/ .x//


0:
84 2 Convex Functions

2.14.3 SDP Programming

A semidefinite program .SDP/ is a problem of minimizing a linear function under


an LMI constraint, i.e., a problem of the form

.SDP/ minfhc; xij A0 C x1 A1 C : : : C xn An


0g:

Clearly this is a special case of generalized convex optimization. Specifically, .SDP/


can be rewritten in the form (2.64), with m D 1; f .x/ D hc; xi; g1 .x/ D A0 C x1 A1 C
p
: : : C xn An ; K1 D SC :
To form the dual problem to .SDP/ we associate with the LMI constraint a dual
variable Y 2 .SC / D SC (see Lemma 2.3), so the Lagrangian is
p p

L.x; Y/ D cT x C Tr.Y.A0 C x1 A1 C    C xn An //:

The dual function is

'.Y/ D inffL.x; Y/j x 2 Rn g:

Since L.x; Y/ is affine in x it is unbounded below, except if it is identically zero, i.e.,


if ci C Tr.YAi / D 0; i D 1; : : : ; n; in which case L.x; Y/ D Tr.A0 Y/: Therefore,

Tr.A0 Y if Tr.Ai Y/ C ci D 0; i D 1; : : : ; n
'.Y/ D
1 otherwise

Consequently, the dual of .SDP/ is

.SDD/ maxfTr.A0 Y/j Tr.Ai Y/ C ci D 0; i D 1; : : : ; n; Y D Y T 0g:

Writing this problem in the form

maxfhA0 ; Yij hAi ; Yi D ci ; i D 1; : : : ; ; Y 0g (2.68)

we see that .SDP/ reminds a linear program in standard form.


By Theorem 2.8 strong duality holds for .SDP/ if Slater condition is satisfied

9x A 0 C x1 A 1 C    C xn A n : (2.69)

2.15 Exercises

1 Let f .x/ be a convex function and C D domf  Rn : Show that for all x1 ; x2 2 C
and  0; 1 such that x1 C .1  /x2 2 C W

f .x1 C .1  /x2 /  f .x1 / C .1  /f .x2 /:


2.15 Exercises 85

2 A real-valued function f .t/; 1 < t < C1; is strictly convex (cf Remark 2.1)
if it has a strictly monotonically increasing derivative f 0 .t/: Apply this result to
f .t/ D et and show that for any positive numbers 1 ; : : : k I
!1=k
Y
k
1X
k
i  i
iD1
k iD1

with equality holding only when 1 D : : : D k :


3 A function f .x/ is strongly convex (cf Sect. 2.9) on a convex set C if and only if
there exists r > 0 (modulus of strong convexity) such that f .x/  rkxk2 is convex.
Show that:

f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 /  .1  /rkx1  x2 k2

for all x1 ; x2 2 C and  0; 1 such that x1 C .1  /x2 2 C:


4 Show that if f .x/ is strongly convex on Rn (with modulus of strong convexity r/
then for any x0 2 Rn and p 2 @f .x0 / W

f .x/  f .x0 /  hp; x  x0 i C rkx  x0 k2 8x 2 Rn :

5 Notations being the same as in Exercise 4, show that for any x1 ; x2 2 Rn and
p1 2 @f .x1 /; p2 2 @f .x2 / W

hp1  p2 ; x1  x2 i  rkx1  x2 k2 :

6 If f .x/ is strongly convex on a convex set C then for any x0 2 C the level set
fx 2 Cj f .x/  f .x0 /g is bounded.
7 Let C be a nonempty convex set in Rn ; f W C ! R a convex function,
Lipschitzian with constant L on C: The function

F.x/ D infff .y/ C Lkx  ykj y 2 Cg

is convex, Lipschitzian with constant L on the whole space and satisfies F.x/ D
f .x/ 8x 2 C:
8 Show that if @" f .x0 / is a singleton for some x0 2 domf and " > 0; then f .x/ is an
affine function.
9 For a proper convex function f W p 2 @f .x0 / if and only if .p; 1/ is an outward
normal to the set epif at point .x0 ; f .x0 //:
10 Let M be a nonempty set in Rn ; h W M ! R an arbitrary function, E an n  n
matrix. The function
86 2 Convex Functions

'.x/ D maxfhx; Eyi  h.y/j y 2 Mg

is convex and for every x0 2 dom'; if y0 2 argmaxy2M fhx0 ; Eyi  h.y/g then Ey0 2
@'.x0 /:
11 Let f W Rn ! R be a convex function, X a closed convex subset of Rn ; A 2
Rmn ; B 2 Rmp ; c 2 Rm : Show that the function '.y/ D minff .x/j Ax C By 
c; x 2 Xg is convex and for every y0 2 dom'; if  is a KuhnTucker vector for the
problem minff .x/j Ax C By0  c; x 2 Xg then the vector BT  is a subgradient of '
at y0 :
Chapter 3
Fixed Point and Equilibrium

3.1 Approximate Minimum and Variational Principle

It is well known that a real-valued function f W Rn ! R which is l.s.c. on a nonempty


closed set X  Rn attains a minimum over X if X is compact. This is true, more
generally, if f is coercive on X; i.e., if

f .x/ ! C1 as x 2 X; kxk ! C1: (3.1)

In fact, for an arbitrary a 2 X the set C D fx 2 Xj f .x/  f .a/g is closed and


if it were unbounded there would exist a sequence fxk g  X such that f .xk / 
f .a/; kxk k ! C1 while by (3.1) this would imply f .xk / ! C1; a contradiction.
So C is compact and the minimum of f .x/ over C; i.e., the minimum of f .x/ on X;
exists.
If condition (3.1) fails, f .x/ may not have a minimum over X: In that case one
must be content with the following concept of approximate minimum: Given " > 0
a point x" is called an "-approximate minimizer of f .x/ on X if

inf f .x/  f .x" /  inf f .x/ C ": (3.2)


x2X x2X

An "-approximate minimizer on X always exists if f .x/ is bounded below on X:


For differentiable functions f W Rn ! R a minimizer x of f .x/ on Rn satisfies
rf .x/ D 0: Is there some similar condition for an "-approximate minimizer? The
next propositions give an answer to this question.
Theorem 3.1 (Ekeland Variational Principle (Ekeland 1974)) Let f W X ! R [
fC1g be a function on a closed set X  Rn which is finite at at least one point,
l.s.c. and bounded below on X: Let " > 0 be given. Then for every "-approximate
minimizer x" of f and every  > 0 there exists x satisfying

Springer International Publishing AG 2016 87


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_3
88 3 Fixed Point and Equilibrium

f .x /  f .x" /I kx  x" k  I (3.3)


 " 
f .x /  f .x/ C  kx  x k 8x 2 X: (3.4)

Proof We follow the elegant proof in Hiriart-Urruty (1983). Clearly the function
g.x/ D f .x/ C " kx  x" k is l.s.c. and coercive on X; so it has a minimizer x on X;
i.e., a point x 2 X such that
"  "
f .x / C kx  x" k  f .x/ C kx  x" k 8x 2 X: (3.5)
 
Letting x D x" yields
" 
f .x / C kx  x" k  f .x" /; (3.6)


hence f .x /  f .x" /: Further, infx2X f .x/ C " kx  x" k  f .x / C " kx  x" k 
f .x" /  infx2X f .x/ C " by the definition of x" , hence kx  x" k  : Finally,
from (3.5) it follows that
"
f .x /  f .x/ C fkx  x" k  kx  x" kg

"
 f .x/ C kx  x k 8x 2 X: t
u


Note that by (3.3) x is also an "-approximate minimizer which is not too distant
from x" ; moreover by (3.4) x is an exact minimizer of the function f .x/C " kxx k
which differs from f .x/ just by a perturbation quantity " kx  x k: The larger  the
smaller this perturbation quantity and the closer x to an exact minimizer, but x
may be farther from x"p : In practice, the choice of  depends on the circumstances.
Usually the value  D " is suitable.
Corollary 3.1 Let f W Rn ! R be a differentiable function which is bounded below
on Rn : For every "-approximate minimizer x" of f there exists an "-approximate
minimizer x such that
p p
kx  x" k  "; krf .x /k  ";
p
Proof From (3.4) for x D x C tu; u 2 Rn ; t < 0 and  D " we have f .x / 
p  
f .x Ctu/f .x / p
f .x C tu/ C "ktuk: p Therefore, n t  "kuk;
p and letting t ! C1
yields hrf .x /; ui  "kuk 8u 2 R ; hence krf .x /k  ":

t
u
Let X; Y be two sets in Rn ; Rm , respectively. A set-valued map f from X to Y is
a map which associates with each point x 2 X a set f .x/  Y called the value of f
at x: Theorem 3.1 implies the following fixed point theorem due to Caristi (1976):
3.2 Contraction Principle 89

Theorem 3.2 Let P be a set-valued map from a closed set X  Rn into itself and
let f W X ! 0; C1 be a l.s.c. function finite at at least one point. If

.8x 2 X/.9y 2 P.x// kx  yk  f .x/  f .y/ (3.7)

then P has a fixed point, i.e., a point x 2 P.x /:


Proof Let x be the point that exists according to Theorem 3.1 for " D 1 < : We
k
have f .x /  f .x/C kxx  8x 2 X; so f .x / < f .x/Ckxx k 8x 2 Xnfx g: On the
other hand, by hypothesis there exists y 2 P.x / satisfying kx  yk  f .x /  f .y/;
i.e., f .x /  f .y/ C ky  x k: This implies y D x ; so x 2 P.x /: t
u
It can be shown that Theorem 3.2 in turn implies Theorem 3.1 (Exercise 1), so
these two theorems are in fact equivalent.
Heuristically, a set-valued map P with values P.x/ ; can be conceived of
as a dynamic system whose potential function value at position x is f .x/: The
assumption (3.7) means that when the system moves from position x to position
y its potential diminishes at least by a quantity equal to the distance from x to y: The
fixed point x corresponds to an equilibriuma position from where no move to a
position with lower potential is possible.

3.2 Contraction Principle

For any set A  Rn and > 0 define d.x; A/ WD infy2A kx  yk; A WD fx 2


Rn j d.x; A/  g: If A; B are two subsets of Rn the number

d.A; B/ WD inff > 0j A  B ; B  A g

is called the Hausdorff distance between A and B: From this definition it readily
follows that: if d.A; B/  then d.x; B/  8x 2 A (because A  B ) and
d.y; A/  8y 2 B (because B  A ).
A set-valued map P from a set X  Rn to Rn is said to be a contraction if there
exists ; 0 < < 1; such that

.8x; x0 2 X/ d.P.x/; P.x0 //  kx  x0 k: (3.8)

Lemma 3.1 If a set-valued map P from a closed set X  Rn to X is a contraction


then the function f .x/ D d.x; P.x/ is l.s.c. on X:
Proof We have to show that if d.xk ; P.xk //  ; kxk x0 j ! 0 then d.x0 ; P.x0 //  :
By definition of d.xk ; P.xk // we can select yk 2 P.xk / such that kxk  yk k 
d.xk ; P.xk // C 1=k: Then d.yk ; P.x0 //  d.P.xk /; P.x0 //  kxk  x0 k and therefore

d.x0 ; P.x0 //  kx0  xk k C kxk  yk k C d.yk ; P.x0 //


90 3 Fixed Point and Equilibrium

 kx0  xk k C d.xk ; P.xk // C 1=k C kxk  x0 k


 kx0  xk k C C 1=k C kxk  x0 k !

as k ! C1: Consequently, d.x0 ; P.x0 //  : t


u
The next theorem extends the well-known Banach contraction principle to set-
valued maps.
Theorem 3.3 (Nadler 1969) Let P be a set-valued map from a closed set X  Rn
into itself, such that P.x/ is a nonempty closed set for every x 2 X: If there exists a
number ; 0 < < 1; satisfying (3.8) then for every a 2 X and every 2 . ; 1/
there exists a point x 2 P.x / such that kx  ak  kaP.a/k

:
Proof By Lemma 3.1 the function f .x/ D d.x; P.x// is l.s.c., so the set E WD fx 2
Xj .  /kx  ak  f .a/  f .x/g is closed and nonempty (as a 2 E). We show that

.8x 2 E/.9y 2 P.x/ \ E/ .  /kx  yk  f .x/  f .y/: (3.9)

Let x 2 X: Since < 1 by definition of d.x; P.x// there exists y 2 P.x/ satisfying

kx  yk  d.x; P.x//: (3.10)

On the other hand, since y 2 P.x/ it follows from (3.8) that

d.y; P.y//  d.P.x/; P.y//  kx  yk: (3.11)

Adding (3.10) and (3.11) yields

kx  yk C d.y; P.y//  d.x; P.x// C kx  yk;

hence .  /kx  yk  f .x/  f .y/; proving (3.9).


Finally, since x 2 E means .  /kx  ak  f .a/  f .x/; we have

.  /ky  ak  .  /.kx  yk C kx  ak/


 f .x/  f .y/ C f .a/  f .x/ D f .a/  f .y/:

So y 2 E and all the conditions in Caristi Theorem 3.2 are satisfied for the set
E; the map P.x/ \ E and the function f .x/ D d.x; P.x//: Therefore, there exists
.x /
x 2 P.x / \ E;, i.e., x 2 P.x / and kx  ak  f .a/f

f .a/
D  by noting that
  
f .x / D 0 as x 2 P.x /: t
u
3.3 Fixed Point and Equilibrium 91

3.3 Fixed Point and Equilibrium

Solving an equation f .x/ D 0 amounts to finding a fixed point of the mapping


F.x/ D x  f .x/: This explains the important role of fixed point theorems in
analysis and optimization. In the preceding section we have already encountered
two fixed point theorems: Caristi theorem which is equivalent to Ekeland variational
principle and Nadler contraction theorem which is an extension of Banach fixed
point principle. In this section we will study topological fixed points and their
connection with the theory of equilibrium.
The foundation of topological fixed point theorems lies in a famous propo-
sition due to the three Polish mathematicians B. Knaster, C. Kuratowski, and
S. Mazurkiewicz (Knaster et al. 1929) and for this reason called the KKM Lemma.

3.3.1 The KKM Lemma

Let S WD a0 ; a1 ; : : : ; an  be an n-simplex spanned by the vertices a0 ; a1 ; : : : ; an : For


every set I  f0; 1; : : : ; ng the set convfaij i 2 Ig is a face of S; more precisely
the face spanned by the vertices ai ; i 2 I; or the face opposite the vertices aj ; j I:
A face opposite a vertex ai is also called a facet. For example, a1 ; : : : ; an  is the
facet opposite a0 : A simplicial subdivision of S is a decomposition of it into a finite
number of n-simplices such that any two simplices either do not intersect or intersect
along a common face.
A simplicial subdivision (briefly, a subdivision) of S can be constructed induc-
tively as follows.
For a 1-simplex a0 ; a1  the subdivision is formed by two simplices a0 ; 12 .a0 C
a1 / and 12 .a0 C a1 /; a1 : Suppose subdivisions have been defined for k-simplices
with k < n: The subdivision of the n-simplex a0 ; a1 ; : : : ; an  consists of all n-
simplices spanned each by a set of the form  [ fcg; where  is the vertex set of an
1
.n  1/-simplex in the subdivision of a facet of S and c D nC1 .a0 C a1 C : : : C an /:
It can easily be checked that the required condition of a subdivision is satisfied,
namely: any two simplices in the subdivision either do not intersect or intersect
along a common face.
The just constructed subdivision is referred to as subdivision of degree 1. By
applying the subdivision operation to each simplex in a subdivision of degree 1 a
more refined subdivision is obtained which is referred to as subdivision of degree 2,
and so on. By continuing the process, subdivisions of larger and larger degrees are
obtained. It is not hard to see that the maximal diameter of simplices in a subdivision
of degree k tends to zero as k ! C1:
Consider now an n-simplex S D a0 ; a1 ; : : : ; an  and a subdivision of S consisting
of m simplices

S1 ; S2 ; : : : ; Sm : (3.12)
92 3 Fixed Point and Equilibrium

To avoid confusion we will refer to the simplices in (3.12) as subsimplices in the


subdivision. Let k be the vertex set of Sk : The set V D [m kD1 k is actually the
set of subdivision points of the subdivision. A map ` W V ! f0; 1; 2; : : : ; ng is
called a labeling of the subdivision and a subsimplex Sk is said to be complete if
`.k / D f0; 1; 2; : : : ; ng:
Lemma 3.2 Let ` W V ! f0; 1; 2; : : : ; ng be a labeling such that if z 2 V belongs
to a facet ai0 ; ai1 ; : : : ; aik  of S then `.z/ 2 fi0 ; i1 ; : : : ; ik g: Then there exists an odd
number of complete subsimplices.
Proof The proposition is trivial if S is 0-dimensional, i.e., if S is a single point.
Arguing by induction, suppose the proposition is true for any .n  1/-dimensional S
and check it for an n-dimensional S D a0 ; a1 ; : : : ; an : Consider any subsimplex Sk :
A facet of Sk will be called a fair facet if its vertex set U satisfies `.U/ D f1; : : : ; ng:
Let k denote the number of fair facets of the subsimplex Sk : It can easily be seen that
if Sk is complete then k D 1; otherwise k D 2. ThereforeP the number of complete
0
subsimplices will be odd if so is the number  D m kD1 k : Let  be the number
1
of fair facets of subsimplices that lie entirely in the facet a ; : : : ; an  of S: Clearly
each of these fair facets is in fact a complete subsimplex in the subdivision of the
.n  1/-simplex a1 ; : : : ; an  (with respect to the labeling of a1 ; : : : ; an  induced
by `/: By the induction hypothesis 0 is an odd number. The number of remaining
fair facets of subsimplices (3.12) is then   0 : But each of these fair facets is
shared by two subsimplices in (3.12), so it appears in two subsimplices in (3.12)
and is counted twice in the sum : Therefore,   0 is even, and hence,  is odd.
t
u
The above proposition, due to Sperner (1928), is often referred to as Sperner
Lemma. We are now in a position to state and prove the KKM Lemma.
Proposition 3.1 (KKM Lemma) Let fa0 ; a1 ; : : : ; an g be any finite subset of Rm : If
L0 ; L1 ; : : : ; Ln are closed subsets of Rm such that for every set I  f0; 1; : : : ; ng

convfai ; i 2 Ig  [i2I Li ; (3.13)

then \niD0 Li ;:
Proof First assume, as in the classical formulation of this proposition, that
a0 ; a1 ; : : : ; an are affinely independent. The set S D a0 ; a1 ; : : : ; an  is then an
n-simplex and for each I  f0; 1; : : : ; ng the set convfai; i 2 Ig is a face of S:
Consider a simplicial subdivision of degree k of S: Let x be a subdivision point,
i.e., a vertex of some subsimplex in the subdivision, and convfaij i 2 Ig be the
smallest face of S containing x: By assumption x 2 [i2I Li ; so there exists i 2 I
such that x 2 Li : We then set `.x/ D i: Clearly this labeling ` of the subdivision
points satisfies the condition of Lemma 3.2. Therefore, there exists a subsimplex
x0;k ; x1;k ; : : : ; xn;k  with `.xi;k / D i; i D 0; 1; : : : ; n: By compactness of S there exists
3.3 Fixed Point and Equilibrium 93

x 2 S such that, up to a subsequence, x0;k ! x : As k ! C1; the diameter of the


simplex x0;k ; x1;k ; : : : ; xn;k  tends to zero, hence kx0;k  xi;k k ! 0; i.e.,

xi;k ! x ; i D 0; 1; : : : ; n:

Since xi;k 2 Li 8k and the set Li is closed, we conclude x 2 Li ; i D 0; 1; : : : ; n:


Turning to the general case when a0 ; a1 ; : : : ; an may not be affinely independent
let K WD convfa0 ; a1 ; : : : ; an g and S WD e0 ; e1 ; : : : ; en  where ei is the i-th unit vector
of Rn ; i.e., the vector such that eii D 1; eij D 0 8j i: Noting that for every x 2 S
P Pn
we have x D niD0 xi with Pxn i  0;i iD0 xi D 1; we can define the continuous map
 W S ! K by .x/ D iD0 xi a : For each i let Mi WD fx 2 Sj .x/ 2 Li \ Kg:
By continuity of  each Mi is a closed set and from the assumption (3.13) it can
easily be deduced that for every I  f0; 1; : : : ; ng; convfei ; i 2 Ig  [i2I Mi : Since
e0 ; e1 ; : : : ; en are affinely independent, by the above there exists x 2 Mi 8i D 0;
1; : : : ; n; hence .x/ 2 Li 8i D 0; 1; : : : ; n: t
u
Remark 3.1 To get insight into condition (3.13) consider the case when
ai Dei ; iD0; 1; : : : ; n: Imagine that a sum of value 1 is to be distributed among
n C 1 persons i D 0; 1; : : : ; n: A sharing program in which person i receives a
share equal to xi can be represented by a point x D .x0 ; x1 ; : : : ; xn / of the simplex
S D e0 ; e1 ; : : : ; en : It may happen that for each person not every sharing program
is acceptable. Let Li denote the set of sharing programs x acceptable for person
i (each Li may naturally be assumed to be a closed set). Then condition (3.13),
i.e., convfei; i 2 Ig  [i2I Li simply states that for any set I  f0; 1; : : : ; n C 1g
every sharing program in which only people in the group I have positive shares is
acceptable for at least someone of them. The KKM Lemma says that under these
natural assumptions there exists a sharing program acceptable for everybody. Thus
a very simple common sense underlies this proposition whose proof, as we saw, is
quite sophisticated.

3.3.2 General Equilibrium Theorem

Given a nonempty subset C  Rn ; a nonempty subset D  Rm and a real-valued


function F.x; y/ W C  D ! R; the general equilibrium problem is to find a point x
such that

x 2 C; F.x; y/  0 8y 2 D:

Such a point x is referred to as an equilibrium point (shortly, equilibrium) for the


system .C; D; F/:
In the particular case when C D D is a nonempty closed convex set this problem
was first introduced by Blum and Oettli (1994) and since then has been extensively
studied.
94 3 Fixed Point and Equilibrium

The KKM Lemma can be viewed as a proposition asserting the existence of


equilibrium in a basic particular case. In fact given a finite set  Rm ; and for
each y 2 a closed set L.y/  Rm ; the KKM Lemma gives conditions for the
existence of an x 2 \y2 L.y/: If we define F.x; y/ D y .x/  1 where y .x/ is the
characteristic function of the closed set L.y/ (so that F.x; y/ D 0 for x 2 L.y/ and
F.x; y/ D 1 otherwise), then x 2 \y2 L.y/ means x 2 ; F.x; y/  0 8y 2 :
So x is an equilibrium point for the system .; ; F/:
Based on the KKM Lemma we now prove a general existence condition for
equilibrium in a system .C; D; F/. Recall that given a set X  Rn and a set Y  Rm ;
a set-valued map f from X to Y is a map which associates with each point x 2 X a
set f .x/  Y; called the value of f at x: A set-valued map f from X to Y is said to be
closed if its graph

f.x; y/j x 2 X; y 2 f .x/g

is closed in Rn  Rm ; i.e., if x 2 X; y 2 f .x / and x ! x; y ! y always imply


y 2 f .x/:
Theorem 3.4 (General Equilibrium Theorem) Let C be a nonempty compact
subset of Rn ; D a nonempty closed convex subset of Rm ; and F.x; y/ W C  D ! R
a function which is upper semi-continuous in x and quasiconvex in y: If there exists
a closed set-valued map ' from D to C with nonempty convex compact values such
that infy2D infx2'.y/ F.x; y/  0 then the system .C; D; F/ has an equilibrium point,
i.e., there exists an x such that

x 2 C; F.x; y/  0 8y 2 D: (3.14)

Proof For every y 2 D define C.y/ WD fx 2 Cj F.x; y/  0g: By assumption


F.x; y/  0 8x 2 '.y/; 8y 2 D; so '.y/  C.y/ 8y 2 D: Since '.y/ ;; this
implies C.y/ ;I furthermore, C.y/ is compact because C is compact and F.:; y/
is u.s.c. So to prove (3.14), i.e., that \y2D C.y/ ;; it will suffice to show that
the family fC.y/; y 2 Dg has the finite intersection property, i.e., for any finite set
fa1 ; : : : ; ak g  D, we have

\kiD1 C.ai / ;: (3.15)

Let WD fai ; i D 1; : : : ; kg and for each i D 1; : : : ; k define L.ai / WD fy 2


convj '.y/ \ C.ai / ;g (note that conv  D because D is convex). It is
easily seen that L.ai / is a closed subset of D: Indeed, let fy g  L.ai / be a sequence
such that y ! y: Then y 2 conv; so y 2 conv because the set conv is closed.
Further, for each  there is x 2 '.y / \ C.ai /: Since C.ai / is compact we have,
up to a subsequence, x ! x for some x 2 C.ai / and since the map ' is closed,
x 2 '.y/: Consequently, x 2 '.y/ \ C.ai /; hence '.y/ \ C.ai / ;; i.e., y 2 L.ai /:
3.3 Fixed Point and Equilibrium 95

For any nonempty set I  f1; : : : ; kg; let DI WD fy 2 convj C.y/ 


[i2I C.ai /g D fy 2 convj F.x; y/ < 0 8x [i2I C.ai /g: In view of the
quasiconvexity of F.x; :/ the set DI is convex and since ai 2 DI 8i 2 I it follows
that

convfai ; i 2 Ig  DI  fy 2 convj '.y/  [i2I C.ai /g: (3.16)

Therefore,

convfai ; i 2 Ig  [i2I fy 2 convj '.y/ \ C.ai / ;g  [i2I L.ai /: (3.17)

Since this holds for every set I  f1; : : : ; kg; by KKM Lemma, we have
\kiD1 L.ai / ;: So there exists y 2 conv such that '.y/ \ C.ai / ; 8i D
1; : : : ; k:
Now for each i D 1; : : : ; k take an xi 2 '.y/ \ C.ai /: It is not hard to see that for
every set I  f1; : : : ; kg, we have

M WD convfxi ; i 2 Ig  [i2I C.ai /: (3.18)

In fact, assume the contrary, that S WD fx 2 Mj x [i2I C.ai /g ;: Since y 2


conv; by (3.16) '.y/  [kiD1 C.ai /; and since '.y/ is convex it follows that

M  '.y/  [kiD1 C.ai /: (3.19)

Therefore, S D fx 2 Mj x 2 [kiD1 C.ai / n [i2I C.ai /g D fx 2 Mj x 2 [iI C.ai /g D


M \ .[iI C.ai //: Since each set C.ai /; i D 1; : : : ; k; is compact, so is the set
[iI C.ai /; hence S is a closed subset of M: On the other hand, M n S D fx 2
Mj x 2 [i2I C.ai /g D M \ .[i2I C.ai //; so M n S; too, is a closed subset of M: But
M n S ; because xi 2 C.ai / 8i 2 I: So M is the union of two disjoint nonempty
closed sets S and M n S; conflicting with the convexity of M: Therefore S D ;,
proving (3.18).
Since (3.18) holds for every set I  f1; : : : ; kg; again by KKM Lemma (this time
with D fx1 ; : : : ; xk g; L.xi / D C.ai /), we have \kiD1 C.ai / ;; i.e., (3.15) and
hence, \y2D C.a/ ;; as was to be proved. t
u
Remark 3.2 (i) Theorem 3.4 still holds if instead of the compactness of C we
only assume the existence of a y 2 D such that the set fx 2 Cj F.x; y/  0g
is compact (or, equivalently, such that F.x; y/ ! 1 as x 2 C; kxk ! C1).
Indeed, for any finite set N  D we can write \y2N[fyg C.y/ D \y2D C0 .y/;
where C0 .y/ D C.y/ \ C.y/; so the family fC0 .y/; y 2 Dg has the finite
intersection property; since C0 .y/ is compact it follows that \y2D C0 .y/ ;;
hence \y2D C.y/ ; and the above proof goes through.
(ii) By replacing F.x; y/ with F.x; y/  ; where is any real number Theorem 3.4
can also be reformulated as follows:
96 3 Fixed Point and Equilibrium

Let C  Rn be a nonempty compact set, D  Rm a nonempty closed convex


set, and F.x; y/ W C  D ! R a function upper semi-continuous in x and
quasiconvex in yI let ' be a closed set-valued map from D to C with nonempty
convex compact values. Then for any real number we have the implication
inf inf F.x; y/  ) 9x 2 C inf F.x; y/  : (3.20)
y2D x2'.y/ y2D

Or equivalently,
max inf F.x; y/  inf inf F.x; y/: (3.21)
x2C y2D y2D x2'.y/

Remark 3.3 For a heuristic interpretation of Theorem 3.4, consider a company (a


person, a community) with an utility function f .x; y/ that depends upon a variable
x 2 C under its control and a variable y 2 D outside its control. For each given
y 2 D; the best for the company is certainly to select an xy 2 C maximizing F.xy ; y/
over C; i.e., such that f .xy ; y/  f .x; y/  0 8x 2 C: Theorem 3.4 tells under which
conditions there exists an x 2 C such that f .x; y/  f .x; y/  0 8x 2 C 8y 2 D; i.e.,
an x D xy for any whatever y 2 D:
As it turns out, such kind of situation underlies the equilibrium mechanism in
most cases of interest. For instance, a weaker variant of Theorem 3.4 with both
C; D compact and F.x; y/ continuous on C  D was shown many years ago to be
a generalization of the Walras Excess Demand Theorem (Tuy 1976). Below we
show how various important existence propositions pertaining to equilibrium can be
obtained as immediate consequences of Theorem 3.4.

3.3.3 Equivalent Minimax Theorem

First an alternative equivalent form of Theorem 3.4 is the following minimax


theorem:
Theorem 3.5 Let C be a nonempty compact subset of Rn ; D a nonempty closed
convex subset of Rm ; and F.x; y/ W C  D ! R a function which is u.s.c. in x
and quasiconvex in y: If there exists a closed set-valued map ' from D to C with
nonempty compact convex values such that for every < WD infy2D supx2C F.x; y/
we have '.y/  C .y/ WD fx 2 Cj F.x; y/  g then the following minimax equality
holds:
max inf F.x; y/ D inf sup F.x; y/: (3.22)
x2C y2D y2D x2C

Proof Clearly, for every < the assumption '.y/  C .y/ 8y 2 D means
that infy2D infx2'.y/ F.x; y/  while by Theorem 3.4 the latter inequality implies
maxx2C infy2D F.x; y/  : Therefore
max inf F.x; y/  WD inf sup F.x; y/;
x2C y2D y2D x2C
3.3 Fixed Point and Equilibrium 97

whence (3.22) because the converse inequality is trivial. Thus Theorem 3.4 implies
Theorem 3.5. The converse can be proved in the same manner. t
u
Corollary 3.2 Let C be a nonempty compact convex set in Rn ; D a nonempty closed
convex set in Rm ; and F.x; y/ W C  D ! R a u.s.c. function which is quasiconcave
in x and quasiconvex in y: Then there holds the minimax equality (3.22).
Proof For every < define '.y/ WD fx 2 Cj F.x; y/  g: Then '.y/ is a
nonempty closed convex set. If .x ; y / 2 C  D; .x ; y / ! .x; y/; and x 2 '.y /;
i.e., F.x ; y /  then by upper semi-continuity of F.x; y/, we must have F.x; y/ 
; i.e., x 2 '.y/: So the set-valued map ' is closed and '.y/ D C .y/ 8y 2 D: The
conclusion follows by Theorem 3.5. t
u
Note that Corollary 3.2 differs from Sions minimax theorem (Theorem 2.7) only
by the continuity properties imposed on F.x; y/: Actually, with a little bit more skill
Sions theorem can also be derived directly from the General Equilibrium Theorem
(Exercise 6).

3.3.4 Ky Fan Inequality

When D D C and '.y/ is single-valued Theorem 3.4 yields


Theorem 3.6 (Ky Fan 1972) Let C be a nonempty compact convex set in Rn ; and
F.x; y/ W C  C ! R a function which is u.s.c in x and quasiconvex in y: If '.y/ is a
continuous map from C into C there exists x 2 C such that

inf F.x; y/  inf F.'.y/; y/: (3.23)


y2C y2C

In the special case '.y/ D y 8y 2 C this inequality becomes

inf F.x; y/  inf F.y; y/:


y2C y2C

If, in addition, F.y; y/ D 0 8y 2 C then

inf F.x; y/  0:
y2C

Proof The inequality (3.21) reduces to (3.23) when '.y/ is a singleton. t


u
In the special case F.x; y/ D kx  yk we have from (3.23) miny2C kx  yk 
miny2C k'.y/  yk; hence 0 D miny2C k'.y/  yk; which means there exists y 2 C
such that y D '.y/: Thus, as a consequence of Ky Fan inequality we get Brouwer
fixed point theorem: a continuous map ' from a compact convex set C into itself
has a fixed point. Actually, the following more general fixed point theorem is also a
direct consequence of Theorem 3.4:
98 3 Fixed Point and Equilibrium

3.3.5 Kakutani Fixed Point

Theorem 3.7 (Kakutani 1941) Let C be a compact convex subset of Rn and ' a
closed set-valued map from C to C with nonempty convex values '.x/  C 8x 2 C:
Then ' has a fixed point, i.e., a point x 2 '.x/:
Proof The function F.x; y/ W C  C ! R defined by F.x; y/ D kx  yk is continuous
on C  C and convex in y for every fixed x: By Theorem 3.4 (Remark 3.2, (3.21))
there exists x 2 C such that

inf kx  yk  inf inf kx  yk:


y2C y2C x2'.y/

Since infy2C kx  yk D 0; for every natural k we have infy2C infx2'.y/ kx  yk  1=k;


hence there exist yk 2 C; xk 2 '.yk /  C such that kyk xk k  1=k: By compactness
of C there exist x; y such that, up to a subsequence, xk ! x 2 C; yk ! x 2 C: Since
the map ' is closed it follows that x 2 '.x/: t
u
Historically, Brouwer Theorem was established in 1912. Kakutani Theorem was
derived from Brouwer Theorem almost three decades later, via a quite involved
machinery. The above proof of Kakutani Theorem is perhaps the shortest one.

3.3.6 Equilibrium in Non-cooperative Games

Consider an n-person game in which the player i has a strategy set Xi  Rmi : When
the player i chooses a strategyQ xi 2 Xi ; the situation of the game is described by the
vector x D .x1 ; : : : ; xn / 2 niD1 Xi : In that situation the player i obtains a payoff
fi .x/:
Assume that each player does Qn not know which strategy is taken by the other
players. A vector x 2 X WD iD1 Xi is called a Nash equilibrium if for every
i D 1; : : : ; n W

fi .x1 ; x2 ; : : : ; xn / D max fi .x1 ; : : : ; xi1 ; xi ; xiC1 ; : : : ; xn /:


xi 2Xi

In other words, x 2 X is a Nash equilibrium if

fi .x/ D max fi .xji xi /;


xi 2Xi

Qn
where for any z 2 Xi ; xji z denotes the vector x 2 iD1 Xi in which xi has been
replaced by z:
3.3 Fixed Point and Equilibrium 99

Theorem 3.8 (Nash Equilibrium Theorem (Nash 1950)) Assume that for each
i D 1; : : : ; n the set Xi is convex compact and the function fi .x/ is continuous on X
and concave in xi : Then there exists a Nash equilibrium.
Qn
Proof The set X WD iD1 Xi is convex, compact as the Cartesian product of n
convex compact sets. Consider the function

X
n
F.x; y/ WD fi .x/  fi .xji yi /; x 2 X; y 2 X: (3.24)
iD1

This is a function continuous in x and convex in y: The latter is due to the assumption
that for each i D 1; : : : ; n the function fi .x/ is concave in xi ; which means that the
function z 7! fi .xji z/ is concave. Since yji yi D y we also have F.y; y/ D 0 8y 2 X:
Therefore, by Ky Fan inequality Theorem 3.6, there exists x 2 X satisfying

X
n
F.x; y/ D fi .x/  fi .xji yi /  0 8y 2 X:
iD1

Fix an arbitrary i and take y 2 X such that yj D xj for j i: Writing the above
inequality as
X
fi .x/  fi .xji yi / C fj .x/  fj .xjj xj /  0 8y 2 X;
ji

and noting that xjj xj D x we deduce the inequality fi .x/  fi .xjyi / 8yi 2 Xi : Since
this holds for every i D 1; : : : ; n; x is a Nash equilibrium. t
u

3.3.7 Variational Inequalities

As we saw in Chap. 2, Proposition 2.31, a minimizer x of a convex function f W


Rn ! R over a convex set C  Rn must satisfy the condition

0 2 @f .x/ C NC .x/; (3.25)

where @f .x/ is the subdifferential of f at x and NC .x/Dfy 2 Rn j hy; xxi  0


8x 2 Cg is the outward normal cone to C at x: The above condition can be
equivalently written as

9y 2 @f .x/ hy; x  xi  0 8x 2 C:

A relation like (3.25) is called a variational inequality. Replacing the map x 7!


@f .x/ with a given arbitrary set-valued map from a nonempty closed convex set
100 3 Fixed Point and Equilibrium

C to Rn , we consider the variational inequality for (C; ):


.VI.C; // Find x 2 C satisfying 0 2 .x/ C NC .x/:
Alternatively, following Robinson (1979) we can consider the generalized
equation
0 2 .x/ C NC .x/: (3.26)
Although only a condensed form of writing VI.C; /; the generalized equation
(3.26) is often a more efficient tool for studying many questions of nonlinear
analysis and optimization.
Based on Theorem 3.4 we can now derive a solvability condition for the
variational inequality VI.C; / or the generalized equation (3.26). For that we need
a property of set-valued maps.
A set-valued map f .x/ from a set X  Rn to a set Y  Rm is said to be upper
semi-continuous if for every x 2 X and every open set U  f .x/ there exists a
neighborhood W of x such that f .z/  U 8z 2 W \ X:
Proposition 3.2 (i) An upper semi-continuous set-valued map f from a set X to a
set Y which has closed values is a closed map. Conversely, a closed set-valued
map f from X to a compact set Y is upper semi-continuous.
(ii) If f .x/ is an upper semi-continuous map from a compact set X to Y with
nonempty compact values then f .X/ is compact.
Proof (i) Let f be an upper semi-continuous map with closed values. If .x ; y / 2
X  Y is a sequence such that .x ; y / ! .x; y/ 2 X  Y; then for any
neighborhood U of f .x/ we will have y 2 U for all large enough ; so
y 2 f .x/: Conversely, let f be a closed set-valued map from X to a compact
set Y: If f is not upper semi-continuous. There would exist x0 2 X and an
open set U  f .x0 / such that for every k there exists xk ; yk 2 f .xk / satisfying
kxk  x0 k  1=k; yk U: Since Y is compact, by taking a subsequence if
necessary, yk ! y 2 Y and since f .x0 /  U; clearly y f .x0 /; conflicting with
the map being closed.
(ii) Let Wt ; t 2 T; be an open covering of f .X/: For each fixed x 2 X since f .x/
is compact there exists a finite covering Wi ; i 2 I.x/  T: In view of the
upper semi-continuity of the map f there exists an open ball V.y/ around x
such that f .z/  [i2I.x/ Wi 8z 2 V.x/: Then by compactness of Y there exists
a finite set E  Y such that Y is entirely covered by [x2E V.x/: So the finite
family Wt ; t 2 [fWi ; i 2 I.x/; x 2 Eg covers f .X/; proving the compactness of
this set. t
u
Theorem 3.9 Let C be a compact convex set in Rn ; an upper semi-continuous
set-valued map from C to Rn with nonempty compact convex values. Then the
variational inequality VI.C; ) [the generalized equation (3.26)] has a solution,
i.e., there exists x 2 C and y 2 .x/ such that

hy; x  xi  0 8x 2 C:
3.4 Exercises 101

Proof By Proposition 3.2 is a closed map and its range D D .C/ is compact.
Define F.x; y/ D hx; yi  minz2C hz; yi: This function is continuous in y and convex
in x: By Theorems3.4, (3.21), there exists y 2 D such that

inf F.x; y/  inf inf F.x; y/:


x2C x2C y2.x/

But infx2C F.x; y/ D minx2C hx; yi  minz2C hz; yi D 0; hence

inf inf F.x; y/  0:


x2C y2.x/

Since C is compact, for every k D 1; 2; : : : ; there exists xk 2 C satisfying

inf F.xk ; y/  1=k;


y2.xk /

hence there exists xk 2 C; yk 2 .xk / such that F.xk ; yk /  1=k; i.e.,

hxk ; yk i  minhz; yk i  1=k:


z2C

Let .x; y/ be a cluster point of .xk ; yk /: Then x 2 C; y 2 .x/ because the map is
closed, and hx; yi  minz2C hz; yi  0; i.e., hy; x  xi  0 8x 2 C as to be proved. u
t

3.4 Exercises

1 Derive Ekeland variational principle from Theorem 3.2. (Hint: Let x" be an
"-approximate minimizer of f .x/ on X and  > 0: Define P.x/ D fy 2 Xj " ky 
x" /k  f .x" /  f .y/g and show that if Ekelands theorem is false there exists for
every x 2 X an y 2 P.x/ satisfying " kx  yk < f .x/  f .y/: Then apply Caristi
theorem to derive a contradiction.)
2 Let S be an n-simplex spanned by a0 ; a1 ; : : : ; an : For each i D 0; 1; : : : ; n let
Fi be the facet of S opposite the vertex ai (i.e., the .n  1/-simplex spanned by n
points aj ; j i/; and Li a given closed subset of S: Show that the KKM Lemma is
equivalent to saying that if the following condition is satisfied:
S  [niD0 Li ; Fi  Li ; 8i;
then \niD0 Li ;:
3 Let S; Fi ; Li ; i D 0; 1; : : : ; n be as in the previous exercise. Show that the KKM
Lemma is equivalent to saying that if the following condition is satisfied:
S  [niD0 Li ; ai 2 Li ; Fi \ Li D ;;
then \niD0 Li ;:
102 3 Fixed Point and Equilibrium

4 Derive the KKM Lemma from Brouwer fixed point theorem. (Hint: Let
S; L0 ; L1 ; : : : ; Ln be as in the previous exercise, .x; Li / D minfkx  ykj y 2 Li g:
For k > 0 set Hi D fx 2 Sj .x; Li /  1=k and define a continuous map f W S ! S
by fi .x/ D Pxni1 .x;H i/
with the convention x1 D xn : Show that if x D f .x/ then
iD1 xi .x;Hi /
fi .x/ > 0 8i; hence .x; Hi / > 0; i.e., .x; Li /  1=k for every i D 0; 1; : : : ; n: Since
k can be arbitrary, deduce that \niD0 Li ;:)
5 Let C1 ; : : : ; Cn be n closed convex sets in Rk such that their union is a convex set
C: Show that if the intersection of n  1 arbitrary sets among them is nonempty then
the intersection of the n sets is nonempty. (Hint: Let ai 2 Li WD \ji Cj and apply
KKM to the set fa1 ; : : : ; an g and the collection of closed sets fL1 ; : : : ; Ln g.)
6 Let C be a compact convex subset of Rn ; D a closed convex subset of
Rm ; f .x; y/ W C  D ! R a function which is quasi convex, l.s.c. in x and
quasiconcave u.s.c. in y: Show that for every >
WD supy2D infx2C f .x; y/ the
set-valued map ' from D to C such that '.y/ WD fx 2 Cj f .x; y/  g 8y 2 D is
closed and using this fact deduce Sions Theorem from Theorem 3.4. (Hint: show
that the set-valued map ' is u.s.c., hence closed because C is compact.)
Chapter 4
DC Functions and DC Sets

4.1 DC Functions

Convexity is a nice property of functions but, unfortunately, it is not preserved under


even such simple algebraic operations as scalar multiplication or lower envelope. In
this chapter we introduce the dc structure (also called the complementary convex
structure ) which is the common underlying mathematical structure of virtually all
nonconvex optimization problems.
Let be a convex set in Rn . We say that a function is dc on if it can be
expressed as the difference of two convex functions on , i.e., if f .x/ D f1 .x/f2 .x/,
where f1 ; f2 are convex functions on .
Of course, convex and concave functions are particular examples of dc functions.
A quadratic function f .x/ D hx; Qxi, where Q is a symmetric matrix, is a dc
function which may be neither convex nor concave. Indeed, setting x D Uy
where U D u1 ; : : : ; un  is the matrix of normalized eigenvectors of Q, we have
U T QU D diag.1 ; 2 ; : : : ; n /, hence f .x/ D f .Uy/ D hUy; QUyi D hy; U T QUyi,
so that f .x/ D f1 .x/  f2 .x/ with
X X
f1 .x/ D i y2i ; f2 .x/ D  i y2i ; y D U 1 x:
i 0 i <0

Actually most functions encountered in practice are dc, as will be shortly demon-
strated.
Functions which can be represented as differences of convex functions were
considered some 40 years ago by Alexandrov (1949, 1950) and Landis (1951), and
some time later by Hartman (1959) who proved a number of important properties
to be discussed below. For our purpose, the first fundamental property of these
functions is their stability relative to operations frequently used in optimization
theory.

Springer International Publishing AG 2016 103


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_4
104 4 DC Functions and DC Sets

Proposition 4.1 If fi .x/; i D 1; : : : ; m; are dc functions on then the following


functions are also dc on :
X
m
(i) i fi .x/, for any real numbers i ;
iD1
(ii) g.x/ D maxff1 .x/; : : : ; fm .x/g;
(iii) h.x/ D minff1 .x/; : : : ; fm .x/g.
Proof We only prove (ii) because (i) is trivial and the proof of (iii) is similar. Let
fi .x/ X
D pi .x/ X
 qi .x/ with pi ; qi convex on . From the obviousX equalityX fi D
pi C qj  qj it follows that maxff1 ; : : : ; fm g D maxpi C qj   qj
i
ji j ji j
which is a difference of two convex functions since the upper envelope and the sum
of finitely many convex functions are convex. t
u
Thus, the set of dc functions on forms the smallest linear space containing all
convex functions on . This linear space, denoted by DC./, is stable under the
operations of upper and lower envelope of finitely many functions.
An inequality of the form f .x/  0, where the function f .x/ is convex, is called a
convex inequality (because the set of all x satisfying this inequality is a convex set).
If f .x/ is concave, then the inequality is called complementary convex or reverse
convex because its solution set is the complement of a convex set. Thus, a reverse
convex inequality is also an inequality of the form f .x/  0, where f .x/ is convex. If
f .x/ is a dc function then the inequality f .x/  0 is called a dc inequality. Of course
the inequality f .x/  0 is then also a dc inequality because f .x/ is still dc.
An important consequence of Proposition 4.1 is that any finite system of
dc inequalities, whether conjunctive or disjunctive, is equivalent to a single dc
inequality. Indeed, a conjunctive system
gi .x/  0; i D 1; : : : ; m
can be written as
g.x/ WD maxfg1 .x/; : : : ; gm .x/g  0 (4.1)
while a disjunctive system
gi .x/  0 for at least one i D 1; : : : ; m
is equivalent to
g.x/ WD minfg1 .x/; : : : ; gm .x/g  0: (4.2)
Furthermore, if g.x/ D p.x/  q.x/ with p.x/; q.x/ convex, then, introducing
an additional variable t, the dc inequality (4.1) [or (4.2)] can be split into two
inequalities:
p.x/  t  0; t  q.x/  0; (4.3)
4.2 Universality of DC Functions 105

where the first is a convex inequality, and the second is a reverse convex inequality.
Thus, any finite system of dc inequalities can be rewritten as a system of one convex
inequality and one reverse convex inequality. As will be seen later, this property
allows a compact description of a wide class of nonconvex optimization problems
and provides the basis for a unified theory of global optimization.

4.2 Universality of DC Functions

The next property shows that virtually all the most frequently encountered functions
in practice are dc. As usual, Ck .Rn / (or simply Ck / denotes the class of functions on
Rn continuously differentiable up to order k.
Proposition 4.2 Every function f 2 C2 .Rn / is dc on any compact convex
set  Rn .
Proof We show that, given any compact convex set  Rn , the function g.x/ D
f .x/ C kxk2 becomes convex on when  is sufficiently large (then f .x/ D g.x/ 
kxk2 yields a dc representation of f ). Indeed, since hu; r 2 g.x/ui D hu; r 2 f .x/ui C
kuk2 , if  is so large that

 minfhu; r 2 f .x/ui j x 2 ; kuk D 1g  

then hu; r 2 g.x/ui  0; 8u, hence g.x/ is convex by Proposition 2.2. t


u
Corollary 4.1 For any continuous function f .x/ on a compact convex set and for
any " > 0 there exists a dc function g.x/, such that

max jf .x/  g.x/j  ":


x2

Proof By Weierstrass theorem there exists a polynomial g.x/ satisfying the required
condition, and obviously g.x/ 2 C2 . t
u
Thus DC./ is dense in C./, the Banach space of continuous functions on
, equipped with the sup norm. Though dc functions occur frequently in practice,
they often appear in a hidden, not directly recognizable, form. To help to identify dc
functions in various situations we prove some further properties of these functions.
A function f W D ! R defined on a convex set D  Rn is said to be locally
dc if for every x 2 D there exist a convex open neighborhood U of x and a pair of
convex functions g; h on U such that f jU D gjU  hjU. Note that since U can be
assumed bounded, the sets [f@g.y/ j y 2 Ug and [f@h.y/ j y 2 Ug are bounded by
Theorem 2.6, so by Corollary 2.10, the functions g; h can be assumed convex finite
on all of Rn .
106 4 DC Functions and DC Sets

Proposition 4.3 A locally dc function on a convex, open, or closed, set D is dc


on D.
Proof We shall restrict ourselves to the case when D is a compact convex set
(a proof for the general case can be found in Hartman (1959); see also Ellaia
(1984)). From the hypothesis and the compactness of D one can find a finite set
fx1 ; : : : ; xk g  D together with open convex neighborhoods U1 ; : : : ; Uk of these
points covering D, and convex P functions hi W Rn ! R .i D 1; : : : ; k/ such that
.f C hi /jUi is convex. Let h D kiD1 hi and consider theP function g D f C h. For
each i; .f C hi /jUi is convex, hence gjUi D .f C hi /jUi C . ji hj /jUi is convex. By
Proposition 2.4, g is convex on D, i.e., f D g  h with g; h convex. t
u
A mapping F D .f1 ; f2 ; : : : ; fm / W ! Rm defined on a convex set  Rn is
said to be dc on : F 2 DC./, if fi 2 DC./ for every i D 1; : : : ; m.
Proposition 4.4 Let 1  Rn ; 2  Rm be convex sets such that 1 is open
or closed, 2 is open. If F1 W 1 ! 2 ; F2 W 2 ! Rk are dc mappings then
F2 F1 W 1 ! Rk is also a dc mapping.
Proof It suffices to show that if F D .f1 ; : : : ; fm / W 1 ! 2 is dc and g W 2 ! R
is convex then g.f1 ; : : : ; fm / 2 DC.1 /. Let x 2 1 and y D F.x/ 2 2 . The
convex function g.y/ can be represented in a neighborhood U2 of y as pointwise
supremum of a family of affine functions: g.y/ D sup `t where `t D a0t C a1t y1 C
t
   C amt ym .y1 ; : : : ; ym are coordinates of y in Rm ) and M D sup jait j < C1
i;t
(cf. Corollary 2.9). Let fi .x/ D fiC .x/  fi .x/ in a neighborhood U1 of x such that
F.U1 /  U2 . Then

`t .f1 ; : : : ; fm /
X
m X
m
D a0t C ait fiC  ait fi
iD1 iD1
" #
X
m X
m
D a0t C .M C ait /fiC C .M  ait /fi
iD1 iD1

X
m
M .fiC C fi / D pt  q
iD1

with pt and q convex and q independent of t. Then g.f1 ; : : : ; fm / D sup `t .f1 ,


t
: : : ; fm / D sup.pt  q/ D sup pt  q D p  q, i.e., g.f1 ; : : : ; fm / is locally dc on
t t
1 . Hence, by Proposition 4.3, g f 2 DC.1 /. t
u
Corollary 4.2 Let 1 ; 2 be as in Proposition 4.4. If F1 W 1 ! 2 is dc on
1  Rn and F2 W 2 ! Rk is C2 -smooth, then F2 F1 is dc on 1 . In particular,
4.3 DC Representation of Basic Composite Functions 107

the product of two dc functions is dc; if f .x/ is dc on an open (or closed) convex set
1
and f .x/ 0 for all x 2 then and jf .x/j1=m are dc on .
f .x/
Proof Indeed, C2 -smooth functions are dc. t
u
The above properties explain why an overwhelming majority of functions of
practical interest are dc.
Remark 4.1 Following McCormick (1972) a function f .x/ is called a factorable
function if it results from a finite sequence of compositions of transformed sums and
products of simple functions of one variable. In other words, a factorable function
is a function which can be obtained as the last in a sequence of functions f1 ; f2 ; : : :,
built up as follows:

fi .x/ D xi .i D 1; : : : ; n/ (4.4)

and for k > n; fk is one of the forms:

fk .x/ D fl .x/ C fj .x/ for some l; j < k (4.5)


fk .x/ D fl .x/  fj .x/ for some l; j < k (4.6)
fk .x/ D F.fj .x// for some j < k; (4.7)

where F W R ! R is a simple dc function of one variable, such as F.t/ D tp ; F.t/ D


et ; F.t/ D log jtj; F.t/ D sin t, etc.
From the above corollary it also easily follows that a factorable function is a dc
function.
Remark 4.2 When a function f is dc and f D g  h is a dc representation of f , then
for any convex function u; f D .g C u/  .h C u/ gives another dc representation of
f . By taking u to be any strictly convex function, we see that a dc function is always
expressible as a difference of two strictly convex functions.

4.3 DC Representation of Basic Composite Functions

In many questions, it is important to know how to write effectively a dc function


as a difference of convex functions. This problem of (effective) dc representation
which is far from being simple will be discussed in this and the next sections for
some particular classes of functions.
To begin with, observe that many functions expressing cost, utility, attraction,
repulsion, etc., in applied fields (e.g., location theory, molecular conformation, and
quadratic assignment problems) are compositions of convex or concave functions
with convex or concave monotone functions. These functions are dc by Proposi-
tion 4.4. Let us show how to find their dc representations (Tuy 1996). First recall
108 4 DC Functions and DC Sets

from Proposition 2.8 that: if h W M ! RC is a convex (concave, resp.) function


on a convex subset M of Rm and if q W RC ! R is a convex (concave, resp.)
nondecreasing function, then q.h.x// is a convex (concave, resp.) function on M.
Proposition 4.5 Let h W M ! RC be a convex function on a convex compact subset
M of Rm . If q W RC ! R is a convex nonincreasing function such that q0C .0/ > 1,
then q.h.x// is a dc function on M:

q.h.x// D g.x/  Kh.x/;

where g.x/ is a convex function and K is a positive constant satisfying K 


jq0C .0/j .q0C .t/ denotes the right derivative of q.t/ at point t/.
Proof We have q0C .0/  q0C .t/  0 8t  0, therefore qQ .t/ D q.t/ C Kt satisfies
qQ 0C .t/ D q0C .t/ C K  q0C .0/ C K  0 8t  0. So qQ is a convex nondecreasing
function and by Proposition 2.8 that has just been recalled above, qQ .h.x// is a convex
function on M. Since qQ .h.x// D q.h.x// C Kh.x/, the result follows. t
u
Analogously: Let h W M ! RC be a convex function as in Proposition 4.5. If
q W RC ! R is a concave nondecreasing function such that q0C .0/ < C1, then
q.h.x// is a dc function on M:

q.h.x// D Kh.x/  g.x/;

where g.x/ is a convex function and K is a positive constant satisfying K  jq0C .0/j.
Using Proposition 4.5, one can see, for example, that the function we kxak
(with w > 0; > 0) is dc, since it is equal to q.kx  ak/ and q.t/ D we t
is a convex decreasing function with q0C .0/ D  w > 1. A generalization of
Proposition 4.5 is the following:
Proposition 4.6 Let h.x/ D u.x/v.x/ where u; v W M ! RC are convex functions
on a compact convex set M  Rm such that h.x/  0 8x 2 M. If q W RC ! R is a
convex nonincreasing function such that q0C .0/ > 1 then q.h.x// is a dc function
on M:

q.h.x// D g.x/  Ku.x/ C v.x/;

where g.x/ D q.h.x// C Ku.x/ C v.x/ is a convex function and K is a constant


satisfying K  jq0C .0/j.
Proof By convexity of q.t/, for any 2 RC we have q.t/  q. / C q0C . /.t  /,
with equality holding for D t. Therefore,

q.t/ D sup fq. / C .t  /q0C . /g D sup fq. /  q0C . / C tq0C . /g;


2RC 2RC
4.3 DC Representation of Basic Composite Functions 109

and consequently,

q.u.x/  v.x//
D sup fq. /  q0C . / C .K C q0C . //u.x/ C .K  q0C . //v.x/g
2RC

Ku.x/ C v.x/ D g.x/  Ku.x/ C v.x/:

We contend that g.x/ D sup fq. / q0C . /C.K Cq0C . //u.x/C.K q0C . //v.x/g
2RC
is convex. Indeed, since q.t/ is convex, q0C . /  q0C .0/ and hence, K C q0C . / 
K C q0C .0/  0 for all  0; furthermore, since q.t/ is nonincreasing, q0C . /  0
and hence K  q0C . /  K > 0 for all  0. It follows that for each fixed 2 RC
the function x 7! q. / q0C . /C .KCq0C . //u.x/C.Kq0C . //v.x/ is convex and
g.x/, as the pointwise supremum of a family of convex functions, is itself convex.
t
u
Analogously: Let h.x/ D u.x/  v.x/ where u; v W M ! RC are convex functions
on a compact convex set M  Rm such that h.x/  0 8x 2 M. If q W RC ! R
is a concave nondecreasing function such that q0C .0/ < C1, then q.h.x// is a dc
function on M:

q.h.x// D Ku.x/ C v.x/  g.x/;

where g.x/ D Ku.x/ C v.x/  q.h.x// is a convex function and K is a constant


satisfying K  jq0C .0/j.
Proposition 4.7 Let h.x/ D u.x/v.x/ where u; v W M ! RC are convex functions
on a compact convex set M  Rm such that 0  h.x/  a 8x 2 M. If q W 0; a ! R
is a convex nondecreasing function such that q0 .a/ < C1 then q.h.x// is a dc
function on M:
q.h.x// D g.x/  Ka C v.x/  u.x/;

where g.x/ D q.h.x// C Ka C v.x/  u.x/ is a convex function and K is a constant


satisfying K  q0 .a/.
Proof Define p W 0; a ! R by p.t/ D q.a  t/. Clearly, p.t/ is convex
nonincreasing function and q.t/ D p.a  t/ so that q.h.x// D p.a  h.x//. The
conclusion follows from Proposition 4.6. t
u
Thus, under mild conditions, a convex (or concave) monotone function of a
dc function over a compact convex set is a dc function, whose dc representation
can easily be obtained. In the next section we shall see that a rather wide class
of functions of a real variable can be represented in the form f .t/ D p.t/  q.t/
where p.t/; q.t/ are convex nondecreasing functions. The above results then provide
a dc representation of composite functions of the form f .h.x//, where h.x/ is a dc
function on Rn .
110 4 DC Functions and DC Sets

4.4 DC Representation of Separable Functions


P
Obviously, a separable function f .x/ D niD1 fi .xi / is dc if each univariate function
fi .t/; t 2 R is dc, so the question of dc representation for separable functions reduces
to that of univariate functions. A classical result in this area is the following:
Proposition 4.8 A function f W 0; a ! R is representable in the form f D p  q,
where p and q are convex and p0C .0/; p0 .a/; q0C .0/, q0 .a/ are all finite, if and only if
Z t
f .t/ D f .0/ C r. /d ; (4.8)
0

for some function of bounded variation r W 0; a ! R.


Proof See, e.g., Roberts and Varberg (1973). t
u
As is well known from real analysis, a function r W 0; a ! R of bounded vari-
ation can always be decomposed into a difference of two monotone nondecreasing
functions: r.t/ D .t/  .t/, where .t/ D V0t .r/ is the total variation of r in
the interval 0; t. Hence the dc representation of a function of the class (4.8) is as
follows:
 Z t  Z t
f .t/ D f .0/ C . /d  . /d : (4.9)
0 0

This classical representation, however, is not always convenient. Therefore, it is use-


ful to know alternative representations for some classes of frequently encountered
univariate functions.
One such class includes continuous piecewise affine functions f .t/ W 0; a ! R
(such as those encountered, for instance, in the Telepak problem (Zadeh 1974)).
Let 0 < c1 < c2 < : : : < ck < a be the sequence of breakpoints. Since in the
neighborhood of each t 2 0; a; f .t/ is either affine (if t is not a breakpoint), or
concave, or convex, it follows that f .t/ is locally dc, and hence dc on 0; a. Denote
Ui D ci1 ; ciC1 , with c0 D 0; ckC1 D a; and let I D fi j f .t/ is concave at ci g. For
each i 2 I take the convex function i .t/ W 0; a ! R that agrees with f .t/ at every
t 2 Ui and is affine on each interval .0; ci / and .ci ; a/.
Proposition 4.9 A dc representation of a piecewise affine function f .t/ is f .t/ D
p.t/  q.t/, where p.t/ is convex piecewise affine and
X
q.t/ D i .t/:
i2I

Proof Clearly q.t/ is convex. To show that f .t/ C q.t/ is convex, observe that for
every i 2 I:
4.4 DC Representation of Separable Functions 111

Fig. 4.1 DC decomposition


of piecewise affine functions f (t)

p(t)

q(t)

f (t) = p(t) q(t)

0 1
X
.f C q/jUi D .f C i /jUi C @ j A jUi
j2Infig
0 1
X
D@ j A jUi :
j2Infig

Since every j with j 2 I n fig is affine on Ui , it follows that f .t/ C q.t/ is affine on
every Ui ; i 2 I. On the other hand, for i I:
0 1
X
.f C q/jUi D @f C j A jUi ;
j2I

and since on such Ui ; f .t/ is convex while every j with j 2 I is affine,it follows that
f .t/Cq.t/ is also convex on every Ui ; i I. Thus, every t 2 0; a has a neighborhood
in which f .t/ C q.t/ is convex. Hence, f .t/ C q.t/ is convex on 0; a, as was to be
proved (Fig. 4.1). t
u
Proposition 4.9 can be extended to continuous functions f .t/ W 0; a ! R
that are piecewise convexconcave in the following sense: there exists a sequence
e0 WD 0 < e1 < e2 < : : : < ek < a WD ekC1 such that for each interval
Vi D ei ; eiC1 ; i D 0; : : : ; k; the function fi .t/ WD f .t/jVi is convex or concave, with
finite fC0 .ei /; f0 .eiC1 /, and moreover, fi .t/ is convex whenever fi1 .t/ is concave and
conversely. Denote I D fi j fi .t/ is concaveg. For each i 2 I let i .t/ W 0; a ! R
be the continuous convex function which agrees with f .t/ for t 2 Vi and is affine
outside Vi . Then, by an analogous argument to the proof of Proposition 4.9, it is
easily proved that f .t/ D p.t/  q.t/, where
X
q.t/ D i .t/
i2I

is convex on each interval Ui D ei1 ; eiC1 ; i D 1; : : : ; k, and hence convex on the


whole segment 0; a.
112 4 DC Functions and DC Sets

In particular this representation applies to S-shaped functions, i.e., continuous


functions f .t/ W 0; a ! R, which are convex (or concave, resp.) in a certain interval
0; c; 0 < c < a; and concave (convex, resp.) in the interval c; a. For example, if
f .t/ is convex in 0; c and concave in c; a then one can take q.t/ to be a convex
function which agrees with f .t/ for t  c, and is affine for t < c.

4.5 DC Representation of Polynomials

Since a polynomial has continuous derivatives of any order, it follows from


Proposition 4.2 that:
Proposition 4.10 Any polynomial in x 2 Rn is a dc function on Rn . t
u
A nontrivial problem, however, is how to represent effectively a polynomial as
a difference of two convex polynomials. For quadratic functions (polynomials of
degree 2), the answer is provided by the following well-known result:
Proposition 4.11 Let f .x/ D hx; Qxi be an indefinite quadratic form, where Q is a
symmetric indefinite matrix with spectral radius .Q/. Then for any   .Q/ the
function g.x/ D f .x/ C kxk2 is convex.
Proof Since the matrix Q C I is obviously positive semidefinite, the function
g.x/ D hx; .Q C I/xi is convex. t
u
It follows that any indefinite quadratic function f .x/ can be represented as f .x/ D
g.x/  kxk2 , where g.x/ D f .x/ C kxk2 is convex. If we know the matrix U of
normalized eigenvectors of Q, so that U T QU D diag.1 ; 2 ; : : : ; n /, then setting
x D Uy; F.y/ D f .Uy/, we can convert f .x/ into a separable function:
X
n
F.y/ D i y2i
iD1

which is obviously dc. A useful consequence of the above result is the following
(see, e.g., Thach 1992):
Corollary 4.3 Any system of quadratic inequalities
fi .x/ WD hx; Qi xi C hbi ; xi C di  0; i D 1; : : : ; m;
where Qi are symmetric matrices, bi 2 Rn ; di 2 R, is equivalent to a single dc
inequality
g.x/  kxk2  0;
where g.x/ is a convex function and   maxiD1;:::;m .Qi /.
4.6 DC Sets 113

Proof Since the ith inequality can be rewritten as gi .x/  kxk2  0, with gi .x/
convex, the system is equivalent to

max gi .x/  kxk2


iD1;:::;m

which is the desired result for g.x/ D maxiD1;:::;m gi .x/. t


u
Thus, the question of dc representation of quadratic polynomials is solved rather
satisfactorily. For polynomials of degrees higher than 2, the results are much less
simple. It is well known that for any polynomial P.x/ and any compact convex set
 Rn the function P.x/ C kxk2 is convex on if    minfhu; r 2P.x/ui j
x 2 ; kuk D 1g. However, the estimation of such a lower bound for  is often
computationally expensive. In practice, we are interested in the dc representation of
a polynomial P.x/ over a set x  a, so by translating if necessary, we may consider
P.x/ only in the orthant x  0.
Proposition 4.12 For any two convex functions f1 ; f2 W RnC ! RC a dc representa-
tion of their product is the following:

1 1
f1 .x/f2 .x/ D f1 .x/ C f2 .x/2  f12 .x/ C f22 .x/:
2 2

Proof Clearly f1 .x/ C f2 .x/ is convex, nonnegative-valued, and since the univariate
function q.t/ D t2 is convex increasing on RC , by Proposition 2.8 each term in the
above difference is convex. t
u
Using this Proposition and the fact that for every natural m the univariate function
t ! tm is convex on RC , one can easily obtain a dc representation for any monomial
p p
of the form x11 x22 : : : xpnn , and hence a dc representation for P.x/ on RnC .

4.6 DC Sets

In this and the next sections, by dc set we mean a set M  Rn for which there
exist convex functions g; h W Rn ! R such that M D fx j g.x/  0; h.x/  0g
(so M D D n C, where D D fx j g.x/  0g and C D fx j h.x/ < 0g). Since
M D fx j maxg.x/; h.x/  0g, it follows that a dc set can also be defined by a dc
inequality. Conversely, if a set S is defined by a dc inequality: S D fx j g.x/h.x/ 
0g, with g.x/ and h.x/ convex, then clearly S D fx j .x; t/ 2 M for some tg, where
M D f.x; t/ 2 Rn  R j g.x/  t  0; t  h.x/  0g; therefore, S can be obtained as
the projection on Rn of the dc set M  RnC1 .
Surprisingly, dc sets are not so different from arbitrary closed sets. This can be
seen from the observation by Asplund (1973) that for any closed set S  Rn , if
d.x; S/ D inffkx  yk j y 2 Sg denotes the distance from x to S then the function
x 7! kxk2 d2 .x; S/ D supf2xykyk2 j y 2 Sg is convex. Thus, S D fx j d2 .x; S/ 
0g where d2 .x; S/ is a dc function.
114 4 DC Functions and DC Sets

For the applications, however, a drawback of the function d2 .x; S/ is that it is


often too difficult to compute; furthermore, d2 .x; S/ D 0 for any x 2 S, so the
representation S D fxj d2 .x; S/ D 0g does not make any difference between interior
and boundary points of S. The following results by Thach (1993a) attempt to remove
this drawback in a way to allow computational developments (see Part II, Sect. 7.5).
Let h W Rn ! R be a strictly convex function (cf. Sect. 2.11), i.e., a function such
that h..1  /x1 C x2 / < .1  /h.x1 / C h.x2 / whenever x1 x2 ; 0 <  < 1 (for
instance, one can take h.x/ D kxk2 /. Given a nonempty closed set S in Rn , define

d2 .x/ D inf h.y/  h.x/  hp; y  xi: (4.10)


y2S;p2@h.x/

Lemma 4.1 We have


(i) d.x/ D 0 8x 2 S;
(ii) d.x/ > 0 8x S;
(iii) If xk ! x and d.xk / ! 0 as k ! 1 then x 2 S.
Proof (i) is obvious because h.y/  h.x/  hp; y  xi  0 8y; 8p 2 @h.x/.
(ii) follows from (iii). Indeed, if x S and d.x/ D 0 then by taking xk D x 8k,
we see that d.xk / ! 0, hence by (iii), x 2 S, a contradiction. Thus, we need only
prove (iii). Let xk ! x and d.xk / ! 0; so that there exist yk and pk 2 @h.xk /
satisfying

h.yk /  h.xk /  hpk ; yk  xk i ! 0:

By a boundedness property of subdifferentials of convex functions (Theorem 2.6),


we may assume, by taking a subsequence if necessary, that pk ! p 2 @h.x/. Then

h.yk /  h.x/  hp; yk  xi ! 0 .k ! 1/: (4.11)

Q
Clearly the function y 7! h.y/ Q
D h.y/  h.x/  hp; y  xi satisfies h.x/ D 0 
Q
h.y/ Q
8y. It follows that h.y/ > 0 8y x for, by the strict convexity of h, Q if
Q
h.y/ D 0 for some y x then h. Q xCy / < 0, a contradiction. Thus the level set
2
C0 D fy j h.y/Q  0g D fxg is bounded, and consequently, so is the level set
C1 D fy j h.y/ Q  1g. But in view of (4.11) we may assume h.y Q k /  1 8k.
k
Therefore, the sequence fy g is bounded. For any cluster point y of this sequence
Q
we have h.y/ D 0, hence y D x and since S is closed and yk 2 S, it follows that
x D y 2 S. t
u
Now let be any positive number and r W Rn ! RC any function such that:

r.x/ D 0 8x 2 SI (4.12)
0 < r.y/  minf ; d.y/g 8y S: (4.13)
4.6 DC Sets 115

Define

gS .x/ D sup fh.y/ C hp; x  yi C r2 .y/g: (4.14)


yS;p2@h.y/

Proposition 4.13 Let h W Rn ! R be any given strictly convex function. For every
closed set S  Rn the function gS .x/ defined by (4.14) is closed, convex, finite
everywhere and satisfies

S D fx 2 Rn j gS .x/  h.x/  0g: (4.15)

Proof We will assume that S is neither empty nor the whole space (otherwise
the proposition is trivial). The function gS .x/ is closed, convex as the pointwise
supremum of a family of affine functions x 7! h.y/ C hp; x  yi C r2 .y/. It is finite
everywhere because for every x 2 Rn :

gS .x/  sup fh.y/ C hp; x  yi C 2 g  h.x/ C 2 < C1:


yS;p2@h.y/

If y S then, since r.y/ > 0, it follows that

gS .y/  h.y/ C hp; y  yi C r2 .y/ > h.y/:

On the other hand, if x 2 S then for all y S; p 2 @h.y/, we have

h.y/ C hp; x  yi C r2 .y/  h.y/ C hp; x  yi C h.x/  h.y/  hp; x  yi  h.x/;

hence, gS .x/  h.x/ 8x 2 S. t


u
2
Remark 4.3 If we take h.x/ D kxk then @h.y/ D f2yg, hence d.y/ D inffkx  yk j
x 2 Sg is the distance from y to S. Following Eaves and Zangwill (1971), a function
r W Rn ! RC satisfying (4.12) and (4.13) can be called a separator for the set S.
Thus, given any closed set S  Rn and a separator r.y/ for S we can describe S as
the solution set of the dc inequality gS .x/  kxk2  0, where

gS .x/ D supfr2 .y/ C 2xy  kyk2 g:


yS

Later, in Sect. 7.5, we will see how this representation can be used for solving
continuous optimization problems.
Corollary 4.4 (Thach 1993a) Any closed set in Rn is the projection on Rn of a dc
set in RnC1 (Fig. 4.2).
Proof From (4.15) S D fxj 9.x; t/ 2 RnC1 ; gS .x/  t; t  h.x/g, so S is the
projection of DnC  RnC1 on Rn , for D D f.x; t/j gS .x/  tg; C D f.x; t/j h.x/ < tg.
t
u
116 4 DC Functions and DC Sets

C
D

E1 E2

Fig. 4.2 S WD E1 [ E2 D pr.D n intC/ with D D shaded area

Corollary 4.5 For any lower semi-continuous function f .x/ there exists a convex
function g.x/ such that the inequality f .x/  0 is equivalent to the dc inequality
g.x/  kxk2  0:

Proof Indeed, S D fxj f .x/  0g is a closed set. t


u
A fundamental property of convex sets which is the cornerstone of the whole
convex duality theory is the possibility of defining a nonempty closed convex proper
subset of Rn as the intersection of a family of halfspaces (Theorem 1.6).
An analogous property holds, with, however, reverse convex sets replacing
halfspaces, for an arbitrary nonempty closed proper subset of Rn .
A set M is called reverse convex or complementary convex if it is defined by a
reverse convex inequality, i.e., M D fx j g.x/  0g, where g.x/ is a convex function.
Proposition 4.14 Let h W Rn ! R be any given strictly convex function. Let S be a
nonempty closed proper subset of Rn .
(i) For every y 2 Rn n S there exists an affine function l.x/ on Rn such that
l.y/  h.y/ > 0; l.x/  h.x/  0 8x 2 S: (4.16)
(ii) There exists a family of affine functions li .x/; i D 1; 2; : : :, such that
1
\
SD fx j li .x/  h.x/  0g: (4.17)
iD1

Proof (i) Choose a function r W Rn ! RC satisfying (4.12), (4.13) and such that

whenever yk ! y and r.yk / ! 0 as k ! 1; then y 2 S:


4.7 Convex Minorants of a DC Function 117

By Lemma 4.1 the function r.y/ D d.y/ satisfies this condition. Let p 2
@h.y/; l.x/ D h.y/ C hp; x  yi C r2 .y/. Clearly, l.y/ D h.y/ C r2 .y/ > h.y/,
while from (4.14): l.x/  gS .x/  h.x/ 8x 2 S. We will say that the reverse
convex set D.y/ D fx j l.x/  h.x/  0g separates S from y.
(ii) Consider a grid of points fyi j i D 1; 2; : : :g  Rn n S which is dense in
Rn n S (e.g., the grid of all points with rational coordinates in Rn n S). For
each yi let pi 2 @h.yi /, li .x/ D h.yi / C hpi ; x  yi i C r2 .yi / so that the set
Di D fx j li .x/  h.x/  0g separates S from yi . We contend that
1
\
SD Di :
iD1

Indeed, by D denote the set on the right-hand side. It is plain that S  D, so


we need only to prove the converse containment. Suppose y 2 D but y S.
Because the grid fyi j i D 1; 2; : : :g is dense in Rn n S, there is a sequence in
this grid (which for convenience we also denote by yi ; i D 1; 2; : : :) such that
yi ! y as i ! 1. We have y 2 Di 8i, i.e.,
h.yi / C hpi ; y  yi i C r2 .yi /  h.y/ 8i:
Since yi ! y; h.yi / ! h.y/; it follows that r2 .yi / ! 0, which by Lemma 4.1
conflicts with y S. Therefore, D  S, and hence S D D. t
u
Thus, an arbitrary closed set which is neither empty nor the whole space can be
defined by means of a strictly convex function and a family of affine functions.
Later (in Part II) we will see that using this property certain approximation
procedures commonly used in convex optimization (such as linearization and outer
approximation of convex sets by polyhedrons) can be extended to continuous
nonconvex optimization.

4.7 Convex Minorants of a DC Function

A nonconvex inequality f .x/  0; x 2 D, where D is a convex set in Rn , can often


be handled by replacing it with a convex inequality c.x/  0, where c.x/ is a convex
minorant of f .x/ on D. The latter inequality is then called a convex relaxation of
the former. Of course, the tightest relaxation is obtained when c.x/ D convf .x/, the
convex envelope, i.e., the largest convex minorant, of f .x/.
If f .x/ is a dc function for which a dc representation f .x/ D g.x/  h.x/ is
available, then a convex minorant of f .x/ on a polytope D  Rn with vertex set V is
provided by g.x/ C convD .h/.x/, i.e., according to Corollary 2.2,
( nC1 )
X X
nC1 X
nC1
g.x/  sup i h.v i /j v i 2 V; i  0; i D 1; i v i D x :
iD1 iD1 iD1
118 4 DC Functions and DC Sets

Most often, however, f .x/ is given as a factorable function (see Sect. 3.3) whose
explicit dc representation is not readily available and generally very hard to
compute. In such cases, using the factorable structure the computation of a convex
minorant (or concave majorant) of f .x/ can be reduced to solving the following two
problems (McCormick 1982):
PROBLEM 4.1 Given a function F.t/ of one variable, and a function t.x/, defined
on a convex set S  Rn , find a convex minorant (concave majorant) of F.t.x// on
the set fx 2 Sj a  t.x/  bg.
Assume that there are available a convex minorant p.x/ and a concave majorant
q.x/ of t.x/. Denote the convex envelope and the concave envelope of F.t/ on the
segment a  t  b by '.t/ and .t/, respectively. Let WD inffF.t/ja  t  bg D
F.tmin /, WD supfF.t/ja  t  bg D F.tmax /. Also for any three real numbers
u; v; w denote by midu; v; w the middle number, i.e., the number which is between
the two others.
Proposition 4.15 The functions

'.midp.x/; q.x/; tmin /; .midp.x/; q.x/; tmax /

provide a convex minorant and a concave majorant of F.t.x// on fx 2 Sj


a  t.x/  bg.
Proof It suffices to prove for the convex minorant, because the proof for the concave
majorant is analogous. Clearly the convex envelope '.t/ of F.t/ can be broken up
into the sum of three terms: the convex decreasing part, the convex increasing part,
and the minimum value. The convex decreasing part is

' D .t/ D '.min.t; tmin //  ;

and the convex increasing part is

' I .t/ D '.max.t; tmin //  :

Then '.t/ D ' D .t/ C ' I .t/ C : Observe that since ' D .t/ is decreasing and
t.x/  q.x/ we have ' D .t.x//  '.min.q.x/; tmin //  . Similarly, ' I .t/ 
'.max.p.x/; tmin //  . Hence,

F.t.x//  '.t.x//
D ' D .t.x// C ' I .t.x// C
 '.min.q.x/; tmin // C '.max.p.x/; tmin //  (4.18)
D '.midp.x/; q.x/; tmin /:
4.7 Convex Minorants of a DC Function 119

The convexity of this function follows from the expression (4.18) because
'.min .q.x/; tmin // is convex as a convex decreasing function of the concave
function min.q.x/; tmin / and '.max.p.x/; tmin // is convex as a convex increasing
function of the convex function max.p.x/; tmin /. t
u
PROBLEM 4.2 Given two functions u.x/; v.y/ on Rn ; Rm resp., find a convex
minorant (concave majorant) of the product f .x/ D u.x/v.y/ on the set f.x; y/j p 
u.x/  q; r  v.y/  sg.
Lemma 4.2 The system of inequalities

puq r  v  s; (4.19)

where p < q; r < s is equivalent to

minfsu C pv  ps; ru C qv  qrg  uv (4.20)


uv  maxfru C pv  pr; su C qv  qsg: (4.21)

Proof The inequalities (4.19) imply

.u  p/.v  r/  0; .q  u/.s  v/  0 (4.22)


.u  p/.v  s/  0; .u  q/.v  r/  0: (4.23)

Conversely, from (4.22)(4.23) it follows that .u  p/.v  r/  .v  s/ D


.u  p/.s  r/  0, hence p  u because r < s; similarly, u  q; r  v  s,
i.e., (4.19). Thus, (4.19) is equivalent to (4.22)(4.23). Expanding the latter yields
precisely (4.20)(4.21). t
u
Proposition 4.16 Let p  0; r  0. If u.x/; v.y/ are convex (concave, resp.) and
finite on Rn ; Rm resp. then a convex minorant (concave majorant, resp.) of the
product u.x/v.y/ on the set f.x; y/j p  u.x/  q; r  v.y/  sg is given by
the function

maxfru.x/ C pv.y/  pr; su.x/ C qv.y/  qsg (4.24)


.minfsu.x/ C pv.y/  ps; ru.x/ C qv.y/  qrg; resp:/: (4.25)

Proof If u.x/; v.y/ are convex, the function maxfru.x/ C pv.y/  pr; su.x/ C
qv.y/  qsg is convex and by the above Lemma it is a minorant of u.x/v.y/ for
p  u.x/  q; r  v.y/  s. Similarly, if u.x/; v.y/ are concave, the function
minfsu.x/ C pv.y/  ps; ru.x/ C qv.y/  qrg is a concave majorant of u.x/v.y/. u
t

Proposition 4.17 Let p  0; r  0 and suppose there exist convex functions


'.x/; .y/ and concave functions .x/;  .y/ such that p  '.x/  u.x/ 
.x/  q and r  .y/  v.y/   .y/  s for all .x; y/ in a convex set
120 4 DC Functions and DC Sets

D  Rn  Rm . Then a convex minorant and a concave majorant of the product


u.x/v.y/ on the set D are provided by the functions

maxfr'.x/ C p .y/  pr; s'.x/ C q .y/  qsg;


minfs.x/ C p .y/  ps; r.x/ C q .y/  qrg:

Proof Immediate from the above. t


u
Corollary 4.6 Let p  0; r  0. The convex and concave envelopes of the function
f .x; y/ D xy on the rectangle f.x; y/ 2 R2 j p  x  q; r  y  sg are the functions

maxfrx C py  pr; sx C qy  qsg; (4.26)


minfsx C py  ps; rx C qy  qrg: (4.27)

A concave majorant of the function xy on the rectangle f.x; y/ 2 R2 j p  x  q; r 


y  sg is the function

p p x q q
min sx C  ; C  :
y r s y s

Proof This follows from Proposition 4.16 when u.x/ D x; v.y/ D y and when
u.x/; v.y/ D 1y (note that the function 1y is concave). t
u

4.8 The Minimum of a DC Function

As we saw in Sect. 2.11 any local minimum of a convex function on a convex set C
is also a global minimum. This important property is no longer true for general dc
functions but it partially extends to quadratic functions, i.e., to functions of the form
f .x/ D 12 hx; Qxi C ha; xi C where Q is a symmetric n  n matrix.
Proposition 4.18 Any local minimum (or maximum, resp.) of a quadratic function
f .x/ on affine set E  Rn is a global minimum (maximum, resp.).
Proof If x0 is a local minimizer of f .x/ on E, there exists a ball B centered at x0 such
that f .x0 /  f .x/ 8x 2 B \ E. For any z 2 E n fx0 g let L be the line through x0 and z,
i.e., L D fx 2 Rn j x D x0 C t.z  x0 /; t 2 .1; C1/g. Then the restriction of f on
L is the function '.t/ D f .x0 C t.z  x0 //; t 2 R. Since f .x/ is a quadratic function
on Rn , '.t/ is a quadratic function of the real variable t. This function attains a
local minimum at t D 0, which must then be a global minimum as the minimum
of a quadratic function of a real variable. So '.0/  '.t/ 8t 2 R, in particular
f .z/ D '.1/  '.0/ D f .x0 /, i.e., f .z/  f .x0 / 8z 2 E. t
u
Proposition 4.19 A quadratic function which is bounded below on Rn must be
convex. The minimum (or maximum) of an indefinite quadratic function on a
compact convex set C is always attained at a boundary point (which may not be
an extreme point).
4.8 The Minimum of a DC Function 121

Proof If a quadratic function f .x/ is not convex there exist two distinct points
a; b 2 Rn and a number t 2 .0; 1/ such that f ..1  t/a C tb/ > .1  t/f .a/ C tf .b/. So
the function '.t/ WD f ..1  t/a C tb/; t 2 R, which is the restriction of f on the line
through a; b, is not convex. Then '.t/ must be concave because a quadratic function
on R is either convex or concave. Hence, '.t/ is unbounded below on R, conflicting
with f .x/ being bounded below on Rn . This proves the first part of the proposition.
Turning to the second part, without loss of generality we can assume that the affine
hull of C is the whole space Rn . Since f .x/ is continuous and C is compact convex,
f .x/ attains a minimum on C. If this minimum were an interior point of C, it
would be a local minimum of f .x/ on Rn , and hence, by Proposition 4.18, a global
minimum. So f .x/ would be bounded below on Rn , and would be convex, contrary
to its indefiniteness. Analogously for the maximum. t
u
By Proposition 2.31, a point x0 achieves the minimum of a proper convex
function f .x/ on Rn if and only if 0 2 @f .x0 /. For general dc functions, a criterion
for the global minimum is the following (Hiriart-Urruty 1989a,b; see also Tuy and
Oettli 1994):
Proposition 4.20 Let g W Rn ! R [ fC1g be an arbitrary proper function and
h W Rn ! R [ fC1g a convex proper l.s.c. function. A point x0 2 domg \ domh is
a global minimizer of g.x/  h.x/ on Rn if and only if

@" h.x0 /  @" g.x0 / 8" > 0: (4.28)

Proof Without loss of generality we may assume that g.x0 / D h.x0 / D 0. Then x0
is a global minimizer of g.x/  h.x/ on Rn if and only if h.x/  g.x/ 8x 2 Rn . It is
easy to see that the latter condition is equivalent to (4.28). Indeed, observe that an
affine function '.x/ D ha; x  x0 i C '.x0 / is a minorant of a proper function f .x/ if
and only if a 2 @" f .x0 / with " D f .x0 /  '.x0 /. Now, since h.x/ is a proper convex
l.s.c. function on Rn , it is known that h.x/ D supf'.x/j ' 2 Qg, where Q denotes
the collection of all affine minorants of h.x/ (Theorem 2.5). Hence, the condition
h.x/  g.x/ 8x 2 Rn is equivalent to saying that every ' 2 Q is a minorant of
g, i.e., by the above observation, every a 2 @" h.x0 / belongs to @" g.x0 /, for every
" > 0. t
u
Remark 4.4 Geometrically, the condition h.x/  g.x/ 8x 2 Rn which expresses
the global minimality of x0 means
G  H; (4.29)
where G; H are the epigraphs of the functions g; h, respectively. Since h is proper
convex and l.s.c., H is a nonempty closed convex set in Rn  R and it is obvious
that (4.29) holds if and only if every closed halfspace in Rn  R that contains H also
contains G (see Strekalovski 1987, 1990, 1993). The latter condition, if restricted to
nonvertical halfspaces, is nothing but a geometric expression of (4.28).
The next result exhibits an important duality relationship in dc minimization
problems which has found many applications (Toland 1978).
122 4 DC Functions and DC Sets

Proposition 4.21 Let g; h be as in Proposition 4.20. Then

inf fg.x/  h.x/g D inf fh .u/  g .u/g: (4.30)


x2domg u2domh

If u0 is a minimizer of h .u/  g .u/ then any x0 2 @g .u0 / is a minimizer of


g.x/  h.x/.
Proof Suppose g.x/  h.x/  , i.e., g.x/  h.x/ C for all x 2 domg. Then
g .u/ D supx2domg fhu; xi  g.x/g  supx2domg fhu; xi  h.x/g   h .u/  ,
i.e., h .u/  g .u/  . Conversely, the latter inequality implies (in an analogous
manner) g .x/  h .x/  , hence g.x/  h.x/  . This proves (4.30). Suppose
now that u0 is a minimizer of h .u/g .u/ and let x0 2 @g .u0 /, hence u0 2 @g.x0 /.
Since h .u0 / C h.x0 /  hu0 ; x0 i, we have

g.x0 /  h.x0 /  g.x0 /  hu0 ; x0 i C h .u0 /


 g.x/  hu0 ; xi C h .u0 /
 g.x/  hu0 ; xi  h .u0 /  g.x/  h.x/

for all x 2 domg. t


u

4.9 Exercises

1 Let C be a compact set in Rn . For every x 2 Rn denote by d.x/ D inffkx  yk j


y 2 @Cg. Show that d.x/ is a dc function such that if C is a convex set then d.x/ is
concave on C but convex on any convex subset of Rn n C.
Find a dc representation of the functions:
P
2 P.x; y/ D njD1 aj xj C bj yj C cj x2j C dj y2j C pj xj yj C qj xj y2j C rj x2j yj  for .x; y/ 2
RnC  RnC .
3
8

<  wt 0t
f .t/ D .  w/.1  t


/ t

: 0 t

for t  0 (; ;
are positive numbers, and <
/.
4

X
n
ri .xi /
f .x/ D Pk ; x  0;
iD1 0 C jD1 j xj

where each ri .t/ is a strictly increasing S-shaped function (convex for t 2 0; i ,


concave for t  i , with i > 0/, and i ; i D 0; 1; : : : ; n are positive numbers.
4.9 Exercises 123

1 1=2 4
5 f .x/ D x1 C C ; x 2 R2C .
x2 .x1  x2 /2
6 A function f W C ! R (where C is an open convex subset of Rn / is said to be
weakly convex if there exists a constant r > 0 such that for all x1 ; x2 2 C and all
 2 0; 1:

f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 / C .1  /rkx1  x2 k2 :

Show that f is weakly convex if and only if it is a dc function of the form f .x/ D
g.x/  rkxk2 , where r > 0.
7 Show that a function f W Rn ! R is weakly convex if and only if there exists a
constant r > 0 such that for every x0 2 Rn one can find a vector y 2 Rn satisfying
f .x/  f .x0 /  hy; x  x0 i  rkx  x0 k2 8x 2 Rn .
2
8 Find a convex minorant for the function f .x/ D e.x1 Cx2 x3 / C x1 x2 x3 on 1 
xi  C1; i D 1; 2; 3.
9 Same question for the function f .x/ D log.x1 C x22 /minfx1 ; x2 g on 1  xi 
3; i D 1; 2.
10 (1) If x is a local minimizer of the function f .x/ D g.x/  h.x/ over Rn then
0 2 @g.x/  @h.x/ where A  B D fxj x C B  Ag.
(2) If x is a global minimizer of the dc function f D g  h over Rn then any point
x 2 @h.x/ achieves a minimum of the function h  g over domg .
Part II
Global Optimization
Chapter 5
Motivation and Overview

5.1 Global vs. Local Optimum

We will be concerned with optimization problems of the general form

.P/ minff .x/ j gi .x/  0 .i D 1; : : : ; m/; x 2 Xg;

where X is a closed convex set in Rn ; f W ! R and gi W ! R; i D 1; : : : ; m,


are functions defined on some set in Rn containing X. Denoting the feasible set
by D:

D D fx 2 Xj gi .x/  0; i D 1; : : : ; mg

we can also write the problem as

minff .x/j x 2 Dg:

A point x 2 D such that (Fig. 5.1)

f .x /  f .x/; 8x 2 D;

is called a global optimal solution (global minimizer) of the problem. A point x0 2 D


such that there exists a neighborhood W of x0 satisfying

f .x0 /  f .x/; 8x 2 D \ W

is called a local optimal solution (local minimizer). When the problem is convex,
i.e., when all the functions f .x/ and gi .x/; i D 1; : : : ; m, are convex, any local
minimizer is global (Proposition 2.30), but this is no longer true in the general
case. A problem is said to be multiextremal when it has many local minimizers

Springer International Publishing AG 2016 127


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_5
128 5 Motivation and Overview

Nonconvex function Nonconvex constraints

Fig. 5.1 Local and global minimizers

with different objective function values (so that a local minimizer may fail to be
global; Fig. 5.1). Throughout the sequel, unless otherwise stated it will always be
understood that the minimum required in problem .P/ is a global one, and we will
be mostly interested in the case when the problem is multiextremal. Often we will
assume X compact and f .x/; gi .x/; i D 1; : : : ; m, continuous on X, (so that if D is
nonempty then the existence of a global minimum is guaranteed); but occasionally
the case where f .x/ is discontinuous at certain boundary points of D will also be
discussed, especially when this turns out to be important for the applications.
By its very nature, global optimization requires quite different methods from
those used in conventional nonlinear optimization, which so far has been mainly
concerned with finding local optima. Strictly speaking, local methods of nonlinear
programming which rely on such concepts as derivatives, gradients, subdifferentials
and the like, may even fail to locate a local minimizer. Usually these methods
compute a KarushKuhnTucker (KKT) point, but in many instances most KKT
points are not even local minimizers. For example, the problem:

X
n
minimize  .ci xi C x2i / subject to  1  xi  1 .i D 1; : : : ; n/;
iD1

where the ci are sufficiently small positive numbers, has up to 3n KKT points, of
which only 2n C 1 points are local minimizers, the others are saddle points.
Of course the above example reduces to n univariate problems minfci xi  x2i j 
1  xi  1g; i D 1; : : : ; n, and since the strictly quasiconcave function ci xi  x2i
attains its unique minimum over the segment 1; 1 at xi D 1, it is clear that
.1; 1; : : : ; 1/ is the unique global minimizer of the problem. This situation is typical:
although certain multiextremal problems may be easy (among them instances
of problems which have been shown to be NP-hard, see Chap. 7), their optimal
solution is most often found on the basis of exploiting favorable global properties
5.2 Multiextremality in the Real World 129

of the functions involved rather than by conventional local methods. In any case,
easy multiextremal problems are exceptions. Most nonconvex global optimization
problems are difficult, and their main difficulty is often due to multiextremality.
From the computational complexity viewpoint, even the problem which only
differs from the above mentioned example in that the objective function is a
general quadratic concave function is NP-hard (Sahni 1974). It should be noted,
however, that, as shown in Murty and Kabadi (1987), to verify that a given point
is an unconstrained local minimizer of a nonconvex function is already NP-hard.
Therefore, even if we were to restrict our goal to obtaining a local minimizer only,
this would not significantly lessen the complexity of the problem. On the other
hand, as optimization models become widely used in engineering, economics, and
other sciences, an increasing number of global optimization problems arise from
applications that cannot be solved successfully by standard methods of linear and
nonlinear programming and require more suitable methods. Often, it may be of
considerable interest to know at least whether there exists any better solution to
a practical problem than a given solution that has been obtained, for example, by
some local method. As will be shown in Sect. 5.5, this is precisely the central issue
of global optimization.

5.2 Multiextremality in the Real World

The following examples illustrate how multiextremal nonconvex problems arise in


different economic, engineering, and management activities.
Example 5.1 (ProductionTransportation Planning) Consider r factories pro-
ducing a certain good to satisfy the demands dj ; j D 1; : : : ; m, of m destination
points. The cost of producing t units of goods at factory i is a concave function gi .t/
(economy of scale), the transportation cost is linear, with unit cost cij from factory
i to destination point j. In addition, if the destination point j receives t dj units,
then a shortage penalty hj .t/ must be paid which is a decreasing nonnegative convex
function of t in the interval 0; dj /. Under these conditions, the problem that arises is
the following one where a dc function has to be minimized over a polyhedron:

X
r X
m X
r X
m
minimize cij xij C gi .yi / C hj .zj /
iD1 jD1 iD1 jD1

X
m
s:t: xij D yi .i D 1; : : : ; r/
jD1
Xr
xij D zj .j D 1; : : : ; m/
iD1
xij ; yi ; zj  0 8i; j:
130 5 Motivation and Overview

In a general manner nonconvexities in economic processes arise from economies


of scale (concave cost functions) or increasing returns (convex utility functions).
Example 5.2 (LocationAllocation) A number of p facilities providing the same
service have to be constructed to serve n users located at points aj ; j D 1; : : : ; n on
the plane. Each user will be served by one of the facilities, and the transportation
cost is a linear function of the distance. The problem is to determine the locations
xi 2 R2 ; i D 1; : : : ; p, of the facilities so as to minimize the weighted sum
of all transportation costs from each user to the facility that serves this user.
Introduced by Cooper (1963), this problem was shown to be NP-hard by Megiddo
and Supowwit (1984). Originally it was formulated as consisting simultaneously
of a nontrivial combinatorial part (allocation, i.e., partitioning the set of users
into groups to be served by each facility) and a nonlinear part (location, i.e.,
finding the optimal location of the facility serving each group). The corresponding
mathematical formulation of the problem is (Brimberg and Love 1994):
Pp P
minx;u iD1 NjD1 wij kaj  xi k
Pp
s:t: iD1 wij D rj ; j D 1; : : : ; N
wij  0; i D 1; : : : ; p; j D 1; : : : ; N
xi 2 S; i D 1; : : : ; p;

where wij is an unknown weight representing the flow from point aj to facility i, S is
a rectangular domain in R2 where the facilities must be located.
With this traditional formulation the problem is exceedingly complicated as it
involves a large number of variables and a highly nonconvex objective function.
However, observing that to minimize the total cost, each user must be served by the
closest facility, and that the distance from a user j to the closest facility is obviously

dj .x1 ; : : : ; xp / D min kaj  xi k (5.1)


iD1;:::;p

the problem can be formulated more simply as

X
n
minimize wj dj .x1 ; : : : ; xp / subject to xi 2 R2 ; i D 1; : : : ; p;
jD1

where wj > 0 is the weight assigned to user j. Since each dj .:/ is a dc function
(pointwise minimum of a set of convex functions kaj  xi k, see Sect. 3.1), the
problem amounts to minimizing a dc function over .R2 /p .
Example 5.3 (Engineering Design) Random fluctuations inherent in any fabri-
cation process (e.g., integrated circuit manufacturing) may result in a very low
production yield. To help the designer to minimize the influence of these random
fluctuations, a method consists in maximizing the yield by centering the nominal
value of design parameters in the region of acceptability. In more precise terms,
5.2 Multiextremality in the Real World 131

this design centering problem (Groch et al. 1985; Vidigal and Director 1982) can
be formulated as follows. Suppose the quality of a manufactured item can be
characterized by an n-dimensional parameter and an item is accepted when the
parameter is contained in a given region M  Rn . If x is the nominal value of
the parameter, and y its actual value, then usually y will deviate from x and for every
fixed x the probability that the deviation kx  yk does not exceed a given level r
monotonically increases with r. Under these conditions, for a given nominal value
of x the production yield can be measured by the maximal value of r D r.x/ such
that

B.x; r/ WD fy W kx  yk  rg  M

and to maximize the production yield, one has to solve the problem

maximize r.x/ subject to x 2 Rn : (5.2)

Often, the norm k:k is ellipsoidal, i.e., there exists a positive definite matrix Q such
that kx  yk2 D .x  y/T Q.x  y/. Denoting D D Rn n M, we then have

r2 .x/ D inff.x  y/T Q.x  y/ W y Dg


D inffxT Qx C yT Qy  2xT Qy W y Dg;

so r2 .x/ D xT Qx  h.x/ with h.x/ D supf2xT Qy  yT Qy W y Dg. Since for each y


D the function x 7! 2xT Qy  yT Qy is affine, h.x/ is a convex function and therefore,
r2 .x/ is a dc function. Using nonlinear programming methods, we can only compute
KKT points which may even fail to be local minima, so the design centering problem
is another instance of multiextremal optimization problem (see, e.g., Thach 1988).
Problems similar to (5.2) appear in engineering design, coordinate measurement
technique, etc.
Example 5.4 (Pooling and Blending) In oil refineries, a two-stage process is used
to form final products j D 1; : : : ; J from given oil components i D 1; : : : ; I (see
Fig. 5.2). In the first stage, intermediate products are obtained by combining the
components supplied from different sources and stored in special tanks or pools. In
the second stage, these intermediate products are combined to form final products
of prescribed quantities and qualities. A certain quantity of an initial component can
go directly to the second stage. If xil denotes the amount of component i allocated

Fig. 5.2 The pooling x11 y11


Comp 1 Product 1
problem Pool 1
x12 y12
Comp 2 z31 Product 2
z32
Comp 3
132 5 Motivation and Overview

to pool l, ylj the amount going from pool l to product j, zij the amount of component
i going directly to product j, plk the level of quality k (relative sulfur content level)
in pool l, and Cik the level of quality k in component i, then these variables must
satisfy (see, e.g., Ben-Tal et al. 1994; Floudas and Aggarwal 1990; Foulds et al.
1992) (Fig. 5.2):
P P )
xil C j zij  Ai
P l P (component balance)
i xil  j ylj D 0
P
x  Sl (pool balance)
Pi il P
.p  P /y C i .Cij  Pjk /zij  0 (pool quality)
Pl lk Pjk lj
l ylj C i zij  Dj (product demands constraints)
xil ; ylj ; zij ; plk  0;

where Ai ; Sl ; Pjk ; Dj are the upper bounds for component availabilities, pool sizes,
product qualities, and product demands, respectively, and Cik the level of quality
k in component i. Given the unit prices ci ; dj of component i and product j, the
problem is to determine the variables xil ; ylj ; zij ; plk satisfying the above constraints
with maximum profit
XX XX XX
 ci xil C dj ylj C .dj  ci /zij :
i l l j i j

Thus, a host of optimization problems of practical interest in economics and


engineering involve nonconvex functions and/or nonconvex sets in their description.
Nonconvex global optimization problems arise in many disciplines including
control theory (robust control), computer science (VLSI chip design and databases),
mechanics (structural optimization), physics (nuclear design and microcluster phe-
nomena in thermodynamics), chemistry (phase and chemical reaction equilibrium,
molecular conformation), ecology (design and cost allocation for waste treatment
systems), . . . Not to mention many mathematical problems of importance which can
be looked at and studied as global optimization problems. For instance, finding a
fixed point of a map F W Rn ! Rn on a set D  Rn , which is a recurrent problem
in applied mathematics, reduces to finding a global minimizer of the function
kx  F.x/k over D; likewise, solving a system of equations gi .x/ D 0 .i D 1; : : : ; m/
amounts to finding a global minimizer of the function f .x/ D maxi jgi .x/j.

5.3 Basic Classes of Nonconvex Problems

Optimization problems without any particular structure are hopelessly difficult.


Therefore, our interest is focussed first of all on problems whose underlying
mathematical structure is more or less amenable to analysis and computational
study. This point of view leads us to consider two basic classes of nonconvex global
optimization problems: dc optimization and monotonic optimization.
5.3 Basic Classes of Nonconvex Problems 133

5.3.1 DC Optimization

A dc optimization (dc programming) problem is a problem of the form



minimize f .x/
(5.3)
subject to x 2 D; hi .x/  0 .i D 1; : : : ; m/;

where D is a convex set in Rn , and each of the functions f .x/; h1 .x/, : : : ; hm .x/ is a
dc function on Rn . Most optimization problems of practical interest are dc programs
(see the examples in Sect. 4.2).
The class of dc optimization problems includes several important subclasses:
concave minimization, reverse convex programming, non convex quadratic opti-
mization among others.

a. Concave Minimization (Convex Maximization)

minimize f .x/ subject to x 2 D; (5.4)


where f .x/ is a concave function (i.e., f .x/ is a convex function) and D is a closed
convex set given by an explicit system of convex inequalities: gi .x/  0; i D
1; : : : ; m.
This is a global optimization problem with the simplest structure. Since (5.4) can
be written as minftj x 2 D; f .x/  tg, it is equivalent to minimizing t under the
constraint .x; t/ 2 G n H with G D D  R and H D f.x; t/j f .x/ > tg (H is a convex
set because the function f .x/ is concave).
Of particular interest is the linearly constrained concave minimization problem
(i.e., problem (5.4) when the constraint set D is a polyhedron given by a system of
linear inequalities) which plays a central role in global optimization and has been
extensively studied in the last four decades.
Since the most important property of f .x/ that should be exploited in prob-
lem (5.4) is its quasiconcavity (expressed in the fact that every upper level set
fxj f .x/  g is convex), rather than its concavity, it is convenient to slightly
extend problem (5.4) by assuming f .x/ to be only quasiconcave. The corresponding
problem, called quasiconcave minimization (or quasiconvex maximization), does
not differ much from concave minimization. Also, by considering quasiconcave
rather than concave minimization, more insight can be obtained into the relationship
between this problem and other nonconvex optimization problems.
134 5 Motivation and Overview

b. Reverse Convex Programming

minimize f .x/ subject to x 2 D; h.x/  0; (5.5)

where f .x/; h.x/ are convex finite functions on Rn and D is a closed convex set in Rn .
Setting
C D fxj h.x/  0g;
the problem can be rewritten as
minimize f .x/ subject to x 2 D n intC:
This differs from a conventional convex program only by the presence of the
reverse convex constraint h.x/  0: Writing the problem as minftj f .x/  t; x 2
D; h.x/  0g, we see that it amounts to minimizing t over the set G n intH, with
G D f.x; t/j f .x/  t; x 2 Dg; H D C  R.
Just as concave minimization can be extended to quasiconcave minimization, so
can problem (5.5) be extended to the case when f .x/ is quasiconvex. As indicated
above, concave minimization can be converted to reverse convex programming by
introducing an additional variable. For many years, it was commonly believed (but
never established) that conversely, a reverse convex program should be equivalent
to a concave minimization problem via some simple transformation. It was not until
1991 that a duality theory for nonconvex optimization was developed (Thach 1991),
within which framework the two problems: quasiconcave minimization over convex
sets and quasiconvex minimization over complements of convex sets are dual to each
other, so that solving one can be reduced to solving the other and vice versa.

c. Quadratic Optimization

An important subclass of dc programs is constituted by quadratic nonconvex pro-


grams which, in their most general formulation, consist in minimizing a quadratic
function under nonconvex quadratic constraints, i.e., a problem (5.3) in which
1
f .x/ D hx; Q0 xi C hc0 ; xi
2
1
hi .x/ D hx; Qi xi C hci ; xi i D 1; : : : ; m;
2
where Qi ; i D 0; 1; : : : ; m, are symmetric n  n matrices, and at least one of these
matrices is indefinite or negative semidefinite.
5.4 General Nonconvex Structure 135

5.3.2 Monotonic Optimization

This is the class of problems of the form



maximize f .x/
(5.6)
subject to g.x/  0  h.x/; x 2 Rn ;
C

where f .x/; g.x/; h.x/ are increasing functions on RnC . A function f W RnC ! R is
said to be increasing if f .x/  f .x0 / whenever x; x0 2 RnC ; x  x0 .
Monotonic optimization is a new important chapter of global optimization which
has been developed in the last 15 years. It is easily seen that if F denotes the family
of increasing functions on RnC then

f1 .x/; f2 .x/ 2 F ) minff1 .x/; f2 .x/g 2 F I maxff1 .x/; f2 .x/g 2 F :

These properties are similar to those of the family of dc functions, so a certain


parallel exists between Monotonic Optimization and DC Optimization.

5.4 General Nonconvex Structure

In the deterministic, much more than in the stochastic and other approaches to global
optimization, it is essential to understand the mathematical structure of the problem
being investigated. Only a deep understanding of this structure can provide insight
into the most relevant properties of the problem and suggest efficient methods for
handling it.
Despite the great variety of nonconvex optimization problems, most of them
share a common mathematical structure, which is the complementary convex
structure as described in (5.5). Consider, for example, the general dc programming
problem (5.3). As we saw in Sect. 4.1, the system of dc constraints hi .x/  0; i D
1; : : : ; m is equivalent to the single dc constraint:

h.x/ D max hi .x/  0: (5.7)


iD1;:::;m

If h.x/ D p.x/  q.x/ then the constraint (5.7) in turn splits into two constraints:
one convex constraint p.x/  t and one reverse convex constraint q.x/  t. Thus,
assuming f .x/ linear and setting z D .x; t/, the dc program (5.3) can be reformulated
as the reverse convex program:

minfl.z/j z 2 G n intHg; (5.8)

where l.z/ D f .x/ is linear, G D fz D .x; t/jx 2 D; p.x/  tg, and H D fz D


.x; t/j q.x/  tg. This reverse convex program (5.8) is also referred to as a canonical
dc programming problem.
136 5 Motivation and Overview

Consider next the more general problem of continuous optimization


minimize f .x/ subject to x 2 S; (5.9)
where S is a compact set in R and f .x/ is a continuous function on R . As
n n

previously, by writing the problem as minftj f .x/  t; x 2 Sg and changing the


notation, we can assume that the objective function f .x/ in (5.9) is a linear function.
Then, by Proposition 4.13, if r W Rn ! RC is any separator for S (for instance,
r.x/ D inffkx  yk j y 2 Sg/ the set S can be described as
S D fxj g.x/  kxk2  0g;
where g.x/ D supyS fr2 .y/ C 2xy  kyk2 g is a convex function on Rn . Therefore,
problem (5.9) is again equivalent to a problem of the form (5.8).
To sum up, a complementary convex structure underlies every continuous global
optimization problem. Of course, this structure is not always apparent and even
when it has been made explicit still a lot of work remains to be done to bring it into
a form amenable to efficient computational analysis. Nevertheless, the attractive
feature of the complementary convex structure is that, on the one hand, it involves
convexity, and therefore brings to bear analytical tools from convex analysis, like
subdifferential, supporting hyperplane, polar set, etc. On the other hand, since
convexity is involved in the wrong (reverse) direction, these tools must be used
in some specific way and combined with combinatorial tools like cutting planes,
relaxation, restriction, branch and bound, etc. In this sense, global optimization
is a sort of bridge between nonlinear programming and combinatorial (discrete)
optimization.
Remark 5.1 The connection between global optimization and combinatorial anal-
ysis is also apparent from the fact that any 0-1 integer programming problem can be
converted into a global optimization problem. Specifically, a discrete constraint of
the form
xi 2 f0; 1g .i D 1; : : : ; n/
can be rewritten as
x 2 G n H;
with G being the hypercube 0; 1n and H the interior of the ball circumscribing the
hypercube, or alternatively as
Xn
0  xi  1; xi .xi  1/  0:
iD1

5.5 Global Optimality Criteria

The essence of an iterative method of optimization is that we have a trial solution


x0 to the problem and look for a better one. If the method used is some standard
procedure of nonlinear programming and x0 is a KKT point, there is no clue to
5.5 Global Optimality Criteria 137

proceed further. We must then stop the procedure even though there is no guarantee
that x0 is a global optimal solution. It is this ability to become trapped at a stationary
point which causes the failure of classical methods of nonlinear programming and
motivates the need to develop global search procedures. Therefore, any global
search procedure must be able to address the fundamental question of how to
transcend a given feasible solution (the current incumbent, which may be a
stationary point). In other words, the core of global optimization is the following
issue:
Given a solution x0 (the incumbent), check whether it is a global optimal solution,
and if it is not, find a better feasible solution.
As in classical optimization theory, one way to deal with this issue is by
devising criteria for recognizing a global optimum. For unconstrained dc optimiza-
tion problems, we already know a global optimality criterion by Hiriart-Urruty
(Proposition 4.20), namely: Given a proper convex l.s.c. function h.x/ on Rn and
an arbitrary proper function g W Rn ! R [ fC1g, a point x0 such that g.x0 / < C1
is an unconstrained global minimizer of g.x/  h.x/ if and only if

@" h.x0 /  @" g.x0 / 8" > 0: (5.10)

We now present two global optimality criteria for the general problem

min f f .x/j x 2 Sg; (5.11)

where the set S  Rn is assumed nonempty, compact, and robust, i.e., S D cl.intS/,
and f .x/ is a continuous function on S. Since f .x/ is then bounded from below on S,
without loss of generality we can also assume f .x/ > 0 8x 2 S.
Proposition 5.1 A point x0 2 S is a global minimizer of f .x/ over S if and only if
WD f .x0 / satisfies either of the following conditions:
(i) The sequence r.; k/; k D 1; 2; : : : ; is bounded (Falk 1973a), where
Z  k

r.; k/ D dx:
S f .x/

(ii) The set E WD fx 2 Sj f .x/ < g has null Lebesgue measure (Zheng 1985).
Proof Suppose x0 is not a global minimizer, so that f .x/ < for some x 2 S. By
continuity we will have f .x0 / < for all x0 sufficiently close to x. On the other hand,
since S is robust, in any neighborhood of x there is an interior point of S. Therefore,
by taking a point x0 2 intS sufficiently close to x we will have f .x0 / < . Let W be
a neighborhood of x0 contained in S such that f .y/ < for all y 2 W. Then W  E,
hence mesE  mesW > 0. Conversely, if mesE > 0, then E ;, hence x0 is not a
global minimizer. This proves that (ii) is a necessary and sufficient condition for the
optimality of x0 . We now show that (i) is equivalent to (ii). If mesE > 0, then since
E D [C1 1 1
iD1 fx 2 Sj f .x/  1 C i g there is i0 such that E0 D fx 2 Sj f .x/  1 C i0 g has
138 5 Motivation and Overview

mesE0 D  > 0. This implies


Z  k

dx  .1 C 1=i0 /k ! C1 as k ! C1:
S f .x/

On the other hand, if mesE D 0 then


Z  k Z  k

dx D dx  mes.S n E/ 8k;
S f .x/ SnE f .x/

proving that (i) , (ii). t


u
Using Falks criterion it is even possible, when the global optimizer is unique, to
derive a global criterion which in fact yields a closed formula for this unique global
optimizer (Pincus 1968).
Proposition 5.2 Assume, in addition to the previously specified conditions, that
f .x/ has a unique global minimizer x0 on S. Then for every i D 1; : : : ; n:
R
xi ekf .x/ dx
RS
kf .x/ dx
! x0i .k ! C1/: (5.12)
Se

Proof We show that for every i D 1; : : : ; n:


R
jxi  x0i jekf .x/ dx
'i .k/ WD S R !0 .k ! C1/: (5.13)
kf .x/ dx
Se

For any fixed " > 0 let W" D fx 2 S j kx  x0 k < "g and write
R R
jxi  x0i jekf .x/ dx jxi  x0i jekf .x/ dx
'i .k/ D
W"
R C
SnW"
R :
kf .x/ dx kf .x/ dx
Se Se

Since x0 is the unique minimizer of f .x/ over S, clearly minff .x/j x 2 SnW" g > f .x0 /
and there is > 0 such that maxff .x0 /  f .x/ j x 2 S n W" g < . Then
R R
jxi  x0i jekf .x/ dx ekf .x/ dx
W"
R  " RW"  ": (5.14)
kf .x/ dx kf .x/ dx
Se Se

On the other hand, setting M D maxx2S kx  x0 k, we have


R Z
jxi  x0i jekf .x/ dx 0
ekf .x /f .x/ dx
SnW"
R  M R (5.15)
kf .x/ dx kf .x0 /f .x/ dx
Se SnW" S e

Mmes.S n W" /ek


 R : (5.16)
kf .x0 /f .x/ dx
Se
5.6 DC Inclusion Associated with a Feasible Solution 139

R 0
The latter ratio tends to 0 as k ! C1 because S ekf .x /f .x/ dx is bounded by
Proposition 5.1 (Falks criterion for x0 to be the global minimizer of ef .x/ /. Since
" > 0 is arbitrary, this together with (5.14) implies (5.13). u
t
Remark 5.2 Since f .x/ > 0 8x 2 S, a minimizer of f .x/ over S is also a maximizer
1
of f .x/ and conversely. Therefore, from the above propositions we can state, under
the same assumptions as previously (S compact and robust and f .x/ continuous
positive on S/:
(i) A point x0 2 S with f .x0 / D is a global maximizer of f .x/ over S if and only
if the sequence
Z  k
f .x/
s.; k/ WD dx; k D 1; 2; : : : ;
S

is bounded.
(ii) If f .x/ has a unique global maximizer x0 over S then
R
xi ekf .x/ dx
x0i D lim RS kf .x/ ; i D 1; : : : ; n:
Se dx
k!C1

5.6 DC Inclusion Associated with a Feasible Solution

In view of the difficulties with computing multiple integrals the above global
optimality criteria are of very limited applicability. On the other hand, using the
complementary convex structure common to most optimization problems, it is
possible to formulate global optimality criteria in such a way that in most cases
transcending an incumbent solution x0 reduces to checking a dc inclusion of the
form

D  C; (5.17)

where D; C are closed convex subsets of Rn depending in general on x0 . Further-


more, this inclusion is such that from any x 2 D n C it is easy to derive a better
feasible solution than x0 .
Specifically, for the concave minimization problem (5.4), C D fxj f .x/  f .x0 /g
and D is the constraint set. Clearly these sets are closed convex, and x0 2 D is
optimal if and only if D  C, while any x 2 D n C yields a better feasible solution
than x0 .
140 5 Motivation and Overview

Consider now the canonical dc program (5.8):


minfl.x/j x 2 G n intHg; (5.18)
where G and H are closed convex subsets of R . As shown in Sect. 4.3 this is the
n

general form to which one can reduce virtually any global optimization problem to
be considered in this book. Assume that the feasible set D D G n intH is robust, i.e.,
satisfies D D cl.intD/ and that the problem is genuinely nonconvex, which means
that
minfl.x/j x 2 Gg < minfl.x/j x 2 G n intHg: (5.19)

Proposition 5.3 Under the stated assumptions a feasible solution x0 to (5.18) is


global optimal if and only if
fx 2 Gj l.x/  l.x0 /g  H: (5.20)

Proof Suppose there exists x 2 G \ fxj l.x/  l.x0 /g n H. From (5.19) we have
l.z/ < l.x/ for at least some z 2 G \ intH. Then the line segment z; x meets the
boundary @H at some point x0 2 GnintH, i.e., at some feasible point x0 . Since x H,
we must have x0 D xC.1/z with 0 <  < 1, hence l.x0 / D l.x/C.1/l.z/ <
l.x/  l.x0 /. Thus, if (5.20) does not hold then we can find a feasible solution x0
better than x0 . Conversely suppose there exists a feasible solution x better than x0 .
Let W be a ball around x such that l.x0 / < l.x0 / for all x0 2 W. Since D is robust one
can find a point x0 2 W \ intD. Then clearly x0 2 G n H and l.x0 / < l.x0 /, contrary
to (5.20). Therefore, if (5.20) holds, x0 must be global optimal. t
u
Thus, for problem (5.18) transcending x0 amounts to solving an inclusion (5.17)
where D D fx 2 Gj l.x/  l.x0 /g and C D H. If x 2 G n H then the point
x0 2 @H \ z; x yields a better feasible solution than x0 , where z is any point of
D \ intH with l.z/ < l.x/ [this point exists by virtue of (5.19)].
We shall refer to the problem of checking an inclusion D  C or, equivalently,
solving the inclusion x 2 D n C, as the DC feasibility problem. Note that for the
concave minimization problem the set C D fxj f .x/  f .x0 /g depends on x0 ,
whereas for the canonical dc program the set D D fx 2 Gj l.x/  l.x0 /g depends
on x0 .

5.7 Approximate Solution

In practice, the computation of an exact global optimal solution may be too expen-
sive while most often we only need the global optimum within some prescribed
tolerance "  0. So problem
.P/ minff .x/j x 2 Dg
5.7 Approximate Solution 141

can be considered solved if a feasible solution x has been found such that

f .x/  f .x/  "; 8x 2 D: (5.21)

Such a feasible solution is called a global "-optimal solution. Setting f D


max f .D/  min f .D/, the relative error made by accepting f .x/ as the global
optimum is

f .x/  min f .D/ " "


  ;
f f M

where M is a lower bound for f .


With tolerance " the problem of transcending a feasible solution x0 then consists
in the following:
() Find a feasible solution z such that f .z/ < f .x0 / or else prove that x0 is a
global "-minimizer.
A typical feature of many global optimization problems which results from their
complementary convex structure is that a subset X (typically, a finite subset of
the boundary) of the feasible domain D is known to contain at least one global
minimizer and is such that starting from any point z 2 D we can find, by local
search, a point x 2 X with f .x/  f .z/. Under these circumstances, if a procedure is
available for solving () then .P/ can be solved according to a two phase scheme as
follows:
Let z0 2 D be an initial feasible solution. Set k D 0.
Phase 1 (Local phase). Starting from zk search for a point xk 2 X such that
f .xk /  f .zk /. Go to Phase 2.
Phase 2 (Global phase). Solve () for x D xk , by the available procedure. If
the output of this procedure is the global "-optimality of xk , then terminate.
Otherwise, the procedure returns a feasible point z such that f .z/ < f .xk /. Then
set zk z, increase k by 1 and go back to Phase 1.
Phase 1 can be carried out by using any local optimization or nondeterministic
method (heuristic or stochastic). Only Phase 2 is what distinguishes deterministic
approaches from the other ones. In the next chapters we shall discuss various
methods for solving inclusions of the form (), i.e., practically for carrying out
Phase 2, for various classes of problems. Clearly, the completion of one cycle of the
above described iterative scheme achieves a move from one feasible point xk 2 X to
a feasible point xkC1 2 X such that f .xkC1 / < f .xk /. If X is finite then the process
is guaranteed to terminate after finitely many cycles by a global "-optimal solution.
In the general case, there is the possibility of the process being infinite. Fortunately,
however, for most problems it turns out that the iterative process can be organized so
that, whenever infinite, it will converge to a global optimum; in other words, after a
sufficient number of iterations it will yield a solution as close to the global optimum
as desired.
142 5 Motivation and Overview

5.8 When is Global Optimization Motivated?

Since global optimization methods are generally expensive, they should be used
only when they are really motivated. Before applying a sophisticated global
optimization algorithm to a given problem, it is therefore necessary to perform
elementary operations (such as fixing certain variables to eliminate them, tightening
bounds, and changing the variables) for improving or simplifying the formulation
wherever possible. In some simple cases, an obvious transformation may reduce a
seemingly nonconvex problem to a convex or even linear program. For instance, a
problem of the form
p
minf 1 C .l.x//2 j x 2 Dg;
where D is a polyhedron and l.x/ a linear function nonnegative on D is merely a
disguise of the linear program maxfl.x/j x 2 Dg. More generally, problems such as
minf.l.x/j x 2 Dg where D is a polyhedron, l.x/ a linear function, while .u/ is a
nondecreasing function of u on the set l.D/ should be recognized as linear programs
and solved as such.1
There are, however, circumstances when more or less sophisticated transfor-
mations may be required to remove the apparent nonconvexities and reformulate
the problem in a form amenable to known efficient solution methods. Therefore,
preprocessing is often a necessary preliminary phase before solving any global
optimization problem.
Example 5.5 Consider the following problem encountered in the theoretical eval-
uation of certain fault-tolerant shortest path distributed algorithms (Bui 1993):
Xn
.  2xi /yi C xi
.Q/ min W (5.22)
iD1
.  yi /xi

X
n X
n
xi  a; yi  b (5.23)
iD1 iD1


 xi  ;
 yi  ; (5.24)
where 0 < n
 b; 0 < < 1 <  and 0 < < 1. Assume that n 
a; n  b (otherwise the problem is infeasible). The objective function looks rather
complicated, and one might be tempted to study the problem as a difficult nonconvex
optimization problem. Close scrutiny reveals, however, that certain variables can be
fixed and eliminated. Specifically, by rewriting the objective function as
" ! #
Xn
 1  2
1 C C2 : (5.25)
iD1
xi  yi  yi

1
Unfortunately, such linear programs disguised in nonlinear global optimization problems have
been and still are sometimes used for testing global optimization algorithms.
5.8 When is Global Optimization Motivated? 143

it is clear that a feasible solution .x; y/ of (Q) can be optimal only if xi D ; i D


1; : : : ; n. Hence, we can set xi D ; i D 1; : : : ; n. Problem (Q) then becomes
" ! #
Xn
 1  2
.Q/ min 1 C C2 W (5.26)
iD1
 yi  yi

X
n
yi  b;
 yi  : (5.27)
iD1

The objective function of .Q/ can further be rewritten as


" #
X
n

 = C 2
iD1
 yi

1
with  D ./= C 1  2/ D .1  / C .=  1/ > 0. Since each term y i
is a
convex function of yi , it follows that .Q/ is a convex program. Furthermore, due to
the special structure of this program, an optimal solution of it can be computed
in closed form by standard convex mathematical programming methods. Note,
however, that if the inequalities (5.23) are replaced by equalities then the problem
becomes a truly nonconvex optimization problem.
Example 5.6 (Geometric Programming) An important class of optimization
problems called geometric programming problems, which have met with many
engineering applications (see, e.g., Duffin et al. 1967), have the following general
formulation:

minfg0 .x/j gi .x/  1 .i D 1; : : : ; m/; x D .x1 ; : : : ; xn / > 0g;

where each gi .x/ is a posynomial, i.e., a function of the form

X
Ti Y
n
gi .x/ D cij .xk /aijk i D 0; 1; : : : ; m
jD1 kD1

with cij > 0 8i D 1; : : : ; m; j D 1; : : : ; Ti . Setting


!
X
Ti X
xk D exp.tk /; gi .x/ D cij exp aijk tk WD hi .t/;
jD1 k

one can rewrite this problem into the form

minflog h0 .t/j log hi .t/  0 i D 1; : : : ; mg:


144 5 Motivation and Overview

It is now not hard to see that each function hi .t/ is convex on Rn . Indeed, if the
vectors aij D .aij1 ; : : : ; aijn / 2 Rn are defined, then

X
Ti
cij eha
ij ;ti
hi .t/ D
jD1

and, since cij > 0, to prove the convexity of hi .t/ it suffices to show that for any
vector a 2 Rn ; the function eha;ti is convex. But the function '.y/ D ey is convex
increasing on the real line 1 < y < C1, while l.t/ D ha; ti is linear in t,
therefore, by Proposition 2.8, the composite function '.l.t// D eha;ti is convex.
Thus a geometric program is actually a convex program of a special form. Very
efficient specialized methods exist for solving this problem (Duffin et al. 1967).
Note, however, that if certain cij are negative, then the problem becomes a difficult
dc optimization problem (see, e.g., McCormick 1982).
Example 5.7 (Generalized Linear Programming) A generalized linear program
is a problem of the form:

min cx W
Pn j
jD1 y xj D b
.GLP/ (5.28)
yj 2 Y j D 1; : : : ; n
j
x  0;

where x 2 Rn ; yj 2 Rm .j D 1; : : : ; n/; b 2 Rm , and Yj  Rm .j D 1; : : : ; n/


are polyhedrons. The name comes from the fact that this problem reduces to an
ordinary linear program when the vectors y1 ; : : : ; yn are constant, i.e., when each
Yj is a singleton. Since yj are now variables, the constraints are bilinear. In spite of
that, the analogy of the problem with ordinary linear programs suggests an efficient
algorithm for solving it which can be considered as an extension of the classical
simplex method (Dantzig 1963).
A special case of (GLP) is the following problem which arises, e.g., from certain
applications in agriculture:

X
n
.SP/ min ci xi W (5.29)
iD1

x 2 D  Rn (5.30)
xi D ui vi i D 1; : : : ; p  n; (5.31)
p p
u 2 r; s  RCC ; v2S RC ; (5.32)

where D; S are given polytopes in Rn ; Rp , respectively. Setting ui D xi ; vi D 1 for


all i > p and substituting (5.31) into (5.29) and (5.30), we see that the problem is to
5.8 When is Global Optimization Motivated? 145

minimize a bilinear function in u; v under linear and bilinear constraints. However,


this natural transformation would lead to a complicated nonlinear problem, whereas
a much simpler solution method is by reducing it to a (GLP) as follows. Let
8 9
< X
p =
p
D D v 2 RC j ij vj D ei .i D 1; : : : ; m/ ; (5.33)
: ;
jD1
8 9
< X
n =
SD x 2 RnC j kj xj D dk .k D 1; : : : ; l/ : (5.34)
: ;
jD1

By rewriting the constraints (5.30)(5.32) as

X
p
xj
ij D ei i D 1; : : : ; m
jD1
uj

X
n
kj xj D dk k D 1; : : : ; l
jD1

rj  uj  sj j D 1; : : : ; p
xj  0 j D 1; : : : ; n;
 
e
we obtain a (GLP) where b D 2 RmCl ,
d
8
< hj =uj h D 1; : : : ; m; j D 1; : : : ; p
j
yj 2 RmCl ; yh D 0 h D 1; : : : ; m; j D p C 1; : : : ; n
:
hm;j h D m C 1; : : : ; m C l; j D 1; : : : ; n

and Yj is a polyhedron in RmCl defined by

h 2 IjC
j
hj =rj  yh  hj =sj (5.35)

h 2 Ij
j
hj =sj  yh  hj =rj (5.36)
IjC D fij ij > 0g; Ij D fij ij < 0g: (5.37)

An interesting feature worth noticing of this problem is that, despite the bilinear
constraints, the set of all x satisfying (5.31), (5.32) is a polytope, so that .SP/
is actually a linear program (Thieu 1992). To see this, for every fixed u 2 r; s
denote by uQ W Rp ! Rp the affine map that carries v 2 Rp to uQ .v/ 2 Rp with
146 5 Motivation and Overview

uQ i .v/ D ui vi ; i D 1; : : : ; p. Then the set uQ .S/ is a polytope, as it is the image of a


polytope under an affine map. Now let uj ; j 2 J, be the vertices of the rectangle r; s
and G be the convex hull of [j2J uQ j .S/; G D fx 2 Rn j .x1 ; : : : ; xp / 2 Gg. If x 2 G,
P j P
then xi D j2J j uQ i .v j /; i D 1; : : : ; p with v j 2 S; j  0, j2J j D 1 (Exercise 7,
Chap. 1), and so
X j j
xi D j ui vi ; i D 1; : : : ; p: (5.38)
j2J

j j P j
Since ui > 0; vi  0; one can have j2J j vi D 0 only if xi D 0, so one can write
P
xi D ui vi .i D 1; : : : ; p/, with v D j2J j v 2 S, and
j

( xi
if vi > 0
ui D vi
ri otherwise:

Clearly, ri  ui  si ; i D 1; : : : ; p because ri vi  xi  si vi by (5.38). Thus


x 2 G implies that x 2 . Conversely, ifPx 2 ; i.e., xi D ui vi ;Pi D 1; : : : ; p
for some u 2 r; s; v 2 S, then u D j2J j u with j  0;
j
j2J j D 1,
P j
hence xi D  .u
j2J j i i v /; i D 1; : : : ; p, which means that x 2 G. Therefore
D G, and so the feasible set of (SP) is a polytope. Though this result implies
that .SP/ can be solved as a linear program, the determination of the constraints in
this linear program [i.e., the linear inequalities describing the feasible set] may be
cumbersome computationally. It should also be noted that the problem will become
highly nonconvex if the rectangle r; s in the condition u 2 r; s is replaced by an
arbitrary polytope.
Example 5.8 (Semidefinite Programming) In recent years linear matrix inequal-
ity (LMI) techniques have emerged as powerful numerical tools for the analysis
and design of control systems. Recall from Sect. 2.12.2 that an LMI is any matrix
inequality of the form

X
n
Q.x/ WD Q0 C xi Qi
0; (5.39)
iD1

where Qi ; i D 0; 1; : : : ; m are real symmetric m  m matrices, and for any two real
symmetric m  m matrices A; B the notation A
B (A B, resp.) means that the
matrix A  B is negative semidefinite (definite, resp.), i.e., the largest eigenvalue of
A  B, max .A  B/, is nonpositive (negative, resp.). A semidefinite program is a
linear program under LMI constraints, i.e., a problem of the form

minfcxj Q.x/
0g; (5.40)
5.9 Exercises 147

where Q.x/ is as in (5.39). Obviously, Q.x/ is negative semidefinite if and only if

supfuT Q.x/uj u 2 Rm g  0;

Since for fixed u the function 'u .x/ WD uT Q.x/u is affine in x, it follows that the set
G D fx 2 Rn j 'u .x/  0 8u 2 Rm g (the feasible set of (5.40)) is convex. Therefore,
the problem (5.40) is a convex program which can be solved by specialized interior
point methods (see, e.g., Boyd and Vandenberghe 2004). Note that the problem
minfcxj Q.x/ 0g is also a semidefinite program , since Q.x/ 0 if and only
if Q.x/
0.
Many problems from control theory reduce to nonconvex global optimization
problems under LMI constraints. For instance, the robust performance problem
(Apkarian and Tuan 1999) can be formulated as that of minimizing subject to

X
N X
m
zi A0i C .ui Ai C vi AiCm / C B 0 (5.41)
kD1 iD1

ui D xi ; xi vi D i D 1; : : : ; m (5.42)
x 2 a; b; z 2 d; d ; N
(5.43)

where A0i ; Ai ; B are real symmetric matrices (the constraint (5.41) is convex,
while (5.42) are bilinear).

5.9 Exercises

1 Let C; D be two compact convex sets in R2 such that D  intC. Show that the
distance d.x/ D inffkx  yk W y Cg from a point x 2 D to the boundary of C is a
concave function. Suppose C is a lake, and D an island in the lake. Show that finding
the shortest bridge connecting D with the mainland (i.e., R2 n intC) is a concave
minimization problem. If C; D are merely compact (not necessarily convex), show
that this is a dc optimization problem.
2 (Weber problem with attraction and repulsion )
1. A single facility is to be constructed to serve n users located at points aj 2 D on
the plane (where D is a rectangle in R2 /. If the facility is located at x 2 D, then
the attraction of the facility to user j is qj .hj .x//, where hj .x/ D kx  aj k is the
distance from x to aj and qj W R ! R is a convex decreasing function (the farther
x is away from aj the less attractive it looks to user j). Show that Pthe problem of
finding the location of the facility with maximal total attraction njD1 qj .hj .x// is
a dc optimization problem.
2. In practice, some of the points aj may be repulsion rather than attraction points
for the facility. For example, there may be in the area D a garbage dump, or
148 5 Motivation and Overview

sewage plant, or nuclear plant, and one may wish the facility to be located as far
away from these points as possible. If J1 is the set of attraction points, J2 the set
of repulsion points, then in typical situations one may have
(
j  wj kx  aj k j 2 J1 ;
qj hj .x/ D
wj e j kxa k
j
j 2 J2 ;

where j > 0 is the maximal attraction of point j 2 J1 , j > 0 is the rate of decay
of the repulsion of point j 2 J2 and wj > 0 for all j. Show that in this case the
problem of maximizing the function
X X
qj hj .x/  qj hj .x/
j2J1 j2J2

is still a dc optimization problem. (When J2 D ; the problem is known as


Webers problem).
3P In the previous problem, show that if there is an index i 2 J1 such that wi 
n i
jD1;ji wj then a is the optimal location of the facility (majority theorem).

4 (Competitive location ) In Exercise 2, suppose that J2 D ; (no repulsion point),


and the attraction of the new facility to user j is defined as follows. There exist
already in the area several facilities and the new facility is meant to compete with
these existing facilities. If j > 0 .
j > j ; resp.) is the distance from user j to the
nearest (farthest, resp.) existing facility, then the attraction of the new facility to user
j is a function qj .t/ of the distance t from the new facility to user j such that
8

j  wj t   0  t  j
<
tj
qj .t/ D .j  wj j / 1 
j j j  t 
j

:
0 t 
j

(for user j the new facility is attractive as long as it is closer than any existing facility,
but this attraction quickly decreases when the new facility is farther than some of
the existing facilities and becomes zero when it is farther than the farthest existing
facility). Denoting by hj .x/ the distance from the unknown location x of the new
facility to user j, show that to maximize the total attraction one has again to solve a
dc optimization problem.
5 In Example 5.3 on engineering design prove that r.x/ D maxfrj B.x; r/  Mg D
inffkx  yk W y 2 @Mg (distance from x to @M/. Derive that r.x/ is a dc function
(cf Exercise 1, Chap. 4), and hence that finding the maximal ball contained in a given
compact set M  Rn reduces to maximizing a dc function under a dc constraint.
5.9 Exercises 149

[
n
6 Consider a system of disjunctive convex inequalities x 2 Ci , where Ci D
iD1
fx W gi .x/  0g. Show that this system is equivalent to a single dc inequality
p.x/  q.x/  0 and find the two convex functions p.x/; q.x/.
P
7 In Example 5.4 introduce the new variables qil such that xil D qil j ylj
and transform the pooling problem into a problem in qil ; ylj ; zij (eliminate the
variables xil /.
8 (Optimal design of a water distribution network) Assuming that a water distri-
bution network has a single source, a single demand pattern, and a new pumping
facility at the source node, the problem is to determine the pump pressure H, the
flow rate qi and the head losses Ji along the arcs i D 1; : : : ; s, so as to minimize the
cost (see, e.g., Fujiwara and Khang 1990):
X
f .q; H; J/ WD b1 Li .KLi =Ji /= .qi =c/= C b2 ja1 je1 H e2 C b3 ja1 jH
i

under the constraints


P P
q  i2out.k/ qi D ak k D 1; : : : ; n (flow balance)
Pi2in.k/ i
Ji D 0 p D 1; : : : ; m (head loss balance)
Pi2loop p
i2r.k/ Ji  H C h1  hk
min
p D 1; : : : ; m (hydraulic requirement)

qmax
i  qi  qmin
i  0 i D 1; : : : ; s
(bounds)
H max  H  H min  0

KLi .qi =c/ =dmax



 Ji
i D 1; : : : ; s (HazenWilliams equations)
Ji  KLi .qi =c/ =dmin

(Li is the length of arc i; K and c are the HazenWilliams coefficients, a1 is the
supply water rate at the source and b1 ; b2 ; b3 ; e1 ; e2 ; are constants such that 0 <
e1 ; e2 < 1 and 1 < < 2:5;  D 1:85 and D 4:87/. By the change of variables
pi D log qi ; i D log Ji , transform the first term of the sum defining f .q; H; J/ into
a convex function of p; and reduce the problem to a dc optimization problem.
9 Solve the problem in Example 5.5.
10 Solve the problem

minimize 3x1 C 4x2 subject to


y11 x1 C 2x2 C x3 D 5I y12 x1 C x2 C x4 D 3
x1 ; x2  0I y11  y12  1I 0  y11  2I 0  y12  2:
Chapter 6
General Methods

Methods for solving global optimization problems combine in different ways some
basic techniques: partitioning, bounding, cutting, and approximation by relaxation
or restriction. In this chapter we will discuss two general methods: branch and bound
(BB) and outer approximation (OA).
A basic operation involved in a BB procedure is partitioning, more precisely,
successive partitioning.

6.1 Successive Partitioning

Partitioning is a technique consisting in generating a process of subdivision of an


initial set (which is either a simplex or a cone, or a rectangle) into several subsets
of the same type (simplices, cones, rectangles, respectively) and reiterating this
operation for each of the partition sets as many times as needed. There are three
basic types of partitioning : simplicial, conical, and rectangular.

6.1.1 Simplicial Subdivision

Consider a .p  1/-simplex S D u1 ; : : : ; up   Rn and an arbitrary point v 2 S, so


that
X
p
X
p
vD i u i ; i D 1; i  0 .i D 1; : : : ; p/:
iD1 iD1

Let I D fij i > 0g. It is a simple matter to check that for every i 2 I, the points
u1 ; : : : ; ui1 ; v; uiC1 ; : : : ; up are vertices of a simplex Si  S and that:

Springer International Publishing AG 2016 151


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_6
152 6 General Methods

u1 H

u2

v v
u3
u4 u1
u3

x0
u2

Fig. 6.1 Simplicial and conical subdivision

[
.riSi / \ .riSj / D ; 8j iI Si D S:
i2I

We say that the simplices Si ; i 2 I form a radial partition(subdivision) of the simplex


S via v (Fig. 6.1). Each Si will be referred to as a partition member or a child of S.
Clearly, the partition is proper, i.e., consists of at least two members, if and only if
v does not coincide with any ui . An important special case is when v is a point of a
longest edge of the simplex S, i.e., v 2 uk ; uh , where
kuk  uh k D maxfkui  uj kj i < jI i; j D 1; : : : ; pg
(k  k denotes any given norm in Rn , for example, kxk D maxfjx1 j; : : : ; jxn jg/. If
v D uk C .1  /uh with 0 <  1=2 then the partition is called a bisection of
ratio (S is divided into two subsimplices such that the ratio of the volume of the
smaller subsimplex to that of S equals /. When D 1=2, the bisection is called
exact.
Consider now an infinite sequence of nested simplices S1  S2      Sk    
such that SkC1 is a child of Sk in a partition via someTv k 2 Sk . Such a sequence is
called a filter (of simplices) and the simplex S1 D 1 kD1 Sk the limit of the filter.
Let .S/ denote the diameter of S, i.e., the length of its longest edge. The filter is said
to be exhaustive if \1 kD1 Sk is a singleton, i.e., if lim .Sk / D 0 (the filter shrinks to
k!1
a single point). An important question about a filter concerns the conditions under
which it is exhaustive.
Lemma 6.1 (Basic Exhaustiveness Lemma) Let Sk D uk1 , : : : ; ukn ; k D
1; 2; : : :, be a filter of simplices such that SkC1 is a child of Sk in a subdivision
via v k 2 Sk . Assume that:
(i) For infinitely many k the subdivision of Sk is a bisection;
(ii) There exists a constant  2 .0; 1/ such that for every k:
maxfkv k  uki k W i D 1; : : : ; ng  .Sk /:
Then the filter is exhaustive.
6.1 Successive Partitioning 153

Proof Denote .Sk / D k . Since kC1  k , it follows that k tends to some limit
as k ! 1. Arguing by contradiction, suppose > 0, so there exists t such that
k < for all k  t. By assumption (ii) we can write

maxfkv k  uki k W i D 1; : : : ; ng  k <  k ; 8k  t: (6.1)

Let us color every vertex of St black and for any k > t let us color white every
vertex of Sk which is not black. Clearly, every white vertex of Sk must be a point v h
for some h 2 ft; : : : ; k  1g, hence, by (6.1), maxfkv h  uhi k W i D 1; : : : ; ng 
h <  k . So any edge of Sk incident to a white vertex must have a length
less than k and cannot be a longest edge of Sk . In other words, for k  t a longest
edge of Sk must have two black endpoints. Now denote K D fk  tj Sk is bisectedg.
Then for k 2 K; v k belongs to a longest edge of Sk , hence, according to what has just
been proved, to an edge of Sk joining two black vertices. Since v k becomes a white
vertex of SkC1 , it follows that for k 2 K, SkC1 has one white vertex more than Sk . On
the other hand, in the whole subdivision process the number of white vertices never
decreases. That means, after one bisection the number of white vertices increases
at least by 1. Therefore, after a certain number of bisections, we come up with a
simplex Sk with only white vertices. Then, according to (6.1), every edge of Sk has
length less than , conflicting with  k D .Sk /. Therefore, D 0, as was to be
proved. t
u
It should be emphasized that condition (ii) is essential for the exhaustiveness of
the filter (see Exercise 1 of this chapter). This condition implies in particular that the
bisections mentioned in (i) are bisections of ratio no less than 1   > 0. A filter of
simplices satisfying (i) but not necessarily (ii) enjoys the following weaker property
of semi-exhaustiveness which, however, may sometimes be more useful.
Theorem 6.1 (Basic Simplicial Subdivision Theorem) Let Sk D uk1 ; : : : ; ukn ,
k D 1; 2; : : :, be a filter of simplices such that SkC1 is a child of Sk in a subdivision
via a point v k 2 Sk . Then
(i) Each sequence fuki ; k D 1; 2; : : :g has an accumulation point ui such that
S1 WD \1 1
kD1 S is the convex hull of u ; : : : ; u ;
k n

(ii) The sequence fv ; k D 1; 2; : : :g has an accumulation point v 2 fu1 ; : : : ; un g;


k

(iii) If for infinitely many k the subdivision of Sk is a bisection of ratio no less than
> 0 then an accumulation point of the sequence fv k ; k D 1; 2; : : :g is a vertex
of S1 D u1 ; : : : ; un .

Proof (i) Since S1 is compact and fuki ; k D 1; 2; : : :g  S1 there exists a


subsequence fks g  f1; 2; : : :g such that for every i D 1; : : : ; n, uks i ! ui
1
1 D u ; : : : ; u  then there exist Pi  0 such that
n
Pns ! C1. If x 2 SP
as
n n

iD1 i D 1 and x D
iD1 i u i
D lim s!C1 sx , for x s D iD1 i u
ks i
2 S ks ,
ks k
hence x 2 S . Since this holds for every s and fS g is a nested sequence, it
follows that x 2 \1 k 1 k
kD1 S . So S1  \kD1 S . Conversely for any s if x 2 S , then
ks
154 6 General Methods

P P P
with si  0 satisfying niD1 si D 1 we have x D niD1 si uks i ! niD1 i ui
as s ! C1, hence x 2 u1 ; : : : ; un  D S1 . Therefore, S1 D \1 k
kD1 S D
1
u ; : : : ; u .
n

(ii) Sk is a child of Sk1 in a subdivision via v k1 , so a vertex of Sk , say ukik , must be
v k1 . Since ik 2 f1; : : : ; ng there exists a subsequence fks g  f1; 2; : : :g such
that iks is the same for all s, for instance, iks D 1 8s. Therefore v ks 1 D uks 1 8s.
Since u1 is an accumulation point of the sequence fuk1 g we can assume that
uks 1 ! u1 as s ! C1. Hence, v ks 1 ! u1 as s ! C1.
(iii) This is obviously true if .S1 / D 0, i.e., if S1 is a singleton. Consider the
case .S1 / > 0 which by Lemma 6.1 can occur only if condition (ii) of this
Lemma does not hold, i.e., if for every s D 1; 2; : : : there is a ks such that

maxfkv ks  uks i kj i D 1; : : : ; ng  s .Sks /;

where s ! 1.s ! C1/. Since S1 is compact, by taking a subsequence if


necessary we may assume that v ks ! v; uks i ! ui .i D 1; : : : ; n/. Then letting
s ! 1 in the previous inequality yields maxiD1;:::;n kv  ui k D .S1 /. Since
for each i the function x 7! kx  ui k is strictly convex, so is the function
x 7! maxiD1;:::;n kx  ui k. Therefore, a maximizer of the latter function over the
polytope convfu1; : : : ; un g must be a vertex of it: v 2 vertS1 . t
u
Incidentally, the proof has shown that one can have .Sk / > 0 only if infinitely
many subdivisions are not bisections of ratio  . Therefore:
Corollary 6.1 A filter of simplices Sk ; k D 1; 2; : : : such that every SkC1 is a child
of Sk via a bisection of ratio no less than > 0 is exhaustive. t
u
A finite collection N of simplices covering a set is called a (simplicial) net for .
A net for is said to be a refinement of another net if it is obtained by subdividing
one or more members of the latter and replacing them with their partitions. A
(simplicial) subdivision process for is a sequence of (simplicial) nets N1 ; N2 ; : : :,
each of which, except the first one, is a refinement of its predecessor.
Lemma 6.2 An infinite subdivision process N1 ; N2 ; : : :, generates at least one
filter.
Proof Since N1 has finitely many members, at least one of these, say Sk1 , has
infinitely many descendants. Since Sk1 has finitely many children, one of these,
say Sk2 , has infinitely many descendants. Continuing this way we shall generate
an infinite nested sequence which is a filter. t
u
A subdivision process is said to be exhaustive if every filter fSk ; k D 1; 2; : : : ; g
generated by the process is exhaustive. From Corollary 6.1 it follows that a
subdivision process consisting only of bisections of ratio at least is exhaustive.
6.1 Successive Partitioning 155

6.1.2 Conical Subdivision

Turning to conical subdivision, consider a fixed cone M0  Rn vertexed at x0


(throughout this chapter, by cone in Rn we always mean a polyhedral cone having
exactly n edges). To simplify the notation assume x0 D 0. Let H be a fixed
hyperplane meeting each edge of M0 at a point distinct from 0. For any cone M  M0
the set H \ M is an .n  1/-simplex S D u1 ; : : : ; un  not containing 0. We shall refer
to this simplex S as the base of the cone M. Then any subdivision of the simplex S
via a point v 2 S induces in an obvious way a subdivision (splitting) of the cone M
into subcones, via the ray through v (Fig. 6.1). A subdivision of M is called bisection
(of ratio ) if it is induced by a bisection (of ratio ) of S. A filter of cones, i.e., an
infinite sequence of nested cones each of which is a child of its predecessor by
a subdivision, is said to be exhaustive if it is induced by an exhaustive filter of
simplices, i.e., if their intersection is a ray.
Theorem 6.2 (Basic Conical Subdivision Theorem) Let C be a compact convex
set in Rn , containing 0 in its interior, fMk ; k D 1; 2; : : :g a filter of cones vertexed
at 0 and having each exactly n edges. For each k let zk1 ; : : : ; zkn be the intersection
points of the n edges of Mk with @C (the boundary of C). If for each k MkC1 is a
child of Mk in a subdivision via the ray through a point qk 2 zk1 ; : : : ; zkn  then one
accumulation point of the sequence fqk ; k D 1; 2; : : : ; g belongs to @C.

Proof Let Sk D uk1 ; : : : ; ukn  be the base of Mk . Then SkC1 is a child of Sk in


a subdivision via a point v k 2 Sk , with v k being the point where Sk meets the
halfline from 0 through qk . For every i D 1; : : : ; n let ui be an accumulation point
of the sequence fuki ; k D 1; 2; : : :g. By Theorem 6.1, (ii), there exists ui such that
v ks ! ui for some subsequence fks g  f1; 2; : : :g. Therefore, qks ! zi where
zi D lims!1 zks i 2 @C (Fig. 6.2). t
u

zk1 qk2
C
q

Mk2 qk1

Mk1 vk1 zk2


0

Fig. 6.2 Basic conical subdivision theorem


156 6 General Methods

6.1.3 Rectangular Subdivision


Q
A rectangle in Rn is a set of the form M D r; s D niD1 ri ; si  D fx 2 Rn j ri 
xi  si .i D 1; : : : ; n/g. There are several ways to subdivide a rectangle M D r; s
into subrectangles. However, in global optimization procedures, we will mainly
consider partitions through a hyperplane parallel to one facet of the given rectangle.
Specifically, given a rectangle M, a partition of M is defined by giving a point
v 2 M together with an index j 2 f1; : : : ; ng: the partition sets are then the two
subrectangles M and MC determined by the hyperplane xj D vj , namely:

M D fxj rj  xj  vj ; ri  xi  si .i j/gI
MC D fxj vj  xj  sj ; ri  xi  si .i j/g:

We will refer to this partition as a partition via .v; j/ (Fig. 6.3). A nice property of
this partition method is expressed in the following:
Lemma 6.3 Let fMk WD rk ; sk ; k 2 Kg be a filter of rectangles such that Mk is
partitioned via .v k ; jk /. Then there exists a subsequence K1  K such that jk D
j0 8k 2 K1 and, as k ! C1; k 2 K1 ,

rjk0 ! rj0 ; skj0 ! sj0 ; vjk0 ! v j0 2 frj0 ; sj0 g: (6.2)

Proof Without loss of generality we can assume K D f1; 2; : : :g. Since jk 2


f1; : : : ; ng there exists a j0 such that jk D j0 for all k forming an infinite subsequence
K1  K. Because of boundedness we may assume vjk0 ! v j0 .k ! C1; k 2 K1 /.
Furthermore, since rjk0  rjkC1 0
 rjh0  shj0  skC1
j0  skj0 for all h > k, we have
rj0 ! rj0 ; rj0 ! rj0 , and sj0 ! sj0 ; sj0 ! sj0 .k ! C1/. But vjk0 2 frjkC1
k kC1 k kC1
0
; skC1
j0 g,
hence, by passing to the limit, v j0 2 frj0 ; sj0 g. t
u

s2

v
M M+

r2

r1 v1 s1

Fig. 6.3 Rectangular subdivision


6.1 Successive Partitioning 157

Theorem 6.3 (Basic Rectangular Subdivision Theorem) Let fMk D rk ; sk ;


k 2 Kg be a filter of rectangles as in Lemma 6.3, where

jk 2 argmaxf
kj j j D 1; : : : ; ng;
kj D minfvjk  rjk ; skj  vjk g: (6.3)

Then at least one accumulation point of the sequence of subdivision points fv k g


coincides with a corner of the limit rectangle M1 D \C1
kD1 Mk .

Proof Let K1 be the subsequence mentioned in Lemma 6.3. One can assume that
rk ! r; sk ! s; v k ! v as k ! C1; k 2 K1 . From (6.2) and (6.3) it follows that
for all j D 1; : : : ; n:


kj 
kj0 ! 0 .k ! C1; k 2 K1 /;

hence v j 2 frj ; sj g 8j, which means that v is a corner of M1 D r; s. t


u
The subdivision of a rectangle M D r; s via .v; j/ is called a bisection of ratio
if the index j corresponds to a longest side of M and v is a point of this side such
that
j D min.vj  rj ; sj  vj / D .sj  rj /.
Corollary 6.2 Let fMk WD rk ; sk g be any filter of rectangles such that MkC1 is a
child of Mk in a bisection of ratio no less than a positive constant > 0. Then the
filter is exhaustive, i.e., the diameter .Mk / of Mk tends to zero as k ! 1.
Proof Indeed, by the previous theorem, one accumulation point v of the sequence
of subdivision points v k must be a corner of the limit rectangle. This can occur only
if .Mk / ! 0. t
u
Theorem 6.4 (Adaptive Rectangular Subdivision Theorem) For each k D
1; 2; : : :, let uk ; v k be two points in the rectangle Mk D rk ; sk  and let
wk D 12 .uk C v k /. Define jk by

jukjk  vjkk j D max juki  vik j:


iD1;:::;n

If for every k D 1; 2; : : : ; MkC1 is a child of Mk in the subdivision via .wk ; jk / then


there exists an infinite subsequence K  f1; 2; : : :g such that kuk  v k k ! 0 as
k 2 K; k ! C1.
Proof By Theorem 6.3 there exists an infinite subsequence K  f1; 2; : : :g such that
jk D j0 8k 2 K; wkj0 ! wj0 2 frj0 ; sj0 g as k 2 K; k ! C1. Since wkj0 D 12 .ukj0 C vjk0 /
it follows that jukj0  vjk0 j ! 0 as k 2 K; k ! C1, i.e., maxiD1;:::;n juki  vik j ! 0 as
k 2 K; k !1 . Hence, kv k  uk k ! 0 as k 2 K; k ! 1. t
u
The term adaptive comes from the adaptive role of this subdivision rule in the
context of BB algorithms for solving problems with nice constraints, see Sect. 6.2
below. Its aim is to drive the difference kuk  v k k to zero as k ! C1. We refer to
it as an adaptive subdivision via .uk ; v k /.
158 6 General Methods

6.2 Branch and Bound

Consider a problem
.P/ minff .x/j x 2 Dg;
where f .x/ is a real-valued continuous function on Rn and D is a closed bounded
subset of Rn .
In general it is not possible to compute an exact global optimal solution of this
problem by a finite procedure. Therefore, in many cases one must be content with
only a feasible solution with objective function value sufficiently near to the global
optimal value. Given a tolerance " > 0 a feasible solution x such that
f .x/  minff .x/j x 2 Dg  "
is called a global "-optimal solution of the problem.
A BB method for solving problem .P/ can be outlined as follows:
Start with a set (simplex or rectangle) M0  D and perform a (simplicial or
rectangular, resp.) subdivision of M0 , obtaining the partition sets Mi ; i 2 I. If
a feasible solution is available let x be the best feasible solution available and
D f .x/. Otherwise, let D C1.
For each partition set Mi compute a lower bound .Mi / for f .Mi \ D/, i.e., a
number .Mi / D C1 if M \ D D ;; .Mi /  f .M \ D/ otherwise. Update the
current best feasible solution x and the current best feasible value .
Among all the current partition sets select a partition set M 2 argmini2I .Mi /.
If  .M /  " stop: x WD CBS is a global "-optimal solution. Otherwise,
further subdivide M , add the new partition sets to the old ones to form a new
refined partitioning of M0 .
For each new partition set M compute a lower bound .M/. Undate x; and
repeat the process.
More precisely we can state the following

6.2.1 Prototype BB Algorithm

Select a (simplicial or rectangular) subdivision rule and a tolerance "  0.


Initialization. Start with an initial set (simplex or rectangle) M0 containing the
feasible set D. If any feasible solution is readily available, let x0 be the best among
such solutions and 0 D f .x0 /. Otherwise, set 0 D C1. Set P0 D fM0 g; S0 D
fM0 g; k D 0.
Step 1. (Bounding) For each set M 2 Pk compute a lower bound .M/ for
f .M \ D/, i.e., a number such that
.M/ D C1 if M \ D D ;; .M/  inf f .M \ D/ otherwise: (6.4)
6.2 Branch and Bound 159

If some feasible solutions have been known thus far, let xk be the best among all
of them and set k D f .xk /. Otherwise, set k D C1.
Step 2. (Pruning) Delete every M 2 Sk for which .M/  k  " and let Rk be
the collection of remaining sets of Sk .
Step 3. (Termination Criterion) If Rk D ;, terminate: xk is a global "-optimal
solution. Otherwise, select Mk 2 argminf.M/j M 2 Rk .
Step 4. (Branching) Subdivide Mk according to the chosen subdivision rule. Let
Pk be the partition of Mk . Reset Sk .Rk n fMk g/ [ Pk . If any new feasible
solution is available, update the current best feasible solution xk and k D f .xk /.
Set k k C 1 and return to Step 1.
We say that bounding is consistent with branching (briefly, consistent) if

.M/  minff .x/j x 2 M \ Dg ! 0 as diamM ! 0: (6.5)

Proposition 6.1 If the subdivision is exhaustive and bounding is consistent, then


the BB algorithm is convergent, i.e., it can be infinite only when " D 0 and then
.Mk / ! v.P/ WD minff .x/j x 2 Dg as k ! C1.
Proof First observe that if the algorithm stops at Step 3 then k < .M/ C " for
every M 2 Rk , so k < C1, and k D f .xk /. Since xk is feasible and satisfies
f .xk / < .M/C"  f .M\D/C" for every M 2 Sk , [see (6.4)], xk is indeed a global
"-optimal solution. Suppose now that Step 3 never occurs, so that the algorithm is
infinite. Since .Mk / < k C ", it follows from condition (6.4) that Mk \ D ;.
So there exists xk 2 Mk \ D . Let zk 2 Mk \ D be a minimizer of f .x/ over Mk \ D
(the points xk and zk exist, but may not be known effectively). We now contend
that diam.Mk / ! 0 as k ! C1. In fact, for every partition set M let .M/ be
the generation index of M defined by induction as follows: .M0 / D 1; whenever
M 0 is a child of M then .M 0 / D .M/ C 1. It is easily seen that .Mk / ! C1
as k ! C1. But .Mk / D m means that Mk D \m iD1 Si where S1 D M0 ; Sm D
Mk ; Si 2 Si ; SiC1 is a child of Si . Hence, since the subdivision process is exhaustive,
diam.Mk / ! 0 as k ! C1. We then have zk  xk ! 0, so zk and xk tend to a
common limit x 2 D. But by (6.5), f .zk /  .Mk / ! 0, i.e., .Mk / ! f .x /,
hence, in view of (6.4), f .x /  v(P). Since x 2 D the latter implies f .x / D v(P),
i.e., .Mk / ! v(P)). Therefore, for k large enough k D f .xk /  f .xk / < .Mk /C",
a contradiction. t
u
As it is apparent from the above proof, the condition .M/ D C1 if M \
D D ; in (6.4) is essential in a BB algorithm: if it fails to hold, Mk \ D may be
empty for some k, in which case x 2 \hk .Mh \ D/ is infeasible, hence .Mk / 6!
v(P), despite (6.5). In the general case, however, it may be very difficult to decide
whether a set Mk \ D is empty or not. It may also happen that computing a feasible
solution is as hard as solving the problem itself, and then no feasible solution is
available effectively at each iteration. Under these conditions, when the algorithm
160 6 General Methods

stops (case " > 0/ it only provides an approximate value of v.P/, namely a value
.Mk / < v.P/ C "; when the algorithm is infinite (case " D 0/ it only provides a
sequence of infeasible solutions converging to a global optimal solution.

6.2.2 Advantage of a Nice Feasible Set

All the above difficulties of the BB algorithm disappear when the problem has a nice
feasible set D. By this we mean that for any simplex or rectangle M a point x 2 M\D
can be computed at cheap cost whenever it exists. More importantly, in most cases
when the feasible set is nice a rectangular adaptive subdivision process can be used
which ensures a faster convergence than an exhaustive subdivision process does.
Specifically suppose for each rectangle Mk two points xk ; yk in Mk can be
determined such that

xk 2 Mk \ D; yk 2 Mk ; (6.6)
f .y /  .Mk / D o.kx  y k/:
k k k
(6.7)

Example 6.1 Let 'Mk .x/ be an underestimator of f .x/ tight at yk 2 Mk (i.e., a


continuous function 'Mk .x/ satisfying 'Mk .x/  f .x/ 8x 2 Mk \ D; 'Mk .yk / D
f .yk //; .Mk / D minf'Mk .x/j x 2 Mk \Dg, while xk 2 argminf'Mk .x/j x 2 Mk \Dg.
(As is easily checked, f .yk /  .Mk / D 'Mk .yk /  'Mk .xk / ! 0 as xk  yk ! 0.)
Example 6.2 Let f .x/ D f1 .x/  f2 .x/; xk 2 argminff1 .x/j x 2 Mk \ Dg, yk 2
argmaxff2 .x/j x 2 Mk g; .Mk / D f1 .xk /  f2 .yk /. (If f1 .x/ is continuous then f .yk / 
.Mk / D f1 .yk /  f2 .yk /  f1 .xk /  f2 .yk / D f1 .yk /  f1 .xk / ! 0 as xk  yk ! 0)
Under these conditions, the following adaptive bisection rule can be used for
subdividing Mk : Bisect Mk via .wk ; jk / where wk D 12 .xk C yk / and jk satisfies

kykjk  xkjk k D max kykj  xkj k:


jD1;:::;n

Proposition 6.2 A rectangular BB algorithm using an adaptive bisection rule is


convergent, i.e., it produces a global "-optimal solution in finitely many steps.
Proof By Theorem 6.4 there exists a subsequence fk g such that

yk  xk ! 0 . ! C1/ (6.8)


lim!C1 yk D lim!C1 xk D x 2 D: (6.9)

From (6.6) and (6.7) we deduce .Mk / ! f .x / as  ! C1. But, since x 2 D,
we have .Mk /  f .x /. Therefore, .Mk / ! v.P/ . ! C1/, proving the
convergence of the algorithm. t
u
6.3 Outer Approximation 161

We shall refer to a BB algorithm using an adaptive (exhaustive, resp.) subdivision


rule as an adaptive (exhaustive , resp.) BB algorithm.
To sum up, global optimization problems with all the nonconvexity concentrated
in the objective function are generally easier to handle by BB algorithm than those
with nonconvex constraints. This suggests that in dealing with nonconvex global
optimization problems one should try to use valid transformations to shift, in one
way or another, all the nonconvexity in the constraints to the objective function.
Classical Lagrange multipliers or penalty methods, as well as decomposition
methods are various ways to exploit this idea. Most of these methods, however, work
under restrictive conditions. In the next chapter we will present a more practical way
to handle hard constraints in dc optimization problems.

6.3 Outer Approximation

One of the earliest methods of nonconvex optimization consists in approximating a


given problem by a sequence of easier relaxed problems constructed in such a way
that the sequence of solutions of these relaxed problems converges to a solution
of the given problem. This approach, first introduced in convex programming in
the late 1950s (Cheney and Goldstein 1959; Kelley 1960), was later extended
under the name of outer approximation, to concave minimization under convex
constraints (Hoffman 1981; Thieu et al. 1983; Tuy 1983; Tuy and Horst 1988)
and to more general nonconvex optimization problems (Mayne and Polak 1984;
Tuy 1983, 1987a). Referring to the fact that cutting planes are used to eliminate
unfit solutions of relaxed problems, this approach is sometimes also called a cutting
plane method. It should be noted, however, that in an outer approximation procedure
cuts are always conjunctive, i.e., the set that results from the cuts is always the
intersection of all the cuts performed.
In this section we will present a flexible framework of outer approximation (OA)
applicable to a wide variety of global optimization problems (Tuy 1995b), including
DC optimization and monotonic optimization.
Consider the general problem of searching for an element of an unknown set
 Rn (for instance, is the set of optimal solutions of a given problem). Suppose
there exist a closed set G  and a family P of particular sets P  G, such that
for each P 2 P a point x.P/ 2 P (called a distinguished point associated with P/
can be defined satisfying the following conditions:
A1. x.P/ always exists and can be computed if ;, and whenever a sequence of
distinguished points x1 D x.P1 /; x2 D x.P2 /; : : :, converges to a point x 2 G
then x 2 (in particular, x.P/ 2 G implies that x.P/ 2 /.

A2. Given any distinguished point z D x.P/; P 2 P, we can recognize when z 2 G


and if z G, we can construct an affine function l.x/ (called a cut) such that
P0 D P \ fxj l.x/  0g 2 P and l.x/ strictly separates z from G, i.e., satisfies

l.z/ > 0; l.x/  0 8x 2 G:


162 6 General Methods

In traditional applications is the set of optimal solutions of an optimization


problem, G the feasible set, P a family of polyhedrons enclosing G, and
x.P/ an optimal solution of the relaxed problem obtained by replacing the
feasible set with P. However, it should be emphasized that in the most general
case ; G; P; x.P/ can be defined in quite different ways, provided the
above requirements A1, A2 are fulfilled. For instance, in several interesting
applications, the set G may be defined only implicitly or may even be unknown,
while x.P/ need not be a solution of any relaxed problem.
Under assumptions A1, A2 a natural method for finding a point of is the
following.

6.3.1 Prototype OA Procedure

Initialization. Start with a set P1 2 P. Set k D 1.


Step 1. Find the distinguished point xk D x.Pk / (by (A1)). If x.Pk / does not exist
then terminate: D ;. If x.Pk / 2 G then terminate: x.Pk / 2 . Otherwise,
continue.
Step 2. Using (A2) construct an affine function lk .x/ such that PkC1 D Pk \
fxj lk .x/  0g 2 P and lk .x/ strictly separates xk from G , i.e., satisfies

lk .xk / > 0; lk .x/  0 8x 2 G: (6.10)

Set k k C 1 and return to Step 1.


The above scheme generates a nested sequence of sets Pk 2 P:

P1  P2  : : :  Pk  : : :  G

approximating G more and more closely from outside, hence the name given to the
method (Fig. 6.4). The approximation is refined at each iteration but only around the
current distinguished point (considered the most promising at this stage).
An OA procedure as described above may be infinite (for instance, if x.P/ exists
and x.P/ G for every P 2 P/. An infinite OA procedure is said to be convergent
if it generates a sequence of distinguished points xk , every accumulation point x of
which belongs to G [hence solves the problem by (A1)]. In practice, the procedure
is stopped when xk has become sufficiently close to G, which necessarily occurs for
sufficiently large k if the procedure is convergent.
The study of the convergence of an OA procedure relies on properties of the cuts
lk .x/  0 satisfying condition (6.10). Let

lk .x/
lk .x/ D hpk ; xi C k ; lk .x/ D D hpk ; xi C k :
kpk k
6.3 Outer Approximation 163

x1

x2

x3

Fig. 6.4 Outer approximation

Lemma 6.4 (Outer Approximation Lemma) If the sequence fxk g is bounded then

lk .xk / ! 0 .k ! C1/:

Proof Suppose the contrary, that lkq .xkq / 


> 0 for some infinite subsequence
fkq g. Since fxk g is bounded we can assume xkq ! x .q ! C1/. Clearly, for
every k,

lk .xk / D lk .x/ C hpk ; xk  xi:

But for every fixed k; lk .xkq /  0 8kq > k and letting q ! C1 yields lk .x/  0.
Therefore, 0 <
 lkq .xkq /  hpkq ; xkq  xi ! 0, a contradiction. u
t
Theorem 6.5 (Basic Outer Approximation Theorem) Let G be an arbitrary
closed subset of Rn with nonempty interior, let fxk g be an infinite sequence in Rn ,
and for each k let lk .x/ D hpk ; xi C k be an affine function satisfying (6.10). If the
sequence fxk g is bounded and for every k there exist wk 2 G, and yk 2 wk ; xk nintG,
such that lk .yk /  0; fwk g is bounded and, furthermore, every accumulation point
of fwk g belongs to the interior of G, then

xk  yk ! 0 .k ! C1/: (6.11)

In particular, if a sequence fyk g as above can be identified whose every accumula-


tion point belongs to G then the OA procedure converges.

Proof By (6.10) hpk ; xk i < k  hpk ; wk i, so the sequence k is bounded. We


can then select an infinite subsequence fkq g such that kq ! ; xkq ! x, ykq !
y; pkq ! p; wkq ! w .kpk D 1 and w 2 intG/. So lkq .x/ ! l.x/ WD px C 8x.
Since l.x/ D lim lkq .x/ D limlkq .xkq / C hpkq ; x  xkq i D lim lkq .xkq /, by Lemma 6.4
we have l.x/ D 0. On the other hand, from (6.10), l.x/  0 8x 2 G, and since
164 6 General Methods

w 2 intG and l.x/ 6 0, we must have l.w/ < 0. From the hypothesis lk .yk /  0 it
follows that hpk ; yk i C k  0, hence l.y/  0. But y D w C .1  /x for some
2 0; 1, hence l.y/ D l.w/ C .1  /l.x/ D l.w/, and since l.y/  0; l.w/ < 0,
this implies that D 0. Therefore, y D x. t
u
An important special case is the following:
Corollary 6.3 Let G D fxj g.x/  0g, where g W Rn ! R is a convex function and
fxk g  Rn n G a bounded sequence. If the affine functions lk .x/ satisfying (6.10) are
such that lk .x/ D hpk ; x  yk i C k ;

pk 2 @g.yk /; yk 2 wk ; xk  n intG; 0  k  g.yk /; (6.12)

where fwk g is a bounded sequence every accumulation point of which belongs to the
interior of G, and g.yk /  k ! 0 .k ! C1/, then every accumulation point x of
fxk g belongs to G.
Proof Since fxk g and fwk g are bounded, fyk g is bounded, too. Therefore, fpk g
is bounded (Theorem 2.6), and by Lemma 6.4, lk .xk / ! 0. Furthermore, by
Theorem 6.5, xk yk ! 0, hence k D lk .xk /hpk ; xk yk i ! 0. But by hypothesis,
g.yk /  k ! 0, consequently, g.yk / ! 0. If xkq ! x then by Theorem 6.5
ykq ! y D x, so g.x/ D lim g.yk / D 0, i.e., x 2 G. t
u
In a standard outer approximation procedure wk is fixed and k D g.yk / 8k.
But as will be illustrated in subsequent sections, by allowing more flexibility in the
choice of wk and k the above theorems can be used to establish convergence under
much weaker assumptions than usual.

6.4 Exercises

1 Construct a sequence of nested simplices Mk such that: for every k of an infinite


sequence   f1; 2; : : :g; MkC1 is a child of Mk by an exact bisection and
nevertheless the intersection of all Mk ; k D 1; 2; : : :, is not a singleton. Thus, in
Lemma 6.1 (which is fundamental to The Basic Subdivision Theorem) the condition
(i) alone is not sufficient for the filter Mk to be exhaustive.
2 Prove directly (i.e., without using Lemma 6.1) that if a sequence of nested
simplices Mk ; k D 1; 2; : : :, is constructed such that MkC1 is a child of Mk in a
bisection of ration > 0 then the intersection of all Mk is a singleton. (Hint: observe
that in a triangle a; b; c where a; b is the longest side, with length of > 0, and
v is a point of this side such that minfkv  ak; kv  bkg  with 0 <  1=2
then there exists a constant 2 .0; 1/ such that kv  ck  ; in fact, D kwsk
where w is a point of a; b such that kw  ak D and s is the third vertex of an
equilateral triangle whose two other vertices are a; b.)
6.4 Exercises 165

3 Show that Corollary 6.2 can be strengthened as follows: Let fMk WD rk ; sk g  R


be any filter of rectangles in Rn such that for infinitely many k; MkC1 is a child of
Mk in a bisection of ratio no less than a positive constant > 0. Then the filter is
exhaustive, i.e., diamMk ! 0 as k ! C1. (Hint: use Lemma 6.3; This result is in
contrast with what happens for filters of simplices, see Exercise 1).
4 Consider a sequence of nonempty polyhedrons D1  : : :  Dk  : : : such that
DkC1 D Dk \ fxj lk .x/  0g, where lk .x/ is an affine function. Show that if there
exist a number " > 0 along with for each k a point xk 2 Dk n DkC1 such that the
distance from xk to the hyperplane Hk D fxj lk .x/ D 0g is at least " then the sequence
D1 ; : : : ; Dk ; : : : must be finite.
5 Suppose in Step 1 of the Prototype BB Algorithm, we take

.M/  inf f .M \ P/;

where P is a pregiven set containing D. Does the algorithm work with this
modification?
Chapter 7
DC Optimization Problems

7.1 Concave Minimization Under Linear Constraints

One of the most frequently encountered problems of global optimization is Concave


Minimization (or Concave Programming) which consists in minimizing a concave
function f W Rn ! R over a nonempty closed convex set D  Rn . In this and the next
chapters we shall focus on the Basic Concave Programming (BCP) Problem which
is a particular variant of the concave programming problem when all the constraints
are linear, i.e., when D is a polyhedron. Assuming D D fxj Ax  b; x  0g, where
A is an m  n matrix, and b an m-vector, the problem can be stated as

.BCP/ minimize f .x/ (7.1)


subject to Ax  b (7.2)
x  0: (7.3)

This is the earliest problem of nonconvex global optimization studied by deter-


ministic methods (Tuy 1964). Numerous decision models in operations research,
mathematical economics, engineering design, etc., lead to formulations such as
(BCP). Important special (BCP) models are plant location problems and also
concave min-cost network flow problems which have many applications. Other
models which originally are not concave can be transformed into equivalent (BCP)s.
On the other hand, techniques for solving (BCP) play a central role in global
optimization, as will be demonstrated in subsequent chapters.
Recall that the concavity of f .x/ means that for any x0 ; x00 in Rn we have

f ..1  /x0 C x00 /  .1  /f .x0 / C f .x00 / 8 2 0; 1:

Springer International Publishing AG 2016 167


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_7
168 7 DC Optimization Problems

Since f .x/ is finite throughout Rn , it is continuous (Proposition 2.3) . Later we shall


discuss the case when f .x/ is defined only on D and may even be discontinuous at
certain boundary points of D.
Of course, all the difficulty of (BCP) is caused by the nonconvexity of f .x/.
However, for the purpose of minimization, the concavity of f .x/ implies several
useful properties listed below:
(i) For any real the level set fxj f .x/  g is convex (Proposition 2.11).
(ii) The minimum of f .x/ over any line segment is attained at one of the endpoints
of the segment (Proposition 2.12).
(iii) If f .x/ is bounded below on a halfline, then its minimum over the halfline is
attained at the origin of the halfline (Proposition 2.12).
(iv) A concave function is constant on any affine set M where it is bounded below
(Proposition 2.12).
(v) If f .x/ is unbounded below on some halfline then it is unbounded below on any
parallel halfline (Corollary 2.4).
(vi) If the level set fxj f .x/  g is nonempty and bounded for one , it will be
bounded for all (Corollary 2.3).
The first two properties, which are actually equivalent, characterize quasicon-
cavity. When D is bounded, these turn out to be the most fundamental properties for
the construction of solution methods for (BCP), so many results in this chapter still
remain valid for quasiconcave minimization over polytopes.
Let be a real number such that the set C. / WD fxj f .x/  g is nonempty. By
quasiconcavity and continuity of f .x/ this set is closed and convex. To find a feasible
point x 2 D such that f .x/ < if there is one (in other words, to transcend ) we
must solve the dc inclusion
x 2 D n C. /: (7.4)
0
This inclusion depends upon the parameter . Since C. /  C. / whenever 
0 , the optimal value in (BCP) is the value
inff j D n C. / ;g (7.5)
A first property to be noted of this parametric dc inclusion is that the range of can
be restricted to a finite set of real numbers. This follows from Property (iii) listed
above, or rather, from the following extension of it:
Proposition 7.1 Either the global minimum of f .x/ over D is attained at one vertex
of D, or f .x/ is unbounded below over some extreme direction of D.
Proof Let z be a point of D, chosen so that z 2 argminx2D f .x/ if f .x/ is bounded
on D and f .z/ < f .v/ for any vertex v of D otherwise. From convex analysis
(Theorem 2.6) we know that the convex function f .x/ has a subgradient at z, i.e.,
a vector p such that hp; x  zi  f .x/  f .z/ 8x 2 Rn . Consider the linear program
minimize hp; x  zi subject to x 2 D: (7.6)
7.1 Concave Minimization Under Linear Constraints 169

Since the objective function vanishes at x D z, by solving this linear program we


obtain either an extreme direction u of D such that hp; ui < 0 or a basic optimal
solution v 0 such that hp; v 0  zi  0. In the first case, for any x we have f .x C u/ 
f .z/  hp; x C u  zi D hp; x  x i C hp; ui ! 1 as  ! C1, i.e., f .x/ is
unbounded below over any halfline of direction u. In the second case, v 0 is a vertex
of D such that f .v 0 /  f .z/  hp; v 0  zi  0, hence f .v 0 /  f .z/. By the choice
of z this implies that the second case cannot occur if f .x/ is unbounded below over
D. Therefore, f .x/ has a minimum over D and z is a global minimizer. Since then
f .v 0 / D f .z/, the vertex v 0 is also a global minimizer. t
u
By this proposition the search for an optimal solution can be restricted to
the finite set of vertices and extreme directions of D, thus reducing (BCP) to
a combinatorial optimization problem. Furthermore, if a vertex v 0 is such that
f .v 0 /  f .v/ for all neighboring vertices v then by virtue of the same property v 0
achieves the minimum of f .x/ over the polytope spanned by v 0 and all its neighbors,
and hence is a local minimizer of f .x/ on D. Therefore, given any point z 2 D one
can compute a vertex x which is a local minimizer such that f .x/  f .z/ by the
following simple procedure:
Take a subgradient p of f .x/ at z. Solve (7.6) . If an extreme direction u is
found with hp; ui < 0 then (BCP) is unbounded. Otherwise, a vertex v 0 is obtained
with f .v 0 /  f .z/. Starting from v 0 , pivot from a vertex to a better adjacent one, and
repeat until a vertex v k is reached which is no worse than any of its neighbors. Then
x D v k is a local minimizer.
Thus, according to the two phase scheme described in Chap. 5 (Sect. 5.7), to solve
(BCP) within a tolerance "  0 we can proceed as follows:
Let z 2 D be an initial feasible solution.
Phase 1 (Local phase). Starting from z check whether the optimal value is
unbounded (D 1), and if it is not, search for a vertex x of D such
that f .x/  f .z/. Go to Phase 2.
Phase 2 (Global phase). Solve the inclusion (7.4) for D f .x/". If the inclusion
has no solution, i.e., DnC. / D ;, then terminate: x is a global "-optimal
solution of (BCP). Otherwise, a feasible point z0 is obtained such that
f .z0 / < f .x/  ". Then set z z0 , and go back to Phase 1.
Since a polyhedron has finitely many vertices, this scheme terminates after
finitely many cycles and for sufficiently small " > 0 a global "-optimal solution
is actually an exact global optimal solution. The key point, of course, is how to
solve the dc inclusion (7.4).

7.1.1 Concavity Cuts

Most methods of global optimization combine in different ways some basic


techniques: cutting, partitioning, approximating (by relaxation or restriction), and
dualizing. In this chapter we will discuss successive partitioning methods which
combine cutting and partitioning techniques.
170 7 DC Optimization Problems

A combinatorial tool which has proven to be useful in the study of many


nonconvex optimization problems is the concept of concavity cut first introduced
in Tuy (1964). This is a device based on the concavity of the objective function and
used to exclude certain portions of the feasible set in order to avoid wasteful search
on non-promising regions.
Without loss of generality, throughout the sequel we will assume that the feasible
polyhedron D in (BCP) is full-dimensional. Consider the inclusion (7.4) for a given
value of (for instance, D f .x/  ", where x is the incumbent feasible solution at
a given stage). Let x0 be a vertex of D such that f .x0 / > . Suppose that the convex
set C. / is bounded and that a cone M vertexed at x0 is available that contains D and
has exactly n edges. Since x0 2 intC. /, these edges meet the boundary of C. / at
n uniquely determined points z1 ; : : : ; zn which are affinely independent (because so
are the directions of the n edges of M/. Let .x  x0 / D 1 be the hyperplane passing
through these n points. From
.zi  x0 / D 1 .i D 1; : : : ; n/; (7.7)
we deduce
D eQ1 ; (7.8)
where Q D .z1  x0 ; : : : ; zn  x0 / is the nonsingular matrix with columns zi  x0 ; i D
1; : : : ; n, and e D .1; : : : ; 1/ is a row vector of n ones.
Proposition 7.2 Any point x 2 D such that f .x/ < must lie in the halfspace
eQ1 .x  x0 / > 1: (7.9)

Proof Let S D x0 ; z1 ; : : : ; zn  denote the simplex spanned by x0 ; z1 ; : : : ; zn . Clearly,


S D M\fxj eQ1 .xx0 /  1g. Since x0 ; z1 ; : : : ; zn 2 C. /, it follows that S  C. /,
in other words, f .x/  for all x 2 S. Therefore, if x 2 D satisfies f .x/ < then
x 2 M but x S, and hence, x must satisfy (7.9). t
u
Corollary 7.1 The linear inequality
eQ1 .x  x0 /  1 (7.10)
excludes x0 without excluding any point x 2 D such that f .x/ < .
We call the inequality (7.10) (or the hyperplane H of equation eQ1 .x  x0 / D 1) a
-valid cut for .f ; D/, constructed at x0 (Fig. 7.1).
Corollary 7.2 If D f .x/  " for some x 2 D and it so happens that
maxfeQ1 .x  x0 /j x 2 Dg  1 (7.11)
then f .x/  f .x/  " for all x 2 D, i.e., x is a global "-optimal solution of (BCP).
Thus, (7.11) provides a sufficient condition for global "-optimality which can
easily be checked by solving the linear program maxfeQ1 .x  x0 /j x 2 Dg.
7.1 Concave Minimization Under Linear Constraints 171

Fig. 7.1 -valid concavity


cut
x0 + 2 u2 D

C( )

x0 + 1 u1
x0

Remark 7.1 Denote by ui ; i D 1; : : : ; n, the directions of the edges of M and by


U D .u1 ; : : : ; un / the nonsingular matrix with columns u1 ; : : : ; un , so that

M D fxj x D x0 C Ut; t D .t1 : : : ; tn /T  0gI

i.e., x 2 M if and only if t D U 1 .x  x0 /  0. Let zi D x0 C i ui , where i > 0 is


defined from the condition f .x0 C i ui / D . Then Q D Udiag. 1 ; : : : ; n /, hence
1 1
Q1 D diag. ;    ; /U 1 and
1 n

1 1
nX ti
eQ1 .x  x0 / D . ;    ; /.t1 ; : : : ; tn /T D :
1 n
iD1 i

Thus, if zi D x0 C i ui ; i D 1; : : : ; n then the cut (7.10) can also be written in the


form

Xn
ti
 1; t D U 1 .x  x0 /; (7.12)

iD1 i

which is often more convenient than (7.10) .


Remark 7.2 Given a real number < f .x0 / and any point x x0 , if

D supfj f .x0 C .x  x0 //  g

then the point x0 C .x  x0 / is called the -extension of x with respect to x0 (or


simply, the -extension of x if x0 is clear from the context). The boundedness of C. /
ensures that < C1 (thus, each zi defined above is the -extension of v i D x0 C ui
with respect to x0 /. Without assuming the boundedness of C. /, there may exist an
index set I  f1; : : : ; ng such that each ith edge, i 2 I, of the cone M is entirely
contained in C. /. In this case, for each i 2 I the -extension of v i D x0 C ui is
172 7 DC Optimization Problems

infinite, but x0 C i ui 2 C. / for all i > 0, so the inequality (7.12) still defines
a -valid cut for whatever large i .i 2 I/. Making i D C1 .i 2 I/ in (7.12), we
then obtain the cut
X ti
 1; t D U 1 .x  x0 /: (7.13)
i
iI

We will refer to this cut (which exists, regardless of whether C. / is bounded or not)
as the -valid concavity cut for .f ; D/ at x0 (or simply concavity cut, when f ; D; ; x0
are clear from the context).

7.1.2 Canonical Cone

The construction of the cut (7.13) supposes the availability of a cone M  D


vertexed at x0 and having exactly n edges. When x0 D 0 (which implies b  0/,
M can be taken to be the nonnegative orthant x  0. In the general case, to construct
M we transform the problem to the space of nonbasic variables relative to x0 , in
the following way. Recall that D D fx 2 Rn j Ax  b; x  0g where A is an
m  n matrix and b an m-vector. Introducing the slack variables s D b  Ax, denote
y D .x; s/ 2 RnCm . Let yB D .yi ; i 2 B/ be the vector of basic variables and
yN D .yi ; i 2 N/ the vector of nonbasic variables relative to the basic feasible
solution y0 D .x0 ; s0 /; s0 D b  Ax0 .jBj D m; jNj D n because dimD D n as
assumed). The associated simplex tableau then yields

yB D y0B  WyN ; yB  0; yN  0 (7.14)

where W D wij  is an m  n matrix and the basic solution y0 corresponds to yN D 0.


By setting yN D t the constraints (7.14) become Wt  y0B ; t  0, so the polyhedron
D is contained in the orthant t  0. Note that, since xi D yi for i D 1; : : : ; n,
from (7.14) we derive

x D x0 C Ut;

where U is an n  n matrix of elements uij ; i D 1; : : : ; n; j 2 N, such that


8
< wij i 2 B
uij D 1 iDj
:
0 otherwise:

Thus, the orthant t  0 corresponds, in the original space, to the cone

M D fxj x D x0 C Ut; t  0g (7.15)


7.1 Concave Minimization Under Linear Constraints 173

which contains D, is vertexed at x0 and has exactly n edges of directions represented


by the n columns uj ; j D 1; : : : ; n, of U. For convenience, in the sequel we will refer
to the cone (7.15) as the canonical cone associated with x0 .
Concavity cuts of the above kind have been developed for general concave
minimization problems. When the objective function f .x/ has some additional
structure, as in concave quadratic minimization and bilinear programming problems,
it is possible to significantly strengthen the cuts (Konno 1976a,b; see Chap. 10,
Sect. 10.4). Also note that concavity cuts, originally designed for concave program-
ming, have been used in other contexts (Balas 1971, 1972; Glover 1972, 1973a,b).
Given the inclusion (7.4), where D f .x/", one can check whether the incumbent
x is global "-optimal by using the sufficient optimality condition in Corollary 7.2.
This suggests a cutting method for solving the inclusion which consists in iteratively
reducing the feasible set by concavity cuts, until a global "-optimal solution is
identified (see e.g. Bulatov 1977, 1987). Unfortunately, it has been observed that,
when applied repeatedly the concavity cuts often become shallower and shallower,
and may even leave untouched a significant portion of the feasible set. To make
the cuts deeper, the idea is to combine cutting with splitting the feasible set into
smaller and smaller parts. As illustrated in Fig. 7.2, by splitting the initial cone M0
into sufficiently small subcones and applying concavity cuts to each of the subcones
separately, it is possible to exclude a portion of the feasible set as close to D \ C. /
as desired, and obtain an accurate approximation to the set D n C. /.

7.1.3 Procedure DC

With the preceding background, let us now consider the fundamental DC feasibility
problem that was formulated in Sect. 7.1 as the key to the solution of (BCP):
./ Given two closed convex sets C; D in Rn , either find a point

x 2 D n C; (7.16)

1
z1
3

M0 z2
x0

Fig. 7.2 Cut and split process


174 7 DC Optimization Problems

or else prove that


D  C: (7.17)
As shown in Sect. 7.1, if x is the best feasible solution of (BCP) known thus far (the
incumbent), then solving ./ for D D fxj Ax  b; x  0g and C D fxj f .x/  f .x/g
amounts to transcending x. According to the general two phase scheme for solving
(BCP) within a given tolerance " > 0, it suffices, however, to find a point x 2 D n C
(i.e., a better feasible solution than x/ or else to establish that D  C" for C" D
fxj f .x/  f .x/  "g (i.e., that x is a global "-optimal solution). Therefore, we will
discuss a slight modification of Problem ./, by relaxing the condition (7.17) to
D  C" ; (7.18)
where C" is a closed convex set such that
C0 D C; C  intC" if " > 0: (7.19)
In other words, we will consider the problem
.#/ For a given closed convex set C" satisfying (7.19) find a point x 2 D n C or
else prove that D  C" .
We will assume that C" is bounded, since by restricting ourselves to the case
when (BCP) has a finite optimal solution it is always possible to take C" D
fxj f .x/  f .x/  "; kxk  g with sufficiently large  > 0.
Let x0 be a vertex of D such that f .x0 /  f .z/ for all neighboring vertices z and
f .x0 / > f .x/  " (i.e., x0 2 intC" / and let M0 D fxj x D x0 C U0 t; t  0g be
the canonical cone associated with x0 as defined above (so the columns u01 ; : : : ; u0n
of U0 are the directions of the n edges of M0 /. By S0 D v 01 ; : : : ; v 0n  denote the
simplex spanned by v 0i D x0 C u0i ; i D 1; : : : ; n.
For a given subcone M of M0 with base S D v 1 ; : : : ; v n   Z0 , i.e., a subcone
generated by n halflines from x0 through v 1 ; : : : ; v n , let U be the nonsingular matrix
of columns ui D v i  x0 ; i D 1; : : : ; n. If i D maxfj x0 C ui 2 C" g (so that
x0 C i ui is the -extension of x0 C ui for D f .x/  "/, then, as we saw from (7.12),
the inequality
Xn
ti
 1; t D U 1 .x  x0 /

iD1 i

defines a -valid cut for .f ; D \ M/ at x0 . Consider the linear program


Xn
ti
maxf j t D U 1 .x  x0 /  0; Ax  b; x  0g:

iD1 i

i.e.,
Xn
ti
LP.D; M/ maxf Q t  0g
Q  b;
j AUt (7.20)
iD1
i
7.1 Concave Minimization Under Linear Constraints 175

with
   
A b  Ax0
AQ D ; bD
Q : (7.21)
I x0

Denote the optimal value and a basic optimal solution of this linear program in t by
.M/ and t.M/, respectively, and let !.M/ D x0 C Ut.M/.
Proposition 7.3 If .M/  1 then D \ M  C" , i.e., f .x/  f .x/  " for all points
x 2 D \ M. If .M/ > 1 and f .!.M//  f .x/  " then !.M/ does not lie on any
edge of M.
Proof Clearly .M/ and !.M/ are the optimal value and a basic optimal solution
of the linear program

maximize eQ1 .x  x0 / subject to x 2 D \ M;

where Q D Udiag. 1 ; : : : ; n / (i.e., Q is the matrix of columns 1 u1 ; : : : ; n un ).


The first assertion then follows from Proposition 7.2. To prove the second assertion,
suppose that .M/ > 1 and f .!.M//  f .x/  " while !.M/ belongs to some jth
Xn
ti
edge. Since the hyperplane D 1 cuts this edge at the point x0 C j uj we must
iD1
i
have !.M/ D x0 C .M/. j uj / D x0 C uj for D .M/ j > j which contradicts
the definition of j because f .!.M//  f .x/  ". t
u
From this proposition it follows that if M0 can be partitioned into a number of
subcones such that .M/  1 for every subcone, then D  C" , i.e., x is a global "-
optimal solution. Furthermore, if .M/ > 1 for some subcone M while f .!.M// 
f .x/  ", then the subdivision of M via the ray through !.M/ will be proper. We
can now state the following procedure for the master problem (#), assuming D D
fxj Ax  b; x  0g, C D fxj f .x/  f .x/g, and x is a vertex of D.

7.1.3.1 Procedure DC

Let x0 be a vertex of D which is a local minimizer, such that x0 2 intC" , let


M0 D fxj x D x0 C U0 t; t  0g be a canonical cone associated with x0 ,
S0 D x0 C u01 ; : : : ; x0 C u0n , where u01 ; : : : ; u0n are the columns of U0 . Set
P D fM0 g; N D fM0 g.
Step 1. For each cone M 2 P, of base S D x0 C u1 ; : : : ; x0 C un   S0 let U
be the matrix of columns u1 ; : : : ; un . Compute i such that x0 C i ui 2
@C" .i D 1; : : : ; n/ and solve the linear program [see (7.20)]:

Xn
ti
LP.D; M/ maxf j AUt Q t  0g
Q  b;

iD1 i
176 7 DC Optimization Problems

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/.
Step 2. If for some M 2 P; !.M/ C, i.e., f .!.M// < f .x/, then terminate:
!.M/ yields a point z 2 D n C.
Step 3. Delete every cone M 2 N for which .M/  1 and let R be the
collection of all remaining cones.
Step 4. If R D ; then terminate: D  C" .x is a global "-optimal solution of
.BCP/).
Step 5. Let M 2 argmaxf.M/j M 2 Rg, with base S . Select a point v  2 S
which does not coincide with any vertex of S . Split S via v  , obtaining
a partition P of M .
Step 6. Set N .R n fM g/ [ P , P P and return to Step 1.
Incorporating the above procedure into the two phase scheme for solving (BCP)
(see Sect. 5.7), we obtain the following:
CS (Cone Splitting) Algorithm for (BCP)
Let z0 be an initial feasible solution.
Phase 1. Compute a vertex x which is a local minimizer such that f .x/  f .z0 /.
Phase 2. Call Procedure DC with CDfxj f .x/  f .x/g; C" Dfxj f .x/  f .x/"g.
(a) If the procedure returns a point z 2 D such that f .z/ < f .x/ then set
z0 z and go back to Phase 1. (Optionally: with D f .z/ and x0
being the vertex of D where Procedure DC has been started, construct
a -valid cut .xx0 /  1 for .f ; D/ at x0 and reset D D\fxj .x
x0 /  1g/.
(b) If the procedure establishes that D  C" , then x is a global "-optimal
solution.
With regard to convergence and efficiency, a key point in Procedure DC is the
subdivision strategy, i.e., the rule of how the cone M in Step 5 must be subdivided.
Let Mk denote the cone M at iteration k. Noting that !.Mk / does not lie on any
edge of Mk , a natural strategy that comes to mind is to split the cone Mk upon the
halfline from x0 through !.Mk /, i.e., to subdivide its base Sk via the intersection
point v k of Sk with this halfline. A subdivision of this kind is referred to as an !-
subdivision and the strategy which prescribes !-subdivision in every iteration is
called the !-strategy. This natural strategy was used in earliest works in concave
minimization (Tuy 1964; Zwart 1974) and although it seemed to work quite well
in practice (Hamami and Jacobsen 1988), its convergence was never rigorously
established, until the works of Jaumard and Meyer (2001), Locatelli (1999), and
most recently Kuno and Ishihama (2015). Now we can sate the following:
Theorem 7.1 The CS Algorithm with !-subdivisions terminates after finitely many
steps, yielding a global "-optimal solution of (BCP).
Proof Thanks to Theorem 6.2, a new proof for this result can be given which is
much shorter than all previously published ones in the above cited works. In fact,
consider iteration k of Procedure DC with !-subdivisions. The cone Mk (i.e., M in
step 5) is subdivided via the halfline from x0 through ! k WD !.Mk /. Let zk1 ; : : : ; zkn
7.1 Concave Minimization Under Linear Constraints 177

be the points where the edges of Mk meet the boundary @C" of C" and let qk be
the point where the halfline from x0 through ! k meets the simplex zk1 ; : : : ; zkn . By
Theorem 6.2 if the procedure is infinite there is an infinite subsequence fks g such
that qks ! q 2 @C" as s ! 1. By continuity of f .x/ it follows that f .qks / ! f .q/ D
f .x/  ", so f .! ks / < f .x/ for sufficiently large ks . Since this is the stopping criterion
in Step 2 of the procedure, the algorithm terminates after finitely many iterations by
the stopping criterion in Step 4. t
u
Remark 7.3 It is not hard to show that ! k D .Mk /qk . From the above proof there
is an infinite subsequence fks g such that k! ks  qks k ! 0, i.e., ..Mks /  1/kqks k !
0, hence

.Mks / ! 1 .s ! 1/: (7.22)

Later we will see that this condition is sufficient to ensure convergence of the CS
Algorithm viewed as a Branch and Bound Algorithm.
Remark 7.4 The CS Algorithm for solving (BCP) consists of a finite number
of cycles each of which involves a Procedure DC for transcending the current
incumbent. Any new cycle is actually a restart of the algorithm, whence the name
CS restart given sometimes to the algorithm. Due to restart, the set Nk (the collection
of cones to be stored) can be kept within reasonable size to avoid storage problems.
Furthermore, the vertex x0 used to initialize Procedure DC in each cycle needs not
be fixed throughout the whole algorithm, but may vary from cycle to cycle. When
started from x0 Procedure DC concentrates the search over the part of the boundary
of D contained in the canonical cone associated with x0 , so a change of x0 means
a change of search directions. This multistart effect may sometimes drastically
improve the convergence speed. On the other hand, further flexibility can be given
to the algorithm by modifying Step 2 of Procedure DC as follows:
Step 2. (Modified) If for some M 2 N ; !.M/ C, i.e., f .!.M// < f .x/ then,
optionally: (a) either set z0 !.M/ and return to Phase 1, or (b) compute
a vertex xO of D such that f .Ox/  f .!.M//, reset x xO . and go to Step 3.
With this modification, as soon as in Step 2 of Procedure DC a better feasible
solution is found which is better than the incumbent, the algorithm either returns
to Phase 1 (with z0 !.M// or goes to Step 3 (with an updated incumbent) to
continue the current Procedure DC. In general, the former option (return to Phase
1) should be chosen when the set N has reached a critical size. The efficiency of
the multirestart strategy is illustrated in the following example:
Example 7.1 Consider the problem

minimize x1  10x2 C 10x3 C x8


x21  x22  x23  x24  7x25  4x26  x27  2x28
C2x1 x2 C 6x1 x5 C 6x2 x5 C 2x3 x4
178 7 DC Optimization Problems


x1 C 2x2 C x3 C x4 C x5 C x6 C x7 C x8  8

2x C x C x  9
1 2 3

x3 C x4 C x5  5
subject to
0:5x5 C 0:5x6 C x7 C 2x8  3

2x2  x3  0:5x4  5

x1  6; xi  0 i D 1; 2; : : : ; 8

A local minimizer of the problem is the vertex x0 D .0; 0; 0; 0; 5; 1; 0; 0/ of the


feasible polyhedron, with objective function value 179. Suppose that Procedure
DC is applied to transcend this local minimizer. Started from x0 the procedure runs
through 37 iterations, generating 100 cones, without detecting any better feasible
solution than x0 nor producing evidence that x0 is already a global optimal solution.
By restarting the procedure from the new vertex x0 .0; 2:5; 0; 0; 0; 0; 0; 0/ we
find after eight iterations that no better feasible solution than x0 exists. Therefore, x0
is actually a global minimizer.

7.1.4 !-Bisection

As said above, for some time the convergence status of conical algorithms using !-
subdivisions was unsettled. It was then natural to look for other subdivisions which
could secure convergence, even at some cost for efficiency. The simplest subdivision
that came to mind was bisection: at each iteration the cone M is split along the
halfline from x0 through the midpoint of a longest edge of the simplex S . In fact, a
first conical algorithm using bisections was developed (Thoai and Tuy 1980) whose
convergence was proved rigorously. Subsequently, however, it was observed that
the convergence achieved with bisections is too slow. This motivated the Normal
Subdivision (Tuy 1991a) which is a kind of hybrid strategy, using !-subdivisions
in most iterations and bisections occasionally, just enough to prevent jamming and
eventually ensure convergence.
At the time when the pure !-subdivision was efficient, yet uncertain to converge,
Normal Subdivision was the best hybrid subdivision ensuring both convergence
and efficiency (Horst and Tuy 1996). However, once the convergence of pure !-
subdivision has been settled, bisections are no longer needed to prevent jamming in
conical algorithms. On the other hand, an advantage of bisections over other kinds
of subdivision is that each bisection creates only two new partition sets, so that using
bisections may help to keep the growth of the collection N of accumulated partition
sets within more manageable limits. Moreover, recent investigations by Kuno and
Ishihama (2015) show that bisections can be built in such a way to take account as
well of current problem structure just like !-subdivisions.
7.1 Concave Minimization Under Linear Constraints 179

The following concept of !-bisection is a slight modification of that introduced


by KunoIshihama. The modification allows us to dispense with assuming strict
concavity of f .x/ and also to simplify the convergence proof.
Let Mk be the cone to be subdivided at iteration k of Procedure DC, Sk D
uk1 ; : : : ; ukn   RP
n
its base, v k the point
Pn where Sk meets the ray through ! k D
n
!.Mk /. Let v D iD1 i u where iD1 i D 1; ki  0 .i D 1; : : : ; n/. Take
k k ki k

> 0, and for every k define

ki C X
n
ik D ; i D 1; : : : ; n; vQ k D ik uki :
1 C n iD1

Pn
Clearly ik > 0 8k; i D 1; : : : ; n; iD1 i
k
D 1, so vQ k 2 Sk . Let ukrk ; uksk  be a
longest edge of the simplex Sk and

rkk ukrk C skk uksk


wk D : (7.23)
rkk C skk

The !-bisection of Mk is then defined to be the subdivision of Mk via the halfine


from x0 through wk (or, in other words, the subdivision of Mk induced by the
subdivision of Sk via wk /.
Since ik > 0 8i the set fvQ k ; uki ; i frk ; sk gg is affinely independent and, as
is easily seen, wk is just the point where the segment ukrk ; uksk  meets the affine
manifold through vQ k ; uki ; i frk ; sk g.
Theorem 7.2 The CS Algorithm with !-bisections terminates after finitely many
steps, yielding a global "-optimal solution of (BCP).
rkk krk C
Proof Since rkk C skk
 1Cn
 1Cn
the !-bisection of Sk is a bisection of Sk of

ratio  1Cn > 0, so by Corollary 6.1 the set \1 kD1 Sk is a singleton fvg. Hence, if
z ; : : : ; z are the points where the halflines from x0 through uk1 ; : : : ; ukn meet @C"
k1 kn

while qk ; qO k are the points where the halfline from x0 through v k meets the simplex
zk1 ; : : : ; zkn  and @C" , respectively, then kqk  qO k k ! 0 as k ! 1. Therefore there
is q 2 @C" such that qk ; qO k ! q as k ! 1. Since ! k WD !.Mk / 2 qk ; qO k  it follows
that ! k ! q 2 @C" . By reasoning just as in the last part of the proof of Theorem 6.1
we conclude that the CS Algorithm with !-bisections terminates after finitely many
iterations, yielding a global "-optimal solution. t
u
From the computational experiments reported in Kuno and Ishihama (2015) the
CS Algorithm with !-bisection seems to compete favorably with the CS Algorithm
using !-subdivisions.
180 7 DC Optimization Problems

7.1.5 Branch and Bound

Let M be a cone vertexed at a point x0 2 D and having n edges of directions


u1 ; : : : ; un . A lower bound of f .M \ D/ to be used in a conical BB algorithm for
solving (BCP) can be computed as follows.
Let be the current incumbent value. Assuming that  " < f .x0 / (otherwise
0
x is already a global "-optimal solution of (BCP) compute

i D supfj f .x0 C ui /   "g; i D 1; : : : ; n; (7.24)

and by  D .M/ denote the optimal value of the linear program LP.D; M/ given
by (7.20). If yi D x0 C  i ui ; i D 1; : : : ; n then clearly D \ M  x0 ; y1 ; : : : ; yn , so
by concavity of f .x/ a lower bound of f .D \ M/ can be taken to be the value

" if .M/  1
.M/ D (7.25)
minff .y1 /; : : : ; f .yn /g otherwise:

(see Fig. 7.3). Using this lower bounding M 7! .M/ together with the conical
!-subdivision rule we can then formulate the following BB (branch and bound)
version of the CS Algorithm discussed in Sect. 5.4.
Conical BB Algorithm for (BCP)
Initialization Compute a feasible solution z0 .
Phase 1 Compute a vertex x of D which is a local minimizer such that
f .x/  f .z0 /.
Phase 2 Take a vertex x0 of D (local minimizer) such that f .x0 / > f .x/  ". Let
M0 D fxj x D x0 C U0 t; t  0g be the canonical cone associated with
x0 ; S0 D x0 C u01 ; : : : ; x0 C u0n , where u0i ; i D 1; : : : ; n are the columns
of U0 . Set P D fM0 g; N D fM0 g; D f .x/.

y2

f (x) =

M y1
x0

Fig. 7.3 Computation of .M/


7.1 Concave Minimization Under Linear Constraints 181

Step 1. (Bounding) For each cone M 2 P, of base S D z1 ; : : : ; zn   S0 let U be


the matrix of columns ui D zi  x0 ; i D 1; : : : ; n. Compute i ; i D 1; : : : ; n
according to (7.24) and solve the linear program [see (7.20)]:
( )
Xn
ti
LP.D; M/ max j AUt Q t0
Q  b; (7.26)

iD1 i

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/. Compute .M/ as in (7.25).
Step 2. (Options) If for some M 2 P; f .!.M// < , then do either of the
options:
(a) with f .!.M// construct a .  "/-valid cut for .f ; D/ at x0 W
0
h ; x  x i  1, reset D D \ fxj h ; x  x0 i  1g, z0 !.M/, and
go back to Phase 1;
(b) compute a vertex xO of D such that f .Ox/  f .!.M//, reset x xO ;
f .Ox/ and go to Step 3.
Step 3. (Pruning) Delete every cone M 2 N for which .M/   " and let R
be the collection of all remaining cones.
Step 4. (Termination Criterion) If R D ;, then terminate: x is a global "-optimal
solution of (BCP).
Step 5. (Branching) Let M 2 argminf.M/j M 2 Rg, with base S . Perform an
!-subdivision of M and let P be the partition of M .
Step 6. (New Net) Set N .R n fM g/ [ P , P P and return to Step 1.
Theorem 7.3 The above algorithm can be infinite only if " D 0 and then any
accumulation point of the sequence fxk g is a global optimal solution of (BCP).
Proof Since D has finitely many vertices, if the algorithm is infinite, a Phase 2 must
be infinite and generate a filter fMk ; k 2 Kg. Let k be the current value of , and let
k D .Mk /; ! k D !.Mk /; D lim k . By formula 7.22 (which extends easily to
the case when k & /, there exists K1  K such that k ! 1; ! k ! !, with !
satisfying f .!/ D " as k ! 1; k 2 K1 . But in view of Step 2, f .! k /  k , hence
f .!/  , which is possible only if " D 0. Moreover, without loss of generality
we can assume .Mk / D f .x0 C k 1k u1k /, and so, in that case, k  .Mk / 
f . 1k u1k /  f .k 1k u1k / ! 0 as k ! 1; k 2 K1 . Therefore, by Proposition 6.1 the
algorithm converges. t
u
Remark 7.5 Option (a) in Step 2 allows the algorithm to be restarted when storage
problems arise in connection with the growth of the current set of cones. On
the other hand, with option (b) the algorithm becomes a unified procedure. This
flexibility sometimes enhances the practicability of the algorithm. Note, however,
that the BB algorithm requires more function evaluations than the CS Algorithm
(for each cone M the CS algorithm only requires the value of f .x/ at !.M/, while
the BB Algorithm requires the values of f .x/ at n points y1 ; : : : ; yn /. This should be
taken into account when functions evaluations are costly, as it occurs in problems
182 7 DC Optimization Problems

where the objective function f .x/ is implicitly defined via some auxiliary problem
(for instance, f .x/ D minfhQx; yij y 2 Gg, where Q 2 Rpn and G is a polytope
in Rp /.

7.1.6 Rectangular Algorithms

Aside from simplicial and conical subdivisions, rectangular subdivisions are also
used in branch and bound algorithms for (BCP). Especially, rectangular subdivisions
are often preferred when the objective function is such that a lower bound of its
values over a rectangle can be obtained in a straightforward manner. The latter
occurs,
Pn in particular, when the function is separable, i.e., has the form f .x/ D
iD1 fi .xi /.
Consider now the Separable Basic Concave Program
X
n
.SBCP/ minff .x/ WD fj .xj /j x 2 Dg; (7.27)
jD1

where D is a polytope contained in the rectangle c; d D fx W c  x  dg and each


function fj .t/ is concave and continuous on the interval cj ; dj . Separable program-
ming problems of this form appear quite frequently in production, transportation,
and planning models. Also note that any quadratic function can be made separable
by an affine transformation (see Chap. 4, Sect. 4.5).
P
Proposition 7.4 If f .x/ D njD1 fj .xj / is a separable concave function and M D
rI s is any rectangle with c  r; s  d, then there is an affine minorant of f .x/ on
M which agrees with f .x/ at the vertices of M. This function is given by
X
n
'M .x/ D 'M;j .xj /;
jD1

where, for every j; 'M;j .:/ is the affine function of one variable that agrees with fj .:/
at the endpoints of the interval rj ; sj , i.e.,
fj .sj /  fj .rj /
'M;j .t/ D fj .rj / C .t  rj /: (7.28)
sj  rj

Proof That 'M .x/ is affine and minorizes f .x/ over M is obvious. On the other
hand, if x is a vertex of M then, for every j, we have either xj D rj or xj D sj , hence
'M;j .xj / D fj .xj /; 8j, and therefore, f .x/ D 'M .x/. t
u
On the basis of this proposition a lower bound for f .x/ over a rectangle M D r; s
can be computed by solving the linear program
LP.D; M/ minf'M .x/j x 2 D \ Mg: (7.29)
7.1 Concave Minimization Under Linear Constraints 183

If !.M/ and .M/ denote a basic optimal solution and the optimal value of this
linear program, then .M/  min f .D\M/, with equality holding when f .!.M// D
'M .!.M//. It is also easily seen that if a rectangle M D rI s is entirely outside D
then D\M is empty and M can be discarded from consideration (.M/ D C1). On
the other hand, if M is entirely contained in D then D\M D M, hence the minimum
of f .x/ over D \ M is achieved at a vertex of M and since 'M .x/ agrees with f .x/
at the vertices of M, it follows that .M/ D min f .D \ M/. Therefore, in a branch
and bound rectangular algorithm for (SBCP) only those rectangles intersecting the
boundary of D will need to be investigated, so that the search will essentially be
concentrated on this boundary.
Proposition 7.5 Let fMk ; k 2 Kg be a filter of rectangles as in Lemma 6.3. If

jk 2 argmaxffj .!jk /  'Mk ;j .!jk /jj D 1; : : : ; ng (7.30)

then there exists a subsequence K1  K such that

f .! k /  'Mk .! k / ! 0 .k ! 1; k 2 K1 /: (7.31)

Proof By Lemma 6.3, we may assume, without loss of generality, that jk D 1 8k 2


K1 and !1k  r1k ! 0 as k ! 1; k 2 K1 (the case !1k  sk1 ! 0 is similar). Since
'Mk ;1 .r1k / D f1 .r1k /, it follows that f1 .!1k /  'Mk ;1 .!1k / ! 0: The choice of jk D 1 by
(7.30) then implies that fj .!jk /  'Mk ;j .!jk / ! 0 .j D 1; : : : ; n/, hence (7.31). t
u
The above results provide the basis for the following algorithm, essentially due
to Falk and Soland (1969):
BB Algorithm for (SBCP)
Initialization Let x0 be the best feasible solution available, P1 D N1 D fM0 WD
c; dg. Set k D 1.
Step 1. (Bounding) For each rectangle M 2 Pk solve the linear program
LP.D; M/ given by (7.29) to obtain its optimal value .M/ and a basic
optimal solution !.M/.
Step 2. (Incumbent) Update the incumbent by setting xk equal to the best among
xk1 and all !.M/; M 2 Pk .
Step 3. (Pruning) Delete every M 2 Nk such that .M/  f .xk /  ". Let Rk be
the collection of remaining members of Nk .
Step 4. (Termination Criterion) If Rk D ;, then terminate: xk is a global "-optimal
solution of (SBCP).
Step 5. (Branching) Choose Mk 2 argminf.M/jM 2 Rk g.
Let ! k D !.Mk /. Choose jk according to (7.30) and subdivide Mk via
.! k ; jk /. Let PkC1 be the partition of Mk .
Step 6. (New Partitioning) Set NkC1 D .Rk n fMk g/ [ PkC1 . Set k k C 1 and
go back to Step 1.
184 7 DC Optimization Problems

Theorem 7.4 The above algorithm can be infinite only if " D 0 and in this case
every accumulation point of the sequence fxk g is a global optimal solution of
(SBCP).
Proof Proposition 7.5 ensures that every filter fMk ; k 2 Kg contains an infinite
subsequence fMk ; k 2 K1 g such that

f .! k /  .Mk / ! 0 .k ! 1; k 2 K1 /:

The conclusion then follows from Proposition 5.1. t


u
Remark 7.6 The subdivision via (! k ; jk ) as used in the above algorithm is also
referred to as !-subdivision. Note that if Mk is chosen as in Step 5, and ! k coincides
with a corner of Mk then, as we saw above [see comments following (7.29)],
f .! k / D .Mk /, hence ! k solves (SBCP). Intuitively this suggests that convergence
would be faster if the subdivision tends to bring ! k more rapidly to a corner of Mk ,
i.e., if !-subdivision is used instead of bisection as in certain similar rectangular
algorithms (see, e.g., Kalantari and Rosen 1987). In fact, although bisection also
guarantees convergence (in view of its exhaustiveness, by Corollary 6.2), for (SBCP)
it seems to be generally less efficient than !-subdivision. Also note that just as with
conical or simplicial subdivisions, rectangular procedures using !-subdivision do
not need any special device to ensure convergence.

7.2 Concave Minimization Under Convex Constraints

Consider the following General Concave Minimization Problem:

.GCP/ minff .x/j x 2 Dg; (7.32)

where f W Rn ! R is a concave function, and D D fxj g.x/  0g is a compact


convex set defined by the convex function g W Rn ! R. Since f .x/ is continuous and
D is compact an optimal solution of (GCP) exists.
Using the outer approximation (OA) approach described in Chap. 5 we will
reduce (GCP) to solving a sequence of (BCP)s. First it is not hard to check that
the conditions (A1) and (A2) required in the Prototype OA Algorithm are fulfilled.
Let G D D, let be the set of optimal solutions to .GCP/, P the collection of
polyhedrons that contain D. Since G D D is a compact convex set, it is easily
seen that Condition A2 is fulfilled by taking l.x/ to be an affine function strictly
separating z from G. The latter function l.x/ can be chosen in a variety of different
ways. For instance, according to Corollary 6.3, if g.x/ is differentiable then to
separate xk D x.Pk / from G D D one can take

lk .x/ D hrg.xk /; x  xk i C g.xk / (7.33)


7.2 Concave Minimization Under Convex Constraints 185

[so yk D xk ; k D g.xk / in (6.12)]. If an interior point x0 of D is readily available,


the following more efficient cut (Veinott 1967) should be used instead:

lk .x/ D hpk ; x  yk i C g.yk /; yk 2 x0 ; xk  n intG; pk 2 @g.yk /: (7.34)

Thus, to apply the outer approximation method to (GCP) it only remains to define a
distinguished point xk WD x.Pk / for every approximating polytope Pk so as to fulfill
Condition A1.

7.2.1 Simple Outer Approximation

In the simple outer approximation method for solving (GCP), xk D x.Pk / is chosen
to be a vertex of Pk minimizing f .x/ over Pk . In other words, denoting the vertex set
of Pk by Vk one defines

xk 2 argminff .x/j x 2 Vk g: (7.35)

Since f .xk /  min f .D/ it is obvious that x D lim!1 xk 2 D implies that x solves
(GCP), so Condition A1 is immediate.
A practical implementation of the simple OA method thus requires an efficient
procedure for computing the vertex set Vk of Pk . At the beginning, P1 can be taken
to be a simple polytope (most often a simplex) with a readily available vertex set
V1 . Since Pk is obtained from Pk1 by adding a single linear constraint, all we need
is a subroutine for solving the following subproblem of on-line vertex enumeration:
Given the vertex set Vk of a polytope Pk and a linear inequality lk .x/  0,
compute the vertex set VkC1 of the polytope

PkC1 D Pk \ fxj lk .x/  0g:

Denote
Vk D fv 2 Vk j lk .x/ < 0g; VkC D fv 2 Vk j lk .x/ > 0g:

Proposition 7.6 VkC1 D Vk [ W, where w 2 W if and only if it is the intersection


point of the hyperplane lk .x/ D 0 with an edge of Pk joining an element of VkC with
an element of Vk .
Proof The inclusion Vk  VkC1 is immediate from the fact that a point of a
polyhedron of dimension n is a vertex if and only if it satisfies as equalities n linearly
independent constraints (Corollary 1.17). To prove that W  VkC1 let w be a point
of an edge v; u joining v 2 VkC with u 2 Vk . Then w must satisfy as equalities n1
linearly independent constraints of Pk (Corollary 1.17). Since all these constraints
are binding for v while lk .v/ > 0 (because v 2 VkC /, the inequality lk .x/  0
186 7 DC Optimization Problems

cannot be a linear combination of these constraints. Hence w satisfies n linearly


independent constraints defining PkC1 , i.e., w 2 VkC1 . Conversely if w 2 VkC1 then
either lk .w/ < 0 (i.e., w 2 Vk /, or else lk .w/ D 0. In the latter case, n  1 linearly
independent constraints of Pk must be binding for w, so w is the intersection of the
edge determined by these n1 constraints with the hyperplane lk .y/ D 0. Obviously,
one endpoint of this edge must be some v 2 Vk such that lk .v/ > 0. t
u
Assume now that the polytopes Pk and PkC1 are nondegenerate and for each
vertex v 2 Vk the list N.v/ of all its neighbors (adjacent vertices) together with the
index set I.v/ of all binding constraints for v is available. Also assume that jVkC j 
jVk j (otherwise, interchange the roles of these sets). Based on Proposition 7.6,
VkC1 D Vk [ W, where W (set of new vertices) can be computed as follows (Chen
et al. 1991):
On-Line Vertex Enumeration Procedure
Step 1. Set W ; and for all v 2 VkC :
for all u 2 N.v/ \ Vk :
compute  2 0; 1 such that lk .v C .1  /u/ D 0I
set w D v C .1  /uI N.w/ D fug; N.u/ .N.u/ n fvg/ [ fwgI
I.w/ D .I.v/ \ I.u// [ fg where  denotes the index of the new
constraint.
set W W [ fwg.
Step 2. For all v 2 W; u 2 W W
if I.u/\I.v/ D n1 then set N.v/ N.v/[fug; N.u/ N.u/[fvg.
In the context of outer approximation, if Pk is nondegenerate then PkC1 can be
made nondegenerate by slightly perturbing the hyperplane lk .x/ D 0 to prevent
it from passing through any vertex of Pk . In this way, by choosing P1 to be
nondegenerate, one can, at least in principle, arrange to secure nondegeneracy
during the whole process. Earlier variants of on-line vertex enumeration procedures
have been developed by Falk and Hoffman (1976), Thieu et al. (1983), Horst et al.
(1988), and Thieu (1989).
Remark 7.7 If an interior point x0 of D is available, the point yk 2 x0 ; xk  \ @D is
feasible and xk 2 argminff .y1 /; : : : ; f .yk /g is the incumbent at iteration k. Therefore,
a global "-optimal solution is obtained when f .xk /  f .xk /  ", which must occur
after finitely many iterations. The corresponding algorithm was first proposed by
Hoffman (1981). A major difficulty with simple OA methods for global optimization
is the rapid growth of the set Vk which creates serious storage problems and often
makes the computation of VkC1 a formidable task.
Remark 7.8 When D is a polyhedron, g.x/ D maxiD1;:::;m .hai ; xibi /, so if xk D,
then haik ; xk ibik > 0 for ik 2 argmaxfhai ; xk ibi j i D 1; : : : ; mg, and rg.xk / D aik .
Hence the cut (7.33) is lk .x/ D haik ; xi  bik  0, and coincides thus with the
constraint the most violated by xk . Therefore, for (BCP) a simple OA method
produces an exact optimal solution after finitely many iterations. Despite this and
7.2 Concave Minimization Under Convex Constraints 187

several other positive features, the uncontrolled growth of Vk is a handicap which


prevents simple OA methods from competing favorably with branch and bound
methods, except on problems of small size. In a subsequent section a dual version of
OA methods called polyhedral annexation will be presented which offers restart
and multistart possibilities, and hence, can perform significantly better than simple
OA procedures in many circumstances. Let us also mention a method for solving
(BCP) via collapsing polytopes proposed by Falk and Hoffman (1986) which can
be interpreted as an OA method with the attractive feature that at each iteration only
n C 1 new vertices are to be computed for Pk . Unfortunately, strong requirements
on nondegeneracy limit the applicability of this ingenious algorithm.

7.2.2 Combined Algorithm

By replacing f .x/, if necessary, with minf; f .x/g where  > 0 is sufficiently large,
we may assume that the upper level sets of the concave function f .x/ are bounded.
According to (7.35), the distinguished point xk of Pk in the above OA scheme is a
global optimal solution of the relaxed problem

.SPk / minff .x/j x 2 Pk g:

The determination of xk in (7.35) involves the computation of the vertex set Vk of


Pk . However, as said in Remark 7.7, a major difficulty of this approach is that, for
many problems the set Vk may grow very rapidly and attain a prohibitively large
size. To avoid computing Vk an alternative approach is to compute xk by using a BB
Algorithm, e.g., the BB conical procedure, for solving .SPk /. A positive feature of
this approach that can be exploited to speed up the convergence is that the relaxed
problem (SPk ) differs from .SPk1 / only by one single additional linear constraint.
Due to this fact, it does not pay to solve each of these problems to optimality.
Instead, it may suffice to execute only a restricted number of iterations (even one
single iteration) of the conical procedure for .SPk ), and take xk D !.Mk /, where
Mk is the current candidate for branching. At the next iteration k C 1, the current
conical partition in the conical procedure for (SPk ) may be used to start the conical
procedure for (SPkC1 ). This idea of combining OA with BB concepts leads to the
following hybrid procedure (Fig. 7.4).
Combined OA/BB Conical Algorithm for (GCP)
Initialization Take an n-simplex s1 ; : : : ; snC1  inscribed in D and an interior
point x0 of this simplex. For each i D 1; : : : ; n C 1 compute the intersec-
tion si of the halfline from x0 through si with the boundary of D, Let x1 2
argminff .s1 /; : : : ; f .snC1 /g, 1 D f .x1 /, Let Mi be the cone with vertex at x0 and
n edges passing through sj ; j i, N1 D fMi ; i D 1; : : : ; n C 1g, P1 D N1 . (For
any subcone of a cone M 2 P1 its base is taken to be the simplex spanned by the
n intersection points of the edges of M with the boundary of D) Let P1 be an initial
polytope enclosing D. Set k D 1.
188 7 DC Optimization Problems

Fig. 7.4 Combined


algorithm
Pk
Pk+1
k
k

Mk
x0
D

Step 1. (Bounding) For each cone M 2 Pk of base u1 ; : : : ; un  compute the point


x0 C i ui where the ith edge of M meets the surface f .x/ D k . Let U be
the matrix of columns u1 ; : : : ; un . Solve the linear program
( n )
X ti
LP.Pk ; M/ max j x0 C Ut 2 Pk ; t  0
iD1
i

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D x0 C Ut.M/. If .M/ > 1, compute zi D x0 C .M/ i ui ; i D
1; : : : ; n and

.M/ D minff .z1 /; : : : ; f .zn /g: (7.36)

Step 2. (Pruning) Delete every cone M 2 Nk such that .M/  1 or .M/ 


k  ". Let Rk be the collection of remaining members of Nk .
Step 3. (Termination criterion 1) If Rk D ; terminate: xk is a global "-optimal
solution.
Step 4. (Termination criterion 2) Select Mk 2 argminf.M/j M 2 Rk g. If ! k WD
!.Mk / 2 D terminate: ! k is a global optimal solution.
Step 5. (Branching) Subdivide Mk via the halfline from x0 through ! k , obtaining
the partition PkC1 of Mk .
Step 6. (Outer approximation) Compute pk 2 @g.! k /, where ! k is the intersection
of the halfline from x0 through ! k with the boundary of D and define

PkC1 D Pk \ fxj hpk ; x  ! k i C g.! k /  0g:

Step 7. (Updating) Let xkC1 be the best of the feasible points xk ; ! k . Let kC1 D
f .xkC1 /, NkC1 D .Rk n fMk g/ [ PkC1 . Set k k C 1 and go back to
Step 1.
Theorem 7.5 The Combined Algorithm can be infinite only if " D 0, and in that
case every accumulation point of the sequence fxk g is a global optimal solution of
(GCP).
7.3 Reverse Convex Programming 189

Proof First observe that for any k we have .Mk /  min f .D/ while f .! k / 
minff .zk1 /; : : : ; f .zkn /g D .Mk /. Therefore, when Step 4 occurs ! k is effectively
a global optimal solution. We now show that for " > 0 if Step 5 never occurs the
current best value k decreases by at least a quantity " after finitely many iterations.
In fact, suppose that k D k0 for all k  k0 . Let C D fx 2 Rn j f .x/  k0 g and
by zki denote the intersection point of the ith edge of Mk with @C. Let k D .Mk /
and by qk denote the intersection point of the halfline from x0 through ! k with the
simplex zk1 ; : : : ; zkn . By Theorem 6.2 there exists a subsequence fks g  f1; 2 : : : ; g
and a point q 2 @C such that qks ! q as s ! C1. Moreover, ks ! 1 as
s ! C1 [see (7.22) in Remark 7.3]. Since ! ks  x0 D ks .qks  x0 / it follows that
! ks ! q 2 @C, hence f .! ks / ! f .q/ D k0 as s ! C1. But f .! k /  .Mk / 8k,
hence .Mks /  f .! ks / ! k0 as s ! C1. Therefore, for s sufficiently large we
will have .Mks  k0  " D ks  ", and then the algorithm stops by the criterion
in Step 3 yielding k0 as the global "-optimal value and xks as a global "-optimal
solution. Thus if k0 is not yet the global "-optimal value, we must have k  k0  "
for some k > k0 . Since min f .D/ < C1, this proves the finiteness of the algorithm.
t
u
Remark 7.9 If for solving (SPk ) the CS Algorithm is used instead of the BB
algorithm, then we have a Combined OA/CS Conical Algorithm for (GCP). In
Step 1 of the latter algorithm, the values i are computed from the condition
f .x0 C i ui / D k  ", and there is no need to compute .M/ while in Step 2 a cone
M is deleted if .M/  1. The OA/CS Algorithm involves less function evaluations
than the OA/BB Algorithm.
Combined algorithms of the above type were introduced for general global
optimization problems in Tuy and Horst (1988) under the name of restart BB
Algorithms. A related technique of combining BB with OA developed in Horst et al.
(1991) leads essentially to the same algorithm in the case of .GCP/.

7.3 Reverse Convex Programming

A reverse convex inequality is an inequality of the form

h.x/  0; (7.37)

where h W Rn ! R is a convex function. When such an inequality is added to


the constraints of a linear program minfcxjAx  b; x  0g, the latter becomes a
complicated nonconvex global optimization problem

.LRC/ minimize cx (7.38)


subject to Ax  b; x0 (7.39)
h.x/  0: (7.40)
190 7 DC Optimization Problems

Fig. 7.5 Reverse convex


constraint
D

c, x =
h(x) 0

The difficulty is brought on by the additional reverse convex constraint which


destroys the convexity and sometimes the connectedness of the feasible set
(Fig. 7.5). Even when h.x/ is as simple as a convex quadratic function, testing the
feasibility of the constraints (7.39)(7.40) is already NP-hard. This is no surprise
since (BCP), which is NP-hard, can be written as a special (LRC):

minftj Ax  b; x  0; t  f .x/  0g:


Reverse convex constraints occur in many problems of mechanics, engineering
design, economics, control theory, and other fields. For instance, in engineering
design a control variable x may be subject to constraints of the type kx  dk 
.; i:e:; h.x/ WD kx  dk   0/, or hpi ; xi  i for at least one i 2 I .i:e:; h.x/ D
max.hpi ; xi  i /  0/. In combinatorial optimization a discrete constraint such as
i2I
i 2 f0; 1g for all i 2 I amounts to requiring that 0  xi  1; i 2 I, and h.x/ WD
xP
i2I xi .xi  1/  0. Also, in bilevel linear programming problems, where the
variable is x D .y; z/ 2 Rp Rq , a constraint of the form z 2 argmaxfdzjPyCQz  sg
can be expressed as h.x/ WD '.y/  dz  0, where '.y/ WD maxfdzjQz  Py  sg is
a convex function of y.1

7.3.1 General Properties

Setting D D fxjAx  b; x  0g, C D fxjh.x/  0g, we can write the problem as


.LRC/ W minfcxj x 2 D n intCg: (7.41)

1
In the above formulation the function h.x/ is assumed to be finite throughout Rn I if this condition
may not be satisfied (as in the last example) certain results below must be applied with caution.
7.3 Reverse Convex Programming 191

Here D is a polyhedron and C is a closed convex set. For the sake of simplicity we
will assume D nonempty and C bounded. Furthermore, it is natural to assume that

minfcxjx 2 Dg < minfcxjx 2 D n intCg: (7.42)

which simply means that the reverse convex constraint h.x/  0 is essential, so that
(LRC) does not reduce to the trivial underlying linear program. Thus, if w is a basic
optimal solution of the linear program then

w 2 D \ intC and cw < cx; 8x 2 D n intC: (7.43)

Proposition 7.7 For any z 2 D n C there exists a better feasible solution y lying on
the intersection of the boundary @C of C with an edge E of the polyhedron D.
Proof We show how to compute y satisfying the desired conditions. If p 2 @h.z/
then the polyhedron fx 2 Dj hp; xziCh.z/ D 0g is contained in the feasible set, and
a basic optimal solution z1 of the linear program minfcxj x 2 D; hp; xziCh.z/ D 0g
will be a vertex of the polyhedron P D fx 2 Dj cx  cz1 g. But in view of (7.43) z1
is not optimal to the linear program minfcxj x 2 Pg. Therefore, starting from z1 and
using simplex pivots for solving the latter program we can move from a vertex of D
to a better adjacent vertex until we reach a vertex u with an adjacent vertex v such
that h.u/  0 but h.v/ < 0. The intersection of @C with the line segment u; v will
be the desired y. t
u
Corollary 7.3 (Edge Property) If (LRC) is solvable then at least one optimal
solution lies on the intersection of the boundary of C with an edge of D.
If E denotes the collection of all edges of D and for every E 2 E , .E/ is the set
of extreme points of the line segment E \ C, then it follows from Corollary 7.3
(Hillestad and Jacobsen 1980b) that an optimal solution must be sought only among
the finite set
[
XD f.E/j E 2 E g: (7.44)

We next examine how to recognize an optimal solution. Problem (LRC) is said to


be regular (stable) if for every feasible point x there exists, in every neighborhood
of x, a feasible point x0 such that h.x0 / > 0. Since by slightly moving x0 towards an
interior point of D, if necessary, one can make x0 2 intD, this amounts to saying that
the feasible set S D D n intC satisfies S D cl.intD/, i.e., it is robust. For example,
the problem depicted in Fig. 7.5 is regular, but the one in Fig. 7.6 is not (the point
x D opt which is an isolated feasible point does not satisfy the required condition).
By specializing Proposition 5.3 to (LRC) we obtain
Theorem 7.6 (Optimality Criterion) In order that a feasible point x be a global
optimal solution it is necessary that

fx 2 Dj cx  cxg  C: (7.45)
192 7 DC Optimization Problems

Fig. 7.6 Nonregular problem

x C opt

This condition is also sufficient if the problem is regular. t


u
For any real number let D. / D fx 2 Dj cx  g. Since D. /  D. 0 / whenever
 0 , it follows from the above theorem that under the regularity assumption , the
optimal value in (LRC) is

inff j D. / n C ;g: (7.46)

Also note that by letting D cx condition (7.45) is equivalent to

maxfh.x/j x 2 D. /g  0; (7.47)

where the maximization problem on the left-hand side is a (BCP) since h.x/
is convex. In the general case when regularity is not assumed, condition (7.45)
[i.e. (7.47)] may hold without x being a global optimal solution, as illustrated in
Fig. 7.6.
Fortunately, a slight perturbation of (LRC) can always make it regular, as shown
in the following:
Proposition 7.8 For " > 0 sufficiently small, the problem

minimize cx subject to x 2 D; h.x/ C ".kxk2 C 1/  0 (7.48)

is regular and its optimal value tends to the optimal value of (LRC) as " ! 0.
Proof. Since the vertex set V of D is finite, there exists "0 > 0 so small that " 2
.0; "0 / implies that h" .x/ WD h.x/ C ".kxk2 C 1/ 0; 8x 2 V. Indeed, if V1 D fx 2
Vjh.x/ < 0g and "0 satisfies h.x/ C "0 .kxk2 C 1/ < 0; 8x 2 V1 , then whenever
0 < " < "0 we have h.x/ C ".kxk2 C 1/ < 0; 8x 2 V1 , while h.x/ C ".kxk2 C 1/ 
0 C " > 0; 8x 2 V n V1 . Now consider problem (7.48) for 0 < " < "0 and let x 2 D
be such that h" .x/  0. If x V then x is the midpoint of some line segment   D,
and since the function h" .x/ is strictly convex, any neighborhood of x must contain
a point x0 of  such that h" .x0 / > 0. On the other hand, if x 2 V, then h" .x/ > 0,
and any point x0 of D sufficiently near to x will satisfy h" .x0 / > 0. Thus, given any
7.3 Reverse Convex Programming 193

feasible solution x of problem (7.48) there exists a feasible solution x0 arbitrarily


close to x such that h" .x0 / > 0. This means that problem (7.48) is regular. If x"
is a global optimal solution of (7.48) then obviously cx"  WD minfcxjx 2 D;
h.x/  0g and since h" .x" /  0 and h" .x" /  h.x" / ! 0 as " ! 0, if x is any cluster
point of fx" g then h.x/  0, i.e., x is feasible to (LRC), hence cx D . t
u
Note that the function h" .x/ in the above perturbed problem (7.48) is strictly
convex. This additional feature may sometimes be useful.
An alternative approach for handling nonregular problems is to replace the
original problem by the following:

.LRC" / minfcxj x 2 D; h.x/  "g: (7.49)

A feasible solution x" to the latter problem such that

cx"  minfcxj x 2 D; h.x/  0g (7.50)

is called an "-approximate optimal solution to (LRC). The next result shows that for
" > 0 sufficiently small an "-approximate optimal solution is as close to an optimal
solution as desired.
Proposition 7.9 If for each " > 0, x" is an "- approximate optimal solution of
(LRC), then any accumulation point x of the sequence fx" g solves (LRC).
Proof Clearly h.x/  0, hence x is feasible to (LRC). For any feasible solution x of
(LRC), the condition (7.50) implies that cx  minfcxj x 2 D; h.x/  0g, hence x is
an optimal solution of (LRC). t
u
Observe that if C" D fxj h.x/  "g then a feasible solution x" to the
problem (7.49) such that

fx 2 Dj cx  cx" g  C" (7.51)

is an "-approximate optimal solution of (LRC). This together with Proposition 7.7


suggests that, within a tolerance " > 0 (so small that still x0 2 intC" /, transcending
an incumbent value in (LRC) reduces to solving the following problem, analogous
to problem (#) for (BCP) in Sect. 7.1, Subsect. 7.1.3:

(\) Given two closed convex sets D. / D fx 2 Dj cx  g and C


D fxj h.x/ 

g (0 <
< "), find a point z 2 D. / n C" or else prove that D. /  C
.

Note that here C"  C


. By Proposition 7.7, from a point z 2 D. / n C" one
can easily derive a feasible point y to (LRC" ) with cy < I on the other hand, if
D. /  C
, then is an "-approximate optimal value because 0 <
< ".
194 7 DC Optimization Problems

7.3.2 Solution Methods

Thus the search for an "-approximate optimal solution of (LRC) reduces to solving
the parametric dc feasibility problem

inff j D. / n C" ;g: (7.52)

analogous to (7.5) for (BCP). This leads to a method for solving (LRC) which is
essentially the same as that used for solving (BCP).
In the following X" denotes the set analogous to the set X in (7.44) but defined
for problem (LRC" ) instead of (LRC).
CS (Cone Splitting) Restart Algorithm for (LRC)
Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/  ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set D 1; D. / D D, and go to Phase 2.

Phase 1. Starting from z0 search for a feasible point x 2 X" [see (7.44)]. Let
D cx; D. / D fx 2 Dj cx  g. Go to Phase 2.
Phase 2. Call Procedure DC to solve problem (\) (i.e., apply Procedure DC in
Sect. 6.1 with D. /; C" and C
in place of D; C; C" resp.).
(1) If a point z 2 D. / n C" is obtained, set z0 z and go to Phase 1.
(2) If D. /  C
then terminate: x is an "-approximate optimal solution
(when < C1/, or the problem is infeasible (when D C1/.

Fig. 7.7 The FW/BW z1


algorithm x0
x1 c, x =

z2
C
x2

Remark 7.10 In Phase 1, the point x 2 X can be computed by performing a number


of simplex pivots as indicated in the proof of Proposition 7.7. In Phase 2, the initial
cone of Procedure DC is the canonical cone associated with the vertex x0 D w
of D (w is also a vertex of the polyhedron D. //. Also note that " > 0, whereas
7.3 Reverse Convex Programming 195

the algorithm does not assume regularity of the problem. However, if the problem is
regular, then the algorithm with
D " D 0 will find an exact global optimal solution
after finitely many steps, though may require infinitely many steps to identify it as
such.
Since the current feasible solution moves in Phase I forward from the region
h.x/ > 0 to the frontier h.x/ D 0, and in Phase 2 backward from this frontier to
the region h.x/ > 0, the algorithm is also called the forwardbackward (FW/BW)
algorithm (Fig. 7.7).
Example 7.2

Minimize 2x1 C x2
subject to x1 C x2  10
x1 C 2x2  8
x1  x2  4
2x1  3x2  6
x1 ; x2  0
x21  x1 x2 C x22  6x1  0:

The optimal solution x D .4:3670068I 5:6329932/ is obtained after one cycle


(Fig. 7.8).

opt

w
2

0 3 4

Fig. 7.8 An example

Just as the CS Algorithm for (BCP) admits of a BB version, the CS Algorithm


for (LRC) can be transformed into a BB Algorithm by using lower bounds to define
the selection operation in Phase 2 .
Let M0 be the canonical cone associated with the vertex x0 D w of D, and let
z ; i D 1; : : : ; n, be the points where the ith edge of M0 meets the boundary @C of
i0

the closed convex set C D fxj h.x/  0g. Consider a subcone M of M0 , of base
196 7 DC Optimization Problems

S D z1 ; : : : ; zn   S0 . Let yi D x0 C i ui be the point where the ith edge of M meets


@C and let U be the nonsingular n  n matrix of columns u1 ; : : : ; un . Let AQ and bQ be
defined as in (7.20).
Proposition 7.10 A lower bound for minfcxj x 2 D \ M; h.x/  0g is given by the
optimal value .M/ of the linear program
( )
Xn
t
0 i
min hc; x C Utij AUt Q  b; Q  1; t  0 : (7.53)

iD1 i

Furthermore, if a basic optimal solution t.M/ of this linear program is such that
!.M/ D x0 C Ut.M/ lies on some edge of M then !.M/ 2 S, and hence .M/ D
minfcxj x 2 M \ Sg, where S D D n intC is the feasible set of (LRC).
P
Proof Clearly x 2 M n intC if and only if x D x0 C Ut; t  0; niD1 tii  1. Since
M \ S  .M \ D/ n intC, a lower bound of minfcxj x 2 M \ Sg is given by the
optimal value .M/ of the linear program
Xn
ti
minfhc; x0 C Utij x0 C Ut 2 D; t  0;  1g (7.54)

iD1 i

which is nothing but (7.53). To prove the second part of the proposition, observe
that if !.M/ lies on the ith edge it must satisfy !.M/ D x0 C .yi  x0 / for some
  1 and since x0 2 D; !.M/ 2 D it follows by convexity of D that yi 2 D, hence
yi 2 S. In view of assumption (7.42) this implies that cx0 < cyi , i.e., c.yi  x0 / > 0,
and hence hc; !.M/  yi i D hc; .  1/yi i > 0 if  > 1. Since c!.M/  cyi from
the definition of !.M/ we must have  D 1, i.e., !.M/ 2 D n intC. t
u
As a consequence of this proposition, when !.M/ is not feasible to (LRC) then
it does not lie on any edge of M, so the subdivision of M upon the halfline from
x0 through !.M/ is proper. Also, if all yi ; i D 1; : : : ; n, belong to D, then !.M/ is
decidedly feasible (and is one of the yi ), hence .M/ is the exact minimum of cx over
the feasible portion in M and M will be fathomed in the branch and bound process.
Thus, in a conical procedure using the above lower bounding, only those cones
will actually remain for consideration which contain boundary points of the feasible
set. In other words, the search will concentrate on regions which are potentially of
interest.
In the next algorithm, let "  0 be the tolerance.
Conical BB Algorithm for (LRC)
Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/  ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set 1 D 1; D. 1 / D D, and go to Phase 2.
7.3 Reverse Convex Programming 197

Phase 1. Starting from z0 search for a point x1 2 X" [see (7.44)]. Let 1 D
cx1 ; D. 1 / D fx 2 Dj cx  1 g. Go to Phase 2.
Phase 2 Let M0 be the canonical cone associated with the vertex x0 D w of D, let
N1 D fM0 g; P1 D fM0 g. Set k D 1.
Step 1. (Bounding) For each cone M 2 Pk whose base is x0 C u1 ; : : : ; x0 C un 
and whose edges meet @C at points yi D x0 C i ui ; i D 1; : : : ; n, solve
the linear program (7.53), where U denotes the matrix of columns ui D
yi  x0 ; i D 1; : : : ; n. Let .M/; t.M/ be the optimal value and a basic
optimal solution of this linear program, and let !.M/ D x0 C Ut.M/.
Step 2. (Options) If some !.M/; M 2 Pk , are feasible to (LRC" ) (7.49) and
satisfy hc; !.M/i < k , then reset k equal to the smallest among these
values and xk arg k (i.e., the x such that hc; xi D k ). Go to Step 3 or
(optionally) return to Phase 1.
Step 3. (Pruning) Delete every cone M 2 Nk such that .M/  k and let Rk be
the collection of remaining cones.
Step 4. (Termination Criterion) If Rk D ;, then terminate:
(a) If k < C1, then xk is an "-approximate optimal solution.
(b) Otherwise, (LRC) is infeasible.
Step 5. (Branching) Select Mk 2 argminf.M/j M 2 Rk g. Subdivide Mk along
the halfline from x0 through ! k D !.Mk /. Let PkC1 be the partition of Mk .
Step 6. (New Net) Set Nk .Rk n fMk g/ [ PkC1 , k k C 1 and return to
Step 1.
Theorem 7.7 The above algorithm can be infinite only if " D 0, and then any
accumulation point of each of the sequences f! k g; fxk g is a global optimal solution
of (LRC).
Proof. Since the set X" [see (7.44)] is finite, it suffices to show that Phase 2 is
finite, unless " D 0. If Phase 2 is infinite it generates at least a filter fMk ; k 2 Kg.
Denote by yki ; i D 1; : : : ; n the intersections of the edges of Mk with @C, by qk
the intersection of the halfline from x0 through ! k WD !.Mk / with the hyperplane
through yk1 ; : : : ; ykn . By Theorem 6.2, there exists at least one accumulation point
of fqk g, say q D lim qk .k ! 1; k 2 K1  K/, such that q 2 @C. We may assume
that ! k ! ! .k ! 1; k 2 K1 /. But, as a result of Steps 2 and 3, ! k 2 intC" , hence
! 2 @C" . On the other hand, qk 2 x0 ; ! k , hence q 2 x0 ; !. This is possible only if
" D 0 and ! D q 2 @C, so that ! is feasible. Since, furthermore, c! k  minfcxj x 2
D n intCg; 8k, it follows that !, and hence any accumulation point of the sequence
fxk g is a global optimal solution. t
u
Remark 7.11 It should be emphasized that the above algorithm does not require
regularity of the problem, even when " D 0. However, in the latter case, since every
! k is infeasible, the procedure may approach a global optimal solution through a
sequence of infeasible points only. This situation is illustrated in Fig. 7.9, where
198 7 DC Optimization Problems

y02

opt
3
2
1
y01

Fig. 7.9 Convergence through infeasible points

an isolated global optimal solution of a nonregular problem is approached by


a sequence of infeasible points ! 1 ; ! 2 ; : : : A conical algorithm for (LRC) using
bisections throughout was originally proposed by Muu (1985).

7.4 Canonical DC Programming

In its general formulation (cf. Sect. 5.3) the Canonical DC Programming problem
is to find
.CDC/ minff .x/j g.x/  0  h.x/g; (7.55)

where f ; g; h W Rn ! R are convex functions. Clearly (LRC) is a special case of


(CDC) when the objective function f .x/ is linear.
Proposition 7.11 Any dc optimization problem can be reduced to the canonical
form (CDC) at the expense of introducing at most two additional variables.
Proof An arbitrary dc optimization problem
minff1 .x/  f2 .x/j gi;1 .x/  gi;2 .x/  0; i D 1; : : : ; mg (7.56)
can be rewritten as
minff1 .x/  zj z  f2 .x/  0; g.x/  0; i D 1; : : : ; mg;

where the objective function is convex (in x; z) and g.x/ D maxiD1;:::;m gi;1 .x/ 
gi;2 .x/ is a dc function (cf. Proposition 4.1). Next the dc inequality g.x/ D p.x/ 
q.x/  0 with p.x/; q.x/ convex can be split into two inequalities:

p.x/  t  0; t  q.x/  0; (7.57)


7.4 Canonical DC Programming 199

where the first is a convex inequality, and the second is a reverse convex inequality.
By changing the notation if necessary, one obtains the desired canonical form. u t
Whereas the application of OA technique to .GCP/ is almost straightforward, it
is much less obvious for .CDC/ and requires special care to ensure convergence. By
hypothesis the sets

D WD fxj g.x/  0g; C WD fxj h.x/  0g

are closed and convex. For simplicity we will assume D bounded, although with
minor modifications the results can be extended to the unbounded case.
If an optimal solution w of the convex program minff .x/j g.x/  0g satisfies
h.w/  0, then the problem is solved (w is a global optimal solution). Therefore,
without loss of generality we may assume that there exists a point w satisfying

w 2 intD \ intC; f .x/ > f .w/ 8x 2 D n intC: (7.58)

The next important property is an immediate consequence of this assumption.


Proposition 7.12 (Boundary Property) Every global optimal solution lies
on D \ @C.
Proof Let z0 be any feasible solution. If z0 @C, then the line segment w; z0  meets
@C at a point z1 D .1  /w C z0 such that 0 <  < 1. By convexity we have
from (7.58), f .z1 /  .1  /f .w/ C f .z0 / < f .z0 /, so z1 is a better feasible solution
than z0 . t
u
Just like (LRC), Problem .CDC/ is said to be regular if the feasible set
S D D n intC is robust or, which amounts to the same,

D n intC D cl.D n C/: (7.59)

Theorem 7.6 for (LRC) extends to (CDC):


Theorem 7.8 (Optimality Criterion) In order that a feasible solution x to (CDC)
be global optimal it is necessary that

fx 2 Dj f .x/  f .x/g  C: (7.60)

This condition is also sufficient if the problem is regular.


Proof Analogous to the proof of Proposition 5.3, with obvious modifications (f .x/
instead of l.x/). t
u
200 7 DC Optimization Problems

To exploit the above optimality criterion, it is convenient to introduce the next


concept. Given " > 0 (the tolerance), a vector x is said to be an "-approximate
optimal solution to .CDC/ if

x 2 D; h.x/  "; (7.61)


f .x/  minff .x/j x 2 D; h.x/  0g: (7.62)

Clearly, as "k # 0, any accumulation point of a sequence fxk g of "k -approximate


optimal solutions to .CDC/ yields an exact global optimal solution. Therefore,
in practice one should be satisfied with an "-approximate optimal solution for "
sufficiently small. Denote C" D fxj h.x/  "g; D. / D fx 2 Dj f .x/  g. In
view of (7.58) it is natural to require that

w 2 intD \ intC" ; f .x/ > f .w/ 8x 2 D n intC" : (7.63)

Define D infff .x/j x 2 D; h.x/ > "g and let

G D D. /; D D. / n intC" :

By (7.63) and Proposition 7.12, it is easily seen that coincides with the set of "-
approximate optimal solutions of (CDC), so the problem amounts to searching for
a point x 2 . Denote by P the family of polytopes P in Rn for which there exists
2 ; C1 satisfying G  D. /  P. For every P 2 P define

x.P/ 2 argmaxfh.x/j x 2 Vg; (7.64)

where V is the vertex set of P. Let us verify that Conditions A1 and A2 stated in
Sect. 6.3 are satisfied. Condition A1 is obvious because x.P/ exists and can be
computed provided ;I moreover, since P  G  , we must have h.x.P// 
", so any accumulation point x of a sequence x.Pk /; k D 1; 2; : : :, satisfies h.x/ 
", and hence x 2 whenever x 2 G.
To verify A2, consider any z D x.P/ associated to a polytope P such that D. / 
P for some 2 ; C1. As we just saw, h.z/  ". If h.z/ < 0 then maxfh.x/j x 2
Pg < 0, hence D. /  fxj h.x/ < 0g, which implies that is an "approximate
optimal value (if < C1/ or the problem is infeasible (if D C1/. If h.z/  0
then maxfg.z/; h.z/ C "g > 0 and since maxfg.w/; h.w/ C "g < 0, we can compute
a point y such that

y 2 w; z; maxfg.y/; h.y/ C "g D 0:


7.4 Canonical DC Programming 201

Two cases are possible:


(a) g.y/ D 0 W since g.w/ < 0 this event may occur only if g.z/ > 0 and so z D x.P/
can be separated from G by a cut l.x/ WD hp; x  yi  0 with p 2 @g.y/.
(b) h.y/ D " W then g.y/  0, so y is an "-approximate solution in the sense
of (7.61). Furthermore, since y D .1  /w C z; 0 <  < 1/, it follows that
f .y/  .1  /f .w/ C f .z/ < f .z/, so z D x.P/ can be separated from G by a
cut l.x/ WD hp; x  yi  0 with p 2 @f .y/.
In either case, if we set 0 D minf ; f .y/g then P0 D P \ fxj l.x/  0g  D. 0 /,
i.e., P0 2 P.
Thus, both Conditions A1 and A2 are satisfied, and hence the outer approxima-
tion scheme described in Sect. 6.3 can be applied to solve .CDC/.

7.4.1 Simple Outer Approximation

We are led to the following OA Procedure (Fig. 7.10) for finding an "-approximate
optimal solution of (CDC):
OA Algorithm for (CDC)
Step 0. Let x1 be the best feasible solution available, 1 D f .x1 / (if no feasible
solution is known, set x1 D ;; 1 D C1/. Take a polytope P0  D and
let P1 D fx 2 P0 j f .x/  1 g. Determine the vertex set V1 of P1 . Set
k D 1.
Step 1. Compute xk 2 argmaxfh.x/j x 2 Vk g. If h.xk / < 0, then terminate:
(a) If k < C1, xk is an "-approximate optimal solution;
(b) If k D C1, the problem is infeasible.

x1
g(x) = 0 f (x) = c, x

h(x) =

P1
l1 (x) = 0 y1
P3 w y2 P2
l2 (x) = 0

x2

Fig. 7.10 Outer approximation algorithm for .CDC/


202 7 DC Optimization Problems

Step 2. Compute yk 2 w; xk  such that maxfg.yk /; h.yk / C "g D 0. If g.yk / D 0


then set kC1 D k ; xkC1 D xk and let

lk .x/ WD hpk ; x  yk i; pk 2 @g.yk /: (7.65)

Step 3. If h.yk / D ", then set kC1 D minf k ; f .yk /g, xkC1 D yk if f .yk /  k ,
xkC1 D xk otherwise. Let

lk .x/ WD hpk ; x  yk i; pk 2 @f .yk /: (7.66)

Step 4. Compute the vertex set VkC1 of PkC1 D Pk \ fxj lk .x/  0g, set k kC1
and go back to Step 1.

Theorem 7.9 The OA Algorithm for .CDC) can be infinite only if " D 0 and in
this case, if the problem is regular any accumulation point of the sequence fxk g will
solve (CDC).
Proof Suppose that the algorithm is infinite and let x D limq!C1 xkq . It can
easily be checked that all conditions of Theorem 6.5 are fulfilled (w 2 intG
by (7.58); yk 2 w; xk  n intG, lk .yk / D 0/. By this theorem, x D lim ykq , therefore
h.x/ D lim h.ykq /  ". On the other hand, h.xk /  0, hence h.x/  0. This is a
contradiction unless " D 0, Since g.yk /  0 8k it follows that x 2 D. Thus x is a
feasible solution. Now let Q D lim k . Since for every k; h.xk / D maxfh.x/j x 2 Pk g
and D. k /  Pk it follows that 0 D h.x/ D maxfh.x/j x 2 D. Q /g. Hence D. Q /  C.
By the optimality criterion (7.60) this implies, under the regularity assumption, that
x is a global optimal solution and Q D , the optimal value. Furthermore, since
k D f .xk /, any accumulation point of the sequence fxk g is also a global optimal
solution. t
u
Remark 7.12 No regularity condition is required when " > 0. It can easily be
checked that the convergence proof is still valid if in Step 3 of certain iterations the
cut is taken with pk 2 @h.yk / instead of pk 2 @f .yk /. This flexibility should be used
to monitor the speed of convergence.
Remark 7.13 The above algorithm presupposes the availability of a point w 2
intD \ intC, i.e., satisfying maxfg.w/; h.w/g < 0. Such a point, if not readily
available, can be computed by methods of convex programming. It is also possible
to modify the algorithm so as to dispense with the computation of w (Tuy 1995b).
An earlier version of the above algorithm with " D 0 was developed in (Tuy
1987a). Several modified versions of this algorithm have also been proposed (e.g.,
Thoai 1988) which, however, were later shown to be nonconvergent. In fact, the
convergence of cutting methods for (CDC) is a more delicate question than it seems.
7.4 Canonical DC Programming 203

7.4.2 Combined Algorithm

By translating if necessary, assume that (7.58) holds with w D 0, and the feasible
set D n intC lies entirely in the orthant M0 D RnC . It is easily verified that the above
OA algorithm still works if, in the case when maxfh.x/j x 2 Pk g > 0, x.Pk / is taken
to be any point xk 2 Pk satisfying h.xk / > " rather than a maximizer of h.x/ over
Pk as in (7.64). As shown in Sect. 6.3 such a point xk can be computed by applying
the Procedure DC to the inclusion

.SPk /x 2 Pk n C" :

For " > 0, after finitely many steps this procedure either returns a point xk such
that h.xk / > ", or establishes that maxfh.x/j x 2 Pk g < 0 (the latter implies that
the current incumbent is an "-approximate optimal solution if k < C1, or that
the problem is infeasible if k D C1/. Furthermore, since PkC1 differs from Pk by
only one additional linear constraint, the last conical partition in the DC procedure
for (SPk ) can be used to start the procedure for (SPkC1). We can thus formulate the
following combined algorithm (recall that w D 0/:
Combined OA/CS Conical Algorithm for (CDC)
Initialization Let 1 D cx1 , where x1 is the best feasible solution available (if no
feasible solution is known, set x1 D ;; 1 D C1/. Take a polytope P0  D and
let P1 D fx 2 P0 j cx  1 g. Set N1 D P1 D fRnC g (any subcone of RnC will
be assumed to have its base in the simplex spanned by the units vectors fei ; i D
1; : : : ; ng/. Set k D 1.
Step 1. (Evaluation) For each cone M 2 Pk of base u1 ; : : : ; un  compute the
point i ui where the ith edge of M meets @C. Let U be the matrix of
columns u1 ; : : : ; un . Solve the linear program
( )
Xn
ti
max j Ut 2 Pk ; t  0

iD1 i

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let
!.M/ D Ut.M/.
Step 2. (Screening) Delete every cone M 2 Nk such that .M/ < 1, and let Rk
be the set of remaining cones.
Step 3. (Termination Criterion 1) If Rk D ;, then terminate: xk is an "-
approximate optimal solution of (CDC) (if k < C1/, or the problem
is infeasible (if k D 1/.
Step 4. (Termination Criterion 2) Select Mk 2 argmaxf.M/j M 2 Rk g. If ! k D
!.Mk / 2 D. k / WD fx 2 Dj f .x/  k g, then terminate: ! k is a global
optimal solution.
204 7 DC Optimization Problems

Step 5. (Branching) Subdivide Mk along the ray through ! k D !.Mk /. Let PkC1
be the partition of Mk .
Step 6. (Outer Approximation) If h.!.M// > " for some M 2 Rk then let xk D
!.M/; yk 2 0; xk  such that maxfg.yk /; h.yk / C "g D 0.
(a) If g.yk / D 0 let lk .x/  0 be the cut (7.65), xkC1 D xk ; kC1 D k .
(b) Otherwise, let lk .x/  0 be the cut (7.66), xkC1 D yk ; kC1 D cyk .
Step 7. (Updating) Let NkC1 D .Rk nfMk g/[PkC1 , PkC1 D fx 2 Pk j lk .x/  0g.
Set k k C 1 and go back to Step 1.
Note that in Step 2, a cone M is deleted if .M/ < 1 (strict inequality), so
Rk D ; implies that maxfh.x/j x 2 Pk g < 0. It is easy to prove that, analogously to
the Combined Algorithm for (GCP) (Theorem 7.5), the above Combined Algorithm
for (CDC) can be infinite only if " D 0 and in this case, if the problem is regular,
every accumulation point of the sequence fxk g is a global optimal solution of (CDC).

7.5 Robust Approach

Consider the general dc optimization problem

.P/ minff .x/j x 2 a; b; gi .x/  0; i D 1; : : : ; mg;

where f ; g1 ; : : : ; gm are dc functions on Rn ; a; b 2 Rn and a; b D fx 2 Rn j a  x 


bg. If f .x/ D f1 .x/  f2 .x/ with f1 ; f2 being convex functions, we can rewrite (P) as
minff1 .x/  tj x 2 a; b; gi .x/  0; i D 1; : : : ; m; t  f2 .x/  0g, so by changing the
notation it can always be assumed that the objective function f .x/ in (P) is convex.
In the most general case, a common approach for solving problem (P) is to use
a suitable OA or BB procedure for generating a sequence of infeasible solutions xk
such that, as k ! C1; xk tends to a feasible solution x , while f .xk / tends to the
optimal value of the problem. Under these conditions x is a global optimal solution,
and theoretically the problem can be considered solved.
Practically, however, the algorithm has to be stopped at some xk which is
expected to be close enough to the optimal solution. On the one hand, xk must be
approximately feasible, on the other hand, the value f .xk / must approach the optimal
value v.P/. In precise terms the former condition means that, given a tolerance
" > 0, one must have xk 2 a; b; gi .xk /  "; i D 1; : : : ; mI the latter condition
means that, given a tolerance
> 0, one must have f .xk /  v.P/ C
. A point
xk satisfying both these conditions is referred to as an .";
/-approximate optimal
solution, or for
D " an "-approximate optimal solution.
Everything in this approximation scheme seems to be in order, so computing
an (";
)-approximate optimal solution has become a common practice for solving
nonconvex global optimization problems. Close scrutiny reveals, however, pitfalls
behind this .";
)-approximation concept.
7.5 Robust Approach 205

Fig. 7.11 Inadequate b


"-approximate optimal
solution
g1 (x)
x
Opt

g2 (x)
x
g3 (x)

f (x)

Consider, for example, the quadratic program depicted in Fig. 7.11, where the
objective function f .x/ is linear and the nonconvex constraint set is defined by the
inequalities a  x  b; gi .x/  0 .i D 1; 2; 3/. The optimal solution is x , while
the point x is infeasible, but almost feasible:

g1 .x/ D 0; g2 .x/ D 0; g3 .x/ D

for some small > 0. If " > then x is feasible with tolerance " and will be
accepted as an "-approximate optimal solution, though it is quite far off the optimal
solution x . But if " < then x is no longer feasible with tolerance " and the "-
approximate solution will come close to x .
Thus, even for regular problems (problems with no isolate feasible solution),
the "-approximation approach may give an incorrect optimal solution if " is not
sufficiently small. The trouble is that in practice we often do not know what exactly
means sufficiently small, i.e., we do not know how small the tolerance should be
to guarantee a correct approximate optimal solution.
Things are much worse when the problem happens to have a global optimal solu-
tion which is an isolated feasible solution. In that case a slight perturbation of the
data may cause a drastic change of the global optimal solution; consequently, a small
change of the tolerances ";
may cause a drastic change of the (";
)-approximate
optimal solution. Due to this instability an isolated global optimal solution is very
difficult to compute and very difficult to implement when computable. To avoid such
annoying situation a common practice is to restrict consideration to problems with
a robust feasible set, i.e., a feasible set D satisfying

D D cl.intD/; (7.67)

where cl and int denote the closure and the interior, respectively. Unfortunately, this
condition (7.67) which implies the absence of isolated feasible solutions is generally
very hard to check for a given nonconvex problem. Practically we often have to deal
with feasible sets which are not known priori to contain isolated points or not.
206 7 DC Optimization Problems

Therefore, from a practical point of view only nonisolated feasible solutions of


(P) should be considered and instead of trying to find a global optimal solution
of (P) it would be more reasonable to look for the best nonisolated feasible
solution. Embodying this idea, the robust approach is devised for computing the best
nonisolated feasible solution without preliminary knowledge about condition (7.67)
holding or not. First let us introduce some basic concepts.
Let D denote the set of nonisolated points (i.e., accumulation points) of D WD
fx 2 a; bj g.x/  0g. A solution x 2 D is called an essential optimal solution of
(P) if
f .x /  f .x/ 8x 2 D :
In other words, an essential optimal solution is an optimal solution of the problem
minff .x/j x 2 D g:
For given " > 0;
> 0, a point x 2 a; b satisfying g.x/  " is said to be
"-essential feasible, and a nonisolated feasible solution x is called essential .";
/-
optimal if
f .x /  f .x/ C
8x 2 D" ;
where D" WD fx 2 a; bj g.x/  "g is the set of all "-essential feasible solutions.
Clearly for " D
D 0 an essential .";
/-optimal solution is a nonisolated feasible
point which is optimal.
To find the best nonisolated feasible solution the robust approach proceeds by
successive improvement of the current incumbent. Specifically, let w 2 Rn be any
point such that f .w/  " > f .x/ 8x 2 DI in each stage of the process, one has to
solve the following subproblem of incumbent transcending:

.SP/ Given an x 2 Rn , find a nonisolated feasible solution xO of (P) such that


f .Ox/  f .x/  ", or else establish that none such xO exists.

Clearly, if x D w then an answer to .SP/ would give a nonisolated feasible


solution or else identify essential infeasibility of the problem (P). If x is the best
nonisolated feasible solution currently available then an answer to .SP/ would give
a new nonisolated feasible solution xO with f .Ox/  f .x/  ", or else identify x as an
essential "-optimal solution.
Since the essential optimal value is bounded above in view of the compactness
of D, by successively solving a finite sequence of subproblems .SP/ we will finally
come up with an essential "-optimal solution, or an evidence that the problem is
essentially infeasible. The key reduces to investigating the incumbent transcending
subproblem .SP/.
Setting g.x/ D maxiD1;:::;m gi .x/, consider the following pair of problems:
minff .x/j g.x/  "; x 2 a; bg; (P" )
minfg.x/j f .x/  ; x 2 a; bg; (Q )
where the objective and constraint functions are interchanged.
7.5 Robust Approach 207

Due to the fact that f .x/ is convex a key property of problem (Q ) is that its
feasible set is a convex set, so (Q ) has no isolated feasible solution and computing
a feasible solution of it can be done at cheap cost. Therefore, as shown in Sect. 7.5,
an adaptive BB Algorithm can be devised that, for any given tolerance
> 0,
can compute an
-optimal solution of (Q ) in finitely many steps (Proposition 5.2).
Furthermore, such an
-optimal solution is stable under small changes of
.
For our purpose of solving problem (P) this property of (Q ) together with the
following relationship between (P" ) and (Q ) turned out to be crucial. Let min.P" /,
min.Q / denote the optimal values of problem (P" ), (Q ), respectively.
Proposition 7.13 For every " > 0, if min.Q / > " then min.P" / > .
Proof If min.Q / > ", then any x 2 a; b such that g.x/  " must satisfy
f .x/ > , hence, by compactness of the feasible set of (P" ), minff .x/j g.x/  ";
x 2 a; bg > . t
u
This almost trivial proposition simply expresses the interchangeability between
objective and constraint in many practical situations (Tuy 1987a, 2003). Exploiting
this interchangeability, the basic idea of robust approach is to replace the original
problem (P), possibly very difficult, by a sequence of easier, stable, problems (Q ),
where the parameter can be iteratively adjusted until a stable (robust) solution to
(P) is obtained.

7.5.1 Successive Incumbent Transcending

As shown above, a key step towards finding an essential (";


)-optimal solution of
problem (P) is to deal with the following question which is the subproblem .SP/
when D f .x/  " W

.SP / Given a real number , and " > 0, check whether problem (P) has a
nonisolated feasible solution x satisfying f .x/  , or else establish that no "-
essential feasible solution x exists such that f .x/  .

Based on the convexity of f .x/, consider an adaptive BB Algorithm for solving


problem (Q /. As described in the proof of Proposition 6.2, such an algorithm
generates at each iteration k two points xk 2 Mk ; yk 2 Mk satisfying

f .xk /  ; g.yk /  .Mk / ! 0 as k ! C1;

where .Mk /  v.Q / (note that the objective function is g.x/, the constraints are
f .x/  ; x 2 a; b).
Proposition 7.14 Let " > 0 be given. Either g.xk / < 0 for some k or .Mk / > "
for some k. In the former case, xk is a nonisolated feasible solution of (P) satisfying
f .xk /  . In the latter case, no "-essential feasible solution x of (P) exists such that
f .x/  (so, if D f .x/ 
for a given
> 0 and a nonisolated feasible solution
x then x is an essential .";
/-optimal solution).
208 7 DC Optimization Problems

Proof If g.xk / < 0, then xk is an essential feasible solution of (P) satisfying f .xk / 
, because by continuity g.x/ < 0 for all x in a neighborhood of xk . If .Mk / > ",
then v.Q / > ", hence, by Proposition 7.13, v.P" / > .
It remains to show that either g.xk / < 0 for some k, or .Mk / > " for some k.
Suppose the contrary that

g.xk /  0; .Mk /  " 8k: (7.68)

As we saw in Sect. 6.1 (Theorem 6.4), the sequence xk ; yk generated by an adaptive


BB Algorithm satisfies, for some infinite subsequence fk g  f1; 2; : : :g,

g.yk /  .Mk / ! 0 . ! C1/; (7.69)


  
k
lim!C1 x D lim!C1 y D x k
with f .x /  ; x 2 a; b: (7.70)

Since by (7.68) g.xk /  0 8k, it follows that g.x /  0. On the other hand, by (7.69)
and (7.68) we have g.x / D lim!C1 .Mk /  ". This contradiction shows that
the event (7.68) is impossible, completing the proof. t
u
Thus, with the stopping criterium g.xk / < 0 or .Mk / > " an adaptive
BB procedure for solving (Q ) will help to solve the subproblem .SP /. Note that
.Mk / D minf.M/j M 2 Rk g (Rk being the collection of partition sets remaining
for exploration at iteration k/. So if the deletion (pruning) criterion in each iteration
of this procedure is .M/ > " then the stopping criterion .Mk / > " is
nothing but the usual criterion Rk D ; .
For brevity an adaptive BB Algorithm for solving (Q / with the deletion criterion
.M/ > " and stopping criterion g.xk / < 0 will be referred to as Procedure
.SP /. Using this procedure, problem (P) can be solved according to the following
scheme:
Successive Incumbent Transcending (SIT) Scheme
Start with D 0 , where 0 D f .a/ 
.
Call Procedure .SP /. If a nonisolated feasible solution x of (P) is obtained with
f .x/  , reset f .x/ 
and repeat. Otherwise, stop: if D f .x/ 
for
a nonisolated feasible solution x then x is an essential .";
/-optimal solution; if
D 0 then problem (P) has no "-essential feasible solution.
Since f .D/ is compact and
> 0 this scheme is necessarily finite.

7.5.2 The SIT Algorithm for DC Optimization

For the sake of convenience let us rewrite problem (P) in the form:

.P/ minff .x/j g1 .x/  g2 .x/  0; x 2 a; bg;

where f ; g1 ; g2 are convex functions.


7.5 Robust Approach 209

As shown above (Proposition 7.14), Procedure .SP / for problem (P) is an


adaptive BB algorithm for solving problem (Q ), with deletion criterion .M/ >
" and stopping criterion g.xk / < 0.
Incorporating Procedure .SP / into the SIT Scheme we can state the following
SIT Algorithm for (P):
SIT Algorithm for (P)
Select tolerances " > 0;
> 0. Let 0 D f .a/ 
.
Step 0. Set P1 D fM1 g; M1 D a; b; R1 D ;; D 0 . Set k D 1.
Step 1. For each box (hyperrectangle) M 2 Pk W
Reduce M, i.e., find a box p; q  M as small as possible satisfying
minfg.x/j f .x/  ; x 2 p; qg D minfg.x/j f .x/  ; x 2 Mg and set
M p; q.
Compute a lower bound .M/ for g.x/ over the feasible solutions in
p; q.
.M/ must be such that: .M/ < C1 ) fx 2 Mj f .x/  g ;,
see (5.4).
Delete every M such that .M/ > ".
Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1. Let R 0 k D Rk [ P 0 k .
Step 3. If R 0 k D ;, then terminate: if D 0 the problem (P) is "-essential
infeasible; if D f .x/ 
for some nonisolated feasible solution x, then
x is an essential .";
/-optimal solution of (P).
[ Step 4.] If R 0 k ;, let Mk 2 argminf.M/j M 2 R 0 k g; k D .Mk /.
Determine xk 2 Mk and yk 2 Mk such that

f .xk /  ; g.yk /  .Mk / D o.kxk  yk k/: (7.71)

If g.xk / < ", go to Step 5. If g.xk /  ", go to Step 6.


Step 5. xk is a nonisolated feasible solution satisfying f .xk /  .
If D 0 define x D xk ; D f .x/ 
and go to Step 6. If D f .x/ 

and f .xk / < f .x/ reset x xk and go to Step 6.


Step 6. Divide Mk into two subboxes by the adaptive bisection, i.e., bisect Mk via
.v k ; jk /, where jk 2 argmaxfjykj  xkj j W j D 1; : : : ; ng; v k D 12 .xk C yk /.
Let PkC1 be the collection of these two subboxes of Mk ; RkC1 D R 0 k n
fMk g.
Increment k, and return to Step 1.
Theorem 7.10 The SIT Algorithm terminates after finitely many steps, yielding an
essential .";
/-optimal solution or an evidence that the problem has no "-essential
feasible solution.
Proof Follows from the above discussion. t
u
210 7 DC Optimization Problems

To help practical implementation, it may be useful to discuss in more detail two


operations important for the efficiency of the algorithm: the box reduction in Step 1
and the computation of xk and yk in Step 4.
Box Reduction
Let M D p; q be a box. We wish to determine a box redM D p0 ; q0   M such that

fx 2 p0 ; q0 j f .x/  ; g1 .x/  g2 .x/  "g D fx 2 Mj f .x/  ; g1 .x/  g2 .x/  "g:


(7.72)
Select a convex function g.x/ underestimating g.x/ D g1 .x/  g2 .x/ on M and tight
at a point y 2 M, i.e., such that g.y/ D g.y/. For example, g.x/ D g1 .x/  g2 .y/,
where y is a corner of the hyperrectangle p; q maximizing g2 .x/ over p; q/. Then
p0 ; q0 are determined by solving, for i D 1; : : : ; n, the convex programs:

p0i D minfxi j x 2 p; q; f .x/  ; g.x/  "g;


q0i D maxfxi j x 2 p; q; f .x/  ; g.x/  "g:

Bounding and Condition (7.71)


With g.x/ being a convex underestimator of g.x/ D g1 .x/g2 .x/ over M D p; q,
tight at a point y 2 redM define .M/ D minfg.x/j f .x/  ; x 2 redMg.
So for each k there is a point yk 2 Mk such that g.yk / D g.yk /. Also define
xk 2 argminfg.x/j f .x/  ; x 2 redMk g, so that g.xk / D .Mk /. If xk D yk
then .Mk / D minfg.x/j x 2 Mk g so, as can easily be checked, condition (7.71) is
satisfied.
Alternative Method: Define

xM 2 argminfg1 .x/j f .x/  ; x 2 redMg;


yM 2 argmaxfg2 .x/j f .x/  ; x 2 redMg;
.M/ D g1 .xM /  g2 .yM /:

The points xk D xMk ; yk D yMk are such that xk D yk implies that minfg1 .x/ 
g2 .x/j x 2 Mk g D 0, hence it is easy to verify that condition (7.71) is satisfied.
Special Case: When f ; g are polynomials or signomials, one can define at each
iteration k a convex program (e.g., a SDP)

.Lk / maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck g;

where u 2 Rmk ; Ck is a compact set in Mk  Rmk , and `k .x; u/ is a linear function


such that the problem maxfg.x/j fQ .x/  ; x 2 Mk g is equivalent to
7.5 Robust Approach 211

.NLk / maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck ; ri .x; u/  0 .i 2 Ik /g;

where ri .x; u/  0; i 2 Ik , are nonconvex constraints (Lasserre 2001; Sherali


and Adams 1999). Furthermore, .Lk / can be selected so that there exists yk 2 Mk
satisfying

.yk ; u/ 2 Ck ) ri .yk ; u/  0 .i 2 Ik / (7.73)

(see, e.g., Tuy (1998, Proposition 8.12)). In that case, setting

.Mk / D maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck g

and taking an optimal solution .xk ; uk / of (Lk / one has a couple xk ; yk satisfy-
ing (7.73) and

xk 2 Mk ; f .xk /  ; .xk ; uk / 2 Ck ; `k .xk ; uk / D .Mk /: (7.74)

If xk D yk then from (7.73) and .xk ; uk / 2 Ck it follows that ri .xk ; u/  0 .i 2


Ik /, so .xk ; uk / solves (NLk / and hence xk solves the subproblem minfg.x/j fQ .x/ 
; x 2 Mk g, i.e., g.xk / D `k .xk ; uk / D .Mk /. It can then easily be shown that
g.yk /  .Mk / ! 0 as xk  yk ! 0, so that xk ; yk satisfy condition (7.71).

7.5.3 Equality Constraints

So far we assumed that all nonconvex constraints are of inequality type: gi .x/ 
0; i D 1; : : : ; m. If there are equality constraints such as

hj .x/ D 0; j D 1; : : : ; s

with all functions h1 ; : : : ; hs nonlinear, the SIT algorithm for whatever " > 0 will
conclude that there is no "-essential feasible solution. This does not mean, however,
that the problem has no nonisolated feasible solution. In that case, since an exact
solution to a nonlinear system of equations cannot be expected to be computable in
finitely many steps, one should be content with replacing every equality constraint
hj .x/ D 0 by an approximate one

jhj .x/j  ;

where > 0 is the tolerance. The above approach can then be applied to the resulting
set of inequality constraints.
212 7 DC Optimization Problems

7.5.4 Illustrative Examples

To illustrate the practicality of the robust approach, we give two numerical


examples.
Example 7.3 Minimize x1 subject to

.x1  5/2 C 2.x2  5/2 C .x3  5/2  18


.x1 C 7  2x2 /2 C 4.2x1 C x2  11/2 C 5.x3  5/2  100

This problem has one isolated feasible point .1; 3; 5/ which is also the optimal
solution. With " D
D 0:01 the essential (";
)-optimal solution (3.747692,
7.171420, 2.362317) with objective function value 3:747692. is found by the SIT
Algorithm after 15736 iterations.
Example 7.4 (n=4) Minimize .3 C x1 x3 /.x1 x2 x3 x4 C 2x1 x3 C 2/2=3 subject to

3.2x1 x2 C 3x1 x2 x4 /.2x1 x3 C 4x1 x4  x2 /


.x1 x3 C 3x1 x2 x4 /.4x3 x4 C 4x1 x3 x4 C x1 x3  4x1 x2 x4 /1=3
C3.x4 C 3x1 x3 x4 /.3x1 x2 x3 C 3x1 x4 C 2x3 x4  3x1 x2 x4 /1=4  309:219315
2.3x3 C 3x1 x2 x3 /.x1 x2 x3 C 4x2 x4  x3 x4 /2
C.3x1 x2 x3 /.3x3 C 2x1 x2 x3 C 3x4 /4  .x2 x3 x4 C x1 x3 x4 /.4x1  1/3=4
3.3x3 x4 C 2x1 x3 x4 /.x1 x2 x3 x4 C x3 x4  4x1 x2 x3  2x1 /4  78243:910551
3.4x1 x3 x4 /.2x4 C 2x1 x2  x2  x3 /2
C2.x1 x2 x4 C 3x1 x3 x4 /.x1 x2 C 2x2 x3 C 4x2  x2 x3 x4  x1 x3 /4  9618
0  xi  5 i D 1; 2; 3; 4:

This problem would be difficult to solve by Sherali-Adams RLT method or


Lasseres method which both would require introducing a very large number of
additional variables.
For " D
D 0:01 the essential (";
)-optimal solution

xessopt D .4:994594; 0:020149; 0:045424; 4:928073/

with objective function value 5.906278 is found by the SIT Algorithm at iteration
667, and confirmed as such at iteration 866.
7.6 DC Optimization in Design Calculations 213

7.5.5 Concluding Remark

Global optimization problems with nonconvex constraint sets are difficult in two
major aspects: (1) a feasible solution may be very hard to determine; (2) the optimal
solution may be an isolated feasible solution, in which case a correct solution cannot
be computed by a finite procedure and implemented successfully.
To cope with these difficulties the robust approach consists basically in finding a
nonisolated feasible solution and improving it step by step. The resulting algorithm
works its way to the best nonisolated optimal solution through a number of cycles of
incumbent transcending. A major advantage of it, apart from stability, is that when
prematurely stopped it may still provide a good nonisolated feasible solution, in
contrast to other current methods which are almost useless in that case.

7.6 DC Optimization in Design Calculations

An important class of nonconvex optimization problems encountered in the appli-


cations has the following general formulation:
.P/ Given a compact set S  Rn with intS ; and a gauge p W Rn ! RC find
the largest value r for which there is an x 2 S satisfying

B.xI r/ WD fy 2 Rn j p.y  x/  rg  S: (7.75)

Here by gauge is meant a nonnegative positively homogeneous and convex function.


As can be immediately seen, a gauge p W Rn ! RC is actually the gauge of the
convex set fxj p.x/  1g in the sense earlier defined in Sect. 1.3.
Problems of the above kind arise in many design calculations. A typical example
is the design centering problem mentioned in Example 5.3, Sect. 5.2. Specifically,
suppose in a fabrication process the quality of a manufactured item is characterized
by an n-dimensional parameter. An item is acceptable if this parameter lies in some
region of acceptability S. If x 2 Rn is the designed value of the parameter, then,
due to unavoidable fluctuations, the actual value y of it will usually deviate from x,
so that p.y  x/ > 0. For each fixed value of x the value of y must lie in B.xI r/
in order to be acceptable, so the production yield can be measured by the maximal
value of r D r.x/ such that B.xI r/ is still contained in S. Under these conditions, to
maximize the yield the designer has to solve the problem (7.75).
This is an inherently difficult problem involving two levels of optimization: for
every fixed value x 2 S compute

r.x/ WD maxfrj B.xI r/  Sg; (7.76)

then determine the value x that achieves

r WD maxfr.x/j x 2 Sg: (7.77)


214 7 DC Optimization Problems

Except in special cases both these two subproblems are nonconvex. Even in the
relatively simple case when S is convex nonpolyhedral and p.x/ D kxk (the
Euclidean norm), the subproblem (7.76), for fixed x, is known to be NP-hard (Shiau
1984). In view of these difficulties, for some time most of the papers published on
this subject either concentrated on some particular aspects of the problem or were
confined to the search for a local optimum, under rather restrictive conditions.
If no further structure is given on the data, the general problem of design
centering is essentially intractable by deterministic methods. Fortunately, in many
practically important cases (see, e.g., Vidigal and Director (1982)) it can be assumed
that
1. p.x/ D xT Ax,
where A is a symmetric positive definite matrix;
2. S D C \ D1 \    \ Dm ,
where C is a closed convex set in Rn with intC ; and Di D Rn n intCi .i D
1; : : : ; m/ and Ci is a closed convex subset of Rn .
Below we present a solution method proposed by Thach (1988) for problem .P/
under these assumptions.

7.6.1 DC Reformulation

For every x 2 Rn define


g.x/ D sup.2xT Ay  yT Ay/:
yS

Proposition 7.15 (Thach 1988) The function g.x/ is convex, finite on Rn and the
general design centering problem is equivalent to the unconstrained dc maximiza-
tion problem
maxfxT Ax  g.x/j x 2 Rn g: (7.78)

Proof For every x 2 Rn we can write


xT Ax  g.x/ D inf .xT Ax  2xT Ay C yT Ay C yT Ay/
yintS

D inf .y  x/T A.y  x/ (7.79)


yintS

D inf p2 .y  x/:
yintS

But for x 2 S it is easily seen from (7.76) that r.x/ D infyintS p.y  x/ 8x 2 S. On
the other hand, for x S, by making y D x in (7.79) we obtain xT Ax  g.x/ D 0
since p2 .y  x/  0. Hence,
7.6 DC Optimization in Design Calculations 215


T r2 .x/ x2S
x Ax  g.x/ D (7.80)
0 xS
This shows that problem (7.77) is equivalent to (7.78). The finiteness of g.x/
follows from (7.80) and that of r.x/. Finally, since for every fixed y the function
x 7! 2xT Ay  yT Ay is affine, g.x/ is convex as the pointwise supremum of a family
of affine functions. t
u
Remark 7.14 (i) In view of (7.80) problem (7.78) is the same as the constrained
optimization problem

maxfxT Ax  g.x/j x 2 Sg:

(ii) For x 2 S we can also write r.x/ D miny2@S p.y  x/, where @S WD S n intS is
the boundary of S. Therefore, from (7.79),

g.x/ D maxf2xT Ay  yT Ay/ 8x 2 S:


y2@S

7.6.2 Solution Method

A peculiar feature of the dc optimization problem (7.78) is that the convex function
g.x/ is not given explicitly but is defined as the optimal value in the optimization
problem

maxf2xT Ay  yT Ayj y Sg: (7.81)

However, from the relation Rn n intS D .[m


iD1 Ci / [ .R n intC/ it is easily seen that
n

g.x/ D maxfg0 .x/; g1 .x/; : : : ; gm .x/g

with

g0 .x/ D maxf2xT Ay  yT Ayj y 2 Rn n intCgI (7.82)


gi .x/ D maxf2x Ay  y Ayj y 2 Ci g; i D 1; : : : ; m:
T T
(7.83)

Thus, the subproblem (7.81) splits into mC1 simpler ones. Every subproblem (7.83)
is a convex program (maximizing a concave function y 7! 2xT Ay  yT Ay over a
closed convex set Ci ), so it can be solved by standard algorithms. The subprob-
lem (7.82), though harder (maximizing a concave function over a closed reverse
convex set), can be solved by reverse convex programming methods developed in
Section 7.3. In that way the value of g.x/ at every point x can be computed.
Also the next proposition helps to find a subgradient of g.x/ at every x.
216 7 DC Optimization Problems

Proposition 7.16 If x S, then 2Ax 2 @g.x/. If x 2 S, then 2Ay 2 @g.x/, where y is


an optimal solution of (7.81) for x D x.
Proof By y denote an optimal solution of (7.81) for x D x. Then g.x/ D 2xT Ay 
yT Ay, so

g.x/  g.x/  .2xT Ay  yT Ay/  .2xT Ay  yT Ay/ D .x  x/T 2Ay 8x 2 Rn :

This implies that 2Ay 2 @g.x/ for every x 2 Rn . In particular, when x S we have
g.x/ D xT Ax; so y D x is an optimal solution of (7.81), hence 2Ax 2 @g.x/. t
u
We are now in a position to derive a solution method for the dc optimization
problem (7.78) equivalent to the general design centering problem .P/.
Let f .x/ WD xT Ax  g.x/. Starting with an n-simplex Z0 containing the compact
set S we shall construct a sequence of functions fk .:/; k D 0; 1; : : :, such that
(i) fk .:/ overestimates f .:/, i.e., fk .x/  f .x/ allx 2 SI
(ii) Each problem maxffk .x/j x 2 Z0 g can be solved by available efficient
algorithms;
(iii) maxffk .x/j x 2 Z0 g ! maxff .x/j x 2 Sg as k ! C1.
Specifically, we can state the following:
Algorithm
Select an n-simplex Z0  S, a point x0 2 S and y0 D x0 . Set k D 1.
Step 1. Let fk .x/ D xT Ax  maxfh2Ayi ; x  xi i C g.xi /j i D 0; 1;    ; k  1g. Solve
the kth approximating problem

.Pk / maxffk .x/j x 2 Z0 g;

obtaining xk . If xk 2 S go to Step 2, otherwise go to Step 3.


Step 2. Solve the kth subproblem

.Sk / maxf2.xk /T Ay  yT Ayj y intSg

obtaining an optimal solution yk and the optimal value g.xk /.


If fk .xk / D .xk /T Axk  g.xk /, stop: xk solves .P/. Otherwise, set k kC1
and go back to Step 1.
Step 3. Let yk D xk . Set k k C 1 and go back to Step 1.
Proposition 7.17 If the above algorithm stops at Step 2 the current xk solves the
design centering (P). If it generates an infinite sequence xk then any cluster point of
this sequence solves (P).
Proof First observe that fk1 .x/  fk .x/  f .x/ 8x and that

fk .xi / D f .xi /; i D 0; 1; : : : ; k  1:
7.6 DC Optimization in Design Calculations 217

In fact, since 2Ayi 2 @g.xi / by Proposition 7.16, it follows that h2Ayi ; x  xi i C


g.xi /  g.x/; i D 0; 1; : : : ; k  1, hence fk .x/  fkC1 .x/  f .x/ 8x 2 Rn . For
x D xi we have g.xi /  h2Ayi ; xi  xj i C g.xj /; i; j D 0; 1; : : : ; k  1. Hence,
fk .xi /  .xi /T Axi  g.xi / D f .xi /; which implies that fk .xi / D f .xi /.
Now suppose that at some iteration k; fk .xk / D .xk /T Axk  g.xk /, i.e., fk .xk / D
f .xk /. Then, since xk solves .Pk / we have, for any x 2 Z0 W f .x/  fk .x/  fk .xk / D
f .xk /, so xk solves .P/, justifying the stopping criterium in Step 2. If the algorithm
is infinite, let x D lim!C1 xk . As we just saw, fkC1 .xkC1 / D maxffkC1 .x/j x 2
Z0 g  maxffk .x/j x 2 Z0 g D fk .xk /, so ffk .xk /g is an increasing sequence bounded
above by maxff .x/j x 2 Sg. Therefore, there exists D limk!C1 fj .xk /. But for any
i<kW

.xk /T Axk  h2Ayi ; xk  xi i  g.xi /  fk .xk /  fk .x/  f .x/:

By letting  ! C1 this yields

xT Ax  h2Ayi ; x  xi i  g.xi /   f .x/ D xAx  g.x/: (7.84)

Since yk D xk , i.e., yk solves .Sk / we have, by Remark 7.14, yk 2 @S  S, so the


sequence fyk g is bounded. By passing to subsequences if necessary, we may assume
yk ! y. Obviously, 2Ay 2 @g.x/. Setting i D k in (7.84) and letting  ! C1
yields

xT Ax  h2Ay; x  xi  g.x/   xT Ax  g.x/;

hence D xT Ax  g.x/ D f .x/. Since fk .xk / D maxffk .x/j x 2 Z0 g 


maxff .x/j x 2 Z0 g, it follows that

f .x/ D maxff .x/j x 2 Z0 g D maxff .x/j x 2 Rn g

(see Remark 7.14, (ii)). t


u
Remark 7.15 The dc maximization problem (7.78) can be rewritten as

maxfxT Ax  tj x 2 S; g.x/  tg: (7.85)

It is not hard to see that the above algorithm can also be interpreted as an outer
approximation method for solving this convex maximization (concave minimiza-
tion) under convex constraints (see Sect. 7.2). In fact, the kth approximating problem
.Pk / is equivalent to the problem

maxfxT Ax  tj x 2 Z0 ; h2Ayi ; x  xi i C g.xi /  t; i D 0; : : : ; k  1g: (7.86)

Noting that 2Ayi 2 @g.xi / (Proposition 7.16) we have h2Ayi ; x  xi i C g.xi / 


g.x/ 8x 2 Rn , so the linear constraints in (7.86) simply define the polytope outer
approximating the convex set fxj g.x/  tg at iteration k of an outer approximation
procedure for solving (7.85).
218 7 DC Optimization Problems

7.7 DC Optimization in Location and Distance Geometry

7.7.1 Generalized Webers Problem

The classical Webers problem is to locate a single facility in the plane that
minimizes the sum of weighted Euclidean distances to the locations of N known
users. This problem can be generalized to the case of p facilities as follows.
Suppose p facilities (providing the same service) are designed to serve N users
located at points a1 ; : : : ; an in the plane. We wish to determine the locations
x1 ; : : : ; xp 2 R2 of the facilities so as to minimize the total cost (travel time, transport
cost for customers, etc.), or to maximize the total attraction (utility, number of
customers, etc.). The cost or attraction is a given function of the distances form
the users to the facilities. Traditionally, the standard mathematical formulation of
this problem is (Brimberg and Love 1994):

X
p
X
N
minx;u wij d.xi ; aj /
iD1 jD1

X
p
s.t. wij D rj ; j D 1; : : : ; N
iD1

wij  0; i D 1; : : : ; p; j D 1; : : : ; N
xi 2 S; i D 1; : : : ; p;

where wij is unknown weight representing the flow from facility i to the fixed point
(warehouse) aj ; j D 1; : : : ; NI d.x; a/ is the distance function which estimates the
travel distance between any two points x; a 2 R2 , and S is a rectangular domain in
R2 where the facilities must be located.
A serious disadvantage of the above formulation is that it involves a huge number
of variables (pN variables wij and 2p variables xk1 ; xk2 ) and a highly nonconvex
objective function. Because of that the problem is very difficult to handle, even
for small values of p and N. In fact, while nonlinear programming methods can be
trapped at a KKT point which may even not be a local optimum, the difficulties
considerably increase when the model is further extended to deal with complicated
but more realistic objective functions (e.g., when the cost function is measured by
a monotone rather than linear function of the distances, or when the facilities are
attractive only for a number of users and repelling for others).
Using the dc approach it is possible to significantly reduce the number of
variables, and express the objective function in a much simpler form. Noticing that
in an optimal solution each user must be served by the closest facility and setting

hj .x1 ; : : : ; xp / D min kxi  aj k; (7.87)


iD1;:::;p
7.7 DC Optimization in Location and Distance Geometry 219

we can rewrite the problem as


X
N
min rj hj .x1 ; : : : ; xp / s.t. xi 2 S  R2 ; i D 1; : : : ; p: (7.88)
jD1

Since clearly
X
p
X
hj .x/ D kxi  aj k  max kxi  aj k;
k
iD1 ik
Pp P
where the functions iD1 kxi  aj k; maxk ik kxi  aj k are convex, it follows
that (7.88) is a dc optimization problem. One obvious advantage of this formulation
is that it uses only 2p variables xi1 ; xi2 ; i D 1; : : : ; p instead of pN C 2p variables
as in the standard formulation. This allows large-scale problems which usually
are out of reach of standard nonlinear methods to be solved successfully by dc
optimization methods. For instance, in Tuy et al. (1995a,b,c) and Al-Khayyal et al.
(1997, 2002), good computational results have been reported for problems with
N D 100;000; p D 1 and N D 10;000; p D 2; N D 1000; p D 3. The traditional
approach would have difficulty even in solving problems with N D 500; p D 2,
which would require more than 1000 variables to be introduced.
Facility with Attraction and Repulsion
In more realistic models, the attraction of facility i to user j at distance t away is
measured by a convex decreasing function qj .t/. Furthermore, for some users the
attraction effect is positive, for others it is negative (which amounts to a repulsion).
Let JC be the set of attraction points (attractors), J the set of repulsion points
(repellers). The problem is to find the locations of facilities so as to maximize the
total effect, i.e.,
nX X o
max qj .hj .X//  qj .hj .X//j X D .x1 ; : : : ; xp / 2 Sp  .R2 /p ;
j2JC j2J

where hj .X/ is given by (7.87). Using Proposition 4.6 it is easy to find an explicit
dc representation of the objective function. Hence, the problem is a dc optimization
just as (7.88). For detail the interested reader is referred to Chen et al. (1992, 1992b),
Tuy et al. (1992), and also Maranas and Floudas (1994) and Al-Khayyal et al.
(1997, 2002), where results in solving problems with up to 100,000 attractors and
repellers have been reported. A local approach to this problem under more restrictive
assumptions (J D ;) can be found in Idrissi et al. (1988).
Maximin Location
The maximin location problem is to determine the locations of p facilities so as
to maximize the minimum distance form a user to the closest facility. Examples are
obnoxious facilities, such as nuclear plants, garbage dumps, sewage plants, etc. With
hj .x1 ; : : : ; xp / defined by (7.87) the mathematical formulation of the problem is
maxf min hj .x1 ; : : : ; xp /j xi 2 S; i D 1; : : : ; pg: (7.89)
jD1;:::;N
220 7 DC Optimization Problems

An analogous problem, when locating emergency facilities, such as fire stations,


hospitals, patrol car centers for a security guard company, etc., consists in mini-
mizing the maximum distance from a user to the nearest facility. With the same
hj .x1 ; : : : ; xp / as above the latter problem can be formulated as

minf max hj .x1 ; : : : ; xp /j xi 2 S; i D 1; : : : ; pg: (7.90)


jD1;:::;N

and is often referred to as the p-center problem.


Since hj .x1 ; : : : ; xp / is a dc function, it follows that minjD1;:::;N hj .x1 ; : : : ; xp / and
maxjD1;:::;N hj .x1 ; : : : ; xp / are also dc functions with easily computable explicit dc
representations. Therefore, (7.89) and (7.90) are dc optimization problems which
can be efficiently solved by currently available methods (see Al-Khayyal et al. 1997
for the case p D 1)

7.7.2 Distance Geometry Problem

A problem which has applications in molecular conformation and other questions


such as surveying and satellite ranging, data visualization and pattern recognition,
etc., is the multidimensional scaling problem or distance geometry problem. It
consists in finding p objects x1 ; : : : ; xp in Rn such that the quantity
X
Vp .x1 ; : : : ; xp / WD wij .ij2  kxi  xj k2 /2 (7.91)
i<j

is smallest, where  D .ij /; W D .wij / are symmetric matrices of order p such that

ij D ji  0; wij D wji  0; .i < j/I ii D wii D 0 .i D 1; : : : ; p/:

By writing the problem as


P P
min . i<j wij kxi  xj k2  2 j<j wij ij kxi  xj k/
(7.92)
s.t. xi 2 Rn ; i D 1; : : : ; p

or, alternatively, as

P t  2  kxi  xj k2  t .8i < j/
2 ij ij
min i;j wij tij i ij
x 2 Rn ; i D 1; : : : ; p

we again obtain a dc optimization problem of a particular type which can be solved


by methods of nonconvex quadratic programming (see Chap. 10). An efficient local
7.7 DC Optimization in Location and Distance Geometry 221

dc optimization called DCA has been developed by Tao (2003) for solving the
multidimensional scaling problem on the basis of the dc formulation (7.92) (see
also An (2003) where a combined DCA-smoothing technique is applied).

7.7.3 Continuous p-Center Problem

The continuous p-center problem differs from the p-center problem only in that the
set of users is a rectangle S  R2 rather than a finite set of points. It consists of
finding the locations x1 ; : : : ; xp of p facilities in S so as to minimize the maximal
distance from a point x 2 S to the nearest facility, i.e.,

h .x1 ; : : : ; xp / D miniD1;:::;p kxi  yk
min maxy2S hy .x1 ; : : : ; xp / iy (7.93)
x 2 S; i D 1; : : : ; p:

For each j define the Voronoi (Dirichlet) cell

Dj WD fxj kx  xj k  kx  xi k 8i D 1; : : : ; pg:

Computational results of (Suzuki and Okabe 1995) for p D 30 and p D 49 suggest


the conjecture that for large values of p most Voronoi tend to be hexagons, only a
few are pentagons.

7.7.4 Distribution of Points on a Sphere

In closing this section we should mention a challenging problem which has in


recent time attracted the attention of mathematicians, biologists, chemists, and
physicists, working on such fields as viral morphology, crystallography, molecular
structure, and electrostatics (Saff and Kuijlaars 1997). Like many famous problems
of mathematics, its formulation is rather simple: find the configuration of p points
on a sphere in R3 so that
X 1
min Vp .x1 ; : : : ; xp / WD s.t. kxi k D 1; D 1; : : : ; p: (7.94)
i<j
kxi  xj k

Physically Vp .x1 ; : : : ; xp / represents the energy of p charged particles that repel


each other according to Coulombs law. An optimal solution X D .x1 ; : : : ; xp /
of the problem (7.94) is called a p-tuple of elliptic Fekete points. Computational
experiments have shown that for 32  p  200 optimal configuration arrange points
according to hexagonalpentagon pattern (i.e., all but exactly 12 Voronoi cells are
hexagonal, the exceptional cells being pentagons).
222 7 DC Optimization Problems

A related problem is to find a configuration of p points on the sphere that


maximizes the product
Y
Wp .x1 ; : : : ; xp / WD kxi  xj k: (7.95)
i<j

The difficulty of problem (7.94) is that there are many local minima that are not
global; in addition the local minimizers have objective function values very close to
the global minimum, which make it very hard to determine the exact value of the
global minimum.
1
j k D q.kx  x k/ with q D 1=t convex decreasing function for
i j
Observe that kxi x
t  . Hence, by Proposition 4.5 problem (7.94) can be rewritten as
X
minf .g.x1 ; x2 /  Kkxi  xj k/j kxi k D 1; i D 1; : : : ; pg;
i<j

where g.t/ is convex, K some positive constant. This is a dc minimization


problem over a sphere which should be practically solvable for small values of p.
Analogously, the problem
Y
maxf kxi  xj k j kxi k D 1; i D 1; : : : ; pg;
i<j

or equivalently,
X
maxf log.kxi  xj k/ j kxi j D 1; i D 1; : : : ; pg
i<j

can be written as a dc optimization problem over a sphere:


X
maxf .Lkxi  xj k  hij .xi ; xj // j kxi k D 1; i D 1; : : : ; pg
i<j

since log.kxi  xj k/ D q.kxi  xj k/ with q.t/ D log t (concave increasing for t 

> 0/.

7.8 Clustering via dc Optimization

The classification of objects into groups or clusters is important in many fields


of operations research and applied sciences. Existing approaches to this problem
include statistical, machine learning, mathematical programming and in particular,
linear and integer programming. Among them approaches based on mathematical
programming techniques are most effective (Bradley et al. 1997; Mangasarian
1997).
7.8 Clustering via dc Optimization 223

The problem of cluster analysis can be formulated as follows. For a given set
of points ai ; i D 1; : : : ; m in Rn find cluster points xl ; l D 1; : : : ; p in Rn such that
the sum of the minima over l 2 f1; : : : ; pg of the distance between each point ai
and the cluster centers xl ; l D 1; : : : ; p is minimized. Using one-norm for measuring
distances, the problem is
P
minimize m il
iD1 minlD1;:::;;p he; y i (7.96)
subject to y  a  x  y i D 1; : : : ; mI l D 1; : : : ; p
il i l il

where e D .1; : : : ; 1/ 2 Rmp .


Since for each i D 1; : : : ; m the function minlD1;:::;p he; yil i is concave piecewise
linear, (7.96) is a piecewise linear concave minimization problem studied in
Section 7.2. The main difficulty of this approach is that in practice m is too large for
the problem to be solved efficiently. Therefore, exploiting the special form of the
objective function (Bradley, Mangasarian and Street 1997) transformed the problem
into an equivalent bilinear program, by rewriting it as
P Pp
min m iD1
il
lD1 he; y til i
s.t. yil  ai  xl  yil i D 1; : : : ; mI l D 1; : : : ; p (7.97)
Pp
lD1 til D 1; til  0 i D 1; : : : ; mI l D 1; : : : ; p

This bilinear program is then solved by an algorithm which alternates between


solving a linear program in the variable t and a linear program in the variables .x; y/.
Although this method terminates after finitely many iterations at a stationary point
satisfying the necessary optimality conditions, this stationary point is not guaranteed
to be a global optimal solution.
Furthermore, when 2-norm is used instead of 1-norm, the problem can no
longer formulated as a concave minimization. Instead, it becomes a considerably
harder problem of Weber multisource location problem. In Mangasarian (1997) the
problem (7.97) is solved by a p-mean algorithm which again does not guarantee a
global optimal solution.
Below we present a global optimization approach via dc optimization by Tuy
et al. (2001). Another dc optimization approach based on the DCA algorithm by
P.D Tao was developed by An et al. (2007).

7.8.1 Clustering as a dc Optimization

Since at optimality each point ai must be assigned the cluster center closest to it, the
clustering problem can be formulated as

X
m
min min kxl  ai k s.t. xl 2 S  Rn ; l D 1; : : : ; p: (7.98)
lD1;:::;p
iD1
224 7 DC Optimization Problems

where S is a compact convex set in Rn . Without loss of generality we can assume


that S is the unit simplex of RnC . By a usual transformation,

X
p
X
min kxi  ai k D kxl  ai k  max kxl  ai k
lD1;:::;p r
lD1 lr

Pp l i
P l i
where lD1 kx  a k and maxr lr kx  a k are convex functions. There-
fore, (7.98) is actually the dc optimization problem
Pp P
min lD1 kxl  ai k  maxr lr kxl  ai k
(7.99)
s.t. xl 2 S; l D 1; : : : ; p:

A common method for solving this problem is by branch and bound. There are
p  n variables xlj ; j D 1; : : : ; n; l D 1; : : : ; p. For the sake of clarity any vector in
Qp
X D lD1 Xl ; Xl 2 Rn , will be denoted by a boldface low case letter, e.g., x D
.x1 ; : : : ; xp / where xl 2 Xl . We also agree that xl D xl . The problem takes on the
form

minff .x/ WD g.x/  h.x/j x 2 Sg; (7.100)

where

X
m X
p
g.x/ D gil x/; gil x/ D kxl  ai k;
iD1 lD1

X
m X
h.x/ D hi .x/; hi .x/ D max kxl  ai k;
r
iD1 lr

Qp
and S is a simplex in X D lD1 X l ; X l D Rn .

a. Bounding

Let M be a subsimplex of S, with vertex set V.M/ (note that jV.M/j D pn C 1). To
compute a lower bound .M/ for the value of f .x/ over M we replace the problem
by a linear relaxation constructed as follows.
For each v 2 V.M/ let pil .v/ 2 @gil .v/, i.e.,
(
vl ai
if vl ai ;
pil .v/ D kvl ai k
0 otherwise
7.8 Clustering via dc Optimization 225

Then an affine minorant of gil .v/ is hpil .v/; xl  vl i C kvl  ai k, so a minorant of


g.x/ is

X
m X
p
'.x/ D max hpil .v/; xl  vl i C kvl  ai k:
v2V.M/
iD1 lD1

Also let i .x/ be a minorant of hi .x/ provided by the affine function which matches
with hi .x/ at every v 2 V.M/, i.e.,
X X
i .x/ D v max kvi  ai k for every
r
v2V.M/ lr
X X
xD v v; v D 1; v  0:
v2V.M/ v2V.M/

Proposition 7.18 A lower bound of f .x/ over M is given by the value



min ft  Pm P P
v2V.M/ v maxr
i
P t; P iD1 lr kvl  a kg s.t.
m p
.M/ D iD1 lD1 hpil .v/; xl  vl i C kvl  ai k  t 8v 2 V.M/; (7.101)
P P
xl D v2V.M/ v vl ; v2V.M/ v D 1; v  0

Proof A relaxation of the subproblem associated with M is

X
m
minf'.x/  i .x/j x 2 Mg (7.102)
iD1

which can also be written as

X
m
minft  i .x/j '.x/  t; x 2 Mg
iD1

and in this form it coincides with (7.101). t


u
We shall refer to the subproblem (7.101) with optimal value .M/ as the bound-
ing subproblem associated with M. Note that if .t; / is an optimal solution of the
linear problem
P (7.101) then an optimal solution of the bounding subproblem (7.102)
is x.M/ D v2V.M/ v v.
Proposition 7.19 We have '.x/ D g.x/ and i .x/ D hi .x/ 8i D 1; : : : ; m at every
vertex of M. In particular, if x.M/ is a vertex of M, i.e., if all but one v ; v 2 V.M/,
are zeros, then x.M/ is a minimizer of f .x/ over M.
Proof Obvious from the definition of '.x/ and i .x/. t
u
226 7 DC Optimization Problems

b. Branching
P
Branching is performed using !-subdivision. Specifically, let x.M/ D v2V.M/ v v
be an optimal solution of the relaxed problem (7.102) and let J.M/ D fv 2
V.M/jjv > 0g. Then subdivide M via the point x.M/.
Theorem 7.11 The BB Algorithm with the above bounding and the !-bisection for
branching is convergent. To be precise, if Mk is the partition set at iteration k, and
xk is the current best solution, then a cluster point x of the sequence fxk g yields an
optimal solution of the problem.
Proof The algorithm works exactly the same way as that for solving the equivalent
concave minimization under convex constraints

minft  h.x/j g.x/  t  0; x 2 Sg:

The convergence of the algorithm then follows from Theorem 7.4 on the conver-
gence of the BB Algorithm for (SBCP). t
u
Computational testing of the above algorithm has been carried out on the
Wisconsin Diagnostic Breast Cancer Database created by Wolberg et al. (1994,
1995,a,b). This database consists of 569 vectors in R30 , which should be arranged
into two clusters (malignant/benign). The original number of feature parameters
(n D 30) is too large for a direct application of global optimization methods, but it
is argued that one can restrict consideration to four most informative parameters. So
the clustering problem is solved for m D 569; n D 4; p D 2. The effectiveness of
the method is confirmed by the performed computational experiments: in each set
of these the two clusters obtained turned out to be correct with 96 % accuracy. For
details, see Tuy et al. (2001).

7.9 Exercises

X
n
1 Consider the problem minf fj .xj / W x 2 Dg, where D  RnC is a polytope and
jD1


0 if t D 0;
fj .t/ D
dj C cj t if t > 0:

(dj is a fixed cost). Show that there exists an " > 0 such that this problem is
Xn
equivalent to a (BCP) with f .x/ D fQj .xj / , where fQj W R ! R is a concave
jD1
function which is linear in the interval .1; "/ and agrees with fj .t/ at points t D 0
and t 2 ."; C1/.
7.9 Exercises 227

2 Let "  0 be a given number. A feasible solution x of (BCP) is said to be an


"-approximate (global optimal) solution if for any feasible x there exists at least one
feasible point y such that ky  xk  " and f .y/  f .x/ (for  D 0, this reduces to the
usual concept of global optimal solution). Consider the following cutting method
for (BCP):
Take a vertex x0 of D. Set D0 D D; k D 0.
(1) Let xk 2 argminff .xi /; i D 0; : : : ; kg, k D f .xk /. Compute a k -valid cut for
.f ; Dk / at xk W k .x  xk /  1 and solve the linear program

maximize k .x  xk / subject to x 2 Dk

obtaining its optimal value k and a basic optimal solution xkC1 .


(2) If k  1 C "kpk k, then terminate: accept xk as an "-approximate solution.
Otherwise, let DkC1 D Dk \fx W k .xxk /  1C"k k kg (geometrically, this
amounts to add a slice of thickness " to each cut), set k k C 1 and go to 1).
Show that this cutting method is finite if " > 0. Does this method always give an
"-approximate solution of (BCP)?
3 Show that if the problem (BCP) has a unique global optimal solution then
Procedure DC, in which the initial vertex x0 is precisely the global optimal solution,
cannot be infinite.
4 Solve the problem

1 2
maximize .x1  /2 C .x2  /2 subject to
2 3
5 3 5 1 1
x1 C x2  1I x1 C x2  I x1  x2  I x1  x2  I
2 2 4 2 4
3
x1  0I 0  x2  :
2
5 Solve the problem

minimize  x2  y2  .z  1/2 subject to


x C y  z  0I x C y  z  0I 6x C y C z  1:9
12x C 5y C 12z  22:8I 12x C 12y C 7z  17:1I y  0:

6 Prove that .LRC/ is equivalent to the parametric convex maximization problem:


Find the smallest value such that

maxfh.x/j Ax  b; x  0; cx  g  0:
228 7 DC Optimization Problems

7 Solve the (LRC):

maximize  x2 subject to
x1 C 2x2  2I 0  x1 I 0  x2  2
81
4.x1  2/2 C 4.x2  2/2  9I .x1 C 3/2  x22  
4
8 Consider the bilevel linear programming problem:

minimize 3x1 C 2x2 C y1 C y2 subject to


x1 C x2 C y1 C y2  4; x1  0; x2  0
.y1 ; y2 / 2 argminf4y1 C y2 W 3x1 C 5x2 C 6y1 C 2y2  15; y1  0; y2  0g:

Restate this problem as a (LRC) and solve it by the FW/BW or BB Algorithm.


9 Show that the problem of minimizing a convex function f .x/ under the reverse
convex constraint h.x/  0 is always regular. Describe an algorithm for solving it.
10 Consider the linear complementarity problem: Given an n  n real matrix Q and
a vector p 2 Rn find a vector x 2 Rn such that Qx C p  0; x  0; hx; Qx C pi D 0.
Show that this problem can be converted into a (BCP).
Chapter 8
Special Methods

Many problems have a special structure which should be exploited appropriately for
their efficient solution. This chapter discusses special methods adapted to special
problems of dc optimization and extensions.

8.1 Polyhedral Annexation

A drawback of OA algorithms is that, due to their stiff structure, these algorithms


do not allow restart, nor multistart. Since a new linear constraint is added at
each iteration while restart is not possible, the vertex set of the current outer
approximating polytope may quickly reach a prohibitive size. Furthermore, OA
methods require for their convergence certain conditions on the functions involved
which may fail to hold in many problems of interest. It turns out that most of
these difficulties can be overcome or mitigated by applying the OA strategy not
to the original problem but, instead, to an equivalent problem in the dual space.
This approach is based on the duality correspondence between convex sets and
their polars, and is often referred to as Polyhedral Annexation (PA) or Inner
Approximation (IA) (Tuy 1990).

8.1.1 Procedure DC*

In Chap. 7 (Sect. 7.1), we saw how the core of many global optimization problems
is a DC feasibility problem of the following form:
(DC) Given two closed convex sets C and D, find a point x 2 D n C or else
prove that D  C.

Springer International Publishing AG 2016 229


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_8
230 8 Special Methods

As in Chap. 6 (Sect. 6.3), assume that

0 2 D \ .intC/: (8.1)

By Proposition 1.21, condition (8.1), along with the convexity and closedness of D
and C imply that D D D ; C D C , so this polarity correspondence gives rise to a
dual problem which will be referred to as the dual DC feasibility problem:
(C D ) Find a point v 2 C n D or else prove that C  D .
Theorem 8.1 The two problems (DC) and (C D ) are equivalent in the following
sense:
(i) D  C if and only if C  D ;
(ii) For each v 2 C n D there exists a point z 2 D n C; conversely for each
z 2 D n C there exists a point v 2 C n D .
Proof Since both D and C are closed convex sets, (i) follows immediately from
Proposition 1.21. If v 2 C n D , then hv; xi  1; 8x 2 C while hv; zi > 1 for at
least one z 2 D, hence z 2 D n C. Conversely, if z 2 D n C, then z 2 D n C , i.e.,
hu; zi  1 8u 2 D while hv; zi > 1 for at least one v 2 C , hence v 2 C n D . u t
Thus, the DC feasibility problem can be solved by solving the dual DC feasibility
problem and vice versa. As it often happens in optimization theory, the dual problem
may turn out to be easier than the original one. In any case, the joint study of a dual
pair of problems may give more insight into both and may suggest more efficient
methods for handling each of them. Note that since 0 2 intC, the set C is compact
by Proposition 1.21. Furthermore, if D is compact then 0 2 intD , i.e., 0 2 C \
intD , so there is a full symmetry in the above duality correspondence.
Setting G D C ; D C n D , we see that the dual DC feasibility problem is
to find an element of  G, where G is a closed convex set. Let us examine how
the OA method can be applied to solve this problem. Assume that D D fxj Ax  b;
x  0g is a polyhedron and x0 D 0 is a vertex of D satisfying (8.1). Let M be the
canonical cone associated with x0 D 0 (see Sect. 7.1). For every i D 1; : : : ; n the i-th
edge of M either meets @C at some point zi or is entirely contained in C. In the latter
case, let zi be any point on this edge, arbitrarily far away from 0. Define the convex
function

v 2 Rn 7! .v/ D maxfhv; xij x 2 Dg: (8.2)

Lemma 8.1 Let S1 D 0; z1 ; : : : ; zn  and let P be any polyhedron such that

C  P  P1 WD S1 : (8.3)

If V is the vertex set of P then

.v/  1 8v 2 V ) C  D : (8.4)
8.1 Polyhedral Annexation 231

Fig. 8.1 Initial v1


polyhedron P1

P1

Co

Proof By Proposition 1.28, P1 D fyj hzi ; yi  1; i D 1; : : : ; ng, so by


Proposition 1.27 the recession cone of P1 is the cone fyj hzi ; yi  0; i D 1; : : : ; ng D
M (note that M D conefz1 ; : : : ; zn g/. Since D  M, any u 2 M satisfies .u/  0.
From (8.3) for any halfline fv C uj   0g  P, we have u 2 recP1 D M , hence
.v C u/  maxfhv; xij x 2 Dg C  maxfhu; xij x 2 Dg  .v/. So the convex
function .v/ is bounded on every halfline in P. Furthermore, P1 has a unique vertex
v 1 determined by

v 1 D eU 1 ; U D z1 ; : : : ; zn  (8.5)

(e is the row vector of n ones, see (7.8)), so that P1 D M C v 1 and by


Proposition 1.20 P1 contains no line (Fig. 8.1). By Proposition 2.34, then,
maxf.v/j v 2 Pg D maxf.v/j v 2 Vg, and so .v/  1 8v 2 V implies that
maxf.v/j v 2 Pg  1; hence P  D , i.e., (8.4) because C  P. t
u
By P now denote the family of polyhedrons P satisfying (8.3) and for each
polyhedron Pk 2 P with vertex set Vk define the distinguished point v k of Pk to be
the point

v k 2 argmaxf.v/j v 2 Vk g: (8.6)

By Lemma 8.1, if .v k /  1, then C  D and the dual DC feasibility problem is


solved. Otherwise, let .v k / D hv k ; xk i; xk 2 D. There are two possibilities:
(a) xk C W then xk 2 D n C and the problem is also solved.
(b) xk 2 C W then v k C (because hv k ; xk i > 1/ and the point xO k where the ray
through xk meets @C satisfies hv k ; xO k i  .v k / > 1, so the inequality

hOxk ; yi  1 (8.7)

holds for all y 2 C but not for y D v k . Therefore, the cut (8.7) determines a
polytope PkC1 D fy 2 Pk j hOxk ; yi  1g 2 P such that v k PkC1 .
232 8 Special Methods

We can thus formulate the following OA procedure for solving the dual DC
feasibility problem.
Procedure DC* (D Polyhedron, C Closed Convex Set)
Initialization. Select a vertex x0 of D such that x0 2 intC and let M be the associated
canonical cone. Set D D  x0 ; C C  x0 ; M M  x0 . For each i D 1; : : : ; n
let z be the point where the i-edge of M meets @C (or any point of this edge if it
i

does not meet @C/. Let P1 D fyj hzi ; yi  1; i D 1; : : : ; ng; U D z1 ; : : : ; zn ,


v 1 D eU 1 , V1 D fv 1 g; V 1 D V1 . Set k D 1.
Step 1. For each v 2 V k solve

LP.v; D/ maxfhv; xij x 2 Dg;

to obtain its optimal value .v/ and a basic optimal solution x.v/.
Step 2. Let v k 2 argmaxf.v/j v 2 Vk g. If .v k /  1, then terminate: C  D ,
hence D  C.
Step 3. If .v k / > 1 and xk WD x.v k / C, then terminate: xk 2 D n C.
Step 4. If .v k / > 1 and xk 2 C, compute k  1 such that xO k D k xk 2 @C and
define

PkC1 D Pk \ fyj k hxk ; yi  1g: (8.8)

Step 5. Compute the vertex set VkC1 of PkC1 and let V kC1 D VkC1 n Vk . Set k
k C 1 and go back to Step 1.
Theorem 8.2 Procedure DC* terminates after finitely many steps yielding a point
of D n C or proving that D  C.
Proof Each iteration generates an xk which is a vertex of D (basic optimal solution
of LP.v k ; D)). But xO k is distinct from all xO k1 ; : : : ; xO 1 . Indeed, v k satisfies all the
constraints hOxi ; yi  1  0; i D 1; : : : ; k  1, because v k 2 Pk , but violates the
constraint hOxk ; yi  1  0 because hv k ; xO k i  .v k / > 1. Therefore, xk is distinct
from all xi ; i D 1; : : : ; k1. This implies that the number of iterations cannot exceed
the number of vertices of D. t
u
Remark 8.1 All the linear programs LP.v; D) have the same constraints. This
should make the solution of these programs an easy task. If 0 is not a vertex but an
interior point of D (so that 0 2 intD \ intC/, then one can take the initial polyhedron
to be P1 D fyj hzi ; yi  1; i D 1; : : : ; n C 1g, where z1 ; : : : ; znC1 are n C 1 points
on @C determining an n-simplex containing 0 in its interior. In this case V1 consists
of the n C 1 vertices of P1 .
Remark 8.2 If we denote by Sk the polyhedron such that Sk D Pk (i.e., Sk D Pk /,
then assuming x0 D 0 and using Propositions 1.21 and 1.28 it can be verified that

S1  S2  : : :  Sk  : : :  C; SkC1 D conv.Sk [ fOxk g/:


8.1 Polyhedral Annexation 233

Indeed, from PkC1 D fyj hzi ; yi  1; i D 1; : : : ; nI hOxj ; yi  1; j D 1; : : : ; kg it


follows that for all k D 1; 2; : : ::

SkC1 D convf0; z1 ; : : : ; zn ; xO 1 ; : : : ; xO k g;

and hence, SkC1 D conv.Sk [ fOxk g/.


The above procedure thus amounts to constructing a sequence of expanding
polyhedrons contained in C, such that ShC1 is obtained from Sh by annexing xO h
to Sh , until a polyhedron Sk is obtained which entirely covers D (in which case
D  C) or a point xk 2 D is found outside C (then xk 2 D n C). This idea of
polyhedral annexation (which gives the name to the method) was in fact suggested
in Tuy (1964). On the other hand, since S1 ; S2 ; : : : form a sequence of polyhedrons
approximating the convex set C from the interior, the procedure can also be regarded
as an inner approximation method.
Remark 8.3 With minor modifications Procedure DC* extends to the case where
D may not be bounded. For instance, in this case, if the point xk or xO k in Step 3 is at
infinity in a direction uk then the inequality in (8.8) should be replaced by

huk ; yi  0:

When C has a nontrivial lineality space L, then r WD dim C D n dim L < n, so the
dual DC feasibility problem is actually a problem in dimension r < n and should be
easier to solve than the original problem. We shall examine in Chap. 8 how the PA
Algorithm can be streamlined to exploit this structure.

8.1.2 Polyhedral Annexation Method for (BCP)

To obtain an algorithm for (BCP) it only remains to incorporate Procedure DC* into
the usual two phase scheme. Furthermore, to enhance efficiency, each time a better
feasible solution than the incumbent is found, we can introduce a concavity cut to
reduce the feasible polyhedron to be considered in the next cycle. This leads to the
following:
PA (Polyhedral Annexation) Algorithm for (BCP)

Phase 1 Compute a vertex x of D.


Phase 2 Call Procedure DC* with C D fxjf .x/  f .x/g (take the initial vertex x0
to be such that f .x0 / > f .x//.
(a) If the procedure returns a vertex z 2 D n C, then set x z and go back to Phase
1. (Optionally: with D f .z/ construct a -valid cut .x  x0 /  1 for .f ; D/ at
x0 and reset D D \ fxj .x  x0 /  1g/.
(b) If the procedure establishes that D  C, then x is an optimal solution.
234 8 Special Methods

Theorem 8.3 The PA Algorithm for (BCP) terminates after finitely many steps at a
global minimizer.
Proof Straightforward from the finiteness of Procedure DC*. t
u
Remark 8.4 Each new cycle is essentially a restart. Just as with the CS Restart
Algorithm in Chap. 6, restart is a device to avoid a excessive growth of Vk . Moreover,
since a new vertex x0 is used to initialize Phase 2 and the region that remains to be
explored is reduced by a cut, restart can also increase the chance of transcending
the incumbent. In practical implementations restart with a new cut and a new x0 is
recommended when Phase 2 seems to drag on and jVk j approaches a critical limit.
Remark 8.5 In the above statement of the PA Algorithm we implicitly assumed
that D is bounded. To extend the procedure to the case when D may be unbounded,
it suffices to make some minor modifications indicated in Remark 8.3. It should be
emphasized that in contrast to OA methods, the PA Algorithm does not require the
computation of subgradients. Nor does it require the function f .x/ to be finite on
an open set containing D (though it would be more efficient in this case). Another
advantage of the PA Algorithm which will be demonstrated in Chap. 9 is that it can
be used as a technique for reducing the dimension of the problem when the lineality
of C is positive.
Example 8.1

Minimize f .x/ WD .x1  4:2/2  .x2  1:9/2


subject to x1 C x2  3
x1 C x2  11
2x1  x2  16
x1  x2  1
x2  5
x1  0; x2  0:

The algorithm is initialized from z D .1:0; 0:5/.


Phase 1 finds a vertex local minimizer x D .9:0; 2:0/. In Phase 2 which is started
from the vertex x0 D .6:0; 5:0/, Procedure DC* with C D fx W f .x/  f .x/g
terminates after two iterations with the conclusion D  C. Hence the optimal
solution is x D .9:0; 2:0/ (Fig. 8.2).

8.1.3 Polyhedral Annexation Method for (LRC)

Since Procedure DC* solves the DC feasibility problem, it can also be incorporated
into a two phase scheme for solving linear programs with an additional reverse
convex constraint:
8.1 Polyhedral Annexation 235

Fig. 8.2 Solving an example


x0
5

1 2 6 9

.LRC/ minfcxj x 2 D n intCg

where D D fxj Ax  b; x  0g is a polytope and C D fxj h.x/  0g is a closed


convex set (h W Rn ! R being a convex function).
In the following algorithm (which is analogous to the CS Restart Algorithm
for (LRC), except that Procedure DC* is used instead of Procedure DC), C" D
fxj h.x/  "g and
[
X" D f.E/j E 2 E g

where E is the collection of all edges of D and for every E 2 E ; .E/ is the set of
extreme points of the line segment E \ C" .
PA Algorithm for .LRC)
Initialization. Solve the linear program minfcxj x 2 Dg to obtain a basic optimal
solution w. If h.w/  ", terminate: w is an "-approximate optimal solution.
Otherwise:
(a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1.
(b) Otherwise, set D 1; D. / D D, and go to Phase 2.

Phase 1. Starting from z0 search for a feasible point x 2 X" . Let D cx; D. / D
fx 2 Dj cx  g. Go to Phase 2.
Phase 2. Call Procedure DC* with D. /; C" in place of D; C resp.
(a) If a point z 2 D. / n C" is obtained, set z0 z and go to Phase 1.
(b) If D. /  C" then terminate: x is an "-approximate optimal solution
(when < C1/, or the problem is infeasible (when D C1/.
Proposition 8.1 The PA Algorithm for LRC is finite. t
u
236 8 Special Methods

8.2 Partly Convex Problems

Among nonconvex global optimization problems a class of particular interest is con-


stituted by partly convex problems which have the following general formulation:

.P/ inffF.x; y/j Gi .x; y/  0 i D 1; : : : ; m; x 2 C; y 2 Dg;

where C is a compact convex subset of Rn ; D a closed convex subset of Rp , while


F.x; y/ W C  D ! R; Gi .x; y/ W C  D ! R; i D 1; : : : ; m, are lower semi-
continuous functions, satisfying the following assumption:
.PCA/ For every fixed x 2 Rn the functions F.x; y/; G1 .x; y/; : : : ; Gm .x; y/ are
convex in y.
When x; y are control variables, this is a problem encountered in numerous
applications such as: pooling and blending in oil refinery, optimal design of water
distribution, structural design, signal processing, robust stability analysis, design
of chips, etc. When y is the control variable and x a parameter, this is a parametric
optimization problem quite often considered in perturbation analysis of optimization
problems.
In many cases of interest, e.g., when x is a parameter, the problem is so structured
that it becomes much easier to solve when x is held temporarily fixed. To exploit
this partitioning structure of the variables, a popular method to solve (P) is to
proceed by branch and bound (BB), with branching performed in the x-space, i.e.,
in Rn , according to an exhaustive rectangular successive subdivision rule (e.g., the
standard bisection rule). Since in this way the problem is decomposed into a
sequence of easier subproblems of smaller dimension, the procedure is referred to
as a decomposition method.
Let G.x; y/ WD .G1 .x; y/; : : : ; Gm .x; y//. Given a partition set M  Rn the
subproblem associated with it is

.PM/ .M/ D inffF.x; y/j G.x; y/  0; x 2 C \ M; y 2 Dg:

Consider the Lagrange dual to (P(M)):

.DP.M// .M/ D sup inf fF.x; y/ C h; G.x; y/ig:


2Rm x2C\M;y2D
C

By weak duality property it is well known that

.M/  .M/; (8.9)

so .M/ yields a lower bound for the optimal value .M/ of the subproblem (P(M)).
The problem (DP(M)) is called a Lagrange relaxation of (P(M)) and the bound .M/
is often referred to as a duality bound or a Lagrangian bound.
8.2 Partly Convex Problems 237

In general the inequality (8.9) is strict, i.e., the difference .M/  .M/, referred
to as the duality gap, or simply the gap, is positive and most often rather large.
However, it has been observed that, through a suitable (BB) scheme, one can
generate a nested sequence of boxes fMk g shrinking to a point x 2 C, such that
the gap .Mk /  .Mk / decreases as k ! C1. Under suitable conditions it is
possible to eventually close the gap, i.e., to arrange things so that

.Mk /  .Mk / & 0; as k ! C1;

and an optimal solution y of the convex subproblem

minfF.x ; y/j G.x ; y/  0; y 2 Dg

will give an optimal solution .x ; y / of .P/.


This is the essential idea of the global optimization method by reducing the
duality gap that was first developed in Ben-Tal et al. (1994) for partly linear
problems, i.e., problems .P/ in the special case when F.x; y/; G.x; y/ are affine in y
for every fixed value of x.
In fact, the idea of using duality bounds for solving nonconvex global optimiza-
tion problems dates back at least to Falk (1969) and Falk and Soland (1969) (see
also Ekeland and Temam 1976, and especially Shor and Stetsenko (1989) where a
comprehensive study of the subject can be found). However, it was not until Ben-
Tal et al. (1994) that the method was proposed for partly linear global optimization.
Despite a fairly elaborate apparatus used, only partial and specific results were
obtained. This motivated quite a few researches on refining the duality bound
method and extending it to more general partly convex and nonconvex optimization
problems. However, due to serious errors and misleading conclusions in several
widespread publications, a rather confused situation resulted in this area that called
for clarification (Tuy 2007a).
We first describe a prototype BB decomposition algorithm for the general prob-
lem .P/ without assumption .PCA/. Afterwards this algorithm will be specialized
to problem .P/ with assumption .PCA/.
Since C is compact, it is contained in a rectangle X  Rn . The following standard
BB algorithm can be stated for solving problem .P/:
Prototype BB Decomposition Algorithm
Select an exhaustive subdivision rule (e.g., the standard bisection). Take a number
 supfF.x; y/j x 2 C; y 2 Dg. If no such 2 R is readily available let D C1.
Initialization. If some feasible solutions are available, let CBS:=.x0 ; y0 / be the
best among them. Set M1 WD X, P1 WD S1 WD fM1 g; k WD 1.
Step 1. For each rectangle M 2 Pk compute a lower bound for F.x; y/ over the
feasible points in M, i.e., a number .M/ such that

.M/  inffF.x; y/j G.x; y/  0; x 2 M \ C; y 2 Dg: (8.10)


238 8 Special Methods

Step 2. Update the incumbent CBS by setting .xk ; yk / equal to the best among all
feasible solutions available at the completion of the previous step.
Step 3. Delete every M 2 Sk such that .M/  minf; F.xk ; yk /g (with the
convention F.xk ; yk / D C1 if .xk ; yk / is not defined). Let Rk be the set
of remaining members of Sk .
Step 4. If Rk D ;, then terminate: CBS D .xk ; yk / yields a global optimal
solution, or else the problem is infeasible (if CBS is not defined).
Step 5. Choose Mk 2 argminf.M/jM 2 Rk g. Subdivide Mk according to the
chosen subdivision rule. Let PkC1 be the partition of Mk .
Step 6. Let SkC1 WD .Rk n fMk g/ [ PkC1 . Increment k and go back to Step 1.
A key operation in this algorithm is the computation of a bound .M/ for
every partition set M (Step 1). According to the prototype BB Algorithm described
in Sect. 6.2 this bound must be valid, which means that it must satisfy not only
condition (8.10) but also
f.x; y/j G.x; y/  0; x 2 M \ C; y 2 Dg D ; ) .M/ D C1: (8.11)
Furthermore, it is convenient to assume that .M 0 /  .M/ for M 0  M as we can
always replace .M 0 / with .M/ if .M 0 / < .M/.
However, since it is very difficult to decide whether the nonconvex set in the
left-hand side of the implication (8.11) is empty or not, we replace this possibly
impractical condition by a simpler one, namely
M \ C D ; ) .M/ D C1: (8.12)
Certainly, with this change the convergence criterion (6.5) for the prototype BB
Algorithm may no longer be valid. Fortunately, however, as will be seen shortly, it
is not difficult to find a suitable alternative convergence criterion.
Usually .M/ is taken to be the optimal value of some relaxation of the
subproblem .P.M// in (8.10). The most popular relaxations are
linear relaxation: F.x; y/ and G.x; y/ are replaced by their respective affine
minorants, and C; D by outer approximating polyhedrons;
convex relaxation, in particular SDP (semidefinite programming) relaxation: a
suitable convex program (in particular a SDP) is derived whose optimal value is
an underestimator of .M/ WD inffF.x; y/j G.x; y/  0; x 2 M \ C; y 2 DgI
Lagrange relaxation: the subproblem (P(M)) is replaced by its Lagrange dual
.DP.M//; in other words, .M/ is taken to be the optimal value of .DP.M// W

.M/ D sup inffF.x; y/ C h; G.x; y/ij x 2 M \ C; y 2 Dg: (8.13)


2Rm
C

Whatever bounds are used, when infinite the above BB procedure generates a
filter (an infinite nested sequence of partition sets) Mk ; k 2 I  f1; 2; : : :g, such that
\
.Mk /  min.P/ 8k 2 I; Mk \ C ; 8k 2 I; Mk D fx g: (8.14)
k2I
8.2 Partly Convex Problems 239

The algorithm is said to be convergent if x 2 C and

min.P/ D minfF.x ; y/j G.x ; y/  0; y 2 Dg; (8.15)

so that any optimal solution y of this problem yields an optimal solution .x ; y /
of .P/.
Theorem 8.4 If .Mk / D C1 for some k then (P) is infeasible and the algorithm
terminates. If .Mk / < C1 8k then there is an infinite subsequence Mk ; k 2 I 
f1; 2; : : :g, satisfying (8.14) and such that x 2 C. If, in addition,

lim .Mk / D minfF.x ; y/j G.x ; y/  0; y 2 Dg; (8.16)


k!1

then (8.15) holds, i.e., the BB decomposition algorithm is convergent.


Proof Since MkC1  Mk and hence, .MkC1 /  .Mk / we have from (8.14)

.Mk / %   min.P/: (8.17)

If .Mk / D C1, then .M/ D C1 8M 2 Sk , hence by (8.10) the problem is


infeasible. If .Mk / < C1 8k, then by (8.14) Mk \ C ; 8k. The sets Mk \ C;
k 2 I, then form a nested sequence of nonempty compact sets, so by Cantor theorem,
\k2I .Mk \ C/ D .\k2I Mk / \ C ;. Therefore, x 2 C. On the other hand,
since .Mk /  min .P/ by (8.14), it follows that  D limk!C1 .Mk /  min.P/
(see (8.17)). If (8.16) holds then

 D minfF.x ; y/j G.x ; y/  0; y 2 Dg


 minfF.x; y/j G.x; y/  0; x 2 C; y 2 Dg D min.P/;

hence (8.15), as was to be proved. t


u

Remark 8.6 The condition x 2 M \ C in (8.12) is essential to ensure that x 2 C.
It cannot be replaced by the weaker condition x 2 M (as has been proposed in
some published algorithms), for then it may happen that Mk \ C D ; (even though
.Mk / < C1/ for some k 2 I, hence x C. To preclude this possibility one has
to assume, as in the mentioned algorithms, that M1  C, and hence Mk  C 8k, or
else to compute, for each partition set M, a feasible solution .x; y/ 2 M  D. The
latter, however, is not always possible and may quite often be as difficult as solving
the problem itself, especially when the constraints are highly nonconvex.
Back to partly convex problems, i.e., problem .P/ with assumption .PCA/,
assume now that Lagrangian bounds are used throughout the BB decomposition
algorithm. So for every partition set M a lower bound .M/ is computed according
to the formula (8.13).
240 8 Special Methods

Theorem 8.5 Assume the set D is compact. Then the BB decomposition algorithm
using Lagrangian bounds throughout is convergent.
Proof Let us fix x 2 C. The function y 7! F.x; y/ C h; G.x; y/i is convex in y 2 D
and linear in  2 RmC . Since this function is lower semi-continuous in y while D is
compact we have the minimax equality (Theorem 2.7):

inf sup fF.x; y/ C h; G.x; y/ig D sup minfF.x; y/ C h; G.x; y/ig: (8.18)
y2D 2Rm 2Rm y2D
C C

Since clearly

F.x; y/ if G.x; y/  0;
sup fF.x; y/ C h; G.x; y/ig D (8.19)
2Rm
C
C1 otherwise;

we have from (8.18), for every x 2 C,

inffF.x; y/j G.x; y/  0; y 2 Dg D sup minfF.x; y/ C h; G.x; y/ig: (8.20)


2Rm y2D
C

Consider now a filter fMk g collapsing to a point x (to simplify the notation we write
Mk instead of Mk /. As we saw in the proof of Theorem 8.4, x 2 C, while by virtue
of (8.20):

inffF.x ; y/j G.x ; y/  0; y 2 Dg D sup minfF.x ; y/ C h; G.x ; y/ig:


2Rm y2D
C
(8.21)
Let us show that this implies

lim .Mk / D inffF.x ; y/j G.x ; y/  0; y 2 Dg: (8.22)


k!1

From the obvious inequalities

.Mk /  inffF.x; y/j G.x; y/  0; x 2 Mk \ C; y 2 Dg


 inffF.x ; y/j G.x ; y/  0; y 2 Dg

it follows that

.Mk / %   inffF.x ; y/j G.x ; y/  0; y 2 Dg:

Arguing by contradiction, suppose that (8.22) does not hold, i.e.,

inffF.x ; y/j G.x ; y/  0; y 2 Dg >  : (8.23)


8.2 Partly Convex Problems 241

Then by virtue of (8.20),

sup minfF.x ; y/ C h; G.x ; y/ig >  ; (8.24)


2Rm y2D
C

so there exists Q satisfying

minfF.x ; y/ C h;
Q G.x ; y/i >  :
y2D

Q G.x; y/ig we
Using the lower semi-continuity of the function .x; y/ 7! fF.x; y/Ch;
can then find, for every fixed y 2 D, an open ball Uy in R around x and an open
n

ball Vy in Rp around y such that

F.x0 ; y0 / C h;
Q G.x0 ; y0 /i >  8x0 2 Uy \ C; 8y0 2 Vy :

Since the balls Vy ; y 2 D, form a covering of the compact set D, there is a finite set
E  D such that the balls Vy ; y 2 E, still form a covering of D. If U D \y2E Uy ,
then for every y 2 D we have y 2 Vy0 for some y0 2 E, hence

Q G.x; y/i > 


F.x; y/ C h; 8x 2 U \ C; 8y 2 D:

But Mk  U for all sufficiently large k, because \k Mk D fx g. Then the just
established inequality implies that

sup minfF.x; y/ C h; G.x; y/ij x 2 Mk \ C; y 2 Dg >  :


2Rm
C

Hence, .Mk / >  , a contradiction. This completes the proof. t


u
Remark 8.7 It has been shown in Tuy (2007a) that Theorem 8.5 still holds if rather
than requiring that D be compact it is assumed only that there exists a compact set
D0  D such that

.8x 2 C/.8 2 Rm
C/ D0 \ argminy2D fF.x; y/ C h; G.x; y/ig ;:

However, a counter-example given there also pointed out that Theorem 8.5 will
be false if the above condition is required only for x 2 C;  2 RmC such that
argminy2D fF.x; y/ C h; Gx; y/ig ;.
For certain applications the assumption on compactness of D turned out to be
too restrictive. In the next theorem this assumption is replaced by a weaker one, at
the expense, however, of requiring the continuity and not merely the lower semi-
continuity of the functions F.x; y/ and Gi .x; y/; i D 1; : : : ; m.
242 8 Special Methods

Recall that a function f W Y ! R is said to be coercive on Y if f .y/ ! C1 as


y 2 Y; kyk ! C1, or equivalently, if for any
2 R the set fy 2 Yj f .y/ 
g is
bounded.
Theorem 8.6 Assume that the functions F.x; y/; Gi .x; y/; i D 1; : : : ; m, are
continuous in x for fixed y 2 D and satisfy the condition
(S) for every x 2 C there exists  2 Rm C such that the function F.x; y/ C
h; G.x; y/i is coercive on D.
Then the BB decomposition algorithm using Lagrangian bounds throughout is
convergent.
Proof In view of Theorem 8.5 it suffices to consider the case when D is unbounded.
Consider a filter fMk g collapsing to a point x . As we saw previously, x 2 C, and,
as k ! C1,

.Mk / %   WD inffF.x ; y/j G.x ; y/  0; y 2 Dg:

According to (S) there exists Q 2 Rm


C satisfying

F.x ; y/ C h;
Q G.x ; y/i ! C1 as y 2 D; kyk ! C1: (8.25)

By the minimax theorem (Sect. 2.12, Remark 2.2), this ensures that

D sup inf F.x ; y/ C h; G.x ; y/i (see (8.19))


2Rm y2D
C

D min sup F.x ; y/ C h; G.x ; y/i: (8.26)


y2D 2Rm
C

We show that  D . Suppose the contrary, that

 < : (8.27)

Obviously  C .1  /Q 2 Rm
C for every .; / with  2 RC ; 0  < 1. Let
m

 
C j inf F.x ; y/ C h; G.x ; y/i > 1g:
 WD f 2 Rm (8.28)
y2D

For every k, since .Mk /   , we have

8.; / 2   0; 1/ inf Q G.x; y/i   :


F.x; y/ C h  C .1  /;
x2C\Mk ;y2D

Hence, if " denotes an arbitrary number satisfying 0 < " <   , then for every
.; / 2   0; 1/ and every k there exists x.k;; / 2 C \ Mk ; y.k;; / 2 D satisfying

F.x.k;; / ; y.k;; / / C h  C .1  /;


Q G.x.k;; / ; y.k;; / /i   C ": (8.29)
8.2 Partly Convex Problems 243

We contend that not for every .; / 2 0; 1/ the sequence fy.k;; / ; k D 1; 2; : : :g
is bounded. Indeed, otherwise we could assume that, for every .; / 2   0; 1/ W
y.k;; / ! y.; / 2 D, as k ! C1. Since x.k;; / 2 Mk while \C1
kD1 Mk D fx g, we

.k;; / 
would have x ! x . Then letting k ! C1 in (8.29) would yield

F.x ; y.; / / C h  C .1  /;


Q G.x ; y.; / /i   C ";

for every .; / 2   0; 1/; hence

sup inf F.x ; y/ C h  C .1  /;


Q G.x ; y/i   C " < :
2;0 <1 y2D

But for every  2  the concave function '. / D infy2D F.x ; y/ C h  C .1 


Q G.x ; y/i satisfies sup
/; '. /  '.1/, i.e.,
0 <1

sup inf F.x ; y/ C h  C .1  /;


Q G.x ; y/i  inf F.x ; y/ C h; G.x ; y/i:
0 <1 y2D y2D

Therefore, we would have

sup inf F.x ; y/ C h; G.x ; y/i


y2D
2RC
m

D sup inf F.x ; y/ C h; G.x ; y/i (because of (8.28)


2 y2D

 sup inf F.x ; y/ C h  C .1  /;


Q G.x ; y/i < ;
2;0 <1 y2D

contradicting (8.26). Thus, there exists .; / 2   0; 1/ such that, by passing to a


subsequence if necessary, ky.k;; / k ! C1 (k ! C1/. Now, for an arbitrary point
y 2 D, let  WD F.x ; y / C h  C .1  /;Q G.x ; y /i. Then there is k0 such that
for all k  k0 ,

F.x.k;; / ; y / C h  C .1  /;
Q G.x.k;; / ; y /i <  C ": (8.30)

For simplicity of notation, let us write xk ; yk for x.k;; / ; y.k;; / . For any l > ky k
define
 
l l
ykl D k yk C 1  k y :
ky k ky k

Clearly, there is k1  k0 such that for all k  k1 ,

0 < l=kyk k < 1; l  ky k  kykl k  l C ky k:


244 8 Special Methods

From the inequalities (8.29) and (8.30) and the convexity of the function y 7!
Q G.xk ; y/i, we then deduce, for every fixed l  ky k,
F.xk ; y/ C h  C .1  /;

Q G.xk ; ykl /i  maxf  ; g C " < C1:


F.xk ; ykl / C h  C .1  /;

Since fykl g is bounded, we can assume ykl ! ul 2 D as k ! C1. This yields

F.x ; ul / C h; G.x ; ul /i


C.1  /F.x ; ul / C h;
Q G.x ; ul /i  maxf  ; g C ";

where clearly kul k  l  ky k ! C1 as l ! C1. Since  2 , by letting


l ! C1 in the above inequality and taking account of (8.25) and (8.28), we get
C1  maxf  ; g C ", which is a contradiction. Therefore  D , as was to be
proved. t
u
Remark 8.8 Condition (S) is equivalent to the following one:
(T) for every x 2 C there exist  2 Rm C and a real number
such that the set
fy 2 Dj F.x; y/ C h; G.x; y/i 
g is nonempty and bounded.
Indeed, it is straightforward that (S) implies (T). Conversely, if (T) holds then
by a known property of proper convex functions (Corollary 2.3) the set fy 2
Dj F.x; y/ C h; G.x; y/i 
g is bounded for every
2 R, hence (S).
Corollary 8.1 In Problem (P) assume that F.x; y/ D hc; yi; G.x; y/ D A.x/y  b
p
with A.x/ 2 Rmp for every fixed x, while D D RC and the following condition is
satisfied:
(S1) For every x 2 C there exists  2 Rm
C such that A.x/  C c > 0.
T

Then the BB decomposition algorithm using Lagrange bounds throughout is


convergent.
Proof If (S1) holds then F.x; y/Ch; G.x; y/i D hc; yiCh; A.x/ybi D hA.x/T C
p
c; yi  h; bi ! C1 as y 2 RC ; kyk ! 1. Therefore, (S) holds. t
u
Thus, Theorem 8.6 includes the basic result in (Ben-Tal et al. 1994) for partly
linear problems as a special case.

8.2.1 Computation of Lagrangian Bounds

As we saw, a key step of the above decomposition scheme is to compute the


Lagrangian bounds. But these are in general not easily computable because in most
cases the Lagrangian dual to a nonconvex problem is itself a nonconvex problem.
Therefore, the first important issue that arises is when Lagrangian bounds can be
practically computed.
We now show a class of problems for which the Lagrangian dual is a convex,
or even a linear program. This class includes partly linear optimization problems
which have the following general formulation:
8.2 Partly Convex Problems 245

.GPL/ minfhc.x/; yi C hc0 ; xij A.x/y C Bx  b; r  x  s; y  0g


where x 2 Rn ; y 2 Rp ; c W Rn ! Rp ; c0 2 Rn ; A W Rn ! Rmp ; B 2 Rmn ; b 2
Rm ; r; s 2 RnC .
Setting A.x/ D aij .x/, and denoting the i-th row of B by Bi , this problem can
also be written in the expanded form as

Xp
min yj cj .x/ C hc0 ; xi;


jD1
.GPL/ Xp
s:t: yj aij .x/ C hBi ; xi  bi i D 1; : : : ; m;


jD1
y  0; r  x  s:

Theorem 8.7 The Lagrangian dual of .GPL/ with respect to the nonlinear con-
straints, i.e.,

'r;s D sup inffhy; c.x/iChc0; xiCh; A.x/yCBxbij x 2 r; s; y  0g; (8.31)
0

is the convex program



'r;s D hc0 ; xi C maxhr  s; ti C hBr  b; i; (8.32)
;t

s.t. g./  0; t C BT   0;   0; t  0; (8.33)


where g./ D minjD1;:::;p minrxs hAj .x/; i C cj .x/.
Proof For fixed   0 we have
inffhy; c.x/i C h; A.x/y C Bx  bij x 2 r; s; y  0g
D hb; i C inf inf fhBx; i C hc.x/ C .A.x//T ; yig
x2r;s y0

D hb; i C h./;
where

infx2r;s hBx; i if c.x/ C .A.x//T   0 8x 2 r; s;
h./ D (8.34)
1 otherwise:
Now observe that for any q 2 Rn W

min hq; xi D hq; ri C maxfhr  s; tij t  0; t  qg;


rxs

minfhq; xij r  x  sg D minfhq; ri C hq; x  rij 0  x  r  s  rg D


since P
hq; ri C qi <0 qi .si  ri / D hq; ri C maxfhr  s; tij t  0; t  qg. Therefore,

infrxs fhBx; i C hc0 ; xig


(8.35)
D hBT  C c0 ; ri C maxfhr  s; tij t  0; t  BT   c0 g:
246 8 Special Methods

Denote the j-th column of A by Aj , so that the condition hA.x/; i C c.x/  0 8x 2


r; s means g./  0, where

g./ WD min min hAj .x/; i C cj .x/:


jD1;:::;p rxs

Since for fixed x the function  7! hAj .x/; i C cj .x/ is affine, g./ is a concave
function and the problem (8.31) reduces to (8.32)(8.33), which is a convex
program. t
u
Corollary 8.2 Assume that
(*) For every j the functions cj .x/; aij .x/; i D 1; : : : ; m, are either all increasing
on r; s or all decreasing on r; s.
Then the Lagrangian dual of (GPL) is the dual of its LP relaxation:
8 9
<X X =
min cj .r/yj C cj .s/yj C hc0 ; xi (8.36)
: ;
j2J C j2J 
X X
s:t: aij .r/yj C aij .s/yj C hBi ; xi  bi i D 1; : : : ; m; (8.37)
j2JC j2J

y  0; r  x  s; (8.38)

where JC is the set of all j such that all cj .x/; aij .x/; i D 1; : : : ; m, are increasing,
and J is the set of all j such that all cj .x/; aij .x/; i D 1; : : : ; m, are decreasing.
Proof By hypothesis JC [ J D f1; : : : ; pg, so for every j D 1; : : : ; p, we have

min hAj .x/; i C cj .x/  0


rxs
" #
X
m
, min i aij .x/ C cj .x/  0
rxs
iD1
Pm
i aij .r/ C cj .r/  0 if j 2 JC ;
, PiD1
m (8.39)
iD1 i aij .s/ C cj .s/  0 if j 2 J :

In view of (8.32), (8.33), (8.35) the problem (8.31) thus reduces to the linear
program
P
hc0 ; xi C maxfhr  s; ti C m iD1 i hBi ; ri  bi g;
Pm
 iD1 i Bi  t  c0 ;
Pm
 i aij .r/  cj .r/ j 2 JC ;
s.t. PiD1
m
 iD1 i aij .s/  cj .s/ j 2 J ;
  0; t  0:
8.2 Partly Convex Problems 247

whose dual is
8 9
<X X =
hc0 ; ci C min cj .r/yj C cj .s/yj ;
: ;
j2JC j2J
X X
s.t. aij .r/yj C aij .s/yj C hBi ; r C zi  bi i D 1; : : : ; m;
j2JC j2J

y  0; 0  z  s  r:

By setting x D r C z, the latter problem becomes (8.36)(8.38). t


u
Corollary 8.3 Assume that
(#) for fixed  each function hAj .x/ C cj .x/ is quasiconcave.
Then the Lagrangian dual of (GPL) is the linear program
( )
X
m
max hr  s; ti C i hBi ; ri  bi  ;
iD1
Pm
 iD1 i Bi  t  0;
P
s.t.  m
iD1 i aij .v/  cj .v/ v 2 V; j D 1; : : : ; k;
  0; t  0;

where V denotes the vertex set of the hyperrectangle r; s.


Proof Indeed, the condition minrxs hAj .x/; i C cj .x/  0 is then equivalent to
saying that minv2V hAj .v/; i C cj .v/  0. t
u
Thus, a Lagrangian bound for .GPL/ is obtained by solving a convex program
which reduces to a linear program if assumption (*) or (#) is satisfied. In the case
(*) is satisfied, this linear program is merely a LP relaxation of (GPL).

8.2.2 Applications

The usefulness of the above results is illustrated by two examples of applications.


I. The Pooling and Blending Problem. It was shown in Ben-Tal et al. (1994) that
this problem from petrochemical industry can be given in the form

minfcT yj A.x/y  b; y  0; x 2 Xg;

where X is a hyperrectangle in Rn , and A.x/ is an m  p matrix whose elements


aij .x/ are continuous functions of x. Since this is a special case of (GPL)
248 8 Special Methods

satisfying condition (#) of Corollary 8.3, the Lagrangian bound can be computed
by solving a linear program. Furthermore, for this (GPL) condition (S) in
Theorem 8.6 now reads

C/
.8x 2 X/ .9 2 Rm cT y C h; A.x/y  bi ! C1 as y ! C1;

where the last condition holds if and only if hA.x/; i C c > 0. Therefore,
as shown in Corollary 8.1 this problem can be solved by a convergent BB
decomposition algorithm using Lagrangian bounds throughout.
II. The Bilinear Matrix Inequalities Problem. Many challenging problems of con-
trol theory can be reduced to the so-called bilinear matrix inequality feasibility
problem (BMI problem for short) with the following general formulation (Tuan
et al. 2000a,b):

minhc; xi C hd; yi (8.40)


X
m
s.t.G0 C y j Gj
0 (8.41)
jD1

X
n X
m X
n X
m
L0 C xi Li0 C yj L0j C xj yj Lij 0 (8.42)
iD1 jD1 iD1 jD1

x 2 X D p; q  Rn ; y 2 Rm
C (8.43)

where x; y are the decision variables, G0 ; Gj ; L0 ; L0i ; Lj0 ; Lij are symmetric
matrices of appropriate sizes, and the notation G
0; L 0 means that G
is a semidefinite negative matrix, L is a definite negative matrix.
For ease of notation we write
   
A A0
for ;
B d 0B

and define
2 3
G
Pn0
A00 .x/ D 4 L0 C iD1 xi Li0
5 ;
hx; ci d
2 3 2 3
G 0
Pnj
Aj0 .x/ D 4 L0j C iD1 xi Lij
5 ; Q00 D 405 :
dj d
1 d

Then, as shown in Tuan et al. (2000a), this problem can be converted to the form
8 9
< Xm =
min tj A0 .x; p; q/ C yj Aj .x; p; q/
tQ; y  0; x 2 X ;
: ;
jD1
8.3 Duality in Global Optimization 249

where
   
Aj0 .x/ Q00
Aj .x; p; q/ D ; QD ; Q01 D 0;
Aj1 .x; p; q/ d
Q01 d
2 3
.x1  p1 /Gj
6 .q  x /G 7
6 1 1 j7
6 7
Aj1 .x; p; q/ D 6  7 j D 0; 1; : : : ; n:
6 7
4 .xn  pn /Gj 5
.qn  xn /Gj d

In this form the BMI problem appears to be a problem .GPL/. Condition (S) in
Theorem 8.6 can be formulated as

.8x 2 X/.9Z1 0/ Tr.Z1 Q00 / D 1; Tr.Z1 Aj0 .x// > 0 j D 1; : : : ; m:

Therefore, by Theorem 8.6, under this assumption the BMI problem can be solved
by a convergent branch and bound algorithm using Lagrangian bounds, as proposed
in Tuan et al. (2000a). Note that the Lagrangian dual of the problem
8 2 39
< X
m =
max min t C Tr 4Z.A0 .x; p; q/ C yj Aj .x; p; q/  tQ/5
Z0 t2R;y0;x2M : ;
jD1

has been shown there to be equivalent to the LMI program

maxftj Tr.ZA0 .x; p; q//  t; Tr.ZAj .x; p; q//  0 8x 2 vertM; j D 1; : : : ; m;


Tr.ZQ/ D 1; Z 0g

where vertM denotes the vertex set of M.

8.3 Duality in Global Optimization

As we saw in Chap. 6 and the preceding sections, Lagrange duality is very useful in
global optimization, especially for computing bounds when solving a nonconvex
problem by a BB algorithm. However, since the dual problem of a nonconvex
optimization problem is also a nonconvex problem, these Lagrange bounds are often
not easily computable. In this section we present a different duality theory due to
Thach (1991, 1994, 2012) (see also Thach and Thang 2011) which has the attractive
feature of being symmetric and may turn certain difficult nonconvex problems into
more tractable, sometimes even convex or linear, ones.
250 8 Special Methods

8.3.1 Quasiconjugate Function

Basic for this nonconvex duality theory is the concept of quasiconjugation which
plays a role similar to that of conjugation in convex duality theory (Sect. 2.10).
Given any function f W RnC ! RC ; 6 0, the function f \ W RnC ! 0; C1/ such
that
1
f \ .p/ D 8p 2 RnC ;
supff .x/j pT x  1; x  0g
1
(where C1
D 0 by the usual convention) is called the quasiconjugate of f .x/.
Example 8.2 Let f be a CobbDouglas function on RnC , i.e.,

Y
n
f .x/ D xi i ;
iD1

where i > 0; i D 1; : : : ; n. Then f \ is also a CobbDouglas function on RnC :

1 Y pi i
n
f \ .p/ D . / . / ; p 2 RnC
iD1 i
Pn
where iD1 i D .
Proof If p D .p1 ; p2 ; : : : ; pn / and pi D 0 for some i D 1; : : : ; n, then pT xk  1
for every xk D .x1 ; x2 ; : : : ; kxi ; : : : ; xn /, where x > 0 satisfies pT x  1. Since
f .x/ > 0; f .xk / D ki f .x/ ! C1 as k ! C1 it follows that f \ .p/ D 0.
If p > 0 then, since f .x/ is continuous the problem maxff .x/j pT x  1; x  0g
has a solution x. Noting that f .x/ D 0 8x 2 RnC n RnCC we have x > 0. By KKT
condition there exists  > 0 such that

rf .x/  r.pT x  1/ D 0
.pT x  1/ D 0
i Qn i
Solving this system yields xi D pi ; i D 1; : : : ; n and  D iD1 xi . Hence,
1
Qn  pi i
f \ .p/ D f .x/ D 1 iD1 i 8p 2 RnC . t
u

In the next proposition f \\ denotes the quasiconjugate of f \ .


Theorem 8.8 (i) The function f \ is quasiconcave, nonnegative, and increasing on
RnC and satisfies

f \ .0/ D infff \ .y/j y 2 Rn g: (8.44)


8.3 Duality in Global Optimization 251

(ii) If f W RnC ! RC is quasiconcave, continuous, and increasing on RnC then


f D f \\ W

1
f .x/ D : (8.45)
supff \ .p/j pT x  1; p  0g

Proof (i) We need only prove the quasiconcavity of f \ . Let p; p0 2 RnC . For every
t 2 0; 1 and x  0 we can write

htp C .1  t/p0 ; xi  1 ) pT x  1 or p0T x  1;

hence

supff .x/j htp C .1  t/p0 ; xi  1; x  0g


 maxfsupff .x/j pT x  1; x  0g; supff .x/j p0T x ; x  0g;

which implies

f \ .tp C .1  t/p0 /  minff \ .p/; f \ p0 g;

as was to be proved.
(ii) Define

1
f .x/ D :
supff \ .p/ W pT x  1; p  0g

For > 0 let F ; F denote the upper levels of f and f \ , respectively, i.e.,

F D fx 2 RnC W f .x/  g; F D fx 2 RnC W f .x/  g:

To prove the equality f .x/ D f .x/ 8x 2 RnC it suffices to show that F D F


for any > 0. Let x 2 F . Suppose x F , then we have f .x/ < or

1
< ;
supff \ .p/ W pT x  1; p  0g

hence, there exists p  0, pT x  1 such that

1
f \ .p/ >

1 1
, >
supff .x/ W p x  1; x  0g
T
, supff .x/ W pT x  1; x  0g <
, f .x/ < 8x  0 such that pT x  1:
252 8 Special Methods

It follows that f .x/ < . This conflicts with x 2 F . Hence, F  F .


Conversely, let x 2 F . We can write

f .x/ 
1
, 
supff \ .p/ W pT x  1; p  0g
1
, f \ .p/  8p  0 such that pT x  1

1 1
,  8p  0 such that pT x  1
supff .x/ W pT x  1; x  0g
, supff .x/ W p x  1; x  0g  8p  0 such that pT x  1: (8.46)
T

For pQ > 0 such that pQ T x  1, the set fx 2 RnC W pQ T x  1g is compact. Since f is


continuous, from (8.46) there exists a vector xQ  0; pQ T xQ  1 such that f .Qx/  .
This implies that xQ 2 F . Hence, F is nonempty.
Suppose now that x F . The function f is quasiconcave continuous and
increasing on RnC , so F is a closed convex set satisfying x0 2 F whenever
x0  x 2 F . Since F does not intersect with the line segment 0; x, by the strictly
separation theorem there exist a vector q 0 and real number such that

qT x > 8x 2 F I (8.47)
qT x < 8x 2 0I x: (8.48)

From (8.47) we have q  0 by Lemma 1. Further, from (8.48) it follows that > 0
and there exists t > 0 sufficiently small such that .q C te/T x  , where e D
.1; 1; : : : ; 1/. Setting p D 1 .q C te/, we have p > 0 and

1 T
pT x  p x > 1 8x 2 F I (8.49)

pT x  1: (8.50)

Consequently,

pT x > 1 8x  0 such that f .x/  (8.51)


, f .x/ < 8x  0 such that pT x  1: (8.52)

From (8.46) and (8.50) it follows that

supff .x/ W pT x  1; x  0g  :

Since fx 2 RnC W pT x  1g is a compact set and f is continuous, there exists a vector


xO  0; pT xO  1 such that f .Ox/  . This conflicts with (8.52). Hence, F D F . u
t
8.3 Duality in Global Optimization 253

8.3.2 Examples

The equation f \\ D f holds for the following functions:


Generalized CobbDouglas function

Y
k
f .x/ D .fj .x//j ; j > 0;
jD1

where fj .x/; j D 1; 2; : : : ; k are positive concave and increasing on RnC . This


function is quasiconcave on RnC .
Generalized Leontiev function
 
xj
f .x/ D minf j j D 1; 2; : : : ; ng ;
cj

where cj > 0; j D 1; 2; : : : ; n; > 0. This function is quasiconcave (and is then


concave) if and only if  1.
Constant elasticity of substitution (C.E.S.) function

1
f .x/ D .a1 x1 C a2 x2 C    C an xn / ;

where aj > 0; i D 1; 2; : : : ; n; 0 <  1. This function is quasiconcave on RnC .


Posynomial function

X
m Y
n
f .x/ D cj .xi /aij with cj > 0 and aij  0:
jD1 iD1

This function is quasiconcave on RnC .


We now introduce the concept of quasi-supdifferential for quasiconcave func-
tions. This will allow us to develop in the next section an optimality condition in the
generalized KKT form for a quasiconcave maximization problem.
Given any function f W Rn ! R [ f1g a vector p 2 RnC will be called a
quasi-supgradient of f at x if

pT x D 1 and f .x/f \ .p/  1:

Since f .x/f \ .p/  1 by definition of f \ this is equivalent to saying that

pT x D 1 and f .x/f \ .p/ D 1:


254 8 Special Methods

The quasi-supdifferential of f at x, written @\ f .x/, is then defined to be the set


of all quasi-supgradients of f at x. If @\ f .x/ is nonempty, f is said to be quasi-
supdifferentiable at x.
Proposition 8.2 Let f .x/ be a continuous, quasiconcave, and monotonously
increasing on RnC . If f is differentiable at a point x such that f .x/ > 0 and
rf .x/ 0 then

1
Q .x/ WD
rf rf .x/ 2 @\ f .x/; (8.53)
hrf .x/; xi

where rf .x/ denotes the gradient of f at x.


Proof Since f is quasiconcave continuous and increasing on RnC , the set Cx WD fx0 2
RnC j f .x0 /  f .x/g is closed, convex and satisfies x0  x 2 Cx ) x0 2 Cx . For every
u 2 RnC we have

f .x C tu/  f .x/
hrf .x/; ui D f 0 .x; u/ WD lim : (8.54)
t!0C t

Since f .x C tu/  f .x/ this implies hrf .x/; ui  0 8u 2 RnC , hence rf .x/  0. For
every x 2 Cx if t 2 .0; 1/ then x C t.x  x/ 2 Cx , so f .x C t.x  x//  f .x/, hence,
by (8.54) for u D x  x, hrf .x/; x  xi  0. Thus hrf .x/; xi  hrf .x/; xi 8x 2 Cx .
1
Setting p WD hrf .x/;xi rf .x/, we can write

hp; xi  1 8x 2 Cx
, f .x/ < f .x/ 8x  0 s.t. hp; xi < 1
) f .x/  f .x/ 8x  0 s.t. hp; xi  1
, supff .x/j hp; xi  1; x  0g  f .x/
1 1
, 
fsupff .x/j hp; xi  1; x  0g f .x/
, f .x/f \ .p/  1:

This together with the obvious equality hp; xi D 1 shows that p 2 @\ f .x/. t
u
Theorem 8.9 If f is quasiconcave continuous strictly increasing on RnC , then @\ f .x/
is a nonempty convex set for any x > 0. Moreover,

p 2 @\ f .x/ , x 2 @\ f \ .p/ (8.55)

Proof Since f is strictly increasing on RnC , it is easily seen that x is a boundary point
of the closed convex set Cx WD fx0 2 RnC j f .x0 /  f .x/g. Indeed, if there were an
open ball B of center x such that B  Cx there would exist xO 2 B such that xO < x
8.3 Duality in Global Optimization 255

which would imply f .Ox/ < f .x/, conflicting with xO 2 Cx . Consequently, there exist
a vector q 0 and number 2 R such that

hq; zi  8z 2 Cx ; (8.56)
hq; xi D : (8.57)

From (8.56) we have hq; x C ui  8u 2 RnC , hence, in view of (8.57), hq; ui 


0 8u 2 RnC , i.e., q 2 RnC . Noting that x > 0 it then follows that > 0. Setting
p D 1 q yields p  0 and

pT z  1 8z 2 RnC s.t. f .z/  f .x/; (8.58)


p x D 1:
T
(8.59)

From (8.58) we have f .z/ < f .x/ for all z  0 such that pT z < 1, hence f .z/  f .x/
for all z  0 such that pT z  1. Thus, supff .z/j pT z  1; z  0g  f .x/ and therefore
f \ .p/f .x/  1. This together with (8.59) shows that p 2 @\ f .x/. The convexity of
@\ f .x/ then follows from the definition of quasi-supgradient.
The equivalence (8.55) follows from the equality .f \ /\ D f and the definition of
subgradient. t
u

8.3.3 Dual Problems of Global Optimization

Let f W RnC ! R be a nonnegative and quasiconcave continuous strictly increasing


function, X  RnC a compact convex with intX ;, satisfying the following
condition:

8x 2 X x0  x ) x0 2 X: (8.60)

In economics, when X represents the production set (i.e., the set of all feasible
production programs) this condition is referred to as a free disposal condition.
Obviously, it is equivalent to saying that

8x 2 X 0; x  X:

In monotonic optimization (see Chap. 11) a set X  RnC satisfying this condition is
said to be normal.
Consider the optimization problem:

.P/ maxff .x/j x 2 Xg:

Since f is continuous and X is compact, this problem has a solution; moreover, its
optimal value is positive.
256 8 Special Methods

For every vector x 2 X let NX .x/ be the normal cone of X at x (cf Sect. 1.6):

NX .x/ D fpj pT .x  x/  0 8x 2 Xg:

Theorem 8.10 A vector x 2 X is a global optimal solution of (P) if and only if it


satisfies the generalized KKT condition

0 2 @\ f .x/  NX .x/: (8.61)

Proof Suppose x 2 X and (8.61) holds. Then there exists a vector p 2 @\ f .x/ such
that p 2 NX .x/. Since p 2 @\ f .x/ we have pT x D 1 and f \ .p/f .x/ D 1, while
p 2 NX .x/ implies that pT .x  x/  0, hence pT x  1 8x 2 X. From the definition
of f \ it is immediate that

f .x/f \ .p/  1 8.x; p/ 2 X  RnC s.t. pT x  1:

Therefore,

f \ .p/f .x/ D maxff \ .p/f .x/ W x 2 Xg D f \ .p/ max f .x/;


x2X

and consequently,

f .x/ D max f .x/:


x2X

So if (8.61) holds then x solves the problem (P).


Conversely, suppose x solves the problem (P). Then, as we saw in the proof of
Theorem 8.9, x is a boundary point of the convex closed set Cx WD fx 2 Xj f .x/ 
f .x/g and hence there is a vector p 2 @\ f .x/ such that

pT x  1 8x 2 Cx ; pT x D 1:

So pT .x  x/  0 8x 2 X, i.e., p 2 NX .x/, hence, 0 2 @ f .x/  NX .x/. t


u
Let X \ be the lower conjugate of X defined as

X \ D fp  0j pT x  1 8x 2 Xg: (8.62)

Since X \ D X \ RnC , where X is the polar of X defined in Sect. 1.8, it follows from
the properties of X that X \ is a compact convex set with nonempty interior in RnC
and, moreover, that X is the lower conjugate of X \ :

X D fx  0 W pT x  1 8p 2 X \ g: (8.63)

Define the dual of the problem (P) to be the problem

.DP/ maxff \ .p/j p 2 X \ g:


8.3 Duality in Global Optimization 257

We show that f \ .p/ is continuous. For every 2 R consider the set C WD fp 2


RnC j f \ .p/  g D fp 2 RnC j f .x/  1= 8x 2 RnC s.t. hp; xi  1g. Let pk 2 C and
pk ! p as k ! C1. Then P f .x/k 1= for all x  0 such that hp ; xi  1. Observe
k

that hp ; xi D hp; x i C pi D0 pi xi with


k k

(
pki
x if pi > 0
xki D pi i
xi if pi D 0:

Consequently, if hpk ; xi  1, then hp; xk i  1, hence f .xk /  1=, and therefore,


as k ! C1, f .x/  1= by continuity of f . This implies p 2 C , so the set C is
closed for every 2 R, proving that f \ is u.s.c. Analogously, we can show that the
set fp 2 RnC j f \ .p/  is closed for every 2 R, so f \ .p/ is also l.s.c, hence f \ .p/
is continuous. Since X \ is compact, the dual problem has an optimal solution.
Since X and X \ are conjugates of each other and .f \ /\ D f , the dual of the dual
problem coincides with the primal problem. Therefore, the duality is symmetric. If
x solves (P) and p solves (DP), then f .x/  supff .x/j pT x  1; x  0g D f \1.p/ . The
difference 1  f .x/f \ .p/  0 measures in a sense the duality gap.
The next proposition shows that the duality gap is actually zero.
Theorem 8.11 Let x 2 X and p 2 P. Then x is optimal to (P) and p is optimal to
(DP) if and only if

f .x/f \ .p/ D 1: (8.64)

Proof Suppose x 2 X and p 2 P satisfy the duality equation (8.64). Then

f .x/f \ .p/ D maxff .x/f \ .p/j x 2 X; p 2 Pg D max f .x/ max f \ .p/;


x2X p2P

hence,

f .x/ D max f .x/; f \ .p/ D max f \ .p/:


x2X p2P

Therefore, x solves (P) and p solves (DP).


Conversely, suppose x solves (P) and p solves (DP). Since x solves (P), f .x/ > 0.
By Theorem 8.10 we have

0 2 @\ f .x/  NX .x/;

i.e., there is a vector q 2 RnC such that

qT x D 1; (8.65)
f .x/f \ .q/ D 1; (8.66)
q x  1 8x 2 X:
T
(8.67)
258 8 Special Methods

From (8.67) it follows that q 2 X \ . Since x 2 X, xT p  1 8p 2 X \ . This together


with (8.65) implies that xT .p  q/  0 8p 2 X \ , i.e., x 2 NX .q/. From (8.65)
and (8.66) we have x 2 @\ f \ .q/. Thus,

\
0 2 @\ f \ .q/  NX .q/:

By Theorem 8.10, it follows that q is optimal solution of the problem (DP). Since p
also solves (DP), we have f \ .q/ D f \ .p/. Therefore, f \ .p/f .x/ D 1. t
u
Corollary 8.4 Let x 2 X and p 2 X \ . Then, x solves (P) and p solves (DP) if and
only if

p 2 @\ f .x/ or x 2 @\ f \ .p/

Proof The if part is obvious. To prove the only if part, suppose x solves (P) and
p solves (DP). By Theorem 8.11 we have

f \ .p/f .x/ D 1:

We show that pT x D 1. Since x solves (P), we have f .x/ > 0. Again by


Theorem 8.11
1 1
f \ .p/ D D :
supff .x/ W pT x  1; x  0g f .x/
Hence,
supff .x/ W pT x  1; x  0g D f .x/: (8.68)
If pT x D < 1 there would exist a real number t > 0 such that pT .x C te/ D 1
(as usual .e D .1; 1; : : : ; 1//. Since f is strictly increasing, this would imply f .x/ <
f .x C te/, conflicting with (8.68). Therefore, pT x D 1. t
u

8.4 Application

A basic feature of the quasi-gradient duality scheme as expressed in Corollary 8.4 is


that a quasi-gradient of the primal objective function at a primal optimal solution is
an optimal solution of the dual problem. Based on this property certain nonconvex
problems can be converted into more tractable problems (Thach 1993b), sometimes
even into convex problems (Thach and Thang 2014).
Below we illustrate this by an economic application discussed in Thach and
Thang (2014).
8.4 Application 259

8.4.1 A Feasibility Problem with Resource Allocation


Constraint

Let us begin with a feasibility problem. Given m vectors ai 2 RnC ; i D 1; : : : ; m,


together with a vector 2 RnC such that
X
n
aij > 0 8j D 1; : : : ; nI j D 1; (8.69)
jD1

consider the problem of finding a couple .x; p/ 2 RnC  RnC satisfying the following
system of equalities and inequalities:
X
m X
m
xD i ai ; i  0 i D 1; 2; : : : ; m; i D 1 (8.70)
iD1 iD1

pj  0 j D 1; 2; : : : ; n; pT ai  1 i D 1; 2; : : : ; m (8.71)
pj xj D j j D 1; 2; : : : ; n: (8.72)
This problem is encountered, e.g., in the activity planning of a company that has m
factories for producing n different goods. For the production of these goods a given
resource of total amount 1 has to be allocated to the factories. Let ai be the capacity
of i-th factory, so i-th factory can produce aij units of j-th good, j D 1; : : : ; n when
running at full capacity. Let pj denote the amount of resource to be allocated to the
production of a unit of j-th good and xj be the quantity of j-th good to be produced.
Then pj xj is the total amount of resource to be allocated to the production of j-th
good. The technology or/and the market requires that pj xj D j ; j D 1; : : : ; n, where
1 ; : : : ; n are given numbers. The problem is to select a feasible activity program,
i.e., a vector .x; p/ 2 RnC  RnC , satisfying the system (8.70)(8.72).
Let X denote the set of all x 2 RnC satisfying (8.70) and P the set of all p 2 RnC
satisfying (8.71). It is obvious that both X; P are convex polytopes with nonempty
interior. Furthermore, by formulas (8.62)(8.63),
P D X\; X D P\ : (8.73)
A solution to the system (8.70)(8.72) is a vector .x; p/ 2 X  P satisfying
the equalities (8.72). Despite the nonconvexity of these equalities, we sill show
that by quasi-gradient duality the system (8.70)(8.72) can be reduced to a convex
minimization problem. More specifically, this system has a unique solution .x; p/
which can be obtained by solving either of two concave maximization (i.e., convex
minimization) problems.
Define the functions
Y
n

f .x/ WD xj j ; x 2 X;
jD1

Y
n

g.p/ WD pj j ; p 2 P:
jD1
260 8 Special Methods

Proposition 8.3 We have

Y
n

Y
n

g.p/ D f \ .p/: j j f .x/ D g\ .x/: j j :
jD1 jD1

Proof Following Example 8.2, for every p  0,


n   Qn
Y pj j pj j
\ jD1
f .p/ D D Qn
iD1
j jD1 j j
g.p/
D Qn j :
jD1 j

Similarly, for every x  0,


n 
Y 
\ xj j f .x/
g .x/ D D Qn j : t
u
iD1
j jD1 j

Consider now the two optimization problems


max f .x/; s.t. x 2 X; (8.74)
max g.p/; s.t. p 2 P: (8.75)
These problems are equivalent, respectively, to the following concave maximization
(i.e., convex minimization) problems
8 9
<X n =
max j log.xj /j x 2 X ;
: ;
jD1
8 9
<Xn =
max j log.pj /j p 2 P :
: ;
jD1

Since X and P contain positive vectors, the optimal values in P (8.74) and in (8.75)
n
are positive. Moreover, since the functions log.f .x// D jD1 j log.xj / and
Pn
log.g.p// D jD1 j log.pj / are strictly concave on the positive orthant, each
problem (8.74) and (8.75) has a unique optimal solution.
Q
Noting that P D X \ while g.p/ D f \ .p/: njD1 j j we see that problem (8.75) is
Q
the dual of problem (8.74). In fact, as f .x/ D g\ .x/: njD1 j j and X D P\ the two
problems are dual of each other.
Theorem 8.12 If x solves problem (8.74), then p D rf Q .x/ solves problem (8.75).
Q
Conversely, if p solves problem (8.75), then x D rg.p/ solves problem (8.74).
Proof This follows from Corollary 8.4 and Proposition 8.2. t
u
8.4 Application 261

Theorem 8.13 If .x; p/ is a solution of the system (8.70)(8.72), then x is optimal


to (8.74) and p is optimal to (8.75). Conversely, if x is optimal to (8.74), then .x; p/
with p D rf Q .x/ is a solution of the system (8.70)(8.72); if p is optimal to (8.75),
then .x; p/ with x D rg.p/
Q is a solution to the system (8.70)(8.72).
Proof Let .x; p/ be a solution of the system (8.70)(8.72). From (8.72) we have
pi D xii ; i D 1; : : : n, while rf .x/ D .1 x11 1 ; : : : ; n xnn 1 /, hence

rf .x/ Q .x/:
pD D rf
hrf .x/; xi
By Corollary 8.4 x solves (8.74) and p solves (8.75).
Conversely if x solves (8.74) and p D rf Q .x/ then obviously p satisfies (8.72), so
.x; p/ is a solution of the system (8.70)(8.72). Analogously, if p solves (8.75) and
Q
x D rg.p/ then x satisfies (8.72), so .x; p/ is a solution of the system (8.70)(8.72).
t
u
The above theorem shows that the nonconvex system (8.70)(8.72) has a unique
solution .x; p/ which can be obtained either by solving the concave maximization
Q .x/) or by solving
problem (8.74) (then x is the unique solution of (8.74) and p D rf
the concave maximization problem (8.75) (then p is the unique solution of (8.75)
Q
and x D rg.p/).

8.4.2 Problems with Multiple Resources Allocation

A more general problem than (8.70)(8.72) occurs when a company has k  1


resources to be used for the production of n goods in m factories of capacities ai 2
RnC ; i D 1; : : : ; m, where minjD1;:::;n aij > 0. Let r be the cost of r-th resource,
P
r D 1; : : : ; k. Without loss of generality we can suppose krD1 r D 1.
If a fraction jr of r-th resource is allocated to the production of j-th good, and
each resource is totally used then obviously
X
n
0  jr  1; jr D 1 r D 1; : : : ; k: (8.76)
jD1

P
A vector D .1 ; : : : ; n / 2 RnC with j D krD1 r jr (cost of all resources to be
used for producing j-th good), j D 1; : : : ; n, will then be referred to as a resource cost
allocation vector. It indicates that an amount of resources of total cost j is allocated
to the production of j-th good, j D 1; : : : ; n. Let  be the set of all resource cost
allocation vectors, i.e.,
( )
X k Xk
 D 2 RC j D
n
r ;
r
r D 1; r  0 r D 1; 2; : : : ; k : (8.77)
rD1 rD1
262 8 Special Methods

A vector x 2 RnC satisfying (8.70) will be termed, as previously, a production


program while a vector p D .p1 ; : : : ; pn / where pj is the total cost of all resources
used for producing a unit of j-th good (i.e., the production cost of a unit of j-th good)
will be referred to as a production unit cost vector. Clearly p must satisfy (8.71).
The problem is now to find a production program x 2 RnC and a production unit
cost p 2 RnC satisfying .p1 x1 ; : : : ; pj xj / 2 , where  is given by (8.77), that is, to
solve the following nonlinear system:

X
m X
m
xD i a i ; i  0 i D 1; 2; : : : ; m; i D 1; (8.78)
iD1 iD1

pj  0 j D 1; 2; : : : ; n; pT ai  1 i D 1; 2; : : : ; m; (8.79)
pj xj D j j D 1; 2; : : : ; n; (8.80)
D .1 ; : : : ; n / 2 : (8.81)

Clearly the system (8.70)(8.72) is a special case of the system (8.78)(8.81) when
k D 1, i.e., when  is the standard simplex in RnC (which is obvious from (8.76)).
We now show that solving the system (8.78)(8.81) reduces to solving either of two
suitable concave maximization problems.
Define

Y
n
r
fr .x/ D xj j r D 1; 2; : : : ; k;
jD1

Y
n
r
gr .p/ D pj j r D 1; 2; : : : ; k:
jD1

As before let X be the set of all x satisfying (8.78) and P the set of all p
satisfying (8.79). For each  2 RnC consider the two following problems problems:
( )
Y
k
r
max fr .x/ j x 2 X : (8.82)
rD1
( )
Y
k
max gr .p/r j p 2 P : (8.83)
rD1

Each of these problems is the maximization of a continuous strictly concave


function over a polytope and consequently has a unique optimal solution which
can be found by methods of convex minimization.
Theorem 8.14 Let x 2 P; p 2 P. Then .x; p/ is a solution of the system (8.78)
(8.81) if and only if there exists  2 RkC such that x is optimal solution of
problem (8.82) and p is optimal solution of problem (8.83).
8.4 Application 263

Proof (i) Suppose .x; p/ is a solution of the system (8.78)(8.81). According


to (8.81) there is  2 RkC such that

X
k
D r r : (8.84)
rD1

By Theorem 8.13, x is the unique optimalPto problem (8.74) and p the unique
k
optimal to problem (8.75), where j D rD1 r j . However, from (8.84) it
r

follows that
Y
n

f .x/ D xj j
jD1

Y
n Pk
rD1 r j
r
D xj
jD1

Y
n Y
k
r jr
D xj
jD1 rD1

Y
k Y
n
r jr
D xj
rD1 jD1
0 1r
Y
k Y
n
jr
D @ xj A
rD1 jD1

Y
k
D fr .x/r :
rD1

So problem (8.74) is exactly problem (8.82). Similarly,

Y
k
g.p/ D gr .p/r ;
rD1

and problem (8.75) is exactly problem (8.83).


(ii) Conversely, suppose there exists  2 RkC such that x 2 X is optimal to
problem (8.82) and p is optimal to problem (8.83). By Theorem 8.13 .x; p/
is a solution of the system (8.78)(8.81).
t
u
Thus any solution .x; p/ of the nonconvex system (8.78)(8.81) can be obtained
either by taking x to be the optimal solution of problem (8.82) for some  2 RkC and
p D rfQ r .x/, or by taking p to be the optimal solution of problem (8.83) for some
 2 RkC and x D rg Q r .p/.
264 8 Special Methods

8.4.3 Optimization Under Resources Allocation Constraints

Consider now the optimization problem

max q.x/ s.t. (8.85)


X
m X
m
xD i ai ; i  0 i D 1; 2; : : : ; m; i D 1; (8.86)
iD1 iD1

pj  0 j D 1; 2; : : : ; n; pT ai  1 i D 1; 2; : : : ; m; (8.87)
pj xj D j j D 1; 2; : : : ; n; (8.88)
D .1 ; : : : ; n / 2 : (8.89)

where q.x/ is a continuous concave function defined on RnC such that


q.x/ > 0 8x > 0. This is a difficult problem of concave maximization over a highly
nonconvex set.
We shall refer to a vector x 2 RnC for which there is a vector p 2 RnC
satisfying (8.86) through (8.89) as a feasible production program (under given
resources allocation constraints). If q.x/ represents a profit, this problem amounts
to finding a feasible production program with maximal profit.
By XE denote the set of all feasible production programs x, so the problem (8.85)
(8.89) is
maxfq.x/j x 2 XE g: (8.90)
Recall from Theorem 8.14 that x 2 XE if and only there exists  2 RkC such that
Q 
x is the unique maximizer of the function krD1 fr r .x/ over X. So problem (8.90)
becomes
( )
Yk Yk
r r 0
max q.x/j x 2 X;  2 RC ;
k
fr .x/ D max
0
fr .x / : (8.91)
x 2X
rD1 rD1

By rescaling if necessary we can assume without lost of generality that minjD1;:::;n aij
> 2 for every i D 1; : : : ; m, and consequently, that
min xj > 2 8x 2 X: (8.92)
jD1;:::;n

Define now
( )
Y
k
M D
D .
1 ;
2 ; : : : ;
k / 2 RnC j max fr
r .x/ 3 ;
x2X
rD1
( )
Y
k
h.
/ D max q.x/j fr
r .x/  3; x 2 X 8
2 RnC :
rD1

(agree that max ; D 0/


8.4 Application 265

Q  P
k
r
Note that for each
2 RkC the function log rD1 fr .x/ D krD1
r log.fr .x//D
Pk Pn
rD1
r jD1 j log.xj / is concave, so the feasible set of the subproblem that
defines h.
/ is a convex subset of the polytope X. Therefore, the value of h.
/
is obtained by solving a convex problem (maximizing a concave function over a
compact convex set).
Lemma 8.2 M is a nonempty compact convex set in RkC .
Proof For any
2 RnC we have

Y
k X
k
max fr
r .x/  3 , max
r log.fr .x//  log.3/:
x2X x2X
rD1 rD1
P P
Since maxx2X krD1
r log.fr .x//  maxx2X log.fr .x// krD1 r , it is easily seen that
Q

there always exists


2 RkC such that maxx2X krD1 fr r .x/  3. This proves that
M ;.
The function
X
k
0
h .
/ D max
r log.fr .x//
x2X
rD1

is pointwise maximum of a family of linear functions of


, so it is a closed convex
function
Pn on RkC . Therefore, M is closed convex set. Furthermore, since log.fr .x/ D
jD1 j log.xj / > log.2/ by (8.92), we have
r


2 M ,
 0; h0 .
/  log.3/
X
k
)
 0;
r log.fr .x//  log.3/ 8x 2 X
rD1

X
k
)
 0;
r log.2/  log.3/
rD1

So, M is bounded. t
u
Lemma 8.3 The function h is continuous and convex on RkC .
Q
P
Proof The constraint krD1 fr r .x/  3 can be written as krD1
r log.fr .x//  log 3.
Since q.x/ and log.fr .x// are continuous concave, the subproblem that defines h.
/
is a convex program. Since furthermore Plog fr .x// is strictly concave, it is easily seen
that there must exist x 2 X satisfying krD1
r log.fr .x// > log 3. Therefore, by the
strong Lagrange duality theorem, there exists  2 RC such that
( " k #)
X
h.
/ D max q.x/ C 
r log.fr .x//  log 3 :
x2X
rD1
266 8 Special Methods

Thus h.
/ is the upper envelope of a family of affine functions. The convexity and
continuity of h.
/ on RkC follow. t
u
Consider now the problem

maxfh.
/j
2 Mg (8.93)

which is a convex maximization over a convex compact set.


Theorem 8.4 The optimal values in problems (8.90) and (8.93) are equal : q D
h . Moreover, if
 solves (8.93) then the unique maximizer x of the strictly concave
Q

function krD1 fr r .x/ on X solves (8.90).
Proof Let x be an optimal solution of (8.90), i.e., x 2 XE ; q.x / D q . Since
x 2 XE , there is a vector  2 RnC such that x is the optimal solution of (8.82). It
is easy to see that for some > 0 W

Y
k
fr r .x / D 3: (8.94)
rD1

Indeed, let

Y
k
!D frr .x /:
rD1

By (8.92) xj > 2; j D 1; : : : ; n, so log.!/ > 0. Taking

log.3/
D
log.!/
Qk r 
yields log.!/ D log.3/, i.e., log. rD1 fr .x // D log.3/, hence (8.94). Now let

D . Clearly
! !
Y
k Y
k Y
k
max fr
r .x/ D max frr .x/ D frr .x / D 3;
x2X x2X
rD1 rD1 r

so
2 M. Hence,

q D q.x /
( )
Y
k
 sup q.x/j fr
r .x/  3; x 2 X
rD1

D h.
/
 h :
8.4 Application 267

This implies, in particular, that h > 0. Conversely, let


 be an optimal solution
of (8.44), i.e.,
 2 M and h.
 / D h . By x denote a maximizer of the function
Qk
  
rD1 fr .x/ on X. By Theorem 8.14 x 2 XE . Since
2 M we have
r

Y
k

Y
k

fr
r .x / D sup fr
r .x/
rD1 x2X rD1

 3:

But if
Y
k

fr
r .x / < 3;
rD1

then
( )
Y
k


h.
/ D sup q.x/j fr
r .x/  3; x 2 X
rD1

D sup ;
D0
< h ;

which is a contradiction. Thus,

Y
k

fr
r .x / D 3
rD1

and therefore,

h D h.
 /
( )
Y
k

D sup q.x/j fr
r .x/  3; x 2 X
rD1

D q.x /
 q :

Consequently, q D h , and so x is an optimal solution of the problem (10.12). u


t
Thus, by quasi-gradient duality a difficult nonconvex problem in an n-
dimensional space like (8.85)(8.89) can be converted into a k-th dimensional
problem (8.93) with k  n, which is a much more tractable problem of convex
maximization over a compact convex set. Especially when k is small, as is often
the case in practice, the converted problem can be solved efficiently even by outer
approximation method as proposed in Thach et al. (1996).
268 8 Special Methods

8.5 Noncanonical DC Problems

In many dc optimization problems the constraint set is defined by a mixture of


inequalities of the following types:
convex;
reverse convex;
piecewise linear;
quadratic (possibly indefinite).
Problems of this form often occur, for instance, in continuous location theory (Tuy
1996; Tuy et al. 1995a). To handle these problems it is not always convenient to
reduce them to the canonical form. In this section we present an outer approximation
method based on a visibility assumption borrowed from location theory. This
method is practical when the number of nonconvex variables is small.
Consider the problem

minff .x/j x 2 Sg; (8.95)

where f W Rn ! R is a convex function, and S is a compact set in Rn .


A feasible point x is said to be visible from a given point w 2 Rn n S if x is the
unique feasible point in the line segment w; x, i.e., if w; x \ S D fxg (Fig. 7.3).
Our basic assumption is the following:
Visibility Assumption: The feasible set S is such that there exists an efficient finite
procedure to compute, for any given point w 2 Rn n S and any z w, the visible
feasible point w .z/ on the line segment w; z, if it exists.
Note that since S is compact, either the line through w; z does not meet S, or w .z/
exists and is the nearest feasible point to w in this line. To simplify the notation we
shall omit the subscript w and write .z/ when w is clear from the context.
Proposition 8.4 If S D fxj gi .x/  0 i D 1; : : : ; mg and each function gi .x/ is
either convex, concave, piecewise linear, or quadratic, then S satisfies the visibility
assumption.

Proof It suffices to check that if g.x/ belongs to one of the mentioned types and
g.w/ > 0 then, for any given z w one can easily determine whether g.w C .z 
w//  0 for some  > 0 and if so, compute the smallest such . In particular, if g.x/
is quadratic (definite or indefinite), then g.w C .z  w// is a quadratic function of
, so computing supfj g.w C .z  w//  0g is an elementary problem. t
u
Since f .x/ is convex one can easily compute w 2 argminff .x/j x 2 Rn g. If it so
happens that w 2 S, we are done: w solves the problem. Otherwise, w S and we
have
f .w/ < minff .x/j x 2 Sg: (8.96)
8.5 Noncanonical DC Problems 269

From the convexity of f .x/ it then follows that f .w C .z  w// < f .z/ for any
z 2 S;  2 .0; 1/, so the visible feasible point in w; z achieves the minimum of
f .x/ over the line segment w; z. Therefore, assuming the availability of a point w
satisfying (8.96) we can reformulate (8.95) as the following:
Optimal Visible Point Problem: Find the feasible point visible from w with
minimal function value f .x/.
Aside from the basic visibility condition and (8.96), we will further assume
that:
(a) the set S is robust, i.e., S D cl.intS/I
(b) f .x/ is strictly convex and has bounded level sets.
(c) a number 2 R is known such that f .x/ < 8x 2 S.
Let P0 be a polytope containing S. If we denote
D minff .x/j x 2 Sg; G D
fx 2 P0 j f .x/ 
g, then G is a closed convex set and the problem is to find a
point of WD S \ G. Applying the outer approximation method to this problem,
we construct a sequence of polytopes P1  P2  : : :  G. At iteration k let xk
be the incumbent feasible point and zk 2 argmaxff .x/j x 2 Pk g. The distinguished
point associated with Pk is defined to be xk D .zk /. Clearly, if xk 2 G, then xk
is an optimal solution. If f .xk / < f .xk /, then xkC1 D xk I otherwise xkC1 D xk .
Furthermore, PkC1 D Pk \ fxj l.x/  0g where l.x/  0 is a cut which separates
xk from the closed convex set G. We can thus state the following OA algorithm for
solving the problem (8.95):
Optimal Visible Point Algorithm
Initialization. Take a polytope (simplex or rectangle) P1 known to contain one
optimal solution. Let x1 be the best feasible solution available, 1 D f .x1 / (if no
feasible solution is known, set x1 D ;; 1 D /. Let V1 be the vertex set of P1 . Set
k D 1.
Step 1. If xk 2 argminff .x/j x 2 Pk g, then terminate: xk is an optimal solution.
Step 2. Compute zk 2 argmaxff .x/j x 2 Vk g.
Step 3. Find .zk /. If .zk / exists and f . .zk // < f .xk /, set xkC1 D .zk / and
kC1 D f .xkC1 /I otherwise set xkC1 D xk ; kC1 D k .
Step 4. Let yk 2 w; zk  \ fxj f .x/ D kC1 g. Compute pk 2 @f .yk / and let
PkC1 D Pk \ fxj hpk ; x  yk i  0g:

Step 5. Compute the vertex set VkC1 of PkC1 . Set k kC1 and go back to Step 1.
Figure 8.3 illustrates the algorithm on a simple example in R2 where the feasible
domain consists of two disjoint parts. Starting from a rectangle P1 we arrive after
two iterations at a polygon P3 . The feasible solution x2 obtained at the second
iteration is already an optimal solution.
Theorem 8.12 Either the above algorithm terminates by an optimal solution, or it
generates an infinite sequence fxk g every accumulation point of which is an optimal
solution.
270 8 Special Methods

z2

(z2 )
w
(z1 )

P3

z1

Fig. 8.3 The optimal visible point problem

Proof Let D limk!C1 k ; G D fxj f .x/  g. By Theorem 6.5, zk  yk ! 0


and any accumulation point z of fzk g belongs to G. Suppose now that there exists a
feasible point x better than z, e.g., such that f .z/  f .x / > > 0. By robustness of
S, there exists a point x0 2 intS so close to x that f .x0 / < f .z/  and, consequently,
a ball W  S around x0 such that f .x/ < f .z/  8x 2 W. Let xO be the point where
the halfline from w through x0 meets @G. Since f .x/ is strictly convex, its maximum
over P1  G is achieved at a vertex, hence f .Ox/ < f .z1 / and there exists in P1 a
sequence fxq g ! xO such that f .xq / & f .Ox/ D . Since f .zk / D maxff .x/j x 2
Pk g & , for each q there exists kq such that f .zkq / < f .xq /, hence xq Pk , and
consequently, hpkq ; xkq  ykq i > 0. By passing to subsequences if necessary we may
assume zkq ! z; ykq ! y; pkq ! p. Since zkq  ykq ! 0, we must have y D z 2 G.
Furthermore, from pk 2 @f .yk / it follows that p 2 @f .y/, hence p 0 for the relation
0 2 @f .y/ would imply that y is a minimizer of f .x/ over Rn , which is not the case.
Letting q ! 1 in the inequality hpkq ; xkq  ykq i > 0 then yields hp; xO  y  0. This
together with the equality f .Ox/ D f .y/ D implies, by the strict convexity of f .x/,
that xO D y, and consequently, zkq ! xO . For sufficiently large q the halfline from
w through zkq must then meet W  S, i.e., .zkq / exists and f . .zkq // < f .z/  ,
conflicting with f . .zk //  D f .z/ 8k. Thus, z is a global optimal solution, and
hence so is any accumulation point of the sequence fxk g. t
u
Remark 8.9 Assumptions (a)(c) are essential for the implementability and con-
vergence of the algorithm. If f .x/ is not strictly convex, one could add a small
perturbation "kxk2 to make it strictly convex, though this may not be always
computationally practical. The robustness assumption can be bypassed by requiring
only an "-approximate optimal solution as in the case of (CDC/.

Remark 8.10 Suppose S D fxj gi .x/  0; i D 1; : : : ; mg and we know, for every


i 2 I  f1; 2; : : : ; mg, a concave function gQ i .x/ such that gQ i .x/  gi .x/ for all x 2 P1 .
Then along with zk we can compute the value

"k D max min gQ i .x/:


i2I x2Pk
8.6 Continuous Optimization Problems 271

(Of course this involves minimizing the concave functions gQ i .x/ over Pk but not
much extra computational effort is needed, since we have to compute the vertex set
Vk of Pk anyway). Let g.x/ D maxiD1;:::;m gi .x/. Clearly,

"k  max min gi .x/  min max gi .x/


iD1;:::;m x2Pk x2Pk iD1;:::;m

D min g.x/: (8.97)


x2Pk

But it is easy to see that fx 2 P1 j f .x/  f .xk /g  Pk . Indeed, the inclusion is


obvious for k D 1I assuming that it holds for some k  1, we have, for any x 2 Pk
satisfying f .x/  f .xkC1 /, hpk ; x  yk i  f .x/  f .yk / D f .x/  f .xkC1 /  0,
hence x 2 PkC1 , so that the inclusion also holds for k C 1. From the just established
inclusion and (8.97) we deduce

minfg.x/j x 2 P1 ; f .x/  f .xk /g  "k ;

implying that there is no x 2 P1 such that g.x/ < "k ; f .x/  f .xk /. In other words

f .xk /  infff .x/j g.x/ < "k g;

so xk is an "k -approximate optimal solution.

8.6 Continuous Optimization Problems

The most challenging of all nonconvex optimization problems is the one in which
no convexity or reverse convexity is explicitly present, namely:

.COP/ minff .x/j x 2 Sg;

where f W Rn ! R is a continuous function and S is a compact set in Rn given by a


system of continuous inequalities of the form
gi .x/  0; i D 1; : : : ; m: (8.98)
Despite its difficulty, this problem started to be investigated by a number of authors
from the early seventies (Dixon and Szego (1975, 1978) and the references therein).
However, most methods in this period dealt with unconstrained minimization of
smooth or Lipschitzian functions and were able to handle only problems of just one
or two dimensions. Attempts to solve constrained problems of higher dimensions
by deterministic methods have begun only in recent years.
As shown in Chap. 4 the core of (COP) is the subproblem of transcending an
incumbent:
(SP) Given a number 2 f .S/ (the best feasible value of the objective function
known at a given stage), find a point x 2 S with f .x/ < , or else prove that
D minff .x/j x 2 Sg.
272 8 Special Methods

In the so-called modified function approaches, the subproblem (SP) for D


f .x/ is solved by replacing the original function with a properly modified function
such that a local search procedure, started from x and applied to the modified
function will lead to a better feasible solution than x, when x is not yet a global
minimizer. However, a modified function satisfying the required conditions is
very difficult to construct. In its two best known versions, this modified function
(tunneling function of Levy and Montalvo (1988), or filled function of Ge and
Qin (1987)) depends on parameters whose correct values, in many cases, can be
determined only by trial and error.
The following approach to (COP) Thach and Tuy (1990) is based on the
transformation of (SP) into the minimization of a dc function '.; x/, called a
relief indicator.
Assume that the problem is regular, i.e.,
minff .x/j x 2 Sg D infff .x/j x 2 intSg: (8.99)
Since f .x/ is continuous this assumption is satisfied provided the constraint set S is
robust :
S D cl(intS/;
whereas usual int and cl denote the interior and the closure, respectively. For 2
.1; C1 define
F.; x/ D supff .x/  ; g1 .x/; : : : ; gm .x/g: (8.100)

Proposition 8.5 Under the assumption (8.99) there holds


D min f .x/ , 0 D minn F.; x/:1 (8.101)
x2S x2R

Proof If D minx2S f .x/ and x 2 S satisfies f .x/ D , then gi .x/  0 for


i D 1; : : : ; m, hence F.; x/ D 0I on the other hand, f .x/  8x 2 S, hence
F.; x/  0 8x 2 Rn , i.e., minx2Rn F.; x/ D 0. Conversely, if the latter equality
holds then for all x 2 intS, i.e., for all x satisfying gi .x/ < 0; i D 1; : : : ; m, we
have f .x/  , hence minx2S f .x/  in view of (8.99). Furthermore, there is x
satisfying F.; x/ D 0, hence gi .x/  0; i D 1; : : : ; m (so x 2 S/ and f .x/ D , i.e.,
D minx2S f .x/. t
u
Thus, solving (COP) can be reduced, essentially, to finding the unconstrained
minimum of the function F.; x/: if this minimum is zero, and F.; x/ D 0 then x
solves (COP); if it is negative, then any x such that F.; x/ < 0, will be a feasible
point satisfying f .x/ < I finally, if it is positive then S D ;, i.e., the problem is
infeasible.

1
If f ; g1 ; : : : ; gm are convex, condition (8.99) is implied by intS ; and by writing (8.101) in the
form 0 2 @F.; x/ we recover the classical optimality criterion in convex programming.
8.6 Continuous Optimization Problems 273

However, finding the global minimum of F.; x/ may be as hard as solving the
original problem. Therefore, we look for some function '.; x/ which could do
essentially the same job as F.; x/ while being easier to handle. The key for this is
the representation of closed sets by dc inequalities as discussed in Chap. 3.
Define

SQ D fx 2 Rn j F.; x/  0g D fx 2 Sj f .x/  g: (8.102)

To make the problem tractable, assume that for every 2 .1; C1 a function
r.; x/ (separator for .f ; S/) is available which is l.s.c. in x and satisfies:
(a) r.; x/ D 0 8x 2 SQ I
(b) 0 < r.; y/  d.y; SQ / WD minfkx  ykj x 2 SQ g 8y SQ I
(c) For fixed x; r.1 ; x/  r.2 ; x/ if 1 < 2 .
This assumption is fulfilled in many cases of interest, as shown in the following
examples:
Example 8.3 If all functions f ; g1 ; : : : ; gm are Lipschitzian with constants
L0 ; L1 ; : : : ; Lm then a separator is

f .x/  gi .x/
r.; x/ D max 0; ; ; i D 1; : : : ; m :
L0 Li
Indeed, for any x 2 SQ W
f .y/   f .y/  f .x/  L0 kx  yk
gi .y/  gi .y/  gi .x/  Li kx  yk i D 1; : : : ; m
hence r.; y/  kx  yk, from this it is easy to verify (a)(c).
Example 8.4 If f is twice continuously differentiable, with kf 00 .x/k  K for every
x, and S D fxj kxk  cg then a separator is
r.; x/ D maxf.; x/; kxk  cg;

where .; x/ denotes the unique nonnegative root of the equation


K 2
t C kf 00 .x/kt D maxf0; f .x/  g:
2
Indeed, we need only verify condition (b) since the other conditions are obvious.
For any x 2 SQ , we have by Taylor s formula:
M
jf .x/  f .y/  f 0 .y/.x  y/j  kx  yk2 ;
2
hence
M
jf .x/  f .y/j  kf 0 .y/kkx  yk C kx  yk2
2
M
f .x/  f .y/  kx  yk2  kf 0 .y/kkx  yk:
2
274 8 Special Methods

Now M2 t2 C kf 0 .y/kt is a monotonic increasing function of t. Therefore, since f .x/ 


implies that

M
kx  yk2 C kf 0 .y/kkx  yk  f .y/  ;
2

it follows from the definition of .; y/ that .; y/  kx  yk. On the other hand,
since kxk  c; we have kykc  kykkxk  kxyk. Hence, r.; y/  kxyk 8x 2
SQ , proving (b).
By Proposition 4.13, given a separator r.; x/, if we define

h.; x/ D sup fr2 .; y/ C 2xy  kyk2 g


ySQ

then h.; x/ is a closed convex function in x such that

SQ D fxj h.; x/  kxk2  0g: (8.103)

Since for any y 2 S satisfying f .y/ D we have r.; y/ D 0, hence r2 .; y/ C 2xy 


kyk2  kxk2 D kx  yk2  0 8x, the representation (8.103) still holds with

h.; x/ D sup fr2 .; y/ C 2xy  kyk2 g (8.104)


yS

where

S D fx 2 Sj f .x/ < g: (8.105)

Proposition 8.6 Assume S ; and let '.; x/ D h.; x/  kxk2 . If

'.; x/ D 0 D minn '.; x/: (8.106)


x2R

then x is a global optimal solution of (COP). Otherwise, the minimum of '.; x/ is


negative and any x such that '.; x/ < 0 satisfies x 2 S; f .x/ < .
Proof Since F.; x/  0 , '.; x/  0 (see (8.102) and (8.103)), it suffices to
show further that F.; x/ < 0 , '.; x/ < 0. But if F.; x/ < 0 then x 2 intS
and denoting by > 0 the radius of a ball around x contained in S , we have for
every y S W d2 .y; S /  kx  yk2  2 , hence r2 .; y/  kx  yk2   2 , i.e.,
'.; x/   2 < 0. Conversely, if '.; x/ < 0 then for every y S W r2 .; y/ C
2xy  kyk2 < kxk2  , with > 0, hence r2 .; y/ < kx  yk2  and we must
have F.; x/ < 0 (for otherwise x intS and by taking a sequence yn ! x; such
that yn S 8n and letting n ! 1 in the inequality r2 .; yn / < kx  yn k2  we
would arrive at a contradiction: 0   ). t
u
8.6 Continuous Optimization Problems 275

The function '.; x/ is called a relief indicator. By Proposition 8.6 it gives


information about the altitude f .x/ of every point x relative to the level and
can be used as a substitute for F.; x/.
The knowledge of a relief indicator '.; x/ reduces each subproblem SP.) to an
unconstrained dc optimization problem, namely:

minimize h.; x/  kxk2 s:t: x 2 Rn : (8.107)

We shall shortly discuss an algorithm for solving this dc optimization problem.


Assuming such an algorithm available, the global optimum of (COP) can be
computed by the following iterative scheme:
Relief Indicator Method
Start with an initial value 1 2 f .S/, if available, or 1 D C1 if no feasible solution
is known. At iteration k D 1; 2; : : : solve

minf'.k ; x/j x 2 Rn g; (8.108)

to obtain its optimal solution xk and optimal value k . If k  0 then terminate: xk


is an optimal solution of (COP) (if k D 0/, or (COP) is infeasible (if k > 0/.
Otherwise, xk is a feasible solution such that f .xk / < k W then set kC1 D f .xk / and
go to the next iteration.
The convergence of this iterative scheme is immediate. Indeed, if x is a accumu-
lation point of the sequence fxk g and D f .x/, then x 2 S and '.; x/  0. On the
other hand, since xk S it follows that '.; x/  r2 .; xk / C 2xxk  kxk k2  kxk2 
kxk  xk2 ! 0 as xk ! x. Hence, '.; x/ D 0, so by Proposition 8.6 x is a global
optimal solution of (COP). t
u

8.6.1 Outer Approximation Algorithm

The implementation of the above conceptual scheme requires the availability of


a procedure for solving subproblems of the form (8.108). One possibility is to
rewrite (8.108) as a concave minimization problem:

minft  kxk2 j h.k ; x/  tg; (8.109)

and to solve the latter by the simple OA algorithm (Sect. 6.2). Since kC1 < k
we have h.k ; x/  h.kC1 ; x/, so if Gk denotes the constraint set of (8.109) then
GkC1  Gk 8k. This allows the procedures of solving (8.109) for successive k D
1; 2; : : : to be integrated into a unified OA procedure for solving the single concave
minimization problem

minft  kxk2 j h.; x/  tg; (8.110)

where is the (unknown) optimal value of (COP).


276 8 Special Methods

Specifically, denote by G the constraint set of (8.110). Let Pk be a polytope outer


approximating Gk  G and let .xk ; tk / be an optimal solution of the relaxed problem

minft  kxk2 j x 2 Pk g:

If tk  kxk k2  0 then the optimal value of (8.110) is   tk  kxk k2  0 and there


are two possibilities:
(1) if k D C1, then (COP) is infeasible (otherwise, one would have  < 0,
conflicting with 0  tk  kxk k2  /I
(2) if k < C1; then xk is an optimal solution of (COP) (because 0  tk  kxk k2 
  0, hence  D 0/.
So the only case that needs further investigation is when tk  kxk k2 < 0. Define

minfk ; f .xk /g if xk 2 S
kC1 D (8.111)
k otherwise

Lemma 8.4 The affine function

lk .x/ WD 2hxk ; xi  kxk k2 C r2 .kC1 ; xk /

strictly separates .xk ; tk / from GkC1 , i.e., satisfies

lk .xk / > tk ; lk .x/  t  0 8x 2 GkC1 :

Proof Since tk < kxk k2 we have lk .xk /  tk > lk .xk /  kxk k2 D r2 .kC1 ; xk /  0. On
the other hand, x 2 GkC1 implies h.kC1 ; x/  t  0, but since xk SkC1 it follows
from (8.104) that lk .x/  h.kC1 ; x/, and hence lk .x/  t  h.kC1 ; x/  t  0. u t
In view of this Lemma if we define PkC1 D Pk \fxj lk .x/  tg then .xk ; tk / PkC1
while PkC1  GkC1  G, so the sequence P1  P2      G realizes an OA
scheme for solving (8.110). We are thus led to the following:
Relief Indicator OA Algorithm for (COP)
Initialization. Take a simplex or rectangle P1  Rn known to contain an optimal
solution of the problem. Let x1 be a point in P1 and let 1 D f .x1 / if x1 2 S, or
1 D C1 otherwise. Set k D 1.
Step 1. If xk 2 S, then improve xk by local search methods (provided this can be
done at reasonable cost). Denote by xk the best feasible solution so far
obtained. If xk exists, let kC1 D f .xk /, otherwise let kC1 D C1. Define

lk .x/ D 2hxk ; xi  kxk k2 C r2 .kC1 ; xk /:

Step 2. Solve

.SPk / minft  kxk2 j x 2 P1 ; li .x/  t; i D 1; : : : ; kg

to obtain .xkC1 ; tkC1 /.


8.6 Continuous Optimization Problems 277

Step 3. If tkC1  kxkC1 k2 > 0, then terminate: S D ; (COP) is infeasible).


If tkC1  kxkC1 k2 D 0, then terminate (S D ; if kC1 D C1; xk solves
(COP) if kC1 < C1).
Step 4. If tkC1  kxkC1 k2 < 0, then set k k C 1 and go back to Step 1.
Lemma 8.5 After finitely many steps the above algorithm either gives evidence that
.COP/ is infeasible or generates a feasible solution, if there is one.
Proof Let xQ 2 intS (see assumption (8.99)), so '.; xQ / < 0 for D C1. If xk
S 8k, then k D C1; tk < kxk k2 8k, and

tk  kxk k2  min '.k1 ; x/  '.k1 ; xQ /; 8k


x2P1

while for all i < k W

tk  kxk k2  li .xk /  kxk k2  kxi  xk k2 :

Thus, kxi  xk k2  '.k ; xQ / > 0 8i < k, conflicting with the boundedness


of fxk g  P1 . Therefore, xk 2 S for all sufficiently large k. In particular, any
accumulation point of the sequence fxk g is feasible. t
u
Proposition 8.7 Whenever infinite the Relief Indicator Algorithm generates an
infinite sequence fxk g every accumulation point of which is an optimal solution of
(COP).
Proof Let x D lim xks , (as we just saw, x 2 S). Define
s!C1

Qlk .x/ D lk .x/  kxk2 :

For any q > s we have Qlks .xkq / D lks .xkq /  kxkq k2  tkq  kxkq k2 < 0, hence, by
making q ! C1 W Qlks .x/  0. But clearly, Qlks .xks / D Qlks .x/  kxks  xk2  kxks 
xk2 ! 0 as s ! C1, while Qlk .xk / D r2 .kC1 ; xk / > 0. Therefore, Qlks .xks / ! 0
and also tks  kxks k2 ! 0, as s ! C1. If t D lim tks then t  kxk2 D 0. Since
tk kxk k2 D minftkxk2 j x 2 P1 ; li .x/  t; i D 1; 2; : : :, k1g  minftkxk2 j x 2
P1 ; h.k ; x/  tg 8k, it follows that t  kxk2 D minft  kxk2 j x 2 P1 ; h.; x/  tg
for D lim k . That is, '.; x/ D 0, and hence, x solves (COP). t
u
Remark 8.11 Each subproblem (SPk ) is a linearly constrained concave mini-
mization problem which can be solved by vertex enumeration as in simple outer
approximation algorithms. Since (SPkC1 ) differs from (SPk ) by just one additional
linear constraint, if Vk denotes the vertex set of the constraint polytope of (SPk )
then V1 can be considered known, while VkC1 can be derived from Vk using an
on-line vertex enumeration subroutine (Sect. 6.2). If the algorithm is stopped when
tk  kxk k2 > "2 then xk yields an "-approximate optimal solution.
278 8 Special Methods

Example 8.5 Solve the problem

Minimize f .x/ D x21 C x22  cos 18x1  cos 18x2


1
subject to g.x/ D .x1  0:5/2 C .x2  0:415331/2 2 C 0:65  0
0  x1  1; 0  x2  1:

The functions f and g are Lipshitzian on S D 0; 1  0; 1, with Lipschitz constants


28.3 and 1, respectively. Following Example 6.2 a separator is

f .x/ 
r.; x/ D maxf0; ; g.x/g:
28:3
With tolerance " D 0:01 the algorithm terminates after 16 iterations, yielding an
approximate optimal solution x D .0; 1/ with objective function value 0:6603168.

8.6.2 Combined OA/BB Algorithm

An alternative method for solving (8.110) is by combining outer approximation with


branch and bound. This gives rise to a relief indicator OA/BB algorithm (Tuy 1992b,
1995a) for problems of the following form:
.CO=SNP/ minff .x/j G.x; y/  0; gi .x/  0; i D 1; : : : ; mg;
where f ; g1 ; : : : ; gm are as previously, while G W Rn  Rq ! R is a convex function.
To take advantage of the convexity of G in this problem it is convenient to treat
the constraint G.x; y/  0 separately. Assuming robustness of the constraint set of
(CO/SNP), and the availability of a separator r.; x/ for .f ; S/ (with S being the set
determined by the system gi .x/  0; i D 1; : : : ; m) we can transform (CO/SNP)
into the concave minimization problem

minft  kxk2 j G.x; y/  0; h.; x/  tg; (8.112)

where h.; x/ is defined as in (8.104) and is the (unknown) optimal value of f .x/.
Since for every fixed x this is a convex problem, the most convenient way is to
branch upon x and compute bounds by using relaxed subproblems of the form

LPk .M/ minft  M .x/j x 2 M; li .x/  t .i 2 Ik /


Lj .x; y/  0 .j 2 Jk /g

where M is the partition set, M .x/ is the affine function that agrees with kxk2 at
the vertices of M; f.x; t/j li .x/  t .i 2 Ik /g the outer approximating polytope of the
convex set C WD f.x; t/j h.; x/  tg and f.x; y/j Lj .x; y/  0 .j 2 Jk /g the outer
approximating polytope of the convex set D f.x; y/j G.x; y/  0g, at iteration k.
8.7 Exercises 279

BB/OA Algorithm for (CO/SNP)


Initialization. Take an n-simplex M1 in Rn known to contain an optimal solution
of (CO/SNP). If a feasible solution is available then let .x0 ; y0 / be the best one,
0 D f .x0 /I otherwise, let 0 D C1. Set I1 D J1 D ;; R D N D fM1 g. Set
k D 1.
Step 1. For each M 2 N compute the optimal value .M/ of the linear program
LPk .M/.
Step 2. Update .xk ; yk / and k . Delete every M 2 R such that .M/ > 0. Let
R 0 be the remaining collection of simplices. If R 0 D ; then terminate:
(CO/SNP) is infeasible.
Step 3. Select Mk 2 argminf.M/j M 2 R 0 g. Let .xk ; yk ; tk / be a basic optimal
solution of LPk .Mk /. If .Mk / D 0 then terminate: (CO/SNP) is infeasible
(if k D C1/, or .xk ; yk / solves (CO/SNP) (if k D C1).
Step 4. If kxk k2  tk and G.xk ; yk /  0 then set IkC1 D Ik ; JkC1 D Jk .
Otherwise:
(a) if kxk k2  tk > G.xk ; yk / then set JkC1 D Jk ,

IkC1 D Ik [ fkg; lk .x/ D 2hxk ; xi  kxk k2 C r2 .k ; xk /I

(b) if kxk k2  tk  G.xk ; yk / then set IkC1 D Ik ,

JkC1 D Jk [ fkg; Lk .x/ D huk ; x  xk i C hv k ; y  yk i C G.xk ; yk /;

where .uk ; v k / 2 @G.xk ; yk /.


Step 5. Perform a bisection or an !-subdivision ( subdivision via ! k D xk ) of Mk ,
according to a normal subdivision rule.
Let N  be the partition of Mk . Set S N ; R N  [ .R 0 n fMk g/,
increase k by 1 and return to Step 1.
The convergence of this algorithm follows from that of the general algorithm for
(COP).

8.7 Exercises

1 Solve by polyhedral annexation:

min .x1  1/2  x22  .x3  1/2 s:t:


4x1  5x2 C 4x3  4
6x1  x2  x3  4:1
x1 C x2  x3  1
280 8 Special Methods

12x1 C 5x2 C 12x3  34:8


12x1 C 12x2 C 7x3  29:1
x1 ; x2 ; x3  0

(optimal solution x D .1; 0; 0//


2 Consider the problem:

minfcxj x 2 ; g.x/  0; h.x/  0; hg.x/; h.x/i D 0g

where g; h W Rn ! Rn are two convex maps and a convex subset of Rn (convex


program with an additional complementarity condition). Show that this problem
can be converted into a (GCP) (a map g W Rn ! Rm is said to be convex if its
components gi .x/; i D 1; : : : ; m are all convex functions)
3 Show that an "-optimal solution of (LRC) can be computed according to the
following scheme (Nghia and Hieu 1986):
Initialization
Solve the linear program

1 D minfcxj x 2 Dg:

Let w be a basic optimal solution of this program. If h.w/  0, terminate: w solves


(LRC). Otherwise, set 1 D C1; k D 1.
Iteration k
(1) If k  k  ", terminates: x is an "-optimal solution.
(2) Solve

k C k

k D max h.z/j z 2 D; cz 
2

to obtain zk .
(3) If
k < 0, and k D 1, terminate: the problem is infeasible.
k C k
(4) If
k < 0, and k < 1, set kC1 D k ; kC1 D and go to iteration
2
k C 1.
(5) If
k  0, then starting from zk perform a number of simplex pivots to move to
a point xk such that cxk  czk and xk is on the intersection of an edge of D with
the surface h.x/ D 0 (see Phase 1 of FW/BW Procedure). Set x D xk ; kC1 D
cxk ; kC1 D k and go to iteration k C 1.
4 Solve minfy3 C 4:5y2  6yj 0  y  3g.
8.7 Exercises 281

5 Solve

min 4x41 C 2x22  4x21 W


x21  2x1  2x2  1  0
1  x1  1I 1  x2  1:

(optimal solution x D .0; 707; 0:000//.


6 Solve

minf2x1 C x2 j x21 C x22  1; x1  x22  0g:

(optimal solution x D .0:618034; 0:786151//


7 Solve

minfx21 C x22 C x1 x2 C x1 C x2 j x21 C x22  0:5I x1 ; x2  0g:

8 Solve

min 2x21 C x1 x2  10x1 C 2x22  20 W


x1 C x2  11I 1:5x1 C x2  11:4I x1 ; x2  0

9 Solve

1 2 1
min x1 C x2 C x22  x21 j 2x1  x2  3I x1 ; x2  0 :
2 3 2

10 Solve the previous problem by reduction to quasiconcave minimization via


dualization (see Sect. 8.3).
11 Let C and D two compact convex sets in Rn containing 0 in their interiors.
Consider the pair of dual problems

.P/ maxx;r frj x C rC  Dg


.Q/ minx;r frj C  x C rDg:

Show that 1) .x; r/ solves .P/ if and only if .x=r; 1=r/ solves QI 2) When D is a
polytope, problem .P) merely reduces to a linear program; 3) When C is a polytope,
problem .Q/ reduces to minimizing a dc function.
2
12 Solve minfex sin x  jyj j 0  x  10; 0  y  10g. (Hint: Lipschitz
constant =2)
Chapter 9
Parametric Decomposition

A partly convex problem as formulated in Sect. 8.2 of the preceding chapter can also
be viewed as a convex optimization problem depending upon a parameter x 2 Rn :
The dimension n of the parameter space is referred to as the nonconvexity rank of
the problem. Roughly speaking, this is the number of nonconvex variables in the
problem. As we saw in Sect. 8.2, a partly convex problem of rank n can be efficiently
solved by a BB decomposition algorithm with branching performed in Rn : This
chapter discusses decomposition methods for partly convex problems with small
nonconvexity rank n: It turns out that these problems can be solved by streamlined
decomposition methods based on parametric programming.

9.1 Functions Monotonic w.r.t. a Convex Cone

Let X be a convex set in Rn and K a closed convex cone in Rn : A function f W Rn !


R is said to be monotonic on X with respect to K (or K-monotonic on X for short) if

x; x0 2 X; x0  x 2 K ) f .x0 /  f .x/: (9.1)

In other words, for every x 2 X and u 2 K the univariate function 7! f .x C u/


is nondecreasing in the interval .0;  / where  D supf j x C u 2 Xg: When
X D Rn we simply say that f .x/ is K-monotonic.
Let L be the lineality space of K; and Q an r  n matrix of rank r such that
L D fxj Qx D 0g: From (9.1) it follows that

x; x0 2 X; Qx D Qx0 ) f .x/ D f .x0 /; (9.2)

i.e., f .x/ is constant on every set fx 2 Xj Qx D constg: Denote Q.X/ D fy 2 Rr j


y D Qx; x 2 Xg:

Springer International Publishing AG 2016 283


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_9
284 9 Parametric Decomposition

Proposition 9.1 Let K  recX: A function f .x/ is K-monotonic on X if and only if

f .x/ D F.Qx/ 8x 2 X; (9.3)

where F W Q.X/ ! R satisfies

y; y0 2 Q.X/; y0  y 2 Q.K/ ) F.y0 /  F.y/: (9.4)

Proof If f .x/ is K-monotonic on X then we can define a function F W Q.X/ ! R by


setting, for every y 2 Q.X/ W

F.y/ D f .x/ for any x 2 X such that y D Qx: (9.5)

This definition is correct in view of (9.2). Obviously, f .x/ D F.Qx/: For any y; y0 2
Q.X/ such that y0  y 2 Q.K/ we have y D Qx; y0 D Qx0 ; with Q.x0  x/ D Qu; u 2
K; i.e., Qx0 D Q.x C u/; u 2 K: Here x C u 2 X because u 2 recX; hence by (9.1),
f .x0 / D f .x C u/  f .x/; i.e., F.y0 /  F.y/: Conversely, if f .x/ D F.Qx/ and
F W Q.X/ ! R satisfies (9.4), then for any x; x0 2 X such that x0  x 2 K we have
y D Qx; y0 D Qx0 ; with y0  y 2 Q.K/; hence F.y0 /  F.y/; i.e., f .x0 /  f .x/: t
u
The above representation of K-monotonic functions explains their role in global
optimization. Indeed, using this representation, any optimization problem in Rn of
the form

minf f .x/j x 2 Dg (9.6)

where D  X; and f .x/ is K-monotonic on X; can be rewritten as

minfF.y/j y 2 Q.D/g (9.7)

which is a problem in Rr : Assume further, that f .x/ is quasiconcave on X; i.e.,


by (9.5), F.y/ is quasiconcave on Q.X/:
Proposition 9.2 Let K  recX: A quasiconcave function f W X ! R on X is K-
monotonic on X if and only if for every x0 2 X with f .x0 / D W

x0 C K  X. / WD fx 2 Xj f .x/  g:

Proof If f .x/ is K-monotonic on X, then for any x 2 X. / and u 2 K we have


x C u 2 X (because K  recX/ and condition (9.1) implies that f .x C u/  f .x/  ;
hence x C u 2 X. /; i.e., u is a recession direction for X. /: Conversely, if for any
2 f .X/; K is contained in the recession cone of X. /; then for any x0 ; x 2 X such
that x0  x 2 K one has, for D f .x/ W x 2 X. /; hence x0 2 X. /; i.e., f .x0 /  f .x/:
t
u
9.1 Functions Monotonic w.r.t. a Convex Cone 285

Proposition 9.1 provides the foundation for reducing the dimension of prob-
lem (9.6), whereas Proposition 9.2 suggests methods for restricting the global search
domain. Indeed, since local information is generally insufficient to verify the global
optimality of a solution, the search for a global optimal solution must be carried
out, in principle, over the entire feasible set. If, however, the objective function f .x/
in (9.6) is K-monotonic, then, once a solution x0 2 D is known, one can ignore
the whole set D \ .x0 C K/  fx 2 Dj f .x/  f .x0 /g; because no better feasible
solution than x0 can be found in this set. Such kind of information is often very
helpful and may drastically simplify the problem by limiting the global search to a
restricted region of the feasible domain. In the next sections, we shall discuss how
efficient algorithms for K-monotonic problems can be developed based on these
observations.
Below are some important classes of quasiconcave monotonic functions encoun-
tered in the applications.
Example 9.1 f .x/ D '.x1 ; : : : ; xp / C dT x;
with '.y1 ; : : : ; yp / a continuous concave function of y D .y1 ; : : : ; yp /:
Here F.y/ D '.y1 ; : : : ; yp / C ypC1 ; Qx D .x1 ; : : : ; xp ; dT x/T : Monotonicity holds
with respect to the cone K D fuj ui D 0.i D 1; : : : ; p/; dT u  0g: It has long
been recognized that concave minimization problems with objective functions of
this form can be efficiently solved only by taking advantage of the presence of the
linear part in the objective function (Rosen 1983; Tuy 1984).
Pp 2
Example 9.2 f .x/ D P iD1 hc ; xi C hc
i pC1
; xi:
Here F.y/ D  iD1 yi C ypC1 ; Qx D .hc1 ; xi; : : : ; hcpC1 ; xi/T : Monotonicity
2
p

holds with respect to K D fuj hci ; ui D 0.i D 1; : : : ; p/; hcpC1 ; ui  0g: For
p D 1 such a function f .x/ cannot in general be written as a product of two
affine functions and finding its minimum over a polytope is a problem known to
be NP-hard (Pardalos and Vavasis 1991), however, quite practically solvable by a
parametric algorithm (see Sect. 9.5).
Q
Example 9.3 f .x/ D riD1 hci ; xi C di i with i > 0; i D 1; : : : ; r:
This class
P includes functions which are products of affine functions. Since
log f .x/ D riD1 i loghci ; xi C di  for every x such that hci ; xi C di > 0; it is easily
seen that f .x/ is quasiconcave on the set X D fxj hci ; xi C di  0; i D 1; : : : ; rg:
Furthermore, f .x/ is monotonic on X with respect to theQ cone K D fuj hci ; ui 
0; i D 1; : : : ; rg: Here f .x/ has the form (9.3) with F.y/ D riD1 .yi Cdi /i : Problems
of minimizing functions f .x/ of this form (with i D 1; i D 1; : : : ; r/ under linear
constraints are termed linear multiplicative programs and will be discussed later in
this chapter (Sect. 9.5).
P
D  riD1 i ehc ;xi with i > 0; i D 1; : : : ; r.
i
Example 9.4 f .x/ P
Here F.y/ D  riD1 i eyi is concave in y; so f .x/ is concave in x: Monotonicity
holds with respect to the cone K D fxj hci ; xi  0; i D 1; : : : ; rg: Functions of
this type appear when dealing with geometric programs with negative coefficients
(signomial programs, cf Example 5.6).
286 9 Parametric Decomposition

Example 9.5 f .x/ quasiconcave and differentiable, with

X
r
rf .x/ D pi .x/ci ; ci 2 Rn ; i D 1; : : : ; r:
iD1

For every x; x0 2 X satisfying hci ; x0  xi D 0; i D 1; : : : ; r; there P is some


2 .0; 1/ such that f .x0 /  f .x/ D hrf .x C .x0  x//; x0  xi D riD1 pi .x C
.x0  x//hci ; x0  xi D 0; hence f .x/ is monotonic with respect to the space
L D fxj hci ; xi D 0; i D 1; : : : ; rg: If, in addition, pi .x/  0 8x .i D 1; : : : ; r/ then
monotonicity holds with respect to the cone K D fuj hci ; ui  0; i D 1; : : : ; rg:
Example 9.6 f .x/ D minfyT Qxj y 2 Eg
where E is a convex subset of Rm and Q is an m  n matrix of rank r:
This function appears when transforming rank r bilinear programs into concave
minimization problems (see Sect. 10.4). Here monotonicity holds with respect to
the subspace K D fuj Qu D 0g: If E  Rm C then Qu  0 implies y Q.x C u/ 
T

y Qx; hence f .x C u/  f .x/; and so f .x/ is monotonic with respect to the cone
T

K D fuj Qu  0g:
Example 9.7 f .x/ D supfdT yj Ax C By  qg
where A 2 Rpn with rankA D r; B 2 Rpm ; y 2 Rm and q 2 Rp :
This function appears in bilevel linear programming, for instance, in the max-min
problem (Falk 1973b):

min maxfcT x C d T yj Ax C By  q; x  0; y  0g:


x y

Obviously, Au  0 implies A.x C u/  Ax; hence fyj Ax C By  qg  fyj A.x C


u/ C By  qg; hence f .x C u/  f .x/: Therefore, monotonicity holds with respect
to the cone K D fuj Au  0g:
Note that by Corollary 1.15, rankQ D n  dim L D dim K : A K- monotonic
function f .x/ on X; with rankQ D r is also said to possess the rank r monotonicity
property.

9.2 Decomposition by Projection

Let D be a compact subset of a closed convex set X in Rn and let f W Rn ! R


be a quasiconcave function, monotonic on X with respect to a polyhedral convex
cone K contained in the recession cone of X: Assume that the lineality space of K is
L D fxj Qx D 0g; where Q is an r  n matrix of rank r: Consider the quasiconcave
monotonic problem

.QCM/ minff .x/j x 2 Dg (9.8)


9.2 Decomposition by Projection 287

In this and the next two sections, we shall discuss decomposition methods for
solving this problem, under the additional assumption that D is a polytope. The
idea of decomposition is to transform the original problem into a sequence of
subproblems involving each a reduced number of nonconvex variables. This can
be done either by projection, or by dualization (polyhedral annexation), or by
parametrization ( for r  3/: We first present decomposition methods by projection.

As has been previously observed, by setting

F.y/ D f .x/ for any x 2 X satisfying y D Qx; (9.9)

we unambiguously define a quasiconcave function F W Q.X/ ! R such that problem


.QCM/ is equivalent to

min F.y/ s:t: y D .y1 ; : : : ; yr / 2 G (9.10)

where G D Q.D/ is a polytope in Rr : The algorithms presented in Chaps. 5 and 6


can of course be adapted to solve this quasiconcave minimization problem.
We  show how to compute F.y/: Since rankQ D r; by writing Q D QB ; QN ;
 first
xB
xD ; where QB is an r  r nonsingular matrix, the equation Qx D y yields
xN
   
Q1
B Q1
B QN x N
xD yC
0 xN
   
Q1
B Q1
B QN x N
Setting Z D and u D we obtain
0 xN

x D Zy C u with Qu D QN xN C QN xN D 0: (9.11)

Since Qx D Q.Zy/ with Zy D x  u 2 X  L D X; it follows that f .x/ D f .Zy/:


Thus, the objective function of (9.10) can be computed by the formula

F.y/ D f .Zy/: (9.12)

We shall assume that f .x/ is continuous on some open set in Rn containing D;


so F.y/ is continuous on some open set in Rr containing the constraint polytope
G D Q.D/ of (9.10). The peculiar form of the latter set (image of a polytope D
under the linear map Q/ is a feature which requires special treatment, but does not
cause any particular difficulty.
288 9 Parametric Decomposition

9.2.1 Outer Approximation

To solve (9.10) by outer approximation, the key point is: given a point y 2 Q.Rn / 
Rr , determine whether y 2 G D Q.D/; and if y G then construct a linear
inequality (cut) l.y/  0 to exclude y without excluding any point of G:
Observe that y Q.D/ if and only if the linear system Qx D y has no solution
in D; so the existence of l.y/ is ensured by the separation theorem or any of its
equivalent forms (such as the FarkasMinkowski Lemma). However, since we are
not so much interested in the existence as in the effective construction of l.y/; the
best way is to use the duality theorem of linear programming (which is another form
of the separation theorem). Specifically, assuming D D fxj Ax  b; x  0g; with
A 2 Rmn ; b 2 Rm ; consider the dual pair of linear programs

minfhh; xij Qx D y; Ax  b; x  0g (9.13)


maxfhy; vi  hb; wij QT v  AT w  h; w  0g: (9.14)

where h is any vector chosen so that (9.14) is feasible (for example, h D 0/ and can
be solved with least effort.
Proposition 9.3 Solving (9.14) either yields a finite optimal solution or an extreme
direction .v; w/ of the cone QT v  AT w  0; w  0; such that hy; vi  hb; wi > 0:
In the former case, y 2 G D Q.D/I in the latter case, the affine function

l.y/ D hy; vi  hb; wi (9.15)

satisfies

l.y/ > 0; l.y/  0 8y 2 G: (9.16)

Proof Immediate from the duality theorem of linear programming. t


u
On the basis of this proposition a finite OA procedure for solving (9.10) can be
carried out in the standard way, starting from an initial r-simplex P1  G D Q.D/
with a known or readily computable vertex set. In most cases, one can take this
initial simplex to be
( )
X
r
P1 D yj yi  i ; .i D 1; : : : ; r/; .yi  i /  (9.17)
iD1

where

i D minfyi j y D Qx; x 2 Dg; i D 1; : : : ; r; (9.18)


Pr
D max iD1 .yi  i /j y D Qx; x 2 D : (9.19)
9.2 Decomposition by Projection 289

Remark 9.1 A typical case of interest is the concave program with few nonconvex
variables (Example 7.1):

minf'.y/ C dzj Ay C Bz  b; y  0; z  0g (9.20)

where the function ' W Rp ! R is concave and .y; z/ 2 Rp  Rq : By rewriting this


problem in the form (9.10):

minf'.y/ C tj dz  t; Ay C Bz  b; y  0; z  0g
p
we see that for checking the feasibility of a vector .y; t/ 2 RC  R the best choice
of h in (9.13) is h D d: This leads to the pair of dual linear programs

minfdzj Bz  b  Ay; z  0g (9.21)


maxfhAy  b; vij  B v  d; v  0g:
T
(9.22)

We leave it to the reader to verify that:


1. If v is an optimal solution to (9.22) and hAy  b; vi  t then .y; t/ is feasible;
2. If v is an optimal solution to (9.22) and hAy  b; vi > t then the cut
hAy  b; vi  t excludes .y; t/ without excluding any feasible point;
3. If (9.22) is unbounded, i.e., hAy  b; vi > 0 for some extreme direction v of
the feasible set of (9.22), then the cut hAy  b; vi  0 excludes .y; t/ without
excluding any feasible point.
Sometimes the matrix B has a favorable structure such that for each fixed
y  0 the problem (9.20) is solvable by efficient specialized algorithms
(see Sect. 9.8). Since the above decomposition preserves this structure in the
auxiliary problems, the latter problems can be solved by these algorithms.

9.2.2 Branch and Bound

A general rule in the branch and bound approach to nonconvex optimization


problems is to branch upon the nonconvex variables, except for rare cases where
there are obvious reasons to do otherwise. Following this rule the space to be
partitioned for solving .QCM/ is Q.Rn / D Rr : Given a partition set M  Rr a
basic operation is to compute a lower bound for F.y/ over the feasible points in M:
To this end, select an appropriate affine minorant lM .y/ of F.y/ over M and solve
the linear program

minflM .y/j Qx D y; x 2 D; y 2 Mg:


290 9 Parametric Decomposition

If .M/ is the optimal value of this linear program then obviously

.M/  minfF.y/j y 2 Q.D/ \ Mg:

According to condition (6.4) in Sect. 6.2 this can serve as a valid lower bound,
provided, in addition, .M/ < C1 whenever Q.D/ \ M ;  which is the
case when M is bounded as, e.g., when using simplicial or rectangular partitioning.
Simplicial Algorithm
If M D u1 ; : : : ; urC1  is an r-simplex in Q.Rn / then lM .x/ can beP taken to be the
affine function that agrees with F.y/ at u1 ; : : : ; urC1 : Since lM .y/ D iD1 rC1
ti F.ui / for
PrC1 i PrC1
y D iD1 ti u with t D .t1 ; : : : ; trC1 /  0; iD1 ti D 1; the above linear program
can also be written as
( rC1 )
X X
rC1 X
rC1
min ti F.u /j x 2 D; Qx D
i
ti u ;
i
ti D 1; t  0 :
iD1 iD1 iD1

Simplicial subdivision is advisable when f .x/ D f0 .x/ C dx; where f0 .x/ is concave
and monotonic with respect to the cone Qx  0: Although f .x/ is then actually
monotonic with respect to the cone Qx  0; dx  0; by writing .QCM/ in the form

minfF0 .y/ C dxj Qx D y; x 2 Dg

one can see that it can be solved by a simplicial algorithm operating in Q.Rn / rather
than Q.Rn /  R: As initial simplex M1 one can take the simplex (9.17).
Rectangular Algorithm
When F.y/ is concave separable, i.e.,
X
r
F.y/ D Fi .yi /;
iD1

where each Fi ./; i D 1; : : : ; r is a concave univariate function, this structure


can be best exploited by rectangular subdivision. In that case, for any rectangle
M D p; q D fyj p  y  qg; lM .y/ can be taken to be the uniquely defined
affine function
P which agrees with F.y/ at the corners of M: This function is
lM .y/ D riD1 li;M .yi / where each li;M .t/ is the affine univariate function that agrees
with Fi .t/ at the endpoints pi ; qi of the segment pi ; qi : The initial rectangle can be
taken to be M1 D 1 ; 1   : : :  r ; r ; where i is defined as in (9.18) and

i D supfyi j y D Qx; x 2 Dg:

Conical Algorithm
In the general case, since the global minimum of F.y/ is achieved on the boundary
of Q.D/; conical subdivision is most appropriate. Nevertheless, instead of bounding,
we use a selection operation, so that the algorithm is a branch and select rather than
branch and bound procedure.
9.2 Decomposition by Projection 291

The key point in this algorithm is to decide, at each iteration, which cones in
the current partition are non promising and which one is the most promising
(selection rule).
Let y be the current best solution (CBS) at a given iteration and WD F.y/:
Assume that all the cones are vertexed at 0 and F.0/ > ; so that

0 2 int ;

where D fy 2 j F.y/  g: Let lM .y/ D 1 be the equation of the hyperplane


in Rr passing through the intersections of the edges of M with the surface F.y/ D :
Denote by !.M/ a basic optimal solution and by .M/ the optimal value of the
linear program

LP.M; / maxflM .y/j y 2 Q.D/ \ Mg:

Note that if u1 ; : : : ; ur are the directions of the edges of M and U D .u1 ; : : : ; ur / is


the matrix of columns u1 ; : : : ; ur then M D fyj y D Ut; t D .t1 ; : : : ; tr /T  0g; so
LP(M, / can also be written as

maxflM .Ut/j Ax  b; Qx D Ut; x  0; t  0g:

The deletion and selection rule are based on the following:


Proposition 9.4 If .M/  1 then M cannot contain any feasible point better than
CBS. If .M/ > 1 and !.M/ lies on an edge of M then !.M/ is a better feasible
solution than CBS.
Proof Clearly D \ M is contained in the simplex Z WD M \ fyj lM .y/  .M/g:
If .M/  1; then the vertices of Z other than 0 lie inside ; hence the values
of F.y/ at these vertices are at least equal to : But by quasiconcavity of F.y/; its
minimum over Z must be achieved at some vertex of Z, hence must be at least equal
to : This means that F.y/  for all y 2 Z; and hence, for all y 2 D \ M: To prove
the second assertion, observe that if !.M/ lies on an edge of M and .M/ > 1, then
!.M/ ; i.e., F.!.M// < I furthermore, !.M/ is obviously feasible. t
u
The algorithm is initialized from a partition of Rr into r C 1 cones vertexed
at 0 (take, e.g., the partition induced by the r-simplex e0 ; e1 ; : : : ; er  where ei ; for
i D 1; : : : ; r; is the i-th unit vector of Rr and e0 D  1r .e1 C : : : C er //: On the basis
of Proposition 9.4, at iteration k a cone M will be fathomed if .M/  1; while the
most promising cone Mk (the candidate for further subdivision) is the unfathomed
cone that has largest .M/: If ! k D !.Mk / lies on any edge of M k it yields, by
Proposition 9.4, a better feasible solution than CBS and the algorithm can go to the
next iteration with this improved CBS. Otherwise, Mk can be subdivided via the ray
through ! k :
As proved in Chap. 7 (Theorem 7.3), the above described conical algorithm
converges, in the sense that any accumulation point of the sequence of CBS that
292 9 Parametric Decomposition

it generates is a global optimal solution. Note that no bound is needed in this branch
and select algorithm. Although a bound for F.y/ over the feasible points in M can
be easily derived from the constructions used in Proposition 9.4, it would require
the computation of the values of F.y/ at r additional points (the intersections of the
edges of M with the hyperplane lM .y/ D .M//; which may entail a nonnegligible
extra cost.

9.3 Decomposition by Polyhedral Annexation

The decomposition method by outer approximation is conceptually simple. How-


ever, a drawback of this method is its inability to be restarted. Furthermore, the
construction of the initial polytope involves solving r C 1 linear programs. An
alternative decomposition method which allows restarts and usually performs better
is by dualization through polyhedral annexation .
Let x be a feasible solution and C WD fx 2 Xj f .x/  f .x/g: By quasiconcavity
and continuity of f .x/; C is a closed convex set. By translating if necessary, assume
that 0 is a feasible point such that f .0/ > f .x/; so 0 2 D \ intC: The PA method
(Chap. 8, Sect. 8.1) uses as a subroutine Procedure DC* for solving the following
subproblem:
Find a point x 2 D n C (i.e., a better feasible solution than x/ or else prove that
D  C (i.e., x is a global optimal solution).
The essential idea of Procedure DC* is to build up, starting with a polyhedron
P1  C (Lemma 8.1), a nested sequence of polyhedrons P1  P2  : : :  C that
yields eventually either a polyhedron Pk  D (in that case C  D ; hence D  C/
or a point xk 2 D n C: In order to exploit the monotonicity structure, we now try to
choose the initial polyhedron P1 so that P1  L? (orthogonal complement of L/:
If this can be achieved, then the whole procedure will operate in L? ; which allows
much computational saving since dim L? D r < n:
Let c1 ; : : : ; cr be the rows ?
Prof Qi (so that L ? is the subspace generated by these
0
vectors) and Pr let c D  iD1 c : Let W L ! Rr be the linear map such
iD1 ti c , .y/ D t: For each i D 0; 1; : : : ; r since 0 2 intC; the
i
that y D
halfline from 0 through ci intersects C: If this halfline meets @C; we define i to
be the positive number such that zi D i ci 2 @CI otherwise, we set i D C1:
Let I D fij i < C1g; S1 D convf0; zi ; i 2 Ig C conefci ; i Ig:
Lemma 9.1
(i) The polar P1 of S1 C L is an r-simplex in L? defined by
( )
X
r
1
.P1 / D t 2 R j r
ti hc ; c i  ; j D 0; 1; : : : ; r ;
i j
(9.23)
iD1
j

1
(as usual C1 D 0/:
9.3 Decomposition by Polyhedral Annexation 293

(ii) C  P1 and if V is the vertex set of any polytope P such that C  P  P1 ;


then

maxfhv; xij x 2 Dg  1 8v 2 V ) C  D : (9.24)

Proof Clearly S1  L? and 0 2 riS1 ; so 0 2 int.S1 C L/; hence P1 is compact.


SinceP.S1 C L/ D S1 \ L? ; any y 2 P1 belongs to L? ; i.e., is of the form
y D riD1 ti ci for some t D .t1 ; : : : ; tr /: But by Proposition 1.28, y 2 S1 is equivalent
to hcj ; yi  1j 8j D 0; 1; : : : ; r: Hence P1 is the polyhedron defined by (9.23). We
1
have 0 < j 8j D 0; 1; : : : ; r; so dim P1 D r; (Corollary 1.16) and since P1 is
compact it must be an r-simplex. To prove (ii) observe that S1 CL  C\L? CL D C;
so C  .S1 C L/ D P1 : If maxfhv; xij x 2 Dg  1 8v 2 V; then
maxfhy; xij x 2 Dg  1 8y 2 P; i.e., P  D ; hence C  D because C  P: u t
Thus Procedure DC* can be applied, starting from P1 as an initial polytope. By
incorporating Procedure DC* (initialized from P1 / into the PA Algorithm for (BCP)
(Sect. 7.1) we obtain the following decomposition algorithm for (QCM):
PA Algorithm for (QCM)
Initialization. Let x be a feasible solution (the best available), C D fx 2 Xj f .x/ 
f .x/g: Choose a feasible point x0 of D such that f .x0 / > f .x/ and set D D
x0 ; C C  x0 : Let PQ 1 D .P1 / be the simplex in Rr defined by (9.23) (or
PQ 1 D .P1 / with P1 being any polytope in L? such that C  P1 /. Let V1 be the
vertex set of PQ 1 : Set k D 1:
Step 1. For every new t D .t1 ; : : : ; tr / 2 Vk solve the linear program
( r )
X
max ti hci ; xij x 2 D (9.25)
iD1

to obtain its optimal value .t/ and a basic optimal solution x.t/:
Step 2. Let tk 2 argmaxf.t/j t 2 Vk g: If .tk /  1 then terminate: x is an
optimal solution of (QCM).
Step 3. If .tk / > 1 and xk D x.tk / C, then update the current best feasible
solution and the set C by resetting x D xk :
Step 4. Compute
k D supf j f . xk /  f .x/g
and define
( )
X
r
1
PQ kC1 D PQ k \ tj ti hx ; c i 
k i
:
iD1
k

Step 5. From Vk derive the vertex set VkC1 of PQ kC1 : Set k k C 1 and go back to
Step 1.
294 9 Parametric Decomposition

Remark 9.2 Reported computational experiments (Tuy and Tam 1995) seem to
indicate that the PA method is quite practical for problems with fairly large n
provided r is small (typically r  6/: For r  3; however, specialized parametric
methods can be developed which are more efficient. These will be discussed in the
next section.

Case When K D fuj Qu  0g

A case when the initial simplex P1 can be constructed so that its vertex set is readily
available when K D fuj Qu  0g; with an r  n matrix Q of rank r: Let c1 ; : : : ; cr be
the rows of Q: Using formula (9.11) one can compute a point w satisfying Qw D e;
where e D .1; : : : ; 1/T ; and a value > 0 such that wO D w 2 C (for the efficiency
of the algorithm, should be taken to be as large as possible).
Lemma 9.2 The polar P1 of the set S1 D wO C K is an r- simplex with vertex set
f0; c1 =; : : : ; cr =g and satisfying condition (ii) in Lemma 9.1.
Proof Since K  recC and wO 2 C it follows that S1 D wO C K  C; and hence
P1  C : Furthermore, clearly S1 D fxj hci ; xi  ; i D 1; : : : ; rg because
x 2 S1 if and only if x  wO 2 K; i.e., hci ; x  wi
O  0 i D 1; : : : ; r; i.e., hci ; xi 
O D ; i D 1; : : : ; r: From Proposition 1.28 we then deduce that S1 D
hc ; wi
i

convf0; c1 =; : : : ; cr =g: The rest is immediate. t


u
this case K D conefc1 ; : : : ; cr g and so .K / D Rr while
Also note that inP
.P1 / D ft 2 R j riD1 ti  1=g:
r

9.4 The Parametric Programming Approach

In the previous sections, it was shown how a quasiconcave monotonic problem can
be transformed into an equivalent quasiconcave problem of reduced dimension.
In this section an alternative approach will be presented in which a quasiconcave
monotonic problem is solved through the analysis of an associated linear program
depending on a parameter (of the same dimension as the reduced problem in the
former approach).
Consider the general problem formulated in Sect. 8.2:

.QCM/ minff .x/j x 2 Dg (9.26)

where D is a compact set in Rn ; not necessarily even convex, and f .x/ is a


quasiconcave monotonic function on a closed convex set X  D with respect to
a polyhedral cone K with lineality space L D fxj Qx D 0g: To this problem we
associate the following program where v is an arbitrary vector in K W
9.4 The Parametric Programming Approach 295

LP.v/ minfhv; xij x 2 Dg: (9.27)

Denote by Kb any compact set such that cone(Kb / D K :


Theorem 9.1 For every v 2 K ; let xv be an arbitrary optimal solution of LP.v/:
Then

minff .x/j x 2 Dg D minff .xv /j v 2 Kb g: (9.28)

Proof Let D minff .xv /j v 2 Kb0 g: Clearly D minff .xv /j v 2 K n f0gg:


Denote by E the convex hull of the closure of the set of all xv ; v 2 K n f0g:
Since D is compact, so is E and it is easy to see that the convex set G D E C K is
closed. Indeed, for any sequence x D u C v  ! x0 with u 2 E; v  2 K; one
has, by passing to subsequences if necessary, u ! u0 2 E; hence v  D x  u !
x0  u0 2 K; which implies that x0 D u0 C .x0  u0 / 2 E C K D G: Thus, G is a
closed convex set. By K-monotonicity we have f .y C u/  f .y/ for all y 2 E; u 2 K;
hence f .x/  for all x 2 E C K D G: Suppose now that > minff .x/j x 2 Dg; so
that for some z 2 D we have f .z/ < ; i.e., z 2 D n G: By Proposition 1.15 on the
normals to a convex set there exists x0 2 @G such that v WD x0  z satisfies

hv; x  x0 i  0 8x 2 G; hv; z  x0 i < 0: (9.29)

For every u 2 K; since x0 C u 2 G C K D G; we have hv; ui  0; hence v 2 K n


f0g: But, since z 2 D; it follows from the definition of xv that hv; xv i  hv; zi <
hv; x0 i by (9.29), hence hv; xv  x0 i < 0; conflicting with the fact xv 2 E  G
and (9.29). t
u
The above theorem reduces problem .QCM/ to minimizing the function '.v/ WD
f .xv / over Kb : Though K  L? ; and dim L? D r may be small, this is still
a difficult problem even when D is convex, because '.v/ is generally highly
nonconvex. The point, however, is that in many important cases the problem (9.28)
is more amenable to computational analysis than the original problem. For instance,
as we saw in the previous section, when D is a polytope the PA Method solves
.QCM/ through solving a suitable sequence of linear
P programs (9.25) which can be
recognized to be identical to (9.27) with v D  riD1 ti ci :
An important consequence of Theorem 9.1 can be drawn when:

D is a polyhedronI
(9.30)
K D fxj hci ; xi  0; i D 1; : : : ; pI hci ; xi D 0; i D p C 1; : : : ; rg
296 9 Parametric Decomposition

(c1 ; : : : ; cr being linearly independent vectors of Rn /: Let


p
 D f 2 RC j p D 1g;
W D f D .pC1 ; : : : ; r /j i  i  i ; i D p C 1; : : : ; rg;
i D inf hci ; xiI i D suphci ; xi i D p C 1; : : : ; r:
x2D x2D

Corollary 9.1 For each .; / 2   W let x; be an arbitrary basic optimal
solution of the linear program

min Pp1 i hci ; xi C hcp ; xi
LP.; / iD1
s:t: x 2 D; hci ; xi D ; i D p C 1; : : : ; r:
i

Then

minff .x/j x 2 Dg D minff .x; /j  2 ; 2 Wg: (9.31)

Proof Clearly

minff .x/j x 2 Dg D min infff .x/j x 2 D \ H g;


where H D fxj hci ; xi D i ; i D p C 1 : : : ; rg: Let KQ D fxj hci ; xi  0;


i D 1; : : : ; pg: Since f .x/ is quasiconcave K-monotonic
Q on X \ H ; we have by
Theorem 9.1 :

minff .x/j x 2 D \ H g D minff .xv; /j v 2 KQ b g; (9.32)

where xv; is any basic optimal solution of the linear program

minfhv; xij x 2 D \ H g: (9.33)

Noting that KQ D conefc1 ; : : : ; cp g we can take KQ b D c1 ; : : : ; cp : Also we can


0
assume v 2 riKQ b since for any v 2 KQ b there Pp exists v 0 2 riKQ b such that xv ;
is a basic optimal solution of (9.33). So v D iD1 ti ci with tp > 0 and the desired
result follows by setting i D ttpi : t
u
As mentioned above, by Theorem 9.1 one can solve .QCM/ by finding first the
vector v 2 Kb that minimizes f .xv / over all v 2 Kb : In certain methods, as,
for instance, in the PA method when D is polyhedral, an optimal solution x D xv
is immediately known once v has been found. In other cases, however, x must be
sought by solving the associated program minfhv; xij x 2 Dg: The question then
arises as to whether any optimal solution of the latter program will solve .QCM/:
9.4 The Parametric Programming Approach 297

Theorem 9.2 If, in addition to the already stated assumptions, D  intX; while
the function f .x/ is continuous and strictly quasiconcave on X, then for any optimal
solution x of .QCM/ there exists a v 2 K n f0g such that x is an optimal solution of
LP.v/ and any optimal solution to LP.v/ will solve .QCM/.
(A function f .x/ is said to be strictly quasiconcave on X if for any two x; x0 2
X; f .x0 / < f .x/ always implies f .x0 / < f . x C .1  /x0 / 8 2 .0; 1//
Proof It suffices to consider the case when f .x/ is not constant on D: Let x be an
optimal solution of .QCM/ and X.x/ D fx 2 Xj f .x/  f .x/g: By quasiconcavity
and continuity of f .x/, this set X.x/ is closed and convex. By optimality of x; D 
X.x/ and again by continuity of f .x/; fx 2 Xj f .x/ > f .x/g  intX.x/: Furthermore,
since f .x/ is not constant on D one can find x0 2 D such that f .x0 / > f .x/; so if
x 2 intX.x/ then for > 0 small enough, x D x  .x0  x/ 2 X.x/; f .x / < f .x0 /;
hence, by strict quasiconcavity of f .x/; f .x / < f .x/; conflicting with the definition
of X.x/: Therefore, x lies on the boundary of X.x/: By Theorem 1.5 on supporting
hyperplanes there exists then a vector v 0 such that

hv; x  xi  0 8x 2 X.x/: (9.34)

For any u 2 K we have x C u 2 D C K  X.x/; so hv; ui  0; and consequently,


v 2 K nf0g: If xQ is any optimal solution of LP.v/; then hv; xQ xi D 0; and in view
of (9.34), xQ cannot be an interior point of X.x/ for then hv; x  xi D 0 8x 2 Rn ;
conflicting with v 0: So xQ is a boundary point of X.x/; and hence f .Qx/ D f .x/;
because fx 2 Xj f .x/ > f .x/g  intX.x/: Thus, any optimal solution xQ to LP.v/ is
optimal to .QCM/. t
u
Remark 9.3 If f .x/ is differentiable and pseudoconcave then it is strictly quasicon-
cave (see, e.g., Mangasarian 1969). In that case, since X.x/ D fx 2 Xjf .x/  f .x/g
one must have hrf .x/; x  xi  0 8x 2 X.x/ and since one can assume that there is
x 2 X satisfying f .x/ > f .x/ the strict quasiconcavity of f .x/ implies that rf .x/ 0
(Mangasarian 1969). So rf .x/ satisfies (9.34) and one can take v D rf .x/: This
result was established in Sniedovich (1986) as the foundation of the so-called C-
programming.
Corollary 9.2 Let F W Rq ! R be a continuous, strictly quasiconcave and K-
monotonic function on a closed convex set Y  Rq ; where K  Rq is a polyhedral
convex cone contained in the recession cone of Y: Let g W D ! Rq be a map defined
on a set D  Rn such that g.D/  intY: If F.g.x// attains a minimum on D then
there exists t 2 K such that any optimal solution of

minfht; g.x/ij x 2 Dg (9.35)

will solve

minfF.g.x//j x 2 Dg: (9.36)


298 9 Parametric Decomposition

Proof If x is a minimizer of F.g.x// over D, then y D g.x/ is a minimizer of F.y/


over G D g.D/; and the result follows from Theorem 9.2. t
u
q
In particular, when K  RC and D is a convex set while gi .x/; i D 1; : : : ; q are
convex functions, problem (9.35) is a parametric convex program (t  0 because
K  RC /:
q

Example 9.8 Consider the problem (Tanaka et al. 1991):


( n ! ! )
X X
n
min f1;i .xi / : f2;i .xi / j xi 2 Xi ; i D 1; : : : ; n (9.37)
x
iD1 iD1

where Xi  RCC and f1;i ; f2;i are positive-valued functions Pn .i D 1 : : : ; n/:


Applying
Pn the Corollary for g.x/ D .g 1 .x/; g 2 .x//; g 1 .x/ D iD1 f1;i .xi /; g2 .x/ D
2
iD1 f2;i .x i /; F.y/ D y 1 y 2 for every y D .y 1 ; y 2 / 2 R CC ; we see that if this problem
is solvable then there exists a number t > 0 such that any optimal solution of
( n )
X
min .f1;i .xi / C tf2;i .xi //j xi 2 Xi ; i D 1; : : : ; n (9.38)
x
iD1

will solve (9.37). For fixed t > 0 the problem (9.38) splits into n one-dimensional
subproblems

minff1;i .xi / C tf2;i .xi /j xi 2 Xi g:

In the particular case of the problem of optimal ordering policy for jointly
replenished products as treated in the mentioned paper of Tanaka et al. f1;i .xi / D
a=n C ai =xi ; f2;i .xi / D bi xi =2 .a; ai ; bi > 0/ and Xi is the set of positive integers, so
each of the above subproblems can easily be solved.

9.5 Linear Multiplicative Programming

In this section we discuss parametric methods for solving problem (QCM) under
assumptions (9.30). Let Q be the matrix of rows c1 ; : : : ; cr : In view of Proposi-
tion 9.1, the problem can be reformulated as

minfF.hc1; xi; : : : ; hcr ; xi/j x 2 Dg; (9.39)

where D is a polyhedron in Rn ; and F W Rr ! R is a quasiconcave function on a


closed convex set Y  Q.D/; such that for any y; y0 2 Y W

yi  y0i .i D 1; : : : ; p/
) F.y/  F.y0 /: (9.40)
yi D y0i .i D p C 1; : : : ; r/
9.5 Linear Multiplicative Programming 299

A typical problem of this class is the linear multiplicative program :


8 9
<Yr =
.LMP/ min hci ; xij Ax D b; x  0 (9.41)
: ;
jD1

Q
which corresponds to the case F.y/ D riD1 yi : Aside from (LMP), a wide variety
of other problems can be cast in the form (9.39), as shown by the many examples in
Sect. 7.1. In spite of that, it turns out that solving (9.39) with different functions F.y/
reduces to comparing the objective function values on a finite set of extreme points
of D dependent upon the vectors c1 ; : : : ; cr but not on the specific form of F.y/.
Due to their relevance to applications in various fields (multiobjective pro-
gramming, bond portfolio optimization, VLSI design, etc. . . ), the above class of
problems, including .LMP/ and its extensions, has been the subject of quite a few
research. Parametric methods for solving .LMP/ date back to Swarup (1966a),
Swarup (1966b), Swarup (1966c), Forgo (1975), and Gabasov and Kirillova (1980).
However intensive development in the framework of global optimization began only
with the works of Konno and Kuno (1990, 1992). The basic idea of the parametric
method can be described as follows. By Corollary 9.1 one can solve .LMP/ by
solving the problem

minff .x; /j  2 ; 2 Wg; (9.42)

where, in the notation in Corollary 9.1, x; is an arbitrary optimal solution of the
linear program
P
min p i hci ; xi
LP.; / iD1
s:t: x 2 D; hcj ; xi D j ; j D p C 1; : : : ; r:

For solving (9.42) the parametric method exploits the property of LP.; / that
the parameter space is partitioned in finitely many polyhedrons over each of which
the minimum of the function '.; / D f .x; / can be computed relatively easily.
Specifically, by writing the constraints in this linear program in the form

Ax D b C S; x  0; (9.43)

where S is a suitable diagonal matrix, denote the collection of all basis matrices of
this system by B:
Lemma 9.3 Each basis B of the system (9.43) determines a set B  B    W
which is a polyhedron whenever nonempty, and an affine map xB W B ! Rn ; such
that xB ./ is a basic optimal solution of LP.; / for all .; / 2 B  B : The
collection of polyhedrons B  B corresponding to all bases B 2 B covers all
.; / such that LP.; / has an optimal solution.
300 9 Parametric Decomposition

 
xB
Proof Let B be any basis of this system. If A D .B; N/; x D ; so that BxB C
xN
NxN D b C S; then

xB D B1 .b C S/  B1 NxN (9.44)

and the associated basic solution of (9.43) is

xB D B1 .b C S/; xN D 0: (9.45)

This basic solution is feasible for all 2 W such that B1 .b C S/ Pp 0 (nonneg-
ativity condition) and is dual feasible for all  2  such that iD1 i .ciN /T 
 i 
cB
.ciB /T B1 N  0 (optimality criterion), where ci D : If we denote the
ciN
vector (9.45) by xB ./ and define the polyhedrons
( )
X p
 i T 
B D  2 j i .cN /  .ciB /T B1 N  0
iD1

B D f 2 Wj B1 .b C S/  0g

then whenever both B and B are nonempty the affine map 2 B 7! xB ./ 2 Rn


is such that xB ./ is a basic optimal solution of LP.; / for all  2 B ; 2 B :
The last assertion of the Lemma is obvious from the fact that if LP.; / has a basic
optimal solution, then .; / 2 B  B for the corresponding basis B: t
u
Each polyhedron B  B is called a cell. Thus, by taking x; D xB ./ for
every .; / 2 B  B ; the function '.; / D f .x; / is constant in  2 B
for every fixed 2 B and quasiconcave in 2 B for every fixed  2 B :
Hence its minimum over the cell B  B is equal to the minimum of f .xB .//
over B and is achieved at a vertex of the polytope B : Therefore, if the collection
of cells can be effectively computed then the problem (9.42) can be solved through
scanning the vertices of the cells. Unfortunately, however, when r > 3 the cells are
very complicated and there is in general no practical method for computing these
cells, except in rare cases where additional structure (like network constraints, see
Sect. 9.8) may simplify this computation. Below we briefly examine the method for
problems with r  3: For more detail the reader is referred to the cited papers of
Konno and Kuno.

9.5.1 Parametric Objective Simplex Method

When p D r D 2 the problem (9.39) is

minfF.hc1; xi; hc2 ; xi/j x 2 Dg (9.46)


9.5 Linear Multiplicative Programming 301

where D is a polyhedron contained in the set fxj hc1 ; xi  0; hc2 ; xi  0g; and F.y/
is a quasiconcave function on the orthant R2C such that

y; y0 2 R2C ; y  y0 ) F.y/  F.y0 /: (9.47)

As seen earlier, aside from the special case when F.y/ D y1 y2 (linear multiplicative
program), the class (9.46) includes problems with quite different objective func-
tions, such as
1 ;xi 2 ;xi
.hc1 ; xi/q1 .hc2 ; xi/q2 I q1 ehc  q2 ehc .q1 > 0; q2 > 0/

corresponding to different choices of F.y/ W


q q
y11 y22 I q1 ey1  q2 ey2 :

By Corollary 9.1 (with p D r D 2/ the parametric linear program associated


with (9.46) is

LP./ minfhc1 C c2 ; xij x 2 Dg;   0: (9.48)

Since the parameter domain  is the nonnegative real line, the cells of this
parametric program are intervals and can easily be determined by the standard
parametric objective simplex method (see, e.g., Murty 1983) which consists in the
following.
Suppose a basis Bk is already available such that the corresponding basic solution
xk is optimal to LP./ for all  2 k WD k1 ; k  but not for  outside this
interval. The latter means that if  is slightly greater than k then at least one of the
inequalities in the optimality criterion

.c1 C c2 /Nk  .c1 C c2 /Bk B1


k Nk  0

becomes violated. Therefore, by performing a number of simplex pivot steps we


will pass to a new basis BkC1 with a basic solution xkC1 which will be optimal to
LP./ for all  in a new interval kC1  k ; C1/: In this way, starting from the
value 0 D 0 one can determine the interval 1 D 0 ; 1  then pass to the next
2 D 1 ; 2 ; and so on, until the whole halfline is covered. Note that the optimum
objective value function ./ in the linear program LP./ is a concave piecewise
affine function of  with k as linearity intervals. The points 0 D 0; 1 ; : : : ; l are
the breakpoints of ./: An optimal solution to (9.46) is then xk where

k 2 argminfF.hc1 ; xk i; hc2 ; xk i/j k D 1; 2; : : : ; lg: (9.49)

Remark 9.4 Condition (9.47) is a variant of (9.40) for p D r D 2: Condition (9.40)


with p D 1; r D 2 always holds since it simply means F.y/ D F.y0 / whenever
302 9 Parametric Decomposition

y D y0 : Therefore, without assuming (9.47), Corollary 9.1 and hence the above
method still applies. However, in this case,  D R; i.e., the linear program LP./
should be considered for all  2 .1; C1/:

9.5.2 Parametric Right-Hand Side Method

Consider now problem (9.39) under somewhat weaker assumptions than previously,
namely: F.y/ is not required to be quasiconcave, while condition (9.47) is replaced
by the following weaker one:

y1  y01 ; y2 D y02 ) F.y/  F.y0 /: (9.50)

The latter condition implies that for fixed y2 the univariate function F.:; y2 / is
monotonic on the real line. It is then not hard to verify that Corollary 9.1 still holds
for this case. In fact this is also easy to prove directly.
Lemma 9.4 Let D minfhc2 ; xij x 2 Dg; D maxfhc2 ; xij x 2 Dg and for every
2 ;  let x be a basic optimal solution of the linear program

LP./ minfhc1 ; xij x 2 D; hc2 ; xi D g :

Then

minfF.hc1 ; xi; hc2 ; xi/j x 2 Dg D minfF.hc1; x i; hc2 ; x i/j   g:

Proof For every feasible solution x of LP./ we have hc1 ; x  x i  0 (because x


is optimal to LP(// and hc2 ; x  x i D 0 (because x is feasible to LP(//: Hence,
from (9.50) F.hc1 ; xi; hc2 ; xi/  F.hc1 ; x i; hc2 ; x i/ and the conclusion follows.
t
u
Solving (9.39) thus reduces to solving LP./ parametrically in : This can be
done by the following standard parametric right-hand side method (see, e.g., Murty
1983). Assuming that the constraints in LP./ have been rewritten in the form

Ax D b C b0 ; x  0;

let Bk be a basis of this system satisfying the dual feasibility condition

c1Nk  c1Bk B1


k Nk  0:

Then the basic solution defined by this basis, i.e., the vector xk D uk C v k such
that xkBk D B1
k .b C b0 /; xNk D 0 [see (9.45)], is dual feasible. This basic solution
k
k
x is optimal to LP./ if and only if it is feasible, i.e., if and only if it satisfies the
nonnegativity condition

B1
k .b C b0 /  0:
9.5 Linear Multiplicative Programming 303

Let k WD k1 ; k  be the interval formed by all values satisfying the above
inequalities. Then xk is optimal to LP./ for all in this interval, but for any value
slightly greater than k at least one of the above inequalities becomes violated, i.e.,
at least one of the components of xk becomes negative. By performing then a number
of dual simplex pivots, we can pass to a new basis BkC1 with a basic solution xkC1
which will be optimal for all in some interval kC1 D k ; kC1 : Thus, starting
from the value 0 D ; one can determine an initial interval 1 D ; 1 ; then
pass to the next interval 2 D 1 ; 2 ; and so on, until the whole interval ;  is
covered. The generated points 0 D ; 1 ; : : : ; h D are the breakpoints of the
optimum objective value function
./ in LP./ which is a convex piecewise affine
function. By Lemma 9.4 an optimal solution to (9.39) is then xk where

k 2 argminfF.hc1 ; xk i; hc2 ; xk i/j k D 1; 2; : : : ; hg: (9.51)

Remark 9.5 In both objective and right-hand side methods the sequence of points
xk on which the objective function values have to be compared depends upon the
vectors c1 ; c2 but not on the specific function F.y/: Computational experiments
(Konno and Kuno 1992), also (Tuy and Tam 1992) indicate that solving a prob-
lem (9.39) requires no more effort than solving just a few linear programs of the
same size. Two particular problems of this class are worth mentioning:

minfc1 x:c2 xj x 2 Dg (9.52)


minfx21 C c0 xj x 2 Dg; (9.53)

where D is a polyhedron [contained in the domain c1 x > 0; c2 x > 0 for (9.52)].


These correspond to functions F.y/ D y1 y2 and F.y/ D y21 C y2 , respectively
and have been shown to be NP-hard (Matsui 1996; Pardalos and Vavasis 1991),
although, as we saw above, they are efficiently solved by the parametric method.

9.5.3 Parametric Combined Method

When r D 3 the parametric linear program associated to (9.39) is difficult to handle


because the parameter domain is generally two-dimensional. In some particular
cases, however, a problem (9.39) with r D 3 may be successfully tackled by the
parametric method. As an example consider the problem

minfc0 x C c1 x:c2 xj x 2 Dg (9.54)

which may not belong to the class (QCM) because we cannot prove the quasicon-
cavity of the function F.y/ D y1 C y2 y3 : It is easily seen, however, that this problem
is equivalent to

minfhc0 C c1 ; xij x 2 D; hc2 ; xi D ;   g (9.55)


304 9 Parametric Decomposition

where D minx2D hc2 ; xi; D maxx2D hc2 ; xi: By writing the constraints x 2
D; hc2 ; xi D in the form

Ax D b C b0 : (9.56)

we can prove the following:


Lemma 9.5 Each basis B of the system (9.56) determines an interval B  ; 
and an affine map xB W B ! Rn such that xB ./ is a basic optimal solution of the
problem

P./ minfhc0 C c1 ; xij x 2 D; hc2 ; xi D g

for all 2 B : The collection of all intervals B corresponding to all bases B such
that B ; covers all for which P./ has an optimal solution.
 
xB
Proof Let B be a basis, A D .B; N/; x D ; so that the associated basic
xN
solution is

xB D B1 .b C b0 /; xN D 0: (9.57)

This basic solution is feasible for all 2 ;  satisfying the nonnegativity


condition

B1 .b C b0 /  0: (9.58)

and is dual feasible for all satisfying the dual feasibility condition (optimality
criterion)

.c0 C c1 /N  .c0 C c1 /B B1 N  0: (9.59)

Conditions (9.58) and (9.59) determine each an interval. If the intersection B of the
two intervals is nonempty, then the basic solution (9.57) is optimal for all 2 B :
The rest is obvious. t
u
Based on this Lemma, one can generate all the intervals B corresponding to
different bases B by a procedure combining primal simplex pivots with dual simplex
pivots (see, e.g., Murty 1983). Specifically, for any given basis Bk ; denote by uk C
v k the vector (9.57) which is the basic solution of P(/ corresponding to this basis,
and by k D k1 ; k  the interval of optimality of this basic solution. When is
slightly greater than k either the nonnegativity condition [see (9.58)] or the dual
feasibility condition [see (9.59)] becomes violated. In the latter case we perform a
primal simplex pivot step, in the former case a dual simplex pivot step, and repeat
as long as necessary, until a new basis BkC1 with a new interval kC1 is obtained.
Thus, starting from an initial interval 1 D ; 1  we can pass from one interval to
9.6 Convex Constraints 305

the next, until we reach : Let then k ; k D 1; : : : ; l be the collection of all intervals.
For each interval k let uk C v k be the associated basic solution which is optimal
to P(/ for all 2 k and let xk D uk C k v k ; where k is a minimizer of the
quadratic function './ WD hc0 C c1 ; uk C v k i over k : Then from (9.55) and by
Lemma 9.5, an optimal solution of (9.54) is xk where

k 2 argminfc0 xk C .c1 xk /.c2 xk /j k D 1; 2; : : : ; lg:

Note that the problem (9.54) is also NP-hard (Pardalos and Vavasis 1991).

9.6 Convex Constraints

So far we have been mainly concerned with solving quasiconcave monotonic


minimization problems under linear constraints. Consider now the problem

.QCM/ minff .x/j x 2 Dg: (9.60)

formulated in Sect. 9.2 [see (9.8)], where the constraint set D is compact convex
but not necessarily polyhedral. Recall that by Proposition 9.1 where Qx D
.c1 x; : : : ; cr x/T there exists a continuous quasiconcave function F W Rr ! R on
a closed convex set Y  Q.D/ satisfying condition (9.40) and such that

f .x/ D F.hc1 ; xi; : : : ; hcr ; xi/: (9.61)

It is easy to extend the branch and bound algorithms in Sect. 8.2 and the polyhedral
annexation algorithm in Sect. 8.3 to (QCM) with a compact convex constraint set D;
and in fact, to the following more general problem:

.GQCM/ minfF.g.x//j x 2 Dg (9.62)

where D and F.y/ are as previously (with g.D/ replacing Q.D//; while g D
.g1 ; : : : ; gr /T W Rn ! Rr is a map such that gi .x/; i D 1; : : : ; p; are convex on a
closed convex set X  D and gi .x/ D hci ; xi C di ; i D p C 1; : : : ; r:
For the sake of simplicity we shall focus on the case p D r; i.e., we shall assume
F.y/  F.y0 / whenever y; y0 2 Y; y  y0 : This occurs, for instance, for the convex
multiplicative programming problem
( )
Y
r
min gi .x/j x 2 D (9.63)
iD1
306 9 Parametric Decomposition

Qr
where F.y/ D iD1 yi ; D  fxj gi .x/  0g: Clearly problem (GQCM) can be
rewritten as

minfF.y/j g.x/  y; x 2 Dg (9.64)

which is to minimize the quasiconcave RrC -monotonic function F.y/ on the closed
convex set

E D fyjg.x/  y; x 2 Dg D g.D/ C RrC : (9.65)

9.6.1 Branch and Bound

If the function F.y/ is concave (and not just quasiconcave) then simplicial subdivi-
sion can be used. In this case, a lower bound for F.y/ over the feasible points y in a
simplex M D u1 ; : : : ; urC1  can be obtained by solving the convex program

minflM .y/j y 2 E \ Mg;

where lM .y/ is the affine function that agrees with F.y/ at the vertices of M (this
function satisfies lM .y/  F.y/ 8y 2 M in view of the concavity of F.y//: Using
this lower bounding rule, a simplicial algorithm for (GQCM) can be formulated
following the standard branch and bound scheme. It is also possible to combine
branch and bound with outer approximation of the convex set E by a sequence of
nested polyhedrons Pk  E: P
If the function F.y/ is concave separable, i.e., F.y/ D rjD1 Fj .yj /; where each
Fj .:/ is a concave function of one variable, then rectangular subdivision can be
more convenient. However, in the general case, when F.y/ is quasiconcave but
not concave, an affine minorant of a quasiconcave function over a simplex (or a
rectangle) is in general not easy to compute. In that case, one should use conical
rather than simplicial or rectangular subdivision.

9.6.2 Polyhedral Annexation

Since F.y/ is monotonic with respect to the orthant RrC ; we can apply the PA
Algorithm using Lemma 9.2 for constructing the initial polytope P1 :
Let y 2 E be an initial feasible point of (9.64) and C D fyj F.y/  F.y/g: By
translating, we can assume that 0 2 E and 0 2 intC; i.e., F.0/ > F.y/: Note that
instead of a linear program like (9.25) we now have to solve a convex program of
the form

maxfht; yij y 2 Eg D maxfht; g.x/ij x 2 Dg (9.66)


9.6 Convex Constraints 307

where t 2 .RrC / D RrC : Therefore, to construct the initial polytope P1


(see Lemma 9.2) we compute > 0 such that F.w/ D F.y/ (where w D
.1; : : : ; 1/T 2 Rr / and take the initial polytope P1 to be the r-simplex in Rr
with vertex set V1 D f0; e1 =; : : : ; er =g: We can thus state:
PA Algorithm (for .GQCM/)
Initialization. Let y be a feasible solution (the best available), C D fyj F.y/  F.y/g:
Choose a point y0 2 E such that F.y0 / > F.y/ and set E E  y0 ; C C  y0 :
Let P1 be the simplex in R with vertex set V1 : Set k D 1:
r

Step 1. For every new t D .t1 ; : : : ; tr /T 2 Vk nf0g solve the convex program (9.66),
obtaining the optimal value .t/ and an optimal solution y.t/:
Step 2. Let tk 2 argmaxf.t/j t 2 Vk g: If .tk /  1, then terminate: y is an
optimal solution of (GQCM).
Step 3. If .tk / > 1 and yk WD y.tk / C (i.e., F.yk / < F.y//; then update the
current best feasible solution and the set C by resetting y D yk :
Step 4. Compute

k D supf j F. yk /  F.y/g (9.67)

and define
1
PkC1 D Pk \ ftj ht; yk i  g:
k

Step 5. From Vk derive the vertex set VkC1 of PkC1 : Set k k C 1 and go back to
Step 1.
To establish the convergence of this algorithm we need two lemmas.
Lemma 9.6 .tk / & 1 as k ! C1:
Proof Observe that .t/ D maxfht; yij y 2 g.D/g is a convex function and that
yk 2 @.tk / because .t/  .tk /  ht; yk i  htk ; yk i D ht  tk ; yk i 8t: Denote
lk .t/ D ht; yk i1: We have lk .tk / D htk ; yk i1 D .tk /1 > 0; and for t 2 g.D/ W
lk .t/ D ht; yk i  1  .t/  1  0 because yk 2 g.D/ C RrC : Since g.D/ is compact,
there exist t0 2 intg.D (Proposition 1.21), i.e., t0 such that l.t0 / D .t0 /  1 < 0
and sk 2 t0 ; tk / n intg.D/ such that l.sk / D 0: By Theorem 6.1 applied to the set
g.D/ ; the sequence ftk g and the cuts lk .t/; we conclude that tk  sk ! 0: Hence
every accumulation point t of ftk g satisfies .t / D 1; i.e., .tk / & .t / D 1:
t
u
Denote by Ck the set C at iteration k:
Lemma 9.7 Ck  Pk for every k:
308 9 Parametric Decomposition

Proof The inclusion C1  P1 follows from the construction of P1 : Arguing by


induction on k; suppose that Ck  Pk for some k  1: Since k yk 2 CkC1 ; for all

t 2 CkC1 we must have ht; k yk i  1; and since Ck  CkC1 ; i.e., CkC1  Ck  Pk it
follows that t 2 Pk \ ftj ht; k y i  1g D PkC1 :
k
t
u
Proposition 9.5 Let yk be the incumbent at iteration k: Either the above algorithm
terminates by an optimal solution of (GQCM) or it generates an infinite sequence
fyk g every accumulation point of which is an optimal solution.
Proof Let y be any accumulation point of the sequence fyk g: Since by Lemma 9.6
maxf.t/j t 2 Pk g & 1 it follows from Lemma 9.7 that maxf.t/j t 2 \1
kD1 Ck g 
1 1
1: This implies that \kD1 Ck  E ; and hence E  [kD1 Ck : Thus, for any y 2 E
there is k such that y 2 Ck ; i.e., F.y/  F.yk /  F.y/; proving the optimality of y:
t
u
Incidentally, the above PA Algorithm shows that

minfF.y/j y 2 Eg D minfF.g.x.t///j t 2 Rr g; (9.68)

where x.t/ is an arbitrary optimal solution of (9.66), i.e., of the convex program

minfht; g.x/ij x 2 Dg:

This result could also be derived from Theorem 9.1.


Remark 9.6 All the convex programs (9.66) have the same constraints. This fact
should be exploited for an efficient implementation of the above algorithm. Also,
in practice, these programs are usually solved approximately, so .t/ is only an
"-optimal value, i.e., .t/  maxfht; yij y 2 Eg  " for some tolerance " > 0:

9.6.3 Reduction to Quasiconcave Minimization

An important class of problems (GQCM) is constituted by problems of the form:

X Y
p qi
minimize gij .x/ s:t x 2 D; (9.69)
iD1 jD1

where D is a compact convex set and gij W D ! RCC ; i D 1; : : : ; p; j D 1; : : : ; qi ; are


continuous convex positive-valued functions on D: This class includes generalized
convex multiplicative programming problems (Konno and Kuno 1995; Konno et al.
1994) which can be formulated as:
9.6 Convex Constraints 309

8 9
< Y
q =
min g.x/ C gj .x/j x 2 D : (9.70)
: ;
jD1

or, alternatively as
( )
X
p
min g.x/ C gi .x/hi .x/j x 2 D ; (9.71)
iD1

where all functions g.x/; gi .x/; hi .x/ are convex positive-valued on D: Another
special case worth mentioning is the problem of minimizing the scalar product of
two vectors:
( n )
X
min xi yi j .x; y/ 2 D  RCC :
n
(9.72)
iD1

Let us show that any problem (9.69) can be reduced to concave minimization under
convex constraints. For this we rewrite (9.69) as
Pp Qqi
minimize iD1 . jD1 yij /1=qi
s:t: gij .x/qi  yij ; i D 1; : : : ; pI j D 1; : : : ; qi (9.73)
yij  0; x 2 D:

Since 'Qj .y/ D yij ; j D 1; : : : ; qi are linear, it follows from Proposition 2.7 that each
yij /1=qi ; i D 1; : : : ; p; is a concave function of y D .yij / 2 Rq1 C:::;qp ;
qi
term . jD1
hence their sum, i.e., the objective function F.y/ in (9.73), is also a concave function
of y: Furthermore, y  y0 obviously implies F.y/  F.y0 /; so F.y/ is monotonic
with respect to the cone fyjy  0g: Finally, since every function gij .x/ is convex,
so is gij .x/qi : Consequently, (9.73) is a concave programming problem. If q D
q1 C : : : C qp is relatively small this problem can be solved by methods discussed
in Chaps. 5 and 6.
An alternative way to convert (9.69) into a concave program is to use the
following relation already established in the proof of Proposition 2.7 [see (2.6)]:

Y
qi
1 Xqi
q
gij .x/ D min ij giji .x/;
jD1
qi 2Ti jD1
i

Qqi
with i D . i1 ; : : : ; iqi /; Ti D f i W jD1 ij  1g: Hence, setting 'i . ; x/ D
i

1 X
qi
q
ij giji .x/; the problem (9.69) is equivalent to
qi jD1
310 9 Parametric Decomposition

X
p
min min 'i . i ; x/
x2D i 2Ti
iD1

X
p
D min minf 'i . i ; x/ W i 2 Ti ; i D 1; : : : ; pg
x2D
iD1

X
p
D minfmin 'i . i ; x/ W i 2 Ti ; i D 1; : : : ; pg
x2D
iD1
1
D minf'. ; : : : ; p /j i 2 Ti ; i D 1; : : : ; pg (9.74)

where
8 9
X
p <X p
Xqi
1 =
'. 1 ; : : : ; p / D min
q
'i . i ; x/ D min ij giji .x/j x 2 D :
x2D : q ;
iD1 iD1 jD1 i

Since each function giji .x/ is convex, '. 1 ; : : : ; p / is the optimal value of a convex
q

program. Furthermore, the objective function of this convex program is a linear


function of . 1 ; : : : ; p / for fixed x: Therefore, '. 1 ; : : : ; p / is a concave function
(pointwise minimum of a family of linear functions). Finally, each Ti ; i D 1; : : : ; p;
is a closed convex set in Rqi because
8 9
< X qi =
Ti D i j log ij  0 :
: ;
jD1

Thus, problem (9.69) is equivalent to (9.74) Qpwhich seeks to minimize the concave
function '. 1 ; : : : ; p / over the convex set iD1 Ti :
Example 9.9 As an illustration, consider the problem of determining a rectangle of
minimal area enclosing the projection of a given compact convex set D  Rn .n > 2/
onto the plane R2 : This problem arises from applications in packing and optimal
layout (Gale 1981; Haims and Freeman 1970; Maling et al. 1982), especially when
two-dimensional layouts are restricted by n  2 factors. When the vertices of D
are known, there exists an efficient algorithm based on computational geometry
(Bentley and Shamos 1978; Graham 1972; Toussaint 1978). In the general case,
however, the problem is more complicated.
Assume that D has full dimension in Rn D R2  Rn2 : The projection of D onto
R2 is the compact convex set prD D fy 2 R2 j 9z 2 Rn2 ; .y; z/ 2 Dg: For any
vector x 2 R2 such that kxk D 1 define

g1 .x/ D maxfhx; yij .y; z/ 2 Dg


g2 .x/ D minfhx; yij .y; z/ 2 Dg:
9.6 Convex Constraints 311

Then the width of prD in the direction x is

g.x/ D g1 .x/  g2 .x/:

On the other hand, since Hx D .x2 ; x1 /T is the vector orthogonal to x such that
kH.x/k D 1 if kxk D 1; the width of prD in the direction orthogonal to x is
g.Hx/ D g.x2 ; x1 /: Hence the product g.x/:g.Hx/ measures the area of the smallest
rectangle containing prD and having a side parallel to x: The problem can then be
formulated as

minfg.x/:g.Hx/j x 2 R2C ; kxk D 1g: (9.75)

Lemma 9.8 The function g.x/ is convex and satisfies g.x/ D g.x/ for any
 0:
Proof Clearly g1 .x/ is convex as the pointwise maximum of a family of affine
functions, while g2 .x/ is concave as the pointwise minimum of a family of affine
functions. It is also obvious that gi .x/ D gi .x/; i D 1; 2: Hence, the conclusion.
t
u
Since H W R2 ! R2 is a linear map, it follows that g.Hx/ is also convex
and that g.H.x// D g.Hx/ for every  0: Exploiting this property, a
successive underestimation method for solving problem (9.75) has been proposed in
Kuno (1993). In view of the above established property, problem (9.75) is actually
equivalent to the convex multiplicative program

minfg.x/  g.Hx/j x 2 R2C ; kxk  1g; (9.76)

and so can be transformed into the concave minimization problem


p
min t1 t2
s:t: g.x/  t1 ; g.Hx/  t2
x 2 R2C ; kxk  1;

or, alternatively,

minf'. 1 ; 2 /j 1 2  1; 2 R2C g: (9.77)

where

'. 1 ; 2 / D minf 1 g.x/ C 2 g.Hx/j x  0; kxk  1g: (9.78)

In the special case when D is a polytope given by a system of linear inequalities


Ay C Bz  b the problem can be solved by an efficient parametric method (Kuno
1993). For any point x 2 R2C with kxk D 1; let .x/ D .; 1  / be the point
312 9 Parametric Decomposition

where the halfline from 0 through x intersects the line segment joining .1; 0/ and
.0; 1/: Since obviously, k.x/k2 D 2 C .1  /2 ; using Lemma 9.8 problem (9.76)
becomes

minfF./j  2 0; 1g; (9.79)

where

g.; 1  /:g.  1; /
F./ D :
2 C .1  /2

Furthermore, g.; 1  / D g1 .; 1  /  g2 .; 1  / with

g1 .; 1  / D maxfy1 C .1  /y2 j Ay C Bz  bg


g2 .; 1  / D minfy1 C .1  /y2 j Ay C Bz  bg:

So each gi .; 1  /; i D 1; 2 is the optimal value of a linear parametric program.


Let 0; 1i 1i
1 ; : : : ; p1i ; 1 be the sequence of breakpoints of gi .; 1  /; i D 1; 2:
Analogously, let 0; 2i 2i
1 ; : : : ; p2i ; 1 be the sequence of breakpoints of gi .  1; /:
Finally let

0 D 0 ; 1 ; 2 ; : : : ; N1 ; 1 D N (9.80)

be the union of all these four sequences rearranged in the increasing order.
Proposition 9.6 A global minimum of the function F./ over 0; 1 exists among
the points s ; s D 0; : : : ; N:
Proof For any interval s ; sC1 ; there exist two vectors u D x1  x2 ; v D x3  v 4 ;
where xi ; i D 1; : : : ; 4 are vertices of D; such that for every x 2 R2C of length kxk D
1; with .x/ D .; 1  /;  2 .s ; sC1 /; we have g.x/ D hx; ui; g.Hx/ D hHx; vi:
Then

F./ D .kuk cos /  .kvk cos /;

where D angle.x; u/ (angle between the vectors x and u/ and D angle.Hx; v/:
Noting that angle.x; u/ C angle.u; v/ C angle.v; y/ D angle.x; y/; we deduce

D C !  =2; (9.81)

where ! D angle.u; v/: Thus, for all  2 .s ; sC1 /:

F./ D .kuk cos /.kvk sin. C !//:

Setting G./ D cos sin. C !/; we have G0 ./ D  sin sin. C !/ C


cos cos. C !/ D cos.2 C !/: If F.:/ attains a minimum at ; i.e., G.:/ attains
a minimum at ; then one must have
9.7 Reverse Convex Constraints 313

cos.2 C !/ D 0; hence 2 C ! D =2 : (9.82)

But from (9.81) it follows that C ! D C =2; hence 2 C ! D C C =2:


In view of (9.82) this in turn implies that

C D ;

contradicting the fact that 0 < jj < =2; 0 < jj < =2 (because hx; ui >
0; hHx; vi > 0/: Therefore, the minimum of F./ over an interval s ; sC1  cannot
be attained at any  such that s <  < sC1 ; but must be attained at either s or
sC1 : Hence the global minimum over the entire segment 0; 1 is attained at one
point of the sequence 0 ; 1 ; : : : ; N : t
u
Corollary 9.3 An optimal solution of (9.79) is

. ; 1   /
x D p ;  2 argminfF.s / W s D 0; 1; : : : ; Ng:
2 C .1   /2

Proof Indeed, x D .x /=k.x /k and .x / D . ; 1   /:

9.7 Reverse Convex Constraints

In the previous sections we were concerned with low rank nonconvex problems
with nonconvex variables in the objective function. We now turn to low rank
nonconvex problems with nonconvex variables in the constraints. Consider the
following monotonic reverse convex problem

.MRP/ minfhc; xij x 2 D; h.x/  0g;

where c 2 Rn ; D is a closed convex set in Rn ; and the function h W X ! R is


quasiconcave and K-monotonic on a closed convex set X  D: In this section we
shall present decomposition methods for (MRP) by specializing the algorithms for
reverse convex programs studied in Chaps. 6 and 7.
As in Sect. 9.2 assume that K  recX; with lineality L D fxj Qx D 0g; where Q
is an r  n matrix with r linearly independent rows c1 ; : : : ; cr : By quasiconcavity of
h.x/ the set C D fx 2 Xj h.x/  0g is convex and by Proposition 9.2 K  recC:
Assume further that D D fxj Ax  b; x  0g is a polytope with a vertex at 0 and
that:
(a) minfhc; xij x 2 D; h.x/  0g > 0; so that h.0/ > 0I
(b) The problem is regular, i.e.,

D n intC D cl.D n C/:


314 9 Parametric Decomposition

9.7.1 Decomposition by Projection

  as in Sect. 7.2 [see formulas (9.11) and (9.12)], Q D QB ; QN ;


By writing,
x
x D B ; where QB is an r  r nonsingular matrix, and setting
xN
 
Q1
ZD B ; '.y/ D h.Zy/
0

we define a function '.y/ such that '.y/ D h.x/ for all x 2 X satisfying y D Qx:
The latter also implies that '.y/ is quasiconcave on Q.X/:
Denote by .y/ the optimal value of the linear program

minfhc; xij x 2 D; Qx D yg:

Then .MRP/ can be rewritten as a problem in y W

minf .y/j y 2 Q.D/; '.y/  0g: (9.83)

Since it is well known that .y/ is a convex piecewise affine function, (9.83) is a
reverse convex program in the r-dimensional space Q.Rn /: If r is not too large this
program can be solved in reasonable time by outer approximation or branch and
bound.

9.7.2 Outer Approximation

Following the OA Algorithm for .CDC/ (Sect. 6.3), we construct a sequence of


polytopes P1  P2  : : : outer approximating the convex set

D fy 2 Q.D/j .y/  Q g;

where Q is the optimal value of .MRP/: Also we construct a sequence C1  1 


2  : : : such that

k < C1 ) 9yk 2 Q.D/; '.yk /  0; .yk / D k I


fy 2 Q.D/j .y/  k g  Pk :

At iteration k we select (by vertex enumeration)

yk 2 argminf'.y/j y 2 Pk g
9.7 Reverse Convex Constraints 315

(note that this requires '.y/ to be defined on P1 /: If '.yk /  0 then .MRP/ is


infeasible when k D C1; or yk is optimal when k < C1: If '.yk / < 0; then
yk and we construct an affine inequality strictly separating yk from : The
polytope PkC1 is defined by appending this inequality to Pk :
The key issue in this OA scheme is, given a point yk such that '.yk / < 0; to
construct an affine function l.y/ satisfying

l.yk / > 0; l.y/  0 8y 2 : (9.84)

Note that we assume (a) and 0 2 D; so

'.0/ > 0; .0/ < minf .y/j y 2 Q.D/; '.y/  0g: (9.85)

Now, since '.yk / < 0 < '.0/ and k  .yk /  0 < k  .0/; one can compute
zk 2 0; yk  satisfying minf'.zk /; k  .zk /g D 0: Consider the linear program

.SP.zk // maxfhzk ; vi  hb; wijQT v  AT w  c; v  0g:

Proposition 9.7
(i) If SP(zk ) has a (finite) optimal solution .v k ; wk / then zk 2 Q.D/ and (9.84) is
satisfied by the affine function

l.y/ WD hv k ; y  zk i: (9.86)

(ii) If SP(zk / has no finite optimal solution, so that the cone QT v  AT w  d; w  0


has an extreme direction .v k ; wk / such that hzk ; v k i  hb; wk i > 0; then zk
Q.D/ and (9.84) is satisfied by the affine function

l.y/ WD hy; v k i  hb; wk i: (9.87)

Proof If SP.zk / has an optimal solution .v k ; wk / then its dual

.SP .zk // minfhc; xij Ax  b; Qx D zk ; x  0g

is feasible, hence zk 2 Q.D/: It is easy to see that v k 2 @ .zk /: Indeed, for any
y; since .y/ is the optimal value in SP .y/; it must also be the optimal value in
SP.y/I i.e., .y/  hy; v k i  hb; wk i; hence .y/  .zk /  hy; v k i  hb; wk i 
hzk ; v k i C hb; wk i D hv k ; y  zk i; proving the claim. For every y 2 we have
.y/  Q  .zk /; so l.y/ WD hv k ; y  zk i  .y/  .zk /  0: On the other hand,
yk D zk D zk C .1  /0 for some  1; hence l.yk / D l.zk / C .1  /l.0/ > 0
because l.0/  .0/  .zk / < 0; while l.zk / D 0:
To prove (ii), observe that if SP.zk / is unbounded, then SP .zk / is infeasible,
i.e., zk Q.D/: Since the recession cone of the feasible set of SP.zk / is the cone
QT v  AT w  0; v  0; SP.zk / may be unbounded only if an extreme direction
316 9 Parametric Decomposition

.v k ; wk / of this cone satisfies hzk ; v k i  hb; wk i > 0: Then the affine function (9.87)
satisfies l.zk / > 0 while for any y 2 Q.D/; SP .y/ (hence SP.y// must have a finite
optimal value, and consequently, hy; v k i  hb; wk i  0; i.e., l.y/  0: t
u
The convergence of an OA method for .MRP/ based on the above proposition
follows from general theorems established in Sect. 6.3.
Remark 9.7 In the above approach we only used the constancy of h.x/ on every
manifold fxj Qx D yg: In fact, since '.y0 /  '.y/ for all y0  y; problem .MRP/
can also be written as

minf .y/j x 2 D; Qx  y; '.y/  0g: (9.88)

Therefore, SP.zk / could be replaced by

maxfhzk ; vi  hb; wij QT v  AT w  0; v  0; w  0g:

9.7.3 Branch and Bound

We describe a conical branch and bound procedure. A simplicial and a rectangular


algorithm for the case when '.y/ is separable can be developed similarly.
From assumptions (a), (b) we have that

0 2 Q.D/; '.0/ > 0;
(9.89)
.0/ < minf .y/j y 2 Q.D/; '.y/  0g:

A conical algorithm for solving (9.83) starts with an initial cone M0 contained in
Q.X/ and containing the feasible set Q.D/ \ fyj '.y/  0g.
Given any cone M  Q.Rn / of base u1 ; : : : ; uk  let i ui be the point where the
i-th edge of M intersects the surface '.y/ D 0: If P U denotes the matrix of columns
u1 ; : : : ; uk then M D ft 2 Rk j Ut  0; t  0g and kiD1 ti = i D 1 is the equation of
the hyperplane passing through the intersections of the edges of M with the surface
'.y/ D 0: Denote by .M/ and t.M/ the optimal value and a basic optimal solution
of the linear program
( )
X
k Xk
ti
LP.M/ min hc; xij x 2 D; Qx D ti ui ;  1; t  0 :
iD1

iD1 i

Proposition 9.8 .M/ gives a lower bound for .y/ over the set of feasible points
in M: If !.M/ D Ut.M/ lies on an edge of M then

.M/ D minf .y/j y 2 Q.D/ \ M; '.y/  0g: (9.90)


9.7 Reverse Convex Constraints 317

P follows from the inclusion fy 2 Q.D/ \


Proof The first part of the proposition
Mj '.y/  0g  fy 2 Q.D/ \ Mj kiD1 ti = i  1g and the fact that minf .y/j y 2
Q.D/ \ M; '.y/  0g D minfhc; xij x 2 D; Qx D y; y 2 M; '.y/  0g: On the other
hand, if !.M/ lies on an edge of M it must lie on the portion of this edge outside the
convex set '.y/  0; hence '.!.M//  0: Consequently, !.M/ is feasible to the
minimization problem in (9.90) and hence is an optimal solution of it. t
u
It can be proved that a conical branch and bound procedure using !-subdivision
or !-bisection (see Sect. 7.1) with the above bounding is guaranteed to converge.
Example 9.10

min 2x1 C x2 C 0:5x3


s.t. 2x1 C x3  x4  2:5
x1  3x2 C x4  2
x1 C x2  2
x1 ; x2 ; x3 ; x4  0
h.x/ WD .3x1 C 6x2 C 8x3 /2  .4x1 C 5x2 C x4 /2 C 154  0:

Here the function h.x/ is concave and monotonic with respect to the cone K D
fyj c1 y D 0; c2 y D 0g; where c1 D .3; 6; 8; 0/; c2 D .4; 5; 0; 1/ (cf Example 7.2).
So Q is the matrix of two rows c1 ; c2 and Q.D/ D fy D Qx; x 2 Dg is contained in
the cone M0 D fyjy  0g (D denotes the feasible set). Since the optimal solution
0 2 R4 of the underlying linear program satisfies h.0/ > 0; conditions (9.89) hold,
and M0 D conefe1 ; e2 g can be chosen as an initial cone. The conical algorithm
terminates after 4 iterations, yielding an optimal solution of .0; 0; 1:54; 2:0/ and the
optimal value 0:76: Note that branching is performed in R2 (t-space) rather than in
the original space R4 :

9.7.4 Decomposition by Polyhedral Annexation

From L D fxj Qx D 0g  K  C we have

C  K  L? ; (9.91)

where L? (the orthogonal complement of L/ is the space generated by the rows


c1 ; : : : ; cr of Q (assuming rankQ D r/: For every 2 R [ Cf1g let D D fx 2
Dj hc; xi  g: By Theorem 7.6 the value is optimal if and only if

D n intC ;; D  C: (9.92)

Following the PA Method for .LRC/ (Sect. 8.1), to solve .MRP/ we construct a
nested sequence of polyhedrons P1  P2  : : : together with a nonincreasing
318 9 Parametric Decomposition

sequence of real numbers 1  2  : : : such that D k n intC ;; C  Pk 


L? 8k; and eventually Pl  D l  for some l W then D l  Pl  C; hence l is
optimal by the criterion (9.92).
To exploit monotonicity [which implies (9.91)], the initial polytope P1 is taken
to be the r-simplex constructed as in Lemma 9.1 (note that 0 2 intC by assumption
(a), so
Prthis construction is possible). Let W L? ! Rr be the linear map
y D iD1 ti c 7! .y/ D t: For any set E  Rr denote EQ D .E/  Rr : Then
i

by Lemma 9.1 PQ 1 is the simplex in Rr defined by the inequalities

X
r
1
ti hci ; cj i  j D 0; 1; : : : ; r
iD1
j
P
where c0 D  riD1 ci ; j D supfj cj 2 Cg: In the algorithm below, when a better
feasible solution xk than the incumbent has been found, we can compute a feasible
point xOk at least as good as xk and lying on an edge of D (see Sect. 5.7). This can be
done by carrying out a few steps of the simplex algorithm. We shall refer to xOk as a
point obtained by local improvement from xk :
PA Algorithm for .MRP/
Initialization. Choose a vertex x0 of D such that h.x0 / > 0 and set D Dx0 ; C
0 1 1 1
C  x : Let x D best feasible solution available, 1 D hc; x i .x D ;; k D C1 if
no feasible solution is available). Let P1 be the simplex constructed as in Lemma 9.1,
V1 the vertex set of PQ 1 : Set k D 1:
Step 1. For every t D .t1 ; : : : ; tr /T 2 Vk solve the linear program
( )
X
r
max ti hci ; xij x 2 D; hc; xi  k
iD1

to obtain its optimal value .t/: (Only the new t 2 Vk should be considered
if this step is entered from Step 3). If .t/  1 8t 2 Vk , then terminate: if
k < 1; xk is an optimal solution; otherwise (MRP) is infeasible.
Step 2. If .tk / > 1 for some tk 2 Vk , then let xk be a basic optimal solution of
the corresponding linear program. If xk C, then let xOk be the solution
obtained by local improvement from xk ; reset xk D xOk ; k D hc; xOk i; and
return to Step 1.
Step 3. If xk 2 C then set xkC1 D xk ; kC1 D k ; compute k D supf j xk 2 Cg
and define
( )
Xr
1
PQ kC1 D PQ k \ tj ti hx ; c i 
k i
:
iD1
k

From Vk derive the vertex set VkC1 of PQ kC1 : Set k k C 1 and return to
Step 1.
9.8 Network Constraints 319

Proposition 9.9 The above PA Algorithm for .MRP/ is finite.


Proof Immediate, since this is a mere specialization of the PA Algorithm for
.LRCP/ in Sect. 6.4 to .MRP/. u
t
Remark 9.8 If K D fxjQx  0g then P1 can be constructed as in Lemma 9.2.
The above algorithm assumes regularity of the problem (assumption (b)). Without
this assumption, we can replace C by C" D fx 2 Xj h.x/  "g: Applying the
above algorithm with C C" will yield an "-approximate optimal solution (cf
Sect. 7.3), i.e., a point x" satisfying x" 2 D; h.x" /  "; and hc; x" i  minfhc; xij
x 2 D; h.x/  0g.

9.8 Network Constraints

Consider a special concave programming problem of the form:

.SCP/ min g.y1 ; : : : ; yr / C hc; xi


s.t. y 2 Y (9.93)
Qx D y; Bx D d; x  0: (9.94)

where g W RrC ! RC is a concave function, Y a polytope in RrC , c; x 2


Rn ; d 2 Rm ; Q an r  n matrix of rank r; and B an m  n matrix. In an economic
interpretation, y D .y1 ; : : : ; yr /T may denote a production program to be chosen
from a set Y of technologically feasible production programs, x a distribution
transportation program to be determined so as to meet the requirements (9.94). The
problem is then to find a productiondistributiontransportation program, satisfying
conditions (9.94), with a minimum cost. A variant of this problem is the classical
concave productiontransportation problem

Xr X m
min g.y/ C cij xij


iD1 jD1
Xm
s.t. xij D yi i D 1; : : : ; r

.PTP/
jD1
Xr

xij D dj j D 1; : : : ; m

iD1
x  0; i D 1; : : : ; r; j D 1; : : : ; m
ij

where yi is the production level to be determined for factory i and xij the amount
to be sent from factory i to warehouse j in order to meet the demands d1 ; : : : ; dm of
the warehouses. Another variant of SCP is the minimum concave cost network flow
problem which can be formulated as
320 9 Parametric Decomposition

P P
min r gi .xi / C n
.MCCNFP/ iD1 iDrC1 ci xi
s:t: Ax D d; x  0:

where A is the node-arc incidence matrix of a given directed graph with n arcs, d
the vector of node demands (with supplies interpreted as negative demands), and
xi the flow value on arc i: Node j withP dj > 0 are the sinks, nodes j with dj < 0
are the sources, and it is assumed that j dj D 0 (balance condition). The cost of
shipping t units along arc i is a nonnegative concave nondecreasing function gi .t/
for i D 1; : : : ; r; and a linear function ci t (with ci  0/ for i D r C 1; : : : ; n: To cast
.MCCNFP/ into the form .SCP/ it suffices to rewrite it as
P P
min r gi .yi / C n
iD1 iDrC1 ci xi (9.95)
s:t: xi D yi .i D 1; : : : ; r/; Ax D d; x  0:

Obviously, Problem .SCP/ is equivalent to

min g.y/ C t
s.t. hc; xi  t; Qx D y; Bx D d; x  0; y 2 Y: (9.96)

Since this is a concave minimization problem with few nonconvex variables, it


can be handled by the decomposition methods discussed in the previous Sects. 7.2
and 7.3.
Note that in both .PTP/ and .MCCNFP/ the constraints (9.94) have a nice
structure such that when y is fixed the problem reduces to a special linear
transportation problem (for .PTP/) or a linear cost flow problem (for .MCCNFP/),
which can be solved by very efficient specialized algorithms. Therefore, to take full
advantage of the structure of (9.94) it is important to preserve this structure under
the decomposition.

9.8.1 Outer Approximation

This method is initialized from a polytope P1 D Y  ; ,where ; are,


respectively, a lower bound and an upper bound of hc; xi over the polytope Bx D
d; x  0: At iteration k; we have a polytope Pk  P1 containing all .y; t/ for which
there exists x  0 satisfying (9.96). Let

.yk ; tk / 2 argminft C g.y/j .y; t/ 2 Pk g:

Following Remark 7.1 to check whether .yk ; tk / is feasible to (9.96) we solve the
pair of dual programs

minfhc; xij Qx D yk ; Bx D d; x  0g (9.97)


9.8 Network Constraints 321

maxfhyk ; ui C hd; vij QT u C BT v  cg (9.98)

obtaining an optimal solution xk of (9.97) and an optimal solution .uk ; v k / of the


dual (9.98). If hc; xk i  tk then .xk ; yk / solves .SCP/: Otherwise, we define

PkC1 D Pk \ f.y; t/j hy; uk i C hd; v k i  tg (9.99)

and repeat the procedure with k k C 1:


In the particular case of .PTP/ the subproblem (9.97) is a linear transportation
problem obtained from .PTP/ by fixing y D yk ; while in the case of .MCCNFP/
the subproblem (9.97) is a linear min-cost network flow problem obtained from
.MCCNFP/ by fixing xi D yki for i D 1; : : : ; r (or equivalently, by deleting the arcs
i D 1; : : : ; r and accordingly modifying the demands of the nodes which are their
endpoints).

9.8.2 Branch and Bound Method

Assume
Pr that in problem .SCP/ the function g.y/ is concave separable, i.e., g.y/ D
iD1 g i .y i /: For each rectangle M D a; b let

X
R
lM .y/ D li;M .yi /
iD1

where li;M .t/ is the affine function that agrees with gi .t/ at the endpoints ai ; bi of the
segment ai ; bi ; i.e.,

gi .bi /  gi .ai /
li;M .t/ D gi .ai / C .t  ai /: (9.100)
bi  ai

(see Proposition 7.4). Then a lower bound .M/ for the objective function in .SCP/
over M is provided by the optimal value in the linear program
( )
X
r
.LP.M// min li;M .yi / C hc; xij Qx D y; Ax D d; x  0 :
iD1

Using this lower bound and a rectangular !-subdivision rule, a convergent rectan-
gular algorithm has been developed in Horst and Tuy (1996) which is a refined
version of an earlier algorithm by Soland (1974) and should be able to handle
even large scale .PTP/ provided r is relatively small. Also note that if the data
are all integral, then it is well known that a basic optimal solution of the linear
transportation problem LP.M/ always exists with integral values. Consequently, in
this case, starting from an initial rectangle with integral vertices the algorithm will
generate only subrectangles with integral vertices, and so it will necessarily be finite.
322 9 Parametric Decomposition

In practice, aside from production and transportation costs there may also be
shortage/holding costs due to a failure to meet the demands exactly. Then we
have the following problem (cf Example 5.1) which includes a formulation of the
stochastic transportationlocation problem as a special case:

min hc; xi C Pr g .y / C Pm h .z /
iD1 i i jD1 j j
Xm

s:t: xij D yi .i D 1; : : : ; r/

jD1
X
.GPTP/ r
xij D zj .j D 1; : : : ; m/


iD1
0  yi  si 8i;

xij  0 8i; j:

where gi .yi / is the cost of producing yi units at factory i and hj .zj / is the
shortage/holding cost to be incurred if the warehouse j receives zj dj (demand
of warehouse j/: It is assumed that gi is a concave nondecreasing function satisfying
gi .0/ D 0  gi .0C / (so possibly gi .0C / > 0Pas, e.g., for a fixed charge), while
r
hj W 0; s ! RC is a convex function (s D iD1 si /; such that hj .dj / D 0; and
the right derivative of hj at 0 has a finite absolute value j : The objective function is
then a d.c. function, which is a new source of difficulty. However, since the problem
becomes convex when y is fixed, it can be handled by a branch and bound algorithm
in which branching is performed upon y (Holmberg and Tuy 1995).
For any rectangle a; b  Y WD fy 2 RrC j 0  yP i  si i D 1; : : : ; rg let as
previously li;M .t/ be defined by (9.100) and lM .y/ D riD1 li;M .yi /: Then a lower
bound of the objective function over the feasible points in M is given by the optimal
value .M/ in the convex program

min hc; xi C l .y/ C Pm h .z /
M jD1 j j
Xm

s:t: xij D yi .i D 1; : : : ; r/

jD1
X
CP.M/ r
xij D zj .j D 1; : : : ; m/


iD1
ai  yi  bi 8i;

xij  0 8i; j:

Note that if an optimal solution .x.M/; y.M/; z.M// of this convex program satisfies
lM .y.M// D g.y.M// then .M/ equals the minimum of the objective function
over the feasible solutions in M: When y is fixed, .GPTP/ is a convex program
in x; z: If '.y/ denotes the optimal value of this program then the problem amounts
to minimizing '.y/ over all feasible .x; y; z/.
9.8 Network Constraints 323

Fig. 9.1 Consequence of


concavity of g1 .t/ g1 (t)

bkq
1 1
t
O

BB Algorithm for .GPTP/


Initialization. Start with a rectangle M1  Y. Let y1 be the best feasible solution
available, INV = '.y1 /: Set P1 D S1 D fM1 g; k D 1:
Step 1. For each M 2 R1 solve the associated convex program CP(M) obtaining
its optimal value .M/ and optimal solution .x.M/; y.M/; z.M//:
Step 2. Update INV and yk :
Step 3. Delete all M 2 Sk such that .M/  INV  ": Let Rk be the collection of
remaining rectangles.
Step 4. If Rk D ;; then terminate: yk is a global "-optimal solution of .GPTP/:
Step 5. Select Mk 2 argminf.M/j M 2 Rk g: Let yk D y.Mk /;

ik 2 arg maxfgi .yk /  li;MK .yk /g: (9.101)


i

If gik .ykik /  lik ;Mk .yk / D 0; then terminate: yk is a global optimal solution.
Step 6. Divide Mk via .yk ; ik / (see Sect. 5.6). Let PkC1 be the partition of Mk and
SkC1 D Rk n .fMk g/ [ PkC1 : Set k k C 1 and return to Step 1.
To establish the convergence of this algorithm, recall that yk D y.Mk /: Similarly
denote xk D x.Mk /; zk D z.Mk /; so .xk ; yk ; zk / is an optimal solution of CP.Mk /:
Suppose the algorithm is infinite and let y be any accumulation point of fyk g, say
y D limq!1 ykq : Without loss of generality we may assume xkq ! x; zkq ! z and
also ikq D 1 8q; so that

1 2 arg maxfgi .ykq /  li;Mkq .ykq /g: (9.102)


i

k
Clearly if for some q we have y1q D 0, then gi .ykq / D l1;Mkq .ykq /; and the algorithm
k
terminates in Step 5. Therefore, since the algorithm is infinite, we must have y1q >
0 8q:
Lemma 9.9 If g1 .0C / > 0 then
k
y1 D lim y1q > 0: (9.103)
q!1

In other words, y1 is always a continuity point of g1 .t/:


324 9 Parametric Decomposition

q q k q q q q
Proof Let Mkq ;1 D a1 ; b1 ; so that y1q 2 a1 ; b1  and a1 ; b1 ; q D 1; 2; : : : is a
q q q q
sequence of nested intervals. If 0 a10 ; b10  for some q0 , then 0 a1 ; b1  8q  q0 ;
k q q
and y1q  a10 > 0 8q  q0 ; hence y1  a10 > 0; i.e., (9.103) holds. Thus, it suffices
to consider the case
q k
a1 D 0; y1q > 0 8q: (9.104)

Recall that j denotes the absolute value of the right derivative of hj .t/ at 0. Let

D max j  min c1j : (9.105)


j j

P k k k
Since njD1 x1jq D y1q > 0; there exists j0 such that x1jq0 > 0: Let .Qxkq ; ykq ; zkq /
be a feasible solution to CP.M/ obtained from .xkq ; ykq ; zkq / by subtracting a very
k k k k
small positive amount
< x1jq0 from each of the components x1jq0 ; y1q ; zj0q ; and letting
unchanged all the other components. Then the production cost at factory 1 decreases
k k k k
by l1;Mkq .y1q /  l1;Mkq .Qy1q / D g1 .b1q /
=b1q > 0; the transportation cost in the arc
.1; j0 / decreases by c1j0
; while the penalty incurred at warehouse j0 either decreases
k k k
(if zj0q > dj /; or increases by hj0 .zj0q 
/  hj0 .zj0q /  j0
: Thus the total cost
decreases by at least
" k
#
g1 .b1q /
D k
C c1j0  j0
: (9.106)
b1q

If  0 then c1j0  j0  0; hence > 0 and .Qxkq ; yQ kq ; zQkq / would be a better feasible
solution than .xkq ; ykq ; zkq /; contradicting the optimality of the latter for CP.Mkq /:
Therefore, > 0: Now suppose g1 .0C / > 0: Since g1 .t/=t ! C1 as t ! 0C (in
view of the fact g1 .0C / > 0/; there exists 1 > 0 satisfying

g1 .1 /
> :
1

k k k0 k
Observe that since Mkq ;1 D 0; b1q  is divided via y1q ; we must have 0; b1q   0; y1q 
k k
for all q0 > q; while 0; y1q   0; b1q  for all q: With this in mind we will show that

k
b1q  1 8q: (9.107)

k
By the above observation this will imply that y1q  1 8q; thereby completing
k
the proof. Suppose (9.107) does not hold, i.e., b1q < 1 for some q: Then
kq kq
y1  b1 < 1 ; and since g1 .t/ is concave it is easily seen by Fig. 9.1 that
k k
g1 .b1q /  b1q g1 .1 /=1 ; i.e.,
9.8 Network Constraints 325

k
g1 .b1q / g1 .1 /
 :
k
b1q 1

This, together with (9.106) implies that


 
g1 .1 / k
  x1jq0 > 0
1

hence .Qxkq ; yQ kq ; zQkq / is a better feasible solution than .xkq ; ykq ; zkq /: This contradiction
proves (9.107). t
u
Theorem 9.3 The above algorithm for .GPTP/ can be infinite only if " D 0 and
in this case it generates an infinite sequence yk D y.Mk /; k D 1; 2; : : : every
accumulation point of which is an optimal solution.
Proof For " D 0 let y D limq!1 ykq be an accumulation point of fykq g with, as
q q q q
previously, ikq D 1 8q; and Mkq ;1 D a1 ; b1 : Since fa1 g is nondecreasing, fb1 g
is nonincreasing, we have a1 ! a1 ; b1 ! b1 : For each q either of the following
q q

situations occurs:
q qC1 k q
a1 < aqC1 < b1 < y1q < b1 (9.108)
q k qC1 qC1 q
a1 < y1q < a1 < b1 < b1 (9.109)

k
If (9.108) occurs for infinitely many q then limq!1 y1q D limq!1 b1 D b1 :
q
k
If (9.109) occurs for infinitely many q then limq!1 y1q D limq!1 a1 D a1 :
q
 
Thus, y1 2 fa1 ; b1 g: By Lemma 9.9, the concave function g1 .t/ is continuous at y1 :
k
Suppose, for instance, that y1 D a1 (the other case is similar). Since y1q ! a1 and
k k
a1 ! a1 ; it follows from the continuity of g1 .t/ at a1 that g1 .y1q /  l1;Mkq .y1q / ! 0
q

and hence, by (9.102),


k k
gi .yi q /  li;Mkq .yi q / ! 0 i D 1; : : : ; r

as q ! 1: This implies that g.ykq /  lMkq .ykq / ! 0 and hence, that

.Mkq / ! hc; xi C g.y/ C h.z/

as q ! 1: Since .Mkq /  WD optimal value of .GPTP/ it follows that hc; xi C


g.y/ C h.z/ is the sought minimum of .GPTP/: t
u
Computational experiments have shown that the above algorithm for .GPTP/ can
solve problems with up to r D 100 supply points and n D 500 demand points in
reasonable time on a Sun SPARCstation SLC (Holmberg and Tuy 1995). To apply
this algorithm to the case when hj .zj / D C1 for zj dj (e.g., the fixed charge
problem) it suffices to redefine hj .zj / D j .jdj  xj j/ where j is finite but sufficiently
large.
326 9 Parametric Decomposition

9.9 Some Polynomial-Time Solvable Problems

As mentioned previously, even quadratic programming with one negative eigenvalue


is NP-hard (Pardalos and Vavasis 1991). There are, nevertheless, genuine concave
programs which are polynomial-time solvable. In this section we discuss this topic,
to illustrate the efficiency of the parametric approach for certain classes of low rank
nonconvex problems.

9.9.1 The Special Concave Program CPL.1/

Consider the special concave program under linear constraints:




min g.y/
Pn
C hc; xi
.CPL.1//
s.t. jD1 xj D y; 0  xj  dj 8j:
P
where g.y/ is a concave function, c 2 Rn and d 2 RnC : Let s D njD1 dj : To solve this
concave program by the parametric right-hand side method presented in Sect. 8.5,
we first compute the breakpoints y0 D 0  y1  : : :  yN D s of the optimal value
function of the associated parametric linear program
X
n
.LP.y// minfhc; xij xj D y; 0  xj  dj 8jg; 0  y  s: :
jD1

If xi is a basic optimal solution of LP.yi / then, by Lemma 9.4, an optimal solution


of CPL(1) is given by .xi ; yi / where

i 2 arg minfg.yi / C cxi j i D 0; 1; : : : ; Ng: (9.110)

Thus, all is reduced to solving LP.y/ parametrically in y 2 0; s: But this is very
easy because LP.y/ is a continuous knapsack problem. In fact, by reindexing we can
assume that the cj are ordered by increasing values:

c1  c2  : : :  cn :
Pk1 Pk
Let k be an index such that jD1 dj  y  jD1 dj :
Lemma 9.10 An optimal solution of LP.y/ is x with


8j D 1; : : : ; k  1
xj D d j P

xk D y  jD1
k1
d j ; xj D 0 8j D k C 1; : : : ; n:
9.9 Some Polynomial-Time Solvable Problems 327

Proof This well-known result (see, e.g., Dantzig 1963)Pis rather intuitive because
for any feasible solution x of LP.y/, by noting that y D njD1 xj we can write

X
k1 X
n X
k1
hc; xi D cj d j C ck xj  ck dj
jD1 jD1 jD1

X
k1 X
k1 X
n
D cj d j  ck .dj  xj / C ck xj
jD1 jD1 jDk

X
k1 X
k1 X
n
 cj d j  cj .dj  xj / C ck xj  hc; xi:u
t
jD1 jD1 jDk

Proposition 9.10
(i) The breakpoints of the optimal value function of LP.y/; 0  y  s; are among
the points:

y0 D 0; yk D yk1 C dk k D 1; : : : ; n

(ii) For every k D 0; 1; : : : ; n a basic optimal solution of TP.yk / is the vector xk


such that

xkj D dj ; j D 1; : : : ; kI xkj D 0; j D k C 1; : : : n: (9.111)

Proof From Lemma 9.10, for any y 2 yk ; ykC1  a basic optimal solution of LP.y/ is
y  yk
xy D xk C .xkC1  xk / with D :
dkC1

Hence each segment yk ; ykC1  is a linear piece of the optimal value function '.y/ of
LP.y/; which proves (i). Part (ii) is immediate. t
u
As a result, the concave program CPL.1/ can be solved by the following:
Parametric Algorithm for CPL(1)
Initialization. Order the coefficients cj by increasing values:

c1  c2  : : :  cn :
328 9 Parametric Decomposition

Step 2. For k D 0; 1; : : : ; n compute


0 1
Xk X
k
fk D g @ dj A C cj d j :
jD1 jD1


Step 3. Find fk D minff0 ; f1 ; : : : ; fn g: Then xk is an optimal solution.
Proposition 9.11 The above algorithm requires O.n log2 n/ elementary operations
and n evaluations of g./
Proof The complexity dominant part of the algorithm is Step 1 which requires
O.n log2 n/ operations, using, e.g., the procedure Heapsort (Ahuja et al. 1993). Also
an evaluation of g./ is needed for each number fk : t
u
Thus, assuming the values of the nonlinear function g./ provided by an oracle
the above algorithm for CPL.1/ runs in strongly polynomial time.

9.10 Applications

9.10.1 Problem PTP(2)

For convenience we shall refer to problem .PTP/ with r factories as PTP.r/. So


PTP.2/ is the problem

X 2 Xm

min g.y/ C cij xij

iD1 jD1
Xm

PTP.2/ s.t. x1j D y


jD1
x1j C x2j D dj j D 1; : : : ; m;

0  xij  dj i D 1; 2; j D 1; : : : ; m

Substituting dj  x1j for x2j and setting xj D x1j ; cj D c1j  c2j we can convert this
problem into a CPL.1/:

min g.y/ C hc; xi


Pm
s.t. jD1 xj D y; 0  xj  dj 8j:

Therefore,
Proposition 9.12 PTP.2/ can be solved by an algorithm requiring O.m log2 m/
elementary operations and m evaluations of g./:
9.10 Applications 329

9.10.2 Problem FP(1)

We shall refer to Problem .MCCNFP/ with a single source and r nonlinear arc costs
as FP.r/. So FP.1/ is the problem of finding a feasible flow with minimum cost in
a given network G with m nodes (indexed by j D 1; : : : ; m/; and n arcs (indexed
byPi D 1; : : : ; n/; where node j has a demand dj (with dm < 0; dj  0 for all other
j; m jD1 dj D 0/; arc 1 has a concave nonlinear cost function g1 .t/ while each arc
i D 2; : : : ; n has a linear cost gi .t/ D ci t (with ci  0/: If AC 
j and Aj denote the sets
of arcs entering and leaving node j, respectively, then the problem is

X n
min g .x / C
1 1 ci xi

X X
iD2
FP.1/ s.t. xi D dj ; j D 1; 2; : : : ; m;
xi 

C
i2Aj i2A

j

x  0:

Proposition 9.13 FP.1/ can be reduced to a PTP.2/ by means of O.m log2 m C n/


elementary operations.

Proof To simplify the language we call the single arc with nonlinear cost the black
arc. All the other arcs are called white, and the unit cost ci attached to a white arc
is called its length. A directed path through white arcs only is called a white path;
its length is then the sum of the lengths of its arcs. A white path from a node j to
a node j0 is called shortest if its length is smallest among all white paths from j
to j0 : Denote by Q0j the shortest white path from the source to sink j; by Q1j the
shortest white path from the head of the black arc to sink j; and by P the shortest
white path from the source to the tail of the black arc. Let c0j ; c1j ; p be the lengths
of these paths, respectively (the length being C1; or an arbitrarily large positive
number, if the path does not exist). Assuming fjjdj > 0g D f1; : : : ; m1 g; let s D
P m1
jD1 dj : Consider now a problem PTP.2/ defined for two factories F0 ; F1 , and m1
warehouses W1 ; : : : ; Wm1 with demands d1 ; : : : ; dm1 ; where the cost of producing
y units at factory F0 and s  y units at factory F1 is g1 .y/ C py while the unit
transportation cost from Fi to Wj is cij .i D 0; 1/; i.e.,

min g .y/ C py C P1 Pm1 c x
P 1 iD0 jD1 ij ij
m1
s.t. x
jD1 0j D y
(9.112)
x 0j C x1j D dj j D 1; : : : ; m1

0  x0j  dj j D 1; : : : ; m1

(the corresponding reduced network is depicted in Fig. 9.2). Clearly from an optimal
solution xij  of (9.112) we can derive an optimal solution of FP.1/ by setting, in
FP.1/;
X
m1
x1 D x0j
jD1
330 9 Parametric Decomposition

Fig. 9.2 Reduced network Wm

F1 F2

W1 W2 Wm1

and solving the corresponding linear min-cost network flow problem. Thus, solving
FP.1/ is reduced to solving the just defined PTP.2/: To complete the proof, it
remains to observe that the computation of cij ; i D 0; 1; j D 1; : : : ; m1 ; and p
needed for the above transformation, requires solving two single-source multiple-
sink shortest problems with nonnegative arc costs, hence a time of O.m log2 m C n/
(using Fredman and Tarjans (1984) implementation of Dijkstra s algorithm).
It follows from Propositions 9.11 and 9.13 that FP.1/ can be solved by a strongly
polynomial algorithm requiring O.m log2 m C m log2 m C n/ D O.m log2 m C n/
elementary operations and m evaluations of the function g1 .:/:
Remark 9.9 A polynomial (but not strongly polynomial) algorithm for FP.1/
was first given by Guisewite and Pardalos (1992). Later a strongly polynomial
algorithm based on solving linear min-cost network flow problems with a parametric
objective function was obtained by Klinz and Tuy (1993). The algorithms presented
in this section for PTP.1/ and FP.1/ were developed by Tuy et al. (1993a) and
subsequently extended in a series of papers by Tuy et al. (1993b, 1994a, 1995b,
1996a) to prove the strong polynomial solvability of a class of network flow
problems including PTP.r/ and FP.r/ with fixed r:

9.10.3 Strong Polynomial Solvability of PTP.r/

The problem PTP.r/ as was formulated at the beginning of Sect. 8.8 is



Xm Xm Xr X m
min g. x1j /; : : : ; xrj / C
cij xij

jD1 jD1 iD1 jD1

s.t. X x D d j D 1; : : : ; m
r
PTP.r/
ij j

iD1
xij  0; i D 1; : : : ; r; j D 1; : : : ; m
where g.y0 /  g.y/ whenever y0  y:
9.10 Applications 331

Following the approach in Sect. 8.5 we associate with PTP.r/ the parametric
problem

X r X m Xr X m
min
ti xij C cij xij

iD1 jD1 iD1 jD1

s.t. X x D d j D 1; : : : ; m
r
P.t/
ij j

iD1
xij  0; i D 1; : : : ; r; j D 1; : : : ; m

where t 2 RrC : The parametric domain RrC is partitioned into a finite collection P
of polyhedrons (cells) such that for each cell 2 P there is a basic solution x

which is optimal to P.t/ for all t 2 : Then an optimal solution of PTP.r/ is x
where
8 0 1 9
< X m X
m X =
 2 argmin g @ x
1j ; : : : ; x AC cij x
ij j 2 P : (9.113)
: rj
;
jD1 jD1 i;j

Following Tuy (2000b) we now show that the collection P can be constructed and
its cardinality is bounded by a polynomial in m.
Observe that the dual of P.t/ is

Xm
max
dj uj
P .t/
jD1
s.t. u  t C c ; i D 1; : : : ; r; j D 1; : : : ; m
j i ij

r
Also for any fixed t 2 rC a basic solution of P.t/ is a vector xt such that for every
j D 1; : : : ; m there is an ij satisfying

dj i D ij
xtij D (9.114)
0 i ij

By the duality of linear programming xt is a basic optimal solution of P.t/ if and


only if there exists a feasible solution u D .u1 ; : : : ; um / of P .t/ satisfying

D ti C cij i D ij
uj
 ti C cij i ij

or, alternatively, if and only if for every j D 1; : : : ; m:

ij 2 argminiD1;:::;r fti C cij g: (9.115)


332 9 Parametric Decomposition

Now let I2 be the set of all pairs .i1 ; i2 / such that i1 < i2 2 f1; : : : ; rg: Define a cell
to be a polyhedron  RrC which is the solution set of a linear system formed
by taking, for every pair .i1 ; i2 / 2 I2 and every j D 1; : : : ; m; one of the following
inequalities:

ti1 C ci1 j  ti2 C ci2 j ; ti1 C ci1 j  ti2 C ci2 j : (9.116)

Then for every j D 1; : : : ; m the order of magnitude of the sequence

ti C cij ; i D 1; : : : ; r

remains unchanged as t varies over a cell : Hence the index ij satisfying PTP.3/
remains the same for all t 2 I in other words, xt (basic optimal solution of P.t//
equals a constant vector x for all t 2 : Let P be the collection of all cells
defined that way. Since every t 2 RrC satisfies one of the inequalities (9.116) for
every .i1 ; i2 / 2 I2 and every j D 1; : : : ; m; the collection P covers all of RrC : Let
us estimate an upper bound of the number of cells in P.
Observe that for any fixed pair .i1 ; i2 / 2 I2 we have ti1 C ci1 j  ti2 C ci2 j if and
only if ti1  ti2  ci2 j  ci1 j : Let us sort the numbers ci2 j  ci1 j ; j D 1; : : : ; m; in
increasing order

ci2 j1  ci1 j1  ci2 j2  ci1 j2  : : :  ci2 jm  ci1 jm : (9.117)

and let i1 ;i2 .j/ be the position of ci2 j  ci1 j in this ordered sequence.
Proposition 9.14 A cell is characterized by a map k W I2 ! f1; : : : ; m; m C 1g
such that is the solution set of the linear system

ti1 C ci1 j  t2 C ci2 j 8.i1 ; i2 / 2 I2 s:t: i1 ;i2 .j/  k .i1 ; i2 / (9.118)
ti1 C ci1 j  t2 C ci2 j 8.i1 ; i2 / 2 I2 ; s:t: i1 ;i2 .j/ < k .i1 ; i2 / (9.119)

i1 ;i2
Proof Let  RrC be a cell. For every pair .i1 ; i2 / with i1 < i2 denote by J the
set of all j D 1; : : : ; m such that the left inequality (9.116) holds for all t 2 and
define
i1 ;i2 i1 ;i2
minfi1 ;i2 .j/j j 2 J g if J ;
k .i1 ; i2 / D
mC1 otherwise

It is easy to see that is then the solution set of the system (9.118) and (9.119).
Indeed, let t 2 : If i1 ;i2 .j/  k .i1 ; i2 /, then k .i1 ; i2 / m C 1; so k .i1 ;2 / D
i1 ;i2
i1 ;i2 .l/ for some l 2 J : Then ti1 C ci1 l  ti1 C ci2 ; hence ti1  ti2  ci2 l  ci1 l and
since the relation i1 ;i2 .j/  i1 ;i2 .l/ means that ci2 j  ci1 j  ci2 l  ci1 l it follows that
ti1  ti2  ci2 j  ci1 j ; i.e., ti1 C ci1 j  ti2 C ci2 j : Therefore (9.118) holds. On the other
i1 ;i2
hand, if i1 ;i2 .j/ < k .i1 ; i2 / then by definition of a cell j J ; hence (9.119)
9.11 Exercises 333

holds, too [since from the definition of a cell, any t 2 must satisfy one of the
inequalities (9.116)]. Thus every t 2 is a solution of the system (9.118) and
(9.119). Conversely, if t satisfies (9.118) and (9.119) then for every .i1 ; i2 / 2 I2 ; t
i1 ;i2 i1 ;i2
satisfies the left inequality (9.116) for j 2 J and the right inequality for j J ;
2
hence t 2 : Therefore, each cell is determined by a map k W I ! f1; : : : ; m C
1g: Furthermore, it is easy seen that k k 0 for two different cells ; 0 : Indeed,
if 0 then at least for some .i1 ; i2 / 2 I2 and some j D 1; : : : ; m; one has
i1 ;i2 i1 ;i2
j 2 J n J 0 : Then k .i1 ; i2 /  i1 ;i2 .j/ but k 0 .i1 ; i2 / > i1 ;i2 .j/: t
u
Corollary 9.4 The total number of cells is bounded above by .m C 1/r.r1/=2 :
Proof The number of cells does not exceed the number of different maps k W I2 !
f1; : : : ; m C 1g and there are .m C 1/r.r1/=2 such maps. t
u
To sum up, the proposed algorithm for solving PTP.r/ involves the following
steps:
(1) Ordering the sequence ci2 j  ci1 j ; j D 1; : : : ; m for every pair .i1 ; i2 / 2 I2 so as
to determine i1 ;i2 .j/; j D 1; : : : ; m; .i1 ; i2 / 2 I2 :
(2) Computing the vector x for every cell 2 P (P is the collection of cells
determinedly the maps k W I2 ! f1; : : : ; m C 1g:
(3) Computing the values f .x / and select  according to (9.113).
Steps (1) and (2) require obviously a number of elementary operations
bounded by a polynomial in m; while step (3) requires mr.r1/=2 evaluations
of f .x/:

9.11 Exercises

9.1 Show that a differentiable function f .x/ is K-monotonic on an open convex set
X  Rn if and only if rf .x/ 2 K 8x 2 X:
9.2 Consider the problem minff .x/j x 2 Dg where D is a polyhedron in Rn with a
vertex at 0, and f W Rn ! R is a concave function such that for any real number
 f .0/ the level set C D fxj f .x/  g has a polar contained in the cone
generated by two linearly independent vectors c1 ; c2 2 Rn : Show that this problem
reduces to minimizing a concave function of two variables on the projection of D
on R2 : Develop an OA method for solving the problem.
9.3 Consider the problem

minfg.t/ C hd; yij ta C By  c; t  0; y  0g

where t 2 R; y 2 Rn ; a 2 Rm ; B 2 Rmn ; c 2 Rm
C : Develop a parametric algorithm
for solving the problem and compare with the polyhedral annexation method.
334 9 Parametric Decomposition

9.4 Show that a linearly constrained problem of the form

minfF.l.x//j x 2 Dg

where D is a polyhedron in Rn ; l.x/ is an affine function, and F.t/ is an arbitrary


quasiconcave real-valued function on an interval   l.D/ reduces to at most two
linear programs. If F.:/ is monotonic on l.D/ then the problem is in fact equivalent
to even a mere linear program.
9.5 Solve the plant location problem

3
X 3 X
X 5
min fi .xi / C dij yij
iD1 iD1 jD1

s:t: x1 C x2 C x3 D 203
y1j C y2j C y3j D bj j D 1; : : : ; 5
yi1 C : : : C yi5 D xi i D 1; 2; 3
xi  0I yij  0 i D 1; 2; 3I j D 1; : : : ; 5:

where fi .t/ D 0 if t D 0; fi .t/ D ri C si t if t > 0

r1 D 1; s1 D 1:7I r2 D 88; s2 D 8:4I r3 D 39I s3 D 4:7


2 3
6 66 68 81 4
b D .62; 65; 51; 10; 15/I d D dij  D 4 40 20 34 83 27 5
90 22 82 17 8

9.6 Suppose that details of n different kinds must be manufactured. The production
cost of t items of i-th kind is a concave function fi .t/: The number of items of i-th
kind required is ai > 0; but one item of i-th kind can be replaced by one item of any
j-th kind with j > i: Determine the number xi of items of i-th kind, i D 1; : : : ; n;
to be Pmanufactured so as to satisfy the demands with minimum cost. By setting
i D ijD1 aj ; this problem can be formulated as:

X
n
min fi .xi / s.t.
iD1

x1  1 ; x1 C x2  2 ; : : : ; x1 C : : : C xn  n
x1 ; : : : ; xn  0

(Zukhovitzki et al. 1968). Solve this problem.


(Hint: Any extreme point x of the feasible polyhedron can be characterized by
the condition: for each i D 1; : : : ; n; either xi D 0 or x1 C : : : C xi D i /.
9.11 Exercises 335

9.7 Develop an algorithm for solving the problem

minfc0 x C .c1 x/1=2  .c2 x/1=3 j x 2 Dg

where D  RnC is a polytope, c1 ; c2 2 RnC ; c0 2 Rn and cx denotes the inner product
of c and x: Apply the algorithm to a small numerical example with n D 2.
9.8 Solve the numerical Example 9.10 (page 317).
9.9 Solve by polyhedral annexation:

min .0:5x1 C 0:75x2 C 0:75x3 C 1:25x4 / s.t.


0:25x2 C 0:25x3 C 0:75x4  2
0:25x2  0:25x3 C 0:25x4  0
x1  0:5x2 C 0:5x3 C 1:5x4  1:5
h.x/ C 4x1  x2 D 0
x D .x1 ; : : : ; x4 /  0

where h.x/ is the optimal value of the linear program

min .4y1 C y2 / s.t.


y1 C y2  1:5  0:5x2  0:5x3  1:5x4
y2  x1 C x2 I y1 ; y2  0

(Hint: Exploit the monotonicity property of h.x/)


9.10 Solve the problem

X
n
xi
min s.t.
iD1
yi C

X
n X
n
xi D a; yi D b
iD1 iD1

0  xi  ; 0  yi  1 i D 1; : : : ; n

where 0 <
; 0 < a  n ; 0 < b  n:
Hint: Reduce the problem to the concave minimization problem
( )
X
n
min h.x/j xi D a; 0  xi 
iD1
336 9 Parametric Decomposition

where
( )
X
n
xi X
n
h.x/ D min j yi D b; 0  yi  1 i D 1; : : : ; n
iD1
yi C
iD1

and use the results on the special concave minimization CPL.1/.


9.11 Solve the problem in Example 5.5 (page 142) where the constraints (5.24) are
replaced by

X
n X
n
xi D a; yi D b:
iD1 iD1
Chapter 10
Nonconvex Quadratic Programming

10.1 Some Basic Properties of Quadratic Functions

As pointed out in the preceding chapters, in deterministic global optimization it is of


utmost importance to take advantage of the particular mathematical structure of the
problem under study. There are in general two aspects of nonconvexity deserving
special attention. First, the rank of nonconvexity, i.e., roughly speaking, the number
of nonconvex variables. Second, the degree of nonconvexity, i.e., the extent to which
the variables are nonconvex. In Chap. 9 we have discussed decomposition methods
for handling low rank nonconvex problems. The present Chap. 10 is devoted to
nonconvex optimization problems which involve only linear or quadratic functions,
i.e., in a sense functions with lowest degree of nonconvexity.
The importance of quadratic optimization models is due to several reasons:
Quadratic functions are the simplest smooth d.c. functions whose derivatives are
readily available and easy to manipulate.
Any twice differentiable function can be approximated by a quadratic function
in the neighborhood of a given point, so in a sense quadratic models are the most
natural.
Numerous applications in economics, engineering, and other fields lead to
quadratic nonconvex optimization problems. Certain combinatorial optimization
problems can also be studied as quadratic optimization problems. For instance,
a 01 constraint of Pthe form xi 2 f0; 1g; i D 1; : : : ; p can be written as the
p
quadratic constraint iD1 xi .xi  1/  0; 0  xi  1:
Before discussing specialized approaches to quadratic optimization it is conve-
nient to recall some of the most elementary properties of quadratic functions.
1. Let Q be a symmetric matrix. Then

Q D UU T ;  D diag.1 ; : : : ; n /;

Springer International Publishing AG 2016 337


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_10
338 10 Nonconvex Quadratic Programming

where UU T D I; U D .u1 ; : : : ; un / is the matrix of normalized eigenvectors


of Q; and 1 ; : : : ; n are the eigenvalues. In particular, Q is positive definite
(semidefinite) if and only if i > 0 .i  0; respectively) for all i D 1; : : : ; n:
The number .Q/ D maxiD1;:::;n ji j is called the spectral radius of Q:
2. For any symmetric matrix Q the matrix QCrI is positive semidefinite if r  .Q/
(Proposition 4.11); in other words, the function f .x/ D hx; Qxi C rkxk2 is convex
if r  .Q/:
3. Any system of quadratic inequalities

1
fi .x/ D hx; Qi xi C hci ; xi C di  0 i D 1; : : : ; m (10.1)
2
is equivalent to a single inequality

g.x/  rkxk2  0;

where g.x/ is convex and r  maxiD1;:::;m 12 .Qi / (Corollary 4.3).


4. The convex envelope of the function f .x; y/ D xy on a rectangle f.x; y/ 2 R2 j p 
x  q; r  y  sg is the function

maxfrx C py  pr; sx C qy  qsg

(Corollary 4.6).
5. A convex minorant of a quadratic function f .x/ D 12 hx; QxiChc; xi on a rectangle
M D fxj p  x  qg is

X
n
'M .x/ D f .x/ C r .xi  pi /.xi  qi /;
iD1

where r is any positive number satisfying r  12 .Q/:


Pn 2 1
Pnfunction f .x/ C r iD1 xi is convex for r  2 .Q/; while x 2 M
Indeed, the
implies that iD1 .xi  pi /.xi  qi /  0:
We next mention some less known properties which are important in the
computational study of quadratic problems. Recall that the rank of a quadratic form
is equal to the rank of its matrix.
Proposition 10.1 Let Q be an n  n matrix of rank k: Then

X
k X
k
QD .ai /.bi /T ; i.e.; hx; Qyi D hai ; xihbi ; yi;
iD1 iD1

where ai ; bi are n  1 vectors.


10.2 The S-Lemma 339

Proof Let Q D .a1 ; : : : ; an / with a1 ; : : : ; ak P


linearly independent. Then every aj is a
1
linear combination of a ; : : : ; a ; i.e., a D kiD1 bij ai .j D 1; : : : ; n/; for some bij 2
k j
P
R; i D 1; : : : ; k: Hence, letting bi D .bi1 ; : : : ; bin /T ; we have Q D kiD1 .ai /.bi /T :
t
u
Consequence A quadratic function of rank k is of the form

1X i
k
f .x/ D ha ; xihbi ; xi C hc; xi: t
u
2 iD1

Proposition 10.2 Let f .x/ W Rn ! R be a quadratic function, H  Rn an affine


manifold. Any local minimizer x of f .x/ on H is a global minimizer on H: If f .x/ is
bounded below on H it is convex on H:
Proof The first part follows from Proposition 4.18. If a quadratic function f .x/ is
bounded below on an affine manifold H it must be so on every line   H; hence
convex on this line, and consequently convex on the whole H: t
u
Proposition 10.3 A quadratic function f .x/ is strictly convex on an affine manifold
H if and only if it is coercive on H; i.e., f .x/ ! C1 as x 2 H; kxk ! C1 or
equivalently, for any real number
the set fx 2 Hj f .x/ 
g is compact.
Proof If a quadratic function f .x/ is strictly convex on H, then it is strictly convex
on every line in H; so for any
2 R the set fx 2 Hj f .x/ 
g (which is closed and
convex) cannot contain any halfline, hence must be compact, i.e., f .x/ is coercive
on H: Conversely, if a quadratic function f .x/ is not strictly convex on H, it is not
strictly convex on some line   H; so it is concave on  and hence cannot be
coercive. t
u

10.2 The S-Lemma

Quadratic optimization with a single quadratic constraint began to be studied as


early as in the middle of the last century in connection with nonlinear control
theory. A famous result obtained by Yakubovich in these days and named by him
the S-Lemma (Yakubovich 1977) turned out to play a crucial role in the subsequent
development of quadratic optimization.
There are numerous proofs and extensions of the S-Lemma. Below, following
(Tuy and Tuan 2013) we present an extension and a proof of this proposition
based on a special minimax theorem which is more suitable for studying quadratic
inequalities than classical minimax theorems.
340 10 Nonconvex Quadratic Programming

Theorem 10.1 Let C be a closed subset of Rn ; D an interval (closed or open) of R


and L.x; y/ W Rn  R ! R a continuous function. Assume the following conditions:
(i) for every x 2 Rn the function L.x; :/ is concave;
(ii) for every y 2 D any local minimizer of L.:; y/ on C is a global minimizer of
L.:; y/ on CI
(iii) there exist y 2 D such that L.x; y / ! C1 as x 2 C; kxk ! C1:
Then we have the minimax equality

inf sup L.x; y/ D sup inf L.x; y/: (10.2)


x2C y2D y2D x2C

Furthermore, if infx2C supy2D L.x; y/ < C1 there exists x 2 C satisfying

sup L.x; y/ D min sup L.x; y/: (10.3)


y2D x2C y2D

Proof Define WD infx2C supy2D L.x; y/;


WD supy2D infx2C L.x; y/: We can
assume
< C1; otherwise the equality (10.2) is trivial. Take an arbitrary real
number >
and for every y 2 D let C .y/ WD fx 2 Cj L.x; y/  g: We first
prove that for any segment  D a; b  D the following holds:

\y2 C .y/ ;: (10.4)

It can always be assumed that y 2 .a; b/: Since infx2C L.x; y / 


< there
exists x 2 C satisfying L.x; y / < ; and by continuity there exist u; v such that
u < y < v and L.x; y/ < 8y 2 u; v: Therefore, setting

s D supfy 2 u; bj \uzy C .z/ ;g: (10.5)

we have s  v: By definition of s there exists a sequence yk % s; such that


Ck WD \uyyk C .y/ ;: In view of assumption (iii) C .y / is compact,
so Ck ; k D 1; 2; : : : ; form a nested sequence of nonempty closed subsets of
the compact set C .y /: Therefore, there exists x0 2 \1 kD1 Ck ; i.e., satisfying
L.x0 ; y/  8y 2 u; s/: Since L.x0 ; yk /  0 8k; letting k ! C1 yields
L.x0 ; s/  : So L.x0 ; y/  8y 2 u; s: We contend that s D b: Indeed,
if s < b we cannot have L.x0 ; s/ < for then by continuity there would exist
q > s satisfying L.x0 ; y/ < 8y 2 s; q/; conflicting with (10.5). Therefore, if
s < b then necessarily L.x0 ; s/ D : Furthermore, for any ball W centered at x0
we cannot have D minx2C\W L.x; s/ for then assumption (ii) would imply that
D L.x0 ; s/ D minx2C L.x; s/; conflicting with > infx2C L.x; s/: So there exists
xk 2 C such that xk ! x0 and L.xk ; s/ < : If for some y 2 .s; b and some k
we had L.xk ; y/ > ; then, since L.xk ; s/ < assumption (i) would imply that
L.xk ; z/ < 8z 2 u; s and, moreover, by continuity L.xk ; q/ < 8y 2 s; q
for some q > s; conflicting with (10.5). Therefore, L.xk ; y/  8k; hence letting
xk ! x0 yields L.x0 ; y/  ; 8y 2 .s; b: This proves that s D b and so

\uyb C .y/ ;: (10.6)


10.2 The S-Lemma 341

In an analogous manner, setting

t D inffy 2 u; bj \yzb C .z/ ;g:

we can show that t D a; i.e., \ayb C .y/ ;; proving (10.4).


It is now easy to complete the proof. By assumption (iii) the set C .y / is
compact. Since any finite set E  D is contained in some segment  it follows
from (10.4) that the family C .y/ \ C .y /; y 2 D; has the finite intersection
property. So there exists x 2 C satisfying L.x; y/  8y 2 D: Taking D
C 1=k
we see that for each k D 1; 2; : : : ; there exists xk 2 C satisfying L.xk ; y/ 

C 1=k 8y 2 D: Since, in particular, L.xk ; y / 


C 1=k <
C 1; i.e.,
xk 2 C
C1 .y / 8k; while the set C
C1 .y / is compact, the sequence fxk g has a cluster
point x 2 C satisfying L.x; y/ 
8y 2 D: Noting that
 this implies (10.3).
t
u
In the above theorem D is an interval of R: The following proposition deals with
the case D  Rm with m  1:
Corollary 10.1 Let C be a closed subset of Rn ; D a convex subset of Rm , and
L.x; y/ W Rn  D ! R be a continuous function. Assume the following conditions:
(i) for every x 2 Rn the function L.x; :/ is affine;
(ii) for every y 2 D any local minimizer of L.x; y/ on C is a global minimizer of
L.x; y/ on CI
(iii*) there exists y 2 D satisfying L.x; y / ! C1 as x 2 C; kxk ! C1 and
such that, moreover

inf L.x; y / D max inf L.x; y/ & L.x ; y / D min L.x; y / (10.7)
x2C y2D x2C x2C

for a unique x 2 C:
Then we have the minimax equality

min sup L.x; y/ D max inf L.x; y/: (10.8)


x2C y2D y2D x2C

Proof Consider an arbitrary >


WD supy2D infx2C L.x; y/ and for every y 2 D let
C .y/ WD fx 2 Cj L.x; y/  g: By assumption (iii*) infx2C L.x; y / D
; hence
C
.y / D fx 2 Cj L.x; y / D
g D fx g:
For every y 2 D let Dy WD y ; y D f.1  t/y C tyj 0  t  1g  R and
Ly .x; t/ WD L.x; .1  t/y C ty/; which is a function on C  Dy : It can easily be
verified that all conditions of Theorem 1 are satisfied for C; Dy and Ly .x; t/: Since
>
 supt2Dy infx2C L.x; t/; it follows that

; C .y / \ C .y/  C .y /  C:


342 10 Nonconvex Quadratic Programming

Furthermore, since by assumption L.x; y / ! C1 as x 2 C; kxk ! C1; the


set C .y / is compact and so is C .y / \ C .y/ because C .y/ is closed. Letting
!
C then yields

; C
.y / \ C
.y/  C
.y / D fx g:

Thus,

L.x ; y/ 
8y 2 D

and hence,

inf sup L.x; y/ 

x2C y2D

completing the proof. t


u
Remark 10.1 (a) When C is an affine manifold and L.x; y/ is a quadratic function
in x it is superfluous to require the uniqueness of x in assumption (iii*). In fact,
in that case the uniqueness of x follows from the condition L.x; y / ! C1 as
x 2 C; kxk ! C1 (which implies that x 7! L.x; y / is strictly convex on the
affine manifold C/:
(b) Conditions (iii*) is satisfied, provided there exists a unique pair .x ; y / 2 CD
such that
L.x ; y / D max min L.x; y/ and L.x; y / ! C1 as x 2 C; kxk ! C1:
y2D x2C
Indeed, let y satisfy min L.x; y/ D maxy2D minx2C L.x; y/; and L.x; y/ ! C1
x2C
as x 2 C; kxk ! C1: Since L.:; y/ is convex, there is x satisfying L.x; y/ D
minx2C L.x; y/: Then L.x; y/ D maxy2D minx2C L.x; y/; so .x ; y / D .x; y/ by
uniqueness condition. Hence minx2C L.x; y / D maxy2D minx2C L.x; y/; i.e., (iii*)
holds.
Assumption (ii) in Theorem 10.1 or Corollary 10.1 is obviously satisfied if L.:; y/ is
a convex function and C is a convex set (then Theorem 10.1 reduces to a classical
minimax theorem). However, as will be seen shortly, this assumption is also satisfied
in many other cases of interest in quadratic optimization.
Corollary 10.2 Let f .x/ and g.x/ be quadratic functions on Rn ; L.x; y/ D f .x/ C
yg.x/ for x 2 Rn ; y 2 R. Then for every line  in Rn there holds the minimax
equality

inf sup L.x; y/ D sup inf L.x; y/: (10.9)


x2 y2R y2RC x2
C

If g.x/ is not affine on  there also holds the minimax equality

inf sup L.x; y/ D sup inf L.x; y/: (10.10)


x2 y2R y2R x2
10.2 The S-Lemma 343

Proof If f .x/ and g.x/ are both concave on  and one of them is not affine then
clearly infff .x/j g.x/  0; x 2  g D 1; hence infx2 supy2RC L.x; y/ D
infff .x/j g.x/  0; x 2  g D 1; and (10.9) is trivial. If f .x/ and g.x/ are
both affine on ; (10.9) is a consequence of linear programming duality. Otherwise,
either f .x/ or g.x/ is strictly convex on  . Then there exists y 2 RC such
that f .x/ C y g.x/ is strictly convex on ; hence such that L.x; y / ! C1 as
x 2 ; kxk ! C1: Since by Proposition 10.2 condition (ii) in Theorem 10.1 is
satisfied, (10.9) follows by Theorem 10.1 with C D ; D D RC :
If g.x/ is not affine on ; there always is y 2 R such that f .x/ C y g.x/ is strictly
convex on  (y > 0 when g.x/ is strictly convex on  and y < 0 when g.x/ is
strictly concave on  ). Therefore all conditions of Theorem 10.1 are satisfied again
for C D  and D D R and (10.10) follows. t
u
We can now state a generalized version of the S-Lemma which is in fact valid for
arbitrary lower semi-continuous functions f .x/ and g.x/; not necessarily quadratic.
Theorem 10.2 (Generalized S-Lemma (Tuy and Tuan 2013)) Let f .x/ and g.x/
be quadratic, functions on Rn ; W a closed subset of Rn ; D a closed subset of R and
L.x; y/ D f .x/ C yg.x/ for y 2 R: Assume WD infx2W supy2D L.x; y/ > 1 and
the following conditions are satisfied:
(i) Either : D D RC and there exists x 2 W such that g.x / < 0
Or : D D R and there exist a; b 2 W such that g.a/ < 0 < g.b/:
(ii) For any two points a; b 2 W such that g.a/ < 0 < g.b/ we have
sup inf L.x; y/  :
y2D x2fa;bg

Then there exists y 2 D satisfying

inf L.x; y/ D ; (10.11)


x2W

and so infx2W supy2D L.x; y/ D maxy2D infx2W L.x; y/:


Proof We first show that for any real < and any finite set E  W there exists
y 2 D satisfying

inf L.x; y/  : (10.12)


x2E

Since x 2 E  W; g.x/ D 0 implies L.x; y/ D f .x/ C yg.x/ D f .x/  > 8y 2


D, it suffices to prove (10.12) for any finite set E  fx 2 Wj g.x/ 0g: Then
E D E1 [ E2 where E1 WD fx 2 Ej g.x/ < 0g; E2 D fx 2 Ej g.x/ > 0g: For every
x 2 E define .x/ WD f .x/
g.x/
: Clearly
for every x 2 E1 W .x/ > 0 and L.x; y/  , y  .x/ ];
for every x 2 E2 W L.x; y/  , y  .x/:
344 10 Nonconvex Quadratic Programming

Consequently, if E1 ; then minx2E1 .x/ > 0 and we have L.x; y/  8x 2 E1


when 0  y  minx2E1 .x/I if E2 ;, then we have L.x; y/  8x 2 E2
when y  maxx2E2 .x/: Further, if Ei ;; i D 1; 2; then by assumption (ii)
with a 2 argminx2E1 .x/; b 2 argmaxx2E2 .x/ there exists y 2 D satisfying
infx2fa;bg L.x; y/  ; i.e., such that

max .x/ D .b/  y  .a/ D min .x/;


x2E2 x2E1

hence

L.x; y/  8x 2 E2 ; L.x; y/  8x 2 E1 :

Thus in any case for any finite set E  W there exists y 2 D satisfying (10.12).
Setting D.x/ WD fy 2 Dj L.x; y/  g we then have \x2E D.x/ ; for any finite
set E  W: In other words, the collection of closed sets D.x/; x 2 W; has the finite
intersection property. If assumption (i) holds then
either D D RC and f .x / C yg.x / ! 1 as y ! C1; so the set D.x / D
fy 2 RC j L.x ; y/  g is compact
or D D R and f .a/ C yg.a/ ! 1 as y ! C1; f .b/ C yg.b/ ! 1 as
y ! 1:
So the set D.a/ D fy 2 Rj L.a; y/  g is bounded above, while D.b/ is bounded
below; hence the set D.a/ \ D.b/ is compact.
Consequently, \x2W D.x/ ;: For every k D  1=k; k D 1; 2; : : : there
exists yk 2 \x2W fy 2 Dj L.x; y/  k g; i.e., yk 2 D such that L.x; yk /  k 8x 2 W:
Obviously every yk is contained in the compact set \x2W fy 2 Dj L.x; y/  1 g;
so there exists y 2 D such that, up to a subsequence yk ! y .k ! C1/: Then
L.x; y/  8x 2 W; hence, infx2W L.x; y/ D : This proves (10.11). t
u
Theorem 10.2 yields the most general version of S-Lemma. In fact, if W is an
affine manifold in Rn and D D RC condition (ii) is automatically fulfilled because
for every line  connecting a and b in W we have, by Corollary 10.2,

sup inf L.x; y/  sup inf L.x; y/ D inf sup L.x; y/


y2RC x2fa;bg y2RC x2 x2 y2R
C

 inf sup L.x; y/ D :


x2W y2R
C

Also using Corollary 10.2, condition (ii) is automatically fulfilled if W is an affine


manifold, D D R and g.x/ is a quadratic function g.x/ D hx; Q1 xiChc1 ; xiCd1 with
either Q1 nonsingular or with c1 D 0 and d1 D 0 (g.x/ is homogeneous). Indeed,
either of the latter conditions implies that g.x/ is not affine on any line  in W and
hence, by Corollary 10.2,

sup inf L.x; y/  sup inf L.x; y/ D inf sup L.x; y/  inf sup L.x; y/ D :
y2R x2fa;bg y2R x2 x2 y2R x2W y2R
10.2 The S-Lemma 345

Corollary 10.3 Let f .x/ and g.x/ be two quadratic functions on Rn :


(i) Let H be an affine manifold in Rn and assume there exists x 2 H such that
g.x / < 0: Then the system x 2 H; f .x/ < 0; g.x/  0 is inconsistent if and
only if there exists y 2 RC such that f .x/ C yg.x/  0 8x 2 H:
(ii) Let H be an affine manifold in Rn and assume that g.x/ is either homogeneous
or g.x/ D hx; Q1 xi C hc1 ; xi C d1 with Q1 nonsingular and that, moreover, g.x/
takes both positive and negative values on H: Then the system x 2 H; f .x/ <
0; g.x/ D 0 is inconsistent if and only if there exists y 2 R such that f .x/ C
yg.x/  0 8x 2 H:
(iii) Let H be a subspace of Rn and assume both f .x/; g.x/ are homogeneous. Then
the system x 2 H n f0g; f .x/  0; g.x/  0 is inconsistent if and only if there
exists .y1 ; y2 / 2 R2C such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g:
Proof Since the if part in each of the three assertions is obvious, it suffices to
prove the only if part. Let L.x; y/ D f .x/ C yg.x/:
(i) This is a special case of Theorem 10.2 with D D RC ; W D H: In fact,

WD infff .x/j g.x/  0; x 2 Hg D inf sup L.x; y/;


x2H y2R
C

so inconsistency of the system fx 2 H; f .x/ < 0; g.x/  0g means  0.


Therefore, by Theorem 10.2 there exists y 2 RC such that f .x/ C yg.x/  
0; 8x 2 H:
(ii) This is a special case of Theorem 10.2 with D D R; W D H: In fact,
inconsistency of the system fx 2 H; f .x/  0; g.x/ D 0g means WD
infff .x/j g.x/ D 0; x 2 Hg D infx2H supy2R L.x; y/  0:
(iii) With H being a subspace and both f .x/; g.x/ homogeneous, the system fx 2
H n f0g; f .x/  0; g.x/  0g is inconsistent only if so is the system fx 2
H n f0g; f .x/=hx; xi  0; g.x/=hx; xi  0g:
p
Setting z D x= hx; xi we have the implication: z 2 H; hz; zi D 1; g.z/ 
0 ) f .z/ > 0: Since the set fx 2 Hj hx; xi D 1g is compact it follows that 1 WD
infff .z/j g.z/  0; hz; zi D 1g > 0 and noting that f .0/ D g.0/ D 0 we can write
infff .x/  1 hx; xij g.x/  0; x 2 Hg  0: If there exists x 2 H satisfying g.x/ < 0
then by (i) there exists y2 2 RC satisfying f .x/  1 hx; xi C y2 g.x/  8x 2 H;
hence f .x/ C y2 g.x/ > 0 8x 2 H n f0g:
Similarly, with the roles of f .x/ and g.x/ interchanged, if there exists x 2 H
satisfying f .x/ < 0; then there exists y1 2 RC satisfying y1 f .x/ C g.x/  0 8x 2
H n f0g:
Finally, in the case g.x/  0 8x 2 H and f .x/  0 8x 2 H; since there cannot
exist x 2 H n f0g such that f .x/ D 0 and g.x/ D 0 we must have f .x/ C g.x/ >
0 8x 2 H n f0g; completing the proof. t
u
Assertion (i) of Corollary 10.3 is a result proved in Polik and Terlaky (2007)
and later termed the generalized S-Lemma in Jeyakumar et al. (2009). Assertion
(ii) for homogeneous g.x/ is the S-Lemma with equality constraint while Assertion
346 10 Nonconvex Quadratic Programming

(iii) is the S-Lemma with nonstrict inequalities. So Theorem 10.2 is indeed the
most general version of the S-Lemma. Condition (i) is a kind of generalized Slater
condition and the proof given above is a simple elementary proof for the S-Lemma
in its full generality.
Remark 10.2 By Proposition 10.2 the function f .x/ C yg.x/ in assertions (i) and
(ii) of the above corollary is convex on H:

10.3 Single Constraint Quadratic Optimization

Let H be an affine manifold in Rn ; and f .x/; g.x/ two quadratic functions on Rn


defined by

1 1
f .x/ WD hx; Q0 xi C hc0 ; xi; g.x/ D hx; Q1 xi C hc1 ; xi C d1 ; (10.13)
2 2

where Qi ; i D 0; 1; are symmetric n  n matrices and ci 2 Rn ; d1 2 R: Consider the


quadratic optimization problems:

.QPA/ minff .x/j x 2 H; g.x/  0g;

.QPB/ minff .x/j x 2 H; g.x/ D 0g:

Define D D RC ; G D fx 2 Rn j g.x/  0g for .QPA/, and D D R; G D fx 2


Rn j g.x/ D 0g for .QPB/ and let L.x; y/ D f .x/ C yg.x/; where x 2 Rn ; y 2 R:
Clearly

f .x/ if x 2 G
sup L.x; y/ D
y2D C1 otherwise

so the optimal value of problem .QPA/ or .QPB/ is

v.QPA/ D inf sup L.x; y/; Q v.QPB/ D inf sup L.x; y/:
x2H y2R x2H y2R
C

Strong duality is said to hold for .QPA/ or .QPB/ if

v.QPA/ D sup inf L.x; y/ .case D D RC / (10.14)


y2RC x2H

v.QPB/ D sup inf L.x; y/ .case D D R/: (10.15)


y2R x2H

The problem on the right-hand side of (10.14) or (10.15) is called the dual
problem of .QPA/ or .QPB/, respectively. By Proposition 10.2, if infx2H L.x; y/ >
1 the function L.:; y/ is convex on H: Therefore, with the usual convention
10.3 Single Constraint Quadratic Optimization 347

inf ; D 1; the supremum in the dual problem can be restricted to those y 2 D
for which the function L.:; y/ is convex on H: As will be seen shortly (Remark 10.4
below), in view of this fact many properties are obvious which otherwise have been
proved sometimes by unnecessarily involved arguments. Actually, it is this hidden
convexity that is responsible for the striking analogy between certain properties of
quadratic inequalities and corresponding ones of convex inequalities.
To investigate conditions guaranteeing strong duality we introduce the following
assumptions:
(S) There exists x 2 H satisfying g.x / < 0:
(T) There exists y 2 D such that L.x; y / D f .x/ C y g.x/ is strictly convex on H:
Theorem 10.3 (Duality Theorem)
(i) If assumption (S) is satisfied then (10.14) holds and if, furthermore, v.QPA/ >
1 (which is the case if assumption (T), too, is satisfied with D D RC ), then
there exists y 2 RC such that L.x; y/ is convex on H and

v.QPA/ D min L.x; y/ D max inf L.x; y/: (10.16)


x2H y2RC x2H

A point x 2 H is then an optimal solution of .QPA/ if and only if it satisfies

hrf .x/ C yrg.x/; ui D 0 8u 2 E; yg.x/ D 0; g.x/  0; (10.17)

where E is the subspace parallel to the affine manifold H:


(ii) If H is a subspace and v.QPB/ > 1 while
either g.x/ is homogeneous and takes both positive and negative values
on H;
or g.x/ and f .x/ are both homogeneous,
then (10.15) holds and there exists y 2 R such that L.x; y/ is convex on H and

v.QPB/ D min L.x; y/ D max inf L.x; y/: (10.18)


x2H y2R x2H

A point x 2 H is then an optimal solution of .QPB/ if and only if it satisfies

hrf .x/ C yrg.x/; ui D 0 8u 2 H; g.x/ D 0: (10.19)

(iii) If assumption (T) is satisfied then (10.15) holds, where the supremum can be
restricted to those y 2 D for which L.:; y/ is convex on H: If, furthermore, the
dual problem has an optimal solution y 2 D then x 2 H is an optimal solution
of the primal problem if and only if

hrf .x/ C yrg.x/; ui D 0 8u 2 E; (10.20)

where E is the subspace parallel to H:


348 10 Nonconvex Quadratic Programming

Proof (i) The existence of y follows from Corollary 10.3, (i). Then x is an optimal
solution of .QPA/ if and only if

max L.x; y/  L.x; y/  min L.x; y/;


y2RC x2H

hence, if and only if conditions (10.17) hold by noting that L.:; y/ is convex
on H:
(ii) When g.x/ is homogeneous and takes both positive and negative values on
the subspace H the existence of y follows from Corollary 10.3, (ii). When
both f .x/; g.x/ are homogeneous, if g.x/  0 8x 2 H then infff .x/j g.x/ 
0; x 2 Hg D v.QPB/, so by Corollary 10.3, (iii), v.QPB/ > 1 implies the
existence of .y1 ; y2 / 2 RC ; such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g:
If y1 D 0 then y2 g.x/ > 0 8x 2 H n f0g; hence g.x/  0 8x 2 H;
hence g.x/ D 0 8x 2 H; a contradiction. Therefore, y1 > 0 and (10.18)
holds with y D y2 =y1 2 RC : Analogously, if g.x/  0 8x 2 H, then
infff .x/j  g.x/  0; x 2 Hg D v.QPB/; so (10.18) holds for some y 2 R :
Since L.x; y/ is convex on H, conditions (10.19) are necessary and sufficient
for x 2 H to be an optimal solution.
(iii) If assumption (T) is satisfied the conclusion follows from Theorem 10.1 where
C D H and D D R for .QPB/: If, in addition to (T), the dual problem has an
optimal solution y 2 D then x 2 H solves the primal problem if and only if it
solves the convex optimization problem minx2H L.x; y/; and hence, if and only
if it satisfies (10.20). t
u
Remark 10.3 In the usual case H D Rn it is well known that by setting Q.y/ D
Q0 C yQ1 ; c.y/ D c0 C yc1 ; d.y/ D yd1 the dual problem of .QPA/ can be written
as the SDP
 

Q.y/ c.y/
max tj 0; y 2 D : (10.21)
c.y/T 2.d.y/  t/

In the general case H Rn ; as noticed above the supremum in (10.14) can be


restricted to the set of those y 2 RC for which L.x; y/ is a convex function on
H; i.e., infx2H L.x; y/ is a convex problem. Since the problem (10.14) amounts to
maximizing the concave function '.y/ D infx2H L.x; y/ on RC ; and the value of
'.y/ at each y is obtained by solving a convex problem, it is easy to solve the dual
problem by convex programming methods.
Corollary 10.4 In .QPA/ or .QPB/ if f .x/ or g.x/ is strictly convex on Rn then
strong duality holds.
Proof If f .x/ or g.x/ is strictly convex on Rn there exists y 2 RC such that f .x/ C
y g.x/ is strictly convex on Rn ; and hence, by Proposition 10.2, coercive on H: So
the function L.x; y/ D f .x/ C yg.x/ satisfies all conditions of Theorem 10.1 with
C D H; D D RC (for .QPA/) or D D R (for .QPB/). By this theorem, strong
duality follows. t
u
10.3 Single Constraint Quadratic Optimization 349

As a consequence, quadratic minimization over an ellipsoid, i.e., the problem

minff .x/j hQx; xi  r2 ; x 2 Rn g (10.22)

with Q 0 (positive definite) can be solved efficiently by solving a SDP which is


its dual. This is in contrast with linearly constrained quadratic minimization which
has been long known to be NP-hard (Sahni 1974).
Remark 10.4 As a simple example illustrating the relevance of condition (T),
consider the problem minfx41 Cax21 Cbx1 g: Upon the substitution x21 ! x2 ; it becomes
a quadratic minimization problem of two variables x1 ; x2 :

minfx22 C ax2 C bx1 j x21  x2 D 0g: (10.23)

Since .x22 C ax2 C bx1 / C .x21  x2 / D x1 .x1 C b/ C x2 .x2 C a  1/ ! C1 as


k.x1 ; x2 /k ! C1 condition (T) is satisfied. Hence, by Theorem 10.3, (iii), (10.23)
is equivalent to a convex problema result proved by an involved argument in Shor
and Stetsenko (1989). Incidentally, note that the convex hull of the nonconvex set

f.1; x1 ; x21 ; : : : ; xn1 /T j x1 2 a; bg  RnC1

is exactly described by SDP constraints (Hoang et al. 2008), so for any n-order
univariate polynomial Pn .x1 / the optimization problem minx1 2a;b Pn .x1 / can be
fully characterized by means of SDPs.
Corollary 10.5 In .QPA/ where H D Rn assume condition (S). Then x 2 Rn solves
.QPA/ if and only if there exists y 2 RC satisfying

Q0 C yQ1 0 (10.24)
0 1 0 1
.Q C yQ /x C c C yc D 0 (10.25)
hQ1 x C 2c1 ; xi C 2d1  0: (10.26)

When Q0 is not positive definite, the last condition is replaced by

hQ1 x C 2c1 ; xi C 2d1 D 0 (10.27)

Proof By Theorem 10.3, (i), x is an optimal solution of .QPA/ if and only if there
exists y 2 RC such that L.x; y/ D f .x/ C yg.x/ is convex and x is a minimizer of
this convex function. That is, if and only if there exists y 2 RC satisfying (10.24)
and (10.17), i.e., (10.24)(10.26).
If the inequality (10.26) holds strictly, i.e.,

hQ1 x C 2c1 ; xi C 2d1 < 0


350 10 Nonconvex Quadratic Programming

then g.x/ < 0; hence g.x/  0 for all x in some neighborhood of x: Then x is a local
minimizer, hence a global minimizer, of f .x/ over Rn ; which happens only if Q0 is
positive definite. t
u
Corollary 10.6 With f ; g; H defined as in (10.13) above, assume that g.x/ is
homogeneous and takes both positive and negative values on H: Then

x 2 H; g.x/ D 0 ) f .x/  0 (10.28)

if and only if there exists y 2 R such that f .x/ C yg.x/  0 8x 2 H:


Proof This is a proposition often referred to as the S-Lemma with equality con-
straint. By Corollary 10.3, (ii), it extends to the case when g.x/ D hx; Q1 xi C
hc1 ; xiCd1 with Q1 nonsingular and, moreover, g.x/ takes both positive and negative
values on H: t
u
Note that when H D Rn and f .x/; g.x/ are homogeneous with associated matrices
Q and Q1 ; g.x/ takes on both positive and negative values if Q1 is indefinite, while
0

f .x/ C yg.x/  0 8x 2 Rn if and only if Q0 C yQ1 0: So in this special case


Corollary 10.6 gives the following nonstrict Finsler Theorem (Finsler 1937):
If Q1 is indefinite then hQ1 x; xi D 0 implies hQ0 x; xi  0 if and only if Q0 CyQ1
0 for some y 2 R:
Also note:
Corollary 10.7 (Strict Finslers Theorem (Finsler 1937)) With f ; g homogeneous
and H D Rn we have

x 0; g.x/ D 0 ) f .x/ > 0 (10.29)

if and only if there exists y 2 R such that f .x/ C yg.x/ > 0 8x 0:


Proof This follows directly from Corollary 10.3, (iii) (S-Lemma with nonstrict
inequalities). Since f .x/ and g.x/ are both homogeneous, there exists .y1 ; y2 / 2 R2C
such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g: From (10.29) we must have y1 > 0; so
y D y2 =y1 : t
u
Corollary 10.8 Let f ; g; H be defined as in (10.13) above. Then just one of the
following alternatives holds:
(i) There exists x 2 H satisfying f .x/ < 0; g.x/ < 0I
(ii) There exists y D .y1 ; y2 / 2 R2C n f.0; 0/g satisfying

y1 f .x/ C y2 g.x/  0 8x 2 H:

Proof Clearly (i) and (ii) cannot hold simultaneously. If (i) does not hold, two cases
are possible: 1) f .x/  0 for every x 2 H such that g.x/ < 0I 2) g.x/  0 for every
x 2 H such that f .x/ < 0: In case (1), if 0 is a local minimum of g.x/ on H; then
g.x/  0 8x 2 H; so y1 f .x/ C y2 g.x/  0 8x 2 H with y1 D 0; y2 D 1I if, on
10.3 Single Constraint Quadratic Optimization 351

the contrary, 0 is not a local minimum of g.x/ on H; then every x 2 H satisfying


g.x/ D 0 is the limit of a sequence xk 2 H satisfying g.xk / < 0; k D 1; 2; : : : ;
so infff .x/j g.x/  0; x 2 Hg D infff .x/j g.x/ < 0; x 2 Hg  0; hence by
Theorem 10.3, (i), there exists y2 2 RC such that f .x/ C y2 g.x/  0 8x 2 H: The
proof in case (2) is similar, with the roles of f .x/; g.x/ interchanged. t
u
Corollary 10.9 Let f ; g; H be defined as above, let A; B be two closed subsets of an
affine manifold H such that A [ B D H: Assume condition (S). If

f .x/  0 8x 2 A; while g.x/  0 8x 2 B (10.30)

then there exists y 2 RC such that f .x/ C yg.x/  0 8x 2 H:


Proof Condition (S) means that H n B ;; hence A ;: Clearly fx 2 Hj g.x/ <
0g  A: If x0 2 H satisfies g.x0 / D 0; then for every k there exists xk 2 H satisfying
kxk  x0 k  1=k; g.xk / < 0 for otherwise x0 would be a local minimum of g.x/ on
H; hence a global minimum on H; i.e., g.x/  g.x0 / D 0 8x 2 H; contradicting (S).
Therefore, fx 2 Hj g.x/  0g D clfx 2 Hj g.x/ < 0g  A: Assumption (10.30) then
implies f .x/  0 for every x 2 H such that g.x/  0; hence infff .x/j g.x/  0g  0;
and the existence of y follows from Theorem 10.3, (i). t
u
The above results, and especially the next one, are very similar to corresponding
properties of convex inequalities.
Corollary 10.10 (i) With f .x/; g.x/; H as in (10.13) the set C D ft 2 R2 j 9x 2
H s.t. f .x/  t1 ; g.x/  t2 g is convex.
(ii) With f .x/; g.x/ homogeneous and H a subspace, the set G D ft 2 R2 j 9x 2
H s.t. f .x/ D t1 ; g.x/ D t2 g is convex.
Proof (i) 1 We first show that for any two points a; b 2 H the set Cab WD ft 2
R2 j 9x 2 a; b s.t. f .x/  t1 ; g.x/  t2 g is closed and convex. Let tk 2 Cab ; k D
1; 2; : : : ; be a sequence converging to t: For each k let xk 2 a; b satisfy f .xk / 
t1k ; g.xk /  t2k : Then xk ! x 2 a; b (up to a subsequence) and by continuity
of f ; g we have f .x/  t1 ; g.x/  t2 ; so t 2 Cab : Therefore, Cab is a closed set.
Further, consider any point t D .t1 ; t2 / 2 R2 n Cab : Since Cab is a closed set
there exists a point t0 Cab such that t10 > t1 ; t20 > t2 : Then

6 9x 2 a; b f .x/ < t10 ; g.x/ < t20 :

By Corollary 10.8 there exists .y1 ; y2 / 2 R2C n f0; 0g satisfying y1 .f .x/  t10 / C
y2 .g.x/  t20 /  0 8x 2 a; b: Setting Lt WD fs 2 R2 j y1 s1 C y2 s2  y1 t10 C y2 t20 g
it is easily checked that Cab  Lt while t Lt : So for every t Cab there exists
a halfplane Lt separating t from Cab : Consequently, C D \tC Lt ; proving the
convexity of Cab :

1
This proof corrects a flaw in the original proof given in Tuy and Tuan (2013).
352 10 Nonconvex Quadratic Programming

Now if t; s are any two points of C then there exist a; b 2 H such that

f .a/  t1 ; g.a/  t2 ; f .b/  s1 ; g.b/  s2 :

This means t; s 2 Cab ; and hence t; s  Cab  C: Thus t; s 2 C ) t; s  C;


proving the convexity of C:
(ii) Turning to the set G with f .x/; g.x/ homogeneous and H a subspace, observe
that if there exists x 2 H satisfying a1 g.x/  a2 f .x/ D 0; a1 f .x/ > 0; then
f .x/ 2 2
a1
f .x/
D fa1.x/ 2 > 0; so setting f .x/ D r yields a1 D r f .x/ D f .rx/ and a2 D
a1

a1 2
f .x/ g.x/ D r g.x/ D g.rx/; hence a 2 G because rx 2 H: Therefore, if a G
then minfa1 f .x/j a1 g.x/ a2 f .x/ D 0; x 2 Hg D 0: By Corollary 10.3, (ii),
there exists y 2 R satisfying a1 f .x/ C y.a1 g.x/  a2 f .x//  0 8x 2 H: Then
G  La WD ft 2 R2 j  a1 t1 C y.a1 t2  a2 t1 /  0g; while a La (because
a21 < 0/: Thus, every a G can be separated from G by a halfspace, proving
the convexity of the set G: t
u
Assertion (i) of Corollary 10.10 is analogous to a well-known proposition for
convex functions. It has been derived from Corollary 10.3, (i), but the converse
is also true. In fact, suppose for given f .x/; g.x/; H the set C WD ft 2 R2 j 9x 2
H s:t: f .x/  t1 ; g.x/  t2 g is convex. Clearly if the system x 2 H; f .x/ 
0; g.x/  0 is inconsistent then .0; 0/ C; hence, by the separation theorem, there
exists .u; y/ 2 R2C ; such that ut1 Cyt2  0 8.t1 ; t2 / 2 C; i.e., uf .x/Cyg.x/  0 8x 2
H: If, furthermore, there exists x 2 H such that g.x / < 0 then .f .x /; g.x // 2 C;
so uf .x / C yg.x /  0: We cannot have u D 0 for this would imply yg.x /  0;
hence y D 0; conflicting with .u; y/ 0: Therefore u > 0 and setting y D y=u
yields f .x/ C yg.x/  0 8x 2 H: So the fact that C is convex is equivalent to
Corollary 10.3, (i).
Assertion (ii) of Corollary 10.10 (which still holds with H replaced by a regular
cone) includes a result of Dines (1941) which was used by Yakubovich (1977)
in his original proof of the S-Lemma. Since this result has been derived from
Corollary 10.3, (ii), i.e., the S-Lemma with equality constraints, Dines theorem
is actually equivalent to the latter.
It has been long known that the general linearly constrained quadratic pro-
gramming problem is NP-hard (Sahni 1974). Recently, two special variants of this
problem have also been proved to be NP-hard:
1. The linearly constrained quadratic program with just one negative eigenvalue
(Pardalos and Vavasis 1991):

minfx21 C hc; xij Ax  b; x  0g:

2. The linear multiplicative program (Matsui 1996):

minf.hc; xi C /.hd; xi C /j Ax  b; x  0g
10.3 Single Constraint Quadratic Optimization 353

Nevertheless, quite practical algorithms exist which can solve each of the two
mentioned problems in a time usually no longer than the time needed for solving
a few linear programs of the same size (Konno et al. 1997). In this section we
are going to discuss another special quadratic minimization problem which can
also be solved by efficient algorithms (Karmarkar 1990; Ye 1992). This problem
has become widely known due to several interesting features. Referring to its
role in nonlinear programming methods it is sometimes called the Trust Region
Subproblem. It consists of minimizing a nonconvex quadratic function over a
Euclidean ball, i.e.,

1
.TRS/ min hx; Qxi C hc; xij hx; xi  r2 ;
2
(10.31)
where Q is an n  n symmetric matrix and c 2 Rn ; r 2 R: A seemingly more general
formulation, in which the constraint set is an ellipsoid hx; Hxi  r2 (with H being a
positive definite matrix), can be reduced to the above one by means of a nonsingular
linear transformation.
Let u1 ; : : : ; un be a full set of orthonormal eigenvectors of Q; and let 1 : : : ; n
be the corresponding eigenvalues. Without loss of generality one can assume that
1 D : : : D k < kC1  : : :  n ; and 1 < 0: (The latter inequality simply
means that the problem is nonconvex.) By Proposition 4.18, the global minimum is
achieved on the sphere hx; xi D r2 ; so the problem is equivalent to

1 2
min hx; Qxi C hc; xij hx; xi D r :
2

Proposition 10.4 A necessary and sufficient condition for x to be a global optimal


solution of .TRS/ is that there exist   0 satisfying

Q C  yI is positive semidefinite (10.32)


  
Qx C  x C c D 0 (10.33)

kx k D r: (10.34)

Proof Clearly .TRS/ is a problem .QPA/ where H D Rn ; Q0 D Q; Q1 D I and


condition .S/ is satisfied, so the result follows from Corollary 10.5. t
u
Remark 10.5 It is interesting to note that when this result first appeared it came
as a surprise. In fact, at the time .TRS/ was believed to be a nonconvex problem,
so the fact that a global optimal solution can be fully characterized by conditions
like (10.32)(10.34) seemed highly unusual. However, as we saw by Theorem 10.3,
x is a global optimal solution of .TRS/ as a .QPA/ if and only if there exists
  0 such that the function 12 hx; Qxi C  hc; xi is convex (which is expressed
by (10.32)) and x is a minimizer of this convex function (which is expressed by the
equalities (10.33)(10.34)standard optimality condition in convex programming).
354 10 Nonconvex Quadratic Programming

Also note that by a result of Martinez (1994) .TRS/ has at most one local nonglobal
optimal solution. No wonder that good local methods such as the DCA method (Tao
1997; Tao and Hoai-An (1998) and the references therein) can often give actually a
global optimal solution.
In view of Proposition 10.4, solving .TRS/ reduces to finding   0 satisfy-
ing (10.32)(10.34). Condition (10.33) implies that 1   < C1: For 1 < 
since QCI is positive definite, the equation QxCxCc D 0 has a unique solution
given by
x./ D .Q C I/1 c:

Proposition 10.5 The function kx./k is monotonically strictly increasing in the


interval .1 ; C1/: Furthermore,

lim kx./k D 0;
!1


kxk if hc; ui i D 0 8i D 1; : : : ; k
lim kx./k D
!1 C1 otherwise;

where

Xn
hc; ui iui
xD :
  1
iDkC1 i

Proof From x./ D Qx./ C c; we have for every i D 1; : : : ; n:

hx./; ui i D hx; Qui i C hc; ui i D i hx; ui i C hc; ui i;

hence, for  > 1 :

hc; ui i
hx./; ui i D :
i  

Since fu1 ; : : : ; un g is an orthonormal base of Rn ; we then derive

X
n
hc; ui iui
x./ D ; (10.35)
iD1
i  

and consequently

Xn
.hc; ui i/2
kx./k2 D : (10.36)
iD1
.i  /2
10.4 Quadratic Problems with Linear Constraints 355

This expression shows that kx./k is a strictly monotonically increasing function of


 in the interval .1 ; C1/: Furthermore, lim kx./k D 0: If hc; ui i D 0 8i D
Pn !1i i
1; : : : ; k, then from (10.35) x./ D iDkC1 hc; u iu =.i  /; hence x./ ! x
as  ! 1 : Otherwise, since hc; ui i 0 for at least some i D 1; : : : ; k and
1 D : : : ; k we have kx./k ! C1 as  ! 1 : t
u
From the above results we derive the following method for solving (TRS):
If hc; ui i 0 for at least one i D 1; : : : ; k or if kxk  r then there exists a
unique value  2 1 ; C1/ satisfying kx. /k D r: This value  can be
determined by binary search. The global optimal solution of (TRS) is then x D
.Q C  I/1 c and is unique.
If hc; ui i D 0 8i D 1; : : : ; k and kxk < r then  D 1 and an optimal
solution is
X
k
x D i ui C x;
iD1
P
where 1 ; : : : ; k are any real numbers such that kiD1 i2 D r2  kxk2 : Indeed,
P
it is easily verified that kx k2 D r2 and also Qx C c D kiD1 i Qui C Qx C c D
Pk P
 iD1 i 1 ui  1 x D 1 . kiD1 i ui C x/ D 1 x : Since there are many
different choices for 1 ; : : : ; k satisfying the required condition, the optimal
solution in this case is not unique.

10.4 Quadratic Problems with Linear Constraints

10.4.1 Concave Quadratic Programming

The Concave Quadratic Programming Problem

.CQP/ minff .x/j Ax  b; x  0g;

where f .x/ is a concave quadratic function, is a special case of .BCP/the Basic


Concave Programming Problem (Chap. 5). Algorithms for this problem have been
developed in Konno (1976b), Rosen (1983), Rosen and Pardalos (1986), Kalantari
and Rosen (1987), Phillips and Rosen (1988), and Thakur (1991), . . . . Most often
the quadratic function f .x/ is considered in the separable form, which is always
possible via an affine transformation of the variables (cf Sect. 8.1). Also, particular
attention is paid to the case when the quadratic function is of low rank, i.e.,
involves a restricted number of nonconvex variables and, consequently, is amenable
to decomposition techniques (Chap. 7).
356 10 Nonconvex Quadratic Programming

Of course, the algorithms presented in Chaps. 567 for .BCP/ can be adapted
to handle .CQP/: In doing this one should note two points where substantial
simplifications and improvements are possible due to the specific properties of
quadratic functions.
The first point concerns the computation of the -extension (Remark 5.2). Since
f .x/ is concave quadratic, for any point x0 2 Rn and any direction u 2 Rn ; the
function 7! f .x0 C u/ is concave quadratic in : Therefore, for any x0 ; x 2 Rn
and any real number such that f .x0 / > ; the -extension of x with respect to x0 ;
i.e., the point x0 C .x  x0 / corresponding to the value

D supfj f .x0 C .x  x0 //  g;

can be found by solving the quadratic equation in W f .x0 C .x  x0 // D (note


that the level set f .x/  is bounded).
The second point is the construction of the cuts to reduce the search domain. It
turns out that when the objective function f .x/ is quadratic, the usual concavity cuts
can be substantially strengthened by the following method (Konno 1976a, see also
Balas and Burdet 1973). Assume that D  RnC ; x0 D 0 is a vertex of D and write
the objective function as

f .x/ D hx; Qxi C 2hc; xi;

where Q is a negative semidefinite matrix. Consider a -valid cut for .f ; D/


constructed at x0 :

X
n
xi
 1: (10.37)
iD1
ti
P
So if we define M.t/ D fx 2 Dj niD1 xi =ti /  1g, then fx 2 Dj f .x/ < g  M.t/:
Note that by construction f .ti ei /  ; i D 1; : : : ; n; where as usual ei is the ith unit
vector. We now construct a new concave function 't .x/ such that a -valid concavity
cut for .'t ; D/ will also be -valid for .f ; D/ but in general deeper than the above
cut. To this end define the bilinear function

F.x; y/ D hx; Qyi C hc; xi C hc; yi:

Observe that

f .x/ D F.x; x/; F.x; y/  minff .x/; f .y/g: (10.38)

Indeed, the first equality is evident. On the other hand,

F.x; y/  F.x; x/ D hx; Q.y  x/i C hc; y  xi


F.x; y/  F.y; y/ D hy; Q.x  y/i C hc; x  yi;
10.4 Quadratic Problems with Linear Constraints 357

hence F.x; y/  F.x; x/ C F.x; y/  F.y; y/ D hx  y; Q.x  y/i  0 because Q is
negative semidefinite. Thus at least one of the differences F.x; y/  F.x; x/; F.x; y/ 
F.y; y/ must be nonnegative, whence the inequality in (10.38). The argument also
shows that if Q is negative definite then the inequality in (10.38) is strict for x y
because then hx  y; Q.x  y/i < 0:
Lemma 10.1 The function

't .x/ D minfF.x; y/j y 2 M.t/g

is concave and satisfies

't .x/  f .x/ 8x 2 M.t/ (10.39)


minf't .x/j x 2 M.t/g D minff .x/j x 2 M.t/g: (10.40)

Proof The concavity of 't .x/ is obvious from the fact that it is the lower envelope
of the family of affine functions x 7! F.x; y/; y 2 M.t/: Since 't .x/  F.x; x/ D
f .x/ 8x 2 M.t/ (10.39) holds. Furthermore, 't .x/  minfF.x; y/j y 2 M.t/; y D
xg D minfF.x; x/j x 2 M.t/g; hence

't .x/  minff .x/j x 2 M.t/g;

and (10.40) follows from (10.39). t


u
Proposition 10.6 Let '.ti e / D F.ti e ; y /; i.e., y 2 argminfF.ti e ; y/j y 2 M.t/g: If
i i i i i

f .yi /  ; i D 1; : : : ; n then the vector t D .t1 ; : : : ; tn / such that

ti D maxfj 't .ei /  g; i D 1; : : : ; n (10.41)

defines a -valid cut for .f ; D/ at x0 :

Xn
xi
 1; (10.42)
t
iD1 i

which is no weaker than (10.37) (i.e., such that ti  ti ; i D 1; : : : ; n/:


Proof The cut (10.42) is -valid for .'t ; D/, hence for .f ; D/; too, in view of (10.39).
Since 't .ti ei /  minff .ti ei /; f .yi /g by (10.38), and since f .ti ei /  ; it follows from
the inequality f .yi /  that 't .ti ei /  ; and hence ti  ti by the definition of
ti : t
u
Note that if 't .ti ei / > minff .ti ei /; f .yi /g (which occurs when Q is negative
definite) then ti > ti provided that ti ei D (for yi 2 M.t/ implies yi 2 D; hence
yi ti ei /: This means that the cut (10.42) in this case is strictly deeper than (10.37).
358 10 Nonconvex Quadratic Programming

The computation of ti in (10.41) reduces to solving a linear program. Indeed,


if Qi denotes the ith row of Q, then F.ei ; y/ D hQi C c; yi C ci ; so for D D
fxj Ax  b; x  0g we have 't .ei / D gi ./ C ci ; where
( )
X
n
yi
gi ./ D min hQi C hc; yij  1; Ay  b; y  0 :
iD1
ti

By the duality theorem of linear programming,


(  T )
1 1
gi ./D max hb; ui C j  A u C  ;:::;
T
 .Qi C c/ ; u  0;   0 :
T
t1 tn

Thus, ti D maxfj gi ./ C ci  g, hence


(  T
1 1
ti D max j  hb; ui C  C ci  ; Q  A u C  ;:::;
T
 .Qi C c/T ;
t1 tn
u  0;   0g : (10.43)

To sum up, a cut (10.37) can be improved by the following procedure:


1. Compute yi 2 argminfF.ti ei ; y/j y 2 M.t/g; i D 1; : : : ; n:
2. If f .yi / < for some i, then construct a 0 -valid cut for .f ; D/,with the new
incumbent 0 WD minff .yi /j i D 1; : : : ; ng:
3. Otherwise, compute ti ; i D 1; : : : ; n by (10.43), and construct the cut (10.42).
Clearly, if a further improvement is needed the above procedure can be
repeated with t t :

10.4.2 Biconvex and ConvexConcave Programming

A problem that has earned a considerable interest due to its many applications is the
bilinear program whose general form is

minfhc; xi C hx; Ayi C hd; yi W x 2 X; y 2 Yg;

where X; Y are polyhedrons in Rp ; Rq , respectively, and A is a pq matrix. By setting


'.x/ D hc; xi C minfhx; Ayi C hd; yi W y 2 Yg, the problem becomes minf'.x/ W x 2
Xg, which is a concave program, because '.x/ is a concave function (pointwise
minimum of a family of affine functions). A more difficult problem, which can also
be shown to be more general, is the following jointly constrained biconvex program:

.JCBP/ minff .x/ C hx; yi C g.y/ W .x; y/ 2 Gg;


10.4 Quadratic Problems with Linear Constraints 359

where f .x/; g.y/ are convex functions on Rn and G is a compact convex subset of
Rn  Rn : Since hx; yi D .kx C yk2  kx  yk2 /=4, the problem can be solved by
rewriting it as the reverse convex program:

minff .x/ C g.y/ C .kx C yk2  t/=4 W .x; y/ 2 G; kx  yk2  tg:

An alternative branch and bound method was proposed in Al-Khayyal and Falk
(1983), where branching is performed by rectangular partition of Rn  Rn : To
compute a lower bound for F.x; y/ WD f .x/ C hx; yi C g.y/ over any rectangle
M D f.x; y/ 2 Rn  Rn W r  x  sI u  y  vg observe the following relations for
every i D 1; : : : ; n:

xi yi  ui xi C ri yi  ri ui 8xi  ri ; yi  ui
xi yi  vi xi C si yi  si vi 8xi  si ; yi  vi

(these relations hold because .xi  ri /.yi  ui /  0 and .xi  si /.yi  vi /  0,


respectively). Therefore, an underestimator of xy over M is the convex function
P
'.x; y/ D niD1 'i .x; y/;
(10.44)
'i .x; y/ D maxfui xi C ri yi  ri ui ; vi xi C si yi  si vi g

and a lower bound for minfF.x; y/ W .x; y/ 2 Mg is provided by the value

.M/ WD minff .x/ C '.x; y/ C g.y/ W .x; y/ 2 G \ Mg (10.45)

(the latter problem is a convex program and can be solved by standard methods).
At iteration k of the branch and bound procedure, a rectangle Mk in the current
partition is selected which has smallest .Mk /: Let .xk ; yk / be the optimal solution
of the bounding subproblem (10.45) for Mk ; i.e., the point .xk ; yk / 2 G \ Mk ; such
that

.Mk / D f .xk / C '.xk ; yk / C g.yk /:

If xk yk D '.xk ; yk / then the procedure terminates: .xk ; yk / is a global optimal


solution of (JCBP). Otherwise, select

ik 2 arg max fxki yki  'i .xk ; yk /g:


iD1;:::;n

If Mk D rk ; sk   uk ; v k ; then subdivide rk ; sk  via .xk ; ik / and subdivide uk ; v k 


via .yk ; ik / to obtain a partition of Mk into four smaller rectangles which will replace
Mk in the next iteration.
By simple computation, one can easily verify that '.x; y/ D xy; hence f .x/ C
'.x; y/Cg.y/ D F.x; y/ at every corner .x; y/ of the rectangle M: Using this property
360 10 Nonconvex Quadratic Programming

and the basic rectangular subdivision theorem it can then be proved that the above
branch and bound procedure converges to a global optimal solution of .JCBP/:
Another generalization of the bilinear programming problem is the convex
concave programming problem studied in Muu and Oettli (1991):

.CCP/ minfF.x; y/ W .x; y/ 2 G \ .X  Y/g;

where G  Rp  Rq is a closed convex set, X a rectangle in Rp ; Y a rectangle in Rq ,


and F.x; y/ is convex in x 2 Rp but concave in y 2 Rq : The method of MuuOettli
starts from the observation that, for any rectangle M  Y; if .M/ is the optimal
value and .xM ; yM ; uM / an optimal solution of the problem

minfF.x; y/ W x 2 X; y 2 M; u 2 M; .x; u/ 2 Gg; (10.46)

then .xM ; uM / is feasible to .CCP/ and

.M/ D F.xM ; yM /  minfF.x; y/ W .x; y/ 2 G \ .X  M/g  F.xM ; uM /:

In particular, .M/ yields a lower bound for F.x; y/ on G \ .X  M/; which becomes
the exact minimum if yM D uM : This suggests a branch and bound procedure in
which branching is by rectangular subdivision of Y and .M/ is used as a lower
bound for every subrectangle M of Y: At iteration k of this procedure, a rectangle
Mk in the current partition is selected which has smallest .Mk /: Let .xk ; yk ; uk / D
.xMk ; yMk ; uMk /: If yk D uk then the procedure terminates: .xk ; yk / solves (CCP).
Otherwise, select

ik 2 arg max fjyki  uki jg


iD1;:::;q

and subdivide Mk via . y Cu


k k
2
; ik /: In other words, subdivide Mk according to the
adaptive rule via yk ; uk (see Theorem 6.4, Chap. 6).
The convergence of the procedure follows from the fact that, by virtue of
Theorem 5.4, a nested sequence fMk.s/ g is generated such that jyMk.s/  uMk.s/ j ! 0
as s ! C1:
This method is implementable if the bounding subproblems (10.46) can be solved
efficiently, which is the case, e.g., for the d.c. problem

minfg.x/  h.y/ W .x; y/ 2 G \ .X  Y/g:

In that case, the subproblem (10.46):

minfg.x/  h.y/ W x 2 X; y 2 M; u 2 M; .x; u/ 2 Gg

is equivalent to

minfg.x/ W x 2 X; u 2 M; .x; u/ 2 Gg  maxfh.y/ W y 2 Mg;


10.4 Quadratic Problems with Linear Constraints 361

which reduces to solving a standard convex program and computing the maximum
of the convex function h.y/ at 2q corners of the rectangle M:
By bilinear programming, one usually means a problem of the form

.BLP/ minfhx; Qyi C hc; xi C hd; yij x 2 D; y 2 Eg;

where D; E are polyhedrons in Rn1 and Rn2 , respectively, Q is an n1  n2 matrix


and c 2 Rn1 ; d 2 Rn2 : This is a rather special (though perhaps the most popular)
variant of bilinear problem, since in the general case a bilinear problem may involve
bilinear constraints as well.
The fact that the constraints in .BLP/ are linear and separated in x and y allow this
problem to be converted into a linearly constrained concave minimization problem.
In fact, setting

'.x/ D minfhx; Qyi C hd; yij y 2 Eg

we see that .BLP/ is equivalent to

minf'.x/ C hc; xij x 2 Dg:

Since for each fixed y the function x 7! hx; Qyi C hd; yi is affine, it follows that '.x/
is concave. Therefore, the converted problem is a concave program.
Aside from the bilinear program, a number of other nonconvex problems can
also be reduced to concave or quasiconcave minimization. Some of these problems
(such as minimizing a sum of products of convex positive-valued functions) have
been discussed in Chap. 9 (Subsect. 9.6.3).

10.4.3 Indefinite Quadratic Programs

Using an appropriate affine transformation any indefinite quadratic minimization


problem under linear constraints can be written as:

.IQP/ minfhx; xi  hy; yij Ax C By C c  0g;

where x 2 Rn1 ; y 2 Rn2 : Earlier methods for dealing with this problem (Kough
1979; Tuy 1987b) were essentially based on Geoffrion-Benders decomposition
ideas and used the representation of the objective function as a difference of
two convex quadratic separable functions. Koughs method, which closely follows
Geoffrions scheme, requires the solution, at each iteration, of a relaxed master
problem which is equivalent to a number of linearly constrained concave programs,
and, consequently, may not be easy to handle. Another approach is to convert .IQP/
to a single concave minimization under convex constraints and to use for solving it
a decomposition technique which extends to convex constraints the decomposition
362 10 Nonconvex Quadratic Programming

by outer approximation presented in Chap. 7 for linear constraints. In more detail,


this approach can be described as follows.
For simplicity assume that the set Y D fyj .9x/ Ax C By C c  0g is bounded.
Setting

'.y/ D inffhx; xij Ax C By C c  0g

it is easily seen that the function '.y/ is convex (see Proposition 2.6) and that
dom'  Y: Then .IQP/ can be rewritten as

minft  hy; yij '.y/  t; y 2 Yg: (10.47)

Here the objective function t  hy; yi is concave, while the constraint set D WD
f.y; t/ 2 Rn2  Rj '.y/  t; y 2 Yg is convex. So (10.47) is a concave program under
convex constraints, which can be solved, e.g., by the outer approximation method.
Since, however, the convex feasible set in this program is defined implicitly via a
convex program, the following key question must be resolved for carrying out this
method (Chap. 7, Sect. 7.1): given a point .y; t/ determine whether it belongs to D
and if it does not, construct a cutting plane (in the space Rn2  R/ to separate it
from D:
Clearly, .y; t/ 2 D if and only if y 2 Y; and '.y/  t:
One has y 2 Y if and only if the linear program minf0j Ax  Bycg is feasible,
hence, by duality, if and only if the linear program

maxfhBy C hc; uij AT u D 0; u  0g (10.48)

has optimal value 0. So, if u D 0 is an optimal solution of the linear


program (10.48), then y 2 YI otherwise, in solving this linear program an extreme
direction v of the cone AT u D 0; u  0 is obtained such that hBy C c; vi > 0 and
the cut

hBy C hc; vi  0 (10.49)

separates .y; t/ from D:


To check the condition '.y/  t one has to solve the convex program

minfhx; xij Ax C By C c  0g: (10.50)

Lemma 10.2 The convex program (10.50) has at least one Karush-Kuhn-Tucker
vector  and any such vector satisfies

BT  2 @'.y/:
10.5 Quadratic Problems with Quadratic Constraints 363

Proof Since hx; xi  0 one has '.y/ > 1; so the existence of  follows
from a classical theorem of convex programming (see, e.g., Rockafellar 1970,
Corollary 28.2.2). Then

'.y/ D inffhx; xi C h; Ax C By C cij x 2 Rn1 g (KKT vector)

and for any y 2 Rn2 :

'.y/  inffhx; xi C h; Ax C By C cij x 2 Rn1 g (because   0/:

Hence '.y/  '.y/  h; By  Byi D hBT ; y  yi; i.e., BT  2 @'.y/: t


u
Corollary 10.11 If '.y/ > t, then a cut separating .y; t/ from D is given by

h; B.y  y/i C '.y/  t  0: (10.51)

Thus, given any point .y; t/ one can check whether .y; t/ 2 D and if .y; t/ D then
construct a cut separating .y; t/ from D (this cut is (10.49) if y Y; or (10.51) if
'.y/ > t/:
With the above background, a standard outer approximation procedure can be
developed for solving the indefinite quadratic problem .IQP/ (Tuy 1987b). At
iteration k of this procedure a polytope Pk is in hand which is an outer approximation
of the feasible set D: Let V.Pk / be the vertex set of Pk : By solving the subproblem
minft hy; yij y 2 V.Pk /g a solution .yk ; tk / is obtained. If .yk ; tk / 2 D the procedure
terminates. Otherwise a hyperplane is constructed which cuts off this solution and
determines a new polytope PkC1 : Under mild assumptions this procedure converges
to a global optimal solution.
In recent years, BB methods have been proposed for general quadratic problems
which are applicable to .IQP/ as well. These methods will be discussed in the next
section and also in Chap. 11 devoted to monotonic optimization.

10.5 Quadratic Problems with Quadratic Constraints

We now turn to the general nonconvex quadratic problem :

.GQP/ minff0 .x/j fi .x/  0; i D 1; : : : ; mI x 2 Xg;

where X is a closed convex set and the functions fi ; i D 0; 1; : : : ; m are indefinite


quadratic:

1
fi .x/ D hx; Qi xi C hci ; xi C di ; i D 0; 1; : : : ; m:
2
364 10 Nonconvex Quadratic Programming

By virtue of property (iii) listed in Sect. 8.1 .GQP/ can be rewritten as

minf'0 .x/  rkxk2 j 'i .x/  rkxk2  0; i D 1; : : : ; mI x 2 Xg; (10.52)

where r  maxiD0;1;:::;m 12 .Qi / and 'i .x/ D fi .x/ C rkxk2 ; i D 0; 1; : : : ; m; are


convex functions. Thus .GQP/ is a d.c. optimization problem to which different
approaches are possible. Aside from methods derived by specializing the general
methods presented in the preceding chapters, there exist other methods designed
specifically for quadratic problems. Below we shall focus on BB (branch and bound)
methods.

10.5.1 Linear and Convex Relaxation

For low rank nonconvex problems, i.e., for problems which are convex when a
relatively small number of variables are kept fixed, and also for problems whose
quadratic structure is implicit, outer approximation (or combined OA/BB methods)
may in many cases be quite practical. Two such procedures have been discussed
earlier: the Relief Indicator Method (Chap. 7, Sect. 7.5), designed for continuous
optimization and the Visible Point Method (Chap. 7, Sect. 7.4) which seems to work
well for problems in low dimension with many nonconvex quadratic or piecewise
affine constraints (such as those encountered in continuous location theory (Tuy
et al. 1995a)).
For solving large-scale quadratic problems with little additional special structure
branch and bound methods seem to be most suitable. Two fundamental operations
in a BB algorithm are bounding and branching. As we saw in Chap. 5, these two
operations must be consistent in order to ensure convergence of the procedure.
Roughly speaking, the latter means that in an infinite subdivision process, the
current lower bound must tend to the exact global minimum.
A standard bounding method consists in considering for each partition set a
relaxed problem obtained from the original one by replacing the objective function
with an affine or convex minorant and enclosing the constraint set in a polyhedron
or a convex set.

a. Lower Linearization

Consider a class of problems which can be formulated as:



min g0 .x/ C h0 .y/

.SNP/ s:t: gi .x/ C hi .y/  0; i D 1; : : : ; m

.x; y/ 2 ;
10.5 Quadratic Problems with Quadratic Constraints 365

where g0 ; gi W Rn ! R are continuous functions, h0 ; hi W Rp ! R are


affine functions, while is a polytope in Rn  Rp : Obviously any quadratically
constrained quadratic program is of this form. It turns out that for this problem a
bounding method can be defined consistent with simplicial subdivision provided the
nonconvex functions involved satisfy a lower linearizability condition to be defined
below.

I. Simplicial Subdivision

A function f W Rn ! R is said to be lower linearizable on a compact set C  Rn


with respect to simplicial subdivision if for every n-simplex M  C there exists an
affine function 'M .x/ such that

'M .x/  f .x/ 8x 2 M; supf .x/  'M .x/ ! 0 as diamM ! 0: (10.53)


x2M

The function 'M .x/ will be called a lower linearization of f .x/ on M: Most
continuous functions are locally lower linearizable in the above sense. For example,
if f .x/ is concave, then 'M .x/ can be taken to be the affine function that agrees with
f .x/ at the vertices of M; if f .x/ is Lipschitz, with Lipschitz constant L, then 'M .x/
can be taken to be the affine function that agrees with fQ .x/ WD f .xM /  Lkx  xM k
at the vertices of M; where xM is an arbitrary point in M (in fact, fQ .x/ is concave,
so 'M .x/ is a lower linearization of fQ .x/; hence, also a lower linearization of f .x/
because by Lipschitz condition fQ .x/  f .x/ and jf .x/  fQ .x/j  2kx  xM k 8x 2
M/: Thus special cases of problem (SNP) with locally lower linearizable gi ; i D
0; 1; : : : ; m; include reverse convex programs (g0 .x/ linear, gi .x/; i D 1; : : : ; m;
concave), as well as Lipschitz optimization problems (g0 .x/; gi .x/; i D 1; : : : ; m;
Lipschitz).
Now assume that every function gi is lower linearizable and let M;i .x/ be a lower
linearization of gi .x/ on a given simplex M: Then the following linear program is a
relaxation of (SNP) on MI

min M;0 .x/ C h0 .y/

SP.M/ s:t: M;i .x/ C hi .y/  0; i D 1; : : : ; m:

.x; y/ 2 ; x 2 M:

Therefore, the optimal value .M/ of this linear program is a lower bound for the
objective function value on the feasible solutions .x; y/ such that x 2 M: Incorporat-
ing this lower bounding operation into an exhaustive simplicial subdivision process
of the x-space yields the following algorithm:
BB Algorithm for (SNP)
Initialization. Take an n-simplex M1  Rn containing the projection of on Rn :
Let .x; y/ be the best feasible solution available, D g0 .x/ ( D C1 if no feasible
solution is known). Set R D S D fM1 g; k D 1:
366 10 Nonconvex Quadratic Programming

Step 1. For each M 2 S compute the optimal value .M/ of the linear program
SP.M/:
0
Step 2. Update .x; y/ and : Delete every M 2 R such that .M/  . Let R be
the remaining collection of simplices.
0
Step 3. If R D ;, then terminate: .x; y/ solves (SNP) (if < C1) or (SNP) is
infeasible (if D C1/:
0
Step 4. Let Mk 2 argminf.M/ W M 2 R g; and let .xk ; yk / be a basic optimal
solution of SP.Mk /.
Step 5. Perform a bisection or a radial subdivision of Mk ; following an exhaustive
subdivision rule. Let S  be the partition of Mk .
0
Step 6. Set S S ; R S  [ .R n fMk g/; increase k by 1 and return to
Step 1.
The convergence of this algorithm is easy to establish. Indeed, if the algorithm
is infinite, it generates a filter fMk g collapsing to a point x : Since .Mk / 
min(SNP) and (10.53) implies that Mk ;i .x / D gi .x /; i D 0; 1; : : : ; m; it follows
that if .Mk / D g0 .xk / C h0 .yk /, then xk ! x ; yk ! y such that .x ; y /
solves (SNP).
Remark 10.6 If the lower linearizations are such that for any simplex M each
function M;i .x/g matches gi .x/ at every corner of M then as soon as xk in Step
3 coincides with a corner of Mk the solution .xk ; yk / will be feasible, hence
optimal to the problem. Therefore, in this case an adaptive subdivision rule to
drive xk eventually to a corner of Mk should ensure a much faster convergence
(Theorem 6.4).

II. Rectangular Subdivision

A function f .x/ is said to be lower linearizable on a compact set C  Rn with respect


to rectangular subdivision if for any rectangle M  C there exists an P affine function
'M .x/ satisfying (10.53). For instance, if f .x/ is separable: f .x/ D niD1 fi .x/ and
each univariate function fi .t/ has a lower linearization 'M;i .x/ on the segment ai ; bi 
Pn a lower linearization of f .x/ on the rectangle C D a; b is obviously 'M .x/ D
then
iD1 'M;i .xi /:
For problems .SNP/ where the functions gi .x/; i D 0; 1; : : : ; m are lower
linearizable with respect to rectangular subdivision, a rectangular algorithm can be
developed similar to the simplicial algorithm.

b. Tight Convex Minorants

A convex function '.x/ is said to be a tight convex minorant of a function f .x/ on a


rectangle M (or a simplex M/ if
'.x/  f .x/ 8x 2 M; minf .x/  '.x/ D 0: (10.54)
x2M
10.5 Quadratic Problems with Quadratic Constraints 367

If f .x/ is continuous on M, then its convex envelope '.x/ WD convf .x/ over M must
be a tight convex minorant, for if minx2M f .x/  '.x/ D > 0 then the function
'.x/ C would give a larger convex minorant.
Example 10.1 The function maxfpy C rx  pr; qy C sx  qsg is a tight convex
minorant of xy 2 R2 on p; q  r; s: It matches xy for x D p or x D q (and y
arbitrary in the interval r  y  s).
Proof This function is the convex envelope of the product xy on the rectangle p; q
r; s (Corollary 4.6). The second assertion follows from the inequalities xy  .py C
rx  pr/ D .x  p/.y  r/  0; xy  .qy C sx  qs/ D .q  x/.s  y/  0 which
hold for every .x; y/ 2 p; q  r; s: t
u
Example 10.2 Let f .x/ D hx; Qxi: The function

X
n
'M .x/ D f .x/ C r .xi  pi /.xi  qi /;
iD1

where r  .Q/ is a tight convex minorant of f .x/ on the rectangle M D p; q:


It matches f .x/ at each corner of the rectangle p; q; i.e., at each point x such that
xi 2 fpi ; qi g; i D 1; : : : ; n:
Proof By (v), Sect. 8.1, 'M .x/ is a convex minorant of f .x/ and clearly 'M .x/ D
f .x/ when xi 2 fpi ; qi g; i D 1; : : : ; n: t
u
Note that a d.c. representation of f .x/ is f .x/ D g.x/  rkxk2 with g.x/ D
2 2
f .x/ C rkxk
Pn ; and r  .Q/: Since the convex envelope of kxk on p; q is
l.x/ D  iD1 .pi C qi /xi  pi qi ; a tight convex minorant of f .x/ is g.x/ C rl.x/: It
is easily verified that g.x/ C rl.x/ D 'M .x/:
Example 10.3 Let Qi be the row i and Qij be the element .i; j/ of a symmetric n  n
matrix Q: For any rectangle M D p; q  Rn define
Pn
ki D minfQi xj x 2 p; qg D jD1 minfpj Qij ; qj Qij g; (10.55)
Pn
Ki D maxfQi xj x 2 p; qg D jD1 maxfpj Qij ; qj Qij g: (10.56)

Then a tight convex minorant of the function f .x/ D hx; Qxi on M D p; q is


X
n
'M .x/ D 'M;i .x/; (10.57)
iD1

'M;i .x/ D maxfpi Qi x C ki xi  pi ki ; qi Qi x C Ki xi  qi Ki g: (10.58)

Actually, 'M .x/ D f .x/ at every corner x of the rectangle p; q:


368 10 Nonconvex Quadratic Programming

Pn
Proof We can write f .x/ D iD1 fi .x/ with fi .x/ D xi yi ; yi D Qi x so the result
simply follows from the fact that 'M;i .x/ is a convex minorant of fi .x/ and 'M;i .x/ D
fi .x/ when xi 2 fpi ; qi g (Example 10.1). t
u
Consider now the nonconvex optimization problem

minff0 .x/j fi .x/  0; i D 1; : : : ; m; x 2 Dg; (10.59)

where f0 ; fi W Rn ! R are nonconvex functions and D a convex set in Rn : Given any


rectangle M  Rn one can compute a lower bound for f0 .x/ over the feasible points
x in M by solving the convex problem

minf'M;0 .x/j x 2 M \ D; 'M;i .x/  0; i D 1; : : : ; mg

obtained by replacing fi ; i D 0; 1; : : : ; m with their convex minorants 'M;i on M:


Using this bounding a BB algorithm for (10.59) with an exhaustive rectangular
subdivision can then be developed.
Theorem 10.4 Assume that fi ; i D 0; 1; : : : ; are continuous functions and for every
partition set M each function 'M;i is a tight convex minorant of fi on M such that for
any filter fMk g and any two sequences xk 2 Mk ; zk 2 Mk :

'Mk ;i .xk /  'Mk ;i .zk / ! 0 whenever xk  zk ! 0:

Then the corresponding BB algorithm converges.


Proof If the algorithm is infinite, it generates a filter fMk g such that \ Mk D fxg:
Let us write 'k;i for 'Mk ;i and denote by zk;i a point of Mk where 'k;i .zk;i / D fi .zk;i /:
Then for every i D 0; 1; : : : ; m: zk ;i ! x: Furthermore, denote by xk an optimal
solution of the relaxed program associated with Mk ; so that 'k;0 .xk / D .Mk /:
Then xk ! x: Hence fi .x/ D lim fi .zk ;i / D lim 'k ;i .zk ;i / D lim 'k ;i .xk / 8i D
0; 1; : : : ; m: Since 'k ;i .xk / /  0 8i D 1; : : : ; m it follows that fi .x/  0 8i D
1; : : : ; m; i.e., x is feasible. On the other hand, for any k: .Mk /  WD
minff0 .x/j fi .x/  0; i D 1; : : : ; mg; so 'k ;0 .xk /  ; hence f0 .x/  : Therefore,
f0 .x/ D and x is an optimal solution, proving the convergence of the algorithm.
t
u
It can easily be verified that the convex minorants in Examples 8.18.3 satisfy the
conditions in the above theorem. Consequently, BB algorithms using these convex
minorants for computing bounds are guaranteed to converge. An algorithm of this
type has been proposed in Al-Khayyal et al. (1995) (see also Van Voorhis 1997) for
the quadratic program (10.59) where f0 .x/ D hx; Q0 xi C hc0 ; xi; fh .x/ D hx; Qh xi C
hch ; xi C dh ; h D 1; : : : ; m and D is a polytope. For each rectangle M the convex
minorant of hx; Qh xi over M is 'M h
.x/ computed as in Example 10.3. To obtain an
upper bound one can use, in addition, a concave majorant for each function fh : By
10.5 Quadratic Problems with Quadratic Constraints 369

similar arguments
Pn to the above, it is easily seen that a concave majorant of hx; Qh xi is
M .x/ D iD1 i .x/ with i .x/ D minfpi Qi xCKi xi pi Ki ; qi Qi xCki xi qi ki g:
h h h h h h h h h

So the relaxed problem is the linear program

0
min 'M .x/ C hc0 ; xi
s.t. th C hch ; xi C dh  0 h D 1; : : : ; m
'Mh
.x/  th  Mh .x/ h D 1; : : : ; m
x2M\D :

Tight convex minorants of the above type are also used in Androulakis et al. (1995)
for the general nonconvex problem (10.59). By writing each function fh .x/; h D
0; 1; : : : ; m; in (10.59) as a sum of bilinear terms of the form bij xi xj and a nonconvex
function with no special structure, a convex minorant 'M;h .x/ of fh .x/ on a rectangle
M can easily be computed from Examples 10.1 and 10.2. Such a convex minorant
is tight because it matches fh .x/ at any corner of M; furthermore, as is easily seen,
'M;h .x/; h D 0; 1; : : : ; m; satisfy the conditions formulated in Theorem 10.4. The
convergence of the BB algorithm in Androulakis et al. (1995) then follows.
The reader is referred to the cited papers (Al-Khayyal et al. 1995; Androulakis
et al. 1995) for details and computational results concerning the above algorithms.
In fact the convex minorants in Examples 10.110.3 could be obtained from the
general rules for computing convex minorants of factorable functions (see Chap. 4,
Sect. 4.7). In other words the above algorithms could be viewed as specific realiza-
tions of the general scheme of McCormick (1982) for factorable programming.
Remark 10.7 (i) If the convex minorants used are of the types given in Exam-
ples 8.18.3 then for any rectangle M each function 'M;i matches fi at the
corners of M: In that case, the convergence of the algorithm would be faster
by using the !-subdivision, viz : at iteration k divide Mk D pk ; qk  via
.! k ; jk /; where ! k is an optimal solution of the relaxed problem at iteration
k and jk is a properly chosen index (jk 2 argmaxf
k;j j j D 1; : : : ; ng where

k;j D minf!jk  pkj ; qkj  !jk gI see Theorem 6.4.
(ii) A function f .x/ is said to be locally convexifiable on a compact set C  Rn
with respect to rectangular subdivision if there exists for each rectangle M  C
a convex minorant 'M .x/ satisfying (10.53) (this convex minorant is then
said to be asymptotically tight). By an argument analogous to the proof of
Theorem 10.4 it is easy to show that if the functions fi .x/; i D 0; 1; : : : ; m
in (10.59) are locally convexifiable and the convex minorants 'M;i used for
bounding are asymptotically tight, while the subdivision is exhaustive then
the BB algorithm converges. For the applications, note that if two functions
f1 .x/; f2 .x/ are locally convexifiable then so will be their sum f1 .x/ C f2 .x/:

c. Decoupling Relaxation

A lower bounding method which can be termed decoupling relaxation consists in


converting the problem into an equivalent one where the nonconvexity is concen-
trated in a number of coupling constraints. When these coupling constraints are
370 10 Nonconvex Quadratic Programming

omitted, the problem becomes linear, or convex, or easily solvable. The method
proposed in Muu and Oettli (1991), Muu (1993), and Muu (1996), for convex
concave problems, and also the reformulationlinearization method of Sherali
and Tuncbilek (1992, 1995), can be interpreted as different variants of variable
decoupling relaxations.

d. ConvexConcave Approach

A real-valued function f .x; y/ on X  Y  Rn1  Rn2 is said to be convexconcave


if it is convex on X for every fixed y 2 Y and concave on Y for every fixed x 2 X:
Consider the problem

.CCAP/ minff .x; y/j .x; y/ 2 C; x 2 X; y 2 Yg; (10.60)

where X  Rn1 ; Y  Rn2 are polytopes, C is a compact convex set in Rn1  Rn2
and the function f .x; y/ is convexconcave on X  Y: Clearly, linearly (or convexly)
constrained quadratic programs and more generally, d.c. programming problems
with convex constraints, are of this form. Since by fixing y the problem is convex,
a branch and bound method for solving .CCAP/ should branch upon y: Using, e.g.,
rectangular partition of the y-space, one has to compute, for a given rectangle M 
Rn2 ; a lower bound .M/ for

.SP.M// minff .x; y/j .x; y/ 2 C; x 2 X; y 2 Y \ Mg: (10.61)

The difficulty of this subproblem is due to the fact that convexity and concavity are
coupled in the contraints. To get round this difficulty the idea is to decouple x and y
by reformulating the problem as

minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Y; y 2 M; u D yg

and omitting the constraint u D y: Then, obviously, the optimal value in the relaxed
problem

.RP.M// minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Y; y 2 Mg (10.62)

is a lower bound of the optimal value in .CCAP/: Let V.M/ be the vertex set of M:
By concavity of f .x; y/ in y we have

minff .x; y/j y 2 Mg D minff .x; y/jy 2 V.M/g;

so the problem (10.62) is equivalent to

min minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Yg:


y2V.M/
10.5 Quadratic Problems with Quadratic Constraints 371

Therefore, one can take

.M/ D min minff .x; y/j .x; u/ 2 C \ .X  Y/g: (10.63)


y2V.M/

For each fixed y the problem minff .x; y/j .x; u/ 2 C \ .X  Y/g is convex (or linear
if f .x; y/ is affine in x; and C is a polytope), so (10.63) can be solved by standard
algorithms. If .xM ; yM ; uM / is an optimal solution of RP.M/ then along with the
lower bound f .xk ; yk / D .Mk /; we also obtain a feasible solution .xM ; uM / which
can be used to update the current incumbent. Furthermore, when yM D uM ; .M/
equals the exact minimum in (10.63).
At iteration kth of the branch and bound procedure let Mk be the rectangle such
that .Mk / is smallest among all partition sets still of interest. Denote .xk ; yk ; uk / D
.xMk ; yMk ; uMk /: If uk D yk then, as mentioned above, .Mk / D f .xk ; yk / while
.xk ; yk / is a feasible solution. Since .Mk /  min (CCAP) it follows that .xk ; yk /
is an optimal solution of .CCAP/: Therefore, if uk D yk the procedure terminates.
Otherwise, Mk must be further subdivided. Based on the above observation, one
should choose a subdivision method which, intuitively, could drive kyk  uk k to zero
more rapidly than an exhaustive subdivision process. A good choice confirmed by
computational experience is an adaptive subdivision via .yk ; uk /; i.e., to subdivide
Mk via the hyperplane yik D .ykik C ukik /=2 where

ik 2 argmaxiD1;:::;n2 jyki  uki j:

Proposition 10.7 The rectangular algorithm with the above branching and bound-
ing operations is convergent.
Proof If the algorithm is infinite, it generates at least a filter of rectangles, say
Mkq ; q D 1; 2; : : : : To simplify the notation we write Mq for Mkq : Let Mq D aq ; bq ;
so that by a basic property of rectangular subdivision (Lemma 5.4), one can suppose,
without loss of generality, that iq D 1; aq ; bq  ! a ; b ; .y1 C u1 /=2 ! y1 2
q q
  q q
fa1 ; b1 g: The latter relation implies that jy1  u1 j ! 0; hence yq  uq ! 0
q q q q
because jy1  u1 j D maxiD1;:::;n2 jyi  ui j by the definition of iq D 1: Let
x D lim x ; y D lim y .D lim u /: Then .Mq / D f .xq ; yq / ! f .x ; y / and since
 q  q q

.Mq /  min(CCAP) for all q; while .x ; y / is feasible, it follows that .x ; y / is
an optimal solution. t
u
This method is practicable if n2 is relatively small. Often, instead of rectangular
subdivision, it may be more convenient to use simplicial subdivision. In that case,
a simplex Mk is subdivided via ! k D .yk C uk /=2 according to the !-subdivision
rule. The method can also be extended to problems in which convex and nonconvex
variables are coupled in certain joint constraints, e.g., to problems of the form

minff .x/j .x; y/ 2 C; g.x; y/  0; x 2 X; y 2 Yg; (10.64)


372 10 Nonconvex Quadratic Programming

where f .x/ is a convex function, X; Y are as previously, and g.x; y/ is a convex


concave function on X  Y: For any rectangle M in the y-space, one can take

.M/ D minff .x/j .x; u/ 2 C; g.x; y/  0; x 2 X; u 2 Y; y 2 Mg


D minff .x/j .x; u/ 2 C \ .X  Y/; g.x; y/  0; y 2 V.M/g:

Computing .M/ again amounts to solving a convex subproblem.

e. ReformulationLinearization

Let X; Y be two polyhedrons in Rn1 ; Rn2 ; n1  n2 ; defined by reformulation


linearization

X D fx 2 Rn1 j hai ; xi  i ; i D 1; : : : ; h1 g (10.65)


Y D fy 2 Rn2 j hbj ; yi  j ; j D 1; : : : ; h2 g; (10.66)

and let

f0 .x; y/ D hx; Q0 yi C hc0 ; xi C hd0 ; yi (10.67)


fi .x; y/ D hx; Q yi C hc ; xi C hd ; yi C gi ; i D 1; : : : ; m;
i i i
(10.68)

where for each i D 0; 1; : : : ; m Qi is an n1  n2 matrix, ci 2 Rn1 ; di 2 Rn2 and


gi 2 R: Consider the general bilinear program

minff0 .x; y/j fi .x; y/  0 .i D 1; : : : ; m/; x 2 X; y 2 Yg: (10.69)

Assume that the linear inequalities defining X and Y are such that both systems

hai ; xi  i  0 i D 1; : : : ; h1 (10.70)
hbj ; yi  j  0 j D 1; : : : ; h2 (10.71)

are inconsistent, or equivalently,

8x 2 Rn1 miniD1;:::;h1 .hai ; xi  i / < 0;


(10.72)
8y 2 Rn2 minjD1;:::;h2 .hbj ; yi  j / < 0:

This condition is fulfilled if the inequalities defining X and Y include inequalities of


the form

1 < p  xi0  q < C1 (10.73)


1 < r  yj0  s < C1 (10.74)
10.5 Quadratic Problems with Quadratic Constraints 373

for some i0 2 f1; : : : ; h1 g; p < q; and some j0 2 f1; : : : ; h2 g; r < sI in particular, it


is fulfilled if X and Y are bounded.
Lemma 10.3 Under condition (10.72) the system x 2 X; y 2 Y is equivalent to

.hai ; xi  i /.hbj ; yi  j /  0 8i D 1; : : : ; h1 ; j D 1; : : : ; h2 : (10.75)

Proof It suffices to show that (10.75) implies .x; y/ 2 X  Y: Let .x; y/ sat-
isfy (10.75). By (10.72) there exists for this x an index j0 2 f1; : : : ; h2 g such that
hbj0 ; yi  j0 < 0: Then for any i D 1; : : : ; h1 ; since

.hai ; xi  i /.hbj0 ; yi  j0 /  0;

it follows that hai ; xi i  0: Similarly, for any j D 1; : : : ; h2 W hbj ; yi j  0: u


t
Thus, a system of two sets of linear constraints in x and in y like (10.65)
and (10.66) can be converted to an equivalent system of formally bilinear con-
straints. Now for each bilinear function P.x; y/ denote by P.x; yL the affine function
of x; y; w obtained by replacing each product xl yl0 by wll0 : For instance, if P.x; y/ D
.ha; xi  /.hb; yi  / where a 2 Rn1 ; b 2 Rn2 ; ; 2 R then

X
n1 X
n2
.ha; xi  /.hb; yi  /L D al bl0 wll0  hb; yi  ha; xi C :
lD1 l0 D1

Let fi .x; y/L D 'i .x; y; w/; i D 0; 1; : : : ; m and for every .i; j/: .hai ; xi 
i /.hbj ; yi  j /L D Lij .x; y; w/: Then the bilinear program (10.69) can be written
as

min '0 .x; y; w/

s:t: 'i .x; y; w/  0 i D 1; : : : ; m;
(10.76)
Lij .x; y; w/  0 i D 1; : : : ; h1 ; j D 1; : : : ; h2

wll0 D xl yl0 l D 1; : : : ; n1 ; l0 D 1; : : : ; n2 :

A new feature of the latter problem is that the only nonlinear elements in it are
the coupling constraints wll0 D xl yl0 : Let (LP) be the linear program obtained
from (10.76) by omitting these coupling constraints:

min '0 .x; y; w/

.LP/ s:t: 'i .x; y; w/  0 i D 1; : : : ; m;

Lij .x; y; w/  0 i D 1; : : : ; h1 ; j D 1; : : : ; h2 :

Proposition 10.8 The optimal value in the linear program (LP) is a lower bound of
the optimal value in (10.69). If an optimal solution .x; y; w/ of (LP) satisfies wll0 D
xl yl0 8l; l0 then it solves (10.69).
374 10 Nonconvex Quadratic Programming

Proof This follows because (LP) is a relaxation of the problem (10.76)


equivalent to (10.69). t
u
Remark 10.8 The replacement of the linear constraints x 2 X; y 2 Y; by
constraints Lij .x; y; w/  0; i D 1; : : : ; h1 ; j D 1; : : : ; h2 ; is essential. For example,
to compute a lower (or upper) bound for the product xy 2 R  R over the rectangle
p  x  q; r  y  s; one should minimize (maximize, resp.) the function w under
the constraints

.x  p/.y  r/L  0; .q  x/.s  y/L  0


.x  p/.s  y/L  0; .q  x/.y  r/L  0:

This yields the well-known bounds given in Example 8.1:

maxfrx C py  pr; sx C qy  qsg  w  minfsx C py  ps; rx C qy  qrg:

Notice that trivial bounds 1 < w < C1 would have been obtained, were the
original linear constraints used instead of the constraints Lij .x; y; w/  0: Thus,
although it is more natural to linearize nonlinear constraints, it may sometimes be
useful to tranform linear constraints into formally nonlinear constraints.
From the above results we now derive a BB algorithm for solving (10.69). For
simplicity, consider the special case of (10.69) when x D y; i.e., the general
nonconvex quadratic program

minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x 2 Xg; (10.77)

where X is a polyhedron in Rn and

1 1
f0 .x/ D hx; Q0 xi C hc0 ; xi; fi .x/ D hx; Qi xi C hci ; xi C di : (10.78)
2 2
Given any rectangle M in Rn ; the set X \ M can always be described by a system

hai ; xi  i i D 1; : : : ; h (10.79)

which includes among others all the inequalities pi  xi  qi ; i D 1; : : : ; n and


satisfies (10.72), i.e., miniD1;:::;h .hai ; xi  i / < 0 8x 2 Rn : For every .i; j/ with
1  i  j  h denote by

Lij .x; w/ D .hai ; xi  i /.haj ; xi  j /L

the function that results from .hai ; xi  i /.haj ; xi  j / by the substitution wll0 D
xl xl0 ; 1  l  l0  n: Also let
10.5 Quadratic Problems with Quadratic Constraints 375

 
1
'0 .x; w/ D hx; Q xi C hc0 ; xi
0
2 L
 
1
'i .x; w/ D hx; Qi xi C hci ; xi C di i D 1; : : : ; m:
2 L

Then a lower bound of f0 .x/ over all feasible points x in X \ M is given by the
optimal value .M/ in the linear program

min '0 .x; w/

LP.M/ s:t: 'i .x; w/  0 i D 1; : : : ; m

Lij .x; w/  0 1  i  j  h:

To use this lower bounding for algorithmic purpose, one has to construct a subdi-
vision process consistent with this lower bounding, i.e., such that the resulting BB
algorithm is convergent. The next result gives the foundation for such a subdivision
process.
Proposition 10.9 Let M D p; q: If .x; w/ is a feasible point of LP.M/ such that x is
a corner of the rectangle M then .x; w/ satisfies the coupling constraints wll0 D xl xl0
for all l; l0 :
Proof Since x is a corner of M D p; q one has, for some I  f1; : : : ; ng: xi D pi for
i 2 I and xj D qj for j I: The feasibility of .x; w/ implies that .xl pl /.ql0 xl0 /L 
0; .xl  pl /.xl0  pl0 /L  0 for any l; l0 ; i.e.,

pl xl0 C pl0 xl  pl pl0  wll0  ql0 xl C pl xl0  pl ql0 :

For l 2 I we have xl D pl ; so

xl xl0 C pl0 pl  pl pl0  wll0  ql0 pl C xl xl0  pl ql0 ;

hence wll0 D xl xl0 : On the other hand, .ql xl /.ql0 xl0 /k  0; .ql xl /.xl0 pl0 /k 
0; i.e.,

ql ql0 C ql0 xl C ql xl0  wll0  ql xl0  ql pl0 C pl0 xl :

For l I we have xl D ql ; so

ql ql0 C ql0 ql C xl xl0  wll0  xl xl0  ql pl0 C pl0 ql ;

hence wll0 D xl xl0 : Thus, in any case, wll0 D xl xl0 : t


u
Based on this property the following rectangular BB procedure can be proposed
for solving (10.77). At iteration k; let Mk D pk ; qk  be the partition set with smallest
lower bound among all partition sets still of interest and let .xk ; wk / be the optimal
376 10 Nonconvex Quadratic Programming

solution of the bounding problem, so that .Mk / D '.xk ; wk /: If wkll0 D xkl xkl0 for all
l; l0 (in particular, if xk is a corner of Mk / then xk is an optimal solution of (10.77)
and the procedure terminates. Otherwise, bisect Mk via .xk ; ik / where

ik 2 argmaxf
ki j i D 1; : : : ; ng;
ki D minfxki  pki ; qki  xki g:

Proposition 10.10 The rectangular algorithm with bounding and branching as


above defined converges to an optimal solution of (10.77).
Proof The subdivision ensures that if the algorithm is infinite it generates a filter
of rectangles fM D p ; q g such that .x ; w / ! .x ; w /; and x is a corner
of p ; q  D \ p ; q  (Theorem 6.3). Clearly .x ; w / is feasible to LP.M / for
M D p ; q ; so by Proposition 10.9, wll0 D xl xl0 for all l; l0 ; and hence, x is a
feasible solution to (10.77). Since '0 .x ; w / D f0 .x / is a lower bound for f0 .x/
over the feasible solutions, it follows that x is an optimal solution. t
u
Example 10.4 (Sherali and Tuncbilek 1995) Minimize .x1  12/2  x22 subject to

6x1 C 8x2  48; 3x1 C 8x2  120; 0  x1  24; x2  0:

Setting w11 D x21 ; w12 D x1 x2 ; w22 D x22 the first bounding problem is to minimize
the linear function w11 C 24x1  w22  144 subject to .5  6/=2 D 15 linear
constraints corresponding to different pairwise products of the factors

.48 C 6x1  8x2 /  0; .120  3x1  8x2 /  0; .24  x1 /  0; x1  0; x2  0:

The solution to this linear program is .x1 ; x2 ; w11 ; w12 ; w22 / D .8; 6; 192; 48; 72/;
with objective function value 216: By bisecting the feasible set X into M1 D fx 2
Xj x1  8g and M1C D fx 2 Xj x1  8g and solving the corresponding bounding
problems one obtains a lower bound of 180 for both. Furthermore, the solution
.x1 ; x2 ; w11 ; w12 ; w22 / D .0; 6; 0; 0; 36/ to the bounding problem on M1 is such that
.x1 ; x2 / D .0; 6/ is a vertex of M1 : Hence, by Proposition 10.9, .0; 6/ is an optimal
solution of the original problem and 180 is the optimal value. This result could
also be checked by solving the problem as a concave minimization over a polytope
(the vertices of this polytope are .0; 0/; .0; 6/; .8; 12/; .24; 6/; and .24; 0//:
Remark 10.9 Since the number of constraints Lij .x; w/  0 is often quite large, it
may sometimes be more practical to use only a selected subset of these constraints,
even if this may result in a poorer bound. On the other hand, it is always possible
to add some convex constraints of the form x2i  wii to obtain a convex rather than
linear relaxation of the original problem. This may give, of course, a tighter bound
at the expense of more computational effort.
10.5 Quadratic Problems with Quadratic Constraints 377

10.5.2 Lagrangian Relaxation

Consider again the general problem


\
fX WD infff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x 2 Xg; (10.80)

where X is a convex set in Rn ; f0 .x/; fi .x/ are arbitrary real-valued functions. The
function
X
m
L.x; u/ D f0 .x/ C ui fi .x/; u 2 Rm
C
iD1

is called the Lagrangian of problem (10.80) and the problem



X WD sup .u/; .u/ D inf L.x; u/ (10.81)
u2Rm x2X
C

\
the Lagrange dual problem . Clearly .u/  fX 8u 2 Rm
C ; hence

 \
X  fX ; (10.82)
\ \
i.e., X always provides a lower bound for fX : The difference fX  X  0 is referred
to as the Lagrange duality gap. When the gap is zero, i.e., strong duality holds, then

X is the exact optimal value of (10.80).
Note that the function .u/ is concave on every convex subset of its domain as
it is the lower envelope of a family of affine functions in u: Furthermore:
Proposition 10.11 Assume X is compact while f0 .x/ and fi .x/ are continuous on X:
Then .u/ is defined and concave on Rm C and if .u/ D L.x; u/ then the vector
.fi .x/; i D 1; : : : ; m/ is a subgradient of  .u/ at u:
Proof It is enough to prove the last part of the proposition. For any u 2 Rm
C we have

X
m
.u/  .u/  L.x; u/  L.x; u/ D .ui  ui /fi .x/:
iD1

The rest is obvious. t


u
The next three propositions indicate important cases when strong duality holds.
Proposition 10.12 Assume f0 .x/ and fi .x/ are convex and finite on X: If the optimal
value in (10.80) is not 1; and the Slater regularity condition holds, i.e., there
exists at least one feasible solution x0 of the problem such that

x0 2 riX; fi .x0 / < 0 8i 2 I;


378 10 Nonconvex Quadratic Programming

where I D fi D 1; : : : ; mj fi .x/ is not affineg; then

 \
X D fX :

Proof This is a classical result from convex programming theory which can also be
directly derived from Theorem 2.3 on convex inequalities. t
u
Proposition 10.13 In (10.80) where X D Rn ; assume that the concave function
u 7! infx2Rn L.x; u/ attains its maximum at a point u 2 Rm
C such that L.x; u/ is
strictly convex. Then strong duality holds, so

 \
X D fX :

Proof This follows from Corollary 10.1 with C D Rn ; D D Rm C and


Remark 10.1(a) by noting that by Proposition 10.3 L.x; u/ is coercive on Rn : t
u
Example 10.5 Consider the following minimization of a concave quadratic func-
tion under linear constraints

min f50jjxjj2 C cT x j Ax  b; xi 2 0; 1; i D 1; 2; : : : :; 10g; (10.83)


x2R10

2 3
2 6 1 0 3 3 2 6 2 2
6 6 5 8 3 0 1 3 8 9 3 7
6 7
6 7
A D 6 5 6 5 3 8 8 9 2 0 9 7
6 7
4 9 5 0 9 1 8 3 9 9 3 5
8 7 4 5 9 1 7 1 3 2

c D .48; 42; 48; 45; 44; 41; 47; 42; 45; 46/T ; b D .4; 22; 3; 23; 12/T :

The box constraint 0  xi  1; i D 1; : : : ; 10; can be written as 10 quadratic


constraints x2i  xi  0; i D 1; 2; : : : ; 10; so actually there are in total 15 constraints
in (10.83).
By solving a SDP, it can be checked that maxy2R15 infx2R10 L.x; y/ is attained at y
C
such that L.x; y/ is strictly convex and attains its minimum at

x D .1; 1; 0; 1; 1; 1; 0; 1; 1; 1/T

which is feasible to (10.83). The optimal value is 47 with zero duality gap by
Proposition 10.13.
Proposition 10.14 In (10.80) whereP X D Rn assume there exists y 2 Rm C such that
that the function L.x; y/ WD f .x/ C m iD1 yi gi .x/ is convex and has a minimizer x
satisfying yi gi .x/ D 0; gi .x/  0 8i D 1; : : : ; m: Then strong duality holds, so
 \
X D fX :
10.5 Quadratic Problems with Quadratic Constraints 379

Proof It is readily verified that

L.x; y/  L.x; y/  L.x; y/ 8x 2 Rn ; 8y 2 Rm


C;

so .x; y/ is a saddle point of L.x; y/ on Rn  Rm


C ; and hence, by Proposition 2.35,

max inf L.x; y/ D minn sup L.x; y/: t


u
y2Rm
C
x2Rn x2R y2Rm
C

Example 10.6 Consider the following nonconvex quadratic problem which arises
from the so-called optimal beamforming in wireless communication (Bengtsson and
Ottersen 2001):

X
d X
min jjxi jj2 W xTi Rii xi C xTj Rij xj C i  0; i D 1; 2; : : : ; d;
xi 2RNi ; iD1;2;:::;d
iD1 ji
(10.84)
where

0 < i ; 0
Rij 2 RNi Ni ; i; j D 1; 2; : : : ; d; (10.85)

so each xi is a vector in RNi while each Rij is a positive semidefinite matrix of


dimension Ni  Ni .
Applying Proposition 10.14 yields the following result (Tuy and Tuan 2013):
Corollary 10.12 Strong Lagrangian duality holds for problem (10.84), i.e.,

min sup L.x; y/ D max min L.x; y/; (10.86)


xi 2RNi ; iD1;2;:::;d y2Rd y2RdC xi 2RNi ; iD1;2;:::;d
C

X
d X X
d
where L.x; y/ D xTi .I C yj Rji  yi Rii /xi C i yi :
iD1 ji iD1

Proof It is convenient to precede the proof by a lemma. As usual the inequalities


y > 0 or y < x for vectors y and x are component-wise understood.
Lemma 10.4 For a given matrix R 2 Rdd with positive diagonal entries rii > 0
and nonpositive off-diagonal entries rij  0, i j, assume that there is y > 0 such
that
RT y > 0: (10.87)

Then the inverse matrix R1 is positive, i.e., all its entries are positive.
Proof Define a positive diagonal matrix D WD diagrii iD1;2;:::;d and nonnegative
matrix G D D  R, so R D D  G. Then, (10.87) means  WD Dy  GT y > 0
and 0 < D1 GT y D y  D1  < y. Due to (10.87), without loss of generality
380 10 Nonconvex Quadratic Programming

we can assume rij < 0 so GD1 is irreducible. Let max : stand for the maximal
eigenvalue of a matrix. As D1 GT is obviously nonnegative, by PerronFrobenius
theorem (Gantmacher 1960, p.69), we can write

.D1 GT y/i
max GD1  D max D1 GT  D min max
y>0 iD1;2;:::;d yi
.D1 GT y/i
 max < 1:
iD1;2;:::;d yi

Again by PerronFrobenius theorem, the matrix I  GD1 is invertible; moreover


its inverse .I  GD1 /1 is positive. Then R1 D D1 .I  GD1 /1 is obviously
positive. t
u
Turning to the proof of Corollary 10.12, let
2 0 1 3
Xd X X
d

WD sup min 4 xTi @I C yj Rji  yi Rii A xi C i yi 5 ;
y2RdC xi 2R ; iD1;2;:::;d
Ni
iD1 ji iD1

which means
8 9
<X =
d
X

D sup i yi I C yj Rji  yi Rii 0; i D 1; 2; : : : ; d ; (10.88)
y2RdC
: ;
iD1 ji

so at the optimum y 2 RdC of (10.88) the function L.x; y/ is convex in x D


.x1 ; : : : ; xd / 2 RN1      RNd :
For proving (10.86) it suffices to show that the conditions of Proposition 10.14
are fulfilled, i.e., there exist y 2 RdC and .x1 ; : : : ; xd / 2 arg min L.x; y/ such
xi 2RNi ;iD1;:::;d
that .x1 ; : : : ; xd / is feasible to (10.84) and moreover
0 1
X
yi @xTi Rii xi  xTj Rij xj  i A D 0; i D 1; 2; : : : ; d: (10.89)
ji

Note that an optimal solution y 2 RdC of (10.88) must satisfy


X
IC yj Rji  yi Rii 6 0; i D 1; 2; : : : ; d: (10.90)
ji

Indeed, if for some i,


X
IC yj Rji  yi Rii 0
ji
10.5 Quadratic Problems with Quadratic Constraints 381

we can take yQ i > yi such that


X
IC yj Rji  yi Rii 0:
ji

Then setting
(
yj j i
yQ j D
yQ i j D i

X
d X
d
yields a feasible solution yQ 2 RdC to (10.88) satisfying i yi < i yQ i ; so y
iD1 iD1
cannot be an optimal solution of (10.88).
Thus, at optimality each matrix
X
IC yj Rji  yi Rii
ji

must be singular. We have yi > 0; i D 1; 2; : : : ; d and there are xQ i 2 RNi , jjQxi jj D 1,


i D 1; 2; : : : ; d such that
0 1
X
xQ Ti @I C yj Rji  yi Rii A xQ i D 0; i D 1; 2; : : : ; d; (10.91)
ji

hence, by setting R D rij i;jD1;2;:::;d with rii WD xQ Ti Rii xQ i and rij WD QxTj Rij xQ j , i j,

RT y D .1; 1; : : : ; 1/T > 0:

By Lemma 10.4, R1 is positive so p D R1  > 0, which is the solution of the
linear equations
X
pi .QxTi Rii xQ i /  pj .QxTj Rij xQ j / D i ; i D 1; 2; : : : ; d: (10.92)
ji
p
With such p it is obvious that fxi WD pi xQ i ; i D 1; 2; : : : ; d} is feasible to (10.84)
and satisfies (10.89), completing the proof of (10.86). t
u
Remark 10.10 The main result in Bengtsson and Ottersen (2001) states that the
convex relaxation of (10.84), i.e., the convex program

X
d
min Trace.Xi / W Trace.Xi Rii /
0 Xi 2RNi Ni ; iD1;2;:::;d
iD1
X
 Trace.Xj Rij / C i ; i D 1; 2; : : : ; d; (10.93)
ji

attains its optimal value at a rank-one feasible solution Xi D xi xTi , so (10.84)


and (10.93) are equivalent. It can be shown that the Lagrangian dual of the convex
382 10 Nonconvex Quadratic Programming

problem (10.93) is precisely max min L.x; y/: In other words the above result of
y2RdC xi 2RNi
Bengtsson-Ottersen is just equivalent to Corollary 10.12 which, besides, has been
obtained here by a more straightforward and rigorous argument.
Also note that the computation of the optimal solution of the nonconvex
program (10.84), which is based on the optimal solution y of the convex Lagragian
p
dual (10.88) and pi xQ i ; i D 1; 2; : : : ; d by (10.91) and (10.92), is much more
efficient than by using the optimal solutions of the convex program (10.93). This
is because the convex program (10.93) has multiple optimal solutions, so its rank-
one optimal solution is not quite easily computed.
We have shown above several examples where strong duality holds, so that the
optimal value of the dual equals exactly the optimal value of the original problem.
In the general case, a gap exists which in a sense reflects the extent to which the
given problem fails to be convex and regular. Quite often, however, the gap can be
reduced by gradually restricting the set X: More specifically, it may happen that:
( ) Whenever a nested sequence of rectangles M1  M2  : : : collapses to a
\
singleton: \1 
kD1 Mk D fxg then fMk  Mk ! 0:
Under these circumstances, a BB algorithm for (10.80) can be developed in the
standard way, such that:
Branching follows a selected exhaustive rectangular subdivision rule.
Bounding is by Lagrangian relaxation: for each rectangle M the value M is taken
\
to be a lower bound for fM :
The candidate for further subdivision at iteration k is Mk 2 argminf M j M 2 Rk g
where Rk denotes the collection of all partition sets currently still of interest.
Termination criterion: At each iteration k a new feasible solution is computed
and the incumbent xk is set equal to the best among all feasible solutions so far
obtained. If f .xk / D M k , then terminate (xk is an optimal solution).
Theorem 10.5 Under assumption ( ) the above BB algorithm converges. In other
words, whenever infinite, the algorithm generates a sequence fMk g such that M k %
\
f \ WD fX :
Proof By the selection of Mk we have M k  f \ for all k: By exhaustiveness of
the subdivision, if the procedure is infinite it generates at least one filter fMk g
\
collapsing to a singleton fxg: Since M k  f \  fMk and since obviously M kC1 
   \
Mk for all k; it follows from Assumption ( ) that Mk % f : t
u

10.5.3 Application
a. Bounds by Lagrange relaxation

Following Ben-Tal et al. (1994) let us apply the above method to the problem:

\
.PX/ fX D minfcT yj A.x/y  b; y  0; x 2 Xg; (10.94)
10.5 Quadratic Problems with Quadratic Constraints 383

where X is a rectangle in Rn ; and A.x/ is an m  p matrix whose elements aij .x/


are continuous functions of x: Clearly this problem includes as a special case the
problem of minimizing a linear function under bilinear constraints. Since for fixed
x the problem is linear in y we form the dual problem with respect to the constraints
A.x/y  b; i.e.,
max minfhc; yi C hu; A.x/y  bij y  0; x 2 Xg: (10.95)
u0

Lemma 10.5 The problem (10.95) can be written as


.DPX/ maxfhb; uij hA.x/; ui C c  0 8x 2 Xg: (10.96)
u0

Proof For each fixed u  0 we have


minfhc; yi C hu; A.x/y  bij x 2 X; y  0g
D min minfhc; yi C hu; A.x/y  big
x2X y0

D hb; ui C minfhc; yi C hu; A.x/yij y  0g


x2X

D hb; ui C h.u/;
where

0 if hA.x/; ui C c  0 8x 2 X
h.u/ D
1 otherwise:
The last equality is due to the fact that the infimum of a linear function of y over
the orthant y  0 is 0 or 1 depending on whether the coefficients of this linear
function are all nonnegative or not. The conclusion follows. t
u
Denote the column j of A.x/ by Aj .x/ and for any given rectangle M  X define
gM .u/ D min minhAj .x/; ui C cj :
jD1;:::;p x2M

Since for each fixed x the function u 7! hAj .x/; ui C cj is affine, gM .u/ is a concave
function and the optimal value of the dual (DPM) is

M D maxfhb; uij gM .u/  0; u  0g:
Consider now a filter fMk g of subrectangles of X such that \1 kD1 Mk D fxg: For
\   \  
brevity write fk ; k ; x for fMk ; Mk ; fxg ; respectively. We have
\
fk  minfhc; yij A.x/y  b; y  0g (since x 2 Mk 8k/
minfhc; yij A.x/y  b; y  0g D maxfhb; uij hA.x/; ui C c  0; u  0g
(dual linear programs)
 
x D maxfhb; uij hA.x/; ui C c  0; u  0g (definition of fxg /;
hence,
\ 
fk  x: (10.97)
384 10 Nonconvex Quadratic Programming

Also, setting
L D fu 2 Rm
C j hA.x/; ui C c  0g;

Lk D fu 2 Rm
C j hA.x/; ui C c  0 8x 2 Mk g

we can write
 
k D maxfhb; uiju 2 Lk g; x D maxfhb; uij u 2 Lg
  
L1  : : :  Lk  : : :  L; 1  :::  k  :::  x:

Proposition 10.15 Assume

B1: C/
.8x 2 X/ .9u 2 Rm hA.x/; ui C c > 0:

If fMk g is any filter of subrectangles of X such that \1


kD1 Mk D fxg; then

\ 
fk  k ! 0:

 \
Proof For all k we have, by (10.97), k  f \  fk  
x; so it suffices
to show that
 
k % x .k ! C1/:

Denote .x; u/ D hA.x/; ui C c 2 Rm : Assumption B1 ensures that int L ;


and .x; u/ is never identical to zero. Therefore, if u 2 intL then .x; u/ > 0 (i.e.,
i .x; u/ > 0 8i D 1; : : : ; m/ and since .x; u/ is continuous in x one must have
.x; u/ > 0 for all x in a sufficiently small ball around x; hence for all x 2 Mk with
sufficiently large k (since maxx2Mk kx  xk ! 0 as k ! C1/: Thus, if u 2 intL
then u 2 Lk for all sufficiently large k: Keeping this in mind let " > 0 be an arbitrary
positive number and u an arbitrary element of L: Since L is a polyhedron with
nonempty interior, there exists a sequence fut g  intL such that u D limt!C1 ut :
Hence, there exists ut 2 intL such that hb; ut i  hb; ui  ": Since ut 2 intL; for all
sufficiently large k we have, by the above, ut 2 Lk and, consequently, hb; ut i   k :
Thus  k  hb; ui C " for all sufficiently large k: Since " > 0 and u 2 L are
arbitrary, this shows that k ! x ; completing the proof. t
u
Thus, under Assumption B1, Condition ( ) is fulfilled. It then follows from
Theorem 10.5 that Problem .PX/ can be solved by a convergent BB procedure using
the bounds by Lagrangian relaxation and an exhaustive rectangular subdivision
process.
For the implementation of this BB algorithm note that the dual problem .DPX/
is a semi-infinite linear program which in the general case may not be easy to solve.
However, if we assume, additionally, that for u  0:
10.5 Quadratic Problems with Quadratic Constraints 385

B2: Every function hAj .x/; ui C cj ; j D 1; : : : p; is quasiconcave

then the condition hAj .x/; ui C cj  0 8x 2 M is equivalent to hAj .v/; ui C cj 


0 8v 2 V.M/; where V.M/ denotes the vertex set of the rectangle M: Therefore,
under Assumption B2, .DPM/ merely reduces to the linear program

maxfhb; uij hAj .v/; ui C cj  0 v 2 V.M/; j D 1; : : : ; pg: (10.98)


u0

If M D r; s then the set V.M/ consists of all points v 2 Rn such that vj 2


frj ; sj g; j D 1; : : : ; n: Thus,under Assumptions B1 and B2 the problem (PX) can
be solved by the following algorithm:
BB Algorithm for (PX)

Initialization. Let .x1 ; y1 / be an initial feasible solution and 1 D hc; y1 i (found,


e.g., by local search). Set M0 D X; S1 D P1 D fM0 g: Set k D 1:
Step 1. For each rectangle M 2 Pk solve

maxfhb; uij hAj .v/; ui C cj  0 v 2 V.M/; j D 1; : : : ; p; u  0g

to obtain M :
Step 2. Delete every rectangle M 2 Sk such that M  k  ": Let Rk be the
collection of remaining rectangles.
Step 3. If Rk D ; then terminate: .xk ; yk / is an "-optimal solution.
Step 4. Let Mk 2 argminf M j M 2 Rk g: Compute a new feasible solution by
local search from the center of Mk (or by solving a linear program, see
Remark 10.11 below). Let .xkC1 ; ykC1 / be the new incumbent, kC1 D
hc; ykC1 i:
Step 5. Bisect Mk upon its longest side. Let PkC1 be the partition of Mk :
Step 6. Set SkC1 D .Rk n fMk g/ [ PkC1 ; k k C 1 and return to Step 1.
Proposition 10.16 The above algorithm can be infinite only if " D 0: In the latter
case, M k % min.PX/; and any filter of rectangles fMk g determines an optimal
solution .x; y/; such that fxg D \1
D1 Mk and y is an optimal solution of Pfxg :

Proof Immediate from Theorem 10.5 and Proposition 10.15. t


u
Remark 10.11 (i) If xk is any point of Mk then an optimal solution yk of the linear
program

minfhc; yij A.xk /y  b; y  0g

yields a feasible solution .xk ; yk /:


(ii) For definiteness we assumed that a rectangular subdivision is used but the
above development is also valid for simplicial subdivisions (replace everywhere
386 10 Nonconvex Quadratic Programming

rectangle by simplex). For many problems, simplicial subdivision is even


more efficient because in that case jV.M/j D n C 1 instead of jV.M/j D 2n as
above, so that (10.98) will involve much less constraints.
Example 10.7 ConsiderPthe Pooling and PBlending Problem formulated in Exam-
ple 4.4. Setting xil D qil j ylj and flk D i Cik qil ; it can be reformulated as
P
min  j j s.t.
P P P
.dj  i ci qil /ylj C i .di  ci /zij D j
P Pl P P P P
q y C j zij  Ai ; j ylj  Sl ; l ylj C i zij  Dj
Pl Pj il lj P
l . C
i Pik q il  P jk /y lj C i .Cik  P jk /zij  0
qil  0; i qil D 1; ylj  0; zij  0:

In this form it appears as a problem of type P .PX/ since for fixed q it is linear in
.y; z; /; while the set X D fqj qil  0; i qil D 1g is a simplex. It is readily
verified that conditions B1 and B2 are satisfied, so the above method applies.

b. Bounds via Semidefinite Programming

Lagrange relaxation has also been used to derive good and cheaply computable
lower bounds for the optimal value of problem (10.80) where X is a polyhedron and
1 1
f0 .x/ D hx; Q0 xi C hc0 ; xi C d0 ; fi .x/ D hx; Qi xi C hci ; xi C di :
2 2
The Lagrangian is
1
L.x; u/ D hx; Q.u/xi C hc.u/; xi C d.u/
2
with
X
m X
m X
m
Q.u/ D Q0 C ui Qi ; c.u/ D c0 C ui ci ; d.u/ D d0 C ui di :
iD1 iD1 iD1

Define
D D fu 2 Rm
C j Q.u/ 0g; D D fu 2 Rm
C j Q.u/ 0g;

where for a matrix A the notation A 0 .A 0/ means that A is positive definite


(positive semidefinite, resp.). Obviously

Q.u/ 0 , minfhx; Q.u/xij kxk D 1g > 0


Q.u/ 0 , minfhx; Q.u/xij kxk D 1g  0

so D; D are convex sets (when nonempty) and D is the closure of D:


If u 2 D then finding the value

.u/ D inf fL.x; u/g


x2X
10.5 Quadratic Problems with Quadratic Constraints 387

is a convex minimization problem which can be solved by efficient algorithms.


But if u D; then computing .u/ is a difficult global optimization problem.
Therefore, instead of  D supu2Rm .u/ Shor (1991) proposes to consider the
C
weaker estimate
 
D sup .u/  :
u2D


Note that in the special case X D Rn ; we have D  because then u D implies
.u/ D 1: For simplicity below we assume X D Rn :
 
Proposition 10.17 If D supu2D .u/ is attained on D then D f \:
Proof Let .u/ D L.x; u/ D supu2D .u/: By Proposition 10.11 the vector
.fi .x/; i D 1; : : : ; m/ is a subgradient of  .u/ at u: If the concave function .u/
attains a maximum at a point u 2 D then since u is an interior point of D one must
have fi .x/  0; i D 1; : : : ; m: That is, x is feasible and since  D L.x; u/  f \
one must have  D f \ : t
u
Thus, to compute a lower bound, one way is to solve the problem

supf .u/j Q.u/ 0; u D .u1 ; : : : ; um /  0g; (10.99)

where .u/ is the concave function


1
.u/ D infn hx; Q.u/xi C hc.u/; xi C d.u/:
x2R 2

Remark 10.12 The problem (10.99) is a concave maximization (i.e., convex


minimization) under convex constraints. It can also be written as

max t s.t. .u/  t  0; Q.u/ 0; u D .u1 ; : : : ; um /  0:

Since .u/  t  0 if and only if

1
hx; Q.u/xi C hc.u/; xi C d.u/  t  0 8x 2 Rn
2
it is easy to check that (10.99) is equivalent to
 

Q.u/ c.u/
max t j 0; u D .u1 ; : : : ; um /  0 :
.c.u//T 2.d.u/  t/

Thus (10.99) is actually a linear program under LMI constraints . For this problem
several efficient algorithms are currently available (see, e.g., Vandenberge and Boyd
(1996) and the references therein). However, it is not known whether the bounds
computed that way are consistent with a rectangular or simplicial subdivision
process and can be used to produce convergent BB algorithms.
388 10 Nonconvex Quadratic Programming

We will have to get back to quadratic optimization under quadratic constraints


in Chap. 11 on polynomial optimization. There a SIT Algorithm for this class of
problems will be presented which is the most suitable when a feasible solution is
available and we want to check whether it is already optimal and if not, to find a
better feasible solution.

10.6 Exercises

1 Let D D fx 2 Rn j hai ; xi  i i D 1; : : : ; mg be a polytope.


P
1. Show that there exist ij  0 such that hx; xi D  m i;jD1 ij ha ; xiha ; xi: (Hint:
i j

use (2) of Exercise 11, Chap. 1, to show that the convex hull of a1 ; : : : ; am
contains 0 in its interior; then for every i D 1; : : : ; n; the ith unit vector ei is a
nonnegative combination of a1 ; : : : ; am /:
2. Deduce that for any quadratic functionPf .x/ there exists  such that for all
r >  the quadratic function f .x/ C r m i;jD1 ij .ha ; xi  i /.ha ; xi  j / is
i j

convex, where ij are defined as in (1).


2 Consider the problem minff .x/j x 2 Xg where X is a polyhedron, and f .x/ is
\
an indefinite quadratic function. Denote the optimal value of this problem by fX :
Given any rectangle M D p; q; assume that X \ M D fx 2 Rn jP hai ; xi  i i D
1; : : : ; mg: Show that for sufficiently large r the function f .x/ C r mi;jD1 ij .ha ; xi 
i
\
i /.haj ; xi  j / is a convex minorant of f .x/ over M and a lower bound for fM D
minff .x/jPx 2 X\Mg is given by the unconstrained minimum of the convex function
f .x/ C r m i;jD1 ij .ha ; xi  i /.ha ; xi  j / over R : (Hint: see Exercise 1).
i j n

3 Solve the indefinite quadratic program:


min 3.x  0:5/2  2y2
s.t. x C 2y  1; 0:5  x  1; 0:75  y  2:

4 Solve the generalized linear program


min 3x1 C 4x2
s.t. y11 x1 C 2x2 C x3 D 5; y12 x1 C x2 C x4 D 3
x 0; x2  0I y11  y12  1I 0  y11  2; 0  y12  2:

5 Solve the problem


minf0:25x1  0:5x2 C x1 x2 j x1  0:5; 0  x2  0:5g:

6 Solve the problem

minfx1 C x1 x2  x2 j  6x1 C 8x2  3I 3x1  x2  3I


0  x1  5I 0  x2  5g:
10.6 Exercises 389

7 Solve the problem


    
1 2 x1 x
min x1 ; x2  C 1; 1 1 s.t.
1 1 x2 x2
  
1 1 x1
x1 ; x2   6; x1 C x2  4; 0  x1  4; 0  x2  4:
32 x2

8 Consider a quadratic program minff .x/j x 2 Dg; where D is a polytope in


Rn ; f .x/ D 12 hx; Qxi C hc; xi and Q is a symmetric matrix of rank k  n: Show
that there exist k linearly independent vectors a1 ; : : : ; ak such that for fixed values
of t1 ; : : : ; tk the problem minff .x/j x 2 D; hai ; xi D ti ; i D 1; : : :g becomes a linear
program. Develop a decomposition method for solving this problem (see Chap. 7).
9 Let G D G.V; E/ be an undirected graph, where V D f1; 2; : : : ; ng denotes the
set of vertices, E the set of edges. A subset V 0  V is said to be stable if no two
distinct members of V 0 are adjacent, i.e., if .i; j/ E for every i; j 2 V 0 such that
i j: Let .G/ be the cardinality of the largest stable set in G: Show that .G/ is
the optimal value in the quadratic problem;
( n )
X
max xi j xi xj D 0 8.i; j/ 2 E/; xi .xi  1/ D 0 8i D 1; : : : ; n/ :
x
iD1

1. Using the Lagrangian relaxation method show that an upper bound for .G/
is given by
" #
X
n

D inff ./j Q./ > 0g; ./ D max 2 xi  hx; Q./xi ;
x
iD1

where Q./ D Qij  is a symmetric matrix such that Qij D 0 if i j; .i; j/ E


and Qij D ij for every other .i; j/:
2. Show that ./ D './ where
( ) 1
X
n
'./ D minhx; Q./xij xi D 1
x
iD1

(Shor and Stetsenko 1989). Hint: show that for any positive definite matrix A
and vector b 2 Rn : maxx 2hb; xi  hx; Axi D fminx hx; Axij hb; xi D 1g1 :
10 Let A0 ; A1 ; A2 ; B be symmetric matrices of order n; and a; b two positive
numbers such that a < b: Consider the problem

 WD inff j A0 C uA1 C vA2 C B < 0;


u D x; hx; vi  ; a  x  b; > 0g:
390 10 Nonconvex Quadratic Programming

4ab
Show that    ; where is the optimal value of the problem
.a C b/2

minf j zA0 C uA1 C vA2 C B < 0


4u C .a C b/2 v  4.a C b/ ; a  u  b;  0g:

Hint: Observe that in the plane .x; v/ the line segment f.x; v/j 4 x C .a C b/2 v D
4.p C q/ ; a  x  bg is tangent to the arc f.x; v/j hx; vi D ; a  x  bg at point
2 4ab
. aCb
2
; aCb /; and is the chord of the arc f.x; v/j hx; vi D .aCb/ 2 ; a  x  bg:

11 Show that the water distribution network problem formulated in Exercise 8,


Chap. 4, is of type (PX) and can be solved by a BB method using Lagrange
relaxation for lower bounding.
12 Show that the objective function f .q; H; J/ in the water distribution network
problem (Exercise 8, Chap. 4) is convex in J and concave in .q; H/: Given any
M
rectangle M D f.q; H/j qM  q  qM ; H M  H  H g we have

f .qM ; H M ; J/  f .q; H; J/

KLi .qM
i
=c/  KLi .qi =c/  KLi .qM 
i =c/ ; i D 1; : : : ; s

for all .q; H/ 2 M: Therefore, a lower bound for the objective function value over all
feasible solutions .q; H; J/ with .q; H/ 2 M can be computed by solving a convex
program. Derive a BB method for solving the problem based on this lower bounding.
Chapter 11
Monotonic Optimization

By their very nature, many nonconvex functions encountered in mathematical


modeling of real world systems in a broad range of activities, including economics
and engineering, exhibit monotonicity with respect to some variables (partial
monotonicity) or to all variables (total monotonicity). In this chapter we will
present a mathematical framework for solving optimization problems that involve
increasing functions, and more generally, functions representable as differences of
increasing functions. Monotonic optimization was first considered in Rubinov et al.
(2001) and extensively developed in the two main papers Tuy (2000a) and Tuy et al.
(2006).

11.1 Basic Concepts

11.1.1 Increasing Functions, Normal Sets, and Polyblocks

For any two vectors x; y 2 Rn we write x  y .x < y; resp.) and say that y dominates
x ( y strictly dominates x, resp.) if xi  yi (xi < yi resp.) for all i D 1; : : : ; n: If
a  b then the box a; b is the set of all x such that a  x  b: We also write
.a; b WD fxj a < x  bg; a; b/ WD fxj a  x < bg: As usual e is the vector of all
ones and ei the i-th unit vector of the space under consideration, i.e., the vector such
that eii D 1; eij D 0 8j i:
A function f W RnC ! R is said to be increasing if f .x/  f .x0 / whenever 0 
x  x0 I strictly increasing if, in addition, f .x/ < f .x0 / whenever x < x0 : A function
f is said to be decreasing if f is increasing. Many functions encountered in
various applications are increasing in this sense. Outstanding examples are the
production functions and the utility functions in mathematical economics (under
the assumptions that all goods are useful), polynomials (in particular quadratic
functions) with nonnegative coefficients, and posynomials

Springer International Publishing AG 2016 391


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_11
392 11 Monotonic Optimization

X
m Y
n
a
cj xi ij with cj  0 and aij  0:
jD1 iD1

Q
(in particular, CobbDouglas functions f .x/ D i xai i ; ai  0).
The following obvious proposition shows that the class of increasing functions
includes a large variety of functions:
Proposition 11.1 (i) If f1 ; f2 are increasing functions, then for any nonnegative
numbers 1 ; 2 the function 1 f1 C 2 f2 is increasing.
(ii) The pointwise supremum of a bounded above family .f /2A of increasing
functions and the pointwise infimum of a bounded below family .f /2A of
increasing functions are increasing.
Nontrivial examples of increasing functions given by this proposition are functions
of the form f .x/ D supy2a.x/ g.y/; where g W RnC ! R is an arbitrary function and
a W RnC ! RnC is a set-valued map with bounded images such that a.x0 /  a.x/ for
x0  x:
A set G  RnC is called normal if 0; x  G whenever x 2 GI in other words, if
0  x0  x 2 G ) x0 2 G: The empty set, the singleton f0g; and RnC are special
normal sets which we shall refer to as trivial normal sets in RnC : If G is a normal set
then G [ fx 2 RnC j xi D 0g for some i D 1; : : : ; n is still normal.
Proposition 11.2 For any increasing function g.x/ on RnC and any 2 R the set
G D fx 2 RnC j g.x/  g is normal and it is closed if g.x/ is lower semi-continuous.
Conversely, for any closed normal set G  RnC with nonempty interior there exists
a lower semi-continuous increasing function g W RnC ! R and 2 R such that
G D fx 2 RnC j g.x/  g:
Proof We need only prove the second assertion. Let G be a closed normal set with
nonempty interior. For every x 2 RnC define g.x/ D inff > 0j x 2 Gg: Since
intG ; there is u > 0 such that 0; u  G: The halfline fxj   0g intersects
0; u  G; hence 0  g.x/ < C1: Since for every  > 0 the set G is normal, if
0  x  x0 2 G, then x 2 G; too, so g.x0 /  g.x/; i.e., g.x/ is increasing. We
show that G D fx 2 TC n
j g.x/  1g: In fact, if x 2 G, then obviously g.x/  1:
Conversely, if x G, then since G is closed there exists > 1 such that x GI
hence, since G is normal, x G 8  ; which means that g.x/  > 1:
Consequently, G D fx 2 RnC j g.x/  1g: It remains to prove that g.x/ is lower semi-
continuous. Let fxk g  RnC be a sequence such that xk ! x0 and g.xk /  8k:
Then for any given 0 > we have inffj xk 2 Gg < 0 8k; i.e., xk 2 0 G 8kI
hence x0 2 0 G in view of the closedness of the set 0 G: This implies g.x0 /  0
and since 0 can be taken arbitrarily close to ; we must have g.x0 /  ; as was to
be proved. t
u
A point y 2 Rn is called an upper boundary point of a normal set G if y 2 clG
and there is no point x 2 G such that x D a C .y  a/ with  > 1: The set of
all upper boundary points of G is called its upper boundary and is denoted @C G:
11.1 Basic Concepts 393

A point y 2 clG is called a Pareto point of G if x 2 G; x  y ) x D y: Since such


a point necessarily belongs to @C G; it is also called an upper extreme point of G:
Proposition 11.3 If G  a; b is a compact normal set with nonempty interior,
then for every point z 2 a; b n fag the line segment joining a to z meets @C G at a
unique point G .z/ defined by

G .z/ D a C .z  a/;  D maxf > 0j a C .z  a/ 2 Gg: (11.1)

Proof If u 2 intG then a; u  G and the set f > 0j aC.za/ 2 Gg is nonempty.


So the point G .z/ defined by (11.1) exists and is unique. Clearly G .z/ 2 G and
for >  we have a C .z  a/ G; hence G .z/ is in fact the point where the line
segment joining a to z meet @C G: t
u
A set H  RnC is called conormal (or sometimes, reverse normal) if x C RnC  H
whenever x 2 H: It is called conormal in a box a; b if b  x0  x  a; x 2 H
implies x0 2 H; or equivalently, if x H whenever a  x  x0  b; x0 H: Clearly
a set H is conormal if and only if the set H [ WD RnC n H is normal.
The following propositions are analogous to Propositions 11.2 and 11.3 and can
be proved by similar arguments:
1. For any increasing function h.x/ on RnC and any 2 R the set H D fx 2
RnC j h.x/  g is normal and is closed if h.x/ is upper semi-continuous.
Conversely, for any closed conormal set H  RnC such that RnC n H has
a nonempty interior there exist an upper semi-continuous increasing function
h W RnC ! R and 2 R such that H D fx 2 RnC j h.x/  g:
A point y 2 a; b is called a lower boundary point of a conormal set H 
a; b if y 2 clH and there is no point x 2 H such that x D a C .y  a/ with
 < 1: The set of all lower boundary points of H is called its lower boundary
and is denoted @ H: A point y 2 clH is called a Pareto point or a lower extreme
point of H if x 2 H; x  y ) x D y:
2. H  a; b is a closed conormal set with b 2 intH, then for every point z 2
a; b n H the line segment joining z to b meets @ H at a unique point H .z/
defined by

H .z/ D b  .z  b/;  D maxf > 0j b  .z  b/ 2 Hg: (11.2)

Given a set A  RnC the whole orthant RnC is a normal set containing A:
The intersection of all normal sets containing A; i.e., the smallest normal set
containing A; is called the normal hull of A and denoted Ae : The conormal hull
of A, denoted bA, is the smallest conormal set containing A:
Proposition 11.4 (i) The normal hull of a set A  a; b is the set Ae D [z2A a; z:
If A is compact then so is Ae :
(ii) The conormal hull of a set A  a; b is the set bA D [z2A z; b: If A is compact
then so is bA:
394 11 Monotonic Optimization

Proof It suffices to prove (i), because the proof of (ii) is similar. Let P D [z2A a; z:
Clearly P is normal and P  A; hence P  Ae : Conversely, if x 2 P then x 2 a; z
for some z 2 A  Ae ; hence x 2 Ae by normality of Ae ; so P  Ae ; proving
the equality P D Ae : If A is compact then for any sequence fxk g  Ae we have
xk 2 a; zk  with zk 2 A so xk  A; hence, up to a subsequence, xk ! x0 2 A  Ae ;
proving the compactness of Ae : t
u
Proposition 11.5 A compact normal set G  a; b has at least one upper extreme
point and is equal to the normal hull of the set V of its upper extreme points.
Proof For each i D 1; : : : ; n an extreme point of the line segment fx 2 Gj x D
ei ;   0g is an upper extreme point of G: So V ;: We show that G D V e :
Since V  @C G; we have V e  .@C G/e  G; so it suffices to show that G  V e : If
y 2 G then G .y/ 2 @C G; so y 2 .@C G/e : Define x1 2 argmaxfx1 j x 2 G; x  yg;
and xi 2 argmaxfxi j x  xi1 g for i D 2; : : : ; n: Then v WD xn 2 G and v  x for
all x 2 G satisfying x  y: Therefore, x 2 G; x  v ) x D v: This means that
y  v 2 V; hence y 2 V e : Consequently, G  V e ; as was to be proved. t
u
A polyblock P is the normal hull of a finite set V  a; b called its vertex set.
By the above proposition P D [z2V a; z: A point z 2 V is called a proper vertex
of P if there is no z0 2 V n fzg such that z0  z: It is easily seen that a proper vertex
of a polyblock is nothing but a Pareto point of it. The vertex set of a polyblock P is
denoted by vertP; the set of proper vertices by pvertP: A vertex which is not proper
is called improper. Clearly a polyblock is fully determined by its proper vertex;
more precisely, P D .pvertP/e ; i.e., a polyblock is the normal hull of its proper
vertex set (Fig. 11.1).
Similarly, a copolyblock (or reverse polyblock) Q is the conormal hull of a finite
set T  a; b; i.e., Q D [z2T z; b: The set T is called the vertex set of the
copolyblock. A proper vertex of a copolyblock Q is a vertex z for which there is no
vertex z0 z such that z0  z: A vertex which is not proper is said to be improper.
A copolyblock is fully determined by its proper vertex set, i.e., a copolyblock is the
conormal hull of its proper vertex set.

Fig. 11.1 Polyblock, Copolyblock


11.1 Basic Concepts 395

Proposition 11.6 (i) The intersection of finitely many polyblocks is a polyblock.


(ii) The intersection of finitely many copolyblocks is a copolyblock.
Proof If V1 ; V2 are the vertex sets of two polyblocks P1 ; P2 , respectively, then

P1 \ P2 D .[z2V1 a; z/ \ .[y2V2 a; y/ D [.z;y/2V1 V2 a; z \ a; y

D [.z;y/2V1V2 a; z ^ y

where u D z ^ y means ui D minfzi ; yi g 8i D 1; : : : ; n: Similarly if T1 ; T2


are the vertex sets of two copolyblocks Q1 ; Q2 , respectively, then Q1 \ Q2 D
[.z;y/2T1 T2 z; b \ y; b D [.z;y/2T1 T2 z _ y; b where v D z _ y means vi D
maxfzi ; yi g 8i D 1; : : : ; n: t
u
Proposition 11.7 If y 2 a; b/ then the set a; b n .y; b is a polyblock with proper
vertices

ui D b C .yi  bi /ei ; i D 1; : : : ; n: (11.3)

Proof We have .y; b D [niD1 fx 2 a; bj xi > yi g; so a; b n .y; b D [niD1 .x 2


a; bj xi  yi g D [niD1 a; ui : t
u

11.1.2 DM Functions and DM Constraints

A function f W RnC ! R is said to be a dm (difference-of-monotonic) function if it


can be represented as a difference of two increasing functions: f .x/ D f1 .x/  f2 .x/;
where f1 ; f2 are increasing functions.
Proposition 11.8 If f1 .x/; f2 .x/ are dm then
(i) for any 1 ; 2 2 R the function 1 f1 .x/ C 2 f2 .x/ is dm;
(ii) the functions maxff1 .x/; f2 .x/g and minff1 .x/; f2 .x/g are dm.
Proof (i) is trivial. To prove (ii) let fi D fi .x/  qi .x/; i D 1; 2; with fi ; qi increasing
functions. Since f1 D p1 C q2  .q1 C q2 /; f2 D p2 C q1  .q1 C q2 / we have
maxff1 ; f2 g D maxfp1 C q2 ; p2 C q1 g  .q1 C q2 /; minff1 ; f2 g D minfp1 C q2 ; p2 C
q1 g  .q1 C q2 /: t
u
Proposition 11.9 The following functions are dm on a; b  RnC :
P
(i) any polynomial, and more generally, any synomial P.x/ D c x ; where
n
D .1 ; : : : ; n /  0 and x D x1 x2    xn ; c 2 R:
1 2

(ii) any convex function f W a; b  RnC ! R:


(iii) any Lipschitz function f W a; b ! R with Lipschitz constant K:
(iv) any C1 -function f W a; b  RnC ! R:
Proof (i) P.x/ D P1 .x/  P2 .x/ where P1 .x/ is the sum of terms c x with c > 0
and P2 .x/ is the sum of terms c x with c < 0:
396 11 Monotonic Optimization

(ii) Since the set [a<x<b @f .x/ is nonempty and bounded (Theorem 2.6), we can
take M >PK WD supfkpkj p 2 @f .x/; x 2 a; b: Then the function g.x/ D
f .x/ C M niD1 xi satisfies, for a < x  x0 < b:

X
n
g.x0 /  g.x/ D f .x0 /  f .x/ C M .x0i  xi /
iD1

X
n
 hp; x0  xi C M .x0i  xi /;
iD1

P
where p 2 @f .x/: Hence g.x0 /  g.x/  .M  K/ niD1 .x0i  xi /  0; so g.x/ is
increasing. P
(iii) The function g.x/ D f .x/ C M niD1 xi where M > K is increasing on a; b
because one has, for a  x  x0  b:

X
n
g.x0 /  g.x/ D M .x0i  xi / C f .x0 /  f .x/
iD1

X
n
M .x0i  xi /  Kjx0  xj  0:
iD1

P
(iv) For M > K WD supx2a;b krf .x/k the function g.x/ D f .x/ C M niD1 xi is
increasing. In fact, by the mean-value theorem, for a  x  x0  b there exists
2 .0; 1/ such that

f .x0 /  f .x/ D hrf .x C .x0  x//; x0  xi;


Pn Pn
hence g.x0 /  g.x/  M 0
iD1 .xi  xi /  K 0
iD1 .xi  xi /  0: t
u
As a consequence of Proposition 11.9, (i), the class of dm functions on a; b is
dense in the space Ca;b of continuous functions on a; b with the supnorm topology.
A function f .x/ is said to be locally increasing (locally dm, resp.) on RnC if for
every x 2 RnC there is a box a; b containing x in its interior such that f .x/ is
increasing (dm, resp.) on a; b:
Proposition 11.10 A locally increasing function on RnC is increasing on RnC :
A locally dm function on a box a; b is dm on this box.
Proof Suppose f .x/ is locally increasing and let 0  x  x0 : Since x; x0  is compact
there exists a finite covering of it by boxes such that f .x/ is increasing on each of
these boxes. Then the segment  joining x to x0 can be divided into a finite number of
intervals: x1 D x  x2      xm D x0 such that f .xi /  f .xiC1 /; i D 1; : : : ; m  1:
Hence, f .x/  f .x2 /      f .x0 /:
11.1 Basic Concepts 397

Suppose now that f .x/ is locally dm on a; b: By compactness the box a; b can
be covered by a finite number of boxes p1 ; q1 ;    ; pm ; qm  such that f .x/ D ui .x/
vi .x/ 8x 2 pi ; qi ; with ui ; vi increasing functions on p1 ; qi : For x 2 pi ; qi \pj ; qj 
P have ui .x/  vi i .x/ D iuj .x/  vj .x/ D f .x/;
we P so we can define u.x/ D ui .x/ C
ji maxf0; u j .q /  u i .q /g; v.x/ D vi .x/ C ji maxf0; u j .q i
/  u i .q /g for every
i

x 2 p ; q : These two functions are increasing and obviously f .x/ D u.x/  v.x/:
i i

t
u

11.1.3 Canonical Monotonic Optimization Problem

A monotonic optimization problem is an optimization problem of the form

maxff .x/j gi .x/  0; i D 1; : : : ; m; x 2 a; bg; (11.4)

where the objective function f .x/ and the constraint functions gi .x/ are dm functions
on the box a; b  Rn : For the purpose of optimization an important property of the
class of dm functions is that, like the class of dc functions, it is closed under linear
operations and operations of taking pointwise maximum and pointwise minimum.
Exploiting this property we have the following proposition:
Theorem 11.1 Any monotonic optimization problem can be reduced to the canon-
ical form

.MO/ maxff .x/j g.x/  0  h.x/; x 2 a; bg (11.5)

where f ; g; h are increasing functions.


Proof Consider a monotonic optimization problem in the general form (11.4). Since
the system of inequalities gi .x/  0; i D 1; : : : ; m; is equivalent to the single
inequality g.x/ WD maxiD1;:::;m gi .x/  0; where g.x/ is dm, we can rewrite the
problem as

maxff1 .x/  f2 .x/j g1 .x/  g2 .x/  0; x 2 a; bg;

where f1 ; f2 ; g1 ; g2 are increasing functions. Introducing two additional variables z; t


we can further write

maxff1 .x/ C zj f2 .x/ C z  0; g1 .x/ C t  0  g2 .x/ C t;


z 2 f2 .b/; f2 .a/; t 2 g2 .b/; g1 .a/; x 2 a; bg:

Finally, replacing the system of two inequalities f2 .x/ C z  0; g1 .x/ C t  0 by the


single inequality maxff2 .x/ C z; g1 .x/ C tg  0 and changing the notations yield the
canonical form (MO). t
u
398 11 Monotonic Optimization

Remark 11.1 A minimization problem such as

minff .x/j g.x/  0  h.x/; x 2 a; bg

can be converted to the form (MO). In fact, by setting x D a C b  y; fQ .y/ D


f .a C b  y/; gQ .y/ D g.a C b  y/; it can be rewritten as

maxffQ .y/j h.y/


Q  0  gQ .y/; y 2 a; bg:

For simplicity we will assume in the following that f .x/; h.x/ are upper semi-
continuous, while g.x/ is lower semi-continuous. Then the set G WD fx 2
a; bj g.x/  0g is a compact normal set, H WD fx 2 a; bj h.x/  0g is a compact
conormal set and the problem (MO) can be rewritten as

maxff .x/j x 2 G \ Hg (11.6)

Proposition 11.11 If G \ H ; the problem (MO) has at least an optimal solution


which is a Pareto point of G:
Proof Since the feasible set G \ H is compact, while the objective function f .x/ is
upper semi-continuous, the problem has an optimal solution x: If x is not a Pareto
point of G then the set Gx WD fx 2 Gj x  xg is a normal set in the box x; b and
any Pareto point xQ of this set is clearly a Pareto point of G and f .Qx/ D f .x/; i.e., xQ is
also optimal. t
u
Proposition 11.12 Let G  a; b be a compact normal set and z 2 a; b n G; y D
G .z/: Then the polyblock P with proper vertices

ui D b C .yi  b/ei ; i D 1; : : : ; n (11.7)

strictly separates z from G; i.e., P  G; z 2 P n G:


Proof Recall that G .z/ denotes the point where the line segment from a to z meets
the upper boundary @C G of G; so y < z: By Proposition 11.7 Py D a; bn.y; b: ut
The next corollary shows that with respect to compact normal sets polyblocks
behave like polytopes with respect to compact convex sets.
Corollary 11.1 Any compact normal set is the intersection of a family of poly-
blocks. In other words, a compact normal set can be approximated as closely as
desired by a polyblock.
Proof Clearly, if G  a; b then P0 D a; b is a polyblock containing G: Therefore,
the family I of polyblocks containing G is not empty and G  \i2I Pi : If there were
x 2 \i2I Pi n G then by the above proposition there would exist a polyblock P  G
such that x P; a contradiction. t
u
11.2 The Polyblock Algorithm 399

On the basis of these properties, one method for solving problem (MO) is
to generate inductively a nested sequence of polyblocks outer approximating the
feasible set:

a; b WD P0  P1      G \ H;

in such a way that

maxff .x/j x 2 Pk g & maxff .x/j x 2 G \ Hg:

This can be done as follows. Start with the polyblock P1 D a; b: At iteration k a
polyblock Pk  G has been constructed with proper vertex set Tk : If Tk0 D Tk \ H D
; stop: the problem is infeasible. Otherwise let zk 2 argmaxff .x/j x 2 Tk0 g: Since
f .x/ is an increasing function, f .zk / is the maximum of f .x/ over Pk \ H  G \ H:
If zk 2 G, then stop: zk 2 G \ H; i.e., zk is feasible, so f .zk / is also the maximum
of f .x/ over G \ H; i.e., zk solves the problem. If, on the other hand, zk G, then
compute xk D G .zk / and let PkC1 D .a; b n .xk ; b/ \ Pk : By Proposition 11.7
a; b n .xk ; b is polyblock, so the polyblock PkC1 satisfies G  PkC1  Pk n fzk g:
The procedure can then be repeated at the next iteration.
This method is analogous to the outer approximation method for minimizing
a convex function over a convex set. Although it can be proved that, under mild
conditions, as k ! C1 the sequence xk generated by this method converges to
a global optimal solution of (MO), the convergence is generally slow. To speed up
convergence the method should be improved in two points: (1) bounding: using valid
cuts reduce the current box p; q in order to have tighter bounds; (2) computing the
set TkC1 (proper vertex set of polyblock PkC1 /: not from scratch, but by deriving it
from Tk :

11.2 The Polyblock Algorithm

11.2.1 Valid Reduction

At any stage of the solving process, some information may have been gathered about
the global optimal solution. To economize the computational effort this information
should be used to reduce the search domain by removing certain unfit portions where
the global optimal solution cannot be expected to be found. The subproblem which
arises is the following.
Given a number 2 f .G \ H/ (the current best feasible objective function value)
and a box p; q  a; b check whether the box p; q contains at least one solution
to the system

fx 2 p; qj g.x/  0  h.x/; f .x/  g ; (11.8)


400 11 Monotonic Optimization

and to find a smaller box p0 ; q0   p; q still containing at least one solution


of (11.8). Such a box p0 ; q0  is called a -valid reduction of p; q and is denoted
red p; q:
Observe that if g.q/  0 then for every x 2 p; q satisfying (11.8) the line
segment joining x to q intersects the surface g.:/ D 0 at a point x0 2 p; q satisfying

g.x0 / D 0  h.x0 /; f .x0 /  f .x/  :

Therefore, the box p0 ; q0  should contain all x 2 p; q satisfying g.x/ D 0 


h.x/; f .x/  ; i.e., satisfying

g.x/  0  h .x/ WD minfg.x/; h.x/; f .x/  g: (11.9)

Lemma 11.1 (i) If g.p/ > 0 or h .q/ < 0 then there is no x 2 p; q


satisfying (11.9). P
(ii) If g.p/  0 then the box p; q0  where q0 D p C niD1 i .qi  pi /ei ; with

i D supfj 0   1; g.p C .qi  pi /ei /  0g; i D 1 : : : ; n; (11.10)

still contains all feasible solutions to (11.9) in p; q: P


(iii) If g.p/  0  h .q/ then the box p0 ; q0  where p0 D q0  niD1 i .q0i  pi /ei ;
with

i D supfj 0   1; h .q0  .q0i  pi /ei /  0g; i D 1; : : : ; n; (11.11)

still contains all feasible solutions to (11.9) in p; q:


Proof It suffices to prove (ii) because (iii) can be proved analogously, while (i) is
obvious. Since q0i D qi C .1  /pi with 0   1; it follows that pi  q0i 
qi 8i D 1; : : : ; n; i.e., p; q0   p; q: Recall that

G WD fx 2 p; qj g.x/  0g:

For any x 2 G \ p; q we have by normality p; x  G; so xi D p C .xi  pi /ei 2


G; i D 1; : : : ; n: But xi  qi ; so xi D p C .qi  pi /ei with 0   1: This implies
that  i ; i.e., xi  p C i .qi  pi /ei ; i D 1; : : : ; n; and consequently x  q0 ; i.e.,
x 2 p; q0 : Thus, G \ p; q  G \ p; q0 ; which completes the proof because the
converse inclusion is obvious by the fact p; q0   p; q: t
u
Clearly the box p; q0  defined in (1) is obtained from p; q by cutting off the
set [niD1 fxj xi > q0i g; while the box p0 ; q defined in (2) is obtained from p; q by
cutting off the set [niD1 fxi j xi < p0i g: The cut [niD1 fxi jxi > q0i g is referred to as an
upper -valid cut with vertex q0 and the cut [niD1 fxi j xi < p0i g as a lower -valid cut
with vertex p0 ; applied to the box p; q (Fig. 11.2). We now deduce a closed formula
for red p; q:
11.2 The Polyblock Algorithm 401

Fig. 11.2 Reduced box q

G
p

Given any continuous increasing function './ W 0; 1 ! R such that '.0/ <
0 < '.1/ let N.'/ denote the largest value 2 .0; 1/ satisfying './ D 0: For
every i D 1; : : : ; n let qi WD p C .qi  pi /ei :
Proposition 11.13 A -valid reduction of a box p; q is given by the formula

;; if g.p/ > 0 or h .q0 / < 0;
red p; q D (11.12)
p0 ; q0 ; if g.p/  0 & h .q0 /  0:

where

X
n X
n
q0 D p C i .qi  pi /ei ; p0 D q0  i .q0  pi /ei ; (11.13)
iD1 iD1


1; if g.qi /  0
i D 'i .t/ D g.p C t.qi  pi /ei /:
N.'i /; if g.qi / > 0:

1; if h .q0i /  0
i D i .t/ D h .q0  t.q0i  pi /ei /:
N. i /; if h .q0i / < 0:

Proof Straightforward from Lemma 11.1 . t


u
Remark 11.2 The reduction operation which is meant to enhance the efficiency
of the search procedure is worthwhile only if the cost of this operation does not
offset its advantage. Therefore, although the exact values of N.'i /; N. i / can
be computed easily using, e.g., a procedure of Bolzano bisection, in practical
implementation of the above formula it is enough to compute only approximate
values of N.'i /; N. i / by applying just a few iterations of the Bolzano bisection
procedure.

Proposition 11.14 Let E WD fx 2 G \ Hj f .x/  g; and let P be a polyblock


containing E with proper vertex V: Let V 0 be the set obtained from V by deleting
402 11 Monotonic Optimization

every z 2 V satisfying red a; z D ; and replacing every other z 2 V with the


highest corner z0 of the box red a; z D a; z0 : Then the polyblock P0 generated by
V 0 satisfies

fx 2 G \ Hj f .x/  g  P0  P:

Proof Since E \a; z D ; for every deleted z; while E \a; z  red a; z WD a; z0 


for every other z; the conclusion follows. t
u
We shall refer to the polyblock P0 with vertex set V 0 as the -valid reduction of
the polyblock P and denote it by red P:

11.2.2 Generating the New Polyblock

The next question after reducing the search domain is how to derive the new
polyblock PkC1 that should better approximate G \ H than the current polyblock
Pk : The general subproblem to be solved is the following.
Given a polyblock P  a; b with V D pvertP; a vertex v 2 V and a point
x 2 a; v/ such that G \ H  P n .x; b; determine a new polyblock P0 satisfying
P n .x; b  P0  P n fvg.
For any two z; y let J.z; y/ D fjj zj > yj g and if a  x  z  b then define
zi D z C .xi  zi /ei ; i D 1; : : : ; n:
Proposition 11.15 Let P be a polyblock with proper vertex V  a; b and let x 2
a; b satisfy V WD fz 2 Vj x < zg ;:
(i) P0 WD P n .x; b is a polyblock with vertex set

T 0 D .V n V / [ fzi D z C .xi  zi /ei j z 2 V ; i D 1; : : : ; ng: (11.14)

(ii) The proper vertex set of P0 is obtained from T 0 by removing improper elements
according to the rule:
For every pair z 2 V ; y 2 VC WD fy 2 Vj x  yg compute J.z; y/ D fJj zj > yj g
and if J.z; y/ D fig then zi is an improper element of T 0 :
Proof Since a; z\.x; b D ; for every z 2 V nV it follows that Pn.x; b D P1 [P2 ;
where P1 is the polyblock with vertex set V n V and P2 D .[z2V a; z/ n .x; b D
[z2V .a; z n .x; b/: Noting that by Proposition 11.7 a; b n .x; b is a polyblock
with vertices ui D C.xi  bi /ei ; i D 1; : : : ; n; we can then write a; z n .x; b D
a; z \ .a; b n .x; b/ D a; z \ .[niD1 a; ui / D [niD1 a; z \ a; ui  D [niD1 a; z ^ ui ;
where z ^ ui denotes the vector whose every j-th component equals minfzj ; uij g:
Hence, P2 D [fa; z ^ ui j z 2 V ; i D 1; : : : ; ng: Since z ^ ui D zi for z 2 V ; this
shows that the vertex set of P n .x; b is the set T 0 given by (11.14).
11.2 The Polyblock Algorithm 403

It remains to show that every y 2 V n V is proper, while a zi D z C .xi  zi /ei


with z 2 V is improper if and only if J.z; y/ D fig for some y 2 VC :
Since every y 2 V n V is proper in V; while zi  z 2 V for every zi it is
clear that every y 2 V n V is proper. Therefore, an improper element must be
some zi such that zi  y for some y 2 T 0 : Two cases are possible: (1) y 2 VI
(2) y 2 T 0 n V: In the former case, since obviously x  zi we must have x  yI
furthermore, zj D zij  yj 8j i; hence, since z 6 y; it follows that zi > yi ; i.e.,
J.z; y/ D fig: In the latter case, zi  yl for some y 2 V and some l 2 f1; : : : ; ng:
Then it is clear that we cannot have y D z, so y z and zij  ylj 8j D 1; : : : ; n: It
l D i then this implies zj D zij  yij D yj 8j i; hence, since z 6 y it follows that
zi > yi and J.z; y/ D fig: On the other hand, if l i then from zij  ylj 8j D 1; : : : ; n;
we have zj  yj 8j i and again, since z 6 y we must have zi > yi ; so J.z; y/ D fig:
Thus an improper zi must satisfy J.z; y/ D fig for some y 2 VC : Conversely, if
J.z; y/ D fig for some y 2 VC then zj  yj 8j i; hence, since zii D xi and xi D yi
if y 2 V or else xi  yi if y 2 VC ; it follows that zi  yi if y 2 V or else zi  y if
y 2 VC I so in either case zi is improper. This completes the proof of the proposition.
t
u

11.2.3 Polyblock Algorithm

By shifting the origin to e if necessary we may always assume that

x  a C e 8x 2 H: (11.15)

where > 0 is chosen not too small compared to kb  ak (e.g., D 14 kb  ak).


Furthermore, since the problem is obviously infeasible if b H whereas b is an
obvious optimal solution if b 2 G \ H; we may also assume that

b 2 H n G; i:e:; h.b/  0; g.b/  0: (11.16)

Algorithm POA (Polyblock Outer Approximation Algorithm, or briefly, Polyblock


Algorithm)
Initialization Let P1 be an initial polyblock containing G \ H and let V1 be its
proper vertex set. For example, P1 D a; b; V1 D fbg: Let x1 be the best feasible
solution available and CBV D f .x1 / (if no feasible solution is available, set CBV D
1). Set k D 1:
Step 1. For D CBV let Pk D red Pk and V k D pvertPk : Reset b _fx 2 V k g
where x D _fa 2 Ag means xi D maxa2A ai 8i D 1; : : : ; n:
Step 2. If V k D ; terminate: if CBV D 1 the problem is infeasible; if CBV >
1 the current best solution xk is an optimal solution.
Step 3. If V k ; select zk 2 argmaxff .x/j x 2 V k g: If g.zk /  0 then terminate:
zk is an optimal solution. Otherwise, go to Step 4.
404 11 Monotonic Optimization

Step 4. Compute xk D G .zk /; the last point of G in the line segment from a to xk :
Determine the new current best feasible xkC1 and the new current best
value CBV: Compute the proper vertex set VkC1 of PkC1 D Pk n .xk ; b
according to Proposition 11.15.
Step 5. Increment k and return to Step 1.
Proposition 11.16 When infinite Algorithm POA generates an infinite sequence
fxk g every cluster point of which is a global optimal solution of (MO).
Proof First note that by replacing Pk with Pk D red Pk in Step 1, all z 2 Vk such
that f .z/ < are deleted, so that at any iteration k all feasible solutions z with
f .z/  CBV are contained in the polyblock Pk : This justifies the conclusions when
the algorithm terminates at Step 2 or Step 3. Therefore it suffices to consider the
case when the algorithm is infinite.
Condition (12.16) implies that

min.zi  ai /  kz  ak  kz  ak 8z 2 H; (11.17)
i kz  ak

where  D kbak : We contend that zk  xk ! 0 as k ! C1: Suppose the contrary,
that there exist
> 0 and an infinite sequence kl such that kzkl  xkl k 
> 0 8l:
For  > l we have zk .xkl ; zkl  because Pkm u  Pkl n .xkl ; b: Hence, kzkm u 
zkl k  miniD1;:::;n jzki l  xki l j: On the other hand, miniD1;:::;n .zki l  ai /  kzkl  ak
by (11.17) because zkl 2 H; while xkl lies on the line segment joint a to zkl ; so
k
zi l ai
zki l xki l D kzkl ak
kzkl xkl k  kzkl xkl k 8i; i.e., miniD1;:::;n jzki l xki l j  kzkl xkl k:
Therefore, kz  z k  miniD1;:::;n jzki l  xki l j  kzkl  xkl k  ; conflicting with
km u kl

the boundedness of the sequence fzkl g  a; b:


Thus zk  xk ! 0 and by boundedness we may assume that, up to subsequences,
x ! x; zk ! x: Then, since zk 2 H; xk 2 G 8k; it follows that x 2 G \ H; i.e., x
k

is feasible. Furthermore, f .zk /  f .z/ 8z 2k  G \ H; hence, by letting k ! C1;


f .x/  f .x/ 8x 2 G \ H; i.e., x is a global optimal solution. t
u
Remark 11.3 In practice we must stop the algorithm at some iteration k: The
question arises as to when to stop, i.e., how large k should be, to obtain a zk
sufficiently close to an optimal solution.
If the problem (MO) does not involve conormal constraints, i.e., has the form

.MO1/ maxff .x/j g.x/  0; x 2 a; bg;

then xk is always feasible, so when f .zk /  f .xk / < "; then f .xk / C " > f .zk / 
maxff .x/j g.x/  0; x 2 a; bg; i.e., xk will be an "-optimal solution of the problem.
In other words, the Polyblock Algorithm applied to problem (MO1) provides an "-
optimal solution in finitely many steps.
11.3 Successive Incumbent Transcending Algorithm 405

In the general case, if both normal and conormal constraints are present, xk may
not be feasible, and the Polyblock Algorithm may not provide an "-optimal solution
in finitely many steps. The most that can be said is that, since g.zk /  g.xk / ! 0 and
g.xk /  0; while h.zk /  0 8k; f .zk / ! max .MO/ as k ! C1; we will have, for
k sufficiently large, g.zk /  "; h.zk /  0; f .zk /  max .MO/  ":
A point x 2 a; b satisfying

g.x/  "  0  h.x/; f .x/  max .MO/  "

is called an "-approximate optimal solution of (MO). Thus, given any tolerance


" > 0; Algorithm POA can only provide an "-approximate optimal solution zk in
finitely many steps.

11.3 Successive Incumbent Transcending Algorithm

Very often in practice we are not so much interested in solving a problem to


optimality as in obtaining quickly a feasible solution better than a given one. In that
case, the Polyblock Algorithm is a good choice only if the problem does not involve
conormal constraints, i.e., if its feasible set is a normal set. But if the problem
involves both normal and conormal sets, i.e., if the feasible set is the intersection
of a normal and a conormal set, then the Polyblock Algorithm does not serve our
purpose. As we saw at the end of the previous section, for such a problem the
Polyblock Algorithm is incapable of computing a feasible solution in finitely many
iterations. More importantly, when the problem happens to have an isolated feasible
or almost feasible solution with an objective function value significantly better than
the optimal value (as, e.g., point x in the problem depicted in Fig. 11.3), then the
"-approximate solution produced by the algorithm may be quite far off the optimal
solution and so can hardly be accepted as an adequate approximation of the optimal
solution.
The situation is very much like to what happens for dc optimization problems
with a nonconvex feasible set. To overcome or at least alleviate the above difficulties
a robust approach to (MO) can be developed which is similar to the robust approach
for dc optimization (Chap. 7, Sect. 7.5), with two following basic features :
1. Instead of an .";
/-approximate optimal solution an essential
-optimal solution
is computed which is always feasible and provides an adequate approximation
of the optimal solution;
2. The algorithm generates a sequence of better and better nonisolated feasible
solutions converging to a nonisolated feasible solution which is the best among
all nonisolated feasible solutions. In that way, for any prescribed tolerance
> 0
an essential
-optimal solution is obtained in finitely many iterations.
406 11 Monotonic Optimization

Fig. 11.3 Inadequate b


approximate optimal
solution x h(x)=0

h(x) 0

g(x)=0
opt
f (x) =

g(x) 0
x

11.3.1 Essential Optimal Solution

When the problem has an isolated optimal solution this optimal solution is generally
difficult to compute and difficult to implement when computable.
To get round the difficulty a common practice is to assume that the feasible set D
of the problem under consideration is robust, i.e., it satisfies

D D cl.intD/; (11.18)

where cl and int denote the closure and the interior, respectively. Condition (11.18)
rules out the existence of an isolated feasible solution and ensures that for " > 0
sufficiently small an "-approximate optimal solution will be close enough to the
exact optimum. However, even in this case the trouble is that in practice we often
do not know what does it mean precisely sufficiently small " > 0: Moreover,
condition (11.18) is generally very hard to check. Quite often we have to deal with
feasible sets which are not known a priori to contain isolated points or not.
Therefore, from a practical point of view a method is desirable which could
help to discard isolated feasible solutions from consideration without having to
preliminarily check their presence. This method should employ a more adequate
concept of approximate optimal solution than the commonly used concept of
"-approximate optimal solution.
A point x 2 Rn is called an essential feasible solution of (MO) if it satisfies

x 2 a; b; g.x/ < 0  h.x/:

Given a tolerance
> 0 an essential feasible solution x is said to be an essential

-optimal solution of (MO) if it satisfies f .x/  f .x/ 


for all essential feasible
solutions x of (MO).
11.3 Successive Incumbent Transcending Algorithm 407

Given a tolerance " > 0; a point x 2 Rn is said to be an "-essential feasible


solution if it satisfies
x 2 a; b; g.x/ C "  0  h.x/

Finally, given two tolerances " > 0;


> 0; an "-essential feasible solution x is said
to be .";
/-optimal if it satisfies f .x/  f .x/
for all "-essential optimal solutions.
Since g; h are continuous increasing functions, it is readily seen that if x is an
essential feasible point then for all t > 0 sufficiently small x0 D x C t.b  x/ is
still an essential feasible point. Therefore, an essential feasible solution is always a
nonisolated feasible solution.
The above discussion suggests that instead of trying to find an optimal solution to
(MO), it would be more practical and reasonable to look for an essential
-optimal
solution. The robust approach to problem (MO) to be presented below embodies
this point of view.

11.3.2 Successive Incumbent Transcending Strategy

A key step towards finding a global optimal solution of problem (MO) is to deal
with the following subproblem of incumbent transcending:
.SP / Given a real number ; check whether (MO) has a nonisolated feasible
solution x satisfying f .x/ > ; or else establish that no nonisolated feasible solution
x of (MO) exists such that f .x/ > :
Since f ; g are increasing, if f .b/  , then f .x/  8x 2 a; b; so there is
no feasible solution with objective function value exceeding I if g.a/ > 0; then
g.x/ > 0 8x 2 a; b; so the problem is infeasible.
If f .b/ > and g.b/ < 0 then b is an essential feasible solution with objective
function value exceeding :
Therefore, barring these trivial cases, we can assume that

f .b/ > ; g.a/  0  g.b/: (11.19)


To seek an answer to .SP / consider the master problem:

minfg.x/j h.x/  0; f .x/  ; x 2 a; bg:


Setting k .x/ WD minfh.x/; f .x/  g and noting that k .x/ is an increasing function,
we write the master problem as
.Q / minfg.x/j k .x/  0; x 2 a; bg:

For our purpose an important feature of this problem is that it can be solved
efficiently, while solving it furnishes an answer to question (IT), as will shortly
be seen.
408 11 Monotonic Optimization

By min.Q / denote the optimal value of .Q /. The robust approach to (MO) is


founded on the following relationship between problems (MO) and .Q ):
Proposition 11.17 Assume (11.19).
(i) If min.Q / < 0 there exists an essential feasible solution x0 of (MO) satisfying
f .x0 /  :
(ii) If min.Q /  0, then there is no essential feasible solution x of (MO) satisfying
f .x/  :
Proof (i) If min.Q / < 0 there exists an x0 2 a; b satisfying g.x0 / < 0  h.x0 /;
f .x0 /  : This point x0 is an essential feasible solution of (MO), satisfying
f .x0 /  :
(ii) If min.Q /  0 any x 2 a; b such that g.x/ < 0 is infeasible to .Q ).
Therefore, any essential feasible solution of (MO) must satisfy f .x/ < : t
u
On the basis of this proposition, to solve the problem (MO) one can proceed
according to the following conceptual SIT (Successive Incumbent Transcending)
scheme:
Select tolerance
> 0: Start from D 0 with 0 D f .a/:
Solve the master problem .Q ).
If min.Q / < 0 an essential feasible x with f .x/  is found. Reset
f .x/ 
and repeat.
Otherwise, min.Q /  0: no essential feasible solution x for (MO) exists such
that f .x/  : Hence, if D f .x/ 
for some essential feasible solution x of
(MO) then x is an essential
-optimal solution; if D 0 the problem (MO) is
essentially infeasible (has no essential feasible solution).
Below we indicate how this conceptual scheme can be practically implemented.

a. Solving the Master Problem

Recall that the master problem .Q ) is

minfg.x/j x 2 a; b; k .x/  0g; (11.20)

where k .x/ WD minfh.x/; f .x/  g is an increasing function. Since the feasible set
of .Q ) is robust (has no isolated point) and easy to handle (a feasible solution can be
computed at very cheap cost), this problem can be solved efficiently by a rectangular
RBB (reduce-bound-and-branch) algorithm characterized by three basic operations
(see Chap. 6, Sect. 6.2):
Reducing: using valid cuts reduce each partition set M D p; q  a; b without
losing any feasible solution currently still of interest. The resulting box p0 ; q0  is
referred to as a valid reduction of M:
11.3 Successive Incumbent Transcending Algorithm 409

Bounding: for each partition set M D p; q compute a lower bound .M/ for
g.x/ over the feasible solutions of .Q ) contained in the valid reduction of MI
Branching: use an adaptive subdivision rule to further subdivide a selected box
in the current partitioning of the original box a; b:

b. Valid Reduction

At a given stage of the RBB algorithm for .Q ) let p; q be a box generated during
the partitioning process and still of interest. The search for a nonisolated feasible
solution x of (MO) in p; q such that f .x/  can then be restricted to the set

fx 2 p; qj g.x/  0  k .x/g:

A -valid reduction of p; q; written red p; q, is a box p0 ; q0   p; q that still


contains all the just mentioned set.
For any box p; q  Rn define qi WD pC.qi pi /ei ; i D 1; : : : ; n: As in Sect. 10.2,
for any increasing function '.t/ W 0; 1 ! R denote by N.'/ the largest value of
t 2 0; 1 such that '.t/ D 0: By Proposition 11.13 we can state:
A -valid reduction of a box p; q is given by the formula

;; if g.p/ > 0 or k .q/ < 0;
red p; q D
p0 ; q0  if g.p/  0 & k .q/  0:

where

X
n X
n
q0 D p C i .qi  pi /ei ; p0 D q0  i .q0i  pi /ei ; (11.21)
iD1 iD1


1; if g.qi /  0
i D 'i .t/ D g.p C t.qi  pi /ei /:
N.'i /; if g.qi / > 0:

1; if k .q0i /  0
i D i .t/ D k .q0  t.q0i  pi /ei /:
N. i /; if k .q0i / < 0:

In the context of RBB algorithm for (Q ), a smaller red p; q allows a tighter bound
to be obtained for g.x/ over p; q but requires more computational effort. Therefore,
as already mentioned in Remark 11.2, in practical implementation of the above
formulas for i ; i it is enough to compute approximate values of N.'i / or N. i /
by using, e.g., just a few iterations of a Bolzano bisection procedure.
410 11 Monotonic Optimization

c. Bounding

Let M D p; q be a valid reduction of a box in the current partition. Since g.x/ is


an increasing function an obvious lower bound for g.x/ over the box p; q; is the
number

.M/ D g.p/: (11.22)

As simple as it is, this bound combined with an adaptive branching rule suffices to
ensure convergence of the algorithm.
However, to enhance efficiency better bounds are often needed.
For example, applying two or more iterations of the Polyblock procedure for
computing minfg.x/j x 2 p; qg may yield a reasonably good lower bound.
Alternatively, based on the observation that minfg.x/ C tk .x/j x 2 p; qg 
minfg.x/j x 2 p; q; k .x/  0g 8t 2 RC ; one may take .M/ D g.p/ C tk .p/ for
some t > 1: As shown in (Tuy 2010), if either of the functions f .x/; g.x/ C tk .x/
is coercive, i.e., tends to C1 as kxk ! C1; then this bound approaches the exact
minimum of g.x/ over p; q as t ! C1:
Also one may try to exploit any partial convexity present in the problem. For
instance, if a convex function u.x/ and a concave function v.x/ are available such
that u.x/  g.x/; v.x/  k .x/ 8x 2 p; q; then a lower bound is given by the
optimal value of the convex minimization problem

minfu.x/j v.x/  0; x 2 p; qg

which can be solved by many currently available algorithms.


Sometimes it may be useful to write g.x/ as a sum of several functions such that
a lower bound can be easily computed for each of these functions: .M/ can then
be taken to be the sum of the separate lower bounds. Various methods and examples
of computing good bounds are considered in (Tuy-AlKhayyal-Thach 2005).

d. Branching

Branching is performed according to an adaptive rule.


Specifically let Mk D pk ; qk  be the valid reduction of the partition set chosen
for further subdivision at iteration k; and let zk 2 Mk be a point such that g.zk /
provides a lower bound for minfg.x/j k .x/  0; x 2 Mk g: (for instance, zk D pk ).
Obviously, if k .zk /  0 then g.zk / is the exact minimum of the latter problem
and so if Mk is the partition set with smallest lower bound among all partition sets
M currently still of interest then zk is a global optimal solution of (Q ). Based on
this observation one can use an adaptive branching method to eventually drive zk
to the feasible set of (Q ). This method proceeds as follows. Let yk be the point
where the line segment zk ; qk  meets the surface k .x/ D 0; i.e., yk D N. / for
.t/ D k .zk C t.qk  zk //; 0  t  1: Let wk be the midpoint of the segment
11.3 Successive Incumbent Transcending Algorithm 411

zk ; yk  W wk D .zk C yk /=2 and jk 2 argmaxfjzki  yki j i D 1; : : : ; ng: Then divide the


box Mk D pk ; qk  into two smaller boxes Mk ; MkC via the hyperplane xjk D wkjk :

Mk D pk ; qk ; MkC D pkC ; qk ; with


q k
Dq k
.qkjk  wkjk /ejk ; p kC k
Dp C .wkjk  pkjk /ejk :

In other words, subdivide Mk according to the adaptive subdivision rule via .zk ; yk /:
Proposition 11.18 Whenever infinite the above adaptive RBB algorithm for (Q )
generates a sequence of feasible solutions yk at least a cluster point y of which is an
optimal solution of (Q ).
Proof Since the algorithm uses an adaptive subdivision rule via .zk ; yk /; by Theo-
rem 6.4 there exists an infinite sequence K  f1; 2; : : :g such that kzk  yk k ! 0 as
k 2 K; k ! C1: From the fact

g.zk /  minfg.x/j x 2 a; b; k .x/  0g; k .yk / D 0;

we deduce that, up to a subsequence, zk ! y; yk ! y with g.y/ D minfg.x/j x 2


a; b; k .x/  0g: t
u
Incorporating the above RBB procedure for solving the master problem into
the SIT (Successive Incumbent Transcending) Scheme yields the following robust
algorithm for solving (MO):

SIT Algorithm for (MO)


Let 0 D f .b/: Select tolerances " > 0;
> 0:
Initilization. Start with P1 D fM1 g; M1 D a; b; R1 D ;: If an essential feasible
solution x of (MO) is available let D f .x/ 
: Otherwise, let D 0 : Set k D 1.
Step 1. For each box M 2 Pk :
Compute its valid reduction red M;
Delete M if red M D ;: Replace M by red M if otherwise;
Letting red M D p; q compute a lower bound .M/ for g.x/ over the
set fx 2 p; qj k .x/  0g such that .M/ D g.z/ for some z 2 p; q:
Delete M if .M/ > ":
Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1 and let R 0 k D Rk [ P 0 k :
Step 3. If R 0 k D ;, then terminate: if D 0 the problem (MO) is "-essential
infeasible; if D f .x/ 
then x is an essential .";
/-optimal solution of
(MO).
Step 4. If R 0 k ;; let pk ; qk  D Mk 2 argminf.M/j M 2 R 0 k g and let zk 2 Mk
be the point satisfying g.zk / D .Mk /: If k .zk /  0 let yk D zk and go to
Step 5. If k .zk / < 0; let yk be the point where the line segment joining zk
412 11 Monotonic Optimization

to qk meets the surface k .x/ D 0: If g.yk / < " go to Step 5. If g.yk /  "
go to Step 6.
Step 5. yk is an essential feasible solution of (MO) with f .yk /  : Reset x yk
and go back to Step 0.
Step 6. Divide Mk into two subboxes according to an adaptive branching via
.zk ; yk /: Let PkC1 be the collection of these two subboxes of Mk : Let
RkC1 D R 0 k n fMk g: Increment k and return to Step 1.
Proposition 11.19 The above algorithm terminates after finitely many steps, yield-
ing either an essential .";
/-optimal solution of (MO), or an evidence that the
problem is essentially infeasible.
Proof Since Step 1 deletes every box M with .M/ > "; the fact Rk D ; in
Step 3 implies that g.x/ C " > 0 8x 2 p; q \ fxj k .x/  0g; hence there is no
"-essential feasible solution x of (MO) with f .x/  ; justifying the conclusion in
Step 3. It remains to show that Step 3 occurs for sufficiently large k: Suppose the
contrary, that the algorithm is infinite. Since each occurrence of Step 5 increases the
current value of f .x/ by at least
> 0 while f .x/ is bounded above on the feasible
set, Step 5 cannot occur infinitely often. Therefore, for all k sufficiently large Step
6 occurs, i.e., g.yk /  ": By Proposition 11.18 there exists a subsequence fks g such
that kzks  yks k ! 0 as s ! C1: Also, by compactness, one can assume that
yks ! y; zks ! y: Then g.y/  "; while g.zk / D .Mk /  "; hence g.y/  "; a
contradiction. t
u
Remark 11.4 So far we assumed that the problem involves only inequality con-
straints. In the case there are also equality constraints, e.g.,

cj .x/ D 0; j D 1; : : : ; s: (11.23)

observe that the linear constraints can be used to eliminate certain variables, so we
can always assume that all the constraints (11.23) are nonlinear. Since, however,
in the most general case one cannot expect to compute a solution to a system of
nonlinear equations in finitely many steps, one should be content with replacing the
system (11.23) by the approximate system

  cj .x/  ; j D 1; : : : ; s ;

where > 0 is the tolerance. The method presented in the previous sections can then
be applied to the resulting approximate problem. An essential "-optimal solution to
the above defined approximate problem is called an essential .; "/-optimal solution
of (MO).

e. Application to Quadratic Programming

Consider the quadratically constrained quadratic optimization problem


11.4 Discrete Monotonic Optimization 413

.QQP/ minff0 .x/j fk .x/  0; k D 1; : : : ; m; x 2 Cg

where C  Rn is a polyhedron and each fk .x/; k D 0; 1 : : : ; m; is a quadratic


function. Noting that every quadratic polynomial can be written as a difference
of two quadratic polynomials with positive coefficients we can rewrite (QQP) as
a monotonic optimization problem of the form

.P/ minff .x/j g.x/  0  h.x/; x 2 a; bg

where f .x/; g.x/; h.x/ are quadratic functions with positive coefficients (hence,
increasing functions). The SIT method applied to the monotonic optimization
problem .P/ then yields a robust algorithm for solving (QQP). For details of
implementation, we refer the reader to the article (Tuy and Hoai-Phuong 2007).
Sometimes it may be more convenient to convert (QQP) to the form

.Q/ minff .x/j g.x/  0; x 2 a; bg

where g.x/ D minkD1;:::;m fuk .x/  vk .x/g and f ; uk ; vk are quadratic functions with
positive coefficients. Then the SIT Algorithm can be applied, using the master
problem

.Q / maxfg.x/j f .x/  ; x 2 a; bg:

11.4 Discrete Monotonic Optimization

Consider the discrete monotonic optimization problem

maxff .x/j g.x/  0  h.x/; .x1 ; : : : ; xs / 2 S; x 2 a; bg;

where f ; g; h W RnC ! R are increasing functions and S is a given finite subset of


RsC ; s  n: Setting

G D fxj g.x/  0g; H D fxj h.x/  0g

S D fx 2 a; bj .x1 ; : : : ; xs / 2 Sg;

we can write the problem as

.DMO/ maxff .x/j x 2 a; b; x 2 .G \ S / \ Hg:


414 11 Monotonic Optimization

Q D .G \ S /e ; the normal hull of the set G \ S : Then


Proposition 11.20 Let G
problem (DMO) is equivalent to

maxff .x/j x 2 G
Q \ Hg: (11.24)

Proof Since G \ S  G Q we need only show that for each point x 2 G Q there
exists z 2 G \ S satisfying f .z/  f .x/: By Proposition 11.4 G
Q D [z2G\S a; z;
so x 2 a; z for some z 2 G \ S : Since x  z and f .x/ is increasing we have
f .x/  f .z/: t
u
As a consequence, solving problem (DMO) reduces to solving (11.24) which is a
monotonic problem without explicit discrete constraint. The difficulty, now, is how
to handle the polyblock G Q which is defined only implicitly as the normal hull of

G\S :
Given a box p; q  a; b and a point x 2 p; q; we define the lower S-
adjustment of x to be the point

bxcS D xQ ; with xQ i D maxfyi j y 2 S [ fpg; yi  xi g; i D 1; : : : ; n (11.25)

and the upper S-adjustment of x to be the point

dxeS D xO ; with xO i D minfyi j y 2 S [ fqg; yi  xi g; i D 1; : : : ; n: (11.26)

In the frequently encountered special case when S D S1      Ss and every Si is a


finite set of real numbers we have

xQ i D maxf j 2 Si [ fpi g;  xi g; i D 1; : : : ; n:

xO i D minf j 2 Si [ fqi g;  xi g; i D 1; : : : ; n:

For example, if each Si is the set of integers and pi ; qi 2 Si then xQ i is the largest
integer no larger than xi ; while xO i is the smallest integer no less than xi :
The box obtained from red p; q upon the above S-adjustment is referred to as
the S-adjusted -valid reduction of p; q and denoted redS p; q: However rather
than compute first red p; q and then S-adjust it, one should compute directly
redS p; q using the formulas redS p; q D Qp; qQ  with

qQ i D maxfxi j x 2 red p; q \ .G \ S /g; i D 1; : : : ; n;


pQ i D minfxi j x 2 red p; qQ  \ .H \ S /g; i D 1; : : : ; n:

With the above modifications to accommodate the discrete constraint the Polyblock
Algorithm and the SIT Algorithm for (MO) can be extended to solve (DMO).
For every box p; q let redS p; q denote the S-adjusted -valid reduction of p; q;
11.4 Discrete Monotonic Optimization 415

i.e., the box obtained from p0 ; q0  WD red p; q by replacing p0 with dp0 eS and q0
with bq0 cS :
Discrete Polyblock Algorithm
Initialization. Take an initial polyblock P0  G \ H; with proper vertex set T0 :
Let xO 0 be the best feasible solution available (the current best feasible solution),
0 D f .Ox0 /; S0 D fx 2 S j f .x/ > 0 g: If no feasible solution is available, let
0 D 1: Set k WD 0:
Step 1. Delete every v 2 Tk such that redS k a; v D ;: For every v 2 Tk such
that redS k a; v ; denote the highest vertex of the latter box again by v
and if f .v/  k then drop v: Let TQ k be the set that results from Tk after
performing this operation for every v 2 Tk (so TQ k  H). Reset Tk WD TQ k :
Step 2. If Tk D ; terminate: if k D 1 the problem is infeasible; if k D
f .Oxk /; xO k is an optimal solution.
Step 3. If Tk ; select v k 2 argmaxff .v/j v 2 Tk g: If v k 2 G \ S terminate: v k
is an optimal solution.
Step 4. If v k 2 G n S compute vQ k D bv k cSk (using formula (11.25) for S D Sk ).
If v k G compute xk D G .v k / and define vQ k D xk if xk 2 Sk ; vQ k D
bxk cSk if xk Sk .
Step 5. Let Tk; D fz 2 Tk j z > vQ k g: Compete
\
Tk0 D .Tk n Tk; / fzk;i D z C .vQ ik  zi /ei j z 2 Tk; ; i D 1; : : : ; ng:

Let TkC1 be the set obtained from Tk0 by removing every zi such that fjj zj >
yj g D fig for some z 2 Tk; and y 2 Tk :
Step 6. Determine the new current best feasible solution xO kC1 : Set kC1 D

f .OxkC1 /; SkC1 D fx 2 S j f .x/ > kC1 g: Increment k and return to Step 1.
Proposition 11.21 If s D n the DPA (Discrete Polyblock) algorithm is finite. If
s < n and there exists a constant > 0 such that

min .xi  ai /  8x 2 H; (11.27)


iDsC1;:::;n

then either the DPA algorithm is finite or it generates an infinite sequence fv k g


converging to an optimal solution.
Proof A each iteration k a pair v k ; vQ k is generated such that v k .G \ Sk /c ; vQ k 2
.G \ Sk /c and the rectangle .vQ k ; b contains no point of Pl with l > k and
hence no vQ l with l > k: Therefore, there can be no repetition in the sequence
fvQ 0 ; vQ 1 ; : : : ; vQ k ; : : :g  S : This implies finiteness of the algorithm in the case s D n
because then S D S is then finite. In the case s < n; if the sequence fv k g is infinite,
it follows from the fact .vQ 1k ; : : : ; vQ sk / 2 S that for some sufficiently large k0 ;
vQik D vQi ; i D 1; : : : ; s; 8k  k0 :

On the other hand, since v k 2 H we have from (11.27) that


min .vik  ai /  8k: (11.28)
iDsC1;:::;n
416 11 Monotonic Optimization

We show that v k  xk ! 0 as k ! C1: Suppose the contrary, that there exist


> 0
and an infinite sequence kl such that kv kl  xkl k 
> 0 8l: For all  > l we have
v k .vQ kl ; v kl  because Pk  Pkl n .vQ kl ; b: Since vQik D xki 8i > s; we then derive

kv k  v kl k  min jvikl  vQ ikl j D min jvikCl  xkCl


i j:
iDsC1;:::;n iDsC1;:::;n

By (11.28) minfvikl  ai j i D s C 1; : : : ; ng  because v kl 2 H; while xkl lies on


the line segment joining a to v kl : Thus

vikl  ai

vikl  xki l D kv kl  xkl k  8i > s:


kv kl  ak kb  ak
Consequently,

kv k  v kl k  min jvikl  xki l k  ;


iDsC1;:::;n kb  ak
conflicting with the boundedness of the sequence fv kl g  a; b: Therefore,
kv k  xk k ! 0 and by boundedness we may assume that, up to a subsequence,
xk ! x; v k ! x: Then, since v k 2 H; xk 2 G 8k; it follows that x 2 G\H and hence,
v 2 G \ H \ S for v defined by v i D vQ i .i D 1; : : : ; s/; v i D xi .i D s C 1; : : : ; n/:
Furthermore, f .v k /  f .v/ 8v 2 Pk  G \ H \ Sk I hence, letting k ! C1 yields
f .x/  f .x/ 8x 2 G \ H \ S for D limk!C1 k : The latter in turn implies that
v 2 argmaxff .x/j x 2 G \ H \ Sg; so v is an optimal solution. t
u
Discrete SIT Algorithm
Let 0 D f .b/: Select tolerances " > 0;
> 0:
Initialization. Start with P1 D fM1 g; M1 D a; b; R1 D ;: If an essential
feasible solution x of (DMO) is available let D f .x/ 
: Otherwise, let D 0 :
set k D 1:
Step 1. For each box M 2 Pk :
Compute its S-adjusted -valid reduction redS M:
Replace M by redS M (in particular, delete it if redS M D ;/;
Let redS M D pM ; qM ; take g.pM / to be .M/; a lower bound for g.x/
over the box M:
Delete M if g.pM / > ":
Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1 and let R 0 k D Rk [ P 0 k :
Step 3. If R 0 k D ; terminate: in the case D 0 the problem (DMO) is "-
essential infeasible; otherwise, D f .x/ 
then x is an essential .";
/-
optimal solution of (DMO).
Step 4. If R 0 k ;; let pk ; qk  D Mk 2 argminf.M/j M 2 R 0 k g: If k .pk / 
0; let yk D pk and go to Step 5. If k .pk / < 0 let yk be the upper S-
adjustement of the point where the line segment joining pk to qk meets the
surface k .x/ D 0: If g.yk /  " go to Step 5; otherwise go to Step 6.
11.5 Problems with Hidden Monotonicity 417

Step 5. yk is an essential feasible solution of (DMO) with f .yk /  : If D 0


define x D yk ; D f .x/ 
and go back to Step 0. If D f .x/ 
and
f .yk / > f .x/ reset x yk and go to Step 0.
Step 6. Divide Mk into two subboxes according to an adaptive branching via
.pk ; yk /: Let PkC1 be the collection of these two subboxes of Mk : Let
RkC1 D R 0 k n fMk g: Increment k and return to Step 1.
Proposition 11.22 The Discrete SIT Algorithm terminates after finitely many
iterations, yielding either an essential .";
/-optimal solution, or an evidence that
problem (DMO) is "-essential infeasible.
Proof Analogous to the proof of Proposition 11.19. t
u

11.5 Problems with Hidden Monotonicity

Many problems encountered in various applications have a hidden monotonic


structure which can be disclosed upon suitable variable transformations. In this
section we study two important classes of such problems.

11.5.1 Generalized Linear Fractional Programming

This is a class of problems with hidden monotonicity in the objective function. It


includes problems of either of the following forms:
 
f1 .x/ fm .x/
.P/ maxf ;:::; j x 2 Dg
g1 .x/ gm .x/
 
f1 .x/ fm .x/
.Q/ minf ;:::; j x 2 Dg
g1 .x/ gm .x/

where D is a nonempty polytope in Rn ; f1 ; : : : ; fm ; g1 ; : : : ; gm are linear affine


functions on Rn such that
fi .x/
 1 < ai WD min < C1; i D 1; : : : ; m; (11.29)
x2D gi .x/

while W Rm ! R is a continuous function, increasing on Rm


aC WD fy 2 R j yi 
m

ai ; i D 1; : : : ; mg; i.e., satisfying

ai  y0i  yi .i D 1; : : : ; m/ ) .y0 /  .y/: (11.30)

This class of problems include, aside from linear multiplicative programs (see
Chap. 8, Sect. 8.5), diverse generalized linear fractional programs, namely:
418 11 Monotonic Optimization

1. Maximin and Minimax


 
f1 .x/ fm .x/
maxfmin g1 .x/ ; : : : ; gm .x/ j x 2 Dg (11.31)

..y/ D minfy1 ; : : : ; ym g/
 
minfmax gf11.x/
.x/
; : : : ; fm .x/
gm .x/
j x 2 Dg (11.32)

..y/ D maxfy1 ; : : : ; ym g/

2. Maxsum and Minsum


P f1 .x/ Pm
maxf miD1 g1 .x/ j x 2 Dg ..y/ D iD1 yi / (11.33)
P f1 .x/ Pm
minf miD1 g1 .x/ j x 2 Dg ..y/ D iD1 yi / (11.34)

3. Maxproduct and Minproduct


Q f1 .x/ Qm
maxf m
iD1 g1 .x/ j x 2 Dg ..y/ D iD1 yi / (11.35)
Qm f1 .x/ Qm
minf iD1 g1 .x/ j x 2 Dg ..y/ D iD1 yi / (11.36)

(For the last two problems it is assumed that ai > 0; i D 1; : : : ; m in


condition (11.29))
Standard linear fractional programs which are special cases of problems (P),
(Q) when m D 1 were first studied by Charnes and Cooper as early as in 1962.
Problems (11.31) and (11.33) received much attention from researchers (Schaible
1992 and the references therein); problems (11.33) and (11.34) were studied in
Falk and Palocsay (1992), Konno and Abe (1999), Konno et al. (1991), and
problems (11.35), (11.36) in Konno and Abe (1999) and Konno and Yajima (1992).
For a review of fractional programing and extensions up to 1993, the reader
is referred to Schaible (1992), where various potential and actual applications of
fractional programming are also described. In general, fractional programming
arises in situations where one would like to optimize functions of such ratios as
return/investment, return/risk, cost/time, output/input, etc. Let us also mention the
book (Stancu-Minasian 1997) which is a comprehensive presentation of the theory,
methods and applications of fractional programming until this date.
From a computational point of view, so far different methods have been proposed
for solving different variants of problems .P/ and .Q/: A general unifying idea
of parametrization underlies the methods developed by Konno and his associates
(Konno et al. 1991; Konno and Yajima 1992; Konno and Yamashita (1997)). Each
of these methods is devised for a specific class of problems, and most of them work
quite well for problem instances with m  3:
In the following we present a unified approach to all variants of problems (P)
and (Q) which has been developed in Hoai-Phuong and Tuy (2003), based on an
application of monotonic optimization.
11.5 Problems with Hidden Monotonicity 419

a. Conversion to Monotonic Optimization

First note that in the study of problems (11.31)(11.34) it is usually assumed that

min fi .x/  0; min gi .x/ > 0; i D 1; : : : ; m; (11.37)


x2D x2D

a condition generally weaker than (11.29). Further, without loss of generality we


can always assume in problems .P/ and .Q/ that W RnC ! RCC and

minfgi .x/; fi .x/g > 0 8x 2 D; i D 1; : : : ; m: (11.38)

In fact, in view of (11.29) gi .x/ does not vanish on D; hence has a constant sign on
D and by replacing fi ; gi with their negatives if necessary, we can assume gi .x/ >
0 8x 2 D; i D 1; : : : ; m: Then setting fQi .x/ D fi .x/  ai gi .x/ we have, by (11.29),
Q
fQi .x/ > 0 8x 2 D; i D 1; : : : ; m; and since gfii.x/ fi .x/
.x/ C ai D gi .x/ ; the problem (P) is
equivalent to
!
fQi .x/ fQm .x/
maxf ;:::; j x 2 Dg;
gi .x/ gm .x/

where fQi .x/ > 0 8x 2 D; i D 1; : : : ; m: Setting Q .y1 ; : : : ; ym / D .y1 C a1 ; : : : ;


ym C am / one has, in view of (11.30), an increasing function Q W Rm C ! R; so by
adding a positive constant one can finally assume Q W Rm C ! R CC : Analogously
for problem (Q).
Now define the sets
fi .x/
G D fy 2 Rm
C j yi  gi x/ ; .i D 1; : : : ; m/; x 2 Dg (11.39)
fi .x/
H D fy 2 Rm
C j yi  gi x/
; .i D 1; : : : ; m/; x 2 Dg: (11.40)

Clearly, G is a normal set, H a conormal set and both are contained in the box 0; b
with

fi .x/
bi D max ; i D 1; : : : ; m:
x2D gi .x/

Proposition 11.23 Problems (P) and (Q) are equivalent, respectively, to the fol-
lowing problems:

.MP/ maxf.y/j y 2 Gg
.MQ/ minf.y/j y 2 Hg:
420 11 Monotonic Optimization

fi .x/
More precisely, if x solves (P) ((Q), resp.) then y with yi D gi .x/ solves (MP) ((MQ),
fi .x/
resp.). Conversely, if y solves (MP) ((MQ), resp.) and x satisfies yi  gi .x/ .yi 
fi .x/
gi .x/ ; resp.) then x solves (P) ((Q), resp.).
.x
Proof Suppose x solves (P). Then y D . gf1i .x/ ; : : : ; gfmm.x/
.x/
/ satisfies .y/  .y/ for all
y satisfying yi D gfii.x/ .x/
; i D 1; : : : ; m; with x 2 D: Since is increasing this implies
.y/  .y/ 8y 2 G; hence y solves (MP). Conversely, if y solves (MP) then yi 
fi .x/ f1 .x/ fm .x/
gi .x/ ; i D 1 : : : ; m; for some x 2 D; and since is increasing, . g1 .x/ ; : : : ; gm .x/ / 
fi .x/
.y/  .y/ for every y 2 G; i.e., for every y satisfying yi D gi .x/ ; i D 1; : : : ; m
with x 2 D: This means x solves (P). t
u
Thus generalized linear fractional problems .P/; .Q/ in Rn can be converted into
monotonic optimization problems (MP), M.Q/ respectively, in Rm C with m usually
much smaller than n:

b. Solution Method

We only briefly discuss a solution method for problem (MP) (a similar method
works for (MQ)):

.MP/ maxf.y/j y 2 Gg
fi .x/
where (see (11.39)) G D fy 2 Rm C j yi  gi .x/ 8i D 1; : : : ; m; x 2 Dg is contained in
the box 0; b with b satisfying

fi .x/
bi D max i D 1; : : : ; m: (11.41)
x2D gi .x/

In view of assumption (11.38) we can select a vector a 2 Rm


CC such that

fi .x/
ai D min >0 i D 1; : : : ; m; (11.42)
x2D gi .x/

where it can be assumed that ai < bi ; i D 1; : : : ; m because if ai D bi for some i


then gfii.x/
.x/ 8x 2 D and .:/ reduces to a function of m  1 variables. So with a; b
defined as in (11.41) and (11.42) the problem to be solved is

maxf.y/j y 2 G \ a; bg; a; b  RnCC : (11.43)

Let be the optimal value of (MP) ( > 0 since W RnC ! RCC ). Given a
tolerance " > 0; we say that a vector y 2 G is an "-optimal solution of (MP) if
 .1 C "/.y/; or equivalently, if .y/  .1 C "/.y/ 8y 2 G: The vector x 2 D
satisfying y D gfii.x/
.x/ is then called an "-optimal solution of .P/:
11.5 Problems with Hidden Monotonicity 421

The POA algorithm specialized to problem (11.43) reads as follows:


POA algorithm (tolerance " > 0)
Initilization. Start with T1 D TQ 1 D fbg: Set k D 1:
Step 1. Select zk 2 argmaxf.z/j z 2 TQ k g. Compute yk D G .zk /; the point where
the halfline from a through zk meets the upper boundary @C G of G: Let
xk 2 D satisfy yki  gfii.x.xk// ; i D 1; : : : ; m: Define yQ k 2 G by yQ ki D gfii.x
k k/

.xk /
; iD
1; : : : ; m (so yQ is a feasible solution). Determine the new current best
k

solution yk (with corresponding xk 2 D) and the new current best value


CBV D .yk / by comparing yQ k with yk1 (for k D 1 set y1 D yQ 1 ; x1 D x1 ).
Step 2. Select a set Tk0  Tk such that zk 2 Tk0  fzj z > yk g and let Tk be
the set obtained from Tk by replacing each z 2 Tk0 with the points zi D
z  .zi  yki /ei ; i D 1; : : : ; m:
Step 3. From Tk remove all improper elements, all z 2 TQ k such that .z/ <
.1 C "/CBV and also all z 6 a: Let TkC1 be the set of remaining elements.
If TkC1 D ;; terminate: yk is an "-optimal solution of (MP), and the
corresponding xk an "-optimal solution of .P/:
Step 4. If TkC1 ;; set k k C 1 and return to Step 1.

c. Implementation Issues

A basic operation in Step 1 is to compute the point G .z/; where the halfline from a
through z meets the upper boundary of the normal set G: Since G  0; b we have
G .z/ D z where

fi .x/
 D maxfj  min ; x 2 Dg
iD1;:::;m zi gi .x/

fi .x/
D max min : (11.44)
x2D iD1;:::;m zi gi .x/

So computing G .z/ amounts to solving a problem (11.31). Since this is computa-


tionally expensive, instead of computing G .z/ D z one should compute only an
approximate value of G .z/; i.e., a vector y  G .z/; such that

O with O    .1 C
/;
y D .1 C
/z; O (11.45)

where
> 0 is a tolerance. Of course, to guarantee termination of the above
algorithm, the tolerance
must be suitably chosen. This is achieved by choosing

> 0 such that

O < .1 C "=2/.z/;
..1 C
/z/ O (11.46)
422 11 Monotonic Optimization

i.e., .y/  .1 C "=2/. G .z//: Indeed, since z O  G .z/  y WD .1 C


/z; O
while .y/ is continuous, it follows that, for sufficiently large k; .z /  .y / 
k k

"=2. G .zk //: Then, noting that .Qyk /  . G .zk // because .y/ is increasing,
we will have

.1 C "/CBV  .1 C "/.Qyk /  .1 C "/. G .zk //


 .1 C "=2/. G .zk // C ."=2/. G.zk //
 .yk / C ."=2/. G.zk //  .yk / C .zk /  .yk /
D .zk /;

and so the termination criterion will hold.


Consider the subproblem

LP./ R./ WD max min .fi .x/  zi gi .x//


x2D iD1;:::;m

which is equivalent to the linear program

maxftj t  fi .x/  zi gi .x/; i D 1; : : : ; m; x 2 Dg:


As has been proved in Hoai-Phuong and Tuy (2003), one way to ensure (11.46) is
to compute y  G .z/ by the following subroutine:
Procedure G .z/
Step 1. Set s D 1; 1 D 0: Solve LP.1 /; obtaining an optimal solution x1 of it.
1
Define Q 1 D mini zifgi .xi .x1/ / :
Step 2. Let sC1 D Q s .1 C
/: Solve LP.sC1 /; obtaining an optimal solution
xsC1 of it. If R.sC1 /  0; and .sC1 z/  .1 C "=2/.Q s z/; terminate:
y D sC1 z; yQ si D fi .xsC1 /=gi .xsC1 /; i D 1; : : : ; m:
Step 3. If R.sC1 /  0 but .sC1 z/ > .1 C "=2/.Q sz/ reset

=2; Q sC1 D
fi .xsC1 /
mini z g .xsC1 / ; increment s and go back to Step 2.
i i
fi .xsC1 /
Step 4. If R.sC1 / > 0; set Q sC1 D mini zi gi .xsC1 /
; increment s and go back to
Step 2.

11.5.2 Problems with Hidden Monotonicity in Constraints

This class includes problems of the form

minff .x/j .C.x//  0; x 2 Dg; (11.47)

where f .x/ is an increasing function, W Rm


C ! R is an u.s.c. increasing function,
D a compact convex set and C W Rn ! Rm C a continuous map.
11.5 Problems with Hidden Monotonicity 423

The monotonic structure in (11.47) is easily disclosed. Since the set C.D/ WD
fy D C.x/; x 2 Dg is compact it is contained in some box a; b  Rm C : For each
y 2 a; b let

R.y/ '.y/ WD minff .x/j C.x/  y; x 2 Dg:

Proposition 11.24 The problem (11.47) is equivalent to the monotonic optimiza-


tion problem

minf'.y/j .y/  0g (11.48)

in the sense that if y solves (11.48) then an optimal solution x of R.y /


solves (11.47) and conversely, if x solves (11.47 ) then y D C.x / solves (11.48).
Proof That (11.48) is a monotonic optimization problem is clear from the fact that
'.y/ is an increasing function by the definition of R.y/ while .y/ is an increasing
function by assumption. The equivalence of the two problems is easily seen by
writing (11.47) as

infff .x/j x 2 D; C.x/  y; y 2 a; b; h.y/  0g


x;y

D min minff .x/j x 2 D; C.x/  0g


y2a;b;h.y/0 x

D minf'.y/j y 2 a; b; h.y/  0g:

In more detail, let y be an optimal solution of (11.48) and let x be an optimal


solution of R.y /; i.e., '.y / D f .x / with x 2 D; C.x /  y : For any x 2 D
satisfying .C.x//  0; we have .y/  0; for y D C.x/  '.y/  '.y / D f .x /;
proving that x solves (11.47). Conversely, let x be an optimal solution of (11.47),
i.e., x 2 D; .y /  0 with y D C.x /: For any y 2 a; b satisfying .y/  0;
we have '.y/ D f .x/ for some x 2 D with C.x/  y; hence .C.x//  .y/  0;
i.e., x is feasible to (11.47), and hence, '.y/ D f .x/  f .x /  '.y /; i.e., y is an
optimal solution of (11.48). t
u
Note that the objective function '.y/ in the monotonic optimization prob-
lem (11.48) is given implicitly as the optimal value of a subproblem R.y/: However,
in many cases of interest, R.y/ is solvable by standard methods and no difficulty can
arise from that. Consider, for example, the problem

X
m
ui .x/
minff .x/j x 2 D;  1g;
iD1
vi .x/

where D is a polytope, f .x/ is an increasing function, ui .x/; vi .x/; i D 1; : : : ; m; are


continuous functions such that vi .x/ > 0 8x 2 D and the set fyj yi D vuii .x/ .x/ ; i D
1; : : : ; m; x 2 Dg is contained in a box a; b  Rm
C :
424 11 Monotonic Optimization

ui .x/ Pm
Setting Ci .x/ D vi .x/ ; .y/ D iD1 yi  1; we have

'.y/ D minff .x/j x 2 D; yi vi .x/  ui .x/; i D 1; : : : ; ng

so the equivalent monotonic optimization problem is

X
m
minf'.y/j yi  1; y 2 a; bg:
iD1

11.6 Applications

Monotonic optimization has met with important applications in economics, engi-


neering, and technology, especially in risk and reliability theory and in communica-
tion and networking systems.

11.6.1 Reliability and Risk Management

As an example of problem with hidden monotonicity in the constraints consider the


following problem which arises in reliability and risk management (Prkopa 1995):

minfhc; xij Ax D b; x 2 RnC ; PfTx  .!/g  g; (11.49)

where C 2 Rn ; A 2 Rpn ; b 2 Rp ; T 2 Rmn are given deterministic data, ! is a


random vector from the probability space .; ; P/ and W ! Rm represent
stochastic right-hand side parameters; PfSg represent the probability of the event
S 2 under the probability measure P and 2 .0; 1/ is a scalar. This is a
linear program with an additional probabilistic constraint which requires that the
inequalities Tx  .!/ hold with a probability of at least :
If the stochastic parameters have a continuous log-concave distribution, then
problem (11.49) is guaranteed to have a convex feasible set (Prkopa 1973), and
hence may be solvable with standard convex programming techniques (Prkopa
1995). For general distributions, in particular for discrete distributions, the feasible
set is nonconvex.
For every y 2 Rm define f .y/ WD inffhc; xij Ax D b; x 2 RnC ; Tx  yg: Let
F W Rm ! 0; 1 be the cumulative density function of the random vector .!/; i.e.,
F.y/ D Pfy  .!/g: Assuming p  y  q with 1 < f .y/ < C1 8y 2 p; q
problem (11.49) can then be rewritten as

minff .y/j F.y/  ; y 2 p; qg: (11.50)


11.6 Applications 425

As is easily seen, f .y/ is a continuous convex and increasing function while


F.y/ is an u.s.c. increasing function, so this is a monotonic optimization problem
equivalent to (11.49). For a study of problem (11.49) based on its monotonic
reformulation (11.50) see, e.g., Cheon et al. (2006).

11.6.2 Power Control in Wireless Communication Networks

We briefly describe a general monotonic optimization framework for solving


resource allocation problems in communication and networking systems, as devel-
oped in the monographs (Bjornson and Jorswieck 2012; Zhang et al. 2012) (see also
Bjornson et al. 2012; Utschick and Brehmer 2012; Jorswieck and Larson 2010; Qian
and Jun Zhang 2010; Qian et al. 2009, 2012). For details the reader is referred to
these works and the references therein.
Consider a wireless network with a set of n distinct links. Each link includes a
transmitter node Ti and receiver node Ri : Let Gij be the channel gain between node Ti
and node Rj I fi the transmission power of link i (from Ti to Ri );
i the received noise
power on link iI Then the received signal to interference-plus-noise ratio (SINR) of
link i is
Gii fi
i D P
ji Gij fj C
i

If p D .f1 ; : : : ; fn /; .p/ D . 1 .p/; : : : ; n .p// and U. .p// is the system utility then
the problem we are concerned with can be formulated as

max U. .p//;

s.t. i .p/  i;min i D 1; : : : ; nI (11.51)
0  fi  fimax i D 1; : : : ; n:

By its very nature U. .p// is an increasing function of .:/; so this is a problem of


Generalized Linear Fractional Programming as discussed in the previous section.
In the more general case of MISO (multi-input-single-output) links, each link
includes a multi-antenna transmitter Ti and a single-antenna receiver Ri : The channel
gain between node Ti and node Rj is denoted by vector hij : If wi is the beamforming
vector of Ti which yields a transmit power level of fi D kwi k22 ; and
i s the noise
power at node Ri ; then the SINR of link i is
2
jhH
ij wi j
i .w1 ; : : : ; wn / D P :
ji jhYij wj j2 C
i

As before, by U. / denote the system utility function which monotonically


increases with WD . 1 ; : : : ; n /: Then the problem is to
426 11 Monotonic Optimization


max U. .w//

s.t. i .w1 ; : : : ; wn /  i;min i D 1; : : : ; nI (11.52)
0  kwi k22  Pmax i D 1; : : : ; n:
i

where i;min is the minimum SINR requirement of link i and Pmax


i is the maximum
transmit power of link i:
Again problem (11.52) can be converted into a canonical monotonic optimization
problem. In fact, setting

G D fyj 0  yi  i .w1 ; : : : ; wn /; 0  kwi k22  Pmax


i ; i D 1; : : : ; ng
H D fyj yi  i;min ; i D 1; : : : ; ng

we can rewrite problem (11.52) as

maxfU.y/j y 2 G \ Hg;
where G is a closed normal set, H a closed conormal set.
Remark 11.5 In Bjornson and Jorswieck (2012) and Zhang et al. (2012) the
monotonic optimization problem was solved by the Polyblock Algorithm or the
BRB Algorithm. Certainly the performance could have been much better with
the SIT Algorithm, had it been available at the time. Once again, it should be
emphasized that not only is the SIT algorithm more efficient, but it also suits best
the users purpose if we are mostly interested in quickly finding a feasible solution
or, possibly, a feasible solution better than a given one.

11.6.3 Sensor Cover Energy Problem

An important optimization problem that arises in wireless communication about


sensor coverage energy (Yick et al. 2008) can be described as follows:
Consider a wireless network deployed in a certain geographical region, with n
sensors at points si 2 Rp ; i D f1; : : : ; ng; and m target nodes at points tj 2 Rp ; j D
f1; : : : ; mg; where p is generally set to 2 or 3. The energy consumption per unit
time of a sensor i is a monotonically non decreasing function of its sensing radius
ri : Denoting this function by Ei .ri / assume, as usual in many studies on wireless
sensor coverage, that

Ei .ri / D i ri i C ; li  ri  ui ;
where i > 0; i > 0 are constants depending on the specific device and
represents the idle-state energy cost. The problem is to determine ri 2 li ; ui  
RC ; i D 1; : : : ; n; such that each target node tj ; j D 1; : : : ; m; is covered by at least
one sensor and the total energy consumption is minimized. That is,
11.6 Applications 427

Fig. 11.4 Sensor cover


energy problem (n=3, m=14)
x
x x x
x
x
x
x
x

x x
sensor
x
x target
x x

Pn
min iD1 Ei .ri /
.SCEP/ s.t. max1in .ri  ksi  tj k/  0 j D 1; : : : ; m;
l  r  u;

where the last inequalities are component-wise understood (Fig. 11.4).



Setting xi D i ri i ; aij D i ksi  tj ki and ai D i li i ; bi D i ui i we can rewrite
(SCEP) as
P
minimize f .x/ D niD1 xi subject to

gj .x/ WD max1in .xi  aij /  0; j D 1; : : : ; m; (11.53)
a  x  b:

Clearly each gj .x/; j D 1; : : : ; m; is a convex function, so the problem is a linear


optimization under a reverse convex constraint and can be solved by methods
of reverse convex programming (Chap. 7, Sect. 7.3). In Astorino and Miglionico
(2014), the authors use a penalty function to shift the reverse convex constraint
to the objective function, transforming (11.53) into an equivalent dc optimization
problem which is then solved by the DCA algorithm of P.D. Tao. Aside from the
need of a penalty factor which is not simple to determine even approximately, a
drawback of this method is that it can at best provide an approximate local optimal
solution which is not guaranteed to be global.
A better approach to problem (SCEP) is to view it as a monotonic optimization
problem. In fact, each gj .x/; j D 1; : : : ; m; is an increasing function, i.e., gj .x/ 
gj .x0 / whenever x  x0 : So the feasible set of problem (11.53) :

H WD fx 2 a; bj gj .x/  0; j D 1; : : : ; mg
is a conormal set (see Sect. 11.1) and (SCEP) is actually a standard
P monotonic
optimization problem: minimizing the increasing function f .x/ D niD1 xi over the
conormal set H:
428 11 Monotonic Optimization

Let aj D .a1j ; : : : ; anj /; J WD fj D 1; : : : ; mj a  aj g: We have

H D \j2J Hj ;

where Hj D fx 2 a; bj max1in .xi  aij /  0g: Clearly Hj D a; b n Cj with


Cj D fx 2 a; bj max1in .xi  aij / < 0g: For every j 2 J; since the closure of Cj is
Cj D fxj a  x  aj g D a; aj ; the set

G WD [j2J C j D [j2J a; aj 

is a polyblock. Let J1 be the set of all j 2 J for which aj is a proper vertex of


the polyblock G, i.e., for which there is no k 2 J; k j such that ak  aj : Then
G D [j2J1 a; aj ; hence [j2J Cj D [j2J1 Cj : Since

H D \j2J Hj D a; b n [j2J Cj D a; b n [j2J1 Cj

it follows that

H D fx 2 a; bj gj .x/ D max .xi  aij /  0; 8j 2 J1 g D \j2J1 Hj :


1in

Proposition 11.25 H is a copolyblock


P in a; b and (SCEP) reduces to minimizing
the increasing function f .x/ D niD1 xi over this copolyblock.
Proof We have Hj D a; bj n Cj ; where Cj D fx 2 a; bj max1in .xi  aij <
0g D a; b n a; aj / D [1in uij ; b; with uij D a C .aij  ai /ei (Proposition 11.7),
so each Cj is a copolyblock with vertex set fu1j ; : : : ; unj g: Hence H D \j2J1 Hj is a
copolyblock (Proposition 11.6). t
u

P set of the copolyblock H: Since each v 2 V is


Let V denote the proper vertex
a local minimizer of f .x/ D niD1 xi over H; we thus have a simple procedure for
computing a local optimal solution of (SCEP). Starting from such a local optimal
solution, a BRB Algorithm can then be applied to solve the discrete monotonic
optimization
X
n
minf xi j x 2 Vg
iD1

which is equivalent to (SCEP). This is essentially the method recently developed


in Hoai and Tuy (2016) for solving SCEP. For detail on the implementation of the
method and computational experiments with problems of up to 1000 variables, the
reader is referred to the cited paper.
11.6 Applications 429

11.6.4 Discrete Location

Consider the following discrete location problem:


(DL) Given m balls in Rn of centers ai 2 RnCC and radii i .i D 1; : : : ; m/; and a
bounded discrete set S  RnC ; find the largest ball that has center in S and is disjoint
from any of these m balls. In other words,

maximize z subject to

kx  ai k  i  z; i D 1; : : : ; m; (11.54)
x 2 S  RnC ; z 2 RC :

This problem is encountered in various applications. For example, in location


theory (Plastria 1995), it can be interpreted as a maximin location problem:
ai ; i D 1; : : : ; m; are the locations of m obnoxious facilities, and i > 0 is the
radius of the polluted region of facility i; while an optimal solution is a location
x 2 S outside all polluted regions and as far as possible from the nearest of these
obnoxious facilities. In engineering design (DL) appears as a variant of the design
centering problem (Thach 1988; Vidigal and Director 1982), an important special
case of which, when i D 0; i D 1; : : : ; m; is the largest empty ball problem
(Shi and Yamamoto 1996): given m points a1 ; : : : ; am in RnCC and a bounded set S;
find the largest ball that has center in S  RnC and contains none of these points
(Fig. 11.5).
As a first step towards solving (DL) one can study the following feasibility
problem:
(Q(r)) Given a number r  0; find a point x.r/ 2 S lying outside any one of the
m balls of centers ai and radii i D i C r:
It has been shown in Tuy et al. (2003) that this problem can be reduced to

maxfkxk2  h.x/j x 2 Sg;

Fig. 11.5 Discrete location


problem (n D 2; m D 10/
2 3

5
8

10
1
4

6 9
430 11 Monotonic Optimization

where h.x/ D maxiD1;:::;m .2hai ; xi C i2  kai k2 /: Since both kxk2 and h.x/ are
obviously increasing functions, this is a discrete dm optimization problem that can
be solved by a branch-reduce-and-bound method as discussed for solving the Master
Problem in Sect. 11.3. Clearly if r is the maximal value of r such that (Q(r)) is
feasible, then x D x.r/ will solve (DL). Noting that r  0 and for any r > 0
one has r  r or r < r according to whether (Q(r)) is feasible or not, the value
r can be found by a Bolzano binary search scheme: starting from an interval 0; s
containing r; one reduces it by a half at each step by solving a (Q(r)) with a suitable
r: Quite encouraging computational results with this preliminary version of the SIT
approach to dm optimization have been reported in Tuy et al. (2003). However, it
turns out that more complete and much better results can be obtained by a direct
application of the enhanced Discrete SIT Algorithm adapted to problem (11.54).
Next we describe this method.
Observe that

kx  ai k  i  z; i D 1; : : : ; m;
, kxk2 C kai k2  2hai ; xi  .z C i /2 ; i D 1; : : : ; m;
, max f.z C i / C 2ha ; xi  ka k g  kxk2 :
2 i i 2
iD1;:::;m

Therefore, by setting

'.x; z/ D max ..z C i /2 C 2hai; xi  kai k2 /; (11.55)


iD1;:::;m

problem (11.54) can be restated as

maxfzj '.x; z/  kxk2  0; x 2 a; b \ S; z  0g; (11.56)

where '.x; z/ and kxk2 are increasing functions and a; b is a box containing S:
To apply the Discrete SIT Algorithm for solving this problem, observe that the
value of z is determined when x is fixed. Therefore, branching should be performed
upon the variables x only.
Another key operation in the algorithm is bounding: given a box p; q  a; b;
compute an upper bound .M/ for the optimal value of the subproblem

maxfzj '.x; z/  kxk2  0; x 2 S \ p; q; 0  zg: (11.57)

If CBV D r > 0 is the largest thus far known value of z at a feasible solution .x; z/;
then only feasible points .x; z/ with z > r are still of interest. Therefore, to obtain a
tighter value of .M/ one should consider, instead of (11.57), the problem

.DL.M; r// maxfzj '.x; z/  kxk2  0; x 2 S \ p; q; z  rg:


11.6 Applications 431

Because '.x; z/ and kxk2 are both increasing, the constraints of DL.M; r/ can be
relaxed to '.p; z/  kqk2 ; so an obvious upper bound is

c.p; q/ WD maxfzj '.p; z/  kqk2  0g: (11.58)

Although this bound is easy to compute (it is the zero of the increasing function
z 7! '.p; z/kqk2 /; it is often not quite efficient. A better bound can be computed by
solving a linear relaxation of DL.M; r/ obtained by omitting the discrete constraint
x 2 S and replacing '.x; z/ with a linear underestimator. Since

.z C i /2  .r C i /2 C 2.r C i /.z  r/;

a linear underestimator of '.x; z/ is

.x; z/ WD max f2.r C i /.z  r/ C .r C i /2 C 2hai ; xi  kai k2 g:


iD1;:::;n

On the other hand,


X
n
kxk2  .pj C qj /xj  pj qj  8x 2 p; q;
jD1

so by (11.55) the constraint '.x; z/  kxk2  0 can be relaxed to

X
n
.x; z/  .pj C qj /xj  pj qj   0;
jD1

which is equivalent to the system of linear constraints

X
n
max f2.r C i /.z  r/ C .r C i /2 C 2hai; xi  kai k2 g  .pj C qj /xj  pj qj :
iD1;:::;m
jD1

Therefore, a lower bound can be taken to be the optimal value .M/ of the linear
programming problem

max 2.r C i /.z  r/ C .r C i /2 C 2hai ; xi  kai k2

Xn
s.t. .pj C qj /xj  pj qj   0; i D 1; : : : ; m;
.LP.M; r//

jD1
z  r; p  x  q:

The bounds can be further improved by using valid reduction operations. For this,
observe that, since '.x; r/  '.x; z/ for r  z; the feasible set of (DL(M; r)) is
contained in the set

fxj '.x; r/  kxk2  0; x 2 p; qg;


432 11 Monotonic Optimization

so, according to Proposition 11.13, if p0 ; q0 are defined by (11.13) with

g.x/ D '.x; r/; h.x/ D kxk2 ;

and pQ D dp0 eS ; qQ D bq0 cS ; then the box Qp; qQ   p; q is a valid reduction of p; q:
Thus, the Discrete SIT Algorithm as specialized to problem (DL) becomes the
following algorithm:

SIT Algorithm for (DL)


Initialization. Let P1 WD fM1 g; M1 WD a; b; R1 WD ;: If some feasible solution
.x; z/ is available, let r WD z be CBV; the current best value. Otherwise, let r WD 0:
Set k WD 1.
Step 1. Apply S-adjusted reduction cuts to reduce each box M WD p; q 2 Pk : In
particular delete any box p; q such that '.p; r/  kqk2  0: Let Pk0 be
the resulting collection of reduced boxes.
Step 2. For each M WD p; q 2 Pk0 compute .M/ by solving LP(M; r). If
LP(M; r) is infeasible or .M/ D r; then delete M: Let Pk WD fM 2
Pk0 j .M/ > rg: For every M 2 Pk if a basic optimal solution of
LP(M; r) can be S-adjusted to derive a feasible solution .xM ; zM / with
zM > r; then use it to update CBV.
Step 3. Let Sk WD Rk [ Pk : Reset r WD CBV: Delete every M 2 Sk such that
.M/ < r and let RkC1 be the collection of remaining boxes.
Step 4. If RkC1 D ;; then terminate: if r D 0; the problem is infeasible;
otherwise, r is the optimal value and the feasible solution .x; z/ with z D r
is an optimal solution.
Step 5. If RkC1 ;; let Mk 2 argmaxf.M/j M 2 RkC1 g: Divide Mk into two
boxes according to the standard bisection rule. Let PkC1 be the collection
of these two subboxes of Mk :
Step 6. Increment k and return to Step 1:
Theorem 11.2 SIT Algorithm for (DL) solves the problem .DL/ in finitely many
iterations.
Proof Although the variable z is not explicitly required to take on discrete values,
this is actually a discrete variable, since the constraint (11.54) amounts to requiring
that z D miniD1;:::;m .kx  ai k  i / for x 2 S: The finiteness of SIT Algorithm for
(DL) then follows from Theorem 7.10. t
u
As reported in Tuy et al. (2006), the above algorithm has been tested on a number
of problem instances of dimension ranging from 10 to 100 (10 problems for each
instance of n/: Points aj were randomly generated in the square 1000  xi  90;000;
i D 1; : : : ; n; while S was taken to be the lattice of points with integral coordinates.
The computational results suggest that the method is quite practical for this
class of discrete optimization problems. The computational cost increases much
11.7 Exercises 433

more rapidly with n (dimension of space or number of variables) than with m


(number of balls). Also, when compared with the results preliminarily reported in
Tuy et al. (2003), they show that the performance of the method can be drastically
improved by using a more suitable monotonic reformulation.

11.7 Exercises

1 Let P be a polyblock in the box a; b  Rn : Show that the set Q D cl.a; b n P/


is a copolyblock.
Hint: For each vertex ai of P the set a; b n a; ai / is a copolyblock; see
Proposition 11.7.
2 Let h.x/ be a dm function on a; b such that h.x/   0 8x 2 a; b and
q W RC ! R a convex decreasing function with finite right derivative at a W
q0C .a/ > 1: Prove that q.h.x// is a dm function ona; b:
Hint: For any 2 RC we have q.t/  q. / C q0C . /.t  / with equality holding
for D t; so q.t/ D sup 2RC fq. / C q0C . /.t  /g: If h.x/ D u.x/  v.x/ with
u.x/; v.x/ increasing, show that g.x/ WD K.u.x/ C v.x//  q.h.x// is increasing.
3 Rewrite the following problem (Exercise 10, Chap. 5) as a canonical monotonic
optimization problem (MO):

minimise 3x1 C 4x2 subject to


y11 x1 C 2x2 C x3 D 5I y12 x1 C x2 C x4 D 3
x1 ; x2  0I y11  y12  1I 0  y11  2I 0  y12  2

Solve this problem by the SIT Algorithm for (MO).


4 Consider the monotonic optimization problem minff .x/j g.x/  0; x 2 a; bg
where f .x/; g.x/ are lower semi-continuous increasing functions. Devise a rectangu-
lar branch and bound method for solving this problem using an adaptive branching
rule.
Hint: A polyblock (with vertex set V) which contains the normal set G D fx 2
a; bj g.x/  0g can be viewed as a covering of G by the set of overlapping boxes
a; v; v 2 V W G  [v2V a; v; so the POA algorithm is a BB procedure in
each iteration of which a set of overlapping boxes a; v; v 2 V is used to cover
the feasible set instead of a set of non-overlapping boxes p; q  a; b as in a
conventional BB algorithm.
Chapter 12
Polynomial Optimization

In this chapter we are concerned with polynomial optimization which is a very


important subject with a multitude of applications: production planning, location
and distribution (Duffin et al. 1967; Horst and Tuy 1996), risk management,
water treatment and distribution (Shor 1987), chemical process design, pooling
and blending (Floudas 2000), structural design, signal processing, robust stability
analysis, design of chips (Ben-Tal and Nemirovski 2001), etc.

12.1 The Problem

In its general form the polynomial optimization problem can be formulated as

.P/ minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; a  x  bg;

where a; b 2 RnC and f0 .x/; f1 .x/; : : : ; fm .x/ are polynomials, i.e.,


X
fi .x/ D ci x ; x WD x1 1 x2 2    xn n ; ci 2 R; (12.1)
2Ai

with D .1 ; : : : ; n / being an n-vector of nonnegative integers. Since the set


of polynomials on a; b is dense in the space Ca; b of continuous functions on
a; b with the supnorm topology, any continuous optimization problem can be,
in principle, approximated as closely as desired by a polynomial optimization
problem. Therefore, polynomial programming includes an extremely wide class of
optimization problems. It is also a very difficult problem of global optimization
because many special cases of it such as the nonconvex quadratic optimization
problem are known to be NP-hard.

Springer International Publishing AG 2016 435


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_12
436 12 Polynomial Optimization

Despite its difficulty, polynomial optimization has become a subject of intensive


research during the last decades. Among the best so far known methods of polyno-
mial optimization one should mention: an earliest method by Shor and Stetsenko
(1989), the RLT method (ReformulationLineralization Technique) by Sherali and
Adams (1999), a SDP relaxation method by Lasserre (2001). Furthermore, since
any polynomial on a compact set can be represented as a dc function, it is natural
that dc methods of optimization have been adapted to solve polynomial programs.
Particularly a primal-dual relaxed algorithm called GOP developed by Floudas
(2000) for biconvex programming has been later extended to a wider class of
polynomial optimization problems. Finally, by viewing a polynomial as a dm
function, polynomial optimization problems can be efficiently treated by methods
of monotonic optimization.

12.2 Linear and Convex Relaxation Approaches

In branch and bound methods for solving optimization problems with a nonconvex
constraint set, an important operation is bounding over a current partition set M.
A common bounding practice consists in replacing the original problem .P/ by a
more easily solvable relaxed problem. Solving this relaxed problem restricted to
M yields a lower bound for the objective function value over M: Under suitable
conditions this relaxation can be proved to lead to a convergent algorithm. Below
we present the RLT method (Sherali and Adams 1999) which is an extension of
the ReformulationLinearization technique for quadratic programming (Chap. 6,
Sect. 9.5) and the SDP relaxation method (Lasserre 2001, 2002) which uses a
specific convex relaxation technique based on Semidefinite Programming.

12.2.1 The RLT Relaxation Method

Letting li .x/ WD ci0 C hci ; xi be the linear part of fi .x/; the polynomial (12.1) can be
written as
X
fi .x/ D li .x/ C ci x ;
2A
i

P
where Ai D f 2 Ai j njD1 j  2g: Setting A D [m  
iD0 Ai ; ci D 0 for 2 A n Ai
we can further write (12.1) as
X
fi .x/ D li .x/ C ci x : (12.2)
2A
12.2 Linear and Convex Relaxation Approaches 437

The RLT method is based on the simple observation that using the substitution
y xP
each polynomial fi .x/ can be transformed into the function Li .x; y/ D
li .x/ C 2A ci y ; which is linear in .x; y/ 2 Rn  RjAj : Consequently, the
polynomial optimization problem (P) can be rewritten as
P
min l0 .x/ C 2A c0 y ;
P
s:t li .x/ C
.Q/ 2A ci y  0 .i D 1; : : : ; m/;
a  x  b;

y D x ; 2 A:

which is a linear program with the additional nonconvex constraints

y D x ; 2 A: (12.3)

The linear program obtained by omitting these nonconvex constraints


P
min l0 .x/ C 2A c0 y ;
P
.LR.P// s:t li .x/ C
2A ci y  0 .i D 1; : : : ; m/;
a  x  b:

yields a linear relaxation of .P/: For any box M D p; q  a; b let .PM / denotes
the problem .P/ restricted to the box M D p; qI then the optimal value of LR.PM /
yields a lower bound .M/ for min(PM ) such that .M/ D C1 whenever the
feasible set of .PM / is empty.
The bounds computed this way may be rather poor. To have tighter relaxation
and hence, tighter bounds, one way is to augment the original constraint set with
implied constraints formed by taking mixed products of original constraints. For
example, for a partition set p; q; such an implied constraint is .x1  p1 /.x3  p3 /
.q2  x2 /  0; i.e.,
x1 x2 x3 C p1 x2 x3 C p3 x1 x2 C q2 x1 x3  p1 p3 x2
q2 p3 x1 C p1 q3 x2  p1 q2 x3  0:

Of course, the more implied constraints are added, the tighter the relaxation is, but
the larger the problem becomes. In practice it is enough to add just a few implied
constraints.
With bounds computed by RLT relaxation it is not difficult to devise a convergent
rectangular branch and bound algorithm for polynomial optimization.
To ease the reasoning in the sequel it is convenient to write a vector D
.1 ; 2 ; : : : ; n / 2 A as a sequence of nonnegative integers between 0 and n as
follows: 1 numbers 1, then 2 numbers 2, . . . , then n numbers n: For example, for
n D 4 a vector D .2; 3; 1; 2/ is written as J./ D 11222344: By A denote the set
of all J./; 2 A:
438 12 Polynomial Optimization

Lemma 12.1 Consider a box M D p; q  a; b and let .x; y/ be a feasible


solution to LR(PM ). If xl D pl then

yJ[flg D pl yJ 8J 2 A ; l 2 f0; 1; : : : ; ng: (12.4)

Similarly, if xl D ql then

yJ[flg D ql yJ 8J 2 A ; l 2 f0; 1; : : : ; ng: (12.5)

Proof Consider first the case xl D pl : For jJj D 1 consider any j 2 f0; 1; : : : ; ng: If
.x; y/ is feasible to LR.PM / then .x; y/ satisfies

.xl  pl /.xj  pj /L D ylj  pj xl  pl xj C pl pj  0


.xl  pl /.qj  xj /L D ylj C pl xj C qj xl  pl qj  0;

hence, noting that xl D pl ;

pj .xl  pl / C pl xj  ylj  qj .xl  pl / C pl xj

and consequently, ylj D pl xj : Thus (12.4) is true for jJj D 1:


Arguing by induction, suppose (12.4) true for jJj D 1; : : : ; t  1 and consider
jJj D t where 2  t  jA j: For any j 2 f0; 1; : : : ; ng; if .x; y/ is feasible to LR.PM /
then
Q
.xj  pj /.xl  pl / r2Jnfjg .xr  pr /L  0
Q
.xl  pl /.qj  xj / r2Jnfjg .xr  pr /L  0:
Q Q
By writing r2Jnfjg .xr  pr / D r2Jnfjg xr C f .x/ where f .x/ is a polynomial in x of
degree no more than t  2; we then have

.yJ[flg  pl yJ /  pj .yJnflg[fjg  pl yJnfjg /


Cpl xj f .x/  xl xj f .x/L C pj xl f .x/  pl f .x/L

.yJ[flg  pl yJ /  qj .yJ[flgnfjg  pl yJnfjg /


Cpl xj f .x/  xl xj f .x/L C qj xl f .x/  pl f .x/L :

By induction hypothesis yJ[flgnfjg Dpl yJnfjg ; xl xj f .x/L Dpl xj f .x/L ; and xl f .x/L D
pl f .x/L ; so the right-hand sides of both the above inequalities are zero. Hence,
yJ[flg D pl yJ : t
u
Corollary 12.1 Let M D p; q  a; b: If .x; y/ is a feasible solution of LR.PM /
such that x is a corner of the box M then x is a feasible solution of .PM /; i.e., is a
feasible solution of problem (P) in the box M.
12.2 Linear and Convex Relaxation Approaches 439

Proof If x is a corner of the box M D p; q there exists a set I  f1; : : : ; ng such


that xi D pi 8i 2 I and xj D qj 8j 2 K WD f1; : : : ; ng n I: Consider any 2 A and
J D J./:QBy Lemma 12.1, for every Q j 2 J \ K weQ have yJ D qj yJnfjg D xj yJnfjg ;
so yJ D . j2J\K xj /yJ\I : Then yJ D . j2J\K xj /. i2J\I xi /; i.e., y D x : Hence,
x 2 M and is a feasible solution of .P/: t
u
Observe that the above proposition is an extension of Proposition 9.9 for
quadratic programming to polynomial optimization. Just like for quadratic pro-
gramming, it gives rise to the following branching method in a BB procedure for
solving .P/: By D denote the feasible set of .P/: Let Mk D pk ; qk   a; b be the
partition set selected for further exploration at iteration kth of the BB procedure. Let
.Mk / be the lower bound for f0 .x/ over D \ Mk computed by RLT relaxation, i.e.,
X
.Mk / D l0 .xk / C c0 yk ; (12.6)
2A

where .xk ; yk / 2 Rn  RjAj is an optimal solution of problem LR.PMk /: If xk is a


corner of Mk ; we are done by the above proposition. Otherwise, we bisect Mk via
.xk ; ik / where

ik 2 argmaxf
ki j i D 1; : : : ; ng;
ki D minfxki  pki ; qki  xki g:

Theorem 12.1 A rectangular BB Algorithm with bounding by RLT relaxation and


branching as above either terminates after finitely many iterations, yielding an
optimal solution xk of .P/ or it generates an infinite sequence fxk g converging to
an optimal solution x of .P/:
Proof The proof is analogous to that of Proposition 9.10. Specifically, we need only
consider the case when the algorithm is infinite. By Theorem 6.3 the branching rule
ensures that a filter of rectangles M D p ; q  is generated such that .x ; y / !
.x ; y /; and x is a corner of M WD p ; q  D \ p ; q /: Clearly .x ; y / is
a feasible solution of LR.PM /; P so by above corollary x is a feasible solution of
 
.P/:
P Since .M  / D l0 .x / C 2A c0 y  f0 .x/ 8x 2 P D we have l0 .x / C
    
c
2A 0 y  f0 .x/ 8x 2 D: But x 2 D implies that l0 .x /C 2A c0 y D f0 .x /;
hence f0 .x /  f0 .x/ 8x 2 D: t
u

12.2.2 The SDP Relaxation Method

This method is due essentially to Lasserre (2001, 2002). By p denote the optimal
value of the polynomial optimization problem .P/:

p WD minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x D .x1 ; x2 ; : : : ; xn / 2 Rn g:


440 12 Polynomial Optimization

We show that a SDP relaxation of .P/ can be constructed whose optimal value
approximates p as closely as desired. Consider the following matrices called
moment matrices:
2 3
x1  
M1 .x/ D 4 : : : 5 1 x1 : : : xn (12.7)
xn
2 3
1 x1 x2 : : : xn
6 x 7
6 1 x21 x1 x2 : : : x1 xn 7
6 7
D 6 x2 x1 x2 x22 : : : x2 xn 7 (12.8)
6 7
4::: ::: :::: ::: ::::5
xn x1 xn x2 xn : : : : x2n

2 3
x1
6 ::: 7
6 7
6 7
6 xn 7
6 2 7
6 x1 7
6 7
6 x1 x2 7
6 7 
M2 .x/ D 6 7 2 2 2
6 : : : 7 x1 : : : xn x1 x1 x2 : : : x1 xn x2 x2 x3 : : : xn (12.9)
6x x 7
6 1 n7
6 x2 7
6 2 7
6 7
6 x2 x3 7
6 7
4 ::: 5
x2n
2 3
1 x1 : : : : xn x21 : : : x2n
6 x1 x2 : : : : x1 xn x3 : : : : x1 x2 7
D6 1
4::: ::: ::: ::: ::: ::: :::5
1 n7 (12.10)
x2n x1 x2n : : : x3n x21 x2n : : : x4n

and so all. We also set M0 .x/ D 1: Clearly, each moment matrix Mk .x/ is
semidefinite positive. Setting Mik .x/ WD fi .x/Mk .x/ for any given natural k we
then have

Mi.k1/ .x/ 0 , fi .x/  0;

so the problem (P) is equivalent to

.Qk / minff0 .x/j Mi.k1/ .x/ 0; i D 1; : : : ; mg:

Upon the change of variables x y the objective function in .Qk / becomes a


linear function of y D fy g while the constraints become linear matrix inequalities
12.3 Monotonic Optimization Approach 441

in y D fy g: The problem .Qk / then becomes a SDP which is a relaxation of


.P/; referred to as the k-th order SDP relaxation of .P/: We have the following
fundamental result:
Theorem 12.2 Suppose the constraint set fx W fi .x/  0; i D 1; 2; : : : ; mg is
compact. Then

lim inf.Qk / D min.Q/: (12.11)


k!C1

We omit the proof which is based on a theorem of real algebraic geometry by


Putinar (1993) on the representation of a positive polynomial as sum of squares
of polynomials. The interested reader is referred to Lasserre (2001, 2002) for more
detail.
Remark 12.1 In the particular case of quadratic programming:
1
fi .x/ D hQi x; xi C hci ; xi C di ; i D 0; 1; : : : ; m;
2
with symmetric matrices Qi ; the first order SDP relaxation is
 
1 1 y x
minf hQ0 ; yi C hc0 ; xi C d0 j hQi ; yi C di  0; i D 1; : : : ; m; 0g
x;y 2 2 xT x0

which can be shown to be the Lagrangian dual of the problem


 
Q.u/ c.u/
max ftj 0g;
t2R;u2RmC
.c.u//T 2.d.u/  t/
P Pm Pm
where Q.u/ D Q0 C m iD1 ui Qi ; c.u/ D c0 C iD1 ui ci ; d.u/ D iD1 ui di :
Thus, for quadratic programming the first order SDP relaxation coincides with
the Lagrange relaxation as computed at the end of Chap. 10.

12.3 Monotonic Optimization Approach

Although very ingenious, the RLT and SDP relaxation approaches suffer from two
drawbacks.
First, they are very sensitive to the highest degree of the polynomials involved.
Even for problems of small size but with polynomials of high degree they require
introducing a huge number of additional variables.
Second, they only provide an approximate optimal value rather than an
approximate optimal solution of the problem. In fact, in these methods the
problem .P/ is relaxed to a linear program or a SDP problem in the variables
x 2 Rn ; yDfy ; 2 Ag; which will become equivalent to .P/ when added the
nonconvex constraints
442 12 Polynomial Optimization

y D x 8 2 A: (12.12)

So if .Ox; yO / is an optimal solution to this relaxed problem then xO will solve problem
.P/ if yO D xO 8 2 A: The idea, then, is to use a suitable BB procedure
with bounding based on this kind of relaxation to generate a sequence .Oxk ; yO k /
such that kOyk  xO k k ! 0 as k ! C1: Under these conditions, given tolerances
" > 0;
> 0 it is expected that for large enough k one would have kOyk  xO k k  ";
while f0 .Oxk / 
 p ; where p is the optimal value of problem .P/: Such an xO k
is accepted as an approximate optimal solution, though it is infeasible and may
sometimes be quite far off the true optimal solution.
Because of these difficulties, if we are not so much interested in finding an
optimal solution as in finding quickly a feasible solution, or possibly a feasible
solution better than a given one, then neither the RLT nor the SDP approach can
properly serve our purpose. By contrast, with the monotonic optimization approach
to be presented below we will be able to get round these limitations.

12.3.1 Monotonic Reformulation

Recall that a function f W Rn ! R is said to be increasing on a box a; b WD fx 2


Rn j a  x  bg  Rn if f .x0 /  f .x/ for all x0 ; x satisfying a  x0  x  b: It is
said to be a d.m. function on a; b if f .x/ D f1 .x/  f2 .x/; where f1 ; f2 are increasing
on a; b:
Clearly a polynomial of n variables with positive coefficients is increasing on
the orthant RnC : Since every polynomial can be written as a difference of two
polynomials with positive coefficients:
X X X
c x D c x  .c /x 
c >0 c <0

it follows that any polynomial of n variables is a d.m. function on RnC :


As shown in Chap. 11, if g1 .x/; : : : ; gm .x/ are d.m. functions on a; b then so are
the functions

maxfg1 .x/; : : : ; gm .x/g; minfg1 .x/; : : : ; gm .x/g:

Therefore, the system of polynomial constraints

fi .x/  0 i D 1; : : : ; m; x 2 a; b

on RnC can be replaced by a single d.m. constraint

g.x/ WD min fi .x/  0:


iD1;:::;m
12.3 Monotonic Optimization Approach 443

If f0 .x/ D f01 .x/  f02 .x/ where f01 .x/; f02 .x/ are polynomials with positive
coefficients, then the problem (P) can be written as

minimize f01 .x/ C t  f02 .b/;


s:t: fj .x/  0 j D 1; : : : ; m;
f02 .x/ C t  f02 .b/  0;
a  x  b; 0  t  f02 .b/  f02 .a/;

where the function .x; t/ 7! f01 .x/ C t  f02 .b/ is a polynomial with positive
coefficients.
It then follows that by introducing an additional variable if necessary, and
changing the notation, we can convert any polynomial programming problem (P)
into the form

minff .x/j g.x/  0; x 2 a; bg; (Q)

where f .x/ is an increasing polynomial (polynomial with positive coefficients),


while

g.x/ D min fuj .x/  vj .x/g; (12.13)


jD1;:::;m

and uj .x/; vj .x/ are increasing polynomials.

12.3.2 Essential Optimal Solution

From now on we assume that the original polynomial programming problem (P) has
been converted to the form .Q/:
In general the feasible solution set D of .Q/ is not robust, it may contain isolated
points. Therefore, as argued in Sect. 11.3, to avoid possible numerical instability,
instead of trying to compute a global optimal solution of (P) one should look for the
best nonisolated feasible solution. This motivates the following definitions:
A nonisolated feasible solution x of .Q/ is called an essential optimal solution
if f .x /  f .x/ for all nonisolated feasible solutions x of .Q/; i.e., if

f .x / D minff .x/j x 2 S g;

where S denotes the set of all nonisolated feasible solutions of .Q/: Assume

fx 2 a; bj g.x/ > 0g ;: (12.14)


444 12 Polynomial Optimization

For "  0; an x 2 a; b satisfying g.x/  " is called an "-essential feasible solution,


and a nonisolated feasible solution x of .Q/ is called an essential "-optimal solution
if it satisfies

f .x/  "  infff .x/j g.x/  "; x 2 a; bg: (12.15)

Clearly for " D 0 a nonisolated feasible solution which is essentially "-optimal is


optimal.
A basic step towards finding an essential "-optimal solution is to deal with the
following subproblem of incumbent transcending:
.SP / Given a real number  f .a/ (the incumbent value), find a nonisolated
feasible solution xO of .Q/ such that f .Ox/  ; or else establish that none such xO
exists.
Since f .x/ is increasing, if D f .b/ then an answer to the subproblem .SP /
would give a nonisolated feasible solution or else identify essential infeasibility
of the problem (a problem is called essentially infeasible if it has no nonisolated
feasible solution). If x is the best nonisolated feasible solution currently available
and D f .x/  "; then an answer to .SP / would give a new nonisolated feasible
solution xO with f .Ox/  f .x/  "; or else identify x as an essential "-optimal solution.
In connection with question .SP / consider the master problem:

.Q / maxfg.x/j f .x/  ; x 2 a; bg:

If g.a/ > 0 and D f .a/ then a solves .Q /: Therefore, barring this trivial case, we
can assume that

g.a/  0; f .a/ < : (12.16)

Since g and f are both increasing functions, problem .Q / can be solved very
efficiently, while, as it turns out, solving .Q / furnishes an answer to question .SP /.
More precisely,
Proposition 12.1 Under assumption (12.16):
(i) Any feasible solution x0 of .Q / such that g.x0 / > 0 is a nonisolated feasible
solution of (P) with f .x0 /  : In particular, if max.Q / > 0 then the optimal
solution xO of .Q / is a nonisolated feasible solution of .P/ with f .Ox/  :
(ii) If max.Q / < " for D f .x/  "; and x is a nonisolated feasible solution of .P/,
then it is an essential "-optimal solution of (P). If max(Q / < " for D f .b/;
then the problem .P/ is "-essentially infeasible.
Proof (i) In view of assumption (12.16), x0 a: Since f .a/ < ; and x0 a;
every x in the line segment joining a to x0 and sufficiently close to x0 satisfies
f .x/  f .x0 /  ; g.x/ > 0; i.e., is a feasible solution of .P/: Therefore, x0 is a
nonisolated feasible solution of .P/ satisfying f .x0 /  :
12.3 Monotonic Optimization Approach 445

(ii) If max.Q // < " then

" > supfg.x/j f .x/  ; x 2 a; bg;

so for every x 2 a; b satisfying g.x/  "; we must have f .x/ > D f .x/  ";
in other words,

infff .x/j g.x/  "; x 2 a; bg > f .x/  ":

This means that, if x is a nonisolated feasible solution, then it is an essential


"-optimal solution, and if D f .b/; then fx 2 a; bj g.x/  "g D ;; i.e., the
problem is "-essentially infeasible. t
u
On the basis of this proposition, the problem .Q/ can be solved by proceeding
according to the following SIT (Successive Incumbent Transcending) scheme:
1. Select tolerance
> 0: Start with D 0 for 0 D f .b/:
2. Solve the master problem .Q ).
- If max.Q / > 0 an essential feasible xO with f .Ox/  is found. Reset f .Ox/C

and repeat.
- Otherwise, max.Q /  0 W no essential feasible solution x for .Q/ exists such that
f .x/  : So if D f .Ox/ C
for some essential feasible solution xO of .Q/; xO is an
essential
-optimal solution; if D 0 problem .Q/ is essentially infeasible.
The key for implementing this conceptual scheme is an algorithm for solving the
master problem.

12.3.3 Solving the Master Problem

Recall that the master problem .Q ) is

maxfg.x/j f .x/  ; x 2 a; bg:

We will solve this problem by a special RBB (Reduce-Bound-and-Branch) algo-


rithm involving three basic operations: reducing (the partition sets), bounding, and
branching.
- Reducing: using valid cuts reduce the size of each current partition set M D
p; q  a; b: The box p0 ; q0  obtained from M as a result of the cuts is referred
to as a valid reduction of M and denoted by red p; q:
- Bounding: for each current partition set M D p; q estimate a valid upper bound
.M/ for the maximum of g.x/ over the feasible portion of .Q ) contained in the
box p0 ; q0  D red p; q:
- Branching: successive partitioning of the initial box M0 D a; b using an
exhaustive subdivision rule or any subdivision consistent with the bounding.
446 12 Polynomial Optimization

a. Valid Reduction

At a given stage of the RBB Algorithm for .Q /; let p; q  a; b be a box generated


during the partitioning procedure and still of interest. The search for a nonisolated
feasible solution of .Q/ in p; q such that f .x/  can then be restricted to the set
B \ p; q; where

B WD fxj f .x/   0; g.x/  0g: (12.17)

Since g.x/ D minjD1;:::;m fuj .x/  vj .x/g with uj .x/; vj .x/ being increasing polynomi-
als (see (12.13)), we can also write

B D fxjf .x/  ; uj .x/  vj .x/  0 j D 1; : : : ; mg:

The reduction operation aims at replacing the box p; q with a smaller box
p0 ; q0   p; q without losing any point x 2 B \ p; q; i.e., such that

B \ p0 ; q0  D B \ p; q:

The box p0 ; q0  satisfying this condition is referred to as a valid reduction of p; q


and denoted by redp; q:
Let ei be the i-th unit vector of Rn and for any continuous function '.t/ W 0; 1 !
R such that '.0/ < 0 < '.1/; let N.'/ denote the largest value t 2 .0; 1/ satisfying
'.t/ D 0: Also for every i D 1; : : : ; n let qi D p C .qi  pi /ei : By Proposition 10.13
applied to problem .Q ) we have

; if f .p/ > or minj fuj .q/  vj .p/g < 0
red p; q D
p0 ; q0  if f .p/  & minj fuj .q/  vj .p/g  0

where

X
n X
n
0 0 0
q DpC i .qi  pi /e ;i
p Dq  i .q0i  pi /ei ;
iD1 iD1


1; if f .qi /   0
i D 'i .t/ D f .p C t.qi  pi /ei /  :
N.'i /; if f .qi /  > 0:


1; if minj fuj .q0i /  vj .p/g  0
i D
N. i /; if minj fuj .q0i /  vj .p/g < 0:

i .t/ D minj fuj .q0  t.q0i  pi /ei /  vj .p/g:

Remark 12.2 It is easily verified that the box p0 ; q0  D red p; q still satisfies

f .p0 /  ; minfuj .q0 /  vj .p0 /g  0:


j
12.3 Monotonic Optimization Approach 447

b. Valid Bounds

Given a box M WD p; q; which is supposed to have been reduced, we want to


compute an upper bound .M/ for

maxfg.x/j f .x/  ; x 2 p; qg:

Since g.x/ D minjD1;:::;m fuj .x/  vj .x/g and uj .x/; vj .x/ are increasing, an obvious
upper bound is

.M/ D min uj .q/  vj .p/: (12.18)


jD1;:::;m

Though very simple, this bound ensures convergence of the algorithm, as will be
seen shortly. However, for a better performance of the procedure, tighter bounds
can be computed using, for instance, the following:
Lemma 12.2 (i) If g.p/ > 0 and f .p/ < , then p is a nonisolated feasible
solution with f .p/ < :
(ii) If f .q/ > and x.M/ D p C .q  p/ with  D maxfj f .p C .q  p//  g;
zi D q C .xi .M/  qi /ei ; i D 1; : : : ; n; then an upper bound of g.x/ over all
x 2 p; q satisfying f .x/  is

.M/ D max min fuj .zi /  vj .p/g:


iD1;:::;n jD1;:::;m

Proof (i) Obvious.


(ii) Let Mi D p; zi  D fxj p  x  zi g D fx 2 p; qj fi  xi  xi .M/g: From the
definition of x.M/ it is clear that f .z/ > f .x.M// D for all z D p C .q  p/
with > : Since for every x > x.M/ there exists z D p C .q  p/ with
> ; such that x  z; it follows that f .x/  f .z/ > for all x > x.M/: So
fx 2 p; qj f .x/  g  p; q n fxj x > x.M/g: On the other hand, noting that
fxj x > x.M/g D \niD1 fxj xi > xi .M/g; we can write p; q n \niD1 fxj xi >
xi .M/g D [iD1;:::;n fx 2 p; qj pi  xi  xi .M/g D [iD1;:::;n Mi : Since
.Mi / D minjD1;:::;m uj .zi /  vj .p/  maxfg.x/j x 2 Mi g it follows that
.M/ D maxf.Mi /j i D 1; : : : ; ng  maxfg.x/j x 2 [iD1;:::;n Mi g 
maxfg.x/j x 2 B \ p; qg: t
u
Remark 12.3 Each box p; zi  can be reduced by the method presented above. If
p0i ; q0i  D redp; zi ; i D 1; : : : ; n; then without much extra effort, we can have a
more refined upper bound, namely

.M/ D max f min uj .q0i /  vj .p0i /g:


iD1;:::;n jD1;:::;m

The points zi constructed as in (ii) determine a set Z WD [niD1 p; zi  containing


fx 2 p; qj f .x/  g: Such a set Z is a polyblock with vertex set zi ; i D 1; : : : ; n
448 12 Polynomial Optimization

(see Sect. 10.1). The above procedure thus amounts to constructing a polyblock
Z  fx 2 p; qj f .x/  gwhich is possible because f .x/ is increasing. To
have a tighter bound, one can even construct a sequence of polyblocks Z1  Z2 ; : : : ;
approximating the set B \ p; q more and more closely. For the details of this
construction the interested reader is referred to Tuy (2000a). By using polyblock
approximation one could compute a bound as tight as we wish; since, however, the
computation cost increases rapidly with the quality of the bound, a trade-off must
be made, so practically just one approximating polyblock as in the above lemma is
used.
Remark 12.4 In many cases, efficient bounds can also be computed by exploiting,
additional structure such as partial convexity, aside from monotonicity properties
Tuy (2007b).
For example, assuming, without loss of generality, that p 2 RnCC ; i.e., fi > 0;
i D 1; : : : ; n; and using the transformation ti D log xi ; i D 1; : : :P ; n; one can

write each monomial xP as expht; i; and each polynomial fk .x/ D 2Ak ck x
as a function 'k .t/ D 2Ak ck expht; i: Since, as can easily be seen, expht; i
with 2 Rm C is an increasing convex function of t; each polynomial in x becomes
a d.c. function of t and the subproblem over p; q becomes a d.c. optimization
problem under d.c. constraints. Here each d.c. function is a difference of two
increasing convex functions of t on the box exp p; exp q; where exp p denotes the
vector .exp f1 ; : : : ; exp fn /: A bound over p; q can then be computed by exploiting
simultaneously the d.c. and d.m. structures.

c. A Robust Algorithm

Incorporating the above RBB procedure for solving .Q ) into the SIT scheme yields
the following robust algorithm for solving .Q/ W

SIT Algorithm for (Q)


Initialization. If no feasible solution is known, let D f .b/ I otherwise, let x be the
best nonisolated feasible solution available, D f .x/  ": Let P1 D fM1 g; M1 D
a; b; R1 D ;: Set k D 1.
Step 1. For each box M 2 Pk W
- Compute its valid reduction redMI
- Delete M if redM D ;I
- Replace M by redM if redM ;I
- If redM D p0 ; q0 , then compute an upper bound .M/ for g.x/
over the feasible solutions in redM: (.M/ must satisfy .M/ 
minjD1;:::;m uj .q0 /  vj .p0 /; see (12.18)). Delete M if .M/ < 0:
Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1. Let R 0 k D Rk [ P 0 k :
12.3 Monotonic Optimization Approach 449

Step 3. If R 0 k D ;, then terminate: x is an essential "-optimal solution of .Q/ if


D f .x/  "; or the problem .Q/ is essentially infeasible if D f .b/.
Step 4. If R 0 k ;; let pk ; qk  WD Mk 2 argmaxf.M/j M 2 R 0 k g; k D .Mk /:
Step 5. If k < ", then terminate: x is an essential "-optimal solution of .Q/ if
D f .x/  "; or the problem .Q/ is "-essentially infeasible if D f .b/:
Step 6. If k  "; compute k D maxfj f .pk C .qk  pk //   0g and let

xk D pk C k .qk  pk /:

6.a) If g.xk / > 0, then xk is a new nonisolated feasible solution of .Q/ with
f .xk /  W if g.pk / < 0; compute the point xk where the line segment
joining pk to xk meets the surface g.x/ D 0; and reset x xk I otherwise,
reset x p : Go to Step 7.
k

6.b) If g.xk /  0; go to Step 7, with x unchanged.


Step 7. Divide Mk into two subboxes by the standard bisection (or any bisection
consistent with the bounding M 7! .M//: Let PkC1 be the collection of
these two subboxes of Mk ; RkC1 D R 0 k n fMk g: Increment k; and return
to Step 1.
Proposition 12.2 The above algorithm terminates after finitely many steps, yield-
ing either an essential "-optimal solution of .Q/; or an evidence that the problem is
essentially infeasible.
Proof Since any feasible solution x with f .x/  D f .x/  " must lie in some
box M 2 R 0 k the event R 0 k D ; implies that no such solution exists, hence the
conclusion in Step 3. If Step 5 occurs, so that k < "; then max.Q = /  "; hence
the conclusion in Step 5 (see Proposition 12.1). Thus the conclusions in Step 3
and Step 5 are correct. It remains to show that either Step 3 (R 0 k D ;/ or Step 5
(k < "/ must occur for sufficiently large k: To this end, observe that in Step 6, since
f .pk /  (Remark 12.2), the point xk exists and satisfies f .xk /  ; so if g.xk / > 0;
then by Proposition 12.1, xk is a nonisolated feasible solution with f .xk /  f .x/  ";
justifying Step 6a. Suppose now that the algorithm is infinite. Since each occurrence
of Step 6a decreases the current best value at least by " > 0 while f .x/ is bounded
below it follows that Step 6a cannot occur infinitely often. Consequently, for all k
sufficiently large, x is unchanged, and g.xk /  0; while k  ": But, as k ! C1;
we have, by exhaustiveness of the subdivision, diamMk ! 0; i.e., kqk  pk k ! 0:
Denote by xQ the common limit of qk and pk as k ! C1: Since

"  k  min uj .qk /  vj .pk /;


jD1;:::;m

it follows that

"  lim k  min uj .Qx/  vj .Qx/ D g.Qx/:


k!C1 jD1;:::;m

But by continuity, g.Qx/ D limk!C1 g.xk /  0; a contradiction. Therefore, the


algorithm must be finite. u
t
450 12 Polynomial Optimization

Remark 12.5 The SIT Algorithm works its way to the optimum through a sequence
of better and better nonisolated solutions. If for some reason the algorithm has to be
stopped prematurely, some reasonably good feasible solution may have been already
obtained. This is a major advantage of it over most existing algorithms, which may
be useless when stopped prematurely.

12.3.4 Equality Constraints

So far we assumed (12.14), so that the feasible set has a nonempty interior. Consider
now the case when there are equality constraints, e.g.,
hl .x/ D 0; l D 1; : : : ; s ; (12.19)
First, if some of these constraints are linear, they can be used to eliminate
certain variables. Therefore, without loss of generality we can assume that all the
constraints (12.19) are nonlinear. Since, however, in the most general case one
cannot expect to compute a solution of a nonlinear system of equations in finitely
many steps, one should be content with an approximate system

  hl .x/  ; l D 1; : : : ; s ;
where > 0 is the tolerance. In other words, a set of constraints of the form

gj .x/  0 j D 1; : : : ; m
hl .x/ D 0 l D 1; : : : ; s

should be replaced by the approximate system


gj .x/  0 j D 1; : : : ; m
hl .x/ C  0; l D 1; : : : ; s
hl .x/ C  0 l D 1; : : : ; s:
The method presented in the previous sections can then be applied to the resulting
approximate problem. With g.x/ D minjD1;:::;m gj .x/; h.x/ D maxlD1;:::;s jhl .x/j;
the required assumption is, instead of (12.14), fx 2 a; bj g.x/ > 0; h.x/ C
> 0g ;:

12.3.5 Discrete Polynomial Optimization

The above approach can be extended without difficulty to the class of discrete (in
particular, integral) polynomial optimization problems. This can be done along
the same lines as the extension of monotonic optimization methods to discrete
monotonic optimization.
12.4 An Illustrative Example 451

12.3.6 Signomial Programming

A large class of problems called signomial programming or generalized geometric


programming can be considered which inludes problems of the form .P/ where
each vector D .1 ; : : : ; n / may involve rational non integral components, i.e.,
may be an arbitrary vector 2 RnC : The above solution method via conversion
of polynomial programming into monotonic optimization can be applied without
modification to signomial programming.

12.4 An Illustrative Example

We conclude by an illustrative example.


Minimize .3Cx1 x3 /.x1 x2 x3 x4 C2x1 x3 C2/2=3 subject to

3.2x1 x2 C 3x1 x2 x4 /.2x1 x3 C 4x1 x4  x2 /


.x1 x3 C 3x1 x2 x4 /.4x3 x4 C 4x1 x3 x4 C x1 x3  4x1 x2 x4 /1=3
C3.x4 C 3x1 x3 x4 /.3x1 x2 x3 C 3x1 x4 C 2x3 x4  3x1 x2 x4 /1=4 
309:219315
2.3x3 C 3x1 x2 x3 /.x1 x2 x3 C 4x2 x4  x3 x4 /2
C.3x1 x2 x3 /.3x3 C 2x1 x2 x3 C 3x4 /4  .x2 x3 x4 C x1 x3 x4 /.4x1  1/3=4
3.3x3 x4 C 2x1 x3 x4 /.x1 x2 x3 x4 C x3 x4  4x1 x2 x3  2x1 /4 
78243:910551
3.4x1 x3 x4 /.2x4 C 2x1 x2  x2  x3 /2
C2.x1 x2 x4 C 3x1 x3 x4 /.x1 x2 C 2x2 x3 C 4x2  x2 x3 x4  x1 x3 /4  9618
0  xi  5 i D 1; 2; 3; 4:

This problem would be difficult to solve by the RLT or Lasserres method, since
a very large number of variables would have to be introduced.
For " D 0:01; after 53 cycles of incumbent transcending, the SIT Algorithm
found the essential "-optimal solution

xessopt D .4:994594; 0:020149; 0:045424; 4:928073/

with essential "-optimal value 5.906278 at iteration 667, and confirmed its essential
"-optimality at iteration 866.
452 12 Polynomial Optimization

12.5 Exercises

1 Solve the problem

minimize 3x1 C 4x2 ; s:t:


x21 x2 C 2x2 C x3 D 5I x21 x22 C x2 C x4 D 3I
x1  0; x2  0I x1 x2  x1 x22  1I 0  x1 x2  2I 0  x1 x22  2:

2 Consider the problem


Minimize 4.x21 x3 C 2x21 x2 x23 x5 C 2x21 x2 x3 /.5x21 x3 x24 x5 C 3x2 /3=5
C 3.2x24 x25 /.4x21 x4 C 4x2 x5 /5=3
subject to

2.2x1 x5 C 5x21 x2 x24 x5 /.3x1 x4 x25 C 5 C 4x3 x25 /1=2  7684:470329


2.2x1 x22 x3 x24 /.2x1 x2 x3 x24 C 2x2 x24 x5  x21 x25 /3=2  1286590:314422
0  xi  5 i D 1; 2; 3; 4; 5:

With tolerance " D 0:01 solve this problem by the SIT monotonic optimization
algorithm.
Hints: the essential "-optimal solution is

x D .4:987557; 4:984973; 0:143546; 1:172267; 0:958926/

with objective function value 28766.057367.


Chapter 13
Optimization Under Equilibrium Constraints

In this closing chapter we study optimization problems with nonconvex constraints


of a particular type called equilibrium constraints. This class of constraints arises
from numerous applications in natural sciences, engineering, and economics.

13.1 Problem Formulation

Let there be given two closed subsets C; D of Rn ; Rm , respectively, and a function


F W C  D ! R: As defined in Chap. 3, Sect. 3.3, an equilibrium point of the system
C; D; F is a point z 2 Rn such that

z 2 C; F.z; y/  0 8y 2 D:

Consider now a real-valued function f .x; z/ on Rn  Rm together with a nonempty


closed set U  Rn  Rm ; and two a set-valued maps C; D from U to Rn with
closed values C.x/; D.x/ for every .x; z/ 2 U: The problem we are concerned
with and which is commonly referred to as Mathematical Programming problem
with Equilibrium Constraints can be formulated as follows:

minimize f .x; z/
.MPEC/ .x; z/ 2 U; z 2 C.x/;
subject to
F.z; y/  0 8y 2 D.x/:

Setting S.x/ D fz 2 C.x/j F.x; z/  0 8y 2 D.x/g the equilibrium constraints can


also be written as

z 2 S.x/: (13.1)

Springer International Publishing AG 2016 453


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6_13
454 13 Optimization Under Equilibrium Constraints

For every fixed x 2 X WD fx 2 Rn j .x; z/ 2 U for some z 2 Rm g; S.x/ is the


set of all equilibrium points of the system .C.x/; D.x/; F/: So the problem can be
viewed as a game (called Stackelberg game) in which a distinctive player is leader
and the remaining players are followers. The leader has an objective function f .x; z/
depending on two variables x 2 Rn and z 2 Rm : The leader controls the variable x
and to each decision x 2 X by the leader, the followers react by choosing a value of
z 2 C.x/ achieving equilibrium of the system .C.x/; D.x/; F/: Supposing that f .x; z/
is a loss, the leaders objective is to minimize f .x; z/ under these conditions. In this
interpretation the variable x under the leaders control is also referred to as upper
level variable and the variable y under the followers control as lower level variable.
The most important cases that occur in the applications are the following:
1. S is generated by an optimization problem:

S.x/ D argminfg.x; y/j y 2 D.x/gI (13.2)

then F.z; y/ D g.x; y/  g.x; z/:


2. S is generated by a variational inequality:

S.x/ D fz 2 C.x/j hg.z/; y  zi  0 8y 2 D.x/gI (13.3)

then F.z; y/ D hg.z/; y  zi:


3. S is generated by a complementarity problem:

S.x/ D fz 2 RnC j hF.z/; y  zi  0 8y 2 RnC g


D fz 2 RnC ; F.z/  0; hF.z/; zi D 0gI (13.4)

then C.x/ D D.x/ D RnC ; F.z; y/ D hF.z/; y  zi: Although this case can be
reduced to the previous one, it deserves special interest as it can be shown to
include the important case when S is generated by a Nash equilibrium problem.
There is no need to dwell on the difficulty of MPEC. Finding just an equilibrium
point is already a hard problem, let alone solving an optimization problem with
an equilibrium constraint. In spite of that, the problem has attracted a great
deal of research, due to its numerous important applications in natural sciences,
engineering, and economics.
Quite understandably, in the early period aside from heuristic algorithms for
solving certain classes of MPECs (Friesz et al. 1992), many solution methods
proposed for MPECs have been essentially local optimization methods. Few papers
have been concerned with finding a global optimal solution to MPEC.
Among the most important works devoted to MPECs we should mention the
monographs by Luo et al. (1996) and Outrata et al. (1998). Both monographs
describe a large number of MPECs arising from applications in diverse fields of
natural sciences, engineering, and economics.
13.2 Bilevel Programming 455

In Luo et al. (1996), under suitable assumptions including a constraint qualifi-


cation, MPEC is converted into a standard optimization problem. First and second
order optimality conditions are then derived and based on these a penalty interior
point algorithm and a descent algorithm are developed for solving the problem.
Particular attention is also given to MPAECs which are mathematical programs with
affine equilibrium constraint.
The monograph by OutrataKocvaraZowe presents a nonsmooth approach to
MPECs where the equilibrium constraint is given in the form of a variational or
quasivariational equality. Specifically, using the implicit function locally defined by
the equilibrium constraint the problem is converted into a mathematical program
with a nonsmooth objective which can then be solved by the bundle method of
nonsmooth optimization.
In the following we will present a global optimization approach to MPEC. Just
like in the approaches of LuoPangRalp and OutrataKocvaraZowe, the basic
idea is to convert MPEC into a standard optimization problem. The difference,
however, is that the latter will be an essential global optimization problem belonging
to a class amenable to efficient solution methods, such as dc or monotonic
optimization. As it turns out, most MPECs of interest can be approached in that
way. The simplest among these are bilevel programming and optimization over
efficient set, for which the equilibrium constraint is an ordinary or multiobjective
optimization problem.

13.2 Bilevel Programming

In its general form the Bilevel Programming Problem can be formulated as follows

min f .x; y/ s.t.


.GBP/ g1 .x; y/  0; x 2 RnC1 ; y solves
R.x/ W minfd.y/j g2 .C.x/; y/  0; y 2 RnC2 g

where it is assumed that


(A1) The map C W Rn1 ! Rm and the functions f .x; y/; d.y/; g1 .x; y/; g2 .u; y/ are
continuous.
(A2) The set D WD f.x; y/ 2 RnC1  RnC2 j g1 .x; y/  0; g2 .C.x/; y/  0g is nonempty
and compact.
Also an assumption common to a wide class of (GBP) is that
(A3) For every fixed y the function u 7! g2 .u; y/ is decreasing.
Assumptions (A1) and (A2) are quite natural. Assumption (A3) means that
the function g2 .C.x/; y/ decreases when the action C.x/ of the leader increases
(component-wise). By replacing u with v D u; if necessary, this assumption also
holds when g2 .C.x/; y/ increases with C.x/:
456 13 Optimization Under Equilibrium Constraints

If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex, the problem is
called a Convex Bilevel Program (CBP); if all these functions are linear affine, the
problem is called a Linear Bilevel Program (LBP).
Specifically, the Linear Bilevel Programming problem can be formulated as

min L.x; y/ s.t.


.LBP/ A1 x C B1 y  c1 ; x 2 RnC1 ; y solves
minfhd; yij A2 x C B2 y  c2 ; y 2 RnC2 g

where L.x; y/ is a linear function, d 2 Rn2 ; Ai 2 Rmi n1 ; Bi 2 Rmi n2 ; ci 2 Rmi ;
i D 1; 2: In this case g1 .x; y/ D maxiD1;:::;m1 .c1  A1 x  B1 y/i  0; g2 .u; y/ D
maxiD1;:::;m2 .c2  u  B2 y/i  0; while C.x/ D A2 x:
Clearly all assumptions (A1), (A2), (A3) hold for (LBP) provided the set D WD
f.x; y/ 2 RnC1  RnC2 j A1 x C B1 y  c1 ; A2 x C B2 y  c2 g is a nonempty polytope. Also
for every x  0 satisfying A1 x C B1 y  c1 the set fy 2 RnC2 j A2 x C B2 y  c2 g is
compact, so R.x/ is then solvable.
Originally formulated and studied as a mathematical program by Bracken
and McGill (1973, 1974) bilevel and more generally, multilevel optimization has
become a subject of extensive research, owing to numerous applications in diverse
fields such as: economic development policy, agriculture economics, road network
design, oil industry regulation, international water systems and flood control,
energy policy, traffic assignment, structural analysis in mechanics, etc. Multilevel
optimization is also useful for the study of hierarchical designs in many complex
systems, particularly in biological systems (Baer 1992; Vincent 1990).
Even (LBP) is an NP-hard nonconvex global optimization problem (Hansen et al.
1992). Despite the linearity of all functions involved, this is a difficult problem
fraught with pitfalls. Actually, several early published methods for its solution
turned out to be non convergent, incorrect or convergent to a local optimum quite
far off the true optimum. Some of these errors have been reported and analyzed,
e.g., in Ben-Ayed (1993). A popular approach to (LBP) at that time was to use,
directly or indirectly, a reformulation of it as a single mathematical program which,
in most cases, was a linear program with an additional nonconvex constraint. It was
this peculiar additional nonconvex constraint that constituted the major source of
difficulty and was actually the cause of most common errors.
Whereas (LBP) has been extensively studied, the general bilevel programming
problem has received much less attention. Until two decades ago, most research in
(GBP) was limited to convex or quadratic bilevel programming problems, and/or
was mainly concerned with finding stationary points or local solutions (Bard 1988,
1998; Bialas and Karwan 1982; Candler and Townsley 1982; Migdalas et al. 1998;
Shimizu et al. 1997; Vicentee et al. 1994; White and Annandalingam 1993). Global
optimization methods for (GBP) appeared much later, based on converting (GBP)
into a monotonic optimization problem (Tuy et al. 1992, 1994b, and also Tuy and
Ghannadan 1998).
13.2 Bilevel Programming 457

13.2.1 Conversion to Monotonic Optimization

We show that under assumptions (A1)(A3) (GBP) can be converted into a single
level optimization problem which is a monotonic optimization problem.
In view of Assumptions (A1) and (A2) the set f.C.x/; d.y//j .x; y/ 2 Dg is
compact, so without loss of generality we can assume that this set is contained in
some box a; b  RmC1
C :

a  .C.x/; d.y//  b 8.x; y/ 2 D: (13.5)

Setting

U D fu 2 Rm j 9.x; y/ 2 D u  C.x/g; (13.6)


W D f.u; t/ 2 R  Rj 9.x; y/ 2 D
m
u  C.x/; t  d.y/g; (13.7)

we define

.u/ D minfd.y/j .x; y/ 2 D; u  C.x/g; (13.8)


f .u; t/ D minfF.x; y/j .x; y/ 2 D; u  C.x/; t  d.y/g: (13.9)

Proposition 13.1 (i) The function .u/ is finite for every u 2 U and satisfies
.u0 /  .u/ whenever u0  u 2 U: In other words, .u/ is a decreasing
function on U:
(ii) The function f .u; t/ is finite on the set W: If .u0 ; t0 /  .u; t/ 2 W then
.u0 ; t0 / 2 W and f .u0 ; t0 /  f .u; t/: In other words, f .u; t/ is a decreasing
function on W:
(iii) u 2 U for every .u; t/ 2 W:
Proof (i) If u 2 U then there exists .x; y/ 2 D such that C.x/  u; hence
.u/ < C1: It is also obvious that u0  u 2 U implies that .u0 /  .u/:
(ii) By assumption (A2) the set D of all .x; y/ 2 RnC1  RnC2 satisfying g1 .x; y/  0;
g2 .C.x/; y/  0 is nonempty and compact. Then for every .u; t/ 2 W the
feasible set of the problem determining f .u; t/ is nonempty, compact, hence
f .u; t/ < C1 because F.x; y/ is continuous by (A1). Furthermore, if .u0 ; t0 / 
.u; t/ 2 W, then obviously .u0 ; t0 / 2 W and f .u0 ; t0 /  f .u; y/:
(iii) Obvious. t
u
As we just saw, the assumption (A2) is only to ensure the solvability of
problem (13.9) and hence, finiteness of the function f .u; t/: Therefore, without any
harm it can be replaced by the weaker assumption
(A2*) The set W D f.u; t/ 2 Rm  Rj 9.x; y/ 2 D u  C.x/; t  d.y/g is
compact.
In addition to (A1), (A2)/(A2*), and (A3) we now make a further
assumption:
(A4) The function .u/ is upper semi-continuous on intU while f .u; t/ is lower
semi-continuous on intW:
458 13 Optimization Under Equilibrium Constraints

Proposition 13.2 If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex
then Assumption (A4) holds, with .u/; f .u; t/ being convex functions.
Proof Observe that if .u/ D d.y/ for x satisfying g2 .C.x/; y/  0; C.x/  u;
and .u0 / D d.y0 / for x0 satisfying g2 .C.x0 /; y0 /  0; C.x0 /  u0 ; then g2 .C.x C
.1  /x0 /; y C .1  /y0 /  g2 .C.x/; y/ C .1  /g2 .C.x0 /; y0 /; and C.x C .1 
/x0 /  C.x/C.1/C.x0 /  uC.1/u0 ; so that .xC.1/x0 ; yC.1/y0 /
is feasible to the problem determining .u C.1 /u0 /; hence .u C.1 /u0 / 
d.y C .1  /y0 /  d.y/ C .1  /d.y0 / D .u/ C .1  / .u0 /: Therefore the
function .u/ is convex, and hence continuous on intU: The convexity and hence,
the continuity of f .u; t/; on intW is proved similarly. t
u
Proposition 13.3 (GBP) is equivalent to the monotonic optimization problem

.Q/ minff .u; t/j t  .u/  0; .u; t/ 2 Wg

in the sense that min(GBP)= min(Q) and if .u; t/ solves (Q) then any optimal
solution .x; y/ of the problem

minfF.x; y/j g1 .x; y/  0; g2 .u; y/  0; u  C.x/; t  d.y/; x  0; y  0g

solves (GBP).
Proof By Proposition 13.1 the function f .u; t/ is decreasing while t  .u/ is
increasing, so .Q/ is indeed a monotonic optimization problem. If .x; y/ is feasible
to (GBP), then g1 .x; y/  0; g2 .C.x/; y/  0; and taking u D C.x/; t D d.y/ D
.C.x// we have  t; hence

F.x; y/  minfF.x0 ; y0 /j g1 .x0 ; y0 /  0; g2 .C.x0 /; y0 /  0;


u  C.x0 /; t  d.y0 /; x0  0; y0  0g
D f .u; t/  minff .u0 ; t0 /j .u0 /  t0  0g
D min.Q/:

Conversely, if .u/  t  0; then the inequalities u  C.x/; t  d.y/ imply d.y/ 


t  .u/  ..x//; hence

f .u; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0;


u  C.x/; t  d.y/; x  0; y  0g
 minfF.x; y/jg1.x; y/  0; g2 .C.x/; y/  0; .C.x//  d.y/g
D min.GBP/:

Consequently, min(GBP)= min.Q/: The last assertion of the proposition readily


follows. t
u
13.2 Bilevel Programming 459

13.2.2 Solution Method

For convenience of notation we set

z D .u; t/ 2 RmC1 ; h.z/ D t  .u/: (13.10)

The problem .Q/ can then be rewritten as

.Q/ minff .z/j h.z/  0; z 2 Wg;

where f .z/ is a decreasing function and h.z/ a continuous increasing function The
feasible set of this monotonic optimization problem .Q/ is a normal set, hence is
robust and can be handled without difficulty. As shown in Chap. 11, such a problem
.Q/ can be solved either by the polyblock outer approximation method or by the
BRB (branch-reduce-and-bound) method. The polyblock method is practical only
for problems .Q/ with small values of m (typically m  5/: The BRB method we
are going to present is more suitable for problems with fairly large values of m:
As its name indicates, the BRB method proceeds according to the standard
branch and bound scheme, with three basic operations: branching, reducing (the
partition sets), and bounding.
1. Branching consists in a successive rectangular partition of the initial box M0 D
a; b  Rm  R; following a chosen subdivision rule. As will be explained
shortly, branching should be performed upon the variables u 2 Rm ; so the
subdivision rule is induced by a standard bisection upon these variables.
2. Reducing consists in applying valid cuts to reduce the size of the current partition
set M D p; q  a; b: The box p0 ; q0  that results from the cuts is referred to as
a valid reduction of M:
3. Bounding consists in estimating a valid lower bound .M/ for the objective
function value f .z/ over the feasible portion contained in the valid reduction
p0 ; q0  of a given partition set M D p; q:
Throughout the sequel, for any vector z D .u; t/ 2 Rm  R we define zO D u:

a. Branching

Since an optimal solution .u; t/ of the problem (Q) must satisfy t D .u/; it suffices
to determine the values of the variables u in an optimal solution. This suggests
that instead of branching upon z D .u; t/ as usual, we should branch upon u;
according to a chosen subdivision rule which may be the standard bisection or
the adaptive subdivision rule. As shown by Corollary 6.2, the standard subdivision
method is exhaustive, i.e., for any infinite nested sequence Mk ; k D 1; 2; : : : ; such
that each MkC1 is a child of Mk via a bisection we have diamMk ! 0: This property
guarantees a rather slow convergence, because the method does not take account
460 13 Optimization Under Equilibrium Constraints

of the current problem structure. By contrast the adaptive subdivision method


exploits the current problem structure and due to that can ensure a much more rapid
convergence.

b. Valid Reduction

At a given stage of the BRB algorithm for .Q/ a feasible solution z is available which
is the best so far known. Let D f .z/ and let p; q  a; b be a box generated
during the partitioning procedure which is still of interest. Since an optimal solution
of .Q/ is attained at a point satisfying h.z/ WD t  .u/ D 0; the search for a feasible
solution z of .Q/ in p; q such that f .z/  can be restricted to the set B \ p; q;
where
B WD fzj f .z/  ; h.z/ D 0g: (13.11)

The reduction aims at replacing the box p; q with a smaller box p0 ; q0   p; q


without losing any point y 2 B \ p; q; i.e., such that

B \ p0 ; q0  D B \ p; q:

The box p0 ; q0  satisfying this condition is referred to as a -valid reduction of p; q


and denoted by red p; q:
As usual ei denotes the ith unit vector of RmC1 ; i.e., eii D 1; eij D 0 8j i:
Given any continuous increasing function './ W 0; 1 ! R such that '.0/ <
0 < '.1/ let N.'/ denote the largest value 2 .0; 1/ satisfying './ D 0: For
every i D 1; : : : ; n let qi WD p C .qi  pi /ei :
Proposition 13.4 A -valid reduction of a box p; q is given by the formula

;; if h.q/ < 0 or  f .q/ < 0;
red p; q D (13.12)
p0 ; q0 ; if h.q/  0 &  f .q/  0:

where
X
n X
n
q0 D p C i .qi  pi /ei ; p0 D q0  i .q0  pi /ei ; (13.13)
iD1 iD1

1; if h.qi /  0
i D 'i ./ D h.p C .qi  pi /ei /:
N.'i /; if h.qi / < 0:

1; if f .q0i /   0
i D i ./ D  f .q0  .q0i  pi /ei /:
N. i /; if f .q0i /  > 0:

Proof Follows from Proposition 11.13 t


u
0 0
Remark 13.1 As can be easily checked, the box p ; q  D redp; q still satisfies
h.q0 /  0; f .q0 /  :
13.2 Bilevel Programming 461

c. Valid Bounds

Let M WD p; q be a partition set which is supposed to have been reduced, so that,


as mentioned in the above remark,

h.q/  0; f .q/  : (13.14)

Let us now examine how to compute a lower bound .M/ for

minff .z/j z 2 p; q; h.z/ D 0g:

Since f .z/ is decreasing, an obvious lower bound is f .q/: We will shortly see that to
ensure convergence of the BRB Algorithm, it suffices that the lower bounds satisfy

.M/  f .q/:

We shall refer to such a bound as a valid lower bound.


Define h .z/ WD minf  f .z/; h.z/g: Since f .z/ and h.z/ are increasing, so
is h .z/: Let

G D fz 2 p; qj h.z/  0g; H D fz 2 p; qj h .z/  0g:

From (13.14) h .q/  0; so if h.q/ D 0 then obviously q is an exact minimizer of


f .z/ over the feasible points z 2 p; q that are at least as good as the current best,
and f .q/ can be used to update the current best.
Suppose now that h.q/ < 0; and hence, h.p/ < 0: For each z 2 p; q such that
z 2 H n G let p .z/ be the first point where the line from z to p meets the upper
boundary of G; i.e.,

p .z/ D z  .z  p/ with D N.'/; './ D h.z  .z  p//:

Obviously, h. p .z// D 0:
Lemma 13.1 If s D p .q/; v i D q  .qi  si /ei ; iD1; : : : ; mC1; and IDfij v i / 2 Hg
then a valid lower bound over M D p; q is

.M/ D minff .v i /j i 2 Ig
D minff .u; t/j g1 .x; y/  0;
g2 .Oui ; y/  0; .C.x/; d.y//  v i .i 2 I/; x  0; y  0g:

Proof This follows since the polyblock with vertices v i ; i 2 I; contains all feasible
solutions z D .u; t/ still of interest in p; q: t
u
462 13 Optimization Under Equilibrium Constraints

Remark 13.2 Each box Mi D p; v i  can be reduced by the method presented


above. If p0 ; v 0i  D red p; v i ; i D 1; : : : ; m C 1; then without much extra effort, we
can have the following more refined lower bound:

.M/ D min f .v 0i /; I D fij fQ .v i /  0g:


i2I

The above procedure amounts to constructing a polyblock Z  B \ p; q D fz 2


p; qjh.z/  0  fQ .z/g; which is possible since h.z/; fQ .z/; are increasing functions.

d. Improvements for the Case of CBP

In the case of (CBP) (Convex Bilevel Programming) the function .u/ is convex
by Proposition 13.1, so the function h.u; t/ D t  .u/ is concave, and a lower
bound can be computed which is rather tight. In fact, since h.z/ is concave and since
the box p; q has been -reduced, h.qi / D 0; i D 1; : : : ; n [see (13.13)], the set
fz 2 p; qj h.z/ D 0g is contained in the simplex S WD q1 ; : : : ; qn : Hence a lower
bound of f .u; t/ over M D p; q is given by

.M/ D minff .z/j z 2 Sg: (13.15)

Taking account of (13.8)(13.9) this convex program can be written as

minu;t F.x; y/ s.t.


g1 .x; y/  0; g2 .C.x/; y/  0;
(13.16)
C.x/  u; d.y/  t; x  0; y  0;
.u; t/ 2 q1 ; : : : ; qn :

We are now in a position to state the proposed monotonic optimization (MO)


algorithm for .Q/:
Recall that the problem is minff .z/j z 2 a; b; h.z/ D 0g and we assume (A1)
(A4) so always b 2 W; bO D .b1 ; : : : ; bm / 2 U; and .b; O .b//
O yields a feasible
solution.

MO Algorithm for (GBP)


Initialization. Start with P1 D fM1 g; M1 D a; b; R1 D ;: Let CBS be the best
feasible solution available and CBV (current best value) the corresponding value of
f .z/: (At least CBV= f .b;
O .b//:
O Set k D 1:
Step 1. For each box M 2 Pk and for DCBV:
(1) Compute the -valid reduction red M of MI
(2) Delete M if red M D ;I
(3) Replace M by red M if red M ;I
(4) If red M D p; q, then compute a valid lower bound .M/ for f .z/
over the feasible solutions in M:
13.2 Bilevel Programming 463

Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1. Update CBS and CBV. From Rk remove all M 2 Rk such
that .M/  CBV and let R 0 k be the resulting collection. Let Sk D
R 0k [ P 0k :
Step 3. If Sk D ; terminate: CBV is the optimal value and CBS (the feasible
solution z with f .z/ D CBV) is an optimal solution.
Step 4. If Sk ; select Mk 2 argminf.M/j M 2 Sk g: Subdivide Mk D pk ; qk 
according to the adaptive subdivision rule: if .Mk / D f .v k / for some
v k 2 Mk and zk D pk .v k / is the first point where the line segment from
v k to pk meets the surface h.z/ D 0 then bisect Mk via .wk ; jk / where
wk D 12 .v k C zk / and jk satisfies kzkjk  vjkk k D maxj kzkj  vjk k: Let PkC1
be the collection of newly generated boxes.
Step 5. Set RkC1 D Sk n fMk g; increment k and return to Step 1.
Proposition 13.5 Whenever infinite the above algorithm generates an infinite
sequence fzk g every cluster point of it yields a global optimal solution of (GBP).
Proof Follows from Proposition 6.2. t
u

13.2.3 Cases When the Dimension Can Be Reduced

Suppose m > n1 while the map C satisfies

x0  x ) C.x0 /  C.x/: (13.17)

In that case the dimension of the problem (Q) can be reduced. For this, define

.v/ D minfd.y/j g2 .C.v/; y/  0; y 2 RnC2 g (13.18)


f .v; t/ D minfF.x; y/j G1 .x; y/  0; g2 .C.v/; y/  0
v  x; t  d.y/; x 2 RnC1 ; y 2 RnC2 g: (13.19)

It is easily seen that all these functions are decreasing. Indeed, v 0  v implies
C.v 0 /  C.v/ and hence g2 .C.v 0 /; y/  0; i.e., whenever v is feasible to (13.18) and
v 0  v then v is also feasible. This shows that .v 0 /  .v/: Similarly, whenever
.v; t/ is feasible to (13.19) and .v 0 ; t0 /  .v; t/ then .v; t/ is also feasible, and hence,
f .v 0 ; t0 /  f .v; t/:
With .v/ and f .v; t/ defined that way, we still have
Proposition 13.6 Problem (GBP) is equivalent to the monotonic optimization
problem

.Q/
Q minff .v; t/j t  .v/  0g
464 13 Optimization Under Equilibrium Constraints

and if .v; t/ solves (Q)


Q then an optimal solution .x; t/ of the problem

minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0;


v  x; t  d.y/; x  0; y  0g

solves (GBP).
Proof If .x; y/ is feasible to (GBP), then g1 .x; y/  0; g2 .C.x/; y/  0 and taking
v D x; t D d.y/ D .x/ we have t  .v/ D 0; hence, setting G WD f.v; t/j t
.v/  0g yields

F.x; y/  minfF.x0 ; y0 /j g1 .x0 ; y0 /  0; g2 .C.x0 /; y/  0;


v  x0 ; t  d.y0 /; x0  0; y0  0g
D f .v; t/  minff .v 0 ; t0 /j .v 0 ; t0 / 2 Gg
Q
D min.Q/:

Conversely, if t  .v/  0; then the inequalities v  x; t  d.y/ imply d.y/ 


t  /v/  .x/; hence

f .v; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0;


v  x; t  d.y/; x  0; y  0g
 minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0;
.x/  d.y/; x  0; y  0g
D min.GDP/

Q The rest is clear.


Consequently, min.GBP/ D min.Q/: t
u
Thus, under assumption (13.17), (GBP) can be solved by the same method as
previously, with v 2 Rn1 as main parameter instead of u 2 Rm : Since n1 < m the
method works in a space of smaller dimension.
More generally, if C is a linear map with rankC D r < m such that E D KerC
satisfies
Ex0  Ex ) C.x0 /  C.x/; (13.20)

then one can arrange the method to work basically


  in a space of dimension r rather
x
than m: In fact, writing E D EB ; EN ; x D B ; where EB is an r  r nonsingular
xN
matrix, we have for every v 2 Ex:
   
EB1 EB1 EN xN
xD vC :
0 xN
13.2 Bilevel Programming 465

   
EB1 EB1 EN xN
Hence, setting Z D ; zD yields
0 xCN

x D Zv C z with Ez D EN xN C EN xN D 0:

Since by (13.20) Ez D 0 implies Cz D 0; it follows that Cx D C.Zv/: Now define

.v/ D minfd.y/j g2 .C.Zv/; y/  0; y 2 RnC2 g; (13.21)


f .v; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.Zv/; y/  0;
v  Ex; t  d.y/; x 2 RnC1 ; y 2 RnC1 g: (13.22)

For any v 0 2 Rr D E.Rn1 / we have v 0 D Ex0 for some x0 ; so v 0  v implies


Ex0  Ex; hence by (13.20) Cx0  Cx; i.e., C.Zv 0 /  C.Zv/; and therefore,
g2 .C.Zv 0 /; y/  g2 .C.Zv/; y/  0: It is then easily seen that .v 0 /  .v/ and
f .v 0 ; t0 /  f .v; t/ for some v 0  v and t0  t; i.e., both .v/ and f .v; t/ are
decreasing. Furthermore, if .v; t/ solves the problem

minff .v; t/j t  .v/  0g (13.23)

(so that, in particular, t D .v/; and .x; y/ solves the problem (13.22) where v D
v; t D t D .v//; then, since v D Ex; we have .Ex/  .v/  d.y/: Noting
that Zv D ZEx D x; this implies that y solves minfd.y/j g2 .Cx; y/  0g; and
consequently, .x; y/ solves (GBP).

13.2.4 Specialization to (LBP)

In the special case when all functions involved in (GBP) are linear we have the
Linear Bilevel Programming problem

min L.x; y/ s.t.

.LBP/ A1 x C B1 y  c1 ; x 2 Rn1 ; y solves
C
minfhd; yij A x C B y  c2 ; y 2 Rn2 g
2 2 C

where L.x; y/ is a linear function, d 2 Rn2 ; Ai 2 Rmi n1 ; Bi 2 Rmi n2 ; i D 1; 2:


In this case C.x/ D A2 x while g1 .x; y/ D maxiD1;:::;m1 .c1 x  B1 y/i  0;
g2 .u; y/ D maxiD1;:::;m2 .c2  u  B2 y/i  0:
Clearly all assumptions (A1), (A2), and (A3) hold, provided the polytope D WD
f.x; y/ 2 RnC1 2 RnC2 j A1 x C B1 y  c1 ; A2 x C B2 y  c2 g is nonempty. To specialize
the MO Algorithm for (GBP) to .BLP/ define
466 13 Optimization Under Equilibrium Constraints

     1
A1 B1 c
AD ; BD ; cD 2 ;
A2 B2 c
2 3 2 3
A1;1 A2;1
A1 D 4 : : : 5 ; A2 D 4 : : : 5 ;
A1;m1 A2;m1
D D f.x; y/j Ax C By  c; x  0; y  0g:

By Proposition 13.3, .BLP/ is equivalent to the monotonic optimization problem


minff .u; t/j h.u; t/  0g; where for u 2 Rm2 ; t 2 R:

f .u; t/ D minfL.x; y/j .x; y/ 2 D; u  A2 x; t  hd; yig;


h.u; t/ D t  minfhd; yij .x; y/ 2 D; u  A2 xg;

with f .u; t/ decreasing and h.u; t/ increasing.


For convenience set z D .u; t/ 2 Rm2 C1 : The basic operations proceed as
follows:
Initial box: M1 D a; b  Rm2 C1 with

ai D minf.A2;i ; x/j .x; y/ 2 Dg; i D 1; : : : ; m2


am2 C1 D minfhd; yij .x; y/ 2 Dg
bi D maxfhA2;i ; xij.x; y/ 2 Dj i D 1; : : : ; m2
bm2 C1 D maxfhd; yij .x; y/ 2 Dg:

Reduction operation: Let z D .u; t/ D CBS and D f .z/ DCBV. If M D p; q


is any subbox of a; b still of interest at a given iteration, then the box red M
(valid reduction M/ is determined as follows:
(1) If h.z/ < 0 or f .z/ > , then red .M/ D ; (M is to be deleted);
(2) Otherwise, red .M/ D p0 ; q0  with p0 ; q0 being determined by formu-
las (13.12)(13.13), i.e.,

X
n X
n
q0 D p C i .qi  pi /ei ; p0 D q0  i .q0i  pi /ei ;
iD1 iD1

where

1; if h.qi /  0
i D 'i ./ D h.p C .qi  pi /ei /:
N.'i /; if h.qi / < 0

1; if f .q0i /   0
i D i ./ D f .q0  .q0i  pi /ei /  :
N. i /; if f .q0i /  > 0
13.2 Bilevel Programming 467

Lower bounding: For any box M D p; q with red .M/ D p0 ; q0  a lower bound
for f .z/ over p; q is

.M/ D minu;t L.x; y/ s.t.


A 1 x C B 1 y  c1 ; A 2 x C B 2 y  c2 ;
A2 x  u; d.y/  t; x  0; y  0;
.u; t/ 2 q01 ; : : : ; q0n :

13.2.5 Illustrative Examples

Example 13.1 Solve the problem

min.x2 C y2 / s.t.
x  0; y  0; y solves
min.y/ s.t.
3x C y  15;
x C y  7;
x C 3y  15:

We have D D f.x; y/ 2 R2C j 3x C y  15I x C y  7; x C 3y  15g;

.u/ D minfy/j .x; y/ 2 D; 3x  ug;


f .u; t/ D minfx2 C y2 j .x; y/ 2 D; 3x  u; y  tg:

Computational results (tolerance 0.0001):


Optimal solution: (4.492188, 1.523438)
Optimal value: 22.500610 (relative error  0:0001)
Optimal solution found and confirmed at iteration 10.
Example 13.2 (Test Problem 7 in Floudas (2000), Chap. 9)

min.8x1  4x2 C 4y1  40y2 C 4y3 / s.t.


x1 ; x2  0; y solves
min.y1 C y2 C 23 / s.t.
y1 C y2 C y3  1;
2x1  y1 C 2y2  0:5y3  1;
2x2 C 2y1  y2  0:5y3  1;
y1 ; y2 ; y3  0:
468 13 Optimization Under Equilibrium Constraints

D D f.x; y/ 2 R2C  R3C j  y1 C y2 C y3  1; 2x1  y1 C 2y2  0:5y3  1;


2x2 C 2y1  y2  0:5y3  1g;

.u/ D minfy1 C y2 C 2y3 j .x; y/ 2 D; x1  u1 ; x2  u2 g;


f .u; t/ D minf8x1  4x2 C 4y1  40y2 C 4y3 j .x; y/ 2 D; x1  u1 ;
x2  u2 ; y1 C y2 C 2y3  tg:

Computational results (tolerance 0.01):


Optimal solution x D .0; 0:896484/; y D .0; 0:597656; 0:390625/
Optimal value: 25.929688 (relative error  0:01)
Optimal solution found and confirmed at iteration 22.

13.3 Optimization Over the Efficient Set

Given a nonempty set X  Rn and a mapping C D .C1 ; : : : ; Cm / W Rn ! Rm ; a point


x0 2 X is said to be efficient w.r.t. C if there is no x 2 X such that C.x/  C.x0 / and
C.x/ C.x0 /: It is said to be weakly efficient w.r.t. C if there is no x 2 X such that
C.x/ > C.x0 /: These concepts appear in the context of multiobjective programming,
when a decision maker has several objectives C1 .x/; : : : ; Cm .x/ he/she would like to
maximize over a feasible set X: Since these objectives may not be consistent with
one another, efficient or weakly efficient solutions are used as substitutes for the
standard concept of optimal solution.
Denote the set of efficient points by XE and the set of weakly efficient points by
XWE : Obviously, XE  XWE : To help the decision maker to select a most preferred
solution, the following problems can be considered:

.OE/ maxff .x/j x 2 X; g.x/  0; x 2 XE g


.OWE/ maxff .x/j x 2 X; g.x/  0; x 2 XWE g

where f .x/; g.x/ are given functions.


Since x 2 XWE if and only if C.y/  C.x/  0 8y 2 X the problem (OWE) is a
MPEC. It can be proved that (OE), too, is a MPEC, though the proof of this fact is
a little more involved (see Philip 1972).
Among numerous studies devoted to different aspects of the two above problems
we should mention (Benson 1984, 1986, 1991, 1993, 1995, 1996, 1998, 1999,
2006; Bolintineau 1993a,b; Ecker and Song 1994; Bach-Kim 2000; An-Tao-Muu
2003; An-Tao-Nam-Muu 2010; Muu and Tuyen 2002; Thach et al. 1996). All these
approaches are based in one way or another on dc optimization or extensions. Below
we present a different approach, first proposed in Luc (2001) and later expanded in
Tuy and Hoai-Phuong (2006), which is based on reformulating these problems as
monotonic optimization problems.
13.3 Optimization Over the Efficient Set 469

13.3.1 Monotonic Optimization Reformulation

Assume X is compact and C is continuous. Then the set C.X/ is compact and there
exists a box p; q  RmC such that C.X/  a; b:
For every y 2 a; b define

h.y/ D minftj yi  Ci .x/  t; i D 1; : : : ; m; x 2 Xg: (13.24)

Proposition 13.7 The function h W a; b ! R is increasing and continuous and


there holds

x 2 XWE , h.C.x//  0:

Proof We can write


x 2 XWE , 8z 2 X 9i Ci .x/  Ci .z/
, max min.Ci .x/  yi /  0
y2Y i

, max minftj yi  Ci .x/  t; i D 1; : : : ; mg  0


y2Y

, h.C.x//  0: t
u

Since the set C.X/ is compact, it is contained in some box a; b  Rm : For each
y 2 a; b let
'.y/ WD minff .x/j x 2 X; g.x/  0; C.x/  yg:
Clearly for y0  y we have '.y0 /  '.y/; so this is a decreasing function.
Proposition 13.8 Problem (OWE) is equivalent to the monotonic optimization
problem
minf'.y/j h.y/  0; y 2 a; bg: (13.25)

Proof By Proposition 13.7 problem (OWE) can be written as

minff .x/j x 2 X; g.x/  0; h.C.x//  0g: (13.26)

This is a problem with hidden monotonicity in the constraints which has been stud-
ied in Chap. 11, Sect. 11.5. By Proposition 11.24, it is equivalent to problem (13.25).

13.3.2 Solution Method for (OWE)

For small values of m the monotonic optimization problem (13.25) can be solved
fairly efficiently by the copolyblock approximation algorithm analogous to the
470 13 Optimization Under Equilibrium Constraints

polyblock algorithm described in Sect. 11.2, Chap. 11. For larger values of m; the
SIT Algorithm presented in Sect. 11.3 is more suitable. The latter is a rectangu-
lar BRB algorithm involving two specific operations: valid reduction and lower
bounding.

a. Valid Reduction

At a given stage of the BRB algorithm for solving (13.25), a feasible solution y is
known or has been produced which is the best so far available. Let D '.y/ and
let p; q  a; b be a box currently still of interest among all boxes generated in the
partitioning process. Since both '.x/ and h.x/ are increasing, an optimal solution
of (13.25) must be attained at a point y 2 p; q satisfying h.y/ D 0 (i.e., a lower
boundary point of the conormal set WD fx 2 a; bj h.y/  0g). The search for a
better feasible solution than y can therefore be restricted to the set

B WD fy 2 p; qj h.y/  0  h.y/; '.y/   0g


D fy 2 p; qj h.y/  0  h .y/g; (13.27)

where h .y/ D minf  '.y/; h.y/g:


In other words, we can replace the box p; q by a smaller one, say p0 ; q0   p; q;
provided
B \ p0 ; q0  D B :
A box p0 ; q0  satisfying this condition is referred to as a -valid reduction of p; q;
and denoted red p; q:
For any continuous increasing function './ W 0; 1 ! R such that '.0/ < 0 <
'.1/ let N.'/ denote the largest value of 2 .0; 1/ satisfying './ D 0: For every
i D 1; : : : ; n let qi WD p C .qi  pi /ei ; where eii D 1; eij D 0 8j i:
Proposition 13.9 We have

; if h.p/ > 0 or h .q0 /g < 0;
red p; q D
p ; q ; if h.p/  0 & h .q0 /g  0:
0 0

where

X
n X
n
q0 D p C i .qi  pi /ei ; p0 D q0i  i .q0i  pi /ei ;
iD1 iD1

1; if h.q /  0
i
i D 'i .t/ D h.p C t.qi  pi /ei /:
N.'i /; if h.qi / > 0:

1; if h .q0i /  0
i D i .t/ D h .q0  t.q0i  pi /ei /:
N. i /; if h .q0i / < 0:
13.3 Optimization Over the Efficient Set 471

Remark 13.3 It is easily verified that the box p0 ; q0  D red p; q still satisfies
'.p0 /  ; h.p0 /  0; h.q0 /  0:

b. Lower Bounding

Let M WD p; q be a partition set which is assumed to have been reduced, so that


according to Remark 13.3

'.p/  ; h.p/  0; h.q/  0:

To compute a lower bound .M/ for

minf'.y/j y 2 p; q; h.y/ D 0g

we note that, since '.y/ is increasing, an obvious bound is '.p/: It turns out that, as
will be shortly seen, the convergence of the BRB algorithm for solving (eqn13.25)
will be ensured, provided that
.M/  '.p/: (13.28)
A lower bound satisfying this condition will be referred to as a valid lower bound.
Define .y/ D minf'.y/  ; h.y/g;  D fy 2 p; qj .y/  0g:
If .p/  0  .q/; then obviously p is an exact minimizer of '.y/ over the
feasible points in p; q at least as good as the current best, and can be used to update
the current best solution. Suppose therefore that .p/  0; h.p/ < 0; i.e., p 2 n:
For each y 2 p; q such that h.y/ < 0 let .y/ be the first point where the line
segment from y to q meet the lower boundary of ; i.e.,

.y/ D y C .q  y/; with


 D maxfj h.y C .q  y//  0g:

Obviously, h. .y// D 0:
Lemma 13.2 If z D .p/; zi D p C .zi  pi /ei ; i D 1; : : : ; m; and I D fij zi 2 g;
then a valid lower bound over M D p; q is

.M/ D minf'.zi /j i 2 Ig
D minff .x/j x 2 D; C.x/  zi ; i 2 Ig:

Proof Let Mi D zi ; q: From the definition of z it is easily seen that h.u/ < 0 8u 2
p; z/; i.e., p; q\\  p; qnp; z/: Noting that fuj p  u < zg D \m iD1 fuj pi 
ui < zi g we can write p; qnp; z/ D p; qn\m iD1 fuj u i < zi g D [m
iD1 fu 2 p; qj zi 
ui  qi g D [m iD1 Mi : Thus, if I D fijzi
2 g, then p; q \  \  [ iD1 Mi : Since
m

'.zi /  minf'.y/j y 2 Mi g; the result follows. t


u
472 13 Optimization Under Equilibrium Constraints

MO Algorithm for (OWE)


Initialization. Start with P1 D fM1 g; M1 D a; b; R1 D ;: If a current best feasible
solution (CBS) is available let CBV (current best value) denote the value of f .x/ at
this point. Otherwise, set CBV=C1: Set k D 1:
Step 1. For each bx M 2 Pk :
Compute the -valid reduction red M of M for =CBV;
Delete M if red M D ;I
Replace M by red M otherwise;
If red M D p; q, then compute a valid lower bound .M/ for '.y/
over the feasible solutions in M:
Step 2. Let Pk0 be the collection of boxes that results from Pk after completion
of Step 1. From Rk remove all M 2 Rk such that .M/ CBV and let Rk0
be the resulting collection. Let Sk D Rk0 [ Pk0 :
Step 3. If Sk D ;, then terminate: the problem is infeasible if CBVD C1 or
CBV is the optimal value and the feasible solution y with '.y/=CBV is an
optimal solution if CBV< C1:
Otherwise, let Mk 2 argminf.M/j M 2 Sk g:
Step 4. Divide Mk into two subboxes by the standard bisection. Let PkC1 be the
collection of these two subboxes of Mk :
Step 5. Let RkC1 D Sk n fMk g: Increment k and return to Step 1.
Proposition 13.10 Either the algorithm is finite, or it generates an infinite filter of
boxes fMkl g whose intersection yields a global optimal solution.
Proof If the algorithm is finite it must generate an infinite filter of boxes fMkl g such
that \C1 
lD1 Mkl D fy g by exhaustiveness of the standard bisection process. Since
Mkl D p ; q  with h.pkl /  0  h.qkl / and y D lim pkl D lim qkl it follows
kl kl

that h.y /  0  h.q /; so y is a feasible solution. Therefore, if the problem


is infeasible, the algorithm must stop at some iteration where no box remains for
consideration while CBVD C1 (see Step 3). Otherwise,

'.pkl /  .Mkl /  '.qkl /;

whence liml!C1 .Mkl / D '.y /: On the other hand, since .Mkl / is minimal
among the current set Skl we have .Mkl /  minf'.y/j y 2 a; b \ g ad,
consequently,

'.y /  minf'.y/j y 2 a; b \ g:

Since y is feasible it must be an optimal solution. t


u
13.3 Optimization Over the Efficient Set 473

c. Enhancements When C Is Concave

Tighter Lower Bounds


Recall from (13.25) that (OWE) is equivalent to the monotonic optimization
problem

minf'.y/j y 2 g

where D fy 2 a; bj h.y/  0g; h.y/ D minftj yi  Ci .x/  t; i D 1; : : : ; m; x 2


Xg and

'.y/ D minff .x/j x 2 X; g.x/  0; C.x/  yg: (13.29)

Lemma 13.3 If the functions Ci .x/; i D 1; : : : ; m; are concave the set  D fy 2


a; bj h.y/  0g is convex.
Proof Clearly  D fy 2 a; bj y  C.x/ for some x 2 Xg: If y; y0 2 ; 0   1,
then y  C.x/; y0  C.x0 / with x; x0 2 C and yC.1/y0  C.x/C.1/C.x0 / 
C.x C ./x0 /; where x C .1  /x0 2 X because X is convex by assumption. u t
The convexity of  permits a simple method for computing a tight lower bound
for the minimum of '.y/ over the feasible solutions in p; q:
For each i D 1; : : : ; m let yi be the last point of  on the line segment from p to
0
p D p C .qi  pi /ei ; i.e., yi D p C i .qi  pi /ei with

i D maxfj p C .qi  pi /ei 2 ; 0   1g:

Clearly h.yi / D 0:
Proposition 13.11 If the functions Ci .x/; i D 1; : : : ; m are concave then a lower
bound for '.y/ over the feasible portion in M D p; q is

.M/ D minff .x/j x 2 X; g.x/  0; C.x/  p;


X
m
.Ci .x/  pi /=.yii  pi /  1g: (13.30)
iD1

Proof First observe that by convexity of  the simplex S D y1 ; : : : ; ym  satisfies


fy 2 p; qj h.y/  0g  S C Rm C and since '.y/ is increasing we have '.y/ 
'. .y// 8y 2 S; where .y/ D y C .q  y/ with  D maxfj h.y C .q  y/  0g,
i.e., h. .y// D 0: Hence,

.M/ D minf'.y/j y 2 Sg
 minf'.y/j y 2 p; q; h.y/ D 0g; (13.31)
474 13 Optimization Under Equilibrium Constraints

so (13.31) gives a valid lower bound. Further, letting

E D f.x; y/j x 2 X; g.x/  0; C.x/  y; y 2 Sg


Ey D fx 2 Xj g.x/  0; C.x/  yg D fxj .x; y/ 2 Eg for every y 2 S;

we can write
min '.y/
y2S

D min minff .x/j x 2 Ey g


y2S

D minff .x/j .x; y/ 2 Eg


D minff .x/j x 2 X; g.x/  0; C.x/  y; y 2 Sg:
Pm
Since S D fy  pj iD1 .yi  pi /=.yii  pi / D 1g; (13.30) follows. t
u
Remark 13.4 In particular, when X is a polytope, C is a linear mapping, g.x/; f .x/
are affine functions, as e.g., in problems (OWE) with linear objective criteria, then
'.y/ and .M/ are optimal values of linear programs. Also note that the linear
program (13.29) for different y differs only by the right-hand side of the linear
constraints and the linear programs (13.30) for different partition sets M differ only
by the last constraint and the right-hand side of the constraint C.x/  p: Exploiting
these facts, reoptimization techniques can be used to solve efficiently these linear
programs.
Adaptive Subdivision Rule
Let yM be the optimal solution of problem (13.30) and zM D .yM /: Since the
equality yM D zM would imply that .M/ D minf'.y/j y 2 p; q; h.y/ D 0g; the
following adaptive subdivision rule must be used to accelerate convergence of the
algorithm (see Theorem 6.4, Chap. 6):
Determine v M D 12 .yM C zM /; sM 2 argmaxiD1;:::;m kyM M
i  zi j; and subdivide M
via .v ; sM /:
M

13.3.3 Solution Method for (OE)

We now turn to the constrained optimization problem over the efficient set

.OE/ WD minff .x/j x 2 X; g.x/  0; x 2 XE g

where XE is the set of efficient points of X with respect to C W Rn ! Rm :


Although efficiency is a much stronger property than weak efficiency, it turns
out that the problem (OE) can be solved, roughly speaking, by the same method as
(OWE). More precisely, let, as previously,  D fy 2 a; bj x 2 X; C.x/  yg:
This is a normal set in the sense that y0  y 2  ) y0 2  (see Chap. 11);
13.3 Optimization Over the Efficient Set 475

moreover, it is compact. Recall from Sect. 11.1 (Chap. 11) that a point z 2  is
called an upper boundary point of ; written z 2 @C ; if .z; b  a; b n : A point
z 2 @C  is called an upper extreme point of ; written z 2 extC ; if x 2 ; x  z
implies x D z: By Proposition 11.5 a compact normal set such as  always has at
least one upper extreme point.
Proposition 13.12 x 2 XE , C.x/ 2 extC :
Proof If x0 2 XE , then y0 D C.x0 / 2  and there is no x 2 X such that C.x/ 
C.x/ ; C.x/ C.x0 /, i.e., no y D C.x/; x 2 X such that y  y0 ; y y0 ; hence
y0 2 extC : Conversely, if y0 D C.x0 / 2 extC  with x0 2 X, then there is no
y D C.x/; x 2 X; such that y  y0 ; y y0 ; i.e., no x 2 X with C.x/  C.x0 /; C.x/
C.x0 /; hence x0 2 XE : t
u
For every y 2 Rm
C define

( )
X
m
.y/ D min .yi  zi /j z  y; z 2  : (13.32)
iD1

Proposition 13.13 The function .y/ W Rm


C ! RD [ fC1g is increasing and
satisfies
8
< 0 if y 2 
.y/ D D 0 if y 2 extC  (13.33)
:
D C1 if y :

Proof If y0  y, then z  y0 implies that z  y; hence .y0 /  .y/: Suppose


.y/ D 0: Then y 2  since there exists z 2  such that y  z: There cannot
Pmany z  y; z y; for this would imply z> yi for at least
be one i; hence .y/ 
C
iD1 .yi  zi / < 0; a contradiction. Conversely,
P if y 2 ext  then for every z 2 
such that z  y one must have z D y; hence m iD1 .yi  zi / D 0; i.e., .y/ D 0: That
.y/  0 8y 2 ; .y/ D C1 8y  is obvious, t
u
Setting D D fx 2 Xj g.x/  0g; h.y/
Q D minfh.y/; .y/g; we can thus formulate
the problem (OE) as

minff .x/j x 2 D; h.C.x//


Q  0g; (13.34)

Q
which has the same form as (13.25), with the difference, however, that h.y/; though
increasing like h.y/; is not usc. Upon the same transformation as previously, this
problem can be rewritten as

minf'.y/j y 2 ; .y/ D 0g; (13.35)

where '.y/ is defined as in (13.29) and D fx 2 a; bj h.y/  0g.


476 13 Optimization Under Equilibrium Constraints

Clearly problem (13.35) differs from problem (13.25) only by the additional
constraint .y/ D 0; whose presence is what makes (OE) a bit more complicated
than (OWE).
There are two possible approaches for solving (OE). In the first approach
proposed by Thach and discussed in Tuy et al. (1996a) the problem (OE) is
approximated by an (OWE) differing from (OE) only in that the criterion mapping
C is slightly perturbed and replaced by C" with
X
m
Cj" .x/ D Cj .x/ C " Ci .x/; j D 1; : : : ; m;
iD1

"
where " > 0 is sufficiently small. Denote by XWE and XE" resp. the weakly efficient
"
set and the set of efficient set of X w.r.t. C .x/ and consider the problem

.OWE" / "
."/ WD minff .x/j g.x/  0; x 2 XWE g:

"
Proposition 13.14 (i) For any " > 0 we have XWE  XE :
(ii) ."/ ! as " & 0:
"
(iii) If X is a polytope and C is linear, then there is "0 > 0 such that XWE D XE for
all " 2 .0; "0 /:
Proof (i) Let x 2 X n XE : Then there is x 2 XPsatisfying C.x/ Pm C.x /;

 m
Ci .x/ > Ci .x / for at least one i: This implies iD1 Ci .x/ > iD1 Ci .x /;
"
hence C" .x/ > C" .x /; so that x XWE : Thus, if x 2 X n XE then
"
x 2 X n XWE  XE :
(ii) Property (i) implies that ."/  : Define

 " D fy 2 a; bj y  C" .x/; x 2 Xg; (13.36)


'" .y/ D minff .x/j; x 2 X; g.x/  0; C" .x/  yg; (13.37)

so that problem .OWE" / reduces to

minf'" .y/j y 2 cl.a; b n  " g: (13.38)


" "
Next consider an arbitrary point x 2 XE n XWE : Since x XWE there exists
x" 2 X such that C" .x" / > C" .x/: Then x" is feasible to problem .OWE" /
and, consequently, f .x" /  ."/  : In view of the compactness of X; by
passing to a subsequence if necessary, one can suppose that x" ! x 2 X
as " & 0: Therefore, f .x /  lim ."/  : But from C" .x" / > C" .x/ we
have C.x /  C.x/ and since x 2 XE it follows that x D x; and hence,
"
f .x/  lim ."/  : This being true for arbitrary x 2 XE n XWE we conclude
that lim ."/ D :
(iii) We only sketch the proof, referring the interested reader to Tuy et al. (1996b,
Proposition 15.1), for detail. Since C.x/ is linear, we can consider the cones
K WD fxj Ci .x/  0; i D 1; : : : ; mg; K " WD fxj C" .x/  0; i D 1; : : : ; mg:
Let x0 2 XE and denote by F the smallest facet of X that contains x0 : Since
13.3 Optimization Over the Efficient Set 477

x0 2 XE we have .x0 C K/ \ F D ;; and hence, there exits "F > 0 such that
.x0 C K " / \ F D ; 8" 2 .0; "F /: Then we have .x C K " / \ F D ; 8x 2 F
whenever 0 < "  "F : If F denotes the set of facets of X such that F  XE ;
then F is finite and "0 D minf"F j F 2 F g > 0: Hence, .x C K " / \ F D
"
; 8x 2 F; 8F 2 F ; i.e., XE  XWE whenever 0 < " < "0 : t
u
As a consequence, an approximate optimal solution of (OE) can be obtained by
solving .OWE" / for any " such that 0 < "  "0 :
An alternative approach to (OE) consists in solving the equivalent monotonic
optimization problem (13.25) by a BRB algorithm analogous to the one for solving
problem (OWE). To take account of the peculiar constraint .y/ D 0 some
precautions and special manipulations are needed
(i) A feasible solution is an y 2 a; b satisfying .y/ D 0; i.e., a point of extC :
The current best solution is the best among all so far known feasible solutions.
(ii) In Step 1, after completion of the reduction operation described above, a special
supplementary reductiondeletion operation is needed to identify and fathom
any box p; q that contains no point of extC : This supplementary operation
will ensure that every filter of partition sets pk ; qk  collapses to a point
corresponding to an efficient point.
Define h" .y/ D minftj x 2 X; yi  Ci" .x/  t; i PD 1; : : : ; mg;  " D fy 2
" "
a; bj y  C .x/; x 2 Xg where Cj .x/ D Cj .x/ C " m iD1 Ci .x/; j D 1; : : : ; m;
" "
Since C.x/  a 2 Rm C 8x 2 X; we have C.x/  C .x/ 8x 2 X; hence    :
Let p; q be a box which has been reduced as described in Proposition 13.9, so
that p 2 ; q : Since p 2  we have .p/  0:
Supplementary ReductionDeletion Rule: Fathom p; q in either of the
following events:
(a) .q/  0 (so that .y/  0 8y 2  \ p; q  @C  \ p; q/I if .q/ D 0, then
q gives an efficient point and can be used to update the current best.
(b) h" .q/  0 for some small enough " > 0 (then q 2  " ; so the box p; q contains
no point of @C . " /; and hence no point of extC  /:
Proposition 13.15 The BRB algorithm applied to problem (13.35) with precaution
(i) and the supplementary reductiondeletion rule either yields an optimal solution
to (OE) in finitely many iterations or generates an infinite filter of boxes whose
intersection is a global optimal solution y of (13.35), corresponding to an optimal
solution x of (OE).
Proof Because of precautions (i) and (ii) the box selected for further partitioning
in each iteration contains either a point of extC  or a point of extC  " : Therefore,
any infinite filter of boxes generated by the algorithm collapses either to a point
y 2 extC  (yielding then an x 2 XE ) or to a point y 2 extC  " (yielding then
x 2 XWE
"
 XE ). t
u
478 13 Optimization Under Equilibrium Constraints

13.3.4 Illustrative Examples

Example 13.3 Problem (OWE) with linear input functions. Solve

minff .x/j x 2 X; g.x/  0; x 2 XWE


with following data:

f .x/ D hc; xi; c D .4; 8; 3; 1; 7; 0; 6; 2; 9; 3/;
X D fx 2 R10
C j A1 x  b1 ; A2 x  b2 g;

where
2 3
7 1 3 7 0 9 2 1 5 1
6 5 8 1 7 5 0 1 3 4 0 7
6 7
6 7
A1 D 6 2 1 0 2 3 2 2 5 1 3 7
6 7
4 1 4 2 9 4 3 3 4 0 2 5
3 1 0 8 3 1 2 2 5 5
b1 D .66; 150; 81; 79; 53/
2 3
2 9 1 2 2 1 4 1 5 2
6 2 3 2 4 5 4 1 9 2 1 7
6 7
6 4 8 1 1 5 3 2 0 2 9 7
6 7
6 7
6 2 7 1 2 5 9 4 1 2 0 7
A2 D 6 7
6 2 3 1 1 4 3 1 0 0 6 7
6 7
6 6 0 0 0 4 3 2 2 4 6 7
6 7
4 2 2 3 5 6 4 0 0 1 4 5
1 4 4 6 0 3 4 2 4 1
b2 D .126; 12; 52; 23; 2; 23; 28; 90/
g.x/ WD Gx  d
where
 
1 1 2 4 4 4 1 4 6 3
GD
4 9 0 1 2 1 6 5 0 0
d D .14; 89/
2 3
5 1 7 1 4 9 0 4 3 7
C D 4 1 2 5 4 1 6 4 0 3 0 5
3 3 0 4 0 1 2 1 4 0
Computational results:
Optimal solution (tolerance
D 0:01):
x D .0; 8:741312; 8:953411; 9:600323; 3:248449; 3:916472;
6:214436; 9:402384; 10; 4:033163/
(found at iteration 78 and confirmed at iteration 316)
13.4 Equilibrium Problem 479

Optimal value: 107.321065


Maximal number of nodes generated: 32
Example 13.4 Problem (OE) with linear-fractional multicriteria
(taken from Muu and Tuyen (2002))
Solve
minff .x/j x 2 X; x 2 XE g

with following data:


f .x/ D x1  x2
X D fx 2 R2C j Ax  bg
where
2 3
1 2
6 1 2 7
AD6 7
4 1 1 5 ;
1 0;

b D .2; 2; 1; 6/T ;


x1
C1 .x/ D ;
x1 C x2
x1 C 6
C2 .x/ D :
x1  x2 C 3

Computational results:
Optimal solution (tolerance
D 0:01):
x D .1:991962; 2:991962/
(found at iteration 16 and confirmed at iteration 16),
Optimal value: 4.983923.

13.4 Equilibrium Problem

Given a nonempty compact set C  Rn ; a nonempty closed convex set D  Rm ;


and a function F.x; y/ W C  D ! R which is upper semi-continuous in x and
quasiconvex in y; the problem is to find an equilibrium of the system .C; D; F/; i.e.,
a point x 2 Rn satisfying
x 2 C; F.x; y/  0 8y 2 D: (13.39)

(cf. Sect. 3.3).


480 13 Optimization Under Equilibrium Constraints

By Theorem 3.4 we already know that this problem has a solution, provided the
following assumption holds:
(A) There exists a closed set-valued function ' from D to C with nonempty compact
values '.y/  C 8y 2 D; such that

inf inf F.x; y/  0: (13.40)


y2D x2'.y/

The next question that arises is how to find effectively the equilibrium point
when condition (13.40) is fulfilled.
As it turns out, this is a question of fundamental importance both for the theory
and applications of mathematics.
Historically, about 40 years ago the so-called combinatorial pivotal methods
were developed for computing fixed point and economic equilibrium, which was
at the time a very popular subject. Subsequently, the interest switched to variational
inequalitiesa new class of equilibrium problems which have come to play a central
role in many theoretical and practical questions of modern nonlinear analysis.
With the rapid progress in nonlinear optimization, including global optimization,
nowadays equilibrium problems and, in particular, variational inequalities can be
approached and solved as equivalent optimization problems.

13.4.1 Variational Inequality

Let C be a compact convex set in Rn ; ' W Rn ! Rn a continuous mapping. Consider


the variational inequality

.VI/ Find x 2 C; s.t. h'.x/; x  xi  0 8x 2 C:

For x; y 2 C define

F.x; y/ D maxfhx; y  zij z 2 Cg D hx; yi  minhx; zi: (13.41)


z2C

Writing minz2C hx; zi D hx; zi for z 2 C we see that the function F.x; y/ is continuous
in x and convex in y: By Theorem 3.4 there exists x 2 C such that

inf F.x; y/  inf F.'.y/; y/:


y2C y2C

But infy2C F.x; y/ D infy2C hx; yi  minz2C hx; zi D 0; so

inf F.'.y/; y/  0: (13.42)


y2C
13.4 Equilibrium Problem 481

On the other hand, F.'.y/; y/ D h'.y/; yi  minz2C h'.y/; zi  0 8y 2 C; hence,


taking account of the continuity of F.'.y/; y/ and the compactness of C:

min F.'.y/; y/ D 0: (13.43)


y2C

If y is a global optimal solution of this optimization problem, then F.'.y/; y/ D


0; i.e., 0 D h'.y/; yi  minz2C h'.y/; zi; hence h'.y/; yi D miny2C h'.y/; yi; i.e.,
h'.y/; y  yi  0 8y 2 C; which means y is a solution to (VI). Conversely, it is
easily seen that any solution y of (VI) satisfies F.'.y/; y/ D 0:
Thus, under the stated assumptions the variational inequality (VI) is solvable and
any solution of it can be found by solving the global optimization problem

minfF.'.y/; y/j y 2 Cg; (13.44)

where

F.'.y/; y/ D h'.y/; yi  minh'.y/; zi  0 8y 2 C: (13.45)


z2C

It is important to note that the optimization problem (13.44) should be solved to


global optimality because F.'.y/; y/ is generally nonconvex and a local minimizer
y of it over C may satisfy F.'.y/; y/ > 0; so may fail to provide a solution to the
variational inequality.
A function g.y/; such that g.y/  0 8y 2 C and g.y/ D 0 for y 2 C if and
only if y solves the variational inequality (VI), is often referred to as a gap function
associated with (VI).
It follows from the above that the function F.'.y/; y/ defined by (13.45), which
was first introduced by Auslender (1976), is a gap function for (VI).
Thus, once a gap function g.y/ is known for the variational inequality (VI),
solving the latter reduces to solving the equation

y 2 C; g.y/ D 0; (13.46)

which in turn is equivalent to the global optimization problem

minfg.y/j y 2 Cg: (13.47)

The existence theorem for variational inequalities and the conversion of variational
inequalities into equivalent equations date back to Hartman and Stampacchia
(1966). The use of gap functions to convert variational inequalities into optimization
problems (13.47) appeared subsequently (Auslender 1976). However, in the earlier
period, when global optimization methods were underdeveloped, the converted
optimization problem was solved mainly by local optimization methods, i.e., by
methods seeking a local minimizer, or even a stationary point satisfying necessary
optimality conditions. Since the function g.y/ may not be convex such points do not
482 13 Optimization Under Equilibrium Constraints

always satisfy the required inequality in (VI) for all y 2 C; so strictly speaking they
do not solve the variational inequality. Therefore, additional ad-hoc manipulations
have often to be used to guarantee the global optimality of the solution found.
Aside from the function (13.45) introduced by Auslender, many other gap
functions have been developed and analyzed in the last two decades: (Fukushima
1992), (Zhu and Marcotte 1994), (Mastroeni 2003),. . . For a gap function to be
practically useful it should possess some desirable properties making it amenable to
currently available optimization methods. In particular, it should be differentiable,
if differentiable optimization methods are to be used. From this point of view, the
following function, discovered independently by Auchumuty (1989) and Fukushima
(1992), is of particular interest:

1
g.y/ D max h'.y/; z  yi  hz  y; M.z  y/i ; (13.48)
z2C 2

where M is an arbitrary n  n symmetric positive definite matrix. Let

1
h.z/ D h'.y/; z  yi  hz  y; M.z  y/i: (13.49)
2

For every fixed y 2 C; since h.z/ is a strictly concave quadratic function, it has a
unique maximizer u D u.y/ on the compact convex set C which is also a minimizer
of the strictly convex quadratic function h.z/ on C and hence, is completely
defined by the condition 0 2 rh.u/ C NX .u/ (see Proposition 2.31), i.e.,

hM.u.y/  y/  '.y/; z  u.y/i  0 8z 2 C: (13.50)

Proposition 13.16 (Fukushima 1992) The function (13.48) is a gap function for
the variational inequality (VI). Furthermore, a vector y 2 C solves (VI) if and only
if y D u.y/; i.e., if and only if y is a fixed point of the mapping u W C ! C that
carries each point y 2 C to the maximizer u.y/ of the function h.z/ over the set C:
Proof Clearly h.y/ D 0; so g.y/ D maxz2C h.z/  h.y/ D 0 8y 2 C: We show
that y 2 C solves (VI) if and only if g.y/ D 0. Since g.y/ D maxz2C h.z/ we have
g.y/ D h'.y/; u.y/  yi  12 hu.y/  y; M.u.y/  y/i: It follows that g.y/ D 0 for
y 2 C if and only if y D u.y/; g.y/ D minz2C g.y/; and h'.y/; z  yi  0; 8z 2 C
[see (13.50)], consequently, if and only if y is a solution of (VI). t
u
It has also been proved in Fukushima (1992) that under mild assumptions on
continuity and differentiability of '.y/ the function (13.48) is also continuous and
differentiable and that any stationary point of it on C solves (VI). Consequently,
with the special gap function (13.48) and under appropriate assumptions (VI) can
be solved by using conventional differentiable local optimization methods to find a
stationary point of the corresponding problem (13.47).
The next proposition indicates two important cases when the problem (13.44)
using the gap function (13.45) can be solved by well-developed global optimization
methods.
13.5 Optimization with Variational Inequality Constraint 483

Proposition 13.17 (i) If the map ' W Rn ! Rm is affine, then the gap
function (13.45) is a dc function and the corresponding optimization problem
equivalent to (VI) is a dc optimization problem.
(ii) If C  RmC and each function 'i .y/ W RC ! RC ; i D 1; : : : ; m; is an increasing
n

function then the gap function (13.45) is a dm function and the corresponding
optimization problem equivalent to (VI) is a dm optimization problem.
Proof (i) If '.y/ is affine, then h'.y/; yi is a quadratic function, while
minz2C h'.y/; zi is a concave function, so (13.45) is a dc function.
(ii) If every function 'i .y/ is increasing then for any y; y0 2 Rm 0
C such that y  y by
writing

h'.y/; yi  '.t0 /; y0 i D h'.y/; yi  h'.y0 /; yi C h'.y0 /; yi  h'.y0 /; y0 i;

it is easily seen that h'.y/; yi is an increasing function. Since it is also obvious


that minz2C h'.y/; zi is an increasing function, (13.45) is a dm function. t
u
In the case (i) where, in addition, C is a polytope, (VI) is called an affine
variational inequality.

13.5 Optimization with Variational Inequality Constraint

A class of MPECs which has attracted much research in recent years includes
optimization problems with a variational inequality constraint. These are problems
of the form

minimize f .x; y/
.OVI/ .x; y/ 2 U; y 2 C;
subject to
h'.x; y/; z  yi  0 8z 2 C;

where f .x; y/ W Rn  Rm ! R is a continuous function, U  Rn  Rm a nonempty


closed set, and C  Rm is a nonempty compact set. For each given x 2 X WD fx 2
Rn j .x; y/ 2 U for some y 2 Rm g the set S.x/ WD fy 2 Cj h'.x; y/; zyi  0 8z 2 Cg
is the solution set of a variational inequality. So (OVI) is effectively an MPEC as
defined at the beginning of this chapter.
Among the best known solution methods for MPECs let us mention the
nonsmooth approach by Outrata et al. (1998), the implicit programming approach
and the penalty interior point method by Luo et al. (1996).
Using a gap function g.y/ we can convert (OVI) into an equivalent optimization
problem as follows:
484 13 Optimization Under Equilibrium Constraints

Proposition 13.18 (OVI) is equivalent to the optimization problem

min f .x; y/ s.t.


.x; y/ 2 U; y 2 C; :
g.y/  0:

Proof Immediate, because by definition of a gap function, y 2 C solves (VI) if and


only if g.y/ D 0: t
u
Of particular interest is (OVI) when (VI) is an affine variational inequality, i.e.,
when ' is an affine mapping and C is a polytope. The problem is then MPEC with
an affine equilibrium constraint and is referred to as MPAEC:

minimize f .x; y/
.MPAEC/ .x; y/ 2 U; y 2 C;
subject to
hAx C By C c; z  yi  0 8z 2 C;

where f .x; y/ W Rn  Rm ! R is a convex function, U  Rn  Rm a nonempty closed


set, A; B are appropriate real matrices, c 2 Rn and C  Rn a nonempty polytope.
By Proposition 13.18, using a gap function g.x; y/ we can reformulate (MPAEC)
as the optimization problem

min f .x; y/ s.t.


.x; y/ 2 U; (13.51)
g.x; y/  0:

If the gap function used is (13.48), i.e.,


1
g.x; y/ D max h'.y/; z  yi  hz  y; M.z  y/i ;
z2C 2

where M is a symmetric nn positive definite matrix, then (13.51) is a differentiable


optimization problem for which, as shown by Fukushima (1992), even a stationary
point yields a global optimal solution, hence a solution to MPAEC. Exploiting this
property which is rather exceptional for gap functions, descent methods and other
differentiable (local) optimization methods can be developed for solving MPAEC:
see Fukushima (1992), and also Muu-Quoc-An-Tao, (2012) among others.
If the gap function (13.45) is used, we have

g.x; y/ D hAx C By C c; yi  minhAx C By C c; zi;


z2C

which is a dc function (more precisely, the difference of a quadratic function


and a concave function). In this case, (13.51) is a convex minimization under dc
constraints, i.e., a tractable global optimization problem.
13.5 Optimization with Variational Inequality Constraint 485

Since the feasible set of problem (13.51) is nonconvex, an efficient method for
solving it, as we saw in Chap. 7, Sect. 7.4, is to proceed according to the following
SIT scheme.
For any real number denote by .SP / the auxiliary problem of finding a feasible
solution .x; y/ to (MPAEC) such that f .x; y/  : This problem is a dc optimization
under convex constraints since it can be formulated as

min g.x; y/ s.t.


.SP / .x; y/ 2 U; (13.52)
f .x; y/  :

Consequently, as argued in Sect. 6.2, it should be much easier to handle than (13.51)
whose feasible set is nonconvex. In fact this is the rationale for trying to reduce
(MPAEC) to a sequence of subproblems .SP /:
Given a tolerance
> 0; a feasible solution .x; y/ of (MPAEC) is said to be an

-optimal solution if f .x; y/  f .x; y/ C


for every feasible solution .x; y/:
Basically, the SIT approach consists in replacing (MPAEC) by a sequence of
auxiliary problems .SP / where the parameter is gradually adjusted until a
suitable value is reached for which by solving the corresponding .SP / an
-optimal
solution of (MPAEC) will be found.
Suppose .x0 ; y0 / is the best feasible solution available at a given stage and let
0 D f .x0 ; y0 / 
: Solving the problem .SP 0 / leads to two alternatives: either a
solution .x1 ; y1 / is obtained, or an evidence is found that .SP 0 / is infeasible. In the
former alternative, we set .x0 ; y0 / .x1 ; y1 / and repeat, with 1 D f .x1 ; y1 / 
in
place of 0 : In the latter alternative, f .x0 ; y0 /  f .x; y/ C
for any feasible solution
.x; y/ of MPAEC, so .x0 ; y0 / is an
-optimal solution as desired and we stop. In
that way each iteration improves the objective function value by at least
> 0; so
the procedure must terminate after finitely many iterations, yielding eventually an

-optimal solution.
Formally, we can state the following algorithm.
By translating if necessary it can be assumed that U is a compact convex subset
of a hyperrectangle a; b  RnC  Rm C and C is a compact convex subset of RC :
m

A vector u D .x; y/ 2 U is said to be "-feasible to (MPAEC) if it satisfies g.u/  "I


a vector u D .x; y/ 2 U is called an .";
/-optimal solution if it is "-feasible and
satisfies f .u/  f .u/ C
for all "-feasible solutions.

Algorithm for (MPAEC)


Select tolerances " > 0;
> 0: Let 0 D f .a/:
Step 0. Set P1 D fM1 g; M1 D a; b; R1 D ;; D 0 : Set k D 1.
Step 1. For each box (hyperrectangle) M 2 Pk :
Reduce M; i.e., find a box p; q  M as small as possible satisfying
minfg.u/j f .u/  ; u 2 p; qg D minfg.u/j f .u/  ; u 2 Mg: Set
M p; q:
Compute a lower bound .M/ for g.u/ over the feasible solutions
in p; q:
486 13 Optimization Under Equilibrium Constraints

Delete every M such that .M/ > ":


Step 2. Let P 0 k be the collection of boxes that results from Pk after completion
of Step 1. Let R 0 k D Rk [ P 0 k :
Step 3. If R 0 k D ;, then terminate: if D 0 the problem (MPAEC) is "-
infeasible (has no "-feasible solutions); if D f .u/
for some "-feasible
solution u, then u is an .";
/-optimal solution of (MPAEC).
Step 4. If R 0 k ;; let Mk 2 argminf.M/j M 2 R 0 k g; k D .Mk /: Determine
uk 2 Mk and v k 2 Mk such that

f .uk /  ; g.v k /  .Mk / D o.kuk  v k k/: (13.53)


If g.uk / < "; go to Step 5. If g.uk /  "; go to Step 6.
Step 5. uk is an "-feasible solution satisfying f .uk /  :
If D 0 reset f .u/ 
; u D uk ; and go to Step 6. If D f .u/ 

and f .u / < f .u/ reset u


k
uk and go to Step 6.
Step 6. Divide Mk into two subboxes by the adaptive bisection, i.e., bisect Mk via
.wk ; jk /; where wk D 12 .uk C v k /; jk 2 argmaxfjvjk  ukj j W j D 1; : : : ; ng:
Let PkC1 be the collection of these two subboxes of
Mk ; RkC1 D R 0 k n fMk g:
Increment k; and return to Step 1.
Proposition 13.19 The above SIT Algorithm terminates after finitely many itera-
tions, yielding either an .";
/-optimal solution to MPAEC, or an evidence that the
problem is infeasible.
Proof Immediate. t
u
Remark 13.5 If a feasible solution can be found by some fast local optimization
method available, it should be used to initialize the algorithm. Also if for some
reason the algorithm has to be stopped prematurely, some reasonably good feasible
solution may have been already obtained. This is in contrast with other algorithms
which would become useless in that case.

13.6 Exercises

1 Solve the Linear Bilevel Program (Tuy et al. 1993a):


min.2x1 C x2 C 0:5y1 / s.t.
x1 ; x2  0; y D .y1 ; y2 / solves
min.4y1 C y2 / s.t.
2x1 C y1  y2  2:5
x1  3x2 C y2  2
y1 ; y2  0:
13.6 Exercises 487

Hints: a non-optimal feasible solution is x D .1; 0/; y D .0:5; 1/:


2 Solve the Convex Bilevel Program (Floudas 2000, Sect. 9.3):

min..x  5/2 C .2y C 1/2 / s.t.


x  0; y solves
min..y  1/2  1:5xy/ s.t.
3x C y  3
x  0:5y  4
xCy7
x  0; y  0:

Hints: A known feasible solution is x D 1; y D 0:


3 Show that any linear bilevel program is a MPAEC.
4 Solve the MPAEC equivalent to the LBP in Exercise 1.
5 Consider the MPAEC

minimize f .x; y/ .1/


subject to .x; y/ 2 U; y 2 C; .2/
hAx C By C c; z  yi  0 8z 2 C .3/

where U  RnCm ; C  Rn are two nonempty closed convex sets, f .x; y/ is convex
function on RnCm while A; B are given appropriate real matrices and c 2 Rn :
Show that if A is a symmetric positive semidefinite matrix then this problem
reduces to a convex bilevel program.
6 Let g.x; y/ D maxy2C fhAx C By C c; y  zi  12 hz  y; G.z  y/g; where G is an
arbitrary n  n symmetric positive definite matrix. Show that:
(i) g.x; y/  0 8.x; y/ 2 C  Rm ;
(ii) .x; y/ 2 U; y 2 C; g.x; y/ D 0 if and only if .x; y/ satisfies the conditions
(2) and (3) in Exercise 5.
In other words, the MPAEC in Exercise 5 is equivalent to the dc optimization
problem

minf .x; y/ W
.x; y/ 2 U; y 2 C;
g.x; y/  0:
References

Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Networks Flows: Theory, Algorithms and Applications.
Prentice Hall, Englewood Cliffs, NJ (1993)
Alexandrov, A.D.: On surfaces which may be represented by a difference of convex functions.
Izvestiya Akademii Nauk Kazakhskoj SSR, Seria Fiziko-Matematicheskikh 3, 320 (1949,
Russian)
Alexandrov, A.D.: On surfaces which may be represented by differences of convex functions. Dokl.
Akad. Nauk SSR 72, 613616 (1950, Russian)
Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math. Oper. Res. 8, 273
286 (1983)
Al-Khayyal, F.A., Larsen, C., Van Voorhis, T.: A relaxation method for nonconvex quadratically
constrained quadratic programs. J. Glob. Optim., 215230 (1995)
Al-Khayyal, F.A., Tuy, H., Zhou, F.: DC optimization methods for multisource location problems.
School of Industrial and Systems Engineering. George Institute of Technology, Atlanta (1997).
Preprint
Al-Khayyal, F.A., Tuy, H., Zhou, F.: Large-scale single facility continuous location by DC
Optimization. Optimization 51, 271292 (2002)
An, L.T.H.: Solving large scale molecular distance geometry problems by a smoothing technique
via the Gaussian Transform and D.C. Programming. J. Glob. Optim. 27, 375397 (2003)
An, L.T.H., Tao, P.D.: The DC (difference of convex) programming and DCA revisited with DC
models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 2346 (2005)
An, L.T.H., Tao, P.D., Muu, L.D.: Simplicially-constrained DC optimization over the efficient and
weakly efficient set. J. Optim. Theory Appl. 117, 503531 (2003)
An, L.T.H., Belghiti, M.T., Tao, P.D.: A new efficient algorithm based on DC programming and
DCA for clustering. J. Glob. Optim. 37, 557569 (2007)
An, L.T.H., Tao, P.D., Nam, N.C., Muu, L.D.: Methods for optimization over the efficient and
weakly efficient sets of an affine fractional vector optimisation problem. Optimization 59, 77
93 (2010)
Androulakis, I.P., Maranas, C.D., Floudas, C.A.: BB: a global optimization method for general
constrained nonconvex problems. J. Glob. Optim. 7, 337363 (1995)
Apkarian, P., Tuan, H.D.: Concave programming in control theory. J. Glob. Optim. 15, 343370
(1999)
Asplund, E.: Differentiability of the metric projection in finite dimensional Euclidean space. Proc.
AMS 38, 218219 (1973)
Astorino, A., Miglionico, G.: Optimizing sensor cover energy via DC Programming. Optim. Lett.
(2014). doi:10.1007/s11590-014-0778-y

Springer International Publishing AG 2016 489


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6
490 References

Aubin, J.-P., Ekeland, I.: Applied Nonlinear Analysis. John Wiley & Sons, NewYork (1984)
Auchumuty, G.: Variational principles for variational inequalities. Numer. Funct. Anal. Optim. 10,
863874 (1989)
Auslender, A.: Optimisation, Mthodes Numriques. Masson, Paris (1976)
Bach-Kim, N.T.: An algorithm for optimization over the efficient set. Vietnam J. Math. 28, 329
340 (2000)
Baer, E., et al.: Biological and synthetic hierarchical composites. Phys. Today 46, 6067 (1992)
Balas, E.: Intersection cutsa new type of cutting planes for integer programming. Oper. Res. 19,
1939 (1971)
Balas, E.: Integer programming and convex analysis: intersection cuts and outer polars. Math.
Program. 2, 330382 (1972)
Balas, E., Burdet, C.A.: Maximizing a convex quadratic function subject to linear constraints.
Management Science Research Report No 299, CarnegieMelon University (1973)
Bard, J.F.: Practical Bilevel Optimization Algorithms and Optimization. Kluwer, Dordrecht (1988)
Bard, J. F.: Convex two-level optimization. Math. Program. 40, 1527 (1998)
Ben-Ayed, O.: Bilevel linear programming. Comput. Oper. Res. 20, 485501 (1993)
Bengtsson, M., Ottersen, B.: Optimum and suboptimum transmit beamforming. In: Handbook of
Antennas in Wirelss Coomunications, chap. 18. CRC, Boca Raton (2001)
Benson, H.P.: Optimization over the efficient set. J. Math. Anal. Appl. 98, 562580 (1984)
Benson, H.P.: An algorithm for optimizing over the weakly-efficient set. Eur. J. Oper. Res. 25,
192199 (1986)
Benson, H.P.: An all-linear programming relaxation algorithm for optimization over the efficient
set. J. Glob. Optim. 1, 83104 (1991)
Benson, H.P.: A bisection-extreme point search algorithm for optimizing over the efficient set in
the linear dependance case. J. Glob. Optim. 3, 95111 (1993)
Benson, H.P.: Concave minimization: theory, applications and algorithms. In: Horst, R., Pardalos,
P. (eds.) Handbook of Global Optimization, pp. 43148. Kluwer Academic Publishers,
Dordrecht (1995)
Benson, H.P.: Deterministic algorithms for constrained concave minimization: a unified critical
survey. Nav. Res. Logist. 43, 765795 (1996)
Benson, H.P.: An outer approximation algorithm for generating all efficient extreme points in the
outcome set of a multi objective linear programming problem. J. Glob. Optim. 13, 124 (1998)
Benson, H.P.: An outcome space branch and bound outer approximation algorithm for convex
multiplicative programming. J. Glob. Optim. 15, 315342 (1999)
Benson, H.P.: A global optimization approach for generating efficient points for multiplicative
concave fractional programming. J. Multicrit. Decis. Anal. 13, 1528 (2006)
Ben-Tal, A., Eiger, G., Gershovitz, V.: Global minimization by reducing the duality gap. Math.
Program. 63, 193212 (1994)
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. MPS-SIAM Series on
Optimization. SIAM, Philadelphia (2001)
Bentley, J.L., Shamos, M.I.: Divide and conquer for linear expected time. Inf. Proc. Lett. 7, 8791
(1978)
Bialas, W.F., Karwan, C.H.: On two-level optimization. IEEE Trans. Auto. Const. AC-27, 211214
(1982)
Bjornson, E., Jorswieck, E.: Optimal resource allocation in coordinated multi-cell systems.
Foundations and Trends in Communications and Information Theory, vol. 9(23), pp. 113381
(2012). doi:10.1561/0100000069
Bjornson, E., Zheng, G., Bengtson, M., Ottersen, B.: Robust monotonic optimization framework
for multicell MISO systems. IEEE Trans. Signal Process. 60(5), 25082521 (2012)
Blum, E., Oettli, W.: From optimization and variational inequality to equilibrium problems. Math.
Student 63, 127149 (1994)
Bolintineau, S.: Minimization of a quasi-concave function over an efficient set. Math. Program. 61,
83120 (1993a)
References 491

Bolintineau, S.: Optimality condition for minimization over the (weakly or properly) efficient set.
J. Math. Anal. Optim. 173, 523541 (1993b)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge,
United Kingdom (2004)
Bracken, J., McGill, J.: Mathematical programs with optimization problems in the constraints.
Oper. Res. 21, 3744 (1973)
Bracken, J., McGill, J.: A method for solving mathematical programs with nonlinear programs in
the constraints. Oper. Res. 22, 10971101 (1974)
Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. Technical
report 9603. Computer Science Department, University of Wisconsin (1997)
Brimberg, J., Love, R.F.: A location problem with economies of scale. Stud. Locat. Anal. (7), 919
(1994)
Bui, A.: Etude analytique dalgorithmes distribus de routage. Thse de doctorat, Universit Paris
VII (1993)
Bulatov, V.P. Embedding methods in optimization problems. Nauka, Novosibirsk (1977) (Russian)
Bulatov, V.P. Methods for solving multiextremal problems (Global Search). In Beltiukov, B.A.,
Bulatov, V.P. (eds) Methods of Numerical Analysis and Optimization. Nauka, Novosibirsk
(1987) (Russian)
Candler, W., Townsley, R.: A linear two-level programming problem. Comput. Oper. Res. 9, 5976
(1982)
Caristi, J.: Fixed point theorems for mappings satisfying inwardness conditions. Trans. Am. Math.
Soc. 215, 241251 (1976)
Charnes, A., Cooper, W.W.: Programming with linear fractional functionals. Nav. Res. Logist. 9,
181186 (1962)
Chen, P.C., Hansen, P., Jaumard, B.: On-line and off-line vertex enumeration by adjacency lists.
Oper. Res. Lett. 10, 403409 (1991)
Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Webers problem with attraction and repulsion. J.
Regional Sci. 32, 467486 (1992)
Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional
Weber problems by D.C. Programming, Cahier du GERAD G-92-35. Ecole Polytechnique,
Montral (1992b)
Chen, P.C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional
Weber problems by DC Programming. Oper. Res. 46, 548562 (1998)
Cheon, M.-S., Ahmed, S., Al-Khayyal, F.: A branch-reduce-cut algorithm for the global optimiza-
tion of probabilistically constrained linear programs. Math. Program. Ser. B 108, 617634
(2006)
Cheney, E.W., Goldstein, A.A.: Newtons method for convex programming and Tchebycheff
approximation. Numer. Math. 1, 253268 (1959)
Cooper, L.: Location-allocation problems. Oper. Res. 11, 331343 (1963)
Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton, NJ
(1963)
Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494498 (1941)
Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. I. North-Holland, Amster-
dam (1975)
Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. II. North-Holland, Amster-
dam (1978)
Duffin, R.J., Peterson, E.L., Zener, C.: Geometric ProgrammingTheory and Application. Wiley,
New York (1967)
Eaves, B.C., Zangwill, W.I.: Generalized cutting plane algorithms. SIAM J. Control 9, 529542
(1971)
Ecker, J.G., Song, H.G.: Optimizing a linear function over an efficient set. J. Optim. Theory Appl.
83, 541563 (1994)
Ekeland, I.: On the variational principle. J. Math. Anal. Appl. 47, 324353 (1974)
492 References

Ekeland, I., Temam, R.: Convex Analysis and Variational Problems. North-Holland, Amsterdam
(1976)
Ellaia, R.: Contribution lanalyse et loptimisation de diffrences de fonctions convexes, Thse
de trosime cycle, Universit de Toulouse (1984)
Falk, J.E.: Lagrange multipliers and nonconvex problems. SIAM J. Control 7, 534545 (1969)
Falk, J.E.: Conditions for global optimality in nonlinear programming, Oper. Res. 21, 337340
(1973a)
Falk, J.E.: A linear max-min problem. Math. Program. 5, 169188 (1973b)
Falk, J.E., Hoffman, K.L.: A successive underestimation method for concave minimization
problems. Math. Oper. Res. 1, 251259 (1976)
Falk, J.E., Hoffman, K.L.: Concave minimization via collapsing polytopes. Oper. Res. 34, 919
929 (1986)
Falk, J.E., Palocsay, S.W.: Optimizing the sum of linear fractional functions. In: Floudas, C.,
Pardalos, P. (eds) Global Optimization, pp. 221258. Princeton University Press, Princeton
(1992)
Falk, J.E., Soland, R.M.L: An algorithm for separable nonconvex programming problems. Manag.
Sci. 15, 550569 (1969)
Fan, K.: A minimax inequality and applications. In: Shisha (ed.) Inequalities, vol. 3, pp. 103113.
Academic, New York (1972)
Finsler, P.: Uber das Vorkommen definiter und semidefiniter Formen in Scharen quadratischer
Formen. Comment. Math. Helv. 9, 188192 (1937)
Floudas, C.A., Aggarwal, A.: A decomposition strategy for global optimum search in the pooling
problem. ORSA J. Comput. 2(3), 225235 (1990)
Floudas, C.A. : Deterministic Global Optimization, Theory, Methods and Applications. Kluwer,
Dordrecht, The Netherlands (2000)
Forgo, F.: The solution of a special quadratic problem (in Hungarian). Szigma 5359 (1975)
Foulds, L.R., Haugland, D., Jrnsten, K.: A bilinear approach to the pooling problem. Optimization
24, 165180 (1992)
Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization
algorithms. J. Assoc. Comput. Mach. 34, 596615 (1984)
Friesz, T.L., et al.: A simulated annealing approach to the network design problem with variational
inequality constraints. Transp. Sci. 28, 1826 (1992)
Fujiwara, O., Khang, D.B.: A two-phase decomposition method for optimal design of looped water
distribution networks. Water Resour. Res. 23, 977982 (1990)
Fukushima, M.: Equivalent differentiable optimization problems and descent methods for asym-
metric variational inequality problems. Math. Program. 53, 99110 (1992)
Gabasov, R., Kirillova, F.M.: Linear Programming Methods. Part 3 (Special Problems). Minsk,
Russian (1980)
Gale, P.: An algorithm for exhaustive generation of building floor plans. Commun. ACM 24, 813
825 (1981)
Gantmacher F.R.: The Theory of Matrices, vol. 2. Chelsea Publishing Company, New York (1960)
Ge, R.P., Qin, Y.F.: A class of filled functions for finding global minimizers of a function of several
variables. J. Optim. Theory Appl. 54, 241252 (1987)
Glover, F.: Cut search methods in integer programming. Math. Program. 3, 86100 (1972)
Glover, F.: Convexity cuts and cut search. Oper. Res. 21, 123124 (1973a)
Glover, F.: Concave programming applied to a special class of 01 integer programs. Oper. Res.
21, 135140 (1973b)
Graham, R.L.: An efficient algorithm for determining the convex hull of a finite planar set. Inf.
Proc. Lett. 1, 132133 (1972)
Groch, A., Vidigal, L., Director, S.: A new global optimization method for electronic circuit design.
IEEE Trans. Circuits Syst. 32, 160179 (1985)
Guisewite, G.M., Pardalos, P.M.: A polynomial time solvable concave network flow problem.
Network 23, 143148 (1992)
References 493

Haims, M.J., Freeman, H.: A multistage solution of the template-layout problem. IEEE Trans. Syst.
Sci. Cyber. 6, 145151 (1970)
Hamami, M., Jacobsen, S.E.: Exhaustive non-degenerate conical processes for concave minimiza-
tion on convex polytopes. Math. Oper. Res. 13, 479487 (1988)
Hansen, P., Jaumard, B., Savard, G.: New branch and bound rules for linear bilinear programming.
SIAM J. Sci. Stat. Comput. 13, 11941217 (1992)
Hansen, P., Jaumard, B., Tuy, H.: Global optimization in location. In Dresner, Z. (ed.) Facility
Location, pp. 4368. Springer, NewYork (1995)
Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9, 707
713 (1959)
Hartman, P., Stampacchia, G.: On some nonlinear elliptic differential functional equations. Acta
Math. 115, 153188 (1966)
Hillestad, R.J., Jacobsen, S.E.: Linear programs with an additional reverse convex constraint. Appl.
Math. Optim. 6, 257269 (1980b)
Hiriart-Urruty, J.-B.: A short proof of the variational principle for approximate solutions of a
minimization problem. Am. Math. Mon. 90, 206207 (1983)
Hiriart-Urruty, J.-B.: From convex optimization to nonconvex optimization, Part I: necessary and
sufficient conditions for global optimality. In: Clarke, F.H., Demyanov, V.F., Giannessi, F. (eds.)
Nonsmooth Optimization and Related Topics. Plenum, New York (1989a)
Hiriart-Urruty, J.-B. Conditions ncessaires et suffisantes doptimalit globale en optimisation de
diffrences de deux fonctions convexes. C.R. Acad. Sci. Paris 309, Srie I, 459462 (1989b)
Hoai, P.T., Tuy, H.: Monotonic Optimization for Sensor Cover Energy Problem. Institute of
Mathematics, Hanoi (2016). Preprint
Hoai-Phuong, N.T., Tuy, H.: A unified monotonic approach to generalized linear fractional
programming. J. Glob. Optim. 26, 229259 (2003)
Hoang, H.G., Tuan, H.D., Apkarian, P.: A Lyapunov variable-free KYP Lemma for SISO
continuous systems. IEEE Trans. Autom. Control 53, 26692673 (2008)
Hoffman, K.L.: A method for globally minimizing concave functions over convex sets. Math.
Program. 20, 2232 (1981)
Holmberg, K., Tuy, H.: A production-transportation problem with stochastic demands and concave
production cost. Math. Program. 85, 157179 (1995)
Horst, R., Tuy, H.: Global Optimization (Deterministic Approaches), 3rd edn. Springer,
Berlin/Heidelberg/New York (1996)
Horst, R., Thoai, N.V., Benson, H.P.: Concave minimization via conical partitions and polyhedral
outer approximation. Math. Program. 50, 259274 (1991)
Horst, R., Thoai, N.V., de Vries, J.: On finding new vertices and redundant constraints in cutting
plane algorithms for global optimization. Oper. Res. Lett. 7, 8590 (1988)
Idrissi, H., Loridan, P., Michelot, C.: Approximation for location problems. J. Optim. Theory Appl.
56, 127143 (1988)
Jaumard, B., Meyer, C.: On the convergence of cone splitting algorithms with !-subdivisions. J.
Optim. Theory Appl. 110, 119144 (2001)
Jeyakumar, V., Lee, G.M., Li, G.Y.: Alternative theorems for quadratic inequality systems and
global quadratic optimization. SIAM J. Optim. 20, 9831001 (2009)
Jorswieck, E.A., Larson, E.G.: Monotonic optimization framework for the two-user MISO
interference channel. IEEE Trans. Commun. 58, 21592168 (2010)
Kakutani, S.: A generalization of Brouwers fixed point theorem. Duke Math. J. 8, 457459 (1941)
Kalantari, B., Rosen, J.B.: An algorithm for global minimization of linearly constrained concave
quadratic problems. Math. Oper. Res. 12, 544561 (1987)
Karmarkar, N.: An interior-point approach for NP-complete problems. Contemp. Math. 114, 297
308 (1990)
Kelley, J.E.: The cutting plane method for solving convex programs. J. SIAM 8, 703712 (1960)
Klinz, B., Tuy, H.: Minimum concave-cost network flow problems with a single nonlinear arc cost.
In: Pardalos, P., Du, D. (eds.) Network Optimization Problems, pp. 125143. World Scientific,
Singapore (1993)
494 References

Knaster, B., Kuratowski, C., Mazurkiewicz, S.: Ein Beweis des Fix-punktsatzes fr n-dimensionale
Simplexe. Fund. Math. 14, 132137 (1929)
Konno, H.: A cutting plane for solving bilinear programs. Math. Program. 11,1427 (1976a)
Konno, H.: Maximization of a convex function subject to linear constraints. Math. Program. 11,
117127 (1976b)
Konno, H., Abe, N.: Minimization of the sum of three linear fractions. J. Glob. Optim. 15, 419432
(1999)
Konno, H., Kuno, T.: Generalized linear and fractional programming. Ann. Oper. Res. 25, 147162
(1990)
Konno, H., Kuno, T.: Linear multiplicative programming. Math. Program. 56, 5164 (1992)
Konno, H., Kuno, T.: Multiplicative programming problems. In: Horst, R., Pardalos, P. (eds.)
Handbook of Global Optimization, pp. 369405. Kluwer, Dordrecht, The Netherlands (1995)
Konno, H., Yajima, Y.: Minimizing and maximizing the product of linear fractional functions.
In: Floudas, C., Pardalos, P. (eds.) Recent Advances in Global Optimization, pp. 259273.
Princeton University Press, Princeton (1992)
Konno, H., Yamashita, Y.: Minimization od the sum and the product of several linear fractional
functions. Nav. Res. Logist. 44, 473497 (1997)
Konno, H., Yajima, Y., Matsui, T.: Parametric simplex algorithms for solving a special class of
nonconvex minimization problems. J. Glob. Optim. 1, 6582 (1991)
Konno, H., Kuno, T., Yajima, Y.: Global minimization of a generalized convex multiplicative
function. J. Glob. Optim. 4, 4762 (1994)
Konno, H., Thach, P.T., Tuy, H.: Optimization on Low Rank Nonconvex Structures. Kluwer,
Dordrecht, The Netherlands (1997)
Kough, P.L.: The indefinite quadratic programming problem. Oper. Res. 27, 516533 (1979)
Kuno, T.: Globally determining a minimum-area rectangle enclosing the projection of a higher
dimensional set. Oper. Res. Lett. 13, 295305 (1993)
Kuno, T., Ishihama, T.: A convergent conical algorithm with !-bisection for concave minimization.
J. Glob. Optim. 61, 203220 (2015)
Landis, E.M.: On functions representable as the difference of two convex functions. Dokl. Akad.
Nauk SSSR 80, 911 (1951)
Lasserre, J.: Global optimization with polynomials and the problem of moments. SIAM J. Optim.
11, 796817 (2001)
Lasserre, J.: Semidefinite programming vs. LP relaxations for polynomial programming. Math.
Oper. Res. 27, 347360 (2002)
Levy, A.L., Montalvo, A.: The tunneling algorithm for the global minimization of functions. SIAM
J. Sci. Stat. Comput. 6, 1529 (1988)
Locatelli, M.: Finiteness of conical algorithm with !-subdivisions. Math. Program. 85, 593616
(1999)
Luc, L.T.: Reverse polyblock approximation for optimization over the weakly efficient set and the
efficient set. Acta Math. Vietnam. 26, 6580 (2001)
Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Constraints. Cambridge
University Press, Cambridge, United Kingdom (1996)
Maling, K., Nueller, S.H., Heller, W.R.: On finding optimal rectangle package plans. In: Proceed-
ings 19th Design Automation Conference, pp. 663670 (1982)
Mangasarian, O.L.: Nonlinear Programming. MacGraw-Hill, New York (1969)
Mangasarian, O.L.: Mathematical programming data mining. In: Data Mining and Knowledge
Discovery, vol. 1, pp. 183201 (1997)
Maranas, C.D., Floudas, C.A.: A global optimization method for Webers problem with attraction
and repulsion. In: Hager, W.W., Hearn, D.W., Pardalos, P.M. (eds.) Large Scale Optimization,
pp. 259293. Kluwer, Dordrecht (1994)
Martinez, J.M.: Local minimizers of quadratic functions on Euclidean balls and spheres. SIAM J.
Optim. 4, 159176 (1994)
Mastroeni, G.: Gap functions for equilibrium problems. J. Glob. Optim. 27, 411426 (2003)
References 495

Matsui, T.: NP-hardness of linear multiplicative programming and related problems. J. Glob.
Optim. 9, 113119 (1996)
Mayne, D.Q., Polak, E.: Outer approximation algorithms for non-differentiable optimization. J.
Optim. Theory Appl. 42, 1930 (1984)
McCormick, G.P.: Attempts to calculate global solutions of problems that may have local minima.
In: Lootsma, F. (ed.) Numerical Methods for Nonlinear Optimization, pp. 209221. Academic,
London/New York (1972)
McCormick, G.P.: Nonlinear Programming: Theory, Algorithms and Applications. Wiley, New
York (1982)
Megiddo, N., Supowwit, K.J.: On the complexity of some common geometric location problems.
SIAM J. Comput. 13, 182196 (1984)
Migdalas, A., Pardalos, P.M., Vrbrand, P. (eds): Multilevel Optimization: Algorithms and
Applications. Kluwer, Dordrecht (1998)
Murty, K.G.: Linear Programming. Wiley, New York (1983)
Murty, K.G., Kabadi, S.N.: Some NP-complete problems in quadratic and nonlinear programming.
Math. Program. 39, 117130 (1987)
Muu, L.D.: A convergent algorithm for solving linear programs with an additional reverse convex
constraint. Kybernetika 21, 428435 (1985)
Muu, L.D.: An algorithm for solving convex programs with an additional convexconcave
constraint. Math. Program. 61, 7587 (1993)
Muu, L.D.: Models and methods for solving convexconcave programs. Habilitation thesis,
Institute of Mathematics, Hanoi (1996)
Muu, L.D., Oettli, W.: Method for minimizing a convex concave function over a convex set. J.
Optim. Theory Appl. 70, 377384 (1991)
Muu, L.D., Tuyen, H.Q.: Bilinear programming approach to optimization over the efficient set of
a vector affine fractional problem. Acta Math. Vietnam. 27, 119140 (2002)
Muu, L.D., Quoc, T.D., An, L.T. and Tao, P.D.: A new decomposition algorithm for globally
solving mathematical programs with equilibrium constraints. Acta Math. Vietnam. 37, 201
217 (2012)
Nadler, S.B.: Multivalued contraction mappings. Pac. J. Math. 30, 475488 (1969)
Nash, J.: Equilibrium points in n-person game. Proc. Natl. Acad. Sci. U.S.A. 36, 4849 (1950)
Nghia, N.D., Hieu, N.D.: A Method for solving reverse convex programming problems. Acta Math.
Vietnam. 11, 241252 (1986)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables.
Academic, New York (1970)
Outrata, J.V., Kocvara, M., Zowe, J.: Nonsmooth Approach to Optimization Problems with
Equilibrium Constraints. Kluwer, Cambridge (1998)
Pardalos, P.M., Vavasis, S.A.: Quadratic programming with one negative eigenvalue is NP-hard. J.
Glob. Optim. 21, 843855 (1991)
Philip, J. : Algorithms for the vector maximization problem. Math. Program. 2, 207229 (1972)
Phillips, A.T., Rosen, J.B.: A parallel algorithm for constrained concave quadratic global mini-
mization. Math. Program. 42, 421448 (1988)
Pincus, M.: A close form solution of certain programming problems. Oper. Res. 16, 690694
(1968)
Plastria, F.: Continuous location problems. In: Drezner, Z., Hamacher, H.W. (eds.) Facility
Location, pp. 225262. Springer, Berlin (1995)
Polik, I., Terlaky T.: A survey of the S-lemma. SIAM Rev. 49, 371418 (2007)
Prkopa, A.: Contributions to the theory of stochastic programming. Math. Program. 4, 202221
(1973)
Prkopa, A.: Stochastic Programming. Kluwer, Dordrecht (1995)
Putinar, M.: Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J. 42, 969
984 (1993)
Qian, L.P., Jun Zhang, Y.: S-MAPEL: monotonic optimization for non-convex joint power control
and scheduling problems. IEEE Trans. Wirel. Commun. 9, 17081719 (2010)
496 References

Qian, L.P., Jun Zhang, Y., Huang, J.W.: MAPEL: achieving global optimality for a nonconvex
wireless power control problem. IEEE Trans. Wirel. Commun. 8, 15531563 (2009)
Qian, L.P., Jun Zhang, Y., Huang, J.: Monotonic optimization in communications and networking
systems. Found. Trends Netw. 7, 175 (2012)
Roberts, A.W., Varberg, D.E.: Convex Functions. Academic, New York (1973)
Robinson, S.: Generalized equations and their solutions, part 1: basic theory. Math. Program. Stud.
10,128141 (1979)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rosen, J.B.: Global minimization of a linearly constrained concave function by partition of feasible
domain. Math. Oper. Res. 8, 215230 (1983)
Rosen, J.B., Pardalos, P.M.: Global minimization of large-scale constrained concave quadratic
problems by separable programming. Math. Program. 34, 163174 (1986)
Rubinov, A., Tuy, H., Mays, H.: Algorithm for a monotonic global optimization problem.
Optimization 49, 205221 (2001)
Saff, E.B., Kuijlaars, A.B.J.: Distributing many points on a sphere. Math. Intell. 10, 511 (1997)
Sahni, S.: Computationally related problems. SIAM J. Comput. 3, 262279 (1974)
Schaible, S.: Fractional Programming. In: Horst ,R., Pardalos, P. (eds.) Handbook of Global
Optimization, pp. 495608. Kluwer, Dordrecht (1992)
Sherali, H.D., Adams, W.P.: A Reformulation-Lineralization Technique (RLT) for Solving Discrete
and Continuous Nonconvex Programming Problems. Kluwer, Dordrecht (1999)
Sherali, H.D., Tuncbilek, C.H.: A global optimization algorithm for polynomial programming
problems using a reformulation-linearization technique. J. Glob. Optim. 2, 101112 (1992)
Sherali, H.D., Tuncbilek, C.H.: A reformulation-convexification approach to solving nonconvex
quadratic programming problems. J. Glob. Optim. 7, 131 (1995)
Shi, J.S., Yamamoto, Y.: A D.C. approach to the largest empty sphere problem in higher dimension.
In: Floudas, C., Pardalos, P. (eds.) State of the Art in Global Optimization, pp. 395411.
Kluwer, Dordrecht (1996)
Shiau, T.-H.: Finding the largest lp -ball in a polyhedral set. Technical Summary Report No. 2788.
University of Wisconsin, Mathematics Research Center, Madison (1984)
Shimizu, K., Ishizuka, Y., Bard, J.F.: Nondifferentiable and Two-Level Mathematical Program-
ming. Kluwer, Cambridge (1997)
Shor, N.Z.: Quadratic Optimization Problems. Technicheskaya Kibernetica, 1, 128139 (1987,
Russian)
Shor, N.Z.: Dual estimates in multiextremal problems. J. Glob. Optim. 2, 411418 (1991)
Shor, N.Z., Stetsenko, S.I.: Quadratic Extremal Problems and Nondifferentiable Optimization.
Naukova Dumka, Kiev (1989, Russian)
Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171176 (1958)
Sniedovich, M.: C-programming and the minimization of pseudolinear and additive concave
functions. Oper. Res. Lett. 5, 185189 (1986)
Soland, R.M.: Optimal facility location with concave costs. Oper. Res. 22, 373382 (1974)
Sperner, E.: Neuer Beweis fr die Invarianz der Dimensionszahl und des Gebietes. Abh. Math.
Seminar Hamburg VI , 265272 (1928)
Stancu-Minasian, I.M.: Fractional Programming, Theory, Methods and Applications. Kluwer,
Dordrecht (1997)
Strekalovski, A.C.: On the global extremum problem. Sov. Dokl. 292, 10621066 (1987, Russian)
Strekalovski, A.C.: On problems of global extremum in nonconvex extremal problems. Izvestya
Vuzov, ser. Matematika 8, 7480 (1990, Russian)
Strekalovski, A.C.: On search for global maximum of convex functions on a constraint set.
J. Comput. Math. Math. Phys. 33, 349363 (1993, Russian)
Suzuki, A., Okabe, A.: Using Voronoi diagrams. In: Drezner, Z. (ed.) Facility Location, pp. 103
118. Springer, New York (1995)
Swarup, K.: Programming with indefinite quadratic function with linear constraints. Cahiers du
Centre dEtude de Recherche Operationnelle 8, 132136 (1966a)
References 497

Swarup, K.: Indefinite quadratic programming. Cahiers du Centre dEtude de Recherche Opera-
tionnelle 8, 217222 (1966b)
Swarup, K.: Quadratic programming. Cahiers du Centre dEtude de Recherche Oprationnelle 8,
223234 (1966c)
Tanaka, T., Thach, P.T., Suzuki, S.: An optimal ordering policy for jointly replenished products by
nonconvex minimization techniques. J. Oper. Res. Soc. Jpn. 34, 109124 (1991)
Tao, P.D.: Convex analysis approach to dc programming: theory, algorithm and applications. Acta
Math. Vietnam. 22, 289355 (1997)
Tao, P.D.: Large-scale molecular optimization from distance matrices by a dc optimization
approach. SIAM J. Optim. 14, 77116 (2003)
Tao, P.D., Hoai-An, L.T.: DC optimization algorithms for solving the trust region subproblem.
SIAM J. Optim. 8, 476505 (1998)
Thach, P.T.: The design centering problem as a d.c. programming problem. Math. Program. 41,
229248 (1988)
Thach, P.T.: Quasiconjugates of functions, duality relationship between quasiconvex minimization
under a reverse convex constraint and quasiconvex maximization under a convex constraint,
and applications. J. Math. Anal. Appl. 159, 299322 (1991)
Thach, P.T.: New partitioning method for a class of nonconvex optimization problems. Math. Oper.
Res. 17, 4369 (1992)
Thach, P.T.: D.c. sets, d.c. functions and nonlinear equations. Math. Program. 58, 415428 (1993a)
Thach, P.T.: Global optimality criteria and duality with zero gap in nonconvex optimization
problems. SIAM J. Math. Anal. 24, 15371556 (1993b)
Thach, P.T.: A nonconvex duality with zero gap and applications. SIAM J. Optim. 4, 4464 (1994)
Thach, P.T.: Symmetric duality for homogeneous multiple-objective functions. J. Optim Theory
Appl. (2012). doi: 10.1007/s10957-011-9822-6
Thach, P.T., Konno, H.: On the degree and separability of nonconvexity and applications to
optimization problems. Math. Program. 77, 2347 (1996)
Thach, P.T., Thang, T.V.: Conjugate duality for vector-maximization problems. J. Math. Anal.
Appl. 376, 94102 (2011)
Thach, P.T., Thang, T.V.: Problems with resource allocation constraints and optimization over the
efficient set. J. Glob. Optim. 58, 481495 (2014)
Thach, P.T., Tuy, H.: The relief indicator method for constrained global optimization. Naval Res.
Logist. 37, 473497 (1990)
Thach, P.T., Konno, H., Yokota, D.: A dual approach to a minimization on the set of Pareto-optimal
solutions. J. Optim. Theory Appl. 88, 689701 (1996)
Thakur, L.S.: Domain contraction in nonlinear programming: minimizing a quadraticconcave
function over a polyhedron. Math. Oper. Res. 16, 390407 (1991)
Thieu, T.V.: Improvement and implementation of some algorithms for nonconvex optimization
problems. In: OptimizationFifth French German Conference, Castel Novel 1988. Lecture
Notes in Mathematics, vol. 1405, pp. 159170. Springer, Berlin (1989)
Thieu, T.V.: A linear programming approach to solving a jointly contrained bilinear programming
problem with special structure. In: Van tru va He thong, vol. 44, pp. 113121. Institute of
Mathematics, Hanoi (1992)
Thieu, T.V., Tam, B.T., Ban, V.T.: An outer approximation method for globally minimizing a
concave function over a compact convex set. Acta Math. Vietnam 8, 2140 (1983)
Thoai, N.V.: A modified version of Tuys method for solving d.c. programming problems.
Optimization 19, 665674 (1988)
Thoai, N.V.: Convergence of duality bound method in partly convex programming. J. Glob. Optim.
22, 263270 (2002)***
Thoai, N.V., Tuy, H.: Convergent algorithms for minimizing a concave function. Math. Oper. Res.
5, 556566 (1980)
Toland, J.F.: Duality in nonconvex optimization. J. Math. Anal. Appl. 66, 399415 (1978)
Toussaint, G.T.: Solving geometric problems with the rotating calipers. Proc. IEEE MELECON 83
(1978)
498 References

Tuan, H.D., Apkarian, P., Hosoe, S., Tuy, H.: D.C. optimization approach to robust control:
feasibility problems. Int. J. Control 73(2), 89104 (2000a)
Tuan, H.D., Hosoe, S., Tuy, H.: D.C. optimization approach to robust controls: the optimal scaling
value problems. IEEE Trans. Autom. Control 45(10), 19031909 (2000b)
Tuy, H.: Concave programming under linear constraints. Sov. Math. Dokl. 5, 14371440 (1964)
Tuy, H.: On a general minimax theorem. Sov. Math. Dokl. 15, 16891693 (1974)
Tuy, H.: A note on the equivalence between Walras excess demand theorem and Brouwers fixed
point theorem. In: Los, J., Los, M. (eds.) Equilibria: How and Why ?, pp. 6164. North-
Holland, Amsterdam (1976)
Tuy, H.: On outer approximation methods for solving concave minimization problems. Acta Math.
Vietnam. 8, 334 (1983)
Tuy, H.: Global minimization of a difference of two convex functions. In: Lecture Notes in
Economics and Mathematical Systems, vol. 226, pp. 98108. Springer, Berlin (1984)
Tuy, H.: Convex Programs with an additional reverse convex constraint. J. Optim. Theory Appl.
52, 463486 (1987a)
Tuy, H.: Global minimization of a difference of two convex functions. Math. Program. Study 30,
150182 (1987b)
Tuy, H.: Convex Analysis and Global Optimization. Kluwer, Dordrecht (1998)
Tuy, H.: On polyhedral annexation method for concave minimization. In: Leifman, L.J., Rosen, J.B.
(eds.) Functional Analysis, Optimization and Mathematical Economics, pp. 248260. Oxford
University Press, Oxford (1990)
Tuy, H.: Normal conical algorithm for concave minimization over polytopes. Math. Program. 51,
229245 (1991)
Tuy, H.: On nonconvex optimization problems with separated nonconvex variables. J. Glob. Optim.
2, 133144 (1992)
Tuy, H.: D.C. optimization: theory, methods and algorithms. In: Horst, R., Pardalos, P. (eds.) In:
Handbook of Global Optimization, pp. 149216. Kluwer, Dordrecht (1995a)
Tuy, H.: Canonical d.c. programming: outer approximation methods revisited. Oper. Res. Lett. 18,
99106 (1995b)
Tuy, H.: A general d.c. approach to location problems. In: Floudas, C., Pardalos, P. (eds.) State
of the Art in Global Optimization: Computational Methods and Applications, pp. 413432.
Kluwer, Dordrecht (1996)
Tuy, H.: Monotonic Optimization: Problems and Solution Approaches. SIAM J. Optim. 11, 464
494 (2000a)
Tuy, H.: Strong polynomial time solvability of a minimum concave cost network flow problem.
Acta Math. Vietnam. 25, 209217 (2000b)
Tuy, H.: On global optimality conditions and cutting plane algorithms. J. Optim. Theory Appl. 118,
201216 (2003)
Tuy, H.: On a decomposition method for nonconvex global optimization. Optim. Lett. 1, 245258
(2007a)
Tuy, H.: On dual bound methods for nonconvex global optimization. J. Glob. Optim. 37, 321323
(2007b)
Tuy, H.: D(C)-optimization and robust global optimization. J. Glob. Optim. 47, 485501 (2010)
Tuy, H., Ghannadan, S.: A new branch and bound method for bilevel linear programs. In: Migdalas,
A., Pardalos, P.M., Vrbrand, P. (eds.) Multilevel Optimization: Algorithms and Applications,
pp. 231249. Kluwer, Dordrecht (1998)
Tuy, H., Hoai-Phuong, N.T.: Optimization under composite monotonic constraints and constrained
optimization over the efficient set. In: Liberti, L., Maculan, N. (eds.) Global Optimization:
From Theory to Implementation, pp. 332. Springer, Berlin/New York (2006)
Tuy, H., Hoai-Phuong, N.T.: A robust algorithm for quadratic optimization under quadratic
constraints. J. Glob. Optim. 37, 557569 (2007)
Tuy, H., Horst, R.: Convergence and restart in branch and bound algorithms for global optimization.
Application to concave minimization and d.c. optimization problems. Math. Program. 42, 161
184 (1988)
References 499

Tuy, H., Luc, L.T.: A new approach to optimization under monotonic constraints. J. Glob. Optim.
18, 115 (2000)
Tuy, H., Oettli, W.: On necessary and sufficient conditions for global optimization. Matemticas
Aplicadas 15, 3941 (1994)
Tuy, H., Tam, B.T.: An efficient solution method for rank two quasiconcave minimization
problems. Optimization 24, 356 (1992)
Tuy, H., Tam, B.T.: Polyhedral annexation vs outer approximation methods for decomposition of
monotonic quasiconcave minimization. Acta Math. Vietnam. 20, 99114 (1995)
Tuy, H., Tuan, H.D.: Generalized S-Lemma and strong duality in nonconvex quadratic program-
ming. J. Glob. Optim. 56, 10451072 (2013)
Tuy, H., Migdalas, A., Vbrand, P.: A global optimization approach for the linear two-level
program. J. Glob. Optim. 3, 123 (1992)
Tuy, H., Dan, N.D., Ghannadan, S.: Strongly polynomial time algorithm for certain concave
minimization problems on networks. Oper. Res. Lett. 14, 99109 (1993a)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for a
production-transportation problem with concave production cost. Optimization 27, 205227
(1993b)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for two special
minimum concave cost network flow problems. Optimization 32, 2344 (1994a)
Tuy, H., Migdalas, A., Vbrand, P.: A quasiconcave minimization method for solving linear two-
level programs. J. Glob. Optim. 4, 243263 (1994b)
Tuy, H., Al-Khayyal, F., Zhou, F.: A d.c. optimization method for single facility location problems.
J. Glob. Optim. 7, 209227 (1995a)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: The minimum concave cost flow problem with
fixed numbers of nonlinear arc costs and sources. J. Glob. Optim. 6, 135151 (1995b)
Tuy, H., Al-Khayyal, F., Zhou, P.C.: DC optimization method for single facility location problems.
J. Glob. Optim. 7, 209227 (1995c)
Tuy, H., Ghannadan, S., Migdalas, A., Vrbrand, P.: Strongly polynomial algorithm for a concave
production-transportation problem with a fixed number of nonlinear variables. Math. Program.
72, 229258 (1996a)
Tuy, H., Thach, P.T., Konno, H., Yokota, D.: Dual approach to optimization on the set of pareto-
optimal solutions. J. Optim. Theory Appl. 88, 689707 (1996b)
Tuy, H., Bagirov, Rubinov, A.M.: Clustering via D.C. optimization. In: Hadjisavvas, N., Pardalos,
P.M. (eds.) Advances in Convex Analysis and Optimization, pp. 221235. Kluwer, Dordrecht
(2001)
Tuy, H., Nghia, N.D., Vinh, L.S.: A discrete location problem. Acta Math. Vietnam. 28, 185199
(2003)
Tuy, H., Al-Khayyal, F., Thach, P.T.: Monotonic optimization: branch and cut methods. In: Audet,
C., Hansen, P., Savard, G. (eds.) Essays and Survey on Global Optimization, pp. 3978.
GERAD/Springer, New York (2005)
Tuy, H., Minoux, M., Hoai-Puong, N.T.: Discrete monotonic optimization with application to a
discrete location problem. SIAM J. Optim. 17, 7897 (2006)
Utschick, W., Brehmer, J.: Monotonic optimization framework for coordinated beamforming in
multi-cell networks. IEEE Trans. Signal Process. 60, 24962507 (2012)
Vandenberge, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 4995 (1996)
Van Voorhis, T.: The quadratically constrained quadratic programs. Ph.D. thesis, Georgia Institute
of Technology, Atlanta (1997)
Veinott, A.F.: The supporting hyperplane method for unimodal programming. Oper. Res. 15, 147
152 (1967)
Vidigal, L., Director, S.: A design centering algorithm for nonconvex regions of acceptability. IEEE
Trans. Comput.-Aided Des. Integr. Circuits Syst. 14, 1324 (1982)
Vincent, J.F.: Structural Biomaterials. Princeton University Press, Princeton (1990)
Vicentee, L., Savard, G., Judice, J.: Descent approaches for quadratic bilevel programming. J.
Optim. Theory Appl. 81, 379388 (1994)
500 References

von Neumann, J.: Zur Theorie der Gesellschaftsspiele. Math. Ann. 100, 293320 (1928)
White, D.J., Annandalingam, G.: A penalty function approach for solving bilevel linear programs.
J. Glob. Optim. 3, 397419 (1993)
Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Machine learning techniques to diagnose breast
cancer from fine-needle aspirates. Cancer Lett. 77, 163171 (1994)
Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Image analysis and machine learning applied to
breast cancer diagnosis and prognosis. Anal. Quant. Cytol. Histol. 17(2), 7787 (1995)
Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computerized breast cancer
diagnosis and prognosis from fine-needle aspirates. Arch. Surg. 130, 511516 (1995a)
Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computer-derived nuclear features
distinguish malignant from benign beast cytology. Hum. Pathol. 26, 792796 (1995b)
Yakubovich, V.A.: The S-procedure in nonlinear control theory. Vestnik Leningrad Univ. 4, 7393
(1977)
Ye, Y.: On affine scaling algorithms for nonconvex quadratic programming. Math. Program. 56,
285300 (1992)
Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Network 52, 2292
2320 (2008)
Zadeh, N.: On building minimum cost communication networks over time. Networks 4, 1934
(1974)
Zhang, Y.J., Qian, L.P., Huang, J.W.: Monotonic optimization communication and networking
systems. Found. Trends Commun. Inf. Theory 7(1), 175 (2012)
Zheng, Q.: Optimality conditions for global optimization (I) and (II). Acta Math. Appl. Sinica,
English Ser. 2, 6678, 118134 (1985)
Zhu, D.L., Marcotte, P.: An extended descent framework for variational inequalities. J. Optim.
Theory Appl. 80, 349366 (1994)
Zukhovitzki, S.I., Polyak, R.A., Primak, M.E.: On a class of concave programming problems.
Ekonomika i Matematitcheskie Methody 4, 431443 (1968, Russian)
Zwart, P.B.: Global maximization of a convex function with linear inequality constraints. Oper.
Res. 22, 602609 (1974)
Index

Symbols bilevel programming, 455


K-monotonic functions, 283 Bilinear Matrix Inequalities Problem, 248
-extension, 171, 356 bilinear matrix inequalities problem, 248
-valid cut, 170 bilinear pprogram, 358
!-bisection, 179 bisection of ratio , 152
!-subdivision, 176, 184 of a cone, 155
"-approximate optimal solution, 193 bisection of ratio of a rectangle, 157
"-subgradient, 65 Brouwer fixed point theorem, 97, 98
p-center problem, 220
r-simplex, 30
C
canonical cone, 172, 173
A canonical dc programming, 198
adaptive BB algorithm, 161 canonical dc programming problem, 135
adaptive subdivision, 157 Caratheodory core, 14
affine Caratheodorys Theorem, 12
hull, 4 Caristi fixed point theorem, 88
manifold, 3 closed
set, 3 convex function, 50
affine function, 39 set-valued map, 94
affine variational inequality, 484 closure
affinely independent, 5 of a convex function, 50
apex of a cone, 7 of a set, 8
approximate minimum, 87 clustering, 222
approximate optimal solution, 204 coercive function, 87
approximate subdifferential, 65 combined OA/BB conical algorithm for (GCP),
187
combined OA/CS algorithm for (CDC), 203
B competitive location, 148
barycentric coordinates, 12 complementary convex, 104, 135
basic concave programming (BCP) problem, complementary convex set, 116
167 complementary convex structure, 103
basic outer approximation theorem, 163 concave function, 39
BB/OA algorithm for (CO/SNP), 279 concave majorant, 118

Springer International Publishing AG 2016 501


H. Tuy, Convex Analysis and Global Optimization, Springer Optimization
and Its Applications 110, DOI 10.1007/978-3-319-31484-6
502 Index

concave minimization, 133, 167 direction of recession, 10


concave productiontransportation problem, directional derivative, 61
319 directions in which f is affine, 49
concave programming, 167 discrete location, 429
concavity cut, 170 distance function, 40
cone, 6, 155 distance geometry problems, 220
cone generated by a set, 7 dm function, 395
conical BB algorithm for (BCP), 180 dual DC feasibility problem, 230
conical BB algorithm for LRC, 196 duality bound, 236
conical subdivision, 155 duality gap, 237
conjugate, 67 dualization through polyhedral annexation, 292
conormal hull, 393
conormal set, 393
consistent bounding, 159 E
constancy space, 49 edge of a polyhdron, 29
continuity of convex function, 42 edge property, 191
continuous p-center problem, 221 effcient point, 468
continuous optimization problems, 271 effective domain, 39
contraction, 89 Ekeland variational principle, 87
convex bilevel program, 456 epigraph, 39
convex combination, 6 equilibrium constraints, 453
convex cone, 7 equilibrium point, 94
convex envelope, 46 equilibrium problem, 479
convex envelope of a concave function, 46 essential optimal solution, 206, 406
convex function, 39 exact bisection, 152
convex hull, 12 exhaustive BB algorithm, 161
convex hull of a function, 46 exhaustive filter, 152
convex inequality, 52 exhaustive subdivision process, 154
convex maximization, 133 extreme direction, 21
convex minorant, 118 extreme point, 21
convex minorant of a dc function, 117
convex set, 6
copolyblock, 394 F
CS algortihm for (BCP), 176 face, 28
CS restart algorithm for (LRC), 194 face of a convex set, 21
CS restart algorithm for LRC, 194 facet, 28
facet of a simplex, 91
factorable function, 107
D Fejer Theorem, 37
DC feasibility problem, 140 Fekete points, 221
DC feasibility problem, 229 filter, 152
dc function, 103 First Separation Theorem, 17
dc inclusion, 139 free disposal, 255
dc representation, 107 full dimension, 9
of basic composite functions, 107
dc set, 113
dc structure, 103 G
decomposition methods by projection, 287 gap function, 481
decreasing function, 391 gauge, 9
design centering problem, 213 General Concave Minimization Problem, 184
dimension general equilibrium theorem, 94
of a convex set, 9 general nonconvex quadratic problem, 363
of an affine set, 3 generalized convex multiplicative
dimension of a convex function, 50 programming problems, 308
Index 503

generalized convex optimizatiion, 78 local maximizer, 69


generalized equation, 100 local minimizer, 69, 128
generalized inequality, 76 local optimal solution, 127, 128
generalized KKT condition, 256 local phase, 141
generalized linear program, 144 locally convexifiable, 369
geometric programming, 143 locally dc, 105
global "-optimal solution, 141, 158 locally increasing, 396
global maximizer, 69 lower -valid cut, 400
global minimizer, 69, 128 lower S-adjustment, 414
global optimal solution, 127, 128 lower level set, 47
global phase, 141 lower linearizable, 365
lower semi-continuous (l.s.c.), 48

H
halfspaces, 6 M
Hausdorff distance, 89 master problem, 408
hyperplane, plane, 4 Mathematical Programming problem with
Equilibrium Constraints, 453
maximin location, 219
I minimax
increasing function, 391 equality, 72
indefinite quadratic minimization, 361 theorem, 72
indicator function, 40 minimum concave cost network flow problem,
infimal convolution, 46 319
inner approximation, 229 minimum of a dc function, 120
interior of a set, 8 Minkowski functional, 9
Minkowskis Theorem, 24
modulus of strong convexity, 66
K moment matrices, 440
K-convex function, 77 Monotonic Optimization, 391
K-monotonic function, 283 monotonic reverse convex problem, 313
Kakutani fixed point theorem, 98 multidimensional scaling, 220
KKM Lemma, 91, 92 multiextremal, 127
Ky Fan inequality, 97

N
L Nadler contraction theorem, 90
l.s.c. hull, 50 Nash equilibrium, 98
Lagrange dual problem, 377 Nash equilibrium theorem, 99
Lagrangian, 377 network constraints, 319
Lagrangian bound, 244 non-cooperative game, 98
lagrangian bound, 236 noncanonical dc problems, 268
line segment, 5 nonconvex quadratic programming, 337
lineality of a convex function, 49 nonconvexity rank, 283
lineality of a convex set, 23 nondegenerate vertex, 30
lineality space, 23 nonregular problems, 193
lineality space of a convex function, 49 norm
linear bilevel program, 456 lp -norm, 7
linear matrix inequality, 146 equivalent norms, 8
linear multiplicative program, 299 Euclidean norm, 8
linear multiplicative programming, 298 Tchebycheff norm, 8
linear program under LMI constraints, 387 normal cone, 20
Lipschitz continuity, 51 inward normal cone, 20
LMI (linear matrix inequality), 83 outward normal cone, 20
504 Index

normal hull, 393 quasiconcave, 72


normal set, 392 quasiconcave function, 47
normal to a convex set, 20 quasiconjugate function, 250
normal vector, 4 quasiconvex function, 47

O R
on-line vertex enumeration, 185 radial partition(subdivision), 152
optimal visible point algorithm, 269 radius of a compact set, 37
optimal visible point problem, 269 rank of a convex function, 50
outer approximation, 161 rank of a convex set, 26
rank of a quadratic form, 338
recession cone, 10
P recession cone of a convex function, 49
Pareto point, 393 rectangular subdivision, 156
partition via .v; j/, 156 redundant inequality, 29
partly convex problems, 236 reformulation-linearization, 372
piecewise convex function, 43 regular (satble) problem, 191
pointed cone, 7 regularity asumption, 192
Polar relative boundary, 8
of a polyhedron, 31 relative interior, 8
polar, 25 relief indicator, 272, 275
polyblock, 394 OA algorithm, 276
polyblock algorithm, 403 relief indicator
polyhedral annexation, 229 method, 275
polyhedral annexation Representation of polyhedrons, 32
method, 233 representation theorem, 23
polyhedral annexation algorithm for BCP, 233 reverse convex, 52
polyhedral convex set, 27 reverse convex constraint, 134
polyhedron, 27 reverse convex inequality, 104
Polynomial Optimization, 435 reverse convex programming, 189
polytope, 30 reverse convex set, 116
pooling and blending problem, 131, 247 reverse normal set, 393
positively homogeneous, 59 reverse polyblock, 394
posynomial, 391 robust
preprocessing, 142 approach, 204
problem PTP(2), 328 constraint set, 272
procedure DC, 175 feasible set, 140, 406
procedure DC*, 229
projection, 20
proper function, 39 S
proper partition, 152 S-shaped functions, 112
proper vertex, 394 saddle point, 75
prototype BB algorithm, 158 SDP (semidefinite program), 84
prototype BB decomposition algorithm, 237 Second Separation Theorem, 18
prototype OA procedure, 162 semi-exhaustiveness, 153
semidefinite program, 147
sensor cover energy problem, 426
Q Separable Basic Concave Program, 182
quadratic function, 42 separator, 115
quais-supgradient, 254 set-valued map, 88
quasi-subdifferential, 253 Shapley-Folkmans Theorem, 13
quasi-subgradient, 253 signomial programming, 451
quasi-supdifferential, 254 simple OA for CDC, 201
Index 505

simple outer approximation, 185 trust region subproblem, 353


simplex, 12 two phase scheme, 141
simplicial subdivision, 91, 151
SIT algorithm for dc optimization, 208
SIT scheme, 208 U
special concave program CPL(1), 326 upper -valid cut, 400
Sperner Lemma, 92 upper S-adjustment, 414
Stackelberg game, 454 upper boundary point, 475
stochastic transportation-location problem, upper extreme point, 475
322 upper level sets, 47
strictly convex, 69 upper semi-continuous (u.s.c.), 48
strictly convex function, 114 upper semi-continuous set-valued map, 100
strongly convex, 66
strongly separated, 18
subdifferentiable, 58 V
subdifferential, 58 variational inequality, 99
subgradient, 58 vertex
successive incumbent transcending, 207 adjacent, 30
successive incumbent transcending scheme, vertex of a cone, 7
208 vertex of a polyhedron, 29
successive partitioning, 151 visibility assumption, 268
support function, 26, 40 Voronoi cell, 221
supporting halfspace, 19
supporting hyperplane, 19
W
weakly convex, 123
T weakly efficient point, 468
tight convex minorant, 366 Weber problem, 218
Toland theorem, 121 Weber problem with attraction and repulsion,
transcending stationarity, 139 147

You might also like