Download as pdf or txt
Download as pdf or txt
You are on page 1of 735

Global Optimization

Springer-Verlag Berlin Heidelberg GmbH


Reiner Horst· Hoang Tuy

Global
Optimization
Deterministic Approaches

Third Revised and Enlarged Edition

With 55 Figures
and 7 Tables

Springer
Professor Dr. Remer Horst
University ofTrier
Department of Mathematics
P.O.Box 3825
D-54286 Trier, Germany
Professor Dr. Hoang Tuy
Vien Toan Hoc
Institute ofMathematics
P.O.Box 631, BO-HO
10000 Hanoi, Vietnam

Cataloging-in-Publication Data applied for


Die Deutsche Bibllothek - CIP-Einheitsaufnahme
Hor.t, Relner:
Global optlmization : determinlstic approaches / Reiner Horst
; Hoang Tuy. - 3., rev. and enl. ed.
ISBN 978-3-642-08247-4 ISBN 978-3-662-03199-5 (eBook)
DOI 10.1007/978-3-662-03199-5
NE: Tuy, Hoang:

This work is subject to copyright. AII rights are reserved, whether the whole or part of the
material is concerne<!, specifica1ly the rights of translation, reprinting, reuse of iIIustrations,
recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data
banks. Duplication of this publication or parts thereof is only permitted under the provisions
of the German Copyright Law of September 9, 1965, in its version of June 240 1985, and a
copyright fee must always be paid. Violations fali under the prosecution act of the German
Copyright Law.

© Springer-VerIag Berlin Heidelberg 1990, 1993, 1996


Originally published by Springer-Verlag Berlin Heidelberg New York in 1996
Softcover reprint ofthe hardcover 3rd edition 1996

The use of registered names, trademarks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
SPIN 10517261 42.12202 - 5 43 21 o - Printed on acid-free paper
PREFACE TO THE THIRD EDITION

Most chapters contain quite a number of modifications and additions which take
into account the recent development in the field. Among other things, one finds
additional d.c. decompositions, new results and proofs on normal partitioning
procedures, optimality conditions, outer approximation methods and design
centering as weIl as revisions in the presentation of some basic concepts.

In Section Vll.5.1, a new version of the decompositions method for minimum


concave cost flow problems is added which provides a useful bound on the number of
iterations in the integer case. One of the outer approximation algorithms for
canonical d.c. problems in Section X.1.2 has been replaced by a new stable approach.
Finally, Section 2.5 on functions with concave minorants is added in Chapter XI,
where new branch and bound methods are discussed which provide considerably
improved linear programming bounding procedures for diverse problem types such as
linearly constrained Lipschitz-, d.c.- and Hoelder optimization problems as weIl as
systems of nonlinear inequalities.

November 1995 Reiner Horst Hoang Tuy


PREFACE TO THE SECOND EDITION

The main contents and character of the monograph did not change with respect
to the first edition. However, within most chapters we incorporated quite a number
of modifications which take into account the recent development of the field, the
very valuable suggestions and comments that we received from numerous colleagues
and students as well as our own experience while using the book. Some errors and
misprints in the first edition are also corrected.

May 1992 Reiner Horst Hoang Tuy


PREFACE TO THE FIRST EDITION

The enormous practical need for solving global optimization problems coupled
with a rapidly advancing computer technology has allowed one to consider problems
which a few years aga would have been considered computationally intractable. As a
consequence, we are seeing the creation of a large and increasing number of diverse
algorithms for solving a wide variety of multiextremal global optimization problems.
The goal of this book is to systematically clarify and unify these diverse
approaches in order to provide insight into the underlying concepts and their pro-
perties. Aside from a coherent view of the field much new material is presented.

By definition, a multiextremal global optirnization problem seeks at least one


global rninimizer of a real-valued objective function that pos ses ses different local
rninirnizers. The feasible set of points in IRn is usually deterrnined by a system of
inequalities. It is weIl known that in practically all disciplines where mathematical
models are used there are many real-world problems which can be formulated as
multiextremal global optimization problems.

Standard nonlinear programming techniques have not been successful for solving
these problems. Their deficiency is due to the intrinsie multiextremality of the
formulation and not to the lack of smoothness or continuity, for often the latter
properties are present. One can observe that local tools such as gradients, subgra-
dients, and second order constructions such as Hessians, cannot be expected to yield
more than local solutions. One finds, for example, that a stationary point is often
detected for which there is even no guarantee of local rninimality. Moreover, deter-
rnining the local rninimality of such a point is known to be NP-hard in the sense of
computational complexity even in relatively simple cases. Apart from this deficiency
in the local situation, classical methods do not recognize conditions for global
optimality.
x

For these reasons global solution methods must be significantly different from
standard nonlinear programming techniques, and they can be expected to be - and
are - much more expensive computationally. Throughout this book our focus will
be on typical procedures that respond to the inherent difiiculty of multiextremality
and which take advantage of helpful specific features of the problem structure. In
certain sections, methods are presented for solving very general and difiicult global
problems, but the reader should be aware that difiicult large scale global optimiza-
tion problems cannot be solved with sufiicient accuracy on currently available
computers. For these very general cases our exposition is intended to provide useful
tools for transcending local optimality restrictions, in the sense of providing valuable
information about the global quality of a given feasible point. Typically, such in-
formation will give upper and lower bounds for the optimal objective function value
and indicate parts of the feasible set where further investigations of global optima-
lity will not be worthwhile.
On the other hand, in many practical global optimizations, the multiextremal
feature involves only a small number of variables. Moreover, many problems have
additional structure that is amenable to large scale solutions.

Many global optimization problems encountered in the decision sciences, engi-


neering and operations research have at least the following closely related key
properties:

(i) con'IJexity is present in a limited and often unusual sense;


(ii) a global optimum occurs within a subset ofthe boundary ofthe feasible set.

With the current state of the art, these properties are best exploited by determini-
stic methods that combine analytical and combinatorial tools in an effective way.
We find that typical approaches use techniques such as branch and hound, relaxa-
tion, outer approximation, and valid cutting planes, whose basic principles have long
appeared in the related fields of integer and combinatorial optimization as we1l as
convex minimization. We have found, however, that application of these fruitful
ideas to global optimization is raising many new interesting theoretical and compu-
tational questions whose answers cannot be inferred from previous successes. For
example, branch and bound methods applied to global optimization problems
generate infinite processes, and hence their own convergence theory must be deve-
loped. In contrast, in integer programming these are finite procedures, and so their
convergence properties do not directly apply. Other examples involve important
XI

results in convex minimization that reflect the coincidence of local and global solu-
tions. Here also one cannot expect a direct application to multiextremal global
rninirnization.

In an abundant dass of global optimizations, convexity is present in areverse


sense. In this direction we focus our exposition on the following main topics:

(a) minimization 01 concave fu,nctions subject to linear and convex constraints (i.e.,
"concave minimization");
(b) convex minimization over the interseetion 01 convex sets and complements 01
convex sets (i.e., "reverse convex programming"); and
(c) global optimization 01 fu,nctions that can be expressed as a dillerence 01 two convex
fu,nctions (i. e., "d. c. -programming").

Another large dass of global optirnization that we shall discuss in some detail has
been termed 11 Lipschitz Programming", where now the functions in the formulation
are assumed to be Lipschitz continuous on certain subsets of their 'domains. AI-
though neither of the aforementioned properties (i)-(ii) is necessarily satisfied in
Lipschitz problems, much can be done here by applying the basic ideas and tech-
niques which we shall develop for the problem dasses (a), (b), and (c) mentioned
above.
Finally, we also demonstrate how global optimization problems are related to solving
systems of equations and/or inequalities. As a by-product, then, we shall present
some new solution methods for solving such systems.

The underlying purpose of this book is to present general methods in such a way
as to enhance the derivation of special techniques that exploit frequently encoun-
tered additional problem structure. The multifaceted approach is manifested occa-
sionally in some computational results for these special but abundant problems.
However, at the present stage, these computational results should be considered as
prelirninary.

The book is divided into three main parts.

Part A introduces the main global optirnization problem dasses we study, and
develops some of their basic properties and applications. It then discusses the funda-
mental concepts that unify the various general methods of solution, such as out er
approximation, concavity cuts, and branch and bound.
XII

Part B treats concave minimization and reverse convex programming subject to


linear and reverse convex constraints. In this part we present additional detail on
specially structured problems. Examples include decomposition, projection, separa-
bility, and parametrie approaches.

In Part C we consider rather general global optimization problems. We study


d.c.-programming and Lipschitz optimization, and present our most recent attempts
at solving more general global optimization problems. In this part, the speciali-
zations most naturally include biconvex programming, indefinite "all-quadratic"
optimization, and design centering as encountered in engineering design.

Each chapter begins with a summary of its contents.

The technical prerequisites for this book are rather modest, and are within reach
of most advanced undergraduate university programs. They include asound know-
ledge of elementary real analysis, linear algebra, and convexity theory. No familia-
rity with any other branch of mathematics is required.

In preparing this book, we have received encouragement, advice, and suggestions


from a large group of individuals. For this we are grateful to Faiz Al-Khayyal,
Harold P. Benson, Neil Koblitz, Ken Kortanek, Christian Larsen, Panos Pardalos,
Janos Pinter, Phan Thien Thach, Nguyen van Thoai, Jakob de Vries, Graham
Wood, and to several other friends, colleagues and students. We are indebted to
Michael Knuth for drawing the figures.

We acknowledge hospitality and / or financial support given by the following


organizations and institutions: the German Research Association -(DFG), the
Alexander von Humboldt Foundation, the Minister of Science and Art of the State
Lower Saxony, the National Scientific Research Center in Hanoi, the Universities of
Oldenburg and Trier.

Our special thanks are due to Frau Rita Feiden for the efficient typing and
patient retyping of the many drafts of the manuscript.

Finally, we thank our families for their patience and understanding.

December 1989 Reiner Horst Hoang Tuy


CONTENTS

PART A: INTRODUCTION AND BASIC TECHNIQUES 1

CHAPTER I. SOME IMPORTANT CLASSES OF GLOBAL


OPTIMIZATION PROBLEMS 3
1. Global Optimization 3
2. Concave Minimization 9
2.1. Definition and Basic Propenies 9
2.2. Brief Survey of Direct Applications 12
2.3. Integer Programming and Concave Minimization 14
2.4. Bilinear Programming and Concave Minimization 20
2.5. Complementarity Problems and Concave Minimization 24
2.6. Max-Min Problems and Concave Minimization 25

3. D.C. programming and Reverse ConTex Constraints 26


3.1. D.C. Programming: Basic Properties 26
3.2. D.C. Programming: Applications 32
3.3. Reverse Convex Constraints 37
3.4. Canonical D.C. Programming Problems 39
4. Lipschitzian Optimization and Systems of Equations
and Inequalities 43
4.1. Lipschitzian Optimization 43
4.2. Systems of Equations and Inequalities 47

CHAPTER n. OUTER APPROXIMATION 53


1. Basic Outer Approximation Method 53

2. Outer Approximation by ConTex Polyhedral Sets 58


3. Constraint Dropping Strategies 67
4. On Solving the Subproblems (Qk) 71
4.1. Finding an Initial Polytope D1 and its Vertex Set VI 72
4.2. Computing New Vertices and New Extreme Directions 74
4.3. Identifying Redundant Constraints 84
XIV

CHAPTER m. CONCAVITY CUT 89


1. Conc:ept oh Valid Cut 89
2. Valid Cuts in the Degenerate Case 95
3. Convergence of Cutting Procedu:tes 99
4. Concavity Cuts for Handling Reverse Convex Constraints 104
5. A Class of Generalized Concavity Cuts 108
6. Cuts Using Negative Edge Extensions 113

CHAPTER IV. BRANCH AND BOUND 115


1. A Prototype Branch and Bound Method 115
2. Finiteness and Convergence Conditions 126
3. Typical Partition Sets and their Re6nement 137
3.1. Simplices 137
3.2. Rectangles and Polyhedral Cones 142

4. Lawer Bounds 145


4.1. Lipschitzian Optimization 145
4.2. Vertex Minima 146
4.3. Convex Subfunctionals 148
4.4. Duality 159
4.5. Consistency 164
5. Deletion by Infeasibility 169
6. Restart Branch and Baund Algorithm 176

PART B: CONCAVE MINIMIZATION 179

CHAPTER V. CUTTING METHODS 181


1. A Pure Cutting Algorithm 181
1.1. Valid Cuts and a Sufficient Condition
for Global Optimality 182
1.2. Outline of the Method 187
2. Facial Cut Algorithm 190
2.1. The Basic Idea 190
2.2. Finding an Extreme Face of D Relative to M 192
2.3. Facial Valid Cuts 196
2.4. A Finite Cutting Algorithm 198
xv

3. Cut and Split Algorithm 201


3.1. Partition of a Cone 202
3.2. Outline of the Method 203
3.3. Remarks 206
4. Generating Deep Cuts: The Case of Concave
Quadratic Functionals 211
4.1. A Hierarchy of Valid Cuts 211
4.2. Konno's Cutting Method for Coneave Quadratie Programming 217
4.3. Bilinear Programming Cuts 222

CHAPTER VI. SUCCESSIVE APPROXIMATION METHODS 225


1. Outer Approximation Algorithms 225
1.1. Linearly Constrained Problem 226
1.2. Problems with Convex Constraints 234
1.3. Reducing the Sizes of the Relaxed Problems 240
2. Inner Approximation 243
2.1. The (DG) Problem 244
2.2. The Coneept of Polyhedral Annexation 245
2.3. Computing the Faeets of a Polytope 248
2.4. A Polyhedral Annexation Algorithm 251
2.5. Relations to Other Methods 261
2.6. Extensions 264
3. Convex Underestimation 267
3.1. Relaxation and Suecessive Underestimation 267
3.2. The Falk and Hoffman Algorithm 269
3.3. Rosen's Algorithm 273

4. Concave Polyhedral Underestimation 278


4.1. Outline of the Method 278
4.2. Computation of the Coneave Underestimators 281
4.3. Computation of the Nonvertieal Faeets 282
4.4. Polyhedral Underestimation Algorithm 285
4.5. Alternative Interpretation 287
4.6. Separable Problems 289

CHAPTER VII. SUCCESSIVE PARTITION METHODS 295

1. Conical Algorithms 295


1.1. The Normal Conieal Subdivision Proeess 296
1.2. The Main Subroutine 298
1.3. Construetion of Normal Subdivision Processes 300
1.4. The Basic NCS Process 306
1.5. The Normal Coniea! Algorithm 308
1.6. Remarks Coneerning Implementation 312
1.7. Example 315
1.8. Alternative Variants 319
1.9. Coneave Minimization with Convex Constraints 323
1.10. Unbounded Feasible Domain 328
XVI

1.11. A Class of Exhaustive Sub division Processes 329


1.12. Exhaustive Nondegenerate Subdivision Processes 335
2. Simplicial Algorithms 342
2.1. Normal Simplicial Subdivision Processes 343
2.2. Normal Simplicial Algorithm 345
2.3. Construction of an NSS Process 347
2.4. The Basic NSS Process 349
2.5. Normal Simplicial Algorithm for Problems with
Convex Constraints 351
3. An Enct Simplicial Algorithm. 353
3.1. Simplicial Subdivision of a Polytope 353
3.2. A Finite Branch and Bound Procedure 356
3.3. A Modified ES Algorithm 358
3.4. Unbounded Feasible Set 361
4. Rectangular Algorithms 365
4.1. Normal Rectangular Algorithm 366
4.2. Construction of an NRS Process 369
4.3. Specialization to Concave Quadratic Programming 371
4.4. Example 377

CHAPTER vm. DECOMPOSITION OF LARGE SCALE PROBLEMS 381


1. Decomposition Framewor1( 382

2. Branch and Bound Approach 384


2.1. Normal Simplicial Algorithm 385
2.2. Normal Rectangular Algorithm 388
2.3. Normal Conical Algorithm 390
3. Polyhedral Underestimation Method 391
3.1. Nonseparable Problems 391
3.2. Separable Problems 393
4. Decomposition by Outer Approximation 401
4.1. Basic Idea 401
4.2. Decomposition Algorithm 403
4.3. An Extension 408
4.4. Outer Approximation Versus Successive Partition 412
4.5. Outer Approximation Combined with Branch and Bound 417

5. Decomposition of Concave Minimization Problems


over Networks 421
5.1. The Minimum Concave Cost Flow Problem 421
5.2. The Single Source Uncapacitated Minimum Concave
Cost Flow (SUCF) Problem 426
5.3. Decomposition Method for (SUCF) 433
5.4. Extension 443
XVII

CHAPTER IX. SPECIAL PROBLEMS OF CONCAVE MINIMIZATION 447


1. Bilinear Programming 447
1.1. Basic Properties 448
1.2. Cutting Plane Method 451
1.3. Polyhedral Annexation 456
1.4. Conical Algorithm 458
1.5. Outer Approximation Method 462

2. Complementarity Problems 469


2.1. Basic Properties 470
2.2. Polyhedral Annexation Method for the Linear
Complementarity Problem (LCP) 472
2.3. Conical Algorithm for the (LCP) 475
2.4. Other Global Optimization Approaches to (LCP) 483
2.5. The Concave Complementarity Problem 486

3. Parametrie Conca.ve Programming 490


3.1. Basic Properties 492
3.2. Outer Approximation Method for (LRCP) 497
3.3. Methods Based on the Edge Property 500
3.4. Conical Algorithms for (LRCP) 508

PART C: GENERAL NONLINEAR PROBLEMS 517

CHAPTER X. D.C. PROGRAMMING 519

1. Outer Approximation Methods for Solving the Canonical


D.C. Programming Problem 519
1.1. Duality between the Objective and the Constraints 520
1.2. Outer Approximation Algorithms for Canonical D.C. Problems 526
1.3. Outer Approximation for Solving Noncanonical
D.C. Problems 541
2. Branch and Bound Methods 553

3. Solving D.C. Problems by a Sequence of Linear Programs


and Line Searches 558

4. Some Special D.C. Problems and Applica.tions 572


4.1. The Design Centering Problem 572
4.2. The Diamond Cutting Problem 581
4.3. Biconvex Programming and Related Problems 592

CHAPTER XI. LIPSCmTZ AND CONTINUOUS OPTIMIZATION 603

1. Brief Introduction into the Global Minimization


of Univariate Lipschitz Functions 604
1.1. Saw-Tooth Covers 604
1.2. Algorithms for Solving the Univariate Lipschitz-Problem 609
XVIII

2. Branch and Bound Algorithms 616


2.1. Branch and Bound Interpretation of Piyavskii's
UnivariatEl Algorithm 617
2.2. Branch and Bound Methods for Minimizing a Lipschitz Function
over an n-dimensional Rectangle 621
2.3. Branch and Bound Methods for Solving Lipschitz
Optimization Problems with General Constraints 632
2.4. Global Optimization of Concave Functions Subject
to Separable Quadratic Constraints 634
2.5. Linearly Constrained Global Optimization of
Functions with Concave Minorants 645
3. Outer Approximation 653
4. The Relief Indicator Method 662
4.1. Separators for f on D 663
4.2. A Global Optimality Criterion 666
4.3. The Relief Indicator Method 670

References 681
Notation 719
Index 723
PARTA

INTRODUCTION AND BASIC TECHNIQUES

Part A introduces the main global optimization problem classes we study, and
develops some of their basic properties and applications. It then d~scusses some
fundamental concepts that unify the various general methods of solution, such as
outer approximation, concavity cuts, and branch and bound.
CHAPTER I

SOME IMPORTANT CLASSES OF GLOBAL


OPTIMIZATION PROBLEMS

In Chapter I, we introduce the main classes of global optimization problems that

we study: concave minimization, reverse convex constraints, d.c. programming, and


Lipschitz optimization. Some basic properties of these problems and various
applications are discussed. It is also shown that very general systems of equalities

and (or) inequalities can be formulated as global optimization problems.

1. GLOBAL OPTIMIZATION

We define a standard global optimization problem as follows.

Given a nonempty, closed set D C Rn and a continuous junction f: A -I R, where


A c~ is a suitable set containing D, find at least one point i' E D satisfying

f(i') ~ f(x) for aU xE D or show that such a point does not ezist.

For the sake of simplicity of presentation it will sometimes be assumed that a

solution x* E D exists. In many cases, D will be compact and f will be continuous on


an open set A ) D. Then, clearly, the existence of x* is assured by the well-known
Theorem of Weierstraß. In other important cases, one encounters compact feasible
4

sets D and objective functions f that are continuous in the (relative) interior of D,

but have discontinuities on the boundary of D.

Throughout this book, a global optimization problem will be denoted by

minimize f(x) (1)


S.t. xeD

A point x* e D satisfying f(x*) ~ f(x) Vx e D is called aglobai minimizer of f


over D. The corresponding value of f is called the global minimum of f over D and is

denoted by min f(D). The set of all solutions of problem (1) will be denoted by

argmin f(D).
Note that since

max f(D) = -min (-f(D)) ,

global maximization problems are included in (1).


Sometimes we shall be able to find all solutions. On th€ other hand, we frequently

have to require additional properties of fand D, one of them being robustness of D.

Definition 1.1. A closed subset D C !Rn is called robust il it is the closure 01 an open
set.

Note that a convex set D C !Rn with nonempty interior is robust (cl., e.g.,
Rockafellar (1970), Theorem 6.3).

Let 11·11 denote the Euclidean norm in !Rn and let c > 0 be areal number. Then an
(open) c-neighbourhood of a point x* e !Rn is defined as the open ball

N(x*, c):= {x e !Rn: IIx - x*1I < c}

centered at x* with radius c.


A point x* E Dis called a local minimizer oll over D if there is an c > 0 such that

f(x*) ~ f(x) Vx e N(x*, c) n D


5

holds.
In order to understand the enormous difficulties inherent in global optimization
problems and the computational cost of solving them, it is important to notice that
aU standard teehniq'Ue8 in nonlinear optimization ean at most loeate loeal minima.
Moreover, there is no local criterion for deciding whether a local solution is global.
Therefore, conventional methods of optimization using such tools as derivatives,

gradients, subgradients and the like, are, in general, not capable of locating or
identifying a global optimum.

Remark 1.1. Several global criteria for a global minimizer have been proposed. Let
D be bounded and robust and let f be continuous on D. Denote by ~M) a measure of
a subset M C IRn. Furthermore, let x* e D and

S: = {x e D: f(x) < f(x*)}.


Then, obviously, ~S) = 0 implies that x* is a global minimizer of f over D. But,
apart from very special cases, there is no numerically feasible method for computing
~S). A theoretical iterative scheme for solving global optimization problems that is
based on this criterion was proposed by Galperin and Zheng (1987) (for a thorough
exposition of the related so--ealled integral methods, see also Chew and Zheng
(1988)).
Another global criterion is given by Falk (1973a). Let D have the above property
and suppose that f(x) > 0 Vx e D. Let i e D. Define for keIN

r(i, k):= r [!(!l]


1> f(i)
k
dx.

Then, it is shown in Falk (1973a) that i is a global maximizer of f over D if and

only if the sequence {r(i, k)}kelN is bounded.


Although it is not very practical for solving the general global optimization
problem, similar ideas have been used by Zhirov (1985) und Duong (1987) to
6

propose algorithms for globally minimizing a polynomial over a parallelepiped,


respeetively, a polytope. Global optimality eonditions for special problems ean be
found, e.g., in Hiriart-Urruty (1989) (differenees of eonvex funetions), and in Warga
(1992) (quadratic problems).

Note that eertain important classes of optimization problems have the property

that every loeal minimum is a global minimum. A well-known example is eonvex


minimization where the objeetive funetion is a eonvex funetion and where the feas-
ible set is a eonvex set. Sinee, in these classes, standard optimization proeedures for
finding loeal solutions will yield the global optimum, eonsiderable effort has gone
into charaeterizing families of funetions and feasible sets having the property that
every loeal minimum is a global solution. For a number of results on this question
see Martos (1967), Mangasarian (1969), Martos (1975), Zang and Avriel (1975),
Netzer and Passy (1975), Zang et al. (1976), Avriel (1976), Avriel and Zang (1981),
Horst (1982 and 1984a,b), Gasanov and Rikun (1984 and 1985), Horst and Thaeh
(1988).
Throughout this volume global optimization problems are eonsidered where
standard optimization techniques Jau because oJthe ezistence oJlocal minima that are
not global. These global optimization problems will be ealled multiextremal global
optimization problems.

Due to the inherent diffieulties mentioned above, the methods devised for analys-

ing multiextremal global optimization problems are quite diverse and signifieantly
different from the standard tools referred to above.
Though several general theoretieal eoneepts exist for solving problem (1), in order
to build a numerieally promising implementation, additional properties of the prob-
lems data usually have to be exploited.
Convexity, for example, will often be present. Many problems have linear eon-
straints. Other problems involve Lipsehitzian funetions with known Lipsehitz eon-
stants.
7

In recent years a rapidly growing number of proposals has been pubIished for

solving specific classes of multiextremal global optimization problems. It seems to be

impossible to present a thorough treatment of an these methods in one volume.

However, many of them can be interpreted as appIications and combinations of cer-

tain recent basic approaches. Knowledge of these approaches not only leads to a

deeper understanding of various techniques designed for solving specific problems,

but also serves as a guideline in the development of new procedures.

This book presents certain deterministic concepts used in many methods for

solving multiextremal global optimization problems that we beIieve to be promising

for further research. These concepts will be applied to derive algorithms for solving

broad classes of multiextremal global optimization problems that are frequently

encountered in applications.

In order to describe these classes, the following definitions are introduced. We

assume the reader to be familiar with basic notions and results on convexity.

Definition 1.2. Let C c IRn be conllu. A function h: C - - I IR is called d.c. on C if

there are two conllu functions p: C - - I IR, q: C - - I IR such that

h(z) = p(z) - q(z) Vz e C.

A function that is d.c. on IRn will be called d.c.

An inequality h(z) ~ 0 is called a tl.c. inequality wheneller his d.c.

In Definition 1.2, d.c. is an abbreviation for "difference of two convex functions".

Definition 1.3. Let M c IR n . A function h: IR n --I IR is called Lipschitzitln on M if

there is areal constant L = L(h,M) > 0 such that

Ih(z) - h(y)1 ~ L IIz - ylI Vz,y e M.


8

An inequality h(x) ~ 0 is caUed a Lipschitzian inequality (on M) whenever h is


Lipschitzian on M.

Definition 1.4. Let h: IRn --+ IR be a convex ju.nction. Then the inequality h(x) ~ 0 is
called convez whereas the inequality h(x) ~ 0 is caUed reverse cont/ex.

Sometimes one encounters quasiconvex functions. Recall that h: C --+ IR with C


convex, is quasiconvex if and only if, for all er E IR, the level sets {x E C: h(x) ~ er}
are convex. An alternative characterization of quasiconvexity is given by

h(AX + (1 - A)Y) ~ max: {h(x), h(y)} Vx,y E C, 0 ~ A ~ 1 .

The concepts to be described in the subsequent chapters will be used to develop


algorithms for solving broad classes of multiextremal global optimization problems.
The feasible set D and the objective function f can belong to one of the following
classes.

Feasible Set D:
- cont/ex and dejined by jinitely many convex inequalities,
- intersection oJ a convex set with jinitely many complements oJ a cont/ex set and
dejined by jinitely many cont/ex and finitely many reverse convex inequalities,
- defined by finitely many Lipschitzian inequalities.

Objective Function f:
- convex,
- concat/e,
- d.c.,
- Lipschitzian,
- certain generalizations oJthese Jour classes.
9

In addition, some classes of problems will be considered in which D is defined by a


finite number of d.c. ineqv.alities. Finally, it will be shown that various systems of
equalities and ineqv.alities can be solved by the algorithmic concepts to be presented.

The sections that follow contain an introduction to some basic properties of the
problems introduced above. Many applications will be described, and various
connections between these classes will be revealed.

2. CONCAVE MIN1MIZATION

2.1. Definition and Basic Properties

One of the most important global optimization problems is that of minimizing a


concatNl fanction otNlr a conve:J: 3d (or - equivalently - of maximizing a convex
function over a convex set):

minimize f(x) (2)


s.t. xeD

where D ( IRn is nonempty, closed and convex, and where f: A --I IR is concave on a
sv.itable set A ( IRn containing D.

The concave minimization problem (2) is a multiextremal global optimization


problem: it is easy to construct, for example, concave functions fand polytopes D
having the property that every vertex of Dis a local minimizer of f over D.

Example 1.1. Let f(x) = -lIxIl2, where 11·11 denotes the Euclidean norm, and let
D = {x E IRn: a ~ x ~ b} with a, be IRn, a < 0, b > 0 (all inequalities are understood
with respect to the componentwise order of IRn). It is easily seen that every vertex of
10

Dis a local minimizer off over D: let v = (v 1""'vn ? be a vertex of D. Then there
are two index sets 11' 12 C {1, ... ,n} satisfying 11 U 12 = {1, ... ,n}, 11 n 12 = 0 such
that we have

Let

o< E: < min {-ai' bi : i = 1, ... ,n}


and consider the cube

B(v,E:):= {y E IRn: m~ IYi -vii ~ E:}.


1

Clearly, we have

N(v,E:) = {y E IRn: lIy-vli < E:} C B(v,E:).

All x E B( V ,E:) n D satisfy

where 0 ~ Yi ~ E:.
By the definition of E:, it follows that we have

hence f(v) ~ f(x) Vx E N(v,E:) n D.

Some basic properties of problem (2), however, make concave programming

problems easier to handle than general multiextremal global optimization problems.

Besides convexity of the feasible set D which will be heavily exploited in the
design of algorithms for solving (2), the most interesting property is that a concave
function f attains its global minimum over D at an extreme point of D.
11

Theorem 1.1. Let I: D - - I IR be concave and let D ( IR n be nonemptll, compact and


convez. Then the global minimum 01 lover Dis attained at an extreme point 01 D.

Proof. The global minimum of f over the compact set D exists by the we1l-known
Theorem of Weierstraß since a concave function defined on IRn is continuous
everywhere. It suffices to show that for every x e D there is an extreme point v of D
such that f(x) ~ f(v) holds.
By the Theorems of Krein-Milman/Caratheodory, there is a natural number
k ~ n+1 such that
k . k
x = E ~.v\ E~. = 1, ~. ~ 0 (i=l, ... ,k), (3)
i=l 1 i=l 1 1

where vi (i=l, ... ,k) are extreme points of D. Let v satisfy f(v) = min {f(vi ):
i=l, ... ,k}. Then we see from the concavity of fand from (3) that we have
k. k
f(x) ~ E ~.f(Vl) ~ f(v)-( E ~.) = f(v) .
i=l 1 i=l 1 •
From the viewpoint of computational complexity, a concave minimization prob-
lem is NP-hard, even in such special cases as that of minimizing a quadratic con-
cave function over very simple polytopes such as hypercubes (e.g., Pardalos and
Schnitger (1987». One exception is the negative Euc1idean norm that can be
minimized over a hyperrectangle by an O(n) algorithm (for related work see
Gritzmann and Klee (1988), Bodlaender et al. (1990». Among the few practical
instances of concave minimization problems for which polynomial time algorithms
have been constructed are certain production scheduling, production-transportation,
and inventory models that can be regarded as special network flow problems (e.g.,
Zangwill (1968 and 1985), Love (1973), Konno (1973 and 1988), Afentakis et al.
(1984), Tuy et al. (1995». More details on complexity of concave minimization and
related problems can be found in Horst et al. (1995), Vavasis (1991, 1995).
12

The range of practical applications of concave minimization problems is very


broad. Large c1asses of decision models arising from operations research and math-
ematical economics and many engineering problems lead to formulations such as
problem (2). Furthermore, many models which originally are not concave can be
transformed into equivalent concave minimization problems. We brief1y discuss
some important examples and relationships in the next sections.

A comprehensive sUIvey of concave programming is given in Benson (1995), a


more introductory treatment in Tuy (1994a) and Horst et al. (1995).

2.2. Brief Survey of Direct ApplicatioDB

Many problems consist in choosing the levels xi of n activities i=l, ... ,n restricted

to ai ~ xi ~ bi' ai'bi e IR+, ai < bi' producing independent costs fi : [ai'bil-i IR+,
subject to additional convex or (in most applications) linear inequality constraints.
n
The objective function f(x) = E f.(x.) is separable, Le., the sum of n functions
i=11 1
fi(~)' Each fi typically reßects the fact that the activities incUI a fixed setup cost
when the activity is started (positive jump at xi = 0) as well as a variable cost
related to the level of the activity.
If the variable cost is linear and the setup cost is positive, then fi is nonllnear and
concave and we have the objective functions of the c1assical fixed charge problems

(e.g., Murty (1969), Bod (1970), Steinberg (1970), Cabot (1974». Frequently
encountered special examples inc1ude fixed charge transportation problems and
capacitated as well as uncapacitated plant location (or ~ite selection) problems (e.g.,
Manne (1964), Gray (1971), Dutton et al. (1974) Barr et al. (1981». Multilevel fixed
charge problems have also been discussed by Jones and Soland (1969). Interactive
fixed charge problems were investigated by Kao (1979), Erenguc and Benson (1986),
Benson and Erenguc (1988) and references therein.
13

Often, price breaks and setup costs yield concave functions fi having piecewise

linear variable costs. An early problem of this type was the bid evaluation problem

(e.g., Bracken and McCormick (1968), and Horst (1980a)). Moreover, piecewise

linear concave functions frequently arise in inventory models and in connection with

constraints that form so-called Leontiev substitution systems (e.g., Zangwill (1966),
Veinott (1969), and Koehler et al. (1975)).

Nonlinear concave variable costs occur whenever it is assumed that as the number

of units of a product increases, the unit cost strictly decreases (economies 0/ scale).
Different concave cost functions that might be used in practice are discussed by, e.g.,

Zwart (1974). Often the concave functions fi are assumed to be quadratic (or'
approximated by quadratic functions); the most well-known examples are quadratic

transportation and related network-flow problems (e.g., Koopmans and Beckman

(1957), Cabot and Francis (1974), Bhatia (1981), Bazaraa and Sherali (1982),

Florian (1986)).
Many other situations arising in practice lead to minimum concave objective

network problems. Examples include problems in communications network planning,


transportation, water resource management, air traffic control, hydraulic or sewage
network planning, Iocation problems, inventory and production planning, etc. A
comprehensive survey of nonconvex, in particular concave network problems with an
extensive bibliography is given in Guisewite (1995).

Global optimization of the general (possibly indefinite) quadratic case is discussed

by many authors, e.g., Ritter (1965 and 1966), Cabot and Francis (1970), Balas

(1975a), Tammer (1976), Konno (1976a and 1980), Gupta and Sharma (1983), Rosen

(1983, 1984 and 1984a), Aneja et al. (1984), Kalantari (1984), Schoch (1984), Thoai

(1984), Pardalos (1985, 1987, 1988 and 1988a), Benacer and Tao (1986), Kalantari

and Rosen (1986 and 1987), Pardalos and Rosen (1986 and 1987), Rosen and

Pardalos (1986), Thoai (1987), Tuy (1987), Pardalos and Gupta (1988), Warga
(1992), Bomze and Danninger (1994), Horst et al. (1995), Horst and Thoai (1995),
14

Floudas and Visweswaran (1995).

Often, nonseparable concave objective functions occur in the context of models

related to economies of scale or the maximization of utility functions having in-

creasing marginal values. Some models yield quasiconcave rat her than concave ob-
jective functions that in many respects behave like concave functions in minim-
ization problems. Moreover, quasiconcave functions are often concave in the region

of interest (e.g., Röss)er (1971) and references therein). Sometimes these functions
can be suitably transformed into concave functions that yield equivalent minim-

ization problems (e.g., Horst (1984».


A model arising from production planning is discussed in Rössler (1971), and a

model for national development is treated in Brotchi (1971).

Many other specific problems lead directly to concave minimizationj examples are

discussed in, e.g., Grotte (1975), and McCormick (1973 and 1983). Of particular
interest are certain engineering design problems. Many of them can be formulated as
suitable concave objective network problems (see above). Other examples arise from
VLSI chip design (Watanabe (1984», in the fabrication of integrated circuits
(Vidigal and Director (1982», and in diamond cutting (Nguyen et al. (1985». The
last two examples are so-called design centering problems that turn out to be linear,

concave or d.c. programming problems (cf. Vidigal and Director (1982), Thach

(1988), see also Section 1.3.).

2.3. Integer Programming and Concave Minimization

One of the most challenging classes of optimization problems with a wide range of
applications is integer programming. These are extremum problems with a discrete
feasible set.
15

In this section, it will be shown that concave programming is a sort of bridge


between integer and nonlinear programming. Broad classes of integer programming

problems can be formulated as equivalent concave programming problems, where

equivalence is understood in the sense that the sets of optimal solutions coincide.

This equivalence is well-known for the quadratic assignment problem (e.g., Baza-

raa and Sherali (1982), Lawler (1963)) and the 3-dimensional assignment problem

(e.g., Frieze (1974)). The zero-one integer linear programming problem was reduced

to a quadratic concave programming problem subject to linear constraints by


Raghavachari (1969). More general results on the connections between integer and
nonlinear programming are given in Giannessi and Niccolucci (1976), where the
Theorem 1.2 below is presented.

Let C ~ IRn , B = {O,l}, Bn = Bx ... xB (n times), f: IRn -+IR. Consider the integer

programming problem

minimize f(x)
s .t. xE C n Bn (4)

Define e:= (l,l...,1)T E IRn , E: = {x E IRn : 0 ~ x ~ e} and associate with (4) the
nonlinear problem

minimize [f(x) + p.x (e-x)] (5)


s.t. xECn E

which depends on the real number Jl.. Then the following connection between (4) and

(5) holds.

Theorem 1.2. Let C be a closed subset of IR n satisJying C n Bn f 0 and suppose


that f: IRn -+ IR is Lipschitzian on an open set A ) E and twice continuo1J.Sly
differentiable on E. Then there exists a Jl.o E IR such that for aU Jl. > Jl.O we ha'IJe
(i) (4) and (5) are eq'Ui'IJalent,
(ii) fex) + fLx(e - x) is concave on E.
16

Proof. (i): Set rp(x):= x(e - x). The function rp(x) is continuous on E, and
obviously we have

cp(x) = 0 Vx E Bn, rp(x) > 0 Vx E E \ Bn . (6)

We show that there is a J1.0 E IR such that, whenever we have J1. > J1.0' then the
global minimum off(x) + J1.rp(x) over C n Eis attained on Bn.
First, note that, for all y E Bn, there exists an open neighbourhood
N(y, c:) = {x E IRn: IIx-YIi < c:} such that for all x E N(y,c:) n (E \ Bn) we have

cp(x) ~ (1-c:) IIx-YIi (7)

(11·11 denotes the Euclidean norm).


To see this, let c: < 1, r:= IIx - ylI < c:, and u:= ~x - y). Then, as x = y + ru E E,
wehave
n
cp(x)= E (ru.+y.)(I-y.-ru.).
j=1 J J J J

Since Yj = 1 implies uj ~ 0, and Yj = 0 implies uj ~ 0, we may express cp(x) in the


following way:
n
cp(x) = E ru.(1-ru.) + E (Hru.) (-ru.) = E rlu·1 (l-rlu.1)
u . >0 J J u . <0 J J j=1 J J
J J

n 2n 2
=r E lu·l-r E lu·1 .
j=1 J j=1 J

n 2 1 2 n
Using 0< r < c: < 1, E luJ·1 = ~lIx-yli = 1 and E lu·1 ~ lIull = 1, we finally
j=1 r j=1 J
see that

cp(x) ~ r (1 - c:) = (1 - c:) IIx - ylI

holds.
17

f(y) - fex) V C \ C C
Fy:= ij)(x) x E 2 l' YE I '

We prove that Fy(x) is always bounded in some neighbourhood of y. To see this,


consider A(y, E):= A n N(y,E), where A is the open set containing E introduced in
the assumptions of Theorem 1.2. Then, by (7), it follows that

cp(x) ~ (l-€)lIx-ylI Vx E A(y, E) n (C 2 \ Cl)

holds. Moreover, by assumption, f is Lipschitzian on A, i.e., there is a constant


L > 0 such that we have

If(x) - f(y)l ~ Lllx - Yll Vx,y E A .

The last two inequalities yield

L
IF y(x) I ~ l-f < +ID Vx E A(y, E) n (C 2 \ Cl)' y E Cl .

The family of sets {A(yi,E): yi E Bn} is a finite cover of Cl' Let k = 2n and consider
k .
C3: = (u A(yl, E)) n C2.
1=1

Clearly, Cl ( C3' and p. > ~ implies that

(8)

holds.
Finally, consider the compact set
18

By the definition of the sets involved, the following relations hold:

Since fis continuous on E and C2 is a compact set contained in E, the quantities

m{= min f(C 2) and Mf:= max f(C 2) exist. By a similar argument, we see that also

mcp:= min CP(C 4) exists. Note that we have cp(x) > 0 on C4 , hence mcp > 0 and
Mf - mf
"'(= m ~ O.
cp

It follows that

fex) + ",cp(x) > Mf (9)

holds for all xE C4 whenever '" > "'1'


Choose", > max {~, "'I}' Then, both inequalities (8) and (9) are fulfilled and
the global minimum m of fex) + ",cp(x) over C2 cannot be attained on

C4 U (C 3 \ Cl) = C2 \ Cl' It follows that m has to be attained on Cl' But cp(x)


vanishes on Cl' In other words, the problems (4) and (5) are equivalent.

(ü): Let by V2f denote the Hessian matrix of f. The Hessian V2f exists on E and
its elements are bounded there. Then, by a well-known criterion on definiteness of

symmetrie matrices and diagonal dominance, there is a ~ > 0 such that, whenever
'" > ~, the Hessian V2f - diag(2",) of fex) + p.xT(e-x) is negative semidefinite on E,
and this implies concavity off(x) + p.xT(e-x).
{~, "'1' "'2}
From the above, Theorem 1.2 follows for "'0 = max .

Now let C be a convex set. Then problem (5) is a concave minimization problem,

whenever '" > "'0' and Theorem 1.2 shows that large classes 0/ integer programming
problems are equivalent to concave minimization problems.
19

Important special classes of integer programming problems are the integer linear

programming problem

minimize cx
s . t. Ax ~ b, x e Bn , (10)

where c e IRn , b e IRm , A e IRmxn , and the integer quadratic programming problem

minimize cx + ~ x (Cx)
s.t. Ax~ b, xeB n , (11)

which adds to the objective function of (10) a quadratic term with matrix C e IRnxn

(assumed symmetrie without loss of generality). When the feasible sets in (10) and

(11) are not empty, the assumptions of Theorem 1.2 are obviously satisfied for

problems (10) and (11).

Estimates for the parameter p. in Theorem 1.2 can be found in Borchardt (1980),
Kalantari and Rosen (1987a), Horst et al. (1995).

Note that problem (4) also covers the cases where x e Bn is replaced by
xj eIN U{O}, and the variables xj are bounded. A simple representation of xj by

(O-l)-variables is then

K i
x·= E y..2
J i=O IJ
, YiJ· e {O,l} ,

where K is an integer upper bound on 10g2xj"


In practice, when dealing with integer programming problems, a transformation
into an equivalent concave minimization problem may be of benefit only for special

cases (e.g., Bazaraa and Sherali (1982), Frieze (1974». The connections discussed

above, however, may lead to an adaptation of typical ideas used in concave minimi-

zation to approaches for solving certain integer problems (e.g., Adams and Sherali

(1986), Beale and Forrest (1978), Erenguc and Benson (1987), and Glover and

Klingman (1973), Horst et al. (1995».


20

2.4. Bilinear Programming and Concave MinimiZlJ.tion

One of the most often encountered difficult multiextremal global problems in


mathematical programming is the bilinear programming problem, whose general
form is

minimize f(x,y) = px + x(Cy) + qy (12)


s .t. x e X, y e Y

where X, Y are given closed convex polyhedral sets in IRn, IRm respectively, and
pe IRn, q e IRm , Ce IRn"m. Problem (12) was studied in the bimatrix game context

by, e.g., Mangasarian (1964), Mangasarian and Stone (1964), Altman (1968).
Further applications include dynamic Markovian assignment problems, multi-
commodity network flow problems and certain dynamic production problems. An
extensive discussion of applied problems which can be formulated as bilinear pro-
gramming problems is given by Konno (1971a).
The first solution procedures were either locally convergent (e.g., Altman (1968),
Cabot and Francis (1970» or completely enumerative (e.g., Mangasarian and Stone
(1964».
Cabot and Francis (1970) proposed an extreme point ranking procedure. Subse-
quent solution methods included various relaxation and cutting plane techniques
(e.g., Gallo and Ülkucü (1977), Konno (1971, 1971a and 1976), Vaish and Shetty
(1976 and 1977), Sherali and Shetty (1980a», or branch and bound approaches (e.g.,
Falk (1973), Al-Khayyal (1990». A general treatment of cutting-plane methods
and branch and bound techniques will be given in subsequent chapters. Related

approaches to bilinear programming can be found in AI-Khayyal (1977), Czoch-


ralska (1982 and 1982a), Thieu (1988 and 1989) and Sherali and Alameddine (1992).
An extension to bicon'l.lez programming problems, where the objective function in
(12) is bicon'l.lez, i.e.,

f(x,y) = fl (x) + x(Cy) + 12(Y)


21

with fl' f:2 convex, and where the constraints form 80 convex set in IRn+m, is given in
Al-Khayyal and Fallt (1983).

Most of the methods mentioned above use in an explicit or implicit way the close
relationship between problem (12) and concave minimization. Formulations of bilin-
ear problems 80S concave minimization problems are discussed, e.g., in Altman
(1968), Konno (1976), Gallo and Ülkucü (1977), Thieu (1980), Altman (1968),
Aggarwal and Floudas (1990), Hansen and Jaumard (1992) discuss different ways to
reduce linearly constrained concave quadratic problems to bilinear problems. Frieze
(1974) has reduced the 3-dimensional assignment problem to 80 bilinear
programming problem and then to 80 special concave programming problem.

Theorem 1.3. In problem (1!) assume that Y has at least one verte:r: and that for
every :r:e X

min f(:r:,1/) (13)


s.t. 1/eY

has a solution. Then problem (1!) can be reduced to a concave minimization problem
with piecewise linear objective fu,nction and linear constraints.

Proof. Note first that the assumptions of Theorem 1.3 are both satisfied if Y is
nonempty and compact. Denote by V(Y) the set of vertices of Y. It is known from
the theory of linear programming that for every x e X the solution of (13) is attained
at least at one vertex of Y. Problem (12) can be restated as

min f(x,y) = min{minf(x,y)}


xeX, yeY xeX yeY

= mi n { mi n f(x,y)} = mi n f(x), where


xeX yeV(Y) xeX

f(x):= mi n f(x,y) = min f(x,y).


yeV(Y) yeY
22

The set V(Y) is finite, and for eaeh y E V(Y), f(x,y) is an affine funetion of x.
Thus, f(x) is the pointwise minimum of a finite family of affine funetions, and henee
is eoneave and pieeewise linear.

In order to obtain a eonverse result, let f(x):= 2px + x(Cx), where p E !Rn and
C E IRnxn is symmetrie and negative semi~efinite. Consider the quadratic eoneave
programming problem

minimize f(x) (14)


s.t. x EX

where Xis a closed eonvex polyhedral set in !Rn.


Define f(x,y):= px + py + x(Cy), let Y = X, and eonsider the bilinear programming
problem

mmmuze f(x,y) (15)


s . t. XEX, Y E Y

Theorem 1.4. Under the assumptions above the following equivalence holds:
lfr solves (14), then (r, r) solves (15). lf(x, y) solves (15), then x and y solve
(14)·

Proof. By definition, f(x*) ~ f(x) '<Ix E X. In partieular,

f(x*) ~ f(i) = f(x,x) ,


(16)
f(x*) ~ f(Y) = f(y, Y) .

On the other hand, we obviously have

f(x, Y) = min f(x, y) ~ mi n f(x, x) = mi n f(x) = f(x*) . (17)


XEX,yEY xEX XEX

Combining (16) and (17), we obtain

f(x, Y) ~ f(x*) ~ min {f(i), f(yn ,


23

and so Theorem 1.4 is established if we prove that

f(x, Y) = f(x) = f(Y) . (18)

Obviously, we have

o ~ f(x, X) - f(x, Y) = p(x - Y) + x(C(x - Y)) ,


(19)
o ~ f(y, Y) - f(x, Y) = p(y - X) + y(C(y - X)) .

Adding these inequalities yie1ds

(x - Y)(C(x - Y)) ~ 0

whieh, by the negative semi-definiteness of C, implies C(x - Y) = 0, Le.,

(20)

Inserting (20) into (19), we obtain p(x - y) = 0, Le.,


- = py.
px
- (21)

Thus, by (20) and (21),

f(x, y) = px + py + x(Cy) = 2px + x(CX) = 2py + y(Cy) ,

hence (18) foHows.



Note that the proof of Theorem 1.4 did not make use of the special form of the
feasible set X in (14). Henee, Theorem 1.4 ean be extended to more general feasible

sets.

Note as weH that, by Theorem 1.2 and Theorem 1.4, the integer linear

programming problem (10) ean be redueed to a bilinear problem.

Bilinear programming problems will be treated in Chapter IX.


24

2.5. Complementarity Problems and Concave Minimization

Let D ~ IRn, g,h: IRn --+ IRm. The problem offinding x E D such that

g(x) ~ 0, h(x) ~ 0, g(x)h(x) =0 (22)

is called a complementarity problem.


Complementarity problems play a very important role in the decision sciences
and in applied mathematics. There are elose relations to fixed-point problems and

variational inequalities. The literature on complementarity problems contains an

enormous nurnber of papers and some excellent textbooks. We only cite here the
monographs of Lüthi (1976), Garcia and Zangwill (1981), Gould and Tolle (1983),

Murty (1988), Cottle et al. (1992), and the surveys of Al-Khayyal (1986a) and Pang

(1995).
A frequently studied special case is the linear complementarity problem which is

obtained from (22) if we set D = ~, n = m, g(x) = x and h(x) = Mx+q, M E IRnxn ,


n
q E IR .
It has been shown in Mangasarian (1978) that a linear complementarity problem

can be converted into an equivalent concave minimization problem. Certain linear


complementarity problems are even solvable as linear programs (e.g., Cottle and
Pang (1978), Mangasarian (1976, 1978 and 1979), AI-Khayyal (1985 and 1986a».
Thieu (1980) proved that certain elasses of nonlinear complementarity problems

are equivalent to concave minimization problems.

Theorem 1.5. In (22) let D be a convex set and let 9 and h be concave mappings

on D. Suppose that (22) has a solution. Then (22) is equivalent to the concave minim-
ization problem
m
minimize f(x}:= Emin {g.(x}, h.(x}}
i=l I I
(23)
s.t. g(x} ~ 0, h(x} ~ 0, xE D.
25

Proof. Let x* be a solution of (22). Then, obviously, x* is feasible for (23). From

g(x*) ~ 0, h(x*) ~ 0, and g(x*)h(x*) = 0 it follows that f(x*) = O. But all feasible
points x of (23) satisfy f(x) ~ 0; hence x* is an optimal solution of (23).

Conversely, since (22) is assumed to have a solution, every optimal solution x* of

(23) has to satisfy f(x*) = O. Thus we have g(x*) ~ 0, h(x*) ~ 0, min {~(x*),

hi(x*)} = 0, (i=l, ... ,m), hence g(x*)h(x*) = O. •

Again, since the convexity of D and the concavity of g and h were not used in the

proof, the result can be generalized to arbitrary D, g, and h. Additional formulations

of linear complementarity problems are given in Horst et al. (1995).

In the linear case, when D is a convex polyhedral set, and h and g are affine

functions, then f(x) is concave and piecewise linear.

An algorithm for solving certain complementarity problems that is based on the

above relationship with concave minimization is proposed by Thoai and Tuy (1983),

Tuyet al. (1985); see also Al-Khayyal (1986a and 1987), Pardalos and Rosen (1988)

and references therein. These procedures will be treated in Chapter IX.

2.6. Max-Min Problems and Concave Minimiz;.tion

A linear max-min problem can be stated as

maximize min {cx + dy}


x Y (24)

s.t. Ax + By ~ b, x, y ~ 0
n m SlCn SlCm s
where x, c E IR , y,d E IR , A E IR , B E IR , and b E IR .

The relationship between problem (24) and concave minimization was discussed

and exploited by Falk (1973).


26

Theorem 1.6. Let the leasible set 01 (24) be nonempty and compact. The problem
(24) is equivalent to a concave minimization problem with piecewise linear objective
function and linear constraints.

Proof. Let D = {(x, y) E IRn +m: Ax + By ~ b, x, y ~ O} denote the feasible set of


(24). Consider the orthogonal projeetion P of D onto the spaee IRn of the variable x,

Le., P:= {x E IRn: x ~ 0, Ax + By ~ b for at least one y ~ O}. Further, for x E P, let
Dx := {y E IRm , By ~ b - Ax, y ~ O}. Then (24) may be rewritten as

maximize {ex + min dy}.


xEP yEDx

It is well-known from the theory of parametrie linear programming that the

funetion

f(x):= min {dy: By ~ b - Ax, y ~ O}


Y

is eonvex and piecewise linear on P. Therefore, (24) is equivalent to the eonvex


maximization (or, equivalently, eoneave minimization) problem

+ cx) .
maximize (f(x)
S.t. xEP

3. D.C. PROGRAMMING AND REVERSE CONVEX CONSTRAINTS

3.1. D.C. Programming: Basic Properties

Reeall from Seetion I.1 that a real-valued funetion f defined on a eonvex set
C ( ~ is ealled d.c. on C if, for all x E C, f ean be expressed in the form

fex) = p(x) - q(x) , (25)


27

where p and q are convex functions on C.

The function fis called d.c. if f is d.c. on uf. The representation (25) is said to be a
cic. decomposition of f.

A global optimization problem is called a d.c. programming problem if it has the


form

minimize f(x)
s.t. xEC,gj(X)~O (j=1, ... ,m) (26)

where C c uf is convex and all functions fand gj are d.c. on C.

Note that C is usually given by a system of convex inequalities, so that, for the
sake of simplicity of notation, we will sometimes set C = IRn without loss of general-

ity.
Clearly, every concave minimization problem is also a d.c. programming problem,

and it follows from Example 1.1 (Section 1.2.1) that (26) is a multi-extremal optim-

ization problem.

The following results show that the class of d.c. functions is very rich and, more-
over, that it enjoys a remarkable stability with respect to operations frequently en-
countered in optimization.

Theorem 1.7. Let J, li (i=1, ... ,m) be d.c. Then the lollowing fv.nctions are also
d.c.:
n
(i) E A. I., lor any real numbers A.i
i=1 I . •

(ii) . max li and . min lii


I=1, .. ,m I=1, .. ,m

(iii) 1I(z) 1,{(x):= max {O, I(x}} and r (x):= min {O,J(x}}.
28

Proof. Let f = p - q, fi = Pi - qi (i=l, ... ,m) be d.c. decompositions of f and the


fi , respectively.
Assertion (i) is a straightforward consequence of well-known properties of convex
and concave functions.
m m
(ii): f. = p.-q. = p. + E q. - E q .. The last sum does not depend on i.
1 1 1 1 j=l J j=l J
jfi

Therefore, we have
m m
max f. = max {p.-q.} = max {po + E q.} - E q ..
' 1 , .. ,m 1 '1=,
1= 1 .. ,m 1 1 '1=
1 , .. ,m 1 J=
'lJ 'lJ
J=
j:f:i

This is a d.c. decomposition, since the sum and the maximum of finitely many
convex functions are convex.
Similarly, we see that
m m m m
min f.= E p.+ min {~E p.)-q.}= E p.- max {( E p.)+q.}
i=1, .. ,m 1 j=l J i=l, .. ,m j=l J 1 j=l J i=l, .. ,m j=l J 1
jji jji

is d.c.

(iii): Suppose that we have p(x) ~ q(x). Then If(x) I = p(x) - q(x) = 2p(x) -
(p(x) + q(x)) holds.

Now let p(x) < q(x). Then we have If(x) I = q(x) - p(x) = 2q(x) - (p(x) + q(x)).
Rence, it follows that

Ifl = 2max {p,q} - (p + q)

which is a d.c. decomposition of Ifl.


Similarly (or directly from (ii» we see that

f+ = max {p, q} - q, and r = p - max {p, q}


29

are d.c. decompositions of:r+ and I, respectively.



It can be shown that the d.c. property is also preserved under other operations

which will not be used in this book (e.g., Tuy (1995». An example is the product of
two d.c. functions (e.g., Hiriart-Urruty (1985) and Horst et al. (1995)).

A main result concerning the recognition of d.c. functions goes back to Hartman
(1959). Before stating it, let U8 agree to call a function f: IRn -+ IR loca1ly d.c. iJ, for
every 1Pe IR n, there exists a neighboumood N = N(zO,e) of zO and convez functions
PN' qN such that

fez) = p~z) - q~z) for all ze N. (27)

Theorem 1.8. Every 10caUy d.c. function is d.c.

The proof requires some extension techniques that are beyond the scope of this
book and is omitted (cf. Hartman (1959), Ellaia (1984), Hiriart-Urruty (1985)).
Denote by C2 the class of functions IRn -+ IR whose second partial derivatives are
continuous everywhere.

Corollary 1.1. Every function fe cf is d.c.

Proof. Corollary 1.1 is merely a consequence of Theorem 1.8. If fe C2, then we


see that, for all xO e ~, the Hessian V2f of f is bounded on the closed neighbour-
hood N(xO,e):= {x e~: IIx-xOIl ~ e}, e> 0. As in the proof of Theorem 1.2, part
°
(ii), it then follows that there is areal number ~ > such that f(x) + ~lIxll2 is con-
vex on N(xO,e). Due to this simple fact, one can find a d.c. decomposition

of f on N(xO,e), i.e., fis locally d.c., and hence d.c. (cf. Theorem 1.8).

30

Furthermore, it turns out that any problem of minimizing a continuous real func-
tion over a compact subset D of IRn can, in principle, be approximated as closely as
desired by a problem ofminimizing a d.c. function over D.

Corol1ary 1.2. Areal flalued continuous fu,nction on a compact (conflu) subset D


o/lR n is the limit 01 a sequence 01 d.c. fu,nctions on D which conflerges unilormly in D.

Proof. Corollary 1.2 is a consequence of the famous Stone-Weierstraß Theorem


or of the original Theorem of Weierstraß which states that every continuous function
on D is the limit of a uniformly convergent sequence of polynomials on D. Corollary
1.2 follows from Corollary 1.1, since every polynomial is C2. •

Of course, the main concern when using Corollary 1.2 is how to construct an ap-
propriate approximation by d.c. functions for a given continuous function on D.
However, in many special cases of interest, the exact d.c. decomposition is already
given or easily found (cf. Theorem 1.7, Bittner (1970), Hiriart-Urruty (1985) and
the brief discussion of direct applications below).

n
Enmple 1.2. Let fex) be separable, Le., we have fex) = i=l
E f.(x.),
1 1
where each f.
1

is a real-valued function of one real variable on a given (possibly unbounded) inter-


val. In many economic applications where the functions fi represent, e.g., utility
functions or production functions, each fi is differentiable and has the property that
there is a point X.1 such that f.(x.)
11
is concave for x.1 < X.1 and convex for x.1-> X.1 (Fig.
1.1). Likewise, we often encounter the case where fi(x) is convex for xi:< xi and con-
cave for xi ~ xi'
31

x·I x·I

Fig. 1.1

In the case corresponding to Fig. LI where fi is concave-convex, we obviously

have the d.c. decomposition f.


1
= p.1 - q. given by
1

(f:1 (i.)
1
(x.-
1
i.)
1
+ f.(i.))
1 1
/2 (x.1 < i.)
1
p.(x. ):= {
1 1
f.(x.)
11
- (f(i.)
111111
(x.-i.) + f.(i.)) /2 (x.1 >
-
i.)
1

(f(i.)
1 1
(x.-i.)
1 1
+ f.(i.))
1 1
/2 -f,(x.)
1 1
(x.1 < i.)
1
qi(Xi ):= {
-(f(i.)
1 1
(x.-i.)
1 1
+ f.(i.))
1 1
/2 (x.1 >
-
i.)
1

Let f(x) be a real-valued function defined on a compact interval

I = {x: a ~ x ~ b}, b > a, of the real line. Then, f is said to be a piecewise linear

function if I can be partitioned into a finite number of subintervals such that in each

subinterval f(x) is affine. That is, there exist values a < xl < ... < xr < b such that
32

~(X) = Q·X
1
+ {J.1 for x.1-1 <
-
X< x.,
1 .
i=I, ... jr+l,

where Xo:= a, Xr + 1:= b and ~, Pi are known real constants (i=I, ... ,r+l). A similar

definition holds for noncompact intervals I.

A piecewise linear function is continuous if QiXi + Pi = ~+lxi + PHI for


i=I, ... ,r.

Corollary 1.3. A continuous piecewise-linear fanction on an interval I is d.c. on 1.

Proof. Let f: 1 - I IR be continuous and piecewise linear. By setting f(x) = Qlx+Pl

for _ < x ~ x o' and f(x) = Qr+!x + Pr +1 for x r +! ~ x < 111, we see that f can be

extended to a continuous piecewise-linear function T on IR. From the definition of a

continuous piecewise-linear function it follows that, for each point x e IR, there is a

neighbourhood N(X,E) such that T can be expressed in N(X,E) as the pointwise max-

imum. or minimum of at most two affine functions. Corollary 1.3 is then established

via Theorem 1.7 (ii) and Theorem 1.8.



Corollary 1.3 holds also for continuous piecewise-linear functions on IRn, n > 1.

3.2. D.C. Programming: ApplicatiODl

Example 1.3. An indefinite quadratic form x(Qx), Q real symmetric nlCn-matrix,

is a d.c. function because of Corollary 1.1. It is easy to see that

is a d.c. representation for any matrix norm IIQII (cf., e.g., Phong et al. (1995)).
33

Another well-known d.c. representation, provided by the eigenvector transforma-

tion, is of the form

A.1 y.21 + A.1 y.21

Example 1.4. Let M be an arbitrary nonempty closed subset of IRn and let d~(x)
denote the square of the distance from a point x E IRn to M. Then it can be shown

that IIxll 2 - d~(x) is convex (cf. Asplund (1973)). Consequently,

is a d.c. decomposition of d~(x). Note that d~(x) is convex whenever M is convex.

Simple proofs for ellipsoidal norms can be found, e.g., in Horst et al. (1995), Tuy

(1995).

In many econometric applications, we encounter a situation where the objective


n1 - n2
function is of the form p(y) - q(z) where p: IR ~ IR, and q: IR ~ IR are convex.
Extending the considerations of Section 1.2.2, we see that such a cost function

reflects the fact that in some activities the unit cost increases when the scale of
activity is enlarged (diseconomies of scale), whereas in other activities the unit cost
decreases when the scale of activity is enlarged. Likewise, in optimal investment
planning, one might have to find an investment program with maximal profit, where

the profit function is (-p(y)) + q(z) with (-p(y)) concave (according to a law of

diminishing marginal returns) and with q(z) convex (according to a law of increasing

marginal returns).
Certain general economic models also give rise to d.c. functions in the constraints.

Suppose that a vector of activities x has to be selected from a certain convex set

C c IRn of technologically feasible activities. Suppose also that the selection has to be

made so as to keep some kind of "utilit y " above a certain level. This leads to

constraints of the form


34

where u.(x) represents some utility depending on the activity level x, and where c. is
1 1

a minimal level of utility required. In many cases, the ui(x) can be assumed to be

separable, Le., of the form

n
u.(x) = E u· .(x.)
1 j=l IJ J

(cf. Example 1.2) with uij either convex or concave or of the form discussed in

Example 1.2. For a thorough discussion of the underlying economic models, see, e.g.,

HiHestad (1975), Zaleesky (1980 and 1981).

Indefinite quadratic constraints arise, for example, in certain packing problems

(e.g., Horst and Thoai (1995)), and in blending and pooling problems encountered in

oil refineries (e.g., Floudas and Aggarwal (1990), AI-Khayyal et al. (1995)).

D.C. programming problems are also frequently encountered in engineering and

physics (cf., e.g., Giannessi et al. (1979), Heron and Sermange (1982), Mahjoub

(1983), Vidigal and Director (1982), Strodiot et al. (1985 and 1988), Polak and
Vincentelli (1979), Toland (1978), Tuy (1986 and 1987)).

Moreover, for certain d.c. programming problems, there is a nice duality theory

developed by Toland (1978 and 1979) and Singer (1979, 1979a and 1992), that leads

to several applications in engineering, mathematics and physics (cf., e.g., Toland

(1978 and 1979), Hiriart-UrIUty (1985)).

In engineering design, we often encounter constraints of the form

g(x):= max G(x,s) ~ 0 (28)


seS

where G: IRn"lRs ---I IR and S ( IRs is compact. Here, x represents the design vector and

(28) is used to express bounds on time and frequency responses of a dynamical

system as weH as tolerancing or uncertainty conditions in worst case design (cf.


35

Polak (1987». The function g(x) belongs to the class of so-called lower

C2-functions if G is a function which has partial derivatives up to order 2 with

respect to x and which, along with all these derivatives, is jointly continuous in

(x,s) E IRn"S. But lower C2-functions are d.c. For a detailed discussion of lower

C2-functions see Rockafellar (1981) and Hiriart-Urruty (1985).

Example 1.5. Let D ( IRn be a compact set, and let K ( IRn be a conllez compact
set. Assume that the origin 0 is an interior point of K. Define the function

r D: D - t IR by

rD(x):= max {r: x + rK ( D} . (29)

Then the problem

maximize r D(x)
s.t. xED (30)

is called a design centering problem.

An application of the design centering problem ia the optimum shape design

problem from the diamond industry where one wants to know how to cut the largest
diamond of shape K that can be cut from a rough stone D ) K (Nguyen et al.

(1985». Other applications are discussed by Tuy (1986) and Polak and Vincentelli

(1979).

In a more general context consider any fabrication process where random vari-

ations may result in a very low production yield. A method to minimize the influ-

ence of these random variations consists of centering the nominal value of the de-

signable parameters in the so-called region of acceptability. This leads to a problem

of the form (29), (30) (cf. Vidigal and Director (1982)).

In many cases, D can be described by finitely many d.c. inequallties. Moreover, it

has been shown in Thach (1988) that in these cases (30) is actually a d.c.
36

programming problem. A somewhat different approach for polyhedral sets D is

described in Thoai (1987). Generalizations are discussed in Thach et al. (1988). We


shall come back to this problem in Chapter X.

Example 1.6. The jointl1l constrained bicon'IJez programming problem is of the


form

minimize fex) + xy + h(y)


s.t. (x,y) E D c IRnxlRn

where D is a closed convex subset of ~xlRn, and fand h are real-valued convex func-
tions on D. The objective function is a d.c. function, since xy = ~IIx+Yll2 -

IIx-YIl2).

The intuition of many people working in global optimization is that most of the

"reasonable" optimization problems actually involve d.c. functions. It is easy to see,


for example, that a problem of the form

minimize fex) (31)


s.t. xED

where D is a closed subset of IRn and f: IRn ~ IR is continuous, can be converted into a
d.c. program (e.g., Tuy (1985)). Introducing the additional real variable t, we can
obviously write problem (31) in the equivalent form

minimize t
s.t. XED,f(x)~t (32)

The feasible set M:= ((x,t) E ~+1: x E D, fex) ~ t} in (32) is closed and the
condition (x,t) e M is equivalent to the constraint d~(x,t) ~ 0, where dM(x,t)
denotes the distance from (x,t) to M. This is a d.c. inequality (cf. Example 1.4).
37

3.3. Reverse Convex Constraints

Recall from Section 1.1 that a constraint h(x) ~ 0 is called reverse convex when-
ever h: IRn - I IR is convex. Obviously, every optimization problem with concave,
convex or d.c. objective function and a combination of convex, reverse convex and

d.c. constraints is a d.c. programming problem. However, it is worthwhile to con-

sider reverse convex constraints separately. One reason for doing so is tliat we often

encounter convex or even linear problems having only one or very few additional

reverse convex constraints. Usually, these problems can be solved more efficiently by

taking into account their specific structure instead of using the more general
approaches designed for solving d.c. problems (e.g., Tuy (1987), Horst (1988), Thoai

(1988)). Another reason is that every d.c. programming problem can be converted

into an equivalent relatively simple problem having only one reverse convex con-
straint. This so--<:alled canonical d.c. programming problem will be derived below.

Example 1.7. Given two mappings h: IRn -I IRm , and g: IRn -I IRm , the condition

h(x) g(x) = 0 is often called a complementarity condition (cf. Section 1.2.5). Several
applications yield optimization problems where an additional simple linear comple-

mentarity condition of the form xy = 0 (x,y E IRn ) has to be fulfilled.


An example is given by Schoch (1984), where it is shown that the indefinite
quadratic programming problem is equivalent to a parametrie linear program
subject to a complementarity condition.
Another example has been discussed in Giannessi et al. (1979). In offs hore techno-

logy, a submarine pipeline is usually laid so that it rests freely on the sea bot tom.

Since, however, the sea bed profile is usually irregularly hilly, it is often regularized

by means of trench excavation in order to bury the pipe for protection and to avoid

excessive bending moments on the pipe. The optimization problem which arises is to

minimize the total cost of the excavation, under the condition that the free contact

equilibrium configuration of the pipe nowhere implies excessive bending. It has been
shown by Giannessi et al. (1979) that the resulting optimization problem is a linear
38

program with one additional complementarity condition of the form

x ~ 0, y ~ 0, xy = ° (x,y E IRn) .

Obviously, this condition is equivalent to


n
x ~ 0, y ~ 0, . E min {xi' Yi} ~ 0 ,
1=1

the last inequality being areverse convex constraint, since the function
n
Emin {x., y.} is concave.
i=l 1 1

Example 1.8. A (0-1) restriction can be cast into the reverse convex form. For

instance, the constraint xi = ° or xi =1 can be rewritten as

2
-x·1 + (x.)
1
> 0, 0 <
-
x· < 1 .
- 1-

Example 1.9. Let G be an open convex set defined by the convex inequality

g(x) < 0, g: IRn - I IR convex. Let K be a compact, convex set contained in G. Then
the problem of rninirnizing the distance d(x) from K to !Rn \ G is a convex mini-
mization problem with the additional reverse convex constraint g(x) ~ 0 (Fig. 1.2) .

..... .. _.' -....-......


x
~

.....

Fig. 1.2
39

Certain classes of optimization problems involving reverse convex constraints

were studied by Rosen (1966), Avriel and Williams (1970), Meyer (1970), Ueing

(1972), Bansal and Jacobsen (1975), Hillestad and Jacobsen (1980 and 1980a), Tuy

(1983 and 1987), Thuong and Tuy (1984), Horst and Dien (1987), Horst (1988),

Horst et al. (1990), Horst and Thoai (1994). Avriel and Williams (1970) showed that

reverse convex constraints may occur in certain engineering design problems.


Hillestad (1975) and Zaleesky (1980 and 1981) discussed economic models yielding

reverse convex constraints (cf. Section 1.3.2). In an abstract setting, Singer (1980)

related reverse convex constraints to certain problems in approximation theory,

where the set of approximating functions is the complement of a convex set.

3.4. Canonical D.C. Programming Problems

A striking feature of d.c. programming problems is that any d.c. problem can

always be reduced to a canonical form which has a linear objective function and only

two constraints, one of them being a convex inequality, the other being reverse

convex (cf. Tuy (1986)).

Definition 1.5. A canonical d.c. program is an optimization problem ofthe form

minimize cx (33)
s. t. h(x) $ 0, g(x} ~ 0

where ce IR n, h, g: IR n - I IR are convex.

Theorem 1.9. Every d.c. programming problem of the form

minimize f(x) (34)


s. t. X eG, g/x) $ 0 (j=l, ... ,m)
40

where C is defined by a finite system 01 convex inequalities hix) ~ 0, k EIe IN, and
where f, gj are d.c. junctions, can be converted into an equivalent canonical d.c.
program.

Proof. By introducing an additional variable t, we see that problem (34) is


equivalent to the following one:

minimize t
s. t. hk(x)~O (kEI), glx)~O (j=l, ... ,m), f(x)-t~O

Therefore, by changing the notation, we have obtained a linear objective function.


Furthermore, Theorem I.7 shows that the finite set of d.c. inequalities f(x) - t ~ 0,
gix) ~ 0 (j=l, ... ,m) can be replaced by a single d.c. inequality

r(x,t):= max {f(x)-t, gix): j = l, ... ,m} ~ 0 .

Moreover, the d.c. inequality

r(x,t) = p(x,t) - q(x,t) $ 0

with p and q convex is equivalent to the system

p(x,t) -z ~ 0, z - q(x,t) ~ 0 (35)

involving the additional real variable z. The first inequality in (35) is convex and the

second is reverse convex.

Final1y, setting h(x,t,z):= max {p(x,t) - z, hlx): jE I} and g(x,t,z):= q(x,t) - z,


we see that problem (34) is transformed into an equivalent canonica1 d.c. program. _

Along the same lines, various other transformations can be carried out that
transform, e.g., a d.c. problem into a program having convex or concave objective
functions, etc. Obviously, since a canonical d.c. problem has simpler structure than
the general d.c. problem, transformations of the above type will be useful for the
41

development of certain algorithms for solving d.c. problems. Note, however, that
these transformations increase the number of variables of the original problem. Since
the numerical effort required to solve these kinds of difficult problems generally
grows exponentially with the number of variables, we will also attempt to solve d.c.
problems without prior transformations (cf. Chapter X).

We now consider the canonica1 d.c. problem (33), and we let OB and lJG denote
the boundaries of the sets H:= {x: h(x) ~ O} and G:= {x: g(x) ~ O}, respectively. For
the sake of simplicity we shall assume that H is bounded and that H n G is not
empty, so that an optimal solution of problem (33) exists.

Definition 1.6. The reverse convex constraint g(x) ~ 0 is caUed usential in the
canonical d.c. program (33) ifthe inequality

min { cx: xe H} < min {cx: x E H n G}

holds.

It can be assumed that g(x) ~ 0 is always essential because otherwise, problem

(33) would be equivalent to the convex minimization problem

minimize cx (36)
s.t. xE H

which can be solved by standard optimization techniques.

Theorem 1.10. Consider the canonical d.c. program. Suppose that His bounded,
H n G is nonempty and the reverse convex constraint is essential. Then the canonical
d.c. program (33) always has a solution lying on lJHn lJG.

Proof. Since the reverse convex constraint is essential, there must be a point
wEH satisfying

g(w) < 0, cw < min {cx: xE H n G} . (37)


42

As IRn \ G is convex, we know that for every x E G there is a point

w(x) = AX + (1 - A)W, A E (0,1] ,

where the line segment [w,x] intersects the boundary 00. The number A is uniquely

determined by the equation

g(h + (1 - A)W) =0 .
Now let x E H n G satisfy g(x) > O. Then we have

w(x) = AX + (1 - A)W, 0 < A < 1 ,

and it follows from (37) that

cw(X) = ACX + (1- A)CW < ACX + (1 - A)CX = cx.

Therefore, problem (33) is equivalent to

minimize cx (38)
S.t. xEHnOG

Now consider an optimal solution i of (38). Denote by int M the interior of a set

M. Then the c10sed convex set (IRn\G) U 00 has a supporting hyperplane at i. The
intersection of this hyperplane with the compact convex set H is a compact convex
set K. It is well-known from linear programming theory that cx attains its

minimum over K at an extreme point u of K (cf. also Theorem 1.1).

Clearly, we have u E 8H, and since K C H \ int (IRn \ G), it follows that u is a

feasible point of problem (33). Finally, we see by construction that x E K, hence


cu ~ ci = min {cx:x E H n G}. This implies that u solves (33). From (38) it then

follows that u E 00.



43

In certain applications we have that the function h(x) is the maximum of a finite
number of affine functions, i.e., the canonical d.c. program reduces to linear pro-
grams with one additional reverse convex constraint. It will be shown in Part B of
this volume, that in this case the global minimum is usually attained on an edge
ofB.

Problems with reverse convex constraints will be discussed in Chapter IX and


Chapter X.

4. LIPSCHITZIAN OPTIMIZATION AND SYSTEMS OF EQUATIONS AND


INEQUALITIES

4.1. Lipschitzian Optimization

Recall from Definition 1.3 that a real-valued function f is called Lipschitzian on a


set Me IRn if there is a constant L = L(f,M) > 0 such that

If(x)-f(y)1 ~Lllx-yll Vx,yeM. (39)

It is well-known that all contin'Uously differentiable functions f with bo'Unded gradi-


ents on M are Lipschitzian on M, where

L = sup {IIVf(x)lI: x e M} (40)

is a Lipschitz constant (11·11 again denotes the Euclidean norm).

Obviously, if f is Lipschitzian with constant L, then fis also Lipschitzian with all

constants L' > L.


The value of knowing a Lipschitz constant L anses from the following simple ob-

servation. Suppose that the diameter d(M):= sup {lix - ylI: x, y e M} < ID of M is
known. Then we easily see from (39) that
44

f(x) ~ f(y) - Lllx - yII ~ f(y) - Ld(M) (41)

holds. Let S ( M denote a finite sampIe of points in M where the function values

have been calculated. Then it follows from (41) that we have

min f(S) ~ inf f(M) ~ max f(S) - Ld(M) , (42)

Le., knowledge of L and d(M) leads to computable bounds for inf (M).

Note that the following weIl-known approach for solving the problem

minimize f(x)
(43)
S.t. xED

with D ( IRn nonempty and compact and f Lipschitzian on D, is based directly on

(41):

Start with an arbitrary point x O E D, define the first approximating function by

and the next iteration point by

xl E argmin F O(D) .

In Step k, the approximating function is

(44)

and the next iteration point is

(45)

It is easy to show that any accumulation point of the sequence {xk } solves prob-
lem (43). This algorithm was proposed by Piyavskii (1967 and 1972) and Shubert

(1972) for one--dimensional intervals D. An extension to the n-dimensional case was

proposed by Mladineo (1986). Alternative aspects were considered in Mayne and


45

Polale (1984), Horst and Tuy (1987), and Pinter (1983 and 1986). See also Bulatov
(1977), Evtushenko (1985), and, in particular, the survey of Hansen and Jaumard
(1995). The crucial part of this method is the minimization in (45). Unfortunately,
since F k is the pointwise maximum of a finite family of concave functions, this
minimization problem is, except in the one-dimensional case, a very difficult one.

Actually Fk(x) is a d.c. function (cf. Theorem 1.7), and applying the above approach

results in 80lving a sequence of increasingly complicated d.c. programming subprob-

lems. In the chapters that follow, some different and, hopefully, more practical ap-

proaches will be presented.

Lipschitzian optimization problems of the form (43), where D is either a convex

set or the intersection of a convex set with finitely many complements of convex sets
or a set defined by a finite number of Lipschitzian inequalities, are encountered in
many economic and engineering applications. Examples are discussed, e.g., in Dixon
and Szego (1975 and 1978), Strongin (1978), Zielinski and Neumann (1983), Fedorov
(1985), Zilinskas (1982 and 1986), Pinter et al. (1986), Tuy (1986), Hansen and
Jaumard (1995). Algorithms such as those proposed in Horst (1987 and 1988), Pinter

(1988), Thach and Tuy (1987), Horst et al. (1995) will be discussed in Chapter XI.

Apparently problem (43) is a very general problem. Most problems discussed in


the previous sections are actually also Lipschitzian. This follows !rom the fact that a
convex function is Lipschitzian on a compact subset of the relative interior of its ef-
fective domain (cf. Rockafellar (1970), Theorem 10.4), and, moreover, the Lip-
schitzian property is preserved under operations such as forming linear combinations

and maxima or minima of finitely many Lipschitzian functions (cf., also Sec-

tion 1.4.2).

Howe1Jer, we point out that aU methods lor sol1ling Lipschitzian optimization prob-
lems to be presented in this book require knowledge 01 C& Lipschitzian constant lor
some or aU olthe functions in1Jol1Jed.
46

Though such a constant can often be estimated, this requirement sets limits on
the application of Lipschitzian optimization techniques, since - in general - finding

a good estimate for L (using, e.g., (40)) can be almost as difficult as solving the

original problem. In most applications where Lipschitzian techniques have been

proposed, the sets Mare successively refined in such a way that one can use adaptive

approximation of L (cf. Strongin (1973) and Pinter (1986)). Another means of calcu-
lating suitable approximations of L is by interval analysis (cf., e.g., Ratscheck and

Rokne (1984 and 1988), and Ratscheck (1985)).

Example 1.10. Many practical problems may involve indefinite, separable quadra-

tic objective functions and/or constraints (cf., e.g., Pardalos et al. (1987), Pardalos

and Rosen (1986 and 1987)). To solve some of these problems, AI-Khayyal et al.
(1989) proposed several variants of a branch and bound scheme that require a Lip-

schitz constant L for

on a rectangle M = {x: ak ~ xk ~ bk , k=l, ... ,n}. In this case, the relation (40) yields

Using monotonicity and separability we see that

and

n
~ max 1PkYk + qk l
k=l a k ~ Yk~bk

have the same optimal solutions. Hence we have


47

where

11 = {k: _ :: ~~ ; bk } ,

12 = {k: _ :: < ~ ; bk} .

4.2. Systems of Equations and Inequalities

Solving systems of nonlinear equations and/or inequalities is a basic activity in

numerical analysis. It is beyond the scope of our book to present here an overview of

applications and methodology in that field (see, e.g., Forster (1980 and 1995),

Aligower and Georg (1980 and 1983), Dennis and Schnabel (1983)). It is the purpose
of this section to show that unconventional global optimization methods designed for
solving Lipschitzian optimization problems or d.c. programs can readily be applied

to treat systems of equations and/or inequalities. Consider the system

(46)

(47)

subject to x E D ( !Rn, where 11' 12 are finite index sets satisfying 11 n 12 = 0. Sup-
pose that Dis nonempty and compact and that all functions fi are continuous on D.

The system (46), (47) can be transformed into an equivalent global optimization

problem.

Suppose first that we have 12 = 0 in (46), (47), Le., we consider the system of
equations (46).
1I 1
Let F: !Rn --t IR 1 be the mapping that associates to x E !Rn the vector with

components fi(x) (i E 11), and let HIN denote any norm on the image space of F.
48

Then we have

Lemma 1.1. x* e Dis a solution o/the system 0/ equations (./6) i/ anti only i/

0= IIF(x*)II N = min {IIF(x)II N : xe D} (48)

holds.

Proof. Lemma 1.1 is an immediate consequence of the norm properties


Let f(x) = IIF(x)II N . Then, by virtue of Lemma 1.1, the optimization problem

minimize f(x)
(49)
s.t. xeD

contains all of the information on (46) that is usually of interest. We see that min

f(D) > 0 holds if and only if (46) has no solution, and in the case min f(D) = 0 the
set of solutions of (49) coincides with the set of solutions of (46).

Suppose now that we have 11 = 0 in (46), (47), i.e., we consider the system of
inequalities (47). The following Lemma is obvious.

Lemma 1.2. x* e D solve8 the system 0/ inequalities (./7) i/ and only i/

(50)

holds.

Lemma 1.2 suggests that we consider the optimization problem

minimize T(x),
s.t. xeD (51)

where T(x) = max {fi(x): i e 12}. Whenever a procedure for solving (51) detects a
point x* e D satisfying T(x*) ~ 0, then a solution of (47) has been found. The
49

system (47) has no solution if and only if min T(D) > O.

The case 11 f. 0 and 12 f. 0 can be treated in a similar way. As above, we see that
x* E D is a solution of (46) and (47) if and only if for

(52)

we have

o= f(x*) = min f(D) . (53)

Now let a1l the functions in (46) and (47) be d.c., and let

II F(x)II N = IIF(x)1I 1 = E If·1 or II F(x)II N = IIF(x)1I = max If.(x) I .


i E11 1 m i EIl 1

Then, by virtue of Theorem 1.7 (Section 1.3.1), each of the objective functions f,

T, f in (49), (51) and (53) is d.c. In other words, whenever the functions fi involved
in a system of equations and/or inequalities of the form (46) (47) are d.c., then this
system can be solved by d.c. programming techniques.

Now suppose that an


of the functions fi in (46), (47) are Lipschitzian on M ) D
with known Lipschitz constants Li . Consider for z = (zl'".,zm? E IRm the p-norms
m -1
IIzll =(E Iz·I P)P,15P<ID, and IIzll = max Iz·l.
p '1
1=
1 m '1= l , .. ,m 1

Lemma 1.3. Let li be Lipschitzian on M ( IR n with Lipschitz constants Li


(i=I, ... ,m) and let F = {f1" .. ,fmJT. Then IIF{x)lIp (I 5 P 5 m) and ._max Ilx)
1-1, .. ,m
m
define Lipschitzian ju,nctions on M with Lipschitz constants.E Li and. max Li'
1=1 1=1, .. ,m
respectively.
50

Proof. We use two well-known properties of the norms involved. First, it follows
from the triangle inequality that for any norm II'II N in IRm and any zl, zl E ~ we

have

(54)

Furthermore, for 1 ~ P ~ ID and z E IRm , the inequality

(55)

is satisfied (Jensen's inequality).

Using (54), (55) and the Lipschitz constants Li of fi (i=I, ... ,m), we obtain the

following relations:

I IIF(x)lI p -IIF(y)lIpl ~ IIF(x) - F(Y)lI p ~ IIF(x) - F(y) 11 1

m m m
= E If.(x) -f·(y)1 ~ E L·llx-yll = IIx-yll( E L.),
i=1 1 1 i=1 1 i=1 1

where /Ix - y/l denotes the Euclidean norm of (x - y) in IRn .

Therefore, for an 1 ~ P ~ ID, IIF(x)lI p defines a Lipschitzian function on M with


m
L = E L. being a Lipschitz constant which is independent of p.
i=1 1
In a similar way, since

Im~ fi(x)-m~ fi(y) I ~ m~ Ifi(x)-fi(y)I ~ m~ {Lillx-yll} = (m~ Li) IIx-YIi ,


1 1 1 1 1

we see that max f.(x) defines a Lipschitz function with Lipschitz constant
. 1 , .. m 1
1=

max
i=I, .. ,m
Li"

Lemma 1.3 provides Lipschitz constants of the objective functions in (49), (51)

and (53); and in a manner similar to the d.c. case we have:


51

Whenever the functions fi involved in a system of equations and/oI inequalities of


the form (46) and (47) are Lipschitzian with known Lipschitz constants Li' then the
system can be 801ved by Lipschitzian optimization tecbniques.
The above transformations have been applied by Horst and Thoai (1988) and
Horst et al. (1995) in order to solve systems of equations and inequalities. Another
method for solving systems of nonlinear equations via d.c.-programming was
proposed by Thach (1987).
CHAPTER 11

OUTER APPROXIMATION

Outer approximation of the feasible set by a sequence of simpler relaxed sets is a


basic method in inany fields of optimization. In this technique, the current approx-

imating set is improved by a suitable additional constraint (a cut).


In this chapter, outer approximation is developed with regard to the specific

needs of global optimization. First, a simple general convergence principle is presen-


ted which permits nonlinear cuts and unbounded feasible sets. Specialization to
outer approximation by polyhedral convex sets enables us to derive a luge dass of
"cutting-plane" methods. Then, some "constraint dropping" strategies and several

computational issues are addressed. These indude the calculation of. new vertices
and extreme directions generated from a polyhedral convex set by a linear cut and

the identification of redundant constraints.

1. BASIC OUTER APPROXIMATION METHOn

In this section we present a dass of methods which are among the basic tools in

many fields of optimization and which have been used in many forms and variants.
The feasible set is relaxed to a simpler set D1 containing D, and the original object-
ive function f is minimized over the relaxed set. If the solution of this relaxed prob-
54

lem is in D, then we are done; otherwise an appropriate portion of D1 \ Dis cut off
by an additional constraint, yielding a new relaxed set D2 that is a better approx-
imation of D than Dr Then, D1 is replaced by D2, and the procedure is repeated.
These methods are frequently called outer approximation or relaxation methods.
Since the pioneering papers of Gomory (1958 and 1960), Cheney and Goldstein
(1959) and Kelley (1960), outer approximation in this sense has developed into a

basic tool in combinatorial optimization and (nonsmooth) convex programming. In


global optimization, where certain theoretical and computational questions arise
that cannot be inferred from previous applications in other fields, outer approxima-

tion has been applied in various forms for solving most of the problem c1asses that
we introduced in the preceding chapter. Examples inc1ude concave minimization

(e.g., Hoffman (1981), Thieu, Tam and Ban (1983), Tuy (1983), Thoai (1984»,
problems having reverse convex constraints (e.g., Tuy (1987», d.c. programming
(e.g., Tuy (1986), Thoai (1988» and Lipschitzian optimization (e.g., Thach and Tuy

(1987».

We modify a general treatment given by Horst, Thoai and Tuy (1987 and 1989).
ConBider the global optimization problem

(P) minimize f (x) , (1)


s.t. xeD

where f: ort --t R is continuous and D c IRn is c1osed.


We shall suppose throughout this chapter that min f(D) exists.

A widely used outer approximation method for solving (P) is obtained by replacing
it by a sequence of simpler "relaxed" problems

minimize f (x) , (2)


s .t. xe Dk

where IRn ) D1 ) D2 ) ... ) D and


55

min f(D k ) - - I min f(D).


(k-+m )

Usually, the sets Dk belong to a family E with the following properties:

a) The sets Dk ( IRn are closed, and any problem (Qk) with Dk E Ehas a solution

and can be solved by available algorithms;

b) for any Dk E E containing D and any point xk E Dk \ D one can define a

constraint function lk: IRn ........ IR satisfying

4c(x) 5 0 Vx ED , (3)

k
lk(x ) > 0 , (4)

(5)

Under these conditions the following solution method suggests itself:

Outer Approximation Method:

Choose D1 E E. such that D1 ) D. Set k I - 1.

Iteration k (k = 1,2, ... ):

Solve the relaxed problem (Qk) obtaining a solution xk E argmin f(D k).

Ifxk E D, then stop: xk solves (P).

Otherwise construct a constraint function 4c: IRn ........ IR satisfying (3), (4), (5) and

set

(6)

Go to the next iteration.


56

> 0

I (x) < 0
k

Fig. 11.1. Portion of Dk \ D containing xk is cut off.

Conditions (3) and (4) imply that the set {x E IRn: lk(X) = O} strictly separates
xk E Dk \ D from D. The additional constraint ~(x) ~ 0 cuts off a subset of Dk.
However, since we have D ( Dk for all k (no part of D is cut off), each Dk
constitutes an outer approximation of D.
Note that, since Dk ) Dk+1 ) D, we have

(7)

and x k E D implies xk E argmin f(D).

In order to ensure that Dk+1 is c10sed whenever Dk is c1osed, we require that the

constraint functions lk are lower semi-continuous.

Next, we provide a result on the convergence of outer approximation methods.


57

Theorem 11.1. In the context 01 the outer approximation method above, assume
that
(i) Lk is lower semi-continuo'ILS lor each k = 1,2, ... ;
(ii) each convergent subsequence {x q} C {i} satisfying xq yq::;;;r x contains
a subsequence {x r} C {x q} such that
lim Lr (xr) = lim Lr (x),
1'-Im 1'-Im

and
(iii) li m Lr (x) = 0 implies xE D.
1'-Im

Then every accumulation point 01 the sequence {xk} belongs to D, and hence solves
(P).

Proof. Let x be an accumulation point of {xk}, and let {xq} ( {xk} be a


corresponding subsequence satisfying xq r::-:-T""' X. Furthermore, let {xr } be the
~q~)

subsequence of {xq} in assumptions (ii) and (iii). From (6), we have

and hence, because of the lower semi-continuity of Lr , Lr (x) S O.


From assumption (ii) we then see that

!im Lr(xr ) = lim Lr(x) S 0 (8)


r~m r~m

On the other hand, from (4), we have Lr(xr ) > 0 Vr, which implies

!im Lixr) = !im Lr(X) ~ 0 (9).


r~m r~m

From (8) and (9) it follows that !im Lr(X) = 0, and hence, from assumption (iii) we
r~m

have xE D.
Finally, since f(xk) S f(x) Vx E Dk ) D, it follows, by continuity of f, that
f(X) S f(x) Vx E D, i.e., x solves (P). •
58

In many applications which we shalI encounter in the sequel the functions ~ are
even continuous, and, moreover, there often exists a function

l: IRn -t1R such that !im lr(xr) = !im lr(i) = l(i).


r-tlll r-tlll

Assumption (ii) is then fulfilled, for example, when lr(x) converges uniformly (or
continuously) to l (x) (cf. KalI (1986».
Several alternative versions of Theorem n.1 can be derived along the very same
lines of reasoning. For example, instead of (ii), (iii) one could require

(ii') lim lr(xr) = !im lr(xr +1) = l(i)


r-tlll r-tlll

and

(iii') l(i) = 0 implies i E D.

2. OUTER APPROXIMATION BY CONVEX POLYHEDRAL SETS

In most realizations of out er approximation methods the sets Dk are convex poly-
hedral sets, and the constraints \(x) are affine functions such that {x: \(x) = O}
define hyperplanes strictly separating xk and D. The set D is usualIy assumed to be
a closed or even compact convex set and the procedures are often calIed cutting
plane methods.

Let

D = {x E~: g(x) ~ O} , (10)

where g: IRn -+ IR is convex. Clearly, Dis a closed convex set. Note that in the case of
several convex constraints ~(x) ~ 0 (ieI), we set
59

g(x) = max gi(x) (11)


i EI

(g is, of course, also convex).


Suppose that we are given a convex polyhedral set D1 ) D (for the construction of
Dl' see Section II.4). Denote by xy the inner product of x, y E IRn . Let at each
iteration k the constraint function ~ be defined by

k k
~(x) = p (x - y ) + 1\ ' (12)

where pk E IRn, l E Dl , ßk E IR are suitably chosen.

In (12) we now set

(13)

where 8g(l) denotes the subdifferential of g at yk.


If in addition we choose yk = xk, then (12) and (13) yield the so-called
KCG-&lgorithm, essentially proposed by Cheney and Goldstein (1959) and by
Kelley (1960) for convex programming problems.
Moreover, if a point w satisfying g(w) < 0 is given and if we choose yk as the
unique point where the line segment [w,xkj meets the boundary an of n, Le.,
{l} = [w,xkj n an, then (12), (13) yield the so-called supporting hyperplane
approach of Veinott (1967) (cf. Fig. 11.2).
Note that, by the convexity of g, we have

(14)

where Ak is the unique solution of the univariate convex programming problem

min {>. E [0,1]: AW + (l-A)xk E D} , (15)

or equivalently, of

g(AW + (1 - >.)xk) = 0, >. ~ 0. (15')


60

When applying these methods to solve convex programming problems

min {f(x): g(x) 5 O} , (16)

where f, g: IRn ~ IR are convex, one usually transforms (16) into the equivalent

problem

min {t: g(x,t) 5 O} , (16')

where tE IR and g(x,t) = max {g(x), f(x) - t} . The relaxed problems (Qk) are then
linear programming problems.

Instead of subgradients pk E 8g(l) one can as weIl use c-subgradients (cf., e.g.,

Parikh (1976».

-- -- - -- - - -- - -- -- ---- --- - . - - .--. - - - --e


yk xk

Fig. II.2 The supporting hyperplane method.


61

In a modified form (essentially relating to the solution of the relaxed problems

(Qk))' both classical methods have been applied for solving concave minimization
problems: the KCG approach by Thieu, Tam and Ban (1983) and the supporting

hyperplane method by Hoffman (1981). Furthermore, the general incorporation of

outer approximation by cutting planes into procedures for solving global optimiza-

tion problems was discussed in Tuy (1983), Horst (1986) and Tuy and Horst (1988).
We will return to these discussions later.

Another approach to outer approximation by convex polyhedral sets uses duality


and decomposition (see, e.g., Dantzig and Wolfe (1960), Mayne and Polak (1984)
and references given therein). In several of these dual approaches, the important

question of constraint dropping strategies is addressed. This question will be dis-

cussed in the next section, where we will also mention some additional important ap-
proaches to outer approximation by cutting planes.

The following class of outer approximation by polyhedral convex sets - intro-

duced in Horst, Thoai and Tuy (1987 and 1989) - provides a large variety of
methods that include as very special cases the KCG and supporting hyperplane

methods. Consider problem (P) with feasible set D as defined by (10). Assume that
an initial polyhedral convex set D1 J D is given and let the constraints ~ be affine,
i.e., of the form (12).
Denote

(17)

We assume that og(x)\{O} # 0 Vx E IRn \ DO, where 8g(x) denotes (as above) the
sub differential of g at x. This assumption is certainly fulfilled if DO # 0 (Slater condi-
tion): let z E DO and y E IRn \ DO, p E 8g(y). Then, by the definitions of 8g(y) and

DO, we have

°> g(z) ~ p(z-y) + g(y) ~ p(z-y),


62

which implies p t 0, Le., 0 t 8g(y).

Theorem ll.2. Let K be any compact convez sv.bset 0/ DO. In (12) /or each
k = 1,2, ... choose

Then the conditions (3), (~) 0/ the ov.ter approximation method and the a8SV.mtr
tions (i), (ii) and (iii) 0/ Theorem 11.1 are fu'fi"ed.

The set yk:= conv (K U {xk}) \ DOis displayed in Fig. 11.3.

Proof. (3): Since pk e 8g(yk), we have g(x) ~ pk(x - l) + g(l) = 4c(x)


Vx e IRn. Rence using xe D ~ g(x) ~ 0, we see that 4c(x) ~ 0 Vx e D.

(4): If DO= 0, then K = 0; hence yk = xk and 4c(xk) = g(xk) > 0, since xk t D.


Let DO t 0, K t 0. Then every point yk e conv (K U {xk}) \ DOcan be expressed in
the form

(19)

where zk E K, 0 ~ ~k < 1. Then

~k
k k k k
x - Y = Qk(y - z ), where Qk = 'I=Xk ~ 0 . (20)

Since K is compact and is contained in DO, there exists a number 0 > 0 such that
g(x) ~ -0< 0 Vx E K.
Using pk E 8g(yk), g(yk) ~ 0, we obtain

hence
63

From this and (20) it follows that

whenever (Xk > 0 (i.e., xk *yk). However, for (Xk = 0, i.e., yk = xk, we have
4c(xk) = g(xk) > 0, since xk t D.

(i): For each k = 1,2, ... , the affine function 4c defined by (12) and (18) is
obviously continuous.

(ü): Let {xq} be any convergent subsequence of {xk}, and let xq - - I X. Then there
is a qo E IN sufficiently large such that xq E B: = {x E !Rn: IIx - xII ~ I} Vq > qO'
Since K and {8g(y): y E Y} are compact sets when Y is compact (cf., e.g., Rockafel-
lar (1970), Chapter 24) and Aq E [0,1], we see that there is a subsequence {xr} of
{xq} such that

pr --I -p, \ --I


T [)r-
A E 0,1 ,Z - - I Z as r -I ID . (21)

Then it follows that

l = Ar l + (l-A r )xr --I Y= X Z + (l-X)x, ßr = gel) --I g(Y) = 73. (22)

Observe that we cannot have X = 1, because X = 1 would imply y= Z, which is


impossible, since Z E K ( int D. Note that, since the set valued mapping defining
8g(y) is closed (cf., e.g., Rockafellar (1970)), we even have p E 8g(Y). Now

= p (i - y) + 'P = l(i),
where l(x) = p (x-Y) + 'P. (23)

(ili): Let xr - - I x, lr(xr ) - - I l (i) = O.


By (20) - (23), we have
64

°= l (X) = p(i - Y) + 73 = l1-X p(y - Z) + 73 . (24)

But since yq ~ DO, we have ßq = g(yq) ~ 0; hence, by continuity of the convex func-
tion g, we have 73 = g(Y) ~ 0. Moreover, while verifying (4), we saw that

pq(yq - zq) ~ 6 > 0, and hence p(y - Z) ~ 6> 0. From (24) it then follows that

X= 0; and hence y = i (cf. (22)), and also that 13 = g(Y) = 0, hence i E D. •

",

k
D X

Fig. 11.3 Choice of i in Theorem 11.2.

For any compact, convex set K c DO we obviously may always choose l = i.


The resulting algorithm is the KCG-method. Ifwe fix K = {w}, Le., zk = w, where

g(w) < 0, and determine las the unique point where the line segment [w,xkj meets
the boundary of D, then we obtain the supporting hyperplane algorithm. The KCG

and the supporting hyperplane methods, which in their basic form were originally

designed for convex programming problems, are useful tools in that field if the
problems are of moderate size and if the accuracy of the approximate solution
obtained when the algorithm is stopped, does not need to be very high. The rate of

convergence, however, generally cannot be expected to be faster than linear (cf., e.g.,
65

Levitin and Polyak (1966) and Wolfe (1970)).

Moreover, in the case of cutting planes, it has been observed that near a solution,
the vectors pk of successive hyperplanes may tend to become linearly dependent,
causing numerical instabilities.
In order to solve convex programming problems and also to obtain local solutions
in nonlinear programming, recently several fast er cutting plane methods have been
proposed that combine the idea of outer approximation methods with typicallocal
elements as line searches etc. We will return brießy to these methods when dis-
cussing constraint dropping strategies in the next section. In global optimization,
where rapid local methods fail and where - because of the inherent difficulties of glo-

bal (multiextremal) optimization - only problems of moderate size can be treated,


the class of relaxation methods by polyhedral convex sets that was introduced in
Theorem II.2 constitutes a basic tool, especially when combined with branch and
bound techniques (cf. Chapter IV). However, the great variety of concrete methods
included in Theorem II.2 needs further investigation in order to find the most
promising realizations.

Another way of constructing supporting hyperplanes in the outer approximation


approach is by projecting the current iteration point xk onto the feasible set D.
Denote by 1!{x) the projection of xE IRn onto the closed convex set D, Le.,

IIx -1!{x)lI: = min {lix - ylI: y E D} . (25)

It is well-known that 1!{x) exists, is unique, and is continuous everywherej more-

over, we have

(x -1!{x)) (y -1!{x)) ~ 0 Vy E D (26)

(cf., e.g., Luenberger (1969)).


66

Theorem n.3. I/in (12) /or each k = 1,2, ... we choose

where .; = 7f(zk), then the outer approximation conditions (9), (4) and the assumJr
tions (i), (ii) and (iii) 0/ Theorem 11.1 are satisfied.

Proo!: (3): By (27) and (26), we have 4c(x) ~ 0 Vx E D.

(4): 4c(xk) = (xk -,t) (xk -,t) = IIxk _1I'k 1l 2 > 0, since xk ~ D.

(i), (ü): The affine function 4c defined by (27) is continuous for each k. Let {xq}
be a subsequence of {xk} satisfying xq --t X. Sinee the funetion 11' is eontinuous, we
have \fq --t w(i). It follows that

lim l (xq) = lim (xq - \fq) (xq - \fq) = lim (xq - \fq) (x - \fq) =
r~m q r~m r~m

lim l (i) = (i -?r) (i -?r) = l(X), where


r~m q

l(x) = (x - \f (X» (x - \f (X».

(ili): Let xq --t x, lq(xq) --t l (i) = O.


Then it follows that l (i) = (x - w(i» (x - w(i» = IIx - w(i)1I 2 = O. This implies
x = w(i) E D. •

Note that Theorem n.3 eannot be subsumed under Theorem 11.2. As an example,
consider the ease where the ray from xk through .,t. does not meet the set DO (Fig.
llA).
67

k
X

Fig. 11.4. Theorem 11.3 is not included in Theorem 11.2.

3. CONSTRAINT DROPPING STRATEGIES

One dear disadvantage of the outer approximation methods discussed so far is


that the size of the subproblems (Qk) increases from iteration to iteration, since in
each step a new constraint lk(x) ~ 0 is added to the existing set of constraints, but
no constraint is ever deleted. Most of the theoretical work on outer approximation
by polyhedral convex sets has centered on the crucial question of dropping (inactive)
constraints. Topkis (1970) proposed a cutting plane method where certain con-

straints could be dropped. Eaves and Zangwill (1971) - by introducing the notions of
a cut map and aseparator - gave a general and abstract theory of outer approxi-
mation by cutting planes. This theory was generalized further by Hogan (1973) to
cover certain dual approaches. Examples of dual approaches that indude constraint
68

dropping strategies are given in Gonzaga and Polak (1979) and in Mayne and Polak

(1984).
Though convex programming is not within the scope of our treatment, we would

like to mention that for (possibly nonsmooth) convex problems several cutting plane
approaches have been proposed that converge linearly or faster; moreover, they

enable one to drop constraints in such a manner that the number of constraints used
in each step is bounded. Most of these algorithms do not fit into our basic approach,

since in each step a quadratic term is added to the objective function of the original

problem, while the outer approximation methods presented here use the original

objective function throughout the iterations. For details, we refer to Fukushima

(1984) and to the book of Kiwiel (1985). However, these results on local optimi-

zation and convex programming, respectively, could not yet be carried over to the
global optimization of multiextremal problems.

Below, we present a theorem on constraint dropping strategies in global optimi-

zation that can be applied to all algorithms satisfying the assumptions of Theorem
11.1, hence, e.g., also to the classes of procedures discussed in the preceding section.
Note that these results apply also to nonlinear cuts.
We shall need separators as defined in Eaves and Zangwill (1971).

Definition U.1. A function 0: D1 \ D --i IR + is caUed aseparator if for aU


sequences {i} ( D1 \ D we ha'IJe that i --i XI o(i) --i 0 imply XE D.

Note that for x E D1 \ D we must have o(x) > O. Each function 0: D1 \ D - - i IR

that is lower semicontinuous and satisfies O(x) > 0 "Ix E D1 \ D is aseparator.


Another example is given by the distance

o(x) = dD(x) of xE D1 \ D from D.

For practical purposes, when separators are used to drop certain old cuts, a
straightforward choice of separator is any penalty function, as is well-known from
standard local nonlinear programming techniques. For example, let
69

D = {x E IRn: ~(X) S 0 (i = 1, ... ,m)}

with continuous ~: IRn ---+ IR (i = 1, ... ,m). Then all functions ofthe form
6(x) =~ [max {O, ~(x)}].ß , .ß ~ 1 (28)
i=l

are examples of separators.


Let

(29)

Then we would like to be able to say that

(30)

where Ik is a subset of {l, ... ,k}, IIkl < k, defines a convergent outer approximation
method. Let 1,! denote monotonically increasing or decreasing convergence,
respectively.

Theorem. ll.4. Let 6 be a separator. Let {ci}' j ~ i, be a dou.ble-indexed sequence


of nonnegative real numbers such that

C··
n = 0, (31)

c··1
IJ "i.1 as j .... 00, "i.!0
I
as i .... oo. (32)

Assume that the sequence {lk(x)} in the outer approximation method satisfies (9), (4)
and the requirements (i), (ii) and (iii) 0/ Theorem [1.1. Then

(33)

with

(34)
70

defines a convergent outer approximation method, i.e., every accumulation point 0/


{i} solves (P).

Proof. Let {xk} ( Dl \ D be the sequence of iteration points defined by an outer

approximation method satisfying (31), (32), (33), (34). Let {xq} be a subsequence of

{xk } converging to x.
If 6(xq) - I 0, then x E D by Definition 11.1.
Suppose that 6(xq ) + O. By passing to a suitable subsequence if necessary, we
may assume that 6(xq ) ~ E > 0 Vq. Since t i 1 0 we may also assume that E ~ t i for all
i considered. Then using (32), we have 6(xi) ~ E ~ ti ~ Ei,q_l Vi < q-l. By (34), this
implies i E Iq_l' hence x q E Li' i.e., 4(xq ) ~ 0 Vi < q. Considering the subsequence
{xr} of {xq} satisfying (ii) of Theorem 11.2., we see that lr(xr ') ~ 0 V r' > r, and
hence, as in the proofofTheorem 11.2., lim lr(i) ~ O.
r-+IJI
But by (4) we have lr(x ) > 0 Vr; hence by (ii), lim lr(xr ) ~ O. By (iii) of Theorem
r
r-+IJI
xE D which implies that xsolves (P).
11.1, then it follows that

Many double-indexed sequences satisfying the above requirements exist. For
example, let { 17i} be any monotonically decreasing sequence of positive real numbers

that converge to O. Then Eij = 17i -17j (i ~ j) satisfies (31), (32).


Note that, since Ekk = 0, we always have k E Ik . Moreover, because Eik is

monotonically increasing with k for fixed i, we conclude that i I I k implies i I Is for

all s > k: a dropped cut remains dropped forever. Likewise, a cut retained at some
iteration k may be dropped at a subsequent iteration s > k. On the other hand, for
i,k -... IJI , we have Eik -... 0, which at first glance seems to imply that few "late"

constraints can be dropped for large i,k, by (34). However, note that. we also have

6(xi ) - I 0, so that the number of constraints to be dropped for large i,k depends on
the relative speed of convergence of 6(xi ) and Eik. This limit behaviour can be
influenced by the choices of 5 and Eik , and should be investigated further.
71

Another constraint dropping strategy that is useful when outer approximation


methods are applied to solve concave minimization problems will be presented in
Chapter VI.

4. ON SOLVING THE SUBPROBLEMS (Qk)

We now discuss the question of how to solve the subproblems (Qk) in outer
approximation methods that use affine cuts. In global optimization, these algorithms
have often been applied to problems satisfying

where Dk is a polytope and Vk = V(D k ) denotes the vertex set of Dk. The most
well-known examples are concave minimization and stable d.c. programming that
can be reduced to parametric concave minimization (cf. Chapters I, IX and X).

In this section we shall address three questions:


- How do we determine the initial polytope D1 and its vertex set V/
- How do we find the vertex set Vk+1 of Dk+1 in each step, if Dk , Vk and an affine
cut are given?
- How do we identify redundant constraints?

The problem of finding all vertices and redundant constraints of a polytope given

by a system of linear inequalities has been treated often (e.g., Manas and Nedoma

(1968), Matheiss (1973), Gal (1975), Dyer and ProH (1977 and 1982), Matheiss and
Rubin (1980), Dyer (1983), Khang and Fujiwara (1989». Since, however, in our case

Vk and the new constraint are given, we are interested in developing methods that
take into account our specific setting rat her than applying one of the standard
72

methods referred to above. We will briefly also treat the case of unbounded

polyhedral convex sets Dk , where vertices and extreme directions are of interest.

4.1 Finding an Initial Polytope D 1 and its Vertex Set VI

Let D be a nonempty, convex and compact set of dimension n. D1 should be a

simple polytope tightly enclosing D and having a small number of vertices.

Denote

ai= min {xi x E D} (j = l, ... ,n) (35)

and
n
a:= max { E x·: x E D} . (36)
j=l J

Then it is easily seen that

n . n
D1:= {x E IR : a· -x. ~ 0 (J=l, ... ,n), E x. - a ~ O} (37)
J J j=l J

is a simplex containing D. The n+l facets of D1 are defined by the n+l hyperplanes
n
{x E IRn: x. = a.} (j=l, ....n). and {x E IRn : E x. = a}. Each of these hyperplanes is
J J i=l 1
a supporting hyperplane of D. The set of vertices of D1 is

VI 0 I ..... vn}
= {v,v

where

(38)

and

(39)

with
73

ß· = Q - E~.
J i#j

Note that (35), (36) define n + 1 convex optimization problems with linear
objective functions. For this reason their solution can be efficiently computed using
standard optimization algorithms. If D is contained in the orthant IR~ (e.g., if the
constraints include the inequalities xj ~ 0 (j = 1, ... ,n», it suffices to compute Q

according to (36), and then

n . n
D1 = {x e IR : x. ~ 0 (J=I, ... ,n), E x. ~ Q} (40)
J j=1 J

is a simplex containing D with vertices vO = 0, ~ = ~ (j=I, ... ,n), where ~ is the


j-th unit vector in~.
If the constraints comprise the inequalities Qj ~ xj ~ ßj (j=I, ... ,n) with given

lower and upper bounds Qj and ßj , then obviously D1 may be the rectangular set

(41)

The vertex set is then

n
VI = {v1,... ,v2 }

with

(42)

where

'T~ = ~ or ßi (i = 1,... ,n) . (43)

and k ranges over the 2n possible different combinations in (42), (43).

Note that more sophisticated constructions of D1 are available in the case of


linear constraints (cf. Part B).
74

4.2. Computing New Vertices and New Extreme Directions

Let the current polyhedral convex set constituting the feasible set of the relaxed

problem (Qk) at iteration k be defined as

(44)

where K ( IN is a finite set of indices and k t K.

Let 4c(x) 5 0 be the new constraint defining

(45)

Denote by Vk , Vk +1 the vertex sets of Dk and Dk +l' respectively, and let Uk ,

Uk + 1 denote the sets of extreme directions of Dk and Dk +l' respectively.

The following lemma characterizes the new vertices and extreme directions of Dk +l'

Lemma 11.1. Let P ~ IR n be an n-dimensional convex polyhedral set with vertex set

V and set U of extreme directions. Let l(x) = ax + ß 5 0 (a E IR n, ßE IR) be an


additional linear constraint, and let V', U' be the vertex set and set of extreme
directions of

P' =pn{x:l(x)5 O}.

Then we have
a) w E V I \ Vif and only if w is the intersection of the hyperplane {x: l(x) = O}
with an edge [v -, v+Jof P satisfying l(v -) < 0, l(v +) > 0 or with an unbounded edge
emanating /rom avertex v E V in a direction 11. E U satisJying either l(v) < 0 and
au > 0 or l(v) > 0 and au < O.

b) 11. E U ' \ U if and only ifu satisfies au = 0 and is ofthe form 11. = >'11.- + I'U +
with >',1' > 0, au < 0, au + > 0, which defines a tWQ-dimensional face of the
recession cone of P.
75

Proof. Suppose that P = {x E IRn: 4(x) 5 0, i E K}, where li(x) = aix + Pi


(i E K).

a): The lIif' part of the assertion is obvious.

Now let w E V' \ V. Since w E V', among the linear constraints defining P' there

are n linearly independent constraints which are active at w. One of these

constraints must be l (x) 5 0, Le., we have l (w) = 0, since otherwise w E V.


Let 4(x) ~ 0, i E J, IJI = n -1, be the n -1 constraints that remain active at w
if we delete l (x) ~ O. Then

F: = {x E P: 4(x) = 0, i E J}

is a face of P containing w. We have dim F = 1 since 4(x) = 0, i E J, are linearly


independent constraints and IJ I = n-l. It follows that Fis an edge of P.
If Fis bounded, then we have F = [v-,v+1, v- E V, v+ E V and w E [v-,v +1 but

w f v-, w f v+. This implies l (V) f 0, l (v +) f O. Since l (w) = 0, we must have


l(v) . l(v+) < O.
If F is an unbounded one-dimensional face (unbounded edge) of P, then
F = {v + an: a ~ O}, v E V, u EU and w f v, i.e., w = v + aOu, a O > O. Since
l (w) = 0, this implies that l (v) f 0, au f 0 and l(v) . au < O.

b) The lIif' part of the assertion is again obvious.

Now let C and C' denote the recession cones of P and P', respectively.

We have

C' = cone U' = C n {y E IRn: ay ~ O}.

Let u E U' \ U. Then, as in the proof of part a) we conclude that among the
constraints defining C there are (n-1) linearly independent constraints active at u.
I
76

Since u t u, one of these active constraints has to be ax ~ 0, i.e., we have au = O.


Let J denote the index set of the (n-2) linearly independent constraints that remain

if we delete ax ~ 0, IJ I = n-2. Then

G: = {y e C: aiy = 0, i e J}
is a smallest face of C containing the ray {oo: Q ~ O}. Certainly, G :/: {oo: Q ~ O},
since otherwise, u would be an extreme direction of P, i.e., u e U. Therefore,
dim G = 2, and G is a two-dimensional cone generated by two extreme directions
u- , u+ e U, u- :/: u, u+ :/: u.
Thus, we have

u = ~u- + J!.U + with ~, JJ > o.

This implies au-:/: 0, au + :/: 0 and (auj(au +) < 0, since 0 = au = ~au- + pau+.•

Note that part a) of Lemma n.l requires that P possesses edges and part b) re-
quires the existence of at least one two-dimensional face of the recession cone of P.
It is easy to see that otherwise we cannot have new vertices or new extreme direc-

tions, respectively.

A straightforward application of this result is the following procedure for calcu-


lating Vk+1 and Uk+1 from Vk and Uk (cf. Thieu, Tarn and Ban (1983), Thieu
(1984».
Let Dk= {x e Rn: ~(x) = aix + ßi ~ 0 (i e K)}, where K is a finite set ofindices
satisfying k t K.

MethodI:

Let ~(x) = akx + I\. ' and define

vt := {ve Vk : ~(v) > O}, Vk:= {v e Vk: ~(v) < O} , (46)

ut:= k
{u e Uk: aku > O}, U ·:= {u e Uk: aku< O} . (47)
77

a) Finding the vertices of Vk+1 \ Vk that are points where {x e tt: 'tc(x) = O}
meets a bounded edge of Dk :
For any pair (v-,v +) E V k x vt let w = av- + (1- a)v +, where

a = 'tc(v +) / ('tc(v +) - 'tc(vj) (Le., 'tc(w) = 0) . (48)

Compute I(w) = {i E K: li(w) = O} . (49)

If the rank ofthe matrix A(w) having the rows ai , where t(x)
1
= aix + p.,1 ai E IRn ,
Pi E IR, i E I(w), is less than n-l, then w cannot be in Vk+ 1 \ Vk. Otherwise, w is in

Vk +1 \ Vk ·
If Vk is bounded, then Vk+1 \ Vk is determined in this way.
Note that, by the proof of Lemma 11.1, we have

(50)

Since the calculation of the rank of A(w) may be time-<:onsuming, one may simply

calculate I(w) via (50) and consider all points w defined by (48), (49) which satisfy

II(w) I = n-l. Then, in general we obtain a set Vk+1larger than Vk + 1' But since

Vk + 1 ~ Vk+1 ( Dk + 1 , and since we assume that argmin f(D k + 1)=argmin f(V k+1)'
we have argmin f(D k+ 1) = argmin f(V k + 1). The computational effort for the whole

procedure may be less if we use Vk+ 1 instead of Vk+ l'

fJ) Finding the vertices of Vk+ 1 \ Vk that are points where {x e ~: ~(x) = O}
meets an unbounded edge of Dk :

For any pair (u, v) E {U x k vt} u {ut x Vk}, determine w = v + 00, where

a = -'tc(v) / aku (Le., 'tc(w) = 0). Compute I(w) = {i: 4(w) = 0, i E K} and as in
the bounded case decide from the rank of A(w) whether w E Vk+1 \ Vk . Note that

in this case we have I(w) = {i: 4(v) = 0, aiu = 0, i E K}.


78

7) Finding the new extreme direcüons:


For any pair (u-,u+) E Uk' x ut determine u = (aku+)u- - (akuJU+.
It is easily seen that u E cone Uk and aku = O. Let J(u) = {j: a~- = aju+ = 0,
jE K}.
If the rank of the system of equations

is less than n - 2, then u cannot be an extreme direction of Dk+l' otherwise


u E Uk+l \ Uk.

Methodll:

In the most interesting case in practice when D and Dk are bounded, the vertices
of Vk+1 \ Vk may be calculated by linear programming techniques (cf., e.g., Falk
and Hoffman (1976) and Hoffman (1981), Horst, Thoai and de Vries (1988)).
We say that a vertex v of a polytope P is a neighbour in P of a vertex w of P if
[v, w] is an edge of P.
Let 'tt(x) ~ 0 be the new constraint and consider the polytope

Then, from Lemma 11.1, it follows that w E Vk+ 1 \ Vk if and only if w is a vertex of
Dk+1 satisfying w t Vk, 'tt(w) = 0 and having a neighbour v in Dk+1' v E Vk'.

Falk and Hoffman (1976) represent each vertex v in Vk' by the entire system of
inequalities defining Dk+1' This system must be transformed by a suitable pivoting
such that each v E Vk' is represented in the form

s + Tt = b,
79

where sand t are the vectors of basic and nonbasic variables, respectively, corres-
ponding to v. By performing dual pivoting on all current nonbasic variables in the

row corresponding to lk(x) $ 0, one obtains the neighbour vertices of v in the new
cutting hyperplane or detects that such a neighbour vertex does not exist. The case

of degenerate vertices is not considered in Falk and Hoffman (1976) and Hoffman

(1981).
The proposal of Horst, Thoai and de Vries (1988) is based on the following consid-
eration.

For each v E Vk', denote by E(v) the set of halflines (directions) emanating from v,

each of which contains an edge of Dk , and denote by I(v) the index set of a1l

constraints that are active (binding) at v. By Lemma II.1, the set of new vertices

coincides with those intersection points of the halflines e E E(v) with the hyperplane

Hk:= {x: 4t(x) = er x + ß = O} that belong to Dk .


Suppose first that v is nondegenerate. Then I( v) contains exactly n linearly
independent constraints, and the set of inequalities

ik .
a x + ßi $ 0, I k E I(v) (k = 1,,,.,n)
k

defines a polyhedral cone with vertex at v and is generated by the halflines e E E(v).
To carry out the calculations, in a basic form of the procedure, introduce slack

variables Yl'".,yn+1 and construct a simplex tableau

xl x2 xn Y1 Y2 Yn Yn+1 RHS

i1 i1 i1
a1 a2 an 1 0 0 0 -ß·1 1

i
a n
1
o o 1 o
o o o 1
80

where the last row corresponds to the inequality 4c(x) ~ O.


Perform pivot operations in such a way that all variables xi (i=1, ... ,n) become
basic variables. In the last tableau on hand after these operations, let Sj
(j=1, ...,2n+1) be the first 2n+1 elements of the last row (i.e., the elements on the
left in the row corresponding to the cut 4c(x) ~ 0). For each j satisfying Sj # 0, a
point wEH is obtained by a pivot operation with pivot element Sj . If we have

w E Dk, i.e., aiw + Pi ~ 0 (i E K), then w is a new vertex of Dk+l'


In the case of adegenerate vertex v E Vk, the operation presented above must be
considered for each system of linearly independent constraints. This can be done by
considering the system of all equations binding at v. This system then admits several
basic forms (simplex tableaus) corresponding to v, for each of which the above
calculation can be carried out. The transition from one of these basic forms to
another can be done by pivoting with pivot rows corresponding to a basic variable
with value O. For an efficient bookkeeping that prevents a new vertex w from being
determined twice, one can apply any of the devices used with algorithms for finding
all vertices of a given polytope (e.g., Matheiss and Rubin (1980); Dyer and Proll

(1977 and 1983); Horst, Thoai and de Vries (1988». A comprehensive treatment of
the determination of the neighbouring vertices of a given degenerate vertex in
polytopes is given in Kruse (1986) (cf. also Gal et al. (1988), Horst (1991».
In the procedure above, the new vertices are calculated from Vk'. Obviously,

instead of Vk' one can likewise use vi. Since the number of elements of these two
sets may differ considerably, one should always use the set with the smaller number
of vertices.

We give some remarks on a comparison of these methods.


Note first that the calculation of the new vertices w from one of the known sets
Vk' or Vi is justified by the observation that in almost all of the examples where we
applied a cutting plane procedure to solve concave minimization problems, the
number of elements in Vk (or in vt, respectively) was considerably smaller than the
81

nurnber of new vertices created by the cut (for a related conjecture see, e.g.,
Matheiss and Rubin (1980), Avis and Fukuda (1992)).

Table 11.1 below shows sorne related results of 20 randomly generated exarnples.

Problem n m IVk 1 slgn VI V2 IVk+11


No.
1 5 8 12 V+ 3 9 18
2 - 14 18 V- 6 12 18
3 - 19 18 V+ 6 12 24
4 - 23 24 V+ 6 12 30
5 10 13 27 V+ 1 10 36
6 - 18 122 V- 28 84 112
7 - 22 112 V+ 52 120 180
8 - 25 360 V+ 120 304 544
9 - 28 680 V- 304 336 640
10 - 35 688 V+ 192 360 856
11 20 23 57 V- 18 54 72
12 - 25 192 V+ 4 168 256
13 - 26 256 V+ 48 624 832
14 - 29 1008 V+ 432 1152 1728
15 - 31 2376 V- 780 1728 2508
16 - 32 2508 V+ 598 1474 3884
17 50 51 51 V+ 1 5 100
18 - 54 675 V- 48 1248 1296
19 - 56 2484 V+ 108 2376 4752
20 - 58 9113 V+ 351 8718 16580

Table 11.1: number of vertices in a cutting procedure

In Table 11.1, we have used the following headings:

n: dimension of Dk '
m: nurnber of constraints that defines Dk ;

1Vk 1 and 1Vk+ 1 1: number of vertices of Dk and Dk + l' respectively;

sign: indicates which set of vertices is taken for calculation of the new vertices

(V+ = vt, V- = Vk);


82

V1: number of elements of the set indicated in signj

V2: number of newly generated vertices.

For a comparison of the procedure of Horst, Thoai and de Vries (HTV) with the
method of Falk-Hoffman, note that in an outer approximation algorithm using
cutting planes one usually starts with a simplex or an n-rectangIe defined by

(n + 1) or 2n constraints, respectively. Then in each step an additional constraint is


added. Therefore, even when applying certain constraint dropping strategies, the
number of constraints defining the polytope Dk = P is always greater than n, and it
increases from step to step in the master cutting plane algorithm. In typical
examples one USUallY has to expect that the number of constraints defining Dk will
be considerably larger than the dimension n of the variable space.
The Falk-Hoffman approach has to handle all these constraints for each vertex
v- E Vk, whereas the number of rows in the tableau of the HTV-procedure never
exceeds n + 1. Moreover, the case of degenerate vertices v- E Vk is easier to handle

in the HTV-approaches than in the method of Falk-Hoffman.


Thieu-Tam-Ban's method, because of its simplicity, operates weIl for very small
problems. However, since in this approach one must investigate the intersection of
all line segments [v-,v+], v- E Vk , v+ E vt with the hyperplane Hk' the
computational effort is prohibitive even for problems of medium size. Note that, for
IVkl = 200, vt I =
I 100, we already have 2.10 4 line segments [v-,v +], v- E Vk '
v+ E vt.
A numerical comparison is given in Horst, Thoai and de Vries (1988). Several
hundred randomly generated test problems were run on an IBM-pe/AT using the

original FORTRAN code of Thieu-Tam-Ban for their relaxed version and our

FORTRAN code for the procedure presented in the preceding section. In the relaxed
version of Thieu-Tam-Ban, the decision on w E Vk+1 or w I Vk+1 is based only on
the number of constraints that are binding at w and omits the relatively expensive
determination of their rank (relaxed procedure). The results show that the Thieu-
83

Tam-Ban algorithm should not be used if IVI exceeds 100 (cf. Table II.2 below).

Problem n m I Vk I I Vk+l I Tl T2
No. [Min] [Min]
1 3 7 9 10 0.008 0.004
2 - 16 10 10 0.018 0.010
3 5 28 34 42 0.390 0.300
4 10 14 27 32 0.420 0.170
5 - 17 48 48 0.720 0.670
6 - 12 32 56 0.120 0.110
7 - 18 56 80 0.580 1.040
8 - 19 80 80 1.12 2.18
9 - 16 160 256 3.16 20.610
10 - 18 288 256 3.55 27.310
11 - 27 400 520 2.65 53.650
12 - 30 672 736 1.91 81.850
13 20 23 57 72 0.96 1.980
14 - 24 72 220 1.21 8.250
15 - 25 220 388 7.54 81.640
16 -- 26 388 1233 7.71 303.360

Table 1I.2: A comparison of the algorithm of the HTV method

with the Thieu-Tam-Ban procedure

In Table 11.2, we have used the following headings:

n: dimension of Dk '

m: number of constraints defining Dk '

IVkl and IVk+11: numberofverticesofD k andD k + 1,respectively,


Tl: epu - time for the method of Horst, Thoai and de Vries,
T2: epu - time for Thieu-Tam-Ban's relaxed procedure.

Another way of treating the unbounded case is as follows (cf. Tuy (1983)).

Assume D1 C IR!. Recall first that, under the natural correspondence between points

of the hyperplane {(l,x): x E IRn } in IRn + 1 and points x E IRn , a point x E D k can be

represented by the ray p(x) = {a(l,x): Q ~ O}, and a direction of recession y can be
84

represented by p(y) ={a(o,y): er ~ O}. In 1R~+1 define the n-llimplex


n
S = {(t,i) e 1R~+1: . E ii + t = I} .
1=1

Associate to each point x e Dk the point sex), where the ray p(x) meets S, and to
each direction y of recession y of Dk associate the point s(y) where the ray p(y)
meets S. In this way there is a on~to-one correspondence between the points and
directions of recession of Dk and the points of the corresponding polytope s(Dk) C S.

Avertex (t,i) of s(Dk) corresponds to an (ordinary) vertex x = ~ of Dk if t > 0,


or to a vertex at infinity (an extreme direction i) of Dk if t = 0. Furthermore, since

the corresponding point of D is simply x = ~ for (t,i) e s(Dk)' t > 0, we conclude


that if

then

Denote the vertex set of s(D k) by Wk. Then, it follows from these observations, that
Wk+1 can be derived from Wk by one of the methods presented above. Therefore,
the set of vertices and extreme directions of Dk+1 can be computed whenever the
vertices and extreme directions of Dk and the new constraint are known.

An interesting "on-line" version of the Thieu-Tam-Ban approach is given in


Chen et al. (1991).

4.3. Identifying Redundant Constraints

Redundancy of constraints in linear programming has been discussed by many


authors (d., e.g., Matheis and Rubin (1980) and references given there). However, in
85

contrast to the common linear programming setting, in our context an vertices of a


given polytope are known; hence we present a characterization of redundant con-

straints by means of the vertex set of a given polytope.

Let P:= Dk = {x E IRn: li(X) ~ 0, i E K}, k t K, and P': = Dk +1 = P n {x E IRn:


'tc(x) ~ O} be polytopes defined as above. A constraint i/x) ~ 0, j E K is called
redundant for P if its removal does not change P.

Definition n.2. A constraint llx} ~ 0, j E K, is redundant for a polytope


P = {x E IR n : vx} ~ 0, i E K} ifthere is an iOE K\ {}} such that we have

where

(52)

We also say that the constraint llx} ~ 0 is redundant for P relative to liO .

In a cutting plane approach as described above, the first polytope can usually be

assumed to have no redundant constraints. Let P' be a polytope genera.ted from a.

given polytope P by a cut 'tc(x) ~ O. Then, by the definition of a cut, 'tc(x) ~ 0

cannot be redundant for P'. If we assurne that redundancy is checked and redundant

constraints are eliminated in each iteration of the cutting-plane algorithm (i.e., P

has no redundant constraints), then P' can only possess redundant constraints

relative to the cut 'tc. These redundant constraints can be eliminated by the

following assertion. Again denote by V(P) the vertex set of a polytope P.

Theorem n.6. Assume that V- (P):= {v E V(P}: 'tc(v) < O} # 0. Then a


constraint llx} ~ 0, jE K, is redundant for P relative to 'tc if and only if we have
I

(53)
86

Proof.Since F. is a polytope, we have FJ.


J
=
conv V(F.), where conv V{F.)
J J
denotes the convex hull of the vertex set V{Fj ) of Fj . Then, by Definition n.2,

ilx) ~ 0 is redundant for P' relative to ~ if and only if conv V{Fj ) C {x: ~(x) ~ O}.
Obviously, because of the convexity of the halfspace {x: ~(x) ~ O}, this inclusion is

equivalent to the condition V{Fj ) nV-{P) = 0, which in turn holds if and only if
(53) is satisfied. _

If one does not want to investigate redundancy at each step of the cutting plane
method, then one can choose to identify only the so-called strictly redundant

constraints. As we shall see, this can be done at an arbitrarily chosen iteration.

Definition n.3. A constraint ilx) ~ 0, jE K, is strictly redundant for a polytope

P = {x E IRn: 4(x) ~ 0, i E K} ifthere is an iOE K \ {J} such that we have

F.:=P. n{XElRn:iJ.(x)=0}C{XElRn
J '0
:4 0(x»O}. (54)

where p. is defined by (511).


'0
We also say that the constraint ilx) ~ 0 is strictly redundant for P relative to 0,4

The following assertion shows that, whenever a constraint is strictly redundant for

P, it is strictly redundant for P' relative to the new constraint ~(x) ~ O.

Theorem 11.7. a) A constraint yx) ~ 0, jE K, is strictly redundant for P relative


to ~(x) ~ 0 if and only ifwe have

(55)

b) Every constraint that is strictly redundant for P is strictly redundant for P I

relative to ~(x) ~ o.
87

Proof. a) The proof of (55) is similar to the proof of (53) (replace V-(P) by

Y(P) \ Y+(P)).

b) Let l.(x) ~ 0 be strictly redundant for P and let t satisfy F·c {x: l (x)
J 10 J 10
> O}.

Then we have p. n {x: [1' (x) ~ O}=pc{x: lJ'(x) < O}. It follows that l.(v) < 0
10 0 J
Vv E Y(P). Rence, we see that llv) < 0 Vv E Y(P) \ Y+(P) holds, and, by part a)
ofTheorem 11.7, llx) ~ 0 is strictly redundant for P' relative to ~(x) ~ O. •

Remark 11.1. Let the feasible set D:= {x E IRn : g(x) ~ O} with strictly convex
g: IRn -+ IR and assume that int D # 0. If all of the facets of the initial polytope D1
and all hyperplanes generated by an outer approximation method support D, then it

is easily seen that redundant constraints cannot occur.


CHAPTERIII

CONCAVITY CUTS

In Chapter 11 we discussed the general concept of a cut and the use of cuts in the

basic technique of outer approximation. There, we were mainly concerned with using
cuts in a "conjunctive" manner: typically, cuts were generated in such a way that no

feasible point of the problem is excluded and the intersection of all the cuts contains
the whole feasible region. This technique is most successful when the feasible region
is a convex set, so that supporting hyperplanes can easily be constructed to separate

this convex set from a point lying outside.

We now discuss cuts that are used in a "disjunctive" manner: each cut taken sep-

arately may exclude certain points of the current region of interest, but the union of
all cuts constructed at a given stage covers this region entirely. This technique is
often used for approximating a certain nonconvex set and separating it from a point
lying outside.

1. CONCEPT OF A VALID CUT

Consider the problem of minimizing a continuous function f: IRn ---f IR over a poly-

hedron D of fuU dimension in IRn . Suppose that 1 is the smallest value of f(x) taken

over an feasible points where an evaluation has been made up to a given stage in the
90

process of solving the problem. Then we may restrict our further search to the subset
of D consisting only of points x satisfying fex) < ,. In other words we may discard
an points x e D in the set

G = {x e of: fex) ~ ,} (1)

!rom further consideration. Of course we could do this by adjoining the constraint


fex) < , to the feasible set D. However, this would lead us to consider the new
feasible set

D*(-y) = D n {x: fex) < ,} ,

which except in certain simple wes is no longer polyhedral, and so would be diffi.-
cult to handle. Therefore, in order to preserve the polyhedral structure of the con-
straint set, we have to consider instead an affine function L (x) such that the con-
straint L (x) ~ 0 does not exclude any feasible point x with fex) < " i.e., such that

D*(-y) c {x e D: L(x) ~ O} 1) • (2)

Definition m.l A linear inequalit1l L (z) ~ 0 satisfying (2) is called a 7-t14litl cut
for (J,D).

Typically, we have a point z e D such that fez) > ,and we would like to have the
cut eliminate it, i e., we require that

L(z) < 0 •

When fex) is a convex function, so that the set D*( ,) is convex, any hyperplane
L (x) = 0 strictly separating z !rom D*(-y) is a -y--valid cut that excludes z. For
example, if p e 8f(z), then fez) + p(x-t:) ~ fex), and the inequality

fez) + p(x-t:) ~ , (i.e., L (x) = , - fez) - p(x-z) ~ 0)

1) In this chapter it will be more convenient to denote a cut by L(x) ~ 0 rather than
by L (x) ~ 0 as in Chapter II.
91

defines a '"(-valid cut that excludes z (see Chapter TI).

A more complicated situation typical of global optimization occurs when f(x) is a


conca.ve function. Then the set D*( 7) is nonconvex, and the subgradient concept can
no longer be used. However, since the complement of D*(-y) (relative to D), Le., the
set D n G, is convex, we may try to exploit this convexity in order to construct a
hyperplane strictly separating z from D*(-y).

Assume that a vertex xo of D is available which satisfies xo eint G (Le.,


f(xo) > 7) and is nondegenerate, Le., there are exactly n edges of D emanating from

xO. Let u1,u2, ... ,un denote the directions ofthese edges. Ifthe i-th edge is a line seg-
ment joining xO to an adjacent vertex yi, then one can take ui = yi - xO. However,

note that certain edges of D may be unbounded (the corresponding vectors ui are
extreme directions of D). Since D has full dimension (dim D = n) the vectors
u 1,u2, ... ,un are linearly independent and the cone vertexed at xO and generated by
the halflines emanating from xO in the directions u1,u2,... ,un is an n-dimensional
cone K with exactly n edges such that D c K (in fact K is the smallest cone with
vertex at xO that contains D).
Now for each i=l, ... ,n let us take a point zi # xO on the i-th edge of K such that
f(zi) ~ 7 (see Fig. Iß.l). These points exist by virtue of the assumption xO Eint G.
The nxn-matrix

° ° °
Q = (z 1-x ,z2-x , ... ,zn-x )

with zi-xO as its i-th column, is nonsingular because its columns are linearly

independent. Let e = (1,1, ... ,1) be the row vector of nones.

Theorem m.l. 11 zi = :rP + Biui with Bi > 0 and I(i) ~ 7, then the linear in-

equality

(3)
92

defines a 1-valid cut for (J,D).

Proof. Since zl_xO,z2_xO, ... ,zn_xO are linearly independent, there is a unique hy-

perplane H passing through the n points zl,z2, ... ,zn. The equation ofthis hyperplane

can be written in the form 1I{x-xO) = 1, where 11" is some n-vector. Since this equa-
tion is satisfied by each zi, we can write


1I{zl-x ) = 1 (i=1,2, ... ,n) , (4)

i.e.,

1I"Q=e,

hence 11" = eQ-l. Thus, the equation of the hyperplane H through zl,z2, ... ,zn is
eQ-l(x-xO) = 1.

Now let S = [xO,zl, ... ,zn]:= conv {xO,zl, ... ,zn} denote the simplex spanned by
°
x ,z1, ... ,zn . Clearly S = K
. n {x: eQ-1 (x-x) ~ ° . °
I}. Smce x ,zi E G (i=1,2, ... ,n) and
the set Gis convex, it follows that Sc G, in other words, fex) ~ 1 Vx E S. Then, if
xE D satisfies fex) < 1, we must have x t Si and since D is contained in the cone K,
this implies eQ-l(x-xO) > 1. Therefore,

{x E D: fex) ~ 1} ( {x E D: eQ -1 (x-x) ° ~ I} ,

proving that the cut (3) is 1-valid for (f,D).



Let U = (u 1,u2,... ,un ) be the nonsingular n"n-matrix with columns u1,u 2,... ,un .

ThenxEKifandonlyifx=xO+ E t.u1 i ,witht=(t1,... ,tn )T~O,i.e.,


i=l

°
K = {x: x = x + Ut, t ~ O} . (5)

Clearly, t = U-1(x-xO). Using (5), from Theorem III.1 we can derive the follow-
ing form of a valid cut which in computations is more often used than the form (3).
93

Corollary m.l. For any ° (° 1,°2,... ,0,) > 0 such that J(xo + 0iu)i
= ~ 7 the
linear inequality
n t·
E -i:~1, (6)
i=1 i

where t = U- 1(x-l), defines a 7-valid cut Jor (J,D).

°
Proof. If zi = x0 + iUi (.1=1,2, ... ,n ) and we denote Q = (1
z -x0,z2-x0,... ,zn-x0) ,
then Q = U diag (Op ... ,On). Hence,

-1 . 1 1 -1
Q = diag (0. ,... , 71) U ,
1 n

-1 0 1 1 -1 0 1 1
eQ (x - x ) = (0. ,... , 71) U (x - x ) = (0. ,... , 71) (tl ,... ,t n)
1 n 1 n
n t.
= E
i=1 i
1:,
and so (3) is equivalent to (6).

Definition ID.2. A 7-valid cut which excludes a larger portion oJ the set
D n G = {x E D: J(x) ~ 7} than another oneJ is said to dominate the latter, or to be
stronger (deeper) than the latter.

It is straightforward to see that, if 0i ~ °i


(Vi), then the cut E
n t.

i=1 i
~ 1 cannot F:
dominate the cut (6). Therefore, the strongest 7-valid cut of type (6) corresponds to
0.1 = (l.
1
(i=I,2, ... ,n), where

Since this cut originated in concave programming, and since the set D*( 7) that it
separates from xOis "concave", we shall refer to this strongest cut, Le., to the cut
94

as a concavity cut.

Definition m.3. A cut of the form (6) with 8i = (li satisfying (7) is called a

i-valid concavity cut for (f,D)), constructed at the point zO.

This cut was first introduced by Tuy (1964).

Fig. ULl A I-valid cut

Note that some of the values (li defined by (7) may be equal to +00. This occurs

when ui is a recession direction of the convex set G. Setting.!...


(l. = 0 when (l.1 = +00,
1

we see that if I = {i: (li < +oo}, then the concavity cut is given by
t.
E ....! > 1 (8)
iEI (li -
95

Its normal vector is 'Ir == ('lr1''lr2,···,'lrn ) with 'lri == ;. (i EI), 'lrj ==


1
° (j ~ I), Le., in

this case the hyperplane of the concavity cut is parallel to each direction u j with
j ~ I (see Fig. III.2). Of course, the cut (8) will exist only if I is nonempty. Other-

wise, the co ne K will have all its edges contained in G, in which case K c G and

there is no point x E D with f(x) < 'Y (in this situation one can say that the cut is
"infinite" ).

f(x)= ~_.

,
, concovity cut
rj
1
=+CD

Fig. III.2 The case (li == +ro

2. VALID CUTS IN THE DEGENERATE CASE

The construction of the cut (3) or (6) relies heavily on the assumption that xO is a

nondegenerate vertex of D. When this assumption fails to hold, there are more than

n edges of D emanating from xo. Then the smallest cone K vertexed at xO and con-
taining D has s > n edges. If, as before, we take a point zi f xO satisfying f(xi ) ~ 'Y0n
96

each edge of K, then there may not exist any hyperplane passing through an these
z.i

Several methods can be used to deal with this case.

The first method, based on linear programming theory, is simply to avoid degen-

eracy. In fact, it is well-known that by a slight perturbation of the vector b one can

always make a given polyhedron

nondegenerate (Le., have only nondegenerate vertices). However, despite its concep-

tual simplicity, this perturbation method is rarely practical.

The second method, suggested by Balas (1971), is to replace D by a slightly larger


polyhedron D' that admits xo as a nondegenerate vertex. In fact, since xo is a vertex,
among the constraints that define D, one can always find n linearly independent con-

straints that are binding for xo. Let D' denote the polyhedron obtained from D by
omitting all the other binding constraints for xo. Then xo is a nondegenerate vertex
of D', and a 1'-valid cut for (f,D') can be constructed at xo. Since D' J D, this will
also be a -y-valid cut for (f,D).
In practical computations, the most convenient way to carry out this method is as

follows. Using standard linear programming techniques, write the constraints de-

fining D in the form

(9)

(10)

where x B = (Xi ' i E B) is the vector of basic variables corresponding to the basic
solution xo = (x~,o), x N = (xj , JEN), IN I = n, is the vector of the corresponding
nonbasic variables, and U is a matrix given by the corresponding simplex tableau.
97

Setting xN = t, we can represent the polyhedron D hy the system. in t:

o
x B + Ut ~ 0, t ~ 0. (11)

The vertex xO now corresponds to t O = 0 e IRn, while the concave function f(x) he-

comes a certain concave function r,o(t) such that r,o(O) = f(x O) > 7 (hence, the convex
set {x: f(x) > 7} becomes the convex set {t: cp(t) > 7} in t-space). The polyhedron
Dis now contained in the orthant t ~ 0 which is a cone vertexed at the origin t O = 0
and having exact1y n edges (though some of these edges may not he edges of D, hut
rather lie outside D). Therefore, all the constructions discussed in the previous sec-
tion apply. In particular, if ei denotes the i-th unit vector in t-space, and if
r,o(0iei) ~ 7, then a 7-valid concavity cut is given hy the same inequality (6) of Co-
rollary 111.1. Note that when the vertex xO is nondegenerate (so that D' = D), the
variables t i (i=1, ... ,n) and the matrix U in (11) can be identified with the variables
t i (i=1, ... ,n) and the matrix U in (5).

The third method, proposed by Carvajal-Moreno (1972), is to determine a nor-


mal vector 'lI' of a -y-valid cut not as a solution of the system. of equations (4) (as we
did in the nondegenerate case), but, instead, as a basic solution of the system of in-
equalities

rni ~ ~. (i=1,2 ... ,s), (12)


I

where u1,u2,... ,us are directions of the edges of D em.anating from xO (s > n), and (ri
is defined by (7) (with the usual convention1. = 0 if (rl' = +m).
~

Lemma m.l. The system (12) is consistent and has at least one basic solution.

Proof. Since dim K = n and K has a vertex (Le., the lineality of K = 0), the
polar KOof -K + xO must have full dimension: dim KO= n, so that KOcontains an
interior point, i.e., a point t satisfying tui > 0 Vi=1,2, ... ,s (cf. Rockafellar (1970),
98

Corollary 14.6.1). Then the vector 11" = ~tT, with ~ E IR, ~> °
sufficiently large, will

satisfy the system (12). That is, the system (12) is consistent. Since obviously the

origin cannot belong to the polyhedron described by (12), the latter contains no

entire straight line. Hence, it has at least one extreme point (cf. Rockafellar (1970),

Corollary 18.5.3), i.e. the system (12) has at least one basic solution.

Proposition m.L Any solution 11" 01 (12) provides a 7-valid cut lor (j,D) by
me ans olthe inequality

1I"(X - i) ~ 1 .
Proof. Denote M = {x E K: 7r{x - xo) ~ 1}, where, we recall, K is the cone ver-

texed at xO with s edges of direction u 1,u 2,... ,us . Since D ( K, it suffices to show

that

M ( G = {x: f(x) ~ 7}. (13)

In fact, consider any point x E M. Since xE K, we have

x E°
= x + i=l
S
~. u
1
i
, ~.
1-
> °.
Furthermore, since 7r{x - xO) ~ 1, it follows that
s . s~.
1> E ~.(1rIll) > E ...!.
- i=l 1 - i=l a i

Let us assume first that a i < m Vi. Setting

we can write

s -.\ (x
x= E
i=l a i
°+ a.ui )
1
+ (1- p.)x °, (14)

which means that xis a convex combination ofthe points xO and xO+aiui (i=l, ... ,s).
99

Since f(xO) ~ 'Y and f(xO +/li Ui ) ~ 'Y, it follows that f(x) ~ 'Y, hence we have proven

(13) for the case where all of the /li are finite.

In the case where some of the /li may be +1Il, the argument is similar, but instead
of (14) we write

A. 0 ' i
x= E 2(x +/l.u1) + (l-JL) [x + E I=IIU],
° \
ieI /li 1 i~I ,..
A.
where I = {i: /l. < +1Il}, JL = E ,.,1.
1 ieI "'i •
If '7f' is a basic solution of (12), then it satisfies n of the inequalities of (12) as
equalities, i.e., the corresponding cutting hyperplane passes through n points

zi = xO + /li Ui (with the understanding that if /li = +1Il for some i, this means that

the hyperplane is parallel to ui ).

A simple way to obtain a basic solution of the system (12) is to solve a linear pro-

gram such as

minimize E /l. '7f'Ui s.t. '7f'Ui ~ ~. (i=l, ... ,s) .


ieI 1 1

When s = n (Le., the vertex xO is nondegenerate), the system (12) has a unique
basic solution which corresponds exactly to the concavity cut.

3. CONVERGENCE OF CUTTING PROCEDURES

The cuts which are used in the outer approximation method (Chapter II) are

conjunctive in the following sense: at each step k=1,2, ... , of the procedure we

construct a cut Lk = {x: 4c(x) ~ O} such that the set to be considered in the next

step is
100

where D is the feasible set. Since each Lk is a halfspace, it is c1ear that this method

can be successful only if the feasible set Dis convex.

By constrast, the valid cuts developed above may be used in disjunctive form.

To be specific, let us again consider the problem of minimizing a continuous

function f over a polyhedron D in !Rn. If I is the best feasible value of f(x) known up

to a given stage, we would like to find a feasible point x with f(x) < I or else to be
sure that no such point exists (i.e., I is the global optimal value). We construct a

')'-valid cut LO = {x: LO(x) ~ O} ) D*(t) and find an element xl E D n LO . If we are


lucky and f(x 1) < I, then we are done. Otherwise, we try to construct a ')'-valid cut
LI = {x: L1(x) ~ O} to exc1ude xl. However, sometimes this may not be possible; for
example, if the situation is as depicted in Fig. I1I.3, then there is no way to exc1ude

xl by a single I-valid cut. Therefore, in such cases we construct a system of cuts

L11 = {x: L11 (x) ~ O}, ... , LI N = {x: LI N (x) ~ O} such that
" , 1 ' 1

Le., the union (disjunction) of all these cuts together forms a set LI which plays the

role of a ')'-valid cut for (f,D), namely it exc1udes xl without exc1uding any point of

D*(/)·

Thus, a general cutting procedure works as follows. Given a target set 0 which is

a subset of some set D (for instance 0 = D*(t)), we start with a set LO ) O. At step

k = 1,2, ... , we find a point x k E D n Lk_ 1 . If xk happens to belong to 0, then we

are done. Otherwise, construct a system of cuts Lk,j (j=l, ... ,N k ) such that
Nk
Lk = .U L k J. ) 0 , (15)
j=l '
101

and go to the next step.

Fig. III.3. Disjunctive cuts

Since the aim of a cutting procedure is eventually to obtain an element of 0, the


procedure will be successful either if xk e 0 at some step k (in which case we say
that the procedure is finite or finitely convergent), or if the sequence {xk} is
bounded and any cluster point of it belongs to 0 (in which case we say that the
procedure is convergent, or infinitely convergent).

Note that, by (16), D n Lk ( D n Lh for all h ~ k, and therefore

(17)

Many convergence proofs for cutting plane procedures are based on the following,
simple, but useful proposition.
102

Lemma m.2 (Bounded Convergence Principle). Let Lk 1 k = 11 21 ••• 1 be a


sequence 01 arbitrary sets in IRn. 11 {;C} is a bounded sequence 01 points satisfging

xk E n Lh 1 (18)
h<k

then

d(xk1Lk) -; 0 (k-; m)

(where disthe distance function in IRn). The same conclusion holdsI ilinstead 01 (18)
we have

k
x E n Lh . (19)
h>k

Proof. Suppose that (18) holds, but d(xk,Lk) does not tend to o. Then there

exists a positive number e and an infinite sequence {kv} C {1,2, ... } satisfying
k
d(x v,Lk ) ~ e for an v. Since {xk} is bounded, we may assume, by passing to sub-
v
k k k
sequences if necessary, that x v converges, so that IIx JL-x vII < e for an JL,V suffi-
k
ciently large. But, by (18), for an JL > v we have x JL E Lk ,and hence
v
k k k
IIx JL-x vII ~ d(x v,L k ) ~ e
v

This contradiction proves the lemma.


kv
The case (19) is analogous, except that for JL > v we have x E Lk ' and hence
JL
k kv kJL
Ilx JL-x 11 ~ d(x ,Lk ) ~ e . •
JL

Now consider a cutting procedure in which in (15) Nk = 1 (k=I,2, ... ), so that


each Lk consists of a single cut:

Lk = {x: 4c(x) ~ O}
103

Then

(20)

while

k
~(x ) ~ 0 (h = 0,1, ... ,k-1). (21)

Assurne that the sequence {xk} is bounded.

Theorem III.2. Let Hk = {x: 'ix) = O}. If there exists an e > 0 s'Uch that
d(;,H~ ~ c for all k, then the proced'Ure is finite.

Proof. It suffices to apply Lemma III.2 to {xk} and Lk = {x: 4c(x) ~ O}. In view
of (21) we have (18). But from (21) d(xk,L k) = d(xk,Hk), and since d(xk,H k) ~ e for
all k, it follows from Lemma III.2 that the procedure must be finite.

From this Theorem we can derive the following results (see Bulatov (1977)).

Corollary III.2. S'Uppose that every cut is ofthe form

Ifthe seq'Uence {7I"k} is bo'Unded, then the procedure is finite.

Proof. Clearly, d(xk,H k) = 1171"k ll -1. Therefore, if 1171"kll ~ c for some c E IR, c> 0,
then d(xk,Hk) ~ 1fc , and the conclusion follows from Theorem III.2. •

Corollary III.3. Suppose that every cut is ofthe form (9), i.e.,

Ifthe seq'Uence 11 Qk-1 11 , k = 1,2, ... , is bounded, then the procedure is finite.
104

k = eQk-1 .
Proof. This follows from Corollary III.2 for 'Ir

Corollary illA. Suppose that every cut is ofthe form (6), i.e.,

where t E IR n is related to xE IR n by

Ifthe sequence "Uk -1" is bounded and there exists 0> 0 such that 0ik ~ 0 for all i =
l,2, ... ,n and all k, then the procedure is finite.

Proof. As seen from the proof of Corollary HI.1, we have:

from which the result follows.



The conditions indicated in Theorem III.2 and its corollaries are not easy to
enforce in practical implementations. NevertheIess, in Part B we shall see how they
can sometimes be used to establish the finiteness or convergence of a given cutting
procedure.

4. CONCAVITY CUTS FOR HANDLING REVERSE CONVEX CONSTRAINTS

Cuts of the type (3) or (6), which were originally devised for concave pro-
gramming, have proved to be useful in some other problems as weIl, especially in in-
teger programming (see Glover (1972, 1973 and 1973a), Balas (1971, 1972, 1975,
1975a and 1979), Young (1971)). In fact, Corollary HI.1 can be restated in the
105

following form (see Glover (1972)), which may be more convenient for use in certain

questions not directly related to the minimization of a concave function f(x).

Proposition ID.2. Let G be a convex set in IRn whose interior contains a point xo

but does not contain any point of a given set S. Let U be a nonsingular nxn matrix
with columns u1,u2, ... ,un. Then for any constants (Ji > °such that i + (Jiui E G

(i=l, ... ,n), the cut


n t.
E F.~l,
i=l i

where t = u- 1(x-i), excludes i without excluding any point x E S satisfying


U- 1(x-xO) ~ 0.

Proof. Clearly K = {x: U-1(x-xO) ~ O} is not hing but the cone generated by n
halflines emanating from xO in the direction u 1,u2,... ,un . It can easily be verified

that the argument used to prove Theorem III.1 and its corollary carries over to the
case when G is an arbitrary convex set with xO Eint G, and the set D*(-y) is re-

placed by S n K, where S has no common point with int G.

The proposition can .also be derived directly from Corollary 111.1 by setting D = K,
f(x) = -p(x-xO), 7 = -1, where p(x) = inf {>. > 0 : ix E G} is the gauge of G, so
that G = {x: p(x-xO) ~ 1}. •

To emphasize the role of the convex set Gin the construction of the cut, Glover
(1972) has called this cut a convexity cut. In addition, the term "intersection cut"

has been used by Balas (1970a), referring to the fact that the cut is constructed by

taking the intersection of the edges of the cone K with the boundary of G.

In the sequel, we shall refer to the cut obtained in Proposition III.2 as a concavity

cut relative to (G,K). Such a cut is valid for K\G, in the sense that it does not

exclude any point of this set.


106

In Section 111.1 concavity cuts were introduced as a tool for handling nonconvex
objective functions. We now show how these cuts can also be used to handle noncon-
vex constraints.
As a typical example let us consider a problem

(P) minimize f(x) B.t. xE D, g(x) ~ 0

such that by omitting the last constraint g(x) ~ 0 we obtain a relatively easy
problem

(Q) minimize f(x) S.t. xE D .

This is the case, for example, if Dis a convex set, while both functions f(x) and g(x)
are convex. Setting

G = {x: g(x) ~ O} ,

we can rewrite the last constraint of (P) as

x t int G . (22)

In order to solve the problem (P) by the outer approximation method discussed in

Chapter 11, we can proceed as follows.

Start with DO = D.

At iteration k=O,1,2,,,. find an optimal solution xk of the relaxed problem

(Qk) minimize f(x) S.t. xE Dk .

If xk t int G, then stop (xk solves (P)). Otherwise, construct a cut 4c(x) ~ 0 to ex-
clude xk without excluding any feasible point of (P). Form the new relaxed problem
(Qk+1) with constraint set
107

and go to iteration k+ 1.
Obviously, a crucial point in this approach is the construction of the cuts. If xO
denotes an optimal solution of (QO)' then we can assume that xO eint G (otherwise
xO would solve (P)). Furthermore, in most cases we can find a cone K vertexed at xO
having exactly n edges (of directions u1,u2,... ,un ), and such that D c K. Then all the
conditions of Proposition III.2 are fulfilled (with S = D \ int G), and a cut of type
(6) can be constructed that excludes xOwithout excluding any feasible point of (P).
The cut at any iteration k=I,2, ... can be constructed in a similar way.

There are cases where a cone K with the above mentioned properties does not
exist, or is not efficient because the cut constructed using this cone would be too
shallow (see, for example, the situation with xl depicted in Fig. III.3). However, in
these cases, it is always possible to find a number of cones KO I ,KO2 ,... ,KON
, , , 0
covering D such that the concavity cuts L .(x) ~ 0 relative to (G,K O·) (i=I, ... ,NO)
~~ ~
form a disjunctive system which does not exclude any point of K \ G. (We shall see
in Part B how these cones and the corresponding cuts can be constructed
effecti vely. )

Thus, concavity cuts offer a tool for handling reverse convex constraints such as
(22).
In integer programming these cuts are useful, too, because an integraHty
condition like xj e {O,l} (j e J) can be rewritten as x~ ~ xj , 0 ~ xj ~ I (j e J); and
hence it implies E h.(x~-x.) ~ 0 for arbitrary nonnegative numbers h. (j e J). Note
jeJ J J J J
that the convex set

G = {x: E h.(x.-x.)
2
~ O}
jeJ J J J
108

does not contain in its interior any point x satisfying xj e {O,l} (j e J). Therefore, if
°
o~ x~ ~ 1 for all j e J and < x~ < 1 for some j e J with hj > 0, then xO eint G,
and a concavity cut can be constructed to exclude xO without excluding any x
satisfying xj e {O,l} (j e J).

The reader interested in a more detailed discussion of suitable choices of the hj or,
more generally, ofthe use of concavity cuts in integer programming is referred to the
cited papers of Glover, Balas, and Young. For a detailed treatment of disjunctive
cuts, see Jeroslow (1976 and 1977) and also Sherali and Shetty (1980).

5. A CLASS OF GENERALIZED CONCAVITY CUTS

The cuts (6) (or (3)) do not exhaust the class of -y-valid cuts for (f,D). For
example, the cut (Il) in Fig. 111.2, which is stronger than the concavity cut (I),
cannot be obtained by Theorem III.l or its corollary (to see this, it suffices to
observe that by definition, the concavity cut is the strongest of all cuts obtained by
Corollary III.li moreover, a cut obtained by Corollary III.1 never meets an edge of
K in its negative extension, as does the cut (Il).

In this section we describe a general class of cuts that usually includes stronger
valid cuts than the concavity cut.

Definition ill.4. Let G be a convez subset o/R n with an interior point zO, and let
K be a fu.ll dimensional polyhedral cone with vertez at zOo A cut that does not ezclude
any point 0/ K\ Gis caUed a (G,K)-cut (cf. Fig. 111.4, page 105).

When G = {x: fex) ~ 7} with fex) a concave function, we recover the concept of a
-y-valid cut.
109

Assume that the cone K has s edges (s ~ n) and that for each i EIe {1,2, ... ,s} the
i-th edge meets the boundary of G at a point zi f. xO, while for i t I the i-th edge is
entirely contained in G, i.e., its direction belongs to the recession cone R(G) of G
(cf. Rockafellar (1970)). Denote by ui the direction of the i-th edge and define

(}i = max {(} ~ 0: xO + aui E G} (i E I) , (23)

(24)

Clearly, (}i > °because zi f. xO, ßij > °(otherwise ui would belong to R(G),
because the latter cone is convex and closed) and ßij may be +00 for certain (i,j),

which means that (}iUi + ßuj t R(G) for any ß ~ 0.


Since uj E R(G) for j °
t I, it follows from (24) that for < (} ~ (}i and ß > °we have
(25)

K\G

Fig. III.4. (G,K)-eut


110

Theorem m.3. The inequality

'Ir(x-x,o) ~ 1 (26)

defines a (G,K)-cut if and only if'lr satisfies the following relations:

(27)

(28)

(It is understood that if ßij = +m, then the condition 7r{ ai ui + ßijU j) ~ 0, Le.,
a.· .
° .
7r{i;; u1 + uJ) ~ means that ruJ ~ 0).
IJ

Proof. First observe that the inequality (26) defines a valid cut for K \ G if and

only if the polyhedron

M = {x E K: 'Ir (x-xo) ~ I}

is contained in G. Because of the convexity of all the sets involved, the latter
condition in turn is equivalent to requiring that

(i) each vertex of M belongs to G,

while

(ii) all extreme directions of M belong to the cone R(G).

(i): Let 1+ = {i: 1I11i > O}. We must have I c 1+, since if 1I11i ~ ° for some i E I

then for 0> ° sufficiently large, we would have x = xO + eui E K \ G, whereas

7r{x-x )° = .~ °<
011111 l.
It is easily seen that the extreme points of M consist of xO and the points where the

edges of K intersect the hyperplane 7r{x-xO) = 1, Le., the points


(29)
111

According to (23), for i E I these points belong to G if and only if Bi ~ Cl'i ' Le., if and

only if (27) holds. On the other hand, for i ~ I, since ui is a direction of recession for
o .
G, we have x + 0l E G für all 0i > O.
(ii) Let us now show that any extreme directiün of M is either a vector uj
(j ~ 1+) or a vector of the form

0iUi+~juj (iEI+,jEC), (30)

where Bi is defined by (29), C = {j: ruj < O}, and

(31)

To see this, consider the cone KO = K_x O. Let tx = 1 be a hyperplane cutting an


edges of KO (for example take a hyperplane strictly separating the origin from the

convex hull of ul, u 2,... , u n ). Since the recession cone R(M) of M is

R(M) = {x E KO: 7IX ~ O} ,

the extreme directions of Mare given by the vertices of the polytope

p = {x E R(M): tx = I}

= {x E K O: tx = 1, ?rX ~ O} = S n {x: ?rX ~ O} ,

with S = {x E K O: tx = I}. But the vertices of S are obviously vi = -\ui (i=I,2, ... ,s)

with -\ = I/(tui) > O. Therefore, a vertex of P must be either a v j such that ~~ 0

(i.e., j ~ 1+) or the intersection of the hyperplane ?rX = 0 with an edge of S, i.e., a

line segment joining two vertices vi, ~ such that 1[Vi > 0 (i.e., rui > 0) and 1[Vj <0
(i.e., ruj < 0). In the latter case, let uij denote the intersection of the line segment
[vi)] with the hyperplane ?rX = O. Since the hyperplane ?rX = 1 meets the halfline
from 0 through vi at 0iui and the halfline from 0 through v j at -~juj (Fig. 111.5), it is
clear that uij is parallel to 0iui + ~juj. We have thus proved that the extreme
112

....•.
.......... ........ u l]

...~ :.::~;.~' - ....


-l' . J::. . . . . . ._. . . .
]

Fig. III.5

directions of M consist of the vectors uj (j t 1+) and a set of vectors of the


form (30).
Since j t 1+ implies j t I, any vector uj (j t 1+) b~ongs to R(G). Moreover,
according to (25), for i E I, j E C(which implies j t I) a vector of the form (30)
belongs to the recession cone of G if and only if

/l·e·
1 J
> 8·ß..
- 1 IJ

and hence, from (29) and (31), if and only if

The proof of Theorem III.3 is completed by noting that for i E I, j t 1 UC we have


wui > 0, wuj ~ 0, so that (28) trivially holds (by convention the inequality
w( /li ui + ßijUj) ~ 0 means wuj ~ °if ßij = +m) while for i E 1+ \ I, j E C we have

ui , uj E R(G) (because i t I, j t I), so that 8iui + ejuj E R(G) for all 8i > 0, ej > 0.•
113

6. CUTS USING NEGATIVE EDGE EXTENSIONS

Theorem III.3 yields a very general elass of cuts, since the inequalities (27) and
(28) that characterize this elass are linear in "" and can be satisfied in many different
ways.
The cut (6) in the nondegenerate case (s = n), the Carvajal-Moreno cut for the
degenerate case (Proposition III.1) and the cut obtained by Proposition III.2 are
special varlants of this elass, which share the common feature that their coefficient
vectors "" satisfy the inequalities rni ~ 0 (i=1,2, ... ,s). Though these cuts are easy to

construct, they are not always the most efficient ones. In fact, Glover (1973 and
1974) showed that if G is a polyhedron and some edges of K (say, the edges of
direction uj , j E J) strictly recede from all boundary hyperplanes of G, then stronger
cuts than the concavity cut can be constructed which satisfy rnj < 0 for jE J (thus,
these cuts use negative extensions of the edges rather than the usual positive

extensions: an example is furnished by the cut (II) in Fig. III.2, already mentioned
in the previous section).

We elose this chapter by showing that these cuts can also be derived from
Theorem III.3.
As before, let I denote the index set of those edges of K which meet the boundary
of G. Assume that I t 0 (otherwise, K \ G = 0).

Theorem m.4. The system

(32)

(maxß .. ),,".,)~-l (UI), (33)


i EI lJ

where a·l~d ß··


U are dejined by (29), (2~), has at least one basic
. solution and any
basic solution is a (G,K) cut.
114

proof. Since (12) implies (32) and (33), the consistency of the system (12) im-

plies that of the system (32) and (33). Obviously, any solution of the latter satisfies

the conditions (27), (28). Hence, by Theorem 1ß.3, it generates a (G,K)~ut. It re-
mains to show that the polyhedron (32) and (33) has at least one extreme point. But

since I # 0, the origin 0 does not satisfy (32), therefore this polyhedron cannot con-
tain any line and must have at least one extreme point, i.e. the system (32), (33)
must have a basic solution.

The following remark provides a cut of the type (32), (33).

Remark m.l. To obtain a cut of the type indicated in Theorem 111.4, it suffices
to solve, for example, the linear program

minimize E (lI·mi subject to (32) and (33) .


ieI

This cut is obviously stronger than the Carvajal-Moreno cut obtained by minim-
izing the same linear funetion over the smaller polyhedron (12). In particular, if s=n
(nondegenerate case), then this cut is determined by the system of equations

(l.7nli = 1 (i e I),
I

(max ß· .)7nlj = -1 (jt I) .


i eI IJ

From the latter equation it follows that 7nlj < 0 for j ~ I, provided that
maxß·· < +111.
i eI IJ

When G is a polyhedron, the recession cone R(G) of G is itself polyhedral and

readily available from the constraints determining G. In this case, the values ßij can
be easily computed from (24), and we necessarily have max ß·· < +111 for every j
i eI IJ
j
such that u is strictly interior to R(G).
CHAPTERIV

BRANCH AND BOUND

A widely used method to solve various kinds of difficult optimization problems is

called branch and bound. In this technique, the feasible set is relaxed and

subsequently split into parts (branching) over which lower (and often also upper)

bounds of the objective function value can be determined (bounding).

In this chapter, branch and bound is developed with regard to the specific needs

of global optimization. First, a general prototype method is presented which includes

all the specific approaches that will be discussed in subsequent chapters. A general
convergence theory is developed and applied to concave minimization, d.c. pro-
gramming and Lipschitzian optimization. The basic operations of branching and

bounding are discussed in detail.

1. A PROTOTYPE BRANCH AND BOUND METHOD

Consider again the global optimization problem

(P) minimize f(x) (1)


s.t. xED

where f: A -+ IR and DcA ( IRn . The set A is assumed to contain all of the subsets

that will be used below.


116

For the moment we only assume that min f(D) msts. Further assumptions will

be needed later.
Notice that mere smoothness of the functions involved in problem (P) is not the

determining factor in the development of a practical global optimization algorithm,

i.e., a finite method that guarantees to find a point x* E D such that f(x*) differs

from f* = min f(D) by no more than a specified accuracy. It is easy to see that for D
robust, from finitely many function values and derivatives and the information that

fE Ck (kEINO U {oo} known) one cannot compute a lower bound for f*. The reason is
simply that one can find a point y* E D and an c-neighbourhood U of y* such that

no point in U has been considered. Then it is well-known that one can modify fon

U such that f(y*) takes on any desired value, the modification is not detectable from

the above local information from outside U, and the modified function is still Ck on

D.
Any method not susceptible to this argument must make global assumptions (or
somehow generate global information, as, for example, interval methods) which al-
low one to compute suitable lower bounds for min f(M) at least for some sets of suffi-
ciently simple structure. When such global information is available, as, for example,
in the case of Lipschitz functions or concave functions over polytopes with known
vertex sets, a "branch and bound" method (abbreviated BB) can often be con-
structed. In the last three decades, along with general BB concepts, abundant

branch and bound methods have been proposed for many classes of global optim-

ization problems. Many of these will be treated and sometimes combined with other
methods in subsequent parts of this book. An extensive list of references can also be
found in Chapters 2, 3, 4, 7, 8, 10 and 13 of the Handbook Horst and Pardalos

(1995).

The following presentation is mainly based on Horst (1986 and 1988), Tuy and
Horst (1988).
117

The idea of BB methods in global optimization is rather simple:

- Start with a relaxed feasible set MO ) D and split (partition) MO into finitely

many subsets Mi' i E 1.

- For each subset Mi determine lower and (if possible) upper bounds ß (Mi)' Cl (Mi)

respectively, satisfying

Then ß: = min ß (Ml·), Cl := min Cl (M.) are "overall" bounds, i.e., we have
i EI i EI 1

ß5minf(D)5 Cl.

- If Cl = ß (or Cl - ß 5 c, for prescribed c > 0), then stop.

- Otherwise select some of the subsets Mi and partition these chosen subsets in

order to obtain a refined partition of MO. Determine new, hopefully better bounds

on the new partition elements, and repeat the process in this way.

An advantage of BB methods is that during the iteration process one can usually

delete certain subsets of D, since one knows that min f(D) cannot be attained there.

A typical disadvantage is that, as a rule, the accuracy of an approximate solution

can only be measured by the difference Cl - ß of the current bounds. Hence a "good"

feasible point found earIy may be detected as "good" only much Iater after many

further refinements.

Definiüon IV.I. Let B be a su.bset 0/ IRn and 1 be a finite set 0/ indices. A set
{Mi: i E 1} 0/ su.bsets 0/ Bis said to be a partition 0/ B i/

B = u M. and M.n M.= 8M.n 8M. Vi, jE I1 it jl


iEI t t J , J

where 8Mi denotes the (relative) bou.ndary 0/ Mi.


118

Let M denote an element of a current partition used in a BB procedure as indi-

cated above. For MO and all partition sets M, it is natural to use most simple poly-
topes or convex polyhedral sets, such as simplices, rectangles and polyhedral cones.

In this context, a polytope M is often given by its vertex set, a polyhedral cone by
its generating vectors.

A generalization of partitions to so-called covers is discussed in Horst, Thoai and de

Vries (1992 and 1992a).

An important question ansing from BB procedures will be that of properly de-

leting partition sets satisfying M n D = 0. Clearly, for many feasible sets adecision
on whether we have M n D = 0 or M n D I 0 will be difficult based on the informa-

tion at hand (cf. Section IV.5).

Definition IV.2. A partition set M satisfying M n D = 0 is called infeasible; a


partition set M satisfying Mn D I 0 is called feasible. A partition set M is said to be
uncertain if we do not know whether M is feasible or infeasible.

In the following description of a general BB method we adopt the usual conven-


tion that infima and minima taken over an empty set equal +(1).

Prototype BB procedure:

Step 0 (Initialization):

Choose MO ~ D, SM ( D, -,)) < ßO ~ min f(D).


o
Set .JC 0 = {MO}' aO = min f(SM ), ß (MO) = ßO'
o
If aO < (1), then choose xO E argmin f(SM ) (Le., f(xO) = aO)'
o
If 0 -
01 ßO = 0, then stop: 010 = ßO = min f(D), x O is an optimal solution.

Otherwise, go to Step 1.
119

Step k (k = 1,2, ... ):

At the beginning of Step k we have the eurrent partition .Jt k-l of a subset of MO

still of interest. Furthermore, for every M E .Jt k-l we have SM ~ M n D and bounds
ß (M), 0' (M) satisfying

ß(M) ~ inf f(M n D) ~ O'(M) if M is known to be feasible,


(2)
ß(M) ~ inf f(M) if M is une e rtain.

Moreover, we have the current lower and upper bounds ßk- 1, O'k-l satisfying

Finally, if O'k-l < (1), then we have a point xk- 1 E D satisfying f(xk- 1) = O'k_l
(the best feasible point obtained so far).

k.1. Delete all M E .Jt k-l satisfying

ß (M) ~ O'k-l .

Let se k be the eolleetion of the remaining sets in the partition .Jt k-l'

k.2. Select a nonempty collection of sets .9'k ( se k and eonstruct a partition of


every member of .9\. Let Jlic be the colleetion of all new partition elements.
k.3. Delete eaeh M E Jl ic for whieh it is known that M nD = 0 or where it is
otherwise known that min f(D) eannot oeeur. Let .Jt ic be the eolleetion of all
remaining members of Jlic.
k.4. Assign to each M E .Jt ic a set SM and a quantity ß (M) satisfying

ß (M) ~ inff(M n D) if M is known to be feasible,


ß (M) ~ inf f(M) if M is une e rtain.
120

Furthermore, it is required that we have

SM ~ Mn SM" ß (M) ~ ß (M') whenever M ( M' E .At k-1.

Set Q (M) = min f(SM).


k.5. Set.At k = (~k \.9k ) u .At k·
Compute

Qk = inf { Q (M): M E .At k}

and

ßk = min {ß (M): M E .At k}.

If Qk < m, then let x k E D such that f(xk) = Qk.

k.6. If Qk - ßk = 0, then stop: Qk = ßk = min f(D), x k is an optimal solution.


Otherwise, go to Step k+I.

fothomed

.At'
1(

Fig. IV.I. A BB step


121

Remarb IV.I. (i) We say that a partition element M E .At k is fathomed if


(:J (M) ~ Qk-l" Such a partition element will be deleted in Step k. The stopping
criterion Qk = ~ means that all partition elements are fathomed.

(ü) Step k is to be performed only if there remain unfathomed partition elements,


i.e., if .9t k # 0. In general, the partition operation is defined only on a certain family
~ of subsets of IRn (e.g., simplices, or rectangles, or cones of a certain type). In order
that Step k can actually be carried out, one must require that .9t k c ~, i.e., every
unfathomed partition element should be capable of further refinement.

(üi) In Step kA one can obviously replace any M E ...t Ir. by a smaller set Me M
such that M E ~ and M nD = M n D.

(iv) For each partition set M, SM c M n D is the collection of feasible points in


M that can be found by reasonable effort when a BB algorithm is performed. We
assume that min f(SM) exists or SM = 0. In the applications, the sets SM are
usually finite. However, the above procedure is also defined in the case where many

or even all sets SM are empty and we may even have Qk = m for all k. We will
return to this point later, but we first treat the case where "enough" feasible points
are available. Then the conditions imposed on SM and (:J (M) ensure that
{Qk} = {f(xkn is a nonincreasing sequence, {~} is a nondecreasing sequence, and
Qk ~ min f(D) ~ (:Jk· Thus, the difference Qk - (:Jk measures the proximity of the
current best solution xk to the optimum. For a given tolerance E > 0, the algorithm
can be stopped as soon as Qk - ~ ~ E.

Since {Qk} and {~} are monotonically nonincreasing and nondecreasing,

respectively, the limits Q = !im Qk and (:J = !im ~ necessarily exist, and, by
k-+m k-+m
construction, they satisfy ß ~ min f(D) ~ (:J. The algorithm is said to be finite if
Qk =~ occurs at some step k, while it is convergent if Qk - (:Jk --+ 0, i.e.,
122

er = lim f(xk) =(3 = min f(D).


k-llll

It is dear that the concrete choice of the following three basic operations is

crucially important for the convergence and efficiency of the prototype branch and

bound procedure:

Bounding (how to determine er (M), (3 (M)?),

Selection (how to choose .9'k?) and

Re6ning (how to construct the partitions?).

Example IV.1: Consider the concave minimization problem:

minimize f(x l , ~):= - (xl - 20)2 - (x2 _10)2

1
s.t. x 2 -ixl no,

~ 500 ,

The procedure that follows is one of several possibilities for solving this problem by
a BB algorithm.

Step 0: MO is chosen to be the simplex conv {(O,O), (0,40), (40,0)} having vertices
(0,0), (0,40), (40,0).

To obtain lower bounds, we construct the affine function tp (xl'~) = alxl +


~~ + aa that coincides with f at the vertices of MO' Solving the corresponding
system of linear equations

aa = -500
40~ + aa = -laoo
123

yields 'P (xl' ~) = -20x2 - 500. Because of the concavity of f(xl' x 2), 'P (xl' x2) is

underestimating f(x 1, x2), i.e., we have cp(xl'x2) ~ f(xl'~) V(xl'x2) E MO' A lower
bound Po can be found by solving the convex optimization problem (with linear

objective function)

We obtain Po = -900 (attained at (20,20)).

The two feasible points (0,0) and (20,20) are at hand, and we set

SM = {(O,O), (20,20)}, aO = min f(SM ) = f(O,O) = -500 and xO = (0,0).


o 0

Step 1: We partition MO into the two simplices M11 = conv {(O,O), (0,40), (20,20)}
,
and M12 = conv {(O,O), (20,20), (40,0)}.
,
As in Step 0, we construct lower bounds P(MI ,I) and P(MI ,2) by minimizing

over MI I n D the affine function 'PlI that coincides with f at the vertices of MI I
" ,
and by minimizing over MI ,2 n D the affine function 'PI ,2 that coincides with f at
the vertices of MI ,2' respectively.
One obtains

(attained at (0,10)), and P(M 12 ) = -500 (attained at (0,0)),


,
which implies PI = -700. The sets of feasible points in M11
, , MI ,2 which are known
until now are

SM = {(0,0)(0,10)(20,20}, SM = {(0,0),(20,20)}.
1,1 1,2
1
Hence a (MI ,I) = a (MI ,2)= f(O,O) = -500, a1 = -500 and x = (0,0).
124

Step 2. The set MI ,2 can be deleted since ß (M 12


,) = -500 = Cl1· We partition M11
,
into the two simplices

M21 = conv{(0,0),(0,20),(20,20)}, M22 = conv{(0,20),(0,40),(20,20)}.


, ,
In a similar way to the above, we calculate the subfunctional 1{J2 ,1 = 20x1 - 500.

The subfunctional 1{J22


, does not need to be calculated, since M22
, n D = {(20,20)}.
We obtain

ß (M 2,1) = -500 (attained at (0,0)) and ß (M 2,2) = f(20,20) = -100,


hence ß2 = -500;

SM = {(0,0),(0,10),(0,20),(20,20)}, Cl (M 2 1) = -500 ;
2,1 '

SM = {(20,20)}, Cl (M 2 2) = -100.
2,2 '

Hence ~ = -500 and x2 = (0,0). Since ß2 = ~, we stop the procedure; x 2 = (0,0)


is the optimal solution.
The particular way of subdividing simplices applied above will be justified in
Section IV.3.1.

Another possibility for calculating the lower bounds would have been simply by
minimizing f over the vertex set of the corresponding partition element M. Keeping

the partitioning unchanged, we then would have obtained ßO = -1300, ß1 = -1300,

deletion of M12 as above, deletion of M22 (since M22 n D ( M21 n D),


, " ,
ß2 = -500 =~.
125

X
2

40

20

X
20 1

Fig. IV.2. Feasible set D and simpliees for Example IV.l

The example shows both the mentioned disadvantage that for a solution found
very early it may take a long time to verify optimality, and the typical advantage

that parts of the feasible region may be deleted from furt her eonsideration.

The BB proeedure ean be visualized by a graph hanging at MO: the nodes


eorrespond to sueeessively generated partition sets, two nodes being eonneeted by an
are if the seeond is obtained by a direet partition of the first (cf. the proof of the
following Theorem IV.l).
126

Fig. IV.3. The graph corresponding to Example IV.1.

2. FINITENESS AND CONVERGENCE CONDITIONS

Finiteness and convergence conditions depend on the limit behaviour of the


difference Qk - ßk · Therefore, in order to study convergence properties for realiza-
tions of the BB procedure, it is natural to consider decreasing (nested) sequences of

successively refined partition elements, i.e., sequences {Mk } such that


q

Definition IV.3. Abounding operation is called finitely consistent ij, at every step,

any unlathomed partition element can be further re/ined, and il any decreasing se-
quence {Mk } 01 successively refined partition elements is finite.
q
127

Theorem IV.!. In a BB procedure, suppose that the bounding operation is finitely


consistent. Then the procedure terminates after finitely many steps.

Proof. Sinee any unfathomed partition element ean be further refined, the proee-

dure stops only when (}k = ßk and an optimal solution has been attained. A direeted
graph G ean be associated to the proeedure in 30 natural way. The nodes of G eonsist

of MO and 3011 partition elements generated by the algorithm. Two nodes are connee-

ted by an are if and only if the first node represents an immediate aneestor of the se-

eond node, i.e., the second is obtained by a direet partition of the first.

Obviously, in terms of graph theory, G is a rooted tree with root MO' A path in G
corresponds to 30 decreasing sequenee {Mk } of suceessively refined partition ele-
q
ments, and the assumption of the theorem means that every path in G is finite.

On the other hand, by Definition IV.1, each partition eonsists of finitely many
elements: hence, from each node of G only a finite number of ares ean emanate (the

"out-degree" of each node is finite, the "in-degree" of each node different from MO

is one, by construction).
Therefore, for eaeh node M, the set of deseendants of M, Le., the set of nodes
attainable (by a path) from M must be finite. In particular, the set of all descen-
dants of MO is finite, Le., the procedure itself is finite.

Remark IV.2. The type of tree that appears in the proof is diseussed in Berge
(1958) (the so-called "r-finite graphs", cf. especially Berge (1958, Chapter 3,

Theorem 2)).

In the sequel, convergence eonditions for the infinite BB procedure are eonsidered.

It will turn out that eonvergenee of an infinite BB proeedure is guaranteed if an


obvious requirement is imposed on the selection operator, and if a eertain eonsis-

teney property of the bounding operations is satisfied on nested sequences of sucees-


sively refined partition elements. An immediate eonsequence of Theorem IV.1 is the
128

following eorollary.

Corollary IV.1. 1/ a BB procedure is infinite, then it generates at least one in-


finitely decreasing sequence {Mk } 0/ successil1ely refined partition elements.
q

Definition IVA. Abounding operation is called cOf&8istent i/ at el1ery step any un-
/athomed partition element can be further refined, and i/ any infinitely decreasing se-
quence {Mk } 0/ successil1ely refined partition elements satisfies
q

(3)

Remarlal IV.3. (i) By construetion, Qk is a nonincreasing sequence, ß (Mk ) is


q q
a nondecreasing sequenee and Qk > ß (Mk ), since otherwise Mk would be deleted
q q q
in the next step, and so it eould not be an element of a decreasing sequence of sue-

eessively refined partition elements. However, &inee ß(Mk ) ~ 1\ ' (3) does
q q
not nece&sarily imply 1im '\ = 1im ßx. = min f(D) via 1im '\ = 1im ßx. .
k-+m k-+m q-+m q q-+m q
In order to guarantee convergenee an additional requirement must be imposed on
the seleetion operation.

(ii) The relation (3) may be diffieult to verify in praetice, sinee Qk is not neces-
q
sarily attained at Mk . In view of the inequality Q (M k ) ~ Qk ~ ß(Mk ) and the
q q q q
properties just mentioned, (3) will be implied by the more praetical requirement

li m (Q (Mk ) - ß(Mk
q-+m q q
»= 0 , (4)

whieh simply says that, whenever a deereasing sequence of partition sets converges
to a eertain limit set, the bounds also must converge to the exaet minimum of f over
this limit set.
129

Definition IV. 5. A selection operation is caUed complete if for every


m m
Me u n fll k we have
p=l k=p

inff(Mn D) ~ a:= Zim ak'


k-+m

Stated in words, this means that any portion of the feasible set which is left "un-

explored forever" must (in the end) be not better than the fathomed portions.

Theorem IV.2. In an infinite BB procedv.re sv.ppose that the bov.nding operation is

consistent and the seZection operation is complete. Then

a: = Zim ak = limf(xk) = minj(D).


k-+m k-+m

Proof. Since, by construction, we have a ~ min f(D), it suffices to prove that

fex) ~ a Vx E D.

(i) If x E D belongs to a partition set M that is deleted at iteration k, then by

step k.1 of the prototype BB procedure we have f(x) ~ ß (M) ~ ak _ 1 ~ a.

(ii) If x E D belongs to a partition set M E Ü li fll k ' then by the complete-


p=l k=p
ness property we have inf f(M n D) ~ a, and hence fex) ~ a.

(iii) If neither of the two previous cases holds for xe D, then any partition set M

containing x must be partitioned at some iteration, Le., it must belong to .Jt k for
some k. Therefore, one can find a decreasing sequence {Mk } of partition sets satis-
q
fying Mk E.Jt k
,x e Mk . By consistency, it follows that I im ak = a = lim
q q q q-im q q-im
ß (M k ). Hence fex) ~ a, since fex) ~ ß (M k ) for every q. •
q q

Theorem IV.2 does not include results on the behaviour of the sequence {ßq} of
lower bounds. By construction (since {ßq} is a nondecreasing sequence bounded from
130

above by min f(D)), we only have the existence of

ß: = lim.t\ ~ inff(D).
k-tll]

However, as already seen, the degree of approximation attained at Step k is


measured by the difference of the bounds Qk - .t\. It is therefore desirable to apply
selection operations that improve lower bounds and imply that Q = ß.

Definiüon IV.6. A selection operation is said to be bound improving iJ, at least


each time after a finite number of steps, 9' k satisfies the relation

9'k n argmin {ß (M): ME .9t .J # 0, (5)

i.e., at least one partition element where the actuallower bound is attained is selected
for further partition in Step k ofthe prototype algorithm.

Several selection rules not explicitly using (5) are actually bound improving. For
example, define for every partition set Miet :7 (M) denote the index of the step
where M is generated, and choose the oldest partition set, i.e., select

9'k = argmin {:7(M): M E .9t k}.


Alternatively, for every partition set M we could define a quantity 6(M) closely
related to the size of M, for example, 6(M) could be the diameter of M, the volume

of M, or the length of the longest edge if M is a polytope. Let the refining operation
be such that, for every compact M, given E > 0, after a finite number of refinements
of M we have 6(Mi ) ~ E for all elements Mi of the current partition of M. Choose the
largest partition element, i.e., select

Both selections are bound improving simply because of the finiteness of the
number of partition elements in each step which assures that any partition set M
131

will be deleted or split again after finitely many steps.

Theorem IV.3. In the infinite BB procedure, auppoae that the bounding operation
ia conaistent and the aelection operation is bound improving. Then the procedure ia
convergent:

Proof. Let us mark every partition element M satisfying M E $J k n argmin


{ß (M): M E .ge k}' i.e., such that ß (M) = ßtt-1 at some Step k where (5) occurs.
Since (5) occurs after finitely many steps, it follows that, if the procedure is infinite,
it must generate infinitely many marked partition elements.

Among the members of the finite partition of MO there exists one, say Mk ' with in-
o
finitely many marked descendants; then, among the members of the partition of M k
o
there exists one, say M k ' with infinitely many marked descendants. Continuing in
1
this way, we see that there exists a decreasing sequence {M k } such that every M k
q q
has infinitely many marked descendants. Since the bounrung operation is consistent,
we have li m (ak - ß(M k )) = O. But every Mk has at least one marked descend-
q q q q
ant, say Mq. Then ß (Mq) = ßhq for some hq > kq, and since Mq c M kq we must

have ß (M )~ß (M k ) (see Step k.4). Thus, ß (M k ) ~


q q q
1\q~ahq~ak q' which implies
that lim (ah -ßh ) = O. Therefore, we have a = ß.
q q q •
Note that for consistent bounrung operations, bound improving selection opera-
tions are complete. This follows from Theorem IV.3, since the relation f(x) ~ ßk
q
Vx E D implies that inf f(M n D) ~ ß = a for every partition set M.

Convergence of the sequence of current best points xk now follows by standard ar-
guments.
132

Corollary IV.2. Let!: IR n -IR be continuous, D be closed and C(l):= {z e D: !(z)


~ !(zO)} be bounded. In an infinite BB procedure suppose that the bounding operation
is consistent and the selection operation is complete. Then e'IJery accumwation point
o! {:t} sol'IJes problem (P).

proof. The set C(xO) is bounded and, by continuity of f, C(x,°) is closed and
therefore compact. By construction, we have f(xk+!) ~ f(xk) Vkj thus {xk} c C(xo).
Hence {xk } possesses accumulation points. Corollary IV.2. then follows immediately

from Theorem IV.2.



Several BB procedures proposed earlier for solving special global optimization
problems de:fine the iteration sequence in a different way. Instead of choosing the
best feasible point xk obtained so far, a point i k is considered which is associated
with the lower bound ßk. Usually ß (M) is calculated by minimizing or maximizing a
certain subfunction of f over D n M or M. Then, in several procedures, i k is a point
where ~ is attained (cf., e.g., Falk and Soland (1969), Horst (1976 and 1980), Ben-
son (1982)). If i k is feasible, then i k will enter the set SM of feasible points known
in M. Since, however, xk is the best feasible point known in Step k, we will then
have f(xk ) ~ f(ik), i.e., xk will be the better choice. Note that for all k we have
f(xk+1) ~ f(xk) whereas f(ik+!) ~ f(ik) does not necessarily hold.
If in these algorithms the iteration sequence {ik} is replaced by the sequence of
best feasible points {xk}, then, for continuous fand compact MO' convergence is pre-
served in the following sense: whenever every accumulation point of {ik } solves (P),
then every accumulation point of {xk} solves (P) as well. To see this, let i be an ac-

cumulation point of {xk} and {xq}qeI be a subsequence of {xk} satisfying

x q - - i. By continuity of f, we have f(x q) - - f(i).


(qeI) (qeI)
133

Now consider the corresponding subsequence {iq}qeI of {iq}. Since MO is com-

pact and {iq} eI C MO ' there is a convergent subsequence {ir} _, i C I, with


q reI
r
f(i ) --:-' f(x*), where x* e argmin f(D). For the corresponding subsequence
(reI)
{xr} _, we then have f(x r ) --:-' f(i) ~ f(x*). But, because f(i r ) ~ f(xr ), we also
reI (reI)
have f(i) S f(x*).

For several classes of problems, however, it is not possible to guarantee that a se-
quence of best feasible points xk and the associated upper bounds erk can be obtained

in such a way that consistency holds (e.g., Horst (1988 and 1989), Horst and Dien

(1987), Horst and Thoai (1988), Pinter (1988), cf. the comments in Section IV.l). In

this case we propose to consider the above sequence {ik } or sequences closely related

to {ik } (cf. Section IV.5).

Example IV.2. We want to minimize the concave function f(xl' x2) =


-(x2 _10)2 on the set

Obviously, min f(D) = -100 is attained at i = (0,0). Suppose that we apply a BB


procedure that uses two-dimensional simplices M defined as the convex hull of their
known vertex sets V(M), i.e., M = conv V(M). Moreover, suppose that all opera-
tions required by the algorithm are carried out on V(M). For example, let the lower

bounds ß (M) be the minimum of f on V(M), i.e., ß(M) = min f(V(M)), and let the

upper bounds er (M) be the minimum of f taken over the feasible vertices of M, i.e.,

SM = V(M) n D, i.e.,
er (M) = { min f(V(M n D)), if V(M) nD t 0,
m , if V(M) n D = 0 .
134

By concavity of f, we have

ß (M) = min f(M) ~ inff(M n D) ~ Q (M)

(cf. Section 1.2, Theorem 1).

It is easy to define refining procedures and selection rules such that the branch

and bound procedure generates a decreasing sequence {Mn} of simplices

Mn = conv{(1/n, 1/n), (-1/n, 1/n), (0, -1/n)}, nEIN,

satisfying

V(M n ) n D = 0 , V nEIN, li m Mn = {i}


n-illl

Obviously, we have

I im ß (Mn) = f(i) , li m vn = i ,
n-i 111 n-i 111

for any sequence of points vn E Mn'


The algorithm converges in the sense that every accumulation point of such a se-
quence {vn} solves the problem. However, because V(Mn ) n D = 0, we have Q (Mn)
= 111, Vn E IN, and the iteration sequence of best feasible points as used in the proto-

type algorithm is empty. Consistency does not hold, and convergence cannot be es-

tablished by the theory developed above.

A related question is that of deleting infeasible partition sets in an appropriate

way. As mentioned in Section IV.l, it is in many cases too difficult to decide exactly

whether M n D = 0 for all partition sets M from the information available. There-

fore, we have to invent simple techniqes for checking infeasibility that, though pos-
sibly incorrect for a given M, lead to correct and convergent algorithms when incorp-
orated in Step k.3 of the prototype BB procedure.
135

Example IV.3. In Example IV.2, we could have decided precisely whether

Mn D = 0 or M n D *0. For example, we define the convex function

and solve the convex optimization problem

minimize g (x)
s. t. x E M

We have x E D if and only if g(x) 5 O. Hence M n D = 0 if and only if


min g(M) > O.

Note that for the convex set D: = {x: gi(x) 5 0, i = 1, ... ,m} defined by the convex
functions gi: IRn -I IR (i = 1, ... ,m), the function g(x) = max {~(x), i=l, ... ,m} is non-

smooth, and it may be numerically expensive to solve the above minimization prob-

lem for each M satisfying V(M) n D = 0. Moreover, whenever min g(M) > 0 is
small, the algorithms available for solving this problem may in practice produce a

result that leads to a wrong decision on M.

The situation is, of course, worse for more complicated feasible sets D. For ex-
ample, let D be defined by a finite number of inequalities gi (x) 5 0, i E I, where
gi: IRn - I IR is Lipschitzian on a set MO containing D (i EI). Then the problem of de-
ciding whether we have M n D = 0 or M n D *0 is c1early almost as difficult as the
original problem (1).

Definition IV.7. A lower bounding operation is caUed strongly consistent, if at


every step any undeleted partition element can be further refined, and if any infinite
decreasing sequence {Mk } of successively refined partition elements possesses a sub-
q
sequence {Mk } satisJying
q'

MnDf.0, ß(Mk )--min!(MnD), whereM=nMk


q' q'-iW q q
136

Note that the notioDS of consistency and strong consistency involve not only the

calculation of bounds but, obviously, also the sub division of partition elements.

Moreover, in order to ensure strong consistency of the lower bounding operation, the

deletion of infeasible sets in Step k.3 of the BB procedure has to guarantee that

M n D # 0 holds for the limit M of every nested sequence of partition elements gen-
erated by the algorithm.

Definition IV.8. The "deletion b1l in/easibility" rule 1J.Sed in Step k.9 throughout a

BB procedure is caUed cerl4in in the limit i/ /or every infinite decreasing sequence
{Mk } 0/ successivel1l refined partition elements with limit M we have M n D # 0.
q

Coro11ary IV.3. In the BB procedure suppose that the lower bouMing operation is
strongl1l consistent aM the selection operation is bOUM improving. Then we have

ß = limßk = min/(D).
k-tlll

Prool. As in the proof of Theorem IV.3, it follows by the bound improving

selection rule that there must be a decreasing sequence {Mk } satisfying f\ =


q q
ß (Mk ). Clearly, ß = lim f\ S min f(D) exists (the limit of a nondecreasing se-
q k~m

quence bounded from above). Rence we ha.ve

ß = lim ß (Mk ) = min f(M n D) S min f(D).


q~1II q

n D, this is only possible ifmin f(M n D) = min f(D).


Since D) M

In the following sections, some concrete types of partition sets, refining opera-

tions, and bounding and deletion rules are presented that illustrate the wide range of
applicability of the BB procedure.
137

3. TYPICAL PARTITION SETS AND THEIR REFINEMENT

As mentioned above, for the partition sets M it is natural to use most simple

polytopes or convex polyhedral sets such as simplices, rectangles and polyhedral

cones.

3.1. Simplices

Suppose that D c IRn is robust and convex. Furthermore, let MO and all partition

elements be n-dimensional simplices (n-simplices). An initial simplex MO can

usually be determined as described in Section HA.

Definition IV.9. Let M be an n-simplex with vertex set V(M) = {va, v1, .. ,vn}.
Choose a point wEM, W ~ V(M) which is uniquely represented by
n . n
W = E >. .vl , >.. ~ 0 (i=O, ... ,n), E >.. = 1, (3)
i=O" i=O '

°
and for each i such that \ > form the simplex M(i, w) obtained from M by replacing
. MI:
th evertex I. by W,I.e., ",w;,) = conv {o
V, .. ,Vi-l ,W,Vi+l , .. , vn} .
This sub division is called a radial subdivision (Fig. IV.4).

2
v

M( 1 ,w)

o tFC..-----+------~ 1
v v
M(2,w)

Fig. IVA. Radial subdivision of a simplex


138

Radial sub divisions of simplices were introduced in Horst (1976) and subsequently
used by many authors.

Proposition IV.I. The set 0/ subsets M(i,w) that can be constrocted /rom an
n-simplex M by an arbitrary radial sub division /orms apartition 0/ M into
n-simplices.

Proof. It is well-known from elementary linear algebra that, given an affinely

independent set of points {vO,,,., vn} and a point w represented by (3), then
{vO ,,,.,vi- 1,w,vi+1,,,.,vn} is a set of affinely independent points whenever we have
Ai > °in (3). Thus, all the sets M(i,w) generated by a radial subdivision of an
n-simplex Mare n-simplices.
Let x e M(i,w), Le.,
n
j=O J
.
x = E p..yJ + /lr,w, p.. ~
1 J
° (j=O,,,.,n), /Ir,
1
+
n
Ep..=1.
j=O J
(4)
Hi Hi
Inserting (3) yields


x = E p..v J
j=O J
+ p..
1
n
E Akv
k=O
k
, p.J' ~ °(j=O,,,.,n), Ak ~ °(k=O,,,.,n),
j#i
(5)
n
p..+Ep..=l,
1 j=O J
Hi
This can be expressed as

(6)

where all of the coefficients in (6) are greater than or equal to zero, and, by (5), their
sum equals 1- lLj + lLj(l-Ai) + ILjAi = 1. Hence, xis a convex combination of the
139

vertices vO, ... ,vn of M and we have x e M. Consequently, M(i,w) C M.

To show that M C U M(i,w), let x e M, x f w and consider the ray


p(w,x) = {a(x-w)+w, a ~ O} from w through x. Let F be the face of M with
smallest dimension among all faces containing x and w, and let y be the point where
p(w,x) intersects the relative boundary of F. Then we have

y = E JL;Vi, JL;
iel l I
°
~ (iel), E JL; = 1, I C IN, III < n+1
iell
(7)

By construction, we also have

x = äy + (1 - Ci)w, °< et ~ 1.

Rence, using (7), we obtain

(8)

where etl-'j ~ °(i e I), (1 - Ci) °and.EleI


~ ~ + (1 - Ci) = 1. It follows that

xe UM(i,w).

Finally, for i f k, let x e M(i,w) n M(k,w), so that

(9)

AJ. ~ 0, "'J' ~ ° (j=O, ... ,n), j=OE A. = j=OE ",. =


n

I
n

J
1.

In (9), we cannot have \ > °or I-'j > 0. To see this, express in (9) w as convex
combination w =
n . °
E a· Vi of v ,... ,vn. Then in (9) xis expressed by two convex
i=O I
combinations of vO, ... ,vn whose coefficients have to coincide. Taking into account
that '\, ak f 0, it is then easy to see that I-'j = Ak= 0. The details are left to the rea-
der.

140

In order to establish consistency or strong consistency for abounding operation in

a BB procedure using radially subdivided simplices, we have to make sure that the

points w in Definition IV.9 are chosen in a way which guarantees convergence of

every decreasing sequence of simplices generated by the algorithm to a simple set M,


where min f(M) is known. If no specific properties of f are known that could be ex-

ploited, then we would like to ensure that M is a singleton.

Denote by ö(M) the diameter of M (measured by the Eudidean distance). For the
simplex M, ö(M) is also the length of a longest edge of M.

A subdivision is called ezhausti'l1e if ö(Mq} - - 0 for aU


Definition IV. 10.
(rrm)
decreasing subsequences {Mq} ofpartition elements generated by the subdivision.

The notion of exhaustiveness was introduced in Thoai and Tuy (1980) for a
similar splitting procedure for cones (see also Horst (1976)), and it was further in-
vestigated in Tuy, Katchaturov and Utkin (1987). Note that exhaustiveness, though
often intuitively dear, is usually not easy to prove for a given radial subdivision pro-
cedure. Moreover, some straightforward simplex splitting procedures are not ex-

haustive.

Example IV.4. Let vk (i=O, .. ,n) denote the vertices of a simplex M . Let
q q
n > 1, and in Definition IV.9 choose the barycenter of Mq, Le.,
1 n i
w=w =n+l. E vM . (10)
q 1=0 q

Construct a decreasing subsequence {Mq} of simplices using radial subdivision

with w given by (10), and suppose that for all q, Mq+1 is obtained from Mq by re-
placing v~ by the barycenter w of M . Then, clearly, every simplex M contains
q q q q

the face conv {v~ ,... ,v~-I} of the initial simplex MI ' and thus M = ~ M has
1 1 q=l q
141

positive diameter.

A large dass of exhaustive radial subdivisions is discussed in Tuy, Katchaturov

and Utkin (1987). We present here only the most frequently used bisection, intro-
duced in Horst (1976), where, in Definition IV.lO, w is the midpoint of one of the

longest ed.ges of M, Le.,

(11)

where [v~, v~] is a longest edge of M. In this case, M is obviously subdivided into

two n-simplices having equal volume. The exhaustiveness of any decreasing se-
quence of simplices produced by successive bisection follows from the following re-

sult.

Proposition N.2. Let {Mq} be any decreasing sequence of n-simplices generated


by the bisection subdivision process. Then we have

(ii)

PIoof. Consider a sequence {Mq} such that Mq+ 1 is always obtained from Mq
by bisection. Let 5 (M q ) = 5q. It suffices to pIove (i) for q = 1. Color every vertex of

MI "black", color "white" every vertex of Mr with r > 1 which is not black. Let dr
denote the longest edge of Mr that is bisected. Let p be the smallest index such that

dp has at least one white endpoint.


Since a black vertex is replaced by a white one at each bisection before p, we must

have p 5 n+1.

Let dp = [u,v] with u white. Then u is the midpoint of some dk with k < p. Let
dk = [a,b]. If a or b coincides with v, then obviously 5p = ~ 5k 5 ~ 51 and (i) holds.
142

Otherwise, consider the triangle conv {a,b,v}. Since v E Mp ( Mk and 0k = Ildkll


is the diameter of Mk , we must have IIv-a1i $ 0k' IIv-bll $ 0k· Since u is the midpoint
of [a,b], we deduce from the "parallelogram rule"

and the relation

that

2
2l1u-vll = IIv-a1i 2 + IIv-bll 21 2 21232
- "2""a-b l $ 26k -"2" 0k = 2" 0k '

and therefore 0p $ ~ 0k. Since on+! $ 0p and k ~ 1, we then have (i).

(ii) is an immediate consequence of (i).



Though very natural and simple, bisection is not necessarily the most efficient
way to subdivide simplices, since it does not take into account the structure of a
given optimization problem.
More sophisticated subdivision procedures for simplices that are especially useful
in linearly constrained concave minimization problems will be presented in Chapter
VII of this book.

Note that the notion of a radial subdivision can be defined similarly for any se-

quence of polytopes, and the definition of exhaustiveness is obviously not restricted

to simplices.

3.2. Rectangles and Polyhedral Cones

The further development of BB methods to be presented later shows that in most

cases the whole vertex set V(M) of a partition set M is needed to compute bounds
143

and "deletion by infeasibility" rules. Since an n-simplex has the least number of ver-

tices among all n~mensional polytopes, it is frequently natural to choose n-sim-


plices.

For some classes of problems, however, rectangles M = {x: a ~ x ~ b}, a,b E (Rn,

a< b, are a more natural choice. Note that M is uniquely deterrnined by its "lower
left" vertex a = (ap ... ,an ? and its "upper right" vertex b = (bp ... ,bn ? Each of
the 2n vertices of the rectangle M is of the form

a+c

where cis a vector with components 0 or (bi - ai ) (i E {1, ... ,n}).

Moreover, an initial rectangle MO ) D is often known by given bounds on the


variables.

Rectangular partition sets have been used to solve certain Lipschitzian optimiza-

tion problems (e.g., Strongin (1984), Pinter (1986, 1986 a, 1987), Horst (1987 and
1988), Horst and Tuy (1987), Neferdov (1987), Horst, Nast and Thoai (1995), cf.

Chapter XI).

Most naturally, rectangular sets are suitable if the functions involved in the prob-
lem are separable, i.e., the sum of n functions of one variable, since in this case ap-
propriate lower bounds are often readily available (e.g., Falk and Soland (1969), So-

land (1971), Falk (1972), Horst (1978), Kalantari and Rosen (1987), Pardalos and
Rosen (1987)). We will return to separable problems in severallater chapters.

Let M be an n-rectangle and let wEM, w ;. V(M), where V(M} again denotes the

vertex set of M. Then a radial subdivision of M using w (defined in the same way as

in the case of simplices) does not partition M into n-rectangles but rather into more

complicated sets.
144

Therefore, the subdivision of n-rectangles is usually defined via hyperplanes


passing through w parallel to the facets ((n-1) dimensional faces) of M, so that M is
partitioned into up to 2n rectangles.

For most algorithms, the subdivision must be exhaustive. An example is the bi-
section, where w is the midpoint of one of the longest edges of M, and M is sub-
divided into two n-rectangles having equal volume and such that w is avertex of
both new n-rectangles. It can be shown in a manner similar to Proposition IV.2,

that the bisection of n-rectangles is exhaustive.

Polyhedral cones are frequently used for concave minimization problems with
(possibly unbounded) robust convex feasible sets (e.g., Thoai and Tuy (1980), Tuy,
Thieu and Thai (1985), Horst, Thoai and Benson (1991)).

Assume that D possesses an interior point yo. Let S be an n-;;implex containing


yo in its interior. Consider its nH facets Fi . Each Fi is an (n-1)-;;implex.
For each Fi let Ci be the convex polyhedral cone vertexed at yo and having
exactly n edges defined by the halflines from yo through the n vertices of Fi . Then
{Ci: i = 1, ... ,n+l} is a conical partition of IRn.
We construct a partition of such a cone C by means of the corresponding facet F.
Any radial subdivision of the simplex F defines a partition of C into cones vertexed
at yo and having exactly n edges, namely the halflines from yo through the vertices

of the corresponding partition element of F. Whenever the subdivision of the

(n-1)-;;implices Fis exhaustive, then any nested sequence {Cq} of cones associated
to the subdivision of the corresponding sequence {Fq} of (n-l)-;;implices converge
to a ray from yo through the point x satisfying Fq (q-laJ) {X}.
----i
145

4. LOWER BOUNDS

In this section, we discuss some examples of bounding operations. Given a family

of partition sets, a subdivision procedure (e.g., as outlined in the preceding section)

and an appropriate "deletion by infeasibility" mechanism, we find a group of consist-


ent (strongly consistent) bounding operations. Since the calculation oflower bounds
ß (M) depends on the data of the optimization problem to be solved, we first discuss
some ideas and classes of problems which occur frequently.
Note that not all the bounding operations discussed below are necessarily mono-
tonic as required in Step kA of the Prototype BB-Procedure. Let M C M' be an ele-

ment of a partition of M' and suppose that we have ß (M) < ß (M'). Then let us
agree to use

"ß (M) = max {ß (M), ß (M')}

instead of ß (M) as defined below.

4.1. Lipschitzian Optimization

Natural lower bounds on Lipschitzian functions have already been mentioned in

Section 104.:
Let f be Lipschitzian on M, i.e., assume that there is a constant L > 0 such that

If(z)-f(x)I ~ Lllz-xll Vx,z E M , (12)

where 11·11 denotes the Euclidean norm.

Suppose that an upper bound A for L is known.

Finding a good upper bound A ~ L is, of course, difficult in general, but without

such abound branch and bound cannot be applied for Lipschitzian problems. On the
other hand, there are many problems, where A can readily be determined. Note that
146

in a branch and bound procedure IIlocal ll bounds for L on the partition sets M should
be used instead of global bounds on MO'
Moreover, suppose that the diameter 6(M) of M is known.
Recall that for a simplex M, o(M) is the length of the longest edge of M. For a
rectangle M = {x E !Rn: a ~ x ~ b}; a,b E !Rn, a < b, o(M) is the length of the diag-
onal [a,b] joining the IIlower left ll vertex a and the lIupper right ll vertex b (all in-
equalities are to be understood in the sense of the componentwise ordering in !Rn).

By (12), we then have


f(z) ~ f(x) - Lllz-xll ~ f(x) - Ao(M) Vx,z E M. (13)

Let V'(M) be a nonempty subset of the vertex set V(M) of M. Then

ß (M): = max {f(x): xE V'(M)} - Ao(M) (14)

is a naturallower bound.
In (14), V'(M) can be replaced by any known subset of M.
If M = {x E !Rn: a ~ x ~ b} is a hyperrectangle, then

ß (M):= f(~ (a + b)) - 4o(M) (14')

might be a better choice than (14). For more sophisticated bounds, see Chapter
XI.2.5.

4.2. Venex Minima

Let M be a polytope. Then, for certain classes of objective functions, lower bounds
for inf f(M n D) or inf f(M) can be determined simply by minimizing a certain func-
tion related to f over the finite vertex set V(M) of M.
For example, if f is concave on M, then

inff(M n D) ~ min f(M) = min f(V(M)),


147

and we may choose

ß (M) = min f(V(M)). (15)

Tighter bounds can be obtained by cutting off parts of M \ D by means of some


steps of an outer approximation (relaxation) method by cutting planes (cf. Chapter

11). The resulting polytope P will satisfy Mn D ~ P C M, hence

inf f(D n M) ~ min f(V(P)) ~ min f(M), (16)

and ß (M) = min f(V(P)) is, in general, a tighter bound than min f(V(M)).

Another example is the calculation of lower bounds for a d.c.-function

f(x) = f1(x) + ~(x), where f 1 is concave and ~ is convex on a polytope M.


Choose any v* E M and let 8f2(v*) denote the subdifferential of f2 at v*. Let

p* E 8f2( v*) and determine

V E argmin {f1(v) + f2(v*) + p*(v-v*): v E V(M)} (17)

Then, by the definition of a subgradient, we have

l(x):= ~(v*) + p*(x-v*) ~ f 2(x) Vx E IRn,

and hence f1(x) + l(x) ~ f(x) Vx E IRn.


But f 1(x) + L(x) is concave and attains its minimum on M at a vertex of M. Con-
sequently,

(18)

is a lower bound for min f(M) ~ inf f(M n D).


148

4.3. Convex Subfunctionala

A commonly used way of calculating lower bounds ß (M) for min f(D n M) or min
f(M) is by minimizing a suitable convex subfunctional of f over D n M or over M. A
convex lubfunctional of f on M ia a convex function that never exceeds f on M. A
convex subfunctional !p is said to be the convex envelope of f on M, if no other con-
vex subfunctional of f on M exceeds !p at any point x E M (i.e., !p is the pointwise su-
premum of all convex subfunctionals of f on M). Convex envelopes play an import-
ant role in optimization and have been discussed by many authors. We refer here
mainly to Falk (1969), Rockafellar (1970), Horst (1976, 1976a and 1979).

The convex envelope !p is the uniformly best convex approximation of f on M


!rom below. For our purposes, it is sufficient to consider lower semicontinuous func-
tions f: M - t R, where Me lIf is compact and convex. We are led thereby to the fol-
lowing definition.

Definition IV.n. Let M C Rn be convez and compact, and let f: M - t Dl be lower


semicontinuous on M. A function !p: M - t Dl is called the convez en7Jdope off on M if
it satisjies

(i) !p(x) is conve:r: on M,

(ii) !p(x) ~ f(x) Vx E M,

(iii) there is Ra function \If: M - t Dl satisJying (i), (ii) and !piz) < \If (i) for
some point i e M.

Thus we have !p(x) ~ \If(x) for all x E M and all convex subfunctionals -q, of f
onM.

By (iii), it is easily seen that the convex envelope is uniquely determined, if it


exists.
149

Theorem IV.4. Let I: M -+ IR be lower semicontinuo1LS on the con'lle3: compact set


M ( IRn. Let rp be the con'llex en'llelope oll o'ller M. Then we ha'lle

a) min rp(M) = min f(M)

b) argmin rp(M) ) argmin f(M).

Proof. Let XE argmin f(M). Then we must have rp(X) ~ f(X) (by Definition IV.ll

(ii)). But we cannot have rp(X) < f(X), since if this were the case, then the constant
function w(x) :: f(X) would satisfy (i), (ü) of Definition IV.ll; but the relation
w(X) > rp(X) would contradict (iii).

Moreover, it follows by the same argument that we have

rp(x) ~ min f(M) "Ix E M,

hence a) and b) are both proved.



By Theorem IV.4, many nonconvex minimization problems could be replaced by
a convex minimization problem if the convex envelope rp of f were available.
However, for arbitrary convex compact sets and arbitrary lower semicontinuous f,

computing rp is at least as difficult as solving the original minimization problem.


Nevertheless, within the BB method, for sufficiently simple M and certain special
forms of f, the convex envelope is readily available and can be used to compute lower
bounds ß (M).

A geometrically more appealing way to introduce convex envelopes is by means of


the following characterization (Fig. IV.5).
Recall that the epigraph epi (f):= ({x,r) E M.. IR: r ~ f(x)} of a given function
f: M -+ IR consists of the points in M.. IR on and above the graph of f.
150

Lemma IV.I. Let M ( Oln be compact anti convex, and let l M - - I Ol be lower semi-
continuous on M. Then rp: M - - I Ol is the convex envelope off if and only if

epi (rp) = conv (epi (f)) (19)

or, equivalently

rp(x) = inf {r: (x, r) E conv (epi (f))} . (20)

Proof. The proof is an immediate consequence of the definition of the convex huH

conv (A) of a set A as the smallest convex set containing A. •

Note that epi (f) and conv (epi(f)) are closed sets, since fis lower semicontinuous

(cf., e.g., Rockafellar (1970), Blum and Oettli (1975)).

IR ,

~----------------------------------------- n
IR

Fig.IV.5. Convex envelope


151

A useful result can be derived by means of Caratheodory's Theorem which states


that every point of a compact convex set M in IRn is the convex combination of at
most nH extreme points of M (cf. Rockafellar (1970)).

Corollary IV.4. Let M ( IR n be compact and convex, and let J: M - - I IR be lower


semicontinuous on M. Then cp and 1 coincide at the extreme points 01 M.

Proof. Applying Caratheodory's Theorem to (20), we see that cp(x) can be ex-
pressed as
n+1 . n+1 n+1.
cp(x) = inf { E >'.f(xI ), E >.. = 1, E >..xI = Xj
i=l 1 i=l 1 i=l 1

\ ~ 0, xi E M (i=l, ... , nH)} .

Since, for extreme points x, x = x is the only representation of x as a convex com-


bination of points in M, it follows that cp(x) = f(x).

The convex envelope cp can also be characterized as weIl by means of the conjug-
ate function as defined by Fenchel (1949, 1951) (see also, e.g., Rockafellar (1970)).

Definition IV.12. Let J: M - - I IR be lower semicontinuous on the convex, compact


set M ( IR n. Then

f*(t) = max {xt - f(x)} (21)


xEM

is called the conjugate function 01 f.

Note that the maximum in (21) always exists, since fis lower semicontinuous and
M is compact. Hence, f* is defined for all t E IRn. If we replace the max operator in
(21) by the sup operator then the conjugate can be defined for arbitrary functions on
152

M. The domain of f* is then

D(f*) = {t: sup {xt -f(x)} < CD} • (22)


xeM

It is easily seen that f* is convex (the pointwise supremum of a family of affine

functions) .
The same operation can be performed on f* to yield a new convex function f**:
this is the s~alled second conjugate of f. The function f** turns out to be identical
to the convex envelope cp (cf., e.g., Falk (1969), Rockafellar (1970)). Our proof of the
following theorem follows Falk (1969).

Theorem IV.5. Let f: M - I IR be lower semicontinuo'US on the compact convex set


M ( IR n. Denote by /** the second conjugate 0// and by cp the convex envelope 0// on
M. Then we have

/**(x) = cp(x) Vx e M . (23)

Proof. We first show that f** is defined throughout M, i.e., D(f**) J M.


Let xo e M. Then, by (21), we have
xOt - f(xO) ~ f*(t) Vt e IRn

Rence,

and

f**(xO) = HUP {xOt -f*(t)} ~ f(xO) < CD , (24)


telRn

i.e., xO ( D(f**).
Note that it can actually be shown that D(f**) = M (cf., e.g., Falk (1969), Rocka-
fellar (1970)).
153

Since f** is convex: on M and, by (24), f**(x) ~ f(x) "Ix e M, it follows from the

definition of tp that f**(x) ~ tp(x) "Ix e M. Suppose that f**(xO) < tp(xO) for some
xO e M. Then we have (xO, f**(xO)) t epi (tp). Note that epi (tp) is a closed convex:

set. Thus, there is a hyperplane strictly separating the point (xO, f**(xO)) from
epi tp, Le., there is a vector (s,O') e IRn +! satisfying

(25)

If 0' = 0, then xs > xOs "Ix e M, which is impossible, since xO e M.


Now, since 0' #: 0, we may divide (25) by -0' and set s ...... -iJ/0'. Then from (25) it
follows that either

(26)

or

(27)

If (26) holds, then in particular it must hold for x = xO, so that

But this is equivalent to f**(xO) > tp(xO), which we have seen above to be false.
If (27) holds, then, since tp(x) ~ f(x), it follows that

But because f**(xO) ~ xOs - f*(s) this implies that

-f(x) + xs < f*(s) "Ix e M . (28)

This contradicts the definition of f*, which implies that in (25) equality (rather than

inequality) holds for some xe M.



154

Further useful results on convex envelopes have been obtained for special classes

of functions f over special polytopes. The following two results are due to Falk and

Hoffman (1976) and to Horst (1976 and 1976a).

Theorem IV.6. Let M be a polytope with vertices v1,,,.,vk, and let f' M -I IR be

concave on M. Then the convex envelope tp oll on M can be expressed as


k .
tp{x} = min E Q/{v'}
Q i=1
(29)
k . k
S.t. E QiV' = x, E Qi = 1, Q ~ 0,
i=1 i=1

Proof. The function tp defined in (29) is convex. To see this, let 0 ~ ). ~ 1 and

x 1 ,x2 e M. Let Ql,Q2 solve (29) for x = xl and x = x2, respectively. Then

where the inequality follows from the feasibility of (>,Q1 + (1_).)Q2).


For every Q that is feasible for the above problem (29), the concavity of f implies

that

Finally, suppose that IJI is a convex function on M which underestimates f over M,

and suppose that tp(X) < 1JI(i) for some i e M. Let Ci solve (29) for x = i. Then we
have
155

which is a contradiction.

Theorem IV.7. Let M = conv {vO, .. ,vn} be an n-simplex with vertices vO, .. ,vn,
and let f: M ~ IR be concave on M. Then the convex envelope off on M is the affine
junction

l{J(x) = ax + 0, a E IR n, 0 E IR , (30)

which is uniquely determined from the system of linear equations

f(v i) = avi + 0 (i=O,l, ... ,n). (30')

Proof. (30') constitutes a system of (n+l) linear equations in the n+l unknown
n
a E IR ,oE IR.

Subtracting the first equation from all of the n remaining equations yields

The coefficient matrix V T whose columns are the vectors vi - vO (i=I, ... ,n) is non-
singular, since the vectors vi_v O (i=I, ... ,n) are linearly independent. Thus, a and the

o are uniquely deterrnined by (30').


But tp(x) = ax + 0 is affine, and hence convex.
n . n
Let x E M, hence x = E A.V\ E A. = 1, A. ~ 0 (i=O, ... ,n).
i=O 1 i=O 1 1

From the concavity of f it follows that

hence tp(x) ~ f(x) Vx E M, and I{J is a convex subfunctional of f on M.

Now suppose that there is another convex subfunctional lJt of f on M and a point

xE M satisfying 1Jt(X) > rp(X).


156

n . n
Then i = E /k,V1, E I'j = 1, I'j ~ 0 (i = O, ... ,n), and
i=O 1 i=O

which is a contradiction.

Note that Theorem IV.7 can also be derived from Theorem IV.6. Each point x of
an n-simplex M has a unique representation as a convex combination of the n+1
affinely independent vertices vO, ... ,v n. To see this, consider

n i n 0
x = E eH + (1- E a.)v , a· > 0 (i=l, ... ,n),
i=l 1 i=l 1 1-

Le.,

n i 0 0 0
x = E a.(v - v ) + v = Va + v , a ~ 0 ,
i=l 1

where a = (ap ... ,an ? and V is the nonsingular matrix with rows vi - vO
(i=l, ... ,n). It follows that a = V-1(x - vo) is uniquely determined by x.
Rence, by Theorem IV.6, we have
n .
rp(x) = E a.f(v1), (30")
i=O 1

n . n
where x = . E ai VI , • E a i = 1, ai ~ 0 (i=l, ... ,n) is the unique representation of x
1=0 1=0

in the barycentric coordinates aO, ... ,an of M. It is very easy to see that this function
coincides with (30).

It follows from (30") that, whenever the barycentric coordinates of Mare used,
the system of linear equations (30') does not need to be solved in order to determine
rp(x).
157

By Theorem IV.7, the constmction of convex envelopes is especially easy for con-
cave functions of one real variable f: [a,b] --t IR over an interval [a,b]. The graph of
the convex envelope !P then is simply the line segment passing through the points (a,
f(a» and (b, f(b».

Another useful result is due to Falk (1969) and AI-Khayyal (1983).

r
Theorem. IV.8. Let M =.n Mi be the product 01 r compact nc dimensional rect-
1=1
r
angles M. {i=l, ... ,r} satisfying E n.; = n. Suppose that f: M --t IR can be decomposed
1 i=l •
r .
into the lorm I{z} = i~l Ilz'}, where li: Mi --t IR is lower semicontinuous on Mi

{i=l, ... ,r}. Then the convez envelope !P off on Misequal to the sum 01 the conVe:I:
envelopes !Pi olli on Mi ' i.e.,
r .
!p{z} = E !plzl}
i=l
. n.
Proof. We use Theorem IV.5. Let t l E IR 1 (i=l, ... ,r). Then we have
r . . . r .
f*(t) = maMx {xt - f(x)} = .E 1 ~a x {X1tl - fi(xl)} = .E 1 fi(t l ) .
xE 1= xl EM. 1=
1
r
rp(x) = f**(x) = sup {xt - f*(t)} = E ~up {xiti - fi(t i )}
tElRn i=l tlElRni


Theorem IV.8 is often used for separable fnnaions f, where r=n and the
n
M. = [a.,b.] are onEHiimensional intervals, i.e., we have f(x) = E f.(x1·),
1 1 1 i=l 1
~: [ai'bi] --t IR (i=l, ... ,n).

Note that Theorem IV.8 cannot be generalized to arbitrary sums of functions. For
example, the convex envelope of the function f(x) = x2 - x2 over the interval [0,1]
158

is the zero function, while the sum of the convex envelopes is x 2 - x.

The following theorem was given in AI-Khayyal (1983).

Theorem IV.9. Let f: M ---< IR be lower semicontinuous on the convex compact set
M ( IR n and let h: IR n ---< IR be an affine junction. Then

where 'Pfand 'Pf+h denote the convex envelopes off and (f+h) on M, respectively.

Proof. By the definition of a convex envelope, we have

f(x) + h(x) ~ 'Pf+h(x) ~ 'Pfx) + h(x) Vx E M ,

with the last inequality holding because the right-hand side is a convex subfunc-

tional off+h. Hence,

f(x) ~ 'Pf+h(x) - h(x) ~ 'Pfx) Vx E M .

Since the middle expression is a convex function, equality must hold in the second
inequality.

More on convex envelopes and attempts to determine 'P can be found in McCor-
mick (1976 and 1983). Convex envelopes of bilinear forms over rectangular sets are

discussed in AI-Khayyal (1983), Al-Khayyal and Falk (1983). Convex envelopes of

negative definite quadratic forms over the parallelepipeds defined by the conjugate

directions of the quadratic form are derived in Kalantari and Rosen (1987). We will

return to some of these problems in subsequent chapters.


159

4.4. Duality

Consider the so-<:alled primal optimization problem

(P) minimize f(x)


s.t. xEC,gi(x)SO (i=l, ... ,m) (31)

where f, ~: IRn --I IR. Assume that ~ is convex (i=l, ... ,m), fis lower semicontinuous

and C is convex and compact.

A dual problem to (31) is defined by

(D) max {i n f [f(x) + ug(x)]} , (32)


uEIR: xEC

Problem (32) has the objective function

d(u):= in f [f(x) + ug(x)]


xEC

which is the pointwise infimum of a collection of functions affine in u and hence con-
cave on the feasible region {u E IR:: d(u) > -m} of (32).

Let inf (P) and sup (D) denote the optimal value of (P) and (D), respectively.

Then we have the following familiar result (so-<:alled weak duality).

Lemma IV.2. We always have in! {P} ~ sup {D}.

Proof. Let u ~ 0 satisfy d(u) > -m, and let i E C satisfy g(i) S O. Then it follows
that

d(U) = in f [f(x) + ug(x)] S f(i) + ug(i) S f(i). •


xEC
160

Note that Lemma IV.2 holds without the convexity and continuity assumptions

made on C, ~ and f, respectively.


Let M = C n {x: ~(x) ~ 0 (i=l, ... ,m)}. Then, by Lemma IV.2, any feasible point

Ü of the dual problem could be a candidate for deriving the lower bounds

ß (M) = d(ü) for min f(M).

It is weH-known that for convex f we have inf (P) = sup (D) whenever a suitable
"constraint qualification" holds. The corresponding duality theory can be found in
many textbooks on optimization, cf. also Geoffrion (1971) for a thorough exposition.

For nonconvex f, however, a "duality gap" inf (P) - sup (D) > 0 has to be expected
and we would like to have an estimate of this duality gap (cf. Bazaraa (1973),

Aubin and Ekeland (1976)). A very easy development is as foHows (cf. Horst

(1980a)).
Let tp be the convex envelope of f on C. Replacing f by tp in the definition of

problems (P) and (D), we obtain two new problems, which we denote (F) and (D),
with d(u) being the objective function of (D). Obviously, since tp(x) ~ f(x) Vx E C,
one has d(u) ~ d(u) Vu E ~ , and we obtain the foHowing lemma as a trivial con-
sequence of the definition of tp.

Lemma IV.3. in! (P) ~ in! (P), sup (D) ~ sup (D).

Convex duality applies to (F) and (D) (cf., e.g., Geoffrion (1971)):

Lemma IVA. I! a so-caUed "constraint qualification" is fulfiUed· in (P), then


in! (P) = maz (D).

One of the most popular "constraint qualifications" is Slater's condition:

(I) There exists a point xOE C such that ~(xO) < 0 (i=l, ... ,m).
161

(I) depends only on the constraintsj hence its validity for (F) can be verified on
(P).

Another "constraint qualification" applies for linear constraints.

(D) g(x) = Ax - rnxn, b e rn, C is a convex polytope and tp can be


b, A e
extended to an open convex set n- ) C such that the extension of tp is convex on n-
(cf. Blum and Oettli (1975».

Combining the preceding lemmas we can easily obtain an estimate for the duality
gap inf (P) - sup (D).

Theorem IV.10. Suppose that Ce IR n is nonempt1l, conllez and compact, f: IR n --I IR


is lower semicontinuo'1J.8 on C and each 9;': IRn --I IR is conllez (i=l, ... ,m). Moreoller,
suppose that a ·constraint qulification· is fulfilled. Then

o~ inf(P) - sup (D) ~ inf(P} - inf(P} ~ sup {I(z} - tp(z}}.


zeC

Proof. The function tp exists and Lemmas IV.2, IV.3, IV.4 yield

inf (P) ~ sup (D) ~ sup (D) = inf (P) . (33)

By Lemma IV.~, M = C n {x: ~(x) ~ 0 (i=l, ... ,m)} is non-ilmpty and inf (F) is
finite. Hence, by (33), we have inf (P) I: ± m, and the first two inequalities in the
assertion are fulfilled.
By the definition of sup and inf, we have

inf (P) - inf (F) ~ su p {fex) - tp(x)},


xeM

and obviously (since Me C) we also have

su p {fex) - tp(x)}
xeM
~ sup {fex) - tp(x)}.
xEC •
162

The quantity sup {f(x) - «p(xH may be considered as a measure of the lack of
xeC
convexity off over C.

An interesting result in connection with BB procedures is that if, in addition to

the assumptions of Theorem IV.lO, the constraint functions ~(x) are affine, then we

have sup (D) = sup (15), i.e., instead of minimizing I{J on M, we can solve the dual of
the original problem min {f(x): x e M} without calculating I{J (cf., e.g., Falk (1969)).

However, since (D) is usuallya difficult problem, until now this approach has been

applied only for some relatively simple problems (cf., e.g., Falk and Soland (1969),

Horst (1980a)).

Theorem IV.H. Suppose that the a&sumptions 01 Theorem IV.10 are falfilled and,

in addition, g(x) = Ax - 6, A e IR mxn, 6 e IR m. Then we have

sup (D) = sup (D) = inJ(P) .

Proof. The last equation is Lemma IVA. To prove the first equation, we first
observe that

f*(t) = max {xt -f(x)}


xeC

is defined throughout of, by the assumptions concerning fand C. Moreover, f* is a

pointwise maximum of affine functions, and hence a convex function. Thus, f* is

continuous everywhere and equals its convex envelope, so that we may apply

Theorem IV.5 to obtain

f***(t) = f*(t) = 1{J*(t) Vt e IRn. (34)


163

Consider the objective function d(u) of the dual (D):

d(u) = mi n {f(x) + u(Ax - b)}


xEC

= mi n {f(x) + (ATu)x} - ub
xEC

= -max ((x(-ATu) -f(x)} -ub


xEC

= -f*(-A T u) - ub.

From (34) it follows that

d(u) = -cp*(-ATU) - ub

T
= -max {x(-A u) - cp(x)} - ub
xEC

= mi n {cp(x) + u(Ax - b)}.


xEC

But this is the objective function of (TI), and we have in fact shown that the

objective functions of (D) and (TI) coincide, this clearly implies the assertion.

A related result on the convergence of branch and bound methods using duality
bound is given in Ben Tal et al. (1994).

We would like to mention that another very natural tool to provide lower (and

upper) bounds in rectangular BB procedures is interval arithmetic (see, e.g., Hansen

(1979 and 1980), Ratschek and Rokne (1984, 1988 and 1995)).

Some other bounding operations that are closely related to specific properties of

special problems will be introduced later.


164

4.5. Consistency

In tbis section, we show that for many important c1asses of global optimization
problems, the examples discussed in the preceding sections yield consistent or
strongly consistent bounding operations whenever the subdivision is exhaustive and
the deletion of infeasible partition elements is certain in the limit (cf. Definitions
IV.6, IV.7, IV.S, IV.lO).

Recall that f denotes the objective function and D denotes the c10sed feasible set
of a given optimization problem. Let SM C M n D be the set introduced in Step k.4
of the prototype BB procedure, and recall that we suppose that er (M) = min f(SM)
is available whenever SM # •.

The following Lemma is obvious.

Lemma IV.5. Suppose tha.t f: MO ---I R is continuous a.nd the subdivision procedure
is ezha.ustive. Furthermore, a.ssume tha.t every infinite decrea.sing sequence {Mql 0f
succusivel1l refined pa.rtition elements sa.tisfiu • # SM CM n D.
q q
Then every strongl,l consistent lower bounding opera.tion 1Iields a. consistent bounding
opera.tion.

ProoC. Exhaustiveness and strong consistency imply that Mq ~ {i}, i e D and


ß(Mq) ~ f(X). The sequence of upper bounds er (M q) associated with Mq is
defined by er (Mq) = min f(SM ) and continuity of f implies that er (M ) - - f(X) . •
q q q-tal

Suppose that MO and all of the partition sets M are polytopes. Recall that the BB
procedure requires that
165

-m < ß (M) ~ min f(M n D), if M is known to be feasible,

-m < ß (M) ~ min f(M) if M is uncertain.

Of course, if the set M is known to be infeasible, then it will be deleted.

Looking at the examples discussed in the preceding sections, we see that ß (M) is
always determined by a certain optimization procedure. A suitably defined function
,: M - - I I is minimized or maximized over a known subset T o{ M:

ß (M) = min {~(x): xE T} (35)

or

ß (M) = max {~(x): x E T} . (36)

Examples.

1. Lipschitzian Opümization (d. Section IVA.l):

Let { be Lipschitzian on MO' let A be an upper bound tor the Lipschitz constant of

{on M, and let 6(M) be the diameter of M.


Then we may set

~1(x) = f(x) - A6(M), T = V'(M), ß (M) = max{~I(x): x E T}, (37)

where V'(M) is a nonempty subset of the vertex set V(M) (see also (14')).

2. Concave Objective Function - Verlex Minima (d. Section IVA.2):

Let f be concave on an open set containing MO' Then we may set

~2(x) = f(x), T = V(M), ß (M) = min {~2(x): x E T} . (38)


166

3. D.C. programming Vertex Minima (cf. Section IVA.2):

Let f = f1 + 12 ' with f1 concave and f2 convex on an open set containing MO'
Then we have

'3(x) = f1(x) + 12(v*) + p*(x-v*), T = V(M), P(M) = min {'3(X): x e T}, (39)

where v* e V(M) and p* e 8f2(v*).

4. ConvexEnvelopes (cf. Sections IVA.3, IVAA):

Suppose that the convex envelope 'PM of f over M is available, and D is such that
min 'PM(D n M) can be calculated. Then we may set

T = D n M, if M is known to be feasible , (40)

T = M, if M is uncertain.

Now suppose that 'PM exists but is not explicitly available. Let D be a polytope.
Then, by Theorem IV.1O, P(M) in (40) can be obtained by solving the dual to
min {f(x): xe T}. Since, however, this dual problem is difficult to solve, this ap-
proach seems to be applicable only in special cases, e.g., if M is a rectangle and Dis
defined by a few separable constraints (cf., e.g., Horst (1980a)).

Let M c M' be an element of a partition of M'. Then it is easily seen that the

bounding operations (37), (39) do not necessarily fulfill the monotonicity require-

ment P(M) ~ P(M') in Step kA of the prototype BB procedure, whereas (38), (40)
yield monotonie bounds. Recall that in the case of nonmonotonic lower bounds we
agreed to use j'1(M):= max {P (M), P(M')} instead of P(M).
Since P(M) ~ j'1 (M) ~ f(x) Vx e M, obviously j'1 is strongly consistent whenever P
is strongly consistent.
167

Similarly, it is clear that a strongly consistent lower bounding operation ß can be


replaced by any lower bounding operation ß' satisfying ß (M) ~ ß '(M) ~ f(x)
Vx E M. For example, in (37), whenever possible, A should be replaced by tighter
local upper bounds AM < A on the Lipschitz constant of f on M.
Let ~q denote the functions ~ on Mq defined in (35), (36). Let ß (M) be deter-
mined by (35), (36), respectively, and associate to each bound ß (M) a point i E M

as follows

i E argmin ~(T), if (35) holds, (41)

i E argmax ~(T), if (36) holds. (42)

In each step of the BB procedure choose i k to be the point associated to fix in


this way.

Now consider the bounding methods in the above example, respectively using

(37), (38), (39), (40). In the first three cases, f is obviously continuous on MO'
Suppose that fis also continuous on MO in case 4 (convex envelopes, (40)).

Proposition IV.3. Suppose that at every step any undeleted partition element can
be further refined. Furthermore, suppose that the "deletion by infeasibility" rule is
certain in the limit and the sub division is exha'ILStive. Then each bounding operation in
the above example given by (97), (98), (99), (~.O), respectively, is strongly consistent.

Proof. Let~.l,q: Mq -+ IR (i=1,2,3) denote the functions defined in (37), (38),

(39), i.e., we have

~1 (x) = f(x) - A 6(M )j f Lipschitzian,


,q q q (43)

~2 (x) = f(x)j f concave,


,q
(44)
168

It is sufficient to demonstrate that, for every decreasing sequence {Mq} of suc-


cessively refi.ned partition elements generated by an exhaustive subdivision such that
Mq - t {i}, there is a subsequence {Mq,} satisfying

(J (M q,) = 'i,q,(iq') q'--: f(i) (i=1,2,3).

Since -p (Mq) is a nondecreasing sequence satisfying (J (Mq) S 7l{Mq) S f(i), it fol-


lows that -p (Mq) q:;: f(i). (Note that in the case of monotonic {J (Mq), we have
(J (Mq) = -P(Mq).)
The assumption concerning the deletion rule then implies that i e D, and hence we
have strong consistency.

In the case i=l, we can assume that Aq S Ao Vq. Then we have

which, by the continuity off and the exhaustiveness, implies that '1 ,q(iq) - I f(i).
q-iaJ

The case i=2 is trivial since '2,q = f, and fis continuous.

In the case i=3, note that v*q q:;: i E D since Mq q:;: {i}. Moreover, since
{~(x): x E MO} is compact (cf., e.g., Rockafellar (1970)), there exists a subse-
quence p*q' =-+
q-iaJ p*. Hence, we have

Now consider the case (40) (convex envelopes). By Theorem IVA, we have
min f(M q) = min IPM (Mq), and hence
q

min f(Mq)S{J(M q) S min f(D n Mq) if Mq is lmown to be feasible,}


(46)
min f(Mq)={J(Mq) i f M ia unc e rtain.
169

---. {i},
From the assumptions we know that Mqq-im xE D. If D n Mq,f. 0 is known
for infinitely many q', then D n Mqq-im
,-r-I {i}. If M ,is uncertain for all but finitely
q
many q', then ß(Mq ,) = rnin f(M q ,). Finally, the continuity off and (46) impIy that
ß (Mq,) -r-I
q-im
f(i}, and hence we have strong consistency.

The following Corollary IV.5 is an immediate consequence of Corollary IV.3.

Corollary IV.5. In the BB procedure suppose that the lower bounding operation is
strongly consistent, and the selection is bound improving. Suppose that 1is continuous
on MO' Then every accumulation point 01 {;;k} solves problem (P).

Note that it is natural to require that all partition sets where ßk is attained are

refined in each step, since otherwise 1\+1 = 1\ holds and the Iower bound is not im-
proved.

5. DELETION BY INFEASIBILITY

In this section, following Horst (1988) certain rules are proposed for deleting in-
feasible partition sets M. These rules, properly incorporated into the branch and
bound concept, will lead to convergent algorithms (cf. Section IVA.5). Since the in-

feasibility of partition sets depends upon the specific type of the feasible set D, three

cases which frequentIy arise will be distinguished:


Convex feasible sets D, intersections of a convex set with finitely many compIe-

ments of convex sets (reverse convex programrning), feasible sets defined by Lip-

schitzian inequalities.

Again suppose that the partition sets M are convex polytopes defined by their vertex

sets V(M).
170

Deletion by Cenainty.

Clearly, whenever we have a procedure that, for each partition set M, can

definitely decide after a reasonable computational effort whether we have M nD = 0


or M n D f 0, then this procedure should be applied (deletion by certainty, cf.
Example IV.3).

Example IV.5. Let D:= {x E IRn: h(x) ~ O}, where h: IRn --I IR is convex. Then a

partition set M is deleted whenever its vertex set V(M) satisfies

V(M) ( {x E IRn: h(x) < O}. (47)

Because of the convexity of the polytope M and the convexity of the set {x E IRn:

h(x)<O}, we obviously have Mn D = 0 if and only if (47) holds.

Convex Feasible Sets.

Let

D:= {x E IRn: g(x) S O} , (48)

where g: IRn --I IR is convex, e.g., g(x) = sup {~(x): i E I} with gi: IRn --I IR convex,

i E I ( IN. Suppose that a point yO satisfying g(yO) < ° is known (the Slater

condition) and that D is compact.

Let M be a partition set defined by its vertex set V(M). If there is a vertex v E

V(M) satisfying v E D, then trivially D n M f 0. However, V(M) n D = 0 does not


imply M n D = 0 (cf., e.g., Example IV.2).
Suppose that V(M) nD = 0. Let p be an arbitrary point of M \ D.
Compute the point z where the line segment [yO,p] intersects the boundary of D.

By convexity of D, we have

z = >.yO + (1 _ >.)p, (49)


171

where ), is the unique solution of the univariate convex programming problem

min {JL E [0,1]: JLY O+ (1 - JL)p E D}. (50)

Equivalently, ), is the unique solution of the equation

g(JLyO + (1 - JL)p) = 0, JL ~ O. (51)

Let s(z) E 8g(z) be a subgradient of g at z. If M is strictly separated from D by

the hyperplane (s(z), x-z) = 0 supporting D at z, then M is deleted, Le., we have the
first deletion rule

(DR 1) Delete a partition element M if its vertex set V(M) satisfles

V(M) ( {x: s(z)(x - z) > O} , (52)

where s(z) and z are defined above.

Clearly, a partition element that is deleted according to (DR 1) must be infeas-

ible. However, it is easy to see that infeasible sets M may exist that do not satisfy

(52).

Intersections of a Convex Set with Finitely Many Complements of a Convex Set.

Let

(53)

where

Dl = {x E IR n : g(x) 5 O} is compact, (54)

(55)

and g, h.: IRn --I IR are convex (j=I, ... ,r).


J
172

°
Assume that a point yO satisfying g(yO) < is known.
Recall that a typical feasible set arising from revene conves: programmiDg can be
described by (53), (54), and (55).
Let

(56)

(DR 2) Delete a partition set M if its vertex set V(M) satisfies either (DR 1)
applied to Dl (in place of D) or if there is a jE (l,... ,r) such that we ove

V(M) (Cj .

Again, it is easy to see that by (DR 2) only infeasible sets will be deleted, but
possibly not all infeasible sets. For arecent related rule, see Fülöp (1995).

Lipschiu Constraints.

Let

(57)

where gi Rn -IR are Lipschitzian on the partition set M with Lipschitz constants Lj
(j=I, ... ,m). Let 6(M) be the diameter of M, and let Aj be the upper bounds for Lj
(j=I, ... ,m). Furthermore, let V'(M) be an arbitrary nonempty subset of the vertex
set V(M). Then we propose

(DR 3) De1ete a partition element M whenever there is a j E (l, ... ,m) satisfying

m.ax {gj(x): x E V'(M)} - Ajf(M) > O. (58)

Again, if M =: {x E IRn: a ~ x ~ b} one might use


1
gj(~ (a + b» -rA. 6(M) > ° (58')
173

rather than (58).

Proposition IV.". Suppose that the subdivision is exhaustive. Then the "deletion by
infeasibility" mles (DR 1) - (DR 9) are certain in the limit.

Proof. a) Let D = {x: g(x) ~ O}, where g: IRn --I IR is convex. Suppose that we
have yO satisfying g(yO) < 0. Apply deletion rule (DR 1). Let {Mq} be a decreasing
sequence of partition sets and let pq E Mq \ D. Consider the line segment [yO, pq].
By exhaustiveness, there is an i satisfying pq - < i.
q-laJ
Suppose that we have i ~ D. Then by the convexity of gon IRn we have continuity
of g on IRn (e.g., Rockafellar (1970)), and IRn \ D is an open set. It follows that there
exists a ball B(i,e):= {x: IIx-ill ~ cl, e > 0, satisfying B(i, e) ( IRn\D, such that
Mq (B(i,c) Vq > ~ , where qo E IN is sufficiently large. Consider the sequence of

points zq E 8D, where [yO,pq] intersects the boundary an of D, pq E Mq:

(59)

Since an is compact and Aq E [0,1], there is a subsequence (that we denote by q'),


satisfying

(60)

Consider the associated sequence of subgradients s(zq') E 8g(zq'). Since {8g(x):


x E an} is compact (cf. Rockafellar (1970)), we may assume (possibly passing to
another subsequence which we again denote by q') that s(zq') qi~ s. But we know
that the set-valued mapping x --I 8g(x) is closed on an (cf. Rockafellar (1970)), and
s
hence we have E 8g(Z). By the definition of a subgradient, it follows that

(61)

But (60) implies that we have

(1 - X) (i - Z) = -X(yO - Z) . (62)
174

Note that X > 0 since otherwise (60) would imply that i = z E an ( D which
contradicts the assumption that i;' D. Likewise, it follows that we have X < 1, since

z= yO is not possible.
) we have -(-
Obviously, by (62, ;;"\
s x-z, -Xs-( y0;;"\
=- -z" 0 < ->. < 1.
I-X
Hence, from (61) it follows that

(5, i- Z) > o. (63)

However, (63) implies that for sufficiently large ql we have

(64)

Since pq' is an arbitrary point of Mq, and Mq, - - I {i}, we see that (64) also holds

for all vertices of Mq" q' sufficiently large. Hence, according to deletion rule (DR 1),

Mq, was deleted, and we contradict the assumptions. Thus, we must have i E D.

b) Let D = D1 n D2 ,where

D1 = {x: g(x) SO},

D2 = {x: hlx) ~ 0, j = 1, ... ,r}.

and g, hf IRn - - I IR are convex (j=I, ... ,r).

Apply deletion rule (DR 2).

By part a) above, we have i E D1.

Let Cf = {x E IRn : hj(x) < O}, j=I, ... ,r. Suppose that we have i ;. D2. Then there

is a jE {1, ... ,r} such that i E Cj , and in a manner similar to the first part of a)

above, we conc1ude that Mq ( Cj for sufficiently large q, since Cj is an open set. This
contradicts the deletion rule (DR 2).
175

e) Finally, let D = {x: glx) ~ 0 (j=l, ... ,m)}, where all gf IRn -i IR are Lipsehitzian

on MO. Suppose that the loeal overestimators AiMq) of the Lipsehitz eonstants

L.(M ) are known (j=l, ... ,m). Sinee the overestimator A.(M ) is an overestimator
J q J q
for LiMq') whenever Mq, ( Mq , we may assume that there is abound A satisfying

(65)

Apply deletion rule (DR 3), and suppose that we have xt D. Sinee Mq -i {X}, by
the eontinuity of gj (j=l, ... ,m) and by (65) it follows that for every sequenee of
nonempty sets V'(M q) ( V(M q ), we have

Sinee xt D, there is at least one j E {l, ... ,m} satisfying gj(X) > O. Taking into
aecount the boundedness of {AlM q ) }, the limit 5(M q) -i 0 and the eontinuity of

gj' we then see from (66) that there is a qo E IN such that

This eontradiets deletion rule (DR 3).



Combining the bounding proeedures diseussed in Seetion IIIA.5 and the above
deletion rules yields a family of BB proeedures for many important classes of global

optimization problems (cf. Part B and C of this book).

A generalization of BB proeedures that allows eertain covers instead of partitions

ean be found in Horst, Thoai and de Vries (1992a).


176

6. RESTART BRANCH AND BOUND ALGORITHM

In this section the combination of outer approximation and branch and bound will

be discussed in the following sense.


Suppose that we apply an outer approximation algorithm to our global optimiza-
tion problem

minimize f (x)
(P)
S.t. xED

and we would like to solve each relaxed problem

minimize f(x)
s.t. xE D v

by a BB-Procedure.

Clearly, since in the basic outer approximation method, each of these problems
differs from the previous one by only a single additional constraint, when solving
(Qv+1) we would like to be able to take advantage of the solution of (Qv). In other
words, in order to handle efficiently the new constraints to be added to the current
feasible set, the algorithm selected for solving the relaxed problems in an outer ap-
proximation scheme should have the capability of being restarted at its previous
stage after each iteration of the relaxation scheme.

An algorithm which proceeds strictly according to the prototype BB procedure


would clearly not satisfy this requirement.

Indeed, suppose we are at Step k of the BB-Procedure for solving (Qv). Let

,k-1 denote the value of the objective at the best feasible point in Dv obtained so
<kv

far. Then certain pa.rtition sets M ma.y be deleted in Step k.1, since ß (M)
,k-1.
~ <kv

These sets, however, may not qualify for deletion when the BB procedure is applied

to solve a subproblem (Q~), ~ > v, since D~ ( D v and possibly <k~,k_1 > <kv,k-1 for
177

all steps k of the BB procedure used to solve (Q ,J


This defect can often be avoided: it suffices, when solving any relaxed problem
(Q) to make sure that SM ( M nD and not just SM ( M n D// (d. Step k.4 of the
BB procedure), so that xk is always the current best feasible solution to (P) and not
just to (Q//).
Since often outer approximation by convex polyhedral sets is used and in many
applications (Q//) can be solved in finitely many steps, we assume that the BB
procedure for solving (Q) is finite.

Restart Branch and Bound - Outer Approximation Procedure (RBB-R)

Let :Y be the family of sets D// admitted in the outer approximation procedure
(d. Chapter 11).
Choose D 1 E :Y such that D 1 J D. Set // = 1.

Apply the finite BB procedure of Section IV.l to problem (Q//) with the conditions
in Step k.4 replaced by

SM ( M n D, ß(M) S inf (M nD//) (67)

and with step k.5 modified as follows.

a) If Qk = f\, then stop: xk solves (P).

b) If ~ > f\ and f\ = f(z//) for some z// E D// \ D, then construct ~ constraint
l//(x) S 0 satisfying {x E D//: l//(x) S O} E :Y, l//(z//) > 0, l//(x) S 0 Vx E D
(d. Chapter 11), and let

(68)

Set v+- v+1 and go to Step k+1 of the BB procedure (applied from now on to

problem (Q//+1»'
178

c) If neither a) nor b) occurs, then go to Step k+1 of the BB-Procedure (with 11

unchanged).

Theorem IV.12. 11 the lamilll:Y is finite, then the (RBB-R)-algorithm


terminates after finitelll manll steps at an optimal solution. 11 conditions Ci) and (ii)
01 Theorem III.l (the convergence theorem for the basic outer approzimation
procedure) are satisfied and the {RBB-R)-Algorithm generates an infinite sequence
{l'}, then every cltl.Ster point 01 {I'} solves (P).

proof. Since DII ) D, in view of (67) we have

Therefore, if it happens that Qk = ~ (which has to be the case for some k, if the
family :y is finite), then f(xk) = min f(D), i.e., xk solves (P).
Now suppose that the algorithm generates an infinite sequence. Then we have

Qk> ßk at every Step k of the BB procedure within the (RBB-R)-aJgorithm. As


long as 11 is unchanged, since D C D II and we have (67), we are in fact applying the
BB procedure to problem (QII)' Since we assume that this algorithm is finite, after
finitely many steps we must have ~ = min f(DJ, i.e., f\ = f(zll) for some Zll E DII•
Note that Zll t D, since otherwise we would have Qk =~. So the case b) must occur
after finitely many steps. We then replace DII by DII+1 a.ccording to (68) and go to
the next Step k+1 with 11 changed to 11+1. That is, we apply the BB-Procedure to

problem (P) starting from the most recent partition and bounds obtained in solving

(QII)' Consequently, if the (RBB-R)-aJgorithm generates an infinite sequence {Zll},


then every zll solves (QII)' Theorem IV.12 now follows from Theorem III.1. •

Note that the (RBB-R)-algorithm may be regarded as 80 single BB procedure 80S

well, and hence Theorem IV.12 can also be verified by proving consistency of the
bounding operation and completeness of the selection.
Applications of the (RBB-R) algorithm will be presented in Part B.
PARTB

CONCAVE MINIMIZATION

Many applications lead to minimizing a concave function over a convex set (cf.
Chapter I). Moreover, it turns out that concave minimization techniqlles also play
an important role in other fields of global optimization.
Part B is devoted to a thorough study of methods for solving concave minim-
ization problems and some related problems having reverse convex constraints.
The methods for concave minimization fall into three main categories: cutting
methods, successive approximation methods, and successive partition methods. Al-
though most methods combine several different techniques, cutting planes play a
dominant role in cutting methods, relaxation and restrietion are the main aspects of
successive approximation, and branch and bound concepts usually serve as the
framework for successive partition.
Aside from general purpose methods, we also discuss decomposition approaches to
large scale problems and specialized methods adapted to problems with a particular
structure, such as quadratic problems, separable problems, bilinear programming,
complementarity problems and concave network problems.
CHAPTER V

CUTTING METHODS

In this chapter we discuss some basic cutting plane methods for concave minim-

ization. These include concavity cuts and related cuts, facial cuts, cut and split pro-

cedures and a discussion of how to generate deep cuts. The important special case of
concave quadratic objective functions is treated in some detail.

1. A PURE CUTTING ALGORlTHM

The basic concave programming (BCP) problem to be studied here is

minimize f(x) (1)

(BCP) s.t. Ax ~ b, (2)

x ~ 0, (3)

where Ais an mxn matrix, x is an n-vector, b is an m-vector, and f: IRn -+ IR is a

concave function. For ease of exposition, in this chapter we shall assume that the

feasible domain D = {x: Ax ~ b, x ~ O} is bounded (i.e., is a polytope) with

int D # 0, and that for any real number a the level set {x: IRn : f(x) ~ a} is bounded.
182

Note that the assumption that fex) is defined and finite throughout IRn is not ful-
filled in various applications. However, we can prove the following result on the ex-

tension of concave functions.

Proposition V.1. Let 10: D - t IR be any concave fv,nction which is contin'UO'US on

D. 11 int D # 0 and IIVIlx)1I is bounded on the set 01 all x e D where Ilx) is diffe.,..
entiable (Vllx) denotes the gradient olloM at the point x), then 10 can be extended
to a concave fv,nction I: IR n - t IR.

Proof. It is well-known that the set C of points x e D where fO(x) is differ-


entiable is dense in int D (cf. Rockafellar (1970), Theorem 25.5). For each point

ye C, consider the affine function hy(x) = fO(Y) + Vfo(Y)(x-y). Then the function
fex) = inf {hy(x): y e C} is concave on IRn (as the pointwise infimum of a family of
affine functions). Since by assumption IIVfO(y)1I is bounded on C, it is easily seen
that -GI < fex) < +1Il for an x e IRn. Moreover, for any x,y e C we have hy(x) ~ fO(x),
while hx(x) = fo(x). Since fex) = fo(x) for an x e C and C is dense in int D, con-
tinuity implies that fex) = fo(x) for an x e D. •
Note that if ~(x) is any other concave extension of fO' then for any x e IRn and
any y e C, we have ~(x) ~ hix), hence ~(x) ~ fex). Thus, fex) is the maximal exten-

sion of fO(x). Also, observe that the condition on boundedness of IIVfO(x)1I is fulfilled

if, for example, fO(x) is defined and finite on some open set containing D (cf. Rocka-

fellar (1970), Theorem 24.7).

1.1. Valid Cuts and a Suflicient Condition for Global Optimality

Let x O be the feasible solution with the least objective function value found so far
by some method. A fundamental question in solving our problem is how to check
whether xOis a global solution.
183

Clearly, by the very nature of global optimization problems, any criterion for glo-

bal optimality must be based on global information about the behaviour of the ob-
jective function on the whole feasible set. Standard nonlinear programming methods
use only local information, and hence cannot be expected to provide global

optimality criteria.

However, when a given problem has some particular structure, by exploiting this
structure, it is often possible to obtain useful sufficient conditions for global op-
timality. For the problem (BCP) formulated above, the structural property to be ex-
ploited can be expressed in either of the following forms:

I. The global minimum of f(x) over any polytope is always attained at some vertex
(extreme point) ofthe polytope (see Theorem 11).

Therefore, the problem is equivalent to minimizing f(x) over the vertex set of D.

H. For any polytope P with vertices u1,u2, ... ,us the number

min {f(u 1),/(u2), ... ,f(us)} is a lower bound for f(D n P).

Here the points ui might not belong to D. Thus, the values of f(x) outside D can
be used to obtain information on the values inside.

These observations underlie the main idea of the cutting method we are going to
present.
First of all, in view of Property I, we can assume that the point xO under consid-

eration is a vertex of D.

Definition V.I. Let 'Y = f(:P). For any x e IRn satisfying f(x) ~ 'Y the point
xO + O(x-xO) such that

0= sup{ t: t ~ 0, f(xO + t(x-i) ~ 'Y} (4)

is called the 'Y-eztension of z (with respect to zO).


184

From the concavity of f(x) and the boundedness of its upper level sets it is imme-
diate that 1 ~ °< +111.
Let y1,y2,,,.,ys denote the vertices of D adjacent to xo (s ~ n). We may assume
that

(5)

(otherwise we know that xo is not optimal). For each i=l,2,,,.,s let


zi = xo + 0i{yi-xO) be the 7--extension of l Normally, these points lie outside D.
Hy Proposition m.1, we know that any solution '11' of the system of linear inequalities

. °
0i'll'{yl-x ) ~ 1 (i=l,2,,,.,s) (6)

provides a ')'-valid cut for (f,D). In other words, we have the following information
on the values off(x) inside the polytope D.

Theorem V.I. (Sufficient condition for global optimality). Let '11' be a solution of
the system (6).
Then

'II'(:r;-i) > 1 for all :r; E D such that f(:r;) < 7. (7)

Hence, if

maz {'II'(:r; - i)::r; E D} ~ 1, (8)

then:r;O is a global optimal solution of (Eep).

Proof. Theorem V.1. is an obvious consequence of Proposition 111.1.



Thus, to check the global optimality of xO we can solve the linear program

°
LP(x ,'II',D) max {'II'{x-xO): xE D} , (9)
185

where 1I{x-xO) ~ 1 is a valid cut for (f,D). If the optimal value of this program does

not exceed 1, then xO is a global optimal solution. Otherwise, we know that any feas-
ible point that is bett er than xO must be sought in the region D n {x: 1I{x-xO) ~ I}
left over by the cut.

Clearly, we are interested in using the deepest possible cut, or at least a cut which
is not dominated by any other one of the same kind. Such cuts correspond to the
basic solutions of (6), and can be obtained, for example, by solving the linear pro-

gram

s
min E
i=l 1
°
o·',,(i. -x) . °
s.t. 0i1l{yl-x) ~ 1 (i=1,2, ... ,s). (10)

When xO is a nondegenerate vertex of D, Le., s = n, the system (6) has a unique

basic solution 11", satisfying


0.1I{yl-x ) = 1
1
(i=l, ... ,n). (11)

This yields

11" = eQ-1 , Q = (1 ° °
z -x , ... ,z2-x ,... ,zn-x0) , (12)

where e = (1,1, ... ,1) and zi is the -y-extension of the i-th vertex yi of D adjacent to
xO. The corresponding cut 1I{x - xO) ~ 1 is then the 7-valid concavity cut as defined
in Definition 1113.

In the general case, where degeneracy may occur, solving the linear program (10)

may be time consuming. Therefore, as pointed out in Section III.2, the most conveni-

ent approach is to transform the problem to the space of nonbasic variables relative

to the basic solution xO. More specifically, in the system (2), (3) let us introduce the

slack variables s = b - Ax E IRm and denote t = (s,x). Let t B = (ti' i E B) and t N =

(ti' i E N) be the basic and nonbasic variables, respectively, relative to the basic
solution t O = (sO,xO), sO = b - AxO, that corresponds to the vertex xO of D. Then,
186

expressing the basic variables in terms of the nonbasic ones, we obtain from (2), (3)
a system of the form

(13)

where tg ~ 0 (since t N = °corresponds to the basic solution xo) and W is an m"n


matrix. The objective function becomes a certain concave function of the nonbasic
variables t N .
Writing (13) as: Wt N ~ tg, t N ~ 0, and changing the notation (x +- t N, A +- W,

b +- tg), we can thus assume that the original constraints (2), (3) have been given
such that xo = 0.

In the sequel, when the constraints have the form (2) (3) with the origin ° at a
vertex xo of D, we shall say that the BCP problem is in standard form with respect
to zOo Under these conditions, if 'Y< f(xo) (e.g., 'Y = f(xo)-c , e > °being the
tolerance), and

Ui = max {t: f(tei ) ~ 'Y} > ° (i=1,2, ... ,n), (14)

then a -y-valid cut is given by


n
E x.jU. > 1 . (15)
i=l 1 1-

More generally , if a cone K O J D is available that is generated by n linearly


independent vectors u1,u2,... ,un , and if 'Y < f(O), and

Ui = max {t: f(tui } ~ 'Y} > ° (i=1,2, ... ,n), (16)

then a -y-valid cut is given by

(17)
187

1.2. Ouiline of the Method

The above sufficient condition for global optimality suggests the following cutting
method for solving the problem (BCP).

Since the search for the global minimum can be restricted to the vertex set of D,
we first compute a vertex xO which is a Iocal minimizer of f(x) over D. Such a vertex
can be found, e.g., as follows: starting from an arbitrary vertex vO, pivot from vO to
a better vertex vI adjacent to vO, then pivot from vI to a better vertex v2 adjacent

to vI, and so on, until a vertex vn = xO is obtained which is not worse than any
vertex adjacent to it. From the concavity of f(x) it immediately follows that
f(xO) ~ f(x) for any x in the convex hull of xO and the adjacent vertices; hence xO is

actually a local minimizer.


Let 7 = f(xO). In order to test xO for global optimality, we construct a -y-valid
cut for xO:

(18)

and solve the linear program LP(xO,1!"°,D). If the optimal value of this linear

program does not exceed 1, then by Theorem V.l, xO is a global minimizer and we
stop. Otherwise, let wO be a basic optimal solution of L(xO,1!"°,D), and consider the
residual polytope left over by the cut (18), i.e., the polytope

(19)

By Theorem V.l, any feasible solution better than xO must be sought only in Dl .

Therefore, starting from wO, we find avertex xl of D l which is a local minimizer of

f(x) over Dl (then f(x l ) ~ f(wO)). It may happen that f(x l ) < 7. Then, by Theorem
V.l, xl must satisfy (18) as a strict inequality; hence, since it is a vertex of Dl' it

must also be a vertex of D. In that case the same procedure as before can be

repeated, but with xl and 71 = f(x l ) replacing xO and 7 = f(xO).


188

More often, however, we have f(x I ) ~ 1. In this ease, the procedure is repeated
with xO +- xl, D +- DI , while 1 is unehanged.
In this manner, after one iteration we either obtain a better vertex of D, or at
least reduee the polytope that remains to be explored. Sinee the number of vertiees
of D is finite, the first situation ean oeeur only finitely often. Therefore, if we ean
also ensure the finiteness of the number of oeeurrenees of the second situation, the
method will terminate sueeessfully after finitely many iterations (Fig. V.I).

Fig. V.I

Algorithm V.I.

Initialization:

Seareh for a vertex xO whieh is a loeal minimizer. Set 1 = f(xO), DO = D.


Iteration k = 0.1 •... :

1) At xk eonstruet a 1-valid eut ?rk for (f,D k ).

2) Solve the linear program


189

Let J. be a basic optimal solution of this linear program. If '/rk(J._xk) ~ 1, then


stop: xO is a global minimizer. Otherwise, go to 3).

3) Let Dk +1 = Dk n {x: '/rk(x_xk ) ~ I}. Starting from J. find a vertex xk+ 1 of

Dk+ 1 which is a local minimizer of f(x) over Dk+ 1 . If f(xk+ 1) ~ 7, then go to


iteration k+l. Otherwise, go to 4).

Theorem V.2. [f the sequence {·i} is bounded, then the above cutting algorithm is
finite.

Proof. The algorithm consists of a number of cycles of iterations. During each


cycle the incumbent xO is unchanged, and each occurrence of Step 4) marks the end

of one cycle and the beginning of a new one. As noticed above, in view of the
inequality f(xk +1) < 7 in Step 4), xk+ 1 satisfies al1 the previous cuts as strict
inequalities; hence, since xk+ 1 is a vertex of Dk+ 1 ' it must be a vertex of D,

distinct from al1 the vertices of D previously encountered. Since the vertex set of D

is finite, it follows that the number of occurrences of Step 4), Le., the number of
cycles, is finite.

Now during each cycle a sequence of cuts ~(x):= ,t(x-xk )-1 ~ °is generated
such that

Since the sequence {'/rk} is bounded, we conclude from Corollary III.2 that each cycle

is finite. Hence the algorithm itself is finite.



Thus, to ensure finiteness of Algorithm V.l we should select the cuts '/rk so as to

have II'/rkll ~ C for some constant C. Note that 1111'/rk ll is the distance from xk to the

hyperplane '/rk(x_xk ) = 1, so these distances (which measure the depth of the cuts)
190

must be bounded away from 0. Though there is some freedom in the choice of 7rk

(condition (6)), it is generaIly very difficult to enforce the boundedness of this se-

quence. In the sections that follow we shaIl discuss various methods for overcoming

this difficulty in cutting plane algorithms.

2. FACIAL CUT ALGORITHM

An advantage of concavity cuts as developed above is that they are easy to con-

struct. Unfortunately, in practical computations, it has been observed that these

cuts, when used alone in a pure cutting algorithm, often tend to become shaIlower

and shaIlower as the algorithm proceeds, thus making the convergence very difficult

to achieve. Therefore, it is of interest to study other kinds of cuts which may be

more expensive but have the advantage that they guarantee finiteness and can be

suitably combined with concavity cuts to produce reasonably practical finite algo-

rithms.

2.1. The Basic Idea

A problem closely related to the concave minimization problem (BCP) is the fol-

lowing:

Vertex problem. Given two polyhedra D and M, find avertex 0/ D lying in M, or


else establish that no such vertex exists.

If we know some efficient procedure for solving this problem, then the concave

programming problem (BCP) can be solved as follows.

Start !rom a vertex xo of D which is a loeal mini mi zer. At step k = O,l, ... ,let 'Yk
be the best feasible value of the objective function known so far, i.e., 'Yk =

min {f(xO), .... ,f(xk )}. At xk eonstruct a 'Yk-valid cut 7rk (x - xk ) ~ 1 for (f,D) and let
191

where 1fi defines a 7i-valid cut for (f,D) at i


Solve the vertex problem for D and Mk. If Mk contains no vertex of D, then stop:
7k = min f(D). Otherwise, let xk +1 be a vertex of D lying in Mk. Go to Step k+l.

Since each cut eliminates at least one vertex of D, the above procedure is obvious-

ly finite.

Despite its attractiveness, this procedure cannot be implemented. In fact, the vertex
problem is a very difficult one, and up to now there has been no reasonably efficient
method developed to solve it. Therefore, following Majthay and Whinston (1974),
we replace the vertex problem by an easier one.

Definition V.2. A face F of a polyhedron D is called an extreme face of D relative

to a polyhedron M if

0f.FnMcriF. (20)

For example, in Fig. V.2, page 186, F l' F 2 and the vertex x are extreme faces of
the polytope D relative to the polyhedral cone M. Since the relative interior of a
point coincides with the point itself, any vertex of D lying in M is an extreme
O-dimensional face of D relative to M.

The following problem should be easier than the vertex problem:

Extreme face problem. Given two polyhedra D and M, find an extreme face of D

relative to M, or else prove that no such face exists.

Actually, as it will be seen shortly, this problem can be treated by linear pro-

gramrning methods.
192

A cutting scheme that uses extreme faces can be realized in the following way: at

each step k find an extreme face of D relative to Mk and construct a cut eliminating

this extreme face without eliminating any possible candidate for an optimal solution.

Since the number of faces of D is finite, this procedure will terminate after finitely

many steps.

Fig. V.2

2.2. Finding an Extreme Face of D Relative to M

Assume now that the constraints defining the polytope D have been given in the

canonical form

X.
1
= p.10 - E p..x. (i e B)
jeJ IJ J
(21)

Xk ~ 0 (k=1, ... ,n), (22)

where B is the index set of basic variables (I B I = m) and J is the index set of

nonbasic variables (I J I = n-m).


193

The following consideration makes use of the fact that a face of D is described in

(21), (22) by setting some of the variables x k equal to zero.

For any x E !Rn let Z(x) = {j: xj = O}. Then we have the following characteristic
property of an extreme face of D relative to M.

Proposition V.2. Let:"o E D n M, FO= {x E D: xj = ° Vj E Z(i)}. Then FOis


an extreme face of D relative to M if and only iffor any i E {1,2, ... ,n} \ Z(xO):

°
Proof. Obviously, F is a face of D containing xO, so FOn M # 0.

°
If F is an extreme face of D relative to M, then for any x E FOn M we must have

x E ri F O' hence xi > ° for any i E {1,2, ... ,n} \ Z(xO) (since the linear function

°
x - xi which is nonnegative on F can vanish at a relative interior point of F only °
if it vanishes at every point of F 0)' In view of the compactness of FOn M, this

implies that °< min {xi: x E FO n M} for any i E {1,2, ... ,n}\Z(xO).

Conversely, if the latter condition is satisfied, then for any x E FOn M we must

have xi> ° for all i; Z(xO), hence xE ri F O' Le., F O n M ( ri F O ' and F O is an
extreme face.

Note that the above proposition is equivalent to saying that a face F O of D that

meets M is an extreme face of D relative to M if and only if Z(x) = Z(x') for any

x,x' E F On M.

On the basis of this proposition, we can solve the extreme face problem by the

following procedure of Majthay and Whinston (1974).

Let M be given by the system of linear inequalities

n
E c..x. > d. (i=n+1, ... ,n) .
j=l IJ r 1
194

Introducing the slack variables xi (i=n+1, ... ,ii) and using (21), (22) we can de-
scribe the polytope D n M by a canonical system of the form

X.
1
= p.10 - jei
E p..x.
IJ J
(i E B) , (23)

Xi ~ ° (i=l, ... ,ii), (24)

where B is the index set ofbasic variables (I BI = m + (ii-n)), j is theindex set of


nonbasic variables (I j I = n-m).
In the sequel, the variables xi (i=l, ... ,n) will be referred to as "original", the
others (Xi' i > n) as "nonoriginal".
Let xO = (x~, ... ,x~) E D n M be the point in the space IRn of original variables ob-
tained by setting x j = 0 for all j E j in (23), (24), i.e., x~ = Pio (i E B, i ~ n),

X~ = ° (j E j, H n).
If j ( Z(xO):= {j E {1,2, ... ,n}: X~ = O}, i.e., if all nonbasic variables are original,
then xO is avertex of D, since in this case the system (23), (24) restricted to i ~ n
gives a canonical representation of D.
In the general case, let

Then we can derive the following procedure.

Proc:edure I.

Starting /rom k=l, solve

min zi s.t. (23), (24) and z· = 0 Vj EZ(Je-l) (25)


k J

Let ek be the optimal value and Je be a basic optimal solution 01 {PIIi. Set k -- k+l
and repeat the procedure until k = s.
195

Proposition V.3. If Z(l) # 0, then F = {x E D: xj = 0 Vj E Z(l)} is an extreme


face of D relative to M. Otherwise, there is no extreme face of D relative to M other
than D itself

Proof. Ifi E {1,2, ... ,n} \ Z(xs), then i = ik for some ik t Z(xs), hence k > 0, i.e., e
° °
< min {xi: x E D n M, xj = Vj E Z(xs)}. Therefore, F is an extreme face by
Proposition V.2 provided Z(xs) # 0. Moreover, if Z(xs) = 0, then °< min {xi:
xe D n M}, from which it easily follows that the only extreme face is D itself. •

Remarks V.L (i) A convenient way to solve (P k) for k=1 is as follows.


Recan that xo is given by (23), (24). If Z(xO) = J n {1,2, ... ,n} (i.e., all the
variables xj ' j E Z(xO), are nonbasic), then to solve (P 1) we apply the simplex
procedure to the linear program

min x.1 subject to (23), (24)* , (25*)


1

where the asterisk means that an nonbasic original variables in (23), (24) should be
omitted.

However, it may occur that Z(xO) nB# 0 (i.e., x~ = ° for certain


i E B n {1,2, ... ,n}). In this case, we must first remove the variables xi' E Z(xo) n B,i
from the basis whenever possible. To do this, we observe that, since x~ = 0, we must
have

0= min {xi: (23), (24) and xj = ° Vj E j n{1,2, ... ,n}} ,

hence Pi,j ~ °for an j E j \ {1,2, ... ,n}. If Pij > °for at least one j E j \ {1,2,... ,n},
then by pivoting on this element (i,j) we will force ~ out of the basis. On the other

D
°
hand, if p.. = for an je j \ {1,2, ... ,n}, this means that x. depends only on the vari-
1

ables X j ' j E j n {1,2, ... ,n}. Repeating the same operation for each i E Z(xO) n B, we
will transform the tableau (23) into one where the only variables xi' i e Z(xO) which
196

remain basic are those which depend only on the nonbasic original variables. Then

and only then we start the simplex procedure for minimizing Xi with this tableau,
1
where we omit, along with the nonbasic original variables (Le., all the columns
j ~ n), also all the basic original variables Xi with i E Z(xO) (Le., all the rows
i E Z(xO)).

In an analogous manner each problem (P k ) is replaced by a problem (P k),


starting with a tableau where a variable Xi ' i E Z(xk- 1), is basic only if it depends
upon the non basic original variables alone.

(ii) The set Z(xk) is equal to the index set Zk of nonbasic original variables in
the optimal tableau of (P k) plus the indices of all of the basic original variables
which are at level zero in this tableau. In particular, this implies that F = {x E D:
x. = ° Vj E Z }, and therefore that F is a vertex of D if and only if IZs I = n-m. We
J 8
can thu8 8top the extreme face finding proces8 when k = s or IZk I = n-m.

2.3. Facial Valid Cuts

Let F = {x ED: xj = ° Vj E Z} be an extreme face of D relative to M that has


been obtained by the above method. If F = D (Le., Z = 0), then, by the definition of
an extreme face, D n M c ri D, 80 that M does not contain any vertex of D. If Fis a
vertex of D, then we already know how to construct a cut which eliminates it with-
out eliminating any better feasible point.

Now consider the case where F is a proper face but not avertex of D

(0< IZI < n-m).

Definition V.3. Let F be a proper face but not avertex of D. A linear inequality
l (x)~O is a facial cut if it eliminates F without eliminating any vertex of D lying in M.

A facial cut can be derived as folIows.


197

Let Qj (j E Z) be prechosen positive numbers and for each h E {1,2,... ,n} \ Z con-
sider the parametric linear program

min {xh: xE D n M, E Q.x. ~ q} , (26)


jeZ J J

where q ia a nonnegative parameter. From Proposition V.2 it followa that

o < min {xh: xe D nM, xj = 0 V j e Z} , (27)

i.e., the optimal value in (26) for q = 0 ia positive. Therefore, if


qh = sup {q: 0 < optimal value of (Ph(q))}, then qh > 0, and hence

p:= min {qh: he {1,2, ... ,n} \ Z} > 0 . (28)

Proposition V.4. Let 0< IZI < n-m. Ifp < +m, then the inequality

E Q.z.~ p (29)
jeZ J J

deftnes a facialvalid cut.

Proof. It is obvioua that the cut e1iminates F. If x e D n M violatea (29), then,


since p ~ qh ' it follows that xh > 0 for all he {1,2, ... ,n} \ Z. Hence, Z J {i: xi = O}j
and, since IZI < n-m, x cannot be a vertex of D.

If p = +m we say that the facial cut ia infinite. Obviously, in that case there is no
vertex of D in M.

Remark V.2. For each h the value qh = aup {q: 0 < optimal value in (Ph(q))}
can be computed by parametric linear programming methods.
Of course, the construction of a facial cut is computationally rather expensive,
even though it only involves aolving linear programs. However, such a cut eliminates
an entire face of D, i.e., all the vertices of D in this face, and this is sometimes worth
the cost. Moreover, if all we need is a valid (even shallow) cut, it is enough for each
198

°
h to choose any q = qh > for which the optimal value in (Ph(q)) is positive.

2.4. A Finite Cutting Algorithm

Since a facial cut eliminates a face of D, the maximal number of facial valid cuts
cannot exceed the total number of distinct faces of D. Therefore, a cutting procedure
in which facial valid cuts are used each time after finitely many steps must be finite.
The following modification of Algorithm V.l is based upon this observation.

Algorithm V.2.

Initialization:

Search for a vertex xOwhich is a Iocal minimizer. Set 7 = f(xo), DO= D.

Select two numbers 50 ~ 0, N > 1. Set dO = +m.

Iteration k = 0,1, ... :

0) If k = 0 or dk_ 1 ~ 6k_ l , go to la); otherwise, go to Ib).

la) Construct a 7-valid cut 4c(x):= Jt(x-xk) - 1 ~ 0 for (f,D k) at x~


k
Set dk = 1/11'11" 11, 6k = 6k- l and go to 2).

Ib) Starting from xk, identify an extreme face Fk of D relative to Dk


(the intersection of D with all previously generated cuts).
If Fk = D, then stop: xO is an optimal solution of (BCP).

If Fk is a vertex of D (Le., Fk = xk), then construct at xk a -y-valid cut


4c(x):= 'll"k(x_xk) -1 ~ 0 for (f,D). Set dk = 1/IIJtIl, 5k = ~k-l and go to 2).

If Fk is a proper face but not a vertex of D, construct a facial valid cut 4c(x) ~ 0.
If this cut is infinite, then stop: xO is a global optimal solution; otherwise, set
1
dk = +m, 6k = "Nök-l and go to 3).
199

2) Solve the linear program

max lk(x) subject to x E Dk

to obtain a basic optimal solution wk of this problem. If lk( uf) ~ 0, then stop: xo
is a global optimal solution. Otherwise, go to 3).

3) Let Dk + l = Dk n {x: ~(x) ~ o}. Find a vertex x k+1 of Dk+1 which is a local
minimizer of f(x) over Dk + l . If f(x k + l ) ~ i, go to iteration k+l. Otherwise, go
to 4).

Theorem V.3. The above algorithm i3 finite.

Proof. Like Algorithm V.l, the above procedure consists of a number of cydes of
iterations, each of which results in a vertex of D (the point xk+ l in Step 4) which is

better than the incumbent one. Therefore, it suffices to show that each cyde is
finite. But within a given cyde, since the number of facial cuts is finite, there must

exist a kO such that Step la) occurs in all iterations k ~ kO· Then d k ~ Dk = Dk for

all k > kO' Le., lI1Tk ll ~


°
l/Dk . So the sequence II1Tk ll is bounded, which, by Theorem
o
V.2, implies the finiteness of the algorithm.

Remark V.3. The above algorithm differs from Algorithm V.l only in Step lb),
which occurs in iteration k when dk_ l < Dk_ l . Since dk- l = l/lI1Tk- l ll is the

distance from xk- l to the cutting hyperplane (when a i-valid cut has been applied

in iteration k-l), this means that a facial cut is introduced if the previous cut was a

'"(-valid cut that was too shallow; moreover, in that case Dk is decreased to kDk_ l '
so that a facial cut will have less chance of being used again in subsequent iterations.

Roughly speaking, the (D,N) device allows facial cuts to intervene from time to time

in order to prevent the cutting process from jamming, while keeping the frequency of
these expensive cuts to a low level.
200

Of course, the choice of the parameters 60 and N is up to the user, and N may

vary with k. If 60 is close to zero, then the procedure will degenerate into Algorithm

V.l; if 60is large, then the procedure will emphasize on facial cuts.

In practice, it may be simpler to introduce facial cuts periodically, e.g., after

every N cuts, where N is some natural number. However, this method might end up

using more facial cuts than are actually needed.

Example V.l. We consider the problem

minimize - (Xl - 1.2)2 - (~ - 0.6)2

subject to -2xl + ~ ~ 1 ,
x2 ~ 2 ,
xl+~~4,
Xl ~ 3 ,
0.5xl-~ ~ 1 ,
Xl ~ 0, ~ ~ O.

Suppose that in each cycle of iterations we decide to introduce a facial cut after
every two concavity cuts. Then, starting from the vertex xO = (0,0) (a local minim-
izer with f(xO) = -1.8), after two cycles of 1 and 3 iterations, respectively, we find
the global minimizer x* = (3,1) with f(x*) = -3.4.

Cycle 1.

Iteration 0: x O = (0,0), 7 = -1.8; concavity cut: 0.4l7xl + 0.833~ ~ 1.


Cycle 2.

Iteration 0: xO = (3,1), 7 =-3.4; concavity cut 1.964xl + 1.250~ ~ 6.143;


Iteration 1: xl = (0.5,2); concavity cut: -5.26lxl + 0.75l~ ~ 0.238.

Iteration 2: X
2 = (2.856,0.43).

Extreme face containing x 2: 0.5xl - ~ = 1.


Facial cut: 0.5x l - ~ ~ l-p, where p > 0 can be arbitrarily large.
201

Since the facial cut is infinite, the incumbent x* = (3,1) is the global optimizer (Fig.
V.3).

Fig. V.3

3. CUT AND SPLIT ALGORITHM

A pure cutting plane algorithm for the BCP problem can be made convergent in
two different ways: either by introducing special (usuaIly expensive) cuts from time
to time, for example, facial cuts, that will eliminate a face of the polytope D; or by

the use of deep cuts that at one stoke eliminates a sufficiently "thick" portion of D.

A drawback of concavity cuts is that, when used repeatedly, these cuts tend to

degrade and become shaIlow. To overcome this phenomenon, we could attempt to

strengthen these cuts: later, in Section VA, we shaIl examine how this can be done

in certain circumstances.
202

In the general case, a procedure which has proven to be rather efficient for making

deep cuts is in an appropriate manner to combine cutting with splitting


(partitioning) the feasible domain.

3.1. Partition of a Cone

Let us first describe a construction which will frequently be used in this and

subsequent chapters.
To simplify the language, in the sequel, unless otherwise stated, a cone always

means a convex polyhedral cone vertexed at the origin 0 and generated by n linearly

independent vectors. A cone K generated by n vectors zl,z2, ... , zn forming a non-


singular matrix Q = (z l i,... ,zn) will be denoted: K = con(Q).

Since a point x belongs to K if and only if x = E~izi = Q~ T with


~ = (~1'~2""'~n) ~ 0, we have
(30)

Now let u be any point of K, so that Q-1 u = (~1'~2"'.'~n.)T ~ O. Let


1= {i: ~i > O}. For each i E I, zi Can be expressed as a linear combination of zj
(j # i) and u, namely: zi = (u - .~. ~j~) / ~ij
hence the matrix Qi = (zl, ... ,zi-l,u,
Jr1
zi+1, ... ,zn) obtained from Q by substituting u fOI zi is still nonsingular. Denoting

Ki = con(Qi)' we then have the following fact whose pIoof is analogous to that of

Proposition IV.l:

(int K i ) n (int K j ) = 0 (j # i) j
K = U{Ki : i E I} .

In other words, the cones Ki (i E I) constitute a partition of K. In the sequel, this


partition will be referred to as the partition (qlitting) 0/ the cone K with respect

to u.
203

In actual computations, we work with the matrices rather than with the cones.

Therefore, it will often be more convenient to speak of matrix partitioning instead of

cone partitioning. Thus, we shall say that the matrices Qi (i EI), with Qi = (zi, ... ,

zi-I ,u,zi+l , ... ,z,


n) E
torm th · · 01Q = (1
e partitIOn z ,... ,z n).
Wlt h respect to u. Note t hat

the partitions of a matrix Q with respect to >.u for different>. > 0, lead to different
matrices, although the corresponding partitions of the cone K = con(Q) are the

same.

3.2. Outline of the Method

We consider problem (BCP) in the form (1), (2), (3).

Let e ~ 0 be a prescribed tolerance. Our aim is to find a global g-optimum, Le., a

feasible solution x* such that

f(x*) - e ~ min f(D).

Let us start with a vertex xO of D which is a local mini mi zer of f(x) over D. Set
, = f(xO), 0: = ,-c. Without loss of generality we may always assume that the
problem is in standard form with respect to xO (cf. Section V.l), so that xO = 0 and

condition (14) holds.

Let G denote the set of


0:
an points Z E IRn such that z is the a-extension of some
y:/: o. Clearly, since f(x) is assumed to have bounded level sets, Go: is a compact set
and

G = {z: f(z) =
0:
0:, f(>'z) < 0: V>. > I} .

For each i=I,2, ... ,n let zi be the point where Go: meets the positive xi-ms.

Then, as already seen from (17), the vector


204

defines an a-valid cut for (f,D) at xO.

Consider the linear program

LP(Q,D) max eQ-Ix S.t. x e D ,

and let w = w(Q) be a basic optimal solution and # = #(Q) be the optimal value of
thislinear program (i.e., #(Q) =eQ-Iw).
If # ~ I, then xO is a global e-optimizer. On the other hand, if f(w) < a, then we
can find a vertex xl of D such that f(x l ) ~ f(w) < a. Replacing the polytope D by
D n {x: eQ-Ix ~ I}, we can rest art from xl instead of xO. So the case that remains
to be examined is when # > I, but f(w) ~ a (Fig. V.4).
As we already know, a feasible point x with fex) < -y-e should be sought only in
the residual polytope D n {x: eQ-Ix ~ I} left over by the cut. In order to cut off
more of the unwanted portion of D, instead of repeating for this residual polytope
what was done for D (as in Algorithm V.I) we now construct the a-ilXtension Wof w
and split the cone K = con(Q) with respect to w(Fig. VA).
Let Qi ' i e I, be the corresponding partition of the matrix Q:

Qi = (zi ,,,.,zi-I·,w,Zi+1 ,,,.,z.


n)

Now note that for each subcone Ki = con(Qi) the cut through zl,,,.,zi-l,w,
i+1,,,.,zn does not eliminate any point x of Ki with fex) < -y-e. (This can be seen in
the same way that one sees that the cut through zli,,,.,zn does not eliminate any
point x e K with fex) < -y-e.) Therefore, to check whether there is a feasible point x
with fex) < -y-e in any subcone Ki' for each i e I we solve the linear program

max eQi x
-1 -1
s.t. x e D, Qi x ~ °.
205

Note that the constraint Qjlx ~ 0 simply expresses the condition that x E Ki =
con(Qi) (see (30)). If all of these linear programs have optimal values ~ 1, this
indicates that no point x E D in any cone Ki has fex) < 7-t:, and hence, that x O is a
global c:-optimal solution. Otherwise, each subcone K i for which the linear program

LP(Qi'D) has an optimal value > 1 can be furt her explored by the same splitting
method that was used for the cone K.

f(x)=O - (

Fig. V.4

In a formal way we can give the following algorithm.

Algorithm V.3.

Select c: ~ O.

Initialization:

Compute a point z E D. Set M = D.


206

Phase I.

Starting from z search for avertex xO of M which is a loca.l minimizer of f(x)

over M.

Phase 11.

0) Let 7 = f(xo), a = -y-e. Rewrite the problem in standard form with respect to
xO. Construct QO = (zOl,z02, ... ,zOn), where zOi is the intersection of Ga with the
i-th edge of K o ' Set .Jt = {Qo}'

1) For each Q E .Jt solve the linear program

LP(Q,M) max eQ-lx S.t. x E M , Q-lx ~ °


to obtain a basic optimal solution w (Q) and the optimal value

",(Q) = eQ-1w(Q). Hf(w(Q» < afor some Q, then set

and return to Phase I. Otherwise, go to 2).

2) Let .9t = {Q E .Jt: ",(Q) > I}. H .9t = 0, then stop: xO is a global c-Qptimal
solution. Otherwise, go to 3).

3) For each Q E .9t construct the a-extension w(Q) of w (Q) and split Q with
respect to w(Q). Replace Q by the resulting partition and let .Jt' be the resulting
collection of matrices. Set .Jt I - .Jt' and return to 1).

3.3. Remarks V.4

(i) The algorithm involves a sequence of cydes of iterations. Every "return to

Phase I" indicates the passage to a new cyde. Within a cyde the polytope M that

remains to be explored and the incumbent xO do not change, but from one cyde to
207

the next the polytope M is reduced by a cut

-1
eQO x ~ 1 ,

while xO changes to a better vertex of D. (Note that QO = (zOl,z02 ... ,zOn) is the

matrix formed in Step 0 of Phase II, and hence determines an a-valid cut for (f,M)
at xO; the subsequent cuts eQÖ1x ~ 1 in this cycle cannot be used to reduce M,
because they are not a-valid for (f,M) at xO.) Under these conditions, it is readily
seen that at each stage the current xO satisfies every previous cut as a strict
inequality. Therefore, since xO is a vertex of M, it must also be a vertex of D.
Moreover, since the vertex set of D is finite, the number of cycles of iterations must

also be finite.

°
(ii) The value of c: ~ is selected by the user. If c: is large, then few iterations will
be needed but the accuracy of the solution will be poor; on the other hand, if c: is
small, the accuracy will be high but many iterations will be required. Also, since the
minimum of f(x) is achieved at least at one vertex, if c: is smaller than the difference
between the values of f at the best and the second best vertex, then a vertex xO
which is globally c:--optimal will actually be an exact global optimal solution. There-

fore, for c: small enough, the solution given by Algorithm V.3 is an exact global
optimizer.

(iii) The linear programs LP(Q,M) can be given in a more convenient form which
does not require computing the inverse matrices Q-1. Indeed, since the problem is in
standard form with respect to xO, the initial cone KO in Phase II is the nonnegative
orthant. Then any cone K = con(zl,z2, ... ,zn) generated in Phase II is a subcone of
KO ' and the constraints Q-1x ~ °(Le., xe K) imply that x ~ 0. Therefore, if
M = D n {x: Cx ~ d}, where Cx S dis the system formed by the previous cuts, then

the constraints ofthe linear program LP(Q,M) are

-1
Ax ~ b, Cx ~ d, Q x ~ °.
208

Thus, in terms of the variables (A 1,A 2,... ,An) = Q-1x, this linear program can be
written as
n
max E AJ'
j=l
LP(Q,M)

In this form, an the linear programs LP(Q,M) corresponding to different cones


K = con(Q) have the same objective function. If u e con(Q) and Q' is obtained from
Q by replacing a certain column zi of Q by u, then LP(Q',M) is derived from
LP(Q,M) simply by replacing Azi,Czi by Au, Cu respectively.
Moreover, if zOj = 0i, where ei denotes the j-th unit vector, then
Qö1 = diag(01'02".,On)' and eQö1x = EX/Ur Consequently, the linear programs
LP(Q,M) can easily be derived, and we do not need to compute the corresponding
inverse matrices Q-l. Furthermore, if the dual simplex method is used to solve these
linear programs, then the optimal solution of an LP(Q,M) is dual feasible to any
LP(Q',M) corresponding to an immediate successor Q' of Q, and therefore can be
used to start the solution process for LP(Q',M).

(iv) For ease of exposition, we assumed that the function fex) has bounded level
sets. If this is not the case, we cannot represent a cone by a matrix Q = (zl,,,.,zn),
where each i is the intersection of the i-th edge with G Q (because this intersection
may not exist). Therefore, to each cone we associate a matrix Q = (zl,,,.,zn), where
each zi is the intersection point of the i-th edge with GQ if this intersection exists,
or the direction of the i-th edge otherwise. Then, in the linear subproblem LP(Q,M)
the vector e should be replaced by a column vector with its i-th component equal to
1 if zi is a point or equal to 0 if zi is a direction. If 1= {i: zi is a point}, this
subproblem can similarly be written as
209

max E >..
i eI 1
n . n .
s.t. E >'.(Az 1 ) ~ b, E >'.(Cz1) ~ d ,
i=l 1 i=l 1

\ ~ 0 (i = 1 ,2, ... ,n).

With these modifications, Algorithm V.3 still works for an arbitrary concave
function f: IRn - I IR.

(v) An algorithm which is very similar to Algorithm V.3, but with c = 0 and the
condition Qi1x ~ 0 omitted in LP(Qi'M), was first given in Tuy (1964). This
algorithm was later shown to involve cycling (cf. Zwart (1973)). Zwart then
developed a modified algorithm by explicitly incorporating the condition Qi1x ~ 0

into the linear program LP(Qi'M) and by introducing a tolerance parameter c ~ o.


When c = 0 Zwart's algorithm and Algorithm V.3 coincide, but when c > 0 the two
algorithms differ, but only in the use of this parameter.

While Zwart's algorithm for c > 0 is computationally finite, it may give an in-
correct solution, which is not even an c-approximate solution in the sense of Zwart

(cf. Tuy (1987b)).


On the other hand, as it stands, Algorithm V.3 is not guaranteed, theoretically,
to terminate in finitely many steps. However, the computational experiments re-
ported by Zwart (1974) as weIl as more recent ones seem to suggest that the algo-
rithm will be finite in most cases encountered in practice. Moreover, it turns out
that, theoretically, the algorithm can be made finite through the use of an appropri-

ate anti-jamming device. The resulting algorithm, called the Normal Conical AIgo-

rithm, will be discussed in Chapter VII.

Let us also mention two other modifications of Tuy's original algorithm: one by

Bali (1973), which is only slightly different from Zwart's algorithm (for c = 0), the
other by Gallo and Ülkücü (1977), in the application to the bilinear programming
problem. The algorithm of Gallo and Ülkücü has been proved to cycle in an example
210

of Vaish and Shetty (cf. Vaish and Shetty (1976 and 1977». The convergence of

Ball's algorithm, like that of the e = ° version of Zwart's algorithm or Algorithm

V.3, is problematic, although no counter-example has been found yet.

The only result in this regard that has been established is stated in the following

proposition (cf. Jacobsen (1981»:

Proposition V.5. Let Qi = (zi1}2, ... }n), i=1,2, ... , be the matrices generated in

aPhase II of Algorithm V.S, where the indez system is such that i < j if Qi is
generated before Qj' Suppose that e = O. If eQ-/zik ~ 1 (k=1,2, ... ,n) for any i < j,
then Phase II is finite, unless zO is already aglobai optimizer.

Proof. Suppose that aPhase 11 is infinite, while there is a feasible solution x*

better than xo. Denote wi = W(Qj)' Lj = {x: eQj1x ~ I}. By construction, we have

wi t Lj , and by hypothesis ~ e Lj for any j > i. Hence,

By Lemma 111.2, this implies that d(cJ,Li ) -t 0 as i - t CD. Now for any i, x* must

belong to some cone K j = con(Qj) with j > i, and from the definition of J = w (Qj)'
we have d(x*,Lj ) ~ d(Wi,L j ). Therefore, d(x*,L j ) -+ 0 as j -+ CD.
On the other hand, by hypothesis, the halfspace Lj entirely contains the polytope
spanned by 0, zOl,z02, ... ,zOn and ~P = w(QO)' Hence, the distance from xO = °to
the hyperplane Hj = {x: eQj1x = I}, i.e., 1/ IIQj111 , must be bounded below by some

positive constant 6.

Let ~ denote the intersection of Hj with the halfline from °through x*, and note
that IIx*-~llId(x*,Lj) = 1I~II/d(O,Hj)' which implies that IIx*-~1I ~ II~II 6d
(x*,L j) -+ 0. But since ~ belongs to the simplex [~1,~2, ... ,zjn] with f(~k) = 7
(j=1,2, ... ,n), we must have f(~) ~ 7. This contradicts the assumption that f(x*) < 7,
~ -+ 0 as j -+ CD.
since we have just established that x* -

211

For every cone K = con(Q) generated by Algorithm V.3, let ß(K) = K n {x:
eQ-1x $ I}. Then the convergence condition stated in the above proposition is

equivalent to requiring that at the completion of any iteration in Phase II, the union

of all ß(K) corresponding to all of the cones K that have been generated so far (in

the current Phase II) is a polytope. In the next chapter, we shall present an

algorithm which realizes this condition (the Polyhedral Annexation Algorithm).

4. GENERATING DEEP CUTS: THE CASE OF CONCAVE QUADRATIC

FUNCTIONALS

The basic construction in a cutting method for solving (BCP) is for a given

feasible value , (the current best value) of the objective function f(x) to define a

cutting plane which will delete as large as possible a sub set of D n {x: f(x) ~ ,}.

Therefore, although shallow cuts which can delete some basic constituent of the

feasible set (such as a face) may sometimes be useful, we are more often interested in

generating a deep cut.

4.1. A Hierarehy of Valid Cuts

Consider a vertex xO of D which is a loeal minimizer of f(x) over D, and let

er $ f(xO). As usual, we may ass urne that the problem is in standard form' with
respect to xO and that (14) holds. So xO = 0, D ( IR~ and

Bi = max {T : f( Tei ) ~ er} > ° (i=1,2, ... ,n),

where ei is the i-th unit vector. We already know that an a-valid cut for xO is

furnished by the concavity cut


212

(31)

Our aim is to develop Q-valid cuts at xO which may be stronger than the con-
cavity cut.

Suppose that we know a continuous function F(u,v): IRnxlRn ~ IR such that:

(i) F(x,x) = f(x) Vx e Dj (32)

(ii) F(u,v) ~ min {f(u), f(v)} VU,Vj (33)

(iii) F(u,v) is concave in u for every fixed v and affine in v for every
fixed u. (34)

Ll(t) = {x e D: Ex/ti ~ I} ,

M(t) = {x e D: Ex/ti ~ I} ,

and define the function

4>t(x) = min {F(x,v): v e M(t)} (35)

(where we set 4>t(x) = --w ifF(x,v) is not bounded from below on M(t).

Proposition V.6. The fu.nction 4> lc) is concave and satisfies

min {4> lx): x e M(t)} = min {/(x): x e M(t)} . (36)

Proof. The concavity of the function in (35) follows from assumption (iii). Since,
by assumption (ii), F(x,v) ~ min {f(x), f(v)} ~ min f(M(t)) for any x,v e M(t), we
have:

4>t(x) ~ min f(M(t)) Vx e M(t). (37)


213

But, min {F(x,v): v e M(t)} ~ F(x,x)j hence, in view of (i):

tt(x) ~ f(x) Vx e M(t).

This, together with (37), yields (36).



Theorem V.4:. Let t > 0 define an Q-1Jalid cut:

Ez.jt.>l,
I 1-
(38)

and let t7 = maz {r : t lri) ~ Q} (i=l,e, ... ,n).


1/

(39)

then t* ~ t and the inequ.ality

Ez.je*,
I
>1
I -
(40)

is an Q-1Jalid cut /or (j,D).

Proof. The inequality t* ~ t follows from the definition of t*. Then ~(t*) nM(t)
is contained in a polytope with vertices ui = tiei , vi = ttei (i=1,2, ... ,n). Since we
have tt(ui) ~ Q, tt(vi ) ~ Q (i=1,2, ... ,n), by (39) and the definition of t*, it follows
from the concavity of tt(x) that tt(x) ~ Q Vx e ~(t*) n M(t).
Therefore,

f(x) = F(x,x) ~ tt(x) ~ Q Vx e ~(t*) n M(t).

Moreover, the Q-validity of the cut (38) implies

f(x) ~ Q Vx e ~(t) .

Rence, f(x) ~ Q Vx e ~(t*), which proves the theorem.



214

Referring to the concavity cut (31), we obtain the following version of The-

orem VA.

Corollary V.I. Let °= (Ol,02'''''0nJ define the concavity cut (91). For each
i = l,2,,,.,n let

ti = max{r: 1JI 0{ri) ~ a}, (41)

yi E argmin {F{O/, v): v E M{O)} . (42)

If f(yi) ~ a (i=l,2,,,.,n), then ti ~ 0i (i=l,2,,,.,n) and the inequality

(43)

is an a-valid cut.

ditions of Theorem VA are satisfied.



Remark V.5. If property (ii) holds with strict inequality in (33) whenever u 'f v,

Le., if we have

F(u,v) > min {f(u),f(v)} Vu f. v , (44)

then under the conditions in the above corollary we will have t i > 8i' provided that
0iei t D, because then yi f. 0iei and consequently 1JI0(Oiei) > min {f(Oiei),f(yi)} ~ a.
This means that the cut (43) will generally be strictly deeper than the concavity cut

(Fig. V.5).

Moreover, note that for the purpose of solving the problem (Bep), a cut con-

struction is only a means to move towards a better feasible solution than the in-

cumbent one whenever possible. Therefore, in the situation of the corollary, if

f(yi) < a for some i, then the goal of the cut is achieved, because the incumbent has
improved.
215

o f(x)=CX

Öl,------"~_____,;.:_4_~----------­
X .....
-......... . ... -_ .......-

Fig. V.5

These results lead to the following procedure.

Iterative Cut Improvement Procedure:

Start with the concavity cut (31).

Iteration 1:

Compute the points yi (i=1,2, ... ,n) according to (42) (this requires solving a
linear program for each i, since F(.,v) is affine in v).

If f(yi) < Cl for some i, then the incumbent has improved.

Otherwise, compute the values t i (i=1,2, ... ,n) according to (41). Then by

Corollary V.1, t ~ e, so that the cut defined by t is deeper than the concavity cut
provided t f. e(which is the case if (44) holds).
216

Iteration k > 1:

Let (38) be the current cut. Compute ti = max {T: tt( rei ) ~ a} (i=I,2, ... ,n).
Since t ~ 0, we have M(t) (M(O), hence ~t(tiei) ~ ~O(tiei) ~ a (i=I,2, ... ,n).

Therefore, by Theorem VA, t* ~ t and the cut defined by t* is deeper than the

current one provided t* f t.

Thus, if condition (44) holds, then the above iterative procedure either leads to an

improvement of the incumbent, or else generates an increasing sequence of vectors

o~ t ~ t* ~ t** ~ ... , which define an increasing sequence of valid cuts (the procedure
stops when successive cuts do not differ substantially).

Of course, this scheme requires the availability of a function F(u,v) with prop-

erties (i), (ii), (iii).

In the next section we shall see that this is the case if f(x) is quadratic. For the

moment, let us mention a result which sometimes may be useful.

Proposition V.7. If a junction F(u, v) satis!ies (ii) and (iii), then the junction
f(x) = F(x,x) is quasiconcave.

Proof. For any x,y e IRn and >',1' ~ 0 such that >'+1' = 1, we have

f(>'x+I'Y) = F(>.x+I'Y,.h+I'Y) = >'F(>'x+/.'Y,x) + J.'F(>'x+I'Y,Y) ~ >.[>.F(x,x) +

J.'F(y,x)] + I'[>.F(x,y) + J.'F(y,y)]. Bence, if y.= min {f(x),f(y)}, then f(>.x+I'Y) ~

>'[>'7+WY] + 1'[>'7+1'7] = >'7 + 1'7 = 7· This proves the quasiconcavity off(x).



Concerning the applicability of this proposition, note that most of the material

discussed in this chapter can easily be extended to quasiconcave minimization prob-

lems.

Another important issue when implementing the above scheme is how to compute

the values t i according to (41) (or the values ti in Theorem VA). For this we
217

consider the linear program

(45)

whose objective function is a concave function of the parameter T by assumption

(iii). Since the optimal value ~ 0 (rei ) of this linear program is concave in T, by

Proposition V.6, and since 4> 0 (Oiei) ~ a, it follows that t i is the value such that

~ 0 (rei ) ~ a for all TE [Oi,ti], while ~ 0 (rei ) < a for all T> ti' Therefore, this value
can be computed by parametric linear programming techniques.

More specifically, let vi be a basic optimal solution of (45) for T = 0i' Then the

reduced costs (in linear programming terminology) for this optimal solution must be

~ O. Clearly, these reduced costs are concave functions of the parameter T, and one

can determine the maximal value Ti such that these reduced costs are still ~ 0 for

0i $ T $ Ti' If ~ 0 (Tiei) = F( Tiei ,vi) < a, then t i is equal to the unique value
TE [0i'Ti] satisfying F(rei,vi ) = a. Otherwise, F(Tiei,vi) ~ a, in which case at least
one of these reduced costs will become negative for T> Ti' and by pivoting we can

pass to a basic solution v2 which is optimal for T> Ti sufficiently elose to Ti'
The procedure is then repeated, with Ti and v2 instead of 0 and vi. (For the sake

of simplicity we assumed Ti to be finite, but the case Ti = +10 can be considered


similarly. )

4.2. Konno's Cutting Method for Concave Quadratic Programming

The cut improvement technique presented above was first developed in a more

specialized form by Konno (1976a) for solving the concave quadratic programming

problem (CQP). Results similar to those of Konno have also been obtained by Balas

and Burdet (1973) using the theory of generalized outer polars (the corresponding

cuts are sometimes called polar cuts).


218

The eoneave quadratie programming (CQP) problem is an important special ease

of the problem (BCP). It eonsists in finding the global minimum of a eoneave

quadratie funetion f(x) over a polyhedron D. Assuming that the polyhedron D is


bounded and int D # 0, let xo be a vertex of D which aehieves a loeal minimum of

f(x) over D. Writing the problem in standard form with respeet to xo (see Seetion

V.Ll), we have

(CQP) minimize f( x) = 2px - x( Cx) (46)

S.t. Ax ~ b , (47)
x~o, (48)

where p is an n-veetor, C = (eij) is a symmetrie positive semidefinite n"n matrix, A


is an m"n matrix and b is an m-vector. Then the eonditions xo = ° (so f(xO) = 0),

D ( IR~ (henee b ~ 0) are satisfied.

Furthermore, p ~ °beeause xO = °
is a loeal minimizer of f(x) over D (if Pi < °
for some i we would have f(>.ei ) = 2>'Pi - >'\i °
< for all sufficiently small >. > 0).

Let us associate with f(x) the bilinear funetion

F(u,v) = pu + pv - u(Cv) (u,v E IRn ) (49)

Proposition V.S. The bilinear fu,nction (49) satisfies the conditions (92), (99),
(9..(.) for the concave quadratic fu,nction (..(.6). If the matrix Cis positive definite, then

F(u,v) > min {f(u),j(v)} Vu # v. (50)

Proof. Clearly, f(x) = F(x,x), and F(u,v) is affine in u for fixed v and affine in v
for fixed u. For any u, v we ean write

F(u,v) -F(u,u) = p(v-u) -u(C(v-u»,

F(u,v) - F(v,v) = p(u-v) - v(C(u-v» .


219

Hence, [F(u,v)-F(u,u)] + [F(u,v)-F(v,v)] = (v-u)(C(v-u)) ~ °since C is positive


semidefinite. Therefore, at least one of the two differences F(u,v)-F(u,u),
F(u,v)-F(v,v) must be ~ 0. This proves that

F(u,v) ~ min {f(u),f(v)}.

If C is positive definite, then (v-u)(C(v-u)) > °for u # v and one of the two
differences must be positive, proving (50).

It follows from this proposition that the cut improvement scheme can be applied

to the problem (CQP). We now show that this scheme can be implemented.

Let er < f(xO). First, the a-valld concavity cut for xO = 0

Ex.1 / o.1 >


-
1 (51)

can easily be determined on the basis of the following result.

Proposition V.9. The quantity 0i is the larger root ofthe quadratic equation:

(52)

ProoI. . Since er < f(xO) = 0, Pi ~ °and cii > 0, it is easily seen that the equation
(52) always has a positive root which is also the larger one. Obviously, f (Oiei) = er,
and the proposition follows.

Next, the points yi in Corollary V.1 are obtained by solving the linear programs:

(53)

If f(yi) ~ er (i=1,2, ... ,n), then, according to Corollary V.1, a stronger cut than the

concavity cut can be obtained by computing the values

ti °
= max {r : ~ (rei) ~ er} (i=1,2, ... ,n).
220

Proposition V.lO. The value ti is the optimal value ofthe linear program

(*) minimize pz - az O

S.t. -Az +zOb~O


n
E z. /0. - zo ~ 0
j=l J J
n
E c .. z. - p. zo = +1,
j=l 1 J J 1

Zj~O (j=O,l, ... ,n).

Proof. Since F( rei,v) = TPi + pv - r ~ CijVj , the optimal value of (53) is equal to

~(r) + TPi ' where


n n
~(r)=min{pv-r E c..v.:-Av~-b, E v./O.~l,v~O}.
j=l IJ J j=l J J

Since the constraint set of this linear program is bounded, by the duality theorem

in linear programming we have

T T T
~(r) = max{-bHr: -A Hr (1/01' ... ,1/ On) ~ p-rCi ' S ~ 0, r ~ O}

where Ci is the i-th row of C.

Thus,

t i = max {r: ~(r) + TPi ~ a}.

Hence,

T T T
t i = max{r : -bs+r + TPi ~ a, -A Hr (l/0l' ... ,l/On) ~ p-rCi '

s ~ 0 , r ~ O} .

By passing again to the dual program and noting that the above program is

always feasible, we obtain the desired result (with the understanding that \ = +111 if
the dual program is infeasible).

221

Since p ~ 0 and a < 0, it follows that (z,zO) = (0,0) is a dual feasible solution of
(*) with only one constraint violated, and it usually takes only a few pivots to solve
(*) starting from this dual feasible solution. Moreover, the objective function of (*)
increases monotonically while the dual simplex algorithm proceeds, and we can stop

when the objective function value reaches a satisfactory level.

After the first iteration, if a furt her improvement is needed, we may try a second

iteration, and so forth. It turns out that this iterative procedure quite often yields a

substantially deeper cut than the concavity cut. For example, for the problem

minimize 2x I + 3x2 - 2xi + 2x1x2 - 2X~


s .t . -xl + x2 ~ 1,

xl - x2 ~ 1,
-xl + 2x2 ~ 3,
2xI - ~ ~ 3,

xl ~ 0 , ~ ~ 0 ,

(Konno (1976a)), the concavity cut and the cuts produced by this method are shown

in Fig. V.6 for a = -3.

-.-----.- co nc ovi ty cut

6., 1 st iteration
------------- 2nd iteration
5, - - - 3rd iteration
---BLP cut
4
, Xl /3.25 + X2 /4.66
·\3
,.,
.,
Xl /4.05 + x2 /5.48
2·' .,
.'.
+ x2 /6-13

Fig. V.6
222

The cutting method of Konno (1976a) for the problem (CQP) is similar to AIgo-

rithm V.I, except that in Step 1 a deep cut of the type discussed above is generated

rather than any ,-valid cut.

The convergence of this method has never been established, though it has seemed

to work successfully on test problems. Of course, the method can be made conver-

gent by occasionally introducing facial cuts as in Algorithm V.2, or by combining

cuts with splitting as in the Normal Conical Algorithm which will be introduced in

Section VI!.l.

4.3. Bilinear Programming Cuts

The cut improvement procedure may be started from the concavity cut or the
bilinear programming (BLP) cut constructed in the following way.

Define the function

4>O(x) = min {F(x,v): v E D} (54)

(so that 4>o(x) has the form (35), with M(O) = D).
By the same argument as in the proof of Proposition V.6 it follows that the

function 4>0 is concave and satisfies

min {4>O(x): x E D} = min {fex): x E D} (55)

(Le., Proposition V.6 extends to t = 0).


Since 4>0(0) = min {pTv: v E D} ~ 0 ~ er, we can define
223

Then er > 0 Vi and the cut


E x.jU!
1 1-
>1 , (56)

which is not hing but the a-valid concavity cut for the problem min {4>O(x): x E D},
will be also a-valid for the problem (CQP) because f(x) = F(x,x) ~ min {F(x,v):

v E D} = 4>O(x) Vx E D. This cut is sometimes referred to as the bilinear

programming (BLP) cut.

If f(yi*) ~ a (i=1,2, ... ,n), where yi* E argmin {F(Oiei,v): v E D}, and if C is
positive definite, then the BLP cut is strictly deeper than the concavity cut (for the

problem CQP).

Note that F(rei,v) = 7Jli + pTv -" ~ CijVj . With the same argument as in the
J
proof of Proposition V.10, one can easily show that er is equal to the optimal value
of the linear program

minimize pz - a Zo

E c· .z. - zOp· = +1
j IJ J 1

Zj ~ 0 (j=1,2, ... ,n).

The BLP cut for the previously given example is shown in Fig. V.6.

Another interpretation is as follows.

Denote

D;(a) = {x: F(x,y) ~ a Vy E D} .


224

Since D;(o) = {x: ~O(x) ~ o}, this set is closed and convex and contains the origin
as an interior point. Therefore, the BLP cut can also be defined as a valid cut for
IR~ \ D;( 0) (i.e., a convexity cut relative to D;( 0), see Section III.4).

The set D;( 0) is sometimes called the polaroid (or generalized o'Uter polar) set of
D with respect to the bilinear function F(u,v) (cl. Balas (1972) and Burdet (1973)).

For a given valid cut Exi / t i ~ 1, the improved cut E xi / tt ~ 1 provided by


Theorem V.4 can also be viewed as a convexity cut relative to the polaroid of
M(t) = D n {x: E xi / t i ~ I}.
CHAPTER VI

SUCCESSIVE APPROXIMATION METHODS

In the cutting plane methods discussed in the previous chapter, the feasible da-

main is reduced at each step by cutting off a feasible portion that is known to con-

tain no better solution than the current best solution.

In the successive approximation methods, to which this chapter is devoted, we

construct a sequence of problems which are used to approximate the original one,

where each approximating problem can be solved by the methods already available

and, as we refine the approximation, the optimal solution of the approximating

problem will get eloser and eloser to a global optimal solution of the original one.

The approximation may proceed in various ways: outer approximation (relaz-

a.tion) of the constraint set by a sequence of nested polyhedra, inner appro:r:imation


of the constraint set by a sequence of expanding polyhedra (pol1lhedral annezation),
or successive underutimation of the objective function by convex or polyhedral func-
tions.

1. OUTER APPROXIMATION ALGORlTHMS

Consider the concave programming (CP) problem

(CP) minimize f( x) subject to x E D ,

where D is a elosed convex sub set of IRn and f: IRn - I IR is a concave function.
226

In Chapter 11 the general concept of outer approximation for nonlinearly con-

strained problems was discussed. In this section, we shall deal with the application

of outer approximation methods to the CP problem (cf. Hoffman (1981), Thieu,

Tarn and Ban (1983), Thieu (1984), Tuy (1983), Horst, Thoai and Tuy (1987 and

1989), Horst, Thoai and de Vries (1988)). A more introductory discussion is given in

Horst, Pardalos and Thoai (1995).

1.1. Linearly Constrained Problem

Let us first consider the problem (BCP) as defined in Section V.1., i.e., the prob-
lem of minimizing a concave function f: IRn - I IR over a polyhedron D given by a sys-

tem of linear inequalities of the form

(1)

xj ~ 0 (j=1,2, ... ,n) , (2)

where bi e IR and Ai is the i-th row of an mxn matrix A.


The outer approximation method for solving this problem, when the feasible set D

may be unbounded, relles upon the following properties of a concave function


f: IRn - I IR.

Proposition VI.1. If the concave fu,nction f(x) is bo'Unded from below on some half

line, then it is bo'Unded from below on any paraUel halfline.

Proof. Defi.ne the closed convex set (the hypograph of f)

G = {(x, t): x E IRn, f(x) ~ t} ( IRn +! .

Suppose that f(x) is bounded from below on a halfline from xOin the direction y, i.e.,

f(x) ~ c for all x = xO + >.y with >. ~ O. Then the halfline {(xO + >.y, cl: >. ~ O} (in
227

IRn+ 1) lies entirely in G. Rence, by a well-known property of closed convex sets (cf.
Rockafellar (1970), Theorem 8.3), (y,O) is a recession direction of G, so that
{(x+>.y, f(X)): >. ~ O} c G for any X, which means that f(x + >.y) ~ f(X) for all >. ~ 0.
Thus, f(x) is bounded !rom below on any halfline parallel to y.

Proposition VI.2. Let M be any closed con'llez set in IRn. 11 the conca'lle function
I(z) is bounded from below on e'llery eztreme ray 01 M, then it is bounded from below
on any halfline contained in M.

Proof. Denote by K the recession cone of M, and let xO e M. Then K + xOc M


and by Proposition VI.l it follows !rom the hypothesis that f(x) is bounded !rom
below on every edge of K + xO. Therefore, by a well-known property of concave
functions (cf. Rockafellar (1970), Theorem 32.3), f(x) is bounded !rom below on
K + xO. Since any halfline contained in M is parallel to some halßine in K + xO, it
folIows, again by Proposition VI.1, that f(x) is bounded !rom below on any halfline
contained in M.

CoroIlary VI.1. Let M # 0 be a polyhedron in IRn that contains no line. Either I(z)
is unbounded from below on an unbounded edge 01 M, or else the minimum oll(z)
o'ller M is attained at so me 'IIertez 01 M.

Proof. Suppose that f(x) is bounded !rom below on every unbounded edge of M.
Then, by Proposition V1.2, f(x) is bounded !rom below on any halßine contained in
M. By a well-known property of concave functions (cf. Rockafellar (1970), Corol-
larles 32.3.3 and 32.3.4) it follows that f(x) is bounded !rom below on M and attains
its minimum over M at some vertex of M (see Theorem 1.1).

On the basis of the above propositions, it is now easy to describe an outer approx-
imation method for solving the problem (BCP).
228

If the polyhedron D is bounded, then the KCG-method proceeds as follows (cf.


Section 11.2).

Start with a simplex D1 such that D ( D1 (IR~. At iteration k = 1,2, ... , one has a
polytope Dk such that D ( Dk ( IR~. Solve the relaxed problem

(Qk) minimize f(x) B.t. xE Dk '

obtaining an optimal solution xk. If x k E D, then stop: xk is a global optimal solu-

tion of (BCP). Otherwise, xk violates at least one of the constraints (1). Select

(3)

and form

(4)

Go to iteration k+l.

Since each new constraint (4) cuts off xk, and hence is different from all of the
previous ones, the finiteness of this procedure is an immediate consequence of the
finiteness of the number of constraints (1).
The implementation of this method and its extension to the case of an unbounded
feasible set D require us to examine two questions:

(i) When the polyhedron D defined by (1), (2) is unbounded, the starting polyhed-

ron D1 as well as an the subsequent polyhedra Dk are unbounded. Then problem


(1), (2) may possess an optimal solution, but the relaxed problem (Qk) may not
have an optimal solution. According to Corollary VI.l, we may find an unbounded
edge of Dk over which f(x) --I -m. How should we proceed in such a case?
Let uk be the direction of the edge mentioned above. If uk is a recession direction
for D, i.e., if Aiuk S 0 (i=I,2, ... ,m), then we stop, since by Proposition VI.1, either
229

D = • or f(x) is unbounded from below over any halßine emanating from a feasible

point in the direction uk .

Otherwise, uk violates one of the inequalities Aiu ~ 0 (i=l, ... ,m). Let

Ifwe define

(6)

then, since Ai uk > 0, it follows that uk is no longer a recession direction for Dk+1.
k

(ü) Each relaxed problem (Qk) is itself a concave minimization problem over a
polyhedron. How can it be solved? Taking into account that (Qk) differs from

(Qk-l) by just one additional linear constraint, how can we use the information ob-

tained in solving (Qk-l) in order to solve (Qk)?


Denote the vertex set of Dk by Vk and the extreme direction set of Dk by
Uk : Vk = vert Dk, Uk = extd Dk, respectively. Since the initial polyhedron D1 is
subject to our choke, we may assume that VI and U1 are known. At iteration k-l,
knowing the sets Vk_ 1 and Uk_ 1, and the constraint adjoined to Dk_ 1 to form Dk ,
we can compute Vk and Uk by the procedures presented in Section II.4.2. Thus, we
embark upon iteration k with knowledge of the sets Vk and Uk . This is sufficient in-
formation for solving (Qk)'
Indeed, by the above Corollary VI.1, a concave function which is bounded from
belowon a halßine (emanating from the origin) must attain its minimum over this
halßine at the origin. Therefore, if there exists uk E Uk such that we have

f('\u k) < f(O) for some ,\ > 0, then f(x) is unbounded from below on the extreme ray
of Dk in the direction uk . Otherwise, the minimum off(x) over Dk must be aUained
at one ofthe vertices of Dk, namely, at xk E arg min{f(x): x E Vk}.

We can thus state the following procedure.


230

Algorithm VI.l.

Initialization:

Take a (generalized) n-simplex D 1 containing D. Set VI = vert Dl' U1 = extd D 1.

(For instance, one can take D 1 = IR!. Then VI = {O}, U 1 = {el,e2,... ,en} where ei
is the i-th unit vector of IRn .)

Set 11 = {l, ... ,m}.

Iteration k = 1,2,... :

1) For each u E Uk check whether there exist A > 0 such that f(AU) < f(O). If this
k
occurs for some u E Uk , then:

a) If Aiuk ~ 0 (i E Ik ), then stop: either D is empty or the problem has no finite


global optimal solution and f(x) is unbounded from below on any halfline parallel
to uk contained in D.

b) Otherwise, compute

(7)

and go to 3).

2) If no u E Uk exists such that f(AU) < f(O) for some A > 0, then find
x k E arg min{f(x): x E V k}.

a) If xk E D, Le., if Aixk ~ bi (Vi E Ik), then terminate: xk is a global optimal

solution.

b) Otherwise, compute

(8)

and go to 3).
231

3) Form

(9)

Compute the vertex set Vk+ 1 and the extreme direction set Uk + 1 of Sk+1 from
knowledge of Vk and Uk · Set I k+1 = I k \ {i k} and go to iteration k+1.

Theorem VI.I. S-uppose that D j 0. Then Algorithm VI.1 terminates after at most
m iterations, yielding a global optimal solution 0/ (Bep) or a halfline in D on which
f(x) - i -00.

Proof. It is easy to see that each ik is distinct from all il' .. .,ik_1' Indeed, if 1b)

occurs, then we have Aiu k ~ 0 (i=i1, .. ·,i k 1)' Ai uk > 0, while if 2b) occurs, then
- k
we have A.xk
1
~ b. (i=i 1,... ,i k 1)' A. xk > b.. Since Ik ( {l, ... ,m}, it follows that
1 - lk lk
the algorithm must terminate after at most m iterations.

Remarks. VI.I. (i) An alternative procedure for solving (Qk) which has restart
capability and does not necessarily involve a complete inspection of all of the ver-

ti ces of Sk ' will be discussed in Chapter VII (the Modified ES Algorithm).

(ii) A drawback of the above outer approximation algorithm is that all of the inter-
mediate approximate solutions xk , except the last one, are infeasible. In order to
have an estimate of the accuracy provided by an intermediate approximate solution
xk , we can proceed in the following way. Suppose that Dis full dimensional. At the

beginning we take an interior point zO of Dj then, at iteration k, when Step 2 occurs,

we compute yk E arg min{f(x): x E r(xk )}, where r(xk ) denotes the intersection of

D with the halfline from xk through zO (l is one endpoint of this line segment). If
i k denotes the best among all of the feasible points yh (h ~ k) obtained in this way
until Step k, then f(x k) ~ min f(D) ~ f(ik ). Therefore, the difference f(i k ) - f(xk )
yields an estimate of the accuracy attained; in particular, if f(ik) - f(xk ) ~ E, then
232

xk is a global E-optimal solution.


(iii) In the worst case, the algorithm may terminate only when all the constraints of
D have been used, Le., when Dk coincides with D. However, when applying the out er
approximation strategy, one generally hopes that a global optimal solution can be
found before too many constraints have been used. In the computational experi-

ments on problems of size up to 14 .. 16 reported by Thieu et al. (1983), on the aver-

age only about half of the constraints of D (not including the nonnegativity con-

straints) were used.

Example VI.!. Consider the problem

x x
minimize f(x) = X~ +~ - O.05(xl + x2)

subject to - 3x1 + ~ - 1 ~ 0, - 3x1 - 5x2 + 23 ~ 0,

Xl - 4x2 - 2 ~ 0 , - Xl + x2 - 5 ~ 0 ,
Xl ~ 0 , x2 ~ 0 .

X
1

Fig. VI.I
233

Initia1ization:

Start with D1 = IR!, VI = {O,O}, U1 = {(1,0),(0,1)}. 11 = {1,2,a,4}.

Iteration 1:

For x = (t,O), f(x) = - 0.05t -- -m as t -- +11). Hence, u 1 = (1,0).


Values Aiu l (i=1,2,3,4): -3, -3,1, -1. Since the largest of these values is 1, we
have i l = 3, so that

with V2 = {(O,O), (2,0)}, U2 = {(O,l), (4,1)}. 12 = {1,2,4}

Iteration 2:

For x = (O,t), f(x) = - 0.05t -- -11) as t -- +11). Hence, u 2 = (0,1).


Values Ai ui (i=1,2,4): 1, -5, 1. Since the largest of these values is 1, we select

i2 = 1, so that

with V3 = {(O,O), (2,0), (O,l)}, Ua = {(4,1), (1,3)}, 13 = {2,4}.

Iteration 3:

f(x) is bounded from below on each direction u E U3. Therefore, we compute

min f(V 3) = min {O, -{).1, -{).05} = -{).1. Thus x3 = (2,0).

Values A.x 3-b. (i=2,4): 17, -7. Since the largest of these values is 17, we have
1 1

i 3 = 2, so that

with V4 = {(6,1), (1,4)}, U4 = U3, 14 = {4}.


234

Iteration 4:

f(x) is bounded from below on each direction u EU4' Therefore, we compute

min f(V 4) = min {0.507143, 0.55} = 0.507143. Thus, x4 = (6,1).

Since A4x4 = -10 < 0, this is the optimal solution (Fig. VI.l).

1.2. Problems with Convex Constraints

Now let us consider the concave minimization problem in the general case when

the feasible set Dis a closed convex set given by an inequality of the form

g(x) SO, (10)

where g: IRn --+ IR is a continuous convex function.

Suppose that D is compact. Then, according to the general outer approximation

method discussed in Chapter Ir, in order to solve the problem (CP) we can proceed
as folIows.

Start with a polytope D1 containing D. At iteration k = 1,2, ... , solve the relaxed
problem

minimize f(x) 8.t. xE Dk '

obtaining an optimal solution x k . If x k E D, then stop: x k is a global optimal solu-

tion of (CP). Otherwise we have g(xk ) > O. Construct a hyperplane strictly separ-

ating x k from D, i.e., construct an affine function Lrc(x) such that

lk(x) S 0 Vx E D , (11)

k
lk(x ) >0 . (12)
235

Form

Dk +1 = Dk n {x: 4c(x) ~ O} ,

and go to iteration k+l.

The convergence of this procedure depends on the choice of the affine function

4c(x) in iteration k. Let K be a compact subset of the interior of D (K may be


empty, for example, when D has no interior point). If we again choose a point

l E conv(K U {xk}) \ int D, and let

lk(x) = Pk (x - k
Y) + g(yk ), (13)

with pk E 8g(yk) \ {O}, then by Theorem II.2, 4c(x) satisfies (11), (12) and the pro-

cedure will converge to a global optimal solution, in the sense that whenever the al-
gorithm is infinite, any accumulation point of the generated sequence xk is such a so-

lution.

For K = 0, this is the KCG-method applied to concave programmingj for

IK I = 1 and l E öD this is the method of Veinott (1960) which has been applied to

concave minimization by Hoffman (1981). Notice that here each relaxed problem
(Qk) is solved by choosing x k E argmin{f(x): xE Vk}, where Vk = vert Dkj the set
Vk is derived from Vk-l by one of the procedures indicated in Section 11.4.2 (V 1 is
known at the beginning).

Extending the above algorithm to the general case when D may be unbounded is

not trivial. In fact, the relaxed problem (Qk) may now have no finite optimal solu-

tion, because the polyhedron Dk may be unbounded. We cannot, however, resolve

this difficulty in the same way as in the linearly constrained case, since the convex

constraint g(x) ~ 0 is actually equivalent to an infinite system of linear constraints.


236

The algorithm below was proposed in Tuy (1983).

Assume that D has an interior point.

Algorithm VI.2.

Initialization:

Select an interior point xO of D and a polyhedron D1 containing D.


Set VI =vertDl' U1 =extdD r

Iteration k = 1.2•... :

1) Solve the relaxed problem

(Qk) minimize f(x) s.t. xE Dk

by a search through Vk = vert Dk• Uk = extd Dk. If a direction uk E Uk is found


on which f(x) is unbounded from below. then:

a) If g(xO + ~uk) ~ 0 for all ~ ~ 0 (Le.• the halfline r k trom xO in the direction uk
lies entire1y in D), then stop: the function fex) is unbounded trom below on
r k c D.
b) Otherwise, compute the intersection point yk of r k with the boundary an of D,
find pk E 8g(yk) and go to 3).

2) If xk E Vk is an optimal solution of (Qk)' then:

a) If g(xk) ~ °(i.e., xk E D), then terminate: xk is a global optimal solution of (P).


b) Otherwise, compute the intersection point l of the line segment lxO,xk] with
the boundary an of D, find pk E 8g(yk), and go to 3).
237

3) Form

Compute Vk +1 = vert Dk+1' Uk+ 1 = extd Dk+1' and go to iteration k+l.

Theorem VI.2. Assume that lor so me a < 1(1,0) the set {x E IRn: 1(1,) = a} is
bounded. Then the above algorithm either terminates after finitely many iterations
(with a finite global optimal solution 01 (CP) or with a halfline in D on which 1(1,) is
unbounded [rom below), or it is infinite. In the latter case, the algorithm either
generates a bounded sequence {i}, every accumulation point Z 01 which is aglobaI
optimal solution 01 (CP), or it generates a sequence {uk}, every accumulation point u
01 which is a direction 01 a halfline in D on which 1(1,) is unbounded [rom below.

The proof of Theorem VI.2 uses the following lemmas.

Lemma VI.1. Under the 48sumptions 01 Theorem Vl.2, let uk - f u (k -f m) and


let I be unbounded [rom below on each halfline r k = {xO+ >. uk: >. ~ O}. Then I is also
unbounded [rom below on the halfline r = {xO+ >. u: >. ~ O}.

Proof. On each r k take a point zk such that f(zk) = a. By hypothesis, the se-
quence {zk} is bounded; hence, by passing to a subsequence if necessary, we may
assume that the zk converge to some z. Because of the continuity of f, we then have
f(z) = a< f(xO). Since zEr, it follows by Corollary VI.1 that fis unbounded from

below on M = r.

Lemma VI.2. Under the 48sumptions 01 Theorem Vl.2, il the algorithm generates
an infinite sequence i, then this sequence is bounded and any accumulation point 01
the sequence is aglobaI optimal solution 01 (CP).
238

Proof. Suppose that the sequence {xk} is unbounded, so that it contains a subse-
k
quence {xq} satisfying IIx qll > q (q=I,2, ... ). Let X be an accumulation point of he
k k k
sequence x q/llx qll. Since f(x q) < f(xO) , Corollary VI.1 implies that fis unboun-
k
ded from below on the halfline from xO through x q. Hence, by the previous lemma,

f is unbounded from below on the halfline from xO through X, and we can find a

point z on this halfline such that f(z) < f(x l ). Let B be a ball around z such that
f(x) < f(x l ) for all x E B. Then for all sufficiently large q, the halfline from xO
k I k
through x q meets B at some point zq such that f(zq) < f(x ) ~ f(x q). Because of
k
the concavity of f(x), this implies that x q lies on the line segment [xO ,zq]. Thus,
k
all of the x q with q large enough belong to the convex hull of xO and B, contradic-
k
ting the assumption that Ilx qll > q. Therefore, the sequence {xk} is bounded. Since
the conditions of Theorem 11.2 are satisfied, it then follows that any accumulation
point of this sequence solves the problem (P).

Lemma VI.3. 0/ Theorem VI.2, i/ the algorithm generates
Under the ass'Umptions
an infinite sequence {'U k}, then every acc'Um'Ulation point 0/ this sequence yields a
recession direction 0/ D on which f(z) is 'Unbo'Unded !rom below.

k
Proof. Let u = lim u q and denote by r k (resp. r) the halfline emanating from
q-t(l)

xO in the direction uk (resp. u). Suppose that r is not entirely contained in D and let
k
y be the intersection point of r with an. It is not hard to see that y q -I Y (yk is
the point defined in Step lb of the algorithm). Indeed, denoting by rp the gauge of

the convex set D - xO, by the continuity of tp we have:


k
kq 0 u q u - 0
y -x = - - r -I ~=y-x ,
tp( u q)

° _°
since tp (yk q - x ) = tp (y - x ) = 1.
239

Now, since pk E 8g(l), we can write

(13)

°
Let zk = 2yk - x . Clearly zk = yk + (ykO . follows that
-x ) E r k and from (13) It

(14)
k
But the sequence {y q} is convergent, and hence bounded. It follows that the se-
k k
quence p q E 8g(y q) is also bounded (cf. Rockafellar (1970), Theorem 24.7). There-
k
fore, by passing to subsequences if necessary, we may assurne that p q - i pe 8g(Y}.
k
Obviously, z q - i Z = 2y - xO, and, by (14), p(z-y) ~ - g(xo) > 0. Since
k ks k __
p q(z -y q) - i p(z-y) as s - i m, q - i m, it follows that for all sufficiently large q

and all s > q we must have

(15)

However, for s > q, we have


k k k
rk (D k (D k +1 ( {x: p q(x - y q) + g(y q) ~ O} ,
s s q
k k
which implies that p q(x - y q) ~ °for all x E rk
s
. This contradicts (15), since
ks
z E rk .
s
Therefore, r must lie entirely in D. Finally, since f(x) is unbounded from below

on each halfline r k' it follows from Lemma VI.1 that it must be unbounded on r.
This completes the proof of Lemma V1.3, and with it the proof of Theorem V1.2. •
240

1.3. Reducing the Sizes of the Relaxed Problems

A convex constraint can actually be viewed as an infinite system of linear con-

straints. Outer approximation by polyhedral sets can simply be regarded as a

method for generating these constraints one by one, as they are needed. For prob-

lems with many variables, usually a large number of constraints has to be generated

before a satisfactory approximate solution can be obtained. Accordingly, the sets Vk ,

Uk increase rapidly in size, making the computation of these sets more and more dif-

ficult as the algorithm proceeds. In practice, for problems with about 15 variables

IVkl may exceed several thousand after 8-10 iterations.

To alleviate this difficulty, a common idea is from time to time to drop certain

constraints that are no longer indispensable. Some constraint dropping strategies

were discussed in Section II.3. Here is another constraint dropping strategy which is

more suitable for the present problem:

Let K denote the index set of all iterations k in which the relaxed problem (Qk)
has a finite optimal solution x k . Then for k E K we have

(16)

For k t K, a direction uk is generated. Let us define

xk = 2yk -x0

(to avoid confusion this point was denoted by zk in the proof of Lemma VI.3).

Recall from (14) that

whereas x.i E r j C Dk +1 for all j > k (rj is the halfline from x O parallel to u j ).
Therefore, (16) holds for all k = 1,2, ....
241

Now, choose a natural number N. At each iteration k let vk denote the number of

points x-i with j < k such that lk(x-i) > 0 (i.e., the number of previously generated
points that violate the current constraint). Let NO be a fixed natural number greater

than N. Then we may modify the rule for forming the relaxed problems (Qk) as

follows:

(*) At every iteration k ~ NO' if vk - 1 ~ N, then form (Qk+1) by adjoining the


newly generated constraint lix) ~ 0 to (Q,); otherwise, form (Qk+1) by adjoining the
newly generated constraint to (Qk-/

It is easily seen that in this way any constraint ~(x) ~ 0 with vk < N, k > NO ' is
used just once (in the (k+1)-th relaxed problem) and will be dropped in all

subsequent iterations. Intuitively, only those constraints are retained that are
sufficiently efficient in the sense of having discarded at least N previously generated

points.
Since (Qk+1) is constructed by adding just one new constraint to (Qk) or

(Qk-1)' the sets Vk +1' Uk+1 can be computed from Vk ' Uk or Vk_ 1 ' Uk_ 1 '
respectively (of course, this requires that at each iteration one stores the sets Vk'
Uk , Vk- 1, Uk- 1, as well as the previously obtained points xi, i < k).

Proposition VI.3. With the above modijication, Algorithm VI.1 stiU converges
'Under the same conditions as in Theorem VI.f.

Proof. Observe that instead of (16) we now have

(17)

Let us first show that any accumulation point i of the sequence {xk} belongs to
k
D. Assume the contrary, i.e., that there is a subsequence x q - I i ~ D. Without
difficulty one can prove, by passing to subsequences if necessary, that
242

k k
y q - I YE 00, P q - I PE 8g(Y) (cf. the proof of Lemma VI. 3).

Let l (x) = p(x-Y). Since g(xO) < 0, we have l (xO) = p(xO-Y) ~ g(xO)-g(Y) =
g(xO) < 0, and consequently, l (i) > (because l (Y) ° = 0). Noting that
. k .
4c =
(x-l) -lk (i) P q(x-l-i), for any q, j we can write
q q

lk ()) ~ 4t (i) - CII)-xll ,


q q
k .
where C is a constant such that IIp qll ~ C. Since x-l-l x (j -I +(0), 4t (x) -I l(i)
q
(q - I +(0), there exist jo and qo such that for j ~ jo and q ~ qO:

Therefore,

In particular, for any q such that q ~ qo and k q > jO+N we have

~ ()) >
q
° (j = jO+1, .. ·,jO+N) .

This implies that IIk ~ Nj hence the constraint


q
4c
q
(x) ~ ° will be retained, and so

~
q
(xj ) ~ °
for all j > kq , a contradiction.

Therefore, any accumulation point xof {xk} belongs to D. Now suppose that the
algorithm generates an infinite sequence {xk , k E K}. Then, by Lemma VI.2, this se-

quence is bounded, and, by the above, any accumulation point xof the sequence
must belong to D, and hence must solve (CP).

On the other hand, if the algorithm generates an infinite sequence {uk , k ~ K},
k
then far any accumulation point u of this sequence, with u = lim u q, we must have

r = {x: xO + AU: A ~ O} c D. Indeed, otherwise, using the same notation and the
k
same argument as in the proof of Lemma V1.3, we would have y q -I YE 00,
243

kq - -0 -0 ..
x --+ x = 2y-x , and henee 2y-x e D, whieh IS impossible, sinee ye öD,
x O eint D. This eompletes the proof of the proposition.

The parameters NO and N in the above eonstraint dropping rule can be chosen ar-
bitrarily considering only computational efficieney. While a large value of N allows

one to reduee the number of constraints of the relaxed problem more significantly at

eaeh iteration this advantage can be offset by a greater number of required itera-

tions.
Though eonstraint dropping strategies may help to reduee the size of the sets Vk '
Uk , when n is large they are often not efficient enough to keep these sets within

manageable size. Therefore, outer approximation methods are practieal only for CP

problems with a relatively small number of variables. Nevertheless, by their struc-

ture, outer approximation methods lend themselves easily to reoptimization when

handling new additional eonstraints. Because of this, they are useful, especially in
combination with other methods, in decomposing large scale problems or in finding a
rough approximate solution of highly nonlinear problems which otherwise would be
aimost impossible to handle. In Section VII.1.10 we shall present a more efficient
method for solving (CP), whieh eombines outer approximation ideas with cone
splitting teehniques and branch and bound methods.

2. INNER APPROXIMATION

In the outer approximation method for finding min F(D), we approximate the

feasible set D from the outside by a sequence of nested polytopes D1 J D2 J ... J D

such that min f(D k ) 1min f(D). Dually, in the inner approximation method, we ap-

proximate the feasible set D from the inside by a sequenee of expanding polytopes

DI C D2 C ... c D such that min f(D k ) ! min f(D).


However, sinee the eoneavity of the function f: IRn --+ IR enables us to use its values
outside D to infer estimates for its values inside, it is more advantageous to take
244

Dk = P k n D, where PI ( P2 c. .. P h is a finite sequence of expanding'polytopes sa-

tisfying P h ) D.
The inner approximation approach originated from the early work of Tuy (1964).

Later it was developed by Glover (1975) for integer programming and by Vaish and

Shetty (1976) for bilinear programming. These authors used the term "polyhedral
annexation ", referring to the process of enlarging the polytopes P k by 11 annexing 11

more and more portions of the space. Other developments of the inner approxim-

ation approach can be found in Istomin (1977) and Mukhamediev (1982).

Below we essentially follow Tuy (1988b and 1990), Horst and Tuy (1991).

2.1. The (DG)-Problem

Polyhedral annexation is a technique originally devised to solve a special problem

which can be formulated as follows:

(DG) Given a polytope D contained in a cone K O C IRn and a compact convex

set G with 0 Eint G, find a point y E D \ G, or else establish that D C G.

The (DG)-Problem turns out to be of importance for a large dass of global op-
timization problems, induding problems in concave minimization and reverse convex

programming.

For example, consider the problem (BCP), Le., the problem (CP) where the feas-

ible domain is a polytope D determined by the linear constraints (1) (2). Assume

that the concave function fex) has bounded level sets.

As seen in Chapter V, when solving the BCP problem a crucial question is the fol-
lowing one: given areal number 'Y e f(D) (e.g., 'Y is the best feasible value of f(x) ob-
tained so far), and given a tolerance c > 0, find a feasible solution y with f(y) < 'Y-€

or else establish that no such point exists (Le. min f(D) ~ 'Y-€).
245

To reduce this question to a (DG)-problem, we compute a vertex xo of D such


that f(xO) ~ 7, and, after translating the origin to xO, we construct a polyhedral cone

KO J D (this can be done, e.g., by rewriting the problem in standard form with

respect to xO, so that KO = IR~). Then the above question is just a (DG)-problem,
with G = {x: f(x) ~ 7-t:} (this set is obviously compact and convex, and it contains

xO = °in its interior because f(xO) > 7-t:).


If we know a method for solving (DG), the BCP problem can be solved according
to the following two phase scheme:

Start from a feasible solution z E D.

Phase I.

Search for a local minimizer xO, which is a vertex of D such that f(xO) ~ f(z).

Phasen.

Let a = f(xO) - c. Translate the origin to xO and construct a cone KO J D. Solve


the (DG)-problem for G = {x: f(x) ~ f(xO) - c}. If D ( G, then terminate: xO is a

global c-optimal solution. Otherwise, let y E D \ G. Then f(y) < f(xO) - c. Set
Z f- Y and return to Phase I.

Since a new vertex of D is found at each return to Phase I that is better than all

of the previous ones, the scheme is necessarily finite.


Thus, the problem (BCP) can always be decomposed into a finite sequence of

(DG )-problems.

2.2. The Concept of Polyhedral Annexation

In this and the next sections we shall present the polyhedral annexation method
for solving (DG), and hence the problem (BCP).
246

The idea is rather simple. Since D C KO ' we can replace G by G n KO. Now we
start with the n-simplex PI spanned by the origin 0 and the n points where the

boundary OG of G meets the n edges of KO. We solve the problem:

(This is easy, since Plis an n-simplex.) If no such point exists (which implies that
D C G), or if y1 ~ G (which means that y1 E D \ G), then we are done.

Otherwise, let zl be the point where the halfline from 0 through y1 meets {)G (such

a point zl will be called the G-extension of y1).

Enlarge PI to

P2 = conv (P 1 U {z 1}) ,

and repeat the procedure with P 2 replacing PI·

In this manner, we generate a sequence of expanding polytopes that approximate


G n KOfrom the interior. We have PIe P 2 c ... c G n K O satisfying

con (P k) = KO' 0 E P k , vert(P k ) \ {O} C 8G , (18)

k
y E D \ P k , P k+1 = conv(P k U{zk}) , (19)

where l is obtained by solving (DP k ), while zk is the G-€Xtension of yk.

Clearly, when D \ P k f 0, then D \ P k contains at least one vertex of D, since

otherwise, vert D C P k would imply D C P k. Therefore, in the (DPk)-problem we

can require that l be a vertex of D. Under these conditions, each P k+1 contains at

least one vertex of D which does not belong to any PI ,P 2 ,... ,P k . Since the vertex

set of D is finite, the above polyhedral annexation procedure must terminate with a
polytope P h ) D (proving that D c G) OI with a point yh E D \ G.
247

Let us ex:amine how to solve the problems

Find a point yk E D \ P k which is avertex: o{ D.

Recall that a facet of a polytope P is an (n-l)~mensional subpolytope o{ P


which is the intersection of P with a supporting hyperplane.

Definition VI.I. A lacet 01 a polytope is said to be transversal il its cOfT"esponding


supporting hyperplane does not contain the ongin o.

A transversal {acet is determined by an equation of the form vx = 1, where v is a


normal vector and vx denotes the inner product o{ v and x. To simplify the lan-

guage, we shall identify a facet with its normal vector and instead of saying "the fa-

cet whose hyperplane is vx= 1", we shall simply say "the facet v".
In view o{ the property con (P k ) = KO (see (18» we see that, if Vk is the C'Ollec-

tion of all transversal facets of P k ' then P k is determined by the inequalities:

Now {or each v E Vk let ~v) denote the optimal value in the linear program:

LP(vjD) maximize vx
s.t. xED

Proposition VI.4. 11 ",(v) ~ 1 lor all v E Vk J then D ( Pk . 11 "'(v) > 1 lor some
v E Vk then any basic optimal solution yk 01 LP(WD) satisfies
J ,l E D \ Pk .
Proof. Suppose that ~v) ~ 1 for all v E Vk . Then xE D implies that vx ~ 1 for all

v E Vk . Hence, xE D implies x E P k , that is, D ( P k . On the other hand, if we have

'" (v) > 1 for some v E Vk , then a basic optimal solution yk of the linear program
LP(v;D) must satisfy yk E D and vyk> 1, hence yk E D \ P k . This completes the
248

proof of the proposition.



Thus, to solve (DP k) we have to solve a linear program LP(v;D) for each trans-

versal facet v of P k . The question that remains is how to find the set Vk of these fa-

cets.

2.3. Compuüng the Facets of a Polytope

The set VI is very simple: it consists of a single element, namely the facet whose

hyperplane passes through the n intersections of 8G with the edges of KO' Since

P k +1 = conv (P k U{zk}), it will suffice to know how to derive Vk+1 from Vk . We


are thus led to consider the following auxiliary problem:

New Facet Finding Problem.


Given a polytope P of fuU dimension in IRn which contains 0 and whose set of trans-
versal facets is known, and given a point z t P, compute the set of transversal facets
ofthe polytope P = conv (P U {z}).
I

However, rather than solving this problem direct1y, we shall associate it with
another problem which is easier to visualize and has already been studied in Section
II.4.2.

New Vertu Finding Problem.


Given a polyhedrfJn S of fuU dimension in IRn which contains 0 in its interior and
whose vertex set is known, and given a hyperplane zx = 1 where z is anormal vector,
compute the vertex set ofthe polyhedron S I = Sn {x: zx ~ 1}.

It is well-known from polyhedral geometry that the following duality relationship

exists between the above two problems which allows one to reduce the first problem
to the second and vice-versa (cf., e.g., Balas (1972)).
249

Proposition VI.5. Let P be a polytope 01 fu.U dimension which contains 0, and let
S = {x: zx ~ 1 Vz E P} be the polar 01 P. Then 0 Eint Sand each transversal lacet
tJZ = 1 01 P corresponds to avertex v 01 Sand vice versa; each nontransversal lacet

tJZ = 0 01 P corresponds to an extreme direction v 01 Sand vice versa.

Proof. The inclusion 0 Eint S follows from a well-known property of polars of

convex sets (cf. Rockafellar (1970), Corollary 14.5.1). Denote by Z the vertex set of

P. Since for any x the linear function Z --I zx attains its maximum over P at some

vertex of P, it is easily seen that S = {x: zx ~ 1 Vz E Z}. Now let vx = 1 be a trans-


versal facet of P. Since dim P = n, and since the hyperplane of this facet does not

pass through 0, the facet must contain at least n linearly independent vertices of P.

In other words, the equation vz = 1 must be satisfied by at least n linearly independ-


ent elements of Z.

Furthermore, since 0 E P, the fact that vx = 1 is a facet of Pimplies that vx ~ 1


for all x E P. Rence, v belongs to S and satisfies n linearly independent constraints of

S as equalities. This means that v is a vertex of S.

Conversely, let v be any vertex of S (v * 0 because 0 Eint S). Then v satisfies all
of the constraints of S, with equality in n linearly independent constraints. That is,

vz ~ 1 for all z E P and vz = 1 for n linearly independent elements of Z. Therefore,


vx = 1 is a supporting hyperplane to P that contains n linearly independent vertices

of P. Rence, vx = 1 is a transversal facet of P.

The assertion about nontransversal facets can be proved similarly (if vx = 0 is


such a facet then v E Sand vz ~ 0 for n-1linearly independent elements of Z). •

Corollary VI.2. Let P be a polytope 01 fu.U dimension which contains 0, and Let
z t P. 11 S denotes the polar 01 P then each transversal lacet tJZ = 1 01 the polytope
P' = conv (P U {z}) corresponds to avertex v 01 the polyhedron S' = Sn {x: zx ~ I}
250

and vice versa; each nontransversal lacet 01 P' corresponds to an extreme direction
01 S' and vice versa.

Proof. 1ndeed, the polar of P' is precisely S'.



To summarize, the computation of the transversal facets of P k+ l = conv (P k U
{zk}) reduces to the computation of the vertices of the polyhedron Sk+l = Sk n {x:
zkx ~ I}, where Sk is the polar of P k (Fig. VI.2).
Since PI = [O,u l ,... ,unJ, where ui is the intersection of 8G with the i-th edge of
K O ' its polar SI is the polyhedral cone {x: uix ~ 1, i=I,2, ... ,n} with a unique vertex
vI = eQl-l, where Ql = (u l ,u2,... ,u n). Starting with VI = {vI}, one can compute
1
the vertex set V2 of S2 = SI n {x: z x ~ I}, then the vertex set V3 of
S3 = S2 n {x: z2x ~ I}, i.e., the set oftransversal facets of P 3, and so on. To derive
Vk + l from Vk one can use, for example, the methods discussed in Section 11.4.2.

2
V x= 1

--
3
V

Fig. VI.2
251

2.4. A Polyhedral Annexation Algorithm.

In the preceding section we have developed a polyhedral annexation procedure for

solving the (DG)-problem. It now suffices to incorporate this procedure in the two

phase scheme outlined in Section V1.2.1 in order to obtain a polyhedral annexation

algorithm. tor the problem (BCP). In the version presented below we describe the

procedure through the sequence SI ( S2 ( ... and add a concavity cut be{ore each
return to Phase I.

Algorithm. VI.3 (PA Algorithm. for BCP).

Select E~ 0.

IDitialization:

Compute a point z E D. Set M = D.

Phase I.

Starting with z search tor a vertex xo o{ M which is a local minimizer o{ {(x) over M.

phasen.

0) Let Q = {(xo) - E. Translate the origin to xo and construct a cone KO containing


M (e.g., by rewriting the problem in standard form with respect to XOj then

KO= IR~).
For each i=I,2, ... ,n construct the intersection ui of the i-th edge of KO with the

surface {(x) = Q. Let

SI = {x: uix ~ 1, i=I,2, ... ,n} .

Compute vI = eQl-l, where Ql = (ul,u2,... ,un ). Let VI = {vI}, Vi = VI' Set

k = 1 (For k > 1, Vkis the set of new vertices o{ Sk)'


252

k
I) For each v E V solve the linear program

LP(vjM) maximize vx
s .t. xE M

to obtain the optimal value p. (v) and a basic optimal solution w (v).
If we have f(w (v» < a for some v E Vk' then set

where vI was defined in Step 0), and return to Phase I. Otherwise, go to 2).

2) Compute vk E arg max {po (v): v E Vk}. If p.(vk) ~ I, then stop: xO is a global
e-optimal solution of (BCP). Otherwise, go to 3).

3) Let zk be the a-extension of w (vk). Form


Sk+1 = Sk n {x: zkx ~ I}.
Compute the vertex set Vk+1 of Sk+1 and let V +1 k = Vk+1 \ Vk. Set k I - k+1
and return to 1).

Theorem VI.3. The PA Algorithm terminates after a finite number 0/ steps at a


global e -optimal solution.

proof. Phase 11 is finite because it is the polyhedral annexation procedure for

solving a (DG)-problem, with D = M and G = {x: fex) ~ a}. Each time the algo-
rithm returns to Phase I, the current feasible set is reduced by a concavity cut v1x ~
1. Since the vertex xO of M satisfies all the previous cuts as strict inequallties it will

actually be a vertex of D. The finiteness of the algorithm follows from the finiteness
of the vertex set of D.

R.ema.rks VI.2. (i) During aPhase 11, all of the linear programs LP(vjM) have
the same constraint set M. This makes their solution a relatively easy task.
253

(ii) It is not necessary to take a local minimizer as xO. Actually, for the algorithm to
work, it suffices that xO be a vertex of D and a < f(xO) (in order that ui f: °
(i=1,2, ... ,n) can be constructed). For example, Phase I can be modified as folIows:
Compute a vertex x of D such that f(X) ~ f(z). Let xO e argmin {f(x): x = x or x
is a vertex of D adjacent to xO}. Then a = f(xO) - E: in Phase 11, and when
Jl. (v k) ~ 1 (Step 2), xis a global E:-optimal solution.

(iii) Each return to Phase I is essentially a restart. Since the cardinality of Vk


increases rapidly as Phase 11 proceeds, an important effect of the restarts is to pre-
vent an excessive growth of Vk . Moreover, since a new vertex xO is used as the
starting point for the search process at each restart (and the region that remains to
be explored is reduced by a cut), another effect of the restarts is to increase the
chance for improving the current best feasible solution.
Therefore, the possibility of restarting is a notable advantage to be exploited in
practical implementation.
The algorithm prescribes arestart when a point w (v) has been found such that

f(w (v)) < a. However, independently of f(w (v)), one can restart also whenever a
point w (v) has been found which is a vertex of D (note that each w (v) is a vertex of
M, but not necessarily a vertex of D): if f(w (v» ~ a, simply return to Step 0, with
xO f-- w (v); otherwise, return to Phase I, with z f-- w (v). It is easy to see that with
this modification, the algorithm will still be finite.
In other cases, when the set Vk approaches a critical size, a restart is advisable
even if no w (v) is available with the above conditions. But then the original feasible

domain D should be replaced by the last set M obtained. Provided that such restarts
are applied in limited number, the convergence of the algorithm will not be

adversely affected. On the other hand, since the polyhedral annexation method is
sensitive to the choice of starting vertex xO, arestart can often help to correct a bad

choice.
254

(iv) The assumption e > ° is to ensure that ui f. °


(i=l,2, ... ,n). However, if the
polytope D is nondegenerate (i.e., if any vertex of D is adjacent to exactly nother

vertices), then the algorithm works even for e = 0, because then IR~ coincides with

the cone generated by the n edges of D emanating from xo.

Example VI.2. Consider the problem:

subject to -xl + x2 ~ 3,

xl+~~ 11,
2x l -~ ~ 16,
-xl -x2 ~ - 1 ,

x2 ~ 5,

xl ~ °, ~ ~ 0.

Fig. VI.3
255

The algorithm is initialized from z = (1.0,0.5).

Phase I:

°
x = (9.0,2.0)

Phase 11:

0) O! = f{xO) = -23.05;
The unique element vI of VI corresponds to the hyperplane through u 1 = (7.0,-2.0)
2
and u = (4.3,6. 7).

Iteration 1:

1) Solution of LP{v 1,D): w(v 1) = (0.0,1.0) with f{w(v 1 »= -18.45> O!.

2) JL{v 1) > 1.

3) a-extension of w(v 1): zl = (-o.505,0.944).


V2 = {v21 ,v 22 } corresponds to hyperplanes through u 1,zl and through u 2,zl

respectively.

Iteration 2:
21
1) LP{v ,D): W(V 21 ) = (1.0,0.0); LP{v 22 ,D): W(V 22 ) = (0.0,3.0).

2) v 2 = argmax {JL(v): v E V2} = v 22 .

3) Va = {v 31,v32 } corresponds to hyperplanes through zl, z2 and through u 2, z2

respectively, where z2 = a-extension of w(v 22 ) = (-o.461,3.051).

Iteration 3:

1) LP{v 31,D): JL(v 31 ) < 1; LP{v 32 ,D): JL(v 32 ) < 1.

2) v 3 = v 21 .
256

3) z3 = (-{).082,-{).271)
Vl = {v41 ,v42 } corresponds to hyperplanes through u 1, z3 and through zl, z3

respectively.

Iteration 4:

1) LP(v41 ,D) and LP(v42 ,D) have J.'C.v) < 1.

2) J.'C.v4) < 1: x O = (9.0,2.0) is a global optimal solution (Fig. VI.3).

Thus, the algorithm has actually verified the global optimality of the solution al-

ready obtained in the beginning.

Example VI.3. We consider the problem

minimize f(x) subject to Ax ~ b, x ~ 0 ,

where x e 1R4 ,
3
~ 2
f(x) = -[' xl' + 0.1(x1 - 0.5x2 + 0.3xa + x4 - 4.2) ] ,

1.2 1.4 0.4 0.8 6.8


-{).7 0.8 0.8 0.0 0.8
0.0 1.2 0.0 0.4 , 2.1
A= 2.8 -2.1 0.5 0.0 b= 1.2
0.4 2.1 -1.5 -{).2 1.4
-{).6 -1.3 2.4 0.5 0.8

Tolerance E = 10-6 .

First cycle:

D = {x: Ax ~ b , x ~ O} .

Phase I:

x O = (0, 0, 0, 0) (nondegenerate vertex of D).


257

Adjacent vertices:
01
y = (0.428571, 0.000000, 0.000000, 0,000000) ,
02
Y = (0.000000, 0.666667, 0.000000, 0.000000) ,
03
Y = (0.000000, 0.000000, 0.333333, 0.000000) ,
Y
04
= (0.000000, 0.000000, 0.000000, 1.600000)
Current best point x = (0.000000, 0.666667, 0.000000, 0.000000,)
(see Remark VI.2(ii)).

Phase 11:

0) a = f(x) - E. = -2.055111. The problem is in standard form with respect to


o
x : KO= IR+ .
4
1
u = (1.035485, 0.000000, 0.000000, 0.000000) ,
2
u = (0.000000, 0.666669, 0.000000, 0.000000) ,
3
u = (0.000000, 0.000000, 29.111113, 0.000000) ,
4
u = (0.000000,0.000000,0.000000,8.733334)

The vertex set VI of SI = {x: uix ~ 1, i=I, ... ,4} is VI = {vI} with
v1 = (0.413885, 0.999997, 0.011450, 0.183206)

Iteration 1:

1) Solution of LP(v 1,D): (vI) = (0.413885, 0.999997, 0.011450, 0.183206) with

f(w(v 1)) = -1.440295 > a.


2) lL(v 1) = 3.341739 > 1.
3) a-extension of vI: zl = (1.616242, 1.654782, 1.088219,3.144536);
21 22 23 24 .
V2 = {v ,v ,v ,v }, Wlth
v21 = (-1.162949, 1.499995, 0.034351, 0.114504) ,
22
v = (0.965731, -0.579109, 0.034351, 0.114504),
v23 = (0.965731, 1.499995, -3.127204, 0.114504) ,
v24 = (0.965731, 1.499995, 0.034351, -0.979605 ) ,
258

Iteration 2:

1) Solution of LP(v 24 ,D): w(v 24) = (1.083760, 1.080259, 0.868031, 0.000000)


with f(w(v 24 »= -2.281489< Cl.

D +- D n {x: v 1x ~ I} ,with vI = (0.413885,0.999997,0.011450,0.183206) .

Second eyde (restart ):

M = D n {x: v1x ~ I}

Phase I:

xO = (1.169415, 1.028223, 0.169811, 4.861582) (nondegenerate vertex of M) with

adjaeent vertiees:
Y01 = (1.104202, 0.900840, 0.000000, 5.267227 ) ,
02
Y = (1.216328, 1.245331, 0.818956, 2.366468) ,

y 03 = (1.134454, 0.941176, 0.000000, 5.151261 ) ,


Y04 = (0.957983, 0.991597, 0.000000, 5.327731)
Current best point x = (1.083760, 1.080259, 0.868031, 0.000000) with

f(X) = -2.281489.

Phase 11:

After 12 iterations the algorithm finds the global optimal solution

(1.083760, 1.080259, 0.868031, 0.000000)

with objeetive funetion value -2.281489.

Thus, the global optimal solution (v 24 ) is eneountered after two iterations of the

first eyde, but is identified as such only after a seeond eyde involving twelve

iterations.

Areport on some eomputational experiments with modifieations of Algorithm

VI.3 ean be found in Horst and Thoai (1989), where, among others, the following

types of objeetive funetions are minimized over randomly generated polytopes in IR!:
259

(1) xTCx + 2pTx (C negative semidefinite nxn matrix).

... + 2] •
nxn)

(5) - max {lIx - dill: i=1 ..... p} with chosen d 1..... dP E IRn .

The computational results partially summarized in the table below on a number

of problems with n ranging from 5 to 50. gives an idea of the behaviour of slight
modifications of Algorithm V1.3. The column "f(x)" indicates theform of the
objective function; the column 11 Res 11 gives the number of restarts; the column "Lin"
gives the number of linear programs solved.
The algorithm was coded in FORTRAN 77. and the computer used was an
IBM-PSII. Model 80 (DOS 3.3). The time in seconds includes CPU time and time
for printing the intermediate results.

n m fex) Res Lin Time (seconds)


5 15 (1) 2 19 5.39
8 21 (1) 3 86 32.48
9 27 (2) 5 417 200.76
10 30 (5) 3 138 84.60
12 11 (3) 4 129 73.35
12 18 (1) 9 350 220.20
20 18 (1) 6 265 442.52
20 13 (4) 18 188 950.41
30 22 (3) 7 341 1350.85
40 20 (3) 6 275 2001.71
50 21 (3) 8 582 6632.37

Table VI.1.
260

Notice that, as communieated by its authors to us, in Horst and Thoai (1989), 3.4.,
pp. 283-285, by an input error, the figures correspond to the truncated objective
funetion f(u) = -lu113/2. This example reduces to one linear program, for whieh
general purpose coneave minimization methods are, of course, often not ef6cient.
Other objective funetions encountered in the concave minimization literature turn
out to be of the form !p[l(u)], l: IRn - IR affine, !p: IR -IR (quasi)concave, so that
they reduce to actually two linear programs (e.g., Horst and Thoai (1989), Horst,
Thoai and Benson (1991)).

More recent investigations (Tuy (1991b and 1992a), Tuy and Tam (1992), Tuy,
Tam and Dan (1994), Tuy (1995)) have shown that a number of interesting global
optimization problems belong to the class 0/ so-ca"ed rank k problems. These
problems can be transformed into considerably easier problems of smaller dimension.
Examples include eertain problems with products in the objective (multiplicative
programs), certain loeation problems, transportation-production models, and bilevel
programs (Stackelberg games).

It is easy to see that objeetive funetions of type (3) above, belong to the class of
rank two q1J.asiconca1Je minimization problems whieh could be solved by a parametrie
method whieh is a specialized version of the polyhedral annexation proeedure (Tuy
and Tam (1992)).

For more computational results on the polyhedral annexation method, we refer


the reader to Tuy and Tam (1995).
261

2.5. Relations to Other Methods

There are several interpretations of the PA algorithm which show the relationship
between this approach and other known methods.

(i) Polyhedral anneu.üon and outer approximaüon.

Consider the (DG)-problem as formulated in 5ection 2.1, and let G# denote the
polar of G, i.e., G# = {v: vx ~ 1 Vx E G}. 5ince the PA algorithm for (DG)
generates a nested sequence of polyhedra SI J S2 J ... J G#, each of which is ob-
tained from the previous one by adding just one new constraint, the computational

scheme is much like an outer approximation procedure performed on G#. One can
even view the PA algorithm for (DG) as an outer approximation procedure for
solving the following convex maximization problem

maximize J.'(v) subject to v E G# ,

where J.'(v):= max v(D) (this is a convex function, since it is the pointwise max-
imum of a family of linear functions V""" v(x».
Indeed, starting with the polyhedron SI J G#, one finds the maximum of J.'(v)
over Sr Since this maximum is achieved at a vertex vI of 51' if J'Cv l ) > 1, then
vI ~ G# (because max v l (D) = J.'(v l ) > 1), so one can separate vI from G# by the
hyperplane zlx = 1.
Next one finds the maximum of J'Cv) over 52 = SI n{x: zlx SI}. If this max-
imum is achieved at v2, and if J.'(v2) > 1, then v2 ~ G#, so one can separate v2 from
G# by the hyperplane z2x = 1, and so on. Note, however, that one stops when a vk

is obtained such that J.'(vk ) S 1, since this already solves our (DG)-problem.
262

Thus, the PA algorithm is a kind of dual outer approximation procedure. It would


seem, then, that this algorithm should share most of the advantages and disadvan-
tages of the outer approximation methods. However, unlike the usual outer approxi-
mation method when applied to the problem (Bep), the polyhedral annexation
method at each stage provides a current best feasible solution that monotonically

improves as the algorithm proceeds. Furthermore, since restarts are possible, the
number of constraints on Sk will have a better chance to be kept within manageable
limits than in the usual out er approximation algorithms.

(ü) Polyhedral annexation as a procedure for constructing the convex hull.

In the (DG)-problem we assumed that 0 eint G. However, it can easily be

verified that all of the previous results still hold if we only assume that 0 e int Ko G

(the interior of G relative to K O)'


Now suppose that G = D, and that 0 is a nondegenerate vertex of D, while KOis the
cone generated by the n edges of D emanating from O. Then the PA procedure for
(DG) will stop only when P k = D. That is, it will generate a sequence of expanding
polytopes, which are the convex hulls of increasing subsets of the vertex set of D; the
last of these polytopes is precisely D.

Note that the same procedure can also be performed using the vertex set rather

than the constraints of D. In fact, if we know a finite set E such that D = conv E,
then instead of defining J4..v) = max {vx: x e D} as above (the optimal value of
LP(v;D)), we can simply define ,""v) = max {vx: x E E}. Therefore, the PA algo-
rithm can be used to solve the following problem:

Given a finite set E in IRn, find linear inequalities that determine the convez hull
oIE.

This problem is encountered in certain applications (cf. Schacht mann (1974)). For
instance, if we have to solve a sequence of problems of the form min {ckx: x e E},
263

where ck E IRn, k=1,2, ... , then it may be more convenient first to find the constraints

pix ~ qi, i=1,2, ... ,r, of the convex huH of E and then solve the linear programs
.{k
IDln c x: pix<
_ qi,1=
· 12, ,... ,r } .

Assuming that E contains at least n+1 affinely independent points, to solve the

above problem we start with the n-simplex PI spanned by these n+ 1 points, and we

translate the origin to an interior point of PI' Then we use the polyhedral

annexation procedure to construct a sequence of expanding polytopes PIe P 2 C •••

terminating with P h = conv E.

(üi) Polyhedral annention as a finite cut and split procedure.

For each polytope P k let .At k denote the coHection of cones generated by the

transversal facets of P k . Clearly, .At k forms a conical sub division of the initial cone

K O ' and so the PA algorithm may also be regarded as a modification of the cut and

split algorithm V.3.

Specifically, each transversal facet vx = 1 of P k is a concavity cut which in the

corresponding cone determines a pyramid M(v) containing only feasible points x

with f(x) ~ Q. H max v(D) ~ 1 for an these v (Le., if the colleetion of an of the

pyramids covers an of D), then the algorithm stops. Otherwise, using zk E argmax
{vkx: x E D} with vk E argmax v(D), we generate a new conieal subdivision .At k +1 '

etc ....
Thus, compared with the cut and split algorithm, the fundamental difference is

that the bases of the pyramids M(v) are required to be the transversal facets of a

convez polytope. Because of this requirement, a cone may have more than n edges
and .At k+1 may not necessarily be a refinement of .At k (Le., not every cone in .At

k+l is a subcone of some cone in .At k)' On the other hand, this requirement anows

the cones to be examined through linear programs with the same feasible domain

throughout a cycle. This property in turn ensures that each zk is a vertex of D,

which is a crucial condition for the finiteness of this kind of procedure.


264

In the cut and split algorithm, P k+1 is obtained by merely adding to P k the
n-simplex spanned by zk and the vertices of the facet vk that generated zk j thus the

annexation procedure is very simple. But, on the other hand, the method might not
be finite, and we might need some anti-jamming device in order for it to converge.
We shall return to this question in Section VII.1.

2.6. Extensions

So far the PA algorithm has been developed under the assumption that the feas-
ible domain D is a polytope and the objective function fex) is finite throughout IRn

and has bounded level sets.


However, this algorithm can be extended in a straightforward manner to more
general situations. Specifically, let us consider the following cases:

(i) Dis a polyhedron, possibl1l unbounded bat line free, while. j{z} has bounded level
sets.

In this case the linear program LP(vjD) might have no finite optimal solution. If
max v(D) = +ID for some v, i.e., if a halfline in D in the direction y = w (v) is found
for which vy > 0, then, since the set {x: fex) ~ t} is bounded for any real number t,

we must have fex) --I --w over this halfline (therefore, the algorithm stops). Other-

wise, we proceed exactly as in the case when D is bounded.

(ü) Dis a pol1lhedron, possibl1l unbounded bat line free, while. inj j{D} > _ {bat j{z}
ma1l have unbounded level sets}.

Under these conditions certain edges of KO might not meet the surface fex) = et, so
we define
265

where I is the index set of the edges of KO which meet the surface fex) = a, and if
i ;. I, then ui denotes the direction of the i-th edge. Moreover, even if the linear pro-

gram LP(v;D) has a finite optimal solution w (v), this point might have no finite
a-extension. That is, at certain steps, the polyhedral annexation process might in-

volve taking the convex hull of the union of the current polyhedron P k and a point
at infinity (i.e. a direction) zk.

To see how one should proceed in this case observe that Proposition VI.5 still
holds when P is an unbounded polyhedron, except that 0 is then a boundary rat her

than an interior point of S (the polar of P). In particular, an extreme direction of P

corresponds to a nontransversal facet of S. Therefore, the polyhedron Sk+l in Step 3

of Algorithm VI.3 should be defined as follows:


If the a - extension of w (vk ) is a (finite) point zk, then let

Otherwise, either w (v k) or its a - extension is at infinity. In this case let zk be

the direction of w (v k) and

(ili) Dis a polytope, 'IIJhile /(x) is finite omy on D (f(x) = -m outside D}.

Most of the existing concave minimization methods require that the objective
function f: D -+ IR can be extended to a finite concave function on a suitable set A,

A J D.
However, certain problems of practical interest involve an objective function fex)

which is defined only on D and cannot be finitely extended to IRn . To solve these

problems, the outer approximation methods, for example, cannot be applied.

However, the PA algorithm might still be useful in these cases. Assume that a

nondegenerate vertex of D, say xO = 0, is available which is a local minimizer.


266

Clearly, for any a E f(D) the set G = {x E D: f(x) ~ a} is a subset of Dj while if


a = min f(D), then it is identical to D. Therefore, the best way to apply Algorithm
VI.3 is to keep xO fixed (so that M = D) and execute always Step 2 after solving the
linear programs LP(vjD), v E V k ' regardless of the value f(w (v)). Thus the algo-
rithm will involve just one cyde and will generate a finite sequence of expanding

polytopes, each obtained from the previous one by annexing just one new vertex.

Such a procedure, if carried out completely, will certainly produce all of the

vertices of D. However, there are several respects in which this procedure differs

from a simple full enumeration.

First, this is a kind of branch and bound procedure, in each step of which all of

the vertices that remain to be examined are divided into a number of subsets (cor-

responding to the facets v of the current polytope P k ), and the subset that has max-

imal J'(v) = max v(D) is chosen for further partitioning. Although J'(v) is not a

lower bound for f(x) on the set D n {x: vx S 1}, it does provide reasonable heuristics
to guide the branching process.

Second, the best solution obtained up to iteration k is the best vertex of the poly-
tope P k j it approximates D monotonically as k increases.

Third, the accuracy of approximation can be estimated with the help of the fol-

lowing

Proposition VI.6. Let Pk be the approximating polytope at iteration k, and let Vk

be the collection 01 its transversallacets (i.e., the collection 01 vertices 01 its polar
Sk)· 11 d(x,Pi! denotes the distance from x to Pk , then

where Pk is the maximal distance from i to the vertices 01 Pk.

Proof. Let x E D \ P k. Then there is avE Vk such that x belongs to the cone ge-

nerated by the facet v. Denoting by y the point where the line segment [xO,x] meets

the hyperplane through the facet v, we see that y E P k and


267

and hence

Since this holds for arbitrary xE D \ P k ' the proposition follows.


3. CONVEX UNDERESTIMATION

It is sometimes more convenient to apply an outer approximation (or relaxation)


method not to the original problem itself but rather to a transformed problem. An
example of such an approach is the successive underestimation method, which was

first introduced by Falk and Soland (1969) in the context of separable, nonconvex
programming, and was later used by Falk and Hoffman (1976) to solve the concave

programming problem, when the feasible set D is a polytope. Similar methods have

also been developed by Emelichev and Kovalev (1970), and by Bulatov (1977) and

Bulatov and Kansinkaya (1982) (see also Bulatov and Khamisov (1992). The first
method for solving the nonlinearly constrained concave minimization problem by
means of convex underestimation was presented in Horst (1976), a survey of the de-
velopment since then is contained in Benson (1995).

3.1. Relaxation and Successive Underestimation

Consider the general problem

min f(D) , (22)

where D is an arbitrary subset of of and f(x} is an arbitrary function defined on

some set S ) D. Setting


268

- = {(x,t): x E D, ~(x ) ~ t} C IRn+l ,


G

we see that this problem is equivalent to the problem

min {t: (x,t) E G} . (23)

Applying an outer approximation method to the latter problem leads us to con-

struct a sequence of nested sets

such that each problem

min {t: (x,t) E Gk} (24)

can be solved by available algorithms, and its optimal solution approaches an opti-

mal solution of (22) as k increases.

Suppose that Gk = Gk n (D k .. IR), where Gk ) Gk +1 ) G, Dk ) Dk +1 ) D and


Gk is the epigraph of some function lPJc defined on some set Sk ) Dk .
It follows that we must have

lPJc(x) ~ lPJc+1 (x) ~ f(x) Vx E D , (25)

Le., the function lPJc(x) defines a nondecreasing sequence of underestimators (sub-

functionals) of f(x).

Problem (24) is equivalent to

(26)

and is called arelaxation of the problem (22). The above variant of outer approxim-

ation is called the successive relaxation method.

Usually Dk is a polyhedron, and lPJc(x) is an affine, or piecewise affine, or convex


underestimator of f(x), so that the relaxed problem (26) can be solved by standard
269

linear or convex programming techniques.

When Dk =D for every k, the method is also called the 6UCcenive untier-
estimation methotl. Thus, the successive underestimation method for solving (22)
consists in constructing a sequence of underestimators ,,\(x) satisfying (25) and such
that the sequence

xk E arg min {,,\(x): xE D}

converges to an optimal solution of (22). These underestimators are constructed


adaptively, i.e., "\+1 is determined on the basis of the information given by x k.
Specifically, since ,,\(xk ) = min ,,\(D) and "\ underestimates f, it is clear that:

1) if ,,\(xk) = f(x k), then f(x k) = min f(D), i.e. xk solves (22), and we stoPj

2) otherwise, "\+1 must be constructed such that ,,\(x) ~ "\+1 (x) for all x ED

and "\ ~ "\+1'

Of course, the convergence of the method crucially depends upon the choice of "\,
k=l,2, ....

Remark VI.3. Another special case of the successive relaxation method is when
,,\(x) :: fex) for every k. Then the sequence Dk is constructed adaptively, based on
the results of solving the relaxed problem min f(D k ) at each step: this is just the
usua! outer approximation method that we discussed in Chapter 11 and Section VI.1.

3.2. The Falk and Hoffman Algorithm.

With the above background, let us now return to the concave programming

problem (BCP), i.e., the problem (22) in which D is a polytope of the form
270

with b e IRm , A an mxn matrix, and f(x) is a concave function defined throughout

IRn .
Let us apply the successive underestimation method to this problem, using as

underestimators of f(x) the convez envelopu of f(x) taken over polytopes S J D.

Recall from Section IVA.3 that the convex envelope of f(x) taken over S, is the
largest convex underestimator of f over S.

Proposition VI.7. Let Sk be a polytope with vertices ,f,O,vk,l, ... ,vk,N(k), and let

lI\(z) be the convez envelope 0/ /(z) taken over Sk' Then the relaxed problem
min {lPlz): z e D} is a linear program which can be mtten as

Nfk) ,k .
minimize A./(TI",J)
j=O J

Nfk) k'
s.t. A . AV'°,J< b
j=O J -

Aj ~ 0 (j = O, ... ,N{k)) .

Proof. Proposition VI.7 follows in a straightforward manner from Theorem IV.6. •

Now let Ak = (Ak, ... ,A k ) be an optimal solution of (Qk)' and let


o N(k)
xk = E A~vk,j, where J(k) = {j:A~ > O}. Then
jEJ(k) J J

x k e argmin {1I\(x): x e D}. (28)

If vk,j E D for all j E J(k), then f(vk,j) ~ min f(D) for all j E J(k), and hence,

(29)
271

In view of (28), this implies that

(30)

Therefore, any vk,j (j E J(k)) is a global optimal solution of (BCP).

k' k~
In general, however, v ,J t D at least for some j E J(k), for example, v t D.
Then, taking any ik E {l, ... ,m} such that

k,jk
A. v > b.
lk lk

(i.e., the ik-th constraint of D is violated), and defining

we obtain a polytope smaller than Sk but still containing D. Hence, for the convex

envelope ~+1 off over Sk+1 we have ~+1 (x) ~ ~(x), ~+1 ~ ~ .

We are thus led to the following successive convex underestimation (SCU) algo-
rithm of Falk and Hoffman (1976):

Algorithm VI.4 (SCU Algorithm).

Initialization:

Select an n-simplex So containing D. Identify the set Vo = vert SO' Let


Vo = {vO,O, ... ,vO,n}. Set N(O) = n.

Iteration k = 0,1, ... :

1) Solve problem (Qk)' obtaining an optimal solution >.k. Let J(k) = {j: >.1> O}.
2) If vk,j E D for all j E J(k), then stop: any vk,j with j E J(k) is a global optimal

solution of (BCP).
272

3) Otherwise, there is a vk,jk ~ D with jk E J(k). Se1ect any ik E {I, ... , m} such
k,jk
that A. v > b. and define
lk lk

4) Compute the vertex set Vk+ l of Sk+1 (from knowledge of Vk , using, e.g., one
of the procedures in Section 11.4). Let Vk+1 = {vk+ I ,O, ... ,vk+1,N(k+1)}. Go to
iteration k+ 1.

Theorem VI.4. Algorithm VI../. terminates after at most m iterations, yielding a

global optimal solution.

Proof. As in the proof of Theorem VI.1, it is readily seen that each ik ia distinct

from a1l the previous indices iO, ... ,i k_ l . Hence, after at most miterations, we have
Sm = D, and then vm,j E D for a1l j E J(m), Le., each of these points is a global
optimal solution of (BCP). •

R.emarks VI.4. (i) Step 2 can be improved by computing

k,jk k'
v E arg min {f(v ,J): j E J(k)} .

k,jk k,~ k,jk


If v E D, then f(v ) ~ min f(D), and hence (29) and (30) hold, Le., v is a
global optimal solution. Otherwise, one goes to Step 3.

In Step 3, a reasonable heuristic for choosing ik is to choose that constraint (not


k,jk
already used in Sk) which is most severely violated by v .

(ü) When compared to Algorithm VI.I, a relative advantage of the successive con-
vex underestimation method is that it yie1ds a sequence of feasible solutions x k such
that lP:k(xk) monotonically approaches the optimal value of the problem (see (25».
However, the price paid for this advantage is that the computational effort required

to solve (Qk) is greater here than that needed to determin xk E arg min{f(x):
273

xE Vk} in Algorithm VI. 1. It is not clear whether the advantage is worth the priee.

3.3. Rosen's Algorithm

A method closely related to eonvex underestimation ideas is that of Rosen (1983),


whieh is primarily eoneerned with the following eoneave quadratie programming

problem (CQP)

(CQP) minimizef(x):= pTx-~x(Cx) subjeet to xe D,

where D is a polytope in !Rn, p E !Rn and C is a symmetrie positive definite nxn

matrix.

An important property of a quadratie funetion is that it ean be put into separable

form by a linear transformation. Namely, if U = [ul,u 2,... ,un] is a matrix of normal-

ized eigen-veetors of C, so that UTCU = diag (Al'A 2,... ,A n ) > 0, then by setting
x = Uy, F(y) = f(Uy) we have:

1 T n
F(y) = qy - 2"Y[(U CU)y] = E F .(y.) ,
j=l J J

where

1 2 T
F·(y·)=q·Y·--2AY. ,q=U p. (31)
JJ JJ J-J

Thus, problem (CQP) becomes

n
minimize E F .(y.) subject to Y E Cl ,
j=l J J

where Cl = {y: Uy E D} .

n
Lemma VIA. Let F(y) = E F. (y.) be a separable function and let S be any
j=l J J
rectangular domain of the form
274

(32)

Then fOT every vertex W of S we have


n
F(w) = E F. (wJ with w·= ß·OTW.= ß· . (33)
j=1 J 1 J J J J+n

Proof. Lemma VI.4 follows beeause any vertex w of S must satisfy wj = ßj or


wj = ßj +n · •

To simplify the language, in the sequel we shall always use the term reet angle to
mean a domain defined by 2n inequalities of the form (32), the faces of which are

parallel to the hyperplanes Yj = Q.

Now the basic idea of the method can be described as follows.

Compute a starting vertex v of Cl which is reasonably good (provided, of course,


that we can find it with a relatively small amount of calculation). Then with the
help of the above lemma construct two rectangles S, T such that: S is the smallest
rectangle containing Cl, while T is the largest rectangle contained in the ellipsoid
E = {y E IRn: F(y) ~ F(v)}. Denote tp* = min {F(y): y E Cl}. Then, since S ) Cl, the
number tp = min{F(y): y E S} (which is easy to compute using the above lemma)
yields a lower bound for tp* while F(v) is an upper bound. Furthermore, since T ( E,

we will have F(y) ~ F(v) for all y E T. Hence, if Cl \ T = 0, then no feasible point
better than v can exist, i.e. v is a global optimal solution. Otherwise, the interior of
T can be excluded from furt her consideration and it remains to investigate only the

set Cl \ int T. Though this set is not convex, it is the union of r $ 2n convex pieces,
and it should be possible to handle each of these pieces, for example, by the

Falk-Hoffman method, or, alternatively, in the same manner as Cl was treated.


275

Specifically, let us.first compute for each j=l, ... ,n:

v.i e arg max {Yi y e O} ,

v.i+ n e arg min {Yi y e O}

Clearly, since each v.i (j=1, ... ,2n) is the optimal solution of a linear program, v.i
can be assumed to be a vertex of O. Let

ve arg min {F(v.i): j=1,2, ... ,2n} .

Then F(v) is an upper bound for the number tp* = mi n F(y). On the other hand,
yeO
. . ·+n
Wlth {J. = ~, (J.+ = ~ , the set
J J J n J

is the smallest rectangle containing o. Therefore, the number

I{J = min {F(y): y e S}

furnishes a reasonably tight lower bound for tp*. Ifit happens that

F(v) - I{J ~ c,

where c is the prescribed tolerance, then v is accepted as an approximate global op-


timal solution of (CQP).

Proposition VI.8. Wehave


n
I{J =.E min (ClJ. , ClJ·+n ) , (34)
- J=l

where

(j=l,2, ... ,n) .


276

Proo!. Since the minimum of F(y) over S is attained at some vertex, 'P is equal

to the smallest of all of the numbers F(w) where w is a vertex of S. The proposition

then follows from the formulas (31) and (33).



Now let us construct the second rectangle T. Let y be the point of n where F(y)
attains its maximum over n (this point is computed by solving the convex program
max F(y)). Obviously, ßF:= F(Y) - F(v) > O.
yEn

Proposition VI.9. For each j let 7j J 7j+n (7j> 7j+rI be the roots o/the equation

1 2 - ßF
--2>"1] +ql]=F.(y.J--. (35)
J J J J n

Then the rectangle

has aU its vertices lying on the sur/ace F(y) = F(V).

Proof. Since FJ·(Y·) - ßF < max FJ.(t) (where FJ.(t) = q.t _!>...t 2), the quadratic
J n t EIR J J
equation (35) has two distinct roots. By Lemma VIA for each vertex w of T we
n 1 2 n
have: F(w) = E (q'7' - -2>"'7') with 7· = 7, or 7'+ ' and hence F(w) = E
j=l J J J J J J n j=l
(F.(Y.) - ßF) = F(Y) - ßF = F(v). •
J J n

Since 'P ~ 'P* ~ F(v) ~ F(y) for all y E T, the optimal solution of (CQP) must be

sought among the feasible points lying on the boundary or outside of T, Le., in the

set n \ int T (see Fig. VIA, page 270). Since the latter set is nonconvex, it must be

treated in a special way. The most convenient method is to use the hyperplanes that

determine T to partition Cl \ int T into 2n convex pieces (some of which may be


empty). That is, if

H,={y:y'>7'}' H.+ ={y:y'<7'+} (j=l, ... ,n),


J rJ Jn rJn
277

then we must consider the polytopes

0j = Hj n 0 (j=I, ... ,2n).

For each nonempty polytope Oj , we already know one vertex, namely i Usuallyan
n-simplex with avertex at ~ can be constructed that contains 0., so that each
J
problem

min {F(y): y E Oj} (j=I, ... ,2n) (36)

can be treated, for example, by the Falk-Hoffman algorithm. If rpj is the optimal
value in (36) then 11'* = min {1Pi, ... ,102n,F(;)} with the convention that rpj = +ID if
Oj = 0. The following example illustrates the above procedure.

Example VIA. Consider the two dimensional problem whose main features are
presented in Fig. VI.4.

The feasible set 0 and a level curve of the objective function F(y) are shown.

First, vI E argmax {Y( y E O}, v3 E argmin {Y( y E O}, and (similarly) v2, v4
are computed, and the rectangle S is constructed. The minimum of F(vi ), i=I,. .. ,4,
is attained at v = v3 (this is actually the global minimum over 0 but it is not yet
recognized as such since F(v) > 11' = min {F(y): y E S}).

Next we determine y E argmax {F(y): y E O} and construct the rectangle T


inscribed in the level ellipsoid F(y) = F(;). The residual domains that remain after

the deletion of the interior of T are 0 3 and 0 4. Since 0 4 is a simplex with all

vertices inside the level ellipsoid F(y) = F(v), it can be eliminated from further con-
sideration. By constructing a simplex which has one vertex at v3 and contains 0 3 '
we also see that all of the vertices of this simplex lie inside our ellipsoid. Therefore,

the global minimum of F over 0 is identified as v* = v = v3.


278

Fig. VI.4

4. CONCAVE POLYHEDRAL UNDERESTIMATION

In this section concave piecewise underestimators are used to solve the concave
minimization problem.

4.1. Outline of the Method

We again consider the problem

(BCP) minimize f(x) subject to x E D ,

where D is a polytope in IRn and f: IRn - - I IR is a concave function.

Let MI = [vl, ... ,vn +l ] be an n-simplex containing D. For the method to be


developed below, it will suffice to suppose that f(x) is concave and finite on MI. Let
279

G = ({x,t) E Ml"lR: f(x) ~ t}

denote the hypograph of f(x). Clearly, G is a convex set. Now consider a finite set
X ( MI such that conv X = MI' Let

Z = {z = (x,f(x)): x E X} ( Ml " IR ,

and let P denote the set of points (x, t) E Ml "IR on or below the convex hull of Z.
Then we can write

P = conv Z - r

where r = {(x,t) E IRn"lR: t > O} is the positive vertical halfline. We shall call P a
tnnJ.: 1l1ith base X. Clearly, P is the hypograph of a certain concave function cp (x) on

Proposition VI.lO. The polyhedral fu,nction cp(x) with hypograph P is the lowest
concave underestimator of f(x) that agrees with f(x) at each point x E X.

Proof. The function cp (x) is polyhedral, since P is a polyhedron. From the


construction it follows that P ( G, hence cp (x) ~ f(x) Vx E D.

If 1/1 is any concave function that agrees with f(x) on X, then its hypograph

{z = (x,t) E MI" IR: 1/J(x) ~ t}

must be convex and must contain Z; hence it must contain P. This implies that
1/J(x) ~ cp(x).

We now outline the concave polyhedral underestimation method and discuss the
main computational issues involved.

We start with the set

Xl = { vI ,... ,vn+1}
280

At the beginning of iteration k=I,2, ... we already have a finite set Xk such that

Xl c Xk c MI = conv Xk , along with the best feasible point i k- l known so far. Let

P k be the trunk with base Xk and let lf1c(x) be the concave function with hypograph
P k . We solve the relaxed problem

(SP k ) minimize lf1c(x) subject to xe D

obtaining a basic optimal solution x k . Let i k be the point with the least function

value among i k- l and all of the new vertices of D that are encountered while

solving (SPk ). Since lf1c(x) is an underestimator of fex), lf1c(xk ) yields a lower bound
for min f(D). Therefore, if

then i k is a global optimal solution of (Bep) and we terminate. Otherwise, we must

have x k t Xk (since xke Xk would imply that lf1c(xk ) = f(xk ) ~ f(ik )" because the
function \Ok(x) agrees with fex) on Xk ). Setting Xk +1 = Xk u {xk}, we then pass to
iteration k+1.

Theorem VI.5. The procedure just described terminates after finitely many
iterations, yielding a global optimal solution 0/ (BCP).

Proof. We have lf1c(xk ) ~ min f(D) ~ f(ik). Therefore, if the k-th iteration is not

the last one, then 'I\(xk ) < f(ik) ~ f(x k) = '1\+1 (xk). This shows that '1\ ~ 1DJi, and
hence Xk # Xh for all h < k. Since each xk is a vertex of D, the number of iterations
is bounded from above by the number of vertices of D.

Thus, finite termination is ensured for this procedure, which can also be identified
with a polyhedral annexation (inner approximation) method, where the target set ia
. - k
G and the expanding polyhedra are PO' PI' ... (note that P k+1 = conv (P k U {z }),
with zk = (xk,f(xk)) t P k ).
281

For a successful implementation of this procedure, the main issue is, of course,

how to compute the functions <.Otc(x) and solve the relaxed problems (SP k ). We pro-
ceed to discuss this issue in the sections that follow.

4.2. Computation of the Concave Underestimators If\{x)

Consider any nonvertical facet u of P k . Since dim u = n, u lies in a uniquely de-


termined hyperplane H u in IRn+l, which is a supporting hyperplane of P k at every

(x,t) E u. This hyperplane is not vertical, 50 its equation has the form

qx + t = <10' with q = q( u) E IRn , qo = <10 (u) E IR.

Now let jTk denote the collection of an nonvertical facets of P k .

Proposition VI.H. We have cplx) = min {CPk,ix): u E jT k}' where

Proof. The result follows because the trunk P k is defined by the inequalities:


The above formulas show that the function <.Otc(x) can be computed, once the
equations of the hyperplanes through the nonvertical facets are known. Since

mi n <.Otc(x) = mi nmi n CPk (x) = mi n mi n <.Otc (x),


xED xED uE.1"k ,u uE.1"k xED ,u

the relaxed problem (SP k ) can be solved by separately solving each linear program

LP(u ;D) minimize [qO(u) - q(u)x] subject to x E D ,

and taking
282

(37)

xk,o- E arg min [qO(o-) -q(o-)x] .


xED

Thus, computing the functions ~ as weIl as solving (SP k ) are reduced to the com-
putation of the nonvertical facets of P k , or rather, the associated affine functions

, K,o-(x).
Cf),

4.3. Computation ofthe Nonverlical Facets ofPk

Observe that the initial trunk PI has just one nonvertical facet which is the
n-fiimplex spanned by (vi, f(vi )) , i=I, ... ,n+1. Therefore, it will suffice to consider
the following auxiliary problem:

Let Xk+1 = X k U {:fl, where i E M1 \ Xk· Given the collection .:7 k 01


(5) nonverticallacets o/the trunk P k with base Xk' find the collection :7 k+l
0/ nonvertical lacets 0/ the trunk Pk+ 1 with base X k+ r

By translating if necessary, we may assume that 0 Eint MI and f(x) > 0 for all
x E MI' so that any trunk P with base X contains 0 E IRn + 1 in its interior. Under

these conditions, we shall convert problem (:7) into an easier one by using the fol-

lowing result, which is analogous to Proposition VIA, from which, in fact, it could

be derived.

Proposition VI.12. Let P be a trunk with base X, and let

s = {(q,qoJ E IR n• IR: qo - qx ~ t V(x,t) E vert P} .

Then each non'llerticallacet 0- 01 P whose hyperplane is qx + t = qo corresponds to a


vertex (q,qoJ 01 Sand conversely.
283

Proof. Consider any nonvertical facet (1 of P. Since 0 Eint P, the hyperplane

through (1 does not contain 0, and so (1 must contain n+llinearly independent ver-

tices of P. That is, the equation of the hyperplane through (1 : t = qo - qx must be

satisfied by n+llinearly independent elements of vert P. Since this hyperplane is a

supporting hyperplane for P, we have t ~ ~ - qx for all (x,t) E vert P. Hence,

(q,qO) E S, and it satisfies at least n+1 linearly independent constraints on S as

strict equalities. This implies that (q,qO) is a vertex of S.

Conversely, let (q,~) be a vertex of S. Then t ~ ~ - qx for all (x,t) E vert P, and
hence for all (x,t) E Pj moreover, of these at least n+l linearly independent con-

straints are satisfied as equalities. Hence, the hyperplane t = qo - qx is a supporting

hyperplane of P, and it passes through at least n+llinearly independent vertices of


P. This implies that the intersection of P with this hyperplane is a facet of Pj this

facet cannot be vertical, because the coefficient of t in the equation of its hyperplane

is 1.

Corollary VI.3. Let

(38)

Then every nonvertical/acet 0/ Pk+l whose hyperplane is t = qo - qz corresponds


to a verte:c (q, qol 0/

(39)

and conversely.

Proof. This follows because veri P k+ 1 C vert P k U{(xk , f(xk ))}.



The above results show that the auxiliary problem (~) is equivalent to the fol-

lowing problem:
284

(~) Suppose that the vertex set .Atk 01 Sk (defined in (98)) is known. Compute
the vertex set .At k+1 01 Sk+1 .

Since Sk+1 is also obtained by adding just one new linear constraint to Sk (for-
mula (39», problem (.At) can be solved by the available methods (cf. III.4.2). Once
the vertices of Sk+1 have been computed, the equations of the nonvertical facets of

P k +1' and hence the function 'PJc+1' are known.

In more detail, the computation of the nonvertical facets of P k can be carried out

in the following way.

First compute the unique nonvertical facet of P l' which is given by the unique

vertex of

(40)

Clearly this vertex is a solution of the system

and hence

(41)

where

(42)

At iteration k, the vertex set .At k of Sk is already known. Form Sk+ 1 by adding
to Sk the new constraint
285

where xk is the point to be added to Xk· Compute the vertex set .At k+1 of Sk+1
(by any available subroutine, e.g., by any of the methods discussed in III.4.2. Then

every (q,qO) E .At k+ 1 yields a nonvertical facet of P k+ 1 defined by the hyperplane

t = ~ -qx.

4.4. Polyhedral Underestimaüon Algorithm

The above development leads to the following algorithm.

Algorithm VI.5 (PU Algorithm)

Compute a vertex i O of D.

0) Choose an n-ilimplex [v 1,v 2,... ,vn + 1] containing D, where vI = iO. Set

Xl = {v 1,v 2,... ,v n +1}, SI = ((q,~): qo - qv 1 ~ f(v 1), i=1,2, ... ,n+1}. Let .At 1 be
the singleton {(f(v 1), ... ,f(vn + 1))Ql1}, where Q1 is the matrix (42). Set

.#'1 = .At 1 ' k = 1.

1) For each (q,qo)) E A'k solve the linear program

minimize (qo - qx) subject to x E D ,

obtaining a basic optimal solution w (q,qO) and the optimal value ß (q,~).
2) Compute

(qk,q~) E arg min {ß(q,qO): (q,~» E .#'k}

and let xk = w (qk,q~), ri = ß (qk,q~).


3) Update the current best solution by taking
286

4&) If f(ik ) = ,., then terminate: i k is a global optimal solution of (BCP).

4b) Iff(ik ) < f(ik- 1), set iO t-ik , and return to Step 0.

5) Otherwise, let

Compute the vertex set .,K k+1 of Sk+1· Set f k+1 = .,K k+1 \ f k. Let k t- k+ 1
and return to 1).

Remarb VI.5. (i) Finiteness of the above algorithm follows from Theorem VI.5.
Indeed, since at each return to Step °(restart) the new vertex iO is better than the
previous one, Step 4b can occur only finitely many times. That is, from a certain
moment on, Step 4b never occurs and the algorithm coincides exact1y with the pro-
cedure described in Section VIA.I. Therefore, by Theorem VI.5 it must terminate at
a Step 4a, establishing that the last iO is a global optimal solution.

(ü) A potential difficulty of the algorithm is that the set .,K k ' i.e., the collection of
nonvertical facets of P k might become very numerous. However, according to Step
4b, when the current best feasible solution is improved, the algorithm returns to
Step °
with the new best feasible solution as xO. Such restarts can often accelerate
the convergence and prevent an excessive growth of 1.,K k I.

(iii) Sometimes it may also happen that, while the current best feasible solution re-
mains unchanged, the set .,K k is becoming too large. To overcome the difficulty in
that case, it is advisable then to make arestart, after replacing the polyhedron D by
D n{x: l (x-iO) ~ I}, where l (x-iO) ~ 1 is an Q-valld cut for (fl,D) at iO with Cl! =
f(iO). The finiteness of the algorithm cannot be adversely affected by such a step.
287

4.5. Alternative Interpretation

The above polyhedral underestimation algorithm was derived by means of an


inner approximation of the hypograph of the function f(x). An alternative interpreta-
tion is based on the representation of the function f(x) as the pointwise infimum of a

collection of affine functions.


Observe that, since f(x) is concave, we have

f(x) = inf {h(x): h affine, h(y) ~ f(y) Vy E IRn} .

Now consider a finite set Xk in IRn such that D ( conv Xk and let

'I\(x) = inf {h(x): h affine, h(v) ~ f(v) Vv E Xk}. (43)

Since Xk is finite, '1\ is a polyhedral function and since D ( conv Xk ' we have

'I\(x) ~ f(x) Vx E D ,

Le., '1\ is a polyhedral concave function that underestimates f over D.


Solving the relaxed problem

min {'I\(x): xe D} ,

we obtain as optimal solution a vertex xk of D such that

Therefore, if 'I\(xk ) = f(ik) lor the best leasible point i k so far encountered, then
i k solves the problem (BCP). Otherwise, since 'I\(x) = f(x) for any x E Xk and

'I\(xk ) < min f(D), we must have xk ~ Xk . Setting Xk +1 = Xk u{xk}, we can


define

'I\+1(x) = inf {h(x): h affine, h(v) ~ f(v) Vv E Xk +1} ,


288

and repeat the procedure just described with V\+1 in place of V\.
This is exactly the PU algorithm, if we start with Xl = {v l ,... ,vn +1}, where
MI = [vl, ... ,v n + l ] is an n-simplex containing D. To see this, it suffices to observe

the following

Proposition VI.13. The junction 'Pk defined in (~9) is identical to the concave
junction whose hypograph is the trunk Pk with base Xk.

Proof. For any x' E MI = conv Xk, let (X',t') be the point where the verticalline
through x' meets the upper boundary of P k. Then (x' ,t') belongs to some nonvertical
facet (J' of P k' so that t ' = Ilo - qx' , where t = qo - qx is the equation of the hyper-
plane through (J'. Since the latter is a supporting hyperplane of P k at (X',t'), we must
have qo - qv ~ f(v) for all v E Xk. Moreover, any hyperplane h(x) = t such that h(v)
~ f(v) Vv E Xk must meet the vertical through x' at a point (x',t*) such that t* ~ t '.
Therefore, t ' = min {t: h(x) = t, h affine, h(v) ~ f(v) Vv E Xk}j that is, V\(.) coin-
eides with the function whose hypograph is just P k .

Remark VI.6. Set vn+i+I:= xi, so that Xk = {vi, i E I k} with Ik =

{1, ... ,n+k+1}. By (43), for each x E MI' V\(x) is given by the optimal value in the
linear program

Thus, if as before ..J( k denotes the vertex set of

(the feasible set of L(xjXk )), then the relaxed problem min V\(D) is the same as
289

From this it is obvious that the crucial step consists in determining the set .Jt k.
On the other hand, instead of determining .Jt k one could also determine directly the

set :Y k of all nonvertical facets of P k , i.e., the linearity pieces of 1Ptc. For this,

observe that :Y 1 is readily available. Once :Y k has been computed, :Yk +1 n :Yk

consists of all elements q E :Y k whose hyperplanes qo - qx =t are such that

~- qxk ~ f(x k), while :Y k+ 1\:Y k is given by the collection of all (q,~) that are
basic optimal solutions of the linear program L(xk ; Xk + 1).

4.6. Separable Problems

Separable problems form an important dass of problems for which the polyhedral

underestimation method seems to be particularly suitable. These are problems in

which the feasible domain is a polytope D contained in a reet angle M = {x E IRn : rj ~


xj ~ Sj , j=l, ... ,n} , while the objective function f(x) has the form
n
f(x) = E f.(x.),
j=l J J

where each f{) is a function of one variable which is concave and finite on the line
segment b. j = [rl j ] (but possibly discontinuous at the endpoints of the segment,
see Fig. VI.5). This situation occurs in practice, e.g., when D represents the set of

feasible production programs, and fP) is the cost of producing t units of the j-th
product. If a fixed cost and economies of scale are present, then seeking the cheapest

production program amounts to solving a problem of the type under consideration.


290

f. (t)
J

t
r---~----~------~~------~------
s.
J

Fig. VI.5

For this problem, many of the methods discussed previously either do not apply
or lose their efficiency, because they assume that the function f(x) is extendable to a
concave finite function over IRn, which is not the case here.
To apply the PU method, for each j let us choose a finite grid Xkj of points in the

segment Ll j such that r j , Sj E Xkj" Consider the piecewise affine function IPJcj(t), t E
IR, that agrees with fP) at each point of Xkj" Since we are dealing with functions of

one variable, the construction of IPJcj presents no difficulty. Clearly, IPJcj(t) is a

concave underestimator offj(t) on Llj" Therefore, the function


n
lPJc(x) = j!l lPJcj(Xj )

is a concave underestimator of f(x) on M.


Solving the relaxed problem

(SP k ) minimize lPJc(x) subject to x E D ,


291

we obtain an optimal solution which is a vertex x k of D.

Let i k be the best feasible solution so far available (i.e., the best among x 1, ... ,xk

and the other vertices of D that may have been encountered during the process of

computation). If 'P]c(xk) = f(i k ), then f(i k) ~ 'P]c(x) ~ fex) "Ix e D, hence i k solves
our problem (Bep) and we stop. Otherwise,

k n k k k n k
'P]c(x ) = E 'P]cJ'(x.) < f(i ) ~ fex ) = E f.(x.) , (44)
j=1 J j=1 J J

therefore, x~* ~ Xkj* for at least one j*. Setting Xk + 1,j* = Xkj * U {x~*} (j=I, ... ,n),

Xk + 1,j = Xk,j (j # j*), we can then repeat the procedure, with k -k+1.

Thus, if we start at k = 1 with X 1j = Ll j (j=I, ... ,n) and perform the above

procedure, we generate a sequence of underestimators 'P1""''P]c, ... along with a

sequence ofvertices of D: xk e arg min {'P]c(x): x e D} (k=I,2, ... ).

Proposition VI.14. The above procedure converges to a global optimal solution 0/


(BCP) in a finite number 0/ iterations.

Proof. It is clear that X h /: X k for h /: k. But X k + 1 is obtained from X k by

adding to Xkj* a point x~* ~ X kj* . Since x k is a vertex of D, the set of all possible
x1 (j=I, ... ,nj k=I, ... } is finite. Hence, the sequence {Xl'X2,... } is finite. _

An important issue in the implementation of the above procedure is how to solve

the relaxed problems (SP k ). Of course, since the objective function /(\(x) in each

problem (SP k ) is concave, piecewise affine, and finite throughout IRn, these problems

can be solved by any of the methods discussed previously. However, in view of the

strong connection between (SP k +1) and (SP k ), an effi.cient method for solving these

problems should take advantage of this structure in order to save computational

eifort. Falk and Soland (1969) suggested the following approach.


292

n
Let Xk = II Xk . , and denote by §k the partition of M determined by Xk '
j=l ,J
Le., the partition obtained by constructing, for each j=l,2, ... ,n, a11 the hyperplanes

parallel to the facets of M and passing through the points of Xk,j (let us agree to call
these hyperplanes partitioning hyperplanes of .9IiJ
For k=l, §1 = {M}, so that rp1(x) is affine and solving (SP 1), Le., finding

xl E argmin {rp1(x): x E D n M} ,

presents no difficulty. Set .91 1 = § 1. At the end of iteration k = 1,2, ... we already

have:

a) a co11ection .9I k of rectangles forming a partition of M. These rectangles are of


the form P = {L ~ x ~ L}, where L, L belong to Xk (so that either PE §k' or else P
can be subdivided into a finite number of members of §k by means of partitioning
hyperplanes of §k).

b) for each P E .9I k ' a point x(P) and a number J.L (P) are determined which are,
respectively, a basic optimal solution and the optimal value of the linear program

min 1/Ip (x) s.t. x E D n P , (45)


n
where P = {x: L ~ x ~ L} and 1/Ip (x) = E 1/Ip .(x.), in which 1/Ip .(x.) is the affine
j=l ,J J ,J J
function that agrees with flx j ) at the points Lj and Lj .

c) a rectangle P k E argmin {J.L (P): P ~ .9I k} such that P k E §k . Observe that

1/Ip (x) always underestimates fP:k(x) on P and 1/Ip(x) = fP:k(x) Vx E P if P E §k .


Therefore, J.L{P) serves as a lower bound for min {fP:k(x): x E D n P}, and J.L (P) is
exactly equal to this minimum if P E §k .

Now, to pass from iteration k to iteration k+l, we choose an index j* such that

x1* t Xk,j* and set Xk +1,j* = Xk,j* U {x1*}, Xk+1,j = Xk,j (j f j*). To solve
(SP k + 1) we proceed according to the fo11owing branch and bound scheme:
293

I) Split P kO into two subrectangles by means of a partitioning hyperplane of .9k+1.


Compute x(P), J'(P) for each of these subrectangles P. Let ""kI be the new
partition of M obtained from ""kO by replacing P kO with its subrectangles.

2) Find P kl E argmin{J'(P): P E ""kI}' If PkI E .9k+1' then stop: x k+1 = x(Pkl ),


""k+1 = ""kl' Otherwise, set ""kO I- ""kI ' P kO I - P kl and return to 1).

It is easily seen that the above process must be finite. Note that in this way it is

generally not necessary to investigate all of the members of the partition .9k+1; nor
is it necessary to find 1PJc+1 (x) explicitly.
CHAPTER VII

SUCCESSIVE PARTITION METHans

This chapter is devoted to a dass of methods for concave minimization which in-

vestigate the feasible domain by dividing it into smaller pieces and refining the par-
tition as needed (successive partition methods, branch and bound).

We shall discuss algorithms that proceed through conical subdivisions (conical al-

gorithms), simplicial subdivisions (simplicial algorithms) or reet angular subdivisions


(reet angular algori thms ).

1. CONICAL ALGORITHMS

The technique of conical subdivision for concave minimization was introduced by


Tuy (1964). Since a conical subdivision induces a partition of the boundary of the
feasible set, this technique seems to be appropriate for nonconvex problems where
the optimum is achieved at certain boundary points.
Based on reet angular and simplicial branch and bound methods proposed by Falk

and Soland (1969) and by Horst (1976), in subsequent conical algorithms, the pro-

cess of conical subdivision was coupled with a lower bounding or some equivalent op-

eration, following the basic steps of the branch and bound scheme.

A first convergent algorithm of this type was developed by Thoai and Tuy (1980).

The algorithm to be presented below is an extended and improved version of both


296

the original algorithm in Tuy (1964) and that of Thoai and Tuy (1980).

1.1. The Normal Conical Subdivision Process

Let us begin with the following (DG) problem which was considered in Section

VI.2.

(DG) Given a polytope D contained in a co ne KO( !Rn and a compact convex


set G with 0 Eint G, find a point y E D \ G or else establish that D ( G.

Clearly, if p(x) is a gauge of G, i.e., a convex, positively homogeneous function

such that G = {x: p(x) ~ 1}, then D ( Gis equivalent to max p(D) ~ 1; in this case,

solving the (DG)-problem is reduced to solving the convex maximization problem:

maximize p(x) subject to x E D . (1)

To construct a conical procedure for solving (1), by the branch and bound
scheme, we must determine three basic operations: branching, bounding and candid-
ate selection (cf. Section IV.2).

1) Branching (conical subdivision). Obviously, any cone can be assumed to be of


the form K = con(Q) with Q = (z1,z2, ... ,zn), zi E 00 (the boundary of G),

i=1,2, ... ,n. Given such a cone, a subdivision of K is determined by a point u E K

such that u # Azi VA ~ 0, i=1,2, ... ,n. As we saw in Section V.3.1, if u = E A.zi
iEI 1

(\ > 0) and ii is the point where the halfline from 0 through u meets 00, then

the partition (splitting) of K with respect to u consists of the subcones

Ki = con(Qi)' i E I, with

Qi = (z1,... ,Zi-I.,u,zi+1 ,... ,z.


n)

Thus, to determine the branching operation, a rule has to be specified that assigns
to each cone K = con(Q), Q = (z1,z2, ... ,zn), a point u(Q) E K which does not lie
297

on any edge of K.

2) Bounding. For any cone K = con(Q), Q = (zl,z2, ... ,zn), the hyperplane
eQ-1x = 1 passes through zl,z2, ... ,zn, Le., the linear function h(x) = eQ-1x

agrees with p(x) at zl i,... ,zn. Rence, h(x) ~ p(x) for an x E K, and the value
~Q) = max {eQ-1x: x E K n D}

will satisfy ~Q) ~ max p(K n D). In other words, JL(Q) is an upper bound for p(K
n D) (note that (1) is a maximization problem).

3) Se1ection. The simplest rule is to se1ect (for furt her splitting) the cone
K = con(Q) with largest JL(Q) among an cones currently ofinterest.

Once the three basic operations have been defined, a corresponding branch and
bound procedure can be described that will converge under appropriate conditions.
Since the selection here is bound improving, we know from the general theory of
branch and bound algorithms that a sufficient convergence condition is consistency
of the bounding operation (cf. Section IV.3).

Let us proceed as follows. For each cone K = con(Q) denote by w (Q) a basic op-
timal solution of the linear program

. . eQ-1 x -1
LP(QjD) maXlIIllze subject to x E D, Q x ~ 0. (2)

Note that w (Q) is a vertex of D n K satisfying eQ-1 w (Q) = p. (Q).

Now consider any infinite nested sequence of cones Ks = con(Qs)'


Qs = (zSl,zS2 ... ,zsn), s=1,2, ... , generated by the cone splitting process described

above. For each s let tl = w(Qs)' uS = u(Qs) (in general, uS #- tl and even uS #- >.tl
for every >'), and denote by qS and tJ the points where the halfline from 0 through
tl meets the simplex [zSl,zS2, ... ,zsn] and the boundary 8G of G, respectively.
298

Definition vn.1. A sequence Ks = con(Q), s=l,~, ... , is said to be normal for


givenD, G if

(3)

A cone splitting process is said to be ftOn7IGl (an NOS proceBs) if any infinite
nested sequence of cones that it generates is normal.

It turns out that normality of a cone'splitting process is a sufficient condition for


consistency of the bounding operation defined above and thereby for convergence of
the resulting conical procedure.

1.2. The Main Subroutine

With the operations of branching, bounding and selection defined above, we now

state the procedure for solving (DG), which we shall refer to as the (DG)-procedure
and which will be used as a main subroutine in the algorithm to be developed for
concave minimization.

(DG)-Procedore:

Select a rule u: Q -i u(Q) for the cone splitting operation so as to generate an


NCS process (we shalliater examine how to select such a rule).

1) Compute the intersections zOl,z02, ... ,zOn of the edges of K O with 00. Set

QO = (z 01 ,z02 ,... ,zOn ),.At = { QO}, .? = .At.

2) For each matrix Q e .? solve the linear program LP(Q,D) to obtain the optimal
value J.'(Q) and a basic optimal solution w (Q) of this program. If w (Q) ~ G for
some Q, then terminate: y = w (Q). Otherwise, w (Q) e G for all Q e .?; then go
to 3).
299

3) In .At delete all Q E !I' such that J.t(Q) S 1. Let .ge be the collection of remaining
matrices. If .ge = 0, terminate: D c G. Otherwise, .ge # 0; then go to 4).

4) Choose Q* E argmax {Jl (Q): Q E .ge}, and split K* = con(Q*) with respect to

u* = u(Q*). Let !I'* be the collection of matrices corresponding to this partition


ofQ*.

5) Replace Q* by !I'* in .ge and denote by .At* the resulting collection of matrices.
Set !I' I - !I'*, .Atl-.At* and return to 2).

Proposition Vll.I. lfthe (DG)-procedure is infinite, then D c G and there ezists


a point y E D n 8G. Consequently, if D \ G is nonempty, the (DG)-procedure mwt
terminate after finitely many steps at a point y E D \ G.

proof. Consider any infinite nested sequence of cones Ks = con(Qs)' s=I,2, ... ,
generated by the procedure. As previously, let J' = w (Qs)' uS = u(Qs) and denote

by qS and rJ the points where the halfline from 0 through J' meets the hyperplane
eQs-lx = 1 and aa, respectively. Then, by normality, we may assume, by passing

to subsequences if necessary, that IIqS - rJII -10. Rence, IIJ' - qSll S IIrJ - qSll -10
and since IIqSll is bounded, we have

Tbis means that the bounding is consistent (cf. Definition IVA). Since the selection
is bound improving, it follows from Theorem IV.2 that

max p(D) = 1,
and hence D c G. Furthermore, by Corollary IV.l, if the procedure is infinite, then

there exists at least one infinite nested sequence Ks = con(Qs) of the type just con-
sidered. Because of the normality and of the boundedness of the sequence J', we
may assume that qS - rJ -I 0, while the J' approach some y E D. Then
300

t1' - WS ---i 0, implying that WS ---i y. Since WS E 00, it follows that y E 00, which
proves the proposition.

Proposition Vll.2. Let G' be a compact conllu 8et contained in the interior of G.

Then after finitely many steps the (DG) procedure either establishes that D C G, or
else finds a point y e D \ G'.

Proof. Let us apply the (DG)-procedure with the following stopping rule: stop if

w (Q) ~ G' for some Q in Step 2, or if .9t = 0 in Step 3 (Le., D C G). Suppose that
the procedure is infinite. Then, as we saw in the previous proof, there is a sequence

t1' ---i y, where y e D n 00. In view of the compactness of both G and G' and the
fact that G' eint G, we must have d(y,G ') > O. Hence, t1' ~ G' for some sufficiently
luge S, and the procedure would have stopped.

1.3. Construction of Normal Subdivision Processes

Now we discuss the question of how to construct anormal conical subdivision pro-
cess.

Definition Vll.2. An infinite nested sequence of cones Ks = con(Q~, s=O,l, ... , is


said to be e:z:ha.v.stive ifthe intersection nK is a ray (a halfline emanating /rom 0);
s 8

it is said to be nondegenerate if lim lIeQs-l11 < IIJ, i.e., ifthere exiSts an infinite
8-i1lJ

swsequence Ac {O,l, ... } and a constant 1J such that lIeQs-l11 ~ 1J Vs E A. A conical


subdi1Jision process is said to be e:z:ha.v.stive (nondegenerate) if all ofthe infinite nested
sequences of cones that it generates are ezh&ustive (nondegenerate).

Based on a related subdivision of simplices (Horst (1976)) the concept of exhaust-


iveness was introduced by Thoai and Tuy (1980) (see Section IV.3.1). The concept
of nondegeneracy is related to but somewhat different from an analogous concept of
301

Hamami and Jacobsen (1988).

Proposition VII.3. Let Ks = con(Q) be an infinite nested sequence.ol cones with


Qs -_ IZ
(s1 ,Zs2, ••• ,zsn), Zsi E 8G (:._ ) F or each s le tUE
'-1,2, ... ,n. s {s+1,1s+1,n
z , ... ,z }
and let l be the point where the halfline from 0 through US meets the simplex
[;1, ... ,;n]. 11 the sequence Ks is emaustive or nondegenerate, then

lim (l- us) = O.


8-f1ll

proof. Suppose that the sequence Ks shrinks to a ray r. Then each point zSi

(i=1,2, ... ,n) approaches a unique point x* of r n 00. Hence, both qS and u S tend to
. qS - us --t 0•
x* , l.e.,
Now suppose that the sequence Ks is nondegenerate and denote by HS+ 1 the hy-
perplane through zS+1,1,i+1,2, ... ,zS+1,n and by LS+ 1 the halfspace not containing

owith bounding hyperplane Hs+ 1 .


Then

Since the sequence qS is bounded, it follows from Lemma III.2 on the convergence of
cutting procedures that d(qS,Ls+ 1) --t 0, and hence d(qS,H sH ) --t O. But the equa-

tion of HsH is eQs+Ix = 1, and hence we have d(O,HsH ) = 1/I1eQs~~II. There-


fore,

By the nondegeneracy property, there is a subsequence A ( {0,1, ... } such that

lI eQ;!l 11 (s E A) is bounded. The previous relation then implies that qS - uS --t 0

(s E A, S --t 1Il).

Letting uS = iJ we can state the following consequence of the above proposition:
302

Corollary VII.1. (Sufficient condition for normality). A conical sub division pro-

cess is normal if any infinite nested sequence Ks = con (Qsl that it generates satisfies
either ofthe following conditions:

1) the sequence is exhaustivej

~) the sequence is nondegenerate, and for all but finitely many s the subdivision of Ks
is performed with respect to a basic optimal solution wS = w(Qsl of the associated
linear program LP(Qs,D).

In particular, an exhaustive subdivision process is normal.

Corollary VII.2. The (DG)-Procedure, where an exhaustive subdivision process is


used, can be infinite only if D ( G and D n {JG # 0.

This result was essentially established directly by Thoai and Tuy (1980), who

pointed out a typical exhaustive subdivision process called bisection.


For simplices, this specific method of subdivision had been introduced earlier by
Horst (1976) (cf. Section IV.3.1). If for any cone K = con(Q) ( KO we denote by
Z(Q) the simplex which is the section of K formed by the hyperplane HO through
zOI,z02, ... ,zOn, then the bisection of K is simply the subdivision of K that is induced

in the obvious way by the bisection of the simplex Z(Q). In other words, the bisec-

tion of a cone K = con(Q) (or of a matrix Q) is the subdivision of K (or Q) with


respect to the midpoint of a longest edge of Z(Q).

A subdivision process consisting exclusively of bisections is called a bisection pro-

ces,. The exhaustiveness of such a subdivision process is shown in Proposition IV.2.


In order to construct more efficient subdivision processes, we now extend this result
to subdivision processes in which bisection is used infinitely often, but not exclus-

ively.
Given a simplex Z = [v1,v2,... ,vn] and a point w E Z, we define
303

5(Z) = max IIvi-v-i1l , 5(w,Z) = max IIw-vill


i<j i=l, ... ,n

Note that 5(Z) is the diameter of Z, while 5(w,Z) is the radius of the smallest ball
with center w containing Z.
The following lemma is closely related to Proposition IV.2.

Lemma VII.I. 11 w is the midpoint 01 a longest edge 01 Z, then

5(w,Z) ~ q5(Z) .
Proof. Let w be the midpoint of [vi ,v2], with IIv2_vIII = 5(Z). Of course,

IIw-vill = ~ 5(Z) ~ ~ 5(Z) (i=1,2).

For i>2, since the line segment [w,vi] is a median ofthe tri angle [v l ,v2,vi] we have

whence the desired inequality.



n· n
Lemma VII.2. Ilw = E (iv', (. ~ 0, E (i = 1 , and
i=l ' i=l

then 5(wjZ) < p5(Z).

proof. For any i such that (i > 0, denote by r i the point where the halfline from

vi through w meets the facet of Z opposite i Then w = vi + 8{ri -vi ) for some (J E
(0,1), and ri = E e·v-i, e· ~ 0, E e· = 1. Hence
j# J J J
n· . .
E (.vl = (1-(J)v1 + fJ( E eJ.vJ)
j=l J j#
304

and, equating the terms in vi, we obtain

which shows that 0 = 1-(r Thus w = vi + (1-(i)(ri-vi ), and consequently,

IIw-vili ~ (1-(i)lIri-vili < p5(Z) whenever (i > O. The lemma follows immediately .•

Proposition VIIA. Let Zs = [ll,,,.,ln], s=O,l,,,., be a nested sequence of


(n-l)-simplices such that for each s, ZS+l is obtained !rom Zs by replacing some vsi
with a point WS E Zs satisfging

(4)

where pE (0,1) is some constant. If for an infinite subsequence 11 ( {O,l,,,.} each ws,
sE 11, is the midpoint of a longest edge of Zs (i.e., ZS+l is obtained !rom Zs by a bt.-
(1)

section), then the intersection n Zs is a singleton.


, s=O

Proof. Denote the diameter of Zs by lis' Since lis+1 ~ lis ' li s tends to some limit li
as S - - I (1). Assume that 5 > O. Then we can choose t such that p5s < 5 for an s ~ t.
From (4) it follows that for an s > t
. max IIws_vsili ~ ~s < 5 ~ 5s . (5)
1=1, .",n

Let us colour every vertex: of Zt "black" and color "white" every vertex of any Zs

with s > t which is not black. Then (5) implies that for s ~ talongest edge of Zs

must have two black endpoints. Consequently, if s ~ t and 5 E 11, then wS must be

the midpoint of an edge of Zs joining two black vertices, SO that Zs+ 1 will have at

least one black vertex: less than Zs' On the other hand, Zs+l can never have more
black vertices than Zs' Therefore, after at most n (not necessarily consecutive) bisec-

tions corresponding to SI < 52 < ".< Sn (Si E 11, SI ~ t), we obtain a simplex: Zs with
only white vertices, Le., according to (5), with only edges of length less than 5. This
305

contradiction shows that we must have 5 = 0, as was to be proved.



Remarks Vll.!. (i) From the proof it is easily seen that the proposition remains
valid even if each ws, s E Ä, is an arbitrary point (not necessarily the midpoint) of a

longest edge of Zs satisfying (4).

(ii) It is not difficult to construct a nested sequence {Zs} such that Zs+! is ob-
tained from Zs by a bisection for infinitely many s, and nevertheless the intersection

n Zs is not a singleton (cf. Chapter IV). Thus, condition (4) cannot be omitted in
s
Proposition VII.4.

Returning to conical subdivision processes, let us consider a subcone K = con( Q)


of an initial cone KO = con(QO)' Let Z = Z(Q) be the (n-l)-simplex formed by the
section of K by the hyperplane HO = {x: eQO-lx = I}. For any point u e K, let w
be the point where the haIßine from 0 through u meets Z.
In the sequel, we shall reier to the ratio 5(w,Z)/5(Z) as the eccentricitiJ of Z relative
to w, or as the eccentricity of K relative to u.

The following corollary then follows from Proposition VII.4:

CoroIlary Vll.3. Let Ks = con(Qs)' s=O,l, ... , be an infinite nested sequence 01


cones, in which Ks+1 is obtained /rom Ks either by bisection or by a sub division with
respect to a point US E Ks such that the eccentricity 01 Ks relative to US does not e:D-
ceed a constant p, 0< p < 1. 11 the sequence involves infinitely many bisections, then
it is exhaustive.

The importance of this result is that it serves as a basis for "normalizing" any

sub division process satisfying condition (4).


306

1.4. The Basic NCS Process

To avoid tedious repetitions when considering a conieal subdivisioil process, for


any cone K = eon(Q) with Q = (zl,z2, ... ,zn), zi E aa, generated during the process,

we shall always use the following notation:

w (Q): basic optimal solution of the linear program LP(Q,D) associated with Qj

u(Q) : point with respect to whieh K is splitj

Z(Q) : (n-l)-simplex whieh is the seetion of K by the hyperplane HO = {x:


eQO-1 x = 1} through z01 ,z02 ,... ,ZOn j

u(Q) : eecentricity of K relative to w(Q) (Le., eeeentricity of Z(Q) relative to


the point where Z(Q) interseets the halfline from 0 through w(Q)).

When u(Q) = w(Q), Le., K is subdivided with respect to w(Q), we shall refer to

this subdivision as an w-subdivision.


A subdivision proeess eonsisting solely of w-subdivisions will be ealled an
w-subdivision process.

This subdivision method obviously depends on the data of the problem and seems
to be the most natural one. However, we do not know whether it is normal. By Co-
rollary VrI.2, it will be normal whenever it is nondegenerate, i.e., if

~~: lIeQs-lll < 111 for any infinite nested sequenee of eones Ks = eon(Q) that it gen-
erates. Unfortunately, this kind of nondegeneraey is a condition whieh is very diffi-

eult to enforee in practice.

On the other hand, while exhaustive subdivision processes are normal (and hence

ensure the convergenee of the (DG)-Procedure), so far the most commonly used ex-

haustive proeess - the bisection - has not proven to be very efficient computa-
tionally. An obvious drawback of this subdivision is that it is defined independently
of the problem's data, whieh partially explains the slow rate of eonvergenee usually
observed with the biseetion, as compared with the w-subdivision process in eases
307

when the latter works.

To resolve this conflict between convergence and efficiency, the best strategy sug-

gested by Corollary VII.3 seems to be to generate a "hybrid" process by properly in-

serting a bisection into an w-subdivision process from time to time. Roughly


speaking, w-subdivisions should be used most of the time, whereas bisections, be-
cause of their convergence ensuring property, can serve as arecourse to get out of
possible jams.

It is this strategy which is embodied in the following rule (Tuy 1991a)):

Basic NCS Process

Select an infinite increasing sequence A. ( {0,1, ... }.

Set r(K o) = 0 for the initial cone K O = con(QO)' At each iteration, an index r(K)

has been defined for the cone K = con(Q) to be subdivided.


a) If r(K) t A. perform an w-subdivision of K = con(Q) (Le., choose u(Q) = w(Q))
and set r(K') = r(K)+1 for every subcone K' = con(Q') in the partition;

b) Otherwise, perform a bisection of K = con(Q) (Le., choose u(Q) to be the


midpoint of a longest edge of Z(Q)) and set r(K') = r(K)+1 for every subcone
K' = con(Q') in the partition.

Proposition Vll.5. The conical sub division process just described is normal.

Proof. Let Ks = con(Qs)' s=l, ... , be an infinite nested sequence of cones gener-

ated in this process. By completing if necessary, we may assume that KS+1 is an im-

mediate successor of Ks for every s so that r(K s) = s. By construction, the sequence

K , s = 0,1, ... , involves infinitely many bisections. If is is exhaustive then by


s
reasoning as in the first part of the proof of Proposition VII.3, it is easily seen that

the sequence is normal. Therefore, it suffices to consider the case then 6'(Zs) ~ 6' > O.

Let u(Qs) = u S, w(Qs) = ws. By Corollary VII.3, the eccentricity a(K s) cannot be
bounded by any constant p E (0,1). Consequently, there exists a subsequence {sh'
308

Sh
h = l,2, ... } such that o{Ks ) = 6(w ,Zs) /6 (Zs ) .... 1 as h .... 00, and by Lemma
h h h

VII.1, u Sh = wwh for all but finitely many h. By compactness, we may assume that
sh sh i . sh
w .... w*, while z ' .... z\ i = l,,,.,n. Since 6(Zs ) ~ 6 > 0, it follows that 6(w ,
h
Z ) - 6(Zs ) .... 0, and hence w* must be a vertex of the simplex Z* = [zl,,,.,zn],
sh h
say w* = zl. Obviously, since 0 eint G, zi is the unique intersection point of lJG
I ,sh I sh ,sh
with the halfline from 0 through z . Therefore, w .... z ,i.e. w - w .... 0 (h .... 00).

Noting that, in the notation of Definition VII.I, qsh = wSh, this implies normality of
the sequence.

Corollary vn.4. The (DG)-procedure 'lJ.Sing the basic NCS process can be infinite

only il D \ G # 0. 11 G is a compact subset 01 int G, then after finitely many steps


I

this proced1J.re either establishes that D ( G, or else finds a point y e D \ GI.

Proof. This is a straightforward consequence of Propositions VII:l, VII.2 and


VII.5.

1.5. The Normal Conical Algorithm

We apply the previous results to the BCP problem:

minimize f(x) (6)

s.t. Ax ~ b , (7)

x ~ 0, (8)

where we assume that the constraints (7) (8) define a polytope D, and the concave
objective function f: IR --+ IR has bounded level sets.
309

In view of Corollary VI1.2, we can solve the problem (BCP) by the following two

phase scheme which is similar to the one in Section VI.2.2.

Start with a feasible solution z E D.

Phase I:
Search for a local minimizer xo which is a vertex of D such that f(xO) ~ f(z).

phasen:
Let a = f(xO)--c. Translate the origin to xO and construct a cone K O ) D. Using the

basic NCS process, apply the (DG)-procedure for G = {x: f(x) ~ f(xO)--c}, G' = {x:
f(x) ~ f(xO)}. If D ( G, then terminate: xO is a global c-optimal solution. Otherwise,
a point y E D \ G' is obtained (so that f(y) < f(xO)): set z f- y and return to

Phase I.

As in the PA algorithm (Section VI.2.4), it is not necessary that xO be a local

minimizer. Also, a concavity cut can be added to the current feasible set before a re-

turn to Phase 11. Incorporating these observations in the above scheme and replacing

cones by their defining matrices, we can state the following procedure.

Algorithm VII.1 (Normal Conical Algorithm for BCP)

Select c ~ 0.

IniUalization:

Compute a point z E D. Set M = D.

Phase I:
Starting with z find a vertex xO of D such that f(xO) ~ f(z). Let xbe the best among
xO and all the vertices of D adjacent to XOj let 'Y = f(X).
310

phasen:
Select an infinite increasing sequence ß of natural numbers.

0) Let 0 = '"1-E. Translate the origin to x O and construct a cone KO J D. For each
i=I,2, ... ,n compute the point zOI where the i-th edge of K O meets the surface
01 02 On
f(x) = o. Let QO = (z ,z ,... ,z ),.At = {QO}' .9' = .At, r(QO) = 0 .

1) For each Q = (zl, ... ,zn) E .9' solve

-1 -1
LP(QjM) maxeQ x s.t. xEM,Q x~O

to obtain the optimal value J.'(Q) and a basic optimal solution w(Q).

Iffor some Q, f(w(Q)) < '"1, set z +- w(Q),

and return to Phase I. Otherwise, go to 2).

2) In .At delete all Q E .9' satisfying J.'(Q) ~ 1. Let .9t be the collection of remaining

matrices. If .9t = 0, then terminate: xis a global c-<lptimal solution of (BCP).


Otherwise, .9t f. 0, go to 3).

3) Choose Q* E argmax {/L(Q): Q E .9t}.

a) If r(Q*) t ß, then split Q* with respect to w(Q*) (perform an w-subdivision)


and set r(Q) = r(Q*)+1 for each member Q of the partition.

b) Otherwise, split Q* with respect to the midpoint of a longest edge of Z(Q*)

(perform a bisection) and set r(Q) = r(Q*)+1 for every member Q of the

partition.

4) Let .9'* be the partition of Q* , .At * the collection obtained from .9t by replacing

Q* by .9'*. Set .9' +- .9'* , .At +- .At * and return to 1).

As a consequence of the above discussion we can state the following result.


311

Theorem Vll.1. For c: > 0 the normal conical algorithm terminates after finitely
many steps at aglobai c:-optimalsolution.

Proof. By Corollary VIIA, where G = {x: f(x) ~ Cl}, G' = {x: f(x) ~ 7}, Phase II

must terminate after finitely many steps either with ~ = 0 (the incumbent i is a
global c:-optimal solution), or else with a point w (Q) such that f(w (Q)) < 1. In the
latter case, the algorithm returns to Phase I, and the incumbent i in the next cyde
of iterations will be a vertex of D bett er than all of the vertices previously
encountered. The finiteness of the algorithm follows from the finiteness of the vertex
set of D.

As previously remarked (cf. Section V.3.3), if c: is sufficiently small, avertex of D
which is aglobaI c:-optimal solution will actually be an exact global optimal solu-
tion.

The algorithm will still work for c: = 0, provided that the points zOl # °
(i=1,2, ... ,n) in Step °can be constructed. The latter condition holds, for example, if
xO is a nondegenerate vertex of D (because then the positive i-th coordinate axis
will coincide with the i-th edge of D emanating from xO).

Theorem Vll.2. For c: = 0 the normal conical algorithm either terminates at an


exact global optimal solution after finitely many steps, or else it involves an infinite
Phase II. The latter case can occur only if the current best solution x is actuaUy
already globally optimal.

Proof. Apply the first part of Corollary VIIA.



Remark Vll.2. Several different algorithms can be derived from the normal

conical algorithm by different choices of the sequence l!. ( {0,1,2, ... }.


If l!. = {0,1,2, ... }, then bisections are used throughout and we recover the Thoai -
Tuy algorithm (1980).
312

If A = {N,2N,3N, ... } where N is a natural number (typically N = 5), then r(Q*)


t A is likely to hold most often and the algorithm will generate a hybrid subdivision
process, with w-subdivision in most iterations and bisection occasionally. The larger
N, the smaller the frequency of bisections.
If N is very large, then in Step 3 we almost always have r(Q*) t A, so that the
algorithm involves practically only w-subdivisions, and it is very dose to Zwart's
algorithm (1974), except for a substantial difference in the use of the tolerance
parameter E.
Thus, the Thoai-Tuy "pure" bisection algorithm and Zwart's "pure" w-sub-
division algorithm appear as two extreme cases in a whole range of algorithms. On
the other hand, for N very large, the normal conical algorithm operates much the
same as the cut and split algorithm, except that in certain iterations a bisection is
used instead of an w-subdivision. One could say that the A-device is a method of
forcing the convergence of the cut and split algorithm.
As mentioned earlier, it is an open question whether the algorithm will still be
convergent if A = 0 (the algorithm will then coincide exactly with the cut and split
algori thm).
However, since the algorithm is convergent for any whatever large value of None
can expect that in most cases in practice it will be convergent even if N = +m. This
seems to be in line with the finite termination of Zwart's algorithm for E = °
observed in the numerical experiments reported in Zwart (1974).

1.6. Remarb Concerning Implementaüon

(i) As in the cut and split algorithm (Section V.3.2), when KO = IR~ (which is
the case if the problem is in standard form with respect to xO), the linear program
LP(QiM) in Step 1) can be solved without having to invert the matrix Q. Actually,
if Q = (z l i,... ,zn) and ifthe additional constraints (cuts) that define Mare Cx ~ d,
then in terms of the variables ().1').2''''').n) = Q-lx this program can be written as
313

n
LP(Q,M) max E >.. (9)
j=1 J

If Q' is the successor of Q obtained by replacing some zi with a vector u E con(Q),


then LP(Q',M) is derived from LP(Qj) simply by replacing Azi and Czi with Au and
Cu, respectively. So to solve LP(QjM) there is no need to know Q-1.

(ii) As in the PA method (Section VI.2), we return to Phase I (i.e., we ruto.rt a new
cycle) whenever an w (Q) is found with f(w (Q) < 7. Restarting is also possible when
w (Q) is a vertex of D, no matter what f(w(Q» iso Then, instead of returning to

Phase I, one should simply return to Step 0, with xOI - w (Q). In this way the new
xOwill be different from all of the previous ones, so that the convergence of the algo-
rithm will still be ensured (recall that a concavity cut should be made before each
restart).
Sometimes it may happen that at a given stage, while no w (Q) satisfying the
above conditions is available, the set ~ of cones that must be investigated has
become very numerous. In that event, it is also advisable to return to Phase I with
z I - w (QO)' M I- Mn {x: eQo-lx> I}, D I - Mn {x: eQo-lx> I} (note that not

only M, but also the original feasible domain D is reduced). Of course, a finite num-
ber of restarts of this kind will not adversely affect the convergence.
Computational experience reported in Horst and Thoai (1989) has shown that a
judicious rest art strategy can often substantially enhance the efficiency of the algo-
rithm by keeping ~ within manageable size and correcting a bad choice of the
starting vertex xO.

(iii) A further question of importance for efficient implementation of the normal


conical algorithm is the selection of the subdivision rule.
The above basic NCS rule aims at ensuring the convergence of the algorithm with
as few bisections as possible. An alternative normal subdivision rule is the following
314

(Tuy (1991a)):

(*) Select a natural number N and a sequence l1k 1 o. At the beginning set r{QoJ = 0
/or the initial cone KO= con{QoJ. At iteration k, i/ r{Q*) < N and p.{Q*) - 1 > 11",
then per/orm an w-subdivision 0/ Q* and set r{Q) = r{Q*) + 1 /or every member Q
0/ the partition," othenoise, per/orm a bisection 0/ Q* and set r{Q) = O/or every
member Q o/the partition.

It has been shown in Tuy (1991a) that this rule generates indeed anormal
subdivision process. If N is chosen sufficiently large, then the condition r(Q*) < N
almost always holds (in the computational experiments reported in Zwart (1974),
r(Q*) rarely exceeds 5 for problems up to 15 variables) and the just descrihed rule
(*) practically amounts to using w-subdivisions if p.( Q*) - 1 S l1k and bisections
otherwise.
Since the value p.(Q*) - 1 indicates how far we are from the optimum, the fact
p.(Q*) - 1 S '1tt means, roughly speaking, that the algorithm runs normally
(according to the "criterion" {'1tc} supplied by the user). Thus, in practical
implementations w-subdivisions are used as along as the algorithm runs normally,
and bisections only when the algorithm slows down and threatens to jam.

In extensive numerical experiments given in Horst and Thoai (1989) the following

heuristic rule has heen successful:

* Ai>
(**) Choose c > 0 su/ficiently smalL Use an w-subdivision i/min {Ai: * O} ~ c,
and a bisection othenoise.

In Horst and Thoai (1989), c = 1/2 n 2 was used.

(iv) It follows from Theorem VII.2 that Algorithm VIll with E = 0 will find an
exact global optimal solution after finitely many steps, but it may require infinitely
many steps to recognize this global optimal solution as such. A similar situation may
315

occur with c > 0: though a global c-optimal solution has already been found at a
very early stage, the algorithm might have to go through many more steps to check
the global c-optimality of the solution attained. This is not a peculiar feature, but
rat her a typical phenomenon in these types of methods.

Finally, the assumption that f(x) has bounded level sets can be removed. In fact,

if this assumption is not satisfied, the set {x: f(x) ~ a} may be unbounded, and the

a-extension of a point may lie at infinity. By analogy with the cut and split algo-

rithm, Algorithm VII.1 can be modified to solve the BCP problem as folIows.

To each cone K we associate a matrix Q = (zl,z2, ... ,zn) where zi is the intersec-

tion of the i-th edge of K with the surface f(x) = a, if this intersection exists, or the
direction of the i-th edge otherwise. Then, in the problem LP( Q,M), the vector e
should be understood as a vector whose i-th component is 1 if zi is a point, or 0 if zi

is a direction. Also, if I = {i: zi is a point}, then this linear program can be written
as

max E )..
jeI J
n .
s.t. E ).. (AzJ) 5 b,).. ~ 0 (j=l, ... ,n).
j=l J J

Aside from these modifications, the algorithm proceeds exactly as before.

1.7. Example VII.1. We consider the problem:

minimize f(x) subject to Ax 5 b, x ~ 0 ,


316

1.2 1.4 0.4 0.8 6.8


--0.7 0.8 0.8 0.0 0.8
0.0 1.2 0.0 0.4 2.1
A= b=
2.8 -2.1 0.5 0.0 1.2
0.4 2.1 -1.5 --0.2 1.4
--0.6 -1.3 2.4 0.5 0.8

Tolerance e = 1O-Q.

With the heuristic subdivision rule (**), where the parameter c was chosen to c =

1/2 n 2, one obtains the following results:

First cycle

Initialization xO = (0,0,0,0) D = {x: Ax ~ b, x ~ O}.

Phase I:

xO = (0,0,0,0) (nondegenerate vertex of D)

y01 = ( 0.428571, 0.000000, 0.000000, 0.000000 )


Y02 = (0.000000, 0.666667, 0.000000, 0.000000)
Y03 = (0.000000, 0.000000, 0.333333, 0.000000)

Y = (0.000000, 0.000000, 0.000000, 1.600000)


04

Current best point: x= (0.000000, 0.666667, 0.000000, 0.000000) with f(X) =

-2.055110.

phasen:

0) Q = -2.055111. The problem is in standard form with respect to XOj KO= IR!.
01
z = (1.035485, 0.000000, 0.000000, 0.000000)
02
z = (0.000000, 0.666669, 0.000000, 0.000000)

z03 = (0.000000, 0.000000, 29.111113, 0.000000)


04
z = (0.000000, 0.000000, 0.000000,8.733334)
QO = (zOl, z02, z03, z04). "'0 = .91 0 = {QO}.
317

Iteration 1

1) Solution of LP(QO,D): wO = (1.216328, 1.245331,0.818556,2.366468)


f( wO) = -1.440295
p,( QO) = 3.341739

2) .9l 1 = .J( 0

3) Q* = QO . Split K O = con(QO) with respect to wO.

4) .9'* = {Q11 ' Q1 2 ' ... , Q1 4}' .J( 1 =.9'* .9'1 = .9'* .
" ,
Iteration 2

1) Solution of LP(Qli ,D) (i=1,2, ... ,4):


w (Qn) = (1.083760, 1.080259,0.868031,0.000000)

with objective function value -2.281489 < 0.


-1 -1
Cut: eQO x ~ 1, with eQO = (0.413885, 0.999997, 0.011450, 0.183206)

Second eyde (Restart): M = D n {x: eQO-lx ~ I}

Phase I:

x O = (1.1694015, 1.028223,0.169811,4.861582) (nondegenerate vertex of M),

y01 = (1.104202, 0.900840, 0.000000, 5.267227)

Y
02
= (1.216328, 1.245331, 0.818956, 2.366468)
03
Y = (1.134454, 0.941176, 0.000000, 5.151261)
Y04= (0.957983, 0.991597, 0.000000, 5.327731)
Current best point i = (1.083760, 1.080259, 0.868031, 0.00000) with

f(X) = -2.281489.
Phase ll:

After 20 iterations (generating 51 subcones in all) the algorithm finds the global

optimal solution (1.083760, 1.080259, 0.868031, 0.000000) with objective function

value -2.281489.
318

Thus, the global optimal solution is eneountered at the end of the first eycle (with
two iterations), but ehecldng its optimality requires a second eycle with twenty more
iterations.
Aeeounts of the first eomputational experiments with the normal eonieal
algorithm and some modifications of it ean be found in Thieu (1989) and in Horst

and Thoai (1989) (cf. the remarks on page 260). For example, in Horst and Thoai
(1989), among others, numerieal results for a number of problems with n ranging
from 5 to 50 and with different objeetive funetions of the forms mentioned in Section
VI.2.4, are summarized in the table below (the eolumn fex): form of the objeetive
funetionj Res.: number of eyclesj Con: number of eones generatedj Bi: number of
biseetions performed). The time includes CPU time and time for printing
intermediate results (the algorithm was eoded in FORTRAN 77 and run on an
IBM-PS 11, Model 80, Dos 3.3).

n m fex) Res Con Bi Time (sec)


5 15 (1) 3 25 2 6.42
8 21 (1) 3 31 3 22.30
9 27 (2) 5 105 10 90.80
10 30 (5) 3 46 0 68.63
12 11 (3) 4 49 2 74.38
12 18 (1) 5 71 5 110.21
20 18 (1) 8 123 7 436.59
20 13 (4) 20 72 12 1020.72
30 22 (3) 9 133 0 1800.23
40 20 (3) 6 70 12 2012.38
50 21 (3) 6 172 0 8029.24

Table VII.I
319

1.8. Alternative Vmants

In this section we discuss two alternative variants of the normal conical algo-

rithm.

I. Branch and Bound Vmant

Note that the above Algorithm VII.1 operates in the same manner as a branch

and bound algorithm, although the number J.t{Q) associated with each cone

K = con(Q) is not actually a lower bound for f(M n K), as required by the conven-
tional branch and bound concept. This is so because Phase II solves a (DG)-problem

with G = {x: f(x) ~ a} and, as seen in Section VILl.1, J.t{Q) is an upper bound for

p(M n K), where p(x) is the gauge of the convex set G.


Because of this observation, we see that an alternative variant of the normal coni-

cal algorithm can be developed where a lower bound for f(M n K) is used in place of
J.t{Q) to determine the cones that can be deleted as non promising (Step 2), and to
select the candidate for further subdivision (Step 3).

Let Q = (z1,z2,... ,zn) an d z-i = p,(Q) zi.


, l.e., Z-i.IS t he mtersec
. t'IOn 0 f t he h yperp Iane

through w(Q) with the ray through zi. Since the simplex [O,i 1,i 2,... ,i n] entirely
contains M n K, and f(ii) < f(O) (i=1,2, ... ,n), the concavity of f(x) implies that
f(M n K) ~ min {f(ii), i=1,2, ... ,n}. Hence, if we define ß(Q) inductively, starting
with (QO)' by the formula

a ifp,(Q)~ 1;
ß(Q) ={ -1 -2 -n
max {ß(Qanc)' min [f(z ),fz ), ... ,f(z)]} if p,(Q) >1,

where Q denotes the immediate ancestor of Q, then dearly ß(Q) yields a lower
anc
bound for min f(M n K) such that ß(Q) ~ ß(Q') whenever con(Q') is a subcone of

con(Q).
320

Based on this lower bound, we should delete any Q E !/' such that ß( Q) ~ a, and
choose for further splitting Q* e argmin {ß(Q): Q E .9t}. Note, however, that
ß(Q) > a if and only if J.t(Q) ~ 1, so in practice a change will occur only in the selec-
tion of Q*.

Theorem VII.3. The concl1LSions of Theorems VII.1 and VI!.2 stiU hold for the
variant of the normal conical algorithm where ß(Q) is 1LSed in place of p.(Q) in Steps 2
and 9.

Proof. It suffices to show that Phase 11 can be infinite only if e = 0 and


'1 = min f(D). To this end consider any infinite nested sequence of cones

Ks = con(Qs)' s=1,2, ... generated in Phase 11 (by Corollary IV.1, such a sequence
exists if Phase 11 is infinite). Let J = w (Qs) and, as before, denote by WS and qS,
respectively, the a-extension of J and the intersection of the hyperplane
eQs-1x = 1 with the ray through J. Then f(J) ~ '1 = a+e (see Step 1) and
JE [qS,WS]. Since qS - WS -+ 0 (s -+ 111) by the normality condition, we deduce that
J- WS -+ 0, and hence, by the uniform continuity of f(x) on the compact set
G = {x: f(x) ~ a}, we have f(J) -f(WS) -+ 0, Le., f(J) -+ a. We thus arrive at a
contradiction, unless e = O.

On the other hand, by the same argument as in the proof of Proposition VII.1, we

can show, by passing to subsequences if necessary, that J.t(Qs) -+ 1 (s -+ 111). Let


..fQ)
Qs = ( Zsl , ••• ,Zsn) , Z~si = 1"\ S Zsi • T hen Z~si - Z
si -+ 0, and hence ~(~si)
~, Z -
f( Zsi) -+ 0,

Le. f(zSi) -+ a. Since from the definition of ß(Q) we have

'1 = a> ß(Qs) ~ min {f(zsi): i=1,2, ... ,n}

it follows that ß(Qs) -+ '1 (s -+ 111). This proves the consistency of the bounding op-
eration.
Since the selection operation is bound improving, we finally conclude from The-
orem IV.2 that
321

min f(D) = min f(M) = 7.



11. One Phase Algorithm

In Algorithm VII.1 the starting vertex xO is changed at each return to Phase I.


The advantage of such restarts was discussed in Section VII. 1.6. However, some-
times this advantage can be offset by the computational cost of the transformations
necessary to start a new cycle. Especially if the old starting vertex xO has proven to
be a good choice, one may prefer to keep it for the new cycle. For this reason and
also in view of the possibility of extending conical procedures to more general situa-
tions (see Section VII.1.9 and Chapter X), it is of interest to consider variants of
conical algorithms with fixed starting point xO = O. Below we present a one phase al-

gorithm which is close to the algorithm of Thoai and Tuy (1980).

For each cone K = con(Q), Q = (zl,z2, ... ,zn), with f(i) = 7< f(O) (i=1,2, ... ,n),

we define

v(Q,7) = min {f(zi), i=1,2, ... ,n} ,

where zi = I'CQ)zi, and I'CQ) is the optimal value of the linear program

-1
LP(Q,D) max eQ-1x s.t. x E D, Q x ~ O.

Algorithm VII.I*.

Select E~ 0 and the sequence 6. ( {0,1, ... } for an NCS rule.

0) Translate the origin 0 to a vertex of D. Let x O = argmin {fex): x = 0 or x is a

vertex of D adjacent to O}, 70 = f(x O) (assume that f(O) > 70).


Construct a cone KO ) D. For each i=1,2, ... ,n compute the point zOi # 0 where the
.
I-th edge of K O meets the surface f(x ) = 70· Set QO = (01 z ,z02 ,... ,zOn) ,

..Jt 0 = .9 0 = {QO}. Set k = O.


322

1) For each Q E Jl k solve LP(Q,D) to obtain the optimal value J.'(Q) and abasie

optimal solution w (Q) of this program. Compute v (Q,/k)' and let

2) Let .ge k = {Q E .J(k: ß(Q) ~ Ik - e}. If .ge k = 0, terminate: xk is a global


e-optimal solution. Otherwise,

3) Select Qk E argmin {ß(Q): Q E .ge k} and split it by proceeding as in Step 3 of


Algorithm VII. 1.

4) Let xk+1 be the best point among xk, w(Q) (Q E Jl k), and the point (ifit exists)
where the splitting ray for con(Qk) meets the boundary of D. Let
k+1
Ik+1 = f(x ).
Let Jl k+1 be the partition of Qk obtained in Step 4). For each Q E Jl k+1 reset
Q = (zl,z2, ... ,zn) with zi a point on the i-th edge of K = con(Q) such that

f(zi) = Ik+1 (i=1,2, ... ,n). Let .J(k+1 = (.ge k \{Qk}) U Jl k+1. Set k f- k+1 and
return to 1).

Theorem VITA. Algorithm VII.l* can be infinite only if e = O. In this case


Ik L1 = min f(D), and any accumulation point of {,f} is a global optimal solution.

Proof. First observe that Proposition VII.5 remains valid if the condition zSi E

8G is replaced by the following weaker one: zSi E 8G s ' where Gs is a convex subset

of a convex compact set G, and 0 Eint Gs ( int GS+ 1 . Now, if the procedure is in-

finite, it generates at least one infinite nested sequence of cones Ks = con(Qs)' sET
( {O,1,2, ... }, with Qs = (zsI,zs2, ... ,zsn) such that f(ii) = I S (i=1,2, ... ,n) and 15 L1

(s -+ m). By Proposition VII.5, we can assume that qS - wS . . o. Hence, as in the

proof of Proposition VILl, J.'(Qs) -+ 1. This implies that lim v(QS'/S) ~ I. But
s
since
323

this is possible only if c = 0. Then ß(Qs) --I 7, Le., the lower bounding is consistent.

Hence, by Theorem IV.2, min f(D) = 7. •

Remarks VII.3. (i) If the problem is in standard form with respect to xO, so that

xO = °is a vertex of the feasible polytope:


D = {x: Ax ~ b, x ~ O} ,

then one can take K O = IR! and the linear program LP(Q,D), where Q = (zl, ... ,zn),
can be written as

max E)'.
. J
J
s.t. ~ ).lzj ~
J
b, ).j ~ ° (j=1,2, ... ,n).

(ii) As with Algorithm VII.1, the efficiency of the procedure critically depends
upon the choice of rule for cone subdivision. Although theoretically an NCS rule is

needed to guarantee convergence, in practice a simple rule like (**) in Section

VII.1.6 seems to work sufficiently weIl in practice.

1.9. Concate Minimization Problem with Convex Constraints

An advantage of the one phase Algorithm VIl.h is that it can easily be extended

to the general concave programming problem:

(CP) minimize f( x) subject to xE D,

where f: IRn --I IR is a concave function and D is a closed convex set defined by the in-

equality

g(x) ~ °,
with g: IRn --I IR a convex function.

Assume that the constraint set Dis compact and that int D f 0.
324

When extending Algorithm VII.1* to problem (CP), the key point is to develop a
method for estimating abound ß(Q) ~ min f(D n K) for each given cone K = con(Q)
such that the lower bounding process is consistent.

Tuy, Thieu and Thai (1985) proposed cutting the cone K by a supporting hyper-
plane of the convex set D, thus generating a simplex containing D n K. Then the
minimum of fex) over this simplex provides a lower bound for min f(D n K). This
method is simple and can be carried out easily (it does not even involve solving a lin-
ear program); furthermore, it applies even if Dis unbounded. However, it is practi-
cal only for relatively small problems.

A more efficient method was developed by Horst, Thoai and Benson (1991) (see
also Benson and Horst (1991)). Their basic idea was to combine cone splitting with
outer approximation in a scheme which can roughly be described as a conical proced-
ure of the same type as Algorithm VII.1*, in which lower bounds are computed
using an adaptively constructed sequence of outer approximating polytopes
DO J D1 J ... J D.
The algorithm we are going to present is a modified version of the original
method of Horst, Thoai and Benson. The modification consists mainlY in using an
NCS process instead of a pure bisection process for cone subdivision.

A1gorithm. Vll.2 (Normal Conical Algorithm. for CP)

Assume that f(O) > min f(D) and that a cone KO is available such that for any
xE K O \ {O} the ray {1X: T > O} meets D. Denote by x the point fJx where
0= sup {r: 1X E D}. Select E~ 0 and an infinite increasing sequence !:J. C {0,1,2, ... }.

0) For each i=1,2, ... ,n take a point yi # 0 on the i-th edge of KO and compute the

corresponding point i = 0iYO' Let xO E argmin {f(yi), i=1,2, ... ,n}, 10 = f(xO),
and let zOi be the 1 0-extension of yi (i=l, ... ,n) (cf. Definition V.1). Set
325

Qo = (zOI, ... ,zOn) and .Jt 0 = 91 0. Construct a polytope DO J D. Set k = o.

1) For each Q E 9l k ' Q = (zl,z2, ... ,zn), solve the linear program

to obtain the optimal value J.&(Q) and a basic optimal solution w(Q) of this
. -i. -i i
programj compute v(Q,'Yk) = lIl1n (f(z ), 1=1,2, ... ,n}, where z = J.&(Q)z , and let

ß(Q) = max {ß(Qk-l)' v(Q,'Yk)} (k ~ 1), ß(QO) = v(QO''YO) .

2) Let .se k = {Q E .Jt k: ß(Q) ~ 'Yk"-e}. If .se k = 0, then terminate: x k is a global


e-optimal solution. Otherwise, go to 3).

3) Select Qk E argmin {ß(Q): Q E .se k} and split it as in step 3 of Algorithm VII.I.

4) If ,} E D, set Dk+1 = Dk . Otherwise, take a vector pk E ög(iJ) (recall that


x= Ox, with (J = sup{r: 7X E D}), and form

5) Let xk+1 be the best point among xk , u(Qk) and all W(Q) for 9 E 9lk j let
k+1)
'Yk+1 = fex .

Denote the partition of Qk obtained in 3) by 9l k+1· For each Q E 9l k+1 reset


Q = (zl i ,... ,zn) with zi f 0 a point where the i-th edge of K = con(Q) meets
the surface fex) = 'Yk+1 .
Let .Jt k+1 = (.se k\{Qk}) U 9l k +1. Set kt-k+1 and return to 1).

Theorem. VTI.5. Algorithm VI!.! can be infinite only i/ e = o. In this case 'Yk ! 'Y,

and e'IJery accumulation point 0/ the sequence {zk} is a global optimal solution 0/
(CP).

Proof. Consider any infinite nested sequence of cones K s = con(Qs)' sET c


.
{O,I,2, ... }, Wlth z ,z82 ,... ,zsn) , such that f (zsi) = 'Ys (.1=1,2, ... ,n.
Qs = (SI ) Let
326

1= 1im Ik = lim I S (s E T, s - I 1Il).


k-+1Il
For each s denote by qS, cJ the points where the halfline from 0 through rJ meets

the simplex [zsl, ... ,zsn] and the surface f(x) = I S' respectivelyj also recall that 'i1' is
the point where this halfline meets the boundary 8D of D. Since f(w) = f(zSi) = I S

and IS ! I, as in the proof of Theorem VIIA it can be seen that the sequence
Ks = con(Qs) is normal, Le., lim (qS - cJ) = O. Hence, in view ofthe boundedness of
DO' we may assume, by taking subsequences if necessary, that "qS, cJ tend to a

common limit qlll, while ws, 'i1' tend to Will, Will, respectively. We claim that

W
III
=W-Ill

Indeed, this is obvious if rJ E D for infinitely many s. Otherwise, for all

t Dj and, according to Step 4, each rJ is strictly


sufficiently large s we must have rJ

separated from D by a hyperplane ls(x):= pS(x_'i1') = 0 such that

Therefore, by Theorem 11.2, !im rJ E D, Le., c? = I? in any case. But clearly,


S-+1Il
Ik+1 ~ f(wk) for any k. By passing to the limit, this yields I ~ f(wm), Le.
t? E [O,qlll]. On the other hand, since qS E [O,rJ], we have qlll E [O,t?].
ConsequentIy, qllJ = t?, i.e., qS - rJ -I O. As illustrated in Fig. VI!.l, we then have

This implies that zSi - zSi -I 0, where zsi = JL(Qs)zsi. Since f(zsi) = I S' it then fol-
lows from the definition of II(QS,/S) that lim v(QS,/S) ~ I' Hence, noting that

IS-t: > ß(Qs) ~ v(QS,/S)' we arrive at a contradiction as s -IIIJ, if c > O. Therefore,


the algorithm can be infinite only if c = O. But then ß(Qs) 1 I, Le., the lower

bounding is consistent. Since the candidate selection here is bound improving, the
concIusion follows from Theorem IV.2.

327

--..5;
.z
.... f(x)= 0
~ s
.. ~ ....

........... .
s
·w
.................

Fig. VII.l

Remar1ts VII.4. (i) The above algorithm differs from Algorithm VIl.h only in

the presence of Step 4. If Dis a polytope and we take DO = D, then Dk = D Vk, and
the above algorithm reduces exactly to Algorithm VII.h.

(ii) If DO = {x: Ax ~ b} and Dk = {x: Ax ~ b, Cx ~ d}, then for Q = (zl,z2, ... ,zn)
the linear program LP(Q;D k) is

(iii) In the general case when an initial cone KO as defined above is not available,
Step 0) should be modified as follows.

By translating if necessary, assume that 0 Eint D. Take an n-simplex


[yl, ... ,yn+1] containing °in its interior. Let xO E argrnin {f(yi), i=l, ... ,n+1},
1 = f(xO) and let zOi be the 1 0-extension of yi (i=l, ... ,n+l). For each i=l, ... ,n+l
328

let QOi be the matrix with columns zOj, j E {l, ... ,n+1}\{i}. Set .J{O = .9>0 = {QOi'
i=l, ... ,n+1}. Construct a polytope DO ) D. Set k=O.

1.10. Unbounded Feasible Domain

The previous algorithms can be extended to problems with unbounded feasible

domain.
Consider, for example, Algorithm Vll.2. If D is unbounded, then Dk is unbounded,
and in Step 2 the linear program LP(Q,D k ) may have no finite optimal solution.

That is, there is the possibility that I'(Q) = +11), and ,) = w (Qk) is not a finite
point but rather the direction of a halfline contained in Dk . If we assurne, as before,
that the function f(x) has bounded level sets (so that, in particular, it is unbounded
from below over any halfline), Algorithm VII.2 will still work, provided we make the

following modifications:
When ,) = w (Qk) is a direction, the 'k-extension of wk should be understood as
a point r) = )..CJ such that f()"cJ) = 'k. Moreover, in Step 4, if J is a recession
direction of D, then the algorithm terminates, because inf f(D) = -11) and f(x) is
unbounded from below on the halfline in the direction ,). Otherwise, the halfline
parallel to cJ intersects the boundary of D at a unique point iJ f O. In this case let
pk E 8g(wk), and

Dk +1 = Dk n{x: pk(x_wk) ~ O} .

Theorem VII.6. Assume that the function f(z) has bounded level sets. For e =0
consider Algorithm VII.2 with the above modifications in Steps 2 and 4. Then

f(i)! inff(D).

Proof. It suffices to consider the case where 'k! , > - 11). Let Ks = con(Qs) be
any infinite nested sequence of cones, with Qs = (zSl ,zs2 ,... ,zsn), f(zsi) = 's
329

(i=1,2, ... ,n). We claim that JL(Qs) = +111 for at most finitely many s. Indeed, suppose

the contrary, so that, by taking a subsequence if necessary, we have JL(Qs) = +111 Vs,

Le., each cl = W (Qs) is a direction. Again by passing to subsequences if necessary,

we mayassume that cl /lIclll u. Then, arguing as in the proof of Lemma V1.3, it


-i

can be seen that the halfline r in the direction u lies entirely in D.

Let W be the interseetion of KO with a ball around 0, which is small enough so

that W ( D, and hence w+r ( D. Since f(x) is unbounded from below on r, we can
take a ball B around some point of r such that B ( w+r ( D and f(x) < '1 Vx e B.

Then for all sufficiently large s, the ray in the direction cl will intersect B at some

point x e D with f(x) < '1. This implies that f(WS) < '1 $ '1s +1' contradicting the
definition of '1s+1. Therefore, we have JL(Qs) < +111 for all but finitely many s.
Arguing as in the first part of the proof of Theorem VII.5, we then can show that
qS _ cl' -i o. Since f(WS) ~ '1, the sequence {WS} is bounded.
If 11 cl-WS 11 ~ 1] > 0 for infinitely many 5, then, by taking the point yS e [WS,cl] such
that lIys-WS 11 = 1], we have ls(Ys) > 0, ls(x) < 0 Vx e D. Hence, by Theorem III.2,
lim yS e D, Le., yS - WS - i 0, a contradiction. Therefore, cl - WS -i 0, and, as in the

last part of the proof of Theorem VI1.5, we conclude that ß(Qs) T '1.

We have thus proved that the lower bounding is consistent. The conclusion of the
theorem then follows from the general theory of branch and bound methods

(Theorem IV.2).

1.11. A Class of Exhaustive Subdivision Processes

In the general theory of branch and bound discussed in Chapter IV, we have seen

that exhaustive subdivisions play an important role.

The simplest example of an exhaustive subdivision process is bisection. Since the

convergence of conical (and also simplicial) algorithms using bisection is sometimes

slow, it is ofinterest to develop alternative exhaustive sub division processes.


330

From Corollary VII.3 we already know an exhaustive subdivision process which


involves infinitely many bisections. Now we present another dass of exhaustive

subdivisions in which a cone is most often split into more than two subcones. This
dass was introduced by Utkin (see Tuy, Khachaturov and Utkin (1987) and also

Utkin, Khachaturov and Tuy (1988)).


Given an (n-1)--simplex Zo which is the section of the initial cone by a fixed

hyperplane, we wish to define a subdivision process on Zo such that any infinite


nested sequence of simplices {Zs} generated by such a process shrinks to a single
point. Clearly, the conical subdivision induced by this simplicial subdivision will

then be exhaustive.
Recall from Section IV.3 that a radial subdivision process is specified by giving a
function w{Z) which assigns to each (n-l)--simplex Z = [vl,v 2,... ,vn] a point
w{Z) E Z \ V{Z), where V{Z) = {vl,v2,... ,vn }. If
n. n
w{Z) = E A. VI , A. ~ 0 , E A. =1 ,
i=1 I I i=l I

I(Z) = {i: \ > o} ,

then Z is divided into subsimplices Z{i,w) that for each i E I{Z) are formed by
substituting w{Z) for vi in the vertex set of Z. In order to have an exhaustive radial
subdivision process, certain conditions must be imposed on the function w{Z). Let us
denote

5{Z) = max IIvi-vili ,


i <j

5{i,Z) = max IIvi-vili (i=1,2, ... ,n).


j

We see that 5{Z) is the diameter (the length of the longest edge) of Z, whereas
6{i,Z) is the length of the longest edge of Z incident to the vertex vi. The following
conditions define an exhaustive class of subdivisions.
331

There exists a constant p, 0 < p < 1, such that for any simplex Z = [vI, v2, ... , vn] "

IIw(Z)-vili $ ~(Z) Vi E I(Z) , (11)

while the set I(z) defined above satisfies

I(Z) = {i,' 6(i,Z) > p6(Z)} . (12)

Proposition Vll.6. A radial subdivision process defined by a function w(Z)


satisfying the above conditions (11) and (12) is exhaustive.

The proof of Proposition VII.6 will follow from several lemmas.

Consider any nested sequence of simplices {Zs}' each of which is an immediate

descendant of the previous one by means of a subdivision defined by w(.). Denote by

i s the index of the vertex of Zs that is to be replaced by w(Zs) to generate Zs+1 . In


. Zs
other words, If = [SI
v ,vs2 ,... ,vsn] ,then the vertIces
. f
0 Zs+l are v
s+ 1,i = vsi
(i fis)' vS+ 1,i s = w(Zs)' so that the new vertex in Zs+l receives the index of the
vertex of Zs that it replaces. Obviously, i s E I(Zs)·

Lemma Vll.3. For every s ~ 1 we have

6(i s,Zs+I) $ p6(Zs)·

Proof. It suffices to show that IIw(Zs)-vS+1,i ll $ p6(Zs) for all i E {l, ... ,n} \ {is}.

But for i E I(Zs) \ {i s} this follows from (11), while for i E {l, ... ,n} \ I(Zs) we have,

by (12):

where ). . are the coefficients expressing w(Zs) as a convex combination of the


sJ
vertices of Zs .

332

Lemma VIIA. For every s at least one ofthe foUowing relations holds:

(13)

(14)

Proof. Obviously, 5(Zs+1) ~ 5(Zs)' Suppose that

Since Zs+ 1 is obtained from Zs by replacing the vertex of index is by w(Zs)' for
every i we can write:

(15)

Hence, if i E I(Zs+1)' Le., 5(i,Zs+1) > p5(ZS+1) = p5(Zs)' then by the previous
lemma, 5(i,Zs+1) = Ö(i,Zs)' and consequently p5(i,Zs) > 5(Zs)' Le., i E I(Zs)' There-
fore,

implies that is ~ I(Zs+1)' This proves (14).



Let us now define

Lemma VII.5. Fors ~ 1 sv.ppose that ls t 0. Let hs = Il(Z"J1 - Ilsl + 1. Then


for all h ~ hs we have

(16)
333

Proof. First note that for every s ~ 1 one has

(17)

Indeed, the second inclusion follows from the inequality 5(Zl) ~ 5(Zs). To check
the first, observe that if i ;. I s ' i.e., if 5(i,Zs) ~ p5(Zl)' then from (15) and Lemma

VII.3 (which implies that 5(i s ,ZS+1 ~ p5(Zl» we derive the inequality 5(i,ZS+1) ~
p5(Zl); hence i ;. IS+ 1 . FlOm (17) it follows, in particular, that hs ~ 1.

Now suppose that for some h ~ 1

(18)

while

(19)

From Lemma VIIA it then follows that II(ZS+h) I ~ II(Zs) I-h; hence, by (17) and
(18),

Therefore, for h ~ h s either (18) or (19) does not hold. If (18) does not hold, then
in view of (17) we must have (16). Otherwise, there exists r such that 5 ~ r < s+h

and

(20)

If I
r = 0, then (16) trivially holds, since I f 0. Consequently, we may assume
s
that Ir f 0. From (20) it follows that 5(ir ,Zr ) = 5(Zr ), and hence
334

In particular, by taking an i E Ir we can write

This implies that ir E Ir. But, by Lemma VI1.3, 6(i r ,Zr+1) S p6(Zr) S p6(Zl)' so
that ir ~ I r +1 . Taking account of (17), we then deduce that Il r +1 1 < 1Ir I. and
hence (16) holds since 1IS+h 1 S 1I r +1 1 and 1Ir 1 S 1I s I. This completes the proof of
Lemma VII.5.

Lemma Vll.6. For p = n(n-l}/2 and any s ~ 1 we have

(21)

Proof. Without loss of generality we may assume that s = 1. Consider a sequence


of integers s = hO < h1 < ... < h", such that

According to Lemma VI1.5, such a sequence can be obtained inductively by de-


fining

h. = h. 1 + II(Zh 1 -llh 1 + 1 (j=1,2, ... ,"').


J,J- j-1 j-1

Since any nonempty It has at least two elements, it follows that Ilh 1 ~ 2.
v-1
Rence, '" ~ n-1 and Ilhj-1 1 ~ 2 + (v-j). Substituting into the expression for hj we

obtain

hj S hj-1 +n - (v-j) -1 (j=1,2, ... ,"').

Therefore, h", S hll'-l + n-1 S hv-2 + (n-2) + (n-1) ~ ... ~ hO+ (n-",) +... + (n-2) +
(n-1) ~ s + n(n-l)/2. Thus, Is+ p = 0, implying the inequality (21).

335

From this lemma it readily follows that c5(Zs) -I 0 as s -I w. Hence, Proposition

VII.6 is established.

Example VI1.2. Conditions (11) (12) are satisfied, with p = 1-I/n, if for each
(n-l)-fiimplex Z we let

I(Z) = {i: c5(i,Z) > (1-I/n)c5(Z)}, w(Z) = E \vi , where \ = 0 for


i ~ I(Z), \ = 1/ II(Z) I for i e I(Z).

Indeed, condition (12) is obvious, while for ieI(Z) we have, by setting /I = 1/ II(Z) I:

where Ej denotes the sum over all j e I(Z).

Rem.ark Vll.5. In the example above, p = 1-I/n -lIas n -I w. However, using

the Young inequality for the radius of the smallest ball containing a compact set

with a given diameter (cf., e.g., Leichtweiß (1980)), one can show the existence of a
subdivision process of the type (11) and (12) with p = [(n-l)/2n] 1/2 $ 2-1/2. For

another discussion of generalized bisection, see Horst (1995) and Horst, Pardalos and
Thoai (1995).

1.12. Exhaustive Nondegenerate Subdivision Processes

In certain applications, an c-optimal solution in terms of the objective may not

provide an adequate answer to the original problem, so that finite convergence may

be a desirable property to achieve in concave minimization over polytopes.

To generate finitely convergent conical algorithms, Hamami and Jacobsen (1988)

(cf. also Hamami (1982)) introduced a subclass of exhaustive subdivision processes


which satisfy a nondegeneracy condition that is stronger than the one defined in Sec-
336

tion VILl.3.
Given the initial cone KO = IR~ and the (n-1)-fiimplex Zo = [el,e2,... ,en], which

is the intersection of KO with the hyperplane xl +x2+ ... +xn = 1, we associate to


each subsimplex Zk = [vkl,vk2, ... ,vkn] of Zo the nx n(n-1)/2 matrix

j] .
ki kj
Bk =[ vki -vkJ ' i <
IIv -v 11

The columns of Bk represent the unit vectors in the edge directions of the

(n-1)-fiimplex Zk' and therefore, rank(B k) = n-l.

Definition Vll.3. An exhaustive sub division process on Z0 is said to be nondegen-

erate (briefly, an END process) if for any infinite nested sequence of simplices Zk=
[Jl,J2, ... ,Jn] obtained by the sub division method, any convergent subsequence of
the aBsociated sequence {Bk} converges to a matrix B ofrank n-l.

Basically, this property ensures that when Qk = (zk1,zk2, ... ,zkn), where zki is the
intersection of the ray through vki with the boundary of a given bounded convex set
G whose interior contains 0, then, under mild conditions, the vector eQk-1 tends to
anormal to G at the point z* = lim zki (k - f IV).
To show this, we need some lemmas.

Lemma Vll.7. Let C be a convex bounded set containing 0 in its interior. Then the

mapping ?r that aBsociates to each point x f 0 the intersection ?r(x) of the boundary of
C with the ray (from 0) through x, is Lipschitzian relative to any compact subset S of
IR n not containing o.

Proof. If p(x) denotes the gauge of C, then ?r(x) = x/p(x), so that

From this the lemma follows, since S is compact, the convex function p(x) is Lip-

schitzian relative to S (cf. Rockafellar (1970), Theorem 10.4) and, moreover, IIxll is
337

bounded from above, while IIp(x)1I is bounded from below on S.



Now consider any infinite nested sequence of simplices Zk = [vk1 ,v k2 ,... ,vkn]

generated in an END subdivision process on the initial simplex ZOo Assume that
7< f(O) (f(x) is the objective function of the BCP problem) and let zki = 0ki vki be
the ~xtension of vki . Since the subdivision is exhaustive, the sequences {vki }
(i=1,2, ... ,n) tend to a common limit v* and if z* is the 7-€xtension of v* then
z*-li
- m zki(k ) 1-
'-12
- - I ID , , , ... ,n.

As a consequence we have the following lemma.

Lemma Vll.8. There exist two positive numbers 1'Jij and Pij such that for i < j and

every k:

Proof. The second inequality is due to the Lipschitz property of the mapping 71",

where 7r{x) denotes the intersection of the boundary of the convex set
G = {x: f(x) ~ 7} with the ray through X. The first inequality is due to the Lipschitz

property of the mapping (J, where a(x) denotes the intersection of Zo with the ray
through x (Zo can be embedded in the boundary of a convex set containing 0 in its
interior).

Also note the following simple fact:

Lemma Vll.9. There exist 01 and 02 such that 0 < 01 ~ 0ki ~ 02 for i=I,2, ... ,n
and every k.

ki
Proof. We have 0ki = l/p(v ), where p(x) is the gauge of the convex set

G = {x: f(x) ~ 7}. The lemma then follows from the continuity of p(x) on ZO' since
any continuous function on a compact set must have a maximum and a: minimum on
338

this set.

Now for each simplex Zk = [vk1,vk2, ... ,vkn] define the matrix

ki kj
Uk = [ II:ki:kJi!' i < j ] .

Lemma VII.I0. There exists a subsequence A ( {k} such that Us ' se A, tends to

a matrix U with rank(U) = n-l.

Proof. By the definition of an END process, there exists a subsequence A ( {k}

such that lim Bs = B as s -Im, S e A, with rank(B) = n-l. Let bij , (i,j) e I, be a set
of n-l linearly independent columns of B. We mayassume that (zsi~sj)/lizsi ~sjll

converges to uij for i < j. Let us show that the vectors uij , (i,j) e I, are linearly in-
dependent. Suppose that E a..uij = o.
(i,j)eI IJ
We may write

zsi~sj = °. IIvsi_vsjll (vsi_vsj ) + (OsCOSj) vsj (22)


IIzSl~SJII SI IIvS1_vsJII Ilz Sl _z sJII IIzSl~SJII

and therefore

E a .. (zsi _ zsj)/lizsi _ zSjll


(i,j)eI IJ

= E ~ .(vsi _ vSj)/livsi _ vsjll


(i,j)eI IJ

+ . ~ aiJ·(( 0si-Os ·)/llii - zsjlDvsj , (23)


(1,J)eI J

where

(24)

By Lemma VII.9 and Lemma VII.8, we may assume that this latter sequence con-

verges. Then from equation (22) we may also assume that (Osi-Osj)/lizSi - zsjll con-
verges. Multiplying both sides of (23) by e = (1,1, ... ,1) and letting s -Im, we obtain
339

0= e( E (}..uij ) = lim E (}..(0.-0 .)/lizsi - zsjll .


(i,j)EI IJ S-+ID (i,j)EI IJ SI SJ

Now, taking the limit in equation (23) as S -I ID, we deduce from the above equality

that

0= E (}..uij = E p..bij , where p.. = lim p.~


(i,j)EI IJ (i,j)EI IJ IJ S-+ID IJ

Since the bij , (i,j) E I, are linearly independent, this implies that ßij = 0 for (i,j) E I.

But from (24) it follows, by Lemma VII.9 and Lemma VII.S, that the only way

ßij = 0 can hold is that (}ij = 0 for all (i,j) E I. Therefore, the vectors uij , (i,j) E I,

are linearly independent.



We are now in a position to prove the main property of END processes.

Recall that for any simplex Zk = [ vkl ,vk2 ,... ,Vkn] , Qk denotes the matnx
.
of
kl k2 kn
columns z ,z ,... ,Z .

Proposition Vll.7. Let {Zk} be any infinite nested sequence 01 simplices generated

in an END process. II the function I(z) is continuously differentiable (at least in a


neighbourhood 01 z*), then there exists a subsequence /1 ( {k} such that
-1
eQs V!(z*)
-I _
(s - I ID, SE /1) . (25)
lIeQs-ili IIVI(u)1I

Proof. Let qsi = Vf(zsi)/IIVf(zsi)lI, q* = Vf(z*)/IIVf(Z*)II. From the continuous

differentiability of fex) we have

But f(zsj) = f(zsi) = "{, and therefore, dividing by IIz sj-2si li and letting s - I ID we

obtain

q*uij = 0 Vi <j ,
340

Le., q*U = 0, where rank(U) = n-l. On the other hand, one can assume that the
subsequence d = {s} has been chosen in such a way that the vectors qS = eQ;I/
IIeQ;I 11 converge. Since eQ;I(zSj-zsi) = 0, it follows that if q = lim qS, then we also
have quij =0 Vi < j, i.e., qU = o. This implies that q = :I: q*. But eQ;lzSi = 1 for
any s, hence qz* = 1, and so we must have q = -q*, proving the proposition. _

Because of the above property, END subdivision processes can be used to ensure
finite convergence in conical algorithms for concave minimization over polytopes.
More specifically, let us select an END process and consider Algorithm 1* for the
BCP problem, in which the following subdivision method is used:

a) If the optimal solution w(Qk) of the linear program LP(QkjD) is avertex of D


(the feasible polytope) , then split con(Qk) with respect to this vertex, Le., select

u(Qk) to be the 7k-extension of w(Qk)·

b) Otherwise, subdivide con(Qk) using the END process.

Theorem W.7. Assume that the function f(z) has bounded level sets and is eon-
tinuously differentiable (at least in a neighbourhood of any global optimal solution of
(BCP)). Furthermore, assume that any global optimal solution of (BCP) is astriet
loeal minimum. Then for t = 0 Algorithm VII. 1* with the above sub division method is
finite.

Proof. If the algorithm is infinite, it must generate an infinite nested sequence of


cones: con(Qs) = [sSI,z82, ... ,zsn]. Because of the exhaustiveness of the subdivision,
the algorithm is convergent (see Theorem VII.4): 7s --I7 = min f(D)j and since
f(zsi) = 7s' zSi --Iz*, it follows that z* is a global optimal solution. By Proposition
VI!.7, we may assume that (25) holds (although 7k here is not constant, the pre-
vious reasoning applies because 7k 17).
341

But sinee z* is astriet loeal minimizer, it must be a vertex of D; moreover, the

tangent hyperplane to the surfaee f(x) = 7 at z* must be a supporting hyperplane of


D whieh eontains no other point of D. That is, any linear program max {px: x E D},
where p / IIpll is sufficiently near to -Vf(z*) / IIVf(z*)II, will have z* as its unique

optimal solution. In view of (25), then, for large enough s, say for s ~ sO' the linear
program LP(Qs;D), i.e., max {eQs-1x: x E D n eon(QsH will have z* as its unique

optimal solution, i.e., w(Qs) = z*. Then eon(Qs) is subdivided with respeet to z*.
So we must have z* E {zsl, ... ,zsn}
But z* is eommon to all eon(Qs); henee for all s >

and eQs-1z* = 1. Sinee z* solves LP(Qs;D), it follows that J.t(Qs) = max {eQs-1x:
xE D n eon(QsH = 1, contradicting the fact that J.t(Qk) > 1 for any k. This proves
finiteness of the algorithm.

Remarks vrr.6. (i) An example of an END process is the following: Given a
. 1ex Zk
Slmp = [vk1 ,vk2 ,... ,vkn] ,we dVIi 'ed1' t 'mto 2n- 1 sub'
Slmpli ces by t he
(n-2)-hyperplanes through the midpoints of the edges (for each i there is an (n-2)-
hyperplane through the midpoints of the n-1 edges of Zk emanating from vk1 ).

Clearly, the edge directions of each sub simplex are the same as those of Zk' There-
fore, the sequence {Bk} will be constant.
But of course there are subdivision processes (even exhaustive subdivision pro-
cesses) which do not satisfy the END condition. An example is the classical bary-
centric subdivision method used in combinatorial topology. Fig. VII.2 illustrates the

case n = 2: the edge lengths of Zk+1 are less than 2/3 those of Zk' and therefore the
process is exhaustive. It is also easy to verify that the sequence {Bk} converges to a

matrix ofrank 1.

(ii) The difficulty with the above method of Hamami and Jacobsen is that no
sufficiently simple END process has yet been found. The authors conjecture that

bisection should be an END process, but this eonjecture has never been proved or

disproved. In any case, the idea of using hybrid subdivision proeesses, with an
emphasis on subdivision with respect to vertices, has inspired the development of
342

further significant improvements of conical algorithms for concave minimization.

1 1
v v

2 .3 2 .3
v v v v

Fig. VII.2

(iii) From Proposition VII.7 it follows that if the function f(x) is continuously
differentiable, then the END process is nondegenerate in the sense presented in Sec-
tion VII.l.3. Thus, the condition of an END process is generally much stronger than
the nondegeneracy condition discussed in normal conical algorithms.

2. SIMPLICIAL ALGORITHMS

In this section we present branch and bound algorithms for solving problems

(BCP) and (CP), in which branching is perlormed by means of simplicial subdi-

visions rather than conical subdivisions.

The first approach for minimizing a concave function f: IRn ---. IR over a nonpoly-
hedral compact convex set D was the simplicial branch and bound algorithm by
Horst (1976) (see also Horst (1980)). In this algorithm the partition sets M are
simplices that are successively subdivided by a certain process. A lower bound ß(M)
343

is determined by constructing the convex envelope IPM of f over M and subsequently

minimizing IPM over D n M. Recall from Theorem IV.7 that IPM is the affine func-
ti on that is uniquely determined by the set of linear equations that arises from the

requirements that IPM and f have to coincide at the vertices of M. Note that this lin-
ear system does not need to be solved explicitly, since (in barycentric coordinates)

wehave

n+l . n+l
IPM(x) = E .>..f(vl ), E >.. = 1 , >.. ~ 0 (i=l, ... ,n),
i=l I i=l I I

. n+l .
where VI (i=l, ... ,n) denotes the vertices of M and x = E >..vl • The lower bound
i=l I
ß(M) is then determined by the convex program

minimize IPM(x) S.t. xE D n M ,

which is a linear program when D is a polytope.

Whenever the selection operation is bound improving, the convergence of this al-

gorithm follows from the discussion in Chapter IV. An example that illustrates this
approach is Example IV. I.

We discuss certain generalizations and improvements of this algorithm and begin


by introducing "normal" simplicial algorithms.

Since the conical subdivision discussed in the preceding sections are defined by
means of subdivisions of simplices the normal simplicial subdivision processes turn
out to be very similar to the conical ones.

We first treat the case (BCP) where D is a polytope.

2.1. Normal Simplicial Subdivision Processes

Let MO be an n-simplex containing D such that a vertex xO of MO is an extreme


point ofD.
344

For any n-subsimplex M = [i', ... ,vn+1] of MO denote by 'PM(x) the convex
envelope of f over Mj it is the affine function that agrees with f(x) at each vertex of

M (cf. Theorem IV.7). Associate to M the linear program

LP(M,D) min 'PM(x) s.t. xe D n M .

Let w(M) be a basic optimal solution of this linear program, and let ß(M) be its

optimal value.
Then we know from Section IV.4.3 that

ß(M) ~ min f(D n M) ,

Le., ß(M) is a lower bound for f(x) on D n M. If M' is a subsimplex of M then the
value of ~, at any vertex v of M' is f(v) ~ ~(v) (by the concavity of f)j hence
V'M,(x) ~ V'M(x) Vx e M'. Therefore, thislower bounding satisfies the.monotonicity

condition:

ß(M') ~ ß(M) whenever M' c M .

Now consider a simplicial subdivision process in which:

1) The initial simplex is MO'


2) The sub division of anY n-simplex M is a radial subdivision with respect to a
point w = w(M) e M distinct from any vertex of M (cf. Section IV.3.1).

Let Ms' s=0,1, ... , be any infinite nested sequence of simplices generated by the

process, Le., such that MS+1 is obtained by a subdivision of Ms ' For each s let
,J = w(Ms)' wS = w(M s) (note that in general wS # ,J).

Definiüon Vll.4. A nested sequence Ms, s=011 1"" is said to be flDfTTl4l for a giuen

(/,D) if

Zim If(ws) - ß(M) I = 0 . (26)


""'CD
345

A simplicial subdivision process is said to be normal if any infinite nested sequence


of simplices that it generates is normal.

We shalliater examine how to construct anormal simplicial subdivision (NSS)

process.

A rule for simplicial subdivision that generates an NSS process is called an NSS rule.

2.2. Normal Simplicial Algorithm

Incorporating anormal simplicial subdivision (NSS) process and the above lower

bounding into a branch and bound scheme yields the following generalization of the

algorithm in Horst (1976).

Algorithm VII.3 (Normal Simplicial Algorithm)

Select a tolerance c ~ 0 and an NSS rule for simplicial subdivision.

Initialization:

Choose an n-ßimplex MO ) D such that a vertex of MO is also a vertex of D. Let

xO be the best feasible solution available. Set .At 0 = .%0 = {MO}'

Iteration k = 0,1, ... :

1) For each M E .%k form the affine function fPM(x) that agrees with f(x) at the
vertices of M, and solve the linear program

LP(M,D) s.t. xE D n M

to obtain a basic optimal solution w(M) and the optimal value ß(M).

2) Delete all M E .At k such that ß(M) ~ f(xk ) - c. Let .5e k be the remaining

collection of simplices.
346

3) If .9t k = 0, terminate: xk is an e~ptimal solution of (BCP). Otherwise,


4) Select Mk e argmin {ß(M): M e .9t k} and subdivide it according to the chosen
NSS process.

5) Update the incumbent, setting xk +1 equal to the best of all feasible solutions
known so far.

Let f k+1 be the collection of subsimplices of M k provided by the subdivision in

Step 4, and let .At k+1 = (.9t k\ {Mk}) U f k+1 . Set k t- k+1 and return to
Step 1.

Theorem. VII.S. The normal simplicial algorithm can be infinite only il e = 0, and
in this case any accumulation point 01 the generated sequence {i} is a global optimal
solution 01 (BCP).

Proof. Consider an infinite nested sequence Ms' se/). ( {O,l, ... }, generated by
the algorithm. Since the sequence f(x k) is nonincreasing, while the sequence ß (M k )

is nondecreasing (by virtue of the monotonicity of the lower bounding and the selec-
tion criterion), there exist 'Y = !im f(x k) = !im f(xk+1) and ß = lim ß (M k ).
Furthermore, by the normality condition, we mayassume (taking subsequences if

necessary) that !im If( tl) - ß(M s)I = 0, and hence lim f( tl) = ß, where
tl = w (Ms). But, from the selection of Mk in Step 2 we find that ß (M s) < f(xs)-t:,
and from the definition of the incumbent in Step 6 we have f(l+1) 5 f(wk ) for any

k. Therefore, ß 5 'Y-t:, while 'Y 5 lim f(tl) = ß. This leads to a contradiction unless
e = 0, and in the latter case we obtain !im ('Yk-ß(M k)) = O. The conclusion then fol-
lows by Theorem IV.3.

R.emark VII.7. Given a feasible point z we can find a vertex of D which cor-
responds to an objective function value no greater than f(z). Therefore, we may sup-
pose that each xk is a vertex of D. Then for e = 0, we have f(xk) = min f(D) when k
is sufficiently large.
347

2.3. Construction of an NSS Proc:ess

Of course, the implementation of the above algorithm requires us to eonstruet an


NSS proeess.

For any n-simplex M = [v1,v2, ... ,vn] in ~ let

'PM(x) = 1I{M)x + r(M), 'lI'{M) e ~ , r(M) e IR ,

where, as before, 'PM(x) is the affine funetion that agrees with f(x) at the vertices

ofM.

Definition W.5. An infinite nested sequence 01 simplices M is said to be nonde-


generAle il lim 117I'(M)1I < CD, i.e., ilthere exists a subsequence!:J. C {1,2, ... } and a
s-+CD
constant '7 such that 117I'(M)1I ~ '7 Vs e !:J.. A simplicial sub division process is nonde-
generale il any infinite nested sequence 01 simplices that it produces is nondegenerate.

Proposition W.8. Let Ms = [lI s1 ,lIs2,... ,ln], s=l,2, ... , be an infinite nested
sequence 01 simplices such that MS+1 is obtained from Ms by means 01 a radial
sub division with respect to a point ws. Ilthe sequence is nondegenerate, then

lim II(ws) - 'PM (ws) I = 0 .


s-+ID S

Proof. Denote by Hs the hyperplane in IRn+1 that is the graph of the affine

funetion 'PM (x), and denote by L the halfspaee that is the epigraph of this fune-
s s
tion. Let

For j < s, from the eoneavity of f(x) it is easily seen that


348

Since the sequence l is bounded, it follows by Lemma III.2 that d(yS,Ls+ 1) ~ 0,


and hence d(yS,HS+1) ~ 0. Noting that Hs+1 = ((x,t) E IRn"lR: t = 1r{Ms+1)x +
r(M s+1)}' we can write

S I1r{ Ms+ 1 )w s + r (Ms+ 1) - IPM (ws)!


d(y H ) - s
, s+1 - (1 + I11r{Ms+1 ) 11 2 )1/2

Now, from the degeneracy assumption, it follows that there exist an infinite
sequence l:l ( {1,2, ... } and a constant fJ such that I11r{M s+1)1I ~ fJ Vs E l:l. Therefore,
If(ws) -IPM (ws) I ~ (1+fJ2)1/2d(yS,HS+1) ~ 0,
s

as s ~ m, S E l:l, proving the proposition.



Corollary Vll.5. A simplicial sub division process is normal if any infinite nested

sequence of simplices Ms that it generates satisfies either of the following conditions:


1) it is exhaustive;
2) it is nondegenerate and satisfies WS = WS for aU but finitely many s.

Proof. If the sequence {M s} is exhaustive, so that Ms shrinks to a point x* as


s ~ m, then cl' ~ x*, and the relation (26) readily follows from the inequalities

If the sequence Ms is nondegenerate and wS = WS for an but finitely many 5, then the
relation (26) follows from Proposition VI1.8, since ß(M ) = IPM (ws) for sufficiently
S s
large s.

349

2.4. The Basic NSS Process

According to the previous corollary, if bisection is used throughout Algorithm


VI1.3, then convergence is assured. On the other hand, if the w-subdi'llision process
(i.e., sub division with respect to w(Mk ) = w (M k)) is used throughout, then conver-
gence will be achieved when the subdivision process is nondegenerate. However, just
as with conical algorithms, nondegeneracy is a difficult property to realize, whereas

pure bisection is expected to give slow convergence. Therefore, to obtain an algo-


rithm that is convergent and at the same time more efficient than when using pure

bisection, we can combine w-subdivisions with bisections in a suitable manner, using


the same tJ. device that was used earlier for conical algorithms.

Specifically, select an infinite increasing sequence tJ. of natural numbers and adopt
the following rule for the subdivision process (Tuy (1991a)):

Set r(M O) = 0 for the initial simplex MO. At iteration k = 0,1,2, ... , if Mk is the
simplex to be subdivided, then:

a) If r(M k ) t tJ., then subdivide Mk with respect to wk = uf


(perform an w-subdivision of Mk ) and set r(M) = r(M k )+1 for each subsimplex
M of this subdivisionj

b) Otherwise, bisect Mk , Le., subdivide it with respect to wk = midpoint of a

longest edge of Mk , and set r(M) = r(M k )+1 for each subsimplex M of this

subdivision.

Proposition VTI.9. The simplicial s'Ubdivision process just defined is an NSS


process.

Proof. Let {M s} be any infinite nested sequence of simplices generated by the

subdivision process. By reasoning as in the proof of Proposition VII.5, we can show


350

that either this sequence is exhaustive (in which case it it obviously normal), or
sh sh sh
there exists a subsequence {sh' h=1,2, ... } such that w = wand, as h -100, W

tends to a point W* which is a vertex of the simplex M* = n+h~1 M . Since


- sh

ß(M ) = cp(M ) (wsh ) ~ min f(D n Ms ) ~ f(wsh ), since cp(M ) (wsh ) -I f(W*),
sh sh h sh
s s
while f( w h) -I f( W*), it follows that f( w h) - ß(M s ) -I o. Therefore, in any case the
h
sequence {M s} is normal.

We shall refer to the subdivision process constructed above as the Basic NSS

Process. Of course, the choice of the sequence 6. is user specified and must be based

upon computational considerations.

When 6. ={0,1,2, ... } the subdivision process consists exclusively of bisections: the

corresponding variant of Algorithm VII.3 is just the algorithm of Horst (1976).

When 6. = {N,2N,3N, ... } with N very large, the subdivison process consists

essentially of w-subdivisions.

Remark Vll.8. Let D be given by the system (7) (8). Recall that for any simplex
M = [v 1,v2, ... ,v n + 1], the affine function that agrees with f(x) at the vertices of M

is rpM(x) = E -\f(vi ), with

E A/ = x, E \ = 1, Ai ~0 (i=1,2, ... ,n+1).

Hence, the associated linear program LP(M,D) is

s.t. E \Avi ~ b ,E\ = 1, \ ~0 (i=1,2, ... ,n+1) .

If we denote

1 2 ... vn+1] ,
Q= [ vv
1 1 ... 1
351

then (>'1'>'2, ... ,>'n+1)T = Q-1[~] , and so

'PM(x) = (f(v 1),f(v2), ... ,f(vn +1))Q-1 [~]


Note, however, that to solve LP (M,D) there is no need to compute this expression

of 'PM(x).

2.5. Normal Simplicial Algorithm for Problems with Convex Constraints

Like the normal conical algorithm, the normal simplicial algorithm can be ex-

tended to problems where the constraint set D is a convex nonpolyhedral set defined

by an inequality of the form

g(x) ~ 0 .

The extension is based upon the same method that was used for conical algo-

rithms. Namely, outer approximation is combined with a branch and bound

technique according to the scheme proposed by Horst, Benson and Thoai (1988) and

Benson and Horst (1988).


For simplicity, let us assume that Dis bounded and that the convex function g(x)

is finite throughout IRn (hence, it is subdifferentiable at every point x).

Algorithm vrr.4 (Normal Simplicial Algorithm for CP)

Select a tolerance c ~ 0 and an NSS rule for simplicial subdivision.

Initialization:

Choose an n-simplex MO ) D and translate the origin to an interior point of D.

Let x O be the best available feasible solution. Set DO = MO' ..(( 0 = .A"O = {MO}'
7J(MO) = '110 .
352

Iteration k = 0,1, ... :

1) For each M E f k form the affine function IPM( x) and sol ve the linear pro gram

min lPM(x) s.t. x E Dk n M

to obtain a basic optimal solution w(M) and the optimal value ß(M).

2) Delete all M E .At k such that ß(M) ~ f(xk)-c. Let .ge k be the remaining
collection of simplices.

3) If .ge k = 0, terminate: x k is an c-optimal solution of (CP). Otherwise,

4) Select Mk E argmin {ß(M): M E .ge k} and subdivide it according to the

chosen NSS rule.

5) Let wk = w(M k ). If Wk E D, set Dk + 1 = Dk . Otherwise, let pk E 8g (-k


W ), where

xdenotes the intersection of the boundary of D with the ray through x. Set

6) Update the incumbent, setting x k+1 equal to the best among xk , ük and all

W(M) corresponding to M E f k .
Let f k +1 be the partition of Mk , .At k +1 = (.ge k \ {M k}) U f k +1. Set
k +- k+l and return to Step 1.

Theorem W.9. Algorithm VILj. can be infinite only if c = 0 and in this case any

accumulation point ofthe generated sequence {:i} is a global optimal solution.

Proof. Consider any infinite nested sequence of simplices Ms ' 5 E T, generated by

the algorithm. By the normality condition, we may assume (passing to subsequences

if necessary) that lim If(J) - ß(M s) I = 0, where J = w(M s). Let '1 = lim f(xk) =
lim f(xk +\ ß = lim ß(M k ). Then ß = lim f(~). But f(xS+ 1) ~ f(l1) (see Step 6)
and f(xs+ 1) ~ f(xo) ~ f(O)j Rence if WS E D (i.e., ~ E [0,11]) for infinitely many 5,

then, by the concavity of f(x), we have f(x s+1) ~ f( ~), which, if we let 5 -+ 11), gives
353

'Y ~ ß·
On the other hand, if wS ;. D for infinitely many s, then, reasoning as in the proof

of Theorem VII.5, we see that J - "J -I 0, and since f(x S+1) ~ f("J) again
it follows that 'Y ~ ß· Now, from the selection of Mk in Step 4, we have

ß(M s) < f(x s)-€, and hence ß ~ "f-€. This contradicts the above inequality 'Y ~ ß un-
less c: = O. Then 'Y = ß, i.e., lim (ß(M s) - 'Ys) = 0, and the conclusion follows by

Theorem IV.2.

3. AN EXACT SIMPLICIAL ALGORITHM

The normal simplicial algorithm with c: = 0 is convergent but may be infinite. We

now present an exact (finite) algorithm based upon a specific simplicial subdivision

of the feasible set. The original idea for this method is due to Ban (1983, 1986) (cf.

also Tam and Ban (1985)). Below we follow a modified presentation of Tuy and

Horst (1988).

3.1. Simplicial Subdivision of a Polytope

We shall assume that the feasible set is a polytope D C IR~ defined by the

inequalities

n
gi(x):= .E aiJ·xJ. - bi ~ 0 , (i=l, ... ,m) , x ~ 0 . (27)
J=l

Definition vn.6. A simplex M = ['1l,'1I, ... ,'1l] c IR~ with vertices u1,u2,,,.,ur

(r ~ n+ 1) is said to be trivial (more precisely, D-trivial, or trivial with respect to the


system (27)) if for every i=l,,,.,m:

(28)
354

The motivation for defining this notion stems from the following property.

Proposition VIT.I0. If a simplex M is trivial then Mn Dis simply the face of M

spanned by those vertices of M that lie in D:

Mn D = conv {1/ ,,) E D} . (29)

Proof. Let x E M, so that x = E ). .uj with J ( {I, ... ,r}, ).. > 0, E)'. = 1. If xE D
~J J J J
then for every i=I, ... ,m:

In view of (28), this implies that for all j E J

~(uj) ~ 0 (i=I, ... ,m) ,

i.e., uj E D. Therefore, M nD ( conv {uj : uj E D}. Since the reverse inclusion is

obvious, we must have (29).



Now suppose that a simplex M = [u\u2,... ,ur ] is nontrivial. Then there is an
index s (which, for convenience, will be called the test index of M) such that (21)
holds for all i=I, ... ,s-I, while there are p,q such that

(30)

Define

p(M) = s + the number ofindices j such that gs(u j ) = o.

From (30) and the linearity of gs' there is a unique v = ).uP)+(I-).)uq

(0 < ). < 1) satisfying

(31)
355

Let M1 (resp. M2) be the simplex whose vertex set is obtained from that of M by

replacing uP (resp. u q ) by v (i.e., Ml' M2 form a radial subdivision of M with

respect to v).

Proposition W.ll. The set {M1 ' M2} is a partition 0/ M. 1/ each Mv (v=l,2) is
nontrivial, then p(MJ > p(M).

Proof. The first assertion follows from Proposition IV.l. To prove the second

assertion, observe that for each i < sand each j:

Therefore, when Mv is still non trivial, its test index Sv is at least equal to s. If

Sv = s, then it follows from (30) and (31) that the number of vertices u satisfying
gs(u) = 0 has increased by at least one. Consequently, p(M v) > p(M). •

The operation of dividing M into M1, M2 that was just described will be called a
D-bisection (or a bisection with respect to the constraints (27». Noting that

p(M) ~ m+r, we can conclude the following corollary.

Corollary W.6. Any sequence 0/ simplices Mk such that Mk+1 is obtained from
Mk by a D-bisection is finite.

Corollary W.7. Any simplex can be partitioned into trivial subsimplices by


means 0/ a finite number 0/ successive D-bisections.

In particular, if we start with a simplex MO containing D, then after finitely many

D-bisections MO will be decomposed into trivial subsimplices. If we then take the

intersections of D with these subsimplices, we obtain a partition of D into simplices,

each of which is entirely contained in a face of D, by Proposition VILlO. It is easily

seen that every vertex of D must be a vertex of some simplex in the decomposition.
356

Therefore, this simplicial subdivision process will eventually produce all of the
vertices of the polytope D (incidentally, we thus obtain a method for generating all
of the vertices of a polytope). The point, however, is that this subdivision process
can be incorporated into a branch and bound scheme. Since deletions will be possible
by bounding, we can hope to find the optimal vertex well before all the simplices of
the decomposition have been generated.

3.2. A Finite Branch and Bound Procedure

To every simplex M = [ul,u2,... ,ur] C IR~ let us assign two numbers o(M), ß(M)

defined as follows:

o(M) = min f(D nvert M) , (32)


min f(D n v e rt M) i f M is D-trivial
ß(M) ={ (33)
min f( vert M) otherwise

Clearly, by Proposition VII.10,

o(M) ~ min f(M nD) ~ ß(M) (34)

and equality must hold everywhere in (34) if M is a trivial simplex. Furthermore,


ß(M') ~ ß(M) whenever M' is contained in a nontrivial simplex M. Therefore, the
bounding (34) can be used along with the above simplicial subdivision process to

define a branch and bound algorithm according to the Prototype BB-Procedure


(cf. Section IV.1).

Proposition vn.12. The BB procedure just defined, with the selection rule

(35)

is finite.
357

Proof. By Theorem IV.I, it suffices to show that the bounding is finitely


consistent. For any trivial simplex M, since equality holds everywhere in (34), we
obviously have

ß(M) ~ llk:= min {a(M): ME .Jt'k}

Hence, any trivial simplex is removed (cf. Section IV.I). This means that every

unremoved simplex M is nontrivial, and consequently is capable of furt her re-


finement by the above simplicial subdivision. Then by Corollary VII.6 the bounding
is finitely consistent.

In order to speed up the convergence ofthis BB procedure (which we shall refer to

as the Exa.ct Simplicial (ES) Algorithm), we may use the following rules in Step k.3
to reduce or delete certain simplices M E .9l k.
Rule 1. Let M = [u l ,u2,... ,ur]. If for some index i there is just one p such that
~(up) > 0, then reduce M by replacing each u q for which gi(u q) < 0 by the point
vq E [up,uq] that satisfies ~(vq) = o.

Rule 2. If there is a p such that gi(uP) < 0 for some i satisfying (28), then replace M
by the proper face of M spanned by those uj with gi(u j ) = o. If for some i we have
gi(uj ) < 0 Vj, then delete M.

The hypothesis in Rule I means that exactly one vertex up of the simplex M lies

on the positive side of the hyperplane ~(x) = o. It is then easily seen that, when we
replace M by the smaller simplex M' which is the part of M on the nonnegative side

of this hyperplane, we do not lose any point of M n D (Le. M n D ( M'). This is just

the adjustment prescribed by Rule 1.

The hypothesis in Rule 2 means that all vertices of M lie on the nonpositive side
of the hyperplane ~(x) = 0 and at least one vertex uP lies on the negative side. It is
358

then obvious that we do not lose any point of M n D when we replace M by its inter-

section with this hyperplane. This is just the adjustment prescribed by Rule 2. If

every vertex of M lies on the negative side of the hyperplane ~(x) = 0, then
Mn D = 0, and M can therefore be deleted.

3.3. A Modified ES Algorithm.

In the ES algorithm., a simplex Mk must always be subdivided with respect to the


test constraint, Le., the first constraint in (27) that does not satisfy (28). Since this
might not be the constraint that is the most violated by the current approximate
solution or the most frequently violated by vertices of Mk , more subdivisions than

necessary may have to be performed before reaching the trivial simplex of interest.
On the other hand, if Mk is subdivided with respect to some constraint other than

the test one, the finiteness of the procedure may not be guaranteed.

To overcome this drawback, we modify the algorithm so as to favour subdivision


with respect to the constraint that is the most violated by the current approximate
solution. For this we use the strategy of Restart Branch and Bound-Outer Ap-
proximation that was discussed in Section IV.6.

Consider the following outer approximation procedure for solving (BCP):

Start with a simplex MO such that D ( MO ( IR~. Set DO = MO. At iteration


11 = 0,1, ... use the ES algorithm to solve the relaxed problem

Let Zll be the vertex of DII which solves (Q). If Zll E D, stop: Zll is a global
optimal solution of (P).

Otherwise, form D II+1 by adjoining to D II the constraint of D that ia the most


violated by Zll. Go to iteration 11+1.
359

By incorporating the ES Algorithm into the above outer approximation scheme

we obtain the following RBB-OP procedure (cf. Section IV.6).

Algorithm Vll.5 (Modified ES Algorithm)

Iteration 0:

Choose a simplex MO such that D ( MO ( IR! and D n vert MO # 0. Let

Qo = min f(D nvert MO) = f(xO), xO e D ,


ßO = min f(vert Mo) = f(zO), zO e vert MO .

If QO = ßO' then stop: xO is a global optimal solution. Otherwise, let .

Form the polytope D1 by adjoining to MO the io-th constraint in the system (27):

Set /I = 1 and go to iteration 1.


Iteration k = 1,2•... :
k.1. De1ete a11 M e .At k-l such that

ß(M) ~ Qk-l .

Let .9t k be the collection of remaining members of .At k-l.

k.2. Se1ect Mk e arg min {ß(M): M e .9t k}. Bisect Mk with respect to the polytope

D/I (i.e. with respect to the constraints iO, ... ,iv-l)·

k.3. Reduce or delete any newly generated simplex that can be reduced or deleted

according to Rules 1 and 2 (with respect to the polytope D/I). Let .At k be the
360

resulting collection of newly generated simplices.

k
kA. For each ME .At compute

a(M) = min f(D v n vert M)

min f(D n vert M) if M is D -trivial


ß(M) ={ v v
min f( vert M) otherwise

k.5. Let .At k = (.ge k \ {Mk}) U .At k' Compute


ctk = min {a(M): M E .At k} = f(xk ), xk E D ,

ßk = min {ß(M): M E .At k} = f(zk) .

k.6. If ctk = ßk , then stop: xk is a global optimal solution of (BCP).

If ctk > ßk and zk E Dv' choose

Let Dv +1 be the polytope which is determined by adjoining to Dv the iv-th


constraint of the system (27). Set v I - v+l and go to iteration k+1.
Otherwise, go to iteration k+l with v unchanged.

Theorem Vll.I0. The modified ES algorithm terminates after finitely many steps.

Proof. Clearly D c Dv and the number v is bounded from above by the total

number of constraints in (27), i.e., v ~ m. As long as v is unchanged, we are applying

the ES Algorithm to minimize f(x) over Dv' Since this algorithm is finite, the

finiteness of the Modified ES Algorithm follows from Theorem IV.12.



Remarks Vll.9. (i) For each simplex M = [ul,u2,... ,l] we must consider the
associated matrix

(g .. ), i=l, ... ,m; j=l, ... ,r , (36)


IJ
361

where ~j = gi(uj). M is trivial if no row of this matrix has two entries with opposite
sign. The test index s is the index of the first row that has two such entries, i.e.,

gsP > 0, gsq < O. Then M is subdivided with respect to

~ uq - ~ up
v= p q
gip - ~q

and the matrices associated with the two subsimplices are obtained by replacing
column p (resp. q) in the matrix (36) with the column of entries

(ii) When the constraints of D are

Ax=b , x~O ,

we can use standard transformations to rewrite this system in the form


n
x· = "0 - E a·kxk ~ 0 (i=1, ... ,n)
1 k=m+1 1

Viewing the polytope D as a subset of the (n-m)-dimensional space of xm+l' ... ,xn '
we can then apply the above method with ~(x) = xi" In this case the matrix
associated with a simplex M = [u1,u2, ... ,ur] is given simply by the first m rows of
the matrix (ui), i=l, ... ,nj j=1, ... ,r.

3.4. Unbounded Feasible Set

The above method can also be interpreted as a conical algorithm. Indeed,


introducing an additional variable t, we can rewrite the constraints on Das follows:
n
h.(y):= E a. ·x· - b.t ~ 0 (i=l, ... ,m),
1 j=1 IJ J 1
362

n+l
y= (x,t ) EIR+ '

t = 1.

Then D is embedded in a subset of the hyperplane t = 1 in IRn+l, and each

simplex M = [u 1,u2,... ,ur ] in IR! may be regarded as the intersection of the hyper-
plane t = 1 with the cone K = con(yl,y2, ... ,y\ where yi = (ui ,I). A simplicial sub-

division of D may be regarded as being induced by a conical sub division of IR! +1.

With this interpretation the method can be extended to the case of an unbounded

polyhedron D.

Specifically, for any y = (x,t) E IR!+! define ?r{y) = r


ift > 0, ?r{y) = x if t = O.

't'
P roposllon vn.13. For any cone K = con{y IR n +1 tcn·th yi = (i
I. 1,y2, ... ,yr) (+ u, ti)
and 1= {i:ti > O}, the set 7r{K) is a generalized simplex with vertices 7r{yi), i E I, and
extreme directions 7f{yi), i ~ I.

r ..
Proof. Let z = 7f{y) with y = (x,t) E K, i.e. y = E .\(ul,t1),).. ~ O. If t =
i=1 I

E )..t i > 0, then


iEI 1

with E J.L: = I, J.L: ~ 0 (i=l, ... ,r). If t = 0, Le., ).. = 0 (iEI), then z = x = E )..ui =
iEI 1 1 1 i~I 1

E >..?r{yi) with >.. ~ O. Thus, in any case z E ?r{K) implies that z belongs to the
iiI 1 1

generalized simplex generated by the points 7r(yi), i E I, and the directions 1r{yi)),
i ~ I. The converse is obvious.

363

A cone K = con(y1,y2, ... ,l) ( 1R~+1 is said to be trivial if for every i=1, ... ,m we

have

The following propositions are analogous to those established in Section VII.3.1.

Proposition VII.14. 1/ a cone K = con{y1,y2, ... ,{) is trivial, then 1f{K) n Dis the
face 0/ 1f{K) whose vertices are those points 1f{yi), i E 1 = {i: ti > O}, which are
vertices 0/ D and whose extreme directions are those vectors 1f{yi), i t 1, which are
extreme directions 0/ D.

Now if a cone K = con(y1,y2, ... ,l) is nontrivial, we define its test index to be the
smallest s E {1, ... ,m} such that there exist p, q satisfying

and we let p(K) = s + the number of indices j such that hs(~) = O.

Proposition VII.15. Let K1, K2 be the subcones 0/ K arising in the subdivision 0/


K with respect to the point z E [I,y~ such that hlz) = O. Then p{KJ > p{K) for
/I = 1,2, unless K/I is trivial.

This subdivision of K into K1,K2 is called a D-bisection (or a bisection with


respect to the constraints (27)).

As in the bounded case, it follows from this proposition that:

1) Any sequence of cones generated by successive D-bisections is finite.

2) Any cone in 1R~+1 can be partitioned into trivial subcones by means of a finite

number of successive D-bisections. It follows that one can generate a conical


partition of 1R~+1 by means of a finite sequence of D-bisections of 1R~+1 which
induce a partition of D into generalized simplices.
364

In order to define a branch and bound algorithm based on this partition method,
it remains to specify the bounds.

Let K = con (y1,y2,... ,yr) . h Yi


, Wlt = (i
u,ti) n+l ' I
E IR+ = {'1: ti > 0} . Take an
i* . i*
arbitrary i* E 1. If there is a j t I such that f(7 + auJ) < f(~) for some er > 0,
t t1
then set p(K) =- CD. Otherwise, set
i
ß(K) = min {f(~) : i E I}
t

Proposition Vll.16. The above bounding operation is ftnitely consistent.

Proof. Since we start with 1R~+1 or with a cone in 1R~+1 containing D, it is


easily seen that any cone K = con(yl,y2, ... ,l) obtained in the above subdivision
process must have I ;. 0. Furthermore, since the infimum of a concave function f(x)
over a halfline either is - CD or else is attained at the origin of the halfline, the
number ß(K) computed above actually yields a lower bound for min f(?r{K) n D). As
in the bounded case (cf. the proof of Proposition VII.12), every trivial cone is
fathomed, hence any unfathomed cone is capable of furt her sub division. By
Proposition VII.15 (consequence 1) it then follows that any nested sequence of cones
generated by the branch and bound process must be finite.

Corollary Vll.8. The BB procedure using the sub division and bounding methods
defined above along with the selection rule

is finite.

Remark Vll.I0. Incidentally, we have obtained a procedure for generating all the
vertices and extreme directions of a given polyhedron D.
365

4. RECTANGULAR ALGORITHMS

A standard method for lower bounding in branch and bound algorithms for

concave minimization is to use convex underestimators (cf. IVA.3). Namely, if cp is


the convex envelope of the objective function f taken over a set M, then the number

min CP(M) (which can be computed by convex programming methods) yields a lower
bound for min f(M). The fact that the convex envelope of a concave function f taken

over a simplex is readily computable (and is actually linear), gave rise to the conical

and simplicial subdivisions most commonly used in concave minimization branch


and bound algorithms.

Another case where the convex envelope is easy to compute is when the function

f(x) is separable, while the set M is a rectangle (parallelepiped). In fact, by The-


n
orem IV.8, if f(x) = E fJ.(x.), then the convex envelope of f(x) taken over a rect-
j=l J
angle M = {x: r j $ xj $ Sj (j=l, ... ,n)} is equal to the sum of the convex envelopes of

the functions fP) taken over the line segments r j $ t $ Sj (j=l, ... ,n).
Moreover, the convex envelope of a concave function fP) (of one variable) taken

over a line segment [rlj ] is simply the affine function that agrees with fj at the
endpoints of this segment, Le., the function

f.(s.)-f.(r.)
J
t -f()
cp.() - . r·
J J
+ JJ _ JJ( t - r· ) .
s· r. J
(37)
J J

Therefore, when the function f(x) is separable or can be made separable by an

affine transformation of the variables, then rectangular subdivisions might be con-

veniently used in branch and bound procedures.

A branch and bound algorithm of this type for separable concave programming

was developed by Falk and Soland in 1969. A variant of this algorithm was discussed

in Horst (1977).
366

The same idea of rectangular subdivision was used for minimizing a concave

quadratic function subject to linear constraints by Kalantari (1984), Kalantari and

Rosen (1987), Pardalos (1985), Rosen and Pardalos (1986), Phillips and Rosen

(1988). An extension to the case of indefinite quadratic objective function is given in

Pardalos, Glick and Rosen (1987).

In this section, a "normal rectangular method" is introduced that includes pre-

vious rectangular procedures. We present here in detail the Falk-Soland method and

the approach of Kalantari-Rosen. The Rosen-Pardalos procedure that has been suc-

cessfully applied for typical largEHIcale problems will be presented in connection

with the decomposition methods of the next chapter.

4.1. Normal Rectangular Algorithm

We first define the concept of anormal rectangular subdivision, which is similar

to that of anormal simplicial subdivision.


Consider the separable concave programming problem

n
(SCP) minimize f(x):= E f.(x.)
j=l J J

subject to x e D ,

where D is a polytope contained in {x: c ~ x ~ d} c IRn and each fP) is a concave

function continuous on the interval [ci~.

Let M = {x: r ~ x ~ s} be a rectangle with c ~ r, s ~ d (in the seque1 by

"rectangle" we shall always mean a set of this type). Let IPM(x) = ~ IPM.j(Xj ) be the
J
convex enve10pe of the function f(x) taken over M. Denote by w (M) and ß(M) a
basic optimal solution and the optimal value, respectively, of the linear program

LP(M,D) min lPM(x) s.t. xE D nM .


367

Now consider a reet angular subdivision proeess, Le., a subdivision proeess in


whieh a reet angle is subdivided into subrectangles by means of a finite number of
hyperplanes parallel to eertain faeets of the orthant IR!. If the initial reet angle is
MO = {x: rO ~ x ~ sO}, then such a process generates a family of rectangles which
can be represented by a tree with root MO and such that anode is a successor of
another one if and only if it represents an element of the partition of the rectangle
corresponding to the latter node. An infinite path in this tree corresponds to an in-
finite nested sequence of rectangles Mh ' h=O,l, ... For each h let J. = w(M h) ,

"1t(x) = IPMh(x).

Definiüon VII.7. A nested sequence Mh , h=O,l, ... , is said to be nDfT1/.al if

(38)

A rectang1J.lar s1J.bdivision process is said to be nofT1/.al if any infinite nested sequence


of rectangles that it generates is nOfT1/.al.

We shalliater discuss how to construct anormal reet angular subdivision (NRS)


process.
The importanee of this eoneept is that it provides a sufficient eondition for the
convergence of branch and bound algorithms operating with reet angular sub-
divisions. To be specific, suppose we have an NRS proeess. Then, using this sub-
division process in eonjunetion with the lower bounding defined above we ean con-
struct the following branch and bound algorithm.
368

Algorithm Vll.6 (Normal Rectangular Algorithm)

Select a tolerance E ~ 0 and an NRS process.

Initialization:

Choose a rectangle MO containing D. Let xO be the best feasible point available.


Set .At 0 = J O= {MO}·

Iteration k = 0,1,... :

1) For each M E J k compute the affine function VM(x) that agrees with fex) at the
vertices of M and solve the linear program

LP(M,D) min VM(X) s.t. x E D nM

to obtain a basic optimal solution weM) and the optimal value ß(M).

2) Delete an M E .At k such that ß(M) ~ f(xk)-e. Let .9t k be the remaining
collection of rectangles.

3) If .9t k = 0, terminate: xk is an E-optimal solution of (SCP). Otherwise,

4) Select Mk E argmin {ß(M): M E .9t k} and subdivide Mk according to the NRS


process.

5) Update the incumbent, setting xk+ 1 equal to the best of the feasible solutions
known so far.
Let J k+1 be the collection of subrectangles of Mk provided by the subdivision
in Step 5, and let .At k+ 1 = (.9t k\ {Mkl) U J k+ 1. Set k I- k+ 1 and return to
Step 1.

Theorem Vll.ll. The normal rectangular algorithm can be infinite only i/ E = 0,


and in this case it generates asequence {J} every accumulatton point 0/ which is a
global optimal solution 0/ (SOP).
369

The proof of this theorem is exactly the same as that of Theorem VII.8 on con-
vergence of the normal simplicial algorithm.

4.2. Construction of an NRS Process

Let us examine the question of how to construct an NRS process.

The selection of Mk implies that ß(M k ) < f(i)--c; hence, setting j. = W (Mk ),

I{Jk(x) = I{JM (x), IJ}, .(x.) = I{JM (x.), we have


k 'K,J J k,j J

(39)

k k
Let Mk = {x: r ~ x ~ s }.

Choose an index jk E {1,2, ... ,n} and a number wk E (r~ ,s~ ), and, using the
Jk Jk
hyper plane x· = wk , subdivide Mk into two rectangles
Jk

Proposition VII.17. The above sub division rv,le (i.e., the rv,le lor selecting jk and
Je) generates an NRS process il it satisfies either 01 the lollowing conditions:
. k k k k
(i) Jk E argm~ I~ (w j ) - I{Jk/wj)1 and w = wj ;
J

(ii) jk E argmj ukj and wk = (1 + S~';/2, where


k ukj is such that

k k . k k
f.(w"'.} - I{Jk {w"'.} < uk ·, uk' - - I 0 sI s . - r . - - I 0 . (40)
J Y J Y- J J J J

Proof. Let Mk ' h=1,2, ... , be any infinite nested sequence of rectangles gen-
h
erated by the rule. To simplify the notation let us write h for kh. By taking a subse-

quence if necessary, we may assume that jh = jo ' for example jh = 1, for all h. It
suffices to show that
370

(41)

where in case (i) we set O"h1 = f1(~) - ~h1(~)' From the definition of ~ it then
follows that fj(~) - V1tj(~) - i 0 Vj and hence f(Jt) - V1t(Jt) -i 0, whichjs the
desired normality condition.

Clearly, if rule (ii) is applied, then the interval [r~,s~] shrinks to a point as
h - i 11), and this implies (41), in view of (40). Now consider the case when rule (i) is
applied. Again taking a subsequence if necessary, we may assume that ~ - i w1 as
h - i 11). If cl < w1 < dl' then, since the concave function f1(t) is Lipschitzian in any
bounded interval contained in (cl'd1) (cf. Rockafellar (1970), Theorem 10.4), for all
h sufficiently large we have

where 1/ is a positive constant. Hence, using formula (37) for V1tj(t), we obtain

Since ~-1 is one of the endpoints of the interval [r~,s~], it follows that
V1t1(~-1) = f1(~-1). Therefore,

If1(~)-V1t1(~)1 ~ If1(~)-f1(~-I) 1 + 1V1t1(~)-IPJt1(~-1)1 ~

~ 21/1 ~-u{-1 1 --I 0,

proving (41).

w
On the other hand, if 1 coincides with an endpoint of [cl'd1], for example
- h h h-1 h -
w1 = cl' then we must have r 1 = cl' SI = "1 Vh, and hence, SI --I w1 = Cl as
h - i 11). Noting that V1t1 (~) ~ min {fl(c l ), fl(s~)}, we then conclude that

by the continuity off1(t).


371

Thus, (41) holds in any case, completing the proof of the proposition.

Corollary VII,9. If the subdivision in Step 5 is performed according to either of the
mIes (i), (ii) described in Proposition VII.17, then Algorithm VII. 6 converges (in the
sense of Theorem VII. 11).

Let us call a subdivision according to rule (i) in Proposition VII.17 a bisection,

and a subdivision according to rule (ii) w-subdivision.


Algorithm VII.6 with the w-subdivision rule was proposed by Falk and Soland

(1969) as a relaxed form of an algorithm which was finite but somewhat more
complicated (see VI.4.6). In contrast to the "complete" algorithm, the "relaxed"
algorithm may involve iterations in which there is no concave polyhedral function

agreeing with l'M(x) on each reet angle M of the current partition. For this reason,
the argument used to prove finite convergence of the "complete" algorithm cannot

be extended to the "relaxed" algorithm.


Intuitively, it can be expected that, as in the case of conical and simplicial algo-

rithms, variants of rectangular algorithms using w-subdivision should converge more


rapidly than those using bisection, because w-subdivision takes account of the solu-
tion of the current approximating subproblern (this idea will be illustrated by an
example in VII.4.4). But because of the separable structure, a new feature is that,
while w-subdivision has to be properly combined with bisection to produce conver-
gent conical and simplicial algorithms, no such combination is necessary for reet an-

gular algorithms.

4.3. Specialization to Concave Quadratic Programming

As seen earlier, an important property of a quadratic function is that it can be

transformed to separable form by an affine transformation. It seems natural, then,


that branch and bound algorithms with reet angular subdivisions should be used for
372

concave quadratic programming. In fact, sueh algorithms have been developped by

Kalantari and Rosen (1987), Rosen and Pardalos (1986), ete. (cf. the introduetion to
Section IV). We diseuss the Kalantari-Rosen proeedure here as a specialized version

of Algorithm VII.6, whereas the Rosen-Pardalos method will be treated in Chapter

VIII (large-scale problems).

Consider the eoneave quadratie programming problem

(CQP) minimize f(x): = px - !x( Cx) subject to x E D ,

where p E IRn, C is a symmetrie positive definite nxn matrix, and D is the polytope
in IRn defined by the linear inequalities

(42)

x~O (43)

with b E IRn and A an mxn matrix.

If U = [ul,u 2,... ,un] is a matrix formed by n C-conjugate vectors, so that

UTCU = diag(>'1'>'2, ... ,>'n) > 0, then, as shown in Seetion VI.3.3, after the affine
transformation x = Uy this problem ean be rewritten as
n
min F(y):= E F.(y.) s.t. yEn, (44)
j=l J J

where Fj(Yj) = q!j -!>.!/ with q = UTp, and n = {y: Uy E D} .

In this separable form the problem can be solved by the normal rectangular

algorithm. To specialize Algorithm VII.6 to this ease, we need the following prop-

erties of separable concave quadratic funetions.


373

Proposition vn.18. Let M = {y: r ~ x ~ s} be a rectangle. Then:


1) The convex envelope 0/ F(y) over Misthe linear junction
n
'l/JMY) =.E 'l/JM,{Y J, with
,=1 ,1

(45)

2) (46)

Prool. 1) By Theorem IV.8, the convex envelope of F(y) over M is equal to

'M(Y) = E'l/JMj(Yj)' where 'MP) is the convex envelope of FP) over the interval
[rl~' Let GP) = - ~ >.l·
Since FP) = qjt + GP), it follows that

,tl.. _.(t)
TMJ
= q.tJ + -y.(t)
J
,

where -yP) is the convex envelope of GP) over the interval [rj,sj]' But -yP) is an
affine function that agrees with GP) at the endpoints of this interval; hence, after
an easy computation we obtain

(47)

This proves the first assertion in the theorem.

2) From (47) we have:

F'(Y)-'l/JM .(y)
J ,J
= G.(y.)--y.(y.)
J J J J
1 2 1 1
= - 2' >'lj + 2' >'j(rj + Sj)Yj - 2' >'llj
= ~ >'iYj - r j ) (Sj - Yj) . (48)

Since the two numbers Yj - r j and Sj - Yj have a constant sum (equal to Sj - r j ),


their product is maximum when they are equal, and then their product is equal to
374

~Sj - r/, from which (46) follows.



With these results in mind, it is now easy to specialize Algorithm VII.6 to the

problem (44).
According to Corollary vn.8, for the subdivision operation in Step 5 we can use
either of the rules given in Proposition VII. 17. Rule (i) is easy to apply, and

corresponds to the method of Falk and Soland. Rule (ii) requires us to choose
numbers O"kj satisfying (40). Formula (46) suggests taking

1 k k2
uk·=öÄ.(s.-r.) . (49)
,J 0 J J J

For each rectangle M = {y: r ~ y ~ s} to be investigated in Step 1, the objective


function of the linear program LP(M,O) is computed according to formula (45),
while the constraints are

AUy ~ b, Uy ~ 0, r ~ y ~ s .

In certain cases, the constraints of the original polytope have .some special
structure which can be exploited in solving the subproblems. Therefore, it may be
more advantageous to work with the original polytope D (in the x-ilpace), rat her
than with the transformed polytope 0 (in the Y-ilpace). Since x = Uy and
ÄjYj = uj(Cx), a rectangle M = {y: r ~ y ~ s} is described by inequalities

Älj ~ uj(Cx) ~ Älj (j=1,2, ... ,n),

while the convex envelope "'M(x) of f(x) over M is

(50)

We can thus state the following algorithm of Kalantari and Rosen (1987) for solving
(CQP):
375

Algorithm vrr.7.

Initia1ization:

Select E~ O. Solve the 2n linear programs

min {uj(Cx): x e D}, max {uj(Cx): x e D}

obtaining the basic optimal solutions x Oj , xDj and the optimal values t'/j' ;;j of these
programs, respectively.

Clearly, DeMO = {x: >'jt'/j ~ uj(Cx) ~ >'j;;j , j=l,2, ... ,n}. Set .J( 1 = .%1 = {MO}'
x O = argmin{f(xOj ), f(xD j ), j=l,2,,,.,n}.

Iteration k = 1,2,,,. :

1) For each M e .%k compute ~(x) = E ~ .(x.) according to (50) and solve the
j ,J J
linear program

LP(M,D) min ~(x)

s.t. xe D, >'h ~ uj(cx) ~ >'jSj (j=l,2,,,.,n)

to obtain a basic optimal solution w(M) and the optimal value ß(M).

2) Update the incumbent by setting xk equal to the best among all feasible
k-1
solutions so far encountered: x and all w(M), M e .%k'
Delete all M e .J( k such that ß(M) ~ f(xk)-e. Let .ge k be the remaining collection

of rectangles.

3) If .ge k = 0, terminate: x k is a global E-optimal solution of (CQP). Otherwise,

4) Select Mk e argmin {ß(M): M e .ge k}'

5) Let jk e argmax {O'kf j=l,2,,,.,n}, where O'kj is given by (49) (rki are the
k 1 k k
vectors that define Mk ), w = 2' (r jk + Sj/
376

Subdivide Mk into two subrectangles Mkl , Mk2 by means of the hyperplane


. 1 k k
uJ(Cx) = ,,>..(r. + s· ).
~ J Jk Jk

6) Let f k+1 = {Mkl'Mk2 }, .J(k+1 = (.ge k \ {Mk}) U f k+ l ·


Set k .- k+l and return to Step 1.

Remarks VII.ll. (i) As already mentioned, in Step 5 instead of bisection one


could also use w-subdivision. If we denote J = w (Mk ), 1("k = u-l J, then, since
k .T k
>'lj = (uJ) Cw, we have from (48)

2[F/1("k)-TPM/1("k)] = [uj(cJ)->'l~] [>.jS~-uj(cJ)] / >'j. (51)

Therefore, if one uses the w-subdivision, then one should choose the index jk that
maximizes (51), and divide the parallelepiped Mk by the hyperplane
jk 1c
u (C(x -w-)) = O.

(ii) Another way of improving the algorithm is to use a concavity cut from time
to time (cf. Section V.l) to reduce the feasible polytope. Kalantari (1984) reported
computational experiments showing the usefulness of such cuts for convergence of
the algorithm. Of course, deeper cuts specially devised for concave quadratic pro-
gramming, such as those developed by Konno (cf. Section VA), shOuld provide even
better results.

(iii) Kalantari and Rosen (1987), and also Rosen and Pardalos (1986), Pardalos
and Rosen (1987), actually considered linearly constrained concave quadratic pro-

gramming problems in which the number of variables that enter the nonlinear part
of the objective function is small in comparison with the total number of variables.
In the next chapter these applications to the large-scale case will be discussed in a
more general framework.
377

4.4. Example VII.2.

The following simple example is taken from Kalantari and Rosen (1987), where

one can also find details of computational results with Algorithm VII.7.

Let fex) = - ~2xi + 8X~), and let the polytope D be given by the constraints:

We have ui = ei (i-th unit vector, i=1,2) and these are normalized eigenvectors of

C, >'1 = 2, >'2 = 8.
It can easily be checked that

o
MI = {x: 0 ~ Xl ~ 8, 0 ~ x 2 ~ 4}, Xo = (8,2), fex ) = -80 .

(Below ßk , wk stand for ß(Mk ), w(M k) respectively)

Algorithm. Vll.7 using bisection (Kalantari-Rosen)

Iteration 1:
(0,4 ) (8,4 )

ßo = -104, Wo = (7,3) ,
1 .
Xl = (7,3), fex ) = -85 .

MI = MO'
1
f(w ) - 'PM = 19 . Divide MI into
1
(0,0) (8,0)
Mn and M 12 (jl = 1).
Fig. a
378

Iteration 2:

Pu = -73.6, wu = (4,3.6) ,
P12 = -100. , W12 = (7,3) ,
2
x = (7,3), fex 2) = ~5 .
M U fathomedj M2 = M12
f( J) - 'PM (J) = 15. Divide M2 into
2
M21 ' M22 (j2 = 2) .
Fig. b

Iteration 3:

P21 = ~O , W21 = (8,2) ,


P22 = -90 , W22 = (7,3) ,
3
x = (7,3), fex3) = ~5 .
M21 fathomedj M3 = M22 '

f(w3 ) -IPM (w3 ) = 7. Divide Ma into


a
M 31 • M 32 (j3 = 1)

Fig. c

Iteration 4:

P31 = ~0.8 , W31 = (6, 3.2) •


P32 = -90. , w32 = (7,3) ,
4
x = (7,3), fex4) = ~5 .
M31 is fathomedj M4 = M32 '

f(w4) -'PM (w4 = 5. Divide M4 into


4
M41 • M42 (j4 "'" 2)
Fig. d
379

Iteration 5:

41
ß41 = -86 , w = (7,3) ,

ß42 = -86 , w42 = (7,3) ,


x 5 = (7,3), f(x 5) = -85 .

M5 = M42 ,
f( w5 ) - 'PM (w5 ) = 1. Divide M5 into
5
M 51 ' M 52 (j5 = 1).
Fig. e

Iteration 6:

ß51 = -85 , w51 = (7,3) ,


ß52 = -85 , w52 = (7,3) ,
x5 = (7,3) , f(x 5) = -85 .

M51 ' M52 are fathomedj M6 = M41 '


f( w6) - 'PM = 1. Divide M6 into
6
M 61 ' M 62 (j6 = 1).
Fig. f

Iteration 7:

ß61 = ß62 = -85 , x 7 = (7,3), f(x 7) = -85. Hence ~ 7 = 0 and x 7 = (7,3) is a


global optimal solution.

Note that the solution (7,3) was already encountered in the first iteration, but the

algorithm had to go through 6 more iterations to prove its optimality.

Algorithm using w-subdivision (Falk-Soland)

Applying rule (ii) in Proposition VII.17 we have in Iteration 1:

wl = (7,3), ji = I and MI is divided into Mn ' M12 as in Fig. a.


380

= ßl2 = -97, we can take M2 = M12 ; then w2 = (7,3), j2 =


In Iteration 2, since ßll
2 and M 2 is divided into M 21 ' M 22 as in Fig. b. In iteration 3, ß21 = ß22 = -85,

while the incumbent is x3 = (7,3) with f(x3) = -85, so M 21 ' M 22 are fathomed, and

M 3 = Mn . Then w3 = (7,3), j3 = 2, and M 3 is divided into M 31 ' M 32 ' which will


be fathomed in iteration 4, because ß31 = ß32 = -85. Thus, the algorithm will ter-
minate after only 4 iterations.

Fig. a Fig. b
CHAPTER VIII

DECOMPOSITION OF LARGE SCALE


PROBLEMS

In many problems of large size encountered in applications, the constraints are

linear, while the objective function is a sum of two parts: a linear part involving
most of the variables of the problem, and a concave part involving only a relatively

small number of variables. More precisely, these problems have the form

(P) minimize f(x) + dy subject to (x,y) E n ( IRnxlR h ,

where f: IRn -I IR is a concave function, n is a polyhedron, d and y are vectors in IRh,


and n is generally much smaller than h.

In solving these problems it is essential to consider methods which take fuH ad-

vantage of this specific structure in order to save computational effort.


In this chapter, different approaches to the decomposition of problem (P) are pre-
sented within a general framework. These approaches include branch and bound
techniques, polyhedral underestimation and outer approximation methods. Import-

ant special cases such as separable concave minimization and concave minimization

on networks are discussed.


382

1. DECOMPOSITION FRAMEWORK

Denote by D the projection of n on the x-llpace, i.e.,

D = {x E IRn : 3y such that (x,y) E n} . (1)

We shall assurne that for every fixed x E D the linear function dy attains a min-

imum over the set of all y such that (x,y) E n. Define the function

g(x) = min {dy: (x,y) E n} . (2)

Proposition VIII. 1. D is a polyhedron, and g(z} is a convez polyhedral function

with dom 9 = D.

Proof. D is a polyhedron because it is the image of a polyhedron under a linear

transformation from IRn +! to IRn (see e.g. Rockafellar (1970), Theorem 19.3). Since

any (x,y) E n can be written as

s .. r
(x,y) = E A.(U\V1), E A. = 1, A. ~ 0 (Vi), (3)
i=l 1 i=l 1 1

where r ~ s, and since (ui,vi ) for i ~ rare the extreme points of n, while (ui,vi ) for
s .
i > r are the extreme directions of n, we have g(x) = inf E A.dvl , where the in-
i=l 1
fimum is taken over all choices of Ai satisfying (3). That is, g(x) is finitely gener-

ated, and hence convex polyhedral (see Rockafellar (1970), Corollary 19.1.2). Fur-

thermore, from (1) and (2) it is obvious that g(x) < +111 if and only if x E D.

Proposition VIII.2. Problem (P) is equivalent to the problem

(H) minimize f(z) + g(z) subject to z E D ( IR n .

Specijically, if (x, y) solves (P), then x solves (H), and if x solves (H), then (X, y)
solves (P), where y is the point satisfying g(x} = dy, (x, y) E n .
383

Proof. If (x,Y) solves P, then XE D and f(X) + dy ~ f(x)+ dy for all y such that
(x,y) E Oj hence dy = g(X), and we have f(x) + g(X) = f(x) + dy ~ f(x) + dy for all
(x,y) E O. This implies that f(X) + g(x) ~ f(x) + g(x) for all x E D, i.e., Xsolves (H).
Conversely, suppose that X solves (H) and let y E argrnin {dy: (x,y) E O}. Then

f(x) + dy = f(x) + g(X) ~ f(x) + g(x) for all x E D and hence f(X) + dy ~ f(x) + dy
for all (x,y) E 0, Le., (x,Y) solves (P).

Thus, solving (P) is reduced to solving (H), which involves only the x variables

and rnight have a much smaller dimension than (P). Difficulties could arise for the

following two reasons:

1) the function g(x) is convex (therefore, f(x) + g(x) is neither convex nor concave,
but is a difference of two convex functions)j

2) the function g(x) and the set D are not defined explicitly.

To cope with these issues, a first method is to convert the problem (H) to the

form

(ii) minimize f(x) +t subject to xE D, g(x) ~ t, (4)

and to deal with the implicitly defined constraints of this concave program in the

same way as it is done in Bender's decomposition procedure.

We shall see, however, that in many cases the two mentioned points can merely

be bypassed by employing appropriate methods such as branch and bound or con-

cave underestimation.
We shall discuss these methods in the next sections, where, for the sake of simpli-

city, we shall assume that the set D is bounded.


384

2. BRANCH AND BOUND APPROACH

One of the most appropriate techniques for solving (H) without explicit know-

ledge of the function g( x) and the set D is the branch and bound approach (cf. Horst

and Thoai (1992)j Tuy (1992)j Horst, Pardalos and Thoai (1995».
In fact, in this approach all that we need to solve the problem is:

1) the construction of a simple polyhedron MO in the x-space which contains D, to


be used for initializationj

2) a practical procedure for computing a lower bound for the minimum of f(x) +
g(x) over any partition set M, such that the lower bounding is consistent (see

Chapter IV).

Quite often, the construction of the starting polyhedron MO is straightforward. In

any event, by solving the linear programs

min {xi (x,y) E o} , max {xi (x,y) E o} (j=I, ... ,n)

one can always determine the smallest rectangle containing D.


As lower bounding, it should not present any particular difficulty, in view of the
following result.

Proposition VIII.3. For any polyhedron MeMO I i/1/JWx} is a linear undef'-


estimator 0/ /(x) on M, then the optimalvalue o/the linear program

(LP(M,O» min [1/JWx} + dy] s.t. x E M, (x,y) E ° (5)

yields a lower bound /ar min {/(x) + g(x}: x E Mn D}.

Proof. Indeed, by Proposition VII.2, problem (5) is equivalent to

+ dy: x E M n D}.
min {1fM(x)

385

Thus, to compute a lower bound for f(x) + g(x) over M n D, it suffices to solve a
linear program of the form (5). In this manner, any branch and bound a.lgorithm ori-

gina.lly devised for concave minimization over polytopes that uses linear undeT'-

estimators for lower bounding, can be applied to solve the reduced problem (H). In
doing this, there is no need to know the function f(x) and the set D explicitly. The

convergence of such a branch and bound algorithm when extended to (H) can

genera.lly be established in much the same way a.s that of its origina.l version.

Let us examine how the above decomposition scheme works with the branch and

bound algorithms previously developed for (BCP).

2.1. Norma.l Simplicia.l Algorithm

When minimizing a concave function f(x) over a polytope D ( IRn, the norma.l sim-

plicial a.lgorithm (Algorithm VII.3) starts with an initia.l n-simplex MO J D, and

proceeds through successive simplicia.l subdivision of MO' For each n-subsimplex M

of MO' alower bound for f(x) over M n D is taken to be the optima.l value of the lin-
ear program

min h'M(x): x E M n D} , (6)

where rpM(x) is the linear underestimator of f(x) given simply by the affine function
that agrees with f(x) at the n+l vertices of M. In order to extend this a.lgorithm to

problem (H), it suffices to consider the linear program (5) in place of (6). We can

thus formulate the following a.lgorithm for solving (P).

Assume that the polyhedron n is defined by the inequalities


Ax + By ~ c, x ~ 0, y ~ 0.
386

Then for any simplex M = [vl,v2,... ,vn +1] in x-space, since x e M is a convex

combination of the vertices vl,v 2,... ,v n +1, the lower bounding subproblem (5) can
also be written as:

(7)

E Ai =1, \ ~ ° Vi, y ~ °
(cf. Section VIL2.4).

Algorithm VIII. 1 (Normal Simplicial Algorithm for (P».

Select a tolerance c ~ 0, and anormal simplicial subdivision proce~s (cf. Section

VILl.3 and VILl.6)

Initialization:

Construct an n-simplex MO such that DeMO c IR! . Let (xO,yo) be the best feas-

ible solution available. Set .Jt 1 = f 1 = {Mo}'

Iteration k = 1,2,... :

1) For each M E f k solve the linear program (7) to obtain a basic optimal solution

(w (M),y(M» and the optimal value ß (M) of LP(M,O).

2) Update the incumbent by setting (xk,yk) equal to the best among all feasible

solutions known so far: (xk-l ,yk-l) and (w (M),y(M» for all M ef k.


Delete all M e .Jt k for which ß(M) ~ f(x k) + dyk - c. Let ~ k be the remaining
collection of simplices.

3) If ~k = 0, terminate: (xk,yk) is an c-optimal solution of (P). Otherwise,


continue.
387

4) Select Mk E argmin{ß(M): M E 9t k} and subdivide it according to the chosen


exhaustive process.

5) Let .h'"k+1 be the partition of Mk and vi( k+1 = (9t k \ {Mk }) U .h'"k+1 . Set
k -- k+ 1 and return to Step 1.

Theorem. VIll.l. Algorithm VIII.1 can be infinite onlll i/ E = 0, anti in this case,
anll accumulation point 0/ the generated sequence {(t',,j}} is a global optimal sol'/./;-
tion o/(P).

Proof. Consider the algorithm as a branch and bound procedure applied to prob-
lem (H), where for each simplex M, the value Q (M) = f(w (M)) + d(y(M» is an
upper bound for min f(M), and ß (M) is a lower bound for min f(M) (cf. Section
IV.I). It suffices to show that the bounding operation is consistent, i.e., that for any
infinite nested sequence {Mq} generated by the algorithm we have lim(Qq - ßq) = 0
(q - - I m), where Qq = Q (M q), ßq = ß (M q) (then the proof can be completed in the
same way as for Theorem VII.8).
But if we denote "'q(.) = ~q(.), wq = w (Mq), then clearly Qq - ß
q
= f(wq) -
"'q(wq). Now, as sh~wn in the.proof of Proposition VII.9, by taking a subsequence if
necessary, wq - - I w , where w is avertex of M.= n~ =1 M q. We can assume that
w· = li m vq,I, where vq,1 is a vertex of Mq, and that wq = 11=1 Aq i vq,i, where
q~oo '
A . ~ A. i with A. i = 1, A. i = 0 for i # 1. Hence, by continuity of f(x) , "'q (wq) =
q,1 , .'.' •
E~-1 A . f(vq,l) ~ f(w ). Since f(wq) ~ f(w ), it follows that f(wq) - 'l/J(wq ) ~ 0, as
1- q,1
was to be proved.

We recall from the discussion in Section VII.1.6 that in practice, instead of a nor-
mal rule one can often use the following simple rule for simplicial subdivision
Choose 7 > 0 sufficiently small. At iteration k, let Ak = (Af, ... ,A:) be an optimal
solution of the linear program (7) for M = Mk. Use w-subdivision if min {>.f:
Af > o} ~ 7, and bisection otherwise.
388

2.2. Normal Rectangular Algorithm

Suppose that the function f(x) is separable:

n
f(x) = E f.(x.).
j=1 J J

As previously shown (Theorem IV.8), the convex envelope of such a function over a

rectangle M = {x E IRn: r ~ x ~ s} is equal to

VM(x) = E 1/JM,J.(x.)J , (8)

where each 1/JM,P) is the affine function of one variable which agrees with fP) at

the endpoints of the interval [rlj ]' Rence, a lower bound for f(x) + g(x) over M

can be obtained by solving the linear program (5), with VM(x) defined by (8). This

leads to the following extension of Algorithm VII.6 for solving problem (P).

Algorithm vm.2 (Normal Rectangular Algorithm for (P))

Select a tolerance t: ~ 0.

Initialization:

Construct a rectangle MO = {x E IRn: rO ~ x ~ sO} such that D ( MO ( IR~. Let

(xO,yO) be the best feasible solution available. Set .At 1 = .%1 = {MO} .
Iteraüon k = 1,2, ... :

1) For each member M = {x E IRn: r ~ x ~ sO} of .%k compute the function VM(x)

according to (8) and solve the linear programm

LP(M,O) min 1/JM(x) + dy B.t. r ~ x ~ s, Ax + By ~ c

Let (w (M),y(M» and ß (M) be a basic optimal solution and the optimal value of

LP(M,O).
389

2) Update the incumbent by setting (xk,yi) equal to the best feasible solution
among (xk-l,yk-l) and all w (M),y(M», M e .A'k'

Delete all M e .Jt k for which ß (M) ~ f(xk ) + dyk - E. Let j! k be the remaining
collection of rectangles.

3) If j! k= 0, terminate: (xk,l) is a global E-optimal solution of (P). Otherwise,


continue.

4) Select Mk e argmin{ß (M): M e j! k}' Let ,} = w (Mk ), and let

'/L
rkJ
.(x) = '/LrMkJ
- .(x).

5) Select jk e argmax{ Ifj(~) - 1I1tj(~) I: j=l, ... ,n}. Divide Mk into two
subrectangles by the hyperplane x. =;;. .
Jk Jk

6) Let .A'k+l be the partition of Mk , .Jt k +1 = (j! k\{Mk }) U .A'k' Set k +- k+1
and return to 1).

Theorem VIII.2. Algorithm VIII.2 can be infinite only i/ E = 0 and in this case,
any accum'lllation point o/{(xk,yk)} is a global optimal solution 0/ (P).

Proof. As with Algorithm VIII.l, all that needs to be proved is that


f(wq) - 'l/Jq(Wq) -I 0 as q -1111 for any infinite nested sequence {Mq} generated by
the algorithm (the notation is the same as in the proof of Theorem VIII.l). But this

follows from Proposition VII.17(i).



Rem.ark VIII.!. The above algorithm. of course can be applied to concave quad-

ratic programming problems, since a concave quadratic function f(x) can always be

made separable by means of an affine transformation of the variables. An alternative

algorithm. can also be obtained by a similar extension of Algorithm. VII. 7 which is

left to the reader.


390

2.3. Normal Conical Algorithm

Strictly speaking, the normal conical algorithm (Section VII.I) is not a branch
and bound procedure using linear underestimators for lower bounding. However, it
can be extended in a similar way to solve problems of the form (P), with d = 0 (i.e.,
where the objective function does not depend upon y):

minimize fex) subject to Ax + By ~ c, x ~ 0, y ~ o. (9)

Obviously, by introducing an additional variable t, problem (P) can also be


written as

minimize fex) +t subject to dy ~ t, (x,y) e 0 ,

i.e., a problem of the form (9), with (x,t) e IRn+! in the role of x.
As before, let D = {x e IRn: 3y ~ 0 such that Ax + By ~ c, x ~ O}. Then (9) be-
comes a BCP problem to which Algorithm VII.1 * can be applied. The fact that D is
not known explicitly does not matter, since the linear program

-1 -1
LP(QjD) max eQ x s.t. x e D, Q x ~ 0

that must be solved in Step 2 of Algorithm VII.I* is simply

-1 -1
max eQ x s.t. Ax + By ~ c, Q x ~ 0,y ~ 0

(we assume con Q c IR~, so the constraint x ~ 0 is implied by Q-Ix ~ 0).


Of course, the same remark applies to Algorithm VII.I. Thus, after converting
problem (P) into the form (9), the application of the normal conical algorithm is
straightforward.

Note that in all the above methods, the function g(x) and the set D are needed
only conceptually and are not used in the actual computation.
391

3. POLYHEDRAL UNDERESTIMATIONMETHOD

Another approach which allows us to solve (H) without explicit knowledge of the
function g(x) and the set D uses polyhedral underestimation.

3.1. Nonseparable Problems

The polyhedral underestimation method for solving (BCP) is based on the follow-

ing property which was derived in Section VIA (cf. Proposition VI.10):

To every finite set Xk in IRn such that conv Xk = MI ) D one can associate a

polyhedron

such that the function

(10)

is the lowest concave function which agrees with f(x) at all points of Xk (so, in par-

ticular, ~(x) underestimates f(x) over D).


Now suppose we start with an n-simplex MI J D and the grid Xl = V(M 1) (ver-
tex set of MI). Then SI has a unique vertex which can easily be determined. At it-
eration k we have a finite grid Xk of MI such that conv Xk = Mi' together with the
associated polyhedron Sk and its vertex set .At k. By solving the relaxed problem

(P k ) min ~(x) + dy s.t. (x,y) e 0

with ~(x) defined by (10), we obtain a lower estimate for min {f(x)+dy: (x,y)eO}.
Therefore, if (xk,l) is a basic optimal solution of (P k) and (xk,yk) satisfies

~(xk) = f(xk ), then ~(xk) + dyk = f(xk ) + dl; hence (xk,l) solves (P).
392

Otherwise, tI1c(xk ) < f(xk ), which can happen only if x k e D \ X k . We then consider
the new grid X k +1 = X k U {xk }. Since

the vertex set of Sk+1 can be derived from that of Sk by any of the procedures dis-

cussed in Section 11.4.2. Once this vertex set is known, the procedure can be re-

peated, with Xk +I in place of Xk .

In this manner, starting from a simple grid Xl (with conv Xl = MI)' we gener-

ate a sequence of expanding grids

a sequence of polyhedral concave functions

<PI (x) ~ <P2(x) ~ ... lOJt(x) ~ ... ~ f(x) Vx E MI '

and a sequence of points (xk,yk) E O. Since (xk,l) is a vertex of 0 (optimal solution

of (P k » and since, by (11), the sequence (xk,yk) never repeats, it follows that the

number of iterations is bounded above by the number of vertices of O. Therefore, the

procedure will terminate after finitely many iterations at a global optimal solution

(xk ,yk).
We are thus led to the following extension of Algorithm VI.5 (we allow restarting

in Step 4 in order to avoid having I vK k I too large).

Algorithm. VIII.3.

0) Set Xl = {v l ,... ,vn +l } (where [v l ,... ,vn +1] is a simplex in IRn which contains

D). Let (xO,yO) be the best feasible solution known so far; set SI = {(q,qO): ~­
qvi ~ {(vi), i=1, ... ,n+1}. Let vK 1 be the vertex set of SI' i.e., the singleton

{[f(v l ), ... ,f(v n +1)] Ql1}, where Q1 is the matrix with n+1 columns [-~
(j=I, ... ,n+1). Set .#"1 = vK l' k=l.
393

1) For every (q,qO) E .;(k solve the linear program

minimize (qo - qx) + dy subject to (x,y) E n ,

obtaining a basic optimal solution w (q,qO) and the optimal value ß (q,qO)'

2) Compute

(qk,q~) E arg min {ß (q,qO): (q,Q.o) E ~ k}

and let (xk,yk) = w (qk,q~), rJ< = ß (qk,q~).


3) Update the current best solution by taking the point (ik,yk) with the smallest

value of f(x) + dy among (ik- 1,yk-l) and all points (q,qO) in ~k .

4) Iff(ik) + dyk = I, then terminate: (ik,yk) is a global optimal solution.


5) Otherwise, set Sk+1 = Sk n {( q,qO): qo - qxk ~ f(x k )} , and compute the vertex
set ~+1 of Sk+1' Set .;(k+ 1 = ~ k+1 \ ~ and go to iteration k+ 1.

3.2. Separable Problems

When the function f(x) is separable

n
f(x) = E f.(x.) , (12)
j=1 J J

it is more convenient to start with a reet angle MI = {x: r l ~ x ~ sI} and to con-

struct the grids Xk so as to determine rectangular subdivisions of MI' Le.,


n
Xk = I1 Xk ·, where each Xk · consists of k+1 points on the x.-axis. The polyhedral
j=1 J J J
concave function lPtt(x) corresponding to Xk is then easy to determine. In fact:

n
lPtt(x) = E lPtt .(x.) , (13)
j=1 J J
394

where each fPJcP) is a piecewise affine function defined by a formula of the type
«37), Section VII.4) in each of the k subintervals that the grid Xkj determines in
1 1
the segment [r j ,s j] .
Theoretically, the method looks very simple. From a computational point of view,
however, for large values of k the functions fPJcj(t) are not easy to manipulate. To
cope with this difficulty, some authors propose using mixed integer programming, as
in the following method of Rosen and Pardalos (1986) for concave quadratic minim-
ization.
Consider the problem (P) where f(x) is concave quadratic. Without loss of gener-
ality we may assume that f(x) has the form (12) with

f.(x.) 1 2 ,~. > 0,


= q ..X-n~.X. (14)
J J JJ "'JJ J

(cf. Section VI.3.3), and that n is the polytope defined by the linear inequalities

Ax + By ~ c , x~0 , y ~ 0. (15)

As seen in Section V1.3.3, we Can easily construct a rectan~ular domain


M = {x e IRn: 0 S xj S ßj (j=l, ... ,n)} which contains D (recall that D is the projec-
tion of non the x-space).

Suppose we partition each interval Ij = [O,ß~ (j=l, ... ,n) into kj equal subinter-
vals of length 6j = ß/kj" Let rpP) be the lowest concave function that agrees with
fP) at all subdivision points of the interval I j , and let

n
rp(x) = E rp.(x.).
j=l J J

If (x,y) is an optimal solution of the approximating problem

min {rp(x) + dy: (x,y) e n} , (16)


395

then (x,y) is an approximate optimal solution of (P) with an accuracy that obviously

depends on the number k. (j=1, ... ,n) of subintervals into which each interval I. is di-
J J
vided. It turns out that we can select kj (j=1, ... ,n) so as to ensure any prescribed
level of accuracy (cf. Pardalos and Rosen (1987)).

Let lII(x,y) = f(x) + dy and denote by Vi* the global minimum of lII(x,y) over O.
The error at (x,y) is given by lII(x,y) -111*. Since

1II((x) + dy ~ 111* ~ f((x) + dy = lII(x,y) ,

we have lII(x,y) -111* ~ f(x) - <p(x). So the error is bounded by

E(x):= f(x) - <p(x) .

We now give abound for E(x) relative to the range of f(x) over M. Let
f = max f(x), f . = mi n f(x). Then the range off(x) over M is
max xEM mm xEM

M=f -f . .
max mm

We first need a lower bound for ilf.

Assume (without 10ss of generality) that

(17)

Define the ratios

(18)

The function f/ t) attains its unconstrained maximum at the point j = x q/ >'j"


Define

77j = min {1, I~1-1} (j=1, ... ,n) (19)


J

Note that 0 ~ 77j ~ 1.


396

1 2 n 2
Lemma W.l. If ~ 8" >'lß 1 .E PJ·(l + 17J')
J=l

Proof. Denote

M. = max f.(t) - min !.(t) ,


J J J

where the maximum and minimum are taken over the interval [O,ßj].

There are four cases to consider:

(i) 0< -x· < -21 ß.:


- r J

Then {j(ßj) = >'!lij - ~ ßj ) 5 0 , and since the minimum of the concave function
{P) over the interval 0 5 t 5 ßj is attained at an endpoint, we have
min {P) = {lß/ On the other hand, max {p) = ! >'l~' From (19) we have xj =!
ßP-17j)' Hence,

M. = ~ >. .x~ - >. ß·x. + ~ >. ß ~ = ~ >'·(ß·-x.)2 = ~ >'.8 ~(1+ 17i .


J -'JJ JJJ -'JJ -'JJ J 0rJ J

(ii)

Then {·(ß·) > 0, so that min {.(t) = {.(O) = O. From (19) we have X. = ~ß·(1+17·),
JJ- J J J -'J J
hence

(iii) xj 5 0:
This implies qj 50, {P) 5 -~ >'jt~ (0 5 t 5 ßj ), so that max fP) = 0,

min {p) 5 -~ >'l~' There{ore, since 17j = 1,

1 212 2
ß{j ~ 2 >'l j = 8" >'l j(1+17j)
397

(iv) x.rJ
> ß·:

Then min fP) = fj(O) = 0, and since 7Jj = 1,

M. = max f.(t) = f·(ß·) = A.ß.(x·-ß·/2) >~ A.8 ~ = -SI A.ß ~(1+7Ji .


J. J JJ JJJJ -~rJ JJ J

Finally, we have

M = n I n A 21 n ,,2 2
E M. ~ 8" E A. ·(1+71') ~ 8" E A1PIP·(1+7J·) .
j=1 J j=1 J J j=1 J J •
Theorem VllI.3. We have
n 2
E (p·/k.)
j=1 J J
(20)
n 2
E p.(1+7J.)
j=1 J J

Proof. As we saw before,

\II(x,y) - \11* ~ f(x) - cp(x) = E(x) . (21)

Since fj is concave and 'PP) interpolates it at the points t = i5j (i=O,I, ... ,kj ), it
can easily be shown that

0< f.(i.) - cp.(i.) < ~ A.{j~ = -SI A·(ß·/kl (j=I, ... ,n).
-JJ JJ-oJJ JJJ

(cf. Proposition VII.1S).


Hence
2
n 1 n ß· 1 2 n p.
f(x) - cp(x) = E {f.(x.) - cp.(x.)} ~ 8" E A. ~ ~ 8" A1ß 1 E ~,
j=1 J J J J j=1 J k. j=1 k.
J J

and the inequality (20) follows from Lemma Vll.1 and (21).

The following corollaries are immediate consequences of Theorem VIlL3.
398

Corollary VIII.I. Let tp(x) be the convex envelope oi J(x) taken over M and let (i,l)
be an optimal solution ofthe problem

min {tp(x) + dy: (x,y) E O} .

Then
n
E Pj
qT(xO,yO) - w. J·=1
(22)
- -"i.{ 2 - =: u(p,1/)
~ -=nc-'-'--"'-(--,.)....
E P j 1+1/j
j=l

Proof. Take kj = 1 (j=l, ... ,n) in Theorem VIII.3.



1 ß·
Note that u(p,1/) E [4,1], and furthermore, u(p,1/) < 1, unless xj = ~ for every j.
In particular, ifxj t (O,ßj ) Vj then u(p,1/) = i·
Corollary vm.2. If for each j=l, ... ,n:

k.~
J
(!!:.p}/2, wherea=e p.(1+1/.l
a 1 j=1 J J
E (23)

then an optimal solution (x,y) ofthe approximation problem (16) satisfies

Proof. From (23) it immediately follows that

n
E -:1-
p.

j=l k.
n 2
~ a = e E p.(1+1/.) .
j=l J J •
J

Thus, in the case of a quadratic function fex), we can choose the numbers kj

(j=l, ... ,n) so that the solution to the approximation problem (16) that corresponds

to the subdivision of each interval [O,ßj ] into kj subintervals will give an e-approxi-

mate global optimal solution to the original problem (P) (e °


> is a prescribed toler-
ance). In general, for small e, the numbers kj must be large, and the explicit

construction of the functions tpP) may be cumbersome. Rosen and Pardalos (1986)
399

suggest reformulating the problem (16) as a zero-one mixed integer programming

problem as follows.
Let us introduce new variables wij such that
k.
xj = Oj Ei~l wij (j=l, ... ,n). (24)

The variables w·· are restricted to w·· E [0,1] and furthermore, w. = (w1 ·, ... ,w. .)
IJ IJ J J Kr
is restricted to have the form wj = (l, ... ,l,wl j'O, ... ,O). Then there will be a unique
w
vector j representing any xj E [O,ßj] and it is easy to see that
k.
ep.(x.) = EJ M .. w.. , with
J J i=l IJ IJ

M .. = f.(iO.) - f.((i-1)O.), i=l, ... ,kJ.


IJ J J J J

w
where j is deterrnined by (24). We can therefore reformulate (16) as
n k.
(MI) rnin E E J M. ·w·· + dy
j=l i=l IJ IJ

n k.
s.t. E o.a. EJw.. +By~c,
j=l J J i=l IJ

Zij E {0,1},

where aj is the j-th column of A in (15).

Algorithm. VIllA (The Rosen-Pardalos Algorithm for (P))

1) For the objective function, compute the eigenvalues Ai and the corresponding

orthonormal eigenvectors ui , i=1,2, ... ,n. Construct a rectangular domain

M = {x: 0 ~ x j ~ ßj } containing the projection D of Cl onto !Rn. Evaluate w(x,y) at

every vertex of Cl encountered.


400

2) Choose the incumbent function value (IFV). Construct CP(X),M,pi'j (j=1, ... ,n).

3) Solve the approximating problem (16) to obtain (i,Y). If 1I1(i,y) < IFV, reset

IFV = 1I1(x,y).

4) If IFV - CP(i) - dy ~ dH, then stop: IFV is an c:-approximate solution.

5) Otherwise, construct the piecewise affine functions 'Pi t) corresponding to the


given tolerance c:.
Solve (MI) to obtain an c:-approximate solution, using the incumbent to

accelerate pruning.

Remarks VIll.2. (i) The rectangular domain M is constructed by solving 2n lin-


ear programs:

max {xi (x,y) E Cl} , min {xi (x,y) E Cl} (j=1,2, ... ,n).

In .the process of solving these multi~ost-row problems, each time a vertex of Cl


is encountered, the corresponding value of 1I1(x,y) is calculated. The vertex with the

minimum 111 is the incumbent and gives IFV in Step 2.


n
(ii) The formulation of (MI) requires a total of N = E k.-n (0-1) integer vari-
j=1 J
ables. If all possible combinations of the integer variables were allowed, this problem
would have a maximum of 2n possibilities to be evaluated. However, it can be shown
that, because of the specific structure of the problem, the maximum number of pos-

sibilities is only k1 x ... x kn . In addition, the number N of 0-1 integer variables never

exceeds

cf. Rosen and Pardalos (1986)).


401

4. DECOMPOSITION BY OUTER APPROXIMATION

In this section we discuss a decomposition approach based on outer approximation

of problem (H) (Le., (4)), which is equivalent to (H). This method (cf. Tuy (1985,

1987) and Thieu (1989)) can also be regarded as an application of Benders' parti-

tioning approach to concave minimization problems.

4.1. Basic Idea

As before, we shall assume that the constraint set n is given in the form
Ax + By ~ c , x ~ 0 , y ~ 0 (25)

The problem considered is

(P) minimize f(x) + dy subject to (25).

As shown in Section VIII.1, this problem is equivalent to the following one in the

space IRnxlR:

(H) minimize [f(x) + t] subject to g(x) ~ t , x E D ,

where

g(x) = inf {dy: -By ~ Ax - c, y ~ O} ,

D = {x ~ 0: 3 y ~ 0 such that Ax + By ~ c} .

Since, by Proposition VIII.1, D is a polyhedron and g(x) is a convex polyhedral func-

tion with dom g = D, it follows that the constraint set G of (H) is a polyhedron; and
hence problem (H) can be solved by the out er approximation method described in

Section VI.l.
402

Note that the function g(x) is continuous on D, because any proper convex poly-

hedral function is closed and hence c:ontinuous on any polyhedron contained in its
domain (Rockafellar, Corollary 19.1.2 and Theorem 10.2). Hence, if we assume that

Dis bounded, then g(x) is bounded on D: a ~ g(x) ~ ß Vx E D. Since obviously any

optimal solution of (H) must satisfy g(x) = t, we can add the constraint a ~ g(x) ~ ß
to (H). In other words, we may assume that the constraint set G of (H) is contained

in the bounded set D- [a,ß] .

Under these conditions, an outer approximation method for solving (H) proceeds

according to the following scheme.

Start with a polytope T O in IRn_1R which contains G. At iteration k one has on


band a polytope T k in IRn_lR. Solve the relaxed problem

and let (xk,t k ) be an optimal solution. If (xk,t k) happens to be feasible (Le., belong
to G), then it solves (H). Otherwise, one constructs a linear constraint on G that is
violated by (xk,tk ). Adding this constraint to T k , one defines a new polytope T k+1
that excludes (xk,t k) but contains G. Then the process is repeated, with T k +1 in
place of T k . Since the number of constraints on G is finite, the process will termin-
ate at an optimal solution of (P) after firii.tely many iterations.

Clearly, to carry out this scheme we must carefully examine two essential questions:

(i) How can one check whether a point (xk,t k) is feasible for (H)?

(il) If (xk,t k ) is infeasible, how does one construct a constraint on G that is violated
by this point?

These questions anse, because, although we know that G is a polytope, we do not


know its constraints explicitly.
403

In the next section we shall show how these two questions can be resolved, with-
out, in general, having to generate explicitly all of the constraints of G.

4.2. Decomposition Algorithm

Let (xO,tO) E IRn"lR. Recall that, by definition, g(xO) is the optimal value of the

linear program

min {dy: -By ~ AxO -c, y ~ O}.

Consider the dual program

Since the function g(x) is bounded on D, C(x) is either infeasible or else has a

finite optimal value.

Proposition vrn.4. The point (il) is feasible for (iI), i.e., i E D and
g(i) ~ tO, if and only if both programs C(xO) and C*(xO) are feasible and their
common optimal value does not exceed l

Proof. Since the point (xO,tO) is feasible for (H) if and only if C(xO) has an op-

timal value that does not exceed tO, the conclusion follows immediately from the du-
ality theory of linear programming.

Observe that, since C(x) is either infeasible or has a finite optimal value (g(x) >
-m), C*(x) must be feasible, i.e., the polyhedron

T
W = B w ~ -d, w ~ °
is nonempty.
404

Now suppose that (xO,tO) is infeasible. By the proposition just stated, this can
happen only in the following two cases:

I. C(xO) is infeasible.

Then, since C*(xO) is always feasible (W is nonempty), it follows that C*(xO)


must be unbounded. That is, in solving C*(xO) we must find an extreme direction vO
of W such that (Ax° - c)vO > 0.

Proposition VIIT.5. The linear ineqv.ality

(26)

is satisjied by all x e D, but is violated by i.

Proof. Indeed, for any xe D, C(x) has a finite optimal value. Hence, C*(x) has a
finite optimal value, too. This implies that (x - c)v ~ °for any extreme direction v
ofW.

ll. C(xO) is feasible but its optimal value exceeds ,0.
Then C*(xO) has an optimal solution wO with (AxO - c)wO= g(xO) > tO.

Proposition VIIT.6. The linear ineqv.ality

(27)

is satisjied by all (x/t) such that x e D, g(x) ~ t/ but it is violated by (i,tO).

Proof. For any x e D, since C(x) has g(x) as its optimal value, it follows that
C*(x), too, has g(x) as its optimal value. Hence,

g(x) ~ (Ax - c)wO ,

which implies (27) if g(x) ~ t. Since tO < g(xO) = (Ax° - c)wO, the proposition fol-
lows.

405

Thus, given any (xO,tO), by solving C(xO) and C*(xO) we obtain an the informa-
tion needed to check feasibility of (xO,tO) and, in case of infeasibility, to construct

the corresponding inequality that excludes (xO,tO) without excluding any feasible

(x,t).

We are thus led to the following algorithm for solving (P).

Let a ~ g(x) ~ ß Vx E D.

Algorithm VIll.5

Iniüalizaüon:

Construct a polytope T O c IRn + 1 containing D " [a,p]. Set k = 0.


lteraüon k = 0,1, ... :

1) Solve the relaxed problem

obtaining an optimal solution (xk,t k ) ofit.

2) Solve the linear program C*(xk ).

3) If a basic optimal solution wk of C*(xk ) is found with (Axk - c)wk ~ t k , then


terminate: (xk,yk), with yk a basic optimal solution of C(xk ), is a global optimal

solution of (P).
If a basic optimal solution wk of C*(xk ) is found with (Axk - c)wk > t k , then

form

T k +1 = Tk n{x: (Axk - k
c)w ~ t}

and go to 1) with k t- k+1.

4) Otherwise, an extreme direction vk of the cone BTv ~ 0, v ~ 0, is found such that


(Axk - c)vk > 0. In this case form
406

Tk+1 = Tk n {x: (Axk - c)v


k
~ o}

and go to 1) with k -k+1.

Theorem vnu. Algorithm VIII. 5 terminates after jinitely many iterations.

Proof. Each wk is avertex of the polyhedron BTw ~ -d, w ~ 0, while each vk is


an extreme direction of the cone BTv ~ 0, v ~ O. But each Tk+ 1 is obviously a pro-
per subset of Tk. Hence, in each sequence {wk}, {vk} there cannot be any repeti-
tion. Since the set of vertices and extreme directions of the polyhedron BTw ~ -d,
w ~ 0 is finite, the finiteness of the algorithm follows.

Remarks VITI.3. (i) All of the linear programs C*(xk) have the same con-
straints: BTw ~ -d, w ~ 0, while the linear programs C(xk) have constraints that
differ only on the right hand side: -B Ty ~ Axk - c, y ~ o. This feature greatly sim-
plifies the prOCeBS of solving the auxiliary linear subproblems and is particularly
valuable when the matrix B has some special structure which allows efficient pro-
cedures for solving the corresponding subproblems. We shall return to ~his matter in
Section VIII.4.4.

(ii) If we know bounds 0, ß for g(x) over D, then we can take TO = s" [o,P] ,
where S is an n-simplex or a rectangle in IRn which contains D. Obviously, a lower
bound for g(x) over D is

0= min {dy: Ax + By ~ c, x ~ 0, y ~ O}

= min {g(x): xe D} .
It may be more difficult to compute an upper bound ß. However, for the above algo-
rithm to work it is not necessary that TO be a polytope. Instead, one can take TO to
be a polyhedron of the form S" [0,+111), where S is a polytope containing D. Indeed,
this will suffice to ensure that any relaxed problem (Hk) will have an optimal solu-
407

tion which is a vertex of T k.

(iii) If x k E D, i.e., if C*(xk ) has an optimal solution wk , then f(xk)+g(xk ) =

f(x k) + (Axk--c)wk provides an upper bound for the optimal value w* in (P), and so

On the basis of this observation, the algorithm can be improved by the following
modification.

To start, set CBS (Current Best Solution) = X, UBD (Upper Bound) = f(x) +

g(x), where i is some best available point of D (if no such point is available, set
CBS = 0, UBD = +ID). At iteration k, in Step 1, if f(xk ) + t k ~ UBD-e, then stop:
CBS is a global E-{)ptimal solution. In Step 3 (entered with an optimal solution wk

of C*(xk), i.e., x k E D), if f(x k ) + (Axk - c)wk < UBD, reset UBD = f(x k ) + g(xk ),
k
CBS = x .

(iv) As in all outer approximation procedures, the relaxed problem (Hk) differs
from (Hk_I) by just one additional constraint. Therefore, to solve the relaxed prob-
lems one should use an algorithm with rest art capability. For example, if Algorithm
VI.l is used, then at the beginning the vertex set T Ois known, and at iteration k the
vertex set of T k is computed using knowledge of the vertex set of T k_ 1 and the
newly added constraint (see Section 1II.4). The optimal solution (xk,t k) is found by
comparing the values of f(x) + t at the vertices.

Example VIll.1. Consider the problem:

Minirni z e f ( x) + dy
s . t. Ax + By ~ c, x ~ 0, y ~ 0,
2 5 2 2 T
where x E IR , Y E IR ,f(x):= -(xCI) - (x2+1) ,d = (1,-1,2,1,-1) ,

B= [~ ~ ~ -~ ~ 1
001 01
c= [=1~ 1
-3
408

Iniüalizaüon:
a = min {dy: Ax + By ~ c, x ~ 0, y ~ O} = -3.
TO= S,,[a,+m), S = {x: xl ~ 0, ~ ~ 0, xl + ~ ~ 3}.

lteraüonO:
The relaxed problem is

with the optimal solution xO = (0,3), tO = -3.


Solving C*(xO) yields wO = (0,1,0) with (AxO - c)wO = -1 > tO.
Form (H1) by adding the constraint (Ax - c)wO = 2x1 + x2 - 4 ~ t.

Iteraüon 1:
Optimal solution of (H1): xl = (0,3), t 1 = -1.
Solving C*(x1) yields w1 = (0,1,0), with (Ax1 _ c)w1 = -1 = t 1.
The termination criterion in Step 3 is satisfied. Hence (x1;y1) = (0,3;0,1,0,0,0) is the
desired global optimal solution of (P).

4.3. An Extension

Algorithm VIII.5 can easily be extended to th.e case where

(1 = {(x,y) e 1R~"lRh: Ax + By ~ 0, X e X, y ~ O} , (28)

with X a convex polyhedron in IRn , and the projection D of (1 on IRn is not necessarlly
bounded.
As above, define
409

g(x) = inf {dy: (x,y) E n} ,


G = ((x,t) E IRnKIR: x E D, g(x) ~ t} ,

and assume that g(x) ~ CI! Vx E Xj hence W # 0 (this is seen by considering the dual
linear programs C(x), C*(x) for an arbitrary x E D). Denote the set of vertices and

the set of extreme directions of W by vert Wand extd W, respectively.

Proposiüon VIII.7. A vector (x/t) E XKR belongs to G if and only ifit satisfies

(Az - c)v ~ 0 Vv E eztd W (29)

(Az - c)w ~ t Vw E vert W. (30)

proof. The proofs of Propositions VIII.5 and VIII.6 do not depend upon the hy-

potheses that X = IR~ and D is bounded. According to these propositions, for any

x EX, if (x,t) t G, then (x,t) violates at least one of the inequalities (29), (30). On
the other hand, these inequalities are satisfied by all (x,t) E G.

Thus, the feasible set G can be described by the system of constraints: x e X, (29)

and (30).

In view of this result, checlring the feasibility of a point (xk,t k ) E XKR and con-

structing the constraint that excludes it when it is infeasible proceeds exactly as be-

fore, i.e., by solving C*(xk ).

The complication now is that, since the outer approximating polyhedron T k may be

unbounded, the relaxed problem (Hk) may have an unbounded optimal solution.

That is, in solving (Hk) we may obtain an extreme direction (zki) of T k on which

the function f(x) +t is unbounded from below. Whenever this situation occurs, we

must check whether this direction belongs to the recession cone of the feasible set G,
and if not, we must construct a linear constraint on G that excludes this direction
410

without excluding any feasible point.

Corollary vm.3. Let z be a recession direction of X. Then (z,s) E IR n,,1R is a re-


cession direction of G if and only if

(Az)v ~ 0 Vv E extd W (31)

(Az)w~ s VWE vert W. (32)

Proof. Indeed, let (x,t) be an arbitrary point of G. Then x + >. z E G V>. ~ 0, and
by Proposition VIII.7. the recession cone of G consists of all (z,s) such that for all
>. ~ 0, (A(x + >.z))v ~ 0 Vv E extd W and {A(x + >.z))w ~ S Vw E vert W. This is
equivalent to (31), (32).

Therefore, a vector (zk,sk) E IRn ,,1R belongs to the recession cone of G if the linear
program

max {(Azk)w: w E W} .

has a basic optimal solution wk with (Azk)wk ~ sk. If not, i.e., if (Azk)w > sk, then
the constraint (30) corresponding to w = wk will exclude (zkl) from the recession
cone. H S(zk) has an unbounded optimal solution with direction vk , then the con-
straint (29) corresponding to v = vk will exclude (zk,sk) from the recession cone.
On the basis, of the above results one can propose the following modification of AI-
gorithm VIII.5 for the case where X is an arbitrary convex polyhedron and D may
be unbounded:

Initialization:
Construct a polyhedron T Osuch that G c T O C Sx [a,+m). Set k = O.
411

Iteration k = 0,1, ... :

1) Solve the relaxed problem

If a finite optimal solution (xkl) of (Hk) is obtained, then go to 2). If an


unbounded optimal solution with direction (zk}) is obtained, then go to 5).

2) - 4): as in Algorithm VIII.5.

5) Solve S(zk).

6) If a basic optimal solution wk of S(zk) is found with (Azk)wk ~ sk, then


terminate: f(x)+t is unbounded from below over the direction (zk,sk).

If a basic optimal solution (zk}) of S(zk) is found with (Azk)wk > sk, then form

k
T k +1 = T k n {x: (Ax - c)w ~ t}

and go to 1) with k r- k+l.

7) Otherwise, a direction vk E extd W is found such that (Azk)vk > 0, Then form

k
T k +1 = T k n {x: (Ax - c)v ~ O}

and go to 1) with k r- k+l.

It is dear that the modified Algorithm VIII.5 will terminate in finitely many

steps.

Remarks VIllA. (i) If n is given in the form Ax + By = c, x E X, Y ~ 0, then

the constraints of the linear program C(x) (for x E X) are -By = Ax - c, y ~ 0,

Hence the nonnegativity constraint w ~ 0 in C*(x) is dropped, Le., we have

W = {w: BT w ~ -d}.
412

(ii) A further extension of the algorithm can be made to the case when the non-
negativity constraint y ~ 0 is replaced by a constraint of the form Ey ~ p, where Eis
an lo,h matrix and p an l-vector. Then the problems C*(x), S(z) become

(C*(x)) max {(Ax - c)w - pu: (w,u) E W}

(S(z» max {(Az)w: (w,u) E W} ,

where W = BTw + ETu = -d, w ~ 0, u ~ O.


Setting

L = {w: 3 u with (w,u) E vert W}, K = {v: 3 v with (v,u) E extd W} ,

er (w) = min {(BTw + d)y: Ey ~ p}, ß (v) = min {(BTv)y: Ey ~ p} ,


one can extend Proposition VIII. 7 as follows:

A vector (x, t) E Xx IR belongs to G if and only if:

(Ax - c)v + er (v) ~ 0 Vv E K ,

(Ax - c)w + ß (w) ~ t Vw E L .


(for the details, see Tuy (1985, 1987) and Thieu (1989)). From these results it is ap-
parent how the algorithm should be modified to handle this case.

4.4. Outer Approximation Versus Successive Partition

As we saw in Chapter 11 and Section VI.1, a major difficulty with outer approx-

imation methods is the rapid growth of the number of constraints in the relaxed

problems (Hk)' Despite this difficulty, there are instances when outer approximation
methods appear to be more easily applicable than other methods. In the decomposi-

tion context discussed in this chapter, an obvious advantage of the outer approx-
imation approach is that all the linear subproblems involved in it have the same con-
413

straint set. This is in contrast with branch and bound methods in which each linear
subproblem has a different constraint set.
To illustrate this remark, let us consider a class of two level decision problems
which are sometimes encountered in production planning. They have the following
general formulation

(*) min [f(x) + dy]


s.t. xEX ,
By = b, Cy = x, y ~ 0. (33)

Specifically, x might denote a production program to be chosen (at the first de-
cision level) from a set X of feasible programs, and y might denote some trans-
portation-distribution program that is to be determined once the production pro-
gram x has heen chosen in such a way that the requirements (33) are met. The ob-
jective function f(x) is the production cost, which is assumed to be concave, and dy
is the transportation-distribution cost. Often, the structure of the constraints (33) is
such that highly efficient algorithms are currently available for solving linear pro-
grams with these constraints. This is the case, for example, with the so-called plant
location problem, in which y = {Yij' i=l, ... ,m, j=l, ... ,n} and the constraints (33)
are:

E y.. = b. (j=l, ... ,m),


i IJ J

E y.. = x· (i=l, ... ,n), (34)


j IJ 1

Yij ~ 0 (i=l, ... ,nj j=l, ... ,m) .

Clearly, branch and bound methods do not take advantage of the specific struc-

ture of the constraints (33): each linear subproblem corresponding to a partition set

involves additional constraints which totally destroy the original structure. In con-
trast, the decomposition approach by outer approximation allows this structure to
be fully exploited: the linear programs involved in the iterations are simply
414

(C(x)) min {dy: By = b, Cy = x, y ~ O} (35)

and their duals

(C*(x)) max {wx + ub: CTw + BuS


T d} . (36)

This is particularly convenient in the case of the constraints (34), for then each
C(x) is a classical transportation problem, and the dual variables w, u are merely

the associated potentials.


When specialized to the problem under consideration, Algorithm VIII.5, with the

modifications described in Section VIII.4.3, goes as follows (cf. Thieu (1987)). For
the sake of simplicity, assume that Xis bounded and the constraints (33) are feasible
for every fixed x E X.

Algorithm VIII.6

Initialization:

Estimate a lower bound tO for the transportation-distribution cost, i.e., compute


a number tO S inf {dy: x E X, By = b, Cy = x, y ~ O}. Set T O = {x E X, tO S t}.
Let xO be an optimal solution of the problem

(HO) min {f(x): x E X} .

Iteration k = 0,1, ... :

1) Solve the linear program C(xk ) , obtaining the optimal transportation-

distribution cost of xk: g(xk) = wkxk + ukb, where (wk,u k) is a basic optimal

solution of the dual C*(xk).

2) If g(xk ) S t k (the optimal transportation-distribution costs do not exceed the

current estimate t k ), then terminate: xk yields a global optimal solution of the


problem (*). Otherwise, continue.
415

3) Form T k + l by adding the constraint

to T k . Solve the new relaxed problem

(H k +1) min {f(x) + t: (x,t) E T k +1} ,

obtaining an optimal solution (xk +l,tk +1) (the new production program together

with the new estimate of transportation-distribution cost).

Go to iteration k+l.

Example vrn.2. Solve the plant location problem:

minimize ~ f/x i) + ~~ d..y ..


1 1J IJ IJ

s .t. Ex. = E b. , Xi ~ 0 (i=I, ... ,n) , (37)


i 1 j J
rYij =b j (j=I, ... ,m),

Ey .. = x. (i=I, ... ,n) ,


j IJ 1

Yij ~ 0 (i =1, ... ,n; j =1, ... ,m) ,

with the following data:

fi(xi ) = 0 if Xi = 0, fi(x i ) = r i + sixi if Xi > 0 (i=I, ... ,n)


(r. is a setup cost for plant i; see Section 1.2.2) ;
1

n = 3, m = 5, r = (1,88,39), s = (1. 7, 8.4, 4.7) .

b = (62, 65, 51, 10, 15) ,

[
6 66 68
81
d = (dij ) = 40 20 34 83 27
4] .
90 22 82 17 8

Applying Algorithm VIII.6 with X defined by (37) we obtain the following results.

Initialization:

t
o= 3636, X0 = (203, 0, 0).
416

Iteration 0:

Optimal value of C(xO): 9000 > t O.


Potentials: wO = (0, - 46, -64), u O = (6, 66, 68, 81, 4).

New constraint: -46~ - 64x3 + 9000 ~ t.

Optimal solution of H(l): xl = (119.1875, 0, 83.8125), t 1 = 3636


Iteration 1:

Optimal value of C(x1): 5535.25 > t1


Potentials: w1 = (-106, -140, -102), u 1 = (112, 124, 174, 119, 110)

New constraint: -106x1 - 140~ - 102x3 + 26718 ~ t


Optimal solution of (H2): x 2 = (104.7, 51.5, 46.8), t 2 = 3636
Iteration 2:

Optimal value of C(x2): 4651.44407 > t 2

Potentials: w2= (-60, -94, -92), u2 = (66, 114, 128, 109,64)


New constraint: -60x1 - 9~ - 92x3 + 20080 ~ t
Optimal solution of (H 3): x3 = (73.176, 54.824,75), t 3 = 3636.

Iteration 3:
Optimal value of C(x3): 3773.647 > t3
Potentials: w3 = (-48, -46, -44), u 3 = (54, 66, 80, 61, 52)

New constraint: -48x1 - 46x2 - 44x3 + 13108 ~ t

Optimal solution of (H4): x4 = (77, 51, 75, 3766), t 4 = 3766


Iteration 4:

Optimal value of C(x4): 3766 = t 4.


Thus, a global optimal solution of the plant location problem is x4 = (77, 51, 75),
with corresponding transportation program

62 0
(y..)=
IJ
[ 0 0 5~ ~ 1~0 J.
o 65 o 10
417

4.5. Outer Approximation Combined with Branch aud Bound

As discussed in connection with Algorithm VIII.5, in order to solve the relaxed

problem (Hk) one cau use any procedure, as long as it is capable of being restarted.

One such procedure is the method of Thieu- Tam-Ban (Algorithm VI.l), which re-

lies on the inductive computation of the vertex set Vk of the approximating polytope

T k . However, for relatively large values of n, this method encounters serious com-

putational difficulties in view of the rapid growth of the set Vk which may attain a

prohibitive size.

One possible way of avoiding these difficulties is to solve the relaxed probleD)S by

arestart branch and bound procedure (Section IV.6). This idea is related to the ap-
proaches of Benson and Horst (1991), Horst, Thoai and Benson (1991) for concave

minimization under convex constraints (Section VII.1.9.).

In fact, solving the relaxed problems in Algorithm VIII.5 by the restart normal con-

iCal algorithm amounts to applying Algorithm VII.2 (normal conical algorithm for

CP) to the problem (H):

(H) min {fex) + t: xE D, g(x) ~ t} .

For the sake of simplicity, as before we assume that Dis boundedj as seen above (in

Section VIII.4.2), this implies that g(x) is bounded on D, and for any x E IRn the lin-

ear program

(C*(x» max {(Ax-c)w: BTw ~ -<1, w ~ O}

is feasible.

Let G = ((x,t): x E D, g(x) ~ t} be the feasible set of (H). From the above results
(Section VIII.4.2) it follows that for any point (xk,t k ), one of the following three

cases must occur:


418

a) C*(xk ) has an optimal solution wk with (Axk - c)wk $ t k. Then (xk,t k) E G;

b) C*(xk) has an optimal solution wk with (Axk - c)wk > t k .

Then x k E D but (xk,t k) t G and the inequality

(Ax - c)wk $ t (38)

excludes (xk,t k ) without excluding any point (x,t) of G.

c) C*(xk) is unbounded, so that an extreme direction vk of the polyhedron

W = {w: BTw ~ -d,w ~ O} can be found over which (Axk - c)wk is unbounded.

Then i t D and the inequality

(Ax- c)vk ~ 0 (39)

excludes x k without excluding any point x of D.

Now, when applying Algorithm VII.2 to problem (H), let us observe that the feas-

ible set G of (H) is a convex set of a particular kind, namely it is the epigraph of a
convex function on D. In view of this particular structure of G, instead of using a
conical subdivision of the (x,t)-space as prescribed by Algorithm VII.2, it is more
convenient to subdivide the space into prisms of the form MxR, where M is an

n-simplex in the x-space (such a prism can also be viewed as a cone with vertex at
infinity, in the direction t --t +(1).

With this subdivision in mind, let T k be a polyhedron containing G in the

(x,t)-space, and let 'k be the incumbent function value. For every prism MxlR which

is in some partition of T k , where M = [sl, ... ,sn+l], we can compute the points

(si,#) on the verticals x =i such that f(si) + # = Ik and consider the linear

program

max {1: A/-t: (1: A/,t) E Tk, E '\ = 1, -\ ~ 0 (i=l, ... ,n+l)} (40)

which is equivalent to
419

max {~(x) -t: (x,t) E T k , x E M}, (41)

where "'M(x) is the intersection point of the vertical through x with the hyperplane
through (sl,ol), ... ,(sn+1,tf+ 1). Let J'{M) be the optimal value, and let z(M) =

(x(M),t(M)) be a basic optimal solution of (40). Clearly, if ",(M) 5 0, then the por-

tion of G contained in MxlR lies entirely above our hyperplane. Hence, by the con-
cavity of iJl(x,t):= f(x)+t, we must have f(x)+t ~ 'Yk V(x,t) E G n (MxR), Le., this

prism can be fathomed. Otherwise, if J'{M) > 0, this prism should be further inves-
tigated. In any case, x(M) is distinct from all vertices of M, so that M can be sub-
divided with respect to this point. In addition, the number

'Yk -J'{M) = min {f(si) + d-J'{M): i=1,2, ... ,n+1}


clearly yields a lower bound for min {f(x) + t: x E M, (x,t) E T k}.

On the other hand, by solving C*(x(M)) we can check whether z(M) belongs to G,

and if not, construct a constraint to add to T k to define the new polyhedron T k+1 .
For every xE D denote F(x) = f(x) + g(x). The above development leads to the fol-
lowing procedure:

Algorithm VIII. 7.

Select c > 0 and anormal rule for simplicial subdivision.

1) Construct an n-simplex MO containing D in the x-space and let T O = MOxlR.

Choose a point XÜ E D and let 'YO = F(XÜ). Set vft 0 = .9 0 = {Mo}' k = O.

2) For each M E .9 k solve the linear program (40), obtaining the optimal value

J'{M) and a basic optimal solution (x(M),t(M)) of (41).

3) Let ~k = {M E vft k: J'{M) > E}. If ~k = 0, then terminate: xk is a global


c-optimal solution of (H). Otherwise, go to 4).

4) Select Mk E argmax {'" (M): M E ~ k}' and subdivide it according to the


normal process that was chosen.
420

Let .9 k+1 be the resulting partition of Mk .

5) Let x k = x(Mk ), tk = t(M k). Solve C*(xk ). If case a) occurs, Le., (xk,t k) E G,

then let T k +1 = T k . Otherwise, form T k+1 by adding the new constraint (38),
or (39), to T k ' according to whether case b) or case c) occurs.

6) Let i k+1 be the best (in terms of the value of F(x» among i k , an x(M) for

M E .9 k +1 ' and let the point u(M k ) E Dk be used for subdividing Mk if u(M k) f

x (M k). Let Ik+1 = F(ik+1), .At k+1 = (~k \ {Mk }) U .9 k+1 . Set b - k+1
and return to 2).

Theorem VIII.5. Algorithm VIII. 7 terminates after finitely many iterations

Proof. By viewing a prism M"IR as a cone with vertex at infinity in the direction

t - - I 111, the proof can be carried out in the same way as for Theorem VII.5 on the

convergence of Algorithm VII.2. If the subdivision process is exhaustive, the argu-

ment is even simpler.



R.emark VIII.4. Note the difference between the above algorithm and Algorithm
VIII.1 (normal simplicial algorithm for (P), Section VIII.2.1). Although both algo-
rithms proceed by simplicial subdivision of the X-5pace, the lower bounding sub-
problem LP(M,O) in Algorithm VIII.1 is much larger than the corresponding li~ear

program (40) in Algorithm VIII. 7. Of course, Algorithm VIII.7 requires us to solve

an additional subproblem C*(xk ) (which is the dual to a linear program in y). How-

ever, since at least two new simplices appear at each iteration, in an, Algorithm

VIII.7 should be less expensive.


421

5. DECOMPOSITION OF CONCAVE MINIMIZATION PROBLEMS OVER

NETWORKS

A signifieant dass of eoneave minimization problems relates to networks. These

indude problems in inventory and produetion planning, eapacity sizing, loeation and
network design whieh involve set-up eharges, diseounting, or eeonomies of seale.

Other, more general noneonvex network problems ean be transformed into equi-
valent eoneave network problems (Lamar, 1993). Large seale problems of this dass
ean often be treated by appropriate decomposition methods that take advantage of

the specifie underlying network strueture.

5.1. The Minimum Concave Cost Flow Problem

Consider a (direeted) graph G = (V, A), where V is the set of nodes and Ais the
set of ares (an are is an ordered pair of nodes). Suppose we are given areal number

d(v) for eaeh node v e V and two nonnegative numbers Pa' qa (Pa S qa) for eaeh arc
a e A. A vector x with components x(a) ~ 0, a e A, is ealled a flow in the network G
(where the eomponent x(a) ia the flow value in the are a). A flow is said to be feas-
ible if

Pa S x(a) S qa Va e A , (42)

E + x(a) - E _ x(a) = d(v) Vve V , (43)


aeA (v) aeA (v)

where A+(v) (resp. A-(v) ) denotes the set of ares entering (resp. leaving) node v.

The number d(v) expresses the "demand" at node v (if d(v) < 0, then node v is a
"supply" node with supply ~(v)). The numbers Pa' qa represent lower and upper

bounds on the flow value in are a. The relation (43) expresses flow eonservation. It

follows immediately from (43) that a feasible flow exists only if E d(v) = 0.
veV
422

Furthermore, to each arc we associate a concave function fa: IR+ --I IR+ whose
value fa(t) at a given t ~ 0 represents the cost of sending an amount t of the flow
through the arc a. The minimum concalle cost jlow problem (CF) is to find a ieasible
flow x with smallest cost

f(x) = E fa(x(a)). (44)


aEA

When Pa = 0, qa = +m Va E A, the problem is called the uncapacitated minimum


concalle cost flow problem (UCF). When, in addition, there is only one supply node
(one node v with d(v) < 0), the problem is referred to as the single source 'll.ncapaci-
tated minimum concalle cost flow problem (SUCF). It is known that (SUCF) is NP-
hard (cf.,e.g., Nemhauser and Wolsey (1988), Guisewite (1995)).
The (CF) problem has heen studied by several authors. One of the earliest works
is a paper of Zangwill (1968), where a dynamic programming method was developed
for certain important special cases of (SUCF). The dynamic programming approach
of Zangwill was further extended for (CF) in a study by Erickson et al. (1987)
(send-and-split method).
Other methods using branch and bound concepts have been proposed by Soland
(1974), Gallo, Sandi and Sodini (1980), Konno (1988), and others (cf. Horst and
Thoai (1995), the survey of Guisewite (1995)).
Denote by AI the set of all arcs a E A for which the cost function fa is affine, i.e.,

of the form fa(x(a)) = fa(Pa) + ca(x(a) -Pa) (ca ~ 0), and let All =A \ AI. In
many practical cases, lAll I is relatively small compared to IAI I, i.e., the' problem

involves relatively few nonlinear variables. Then the minimum concave cost flow
problem belongs to the class considered in this chapter and can he treated by the
methods discussed above. It has been proved recently that by fixing the number of
sources (supply points), capacitated arcs and nonlinear arc costs (i.e. IIAII II) this
problem becomes even strongly polynomially solvable (Tuy, Ghannadan, Migdalas
and Värbrand (1995)). In particular, efficient algorithms have been proposed for
SUCF with just one or two nonlinear arc costs (Guisewite and Pardalos (1992), Tuy,
423

Dan and Ghannadan (1993), Tuy, Ghannadan, Migdalas and Värbrand (1993b),
Horst, Pardalos and Thoai (1995)).

For every flow x let us write x = (i ,xII) where xl = (x(a), aEA I ), xII = (x(a),

aEAII ). The following Algorithm VIII.8 is a specialization to the concave cost flow

problem of the normal rectangular algorithm for separable problems (Algorithm

VIII.2) which differs from the algorithm of Soland (1974) by a more efficient
bounding method.

Algorithm VIll.8.

Start with the rectangle MI = 11 11 [Pa,qa]· Set xO = 0, 70 = +(1).


aEA
Let .J( 1 = .%1 = {MI}·

Iteration k=I,2, ... :

1) For each rectangle M E .%k with M = 11 11 [ra ,s ] solve the linear problem:
aEA a

minimize E I cax(a) + E 11 cM x(a)


aEA aEA a

E + x(a) - E _ x(a) =d(v) (VEV),


aEA (v) aEA (v)

where

(c~ is the slope ofthe affine function 'I/J~(t), which agrees with f(t) at the points

t = r a and t = sa).
If (LP(M)) is infeasible, then set ß(M) = +(1). Otherwise, let w(M) be an optimal

solution of (LP(M)) with components wa(M) and let ß(M) = f(w(M)).

2) Define the incumbent by setting xk equal to the best feasible solution among
xk-l, and all w(M), M E .%k· Let 'k = f(x k) if x k exists, 'k = +00 otherwise.
424

Delete all M E .At k for which ß(M) ~ f(x k). Let .ge k be the remaining collection of
rectangles.

3) If .ge k = 0, then terminate: if 'Yk < +111, then xk is an optimal solution (optimal
flow), otherwise the problem is infeasible.

5) Select an are ~ E argmax {fa({) - tf1c,a(W~): a E All}. Denote by M~ , M~ the


rectangles obtained from Mk by replacing [ra ,sa] by [ra' uf] and
k k k ak
k
[wa ,s a ], respectively.
k k
1 2
6) Let .h'"k+1 = {Mk , Mk }, .At k +1 = (.ge k \ {Mk}) U .h'"k+1. Set k t- k+1 and
return to 1).

It is easily seen that 'Yk < +111 Vk > 1.

Convergence of Algorithm VIII.8 can be deduced from Theorem VIII.2, since AI-
gorithm VIIL8 is a specialization to the concave cost flow problem of Algorithm

VIII.2 which handles more general separable problems. It follows that every accumu-

lation point of the sequence {xk } is an optimal solution of (CF), and f(x k ) converges

to the optimal value of (CF) as k -111.

Next, assume that the numbers Pa' qa' d(v) in (42), (43) are all integers, which is

usually the case in practical applications. Then it is well-known that the vertices of

the feasible polytope defined by (42), (43) are all integer vectors because of the total

unimodularity of the matrix defining the left hand side of (43) (which is the

node-arc incidence matrix of G) (cf., e.g., Nemhauser and Wolsey (1988),

Papadimitriou and Steiglitz (1982)). Since we know that the concave function (44)

attains its minimum at a vertex of the feasible polytope, it is c1ear that we can add

the requirement
425

x(a) E INO (46)

to the constraints (42), (43) without changing the problem (in (46) INO:= {O} UIN).
Finiteness of Algorithm VIII.8 follows then because the optimal solution w(M) of

every linear subproblem LP(M) is integral, and hence every xk is integral. Since the

sequence {xk} is bounded, it must be finite, hence the algorithm is finite.

Notice that, for solving the linear cost network flow problem (LP(M)) in Step 1) a
number of very efficient polynomial algorithms are available (cf., e.g., Ahuya, Mag-
nanti and Orlin (1993)).

Another integral subdivision can be interpreted as integral equivalent of

rectangular bisection (Horst and Thoai (1994a and 1995): Let Mk be the rectangle

chosen in Step 4) with edges [ra' sa ], ak E All, and let


k k

(47)

be the length of one of its longest edges. Then, in Step 5), subdivide Mk into

1 6 (M)
Mk = {x E Mk : x(i\) ~ sä + l2"- J} (48)
k

and

(49)

Proposition vm.8. The integral version 0/ Algorithm VIII.8 'U.Sing the subdivision
(47)-(49) terminates after at most T = rr II r(q(a)-p(a))/21 iterations.
aeA

Proof. Notice that optimal solutions of LP(M) are integer. From this and the fact

that the convex envelope of a univariate concave function f over an interval coin-

cides with f at the endpoints it follows that rectangles M satisfying 6(M) = 1 are
426

deleted in Step 2). Therefore, it is sufficient to show that after at most T iterations
no partition element M satisfying 6(M) > I is left. But this follows readily from
(48), (49) which implies that an edge e of the initial rectangle MI cannot be involved
in a subsequent subdivision more than rI e 1/21 times, where Ieldenotes the length
of the edge e. Since lei ~ qa - Pa for the corresponding a E All, we obtain the above
bound.

5.2. The Single Source Uncapacitated Minimum Concave Cost Flow (SUCF)
Problem

Now we consider the SUCF problem, i.e., the special case of (CF) when there are
no capacity constraints, (Le., Pa = 0, qa = +ID Va E A) and there is only one supply
node (i.e., one node v E V with dv < 0). Since (SUCF) is NP-hard, large seale
SUCF problems cannot be expected to be effectively solved by general purpose
methods. Fortunately, many SUCF problems encountered in practice have special
additional structure that can be exploited to devise specialized algorithms. Examples
include the concave warehouse problem, the plant location problem, the multi-
product production and inventory models, etc. All of these problems and many
others can be described as (SUCF) over a network which consists of several pairwise
interconnected subnetworks like the one depicted in Fig. VIII.I.
427

Fig. VIII.1 Typical structure of (SUCF)

For problems with such a reccuring special structure, a decomposition method can

be derived that combines dynamic programming and polyhedral underestimation

techniques. We follow the approach of Thach (1988).

Let us first restate the problem and its structure in a convenient form. Anode v

of a graph G = (V,A) is called a source if At = 0. Let S(G) be the set of all sources
of G, and suppose that we are given a demand d(v) ~ 0 for each node v E V \ S(G)

and a concave cost functian fa : IR+ --+ IR+ far each arc a E A. Then we consider the
problem

minimize f(x) = }; fa(x(a)) (50)


aEA

s.t. }; + x(a) - }; _ x(a) = d(v) (v E V \ S(G)) (51)


aEA (v) aEA (v)

x(a) ~ 0 (a E A) . (52)
428

Clearly this problem beeomes an (SUCF) problem when we add to the network a fie-
tive supply node vo along with a fietive are (vO,v) for eaeh v E S(G). Furthermore,

we set d(v) = ° (v E S(G)), d(vO):= -l; {d(v): v E V \ S(G)}.

Conversely, an (SUCF) problem in the original formulation eorresponds to the ease

IS(G)I = 1. Therefore, from now on, by (SUCF) we shall mean the problem (50) -
(52) with the fietive node vO and the fietive ares (vO,v) (v E S(G)) as introdueed

above.
We shall further impose a eertain strueture on the network G = (V,A):

(*) There is a partition {V1, ... , Vn} of V such that for every i=l, ... ,n we have the
foUowing property:
if the initial node of an arc belongs to Vi' then its final node belongs to either Vi or

Vi +1"

In view of this eondition, the eoefficient matrix of the flow eonservation eon-

straints (51) has astairease strueture whieh we exploit in order to detive a deeom-
position method.

For eaeh i=I, ... ,n we define

Ai = {a E A: a = (u,v), u E V, v E Vi} ,

and for eaeh v we let At(v) (resp. Ai(v)) denote the set of ares in Ai whieh enter
(resp. leave) v.

n
Then A. n A. = 0 (i f j) and A = U A., i.e., the sets A l , ... ,A form a partition of
1 J i=l 1 n
A. Furthermore, let Gi be the subgraph of G generated by Ai ' i.e., Gi = (Wi' Ai)'
where Wi is the set of all nodes incident to ares in Ai. Then any flow x in G ean be
written
.
as x = (x l ,... ,xn
), where x. is theI
restrietion of
l
x to A..
Setting Ui = Wi n Wi +! ' we see that v E Ui if and only if v is the initial node of an

are going from Vi to Vi+!. Let hi : V -+ IR+ be a funetion such that hi(v) = ° for v ~
Ui . Denote by X(G,d) the set of all feasible flows in G for the demand veetor d with
429

components d(v), v E V \ S(G), i.e., the set of all veetors x = (x(a), a E A) satis-

fying (51) and (52); similarly, denote by X(Gi ' d + hi ) the set of all feasible nows in

Gi for the demands

Proposition vm.9. We have

t
where A +1(v) denotes the set 01 ares a E Ai+1 'eaving v.

proof. By virtue of assumption ( ~) and the definition of Ai we have

A +(v) = At(v) Vv E Vi

Ai(v) U Ai+l(v) if v E Ui
A-(v) ={ _
A.(v) i f v E V. \ U.
111

from whieh the assertion follows.



In a network G = (V,A) a path is a finite sequenee of distinet ares such that (ex-
cept for the last are) the final node of any are in the sequenee is the initial node of
the next are.

We shall assume that for any node v E V \ S(G) there is at least one path going

from a souree to v. This amounts to requiring that X(G,d) is nonempty for any vee-

tor d ofnonnegative demands d(v).

A subgraph T = (V,B) of G is ealled a spanning lorut of G if for any node v e


V \ S(G) there is exaetly one path in T going from a source to v.

An eztreme ftowin Gis an extreme point ofthe polyhedron X(G,d).


430

Proposition vm.l0. A feasible flow z is an extreme flow if and only if there ezists

a spanning forest T = (V,B) such that {a e A: z(a) > o} ( B.

We omit the proof, which can be found in textbooks on elementary network flow

theory (see also Zangwill (1968».

Now consider the function

~(d) = in! {f(x): xe X(G, d)}

Proposition vm.ll. The function ~(d) is concave on the set of all d = (d(v)),
v e V\ S(G), d(v) ~ O.

Proof. For d = >'d' + (1->')d", 0 ~ >. ~ 1, let i e argmin {f(x): x e X(G,d)}. We


may assume that i is an extreme flow (because f(x) is concave). Then there exists a

spanning forest T = (V,B) such that B ) {a e A: i(a) > O}. Since T is a forest, there
exist a unique x' e X(T,d') and a unique x" e X(T,d"). Clearly ).x' + (1->')x" e
X(T,d) and hence by uniqueness, i = ).x' + (1->')x". But we have ~(d) = f(i) ~
>'f(x') + (1->') f(x") ~ >'~(d') + (1->')~(d"), which proves the concavity of~. •

Next, consider the functions


431

Fn-1 (hn_ 1) = inf {Fn-2(hn- 2) + EA fa (xn_ 1(a)):


aE n-1
x n _ 1 E X(G n_ 1,d + hn_ 1), h n_ 2(u) = EA_ x n_ 1(a) Vu E Un_ 2} .
aE n-1(u)

Proposition vm.12. The fu,nctions F/) are concave on their domains of de-
finition. 1f lP/xJ denotes the fu,nction obtained !rom Fi -lhi - 1) by replacing hi - 1 by
the vector E _ x.(a), u E Ui - 1 ' then lP/xJ is concave.
aeA i (u) I

Proof. The concavity of F 1( .) follows from Proposition VIII.10. Since the

mapping x2 -+ ( E _ x2(a), u E U 1) is linear in x2, and since F 1(·) is concave, we


aEA 2(u)

deduce that 1P2(x2) is concave. But

Therefore, again using Proposition VIII.10, we see that F 2(·) and 1P3(') are concave.

The proof can be completed by induction.



With the above background, we can describe the basic idea of the decomposition
method proposed in Thach (1988).

Consider the subproblem

Let (i
n,h~1) denote an optimal solution of (P n), and consider the next subproblem
432

Continuing in this way, we can successively define the subproblems P n-2""'P 2,


where Pi is the problem

S.t. x. E X(G., d
1 1
+ li.),
1
h. l(u) = E
1- aEA~(u) 1
x.(a) 'tu EU. I'
l-
I

in which lii is obtained !rom an optimal solution (xi +1' lii) of PHI'

Finally, let xl be an optimal solution of the subproblem

Note that, if we agree to set 'PI (. ) :: 0, then, by Proposition VIII.12 each problem
(P.) is equivalent to minimizing the concave function cp.(xI·) + E fa(x.(a» subject
1 1 aeA. 1
1

to Xi E X(Gi , d + hi ).

Theorem VIII.6. Let Q be the optimalllalue o/the objective function in {PnJ. Then
Q is the optimal value 0/ the objective fu,nction in {P}, and x = (xp-"'xn), with xi
{i=l, ... ,n} as defined above, is an optimal solution 0/ (P).

Proof. Replacing the functions Fi (i=n-l, ... ,l) in the problems (Pi) (i=n-l, ... ,l)
by their defining expressions we can easily see that Q is the optimal value of the ob-
jective function and xis a minimizer of the function E f (x(a» over the domain
aeA a
433

From Proposition VIII.9 it follows that this is just problem (P).



Thus, to obtain an optimal flow X, it suffices to solve (P n)'(P n-1)""(P 1) suc-
cessively (note that by solving (Pi) we obtain (xi' hi_ 1), and then hi_ 1 is used to de-
fine P.1-1)' Since each (P.)
1
is a concave minimization over X(G.,
1
d + h.),
1
the ori-
ginal problem (P) decomposes into n subproblems of the same type but much
smaller size. We next discuss how to solve these subproblems.

5.3. Decomposition Method for (SUCF)

When solving the subproblems (Pi)' the difficulty is that the functions Fi_ 1(hi_ 1)
which occur in the objective functions of these subproblems are defined only impli-
citly. One way to overcome this difficulty is as follows: first approximate the func-
tions Fi (·) (which are nonnegative and concave by Proposition VIII.lI) with certain
polyhedral concave underestimators ",~, and for i=n,n-l, ... ,1 solve the approximate
subproblems (P~) obtained from (Pi) by substituting "'~ for Fi_ 1. Then use the so-
lutions of (P~), ... ,(P~) to define functions "'} that are better approximations to Fi
than "'~, and solve the new approximate subproblems (P 1), ... ,(P 11), and so on. It
1 n
turns out that with an appropriate choice of approximating functions ",~,,,,}, ... ,~, ... ,
this iterative procedure will generate an optimal flow x after finitely many iter-

ations.
Recall that for any real valued function f/J(h) defined on a set dom "', the hypo-

graph of f/J(h) is the set

hypo '" = ((h,t) E dom "xR: t ~ f/J(h)}


434

Clearly, a concave function is determined by its hypograph.


Denote by IR U the spaee of all vectors h = (h(u), u E U) (i.e., all functions
h: U - I IR), ~d denote by IR~ the set of all hE tRU such that h(u) ~ 0 Vu e U.

Algorithm vm.9 (Decomposition Algorithm for SUCF)

lnitialization:

For each i=I, ... ,n-l1et "'~ =O. Set k = O.


Iteration k=O,I,2, ... :
This iteration is entered with knowledge of ~ , i=I, ... ,n-l. Define h~ =0, 10 =O.
Set i=n.

k,I. Solve

(P~)

obtaining an optimal solution (x~, hLl) and the optimal value t~ of (P~).
If i ~ 2, set i t - - i-I and return to k.l. Otherwise, go to k.2.

k,2. If t~ S ~(h~) for a11 i=I, ... ,n-l, then stop: xk = (x~, ... ,x~) is an optimal ßow.
Otherwise, go to k.3.

k,3. Construct a new concave underestimator ~+1 for each Fi such that the
hypograph of ~+1 ia the eonvex hull of the set obtained by adjoining the point
(hf,tf) to hypo ~ , i.e.,

(54)

Go to iteration k+ 1.
435

Remarks VIIT.5. Before discussing the convergence of the algorithm, we make

some remarks on how to construct the functions .,pf+l and how to solve the subprob-
k
lem (Pi).

Construction ofthe functions ~+1 (step k.3):

In view of (54) and the relation .,p~ :: 0 we can write


k+1 0 ..
hypo .,pi = conv {hypo .,pi ' (hi,ti) j=O, ... ,k}
U. . .
= conv {1R+1xlR_, (hi,ti) j=O, ... ,k}.

u.
Thus, for any hi E IR+ 1 we have

.,pf+1(hi ) = sup {t: (hpt) E hypo .,pf+1}

k . k .
= sup {t: t ~ E s.tJ, E s.hJ(u) ~ h.(u) Vu E U1· ,
j=l J 1 j=l J 1 1

k
E s· ~ 1 , s· ~ 0 Vj=O, ... ,k}
j=l J J
k . k .
= sup {E s.tJ: E s.hJ(u) ~ h.(u) Vu E U1. ,
j=l J 1 j=l J 1 1
(55)
k
E s· ~ 1 , s· ~ 0 Vj=O, ... ,k}.
j=l J J

For a given hi E IR ~i the value of .,pf+1(hi ) is equal to the optimal value of the linear
program (55).

On the other hand, since hypo .,pf+1 is a polyhedral convex set, .,pf+1 can be ex-

pressed as a pointwise minimum of affine functions. The graphs of these affine func-

tions are hyperplanes through the nonvertical facets of hypo .,pf+ 1. Starting with the
o U.
fact that .,pi has a unique nonvertical facet, namely IR lx{O}, and using formula (54),

we can inductively determine the nonvertical facets of .,pi,.,p~, ... ,.,pf+1 by the poly-
436

hedral annexation technique (see Section VI.4.3). In the present context this pro-
cedure works in the following way.
Consider the dual problem of (55):

minimize E r(u)h.(u) + t (56)


UEU. 1
1

s.t. E r(u)hJ(u) + t ~ tJ (j=O,l, ... ,k), (57)


uEU. 1 1
1

U.1
rElR+ ,t~O. (58)

Since v{+l(hi ) is equal to the optimal value in (55), it is also equal to the optimal
value in (56)-{58). Furthermore, since v{+1(hi ) is finite, the optimal value in

(56)-{58) must be achieved at least at one vertex of the polyhedral convex set Zf
defined by (57) and (58). Let Ef denote the vertex set of Zf. Then Ef is finite, and
we have

~+1(h.) = min {E r(u)h.(u) + t: (r,t) E E~} . (59)


1 1 UEU. 1 1
1

Therefore, in order to obtain an explicit expression for t/lf+l we compute the vertex
set Ef of zf· Clearly, E~ = {O} for all i=I, ... ,n-l, and Zf+1 differs from Zf by just
one additional linear constraint. Hence, we can compute the vertex set of Zf by the

methods discussed in Section 11.4.2. Since Zf ( IR Ui"IR, the above procedure is prac-

tical if IUi I is relatively smalI.

Solviag the 8ubproblema (~) (step 1:.1):

From (59) it is easily seen that (Pf) can be rewritten as

min k-l {t + min Pf(rH


(r, t)EE i _ 1

where min Pf(r) denotes the optimal value of the problem


437

k
S.t. x. E X(G., d
1 1
+ h.),
1
h. leu) =};
1-
x.(a)
aEA~(u) 1
Vu EU. I'
l-
I

Recalling now that Gi = (Wi , Ai) and using an artificial node w, let us define the
network U.1 = (W.,X.),
1 1
where W.1 = W.1 U {w}, X.1 = A.1 U {(w,u): u EU.1-I} and to

each arc a = (w,u) (u E Ui_ l ) one assigns the linear cost function fa(x) = r(u)x.

Then Pf(r) becomes an uncapacitated minimum concave cost flow problem on Gi


which can be solved by several currently available algorithms (for example, AIgo-
rithm VIII.8). Note that for r' f r the problem Pf(r') differs from Pf(r) only in the
linear cost functions on the artificial ares a = (w,u) (u E Ui- l ).
Moreover, it follows from the above discussion that each (pf) has an optimal so-

lution (xf,hf_l) such that xf is an extreme flow in Gi' Hence, we may assume that
for every k, xk = (xf, ... ,x~) is an extreme flow in G.

In order to show convergence of the above algorithm, we first formulate the fol-

lowing propositions.

Proposition vm.13. For any i=l, ... ,n-l and any k=O,1,2, ... we have "'~(hJ ~
F/hJ Vhi E IR ~i (i.e., "'~ is actually an underestimator of Fi ).
U.
Proof. Since Fi is nonnegative on IR+I (i=I, ... ,n-l), it is immediate that

o U.
1Pi(hi) ~ Fi(hi ) Vhi E IR+ I (i=l, ... ,n-l). (60)

Now, for any k we have

Furthermore, since "'~(hl) ~ F l(h l ), we further have hypo "'~ ( hypo F l' Therefore,
by (60), hypo 'I/J~ (hypo F 1 Vk, or, equivalently, 'I/J~(hl) ~ FI(h l ) Vk. Since
438

h1(u) = E _ x2(a) Vu E U1}


aEA 2(u)

~ inf{F 1(h 1) + E fa(~(a)): x2 E X(G 2, d + h~),


aEA 2

and hypo 'IjJ~ ( hypo F 2' it follows from (54) that ~(h2) ~ F 2(h2) for all k.
By the same argument we have


Proposition VIll.14. I/at some iteration k

t·k < ,;:.Ih.)


k V i=l, ... ,n-l , (61)
1- ,",

or, equivalently,

,
vlf, = 'IjJ~+1 V i=l, ... ,n-l, (62)

then; = (x1, ... ,x~ is an optimal flow.

Proof. First we prove that if (61) (or (62)) holds, then

.k k k
'/{J;(h.) = F.(h.)
1 1 1 1
(63)

for all i=1, ... ,n-1. Indeed, for i=1 it is obvious that

'IjJ~(hf) = v{+ 1(h~) ~ t~ = inf (P~)


= inf {E fa(xl(a»: Xl E X(Gl' d + h l )} = FI(h~).
aEA I
439

In view of Proposition VIII.13, we then deduce that ~(h~) = F I (h~).


Now assuming that (63) is true for i-I, let us prove (63) for i. We have

k 1fJi(h.
F.(h.)> .k k) =t/J.k+1( h.k) >t.=mf(P.)
k. k
11-1111-1 1

h. I(U) = E k .
x.(a) Vu E u. I} = F.(h.)
1- A- () 1 1- 1 1
aE i-I u

Therefore, (63) holds for all i=I,,,.,n-l.


To complete the proofit remains to show that (x~,hf_l) is an optimal solution of

(where we agree to set h! = 0 and F0 = 0). We have

h. l(u) = E x.(a) Vu EU.}


1- A- () 1 1
aE i-I u
440

< inf {F. 1(h. 1) + E f (x.(a»: x. E X(G., d + h~) ,


- 1- 1- aEA. a l l 1 1
1

Le., (x~, hf-1) is an optimal solution of (Pi).


Finally, from Theorem VIII.6 it follows that xk = (xf, ... ,x:) is an optimal solution

of (P).

Theorem VIII.7. Algorithm VII1.9 terminates after finitely many iterations at an

optimal solution 01 (SUCF).

Proof. We first show that for any fixed i E {0,1, ... ,n-1} there is a finite collection

.Ni of functions such that 1/It E .Ni for all k=0,1,2, .... Indeed, this is obvious for i=O
since ~ =0 Vk. Arguing by induction, suppose that the claim is true for i = p-1

(p ~ 1) and consider the case i = p. For k = 0, we have 1/1~ =0, while for k ~ 1

(64)

where

t k- 1 = ~-1(hk-l) + E f (xk-1(a» . (65)


p p-l p-l aEA a p
p

Si nce every xk-l = (k-l k-l) 18


xl •... ,xn . G and t he number 0 f ex-
. an extreme fl ow In

treme flows is finite, x~-l mU8~ belong to some finite set X p . Moreover, aince the

quantities hf-1 (i==l, ... ,n-l) are uniquely determined by xk- l , they must belong to

certain finite sets Bi (i=l, ... ,n-l). Therefore,

k-l k-l
hp-l e Bp-l ' xp E Xp ,

ud IlillCe ':=~ E JI p-1 ud JfI p-1 is finite by aasumptiOll, it follows from (65) that

t~-l belongi to same finite 5~ T p C 1+. But


441

(h k- 1 t k- 1) E H xT .
p 'p p P

By virtue of (64), this implies that any '1/1; is the eonvex hull of the union of the

hypograph of 'I/I~ and some subset oft he finite set HpxTp' Henee, ~ itself belongs to
some finite family cH p'
We have thus proved that for any i=0,1, ... ,n-1, cHi is finite. Since for any k

'I/If ~ v1+ 1 (i=0,1, ... ,n-1),

and both 'I/If and 'I/If+1 belong to the finite set cHi' there must exist a k such that

'I/Iki = 'I/Ik+1
i
.
(1=0,1, ... ,n-1) .

In other words, the algorithm must stop at some iteration k. By Proposition VIII.14,

xk is then an optimal flow.



Example VIll.3. Consider the network G in Fig. VIII.2, with eost functions of

the form

fa(t) = e(a)t + b(a)5(t) ,

where 5(t) = 1 if t > 0, 5(t) = 0 if t = 0 (i.e., b(a) is a fixed cost).

The data of the problem are as folIows:


Node v: 2 3 4 5 6 7 9 10 11 12

Demand d(v): 3 29.2 1.5 2.5 0 0 20.3 39.5 1.5 30.5

Are a: 1 2 3 4 5 6 7 8

Fixed cost b(a): 25.1 15.7 15.7 14.9 14.9 14.8 30 29

c(a): 3.2 3.3 3.3 2.8 2.8 2.7 5.5 5.4

Are a: 9 10 11 12 13 14 15 16

Fixed eost b(a): 15.7 15.7 15.5 15.7 50.5 41.5 55 41.5

e(a): 3.3 3.3 3.2 3.3 9.5 5.7 8.5 5.7


442

Are a: 17 18 19
Fixed cost b(a): 15.7 15.7 41.5

c(a): 3.3 3.3 5.7

1 .,
2 2 3 3 4

7 8
5
11

5 9 9

19

---3>---------3------ 10

Fig. VIII.2

Clearly, the set of nodes V ean be partitioned iuto two subsets

VI = {1,2,3,4,5,6,7} ud V2 = {8,9,1O,1l,12} satisfying a.ssumption (*).


Applying the above method we obtain the following results (U 1 = {6, 7}).

Iteration 0:
(Here we wnte xj rather than xO(j) to denote the value of the flow xO in Me j).

0.1: Solving (P~):


o :::::: 59.8, x(}12 == 0, x0l3
xll = 0, x°
14 = 0, x 15 = 0, x 16 = 32, x 17 = 1.5, x 18 = 0,
0 0 0 0

xo
19 = 39.5.
Solving (P~):
a = 68.2, X20 = 30.7, x0 = 1.5, x0 = 0, X I) = 2.5, ~() = 0, Xr0 = 32, X 0 = 0, x 0 =
xl 3 4 s s 9 0,
a
x :::::: O.
10
443

Optimal value of (P~): t~ = 599.3.

0.2.: °
t~ = 599.3 > = 1/J~(h~) (stopping criterion not satisfied).

0.3.: (From step 0.1 we have h~(6) = 32, h~(7) = 0)


hypo 1/J~ = conv {IR; xR, (32,Oj599.3)} = {( 91'92,t): t ~ min (599.3, 18.728(1),
91 ~ 0, 92 ~ O}.
Bence, 1/J~(hl) = 1/J~(hl(6), hl (7)) = min {599.3, 18.728hl (6)}.

Iteration 1:

1.1. Solving (P~) and (pb, we obtain xl = xO, t~ = 599.3

1.2. t~ = 599.3 = 1/J~(hb: stop. Optimal vallle: 1317.36.

In this example the total number N of extreme flows is 162. Further numerical re-

sults are reported in Thach (1988).

5.4. Extension

If we set Xi(hi ):= X(Gi' d + hi) (i=I, ... ,n), Bi (xi +1):= ( E _ xi +1(a),
aeAi +1(u)
u e U.) (i=I, ... ,n-l), f.(x.):= E fa(~(a» (i=I, ... ,n), then it is easily seen that
1 1 1 aeA.
1

(SUCF) ia a special cue of the following more general problem:


n
minimize . E fi (xi) (66)
1=1

(P) (67)

(68)

(69)
444

m· k.
Here, xi E IR / ' hi E IR+1 and it is assumed that:
m.
(i) each fi (·) is a concave function on 1R+ 1;

(ii) each Xi(hi ) is a convex polyhedron, and the point-t<HIet mapping hi t----+ Xi(hi )
is affine, i.e.,

11 + (l-..\)h~')
X.(..\h! 11 + (l-..\)X.(h~')
1 = ..\X.(h!) 11

for any hj, hi E IR+ and 0 ~ ..\ ~ 1;

m. 1 k.
"') each Hi: IR +1+
(11l -I
IR+1.18 a l'meu mappmg.
.

It is obvioU8 that (SUCF) satisfies (i) and (iii). To see that (SUCF) also satisfies

(ii), it suffices to show that any extreme point xi of Xi(..\hj + (l-..\)hi) = X(G i , d +
..\hi + (l-..\)hP is ofthe form xi = ..\xi + (l-..\)xi' with xi E Xi(hi), xi E Xi(hj). But,
since Xi is an extreme flow in Gi = (Vi' Ai)' there exists (by Proposition VIII.10) a
xi, xii be feasible
spanning forest Ti = (Vi'B) such that {a E Ai: ~(a) > O} ( B. Let
flows in Ti = (Vi'B) for the demands d + hj and d + hi ' respectively. Then xi E
Xi(hi), xi E Xi(hP, and hence ..\xi + (l-..\)xi E Xi(..\hi + (l-..\)hi). Since there is a
unique feasible flow in T.1 for the demand d + ..\h!1 + (l-..\)h '1.' we conclude that X.1 =
..\i + (l-..\)xi·

Note that several problems of practical interest can be formulated as special cases
of problem (P). Consider for example the following concave cost production and in-
ventory model (see, e.g., Zangwill (1968».
n
minimize E (p.(y.) + 'lj(h.» (70)
i=l 1 1 1

S.t. hi- 1 + Yi = hi + di (i=l, ... ,n), (71)

(72)
445

Here di > 0 is the given market demand for a product in period i, Yi is the amount

to be produced and hi is the inventory in that period. The function Pi(Yi) is the pro-
duction cost, and q.(h.)
1 1
is the inventory holding cost in period i (where both p.(.)
1

and qi(') are assumed to be concave functions).

Setting xi = (Yi'hi- l ) (i=l, ... ,n), and

x.1 E X.(h.)
1 1
~ x· = (y.,h. 1)' h. 1 + y. = h. + d. ,
1 1 1- 1- 1 1 1

f.(X.)
1 1
= p.(y.)
1 1
+ q.(h.)
1 1
,

we see that we have a special case of problem (P).

A feasible solution x = (xl""'xn ) for (P) can be generated in the following way:
first, choose x n E Xn(O) and compute hn_ l = Hn_l(xn ); then choose xn_ l E
Xn_l(h n_ l ) and compute hn_ 2 = Hn_ 2(xn_ l ), and so on; finally, choose xl E

Xl(h l )·
Therefore, the problem can be considered as a multistage decision process. This sug-

gests decomposing (P) into a sequence of smaller problems which corresponds to dif-
ferent stages of the process.
For each j=l, ... ,n consider the problem

minimize ~
i=l
f.(x.)
1 1
(73)

P -(h.) (74)
J J

h.1 = H.(x·+
1 1 l
) (i=l, ... ,j-l), (75)

h j given. (76)

Let F .(h.) denote the optimal value of P .(h.) as a function of hJ..


J J J J
446

Proposition vm.15. The fu,nction Fi) is concave.

Proof. Denote by nj(hj ) the set of all (xl""'xj ) that are feasible for Pj(hj ), i.e.,

for each (xl""'xj ) E nihj) there exist hl' ... ,h~l satisfying (73), (74), (75). By in-
duction, it can easily be shown that the point-to--flet mapping hj - I nj(hj ) is affine.

Now let hj = >'hj + (l->.)hj (0 ~ >. ~ 1), and let (xl""'xj ) E nj(hj ) satisfy (xl'''''xj )
= >,(xi,,,,,xj) + (1->')(x1,... ,xj) with (xi, .. ·,xj) E nj(hj) and (x1,.. ·,xj) E nihj).
Since fi (·) is concave, we have fi(~) ~ >.fi(xi) + (l->')fi (xf). Hence, if (xl'''''xj ) is

an optimal solution of PJ.(hJ.), then FJ.(hJ.) = .~ fi(xi ) ~ >. . ~ fi(xt) + (1->') . ~


1=1 1=1 1=1
f.(x~') .(h~) + (l->.)F .(h'!).
>>.FJJ
11- JJ
This proves the concavity of F j (.).

Obviously the optimal value of (P) is simply FneO), and we can write the recursive

equations:

F.(h.)
JJ
= min {F. l(h. 1)
,J-,J-
+ L(x.): h. 1
JJ,J-
= H. l(x,), x. E X.(h.)}
,J- J J JJ
(j=2, ... ,n).

Because of Proposition VIII.15, these subproblems are concave minimization prob-


lems under linear constraints. Hence, in these problems one can replace the poly-

hedra Xj(hj ) by their vertex sets. Since the laUer sets are finite, one can compute

the function F 1(h 1) (i.e., the tableau of its values corresponding to different possible

values of h 1). Then, using the recursive equation, one can find F 2(h2) from

F 1(h 1), .. ·, and finally F neO) from F n-1 (hn_ 1). However, this method is practical

only if n and the vertex sets of Xj{hj ) are small.

Under more general conditions it is easy to see that the same method that was de-
veloped in Section VIII.5.3 for the SUCF problem can be applied to solve (P), pro-

vided t'il~t the dimensions of the variables hi are relatively small (but the dimensions
of the variables Xi may be fairly large, as in (SUCF».
CHAPTERIX

SPECIAL PROBLEMS OF CONCAVE


MINIMIZATION

Many noneonvex optimization problems can be reduced to eoneave ntinimization

problems of a special form and ean be solved by specialized coneave minimization

methods. In this ehapter we shall study some of the most important examples of

these problems. They include bilinear pro~ramming, eomplementarity problems and

eertain parametrie eoneave minimization problems. An important subclass of para-

metrie eoneave minimization whieh we will study is linear programming subjeet to

an additional reverse eonvex constraint.

1. BILINEAR PROGRAMMING

A number of situations in engineering design, eeonomic management, operations


research (e.g., eonstrained bimatrix games (Mangasarian (1964)), three dimensional

assignment (Frieze (1974», multicommodity network flow, production seheduling,

rectilinear distanee loeation-alloeation (e.g., Konno (1971a), Soland (1974, Sherali

and Shetty (1980a), Almeddine (1990), Benett and Mangasarian (1992), Sherali and

Almeddine (1992), Benson (1995), ete.) ean be modeled by the following general

mathematical formulation, often ealled the bilinear programming problem:

(BLP) minimize F(x,y):= px + y(Cx) + qy ,

subject to x EX, Y E Y ,
448

where X, Y are nonempty polyhedra given by

n'
Y = {y Eil: By ~ b , y ~ O} ,

with a E IRm, bEIm', p E IRn', q E IRn, and C,A,B are matrices of dimension n'-n,

m-n, m'-n', respectively. Let V(X) and V(Y) denote the vertex set of X and Y, re-

spectively. A number of further appIications can be found in Almeddine (1990),

Benett and Mangasarian (1992), SheraIi and Almeddine (1992), Benson (1995).
Problem (BLP) has been extensively studied in the Iiterature for more than

twenty years (e.g., Mangasarian (1964), Mangasarian and Stone (1964) and Altman

(1968); Konno (1976), Vaish and Shetty (1976 and 1977), Gallo and Ulkücü (1977),
Mukhamediev (1982), SheraIi and Shetty (1980), Thieu (1980 and 1988), Czoch-

ralska (1982 and 1982a), Al-Khayyal (1986), SheraIi and Almeddine (1992». We
shall focus on methods that are directly related to concave minimization.

1.1. Basic Properties

The key property which is exploited in mOlt methods for bilinear programming is
the equivalence of (BLP) with a polyhedral concave minimization problem (see Sec-

tion 1.2.4). Recall that the problem (BLP) can be rewritten as

min {fex): x E X} , (1)

where

fex) = inf {F(x,y): y E Y}


= px + inf {(q + Cx)y : y E Y} . (2)
449

If Y has at least one vertex, and if we denote the vertex set of Y by V(Y), then the

hypograph P off is a polyhedron

P = {(x,t) E IRnxlR: px + (q + Cx)y ~ t Vy E V(Y)} . (3)

Hence, f(x) is a polyhedral concave function. In particular, dom f:= {x E IRn:


f(x) > -m} is a polyhedron (note that dom f = IRn if Y is bounded). Moreover, the

function f(x) is piecewise affine on dom f.


As a consequence of this equivalence, any method of concave minimization, when
specialized to problem (1), will produce an algorithm for solving problem (BLP). It

is the aim of this section to examine some of the methods for which this special-
isation is not trivial. But before turning to the methods, let us mention some general

properties of the optimal solutions of (BLP).

Proposition IX.I. If problem (BLP) has a finite optimal value (e.g., if X and Y

are bounded), then an optimal solution (x,y) exists such that xE V(X), 11 E V(Y).

Proof. Indeed, the infimum of f(x) over X, if it is finite, must be attained at some
i E V(X). Furthermore, from (3) we see that f(Je) = F(i,Y) for some y E V(Y) (cf.
Theorem 1.3).

Of course, by interchanging the roles of x and y, we obtain another equivalent

form of (BLP):

min {g(y): y E Y} ,

where g(y) = qy + inf {(p + CTy)x: x E X} .

In view of this symmetrie structure, it is clear that a necessary condition for a

pair (i,Y) with i E V(X), YE V(Y) to be an optimal solution of the problem is that
mi n F(x,Y) = F(i,Y) = mi n F(i,y) . (4)
xEX yeY
450

However, this condition is not sufficient. We only have

Proposition IX.2. Let (z,y) satisfy (-I). 11 Y= arg mi n F(z,y) (;'.e., y is the
yeY
z
un;,que minimizer 01 F(z,.) over Y), then is a local optimal solution 01 (1).

Proof. By hypothesis, F(i,Y) < F{i,y) for all y e Y satisfying y # y. Therefore,


by continuity, for each y e Y, y # y, there exists an open neighbourhood Uy of i sa-
tisfying F(i,Y) < F(x,y) for all xe Uy. Let U = n {Uy: y e V(Y), y # Y}. Then for
all x e U we have

F(i,Y) < F(x,y) Vye V(Y), y # y.

But F(i,Y) = mi n F(x,Y). Hence, for all x e U:


xeX

F(i,Y) ~ min {F(x,y): ye V(Y)} = mi n F(x,y) ,


yeY

i.e., f(i) ~ fex).



To find a pair (i,y) satisfying (4), one can use the following "mountain climbing"
procedure (e.g., KonIlO (1976».

Assume that X,Y are bounded. Let xO e V(X). Set h =O.


1) Solve the linear program min {(q + Cxh)y: y e Y} to obtain a vertex yh of Y
luch that F(xh,yh) = mi n F{xh,y) .
yeY
2) Solve the linear prosram min {(p + CTyh)x: x e X} W obtain a vertex xh +1
of X .uch that F(xh +l ,yh) =mi n F(x,yh) .
'0
xeX
If F(xh +1 ,yh) = F(xh,yh), then .top. Otherwise. sei h I - h+1 and to 1).

In view of the finiteness of the set V(X)-V(Y), the situa.tion F(xh+1,yh) <
F(xh,yh) cannat occur infiDitely mur times. Therefore, the above procedure muat
451

terminate after finitely many steps with a pair (xh,yh) such that mi n F(xh,y» =
yeY
h h h+l h .. h
F(x ,y ) = F(x ,y) = ml n F(x,y ) .
xeX

1.2. Cutüng Plane Methocl

As seen above, problem (BLP) is equiva1ent to each of the following concave min-
imization problems:

min {f(x): x e X} , f(x) = inf {F(x,y): y E Y} , (5)

min {g(y): y E Y} , ,(y) = inf {F(x,y): x E X} . (6)

To solve (BLP) we can specialize Algorithm V.l to either of these problems. How-
ever, if we use the symmetrie structure of the bilinear programming problem, a more
efficient method might be to alternate the cutting process for problem (5) with the
cutting process for problem (6) in such a way that one uses the information obtained
in the course of one process to speed up the other.
Specifically, consider two polyhedra Xo c X, Yo c Y with vertex sets V(Xo)'
V(Yo)' respectively, and let

fO(x) = min {F(x,y): y E Yo} , So(y) = min {F(x,y): x E Xa} .


By the "mountain climbbtglt procedure we can fiad & pair (xO,yo) e V(XO)-V(YO)
luch that

fO(xo) = F(x
. (},y0) _( 0) .
= ,O'y (7)

Let E > 0 be a tolel'allce number, a = fQ(xo., - (..

Assumins xO io be a JlOIldeceRtrate vert.ex ofXo' dellote by ctoi (j=l,,,.,n) the dir-


ections of the a e4ges 01 Xo emanatiag fmDI xO. Tha, .. ShoWll in Secüon DU, aB.
452

/}-valid cut for the concave program min fO(X O) is given by a vector 1rX (Y O) such
o
that

(8)

(9)

Therefore, setting

(10)

we have fO(x) ~ er for all x E ß X (YO) , i.e., any candidate (x,y) E XOxYO with
o
F(x,y) < er must lie in the region Xl xYO' where

Thus, if Xl = 0 , then

er ~ min {F(x,y): x E Xo ' y E Yo} .


Otherwise; Xl f 0, and we can consider the problem

min {F(x,y): x E Xl' Y E Yo} .

In an analogous manner, hut interchanging the roles of x and y, we can construct

a vector 7fy (Xl) which determines an /}-valid cut for the concave program min
o
gl(YO) , where gl(Y) = min {F(x,y): x E Xl}. (Note that this is possible, provided
yO is a nondegenerate vertex of yO, because er = ~(yO) -t: ~ gl (yO) - E). Setting

(11)

we have ~l(Y) ~ er for all Y E ß y (Xl)' 50 that any candidate (x,y) E X1xYO with
o
F(x,y) < er must lie in the region Xl xY1 ' where
453

Thus, the problem to be considered now is min {F(x,y): x E Xl' Y E Y1}. Of course,

the same operations can be repeated with Xl' Y1 in place of XO' YO. We are led to
the following procedure (Konno (1976)).

Algorithm IX.I.

Initialization: Let Xo = X, YO = Y, a = +111 •

Step 1: Compute a pair (xO,yO) satisfying (7). If F(xO,yO) - c < a, then reset
al- F(xO,yO)-t:.

Step 2: Construct the cut 1fX (YO) defined by (8), (9), and let ß X (Yo) denote the

° °
set (10). If Xl = Xo\ß X (YO) = 0, then stop: (xO,yO) is a global c-{)ptimal solution

of the problem (BL). °


Otherwise, go to Step 3.

Step 3: Construct the cut 1fy (Xl)' and define the set (11). IfYI=YO\ß y (X I )=0,

°
then stop: (xO,yo) is a global c-{)ptimal solution. °
Otherwise, go to Step 4.

Step 4: Set Xo I - Xl ' Yo I- Yland go to Step 1.

Denote by 1f~ and 1f~ the cuts generated in Steps 2 and 3 of iteration k, respect-

ively. From Theorem V.2 we know that the cutting plane algorithm just deseribed

will converge ifthe sequences {1f~}, {?r~} are bounded. However, iR the general case
the algorithm may no~ converge. To ensure convergence one could, from time to

time, insert a fa.cial cut as described in Section V.2; but trus may be computaiionally

expensive.

For the implemwation of the algorithm, aote that, because oI the specific struc-
ture of the function fo(x), the computation of the numbers 0j defined by (i) reduces
454

simply to solving linear programs. This will be shown in the following proposition.

Proposition IX.3. Let X o = X. Then (Jj equals the optimal value ofthe linear pro-

gram

minimize (pi - a)so + (q + Cxo)s

s.t. (pal)so + (cal)s = -1

. o·
ProoC. Define IPP') = ApdJ + min {(q + Cx + ACdJ)y: By ~ b, y ~ O} . From

(1) we have f(x O+ Adj ) = pxO+ IPi>'), 80 that


(Jj = max p: IPiA) ~ o
a-px }.

By the duality theorem of linear programming we have

. T o·
IPj{A) = ApdJ + max {-bu: -B u~q + C(x + AdJ) , u ~ O} .

Hence,

(J. = max A
J

B.t. Apdj - bu ~ a - pxO

u ~ o.

The assertion followB by passing to the dual of this linear program.



Konno (1976) has also indicated a procedure for constructing an a-valid cut

which is usually deeper than the ooncavity cut 1rX (YO) defined by (8), (9).
o
455

Note that the numbers 0j in (9) depend upon the set YO' If we write O/Y 0) to em-
phasize this dependence, then for Y1 C YOand Y1 smaller than Y0 we have O/Y1) ~

O/YO) Vj, usually with strict inequality for at least one jj i.e., the cut ?l'Xo(Y1) is

usually deeper than 'll'X (Y O)' Based on this observation, Konno's cut improvement
o
procedure consists in the following.

Construct ?l'X (Y O)' then ?l'y (X O) for Xl = XO\~X (YO) and ?l'X (Y l ) for
o 0 0 0
Yl = YO\~Y (Xl)'
o
The cut 'll'X (Y 1) is also an a-valid cut at xO for the concave program
o
min fO(X O)' and it is generally deeper than the cut 'll'X (Y O). Of course, the process
o
can be iterated until successive cuts converge within some tolerance.
This cut improvement procedure seems to be particularly efficient when the prob-
lem is symmetric with respect to the variables x,y, as it happens, e.g., in the bilinear
programming problem associated with a given quadratic minimization problem (see
Section VA).

Example IX.I (Konno (1976). Consider the problem

s. t. Xl + 4x2 ~ 8 , 2Yl + Y2 ~ 8 ,

4x l + x2 ~ 12 , Yl + 2Y2 ~ 8,
3x l + 4x2 ~ 12 , Yl + Y2 ~ 5 ,
xl ~ 0 , x2 ~ 0 , Yl ~ 0 'Y2 ~ 0 .

Applying Algorithm IX.I, where Step 3 is omitted (with Y1 = Yo), we obtain the
following results (e = 0):
1st iteration: xO = PI' Yo = Ql (see Fig. IX.l)j CI! = -10
1 1
cut: ~xl -~x2 ~ 1

Xl f. 0 (shaded region)
456

1
2nd iteration: x = P4 ,y1 = Q4 j Q = -13

~ = e. Optimal solution.

x
2

.3 Y2

Q1

.3
2

FQ
2 .3

Fig. IX.1

1.3. Polyhedral .Annexa.tion

Polyhedral annexation type algorithms for bilinear programming have been pro-
posed by several authors (see, e.g., Vaish and Shetty (1976), Mukhamediev (1978».

We present here an al~orithm similar to that of Vaish and Shetty (1976), which is a
direct application of the PA algorithm (Algorithm VI.3) to the concave minim-
ization problem (1).
Because of the special form of the concave function f(x) , the a-extension of a
point with respect to a given vertex xO of X such that f(xO) ~ Q can be computed by
solving a linear program. Namely, by Proposition IX.3, if xO = 0, then the value
457

() = max {t: f(tx) ~ a}

is equal to the optimal value of the linear program

minimize -a So + qs

We can now give the following polyhedral annexation algorithm (assuming X,Y to

be bounded).

Algorithm. IX.2 (P A Algorithm for (BLP))

Compute a point z E X. Set Xo = X.

0) Starting with z, search for a vertex x Oof Xo which is a local minimizer of f(x)

over XO' Let a = f(xO). Translate the origin to xO, and construct a cone KO

containing Xo such that for each i the i-th edge of KO contains a point yOi f. °
satisfying f(li) ~ a. Construct the a-extension zOi of yOi (i=I, ... ,n), and find
the (unique) vertex vI of

I
Let VI = {v }, Vi = VI· Set k = 1

1) k
For each v E V salve the linear program

max {vx: x E XO}

to obtain the optimal value ~v) and a basic optimal solution w (v). If for some

v EV k we have f(w(v» < a, then set ZI- w(v),

1
Xo I - Xo n {x: v x ~ I}
458

where vI was defined in Step 0, and return to Step 0. Otherwise, go to 2.

2) Select vk e argmax {J'{v): v t- k}.


V If J'{v k) ~ 1, then stop: an optimal
solution of (BLP) is (xO,yO), where yO e argmin {(q + CxO)y: y e Y}.

Otherwise, go to 3).

3) Construct the a-extension zk of vk and form Sk+l by adjoining to Sk the


constraint

Compute the vertex set Vk+1 of Sk+1' and let V +1 k = Vk+1 \ Vk . Set
k t- k+l and return to 1.

It follows from Theorem VI.3 that this algorithm must terminate after finitely
many steps at an optimal solution of problem (BLP).

1.4. Conical Algorithm

A cone splitting algorithm to solve problem (BLP) was proposed by Gallo and
Ülkucü (1977). However, it has been shown subsequently that this algorithm can be
considered as a specialization of the cut and split algorithm (Algorithm V.3) to the

concave program (5), which is equiva1ent to problem (BLP) (cf. Thieu (1980».
Though the latter algorithm may work quite successfully in many circumstances, we

now know that its convergence is not guaranteed (see Sections V.3.3 and VII.1.6). In
fact, an example by Vaish (1974) has shown that the algorithm of Gal10 and Ülkucü
may lead to cycling.
However, from the results of Chapter VII it follows that a. convergent conical al-
gorithm for solving (BLP) can be obtained by specializing Algorithm VII.1 to prob-
lem (5).
459

Recall that for a given x the value fex) is computed by solving the linear program

min {(q + Cx)y: By ~ b, y ~ o} ,

while the a-extension of x with respect to a vertex xo of X (when f(xO) < er, x f. xO

and fex) ~ er) is computed accbrding to Proposition IX.3. More precisely, the number

is equal to the optimal value of the linear program

max >.
s.t. ->.p(x-xO) + bu ~ pxO - er ,
->.C(x-x O) - BTu ~ CxO + q,
u~ °.
By passing to the dual linear program, we see that, likewise, the number (J is equal

to the optimal value of

.
mm (0
px - er ) So + (q +
s.t. (p(x-xO))sO + (C(x-xO)) 8 = -1,

80 ~ °, 8 = (8 1 , ... , 8 m ) T ~ °.
Assuming X and Y to be bounded, we thus can specialize Algorithm VII.1 to prob-

lem (5) as follows.


460

Algorithm IX.3.

Select € ~ 0 and an NCS rule (see VII.1.4). Compute a point z E X.

0) Starting with z, find a vertex xO of X such that f(xO) ~ f(z). Let x be the best
among xO and all the vertices of X adjacent to xO. Let 1 = f(X).

1) Let Q = 1-€. Translate the origin to xO, and construct a cone KOJ X such that
for each i the i-th edge of KO contains a point yOi # xO satisfying f(yOi) ~ Q.
Let QO = (z01 ,z 02 ,... ,ZOn) , where each zOi.IS t he a--extenslon
. 0 f yOi . Let

.,t = .9' = {QO}·


2) For each Q = (zl,z2, ... ,zn) E .9', solve the linear program

LP(Q,X)

to obtain its optimal value I-'(Q) and basic optimal solution w (Q). If
f(w (Q» < 1 for some Q, then return to 0) with

-1
z ~ w (Q) , X ~X n {x: eQo x ~ 1} ,

where QO is the matrix in Step 1. Otherwise go to 3.

3) In.At delete a1l Q with I-'(Q) ~ 1. Let se be the collection of remainillg


matrices. If se = 0, then terminate: i is a global c-<>ptimal solution of (5).
Otherwise,

4) Select Q* E argmax {I-'(Q): Q E se} and subdivide Q* according to the chosen


NCS role.

5) Let.9'* be the partition of Q*. For each Q E .9'* reset Q = (z l i,... ,zn) with
zi such that f( zi) = Q.
Return to 2) with .9 ~ .9*, .At ~ {lRnHQ*}} U .9'*.

By Theorem VII.1, this algorithm terminates at a global €-<>ptimal solution after


finitely many steps whenever € > o.
461

Example IX.2 (Gallo and Ülkücü, 1977). Consider the problem

s.t. Xl + x2 S 5 Y1 + 2Y2 S 8 ,
2x 1 + ~S 7 3Y1 + Y2 S 14 ,
3x 1 + x2 S 6 2Y1 S 9 ,
Xl - 2x2 S 1 Y2 S 3 ,

Xl > o, ~ ~ 0 , Y1 ~ 0 , Y2 ~O,

Applying Algorithm IX.3 with t: = 0 and the NCS rule (*) of Section VII.1.6, where
N is very large and pis very elose to 1, leads to the following calculations.

0) Choose x O = (OjO). The neighbouring vertices are (ljO) and (Oj5). The values of

f(x) at these three points: -3, -13/2, -18. Hence, x= (Oj5) and 'Y = -18.
Iteration 1:
01 02 . 01 02
1) Choose QO = {z ,z ), Wlth z = (36/13jO), z = (Oj5) . .Jt 1 =.9 1 = {QO}'

2) Solve LP(QO'X): ~QO) = 119/90> 1, w (QO) = (2j3) with f(w (QO» = -10> J.

3) .ge 1 = {Qo} .

4) Q* = QO. Subdivide QO with respect to Wo = (2j3). The a-extension of wO is

r.P = (30/17;45/7) .

.9 2 = {QOl'Q02} with Ql = (z01,~p), Q02 = (z02,r.p).

Iteration 2:

2) Both Q01' Q02 are deleted . .ge 2 = 0, and hence the global optimal solution of

(BLP) is x= (0,5), Y= (0,3).


462

x Y2
2

(0;5)

(0;3) (2;3)
(2;3)
X
Y
x Y1
1
(0;0) (1 ;0) (0;0)

Fig.IX.2

1.5. Outer Approximation Method

In the previous methods we have assumed that the polyhedra X, Y are bounded.

We now present a method, due to Thieu (1988), which applies to the general case,
when X and (or) Y may be unbounded.
This method is obtained by specializing the out er approximation method for concave

minimization (Section VI.l) to problem (5) (or (6».

A nontrivial question that arises here is how to check whether the function f(x) is

bounded from below over a given halfline r = {xO + Ou: 0 ~ 0< +IJJ} (note that f(x)

is not given in an explicit form, as assumed in the conventional formulation of prob-

lem (BCP». A natural approach would be to consider the parametric linear program

min {F(xO + Ou,y): y E Y}, °5 05 IJJ , (12)

or, alternatively, the subproblem


463

max {r: f(xO + fJu) ~ f(xO) , (J ~ O} , (13)

which can be shown to be a linear program. However, this approach is not the best
one. In fact, solving the parametric linear program (12) is computationally expensive
and the subproblem (12) or (13) depends upon xO, and this means that different sub-
problems have to be solved for different points xO.
Thieu (1988) proposed a more efficient method, based on the following fact.
Let f(x) be a concave function defined by

f(x) = inf {r(y)x + s(y)} , (14)


yeJ

where r(y) e IRn , s(y) e IR and J is an arbitrary set of indices (in the present context,
J = Y, r(y) = p + CTy, s(y) = qy). Let r be the halfline emanating !rom a point
xO e ~ in the direction u.

Proposition IX.4. The function f(x) is bounded /rom below on r if and only if

p ('11.):= inf r(y)u ~ 0 . (15)


yeJ

Proof. Suppose that p(u) ~ 0. From (14) we have, for every (J ~ 0,

f(xO + fJu) = inf {r(y) (xO + fJu) + s(y)}


yeJ
~ inf {r (y) xO + s(y)} + (J inf r(y)u
yeJ yeJ
= f(xO)+ (Jp(u) ~ f(xO)

This shows that f(x) is bounded !rom below on r. In the case p(u) < 0, let yO e J
be such that 'Y = r(yO)u < 0. Then !rom (14) we see that

f(x O+ fJu) ~ r(yO)(xO+fJu) + s(yO)

= r(yO)xO + s(yO) + (J'Y ~ - m as (J ~ +m.

Therefore, f(x) is unbounded from below on r. •


464

In the general case, when the index set J is arbitrary, the value p(u) defined by

(15) might not be easy to determine. For example, for the formula

fex) = inf {x*x - f*(x*): x* e dom f*}

where f*(x*) is the concave conjugate of fex) (cf. RockafelIar (1970)), computing

inf {x*u: x* e dom f*} may be difficult. But in our case, be~ause of the specific

structure of fex), the determination of p(u) is very simple. Indeed, since J = Y,


r(y) = p + CTy, s(y) = qy, the inequality (15) reduces to

p(u) = in f (p + CTy)u
yeY

= pu + in f
yeY
(Cu)y ~ °.
Thus, in order to check whether fex) is bounded from below over r it suffices to

solve the linear subprogram

minimize (Cu)y subject to y eY . (16)

Note that all of these subproblems have the same constraint set Y, and their ob-

jective function depends only on u, but not on xo.

Let us now consider the concave minimization problem (5) (which is equivalent to

(BLP)) in explicit form:

minimize f(x):= px + min {(q + Cx)y: y E Y} ,

subject to Aix $ ai ( i=I, ... ,m) ,

xj ~ ° (j=I, ... ,n) .

For the sake of convenience we shall assurne that both polyhedra X, Y are non-

empty.

On the basis of the above results, we can give the following outer approximation

method for solving the problem (BLP):


465

Algorithm IX.4.

Initialization:

Set Xl = IR~. Let VI = {O} (vertex set ofXl ), Ul = {el, ... ,en} (extreme direction
set of Xl)' where ei is the i-th unit vector of IRn. Set 11 = {l, ... ,m}.

Iteration k = 1.2•....

1) For each u E Uk compute p(u) = pu + inf {(Cu)y: y E Y} by solving (16). Ir


p(uk ) < 0 for some uk E Uk , then:

a) Ir Vi E I k Aiu k ~ 0, stop: problem (5) has no finite optimal solution, and fex) is
unbounded from below on any halfline in X parallel to uk . In this case (BLP) is

unsolvable.

b) Otherwise, select

and go to 3).

2) Ir p(u) ~ 0 Vu E Uk ' then select

xk E argmin {f(x): x E V k} .

a) Ir Aixi ~ ai Vi E Ik , then terminate: xk is a global optimal solution of (5).


Compute

l E argmin {(q + Cxk)y: y E Y}

(this linear program must be solvable, because f(x k) = pxk + in! {(q + CTxk)y:
y E Y} is finite). Then (xk,yk) is an optimal solution of (BLP).

b) Ir Aixk > ai for some i E I k, then select

and go to 3).
466

3) Form

Determine the vertex set V k+1 and the extreme direction set Uk+1 of Xk + 1 from
Vk and Uk (see Section 11.4.2). Set Ik+ 1 = Ik \ {ikJ and go to iteration k+ 1.

Remark DU. In the worst Ca&e, the above algorithm might stop only when k =
m. Then the algorithm would have enumerated not only all of the vertices and ex-
treme directions of X, but also the vertices and extreme directions of intermediate
polyhedra Xk generated during the procedure. However, computational experiments
reported in Thieu (1988) suggest that this case generally cannot be expected to
occur, and that the number of linear programs to be solved is likely to be sub-
stantially less than the total number of vertices and extreme directions of X.

Example IX.2.

s.t. - xl + x 2 ~ 5 Y1 + Y2 + Y3 ~ 6
x1-~~ 2 Y1 -Y2 + Y3 ~ 2
-3x1 + x 2 ~ 1 -Y1 + Y2 + Y3 ~ 2
-3x1 - 5~ ~ -23 , -Y1 -Y2 + Y3 ~-2

The sets X and Y are depicted in the following figures.


467

X
1

Fig. IX.3

y ,
1

Fig. IXA
468

The algorithm starts with Xl = IR!, V 1 = {O}, U1 = {e1,e2}, 11 = {1,2,3,4}.


Iteration 1.

p(e1) °
= pe1 + min {(Ce1)y: y E Y} = 3 + = 3 > 0.
p(e2) = 5 - 10 = -5 < 0.
max {Aie2: i EIl} = max {1,-4,1,-5} = 1 > 0. This maximum is achieved for i = 1.
Hence i 1 = 1.

Form X2 = Xl n {x: -xl + x 2 $ 5}.


Then V2 = {0,v2} with v 2 = (0,5); U2 = {e1,u3} with u3 = (1,1).

12 = 11 \ {i 1} = {2,3,4}.

Iteration 2.

p(e1) = 3> 0, p(u3 ) = 0.

min {f(x): xE X2} = -19. This minimum is achieved at x2 = (0,5) = v2.


max {A.x2 - a.: i E 12} = max {-22,4,-2} = 4 > 0. This maximum is attained when
1 1

i = 3. Hence i 2 = 3.
Form

X 3 = X 2 n {x: -3x1 + x 2 $ 1} .

Then V 3 = {0,v3,v4} with v3 = (0,1), v4 = (2,7), while U3 = U2 = {e1,u3}.


13 = 12 \ {i 2} = {2,4}.

Iteration 3.

min {f(x): xE X 3} = -19. This minimum is achieved at x3 = (2,7) = v4 .


max {Aix 3 - a( i E 13} = -18 < 0. Hence, x3 is a global optimal solution of the

concave minimization problem min {-f(x): x EX}.

By solving the problem min {(q + Cx3)y: y E Y}, we then obtain y3 = (4;2;0).

Thus, (x3 ,y3) is a global optimal solution of the BLP problem under consideration.

Note that the polytope Y has five vertices: (2,2,2), (2,0,0), (0,2,0), (4,2,0), (2,4,0).
469

2. COMPLEMENTARlTY PROBLEMS

Complementarity problems form a dass of nonconvex optimization problems


which play an important role in mathematical programming and are encountered in

numerous applications, ranging from mechanies and engineering to econornics (see

Section 1.2.5). In this section, we shall consider the concave complementarity problem
(CCP), which can be formulated as follows:

Given a concave mapping h: !Rn -i !Rn, Le., a mapping h(x) = (h1(x), ... ,hn (x)),
such that each hi(x) is a concave function, find a point x E !Rn satisfying
n
x ~ 0, h(x) ~ 0, . E xihi(x) =0 . (17)
1=1

Note that in the literature problem (17) is sometimes called a convex complement-
arity problem.

When h is an affine mapping, Le., h(x) = Mx + q (with q E IRn and M E IRn"n),


the problem is called the linear complementarity problem (LCP): find a point x
satisfying
n
x ~ 0, Mx + q ~ 0, E x.(M.x + q.) = 0 , (18)
i=1 1 1 1

where Mi denotes the i-th row of the matrix M.


Over the past 25 years, numerous methods have been devised to solve this prob-
lem. The best known of these methods - Lemke's complementarity pivot method

(Lemke (1965), Tomlin (1978)) - and other pivoting methods due to Cottle and

Dantzig (1968), Murty (1974), and Van der Heyden (1980), are guaranteed to work

only under restrictive assumptions on the structure of the problem matrix M. Re-

cently, optirnization methods have been proposed to solve larger dasses of linear

complementarity problems (Mangasarian (1976, 1978 and 1979), Cottle and Pang

(1978), Cheng (1982), Cirina (1983), Ramarao and Shetty (1984), AI-Khayyal
470

(1986, 1986a and 1987), Pardalos (1988b), Pardalos and Rosen (1988) (see also the
books of Murty (1988), Cottle, Pang and Stone (1992), and the survey of Pang
(1995)).
In the sequel we shall be concerned with the global optimization approach to com-
plementarity problems, as initiated by Thoai and Tuy (1983) and further developed
in Tuy, Thieu and Thai (1985) for the convex complementarity problem and in Par-
dalos and Rosen (1987) for the (LCP). An advantage of this approach is that it does
not depend upon any special properties of the problem matrix M (which, however,
has to be paid for by a greater computational cost).

2.1. Basic Proper1ies

As shown in Section 1.2.5, by setting


n
fex) = E min {x., h.(x)}
i=1 1 1

one can reduce the concave complementarity problem (17) to the concave minim-
ization problem

minimize fex) S.t. x ~ 0 , hex) ~ 0. (19)

Proposition IX.5. A vector i is a solution to the complementarity problem (17) if


and only if it is a global optimal solution of the concave minimization problem (19)
with f(z) = O.

Proof. This follows from Theorem 1.5, where g(x) ;: x.



An immediate consequence of this equivalence between (17) and (19) is that:

Proposition IX.6. lfthe concave complementarity problem (17) is solvable, then at


least one solution is an extreme point of the convez set defined by z ~ 0, h(z) ~ O. In
particular, either (LCP) has no solution, or else at least one solution of (LCP) is a
471

vertex 01 the polyhedron defined by

x ~ 0 , Mx + q ~ 0 .

Many solution methods for (LCP) are based on this property of (LCP). From the

equivalence between (17) and (19) it also follows that, in principle, any method of

solution for concave minimization problems gives rise to a method for solving con-

cave complementarity problems. In practice, however, there are some particular fea-

tures of the concave minimization problem (19) that should be taken into account
when devising methods for solving (17):

1) The objective function f(x) is nonnegative on the feasible domain D and must be

zero at an optimal solution.

2) The feasible domain D, as well as the level sets of the objective function f(x),

may be unbounded.

Furthermore, what we want to compute is not really the optimal value of (19),
but rat her a point x(if it exists) such that
XE D , f(x) = 0 .

In other words, if we denote

D = {x: x ~ 0 , h(x) ~ O}, G = {x: f(x) > O} ,


then the problem is to find a point xE D \ Gor else establish that D \ G = 0 (i.e.,
Dc G).

Observe that, since the functions h(x) and f(x) are concave, both of the sets D, G

are convex. Thus, the complementarity problem is a special case of the following

generalllgeometric complementarity problemlI:

Given two convex sets D and G, find an element 01 the complement 01 G with
respect to D.
472

In Chapters VI and VII we saw that the eoneave minimization problem is also

closely related to a problem of this form (the "(DG) problem", cf. Seetion VI.2). It

turns out that the proeedures developed in Seetions VI.2 and VII.1 for solving the

(DG) problem ean be extended to the above geometrie eomplementarity problem.

2.2. Polyhedral Annexation Method for the Linear Complementarity Problem

Consider the linear eomplementarity problem (18)

(LCP) x eD , f(x) = °,
where D = {x: x ~ °,Mx + q ~ O}, f(x) =
n
. E min {xi' Mix + qJ
1=1
°
Let xO be a vertex of the polyhedron D. If f(xO) = then xO solves (LCP). Other-
wise, we have f(xO) > 0.

We introduee the slaek variables xn +i = Mix + qi (i=1, ... ,n), and express the
basic variables (relative to the basic solution xO) in terms of the nonbasic ones. If we
then change the notation, we can rewrite (LCP) in the form

y ~ 0, Cy +d ~ °, f(y) = °, (20)

where y is related to x by a linear equation x = xO + Uy (with U an n"n-matrix),


f(y) = f(xO + Uy) is a eoneave function with f(O) = f(xO) > 0, y = °is a vertex of
the polyhedron

fi = {y: y ~ °,Cy + d ~ O} ,
and f(y) ~ °for all y e fi. Setting
G= {y: f(y) > O} ,

we see that °Eint G, and all of the eonditions assumed in the (fiG) problem as for-
mulated in Seetion VI.2 are fulfilled, exeept that G is an open (rather than closed)
473

set and D and G may be unbounded. Therefore, with some suitable modifications,

the polyhedral annexation algorithm (Section VI.2.4) can be applied to solve (20),

and hence (LCP).

In Section V1.2.6 it was shown how this algorithm can be extended in a natural

way to the case when D and G may be unbounded, but D contains no line and inf

f(D) > -1IJ (the latter conditions are fulfilled here, because D ( IR~ and f(y) ~ 0 for
all y E D). On the other hand, it is straightforward to see that, when G is open, the

stopping criterion in Step 2 of the polyhedral annexation algorithm should be

JL(zf) < 1 rather than JL(v k) ~ 1. We can thus give the following

Algorithm IX.5 (PA Algorithm for LCP)

Using a vertex x Oof the polyhedron D, rewrite (LCP) in the form (20).

0) For every i=I, ... ,n compute

- i
0i = sup {t: f(te) ~ O} (i=I, ... ,n) ,

where ei is the i-th unit vector of IRn (Oi > 0 because f(O) > 0 and f is
continuous). Let

1
SI = {Y: Yi ~ 7r. (i=l, ... ,n)}

(t-1 ,..., j-)n (the unique


1

(with the usual convention that ! = 0), and let vI =


00
vertex of SI).

Set VI = {vI}, Vi = VI' k = 1 (for k > 1, Vkis the set of new vertices of Sk).

k
k.1. For each v E V solve the linear program

LP(v,D) max {vy: y E D}

to obtain the optimal value JL(v) and a basic optimal solution w (v) (when

JL(v) = +00, w(v) is an extreme direction of D over which vx -I +00). If for some
k
v E V the point w(v) satisfies f(w(v») = 0, then stop. Otherwise, go to k.2.
474

k.2. Select vk E argmax {JL(v): v E Vk}. If J.i(vk ) < 1, then stop: (LCP) has no
solution. Otherwise, go to k.3.

k.3. Let )- = w(vk) ,


Tk = aup {t: l(t)-) ~ O}
(if )- is an extreme direction of Ö, then necessarily Tk = +m because l(x) ~ 0

Vx e Ö).
Form the polyhedron

Compute the vertex set V k+ 1 of Sk+1' and let V + 1 k = Vk+1 \ V k . Set


k -- k+1 and return to k.I.

Theorem IX.I. The above algorithm termiRl.&tes after finitely many steps, either

yielding a solution to (LCP) or else establishing that (LCP) has 11.0 solution.

Proof. Denote by P k the polar set of Sk. It is easily verified that P 1 is the convex
hull of {O,ul, ... ,un}, where ui ia the point sii if Si < +m, and the direction ei if
Si = +m. Similarly, P k +1 is the convex hull of P k u {zk}, where zk is the point Tk )-
if Tk < +m and the direction )- if Tk = +m (see Section VI.2.6). Since each

)- = w(vk) is a vertex or an extreme direction of Ö which does not belong to Pk ,


there can be no repetition in the sequence {)-}. Hence, the procedure must termin-

ate after finitely many steps. If it terminates at a step k.I., then a solution y of (20)

(and hence a solution i = Ur + xO of (LCP)) is obtained. If it terminates at a step


k.2 (because J.i(vk) < 1), then JL(v) < 1 for all v e Vk, and hence D- c G,
- Le.,

> 0 for all y E D. This implies that fex) > 0 for al1 xE D.
{(y)

Note that the set Vk may increase quickly in size as the algorithm proceeds.

Therefore, to alleviate storage problems and other difficulties that can arise when Vk
475

becomes too large, it is recommended to restart the algorithm from some w(v) E V k'
i.e., to return to Step 0 with xOI - w(v) and

~ 1 ~
D I - D n {y: v y ~ I} , f I - f ,

where vI is the vertex computed in Step 0 of the current cyde of iterations. At each

rest art the feasible domain is reduced, while the starting vertex xO changesj there-

fore, the chance for successful termination will increase.

2.3. Conical Algorithm for the Linear Complementarity Problem

Like the polyhedral annexation algorithm, the conical (DG) procedure in Section

VII.1.2 can be extended to solve the problem (20). This extension, however, requires

a careful examination of the following circumstance.

In the coniCal (DG) procedure, when the sets D and G are not bounded (as assu-

med in Section VII.1.2), the optimal value /-I(Q) of the linear program LP(Q,D) asso-

ciated with a given cone K = con(Q) might be +111. In this case, the procedure might
generate an infinite nested sequence of cones Ks = con(Qs) with /-I(Qs) = +111. To
avoid this difficulty and ensure convergence of the method, we need an appropriate
subdivision process in conjunction with an appropriate selection rule in order to pre-

vent such sequences of cones from being generated when the problem is solvable.

To be specific, consider (LCP) in the formulation (20). Let ei be the i-th unit

vector of IRn, and let HO be the hyperplane passing through e 1, ... ,en , i.e.,
n
HO = {y: E y. = I}. Then for any cone K in lR+ n we can define a nonsingular nlCn
i=l 1
matrix Z = (v 1, ... ,vn ) whose i-th column vi is the intersection of HO with the i-th

edge of K. Since {CO) > 0 and the function {is continuous, we have

0i = sup {t: f(tvi ) ~ O} > 0


~
(i=I, ... ,n) . (21)
476

Consider the linear program


n )..
LP(Z,D) max E
i=l i
f s. t. CZ). + d ~ 0, ). ~ 0 (22)

i = 0 if Ui = +ID).
)..
(here, as before, it is agreed that
1

• • • T
Denote by J.'(Z) and)' = ().l'.·.,An) the optimal value and a basic optimal solu-

tion of this linear program (when J.'(Z) = +ID, Xdenotes an extreme direction of the

i
)..
polyhedron CZ). + d ~ 0, ). ~ 0 over which ~ -I +ID).
1 1

Proposition IX.7. Let l(y) = 1 be the equation 0/ the hyperplane passing through
i = Uivi (i EI) which is paraUel to the directions vi (j ~ I), where I = {i: Ui < -kI}.
Then J.L(Z} and w(Z} = zX are the optimal value and a basic optimal solution 0/ the
linear program

max l(y) s.t. y E D n K.

n .
Proof. Since Z is a nonsingular nxn matrix, we Can write y = Z). = E ).. VI =
i=l I

E
).. .

iEI i
i 1 .
z\ with ). = z- y. Hence, noting that l(i) = 1 (i EI), we see that:

)... )..
l(y) = E fl(ZI) = E f.
iEI i iEI i

On the other hand, we have DnK = {y: Cy + d ~ 0, Z-ly ~ o} = {y = Z).: CZ). +


d ~ 0, ). ~ O}, from which the assertion immediately follows.

Corollary IX.I. If J.L(Z} < I, then f(y} > 0 for all y E D n K.

Proof. The hyperplane i(y) = p,(Z) passes through z*i = "tvi (i E I) and is paral-

lel to the directions vj (j I I), where 0 < er < Ur Since D n K ( {y E K:


i i
p,(Z)}, anY Y E D n K can be represented as y = E t.z*
~
i(y) ~ + E t.V", where
iEI I jlI J
477

t.
1
~ ° (i=l, ... ,n) and E t. ~
~I 1
1. But clearly f(z*i) > ° (i E I) and f(tv j ) > ° (j t I)
for all t > 0. Rence f(y) > 0 by the concavity of f(y).

Now for each cone K with matrix Z = (v 1,... ,vn ) define the number

O(Z) = min {Bi: i = 1, ... ,n} ,

where the Bi are computed according to (21). Let .9t be the collection of matrices

that remain for study at a given stage of the conical procedure. It turns out that the

following selection rule coupled with an exhaustive subdivision process will suffice to
ensure convergence of this procedure:

A cone K is said to be of the first categorg if the corresponding matrix Z =

(v 1,... ,vn ) is such that Bi< +111 Vi and P(Z) < +lIIj K is said to be ofthe second ca~
egorg otherwise. If there exists at least one cone of the first category in .9t, then
choose a cone of the first category with maximal P(Z) for furt her subdivisionj other-

wise, choose a cone with minimal O(Z).


The corresponding algorithm can be described as follows.

Algorithm IX.6 (Conical Algorithm for LCP).

Assume that a vertex x Oof the polyhedron

x~O,Mx+q~O,

is available such that f(xo) > 0. Using xO, rewrite (LCP) in the form (20). Select an
exhaustive cone subdivision process.

0) Let Z1 = (e 1,e2,... ,en), where ei is the i-th unit vector of IRn, and let

.9 1 = '" 1 = {Z1}' Set k = 1.

k.1. For each Z E .9 k solve the linear program LP(Z,D) (see (22)). Let p(Z) and X
be the optimal value and a basic optimal solution of LP(Z,D), and let w(Z) = zt
478

If f(w(Z» =0 for some Z e .9k ' then terminate. Otherwise (f(w(Z» > 0
VZ e .9k ), go to k.2.

k.2. In .Jt k delete a1l Z e .9k satisfying J.l(Z) < 1. Let ~ k be the collection of the

remaining elements of .Jt k. If ~ k = 0, then terminate: f(y) > 0 Vy e :ö (Le.,


(LCP) has no solution). Otherwise, go to k.3.

k.3. Let ~ LI) = {Z e .9k: Z = (v1,... ,vn), Bi < +m Vi, J'(Z) < +m}, where Bi is
defined by (21). If ~ LI) # 0, then select

Zk e argmax {J'(Z): Z e ~ LI)} .

Otherwise, select

Zk e argmin {O(Z): Z e ~ k} .

Subdivide the corresponding cone according to the chosen exhaustive cone


subdivision rule.

k.4. Let .9'k+l be the partition of Zk' and let .At: k+l be the collection obtained
from .9l k by replacing Zk with .9'k+l . Set k t - k+l and return to k.1.

To establish convergence of this algorithm, let us first observe the following

Lemma IX.1. For every point v ofthe simplex [e 1, ... ,en] denote by r(v) the half
line /rom °through v. If r (ti*) n b is a line segment [0, y*], then for aU v e
[e 1, ... ,en] sufficiently close to V*, r(v) n b is a line segment [O,y] satisfying y-t y*
asv-tti*.

Proof. Since 0 e :Ö, we must have d ~ o. Consider a point v* such that r(v*) n:ö
is a line segment [O,y*]. First Suppose that y* # o. Then, since y* is a boundary
point of :Ö, there exists an i such that Ciy* + di = 0 but >.Ciy* + di < 0 for all
>. > 1, Le., di > O. Rence, J = {i: di > O} # 0. Define
479

C.u
g(u) = max { - i-: i E J} .
1

Obviously, g(y*) = 1, and if y* = (}v* then 1 = g((}v*) = 8g(v*). It follows that

g(v*) * . Since g(. ) is continuous, we see that for all v sufficiently


> 0 and y* = g(vi)
elose to v* we still have g(v) > o. Then y = ±:t
g\ v,
will satisfy g(y) = g(.!l = 1, and
gtvJ
this implies that y is a boundary point of D, and r(v) n D = [O,y]. Clearly, as
v v*
v --+ v* we have y = ~ --+:;r.;;r'\" = y*.
g\v, g\v,
On the other hand, if y* = 0, then r(v*) n D = {O}. Therefore, for all v suffi-
ciently elose to v*, we have r(v) nD = [O,y] with y = o. This completes the proof

of the lemma.

Lemma IX.2. 1/ Ks ' sE ß ( {1,2, ... }, is an infinite nested sequence 0/ cones 0/
the first category, then n Ks = r is a ray such that r n iJ = [0, y*] with
SEß

y* = lim w(Z) .
SEß
8-!m

Proof. Recall that Ks denotes the cone to be subdivided at iteration s. From the

exhaustiveness of the subdivision process it follows that the interseetion r of all Ks'
s E ß, is a ray. Now, since Ks is of the first category, the set Ks n {y: ls(Y) ~ JL(Zs)}
(which contains Ks n D) is a simplex (ls(Y) = 1 is the equation of the hyperplane

through zsi = ()sivsi , l=I,


. ... ,n, where Zs = (sI
v ,... ,V sn) , ()si = sup { t: f( tv si) ~ 0}) .
Hence Ks n D is bounded, and consequently r n D is bounded, because
rn D ( Ks n D. That is, r nD is a line segment [O,y*]. If v* and vS denote the in-

tersections of the rays through y* and w(Zs)' respectively, with the simplex

[e l , ... ,en], then, as s --+ m, S E ß, we have vB --+ v*, and hence, by Lemma IX.l,


480

Lemma IX.3. 11 the cone Ks chosen lor further subdivision is 01 the second cat-

egory at some iteration s, then f7y) > 0 lor aU y E lJ in the simplex T s = {y E IR~ :
n
E y.~ O(Z)}.
i=l '

Proof. The selection rule implies that st ~1) = 0 and 6(Zs) ~ 6(Z) for all Z E st s'

Now if y E :ö n T s' then y either belongs to a cone already deleted at an iteration

k ~ s (in which case l(y) > 0), or else belongs to a cone with matrix Z E st s' In the
n
latter case, since O(Zs) ~ 6(Z), we have . E Yi ~ 6(Z), and it follows from the de-
1=1
finition of 6(Z) that l(y) > O.

Lemma IX.4. 11 the algorithm generates infinitely many cones Ks 01 the second
category SE/). ( {1,2, ... }, then O(Z) ~ IJJ as s ~ IJJ, sE/)..

Proof. Among the cones in the partition of K 1 = IR~ there exists a tone, say K s '
1
that contains infinite1y manY cones of the second category generated by the algo-
rithm. Then among the cones in the partition of Ks there exists a cone, say K s '
1 2
that contains infinite1y manY cones of the second category. Continuing in this way,

we find an infinite nested sequence of cones Ks ' v=1,2, ... , each of which contains in-
v
finitely many cones of the second category. Since the subdivision is exhaustive, the
IJJ
intersection n Ks = r is a ray. If r contains a point y such that f(y) < 0, then,
v=1 v
since y t :Ö, there exists a ball U around y, disjoint from :Ö, such that f(u) < 0 for

all u EU. Then for v sufficiently large, any ray contained in K s will meet U. This
v
implies that for all k such that Kk ( Ks ' we have ~ < IJJ (i=1, ... ,n) and J.&(Zk) < IJJ.
v
That is, all subcones of K s generated by the algorithm will be of the first category.
v
This contradicts the above property of Ks . Therefore, we must have f(y) ~ 0 for all
v
y Er. Since fis concave and f(O) > 0, it follows that f(y) > 0 for all y Er.
481

For any given positive number N, consider a point CEr and a ball W around c such
n
that . E Yi > N and f(y) > 0 for all y E W. When 11 is sufficiently large, say 11 ~ 110,
1=1
sll,i sll,i
the edges of Ks will meet W at points y such that f(y ) > 0 (i=1, ... ,n). Since
11
n sll,i
E YJ' > N, it follows that Os i > N (i=1, ... ,n), and hence O(Z ) > N. Now for
j=1 11' sll
any s E 11 such that s ~ sll ' the cone Ks must be a subcone of some cone in.9t .
o ~
o
Therefore, O(Zs) ~ O(Zs )
110
> N.

Theorem IX.2. If Algorithm IX.5 generates infinitely many cones of the second

category, then the LOP problem has no solution. Othenoise, beginning at some
iteration, the algorithm generates only cones ofthe first category. In the latter case, if
the algorithm is infinite, then the sequence J = w(ZII has at least one accumulation
point and each of its accumulation points yields a solution of (LOP).

Proof. Suppose that the algorithm generates an infinite number of cones Ks of

the second category, sE tl C {1,2, ... }. By Lemma IX.3, the problem has no solution
n
in the simplices T s = {y E IR~: . E Yi ~ O(Zs)}' On the other hand, by Lemma IXA,
1=1
O(Zs) --+ mas s --+ m, S E 11. Therefore, the problem has no solution.
Now suppose that for all k ~ kO' Jl k consists only of cones of the first category. If
the algorithm is infinite, then it generates at least one infinite nested sequence of

cones K s' s E 11 C {1,2, ... }. Since all K s' s ~ kO' are of the first category, it follows
m
from Lemma IX.2 that lim w(Zs) = y*, where y* is such that [O,y*] ~ f> n n Ks '
S-+m s=1
Now consider any accumulation point y of {w(Zk)}' for example, y= lim w(Zk ).
r-+m r
Reasoning as in the beginning of the proof of Lemma IXA, we can find an infinite

nested sequence of cones Ks ' s E tl' C {1,2, ... }, such that each Ks ' s E 11', contains
infinitely many members of the sequence {K k ,r=1,2, ... }. Without loss of gen-
r
482

erality, we may assume that Kk C Ks (s E ~I). Since the sub division process is ex-
s
haustive, the intersection of all of the Ks' s E ~', is the ray passing through y. If
f(Y) > 0, then around some point c of this ray there msts a ball U such that
[O,YJ nU = 0 and f(u) > 0 Vu E U. Then for all sufficiently large s E ~', we have
[O,w(Zk )] n U = 0.
s
k ,i
On the other hand, the i-th edge of the cone Kk meets U at some point u s .
s
k,i k,i k,i
Since f( u s ) > 0, it follows that u s will lie on the line segment [O,z s ]. Con-
sequently, since P(Zk) ~ I, the line segment [O'w(Zk)] meets the simplex
s s
k ,I k ,n k k k
[u s ,,,.,u S ] at some point u s. Then we have u s tU (because u sE [O,w(Zk )]
s
k k ,I k ,n
C ~ \ U), while u sE [u S ,,,.,u s ] cU. This contradiciion shows thai f(Y) ~ 0,

and hence f(Y) = 0, since y E D. •

Corollary IX.2. For any E > 0 and any N > 0, Algorithm IX.5 either fonds an
E-approzimate solution 01 (20) (i.e., a point y E jj such that /(y) < E) after flnitely
many iterations or else establishes that the problem has no solution in the baU
lIylI <N.

proof. H the first alternative in the previous theorem holds, then, for all k such
n
thai Kk is of the second category and the simplex Tk = {y E IR~: i!l Yi ~ O(Zk)}

contains the ball lIylI < N, it follows that we have f(y) > 0 for all y E D with
lIylI < N.
H the second alternative holds, then for sufficiently large k we have l(w(Zk» < E. •

Remarb IX.2. (i) As in the case of Algorithm IXA, when ~k becomes too
large, it is advisable to rest art (i.e., to return to step 0), with xO......- w(Z) and

D ......- D n {y: ~ (y) ~ I} , f ......- 1 ,


483

where Z is the matrix of a cone such that w(Z} is avertex of :ö satisfying

f(w(Z)) > 0, and where ll(y) = 1 is the equation of the hyperplane in Proposition
IX.7 for ZI constructed in Step O.

(ii) When the problem is known to be solvable, the algorithm can be made finite as

follows. At each iteration k, denote by .9'~1) the set of cones of the first category in

.9'k ' and let yl = w(ZI)' yk e argmin {f(yk-l), f(w(Z)) VZ e .9'~1)} (i.e., yk is the

best point of:ö known up to iteration k). If l is a vertex of :Ö, let yk = yk j other-
wise, find a vertex yk of:ö such that l(yk) ~ f(yk). Since f(yk) ~ 0 (k ~ m) and the
vertex set of:ö is finite, we have f(yk) = 0 after finitely many steps.

(iii) To generate an exhaustive subdivision process one can use the rules discussed
in Section VII.1.6, for example, the rule (*) in Section VII. 1. 6, which generates

mostly (V-fIubdivisions. Note that when w(Zk) is a direction (i.e., ~Zk) = +m), the

(V-fIubdivision of Zk is the subdivision with respect to the point where the ray in the

direction w(Zk) intersects the simplex [Zk] defined by the matrix Zk'

(iv) Algorithm IX.6 can be considered as an improved version of an earlier conical

algorithm for (LCP) proposed by Thoai and Tuy (1983). The major improvement
consists in using a more efficient selection rule and an exhaustive subdivision process
which involves mostly (V-fIubdivisions instead of pure bisection.

2.4. Other Global Optimization Approaches to (LCP)

The above methods for solving (LCP) are based on the reduction of (LCP) to the

concave minimization problem

n
minimize E min {~ , MXi + q.} S.t. x ~ 0, Mx + q ~ 0 .
i=1 1
484

Since min {xi' MXi + qi} = xi + min {O, Mix - xi + qi}' by introducing the auxili-
ary variables w. we can rewrite this concave minimization problem in the separable
1

form
n
minimize . E {x i + min (O,wi )}
1=1

s.t. x~O, Mx+q~O, w-Mx+x=q.

Bard and Falk (1982) proposed solving this separable program by a branch and
bound algorithm which reduces the problem to aseries of linear programs with con-

straint set D = {x: x ~ 0, Mx + q ~ O} and cost functions


n
E {x. + Q.(M.x-x. + q.)},
i=1 1 1 1 1 1

where Qi is a parameter with value 0, ~ or 1, depending on the stage of the algo-


rithm.
A more direct approach to (LCP) is based on the equivalence of (LCP) and the
quadratic programming problem

minimize xMx + qx S.t. x ~ 0, Mx + q ~ 0. (23)

Here M can be replaced by the symmetrie matrix M = ~ (M + MT). When M is


positive semidefinite, it is a convex quadratic program and can be solved by efficient

procedures. When M is negative semidefinite, this is a concave quadratic program


and can be treated by the methods in Seetions VA.2., Vl.3.3, VlI.4.3, and VII1.3.2.

In the general case when M is indefinite, (23) becomes a harder d.c. optimization
problem (see Chapter X).
Pardalos and Rosen (1988) showed that (LCP) is equivalent to the following
mixed zer~ne integer program:
485

maximize a
S.t. 0<- M.y
1
+ q.a<
1 -
1 - z.1 (i=l, ... ,n)
(MIP)
o~ Y i ~ zi (i=l, ... ,n)

zi e {0,1} (i=l, ... ,n) ,0 ~a~ 1

Of course, here we assume that qi < 0 for at least one i (otherwise x = 0 is an ob-
vious solution of (LCP».

Proposition IX.S. If (MIP) has an optimal solution (a,y,z) with Q > 0, then
x= y / Q solves (LCP). If the optimal value of (MIP) is Q = 0, then (LCP) has no
solution.

Proof. (MIP) always has the feasible solution y = 0, a = 0, and zi = 0 or 1


(i=l, ... ,n). Since the constraint set is bounded, (MIP) has an optimal solution
(Q,y,z). Suppose that Q > 0, and let x = y / Q. Then My + Qq = Q(Mx + q) ~ 0,

and hence Mx +q ~ O. Furthermore, for each i, either zi = 0 (and hence xi = 0) or

else zi = 1 (and hence Miy + qiQ = 0, i.e., Mix + qi = 0). Therefore, x solves
(LCP).
Now suppose that Q = O. If (LCP) has a solution x, then we have max {xi' Mix +
qi (i=l, ... ,n)} > O. Denote by a the reciprocal of this positive number. Then a feas-
ible solution of (MIP) is a, y = ax, zi = 0 if xi = 0, zi = 1 if xi > O. Hence we have
a ~ Q = 0, a contradiction. Therefore, (LCP) has no solution.

Using this result, Pardalos and Rosen suggested the following method of solving
(LCP):

1. Solve the quadratic program (23) by a local optimization method. If x is a

solution and f(X) = 0, then stop. Otherwise, go to 2).

2. Choose n orthogonal directions ui (i=l, ... ,n), and solve the linear programs
min {cTx: x E D} with c = ui or c = _ui (i=l, ... ,n). This will generate k ~ 2n
486

vertices x.i of D (j E J). If f(,J) = 0 for some j, then stop. Otherwise, go to 3).

3. Starting from the vertex ,J (j E J) with smallest f(,J) , solve the quadratic
progra.m (23) (by 80 loca.l optimization method) to obtain 80 Kuhn-Tucker point
ri. If f(ri) = 0, then stop. Otherwise, go to 4)

4. Solve (MIP) by a mixed integer progra.mming a.lgorithm.

In this approch, (MIP) is used only 80S a last resort, when loca.l methods fail.
From the computationa.l results reported by Parda.los and Rosen (1987), it seems
that the average complexity of this a.lgorithm is O(n4).
Let us a.lso mention another approach to (LCP), which consists in reformulating
(23) as a bilinea.r progra.m:

minimizexw S.t. x ~ 0, w ~ 0, -Mx + w = q. (24)

Since the constraints of this progra.m involve both x and w, the standard bilinear
programming a.lgorithms discussed in Section IX.l cannot be used. The problem can,
however, be handled by a method of Al-Khayya.l and Fa.lk (1983) for jointly con-
strained biconvex progra.mming (cf. Chapter X). For details of this approach, we
refer to Al-Khayya.l (1986a).

2.5. The Concave Complementarlty Problem

Now consider the concave complementarity problem (17)


n
x ~ 0, h(xH 0, . E xihi(x) = 0 ,
1=1

where h: IRn _ IRn is 80 given concave mapping such that hi(O) < 0 for at least one
i=I, ... ,n. Setting
n
D = {x: x ~ 0, h(xH O} , G = {x: . E min {xi' hi(x)} > O} ,
1=1
487

we saw that the sets D and G are convex, and the problem is to find a point

xE D \ G.
By translating the origin to a suitable point x E G and performing some simple

manipulations, we can reformulate the problem as follows.

Given an open convex set G containing the origin 0 and a closed convex set

D C IR! n cl 0 (where cl 0 denotes the closure of 0), find a point x E D \ O.


Note that the assumptions in this reformulation imply that, if the problem is solv-

able, then a solution always exists on the part of the boundary of :ö that consists of

points x such that x = Oy with y E IR! \ {Ol, 0 = sup {t: ty E D}. Therefore, re-

placing D by the convex hull of D U {O} if necessary, we may assume that 0 E D.


The following algorithm can then be deduced from the normal conical algorithm for
(CP) (Algorithm VII.2).

Algorithm IX.7.

Select an exhaustive subdivision process.

o) Let 1 2,... ,en) , where ei IS


Zl = (e,e . the i-th unit vector of IRn, and let

.9 1 = .Jt 1 = {Zl}· Construct a polyhedron D 1 ) conv {D U {O}). Set k = 1.

k.l. For each Z E .9 k solve the linear program


n >..
max E T.
i=l i
s.t. Z>. E D1 '

where Z = (v1, ... ,vn ), 0i = sup {t: tvi E O} (i=l, ... ,n), >. = (>'l' ... ,>'n)T (as
>..
usual, .,l-
u,
= 0 if O. = +111). Let p.(Z) and ~ be the optimal value and abasie
1
1

optimal solution of this linear program, and let w(Z) = zi If we have

w(Z) :ö \ 0 for some Z E .9'k


E ' then terminate. Otherwise, (w(Z) E 0 or

w(Z) t:ö VZ E .9'k)' go to k.2.


488

k.2. In .At k delete all Z E .9l k satisfying p.(Z) < 1. Let st k be the collection of
remaining elements of .At k' If st k = 0, then terminate: :ö ( Ö (the problem has
no solution). Otherwise, go to k.3.

k.3. Let st ~l) = {Z E st k: Z = (v1, ... ,v n ), Bi < +00 Vi, p.(Z) < +ID}. If st p) f 0,
select

Zk E argmax {p.(Z): Z E st P)} .

Otherwise, select .

Zk E argmin {O{Z): Z E st k} .

where, for Z = (v1, ... ,vn) we define

O{Z) = min {B.:1 i=I, ... ,n} = min


.
sup {t: tvi E Ö} .
1

Subdivide (the cone generated by) Zk according to the chosen exhaustive sub-

division rule.

k.4. Denotej. = w(Zk)' If j. E :Ö, then set Dk+1 = Dk. Otherwise, take a vector
k k -1c 1c - -k 1c
P such that the halfspace p (x - t..r) ~ 0 separates ur {rom D (where w = Bt..r,
B= sup {t: tt..r1c E D})
- k -k
and set Dk +1 = Dk n {x: p (x - w ) ~ O}.

k.5. Let .9lk+1 be the partition of Zk obtained in Step k.3., and let .At k+1 be the
collection that results {rom .At k by substituting .9lk+ 1 for Zk' Set k I - k+1 and

go to k.1.

As before, we shall say that a cone K with matrix Z = (vl, ... ,vn ) is of the first

category if Bi < +00 Vi and p.(Z) < +a, and that it is of the second category other-
wise.
489

Theorem. IX.3. II Algorithm IX.6 generates infinitely many cones 01 the second
category, then the convex complementarity problem has no solution. Otherwise, be-
ginning at some iteration, the algorithm generates only cones 01 the first category. In
the latter case, il the algorithm is infinite, then the sequence J = w(Z~ has at least
one accumulation point, and any 01 its accumwation points yields a solution 01 the
problem.

Proof. It is easily seen that Lemmas 1X.3 and IX.4 are still valid (f(y) > 0 means
that y e G), and hence the first part of the theorem can be established in the same
way as the first part of Theorem 1X.2.

Now suppose that for all sufficiently large k, jIlk consists only of cones of the first
category. H the algorithm is infinite, then it generates at least one infinite nested se-
quence of cones K s of the first category, s e 11 ( {1,2, ... }. For each s, let
Zs = (vs,l ,... ,Vs,n) , Zs,i = (Js,ivs,i , (Js,i = sup {t : t vs,i e G} ('1=1,... ,n ) , an d 1et

ls(X) = 1 be the equation of the hyperplane through zs,l, ... ,zs,n (so that

tJ e argmax {ls(x): xe Ds n Ks}; see Proposition 1X.7). Then, denoting the smallest
index s e 11 by sI ' we find that tJ e {x e Ks : ls (x) ~ p.(Zs )} for all s e 11, Le.,
1 1 1
the sequence {tJ, s e Ll} is bounded, and hence must have an accumulation point.
Consider any accumulation point i of the sequence {Jt} (for all k sufficiently large,
k
J<is a point). For example, let i = lim w r. Reasoning as in the proof of Lemma
r-+m
1X.4, we can find an infinite nested sequence of cones Ks ' s e 11' ( {1,2, ... }, such that

each K , s e 11', contains infinitely many members of the sequence {K k ' r=1,2, ... }.
8 . r
Without loss of generality we may assume that K k (Ks (s e 11') and some K s is
8 1
of the first category. It is easily verified that all of the conditions of Theorem 11.2
k
are fulfilled for the 8et D n {x e Ks : ls (x) ~ p.(Zs )} and the sequences {w r},
1 1 1
k
{p r}. Therefore, by this theorem, we conclude that i e D.
On the other hand, the exhaustiveness of the subdivision process implies that the
490

k ,1 k ,n _ kr,i k ,i
simplex [Zk ] = [v r ,... ,V r ] shrinks to a point'/: Hence z = 0k i V r con-
r r'
k i
verges to a point z = öV as r -+ 00. Since z r' E aG (the boundary of G), we must
k k
have zE 00. If z r denotes the point where the halfline from 0 through w r meets
kr ,l kr,n . kr _ kr kr
the simplex [z ,... ,z ], then ObvlOusly z -+ z. But w = JL(Zk)z , and
r
k
since JL(Zk ) ~ 1, it follows that x= lim w r t G. Therefore we have xE D \ G. •
r r-'ID

3. PARAMETRIC CONCAVE PROGRAMMING

An important problem that arises in certain applications is the following

(PCP) Find the smallest value 0 such that

min {f(x): x E D, cx ~ O} ~ a, (25)

where Dis a polyhedron in IRn, cis an n-vector, and f: IRn -+ IR is a concave function.
We shall call this a parametric concave programming problem, since the minimi-
zation problem in (25) is a concave program depending on the parameter O.
In a typical interpretation of (PCP), D represents the set of all feasible produc-
tion programs, while the inequality cx ~ 0 expresses a constraint on the amount of a
certain scarce commodity that can be used in the production, and f(x) is the produc-

tion cost of the pro gram x. Then the problem is to find the least amount of the

scarce commodity required for a feasible production program with a cost not ex-
ceeding a given level a.

In the literature, the PCP problem has received another formulation which is

often more convenient.

Proposition IX.9. The PCP problem is equivalent to

(LRCP) minimize cz s.t. z E D, f(z) ~ a. (26)


491

Proof. We may of course assume that min f(D) ~ er, for otherwise both problems

are infeasible. If 0 0 is optimal for (PCP) and xO is an optimal solution of the corres-

ponding concave program, then obviously cxO = 0 0, and x O is feasible for (LRCP),
hence 0 0 ~ 01:= optimal value of (LRCP).

Conversely, if xl is optimal for (LRCP) and cx1 = 01, then 01 is feasible for
(PCP), hence 01 ~ 0 O. Therefore, 01 = 0 0, and xO is optimal for (LRCP), while 01
is optimal for (PCP).

An inequality of the form f(x) ~ er, where f(x) is a concave function, is called a re-

verse convez inequalitll, because it becomes convex when reversed (see Chapter I). If
this inequality is omitted, then problem (26) is merely a linear program; therefore it

is often referred to as a linear program with an additional reverse convez constraint.


LRCP problems were first studied with respect to global solutions by Bansal and

Jacobsen (1975 and 1975a), Hillestad (1975) and also Hillestad and Jacobsen (1980).

In Bansal and Jacobsen (1975 and 1975a) the special problem of optimizing a net-
work flow capacity wider economies of scale was discussed. Several methods for

globally solving (LRCP) with bounded feasible domain have been proposed since
then. Hillestad (1975) and Hillestad and Jacobsen (1980 and 1980a) developed
methods based on the property that an optimal solution lies on an edge of the poly-
hedron D. These authors also showed how cuts that were originally devised for con-
cave minimization problems can be applied to (LRCP). Further developments along
these lines were given in Sen and Sherali (1985 and 1987), Gurlitz (1985) and Fulöp

(1988). On the other hand, the branch and bound methods originally proposed for

minimizing concave functions over polytopes have been extended to (LRCP) by Muu

(1985), Hamami and Jacobsen (1988), Utkin, Khachaturov and Tuy (1988), Horst

(1988).
For some important applications of (LRCP) we refer to the discussion in Section

1.2.5.
492

3.1. Basic Properties

We shall make the following assumptions:

(a) Dis nonempty and contains no lines;

(b) either D or G:= {x: f(x) ~ o} is compact;


(c) min {cx: xE D} < min {cx: x E D, f(x) ~ o}.

The last assumption simply means that the constraint f(x) ~ 0 is essential and
(LRCP) does not reduce to the trivial linear program min {cx: xE D}. It follows

that there exists a point w satisfying

w E D, f( w) > 0, cw < cx Vx E D \ G (27)

(such a point is provided, for example, by an optimal solution of the linear program

min {cx: x E D}).


The following property was first established by Hillestad and Jacobsen (1980) in
the case when D is bounded, and was later extended to the general case by Tuy
(1983) (see also Sen and Sherali (1985)).

Let conv Adenote the closure of conv A.

Proposition IX.IO. The set conv(D \ int G) is a polyhedron whose extreme diT'-

ections are the same as those of D and whose vertices are endpoints of sets of the
form conv(E \ int G), where Eis any edge of D.

Proof. Denote by M the set of directions and points described in the proposition.

Obviously, M ( D \ int G (note that, in view of assumption (b), any recession dir-
ection of D must be a recession direction of D \ int G). Hence, convM (
conv(D \ int G).
We now show the inverse inclusion. Suppose z E D \ int G. Since G is convex,
there is a halfspace H = {x: h(x~) ~ O} such that zEH ( IRn \ G. Then, since D n H
493

contains no lines, z belongs to the convex hull of the set of extreme points and direc-

tions of H n D. But obviously any extreme direction of the latter polyhedron is a re-

cession direction of D, while any extreme point must lie on an edge E of D such that

E \ int G is nonempty, and hence must be an endpoint of the segment

conv(E \ int G). Consequently, D \ int Gis contained in conv M; and, since conv M
is closed (M is finite), conv(D \ int G) ( conv M. Hence, conv(D \ int G) = conv M.


Since minimizing a linear form over a closed set is equivalent to minimizing it
over the closure of the convex hull of this set we have

Corollary IX.3. (LRCP) or (PCP) is equivalent to the implicit linear program

min {cx: x E conv(D \ int G)} . (28)

Here the term "implicit" refers to the fact that, although conv(D \ int G) is a poly-
hedron, its constraints are not given explicitly (this constitutes, of course, the main

difficulty of the problem).

Proposition IX.n. If (LRCP) is solvable, then at least one of its optimal solutions
lies on the intersection of the boundary 8G of G with an edge of D.

Proof. If (LRCP) is solvable, then at least one of its optimal solutions (i.e., an
optimal solution of (28)), say xO, is a vertex of conv(D \ int G), and hence is an end-
point of the set conv(E \ int G), where E is some edge of D. If f(xO) < a, then xO

must be a local (hence, global) minimum of cx over D, contrary to assumption (c).

Hence, f(xO) = a, i.e., xO E 00. •


It follows from the above that in the search for an optimal solution of (LRCP) we

can restrict ourselves to the set of intersection points of 00 with the edges of D.

Several earlier approaches to solving (LRCP) are based on this property (see, e.g.,
494

Hillestad and Jacobsen (1980), and Tuy and Thuong (1984».

Another property which is fundamental in some recent approaches to the LRCP


problem is the following.

Definition DU. We say that the LRCP problem is regular if D \ int G = cl(D\ G)I
i.e. 1 if any feasible point is the limit of a sequence of points xE D satisfying f(x) < Q.

Thus, if D \ int G has isolated points, as in Fig. IX.5, then the problem is not
regular. However, a regular problem may have a disconnected feasible set as in Fig.
IX.6.

....•...•................••..•.••...•............

D(xO)\G=%
but xOnot opt.
opt.
f(x)=ct

Fig. IX.5
495

.•..........

Fig. IX.6

For any point xo of D let us denote

(29)

Theorem IXA. In order that a feasible solution ');0 be globally optimal for (LRCP;,
it is necessary and, if the problem is regular, also sufficient that

D(i; \ G =0. (30)

Proof. Suppose D(xO) \ Gis not empty, i.e., there exists a point Z of D(xO) such

that f(z) < Q. Let xl be the point where the boundary 8G of G intersects the line
segment joining z and the point w satisfying (27). Then xl belongs to D \ int G,
and, since cw < cz, it follows that cx l < cz ~ cxO; and hence xO is not optimal. Con-
versely, suppose that (30) holds and the problem is regular. If there were a feasible
point x with cx < cxO, in any neighbourhood of x we would find a point Xl E D with
496

f(x ' ) < a. When this point is sufficiently near to x, we would have cx' < cxO, Le., x'
E D(xO) \ G, contrary to the assumption. Therefore, xO must be optimal. This com-

pletes the proof.



It is easy to give examples of problems where, because regularity fails, condition

(30) holds without xO being optimal (see Fig. IX.5). However, the next result shows
the usefulness of condition (30) in the most general case, even when we do not know

whether a given problem is regular.

For each k = 1,2, ... let Ek : IRn -I IR be a convex function such that °< Ek(X)
Vx E D, max {Ek(X): x E D} -I °(k -Im), and consider the perturbed problem

(LRCP k ) minimize cx s.t. x E D, f(x) - Ek(X) ~ a.

Denote Gk = {x: f(x) - Ek(X) ~ a}.

Theorem IX.5. Let i be a feasible solution to (LRCP~ satisfying


D(i) \ Gk = 0. Then any accumulation point x of the sequence i, k=1,2, ... , is a
global optimal solution of (LRCP).

Proof. Clearly, because of the continuityof f(x) - Ek(X), xis feasible for (LRCP).

For any feasible solution x of (LRCP), since f(x) - ek(x) < f(x) ~ a, we have x t Gk .
Therefore, the condition D(xk) \ Gk = 0 implies that x t D(xk). Since x E D, we

must have cx ~ cxk , and hence cx ~ cx. This proves that xis a global optimal so-

lution of (LRCP).

In practice, the most commonly used perturbation functions are e(x) =e or

e(x) = e(lIxll 2 + 1) (for D bounded), where e ! O.

Proposition IX.13. For sufficiently smaU e > 0, the problem

(LRCP- e) min cx s.t. xE D, f(x} - enlxll 2 + 1} ~ a


497

is regular.

Proof. Since the vertex set V of D is finite, there exists Co > 0 small enough so
that cE (O,cO) implies that F(c,x):= f(x) - c(lIxl12 + 1) f 0: "Ix E V. Indeed, if
VI = {x E V: f(x) > o:} and Co satisfies coOlxll2 + 1) < f(x) - 0: "Ix E VI' then
whenever 0 < c < cO' we have for all x E V \ V( f(x) - c(lIxll 2 + 1) ~ 0: - c < 0:,

while for all x E VI: f(x) - c(lIxll 2 + 1) > f(x) - coOlxll2 + 1) > 0:. Also note that
the function F(c,x):= f(x) - c(IIxll2 + 1) is strictly concave in x. Now consider the
problem (LRCP- c), where 0 < c < cO' and let xE D be such that F(c,x) ~ 0:. If xis
not a vertex of D, then x is the midpoint of a line segment t:. ( D, and, because of
the strict concavity of F(c,x), any neighbourhood of x must contain a point x' of t:.
such that F(c,x' ) < 0:. On the other hand, if xis a vertex of D, then F(c,x) < 0:, and
any point x' of D sufficientIy near to x will satisfy F(c,x') < 0:. Thus, given any
xE D such that F(c,x) ~ 0:, there exists a point x' arbitrarily elose to x such that
x' E D, F(t:,x') < 0:. This me ans that the problem (LRCP- c) is regular.

It follows from the above result that a LRCP problem can always be regularized

by a slight perturbation. Moreover, this perturbation makes the function f(x) strictly
concave, a property which may be very convenient in certain circumstances.

3.2. Outer Approximation Method for (LRCP)

To simplify the presentation of the methods, in the sequel instead of (b) we shall
assurne astronger condition:

(b') Dis a bounded polyhedron (a polytope).

With suitable modifications, most of the results below can be extended to the case
when D is unbounded.
49S

Under assumptions (a), (b'), (c), if w is a basic optimal solution of the linear pro-

gram min {cx: x E D}, then, by transforming the problem to the space of nonbasic

variables relative to this basic solution, we can always arrange that:

1) w = 0 is a vertex of D;

2) f(O) > a and min {cx: x E D} > 0 (see (27));


3) D is defined by constraints of ~he form Ax ~ b, x ~ O.

One of the most natural approaches to solving the LRCP problem is by outer ap-

proximation (cf. Forgo (19SS), Hillestad and Jacobsen (19S0a), and Fülöp (19SS);

see also Bulatov (1977) for a related discussion). This approach is motivated by the

following simple observation.

Proposition IX.14. Let i be a basic optimal solution 01 the linear program

min {cx: x E D} .

Suppose that I(i) > a, i.e., xO is not leasible lor (LRCP). 11 1r(x-xO) ~ 1 is an

a-valid cut constructed at:P for the concave program

min {f(x): x E D} ,

then the inequality

l(x):= 1r(X-XO) - 1 ~ 0

excludes xOwithout excluding any leasible solution 01 (LRCP).

Proof. The proof is trivial, since, by definition, an a-valid cut at x O is an in-

equality lex) ~ 0 which excludes xO without excluding any point x E D such that

fex) ~ a.

It follows from this fact that concavity cuts (see Chapter III) can be used in out er
approximation methods to solve (LRCP).
499

Specifically, to solve (LRCP) by the outer approximation approach one constructs a


nested sequence of polytopes So ) SI ) ... ) Sk ) ... in the following way. One starts

with So = D. When SO"",Sk have been constructed, one solves the linear program

min {ex: xE Sk} ,

obtaining a basic optimal solution, xk. If xk happens to be feasible for (LRCP), then

the procedure terrninates: i solves (LRCP), since Sk contains the feasible set of
(LRCP). Otherwise, one generates a concavity cut ?f"k(x_xk ) ~ 1 to exclude xk and
forms Sk+ 1 by adding this constraint to Sk' The procedure is then repeated with

Sk+1 in place of Sk (see Fig. IX.7, page 486).


Although this method is conceptually simple, and, as reported in Hillestad and
Jacobsen (1980a), it may sometimes help solve problems which otherwise would be

difficult to attack, its convergence is not guaranteed. For example, Gurlitz (1985)
has shown that, when applied to the 5-dimensional problem

min {xl: 0 ~ Xi ~ 1 (i=l, ... ,5), Ex~ ~ 4.5} ,

the outer approximation method using such cuts will generate a sequence xk which

converges to an infeasible solution.


To overcome the difficulty and ensure finiteness of the procedure, Fülöp (1988) pro-
posed combining concavity cuts with facial cuts, sirnilar to those introduced by
Majthay and Whinston (1974) (see Section V.2). His method involves solving a set
covering subproblem in certain steps.
Note that by Corollary lIlA, a sufficient convergence condition is that the se-
quence {?f"k} be bounded. A prornising approach suggested by this condition is to

combine cutting with partitioning the feasible domain by means of cone splitting.

This leads to conical algorithms, which will be discussed later in this section.
500

4th cut - - - - -

"-
1 st cut
X
o
o
T
cx=8

1
X

Fig. IX.7

3.3. Methods Based on the Edge Property

Since, by Proposition IX.H, an optimal solution must exist on some edge of D in-

tersecting 00, and since the number of such edges is finite, one can hope to solve the

problem by a suitable edge search procedure. The first method along these lines is

due to Hillestad (1975). Another method, proposed by Hillestad and Jacobsen

(1980a), is based on a characterization of those edges which can contain an optimal

solution in terms of the best feasible vertex of the polytope D.


In addition to (a), (b'), (c), assurne that f(x) is strictly concave and the problem
is regular (these additional assumptions are innocuous, by virtue of Proposition

IX.4).
501

Typically, a method based on the edge property alternates between steps of two

kinds: "forward" and "backward ".


First, starting with a vertex sO of D such that f(sO) < /l, we try to decrease the ob-

jective function value cx, while moving forward to the surface f(x) = /l. To do this,
it suffices to apply the simplex procedure to the linear program

min {cx: x E D, cx ~ csO} .

In view of assumption (c), sO cannot be a minimizer of cx over D, so at least one

neighbouring vertex u to sO satisfies cu < csO. H f(u) < /l, we perform a simplex
pivot to move from sO to u. This pivoting process is continued until we find a pair of

vertices u,v of D such that f(u) < /l, f(v) ~ /l (this must occur, again because of as-
sumption (c)). Then we move along the edge [u,v] of D to the point xO where this
edge meets the surface f(x) = /l (due to the strict concavity of f(x), xO is uniquely
determined). At this stage xO is the best feasible point obtained thus far, so for fur-

ther investigation we need only consider

Since we are now stopped by the "wall" f(x) = /l, we try to move backward to the
region f(x) < /l, while keeping the objective function value at the lowest level al-
ready attained. This can be done by finding a vertex sI of D(xO) such that f(sl) < /l
which is as far as possible from the surface f(x) = /l (intuitively, the further we can
move backward, the more we will gain in the next forward step).
H such a point sI can be found, then another forward step can be performed from sI,

and the whole process can be repeated with sI and D(xO) replacing sO and D.

On the other hand, if such an sI does not exist, this means that D(xO) \ Gis empty.

By Theorem IX.1 and the regularlty of the problem, this implies that xO is a global

optimal solution of (LRCP).


The most difficult part of this forward-backward scheme of course lies in the
backward steps: given the polyhedron D(xO), how do we find a vertex sI such that
502

f(SI) > a?
Hillestad and Jacobsen (1980) suggested a combinatorial procedure for the backward

step which involves enumerating the vertices of certain polytopes. Unfortunately, in

certain cases this procedure may require us to solve by a rather expensive method

(vertex enumeration) a subproblem which is almost as difficuIt as the original one

(finding a feasible point sI better than the current best feasible solution xO).

However, a more systematic way to check whether D(xO) has a vertex s such that

f(s) < a, and to find such a vertex if one exists, is to solve the concave program

min {f(x): x E D(xOn .

Therefore, Thuong and Tuy (1985) proposed that one solve this concave program

in the backward step. With this approach, the algorithm can be summarized as fol-

lows:

Algorithm IX.S.

Initialization:
If a vertex 5° of Dis available such that f(sO) < a, set DO= D, k = ° and go to 1).
Otherwise, apply any finite algorithm to the concave program min {f(x): x E D},
until a vertex s° of Dis found such that f(sO) S a (if such a vertex cannot be found,

the problem has no feasible solution). Set DO = D, k = 0. Go to 1) iff(sO) < a, go to


2) (with xO = sO) if f(sO) = a.

Iteration k = 1,2,... :
1) Starting from sk-l pivot by means of the simplex algorithm for solving the linear

program

(31)
503

until a pair of vertices u, v of Dk_ 1 is found so that f( u) < Q, f( v) ~ Q, and

cv< cu $ cs k- 1. Let xk be the (unique) point of the line segment [u,v] such that
f(x k ) == Q. Go to 2).

k
2) Form Dk == {x E D: cx $ cx } and solve the concave program

min {f(x): x E Dk} , (32)

obtaining an optimal vertex solution sk.

a) If f(sk) == Q, terminate.

b) Otherwise, f(sk) < Q, set k +- k+1 and go to 1).

Theorem IX.6. Assume that (a), (b'), (c) hold, and that moreover, J(z) is strictly
concave and the LRCP problem is regular. Ifthe problem has a feasible solution, then
the above algorithm terminates at Step 2a) after jinitely many iterations, yielding a
global optimal solution.

Proof. If the algorithm terminates at Step 2a), then min {f(x): xE Dk} == Q, and

hence, D(xk ) \ G == Dk \ G == 0. Therefore, by Theorem IX.1 and the regularity as-


sumption, xk is a global optimal solution. Now, by construction, [uk,vk] is an edge

of Dk-1 == D( xk-1) . S·mce cvk < cuk $ es k-1 $ cxk-1 , [k


u,v k] cannot be contame
. d

in the face {x: cx == cxk- 1} of Dk_ 1. Hence, [uk,v k] is contained in an edge of D.

Let M denote the set of a11 x E D such that f(x) == Q and xis contained in some
edge of D. M is finite, since the number of edges of D is finite, and the strict1y con-

cave function f(x) can assume the value Q on each edge of D at most at two distinct

points. Finiteness of the algorithm then fo11ows from finiteness of M and the fact

t hat each · · generat es a pom


IteratIOn . t xk E M sat·IS fymg
. cxk < ck-1
x. •

The algorithm involves an alternating sequence of linear programming steps (31)


and concave programming steps (32).
504

An important feature of the concave programming subproblems

min {fex): x E D, cx ~ cxk}

is that the subproblem in each iteration differs from the one in the previous iteration
only in the right hand side of the constraint cx ~ cxk. To increase efficiency, the so-
lution method chosen for the subproblems should take advantage of this structure.
For example, if D is bounded and the outer approximation algorithm (Algorithm
VI.l) is used for the concave programs (32), then the algorithm could proceed as fol-
lows:

Algorithm IX.8*.

Initialization:
Construct a polytope DO J D, with a known (and small) vertex set VO. Let sO be a
vertex of DOsuch that f(sO) < Q. Set k = o.

Iteration k = 1,2,... :

I} Starting from sk-I, pivot by means of the simplex algorithm for solving the
linear program min {cx: xE Dk_ 1} until a pair of vertices u,v of Dk_ 1 is found so
that f(u) < Q, f(v) ~ Q, and cv < cu ~ cs k- 1. Let xk be the intersection of [u,v]
with the sudace fex) = Q. Go to 2).
2} If xk E D, set Dk = Dk_1 n {x: cx ~ cxk}. Otherwise, set Dk = Dk- 1 n {x:
4c(x) ~ O}, where 4c(x) ~ 0 is the constraint of D that is the most violated by xk.
Compute the vertex set Vk of Dk (from knowledge of Vk_ 1).
Let sk E argmin {fex): x E Vk}.

a} If f(sk} = Q, then terminate: i k E argmin {cxi: xi E D, i=O,I, ... ,i} is a global


optimal solution of (LRCP).

b} If f(sk) < Q, set k t-- k+l and return to 1).

c) Iff(sk) > Q, terminate: (LRCP) is infeasible.


505

Theorem IX.7. Under the same assumptions as in Theorem IX. 6, Algorithm IX.8*
terminates at Step 2a) or 2c) after jinitely many iterations.

Proof. Sinee the number of eonstraints on D is finite, either all of these eon-
straints are generated (and from then on Theorem IX.6 applies), or else the algo-

rithm terminates before that. In the latter ease, if 2a) oeeurs, sinee

{x E Dk: ex ~ cik} \ G = 0, we have D(ik) \ G = 0. Henee, i k is optimal for


(LRCP). Similarly, if 2e) oeeurs, then {x E Dk: ex ~ cik} \ int G = 0. Henee xi t D
(i=O,l, ... ,k), and therefore D \ int G = 0.

Example IX.3.

Minimize -2x1 + x 2

s .t. xl + x2 ~ 10 ,
-xl + 2x2 ~ 8,
-2x1 - 3x2 ~ -6 ,

xl - x 2 ~ 4,
xl ~ 0 , x2 ~ 0 ,
2 2
-Xl + Xl x2 - x2 + 6x I ~ 0.

Sinee the subproblems are only on~mensional it will suffiee to use AIgo-rithm
IX.8. However, we shall also show Algorithm IX.8* for eomparison.
Applying Algorithm IX.8, we start with the vertex 50 = (0;4).

Iteration 1.

Step 1 finds u l = (0,2), vI = (3,0) and xl = (0.4079356,1.728049). Step 2 solves the

eoneave program min {f(x): x E D(x1)} and finds sI = (2,5).

Iteration 2.

Step 1 finds u2 = (4,6), v2 = (7;3) and x2 = (4.3670068,5.6329334). Sinee the op-


timal value of the 'concave program min {f(x): x E D(x2)} is 0, Step 2 concludes that
506

x2 is an optimal solution of the problem (see Fig. IX.8).

Fig. IX.8

Applying Algorithm IX.8*, we start with the simplex 00 = {Xl ~ 0, x 2 > 0,


Xl + x2 ~ 10} and its vertex 50 = (0,10).

Iteration 1:
Step 1 finds xl = (4.3670068,5.6329334). Sinee xl is feasible,
01 = 00 n {x: cx ~ ex1}. Step 2 finds the vertex sI = (10jO) whieh achieves the
minimum of cx over 01'

Iteration 2:

Step 1 finds x2 = (7.632993,2.367007). Since this point is infeasible,

02 = 01 n {x: Xl - x 2 ~ 4}. Since the minimum of f(x) over 02 is 0, Step 2 con-


cludes that the best feasible solution so far obtained, Le., xl, is optimal.
507

Fig. IX.9

Remark IX.3. The above method assumes that the function f(x) is strictly con-
cave and the LRCP problem is regular. If these ass um pt ions are not readily verifi-
able, we apply Algorithm IX.8 or IX.8* to the perturbed problem

min cx s.t. x E D, f(x) -EOlxl!l + 1) ~ a (33)

which is regular and satisfies the strict concavity assumption. Clearly, if this
perturbed problem is infeasible, then the original problem itse1f is infeasible. Other-
wise, we obtain a global optimal solution X(E) of the perturbed problem, and, by

Theorem IX.5, as E ! 0 any accumulation point ofthe sequence {X(E)} yields a global

optimal solution of (LRCP).


508

3.4. Conica1 Algorithma for (LRCP)

It is also natural to extend conical algorithms to (LRCP). The extension is based


on the fact that a basic subproblem of the LRCP problem is the following:

(*) Given a feasible point xk, find a point y E D(xk) \ G, or else establish that

D(xk ) C G.

Indeed, whenever a point y E D(xk) \ G exists, then, using inexpensive local


methods, one can find a vertex s of D(xk) such that fes) ~ f(y) < a. The~efore, the
backward step in Algorithm 1 is reduced to solving the subproblem (*).

Now recall that, because ofthe basic assumptions (a), (b'), (c), we can arrange it so
that the conditions 1) - 3) at the beginning of Section IX.5.2 hold (in particular,

°E D n int G and D C IR~). Then, setting Dk = D(xk ) in (*) we recognize the (DkG)
problem studied in Sections VI.2.2 and VII.1.1. If we then use the (DkG) procedure
(Section VII.1.2) in the backward step of each iteration, then we obtain an algo-
rithm which differs !rom Algorithm IX.8 only in the way the backward step is
carried out. However, just as with Algorithm IX.8, it is important to take advantage
of the fact that Dk differs from Dk_ 1 only in the right hand side of the constraint
cx 5 cxk- 1. This suggests integrating all of the (DkG) procedures in successive
iterations k = 1,2, ... into a unified conical algorithm, as follows.

Algorithm IX.9 (Normal Conical Algorithm for (LRCP»

Select an NCS rule for cone subdivision (see Sections VII.1.4 and VII.1.6). Set

10 = +ID, k = 0.

0) Let Dk = D nix: cx ~ 1k}' Select a cone KO such that Dk C KO C IR~ and


min {cx: x E Ko} = 0 (e.g., KO = IR~ if Dk = D). Compute the points zOi # °
where the surface fex) = a meets the i-th edge of KO (i=I,2, ... ,n). Set
Qo = (zOI,z02, ... ,zOn), .Jt = !/' = {Qo}'
509

1) For each Q E ~ with Q = (z 1 i, ... ,zn), f(zi) = Cl (i=1,2, ... ,n), solve the linear

pro gram

max {E>'.: E>..Az i < b, E>..cz i < 'Yk' >..1->0 Vi} =


1 1 - 1 -

-1 -1
max { eQ x: x E Dk , Q x ~ O}

obtaining the optimal value JL(Q) and a basic optimal solution w (Q). If

f(w (Q)) < Cl for some Q, then go to 5). Otherwise, go to 2).

2) In .Ji delete all Q E ~ such that JL(Q) $ 1. Let .9E be the remaining collection of

matrices. If .9E = 0, then terminate. Otherwise go to 3).

3) Select Q*E argmax {JL(Q): Q E .9E} and split it according to the NCS rule chosen.

4) Let ~* be the partition of Q* obtained in this way.

Set ~ ~ ~*, .Ji ~ (.9E \ {Q*}) u ~* and go to 1).


5) Compute a vertex sk of Dk such that f(sk) $ f(w (Q)), and, starting from sk,

perform a forward step (Step 1 in Algorithm IX.8) to obtain a point x k +1 which

lies on the intersection of the surface f(x) = Cl with an edge of D. Let


k+1
'Yk+l = cx . Set k ~ k+l and go to 0).

Theorem IX.8. Assume that the LRCP problem is regular. 11 Algorithm 1X.9 is in-
finite, then some iteration k ~ 1 is infinite and i is a global optimal solution. 11 the
algorithm terminates at iteration k with 'Yk < +m (or equivalently, k ~ 1), then i is a
global optimal solution; il it terminates at iteration k with 'Yk = +m (or equivalently,
k = O), then the problem is inleasible.

Proof. Before proving the theorem, observe that so long as Step 5) has not yet

occurred, we are executing the (DkG) procedure described in Section VIL1.2. An

occurrence of Step 5) marks the end of an iteration k and the passage to iteration

k+1, with 'Yk+1 < 'Yk' Bearing this in mind, suppose first that the algorithm is in-
finite. Since each xk lies on some edge E of D and achieves the minimum of cx over
510

E \ int G, it follows that the number of all possible values of 7k' and hence the num-
ber of iterations, is finite. That is, some iteration k must continue endlessly. Since
this iteration is exactly the (DkG) procedure, by Proposition VII.2 the set Dk n öG
is nonempty, while Dk C G. If k = 0, i.e., Dk = D, this would mean that the prob-
lem is feasible, but has no feasible point x such that f(x) < er, conflicting with the
regularity assumption. Hence, k ~ 1, and then the fact that Dk C G together with the
regularity assumption imply that xk is a global optimal solution. Now suppose that

the algorithm terminates at Step 2 of some iteration k. Then, since the current set
~ is empty, no cone of the current partition of KO contains points of Dk \ G.

Hence, if 7k < +m, then by Theorem IXA, x k is a global optimal solution. On the

other hand, if 7k = +m (i.e., Dk = D), then D C Gj hence, since the problem is regu-
lar, it must be infeasible.

As with Algorithms IX.8 and IX.8*, the regularity assumption here is not too re-
strictive. If this assumption cannot be readily checked, the problem can always be
handled by replacing f(x) with f(x) - E (lIxll 2+1), where E > 0 is sufficiently small.
On the other hand, there exist variants of conical algorithms which do not require
regularity of the problem (Muu (1985), Sen and Whiteson (1985». However, these
algorithms approach the global optimum from outside the feasible region and at any

stage can generally guarantee only an infeasible solution sufficiently near to a global
optimal solution.

Next we present an algorithm of this class which is an improved version of an

earlier algorithm of Muu (1985). The main improvement consists in allowing any ex-

haustive subdivision process instead of a pure bisection process. This is possible due
to the following lower bounding method.

Proposition IX.I5. Let K = con(Q), Q = (zl,;, ... ,zn), be a cone generated b1l n
linearl1l independent vectors i' e IR! such that f(zi) = er. Then a lower bound for cz

over the set Kn (D \ int G) is given b1l the optimalvalue ß(Q) ofthe linear program:
511

-1 -1
min cx s.t. XE D, eQ x ~ 1, Q x ~ 0, (34)

i.e.,

(35)

If cx > 0 Vx E K \ int G, then for an optimal solution w (Q) of (91,) either


f(w(Q)) = Q (and in this case ß(Q) equals the exact minimum of cx over the feasible
portion contained in K), or else w(Q) does not lie on any edge of K.

Proof. Because of the convexity of G, K n (D \ int G) is contained in K n D n H,


where His the halfspace not containing 0 with bounding hyperplane passing through
zl,z2, ... ,zn. This proves the first assertion (the formulation (35) follows from the fact

that any point x E K n H can be represented as x = };AiZi , with };Ai ~ 1, Ai ~ 0

(i=I, ... ,n». The assumption cx > 0 Vx E K \ int G implies that CX(AZi ) > czi
VA > 1, while the convexity of D implies that [O,w (Q)] ( D. Therefore, if an op-
timal solution w (Q) of (34) lies on the i-th edge of K, i.e., if it satisfies w (Q) = AZi

for some A ~ 1, then necessarily w (Q) = zi, and hence f(w (Q» = Cl. On the other

hand, if f(w (Q») = Cl, then w (Q) E D \ int G, and hence w (Q) achieves the min-
imum of cx over K n (D\int G) ( K n (D n H).

The algorithm we are going to describe is a branch and bound procedure similar
to Algorithm IX.S*, in which branching is performed by means of conical subdivision
and lower bounding is based on Proposition IX.15.

We start with a cone KO as in Algorithm X.9. Then for any subcone K = con(Q)

of this cone the condition cx > 0 Vx E K \ int G in Proposition X.15 is satisfied. At


any given stage, if con(Qk) denotes the cone chosen for further partition, then

w (Qk) must be infeasible (otherwise, the exact minimum of cx over the feasible

portion contained in this cone would be known, and Q would have been fathomed);

hence, by Proposition X.15, w (Qk) does not lie on any edge of con(Qk) and can be
used for furt her subdivision of con(Qk)' We can thus state:
512

Algorithm IX.9*

Choose a rule for cone subdivision so as to generate an exhaustive process (cf.

VII.1.4).

0) Construct a matrix QO = (zOl,z02, ... ,zOn) as in Step 0 of iteration k = 0 of


Algorithm X.9. Set .J(O = {QO}' ß(QO) = -w, 70 = +ID (or 70 = cxOif a feasible
point x Ois available). Set k = O.

1) In .J( k delete al1 Q such that ß(Q) ~ 7k· Let .ge k be the remaining collection of
matrices. If .ge k = 0, terminate: xk is a global optimal solution of (LRCP) if
7k < +ID; the problem is infeasible if 7k = +ID. Otherwise,. if .ge k is nonempty, go
to 2).

2) Select Qk e argmin {ß(Q): Q e .ge k} and split it according to the subdivision


process chosen.

3) Let .9lk be the partition of Qk so obtained. For each Q e .9lk solve (35) to obtain
the optimal value ß(Q) and a basic optimal solution w (Q) of (34).

4) Update the incumbent: set xk +1 equal to the best among: xk,uk = u(Qk) (if
these points exist) and all w (Q), Q e .9lk ' that are feasible. Set 7k+1 = cxk +1.
Set .J( k+1 = (.ge k \ {Q}) U .9l k ' k f - k+1 and go to 1).

Theorem IX.7*. I/ Algorithm IX.!J* is infinite, it generates an infinite sequence

cl = w (Q,), every accumulation point 0/ which is a global optimal solution.

Proof. If the algorithm is infinite, it generates at least one infinite nested se-

quence of cones Ks = con(Qs)' seil ( {0,1, ... }, with Qs = (zSl,zS2, ... ,zsn) such that

f(zsi) = a (i=1,2, ... ,n). By virtue of the subdivision rule, such a sequence shrinks to
a ray; consequently, zSi -i x* (5 -i ID, seil), for i=1,2, ... ,n.
Since wS belongs to the halfspace {x: eQs-1x ~ 1}, the halfline from 0 through wS

meets the simplex [i 1,... ,zsn] at some poi~t vS, and it meets the 5urface f(x) = a at
513

some point yS. Clearly, vS -+ x*, yS -+ x*. But we have f(w s) > a, for otherwise, by

Proposition IX.15, cJ would be feasible and cws ~ 'Ys ' Le., ß(Qs) ~ 'Ys ' conflicting

with the fact that Qs E fIl s .


On the other hand, since f( ws) > a, by the concavity of f(x), it follows that wS be-

longs to the line segment [vS,ys]. Hence, cJ -+ x*. Noting that wS E D and

f(zsi) =a Vs, by letting s ---I CD we then deduce that x* E D, f(x*) = a. Thus, x* is

feasible, proving that the lower bounding used in the algorithm is strongly consistent

in the sense of Definition IV.7. It then follows by Corollary IV.3 that lim ß(Qs) =
lim ß(Qk) = min {cx: xE D, f(x) 5 a}, and, since cx* = lim ß(Qs)' we conclude that
x* is a global optimal solution.

(This can also be seen directly: for any feasible point x, if x belongs to one of the

cones that have been deleted at some iteration h 5 s, then cx ~ 'Yh ~ 'Ys > ß (Qs) =
ccJ, and hence cx ~ cx*; if x belongs to some con(Q), Q E fIl s' then cx ~ ß(Q) ~
ß (Qs) = cws, and hence cx ~ cx*). Now let xbe an arbitrary accumulation point of
the sequence {wk = w (Qk)}' e.g., x= lim wh (h ---I CD, h EH). It is easy to see that

there exists a nested sequence Ks = con(Qs)' s E A, such that any Ks contains in-
finitely many Kh ' hE H. Indeed, at least one of the cones con(Q), Q E fIl 1 ' con-
tains infinitely many K h : such a cone must be split at some subsequent iteration; let
this cone be K for some sl ~ 1. Next, among the successors of K s at least one con-
sl 1
tains infinitely many K h : such a cone roust, in turn, be split at some iteration
s2> sl; let this cone be K s . Continuing in this way, we obtain an infinite nested se-
2
quence {K s ' s E A} with the desired property, where A = {sl's2''''}' Since for every
s there exists an h such that K h is a descendant of K , because the sequence {K s}
s s s
shrinks to a ray, it follows that w (Qh ) ---I x*, where x* = lim w (Qs) (5 ---I CD, 5 E
s
A). That is, x = x*, and hence, by the above, x is a global optimal solution of

(LRCP).

514

Remark IX.4. In the interest of efficieney of the proeedure, one should ehoose a
eone subdivision rule that involves mostly w-subdivisions. It ean be verified that the
algorithm will still work if an arbitrary NCS rule is allowed.

Examples IX.4. Fig. IX.lO illustrates Algorithm IX.9 for a regular problem. In
Step 1 a point wO of D is found with f(wO) < a; henee the algorithm goes to 5). A
forward step from sO then finds the optimal solution x*.

Fig. IX.lI illustrates Algorithm IX.9* for a nonregular problem. The algorithm gen-
erates a sequenee of infeasible points w1 ,J, ... approaching the optimal solution x*,
which is an isolated point of the feasible region.

~
---r
f(x)= Ci

Fig. IX.10
515

f(x)= Ci

O·~~~=---~~----~~--- --r
Fig. IX.lI
PART C

GENERAL NONLINEAR PROBLEMS

Part C is devoted to the study of methods of solution for quite general global op-
timizatioil problems. Several outer approximation algorithms, branch and bound
procedures and combinations thereof are developed for solving d.c. programming,
Lipschitzian optimization problems, and problems with concave minorants. The
"relief indicator method" may serve as a conceptual tool for even more general
global problems. The applications that we discuss include design centering problems,
biconvex programming, optimization problems with indefinite quadratic constraints
and systems of equations and / or inequalities.
CHAPTER X

D.C. PROGRAMMING

In Chapter X, we continue the discussion of d.c. programming problems. First, a

duality theory is developed between the objective and the constraints of a very gen-

eral dass of optimization problems. This theory allows one to derive several outer

approximation methods for solving canonical d.c. problems and even certain d.c.

problems that involve functions whose d.c. representations are not known. Then we

present branch and bound methods for the general d.c. program and a combination
of outer approximations and branch and bound. Finally, the design centering prob-

lem and biconvex programming are discussed in some detail.

1. OUTER APPROXIMATION METHODS FOR SOLVING THE CANONICAL

D.C. PROGRAMMING PROBLEM

Recall from Chapter I that a d.c. programming problem is a global optimization

problem of the form

minimize f(x) (1)


s .t. x f C, gi(x) ~ 0 (i=l, ... ,m)
520

where C C IRn is convex and all of the functions f, gj are d.c. Suppose that C is de-

fined by a finite system of convex inequalities hk(x) ~ 0, k EIe IN. In Theorem 1.9, it

is shown that, by introducing at most two additional variables, every d.c. pro-

gramrning problem can be transformed into an equivalent canonical d.c. prlr

gramming problem

(CDC) minimize cx (2)


s .t. h(x) ~ 0, g(x) ~ 0

where c E IRn (cx denotes the inner product), and where hand g: IRn -+ IR are
real-valued convex functions on IRn.

In this section, an outer approximation method is presented for solving (CDC)

that is based on Tuy (1987), Tuy (1994), Tuy and Thuong (1988). Moreover, it will

be shown that the method can easily be extended to solve problems where in (2) the

objective function is convex (cf. Tuy (1987 and 1995)).

1.1. Duality between the Objective and the Constraints

A general and simple duality principle allows one to derive an optimality condi-
tion for problem (CDC). We present a modification of the development given in Tuy

(1987) and Tuy and Thuong (1988), cf. also Tichonov (1980), where a related II reci-

procity principlell is discussed. Let D be an arbitrary subset of IRn , and let f: IRn -+ IR,
g: IRn -+ IR, a,p E IR. Consider the following pair of global optirnization problems, in

which the objective function of one is the constraint of the other, and vice versa:

inf{f(x): x E D, g(x) ~ P}, (3)

sup {g(x): xE D, J(x) ~ a}. (4)

Denote by inf Pp and sup Qa of (Pp) and (Q ), respectively.


the optimal valuesa
521

Definition X.I. Problem (Pß) is said to be stahle if

I im inf Pß . = inf Pß . (5)


ß· ... ß+O

Similarly, problem (Qa) is stable if

I im sup Q , = sup Q . (6)


a.... a-O a a

Lemma X.I. (i) If (Q a ) is stable, then a ~ inf Pß implies ß ~ sup Qa .

(ii) If (Pß ) is stable, then ß ~ sup Qa implies a ~ inf Pß .

Proof. (i) Assume that (Qa) is stable and a ~ inf P ß . Then for all a' < a the set
{x E D: g(x) ~ ß, f(x) ~ a'} is empty. Hence,

sup Qa':= sup {g(x): x E D, f(x) ~ a'} ~ ß Va' < a,

and, letting a' --I a - 0, we see from (6) that ß ~ sup Qa'

(ii) Similarly, if (P ß) is stable and ß ~ sup Qa ' then for all ß' > ß the set
{x E D: g(x) ~ ß', f(x) ~ a} is empty. Hence,

inf P ß .:= inf {f(x): xE D, g(x) ~ ß'} ~ a Vß' > ß,

and, letting ß' --I ß+O, we see from (5) that inf P ß ~ a. _

Corollary X.I. If both (P ß) and (Qa) are stable, then

Proposition X.l. (i) If (QaY is stable and a = min Pß , then ß = sup Qa'

(ii) If (Pß) is stable and ß = max Qa ' then a = inf Pß .


522

Proof. (i) Since a = min P ß ' there must exist an i e D satisfying f(i) = a,
g(i) ~ ß. It follows that ß ~ sup Qa' But, by Lemma X.I, we know that ß ~ sup Qa'

(ii) Similarly, since ß = max Qa' we see that there exists ye D satisfying
g(Y) = ß, f(Y) ~ a. Hence, inf Pß ~ a, and Lemma X.1 shows that we must have
infP ß = a. _

In order to apply the above results, it is important to have some criteria for
checking stability of a given problem.

Lemma X.2. 11 inl Pß < +m, 1is upper semicontinuott.S {u.s.c.} and ß is not a loeal
marimum 01 9 ot/er D, then {Pß } is stable. Similarly, il sup Qa > -ID, gis lower
semieontinuott.S (l.s.c.) and ais not a loeal minimum oll ot/er D, then {Qa} is stable.

Proof. We prove only the first assertion, since the second can be established in a
similar way.
Suppose that inf P ß < +ID, fis u.s.c. and ß is not a local maximum of g with respect
to D. Then there is a sequence {xk} C D such that

where {ck} is any sequence of real numbers having the above property. If, for some
keIN we have g(xk) > ß, then, obviously, for all ß' > ß sufficiently close to ß we
also have g(xk) ~ ß', and hence

It follows that

lim infPß,~ck'
ß'~ß+O
523

Therefore, since ck 1 inf P ß we see that, if g(xk ) > ß holds for infinitely many k,

then

lim inf Pß ' ~ inf Pß . (7)


ß '-Iß+O

Since obviously inf P ß' ~ inf P ß for all ß' ~ ß, we have equality in (7).

On the other hand, if g(xk ) = ß for all hut finitely many k, then, since ß is not a

Iocal maximum of g over D, it follows that for every k with g(xk ) = ß thereis a se-
kll ---;;-+ x k such that x'
quence x' kll E D, g(kll)
x' > ß. Then for all .
ß ' sufficlently elose

to ß we have g(Xk,lI) ~ ß', and hence

and

lim inf Pß ' ~ f(Xk,lI) .


ß '-Iß+O

Letting 11 ---I 111, we see from the upper semicontinuity of f that

FinalIy, Ietting k ---I 111, as ahove we see that

inf Pß ' = inf Pß .


lim
ß '-Iß+O •
The following concept of regularity provides further insight into the notion of

stahility.

Definition X.2. A feasible point x 0 f (P ß) is said to be regular for (Pß) if every

neighbourhood ofx contains a point x E D satisfying g(x) > ß (i.e., there exists a se-
quence i ---I x such that i E D, g(i) > ß).
524

Clearly, if the function g is l.s.c. (so that {x: g(x) > ß} is open), then every non-
isolated point x E D satisfying g(x) > ß is regular for (P ß)' Saying that ß is not a
local maximum of g over D (cf. Lemma X.2) is equivalent to saying that every point

x E D satisfying g(x) = ßis regular.


The notion of a regular point for (Qa) is defined in a similar way, replacing the

condition g(x) > ß by f(x) < a.


Note that the notion of a regular linear program with an additional reverse con-
vex constraint in Definition IX.1 is related to Definition X.2 in an obvious way.

Proposition X.2. If fis u.s.c. and if there exists at least one optimal solution of

(Pß) that is regular for (Pß)' then (Pß) is stable. Similarly, if g is l.s. c. and there
exists at least one optimal solution of (Qa) that is regular for (Qa)' then (Qa) is
stable.

Proof. Let x be an optimal solution of (Pß ) that is regular. Then there exists a
sequence {xk} (D satisfying g(xk ) > ß, xk --I x. For any fixed k we have
g(xk ) > ß for all ß sufficiently dose to ß ; hence inf Pß S f(x k). This implies that
I I I

!im inf P ß S f(xk ) .


I

ß -Iß+O
I

But from the upper semicontinuity of f we know that

lim f(x k) S f(x) = inf P ß .


Therefore,

!im inf Pß S inf Pß


I .
ß -Iß+O
I

Since the reverse inequality is obvious, the first assertion in Proposition X.2 is
proved.
The second assertion can be proved in an analogous way.

525

Now let us return to the eanonieal d.e. problem (CDC). Denote

G:= {x: g(x) ~ O}, H:= {x: h(x) ~ O}, and reeall that g and h are eonvex funetions.

In (3) let D = H, f(x) = ex and ß = O. Then problems (PO) and (CDC) coincide. As-

sume that H n G f 0.

Corollary X.2. (i) Let H be bounded, and suppose that g(x) f 0 at every extreme

point x 01 H. Then problem (eDe) is stable.

(ii) 11 at least one optimal solution 01 (eDe) is regular, then problem (eDe) is
stable.

Proof. (i): A loeal maximum of the eonvex function g with respeet to H is

always attained at an extreme point of H. Therefore, 0 eannot be a loeal maximum

of g over H, and henee (CDC) is stable, by Lemma X.2.

(ii) Assertion (ii) follows !rom Proposition X.2.



As in Seetion 1.3.4 assume that His bounded, H n G f 0, and the reverse eonvex
eonstraint g(x) ~ 0 is essential, i.e., we have

min {cx: x E H} < min {ex: xE H n G} (8)

(cf. Definition 1.6). Then the following optimality eriterion ean be derived !rom the
above eonsiderations. Reeall !rom Theorem 1.10 that an optimal solution is attained

on 8H n 8G. The following result generalizes Theorem IX.4.

Theorem X.I. In problem (eDe) let H be bounded, H n G f 0 and g(x) ~ 0 be es-


sential.
For a point x E 8G to be an optimal solution to problem (eDe) it is necessary that

max {g(x): x E H, cx ~ cx} =0. (9)

This condition is also su!ficient il problem (eDe) is stable.


526

proof. Let fex) = cx, D = H, ß = 0, 0 = min {cx: x E H n G}, and consider the
problems (Qo) and (PO) = (CDC). The convex function g is continuous and it fol-
lows from (8) that 0 is not a local minimum of fex) = cx over H. Hence from the se-
cond part of Lemma X.2 we see that (Qo) is stable. Therefore, if i is optimal, Le"
ci = 0, then, by Lemma X.l(i),

o ~ max {g(x): xE H, cx ~ ci}. (10)

But since gei) = 0, i E H, it follows that in (10) we have equality, i.e., (9) holds.

Conversely, if (CDC) is stable and (9) holds for i E 80, then, by Proposition X.I
(ii), we have

0= ci = inf {cx: x EH n G} = min {cx: x EH n G} ,

since (CDC) has an optimal solution.



An alternative short proof of Theorem X.l which does not use the above duality
can be derived from the fact that int(H n G) f. {x : g(x) > O} (cf. Horst and Thoai
(1994) and Horst, Pardalos and Thoai (1995).

1.2. Outer Approximation Algorithms for Canonical D.C. Problems

As above consider the problem (CDC)

minimize cx (11)
S.t. xEH nG

where c EDf, H:= {x: hex) ~ O}, G:= {x: g(x) ~ O} with h,g: IRn - I IR convex.
Assume that
527

(a) int H = {x: h(x) < O} 1 0 ,

(b) H is bounded,

(c) the reverse convex constraint is essential.

A straightforward application of the development presented in Section X.1.1.


leads to a sequence of concave minimization problems. In addition to (a), (b) and (c)
assume that H n G 1 0, so that an optimal solution to (11) exists. Let weint H be a
point satisfying

g(w) < 0, (12)

cw < min {cx: x E H n G} . (13)

Note that w is readily available by assumptions (a), (b), (c) (cf. Section 1.3.4). For
example, solve the convex minimization problem

minimize cx
S.t. xEH

obtaining an optimal solution W. If wEH n G, then assumption (c) above does not
hold, and w is an optimal solution to (11). Hence, by assumption (c), we have
g(W) < 0, cw< min {cx:x E H n G}, where w is a boundary point of H. Then weint
H can be found by a small perturbation of W.

For every xe G, let 7f'(x) denote the point where the line segment [w,x] intersects
the boundary 8G of G. Since g is convex, and since g(w) < 0, while g(x) ~ 0, it is

clear that

7r(x) = tx + (l-t)w ,

with tE (0,1] is uniquely determined by an univariate convex minimization problem

(cf. Chapter H.).


528

Note that for x E H n G it follows from (12) and (13) that c?r{x)=tcx+(l-t)cw<cx.

Algorithm X.I.

Initialization:
Determine a point xl E H n 00. Set k I - 1.

Iteration k=I,2, ... :

Solve the subproblem

maximize g(x) (14)


S.t. xEH, cx5cxk

obtaining an optimal solution zk of (Q(xk)). If g(zk) = 0, then stop.


Otherwise, set x k +1 = ?r{zk), and go to iteration k+1.

Remarks X.I. (i) A point xl E H n 00 can be determined by running an algo-


rithm to solve the convex maximization problem max {g(x): x E H} until a point
y1 E H has been found satisfying g(y1) ~ O. Set xl = 7!{ y1).
(ii) Also note that x k satisfies g(xk ) = 0 for all k, so that g(zk) ~ O.

Proposition X.3. Assume that problem (eDe) is stable. Then the lollowing asser-
tions hold.

(i) 11 Algorithm X.1 stops at}, then i is an optimal solution 01 (eDe).

(ii) 11 Algorithm X.1 is infinite, then it generates a sequence {:i} ( H n aG, every acr
cumulation point 01 which is an optimal solution to (eDe).

Proof. If the algorithm stops at zk, then we have g(zk) = max {g(x): xE H,
cx 5 cxk} = 0, and xk satisfies the necessary and sufficient optimality condition (9)
in Theorem X.1.
529

Suppose that Algorithm X.1 is infinite. Then, since H n an is compact, the se-
k
quence {xk} ( H n an has accumulation points in H n 00. Let {x q} be a subse-
quence of {xk} such that

k
x=limx q .
q"'lIJ

Since the sequences {xk} and {zk} are bounded, we may, by considering a subse-
k +1 k
quence if necessary, assurne that x q q x, z q q z. Clearly, since c7r{x) < cx
for all x E H n G, it follows that cxk+ 1 < cxk for all k. Moreover, for xE H satis-
k
fying cx ~ cx, we have cx ~ cxk, k=1,2, ... But, by the definition of z q in Algorithm
k
X.1, it follows that g(x) ~ g(z q)j and hence, letting q -- IIJ : g(x) ~ g(z). Since zEH

and cz ~ cx, we thus see that z is an optimal solution of the subproblem (Q(X)).
k +1 k
Now suppose that g(z) > O. Since x q = 7r(z q), it is easily seen (from the def-

inition of 7r{x)) that x = 7r{z). Thus,

x = tz + (1 - t)w , 0 < t < 1 j

hence, using (13),

cx = tcz + (l-t)cw < tcz + (l-t)cz = cz. (15)

But
k k+1 k k
cx q+1 < cx q < cz q ~ cx q, (16)

where the strict inequality holds, by an argument similar to that used to derive (15).

If we let q -- IIJ, then (16) yields

contradicting (15). Therefore, g(z) = Oj hence (9) is satisfied, and Proposition XA is


established by virtue of Theorem X.l.

530

Remark X.2. Theorem X.l and Proposition X.3 remain true if we replace the lin-
ear function ex by a convex funcüon f(x), Hence, Algorithm X.l can be used to min-
imize a convex function f(x) subject to convex and reverse convex constraints (cf.
Tuy (1987».

»
Note that each subproblem (Q(xk is a difficult global optimization problem that
cannot be assumed to be solved in a finite number of iterations, since its feasible set
is not polyhedral.
Therefore, on the basis of the above conceptual scheme, in Tuy (1987) and Tuy
(1994) the following algorithm was proposed that can be interpreted as an outer ap-
proximation method for solving

max {g(x): x eH, ex ~ a} ,

where a = min {cx: x e H n G} is the unknown optimal value of (CDC).

Denote by V(Dk ) the vertex set of a polytope Dk, and let 8f(x) denote the set of
subgradients of a convex function f at x.

Algorithm X.2.

Initialization:
Set a l = exl , where xl e H n 8G is the best feasible solution available (if no feasible
solution is known, set xl = 0, a l = + 00). Set a l = exl . Generate a polytope Dl
containing the compact convex set {x eH: ex ~ all. Let k 1-1.

Iteration k = 1,2,... :
Solve the subproblem

Let zk be an optimal solution for (Qk)'


If g(zk) = 0, then stop.
531

Otherwise, determine the point l where the line segment [w,zk] interseets the sur-

face

max {ex- ctk , g(x)} = O. (17)

(a) Ifl E H (i.e., h(yk) 5 0), then set


4c(x) = e(x -l). (18)

(b) If l ;. H (i.e., h(yk) > 0), then ehoose pk E 8h(yk) and set
(19)

Let

(20)

Set

k+1 k k k k
x = y , ctk+1 = ey , if Y E H and g(y ) = 0,

k+1 k
x = x , ctk +1 = ctk , otherwise.

Go to iteration k+1.

We establish eonvergenee of this algorithm under the assumptions (a), (b), (e)

stated at the beginning of Seetion X.1.2.

Observe that for eaeh k=I,2, ... , xk is the best feasible point obtained until step k,

while ctk = exk is the eorresponding objeetive funetion value, ctk ~ ctk+l'

Lemma X.3. For every k, we have


532

Proof. Let x E H, cx S Qk· Sinee Qk S Ql and {y E H: ey S Ql} C Dl' it follows


that x E Dl . Furthermore, if i < k and yi E H, then 4(x) = e(x - yi) = cx - eyi S
cx - Qk S O. If yi ~ H, then pi E öh(yi), and, by the definition of a subgradient, we
have 4(x) = pi(x - yi) + h(yi) S h(x) S o. Therefore, x E Dk. •

Lemma X.4. (i) 1/ g(}) = 0, then


o= maz {g(z): ZEH, cz S Q~ • (21)

(ii) 1/ the sequence {}} has an accumtdation point z satisfying g(z) = 0, then

0= maz {g(z): zE H, czS ä} , (22)

where ä = inJ{Qk: k = 1,2, ... }.

Proof. (i) From Qk ~ Q := min {ex: x E H, g(x) ~ O} and Lemma X.3 we see that

(23)

By Theorem X.l, we have

o= max {g(x): x E H, cx S Q} , (24)

by hypothesis,

k
g(z ) = max {g(x): x E Dk} = 0 ,

and (21) follows in view of (23).

(ii) To prove (22), observe that

Hence, using the eontinuity of g, we have


533

00
g(Z) ~ sup {g(x): xE n Dk}. (25)
k=l

But (23) implies that


00
{x E H: cx ~ a} ( {x E H: cx ~ Ci} ( n Dk . (26)
k=l

= 0, the relations (24) -


Since g(Z) (26) together imply (22).

Lemma X.5. Assume that problem (GDG) is stable, and let ä E IR satisfy

0= max {g{x): xE H, cx ~ ä}. (27)

Then

ä ~ a:= min {cx: xE H, g{x) ~ O} .

Proof. From (27) we see that for any number Cl!' < ä we have

o ~ max {g(x): xE H, cX ~ a'} .

ä~
Since (CDC) is stable, it follows by Lemma X.l(ii) that Cl!' ~ a, and hence a.

Theorem X.2. Assume that the conditions (a), (b), (c) are fulfilled and that prob-
lem (GDG) is stable. If Algorithm X.2 terminates at iteration k, then i is an optimal
solution for problem (GDG) (if ak < + 00), or (GDG) is infeasible (if
ak = +00).
If the algorithm is infinite, then every accumulation point x of the sequence {xk} is an
optimal solution for problem (GDG).

Proof. If g(zk) = 0, then we see from Lemma XA (i) that

o = max {g(x): x E H, cx ~ cxk} ,

and, by Theorem X.I, xk is an optimal solution for problem (CDC).


534

In order to prove the second part of Theorem X.2, we shall show that any ac-

cumulation point Z of the sequence {zk} satisfies g(i) = O. Then, in view of Lemma
XA (ii), we have

o= max {g(x): x E H, cx ~ ci} ,

and the equality ci = a = min {cx: x E H, g(x) ~ O} follows from Lemma X.5, since
i is feasible for (CDC).

The assertion g(i) = 0 will be established by checking that all of the conditions of
Theorem 11.1 are fulfilled for the sequence {xk = zk} and the set
D = {x: g(z) ~ 0, cx ~ a}, (a:= min {cx: x E H, g(x) ~ O}).

Condition (3) in Section 11.1: lk(x) ~ 0 V x E D obviously holds. To verify condition

(4): lk (zk) > 0, observe that

k
z = y k + Ak(Y k - w) , Ak >0. (28)

If lk is of the form (18), then from (28) and the inequality cw < min {cx: x E H n G}
we deduce that

k
lk(z ) = c(zk -yk) = Akc(y k -w) > O.

If lk is of the form (19), then observe that from the definition of a subgradient and

the inequality h(w) < 0 we have

lk(w) = pk(w - yk) + h(yk) ~ h(w) < 0 . (29)

Using (28), (29) and lk(l) = hel) > 0, we see that

lk(zk) = 4t(yk ) + Ak(4t(yk) -lk(w)) > 0


Now let {zq} be a subsequence of {zk} converging to a point Z. Since cz q ~ a q Vq
it follows that cz ~ a. Therefore, to prove that the conditions of Theorem 11.1 are
fulfilled it remains to show that there exists a subsequence {l} c {zq} such that
535

lim ir(l) = lim ir(Z) and, moreover, that lim ir(Z) = 0 implies that g(Z) ~ O. With-
out loss of generality we may assume that one of the following cases occurs.

(a) Suppose that yq E H for an q. Then iq has the form (18); and, since zq, yq E D1
and D 1 is compact, there is a subsequence (l, Ar ) such that l --I y, Ar --I X.
Therefore, by (18) and (28), with i(z) = i(z-y) we have
z= y + X(y - w) , X~ 0 (30)

and

lim ir(z)r = lim c(l-l) = lim c(i_yr) = c(i - Y) = l(Z).


~m ~m ~m

Let

l(Z) = c(z - Y) = Xc(y - w) = O. (31)

As above, from the definition of w we deduce that cy > cw. Therefore, we have
X = 0 and z = y. But for an r we have g(zr) ~ 0, g(yr) ~ 0, by the construction of the
algorithm. Therefore, z = y implies that g(Z) = o.

(b) Suppose that yq ~ H for an q. Then iq has the form (19); and, as before, we
have a subsequence yr --I y, Ar --I X. Moreover, we may also assume that pr --I p E

8h(Y) (cf. Rockafe11ar (1970), Chapter 24). It follows that

i(x) = p(x - Y) + h(Y) .

Clearly, l(w) = p(w - Y) + h(Y) ~ h(w) < 0 and l(Y) = h(Y) ~ 0, since h(l) > o.
As above, from (30) we conclude that then

i(Z) = i(Y) + X(l(Y) -l(w)) = 0


is only possible for X = O. But this implies that z = y and g(Z) = 0 (cf. Theorem
11.2).

536

1 cx== (Xl
Z ... ..~ .. ..

CX< (Xl

Fig. X.I. Algorithm X.2

Remark X.3. Algorithm X.2 is formulated in a manner that allows us to extent it


in a straightforward way to the more general case of minimizing a convex fnnction

fex) over H n G. For that purpose we only have to start with a polytope Dl ) H, and
to replace equation (17) by max {fex) - ak , g(x)} = 0 and lk(x) = c(x_yk) in (18)
by lk) = tk(x_yk), where t k E ßf(yk) (cf. Remark X.2 and Tuy (1987)).

Remark X.4. In the case of linear objective function (for which Algorithm X.2 is
formulated) it is easy to say that certain simplifications can be built in. For ex-

ample, if (18) occurs, Le. if a cut cx ~ cl is added, then obviously all previous cuts
of the form cx ~ cyi, i < k, are redundant and can be omitted. Moreover, if the ini-
tial polytope D l is contained in the halfspace {x E IRn: cx ~ all, i.e., if cx ~ a1 is

among the constraints which define Dl' then equation (17) can be replaced by

g(x) = O.

Remark X.5. In practice, given a tolerance c > 0, we terminate when

(32)
537

Since any accumulation point Z of the sequence {zk} satisfies g(Z) = 0, (32) must
occur after finitely many iterations. Suppose that

"( = max {g(x): x E H} > 0 , (33)

and c < "(. Then we have

which, because of (33), implies that Clk < +m, i.e., there is a point xk E H satisfying
g(xk ) = 0, cxk = Clk. Furthermore, there is no xE H such that cx ~ Clk ' g(x) ~ c.
Hence,

ci = Clk < min {cx: xE H, g(x) ~ c} . (34)

Therefore, with the stopping rule (32), where c < ,,(, the algorithm is finite and

provides an c-optimal solution in the sense of (34). Note that this assertion is valid
no matter whether or not the problem is stable.

There are two points in the proposed algorithm which may cause problems in the

implementation: 1) the regularity assumptionj 2) the requirement on the availability


of a point W E intD n intC. It turns out, however, that the trouble can be avoided by
using an appropriate concept of approximate optimal solution.

A point Je is called c-approximate feasible solution of (CDC) if

h(X) ~ c, g(X) + c ~ o.

It is called c-approximate optimal solution of (CDC) if it is c-approximate feasible

solution and

ci ~ min{cx: xE D, g(x) ~ O} + c.

The following modified variant of Algorithm X.2. has been proposed in Tuy (1994):
538

AIgorithm X.2· .

O. Let 11 = cx1, where xl is the best feasible solution available (if no feasible solu-
tion is known, set xl = 0, 11 = + 00). Take a polytope PI such that {x E D :

cx ~ 11 -c} C PI C {x: cx ~ 11 -c} and having a known vertex set VI"


Setk=1.

1. Compute zk E argmin{g(x) : x E Vk}. If g(zk) < 0 then terminate:

a) If Ik < + m, then xk is an c-approximate optimal solution of (CDC)j

b) If Ik = + 00, then (CDC) is infeasible.

2. Select wk E Vk such that cr.! ~ min{cx: x E Vk} + c.lfh(r.!) ~ c, g(r.!) ~ -c,


then terminate: r.! is an c-approximate optimal solution.

lc c k+ 1 k k lc
3.lfh(w) > 2' then defi.ne x = x '/k+1 = Ik' Let p E ah (w),

(35)

and go to Step 6.

4. Determine I k E [r.!j zk] such that g(l) = --t: (yk exists because g(zk) ~ 0,
lc k k+1 k
g(w) < - c). If h(y ) ~ c, then set x = x '/k+1 = Ik'
Determine uk E [r.!j yk] such that h(u k) = c, pk E ah(uk) (uk exists because

h(r.!) ~ ~ and h(l) > c), and let

lk(x) = pk(x - uk) (36)

and go to Step 6.

k k+1 k k
5.lfh(y ) ~ c, then set x = y '/k+1 = cy .

a) H cwk ~ cyk, then terminate: xk+ 1 is an c-approximate global optimal solution.

b) Otherwise, let
539

(37)

and go to Step 6.

6. Compute the vertex set Vk+1 ofthe polytope

Set k ... k+1 and go back to Step 1.

To establish the convergence of this Algorithm we use a stronger convergence


principle than those discussed in Chapter 11, Section 11.2 (cf. Tuy (1994)).

Lemma X.6. Let {i} be a bounded sequence ofpoints in IRn, Zk (.) be a sequence of
affine fu,nctions such that

k
Let {wk} be a bounded sequence such that Zirn Zk (w) < 0 for any w = Zirn w q.lf
q-t+m q q--t+m
yk E [~, .!] satisfies 'k (yk) ~ 0 then

lim (i - yk) = o.
fo.++m
k k
Proof. Suppose the contrary, that IIz q - y qll ~ 6 > 0 for some infinite subse-
k k k
quence {kq}. Let lk(x) = p x + ß.x with IIp 11 = 1. We can assume z q -i Z,
k k k k
w q ---I W, Y q ---I Y E [w, Z], P q ---I p. Since lk (z q) > 0 > lk (w) implies that
q q
k k k
-p q z q < f\ < - P q wwe can also assume f\ ---I P, hence,
q q

lk (x) ---11(x) := px + P V x.
q
540

k
Furthermore, l(w) = !im lk (w) < o. From the relation 0< lk (z q) = lk (Z) +
q"'+1D q q q
k k k
<p q(z q - Z» it follows that !im lk (Z) ~ O. On the other hand, since lk (z s)
q"'+1D q q
~ 0 V s > q, by fixing q and letting s - - f + ID we obtain lk (Z) ~ 0 V q. Hence, 1(Z)
q
= o. Also, since lk (yk) ~ 0 V k we have l(y) ~ o. But Y = 0 w + (1-0) '-z for some 0 E
[0,1], hence l(y) = 01 (w) + (1-0) 1 (Z) = 01 (w), and since l(w) < 0, while l(y) ~ 0,

this is possible only if 0 = 0, i.e. y = Z, a contradiction.



Proposition X.4. Algorithm 2 terminates after jinitely many steps by an E-ap-

proximate optimal solu.tion or by the evidence that the problem has no feasible solu.-
tion.

Proof. It is easily seen that the algorithm stops only at one of the Steps 1,2, 5a.
Since xk changes only at Step 5, and xk+1 = yk with h(l) ~ c, g(yk) = - c, it is

clear that every x k satisfies h(xk ) ~ c, g(xk) = --f:. If Step 1a) occurs, then {x E D :

cx ~ 7k - c} c P k C {x: g(x) < O}, hence {x E D : g(x) ~ 0, cx < cxk - c} = 0, so zk


is an c-approximate optimal solution.
If Step 1b) occurs, then D C Pk c {x: g(x) < O}, hence (CDC) is infeasible.
If Step 2 occurs, then h(uf) ~ c, g(uf) ~ - c, while from the definition of cl,
cwk ~ cx + c for all x E D c P k , hence cl is an c-approximate optimal solution.
If Step 5a occurs then cyk ~ ccl ~ min{cx : xE P k} + c, and since yk is c-ap-
proximate feasible, it follows that yk is an c-approximate optimal solution.

Now suppose the algorithm is infinite. Step 5b) cannot occur infinitely often

because yk E P k c {x: cx ~ 7k - c}, and hence 7k+1 = cyk ~ 7k - c. Step 3 cannot


occur infinitely often either, for then, by Corollary 1, any cluster point of {cl}
would belong to D, hence for sufficiently large k one would have h(cl) < ~, a contra-
diction. Thus the only way the algorithm can be infinite is that Step 4 occurs for all
but finitely many k, say for all k ~ kO. We now show that all conditions in Lemma
541

X.6. are fulfilled for the sequence {zk, yk, Je} and the functions lk(x), where k ~ kO•
In fact, since zr e Pr we have lk (xr ) ~ 0 V r > k. On the other hand, lk (Je) =
pk(Je - u k ) ~ h(wk ) - h(uk ) ~ e/2 - e = -€/2 < 0, while lk(uk ) = 0, hence
lk(zk) > o. Furthermore, since uk is bounded, and pk e 8h (uk ), it follows by a
well-known property of subdifferentials (see e.g. Rockafellar (1970)), that pk is also
k
bounded. If w q ---! W (q ---! + m), then, by taking a subsequence if necessary, we can
k k
assume u q ---! u, P q ---! P e 8h (u), so lk (w) = Pk(w - k
u ) ---! p(w - u) ~ h(w) -
q
h(u) ~ e/2 - e = - e/2 < O. Finally, yk = Je + (Jk (uk - Je) for some (Jk ~ 1, hence

= (Jk lk (uk) + (1- (Jk) lk (wk) = (1-(Jk) lk (Je) ~ O. Thus, all conditions of
lk (yk)
Lemma X.6 are fulfilled and by this Lemma, zk - yk O. Since g(l) = - e, for
---!

sufficiently large k we would have g(zk) < 0, and the Algorithm would stop at Step

1. Consequently, the Algorithm is finite.


1.3. Outer Approximation for Solving Noncanonical D.C. Problems

Transforming a general d.c. problem to a canonical d.c. program (0': to a similar


program with convex objective function) requires the introduction of additional
variables. Moreover, the functions hand g in a canonical d.c. program that is
derived from a general d.c. problem can be quite complicated (cf. Section 1.3.4).
Even more important, however, is the fact that, in order to apply one of the ap-
proaches to solve d.c. programs discussed so far, we must know a d.c. representation
of the functions involved. Although such a representation is readily available in
many interesting cases (cf. Section 1.3), very often we know that a given function is

d.c., but we are not able to find one ofits d.c. representations.

A first attempt to overcome the resulting difficulties was made by Tuy and

Thuong (1988), who derlved a conceptual outer approximation scheme that is applic-
able for certain noncanonical d.c. problems, and even in certain cases where a d.c. re-
542

presentation of some of the functions involved is not available.

Consider the problem

minimize f(x)
(P) s . t. gi (x) ~ 0 (i=l, ... ,m)

where fis a finite convex function on IRn and ~ (i=l, ... ,m) are continuous functions

in a large dass satisfying the assumption below.

Let us set

Ci = {x E IRn: ~(x) < O} (i=l, ... ,m); (38)


. n m
g(x) =. mm ~(x); C = {x E IR : g(x) < O} = . U Ci.
l=l, ... ,m 1=1

With this notation, problem (P) asks us to find the global minimum of f(x) over the

complement of the set C.

Suppose that the following assumptions are fulfilled:

(i) The finite convex fv,nction f has bounded level sets {x E IR n : 1(x) ~ J}, and
the fv,nctions gi are everywhere continuoUSi

(ii) -a:= min {j(x): g(x) ~ O} exists;


(iii) a point w E !Rn is available satisfging

f(w) < -a ; g(w) < 0 ;

(iv) for any Z E IR n, one can compute the point 1r(z) nearest to w in the

intersection of the Une segment [w,z] with the boundary BC of C, or else


establish that such a point does not exist.

Assumptions (i) and (ii) are self-explanatory. Assumption (iii) can be verified by

solving the unconstrained convex program


543

minimize f(x)
S.t. xElRn

If an optimal solution of this convex program exists satisfying g(x) ~ 0, then it will

solve problem (P). Otherwise, a point w as required is available.

Assumption (iv) constitutes an essential restriction on the dass of constraint

functions that we admit. Since, by assumption (i), the functions ~(x) are con-

tinuous, the set of feasible points lying in a line segment [w,z] is compact. This set

does not contain w. Therefore, whenever nonempty, it must have an element nearest

to w. It is easily seen that such a point must lie on Be, and is just 7!'(z). Assumption

(iv) requires that an algorithm exist to compute 7!'(z).

For instance, if each of the functions gi(x) (i=l,oo.,m) is convex, then 7I"(z) can be

computed in the following way:

Initialization:
1
Set w = w, 11 = {l,oo.,m} .

Step k=1,2,oo.:
Choose i k E argmin {~(wk): i E I k} .
If ~ (z) < 0, then stop: there is no feasible point on the line segment [wk,z] because
k
gi(x) < ° k
Vx E [w ,z].
Otherwise, compute the point wk +1 E [wk,z] satisfying gi (w k +1) = 0. (Since Ci
k k
is convex, this point is unique and can be easily determined.)

Set Ik +1 = I k \ {i k}. Umin {gi(wk +1): i E I k+1} ~ 0, then stop: 7!'(z) = wk +1.

Otherwise, go to Step k+ 1.

Clearly, after at most m steps this procedure either finds 7!'(z) or else establishes

that 7!'(z) does not exist.


544

It is also not difficult to derive a procedure to determine ?r{z) for other classes of
functions ~(x), for example, when an of the ~(x) are piecewise affine (this will be
left to the reader).

Proposition X.5. Suppose that assumptions (i) - (iv) are satisfied. Then every op-
timal solution 0/ (P) lies on.the boundary oe o/the set e.

proof. Suppose that z is a point satisfying g(z) > O. Then ?r{z) as defined in as-
sumption (iv) exists. By assumption (iii), we have f(w) < fez), and hence

f(?r{z)) = f(~w + (l-~)z) < Af(w) + (l-~) fez) < fez) ,


since fis convex and there is a number ~ e (0,1) such that

?r{z) = ~w + (l-~)z . •
Note that the duality theory discussed in the preceding sections also applies to prob-
lem (P). In particular, Theorem X.l remains valid, Le., we have the following
corollary.

Corollary X.3. For a point i e oe to be an optimal solution 0/ (P) it is necessary


that

max {g{x): /(x) ~ /(z)} =0. (39)

This condition is also su/ficient i/problem (P) is stable.

In addition to the conditions (i) - (iv) we shall assume that

(v) the objective function /(x) is strictly convex.

Assumption (v) is purely technical. If it is not satisfied (Le., if fex) is convex but not
strictly convex), then we may replace fex) by fe(x) = fex) + ellxll 2, which is ob-
viously strictly convex. For an optimal solution i(e) of the resulting problem we
545

then have f(X(E)) + Ellx(E)11 2 S f(x) + Ellxll 2 for all feasible points of the original

problem (P). It follows that f(X(E)) -+ a as E -+ 0, since {X(E)} is bounded, by as-


sumption (i) (cf. Remark IX.3).

The role of assumption (v) is to ensure that every supporting hyperplane of the

level set {x: f(x) S er} supports it at exactly one point. This property is needed in

order to prevent the following algorithm from jamming.

Algorithm X.3.

Initialization:
Let w E argmin {f(x): x E IRn } (or any point satisfying assurnption (iii)). Compute a
point xl E BC.

Generate a polytope D I satisfying

{x E IRn : x E w + U e (8C -w), f(x) S f(x l )} c DI


e~l

while f(z) > f(i) for any vertex Z of D I . Set k = 1.

Iteration k=1,2, ... :

k.1.: Check whether f(x k ) = min f(D k).


If f(x k) = min f(D k ), then stop: xk is an optimal solution of problem (P). Otherwise,
continue.

k.2.: Solve the subproblern

by a finite algorithm. Let zk be an optimal solution of (SP k ) (zk is a vertex of Dk

since fis strictly convex).

k.3.: Compute 7r{ zk). If 1f(zk) exists and f( 7r{ zk)) < f(xk ), then set xk + l = 7r{ zk).
· set x k+l
Ot herwlse, = xk .
546

kA.: Let l+1 be the point where the line segment [w,zk] intersects the surface
{x: f(x) = f(xk+1)}. Compute pk+1 E 8f(l+1) (8f(l+1) denotes the subdif-

ferential off at l+1). Set

Set k I - k+1 and go to Step k.l.

Remarks X.6. (i) In Step k.1, checking whether f(x k ) = min f(D k) is easy,
because fis convex and Dk is a polytope. It suffices to determine whether one of the
standard first order optimality conditions holds. For example, xk is optimal if

o E 8f(xk) + ND k
(x ) ,
k

where ND (xk ) denotes the out ward normal cone to Dk at xk , Le., the cone which is
k
generated by the normal vectors of the constraints of Dk that are binding (active) at

xk (cf., e.g., Rockafellar (1970)).


It will be shown that Dk contains all feasible points of (P) satisfying f(x) S f(x k).
Therefore, if f(xk ) = min f(D k), then xk must be an optimal solution of (P).

(ii) The subproblem (SP k) is equivalent to the problem of globally minimizing the
concave function (-f) over the polytope Dk , and it can be solved by any of the algo-

rithms described in Part B. Since Dk+1 differs from Dk by just one additional linear

constraint, the algorithm for solving (SP k ) should have the capability of being re-

started at the current solution of (SP k_ 1) (cf., Chapter VII). However, most orten

one would proceed as discussed in Chapter II: start with a simple initial polytope D1

(for example, a simplex) whose vertex set V(D 1) is known and determine the vertex

set V(D k+1) of Dk +1 from the vertex set V(D k ) of Dk by one of the methods de-
scribed in Chapter 11. Since max f(D k ) = max f(V(D k )), problem (SP k ) is then re-

duced to the problem of determining V(D k+1) from V(D k ).


547

The convergence of Algorithm X.3 is established through some lemmas. Let

lk(x):= pk+1 (x - yk+1 ) .

Lemma X.7. For every k one has

Proof. The assertion is obvious for k=1. Supposing that it holds for some k, we
prove it for k+1.
If x E D1 and f(x) ~ f(i+1), then f(x) ~ f(xk), since f(xk+ 1) ~ f(i); hence, x E
Dk by the induction assumption. Furthermore, from the definition of a subgradient
and the equality f(yk+1) = f(xk+1), we see that

i.e., 4t(x) ~ 0, and hence x E Dk+1 .



Lemma X.S. (i) There exists a positive real number L such that

(ii) There exists an a* E IR satisfying f(x k) --+ a* (k --+ m). Moreover, whenever
k k
z q --+ z and lk (z q) --+ 0 (q --+ m) for some subsequence kq , we have f(z) = a*.
q

Proof. (i) Since l+1 E D1 and D1 is compact (it is a polytope) , it follows from
a well-known result of convex analysis (see, e.g., Rockafellar (1970)) that the se-
quence {pk+1}, pk+l E M(yk+1), is bounded, i.e., IIpk+111 < L for some L > O.

Rence, for every z and x we have


548

(ii) The sequence {f(xkn is nonincreasing and bounded from below by f(w) (cf. as-

sumption (iii)). Therefore, it must converge to some limit 0.*.


k k
Now consider a subsequence k such that z q - - I Z and lk (z q) --10. In view of the
q q
boundedness of the sequences {pk+1} and {yk+1} we may assume (by passing to
k +1 k +1 k
subsequences if necessary) that p q --I P, y q --I y. Then ~ (z q) =
q
k+1 k k+1
p q (z q - y q ) - - I p(z - Y) = 0 (by hypothesis).
k+1 k+1
Since PE8f(y) (cf., Rockafellar (1970)) and f(Y)=lim f(y q )=lim f(x q )=0.*,

we must have p f O. Otherwise, 0 = p(w - y) ~ f(w) - f(y) = f(w) - a* would imply

f(Y) ~ f(w), which contradicts assumption (iii).


Furthermore, since yk+1 E [w,zk], we see that y E [w,Z], Le., z - y = ->.(w - Y)

for some >. ~ O.


But from the above we know that 0 = p(z - Y), and hence

0= p(z - Y) = ->.p(w - Y) ~ ->'(f(w) - f(Y)) .

Finally, since f(w) - f(y) < 0, this implies that >. = 0, i.e., z= y. Therefore,

f(Z) = f(y) = x*.



Lemma X.9. Every accv.mv.lation point Z of the seqv.ence {Je} satisfies f(z) = 0.*.

Proof. Lemma X.9 can easily be deduced from the preceding two lemmas and
Theorem II.1. A simple direct proof is as folIows:

k
Let z = Iim z q. From Lemma X.8 (i) we see that
q-+ID

(40)

But from the construction of the algorithm it is dear that ~(zj) ~ 0 Vk < j. Fixing
k and setting j = kq - - I ID, we obtain lk(Z) ~ O. Inserting this in the above inequality
yields
549

k k
o5 lk (z qH Lllz q - zll -+ 0 ,
q
k
where the first inequality follows from the definition of z q in the above algorithm

(4c(w) < 0 5 lk(zk) Vk).

= a* then follows from Lemma X.8 (ii).


The equality f(Z)

Lemma X.IO. Every point Z E D1 which satisfies f(z) = a* is the limit of some
k k
subsequence {y q} of{y }.

Proof. Since a* 5 f(x l ), it follows from the construction of D1 that x is not a ver-

tex of D 1. Therefore, using the strict convexity of f(x), we see that there exists a

point uq E D 1 such that Ilu q -xII ~ l/q for any integer q > 0 and f(u q ) > a*.
But the inclusion zk E argmax {f(z): z E Dk } and Lemma X.9 imply that max {f(z):
k
z E Dk } -+ a* (k -+ m). Hence, u q ~ Dk for some k ,Le., one has y q and
q q
k k
p q E M(y q) such that

k k
p q(uq - y q) > 0 . (41)

k k
By passing to subsequences if necessary, we may assume that y q -+ y and p q -+ p

as q -+ m. Then we know from the proof of Lemma X.8 that f(y) = 0.* and p E M(Y).
We must have p j 0, because 0 E M(y) would imply that f(y) = a* = min f(lR n ),

which contradicts assumption (iii).

Letting q -+ m, we see from (41) that p(x - y) ~ 0, and since f(x)= f(Y) = a*,
from the strict convexity of f(x) again it follows that x = y, Le., x =


Proposition X.6. Every accumulation point xofthe sequence {l} generated by Al-
gorithm X.9 satisfies the condition
550

o= max {g(z}: f(z} ~ l(x}} . (42)

11 problem (P) is stable, then every accumulation point x01 {i} is an optimal Slr

lution 01 (P).

Proof. First note that, for every k the line segment [w,l] contains at most one
feasible point, namely either yk, or no point at all. Indeed, either yk = xk = 1I{zk-l)
and, from the definition of 1I{ zk-I), yk is the only feasible point in [w,yk], or else l
satisfies f(l) = f(x k- l ) and there is no feasible point in [w,yk].
Now suppose that (42) does not hold. Then there is a point x satisfying
f(x) ~ f(i) = a* and g(x) > O. Since fand gare continuous and f(w) < a*, there is

also a point x satisfying f(x) < a* and g(x) > O. Let U denote a closed ball around x
such that g(x) > 0 and f(x) < a* for all x E U. Let i: E DI be the point of the half-
line from w through x for which f(i:) = a*. From Lemma X.IO we know that there is
k k
a sequence y q - - I i: (q --I ID). For sufficiently large q the line segment [w ,y q] will
intersect the ball U at a point x' satisfying g(x ') > 0 and f(x') < a*. But since
k k k
f(y q) = f(x q) ~ a*, this implies that x # y q Vq, contradicting the above ob-
servation.
Therefore, relation (42) holds, and the second assertion follows from Corollary X.3 .•

Example X.I. Computing a global solution of a difficult multiextremal optimi-

zation problem is usually very expensive. Therefore, a reasonable approach is to


transcendent local optimality by first computing a local minimum (or stationary

point) xloc by one of the standard nonlinear programming techniques, and then to
apply aglobai optimization algorithm with the aim of locating a feasible solution xk

that is substantially better than xloc' for example, such that

k
f(x ) < f(xl oc) -1/ ,

where 1/ is same prescribed positive number. As an example, consider the problem


551

S.t. 2 2 2 2
gl (x):= 2(xC15 ) +(x2-9) + 3(x3-18) + 2(x4-10) ~O ~ 0,

g2(x):= (xC10)2+ 3 (x2-12) 2 +2(x3-14)2 + (xr13)2-S0 ~0.

The global minimum of f(x) over 1R4 is O. Suppose that a feasible solution is given by

xl = (13.87481, 9.91058, 15.85692, 16.870087) with f(x 1) = 314.86234.

We try to find a feasible solution i satisfying

f(i) < f(x 1) - 300 = 14.86234

or to establish that such a feasible point does not exist.

We choose w = (14, 10, 16, 8) with f(w) = 0, g(w) = min {f1(w), g2(w)}: -37.

D1 is the simplex in 1R4 with vertices vO = 0, vi = 103 • ei (i=1,2,3,4), where i is

the i-th unit vector in 1R4 (i=1,2,3,4).

After eight iterations the algorithm finds

x9 = (14.44121, 10.48164, 14.30500, 7.15250) with f(x9) = 12.15073 < 14.86234.

The intermediate results are:

2 2 2 2
x = (13.91441, 9.93887, 22.01549, 7.95109), y = x ,f(x ) = 108.58278;

x3 = (13.92885, 15.03099, 15.91869, 7.95994), y3 = x 3 , f(x2) = 50.65313;

x4 = x3 , y4 = (13.92728, 15.03025, 15.91690,8.07018);

x5 = x 4 , y5 = (13.92675, 15.03119, 16.06476, 7.95814);

x6 = (19.34123, 9.94583, 15.91333, 7.95667 ) ,y6 = x6,f(6)


x = 28.56460,.

x7 = (19.32583, 10.06260, 15.91166, 7.95583), y7 = x 7, f(x 7) = 28.40348;

x8 = (12.38792, 8.84852, 14.15763,8.98799), y8 = x8 , f(x8) = 19.33813.


552

Now suppose that problem (P) is not stable. In this case, Algorithm X.3 is not
guaranteed to converge to a global solution (cf. Proposition X.9). However, an
e-perturbation in the sense of the following proposition can always be used to handle
unstable problems, no matter whether or not the perturbed problem is stable.

Proposition X.7. Let ire} denote any accummation point 01 the sequence d'{e}
generated by Algorithm X.9 when applied to the e-perturbed problem

minimize I{ z}
{P{e}}
s. t. g/z} + e ~ 0 {i=l, ... ,m}

Then as e -10 every accummation point olthe sequeme {ire}} is an optimal solution
olproblem {P}.

Proof. Let D(e) denote the feasible set of problem (P(e)). Clearly, D(e) contains
the feasible set of (P) for all e > o. From Proposition X.6 we know that every ac-
cumulation point x(e) of {xk(e)} satisfies g(x(e» = -E, and

max {g(x): f(x) S f(x(e))} = -E.

This implies that f(x(e)) < min {f(x): g(x) ~ -e/2}. Therefore, as e - I 0 every ac-
cumulation point x of X(e) satisfies g(X) = 0, and f(X) S min {f(x): g(x) ~ O}j hence,
it is an optimal solution of (P).

Remark X.7. If we apply Algorithm X.3 to the above perturbed problem and stop
assoonas

max {g(x): f(x) S f(xk(e))} < 0 ,

then xk(e) yields an approximate optimal solution of (P) in the sense that

g(xk(e)) + e = 0, f(xk(e)) S min {f(x): g(x) ~ O}.


553

2. BRANCH AND BOUND METHODS

The development of BB methods presented in Chapter IV allows one to design

various procedures for a variety of difficult global optimization problems.

We choose an appropriate type of partition set (e.g., simplices or rectangles) and an


exhaustive

select abounding operation in accordance with the given type of objective function
which provides a lower bound ß(M) for min f(D n M) or min f(M), respectively. We
apply abound improving selection for the partition elements to be refined. Finally,

if necessary, we choose from Section IV.5. the "deletion by infeasibility" rule that
corresponds to the given feasible set, and we incorporate all of these elements into
the prototype BB procedure described in Section IV.!. Then the theory developed in

Section IV guarantees convergence in the sense discussed here.

Note that, as shown in Section IVA.5., whenever a lower bound ß (M) yields con-
sistency or strong consistency, then any lower bound "ß (M) satisfying "ß (M) ~ ß(M)
for all partition sets M will, of course, also provide consistency or strong consistency,
respectively. Hence, better and more sophisticated lower bounds than those dis-
cussed in Section IVA.5. can be incorporated in the corresponding BB procedures
without worrying about convergence.

As an example of the various possibilities, we present here an algorithm for

globaU71 minimizing a d.c. junction subject to a finite number of convex and reverse
convex inet[Ualities which has been proposed in Horst and Dien (1987).

Let f1: IRn --+ IR be a concave function, and let ~, ~, h j: IRn --+ IR be convex func-

tions (i=l, ... ,mj j=l, ... ,r). Define the convex function

g(x) = max {gi(x): i=l, ... ,m} , (43)


554

and the sets

(44)

(45)

The problem to be considered is then

minimizef(x):= f1(x) +f2(x) (46)


s. t. xE D:= D 1 n D2

Assume that D is nonempty and compact and that a point yO satisfying g(yO) < °
is known.

Let min f(S) = CD whenever S is an empty set.

Algorithm X.4.

Step 0 (Initialization):

Construct an n-simplex MO 'J D1 (cf. Chapter III) and its radial partition

with respect to yO (cf. Section IV.3).


For each M E .At 0:
Set SM = V(M) n D and a(M) = min f(SM)' where V(M) denotes the vertex set
ofM.
Choose v* E V(M), p* E 8f2(v*) and compute

ß(M) = min {~M(v):= f1(v) + f2(v*) + p*(v-v*): v E V(M)}. (47)

Choose y(M) E V(M) satisfying ~M(y(M)) = ß (M) . (48)

Set GO = min {G (M): M E .At o}, ßO = min {ß (M): ME .At o}·


If GO < CD, then choose xO satisfying f(xO) = GO.
555

If aO- ßO = 0 (~ c), then stop. x Ois an (c-)optimal solution.


Otherwise, choose MO E ..,K 0 satisfying ßO= ß (MO) and set yO = y(M O).

Step k=1,2, ... :

At the beginning of Step k we have the current partition ..,K k-l of a subset of MO
still of interest.

Furthermore, for every M E ..,K k-l we have SM ~ M n D (possibly SM = 0), and


bounds ß (M), a (M) (possibly a (M) = m) which satisfy
ß (M) ~ min f(M n D) ~ a(M) if M is known to be feasible,

ß (M) ~ rnin f(M) if M is uncertain.

Moreover, we have the current lower and upper bounds ~-1' ak_l (possibly
a k_ 1 = m) which satisfy

ßk- 1 ~ min f(D) ~ a k- 1 '

and a partition set Mk_ 1 satisfying 1\-1 = ß (Mk_ 1)·


Finally, we obtain a corresponding point yk-l = y(M k_ 1), and, if ak-1 < m, we
have xk-l E . f·
D satIs ymg f( xk-l) = a k_ 1.

k.1. Delete all M E ..,K k-1 satisfying ß(M) > ak- 1.


Let $, k be the collection of remaining members of ..,K k-1·

k.2. Select a collection ,9Jk ( $, k satisfying

and subdivide every member of ,9Jk into a finite number of n-ilimplices by means

of an exhaustive radial subdivision. Let ,9Jk be the collection of all new partition

elements.
556

k
11:.3. Delete every M E .9l for which the deletion rule (DR!) (cf. Section IV.5)
applies or for which it is otherwise known that min f(D) cannot occur. Let .Jt k be
the collection of all remaining members of .9lk.

11:.4. For each M E .Jt k:


Set SM = V(M) n D and a(M) = min f(SM)'
Determine ß '(M) = min 'M(V(M)) and y(M) according to (47), (48).
Set ß (M) = max {ß '(M), ß (M')}, where M ( M' E .Jt k-l"

11:.5. Set .Jt k = (.9t k\ .9lk) U .Jt k'


Compute
0k = inf {a(M): ME .Jt k } , f\
= min {ß (M): M E .Jt k } .
If 0k < then let xk E D be such that f(xk) = 0k'
111,

If 0k - f\ = 0 (~ e), then stop. xk is an (e-)optimal solution.


Otherwise, choose Mk E .Jt k satisfying f\ = ß (Mk ) and set yk = y(Mk). Go to
Step k+1.

The following Proposition shows convergence of the procedure (e = 0).

Proposition X.8. (i) 1/ Algorithm X.4 does not tenninate after a finite number 0/
iterations, then e'IJery accumulation point 0/ the sequence {yk} is an optimal solution
o/problem (46), and

lim ßk = min /(D). (49)


R-tlll

(ii) 1/ SM # 0 /or e'IJery partition element M that is ne'IJer deleted, then the se-
quence {.}} has accumulation points, and e'IJery accumulation point is an optimal sa-
lution o/problem (46) satisfying

(50)
557

Proof. Proposition X.ll follows from the theory presented in Chapter IV.. Since

{~} is a nondecreasing sequence bounded from above by min f(D), we have the
existence of

ß:= I im ~ ~ min f(D). (51)


k-+1Il

Now lety be an accumulation point of {yk}, and denote by {l} a subsequence of


{l} satisfying l--+ y. Then, since the selection is bound improving and the sub-
division is exhaustive (cf., Definition IV.6 and Definition IV.10), we can use a
standard argument on finiteness of the number of partition elements in each step
(similarly to the argument in the proof of Theorem IV.3) to conc1ude that there
exists a decreasing sequence {M q} ( {Mr } of successively refined partiton elements

satisfying

(52)

and

(53)

In Proposition IVA, it was shown that deletion rule (DR2) is certain in the limit,
and by Proposition IV.3 we have strong consistency of the bounding operation (cf.,
Definition VI.7). It follows that

yeD (54)

and

(55)

Considering (51), (52) and (55), we see that

ß = 1i m ß (M ) = f(Y) ~ min f(D) ,


q-+1Il q
558

and hence

ß = f(Y) = min f(D) ,

since y e D, and assertion (i) is verified.

In order to prove (ii), recall from Lemma IV.5 that under the assumptions of
Proposition X.11(ii) the bounding operation is also consistent. The assertion then

follows from Theorem IV.3 and Corollary IV.2, since f is continuous and D is
compact.

Remark X.S. In addition to (47), several other bounding procedures are available.
One possibility, for example, is to linearize the convex part f2 of f at different ver-
tices v* E M and to choose the best bound obtained from (47) over all v* e M con-
sidered (cf. Section XI.2.5).

Another method is to replace the concave part f1 of f by its convex envelope cp over
the simplex M (which is an affine function, cf. Section IV.4.3.) and to minimize the
convex function (cp + f2 ) over M.
More sophisticated bounding operations have been applied to problems where
additional structure can be exploitedj examples include separable d.c. problems such
as, e.g., the minimization of indefinite quadratic functions over polytopes where
piecewise linearization has been used (Chapter IX).

3. SOLVING D.C. PROBLEMS BY A SEQUENCE OF LINEAR PRO GRAMS


AND LINE SEARCBES

The standard deterministic global optimization methods, such as outer approx-


imation and branch and bound, were first investigated in order to solve the concave
minimization problem and problems closely related to concave minimization (cf.
Part B). In the preceding sections we saw that these basic methods can also be used
559

to solve the more general and more difficult d.c. problem (general reverse convex
programming problem).
Further developments in concave minimization, however, have led to certain com-
binations of outer approximation and branch and bound methods that involve only
linear programming subproblems and line searches (Benson and Horst (1991), Horst,

Thoai and Benson (1991)), cf. Algorithm VII.2 and the discussion in Section VII.1.9.
The first numerical experiments indicate that, for concave minimization, these
methods can be expected to be more efficient than pure outer approximation and
pure branch and bound methods (cf. Horst, Thoai and Benson (1991) and the dis-
cussion in Chapter VII.). Therefore, some effort has been devoted to the extension of
these approaches to the d.c. problem. The resulting procedure, which is presented
below, can be viewed as an extension of Algorithm VII.2 to the d.c. problem which
takes into account the more complicated nature of the latter problem by an ap-
propriate deletion-by-infeasibility rule and a modified bounding procedure (cf.
Horst et al. (1990)).
For other possible extensions of the above mentioned linear programming -line
search approaches for concave minimization to the d.c. problem we refer to Horst
(1989), Tuy (1989a) and Horst et al. (1991), see also Horst, Pardalos and Thoai
(1995).

Let us again consider the canonical d.c. problem

minimize f(x): = cx
(CDC)
S.t. h(x) ~ 0, g(xHO

where h, gare convex functions on IRn and c E ßtl.


A modification of the algorithm which handles concave objective functions will be

given below.
560

Let

0:= {x e IRn: hex) ~ O},

G:= {x e IRn: g(x) ~ O}, C = {x e ~: g(x) < O} = IRn \ G , (57)


D:= 0 n G.

(Note that 0 is the set which in Section X.I was denoted by H.) Assume that

(a) 0 is bounded, and there is a pol1ltope T containing Oj


(b) C is boundedj
(c) there is a point w satisfying

h(w) < 0, g(w) < 0, cw < min {cz: z e 0 n G} .

Remarb x.g. (i) Assumptions (a), (b), (c) are quite similar to the standard as-
sumptions in Section X.1.2. A polytope T satisfying (a) is often given as a rectangle

defined by known bounds on the variables. Another possibility is to construct a


simplex T satisfying 0 C T by one of the methods in Chapter 11.

(ii) Assumption (b) is often not satisfied in formulations of (CDC) arising from
applications. However, since 0 is bounded and a simple polytope T ) 0 with known
vertex set V(T) is at hand, we can always redefine

C +- C n{x e IRn: IIx-wll > max {lIv-wll: v e V(T)}}

without changing the problem.

(iii) Assumption (c) is fulfilled if 0 satisfies the Slater condition (h(w) < 0) and if

the reverse convex constraint is essential, i.e., if it cannot be omitted (cf. Section
X.1.2).

Initial conical partition and subdivision process

In order to simplify notation, let us assume that the coordinate system has been
translated such that the point w in assumption (c) is the origin o.
561

Let

S = conv {v I , ... ,v n+1}

denote an n-simplex with vertices vI ,... ,v n +1 which satisfies

OE int S .

For example, we can take S = T if T is an n-simplex. Another example is given


by vi = ei (.1=1,... ,n ), vnH = - e, where ei.IS t he I-t
. humt
· vector In
. IRn an d e =

(l,l, ... ,l)T E IRn.

Let Fi denote the facet of S opposite vi, Le., we have vi ~ Fi (i=l, ... ,nH). Clearly,
F. is an (n-1)-simplex.
1

Let M(Fi ) be the convex polyhedral cone generated by the vertices of Fr Then we

know from Chapter IV and Chapter VII that

.At 1 = {M(Fi ): i=l, ... ,nH}

is a conical partition of IRn. This partition will be the initial partition of the algo-

rithm.

The subdivision process for a cone M = M(U), where U is an (n-I)-simplex con-


tained in some facet Fi of S, is assumed to be induced by an exhaustive radial sub-
division of U (cf. Chapters IV, and also VII where, however, a slightly different but
equivalent matrix notation was used).

Lower bounds and deletion by infeasibility

Let U be an (n-I) simplex contained in some facet Fi of S, and let by M = M(U)

denote the cone generated by the vertices of U. Furthermore, suppose that ais the

best objective function value attained at a feasible point known so far.

Let P be a polytope containing D, and consider the polytope

Q = P n {x: cx ~ a} .
562

The algorithm below begins with P = T, and then successively redefines P to in-

clude constraints generated by an outer approximation procedure.

Now we present a method for calculating a lower bound of f(x) = cx on the inter-

section D n M n {x: cx ~ /l} of the part of the feasible set still of interest with the
cone M provided that D nM n {x: cx ~ /l} :f: 0. This method will also enable us to
detect sufficiently many (but not necessarily all) cones M of a current partition

which satisfy

D n M n {x: cx ~ /l} =0.

These cones will be deleted from further consideration.

Let the above polytope Q be defined by

where A is a (pxn)-matrix and b E IRP.


Furthermore, for each i=l, ... ,n, let yi = yi(M) denote the point where the i-th
edge of the cone M intersects the boundary Be of C. (Note that yi is uniquely deter-
rnined by means of a univariate convex rninirnization (line search) since C is convex
and bounded and 0 Eint C.)
Suppose that an upper bound /l ~ min f(D) is known (usually /l is the objective func-
tion value at the current best feasible point). Since cw < rnin {cx: x E D}, and since
w was translated to the origin, we mayassume that /l ~ O. Then the cone M can be

deleted from further consideration whenever we have

cyi ~ /l Vi E {l, ... ,n} . (58)


n . n
To see this, consider the hyperplane H = {x E IRn: x = E >..yl, E >.. = I} which
i=l 1 i=l 1
is uniquely determined by the linear independent points yi (i=l, ... ,n). Let H+
denote the closed halfspace generated by H that does not contain the origin. Then
we have M n D ( H+ n M because of the convexity of C. Since
563

+ n . n
H n M = {x E IRn: x = E >..yl, E >.. ~ 1, >'1. ~ 0 (i=l, ... ,n)} ,
i=l 1 i=l 1

we see that (58) with a ~ 0 implies that


n . n
cx = E >..cyl > E >.. a > a Vx E H+ nM ,
i=l I - i=l I -

and hence cx ~ a Vx E M n D, Le., the upper bound a cannot be improved in M.

Now let Y denote the (n .. n)-matrix with columns yi (i=l, ... ,n), and consider the
linear programming problem
n
(LP) = (LP(M,Q)): max { E \ AY>. ~ b , >. ~ O} ,
i=l
T n n
where>. = (>'1' ... '>') E IR . Define JL(>') = E >. ..
n i=l I
We recall the geometrie meaning of (LP): eonsider the hyperplane H = {x E IRn: x =
Y>', JL(>') = 1} defined above. Changing the value of JL(>') results in a translation of
H into a hyperplane whieh is parallel to H. The eonstraints in (LP) describe the set
M n Q. Let >.*(M) and 1'* = I'*(M) = JL(>.*) denote an optimal solution and the op-
timal objective funetion value of (LP), respeetively. Then H* = {x E IRn: x = Y>',
JL(>') = I'*} describes a hyperplane parallel to H that supports Q n M at

x* = x*(M) = Y>'* . (59)

Let zi = I'*yi denote the point where the i-th edge of M intersects H* (i=l, ... ,n).

Lemma X.11. Let >.* = >.*{M), 1'* = I'*{M) and yi, zi (i=l, ... ,n) be defined as
above.

(i) 111'* =I'*{M) < 1, thenMnDn{x:cx~ a} =0.

(ii) 111'* = 1'* {MH 1, then

ß (M,Q):= min {cyi, ci i=l, ... ,n}


564

is a lower bound for f(x) = cx over Mn D n {x: cx ~ Il}.

Proof. Let H- and H*+ denote the closed halfspaces containing 0 generated by
the hyperplanes Hand H* defined above, and let ft- be the open halfspace int H-.
Since H is the hyperplane passing through the points yi E ac (i=l, ... ,n), Cis convex

and 0 Eint C, it follows that the simplex H- n M = conv {O,yl, ... ,yn} is contained
in Cu 8C; hence

(because D n C = 0).
But from the definition of H* it follows that

Mn D n {x: cx ~ Il} ~ M n Q ~ H*- nM ,

and hence

(60)

Therefore, since p.* < 1 implies that H*- c ft-, we see that assertion (i) holds.

Now consider the case p.* ~ 1. It is easily seen that (H*- \ ftl n M is a polytope
with vertex set {yi ,zi (i=l, ... ,n)}. It follows from the linearity of the objective func-
tion cx and from (60) that assertion (ii) holds.

Remark X.lO. When p.*(M) < 1 occurs, then the cone M is deleted.

Algorithm. X.5.

Initialization:

Let .At 1 = {M(Fi ): i=l, ... ,n+l} be the initial conical partition as defined above.
Determine the intersection points yi (i=l, ... ,n+l) of the rays emanating from 0
and passing through the vertices of the simplex S with the boundary ac of C.
565

Determine

a(= min {cl yi E D, i E {I, ... ,n+I}}

(initial upper boundj a l = m if yi ~ D Vi).


If a l *
m, then choose xl E {yi E D: i E {I, ... ,n+1}} satisfying cx I = al'
For each cone Mi = M(Fi ) (i=I, ... ,n+I) solve the linear programming problem

LP(Mi'QI) where QI = PI n {x: cx $ aI}' and PI = T.


Delete each Mi E .At 1 for which JL*(Mi ) < 1 (cf. Lemma X.lI).
If {Mi E .At ( JL*(Mi ) ~ I} = 0, then stop: the feasible set D of problem (CDC) is
empty.

Otherwise, for each cone Mi E .At 1 satisfying JL*(M i ) ~ 1, compute the lower
bound

(cf. Lemma X.lI).


Set p(= min {P (Mi): Mi E .At 1 ' JL*(Mi ) ~ I} , and let xl be a point where PI is
attained, i.e., cx I = PI .

Iteration k=l,2, ... :

At the beginning of Step k we have a polytope P k ) D, an upper bound ak


(possibly m) and, if a feasible point has been found, we have a point xk E D

satisfying ak = f(x k). Furthermore, we have a set .At k of cones generated from
the initial partition .At 1 by deletion operations and subdivisions according to the

rules stated below. Finally, for each cone M E .At k' a lower bound ß (M) $
min {cx: x E D n M} is known, and we have the bound ßk $ min {cx: xE D} and

a not necessarily feasible point xk associated with ßk such that cxk = f1c.
k.1. Delete all M E .At k satisfying
566

Let se k be the collection of remaining cones in .Jt k. If se k = 0, then stop: xk is


an optimal solution of problem (P) with optimal value ßk = Qk.

1::.2. Se1ect a cone Mk E se k satisfying

ß (Mk) = min {ß(M): M E se k} .

Let x*k = Y(Mk)..\*(Mk ) be the point defined in (59) corresponding to the

cone Mt
Ifx*k E D, then set P k+1 = P k and go to k.4.

1::.3. Determine the point wk where the line segment [O,x*k] intersects the

boundary an of O. Compute a subgradient t k of h at wk , and define the affine

functions

Set

1::.4. Subdivide Mk into a finite number of cones Mk,j (j E Jk ) by an exhaustive


subdivision process and compute the point y*k where the new (common) edge of

the cones Mk,j (j E Jk ) intersects 00.

Set

k
Let .Jt = {Mk,j C Mt j E J k} and set ....rk = (se k \ {Mk}) u .Jtt
For each cone M E ....rk let yi(M) denote the intersection points of its i-th edge
with oe (i=l, ... ,n).
De1ete M E f k if cyi(M) ~ Qk+1 Vi=l, ... ,n (cf. (58».
Let ....rk denote the set of remaining cones.
567

k.5. Set

and, for each newly generated M E A'ic. solve the linear programming problem

LP(M,Qk+1) obtaining the optimal values /L*(M).

Delete all M E A'ic. satisfying /L*(M) < 1. Let .At k+1 denote the collection of
cones in A'ic. that are not deleted. If ~+1 = 0, then stop: if tkic.+l = 00, then the
feasible set Dis empty. Otherwise, x k is an optimal solution.

k.6. For all M E A' ic. determine the lower bound

k.7. Set

ßk+1:= min {ß(M): M E .At k+1} ,

. t where ßk+lIS
an d 1et x-k+1 be a pom . att·
ame d·
,I.e., cx-k+1 = ßk+l.
From tkic.+1 and the new feasible points obtained in this iteration determine a

current best feasible point x k + 1 and the corresponding upper bound

tkk+l = cxk+1 .
Set k I - k+ 1 and go to the next iteration.

Convergence of the algorithm will be established by means of the general con-

vergence theory developed in Chapter IV. According to Corollary IV.3. and Corollary

IV.5, convergence in the sense that li m ~ = min {cx: x E D} and every ac-
k~1J)

cumulation point of the sequence {ik } is an optimal solution of (CDC), is guaran-

teed if we show that any infinite decreasing sequence {Mq} of successively refined

partition cones Mq satisfies

Mn D f 0 ,ß (M ,) - - - I min {cx: x E M n D} ,
q q'.....,
568

where M = n M and {M .} is an infinite subsequence of {Mq}.


q q q

Lemma X.12. Assume that the algorithm is infinite. Then we ha'IJe 1! E D n an /or

e'IJery accumulation point 1! o/the sequence {1!k}, where 1!k = Y(M*~)'*(MV (cf
Step k.2.)

Proof. From the convergence theory that is known for the outer approximation

method defined in Step k.3 and the linear programs LP(M,Qk) it follows that

x* E an (cf. Chapter II). In order to prove x* E D, note that, using a standard argu-

ment on finiteness of the number of partition cones in each step, one can concIude

that there exists a decreasing sequence {Mq} of successively refined partition cones

such that for the sequence x*q --I x* we have [O,x*q] C Mq and [O,x*] C M:= n Mq .
q
Since the algorithm use& an exhaustive subdivision process, the limit M must be a

ray emanating from °which intersects oe at a point y*.


For M = Mq , let yq,i, zq,i denote the quantities yi, zi introduced above. Let

1'*q = p.*(Mq).
Suppose that x* t D. Then, since x* E an, we must have g(x*) < 0, i.e., x* is an
interior point of the open convex set C, and it is also a relative interior point of the

line segment [O,y*]. Hence, there exist a point w E [x*,y*] and a closed ball B
around w such that x* t Band B C C. For sufficiently large q, we then must have

x*q t B, while the edges of Mq intersect B at points in the relative interior of

[O,yq,i]. It follows that x* is contained in the open halfspace ~ generated by the


hyperplanes Hq through yq,i (i=1, ... ,n) and containing the origin. But this implies

that we have #L~ < 1, and this would have led to the deletion of Mq , a contra-

diction.

Lemma X.13. Let {Mq} be an infinite decreasing sequence 0/ successi'IJely refined
partition cones generated by the algorithm. Let 1! be an accumulation point 0/ the
corresponding sequence {1!q}, and denote by y* the intersection o/the ray M:= Zim
q
569

Mq with ac. Then we have

Mn D = [y*,x*] .

Proof. Clearly, {x*q} has accumulation points, since 0 and the initial polytope T

are bounded sets, and x* E M. In the proof of Lemma X.12 we showed that

x* E an n D. In particular, we have g(x*) ~ 0, hence y* E [O,x*]. This implies that


y* E 0, since 0 is a convex set and °E O. Therefore, y* E ac n 0 ( D. Again from
the convexity of n and C it follows that the subset [y*,x*] of M is contained in D,
* *] (M
Le., [y,x - n D.
In a similar way, from the convexity of the sets involved it is easily seen that

xE M \ [y*,x*] implies that x ;. D.



Lemma X.14. Let {Mq} be an infinite decreasing sequence 0/ successively refined
partition cones generated by the aZgorithm, and let M = Zirn M . Then there exists a
q-tlll q
subsequence {Mq,} of{Mq} such that

ß (M ,) - - I min {cx; xE Mn D} .
q q-+JJ

Proof. Consider the quantities

corresponding to Mq that we introduced in Lemma X.Il.


Since M converges to the ray M and yq,i E
q
ac (i=l, ... ,n), where ac is compact, we
see that

yq,i --cI y* E M n ac (i=l, ... ,n).

In the proof of Lemma X.13 we saw that every accumulation point x* of {x*q} satis-

fies x* E Mn an, and [y*,x*] = M n D.


570

Consider a subsequence {q'} such that x*q' ----er x*. We show (again passing to a

subsequence if necessary; we also denote this subsequence by {q'}) that zq' ,i ----er x*

(i=l, ... ,n).

Recall that zq,i = J.I~yq,i. Let yq be the point where the line segment [O,x*q]

intersects the hyperplane Bq through yq,i (i=l, ... ,n). Since yq E conv {yq,i:
. I } ,we see from y q,i q y * tha t yq q y *.
1= , ... ,n

But the relations x*q = J.I~yq , x*q'----er x* , yq'----er y* and the boundedness of {J.I~}
imply that (passing to a subsequence if necessary) we have J.I~' ----er J.I* . It follows

that
,.
q
Z '
l
-:::r+ J.I*Y* = x* E M
q
-
n an

(cf. Lemma X.12).


From the continuity of the function cx we see that

But min {cy*,cx*} = min {cx: x E [y*,x*]}; and from Lemma X.13 it follows that

Finally, since ß (Mq"Qq') ~ ß (Mq ,) ~ min {cx: x E Mq, n D}, we must also have

ß (M q,) ----er min {cx: x E M n D} .



Proposition X.9. 11 Algorithm X.5 is infinite and D f 0, then we have
(i) ß:= I im ßk = min {cx: x E D} ;
k-+m
(ii) every accumulation point 01 the sequence {i} is an optimal solution 01 problem
(GDG).

Proof. Since the selection of the cone to be subdivided in each iteration is bound

improving and the preceding lemmas have shown that the lower bounding is strongly

consistent (cf. Definition IV.7), the assertions follow from Corollary IV.3 and
571

Corollary IV.5.

Extension

Note that Algorithm X.5 can also be applied if the objective junction f(x) is a con-

cave junction. The only modification that is necessary to cover this case is to omit
the deletion rule (58) and the sets {x: cx ~ Qk} whenever they occur. Since for con-
cave f the corresponding set {x: f(x) ~ Qk} cannot be handled by linear programming

techniques, we replace Q = P n {x: cx ~ Q} by Q = P, Mn D n {x: cx ~ Q} in


Lemma X.lI by M n D, Qk+l = P k + 1 n {x: cx ~ Qk+l} in Step k.5 by Qk+1 =
P k + 1 etc. Then the concavity of f implies that all of the above lemmas and Propo-
sition X.12 remain true when cx is replaced by f(x).

The following slight modification has improved the performance of the algorithm

in all of the examples which we calculated. Instead of taking the point w defined in

assumption (c) as both the vertex of an the cones generated by the procedure and
the endpoint of the ray that determines the boundary point where the hyperplane

constructed in Step k.3 supports the set Cl, we used two points wO and WO. The point
wO has to satisfy the conditions

ewO ~ min {cx: h(x) ~ O}, g(wO) < °,


and serves as a vertex of the eones generated by the partitioning and subdivision
process. The point WO is an interior point of Cl satisfying h(WO) < 0, and it is used as
endpoint of the rays that define the points where the hyperplanes eonstrueted in
Step k.3 support Cl. Note that wO can be found by solving the eonvex program

minimize cx
S.t. h(x)~O

After a coordinate transformation wO --I °


the algorithm ean be run with the above

modifieation and with {x: cx ~ Qk} replaeed by {x: cwO ~ cx ~ Qk}'


572

Example X.2.

= 0.5x1 + 1.2x2
Objective function: f(xl'x 2)
Constraints: h(xl'x2) = max {(xC1)2 + (x2-1)2 - 0.6, 1/(0.7x1 + x2)-5},
2 2
g(xl'x2) = Xl + x2 - 2.4, T = {O ~ Xl ~ 5, 0 ~ x2 ~ 5}.
The points wO, WO were chosen as wO = (1.097729, 0.231592) with f(w O) = 0.826776
and -0
w = (1.0,1.0).

The initiating step has led to xl = (1.097729,1.093156), lk1 = f(x 1) = 1.860652,


ß1 = 0.826776.
In the first iteration the point x*l = (0,000000,1.461790) and the first supporting
hyperplane II (x1,x2) = 1.406x1 + 0.650x - 0.443 were determined.
After 16 iterations the algorithm stopped at an approximate optimal solution
x* = (1.510419,0.417356) with f(x*) = 1.25603 satisfying min f(D) ~ f(x*) ~ min
f(D) + 10-6. The point x* was found at iteration 13.

4. SOME SPECIAL D.C. PROBLEMS AND APPLICATIONS

Some special d.c. problems have already been treated in Chapter IX. In this sec-
tion, we discuss design centering problems and biconvex programming.

4.1. The Design Centering Problem

Recall from Example 1.5 that a design centering problem is defined as folIows.

Let K c IRn be a compact, convex set containing the origin in its interior. Further-
more, let M ( IRn be a nonempty, compact set. Then the problem of finding X E M, r

E IR + satisfying

max {r: X + rK ( M} (61)


X ,r
573

is called the design centering problem.

Problems of the form (61) often arise in optimal engineering design (e.g., Polak

and Vincentelli (1979), Vidigal and Director (1982), Polak (1982), Nguyen et al.

(1985 and 1992), Thoai (1987)). For example, consider a fabrication process where

the quality of a manufactured item is characterized by an n-dimensional parameter.

An item is accepted if this parameter is contained in some region of acceptability M.

Let x be the nominal value of this parameter, and let y be its actual value. Assume
that for fixed x the probability

P(lIy-xll ~ r) = p(x,r)

that the deviation is no greater than r is monotonically increasing in r. Then for a

given nominal value of x the production yield can be measured by the maximal value
of r = r(x) satisfying
{y: lIy-xll ~ r} ( M .

In order to maximize the production yield, one should choose the nominal value x
so that

r(X) = max {r(x): x e M} . (62)

Setting K = {z: IIzll ~ I} and y = x + rz, zeK, we see that this is a design centering
problem.

Another interesting application has been described in Nguyen et al. (1985). In the

diamond industry, an important problem is to cut the largest diamond of apre-

scribed form inside a rough stone M. This form can often be described by a convex

body K. Assume that the orientation of K is fixed, i.e., only translation and

dilatation of K is allowed. Then, obviously, we are led to problem (61).

Note that, in (61), we mayassume that int M # 0, since otherwise (6·1) has the so-

lution r = O.
574

In many cases of interest, the set M is the intersection of a number of convez and
complementary convez sets, i.e.,

(63)

where C is a closed convex set satisfying int C I 0, and Di = ort \ Ci is the comple-
ment of an open convex subset Ci of IRn (i=1, ... ,m). We show that in this case the
design centering problem is a d.c. programming problem (cf. Thach (1988».
Let

SUp {r: x + rK (M} ifx E M


rM(x):= { . (64)
o ifxt M

Note that max {r: x + rK ( M} exists if M is eompact. The expression (64), how-
ever, is defined for arbitrary M ( IRn, and we will also consider unbounded closed sets
M.

Obviously, if M = n M., where J (IN, then


jEJ J

(65)

Lemma X.lS. Let H be a closed halfspace. Then rIlz) is affine on H.

proof. Consider

H = {y: ey ~ a} , cE IRn, c I 0, a E IR.

The programming problem

max {cz: Z E K}

z
has a solution satisfying v= ci > 0, because K is compact, 0 Eint K, and c I O.
575

Let x E H, and define

tl-ex
() = - - - .
px
v

Obviously, p (x) ~ 0 and x + p (x)K c H if P (x) = o.


Let p (x) > 0 and z E K. Then

e(x + P (x)z) = ex + p (x)ez ~ ex + p (x)ci = tl j

henee x + p (x)K c H.
But c(x + rZ) > c(x + p (x)Z) = tl whenever r > p (x).
Therefore, rH(x) = p (x), i.e., we have

tl-ex
r H ()
x =-_-- Vx EH. (66)
v

Note that rH(x) = max {O, ~ (tl- cx)} Vx


v
E !Rn .

Now consider a convex polyhedralset M, i.e.,

(67)

where ci E IRn \ {O}, tlj E IR (i=1, ... ,m).


Then it follows from (65) and (65) that

tl· - Ci x
rM(x) =. min 1 _ Vx EM , (68)
1=1, ... ,m vi

where

Vi = max {ciz: z E K} > 0 (i=1, ... ,m). (69)

In this case, the design eentering problem reduces to maximizing rM(x) over M.
But since x ~ M implies that tli - cix < 0 for at least one i, we may also maximize
the expression (68) over IRn .
576

Proposition X.IO. Let M be a polytope ofthe form (67). Then the design centering
problem is the linear programming problem

maximize t (70)
i -
S.t. n i -cx~tvi (i=l, ... ,m)

where vi is defined by (69).

Proof. Problem (70) with the additonal variable t e IR is equivalent to


max {rM(x): xe IRn} with rM(x) of the form (68). •
Since (69) must be evaluated, we see that the design centering problem for convex
polyhedral M requires us to solve m convex programming problems (which reduce to
linear programs when K is a polytope) and one linear program. The special case
when K is the unit ball of an Lp-norm, I ~ P ~ m, is investigated further in Shiau
(1984).

Proposition X.II. Let M = C n D1 n ... n Dm' where Cis a closed convex subset of
IR n, int C f. 0, and Di = IR n \ Ci is the complement of an open convex set in IR n
{i=l, ... ,m}. Assume that int M f. 0. Then the design centering problem is equivalent
to maximizing a d.c. junction over M.

Proof. a) It is well-known that C =


n H, where tR is the family of closed
HetR
halfspaces containing C. From Lemma X.15 and (65), we see that

(71)

where, for any H e cN, rH(x) is the affine function (66). Let H = {y: cy ~ a},
v(c) = max {cz: z e K}. Then we have

rdx) = in f
He tR
{n_-v(c)cx) , (72)
577

where the infimum is taken over all c, a for which H e ,R.

Note that rC(x) is finite for every x. To see this, let xO e C. Then for all

H = {y: cy ~ a} e O'l, we have cxO ~ a, and hence

a - cx > c(xO-x) >_lIcl/ I/xo-xl/


(73)
v(c) - v(c) - v(c) .

Let

r O = min {lIyll: y e OK} >0, (74)

where OK denotes the boundary of K. Then we see that

(75)

It follows from (73) that

a - cx
o
- - > - IIx -xII .
v(c) - rO

Hence, rC(x) is a finite concave ju.nction.

b) Now consider the set n = IRn \ - where C


C, - is an open convex set. Then we have

cl C= n H, where tR is the family of all halfspaces H = {y: cy S a} containing


HetR
cl C.
For any H e O'l, let H' = {y: cy ~ a} be the closed halfspace opposite H, and

denote by tR' the family of an halfspaces H' defined in this way.

Since H' cn for every H' e tR', it follows that rn(x) ~ rH,(x)j hence,

r n (x) ~ sup rH,(x).


H'ecH'

The converse inequality obviously holds for rn(x) = 0, since rH,(x) ~ O. Assume

that rn(x) = 5 > O. Then we see from the definition of r n (x) that there exists a

closed ball B centered at x e n with radius 6 satisfying B n C = 0. Therefore, there


is also a hyperplane separating C and B. In other words, there is an H' e JI' such
578

that B eH', where H' was defined above. Then, we have r H,(x) ~ rn(x), and hence

rn(x) ~ su p rH,(x). (76)


H'eß'

It is easy to see that rn(x) = su p IH,(X) is finite everywhere.


H'eß'
m
c) Finally, since M = n ni n C, by (65) we find that
i=l

(77)

This is a d.c. function as shown in Theorem 1.7.



RemarJr: X.lI. Part a) of the above proof showed that rC(x) is concave if C is
convex, int CI: 0, i.e., in this case the design centering problem is a concave maxim-
ization problem.
When C and the collection of n i (i=l, ... ,m) are given by a convex inequality and
m reverse convex inequalities, then we are led to a d.c.-programming problem of the
form discussed in the preceding section. Note, however, that the functions rc(x),
rn(x) in the proof of Proposition X.14 can, by the same expression, be extended to
the whole space (Rn. But if x t M, then rM(x) S 0, whereas rM(x) > 0 in int M.
Therefore, (77) can be maximized over (Rn instead over M.
Note that in part b) of the proof it is essential to consider all supporting half-
spaces of cl C.

2 ~ ~
Example X.3. Let n = IR \ C, where C = {(xl'~): Xl + x 2 > 0, -Xl + ~ > O}.
Obviously, cl Cis defined by the two supporting halfspaces x I +x2 ~ 0, -xI +x2 ~ O.
But finding rn(x) for given K, X may require that one considers additional half-
spaces that support cl C at (0,0). In the case of K,x as in Fig. X.3, for example, it is
the halfspace x 2 S 0 that determines rn(x).
579

x
2

X
1

Fig. X.2. Design Centering and Supporting Halfspaces

The proof of Proposition X.14 shows that, under the assumptions there, the

design centering problem can also be formulated as a semi-infinite optimization


problem, i.e., a problem with an infinite number of linear constraints.

Knowledge of the d.c. structure is of little practical use if an explicit d.c. decom-

position of rM(x) is not available. Such a d .c. representation is not known unless
additional assumptions are made. To see how difficult this general d.c. problem is,
let

x + rK = {y E !Rn: p(y-x) ~ r} , (78)

where p(z) = inf {>. > 0: z E >.K} is the Minkowski functional 0/ K (cf. Thach (1988),
Thach and Tuy (1988)). Then it is readily seen that

rM(x) = inf {p(y-x): y ~ M} . (79)

Suppose that p(z) is given. (Note that p(z) = IIzliN if K is the unit ball with

respect to a norm IIzIlN.) Then we have


580

rD.(x) = inf {p(y-x): y e Ci} , (80)


1

which &mounts to solving a convex programming problem, whereas

rC(x) = inf {p(y-x): y e IRn \ C} (81)

requires minimizing a convex function over the complement of a convex set.


Since rM(x) is d.c. everywhere, it is also Lipschitzian on compact subsets of IRn. A
Lipschltz-constant cu be found in the following way.

Proposiüon X.12. Let M be given as in Proposition X.1I. Assume that M is


bounded and int M # 0. Then rMz) is Lipschitzian with Lipschitz-constant

1
L=-
r I
o
where rO= min {I/zl/: z e 8K} .

Proof. Let p(z) denote the Minkowski functional of K such that K = {z:
p(z) ~ I}. Then it is well-known that for y e IRn we have p(y) = W , where 1/·11
1I'i1i
denotes the Euclidean norm, and where y is the intersection point of the ray
{py: p ~ K} with the boundary aK. Therefore, for arbitrary xI ,x2 e ~, it follows
that

Using

and

rM(x) = inf {p(y-x): y ~ M}


(cf. (79», we see that
581

Interchanging xl and x2, we conclude that


Two algorithmic approaches to solve fairly general design centering problems

have been proposed by Thach (1988) and Thach and Tuy (1988). Thach considers
the case when M is defined as in Proposition X.12 and where p(z) = (z(Az»l/2 with
symmetrie positive definite (nxn) matrix A. He reduces the design centering problem

to concave minimization and presents a cutting plane method for solving it. How-

ever, there is a complicated, implicitly given function involved that requires that in

each step of the algorithm, in addition to the calculation of new vertices generated

by a cut, one minimizes a convex function over the complement of a convex set and
solves several convex minimization problems. However, when C and K are polytopes,
only linear programs have to be solved in this approach (cf. Thach (1988)).

Thach and Tuy (1988) treat the design centering problem in an even more general
setting and develop an approach via so--<:alled relief indicators which will be dis-
cussed in Section XlA.
The first numerical tests (Boy (1988» indicate, however, that - as expected from
the complicated nature of the general problems considered - the practical impact of
both methods is limited to very small problems.

4.2. The Diamond CuUing Problem

In the diamond cutting problem mentioned at the beginning of this section, the

design centering problem can be assumed to have a polyhedral structure, i.e., K is a

polytope and M = C n D1 n... n Dm' where Cis a polytope and each Di is the com-
plement of an open convex polyhedral set Ci (i=l, ... ,m). Let
582

(82)

(83)

and

C. = {y: Ci,ky + 1. k > 0, k E K.}, (i=l, ... ,m),


1 1, 1

where I, J, Ki are finite index sets; ai , bj , ci,k E IRn ; and Qi' ßj , li,k E IR.
Then we have

D. = {y: mi n (ci ,ky+ 1· k) ~ o} (i = l, ... ,m) , (84)


1 kEK. 1,
1

and, according to Proposition X.lI,

(85)

It follows from Lemma X.15 (see also (68)) that

rc(x) = I
j EJ
o
1
Vj J
.
min [-= (ß· + bJx)] , x E C

, x; C
(86)

where vi is the optimal value of the linear programming problem

max {bjz: z E K} (j E J) .

Furthermore, evaluating r D. (x) (i=l, ... ,m) by means of (80) also only requires
1

solving linear programs. Indeed, since K contains 0 in its interior, we can assume

Qi = -1 for all i E 1. Then it is easily seen that for any x E Dk we have

r D (x) = min{t : aiy ~ t, i E I, y E clCk},


k

where clC k denotes the closure of Ck . Therefore, the computation of rM(x)


according to (85) reduces to solving m+1 linear programs. Alternatively, under
additional assumptions on the available apriori information, it is possible to
compute the quantities r D. (x) without solving linear programs.
1
583

Consider the case n = 3, and replace n i ' Ci by n, C, respectively, i.e., we have


n = 1R3 \ C, where Cis an open polyhedral convex set.
Assume that all ofthe extreme points, edges and facets of cl C and of Kare known.

Recall from part b) of the proof of Proposition X.11 that

rn (x) = s up rH,(x), (87)


H'e.N'

where .N' is the set of closed halfspaces H' = {y: cy ~ er} determined by the
supporting planes of cl C which satisfy cl C C {y: cy ~ er}. Observe that (87) is not
useful for computing rn(x), because .N' may contain an infinite number of elements.
Following Nguyen and Strodiot (1988, 1992), we show that, by the above assump-
tions, one can find a finite subcollection 'J 'of .N' such that for all x e 1R3

rn(x) = max rH,(x) . (88)


H' e'J'

Then computing r n (x) amounts to using a formula similar to (86) (cf. also (66)) a
finite number of times.

Proposition X.16. If I' is a subcoUection 01 tR' such that

"Ix e D, 3 HO e 'J' such that x + rD(x)K C HO' (89)

then for aU x e 1R 9 we have

rD(x) = max rH ,(x) .


H' e'J'

Proof. From the inclusion 'J' c .N' and (87) it follows that for all x e 1R3 one has

o~ max rH,(x) ~ sup r H,(x) = rn(x) . (90)


H' e'J ' H'e.N'
584

But if x e n, then by (86) we have x + rn(x)K CHO for some HO e 'I 'j hence, by
the definition of r H"

rn(x) ~ rH,(x) ~ max rH,(x) .


H'e'l '
Therefore, !rom (90) we see that rn(x) = max rH,(x), whenever x e n.
H'e'l '
On the other hand, when x t n, we have rn(x) = 0 by definition, and thus, again
using (87), we obtain the required equ8.llty. _

C is polyhedral in 1R3, there are three types of supporting planes for cl C:


Since cl

the ones containing a facet of cl C, the ones containing only a single extreme point of

cl C and the ones containing only one edge of cl C. Corresponding to this classifi-

cation, 'I' will be constructed as the union of three finite subsets '11', '12' and '13'
of dl t •

'11' will be generated by all of the supporting planes corresponding to the facets of
cl C. Since cl C has only a finite number of facets, the collection '11' is finite. Let f

be a facet of cl C, and let PI ' P2 ' P3 be three affinely independent points which
characterize it. Suppose that they are numbered counter clockwise, when cl C is
viewed !rom outside. Then the halfspace H' e '11' which corresponds to the facet f
has the representation

where c and Q are defined by

Here ",," denotes the usual cross product in 1R3.

The set '12' will be defined as folIows. To each vertex p of cl Cand to each facet v
of K we associate (if it exists) the plane
585

cy =a

which is parallel to the facet v and passes through the vertex p but not through any
other vertex of cl C. Moreover, it is required that

cl C ( {y: cy 5 a} and K ( {y: cy ~ cq} , (91)

where q is any point of the facet v. Condition (91) ensures that {y: cy ~ a} E tR'
and that K is contained in the parallel halfspace {y: cy ~ cq}.

Computationally, 12' can be obtained in the following way. Consider each couple

(p,v), where p in a vertex of cl C and v is a facet of K. Let ql ' q2 ' q3 be three


affinely independent points which deterrnine the facet v. Suppose that they are num-

bered counter clockwise when K is viewed from outside. Then

cy =a

with c = (q3-ql) x (q2-ql)' and a = cp represents aplane containing p which is


parallel to v and satisfies K ( {y E !R3: cy ~ cq} Vq E v. The halfspace {y: cy ~ a}
will be put into 12' if cl C( {y: cy 5 a}.
The latter condition can be verified by considering the edges of cl C emanating from
p.

Let each edge of clC emanating from p be represented by a point p f p on it. Since
a = cp, we have cl C ( {y: cy 5 a} if cp 5 a for of all these points (edges) p. The col-
lection 12' is finite because there exist only a finite number of vertices p and facets

v.

Finally, 13' is defined in the following way. To each edge e of cl C and to each
edge w of K which is not parallel to e, we associate (if it exists) the plane

cy =a
that contains the edge e and is parallel to the edge w. Then as in (91) we have
586

cl C ( {y: ey 5 a} and K ( {y: ey ~ eq} , (92)

where q is now an arbitrary extreme point of w. Condition (92) means that


{y: ey ~ a} E ,H', and that K is eontained in the parallel halfspace {y: ey ~ eq}.
Computationally, for eaeh pair (e,w), where e is an edge of cl C and w is an edge of
K, one ean proeeed as follows. Let PI ' P2 be two points defining e, and let ql ' q2 be
two points defining w. Set Eew = (PCP2) x (qr·-<I2). If Eew = 0, then e and ware
parallel to eaeh other, and we must eonsider another pair of edges. Otherwise, we set
e = :I: Eew , where the plus or minus sign is determined in such a way that eql < 0,
and we set a = ePI . Note that cql < 0 is always possible, because 0 Eint K. Then
the plane cy =a contains the edge e and is parallel to the edge w. Moreover, we
have 0 E {y: ey ~ eq} for any extreme point q of w. The halfspace {y: ey ~ a} will be
put into 'J3' if (92) is fulfilled, i.e., if ep 5 a for each point p whieh defines an edge
of cl C emanating from PI or P2 ' and if er ~ eql for eaeh extreme point r of K.
Since there exist only a finite number of edges e and only a finite number of edges
w, the colleetion 'J3' is finite.

Finally, we set

(90)

In order to prove that the eollection 'J' satisfies condition (86) of Proposition X.16,
we establish the following lemma (cf. Nguyen and Strodiot (1988)).

Lemma X.16. Let xE D, and consider the hal/spaces H' = {y: cy ~ a} ( ,H'

whose gene rating plane P = {y: cy = a} separates cl aand x + rD(x)K. Denote by


.9'x the set 0/ these planes. Then the /oUowing assertions hold.

(i) cl an (x + rD(x)K) f 0 ;
(ii) .9'x f 0, and (x + rD(x)K) n cl a(p v PE .9'x;
587

(iii) if P E .9X contains a relative interior point of a facet of cl C (or of


x + rD(x)K, respectively), then P contains the whole facet;

(iv) if P E .9x contains a relative interior point of an edge of cl C (or of


x + rD(x)K, respectively), then P contains the whole edge;

(v) if PE .9x contains an extreme point of cl C and a facet of x + rD(x)K,


then we have H' E '12' for the halfspace H' corresponding to P;

(vi) if PE .9x contains an edge of cl C and an edge of x + rD(x)K, then we


have H' E 'Is' for the halfspace H' corresponding to P.

Proof. In order to prove (i), we exhibit a point y* E cl C n (x + rn(x)K). From


(80) we see that rn(x) = inf {p(y-x): y E Cl. It follows, by the continuity of the

Minkowski functional and the assumptions on n, that rn(x) is attained in cl C, Le.,


there is a y* E cl C such that p(y*-x) = rn(x). But the point y* also belongs to
x+p(y*-x)K because, by the well-known properties of p(z), we have (y*-x) /
p(y*-x) E K.

Since x + rn(x)K ( n and C n n = 0, assertion (ii) follows from (i) by a


well-known c1assical theorem on separation of convex sets (cf., e.g., Rockafellar
(1970)).

Finally, assertions (iii) and (iv) are straightforward because P is aplane sep-
arating cl C and x + rn(x)K, whereas (v) and (vi) are immediate consequences of

the definition of '12' and '13'· •

The following proposition is also due to Nguyen and Strodiot (1988, 1992).

Proposition X.14. The coUection 'I' is finite and satisfies property (89) of Pro-
position X.16.
588

Proof. Finiteness of 'I' = '11' U '12' U '13' has already been demonstrated above.

Let x E D. First suppose that x E cl C. Then x belongs to a facet f of cl C. The

supporting plane P of cl C which contains f, determines a hyperplane H' E '11' which

satisfies x + rD(x)K = {x} C P eH'.


Next suppose that x ~ cl C. Denote by Y* the intersection of cl C and x + rD(x)K.
From Lemma X.16 (i) and (ii) we know that Y* f 0 and .9lx f 0. We consider several
cases according to the dimension dim Y* of Y* (recall that dim Y* is defined as the

dimension of the affine hull of Y*).

Case 1: dim Y* = 2:

Let P E .9l . Then, by Lemma X.16 (ii), P contains a facet of cl


x
C, and the

halfspace H' E CH' generated by P belongs to 'I{ Since x + rD(x)K eH', condition
(89) is satisfied.

Case 2: dim Y* = 1:
Since x + rn(x)K is bounded and dirn Y* = 1, we see that Y* is a c1osed,
bounded interval which does not reduce to a singleton. Consider the following three
subcases.

Case 2.1: Y* is part 0/ an edge 0/ x + rD(x)K but is not contained in an edge 0/ cl C.


Then Y* must contain a relative interior point of a facet of cl C. Let P E .9l . We
x
see from Lemma X.16(ii) that, because Y* E P, it follows that P contains a relative

interior point of a facet of cl C, and thus, by Lemma X.16 (iii), it contains the whole
facet. Consequently, the halfspace H' E CH' generated by P again belongs to '11', and

condition (89) is implied by the inclusion x + ID(X)K eH'.

Case 2.2: Y* is part 0/ an edge 0/ cl C but is not contained in an edge 0/ x + r D(x)K.


As in the case 2.1 above, we conclude that now each P E .9lx contains a facet of

x + rD(x)K. Ey Lemma X.16 (iv), we see that P also contains an edge of cl C.


589

Hence, by Lemma X.16 (v), it follows that the halfspace H' E tR' generated by P

belongs to '12', and condition (89) holds because x + rn(x)K eH'.

Case 2.3: Y* is part 0/ an edge e 0/ cl Cand part 0/ an edge W0/ z + rD(zjK.


whas the form x + rn(x)w, where w is an edge of K. Let fl and f2 be
The edge

the two facets of cl C such that e = fl nf2 ' and let vI and v2 be the two facets of K

determining w = vI nv 2 . Then, by Lemma X.16 (ii) and (iv), each P E .9'x contains
the two colinear edges e and x + rn(x)w. Therefore, among all planes P E .9'x ' there

exists at least one plane P' which contains one of the four facets f p ~, x + rn(x)vl'

x + r n (x)v 2. Let H' E tR' be the halfspace corresponding to P'. Then we have

H' E '11' if f l E P' or f2 E P'. Otherwise, by Lemma X.16 (v), one has H' E '12' . In

any case, x + rn(x)K eH', and condition (89) is satisfied.

Case 3: dim Y* = 0:
In this case, when Y* is reduced to a singleton Y* = {y*}, we must consider six
subcases according to the position of y* with respect to cl Cand x + rn(x)K.

Case 3.1: y* is avertu 0/ z + rD(zjK and belongs to the relative interior 0/ a /acet 0/
cl C.
Let P E .9'x' and let H' E tR' be the corresponding halfspace. By Lemma X.16

(ii) and (iii), P contains a facet of cl C. Thus we have H' E '11' and, since

x + rn(x)K eH', condition (89) is satisfied.

Case 3.2: y* is a vertez 0/ cl C and belongs to the relative interior 0/ a /acet 0/ z +


rD(zjK.
Since P E .9' and the corresponding halfspace H' E tR', we see from Lemma X.16
x
(ii) and (iii) that P contains a vertex of cl C and a facet of x + rn(x)Kj hence, by

Lemma X.16 (v), H' E '12'. Since x + rn(x)K eH', it follows that condition (89) is
satisfied.
590

Case 3.3: y* is a relative interior point 0/ both an edge 0/ cl C and an edge 0/ x +


rD(x)K.
Since dim Y* = 0 we see that these two edges cannot be collinear. Let P E .J',
x
and let H' E tR' be generated by P. Lemma X.16 (ii) and (iv) show that P contains

an edge of cl C and an edge of x + rn(x)K which are not collinear. Therefore, by


Lemma X.16 (vi), H' E 13' and (86) is satisfied, because x + rn(x)K ( H'.

Case 3.4: y* is avertex 0/ x + rn(x)K and belongs to the relative interior 0/ an edge e
o/cl C.
Let f1 and f2 be the two facets of cl C which determine e, i.e., e = f1 n f2. Then,
by Lemma X.16 (ii) and (iv), each P E .J'x contains e. Therefore, there exists at least
one plane P' E .J'x which contains either one of the faces f1 and f2 or an edge of
x + rn(x)K emanating from x. (Observe that in the latter case this edge of

x + rn(x)K cannot be collinear with e because Y* = {y*} is a singleton.) If f1 ( P'


or f2 ( P', then the halfspace H' E tR' generated by P' belongs to 11'; otherwise, by
Lemma X.16 (vi), we have H' E 13'. In any case x + rn(x)K ( H', and (89) is satis-
fied.

Case 3.5: y* is avertex 0/ cl C and belongs to the relative interior 0/ an edge 0/


x+ rn(x)K.
The proof in this case is similar to Case 3.4.

Case 3.6: y* is avertex 0/ cl Cand avertex 0/ x + rn(x)K.


In this case, there exists at least one plane P E .J'x which contains either an edge

of cl C or an edge of x + rn(x)K. Since both cases can be examined in a similar way,


we suppose that there exist an edge e of cl C and a plane P E .J'x satisfying e ( P.

Two possibilities can occur. If none of the edges of x + rn(x)K is collinear with e,
then we can argue in the same way as in Case 3.4 to conclude that (89) holds.

Otherwise, there is an edge w of K such that x + rn(x)w is collinear with e. We


can argue as in Case 2.3. Let f1 and f2 be the two facets of cl C satisfying
591

e = fl nf2, and let vI and v2 be the two facets of K determining w = vI n v2. Then,
among all of the planes P E .9'x containing e, there exists at least one plane P' which
contains one ofthe four facets fl' f2, x + rD(x)vl' x + rD(x)v2 . Let H' e JI' be the
halfspace generated by P'. Then we have H' E 11' if f l E P' or ~ E P'. Otherwise,
using Lemma X.16 (v), we see that H' E 12'. In any case, x + rD(x)K eH', and con-
dition (89) is satisfied. _

Example X.4. An application of the above approach to the diamond cutting and

dilatation problem is reported in Nguyen and Strodiot (1988). In all the tests dis-

cussed there, the reference diamond K has 9 vertices and 9 facets and the rough

stone is a nonconvex polyhedron of the form M = C n Dl ' Dl = IRn \C I as described


in (80), (81) with 9 facets and one "nonconvexity" Dl determined by four planes.

First, from the vertices, edges and facets of K and M the three finite collections

11',12' ,13' were automatically computed. Then the design centering problem was
solved by an algorithm given in Thoai (1988). Although the theoretical foundation of
this algorithm contains an error, it seems that, as a heuristic tool, it worked quite
efficiently on some practical problems (see Thoai (1988), Nguyen and Strodiot (1988
and 1992)). The following figure shows the reference diamond K, the rough stone M
and the optimal diamond inside the rough stone for one of the test examples. In this

example 11' has 9 elements, 12' has 1 element and 13' has 3 elements. The optimal
dilatation 'Y is 1.496.
592

Fig. X.3. Diamond problem

4.3. Biconvex Programming and Related Problems

In this section, we consider a special jointly constrained biconvex programming

problem namely

(SBC) minimize F(x,y) = f(x) + xy + g(y) (94)


s. t. (x,y) E K n R '

where:

(a) R = {(x,y) E IRn xlRn:! $ x $ -a, Q $ y $ -b} with


- - n - -
!, a, Q, b E IR ,! < a, Q< b;

(b) K is a c10sed convex set in 1R2n ;


593

(c) fand g are real-valued convex functions on an open set containing R


x
and Ry ' respectively, where

and

Problem (SBC) is a d.c. programming problem since xy = ! (IIx+Yll2 - IIx-Yll2)


is a d.c. function (cf. Example 1.6).

Some immediate extensions of problem (SBC) are obvious. For example, a term

x(Ay) can be transformed into xz if the linear constraint z = Ay is included among


the constraints defining K.

Note that, even though each term xiYi in xy is quasiconvex, it is possible to have

proper local optima that are not global. For example, the problem min {xy: (x,y)

E 1R2, -1 ~ x ~ 2, -2 ~ Y ~ 3} has local minima at (-1,3) and (2,-2).

An often treated important special case is the bilinear program

(BLP) minimize px + x( Cy) + qy (95)


S.t. x EX, Y EY

where p, q E IRn; C is a (nxn) matrix, and X and Y are polytopes in IRn. This problem
is treated in Section IX.!. In Section 1.2.4, we showed that problem (BLP) is equi-
valent to a special concave minimization problem. From the form of this equivalent
concave minimization problem (Section 1.2.4) and the corresponding well-known

property of concave functions it follows that (BLP) has an optimal s~lution (x,y),

where x is an extreme point of X and y is an extreme point of Y (cf. Theorem 1.1

and Section 1.2.4). As pointed out by AI-Khayyal and Falk (1983), this property is

lost in the jointly constrained case.


594

Example X.5. The problem in 1R2

minimize (-x + xy - y)
s . t. -6x + 8y ~ 3
3x - y~3
o ~ x, y ~ 5
has an optimal solution at (i, ~) which is not an extreme point of the feasible set.
Moreover, no extreme point of the feasible set is a solution.
The jointly constrained bilinear programming problem, however, has an optimal
solution on the boundary of a compact, convex feasible set. This can be shown in a
somewhat more general context, by assuming that the objective function is bi-
concave in the sense of the following proposition (cf. Al-Khayyal and Falk (1983)).

Proposition X.I5. Let F(z,y) be a biconcave real-valued fv,nction on a compact


conve:t: set 0 C IRn.lRm, i.e., F(z,·) and F(. ,y) are concave on the projections 0/0 on
IRm and IRn, respectively. 1/ min F(O) ezists, then it is attained on the boundary BO 0/

O.

proof. Assume that there exists a point (xo, yo) Eint C such that
F(xO,yo) < F(x,y) V(x,y) E 00.
It follows that in particular we have
F(xO,yo) < F(xO,y) Vy E {}C (xo),
where
C(xo) = {y: (XO,y) E Cl.
But C(xo) is a compact and convex and yo is a (relative) interior point of C(xo).
Bence, the concave function F(xo,.) attains its global minimum over C(xo) at an
extreme point of C(xo), which is a contradiction to the last inequality above.

Another proof of Proposition X.15. can be found in Al-Khayyal and Falk (1983).
595

Bilinear programming is of considerable interest because of its many applications.

Recall, for example, that minimization of a (possibly indefinite) quadratic form over

a polytope can be written as a bilinear problem. Another prominent example is the

linear complementarity problem (cf. Chapter IX).

Several algorithms have been proposed for solving the bilinear programming prob-
lem, some of which are discussed in Section IX.l.

For the biconvex problem under consideration, observe that the methods in the
preceding sections can easily be adapted to problem (SBC). Specifically, several

branch and bound approaches are available. One of them is the method discussed in
Section X.2 for a more general problem.

Another branch and bound algorithm is developed in AI-Khayyal and Falk


(1983). Below we present a variant of this algorithm that differs from the original

version in the subdivision rule which governs the refinement of partition sets and in

the choice of iteration points.


Starting with MO ::;: R, let all partition sets M be 2n-rectangles, and let the rect-

angles be refined by an exhaustive subdivision, e.g., by bisection. The procedure

below will use bisection, but any other exhaustive subdivision will do as weIl (cf.
also Chapter VII).

The lower bounds ß (M) will be determined by minimizing over M n K a convex


function ~M(x,y) that underestimates F(x,y) on M. Let If.1.M(x,y) denote the convex
envelope of xy over M. Then

~M(x,y) = f(x) + 'PM(x,y) + g(y) (96)

will be used.
The selection rule that determines the partition sets to be refined in the current

iteration will be bound improving, as usual.

We begin by deriving a formula for 'PM(x,y) (cf. Al-Khayyal and Falk (1983)).
Let M = {(x,y):= ~ ~ x ~ a, E. ~ y ~ 'Ei} be a 2n-rectangle in 1R2n , and let
596

M.1 = {(x.,y.): a. < x. <i., b. <y. < o.}, so that M = MI


11-1-1-1-1-1-1
x M2 x ••• x Mn . Denote by
IIJ..r (X.,y.) the convex envelope of XjYj on M. (i=I, ... ,n).
'Mi 1 1 1

(ii) 1/ /(z) = cz and g(y) = dy, e,d E IRn, then cz + lPA!z,y) + dy is the
eonvez envelope 0/ F(z,y) over M.

Proof. Part (i) is an immediate consequence of Theorem IV.S, and part (ü) fol-

lows from Theorem IV.9.


Proof. We temporarily drop the subscripts in ~.,xi'Yi'!t,ii'!!i'0i and consider


1

~(x,y), where x and y are now real variables rather than vectors.
Recall that the convex envelope of a function h on M may be equivalently defined
as the pointwise supremum of an affine functions which underestimate h over M.
Since x - .!!: ~ 0 and y - !! ~ 0, it follows after multiplication that

so that L1(x,y) = !!x + !J - ab underestimates xy over M. Similarly, it follows from

i-x ~ 0 and 0 - y ~ 0 that '-2(x,y) = ox + iy - äO underestimates xy over M.


Hence

cp(x,y) = max {LI (x,y) , '-2(x,y)}

is a convex underestimating function for xy over M. In addition, a simple com-


putation shows that cp(x,y) agrees with xy at the four extreIlle points of M.
Let MI' M2 ( M be the closed tri angle below and above the diagonal joining (.!!:,O)
and (i,!!), respectively. Then it is easy to see that
597

cp{x,y) = { ~(X,y) , (x,y) E MI . (97)


~(x,y) , (x,y) E M2

If ep were not the convex envelope of xy over M, there would be a third affine

function !a(x,y) underestimating xy over M such that

cp{i,Y) < !a(i,y) for some (i,Y) E M . (98)

Suppose that (i,Y) E MI· Then (i,Y) is a unique convex combination of the three

extreme points vl,v 2,v3 of MI. Hence, for every affine function l one has

with uniquely deterrnined ).. ~


1
° 3
(i=I, ... ,3), E ).. = 1. But since ep agrees with xy at
i=l 1
these extreme points and 13 underestimates xy there, by (97) we must have

contradicting (98). A similar argument holds when (i,Y) E M2.



Algorithm X.6

Step 0 (Initialization):

Set vK °= {M}, where M = R, and deterrnine ~M(x,y) = f(x) + epM(x,y) + g(y)


according to Proposition X.19 (i) and Proposition X.20.

Solve the convex minirnization problem

(PM) minimize ~M(x , y)


s . t. (x,y) E M n K

to deterrnine ßO = min ~M(M n K). Let SM be the finite set of iteration points in

Mn K obtained while solving (PM). Set QO = min F(SM) and (xO,yo) E argmin

F(SM)·
If QO - ßO = °(~ c), then stop. (xO,yO) is an (c-)optimal solution.
598

Step k=I.2 •... :

At the beginning of Step k we have the current partition vK k-l of a subset of R

e .At k-l' we have SM C M n K and the


still of interest. Furthermore. for every M
bounds ß (M), a(M) satisfying ß (M) ~ min F(M n K) ~ Il (M). Moreover, the

current lower and upper bounds J\-I' Qk-l satisfying ßk- 1 ~ min F(K n R) ~ Ilk- 1
are at hand, and we have a subset .At-k_ 1 of .At k-l whose elements are the partition
sets M such that J\-1 = ß (M). Finally, the current iteration point (xk- 1,yk-l) e
K n R is the best feasible point obtained so far, i.e., one has F(xk- 1,l-l) = Ilk_ 1.

k.l. Delete all M e .At k-l satisfying ß (M) > Qk-l .


Let .ge k be the collection of remaining members of .At k-l .

11:.2. Select a collection .9k C .ge k satisfying .At-k_ 1 C .9k ' and bisect each member
of .9k . Let .9ic. be the collection of all new partition elements.

k.3. For each M e .9ic.:


Determine 'M(x,y) according to Proposition X.19 (i) and Proposition X.20. Solve
the convex minimization problem (PM). Delete M if the procedure for solving
(PM) detects M nK = 0.

Let vK ic. be the collection of all remaining members of .9ic..


k.4. For each M e .At ic.:

Let SM be a finite set of iteration points in M n K known so far (SM contains

iteration points obtained while solving (PM) and best feasible points known from

iteration k-l). Set Il (M) = min F(SM)' ß(M) = min 'M(K n M).
11:.5. Set vK k = (.ge k\ .9k) u .At ic..
Compute

Qk = min {Q (M): M e .At k}' Pr. = min {ß (M): M e .At k}·


599

Let (xk,yk) E K n R be such that f(xk,yk) = ak' and set "{{-k = {M E .,{{ k:

ßk = ß (M)}.
If ak - ~ = °(~ e), then stop. (xk,yk) is an (e-)optimal solution. Otherwise, go
to Step k+1.

The following proposition shows convergence of the procedure (e = 0).

Proposition X.IS. 1/ Algorithm X.5 does not terminate after a finite number 0/
iterations, then the sequence {:i, yk} has accumulation points, and e'IJery accumu-
tation point 0/ {(zk,yk)} is an optimal solution 0/ problem (91) satisfying

lim ßk = min F(Kn R) = lim ak = lim F(/,yk). (99)


/o-tm /o-tm

Proof. Proposition X.21 can be derived from the general theory of branch and

bound methods presented in Chapter IV .. We refer to Theorem IV.3 and Corollary

IV.2.

Since the functions f, g are convex on an open set containing R, we have con-

tinuity of Fon R. Obviously, because of the compactness of the feasible set D =K n


R we then see that ((x,y) e D: F(x,y) ~ F(xO,yOn is bounded.
Recall from the discussion of Theorem IV.3 that for consistent bounding opera-

tions, bound improving selections are complete. Therefore, Proposition X.21 follows
from Theorem IV.3 when consistency of the bounding operation is established. We
show that any decreasing sequence {Mq} of successively refined partition elements

satisfies

lim (a (M q ) -
q~m
ß (M q)) = °, (100)

and this implies consistency (cf. Section IV.2.).

Let Mq = Mx,q xM y, q' where Mx, q' My, q denote the projection of Mq onto the
x-space IRn and the y-space IRn, respectively. Denote h(x,y) = xy. Recall from The-
600

orem IV.4 that for the convex envelope ~ of hone has min ~ (M q ) =
. q q
min h(Mq ), i.e., the global minimum of h over Mq is equal to the global minimum of

its convex envelope ~ over Mq . Then, from the construction of the lower bounds
q
ß (Mq) we see that

But the subdivision is exhaustive, i.e., Mq q {(i,y)} ( D, and all of the infeas-
ible partition sets are deleted. Recall that a (M q ) = F(iq,yq) for some (iq,yq) e D n
Mq. Using the continuity of f, h, g, it follows that as q --i m the left-hand side of

(101) converges to f(i) + h(i,Y) + g(Y) = F(i,Y). Since we also have a(M q ) =
F(iq,yq) q F(i,y), we see that condition (100) follows from (101) if we let q - - i m•


The original version of Al-Khayyal and Falk (1983) proceeds like Algorithm X.5

with two modifications.


The iteration point (xk,yk) is chosen to be the point, where the lower bound ßx. is
attained in the sense that

(cf. the discussion in Section IV.2).


The subdivision of a partition element M involves setting up four new partition

sets by choosing an index j satisfying

kk kk kk kk
x.y. - 'PM(x.y,) = max {x.y. - ~(x.y.)}
J J J J i=l , ... ,n 1 1 1 1

and then splitting with two hyperplanes through (x~,y~) orthogonal to the

(xj'Yj)-plane, as illustrated in Fig. X.4 in the case k = 1 (cf. Chapter VII).


601

!
_._... ....._.._._.._-_..._....._,...._._......_-
k k
", ...,.... __ ........., .......•: ._(X
,
,Y1.......
.. ....l" .....
,,,
) , ............ _...... ,

Fig. X.4 . Splitting at Stage 1

Notice that a number of applications lead also to optimization problems where

the objective function F(x,y) is convex in x and concave in y. Algorithmic ap-

proaches for such convex-concave problems can be found in Muu and Oettli (1991),

Horst, Muu and Nast (1994).


CHAPTER XI

LIPSCHITZ AND CONTINUO US OPTIMIZATION

In this chapter, we discuss global optimization problems where the functions in-
volved are Lipschitz-{;ontinuous or have a related property on certain subsets
M c IRn. Section 1 presents a brief introduction into the most often treated
univariate case. Section 2 is devoted to branch and bound methods. First it is shown

that the well-known univariate approaches can be interpreted as branch and bound
methods. Next, several extensions of univariate methods to the case of n dimensional

problems with rectangular feasible sets are discussed. Then it is recalled from
Chapter IV that very general Lipschitz optimization problems and also very general
systems of equations and (or) inequalities can be solved by means of branch and
bound techniques. As an example of Lipschitz optimization, the problem of
minimizing a concave function subject to separable indefinite quadratic constraints
is discussed in some detail. Finally, the concept of Lipschitz functions is extended to
SO-{;alled functions with concave minorants.
In Section 3 it is shown that Lipschitz optimization problems can be transformed
into equivalent special d.c. programs which can be solved by outer approximation
techniques. This approach will then be generalized further to a "reijef indicator"

method.
604

1. BRIEF INTRODUCTION INTO THE GLOBAL MINIMIZATION OF

UNIVARlATE LIPSCHITZ FUNCTIONS

1.1. Saw-Tooth Covers

Recall from Definition 1.3 that a real-valued funetion f is ealled a Lipschitz fu.n~

tion on a set M c IRn if there is a constant L = L(f,M) > 0 such that

/f(x) -f(y)/ ~ LI/x-yl/ Vx,y E M. (1)

In (1), "." again denotes the Euclidean norm.

We first eonsider a umvariate Lipschitz funetion defined on an interval [a,b].


We are interested in finding the global minimum of f over [a,b] and a point

x* E [a,b] sueh that

f*:= f(x*) = min {fex): xE [a,b]}

This problem will be denoted by

minimize fex) (2)


(UL) s. t. x E [a,b] .

It is not assumed that an analytic expression of f is knownj Le., f may be given by

a so-called oracle.

This relatively simple problem (UL) is interesting beeause it anses in many ap-

pIicataions and also because some algorithms for solving problem (UL) can easily be

extended to the n-dimensional case.

Examples of applications are found in the optimization of the performance a sys-

tem, which can in many eases be measured for given values of some parameter(s)
even if the governing equations are unknown (e.g., Brooks (1958». Other examples
are discussed in Pinter (1989), see also the survey of Hansen and Jaumard (1995).
605

Denote by cJ1 L [a,b] the c1ass of Lipschitz functions on [a,b] having Lipschitz con-

stant L. Then it is easy to see that no algorithm can solve (UL) for all fE cJ1 L [a,b]
by using only a finite number of function evaluations (cf., Ransen et al. (1989)).

Theorem XI.I. There is no algorithm for solving any problem (UL) in cJ1 L[ a,b]
that wes only a finite number of function evaluations.

Proof. Assume that there is such a finitely convergent algorithm A yielding a

global minimizer x* after k steps, i.e., we have f(x*) = min f(xi ) (k> 1).
i=1, ... ,k
Denote Xk = {xl, ... ,xk}, and let f(X k ) be the set of corresponding function

values. Let x-i E Xk \ {x*} be the evaluation point different from x* which is c10sest
to x* on the left (if such a point does not exist, then a similar argument holds for the
point in Xk \ {x*} c10sest to x* on the right). Consider the function

1 {f(X) , xE [a,x j ] U [x*,b]


f (x) = .. .
max {f(rl) - L(x-rl), f(x*) - L(x*-x)} , xE [rl ,x*]

Obviously, we have f1 E cJ1 L [a,b], and it is easy to see by a straightforward geo-


metrie argument that the global minimum of f1 is attained at

- _~ + x* f(x j ) - f(x*)
x- 2 + 2L

with

f1(i) = ! (f(~) + f(x*) - L(x* - x-i)) < f(x*),


whenever f(x*) > f(x-i) - L(x-i - x*).

The strategy of algorithm A, however, depends only on L, Xk , f(Xk ) which coin-

eide for f and f1. Rence, we have

f( x*) = mIn. f( xi) = mIn. f1( x


i) ,
i=1 , ... ,k i=1 , ... ,k

and algorithm A conc1udes that x* is also a global minimizer of f1, a contradiction.•


606

Frequently, instead of problem (UL), one investigates problem (ULt ) which con-

sists in finding a point x~ E [a,b] such that, for small c > 0,

c <
f*c := f!(x*) - f* + c. (3)

It is obvious that every problem (ULc) always can be solved by a finite algorithm.

Evaluating f at the equidistant points

xi=a+(2i 1)c (i=1, ... ,k) , (4)


L

where k = rL(b 2; all is the smallest integer satisfying k ~ L(b 2; a) , yields a


point satisfying (3) for any Lipschitz function f E ~L [a,b] .

Problems (UL) and (ULc) have been studied by several authors, e.g., Danilin
(1971), Evtushenko (1971 and 1985), Piyavskii (1972), Shubert (1972), Strongin
(1973 and 1978), Timonov (1977), Schoen (1982), Shepilov (1987), Pinter (1986 and

1988), etc. Arecent survey is Hansen and Jaumard (1995).


A number of procedures have also been designed to approximate the set

X* = {x* E [a,b]: f(x*) = f*} (5)

of all optimal solutions to (UL); see, for example, Basso (1982), Galperin (1985 and
1988), Pint er (1986 and 1988), Hansen et al. (1991).
An algorithm such as (4), where the evaluation points are chosen simultaneously,

is frequently said to be passive, since the step size is predetermined and does not de-

pend on the function values. Its counterpart is a sequential algorithm, in which the

choice of new evaluation points depends on the information gathered at previous it-

erations.

For most functions f, the number of evaluation points required to solve (UL c) will
be much smaller with a suitable sequential algorithm than with a passive algorithm.
In the worst case, however, the number of evaluation points required by a passive
and by a best possible sequential algorithm are the same (cf. Ivanov (1972), Archetti
607

and Betro (1978), Sukharev (1985)). It can easily be seen from the following dis-

cussion that this case arises when fis a constant function over [a,b].

Given Xk = {x1,... ,xk }, the corresponding set f(X k ) of function values, and the

Lipschitz constant L, it is natural to bound f* from below by a piecewise linear func-

tion with slope + L or - L that exploits the Lipschitz bounds given by

(6)

(cf. 1.4.1).
Obviously, for a fixed set X k, the best underestimating function using the above

information is given by

(7)

Because of its shape, a function Fk of the form (7) will be called a saw-tooth cover
1 2 k 1 k
0/ fLet X k be ordered such that a ~ y ~ y ~ ... ~ y ~ b, where {y ,... ,y } =
{x1 ,... ,xk }. The restriction of F k to the interval [yi,yi+1] of two consecutive evalu-

ation points is said to be the tooth on [yi,yi+1]. A straight forward simple calcula-

tion shows that the tooth on [yi,yi+1] attains its minimal value (downward peak)

at

. i + yi+1 f(yi) _ f(yi+1)


Xp,l = y 2- + -- - 2L -- - (8)

with

(9)

Since the number of necessary function evaluations for solving problem (ULe )

measures the efficiency of a method, Danilin (1971) suggested studying the minimum

number of evaluation points required to obtain a guaranteed solution of problem

(UL ) (cf. also Hansen et al. (1988 and 1989), Hansen and Jaumard (1995)).
e
This can be done by constructing a reference saw-tooth cover
608

for solving (UL e ) with a minimal number kp of function evaluations. Such a refer-
ence cover is constructed with f* assumed to be known. It is, of course, designed not
to solve problem (ULe) from the outset, but rather to give a reference number of ne-
cessary evaluation points in order to study the efficiency of other algorithms.

It is easy to see that a reference saw-tooth cover FrJt) can be obtained in the fol-
lowing way. Set FrJa) = f* - e. The first evaluation point xl is then the intersection
point of the line (f* - c:) + L(x - a) with the curve fex). The next downward peak is
at

p,l _ I + f(x l ) - (f* - e)


x -x L '

and it satisfies F rJx P' I) = f* - c:.


Proceeding in this way, we construct a saw-tooth cover F rJx), for which the
lowest value of a downward peak (with the possible exception of the last one) is
f* - c (Fig. XI.I).

f(x)

f(x)

f- (

~~------------~----~-.--~~.------
o k p,k k+l
x x x b

Fig. XLI Reference saw-tooth cover


609

Algorithm XI.l (Reference saw-tooth cover)

Initialization:
Set k = 1 , xl solution of the equation fex) = (f* - E) + L(x - a).

Reference saw-tooth cover:

Step k = 1,2, ... :


k
_ k + fex) - ef* - E) .
Set xp,k -x L '

xk +1 solution of the equation fex) = (f* - E) + L(x - xp,k);


If xk+1 b
~, t hen stop.. kß -- k·,
Otherwise, set k = k+l.
Go to Step k.

1.2. Algorithms for Solving the Univariate Lipschitz-Problem

Consider problem (ULE). Let xk be the last evaluation point and let f E denote the

current best known function value. We try to find xk + 1 such that the step-size
xk+ 1 _ xk is maximal under the condition that, if f(xk + 1) ~ f E ' then we have

(10)

Inserting (9) into (10), we obtain

(11)

which, because of the condition f(x k+1) ~ f E ' leads to

(12)

This is essentially the procedure of Evtushenko (1971), (cf. also Hansen and J au-

mard (1995)).
610

Algorithm XI.2 (Evtushenko's saw-tooth cover)

Initialization:

Set k = 1, xl = a + f, Xc = xl, fc = f(xc);


Evtushenko's saw-tooth cover:

Step k = 1,2, ... :

If xk > b, then stop.


Otherwise, set

xk + 1 = xk + t (2c + f(xk) - fc)·


If f(x k +1) < fc' then set fc = f(xk +1), Xc = xk +1 ;
Set k = k+ 1. Go to Step k.

Note that from the derivation of (12) it follows that, if f(x k+1) < f(x k) , then
the downward peak F(xp,k) differs from the new incumbent value f(x k +1) by more

than c.

We see from (12) that in the worst case of a constant or a monotonically decreasing
function f, we have f(x k ) = fc for all k, hence xk+ 1 - xk = if. ' which is the
step-size of the passive algorithm (4)

The minimum number of evaluation points required by Evtushenko's algorithm is

1 + rlog2 (1 + L(b2; a»l . It is attained when fis an affine function on [a,b] with

slope L. The efficiency of the procedure depends greatly on the position of the op-

timal solution x*, and it tends to become worse when x* --I b. Its saw-tooth cover

can differ considerably from the reference saw-tooth cover, particularly when x* is

far from the left bound a ofthe interval.

Evtushenko's algorithm was originally designed to globally optimize a multi-


variate Lipschitz function by repeatedly solving the univariate problems obtained by
611

fixing all variables but one.

In contrast to Evtushenko's method, which is an ordered sequential algorithm,

Le., the evaluation points at successive iterations are increasing values of x belonging

to [a,b] , the algorithm of Piyavskii (1967 and 1972) constructs more and more
refined saw-tooth covers of f in the following way. Starting with xl = ~ and
f(x 1), the first saw-tooth cover

is minimized over [a,b] in order to obtain its lowest downward peak at x 2 E argmin

The function fis then evaluated at this "peak point" x2, and the corresponding
tooth is split into smaller teeth to obtain the next cover

which in turn is minimized over [a,b], etc.

Thus, the procedure is governed by the formulas

(13)

(cf. (7)) and

Piyavskii's algorithm with various extensions seems to be the most often dis-

cussed approach for problems (UL) and (ULc). It was rediscovered by Shubert

(1972) and Timonov (1977). Archetti and Betro (1978) discuss it in a general frame-

work oI sequential methods. Basso (1982) concentrates on convergence issues, and


proposes some modifications in order to approximate the set X* of all optimal solu-
612

tions. Pinter (1986 and 1986a) introduces five axioms which guarantee the conver-
gence of certain global algorithms, and he shows that Piyavskii's algorithm, as well
as others, satisfy them. Schoen (1982) proposes a variant of Piyavskii's approach
that, instead of choosing the point of lowest downward peak to be the next
evaluation point, selects the evaluation point of the passive strategy (4) that is
closest to this peak point. Shen and Zhu (1987) discuss a simplified version in which
at each iteration the new evaluation point is at the middle of a subinterval bounded
by two consecutive previous evaluation points. Hansen et al. (1989 and 1991) present
a thorough discussion of Piyavskii's univariate algorithm and related approaches
which includes a theoretical study of the number of iterations which was initiated by
Danilin (1971). Arecent comprehensive survey is Hansen and Jaumard (1995).

Multidimensional extensions where proposed by Piyavskii (1967 and 1972),


Mayne and Polak (1984), Strigul (1985), Mladineo (1986), Pinter (1986, 1986a, and
1988), Shepilov (1987), Neferdov (1987), Meewalla and Mayne (1988), Wood (1992),
Baoping et al. (1993), Baritompa (1994) and others.
Note that a direct extension of Piyavskii's method to the case of an n-rectangle
D, where (13), (14) are replaced by

F.(x) =• max {f(x-i) - LUx - x-ill} (13')


1
J= 1 , ... ,1.

and

(14')

seems not to be very promising with respect to numerical efficiency for dimension

n > 2 since (14') constitutes an increasingly difficult d.c. problem (cf. Section 1.4).

In Horst and Tuy (1987) it is shown that Piyavskii's algorithm, as well as others,
can be viewed as branch and bound algorithm. In this branch and bound reformu-
lation, the subproblems correspond to a tooth of the current saw-tooth cover, and
613

the selection is bound improving.

This way of viewing the procedure will allow one to delete unpromising teeth in

order to reduce the need for memory space. We shall return to this branch and
bound formulation in the more general framework of Section X1.2.

In the following algorithmic description of Piyavskii's method for solving (UL e),
fe denotes the current best value of the function f, whereas Fe denotes the minimal

value of the current saw-tooth cover.

Algorithm. XI.3 (Piyavskii's saw-tooth cover)

lnitialization:

Set k = 1, x 1 = ~
a+b , Xc = x1, f = f(xc),
e

Fe = fe - L(b;a) , F 1 = f(x 1) - L Ix - x \

Piyavskii's saw-tooth cover

Step k = 1,2, ... :

If fe - Fe $ e, then stop.
Otherwise determine

xk +1 E argmin Fk([a,b]).

If f(x k+ 1) $ fe ' then set fe = f(i+1), Xc = xk +1 .


SetFk +1(x)=. max {f(xi)-Llx-xil},
1=1, ... ,k+1

Go to Step k.

Obviously, the second and the third evaluation points in Piyavskii's algorithm are

a and b, where x2 = a implies x3 = band x2 = b implies x 3 = a.


614

Now suppose that for k ~ 3 the first k evaluation points have already been gener-
1 2 k 1 k
ated and ordered 50 that we have a = y ~ y ~ ... ~ y = b, where {y ,... ,y } =

{x1,... ,xk}. Then, by (8), (9), we see that xk+1 is determined by

Assume that the minimum is attained at i = j. Then we have

(15)

and

F ( k+1) _
kx -
f(~+1)+f(~)
2 L r'+1 2- r. . (16)

We show that Piyavskii's algorithm is one-step optimal in the sense explained

below.
Let f(x 1),f(x 2), ... ,f(xk- 1) be the values of fE I}>L[a,b] at the first k-1 evaluation
points of a saw-tooth cover algorithm for solving problem (UL) (or problem ULe:)'

In addition, let I}>k_1(f) C I}>L[a,b] denote the set of all Lipschitz functions (with
Lipschitz constant L) which coincide with f(x) at these points. For all rp E I}>k_1 (f),
denote rp* = min rp[a,b]. Consider the saw-tooth cover ~(x) of rp(x) at iteration k
which, given ~-1 (x), is determined by the choice of the evaluation point xk . We
are interested in making the choice in such a way that the error

(17)

is minimized in the worst case. In other words, we attempt to minimize the quantity

(18)

(ll optimality in one stepll, cf. Sukharev (1985) for a related definition).
615

Proposition XI.i. Piyavskiz's algorithm is optimal in one step in the sense 0/ (17),
(18).

Proof. Since for all rp E 4>k_l(f) we have f(xi ) = rp(xi ) (i=I, ... ,k-l), the previous

saw-tooth covers ~-1 (x) coincide for all rp E 4>k(f) , Le., one has ~-1 (x) =
Fk_ 1(x).

By the construction of F k_ 1(x) , we have

and

It follows from the construction of F k-1 that the maximal error after k-l iterations

occurs when rp(x) = F k-l (x), Le., we have

(19)

which is attained at the Piyavskii evaluation point xp,k.

In order to investigate the worst case error in the next iteration, first note that
there exist functions rp E 4>k_l (f) satisfying

rp(x) ~ min rp(xi ) = min f(xi ) Vx E [a,b] . (20)


i=l, ... ,k-l i=l, ... ,k-l

Furthermore, let ak_ 1 ' bk- 1 E {x I ,... ,xk-1} , ~-1 < bk_I' be the nearest evalu-
ation points to the left and to the right of xp,k, respectively, Le., we have

. '+1
(cf. (15), (16), where ak- 1, bk_ 1 correspond to yJ, yJ ).

Let xk denote the next evaluation point and consider the two cases xk t [ak_ 1,
k
bk_I] and x E [~_I,bk_l]' Denote
616

Mk_ 1 = min F k_ 1 [a,b]

. . 1 k-l
and let f k_ 1 = mm {f(x ), ... ,f(x H.

Suppose that x k ~ [ak_ l ' bk_I]. Then for all cp E tf>k_l(f) satisfying (20), we

have

(cf. (17)). In this case, the maximal error is not improved by the choice of x k and

(21)

Now suppose that xk E [ak- l ' bk-I]. In this case, we have xk f:. ak_ l and

xk f:. bk_I' since ~-l


' bk- l are previous evaluation points, and the optimal error
reduction is obtained by setting xk = xp,k. •

The algorithms for solving problem (UL g ) discussed so far regard the Lipschitz
constant L as known apriori. We would like to mention that Strongin (1973 and
1978) proposes an algorithm that, instead of L, uses an estimate of L which is a mul-
tiple of the greatest absolute value of the slopes of the lines joining successive evalu-
ation points. Convergence to a global minimum (g = 0) can be guaranteed whenever
this estimate is a sufficiently large upper bound for the Lipschitz constant L (for de-

tails, see Strongin (1978), Hansen et al. (1989), Hansen and Jaumard (1995)).

2. BRANCH AND BOUND ALGORITHMS

In this section, the branch and bound concept developed in Chapter IV will be
applied to certain Lipschitz optimization problems. We begin with an interpretation
of Piyavskii's univariate algorithm as a branch and bound procedure. Then the case
of an n-dimensional rectangular feasible set D is considered, where a generalization
617

of Piyavskii's univariate algorithm and an axiomatic approach are discussed in the


branch and bound framework. Finally, it is recalled from Chapter IV that branch
and bound methods can be designed for minimizing Lipschitz functions over convex
sets, over intersections of a convex set with finitely many complements of convex
sets, and over sets defined by a finite number of Lipschitz inequalities. The resulting
approach will be applied to global optimization problems with indefinite separable
quadratic constraints.

2.1. Branch and Bound Interpretation of Piyavs1rii's Univariate Algorithm

Let the first k (k ~ 3) evaluation points of Piyavskii's algorithm for solving prob-
lem (UL) be ordered in such a way that we have a = y1 ~ y2 ~ ... ~ l = b, where
{y1, ... ,yk} = {xl, ... ,xk}. Obviously, the intervals

i i+1]
M k . = [y,y (.1=1, ... ,k
-1) ,
,I

define a partition of [a,b], and


.+1 .
) _ f(yl ) + f(yl)
i+1 i
ß(M LY - y (i=1, ... ,k-1) (22)
k,i - 2 2

constitute the associated lower bounds (cf. (15) and (16)). Changing notation
slightly, let xk,i e {yi,yi+1} satisfy f(xk,i) = min {f(yi), f(yi+1)}j
set

o(M k .) = f(xk,i) (i=1, ... ,k-1), (23)


,I

and let the current iteration point xk e {xk,i: i=1, ... ,k-1} be such that

(24)

Subdivide an interval Mk . satisfying


,J

ßk = ß(M k ,J.) = min {ß(M k ,I.): i=1, ... ,k-1} (25)


618

. k k· 1
into the two intervals [y-I,z ], [z ,yH ], where

k_ ~+1 + ~ f(~) - f(~+1)


z - 2 + 2L (26)

is the next evaluation point (cf. (15)).

Rearranging the evaluation points y1, ... ,yk,zk in the order of increasing values to
.
obtaln a = y1 ~ 2
y ~ ... ~ y
k+1
= b, {y1,... ,yk ,zk} = {y1, ... ,yk+1 }, we see that all

of the ingredients of a BB procedure as described in Section IV.1 are at hand. In this

way, Piyavskii's algorithm can be interpreted as a branch and bound method, in

which deletion operations can be used to reduce memory space compared to the

version presented in XI.1.2.

Proposition XI.2. Consider Piyavskiz's algorithm for solving problem (UL) in its

branch and bound interpretation described above. Then we have

lim ßk = min f([a,bJ) = lim ak'


/rrI m /rrI m

and every accumulation point o[ {i} is an optimal solution o[ problem (UL).

Proof. We reier to the convergence theory of the BB procedure presented in


Chapter IV, and we verify the assumptions of Theorem IV.3 and Corollary IV.3.

The selection is obviously bound improving and the set {x E [a,b]: f(x) ~ f(x 1)} is

compact by continuity of f. It remains to verify consistency of the bounding opera-

tion.

Let {Mk } be a nested sequence of partition intervals generated by the algorithm.


q
The sequence a(M k ) of associated upper bounds is nonincreasing and bounded from
q
below by min f( [a,b]); the sequence ß(Mk ) of associated lower bounds is nonde-
q
creasing and bounded from above by min f( [a,b]). Hence, we have the existence of

limits a,ß satisfying


619

Q = 1im Qk ~ min f( [ a, b]) ~ li m ßk = ß. (27)


q-+m q q-+m q

Now consider the sequence {zk} of evaluation points zk generated at each Step k.

Since M k is subdivided in iteration k +1 ' we have


q q
k k
z q+1 E M and a(M ) ~ f(z q+1) . (28)
kq k q+1
k
Let z be an accumulation point of the subsequence {z q+1} of {zk}. If follows

from (28) and the continuity of f, that

Q ~ f(z) (29)

holds.

Now suppose that there is an c > 0 such that

f(Z) > ß + Co (30)


_ k '+1 k +1
By definition of z, there is a subsequence {z q } C {z q } and a qc E IN such

that, for q' > qc


k'+1 - c
Iz q -zl<2t· (31)

It follows from the Lipschitz-continuity of fand from (30) that

f(z kq'+1) ~ f(z) - Llz kq'+1_ zl > ß + ~ Vq' > qc .

Hence, for x E [a,b] satisfying

Ix - zkq'+11 < 2t
c ,
' q > qc '

we have

(32)
620

However, by construction of BB procedures, we have the inequality

and hence, from (32), it follows that

zki{x: Ix_zkq '+1 1 <k,q'>qc} Vk.

This contradicts the assumption that zis an accumulation point of {zk}. Therefore,
there is no c > 0 such that (30) holds, and this (by using (29)) implies that

Considering (27), we finally see that there must hold

er = min f( [a, b]) = ß ,

from which consistency follows.



In the preceding discussion of Lipschitz optimization it was always assumed that
the Lipschitz constant L is used. It is obvious, however, that, for a given Lipschitz
function, there are infinitely many Lipschitz constants, i.e., if L is a Lipschitz con-

stant, then all numbers L' > L are Lipschitz constants as well. Let f be a Lipschitz
function on [a,b] and let

L = inf {L': L' is Lipschitz constant of f on [a,b]} .

Then, in practice, we often know only some L' > L. Assuming this and applying

Piyavskii's algorithm with L' > L instead of L, Proposition XI.2 can also be derived

along the very simple lines presented in Section IV.4.5.


To see this, consider the subdivision of the interval [yj,~+l] into two subinter-
. k k"+l
vals [yJ,z ], [z,yl ] as described by (26). From (26), it follows that

max {yj+1 _ zk zk _ yj} < ~+ 1 - yj + If(yj+ 1) - f(yj) I


, - 2 2L"
621

and, using the Lipschitz continuity with constant L,

(33)

Considering a nested sequence {M } of successively refined partition intervals M


q q
with length O(M q ), we see that

(34)

where 'Y = ~ (1 + t,) < 1. This establishes the exhaustiveness of the subdivision pro-
cedure.

Consistency is then an obvious consequence of (22), (23), and the continuity of f.

2.2. Branch and Bound Methods for Minjmjzjng a Lipschitz Function over an

n-dimensional Rectangle

Now let the feasible set D be an n-dimensional interval, i.e., there are vectors

a,b E IRn , a < b, such that

(35)

where the inequalities are understood with respect to the componentwise ordering of

IRn . Let the objective function f be Lipschitzian on D with Lipschitz constant L and

consider the global optimization problem

minimize f( x) . (36)
xED

As shown in Section XI. 1.2, Piyavskii's univariate algorithm can easily be formu-

lated for the case of problem (36), and one obtains a corresponding convergence re-

sult as a straightforward n-dimensional extension of Proposition XI.2 (cf. Horst and

Tuy (1987)). Since, however, the computational effort in solving the corresponding
subproblems is enormous in dimension n ~ 2, we prefer to present branch and bound
622

extensions of Piyavskii's approach that essentially apply the univariate version to


the main diagonal of rectangular partition sets.
Let L' > L be an upper bound for the optimal Lipschitz constant L. Denote by
a M, b M the lower left and upper right vertex of an n-rectangle M, respectively, Le.,
we have

Algorithm XL( (Prototype Diagonal Extension of Piyavskii's saw-tooth cover)

Step 0 (Initialization):

Set MO = D, .Je' 0 = {MO},


QO = rnin {f(a), f(b)}, xO e {a,b} such that QO = f(xO) , (37)

ßO = max {f(a), f(b) - L'lIb - all} . (38)

Go to Step 1.

Step 11: (11:=1,2, ... ) :


At the beginning of Step k we have the current rectangular partition .Je' k-1 of a

subset of MO = D which is still of interest, and for every M e .Je' k-1 we have
bounds ß(M), a(M) satisfying

ß(M) ~ min f(M) ~ a(M) .

Moreover, we have the current lower and upper bounds ~-1' Qk-1 satisfying

ßk- 1 ~ rnin f(D) ~ Qk-1

. x k-1 such t hat r(


an d a pomt l' x
k-1) = Qk-1 .
k.1. Delete all M e .Je' k-1 satisfying

ß(M) ~ Qk-1 .
623

Let flt k be the collection of remaining partition sets in .Jt k-1 .

k.2. Select

satisfying

Jl k n argmin {ß(M): M E flt k} f. 0. (39)

For each M E Jl k choose

1 f (aM) - f( b M)
xM=2(aM+bM)+2L'lIbM -aM 11 (bM-aM), (40)

and subdivide M into two n-dimensional subintervals using the hyperplane which

contains x M and is orthogonal to one of the longest edges of M.

k k
Let .Jt be the collection of new partition elements. For each M' E .Jt denote by

M' E Jl k the rectangle whose subdivision generated M'.

k.3. For each M' E .Jt set k


a(M') = min {f(aM,), f(b M,), f(xM ,)} , (41)

ß(M') = max {ß(M'), max{f(aM,),f(bM,),f(xM,)}-L'lIbM,-aMrI\} . (42)

k.4. Set .Jt k = (flt k \ Jl k) U .Jt k.


Compute

(}k = min {a(M): M E .Jt k },

ßk = min {ß(M): M E .Jt k} .

Let xk E D be such that f(x k) = (}k .

k.5. If (}k - ßk = 0 (~ E), then stop. Otherwise, go to Step k+l.


624

Proposition XI.3. Consider Algorithm XI.4 (with c = 0). Then we have

and every accumulation point r of {i} satisfies f(r) = min f(D).

Proof. Proposition XI.3 readily follows from Theorem IV.3, Corollary IV.3 and

Proposition IV.3 if the subdivision procedure is exhaustive. But from (33), (34), we

see that the length of the diagonal [aM'bMJ is reduced by a factor 1 = ~ (1 +


L
1') < 1.
Exhaustiveness then follows (cf. also the proof of Proposition IV.2).

Remarks XI.I. (i) Many variants of this algorithm are possible.
For example, recall from Chapter IV that, if we choose a subset V'(M) )

{aM,b M} of the vertices of a partition set M, the lower and upper bounds may be
determined by

a(M) = min f(V'(M))

and

ß(M) = max f(V'(M)) - L'lIb M - aMIl ,

respectively. Moreover, any exhaustive subdivision process will yield a convergent

procedure as long as one of these bounds or any uniformly better bound is used.

(ii) In practical computation, an inter val M will not be subdivided further when its
diameter IIbM - aMIl is less than a fixed parameter 6 > O.

(iii) Whenever adaptive estimates L(M) of the Lipschitz constant of f over current
intervals are available, then, of course, these should be used instead of L'.
625

(iv) When the branch-and-bound algorithm is stopped at an iteration point x k

because we have /lk - ßk < E or IIb M - aMIl < 6 for all sets M which are still of
interest, then, of course, a local search starting from xk could lead to an improved
approximation of a global minimum.

A so-called lIaxiomatic ll approach to problem (36) has been proposed by Pinter

(1986 and 1986a). This approach uses typical branch and bound elements, such as
partitions of the feasible n-interval D into finite sets of n-intervals and refinement
by sub division of selected partition elements. The selection of partition elements is

governed by a selector function that takes into account the size of a given partition
interval as well as the objective function values at its vertices. Each selected

n-interval M is subdivided into 2n subintervals using all of the hyperplanes through

selected interior points of M that are parallel to the facets of M.


However, the current iteration point in Pinter's approach is not necessarily the

best feasible point obtained so far. Moreover, Pinter's method does not make use of

lower bounds, and hence it does not provide estimates of the quality of a current
iteration point or adeletion rule to remove partition elements not of interest.

In Horst and Tuy (1987) it was shown, however, that Pinter's approach can
readily be modified, improved and generalized by viewing it within the framework of
branch and bound methods discussed in Chapter IV. A simplified and slightly gener-
alized version of the presentation in Horst and Tuy (1987) follows.

Consider Algorithm XI.4 with the following modifications:

(i) Upper and lower bounds are determined on the complete vertex set V(M) of a

partition interval M (cf. Remark XI.1.(i)).

Following Pinter (1986 and 1986(a)), the vertex set V(M) can be described by an

(nIC2n) matrix X(M) whose columns are the lexicographically ordered vertices of M.
626

The corresponding values of the objective function f at the vertices of M define a 2n


vector z(M).
Using this notation, in Step k.3 of Algorithm XIA we replace the bounds (41), (42)
by

(41')

and

ß(M') = max: {ß(M'), max:{zi(M'):i=1, ... ,2n} - L'lIbM,-aM,II}. (42')

(ü) Rule (39) in Step k.2 of Algorithm XIA, which selects the n-intervals to be sub-
divided further is replaced by the following procedure.

Let R(X(M),z(M)) be a suitable real-valued function of the vertices of M (repre-


sented by the matrix X(M)) and of the vector z(M) of the objective function values
at these vertices.
A function R(X(M),z(M)) is suitable if it satisfies the requirements R.2 - R.5 listed
below. Given such a suitable (selector) function, select

.9'k = {M E .9t k: R(X(M),z(M)) = max:{R(X(M),z(M)):M E .9t k} .(39')

Subdivide each M E .9'k into r n-intervals, 2 S r S 2n, using hyperplanes parallel to


certain facets of M that pass through a chosen interior point of M. Any subdivision
of that kind is admitted as long as the following requirement R.1 is fu11filled.

R.l. The sub division is exh.austive.

The requirements for a suitable selector function R(X(M),z(M)) are as folIows.

R.2. R(X(M), z(M)) is continuous in (X(M), z(M)) and, for every decreasing
sequence {Mq} of n-intervals Mq C D, the limit ~: R(X(M,J, z(M,J) emts and

is continuous. If {M,) converges to a singleton {x}, the"n


627

R(X,z)) = lim R(X(M~, z(M~) , (43)


~m

holtis. In (/.9), Xis the (nx~n)-matriz having ~n identieal eolumns Z.E IR n, andz is
the veetor o! ~n identieal eomponents !(x).

R.S. R(X(M), z(M)) is translation-invariant with respeet to M, i.e., tor an


arbitrary veetor e E IR n satisfying M + e E D, we have

R(X(M), z(M)) = R(X(M + e), z(M)) . (44)

R.4. R(X(M), z(M)) is strietly monotonieally deereasing in z(M), i.e., tor an


arbitrary d E IRn, d f 0, d ~ 0 (componentwise), we have

R(X(M), z(M)) < R(X(M), z(M) - d) . (45)

R.5. I! M is an n-interval and i E M, then

R(X, z) < R(X(M), z(M)) , (46)

where X, zare defined as in R.~.


Example XI.1. Consider the branch-and-bound interpretation of Piyavskii's uni-
variate algorithm as discussed in Section XI.2.l. Suppose that a Lipschitz constant

L' > L is used, where L denotes the infimum taken over al1 Lipschitz constants of f
on D = [a,b]. Then exhaustiveness of the subdivision (26) has been demonstrated in
Section XI.2.l., Le., the requirement R.l. holds.

Let

47)

where M = {x E IR: x 1(M) ~ x ~ x 2(M)} and zl(M) = f(i(M», z2(M) = f(x 2(M».
Note that (-R(x(M), z(M» describes Piyavskii's lower bound (cf. (22), where

slightly different notation is used).


628

The function R(X(M), z(M» defined by (47) is obviously continuous in its argu-

ments xi(M), zi(M) (i=1,2), and from the continuity of f we see that requirement

R.2 is satisfied.
Requirements R.3 and RA obviously hold.
Finally, in order to verify R.5, let xE M, z= f(X). Then
-;:) - 1(-;:)
R(X,z,=-z=-2 z+z,

~ - ~ [zl (M) - L(x - x 1(M))] - ~ [z2(M) - L(x2(M) - X)]

x 2(M) - x 1(M) z2(M) + zl (M)


=L 2 2

x 2(M) - i(M) z2(M) + zl (M)


< L' 2 2 = R(X(M), z(M» .

Proposition XI.4. Suppose that in the above branch and bound interpretation 01

Pinfßrs method the requirements R.t - R.5 are satisfied. Then, il the algorithm is
infinite, we have

Zim llk = minl(D} ,


h-!m

and eve,.y accumulation point'" 01 {i'} satisfies I("'} = min I(D}.

proof. We show that the assumptions of Theorem IV.2 and Corollary IV.2 are

satisfied.

Since every Lipschitz function is continuous, and hence {x E D: f(x) ~ f(xOn is

compact because of the compactness of D, it remains to verify that the bounding


operation is consistent and the selection operation is complete.

Requirement R.1 (exhaustiveness of the subdivision process) implies that every


nested sequence of n-intervals {Mk } generated by the algorithm converges to a
q
singleton {X}i hence we have
629

lim 6(M k ) =0 ,
q. . . 1II q

where 6(M k ) denotes the diameter IIbM - a M 11 of Mk . Using (41'), (42') and
q kq kq q

the continuity of f, we see that

!im a(M k ) = f(X) and !im ß(Mk ) ~ f(x) ;


q..... CD q q-lCD q

hence lim ß(Mk ) = f(x), because a(Mk ) ~ ß(Mk ) Vq.


q-lCD q q q
This implies consistency of the bounding operation.

To verify completeness, let xbe an arbitrary point of


CD CD
ME U n ~k;
p=l k=p

i.e., there is a kO E IN such that M is not subdivided furt her if k > kO . We have to
show that inf f(M n D) = inf f(M) ~ a where a = I im ak .
k..... CD
Let M be represented by the nx2 n matrix X = X(M) of its vertices, and let z = z(M)

be the corresponding vector of the values of f at the vertices of M. Consider any

nested sequence {M k } of intervals where Mk is subdivided in Step k > kO' In


q q q
order to simplify notation, set M = M k ,let X be the matrix representation of
q q q
Mq , and let zq be the associated vector of the function values at the vertex set

V(M q ) ofMq .

The selection rule (39') implies that

(48)

and because of R.1 there is a point x E D such that I im M = {X}.


q-lCD q
Let X, zbe the quantities associated to x defined as above, and take the limit as
q --+ CD in (48). Then R.2 and R.5 yield
630

R(X,i) = lim R(X ,zq) ~ lim R(X,z) = R(X,z) > R(X, z) (49)
q-lCll q q-lCll

where X, zcorrespond to Xas specified above.


Using R.3, from (43) we obtain

R(X, Z) = R(X, Z) > R(X, z) , (50)

which, by RA, implies that z< Z, Le.,


f(X) < f(x) . (51)

Now consider the sequence of points ~ E Mq satisfying f(~) = a(M q). Since

M --+ {X}, a(M ) ~ a, and since fis continuous it follows that


qq-tm q

a ~ lim a(M ) = f(X) j


q-lCll q

and hence, by (51), a < f(x).


Since x is an arbitrary point of M, it follows that inf f(M) ~ a, Le., the selection
process is complete.

A dass of functions R(X(M), z(M)) that satisfies R.2 - R.5 is proposed in Pint er
(19S6a).
Let xj(M) (j=1, ... ,2 n ) denote the 2n lexicographically ordered vertices of M and

consider selector functions of the form


n
n 2n 1 1 2
R(X(M),z(M)) = R1( E (x. (M)-x. (M))) + R2 (-
n
E z.(M)) , (52)
i=l 1 1 2 j=l J

where R( IR+ --I IR is strictly monotonicaUy increasing and satisfies R1(0) = 0,

whereas R2: IR - - I IR is strictly monotonicaUy decreasing. Furthermore, it is assumed


that R 1 and R2 are continuous in their respective domains.

Recall that z}M) = f(x-i(M)) (j=1, ... ,2 n ). The function R 1 depends only on the
n
"lower left" vertex xl and the "upper right" vertex x 2 , whereas ~ takes into
631

account the function values at a11 of the vertices of M.

It is easily seen that under the above assumptions the requirements R.2, R.3, RA

are satisfied.
In order to meet requirement R.5, suppose in addition that R2 is Lipschitzian
with Lipschitz constant L(R2) and that

(53)

where L is the Lipschitz constant of f.


Let xE M, and consider xas an n-interval M with coinciding vertices x-i(M) = x
(j=I, ... ,2 n ). Since R 1(O) = 0 and ziM) = f(X) (j=I, ... ,2 n), we see that

R(X, Z) = R(X(M), z(M)) = ~(f(X)) .

By (53), the inequality R(X, Z) < R(X(M), z(M)) of R.5 then fo11ows from the fol-
lowing chain of relations:

n 2n 1 n 2n 1
< L(lL) . L· E (x. (M) -x.(M)) S R1 (E (x. (M) -x.(M)).
-~ i=1 1 1 i=1 1 1

Example XI.2. An example of functions R1 ' R2 satisfying the above assumption

is given by

n 2n 1
R 1 =c. E (x. (M)-x.(M)),(CEIR+,c>L),
i=1 1 1

and
632

where Po is a lower bound for min f(D). Then L(~)<l and RI(y) = cy~L(~).L.y
Vy E IR+ •

2.3. Branch and Bound Methods for Solving Lipschitz Opümization Problems with

General Constraints

The purpose of this section is to recall certain classes of general multiextremal

global optimization problems

minimize f (x) (54)


S.t. xED

and the corresponding branch and bound procedures that were already treated in

Chapter IV.

Let the compact feasible set D and the objective function f: IRn -I IR belong to one of

the following classes:

Feasible set D:

(DI) - robust, convez and defined by a finite number 0/ convez constraintsj

(D2>- robust, interseetion 0/ a convez set with finitely many complements 0/


convez sets and defined by a finite number 0/ convez and reverse convez
constraintsj

(D 3>- defined by a finite number 0/ Lipschitzian ineqv.alities.

Objecüve Function f:

(fl > - convex,

(f2) - concave,
633

(f3) - d.c.,

(f4) - Lipschitzian.

In Chapter IV we saw that a convergent branch-and-bound procedure can be


developed for every optimization problem (54) where D and f belong to one of the
above classes, respectively. One has to apply the prototype BB procedure of Section
IV.1 with

- an appropriate choice of partition sets (e.g., simplices, rectangles, cones (cf.


Section IV.3));

- an exhaustive sv.bdivision process (cf. Section IV.3, Chapter VII);

- bov.nd improving selection (cf. Definition IV.6);

- an appropriate bov.nding operation (cf. Section IV.4);

- an appropriate "deletion by infeasibility" rule (cf. Section IV.5).

We expect the reader to be able to formulate a convergent branch and bound pro-
cedure following the lines of the discussion in Chapter IV for each of the resulting
problem classes (cf. also Horst (1988 and 1989)). More efficient approaches can be
constructed for the important case of linearly constrained Lipschitz and d.c. prob-
lems (cf. Horst, Nast and Thoai (1995), Horst and Nast (1996). These will be dis-
cussed in Section 2.5.

It is also recalled from Section 1.4.2 that broad classes of systems of eqv.alities and

(or) ineqv.alities can be solved by transforming them into an equivalent optimization


problem out of the above classes (for some numerical results, see Horst and Thoai

(1988)).
634

2.4. Global Optimization of Concave FonctiODB Subject to Separable Quadratic

Constraints

The bounds provided by the general methods referred to in the preceding section
can often be improved for problems with additional structure. In this section as an

example we consider problems of the following form:

minimize f(x)

mk ~ xk ~ mk (k=l, ... ,n) ,

where Pik' 'ljk ' rik , mk , mk (i=l, ... ,nj k=l, ... ,m) are given real numbers, and f(x)
is areal valued concave function defined on an open convex set containing the rect-
n;::) T - - - T
angle R:= {x E IR : m ~ x ~ ml' where m = (!!!l' ... ,mn ) ,m = (ml' ... ,mn ) .

Denote by D the feasible set of problem (55).

Note that several other problems of importance can be included under problem (55).
For example, the problem of globally minimizing a separable possibly indefinite
quadratic form subject to separable quadratic constraints, i.e., the problem

(56)

s.t. xeD,

where POk ' ~k ' rOk e IR (k=l, ... ,n), is equivalent to

minimize t (56')
s.t. xeD,fO(x)~t

with the additional variable t E IR.


635

Practical problems that can be formulated as in (55) include location problems,


production planning, and minimization of chance-constrained risks (cf., e.g., Phan
(1982)), Problem (56) also arises as subproblem in certain bilevel programs (Stackel-

berg games, cf. Al-Khayyal et al. (1991), Vincente and Calamai (1995), and from

some VLSI chip design problems (e.g., Maling et al. (1982».

A direct application of the branch and bound methods developed in Chapter IV

would probably use exhaustive rectangular partitions, lower bounding by means of

vertex minima (cf. Section IV.4.5, Example IV.2) and the "Lipschitzian" deletion-

by-infeasibility rule (DR3) (cf. Section IV.5.).

A Lipschitz constant L(M) for a separable indefinite quadratic function


n 1 2 n
E (2 Pkxk + qkxk + r k) on a rectangle M = {x E IR : ~ ~ xk ~ bk ' k=l, ... ,n} was
k=l
derived in Section 1.4.1:

(57)

where

In this way we are led to the following algorithm.

Algorithm XI.5.

Step 0 (Initialization):

Let MO = R, choose a finite set SM (D (SM possibly empty), and determine


o 0
ß(M O) = min f(V(M» (where V(M) denotes the vertex set of M), aO = min

f(SM) (ao = aJ if SM = 0).


o 0
636

Let .Jt 0 = {MO}' and ßo = ß(M O)'


If 110 < IIJ, then ehoose xO E argmin f(SM ) (Le., f(xO) - 0),
(1
o
If 110 - ßO = 0 (or, in praetiee ~ c, where c > 0), then stop: 110 = ßO = min f(D)
(110 - ßo ~ c, x O is an c-approximate solution). Otherwise, set r =I and go to

Step r.

Step r = 1,2,... :

At the beginning of Step r we have the eurrent reet angular partition .Jt r-I of a

subset of MO still under eonsideration. Furthermore, for every M E .Jt r-I we have

SM ( M n D and bounds ß(M), a(M) satisfying

ß(M) ~ min f(M) ~ a(M) .

Moreover, we have the eurrent lower and upper bounds ßr-I' 11r-I satisfying

Finally, if Ilr _ I < IIJ, then we have a point xr- I E D satisfying f(xr - I ) = Ilr- I
(the best feasible point obtained so far).

r.I. Delete all M E .Jt r-I satisfying ß(M) ~ Ilr_ I .


Let.91 r be the eolleetion of remaining rectangles in the portion.Jt r-I'

r.2. Seleet a nonempty eolleetion of sets .9r ( .9l r satisfying

argmin {ß(M): ME .Jt r _ I } ( .9r

and subdivide every member of .9 r by biseetion (or any other exhaustive or

normal subdivision yielding reet angular partitions). Let .9~ be the eolleetion of all

new partition elements.

r.3. Remove any M E .9~ for which there is an i E {l, ... ,m} satisfying

max {gi (x): xE V(M)} - Li(M) O(M) > 0 , (58)


637

where Li(M) is a Lipschitz constant for gi over M given as in (57), and where
6(M) is the diameter of M (rule (DR3)i note that V(M) can be replaced by any
nonempty subset V'(M) of V(M».

Let .At ~ be the collection of all remaining members of !J' ~ .

r.4. Assign to each M e .At ~ the set SM C M n D of feasible points in M known so


far and the bounds

ß(M) = min f(V(M», a(M) = min f(SM) (o(M) = III if SM = 0)


r.5. Set .At r = (~r \ !J' r) U .At ~ . Compute

Qr = inf {a(M): M e .At r} , ßr = min {ß(M): M e .At r} .

If Qr < 1Il, then let xr e D be such that f(xr ) = Qr .

I.6. If Qr - ßr = 0 (~ c:), then stop: xr is an (c:-approximate) optimal solution.


Otherwise go to Step rH.

From the theory developed in Chapter IV we know that

ß:= lim ßr = min f(D) .


r .... 1Il

Moreover, if SM :f: 0 for al1 partition sets M, then

ß = lim ß = min f(D) = lim Q =: Q,


r.... 1Il r r.... 1Il r

and every accumulation point of the sequence {xr} solves problem (55).

If not enough feasible points can be obtained such that SM :f: 0 for al1 partition ele-
ments M, then, as discussed in Chapter IV, one may consider the iteration sequence

{ir} defined by f(i r ) = ßr . Although i r is not necessarily feasible for problem (49),
we know that every accumulation point of {ir} solves problem (55).
638

The general algorithm uses only simple calculations to make decisions on parti-
tioning, deleting and bounding. However, the lower bounds ß(M) = min f(V(M» are
weak, and a closer examination leads to procedures that allow improved bounding,
at the expense of having to solve additional subproblems. Moreover, a mechanism
needs to he devised for identifying points in SM C M n D. Following AI-Khayyal,
Horst and Pardalos (1992), we next attempt to obtain hetter bounds than ß(M)
using only linear programming calculations. In the process, at times, we will also be
able to identify when M n D = 0 or possibly uncover feasible points of Mn D # 0 for
inclusion in SM .
Let
n
G = {x: ~(x) = E ~k(xk) ~ 0 (i=l, ... ,m)},
k=l

where ~k(xk) = ! Pikx: + qikxk + rik (i=l, ... ,mj k=l, ... ,n).
Note that for each partition element M we have M n D = M n G since MO ( R and
D = Rn G.

Linea.rization of the constraints in G:

We begin with a simple linearization of the constraints in G. Let M = {x E IRn:


a ~ x ~ b} and let !PM ,h(x) denote the cont/ex ent/elope of a function h over M. Then
we know from Theorem IV.8 that

n
!PM eJx) = E !PM g (xk) ~ 2.(X) , (53)
'vi k=l k' ik UJ.

where

Each gik(xk) can be linearized according to one of the following three cases:
639

Case 1: Pik = 0, Then ~k(xk) is linear,


Case 2: Pik< 0, Then gik(xk ) is concave and

!PM k,gik (xk ) = (k'kxk+


1
ß'k'
1

where

(cf. the remark after Theorem IV,7),

In this case we can replace ~k(xk) by 4~)(xk) = (kikXk + Pik'

Case 3: Pik> 0, Then gik is convex, Compute xk = - qik/Pik which minimizes ~k'
If gik(xk) ~ 0, then replace gik(xk ) by the constant ~k(xk) in the constraint
gi(x) $ 0, (This effectively enlarges the region which is feasible for that con-
straint,) Otherwise, continue,

Case 3a: If xk < a k ' then compute

(1) 1 2 1/2
P'k
1
= -Pik [- q'k
1
+ (q'k
1
- 2p'k
1 1
r 'k) ],

(p~~) is the zero of gik to the right of xk)' Replace gik(xk) by the linear support
ofits graph at pf~), This is given by

Case 3b: If xk > b k ' then replace gik(xk) by the linear support of its graph at

(2) 1 2 1/2]
P'k = -Pik [- q'k - (q'k - 2p'k r 'k) ;
1 1 1 1 1

namely,
640

where O!ik = PikPf~) + qik < 0 and ßik = - O!ikPf~) (pf~) is the zero of gik to the
left of xk)'

Case 3e: If ak ~ xk ~ pf~) and pf~) , as


bk ' then eompute above, and replaee
~k(xk) by the maximum of the supports at pf~) and pf~) ,

Let t i be the number of terms in ~(x) that fall into Case 3e. That is, t i is the

cardinality of the index set

Setting

gik(Xk) if Case 1

(0)
lik (xk) if Case 2

lik(xk ) = ~k(xk) if Case 3


(1)
lik (xk) if Case 3a

liF) (xk) if Case 3b

we have

l(x):= E lk(xk) + E max {lk


(1) (x ), lk(2) (x )} ~ g.(x) ,
k k
1 ktK. 1 kEK. 1 1 1
1 1

and li(X) is a linear underestimate of ~(x) when Ki = 0, and it is pieeewise linear


and convex, otherwise.

In particular, the region defined by 4(x) ~ 0 is equivalent to the polyhedral set

defined by
641

li~1)(xk ) $ zik (k E Ki ) ,

li~2)(xk) $ zik (k E K i ) ,

which involves t i new variables zik (k E K i ) and 2ti additional constraints.

Performing the above linearization for every constraint i, we let L(G) denote the

resulting polyhedral set, and we let Lx(G) denote its projection onto the n-dimen-
sional space of x. It is clear that

Lx(G) ) conv (G) ) G .

It is also clear that, if (x, Z) minimizes f(x) subject to (x,z) E, L(G) and x E M,

then xminimizes f(x) over Lx(G) n M. If Lx(G) n M = 0, then G n M = 0 and


D n M = 0.
Therefore, if we solve the linearly constrained concave minimization problem

minimize f(x)
s .t. (x,z) E L(G), x ER (60)

by one of the methods developed in Part B, we obtain an initiallower bound for


min f(D). (A refined linearization technique will be discussed below.)
Solving the problem

minimize f(x)
(61)
S.t. (x,z) E L(G),xEM

would lead to a better lower bound than min f(V(M)). However, since fis concave

(and is not assumed to have any additional exploitable structure), too much effort is

required to solve (61). Instead, we seek an i: E Lx(G) n M that satisfies


min f(V(M)) $ f(i:) $ f(X). This can be done by using any of the algorithms discussed
in Part B of this book. We run one of these algorithms on problem (61) until a point
642

(i,i) E L(G), i E M satisfying min f(V(M)) ~ f(i) is found. Ifi E G, then SM:f 0.

For problem (56) in the form (56'), however, problems (60) and (61) are linear pro-

grams that can be solved easily.

Bilinear constraint approach:

An alternative linearization of G can be derived from the bilinear constraints that

arise from the quadratic constraints in the following way.

Each constraint function can be written as

n 1 2 n
~(x) = k!l (2 Pikxk + qikxk + rik ) = k!l (xkYik + rik ) ,

1
where Yik = 2 Pikxk + qik (i=l, ... ,mj k=l, ... ,n).
From Theorem IV.8 and Proposition X.20 we also have that the convex envelope of

each ~ over {li can be expressed by


n
!{Jn (x) = E (!{Jn (xkY·k)
ui,gi k=l Hik 1
+ r·1k ) , (62)

where !{J{lik (xkYik) denotes the convex envelope of xkYk over the set {lik and

{lik = {(xk , Yik): ak ~ xk ~ bk ' cik ~ Yik ~ dik} ,

cik = ! min {Pikai ' Pikbk} + qik '


643

Let y.
1
= (Y·l'···,y·n)T.
1 1
Since IOn l!:.(x) is piecewise linear and convex, the set defined
i'vi
by

IOn.,g.(x) ~ 0,
1 1

is equivalent to a polyhedral set whose description involves n additional variables

(zil, ... ,zin) and 2n additional linear constraints:

n
I: z·k ~ 0 ,
k=l 1

liP)(Xk'Yk) + rik ~ zik (k=l, ... ,n),

li~2)(xk'Yk) + rik ~ zik (k=l, ... ,n),

This is done for each constraint i=l, ... ,m to yield a polyhedral set P(G) whose pro-

jection onto IRn (the space of x-variables), denoted by P x(G), contains the convex
hull of Gj that is we have

G ~ conv (G) ~ P x(G) .

Rence,

rnin {f(x): (x,y,z) E P(G), x E M} ~ rnin {f(x): x E G n M} .

The discussion at the end of the preceding subsection ("Linearisation of the con-

straints") can now be followed with L(G) replaced by P(G) and the vector (x,z) re-
placed by (x,y,z).
644

Piecewise linear approximation:

Piecewise linear approximation of separable functions belongs to the folklore of

mathematical programming. Recently this approach has been studied in the context
of separable (quadratic) concave problems subject to linear constraints (e.g., Rosen
and Pardalos (1986), Pardalos and Rosen (1987)). A detailed discussion is given in

Chapter IX. An extension of these techniques to problems with an indefinite quad-


ratic objective function is presented in Pardalos, Glick and Rosen (1987).

Let g: [a,b] -t IR be a continuous univariate function on the interval [a,b] ( IR.

Choose a fixed grid of points by partitioning [a,b] into r subintervals of length

h=b-a
r

and determine the piecewise linear function l(x) that linearly interpolates g(x) at the
grid points

xj = a + jh (j=O, ... ,r) .

Replacing a constraint g(x) $ 0 by l(x) $ 0 leads to linear constraints and additional


zer~ne integer variables having a special structure. To see this, one writes x E
[a,b] in the form
r
x =a + E hw.
j=l J

where

0$ wj $ 1 (j=l, ... ,r) (63)

Wj +1 $ Zj $ wj , Zj E {0,1} (j=1, ... ,r-1).

The last constraints in (63) imply that the vector Z = (zl' ... ,zr_1) of zer~ne
variables must have the form (1,1, ... ,1,0,0, ... 0), i.e., whenever Zj = 0 for some j, one

has Zj+1 = ... = zr_1 = O. Hence, Z takes only r possible values, instead of 2r- 1
645

values as in the case of a general zero-<>ne vector with r-l components. Under the

transformation (63), l(x) can be written in the form


r
l(x) = l(a) + j!1 wil(a + jh) -l(a + (j-l)h)
(64)
r
= g(a) + E wJ.(g(a + jh) - g(a + (j-l)h).
j=1

Replacing each term ~k in the constraints of (55) by its corresponding piecewise lin-

ear approximation in the form (63), (64), one obtains an approximation for G which

is described by mixed integer linear constraints. Then, in particular, problem (56),


(56') can be approximated by a Mixed integer linear program.
Since the constraints are quadratic, it is easy to derive bounds on the maximum

interpolation error.

Recent branch and bound approaches for general indefinite quadratic constraints are

given in Sherali and Almeddine (1992), Khayyal et al. (1995), Phong et al. (1995).

2.5. Linearly Constrained Global Optimization of Fundions with Concave

Minorants

A common property of Lipschitz functions, d.c. functions and some other function
classes of interest in global optimization is that, at every point of the domain, one

can construct a concave function which coincides with the given function at this
point, and underestimates the function on the whole domain (concave minorant, cf.

Kharnisov (1995)). We present a new branch and bound 'algorithm for minimizing

such a function over a polytope which, when specialized to Lipschitz or d.c. func-

tions, yields improved lower bounds as compared to the bounds discussed in the pre-

vious sections. Moreover, the linear constraints will be incorporated in a straight-

forward way so that "deletion-by-infeasibility" rules can be avoided. Finally, we

show that these bounds can be improved further when the algorithm is applied to
646

solve systems of inequalities. Our presentation is based on Horst and Nast (1996)

and Horst, Nast and Thoai (1995), where additional details and areport on imple-

mentational issues and numerical experiments are given.

The following definition is essentially equivalent to the definition given in Khamisov

(1995).

Definition XI.1. A function f : S -+ IR, defined on a nonempty convex set S ( IR n, is

said to have a concave minorant on S ij, for every y E S, there exists a function Fy :
S -+ IR satisfying

{i} Fy {x} is concave on S, (65)

{ii} (66)

{iii} (67)

The functions Fix) are caUed conave minorants of f{x} (at y ES), and the dass of
functions having a concave minorant on S wiU be denoted by CM{S}.

Example XI.3. Let f(x) = p(x) - q(x), where p and q are convex functions on IRn .

Then it is well-known that pis subdifferentiable at every y E IRn , Le., the set 8p(y)

(subdifferential) of subgradients s(y) of p at y is nonempty and satiesfies, by

definition,

p(x) ~ p(y) + s(y)(x-y) V xE IRn . (68)

Therefore, every d.c. function is in CM(lRn ) with concave minorants

F y(x) = p(y) + s(y)(x-y) - q(x). (69)

Example XI.4. A function f : IRn -+ IR is called p-eonvex if there is some p E IR such

that for every y E IRn , there is some s(y) E IRn satisfying

f(x) ~ f(y) + s(y)(x-y) + pllx_yl!2 (70)


647

(cf. Vial (1983». For p > 0 one obtains the class of strongly convex functions, p = 0
characterizes a convex function, and p-<!onvex functions with p < 0 are called
weakly convex. From (70) we see that weakly convex functions are in CM(lRn) with
concave minorants

F(y) = f(y) + s(y)T (x-y) + pllx-yf (71)

Example XI.5. Functions f : S -f IR are said to be Bölder continuous on S if there


exist L > 0, P e IR such that

If(x) - f(y) I ~ L IIx-yliP V x,y eS (72)

(where 11·11 denotes the Euclidean norm). It follows from (72) that, for all x,y e S,
we have

f(x) ~ f(y) - L IIx-yIlP, (73)

Le., for P ~ 1, Bölder continuous functions are in CM(S) with

Fy (x) = f(y) - Lllx-yllp. (74)

In order to ensure convergence of the algorithm given below one needs continuous
convergence in the sense of the following lemma.

Lemma XI.1. Let {xk} and {Yk} be sequences in S such that lim xk = lim 1Jk =
k-+m k-+m
seS. Then, for each of the concave minorants given in Example XI.9 - XI.5, we
have
I im F (xII = f(s).
k-+m 1Jk

Proof. First, notice that each of the three types of functions considered in the
above examples is continous on S. This follows in Example XI.3 from continuity of
convex functions on open sets (since S = IRn) and is trivial in Example XI.5. Since
648

p-convex functions are not treated in detail in this monograph, we refer to Vial

(1983) for Example XIA. Let B(s) be a compact ball centered at s. Then the

assertion follows for Example XI.3 from boundedness of { &(y) : y E B( s)} (cf., e.g.,

Rockafellar (1970)) and continuity of p and q. For Example XIA, the property of

Lemma XI.1 follows in a similar way since s(y) is a subgradient of the convex func-

tion f(x) - pllxf The case of Example XI.5 is trivial. •

We consider the problem

minimize (x), (75)


xED

where D is a polytope in IRn with nonempty interior, and f E CM(S) for some

n-simplex S 2. D.

A lower bound for f over the intersection of an n-simplex S with the feasible set is
obtained by minimizing the maximum of the convex envelopes IPy(x) of the concave

minorants F y(x), taken at a finite set T ( S. Recall from Theorem IV.7 that, for

each y E T, the convex envelope IPy of F y over S is precisely that affine function
which coincides with Fy at the vertices of S.

Proposition XI.5. Let S = [vO' ...'vn] be an n-simplex with vertices vO, ... ,vn' D be
a polytope in IRn, T be a nonempty finite set of points in S, and fE CM(S) with con-
cave minorants Fy. For each y E T, let IP y denote that affine functiuon which is
uniquely defined by the system of linear equations

(76)

Then, the optimal value ß(S n D) 0 f the linear pragram

minimize t
s.t. IP y (x)$ t, yE T, xE Sn D (77)

is a tower bound for min{J(x) : xE Sn D}.


649

Proof. Concavity of Fy implies that <,Oy(x) ~ F y(x), v x E S, Y E T. It is easy to


see (and well-known) that the optimal value P(S n D) of the linear program

(76)-{77) satisfies

P(S n D) = min max <,0 (x),


xE SnD yET y

and hence, by Definition XI.1,

P(S nD) ~ min max F (x) ~ min fex).


xE SnD yET y xE SnD •
Notice that one can avoid solving the system (76), since in barycentric co-

ordinates

n n
xE S ~ x= ~ ).. V., ~ ).. = 1, ).. ~ 0, i = O, ... ,n
i=O 1 1 i=O 1 1

one has

n
<,0 (x) = ~ ).. F (v.).
Y i=O 1 Y 1

As usual, we set p(SnD) = + aJ when SnD = 0. When in (76), (77) SnD f. 0 we ob-
tain a set Q(S) of feasible points in S while solving (76), (77). The construction of a
tight initial simplex S l Dis known from previous chapters.

Algorithm X.6.

Initialization:

Determine an initial n-simplex S l D, the lower bound p(SnD), and the set Q(S).

Set peS) = p(SnD), Q = Q(S), a = min{f(x) : x E Q}, and choose Z E Q satisfying

fez) = a. Define M = {S}, set p= peS), k = 1.

Iteration k:

If a = P, then stop; z is an optimal solution, and I is the optimal objective function


value of Problem (75).
650

Otherwise, choose

SEM satisfying ß(S) = ß· (79)

Bisect Sinto the simplices SI and S2. Compute ß(Si n D), i = 1,2j and

ß(Si) = max{ß(S)j ß(Si n D)} (i=I,2). (80)

. Set Q = Q U {Q(SI)' Q(S2)}' update a = min{f(x) : x E Q}, and choose Z EQ

satisfying f(z) = a. Set

M= (M\ {S}) U {Sl,S2}' (81)

M = M \ {S:ß(S)1 ~ a},

Jl. = { min{ß(S) : SEM}, if M f 0


(82)
a, if M= 0

and go to iteration k+1.

Clearly, if the algorithm terminates after a finite number of iterations, then it


yields an optimal solution. In order to investigate convergence in the infinite case,
let us attach the index k to each quantity and set at the beginning of iteration k.

Proposition XI.6. In Problem (75), let fE CM(S} be continuous on the initial sim-

plex S. Moreover, for each pair of sequences {xk}, {Yk} C S such that 1im xk I im
k-tCD k-tCD
Yk = s assume that 1im Fy (x~ = f(s). Then, if the algorithm does not terminate
k-tm k
after a finite number of iterations, we have

and every accumulation point z* of the sequence {zk} is an optimal solution of Prob-
lem (75).
651

Proof. Let z* be an accumulation point of the sequence {zk} C P, and let f" =

min{f(x) : x E D}. The sequence {{\} of lower bounds is monotonically non-


decreasing and bounded from above by f", likewise, the sequence {ok} of upper

bounds is nonincreasing and bounded from below by f". Therefore, ß" := 1i m ßk


k-loo
and 0" = 1im 0k exist, and because of 0k = f(zk) and continuity off(x), we have
k-llll

ß" ~ f " ~ 11m


. f(zk) = f(")
Z = °" . (83)
k-llll

Next, consider a subsequence of {zk} converging to z". It follows from (83) by a

standard argument (see Chapter 4) that this subsequence must contain an infinite

subsequence {zk } such that the corresponding sequence {Sk } satisfies Sk )


q q q
Sk ' and 1\ = ß(Sk ), V q. We must have Sk n D t 0, V q, since infeasible sim-
q+1 q q q
plices are deleted because of ß(S n D) =+ 1Il, if S n D = 0. For all q, choose a point
xk E Sk n D, x k E Qk . We know that every decreasing sequence of simplices
q q q q
generated by successive bisection converges to a singleton. Therefore, we have 1 im
q-llll
Sk = {s} for some SE D, and continuity off implies ß ~ lim f(x k ) = f(s). On the
q ~1Il q
other hand, each of the affine functions l{Jy defined in Proposition XI.6 for a given

simplex S attains its minimum at a vertex of S, where l{Jy coincides with F y" In view

of Proposition XI.6 we have

ß(S n D) = min max I{J (x) ~ min max l{Jy (x)


xE SnD xES y xES yET

> min cp (x) = F (v(y)), V Y E T,


xES Y Y

where v(y) is the vertex of S at which min cp (x) is attained. It follows that ßk ~
xES y q
F (v(Yk)) for an arbitray Yk E T k and &11 q. Since lim Sk = {s}, we must
Yk q q q q-llll q
q
have 1 im Yk = 1 im v(Yk ) = s, and hence, using the continuous convergence as
q-llll q q-llll q
652

sumption, ß'" = lim 1\ ~ f(s). Therefore, the assertion follows from (83).- •
q-i1D q

Systems of CM-lnequaliües

Let D c IRn be a polytope with nonempty interior, and let fi E CM(S) be eontinuous

on the initial simplex S J D, i = 1,... ,m. The system


fi(x) ~ 0, i = 1, ... ,m (84)

has a solution x", E D if and only if

max{fi(x"') : i = 1,... ,m} ~ 0 (85)

It follows from Definition Xl.l that f(x) = max{fi (x) : i = 1,... ,m} E CM(S), so that
the system (84) of inequalities ean be investigated by applying the above algorithm
to the optimization problem (75) until a point x'" E D satisfying f(x"') ~ 0 is deteeted
or the optimal value of (75) is found to be positive (indieating that the system (84)
has no solution in D, cf. Seetion 4.2). A straightforward applieation of Proposition
XI.6 would lead to the bound

ß(S n D) = min max I{J. (x) , (86)


xESnD yET y

where rpy (x) is the eonvex envelope of F y(x), F y(x) being the eoneave minorant of
one ofthe funetions fj satisfying fj(y) = max{fi(y) : i = 1, ... ,m}. This bound ean eer-
tainly be improved by eonsidering

ß1(S n D) =. max min max rp; (x), (87)


1=1, ... ,m xE SnD yET

where rp; is the eonvex envelope ofthe eoneave minorant F; offi' i = 1,... ,m.
Further improvement results from the well-known observation that a maximum
operations always leads to a smaller value than the eorresponding minmax opera-

tion, so that we propose to use


653

ß2(S n D) = min . max max cp; (x). (88)


xe SnD 1=1, ... ,m yeT

Notice that ß2(S n D) is the optimal objective function value of the linear program

minimize t

S.t. cp;(xHt, l~i~m,yeT,


xe S n D.

3. OUTER APPROXIMATION

Consider the problem

minimize f(x) (89)


(P) s .t. gi(x) ~ 0 (i=l, ... ,m)

where f, ~: IRn -+ IR are Lipschitz functions (i=l, ... ,m). Suppose that the feasible set

D = {x e IRn : ~(x) ~ 0 (i=l, ... ,m)} (90)

is nonempty and compact, and suppose that areal number r > 0 is known which
satisfies

(91)

Le., we know a ball of radius r containing D (11·11 denotes the Euclidean norm).

Moreover, it is assumed that Lipschitz constants Li of ~ on the ball IIxll ~ r


(i=l, ... ,m) are known.

We attempt to apply to (89) an outer approximation method of the type dis-

cussed in Chapter 11. Recall from Section II.1 that in each step of an out er approx-

imation method a subproblem of the form


654

minimize f (x) (92)


s .t. xE Dk

must be solved, where Dk is a relaxation of D and

(93)

with a suitable function 4c:: IRn --I IR satisfying

lk(x) ~ 0 \Ix E D , (94)

lk(xk ) >0 (95)

(xk denotes a solution of (Qk))'

A direct application of such an outer approximation scheme does not seem to be

promising, because the subproblems (Qk) are still Lipschitz optimization problems,

and, moreover, it will be difficult to find suitable functions lk such that {x: 4c:(x)
= o} separates x k from D in the sense of (94), (95). Convexity does not seem to be
present in problem (89), and this makes it difficult to apply outer approximation
methods.

Therefore, we shall first transform problem (89) into an equivalent program where
convexity is present. Specifically, problem (89) will be converted into a problem of

globally minimizing a concave (in fact, even linear) function subject to a convex and

areverse convex constraint.


This idea and the following outer approximation method are due to Thach and Tuy

(1987).

First we note that in (89) one may always assume that the objective function f(x) is

concave (even linear). Indeed, in the general case of a nonconcave function f, it


would suffice to write the problem as

minimize t
s . t. f(x) ~ t , ~(x) ~ 0 (i=l, ... ,m)
655

which involves the additional variable t and the additional constraint f(x) ~ t.

Now, in the space IRn+1let us consider the hemisphere

nH 2 nH 2 2
S = {u e IR : lIull = i!l u i = r , unH ~ O} ,

whose projection onto IRn is just the ball B:= {x e IRn : IIxll ~ r} introduced above.

Using the projection 'lI": IRnH --+ IRn defined by u = (ul'''''un + 1) --+ 'lI"(u) =
(u1""'un ) we can establish an obvious homeomorphism between the hemisphere S in
IRnH and the ball B in IRn.

Let

<p(u) = f('lI"(u)), 'Pi(u) = ~('lI"(u)) (i=l,:.. ,m). (96)

Then we rewrite problem (P) in the form

minimi ze 'P(u)
s.t·'Pi(u)~O (i=l, ... ,m) , (97)

lIull = r , unH ~ O.

In fact, if x solves (P), then u with unH = ~ r 2 - IIxll 2 , 'lI"(U) = x (Le., ui = Xi


(i=l, ... ,n)) solves (PS)' Conversely, ifu solves (PS)' then x = 'lI"(U) solves (P).
Since 'lI" is a linear mapping, the function 'P is still concave, while the functions 'Pi
(i=l, ... ,m) are still Lipschitzian on S with the same Lipschitz constants Li as gi (on
B).

At first glance, problem (PS) seems to be more complicated than the original

problem (P). However, the following proposition shows that the feasible set of (97)

can be expressed as the difference of two convex sets.

Let C denote the feasible set of problem (P S), i.e.,

C = {u e IRnH : 'Pi(u) ~0 (i=l, ... ,m), lIul! = r, unH ~ O} (98)

= {u eS: 'Pi(u) ~ 0 (i=l, ... ,m)}.


656

Proposition XI.7. The feasible set C ofproblem (PsJ is a difference oftwo convex

sets:

C=Cl\G, (99)

where Cl is the (closed) convex hull of C and G is the (open) ball of radius r in IR n +1,
i.e.,

Cl = convC, G={uElR n+1:lIull <r}. (100)

Proof. Clearly, we have C ( Cl \ G.

To prove the converse inclusion, consider an arbitrary point u E Cl \ G. Then u can

be written as u = E ).. ui , with E)'. = 1 , ).. > 0, ui E C (i EI). Suppose that


iel 1 iel 1 1

II I > 1. Then we would have

(since the norm"." is strictly convex), Le., U E G. Therefore, u ;. G implies 111 = 1,


and hence u e C.

Note that Proposition XI.7 simply expresses the observation that, when one forms

the convex hull of a closed subset of a sphere in IRn +1 (with respect to the Euclidean

norm), one only needs to add to this set strictly interior points of the ball which

deterrnines the sphere.

A related observation is that, given a closed subset M of the hernisphere S in

IR n+1 and a point uD eS \ M, there always exists a hyperplane H in IRn +! which

strictly separates uD from M: take the hyperplane which supports the ball Ilull $ r at

uD, and move it parallel to itself a sufficiently small distance towards the interior of
the ball. In particular, a separating hyperplane can easily be constructed for M = C,
where the functions IPi are Lipschitzian functions with Lipschitz constants Li .
657

Proposition XI.S. For every U E S define

1
h(u):= "2 ._ maz
t
['P (u) ] 2
L. (101)
z-l, . .. ,m z

where 'Pt (u) = maz {O, 'P/u)}. If uO E S \ C, then the affine fu,nction

(102)

strictly separates uO from C, i.e., one has

l(uO) > 0, l(u) ~ ° Vu E C. (103)

°
Proof. Since uO ~ C, we have 'Pi{uO) > fOI at least one i E {l, ... ,m}. Thelefole,
h{uo) > 0, and since lIuo" = I, it follows that

On the othel hand, u E C implies 'Pt{u) = ° (i=l, ... ,m).


It follows that fOI any u E C we have

Thelefole, if i* denotes an index such that

°1
h( u ) = "2. max
['P+
L.
°
i (u )] 2 1 ['P+
= "2
° °
i * (u )] 2 1 ['Pi* (u )] 2
L. * = "2 L. * '
1=1, ... ,m 1 1 1

then

(104)

l{u) ~ uO{u - uD) + ~ lIuO - ull 2

= (uO+ ~ (u - uD)) (u - uD) = ~ (u + uD) (u - uD)


658

This completes the proof.



As a consequence of the above results we see that problem (PS) is actually a
special d.c. programming problem, namely, it is a problem of minimizing a concave
function over the intersection of the convex set n with the complement of the con-
vex set {u E IRn+l : lIull < r} (cf. Proposition XI.6). Therefore, an outer approxi-
mation method, such as those discussed in Chapter X, could be applied. A difficulty
which then arises is that the convex set n is not defined explicitly by a set of finitely
many convex constraints.
However, for any uD E S \ C, Proposition XI.7 allows us to construct a hyperplane
which strictly separates uD from C, as required by outer approximation methods.
Before describing such a method we first prove a result which is very similar to Pro-
position IX.l!.

Proposition XI.9. Consider a polytope P in IR n+1 and the problem

minimize rp(u) (105)


s. t. u E P , lIull ~ r

where rp(u) is a concave function on IR n+1. I/ this problem is solvable, then there is
always an optimal solution 0/ the problem which is a vertez 0/ P or lies on the inteT'-
section o/the sur/ace lIull = r with an edge 0/ P.

Proof. For every w E P, let Fw denote the face of P containing w of smallest di-
mension.
Suppose that problem (105) has an optimal solution in the region lIull > r and let
w be such an optimal solution with minimal Fw.
We first show that in this case dim Fw = 0, i.e., w must be a vertex of P. Indeed,
if dim Fw ~ I, then there exists a line whose intersection with Fw n {u: lIull ~ r} is a
659

segment [w' ,W"] containing w in its relative interior. Because of the concavity of
rp(u), we must have rp(w') = rp(w") = rp(w). Moreover, if /lw'lI > r, then
dim F w' < dim Fw' and this contradicts the minimality of dim Fw. Therefore,
/lw'lI = r, and similarly Ilw"/l = r. But then we must have [w', w"] C {u: /lu/l ~ r},
contradicting the assumption that IIwll > r.
Now consider the case when all of the optimal solutions of (105) lie on the surface
/lu/l = r,
and let w be an optimal solution. If dim Fw > 1, then the tangent hyper-
plane to the sphere IIull = r at the point wand F w n {u: IIull ~ r} would have a
common line segment [w', w"] which contains w in its relative interior. Because of
the concavity of C{J, we must have rp(w') = rp(w") = rp(w), Le., w' and wll are also op-
timal solutions. But since IIw'II > rand IIw"1l > r, this contradicts the hypothesis.
Therefore, dim Fw ~ 1, i.e., w lies on an edge of P.

Algorithm XI.7:

Initialization:
Se1ect a polytope D1 with known vertex set V 1 satisfying

Oe D1 C {u: 117r{u)II ~ r, un+1 ~ O} .

Compute the set V1 of all points w such that w is either a vertex of D1 satisfying
IIwll ~ r or else the intersection of an edge of D1 with the sphere IIull = r.
Set k = 1.

Step 1. Compute

wk E argmin {rp(w): w E V
- }
k (106)

and the point uk where the verticalline through wk meets the hemisphere

S = {u: /lull = r, unH ~ O}.


If C{Ji(uk) ~ 0 (i=l, ... ,m), Le., uk E C, then stop: uk is an optimal solution of

(PS)·
Otherwise go to Step 2.
660

Step 2. Define the additional linear constraint

(107)

and set

Dk+1 = Dk n {u: 4c(u) S O} .

Compute the vertex set Vk+1 of Dk +1 and the set Vk+1 of all points w such
that w is either a vertex of Dk +1 satisfying IIwl/ ~ r or else the intersection of an

edge of Dk +1 with the sphere I/ull = I.


Set k I- k+1 and return to Step 1.

Remark XI.2. The sets Vk+ 1 ' Vk+ 1 can be obtained from Vk' Vk using the
methods discussed in Section 11.4.2.

Remark XI.3. The constraint (83) corresponds to a quadratic cut in the original

variables (xp- .. ,xn ) of the form

The convergence of Algorithm XI. 7 follows from the general theory of out er ap-
proximation methods that was discussed in Chapter II.

Proposition XI.IO. I/ Algorithm X1.6 is infinite, then every accumulation point 0/


{uk} is an optimal solution 0/ problem (P sJ.

Proof. We refer to Theorem II.1, and we verify that the assumptions of this the-

orem are satisfied.


The conditions 4c(uk ) > 0 and lk(u) S 0 Vu E C follow from Proposition XI.8.

Since the functions 'Pi( u) are Lipschitzian, and hence continuous, we have continuity

of 4c(u).
661

Now consider a subsequence {uq} ( {uk} satisfying u q --I Ü. Clearly li m 1 (u q) =


q-too q
1i m (uqu q - r 2 + h(u q)) = 1i m (uqü - r 2 + h(uq )) = uu - r 2 + h(U).
q-too q-too

Moreover, since lIu qll


k k
= r, un +~
1 0 Vq, we have lIüll = r, ünH ~ O. Therefore,
using h(u) ~ 0 Vu, we see that l(ü) = lIüll 2 - r 2 + h(U) = 0 implies h(U) = O. But
h(U) = 0 is equivalent to ü E C.
Thus, the assumptions of Theorem 11.1 are satisfied, and Proposition XI.lO follows
from Theorem 11.1.

Example XI.6. Consider the feasible set D c 1R2 defined by

g2(x) = 0.125 sin 0.625 xl + 0.0391 ~ SO,


which is contained in the rectangle -5 S xl S 5 , -5 S ~ S 3.2 .

Consider the ball B of radius 1 around (-1,0). Let a = (-1.5,0.5) and for every x
define

p(x) = max {t E IR: (x-a) E t B} .


It is easily seen that p(x) is a convex function having severallocal extrema over D

(actually p(x) is equal to the value of the gauge of B - a at the point x - a). We
want to find the global maximum of p(x), or equivalently, the global minimum of
f(x) = -p(x) over D. By an easy computation we see that f(a) = 0 and
2 2
2x1 + 2x2 + 6x1 - 2x 2 +5
p(x) = 2 2 172
(3x 1 + 3~ - 2x1x 2 + 10x 1 - 6x2 + 9) + Xl - ~ + 2

for x # a. The Lipschitz constants of g1' g2 are LI = 1, L2 = 0.0875.


We can choose r = 100 and the starting polytope
662

After 13 iterations the algorithm finds a solution with i = (-3.2142, 3.0533),

p(X) = 10.361, max {gI (X), g2(i)} = 0.0061. The intermediary results, taken from
Thach and Tuy (1987), are shown in the table below, where Nk denotes the number

of vertices of the current polytope and Ek = max {gI (xk), g2(xk)} .


k Nk xk = (x~,x~) l'(X k ) fk
h(uk )
I-
1 8 -5,3.2) 15.005 0.9363 0.9868
2 10 -5,1.7956) 11.927 0.3126 0.3026
3 12 -3.5964,3.2) 11.606 0.0276 0.0496
4 14 -3.6321,2.8871) 10.915 0.0171 0.0191
5 16 -3.2816,3.2) 10.886 0.0141 0.0131
6 18 -3.6891,2.7) 10.596 0.0128 0.0106
7 20 -3.3231, 3.0439~ 10.585 0.0096 0.0060
S 22 -3.4406,2.9265 10.562 0.0098 0.0062
9 24 -3.1201,3.2) 10.524 0.0089 0.0052
10 26 -5,1.0178) 10.435 0.1026 0.0927
11 28 -4.2534,2.0118) 10.423 0.0205 0.0275
12 30 -2.748,2.57) 3) 10.420 0.0107 0.0074
13 32 -3.2142,3.0533 10.361 0.0061 0.0024

Table XI.I.

4. THE RELIEF INDICATOR METHOD

The conceptual method that folIows, which is essentially due to Thach and Tuy
(1990), is intended to get beyond local optimality in general global optimization and
to provide solution procedures for certain important special problem classes.
Consider the problem

minimize f(x) (108)


(P) S.t. xe D

where f: IRn ---t IR is continuous and D ( IRn is compact. Moreover, it will be assumed

that

inf{f(x): x E D} = inf{f(x): xe int D} . (109)

Assumption (108) is fulfilled, for example, if Dis robust (cI. Definition 1.1).
The purpose of our development is to associate to f, D and to every a E IR a d.c. func-
663

tion I{Jo.(x) such that xis a global minimizer of f over D if and only if

(110)

where a = f(x). Based on this optimality criterion, a method will be derived to


handle problem (P) in the sense mentioned above.

4.1. Separators tor { on D

The function I{Jo.(x) will be defined by means of aseparator of f on D in the fol-


lowing sense.
Let iR = IR U{m}, and for each 0. E iR consider the level sets

Do.:= {x E D: f(x) < o.},

Let dA(x):= inf {lix - ylI: Y E A} denote the distance from xE IRn to a set A ( IRn
(with the usua! convention that dA(x) = +m if Ais empty).

Definition XI.2. Areal val'Ued junction r(o.,x) dejined on iRIClR n is called a


separator for the junction f(x) on the set D if it satisfies the foUowing conditions:

(i) 0 ~ r(o.,x) ~ dn (x) for every 0. E iR, xE IRn;


0.

(ii) for each fixed 0. E IR, i ---I X~ D0. implies that


h

lim r(o.}) > 0;


~

(iii) r(o.,x) is monotonically nonincreasing in 0., i.e., 0. ~ 0.' implies that


r(o.,x) ~ r(o.',x) Vx E IRn.
664

Note that the notion of aseparator for fon D is related to but is different from

the notion of a separator (for a set) as introduced in Definition 11.1

Example XI.7. The distance function dj) (x) is aseparator for fon D.
a

Example XI.S. Let D = {x E !Rn: ~(x) ~ 0 (i=l, ... ,mn with ~: !Rn --I IR

(i=l, ... ,m). Suppose that fis (L,J.')-Hölder continuous and ~ is (Li'I1)-Hölder con-
tinuous (i=l, ... ,m), i.e., for a11 x,y E IRn one has

If(x) -f(y)1 ~ Lllx-yllJ.', (111)

1~(x)-~(y)1 ~Lillx-YII
11 (l=l,
. ... ,m) (112)

with L, Li > Oj J.', 11 E (0,1] (i=l, ... ,m).


Then

r( ) = max {[max ( 0, f(x)-a)]l/J.'


a,x L '
gi (x) 1/11] .
[max (0, --r:-)] (l=l, ... ,mn (113)
1

is aseparator for fon D.


We verify the conditions (i), (ii), (iii) of Definition XI.2:
(i) Let a E iR, x E IRn, y E j)a. Then it follows from (111) and (112) that

IIx _ ylI ~ max {I f(x)Lf(y) 11/ J.', 1~(x)~(y) 11/ 11 (i=l, ... ,mn.
1

But since

If(x) - f(y) 1 ~ max {O, f(x) - f(yn ~ max {O, f(x) - a}

and

I~(x) - ~(y)1 ~ max {O, gi(x) - ~(yn ~ max {O, ~(xn (i=l, ... ,m),

we have
665

Ilx - ylI ~ r(/l,x) Vx E IRn , y E D/l;

and hence

r( /l,x) S inf {lix - ylI: y E D/l} = dD (x).


/l

The conditions (ii) and (iii) in Definition XI.2 obviously hold for the function r( /l,x)

given in (89).

Example XI.9. Let f: IR --I IR be twice continuously differentiable with bounded


second derivative, i.e., there is a constant M > 0 such that

If"(x)1 SM VxEIR.

For every /l E iR, x E IR set

0 iff(x) S /l
p(/l x)·= { (114)
,. - ~ If '(x) 1- (I f '(x) 12 + 2M(f(x)_/l))1/2) iff(x) > /l

Then aseparator for fon the ball D = {x E IR: lxi Sc} is given by

r( /l,x) = max {p( /l,x), 1xl - c} . (91)

Note that the second expression in (90) describes the unique positive solution

t = p( /l,x) of the equation

~ t 2 + 1f '(x) 1t = f(x) - /l.

Conditions (ii) and (iii) in Definition XI.2 are again obviously satisfied by (91). To

demonstrate (i), it suffices to show that lyl S c and f(y) S /l imply that r(/l,x) S

1x - y I· But, since 1y 1 S c implies that 1x 1 - c S Ix 1 - 1y 1 S 1x - y I. we need only

to show that f(y) S /l implies that p( /l,x) S 1x - y I. This can be seen using Taylor's

formula

If(y) -f(x) -f'(x) (x-y)1 S ~ ly-xI 2 .


666

From this it follows that

~ ly-xI 2 + If'(x)lly-xl ~f(x)-f(y). (116)

Now let f(y) ~ a. Then from (116) we have that

:w.ly _x1 2 + If '(x) I Iy -xl ~ fex) - a.

But ~ t 2 + If '(x) I t is monotonically increasing in t > 0 (for x fixed). Therefore, it


follows from the definition of p( a,x) that p( a,x) ~ Ix - y I .
Note that, by the same arguments, Example XI.9 can be extended to the case
where fis defined on IRn with bounded Hessian f"(x):= V2f(x) and f '(x):= Vf(x).

4.2. AGlobal Opüm.aJity Criterion

Suppose that aseparator r( a,x) for the function f on the set D is available, and
for every a e öl define the function

(117)

(with the usual convention that sup 0 = _, inf 0 = +m). Clearly, for fixed a, h(a,x)
is the pointwise supremum of a family of affine functions, and hence is convex (it is
a so-called closed convex function, cf. Rockafellar (1970)).
Now consider the d.c. function

(118)

Lemma XI.2. We have

l{Jiz) > 0 if zt 'Da I (119)

l{Jiz} = -inf{IIz - vll~: vt Da} if xe 'Da' (120)


667

Proof. If x ~ Dn' then it follows from Definition XI.2 (ii) that r( n,x) > 0, and
hence, by (117),

This proves (119).

In order to prove (120), we observe that from (117) it follows that

2 2
= sup {r (n,v) -lIx-vll }
v~Dn

(121)

Now consider an arbitrary point x E Dn' If f(x) = n, then it follows that

cpn(x) ~- inf IIx-vll 2 ~ -lIx-xIl 2 = 0, (122)


v~Dn

This implies that - in f IIx - vll 2 = 0,


v~Dn

On the other hand, from Definition XI.2 (i) we know that

r 2(n,v) ~ d~ (v) ~ IIx-vll 2 Vv E!Rn ,


n

It follows that

cP (x) = sUP{-llx-vIl2+r2(n,v)}~0,
n v~Dn

and hence by (122)


668

Therefore, (120) holds for x E TI er satisfying f(x) = er.

If f(x) < er, then to each point v ~ Der we associate a point z(v) on the intersec-
tion of the line segment [x,v] with the boundary an er of Der. Such a point z(v)
exists, because x E Der while v ~ Der.
Since z(v) E [x,v] , z(v) E TIer' and because of Definition Xl.l (i), we have

/Ix - v/l = /Ix - z(v)1I + /Iv - z(v)11 ~ /Ix - z(v)/I + r( er,v) .

It follows that

2 2 2
-/lx-z(v)1I ~r (er,v)-lIx-v/i ,

and hence

Finally, from (121) and (123) it follows that we must have (120) for xe Der. •

Corollary XI.I. For every er E iR satisfying TIer f. 0 we have

in! tpJxh o.
xelR n ...

Furthermore, for every x E D and er = f(x) we have

Proof. Corollary Xl.l is an immediate consequence of Lemma X1.2.



669

x
Theorem XI.2. Let E D be a leasible point 01 problem (P), and let a= f(x). Gon-
sider the function rpa{x) defined in (107), (108).

Ci) I/

(124)

then /or any x satisfying rpa(x) < 0 we have xE D, I(x) < a {i.e., xis a better /eas-
ible point than x}.

{ii} I/ xis a {globaUy} optimal solution 0/ problem {P}, then

min rp-{x} = o. (125)


xE/Rn a

(iii) I/problem {P} is regular in the sense 01 {109}, i.e., il

in/ {f(x}: xE D} = in/ {f{x}: x Eint D} ,

then any xsatisfying {125} is a (globally) optimal solution 01 problem (P).

Proof. Because of Lemma XI.2, every x E /Rn satisfying rpa(x) < 0 must be in Da.
This proves (i) by the definition of Da.
Using the assertion (i) just proved and Corollary XI.1, we see that condition (125)

is necessary for the global optimality of X, Le., we have (ii).

In order to prove (iii), suppose that xsatisfies (125) but is not a globally optimal
solution of (P). Then, using the regularity condition (109) we see that there exists a

point x' Eint D satisfying f(x') < a. Because of the continuity of f it follows that
x' Eint D -. By Lemma XU this implies that
a

inf rp-(x) $ rp-(x') =- inf IIx' -vII< 0,


xE/Rn a a vtD-
a

i.e., xdoes not satisfy (125). This contradiction proves (iii). •


A slightly modified version of the above results is the following corollary.
670

Corollary XI.2. 11 ä = min I{D} is the optimal objective fu,nction value 01 problem

{P}, then ä satisfies {101}, and every optimal solution 01 {P} is a global minimizer 01
CPä(x) over IRn.
Conversely, ilthe regularity condition {109} is fu,lfiUed and ä satisfies {125}, then
ä is the optimal objective fu,nction value olproblem {P}, and every global minimizer 01
cpiz} overlR n is an optimal solution 01 {P}.

Proof. The first assertion follows from Corollary XI.1 and Theorem XI.2.

In order to prove the second assertion, assume that the regularity condition (109)

is fulfilled. Let ä satisfy (125) and let i be a global minimizer of cpä(x) over IRn.
Then cpä(i) = 0, and from Lemma XI.1 we have i e D, f(i) ~ ä.
Let ä = f(i). Since ä ~ ä, it follows from Definition XI.2 (iii) that r(ä,x) ~ r(ä,x),
and hence

Using the first part of Corollary XI.1, we deduce that inf IP(j..x) = O. Therefore, by
xelRn
Theorem XI.2 (iii) we conclude that i is an optimal solution of problem (P).

Furthermore, it follows from the regularity condition (109) that we cannot have
ä< ä, because in that case there would exist a point x' eint D satisfying f(x') < ä.
But, by Lemma XI.1, this would imply that cpä(x') = - inf {lIx' - vf v t Dä} < 0,
which contradicts the hypothesis that ä satisfies (125). Therefore, we must have

ä = ä, i.e., ä is the optimal objective function value of (P).


4.3. The Relief Indicator Method

The properties of the function cP(l(x) presented in the preceding section suggest

interpreting cp(x) as a sort of generalized gradient or relief indicator. Following

Thach and Tuy (1990), we shall call cP(l(x) a reliel indicator fu,nction lor 1 on D. (A
671

slightly more general not ion of rellef indicator is discussed in Thach and Tuy

(1989)).

We have seen that under the regularity assumption (85)

min I{J (x) = 0 if and only if Q = min f(D) .


xelRn Q

Thls suggests replacing problem (P) by the parametric unconstrained d.c. minimi-

zation problem:

find Q such that 0 = infn I{J {x}. (126)


zelR Q

Suppose that D f 0. A straightforward conceptual iterative method to solve prob-

lem (126) is as follows:

Start with an arbitrary feasible point xl.

At iteration k, solve the auxiliary problem

(P k ) minimize I{J (x) over IRn ,


Qk

where Qk = f(xk). Let xk+ 1 denote an optimal solution of(P k)· If I{JQk(xk +1) =
0, then stop: xk is an optimal solution of (P). Otherwise, go to iteration k+1 with

Qk+1 = f(x k+1 ).

Proposition XI.n. Let problem {P} be regular in the sense 0/ {109}. 1/ the above
iterative procedure is infinite, then every accumulation point i 0/ the sequence {i} is
an optimal solution 0/ {P}.

Proof. We first note that problem (PI) has an optimal solution x 2 with

I{J (x2) ~ o. To see this, recall from Corollary XI.1 that i nf n I{J Q (x) ~ 0 since DQ
Q1 xelR 1 1
f 0 (because xl e D). Moreo~er, from Lemma XI.1 we know that I{JQ (x) > 0 if x ~
1
TI . It follows that inf I{J (x) = inf I{J (x). But I{J (x) is lower semi-
Q1 xelRn Q1 xeD Q1 Q1
672

continuous, since h( a,x) is lower semicontinuous (cf. Rockafellar (1970)), and from

the compactness of D we see that mi n 'Pa (x) exists.


xED 1
2
We have 'P
a 1(x ) < 0, since otherwise the above procedure would terminate at
x 2. It follows that x 2 E D, f(x 2) < f(x 1). By induction, it is then easily seen that we
have x k E D, f(xk +1) < f(x k) for all k. Therefore, the sequence {ak }, ak = f(x k ), is
monotonically decreasing. It is also bounded from below by min f(D), which exists

because of the inc1usion xl E D, the continuity of f and the compactness of D. It fol-

lows that a:= lim ak exists.


k-l(l)
k
Let x = I im x q be an accumulation point of {xk}. Then we have a = fex).
q-l(l)

Now, for every r < kq one has


k k
'Pa (x q) ~ 'Pa _l(x q) ~ 'Pa -1 (x) ~ 'Pa (x) VxE IRn . (127)
r q q
The first and last inequalities in (127) follow from Definition XI.2 (iii) and the
k
definition of 'Pa(x). The second inequality in (127) holds since 'Pa -1 (x q) =
q
min 'P l(x). Keeping Q fixed and letting q -- (I) in (127), we see from (127) and
xElRn Qq- r
the lower semicontinuity of 'P (x) that
Qr

'P Q (x) ~ 'P a (x) Vx E IRn . (128)


r

Letting r -- (I) in (128), by virtue of Corollary XI.1 we obtain

Since 'P - (x)


Q
= 0, it follows from this inequality that 0 = min 'P - (x). But this is
xElRn a
= min f(D) by Theorem XI.2.
equivalent to fex)

The implementation of the above iterative method requires that we investigate

two issues. The first matter concerns the solution of the subproblems (P k ). These

problems in general might not be solvable by a finite procedure. Therefore, one


673

should replace problem (P k) by a suitable approximation (Qk) which can be solved


by a finite algorithm.
The second matter regards the computation of the initial feasible point xl. In

addition, the implicit definition of r a(x) by means of (117), (118) necessitates


suitable approximations.

One possibility for handling these matters is as follows.

Replace problem (P k ) by a relaxed problem of the form

minimize (hk(x) -lIxIl2) , (129)


S.t. xES

where S is a suitable polytope and hk(x) is a suitable polyhedral convex function


that underestimates h( ak,x). Since min {'P a (x): x E IRn} is attained in D, it suffices
k
to choose any polytope S that encloses D. Moreover, the form (109) of h(a,x) sug-

gests that we consider

(130)

where a i is the smallest value of f(x) at an feasible points evaluated until iteration i,
and where xi+! (i ~ 1) is an optimal solution of problem (Qi)'
By the definition of the ai in (130), we must have a i ~ f(xi ) whenever xi E D. It fol-
lows that xi ~ D ,and hence xi ~ D,. for i=l, ... ,k. Since r(a.,x) ~ r(ak,x) for an
~ ~ 1

xe IRn , i=l, ... ,k (Definition XI.2 (iii», we see from (109) and (130) that for an x

Le., the functions hk(x) defined by (130) underestimate h(ak,x).

Moreover, the functions hk(x) are polyhedral convex functions since

with
674

(131)

It follows that problem (Qk) is equivalent to

minimize (t -lIxIl 2) (132)


s.t. xE S, li(x) ~ t (i=l, ... ,k)

In this way we have replaced the unconstrained d.c. problem (P k ) by the linearly

constrained concave minimization problem (Qk) with quadratic objective function.

Several finite algorithms to solve problem (Qk) were discussed in Part B of this

book.

Since hk(x) ~ h( Ilk,x) for all x E S, it follows from Lemma XI.1 that the optimal ob-

jective function value of (Qk) is nonpositive. Moreover, if this value is zero, then we

must have

0= minn VJ at (x) ,
xEIR k

and it follows from Theorem XI.2 that Ilk = min f(D) and every point i k E D satis-
fying f(xk ) = Ilk solves (P).
However, if h k (xk +1) - IIxk +1 11 2 < 0, then it is not guaranteed that f(xk +1) < Ilk .

Therefore, we set Ilk+1 = Ilk if x k +1 t D. When the solution xk +1 of (Qk) is feas-


ible, then we can run a local optimization procedure that starts with x k +l and

yields a feasible point i k +1 satisfying f(ik +1) ~ f(xk +1). In this case we set
. -k+1
'\+1 = mm {Ilk' f(x H·

AIgorithm XI.8 (approximate relief indicator method)

Initialization:

Construct a polytope S J D and choose xl E S (if available, then choose xl E D).

Set 110 = IIJ if no feasible point is known, and set 110 equal to the minimal value of

f at known feasible points, if feasible points are known.


675

Iteration 1:=1,2,... :
1:.1.: If i E D, then, using a local optimization procedure that starts with xk, find a
point i k E D satisfying f(ik) < f(xk), and set tlk = min {tlk_l,f(ik)). If xk t D,
then set tlk = tlk_l·
Denote by i k the best feasible point known so far, i.e., we have f(ik) = tlk.

1:.2.: Set

and solve the relaxed problem

(Qk) minimize (t -lIxIl 2)


s .t. xE S, Li (x) 5 t (i=I, ... ,k)

Let (xk+1,t k+1) be an optimal solution of (Qk).


If

then stop: i k is an optimal solution of (P), and tlk = f(i k) = min f(D).
Otherwise (t k+1 _lIxk+1 11 2 < 0), go to iteration k+1.

Rernark XIA. Let Dk denote the feasible set of ("Q"k)' Le.,

Dk = ((x,t): xE S, 4(x) 5 t (i=I, ... ,k)} ,

and let Vk denote the vertex set of Dk. Then we know that

Since Dk+1 = Dk n ({x,t): 4t(x) 5 t}, we have the same situation as with the outer
approximation methods discussed in Chapter 11, and Vk+ 1 can be determined from
Vk by one of the methods presented in Chapter 11.
676

Before proving convergence of Algorithm XI.8, we state the following result.

Proposition XI.12. Let problem {P} be regular in the sense of {109}, and let the

feasible set D of {P} be nonempty. Assume that we have aseparator r{a,x} for f on D
which is lower semi-continuo'US in IRJllR n. Then Algorithm XI. 7 either terminates after
a finite number of iterations with an optimal solution zk of {P}, or else it generates an
infinite sequence {i}, each accumulation point of which is optimal for {P}.

Proof. If the algorithm terminates after a finite number of iterations, then, by

construction, it terminates with an optimal solution of (P).


In order to prove the second assertion, first note that the sequence {ak } gene-
rated by the algorithm is nonincreasing and is bounded from below by min f(D).

Therefore,a := I im ak exists. We show that a = min f(D), and that every accu-
k-t1D

mulation point xof {xk} satisfies xE D and f(X) = a.

We use the general convergence results for outer approximation methods (cf.
Theorem II.1) in the following form:
Consider the nonempty set

and the sequence {(xk,tkn generated by Algorithm X.6. This sequence is bounded,

because {xk} eS and li (xk ) ~ hk- I (xk ) ~ tk < IIxkf


Then we know from the outer approximation theory (cf. Theorem II.1) that every

accumulation point (x,I) belongs to A if the functions lk(x,t):= lk(x) - t (k=I,2, ... )
satisfy the following conditions:

(1)

I k' k' ,
(2) lk(x,tH 0 V(x,t) E A, '1c (x ,t ) ~ 0 V k > k;
677

(3) ~(x,t) is lower semi--continuous, and every convergent subsequence


k k k k
{(x q,t qn ({(xk,tkn satisfying (x q,t q) ----+ (i,t) has a
(q-lal )
kr kr k k
subsequence {x ,t } ( {x q,t q} such that
k k
lim ~ (x r,t r) = lim ~ (i,t) = l (i,t);
r-+m r r-+m r

(4) l(i,t) = 0 implies that (i,t) E A.


We show that the conditions (1) - (4) are satisfied.

(1): We have

~(xk,tk) = 2xkxk + r2(Qk,xk) _lI xk ll 2 _ t k = II xk ll 2 _ t k + r2( Qk,xk)


~ IIxk ll 2 _t k > 0,

where the first inequality is obvious and the second inequality follows from the as-
sumption that the algorithm is infinite (cf., Step k.2).

(2): Let (x,t) E A. Then

lk(x,t) = 2xkx + r2( Qk,xk) _lIxkll2 - t

~ 2xkx + r2( Qk,xk) -lIill2 -lIxll 2

= r 2( Qk,xk) -lIxk_x1l 2 ~ r2(a,xk) _ d 2 (xk) ~ O.


u- Q

Here the last two inequalities follow from Definition XI (iii) and (i), respectively.

(3): The affine functions ~(x,t) are lower semi--continuous. Consider a subsequence
k k k k
(x q,t q) satisfying (x q,t q) ----+ (i,t). Then there is a subsequence {kr} ( {kq}
(q-lal )
such that
k k
lim lk (x r ,t r ) = lim 1_'11: (x,t,
-~ - 2 + r2(-:;"\
= IIxll Q,X, - -t = 1f-~
"\x,t,,
r-+m r r-+m r
678

where Z(x,t) = 11 xII 2 + r2(-)


a,x - t.

(4): Let l(x,i) = IIxl1 2 + r2(ii,x) - t = O. Since the algorithm is infinite, i.e., we
k k
have IIx qll2 > t q Vq, it follows that IIxll 2 - t ~ 0, and hence, because Z(x,i") = 0,
we have r 2(ii,x) ~ o. This is possibly only if r(ii,x) = 0, and hence IIxll2 = t. But
from Definition X1.1 (ii) we see that then we must have x E Dii, and hence (x,t) E A.
Therefore, by Theorem I1.1, every accumulation point (x,i") of {(xk,tk)} satisfies
(x,i") E A, i.e.

(133)

Now we show that the optimality condition of Theorem XI.2 (resp. Corollary
XI.2) is satisfied.

Since x E Dii it follows by Lemma XI.2 that

I{J ii (x) = h(ii,x) - IIxll 2 = - inf {lix - vf v;' Da} ~ 0. (134)


k k
Let {(x S,t s)} be a subsequence converging to (x,i"). Then (134) implies that

Now let s - I m in (135) and observe that IIxll 2 ~ t (cf. 133). We obtain

o ~ h(ii,X) -lIxll 2 ~ min {h(ii,x) -lIxII} ~ 0 .


xES

But since S J Dii, this implies, because of Lemma XI.l, that


679

The assertion follows from Theorem XI.2.



Example XI.IO. Consider the problem

minimize f(x) = (x1)2 + (~)2 - cos 18x1 - cos 18x2

s.t. [(xl - 0.5)2 + (x2 - 0.415331)2] 1/2 ~ 0.65 ,

Let g(xl'x2) = - [(xl - 0.5)2 + (x2 - 0.415331)2] 1/2 + 0.65 and let
S = {(xl'x2) E 1R2: 0 ~ xl ~ 1, 0 ~ x2 ~ I}.
The functions g and f are Lipschitz continuous on S with Lipschitz constants 28.3 for

fand 1.0 for g.

According to Example XI.6 we choose the separator

r(a,x) = max {O, f~8?3a g(x)} . ,


With the stopping criterion inf (Qk) > -10-4 the algorithm terminates after 16
iterations with x = (0,1), f(X) = -0.6603168. No local methods were applied, Le., we
k -k
usedx =x.
The intermediate results are given in Table XI.2

Her. xk f(x)k i k CCk in/(Qk) I Vk I

1 fO, 1~ -0.6603 (0,1) -0.6603 -0.1693 4


2 1,1 0.6793 - - -0.1693 6
3
4 0.,693"1
0.8284,1
1.3639
1.7252
-
-
-
-
-0.0286
-0.0271
S
10
5 0,0.8551 0.6822 - - -0.0209 12
6 1,0.8532 2.0074 - - -0.0192 14
7 0.0695, 0.9353~ 0.9939 - - -0.0090 16
8 0.9284,0.9493 204908 - - 0.0054 18
9 0,0.9567~ -0.0271 - - -0.0018 18
10 0.0402,1 -004080 - - -0.0016 20
11 0.0041,0.9145) 0.5678 - - -0.0012 22
12 0.0899,1) 0.3967 - - -0.0011 24
13 . 0.0191,0.9841~ -0.3948 - - -0.0006 26
14 0.0445,0.8794 1.0731 - - -0.0003 28
15 0.0006,0.9841 -004537 - - -0.0002 30
16 0.0138,1) -0.6293 - - -0.0001 32
Table XI.2
REFERENCES

ABRHAM, J., and BUIE, R.N. (1975), A Note on Nonconcave Continuous Program-
ming. Zeitschrift für Operations Research, Serie A, 3, 107-114.
ADAMS, W.P. and SHERALI, H.D. (1986), A Tight Linearization and an Algorithm
for zero-one Quadratic Programming Problems. Management Science, 32,
1274-1290.

AFENTAKIS, P, GARISH, B. and KARMARKAR, U. (1984), Computationally


efficient optimal solutions to the lot-sizing problem in multistage assembly systems.
Management Science, 30, 222-239.

AGGARW AL, A. and FLOUDAS, C.A. (1990), A Decomposition Approach for


Global Optimum Search in QP, NLP and MINLP problems. Annals of Operations
Research, 25, 119-146.

AHUJA, R.K., MAGNANTI, T.L. and ORLIN, J.B. (1993), Network Flows: Theory,
Algorithms and Applications. Prentice Hall, Englewood cliffs, N.J.
ALAMEDDlNE, A. (1990), A New Reformulation-Linearization Technique for the
Bilinear Programming and Related Problems with Applications to Risk Management.
Ph.D., Dissertation, Department of Industrial and Systems Engineering, Virginia
Polytechnic Institute and State University, Blacksburg, Virginia.

AL-KHA YYAL, F.A. (1986), Further Simplified Characterizations of Linear Com-


plementarity Problems Solvable as Linear Programs. Report PDRC 864l8, School of
Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta.

AL-KHAYYAL, F.A. (1986a), Linear, Quadratic and Bilinear Programming


Approaches to the Linear Complementarity Problem. European Journal of
Operational Research, 24, 216-227.

AL-KHAYYAL, F.A. (1987), An Implicit Enumeration Procedure for the General


Linear Complementarity Problem. Mathematical Programming Study, 31, 1-20.
AL-KHAYYAL, F.A. (1989), Jointly Constrained Bilinear Programs and Related
Problems: An Overview. Computers and Mathematics with Applications, 19, 53-62 ..
AL-KHA YYAL, F .A. and F ALK, J.E. (1983), Jointly constrained biconvex pro-
gramming. Mathematics of Operations Research, 8,.273-286.
AL-KHAYYAL, F.A., HORST, R., and PARDALOS, P. (1992), Global Optimiza-
tion 0/ Concave Functions Subject to Separable Quadratic 'Constraints - An
Application in Nonlinear Bilevel Programming. Annals of Operations Research, 34,
125-147.
682

AL-KHAYYAL, F.A., LARSEN, C. and VAN VOORHIS, T. (1995), ARelaxation


Method for Noneonvex QuadratieaUy Constrained Quadratie Programs. Journal of
Global Optimization, 6, 215-230.

ALLGOWER, E.L. and GEORG, K. (1980), Simplieial and Continuation Methods


for Approximating Fixed Points and Solutions to Systems of Equations. SIAM
Review, 22, 28-85.
ALLGOWER, E.L. and GEORG, K. (1983), Predietor-eorreetor and Simplieial
Methods for Approximating Fixed Points and Zero Points of Nonlinear Mappings. In:
Bachem, A., Grötschel, M. and Korte, B. (ed.), Mathematical Programming: The
State of the Art. Springer, Berlin

ALTMANN, M. (1968), Bilinear Programming. BuH. Acad. Polon. Sei. Sero Sei.
Math. Astronom. Phys., 16, 741-745.

ANEJA, Y. P., AGGARWAL, V. and NAIR, K. P. (1984), On a Class ofQuadratie


Programs. European Journal of Operations Research, 18, 62-72.
ARCHETTI, F. and BETRO, B. (1978), APriori Analysis of Deterministie Strate-
gies for Global Optimization Problems. In: L.C.W. Dixon and G.P. Szegö (eds.),
Towards Global Optimization 2, North Holland.

ARCHETTI, F. and BETRO, B. (1979), A Probabalistie Algorithm for Global Opti-


mization. Calcolo, 16, 335-343
ARCHETTI, F. and BETRO, B. (1980), Stoehastic Models and Global Optimization.
Bolletino della Unione Matematica Italiana, 5, 17-A, 295-30l.
ARSOVE, G. (1953), Functions Representable as the Differenee of Subharmonie
Funetions. Transactions of the American Mathematical Soeiety, 75, 327-365.
ASPLUND, E. (1973), Differentiability of the Metrie Projeetion in Finite Dimen-
sional Euclidean Spaee. Proceedings of the American Mathematical Soeiety, 38,
218-219.
AUBIN, J.P. and EKELAND, I. (1976), Estimates ofthe Duality Gap in Noneonvex
Optimization. Mathematics of Operations Research, 1, 225-245.
AVIS, D. and FUKUDA, K. (1992), A Pivoting Algorithm for Convex Hulls and
Vertex Enumeration of Arrangements and Polyhedra. Discrete and Computational
Geometry, 8, 295-313.

AVRIEL, M. (1973), Methods for Solving Signomial and Reverse Convex Program-
ming Problems. In: Avriel et al. (ed.), Optimization and Design, Prentice-Hall Inc.,
Englewood Cliffs, N.J. 307-320.

AVRIEL, M. (1976), Nonlinear Programming: Analysis and Methods. Prentice Hall


Inc. Englewood Cliffs, N.J.

AVRIEL, M. and WILLIAMS, A.C. (1970), Complementary Geometrie Program-


ming. SIAM Journal of Applied Mathematics, 19, 125-14l.
AVRIEL, M. and ZANG, I. (1981), Generalized Arewise Conneeted Funetions and
Characterizations of Loeal-Global Minimum Properties. In: Schaible, S. and
Ziemba, W.T. (ed.) Generalized Concavity in Optimization and Economics,
Academic Press, New York.
683

BALAS, E. (1968), A Note on the Branch and Bound Principle. Operations Research
16, 442-445.

BALAS, E. (1971), Intersection Cuts - a New Type of Cutting Planes for Integer
Programming. Operations Research, 19, 19-39.
BALAS, E. (1972), Integer Programming and Convex Analysis: Intersection Cuts
!rom Outer Polars. Mathematical Programming, 2, 330-382.
BALAS, E. (1975), Disjunctive Programming: Cutting Planes !rom Logical Condi-
tions. In: "Nonlinear Programming, 2", Academic Press, Inc., New York, San
Francisco, London, 279-312.

BALAS, E. (1975a), Nonconvex Quadratic Programming via Generalized Polars.


SIAM Journal on Applied Mathematics, 28, 335-349.

BALAS, E. (1979), Disjunctive Programming. Annals of Discrete Mathematics, 5,


3-51.
BALAS, E., and BURDET, C.A. (1973), Maximizing a Convex Quadratic Function
Subject to Linear Constraints. Management Science Research Report No. 299,
Carnegie-Mellon University (Pittsburg, P.A.).

BALl, S. (1973), Minimization of a concave fu,nction on a bounded convex poly-


hedron. Ph.D. Dissertation, UCLA.
BALINSKI, M.L. (1961), An Algorithm for Finding All Vertices of Convex Poly-
hedral Sets. SIAM Journal, 9, 72-88.
BAN, V.T. (1982), A finite Algorithm for GlobaUy Minimizing a Concave Function
Under Linear Constraints and its Applications. Preprint, Institute of Mathematics,
Hanoi.

BANSAL, P.P. and JACOBSEN, S.E. (1975), Characterization of Local Solutions


for a Class of Nonconvex Programs. Journal of Optimzation Theory and
Applications, 15, 549-564.
BANSAL, P.P. and JACOBSEN, S.E. (1975a), An Algorithm for Optimizing Net-
work Flow Capacity under Economies of Scale. Journal of Optimization Theory and
Applications, 15, 565-586.
BAOPING, Z., WOOD, G.R. and BARITOMPA, W.P. (1993), Multidimensional
Bisection: Performance and Context. Journal of Global Optimization, 3, 337-358.
BARD, J.F. and F ALK, J.E. (1982), A Separable Programming Approach to the
Linear Complementarity Problem. Computation and Operations Research, 9,
153-159.

BARITOMPA, W.P. (1994), Accelerations for a Variety of Global Optimization


Methods. Journal of Global Optimization, 4, 37-46.
BARON, D.P. (1972), Quadratic Programming with Quadratic Constraints. Naval
Research Logistics Quarterly, 19, 253-260.

BARR, R.S., GLOVER, F. and KLINGMAN, D (1981), A New Optimization


Method for Large Scale Fixed Charge Transportation Problems. Operations
Research, 29, 3, 448-463.
684

BASSO, P. (1982), Iterative methods for the localization of the global maximum.
SIAM Journal on Numerical Analysis, 19, 781-792.

BASSO, P. (1985), Optimal Search for the Global Maximum of Functions with
Bounded Seminorm. SIAM Journal on Numerical Analysis, 22, 888-903.
BAZARAA, M.S. (1973), Geometry and Resolution of Duality Gaps. Naval Research
Logistics, 20, 357-366.
BAZARAA, M.S. and SHETTY, C.M. (1979), Nonlinear Programming: Theory and
Algorithms. John Wiley and Sons, New York.
BAZARAA, M.S. and SHERALI, H.D. (1982), On the Use of Exact and Heuristic
Cutting Plane Methods for the Quadratic Assignment Problem. Journal Operational
Society, 33, 991-1003.
BEALE, E.M.L. (1980), Branch and Bound Methods for Numerical Optimization of
Nonconvex Functions. In: Compstat 1980, Physica, Vienna, 11-20.
BEALE, E.M.L. and FORREST, J.J.H. (1978), Global Optimization as an Extension
of Integer Programming. In: Towards Globaf Optimization 2, Dixon, L.C.W. and
Jzego, eds., North Holland, Amsterdam, 131-149.

BENACER, R. (1986), Contribution a. letude des algorithmes de loptimisation non


convexe et non difforentiable. These, Universite Scientifique, Technologique et
Medicale de Grenoble.

BENACER, R. and TAO, P.D. (1986), Global Maximization. of a Nondefinite Qua-


dratic Function over a Convex Polyhedron. In: Hiriart-Urruty, ed., FERMAT DAYS
1985: Mathematics for Optimization, North-Holland, Amsterdam, 65-76.
BENDERS, J.F. (1962), Partitioning Procedures for Solving Mixed- Variables Pro-
gramming Problems. Numerische Mathematik, 4, 238-252.
BEN NAHIA, K. (1986), Autour de la biconvexite en optimisation. These, Universite
Paul Sabatier de Toulouse.
BENNET, K.P. and MANGASARIAN, O.L. (1992), Bilinear Separation oftwo Sets
in n-Space. Computer Science Technical Report No. 1109, Center for Parallel
Optimization, University of Wisconsin, Madison, Wisconsin.

BENSON, H.P. (l982), On the Convergence of Two Branch and Bound Algorithms
for Nonconvex Programming Problems. Journal of Optimization Theory and
Applications, 36, 129-134.

BENSON, H.P. (1982a), Algorithms for Parametrie Nonconvex Programming.


Journal of Optimization Theory and Applications, 38, 319-340.

BENSON, H.P. (1985), A Finite Algorithm for Concave Minimization Over a Poly-
hedron. Naval Research Logistics Quarterly, 32, 165-177.
BENSON, H.P. (1990), Separable Concave Minimization via Partial Outer Approxi-
mation and Branch and Baund. Operations Research Letters, 9, 389-394.
BENSON, H.P. (1995), Concave Minimization: Theory, Applications and
Algorithms. In: Horst, R. and Pardalos, P.M. (eds.), Handbook of Global
Optimization, 43-148, Kluwer, Dordrecht-Boston-London.
685

BENSON, H.P. and ERENGUC, S. (1988), Using Convex Envelopes to Solve the
Interactive Fixed Charge Linear Programming Problem. Journal of Optimization
Theory and Applications 59, 223-246.

BENSON, H.P., ERENGUC, S., HORST, R. (1990), A Note on Adapting Methods


for Continuous Global Optimization to the Discrete Case. Annals of Operations
Research 25, 243-252.

BENSON, H.P. and HORST, R. (1991), A Branch and Bound - Outer Approxi-
mation Algorithm /or Concave Minimization Over a Convex Set. Journal of
Computers and Mathematics with Applications 21, 67-76.

BENSON, H.P. and SA YlN, S. (1994), A finite Concave Minimization Algorithm


using Branch and Bound and Neighbor Generation. Journal of Global Optimization,
5, 1-14.

BEN TAL, A., EIGER, G. and GERSHOVITZ, V. (1994), Global Minimization by


Reducing the Duality Gap. Mathematical Programming, 63, 193-212.
BERGE, C. (1958), Theorie des Graphes et ses Applications. Dunod Paris.

BHATIA, H.L., (1981), Indefinite Quadratic Solid Transportation Problem. Journal


of Information and Optimization, 2, 297-303.

BITRAN, G.R., MAGNANTI, T.L. and YANASSE, H.H. (1984), Approximation


Methods /or the Uncapacitated Dynamic Lot Size Problem. Management Science, 30,
1121-1140.

BITTNER (1970), Some Representation Theorems for Functions and Sets and their
Applications to Nonlinear Programming. Numerische Mathematik, 16, 32-51.
BLUM, E. and OETTLI, W. (1975), Mathematische Optimierung. Springer-Verlag,
Berlin.

BOD, P. (1970), Solution 0/ a Fixed-Charge Linear Programming Problem. In:


Proceedings of Princeton Symposium on Mathematica Programming. Princeton
University Press, 367-375.

BODLAENDER, H.L., GRITZMANN, P., KLEE, V. and VAN LEEUWEN, J.


(1990), Computational Complexity 0/ Norm-Maximization. Combinatorica 10,
203-225.

BOHRINGER, M.C. (1973), Linear Programs with Additional Reverse Convex


Constraints. Ph.D. Dissertation, UCLA.
BOMZE, I.M. and DANNINGER, G. (1994), A finite Algorithm for Solving General
Quadratic Problems. Journal of Global Optimization, 4, 1-16.
BORCHARDT, M. (1988), An Exact Penalty Approach /or Solving a Class 0/
Minimization Problems with Boolean Variables. Optimization 19, 829-838.
BOY, M. (1988), The Design Centering Problem. Diplom-Thesis. Department of
Mathematics, University of Trief.

BRACKEN, J. and McCORMICK, G.P. (1968), Selected Applications 0/ Nonlinear


Programming, J. Wiley, New York.
BRAESS, D. (1986), Nonlinear Approximation Theory. Springer-Verlag, Berlin.
686

BRANIN, F.H. (1972), Widely Convergent Method for Finding Multiple Solutions of
Simultaneous Nonlinear Equations. IBM J. Res. Dev. 504-522.
BRENT, R.P. (1973), Algorithms for Minimization Without Derivatives. Prentice-
Hall, Englewood Cliffs.

BROOKS, S.H. (1958), Discussion of Random Methods for Locating Surface


Maxima. Operations Research, 6, 244-251.
BROTCHI, J.F. (1971), A Model for National Development. Management Science,
18, B14 - B18.

BULATOV, V.P. (1977), Embedding Methods in Optimization Problems. Nauka,


Novosibirsk (in Russian).
BULATOV, V.P. (1987), Methods for Solving Multiextremal Problems (Global
Search). In: Methods of Numerical Analysis and Optimization, ed. B.A. Beltiukov
and V.P. Bulatov, Nauka, Novosibirsk, 133-157 (in Russian).
BULATOV, V.P. and KASINKAYA, L.1. (1982), Some Methods of Concave
Minimization on a Convex Polyhedron and Their Applications. In: Methods of
Optimization and their Applications, Nauka, Novosibirsk, 71-80 (in Russian).

BULATOV, V.P. and KHAMISOV, O.V. (1992), The Branch and Bound Method
with Cuts in E n+1 for Solving Concave Programming Problems. Lecture Notes in
Control and Information Sciences, 180, 273-281.

BURDET, C.A. (1973), Polaroids: A New Tool in Nonconvex and in Integer Pro-
gramming. Naval Research Logistics Quarterly, 20, 13-24.
BURDET, C.A. (1973), Enumerative Cuts I. Operations Research, 21, 61-89.
BURDET, C.A. (1977), Convex and Polaroid Extensions. Naval Research Logistics
Quarterly, 26, 67-82.

BURDET, C.A. (1977a), Elements of a Theory in Nonconvex Programming. Naval


Research Logistics Quarterly, 24, 47-{)6.
CABOT, A.V. (1974), Variations on a Cutting Plane Method for Solving Concave
Minimization Problems with Linear Constraints. Naval Research Logistics, 21,
265-274.

CABOT, A.V. and ERENGUC, S.S. (1984), Some Branch and Bound Procedures
for Fixed-Cost Transportation Problems. Naval Research Logistics Quarterly, 31,
129-138.

CABOT, A.V. and ERENGUC, S.S. (1986), A Branch and Bound Algorithm for
Solving a Class of Nonlinear Integer Programming Problems. Naval Research
Logistics, 33, 559-567.

CABOT, A.V. and FRANCIS, R.L. (1970), Solving Nonconvex Quadratic Minimiza-
tion Problems by Ranlcing the Extreme Points. Operations Research, 18, 82-86.
CANDLER, W. and TOWNSLEY, R.J. (1964), The Maximization of a Quadratic
Function of Variables subject to Linear Inequalities. Management Science, 10,
515-523.
687

CARILLO, M.J. (1977), A Relaxation Algorithm for the Minimization of a Quasi-


eoneave Funetion on a Convex Polyhedron. Mathematical Programming, 13, 69-80.
CARVAJAL-MORENO, R. (1972), Minimization of Coneave Funetions Subjeet to
Linear Constraints. Operations Research Center Report ORC 72-3, University of
Berkely, California.

CHEN, P.C., HANSEN, P. and JAUMARD, B. (1991), On-Line and Off-Line


Vertex Enumeration by Adjaeeney Lists. Operations Research Letters 10, 403-409.
CHEN, P.C., HANSEN, P., JAUMARD, B. and TUY, H. (1992), Webers Problem
with Attraetion and Repulsion. Journal of Regional Science, 32, 467-468.
CHENEY, E.W. and GOLDSTEIN, A.A. (1959), Newton's Method for Convex Pro-
gramming and Tehebyeheff Approximation. Numerische Mathematik, 1, 253-268.
CHENG, Y.C. (1984), On the Gradient Projeetion Method for Solving the Non-
symmetrie Linear Complementarity Problem. Journal of Optimization Theory and
Applications, 43, 527-54l.

CHEW, S.H. and ZHENQ, Q. (1988), Integral Global Optimization. Lecture Notes in
Economics and Mathematical Systems, 289, Springer-Verlag, Berlin.

CIRINA, M. (1983), Reeent Progress in Complementarity. Paper presented at


TIMS/ORSA Joint National Meeting, Chicago, IL.

CIRINA, M. (1985), A Class of Nonlinear Programming Test Problems. Working


Paper, Dipartimento di Informatica, Torino, Italy.

CIRINA, M. (1986), A Finite Algorithm for Global Quadratie Minimization.


Working Paper, Dip. di Informat. Torino, Italy.
COHEN, J.W. (1975), Plastie-Elastie-Torsion, Optimal Stopping and Free Bound-
aries. Journal of Engineering Mathematics, 9, 219-226.
COOPER, M. (1981), Survey of Methods for Pure Nonlinear Integer Programming.
Management Science, 27, 353-361.
COTTLE, R.C. and DANTZIG, G.B. (1968), Complementarity Pivot Theory of
Mathematieal Programming. Linear Algebra and its Applications, 1, 103-125.
COTTLE, R. and MYLANDER, W.C. (1970), Ritter's Cutting Plane Method for
Noneonvex Quadratie Programming. In: J. Abadie (ed.), Integer and Nonlinear
Programming, North-Holland, Amsterdam.
COTTLE, R. and PANG, J.S. (1978), On Solving Linear Complementarity Problems
as Linear Programs. Mathematical Programming Study, 7, 88-107.
COTTLE, R., PANG, J.S. and STONE, R.E. (1992), The Linear Complementarity
Problem. Academic Press, Boston.
CZOCHRALSKA, I. (1982), Bilinear Programming. Zastosow. Mat., 17,495-514.

CZOCHRALSKA, I. (1982a), The Method of Bilinear Programming for Noneonvex


Quadratie Problems. Zastosow. Mat., 17,515-523.
DAJANI, J.S. and HASIT, Y. (1974), Capital Cost Minimization of Drainage
Networks. Journal Environ. Eng. Div., 100, 325-337.
688

DANILIN, Y.M. (1971), Estimation of the Efjiciency of an Absolute-Minimum-


Finding Algorithm. USSR Computational Mathematics and Mathematical Physics,
11,261-267.

DANTZIG, G.B. and WOLFE, P. (1960), Decomposition Principle for Linear Pro-
grams. Operations Research, 8, 101-111.
DENNIS, J.E. and SCHNABEL, R.B. (1983), Numerical Methods for Nonlinear
Equations and Unconstrained Optimization. Prentice-Hall, Englewood Cliffs, New
Jersey.
DEWDER, D.R. (1967), An Approximate Algorithm for the Fixed Charge Problem.
Naval Research Logistics Quarterly, 14, 101-113.
DIENER, I. (1987), On the Global Convergence of Path-Following Methods to
Determine aU Solutions to a System of Nonlinear Equations. Mathematical
Programming, 39, 181-188.
DINH ZUNG (1987), Best Linear Methods of Approximation for Classes of Periodic
Function of Several Variables. Matematicheskie Zametki 41, 646-653 (in Russian).
DIXON, L.C.W. (1978), Global Optima Without Convexity. In: Greenberg, H. (ed.),
Design and Implementation Optimization Software, Sijthoff and Noordhoff, Alphen
aan den Rijn, 449-479.
DIXON, L.C.W., and SZEGO, G.P. (eds.) (1975), Towards Global Optimization.
Volume I. North-Holland, Amsterdam.

DIXON, L.C.W., and SZEGO, G.P. (eds.) (1978), Towards Global Optimization.
Volume 11. North-Holland, Amsterdam.
DUONG, P.C. (1987), Finding the Global Extremum of a Polynomial Function. In:
Essays on Nonlinear Analysis and Optimization Problems, Institute of Mathematics,
Hanoi (Vietnam), 111-120.
DUTTON, R., HINMAN, G. and MILLHAM, C.B. (1974), The Optimal Location of
Nuclear-Power Facilities in the Pacijic Northwest. Operations Research, 22,
478-487.
DYER, M.E. (1983), The Complexity of Vertex Enumeration Methods. Mathematics
of Operations Research, 8, 381-402.

DYER, M.E. and PROLL, L.G. (1977), An Algorithm for Determining AU Extreme
Points of a Convex Polytope. Mathematical Programming, 12, 81-96.
DYER, M.E. and PROLL, L.G. (1982), An Improved Vertex Enumeration Algo-
rithm. European Journal of Operational Research, 9, 359-368.
EAVES, B.C. AND ZANGWILL, W.1. (1971), Generalized Cutting Plane AIgo-
rithms. SIAM Journal on Control, 9, 529-542.
ECKER, J.G. and NIEMI, R.D. (1975), A Dual Method for Quadratic Programs
with Quadratic Constraints. SIAM Journal Applied Mathematics, 28, 568-576.
ELLAIA, R. (1984), Contribution a. l'Analyse et l'Optimisation de Difforences de
Fonctions Convexes. These du 3eme Cyc1e, Universite Paul Sabatier, Toulouse.
ELSHAFEI, A.N. (1975), An Approach to Locational Analysis. Operational Re-
search Quarterly, 26, 167-181.
689

EMELICHEV, V.A. and KOV ALEV, M.M. (1970), Solving Cenain Concave
Programming Problems by Successive Approximation I. Izvestya Akademii Nauk
BSSR, 6, 27-34 (in Russian).

ERENGUC, S.S. (1988), Multiproduct Dynamic Lot-Sizing Model with Coordinated


Replenishment. Naval Research Logistics, 35, 1-22.
ERENGUC, S.S. and BENSON, H.P. (1986), The Interactive Fixed Charge Linear
Programming Problem. Naval Research togistics, 33, 157-177.
ERENGUC, S.S. and BENSON, H.P. (1987), An Algorithm for Indefinite Integer
Quadratic Programming. Discussion Paper 134, Center for Econometrics and
Decision Sciences, University of Florida.

ERENGUC, S.S. and BENSON, H.P. (1987a), Concave Integer Minimizations Over
a Compact, Convex Set. Working Paper No. 135, Center for Econometrics and
Decision Science, University of Florida.

ERICKSON, R.E., MONMA, C.L. and VEINOTT, A.F. (1987), Send-and-Split


Method for Minimum-Concave-Cost Network Flows. Mathematics of Operations
Research, 12, 634-{l63.

EVTUSHENKO, Y.G. (1971), Numerical Methods for Finding the Global Extremum
of a Function. USSR Computational Mathematics and Mathematical Physics, 11,
38-54.

EVTUSHENKO, Y.G. (1985), Numerical Optimization Techniques. Translation


Series in Mathematics and Engineering, Optimization Software Inc. Publication
Division, New York.
EVTUSHENKO, Y.G. (1987), Bisection Method for Global Optimization of
Functions of Many Variables. Technicheskaya Kibernetika, 1, 119-127 (in Russian).
FALK, J.E. (1967), Lagrange Multipliers and Nonlinear Programming. J. Mathe-
matical Analysis and Applications, 19, 141-159.
FALK, J.E. (1969), Lagrange Multipliers and Nonconvex Programs. SIAM Journal
on Control, 7, 534-545.

FALK, J.E. (1972), An Algorithm for Locating Approximate Global Solutions of


Nonconvex, Separable Problems. Working Paper Serial T-262, Program in Logistics,
The George Was hingt on University.
FALK, J.E. (1973), A Linear Max-Min Problem. Mathematical Programming, 5,
169-188.

FALK, J.E. (1973a), Conditions for Global Optimality in Nonlinear Programming.


Operations Research, 21, 337-340.

FALK, J.E. (1974), Sharper Bounds on Nonconvex Programs. Operations Research,


22, 410-412.

FALK, J.E., BRACKEN, J. and McGILL, J.T. (1974), The Equivalence of Two
Mathematical Pro grams with Optimization Problems in the Constraints. Operations
Research, 22, 1102-1104.
FALK, J.E. and HOFFMAN, K.L. (1976), A Successive Underestimation Method for
Concave Minimization Problems. Mathematics of Operations Research, I, 251-259.
690

FALK, J.E. and HOFFMAN, K.L. (1986), Concave Minimization via Collapsing
Poly top es. Operations Research, 34, 919-929.
FALK, J.E. and SOLAND, R.M. (1969), An Algorithm for Separable Nonconvex
Programming Problems. Management Science, 15, 550-569.
FEDOROV, V.V. (ed.) (1985), Problems of Cybernetics, Models and Methods in
Global Optimization. USSR Academy of Sciences, Moscow (in Russian).
FENCHEL, W. (1949), On Conjugate Convex Functions. Canadian Journal of
Mathematics, 1, 73-77.
FENCHEL, W. (1951), Convex Cones, Sets and Functions. Mimeographed Lecture
Notes, Princeton University.

FERLAND, A.J. (1975), On the Maximum of a Quasi-Convex Quadratic Function


on a Polyhedral Convex Set. SIAM Journal on Applied Mathematics, 29, 409-415.
FLORIAN, M. (1986), Nonlinear Cost Network Models in Transportation Analysis.
Mathematical Programming Study, 26, 167-196.

FLORIAN, M. and ROBILLARD, P. (1971), An Implicit Enumeration Algorithm


for the Concave Cost Network Flow Problem. Management Science, 18, 184-193.
FLORIAN, M., ROSSIN, M.A. and de WERRA, D. (1971), A Property of Minimum
Co nc ave Cost Flows in Capacitated Networks. Canadian Journal of Operations
Research, 9, 293-304.
FLOUDAS, C.A. and AGGARWAL, A. (1990), A Decomposition Strategy for
Global Optimum Search in the Pooling Problem. ORSA Journal on Computing, 2,
225-235.

FLOUDAS, C.A. and PARDALOS, P.M. (1990), A Collection of Test Problems for
Constrained Global Optimization Algorithms. Lecture Notes in Computer Science,
455, Springer Verlag, Berlin.

FLOUDAS, C.A. and VISWESWARAN, V. (1995), Quadratic Optimization. In:


Horst, R. and Pardalos, P.M. (eds.), Handbook of Global Optimization, 217-270,
Kluwer, Dordrecht-Boston-London.

FORGO, F. (1972), Cutting Plane Methods for Solving Nonconvex Quadratic Prob-
lems. Acta Cybernetica, 1, 171-192.
FORGO, F. (1988), Nonconvex Programming. Akademiai Kiado, Budapest.

FORSTER, W. (ed.) (1980), Numerical Solution of Highly Nonlinear Problems.


North-Holland, Amsterdam.

FORSTER, W. (1995), Homotopy Methods. In Horst, R. and Pardalos, P.M. (eds.),


Handbook of Global Optimization, 669-750, Kluwer, Dordrecht-Boston-London.
FRIEZE, A.M. (1974), ABilinear Programming Formulation 0/ the :i-dimensional
Assignment Problem. Mathematical Programming, 7, 376-379.
FÜLÖP, J. (1990), A Finite Cutting Plane Method for Solving Linear Programs with
an Additional Reverse Convex Constraint. European Journal of Operations Research
44, 395-409.
691

FÜLÖP, J. (1995), Deletion-by-Infeasibility Rule for DC-Constrained Global


Optimization. Journal of Optimization Theory and Applications, 84, 443-455.
FUJIWARA O. and KHANG, D.B. (1988), Optimal Design of Looped Water Distri-
bution Networks via Concave Minimization. 13th International Symposium on
Mathematical Programming 1988.

FUKUSHIMA, M. (1983), An Outer Approximation Algorithm for Solving General


Convex Programs. Operations Research, 31, 101-113.
FUKUSHIMA, M. (1984), On the Convergence of a Class of Outer Approximation
Algorithms for Convex Programs. Journal of Computational and Applied
Mathematics, 10, 147-156.

GAL, T. (1975), Zur Identifikation Redundanter Nebenbedingungen in Linearen Prfr


grammen. Zeitschrift für Operations Research, 19, 19-28.
GAL, T. (1985), On the Structure ofthe Set Bases of aDegenerate Point. Journalof
Optimization Theory and Applications, 45, 577-589.

GAL, T., KRUSE, H.J. and ZÖRNIG, P. (1988), Survey of Solved and Open
Problems in the Degeneracy Phenomenon. Mathematical Programming B, 42,
125-133.

GALLO, G., SANDI, C. and SODINI, C. (1980), An Algorithm for the Min Concave
Cost Flow Problem, European Journal of Operational Research, 4, 249-255.
GALLO, G. and SODINI, C. (1979), Adjacent Extreme Flows and Application to
Min Concave Cost Flow Problem. Networks, 9, 95-122.
GALLO, G. and SODINI, C. (1979a), Concave Cost Minimization on Networks.
European Journal of Operations Research, 3, 239-249.
GALLO, G. and ÜLKUCÜ, A. (1977), Bilinear Programming: An Exact Algorithm.
Mathematical Programming, 12, 173-194.
GALPERIN, E.A. (1985), The Cubic Algorith.m. Journal of Mathematical Analysis
and Applications, 112, 635-640.
GALPERIN, E.A. (1988), Precision, Complexity, and Computational Schemes of the
Cubic Algorithm. Journaf of Optimization Theory and Applications, 57, 223-238.
GALPERIN, E.A. and ZHENG, Q. (1987), Nonlinear Observation via Global Opti-
mization Methods: Measure Theory Approach. Journal of Optimization Theory and
Applications, 54, 1, 63-92.

GANSHIN, G.S. (1976), Function Maximization. USSR Computational Mathematics


and Mathematicai Physics, 16, 26-36.

GANSHIN, G.S. (1976a), Simplest Ser;u.ential Search Algorithm for the Largest Value
0/ a Twice-Differentiable Function. USSR Computational Mathematics and
Mathematical Physics, 16, 508-509.

GANSHIN, G.S. (1977), Optimal Passive Algorithms for Evaluating the Maximum of
a Function in an Interval. USSR Computational Mathematics and Mathematical
Physics, 17, 8-17.
GARCIA, C.ß. and ZANGWILL, W.!. (1981), Pathways to Solutions, Fixed Points
and Er;u.ilibra. Prentice-Hall, Englewood Cliffs, N.J.
692

GAREY, M.P., JOHNSON, D.S. and STOCKMEYER, L. (1976), Some Simplified


NP-Complete Problems. Theoretical Computer Science, 1, 237-267.
GAREY, M.P. and JOHNSON, D.S. (1979), Computers and Intractability: A Guide
to the Theory of NP-Completeness. Freeman, San Francisco.
GASANOV, 11 and RIKUN, A.D. (1984), On Necessary and Sufficient Conditions
for Uniextremality in Nonconvez Mathematical Programming Problems. Soviet
Mathematics Doklady, 30, 457-459.

GASANOV, 11 and RIKUN, A.D. (1985), The Necessary and Sufficient Conditions
for Single Extremality in Nonconvez Problems of Mathematical Programming. USSR
Computational Mathematics and Mathematical Physics, 25, 105-113.
GEOFFRION, A.M. (1971), Duality in Nonlinear Programming: A Simplified
Applications-Oriented Development. SIAM Reviews, 13, 1-37.
GEOFFRION, A.M. (1972), Generalized Benders Decomposition. Journal of Opti-
mization Theory and Applications, 10, 237-260.

GHANNADAN, S., MIGDALAS, A., TUY, H. and VARBRAND, P. (1994),


Heuristics Based on Tabu Search and Lagrangian Relaxation for the doncave
Production-Transportation Problem. Studies in Regional & Urban Planning, Issue
3,127-140.
GHANNADAN, S., MIGDALAS, A., TUY, H. and VARBRAND, P. (1994), Tabu
Meta-heuristic based on Local Search for the Concave Production- Transportation
Problem. Preprint, Department of Mathematics, Linköping University, to appear in
Studies in Loca.tion Analysis.
GIANNESSI, F., JURINA, L. and MAlER, G. (1979), Optimal Excavation Profile
for a Pipeline Freely Resting on the Sea Floor. Engineering Structures, 1, 81-91.
GIANNESSI, F. and NICCOLUCCI, F. (1976), Connections between Nonlinear and
Integer Programming Problems. Istituto Nazionale di Alta Mathematica, Symposia
Mathematica, 19, 161-176.
GINSBERG, W. (1973), Concavity and Quasiconcavity in Economics. Journal of
Econ. Theory, 6, 596--605.

GLOVER, F. (1972), Cut Search Methods in Integer Programming. Mathematical


Programming, 3, 86-100.

GLOVER, F. (1973), Convezity Cuts and Cut Search. Operations Research, 21, 123-
124.

GLOVER, F. (1973a), Concave Programming Applied to a Special Olass of 0-1


Integer Programs. Operations Research, 21, 135-140.
GLOVER, F. (1974), Polyhedral Convezity Cuts and Negative Edge Extensions.
Zeitschrift für Operations Research, 18, 181-186.
GLOVER, F. (1975), Polyhedral Annexation in Mixed Integer and Combinatorial
Programming. Mathematical Programming, 8, 161-188. ,

GLOVER, F. (1975a), Surrogate Constraint Duality in Mathematical Programming.


Operations Research, 23, 434-451.
693

GLOVER, F. and KLINGMAN, D. (1973), Concave Programming Applied to a


Special Class of 0-1 Integer Programs. Operations Research, 21, 135-140.
GLOVER, F. and WOLSEY, E. (1973), Further Reduction of Zero-One Polynomial
Programming Problems to Zero-One Linear Programming Problems. Operations
Research, 21, 141-161.

GLOVER, F. and WOLSEY, E. (1974), Converting the 0-1 Polynomial Pro-


gramming Problem to a Linear 0-1 Program. Operations Research, 22, 180-182.
GOMORY, RE. (1958), Outline of an Algorithm for Integer Solutions to Linear
Programs. Bulletin ofthe Americal Mathematical Society, 64, 275-278.
GOMORY, RE. (1960), Solving Linear Programming Problems in Integers. In:
Combinatorial Analysis (R. Bellman and M. Hall eds.), Providence, 211-215.

GONZAGA, C. and POLAK, E. (1979), On Constraint Dropping Schemes and


Optimality Functions for a Glass of Du.ter Approximation Algorithms. SIAM Journal
on Control and Optimization, 17, 477-493.
GOULD, F.J. (1969), Extension of Lagrange Multipliers in Nonlinear Programming.
SIAM Journal on Applied Mathematics, 17, 1280-1297.

GOULD, F.J. and TOLLE, J.W. (1983), Complementary Pivoting on a Pseudomani-


fold Structure with Applications in the Decision Sciences. Heldermann, Berlin.
GRAVES, S.T. and ORLIN, J.B. (1985), A Minimum-Cost Dynamic Network Flow
Problem with an Application to Lot-Sizing. Networks, 15, 59-71.
GRAVES, G.W. and WHINSTON, A.B. (1970), An Algorithm for the Quadratic
Assignment Problem. Management Science, 17,453-471.
GRAY, P. (1971), Exact Solution of the Fixed Charge Transportation Problem.
Operations Research, 19, 1529-1538.

GREENBERG, H.J. (1973), Bounding Nonconvex Programs by Conjugates. Opera-


tions Research, 21, 346-347.
GRIEWANK, A.O. (1981), Generalized Descent for Global Optimization. Journal of
Optimization Theory and Applications, 34, 11-39.

GRITZMANN, P. and KLEE, V. (1988), On the 0-1-Maximization of Positive


Definite Quadratic Forms. Discussion Paper. Operations Research Proceedings,
Springer, Berlin, 222-227.
GROCH, A., VIDIGAL, L. and DIRECTOR, S. (1985), A New Global Optimization
Method for Electronic Circuit Design. IEEE Transactions on Circuits and Systems,
CAS-32, 160-170.
GROTTE, J.H. (1975), A Nontarget Damage Minimizer which achieves Targeting
Objectives: Model Description and Users Guide. Working Paper WP-8, Institute of
Defense Analysis, Arlington, Virginia.

GUISEWITE, G.M. (1995), Network Problems. In: Horst, R and Pardalos, P.M.
(eds.), Handbook of Global Optimization, 609-678, Kluwer, Dordrecht-Boston-Lon-
don.
694

GUISEWITE, G.M. and PARDALOS, P.M. (1989), On the Complexity of Minimum


Concave-Cost Network Flow Problems. Discussion Paper, Department of Computer
Science, Pennsylvania State University.
GUISEWITE, G.M. and PARDALOS, P.M. (1991), Algorithms for the
Single-Source Uncapacitated Minimum Concave-dost Network Flow Problem.
Journal of Global Optimization 1, 246-265.
GUISEWITE, G.M. and PARDALOS, P.M. (1993), A Polynomial Time Solvable
Concave Network Flow Problem. Networks, 23, 143-141.
GUPTA, A.K. and SHARMA, J.K. (1983), A Generalized Simplex Technique for
Solving Quadratic Programming Problems. Indian Journal of Technology, 21,
198-201.
GURLITZ, T.R. (1985), Algorithms for Reverse Convex Programs. Ph.D. Disser-
tation, UCLA.
GURLITZ, T.R. and JACOBSEN, S.E. (1991), On the Use of Cuts in Reverse
Convex Programs. Journal of Optimization Theory and Applications 68, 251-214.
HAGER, W.W., PARDALOS, P.M., ROUSSOS, I.M. and SAHINOGLOW, H.D.
(1991), Active Constraints, Indefinite Quadratic Programming and Test Problems.
journal of Optimization Theory and Applications, 68, 499-511.
HAMAMI, M. (1982), Finitely Convergent Tuy-Type Algorithms for Concave Mini-
mization. Ph.D. Dissertation, UCLA.
HAMAMI, M. and JACOBSEN, S.E. (1988), Exhaustive Nondegenerate Conical
Processes for Concave Minimization on Convex Polytope. Mathematics of Operations
Research, 13, 419-481.
HAMMER, P.L. (1965), Some Network Flow Problems with Pseudo-Boolean
Programming. Operations Research, 13, 388-399.
HANSEN, E. (1919), Global Optimization Using Interval Analysis - The one-
dimensional Case. .tournal of Optimization Theory and its Applications, 29,
331-344.
HANSEN, E. (1980), Global Optimization using Interval Analysis - The Multidimen-
sional Case. Numensche Mathematik, 34, 241-210.
HANSEN, P. (1919), Methods of Nonlinear 0-1 Programming. Annals of Discrete
Mathematics, Discrete Optimization II, 5, 53-10.
HANSEN, P. and JAUMARD, B. (1995), Lipschitz Optimization. In Horst, R. and
Pardalos, P .M. (eds.), Handbook of Global Optimization, 408-494, Kluwer,
Dordrecht-Boston-London.

HANSEN, P., JAUMARD, B and LU, S.H. (1989), A Framework for Algorithms in
Globally Optimal Design. Research Report 0-88-11, HEC-GERAD, University of
Montrea.J..
HANSEN, P., JAUMARD, B. and LU, S.H. (1989a), Global Minimization of Uni-
variate Functions by Sequential Polynomial Approximation. International Journal of
Computer Mathematics, 28, 183-193.
695

HANSEN, P., JAUMARD, B. and LU, S.H. (1991), On the Number of Iterations of
Piyavski,' s Global Optimization Algorithm. Mathematics of Operations Research, 16,
334-350.

HANSEN, P., JAUMARD, B. and LU, S.H. (1992), Global Optimization of Uni-
variate Lipsehitz Funetions: 1. Survey and Properties. Mathematical Programming,
55, 273-292.
HANSEN, P., JAUMARD, B. and LU, S.H. (1992a), Global Optimization of Uni-
variate Lipsehitz Funetions: Part 11. New Algorithms and Computational Compari-
son. Mathematical Programming, 55, 273-292.
HARMAN, J.K. (1973), Some Ezperienee in Global Optimization. Naval Research
Logistics Quarterly, 20, 569-576.

HARTMAN, P. (1959), On Funetions Representable aB a Differenee of Convex


Functions. Pacific Journal of Mathematics, 9, 707-713.
HASHAM, A. and SACK, J.-R. (1987), Bounds for Min-Max Heaps. BIT, 27,
315-323.
HERON, B. and SERMANGE, M. (1982), Noneonvex Methods for Computing Free
Boundary Equilibra ofAxially Symmetrie Plasmas. Applied Mathematics and
Optimization, 8, 351-382.

HEYDEN, L. Van der (1980), A Variable Dimension Algorithm for the Linear
Complementarity Problem. Mathematical Programming, 19, 123-130.
HlLLESTAD, R.J. (1975), Optimization Problems Subject to a Budget Constraint
with Eeonomies 0/ Scale. Öperations Research, 23, 1091-1098.
HlLLESTAD, R.J. and JACOBSEN, S.E. (1980), Reverse Convex Programming.
Applied Mathematics and Optimization, 6, 63-78.

HILLESTAD, R.J. and JACOBSEN, S.E. (1980a), Linear Programms with an


Additional Reverse Convex Constraint. Applied Mathematics and Optimization, 6,
257-269.
HIRIART-URRUTY, J.B. (1985), Generalized Differentiability, Duality and Optimi-
zation for Problems Dealing with Differences of Convex Funetions. Lecture Notes in
Economics and Mathematical Systems, 256, 37--69, Springer-Verlag, Berlin.
HlRIART-URRUTY, J.B. (1986), When is a point x satisfying !:::.f(x) = 0 a global
minimum off Amer. Math. Monthly, 93, 556-558.
HlRIART-URRUTY, J.B. (1989), From Convex Optimization to Nonconvex
Optimization. Part I: Neeessary and Sufficient Conditions for Global Optimality. In:
Clarke, F.H., Demyanov, V.F. and Gianessi, F. (eds.), Nonsmooth Optimization and
Re1ated Topics, 219-239, Plenum, New York.

HOFFMAN, K.L. (1981), A Method for Globally Minimizing Concave Fv.nctions over
Convex Sets. Mathematical Programming, 20, 22-32.
HOGAN, W.W. (1973), Applieations of a General Convergenee Theory for Outer
Approximation Algorithms. Mathematical Programming, 5, 151-168.
HORST, R. (1976), An Algorithm for Nonconvex Programming Problems. Mathema-
tical Programming, 10, 312-321.
696

HORST, R. (1976a), On the Characterization of Affine HuU-Functionals. Zeitschrift


für Angewandte Mathematik und Mechanik, 52, 347-348 (in German).

HORST, R. (1976b), A New Branch and Bound Approach for Concave Minimization
Problems. Lecture Notes in Computer Science 41,330-337, Springer-Verlag, Berlin.
HORST, R. (1978), A New Approach for Separable Nonconvex Minimization Prob-
lems Including a Method for Finding the Global Minimum of a Function of a Single
Variable. Proceedings in Operations Research, 7, 39-47, Physica, Heidelberg.
HORST, R. (1979), Nichtlineare Optimierung. Carl Hanser-Verlag, München.

HORST, R. (1980), A Note on the Convergence of an Algorithm for Nonconvex


Programming Problems. Mathematical Programming, 19,237-238.
HORST, R. (1980a), A Note on the Duality Gap in Nonconvex Optimization and a
very simple Procedure for Bild Evaluation Type Problems. European Journal of
Operational Research, 5, 205-210.
HORST, R. (1982), A Note on Functions whose Local Minima are Global. Journal of
Optimization Theory and Applications, 36, 457-463.

HORST, R. (1984), On the Convexification of Nonconvex Programming Problems.


European Journal of Operational Research, 15, 382-392.

HORST, R. (1984a), Global Minimization in Arcwise Connected Spaces. Journal of


Mathematicai Analysis and Applications, 104,481-483.

HORST, R. (1984b), On the Global Minimization of Concave Functions: Introduction


and Survey. Operations Research Spektrum, 6, 195-205.
HORST, R. (1986), A General Glass of Branch and Bound Methods in Global
Optimization with some New Approaches for Concave Minimization. Journal of
Optimization Theory and Applications, 51, 271-291.
HORST, R. (1987), On Solving Lipschitzian Optimization Problems. In: Essays on
Nonlinear Analysis and Optimization Problems, National Center for Scientific
Research, Hanoi, 73--88.
HORST, R. (1987a), Outer Cut Methods in Global Optimization. Lecture Notes in
Economics and Matliematical Systems, 304, 28-40.

HORST, R. (1988), Deterministic Global Optimization with Partition Sets whose


Feasibility is not Known. Application to Concave Minimization, Reverse Convex
Constraints, D.C.-Programming and ,Lipschitzian Optimization. Journal of
Optimization Theory and Application, 58, 11-37.

HORST, R. (1989), On Consistency of Bounding Operations in Deterministic Global


Optimization. Journal of Optimization Theory and Applications, 61, 143-146.
HORST, R. (1990), Deterministic Global Optimization: Some Recent Advances and
New Fields ot Application. Naval Research Logistics 37, 433-471.

HORST, R. (1991), On the Vertex Enumeration Problem in Cutting Plane


Algorithms for Global Optimization. In: Fandel and Rauhut (eds.); Beiträge zur
Quantitativen Wirtschaftsforschung, Springer, Berlin.

HORST, R. (1995), On Generalized Bisection of N-Simplices. Preprint, Department


of Mathematics, University of Trier.
697

HORST, R. and DIEN, L.V. (1987), A Solution Concept for a very Generql Class of
Decision Problems. In: Opitz and Rauhut (eds.): Mathematik und Okonornie,
Springer 1987, 143-153.

HORST, R., MUU, L.D. and NAST, M. (1994), A Branch and Bound Decomposition
Approach for Solving Quasiconvex-Concave Programs, Journal of Optirnization
Theory and Applications, 82, 267-293.

HORST, R. and NAST, M. (1996), Linearly Constrained Global Minimization of


Functions with Concave Minorants. To appear in Journal of Optirnization Theory
and Applications.

HORST, R., NAST, M. and THOAI, N.V. (1995), New LP-Bound in Multivariate
Lipschitz Optimization: Theory and Applications. Journal of Optirnization Theory
and Applications, 86, 369-388.

HORST, R. and PARDALOS, P.M. (eds.) (1995). Handbook of Global Optimization,


Kluwer, Dordrecht-Boston-London.

HORST, R., PARDALOS, P.M. and THOAI, N.V. (1995), Introduction to Global
Optimization. Kluwer, Dordrecht-Boston-London.
HORST, R., PHONG, T.Q. and THOAI, N.V. (1990), On Solving General Reverse
Convex Programming Problems by a Sequence of Linear Programs and Line
Searches. Annals of Operations Research 25, 1-18.
HORST, R., PHONG, T.Q., THOAI, N.V. and VRIES, J. de (1991), On Solving a
D.C. Programming Problem by a Sequence of Linear Programs. Journal of Global
Optimization 1, 183-204.

HORST, R. and THACH, P.T. (1988), A Topological Property of Limes-Arcwise


Strictly Quasiconvex Functions. Journai of Mathematical Analysis and Applications,
134, 426-430.

HORST, R. and THOAI, N.V. (1988), Branch-and-Bound Methods for Solving


Systems 0/ Lipschitzian Equations and Inequalities. Journal of Optirnization Theory
and Applications, 58, 139-146.
HORST, R. and THOAI, N.V. (1989), Modification, Implementation and Compa-
rison 01 Three Algorithms for GlobaUy Solving Linearly Constrained Concave Mini-
mization Problems. Computing, 42, 271-289.
HORST, R. and THOAI, N.V. (1992), Conical Algorithms for the Global Minimi-
zation of Linearly Constrained Large-Scale Concave Minimization Problems.
Journal of Optirnization Theory and Applications, 74,469-486.

HORST, R. and THOAI, N.V. (1994), Constraint Decomposition Algorithms in


Global Optimization. Journal of Global 6ptimization 5, 333-348.
HORST, R. and THOAI, N.V. (1994a), An Integer Concave Minimization Approach
for the Minimum Concave Cost Capacitated Flow Problem on Networks. Preprint,
Department of Mathematics, University of Trier.
HORST, R. and THOAI, N.V. (1995), Global Optimization of Separable Concave
Functions under Linear Constraints with Totally Unimodular Matrices. To appear in
Floudas, C. and Pardalos, P.M. (eds.), State of the Art in Global Optimization,
Kluwer, Dordrecht-Boston-London.
698

HORST, R. and THOAI, N.V. (1995a), A New Algorithm lor Solving the General
Quadratic Programming Problem. To appear in Computational Optimization and
Applications.
HORST, R. and THOAI, N.V. (1996), A Decomposition Approach lor the Global
Minimization 01 Biconcave Functions over Polytopes. To appear in Journal of
Optimization Theory and Applications.

HORST, R., THOAI, N.V., and BENSON, H.P. (1991), Concave Minimization via
Conical Partitions and Polyhedral Outer Approximation. Mathematical
Programming 50, 259-274.

HORST, R., THOAI, N.V. and TUY, H. (1987), Outer Approximation by Polyhedral
Convex Sets. Operations Research Spektrum, 9, 153-159.

HORST, R., THOAI, N.V. and TUY, H. (1989), On an Outer Approximation


Concept in Global Optimization. Optimization, 20, 255-264.
HORST, R., THOAI, N.V. and VRIES, J. de (1988), On Finding New Vertices and
Redundant Constraints in Cutting Plane Algorithms lor Global Optimization.
Operations Research Letters, 7, 85-90.

HORST, R., THOAI, N.V. and VRIES, J. de (1992), A New Simplicial Cover
Technique in Constrained Global Optimization. Journal of Global Optimization, 2,
1-19.

HORST, R., THOAI, N.V. and VRIES, J. de (1992a), On Geometry and


Convergence 01 a Class 01 Simplicial Covers. Optimization, 25, 53-64.

HORST, R. and TUY, H. (1987), On the Convergence 01 Global Methods in


Multiextremal Optimization. Journal of Optimization Theory and Applications, 54,
253-271.
HORST, R. and TUY, H. (1991), The Geometrie Complementarity Problem and
Transcending Stationarity in Global Optimization. DIMACS Series in Discrete
Mathematics and Computer Science, Volume 4 IIApplied Geometry and Discrete
Mathematics, The Victor Klee Festschrift 11 , 341-354.

IBARAKI, T. (1971), Complementarity Programming. Operations Research, 19,


1523-1528.

ISTOMIN, L.A. (1977), A Modification 01 Tuy's Method lor Minimizing a Concave


Function Over a Polytope. USSR Computational Mathematics and Mathematical
Physics, 17, 1582-1592 (in Russian).

IVANOV, V.V. (1972), On Optimal Algorithms 01 Minimization in the Class 01


Functions with the Lipschitz Condition. Information Processing, 71, 1324-1327.

JACOBSEN, S.E. (1981), Convergence 01 a Tuy-Type-Algorithm lor Concave


Minimization Subject to Linear Inequality Constraints. Applied Mathematics and
Optimization, 7, 1-9.

JACOBSEN, S.E. and TORABI, M. (1978), AGlobai Minimization Algorithm for a


Class 01 One-Dimensional Functions. Journal of Mathematical Analysis and
Applications, 62, 310-324.

JEN SEN, P.A. and BARNES, J.W. (1980), Network Flow Programming. John
Wiley, New York.
699

JEROSLOW, R.G. (1976), Cutting Planes for Complementarity Constraints. SIAM


Journal on Control and Optimization, 16, 56-62.
JEROSLOW, R.G. (1977), Cutting Plane Theory: Disjunctive Methods. Annals of
Discrete Mathematics, 1, 293-330.

JONES, A.P. and SOLAND, R.M. (1969), A Branch-and-Bound Algorithm for


Multi-Level Fized-Charge Problems. Management Seience, 16, 67-76.
KALANTARI, B. (1984), Large Scale Global Minimization of Linearly Constrained
Concave Quadratic Functions and Related Problems. Ph.D. Thesis, Computer Sei.
Dept., University of Minnesota.

KALANTARI, B. (1986), Quadratic Functions with Ezponential Number of Local


Mazima. Operations Research Letters, 5, 47-49.
KALANTARI, B. and ROSEN, J.B. (1986), Construction of Large-Scale Global
Minimum Concave Test Problems. Journal of Optimization Theory and Applica-
tions, 48, 303-313.

KALANTARI, B. and ROSEN, J.B. (1987), An Algorithm for Global Minimization


of Linearly Constrained Concave Quadratic Functions. Mathematics of Operations
Research, 12, 544-56l.

KALANTARI, B. and ROSEN, J.B. (1987a), Penalty Formulation for Zero-One


Nonlinear Programming. Discrete Applied Mathematics 16, 179-182.
KALL, P. (1986), Approximation to Optimization Problems: An Elementary Review.
Mathematics of Operations Research, 11, 9-17.

KAO, E. (1979), A Multi-Product Lot-Size Model with Individual and Joint Set-Up
Costs. Operations Research, 27, 279-289.
KEARFOTT, R.B. (1987), Abstract Generalized Bisection and a Cost Bound.
Mathematics of Computation, 49, 187-202.
KEDE, G. and WATANABE, H. (1983), Optimization Techni([Ues for IC Layout
and Compaction. Proceedings IEEE Intern. Conf. in Computer Design: KSI in
Computers, 709-713.
KELLEY, J.E. (1960), The Cutting-Plane Method for Solving Convez Programs.
Journal SIAM, 8, 703-712.
KHACHATUROV, V. and UTKIN, S. (1988), Solving Multieztremal Concave
Programming Problems by Combinatorial Approximation Method. Preprint,
Computer Center of the Academy of Seiences, Moscow (in Russian).

KHAMISOV, O. (1995), Functions with Concave Minorants. To appear in Floudas,


C. and Pardalos, P.M. (eds.), State of the Art in Global Optimization, Kluwer,
Dordrecht-Boston-London.

KHANG, D.B. and FUJIWARA, O. (1989), A New Algorithm to Find All Vertices of
a Polytope. Operations Research Letters, 8, 261-264.
KIEFER, J. (1957), Optimum Search and Approximation Methods Under Minimum
Regularity Assumptions. SIAM Journal, 5, 105-136.
KIWIEL, K.C. (1985), Methods of Descent for Nondifferentiable Optimization.
Lecture Notes in Mathematics, 1133, Springer-Verlag, Berlin.
700

KLEIBOHM, K. (1967), Remarks on the Non-Convex Programming Problem.


Unternehmensforschung, 2, 49-{l0 (in German).

KLEITMAN, D.J. and CLAUS, A. (1972), A Large Scale Multicommodity Flow


Problem: Telepack. Proceedings Symp. Comput. Commun. Networks and Teletraffic.
Polytechn. Institute New York 22,335-338.

KLINZ, B. and TUY, H. (1993), Minimum Concave Cost Network Flow Problems
with a Single Nonlinear Arc Cost. In: Pardalos, P.M. and Du, D.Z. (eds.), Network
Optimization Problems, 125-143, World Scientific, Singapore.
KOEHLER, G., WHINSTON, A.B. and WRIGHT, G.P. (1975), Optimization over
Leontiev Substitution Systems. North Holland, Amsterdam.
KONNO, H. (1971), Bilinear Programming: Part I. An Algorithm for Solving
Bilinear Programs. Technical Report No. 71-9, Operations Research House,
Stanford University.
KONNO, H. (1971a), Bilinear Programming: Part II. Applications of Bilinear
Programming. Technical Report No. 71-10, Operations Research House, Stanford
University, Stanford, CA.
KONNO, H. (1973), Minimum Concave Se ries Production Systems with Determi-
nistic Demand-Backlogging. Journal of the Operations Research Society of Japan,
16,246-253.

KONNO, H. (1976), A Cutting-Plane Algorithm for Solving Bilinear Programs.


Mathematical Programming, 11, 14-27.

KONNO, H. (1976a), Maximization of a Convex Quadratic Function under to Linear


Constraints. Mathematical Programming, 11, 117-127.
KONNO, H. (1980), Maximizing a Convex Quadratic Function over a Hypercube.
Journal of the Operations Research Society of Japan, 23, 171-189.
KONNO, H. (1988), Minimum Concave Cost Production System: A Further
Generalization of Multi-Echelon Model. Mathematical Programming, 41, 185-193.
KONNO, H., (1981), An Algorithm for Solving Binlinear Knapsack Problems.
Journal of Operations Research Society of Japan, 24, 360-373.

KONNO, H. and KUNO, T. (1995), Multiplicative Programming. In: Horst, R. and


Pardalos, P.M. (eds.), Handbook of Global Optimization, 369-406, Kluwer,
Dordrecht-Boston-London.

KONNO, H., YAJIMA, Y. and MATSUI, T. (1991), Parametric Simplex Algorithms


for Solving a Special Class of Nonconvex Minimization Problems. Journal of Global
Optimization 1, 65--82.

KOUGH, P.L. (1979), The Indefinite Quadratic Programming Problem. Operations


Research, 27, 516-533.
KRIPFGANZ, A. and SCHULZE, R. (1987), Piecewise affine function as a
difference oftwo convex functions. Optimization, 18, 23-29.
KRUSE, H.-J. (1986), Degeneracy Graphs and the Neighbourhood Problem. Lecture
Notes in Economics and Mathematical Systems 260, Springer-Verlag, Berlin.
701

LAMAR, B.W. (1993), A Method for Solving Network Flow Problems with General
Nonlinear Are Gosts. In: Du, D.Z. and Pardalos, P.M. (eds.), Network Optimization
Problems, 147-168, World Scientific, Singapore.

LANDIS, E.M. (1951), On Funetions Representable as the Differenee of two Gonvex


Funetions. Dokl. Akad. Nauk SSSR, 80, 9-1l.
LAWLER, E.L. (1963), The Quadratie Assignment Problem. Management Science,
9,586-699.

LAWLER, E.L. (1966), Braneh and Bound Methods: A Survey. Operations Research,
14, 669-179.

LAZIMY, R. (1982), Mixed-Integer Quadratie Programming. Mathematical Pro-


gramming, 22, 332-349.

LAZIMY, R. (1985), Improved Algorithm for Mixed-Integer Quadratie Programs


and a Gomputational Study. Mathematical Programming, 32, 100-113.
LEBEDEV, V.Y. (1982), A Method of Solution ofthe Problem of Maximization of a
Positive Definite Quadratie Form on a Polyhedron. USSR Computational
Mathematics and Mathematical Physics, 22, 1344-1351 (in Russian).

LEICHTWEISS, K. (1980), Konvexe Mengen. VEB Deutscher Verlag der Wissen-


schaften, Berlin.

LEMKE, C.E. (1965), Bimatrix Equilibriv.m Points and Mathematieal Programming.


Management Science, 4, 681--689.

LEVITIN, E.S. and POLYAK, B.T. (1966), Gonstrained Minimization Methods.


USSR Computational Mathematics and Mathematics Physics, 6, 1-50.

LEVY, A.V. and MONTALVO, A. (1985), The Tunneling Algorithm for the Global
Minimization of Funetions. SIAM Journaf on Scientific and Statistical Computing,
6,15-29.

LOVE, F.C. (1973), A Facilities in Series Inventory Model with Nested Sehedv.les.
Management 8cience, 18, 327-338.
LUENBERGER, D.G. (1969), Optimization by Veetor Spaee Methods. John Wiley,
New York.
LÜTHI, H.J. (1976), Komplementaritäts- v.f!..d Fixpunktalgorithmen in der Mathema-
tischen Programmierung, Spieltheorie v.nd Okonomie. Lecture Notes in Economies
and Mathematical Systems 129, Springer-Verlag, Berlin.

MAJTHAY, A. and WHINSTON, A. (1974), Quasieoneave Minimization Sv.bject to


Linear Gonstraints. Discrete Mathematics, 9, 35-59.
MAHJOUB, Z. (1983), Gontribution a L'Etude de l'Optimisation des Reseaux
Maill.es. These de Doctorat d'Etat. Institut Nationale Polytechnique de Toulouse.
MALING, K., MUELLER, S.H. and HELLER, W.R. (1982), On Finding Most
Optimal Reetangular Package Plans. Proceeding of the 19th Design Automation
Conference, 663--670.
MANAS, M. and NEDOMA, J. (1968), Finding all Vertices of a Gonvex Polyhedron.
Numerische Mathematik, 12,226-229.
702

MANGASARlAN, O.L. (1964), Equilibrum Points 01 Bimatrix Games. SIAM


Journal, 12, 778-780.

MANGASARlAN, O.L. (1969), Nonlinear Programming. McGraw-Hill, New-York.

MANGASARlAN, O.L. (1976), Linear Complementarity Problems Solvable by a


Single Linear Program. Mathematical Programming, 10, 263-270.
MANGASARlAN, O.L. (1978), Gharacterization 01 Linear Complementarity
Problems as Linear Programs. Mathematical Programming Study, 7, 74-87.
MANGASARlAN, O.L. (1979), Simplified Characterizations 01 Linear Complemen-
tarity Problems Solvable as Linear Programs. Mathematics of Operations Research,
4,268-273.

MANGASARlAN, O.L. and SffiAU, T.B. (1985), A Variable Complexity Norm


Maximization Problem. Technical Summary Report University of Wisconsin,
Madison.
MANGASARlAN, O.L. and STONE, B. (1964), Two Person Nonzero-Sum Games
and Quadratic Programming. Journal Mathematical Analysis and Applications, 9,
345-355.
MANNE, A.S. (1964), Plant Location under Economies 01 Scale - Decentralization
and Computation. Management Science, 11, 213-235.
MARANAS, C.D. and FLOUDAS, C. (1994), AGlobai Optimization Method lor
Weber's Problem. with Attraction and Repulsion. In: Hager, W.W., Hearn, D.W. and
Pardalos, P.M. (eds.), Large-Scale Optimization: State ofthe Art, 259-293, Kluwer,
Dordrecht-Boston-London.
MARTOS, B. (1967), Quasiconvexity and Quasimonotonicity in Nonlinear Program-
ming. Studia Scientiarum Mathematicarum Hungaria, 2, 265-273.
MARTOS, B. (1971), Quadratic Programming with a Quasiconvex Objective Func-
tion. Operations Research, 19, 87-97.
MARTOS, B. (1975), Nonlinear Programming, North Holland Publishing Company,
Amsterdam.
MATHEISS, T.H. (1973), An Algorithm lor Determining Irrelevant Constraints and
aU Vertices in Systems 0/ Linear Inequalities. Operations Research, 21, 247-260.
MATHEISS, T.B. and RUBIN, D.S. (1980), A Survey and Comparison 01 Methods
lor Finding all Vertices 01 Convex Polyhedral Sets. Mathematics of Operations
Research, 5, 167-184.

MAYNE, D.Q. AND POLAK, E. (1984), Outer Approximation Algorithm lor Non-
dillerentiable Optimization Problems. Journal Optimization Theory and Applica-
tions, 42, 19-30.

MAYNE, D.Q. AND POLAK, E. (1986), Algorithms lor Optimization Problems with
Exclusion Constraints. Journal of Optimization Theory and Applications, 51,
453-473.
McCORMICK, G.P. (1972), Attempts to Calculate Global Solutions 01 Problems that
may have Local Minima. In: Numerical Methods for Nonlinear Optimization, F.
Lootsma, Ed., Academic, London and New York, 209-221.
703

McCORMICK, G.P. (1973), Algorithmic and Computational Methods in Engineering


Design. Computers and Structures, 3, 1241-1249.
McCORMICK, G.P. (1976), Computability of Global Solutions to Factorable Non-
convex Programs: Part I - Convex Underestimating Problems. Mathematical
Programming, 10, 147-175.

McCORMICK, G.P. (1980), Locating an Isolated Global Minimizer of a Constrained


Nonconvex Program. Mathematics of Operations Research, 5, 435-443.
McCORMICK, G.P. (1981), Finding the Global Minimum of a Function of one
Variable Using the Methods of Constant Signed Higher Order Derivatives. In:
Mangasarian, O.L., Meyer, RR., Robinson, S.M. (eds.), Nonlinear Programming 4,
Proceedings Symposium Madison, July 14-16, Academic Press, New York, 223-243.

McCORMICK, G.P. (1983), Nonlinear Programming: Theory, Algorithms and


Applications. John Wiley & Sons, New York.
MEEWELLA, C.C. and MA YNE, D.Q. (1988), An Algorithm for Global Optimiza-
tion of Lipschitz Continuous Functions. Journal of Optimization Theory and Appli-
cations, 57, 307-322.

MEYER, R.R. (1970), The Validity of a Family of Optimization Methods. SIAM


Journal on Control, 8,41-54.

MEYER, R.R. (1983), Computational Aspects of Two-Segment Separable Program-


ming. Mathematical Programming, 26, 21-39.
MILLS, H. (1960), Equilibrum Points in Finite Games. SIAM Journal, 8, 397-402.

MINOUX, M. (1986), Mathematical Programming. John Wiley, 1986.


MINOUX, M. (1989), Network Synthesis and Optimum Network Design Problems:
Models, Solution Methods and Applications. Networks 19, 313-360.
MINOUX, M. and SERREAULT, J.Y. (1981), Synthese Optimale d'un Reseau de
TeT.ecommunications avec Constraintes de Securite. Annales des TelElcommuni-
cations, 36, 211-230.
MITTEN, L.G. (1970), Branch and Bound Methods: General Formulation and
Properties. Operations Research, 18, 24-34.
MLADINEO, R.H. (1986), An Algorithm for Finding the Global Maximum of a
Multimodal, Multivariate Function. Mathematical Programming, 34, 188-200.
MOORE, RE. (1979), Methods and Applications of Interval Analysis. SIAM Studies
in Applied Mathematics, W.F. Ames (ed.), Philadelphia.

MOORE, R.E. (1988) (ed.), Reliability in Computing: The Role of Interval Methods.
Academic Press, New York.

MUELLER, R.K. (1970), A Method for Solving the Indefinite Quadratic Program-
ming Problem. Management Science, 16, 333-339.
MUKHAMEDIEV, B.M. (1982), Approximate Methods for Solving Concave Prrr
gramming Problems. USSR Computational Mathematics and Mathematical Physics,
22, 238-245 (in Russian).
704

MURTY, KG. (1969), Solving the Fixed-Charge Problem by Ranking the Extreme
Points. Operations Research, 16, 268-279.
MURTY, KG. (1974), Note on a Bard-Type Scheme for Solving the Complemen-
tarity Problem. Operations Research, 11, 123-130.
MURTY, KG. (1988), Linear Complementarity, Linear and Nonlinear Program-
ming. Heldermann Verlag, Berlin.
MURTY, K.G. and KABADI, S.N. (1987), Some NP-Complete Problems in
Quadratic and Nonlinear Programming. Mathematical Programming, 39, 117-130.
MUU, L.D. (1985), A Convergent Algorithm for Solving Linear Programs with an
Additional Reverse Convex Constraint. Kybernetika, 21, 428-435.
MUU, L.D. (1993), An Algorithm for Solving' Convex Programs with an Additional
Convex-Concave Constraint, Mathematical Programming, 61, 75-87.
MUU, L.D. and OETTLI, W. (1989), An Algorithm for Indefinite Qv.adratic
Programming with Convex Constraints, Operations Research Letters, 10, 323-327.
NEFERDOV, V.N. (1987), The Search for the Global Maximum of a Function of
Several Variables on a Set Defined by Constraints of Inequality Type. USSR
Computational Mathematics and Mathematical Physics, 27, 23-32.

NEMHAUSER, G.L. and WOLSEY, L.A. (1988), Integer and Combinatorial Optimi-
zation. John Wiley & Sons, New York.
NETZER, D. and PASSY, W. (1975), A Note on the Maximum of Quasiconvex
Functions. Journal of Optimization Theory and Applications, 16, 565-569.
NGHIA, N.D. and HIEU, N.D. (1986), A Method for Solving Reverse Convex
Programming Problems. Acta Mathematica Vietnamica, 11,241-252.
NGUYEN, V.H. and STRODIOT, J.J. (1988), Computing a Global Optimal Solution
to a Design Centering Problem. Technical Report 88/12, Facultes Universitaires de
Namur, Namur, Belgium.
NGUYEN, V.H. and STRODIOT, J.J. (1992), Computing a Global Optimal Solution
to a Design Centering Problem. Mathematical Programming, 53, 111-123.
NGUYEN, V.H., STRODIOT, J.J. and THOAI, N.V. (1985), On an Optimum
Shape Design Problem. Technical Report 85/5. Department of Mathematics,
Facultes Universitaires de Namur.

OWEN, G. (1973), Cutting Planes for Programs with Disjunctive Constraints.


Journal of Optimization Theory and Applications, 11,49-55.

PANG, J.S. (1995), Complementarity Problems., In: Horst, R. and Pardalos, P.M.,
Handbook of Global Optimization, 271-338, Kluwer, Dordrecht-Boston-London.

PAPADIMITRIOU, C.H. and STEIGLITZ, K (1982), CombinatorialOptimization:


Algorithms and Complezity. Prentice--Hall, Englewood Cliffs, N.J.
PARDALOS, P.M. (1985), Integer and Separable Programming Techniques for
Large-Scale Global Optimization Problems. Ph.D. Thesis. Computer Science
Department, Univ. Minnesota, Minneapolis, M.N.
705

PARDALOS, P.M. (1986), Aspects of Parallel Computation in Global Optimization.


Proc. of the 24th Annual Allerton Conference on Communication, Control and
Computing, 812-821.

PARDALOS, P .M. (1986a), An Algorithm for a Class of Nonlinear Fractional Prob-


lems Using Ranking ofthe Vertices. BIT, 26, 392-395.
PARDALOS, P .M. (1987), Generation of Large-Scale Quadratic Programs for Use
as Global Optimization Test Problems. ACM Transactions on Mathematical
Software, 13, No. 2, 133-137.

PARDALOS, P.M. (1987a), Objective Function Approximation in Nonconvex Pro-


gramming. Proceedings of the 18th Modeling and Simulation Conference, 1605-1610.
PARDALOS, P.M. (1988), Quadratic Programming Defined on a Convex Hull of
Points. BIT, 28, 323-328.
PARDALOS, P.M. (1988a), Enumerative Techniques for Solving some Nonconvex
Global Optimization Problems. Operations Research Spektrum, 10, 29-35.
PARDALOS, P.M. (1988b), Linear Complementarity Problems Solvable by Integer
Programming. Optimization, 19,467-474.
PARDALOS, P.M. (1991), Global Optimization Algorithms for Linearly Constrained
Indefinite Quadratic Problems. Journal of Computers and Mathematics with
Applications, 21, 87-97.

PARDALOS, P.M., GLlCK, J.H. and ROSEN, J.B. (1987), Global Minimization of
Indefinite Quadratic Problems. Computing, 39, 281-291.
PARDALOS, P.M. and GUPTA, S. (1988), A Note on a Quadratic Formulation for
Linear Complementarity Problems. Journal of Optimization Theory and Applica-
tions, 57, 197-202.
PARDALOS, P.M. and KOVOOR, N. (1990), An Algorithm for Singly Constrained
Quadratic Programs. Mathematical Programming 46, 321-328.
PARDALOS, P.M. and PHILLIPS, A.T. (1990), AGlobai Optimization Approach
for the Maximum Clique Problem. International Journal of Computer Math. 33,
209-216.
PARDALOS, P.M. and PHILIPPS, A.T. (1991) , Global Optimization of Fractional
Programs. Journal of Global Optimization, 1, 173-182.
PARDALOS, P.M. and ROSEN, J.B. (1986), Methods for Global Concave Minimi-
zation: A Bibliographie Survey. SIAM Review, 28, 367-379.
PARDALOS, P.M. and ROSEN, J.B. (1987), Constrained Global Optimization:
Algorithms and Applications. Lecture Notes in Computer Science, 268,
Springer-Verlag, Berlin.

PARDALOS, P.M. and ROSEN, J.B. (1988), Global Optimization Approach to the
Linear Complementarity Problem. SIAM Journal on Scientific and Statistical
Computing, 9, 341-353.
PARDALOS, P.M. and SCHNITGER, G. (1987), Checking Local Optimality in Con-
strained Quadratic Programming is NP-hartl. Operations Research Letters, 7,
33-35.
706

PARIKH, S.C. (1976), Approximate Ou.tting Planes in Nonlinear Programming.


Mathematical Programming, 11, 184-198.
PEVNYl, A.V. (1982), On Optimal Search Strategies for the Maximum of a Function
with Bounded Highest Derivative. USSR Computational Mathematics and Mathema-
tical Physics, 22, 38-44.
PFERSCHY, U. and TUY, H. (1994), Linear Programs with an Additional Rank
Two Reverse Convez Constraint. Journal of Global Optimization, 4, 347-366.
PFORR, E.A. and GUNTHER, R.H. (1984), Die Normalform des Quadratischen
Optimierungsproblems und die pol-Polaren Theorie. Mathematische Operations-
forschung und Statistik, Sero Optimization, 15,41-55.

PHAN, H. (1982), Quadratically Constrained Quadratic Programming: Some Appli-


cations and a Method of Solution. Zeitschrift für Operations Research, 26, 105-119.
PHAM, D. T. (1984), Algorithmes de calcul du maximum cCune forme quadratique
sur la boule unite de (a norme du maximum. Numerische Mathematik, 45, 163-183.
PHAM, D. T. and SOUAD, E.B. (1986), Algorithms for Solving a Class of Non-
convez Optimization Problems. Method of Subgradients. In: Hiriart-Urruty (ed.),
Fermat Days 1985: Mathematics for Optimization, 249-272.
PHILLIPS, A.T. and ROSEN, J.B. (1988), A ParaUel Algorithm for Constrained
Concave Quadratic Global Minimization: Computational Aspects. Mathematical
Programming B, 42, 421-448.
PHONG, T.Q., TAO, P.D. and AN, L.T.H. (1995), A Method for Solving D.C.
Programming Problems. Application to Fuel Mizture Nonconvu Optimization
Problems. Journal of Global Optimization, 6, 87-106.
PINCUS, M. (1968), A Closed Form Solution of Certain Programming Problems.
Operations Research, 16, 69<H>94.
PINTER, J. (1983), A Unified Approach to Globally Convergent One-Dimensional
Optimization Algorithms. Technical Report IA MI 83-5, Istituto per le Applicazioni
della Matematica e dell' Informatica CNR, Milan.
PINTER, J. (1986), Eztended Univariate Algorithms for n-dimensional Global Opti-
mization, Computing, 36, 91-103.
PINTER, J. (1986a), Globally Convergent Methods for n-dimensional Multieztremal
Optimization. Optimization, 17, 187-202.
PINTER, J. (1986b), Global Optimization on Convez Sets. Operations Research
Spektruttl, 8, 197-202.

PINTER, J. (1987), Convergence Qualification of Partition Algorithms in Global


Optimization. Research Report 87--61, Department of Mathematics and Informatics,
Delft University of Technology.
PINTER, J. (1988), Branch-and-Bound Algorithm for Solving Global Optimization
Problems with Lipschitzian Structure. Optimization, 19, 101-110.
PINTER, J. (1990), Solving Nonlinear Equation Systems via Global Partition and
Search. Computing 43, 309-323.
707

PINTER, J., SZABO, J. and SOMLYODY, L. (1986), Multiextremal Optimization


lor Calibrating Water Resources Models. Environmental Software, 1,98-105.
PIYA VSKII, S.A. (1967), An Algorithm lor Finding the absolute Minimum 01 a
Function. Theory of Optimal Solutions, No. 2, Kiev, IK AN USSR, 13-24, (in
Russian).

PIYA VSKII, S.A. (1972), An Algorithm lor Finding the Absolute Extremum 01 a
Function. USSR Computational Mathematics and Mathematical Physics, 12, 57-67.
POLAK, E. (1982), An Implementable Algorithm lor the Optimal Design Centering,
Tolerancing and Tuning Problem. Journal of Optimization Theory and Applications,
37,45-67.

POLAK, E. (1987), On the Mathematical Foundations 01 Nondillerentiable Optimi-


zation in Engineering Design. SIAM Review, 29, 21-89.
POLAK, E. and VINCENTELLI, A.S. (1979), Theoretical and Computational
Aspect 01 the Optimal Design Centering, Tolerancing and Turning Problem. IEEE
Transactions Circuits and Systems, CAS-26, 795-813.

POLJAK, B.T. (1969), Minimization 01 Nonsmooth Functionals. USSR Computa-


tional Mathematics and Mathematical Physics, 9, 14-29.

RAGHAVACHARI, M. (1969), On Connections between Zero-One Integer Pro-


gramming and Concave Programming under Linear Constraints. Operations
Research, 17,680-684.

RAMARO, B. and SHETTY, C.M. (1984), Application 01 Disjunctive Programming


to the Linear Complementarity Problem. Naval Research Logistics, 31, 589-600.
RATSCHEK, H. (1985), Inclusion Functions and Global Optimization. Mathematical
Programming, 33, 300-317.

RATSCHEK, H. and ROKNE, J. (1984), Computer Methods lor the Range 01 Fun(}-
tions. Ellis Horwood Series Mathematics and its Applications, Wiley, New York.
RATSCHEK, H. and ROKNE, J. (1988), New Computer Methods lor Global Optimi-
zation. Ellis Horwood, Chi chester.
RATSCHEK, H. and ROKNE, J. (1988), Interval Methods. In: Horst, R. and
Pardalos, P.M. (eds.), Handbook of Global Optimization, 751-828, Kluwer,
Dordrecht-Boston-London.
REEVES, G.R. (1975), Global Minimization in Nonconvex AU-Quadratic Program-
ming. Management Science, 22, 76-86.
RINNOOY KAN, A.H.G. and TIMMER, G.T. (1987), Stochastic Global Optimiza-
tion Methods. Part I: Clustering Methods. Mathematical Programming, 39, 27-56.
RINNOOY KAN, A.H.G. and TIMMER, G.T. (1987), Stochastic Global Optimiza-
tion Methods. Part II: Multi-Level Methods. Mathematical Programming, 39, 57-78.
RITTER, K. (1965), Stationary Points of Quadratic Maximum Problems. Zeitschrift
für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 4, 149-158.

RITTER, K. (1966), A Method for Solving Maximum-Problems with a Nonconcave


Quadratic Objective Function. Zeitschrift für Wahrscheinlichkeitstheorie und
Verwandte Gebiete, 4, 340-351.
708

ROCKAFELLAR, R.T. (1970), Convex Analysis. Princeton University Press,


Princeton, N.J.
ROCKAFELLAR, R.T. (1974), Augmented Lagrange Multiplier Functions and
Duality in Nonconvex Programming. SIAM Journal on Control, 12, 268-285.
ROCKAFELLAR, R.T. (1981), Favorable Classes of Lipschitz Continuous Functions
in Subgradient Optimization. In: Nurminskii (ed.): Progress in Nondifferentiable
Optirnization, Pergamon Press.
ROESSLER, M. (1971), Eine Methode zur Berechnung des Optimalen Produktions-
programms bei konkaven Kosten. Unternehmensforschung, 15, 103-11l.
ROKNE, J. (1977), Bounds for an Interval Polynomial. Computing, 18,225-240.

ROSEN, J.B. (1960), The Gradient Projection Method for Nonlinear Programming,
I: Linear Constraints. SIAM Journal on Applied Mathematics, 8,181-217.
ROSEN, J.B. (1966), Iterative Solution of Nonlinear Optimal Control Problems.
SIAM Journal on Contral, 4, 223-244.
ROSEN, J.B. (1983), Global Minimization of a Linearly Constrained Concave
Function by Partition of Feasible Domain. Mathematics of Operations Research, 8,
215-230.

ROSEN, J.B. (1983a), Parametric Global Minimization for Large Scale Problems.
Technical Report 83-11 (revised), Computer Sei. Dept. Univ. of Minnesota.
ROSEN, J.B. (1984), Performance of Approximate Algorithms for Global Minimi-
zation. Math. Progr. Study, 22, 231-236.
ROSEN, J;B. (1984a), Computational Solution of Large-Scale Constrained Global
Minimization Problems. Numerical Optimization 1984. (P.T. Boggs, R.H. Byrd,
R.B. Schnabel, Eds.). SIAM, Phil., 263-271.

ROSEN, J.B. and PARDALOS, P.M. (1986), Global Minimization of Large-Scale


Constrained Concave Quadratic Problems by Separable Programming. Mathematical
Programrning, 34, 163-174.
RUBINS TEIN , R. and SAMORODNITSKY, G. (1986), Optimal Coverage of Convex
Regions. Journal of Optimization Theory and Applications, 51, 321-343.
SAHNI, S. (1974), Computationally Related Problems. SIAM J. on Computing, 3,
262-279.

SCHOCH, M. (1984), Über die Äquivalenz der allgemeinen quadratischen Opti-


mierungsaufgabe zu einer linearen parametrischen komplementären Optimierungsau}
gabe. Mathematische Operationsforschung und Statistik, Sero Optirnization, 16,
211-216.

SCHOEN, F. (1982), On a Sequential Search Strategy in Global Optimization Prob-


lems. Calcolo, 19, 321-334.
SEN, S. (1987), A Cone Splitting Algorithm for Reverse Convex Programming.
Preprint, University of Arizona, Tueson.
709

SEN, S. and SRERALI, R.D. (1985), On the Convergence of Cutting Plane Alglr
rithms for a Class of Nonconvex Mathematical Programs. Mathematical Program-
ming, 31, 42-56.

SEN, S. and SRERALI, H.D. (1985a), A Branch-and-Bound Algorithm for Extreme


Point Mathematical Programming Problems. Discrete Applied Mathematics, 11,
265-280.

SEN, S. and SHERALI, H.D. (1986), Facet Inequalities from Simple Disjunctions in
Cutting Plane Theory. Mathematicai Programming, 34, 72-83.
SEN, S. and SHERALI, H.D. (1987), Nondifferentiable Reverse Convex Programs
and Facial Convexity Cuts via a Disjunctive Characterization. Mathematical
Programming, 37, 169-183.

SEN, S. and WHITESON, A. (1985), A Cone Splitting Algorithm for Reverse Convex
Programming. Proceedings of IEEE Conference on Systems, Man, and Cybernetics,
Tucson, Az. 656-660.
SHEN, Z. and ZHU, Y. (1987), An Interval Version of Shubert's Iterative Method for
the Localization ofthe Global Maximum. Computing 38, 275-280.
SHEPILOV, M.A. (1987), Determination ofthe Roots and ofthe Global Extremum of
a Lipschitz Function. Cyhernetics, 23, 233-238.
SHERALI, H.D. and SHETTY, C.M. (1980), Optimization with Disjunctive Con-
straints. Lecture Notes in Economics and Mathematical Systems, 181, Springer-
Verlag, Berlin.
SHERALI, H.D. and SHETTY, C.M. (1980a), A Finitely Convergent Algorithm for
Bilinear Programming Problems Using Polar and Disjunctive Face Cuts.
Mathematical Programming, 19, 14-31.

SHERALI, H.D. and SHETTY, C.M. (1980b), Deep Cuts in Disjunctive Program-
ming. Naval Research Logistics, 27, 453--476.

SHIAU, T.R. (1984), Finding the Largest fl -ball in a Polyhedral Set. Technical
Summary Report, University of Wisconsin.
SHOR, N.Z. (1987), Quadratic Optimization Problems. Technicheskaya Kibernetika,
1, 128-139 (in Russian).
SHUBERT, B.O. (1972), A Sequential Method Seeking the Global Maximum 01 a
Function. SIAM Journal on Numerical Analysis, 9, 379-388.
SINGER, I. (1979), A Fenchel-Rockafellar Type Duality Theorem for Maximization.
Bulletin of the Australian Mathematical Society, 20, 193-198.

SINGER, I. (1979a), Maximization of Lower Semi-Continuous Convex Functionals


on Bounded Subsets of Locally Convex Spaces. I: Hyperplane Theorems. Applied
Mathematics and Optimization, 5, 349-362.

SINGER, I. (1980), Minimization of Continuous Convex Functionals on Comple-


ments of Convex Subsets of Locally Convex Spaces. Mathematische Operationsfor-
schung und Statistik, Serie Optimization, 11, 221-234.
710

SINGER, I. (1987), Generalization 01 Convex Supremization Duality. In: Lin, B.L.


and Simons, S. (eds.): Nonlinear and Convex Analysis, Lecture Notes in Pure and
Applied Mathematics 107, M. Dekker, New York.

SINGER, I. (1992), Some Further Duality Theorem lor Optimization Problems with
Reverse Convex Constraint Sets. Journal of Mathematical Analysis and Applications,
171,205-219.
SOLAND, R.M. (1971), An Algorithm lor Separable Nonconvex Programming Prob-
lems II: Noneonvex Constraints. Management Science, 17, 759-773.
SOLAND, R.M. (1974), Optimal Faeility Location with Coneave Costs. OperationS
Research, 22, 373-382.
STEINBERG, D.I. (1970), The Fixed Charge Problem. Naval Research Logistics, 17,
217-235.

STONE, R. E. (1981), Geometrie Aspeets 01 the Linear Complementarity Problem.


Technical Report SOL81-6, Department of Operations Research, Stanford
University.
STREKALOVSKI, A.C. (1987), On the Global Extremum Problem. In Russian.
Soviet Doklady, 292, 1062-1066.

STREKALOVSKI, A.C. (1990), On Problems 01 Global Extremum in Nonconvex


Extremal Problems. In Russian. Izvestya Vuzov, sero Matematika, 8, 74-80.
STREKALOVSKI, A.C. (1991), On Conditions lor Global Extremum in a Noneonvex
Minimization Problem. In Russian. Izvestya Vuzov, sero Matematika, 2, 94-96.
STRIGUL, 0.1. (1985), Search Jor aGlobai Extremum in a Certain Subclass 01
Functions with the Lipschitz Condition. Cybernetics, 21, 193-195.
STRONGIN, R.G. (1973), On the Convergenee 01 an Algorithm lor Finding aGlobai
Extremum. Engineering Cybernetics, 11,549-555.
STRONGIN, R.G. (1978), Numerieal Methods tor Multiextremal Problems. Nauka,
Moscow (in Russian).
STRONGIN, R.G. (1984), Numerical Methods lor Multiextremal Nonlinear Program-
ming Problems with Noneonvex Constraints. In: Demyanov, V.F. and Pallaschke, D.
(eds.), Non-Differentiable Optimization: Motivations and Applications. Proceedings
IIASA Workshop, Sopron 1984, IIASA, Laxenburg.

STUART, E.L., PHILLIPS, A.T. and ROSEN, J.B. (1988), Fast Approximate
Solution 01 Large-Seale Constrained Global Optimization Problems. Technical
Report No. 88-9, Computer Science Department, University of Minnesota.

SUKHAREV, A.G. (1971), Best Sequential Seareh Strategies lor Finding an


Extremum. USSR Compurations Mathematics and Mathematical Physics, 12, 35-50.
SUKHAREV, A.G. (1985), The Equality 01 Errors in Classes 01 Passive and
Sequential Algorithms. USSR Computational Mathematics and Mathematical
Physics, 25, 193-195.
SUNG, Y.Y. and ROSEN, J.B. (1982), Global Minimum Test Problem Construction.
Mathematical Programming, 24, 353-355.
711

SWARUP, K. (1966), Quadratic Programming. Cahiers du Centre d'Etudes de


Recherche Operationelle, 8, 223-234.

TAHA, H.A. (1972), A Balasian-Based Algorithm for zero-one Polynomial Pro-


gramming. Management Science, 18, 328-343.
TAHA, H.A. (1973), Concave Minimization over a Convex Polyhedron. Naval
Research Logistics, 20, 533-548.

TAM, B.T. and BAN, V.T. (1985), Minimization of a Concave Function Under Li-
near Constraints. Ekonomika i Matematicheskie Metody, 21, 709-714 (in Russian).
TAMMER, K. (1976), Möglichkeiten Z'Ur Anwendung der Erkenntnisse der para-
metrischen Optimierung fiir die Lösung indefiniter quadratischer Optimierungs-
probleme. Mathematische Operationsforschung und Statistik, Serie Optimization, 7,
206-222.
THACH, P.T. (1985), Convex Programs with Several Additional Reverse Convex
Constraints. Acta Mathematica Vietnamica, 10,35-57.
THACH, P.T. (1987), D.C. Sets, D.C. Functions and Systems of Equations.
Preprint, Institute of Mathematics, Hanoi.
THACH, P.T. (1988), The Design Centering Problem as a D.C. Programming
Problem. Mathematical Programming, 41, 229-248.
THACH, P.T. (1990), A Decomposition Method for the Min Concave Cost Flow
Problem with a Special Structure. Japan Journal of Applied Mathematics 7, 103-120.
THACH, P.T. (1990a), Convex Minimization Under Lipschitz Constraints. Journal
of Optimization Theory and Applications 64, 595-614.

THACH, P.T. (1991), Quasiconjugates of Functions and Duality Correspondence


Between Quasiconcave Minimization under aReverse Convex Constraint and
Quasiconvex Maximization under a Convex Constraint. Journal of Mathematical
Analysis and Applications 159,299-322.
THACH, P.T. (1991a), A Nonconvex Duality with Zero Gap and Applications. SIAM
Journal of Optimization, 4, 44-64.
THACH, P.T. (1992), New Partitioning Method for a Class of Nonconvex
Optimization Problems. Mathematics of Operations Research, 17,43-69.
THACH, P.T. (1993), D.C. sets, D.C. Functions and Nonlinear Equations.
Mathematical Programming, 58, 415-428.

THACH, P.T. (1993a), Global Optimality Criterion and Duality with Zero Gap in
Nonconvex Optimization Problems. SIAM J. Math. Anal., 24, 2537-2556.
THACH, P.T., BURKARD, R.E. and OETTLI, W. (1991), Mathematical Programs
with a Two Dimensional Reverse Convex Constraint. Journal of Global
Optimization, 1, 145-154.

THACH, P.T., THOAI, N.V. and TUY, H. (1987), Design Centering Problem with
Lipschitzian Structure. Preprint, Institute of Mathematics, Hanoi.
THACH, P.T. and TUY, H. (1987). Global Optimization under Lipschitzian Con-
straints. Japan Journal of Applied Mathematics, 4,205-217.
712

THACH, P.T. and TUY, H. (1988), Parametric Approach to a Class of Nonconvex


Global Optimization Problems. Optimization, 19,3-11.
THACH, P.T. and TUY, H. (1989), Remarks on a New Approach to Global
Optimization: The Relief Indicator Method. Proceedings of the International Seminar
on Optimization Methods and Applications, Irkutsk.

THACH, P.T. and TUY, H. (1990), The Relief Indicator Method for Constrained
Global Optimization, Naval Research Logistics 37, 473-497.
THAKUR, L. (1990), Domain Contraction in Nonlinear Programming: Minimizing a
Quadratic Concave Objective over a Polyhedron. Mathematics of Operations
Research 16, 390-407.
THIEU, T.V. (1980), Relationship between Bilinear Programming and Concave
Programming. Acta Mathematica Vietnamica, 2, 106-113.
THIEU, T.V. (1984), A Finite Method for GlobaUy Minimizing Concave Functions
over Unbounded Convex Sets and its Applications. Acta Mathematica Vietnamica 9,
173-191.
THIEU, T.V. (1987), Solving the Lay-Out Planning Problem with Concave Cost. In:
Essays in Nonllnear Analysis and Optimization, Institute of Mathematics, Hanoi,
101-110.

THIEU, T.V. (1988), A Note on the Solution of Bilinear Programming Problem by


Reduction to Concave Minimization. Mathematical Programming, 41, 249-260.
THIEU, T.V. (1989), Improvement and Implementation of some Algorithms for
Nonconvex Optimization Problems. In: Optimization - Fifth French German
Conference Castel Novel 1988, Lecture Notes in Mathematics 1405, Springer,
159-170.
THIEU, T.V. (1991), A Variant of Tuy's Decomposition Algorithm for Solving a
Class of Concave Minimization Problems. Optimization 22, 607-620.
THIEU, T.V., TAM, B.T. and BAN, V.T. (1983), An Outer Approximation Method
for Globally Minimizing a Concave Function over a Compact Convex Set. Acta
Mathematica Vietnamica, 8, 21-40.
THOAI, N.V. (1981), Anwendung des Erweiterungsprinzips zur Lösung konkaver
Optimierungsaufgaben. Mathematische Operationsforschung und Statistik, Serie
Optimization, 11,45-51.

THOAI, N.V. (1984), Verfahren zur Lösung konkaver Optimierungsaufgaben.


Seminarbericht 63 der Humboldt Universität Berlin, Sektion Mathematik.

THOAI, N.V. (1987), On Canonical D.C. Programs and Applications. in: Essays on
Nonlinear Analysis and Optimization Problems, Institute of Mathematics, Hanoi,
88-100.

THOAI, N.V. (1988), A Modified Version of Tuy's Method for Solving D.C.
Programming Problems. Optimization, 19, 665-674.
THOAI, N.V. (1994), On the Construction of Test Problems for Concave
Minimization Algorithms. Journal of Global Optimization 5, 399-402.
713

THOAI, N.V. and de VRIES, J. (1988), Numerical Experiments on Concave


Minimization Algorithms. Methods of Operations Research 60, 363-366.
THOAI, N.V. and TUY, H. (1980), Convergent Algorithms lor Minimizing a
Concave Function. Mathematics of Operations Research, 5, 556-566.
THOAI, N.V. and TUY, H. (1983), Solving the Linear Complementarity Problem
through Concave Programming. USSR Computational Mathematics and
Mathematical Physics, 23, 602-608.

THUONG, T.V. and TUY, H. (1984), A Finite Algorithm lor Solving Linear
Programs with an Additional Reverse Convex Constraint. Lecture Notes in
Economics and Mathematical Systems, 225, Springer, 291-302.

TICHONOV, A.N. (1980), On a Reciprocity Principle. Soviet Mathematics


Doklady, 22, 100-103.

TIMONOV, L.N. (1977), An Algorithm lor Search 01 aGlobaI Extremum. Engi-


neering Cybernetics, 15, 38-44.

TÖRN, A. and ZILINSKAS, A. (1989), Global Optimization. Lecture Notes in


Computer Science, 350, Springer-Verlag, Berlin.

TOLAND, J.F. (1978), Duality in Nonconvex Optimization. Journal Mathematical


Analysis and Applications, 66, 399-415.

TOLAND, J.F. (1979), A Duality Principle lor Nonconvex Optimisation and the
Calculus 01 Variations. Archive of Rational Mechanics and Analysis, 71, 41-6l.
TOMLIN, J.A. (1978), Robust Implementation 01 Lemke's Method lor the Linear
Complementarity Problem. Mathematical Programming Study, 7, 55-60.
TOPKIS, D.M. (1970), Cutting Plane Methods without Nested Constraint Sets.
Operations Research, 18, 404-413.
TUY, H. (1964), Concave Programming under Linear Constraints. Soviet Mathema-
tics, 5, 1437-1440.

TUY, H. (1981), Conical Algorithm lor Solving a Class 01 Complementarity Prob-


lems. Acta Mathematica Vietnamica, 6, 3-17.
TUY, H. (1982), Global Maximization 01 a Convex Function over a Closed, Convex,
Non necessarily Bounded Set. Cahiers de MatMmatiques de la Decision, 8223,
Universite Paris IX-Dauphine.

TUY, H. (1983), On Outer Approximation Methods lor Solving Concave Minimiza-


tion Problems. Acta Mathematica Vietnamica, 8, 3-34.
TUY, H. (1983a), Global Minimization 01 a Concave Function Subject to Mixed
Linear and Reverse Convex Constraints. IFIP Conference on lIRecent Advances in
System Modeling and Optimization ll , Hanoi.

TUY, H. (1984), Global Minimization 01 a Difference 01 Two Convex Functions.


Proceedings of the VII Symposium on Operations Research, Karlsruhe, August 1983,
Lecture Notes In Economics and Mathematical Systems, 226, Springer-Verlag,
Berlin, 98-108.
714

TUY, H. (1985), Concave Minimization Under Linear Constraints with Special


Structure. Optimization, 16,335-352.
TUY, H. (1986), A General Deterministic Approach to Global Optimization via D.C.
Programming. In: Hiriart-Urruty, J.B. (ed.), Fermat Days 1985: Mathematics for
Optimization, Elsevier, Amsterdam, 137-162.

TUY, H. (1987), Global Minimization of a Difference of two Convez Functions.


Mathematical Programming Study, 30, 150-182.

TUY, H. (1987a), Convez Programs with an Additional Reverse Convez Constraint.


Journal of Optimization Theory and Applications, 52, 463--485.
TUY, H. (1987b), An Implicit Space Covering Method with Applications to Fized
Point and Global Optimization Problems. Acta Mathematica, 12, 162-170.
TUY, H. (1988), On Large Scale Concave Minimization Problems with Relatively Few
Nonlinear Variables. Preprint, Institute of Mathematics, HanoL
TUY, H. (1988a), On Ou.tting Methods for Concave Minimization. Preprint, Institute
of Mathematics, HanoL
TUY, H. (1988b), Concave Minimization and Constrained Nonsmooth Global
Optimization: A Survey. Preprint, Institute of Mathematics, Hanoi.
TUY, H. (1989), Computing Fized Points by Global Optimization Methods. Procee-
dings of the International Conference on Fixed Point Theory and Applications.
Marseille-Luminy.
TUY, H. (1989a), On D.C. Programming and General Nonconvez' Global
Optimization Probfems. Preprint, Institute of Mathematics, Hanoi.
TUY, H. (1990), On a Polyhedral Annezation Method for Concave Minimization. In:
Leifman, L.J. and Rosen, J.B. (eds.): Functional Analysis, Optimization and
Mathematical Economics, Oxford University Press, 248-260.
TUY, H. (1991), Normal Conical Algorithm for Concave Minimization Over Poly-
topes. Mathematical Programming 51, 229-245.
TUY, H. (1991a), Effect ofthe Subdivision Strategy on Convergence and Efficiency of
Some Global Optimization Algorithms. Journal of Global Optimization 1, 23-36.
TUY, H. (1991b), Polyhedral Annezation, Dualization and Dimension Reduction
Technique in Glo6al Optimization. Journal of Global Optimization 1, 229-244.
TUY, H. (1992), On Nonconvez Optimization Problems With Separated Nonconvez
Variables. Journal of Global Optimization, 2, 133-144.
TUY, H. (1992a), The Complementary Convez Structure in Global Optimization.
Journal of Global Optimization, 2, 21--40.
TUY, H. (1992b), On Nonconvez Optimization Problems with Separated Nonconvex
Variables. Journal of Global Optimization, 2, 133-144.
TUY, H. (1994), Canonical D.C. Programming Problem: Outer Approximation
Methods Revisited. Preprint, Institute of Mathematics, to appear in Operations
Research Letters.
715

TUY, H. (1994a), Introduction to Global Optimization. Ph.D. course, GERARD


G-94-D4, Ecole Polytechnique, Montreal.

TUY, H. (1995), D.G. Optimization: Theory, Methods and Algorithms. In Horst, R.


and Pardalos, P.M. (eds.), Handbook of Global Optimization, 149-216, Kluwer,
Dordrecht-Boston-Londort.

TUY, H. (1995a), A General D.G. Approach to Location Problems. Conference State


of the Art in Global Optimization: Computational Methods and Applications,
Princeton, April 1995.

TUY, H. and AL-KHA YYAL, F.A. (1988), Goneave Minimization by Piecewise


Affine Approximation. Preprint, Institute of Mathematics, Hanoi.
TUY, H. and AL-KHA YYAL, F.A. (1992), Global Optimization of a Nonconvex
Single Facility Location Problem by Sequential Unconstrained Gonvex Minimization.
Journal of Global Optimization, 2, 61-71.

TUY, H., AL-KHAYYAL, F.A. and ZHOU, F. (1994), D.G. Optimization Method
for Single Facility Problems. Conference State of the Art in Global Optimization:
Computational Methods and Applications, Princeton, April 1995.

TUY, H., DAN, N.D. and GHANNADAN, S. (1992), Strongly Polynomial Time
Algorithm for Gertain Goncave Minimization Problems on Networks. Operations
Research Letters, 14, 99-109.

TUY, H., GHANNADAN, S., MIGDALAS, A. and VARBRAND, P. (1993),


Strongly Polynomial Algorithm for a Production- Transportation Problem with
Goncave Produetion Gost. Optimization, 27, 205-227.
TUY, H., GHANNADAN, S., MIGDALAS, A. and VARBRAND, P. (1993a), The
Production-Transportation Problem with a Fixed Number of Nonlinear Variables.
Manuscript, to appear in Mathematical Programming.

TUY, H., GHANNADAN, S., MIGDALAS, A. and VARBRAND, P. (1993b), The


Minimum Goncave Gost Flow Problem with Fixed Numbers of Sources and 0/
Nonlinear Arc Gosts. Journal of Global Optimization, 6, 135-151.
TUY, H. and HORST, R. (1988), Gonvergence and Restart in Branch and Bound
Algorithms for Global Optimization. Applieation to Goncave Minimization and
DG-Optimization Problems. Mathematical Programming, 41, 161-183.
TUY, H., KHACHATUROV, V. and UTKIN, S. (1987), A Glass of Exhaustive Gone
Splitting Procedures in Gonical Algorithms for Goncave Minimization. Optimization,
18, 791-807.

TUY, H. and OETTLI, W. (1991), On Necessary and Sufficient Gonditions for


Global Optimality. Revista de Matematicas Aplicadas, 15, 39--41.
TUY, H. and TAM, B.T. (1992), An Efficient Method for Rank Two Quasiconcave
Minimization Problems. Optimization, 24, 43-56.
TUY, H. and TAM, B.T. (1994), Polyhedral Annexation versus Outer
Approximation for Decomposition of Monotonie Quasiconeave Minimization
Problems. To appear in Acta Mathematica Vietnamica.
TUY, H., TAM, B.T. and DAN, N.D. (1994), Minimizing the Sum of a Gonvex
Function and a SpeciaUy Structured Nonconvex Function. Optimization, 28, 237-248.
716

TUY, H. and THAI, N.Q. (1983), Minimizing a Concave Function Over a Compact
Convex Set. Acta Mathematica Vietnamica, 8, 12-20.
TUY, H., THIEU, T.V. and THAI, N.Q. (1985), A Conical Algorithm for Globally
Minimizing a Concave Function Over a Closed Convex Set. Mathematics of
Operations Research, 10, 498-515.

TUY, H. and THUONG, N.V. (1985), Minimizing a Convex Function over the
Complement of a Convex Set. Methods of Operations Research, 49, 85-89.
TUY, H. and THUONG, N.V. (1988), On the Global Minimization of a Convex
Function under General Nonconvex Constraints. Applied Mathematics and
Optimization, 18, 119-142.

UEING, U. (1972), A Combinatorial Method to Compute the Global Solution of


Certain Non-Convex Optimization Problems. In: Lootsma, F.A. (ed.), Numerical
Methods for Nonlinear Optimization, Academic Press, 223-230.
UTKIN, S., KHACHATUROV, V. and TUY, H. (1988), A New Exhaustive Cone
Splitting Procedure for Concave Minimization. USSR Computational Mathematics
and Mathematical Physics, 7, 992-999 (in Russian).

VAISH, H. and SHETTY, C.M. (1976), The Bilinear Programming Problem. Naval
Research Logistics Quarterly, 23, 303-309.

VAISH, H. and SHETTY, C.M. (1977), A Cutting Plane Algorithm for the Bilinear
Programming Problem. Naval Research Logistics Quarterly, 24, 83-94.
VAN der HEYDEN, L. (1980), A Variable Dimension Algorithm for the Linear
Complementarity Problem. Mathematical Programming, 19, 328-346.
VASIL'EV, N.S. (1985), Minimum Search in Concave Problems Using the Sufficient
Condition for aGlobai Extremum. USSR Computational Mathematics and
Mathematical Physics, 25, 123-129 (in Russian).
VASIL'EV, S.B. and GANSHIN, G.S. (1982), Sequential Search Algorithm for the
Largest Value of a Twice Differentiable Function. Mathematical Notes of the
Academy of Sciences of the USSR 31, 312-316.

VAVASIS, S.A. (1991), Nonlinear Optimization, Complexity Issues. Oxford


University Press, Oxford.

VEINOTT, A.F. (1967), The Supporting Hyperplane Method for Unimodal Program-
ming. Operations Research, 15, 147-152.
VEINOTT, A.F. (1969), Minimum Concave Cost Solution of Leontiev Substitution
Models of Multi-Facility Inventory Systems. Operations Research, 14, 486-507.
VERGIS, A., STEIGLITZ, K. and DICKINSON, B. (1986), The Complexity of
Analog Computation. Mathematics and Computers in Simulation, 28, 91-113.
VIDIGAL, L.M. and DIRECTOR, S.W. (1982), A Design Centering Algorithm for
Nonconvex Regions of Acceptability. IEEE Transactions on Computer-Aided-Design
of Integrated Circuits and Systems, CAD-I, 13-24.
WARGA, J. (1992), A Necessary and Sufficient Condition for a Constrained
Minimum. SIAM Journal Optimization 2,665--667.
717

WATANABE, H. (1984), IC Layout Generation and Compaction Using Mathema-


tical Programming. Ph.D.-Thesis, Computer Sc. Dept. Univ. of Rochester.
WINGO, D.R. (1985), Globally Minimizing Polynomials Without Evaluating Deriva-
tives. International Journal of Computer Mathematics, 17, 287-294.
WOLFE, D. (1970), Convergence Theory in Nonlinear Programming. In: Integer and
Nonlinear Programming, J. Abadie, Ed., North-Holland, Amsterdam.

WOOD, G.R. (1991), Multidimensional Bisection and Global Optimization. Journal


of Computers and Mathematics with Applications 21,161-172.

YAGED, B. (1971), Minimum Cost Routing for Static Network Models. Networks, 1,
139-172.

YAJIMA, Y. and KONNO, H. (1990), Efficient Algorithms for Solving Rank Two
and Rank Three Bilinear Programming Problems. Journal of Global Optimization 1,
155-172.

YOUNG, R.D. (1971), HypercylindricaUy deduced Cuts in Zero-One Integer


Programs. Operations Research, 19, 1393-1405.
ZALEESKY, A.B. (1980), Non-Convexity of Admissible Areas and Optimization of
Economic Decisions. Ekonomika i Matematitcheskie Metody, XVI, 1069-1081 (in
Russian).

ZALIZNYAK, N.F. and LIGUN, A.A. (1978), Optimal Strategies for Seeking the
Global Maximum of a Function. USSR Computational Mathematics and Mathemati-
cal Physics, 18, 31-38.

ZANG, I. and AVRIEL, M. (1975), On Functions whose Local Minima are GlobaL
Journal of Optimization Theory and Applications, 16, 183-190.

ZANG, 1., CHOO, E.W. and AVRIEL, M. (1976), A Note on Functions whose Local
Minima are Global. Journal of Optimization Theory and Applications, 18, 556-559.
ZANGWILL, W.!. (1966), A Deterministic Multi-Product Multi-Facility Pro duc-
tion and Inventory Model. Operations Research, 14,486-507.
ZANGWILL, W.!. (1968), Minimum Concave Cost Flows in Certain Networks.
Management Science, 14, 429-450.

ZANGWILL, W.!. (1969), A Backlogging Model and a Multi-Echelon Model of a


Dynamic Economic Lot Size Production System - A Network Approach.
Management Science, 15, 506-527.

ZANGWILL, W.!. (1985), Setup Cost Reduction in Series Facility Production.


Discussion Paper, Graduate School of Business, University of Chicago.

ZHIROV, V.S. (1985), Searching for aGlobaI Extremum of a Polynomial on a


ParaUelepiped. USSR of Computational Mathematics and Mathematical Physics, 25,
63-180 (in Russian).

ZIELINSKI, R. and NEUMANN, P. (1983), Stochastische Verfahren zur Suche nach


dem Minimum einer Funktion. Mathematical Research, 16, Akademie-Verlag,
Berlin.
718

ZILINSKAS, A. (1981), Two Algorithms /or one-dimensional Multimodal Minimi-


zation. Optimization, 12, 53--63.
ZILINSKAS, A. (1982), Axiomatic Approach to Statistical Models, Applications and
their Use in Multimodal Optimization Theory. Mathematical Programming, 22,
104-116.

ZILINSKAS, A. (1986), Global Optimization: Axiomatics 0/ Statistical Models, Algtr


rithms and their Applications, Mokshar Publishers, Vilnius (in Russian).
ZWART, P.B. (1973), Nonlinear Programming: Counterexamples to two Global Opti-
mization Algorithms. Operations Research, 21, 1260-1266.
ZWART, P.B. (1974), Global Maximization 0/ a Convex Function with Linear
lnequality Constraints. Operations Research, 22, 602--609.
NOTATION

IN set of natural numbers

IR set of real numbers

iR set of extended real numbers (iR = IR U {+(I), -m} )


IR+ set of nonnegative real numbers

LaJ lower integer part of a

ral upper integer part of a

MeN M (not necessarily strict) subset of N

M\N difference of sets M and N

M-N algebraic difference of sets M and N

M+N sum of sets M and N

IMI cardinality of set M

lin M linear hull of set M

affM affine hull of set M

conv M convex hull of set M

cone M conical hull of set M

8(M) = d(M) diameter of M

xy inner product of vectors x,y

I identity matrix

(nxn) identity matrix


720

diag( O!) =
diag( O!P".'O!n) diagonal matrix with entries O!P ... 'O!n (where

a=( O!P."'O!n)
trans pose of matrix A
inverse of matrix A
determinant of A

x~y xi ~ Yi for all i (where x = (x1,.. ·,xn), y = (YP···'Yn))


Ax= b system of linear equalities

Ax ~ b system of linear inequalities


Q = (zl, ... ,zn) . 0 f coI umns z1, ... ,zn
matnx

con(Q) cone spanned by the columns of the matrix Q


conv [xO, ... ,xn] =
[xO, ... ,xn] simplex spanned by its n+1 affinely independent
.
ver t lces °
x ,... ,xn
polyhedron,
convex polyhedral set set of solutions of a system of linear inequalities
polytope bounded polyhedron
vert(P), V(P) vertex set of polyhedron (polytope) P
extd(P), U(P) set of extreme directions of a polyhedron P
R(K) recession cone of convex set K
G = (V,A) directed graph
epi (f) epigraph of f
hypo (f) hypograph of f

dom (f) effective domain of a function f: IRn --I iR


Vf(x) gradient of f at x

8f(x) sub differential of f at x

11·11 Euclidean norm


N(x,c) open ball centered at x with radius c
int M interior of set M
721

clM==M closure of set M


8M boundary of set M
7r{x) projection of a point x onto a specified set
min f(M) global minimum of function f over set M
argmin f(M) set of global minimizers of f over M
(BCP) basic concave programrning problem
(==linearly constrained concave minimization problem)
(BLP) bilinear programming problem
(CCP) concave complementarity problem
(cnc) canonical dc programrning problem
(CF) minimum concave cost flow problem
(CP) concave rninimization problem (concave programming
problem)
(LCP) linear complementarity problem
(LP) linear programrning problem
(LRCP) linear program with an additional reverse convex
constraint
(MIP) mixed integer programming problem
(PCP) parametrie concave programming problem
(QCP) quadratic concave programming problem
(SBC) special biconvex programming problem
(SCP) separable concave programrning problem
(SUCF) single source uncapacitated minimum concave cost
flow problem
(UCF) uncapacitated minimum concave cost flow problem

(UL) univariate Lipschitz optirnization problem

NCS normal conical sub division

NRS normal rectangular subdivision


NSS normal simplicial subdivision
INDEX C
canonical d.c. problem, 40, 41, 519,
525, 559
Caratheodory's Theorem, 11, 151
certain in the limit, 136
complementarity condition, 37
complementarity problem, 24,469 ff.
complete selection, 129
communication network, 13
A concave complementarity problem,
469 ff., 486
all-quadratic problem, 634 concavity cut, 94, 104 ff., 211 ff.
approximate relief indicator concave rninirnization, 9, 14, 179,
method,674 181, 190
arc, 125, 421 concave rninirnization over network,
assignment problems, 15, 20, 21 421
concave rninorants, 645 ff.
concave objective network problem,
B 13, 421
concave polyhedral underestimation,
barycenter, 140 278
basic solution, 96 concave variable cost, 13
basic variables, 96, 192, 194 cone of the first category, 477
Bender's decomposition, 383 cone of the second category, 477
biconvex programrning problem, 20, conical algorithms, 295, 321
36, 592 conical algorithm for the
bid evaluation problem, 13 - bilinear programrning problem,
bilinear constraints approach, 642 447 ff.
bilinear programming cut, 222 - concave complementarity
bilinear programming problem, 20, problem, 469, 486
209,447, 594 - concave rninirnization problem,
bimatrix game, 20 190, 229, 278, 295 ff., 323
bisection, 141, 302, 329, 371 - d.c. programrning problem, 520,
bounded convergence principle, 102 558 ff., 574
bound improving selection, 131, 136 - linear complementarity problem,
bounding, 115, 126, 145, 163,175 469,472,475,508ff.
branch and bound 115, 297, 553, 616 -linear pro gram with an additional
branch and bound algorithms for reverse convex constraint,
(see also conical alg., rectangular 559 ff., 564
alg., simplicial alg.) conjugate function, 151
- biconvex problems, 595 conjunctive cuts, 99
- concave minirnization, 190, 295, consistent bounding operation, 131,
298,323,335,340,342,365 164
- d.c. problems, 558, 572, 579 constraint dropping strategies, 53,
- Lipschitz optimization, 603, 616, 61, 65, 67 ff., 240 ff.
632, 654 constraint qualification, 160
- branch and bound / outer appro- convergent branch and bound
ximation algorithms, 176 ff., procedure, 122, 134, 163
324, 351, 559 convergent cutting procedures, 99
convex envelope, 148, 270, 365,
388, 595 ff., 638, 642, 648, 652
convexinequaIity, 8, 38, 39
convexity cut, 105
convex rninirnization, 6, 38, 41
convex subfunctional, 148
724

convex underestimation, 148,267 E


cut, 89 ff., 182 ff., 211
cut and split algorithm, 201 ff. eccentricity, 305
cuts using negative edge extensions, economies of scale, 13
113 END process, 336
cutting plane method, 58 ff. engineering design, 14, 34
cutting plane methods for the enumerative method, 20
- bilinear programming problem, epigraph, 149
447 ff. essential reverse convex constraint,
- concave minimization problem, 41,525
190, 228, 234 Evtushenko's saw-tooth cover, 609
- concave quadratic programming exact simplicial algorithm, 353
problem, 211 ff., 273 exhaustive nondegenerate sub-
cutting procedure, 99 ff. division, 335
exhaustive subdivision, 140
exhaustive subdivision of cones, 300,
D 329 H.
exhaustive subdivision of simplices,
d.c. decomposition, 27 ff. 137,329 ff.
d.c. function, 7 extreme direction, 74
d.c. inequality, 7 extreme face, 191
d.c. programming problem, 26, 166, extreme face problem, 191
517 ff., 558 ff., 572 ff. extreme flow, 429
d.c. programming problem with extreme point, 10, 74 ff.
concave objective function, 571
d.c. programming problem with
convex objective function, 541 F
decomposition, 381 ff.
decomposition by ou.ter approxi- facet finding problem, 248
mation, 401 ff. facial cut, 196
decomposition framework, 382 ff. facial cut algorithm, 190
decomposition of concave rninimi- Falk-Hoffman algorithm, 82, 269
zation problems over networks, feasible partition set, 118
421 ff. feasible set, 6, 8 ff.
deep cuts, 181, 201, 211 ff. finite branch and bound procedure,
degeneracy, 80, 95ff. 121, 126 ff.
deletion by infeasibility, 136, 145, finite consistency, 126
169 ff., 556, 561 ff., 633 finite cutting procedure, 103
deletion of a partition element, fixed charge problem, 12
119 ff., 164 fixed charge transportation problem,
demand, 421 12
design centering problem, 14, 35, fixed-point problem, 24
572 ff. flow, 421
(DG) problem, 244 ff., 296 ff. functions with concave rninorants,
(DG) procedure, 300 .645 ff.
diamond cutting problem, 14, 572 ff.
disjunctive cuts, lOt
downward peak, 607 G
duality, 34, 159, 520 ff.
duality between objective and '}'--extension, 183
constraints, 520 generalized concavity cut, 108
duality gap, 160 f. geometric complementarity
dynamic programming, 422 problem, 471 ff.
(G,K)-cut, 108
global minimizer, 4
global minimum, 4
global optimality criterion, 5, 183,
525, 663
global optimization problem, 3
725

graph, 421 lower bounding, 135 ff. 167


graph of a branch and bound 2 '
lower C function, 35
method, 125

M
H
Markovian assignment problem 20
hemisphere, 655 m:u;-min problem, 25 '
hierarchy of valid !;uts 211 IIllrumum concave cost flow
Hölder continuity, 664'
problem, 421 ff.
Horst-Thoai-de Vries method 78
mix~d integer program, 399, 645
hydraulic network, 13 ' modified exact simplicial algorithm
hypograph, 226, 279 353 '
mountain climbing procedure 450
multicommodity network flo~ 20
I, J multiextremal global optinn'zation
problem, 6
indefinite quadratic problem 13 32 multilevel fixed charge problem, 12
46 ' , ,
infeasible partition set, 118 134
initial polytope, 71 f., 87 ' N
~nner approximation, 225, 243
Integer linear programming national development, 14
problem, 19 network, 11, 13
integer quadratic programming node, 421
problem, 19 noncanonical d.c. problem, 541 ff.
integer programming, 14, 104 ff. nondegenerate subdivision of cones
integrated circuits, 14 300 ff. '
~nteractiye fixed charge problem, 12
nondegenerate subdivision of
IntersectlOn cut, 105 simplices, 347 ff. .
inventory, 11, 13 nonlinear cut, 53, 68, 660
investment, 33 normal cone splitting process, 297 ff.
jointly constrained biconvex normal conical algorithm, 308 ff.
programming problem, 36, 592 normal rectangular algorithm 295
M~M6K "
normal reet angular subdivision 295
K 365 ff. ' ,
normal simplicial algorithm 345
KCG algorithm, 61, 64, 228, 235 ~lK ' ,
Konno's cutting method, 217 normal simplicial subdivision, 343 ff
Krein-Milman Theorem, 11

L
o
w-subdivision, 306 ff., 349 ff. 371
Leontiev substitution system, 13 one-step optimality, 614 '
level set, 8, 181 optimality condition,. 6, 520 528
linear eomplementarity problem 24 546 ' ,
~9K ' ,
ordered sequential algorithm 611
linearization of the constraints 638 outer approximation by con~ex
Lipschitz constant, 6, 50 ' polyhedral sets, 58
Lipschitz function, 7, 50, 604 outer approximation by prov>ction
Lipschitz inequality, 8 65 ff. J- ,
Lipschitz optimization, 43, 145, 603 outer approximation method, 53 ff.
local d.c. function, 29
local minimizer, 4
local minimum, 4
726

outer approximation method for the


- bilinear programming problem, Q
462 ff.
- canonical d.c. problem, 519 ff. quadratic constraints, 634
- concave minimization problem, quadratic problem, 6, 21, 179
225 ff. quasiconvex function, 8
-linear programming problem with
an additional reverse convex
constraint, 497, 519 ff. R
- Lipschitz optimization problem,
653 ff. radial subdivision, 137 ff.
- noncanonical d.c. problem, 541 ff. recession cone, 74 ff.
rectangular algorithm, 365 ff.,
388 ff., 621 ff., 694
P redundant constraint, 84 ff.
reference saw-tooth cover, 609
parametric concave programming, refining, 122, 130
490 relaxation, 54, 225, 267 ff.
partition, 117 ff. relief indicator function, 670
partition of a cone, 142 ff., 297 ff. relief indicator method", 662 ff.
partition of a rectangle, 142 ff., restart branch and bound algorithm,
365 ff. 176 ff.
partition of a simplex, 137 ff., 342 reverse convex constraint, 26, 37
355 ff. reverse convex inequality, 8
partition set, 118 ff., 137 ff. robust set, 4
passive algorithm, 606 Rosen's algorithm, 273 ff.
path,429 Rosen-Pardalos algorithm, 399 ff.
penalty function, 68
piecewise linear approximation, 645
piecewise linear cost, 13 s
piecewise linear function, 13, 31
Pinter's axiomatic method, 625 ff. saw-tooth cover, 604
Piyavskii's saw tooth cover, 611 ff. selection, 127 ff.
plant location, 12 f., 413 ff. separable constraint, 634
polyhedral annexation 225 ff., separable function, 12, 143, 157
245 ff., 251 ff. separable problem, 289, 365 ff.,
polyhedral annexation for the 393
- bilinear programming problem, separator, 67 ff., 663
456 ff. sequential algorithm, 606
- concave minimization problem, sewage network, 13
266 ff. simplicial algorithm, 342 ff., 543 ff.,
-linear complementarity problem, 649 ff.
469 ff. simplicial subdivision, 137 ff.,
polyhedral und~restimation 329 ff., 343 ff.
algorithm, 285, 391 single source uncapacitated
polynomial time algorithm, 11 minimum concave cost flow
price break, 13 problem, 422
production function, 30 site selection, 12
production planning, 13, 14 Slater condition, 61
projection method, 65, 641 source, 427
prototype branch and bound spanning forest, 429
procedure, 115 ff. stable optimization problem, 521 ff.
prototype diagonal extension, 622 standard form of a concave
minimization problem, 186
standard global optimization
problem, 3
Stone-Weierstraß Theorem, 30
strictly redundant constraint, 86
727

stronger cut, 113


strongly consistent bounding
operation, 135
subdifferential, 59
subgradient, 60
subdivision (see partition)
successive partitioning method,
295 ff.
successive relaxation method, 267 ff.
successive underestimation, 225, 267
sufficient global optimality
condition, 138, 525 ff., 663
supporting hyperplane, 59 ff.
supply,421
system of equations, 78 ff.
system of inequalities, 48 ff., 625 ff.

T
Thieu-Tam-Ban method, 82 ff.
transversal facet, 247 ff.
trivial simplex, 356 ff.

u
uncapacitated minimum concave
cost flow problem, 422,426,437
uncertain partition set, 118
univariate Lipschitz optimization,
604,609 ff.
unstable d.c. problem, 537
upper bound, 117, 119, 133, 145, 164
upper level set, 184
utility, 14, 30, 33, 34

v
valid cut, 89 ff., 182 ff., 211 ff.
variable cost, 12
variational inequality, 24
vertex minima, 146, 165, 166
vertex problem, 191
vertices, 53, 71 ff., 96, 111, 189 ff.
249 ff., 341 ff., 385 f., 406 ff.

W,X,Y,Z
Weierstraß Theorem, 3, 11, 30
zero-one integer programming, 15,
399, 644
Spri nger-Verlag
and the Environment

Ws at Springer-Verlag firmly believe that an


international science publisher has a special
obligation to the environment, and our corpo-
rate policies consistently reflect this conviction.

Ws also expect our busi-


ness partners - paper mills, printers, packag-
ing manufacturers, etc. - to commit themselves
to using environmentally friendly materials and
production processes.

The paper in this book is made from


low- or no-chlorine pulp and is acid free, in
conformance with international standards for
paper permanency.

You might also like