Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Coral~utera Math. Apl~li¢. Vol. 21, No. 6/7, pp. 1--6, 1991 0097-4943/91 $3.00 + 0.

00
Printed in Great Britain. All rights reserved Copyright(~) 1991 Pergamon Pre~ pie

PROBLEM-METHOD CLASSIFICATION
IN O P T I M I Z A T I O N A N D C O N T R O L
EFIM A . GALPBRIN
D~partement de Math~matiques et d'Informatique, Universit~ du Qu~hec h Montr&d
C.P. 8888, Succ. "A", Montreal, Qu&, H3C 3P8, Canada

A b s t r a c t - - T h e r e are many classifications in optimization and control (continuotm, discrete, integer


problems; convex or concave minimization; steepest deament, branch and bound, path following, space
filling, random search methods; smooth or non smooth problems; etc.). The aim of this short note is
not to propose yet another classification, hut to clarify existing ambiguities and imprecisions in prob-
lem formulations and to emphasize certain fundamental distinctions in solution methods currently
used.

1. I N T R O D U C T I O N
In almost all literature up to date, the problems of mathematical programming (parameter opti-
mization) and optimal control are formulated as follows.

Mathematical Programming:

m i n f ( z ) or m a x f ( z ) , (1.1)
under the conditions

gi(x) <_O, i = 1,... ,m (1.2)


h./(z) = O, j = I,... ,k; (1.3)

see, for example, [1,2].

Optimal Control:
Here the function f(z) in (1.1) is substituted by a functional J(u) in the form of an integral plus
a function of the terminal state; the inequalities (1.2) are treated as restrictions on the control
function u(x,t) and/or state variables z(t0, z0,t, u(.)); and the equalities (1.3) are replaced by a
system of differential equations.
Such formulations, in fact, represent only a declaration of the intention to look for an optimal
solution under certain conditions. They are so ambiguous, that one can really see what problem is
being solved only in the process of solution by certain available method. Until recently, the most
popular methods were the Karush-Kuhn-Tucker conditions [1,3], the maximum principle [4] and
the Hamilton-Jaeobi-Bellman equation [5]. Consequently, the optimal solution was understood
and determined simply as a solution that would fit the Karush-Kuhn-Tueker conditions, the
maximum principle or the Hamilton-Jaeobi-Bellman equation, according to the ease.
Let us see what is the sense of such a solution.
Since the Karush-Kuhn-Tucker conditions employ derivatives, their solution, if it exists, is
necessarily local. If there is a set (a continuum) of equi-optimal solutions, it cannot be found;
only one or several solutions can be obtained from those conditions. To apply the conditions,
the functions f, gi, hj should be differentiable and certain convexity assumptions (at least local)
should be imposed. It is well known that there are problems to which the Karush-Kuhn-Tbcker
conditions do not apply; such problems are usually discarded. Much less known is that the set

Typeset by .A.,~-"~X
2 E . A . GALPERIN

given by (1.2), (1.3) may be not inf-robust (this term is not defined in classical literature; for its
meaning see, e.g., [6] in this issue); in such a case, gradient conditions cannot be applied despite
the fact that f , gi, hj may be all differentiable.
If we look at the optimal control problems as solutions to the maximum principle or to the
Hamilton-Jacobi-Bellman equation, then, replacing the notion of derivative by the notion of
variation, we can see the same handicaps.
If we apply a popular random search method or heuristic method, then the result is optimal
"in probability" or up to the heuristic employed.
Thus, the traditional formulation (1.1)-(1.3) makes the notion of optimal solution vary and
depend on the method employed. From the writing (1.1)-(1.3), it is not clear whether we look
for a local of global solution, for a just one or all optimal solutions, whether or not the set (1.2)-
(1.3) is robust so that a given method should work or we can only try and see what happens. To
improve the situation, we have to detach the problem formulation from the methods employed
and clearly specify what kind of solution is sought, irrespective of the capacity of certain methods
to deliver such a solution.
Consider a nonempty robust set X which belongs to R n for parameter optimization or to a
functional space for optimal control; for any set Y, robust means

closure (interior Y ) = closure Y; (1.4)

for details, see [6] in this issue. The set X" can be given by relations (1.2)-(1.3). For simplicity,
we assume the set X to be compact, though it is not necessary for iterative methods. Over the
set X, a continuous functional f : X ~ R is defined. As a matter of convenience, we consider
here minimization problems.

2. T H R E E P R O B L E M S IN O P T I M I Z A T I O N AND C O N T R O L
There are three distinct problems of optimization (min) of f over X', depending on the exact
sense that is meant under the notation (1.1).

Problem A. Traditional (local) optimization, or mathematical programming (optimal control):


m

Find an element (a point) z ° G X such that


f ( x 0) < for all z E N6(z°), (2.1)

where

N d x °) = ,(x e I IIx - x°ll < 6,6 > 0} (2.2)


: { eR" I Il -x°ll < 6,6 > 0 } n x c X (2.3)
is some (maybe, truncated) 6-neighborhood in ~ . Here II" {J is a norm in the appropriate space.
The notations (2.2), (2.3) are equivalent; if X C R '~ we prefer notation (2.2) since it is simpler
and does not carry the indication of a space, being, thus, more universal.
Problem A is very old and there are many different methods for its solution, based, mainly, on
(sub)gradients, Taylor series, cones, variations, Euler equation, etc. They include such achieve-
ments as the maximum principle, dynamic programming and the Hamilton-Jacobi-Bellman equa-
tion. There are scores of papers and books on this topic.

Problem B. Limited global optimization:


Find an element (a point) x ° E X such that

/(z*) _< f ( z ) for all ~: E X. (2.4)

Efforts on this problem have been in progress for many years and have produced various
solution methods, such as enumerative techniques, separation and/or cutting plane methods,
branch and bound methods, outer approximation (relaxation), clustering methods, stochastic
search methods, etc.; see [7-9] and bibliographies therein.
Problem-method classification 3

Problem C. Full global optimization:


Find the set ~-o C ~ such that

f(xo) = po < f(x), for all x ° • ~-o and all z • X ' - ~-o. (2.5)
Here pO is a constant. We shall use henceforth a shorter notation for (2.5).
Find
po = mi_nf(x), (2.8)
rEX
and
= {x E ~ 1 f(x) = pO}, (2.7)

with the understanding that minimization in (2.6) is carried over the whole set X, hence, the
number p* represents the unique global minimum value of f(x) over X.
Although intuitively transparent, a clear formulation of Problem C, including the requirement
to find the entire set X-~ ofaU global minimizers has been probably first suggested in [10-12], and
then in [13,14]. An integral global optimization method for its solution has been developed in [10-
12,15], see also [16,17] for its application to nonlinear observation and identification. The universal
global optimality criterion (a necessary and sufficient condition) has been developed in [11,15].
A different criterion has been proposed earlier in [18], but this criterion can only distinguish an
isolated global minimizer (a point x* E X-~ in limited global optimization, Problem B).
A clear distinction between limited and full global optimization is formally introduced in [19]
and here. In contrast, Problem B in [7-9] is simply called "global optimization." It seems that
such distinction is necessary since Problems B and C are quite different not only in the search
task (z* E X" or ~-o C X) but also in applicable methods. There are many methods for solving
Problem B, see [7-9] and bibliographies therein, but only three (at the time of printing) for
solving Problem C. Those are the integral global optimization method [10-12,15,20], the cubic
algorithm and its extensions [21-24] and interval methods [25-28].
Robustness of X is an important requirement. If a set X is compact but not inf-robust (roughly
speaking, "bearded set"), then z*, ~ still exist but cannot be obtained by any known method.
Though, there is a way around it outlined in §§ 7.2, 7.3 of [24].
If ~r and f are convex, then Problems A and B coincide, but C is distinct. If ~ is convex and
f is strictly convex, then Problems A, B, C all coincide having one unique minimizer. Otherwise,
the three problems are profoundly different in the solution and in the methods by which the
solution can be obtained.

3. N O N - C O M P A C T SETS AND 17-OPTIMAL S O L U T I O N S


Compactness of the set X guarantees the existence of min, max in (1.1), (2.6) and of the
minimizing elements x ° or the set ~-o. For iterative algorithms, compactness is a technical re-
quirement and the problem can be formulated and solved without it. Indeed, iterative algorithms
deliver, in the limit, inf f(x) or sup ](x) in the local or global sense, depending on the algorithm
employed. Since the limit is never achieved, they deliver an q-approximation to the optimal
solution that can be described as follows.

Problem B*. Given r/> 0, find an element (a point) x" E X such that

f ( x °) ~ f(x) + ~, for all x E X. (3.1)

Problem C*. Given 17> 0, find the largest set X* C_ X such that (3.1) holds for all x* E X" and
all z E X .
Here the largest means that either
x" = x , (3.2)
4 E . A . GALPERIN

or X * C X , that is, the set X - X* # 0, non empty, and for every ~ E X - X*, there is a point
x0 E X such that
f(~') > f ( z 0 ) + 'L (3.3)
T h e sets X, X* may be not closed nor inf-compact. For technical reasons, the set X should be
inf-robust [6], then X* will necessarily appear also inf-robust.
We did not assume the knowledge of the global optimum value

p* = i n f f ( x ) < oo, (3.4)

which is supposed to exist, and did not use this value in (3.1). Hence, we have to prove that
formulations B* and C* are, indeed, equivalent to the intuitive notion of the global r/-optimal
solution (respectively, limited and full) which follows from (2.4) to (2.7).
THEOREM 3.1. If an element x* E X satisfies (3.1), then

pO _ < p, + 7. (3.5)

PROOF. The left inequality is trivial by the definition of the infimum. By the same definition,
for each e > 0, there is ~ E X , such that

f ( ~ ) < p* + e. (3.6)

Since (3.1) holds for all x e X, we can take ~" for x in (3.1), yielding, due to (3.6):

.f(x*) < pO + e + 7, for given 71 > 0 and all e > 0. (3.7)

Note that we cannot take the limit in (3.6) as e ---* 0 because this may imply ~ ~ X so that the
value f ( ~ ) < p0 may be not existing for a function f defined on the set X only. On the contrary,
since , / > O, we can take the limit as e ---, 0 in (3.7), yielding

f ( : ) _<p° + 7, (3.8)

which completes the proof. |

THEOREM 3.2. If each point of the set X* C_ X satisfies (3.1), then the set X* is the largest set
with the property (3.1) in the sense of the definition (3.2)-(3.3), if and only if

X ° = {x E X I p ° < f ( x ) ~ pO + ~/}. (3.9)

PROOF. For every x* satisfying (3.1) we have (3.5). Define X ° as the set composed of all such
z °. Then (3.9) follows, and, if the set X - X " is not empty, then (3.3) holds for every ~ E X - X °
and corresponding x0 E X . Vice versa, let X " be defined by (3.1)-(3.3). Then (3.5) holds for
every x* 6 X °. If X" = X then X is itself an Tl-optimal set and (3.9) follows. If X - X* # 0,
then from (3.3) we get for every ~ E X - X*:

f(~') > f ( x o ) + 77 >_ i ~ f f ( x ) + 17= pO + T], (3.10)

so that X ° of (3.9) is the largest subset of X with the property (3.1). It

Remark 3.1.
It is instructive that a definition of local y-optimal solution by replacing f ( x ) with f ( x ) + r/
in (2.1), (compare (2.4) and (3.1)), is incorrect, as can be clearly seen graphically. Here we need
the a priori assumption that a minimizer x0 does exist in a neighborhood N6(xo).
Problem-method classification 5

4. P O I N T - T O - P O I N T VERSUS S E T - T O - S E T METHODS.
It is clear that conventional methods based on derivatives, (sub)gradients and variations use
only local information and, thus, cannot distinguish between the local and the global minimizers
unless the global property of convexity is called in. This makes them inapplicable for finding a
global minimum in nonconvex problems. Moreover, such methods are indirect; for example, the
steepest descent method minimizes not a function but the norm of its gradient. This implies
in many cases the ill-conditioned situation and the sensitivity of the solution to perturbation of
parameters.
Another difference of conventional methods and of methods for limited global optimization is
that those methods are either "point-to-point" (cf. the steepest descent) or grid search ("multi-
point") methods. It is okay for Problems A and B where it is required to locate just one (or
"at least one" as sometimes said) minimizer. In contrast, Problem C requires to determine the
entire set ~ of all global minimizers which set may well be a continuum. If such a set is not a
polyhedral set (a simplex, rectangle, etc.), then grid search methods fail to deliver X-*, even in
the limit.
Thus, in order to develop methods for the solution of different optimization problems, we
have to distinguish between point-to-point (or multi-point, grid search) methods and set-to-set
methods. It is clear that point-to-point methods can deliver only a limited solution and are, thus,
inapplicable to Problem C which requires the application of a set-to-set method.
There are three classes of set-to-set methods. Those are interval methods [25-28], integral
methods [10-12,20] and the cubic algorithm [24] which can be classified as semi-interval. This
brings up a problem of exact representation of sets in a computer. It is clear that, with finite
memory of whatever volume, most sets cannot be exactly represented in a computer, even if it
is assumed free of round-off errors. However, certain sets can, and the simplest are the cube
and the ball which are used for partitions and coverings respectively and can be represented, for
example, by the central point and diagonal (radius) each. Such representation is, e.g., used in the
cubic algorithm and its extensions that are direct, set-to-set methods that deliver (in the limit)
the entire set ~ , whatever its configuration may be.

REFERENCES

1. H.W. Kuhn and A.W. Tucker, Nonlinear Programming, In Proceedings of the 2nd Berkeley Symposium
on Mathematical Statistics and Probability (Jer'zy Neyman, ed.), Univ. of California Press, Berkeley, pp.
481--492, (1951).
2. G.P. McCormick, Optimality Criteria in Nonlinear Programming, In SIAM-AMS Proceedings, Volume IX:
Nonlinear Programming, AMS, Providence, R.I, pp. 27-38, (1976).
3. W. Karush, Minima of Functions of Several Variables with Inequalities as Side Conditions, Master's Thesis,
Univ. of Chicago, (1939).
4. L.S. Pontryagin, V.G. Boltyanskii, H.V. Gamkrelidze and E.F. Mischenko, The Mathematical Theory of
Optimal Processes, John Wiley & Sons, New York, (1962).
5. R. Bellman, Dynamic Programming, Princeton University Press, Princeton, New Jersey, (1957).
6. Q. Zheng, Robust analysis and global optimization, Int..I. Computers and Mathematics with Applications,
Special Issue on Global Optimization, Control and Games, this issue.
7. P.M. Pardalos and 3.13. Rosen, Constrained Global Optimization: Algorithms and Applications, Lecture
Notes in Computer Science, 268, Springer-Verlag, (1987).
8. A.H.G. Rinnooy Kan and G.T. Timmer, Stochastic global optimization methods, Mathematical Program-
ming 39 (1) (Part I: Clustering Methods, pp. 27-56; Part II: Multilevel Methods, pp. 57-78), (1987).
9. R. Horst and H. Tuy, Global Optimization, Springer-Verlag, (1990).
10. Q. Zheng, B. Jiang and S. Zhuang, A method for finding the global extremum, Acta Mathematicae Applicatae
Sinica (2), 161-174 (1978), (in Chinese).
11. Q. Zheng, On optimality conditions of a global extremum, Numerical Mathematics--A Journal of Chinese
Unirersities (3), 273-275 (1981), (in Chinese).
12. Q. Zheng, Global optimization: Theory, methods and applications, Communication at the Conference
Optimization Days-198~, Montreal, Canada, Abstracts Volume.
13. E.A. Galperin, The cubic algorithm, Preliminary report, Abstracts of the American Mathematical Society
(AMS), Issue 27 4 (6), 501 (October 1983); Also in: Abstracts of the 9th Symposium on Operations
Research, Osnabrfick, West Germany, (Aug. 1984), 43.
14. E.A. Galperin, The cubic algorithm for a non cubic domain, Preliminary report, Abstracts of the American
Mathematical Society (AMS), Issue 30 5 (2), 213 (Feb. 1984).
6 E.A. GALPERIN

15. Q. Zheng, Optimality conditions for global optimization (I, II), Acts Mathematicae Applicatae $inic~ 2 (1),
49--76 (1985).
16. E.A. Galperin and Q. Zhe~g, Nonlinear observation via global optimization methodJ, Proc. of the 17th
International Conf. on Information Sciences and Systems, The John Hopkins University, Baltimore, USA.
17. E.A. Galperin and Q. Zheng, Nonlinear observation via global optimization methocht: meamu-e theory
approach, Journal of Optimization Theo~¢ and Applications (.IOTA) 54 (1), 63-92 (1987).
18. J.E. Falk, Conditions for global optimality in nonlinear programmL~, Operations Research 21, 337-340
(1973).
19. E.A. Galperin, ~ global optimization via cubic algorithm, Abstracts of the I $th International Sllmpo,i=m
on Mathematical Programmino, Tokyo, Japan, August 1988.
20. Q. Zheng, Theory and methods for global optimization--An integral approach, Adeance, in Optimization
and Control, Lecture Notes, 302.
21. E.A. Galperin, The cubic algorithm, JMAA 112 (2), 635--640 (Dec. 1985).
22. E.A. Galperin, The beta.algorithm, JMAA 126 (2), 455-468 (Sept. 1987).
23. E.A. Galperin, Precision, complexity and computational schemes of the cubic algorithm, JOTA 67 (2),
22.3-238 (1988).
24. E.A. Galperin, The Cable Algorithm for Optimization and Control, NP Research PubL, P.O. Box 1691,
Station B, Montreal, Que., Canada H3B 3I.,3, (1990).
25. H. Ratshek and J. Rokne, Net~ Computer Methods for Global Optimization, Ellis Horwood and John Wiley,
(19SS).
26. E.R. Hansen, lnter~al Method, for Global Optimization, (to appear).
27. R.E. Moore and H. Ratshek, Inclusion functions and global optimization II, Mathematical Proframmino 41,
341-356 (1988).
28. R.E. Moore, Global optimization to prescribed accuracy, Int. J. Computers and Mathematic, wi~ Appli-
cations, this issue.

You might also like