Professional Documents
Culture Documents
(Chapman & Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)
(Chapman & Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)
Theory of
Optimization in
Euclidean Space
Series in Operaons Research
Series Editors:
Malgorzata Sterna, Marco Laumanns
Raonal Queueing
Refael Hassin
Samia Challal
Glendon College-York University
Toronto, Canada
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
c 2020 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
This book contains information obtained from authentic and highly regarded sources. Rea-
sonable efforts have been made to publish reliable data and information, but the author
and publisher cannot assume responsibility for the validity of all materials or the conse-
quences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if
permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying, microfilming, and record-
ing, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Cen-
ter, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-
for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system
of payment has been arranged.
Preface ix
Acknowledgments xi
Author xv
1 Introduction 1
1.1 Formulation of Some Optimization Problems . . . . . . . . . 1
1.2 Particular Subsets of Rn . . . . . . . . . . . . . . . . . . . . 8
1.3 Functions of Several Variables . . . . . . . . . . . . . . . . . 20
2 Unconstrained Optimization 49
2.1 Necessary Condition . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Classification of Local Extreme Points . . . . . . . . . . . . . 71
2.3 Convexity/Concavity and Global Extreme Points . . . . . . 93
2.3.1 Convex/Concave Several Variable Functions . . . . . 93
2.3.2 Characterization of Convex/Concave C 1 Functions . . 95
2.3.3 Characterization of Convex/Concave C 2 Functions . . 98
2.3.4 Characterization of a Global Extreme Point . . . . . . 102
2.4 Extreme Value Theorem . . . . . . . . . . . . . . . . . . . . 117
vii
viii Contents
Bibliography 315
Index 317
Preface
In presenting the material, we refer first to the intuitive idea in one dimension,
then make the jump to n dimension as naturally as possible. This approach
allows the reader to focus on understanding the idea, skip the proofs for later
and learn to apply the theorems through examples and solving problems. A
detailed solution follows each problem constituting an image and a deepening
of the theory. These solved problems provide a repetition of the basic princi-
ples, an update on some difficult concepts and a further development of some
ideas.
Students are taken progressively through the development of the proofs where
they have the occasion to practice tools of differentiation (Chain rule, Taylor
formula) for functions of several variables in abstract situation. They learn to
apply important results established in advanced Algebra and Analysis courses,
like, Farkas-Minkowski Lemma, the implicit function theorem and the extreme
value theorem.
ix
x Preface
– Among the local candidate points, which of them are local maximum or
local minimum points? Here, we establish sufficient conditions to identify
a local candidate point as an extreme point.
– Now, among the local extreme points found, which ones are global ex-
treme points? Here, the convexity/concavity property intervenes for a
positive answer.
Finally, we explore how the extreme value of the objective function f is affected
when some parameters involved in the definition of the functions f or g change
slightly.
Acknowledgments
xi
Symbol Description
n
1/2
∀ For all, or for each A = a2ij norm of the ma-
i,j=1
∃ There exists
trix A = (aij )i,j=1,...,n
∃! There exists a unique
rankA rank of the matrix A
∅ The empty set
detA determinant of the matrix A
s.t Subject to
◦
KerA = {x : Ax = 0} Kernel of the ma-
S Interior of the set S trix A
xiii
Author
xv
Chapter 1
Introduction
The purpose of this short section is to show, through some examples, the
main elements involved in an optimization problem.
i) Show how to make this choice without finding the exact radius.
ii) How to choose the radius if the volume V may vary from one liter to
two liters?
1
2 Introduction to the Theory of Optimization in Euclidean Space
Solution: Denote by h and r the height and the radius of the can respectively.
Then, the area and the volume of the can are given by
Note that the set S, as shown in Figure 1.1, is an open unbounded interval
of R.
interval r0
r
0.0 0.5 1.0 1.5 2.0 2.5
2.0
1.5
1.0
S
0.5
r
0.5 1.0 1.5 2.0
ii) In the case, we allow more possibilities for the volume, for example 1
V 2, then we can formulate the problem as a two dimensional problem
Introduction 3
⎧ 2
⎪
⎨ minimize A(r, h) = 2πr + 2πrh over the set S
⎪
⎩ S = {(r, h) ∈ R+ × R+ 1 2
/ h 2 }.
πr2 πr
The set S is the plane region, in the first quadrant, between the curves
1 2
h = 2 and h = 2 (see Figure 1.3).
πr πr
h
3.5
3.0
2.5
2.0
1.5
1.0 S
0.5
r
0.5 1.0 1.5 2.0
where, the set S ⊂ R3 is the part of the surface V = πr2 h located between
the planes V = 1 and V = 2 in the first octant; see Figure 1.4.
3
h
2
0
1.0
V
0.5
0.0
0.0
0.5
1.0
r
1.5
2.0
Diet Problem. * One can buy four types of aliments where the nutritional
content per unit weight of each food and its price are shown in Table 1.1 [5].
The diet problem consists of obtaining, at the minimum cost, at least twelve
calories and seven vitamins.
Solution: Let ui be the weight of the food of type i. The total price of the
four aliments consumed is given by the relation
To ensure that at least twelve calories and seven vitamins are included, we
can express these conditions by writing
⎧
⎨ minimize f (u1 , u2 , u3 , u4 ) over the set S = (u1 , u2 , u3 , u4 ) ∈ R4 :
⎩ 2u1 + u2 + u4 12, 3u1 + 4u2 + 3u3 + 5u4 7 .
aliments consumed is
2u1 + 2u2 + u3 + 8u4 + 12u5 + 10u6 + 8u7 = p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ).
To ensure that at least twelve calories, seven vitamins, twenty proteins are
included, and less than fifteen fats are consumed, the problem would be for-
mulated as
⎧
⎪
⎪ minimize p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ) over the set
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ S = (u1 , u2 , u3 , u4 , u5 , u6 , u7 ) ∈ R7 :
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ 3u1 + u2 + 2u3 + 7u4 + 8u5 + 5u6 + 10u7 20
⎪
⎪
⎪
⎪ u2 + 8u4 + 15u5 + 10u6 + 6u7 15
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ 2u1 + u2 + u4 + 5u5 + 7u6 + 9u7 12
⎪
⎪
⎪
⎪
⎪
⎪
⎩ 3u1 + 4u2 + 3u3 + 5u4 + u5 + 2u6 + 5u7 7.
2 L2
L3
1 S
x
1 1 2 3
L1
1
S = {(x, y) ∈ R2 : 3x + 2y 6, x 0, y 0}.
6 Introduction to the Theory of Optimization in Euclidean Space
The set S is the triangular plane region bounded by the sides L1 , L2 and L3 ,
defined by: L1 = {(x, 0), 0 x 2},
Here, the objective function f (x, y) = xy is nonlinear and the set S is described
by linear inequalities.
** Such a model may work for a certain production process. However, it may
not reflect the situation as other factors involved in the production process
cannot be ignored. Therefore, new models have to be considered. For Exam-
ple [7]:
- The Canadian manufacturing industries for 1927 is estimated by:
- The production P for the dairy farming in Iowa (1939) is estimated by:
As seen above, the main purpose, of this study, is to find a solution to the
following optimization problems
Remark 1.1.1 The extreme point may not exist on the set S. In our study,
we will explore the situations where min f and max f are attained in S.
S S
For example
Indeed, suppose there exists x0 ∈ (0, 1) such that f (x0 ) = min f (x). Then,
(0,1)
x0 x0
0< < x0 =⇒ ∈ (0, 1)
2 2
x0
f is a strictly increasing function on (0, 1) =⇒ f( ) < f (x0 ),
2
which contradicts the fact that x0 is a minimum point of f on (0, 1). However,
we remark that
We list here the main categories of sets that we will encounter and give the
main tools that allow their identification easily. Even though the purpose is
not a topological study of these sets, it is important to be aware of the precise
definitions and how to apply them accurately [18], [13].
In one dimension, the distance between two real numbers x and y is mea-
sured by the absolute value function and is given by
d(x, y) = |x − y|.
d(x, y) 0 d(x, y) = 0 ⇐⇒ x = y
d(y, x) = d(x, y) symmetry
d(x, z) d(x, y) + d(y, z) triangle inequality.
d is called the Euclidean distance and satisfies the three properties above. A
set O ⊂ Rn is said to be open if and only if, at each point x0 ∈ O, we can
insert a small ball
B (x0 ) = {x ∈ Rn : x − x0 < }
Introduction 9
Example 1. As n varies, the ball takes different shapes; see Figure 1.6.
n=1 a∈R Br (a) = (a − r, a + r) : an open interval
n=3 a = (a1 , a2 , a3 )
Br (a) = {(x1 , x2 , x3 ) : (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 < r2 } :
set of points delimited by the sphere centered at a with radius r
y 4
y 2
3 0
2
2 4
4 sphere
1 disk 2
interval 2 x 2 x2 y2 z2 4
z 0
2 1 0 1 2 x
3 2 1 1 2 3
x2 y2 4 2
1
4
4
2 2
0
x 2
3 4
– S = S ∪ ∂S is the closure of S.
– S is bounded ⇐⇒ ∃M >0 such that x M ∀x ∈ S.
we have
◦
S S ∂S S
S1 (−2, 2) {−2, 2} S1
where
– S is closed ⇐⇒ S = S.
Introduction 11
y
5
x
2 1 1 2 3 4 5
1
2
∗ Note that the set S, sketched in Figure 1.7, doesn’t contain the points on
the x and y axis. So
S = {(x, y) : x > 0, y > 0, xy 1}
and can be described using the continuous function f : (x, y) −→ xy on the
open set Ω = {(x, y) : x > 0, y > 0} as
∗∗ The set is unbounded since it contains the points (x(t), y(t)) = (t, t) for
t 1 (xy = t.t = t2 1) and
√
(x(t), y(t)) = (t, t) = t2 + t2 = 2t −→ +∞ as t −→ +∞.
12 Introduction to the Theory of Optimization in Euclidean Space
∗ ∗ ∗ We have
◦
S = {(x, y) : x > 0, y > 0, xy > 1}
1
the region in the 1st quadrant above the hyperbola y =
x
1 S
x
2 4 6
1
∗ Figure 1.8 shows that S is the triangular region formed by all the points in
the first quadrant below the line x + 3y = 7 :
S = {(x, y) : x + 3y 7, x 0, y 0}
on R2 as
7 2 7√
(x, y) = x2 + y 2 72 + = 10 ∀(x, y) ∈ S.
3 3
∗ ∗ ∗ We have
◦
S = {(x, y) : x > 0, y > 0, x + 3y < 7} the region S excluding its three sides
Convex sets
The category of convex sets, deals with sets S ⊂ Rn where any two points
x, y ∈ S can be joined by a line segment that remains entirely into the set.
Such sets are without holes and do not bend inwards. Thus
S is convex ⇐⇒ (1 − t)x + ty ∈ S ∀x, y ∈ S ∀t ∈ [0, 1].
3
line segment
B 2 disk
line
1
A
x
4 2 2 4 6
1
Hence (1 − t)a + tb ∈ Br (x0 ) for any t ∈ [0, 1]; that is, [a, b] ⊂ Br (x0 ).
y
2
x2 y2 4
x
2 1 1 2
closed disk
1
2
The set is the closed disk with center (0, 0) and radius 2. It is closed since it
includes its boundary points located on the circle with center (0, 0) and radius
2. This set is bounded since (x, y) 2 ∀(x, y) ∈ B2 ((0, 0)).
x = (x1 , . . . , xn ) ∈ Rn : a1 x1 + a2 x2 + . . . + an xn = a.x = b
4
y y 2
2.0 0
2
4
1.5 4 hyperplane
hyperplane 2
1.0
x z 0
3 2 1 0 1 2 3 hyperplane
0.5
2
4
x 4
2 1 1 2
2
0
0.5 x 2
4
Indeed, as above, consider x1 , x2 in the region [a.x b] and t ∈ [0, 1], then
The set a.x b describes the region of points located below the hyperplane
a.x = b.
S1 = {(x, y) ∈ R2 : x + 6y 0} S2 = {(x, y) ∈ R2 : x 6}
L4
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
circle
1
x 12 y 12 4
x
1 1 2 3
1
Indeed, we have
x2 y2 4
2
x
4 2 2 4
2
4
Indeed, we have
1 ∗ 1
(x1 , . . . , x∗n + 2r) + (1 − )(x∗1 , . . . , x∗n − 2r)
2 2
1
= (2x∗1 , . . . , 2x∗n + 2r − 2r) = x∗ ∈ S.
2
For example, in the plane, the set
{(x, y) : x2 + y 2 > 4} = R2 \ B2 ((0, 0)) is not convex.
Moreover, the set is open since it is the complementary of the closed disk with
center (0, 0) and radius 2 (see Figure 1.14). It is not bounded since for t 2,
the points (0, t2 ) belong to the set, but (0, t2 ) = t2 −→ +∞ as t −→ +∞.
∗ The union of the disk and the line in Figure 1.9 is not convex.
Indeed, we have
4 2 2 4
0.5 x1
and y 1
1.0
y 1 1 x 1.5
2.0
We refer the reader to any book of calculus [1], [3], [21], [23] for details on
the points introduced in this section.
– Linear function
f (x1 , . . . , xn ) = a1 x1 + a2 x2 + . . . + an xn .
– The electric potential function for two positive charges, one at (0, 1)
with twice the magnitude as the charge at (0, −1), is given by
2 1
ϕ(x, y) = + .
x2 + (y − 1)2 x2 + (y + 1)2
Introduction 21
are
Df = {x ∈ R/ x 0}
y
1.0
Dg : x 0 0.5
interval Df : x 0 x
1.0 0.5 0.5 1.0
0 1 2 3 4 5 6
0.5
1.0
10 Dh : x 0
y 5
5
10
10
z
0
5
10
5
x
10
The following examples illustrate how to proceed to graph some surfaces and
level curves.
Example 3. A cylinder is a surface that consists of all lines that are parallel
to a given line and that pass through a given plane curve.
Let
E = {(x, y, z), x = y 2 }.
The set E cannot be the graph of a function z = f (x, y) since (1, 1, z) ∈ E
for any z, and then (1, 1) would have an infinite number of images. However,
we can look at E as the graph of the function x = f (y, z) = y 2 . Moreover, we
have
E= {(x, y, z), x = y 2 , (x, y) ∈ R2 }.
z∈R
Introduction 23
This means that any horizontal plane z = k (// to the xy plane) intersects the
graph in a curve with equation x = y 2 . So these horizontal traces E ∩ [z = k],
k ∈ R are parabolas. The graph is formed by taking the parabola x = y 2
in the xy-plane and moving it in the direction of the z-axis. The graph is a
parabolic cylinder as it can be seen as formed by parallel lines passing through
the parabola x = y 2 in the xy-plane (see Figure 1.17).
Note that for any k ∈ R, the level curve z = k is the parabola x = y 2 in the
xy plane.
z
traces
y
y
4
Level curve x y2
2 x
x
4 2 2 4
2
4
2 graph x y2
y 1
1
2
2
z
0
1
2
2
1
0
x
1
The graph
x2 y2
Gf = (x, y, z), + = z
a2 b2
z∈[0,+∞)
x2 y2
can be seen as the union of ellipses + = k in the planes z = k, k 0.
a2 b2
By choosing the traces in Table 1.3, we can shape the graph in the space (see
Figure 1.18 for a = 2, b = 3):
24 Introduction to the Theory of Optimization in Euclidean Space
plane trace
xy (z = 0) point : (0, 0)
x2
xz (y = 0) parabola : z =
a2
y2
yz (x = 0) parabola : z =
b2
x2 y2
z=1 ellipse : + =1
a2 b2
5 k9
k4
k1
z
0.5
k0 2.0
x 2
10 5 5 10 1.5
z
1.0
0.5 0
y 0.0
0.0
5 2
2
1
1
2 0
0
x
x 1 1
10 2 2
Note that for any k < 0, the level curves z = k are not defined. For k > 0,
x2 y2
the level curves are ellipses √ + √ = 1 centered at the origin. For
(a k)2 (b k)2
k = 0, the level curve is reduced to the point (0, 0).
plane trace
xy (z = 0) point : (0, 0)
x
xz (y = 0) lines : z = ±
a
y
yz (x = 0) lines : z = ±
b
x2 y2
z = ±1 ellipse : + =1
a2 b2
5 k3 0.5
k2
k1
1.0 z
0.0
k0 0.5
x z
10 5 5 10 0.0
0.5
0.5 2
1.0
2 1.0
0 2
5 y
1
1
0
x 0
2 x
1 1
10 2 2
x2 y2 z2
+ + =1 with a > 0, b > 0, c > 0.
a2 b2 c2
x2 y2
It is the union of the graphs of the functions z = ±c 2
− 2 that one 1−
a b
can sketch by making the following choice of traces in Table 1.5 (see Figure
1.20 for a = 2, b = 3, c = 4):
26 Introduction to the Theory of Optimization in Euclidean Space
plane trace
x2 y2
xy (z = 0) ellipse : 2 + 2 = 1
a b
x2 z2
xz (y = 0) ellipse : + =1
a2 c2
y2 z2
yz (x = 0) ellipse : 2
+ 2 =1
b c
2
z
0
k 4
x
3 2 1 1 2 3 4
2
1 2
4
4
z 0
2
2
0
2
x
2
3 4 4
For k ∈ R, the level curves z = ±k are ellipses centered at the origin with ver-
k2 k2 k2 k2
tices − a 1 − 2 , a 1 − 2 , − b 1 − 2 , b 1 − 2 in the xy plane.
c c c c
ii) One can establish, using similar tools in one dimension [2], that the
standard properties of limits hold for limits of functions of n variables.
iii) If the limit of f (x) fails to exist as x −→ x0 along some smooth
curve, or if f (x) has different limits as x −→ x0 along two different
smooth curves, then the limit of f (x) does not exist as x −→ x0 .
Example 7.
• lim xi = ai , i = 1, · · · , n a = (a1 , · · · , an ) ∈ Rn .
x−→a
lim 3xy 2 + z − 5
(x,y,z)−→(1,2,3)
• The limit
2x2 y
lim does not exist.
(x,y)→(0,0) x4 + y2
2x2 y 2x2 x 2x
lim 4 2
= lim 4 2
= lim 2 = 0,
(x,y)→(0,0)(x,y)∈C x + y x→0 x +x x→0 x +1
2
the limits have different values along C1 and C2 (see Figure 1.21).
28 Introduction to the Theory of Optimization in Euclidean Space
2 x2
z
x4 y2
15
0.5
10z
0
0.5 0.0
y
0.0
x
0.5
0.5
2x2 y
FIGURE 1.21: Behavior of f (x, y) = near (0, 0)
x4 + y 2
1
f (x, y) = .
exy −1
Df = R2 \ {(x, y) ∈ R2 / x = 0 or y = 0}.
Introduction 29
1
∗ ∗ (x, y) −→ is continuous on Df as the composition of the C 0
exy
−1
1
function (x, y) −→ xy on R2 and the C 0 function t −→ t on
e −1
R \ {0} :
1
(x, y) ∈ Df −→ xy = t ∈ R \ {0} −→ t
.
e −1
∂f f (x1 , · · · , xi + h, · · · , xn ) − f (x1 , · · · , xi , · · · , xn )
(x) = lim
∂xi h−→0 h
Solution: We have
fx = eyw sin z fx (1, 2, 3, π/2) = eyw sin z = e3
(w,x,y,z)=(1,2,3,π/2)
fy = xweyw sin z fy (1, 2, 3, π/2) = xweyw sin z = 2e3
(w,x,y,z)=(1,2,3,π/2)
fz = xeyw cos z fz (1, 2, 3, π/2) = xeyw cos z =0
(w,x,y,z)=(1,2,3,π/2)
fw = xyeyw sin z fw (1, 2, 3, π/2) = xyeyw sin z = 6e3 .
(w,x,y,z)=(1,2,3,π/2)
Example 10. The rate of change of the (BMI) body mass index function
B(w, h) = w/h2 with respect of the weight w at a constant height h is
∂B 1
= 2 > 0.
∂w h
Thus, at constant height, people’s BMI differs by a factor of 1/h2 .
The rate of change of the BMI with respect of the height h at a constant
weight w is
∂B 2w
= − 3 < 0.
∂h h
Therefore, with similar weight, people’s BMI is a decreasing function of the
height.
Introduction 31
∂ ∂f ∂2f
= = fxi xj .
∂xj ∂xi ∂xj ∂xi
The n second-order partial derivatives fxi xi are called direct second-order
partial; the others, fxi xj where i = j, are called mixed second-order partial.
Usually these second-order partial derivatives are displayed in an n × n matrix
named the Hessian
⎡ ⎤
fx1 x1 fx1 x2 . . . f x1 xn
⎢ fx2 x1 fx2 x2 . . . f x2 xn ⎥
⎢ ⎥
Hf (x) = (fxi xj )n×n = ⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
f xn x1 fxn x2 ... fxn xn
∂(ln Q) QL a a
= = =⇒ QL = Q
∂L Q L L
∂(ln Q) QK b b
= = =⇒ QK = Q
∂K Q K K
a a a a a a(a − 1)
QLL = QL + − Q= Q+ − Q= Q
L L2 L L L2 L2
b b b b b b(b − 1)
QKK = QK + − Q= Q+ − Q= Q
K K2 K K K2 K2
a ab
QKL = QLK = QK = Q.
L LK
The Hessian matrix of Q is given by:
⎡ ⎤
a(a − 1) ab
⎢ ⎥
QLL QLK ⎢ L2 LK ⎥
HQ (L, K) = = Q⎢ ⎥.
QKL QKK ⎣ ab b(b − 1) ⎦
LK K2
Example 12. Laplace’s equation of a function u = u(x1 , . . . , xn ) is
∂2u ∂2u ∂2u
u = 2 + 2 + ... + = 0.
∂x1 ∂x2 ∂x2n
For which value of k, the function u = (x21 + x22 + . . . + x2n )k satisfies Laplace’s
equation?
Solution: We have
∂u
= 2xi k(x21 + x22 + . . . + x2n )k−1
∂xi
∂2u
= 2k(x21 + x22 + . . . + x2n )k−1 + 4x2i k(k − 1)(x21 + x22 + . . . + x2n )k−2
∂x2i
Differentiability
While the existence of a derivative of a one variable function at a point guar-
antees the continuity of the function at this point, the existence of partial
derivatives for a function of several variables doesn’t. Indeed, for example
⎧
⎨ 2 if x > 0 and y > 0
f (x, y) =
⎩
0 if not
has partial derivatives at (0, 0) since
f (h, 0) − f (0, 0) 0−0
fx (0, 0) = lim = lim = lim 0 = 0,
h→0 h h→0 h h→0
f (0, h) − f (0, 0)
fy (0, 0) = lim =0
h→0 h
but f is not continuous at (0, 0) since
lim f (t, t) = lim 2 = 2 = 0 = f (0, 0).
t→0+ t→0+
with
lim ε( x − a ) = 0.
x−→a
- f continuous at a;
Theorem 1.3.2 If all first-order partial derivatives of f exist and are con-
tinuous at a point, then f is differentiable at that point.
Example 13. Use the linear approximation to estimate the change of the
Cobb-Douglas production function
Q(L, K) = L1/3 K 2/3 from (20, 10) to (20.6, 10.3).
Solution: We have
1 2
QL (L, K) = Q, QK (L, K) = Q, Q(20, 10) = 201/3 102/3 = 10(21/3 ),
3L 3K
1 2
QL (20, 10) = Q(20, 10), QK (20, 10) = Q(20, 10)
3(20) 3(10)
Thus, close to (20, 10), we have
Q(L, K) ≈ Q(20, 10) + QL (20, 10)(L − 20) + QK (20, 10)(K − 10)
1 2
= 1 + (L − 20) + (K − 10) Q(20, 10)
60 30
from which we deduce the estimate
1 2
Q(20.6, 10.3) ≈ 1+ (20.6−20)+ (10.3−10) Q(20, 10) = 1.003 Q(20, 10).
60 30
Another consequence of the differentiability is the chain rule for derivation
under composition.
f (x, y) = x2 − 2xy + 2y 3 , x = s ln t, y = s t.
Solution: i) We have
∂f ∂f
= 2x − 2y, = −2x + 6y 2 ,
∂x ∂y
∂x ∂x s
x = x(s, t), = ln t, = ,
∂s ∂t t
∂y ∂y
y = y(s, t), = t, = s.
∂s ∂t
Hence the partial derivatives of f at (s, t) are:
∂f ∂f ∂x ∂f ∂y
= . + . = (2x − 2y) ln t + (−2x + 6y 2 )t
∂s ∂x ∂s ∂y ∂s
∂f ∂f ∂x ∂f ∂y s
= . + . = (2x − 2y) + (−2x + 6y 2 )s
∂t ∂x ∂t ∂y ∂t t
s
= (2s ln t − 2st) + (−2s ln t + 6s2 t2 )s.
t
∂f
∂s = (2x(s, t) − 2y(s, t)) ln t + (−2x(s, t) + 6y 2 (s, t))t = 6
s=1,t=1 s=1,t=1
∂f
∂t = (2x(s, t) − 2y(s, t)) st + (−2x(s, t) + 6y(s, t)2 )s = 4.
s=1,t=1 s=1,t=1
Introduction 37
Solved Problems
Solution:
2 2
Df : 1
5 x y 4 0
DH : z x2 y2
y
2
0 y
0
5 2
5
10
y
5
z
0
8 z
Df y x2 0
6 0
5
4
5 5
2 2
0
x 0
x
x
3 2 1 1 2 3 5 2
i) f (x, y) = e2x y − x2
Df = {(x, y) ∈ R2 : y − x2 0}
the plane region located above the parabola, including the parabola.
ii) f (x, y, z) = z (1 − x2 )(y 2 − 4)
Df = {(x, y, z) ∈ R3 : (1 − x2 )(y 2 − 4) 0}
x -2 -1 1 2
1 − x2 − − + − −
y2 − 4 + − − − +
38 Introduction to the Theory of Optimization in Euclidean Space
so
Df = [−1, 1] × (−∞, −2] ∪ [2, +∞) × R
∪ (−∞, −1] ∪ [1, +∞) × [−2, 2] × R .
iii) H(x, y, z) = z − x2 − y 2
DH = {(x, y, z) ∈ R3 : z − x2 − y 2 0}
set of points bounded by the paraboloid z = x2 +y 2 , including the paraboloid.
The three domains are illustrated in Figure 1.22.
y2
a. y − z2 = 0 b. x+y+z =0 c. 4x2 + + z2 = 1
9
y2 y2
d. x2 + − z2 = 1 e. x2 + = z2 f. z − y2 = 0
9 9
2
2 y 1
2
y y
0
0
0
1
2
2 2
1.0
4
2
0.5
z
z0.0
z
0 2
0.5
2
1.0 0
1.0 2
0.5 1
2
0.0 0
0 x x
0.5 1
A x
3
2 B 2
1.0 C 5
2
y y
2 0 y
0
1
2
0 1.0 5
2 1.0
0.5
1 0.5
z0.0
z
0 z0.0
0.5
1 0.5
1.0
2 1.0
1.0
0.5
2 1
0.0
0 x 0
0.5 x
D x
2 E 1.0 F 1
Solution:
y2
c. 4x2 + 9 + z2 = 1 (E) ellipsoid centered at (0, 0, 0)
y2
d. x2 + 9 − z2 = 1 (F ) the traces at z = −1, 0, 1 are ellipses
y2
e. x2 + 9 = z2 (B) elliptic cone
Solution: i)
Df
z 81 x2
15
10
y
0
10
10
15
5
10
0
z
5 5
0
10
10
0
15 x
15 10 5 0 5 10 15 10
√
FIGURE 1.24: Domain and graph of z = 81 − x2
40 Introduction to the Theory of Optimization in Euclidean Space
Graph of f : Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that z = 81 − x2 }
= {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that x2 + z 2 = 81, z 0}.
It is the half circular cylinder located in the z 0 with radius 9 and axis the
y axis (see Figure 1.24).
ii)
It is the plane passing through (0, 0, 3) with normal vector k = 0, 0, 1 (see
Figure 1.25).
iii)
Domain of f : Df = {(x, y) ∈ R2 : x2 + y 2 0} = R2
The graph is the part of the circular cone z 2 = x2 + y 2 located in the region
[z 0]; see Figure 1.25.
5 z3 z1.0 x2 y2
y 0.5
y
0
0.0
0.5
5 1.0
0.0
z z
0.5
2
0 1.0
5 1.0
0.5
0 0.0
x x
0.5
5 1.0
FIGURE 1.25: Graph of z = 3 and graph of z = − x2 + y 2
Introduction 41
5 1
2
0 0 0
2
5 1
4
10 2
1. 10 5 0 5 10 2. 2 1 0 1 2 3. 4 2 0 2 4
10 2 10
5 1 5
0 0 0
5 1 5
10 2 10
4. 10 5 0 5 10 5. 2 1 0 1 2 6. 10 5 0 5 10
4 2 2
3 1 1
2 2 5
z 2 z 0 z
0
1 1 1 1 1
0 2 2
2 0 y 2 0 y 5 0
y
1 1
0 1 0 1 0
x x x
1 1
A 2
2
B 2
2
C 5
5
D 2
2
E 10
10
F 2
2
Solution:
ii) The level curve (see the 2nd graph in Figure 1.27) (x − 2)2 + y 2 + z 2 = k
is reduced to
⎧
⎪
⎪ the point (2, 0, 0) if k=0
⎪
⎪
⎨ √
the sphere centered at (2, 0, 0) with radius k if k>0
⎪
⎪
⎪
⎪
⎩
no points if k < 0.
x 22 y2 z2 k
2
y
x2 y k
0
y
2
2
2
z
0
x
4 2 2 4
2
2
2
4 0
x
2
∗ ∗ v : (x, y, z) −→ y − x2 : continuous on D2 = {(x, y, z) : y − x2 0}
as the composite of the polynomial function(x, y, z) ∈ D2 −→ y − x2
√
∈ R+ and the function t −→ t continuous on R+ ; we have
√
(x, y, z) ∈ D2 −→ y − x2 = t ∈ R+ \ {0} −→ t.
3 Df
y
2
0
10
z
5
2
0
x
2
FIGURE 1.28: Domain of continuity of f (x, y, z) = y − x2 ln z
(c) fxz = (fx )z = −2e−2z sin(πy) (d) fzz = (fz )z = 4xe−2z sin(πy)
∂2u ∂2u
8. – Show that u = ln(x2 + y 2 ) satisfies Laplace equation + 2 = 0.
∂x2 ∂y
∂2u ∂2u
Show, without calculation, that: = .
∂x∂y ∂y∂x
Solution: We have
∂u 2x ∂u 2y
= 2 = 2
∂x x + y2 ∂y x + y2
∂u ∂2u
Note that is a fraction. Then is also a fraction. As a consequence,
∂x ∂y∂x
∂2u
is continuous on R2 \ {(0, 0)}.
∂y∂x
∂u ∂2u ∂2u
In the same way, is a fraction, is also a fraction. Therefore,
∂y ∂x∂y ∂x∂y
is continuous on R2 \ {(0, 0)}.
From Clairaut’s Theorem, the two second derivatives uxy and uyx are equal
on R2 \ {(0, 0)}.
dw
9. – Find the value if
ds s=0
w = x2 e2y cos(3z); x = cos s, y = ln(s + 2), z = s.
dx dy 1 dz
= − sin s, = , =1
ds ds s+2 ds
∂w ∂w ∂w
= 2xe2y cos(3z), = 2x2 e2y cos(3z), = −3x2 e2y sin(3z)
∂x ∂y ∂z
dw ∂w dx ∂w dy ∂w dz
= + +
ds ∂x ds ∂y ds ∂z ds
1
= [2xe2y cos(3z)](− sin s) + [2x2 e2y cos(3z)] +[−3x2 e2y sin(3z)]
s+2
dw
= e2 ln 2 = 4.
ds s=0
10. – Let
Find
∂R ∂R
|x=1,y=0 and |x=1,y=0 .
∂x ∂y
46 Introduction to the Theory of Optimization in Euclidean Space
Solution: We have
∂R 2u ∂R 2v ∂R 2w
= 2 , = 2 , = 2 ,
∂u u + v 2 + w2 ∂v u + v 2 + w2 ∂w u + v 2 + w2
∂u ∂v ∂w
= 1, = 2, = 2y,
∂x ∂x ∂x
∂u ∂v ∂w
= 2, = −1, = 2x.
∂y ∂y ∂y
∂R ∂R ∂u ∂R ∂v ∂R ∂w 4u − 2v + 4wx
= . + . + . = 2 .
∂y ∂u ∂y ∂v ∂y ∂w ∂y u + v 2 + w2
u=1 v = 2, w = 0, u2 + v 2 + w2 = 5.
Thus
∂R 2(1) + 4(2) + 4(0) ∂R 4(1) − 2(2) + 4(0)
= = 2, = = 0.
∂x 5 ∂y 5
11. – Use the linear approximation of f (x, y, z) = x3 y 2 + z 2 at the point
(2, 3, 4) to estimate the number
(1.98)3 (3.01)2 + (3.97)2 .
Solution: Since f is differentiable at the point (2, 3, 4), the linear approxima-
tion of L(x, y, z) at the point (2, 3, 4) is given by:
We have
yx3 zx3
fx = 3x2 y2 + z2 , fy = , fz =
y2 + z2 y2 + z2
Introduction 47
and
24 32
f (2, 3, 4) = 40, fx (2, 3, 4) = 60, fy (2, 3, 4) = , fz (2, 3, 4) = .
5 5
Thus
12 24 32
(x − 2) + (y − 3) + (z − 4).
L(x, y, z) = 40 +
5 5 5
Using this approximation, one obtain the following estimate:
(1.98)3 (3.01)2 + (3.97)2 ≈ L(1.98, 3.01, 3.97)
24 32
= 40 + 60(1.98 − 2) + (3.01 − 3) + (3.97 − 4)3
5 5
24 32
= 40 + 60(−0.02) + (0.01) + (−0.03) = 38.656.
5 5
12. – Determine whether the limit exists. If so, find its value.
x4 − x + y − x3 y cos(xy) x − y4
lim , lim , lim .
(x,y)→(0,0) x−y (x,y)→(0,0) x + y (x,y)→(1,1) x3 − y 4
Solution: We have
x4 − x + y − x3 y x3 (x − y) − (x − y)
i) lim = lim
(x,y)→(0,0) x−y (x,y)→(0,0) x−y
= lim x3 − 1 = −1.
(x,y)→(0,0)
cos(xy)
ii) lim doesn’t exist since
(x,y)→(0,0) x + y
cos(xy) cos(t2 )
lim = lim = +∞.
(x,y)=(t,t),t>0→(0,0) x + y t→0+ 2t
cos(xy) cos(t2 )
and lim = lim = −∞.
(x,y)=(t,t),t<0→(0,0) x + y t→0− 2t
y(1 − y 3 ) y2 + y + 1
lim = lim = 3.
(x,y)→(1,1)(x,y)∈C y 3 (1 − y) y→1 y2
2
The limits are different along C1 and C2 . Thus, the limit doesn’t exist.
Chapter 2
Unconstrained Optimization
Many results are well known when dealing with functions of one variable (n = 1).
The concept of differentiability offered useful and flexible tools to get local and global
behaviors of a function. These results are generalized to functions of n variables in
these notes. Indeed, we obtain, in Section 2.1, a characterization of local critical
points as solutions of the vectorial equation f (x) = 0 when f is regular. In Section
2.2, we use the second partial derivatives to identify the nature of the critical points.
In Section 2.3, first we define the convexity-concavity property for a function of
several variables, then we show how to use it to identify the global extreme points.
Finally, Section 2.4 extends the extreme value theorem to continuous functions on
closed bounded subsets of Rn .
In this section, we would like to have a close look at our candidates for
optimality. In other words, if we are close enough of such points (when they
exist), what conditions would be satisfied? Doing so, we hope to reduce the
size of the set of the candidates’ points then identify among these points the
extreme ones. This motivates the following definition of local extreme points.
49
50 Introduction to the Theory of Optimization in Euclidean Space
∃r > 0 such that f (x) < f (x∗ ) (resp. >) ∀x ∈ Br (x∗ )∩S, x = x∗ .
Remark 2.1.1 Note that a global extreme point is also a local extreme
point when S is an open set, but the converse is not always true.
then
f (x) f (x∗ ) ∀x ∈ S.
Because S is an open set and x ∈ S, there exists a ball Br (x∗ ) such that
∗
To show that the converse is not true, consider the function f (x) = x3 − 3x.
The study of the variations of f , in Table 2.1, and its graph, in Figure 2.1,
show that f has a local minimum at x = 1 and a local maximum at x = −1,
but none of them is a global maximum or a global minimum, as we have
1 y x3 3 x
x
3 2 1 1 2 3
1
2
3
(x∗1 , . . . , x∗j +t, . . . , x∗n )−x∗ = (x∗1 , . . . , x∗j +t, . . . , x∗n )−(x∗1 , . . . , x∗j , . . . , x∗n )
52 Introduction to the Theory of Optimization in Euclidean Space
= (0, . . . , 0, t, 0, . . . , 0) = 02 + . . . + 02 + t2 + 02 + . . . + 02 = |t| < .
Thus the points (x∗1 , . . . , x∗j + t, . . . , x∗n ) remain inside the ball B (x∗ ) and
therefore satisfy
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) f (x∗ )
⇐⇒ f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) 0.
Thus, if t is positive,
∂f ∗ ∂f ∗ ∂f ∗
we have (x ) 0 and (x ) 0, and we deduce that (x ) = 0.
∂xj ∂xj ∂xj
This holds for each j ∈ {1, . . . , n}. Hence ∇f (x∗ ) = 0.
A similar argument applies if f has a local maximum at x∗ .
Remark 2.1.2 Note that a local extremum can also occur at a point where
a function is not differentiable.
Unconstrained Optimization 53
y
3
2 y x
x
3 2 1 1 2 3
1
• For example, the one variable function f (x) = |x|, illustrated in Figure 2.2,
has a local minimum at 0 but f is not differentiable at 0 since we have
z
1.0 x2 y2
y 0.5
0.0
0.5
1.0
1.0
1.0
0.5
z
0.5
0.0
0.0
0.5
1.0
0.5
0.0
x
0.5
1.0
1.0 1.0 0.5 0.0 0.5 1.0
FIGURE 2.3: A minimum point where f (x, y) = x2 + y 2 is not differen-
tiable
• The two variables function f (x, y) = x2 + y 2 , graphed in Figure 2.3,
attains its minimum value at (0, 0) because we can see that
f (x, y) = x2 + y 2 0 = f (0, 0) ∀(x, y) ∈ R2 .
But f is not differentiable at (0, 0) since, for example fx (0, 0) doesn’t exist.
Indeed, we have
54 Introduction to the Theory of Optimization in Euclidean Space
√ ⎧
⎨ 1, if h → 0+
f (0 + h, 0) − f (0, 0) h2 − 0 |h|
= = −→
h h h ⎩
−1, if h → 0− .
Example 1. (0, 0) is the only stationary point for the functions f and g
It is a local and absolute minimum for f and a local and absolute maximum
for g. The values of the level curves are increasing in Figure 2.4, while they
are decreasing in Figure 2.5.
z x2 y2
z
0.5
0.0
0.0
0.5 0.54
1.0
0.5 1.44 1.44
0.0 1.26
x 1.8 0.9 1.8
0.5 1.62 1.26 1.62
1.0
1.0 1.0 0.5 0.0 0.5 1.0
0.36
0.5
0.1
1.0
0.0 0 0.54
0.5
0.72
0.0 0.1
0.5
1.0
0.36
0.5
0.54
0.0
0.72
x 0.18
0.5
1.0 0.9 0.54 0.18 0.36 0.720.9
1.0 1.0 0.5 0.0 0.5 1.0
The maximum profit occurs when P (x) = 0, or R (x) = C (x). From the
linear approximation, we have for Δx = 1,
• For example, the one variable function f (x) = x3 has a local critical point
since
f (x) = 3x2 = 0 ⇐⇒ x = 0.
But 0 is not a local extremum (see Figure 2.6). Indeed we have
1 y x3
x
2 1 1 2
1
2
0.0
0.5
1.0 0.54
0.18 0.72
1.0
1.0 0.3
0.18
0.36
0
0.50.72
0.5
0.9
z 0
0.0 0.36
0.0
0.9 0.54
0.5
1.0 0
0.5 0.72
1.0
0.18
0.5 0.54
0.54
0.0
x 0.36 0.1
0.5
1.0 0.72
1.0 1.0 0.5 0.0 0.5 1.0
Now, we give a necessary condition when the extreme point is not neces-
sarily an interior point [5].
1 ∗ θ
θ[ f (x∗ ).(x − x∗ ) − f (x ).(x − x∗ ) ] = f (x∗ ).(x − x∗ ) < 0
2 2
which contradicts the fact that x∗ is a relative minimum. Therefore, we have
f (x∗ ).(x − x∗ ) 0.
The case of a relative maximum is proved similarly.
Example 3. Consider the real function f (x) = x2 with x ∈ [1, 2]; see
Figure 2.8.
The interval S = [1, 2] is a convex subset of Ω = R. f is differentiable on R,
and has no critical points on (1, 2) since
y
4
2 y x2
x
4 2 2 4
1
1 1 1 1
f (x1 , 0) = x21 − x1 = (x1 − )2 − − = f ( , 0) ∀x1 0
2 4 4 2
and
f (0, x2 ) = x2 0 ∀x2 0.
Since −1/4 < 0, the point ( 12 , 0) is the global minimum point of f on S, as
shown in Figure 2.9.
At this point
3
f (x1 , x2 ) = 2x1 − 1 + x2 , 1 + x1 = 0, = 0.
x1 = 12 ,x2 =0 x1 = 12 ,x2 =0 2
Unconstrained Optimization 59
2 2
z
10 x y x x y z
10 x y x x y
y 5 y
0 5
5
10 0
10 10
5 5
z z
0 0
5 5
10 10
10 0
5
0 5
x x
5
10 10
and
1 1 3
∇f ( , 0).x1 − , x2 − 0 = x2 0 ∀(x1 , x2 ) ∈ S = R+ × R+ .
2 2 2
Remark 2.1.5 * Note that, it is not easy to find the candidate points
by solving an inequality ∇f (x∗ ).(x − x∗ ) 0 (resp. 0). However, the
information gained is useful to establish other results.
** Solving the equation ∇f (x) = 0 is not that easy either! It induces non-
linear equations or large linear systems when the number of variables is
large. To overcome this difficulty, we resort to approximate methods. New-
ton’s method is one of the well known approximate methods for approaching
a root of the equation F (x) = 0. In Exercise 5, the method is described and
applied for solving a nonlinear equation in one dimension. Steepest descent
method, Conjugate gradient methods and many other methods are developed
for approaching the solution [22], [5].
on the window [−10, 8] × [−10, 8] × [−1, 12], shows three peaks. Thus, we have
at least three local maxima points. These points are solution of the system
⎧ 2 2 2 2 2 2
⎪
⎪ fx = −20xe−(x +y ) − 10(x + 5)e−[(x+5) +(y−3) ]/10 − 16(x − 4)e−2[(x−4) +(y+1) ] = 0
⎨
⎪ −(x2 +y 2 ) − (y − 3)e−[(x+5)2 +(y−3)2 ]/10 − 16(y + 1)e−2[(x−4)2 +(y+1)2 ] = 0,
⎩fy = −20ye
⎪
60 Introduction to the Theory of Optimization in Euclidean Space
The result is
[10.1678223807097599, [x = −0.842598632890276e − 2, y = 0.505559179745079e − 2]].
5
y
0
5
10
5
10
z 0
5
0 5
10
5
0
x 10
5 10 5 0 5
Solved Problems
y x2 2
x
4 2 2 4
1
min D(x).
x∈R
Since R is an open set, the minimum must occur at a critical point, i.e., since
D is differentiable, at a point where
62 Introduction to the Theory of Optimization in Euclidean Space
dD 2(x − 3) + 4x3
= =0
dx 2 (x − 3)2 + x4
x −∞ 2 +∞
D (x) − 0 +
D(x) +∞ D(1) +∞
TABLE 2.2: Variations of D(x) = (x − 3)2 + x4
Thus √
min D(x) = D(1) = 5.
x∈R
0 D2 (x0 ) D2 (x) ⇐⇒ 0 D(x0 ) = D2 (x0 ) D2 (x) = D(x)
√
since t is an increasing function on the interval [0, +∞). It suffices, then, to
minimize on R the function
Since R is an open set, the minimum must occur at a critical point, i.e., since
F is differentiable, at a point where
dF
= 2(x − 3) + 4x3 = 0 ⇐⇒ 2(x − 1)(2x2 + 2x + 3) = 0 ⇐⇒ x = 1.
dx
Since F ∈ C 0 (R) and
x −∞ 1 +∞
F (x) = 2(x − 1)(2x2 + 2x + 3) − 0 +
F (x) +∞ F (1) +∞
Solution: From Section 1.1, Example 1, we are lead to solve the minimization
problem
⎧
⎪ 2V
⎨ minimize A = A(r) = 2πr2 + over the set S
r
⎪
⎩
S = (0, +∞) = {r ∈ R / r > 0}.
Since S is an open set, the minimum must occur at a critical point, ie., since
A(r) is differentiable, at a point where
dA 2V V 1/3
= 4πr − 2 = 0 =⇒ r= ∈ S.
dr r 2π
0
Since A ∈ C (S) and
V 1/3
the minimum exists and it must be on r = . Indeed the variations of
2π
A are as shown in Table 2.4.
√
r 0 V 1/3 / 3 2π +∞
2V
A (r) = 4πr − r2 − 0 +
A(r) +∞ A((V /2π)1/3 ) +∞
2V
TABLE 2.4: Variations of A(r) = 2πr2 +
r
3. – Locate all absolute maxima and minima if any for each function.
ii) g(x, y) = 3x − 2y + 5
Solution: i)
2 2
z x
7 1 y 5 1
y 6
5
7 6.48 5.04 3.6 4.32 5.76
4 6.48
5.76 2.16
3
1.0 5.0
4.32
6 0
0.5 3.6
1.44
z
0.0
5
0.5
0.72
3.6 3.6
1.0
4
3
2 5.04 5.0
1
x 6.48 6.48
0 5.76 4.32 2.88 4.32 5.76
3
1 3 2 1 0 1
√ 2
2
(x, y) − (−1, 5) ( (x, y) + (−1, 5) )2 = ( x2 + y 2 + 26)
2 √
2
(x, y) − (−1, 5) (x, y) − (−1, 5) = ( x2 + y 2 − 26)2 .
Unconstrained Optimization 65
Then
√ √
1−( x2 + y 2 + 26)2 f (x, y) 1 − ( x2 + y 2 − 26)2
It suffices also to show that f takes large negative values on a subset of its
domain R2 , like
ii) Since g is differentiable on R2 , its absolute extreme points that are also
z 3x2 y5
2
y
0
3 2.7 8.1
2.7
8.1
2
1.0 2
0.5 1
5.4
z 13.5
0.0 0
0.5
1
1.0 10.8
5.4
2 0
2
0
x 3 16.2
2 3 2 1 0 1 2 3
local extreme points (if they exist), are stationary points, ie. solution of ∇g =
0, 0. But
g(0, y) = −2y + 5 −→ ∓∞ as y −→ ±∞
g(x, 0) = 3x + 5 −→ ±∞ as x −→ ±∞.
iii) Since h is differentiable on R2 , its absolute extreme points that are also
local extreme points (if they exist) are stationary points, ie. solution of
h(1, 2) = 1 − 2 + 4 − 6 = −3
y y2
h(x, y) − h(1, 2) = x2 − xy + y 2 − 3y + 3 = (x − )2 − + y 2 − 3y + 3
2 4
y 3
= (x − )2 + (y − 1)2 0 ∀(x, y) ∈ R2 .
2 4
Hence, the point (1, 2) is a global minimum of h in R2 . Here also, one can see
that h takes large values, for example, along the x axis, we have
1 4 7 5 3 1
0
0 6
1
4
3
0
1
2
z
2
2
3 2
4
1 2
4
1
0
6
1
x
2 1 3 5 7
0
3 1 0 1 2 3
◦
∇f = 0, 1 = 0, 0 ∀(x, y) ∈ {(x, y) : x2 + y 2 < 1} = S.
ii) If the minimum points exist, they may be on the unit circle, the boundary
of S:
∂S = {(x, y) : x2 + y 2 = 1}.
1.0
y 0.5
0.0
0.5
1.0
1.0
0.5
z
0.0
0.5
1.0
1.0
0.5
0.0
x
0.5
1.0
So the only point candidate is (a, b) = (0, 1). In fact, it is the minimum point
(see Figure 2.15) since we have
f (0, y) = y −→ −∞ as y −→ −∞.
F (xn )
xn+1 = xn − ∀n ∈ N,
F (xn )
Solution: i) We have
From the intermediate value theorem, there exists x0 ∈ (2, 2.2) such that
f (x0 ) = 0.
We deduce that the sequence (xn ) converges to a root r of f (x) = 0 in [2, 2.2]
and satisfies
M
|xn+1 − r| 0.7|xn − r|2 K= = 0.66 < 0.7.
2m
where
Thus
2n
|Ken+1 | |Ke1 |2n (0.7)(0.2) = (0.0196)n
(0.0196)n
|en+1 | 10−6 .
0.66
70 Introduction to the Theory of Optimization in Euclidean Space
We have
(0.0196)2 (0.0196)3
= 0.000582061 ≈ 0.0000114084
0.66 0.66
(0.0196)4
≈ 0.0000002236 < 10−6 .
0.66
The desired accuracy is obtained for n = 4.
The approximate values of this root are:
2x31 + 5 2(8) + 5 21
x2 = 2 = 2
= = 2.1
3x1 − 2 3(2 ) − 2 10
2x33 + 5 2( 23.522 3
11.23 ) + 5
x4 = 2 = 23.522 ≈ 2.09455148
3x3 − 2 3( 11.23 )2 − 2
2x34 + 5 23.3782059
x5 = ≈ ≈ 2.0945514841.
3x24 − 2 11.1614377
y
1.0
x
3 2 1 1 2 3
0.5
x
5
1 1 2 3
0.5
10
1.0
f (c)
f (x) = f (x∗ ) + (x − x∗ )2 .
2!
Now, if we have f (x∗ ) > 0, then by continuity of f , we deduce that for x close to
x∗ , ( x ∈ (x∗ − , x∗ + ) ), we will have
This classification of critical points, into minima and maxima points, where
the sign of the second derivative intervenes, is generalized to C 2 functions with
several variables in the theorem below, following the definition:
Theorem 2.2.1 Second derivatives test - Sufficient conditions for a strict local
extreme point
Before proving the theorem, we will see its application through some examples.
Solution: The total revenue for selling x units is R(x) = 5x. Thus, the profit
P (x) on x units is
P (x) = R(x) − C(x) = 5x − (x3 − 10x2 + 17x + 66) = −x3 + 10x2 − 12x − 66.
The profit, illustrated in Figure 2.17, will be at its maximum at points where
dP 2
= −3x2 + 20x − 12 = −3(x − 6)(x − ) = 0.
dx 3
We deduce that we have two critical points x = 6 and x = 23 .
The Hessian of P is
d2 P
HP (x) = = [−6x + 20].
dx2
Applying the second derivatives test, we obtain
∗ at x = 6,
d2 P
(−1)1 D1 (6) = (−1) (6) = (−1)(−6(6) + 20) = 16 > 0
dx2
Thus, x = 6 is a local maximum.
∗∗ at x = 2/3,
d2 P 2 2
D1 (2/3) = 2
( ) = −6( ) + 20 = 16 > 0
dx 3 3
2
Thus, x = 3 is a local minimum.
Unconstrained Optimization 73
Thus six units is a candidate point for optimality. We have to check that it
is the point at which we have the most profitable profit. This can be done by
comparing P (x) and P (6). Indeed, we have
P (x) − P (6) = −(x − 6)2 (x + 2) 0 ∀x > 0, x=6
=⇒ P (x) < P (6) ∀x ∈ (0, +∞) \ {6}.
x
2 4 6 8 10
50
100 y x3 10 x2 12 x 66
150
500 225018001350900
2700
4050
z 0.2 x2 0.05 y x 55 x 0.05 y2 35 y 2500
3150
3600
4950
400
2700
3600
3150
300
global maximum
4000 500
z
3150 2000 400
3600
2700
200
0 300
y
3600
50
4500 3150
100
x 200
2700 150
100 900135018002250
100
0 50 100 150 200 200
FIGURE 2.18: Profit function P (x, y) and maximum point (100, 300)
f (x, y) = 3x − x3 − 2y 2 + y 4 .
We deduce that (1, 0), (1, 1), (1, −1), (−1, 0), (−1, 1) and (−1, −1) are the
critical points of f . The level curves, graphed in Figure 2.19, show the nature
of these points.
Unconstrained Optimization 75
9.7 5.82
7.76
0.67 1.94 3.88 4.85 2.91 0.97
0.97
z y4 2 y2 x3 3 x 1.5 8.73 1.94 6.7
4.85
1.0 0
2.91
0.5 3.88
7.76
2.91
10 1.94
0.0 1.94
2 0
5 6.79
z 0.97 5.82
1 0.5
0
2.91
2 0y
1.0 0.97
7.76
1
4.85
1
x0 1.5 8.73 1.94
1.94
1
7.76 2.91
0.67
9.7 3.88 4.85 2.91 3.88
2
2 2 1 0 1 2
6 0
(−1, 1) 6 = 48 local minimum
0 8
6 0
(−1, −1) 6 = 48 local minimum
0 8
−6 0
(1, 1) −6 = −48 saddle point
0 8
−6 0
(1, −1) −6 = −48 saddle point
0 8
6 0
(−1, 0) 6 = −24 saddle point
0 −4
The proof of Theorem 2.2.1 uses Taylor’s formula for a function of several
variables and a characterization of symmetric quadratic forms (see the end of
this section). Taylor’s formula will be used several times through out the next
chapters. It is therefore important to understand its proof.
n n n
∂f ∗ 1 ∂2f
f (x∗ + h) = f (x∗ ) + (x )hi + (x∗ + c h)hi hj
i=1
∂x i 2 i=1 j=1
∂x i ∂x j
or
1t
f (x∗ + h) = f (x∗ ) + ∇f (x∗ ).h + hHf (x∗ + ch)h
2
⎡ ∗ ⎤ ⎡ ⎤
x1 h1
∗ ⎢ .. ⎥ ⎢ . ⎥
for some c ∈ (0, 1) and where x = ⎣ . ⎦ , h = ⎣ .. ⎦, t h =
x∗n hn
h1 . . . hn . Here, we identified the column vector x∗ + th with the
point (x∗1 + th1 , . . . , x∗n + thn ), t ∈ R.
Note that
= ∇f (x∗ + th) .h.
Unconstrained Optimization 77
d d d
g (t) = fx1 (x∗ +th) h1 + fx2 (x∗ +th) h2 + . . . . . . + fxn (x∗ +th) hn .
dt dt dt
For each i = 1, . . . , n, we have
Then
n
= fxi xj (x∗ + th) hj .
j=1
Hence
n
n
g (t) = fxi xj (x∗ + th) hi hj .
i=1 j=1
Now, since f is defined on the segment [x∗ , x∗ + h], g is defined on the interval
[0, 1] and by using the 2nd order Taylor’s formula for real functions [1], [2],
we get
g (0) g (c)
g(1) = g(0) + (1 − 0) + (1 − 0)2
1! 2!
1
= g(0) + g (0) + g (c) for some c ∈ (0, 1),
2
or equivalently
n
n n
1
f (x∗ + h) = f (x∗ ) + fxi (x∗ )hi + fx x (x∗ + ch)hi hj .
i=1
2 i=1 j=1 i j
For h ∈ Rn such that x∗ + h ∈ S, we have from the 2nd order Taylor’s formula
1t
f (x∗ + h) = f (x∗ ) + hHf (x∗ + ch)h for some c ∈ (0, 1).
2
1
(−f )(x∗ + h) − (−f )(x∗ ) = ( )t hH−f (x∗ + ch)h > 0
2
which shows that the stationary point x∗ is a strict local maximum point for
f in S.
Situation (iii) Assume Dn (x∗ ) = 0 and neither of the conditions i) and ii)
hold.
Note that situation (i) (resp. (ii)) means also that the matrix A =
fxi xj (x∗ ) n×n is definite positive (resp. negative), which is equivalent to each
of its eigen value λi to be positive (resp. negative). So, if neither (i) or (ii)
hold, there exist i0 , j0 ∈ {1, . . . , n} such that
n
Dn (x∗ ) = λi = 0 with λi0 > 0 and λj0 < 0.
i=1
t s 2s t 2s s
Ohs = ei0 + ej0 Ohs = ei0 + ej0 ,
λi0 −λj0 λi0 −λj0
80 Introduction to the Theory of Optimization in Euclidean Space
Proof. (i) Suppose that x∗ is an interior local minimum point for f . There
exists r > 0 such that
f (x∗ ) f (x) ∀x ∈ Br (x∗ ).
Unconstrained Optimization 81
Hence
n
n
g (0) = fxi xj (x∗ )hi hj =t hHf (x∗ )h 0.
i=1 j=1
Quadratic forms
Consider the quadratic form in n variables
n
n
Q(h) = aij hi hj =t hAh t
h= h1 ... hn
i=1 j=1
Definition.
Q is positive (resp. negative) definite if Q(h) > 0 (resp. < 0) for all
h = 0.
We have the following necessary and sufficient condition for a quadratic form
Q to be positive (negative), definite or semi definite.
Theorem.
Theorem.
Solved Problems
Solution:
z x4 y4 z x4 y4 z x4 y4
We have
So (0, 0) is the only stationary point for f , g and h. But, we cannot conclude
anything about its nature by using the second derivatives test since the Hessian
matrix at (0, 0) of each function is equal to the zero matrix.
12x2 0 12x2 0
Hf = , Hg = −Hf , Hh =
0 12y 2 0 −12y 2
0 0
Hf (0, 0) = Hg (0, 0) = Hh (0, 0) =
0 0
84 Introduction to the Theory of Optimization in Euclidean Space
h(x, 0) = x4 0 = h(0, 0) ∀x ∈ R
h(0, y) = −y 4 0 = h(0, 0) ∀y ∈ R.
Thus, for any disk centered at (0, 0), h takes values greater and lower than
h(0, 0).
Df = {(x, y) ∈ R2 : 1 + x2 y > 0}
1
= {(0, y) : y ∈ R} ∪ {(x, y) ∈ R∗ × R : y>− }.
x2
1
The domain of f is the region located above the curve y = − 2 , including the
x
y axis; see Figure 2.21.
86 Introduction to the Theory of Optimization in Euclidean Space
y
x
10 5 5 10
0.2
0.4
0.6
0.8
2xy x2
∇f (x, y) = , = 0, 0
1 + x2 y 1 + x2 y
⎧ ⎧
⎨ xy = 0 ⎨ x=0 or y=0
⇐⇒ ⇐⇒
⎩ 2 ⎩
x =0 x=0
⎧
⎨ x=0 and x=0
⇐⇒ or ⇐⇒ x = 0, y ∈ R.
⎩
y=0 and x=0
We deduce that the points located on the y axis are the critical points of f .
2 0.34 0.34
0 0
1 1.02
1.7 0.51 0.51 1.1
2.0
2
z 0 1.36 0.17 1.53
0.34
2.04 0 1.02
1 1
0.85
2 1.7
1
2 0 y 1.53 0.68
1
0.68 1.36
0 1
0.17
x
1 1.87
2 1.19
0.34 1.87
0.85
2
2 2 1 0 1 2
The leading minor D2 (0, y) = det(Hf (0, y)) = 0, then the second deriva-
tives test fails at these points. The behaviour of the function is illustrated in
Figure 2.22.
• The points (0, y0 ) with y0 > 0 are local minimum points for f . Indeed, since
the logarithm function is increasing, we have
y0 y0 y0 3y0
∀x ∈ R, ∀y ∈ (y0 − , y0 + ) = ( , ).
2 2 2 2
• The points (0, y0 ) with y0 < 0 are local maximum points for f . Indeed, since
ln is an increasing function, we have
Solution:
2
z x y3 x2 y x y
5 0
2
z0
1
5
1
2 0y
1
1
x0
1
2
2
2 2 1 0 1 2
⎧ ⎧
⎨ y(y 2 + 2x − 1) = 0 ⎨ y=0 or y 2 + 2x − 1 = 0
⇐⇒
⎩ ⎩
x(3y 2 + x − 1) = 0 x=0 or 3y 2 + x − 1 = 0
⎧
⎪
⎪ y = 0 and x=0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ or [y = 0 and 3y 2 + x − 1 = 0]
⇐⇒ ⇐⇒
⎪
⎪
⎪
⎪ or [y 2 + 2x − 1 = 0 and x = 0]
⎪
⎪
⎪
⎪
⎩
or [y 2 + 2x − 1 = 0 and 3y 2 + x − 1 = 0]
Unconstrained Optimization 89
⎧ ⎧ 2
⎨ y=0 and x=0 ⎨ or [y − 1 = 0
⎪ and x = 0]
⎩ ⎪
⎩ or 1 2
or [y = 0 and x = 1] [y 2 = and x= ].
5 5
2 1 2 1
We deduce that (0, 0), (1, 0), (0, 1), (0, −1), ( , √ ) and ( , − √ ) are the
5 5 5 5
critical points of f . Reading the level curves in Figure 2.23, one can locate
four saddle points and two local extrema.
2 1 2 4
( ,√ ) √ local minimum point
5 5 5 5
2 1 2 4
( , −√ ) −√ local maximum point
5 5 5 5
where
fxx (x, y) = 2y, fyy (x, y) = 6yx, fxy (x, y) = 2x + 3y 2 − 1
2y 2x + 3y 2 − 1
Hf (x, y) = 2 , D1 (x, y) = fxx = fxx = 2y
2x + 3y − 1 6xy
90 Introduction to the Theory of Optimization in Euclidean Space
fxx fxy 2y 2x + 3y 2 − 1
= 2 2 2
fxy fyy 2x + 3y 2 − 1 6yx = 12xy − [2x + 3y − 1] .
Finally, note that f takes large positive and negative values since we have
f (1, y) = y 3 −→ ±∞ as y −→ ±∞.
Solution: Let (x, y) be the position of the power substation. Then, we have
y
2.0
1.5
1.0
0.5
x
0.2 0.2 0.4 0.6 0.8 1.0 1.2
f (x, y) = d2 ((x, y), (0, 0)) + d2 ((x, y), (1, 1)) + . . . + d2 ((x, y), (0, 2))
f (x, y) = [(x − 0)2 + (y − 0)2 ] + [(x − 1)2 + (y − 1)2 ] + [(x − 0)2 + (y − 2)2 ].
Thus, we have one critical point and by applying the second derivatives test,
we obtain:
6 0 1 1 6 0
Hf (x, y) = D1 ( , 1) = 6 > 0 D2 ( , 1) = = 36 > 0.
0 6 3 3 0 6
So ( 13 , 1) is a local minimum; see Figure 2.24 for the position of the point and
the three houses.
To show that it is the point that minimizes f globally, we proceed by com-
paring the values of f and completing squares:
1 2
f (x, y) − f ( , 1) = 3x2 − 2x + 1 + 3y 2 − 6y + 5 − ( + 2)
3 3
1 2
= 3(x − ) + 3(y − 1)2 0 ∀(x, y) ∈ R2 .
3
6. – Based on the level curves that are visible in Figures 2.25 and 2.26,
identify the approximate position of the local maxima, local minima and
saddle points.
2 0.128 0.128
0.064 0.32
1
0.192
0.256 0.192
0.064
0 0 0
0.064
0.192
0.192
0.256
0.256
1
0.064
0.32
0.128
0.128
2 0.128 0.128
2 1 0 1 2
2
+y 2 )/2
FIGURE 2.25: Level curves of f (x, y) = −xye−(x on [−2.2] × [−2, 2]
2.4 0
1.6 2
2.4
8 0.8
0.8
0.4 0.8 0.8
2
1.6
1.2 1.2
6 1.2 0.4
0.4 0
1.2
0.8
0.8
1.2 0.4 1.2
4
0.4
with(Student[M ultivariateCalculus])
LagrangeM ultipliers(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [ ], [x, y], output = detailed)
[x = 0, y = 0, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = 0],
[x = 1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)],
[x = 1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)],
[x = −1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)],
[x = −1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)]
SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [x, y] = [0, 0])
LocalM in = [], LocalM ax = [], Saddle = [[0, 0]]
SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [x, y] = [1, 1])
LocalM in = [[1, 1]], LocalM ax = [], Saddle = []
..
.
ii) For the second figure, the exact points found, using Maple, are:
3π 3π π 3π 3π π 7π π π 7π
- 5 saddle points ( , ), ( , ), ( , ), ( , ), ( , )
2 2 2 2 2 2 2 2 2 2
π 5π 5π π π π 5π 5π
- 4 local maxima at ( , ), ( , ), ( , ), ( , )
2 2 2 2 2 2 2 2
11π 11π 7π 7π
- 2 local minima at ( , ), ( , ).
6 6 2 2
Unconstrained Optimization 93
Each function is not differentiable at the origin and represents the Euclidean
distance in R and R2 respectively. We use the triangular inequality to verify
that they are convex.
• One can form new convex/concave functions using algebraic operations. For
Example [25],
if f , g are functions defined on a convex set S ⊂ Rn and s, t 0, then:
f is strictly convex in S ⇐⇒ f (x) − f (a) > ∇f (a) .(x − a), x = a
96 Introduction to the Theory of Optimization in Euclidean Space
f is concave in S ⇐⇒ f (x) − f (a) ∇f (a) .(x − a)
f is strictly concave in S ⇐⇒ f (x) − f (a) < ∇f (a) .(x − a), x = a.
Proof. We prove the first assertion. The other assertions can be established
similarly.
g(t) − g(0)
f (b) − f (a) lim = g (0)
t→0+ t−0
where
Indeed,
. . . + fxn (a + t(b − a)) (bn − an ) = (∇f )(a + t(b − a)) .(b − a).
f (a) − f (ta + (1 − t)b) ∇f (ta + (1 − t)b) .(a − [ta + (1 − t)b])
= (1 − t) ∇f (ta + (1 − t)b) .(a − b). (∗)
f (b) − f (ta + (1 − t)b) ∇f (ta + (1 − t)b) .(b − [ta + (1 − t)b])
= −t ∇f (ta + (1 − t)b) .(a − b). (∗∗)
Multiply the inequality (∗) by t > 0 and the inequality (∗∗) by (1 − t) > 0,
then add the resulting inequalities. This gives
Therefore f is convex.
Solution: We have
x−s
f (x, y) − f (s, t) − ∇f (s, t).
y−t
2 2 2 2
x−s
= x + y − (s + t ) − 2s 2t .
y−t
Thus f is convex on R2 . Note that by taking (s, t) = (0, 0), the critical point
of f , we deduce that f (x, y) − f (0, 0) 0 ∀ (x, y) ∈ R2 . Hence, (0, 0) is a
global minimum of f .
98 Introduction to the Theory of Optimization in Euclidean Space
As, we can expect, from the above example, it will not always be easy
to check the convexity or concavity of a function through solving inequalities.
Next, we show a more practical characterization, but requiring more regularity
on the function.
d d
g (t) = fx1 (a + t(b − a)) (b1 − a1 ) + . . . + fxn (a + t(b − a)) (bn − an ).
dt dt
For each i = 1, . . . , n, we have
fxi (a + t(b − a)) = fxi (x1 (t), x2 (t), . . . , xn (t)).
Then
n
= fxi xj (a + t(b − a)) (xj − yj ).
j=1
Unconstrained Optimization 99
Hence
n
n
g (t) = [fxi xj (a + t(b − a))](bi − ai )(bj − aj ).
i=1 j=1
Now, by assumption, we have Dk (z) > 0 for all z ∈ S and for all k = 1, . . . , n,
then the quadratic form
n
n
Q(h) = fxi xj (a + t(b − a)) hi hj
i=1 j=1
with the associated symmetric matrix fxi xj (a + t(b − a)) n×n is positive
definite. As a consequence, g (t) > 0 and g is strictly convex. In particular
f is convex in S ⇐⇒ Δk (x) 0 ∀x ∈ S ∀ k = 1, . . . , n.
Proof. We prove only the first assertion. The second one is established by
replacing f by −f .
=⇒) Suppose f convex in S. It suffices to show that the quadratic form Q(h)
satisfies
n n
Q(h) = fxi xj (a)hi hj 0 ∀a ∈ S.
i=1 j=1
So, let a ∈ S. Since S is an open set, there exists > 0 such that B (a) ⊂ S.
In particular for h ∈ Rn , h = 0, we have
a + th ∈ B (a) ⇐⇒ a + th − a = |t| h < ⇐⇒ |t| < = α.
h
So, for t ∈ (−α, α), the function u(t) = f (a + th) is well defined. We claim
that u is convex. Indeed, we have for λ ∈ [0, 1] and t, s ∈ (−α, α),
Unconstrained Optimization 101
= λu(t) + (1 − λ)u(s).
and for t = 0, we obtain the semi definite positivity of the quadratic form Q.
Solution: We have
∇f (x, y) = 4x3 , 4y 3 .
The Hessian matrix of f is
fxx fxy 12x2 0
Hf (x, y) = = .
fyx fyy 0 12y 2
We have Δ11 2
1 (x, y) = 12y 0,
Δ22 2
1 (x, y) = 12x 0, and Δ2 (x, y) = 144x2 y 2 0.
Thus, f is convex on R2 .
102 Introduction to the Theory of Optimization in Euclidean Space
x∗ is a global maximum
⇐⇒ ∇f (x∗ ).(x − x∗ ) 0 ∀x ∈ S (resp. ).
(resp. minimum) point
◦
Moreover, if x∗ ∈ S, then
Solution: Since the total revenue for selling x units is R(x) = x(160 − 0.01x),
the profit P (x) on x units will be
From
dP
= −0.02x + 120
dx
we have
dP
=0 ⇐⇒ x = 6000.
dx
The only stationary point is 6000 and cannot be the maximum point since it
is not in S. Let us then explore the concavity of P . We have
dP
P (x) − P (a) − (a)(x − a) = −0.01(x − a)2 0 ∀ x, a ∈ S, x = a.
dx
Thus, P is strictly concave on S. Therefore, the maximum point x∗ (that
exists by the extreme value theorem) must satisfy
dP ∗
(x ).(x − x∗ ) 0 ∀x ∈ S ⇐⇒ (−0.02x∗ + 120)(x − x∗ ) 0 ∀x ∈ S.
dx
Since (−0.02x∗ + 120) < 0 on S, we must have
◦
Theorem 2.3.5 Let S be a convex set of Rn and x∗ ∈ S. Let f : S −→ R
be a C 2 concave (resp. convex) function on S, then
Example 4. Find the global maxima and minima points if any of f defined
by
f (x, y, z, t) = 24x + 32y + 48z + 72t − (x2 + y 2 + 2z 2 + 3t2 ).
Therefore, f is strictly concave on R4 and the point (12, 16, 12, 12) is the only
global maximum point.
Solved Problems
Solution: Let (x, y) be the position of the power substation. Then, we have
to look for (x, y) as the point that minimizes the function
f (x, y) = d2 ((x, y), (x1 , y1 )) + d2 ((x, y), (x2 , y2 )) + . . . + d2 ((x, y), (xm , ym ))
ii) Find all the stationary points of f and classify them by means of the
second derivatives test.
⎧ ⎧ ⎧
⎨ x = 2y ⎨ x = 2y ⎨ x = 2y
⇐⇒ or √ or √
⎩ ⎩ ⎩
y=0 y= 2 y=− 2
√ √ √ √
We deduce that (0, 0), (2 2, 2) and (−2 2, − 2) are the critical points.
iii) and iv) The first graphing in Figure 2.28 shows a form of a saddle. On the
second graphing, there are two families of circulaire curves and a hyperbola
which confirm the previous classification of the critical points.
z y4 4 x y x2
1
2
z
0
0
2
1
4
5
0 2
x
5 4 2 0 2 4
Δ11 2
1 (x, y) = 12y 0, Δ22
1 (x, y) = 2 0, Δ2 (x, y) = 24y 2 − 16
f (0, y) = y 4 −→ +∞ as y −→ ±∞.
So f cannot attain a maximum value in R2 .
3. – Let f (x, y) = x2 .
i) Show that f has infinitely many critical points and that the second
derivatives test fails for these points.
Solution:
z x2
4
4
2
2
z
0
0
2
2
4
5 4
0
x
6
5 6 4 2 0 2 4 6
2 0
Δ11
1 (x, y) = 0 Δ22
1 (x, y) = 2 Δ2 (x, y) = = 0.
0 0
f (x, 0) = x2 −→ +∞ as x −→ +∞.
2
So f cannot attain a maximum value M in R . Indeed, if not, we would have
f (x, y) M ∀(x, y) ∈ R2 .
Then, we have
f (x, 0) = x2 M ∀x ∈ R
which is not possible. For example
Solution:
z x2 4 y x 6 x y2
2
4
0
2
z
2
0
4
2
6
5 4
0
x
6
5 6 4 2 0 2 4 6
We have
∇f (x, y) = 4y − 2x − 6, 4x − 2y.
∞
Since f is C , then
Remark that
f (0, y) = −y 2 −→ −∞ as y −→ ±∞.
f takes large negative values and doesn’t attain its minimal value.
112 Introduction to the Theory of Optimization in Euclidean Space
On the other hand, when looking for the critical points of f , we obtain
So f takes large positive values and doesn’t attain its maximal value either.
Solution: The shape of the surface, in Figure 2.31, shows that the function
is neither convex, nor concave.
z x4 2 x2 y2 6 y
4.0
3.5
5 3.0
10
2.5
2
1
0
x
1
2.0
2 2 1 0 1 2
⎧
⎨ x=0 or x+1=0 or x−1=0
⇐⇒
⎩
y=3
⎧
⎪
⎪ x = 0 and y = 3
⎪
⎪
⎨
⇐⇒ or [x = −1 and y = 3]
⎪
⎪
⎪
⎪
⎩
or [x = 1 and y = 3].
We deduce that (−1, 3), (0, 3) and (1, 3) are the critical points of f .
8 0
(−1, 3) 8 = 16 local minimum
0 2
−4 0
(0, 3) −4 = −8 saddle point
0 2
8 0
(1, 3) 8 = 16 local minimum
0 2
The second derivative test gives the following characterization of the points
in Table 2.8.
Thus
min f (x, y) = −10 = f (1, 3) = f (−1, 3).
(x,y)∈R2
So
1
Δk 0 k = 1, 2 ⇐⇒ |x| √ .
3
Hence, Hf is semi definite positive on each open convex set S1 and S2 . Hence
2 S2
x
4 2 2 4
S1 2
4
vii) Since f is convex on S1 = [x < − √13 ] and the critical point (−1, 3) is in
S1 with f (−1, 3) = −10, then
viii) We have
x −1 0 1
ϕ (x) + 0 −
ϕ(x) −1 0 −1
5 5 5 1
f (x, y) − +y 2 −6y = (y−3)2 −9− −9− = f (± √ , 3) ∀(x, y) ∈ R2 \S.
9 9 9 3
Hence,
1 1 5 86
min f (x, y) = f (− √ , 3) = f ( √ , 3) = −9 − = − .
R2 \S 3 3 9 9
The proof of the extreme value theorem uses the fact that the image of a
closed bounded set S of Rn by a real valued continuous function f : S −→ R
is a closed bounded set of R [18]. Thus f (S) is a closed bounded interval [a, b].
Therefore
∃xm , xM ∈ S such that f (xm ) = a, f (xM ) = b.
Since f (S) = [f (xm ), f (xM )], then
f (xm ) f (x) f (xM ) ∀x ∈ S.
Therefore,
f (xm ) = min f (x) and f (xM ) = max f (x).
S S
– find the boundary points where f takes its absolute values on the
boundary
• The values max f (x) and min f (x) exist by the extreme value theorem
x∈[−1,1] x∈[−1,1]
because f is continuous on the closed bounded interval [−1, 1]. Now, since
there is no critical points in the interior of the interval (−1, 1), these values
must be in {f (−1) , f (1)}. Comparing these two values, we conclude that
25 5
max f (x) = f (−1) = and min f (x) = f (1) = .
x∈[−1,1] 6 x∈[−1,1] 6
• The values max f (x) and min f (x) exist by the extreme value theorem
x∈[−2,2] x∈[−2,2]
because f is continuous on the closed bounded interval [−2, 2]. The critical
point −1 is in the interior of the interval (−2, 2), the absolute values must be
7 25 1
in {f (−2) , f (−1) , f (2)} = { , , − }. Comparing these three values, we
3 6 3
conclude that
25 1
max f (x) = f (−1) = and min f (x) = f (2) = − .
x∈[−2,2] 6 x∈[−2,2] 3
x3 x2
y 2x3
3 2
4
x
4 2 2 4
2
4
f (x, y) = 4xy − x2 − y 2 − 6x
The point (1, 2) is the only critical point of f and f (1, 2) = −3.
y
2 2
z x
6 4 yx6x y
y 4
4 L3
0
L2
5
z
2
10
15
0.0
0.5
x
1 1 2 3 4 1.0
L1
x
1.5
2.0
x 0 2
g (x) −
g(x) 0 − 16
y 0 4 6
h (y) + 0 −
h(y) −16 0 −4
3
x 0 2 2
l (x) − 0 +
l(x) 0 − 92 −4
Theorem 2.4.2
Let f (x) be a continuous function on an unbounded set S of Rn such that
Indeed, we have
Moreover, we have
then
f (x) max f (x0 ), min f (z) min f (z) ∀x ∈ S.
z∈S0 z∈S0
Hence
Note that the minimum min f (z) is attained by the extreme value theorem
z∈S0
since S0 is a bounded closed set of Rn . Therefore
x
3 2 1 1 2 3
10
20 y 3 x4 4 x3 12 x2 2
30
f (x) = 0 ⇐⇒ x = 0, x = −2 or x = 1.
Unconstrained Optimization 123
We deduce that
! " ! "
min f (x) = min f (0), f (−2), f (1) = min 2, −30, −3 = −30 = f (−2).
x∈R
have opposite signs (one is +∞ and the other is −∞), so f has no absolute
extreme points.
If n is even, then the limits above have the same sign. When they are both
equal to +∞, f has an absolute minimum but no absolute maximum. When
the limits are both equal to −∞, f has an absolute maximum but no absolute
minimum.
124 Introduction to the Theory of Optimization in Euclidean Space
Solved Problems
1 2 1 2
1. – Define the function f (x, y) = x − y on the closed unit disk. Find
4 9
i) the critical points
Solution:
x2 y2
z
4 9 1.0
0.5
0.2 0.0
0.1 1.0
0.0 0.5
0.1 0.5
1.0 0.0
0.5
0.5
1.0
1.0 1.0 0.5 0.0 0.5 1.0
i) Since f is differentiable, the critical points are solution of ∇f (x, y) = 0, 0.
That is
x 2y
∇f (x, y) = , − = 0, 0 ⇐⇒ (x, y) = (0, 0).
2 9
So (0, 0), the origin of the unit disk, is the unique critical point of f .
Unconstrained Optimization 125
2 1
Then D2 (0, 0) = [fxx fyy − fxy ](0, 0) = − < 0 and (0, 0) is a saddle
9
point; see Figure 2.36.
π 3π
θ 0 2 π 2 2π
sin t + + − −
cos t + − − +
g (t) − + − +
1 1 1 1 1
g(t) 4 − 9 4 − 9 4
∗ Conclusion:
We list, in Table 2.15, the values of f at the critical point and at the boundary
points where f attains its absolute values on that boundary.
Solution:
y
2.0
1.5
L3
1.0
0.5
L4 L2
x
1 1 2 3 4
0.5 R
1.0
L1
1.5
⎧
⎨ x=2 or cos y = 0
⇐⇒ ⇐⇒ (x, y) = (2, 0).
⎩
x=0 or x=4 or sin y = 0
Unconstrained Optimization 127
The point (2, 0) is the only critical point of f , as shown in Figure 2.38, and
f (2, 0) = 4.
1.0
z 4 x x2 cosy
0.5
4.0
0.0
3.5
3.0
0.5
2.5
0.5
1.0 0.0
1.5
2.0
0.5
2.5 1.0
√
2
√
– On L1 , we have: f (x, − π4 ) = 2 (4x − x2 ) = g(x), g (x) = 2(2 − x).
x 1 2 3
g (x) √ + √ − √
3 3
g(x) 2 2 2 2 2 2
√
2
TABLE 2.16: Variations of g(x) = 2 (4x − x2 )
√
π √ π π 3 2
max f = f (2, − ) = 2 2 min f = f (1, − ) = f (3, − ) = .
L1 4 L1 4 4 2
√
π π 3 2
max f = f (3, 0) = 3 min f = f (3, − ) = f (3, − ) = .
L2 L2 4 4 2
128 Introduction to the Theory of Optimization in Euclidean Space
y − π4 0 π
4
h (y) √ + − √
3 3
h(y) 2 2 3 2 2
√
2
√
– On L3 , we have: f (x, π4 ) = 2 (4x − x2 ) = l(x), l (x) = 2(2 − x).
x 1 2 3
l (x) √ + √ − √
3 3
l(x) 2 2 2 2 2 2
√
2
TABLE 2.18: Variations of l(x) = 2 (4x − x2 )
√
π √ π π 3 2
max f = f (2, ) = 2 2 min f = f (1, ) = f (3, ) = .
L3 4 L3 4 4 2
y − π4 0 π
4
m (y) √ + − √
3 3
m(y) 2 2 3 2 2
√
π π 3 2
max f = f (1, 0) = 3 min f = f (1, − ) = f (1, − ) = .
L4 L4 4 4 2
3. – Find the points on the surface z 2 = xy + 4 that are closer to the origin.
Solution:
The distance of a point (x, y, z) to the origin is given by d =
x2 + y 2 + z 2 . The problem is equivalent to minimize d2 = x2 + y 2 + z 2 on
the set z 2 = xy + 4 or equivalently to look for
Note that the function f is continuous on the unbounded set R2 and satisfies
1 1 1
f (x, y) x2 + y 2 − (x2 + y 2 ) + 4 = (x2 + y 2 ) + 4 = (x, y) + 4
2 2 2
since
1 2
|xy| (x + y 2 ).
2
Thus
lim f (x, y) = +∞.
(x,y)→+∞
Note that a global minimum of the problem is also a local minimum, i.e.,
solution of
2 1
∇f = 2x+y, 2y +x = 0, 0 ⇐⇒ (x, y) = (0, 0) since = 0.
1 2
2x + 3y 19, −3x + 2y 4
x + y 8, 0 x 6, y 0.
Solution:
y
L4
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
i) Set
The set S is the region of the plan xy, located in the first quadrant and
bounded by the lines 2x+3y = 19, −3x+2y = 4, x+y = 8; see Figure 2.39.
It is a closed bounded convex of R2 . Since f is continuous (because it is a
Unconstrained Optimization 131
19 − 2x
L3 = {(x, 8 − x), 5 x 6}, L4 = {(x, ), 2 x 5},
2
4 + 3x
L5 = {(x, ), 0 x 2}, L6 = {(0, y), 0 y 2}.
2
On L1 , we have: f (x, 0) = x,
∗ Conclusion:
We list, in Table 2.21 below, the values of f at the boundary points where f
takes absolute values on each side of the set S. We conclude that the abso-
lute maximum value of f is f (2, 5) = 22 and the absolute minimum value is
f (0, 0) = 0.
L4
x4y 22
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
x4y 0
Remark 2.4.2 Note that the points that appear in the above table are the
vertices of the hexagon S. The extreme points are attained at two of these
vertices. This is true in the more general problem
with
n
S = {x = (x1 , · · · , xn ) ∈ R+ : Ax b}
where
5
5
z
0
5
5
0
x
135
136 Introduction to the Theory of Optimization in Euclidean Space
1
5
5 2
4
z
0 z
2
5 0
5
2
1
0 0
x x 1
5 2
y 2
0 y 1
1
5
5 2
4
z
0 z
2
5 0
5
2
1
0 0
x x 1
5 2
d
f (x(t)) = f (x(t)).x (t) =0 =⇒ f (x∗ ).x (0) = 0
dt t=0 t=0
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality
musn’t depend on a particular curve x(t). So, we must have
f (x∗ ).x (0) = 0 for any curve x(t) such that g(x(t)) = c.
In this chapter, first, we will characterize, in Section 3.1, the set of tangent vectors to
such curves, then establish in Section 3.2, the equations satisfied by a local extreme
point x∗ . In Section 3.3, we identify the candidates’ points for optimality, and in
Section 3.4, we explore the global optimality of a constrained local candidate point.
Finally, we establish, in Section 3.5, the dependence of the optimal function with
respect to certain of its parameters.
Let
x∗ ∈ S = [g(x) = c].
v1 , . . . , vm ∈ Rn are LI ⇐⇒ α1 v1 + . . . + αm vm = 0 =⇒ α1 = . . . = 0 .
M = {y ∈ Rn : g (x∗ )y = 0}.
Proof. We have
T ⊂ M : Indeed, let y ∈ T, then
δ δ
∀(t, u) ∈ (−δ0 , δ0 ) × Bδ0 (0) with δ0 = min ,
2 y 2 g (x∗ )
δ δ δ δ
< y + g (x∗ ) = + =δ
2 y 2 g (x∗ ) 2 2
We have
m
∂gl ∗ ∂Xj ∂gi ∗
Xj (t, u) = x∗j + tyj + (x )ul = (x )
∂xj ∂ui ∂xj
l=1
n
∂gk n
∂Fk ∂gk ∂Xj ∂gi ∗
(t, u) = = (X(t, u)) (x )
∂ui j=1
∂Xj ∂ui j=1
∂x j ∂x j
∂F t
k
(t, u) = g (X(t, u)) g (x∗ ) .
∂ui k,i=1,··· ,m
By hypotheses, we have
– F (0, 0) = g(x∗ ) − c = 0
∂(F1 , · · · , Fm ) t
– det(∇u F (0, 0)) = = det g (x∗ ) g (x∗ ) = 0 as
∂(u1 , · · · , um )
∗
rankg (x ) = m.
The curve
x(t) = X(t, u(t)) = x∗ + ty +t g (x∗ )u(t)
is thus, by construction, a curve on S. By differentiating both sides of
n
∂g ∂Xj
d
0= g(x(t)) =
dt j=1
∂Xj ∂t
m
∂gl m
∂gl ∗ ∂Xj ∂ul
Xj (t, u) = x∗j + tyj + (x )ul = yj + (x∗ )
∂xj ∂t ∂xj ∂t
l=1 l=1
#
d n
∂g m
∂gl ∗ ∂ul
0= g(x(t)) = (X(t, u)) yj + (x )
dt t=0
j=1
∂xj ∂xj ∂t
l=1 t=0
∗ ∗ t ∗
= g (x )y + g (x ) g (x )u (0).
F (x, y) = 0.
If
◦
∃(x0 , y 0 ) ∈ A = A, F (x0 , y 0 ) = 0 and det Fy (x0 , y 0 ) = 0,
−1
ϕ (x) = − Fy (x, y) Fx (x, y)
where
⎡ ∂F1 ∂F1 ⎤
∂y1 ... ∂ym
⎢ .. ⎥
Fy (x, y) = ∇y F (x, y) = ⎣ ... ..
. . ⎦ gradient of F with respect to y
∂Fm ∂Fm
∂y1 ... ∂ym
∂(F1 , . . . , Fm )
det(Fy (x, y)) = Jacobian of F with respect to y.
∂(y1 , . . . , ym )
2
0.5 y x 1 1 tangent line y 1
1.0
x
0.5 0.5 1.0 1.5 2.0 2.5
0.5 y 1 x2
0.5
x
1.5 1.0 0.5 0.5 1.0 1.5
tangent line y 1
1.0
0.5
*** The graph of the tangent plane is the graph of the linear approximation
L(x) = f (x∗ ) + f (x∗ ).(x − x∗ ). Thus, we have
The following examples, in Table 3.1, show that the tangent line is horizontal
at local extreme points and separates the graph into two parts at an inflection
point; see Figure 3.4 and Figure 3.5.
Constrained Optimization-Equality Constraints 143
(x − 1)2 − 1 1 : global minimum 2(x − 1) =0 y = −1
x=1
1 − x2 0 : global maximum −2x =0 y=1
x=0
(x − 1)3 + 1 0 : inflection point 3(x − 1)2 =0 y=1
x=1
1 1 1
ln x e = y−1= (x − e)
x x=e e e
tangent line y 1
1.0 y
1.5
0.5 0.5
The examples, given in Table 3.2 and graphed in Figures 3.6 and 3.7, show
that the tangent plane is horizontal at local extreme points and separates the
graph into two parts at a saddle point.
144 Introduction to the Theory of Optimization in Euclidean Space
4
4
2 2
2 2
z x 1 y 1 1 y
y plane tangent z 4
0 0
2 2
4 z x2 y2 4
2
3
1
z
2
z
0
1
1
0
2 2
plane tangent z 1
1
0 0
1 x 1
x 2 2
a) z = −1, b) z = 4, c) z = 0, d) z = 2x + 2y − 3
2 4
y 1
2
0
plane tangent 2 x 2 y z3 0
y
1
0
2 z y2 x2
2
2
2
1
z 1
0
z
0
1
1
2
plane tangent z 0
2 2
z x 12 y 12 1
1
0
0
1
x
1 2
x
2 3
Example 3. Find the tangent plane at the point (0, 1, 0) to the set g =
(g1 , g2 ) = 1, 1 with
Solution: The surface g(x, y, z) = 1, 1 is the intersection of the two surfaces
g1 (x, y, z) = 1 and g2 (x, y, z) = 1. So, it is a curve in the space R3 . We have
⎡ ⎤
∂g1 ∂g1 ∂g1
⎢ ∂x ∂y ∂z
⎥ 1 1 1
g (x, y, z) = ⎣ ⎦ = 2x 2y 2z .
∂g2 ∂g2 ∂g2
∂x ∂y ∂z
Constrained Optimization-Equality Constraints 145
z = f (x, y) f (x0 , y0 )
a) (x − 1)2 + (y + 1)2 − 1 2(x − 1), 2(y + 1) = 0, 0
(x,y)=(1,−1)
b) 4 − x2 − y 2 −2x, −2y = 0, 0
(x,y)=(0,0)
c) y 2 − x2 −2x, 2y = 0, 0
(x,y)=(0,0)
d) (x − 1)2 + (y + 1)2 − 1 2(x − 1), 2(y + 1) = 2, 2
(x,y)=(2,0)
1 1 1
g (0, 1, 0) = has rank 2
0 2 0
The tangent plane is the set of points (x, y, z) such that
⎡ ⎤
x
1 1 1 ⎣ ⎦ 0
g (0, 1, 0).x − 0, y − 1, z − 0 = . y−1 =
0 2 0 0
z
⎧
⎨ x+y−1+z =0
⇐⇒
⎩
2(y − 1) = 0.
A parametrization of the tangent plane to the two surfaces at (0, 1, 0) is the
line (see Figure 3.8)
y 1
1
2
2
z
0
1
2
2
1
0
x
1
Remark 3.1.3 Note that the representation of the tangent plane obtained
in the theorem has used the fact that the point was regular. When this
hypothesis is omitted, the representation is not necessary true.
Indeed, if S is the set defined by
x(t) = 0 y(t) = y0 + t
passes through the point (0, y0 ) at t = 0 with direction x (0), y (0) = 0, 1
and remains included in S. Hence, the tangent plane is equal to S.
Constrained Optimization-Equality Constraints 147
Solved Problems
5
5
z
0
5
5
0
x
2x2 + 3y 2 + 4z 2 = 9
g (x, y, z) = 4x 6y 8z =0 on [g = 9] =⇒ rank(g (x, y, z)) = 1
since
The tangent plane to the surface g(x, y, z) = 9 at a point (x0 , y0 , z0 ) is the set
of points (x, y, z) such that
⎡ ⎤
x − x0
g (x0 , y0 , z0 ).x − x0 , y − y0 , z − z0 = 4x0 6y0 8z0 . ⎣ y − y0 ⎦ = 0
z − z0
t 2 t 2 3t 2 12 √
=⇒ 2 +3 − +4 =9 =⇒ t=± 3.
4 3 8 7
The needed points on the surface are
3√ 4√ 9 √ 3√ 4√ 9√
3, − 3, 3 , − 3, 3, − 3 .
7 7 14 7 7 14
The equations of the tangent planes to the surface (see Figure 3.10) at these
points are
3√ 4√ 9√
x− 3 −2 y+ 3 +3 z− 3 = 0,
7 7 14
Constrained Optimization-Equality Constraints 149
2
y 1
1
2
2
z
0
1
2
2
1
0
x
1
3√ 4√ 9√
x+ 3 −2 y− 3 +3 z+ 3 = 0.
7 7 14
Solution: Set
1 2 5
g1 (x, y, z) = z − x2 + y 2 g2 (x, y, z) = z − (x + y 2 ) − .
10 2
Since g1 (3, 4, 5) = 0 and g2 (3, 4, 5) = 0, then the point (3, 4, 5) is a common
point to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0. We have
x y
g1 (x, y, z) = − i− j+k
+ x2 y2 x + y2
2
x y
g2 (x, y, z) = − i − j + k
5 5
3 4
g1 (3, 4, 5) = − i − j + k = 0, rank(g1 (3, 4, 5)) = 1
5 5
3 4
g2 (3, 4, 5) = − i − j + k = 0, rank(g2 (3, 4, 5)) = 1.
5 5
150 Introduction to the Theory of Optimization in Euclidean Space
Note that the normal vectors g1 (3, 4, 5) and g2 (3, 4, 5) of the tangent planes
to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0 respectively are the same.
Hence, the two surfaces have a common tangent plane at this point with the
equation
3 4
− (x − 3) − (y − 4) + (z − 5) = 0.
5 5
sin(xz) − 4 cos(yz) = 4
g (π, π, 1) 1 π
± = ±− √ , 0, − √ .
g (π, π, 1) 1 + π2 1 + π2
Constrained Optimization-Equality Constraints 151
Before setting the results rigorously, we will try to give an intuitive approach of
the comparison of the values of f close to a local maximum value f (x∗ ) under the
constraints g(x) = c. We will follow the unconstrained case in parallel.
d
f (x(t)) = f (x(t)).x (t) =0 =⇒ f (x∗ ).x (0) = 0.
dt t=0 t=0
152 Introduction to the Theory of Optimization in Euclidean Space
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality musn’t
depend on a particular curve. Thus, it must be satisfied for any y = x (0) ∈ M , which
is summarized below:
∀y ∈ Rn : g (x∗ )y = 0 =⇒ f (x∗ )y = 0.
∂f ∂f
b = f (x∗ ) = ,..., b ∈ Rn .
∂x1 ∂xn
From the previous lemma, we have
∀y ∈ Rn : Ay = 0 =⇒ b.y = 0.
Constrained Optimization-Equality Constraints 153
where KerN denotes the Kernel [10] of the linear transformation induced by
the matrix N . Since we have [10]
A A
dimRn = dim(kerA) + rank(A) = dim(ker ) + rank( )
b b
then
A
rank(A) = rank( )
b
which means that the vector b is linearly dependent on the line vectors of A,
so there exists a unique vector λ∗ = (λ∗1 , . . . , λ∗m ) ∈ Rm such that
m
∂gj
t ∂f ∗
b =t Aλ∗ ⇐⇒ (x ) = λ∗j (x∗ ), i = 1, . . . , n.
∂xi j=1
∂x i
m
∂f ∂gj
(x) − λj (x) = 0 i = 1, · · · , n
∂xi j=1
∂xi
gj (x) − cj = 0 j = 1, · · · , m.
∃!λ∗ ∈ R : ∇f = λ∗ ∇g =⇒ ∇f // ∇g.
The vectors g (x∗ ) and f (x∗ ) are respectively normal to the level curves
g(x) = c and f (x) = f (x∗ ). When the extreme point is attained then the
two vectors g (x∗ ) and f (x∗ ) are parallel. Thus the two level curves have
a common tangent plane at x∗ . When, using a graphic utility, it is where
the level curves are tangent, the constrained extreme points may locate.
Solution: Set
g(x, y) = x2 + y 2 S = {(x, y) : g(x, y) = x2 + y 2 = 1}
0.5
z 1.0
0.0
0.5
zxy
0.5
1.0 0.0
y
0.5
0.0 0.5
x
0.5
1.0
1.0
Next, the functions f and g are C 1 around each point (x, y) ∈ R2 and in
particular each point of S is relatively interior to S and is a regular point
since we have
we can apply Lagrange multipliers method to look for the interior extreme
points as solutions of the system
⎧ ⎧
⎪
⎪ Lx = fx (x, y) − λgx (x, y) = 0 ⎪
⎪ y − 2xλ = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy (x, y) − λgy (x, y) = 0 ⇐⇒ x − 2yλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 2
Lλ = −(g(x, y) − 1) = 0 x + y2 − 1 = 0
⎧ ⎧
⎪
⎪ y − 2xλ = 0 ⎪
⎪ y − 2xλ = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
⇐⇒ x(1 − 4λ2 ) = 0 ⇐⇒ x=0 or λ = ± 21
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ 2 ⎩ 2
x + y2 − 1 = 0 x + y 2 − 1 = 0.
So, the stationary points, for the Lagrangian, are the four points
1 1 1 1 1 1 1 1
( √ , √ ), (− √ , − √ ), ( √ , − √ ), (− √ , √ )
2 2 2 2 2 2 2 2
at which f takes its maximum and minimum values respectively
1 1 1 1 1 1 1 1 1 1
f ( √ , √ ) = f (− √ , − √ ) = , f ( √ , − √ ) = f (− √ , √ ) = − .
2 2 2 2 2 2 2 2 2 2
The problem can be solved graphically, as illustrated in Figure 3.12.
156 Introduction to the Theory of Optimization in Euclidean Space
y y
2.11.9
1.7 1.4 1.1 0.7 1.5 0.5 1.1 1.5 1.5
21.8 1.7
0.3 1.3
1.5 1.2 1.8
0.8
0.8 1.4
1.6 0.2 1.
1.0 1 1.0 xy 1 2
1.3 1.2
1 0.4
0.4 0.7
0.6 0.9
0.9
0.1
0.5 0.5
0.6
0.5 0.3
0.2
0.1
0 0 x x
1.5 0.1 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.2
0.4
0.6 0.4
0.5 0.3 0 0.5
0.6
0.3
0.9
.2 0.7 1
1
0.1 0.7
1.1 1.0 1. 1.0 xy 1 2
.6 1.4
0.5 1.1 1.4
1 1
1.7 0.5
1.5 1.7
1.8 1.3 08 0.2
1.5 0.9 1.31 6 1 92.1 1.5
1 1
FIGURE 3.12: The constraint [x2 + y 2 = 1] and the level curves xy = − ,
2 2
are tangent
Remark 3.2.2 Note that the Lagrange’s method doesn’t transform a con-
strained optimization problem into one finding an unconstrained extreme
point of the Lagrangian.
max x y subject to x + y = 2, x 0, y 0.
Using the Lagrange multiplier method, prove that (x, y) = (1, 1) solves
the problem with λ = 1. Prove also that (1, 1, 1) does not maximize the
Lagrangian L.
S = {(x, y) : g(x, y) = 2, x 0, y 0}
which is a closed and bounded subset of R2 .
Next, the functions f and g are C 1 around each point (x, y) ∈ (0, 2) × (0, 2)
which is a regular point since we have
2.0
1.5 S
1.0
0.5
x
0.5 0.5 1.0 1.5 2.0 2.5
0.5
It remains to show that the point (1, 1) is the maximum point for the problem;
see Figure 3.14 for a graphical solution using level curves. Indeed, since it is
the only interior point to the segment, it suffices to compare the value of f at
(1, 1) with its value at the end points of the segment. We have
f (1, 1) = 1 f (2, 0) = 0 f (0, 2) = 0.
158 Introduction to the Theory of Optimization in Euclidean Space
0.0
2.0 x
0.5
y 1.5 1.0
1.5
1.0 2.0
0.5
0.0
4
1.5 2.85
0.95 2.09 2.47
z
2 1.52
1.0
1.71
zxy
1
0.76 1.14
0.5
0.19
0
0.3
0.0
0.0 0.5 1.0 1.5 2.0
min x y subject to x + y = 2, x 0, y 0.
Using the Lagrange multiplier method, prove that (x, y) = (1, 1) doesn’t solve
the problem with λ = 1.
Solved Problems
1. –
have no solution.
ii) Show that any point of the constraints’ set is a regular point.
iii) What can you conclude about the minimum and maximum values of
f subject to g = 0? Show this directly.
Solution: i) Set
0
y
2
2
x y 0
line: z 0, xy 0
1 2
z
0
x
2 1 1 2
plane : z xy 3
2
1
2
0
x
2 2
system. Indeed, all conditions of the theorem on the necessary conditions for
a constrained candidate point are satisfied.
Therefore, f cannot reach a finite lower and upper bound on the set of the
constraints.
The graph of f is a plane; see Figure 3.15. The level curves y+1 = k are parallel
lines that intersects the constraint line x − y = 0 at the points (k − 1, k − 1).
This shows that f takes large values (see Figure 3.16).
1
0 2.66 4.18
1.52 1.14
1.14 2
2.28
2 66 1 9 0 76 3 0 38 1.9 30
FIGURE 3.16: ∇f = 1, 1 ∦ 1, −1 = ∇g
Constrained Optimization-Equality Constraints 161
i) Show, without using calculus, that the minimum occurs at (0, 2). Is
it a regular point?
ii) Show that the Lagrange condition ∇f = λ∇g is not satisfied for any
value of λ.
iii) Does this contradicts the theorem on the necessary conditions for a
constrained candidate point?
5
10
10
y
10
5
5
x4 y 2 0 8
z
0 plane : z y1
6
5
4
10
2 5
0
x
5
x
10 5 5 10 10
ii) Let
⎧ ⎧
⎪
⎪ Lx = fx − λgx = 0 ⎪
⎪ 0 − 4λx3 = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy − λgy = 0 ⇐⇒ 1 − 5λ(y − 2)4 = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 4
Lλ = −(g − 0) = 0 x − (y − 2)5 = 0
Note that λ = 0 is not possible by the second equation. So, we deduce that
x = 0, from the first equation, and then y = 2 from the third equation. But,
this leads to a contradiction by the second equation. So the system has no
solution. No level curve is tangent to the constraint set in Figure 3.18.
10
10.24
9.6
.96
8.32
7.6
7.04
5 6.4
5.76
5.12
4.4
3.84
3.2
2.56
1.92
1.28
0.6
10 5 0 5 10
0.64
1.28
1.92
2.56
3.2
3.84 5
4.48
5.12
5.76
6.4
7.04
7.
8.32
10
iii) This does not contradict the theorem on the necessary conditions for a
constrained candidate point since the theorem is true if all assumptions are
satisfied which is not the case for the regularity of the point (0, 2). Indeed, we
have
Solution: Note that, the optimization problem has a solution by the extreme-
value theorem since f is continuous on the closed and bounded subset [g =
1] = g −1 {1} of R2 .
Next, the functions f and g are C 1 around each point (x, y) ∈ R2 . In particular
each point of [g = 1] is relatively interior to [g = 1]. Indeed, if (x0 , y0 ) ∈ [g =
1], then the point (x20 , y02 ) is on the unit circle. Thus, (x20 , y02 ) is an interior
point and we conclude, by using the preimage of an open ball by the continuous
function (x, y) −→ (x2 , y 2 ) is an open set.
Moreover, each point of [g = 1] is a regular point since we have
L(x, y, λ) = x2 + y 2 − λ (x4 + y 4 − 1)
we are led to solve the system
⎧ ⎧
⎪
⎪ Lx = 2x − 4λx3 = 0 ⎪
⎪ 2x(1 − 2λx2 ) = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = 2y − 4λy 3 = 0 ⇐⇒ 2y(1 − 2λy 2 ) = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 4
Lλ = −(x4 + y 4 − 1) = 0 x + y4 = 1
⎧
⎪
⎪ x = 0 or 2λx2 = 1
⎪
⎪
⎨
⇐⇒ y = 0 or 2λy 2 = 1 ⇐⇒
⎪
⎪
⎪
⎪
⎩ 4
x + y4 = 1
⎧ ⎧ ⎧ 2
⎪
⎪ x=0 ⎪
⎪ y=0 ⎪
⎪ x = y2
⎪
⎪ ⎪
⎪ ⎪
⎪
⎨ ⎨ ⎨
y = ±1 or x = ±1 or x4 = 1/2
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪ ⎪
⎪
⎩ ⎩ ⎩
λ = 1/2 λ = 1/2 λ = 1/(2x2 ).
1.0
y 0.5
0.0 y
4 3.4 3.1 2.7 2.3 1.5 2.4 3 3.3 3.53.9
0.5 3.83.6 2.6 3.7 4
.9 3.2 2.9 2.4
3.73.5 1.7 2.8 3.13.4 3.8
2.1 3.
1.0 3 2.6
3.3 1.2 2.5 2.93.2
1.0
2.8 2.2 0.8
2.7
1.3 0.7
1.0 2.5 0.5 2.3
0.2 1.4
z
0.5 0.4
0.3 x
1.5 1.8 1.0
1 0.5 0.5 1.0 1.5
√
FIGURE 3.19: The constraint [x4 + y 4 = 1] and the level curves f = 1, 2 are
tangent
Since f (x, y) = (x, y) − (0, 0) 2 , then the problem looks for points (x, y) on
the curve x4 + y 4 = 1 that are closest and farthest from the origin; see Figure
3.19.
4. – Figures A an B (see Figure 3.20) show the level curves of f and the
constraint curve g(x, y) = 0 graphed thickly. Estimate the maximum and
minimum values of f subject to the constraint. Locate the point(s), if any,
where an extreme value occurs.
To look for the shortest and the farthest distance when (x, y, z) remains into
the unit sphere is equivalent to optimize D2 (x, y, z) under the constraint x2 +
y 2 + z 2 = 1. So, let us denote
First, the optimization problem has a solution by the extreme value theorem
since f is continuous on the unit sphere S, which is a closed and bounded
subset of R3 .
Next, f and g are C ∞ around each point (x, y, z) ∈ R3 . In particular, each
point of S a relatively interior point and is a regular point since we have
and apply Lagrange multipliers method to look for the interior extreme points
by solving the system
⎧ ⎧
⎪
⎪ Lx = fx (x, y, z) − λgx (x, y, z) = 0 ⎪
⎪ 2(x − 1) − 2xλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎨ Ly = fy (x, y, z) − λgy (x, y, z) = 0 ⎨ 2(y − 2) − 2yλ = 0
⇐⇒
⎪
⎪ ⎪
⎪
⎪
⎪ Lz = fz (x, y, z) − λgz (x, y, z) = 0 ⎪
⎪ 2(z − 2) − 2zλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 2
Lλ = −(g(x, y, z) − 1) = 0 x + y 2 + z 2 − 1 = 0.
If x = 0, then the first equation leads to x = 1 which is impossible. We cannot
have also y = 0 and z = 0. So, we deduce from the system that
1 2 2
λ=1− =1− =1−
x y z
from which we deduce
⎧ ⎧
⎨ y = z = 2x ⎪
⎨ y = z = 2x
λ = 1 − 1/x ⇐⇒ λ = 1 − 1/x
⎩ 2 ⎪
⎩ x2 + 4x2 + 4x2 − 1 = 0 ⇐⇒ x = ± 1 .
x + y2 + z2 = 1
3
44 x
2
y 2 0
2
0 4
2
4
5
z
0
5
FIGURE 3.21: The constraint [g = 1] and the level curves f = 4, 16 are tangent
So, the stationary points for the Lagrangian are the two points
1 2 2 1 2 2
( , , , −2), (− , − , − , 4)
3 3 3 3 3 3
and f takes its maximum and minimum values respectively
1 2 2 1 2 2
f (− , − , − ) = 16 and f ( , , ) = 4.
3 3 3 3 3 3
The level curves passing by these points are spheres tangents to the constraint
as shown in Figure 3.21.
Constrained Optimization-Equality Constraints 167
Then
∂g1 ∗ ∂g1 ∗
0 ... 0 ∂x1 (x ) ... ∂xr (x )
.. .. .. .. .. ..
. . . . . .
∂gm ∗ ∂gm ∗
0 ... 0 ∂x1 (x ) ... ∂xr (x )
∗
Br (x ) =
∂g1 ∗ ∂gm ∗ ∗
∂x1 (x ) ... ∂x1 (x ) Lx1 x1 (x∗ , λ∗ ) ... ∗
Lx1 xr (x , λ )
.. .. .. .. .. ..
. . . . . .
∂g1 ∗ ∂gm ∗
Lxr x1 (x∗ , λ∗ ) Lxr xr (x∗ , λ∗ )
∂xr (x ) ... ∂xr (x ) ...
The variables are renumbered in order to make the first m columns in the
matrix g (x∗ ) linearly independent.
M = {h ∈ Rn : g (x∗ ).h = 0}
Before proving the theorem, we will see its application through some examples.
Constrained Optimization-Equality Constraints 169
10
5 4
z
0 2
5
4 0y
2
0 2
x
2
4
4
λ
From the first three equations, we deduce that = x = y = z, which inserted
2
into the last equation gives
x=y=z=1 λ = 2.
Now, let us study the nature of the point (1, 1, 1). For this we use the second
derivative test since f and g are C 2 around this point. The first column vector
of g (1, 1, 1) is linearly independent. So, we keep the matrix g (1, 1, 1) without
renumbering the variables. As n = 3 and m = 1, we have to consider the signs
of the following bordered Hessian determinants:
0 gx (1, 1, 1) gy (1, 1, 1)
0 1 1
(−1)2 B2 (1, 1, 1) = gx (1, 1, 1) Lxx (1, 1, 1, 2) Lxy (1, 1, 1, 2) = 1 0 1 = 2 > 0.
1 1 0
gy (1, 1, 1) Lxy (1, 1, 1, 2) Lyy (1, 1, 1, 2)
0 gx (1, 1, 1) gy (1, 1, 1) gz (1, 1, 1)
gx (1, 1, 1) Lxx (1, 1, 1, 2) Lxy (1, 1, 1, 2) Lxz (1, 1, 1, 2)
(−1)3 B3 (1, 1, 1) = −
gy (1, 1, 1) Lyx (1, 1, 1, 2) Lyy (1, 1, 1, 2) Lyz (1, 1, 1, 2)
gz (1, 1, 1) Lzx (1, 1, 1, 2) Lzy (1, 1, 1, 2) Lzz (1, 1, 1, 2)
0 1 1 1
1 0 1 1
= − = 3 > 0.
1 1 0 1
1 1 1 0
Proof. We will prove assertion i). Assertion ii) can be established similarly.
We follow for this the proof in [25] with more details in the steps involved.
◦
Step 1 : Let Ω be a neighborhood of x∗ . For h ∈ Rn such that x∗ + h ∈ Ω,
we have from Taylor’s formula, for some τ ∈ (0, 1),
n
n n
1
L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+ Lxi (x∗ , λ∗ )hi + Lx x (x∗ +τ h, λ∗ )hi hj .
i=1
2 i=1 j=1 i j
◦
Since x∗ ∈ Ω and (x∗ , λ∗ ) is a local stationary point of L then, in particular,
Lxi (x∗ , λ∗ ) = 0 i = 1, · · · , n.
Moreover, we have
n
n n
∗ ∗ 1
f (x +h)−f (x ) = λ∗k [gk (x∗ +h)−ck ] + Lx x (x∗ +τ h, λ∗ )hi hj .
2 i=1 j=1 i j
k=1
n
∂gk
gk (x∗ + h) − ck = gk (x∗ + h) − gk (x∗ ) = (x∗ + τk h)hj τk ∈ (0, 1).
j=1
∂xj
where
⎡ ∂g1 1 ∂g1 1
⎤
∂g ∂x1 (x ) ... ∂xn (x )
⎢ .. .. ⎥
=⎢ ⎥
i
G(x1 , . . . , xm ) = (xi ) ⎣ .
..
. . ⎦
∂xj m×n
∂gm m ∂gm m
∂x1 (x ) ... ∂xn (x )
n
∂gk
G(x1 , . . . , xm ).t = 0 ⇐⇒ (xk )tj = 0 k = 1, . . . , m.
j=1
∂xj
x0 = x∗ + τ h, x1 = x∗ + τ1 h, . . . , xm = x∗ + τm h ∈ Bρ (x∗ ).
Then
n
n
Lxi xj (x∗ + τ h, λ∗ )ti tj > 0 ∀t = 0 such that
i=1 j=1
Constrained Optimization-Equality Constraints 173
n
∂gk
(x∗ + τk h)tj = 0 k = 1, . . . , m.
j=1
∂xj
we have
n n
∗ 1
∗
f (x + h) − f (x ) = Lx x (x∗ + τ h, λ∗ )hi hj > 0. (2)
2 i=1 j=1 i j
This shows that the stationary point x∗ is a strict local minimum point for f
subject to the constraint g(x) = c in particular directions.
Step 4 : Suppose that x∗ is not a strict relative minimum point. Then, there
exists a sequence of points yl satisfying
yl = x ∗ + δ l s l = 0 sl ∈ Rn sl = 1 δl > 0 ∀l.
n
∂gk ∗
gk (x∗ + h) − gk (x∗ ) = (x + τk h)hj = 0 τk ∈ (0, 1), k = 1, . . . , m
∂xj
j=1
which is a contradiction.
174 Introduction to the Theory of Optimization in Euclidean Space
Then,
We have
d2
f (x(t)) = t x (t)Hf (x(t))x (t) + ∇f (x(t))x (t)
dt2
d2
f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0).
dt2 t=0
t
= x (0)[Hf (x∗ ) −t λ∗ Hg (x∗ )]x (0) + [∇f (x∗ ) + t λ∗ ∇g(x∗ )]x (0)
t
= x (0)[HL (x∗ )]x (0) since ∇f (x∗ ) + t λ∗ ∇g(x∗ ) = 0
b11 h1 + . . . + b1n hn = 0
subject to m linear homogeneous constraints .. .. .
. .
bm1 h1 + . . . + bmn hn = 0
Set
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 ... a1n b11 ... b1n h1
⎢ .. ⎥ ⎢ .. ⎥ ⎢ ⎥
A = ⎣ ... ..
. . ⎦ B = ⎣ ... ..
. . ⎦ h = ⎣ ... ⎦
an1 ... ann bm1 ... bmn hn
Definition.
Q(h) = t hAh is positive (resp. negative) definite subject to the linear
constraints Bh = 0 if Q(h) > 0 (resp. < 0) for all h = 0 that satisfy
Bh = 0.
We have the following necessary and sufficient condition for a quadratic form
Q to be positive (resp. negative) definite subject to linear constraints.
176 Introduction to the Theory of Optimization in Euclidean Space
Theorem: Assume the first m columns in the matrix B = (bij ) are linearly
independent. Then
⇐⇒ (−1)m Br > 0 r = m + 1, . . . , n
⇐⇒ (−1)r Br > 0 r = m + 1, . . . , n
0 ··· 0 b11 ... b1r
.. .. .. .. ..
. . . . .
0 ... 0 bm1 ... bmr
Br =
for r = m + 1, . . . , n.
b11 ... bm1 a11 ... a1r
.. .. .. .. ..
. . . . .
b1r ... bmr ar1 ... arr
Constrained Optimization-Equality Constraints 177
Solved Problems
iii) Graph some level curves of f and the graph of g = 1. Explain, where
the extreme points occur.
S = {(x, y) : g(x, y) = 1}
which is a closed and bounded subset of R2 .
1.0
y 0.5
0.0
0.5
1.0
2.0
1.5
z 1.0
0.5
0.0
z x2 2 y2
1.0
0.5
0.0
x 0.5
1.0
Next, the functions f and g are C 1 in R2 and any point on the unit circle is
regular since, for each (x, y) ∈ S, we have
ii) Now, because f and g are C 2 , we may study the nature of the four points
by using the second derivatives test. Here, we have n = 2 and m = 1. Then,
we have to consider the sign of the bordered Hessian determinant B2 at each
point.
g (x, y) = (2x, 2y), g (0, ±1) = (0, ±2) =⇒ rank(g (0, ±1)) = 1.
Note that the first column vector of g (0, ±1) is linearly dependent and the
second column vector is linearly independent. So, we renumber the variables
so that the second column vector of g (0, ±1) is in the first position. Hence B2
will be written as
0 gy (x, y) gx (x, y) 0 2y 2x
B2 (x, y) = gy (x, y) Lyy (x, y, λ) Lyx (x, y, λ) = 2y
4 − 2λ 0
gx (x, y) Lxy (x, y, λ) Lxx (x, y, λ) 2x 0 2 − 2λ
0 2 0 0 −2 0
B2 (0, 1) = 2 0 0 =8
B2 (0, −1) = −2 0 0 = 8.
0 0 −2 0 0 −2
For r = m + 1 = 2 = n, we have
0.96
1.92 2.8
0.5 0.5
x2 2 y2 1
0.5 0.5
.2
2.24 1.28 3.5
min f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2
subject to g(x, y, z) = ax + by + cz + d = 0
min x2 + y 2 + z 2 subject to x + y + z = 1.
So, by applying Lagrange multipliers method, we will look for the candidate
extreme points as stationary points for the Lagrangian
⎧
⎪
⎪ Lx = 2(x − x0 ) − λa = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ Ly = 2(y − y0 ) − λb = 0
∇L(x, y, z, λ) = 0, 0, 0, 0 ⇐⇒
⎪
⎪
⎪
⎪ Lz = 2(z − z0 ) − λc = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(ax + by + cz + d) = 0
Constrained Optimization-Equality Constraints 181
Case a = 0.
The first column vector of g (x∗ , y ∗ , z ∗ ) is linearly independent, and because
n = 3 and m = 1, we have to consider the signs of the following bordered
Hessian determinants:
0 gx gy 0 a b
B2 (x , y , z ) = gx
∗ ∗ ∗
Lxx Lxy = a
2 0 = −2(a2 + b2 ) < 0.
gy Lxy Lyy b 0 2
Case a = 0 & b = 0.
The first column vector of g (x∗ , y ∗ , z ∗ ) is linearly dependent and the second
is linearly independent. We renumber the variables in the order y, x, z and
obtain
0 b a c
0 b a
b 2 0 0
B2 = b 2 0 = −2(a2 + b2 ) B3 = = −4(a2 + b2 + c2 ).
a a 0 2 0
0 2
c 0 0 2
182 Introduction to the Theory of Optimization in Euclidean Space
Case a = 0, b = 0, & c = 0.
The first and second column vector of g (x∗ , y ∗ , z ∗ ) are linearly dependent
and the third is linearly independent. We renumber the variables in the order
z, x, y and obtain
0 c a b
0 c a
c 2 0 0
B2 = c 2 0 = −2(a2 + c2 ) B3 = = −4(a2 + b2 + c2 ).
a a 0 2 0
0 2
b 0 0 2
f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 = M0 M 2
is the square of the distance of the point M to the point M0 . The constraint
surface
x = x0 + ta y = y0 + tb z = z0 + tc t ∈ R.
a x0 + ta + b y0 + tb + c z0 + tc + d = 0 ⇐⇒
λ∗ λ∗ λ∗ λ∗ λ∗ λ∗ λ∗2 2 2 2
f( a+x0 , b+y0 , c+z0 ) = ( a)2 +( b)2 +( c)2 = (a +b +c ).
2 2 2 2 2 2 4
Constrained Optimization-Equality Constraints 183
v) From, the previous study, choose (a, b, c) = (1, 1, 1), d = −1, (x0 , y0 , z0 ) =
(0, 0, 0). Then
λ 1 1 1 1
= and (x∗ , y ∗ , z ∗ ) = , , .
2 3 3 3 3
We conclude that the point ( 13 , 13 , 13 ) is a local minimum to the constrained
minimization problem. At this point, the two level surfaces
1 1 1 1
x2 + y 2 + z 2 = =f , , , and x+y+z =1
3 3 3 3
are tangent, as it is described in Figure 3.25.
1.0
y 0.5
0.0
0.5
1.0
1.0
x yz 1
0.5
z
0.0
0.5
1
x2 y2 z2
3
1.0
1.0
0.5
0.0
x
0.5
1.0
FIGURE 3.25: The level surface and the plane are tangent
184 Introduction to the Theory of Optimization in Euclidean Space
⎧
⎨ g1 (x, y, z) = x + y + z = 3
min f (x, y, z) = x2 + y 2 + z 2 subject to
⎩
g2 (x, y, z) = x − y = 2.
Note that f , g1 and g2 are C 1 in R3 and any point of the set of the constraints,
sketched in Figure 3.26 and defined by g = (g1 , g2 ) = (3, 2), is an interior point
and regular since we have
1 1 1
g (x, y, z) = rank(g (x, y, z)) = 2.
1 −1 0
x
0
2 2 x
y 0
4 y 2
2
0
4
0
2
2
minimum
z z
0 origin
0
5 5
FIGURE 3.26: The constraints, the origin and the minimum point
= x2 + y 2 + z 2 − λ1 (x + y + z − 3) − λ2 (x − y − 2)
Constrained Optimization-Equality Constraints 185
The only critical point for L is (x∗ , y ∗ , z ∗ , λ∗1 , λ∗2 ) = (2, 0, 1, 2, 2).
ii) Note that the first two column vectors of g (x, y, z) are linearly independent.
We can, therefore, keep the matrix without renumbering the variables, and
consider the sign of the following bordered Hessian determinant (n = 3, m =
2, r = m + 1 = 3):
0 0 ∂g1 ∂g1 ∂g1
∂x ∂y ∂z
0 ∂g2 ∂g2 ∂g2
0 ∂x ∂y ∂z 0 0 1 1 1
0 0 1 −1 0
1 = 12
B3 (2, 0, 1) = ∂g ∂g2
Lxx Lxy Lxz = 1 1 2 0 0
∂x ∂x
1 −1 0 2 0
∂g1 ∂g2
∂y Lyx Lyy Lyz 1 0 0 0 2
∂y
∂g1 ∂g2
Lzx Lzy Lzz
∂z ∂z
We have
(−1)m B3 (2, 0, 1) = (−1)2 B3 (2, 0, 1) = 12 > 0.
186 Introduction to the Theory of Optimization in Euclidean Space
iii) To show that the point is the global minimum point, we use the following
parametrization of the set of the constraints; see Figure 3.26:
x = t + 2, y = t, z = 1 − 2t t ∈ R.
So the optimization problem is reduced to
We have
and
F (t) = 12 > 0 ∀t ∈ R.
Hence 0 is a global minimum for F . That is, the point (2, 0, 1) is the solution
to the minimization problem.
In Section 3.4, we will see that using the convexity of the Lagrangian in
(x, y, z), when (λ1 , λ2 ) = (2, 2), we can conclude that the local minimum point
(2, 0, 1) is the global minimum point. Therefore, it solves the problem. The
advantage, in arguing in this way, prevents us from exploring the geometry of
the constraint set.
Constrained Optimization-Equality Constraints 187
Then, we have
⎫
∃ λ∗ = λ∗1 , . . . , λ∗m : ∇x,λ L(x∗ , λ∗ ) = 0 ⎬
⎭
L(., λ∗ ) is concave (resp. convex) in x ∈ S
Since, we have
∂L ∗ ∗
(x , λ ) = −(gj (x∗ ) − cj ) = 0 j = 1, . . . , m
∂λj
then
g1 (x∗ ) − c1 = g2 (x∗ ) − c2 = . . . = gm (x∗ ) − cm = 0.
188 Introduction to the Theory of Optimization in Euclidean Space
Solution: The inputs K and L minimizing the cost must solve the problem
min rK + wL subject to cK a Lb = Q.
We look for the extreme points in the set Ω = (0, +∞) × (0, +∞) since K and
L must satisfy cK a Lb = Q. Denote
f (K, L) = rK + wL g(K, L) = cK a Lb S = Ω.
Note that f and g are C 1 in the open convex set Ω.
Consider the Lagrangian
L(K, L, λ) = f (K, L) − λ(g(K, L) − Q) = rK + wL − λ(cK a Lb − Q)
and Lagrange’s necessary conditions
⎧
⎪
⎪ LK = r − λcaK a−1 Lb = 0
⎪
⎪
⎨
∇L(K, L, λ) = 0, 0, 0 ⇐⇒ LL = w − λcbK a Lb−1 = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(cK a Lb − Q) = 0.
Constrained Optimization-Equality Constraints 189
Multiplying each side of the first equality by K, each side of the second equality
by L, we obtain
D1 (K, L) = −λ∗ ca(a − 1)K a−2 Lb > 0 since 0 < a < a + b < 1
−λ∗ ca(a − 1)K a−2 Lb −λ∗ cabK a−1 Lb−1
D2 (K, L) =
−λ∗ cabK a−1 Lb−1 −λ∗ cb(b − 1)K a Lb−2
= (λ∗ )2 c2 abK 2a−2 L2b−2 (1 − (a + b)) > 0.
⎧
⎪
⎪ (1) Lx = 1 − 2xλ1 − 2xλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 0 − 2yλ1 = 0
⎪
⎪
⎨
∇L(x, y, z, λ1 , λ2 ) = 0R5 ⇐⇒ (3) Lz = −1 − 2zλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x2 + y 2 − 1) = 0
⎪
⎪
⎪
⎪
⎩
(5) Lλ2 = −(x2 + z 2 − 1) = 0.
λ1 = 0 or y = 0.
x = ±1 and z = 0.
1 − 2xλ2 = 0 or − 1 − 2zλ2 = 0.
Since λ2 cannot be equal to zero, we deduce that
1
x = −z = .
2λ2
Inserting x = −z in (5), we obtain
√
2x2 = 1 ⇐⇒ x = ±1/ 2.
1 1 1 1
( √ , ± √ , − √ , λ∗1 , λ∗2 ) with (λ∗1 , λ∗2 ) = (0, √ ),
2 2 2 2
1 1 1 1
(− √ , ± √ , √ , λ∗1 , λ∗2 ) with (λ∗1 , λ∗2 ) = (0, − √ ).
2 2 2 2
Constrained Optimization-Equality Constraints 191
1 1 1 √ 1 1 1 √
f(√ , ±√ , −√ ) = 2 f (− √ , ± √ , √ ) = − 2.
2 2 2 2 2 2
ii) To study the convexity of L in (x, y, z), consider the Hessian matrix
⎡ ⎤ ⎡ ⎤
Lxx Lxy Lxz −2(λ1 + λ2 ) 0 0
HL(x,y,z,λ1 ,λ2 ) = ⎣ Lyx Lyy Lyz ⎦ = ⎣ 0 −2λ1 0 ⎦
Lzx Lzy Lzz 0 0 −2λ2
1
* With (λ∗1 , λ∗2 ) = (0, √ ), the Hessian is
2
⎡ √ ⎤
− 2 0 0
HL(x,y,z,0, √1 ) = ⎣ 0 0 0 ⎦
√
2
0 0 − 2
and
√ √ √ √
Δ12
1 =
− 2 =− 2 Δ13
1 =
0 =0 Δ23
1 =
− 2 =− 2
√ √
0 0 − 2 0 − 2 0
Δ12 = √ =0
Δ22 = √ =2 Δ32 = =0
0 − 2 0 − 2 0 0
√
− 2 0 0
Δ3 = 0 0 0
√
=0
(−1)k Δk 0 k = 1, 2, 3.
0 0 − 2
Thus L(., 0, √12 ) is concave in R3 and the points ( √12 , ± √12 , − √12 ) are maxima
points.
** Similarly, we show that L(., 0, − √12 ) is convex and the points
(− √12 , ± √12 , √12 ) are minima points.
iii) Comments. The constraint set, illustrated in Figure 3.27, is the inter-
section of two cylinders. A parametrization of this set is described by the
equations
x(t) = ± 1 − t2 , y(t) = t, z(t) = t or −t t ∈ [−1, 1].
The set is closed since g1 and g2 are continuous on R3 and
[(g1 , g2 ) = (1, 1)] = g1−1 {1} ∩ g2−1 {1} .
It is bounded since, for any (x, y, z) ∈ [(g1 , g2 ) = (1, 1)], we have
(x, y, z) = x2 + y 2 + z 2 (x2 + y 2 ) + (x2 + z 2 ) 1 + 1 = 2.
192 Introduction to the Theory of Optimization in Euclidean Space
4
1.0
y 2
y 0.5
0
0.0
2 0.5
x2 y2 1
4 1.0
4 1.0
2 0.5
z z
0 0.0
2 0.5
x2 z2 1
4 1.0
4 1.0
2 0.5
0 0.0
x x
2 0.5
4 1.0
Show that the local maximum point (1, 1, 1) of the constrained optimization
problem, with λ = 2, is a global maximum, but L(., 2) is not concave.
Solution: We have
Lx = y + z − λ Ly = x + z − λ
Lz = y + x − λ Lλ = −(x + y + z − 3).
Δ12
1 =
0 =0 Δ13
1 =
0 =0 Δ23
1 =
0 =0
0 1 0 1 0 1
Δ12 = = −1 Δ 2
= = −1 Δ32 = = −1
1 0 2 1 0 1 0
0 1 1
Δ3 = 1 0 1 = 2.
1 1 0
So L(., 2) is neither concave nor convex in (x, y, z). Thus, we cannot conclude,
by using the theorem, whether the point (1, 1, 1) is a global maximum or not.
Solved Problems
where c ∈ R.
ii) Use part (i) to show that if x1 , x2 , . . . , xn are given numbers, then
i=n
i=n
2
n x2i xi .
i=1 i=1
⎧
⎪
⎪ Lx1 = 2x1 − λ = 0
⎪
⎪ ..
⎪
⎪
⎪
⎪ .
⎪
⎪
⎨ Lxi = 2xi − λ = 0
∇L(x1 , . . . , xn , λ) = 0, . . . , 0, 0 ⇐⇒ ..
⎪
⎪ .
⎪
⎪
⎪
⎪ Lxn = 2xn − λ = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(x1 + x2 + . . . + xn − c) = 0.
Constrained Optimization-Equality Constraints 195
2c
Now, let us study the convexity of L in (x1 , . . . , xn ) when λ = .
n
The corresponding Hessian matrix is
⎡ ⎤
2 ··· 0
⎢ .. . . .. ⎥
⎣ . . . ⎦
0 ··· 2
The leading principal minors are
c c
f ,..., f (t1 , . . . , tn ) ∀(t1 , . . . , tn ) ∈ [t1 + . . . + tn = c].
n n
In particular, for the given xi , we can write
c 2 c 2 c 2
+ + ... + x21 + x22 + . . . + x2n
n n n
c2 c2
⇐⇒ n 2 = x21 + x22 + . . . + x2n
n n
⇐⇒ c2 = (x1 + x2 + . . . + xn )2 n(x21 + x22 + . . . + x2n ).
The equality holds only at the minimum point whose coordinates are equal to
(x1 + x2 + . . . + xn )/n.
196 Introduction to the Theory of Optimization in Euclidean Space
Since D is regular, then the local extreme points are stationary points of the
gradient of D, i.e, solution of ∇D(m, b) = 0, 0
⎧
⎪
⎪ ∂D
⎨ ∂m = −2[y1 − (mx1 + b)]x1 − . . . − 2[yn − (mxn + b)]xn = 0
⎪
⇐⇒
⎪
⎪
⎪
⎩
∂D
= −2[y1 − (mx1 + b)] − . . . − 2[yn − (mxn + b)] = 0
∂b
⎧ n n n
⎪
⎪ 2
⎪
⎪ y i x i = m[ x i ] + b[ xi ]
⎪
⎨ i=1 i=1 i=1
⇐⇒
⎪
⎪ n n
⎪
⎪
⎪
⎩ yi = m[ xi ] + b[n].
i=1 i=1
since x1 , . . . , xn are different (see Part 1). Therefore, there exists a unique
solution to the system. It remains to show that it is the minimum point. For
this, we study the convexity of D where its Hessian matrix is given by
⎡ n
n
⎤
⎢ 2 x2i 2 xi ⎥
⎢ i=1 i=1 ⎥
⎢ ⎥
HD (m, b) = ⎢ ⎥
⎢ n ⎥
⎣ ⎦
2 xi 2n
i=1
So D is convex and the unique critical point (m∗ , b∗ ) is the global minimum.
The regression line equation is y = m∗ x + b∗ with
n n
n 2 n
yi x i xi xi yi xi
i=1 i=1 i=1 i=1
n n n
y i n x i y i
i=1
∗
m = n n b∗ = i=1
n
i=1
n
2
x x x 2
x
i i i i
i=1 i=1 i=1 i=1
n n
x i n x i n
i=1 i=1
iii) Plot the points and the regression line on the same graph.
iv) Use your answer from ii) to predict the final exam score of a student
whose midterm score was 41 and who dropped the course.
198 Introduction to the Theory of Optimization in Euclidean Space
xi 100 95 81 71 83 48 92 100 85 63 78 58 73 60
yi 95 88 53 58 80 31 91 78 85 52 78 74 60 60
Solution: i) The plot, in Figure 3.28, shows that 10 points are close to a line.
The plot is obtained using the Mathematica coding below:
f p = {{100, 95}, {95, 88}, {81, 53}, {71, 58}, {83, 80}, {48, 31}, {92, 91},
{100, 78}, {85, 85}, {63, 52}, {78, 78}, {58, 74}, {73, 60}, {60, 60}};
gp = ListP lot[f p]
90
80
70
60
50
40
60 70 80 90 100
14
14
14
14
xi = 1087 x2i = 87855 yi = 983 xi yi = 7942815178
i=1 i=1 i=1 i=1
1499 801
m∗ = ≈ 0.8981426 b∗ = ≈ 0.4799281
1669 1669
and the regression line will be
y = 0.8981426 x + 0.4799281.
iii) To check the equation of the line of best fit, we use the instruction
line = F it[f p, {1, x}, x]
0.479928 + 0.898143 x
Constrained Optimization-Equality Constraints 199
To sketch the line (see Figure 3.29) with the data, we add the following Math-
ematica coding:
100
90
80
70
60
50
40
60 80 100
iv) The student who dropped the course would have at the final exam the ap-
proximate mark of y(41) ≈ 0.8981426(41) + 0.4799281 = 36.4056. The student
would have failed if he didn’t improve his understanding of the material stud-
ied. However, this is only a relative prediction that doesn’t take into account
other factors involving the learning experience of the student.
Solution: i) The data, of points (x, y), appear to lie along a straight line.
The plot, shown in Figure 3.30, is obtained using the Mathematica coding
below:
200 Introduction to the Theory of Optimization in Euclidean Space
f p1 = {{1, 6.1}, {2, 6.8}, {3, 7.5}, {4, 8.5}, {5, 9.3}, {6, 10.5}, {7, 11.5},
{8, 12.625}, {9, 13.975}, {10, 14.975}};
gp1 = ListP lot[f p1]
14
12
10
4 6 8 10
The plot of the data, of points (xi , ln yi ), appears also to lie along a straight
line (see Figure 3.31).
f p2 = {{1, Log[6.1]}, {2, Log[6.8]}, {3, Log[7.5]}, {4, Log[8.5]}, {5, Log[9.3]},
{6, Log[10.5]}, {7, Log[11.5])}, {8, Log[12.625]}, {9, Log[13.975]}, {10, Log[14.975]}};
gp2 = ListP lot[f p2]
Constrained Optimization-Equality Constraints 201
2.6
2.4
2.2
2.0
4 6 8 10
FIGURE 3.31: The data (xi , ln yi ) are positioned along a straight line
ii) Using the results from Part 2, the least squares’ line of best fit is given by
ln(y) = ln(β0 ) + β1 x; where b∗ = ln(β0 ) and m∗ = β1 , are the solution of the
linear system
p.10 − qB qA − Bp
m∗ = ≈ 0.10156 b∗ = ≈ 1.71975
10A − B 2 10A − B 2
and the regression line will be
ln y = 0.10156x + 1.71975.
Thus ∗
β0 = eb ≈ 5.583112000 β1 = m∗ ≈ 0.10156.
We sketch the line with the data (xi , ln(yi )), in Figure 3.32, using the coding:
2.6
2.4
2.2
2.0
4 6 8 10
Finally, we sketch, in Figure 3.33, the curve y = f (x) = β0 eβ1 x , with the
original data (xi , yi ):
16
14
12
10
2 4 6 8 10
vi) Using the formula for prediction. We need to solve the equation
1 40
1000f (x) = 40000 ⇐⇒ x= ln( ) ≈ 19.38886498
0.10156 5.583112
Thus in the year 1981 + 19 = 2000, the tuition fees reached the rate of $40000.
Chapter 4
Constrained Optimization-Inequality
Constraints
Example.
Note that sets defined by inequalities contain interior points and boundary
points. So, for comparing the values of a function f taken around an extreme
point x∗ , it will be suitable to consider curves x(t) passing through x∗ and
included in the constraint set [g b]. We will consider, this time, curves
t −→ x(t) such that the set {x(t) : t ∈ [0, a], x(0) = x∗ }, for some a > 0, is
included in [g b]. Then, if x∗ is a local maximum of f , then we have
203
204 Introduction to the Theory of Optimization in Euclidean Space
Thus, 0 is local maximum point for the function t −→ f (x(t)). Hence
d
f (x(t)) = f (x(t)).x (t) 0 =⇒ f (x∗ ).x (0) 0.
dt t=0 t=0
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This inequality
musn’t depend on a particular curve x(t). So, we should have
f (x∗ ).x (0) 0 for any curve x(t) such that g(x(t)) b.
In this chapter, we will first characterize, in Section 4.1, the set of tangent vectors to
such curves, then establish, in Section 4.2, the equations satisfied by a local extreme
point x∗ . In Section 4.3, we identify the candidates points for optimality, and in
Section 4.4, we explore the global optimality of a constrained local candidate point.
Finally, we establish, in Section 4.5, the dependence of the optimal value of the
objective function with respect to certain parameters involved in the problem.
Let
x∗ ∈ S = [g(x) b]
So y = 0 ∈ T .
m
∗
∗∗ Suppose y = 0. We have x ∈ [gj (x) < bj ] which is an open subset of
j=1
Rn . So there exists δ > 0 such that
m
Bδ (x∗ ) ⊂ [gj (x) < bj ].
j=1
Now
δ δ
x(t) = x∗ + ty ∈ Bδ (x∗ ) ∀t ∈ [− , ]
2|y| 2|y|
since
δ δ
|x(t) − x∗ | = |t||y| |y| = < δ.
2|y| 2
δ
We deduce that y ∈ T since the curve satisfies: x ∈ C 1 [0, 2|y| ],
δ
x(0) = x∗ , x (t) = y, x (0) = y, x(t) = x∗ + ty ∈ S ∀t ∈ [0, ].
2|y|
Example 1. Find and sketch the cone of feasible directions at the point
(−1/2, 1/2) belonging to the set
Solution: The set S is the part of the unit disk located above the line y = x.
The point (−1/2, 1/2) is an interior point of S; see Figure 4.1. Thus T = R2 .
206 Introduction to the Theory of Optimization in Euclidean Space
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5
1.5 1.5
Theorem 4.1.1
At a regular point x∗ ∈ S = [g b], where g is C 1 in a neighborhood of
x∗ , the cone of feasible directions T is equal to the convex cone
Before giving the proof, we give some remarks and identify some cones.
Example 2. Find and sketch the cone of feasible directions C(x, y) with vertex
√ √
(x, y) = (−1/2, −1/2), (0, 1) and (1/ 2, 1/ 2). The points belong to the set
Solution: Note that the three points belong to ∂S; see Figure 4.2.
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5
1.5 1.5
To determine the cone of feasible directions at each point (see Figures 4.2
and 4.3), we need to discuss the regularity of each point. First, we will need:
⎡ ⎤ ⎡ ⎤
∂g1 ∂g1
∂x ∂y 2x 2y
⎢ ⎥ ⎣ ⎦.
g (x, y) = ⎣ ⎦=
∂g2 ∂g2
∂x ∂y
1 −1
g1 (0, 1) = 0 2 and rank(g1 (0, 1)) = 1
⎡ ⎤
x−0
C(0, 1) = (x, y) ∈ R2 : 0 2 .⎣ ⎦0
y−1
2
= (x, y) ∈ R : y1 .
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
x y 0
1.5 1.5
√ √ √
FIGURE 4.3: C(0, 1) = [y 1] and C(1/ 2, 1/ 2) = [x + y 2] ∩ [x y]
√ √
∗ ∗ ∗ At (1/ 2, 1/ 2), the two equality constraints g1 = g2 = 0 are satisfied
and the point is regular. We have
√ √
1 1 1 21 2
g (√ , √ ) = rank(g ( √ , √ )) = 2 and
2 2 21 2 −1
⎡ ⎤
√ √ x − √12
1 1 2 2 2 ⎣ ⎦0
C( √ , √ ) = (x, y) ∈ R : .
2 2 1 −1
y − √12
√
= (x, y) ∈ R2 : x + y − 2 0 and x − y 0 .
Constrained Optimization-Inequality Constraints 209
Remark 4.1.3 The conclusion of the theorem is also true when the point
x∗ satisfies any one of the following regularity conditions [5] :
i) Each constraint gj (x) is affine for j ∈ I(x∗ ).
gj (x̄) bj
∀j ∈ I(x∗ ).
gj (x̄) < bj if gj is not affine
Example 3. Suppose that all the constraints are affine and that the set S is
described by
n
S = {x ∈ Rn : aij xj bi , i = 1, . . . , m} = {x ∈ Rn : Ax b}
j=1
n
C(x∗ ) = {x ∈ Rn : gi (x).(x − x∗ ) = aij (xj − x∗j ) 0, i ∈ I(x∗ )}
j=1
n
n
= {x ∈ Rn : aij xj aij x∗j = bi , i ∈ I(x∗ )}.
j=1 j=1
Let x∗ be a relative interior point of the surface z = f (x). Find the cone at
x∗ .
g (x∗ , f (x∗ )) = −f (x∗ ) 1 =0 rank(g (x∗ , f (x∗ ))) = 1.
The cone of feasible directions at the point (x∗ , f (x∗ )) with vertex (x∗ , f (x∗ ))
is given by
x − x∗
C(x∗ , f (x∗ )) = (x, z) ∈ Rn × R : g (x∗ , f (x∗ )). 0 .
z − f (x∗ )
We have
∗ ∗ x − x∗ ∗
x − x∗
g (x , f (x )). = −f (x ) 1 .
z − f (x∗ ) z − f (x∗ )
The cone is the region below the hyperplane z = f (x∗ )+f (x∗ ).(x−x∗ ), which
is also the tangent plane to the surface z = f (x) at x∗ .
Remark 4.1.4 Note that the representation of the cone of feasible direc-
tions obtained in the theorem used the fact that the point was regular.
When, this hypothesis is omitted the representation is not necessary valid.
g(x, y) = x2 0 ⇐⇒ g(x, y) = x2 = 0.
Proof. We have:
So 0 is a minimum for the function φi (t) = gi (x(t)) − bi , (i ∈ I(x∗ )), over the
interval [0, a] since we have
φi (t) − φi (0) = φi (0)t + tα(t) = t φi (0) + α(t) with lim α(t) = 0.
t→0+
d
φi (0) = (gi (x(t))) = ∇gi (x(t)).x (t) = gi (x∗ ).y 0.
dt t=0 t=0
212 Introduction to the Theory of Optimization in Euclidean Space
Since x∗ ∈ [gj (x) < bj ] for j ∈ I(x∗ ) and g continuous, there exists δ > 0 such
that
We claim that
δ
∃δ0 ∈ 0, min(δ, ) such that x(t) ∈ S = [g(x) b] ∀t ∈ [0, δ0 ].
|y|
gj (x(t)) = gj (x∗ + ty) = gj (x∗ ) + tgj (x∗ ).y + tεj (t) with lim εj (t) = 0.
t→0
Since gj (x∗ ) = bj and gj (x∗ ).y < 0, we deduce the existence of
δ
δ0j ∈ (0, min(δ, )) such that
|y|
1
|εj (t)| < − gj (x∗ ).y.
2
Consequently, for δ0 = min∗ δ0j , we have ∀j ∈ I(x∗ ),
j∈I(x )
t t
gj (x(t)) < bj + tgj (x∗ ).y − gj (x∗ ).y = bj + gj (x∗ ).y < bj ∀t ∈ (0, δ0 ).
2 2
F (t, u) = G x∗ + ty +t G (x∗ )u − B = 0
where, for t fixed, u ∈ Rp is the unknown, and where
δ δ δ δ
< y + G (x∗ ) = + =δ
2 y 2 G (x∗ ) 2 2
p
∂Gi ∂Xj ∂Gip ∗
Xj (t, u) = x∗j + tyj + l
(x∗ )ul = (x )
∂xj ∂up ∂xj
l=1
By hypotheses, we have
– F is a C 1 function in the open set A = (−δ0 , δ0 ) × Bδ0 (0)
– F (0, 0) = G(x∗ ) − B = 0
∂(F1 , . . . , Fp ) t
– det(∇u F (0, 0)) = = det G (x∗ ) G (x∗ ) = 0 since
∂(u1 , . . . , up )
G (x∗ ) has rank p.
214 Introduction to the Theory of Optimization in Euclidean Space
Then, by the implicit function theorem, there exists open balls B (0) ⊂
(−δ0 , δ0 ), Bη (0) ⊂ Bδ0 (0), , η > 0 with B (0) × Bη (0) ⊆ A, and such that
n
∂G ∂Xj
d
0= G(x(t)) =
dt j=1
∂Xj ∂t
m
∂Gi m
∂Gi ∂Xj ∂ul
Xj (t, u) = x∗j + tyj + l
(x∗ )ul = yj + l
(x∗ )
∂xj ∂t ∂xj ∂t
l=1 l=1
#
d n
∂G m
∂Gl ∗ ∂ul
0= G(x(t)) = (X(t, u)) yj + (x )
dt t=0
j=1
∂xj ∂xj ∂t
l=1 t=0
Since we have G (x∗ )y = 0 and that G (x∗ )t G (x∗ ) is nonsingular and definite
positive, we conclude that
Hence
x (0) = y +t G (x∗ )u (0) = y.
gj (x(t)) = gj (x(0)) + tgj (x∗ ).x (0) + tη(t) = bj + tgj (x∗ ).y + tη(t)
with lim η(t) = 0. Then, from the first case, there exists 0 ∈ (0, ) such that
t→0
thus
Finally, y is a tangent vector to the curve x(t) included in S for t ∈ [0, 0 /2],
so y ∈ T .
gi (x∗ )(sy + (1 − s)y ) = sgi (x∗ )y + (1 − s)gi (x∗ )y s.0 + (1 − s).0 = 0
Solved Problems
1. – Find and draw the cone of feasible directions at the point (0, 3, 0)
belonging to the set x2 + y 2 + z 2 9.
5 5
5
5 Cone
y3
z
0 z
0
x0 5
5
x2 y2 z2 9
4 5
2 0
x
x
0 5
2. – Find the cone of feasible directions at the point (0, 1, 0) to the set
g1 (x, y, z) = x + y + z, g2 (x, y, z) = x2 + y 2 + z 2 .
Solution: The set S = [g (1, 1)], as illustrated in Figure 4.5, is the part of
the unit ball located below the plane x + y + z 1.
2 2
y 1 y 1
0 0
1 1 Cone
x y z 1 1
y1 0
2 2
2 2
P P
1 1
z z
0 0
1 1
x2 y2 z2 1 0
and
2 x yz1 0 2
2 2
1 1
0 0
x x
1 1
2 2
⎡ ⎤
∂g1 ∂g1 ∂g1
∂x ∂y ∂z
⎢ ⎥ 1 1 1
g (x, y, z) = ⎣ ⎦=
∂g2 ∂g2 ∂g2 2x 2y 2z
∂x ∂y ∂z
1 1 1
g (0, 1, 0) = has rank 2.
0 2 0
218 Introduction to the Theory of Optimization in Euclidean Space
The cone of feasible directions to the set S at the point (0, 1, 0), with vertex
this point, is the set of points (x, y, z) such that
⎡ ⎤ ⎡ ⎤
x−0 x
1 1 1 0
g (0, 1, 0). ⎣ y − 1 ⎦ = .⎣ y − 1 ⎦
0 2 0 0
z−0 z
Solution: Set
1 2 5
g1 (x, y, z) = z − x2 + y 2 g2 (x, y, z) = z − (x + y 2 ) − .
10 2
We have
g1 (3, 4, 5) = g2 (3, 4, 5) = 0
1 3 4
g1 = − xi − yj + k , g1 (3, 4, 5) = − i − j + k = 0
x2 + y 2 5 5
x y 3 4
g2 (x, y, z) = − i − j + k, g2 (3, 4, 5) = − i − j + k = 0
5 5 5 5
C1 (3, 4, 5) = {(x, y, z) ∈ R3 : g1 (3, 4, 5).t x−3 y−4 z−5 0}
C2 (3, 4, 5) = {(x, y, z) ∈ R3 : g2 (3, 4, 5).t x−3 y−4 z−5 0}.
Constrained Optimization-Inequality Constraints 219
Clearly, since g1 (3, 4, 5) = g2 (3, 4, 5), the two sets are equal and we have for
i = 1, 2
⎡ ⎤ ⎡ ⎤
x−3 x−3
gi (3, 4, 5). ⎣ y − 4 ⎦ 0 ⇐⇒ − 35 − 45 1 . ⎣ y − 4 ⎦ 0
z−5 z−5
3 4
⇐⇒ − (x − 3) − (y − 4) + 1(z − 5) 0.
5 5
Hence, the two given sets have a common cone of feasible directions at this
point (see the illustrations in Figure 4.6) characterized by the inequality
3 4
− (x − 3) − (y − 4) + (z − 5) 0.
5 5
10
y 5 5
y
0
0
5
5
10
10
P
z
5 z 4
2 Cone
0
0
10
5 5
0
x 0
5
x
10 5
10
y 5
5
10
10
z
5
Cone
0
10
5
0
x
5
10
⎧
⎨ g1 (x1 , . . . , xn ) b1
⎪
max f (x1 , . . . , xn ) subject to .. ..
⎪ . .
⎩
gm (x1 , . . . , xn ) bm
The results established are strongly related to the fact that we are maximizing a
function f under inequality constraint g(x) b. To solve a minimization problem
min f (x), we can maximize −f (x), and if a constraint is given in the form gj (x) bj ,
we can transform it into −gj (x) −bj . An equality constraint gj (x) = bj can be
equivalently written as gj (x) bj and −gj (x) −bj .
f (x(t)) − f (x(0))
f (x(t)) f (x∗ ) = f (x(0)) ⇐⇒ 0
t−0
from which we deduce
d f (x(t)) − f (x(0))
f (x∗ ).y = f (x(t)).x (t) = f (x(t)) = lim+ 0.
t=0 dt t=0 t→0 t−0
Constrained Optimization-Inequality Constraints 221
Remark 4.2.1 The lemma generalizes the necessary condition for a local
maximum point x∗ in a convex S:
f (x∗ ).(x − x∗ ) 0 ∀x ∈ S.
such that
∂f ∗ ∂gj ∗
(x ) − λ∗j (x ) = 0, i = 1, · · · , n.
∂xi ∂xi
j∈I(x∗ )
{x ∈ Rn : Ax 0} ⊂ {x ∈ Rn : c.x 0}
∂f ∂gj
(x) − λj (x) = 0 i = 1, . . . , n, λj 0 j ∈ I(x∗ )
∂xi ∗
∂x i
j∈I(x )
gj (x) − bj = 0 ∀j ∈ I(x∗ )
Remark 4.2.2 The numbers λ∗j , j ∈ I(x∗ ) are unique. Indeed, suppose
there exist λ = λ1 , · · · , λp and λ = λ1 , · · · , λp solutions of
c =t Aλ and c =t Aλ
Since the vectors gj (x∗ ) are linearly independent, deduce that
First, let us practice writing the KKT conditions through simple examples.
Solution: i) The set of constraints is the set reduced to the closed interval
[0, 3]. The problem consists of maximizing the real function y = (x−2)3 = f (x)
which is increasing on R (f (x) = 3(x − 2)2 ). Therefore, it attains its maximal
value on [0, 3], at x = 3.
Thus we obtain
x=3 with (α, β) = (0, 3).
Note that only the constraint g2 (x) = x is active at 3. We have
vi) Conclusion. x = 2 is not the optimal point since f (2) = 0 < 1 = f (3).
Because, f is increasing, then 3 is the maximum point. We can also conclude by
using the extreme value theorem since f is continuous on the closed bounded
constraint set (g1 , g2 ) (0, 3).
226 Introduction to the Theory of Optimization in Euclidean Space
Solution: i) The problem describes the shortest distance of the point (a, b)
to the unit disk (here (a, b) is located outside the unit disk). This distance is
attained by the extreme value theorem since f is continuous on the constraint
set [g 1], which is a closed and bounded subset of R2 . The case (a, b) = (2, 3)
is illustrated in Figure 4.7 and a graphical solution is described in Figure 4.8
using level curves.
x
4 0
2
y 2 4
1.0
z x 22 y 32
0.5
y
0.0
20 0.5
1.0
20
15
z
10
10
1.0
0.5
0.0
0 x
0.5
1.0
∗ If x2 + y 2 < 1 then λ = 0, and then (i) and (ii) yield (x, y) = (a, b) which
leads to a contradiction since a2 + b2 > 1.
Constrained Optimization-Inequality Constraints 227
a b
∗∗ If x2 +y 2 = 1 then from (i) and (ii), we deduce that (x, y) = ( , ).
1+λ 1+λ
By substitution in x2 + y 2 = 1, we get
a 2 b 2
+ = 1, λ 0 ⇐⇒ λ= a2 + b2 − 1.
1+λ 1+λ
Thus, the only solution of the system is the point
a b
(x∗ , y ∗ ) = ( √ ,√ )
+b 2 a + b2
2 a2
where the constraint is active. Finally, the point is regular since we have
g (x, y) = 2x 2y , and rankg (x∗ , y ∗ ) = 1.
Therefore, the point is candidate for optimality.
y
6.8 30.4 12.8 6.44 1.6 4.8
25.6
17.6
3.2
35.2
2 4.8
22.4
9.6 8
40
11.2 x
4
6.4 2 14.4 2 4
16
28.8
51.2 19.2 20
54.4
24
33.6 27.2
57.6
60.8 2
62.4 32
43.2
65.6 38.4
36.8
68.8
72 41
5.2
49.6
80 70.4 44.8 48
7876473
8 6 67 264 59.2 56 452 8 49 6 51 2
ii) Slater’s condition: gj (x) is convex and there exists x̄ such that gj (x̄) <
bj , j = 1, · · · , m (with f concave).
228 Introduction to the Theory of Optimization in Euclidean Space
iv) The rank condition: The constraints gi1 , . . . , gip , (p m), are bind-
ing. The rank of the matrix
⎡ ∗ ⎤
gi1 (x )
⎢ .. ⎥
⎣ . ⎦
gip (x∗ )
is equal to p.
This last case is the one we consider here in our study. These four conditions
are not equivalent to one another. For example, the uniqueness of the
Lagrange multipliers is established under the rank condition iv).
S = {(x, y) ∈ R+ × R+ : 2x + y 3, x + 2y 3, x + y 2}
is a closed bounded subset of R2 . f is continuous on S, then, by the extreme
value theorem,
∃(x∗ , y∗ ) ∈ S such that f (x∗ , y∗ ) = max f (x, y).
(x,y)∈S
So f (x∗ , y∗ ) = max f (x, y) > 0. Therefore, at the maximum point, the con-
(x,y)∈S
straints x 0 and y 0 cannot be binding.
y
3.0
2.0
2x y 3
y 1.5
2.5 1.0
0.5
0.0
2.0
1.5
4
z x y
1.5
1.0
z
0.5
1.0
0.0
0.0
0.5 S x2 y 3
0.5
x y 2 1.0
x
1.5
x
0.5 1.0 1.5 2.0 2.5 3.0 2.0
where
S ∗ = {(x, y) ∈ Ω : 2x + y 3, x + 2y 3, x + y 2}.
Set
1 1
F (x, y) = ln f (x, y) = ln(x) + ln(y) (x, y) ∈ Ω
2 4
F is well defined and we have
iii) Since F and the constraints are C 1 in Ω, to solve the problem, we write
the KKT conditions for the associated Lagrangian
1 1
L(x, y, λ1 , λ2 , λ3 ) = ln x+ ln y−λ1 (2x+y−3)−λ2 (x+2y−3)−λ3 (x+y−2),
2 4
The necessary conditions to satisfy are:
⎧
⎪ 1
⎪
⎪ (i) Lx = − 2λ1 − λ2 − λ3 = 0
⎪
⎪ 2x
⎪
⎪
⎪
⎪
⎪
⎪ 1
⎪
⎪ (ii) Ly = − λ1 − 2λ2 − λ3 = 0
⎪
⎪
⎨ 4y
⎪
⎪ (iii) λ1 0 with λ1 = 0 if 2x + y < 3
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (iv) λ2 0 with λ2 = 0 if x + 2y < 3
⎪
⎪
⎪
⎪
⎪
⎩
(v) λ3 0 with λ3 = 0 if x+y <2
◦◦ if x + 2y = 3, then
** If x + y = 2, then by drawing the constraint set, we see that the only point
satisfying x + y = 2 is (x, y) = (1, 1) for which we have also 2x + y = 3 and
x + 2y = 3 with
iv) Conclusion. The only point candidate is (1, 1), and it is the maximum
point since we know that such a point exists.
Note also, that the rank condition is not satisfied since we have
Mixed Constraints
⎧
⎨ gj (x) = bj , j = 1, · · · , r (r < n)
max f (x) subject to
⎩
hk (x) ck , k = 1, · · · , s
We have:
where
r
s
L(x, λ, μ) = f (x) − λj (gj (x) − bj ) − μk (hk (x) − ck ).
j=1 k=1
⎧
⎪
⎪ gj (x) bj , j = 1, . . . , r
⎪
⎪
⎨
max f (x) subject to −gj (x) −bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎩
hk (x) ck , k = 1, . . . , s
Constrained Optimization-Inequality Constraints 233
r
r
s
∗
L (x, τ, κ, μ) = f (x)− τj (gj (x)−bj )− κj (−gj (x)+bj )− μk (hk (x)−ck )
j=1 j=1 k=1
there exist unique multipliers τj∗ , κ∗j , μ∗k such that the necessary conditions are
satisfied:
⎧ r
⎪ ∂L∗ ∗ ∗ ∗ ∗ ∂f ∗ ∂gj ∗
⎪
⎪ (x , τ , κ , μ ) = (x ) − τj∗ (x )
⎪
⎪
⎪
⎪ ∂xi ∂xi ∂xi
⎪
⎪
j=1
⎪
⎪
⎪
⎪
⎪
⎪ r s
⎪ ∗ ∂gj ∗ ∂hk ∗
⎪
⎪
⎨ + κ j (x ) − μ∗k (x ) = 0 i = 1, . . . , n
j=1
∂xi ∂xi
k=1
⎪
⎪
⎪
⎪
⎪
⎪ τj∗ 0, with τj∗ = 0 if gj (x∗ ) < bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ κ∗j 0, with κ∗j = 0 if − gj (x∗ ) < −bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎩ ∗
μk 0, with μ∗k = 0 if hk (x∗ ) < ck , k = 1, . . . , s.
Setting
λ∗ = τ ∗ − κ ∗ and
L(x, λ, μ) = L∗ (x, τ, κ, μ)
r
s
= f (x) − (τj − κj )(gj (x) − bj ) − μk (hk (x) − ck )
j=1 k=1
r s
∂f ∗ ∂gj ∗ ∂hk ∗
(x ) − λj (x ) − μk (x ) = 0.
∂xi j=1
∂x i ∂xi
k=1
Subtracting the two equalities and using the fact that x∗ is a regular point,
we obtain a contradiction:
r
s
(λj − λj )∇gj (x∗ ) − (μk − μk )∇hk (x∗ ) = 0 =⇒ (λ , μ ) = (λ, μ).
j=1 k=1
Nonnegativity constraints
⎧
⎨ gj (x) bj , j = 1, . . . , m
max f (x) subject to
⎩
x1 0, . . . , xn 0.
⎧
⎨ gj (x) bj , j = 1, . . . , m
max f (x) subject to
⎩
gj (x) 0, j = m + 1, . . . , m + n.
By applying the KKT conditions, for a regular point x, with the Lagrangian
m
n
L∗ (x, λ, μ) = f (x) − λj (gj (x) − bj ) − μk (−xk )
j=1 k=1
Constrained Optimization-Inequality Constraints 235
⎪
⎪ λj 0, with λj = 0 if gj (x) < bj , j = 1, . . . , m
⎪
⎪
⎪
⎪
⎪
⎩
μk 0, with μk = 0 if xk > 0, k = 1, . . . , n.
We deduce then:
⎧ m
⎪ ∂L ∂f ∂gj
⎪
⎪ − (x) 0
⎪
⎪ (x, λ) = (x) λj (=0 if xi > 0 ),
⎨ ∂xi ∂xi j=1
∂xi
⎪ i = 1, . . . , n
⎪
⎪
⎪
⎪
⎩
λj 0, with λj = 0 if gj (x) < bj , j = 1, . . . , m
Solved Problems
y
0.0 x
0.5 1.0 1.5 2.0 2.5 3.0
0.5
1.0
1.5
2.0 S
2.5
3.0
are ⎧
⎪
⎪ (1) Lx = 2x − 2α(x − 1) = 0
⎪
⎪
⎨
(2) Ly = 1 − 3β(y + 1)2 = 0
⎪
⎪
⎪
⎪
⎩
(3) β0 with β=0 if (y + 1)3 < 0
* If (y + 1)3 < 0, then β = 0. We get a contradiction with (2) which leads to
1 = 0.
* If (y + 1)3 = 0, then y = −1, and by (3) again, we obtain 1 = 0 which is not
possible.
Thus, KKT conditions have no solution.
Thus
max f (x, y) = f (2, −1) = 3.
S
Note that the point is not a candidate for the KKT conditions. This is because
it doesn’t satisfy the constraint qualification under which the KKT conditions
are established. In particular, the rank condition is not satisfied.
Indeed,
the
g1 (2, −1)
two constraints are active at (2, −1), but we have rank( ) = 0
g2 (2, −1)
since
g1 (x, y) 2(x − 2) 0 g1 (2, −1) 0 0
= = .
g2 (x, y) 0 3(y + 1)2 g2 (2, −1) 0 0
i) Sketch the feasible set and write down the necessary KKT conditions.
ii) Find the point(s) solution of the KKT conditions and check their
regularity.
iii) What can you conclude about the solution of the minimization prob-
lem?
iv) Does this contradict the theorem on the necessary conditions for a
constrained candidate point?
238 Introduction to the Theory of Optimization in Euclidean Space
Solution: i) The set of the constraints is the set of points on the line y = x
included in the region below the line y = 2 − x, as shown in Figure 4.12.
y
3
x y 2
x
3 2 1 1 2 3
1
x y 0
2 S
3
g1 (x, y) = y − x g2 (x, y) = x + y − 2.
∗∗ If x + y − 2 = 0 then
⎧
⎪
⎪ 2(x − 1) + α − β = 0
⎨
1−α−β =0
=⇒ (x, y) = (1, 1) and (α, β) = (1/2, 1/2).
⎪
⎪ y−x=0
⎩
x+y−2=0
So, there are two solutions: (1/2, 1/2) and (1, 1) for the KKT conditions.
g1 (x, y) = −1 1 rank( g1 (1/2, 1/2) ) = rank( −1 1 ) = 1.
Regularity of the point (1, 1). The two constraints are active at (1, 1). We
have
g1 (x, y) −1 1 g1 (1, 1) −1 1
= rank( ) = rank( ) = 2.
g2 (x, y) 1 1 g2 (1, 1) 1 1
iii) Conclusion. The two points are candidates for optimality. Comparing
the values taken by f at these points gives:
1 1 1 1 5
f (1, 1) = 1, f ( , ) = 2 − − = > 1,
2 2 2 4 4
we deduce that, only (1, 1) is the candidate for minimality. However, it is not
the minimum point. Indeed, we have
iv) This doesn’t contradict the theorem since KKT conditions indicate only
where to find the possible points when they exist.
S = {(x, y) / 0 x 2, x2 y 4}.
x
y 0
4 1
5
2
3 3
y
2
2
yx 1
0
10
4
3 S
z
5
2
0 x
0.0 0.5 1.0 1.5 2.0 2.5 3.0
** Extreme values on ∂S :
Let L1 , L2 and L3 the three parts of the boundary of S defined by:
L3 = {(0, y), 0 y 4}
g (x) = 3x2 − 2x + 1.
x 0 2
g (x) +
g(x) 3 9
x 0 2
h (x) +
h(x) −1 9
y 0 4
ϕ (y) −
ϕ(y) 3 −1
∗ ∗ ∗Conclusion:
The maximal value of f on S is 9 and is attained at the point (2, 4).
The minimal value of f on S is −1 and is attained at the point (0, 4).
– Suppose y = x2 .
with no solution.
∗∗ If y = 4
0 12.95
2.45 14.
1.05 7.7 11.9
4
9.45 13.
0.7 10.5
0.35
11.55 12.
3 5.25 8.4
1.75
35 7. 9.8
10.8
3.5
2
8.05
.4 9.1
5.6
1 2.1 6.65
7.3
2.8
0 3 15 3 85 4.2 4.9 5 95 x
0.0 0.5 1.0 1.5 2.0 2.5 3.0
iv) Find all points that satisfy the KKT conditions. Check whether or
not each point is regular.
Solution: i) The square of the distance between (x, y) and (2, 3) is given by
(x − 2)2 + (y − 3)2 . To find the point (x, y) ∈ S that lies closest to the point
(2, 3) is equivalent to solve the minimization problem
⎧
⎨ g1 (x, y) = x + y 0
min (x − 2)2 + (y − 3)2 subject to
⎩
g2 (x, y) = x2 − 4 0
S = {(x, y) : y −x, −2 x 2}
The level curves of f , with equations: (x√− 2)2 + (y − 3)2 = k where k 0,
are circles centered at (2, 3) with radius k; see Figure 4.16.
If we increase the values of the radius, the values of f decrease. The first circle
that will intersect the set S will be the circle with radius equal to the distance
of the point (2, 3) to the line y = −x. So, only the first constraint will be
active in solving the optimization problem.
y
2
0
y
10
x 2 4 x2
y x 2
z
5
x
3 2 1 1 2 3
2
z x 22 y 32
S
0
4
⎧
⎨ −2(x − 2) − 2βx = 0
=⇒ x(1 + β) = 2 and y=3
⎩
−2(y − 3) = 0
246 Introduction to the Theory of Optimization in Euclidean Space
y
22.1 17 10.2 6.8
5.1 3.4
7.2
5.5 4
1.
13.6 2
27.2
18.7
30.6
8.5
x
3 2 1 1 2 3
11.
23.8
37.4
15.3
4.2 34 20
2 28.9 25.5
47.6
54.4 40.8 32.3
35.7
57.8 39
64.6 51. 42.5
45.9
68
71.4 49
4
74.8
8.2 61.2 56.1 52.7
1.6 59
85.83 379.9 76 573.1 66.3 62.964
69 7
∗∗ If x + y = 0 then
⎧
⎨ λ + 4β = 0
(x, y) = (2, −2) =⇒ =⇒ (λ, β) = (10, −5/2)
⎩
10 − λ = 0
Constrained Optimization-Inequality Constraints 247
⎧
⎨ 8 − λ + 4β = 0
(x, y) = (−2, 2) =⇒ =⇒ (λ, β) = (0, −2)
⎩
λ=0
contradicting β 0.
So, the only point solution of the system is
Regularity of the candidate point (−1/2, 1/2). Note that only the con-
straint g1 (x, y) = x + y is active at (−1/2, 1/2). We have
g1 (x, y) = 1 1 rank( g1 (−1/2, 1/2) ) = rank( 1 1 ) = 1.
By Theorem 2.4.2, there exists a minimum point for f on S. Thus, the candi-
date found solves the problem.
S = {(x, y, z) : x + y + z 0, 2x2 + y 2 + z 2 = 1}
248 Introduction to the Theory of Optimization in Euclidean Space
y 1
1
2
2 Plane
1 Ellipsoid
z
0
1
2
2
1
0
x
1
⎧
⎪
⎪ (i) Lx = 2x − λ − 4μx = 0 ⇐⇒ 2x(1 − 2μ) = λ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (ii) Ly = 2y − λ − 2μy = 0 ⇐⇒ 2y(1 − μ) = λ
⎪
⎪
⎨
(iii) Lz = 2z − λ − 2μz = 0 ⇐⇒ 2z(1 − μ) = λ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (iv) Lμ = −(2x2 + y 2 + z 2 − 1) = 0
⎪
⎪
⎪
⎪
⎩
(v) λ0 with λ=0 if x + y + z < 0.
* If x + y + z < 0, then λ = 0, and from (i), (ii), (iii) and (iv), we deduce
that
Constrained Optimization-Inequality Constraints 249
⎧
⎪
⎪ x = 0 or μ = 1/2
⎨
y = 0 or μ = 1
⎪ z = 0 or μ = 1
⎪
⎩
2x2 + y 2 + z 2 = 1
We obtain the points
1
rank h (0, −1, 0) = rank h (0, 0, −1) = rank h (− √ , 0, 0) = 1.
2
Thus, the points are regular and candidate for optimality.
* If x + y + z = 0, then
y2 + z2 = 1 and y+z =1
1 1 1 1
(x, y, z) = (0, √ , − √ ) or (0, − √ , √ ) with (λ, μ) = (0, 1/2).
2 2 2 2
The two constraints are active at theses points and satisfy
g(x, y, z) h(x, y, z) = x+y+z 2x2 + y 2 + z 2 − 1
g (x, y, z) 1 1 1
=
h (x, y, z) 4x 2y 2z
( #
g (0, √12 , − √12 ) 1 1 1√
rank = rank √ =2
h (0, √12 , − √12 ) 0 2 2 −2 2
( #
g (0, − √12 , √12 ) 1 1√ 1
rank = rank √ =2
h (0, − √12 , √12 ) 0 −2 2 2 2
The points satisfy the constraint qualification. They are regular points and
candidates for optimality.
250 Introduction to the Theory of Optimization in Euclidean Space
• Suppose x = 0. Then
2 1 1 4 3
(x, y, z) = −√ ,√ ,√ with (λ, μ) = ( √ , ).
10 10 10 5 10 5
It is clear also that the constraint qualification condition is satisfied, so the
point is regular.
⎡ ⎤
√2 , √1 , √1
g − 1 1 1
rank ⎣ 10 10 10 ⎦ = rank
− √810 √2 √2
= 2.
h − √2 , √1 , √1 10 10
10 10 10
1 1 1 1
f (0, 0, −1) = f (0, −1, 0) = f 0, √ , − √ = f 0, − √ , √ = 1
2 2 2 2
we deduce that f attains its maximum value subject to the constraints at
1 1 1 1
(0, 0, −1), (0, −1, 0), 0, √ , − √ and 0, − √ , √ .
2 2 2 2
Constrained Optimization-Inequality Constraints 251
Consequently, we can apply the second derivative test established for equality
constraints by considering in the test only the active constraints at that point.
In what follows, suppose we have:
The variables are renumbered in order to make the first p columns in the
matrix G (x∗ ) linearly independent.
Proof. The proof follows the one seen for the case of equality constraints.
We outline here the key modification that allows us to conclude with the pre-
vious proof. We assume that I(x∗ ) = {1, . . . , m} to avoid the case of equality
constraints. Note that the positivity of λ is not assumed in the hypothesis
H in order to include both the maximization and minimization problems as
explained below. The Lagrangian introduced is used to link values of f and
g for comparison. Then depending on its positivity or negativity on the plan
tangent of the active constraints at that point, we identify whether we have a
minimum or a maximum point.
then
−L(x, β) = f (x) − (−β).(g(x) − b) − β 0.
So, to consider the two problems simultaneously, we can introduce the La-
grangian
L(x, λ) = f (x) − λ.(g(x) − b)
with λ 0 (resp. ) for the maximization (resp. minimization) problem.
Step 1: We have
Thus x∗ belongs to the open set O. So, one can find ρ0 > 0 such that Bρ0 (x∗ ) ⊂
O. Then, for h ∈ Rn such that x∗ + h ∈ Bρ0 (x∗ ), we have from Taylor’s
formula, for some τ ∈ (0, 1),
n
n n
1
L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+ Lxi (x∗ , λ∗ )hi + Lx x (x∗ +τ h, λ∗ )hi hj .
i=1
2 i=1 j=1 i j
By assumptions, we have
Lxi (x∗ , λ∗ ) = 0 i = 1, . . . , n
L(x, λ) = f (x) − λj (gj (x) − bj )
j∈I(x∗ )
then, we have
L(x∗ , λ∗ ) = f (x∗ ) − λ∗i1 (gi1 (x∗ ) − bi1 ) − . . . − λ∗ip ((gip (x∗ ) − bip ) = f (x∗ )
L(x∗ + h, λ∗ ) = f (x∗ + h) − λ∗i1 (gi1 (x∗ + h) − bi1 ) − . . . − λ∗ip (gip (x∗ + h) − bip )
from which we deduce
n n
1
f (x∗ +h)−f (x∗ ) = λ∗k [gk (x∗ +h)−bk ]+ Lx x (x∗ +τ h, λ∗ )hi hj .
2 i=1 j=1 i j
k∈I(x∗ )
n
∂gk
gk (x∗ + h) − bk = gk (x∗ + h) − gk (x∗ ) = (x∗ + τk h)hj τk ∈ (0, 1).
j=1
∂xj
254 Introduction to the Theory of Optimization in Euclidean Space
⎡ ∂gi1 ∂gi1 ⎤
(x1 )
∂xi1 ... ∂xn (x1 )
∂gik k ⎢ ⎥
G(x1 , · · · , xp ) = (x ) =⎢
⎣
..
.
..
.
..
.
⎥
⎦
∂xj p×n
∂gip p ∂gip p
∂x1 (x ) ... ∂xn (x ).
The remaining steps of the equality constraints’ proof work is shown using
the above notations.
M = {h ∈ Rn : G (x∗ ).h = 0}
0 gx (1, 1) gy (1, 1)
0 1 1
(−1)2 B2 (1, 1) = gx (1, 1) Lxx (1, 1, 1) Lxy (1, 1, 1) = 1 0 1 = 2 > 0.
1 1 0
gy (1, 1) Lxy (1, 1, 1) Lyy (1, 1, 1)
We conclude that the point (1, 1) is a local maximum to the problem.
Proof. Let x(t) ∈ C 2 [0, a], a > 0, be a curve on the constraint set g(x) b
passing through x∗ at t = 0. Suppose that x∗ is a local maximum point for f
subject to the constraint g(x) b. Then,
or
f$(0) = f (x∗ ) f (x(t)) = f$(t) ∀t ∈ [0, a).
So f$ is a one variable function that has a local maximum at t = 0. Conse-
quently, it satisfies f$ (0) 0 and f$ (0) 0 or equivalently
d2
∇f (x∗ ).x (0) 0 and f (x(t)) 0.
dt2 t=0
We have
d2
f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0).
dt2
Moreover, differentiating the relations gk (x(t)) = bk , k ∈ I(x∗ ) twice and
denoting Λ∗ = λ∗i1 , . . . , λ∗ip , we obtain
t
x (0)HG (x∗ )x (0) + ∇G(x∗ )x (0) = 0
t
=⇒ x (0)t Λ∗ HG (x∗ )x (0) + t Λ∗ ∇G(x∗ )x (0) = 0.
Hence
d2
0 2
f (x(t)) = [t x (0)Hf (x∗ )x (0) + ∇f (x∗ )x (0)] −
dt t=0
= t x (0)[Hf (x∗ ) −t ΛHG (x∗ )]x (0) + [∇f (x∗ ) + t Λ∇G(x∗ )]x (0)
Solved Problems
⎧
⎨ −(x + 2y) −3
local max −f (x, y) = −(x2 + y 2 ) s.t
⎩
−(2x − y) −1
Consider the Lagrangian
The constraints are linear, so we can look for the candidate points by writing
the Karush-Kuhn-Tucker conditions:
⎧
⎪
⎪ (i) Lx = −2x + λ1 + 2λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (ii) Ly = −2y + 2λ1 − λ2 = 0
⎪
⎪
⎪
⎪ (iii) λ1 0 with λ1 = 0 if x + 2y > 3
⎪
⎪
⎪
⎪
⎩
(iv) λ2 0 with λ2 = 0 if 2x − y > 1.
We distinguish several cases:
• If 2x − y = 1, then
– If x + 2y > 3, then λ1 = 0. From (i) and (ii), we deduce that λ2 =
x = 2y.
With 2x − y = 1, we deduce (x, y) = (2/3, 1/3). But 2/3 + 2(1/3) ≯ 3.
So, no solution.
– If x + 2y = 3, then with 2x − y = 1, we have (x, y) = (1, 1) and
(λ1 , λ2 ) are such that
⎧
⎨ λ1 + 2λ2 = 2 6 2
⇐⇒ (λ1 , λ2 ) = ( , ).
⎩ 5 5
2λ1 − λ2 = 2
Regularity of the point. The two constraints are active at the point. We have
g = (g1 , g2 ) = (−(x + 2y), −(2x − y)),
( #
∂g1 ∂g1
−1 −2
g (x, y) = ∂g ∂x ∂y
∂g2 = =⇒ rank(g (1, 1)) = 2.
∂x
2
∂y
−2 1
Hence, (1, 1) is the minimum point solution; see Figure 4.18 for a geometric
interpretation of the solution.
260 Introduction to the Theory of Optimization in Euclidean Space
y y
4 4 16.12 17.98 21.08 29.76
28.52
24.826.04
23.56 27.28 31
30.3
19.22 29.1
22.32 27.
13.02
16.74 20.46 26.6
24.18
25.4
3 3
18.6 22.94
7.44
21.
x2 y 3
9.92
2 2 15.5 19.8
3.72
17.3
1 1 1.24
6.2
S
x 12.4 x
1 1 2 3 4 1 1 2 3 4
2x y 1
0.62
1.86
1 1 16 12
y 2
y 2
4
4
4
4x y0
2 2
z
0
x
4 2 2 4
2
4x y0
4
2 4
2
0
x
2
4 4
g (x, y) = −y −x rank(g (x, y)) = 1 for (x, y) ∈ [g = 0]
32.4 32.4
27 27
21.6 2 21.6
16.2 0 16.2
x
4 2 2 4
5.4
21.6 21.6
27 27
2
32.4 32.4
10.8
37.8 37.8
9.4 5
43.2 43.2
64.8
.6 64.8
7
48.6 416 2 48.6
1 1 1 3
x= λ1 − λ 2 y = λ 1 + λ2 z= λ1 + λ2 .
2 2 2 2
We distinguish several cases:
Now, let us study the nature of the point (5, 10, 5). For this, we use the second
derivatives test since f , g1 and g2 are C 2 around this point. Since n = 3 and
p = 1 (only the constraint g1 is active), then r takes the values p + 1 = 2 to
n = 3. First, we consider the matrix
g (x, y, z) = − ∂g
∂x
1
− ∂g
∂y
1
− ∂g
∂z
1
= −1 −2 −1
Then rank(g (x, y, z)) = 1. Moreover, the first column vector of g (5, 10, 5) is
linearly independent, so we don’t have to renumber the variables.
Next, we have to consider the sign of the following bordered Hessian determi-
nants:
0 − ∂g 1
− ∂g 1
∂x ∂y
0 −1 −2
−1 −2 0
B2 (5, 10, 5) = − ∂g 1
Lxx Lxy = = 10.
∂x −2 0 −2
∂g
− 1 Lyx Lyy
∂y
264 Introduction to the Theory of Optimization in Euclidean Space
0 − ∂g 1
− ∂g 1
− ∂g 1
∂x ∂y ∂z
− ∂g1 0 −1 −2 −1
∂x Lxx Lxy Lxz −1 −2
B3 (5, 10, 5) = = = −24.
0 0
−2 0 −2 0
− ∂g 1
Lyx Lyy Lyz
∂y −1 0 0 −2
− ∂g1 Lzx Lzy Lzz
∂z
Here, the partial derivatives of g1 are evaluated at the point (5, 10, 5) and the
second partial derivatives of L are evaluated at the point (5, 10, 5, 10, 0).
We have
We conclude that the point (5, 10, 5) is a local maximum to the maximization
problem, or equivalently, a local minimum to the minimization problem.
Conclusion: The minimization problem has one local minimum at the point
(5, 10, 5).
−λ2 = 1 − 2xλ1 λ1 (x + y) = 1 λ1 (x + z) = 1.
Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1,
and from (2) : λ2 = 1. So λ1 = 0 and we have
1
x+y =x+z = =⇒ y = z.
λ1
1
∗ If x − y − z > 1, then λ2 = 0. We deduce that = 2x, thus
λ1
1
x+y =x+z = = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1.
λ1
Hence, we have two points
1 1 1 1 1 1
(x, y, z) = ( √ , √ , √ ) or (− √ , − √ , − √ ).
3 3 3 3 3 3
But, they do not satisfy x − y − z > 1.
We deduce then
1 2 2 1
(x, y, z) = (− , − , − ) with λ1 = −1, λ2 = − .
3 3 3 3
266 Introduction to the Theory of Optimization in Euclidean Space
2 0 0 1 2 2 − 23 − 43 − 43
g (1, 0, 0) = g (− , − , − ) = .
−1 1 1 3 3 3 −1 1 1
Then
1 2 2
rank(g (1, 0, 0)) = rank(g (− , − , − )) = 2.
3 3 3
The two points are regular. Moreover, we remark that the first two column
vectors are linearly independent and we will not renumber the variables.
iii) Classification of the points. Now, let us study the nature of the points
(1, 0, 0) and (− 13 , − 23 , − 23 ). For this we use the second derivatives test since
f , g1 and g2 are C 2 around these points. We have to consider the sign of the
following bordered Hessian determinant:
∂g1 ∂g1 ∂g1
0 0
∂x ∂y ∂z
0 ∂g2
− ∂x ∂g2
− ∂y ∂g2
− ∂z
0
∂g1
∂x − ∂g 2
Lxx Lxy Lxz
B3 (x, y, z) = ∂x
∂g1 − ∂g2 Lyx Lyy Lyz
∂y ∂y
∂g1
∂z − ∂g 2
L L L
∂z zx zy zz
0 0 2x 2y 2z
0 0 −1 1 1
= 2x −1 −2λ1 0 0 .
2y 1 0 −2λ1 0
2z 1 0 0 −2λ1
The first partial derivatives of g1 and g2 are evaluated at (x, y, z). The second
partial derivatives of L are evaluated at (x, y, z, λ1 , λ2 ).
0 0 − 23 − 43 − 43
0 0 −1 1 1
1 2 2
B3 (− , − , − ) = − 23 −1 2 0 0 = 16
(−1)2 B3 = 16 > 0.
3 3 3
− 43 1 0 2 0
− 43 1 0 0 2
1 2 2 5
max f = f (1, 0, 0) = 1 min f = f (− , − , − ) = − .
g1 =1, g2 1 g1 =1, g2 1 3 3 3 3
⎧
⎪
⎪ (1) Lx = 1 − 2xλ1 − λ2 = 0 ⎧
⎪
⎪
⎨ ⎨ (4) Lλ1 = −(x2 + y 2 + z 2 − 1) = 0
(2) Ly = 1 − 2yλ1 + λ2 = 0
⎪
⎪ ⎩
⎪
⎪ (5) λ2 = 0 if x − y − z < 1.
⎩
(3) Lz = 1 − 2zλ1 + λ2 = 0
268 Introduction to the Theory of Optimization in Euclidean Space
λ2 = 1 − 2xλ1 λ1 (x + y) = 1 λ1 (x + z) = 1.
Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1,
and from (2) : λ2 = 1. So λ1 = 0 and we have
1
x+y =x+z = =⇒ y = z.
λ1
1
∗ If x − y − z < 1, then λ2 = 0. We deduce that = 2x, thus
λ1
1
x+y =x+z = = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1.
λ1
Hence, we have two solutions
√
1 1 1 3
(x, y, z) = ( √ , √ , √ ) with λ1 = , λ2 = 0
3 3 3 2
√
1 1 1 3
(x, y, z) = (− √ , − √ , − √ ) with λ1 = − , λ2 = 0.
3 3 3 2
1 2 2 1
(x, y, z) = (− , − , − ) with λ1 = −1, λ2 = .
3 3 3 3
1 1 1 1 1 1
g2 ( √ , √ , √ ) = √2 √2 √2 = −g2 (− √ , − √ , − √ )
3 3 3
3 3 3 3 3 3
Constrained Optimization-Inequality Constraints 269
1 1 1 1 1 1
rank(g2 ( √ , √ , √ )) = rank(g2 (− √ , − √ , − √ )) = 1.
3 3 3 3 3 3
2 0 0 1 2 2 − 23 − 43 − 43
g (1, 0, 0) = g (− , − , − ) = .
1 −1 −1 3 3 3 1 −1 −1
1 2 2
rank(g (1, 0, 0)) = rank(g (− , − , − )) = 2.
3 3 3
The four points are regular. Moreover, we will not have to renumber the
variables since the first two column vectors of each derivative above are linearly
independent.
1 1 1
iii) Classification of the points (± √ , ± √ , ± √ ).
3 3 3
Here n = 3, p = 1, thus we have to consider the sign of the following bordered
Hessian determinants:
0 2x 2y 2z
0 2x 2y
2x −2λ1 0 0
B2 = 2x −2λ1 0
B3 = .
2y 2y 0 −2λ1 0
0 −2λ1
2z 0 0 −2λ1
We have
1 1 1 8 1 1 1
B2 ( √ , √ , √ ) = √ B3 ( √ , √ , √ ) = −12
3 3 3 3 3 3 3
1 1 1
(−1)r Br ( √ , √ , √ ) > 0 r = 2, 3.
3 3 3
Thus, the point is a local maximum since λ1 = 1 > 0 and (−1)2 B2 > 0,
(−1)3 B3 > 0.
1 1 1 8 1 1 1
B2 (− √ , − √ , − √ ) = − √ B3 (− √ , − √ , − √ ) = −12
3 3 3 3 3 3 3
1 1 1
(−1)1 Br (− √ , − √ , − √ ) > 0 r = 2, 3
3 3 3
Thus, the point is a local minimum since λ1 = −1 < 0 and (−1)1 B2 > 0,
(−1)1 B3 > 0.
1 2 2
iv) Classification of the points (1, 0, 0), (− , − , − ).
3 3 3
Here n = 3, p = 2, thus we have to consider the sign of the following bordered
Hessian determinant:
270 Introduction to the Theory of Optimization in Euclidean Space
0 0 2x 2y 2z
0 0 1 −1 −1
B3 (x, y, z) = 2x 1 −2λ1 0 0 .
2y −1 0 −2λ 0
1
2z −1 0 0 −2λ1
⎡ ⎤⎡ ⎤
−2 0 0 0
0 k −k ⎣ 0 −2 0 ⎦ ⎣ k ⎦ = −4k2 0 on M.
0 0 −2 −k
√ √
max f (x, y, z) = 3 and min f (x, y, z) = − 3.
g1 =1, g2 1 g1 =1, g2 1
Constrained Optimization-Inequality Constraints 271
Then, we have
Proof. i) First implication. The point x∗ is a critical point for the La-
grangian L(., λ∗ ) (∇x L(x∗ , λ∗ ) = 0) and L(., λ∗ ) is concave on the convex set
S, then x∗ is a global maximum for L(., λ∗ ) on S (by Theorem 2.3.4). Thus,
we have
L(x∗ , λ∗ ) = f (x∗ ) f (x) − λ∗1 (g1 (x) − b1 ) − . . . − λ∗m (gm (x) − bm ) = L(x, λ∗ ).
For each j = 1, . . . , m, we also have , λ∗j 0 and gj (x) − bj 0, then
−λ∗j (gj (x) − bj ) 0. Therefore,
ii) Second implication. This part can be deduced similarly. Moreover, it sug-
gests, for example, when looking for candidates for a maximization problem
that we keep the points with negative Lagrange multipliers and see if they
are global minima points without maximizing (−f ) and introducing another
Lagrangian.
L(x, y, z, λ) = x2 + y 2 + z 2 − λ(x − 2z + 5)
Let us solve the system
⎧
⎪
⎪ (i) Lx = 2x − λ = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (ii) Ly = 2y = 0
⎪
⎪
⎪
⎪ (iii) Lz = 2z + 2λ = 0
⎪
⎪
⎪
⎪
⎩
(iv) λ = 0 if x − 2z + 5 < 0.
∗ If x − 2z + 5 < 0 , then λ = 0. From the equations (i), (ii) and (iii), we
deduce that (x, y, z) = (0, 0, 0). But, then the inequality x − 2z + 5 < 0 is
not satisfied.
Hence, L(., −2) is strictly convex in (x, y, z), and we conclude that the point
(−1, 0, 2) is the solution to the constrained manimization problem.
The maximization problem doesn’t have a solution, since there is only one
solution to the system and it is a global minimum point.
Interpretation. The problem looks for the shortest and farthest distance of the
origin to the space region located below the plan x − 2z + 5 = 0. The shortest
distance is attained on the plane.
Remark 4.4.1 The rank condition, at the point x∗ , is not assumed in the
theorem. The proof uses the characterization of a C 1 convex function on a
convex set only.
Example 2. In Example 4, Section 4.2, the point (1, 1) doesn’t satisfy the
rank condition. It solves the KKT conditions related to the problem with
linear constraints:
is strictly definite negative since the leading principal minors are such that
1 1
D1 (x, y) = − < 0, D2 (x, y) = >0
x2 x2 y 2
for (x, y) ∈ Ω = (0, +∞) × (0, +∞). So the Lagrangian is strictly concave in
(x, y) ∈ Ω, and (1, 1) is the maximum point.
ii) Note that, on the boundary of the constraint set [g 4], we have y = 4/x
and f takes the values
4 12
f (x, ) = 4x + −4
x x
and
4 4
lim f (x, ) = +∞ and lim f (x, ) = −∞.
x→+∞ x x→−∞ x
Hence f doesn’t attain an absolute maximum nor an absolute minimum value
on the constraint set.
√ √ 24 √ √ 24
f ( 3, 4/ 3) = √ − 4 f (− 3, −4/ 3) = − √ − 4.
3 3
√ √ √ √ √ √
√ f ( 3,√4/ 3) > f (− 3, −4/ 3), ( 3, 4/ 3) being a local
With √ minimum
√ and
(− 3, −4/ 3) being a local maximum,
√ √ we can see that ( 3, 4/ 3) cannot
be a global minimum and (− 3, −4/ 3) cannot be a global maximum. A
constrained global extreme point would be a local one since any point of the
set of the constraints g = 4 is an interior point and regular.
no constraints
f (x) = 0
F(x, λ) = 0
On the other hand, solving a nonlinear equation is not easy even when F
is a polynomial of degree 3 of one variable.
These two points are the start for the development of numerical methods
for approaching the solution with accuracy (see [17], [19], [8], [4]).
*** The proofs we studied for optimization problems in the Euclidean space
constitute a natural step to more complex ones developed in calculus of
variation where the maximum and minimum are searched in a class of
functions and where the objective function is a function defined on that
class (see [16], [6], [9]).
Constrained Optimization-Inequality Constraints 277
Solved Problems
Solution: i) Let t a = a1 . . . an , t x = x1 . . . xn . The mini-
mization problem looks for points in the region above the hyperplane
t
a.x = b ⇐⇒ a 1 x 1 + a2 x 2 + . . . + a n x n = b
⎧
⎪
⎪ Lx1 = −2x1 + λa1 = 0
⎪
⎪
⎪
⎪
⎪
⎪ ..
⎪
⎨ .
⎪
⎪
⎪
⎪ Lxn = −2xn + λan = 0
⎪
⎪
⎪
⎪
⎪
⎩
λ0 with λ = 0 if a1 x1 + a2 x2 + . . . + an xn > b.
278 Introduction to the Theory of Optimization in Euclidean Space
Finding a candidate.
* If a1 x1 + a2 x2 + . . . + an xn = b, then
λ
x i = ai , i = 1, . . . , n
2
which inserted in the equation of the hyperplane, we obtain
λ λ λ λ b
a1 a1 + a 2 a2 + . . . + a n an = b ⇐⇒ = .
2 2 2 2 a 2
Hence, a solution to the system is
b b
xi = ai , i = 1, . . . , n ⇐⇒ x= a.
a 2 a 2
b
To study the concavity of L in x when λ = 2 , consider the Hessian matrix
a 2
⎡ ⎤ ⎡ ⎤
Lx 1 x 1 ... Lx 1 x n −2 . . . 0
⎢ .. .. .. ⎥ ⎢ . .. .. ⎥
HL(.,λ) (x) = ⎣ . . . ⎦ = ⎣ .. . . ⎦
Lxn x1 ... Lxn xn 0 ... −2
The leading minor principals are equal to Dk (x) = (−2)k , k = 1, . . . , n.
The matrix is semi-definite negative. Thus, the point maximizes − x 2 subject
to the constraint t ax b. Hence, the point solves the minimization problem
and the minimal distance of the origin to this point is equal to
b
b
a = .
a 2 a
Moreover, we have
Thus
2 2 √
min x2 + y 2 = =√ = 2
−x+y2 −1 2
1
and is attained at (x∗ , y ∗ ) = (−1, 1). Hence
)
√
min 5+ x2 + y2 = 5+ 2.
−x+y2
β) We have
Thus
6 6
max −6x2 − 6y 2 − 6z 2 + 4 = 4 − ⎡ ⎤ = 4 − √ = 2,
2x−y+2z−1 −2 9
⎣ 1 ⎦
−2
1
and is attained at (x∗ , y ∗ , z ∗ ) = (−2, 1, −2).
9
α) − y −2, x 0 β) x − y 2, x 0, y 0
γ) − x + y 2, x 0, y 0 δ) x + y 2, x 0, y 0
and that are closest to the origin. It is a nonlinear minimization problem with
inequality constraints. We introduce the Lagrangian
⎧
⎪ Lx1 = −2x1 + λa1 0 (= 0 if x1 > 0)
⎪
⎪
⎪
⎪ ..
⎨ .
⎪
⎪
⎪
⎪ Lxn = −2xn + λan 0 (= 0 if xn > 0)
⎪
⎩
λ0 with λ = 0 if a1 x1 + a2 x2 + . . . + an xn > b.
Finding a candidate.
λ + λ λ λ b
a1 a + a2 a+ + . . . + a n an + =b ⇐⇒ = + .
2 1 2 2 2 2 a 2
t t +
set a a a+ (x∗ , y ∗ )
One can easily check the minimal distance of the origin to the given sets from
the graphics in Figure 4.21.
282 Introduction to the Theory of Optimization in Euclidean Space
y
y 2.5
5
2.0 y x2
4
1.5
3 x0, y2
1.0 xy2
2
y2
0.5 x0, y0
1
x0 x
1 2 3 4 5
x
0.5 0.5 1.0 1.5 2.0 0.5
y
y 2.5
5
x0, y0 2.0 xy2
4 xy2
1.5 x0, y0
3
1.0
2
0.5
1
x
1 2 3 4
x
0.5 0.5 1.0 1.5 2.0 0.5
iv) Determine whether or not the point(s) in part ii) satisfy the second-
order sufficient condition.
Solution: i) The feasible set is the plane region located above the curve and
the two lines, as described in Figure 4.22.
Constrained Optimization-Inequality Constraints 283
y
y 3 x 4 y3x
y 4 x2
x
3 2 1 1 2 3
2
ii) Writing the KKT conditions. The problem is equivalent to the follow-
ing maximization problem
⎧
⎪
⎪ g1 (x, y) = 4 − x2 − y 0
⎪
⎪
⎨
max (−x2 − y 2 ) subject to g2 (x, y) = 3x − y 0
⎪
⎪
⎪
⎪
⎩
g3 (x, y) = −3x − y 0.
⎧
⎨ −2x − 3β + 3γ = 0
=⇒ 6γ = 20x and 3β = 8x 0.
⎩
−2(3x) + β + γ = 0
•• If 4 − x2 − y = 0 then
⎧ ⎧
⎨ −2x + 2λx = 0 ⎨ 2x(−1 + λ) = 0 ⇐⇒ x = 0 or λ = 1
⇐⇒
⎩ ⎩
−2y + λ = 0 2y = λ 0
√ √
◦ λ = 1 leads to y = 1/2 and x = ± 7/2. But, for (x, y) = (√7/2, 1/2),
the inequality 3x − y < 0 is not satisfied, and for (x, y) = (− 7/2, 1/2),
the inequality −3x − y < 0 is not satisfied. So we cannot have λ = 1.
Constrained Optimization-Inequality Constraints 285
From 4 − x2 − y = 0, we have
The point (4, −12) doesn’t satisfy the inequality 3x − y < 0, so it cannot be
a solution.
The points (−4, −12) doesn’t satisfy the inequality −3x − y 0, so it cannot
be a candidate.
The point (1, 3) satisfies the inequality −3x − y < 0, thus γ = 0, and we have
⎧
⎨ 2λ − 3β = 2
=⇒ (λ, β) = (4, 2).
⎩
λ+β =6
286 Introduction to the Theory of Optimization in Euclidean Space
Regularity of the candidate point (0, 4). Only the constraint g1 (x, y) =
4 − x2 − y is active at (0, 4) and we have
g1 (x, y) = −2x −1 g1 (0, 4) = 0 −1 rank(g1 (0, 4)) = 1.
Regularity of the candidate point (−1, 3). Only the constraints g1 (x, y) =
4 − x2 − y and g3 (x, y) = −3x − y are active at (−1, 3) and we have
g1 (x, y) −2x −1 g1 (−1, 3) 2 −1
= = .
g3 (x, y) −3 −1 g3 (−1, 3) −3 −1
g1 (−1, 3)
Thus the point (1, −3) is a regular point since rank( ) = 2.
g3 (−1, 3)
Regularity of the candidate point (1, 3). Only the constraints g1 (x, y) =
4 − x2 − y and g2 (x, y) = 3x − y are active at (1, 3) and we have
g1 (x, y) −2x −1 g1 (1, 3) −2 −1
= = .
g2 (x, y) 3 −1 g2 (1, 3) 3 −1
g1 (1, 3)
Thus the point (1, 3) is a regular point since rank( ) = 2.
g2 (1, 3)
iv) With p = 2 (the number of active constraints) at the points (3, −1) and
(3, 1), n = 2 (the dimension of the space), then p = n. The second derivatives
test cannot be applied since it is established for p < n.
For the point (0, 4), we have p = 1 < 2 = n. We consider the following
determinant (r = p + 1 = 2) (Note that the first column vector of [g1 (x, y)] is
linearly dependent, so we have to renumber the variables)
0 ∂g1 ∂g1 0
∂y ∂x −1 −2x
1 −1 −2
B2 (x, y) = ∂g Lyy Lyx = 0
∂g
∂y −2x 0
1 Lxy Lxx −2 + 2λ
∂x
Constrained Optimization-Inequality Constraints 287
We have (−1)2 B2 (0, 4) = −14 < 0. So the second derivatives test is not
satisfied at (0, 4).
v) Let us explore the concavity and convexity of L with respect to (x, y) where
the Hessian matrix of L in (x, y) is
Lxx Lxy −2 0
HL = =
Lyx Lyy 0 −2 + 2λ
When λ = 8 or 4, the principal minors are
On the lines y = ±3x, with |x| 1, the function f (x, y) = x2 + y 2 takes the
values
Thus,
So we can conclude that the minimum value attained by f on the set of the
constraints is 10.
S = {(x, y) : 4 − x2 − y 0, 3x − y 0, −3x − y 0}
2 2
The level curves of f , with equations
√ : x + y = k where k 0, are circles
centered at (0, 0) with radius k; see Figure 4.23.
If we increase the values of the radius, the values of f increase. The value
k = 10 is the first one at which the level curve intersects the constraints
g1 = g2 = 0 and g1 = g3 = 0. Thus the value 10 is the minimal value of f
reached at (±1, 3).
x
4 2 2 4
2
4
FIGURE 4.23: Level curves of f and the closest points of S to the origin
4. – The data in Table 4.5 can be found in [8]. Here we consider boundary
conditions to illustrate an inequality constrained problem.
The Body Fat Index (BFI) measures the fitness of an individual. It is
a function of the body density ρ (in units of kilograms per liter) according
Constrained Optimization-Inequality Constraints 289
to Brozek’s formula,
457
BF I = − 414.2.
ρ
However the accurate measurement of ρ is costing. An alternative solution
is to try to describe the dependence of the BFI with respect of five variables
x1 , x2 , x3 , x4 , x5 in the form
f : x −→ BF I = y = f (x) = a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5
xi = (xi1 , xi2 , xi3 , xi4 , xi5 ) are the measurements for the ieme individual.
x1 x2 x3 x4 x5 y
154.25 67.75 85.2 17.1 36.2 12.6
173.25 72.25 83 18.2 38.5 6.9
154 66.25 87.9 16.6 34 24.6
184.75 72.25 86.4 18.2 37.4 10.9
184.25 71.25 100 17.7 34.4 27.8
210.25 74.75 94.4 18.8 39 20.6
181 69.75 90.7 17.7 36.4 19
176 72.5 88.5 18.8 37.8 12.8
191 74 82.5 18.2 38.1 5.1
198.25 73.5 88.6 19.2 42.1 12
with(Optimization) : LSSolve(
[154.25a1 + 67.75a2 + 85.2a3 + 17.1a4 + 36.2a5 − 12.6, 173.25a1 + 72.25a2 + 83a3 + 18.2a4 +
38.5a5 − 6.9,
154a1 + 66.25a2 + 87.9a3 + 16.6a4 + 34a5 − 24.6, 184.75a1 + 72.25a2 + 86.4a3 + 18.2a4 +
37.4a5 − 10.9,
184.25a1 + 71.25a2 + 100a3 + 17.7a4 + 34.4a5 − 27.8, 210.25a1 + 74.75a2 + 94.4a3 + 18.8a4 +
39a5 − 20.6,
181a1 +69.75a2 +90.7a3 +17.7a4 +36.4a5 −19, 176a1 +72.5a2 +88.5a3 +18.8a4 +37.8a5 −12.8,
191a1 +74a2 +82.5a3 +18.2a4 +38.1a5 −5.1, 198.25a1 +73.5a2 +88.6a3 +19.2a4 +42.1a5 −12],
{180.7a1 + 71.425a2 + 88.72a3 + 18.05a4 + 37.39a5 ≤ 15.23})
[15.0549945448635683, [a1 = 0.474753096134219e − 1, a2 = −1.03634130223772,
a3 = 1.22920301075594, a4 = −1.86308283592359, a5 = .140089140413700]]
Thus
f (x1 , x2 , x3 , x4 , x5 ) ≈ 0.474x1 − 1.036x2 + 1.229x3 − 1.863x4 + .140x5
f can be used to predict an individual’s body fat index, based upon the five measurements
types.
- When the residuals in the objective function and the constraints are all linear, which is
the case here, then an active set method is used. This is an approximate method [19],[22],
[17].
-The LSSolve command uses various methods implemented in a built in library provided
by a group of numerical algorithms.
with(Optimization) :
c := V ector([12.6, 6.9, 24.6, 10.9, 27.8, 20.6, 19, 12.8, 5.1, 12],
datatype = f loat) :
G := M atrix([[154.25, 67.75, 85.2, 17.1, 36.2], [173.25, 72.25, 83, 18.2, 38.5],
[154, 66.25, 87.9, 16.6, 34], [184.75, 72.25, 86.4, 18.2, 37.4],
[184.25, 71.25, 100, 17.7, 34.4], [210.25, 74.75, 94.4, 18.8, 39],
[181, 69.75, 90.7, 17.7, 36.4], [176, 72.5, 88.5, 18.8, 37.8],
[191, 74, 82.5, 18.2, 38.1], [198.25, 73.5, 88.6, 19.2, 42.1]],
datatype = f loat) :
with(Statistics) :
A := M ean(G) :
b := M ean(c) :
A := M atrix([[180.7, 71.425, 88.72, 18.05, 37.39]], datatype = f loat) :
b := V ector([15.23], datatype = f loat) :
lc := [A, b] :
LSSolve([c, G], lc) :
⎡ ⎡ ⎤ ⎤
0.0474753096134219
⎢ ⎢ −1.03634130223772 ⎥ ⎥
⎢ ⎢ ⎥ ⎥
⎢ 15.0549945448635683, ⎢ 1.22920301075594 ⎥ ⎥
⎣ ⎣ −1.86308283592359 ⎦ ⎦
0.140089140413700
Hence, we obtain the same coefficients ai .
The Hessian of ϕ is
1 2 1
ϕ(a) = G.a − c = (G.a − c).(G.a − c)
2 2
1 1
= ( G.a 2 − 2t c.G.a + c 2 ) = (t at GG.a − 2t c.G.a + c 2
)
2 2
ϕ (a) =t GG.a − G.c ϕ (a) =t GG
Checking that the Hessian is definite positive.
with(LinearAlgebra)
H := M ultiply(T ranspose(G), G) :
IsDef inite(H)
true
292 Introduction to the Theory of Optimization in Euclidean Space
To set the main result of this section, we suppose the objective function f
and the constraint function g depending on a parameter r ∈ Rk , i.e.
f (x, r) = f (x1 , . . . , xn , r1 , . . . , rk ),
g(x, r) = g(x1 , . . . , xn , r1 , . . . , rk ) g = (g1 , . . . , gn ),
I(x(r)) = {i ∈ {1, · · · , m} : gi (x(r), r) < 0},
Consider the problem (Pr )
f ∗ (r) = local max f (x, r) (resp. local min) s.t g(x, r) 0
and introduce the Lagrangian
L(x, λ, r) = f (x, r) − λ1 g1 (x, r) − . . . − λm gm (x, r).
rank(G (x∗ , r̄)) = p, G (x∗ , r̄) = (gi1 (x∗ , r̄), . . . , gip (x∗ , r̄))
∂f ∗ ∂L
(r) = (x(r), λ(r), r) j = 1, . . . , k.
∂rj ∂rj
where
∂f ∗ ∂L
(r̄) = (x, λ, r) j = 1, . . . , k.
∂rj ∂rj x=x∗ , λ=λ∗ , r=r̄
= f (x(r), r) − λi (r)gi (x(r), r) − λi (r)gi (x(r), r)
i∈I(x(r)) i∈I(x(r))
/
= f (x(r), r) = f ∗ (r)
because gi (x(r), r) = 0 for i ∈ I(x(r)) and λi (r) = 0 for i ∈
/ I(x(r)); then
using the Chain rule formula, we obtain
∂f ∗ ∂(L(x(r), λ(r), r))
(r) =
∂rj ∂rj
n m
∂L ∂xi ∂L ∂λt
= (x(r), λ(r), r)) (r) + (x(r), λ(r), r))
i=1
∂xi ∂rj t=1
∂λt ∂rj
k
∂L ∂rl
+ (x(r), λ(r), r))
∂rl ∂rj
l=1
n
∂x
∂f ∂g1 ∂gm i
= (x(r), r) − λ1 (r) (x(r), r) − . . . . . . − λm (r) (x(r), r) (r)
i=1
∂xi ∂xi ∂xi ∂rj
m
∂λt ∂L
+ − gt (x(r), r) (r) + (x(r), λ(r), r).
t=1
∂rj ∂rj
Since x(r) optimizes f (x, r) subject to the constraints g(x, r) 0, then the
necessary condition gives, for each i = 1, . . . , n,
∂f ∂g1 ∂gm
(x(r), r) − λ1 (r) (x(r), r) − . . . . . . − λm (r) (x(r), r) = 0.
∂xi ∂xi ∂xi
Now, since we have gt (x(r), r) = 0 for t ∈ I(x(r)) and λt (r) = 0 for t ∈ I(x(r)),
then
m
∂λt
− gt (x(r), r) (r) = 0.
t=1
∂rj
Hence
∂f ∗ ∂L
(r) = (x(r), λ(r), r).
∂rj ∂rj
– (x∗ , λ∗p , r̄) ∈ Ω × Rp × B(r̄, δ), so (x∗ , λ∗p , r̄) is an interior point
⎡ ⎤
(Lxi xj ) −G (x∗ , r̄)
– det(∇x,λp F (x∗ , λ∗p , r̄)) = det ⎣ ⎦
t ∗
− G (x , r̄) 0
= (−1)2p Bn (x∗ , λ∗p , r̄) = 0, Bn : Bordered Hessian determinant.
Remark 4.5.2 * In the theorem above, the local max(min) problem can
be replaced by the max(min) problem, provided we assume, for example,
Example 1. Suppose that when a firm produces and sells x units of a com-
modity, it has a revenue R(x) = x, while the cost is C(x) = x2 .
i) Find the optimal choice of units of the commodity that maximize profit.
ii) Find the approximate change of the optimal profit if the revenue changes
to 0.99x.
Since the set of the constraints S = (0, +∞) is an open set and the profit
function is regular, the optimal point, if it exists, is a critical point solution
of the equation
dP 1
= 1 − 2x = 0 ⇐⇒ x= .
dx 2
Moreover, we have
d2 P d2 P
= −2 and <0 ∀x ∈ S.
dx2 dx2
Then P is strictly concave on the convex set S. Hence, the only critical point
x = 1/2 is a global maximum point. Thus x∗ = 1/2 units should be produced
to achieve maximum profit.
ii) Introduce the new profit function with the new revenue rx where r > 0 :
d2 P
1. For, r close to 1, we have (x, r) = −2 < 0. Thus P (., r) is concave
dx2
in x.
2. The second order condition for strict maximality is satisfied when r = 1.
Constrained Optimization-Inequality Constraints 297
1 1 1
3. P (1/2, 1) = max P (x, 1) = − = .
S 2 4 2
As a consequence,
– ∃η > 0 such that the function P ∗ (r) = max P (x, r) is defined for any
x∈S
r ∈ (1 − η, 1 + η)
– P ∗ is C 1 and
– ∂P
dP ∗ 1
(1) = (x, r) =x = .
dr ∂r x=1/2, r=1 x=1/2, r=1 4
We can write the following approximation
dP ∗ 1 1
P ∗ (r) ≈ P ∗ (1) + (1)(r − 1) = + (r − 1) for r close to 1.
dr 4 2
In particular, for r = 0.99, the objective function P ∗ takes the following
approximate value:
and the approximate change in the maximum value of the maximum profit
function is
P ∗ (0.99) − P ∗ (1) ≈ −0.5(0.01) = −0.005.
* Note that, for this example, we have easily the exact value of the objective
function P ∗ ; see Figure 4.24. Indeed, we have
r r r r2
P ∗ (r) = P (x∗ (r), r) = P ( , r) = r − ( )2 =
2 2 2 4
from which we deduce
(0.99)2
P ∗ (0.99) = = 0.245025
4
We also have the following equality
dP ∗ r ∂P (x, r)
= = x∗ (r) = .
dr 2 ∂r x=x∗ (r)
y
0.30
0.25 : y xx^2
0.20
0.15
0.10 y 0.99 x x2
0.05
0.00 x
0.0 0.2 0.4 0.6 0.8 1.0
∂f ∗ ∂L
(b̄) = (x, λ, b) = λj (b̄) j = 1, . . . , m.
∂bj ∂bj x=x∗ , λ=λ∗ , b=b̄
This tells us that the Lagrange multiplier λj = λj (b̄) for the j th constraint
is the rate of change at which the optimal value function changes with
respect of the parameter bj at the point b̄.
Using the linear approximation formula,
∂f ∗ ∂f ∗
f ∗ (b) − f ∗ (b̄) (b̄)(b1 − b̄1 ) + · · · · · · + (b̄)(bm − b̄m )
∂b1 ∂bm
the change in the optimal value function is estimated, when one or more
components of the resource vector are slightly changed.
∂f ∗ ∂L
(3) = (x(b), y(b), z(b), λ(b), b) = λ(b) =2
∂b ∂b b=3 b=3
Solved Problems
Solution: This example shows that the optimal value function is not neces-
sarily regular. Indeed, set
We have
dy
y =
= fx (x, r) = 2(x − r).
dx
We distinguish different cases:
x −1 r 1
y = 2(x − r) − +
y = (x − r)2 (1 + r)2 0 (1 − r)2
x −1 1 r
y = 2(x − r) − −
y = (x − r)2 (1 + r)2 (1 − r)2 0
x r −1 1
y = 2(x − r) + +
y = (x − r)2 0 (1 + r)2 (1 − r)2
⎧
⎪
⎪ (1 − r)2 − 1
⎪
⎨ = −(2 − r) if r<0
f ∗ (r) − f ∗ (0) r
=
r−0 ⎪
⎪ 2
⎩ (1 + r) − 1 = 2 + r
⎪
if r>0
r
Hence
This doesn’t contradicts the theorem since the regularity of f ∗ was proved
when x∗ is an interior point for f , which is not the case here with x∗ = ±1.
Indeed, we have f (x, 0) = x2 and f ∗ (0) = f (±1, 0) = 1.
So f attains its minimal value at the interior point 0, where the second deriva-
tives test is satisfied. Moreover, f is convex on [−1, 1], which let’s 0 be the
global minimum point. Therefore, for r close to 0, that is ∃η > 0 such that
g ∗ ∈ C 1 (−η, η). In fact, from i), we have exactly g ∗ (r) = 0 for r ∈ (−1, 1),
which is a regular function.
max
2
(1.05)2 x + 5y sin(0.01) − 2x2 − 3y 2
R
Solving max
2
x − 2x2 − 3y 2 .
R
The leading principal minors are D1 (x, y) = −4 < 0 and D2 (x, y) = 24 > 0.
1
Hence, f is strictly concave on R2 and we conclude that (x∗ , y ∗ ) = ( , 0) is a
4
global maximum point, and the only one.
As a consequence,
We can write the following approximation, for (r, s) close to (1, 0),
∂f ∗ ∂f ∗ 1 1
f ∗ (r, s) ≈ f ∗ (1, 0) + (1, 0)(r − 1) + (1, 0)(s − 0) = + (r − 1).
∂r ∂s 8 2
In particular, for (r, s) = (1.05, 0.01), the objective function f ∗ takes the
following approximate value:
1
f ∗ (1.05, 0.01) ≈ 0.125 + (1.05 − 1) = 0.125 + 0.025 = 0.15
2
and the approximate change in the maximum value of the maximum profit
function is
1
f ∗ (1.05, 0.01) − f ∗ (1, 0) ≈ (1.05 − 1) = 0.025.
2
Constrained Optimization-Inequality Constraints 303
i) Apply Lagrange’s theorem to the problem to show that there are four
points satisfying the necessary conditions.
ii) Show that each point is a regular point.
iii) What can you conclude about the global minimal and maximal values
of f subject to g1 = g2 = 1? Justify your answer.
iv) Replace the constraints by x + y + z = a and x2 + y 2 + z 2 = b with
(a, b) close to (1, 1) (a > 0, b > 0).
- What is the approximate change in the optimal value function
L(x, y, z, λ1 , λ2 ) = ex + y + z − λ1 (x + y + z − 1) − λ2 (x2 + y 2 + z 2 − 1)
⎧
⎪
⎪ (1) Lx = ex − λ1 − 2xλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 1 − λ1 − 2yλ2 = 0
⎪
⎪
⎨
∇L(x, y, z, λ1 , λ2 ) = 0R5 ⇐⇒ (3) Lz = 1 − λ1 − 2zλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x + y + z − 1) = 0
⎪
⎪
⎪
⎪
⎩
(5) Lλ2 = −(x2 + y 2 + z 2 − 1) = 0.
(z − y)λ2 = 0 =⇒ z=y or λ2 = 0.
304 Introduction to the Theory of Optimization in Euclidean Space
2y 2 − 2y = 0 =⇒ y=0 or y = 1.
Each critical point is regular, and we remark that the first two column vectors
in the matrices g (− 13 , 23 , 23 )), g (0, 1, 0) and g (1, 0, 0) are linearly independent,
while they are linearly dependent in g (0, 0, 1). Therefore, we can keep the ma-
trices without renumbering the variables when applying the second derivatives
test in the first three matrices and change the variables in the last one.
iii) Now, f is continuous on the constraint set which is a closed and bounded
curve of R3 as the intersection of the unit sphere x2 + y 2 + z 2 = 1 and the
plane x + y + z − 1 = 0. So f attains its optimal values by the extreme value
theorem on points that are also critical points of the Lagrangian. Comparing
the values of f on these points, we obtain
1 2 2 1 4
2 < f (− , , ) = e− 3 + ≈ 2.0498 < e.
3 3 3 3
Constrained Optimization-Inequality Constraints 305
and
max f (x, y, z) = f (1, 0, 0) = e.
g1 =1, g2 =1
iv) ∗ If we denote
1. for (a, b) close to (1, 1), there exists a solution to the constrained mini-
mization problem by the extreme value theorem (because f is continuous
on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b).
2. (0, 1, 0) and (0, 0, 1) are solutions to the constrained minimization prob-
lem when (a, b) = (1, 1) and are regular points.
3. the second order condition for minimality is satisfied when (a, b) = (1, 1)
at (0, 1, 0) and (0, 0, 1). Indeed, n = 3 and m = 2, then we have to
consider the sign of the following bordered Hessian determinant:
0 0 ∂g1 ∂g1 ∂g1
∂x ∂y ∂z 0 0 1 1 1
∂g2 ∂g2 ∂g2
0 0 0 0 2x 2y 2z
∂x ∂y ∂z
B3 (x, y, z) = ∂g1 ∂g2
Lxx Lxy Lxz =1
2x ex − 2λ2 0 0 = 4.
∂x ∂x
∂g1 1
∂y
∂g2
Lyx Lyy Lyz 2y 0 −2λ2 0
∂y
∂g
1 ∂g2
1 2z 0 0 −2λ2
∂z ∂z
Lzx Lzy Lzz
0 0 1 1 1
0 0 0 2 0
B3 (0, 1, 0) = 1 0 1 0 0 =4 =⇒ (−1)2 B3 (0, 1, 0) = 4 > 0.
1 2 0 0 0
1 0 0 0 0
We change the variables in the order (x, z, y) to compute B3 (0, 0, 1) and obtain
0 0 1 1 1
0 0 0 2 0
B3 (0, 0, 1) = 1 0 1 0 0 =4 =⇒ (−1)2 B3 (0, 0, 1) = 4 > 0.
1 2 0 0 0
1 0 0 0 0
306 Introduction to the Theory of Optimization in Euclidean Space
we have
∂f ∗ ∂La,b
(1, 1) = = λ1 (1, 1) = 1
∂a ∂a (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1))
∂f ∗ ∂La,b
(1, 1) = = λ2 (1, 1) = 0
∂b ∂b (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1))
∂f ∗ ∂f ∗
f ∗ (a, b) ≈ f ∗ (1, 1) + (1, 1)(a − 1) + (1, 1)(b − 1)
∂a ∂b
= 2 + (a − 1) + (0)(b − 1) = a + 1.
∗∗ If we denote
F ∗ (a, b) = max f (x, y, z)
g1 =a, g2 =b
1. for (a, b) close to (1, 1), there exists a solution to the constrained maxi-
mization problem by the extreme value theorem (because f is continuous
on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b).
2. (1, 0, 0) is the solution to the constrained maximization problem when
(a, b) = (1, 1) and it is a regular point.
3. the second order condition for maximality is satisfied when (a, b) = (1, 1)
at (1, 0, 0). Indeed, n = 3 and m = 2, then we have to consider the sign
of the following bordered Hessian determinant:
0 0 1 1 1
0 0 2 0 0
B3 (1, 0, 0) = 1 2 1 0 0 = 8(1 − e) < 0 (−1)3 B3 = 8(e − 1) > 0.
1 0 0 1−e 0
1 0 0 0 1−e
Consequently, we have
i) Sketch the feasible set and write down the necessary KKT conditions.
ii) Find the solutions candidates of the necessary KKT conditions.
v) What can you conclude about the solution of the maximization prob-
lem?
vi) Determine the approximate values of each problem.
⎧ √
⎨ x2 + 1.04y 2 8
min(max) 1 − (0.98)3 (x − 2)2 − e−0.01 y 2 s.t
⎩
(1.04)2 x − y 0.
Solution: i) Figure 4.25 describes the constraint set and locate the extreme
points, approximately, following the variation of the objective function along
the level curves.
35 26
1 x2 y2 8 37 1
19
15 2 4 6
39
8
41 30
2 10
43
36 24
45
47
x y 0 49 38 13 1
3 50 42
4844
46 40 33 27 214 17 17 1
⎧
⎪
⎪ (1) Lx = −2(x − 2) − 2λx − β = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (2) Ly = −2y − 2λy + β = 0
⎪
⎪
⎪
⎪ (3) λ=0 if x2 + y 2 < 8
⎪
⎪
⎪
⎪
⎩
(4) β=0 if x−y <0
with
(λ, β) (0, 0) for a maximum point
(λ, β) (0, 0) for a minimum point
ii) Solving the system.
∗ If x2 + y 2 < 8 then λ = 0 and
⎧
⎨ −2(x − 2) − β = 0
=⇒ y = −(x − 2)
⎩
−2y + β = 0
then inserted in (4) we discuss
∗ If x2 + y 2 = 8 then
– Suppose x − y < 0 then β = 0 and
⎧
⎨ x − 2 + λx = 0
⎩
−2y(1 + λ) = 0 ⇐⇒ y=0 or λ = −1.
λ = −1 is not possible by x − 2 + λx = 0. Thus y = 0.
√
With √x2 + y 2 = 8, we deduce that√ x = 8, which contradicts x < y, or
x = − 8. Inserting the value x = − 8 into x − 2 + λx = 0 gives λ = −1 − √12 .
So, we have another candidate
√ 1
(x, y) = (− 8, 0) with (λ, β) = (−1 − √ , 0).
2
⎧
⎨ −4λ − β = 0 1
(x, y) = (2, 2) =⇒ ⇐⇒ (λ, β) = (− , 2)
⎩ 2
−4λ + β = 4
⎧
⎨ 4λ − β = −8 3
(x, y) = (−2, −2) =⇒ ⇐⇒ (λ, β) = (− , 2)
⎩ 2
4λ + β = −4
Regularity of the candidate point (1, 1). Note that the constraints
g1 (x, y) = x2 + y 2 and g2 (x, y) = x − y are C 1 in R2 and that only the
constraint g2 is active at (1, 1). We have
g2 (x, y) = 1 −1 rank(g2 (1, 1)) = 1
Thus the point (1, 1) is a regular point.
√
Regularity √of the candidate point (− 8, 0). Only the constraint g1 is
active at (− 8, 0). We have
√
g1 (x, y) = 2x 2y rank(g2 (− 8, 0)) = 1
√
Thus the point (− 8, 0) is a regular point.
310 Introduction to the Theory of Optimization in Euclidean Space
iii) Second derivatives test at (1, 1). With p = 1 (the number of the
constraints), n = 2 (the dimension of the space), then r = p + 1, n = 2, 2 =⇒
r = 2 and we will consider the following determinant
0 ∂g2 ∂g2 0
∂x ∂y 1 −1
2 1
B2 (x, y) = ∂g Lxx Lxy = −2 − 2λ 0
∂g
∂x −1
∂y2 Lyx Lyy 0 −2 − 2λ
0 ∂g1 ∂g1 0
∂x ∂y 2x 2y
1 2x
B2 (x, y) = ∂g Lxx Lxy = −2 − 2λ 0
∂g
∂x 2y
∂y1 Lyx Lyy 0 −2 − 2λ
√
∗ At (− 8, 0), we have λ = −3/2, then
√
0√ −2 8 0
√
B2 (1, 1) = −2 8 1 0 = −32 =⇒ (−1)1 B2 (− 8, 0) > 0
0 0 1
√
and (− 8, 0) is a local minimum.
iv) and v) Let us explore the concavity and convexity of L with respect to
(x, y) where the Hessian matrix of L in (x, y) is
Lxx Lxy −2 − 2λ 0
HL = =
Lyx Lyy 0 −2 − 2λ
• When λ = 0, the principal minors are Δ11 = Lyy = −2 < 0, Δ21 = Lxx =
−2 < 0 and Δ2 = 4 > 0. So (−1)k Δk 0 for k = 1, 2. Therefore, L is concave
in (x, y) and then (1, 1) is a global maximum for the constrained maximization
problem.
• When λ = −3/2, the principal minors are Δ11 = Lyy = 1 > 0, Δ21 = Lxx =
1 > 0 and Δ2 = 1 √> 0. So (−1)k Δk 0 for k = 1, 2. Therefore, L is convex in
(x, y) and then (− 8, 0) is a global minimum for the constrained minimization
problem.
Constrained Optimization-Inequality Constraints 311
vi) Note that 0.98 ≈ 1, 1.04 ≈ 1, 0.01 ≈ 0 and e−0.01 ≈ 1. Thus, the new
problems seem like a perturbation from the original problem. Therefore, we
will use linear approximation to solve the problem when r = 0.98, s = 1.04 and
t = −0.01. So, introduce the Lagrangian associated with the new constrained
optimization problem
√
L(x, y, λ, β, r, s, t) = 1 − r3 (x − 2)2 − et y 2 − λ(x2 + sy 2 − 8) − β(s2 x − y)
As a consequence,
∂f ∗ ∂L
(1, 1, 0) = √ √
∂r ∂r (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
√
= −3r2 (x − 2)2 √ √ = −3( 8 + 2)2
(x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
∂f ∗ ∂L
(1, 1, 0) = √ √
∂s ∂s (x,y,λ,β)=(− 8,0,−1−1/ 2,0, (r,s,t)=(1,1,0)
1
= −λ √ y 2 − 2βsx √ √ =0
2 s (x,y,λ,μ)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
∂f ∗ ∂L
(1, 1, 0) = √ √
∂t ∂t (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
= −et y 2 √ √ = 0.
(x,y,z,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
312 Introduction to the Theory of Optimization in Euclidean Space
∂F ∗ ∂L
(1, 1, 0) =
∂s ∂s (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)
1
= −λ √ y 2 − 2βsx = −4
2 s (x,y,λ,μ)=(1,1,0,2), (r,s,t)=(1,1,0)
∂F ∗ ∂L
(1, 1, 0) =
∂t ∂t (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)
= −et y 2 = −1.
(x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)
Constrained Optimization-Inequality Constraints 313
∂F ∗
F ∗ (r, s, t) ≈ F ∗ (1, 1, 0) + (1, 1, 0)(r − 1)
∂r
∂F ∗ ∂F ∗
+ (1, 1, 0)(s − 1) + (1, 1, 0)(t − 0)
∂s ∂t
F ∗ (r, s, t) ≈ −1 − 3(r − 1) − 4(s − 1) − (t − 0)
F ∗ (0.98, 1.04, −0.01) ≈ −1 − 3(−0.02) − (0.04) − (−0.01) = 0.02.
315
316 Bibliography
[21] S.L. Salas, E. Hille, and G.J. Etgen. Calculus. One and Several Variables.
Tenth Edition. John Wiley & Sons, INC, 2007.
317
318 Index