(Chapman & Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)

Introduction to the
Theory of
Optimization in
Euclidean Space
Series in Operaons Research
Series Editors:
Malgorzata Sterna, Marco Laumanns
About the Series

The CRC Press Series in Operaons Research encompasses books that
contribute to the methodology of Operaons Research and applying advanced
analycal methods to help make beer decisions.
The scope of the series is wide, including innovave applicaons of Operaons
Research which describe novel ways to solve real-world problems, with
examples drawn from industrial, compung, engineering, and business
applicaons. The series explores the latest developments in Theory and
Methodology, and presents original research results contribung to the
methodology of Operaons Research, and to its theorecal foundaons.
Featuring a broad range of reference works, textbooks and handbooks, the
books in this Series will appeal not only to researchers, praconers and
students in the mathemacal community, but also to engineers, physicists,
and computer sciensts. The inclusion of real examples and applicaons is
highly encouraged in all of our books.
Raonal Queueing
Refael Hassin
Introducon to the Theory of Opmizaon in Euclidean Space

Samia Challal
For more informaon about this series please visit: hps://www.crcpress.com/Chapman--HallCRC-Series-in-

Operaons-Research/book-series/CRCOPSRES
Introduction to the
Theory of
Optimization in
Euclidean Space
Samia Challal
Glendon College-York University
Toronto, Canada
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

c 2020 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed on acid-free paper
International Standard Book Number-13: 978-0-367-19557-1 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Rea-
sonable efforts have been made to publish reliable data and information, but the author
and publisher cannot assume responsibility for the validity of all materials or the conse-
quences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if
permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying, microfilming, and record-
ing, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Cen-
ter, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-
for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system
of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trade-

marks, and are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at

http://www.taylorandfrancis.com
and the CRC Press Web site at

http://www.crcpress.com
To my parents
Contents
Preface ix
Acknowledgments xi
Symbol Description xiii
Author xv
1 Introduction 1
1.1 Formulation of Some Optimization Problems . . . . . . . . . 1
1.2 Particular Subsets of Rn . . . . . . . . . . . . . . . . . . . . 8
1.3 Functions of Several Variables . . . . . . . . . . . . . . . . . 20
2 Unconstrained Optimization 49
2.1 Necessary Condition . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Classification of Local Extreme Points . . . . . . . . . . . . . 71
2.3 Convexity/Concavity and Global Extreme Points . . . . . . 93
2.3.1 Convex/Concave Several Variable Functions . . . . . 93
2.3.2 Characterization of Convex/Concave C 1 Functions . . 95
2.3.3 Characterization of Convex/Concave C 2 Functions . . 98
2.3.4 Characterization of a Global Extreme Point . . . . . . 102
2.4 Extreme Value Theorem . . . . . . . . . . . . . . . . . . . . 117
3 Constrained Optimization-Equality Constraints 135

3.1 Tangent Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.2 Necessary Condition for Local Extreme
Points-Equality Constraints . . . . . . . . . . . . . . . . . . . 151
3.3 Classification of Local Extreme Points-Equality
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.4 Global Extreme Points-Equality Constraints . . . . . . . . . 187
4 Constrained Optimization-Inequality Constraints 203

4.1 Cone of Feasible Directions . . . . . . . . . . . . . . . . . . . 204
4.2 Necessary Condition for Local Extreme Points/
Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . 220
4.3 Classification of Local Extreme Points-Inequality
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
vii
viii Contents
4.4 Global Extreme Points-Inequality Constraints . . . . . . . . 271

4.5 Dependence on Parameters . . . . . . . . . . . . . . . . . . . 292
Bibliography 315
Index 317
Preface
The book is intended to provide students with a useful background in opti-

mization in Euclidean space. Its primary goal is to demystify the theoretical
aspect of the subject.
In presenting the material, we refer first to the intuitive idea in one dimension,
then make the jump to n dimension as naturally as possible. This approach
allows the reader to focus on understanding the idea, skip the proofs for later
and learn to apply the theorems through examples and solving problems. A
detailed solution follows each problem constituting an image and a deepening
of the theory. These solved problems provide a repetition of the basic princi-
ples, an update on some difficult concepts and a further development of some
ideas.
Students are taken progressively through the development of the proofs where
they have the occasion to practice tools of differentiation (Chain rule, Taylor
formula) for functions of several variables in abstract situation. They learn to
apply important results established in advanced Algebra and Analysis courses,
like, Farkas-Minkowski Lemma, the implicit function theorem and the extreme
value theorem.
The book starts, in Chapter 1, with a short introduction to mathematical

modeling leading to formulation of optimization problems. Each formulation
involves a function and a set of points. Thus, basic properties of open, closed,
convex subsets of Rn are discussed. Then, usual topics of differential calculus
for functions of several variables are reminded.
In the following chapters, the study is devoted to the optimisation of a function

of several variables f over a subset S of Rn . Depending on the particularity of
this set, three situations are identified. In Chapter 2, the set S has a nonempty
interior; in Chapter 3, S is described by an equation g(x) = 0 and in Chapter 4
ix
x Preface
by an inequality g(x) 0 where g is a function of several variables. In each

case, we try to answer the following questions:
– If the extreme point exists, then where is it located in S? Here, we

look for necessary conditions to have candidate points for optimality.
We make the distinction between local and global points.
– Among the local candidate points, which of them are local maximum or
local minimum points? Here, we establish sufficient conditions to identify
a local candidate point as an extreme point.
– Now, among the local extreme points found, which ones are global ex-
treme points? Here, the convexity/concavity property intervenes for a
positive answer.
Finally, we explore how the extreme value of the objective function f is affected
when some parameters involved in the definition of the functions f or g change
slightly.
Acknowledgments
I am very grateful to my colleagues David Spring, Mario Roy and Alexander

Nenashev for introducing the course on optimization, for the first time, to
our math program and giving me the opportunity to teach it. I, especially,
thank Professor Vincent Hildebrand, Chair of the Economics Department for
the useful discussions during the planning of the course content to support
students majoring in Economics.
My thanks are also due to Sarfraz Khan and Callum Fraser from Taylor
and Francis Group, to the reviewers for their invaluable help, and to Shashi
Kumar for the expert technical support.
I have relied on the various authors cited in the bibliography, and I am
grateful to all of them. Many exercises are drawn or adapted from the cited
references for their aptitude to reinforce the understanding of the material.
xi
Symbol Description
n
1/2
∀ For all, or for each A = a2ij norm of the ma-
i,j=1
∃ There exists
trix A = (aij )i,j=1,...,n
∃! There exists a unique
rankA rank of the matrix A
∅ The empty set
detA determinant of the matrix A
s.t Subject to
◦
KerA = {x : Ax = 0} Kernel of the ma-
S Interior of the set S trix A
∂S Closure of the set S th = h1 ... hn transpose of

⎡ ⎤
h1
S Boundary of the set S ⎢ ⎥
h = ⎣ .. ⎦
.
CS The complement of S.
hn
i, j, k i = (1, 0, 0), j = (0, 1, 0), k =
n
(0, 0, 1) standard basis of R3 t h.x∗ = hk .xk dot product of the
Br (x0 ) Ball centered at x0 with radius r k=1
vectors h and x∗
Br (x0 ) Bordered Hessian of order r at x0 .
C 1 (D) set of continuously differentiable
., . or [ . , . ] brackets for vectors functions on D
∇f gradient of f C k (D) set of continuously differentiable

⎡ ∗ ⎤ functions on D up to the order k
x1
⎢ . ⎥ C ∞ (D)
x∗ = ⎣ . ⎦ column vector iden- set of continuously differentiable
.
functions on D for any order k
x∗n
tified sometimes to the point Hf (x) = (fxi xj )n×n Hessian of f
(x∗1 , . . . , x∗n )

fx 1 x 1 fx 1 x 2 ... fx 1 x k
x21 + x22 + . . . + x2n norm of
x =

the vector x fx 2 x 1 fx 2 x 2 ... fx 2 x k

Mm n set of matrices of m rows and n Dk (x) =
. .. .. ..
columns .. .
. .

A = (aij ) i = 1, . . . , m, is an m × f fx k x 2 ... fx k x k
xk x1
j = 1, . . . , n leading minor of order k of the
n matrix Hessian Hf
xiii
Author
Samia Challal is an assistant professor of Mathematics at Glendon College,

the bilingual campus of York University. Her research interests include homog-
enization, optimization, free boundary problems, partial differential equations
and problems arising from mechanics.
xv
Chapter 1
Introduction
Optimization problems arise in different domains. In Section 1.1 of this chapter, we

introduce some applications and learn how to model a situation as an optimization
problem.
The points where an optimal quantity is attained are looked for in subsets that can
be one dimensional, multi-dimensional, open, closed, bounded or unbounded, . . . etc.
We devote Section 1.2 to study some topological properties of such subsets of Rn .
Finally, since, the phenomena analyzed are often complex, because of the many pa-
rameters that are involved, this requires an introduction to functions of several vari-
ables that we study in Section 1.3.
1.1 Formulation of Some Optimization Problems
The purpose of this short section is to show, through some examples, the
main elements involved in an optimization problem.
Example 1. Different ways in modeling a problem.
To minimize the material in manufacturing a closed can with volume capacity

of V units, we need to choose a suitable radius for the container.
i) Show how to make this choice without finding the exact radius.
ii) How to choose the radius if the volume V may vary from one liter to
two liters?
1
2 Introduction to the Theory of Optimization in Euclidean Space
Solution: Denote by h and r the height and the radius of the can respectively.
Then, the area and the volume of the can are given by
area = A = 2πr2 + 2πrh, volume = V = πr2 h.

i) * The area can be expressed as a function of r and the problem is reduced
to find r ∈ (0, +∞) for which A is minimum:
⎧
⎪ 2V
⎨ minimize A = A(r) = 2πr2 + over the set S
r
⎪
⎩
S = (0, +∞) = {r ∈ R / r > 0}.
Note that the set S, as shown in Figure 1.1, is an open unbounded interval
of R.
interval r0
r
0.0 0.5 1.0 1.5 2.0 2.5
FIGURE 1.1: S = (0, +∞) ⊂ R
** We can also express the problem as follows:

⎧
⎨ minimize A(r, h) = 2πr2 + 2πrh over the set S
⎩
S = {(r, h) ∈ R+ × R+ / πr2 h = V }.
Here, the set S is a curve in R2 and is illustrated by Figure 1.2 below:

h
2.0
1.5
1.0
S
0.5
r
0.5 1.0 1.5 2.0
FIGURE 1.2: S is a curve h = π −1 /r2 in the plane (V=1 liter)
ii) In the case, we allow more possibilities for the volume, for example 1
V 2, then we can formulate the problem as a two dimensional problem
Introduction 3
⎧ 2
⎪
⎨ minimize A(r, h) = 2πr + 2πrh over the set S
⎪
⎩ S = {(r, h) ∈ R+ × R+ 1 2
/ h 2 }.
πr2 πr
The set S is the plane region, in the first quadrant, between the curves
1 2
h = 2 and h = 2 (see Figure 1.3).
πr πr
h
3.5
3.0
2.5
2.0
1.5
1.0 S
0.5
r
0.5 1.0 1.5 2.0
FIGURE 1.3: S is a plane region between two curves
A three dimensional formulation of the same problem is

⎧
⎪ 2V
⎨ minimize A(r, h, V ) = 2πr2 + over the set S
r
⎪
⎩
S = {(r, h, V ) ∈ R+ × R+ × R+ / πr2 h = V, 1 V 2}
where, the set S ⊂ R3 is the part of the surface V = πr2 h located between
the planes V = 1 and V = 2 in the first octant; see Figure 1.4.
3
h
2
0
1.0
V
0.5
0.0
0.0
0.5
1.0
r
1.5
2.0
FIGURE 1.4: S is a surface in the space

Example 2. Too many variables and linear inequalities.
Diet Problem. * One can buy four types of aliments where the nutritional
content per unit weight of each food and its price are shown in Table 1.1 [5].
The diet problem consists of obtaining, at the minimum cost, at least twelve
calories and seven vitamins.
type1 type2 type3 type4

calories 2 1 0 1
vitamins 3 4 3 5
price 2 2 1 8
TABLE 1.1: A diet problem with four variables
Solution: Let ui be the weight of the food of type i. The total price of the
four aliments consumed is given by the relation
2u1 + 2u2 + u3 + 8u4 = f (u1 , u2 , u3 , u4 ).
To ensure that at least twelve calories and seven vitamins are included, we
can express these conditions by writing
2u1 + u2 + u4 12 and 3u1 + 4u2 + 3u3 + 5u4 7.
Hence, the problem would be
⎧
⎨ minimize f (u1 , u2 , u3 , u4 ) over the set S = (u1 , u2 , u3 , u4 ) ∈ R4 :

⎩ 2u1 + u2 + u4 12, 3u1 + 4u2 + 3u3 + 5u4 7 .
** The above problem is rendered more complex if more factors (fat,

proteins) and types of food (steak, potatoes, fish, ...) were to be considered.
For example, from Table 1.2, we deduce that the total price of the seven
type1 type2 type3 type4 type5 type6 type7

protein 3 1 2 7 8 5 10
f at 0 1 0 8 15 10 6
calories 2 1 0 1 5 7 9
vitamins 3 4 3 5 1 2 5
price 2 2 1 8 12 10 8
TABLE 1.2: A diet problem with seven variables

Introduction 5
aliments consumed is
2u1 + 2u2 + u3 + 8u4 + 12u5 + 10u6 + 8u7 = p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ).
To ensure that at least twelve calories, seven vitamins, twenty proteins are
included, and less than fifteen fats are consumed, the problem would be for-
mulated as
⎧
⎪
⎪ minimize p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ) over the set
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ S = (u1 , u2 , u3 , u4 , u5 , u6 , u7 ) ∈ R7 :
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ 3u1 + u2 + 2u3 + 7u4 + 8u5 + 5u6 + 10u7 20
⎪
⎪
⎪
⎪ u2 + 8u4 + 15u5 + 10u6 + 6u7 15
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ 2u1 + u2 + u4 + 5u5 + 7u6 + 9u7 12
⎪
⎪
⎪
⎪
⎪
⎪
⎩ 3u1 + 4u2 + 3u3 + 5u4 + u5 + 2u6 + 5u7 7.
Example 3. Too many variables and nonlinearities.

* A company uses x units of capital and y units of labor to produce x y
units of a manufactured good. Capital can be purchased at 3$/ unit and labor
can be purchased at 2$/ unit. A total of 6$ is available to purchase capital
and labor. How can the firm maximize the quantity of the good that can be
manufactured?
Solution: We need to maximize the quantity x y on the set of points (see

Figure 1.5)
y
3
2 L2
L3
1 S
x
1 1 2 3
L1
1
FIGURE 1.5: S is a triangular region in the plane
S = {(x, y) ∈ R2 : 3x + 2y 6, x 0, y 0}.
The set S is the triangular plane region bounded by the sides L1 , L2 and L3 ,
defined by: L1 = {(x, 0), 0 x 2},
L2 = {(0, y), 0 y 3}, L3 = {(x, (6 − 3x)/2), 0 x 2}.
Here, the objective function f (x, y) = xy is nonlinear and the set S is described
by linear inequalities.
** Such a model may work for a certain production process. However, it may
not reflect the situation as other factors involved in the production process
cannot be ignored. Therefore, new models have to be considered. For Exam-
ple [7]:
- The Canadian manufacturing industries for 1927 is estimated by:
P (l, k) = 33l0.46 k 0.52
where P is product, l is labor and k is capital.
- The production P for the dairy farming in Iowa (1939) is estimated by:
P (A, B, C, D, E, F ) = A0.27 B 0.01 C 0.01 D0.23 E 0.09 F 0.27
where A is land, B is labor, C is improvements, D is liquid assets, E is

working assets and F is cash operating expenses.
Each of these nonlinear production function P is optimized on a suitable set
S that describes well the elements involved.
As seen above, the main purpose, of this study, is to find a solution to the
following optimization problems
find u ∈ S such that f (u) = min f (v)

S
or
find u ∈ S such that f (u) = max f (v)
S
where f : S ⊂ R −→ R is a given function and S a given subset of Rn .

n
It is obvious that establishing existence and uniqueness results of the extreme

points, depends on properties satisfied by the set S and the function f . So,
we need to know some categories of subsets in Rn as well as some calculus on
multi-variable functions. But, first look at the following remark:
Introduction 7
Remark 1.1.1 The extreme point may not exist on the set S. In our study,
we will explore the situations where min f and max f are attained in S.
S S
For example
min f (x) = x2 does not exist.

(0,1)
Indeed, suppose there exists x0 ∈ (0, 1) such that f (x0 ) = min f (x). Then,
(0,1)
x0 x0
0< < x0 =⇒ ∈ (0, 1)
2 2
x0
f is a strictly increasing function on (0, 1) =⇒ f( ) < f (x0 ),
2
which contradicts the fact that x0 is a minimum point of f on (0, 1). However,
we remark that
f (x) > 0 ∀x ∈ (0, 1).
To include these limit cases, usually, instead of looking for a minimum or a

maximum, we look for
inf f (x) = inf{f (x) : x ∈ S} and sup f (x) = sup{f (x) : x ∈ S}

S S
where inf E and sup E of a nonempty subset E of R are defined by [2]
sup E = the least number greater than or equal to all numbers in E
inf E = the greatest number less than or equal to all numbers in E.
If E is not bounded below, we write inf E = −∞. If E is not bounded above,

we write sup E = +∞. By convention, we write sup ∅ = −∞ and inf ∅ = +∞.
For the previous example, we have
inf x2 = 0, and sup x2 = 1.

(0,1) (0,1)
1.2 Particular Subsets of Rn
We list here the main categories of sets that we will encounter and give the
main tools that allow their identification easily. Even though the purpose is
not a topological study of these sets, it is important to be aware of the precise
definitions and how to apply them accurately [18], [13].
Open and Closed Sets
In one dimension, the distance between two real numbers x and y is mea-
sured by the absolute value function and is given by
d(x, y) = |x − y|.
d satisfies, for any x, y, z, the properties
d(x, y) 0 d(x, y) = 0 ⇐⇒ x = y
d(y, x) = d(x, y) symmetry
d(x, z) d(x, y) + d(y, z) triangle inequality.
These three properties induce on R a metric topology where a set O is said to

be open if and only if, at each point x0 ∈ O, we can insert a small interval
centered at x0 that remains included in O, that is,
O is open ⇐⇒ ∀x0 ∈ O ∃ > 0 such that (x0 − , x0 + ) ⊂ O.
In higher dimension, these tools are generalized as follows:

The distance between two points x = (x1 , · · · , xn ) and y = (y1 , · · · , yn ) is
measured by the quantity

d(x, y) = x − y = (x1 − y1 )2 + . . . + (xn − yn )2 .
d is called the Euclidean distance and satisfies the three properties above. A
set O ⊂ Rn is said to be open if and only if, at each point x0 ∈ O, we can
insert a small ball
B (x0 ) = {x ∈ Rn : x − x0 < }
Introduction 9
centered at x0 with that remains included in O, that is,
O is open ⇐⇒ ∀x0 ∈ O ∃ > 0 such that B (x0 ) ⊂ O.

The point x0 is said to be an interior point to O.
Example 1. As n varies, the ball takes different shapes; see Figure 1.6.
n=1 a∈R Br (a) = (a − r, a + r) : an open interval
n=2 a = (a1 , a2 ) Br (a) = {(x1 , x2 ) : (x1 − a1 )2 + (x2 − a2 )2 < r2 } :

an open disk
n=3 a = (a1 , a2 , a3 )
Br (a) = {(x1 , x2 , x3 ) : (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 < r2 } :
set of points delimited by the sphere centered at a with radius r
n>3 a = (a1 , . . . , a3 ) Br (a) is the set of points delimited by

the hyper sphere of points x satisfying d(a, x) = r.
y 4
y 2
3 0
2
2 4
4 sphere
1 disk 2
interval 2 x 2 x2 y2 z2 4
z 0
2 1 0 1 2 x
3 2 1 1 2 3
x2 y2 4 2
1
4
4
2 2
0
x 2
3 4
FIGURE 1.6: Shapes of balls in R, R2 and R3
Using the distance d, we define
Definition 1.2.1 Let S be a subset of Rn .

◦
– S is the interior of S, the set of all interior points of S.
– S is a neighborhood of a if a is an interior point of S.
– S is a closed set ⇐⇒ C S is open.
– ∂S is the boundary of S, the set of boundary points of S, where
x0 ∈ ∂S ⇐⇒ ∀r > 0, Br (x0 )∩ S = ∅ and Br (x0 )∩ C S = ∅.
– S = S ∪ ∂S is the closure of S.
– S is bounded ⇐⇒ ∃M >0 such that x M ∀x ∈ S.
– S is unbounded if it is not bounded.
Example 2. For the sets, S1 = [−2, 2] ⊂ R
S2 = {(x, y) : x2 +y 2 4} ⊂ R2 , S3 = {(x, y, z) : x2 +y 2 +z 2 < 4} ⊂ R3 ,
we have
◦
S S ∂S S
S1 (−2, 2) {−2, 2} S1
S2 B2 (0) C2 (0) : circle S2
S3 S3 = B2 (0) S2 (0) : sphere S3 ∪ S2 (0)
where
C2 (0) = {(x, y) : x2 + y 2 = 4}, S2 (0) = {(x, y, z) : x2 + y 2 + z 2 = 4}.
We have the following properties:
Remark 1.2.1 – Rn and ∅ are open and closed sets

– The union (resp. intersection) of arbitrary open (resp. closed) sets is
open (resp. closed).
– The finite intersection (resp. union) of open (resp. closed) sets is open
(resp. closed).
◦
– S is open ⇐⇒ S = S.
– S is closed ⇐⇒ S = S.
Introduction 11
– If f is continuous on an open subset Ω ⊂ Rn (see Section 1.3), then
f −1 (−∞, a] = [f a], [f a], [f = a] are closed sets in Rn
f −1 (−∞, a) = [f < a], [f > a] are open sets in Rn .
Example 3. Sketch the set S in the xy-plane and determine whether it is

◦
open, closed, bounded or unbounded. Give S, ∂S and S.
S = {(x, y) : x 0, y 0, xy 1}
y
5
3 xy1 x0 y0
x
2 1 1 2 3 4 5
1
2
FIGURE 1.7: An unbounded closed subset of R2
∗ Note that the set S, sketched in Figure 1.7, doesn’t contain the points on
the x and y axis. So
S = {(x, y) : x > 0, y > 0, xy 1}
and can be described using the continuous function f : (x, y) −→ xy on the
open set Ω = {(x, y) : x > 0, y > 0} as
S = {(x, y) ∈ Ω : f (x, y) 1} = f −1 [1, +∞) .
Therefore, S is a closed subset of R2 . Thus S = S.
∗∗ The set is unbounded since it contains the points (x(t), y(t)) = (t, t) for
t 1 (xy = t.t = t2 1) and
√
(x(t), y(t)) = (t, t) = t2 + t2 = 2t −→ +∞ as t −→ +∞.
∗ ∗ ∗ We have
◦
S = {(x, y) : x > 0, y > 0, xy > 1}
1
the region in the 1st quadrant above the hyperbola y =
x
∂S = {(x, y) : x > 0, y > 0, xy = 1}

the arc of the hyperbola in the 1st quadrant.
Example 4. A person can afford any commodities x 0 and y 0 that

satisfies the budget inequality x + 3y 7.
Sketch the set S described by these inequalities in the xy-plane and determine
◦
whether it is open, closed, bounded or unbounded. Give S, ∂S and S.
y
4
1 S
x
2 4 6
1
FIGURE 1.8: Closed set as intersection of three closed sets of R2
∗ Figure 1.8 shows that S is the triangular region formed by all the points in
the first quadrant below the line x + 3y = 7 :
S = {(x, y) : x + 3y 7, x 0, y 0}
and can be described using the continuous functions
f1 : (x, y) −→ x + 3y, f2 : (x, y) −→ x, f3 : (x, y) −→ y
on R2 as
S = {(x, y) ∈ R2 : f1 (x, y) 7, f2 (x, y) 0, f3 (x, y) 0}
= f1−1 (−∞, 7] f2−1 [0, +∞) f3−1 [0, +∞) .
Therefore, S is a closed subset of R2 as the intersection of three closed subsets

of R2 . Thus S = S.
Introduction 13
∗∗ The set S is bounded since

7
x + 3y 7, x 0, y0 =⇒ 0 x 7, 0y
3
from which we deduce
7 2 7√
(x, y) = x2 + y 2 72 + = 10 ∀(x, y) ∈ S.
3 3
∗ ∗ ∗ We have
◦
S = {(x, y) : x > 0, y > 0, x + 3y < 7} the region S excluding its three sides
∂S = the three sides of the triangular region.
Convex sets
The category of convex sets, deals with sets S ⊂ Rn where any two points
x, y ∈ S can be joined by a line segment that remains entirely into the set.
Such sets are without holes and do not bend inwards. Thus
S is convex ⇐⇒ (1 − t)x + ty ∈ S ∀x, y ∈ S ∀t ∈ [0, 1].
We have the following properties:
Remark 1.2.2 – Rn and ∅ are convex sets
– A finite intersection of convex sets is a convex set.
Example 5. “Well known convex sets” (see Figure 1.9)
∗ A line segment joining two points x and y is convex. It is described by
[x, y] = {z ∈ Rn : ∃t ∈ [0, 1] such that z = x + t(y − x) = (1 − t)x + ty}.

∗∗ A line passing through two points x0 and x1 is convex. It is described by
L = {x ∈ Rn : ∃t ∈ R such that x = x0 + t(x1 − x0 )}.
y
5
3
line segment
B 2 disk
line
1
A
x
4 2 2 4 6
1
FIGURE 1.9: Convex sets in R2
∗ ∗ ∗ A ball Br (x0 ) = {x ∈ Rn : x − x0 < r} is convex.

Indeed, let a and b in Br (x0 ) and t ∈ [0, 1], we have
[(1 − t)a + tb] − x0 = (1 − t)(a − x0 ) + t(b − x0 )
(1 − t)(a − x0 ) + t(b − x0 ) = |1 − t| a − x0 + |t| b − x0
< |1 − t|r + |t|r = r since a − x0 < 1 and b − x0 < 1.
Hence (1 − t)a + tb ∈ Br (x0 ) for any t ∈ [0, 1]; that is, [a, b] ⊂ Br (x0 ).
y
2
x2 y2 4
x
2 1 1 2
closed disk
1
2
FIGURE 1.10: A closed ball is convex
∗ ∗ ∗∗ A closed ball Br (x0 ) = {x ∈ Rn : x − x0 r} is convex.
For example, in the plane, the set in Figure 1.10, defined by
{(x, y) : x2 + y 2 4} = B2 ((0, 0)) is convex.

Introduction 15
The set is the closed disk with center (0, 0) and radius 2. It is closed since it
includes its boundary points located on the circle with center (0, 0) and radius
2. This set is bounded since (x, y) 2 ∀(x, y) ∈ B2 ((0, 0)).
Example 6. “Convex sets described by linear expressions”
∗ For a = (a1 , . . . , an ) ∈ Rn , b ∈ R, the set of points
x = (x1 , . . . , xn ) ∈ Rn : a1 x1 + a2 x2 + . . . + an xn = a.x = b
is convex and called hyperplane.

Indeed, consider x1 , x2 in the hyperplane and t ∈ [0, 1], then
a.[(1 − t)x1 + tx2 ] = (1 − t)a.x1 + ta.x2 = (1 − t)b + tb = b
thus (1 − t)x1 + tx2 belongs to the hyperplane.
As illustrated in Figure 1.11, the graph of an hyperplane is reduced to the

point x1 = b/a1 when n = 1, to the line a1 x1 + a2 x2 = b in the plane when
n = 2, and to the plane a1 x1 + a2 x2 + a3 x3 = b in the space when n = 3.
4
y y 2
2.0 0
2
4
1.5 4 hyperplane
hyperplane 2
1.0
x z 0
3 2 1 0 1 2 3 hyperplane
0.5
2
4
x 4
2 1 1 2
2
0
0.5 x 2
4
FIGURE 1.11: Hyperplane in R, R2 and R3
∗∗ The set of points in x = (x1 , . . . , xn ) ∈ Rn defined by a linear inequality
a1 x1 + a2 x2 + . . . + an xn = a.x b (resp. , <, >) is convex.
Indeed, as above, consider x1 , x2 in the region [a.x b] and t ∈ [0, 1], then
a.x1 b =⇒ (1 − t)a.x1 (1 − t)b since (1 − t) 0
a.x2 b =⇒ ta.x2 tb since t0

Adding the two inequalities, we get
a.[(1 − t)x1 + tx2 ] = (1 − t)a.x1 + ta.x2 (1 − t)b + tb = b

thus (1 − t)x1 + tx2 belongs to the region [a.x b].
The set a.x b describes the region of points located below the hyperplane
a.x = b.
∗ ∗ ∗ A set of points in Rn described by linear equalities and inequalities

is convex as it can be seen as the intersection of convex sets described by
equalities and inequalities.
For example, in Figure 1.12, the set
S = {(x, y) : 2x + 3y 19, −3x + 2y 4, x + y 8, 0 x 6, x + 6y 0}
can be described as S = S1 ∩ S2 ∩ S3 ∩ S4 ∩ S5 ∩ S6 where
S1 = {(x, y) ∈ R2 : x + 6y 0} S2 = {(x, y) ∈ R2 : x 6}
S3 = {(x, y) ∈ R2 : x + y 8} S4 = {(x, y) ∈ R2 : 2x + 3y 19}
S5 = {(x, y) ∈ R2 : −3x + 2y 4} S6 = {(x, y) ∈ R2 : x 0}.
L4
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
FIGURE 1.12: A convex set described by linear inequalities

Introduction 17
S is the region of the plan xy, bounded by the lines

L1 : x + 6y = 0 L2 : x = 6, L3 : x + y = 8,
L4 : 2x + 3y = 19, L5 : −3x + 2y = 4 L6 : x = 0.
Often, such sets are described using matrices and vectors;

⎡ ⎤ ⎡ ⎤
2 3 19
⎢ −3 2 ⎥ ⎢ 4 ⎥
x ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
S = ∈ R2 : ⎢ 1 1 ⎥ x
⎢
8 ⎥ .
y ⎢ ⎥ ⎢ ⎥
⎢ 1 0 ⎥ y ⎢ 6 ⎥
⎣ −1 −6 ⎦ ⎣ 0 ⎦
−1 0 0
Example 7. “Well-known non convex sets”

∗ The hyper-sphere (see Figure 1.13 for an illustration in the plane)
∂Br (x∗ ) = {x ∈ Rn : x − x0 = r} is not convex.
y
3
circle
1
x 12 y 12 4
x
1 1 2 3
1
FIGURE 1.13: Circle ∂B2 ((1, 1)) is not convex
Indeed, we have
(x∗1 , . . . , x∗n ± r) ∈ ∂Br (x∗ ) since (0, . . . , ±r) = r

1
1
(x∗1 , . . . , x∗n + r) + (1 − )(x∗1 , . . . , x∗n − r) − x∗
2 2
1

= (2x∗1 , . . . , 2x∗n + r − r) − x∗ = x∗ − x∗ = 0 = r
2
1 ∗ 1
=⇒ (x , . . . , x∗n + r) + (1 − )(x∗1 , . . . , x∗n − r) = x∗ ∈ ∂Br (x∗ ).
2 1 2
∗∗ The domain located outside the hyper-sphere, described by

S = {x ∈ Rn : x − x∗ > r} = Rn \ Br (x∗ ) is not convex.
y
4
x2 y2 4
2
x
4 2 2 4
2
4
FIGURE 1.14: An unbounded open non convex set of R2
Indeed, we have
(x∗1 , . . . , x∗n ± 2r) ∈ S since (0, . . . , ±2r) = 2r > r
1 ∗ 1
(x1 , . . . , x∗n + 2r) + (1 − )(x∗1 , . . . , x∗n − 2r)
2 2
1
= (2x∗1 , . . . , 2x∗n + 2r − 2r) = x∗ ∈ S.
2
For example, in the plane, the set
{(x, y) : x2 + y 2 > 4} = R2 \ B2 ((0, 0)) is not convex.
Moreover, the set is open since it is the complementary of the closed disk with
center (0, 0) and radius 2 (see Figure 1.14). It is not bounded since for t 2,
the points (0, t2 ) belong to the set, but (0, t2 ) = t2 −→ +∞ as t −→ +∞.
∗ ∗ ∗ The region located outside the hyper-sphere, including the hyper-sphere,

described by
S = {x ∈ Rn : x − x0 r} = Rn \ Br (x∗ ) is not convex.
Example 8. “The union of convex sets is not necessarily convex ”
∗ The union of the disk and the line in Figure 1.9 is not convex.
∗∗ The set E = {(x, y) ∈ R2 : xy + x − y − 1 > 0}, graphed in Figure 1.15,

is not convex.
Introduction 19
Indeed, we have
xy + x − y − 1 > 0 ⇐⇒ (x − 1)(y + 1) > 0

⇐⇒ x > 1 and y > −1 or x<1 and y < −1.
Thus E is the union of the sets
E1 = {(x, y) ∈ R2 : x > 1 and y > −1}
E2 = {(x, y) ∈ R2 : x < 1 and y < −1}

E1 and E2 are convex since they are described by linear inequalities. However,
E = E1 ∪ E2 is not convex since for example (2, 0) and (0, −2) are points of
E, but
1 1
2, 0 + 1 − 0, −2 = 1, −1 doesn’t belong to the set E.
2 2
4 2 2 4
0.5 x1
and y 1
1.0
y 1 1 x 1.5
2.0
FIGURE 1.15: Union of convex sets

1.3 Functions of Several Variables
We refer the reader to any book of calculus [1], [3], [21], [23] for details on
the points introduced in this section.
Definition 1.3.1 A function f of n variables x1 , · · · , xn is a rule that

assigns to each n-vector x = (x1 , . . . , xn ) in the domain of f , denoted by
Df , a unique number f (x) = f (x1 , . . . , xn ).
Example 1. Formulas may be used to model problems from different fields.
– Linear function
f (x1 , . . . , xn ) = a1 x1 + a2 x2 + . . . + an xn .
– The body mass index is described by the function

w
B(w, h) =
h2
where w is the the weight in kilograms and h is the height measured in
meters.
– The distance of a point P (x, y, z) to a given point P0 (x0 , y0 , z0 ) is a
function of three variables

d(x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 .
– The Cobb-Douglas function or the production function, describes the re-

lationship between the output: the product Q and the inputs: x1 , . . . , xn
(capital, labor, . . .) involved in the production process
Q(x1 , · · · , xn ) = Cxa1 1 xa2 2 . . . xann C, a1 , . . . , an are constants, C > 0.
– The electric potential function for two positive charges, one at (0, 1)
with twice the magnitude as the charge at (0, −1), is given by
2 1
ϕ(x, y) = + .
x2 + (y − 1)2 x2 + (y + 1)2
Introduction 21
Example 2. When given a formula of a function, first identify its domain of

definition before any other calculation.
The domains of definition of the functions given by the following formulas:

√ √ √
f (x) = x g(x, y) = x h(x, y, z) = x
are
Df = {x ∈ R/ x 0}
Dg = {(x, y) ∈ R2 / x 0} : the half plane bounded by the y axis,

including the axis and the points located in the 1st and 4th quadrants.
Dh = {(x, y, z) ∈ R3 / x 0} : the half space bounded by the plane yz,

including this plane and the points with positive 1st coordinates x 0.
The three domains Df , Dg , Dh are closed, convex, unbounded subsets of

R, R2 and R3 respectively; see Figure 1.16.
y
1.0
Dg : x 0 0.5
interval Df : x 0 x
1.0 0.5 0.5 1.0
0 1 2 3 4 5 6
0.5
1.0
10 Dh : x 0
y 5
5
10
10
z
0
5
10
5
x
10
FIGURE 1.16: Domains of definition

Graphs and Level Curves

With the aid of monotony, and convexity, sketching the graph of a real
function is performed by plotting few points. This is not possible in the case
of dimension 3.
To get familiar with some sets in R3 , we describe the traces’ method used
for plotting graphs of functions of two variables. The method consists on
sketching the intersections of the graph (or surface) with well-chosen planes,
usually planes that are parallel to the coordinates planes:
xy-plane : z = 0 xz-plane : y = 0 yz-plane : x = 0.
These intersections are called traces.
Definition 1.3.2 The graph of a function f : x = (x1 , . . . , xn ) ∈ Df ⊂

Rn −→ z = f (x) ∈ R is the set
Gf = {(x, f (x)) ∈ Rn+1 : x ∈ Df }.
The set of points x in Rn satisfying f (x) = k is called a level surface of f .
When n = 2, a level surface f (x, y) = k is called level curve. It is the projection

of the trace Gf ∩[z = k] onto the xy-plane. Drawing level curves of f is another
way to picture the values of f .
The following examples illustrate how to proceed to graph some surfaces and
level curves.
Example 3. A cylinder is a surface that consists of all lines that are parallel
to a given line and that pass through a given plane curve.
Let
E = {(x, y, z), x = y 2 }.
The set E cannot be the graph of a function z = f (x, y) since (1, 1, z) ∈ E
for any z, and then (1, 1) would have an infinite number of images. However,
we can look at E as the graph of the function x = f (y, z) = y 2 . Moreover, we
have

E= {(x, y, z), x = y 2 , (x, y) ∈ R2 }.
z∈R
Introduction 23
This means that any horizontal plane z = k (// to the xy plane) intersects the
graph in a curve with equation x = y 2 . So these horizontal traces E ∩ [z = k],
k ∈ R are parabolas. The graph is formed by taking the parabola x = y 2
in the xy-plane and moving it in the direction of the z-axis. The graph is a
parabolic cylinder as it can be seen as formed by parallel lines passing through
the parabola x = y 2 in the xy-plane (see Figure 1.17).
Note that for any k ∈ R, the level curve z = k is the parabola x = y 2 in the
xy plane.
z
traces
y
y
4
Level curve x y2
2 x
x
4 2 2 4
2
4
2 graph x y2
y 1
1
2
2
z
0
1
2
2
1
0
x
1
FIGURE 1.17: Parabolic cylinder
Example 4. An Elliptic Paraboloid, in its standard form, is the graph of

the function
x2 y2
f (x, y) = z = + with a > 0, b > 0.
a2 b2
The graph
x2 y2
Gf = (x, y, z), + = z
a2 b2
z∈[0,+∞)
x2 y2
can be seen as the union of ellipses + = k in the planes z = k, k 0.
a2 b2
By choosing the traces in Table 1.3, we can shape the graph in the space (see
Figure 1.18 for a = 2, b = 3):
plane trace
xy (z = 0) point : (0, 0)
x2
xz (y = 0) parabola : z =
a2
y2
yz (x = 0) parabola : z =
b2
x2 y2
z=1 ellipse : + =1
a2 b2
TABLE 1.3: Traces to sketch a paraboloid

x2 y2
z
4 9
2
y
0
y
2
10
x2 y2
levelcurves k
4 9 1.0
5 k9
k4
k1
z
0.5
k0 2.0
x 2
10 5 5 10 1.5
z
1.0
0.5 0
y 0.0
0.0
5 2
2
1
1
2 0
0
x
x 1 1
10 2 2
FIGURE 1.18: Elliptic paraboloid
Note that for any k < 0, the level curves z = k are not defined. For k > 0,
x2 y2
the level curves are ellipses √ + √ = 1 centered at the origin. For
(a k)2 (b k)2
k = 0, the level curve is reduced to the point (0, 0).
Example 5. The Elliptic Cone, in its standard form, is described by the

equation
x2 y2
z2 = 2 + 2 with a > 0, b > 0.
a b
x2 y2
It is the union of the graphs of the functions z = ±2
+ 2.
a b
To sketch the cone, one can make the choice of traces in Table 1.4 (see
Figure 1.19 for a = 2, b = 3):
Introduction 25
plane trace
xy (z = 0) point : (0, 0)
x
xz (y = 0) lines : z = ±
a
y
yz (x = 0) lines : z = ±
b
x2 y2
z = ±1 ellipse : + =1
a2 b2
TABLE 1.4: Traces to sketch a cone

x2 y2
z2
4 9
2
y
0
y
2
10
x2 y2
levelcurves k2
4 9 1.0
5 k3 0.5
k2
k1
1.0 z
0.0
k0 0.5
x z
10 5 5 10 0.0
0.5
0.5 2
1.0
2 1.0
0 2
5 y
1
1
0
x 0
2 x
1 1
10 2 2
FIGURE 1.19: Elliptic cone
Note that for any k = 0, the level curves z = ±k are ellipses

x2 y2
2
+ = 1 centered at the origin. For k = 0, the level curve is
(|k|a) (|k|b)2
reduced to the point (0, 0).
Example 6. The Elliptic Ellipsoı̈d, in its standard form, is described by the

equation
x2 y2 z2
+ + =1 with a > 0, b > 0, c > 0.
a2 b2 c2
x2 y2
It is the union of the graphs of the functions z = ±c 2
− 2 that one 1−
a b
can sketch by making the following choice of traces in Table 1.5 (see Figure
1.20 for a = 2, b = 3, c = 4):
plane trace
x2 y2
xy (z = 0) ellipse : 2 + 2 = 1
a b
x2 z2
xz (y = 0) ellipse : + =1
a2 c2
y2 z2
yz (x = 0) ellipse : 2
+ 2 =1
b c
TABLE 1.5: Traces to sketch an ellipsoı̈d

x2 y2 z2
1
44 9 16
y 2
y 0
2
3 x
1
0
2
1
k0 2
4
2 4
2
k 2
y
0
2
1 k 3
2
z
0
k 4
x
3 2 1 1 2 3 4
2
1 2
4
4
z 0
2
2
0
2
x
2
3 4 4
FIGURE 1.20: An elliptic ellipsoı̈d
For k ∈ R, the level curves z = ±k are ellipses centered at the origin with ver-
k2 k2 k2 k2
tices − a 1 − 2 , a 1 − 2 , − b 1 − 2 , b 1 − 2 in the xy plane.
c c c c
Limits and Continuity

For the local study of a function, the concept of limit is generalized to functions
of several variables as follows
Definition 1.3.3 Let x0 ∈ Rn and let f be a function defined on Df ∩

Br (x0 ) \ {x0 } . We write lim f (x) = L
x−→x0
⇐⇒ ∀ > 0, ∃δ > 0 such that ∀x : 0 < x−x0 < δ =⇒ |f (x)−L| < .

Introduction 27
Remark 1.3.1 i) The definition above supposes that f is defined in a

neighborhood of x0 , except possibly at x0 . It includes points x0 located
at the boundary of the domain of f .
ii) One can establish, using similar tools in one dimension [2], that the
standard properties of limits hold for limits of functions of n variables.
iii) If the limit of f (x) fails to exist as x −→ x0 along some smooth
curve, or if f (x) has different limits as x −→ x0 along two different
smooth curves, then the limit of f (x) does not exist as x −→ x0 .
Example 7.
• lim xi = ai , i = 1, · · · , n a = (a1 , · · · , an ) ∈ Rn .
x−→a
Indeed, for > 0, choose δ = > 0. Then, we have for x satisfying
x−a <δ =⇒ |xi − ai | x − a < δ = .
• Algebraic operations on limits.
lim 3xy 2 + z − 5
(x,y,z)−→(1,2,3)
= lim [3xy 2 ] + lim z − lim 5

(x,y,z)−→(1,2,3) (x,y,z)−→(1,2,3) (x,y,z)−→(1,2,3)
= 3[ lim x] . [ lim y]2 + 3 − 5 = 3(1)(2)2 + 3 − 5 = 10.

(x,y,z)−→(1,2,3) (x,y,z)−→(1,2,3)
• The limit
2x2 y
lim does not exist.
(x,y)→(0,0) x4 + y2
Indeed, if we consider the smooth curves C1 and C2 with equations y = x2

and y = x respectively, we find that
2x2 y 2x2 x2 2x4

lim = lim 4 = lim 4 = 1,
(x,y)→(0,0)(x,y)∈C
1
x4+y 2 2
x→0 x + (x ) 2 x→0 2x
2x2 y 2x2 x 2x
lim 4 2
= lim 4 2
= lim 2 = 0,
(x,y)→(0,0)(x,y)∈C x + y x→0 x +x x→0 x +1
2
the limits have different values along C1 and C2 (see Figure 1.21).
2 x2
z
x4 y2
15
0.5
10z
0
0.5 0.0
y
0.0
x
0.5
0.5
2x2 y
FIGURE 1.21: Behavior of f (x, y) = near (0, 0)
x4 + y 2
Definition 1.3.4 Let f be a function defined on Df ⊂ Rn . Then

⎧
⎪
⎨ f (x0 ) is defined and
f is continuous at x0 ⇐⇒
⎪
⎩ lim f (x) = f (x0 ).
x−→x0
If f is continuous at every point in an open set O, then we say that f is

continuous on O.
Remark 1.3.2 A function of n variables that can be constructed from

continuous functions by combining the operations of addition, substrac-
tion, multiplication, division and composition is continuous wherever it is
defined.
Example 8. Give the largest region where f is continuous
1
f (x, y) = .
exy −1
Solution: f is continuous on its domain of definition
Df = R2 \ {(x, y) ∈ R2 / x = 0 or y = 0}.
Introduction 29
More precisely, we have
∗ (x, y) −→ x y is continuous on R2 as the product of the function

(x, y) −→ x and the function (x, y) −→ y
1
∗ ∗ (x, y) −→ is continuous on Df as the composition of the C 0
exy
−1
1
function (x, y) −→ xy on R2 and the C 0 function t −→ t on
e −1
R \ {0} :
1
(x, y) ∈ Df −→ xy = t ∈ R \ {0} −→ t
.
e −1
First-order Partial Derivatives

Our purpose, now, is to generalize the concept of differentiability to functions
of several variables. More precisely, we will show that the existence of a line
tangent for a real differentiable function f at a point x0 is extended to the
existence of an hyperplane for a differentiable function with several variables.
First, we introduce some tools:
Definition 1.3.5 If z = f (x) = f (x1 , · · · , xn ), then the quantity
∂f f (x1 , · · · , xi + h, · · · , xn ) − f (x1 , · · · , xi , · · · , xn )
(x) = lim
∂xi h−→0 h
is the partial derivative of f (x1 , · · · , xn ) with respect to xi when all the

other variables xj (j = i, i = 1, . . . , n) are held constant.
Remark 1.3.3 - The partial derivative

∂f d
(a) = [f (a1 , . . . , xi , . . . , an )] , i = 1, . . . , n
∂xi dxi xi =ai
can be viewed as the slope of the line tangent to the curve Ci : z =

f (a1 , . . . , xi , . . . , an ) at the point a, or the rate of change of z with respect
to xi along the curve Ci at a.
- Other notations are :

∂f ∂z
= = f x i = zx i i = 1, · · · , n.
∂xi ∂xi
- We call gradient of f the vector
∇f (x) = fx1 , fx2 , · · · , fxn = f (x).
Example 9. Let f (w, x, y, z) = xeyw sin z. Find
fx (1, 2, 3, π/2), fy (1, 2, 3, π/2), fz (1, 2, 3, π/2) and fw (1, 2, 3, π/2).
Solution: We have

fx = eyw sin z fx (1, 2, 3, π/2) = eyw sin z = e3
(w,x,y,z)=(1,2,3,π/2)

fy = xweyw sin z fy (1, 2, 3, π/2) = xweyw sin z = 2e3
(w,x,y,z)=(1,2,3,π/2)

fz = xeyw cos z fz (1, 2, 3, π/2) = xeyw cos z =0
(w,x,y,z)=(1,2,3,π/2)

fw = xyeyw sin z fw (1, 2, 3, π/2) = xyeyw sin z = 6e3 .
(w,x,y,z)=(1,2,3,π/2)
Example 10. The rate of change of the (BMI) body mass index function
B(w, h) = w/h2 with respect of the weight w at a constant height h is
∂B 1
= 2 > 0.
∂w h
Thus, at constant height, people’s BMI differs by a factor of 1/h2 .
The rate of change of the BMI with respect of the height h at a constant
weight w is
∂B 2w
= − 3 < 0.
∂h h
Therefore, with similar weight, people’s BMI is a decreasing function of the
height.
Introduction 31
Higher Order Partial Derivatives
• Each partial derivative is also a function of n variables. These functions may

themselves have partial derivatives, called second order derivatives. For each
i = 1, . . . , n, we have
∂ ∂f ∂2f
= = fxi xj .
∂xj ∂xi ∂xj ∂xi
The n second-order partial derivatives fxi xi are called direct second-order
partial; the others, fxi xj where i = j, are called mixed second-order partial.
Usually these second-order partial derivatives are displayed in an n × n matrix
named the Hessian
⎡ ⎤
fx1 x1 fx1 x2 . . . f x1 xn
⎢ fx2 x1 fx2 x2 . . . f x2 xn ⎥
⎢ ⎥
Hf (x) = (fxi xj )n×n = ⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
f xn x1 fxn x2 ... fxn xn
• The mixed derivatives are equal in the following situation [15]
Theorem 1.3.1 Clairaut’s theorem

Let f (x) = f (x1 , x2 , · · · , xn ). If fxi xj and fxj xi , i = j for i, j ∈ {1, · · · , n}
are defined on a neighborhood of a point a ∈ Rn and are continuous at a
then
fxi xj (a) = fxj xi (a).
• Third-order, fourth-order and higher-order partial derivatives can be ob-

tained by successive differentiation. Clairaut’s theorem reduces the steps of
calculations when the continuity assumption is satisfied.
Example 11. Write the Hessian of the Cobb-Douglas function
Q(L, K) = cLa K b (c, a, b are positive constants)

where the two inputs are labor L and capital K.
Solution: For L, K > 0, we have

ln Q = ln c + a ln L + b ln K
∂(ln Q) QL a a
= = =⇒ QL = Q
∂L Q L L
∂(ln Q) QK b b
= = =⇒ QK = Q
∂K Q K K
a a a a a a(a − 1)
QLL = QL + − Q= Q+ − Q= Q
L L2 L L L2 L2
b b b b b b(b − 1)
QKK = QK + − Q= Q+ − Q= Q
K K2 K K K2 K2
a ab
QKL = QLK = QK = Q.
L LK
The Hessian matrix of Q is given by:
⎡ ⎤
a(a − 1) ab
⎢ ⎥
QLL QLK ⎢ L2 LK ⎥
HQ (L, K) = = Q⎢ ⎥.
QKL QKK ⎣ ab b(b − 1) ⎦
LK K2
Example 12. Laplace’s equation of a function u = u(x1 , . . . , xn ) is
∂2u ∂2u ∂2u
u = 2 + 2 + ... + = 0.
∂x1 ∂x2 ∂x2n
For which value of k, the function u = (x21 + x22 + . . . + x2n )k satisfies Laplace’s
equation?
Solution: We have
∂u
= 2xi k(x21 + x22 + . . . + x2n )k−1
∂xi
∂2u
= 2k(x21 + x22 + . . . + x2n )k−1 + 4x2i k(k − 1)(x21 + x22 + . . . + x2n )k−2
∂x2i
∂2u ∂2u ∂2u

u = 2 + 2 + ... +
∂x1 ∂x2 ∂x2n
= 2kn(x21 + x22 + . . . + x2n )k−1 + 4k(k − 1)

n
× x2i (x21 + x22 + . . . + x2n )k−2
i=1
= 2k[n + 2(k − 1)](x21 + x22 + . . . + x2n )k−1 .

Thus u = 0 if n + 2(k − 1) = 0 ie. for k = 1 + n/2.
Introduction 33
Differentiability
While the existence of a derivative of a one variable function at a point guar-
antees the continuity of the function at this point, the existence of partial
derivatives for a function of several variables doesn’t. Indeed, for example
⎧
⎨ 2 if x > 0 and y > 0
f (x, y) =
⎩
0 if not
has partial derivatives at (0, 0) since
f (h, 0) − f (0, 0) 0−0
fx (0, 0) = lim = lim = lim 0 = 0,
h→0 h h→0 h h→0
f (0, h) − f (0, 0)
fy (0, 0) = lim =0
h→0 h
but f is not continuous at (0, 0) since
lim f (t, t) = lim 2 = 2 = 0 = f (0, 0).
t→0+ t→0+
This motivates, the following definition
Definition 1.3.6 A function of n variables is said to be differentiable at

a = (a1 , . . . , an ) provided that fxi (a), i = 1, . . . , n exist and that there exists
a function ε : R+ −→ R such that:
f (x) = f (a) + fx1 (a)(x1 − a1 ) + . . . + fxn (a)(xn − an ) + x − a ε( x − a )
with
lim ε( x − a ) = 0.
x−→a
Remark 1.3.4 The definition extends the concept of differentiability of

functions of one variable to functions of n variables in such a way that we
preserve properties like:
- f continuous at a;
- the values of f at points near a can be very closely approximated by the

values of a linear function:
f (x) ≈ f (a) + fx1 (a)(x1 − a1 ) + . . . + fxn (a)(xn − an ).

The next theorem provides particular conditions for a function f to be differ-

entiable.
Theorem 1.3.2 If all first-order partial derivatives of f exist and are con-
tinuous at a point, then f is differentiable at that point.
If f has continuous partial derivatives of first-order in a domain D, we call f

continuously differentiable in D. In this case, f is also called a C 1 function
on D. If all partial derivatives up to order k exist and are continuous, f is
called a C k function.
Example 13. Use the linear approximation to estimate the change of the
Cobb-Douglas production function
Q(L, K) = L1/3 K 2/3 from (20, 10) to (20.6, 10.3).
Solution: We have
1 2
QL (L, K) = Q, QK (L, K) = Q, Q(20, 10) = 201/3 102/3 = 10(21/3 ),
3L 3K
1 2
QL (20, 10) = Q(20, 10), QK (20, 10) = Q(20, 10)
3(20) 3(10)
Thus, close to (20, 10), we have
Q(L, K) ≈ Q(20, 10) + QL (20, 10)(L − 20) + QK (20, 10)(K − 10)
1 2
= 1 + (L − 20) + (K − 10) Q(20, 10)
60 30
from which we deduce the estimate
1 2
Q(20.6, 10.3) ≈ 1+ (20.6−20)+ (10.3−10) Q(20, 10) = 1.003 Q(20, 10).
60 30
Another consequence of the differentiability is the chain rule for derivation
under composition.
Theorem 1.3.3 Chain rule 1

If f is differentiable at x = (x1 , x2 , . . . , xn ) and each xj = xj (t), j =
1, . . . , n, is a differentiable function of a variable t, then z = f (x(t)) is
differentiable at t and
dz ∂z dx1 ∂z dx2 ∂z dxn
= + + ...... + .
dt ∂x1 dt ∂x2 dt ∂xn dt
Introduction 35
Proof. Since f is differentiable at the point a, then, for x(t) close to a =

x(t0 ), we have
f (x(t)) − f (a) = fx1 (a)(x1 (t) − a1 ) + · · · + fxn (a)(xn (t) − an )
+ x(t) − a ε( x(t) − a ) with lim ε( x − a ) = 0.

x−→a
Dividing each side of the equality by Δt = t − t0 , we obtain
f (x(t)) − f (a) x1 (t) − a1 xn (t) − an

= fx1 (a) + . . . . . . + fxn (a)
Δt Δt Δt
x(t) − a

+ ε( x(t) − a ).
Δt
Then letting t −→ t0 and using the fact that each xj = xj (t), j = 1, . . . , n, is
a differentiable function of the variable t and that lim ε( x − a ) = 0, then
x−→a
f (x(t)) − f (a) x1 (t) − a1

lim = fx1 (a). lim + ...
t−→t0 Δt t−→t0 Δt
xn (t) − an x(t) − a

+ fxn (a). lim + lim . lim ( x(t) − a )
t−→t0 Δt t−→t0 Δt t−→t0
from which we deduce that
d(f (x(t))) dx1 dxn dx

= fx1 (a). (t0 ) + . . . . . . + fxn (a). (t0 ) + (t0 ).0
dt t=t0 dt dt dt
and the result follows.
In the general situation, each variable xi is a function of m indepen-

dent variables t1 , t2 , . . . , tm . Then z = f (x(t1 , t2 , . . . , tm )) is a function of
∂z
t1 , t2 , . . . , tm . To compute , we hold ti with i = j fixed and compute the
∂tj
ordinary derivative of z with respect to tj . The result is given by the following
theorem:
Theorem 1.3.4 Chain rule 2

If f is differentiable at x = (x1 , x2 , . . . , xn ) and each xj =
xj (t1 , t2 , · · · , tm ), j = 1, · · · , n, is a differentiable function of m vari-
ables t1 , t2 , . . . , tm , then z = f (x(t1 , t2 , . . . , tm )) is differentiable at
(t1 , t2 , . . . , tm ) and
∂z ∂z ∂x1 ∂z ∂x2 ∂z ∂xn

= + + ...... + .
∂ti ∂x1 ∂ti ∂x2 ∂ti ∂xn ∂ti
Example 14. Let
f (x, y) = x2 − 2xy + 2y 3 , x = s ln t, y = s t.
Use the chain rule formula to find

∂f ∂f ∂f ∂f
, , |s=1,t=1 and |s=1,t=1 .
∂s ∂t ∂s ∂t
Solution: i) We have
∂f ∂f
= 2x − 2y, = −2x + 6y 2 ,
∂x ∂y
∂x ∂x s
x = x(s, t), = ln t, = ,
∂s ∂t t
∂y ∂y
y = y(s, t), = t, = s.
∂s ∂t
Hence the partial derivatives of f at (s, t) are:
∂f ∂f ∂x ∂f ∂y
= . + . = (2x − 2y) ln t + (−2x + 6y 2 )t
∂s ∂x ∂s ∂y ∂s
= (2s ln t − 2st) ln t + (−2s ln t + 6s2 t2 )t
∂f ∂f ∂x ∂f ∂y s
= . + . = (2x − 2y) + (−2x + 6y 2 )s
∂t ∂x ∂t ∂y ∂t t
s
= (2s ln t − 2st) + (−2s ln t + 6s2 t2 )s.
t
ii) When s = 1 and t = 1, we have
x(1, 1) = (1) ln(1) = 0, and y(1, 1) = 1.

Thus the partial derivatives of f at (s, t) = (1, 1) are:

∂f
∂s = (2x(s, t) − 2y(s, t)) ln t + (−2x(s, t) + 6y 2 (s, t))t = 6
s=1,t=1 s=1,t=1

∂f
∂t = (2x(s, t) − 2y(s, t)) st + (−2x(s, t) + 6y(s, t)2 )s = 4.
s=1,t=1 s=1,t=1
Introduction 37
Solved Problems
1. – Sketch the domains of definition of the functions given by the following

formulas:

i) f (x, y) = e2x y − x2 ii) f (x, y, z) = z (1 − x2 )(y 2 − 4)

iii) H(x, y, z) = z − x2 − y 2 .
Solution:
2 2
Df : 1
5 x y 4 0
DH : z x2 y2
y
2
0 y
0
5 2
5
10
y
5
z
0
8 z
Df y x2 0
6 0
5
4
5 5
2 2
0
x 0
x
x
3 2 1 1 2 3 5 2
FIGURE 1.22: Domains of definitions

i) f (x, y) = e2x y − x2
Df = {(x, y) ∈ R2 : y − x2 0}
the plane region located above the parabola, including the parabola.

ii) f (x, y, z) = z (1 − x2 )(y 2 − 4)
Df = {(x, y, z) ∈ R3 : (1 − x2 )(y 2 − 4) 0}
x -2 -1 1 2
1 − x2 − − + − −
y2 − 4 + − − − +
so

Df = [−1, 1] × (−∞, −2] ∪ [2, +∞) × R

∪ (−∞, −1] ∪ [1, +∞) × [−2, 2] × R .

iii) H(x, y, z) = z − x2 − y 2
DH = {(x, y, z) ∈ R3 : z − x2 − y 2 0}
set of points bounded by the paraboloid z = x2 +y 2 , including the paraboloid.
The three domains are illustrated in Figure 1.22.
2. – Match the functions with their graphs in Figure 1.23.
y2
a. y − z2 = 0 b. x+y+z =0 c. 4x2 + + z2 = 1
9
y2 y2
d. x2 + − z2 = 1 e. x2 + = z2 f. z − y2 = 0
9 9
2
2 y 1
2
y y
0
0
0
1
2
2 2
1.0
4
2
0.5
z
z0.0
z
0 2
0.5
2
1.0 0
1.0 2
0.5 1
2
0.0 0
0 x x
0.5 1
A x
3
2 B 2
1.0 C 5
2
y y
2 0 y
0
1
2
0 1.0 5
2 1.0
0.5
1 0.5
z0.0
z
0 z0.0
0.5
1 0.5
1.0
2 1.0
1.0
0.5
2 1
0.0
0 x 0
0.5 x
D x
2 E 1.0 F 1
FIGURE 1.23: Surfaces in R3

Introduction 39
Solution:
equation of the surf ace its graph why?
a. y − z2 = 0 (D) parabolic cylinder in the direction

of the x − axis, located in y 0
b. x+y+z =0 (A) a plane
y2
c. 4x2 + 9 + z2 = 1 (E) ellipsoid centered at (0, 0, 0)
y2
d. x2 + 9 − z2 = 1 (F ) the traces at z = −1, 0, 1 are ellipses
y2
e. x2 + 9 = z2 (B) elliptic cone
f. z − y2 = 0 (C) parabolic cylinder in the direction

of the x − axis, located in z 0
3. – Sketch the graphs of the following functions:

i) f (x, y) = 81 − x2 ii) f (x, y) = 3 iii) f (x, y) = − x2 + y 2 .
Solution: i)
Df
z 81 x2
15
10
y
0
10
10
15
5
10
0
z
5 5
0
10
10
0
15 x
15 10 5 0 5 10 15 10
√
FIGURE 1.24: Domain and graph of z = 81 − x2
Domain of f : Df = {(x, y) ∈ R2 : 81 − x2 0} = {(x, y) ∈ R2 : |x| 9}

Graph of f : Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that z = 81 − x2 }
= {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that x2 + z 2 = 81, z 0}.
It is the half circular cylinder located in the z 0 with radius 9 and axis the
y axis (see Figure 1.24).
ii)
Domain of f : Df = {(x, y) ∈ R2 : f (x, y) = 3 ∈ R} = R2
Graph of f : Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that z = 3}
It is the plane passing through (0, 0, 3) with normal vector k = 0, 0, 1 (see
Figure 1.25).
iii)
Domain of f : Df = {(x, y) ∈ R2 : x2 + y 2 0} = R2
Graph of f : Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that

z = − x2 + y 2 }
= {(x, y, z) ∈ R3 : ∃(x, y) ∈ R2 such that

z 2 = x2 + y 2 , z 0}
The graph is the part of the circular cone z 2 = x2 + y 2 located in the region
[z 0]; see Figure 1.25.
5 z3 z1.0 x2 y2
y 0.5
y
0
0.0
0.5
5 1.0
0.0
z z
0.5
2
0 1.0
5 1.0
0.5
0 0.0
x x
0.5
5 1.0

FIGURE 1.25: Graph of z = 3 and graph of z = − x2 + y 2
Introduction 41
4. – Match the surfaces with the level curves in Figure 1.26.

10 2
5 1
2
0 0 0
2
5 1
4
10 2
1. 10 5 0 5 10 2. 2 1 0 1 2 3. 4 2 0 2 4
10 2 10
5 1 5
0 0 0
5 1 5
10 2 10
4. 10 5 0 5 10 5. 2 1 0 1 2 6. 10 5 0 5 10
4 2 2
3 1 1
2 2 5
z 2 z 0 z
0
1 1 1 1 1
0 2 2
2 0 y 2 0 y 5 0
y
1 1
0 1 0 1 0
x x x
1 1
A 2
2
B 2
2
C 5
5
2.0 1.0 0.08

1.5 0.5
2 0.06 2
z z 10
1.0 0.0 z 0.04
0.5 1 0.5 5 0.02 1
0.0 1.0 0.00
2 0 10 0 2 0
y y y
1 5 1
0 1 0 5 0 1
x x x
1 5 1
D 2
2
E 10
10
F 2
2
FIGURE 1.26: Surfaces and their level curves
Solution:
level curves (1) (2) (3) (4) (5) (6)
surface (E) (F ) (C) (A) (D) (B)

5. – Draw a set of level curves for the following functions:
i) z = x2 + y ii) f (x, y, z) = (x − 2)2 + y 2 + z 2 .
Solution: i) We have: Df = {(x, y) ∈ R2 , x2 + y ∈ R} = R2 ,
z = x2 + y = k ⇐⇒ y − k = x2 : parabola with vertex (0, k)

and axis the line Oy; see Figure 1.27.
ii) The level curve (see the 2nd graph in Figure 1.27) (x − 2)2 + y 2 + z 2 = k
is reduced to
⎧
⎪
⎪ the point (2, 0, 0) if k=0
⎪
⎪
⎨ √
the sphere centered at (2, 0, 0) with radius k if k>0
⎪
⎪
⎪
⎪
⎩
no points if k < 0.
x 22 y2 z2 k
2
y
x2 y k
0
y
2
2
2
z
0
x
4 2 2 4
2
2
2
4 0
x
2
FIGURE 1.27: Level curves x2 +y = k and level surfaces (x−2)2 +y 2 +z 2 = k
6. – Sketch the largest region on which the function is continuous. Explain

why the function is continuous.

f (x, y, z) = y − x2 ln z.
Introduction 43
Solution: f is continuous on its domain of definition
Df = {(x, y, z) ∈ R3 / y − x2 0 and z > 0}
because it is the product of the two continuous functions:
∗ u : (x, y, z) −→ ln z continuous on D1 = {(x, y, z) : z > 0}

with values in R as the composite of the polynomial function
(x, y, z) ∈ D1 −→ z ∈ R+ \ {0} and the function t −→ ln t continuous on
R+ \ {0}; we have (x, y, z) ∈ D1 −→ z = t ∈ R+ \ {0} −→ ln t.

∗ ∗ v : (x, y, z) −→ y − x2 : continuous on D2 = {(x, y, z) : y − x2 0}
as the composite of the polynomial function(x, y, z) ∈ D2 −→ y − x2
√
∈ R+ and the function t −→ t continuous on R+ ; we have
√
(x, y, z) ∈ D2 −→ y − x2 = t ∈ R+ \ {0} −→ t.
∗∗∗ f = u.v is continuous on D1 ∩ D2 = Df , the set in Figure 1.28.
3 Df
y
2
0
10
z
5
2
0
x
2

FIGURE 1.28: Domain of continuity of f (x, y, z) = y − x2 ln z
7. – Let f (x, y, z) = x2 y 2 − y 3 + 3x4 + xe−2z sin(πy) + 5. Find
(a) fxy (b) fyz (c) fxz (d) fzz
(e) fzyy (f ) fxxy (g) fzyx (h) fxxyz .

Solution: Note that f is indefinitely differentiable. Therefore, we can change

the order of differentiation with respect of the variables by using Clairaut’s
theorem.
fx = 2xy 2 + 12x3 + e−2z sin(πy)
fy = 2x2 y − 3y 2 + πxe−2z cos(πy) fz = −2xe−2z sin(πy)
(a) fxy = (fx )y = 4xy + πe−2z cos(πy)
(b) fyz = (fy )z = −2πxe−2z cos(πy)
(c) fxz = (fx )z = −2e−2z sin(πy) (d) fzz = (fz )z = 4xe−2z sin(πy)
(e) fzyy = (fzy )y = (fyz )y = 2π 2 xe−2z sin(πy),
(f ) fxxy = (fx )xy = (fx )yx = (fxy )x = 4y
(g) fzyx = (fzy )x = (fyz )x = −2πe−2z cos(πy),
(h) fxxyz = (fxxy )z = (4y)z = 0.
∂2u ∂2u
8. – Show that u = ln(x2 + y 2 ) satisfies Laplace equation + 2 = 0.
∂x2 ∂y
∂2u ∂2u
Show, without calculation, that: = .
∂x∂y ∂y∂x
Solution: We have
∂u 2x ∂u 2y
= 2 = 2
∂x x + y2 ∂y x + y2
∂2u (1)(x2 + y 2 ) − x(2x) (y 2 − x2 ) ∂2u (x2 − y 2 )

= 2 = 2 = 2
∂x2 (x2 + y 2 )2 (x2 + y 2 )2 ∂y 2 (x2 + y 2 )2
∂2u ∂2u (y 2 − x2 ) (x2 − y 2 )

2
+ 2 =2 2 2 2
+2 2 = 0.
∂x ∂y (x + y ) (x + y 2 )2
Introduction 45
∂u ∂2u
Note that is a fraction. Then is also a fraction. As a consequence,
∂x ∂y∂x
∂2u
is continuous on R2 \ {(0, 0)}.
∂y∂x
∂u ∂2u ∂2u
In the same way, is a fraction, is also a fraction. Therefore,
∂y ∂x∂y ∂x∂y
is continuous on R2 \ {(0, 0)}.
From Clairaut’s Theorem, the two second derivatives uxy and uyx are equal
on R2 \ {(0, 0)}.
dw
9. – Find the value if
ds s=0
w = x2 e2y cos(3z); x = cos s, y = ln(s + 2), z = s.
Solution: We have x = x(s), y = y(s), z = z(s) and w = w(x, y, z). Then
dx dy 1 dz
= − sin s, = , =1
ds ds s+2 ds
∂w ∂w ∂w
= 2xe2y cos(3z), = 2x2 e2y cos(3z), = −3x2 e2y sin(3z)
∂x ∂y ∂z
x(0) = 1, y(0) = ln 2, z(0) = 0
dw ∂w dx ∂w dy ∂w dz
= + +
ds ∂x ds ∂y ds ∂z ds
1
= [2xe2y cos(3z)](− sin s) + [2x2 e2y cos(3z)] +[−3x2 e2y sin(3z)]
s+2
dw
= e2 ln 2 = 4.
ds s=0
10. – Let
R = ln(u2 + v 2 + w2 ), u = x + 2y, v = 2x − y, w = 2xy.
Find
∂R ∂R
|x=1,y=0 and |x=1,y=0 .
∂x ∂y
Solution: We have
∂R 2u ∂R 2v ∂R 2w
= 2 , = 2 , = 2 ,
∂u u + v 2 + w2 ∂v u + v 2 + w2 ∂w u + v 2 + w2
∂u ∂v ∂w
= 1, = 2, = 2y,
∂x ∂x ∂x
∂u ∂v ∂w
= 2, = −1, = 2x.
∂y ∂y ∂y
The partial derivatives of R are:

∂R ∂R ∂u ∂R ∂v ∂R ∂w 2u + 4v + 4wy
= . + . + . = 2
∂x ∂u ∂x ∂v ∂x ∂w ∂x u + v 2 + w2
∂R ∂R ∂u ∂R ∂v ∂R ∂w 4u − 2v + 4wx
= . + . + . = 2 .
∂y ∂u ∂y ∂v ∂y ∂w ∂y u + v 2 + w2
When x = 1 and y = 0, we have
u=1 v = 2, w = 0, u2 + v 2 + w2 = 5.
Thus
∂R 2(1) + 4(2) + 4(0) ∂R 4(1) − 2(2) + 4(0)
= = 2, = = 0.
∂x 5 ∂y 5

11. – Use the linear approximation of f (x, y, z) = x3 y 2 + z 2 at the point
(2, 3, 4) to estimate the number

(1.98)3 (3.01)2 + (3.97)2 .
Solution: Since f is differentiable at the point (2, 3, 4), the linear approxima-
tion of L(x, y, z) at the point (2, 3, 4) is given by:
L(x, y, z) = f (2, 3, 4) + fx (2, 3, 4)(x − 2) + fy (2, 3, 4)(y − 3) + fz (2, 3, 4)(z − 4).
We have
yx3 zx3
fx = 3x2 y2 + z2 , fy = , fz =
y2 + z2 y2 + z2
Introduction 47
and
24 32
f (2, 3, 4) = 40, fx (2, 3, 4) = 60, fy (2, 3, 4) = , fz (2, 3, 4) = .
5 5
Thus
12 24 32
(x − 2) + (y − 3) + (z − 4).
L(x, y, z) = 40 +
5 5 5
Using this approximation, one obtain the following estimate:

(1.98)3 (3.01)2 + (3.97)2 ≈ L(1.98, 3.01, 3.97)
24 32
= 40 + 60(1.98 − 2) + (3.01 − 3) + (3.97 − 4)3
5 5
24 32
= 40 + 60(−0.02) + (0.01) + (−0.03) = 38.656.
5 5
12. – Determine whether the limit exists. If so, find its value.
x4 − x + y − x3 y cos(xy) x − y4
lim , lim , lim .
(x,y)→(0,0) x−y (x,y)→(0,0) x + y (x,y)→(1,1) x3 − y 4
Solution: We have
x4 − x + y − x3 y x3 (x − y) − (x − y)
i) lim = lim
(x,y)→(0,0) x−y (x,y)→(0,0) x−y
= lim x3 − 1 = −1.
(x,y)→(0,0)
cos(xy)
ii) lim doesn’t exist since
(x,y)→(0,0) x + y
cos(xy) cos(t2 )
lim = lim = +∞.
(x,y)=(t,t),t>0→(0,0) x + y t→0+ 2t
cos(xy) cos(t2 )
and lim = lim = −∞.
(x,y)=(t,t),t<0→(0,0) x + y t→0− 2t
iii) Let C1 and C2 the curves x = 1 and x = y respectively. We have

x − y4 1 − y4
lim = lim = lim 1 = 1,
(x,y)→(1,1)(x,y)∈C x3 − y 4 y→1 1 − y 4 y→1
1
y(1 − y 3 ) y2 + y + 1
lim = lim = 3.
(x,y)→(1,1)(x,y)∈C y 3 (1 − y) y→1 y2
2
The limits are different along C1 and C2 . Thus, the limit doesn’t exist.
Chapter 2
Unconstrained Optimization
In this chapter, we are interested in optimizing several variables’ functions f : x =

(x1 , . . . , xn ) −→ f (x) ∈ R over subsets S of Rn with nonempty interior.
Many results are well known when dealing with functions of one variable (n = 1).
The concept of differentiability offered useful and flexible tools to get local and global
behaviors of a function. These results are generalized to functions of n variables in
these notes. Indeed, we obtain, in Section 2.1, a characterization of local critical
points as solutions of the vectorial equation f (x) = 0 when f is regular. In Section
2.2, we use the second partial derivatives to identify the nature of the critical points.
In Section 2.3, first we define the convexity-concavity property for a function of
several variables, then we show how to use it to identify the global extreme points.
Finally, Section 2.4 extends the extreme value theorem to continuous functions on
closed bounded subsets of Rn .
2.1 Necessary Condition
In this section, we would like to have a close look at our candidates for
optimality. In other words, if we are close enough of such points (when they
exist), what conditions would be satisfied? Doing so, we hope to reduce the
size of the set of the candidates’ points then identify among these points the
extreme ones. This motivates the following definition of local extreme points.
Definition 2.1.1 local (global) maximum (minimum)

Let S ⊂ Rn and f : S −→ R be a function. A point x∗ ∈ S is said to be
49
– a local maximum (resp. minimum) of f if
∃r > 0 such that f (x) f (x∗ ) (resp. ) ∀x ∈ Br (x∗ ) ∩ S.
– a strict local maximum (resp. minimum) of f if
∃r > 0 such that f (x) < f (x∗ ) (resp. >) ∀x ∈ Br (x∗ )∩S, x = x∗ .
– a global maximum (resp. minimum) of f if
f (x) f (x∗ ) (resp. ) ∀x ∈ S.
– a strict global maximum (resp. minimum) of f if
f (x) < f (x∗ ) (resp. >) ∀x ∈ S, x = x∗ .
Remark 2.1.1 Note that a global extreme point is also a local extreme
point when S is an open set, but the converse is not always true.
Indeed, suppose, for example, that x∗ is such that
min f (x) = f (x∗ )

S
then
f (x) f (x∗ ) ∀x ∈ S.
Because S is an open set and x ∈ S, there exists a ball Br (x∗ ) such that
∗
Br (x∗ ) ⊂ S, and then, in particular,
f (x) f (x∗ ) ∀x ∈ Br (x∗ )

which shows that x∗ is a local minimum.
To show that the converse is not true, consider the function f (x) = x3 − 3x.
The study of the variations of f , in Table 2.1, and its graph, in Figure 2.1,
show that f has a local minimum at x = 1 and a local maximum at x = −1,
but none of them is a global maximum or a global minimum, as we have
f (x) = 3x2 −3 f (x) = 6x lim f (x) = +∞ lim f (x) = −∞.

x→+∞ x→−∞
Now, here is a characterization of a local extreme point for a regular

objective function.
Unconstrained Optimization 51
x −∞ −1 0 1 +∞
f (x) + − +
f (x) −∞ 2 −2 +∞
f (x) − − + +
f is concave concave convex convex
TABLE 2.1: Study of f (x) = x3 − 3x

y
3
1 y x3 3 x
x
3 2 1 1 2 3
1
2
3
FIGURE 2.1: Local extreme points but not global ones
Theorem 2.1.1 Necessary condition for local extreme points

Let S ⊂ Rn and f : S −→ R be a differentiable function at an interior
◦
point x∗ ∈ S. Then
x∗ is a local extreme point =⇒ ∇f (x∗ ) = 0.
Proof. Suppose f has a local minimum at x∗ . Since f is differentiable at

x = (x∗1 , x∗2 , . . . , x∗n ), its first derivatives exist. From the definition of the
∗
partial derivative, we have, for j ∈ {1, . . . , n}
∂f ∗ f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )

(x ) = lim
∂xj t→0 t
Because f has an interior local minimum at x∗ , there is an > 0 such that
∀x ∈ B (x∗ ) ⊂ S =⇒ f (x) f (x∗ ).
In particular, for |t| < , we have
(x∗1 , . . . , x∗j +t, . . . , x∗n )−x∗ = (x∗1 , . . . , x∗j +t, . . . , x∗n )−(x∗1 , . . . , x∗j , . . . , x∗n )

= (0, . . . , 0, t, 0, . . . , 0) = 02 + . . . + 02 + t2 + 02 + . . . + 02 = |t| < .
Thus the points (x∗1 , . . . , x∗j + t, . . . , x∗n ) remain inside the ball B (x∗ ) and
therefore satisfy
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) f (x∗ )
⇐⇒ f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) 0.
Thus, if t is positive,
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )

0
t
and letting t → 0+ , we deduce that
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )

lim+ 0.
t→0 t
In the same way, if t is negative,
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )

0
t
and letting t → 0− , we deduce that
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )

lim 0.
t→0− t
Because
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )
lim
t→0+ t
f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) ∂f ∗
= lim = (x )
t→0− t ∂xj
∂f ∗ ∂f ∗ ∂f ∗
we have (x ) 0 and (x ) 0, and we deduce that (x ) = 0.
∂xj ∂xj ∂xj
This holds for each j ∈ {1, . . . , n}. Hence ∇f (x∗ ) = 0.
A similar argument applies if f has a local maximum at x∗ .
Remark 2.1.2 Note that a local extremum can also occur at a point where
a function is not differentiable.
y
3
2 y x
x
3 2 1 1 2 3
1
FIGURE 2.2: A minimum point where f (x) = |x| is not differentiable
• For example, the one variable function f (x) = |x|, illustrated in Figure 2.2,
has a local minimum at 0 but f is not differentiable at 0 since we have
f (x) − f (0) x−0

lim+ = lim+ =1
x→0 x x→0 x
f (x) − f (0) −x − 0
lim = lim = −1.
x→0− x x→0− x
Moreover 0 is a global minimum since we have
f (x) = |x| 0 = f (0) ∀x ∈ R.
z
1.0 x2 y2
y 0.5
0.0
0.5
1.0
1.0
1.0
0.5
z
0.5
0.0
0.0
0.5
1.0
0.5
0.0
x
0.5
1.0
1.0 1.0 0.5 0.0 0.5 1.0

FIGURE 2.3: A minimum point where f (x, y) = x2 + y 2 is not differen-
tiable

• The two variables function f (x, y) = x2 + y 2 , graphed in Figure 2.3,
attains its minimum value at (0, 0) because we can see that

f (x, y) = x2 + y 2 0 = f (0, 0) ∀(x, y) ∈ R2 .
But f is not differentiable at (0, 0) since, for example fx (0, 0) doesn’t exist.
Indeed, we have
√ ⎧
⎨ 1, if h → 0+
f (0 + h, 0) − f (0, 0) h2 − 0 |h|
= = −→
h h h ⎩
−1, if h → 0− .
The above remark leads to the following definition.
Definition 2.1.2 Critical point

An interior point x∗ of the domain of a function f is a critical point of
f if it is a stationary point where ∇f (x∗ ) = 0 or a point where f is not
differentiable.
Example 1. (0, 0) is the only stationary point for the functions f and g
i) f (x, y) = x2 + y 2 ii) g(x, y) = 1 − x2 − y 2 .
It is a local and absolute minimum for f and a local and absolute maximum
for g. The values of the level curves are increasing in Figure 2.4, while they
are decreasing in Figure 2.5.
z x2 y2
1.0 1.62 1.26 1.08 1.44 1.8

1.8
1.62
1.0
1.44
0.72
1.26
0.5
0.36
1.08
z
0.5
0.0
1.08 0.18 1.08
0.0
0.5 0.54
1.0
0.5 1.44 1.44
0.0 1.26
x 1.8 0.9 1.8
0.5 1.62 1.26 1.62
1.0
1.0 1.0 0.5 0.0 0.5 1.0
FIGURE 2.4: Local minimum point that is a global one
Indeed, we have for any (x, y) ∈ R2
f (x, y) = x2 + y 2 0 = f (0, 0) and g(x, y) = 1 − (x2 + y 2 ) 1 = g(0, 0).
Example 2. In economics, one is interested in maximizing the total profit

P (x) in the sale of x units of some product. If C(x) is the total cost of
production and R(x) is the revenue function then
P (x) = R(x) − C(x).

z x2 y2 1
1.0 0.9 0.54 0.18 0.36 0.72

0.9
0.72
1.5 0.36 0.54
0.36
0.5
0.1
1.0
0.0 0 0.54
0.5
0.72
0.0 0.1
0.5
1.0
0.36
0.5
0.54
0.0
0.72
x 0.18
0.5
1.0 0.9 0.54 0.18 0.36 0.720.9
1.0 1.0 0.5 0.0 0.5 1.0
FIGURE 2.5: Local maximum point that is a global one
The maximum profit occurs when P (x) = 0, or R (x) = C (x). From the
linear approximation, we have for Δx = 1,
R(x+1)−R(x) ≈ R (x)Δx = R (x), C(x+1)−C(x) ≈ C (x)Δx = C (x),
C(x + 1) − C(x) ≈ R(x + 1) − R(x)

that is, the cost of manufacturing an additional unit of a product is approxi-
mately equal to the revenue generated by that unit. P (x), R (x), C (x) are
interpreted respectively as the additional profit, revenue and cost that result
from producing one additional unit when the production and sales levels are
at x units.
Remark 2.1.3 A function needs not to have a local extremum at every

local critical point.
• For example, the one variable function f (x) = x3 has a local critical point
since
f (x) = 3x2 = 0 ⇐⇒ x = 0.
But 0 is not a local extremum (see Figure 2.6). Indeed we have
f (x) = x3 > 0 = f (0) ∀x > 0 and f (x) = x3 < 0 = f (0) ∀x < 0.
The point 0 is called an inflection point.

• The two variables function f (x, y) = y 2 − x2 , graphed in Figure 2.7, has a
critical point at (0, 0) since we have
∇f (x, y) = −2x, 2y = 0, 0 ⇐⇒ (x, y) = (0, 0).

y
2
1 y x3
x
2 1 1 2
1
2
FIGURE 2.6: The critical point x = 0 is an inflection point for f (x) = x3
However, the function f has neither a relative maximumnor a relative mini-

mum at (0,0). Indeed, along the x and y axis, we have
f (x, 0) = −x2 0 = f (0, 0) ∀x ∈ R and f (0, y) = y 2 0 = f (0, 0) ∀y ∈ R.
The point (0, 0) is called a saddle point. Figure 2.7 shows how the values of
the level curves are increasing in one side and decreasing on the other side.
1.0 z y2 x2
y 0.5
0.0
0.5
1.0 0.54
0.18 0.72
1.0
1.0 0.3
0.18
0.36
0
0.50.72
0.5
0.9
z 0
0.0 0.36
0.0
0.9 0.54
0.5
1.0 0
0.5 0.72
1.0
0.18
0.5 0.54
0.54
0.0
x 0.36 0.1
0.5
1.0 0.72
1.0 1.0 0.5 0.0 0.5 1.0
FIGURE 2.7: (0, 0) is a saddle point for f (x, y) = y 2 − x2
Definition 2.1.3 Saddle point

A differentiable function f (x) has a saddle point at a critical point x∗ if in
every open ball centered at x∗ there are domain points x where f (x) > f (x∗ )
and domain points x where f (x) < f (x∗ ).
Remark 2.1.4 In two dimensions, the projection of horizontal traces

shows circular curves around (x∗ , y ∗ ) when it is a local extreme point, and
hyperbolas when the point is a saddle point.
Now, we give a necessary condition when the extreme point is not neces-
sarily an interior point [5].
Theorem 2.1.2 Necessary condition for a relative extreme point

on a convex set
Let S ⊂ Ω ⊂ Rn , Ω an open set, S a convex set and f : Ω −→ R be a
differentiable function at a point x∗ ∈ S. Then
f (x) f (x∗ ) (resp. ) ∀x ∈ S
=⇒ ∇f (x∗ ).(x − x∗ ) 0 (resp. 0) ∀x ∈ S.
Proof. Let x ∈ S, x = x∗ . Since S is convex, θx + (1 − θ)x∗ = x∗ + θ(x −

x ) ∈ S for θ ∈ [0, 1]. Suppose f has a relative minimum at x∗ . Since f is
∗
differentiable at x∗ , we can write
f (x∗ + θ(x − x∗ )) − f (x∗ ) = θ[ f (x∗ ).(x − x∗ ) + (θ) ], lim (θ) = 0.

θ→0
If f (x∗ ).(x − x∗ ) < 0, then

1
∃θ0 ∈ (0, 1) : ∀θ ∈ (0, θ0 ) =⇒ |(θ)| < − f (x∗ ).(x − x∗ ).
2
Hence, ∀θ ∈ (0, θ0 ), f (x∗ + θ(x − x∗ )) − f (x∗ ) <
1 ∗ θ
θ[ f (x∗ ).(x − x∗ ) − f (x ).(x − x∗ ) ] = f (x∗ ).(x − x∗ ) < 0
2 2
which contradicts the fact that x∗ is a relative minimum. Therefore, we have
f (x∗ ).(x − x∗ ) 0.
The case of a relative maximum is proved similarly.
Example 3. Consider the real function f (x) = x2 with x ∈ [1, 2]; see
Figure 2.8.
The interval S = [1, 2] is a convex subset of Ω = R. f is differentiable on R,
and has no critical points on (1, 2) since
f (x) = 2x = 0 on (1, 2).
From the theorem, if x = a ∈ [1, 2] is a minimum point, then it must satisfy
f (a)(x − a) 0 ∀x ∈ [1, 2] ⇐⇒ 2a(x − a) 0 ∀x ∈ [1, 2] =⇒ a = 1.
If x = b ∈ [1, 2] is a maximum point, then it must satisfy
f (b)(x−b) 0 ∀x ∈ [1, 2] ⇐⇒ 2b(x−b) 0 ∀x ∈ [1, 2] =⇒ b = 2.

Hence, x = 1 and x = 2 are the candidate points for optimality. In fact, f

attains its minimum and maximum values at x = 1 and x = 2 respectively on
[1, 2]. We can see that
1x2 =⇒ 1 = 12 = f (1) x2 = f (x) f (2) = 22 = 4 ∀x ∈ [1, 2].
y
4
2 y x2
x
4 2 2 4
1
FIGURE 2.8: Extreme points of f (x) = x2 on the convex [1, 2]
Example 4. Solve the problem
min f (x1 , x2 ) = x21 − x1 + x2 + x1 x2 subject to x1 0, x2 0.
Solution: The set S = {(x1 , x2 ) : x1 0, x2 0} is a convex subset of R2

and f is differentiable on Ω = R2 , and has no critical points in the interior of
S, ie. {(x1 , x2 ) : x1 > 0, x2 > 0} since
◦
f (x1 , x2 ) = 2x1 −1+x2 , 1+x1 = 0, 0 ⇐⇒ (x1 , x2 ) = (−1, 3) ∈ S.
So the minimum value, if it exists, must be attained on the boundary of S.

Note that
1 1 1 1
f (x1 , 0) = x21 − x1 = (x1 − )2 − − = f ( , 0) ∀x1 0
2 4 4 2
and
f (0, x2 ) = x2 0 ∀x2 0.
Since −1/4 < 0, the point ( 12 , 0) is the global minimum point of f on S, as
shown in Figure 2.9.
At this point

3
f (x1 , x2 ) = 2x1 − 1 + x2 , 1 + x1 = 0, = 0.
x1 = 12 ,x2 =0 x1 = 12 ,x2 =0 2
2 2
z
10 x y x x y z
10 x y x x y
y 5 y
0 5
5
10 0
10 10
5 5
z z
0 0
5 5
10 10
10 0
5
0 5
x x
5
10 10
FIGURE 2.9: Min f attained at the boundary of x1 0, x2 0
and
1 1 3
∇f ( , 0).x1 − , x2 − 0 = x2 0 ∀(x1 , x2 ) ∈ S = R+ × R+ .
2 2 2
Remark 2.1.5 * Note that, it is not easy to find the candidate points
by solving an inequality ∇f (x∗ ).(x − x∗ ) 0 (resp. 0). However, the
information gained is useful to establish other results.
** Solving the equation ∇f (x) = 0 is not that easy either! It induces non-
linear equations or large linear systems when the number of variables is
large. To overcome this difficulty, we resort to approximate methods. New-
ton’s method is one of the well known approximate methods for approaching
a root of the equation F (x) = 0. In Exercise 5, the method is described and
applied for solving a nonlinear equation in one dimension. Steepest descent
method, Conjugate gradient methods and many other methods are developed
for approaching the solution [22], [5].
∗ ∗ ∗ Finally, the following example, in dimension 2, shows the necessity of us-

ing new methods for finding the critical points. Indeed, the graph, Figure 2.10,
of
2
+y 2 ) 2
+(y−3)2 ]/10 2
+(y+1)2 ]
z = f (x, y) = 10e−(x + 5e−[(x+5) + 4e−2[(x−4) ,
on the window [−10, 8] × [−10, 8] × [−1, 12], shows three peaks. Thus, we have
at least three local maxima points. These points are solution of the system
⎧ 2 2 2 2 2 2
⎪
⎪ fx = −20xe−(x +y ) − 10(x + 5)e−[(x+5) +(y−3) ]/10 − 16(x − 4)e−2[(x−4) +(y+1) ] = 0
⎨
⎪ −(x2 +y 2 ) − (y − 3)e−[(x+5)2 +(y−3)2 ]/10 − 16(y + 1)e−2[(x−4)2 +(y+1)2 ] = 0,
⎩fy = −20ye
⎪
a nonlinear system, for which it is not evident to find an explicit solution by

algebraic manipulations. The following Maple software command searches for
a solution near (0, 0) using an approximate method:
f := (x, y)− > 10 ∗ exp(−x2 − y 2 ) + 5 ∗ exp(−((x + 5)2 + (y − 3)2 ) ∗ (1/10))

+4 ∗ exp(−2 ∗ ((x − 4)2 + (y + 1)2 ))
with(Optimization) :
N LP Solve(f (x, y), x = −8..8, y = −8..8, initialpoint = x = 0, y = 0, maximize);
The result is
[10.1678223807097599, [x = −0.842598632890276e − 2, y = 0.505559179745079e − 2]].
Thus, (−0.084e−2 , 0.5e−2 ) ≈ (−0.115, 0.067) is an approximate critical point,

where f takes the approximate local maximal value 10.1678. A search near
(−5, 3) and (4, −1) yields to the other approximate local maxima points:
with(Optimization); N LP Solve(f (x, y), x = −8..8, y = −8..8,

initialpoint = x = −4, y = 3, maximize)
[5.00000000000001688, [x = −5.00000000010854, y = 2.99999999999990]]
with(Optimization) : N LP Solve(f (x, y), x = −8..8, y = −8..8,
initialpoint = x = 4, y = −1, maximize);
[4.00030684298145278, [x = 3.99996531847993, y = −.999984626392930]
5
y
0
5
10
5
10
z 0
5
0 5
10
5
0
x 10
5 10 5 0 5
FIGURE 2.10: Location of mountains

Solved Problems
1. – A suitable choice of the objective function.

Find a point on the curve y = x2 that is closest to the point (3, 0).
Solution: When formulating an optimization problem, sometimes, one can

encounter some technical difficulties by considering an auxiliary objective func-
tion instead of considering the direct one. This situation is illustrated by the
two choices below.
y
4
y x2 2
x
4 2 2 4
1
FIGURE 2.11: Closest point
• 1st choice. Let
D = distance between (3,0) and any point (x, y).
Since (x, y) lies on the curve y = x2 , the distance D must satisfy

D = D(x) = (x − 3)2 + (y − 0)2 = (x − 3)2 + x4 .
We need to solve the problem (see Figure 2.11)
min D(x).
x∈R
Since R is an open set, the minimum must occur at a critical point, i.e., since
D is differentiable, at a point where
dD 2(x − 3) + 4x3
= =0
dx 2 (x − 3)2 + x4
⇐⇒ 2(x − 3) + 4x3 = 2(x − 1)(2x2 + 2x + 3) = 0 ⇐⇒ x = 1.
Since D ∈ C 0 (R) and
lim D(x) = +∞ and lim D(x) = +∞,

x→−∞ x→+∞
the minimum exists and it must be at x = 1 [1]. The variations of D is given

by Table 2.2.
x −∞ 2 +∞
D (x) − 0 +
D(x) +∞ D(1) +∞

TABLE 2.2: Variations of D(x) = (x − 3)2 + x4
Thus √
min D(x) = D(1) = 5.
x∈R
• 2nd choice. Note that, for any x0 , x ∈ R, we have

0 D2 (x0 ) D2 (x) ⇐⇒ 0 D(x0 ) = D2 (x0 ) D2 (x) = D(x)
√
since t is an increasing function on the interval [0, +∞). It suffices, then, to
minimize on R the function
F (x) = D2 (x) = (x − 3)2 + x4 .
Since R is an open set, the minimum must occur at a critical point, i.e., since
F is differentiable, at a point where
dF
= 2(x − 3) + 4x3 = 0 ⇐⇒ 2(x − 1)(2x2 + 2x + 3) = 0 ⇐⇒ x = 1.
dx
Since F ∈ C 0 (R) and
lim F (x) = +∞ and lim F (x) = +∞,

x→−∞ x→+∞
the minimum exists and it must be at x = 1. The variations of F is given by

Table 2.3. The point (1, 1) is the closest point on the curve [y = x2 ] to the
point (3, 0).
x −∞ 1 +∞
F (x) = 2(x − 1)(2x2 + 2x + 3) − 0 +
F (x) +∞ F (1) +∞
TABLE 2.3: Variations of F (x) = (x − 3)2 + x4
2. – To minimize the material in manufacturing a closed can with volume

capacity of V units, we need to choose a suitable radius for the container.
Find the radius if the container is cylindrical.
Solution: From Section 1.1, Example 1, we are lead to solve the minimization
problem
⎧
⎪ 2V
⎨ minimize A = A(r) = 2πr2 + over the set S
r
⎪
⎩
S = (0, +∞) = {r ∈ R / r > 0}.
Since S is an open set, the minimum must occur at a critical point, ie., since
A(r) is differentiable, at a point where
dA 2V V 1/3
= 4πr − 2 = 0 =⇒ r= ∈ S.
dr r 2π
0
Since A ∈ C (S) and
lim A(r) = +∞ and lim A(r) = +∞,

r→0+ r→+∞
V 1/3
the minimum exists and it must be on r = . Indeed the variations of
2π
A are as shown in Table 2.4.
√
r 0 V 1/3 / 3 2π +∞
2V
A (r) = 4πr − r2 − 0 +
A(r) +∞ A((V /2π)1/3 ) +∞
2V
TABLE 2.4: Variations of A(r) = 2πr2 +
r
So we should choose for the can a radius r = (V /2π)1/3 and a height h =

V /(2πr) = (V /2π)2/3 .
3. – Locate all absolute maxima and minima if any for each function.
i) f (x, y) = 1 − (x + 1)2 − (y − 5)2
ii) g(x, y) = 3x − 2y + 5
iii) h(x, y) = x2 − xy + y 2 − 3y.
Solution: i)
2 2
z x
7 1 y 5 1
y 6
5
7 6.48 5.04 3.6 4.32 5.76
4 6.48
5.76 2.16
3
1.0 5.0
4.32
6 0
0.5 3.6
1.44
z
0.0
5
0.5
0.72
3.6 3.6
1.0
4
3
2 5.04 5.0
1
x 6.48 6.48
0 5.76 4.32 2.88 4.32 5.76
3
1 3 2 1 0 1
FIGURE 2.12: Graph and level curves of z = f (x, y)
Since f is differentiable on R2 , its absolute extremum that are also local

extremum (if they exist), are stationary points, ie. solution of
∇f = −2(x + 1), −2(y − 5) = 0, 0 ⇐⇒ (x, y) = (−1, 5).
So, there is only one critical point. It satisfies
f (−1, 5) = 1 1 − (x + 1)2 − (y − 5)2 = f (x, y) ∀(x, y) ∈ R2 .
Hence, it is a global maximum of f in R2 ; see Figure 2.12. However, f does

not have a global minimum since the following hold:
(x + 1)2 + (y − 5)2 = (x, y) − (−1, 5) 2
√ 2
2
(x, y) − (−1, 5) ( (x, y) + (−1, 5) )2 = ( x2 + y 2 + 26)
2 √
2
(x, y) − (−1, 5) (x, y) − (−1, 5) = ( x2 + y 2 − 26)2 .
Then
√ √
1−( x2 + y 2 + 26)2 f (x, y) 1 − ( x2 + y 2 − 26)2
and we deduce that
lim f (x, y) = −∞.

(x,y)→+∞
It suffices also to show that f takes large negative values on a subset of its
domain R2 , like
f (x, 5) = 1 − (x + 1)2 −→ −∞ as x −→ ±∞.
ii) Since g is differentiable on R2 , its absolute extreme points that are also
z 3x2 y5
2
y
0
3 2.7 8.1
2.7
8.1
2
1.0 2
0.5 1
5.4
z 13.5
0.0 0
0.5
1
1.0 10.8
5.4
2 0
2
0
x 3 16.2
2 3 2 1 0 1 2 3
FIGURE 2.13: Graph and level curves of g(x, y) = 3x − 2y + 5
local extreme points (if they exist), are stationary points, ie. solution of ∇g =
0, 0. But
∇g = 3, −2 = 0, 0.

So, there is no critical point. g has no local or global extreme point. The graph
z = g(x, y) is a plane in R3 which spreads in the space taking large values
when x or y −→ ±∞; see Figure 2.13. For example
g(0, y) = −2y + 5 −→ ∓∞ as y −→ ±∞
g(x, 0) = 3x + 5 −→ ±∞ as x −→ ±∞.
iii) Since h is differentiable on R2 , its absolute extreme points that are also
local extreme points (if they exist) are stationary points, ie. solution of
∇h = 2x − y, −x + 2y − 3 = 0, 0 ⇐⇒ (x, y) = (1, 2).

So, there is only one critical point. It satisfies
h(1, 2) = 1 − 2 + 4 − 6 = −3
y y2
h(x, y) − h(1, 2) = x2 − xy + y 2 − 3y + 3 = (x − )2 − + y 2 − 3y + 3
2 4
y 3
= (x − )2 + (y − 1)2 0 ∀(x, y) ∈ R2 .
2 4
Hence, the point (1, 2) is a global minimum of h in R2 . Here also, one can see
that h takes large values, for example, along the x axis, we have
h(x, 0) = x2 −→ +∞ when x −→ ±∞.
So h has no global maximum (see Figure 2.14).

z 4x2 y x y2 3 y
y 3
1 4 7 5 3 1
0
0 6
1
4
3
0
1
2
z
2
2
3 2
4
1 2
4
1
0
6
1
x
2 1 3 5 7
0
3 1 0 1 2 3
FIGURE 2.14: Graph and level curves of h(x, y) = x2 − xy + y 2 − 3y
4. – Consider the problem
min f (x, y) = y where S = {(x, y) : x2 + y 2 1}.

S
i) Does f have local minimum points?

ii) Where may the minimum points locate if they exist?
iii) Solve the inequality
∇f (a, b).(x − a, y − b) 0 ∀(x, y) ∈ S
to find the candidate points (a, b) and solve the problem.
iv) Can you proceed as in iii) if S = {(x, y) : x2 + y 2 1}? What is the

solution in this case?
Solution: i) Since f is differentiable on R2 , a local minimum point would be

a critical point, ie. solution of ∇f = 0, 0. But
◦
∇f = 0, 1 = 0, 0 ∀(x, y) ∈ {(x, y) : x2 + y 2 < 1} = S.
So, there is no critical point. f has no local minimum point.
ii) If the minimum points exist, they may be on the unit circle, the boundary
of S:
∂S = {(x, y) : x2 + y 2 = 1}.
1.0
y 0.5
0.0
0.5
1.0
1.0
0.5
z
0.0
0.5
1.0
1.0
0.5
0.0
x
0.5
1.0
FIGURE 2.15: Graph of f (x, y) = y on the set x2 + y 2 1
iii) Since S is convex and f differentiable on R2 , a solution (a, b) of the

problem, if it exists, must satisfy
⎧ 2
⎨ a + b2 = 1
⎩
∇f (a, b).(x − a, y − b) 0 ∀(x, y) ∈ S
⎧ 2
⎨ a + b2 = 1
⇐⇒
⎩
0, 1.x − a, y − b 0 ∀(x, y) ∈ S
⎧ 2
⎨ a + b2 = 1
⇐⇒
⎩
y−b0 ∀(x, y) ∈ S ⇐⇒ yb ∀(x, y) ∈ S
Thus
b = −1 and a = 0.
So the only point candidate is (a, b) = (0, 1). In fact, it is the minimum point
(see Figure 2.15) since we have
f (x, y) = y −1 = f (0, −1) ∀(x, y) ∈ S.
iv) We cannot proceed as in iii) because the set S = {(x, y) : x2 + y 2 1}

is not convex. And because this set is not bounded, we can see that f can
take large negative values. Therefore, it doesn’t attain a minimum value. For
example, on the negative y axis, we have
f (0, y) = y −→ −∞ as y −→ −∞.
5. – Newton’s Method[2] Let I = [a, b] and let F : I −→ R be twice

differentiable on I. Suppose that
∃m, M ∈ R+ : |F (x)| m > 0 and |F (x)| M ∀x ∈ I

F (a).F (b) < 0, K = M/2m.
Then there exists a subinterval I ∗ containing a root r of F such that for

any x1 ∈ I ∗ , the sequence xn defined by
F (xn )
xn+1 = xn − ∀n ∈ N,
F (xn )
belongs to I ∗ and (xn ) converges to r. Moreover
|xn+1 − r| K|xn − r|2 ∀n ∈ N.
Application Let f (x) = x3 − 2x − 5.

i) Show that f has a root on the interval I = [2, 2.2].
ii) If x1 = 2 and if (xn ) is the sequence obtained by Newton’s method,

show that
|xn+1 − r| (0.7)|xn − r|2
iii) Show that x4 is exact up to 6 decimals.
Solution: i) We have
f (2.2) = (2.2)3 − 2.(2.2) − 5 = 1.248 > 0 f (2) = 8 − 4 − 5 = −1 < 0
f is continuous on [2, 2.2] and 0 is between f (2) and f (2.2).

From the intermediate value theorem, there exists x0 ∈ (2, 2.2) such that
f (x0 ) = 0.
ii) The sequence (xn ) obtained by Newton’s method is:

⎧
⎪
⎪ x1 = 2
⎨
⎪
⎪ f (xn ) x3 − 2xn − 5 2x3 + 5
⎩ xn+1 = xn − = xn − n 2 = n2
f (xn ) 3xn − 2 3xn − 2
with
f (x) = 3x2 − 2 f (x) = 6x.

Because, the functions f and f are increasing on [2, 2.2], we have
10 = f (2) f (x) f (2.2) = 12.52

12 = f (2) f (x) f (2.2) = 13.2.
In particular
|f (x)| 10 = m and |f (x)| 13.2 = M ∀x ∈ [2, 2.2].
We deduce that the sequence (xn ) converges to a root r of f (x) = 0 in [2, 2.2]
and satisfies
M
|xn+1 − r| 0.7|xn − r|2 K= = 0.66 < 0.7.
2m
iii) Denote by en = xn − r, the approximation error of the root r, then
|Ken+1 | K 2 |en |2 = |Ken |2 =⇒ |Ken+1 | |Ke1 |2n by induction,
where
|e1 | = |x1 − r| < (2.2 − 2) = 0.2 since x1 , r ∈ [2, 2.2].
Thus
2n
|Ken+1 | |Ke1 |2n (0.7)(0.2) = (0.0196)n
To obtain an accuracy up to 6 decimals, it suffices to choose n such that
(0.0196)n
|en+1 | 10−6 .
0.66
We have
(0.0196)2 (0.0196)3
= 0.000582061 ≈ 0.0000114084
0.66 0.66
(0.0196)4
≈ 0.0000002236 < 10−6 .
0.66
The desired accuracy is obtained for n = 4.
The approximate values of this root are:
2x31 + 5 2(8) + 5 21
x2 = 2 = 2
= = 2.1
3x1 − 2 3(2 ) − 2 10
2x32 + 5 2(2.1)3 + 5 23.522

x3 = 2 = = = 2.0945681
3x2 − 2 3(2.1)2 − 2 11.23
2x33 + 5 2( 23.522 3
11.23 ) + 5
x4 = 2 = 23.522 ≈ 2.09455148
3x3 − 2 3( 11.23 )2 − 2
2x34 + 5 23.3782059
x5 = ≈ ≈ 2.0945514841.
3x24 − 2 11.1614377
We can see that x4 is exact up to six decimals; see Figure 2.16.

y
10
y
1.0
x
3 2 1 1 2 3
0.5
x
5
1 1 2 3
0.5
10
1.0
FIGURE 2.16: Approximate position of the root of f (x) = x3 − 2x − 5

2.2 Classification of Local Extreme Points
For a C 2 function f of one variable, in a neighborhood of a critical point x∗ ,

one can write by using the second order Taylor’s formula:
f (x∗ ) f (c)
f (x) = f (x∗ ) + (x − x∗ ) + (x − x∗ )2
1! 2!
for some number c between x∗ and x. Then, since f (x∗ ) = 0, we have
f (c)
f (x) = f (x∗ ) + (x − x∗ )2 .
2!
Now, if we have f (x∗ ) > 0, then by continuity of f , we deduce that for x close to
x∗ , ( x ∈ (x∗ − , x∗ + ) ), we will have
f (c) > 0 =⇒ f (x) > f (x∗ ) ∀x ∈ (x∗ − , x∗ + ) \ {x∗ }.

This means that x∗ is a strict local minimum point. Similarly, we show that
f (x∗ ) < 0 =⇒ x∗ is a strict local maximum point.
This classification of critical points, into minima and maxima points, where
the sign of the second derivative intervenes, is generalized to C 2 functions with
several variables in the theorem below, following the definition:
Definition 2.2.1 Let Hf (x) = (fxi xj (x))n×n be the Hessian of a C 2 func-

tion f . Then, the n leading minors of Hf are defined by

f x1 x1 fx1 x2 . . . f x1 xk

fx2 x1 fx2 x2 . . . f x2 xk
Dk (x) = , k = 1, . . . , n.
.. .. .. ..
. . . .

fx x fx x . . . fx x
k 1 k 2 k k
Theorem 2.2.1 Second derivatives test - Sufficient conditions for a strict local
extreme point
Let S ⊂ Rn and f : S −→ R be a C 2 function in a neighborhood of a

critical point x∗ ∈ S (∇f (x∗ ) = 0). Then
(i) Dk (x∗ ) > 0, ∀k = 1, . . . , n

∗
=⇒ x is a strict local minimum point,
(ii) (−1)k Dk (x∗ ) > 0, ∀k = 1, . . . , n

∗
=⇒ x is a strict local maximum point,
(iii) Dn (x∗ ) = 0 and neither of the conditions in (i) and (ii)

are satisfied, then x∗ is a saddle point.
Before proving the theorem, we will see its application through some examples.
Example 1. Profit in selling one commodity

A commodity is sold at 5$ per unit. The total cost for producing x units is
given by
C(x) = x3 − 10x2 + 17x + 66.
Find the most profitable level of production.
Solution: The total revenue for selling x units is R(x) = 5x. Thus, the profit
P (x) on x units is
P (x) = R(x) − C(x) = 5x − (x3 − 10x2 + 17x + 66) = −x3 + 10x2 − 12x − 66.
The profit, illustrated in Figure 2.17, will be at its maximum at points where
dP 2
= −3x2 + 20x − 12 = −3(x − 6)(x − ) = 0.
dx 3
We deduce that we have two critical points x = 6 and x = 23 .
The Hessian of P is
d2 P
HP (x) = = [−6x + 20].
dx2
Applying the second derivatives test, we obtain
∗ at x = 6,
d2 P
(−1)1 D1 (6) = (−1) (6) = (−1)(−6(6) + 20) = 16 > 0
dx2
Thus, x = 6 is a local maximum.
∗∗ at x = 2/3,
d2 P 2 2
D1 (2/3) = 2
( ) = −6( ) + 20 = 16 > 0
dx 3 3
2
Thus, x = 3 is a local minimum.
Thus six units is a candidate point for optimality. We have to check that it
is the point at which we have the most profitable profit. This can be done by
comparing P (x) and P (6). Indeed, we have
P (x) − P (6) = −(x − 6)2 (x + 2) 0 ∀x > 0, x=6
=⇒ P (x) < P (6) ∀x ∈ (0, +∞) \ {6}.
x
2 4 6 8 10
50
100 y x3 10 x2 12 x 66
150
FIGURE 2.17: Graph of P and the maximum profit at x = 6
Example 2. Profit in selling two commodities

The cost to produce x units of a commodity A and y units of a commodity
B is
C(x, y) = 0.2x2 + 0.05xy + 0.05y 2 + 20x + 10y + 2500.
If each unit from A and B are sold for 75 and 45 respectively, find the daily
production levels x and y that maximize the profit per day.
Solution: The daily profit is given by

P (x, y) = 75x + 45y − C(x, y) = −0.2x2 − 0.05xy − 0.05y 2 + 55x + 35y − 2500.
Since P is differentiable (because it is a polynomial), the points that maximize
the profit are critical ones, i.e, solutions of

x = 100
∇P (x, y) = −0.4x−0.05y+55, −0.05x−0.1y+35 = 0, 0 ⇐⇒
y = 300.
We deduce that (100, 300) is the only critical point of P ; see Figure 2.18.
Now, we apply the second derivatives test to classify that point. We have

Pxx Pxy −0.4 0
HP (x, y) = =
Pyx Pyy 0 −0.1

D1 (100, 300) = Pxx = Pxx = −0.4 < 0,

P Pxy −0.4 0
D2 (100, 300) = xx = = 0.004 > 0.
Pxy Pyy 0 −0.1
So (100, 300) is a local maximum point. In fact, it is a global maximum point

where P attains the optimal value P (100, 300) = 5500. This is true because
P is concave in R2 . Indeed, we have
D1 (x, y) < 0 and D2 (x, y) > 0 ∀(x, y) ∈ R2 (see next section).
500 225018001350900
2700
4050
z 0.2 x2 0.05 y x 55 x 0.05 y2 35 y 2500
3150
3600
4950
400
2700
3600
3150
300
global maximum
4000 500
z
3150 2000 400
3600
2700
200
0 300
y
3600
50
4500 3150
100
x 200
2700 150
100 900135018002250
100
0 50 100 150 200 200
FIGURE 2.18: Profit function P (x, y) and maximum point (100, 300)
Example 3. Several local extreme points

Find the stationary points and classify them when
f (x, y) = 3x − x3 − 2y 2 + y 4 .
Solution: Since f is a differentiable function (because it is a polynomial), the

local extreme points are critical points, i.e, solutions of ∇f (x, y) = 0, 0.
We have
∇f (x, y) = 3 − 3x2 , −4y + 4y 3 = 0, 0

⎧ 2 ⎧
⎨ x =1 ⎨ x = 1 or x = −1
⇐⇒ and ⇐⇒ and
⎩ ⎩
y(y 2 − 1) = 0 y = 0 or y = 1 or y = −1
⎧ ⎧ ⎧
⎨ x=1 ⎨ x=1 ⎨ x=1
⇐⇒ and or and or and
⎩ ⎩ ⎩
y=0 y=1 y = −1
⎧ ⎧ ⎧
⎨ x = −1 ⎨ x = −1 ⎨ x = −1
or and or and or and
⎩ ⎩ ⎩
y=0 y=1 y = −1.
We deduce that (1, 0), (1, 1), (1, −1), (−1, 0), (−1, 1) and (−1, −1) are the
critical points of f . The level curves, graphed in Figure 2.19, show the nature
of these points.
9.7 5.82
7.76
0.67 1.94 3.88 4.85 2.91 0.97
0.97
z y4 2 y2 x3 3 x 1.5 8.73 1.94 6.7
4.85
1.0 0
2.91
0.5 3.88
7.76
2.91
10 1.94
0.0 1.94
2 0
5 6.79
z 0.97 5.82
1 0.5
0
2.91
2 0y
1.0 0.97
7.76
1
4.85
1
x0 1.5 8.73 1.94
1.94
1
7.76 2.91
0.67
9.7 3.88 4.85 2.91 3.88
2
2 2 1 0 1 2
FIGURE 2.19: Local extreme points of f (x, y) = 3x − x3 − 2y 2 + y 4
∗ Classification of the critical points: We have

fxx (x, y) = −6x, fyy (x, y) = 12y 2 − 4, fxy (x, y) = 0,

fxx fxy −6x 0
Hf (x, y) = =
fyx fyy 0 12y 2 − 4

D1 (x, y) = fxx = fxx = −6x,

fxx fxy −6x 0

D2 (x, y) = = = −24x[3y 2 − 1].
fxy fyy 0 12y 2 − 4
Applying the second derivative test, we obtain:
(x, y) D1 (x, y) D2 (x, y) type

−6 0
(1, 0) −6 = 24 local maximum
0 −4

6 0
(−1, 1) 6 = 48 local minimum
0 8

6 0
(−1, −1) 6 = 48 local minimum
0 8

−6 0
(1, 1) −6 = −48 saddle point
0 8

−6 0
(1, −1) −6 = −48 saddle point
0 8

6 0
(−1, 0) 6 = −24 saddle point
0 −4
TABLE 2.5: Critical points’ classification for f (x, y) = 3x − x3 − 2y 2 + y 4

The proof of Theorem 2.2.1 uses Taylor’s formula for a function of several
variables and a characterization of symmetric quadratic forms (see the end of
this section). Taylor’s formula will be used several times through out the next
chapters. It is therefore important to understand its proof.
Theorem 2.2.2 2nd order Taylor’s formula for a function of n

variables
Suppose f is C 2 in an open set of Rn containing the line segment [x∗ , x∗ +
h]. Then
n n n
∂f ∗ 1 ∂2f
f (x∗ + h) = f (x∗ ) + (x )hi + (x∗ + c h)hi hj
i=1
∂x i 2 i=1 j=1
∂x i ∂x j
or
1t
f (x∗ + h) = f (x∗ ) + ∇f (x∗ ).h + hHf (x∗ + ch)h
2
⎡ ∗ ⎤ ⎡ ⎤
x1 h1
∗ ⎢ .. ⎥ ⎢ . ⎥
for some c ∈ (0, 1) and where x = ⎣ . ⎦ , h = ⎣ .. ⎦, t h =
x∗n hn

h1 . . . hn . Here, we identified the column vector x∗ + th with the
point (x∗1 + th1 , . . . , x∗n + thn ), t ∈ R.
Proof. Define the function
g(t) = f (x∗1 + th1 , . . . , x∗n + thn ) = f (x∗ + th).
Note that
g(t) = f (x1 (t), x2 (t), . . . , xn (t)) with xj (t) = x∗j + thj j = 1, . . . , n.
Since the real functions xj , j = 1, . . . , n, are differentiable with xj (t) = hj ,

then g is differentiable and we have by the chain rule formula
∂f ∂x1 ∂f ∂x2 ∂f ∂xn

g (t) = + + ...... +
∂x1 ∂t ∂x2 ∂t ∂xn ∂t

= fx1 (x∗ + th) h1 + fx2 (x∗ + th) h2 + . . . . . . + fxn (x∗ + th) hn

= ∇f (x∗ + th) .h.
Because f is C 2 , then g is also C 2 , and we have
d d d
g (t) = fx1 (x∗ +th) h1 + fx2 (x∗ +th) h2 + . . . . . . + fxn (x∗ +th) hn .
dt dt dt
For each i = 1, . . . , n, we have
fxi (x∗ + th) = fxi (x1 (t), x2 (t), . . . , xn (t)).
Then
d ∂fxi ∂x1 ∂fxi ∂x2 ∂fxi ∂xn

fx (x∗ + th) = + + ...... +
dt i ∂x1 ∂t ∂x2 ∂t ∂xn ∂t

= fxi x1 (x∗ + th) h1 + fxi x2 (x∗ + th) h2 + . . . . . . + fxi xn (x∗ + th) hn
n

= fxi xj (x∗ + th) hj .
j=1
Hence
n
n

g (t) = fxi xj (x∗ + th) hi hj .
i=1 j=1
Now, since f is defined on the segment [x∗ , x∗ + h], g is defined on the interval
[0, 1] and by using the 2nd order Taylor’s formula for real functions [1], [2],
we get
g (0) g (c)
g(1) = g(0) + (1 − 0) + (1 − 0)2
1! 2!
1
= g(0) + g (0) + g (c) for some c ∈ (0, 1),
2
or equivalently
n
n n
1
f (x∗ + h) = f (x∗ ) + fxi (x∗ )hi + fx x (x∗ + ch)hi hj .
i=1
2 i=1 j=1 i j
Proof. (Theorem 2.2.1) Since x∗ is an interior point of S and is a local

stationary point of f then
∇f (x∗ ) = 0.
For h ∈ Rn such that x∗ + h ∈ S, we have from the 2nd order Taylor’s formula
1t
f (x∗ + h) = f (x∗ ) + hHf (x∗ + ch)h for some c ∈ (0, 1).
2
Situation (i) Suppose that Dk (x∗ ) > 0 for all k = 1, . . . , n.

By continuity of the second-order partial derivatives of f , there exists r > 0
such that
Dk (x) > 0 ∀x ∈ Br (x∗ ) ∀k = 1, . . . , n.

As a consequence, the quadratic form
n
n
Q(h)(x) = fxi xj (x)hi hj =t hHf (x)h
i=1 j=1

with the associated symmetric matrix Hf (x) = fxi xj (x) n×n is definitely
positive.
Since x∗ + ch ∈ Br (x∗ ), then
Q(h)(x∗ + ch) =t hHf (x∗ + ch)h > 0.

Therefore, we have for x∗ + h ∈ Br (x∗ )
1
f (x∗ + h) − f (x∗ ) =Q(h)(x∗ + ch) > 0
2
which shows that the stationary point x∗ is a strict local minimum point for
f in S.
Situation (ii) Suppose that (−1)k Dk (x∗ ) > 0 for all k = 1, . . . , n.

By continuity of the second-order partial derivatives of f , there exists r > 0
such that
(−1)k Dk (x) > 0 ∀x ∈ Br (x∗ ) ∀k = 1, . . . , n.

From the property of determinants, we can write

(−f )x1 x1 (−f )x1 x2 . . . (−f )x1 xk

(−f )x2 x1 (−f )x2 x2 . . . (−f )x2 xk

k ∗
(−1) Dk (x ) =
.. .. .. ..
. . . .

(−f )x x (−f )x x . . . (−f )xk xk
k 1 k 2
As a consequence, the quadratic form

n
n
t
hH−f (x)h = (−f )xi xj (x)hi hj
i=1 j=1

with the associated symmetric matrix H−f (x) = (−f )xi xj (x) n×n is definite
positive.
Therefore, we have for x∗ + h ∈ B(x∗ , r)
1
(−f )(x∗ + h) − (−f )(x∗ ) = ( )t hH−f (x∗ + ch)h > 0
2
=⇒ (−f )(x∗ + h) > (−f )(x∗ ) ⇐⇒ f (x∗ ) > f (x∗ + h)
which shows that the stationary point x∗ is a strict local maximum point for
f in S.
Situation (iii) Assume Dn (x∗ ) = 0 and neither of the conditions i) and ii)
hold.
Note that situation (i) (resp. (ii)) means also that the matrix A =
fxi xj (x∗ ) n×n is definite positive (resp. negative), which is equivalent to each
of its eigen value λi to be positive (resp. negative). So, if neither (i) or (ii)
hold, there exist i0 , j0 ∈ {1, . . . , n} such that
n
Dn (x∗ ) = λi = 0 with λi0 > 0 and λj0 < 0.
i=1
Now, since A is symmetric, there exists an orthogonal matrix O = (pij )n×n

(O−1 =t O) such that
⎡ ⎤
λ1 · · · 0
⎢ ⎥
A = ODt O D = ⎣ 0 ... 0 ⎦
0 ··· λn
Then the quadratic form Q(h) can be written as
n
n
2
Q(h)(x∗ ) =t hAh =t [t Oh]D[t Oh] = λi pji hj .
i=1 j=1
Choose hs and hs such that for s > 0,
t s 2s t 2s s
Ohs = ei0 + ej0 Ohs = ei0 + ej0 ,
λi0 −λj0 λi0 −λj0
which is possible since t O is invertible. Then we have

s 2 2s 2
Q(hs )(x∗ ) = λi0 + λ j0 = s2 − 4s2 = −3s2 < 0
λi0 −λj0
2s 2 s 2
Q(hs )(x∗ ) = λi0 + λ j0 = 4s2 − s2 = 3s2 > 0.
λi0 −λj0
We deduce, by continuity of Q(h)(x) the existence of δ > 0 such that ∀s ∈

(0, δ)
1
f (x∗ + hs ) − f (x∗ ) = Q(hs )(x∗ + chs ) < 0
2
1
f (x∗ + hs ) − f (x∗ ) = Q(hs )(x∗ + chs ) > 0.
2
f takes values greater and less than f (x∗ ) in the neighborhood of x∗ . Therefore
x∗ is a saddle point.
The following theorem shows that the Hessian matrix of a C 2 function at a

local maximum (resp. minimum) point is necessarily positive (resp. negative)
semi definite. However, this condition is not sufficient as we can see it in a
suggested exercise where the origin is neither a local minimum, nor a local
maximum.
Theorem 2.2.3 Necessary conditions for a local extreme point

Let S ⊂ Rn and f : S −→ R be a C 2 function in a neighborhood of a
◦
critical point x∗ ∈ S (∇f (x∗ ) = 0). Then
(i) x∗ is a local minimum point =⇒ k (x∗ ) 0 ∀k = 1, n
(ii) x∗ is a local maximum point =⇒ (−1)k k (x∗ ) 0 ∀k = 1, n
where k (x∗ ) is the principal minor of order k of the Hessian matrix

Hf (x∗ ); that is the determinant of a matrix obtained by deleting n − k
rows and n − k columns such that if the ith row (column) is selected, then
so is the ith column (row).
Proof. (i) Suppose that x∗ is an interior local minimum point for f . There
exists r > 0 such that
f (x∗ ) f (x) ∀x ∈ Br (x∗ ).
In particular, for t ∈ (−r, r) and h ∈ Rn with h = 1, we have x∗ + th ∈

Br (x∗ ) since
x∗ + th − x∗ = |t| h = |t| < r.

Then
g(0) = f (x∗ ) f (x∗ + th) = g(t) ∀t ∈ (−r, r).

So g is a one variable function that has an interior local minimum at t = 0.
Consequently, it satisfies
g (0) = 0 and g (0) 0.
From previous calculations, we have

n
n
g (t) = fxi xj (x∗ + th)hi hj .
i=1 j=1
Hence
n
n
g (0) = fxi xj (x∗ )hi hj =t hHf (x∗ )h 0.
i=1 j=1
The above inequality remains true for h = 0 and for h = 0. It suffices to

consider for this last case h/ h which is a unit vector. Hence the Hessian
matrix of f at x∗ is positive semi definite by the result below from Algebra
(see [10]).
(ii) is proved similarly.
Quadratic forms
Consider the quadratic form in n variables
n
n

Q(h) = aij hi hj =t hAh t
h= h1 ... hn
i=1 j=1
associated to the symmetric matrix

⎡ ⎤
a11 . . . a1n
⎢ .. .. .. ⎥
A = (aij )i,j=1,...,n = ⎣ . . . ⎦ (aij = aji ).
an1 . . . ann
Definition.
Q is positive (resp. negative) definite if Q(h) > 0 (resp. < 0) for all
h = 0.
Q is positive (resp. negative) semi definite if Q(h) 0 (resp. 0) for all

h ∈ Rn .
We have the following necessary and sufficient condition for a quadratic form
Q to be positive (negative), definite or semi definite.
Theorem.
Q is positive definite ⇐⇒ Dr > 0 r = 1, . . . , n

r
Q is negative definite ⇐⇒ (−1) Dr > 0 r = 1, . . . , n
where Dr is the leading principal minor of order r of the matrix A;

a11 . . . a1n

..
Dr = ... ..
. . for r = 1, . . . , n.

an1 . . . ann
Theorem.
Q is positive semi definite ⇐⇒ Δr 0 r = 1, . . . , n

Q is negative semi definite ⇐⇒ (−1)r Δr 0 r = 1, . . . , n
where Δr is the principal minor of order r of the matrix A; that is the

determinant of the matrix obtained from the matrix A by deleting n − r
rows and n − r columns such that if the i th row (column) is selected, then
so is the i th column (row).
Solved Problems
1. – Use the following functions to show that the positivity or negativity

semi definite of the Hessian of the objective function at a critical point is
not a necessary condition for local optimality.
f (x, y) = x4 + y 4 , g(x, y) = −(x4 + y 4 ), h(x, y) = x4 − y 4 .
Solution:
z x4 y4 z x4 y4 z x4 y4
2.0 0.0 1.0

1.5 0.5 0.5
1.0 1.0 1.0
z1.0 z z0.0
1.0
0.5 0.5 1.5 0.5 0.5 0.5
0.0 2.0 1.0
1.0 0.0y 1.0 0.0y 1.0 0.0y
0.5 0.5 0.5
0.0 0.5 0.0 0.5 0.0 0.5

x x x
0.5 0.5 0.5
1.0 1.0 1.0
1.0 1.0 1.0
FIGURE 2.20: Graphs of f, g, h
We have
∇f (x, y) = 4x3 , 4y 3 , ∇g(x, y) = −4x3 , −4y 3 , ∇h(x, y) = 4x3 , −4y 3 .
So (0, 0) is the only stationary point for f , g and h. But, we cannot conclude
anything about its nature by using the second derivatives test since the Hessian
matrix at (0, 0) of each function is equal to the zero matrix.

12x2 0 12x2 0
Hf = , Hg = −Hf , Hh =
0 12y 2 0 −12y 2

0 0
Hf (0, 0) = Hg (0, 0) = Hh (0, 0) =
0 0
Δ11 (0, 0) = Δ21 (0, 0) = Δ2 (0, 0) = 0

where Δl1 is the principal minor of order l obtained by removing the leme row
and leme column l = 1, 2. Thus the Hessian matrices of f , g and h are positive
and negative semi definite at (0, 0). However, this doesn’t imply that (0, 0) is a
local minimum or maximum point. Indeed, by looking at the functions directly,
we can classify the point. The three situations are shown in Figure 2.20.
First, note that (0, 0) is a global maximum for f since we have
f (x, y) = x4 + y 4 0 = f (0, 0) ∀(x, y) ∈ R2 .
Next, note that (0, 0) is a global minimum for g. Indeed, we have
g(x, y) = −(x4 + y 4 ) 0 = g(0, 0) ∀(x, y) ∈ R2 .
Finally, (0, 0) is a saddle point for h since we have
h(x, 0) = x4 0 = h(0, 0) ∀x ∈ R
h(0, y) = −y 4 0 = h(0, 0) ∀y ∈ R.
Thus, for any disk centered at (0, 0), h takes values greater and lower than
h(0, 0).
2. – Classify the stationary points of
f (x1 , x2 , x3 , x4 ) = 20x2 + 48x3 + 6x4 + 8x1 x2 − 4x21 − 12x23 − x24 − 4x32 .
Does f attain its global extreme values on R4 ?
Solution: Since the function f is differentiable (because it is a polynomial),

the local extreme points are critical points, i.e, solutions of
∇f (x1 , x2 , x3 , x4 ) = 8x2 − 8x1 , 20 + 8x1 − 12x22 , 48 − 24x3 , 6 − 2x4 = 0R4

⎧
⎨ x2 = x 1 5 + 2x1 − 3x22 = 0
⇐⇒
⎩
x3 = 2 x4 = 3.
5 5
We deduce that (−1, −1, 2, 3) and ( , , 2, 3) are the critical points of f .
3 3
• Classification of the critical points: The Hessian matrix of f is

⎡ ⎤
−8 8 0 0
⎢ 8 −24x2 0 0 ⎥
Hf (x1 , x2 , x3 , x4 ) = ⎢
⎣ 0
⎥
0 −24 0 ⎦
0 0 0 −2
The leading principal minors at the point (−1, −1, 2, 3) are
D1 = −8 < 0, D2 = −256 < 0, D3 = −24D2 > 0 and D4 = −2D3 < 0.
Then, (−1, −1, 2, 3) is a saddle point.
The leading principal minors at the point ( 53 , 53 , 2, 3) are
D1 = −8 < 0, D2 = 256 > 0, D3 = −24D2 < 0 and D4 = −2D3 > 0.

5 5
Then, ( , , 2, 3) is a local maximum point.
3 3
• Global optimal points: Note that
f (0, x2 , 0, 0) = 20x2 − 4x32 −→ ∓∞ as x2 −→ ±∞.

Thus f takes large negative and positive values. Therefore f doesn’t attain
its global optimal values on R2 .
3. – Let f (x, y) = ln(1 + x2 y).
i) Find and sketch the domain of definition of f .

ii) Find the stationary points and show that the second-derivatives test
is inconclusive at these points.
iii) Describe the behavior of f at these points.
Solution: i) The domain of f is given by:
Df = {(x, y) ∈ R2 : 1 + x2 y > 0}
1
= {(0, y) : y ∈ R} ∪ {(x, y) ∈ R∗ × R : y>− }.
x2
1
The domain of f is the region located above the curve y = − 2 , including the
x
y axis; see Figure 2.21.
y
x
10 5 5 10
0.2
0.4
0.6
0.8
FIGURE 2.21: Domain of f (x, y) = ln(1 + x2 y)
ii) f is differentiable on its open domain Df because
f = vou with v(t) = ln t, u(x, y) = 1 + x2 y

2
u is differentiable in R then, in particular, in Df
u(Df ) ⊂ R∗ +
v is differentiable in R∗ + .
The stationary points are solutions of
2xy x2
∇f (x, y) = , = 0, 0
1 + x2 y 1 + x2 y
⎧ ⎧
⎨ xy = 0 ⎨ x=0 or y=0
⇐⇒ ⇐⇒
⎩ 2 ⎩
x =0 x=0
⎧
⎨ x=0 and x=0
⇐⇒ or ⇐⇒ x = 0, y ∈ R.
⎩
y=0 and x=0
We deduce that the points located on the y axis are the critical points of f .
The Hessian matrix of f is

1 1 − x2 y 2x
Hf (x, y) = .
(1 + x2 y)2 2x −x4
At the stationary points, we have

1 0
Hf (0, y) = .
0 0
2 0.85 2.04
0
2.04 1.7 1.19 0.68 1.36
z logy x2 1 1.7
1.87 0.51 1.87
1.36 0.85 1.02 1.53
1 0.17
1.53
1.02 0.51 0.17 0.68 1.19
2 0.34 0.34
0 0
1 1.02
1.7 0.51 0.51 1.1
2.0
2
z 0 1.36 0.17 1.53
0.34
2.04 0 1.02
1 1
0.85
2 1.7
1
2 0 y 1.53 0.68
1
0.68 1.36
0 1
0.17
x
1 1.87
2 1.19
0.34 1.87
0.85
2
2 2 1 0 1 2
FIGURE 2.22: Graph and level curves of f (x, y) = ln(1 + x2 y)
The leading minor D2 (0, y) = det(Hf (0, y)) = 0, then the second deriva-
tives test fails at these points. The behaviour of the function is illustrated in
Figure 2.22.
Classification of these points:
• The points (0, y0 ) with y0 > 0 are local minimum points for f . Indeed, since
the logarithm function is increasing, we have
f (x, y) = ln(1 + x2 y) ln(1) = ln(1 + 02 y0 ) = 0 = f (0, y0 )
y0 y0 y0 3y0
∀x ∈ R, ∀y ∈ (y0 − , y0 + ) = ( , ).
2 2 2 2
Thus, f takes values greater than f (0, y0 ) in a neighborhood of (0, y0 ) with

y0 > 0.
• The points (0, y0 ) with y0 < 0 are local maximum points for f . Indeed, since
ln is an increasing function, we have
f (x, y) = ln(1 + x2 y) ln(1) = ln(1 + 02 y0 ) = 0 = f (0, y0 )
−y0 −y0 3y0 y0

∀y ∈ (y0 − , y0 + )=( , ), ∀x such that 0 < 1 + x2 y.
2 2 2 2
f takes values lower than f (0, y0 ) in a neighborhood of (0, y0 ) with y0 < 0.
• The point (0, 0) is a saddle point for f . Indeed, we have

f (x, y) = ln(1 + x2 y) ln(1) = 0 = f (0, 0) ∀y ∈ R+
f (x, y) = ln(1 + x2 y) ln(1) = 0 = f (0, 0) ∀y ∈ R− such that 0 < 1 + x2 y.

For any disk centered at (0, 0), f takes values greater and lower than f (0, 0).
4. – Find and classify all stationary points of f (x, y) = x2 y + y 3 x − x y.

Are there global minimum and maximum values of f on R2 ?
Solution:
2
z x y3 x2 y x y
5 0
2
z0
1
5
1
2 0y
1
1
x0
1
2
2
2 2 1 0 1 2
FIGURE 2.23: Graph and level curves of f (x, y) = x2 y + y 3 x − xy
The function f is differentiable on its open domain R2 since it is a polynomial.

So, the local extreme points are critical, i.e, solutions of
∇f (x, y) = 2xy + y 3 − y, x2 + 3y 2 x − x = 0, 0 ⇐⇒
⎧ ⎧
⎨ y(y 2 + 2x − 1) = 0 ⎨ y=0 or y 2 + 2x − 1 = 0
⇐⇒
⎩ ⎩
x(3y 2 + x − 1) = 0 x=0 or 3y 2 + x − 1 = 0
⎧
⎪
⎪ y = 0 and x=0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ or [y = 0 and 3y 2 + x − 1 = 0]
⇐⇒ ⇐⇒
⎪
⎪
⎪
⎪ or [y 2 + 2x − 1 = 0 and x = 0]
⎪
⎪
⎪
⎪
⎩
or [y 2 + 2x − 1 = 0 and 3y 2 + x − 1 = 0]
⎧ ⎧ 2
⎨ y=0 and x=0 ⎨ or [y − 1 = 0
⎪ and x = 0]
⎩ ⎪
⎩ or 1 2
or [y = 0 and x = 1] [y 2 = and x= ].
5 5
2 1 2 1
We deduce that (0, 0), (1, 0), (0, 1), (0, −1), ( , √ ) and ( , − √ ) are the
5 5 5 5
critical points of f . Reading the level curves in Figure 2.23, one can locate
four saddle points and two local extrema.
Classification of the critical points: Applying the second derivatives test,

we obtain:
critical point D1 (x, y) D2 (x, y) classification
(0, 0) 0 −1 saddle point
(0, −1) −2 −4 saddle point
2 1 2 4
( ,√ ) √ local minimum point
5 5 5 5
2 1 2 4
( , −√ ) −√ local maximum point
5 5 5 5
TABLE 2.6: Critical points’ classification for f (x, y) = x2 y + y 3 x − xy
where
fxx (x, y) = 2y, fyy (x, y) = 6yx, fxy (x, y) = 2x + 3y 2 − 1

2y 2x + 3y 2 − 1
Hf (x, y) = 2 , D1 (x, y) = fxx = fxx = 2y
2x + 3y − 1 6xy

fxx fxy 2y 2x + 3y 2 − 1
= 2 2 2
fxy fyy 2x + 3y 2 − 1 6yx = 12xy − [2x + 3y − 1] .
Finally, note that f takes large positive and negative values since we have
f (1, y) = y 3 −→ ±∞ as y −→ ±∞.
Therefore, f doesn’t attain a global maximal value nor a minimal one.
5. – A power substation must be located at a point closest to three houses

located at the points (0, 0), (1, 1), (0, 2). Find the optimal location by
minimizing the sum of the squares of the distances between the houses and
the substation.
Solution: Let (x, y) be the position of the power substation. Then, we have
y
2.0
1.5
1.0
0.5
x
0.2 0.2 0.4 0.6 0.8 1.0 1.2
FIGURE 2.24: The closest power station to three houses
to look for (x, y) as the point that minimize the function
f (x, y) = d2 ((x, y), (0, 0)) + d2 ((x, y), (1, 1)) + . . . + d2 ((x, y), (0, 2))
which can be written as
f (x, y) = [(x − 0)2 + (y − 0)2 ] + [(x − 1)2 + (y − 1)2 ] + [(x − 0)2 + (y − 2)2 ].
Because f is polynomial, it is differentiable on the open set R2 . Thus a global

minimum point is also a local one. Therefore, it is a solution of
∇f (x, y) = 2x + 2(x − 1) + 2x, 2y + 2(y − 1) + 2(y − 2) = 0, 0

⎧
⎨ 6x − 2 = 0 1
⇐⇒ ⇐⇒ (x, y) = ( , 1).
⎩ 3
6y − 6 = 0
Thus, we have one critical point and by applying the second derivatives test,
we obtain:

6 0 1 1 6 0
Hf (x, y) = D1 ( , 1) = 6 > 0 D2 ( , 1) = = 36 > 0.

0 6 3 3 0 6
So ( 13 , 1) is a local minimum; see Figure 2.24 for the position of the point and
the three houses.
To show that it is the point that minimizes f globally, we proceed by com-
paring the values of f and completing squares:
1 2
f (x, y) − f ( , 1) = 3x2 − 2x + 1 + 3y 2 − 6y + 5 − ( + 2)
3 3
1 2
= 3(x − ) + 3(y − 1)2 0 ∀(x, y) ∈ R2 .
3
6. – Based on the level curves that are visible in Figures 2.25 and 2.26,
identify the approximate position of the local maxima, local minima and
saddle points.
2 0.128 0.128
0.128 0.256 0.12
0.064 0.32
1
0.192
0.256 0.192
0.064
0 0 0
0.064
0.192
0.192
0.256
0.256
1
0.064
0.32
0.128
0.128
2 0.128 0.128
2 1 0 1 2
2
+y 2 )/2
FIGURE 2.25: Level curves of f (x, y) = −xye−(x on [−2.2] × [−2, 2]
Solution: i) From the level curves’ plotting, one can locate:
- a saddle point at (0, 0)

1.2 0 1.2 0.4 1.2 0.4 0.4
0.8
2.4 0
1.6 2
2.4
8 0.8
0.8
0.4 0.8 0.8
2
1.6
1.2 1.2
6 1.2 0.4
0.4 0
1.2
0.8
0.8
1.2 0.4 1.2
4
0.4
1.2 0.4 1.2

0.8
1.6 2
0.8 0.4
2
2 2.4
0.8 2.4 1.6
0
0
0.8
0.4 0.4 1.2
0 1.2 0.8 0 1.2
0 2 4 6 8
FIGURE 2.26: Level curves of g(x, y) = sin(x) + sin(y) − cos(x + y) for x, y

in [0, 3π]
- two local maxima at (−1, 1), (1, −1)

- two local minima at (−1, −1), (1, 1).
Using Maple software, one can check these observations by applying the second
derivatives test using the coding:
with(Student[M ultivariateCalculus])
LagrangeM ultipliers(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [ ], [x, y], output = detailed)
[x = 0, y = 0, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = 0],
[x = 1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)],
[x = 1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)],
[x = −1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)],
[x = −1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)]
SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [x, y] = [0, 0])
LocalM in = [], LocalM ax = [], Saddle = [[0, 0]]
SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [x, y] = [1, 1])
LocalM in = [[1, 1]], LocalM ax = [], Saddle = []
..
.
ii) For the second figure, the exact points found, using Maple, are:
3π 3π π 3π 3π π 7π π π 7π
- 5 saddle points ( , ), ( , ), ( , ), ( , ), ( , )
2 2 2 2 2 2 2 2 2 2
π 5π 5π π π π 5π 5π
- 4 local maxima at ( , ), ( , ), ( , ), ( , )
2 2 2 2 2 2 2 2
11π 11π 7π 7π
- 2 local minima at ( , ), ( , ).
6 6 2 2
2.3 Convexity/Concavity and Global Extreme Points
In dimension 1, when a C 2 function f is convex on its domain Df and x∗ is

a local minimum of f , then x∗ is a global minimum. Indeed, the convexity of f is
characterized by f (x) 0 [2], [1]. Then, using Taylor’s formula, the values f (x)
and f (x∗ ) can be compared as follows:
(x − x∗ )2
f (x) = f (x∗ ) + (x − x∗ )f (x∗ ) + f (c) for some c between x∗ and x.
2
Because f (x∗ ) = 0, then
(x − x∗ )2
f (x) − f (x∗ ) = f (c) 0.
2
As x is arbitrarily chosen in the domain of f , then
f (x) f (x∗ ) ∀x ∈ Df
∗
which shows that x is a global minimum point for f .
In this section, we want to generalize the convexity property to functions

of several variables in order to establish, later, results of global optimality.
2.3.1 Convex/Concave Several Variable Functions
Definition 2.3.1 Let S be a convex set of Rn and let f be a real function

f: S −→ R
x = (x1 , · · · , xn ) −→ f (x).
Then,
f is convex ⇐⇒ f (ta + (1 − t)b) tf (a) + (1 − t)f (b).
f is strictly convex ⇐⇒ f (ta + (1 − t)b) < tf (a) + (1 − t)f (b)

a = b, t = 0, 1.
f is concave ⇐⇒ f (ta + (1 − t)b) tf (a) + (1 − t)f (b).
f is strictly concave ⇐⇒ f (ta + (1 − t)b) > tf (a) + (1 − t)f (b)

a = b, t = 0, 1.
These equivalences must hold ∀a, b ∈ S, ∀ t ∈ [0, 1].
• Using the definition, one can check that the functions
i) f (x) = ax + b ii) f (x, y) = ax + by + c

are simultaneously concave and convex in R and R2 respectively. Their re-
spective graphs represent a line y = ax + b and a plane z = ax + by + c.
• A convex/concave function is not necessarily differentiable at every point.

i) f (x) = |x| ii) f (x, y) = x2 + y 2 = (x, y .
Each function is not differentiable at the origin and represents the Euclidean
distance in R and R2 respectively. We use the triangular inequality to verify
that they are convex.
• One can form new convex/concave functions using algebraic operations. For
Example [25],
if f , g are functions defined on a convex set S ⊂ Rn and s, t 0, then:
f and g are concave (resp. convex) =⇒

sf + tg is concave (resp. convex)

min(f, g) (resp. max(f, g)) is concave (resp. convex).
Remark 2.3.1 The geometrical interpretation of the convexity of f ex-

presses that the graph of f remains under the line segment [AB] joining
any two points A(a, f (a)) and B(b, f (b)) of the graph of f . Indeed,

[A, B] = (x, y) ∈ Rn × R : x = a + t(b − a),

y = f (a) + t(f (b) − f (a)) t ∈ [0, 1]
is located above the part of the graph of f

(x, y) ∈ Rn × R : x = a + t(b − a), y = f (a + t(b − a)) t ∈ [0, 1]
since we have ∀t ∈ [0, 1]
f (a+t(b−a)) = f (tb+(1−t)a) tf (b)+(1−t)f (a) = f (a)+t(f (b)−f (a)).
Similarly, the geometrical interpretation of the concavity of f expresses

that the graph of f remains above the line segment [AB] joining any two
points A(a, f (a)) and B(b, f (b)) of the graph of f ; see Figure 2.27.
convex function z fx,y
segment AB above the graph y fx
FIGURE 2.27: Shape of convex functions
Remark 2.3.2 There is a connection with a convexity/concavity of a func-

tion f defined on a convex set S ⊂ Rn and the convexity/concavity of
particular sets described by f [25]. Indeed, we have

f is convex ⇐⇒ the set (x, y) ∈ S × R : y f (x) is convex

f is concave ⇐⇒ the set (x, y) ∈ S × R : y f (x) is convex
2.3.2 Characterization of Convex/Concave C 1 Functions
When n = 1, the following theorem expresses that the graph of a convex

(resp. concave) C 1 function remains above (resp. below) its tangent lines.
Theorem 2.3.1 Let S be a convex open set of Rn and let f : S −→ R be

C 1 . Then, for any x, a ∈ S, the following inequalities hold

f is convex in S ⇐⇒ f (x) − f (a) ∇f (a) .(x − a)

f is strictly convex in S ⇐⇒ f (x) − f (a) > ∇f (a) .(x − a), x = a

f is concave in S ⇐⇒ f (x) − f (a) ∇f (a) .(x − a)

f is strictly concave in S ⇐⇒ f (x) − f (a) < ∇f (a) .(x − a), x = a.
Proof. We prove the first assertion. The other assertions can be established
similarly.
=⇒) If f is convex in S, then, by definition, we have for a, b ∈ S,
f (tb + (1 − t)a) tf (b) + (1 − t)f (a) ∀t ∈ [0, 1]

f (tb + (1 − t)a) − f (a) f (a + t(b − a)) − f (a)

f (b) − f (a) = ∀t ∈ (0, 1].
t t
Since f ∈ C 1 (S), we obtain
g(t) − g(0)
f (b) − f (a) lim = g (0)
t→0+ t−0
where
g(t) = f (a+t(b−a)) g (t) = f (a+t(b−a)).(b−a), g (0) = f (a).(b−a).
Indeed,
g(t) = f (a1 + t(x1 − a1 ), . . . , an + t(xn − an )) = f (x1 (t), x2 (t), ..., xn (t)).
Each function xj (t) == aj + t(bj − aj ), j = 1, ..., n, is differentiable with

xj (t) = bj − aj . So g is differentiable and we obtain, by the chain rule formula,
∂f ∂x1 ∂f ∂x2 ∂f ∂xn

g (t) = + + ... +
∂x1 ∂t ∂x2 ∂t ∂xn ∂t

= fx1 (a + t(b − a)) (b1 − a1 ) + fx2 (a + t(b − a)) (b2 − a2 ) + . . .

. . . + fxn (a + t(b − a)) (bn − an ) = (∇f )(a + t(b − a)) .(b − a).
⇐=) Assume that


f (x) − f (u) ∇f (u) .(x − u) ∀ x, u ∈ S.
Let a, b ∈ S and t ∈ [0, 1]. Choosing x = a and u = ta + (1 − t)b in the above
inequality, we obtain

f (a) − f (ta + (1 − t)b) ∇f (ta + (1 − t)b) .(a − [ta + (1 − t)b])

= (1 − t) ∇f (ta + (1 − t)b) .(a − b). (∗)
Now, choose x = b and u = ta + (1 − t)b in the same inequality. We get

f (b) − f (ta + (1 − t)b) ∇f (ta + (1 − t)b) .(b − [ta + (1 − t)b])

= −t ∇f (ta + (1 − t)b) .(a − b). (∗∗)
Multiply the inequality (∗) by t > 0 and the inequality (∗∗) by (1 − t) > 0,
then add the resulting inequalities. This gives
tf (a) + (1 − t)f (b) − (t + (1 − t))f (ta + (1 − t)b)
[t(1 − t) − (1 − t)t]∇f (ta + (1 − t)b).(a − b) = 0.
Therefore f is convex.
Example 1. Show that f (x, y) = x2 + y 2 is convex on R2 .
Solution: We have

x−s
f (x, y) − f (s, t) − ∇f (s, t).
y−t

2 2 2 2
x−s
= x + y − (s + t ) − 2s 2t .
y−t
= x2 + y 2 − (s2 + t2 ) − 2s(x − s) − 2t(y − t)
= (s − x)2 + (t − y)2 0 ∀ (x, y), (s, t) ∈ R2 .
Thus f is convex on R2 . Note that by taking (s, t) = (0, 0), the critical point
of f , we deduce that f (x, y) − f (0, 0) 0 ∀ (x, y) ∈ R2 . Hence, (0, 0) is a
global minimum of f .
As, we can expect, from the above example, it will not always be easy
to check the convexity or concavity of a function through solving inequalities.
Next, we show a more practical characterization, but requiring more regularity
on the function.
2.3.3 Characterization of Convex/Concave C 2 Functions
Theorem 2.3.2 Strict convexity/concavity

Let S be a convex open set of Rn and let f : S −→ R, f ∈ C 2 (S). Then
(i) Dk (x) > 0 ∀x ∈ S, k = 1, . . . , n =⇒ f is strictly convex in S.
(ii) (−1)k Dk (x) > 0 ∀x ∈ S, k = 1, . . . , n =⇒ f is strictly concave in S.

Dk (x), k = 1, . . . , n are the n leading minors of the Hessian matrix
Hf (x) = (fxi xj (x))n×n of f .
Proof. i) For a, b ∈ S, a = b, and t ∈ [0, 1], define the function

g(t) = f (tb + (1 − t)a) = f (a + t(b − a)) = f (x1 (t), . . . , xn (t))
with xj (t) = aj + t(bj − aj ) j = 1, . . . , n.
By the chain rule theorem, we have

g (t) = (∇f )(a + t(b − a)) .(b − a).
Since f is C 2 , g is also C 2 and we have
d d
g (t) = fx1 (a + t(b − a)) (b1 − a1 ) + . . . + fxn (a + t(b − a)) (bn − an ).
dt dt
For each i = 1, . . . , n, we have
fxi (a + t(b − a)) = fxi (x1 (t), x2 (t), . . . , xn (t)).
Then
d ∂fxi ∂x1 ∂fxi ∂x2 ∂fxi ∂xn

fxi (y + t(x − y)) = + + ... +
dt ∂x1 ∂t ∂x2 ∂t ∂xn ∂t

= fxi x1 (a + t(b − a)) (b1 − a1 ) + . . . + fxi xn (a + t(b − a)) (bn − an )
n

= fxi xj (a + t(b − a)) (xj − yj ).
j=1
Hence
n
n
g (t) = [fxi xj (a + t(b − a))](bi − ai )(bj − aj ).
i=1 j=1
Now, by assumption, we have Dk (z) > 0 for all z ∈ S and for all k = 1, . . . , n,
then the quadratic form
n
n

Q(h) = fxi xj (a + t(b − a)) hi hj
i=1 j=1

with the associated symmetric matrix fxi xj (a + t(b − a)) n×n is positive
definite. As a consequence, g (t) > 0 and g is strictly convex. In particular
f (tb+(1−t)a) = g(t) = g(t.1+(1−t)0) < tg(1)+(1−t)g(0) = tf (b)+(1−t)f (a)
and the strict convexity of f follows.
ii) Under the assumptions ii), the quadratic form

n
n

Q∗ (h) = (−f )xi xj (a + t(b − a)) hi hj =t hH−f (a + t(b − a))h
i=1 j=1
⎡ ⎤
(−f )x1 x1 (−f )x1 x2 ... (−f )x1 xk
⎢ ⎥
⎢ ⎥
⎢ (−f )x2 x1 (−f )x2 x2 ... (−f )x2 xk ⎥
⎢ ⎥
⎢ ⎥
=t h ⎢ ⎥h
⎢ .. .. .. .. ⎥
⎢ . . . . ⎥
⎢ ⎥
⎣ ⎦
(−f )xk x1 (−f )xk x2 ... (−f )xk xk
is positive definite by assumption. As a consequence, (−g) (t) > 0 and −g is

strictly convex. In particular
−f (tb + (1 − t)a) = (−g)(t) = −g(t.1 + (1 − t)0)

< t(−g)(1) + (1 − t)(−g)(0) = t(−f )(b) + (1 − t)(−f )(a)
⇐⇒ f (tb + (1 − t)a) > tf (b) + (1 − t)f (a) ∀a = b ∀t ∈ [0, 1]
and the strict concavity of f follows.
We also have the following characterization

Theorem 2.3.3 Convexity/concavity

Let S be a convex open set of Rn and let f : S −→ R be C 2 . Then
f is convex in S ⇐⇒ Δk (x) 0 ∀x ∈ S ∀ k = 1, . . . , n.
f is concave in S ⇐⇒ (−1)k Δk (x) 0 ∀x ∈ S ∀ k = 1, . . . , n.
A principal minor Δr (x) of order r in the Hessian [fxi xj (x)] of f is the

determinant obtained by deleting n − r rows and the n − r columns with
the same numbers (if the ith row (column) is selected, then so is the ith
column (row)).
Proof. We prove only the first assertion. The second one is established by
replacing f by −f .
⇐=) We proceed as in the proof of the previous theorem. We conclude that

Q(h) is positive semi definite. As a consequence, g (t) 0 and g is convex.
In particular
f (tb+(1−t)a) = g(t) = g(t.1+(1−t)0) tg(1)+(1−t)g(0) = tf (b)+(1−t)f (a)
and the convexity of f follows.
=⇒) Suppose f convex in S. It suffices to show that the quadratic form Q(h)
satisfies
n n
Q(h) = fxi xj (a)hi hj 0 ∀a ∈ S.
i=1 j=1
So, let a ∈ S. Since S is an open set, there exists > 0 such that B (a) ⊂ S.
In particular for h ∈ Rn , h = 0, we have

a + th ∈ B (a) ⇐⇒ a + th − a = |t| h < ⇐⇒ |t| < = α.
h
So, for t ∈ (−α, α), the function u(t) = f (a + th) is well defined. We claim
that u is convex. Indeed, we have for λ ∈ [0, 1] and t, s ∈ (−α, α),
u(λt + (1 − λ)s) = f (a + [λt + (1 − λ)s]h)
= f (λa + (1 − λ)a + [λt + (1 − λ)s]h)
= f (λ[a + th] + (1 − λ)[a + sh])
λf (a + th) + (1 − λ)f (a + sh) since f is convex
= λu(t) + (1 − λ)u(s).
Hence u (t) 0 for all t ∈ (−α, α). But

n
n

u (t) = fxi xj (a + th) hi hj
i=1 j=1
and for t = 0, we obtain the semi definite positivity of the quadratic form Q.
Example 2. Show that f (x, y) = x4 + y 4 is convex on R2 .
Solution: We have
∇f (x, y) = 4x3 , 4y 3 .

fxx fxy 12x2 0
Hf (x, y) = = .
fyx fyy 0 12y 2
The leading principal minors are

12x2 0
2
D1 (x, y) = 12x 0 D2 (x, y) = = 144x2 y 2 0.
0 12y 2
We cannot conclude about the strict convexity of f on R2 since D1 (x, y) = 0

if x = 0 and D2 (x, y) = 0 if x = 0 or y = 0. However, since f is C ∞ , then
f is convex ⇐⇒ Hf is semi-definite positive.
We have Δ11 2
1 (x, y) = 12y 0,
Δ22 2
1 (x, y) = 12x 0, and Δ2 (x, y) = 144x2 y 2 0.
Thus, f is convex on R2 .
2.3.4 Characterization of a Global Extreme Point
From the previous characterizations of convex/concave functions, we de-

duce the following:
Theorem 2.3.4 Let S ⊂ Ω ⊂ Rn , Ω an open set, S a convex set and

x∗ ∈ S. Let f : Ω −→ R be a C 1 function on Ω, concave (resp. convex) on
S, then
x∗ is a global maximum
⇐⇒ ∇f (x∗ ).(x − x∗ ) 0 ∀x ∈ S (resp. ).
(resp. minimum) point
◦
Moreover, if x∗ ∈ S, then
x∗ is a global maximum (resp. minimum) point for fon S ⇐⇒ ∇f (x∗ ) = 0.
Proof. • Without the convexity assumption of f , the implication

x∗ is a global minimum point for f on S =⇒ ∇f (x∗ ).(x − x∗ ) 0
is established in theorem 2.1.2. Now, Suppose that f is convex on S and that
∇f (x∗ ).(x − x∗ ) 0 ∀x ∈ S,
then, because f is a C 1 function on Ω, we have (see proof of Theorem 2.3.1 ⇒)
f (x) − f (x∗ ) ∇f (x∗ ).(x − x∗ ) ∀x ∈ S
f (x) f (x∗ ) ∀x ∈ S
and conclude that x∗ is a global minimum point for f on S.
• If x∗ is an interior global minimum point for f , then it is a stationary point

with no need to the convexity of f . Conversely, suppose x∗ is a stationary
point for f and f be C 1 (Ω) and convex in S, then we have
f (x) − f (x∗ ) ∇f (x∗ ).(x − x∗ ).
Because, ∇f (x∗ ) = 0, we deduce that f (x) − f (x∗ ) 0, and then f (x)
f (x∗ ) ∀x ∈ S.
The case f concave is established similarly.
Example 3. Suppose that x units of a commodity are sold at 160 − 0.01x

cents per unit and that the total cost of production, in cents, is given by
C(x) = 40x + 20000.
Find the most profitable level of production if 7000 x 100000 by applying

the theorem above.
Solution: Since the total revenue for selling x units is R(x) = x(160 − 0.01x),
the profit P (x) on x units will be
P (x) = R(x)−C(x) = x(160−0.01x)−(40x+20000) = −0.01x2 +120x−20000.
So, we have to find
max P (x) on the convex set S = [7000, 100000].

x∈S
From
dP
= −0.02x + 120
dx
we have
dP
=0 ⇐⇒ x = 6000.
dx
The only stationary point is 6000 and cannot be the maximum point since it
is not in S. Let us then explore the concavity of P . We have
dP
P (x) − P (a) − (a)(x − a) = −0.01(x − a)2 0 ∀ x, a ∈ S, x = a.
dx
Thus, P is strictly concave on S. Therefore, the maximum point x∗ (that
exists by the extreme value theorem) must satisfy
dP ∗
(x ).(x − x∗ ) 0 ∀x ∈ S ⇐⇒ (−0.02x∗ + 120)(x − x∗ ) 0 ∀x ∈ S.
dx
Since (−0.02x∗ + 120) < 0 on S, we must have
x − x∗ 0 ∀x ∈ [7000, 100000] ⇐⇒ x∗ = 7000.
7000 units should be produced to attain maximum profit.
◦
Theorem 2.3.5 Let S be a convex set of Rn and x∗ ∈ S. Let f : S −→ R
be a C 2 concave (resp. convex) function on S, then
x∗ is a global maximum (resp. minimum) point for f on S ⇐⇒ ∇f (x∗ ) = 0.

Proof. i) Suppose f to be concave.

=⇒) Suppose x∗ is a global maximum point for f . Since x∗ is an interior
point, it is a stationary point, that is ∇f (x∗ ) = 0.
⇐=) Suppose ∇f (x∗ ) = 0, then, since f is C 2 , we have, for some t ∈ [0, 1],
f (x) − f (x∗ ) = ∇f (x∗ ).(x − x∗ ) +t (x − x∗ )Hf (x∗ + t(x − x∗ ))(x − x∗ ),
then, because f is concave in S, we have
f (x) − f (x∗ ) =t (x − x∗ )Hf (x∗ + t(x − x∗ ))(x − x∗ ) 0.
Hence
f (x) f (x∗ ) ∀x ∈ S.
∗
Thus x is is a global maximum point for f .
ii) If f is convex in S, then −f is concave in S. From i), we deduce that
x∗ is a global minimum point for f in S

⇐⇒ x∗ is a global maximum point for (−f ) in S
⇐⇒ ∇(−f )(x∗ ) = 0 ⇐⇒ ∇f (x∗ ) = −∇(−f )(x∗ ) = 0.
Example 4. Find the global maxima and minima points if any of f defined
by
f (x, y, z, t) = 24x + 32y + 48z + 72t − (x2 + y 2 + 2z 2 + 3t2 ).
Solution: A global extreme point of f is also a local extreme one since f is

defined on R4 which is open. Therefore, it is a stationary point since f is C ∞
(because it is a polynomial). We have
∇f (x, y, z, t) = 24 − 2x, 32 − 2y, 48 − 4z, 72 − 6t = 0, 0, 0, 0
⇐⇒ (x, y, z, t) = (12, 16, 12, 12).
The only stationary point is (12, 16, 12, 12). The Hessian matrix is
⎡ ⎤
−2 0 0 0
⎢ 0 −2 0 0 ⎥
Hf (x, y, z, t) = ⎢
⎣ 0
⎥
0 −4 0 ⎦
0 0 0 −6

−2 0 0
−2 0
D1 = −2, D2 = = 4, D3 = 0 −2 0 = −16,
0 −2 0

0 −4

−2 0 0 0

0 −2 0 0
D4 (x, y, z, t) = = −6(−16) = 96,
0 0 −4 0
0 0 0 −6
and satisfy
(−1)k Dk (x, y, z, t) > 0 ∀(x, y, z, t) ∈ R4 for k = 1, 2, 3, 4.
Therefore, f is strictly concave on R4 and the point (12, 16, 12, 12) is the only
global maximum point.
Note that min

4
f doesn’t exist since f takes large negative values. Indeed, we
R
have, for example,
f (x, 0, 0, 0) = 24x − x2 −→ −∞ as x −→ ±∞.

Solved Problems
1. – A power substation must be located at a point closest to n houses lo-

cated at m distinct points (x1 , y1 ), (x2 , y2 ), . . . , (xm , ym ). Find the opti-
mal location by minimizing the sum of the squares of the distances between
the houses and the substation.
Solution: Let (x, y) be the position of the power substation. Then, we have
to look for (x, y) as the point that minimizes the function
f (x, y) = d2 ((x, y), (x1 , y1 )) + d2 ((x, y), (x2 , y2 )) + . . . + d2 ((x, y), (xm , ym ))
which can be written as
f (x, y) = [(x−x1 )2 +(y−y1 )2 ]+[(x−x2 )2 +(y−y2 )2 ]+. . .+[(x−xm )2 +(y−ym )2 ].
Because f is polynomial, it is differentiable on the open set R2 . Thus a global

minimum point is also a local one. Therefore, it is a solution of
∇f (x, y) = 2(x − x1 ) + 2(x − x2 ) + . . . + 2(x − xm ),

2(y − y1 ) + 2(y − y2 ) + . . . + 2(y − ym ) = 0, . . . , 0
⎧ m
⎪
⎪
⎪
⎪ m.x − xk = 0
⎪
⎪
⎨ k=1 1
m
1
m
⇐⇒ ⇐⇒ x = xk and y= yk .
⎪
⎪ m
m m
⎪
⎪ k=1 k=1
⎪
⎪ m.y − yk = 0
⎩
k=1
We have only one critical point. The Hessian matrix of f is

fxx fxy m 0
Hf (x, y) = = .
fyx fyy 0 m
The leading principal minors satisfy

m 0
D1 (x, y) = m > 0 D2 (x, y) = = m2 > 0.

0 m
So f is strictly convex on R2 . Then, the critical point is the global minimum

of f and describes the optimal location of the substation.
2. – Let f be a function of two variables given by
f (x, y) = x2 + y 4 − 4xy for all x and y.
i) Calculate the first and second order partial derivatives of f .
ii) Find all the stationary points of f and classify them by means of the
second derivatives test.
iii) Does f have any global extreme points?
iv) Use a software to graph f .
Solution: i) and ii) Since the function f is differentiable (because it is a

polynomial), the local extreme points are critical, i.e, solutions of
∇f (x, y) = 2x − 4y, 4y 3 − 4x = 0, 0

⎧ ⎧
⎨ 2x − 4y = 0 ⎨ x = 2y
⇐⇒ ⇐⇒
⎩ ⎩
4y 3 − 4x = 0 y 3 − 2y = 0 ⇐⇒ y(y 2 − 2) = 0
⎧ ⎧ ⎧
⎨ x = 2y ⎨ x = 2y ⎨ x = 2y
⇐⇒ or √ or √
⎩ ⎩ ⎩
y=0 y= 2 y=− 2
√ √ √ √
We deduce that (0, 0), (2 2, 2) and (−2 2, − 2) are the critical points.
Classification of the critical points:


fxx fxy 2 −4
Hf (x, y) = =
fyx fyy −4 12y 2
(x, y) D1 (x, y) D2 (x, y) type

√(0, 0)
√ 2 −16 saddle point
(2 √2, √2) 2 32 local minimum
(−2 2, 2) 2 32 local minimum
TABLE 2.7: Critical points classification of f (x, y) = x2 + y 4 − 4xy

2 −4
D1 (x, y) = 2 D2 (x, y) = .
−4 12y 2
An application of the second derivatives test gives the characterization in

Table 2.7.
iii) and iv) The first graphing in Figure 2.28 shows a form of a saddle. On the
second graphing, there are two families of circulaire curves and a hyperbola
which confirm the previous classification of the critical points.
z y4 4 x y x2
1
2
z
0
0
2
1
4
5
0 2
x
5 4 2 0 2 4
FIGURE 2.28: Graph and level curves of f
Global extreme points.
We cannot conclude about the concavity/convexity of f on R2 since the

signs of the principal minors of the Hessian are as follows:
Δ11 2
1 (x, y) = 12y 0, Δ22
1 (x, y) = 2 0, Δ2 (x, y) = 24y 2 − 16
and Δ2 depends on y. Thus, f is neither convex, nor concave on R2 .

However, we remark that, on the y axis, we have
f (0, y) = y 4 −→ +∞ as y −→ ±∞.
So f cannot attain a maximum value in R2 .
Moreover, by completing the squares,√we √compare the √

values√of f with its
value at the local minima points f ((2 2, 2) = f ((−2 2, − 2) = −4 and
obtain
f (x, y) + 4 = (x − 2y)2 + (y 2 − 2)2 0 ∀(x, y) ∈ R2 .
Thus, f attains its global minimal value -4 at these two points.
3. – Let f (x, y) = x2 .
i) Show that f has infinitely many critical points and that the second
derivatives test fails for these points.
ii) Show that f is convex on R2 .

iii) What is the minimum value of f ? Give the minima points.
iv) Does f have any local or global maxima? Justify your answer.
Solution:
z x2
4
4
2
2
z
0
0
2
2
4
5 4
0
x
6
5 6 4 2 0 2 4 6
i) Since f is a differentiable function (because it is a polynomial), the local

extreme points are critical ones, i.e, solutions of
∇f (x, y) = 2x, 0 = 0, 0 ⇐⇒ x = 0.
We deduce that the points on the y axis are all critical points of f .
Classification of the critical points:


fxx fxy 2 0
Hf (x, y) = =
fyx fyy 0 0
The leading principal minors at the critical points (0, y) of f (y ∈ R) are

2 0
D1 (0, y) = 2 > 0
D2 (0, y) = . = 0.
0 0
So the second derivative test is inconclusive.
ii) The principal minors are

2 0
Δ11
1 (x, y) = 0 Δ22
1 (x, y) = 2 Δ2 (x, y) = = 0.
0 0
and satisfy Δk (x, y) 0 for k = 1, 2 x, y ∈ R2 . Therefore f is convex in

R2 .
iii) Note that
f (x, y) = x2 0 = f (0, y) ∀(x, y) ∈ R2 .

We deduce that the critical points are global minimum for f in R2 .
iv) Since f is infinitely differentiable (because it is polynomial) in the open

set R2 , an absolute maximum of f would be a local maximum, and therefore
a critical point. But, all the critical points are minima points for f . Hence, f
has no local nor absolute maxima; see Figure 2.29. In fact, on the x-axis, we
have
f (x, 0) = x2 −→ +∞ as x −→ +∞.
2
So f cannot attain a maximum value M in R . Indeed, if not, we would have
f (x, y) M ∀(x, y) ∈ R2 .
Then, we have
f (x, 0) = x2 M ∀x ∈ R
which is not possible. For example
f (M, 0) = M 2 > M ∀M > 1.

4. – Discuss the convexity/concavity of f on R2
f (x, y) = 4xy − x2 − y 2 − 6x.
Are there global extreme points?
Solution:
z x2 4 y x 6 x y2
2
4
0
2
z
2
0
4
2
6
5 4
0
x
6
5 6 4 2 0 2 4 6
We have
∇f (x, y) = 4y − 2x − 6, 4x − 2y.
∞
Since f is C , then
f is convex ⇐⇒ Hf is semi-definite positive
where the Hessian matrix of f is

fxx fxy −2 4
Hf (x, y) = =
fyx fyy 4 −2
The principal minors of Hf are

−2 4
Δ11
1 (x, y) = −2, Δ22
1 (x, y) = −2 and Δ2 (x, y) = = −12.
4 −2
So f is not convex, nor concave on R2 ; see Figure 2.30.
Remark that
f (0, y) = −y 2 −→ −∞ as y −→ ±∞.
f takes large negative values and doesn’t attain its minimal value.
On the other hand, when looking for the critical points of f , we obtain
∇f (x, y) = 4y − 2x − 6, 4x − 2y = 0, 0 ⇐⇒ x=1 and y = 2x = 2.
This point is a saddle point. It will help us to find a direction of increase of

values of f . Indeed, by completing the squares, we obtain
f (x, y) − f (1, 2) = −(2x − y)2 + 3(x − 1)2
f (x, 2x) = f (1, 2) + 3(x − 1)2 −→ +∞ as x −→ ±∞.
So f takes large positive values and doesn’t attain its maximal value either.
5. – Let f be the function defined by:
f (x, y) = x4 − 2x2 + y 2 − 6y.
i) Find the critical points of f .

ii) Use the second derivative test to classify the critical points of f .
iii) Find the global minimum value of f on R2 by completing squares.
iv) Is there a global maximum value of f on R2 ?
v) Show that f is convex on each of the open convex sets

√ √
S1 = {(x, y) : x < −1/ 3} and S2 = {(x, y) : x > 1/ 3}.
vi) Sketch these sets and plot the critical points.

vii) Find min f (x, y) and min f (x, y) (justify)
S1 S2
viii) Set S = S1 ∪ S2 . Find m0 = min

2
(x4 − 2x2 )
R \S
ix) Use (x4 − 2x2 ) m0 on R2 \ S to deduce min f (x, y).

R2 \S
Solution: The shape of the surface, in Figure 2.31, shows that the function
is neither convex, nor concave.
z x4 2 x2 y2 6 y
4.0
3.5
5 3.0
10
2.5
2
1
0
x
1
2.0
2 2 1 0 1 2
i) Since f is a differentiable function (because it is a polynomial), the local

extreme points are critical points solution of
⎧
⎨ 4x(x + 1)(x − 1) = 0
∇f (x, y) = 4x3 − 4x, 2y − 6 = 0, 0 ⇐⇒
⎩
2(y − 3) = 0
⎧
⎨ x=0 or x+1=0 or x−1=0
⇐⇒
⎩
y=3
⎧
⎪
⎪ x = 0 and y = 3
⎪
⎪
⎨
⇐⇒ or [x = −1 and y = 3]
⎪
⎪
⎪
⎪
⎩
or [x = 1 and y = 3].
We deduce that (−1, 3), (0, 3) and (1, 3) are the critical points of f .
ii) Classification of the critical points:


fxx fxy 12x2 − 4 0
Hf (x, y) = =
fyx fyy 0 2

12x2 − 4 0
D1 (x, y) = 12x2 − 4 D2 (x, y) = .
0 2
(x, y) D1 (x, y) D2 (x, y) type

8 0
(−1, 3) 8 = 16 local minimum
0 2

−4 0
(0, 3) −4 = −8 saddle point
0 2

8 0
(1, 3) 8 = 16 local minimum
0 2
TABLE 2.8: Classifying critical points of f (x, y) = x4 − 2x2 + y 2 − 6y
The second derivative test gives the following characterization of the points
in Table 2.8.
iii) Global minimum value of f : We have
f (x, y) = (x2 − 1)2 + (y − 3)2 − 10 −10 = f (1, 3) = f (−1, 3) ∀(x, y) ∈ R2
Thus
min f (x, y) = −10 = f (1, 3) = f (−1, 3).
(x,y)∈R2
iv) Global maximum value of f : We have
f (x, 3) = (x2 − 1)2 − 10 and lim f (x, 3) = +∞.

x→±∞
So f doesn’t attain its maximum value on R2 .
v) The principal minors of the Hessian of f are

Δ11
1 =
fyy = 2 0, Δ22
1 =
fxx = 12x2 − 4 0 ⇐⇒ |x| √1
3

12x2 − 4 0
1
Δ2 = = 8(3x2 − 1) 0
⇐⇒ |x| √
0 2 3
So
1
Δk 0 k = 1, 2 ⇐⇒ |x| √ .
3
Hence, Hf is semi definite positive on each open convex set S1 and S2 . Hence
f is convex on each of the open convex sets S1 and S2 .
vi) Sketch of the sets S1 and S2 in Figure 2.32.

y
2 S2
x
4 2 2 4
S1 2
4
FIGURE 2.32: The convex sets S1 , and S2
vii) Since f is convex on S1 = [x < − √13 ] and the critical point (−1, 3) is in
S1 with f (−1, 3) = −10, then
min f (x, y) = f (−1, 3) = −10.

S1
f is also convex on S2 = [x > √1 ] and the critical point (1, 3) is in S2 with

3
f (1, 3) = −10, then
min f (x, y) = f (1, 3) = −10.

S2
viii) We have
ϕ(x) = x4 − 2x2 ϕ (x) = 4x3 − 4x = 4x(x − 1)(x + 1)
Using Table 2.9, we find that

1 1 5
min ϕ(x) = ϕ(− √ ) = ϕ( √ ) = − .
x∈[− √13 , √13 ] 3 3 9
x −1 0 1
ϕ (x) + 0 −
ϕ(x) −1 0 −1
TABLE 2.9: Variations of ϕ(x) = x4 − 2x2
ix) We deduce that

5 1 1
x4 − 2x2 − ∀x ∈ [− √ , √ ]
9 3 3
5 5 5 1
f (x, y) − +y 2 −6y = (y−3)2 −9− −9− = f (± √ , 3) ∀(x, y) ∈ R2 \S.
9 9 9 3
Hence,
1 1 5 86
min f (x, y) = f (− √ , 3) = f ( √ , 3) = −9 − = − .
R2 \S 3 3 9 9
Remark. Note that

5
f (x, y) −9 − −10 ∀(x, y) ∈ R2 \ S.
9
We also have
f (x, y) −10 = f (−1, 3) = f (1, 3) ∀(x, y) ∈ S.

Hence,
min
2
f (x, y) = f (−1, 3) = f (1, 3) = −10.
R
2.4 Extreme Value Theorem
The first main result of this section is
Theorem 2.4.1 Extreme value theorem

Let S be a closed bounded set of Rn . Let f ∈ C 0 (S). Then
f attains both its maximal and minimal values in S; that is,
max f and min f exist.
S S
The proof of the extreme value theorem uses the fact that the image of a
closed bounded set S of Rn by a real valued continuous function f : S −→ R
is a closed bounded set of R [18]. Thus f (S) is a closed bounded interval [a, b].
Therefore
∃xm , xM ∈ S such that f (xm ) = a, f (xM ) = b.
Since f (S) = [f (xm ), f (xM )], then
f (xm ) f (x) f (xM ) ∀x ∈ S.
Therefore,
f (xm ) = min f (x) and f (xM ) = max f (x).
S S
Remark 2.4.1 When f is a continuous function on a closed and bounded

set S, then the extreme value theorem guarantees the existence of an abso-
lute maximum and an absolute minimum of f on S. These absolute extreme
points can occur either on the boundary of S or in the interior of S. As a
consequence, to look for these points, we can proceed as follows:
– find the critical points of f that lie in the interior of S
– find the boundary points where f takes its absolute values on the
boundary
– compare the values of f taken at the critical and boundary points

found. The largest of the values of f at these points is the absolute
maximum and the smallest is the absolute minimum.
Example 1. Find the extreme values of f (x) = 13 x3 − 12 x2 − 2x + 3 on the

intervals [−1, 1] and [−2, 2].
Solution: We have f (x) = x2 − x − 2 = (x − 2)(x + 1) and
f (x) = 0 ⇐⇒ x=2 or x = −1.

We deduce that 2 and −1 are the critical points of f ; see Figure 2.33.
• The values max f (x) and min f (x) exist by the extreme value theorem
x∈[−1,1] x∈[−1,1]
because f is continuous on the closed bounded interval [−1, 1]. Now, since
there is no critical points in the interior of the interval (−1, 1), these values
must be in {f (−1) , f (1)}. Comparing these two values, we conclude that
25 5
max f (x) = f (−1) = and min f (x) = f (1) = .
x∈[−1,1] 6 x∈[−1,1] 6
• The values max f (x) and min f (x) exist by the extreme value theorem
x∈[−2,2] x∈[−2,2]
because f is continuous on the closed bounded interval [−2, 2]. The critical
point −1 is in the interior of the interval (−2, 2), the absolute values must be
7 25 1
in {f (−2) , f (−1) , f (2)} = { , , − }. Comparing these three values, we
3 6 3
conclude that
25 1
max f (x) = f (−1) = and min f (x) = f (2) = − .
x∈[−2,2] 6 x∈[−2,2] 3
x3 x2
y 2x3
3 2
4
x
4 2 2 4
2
4
FIGURE 2.33: Absolute values on a closed interval

Example 2. Find the absolute maximum and minimum values of
f (x, y) = 4xy − x2 − y 2 − 6x
on the closed triangle S = {(x, y) : 0 x 2, 0 y 3x}.
Solution: f is continuous (because it is a polynomial) on the triangle S, which

is a bounded and closed subset of R2 . So f attains its absolute extreme points
on S at the stationary points lying at the interior of S or on points located
at the boundary of S (see Figure 2.34).
∗ Interior stationary points of f : We have
∇f = 4y − 2x − 6, 4x − 2y = 0, 0 ⇐⇒ (x, y) = (1, 2).
The point (1, 2) is the only critical point of f and f (1, 2) = −3.
y
2 2
z x
6 4 yx6x y
y 4
4 L3
0
L2
5
z
2
10
15
0.0
0.5
x
1 1 2 3 4 1.0
L1
x
1.5
2.0
FIGURE 2.34: Extreme values of f on the triangular plane region S
∗ Extreme values of f at the boundary of S:

Let L1 , L2 and L3 be the three sides of the triangle, defined by:
L1 = {(x, 0), 0 x 2} L2 = {(2, y), 0 y 6}

L3 = {(x, 3x), 0 x 2}.
– On L1 , we have: f (x, 0) = −x2 − 6x = g(x), g (x) = −2x − 6.

We deduce from the monotony of g (see Table 2.10) that
x 0 2
g (x) −
g(x) 0 − 16
TABLE 2.10: Variations of g(x) = −x2 − 6x on [0, 2]
max f = f (0, 0) = 0 and min f = f (2, 0) = −16.

L1 L1
– On L2 , we have: f (2, y) = −y 2 + 8y − 16 = h(y), h (y) = −2y + 8.
y 0 4 6

h (y) + 0 −
h(y) −16 0 −4
TABLE 2.11: Variations of h(y) = −y 2 + 8y − 16 on [0, 6]
Then, from Table 2.11, we obtain
max f = f (2, 4) = 0 and min f = f (2, 0) = −16.

L2 L2
– On L3 , we have: f (x, 3x) = 2x2 − 6x = l(x), l (x) = 4x − 6.
3
x 0 2 2

l (x) − 0 +
l(x) 0 − 92 −4
TABLE 2.12: Variations of l(x) = 2x2 − 6x on [0, 2]
Using Table 2.12, we deduce that
max f = f (0, 0) = 0 and min f = f (3/2, 9/2) = −9/2.

L3 L3
∗ Conclusion: We list, in Table 2.13, the values of f at the interior critical

points and at the boundary points where an absolute extreme value occurs
on the considered side of the boundary. We conclude that the absolute max-
imum value of f is f (0, 0) = f (2, 4) = 0 and the absolute minimum value is
f (2, 0) = −16.
Now, here is a version of an extreme value theorem for a continuous func-

tion on an unbounded domain.
(x, y) (1, 2) (0, 0) (2, 0) (2, 4) (3/2, 9/2)

f (x, y) −3 0 −16 0 −9/2
TABLE 2.13: Values of f at the points
Theorem 2.4.2
Let f (x) be a continuous function on an unbounded set S of Rn such that
lim f (x) = +∞ (resp. − ∞).

x→+∞
Then, there exists an element x∗ ∈ S such that
f (x∗ ) = min f (x) (resp. max f (x)).

x∈S x∈S
Proof. Let x0 ∈ S. There exists R0 > 0 such that
∀x ∈ S : x > R0 =⇒ f (x) > f (x0 ).
So the optimization problem inf f (x) is equivalent to

x∈S
min f (x) with S0 = S ∩ {x ∈ Rn : x R0 }.

x∈S0
Indeed, we have
S0 ⊂ S =⇒ min f (x) = inf f (x) inf f (x).

x∈S0 x∈S0 x∈S
Moreover, we have
f (x) > f (x0 ) if x ∈ S \ S0
f (x) min f (z) if x ∈ S0

z∈S0
then
f (x) max f (x0 ), min f (z) min f (z) ∀x ∈ S.
z∈S0 z∈S0
Hence
inf f (x) min f (z).

x∈S z∈S0
Note that the minimum min f (z) is attained by the extreme value theorem
z∈S0
since S0 is a bounded closed set of Rn . Therefore
∃x∗ ∈ S0 such that min f (x) = f (x∗ ).

x∈S0
Now, since, we have inf f (x) f (x∗ ), we deduce that

x∈S
f (x∗ ) = inf f (x) = min f (x).

x∈S x∈S0
Example 3. Let f (x) = 3x4 + 4x3 − 12x2 + 2.

i) Show that f has an absolute minimum on R.
ii) Find the minimal value of f on R.
Solution: The graphing in Figure 2.35 shows three local extrema.

y
10
x
3 2 1 1 2 3
10
20 y 3 x4 4 x3 12 x2 2
30
FIGURE 2.35: Absolute minimum of f
i) f is continuous on R since f is polynomial. Moreover, we have
lim f (x) = +∞.

|x|→+∞
Then f attains its minimum value at some point x∗ ∈ R.
ii) Since R is an open set and x∗ ∈ R, then x∗ must be a critical point of f .

We have
f (x) = 12x3 + 12x2 − 24x = 12x(x + 2)(x − 1) and
f (x) = 0 ⇐⇒ x = 0, x = −2 or x = 1.
We deduce that
! " ! "
min f (x) = min f (0), f (−2), f (1) = min 2, −30, −3 = −30 = f (−2).
x∈R
Example 4. Let f (x) = p(x) = xn + +an−1 xn−1 + · · · + a1 x + a0 be a

polynomial with n 1.
If n is odd, then
lim p(x) and lim p(x)

x→+∞ x→−∞
have opposite signs (one is +∞ and the other is −∞), so f has no absolute
extreme points.
If n is even, then the limits above have the same sign. When they are both
equal to +∞, f has an absolute minimum but no absolute maximum. When
the limits are both equal to −∞, f has an absolute maximum but no absolute
minimum.
Solved Problems
1 2 1 2
1. – Define the function f (x, y) = x − y on the closed unit disk. Find
4 9
i) the critical points
ii) the local extreme values

iii) the absolute extreme values.
Solution:
x2 y2
z
4 9 1.0
0.5
0.2 0.0
0.1 1.0
0.0 0.5
0.1 0.5
1.0 0.0
0.5
0.0 0.5 1.0
0.5
1.0
1.0 1.0 0.5 0.0 0.5 1.0
FIGURE 2.36: Graph of f on the unit disk and level curves
i) Since f is differentiable, the critical points are solution of ∇f (x, y) = 0, 0.
That is
x 2y
∇f (x, y) = , − = 0, 0 ⇐⇒ (x, y) = (0, 0).
2 9
So (0, 0), the origin of the unit disk, is the unique critical point of f .
ii) Nature of the local extreme point. We have

1 2
fxx = , fyy = − , fxy = 0.
2 9
2 1
Then D2 (0, 0) = [fxx fyy − fxy ](0, 0) = − < 0 and (0, 0) is a saddle
9
point; see Figure 2.36.
iii) Global extreme points.

Since the unit disk is a bounded closed subset of R2 , f attains its global
extreme points on this set since it is continuous (because it is a polynomial
function). These extreme points are interior critical points or points on the
boundary of the disk.
∗ Extreme values of f on the boundary of the disk:

On the unit disk, f takes the values (see Table 2.14)
1 1
f (cos t, sin t) = cos2 t − sin2 t = g(t), t ∈ [0, 2π].
4 9
We have
1 2 13
g (t) = − cos t sin t − sin t cos t = − sin t cos t
2 9 18
π 3π
θ 0 2 π 2 2π
sin t + + − −
cos t + − − +
g (t) − + − +
1 1 1 1 1
g(t) 4 − 9 4 − 9 4
TABLE 2.14: Variations of g(t) = 1

4 cos2 t − 1
9 sin2 t
∗ Conclusion:
We list, in Table 2.15, the values of f at the critical point and at the boundary
points where f attains its absolute values on that boundary.
(x, y) (0, 0) (1, 0) (0, 1) (−1, 0) (0, −1)

1 1 1 1
f (x, y) 0 − −
4 9 4 9
TABLE 2.15: Values of f (x, y) = 14 x2 − 19 y 2 at candidate points

1
The absolute maximal value of f on the disk is and is attained on the points
4
(1, 0) and (−1, 0).
1
The absolute minimal value of f on the disk is − and is attained on the
9
points (0, 1) and (0, −1).
2. – Find the absolute extreme points of the function
f (x, y) = (4x − x2 ) cos y
on the rectangular region 1 x 3, −π/4 y π/4.
Solution:
y
2.0
1.5
L3
1.0
0.5
L4 L2
x
1 1 2 3 4
0.5 R
1.0
L1
1.5
FIGURE 2.37: The plane region R
f is continuous (because it is the product of a polynomial function and the

cosine function) on the rectangle R = [1, 3] × [−π/4, π/4] (see Figure 2.37),
which is a closed bounded set of R2 , then f attains its absolute extreme
points on R. These points are attained at the critical points of f located at
the interior of R or on points located on ∂R.
∗ Interior stationary points of f . We have
∇f = (4 − 2x) cos y, −(4x − x2 ) sin y = 0, 0
⎧
⎨ x=2 or cos y = 0
⇐⇒ ⇐⇒ (x, y) = (2, 0).
⎩
x=0 or x=4 or sin y = 0
The point (2, 0) is the only critical point of f , as shown in Figure 2.38, and
f (2, 0) = 4.
∗ Extreme values of f at the boundary of R:

Let L1 , L2 , L3 and L4 the four sides of the rectangle R, defined by:
L1 = {(x, − π4 ), 1 x 3}, L2 = {(3, y), − π4 y π

4}
L3 = {(x, π4 ), 1 x 3}, L4 = {(1, y), − π4 y π

4 }.
1.0
z 4 x x2 cosy
0.5
4.0
0.0
3.5
3.0
0.5
2.5
0.5
1.0 0.0
1.5
2.0
0.5
2.5 1.0
3.0 1.0 1.5 2.0 2.5 3.0
FIGURE 2.38: Values of f on R and level curves
√
2
√
– On L1 , we have: f (x, − π4 ) = 2 (4x − x2 ) = g(x), g (x) = 2(2 − x).
x 1 2 3
g (x) √ + √ − √
3 3
g(x) 2 2 2 2 2 2
√
2
TABLE 2.16: Variations of g(x) = 2 (4x − x2 )
We deduce from the monotony of g, described in Table 2.16, that
√
π √ π π 3 2
max f = f (2, − ) = 2 2 min f = f (1, − ) = f (3, − ) = .
L1 4 L1 4 4 2
– On L2 , we have: f (3, y) = 3 cos y = h(y), h (y) = −3 sin y.

From the monotony of h (see Table 2.17), we have
√
π π 3 2
max f = f (3, 0) = 3 min f = f (3, − ) = f (3, − ) = .
L2 L2 4 4 2
y − π4 0 π
4

h (y) √ + − √
3 3
h(y) 2 2 3 2 2
TABLE 2.17: Variations of h(y) = 3 cos y
√
2
√
– On L3 , we have: f (x, π4 ) = 2 (4x − x2 ) = l(x), l (x) = 2(2 − x).
x 1 2 3
l (x) √ + √ − √
3 3
l(x) 2 2 2 2 2 2
√
2
TABLE 2.18: Variations of l(x) = 2 (4x − x2 )
As a consequence of Table 2.18, we have
√
π √ π π 3 2
max f = f (2, ) = 2 2 min f = f (1, ) = f (3, ) = .
L3 4 L3 4 4 2
– On L4 , we have: f (1, y) = 3 cos y = m(y), m (y) = −3 sin y.
y − π4 0 π
4

m (y) √ + − √
3 3
m(y) 2 2 3 2 2
TABLE 2.19: Variations of m(y) = 3 cos y
From the behaviour described in Table 2.19, we deduce that
√
π π 3 2
max f = f (1, 0) = 3 min f = f (1, − ) = f (1, − ) = .
L4 L4 4 4 2
∗ Conclusion: We list the particular points found above in Table 2.20.

The maximal value of f on R is 4 and it is attained at the point (2, 0), which
is an interior critical point. √
The minimal value of f on R is 3 2 2 and it is attained at the points (1, − π4 ),
(1, π4 ), (3, − π4 ) and (3, π4 ).
(x, y) (2, 0) (2, ± π4 ) ± π4 )

(1, √ ± π4 )
(3, √ (3, 0) (1, 0)
√ 3 2 3 2
f (x, y) 4 2 2 2 2 3 3
TABLE 2.20: Values of f (x, y) = (4x − x2 ) cos y at candidate points
3. – Find the points on the surface z 2 = xy + 4 that are closer to the origin.
Solution:
The distance of a point (x, y, z) to the origin is given by d =
x2 + y 2 + z 2 . The problem is equivalent to minimize d2 = x2 + y 2 + z 2 on
the set z 2 = xy + 4 or equivalently to look for
min x2 + y 2 + (xy + 4) = f (x, y).

S=R2
Note that the function f is continuous on the unbounded set R2 and satisfies
1 1 1
f (x, y) x2 + y 2 − (x2 + y 2 ) + 4 = (x2 + y 2 ) + 4 = (x, y) + 4
2 2 2
since
1 2
|xy| (x + y 2 ).
2
Thus
lim f (x, y) = +∞.
(x,y)→+∞
Hence, the minimization problem has a solution.
Note that a global minimum of the problem is also a local minimum, i.e.,
solution of

2 1
∇f = 2x+y, 2y +x = 0, 0 ⇐⇒ (x, y) = (0, 0) since = 0.
1 2
The point (0, 0) is the only critical point of f and f (0, 0) = 4.

Since the global minimum exists, then (0, 0) is the global minimum and the
corresponding points on the surface z 2 = 4 + xy where the distance is closer
to (0, 0, 0) are: (0, 0, ±2).
We can also verify that (0, 0) is a local minimum by applying the second
derivatives test. Indeed, we have
fxx = 2, fxy = 1, fyy = 2


f fxy 2 1
D1 (x, y) = fxx = 2 D2 (x, y) = xx =
1 2 = 3.
fxy fyy
Since D1 (0, 0) > 0 and D2 (0, 0) > 0, then (0, 0) is a strict local minimum.
4. – i) Find the quantities x, y that should be produced to maximize the

total profit function
f (x, y) = x + 4y
subject to
2x + 3y 19, −3x + 2y 4
x + y 8, 0 x 6, y 0.
ii) Use level curves to solve the problem geometrically.
Solution:
y
L4
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
FIGURE 2.39: Hexagonal plane region S
i) Set
S = {(x, y) : 2x + 3y 19, −3x + 2y 4, x + y 8, 0 x 6, y 0}.
The set S is the region of the plan xy, located in the first quadrant and
bounded by the lines 2x+3y = 19, −3x+2y = 4, x+y = 8; see Figure 2.39.
It is a closed bounded convex of R2 . Since f is continuous (because it is a
polynomial), it attains its absolute extreme points on S at the stationary

points lying at the interior of S or on points located at the boundary of S.
∗ Interior stationary points of f . We have
∇f = 1, 4 = 0, 0 ∀(x, y) ∈ R2 .
There is no critical point of f .

∗ Extreme values of f at the boundary of S:
Let L1 , · · · , L6 be the six sides of the hexagon S, defined by:
L1 = {(x, 0), 0 x 6}, L2 = {(6, y), 0 y 2}
19 − 2x
L3 = {(x, 8 − x), 5 x 6}, L4 = {(x, ), 2 x 5},
2
4 + 3x
L5 = {(x, ), 0 x 2}, L6 = {(0, y), 0 y 2}.
2
On L1 , we have: f (x, 0) = x,
max f = f (6, 0) = 6 min f = f (0, 0) = 0.

L1 L1
On L2 , we have: f (6, y) = 6 + 4y,
max f = f (6, 2) = 10 min f = f (6, 0) = 6.

L2 L2
On L3 , we have: f (x, 8 − x) = x + 4(8 − x) = 32 − 3x,
max f = f (5, 3) = 17 min f = f (6, 2) = 10.

L3 L3
On L4 , we have: f (x, 19−2x

3 ) = x + 43 (19 − 2x) = 13 (76 − 5x),
max f = f (2, 5) = 22 min f = f (5, 3) = 17.

L4 L4
On L5 , we have: f (x, 4+3x

2 ) = x + 2(4 + 3x) = 8 + 7x,
max f = f (2, 5) = 22 min f = f (0, 2) = 8.

L5 L5
On L6 , we have: f (0, y) = 4y,
max f = f (0, 2) = 8 min f = f (0, 0) = 0.

L6 L6
∗ Conclusion:
We list, in Table 2.21 below, the values of f at the boundary points where f
takes absolute values on each side of the set S. We conclude that the abso-
lute maximum value of f is f (2, 5) = 22 and the absolute minimum value is
f (0, 0) = 0.
(x, y) (0, 0) (6, 0) (6, 2) (5, 3) (2, 5) (0, 2)

f (x, y) 0 6 10 17 22 8
TABLE 2.21: Values of f (x, y) = x + 4y at candidate points
ii) To solve the problem geometrically, we sketch the level curves x + 4y = k.

The profit k is attained if the line has common points with the region S. The
profit 0 is attained at the point (0, 0) at the level curve x + 4y = 0. When the
profit k increases, the lines x + 4y = k are parallel and move out farther to
reach the point (2, 5) where the highest profit is attained; see Figure 2.40.
y
L4
x4y 22
4 L5
L3
2 S
L2
L6
x
2 4 6
L1
x4y 0
FIGURE 2.40: Level curve of highest profit

Remark 2.4.2 Note that the points that appear in the above table are the
vertices of the hexagon S. The extreme points are attained at two of these
vertices. This is true in the more general problem
min p.x = p1 x1 + · · · + pn xn or max p.x

S S
with
n
S = {x = (x1 , · · · , xn ) ∈ R+ : Ax b}
where
A = (aij )1im, 1jn , b =t (b1 , · · · , bn ) and p =t (p1 , · · · , pn ).
We look for the extreme points on the polyhedra
U = {x = (x1 , · · · , xn ) ∈ R+n : Ax = b}.
We establish, when it exists, that an extreme point is at least a corner of the

polyhedra U . However, when m and n take large values, the number of cor-
ners is very important, and linear programming develops various methods
to approach these optimal values of the objective function [19], [5], [29].
Chapter 3
Constrained Optimization-Equality
Constraints
In this chapter, we are interested in optimizing functions f : Ω ⊂ Rn −→ R over

subsets described by equations
g(x) = (g1 (x), g2 (x), . . . , gm (x)) = cRm with m < n x ∈ Rn .

Denote the set of the constraints by
S = [g(x) = c] = [g1 (x) = c1 ] ∩ [g2 (x) = c2 ] ∩ . . . ∩ [gm (x) = cm ].
In dimension n = 3, when m = 1, the equation g1 (x, y, z) = cR3 may describe a

surface, while when m = 2, the equations
g1 (x, y, z) = c1 and g2 (x, y, z) = c2

may describe a curve as the intersection of two surfaces. Thus, the set [g = c] can be
seen as a set of dimensions less than 3. For m = 3, the set [g = c] may be reduced to
some points or to the empty set. For this reason, we will not consider these situations
and assume always m < n.
x52 y2 z2 9
y
5
5
z
0
5
5
0
x
FIGURE 3.1: S = [g1 = 9], n = 3, m = 1
Example. ∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] is a surface (the sphere centered

at the origin with radius 3; see Figure 3.1). Here (n = 3, m = 1).
135
∗∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] ∩ [g2 (x, y, z) = z = 2] is the intersection of

the previous sphere with the plan z = 2; see Figure 3.2.
x2 5y2 z2 9, z 2
y x2 y2 z2 9, z 2
2
0 y 1
1
5
5 2
4
z
0 z
2
5 0
5
2
1
0 0
x x 1
5 2
FIGURE 3.2: S = [g1 = 9] ∩ [g2 = 2], n = 3, m = 2
∗∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] ∩ [g2 (x, y, z) = z = 2] ∩ [g3 (x, y, z) = y =

1] = {(2, 1, 2), (−2, 1, 2)}is the intersection of the sphere with the two planes z = 2
et y = 1. It is reduced to two points; see Figure 3.3.
x2 y2 z2 9, z 2, y 1
5
y 2
0 y 1
1
5
5 2
4
z
0 z
2
5 0
5
2
1
0 0
x x 1
5 2
FIGURE 3.3: S = [g1 = 9] ∩ [g2 = 2] ∩ [g3 = 1], n = 3, m = 3
As in the case of unconstrained optimization, we will need to reduce our set of

searches of the extreme points by looking for some necessary conditions. A local study
for such points x∗ cannot be done by considering balls centered at these points because
the points x∗ + th, with |h| small, do not remain necessarily inside the set [g = c].
This situation prevents us from comparing the values f (x∗ + th) with f (x∗ ). In order
to remain close to x∗ through points of the set [g = c], an idea is to consider all
the curves passing through x∗ included in the constraint set. We will consider curves
t −→ x(t) such that the set {x(t) : t ∈ [−a, a], x(0) = x∗ }, for some a > 0, are
included in [g = c]. So, if x∗ is a local maximum of f , then we have
Constrained Optimization-Equality Constraints 137
f (x(t)) f (x∗ ) ∀t ∈ [−a, a].

Thus, 0 is local maximum point for the function t −→ f (x(t)). Hence
d

f (x(t)) = f (x(t)).x (t) =0 =⇒ f (x∗ ).x (0) = 0
dt t=0 t=0
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality
musn’t depend on a particular curve x(t). So, we must have
f (x∗ ).x (0) = 0 for any curve x(t) such that g(x(t)) = c.
In this chapter, first, we will characterize, in Section 3.1, the set of tangent vectors to
such curves, then establish in Section 3.2, the equations satisfied by a local extreme
point x∗ . In Section 3.3, we identify the candidates’ points for optimality, and in
Section 3.4, we explore the global optimality of a constrained local candidate point.
Finally, we establish, in Section 3.5, the dependence of the optimal function with
respect to certain of its parameters.
3.1 Tangent Plane
Let
x∗ ∈ S = [g(x) = c].
Definition 3.1.1 The set defined by
T = { x (0) : t −→ x(t) ∈ S, x ∈ C 1 (−a, a), a > 0, x(0) = x∗ }
of all tangent vectors at x∗ to differentiable curves included in S, is called

tangent plane at x∗ to the surface [g = c].
We have the following characterization of the tangent plane T at a regular

point x∗ of S.
Definition 3.1.2 A point x∗ ∈ S = [g = c] is said to be a regular point

of the constraints if the gradient vectors ∇g1 (x∗ ), . . ., ∇gm (x∗ ) are linearly
independent (LI). That is, the m × n matrix
⎡ ∂g ∂g1 ∂g1
⎤
1
∂x1 ∂x2 . . . ∂xn
⎢ ⎥
⎢ ⎥
⎢ ∂g2 ∂g2 ∂g2 ⎥
∗ ⎢
g (x ) = ⎢ ∂x1 ∂x2 . . . ∂xn ⎥ has rank m.
⎢ .. .. .. .. ⎥
⎥
⎣ . . . . ⎦
∂gm ∂gm ∂gm
∂x1 ∂x2 ... ∂xn
v1 , . . . , vm ∈ Rn are LI ⇐⇒ α1 v1 + . . . + αm vm = 0 =⇒ α1 = . . . = 0 .
The rank of a matrix = rank of its transpose [10].
Theorem 3.1.1 At a regular point x∗ ∈ S = [g = c], where g is C 1 in a

neighborhood of x∗ , the tangent plane T is equal to the subspace
M = {y ∈ Rn : g (x∗ )y = 0}.
The proof of this theorem is an application of the implicit function

theorem.
Proof. We have
T ⊂ M : Indeed, let y ∈ T, then
∃ x ∈ C 1 (−a, a) such that g(x(t)) = c ∀t ∈ (−a, a) for some a > 0,

x(0) = x∗ , x (0) = y.
Differentiating the relation g(x(t)) = c, we obtain
g (x(t))x (t) = 0 ∀t ∈ (−a, a) =⇒ g (x(0))x (0) = 0 ⇐⇒ g (x∗ )y = 0.
Hence y ∈ M.
M ⊂ T : ∗ Indeed, let y ∈ M \ {0} and consider the vectorial equation
F (t, u) = g(x∗ + ty +t g (x∗ )u) − c = 0,

where for fixed t, the vector u ∈ Rm is the unknown.
Note that F is well defined on an open subset of R × Rm . Indeed, if g is C 1

on Bδ (x∗ ) ⊂ Rn , then
δ δ
∀(t, u) ∈ (−δ0 , δ0 ) × Bδ0 (0) with δ0 = min ,
2 y 2 g (x∗ )

(x∗ + ty +t g (x∗ )u) − x∗ |t| y + u g (x∗ )
δ δ δ δ
< y + g (x∗ ) = + =δ
2 y 2 g (x∗ ) 2 2
=⇒ [x∗ + ty +t g (x∗ )u] ∈ Bδ (x∗ ).
We have
F (t, u) = g(X(t, u)) − c X(t, u) = x∗ + ty +t g (x∗ )u
m
∂gl ∗ ∂Xj ∂gi ∗
Xj (t, u) = x∗j + tyj + (x )ul = (x )
∂xj ∂ui ∂xj
l=1
n
∂gk n
∂Fk ∂gk ∂Xj ∂gi ∗
(t, u) = = (X(t, u)) (x )
∂ui j=1
∂Xj ∂ui j=1
∂x j ∂x j
∂F t
k
(t, u) = g (X(t, u)) g (x∗ ) .
∂ui k,i=1,··· ,m
By hypotheses, we have
– F is a C 1 function in the open set A = (−δ0 , δ0 ) × Bδ0 (0)
– F (0, 0) = g(x∗ ) − c = 0
– (0, 0) ∈ (−δ0 , δ0 ) × B(0, δ0 ), so (0, 0) is an interior point
∂(F1 , · · · , Fm ) t
– det(∇u F (0, 0)) = = det g (x∗ ) g (x∗ ) = 0 as
∂(u1 , · · · , um )
∗
rankg (x ) = m.
Then, by the implicit function theorem, there exists open balls
B (0) ⊂ (−δ0 , δ0 ), Bη (0) ⊂ Bδ0 (0), , η > 0 with B (0) × Bη (0) ⊆ A,

and such that
det(∇u F (t, u)) = 0 in B (0) × Bη (0)
∀t ∈ B (0), ∃!u ∈ Bη (0) : F (t, u) = 0
u : (−, ) −→ Bη (0); t −→ u(t) is a C 1 function.
The curve
x(t) = X(t, u(t)) = x∗ + ty +t g (x∗ )u(t)
is thus, by construction, a curve on S. By differentiating both sides of
F (t, u(t)) = g(x(t)) − c = g(X(t, u(t))) − c = 0
with respect to t, we get
n
∂g ∂Xj
d
0= g(x(t)) =
dt j=1
∂Xj ∂t
m
∂gl m
∂gl ∗ ∂Xj ∂ul
Xj (t, u) = x∗j + tyj + (x )ul = yj + (x∗ )
∂xj ∂t ∂xj ∂t
l=1 l=1
#
d n
∂g m
∂gl ∗ ∂ul
0= g(x(t)) = (X(t, u)) yj + (x )
dt t=0
j=1
∂xj ∂xj ∂t
l=1 t=0
∗ ∗ t ∗
= g (x )y + g (x ) g (x )u (0).
Since y ∈ M \ {0}, we have g (x∗ ).y = 0. Moreover, since g (x∗ )t g (x∗ ) is

nonsingular, we conclude that u (0) = 0. Hence
x (0) = y +t g (x∗ )u (0) = y + 0 = y
and y is a tangent vector to the curve x(t) included in S, so y ∈ T.
∗∗ If y = 0, the constant curve x(t) = x∗ is included in S and x (0) = 0 = y,

so 0 ∈ T.
It is easy to show that M is a subspace of Rn . Indeed, 0 ∈ M and for y1 , y2 ∈

M, κ ∈ R, we have
g (x∗ )(y1 + κy2 ) = g (x∗ )y1 + κg (x∗ )y2 = 0.

Theorem 3.1.2 Implicit function theorem [15] [20]

Let A in Rn × Rm be an open set. Let F = (F1 , . . . , Fm ) be a C 1 (A)
function. Consider the vector equation
F (x, y) = 0.
If
◦
∃(x0 , y 0 ) ∈ A = A, F (x0 , y 0 ) = 0 and det Fy (x0 , y 0 ) = 0,
then, ∃, η > 0 such that
det Fy (x, y) = 0 ∀(x, y) ∈ B (x0 ) × Bη (y 0 ) ⊂ A
∀x ∈ B (x0 ), ∃!y ∈ Bη (y 0 ) : F (x, y) = 0
ϕ : B (x0 ) −→ Bη (y 0 ); x −→ ϕ(x) = y is C 1 (B (x0 ))
−1
ϕ (x) = − Fy (x, y) Fx (x, y)
where
⎡ ∂F1 ∂F1 ⎤
∂y1 ... ∂ym
⎢ .. ⎥
Fy (x, y) = ∇y F (x, y) = ⎣ ... ..
. . ⎦ gradient of F with respect to y
∂Fm ∂Fm
∂y1 ... ∂ym
∂(F1 , . . . , Fm )
det(Fy (x, y)) = Jacobian of F with respect to y.
∂(y1 , . . . , ym )
Remark 3.1.1 Denote by T(x∗ ) the translation of T by the vector x∗ :
T(x∗ ) = x∗ + M = {x∗ + h ∈ Rn : g (x∗ ).h = 0}
= {x ∈ Rn : g (x∗ ).(x − x∗ ) = 0}.

T(x∗ ) is the tangent plane to the surface [g(x) = c] passing through x∗ .
y
1.0
y
1.5
2
0.5 y x 1 1 tangent line y 1
1.0
x
0.5 0.5 1.0 1.5 2.0 2.5
0.5 y 1 x2
0.5
x
1.5 1.0 0.5 0.5 1.0 1.5
tangent line y 1
1.0
0.5
FIGURE 3.4: Horizontal tangent line at local extreme points
Remark 3.1.2 Tangent plane at a point of a surface z = f (x)

* Suppose x∗ is an interior point of a surface z = f (x) where f is a C 1
function. Then, the tangent plane at (x∗ , f (x∗ )) is given by
z = f (x∗ ) + f (x∗ ).(x − x∗ )
Indeed, setting g(x, z) = z − f (x) = 0, then
g (x, z) = −f (x), 1 = 0 and rank(g (x∗ , f (x∗ ))) = 1.
The tangent plane at (x∗ , f (x∗ )) is characterized by
g (x∗ , f (x∗ )).x − x∗ , z − f (x∗ ) = −f (x∗ ), 1.x − x∗ , z − f (x∗ ) = 0.
** If x∗ is an interior stationary point, then ∇f (x∗ ) = 0 , and the tangent

plane T (x∗ , f (x∗ )) is the horizontal plane z = f (x∗ ).
*** The graph of the tangent plane is the graph of the linear approximation
L(x) = f (x∗ ) + f (x∗ ).(x − x∗ ). Thus, we have
f (x) ≈x∗ L(x) for x close to x∗ .
Example 1. The tangent plane to a curve y = f (x) at a point (x0 , f (x0 ))

corresponds to the tangent line to the curve at that point described by the
equation
y = f (x0 ) + f (x0 )(x − x0 ).
The following examples, in Table 3.1, show that the tangent line is horizontal
at local extreme points and separates the graph into two parts at an inflection
point; see Figure 3.4 and Figure 3.5.
f (x) point x0 f (x0 ) tangent line

(x − 1)2 − 1 1 : global minimum 2(x − 1) =0 y = −1
x=1

1 − x2 0 : global maximum −2x =0 y=1
x=0

(x − 1)3 + 1 0 : inflection point 3(x − 1)2 =0 y=1
x=1
1 1 1
ln x e = y−1= (x − e)
x x=e e e
TABLE 3.1: Tangent planes in one dimension

y
1.5
tangent line y 1
1.0 y
1.5
0.5 1.0 tangent line

y logx
y 1 x2 x
0.5 y
x
0.5 0.5 1.0 1.5 2.0
0.0 x
1 2 3 4
0.5 0.5
FIGURE 3.5: Tangent lines at an inflection and at an ordinary points
Example 2. The tangent plane to a surface z = f (x, y) at a point

(x0 , y0 , f (x0 , y0 )) corresponds to the usual tangent plane to the surface at
that point described by the equation
z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).
A normal vector to this plane is
n = fx (x0 , y0 ), fy (x0 , y0 ), −1 = fx (x0 , y0 )i + fy (x0 , y0 )j − k.
A normal line to the surface z = f (x, y) at (x0 , y0 , f (x0 , y0 )) is the line

parallel to the vector n.
The examples, given in Table 3.2 and graphed in Figures 3.6 and 3.7, show
that the tangent plane is horizontal at local extreme points and separates the
graph into two parts at a saddle point.
4
4
2 2
2 2
z x 1 y 1 1 y
y plane tangent z 4
0 0
2 2
4 z x2 y2 4
2
3
1
z
2
z
0
1
1
0
2 2
plane tangent z 1
1
0 0
1 x 1
x 2 2
FIGURE 3.6: Horizontal tangent planes at local extreme points
The corresponding tangent planes at (x0 , y0 ) are respectively
a) z = −1, b) z = 4, c) z = 0, d) z = 2x + 2y − 3
2 4
y 1
2
0
plane tangent 2 x 2 y z3 0
y
1
0
2 z y2 x2
2
2
2
1
z 1
0
z
0
1
1
2
plane tangent z 0
2 2
z x 12 y 12 1
1
0
0
1
x
1 2
x
2 3
FIGURE 3.7: Tangent planes at a saddle and ordinary points
Example 3. Find the tangent plane at the point (0, 1, 0) to the set g =
(g1 , g2 ) = 1, 1 with
g1 (x, y, z) = x + y + z, and g2 (x, y, z) = x2 + y 2 + z 2 .
Solution: The surface g(x, y, z) = 1, 1 is the intersection of the two surfaces
g1 (x, y, z) = 1 and g2 (x, y, z) = 1. So, it is a curve in the space R3 . We have
⎡ ⎤
∂g1 ∂g1 ∂g1

⎢ ∂x ∂y ∂z
⎥ 1 1 1
g (x, y, z) = ⎣ ⎦ = 2x 2y 2z .
∂g2 ∂g2 ∂g2
∂x ∂y ∂z
z = f (x, y) f (x0 , y0 )

a) (x − 1)2 + (y + 1)2 − 1 2(x − 1), 2(y + 1) = 0, 0
(x,y)=(1,−1)

b) 4 − x2 − y 2 −2x, −2y = 0, 0
(x,y)=(0,0)

c) y 2 − x2 −2x, 2y = 0, 0
(x,y)=(0,0)

d) (x − 1)2 + (y + 1)2 − 1 2(x − 1), 2(y + 1) = 2, 2
(x,y)=(2,0)
TABLE 3.2: Examples in two dimension

1 1 1
g (0, 1, 0) = has rank 2
0 2 0
The tangent plane is the set of points (x, y, z) such that
⎡ ⎤
x
1 1 1 ⎣ ⎦ 0
g (0, 1, 0).x − 0, y − 1, z − 0 = . y−1 =
0 2 0 0
z
⎧
⎨ x+y−1+z =0
⇐⇒
⎩
2(y − 1) = 0.
A parametrization of the tangent plane to the two surfaces at (0, 1, 0) is the
line (see Figure 3.8)
x=t y=1 z = −t, t ∈ R.

2
y 1
1
2
2
z
0
1
2
2
1
0
x
1
FIGURE 3.8: Tangent plane at (0, 1, 0) to [g = 1, 1]
Remark 3.1.3 Note that the representation of the tangent plane obtained
in the theorem has used the fact that the point was regular. When this
hypothesis is omitted, the representation is not necessary true.
Indeed, if S is the set defined by
g(x, y) = 0 with g(x, y) = x2 ,
then S is the y axis. No point of S is regular since we have
g (x, y) = 2x, 0 and g (0, y) = 0, 0 on the y-axis.
We deduce that at each point (0, y0 ) ∈ S, we have
M = {h = (h1 , h2 ) : g (0, y0 ).h = 0} = R2 .
However, the line
x(t) = 0 y(t) = y0 + t
passes through the point (0, y0 ) at t = 0 with direction x (0), y (0) = 0, 1
and remains included in S. Hence, the tangent plane is equal to S.
Solved Problems
1. – Find an equation of the tangent plane to the ellipsoid x2 +4y 2 +z 2 = 18

at the point (1, 2, 1).
Solution: Set g(x, y, z) = x2 + 4y 2 + z 2 = 18. Then, g (x, y, z) = 2xi + 8yj +

2zk,
g (1, 2, 1) = 2i + 16j + 2k = 0 =⇒ rank(g (1, 2, 1)) = 1.

The tangent plane (see Figure 3.9) is the set of points (x, y, z) such that
⎡ ⎤
x−1
g (1, 2, 1).x − 1, y − 2, z − 1 = 2 16 2 . ⎣ y − 2 ⎦ = 0
z−1
⇐⇒ 2(x − 1) + 16(y − 2) + 2(z − 1) = 0 ⇐⇒ x + 8y + z − 18 = 0.
5
5
z
0
5
5
0
x
FIGURE 3.9: Tangent plane at (1, 2, 1) to the ellipsoid

2. – Find all points on the surface
2x2 + 3y 2 + 4z 2 = 9
at which the tangent plane is parallel to the plane x − 2y + 3z = 5.
Solution: Set g(x, y, z) = 2x2 + 3y 2 + 4z 2 = 9. We have

g (x, y, z) = 4x 6y 8z =0 on [g = 9] =⇒ rank(g (x, y, z)) = 1
since
g (x, y, z) = 0 ⇐⇒ (x, y, z) = 0 and g(0) = 9.
The tangent plane to the surface g(x, y, z) = 9 at a point (x0 , y0 , z0 ) is the set
of points (x, y, z) such that
⎡ ⎤
x − x0
g (x0 , y0 , z0 ).x − x0 , y − y0 , z − z0 = 4x0 6y0 8z0 . ⎣ y − y0 ⎦ = 0
z − z0
⇐⇒ 4x0 (x − x0 ) + 6y0 (y − y0 ) + 8z0 (z − z0 ) = 0.

This tangent plane will be parallel to the plane x − 2y + 3z = 5 if the two
planes have their respective normals g (x0 , y0 , z0 ) and 1, −2, 3 parallel. So,
we have to solve the following system
⎧
⎪
⎪ 4x0 = t
⎧ ⎪
⎪
⎪
⎪
⎪
⎪ find t ∈ R : ⎪
⎪
⎨ ⎨ 6y0 = −2t
g (x0 , y0 , z0 ) = t1, −2, 3
⇐⇒
⎪
⎪ ⎪
⎪
⎩ ⎪
⎪ 8z0 = 3t
g(x0 , y0 , z0 ) = 9 ⎪
⎪
⎪
⎪
⎩
2x20 + 3y02 + 4z02 = 9
t 2 t 2 3t 2 12 √
=⇒ 2 +3 − +4 =9 =⇒ t=± 3.
4 3 8 7
The needed points on the surface are
3√ 4√ 9 √ 3√ 4√ 9√
3, − 3, 3 , − 3, 3, − 3 .
7 7 14 7 7 14
The equations of the tangent planes to the surface (see Figure 3.10) at these
points are
3√ 4√ 9√
x− 3 −2 y+ 3 +3 z− 3 = 0,
7 7 14
2
y 1
1
2
2
z
0
1
2
2
1
0
x
1
FIGURE 3.10: Parallel tangent planes to an ellipsoid
3√ 4√ 9√
x+ 3 −2 y− 3 +3 z+ 3 = 0.
7 7 14
3. – Show that the surfaces

1 2 5
z = x2 + y 2 and z= (x + y 2 ) +
10 2
intersect at (3, 4, 5) and have a common tangent plane at that point.
Solution: Set
1 2 5
g1 (x, y, z) = z − x2 + y 2 g2 (x, y, z) = z − (x + y 2 ) − .
10 2
Since g1 (3, 4, 5) = 0 and g2 (3, 4, 5) = 0, then the point (3, 4, 5) is a common
point to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0. We have
x y
g1 (x, y, z) = − i− j+k
+ x2 y2 x + y2
2
x y
g2 (x, y, z) = − i − j + k
5 5
3 4
g1 (3, 4, 5) = − i − j + k = 0, rank(g1 (3, 4, 5)) = 1
5 5
3 4
g2 (3, 4, 5) = − i − j + k = 0, rank(g2 (3, 4, 5)) = 1.
5 5
Note that the normal vectors g1 (3, 4, 5) and g2 (3, 4, 5) of the tangent planes
to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0 respectively are the same.
Hence, the two surfaces have a common tangent plane at this point with the
equation
3 4
− (x − 3) − (y − 4) + (z − 5) = 0.
5 5
4. – Find two unit vectors that are normal to the surface
sin(xz) − 4 cos(yz) = 4
at the point P (π, π, 1).
Solution: A vector that is normal to the surface g(x, y, z) = sin(xz) −

4 cos(yz) = 4 is normal to the tangent plane to this surface at this point
and we have
g (x, y, z) = z cos(xz)i + 4z sin(yz)j + (x cos(xz) + 4y sin(yz))k

g (π, π, 1) = −i − πk = 0 =⇒ rank(g (π, π, 1)) = 1.
A normal vector to the tangent plane is g (π, π, 1) = −i − πk and two unit

vectors that are normal to the surface sin(xz) − 4 cos(yz) = 4 at the point
P (π, π, 1) are
g (π, π, 1) 1 π
± = ±− √ , 0, − √ .
g (π, π, 1) 1 + π2 1 + π2
3.2 Necessary Condition for Local Extreme

Points-Equality Constraints
Before setting the results rigorously, we will try to give an intuitive approach of
the comparison of the values of f close to a local maximum value f (x∗ ) under the
constraints g(x) = c. We will follow the unconstrained case in parallel.
• Unconstrained case: We compare values of f taken in a neighborhood of x∗ in

all directions
f (x∗ + th) f (x∗ ) f or h ∈ Rn |t| < δ
or equivalently, for each i = 1, . . . , n
f (x∗ + tei ) f (x∗ ) |t| < δ

then for |t| < δ, we have
f (x∗ + tei ) − f (x∗ )

0 if t>0 and
t
f (x∗ + tei ) − f (x∗ )
0 if t > 0.
t
Since f is differentiable, we obtain as t −→ 0+ and t −→ 0− respectively
fxi (x∗ ) 0 and fxi (x∗ ) 0

So
fxi (x∗ ) = 0 for each i = 1, . . . , n.
• Constrained case: We cannot choose points around x∗ in any direction because

we need to remain on the set [g = c]. A way to do that, is to consider curves t −→ x(t)
satisfying x(t) ∈ [g = c] for t ∈ (−a, a) and x(0) = x∗ . Then, we have
f (x(t)) f (x∗ ) ∀t ∈ (−a, a)

and 0 is local maximum point for the function t −→ f (x(t)). Hence, for regular
functions, we have
d

f (x(t)) = f (x(t)).x (t) =0 =⇒ f (x∗ ).x (0) = 0.
dt t=0 t=0
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality musn’t
depend on a particular curve. Thus, it must be satisfied for any y = x (0) ∈ M , which
is summarized below:
Lemma 3.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighbor-

hood of x∗ ∈ [g = c]. If x∗ is a regular point and a local extreme point of f
subject to these constraints, then we have
∀y ∈ Rn : g (x∗ )y = 0 =⇒ f (x∗ )y = 0.
The lemma says that f (x∗ ) is orthogonal to the plane tangent at x∗ to

the surface g(x) = c. As a consequence, we will see that f (x∗ ) is a linear
combination of g1 (x∗ ), . . . , gm

(x∗ ).
Theorem 3.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neigh-

borhood of x∗ ∈ [g = c]. If x∗ is a regular point and a local extreme point of
f subject to these constraints, then there exists unique numbers λ∗1 , . . . , λ∗m
such that
m
∂f ∗ ∂gj ∗
(x ) − λ∗j (x ) = 0 i = 1, . . . , n.
∂xi j=1
∂xi
Proof. The proof uses a simple argument of linear algebra. Indeed,
⎡ ∂g1 ∂g1 ∂g1

⎤
∂x1 ∂x2 ... ∂xn
⎢ ⎥
⎢ ⎥
⎢ ∂g2 ∂g2 ∂g2 ⎥
A = g (x ) = ⎢
∗
⎢ ∂x1 ∂x2 ... ∂xn ⎥
⎥ rank(A) = m
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
∂gm ∂gm ∂gm
∂x1 ∂x2 ... ∂xn
∂f ∂f
b = f (x∗ ) = ,..., b ∈ Rn .
∂x1 ∂xn
From the previous lemma, we have
∀y ∈ Rn : Ay = 0 =⇒ b.y = 0.
In other words, we have

A
KerA = Ker
b
where KerN denotes the Kernel [10] of the linear transformation induced by
the matrix N . Since we have [10]

A A
dimRn = dim(kerA) + rank(A) = dim(ker ) + rank( )
b b
then
A
rank(A) = rank( )
b
which means that the vector b is linearly dependent on the line vectors of A,
so there exists a unique vector λ∗ = (λ∗1 , . . . , λ∗m ) ∈ Rm such that
m
∂gj
t ∂f ∗
b =t Aλ∗ ⇐⇒ (x ) = λ∗j (x∗ ), i = 1, . . . , n.
∂xi j=1
∂x i
Finally, to look for extreme points of f subject to the constraint g(x) = c,

we are lead to solve the system
m
∂f ∂gj
(x) − λj (x) = 0 i = 1, · · · , n
∂xi j=1
∂xi
gj (x) − cj = 0 j = 1, · · · , m.
These equations suggest to introduce the function
L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − · · · − λm (gm (x) − cm )

called Lagrange function or Lagrangian and λ1 , . . . , λm the Lagrange
multipliers.
The necessary conditions can be, then, expressed in the form
⎧ m
⎪ ∂L ∂f ∂gj
⎪
⎪ −
⎪
⎪ (x, λ) = (x) λj (x) = 0 i = 1, . . . , n
⎨ ∂xi ∂xi j=1
∂xi
⎪
⎪
⎪
⎪ ∂L
⎪
⎩ (x, λ) = −(gj (x) − cj ) = 0 j = 1, . . . , m
∂λj
or simply ∇L(x, λ) = 0.
We may reformulate the previous theorem as follow:

borhood of x∗ ∈ [g = c].
x∗ is a regular point and a local extreme point of f
=⇒ ∃! λ∗ ∈ Rm such that ∇L(x∗ , λ∗ ) = 0.
Remark 3.2.1 When m = 1, the necessary condition is reduced to
∃!λ∗ ∈ R : ∇f = λ∗ ∇g =⇒ ∇f // ∇g.
The vectors g (x∗ ) and f (x∗ ) are respectively normal to the level curves
g(x) = c and f (x) = f (x∗ ). When the extreme point is attained then the
two vectors g (x∗ ) and f (x∗ ) are parallel. Thus the two level curves have
a common tangent plane at x∗ . When, using a graphic utility, it is where
the level curves are tangent, the constrained extreme points may locate.
Example 1. At what points on the circle x2 + y 2 = 1 does f (x, y) = xy have

its maximum and minimum?
Solution: Set
g(x, y) = x2 + y 2 S = {(x, y) : g(x, y) = x2 + y 2 = 1}
By the extreme-value theorem, f attains its maximum and minimum val-

ues on S since f is continuous on the closed and bounded unit circle S; see
Figure 3.11.
0.5
z 1.0
0.0
0.5
zxy
0.5
1.0 0.0
y
0.5
0.0 0.5
x
0.5
1.0
1.0
FIGURE 3.11: Graph of f (x, y) = xy on the unit disk [x2 + y 2 1]

Next, the functions f and g are C 1 around each point (x, y) ∈ R2 and in
particular each point of S is relatively interior to S and is a regular point
since we have
g (x, y) = 2xi + 2yj = 0 =⇒ rank(g (x, y)) = 1.

Thus, introducing the Lagrangian
L(x, y, λ) = x y − λ (x2 + y 2 − 1),
we can apply Lagrange multipliers method to look for the interior extreme
points as solutions of the system
⎧ ⎧
⎪
⎪ Lx = fx (x, y) − λgx (x, y) = 0 ⎪
⎪ y − 2xλ = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy (x, y) − λgy (x, y) = 0 ⇐⇒ x − 2yλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 2
Lλ = −(g(x, y) − 1) = 0 x + y2 − 1 = 0
⎧ ⎧
⎪
⎪ y − 2xλ = 0 ⎪
⎪ y − 2xλ = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
⇐⇒ x(1 − 4λ2 ) = 0 ⇐⇒ x=0 or λ = ± 21
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ 2 ⎩ 2
x + y2 − 1 = 0 x + y 2 − 1 = 0.
∗ x = 0 leads to y = 0 and (0, 0) is not a point on the constrained curve.

∗∗ λ = 12 leads to y = x and from the constraint equation, we deduce that
√
x = ±1/ 2.
∗ ∗ ∗ λ = − 12 leads to y = −x and from the constraint equation, we deduce
√
that x = ±1/ 2.
So, the stationary points, for the Lagrangian, are the four points
1 1 1 1 1 1 1 1
( √ , √ ), (− √ , − √ ), ( √ , − √ ), (− √ , √ )
2 2 2 2 2 2 2 2
at which f takes its maximum and minimum values respectively
1 1 1 1 1 1 1 1 1 1
f ( √ , √ ) = f (− √ , − √ ) = , f ( √ , − √ ) = f (− √ , √ ) = − .
2 2 2 2 2 2 2 2 2 2
The problem can be solved graphically, as illustrated in Figure 3.12.
y y
2.11.9
1.7 1.4 1.1 0.7 1.5 0.5 1.1 1.5 1.5
21.8 1.7
0.3 1.3
1.5 1.2 1.8
0.8
0.8 1.4
1.6 0.2 1.
1.0 1 1.0 xy 1 2
1.3 1.2
1 0.4
0.4 0.7
0.6 0.9
0.9
0.1
0.5 0.5
0.6
0.5 0.3
0.2
0.1
0 0 x x
1.5 0.1 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.2
0.4
0.6 0.4
0.5 0.3 0 0.5
0.6
0.3
0.9
.2 0.7 1
1
0.1 0.7
1.1 1.0 1. 1.0 xy 1 2
.6 1.4
0.5 1.1 1.4
1 1
1.7 0.5
1.5 1.7
1.8 1.3 08 0.2
1.5 0.9 1.31 6 1 92.1 1.5
1 1
FIGURE 3.12: The constraint [x2 + y 2 = 1] and the level curves xy = − ,
2 2
are tangent
Remark 3.2.2 Note that the Lagrange’s method doesn’t transform a con-
strained optimization problem into one finding an unconstrained extreme
point of the Lagrangian.
Example 2. Consider the problem
max x y subject to x + y = 2, x 0, y 0.
Using the Lagrange multiplier method, prove that (x, y) = (1, 1) solves
the problem with λ = 1. Prove also that (1, 1, 1) does not maximize the
Lagrangian L.
Solution: Since x and y must be positive and satisfy the sum x + y = 2, we

may look for the extreme points in the set [0, 2] × [0, 2]. Let us denote
f (x, y) = xy g(x, y) = x + y Ω = [0, 2] × [0, 2].
First, the optimization problem has a solution by the extreme-value theorem.

Indeed, f is continuous on the line segment (see Figure 3.13)
S = {(x, y) : g(x, y) = 2, x 0, y 0}
which is a closed and bounded subset of R2 .
Next, the functions f and g are C 1 around each point (x, y) ∈ (0, 2) × (0, 2)
which is a regular point since we have
g (x, y) = i + j = 0 =⇒ rank(g (x, y)) = 1.

y
2.5
2.0
1.5 S
1.0
0.5
x
0.5 0.5 1.0 1.5 2.0 2.5
0.5
FIGURE 3.13: Set of the constraints
So, by applying the method of Lagrange Multipliers, we introduce the

Lagrangian
L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2)

and look for the interior extreme points as solutions of the system
⎧ ⎧
⎪
⎪ Lx = fx − λgx = 0 ⎪
⎪ y−λ=0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy − λgy = 0 ⇐⇒ x−λ=0 ⇐⇒ x = y = λ = 1.
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩
Lλ = −(g − 2) = 0 x+y−2=0
So, the point (1, 1, 1) is a stationary point for the Lagrangian L. But, it is not
an extreme point for L. Indeed, the second test derivative gives
⎡ ⎤
0 1 −1
HL (x, y, λ) = ⎣ 1 0 −1 ⎦ the Hessian matrix of L.
1 1 0
The leading principal minors of HL at (1, 1, 1) are

0 1 −1
0 1
D1 = 0 , D2 = = −1, D3 = 1 0 −1 = −2 = 0.
1 0 1 1 0
Hence, (1, 1, 1) is a saddle point.
It remains to show that the point (1, 1) is the maximum point for the problem;
see Figure 3.14 for a graphical solution using level curves. Indeed, since it is
the only interior point to the segment, it suffices to compare the value of f at
(1, 1) with its value at the end points of the segment. We have
f (1, 1) = 1 f (2, 0) = 0 f (0, 2) = 0.
0.0
2.0 x
0.5
y 1.5 1.0
1.5
1.0 2.0
0.5
0.0
4
2.0 0.57 1.33 2.28 3.04 3.42 3.8

3.6
1.9 2.66 3.23
3
1.5 2.85
0.95 2.09 2.47
z
2 1.52
1.0
1.71
zxy
1
0.76 1.14
0.5
0.19
0
0.3
0.0
0.0 0.5 1.0 1.5 2.0
FIGURE 3.14: The constraint x + y = 2 and the level curve f = xy = 1 are

tangent
So f attains its maximum value at (1, 1) under the constraint g(x, y) = 2.
Remark 3.2.3 A function subject to a constraint needs not to have a local

extremum at every stationary point of the associated Lagrangian. The La-
grangian multiplier method transforms a constrained optimization problem
into one of finding the appropriate stationary points of the Lagrangian.
min x y subject to x + y = 2, x 0, y 0.
Using the Lagrange multiplier method, prove that (x, y) = (1, 1) doesn’t solve
the problem with λ = 1.
Solution: Arguing as in the Example 2, the problem has a solution by the

extreme-value theorem. But, by applying the method of Lagrange multiplier,
we found the only point candidate (1, 1) and it realizes the maximum of f .
So the minimum point of f is not necessary a stationary point of L. In fact,
f attains its minimum value 0, under the constraint g(x, y) = 2 at (2, 0) and
(0, 2).
Solved Problems
1. –
i) Show that the Lagrange equations for
max (min) f (x, y) = x+y +3 subject to g(x, y) = x−y = 0
have no solution.
ii) Show that any point of the constraints’ set is a regular point.
iii) What can you conclude about the minimum and maximum values of
f subject to g = 0? Show this directly.
Solution: i) Set
L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = x + y + 3 − λ(x − y).

By applying Lagrange’s multipliers method, we look for the interior extreme
points as a solution of the system
⎧ ⎧
⎪
⎪ Lx = fx − λgx = 0 ⎪
⎪ 1−λ=0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy − λgy = 0 ⇐⇒ 1+λ=0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩
Lλ = −(g − 0) = 0 x−y =0
which leads to a contradiction with λ = 1 and λ = −1. So the system has no
solution.
ii) Any point of the constraint is a regular point since we have
g (x, y) = i − j = 0 =⇒ rank(g (x, y)) = 1.

iii) We can conclude that f has no maximum nor minimum on the set of
the constraints since if they existed, they would be solution of the above
2
y
0
y
2
2
x y 0
line: z 0, xy 0
1 2
z
0
x
2 1 1 2
plane : z xy 3
2
1
2
0
x
2 2
FIGURE 3.15: No solution for the constrained optimization problem
system. Indeed, all conditions of the theorem on the necessary conditions for
a constrained candidate point are satisfied.
The problem is equivalent to optimize
F (x) = f (x, x) = 2x + 3 for x ∈ R.

We can see that
lim F (x) = −∞ lim F (y) = +∞.

x→−∞ x→+∞
Therefore, f cannot reach a finite lower and upper bound on the set of the
constraints.
The graph of f is a plane; see Figure 3.15. The level curves y+1 = k are parallel
lines that intersects the constraint line x − y = 0 at the points (k − 1, k − 1).
This shows that f takes large values (see Figure 3.16).
4.56 5.73 6.84

7.6 8.36
7.98
.28
2
4.94 6.08 7.22
3.42
1
76
5.32 6.46
3.8
1.52
3
0.38 2 1 1 2 3
1
0 2.66 4.18
1.52 1.14
1.14 2
2.28
2 66 1 9 0 76 3 0 38 1.9 30
FIGURE 3.16: ∇f = 1, 1 ∦ 1, −1 = ∇g
2. – Consider the problem of minimizing
f (x, y) = y + 1 subject to g(x, y) = x4 − (y − 2)5 = 0.
i) Show, without using calculus, that the minimum occurs at (0, 2). Is
it a regular point?
ii) Show that the Lagrange condition ∇f = λ∇g is not satisfied for any
value of λ.
iii) Does this contradicts the theorem on the necessary conditions for a
constrained candidate point?
Solution: i) Note that we have,
g(x, y) = x4 −(y −2)5 = 0 ⇐⇒ (y −2)5 = x4 0 =⇒ y 2.
So, on the set of the constraint (see Figure 3.17), we have
f (x, y) = y + 1 3 = f (x, 2) ∀(x, y) ∈ [g = 0].

Since g(0, 2) = 0, then (0, 2) ∈ [g = 0]. Thus
f (x, y) f (0, 2) ∀(x, y) ∈ [g = 0]
and (0, 2) is a global minimum point.

y 5
5
10
10
y
10
5
5
x4 y 2 0 8
z
0 plane : z y1
6
5
4
10
2 5
0
x
5
x
10 5 5 10 10
FIGURE 3.17: Minimal value of f on the constraint set g = 0

ii) Let
L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = y + 1 − λ(x4 − (y − 2)5 ).
An interior extreme point, if it exists, is a solution of the system
⎧ ⎧
⎪
⎪ Lx = fx − λgx = 0 ⎪
⎪ 0 − 4λx3 = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = fy − λgy = 0 ⇐⇒ 1 − 5λ(y − 2)4 = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 4
Lλ = −(g − 0) = 0 x − (y − 2)5 = 0
Note that λ = 0 is not possible by the second equation. So, we deduce that
x = 0, from the first equation, and then y = 2 from the third equation. But,
this leads to a contradiction by the second equation. So the system has no
solution. No level curve is tangent to the constraint set in Figure 3.18.
10
10.24
9.6
.96
8.32
7.6
7.04
5 6.4
5.76
5.12
4.4
3.84
3.2
2.56
1.92
1.28
0.6
10 5 0 5 10
0.64
1.28
1.92
2.56
3.2
3.84 5
4.48
5.12
5.76
6.4
7.04
7.
8.32
10
FIGURE 3.18: No solution with Lagrange method
iii) This does not contradict the theorem on the necessary conditions for a
constrained candidate point since the theorem is true if all assumptions are
satisfied which is not the case for the regularity of the point (0, 2). Indeed, we
have
g (x, y) = 4x3 i−5(y−2)4 j g (0, 2) = 0, 0 =⇒ rank(g (0, 2)) = 0 = 1.
3. – At what points on the curve g(x, y) = x4 +y 4 = 1 does f (x, y) = x2 +y 2

have its maximum and minimum values?
Give a geometric interpretation of the problem.
Solution: Note that, the optimization problem has a solution by the extreme-
value theorem since f is continuous on the closed and bounded subset [g =
1] = g −1 {1} of R2 .
Next, the functions f and g are C 1 around each point (x, y) ∈ R2 . In particular
each point of [g = 1] is relatively interior to [g = 1]. Indeed, if (x0 , y0 ) ∈ [g =
1], then the point (x20 , y02 ) is on the unit circle. Thus, (x20 , y02 ) is an interior
point and we conclude, by using the preimage of an open ball by the continuous
function (x, y) −→ (x2 , y 2 ) is an open set.
Moreover, each point of [g = 1] is a regular point since we have
g (x, y) = 4x3 i + 4y 3 j = 0 on [g = 1] =⇒ rank(g (x, y)) = 1.

So, by setting
L(x, y, λ) = x2 + y 2 − λ (x4 + y 4 − 1)
we are led to solve the system
⎧ ⎧
⎪
⎪ Lx = 2x − 4λx3 = 0 ⎪
⎪ 2x(1 − 2λx2 ) = 0
⎪
⎪ ⎪
⎪
⎨ ⎨
Ly = 2y − 4λy 3 = 0 ⇐⇒ 2y(1 − 2λy 2 ) = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 4
Lλ = −(x4 + y 4 − 1) = 0 x + y4 = 1
⎧
⎪
⎪ x = 0 or 2λx2 = 1
⎪
⎪
⎨
⇐⇒ y = 0 or 2λy 2 = 1 ⇐⇒
⎪
⎪
⎪
⎪
⎩ 4
x + y4 = 1
⎧ ⎧ ⎧ 2
⎪
⎪ x=0 ⎪
⎪ y=0 ⎪
⎪ x = y2
⎪
⎪ ⎪
⎪ ⎪
⎪
⎨ ⎨ ⎨
y = ±1 or x = ±1 or x4 = 1/2
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪ ⎪
⎪
⎩ ⎩ ⎩
λ = 1/2 λ = 1/2 λ = 1/(2x2 ).
So, the stationary points for the Lagrangian are

1 1 1 1
(0, ±1), (±1, 0), , ± 1/4 ),
( (− 1/4 , ± 1/4 )
21/4 2 2 2
at which f takes its maximum and minimum values respectively
1 1 1 1 √
max f = f ( 1/4 , ± 1/4 ) = f (− 1/4 , ± 1/4 ) = 2
g=1 2 2 2 2
min f = f (±1, 0) = f (0, ±1) = 1.

g=1
1.0
y 0.5
0.0 y
4 3.4 3.1 2.7 2.3 1.5 2.4 3 3.3 3.53.9
0.5 3.83.6 2.6 3.7 4
.9 3.2 2.9 2.4
3.73.5 1.7 2.8 3.13.4 3.8
2.1 3.
1.0 3 2.6
3.3 1.2 2.5 2.93.2
1.0
2.8 2.2 0.8
2.7
1.3 0.7
1.0 2.5 0.5 2.3
0.2 1.4
z
0.5 0.4
0.3 x
1.5 1.8 1.0
1 0.5 0.5 1.0 1.5
0.1 0.9 1.9

0.0
z x2 y2
0.5 2.
2.6 0.6 0.5
1.0
2.3
0.5 1.5 2.8
3 2.7 2.4
1.0 1.6 3.1
0.0 1.1
3.4 2.5
2.9 3.33.
x 3.7 3.2
0.5 3.9 3.5 2.9 2.6 3.23.53.84
2.4 2.3
3.8 3.3 2 3 3.43.73.9
1.0 4 36 31 28 1.5 27
√
FIGURE 3.19: The constraint [x4 + y 4 = 1] and the level curves f = 1, 2 are
tangent
Since f (x, y) = (x, y) − (0, 0) 2 , then the problem looks for points (x, y) on
the curve x4 + y 4 = 1 that are closest and farthest from the origin; see Figure
3.19.
4. – Figures A an B (see Figure 3.20) show the level curves of f and the
constraint curve g(x, y) = 0 graphed thickly. Estimate the maximum and
minimum values of f subject to the constraint. Locate the point(s), if any,
where an extreme value occurs.
Solution: Figure A. Two level curves of f are tangent to the constraint

curve g = 0. Comparing the values of f taken at these level curves, we deduce
that
local max f ≈ 15 ≈ f (−1.5, 1.5) local min f ≈ 3.64 ≈ f (−1.5, 1.5).

g=0 g=0
Figure B. One level curve of f is tangent to the constraint curve g = 0.

Comparing the values of f taken at different level curves, we remark that f
keeps taking large values on the constraint set. Therefore, we deduce that
local max f doesn’t exist local min f ≈ 19.2 ≈ f (3, −2).

g=0 g=0
y y
32.48
33.6
33.0430.8 24.0822.68 3
21.5620.16 20.1620.72
21.2822.6
22.4
21.84
31.92
.8831.3629.1228.25.76
30.24 19.04 19.6
19.04 22. 249.6230.4
259.2
288.268.8 10 240.
230.4 249.6
259.2
268.8288
.32
.76 29.68
28.56 26.3224.92
26.88 17.36 19.8821.
21 8.4 278
30.52 27.44 23.2420.72
2.2 20.
.64 19.3
1.08
9.96
28.84 21.84 19.32 15.4
18.2 240.
28.28 24.36
9.4 17.92 16.24
14.84 201.6
0.8 115.2 201.6
220
26.04
27.1625.2 13.16 1.2 211
7.72 22.4 16.5 105.6
23.5221. 2 14.28 19
6.6 19.88 163.2
2.4
13.44 12.04 10.64 124.8 172
5.48 11.2 134.4153
4.64
22.12 144. 5
15.12 9.24 12.
3.8
18.7616.8 8.96 7.56
2.96
20.44 10.92 1
10.0
12.32
7.28
.28
6.72 5.6
5.04 8.4
9.6 96. 57.6 38.4 86.4 x
14. 9.8 x 19.2 9.6 28.8
317.64 2 1
8.12 1 4.2 2 3 10 76.8 5 5 48. 67.2 10
15.96
11.48 3.36
6.16
5.32 13.92
.04
7.
4.48 5 144
12.88 53.6134.4
14.56
17.08 10.368.68 3.64 2.8 124.8
2 115.2 182
163.2
7.28
8.1 92. 105.6
4.76
9.88
0.44
19.32 5.88 1.2 211
8.6 0.8
201.6 220
201.6
1.
7.84 9.24 240. 240
.56 7.84 6.44 9.
.12
20.7218.48 8.4 10. 8.4 278
11.76
A .68
22120.16
21 19.6 15.6813 72
484
28 9.52 3 7 56 8.96
9 10
5210
086 B 288 268259
8249
2 6230 4 10 2 8288
6268
230 4249259
FIGURE 3.20: Level curves of f and the constraint curve g = 0
5. – Find the points on the sphere x2 + y 2 + z 2 = 1 that are closest to and

farthest from the point (1, 2, 2).
Solution: The distance of a point (x, y, z) to the point (1, 2, 2) is given by

D(x, y, z) = (x − 1)2 + (y − 2)2 + (z − 2)2 .
To look for the shortest and the farthest distance when (x, y, z) remains into
the unit sphere is equivalent to optimize D2 (x, y, z) under the constraint x2 +
y 2 + z 2 = 1. So, let us denote
f (x, y, z) = (x − 1)2 + (y − 2)2 + (z − 2)2

g(x, y, z) = x2 + y 2 + z 2 S = [g = 1].
First, the optimization problem has a solution by the extreme value theorem
since f is continuous on the unit sphere S, which is a closed and bounded
subset of R3 .
Next, f and g are C ∞ around each point (x, y, z) ∈ R3 . In particular, each
point of S a relatively interior point and is a regular point since we have
g (x, y, z) = 2xi + 2yj + 2zk = 0 =⇒ rank(g (x, y, z)) = 1.
So, consider the Lagrangian
L(x, y, z, λ) = (x − 1)2 + (y − 2)2 + (z − 2)2 − λ(x2 + y 2 + z 2 − 1)

and apply Lagrange multipliers method to look for the interior extreme points
by solving the system
⎧ ⎧
⎪
⎪ Lx = fx (x, y, z) − λgx (x, y, z) = 0 ⎪
⎪ 2(x − 1) − 2xλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎨ Ly = fy (x, y, z) − λgy (x, y, z) = 0 ⎨ 2(y − 2) − 2yλ = 0
⇐⇒
⎪
⎪ ⎪
⎪
⎪
⎪ Lz = fz (x, y, z) − λgz (x, y, z) = 0 ⎪
⎪ 2(z − 2) − 2zλ = 0
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 2
Lλ = −(g(x, y, z) − 1) = 0 x + y 2 + z 2 − 1 = 0.
If x = 0, then the first equation leads to x = 1 which is impossible. We cannot
have also y = 0 and z = 0. So, we deduce from the system that
1 2 2
λ=1− =1− =1−
x y z
⎧ ⎧
⎨ y = z = 2x ⎪
⎨ y = z = 2x
λ = 1 − 1/x ⇐⇒ λ = 1 − 1/x
⎩ 2 ⎪
⎩ x2 + 4x2 + 4x2 − 1 = 0 ⇐⇒ x = ± 1 .
x + y2 + z2 = 1
3
44 x
2
y 2 0
2
0 4
2
4
5
z
0
5
FIGURE 3.21: The constraint [g = 1] and the level curves f = 4, 16 are tangent
So, the stationary points for the Lagrangian are the two points
1 2 2 1 2 2
( , , , −2), (− , − , − , 4)
3 3 3 3 3 3
and f takes its maximum and minimum values respectively
1 2 2 1 2 2
f (− , − , − ) = 16 and f ( , , ) = 4.
3 3 3 3 3 3
The level curves passing by these points are spheres tangents to the constraint
as shown in Figure 3.21.
3.3 Classification of Local Extreme Points-Equality

Constraints
To classify a local extreme point x∗ in the case of an unconstrained optimization

problem, we compared values f (x∗ + h) with f (x∗ ) using Taylor’s formula and the
fact that ∇f (x∗ ) = 0. In this constrained case, we also need to make this comparison,
but, we have to take into account the presence of the constraints. The Lagrangian
function links the values of f to those of g. Therefore, we will apply Taylor’s formula
to compare values L(x∗ + h) with L(x∗ ) using the fact that ∇L(x∗ , λ∗ ) = 0. More
precisely, we establish a second derivative test under specific assumptions.
Consider the optimization problem with equality constraints,
local max(min)f (x) subject to g(x) = c

where
g(x) = g1 (x), . . . , gm (x), c = c1 , . . . , cm (m < n).
The associated Lagrangian is
L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − λ2 (g2 (x) − c2 ) − . . . − λm (gm (x) − cm ).
Theorem 3.3.1 Sufficient conditions for a strict local constrained extreme

point
Let f and g = (g1 , . . . , gm ) be C 2 functions in a neighborhood of x∗ in Rn

such that:
g(x∗ ) = c rank(g (x∗ )) = m,
∇L(x∗ , λ∗ ) = 0 for a unique vector λ∗ = λ∗1 , . . . , λ∗m .
Then
(i) (−1)m Br (x∗ ) > 0 ∀r = m + 1, . . . , n

=⇒ x∗ is a strict local minimum point
(ii) (−1)r Br (x∗ ) > 0 ∀r = m + 1, . . . , n

=⇒ x∗ is a strict local maximum point.
For r = m + 1, . . . , n, Br (x∗ ) is the bordered Hessian determinant defined by

∂g1 ∗ ∂g1 ∗
0 ... 0 ∂x1 (x ) ... ∂xr (x )
.. .. .. .. .. ..
. . . . . .

∂gm ∗ ∂gm ∗
0 ... 0 ∂x1 (x ) ... ∂xr (x )
∗
Br (x ) =
∂g1 ∗ ∂gm ∗ ∗
∂x1 (x ) ... ∂x1 (x ) Lx1 x1 (x∗ , λ∗ ) ... ∗
Lx1 xr (x , λ )

.. .. .. .. .. ..
. . . . . .

∂g1 ∗ ∂gm ∗
Lxr x1 (x∗ , λ∗ ) Lxr xr (x∗ , λ∗ )
∂xr (x ) ... ∂xr (x ) ...
The variables are renumbered in order to make the first m columns in the
matrix g (x∗ ) linearly independent.
Remark 3.3.1 If we introduce the notations:

n
n
Q(h) = Q(h1 , . . . , hn ) = Lxi xj (x∗ , λ∗ )hi hj
i=1 j=1
⎡ ⎤
0m×m g (x∗ )
the (m + n) × (m + n) bordered matrix ⎣ ⎦
t
g (x∗ ) [Lxi xj (x∗ , λ∗ )]n×n
M = {h ∈ Rn : g (x∗ ).h = 0}
the theorem says that
Q(h) > 0 ∀h ∈ M, h = 0 =⇒ x∗ is a strict local minimum
Q(h) < 0 ∀h ∈ M, h = 0 =⇒ x∗ is a strict local maximum.
It suffices then to study the definite positivity (negativity) of the quadratic

form on the tangent plan M to the constraint g = c at the point x∗ (see
the reminder at the end of this section).
Before proving the theorem, we will see its application through some examples.

local max f (x, y) = x y subject to g(x, y) = x + y = 2, x 0, y 0.
Lagrange multiplier method shows that (1, 1) is a regular candidate point.
Prove that it is a local maximum to the constrained optimization problem.
Solution: Considering the Lagrangian

L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2),
we can study the nature of the point (1, 1) using the second derivatives test.
Here, we have n = 2 and m = 1. The first column vector of g (1, 1) = 1, 1
is linearly independent. So, we keep the matrix g (1, 1) without renumbering
the variables. Then, we have to consider the sign of the bordered Hessian
determinant (r = m + 1 = 2 = n)

0 gx (1, 1) gy (1, 1)

0 1 1

(−1)2 B2 (1, 1) = gx (1, 1) Lxx (1, 1, 1) Lxy (1, 1, 1) = 1 0 1 = 2 > 0.
1 1 0

gy (1, 1) Lxy (1, 1, 1) Lyy (1, 1, 1)
We conclude that the point (1, 1) is a local maximum to the problem.

local max f (x, y, z) = xy + yz + xz subject to g(x, y, z) = x + y + z = 3.
Solution: Note that f and g are C 1 in R3 and

g (x, y, z) = i + j + k = 0 =⇒ rank(g (x, y, z)) = 1.
Thus, any point, interior to the constraint set [g = 3] (see Figure 3.22), is a
regular point.
Consider the Lagrangian
L(x, y, z, λ) = f (x, y, z) − λ(g(x, y, z) − 3)
= xy + yz + xz − λ(x + y + z − 3)
and let us look for its stationary points solutions of the system
⎧
⎪
⎪ Lx = y + z − λ = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ Ly = x + z − λ = 0
∇L(x, y, z, λ) = 0, 0, 0, 0 ⇐⇒
⎪
⎪
⎪
⎪ Lz = y + x − λ = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(x + y + z − 3) = 0.
x yz3
10
5 4
z
0 2
5
4 0y
2
0 2
x
2
4
4
FIGURE 3.22: The constraint set [g = 3]
λ
From the first three equations, we deduce that = x = y = z, which inserted
2
into the last equation gives
x=y=z=1 λ = 2.
Now, let us study the nature of the point (1, 1, 1). For this we use the second
derivative test since f and g are C 2 around this point. The first column vector
of g (1, 1, 1) is linearly independent. So, we keep the matrix g (1, 1, 1) without
renumbering the variables. As n = 3 and m = 1, we have to consider the signs
of the following bordered Hessian determinants:

0 gx (1, 1, 1) gy (1, 1, 1)

0 1 1

(−1)2 B2 (1, 1, 1) = gx (1, 1, 1) Lxx (1, 1, 1, 2) Lxy (1, 1, 1, 2) = 1 0 1 = 2 > 0.

1 1 0

gy (1, 1, 1) Lxy (1, 1, 1, 2) Lyy (1, 1, 1, 2)

0 gx (1, 1, 1) gy (1, 1, 1) gz (1, 1, 1)

gx (1, 1, 1) Lxx (1, 1, 1, 2) Lxy (1, 1, 1, 2) Lxz (1, 1, 1, 2)

(−1)3 B3 (1, 1, 1) = −

gy (1, 1, 1) Lyx (1, 1, 1, 2) Lyy (1, 1, 1, 2) Lyz (1, 1, 1, 2)

gz (1, 1, 1) Lzx (1, 1, 1, 2) Lzy (1, 1, 1, 2) Lzz (1, 1, 1, 2)

0 1 1 1

1 0 1 1
= − = 3 > 0.

1 1 0 1
1 1 1 0
We conclude that the point (1, 1, 1) is a local maximum to the constrained

maximization problem.
Proof. We will prove assertion i). Assertion ii) can be established similarly.
We follow for this the proof in [25] with more details in the steps involved.
◦
Step 1 : Let Ω be a neighborhood of x∗ . For h ∈ Rn such that x∗ + h ∈ Ω,
we have from Taylor’s formula, for some τ ∈ (0, 1),
n
n n
1
L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+ Lxi (x∗ , λ∗ )hi + Lx x (x∗ +τ h, λ∗ )hi hj .
i=1
2 i=1 j=1 i j
◦
Since x∗ ∈ Ω and (x∗ , λ∗ ) is a local stationary point of L then, in particular,
Lxi (x∗ , λ∗ ) = 0 i = 1, · · · , n.
Moreover, we have
g1 (x∗ ) − c1 = g2 (x∗ ) − c2 = . . . = gm (x∗ ) − cm = 0
L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − c1 ) − . . . − λ∗m (gm (x∗ ) − cm ) = f (x∗ )
L(x∗ + h, λ∗ ) = f (x∗ + h) − λ∗1 (g1 (x∗ + h) − c1 ) − . . . − λ∗m (gm (x∗ + h) − cm )

n
n n
∗ ∗ 1
f (x +h)−f (x ) = λ∗k [gk (x∗ +h)−ck ] + Lx x (x∗ +τ h, λ∗ )hi hj .
2 i=1 j=1 i j
k=1
Using Taylor’s formula for each gk , k = 1, . . . , m, we obtain
n
∂gk
gk (x∗ + h) − ck = gk (x∗ + h) − gk (x∗ ) = (x∗ + τk h)hj τk ∈ (0, 1).
j=1
∂xj
Step 2 : Now consider the (m + n) × (m + n) bordered Hessian matrix

⎡ ⎤
0 G(x1 , . . . , xm )
B(x0 , x1 , . . . , xm ) = ⎣ ⎦
t
G(x1 , . . . , xm ) HL(.,λ∗ ) (x0 )
where
⎡ ∂g1 1 ∂g1 1
⎤
∂g ∂x1 (x ) ... ∂xn (x )
⎢ .. .. ⎥
=⎢ ⎥
i
G(x1 , . . . , xm ) = (xi ) ⎣ .
..
. . ⎦
∂xj m×n
∂gm m ∂gm m
∂x1 (x ) ... ∂xn (x )
x1 , . . . , xm are arbitrary vectors in some open ball around x∗
HL(.,λ∗ ) (x0 ) : is the Hessian matrix of L with respect to x evaluated at x0 .
For r = m+1, . . . , n, let detBr (x0 , x1 , . . . , xm ) be the (m+r)×(m+r) leading

principal minor of the matrix B(x0 , x1 , . . . , xm ).
Suppose that (−1)m Br (x∗ ) > 0 for all r = m + 1, . . . , n, then by continuity of

the second-order partial derivatives of f and g, and since
detBr (x∗ , x∗ , . . . , x∗ ) = Br (x∗ )
there exists ρ > 0 such that, ∀r = m + 1, . . . , n,
(−1)m detBr (x0 , x1 , . . . , xm ) > 0 ∀x0 , x1 , . . . , xm ∈ Bρ (x∗ ).
As a consequence, for x0 , x1 , . . . , xm ∈ Bρ (x∗ ), the quadratic form

n
n
Q(t) = Q(t1 , . . . , tn ) = Lxi xj (x0 , λ∗ )ti tj ,
i=1 j=1

with the associated symmetric matrix Lxi xj (x0 ) n×n
, is definite positive sub-
ject to the constraints
n
∂gk
G(x1 , . . . , xm ).t = 0 ⇐⇒ (xk )tj = 0 k = 1, . . . , m.
j=1
∂xj
Step 3 : Because τ, τk ∈ (0, 1), we have, for x∗ + h ∈ Bρ (x∗ ),
x0 = x∗ + τ h, x1 = x∗ + τ1 h, . . . , xm = x∗ + τm h ∈ Bρ (x∗ ).
Then
n
n
Lxi xj (x∗ + τ h, λ∗ )ti tj > 0 ∀t = 0 such that
i=1 j=1
n
∂gk
(x∗ + τk h)tj = 0 k = 1, . . . , m.
j=1
∂xj
In particular, for t = h such that

n
∂gk
(x∗ + τk h)hj = 0 k = 1, . . . , m, (1)
j=1
∂xj
we have
n n
∗ 1
∗
f (x + h) − f (x ) = Lx x (x∗ + τ h, λ∗ )hi hj > 0. (2)
2 i=1 j=1 i j
This shows that the stationary point x∗ is a strict local minimum point for f
subject to the constraint g(x) = c in particular directions.
Step 4 : Suppose that x∗ is not a strict relative minimum point. Then, there
exists a sequence of points yl satisfying
yl −→ x∗ g(yl ) = c f (yl ) f (x∗ ).

Write each yl in the form
yl = x ∗ + δ l s l = 0 sl ∈ Rn sl = 1 δl > 0 ∀l.
Note that we have

δl = δl sl = yl − x∗ −→ 0.
Hence, there exists l0 > 1 such that for all l l0 , yl ∈ Bρ (x∗ ). Choose in
steps 1 and 3, h = δl sl = yl − x∗ . Then
g(x∗ + h) − g(x∗ ) = g(ylk ) − g(x∗ ) = c − c = 0
n
∂gk ∗
gk (x∗ + h) − gk (x∗ ) = (x + τk h)hj = 0 τk ∈ (0, 1), k = 1, . . . , m
∂xj
j=1
and we should have from (1) and (2)

n n
1
0 f (yl ) − f (x∗ ) = f (x∗ + h) − f (x∗ ) = Lx x (x∗ + τ h)hi hj > 0
2 i=1 j=1 i j
which is a contradiction.
Theorem 3.3.2 Necessary conditions for local extreme points

Let f and g = (g1 , . . . , gm ) be C 2 functions in a neighborhood of x∗ in Rn
such that:
g(x∗ ) = c rank(g (x∗ )) = m,

∇L(x∗ , λ∗ ) = 0 for a unique vector λ∗ = λ∗1 , . . . , λ∗m .
Then,
(i) x∗ is a local minimum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n

is positive semi definite on M : t yHL y 0 ∀y ∈ M
(ii) x∗ is a local maximum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n

is negative semi definite on M : t yHL y 0 ∀y ∈ M
where M = {h ∈ Rn : g (x∗ ).h = 0} is the tangent plane to the surface

g(x) = c at the point x∗ .
Proof. We prove i), then ii) can be established similarly.

Let x(t) be a two differentiable curve on the constraint surface g(x) = c with
x(0) = x∗ . Suppose that x∗ is a local minimum point for f subject to the
constraint g(x) = c. Then there exists r > 0 such that
f (x∗ ) f (x(t)) ∀t ∈ (−r, r).
Then
f$(0) = f (x∗ ) f (x(t)) = f$(t) ∀t ∈ (−r, r).
So f$ is a one variable function that has an interior minimum at t = 0. Conse-
quently, it satisfies f$ (0) = 0 and f$ (0) 0 or equivalently
d2

∇f (x∗ ).x (0) = 0 and f (x(t)) 0.
dt2 t=0
We have
d2
f (x(t)) = t x (t)Hf (x(t))x (t) + ∇f (x(t))x (t)
dt2
d2

f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0).
dt2 t=0
Moreover, differentiating the relation g(x(t)) = c twice, we obtain

t
x (t)Hg (x(t))x (t) + ∇g(x(t))x (t) = 0 =⇒
t
x (0)Hg (x∗ )x (0) + ∇g(x∗ )x (0) = 0.
Hence

d2
0 f (x(t)) = [t x (0)Hf (x∗ )x (0) + ∇f (x∗ )x (0)]
dt2 t=0
− t λ∗ [t x (0)Hg (x∗ )x (0) + ∇g(x∗ )x (0)]
t
= x (0)[Hf (x∗ ) −t λ∗ Hg (x∗ )]x (0) + [∇f (x∗ ) + t λ∗ ∇g(x∗ )]x (0)
t
= x (0)[HL (x∗ )]x (0) since ∇f (x∗ ) + t λ∗ ∇g(x∗ ) = 0
and the result follows since x (0) is an arbitrary element of M.
Quadratic Forms with Linear Constraints
Consider the symmetric quadratic form in n variables

n
n
Q(h) = aij hi hj (aij = aji )
i=1 j=1
b11 h1 + . . . + b1n hn = 0
subject to m linear homogeneous constraints .. .. .
. .
bm1 h1 + . . . + bmn hn = 0
Set
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 ... a1n b11 ... b1n h1
⎢ .. ⎥ ⎢ .. ⎥ ⎢ ⎥
A = ⎣ ... ..
. . ⎦ B = ⎣ ... ..
. . ⎦ h = ⎣ ... ⎦
an1 ... ann bm1 ... bmn hn
Definition.
Q(h) = t hAh is positive (resp. negative) definite subject to the linear
constraints Bh = 0 if Q(h) > 0 (resp. < 0) for all h = 0 that satisfy
Bh = 0.
We have the following necessary and sufficient condition for a quadratic form
Q to be positive (resp. negative) definite subject to linear constraints.
Theorem: Assume the first m columns in the matrix B = (bij ) are linearly
independent. Then
Q is positive definite subject to the constraints Bh = 0
⇐⇒ (−1)m Br > 0 r = m + 1, . . . , n
Q is negative definite subject to the constraints Bh = 0
⇐⇒ (−1)r Br > 0 r = m + 1, . . . , n
where Br are the symmetric determinants

0 ··· 0 b11 ... b1r

.. .. .. .. ..
. . . . .

0 ... 0 bm1 ... bmr

Br =
for r = m + 1, . . . , n.
b11 ... bm1 a11 ... a1r

.. .. .. .. ..
. . . . .

b1r ... bmr ar1 ... arr
Solved Problems
max(min) f (x, y) = x2 + 2y 2 subject to g(x, y) = x2 + y 2 = 1.
i) Find the four points that satisfy the first-order conditions.

ii) Classify them by using the second derivatives test.
iii) Graph some level curves of f and the graph of g = 1. Explain, where
the extreme points occur.
Solution: i) First, each of the optimization problems has a solution by the

extreme-value theorem; see Figure 3.23. Indeed, f is continuous on the unit
circle
S = {(x, y) : g(x, y) = 1}
which is a closed and bounded subset of R2 .
1.0
y 0.5
0.0
0.5
1.0
2.0
1.5
z 1.0
0.5
0.0
z x2 2 y2
1.0
0.5
0.0
x 0.5
1.0
FIGURE 3.23: Graph of f on the set [x2 + y 2 1]

Next, the functions f and g are C 1 in R2 and any point on the unit circle is
regular since, for each (x, y) ∈ S, we have
g (x, y) = (2x, 2y) = (0, 0) =⇒ rank(g (x, y)) = 1.

Thus, if we introduce the Lagrangian
L(x, y, λ) = f (x, y) − λ(g(x, y) − 1) = x2 + 2y 2 − λ(x2 + y 2 − 1),
then, by applying Lagrange multipliers method, the interior extreme points
candidates are solutions of the system ∇L(x, y, λ) = 0, 0, 0
⎧ ⎧
⎪
⎪ Lx = 2x − λ(2x) = 0 ⎪
⎪ x = 0 or λ = 1
⎪
⎪ ⎪
⎪
⎨ ⎨
⇐⇒ Ly = 4y − λ(2y) = 0 ⇐⇒ y = 0 or λ = 2
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ 2
Lλ = −(x2 + y 2 − 1) = 0 x + y 2 − 1 = 0.
We cannot have x = y = 0 since the constraint is not satisfied. If x = 0
and λ = 2, we deduce from the third equation y = ±1. Then, if y = 0 and
λ = 1, we get x = ±1. So the four points that satisfy the necessary conditions
are
(1, 0) (−1, 0) (0, 1) (0, −1).
ii) Now, because f and g are C 2 , we may study the nature of the four points
by using the second derivatives test. Here, we have n = 2 and m = 1. Then,
we have to consider the sign of the bordered Hessian determinant B2 at each
point.
Nature of the points (±1, 0) where λ = 1 : First, we have
g (x, y) = (2x, 2y), g (±1, 0) = (±2, 0), rank(g (±1, 0)) = 1,

and the first column vector of g (±1, 0) is linearly independent. We have

0 gx (x, y) gy (x, y) 0 2x 2y

B2 (x, y) = gx (x, y) Lxx (x, y, λ) Lxy (x, y, λ) = 2x
2 − 2λ 0

gy (x, y) Lxy (x, y, λ) Lyy (x, y, λ) 2y 0 4 − 2λ

0 2 0 0 −2 0

B2 (1, 0) = 2 0 0 = −8
B2 (−1, 0) = −2 0 0 = −8.

0 0 2 0 0 2
For m = 1, we have
(−1)1 B2 (1, 0) = 2 > 0 (−1)1 B2 (−1, 0) = 2 > 0
and the points (±1, 0) are local minima.
Nature of the points (0, ±1) where λ = 2 : We have
g (x, y) = (2x, 2y), g (0, ±1) = (0, ±2) =⇒ rank(g (0, ±1)) = 1.
Note that the first column vector of g (0, ±1) is linearly dependent and the
second column vector is linearly independent. So, we renumber the variables
so that the second column vector of g (0, ±1) is in the first position. Hence B2
will be written as

0 gy (x, y) gx (x, y) 0 2y 2x

B2 (x, y) = gy (x, y) Lyy (x, y, λ) Lyx (x, y, λ) = 2y
4 − 2λ 0

gx (x, y) Lxy (x, y, λ) Lxx (x, y, λ) 2x 0 2 − 2λ

0 2 0 0 −2 0

B2 (0, 1) = 2 0 0 =8
B2 (0, −1) = −2 0 0 = 8.

0 0 −2 0 0 −2
For r = m + 1 = 2 = n, we have
(−1)2 B2 (0, 1) = 8 > 0 (−1)2 B2 (0, −1) = 8 > 0
and the points (0, ±1) are local maxima.

y y
6.08 5.44 4.8 1.5 4.8 5.76 6.4 1.5
.4
76
3.84 5.12 6.0
5.12 5.4
3.52
3.2 2.56 4.4
4.16 1.0 1.0 x2 2 y2 2
0.96
1.92 2.8
0.5 0.5
x2 2 y2 1
0.64 0.32 1.6 x x

1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5
.2
2.24 1.28 3.5
1.0 2.56 1.0

4.8 2.88 4.48
4.16 5.4
5.44 3.84 5.12
08 6.0
6 4 5 76 5 12 1.5 48 5 76 6 4 1.5
FIGURE 3.24: Level curves f = 1 and f = 2 are tangent to the constraint

g=1
iii) Conclusion: We have
f (±1, 0) = 1 f ((0, ±1) = 2.
Subject to the constraint g(x, y) = 1, f attains its maximum value 2 at the

points (0, ±1) and its minimum value 1 at the points (±1, 0). At these points,
the level curves x2 + 2y 2 = 1, x2 + 2y 2 = 2 and the constraint x2 + y 2 = 1,

sketched in Figure 3.24, are tangent.
min f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2
subject to g(x, y, z) = ax + by + cz + d = 0
for (x0 , y0 , z0 ) ∈ R3 , d ∈ R and (a, b, c) = (0, 0, 0).

i) Find the points that satisfy the first-order conditions.
ii) Show that the second-order conditions for a local minimum are sat-
isfied.
iii) Give a geometric argument for the existence of a minimum solution.
iv) Does the maximization problem have any solution?

v) Solve
min x2 + y 2 + z 2 subject to x + y + z = 1.
Solution: i) Note that f and g are C 1 in R3 . In particular, each point of

[g = 0] is a relative interior and regular point since we have
g (x, y, z) = ai + bj + ck = 0 =⇒ rank(g (x, y, z)) = 1.
So, by applying Lagrange multipliers method, we will look for the candidate
extreme points as stationary points for the Lagrangian
L(x, y, z, λ) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 − λ(ax + by + cz + d).
These points are solution of the system
⎧
⎪
⎪ Lx = 2(x − x0 ) − λa = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ Ly = 2(y − y0 ) − λb = 0
∇L(x, y, z, λ) = 0, 0, 0, 0 ⇐⇒
⎪
⎪
⎪
⎪ Lz = 2(z − z0 ) − λc = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(ax + by + cz + d) = 0

⎧
⎪
⎪ λ λ λ
⎪
⎨ x = 2 a + x0 y= b + y0 z= c + z0
2 2
⎪
⎪
⎪ λ λ λ
⎩ a a + x 0 + b b + y 0 + c c + z0 + d = 0
2 2 2
and that
λ ax0 + by0 + cz0 + d λ∗
=− = .
2 a 2 + b2 + c 2 2
Thus, we have only one critical point denoted (x∗ , y ∗ , z ∗ ) with λ = λ∗ .
ii) First, note that

g (x∗ , y ∗ , z ∗ ) = (a, b, c) = (0, 0, 0)
and discuss:
Case a = 0.
The first column vector of g (x∗ , y ∗ , z ∗ ) is linearly independent, and because
n = 3 and m = 1, we have to consider the signs of the following bordered
Hessian determinants:

0 gx gy 0 a b

B2 (x , y , z ) = gx
∗ ∗ ∗
Lxx Lxy = a
2 0 = −2(a2 + b2 ) < 0.

gy Lxy Lyy b 0 2
The partial derivatives of g are taken at (x∗ , y ∗ , z ∗ ) and those of L at

(x∗ , y ∗ , z ∗ , λ∗ ).

0 gx gy gz 0 a b c

gx Lxx Lxy Lxz a 2 0 0
B3 = =

= −4(a2 + b2 + c2 ) < 0.

gy Lyx Lyy Lyz b 0 2 0
gz Lzx Lzy Lzz c 0 0 2
Case a = 0 & b = 0.
The first column vector of g (x∗ , y ∗ , z ∗ ) is linearly dependent and the second
is linearly independent. We renumber the variables in the order y, x, z and
obtain

0 b a c
0 b a
b 2 0 0
B2 = b 2 0 = −2(a2 + b2 ) B3 = = −4(a2 + b2 + c2 ).

a a 0 2 0
0 2

c 0 0 2
Case a = 0, b = 0, & c = 0.
The first and second column vector of g (x∗ , y ∗ , z ∗ ) are linearly dependent
and the third is linearly independent. We renumber the variables in the order
z, x, y and obtain

0 c a b
0 c a
c 2 0 0
B2 = c 2 0 = −2(a2 + c2 ) B3 = = −4(a2 + b2 + c2 ).

a a 0 2 0
0 2

b 0 0 2
Conclusion. In each case, we have, with m = 1,
(−1)m B2 (x∗ , y ∗ , z ∗ ) > 0 (−1)m B3 (x∗ , y ∗ , z ∗ ) = 4(a2 + b2 + c2 ) > 0.
We conclude that the point (x∗ , y ∗ , z ∗ ) is a local minimum to the constrained

minimization problem.
iii) Geometric interpretation of the minimization problem:

If M (x, y, z), M0 (x0 , y0 , z0 ) ∈ R3 , then
f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 = M0 M 2
is the square of the distance of the point M to the point M0 . The constraint
surface
g(x, y, z) = ax + by + cz + d = 0 is the plane with normal a, b, c.
The minimization problem consists in finding a point M in the plane that is

located at a shortest distance from M0 . Such a point exists and is obtained
by considering the intersection of the line passing through the point M0 and
perpendicular to the plane. A direction of this line is given by the normal to
the plane a, b, c. Therefore, parametric equations of the line are
x = x0 + ta y = y0 + tb z = z0 + tc t ∈ R.
Clearly the intersection of the line with the plane gives
a x0 + ta + b y0 + tb + c z0 + tc + d = 0 ⇐⇒
λ∗ ax0 + by0 + cz0 + d

t= =− .
2 a 2 + b2 + c 2
f takes its minimum value
λ∗ λ∗ λ∗ λ∗ λ∗ λ∗ λ∗2 2 2 2
f( a+x0 , b+y0 , c+z0 ) = ( a)2 +( b)2 +( c)2 = (a +b +c ).
2 2 2 2 2 2 4
The shortest distance of M0 to the plan g = 0 is
λ∗2 2 |λ∗ | 2 |ax0 + by0 + cz0 + d|

D= (a + b2 + c2 ) = a + b2 + c 2 = √ .
4 2 a 2 + b2 + c 2
iv) The maximization problem doesn’t have a solution:

Suppose that there exists (xm , ym , zm ) a solution to the maximization prob-
lem. Then the points (xm + t, ym − t, zm ) for t ∈ R are located in the plane
and satisfy
f (xm +t, ym −t, zm ) = (xm +t)2 +(ym −t)2 +z 2 −→ +∞ as t −→ +∞.
v) From, the previous study, choose (a, b, c) = (1, 1, 1), d = −1, (x0 , y0 , z0 ) =
(0, 0, 0). Then
λ 1 1 1 1
= and (x∗ , y ∗ , z ∗ ) = , , .
2 3 3 3 3
We conclude that the point ( 13 , 13 , 13 ) is a local minimum to the constrained
minimization problem. At this point, the two level surfaces
1 1 1 1
x2 + y 2 + z 2 = =f , , , and x+y+z =1
3 3 3 3
are tangent, as it is described in Figure 3.25.
1.0
y 0.5
0.0
0.5
1.0
1.0
x yz 1
0.5
z
0.0
0.5
1
x2 y2 z2
3
1.0
1.0
0.5
0.0
x
0.5
1.0
FIGURE 3.25: The level surface and the plane are tangent
3. – The planes x + y + z = 3 and x − y = 2 intersect in a straight line.

Find the point on that line that is closest to the origin.
Solution: i) We formulate the problem as follows
⎧
⎨ g1 (x, y, z) = x + y + z = 3
min f (x, y, z) = x2 + y 2 + z 2 subject to
⎩
g2 (x, y, z) = x − y = 2.
Note that f , g1 and g2 are C 1 in R3 and any point of the set of the constraints,
sketched in Figure 3.26 and defined by g = (g1 , g2 ) = (3, 2), is an interior point
and regular since we have

1 1 1
g (x, y, z) = rank(g (x, y, z)) = 2.
1 −1 0
x
0
2 2 x
y 0
4 y 2
2
0
4
0
2
2
minimum
z z
0 origin
0
5 5
FIGURE 3.26: The constraints, the origin and the minimum point
L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 3) − λ2 (g2 (x, y, z) − 2)
= x2 + y 2 + z 2 − λ1 (x + y + z − 3) − λ2 (x − y − 2)
and look for the stationary points solutions of ∇L(x, y, z, λ1 , λ2 ) = 0R5

⎧
⎪
⎪ (1) Lx = 2x − λ1 − λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 2y − λ1 + λ2 = 0
⎪
⎪
⎨
⇐⇒ (3) Lz = 2z − λ1 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x + y + z − 3) = 0
⎪
⎪
⎪
⎪
⎩
(5) Lλ2 = −(x − y − 2) = 0.
From equations (1), (2) and (3), we deduce that
⎧ 1
⎪
⎪ x = 2 (λ1 + λ2 )
⎪
⎪
⎨
y = 12 (λ1 − λ2 )
⎪
⎪
⎪
⎪
⎩
z = 12 λ1
then substituting these values into equations (4) and (5), we obtain
⎧ 1
⎨ 2 (λ1 + λ2 ) + 12 (λ1 − λ2 ) + 12 λ1 = 3
=⇒ (λ1 , λ2 ) = (2, 2).
⎩ 1 1
2 (λ 1 + λ 2 ) − 2 (λ 1 − λ 2 ) = 2
The only critical point for L is (x∗ , y ∗ , z ∗ , λ∗1 , λ∗2 ) = (2, 0, 1, 2, 2).
ii) Note that the first two column vectors of g (x, y, z) are linearly independent.
We can, therefore, keep the matrix without renumbering the variables, and
consider the sign of the following bordered Hessian determinant (n = 3, m =
2, r = m + 1 = 3):

0 0 ∂g1 ∂g1 ∂g1
∂x ∂y ∂z

0 ∂g2 ∂g2 ∂g2
0 ∂x ∂y ∂z 0 0 1 1 1

0 0 1 −1 0
1 = 12
B3 (2, 0, 1) = ∂g ∂g2
Lxx Lxy Lxz = 1 1 2 0 0
∂x ∂x
1 −1 0 2 0
∂g1 ∂g2
∂y Lyx Lyy Lyz 1 0 0 0 2
∂y

∂g1 ∂g2
Lzx Lzy Lzz
∂z ∂z
We have
(−1)m B3 (2, 0, 1) = (−1)2 B3 (2, 0, 1) = 12 > 0.
We conclude that the point (2, 0, 1) is a local minimum to the constrained

optimization problem.
iii) To show that the point is the global minimum point, we use the following
parametrization of the set of the constraints; see Figure 3.26:
x = t + 2, y = t, z = 1 − 2t t ∈ R.
So the optimization problem is reduced to
min F (t) = f (t + 2, t, 1 − 2t) = (t + 2)2 + t2 + (2t − 1)2 .

t∈R
We have
F (t) = 2(t + 2) + 2t + 2(2t − 1)(2) = 12t = 0 ⇐⇒ t=0
and
F (t) = 12 > 0 ∀t ∈ R.
Hence 0 is a global minimum for F . That is, the point (2, 0, 1) is the solution
to the minimization problem.
In Section 3.4, we will see that using the convexity of the Lagrangian in
(x, y, z), when (λ1 , λ2 ) = (2, 2), we can conclude that the local minimum point
(2, 0, 1) is the global minimum point. Therefore, it solves the problem. The
advantage, in arguing in this way, prevents us from exploring the geometry of
the constraint set.
3.4 Global Extreme Points-Equality Constraints
The following theorem gives sufficient conditions for a critical point of

the Lagrangian to be a global extreme point for the associated constrained
optimization problem.
Theorem 3.4.1 Let Ω ⊂ Rn , Ω be an open set and f, g1 , . . . , gm : Ω −→

◦
R be C 1 functions. Let S ⊂ Ω be convex, x∗ ∈ S and L be the Lagrangian
L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − . . . − λm (gm (x) − cm ).
Then, we have
⎫
∃ λ∗ = λ∗1 , . . . , λ∗m : ∇x,λ L(x∗ , λ∗ ) = 0 ⎬
⎭
L(., λ∗ ) is concave (resp. convex) in x ∈ S
=⇒ f (x∗ ) = max f (x) ( resp. min)

{x∈S: g(x)=c}
Proof. Suppose that the Lagrangian L(., λ∗ ) is concave in x and that

m
∂L ∗ ∗ ∂f ∗ ∂gj ∗
(x , λ ) = (x ) − λ∗j (x ) = 0 i = 1, . . . , n,
∂xi ∂xi j=1
∂x i
then x∗ is a stationary point for L(., λ∗ ). Therefore, x∗ is a global maximum

for L(., λ∗ ) in S (by Theorem 2.3.4) and we have
L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − c1 ) − . . . − λ∗m (gm (x∗ ) − cm )
f (x) − λ∗1 (g1 (x) − c1 ) − . . . − λ∗m (gm (x) − cm ) = L(x, λ∗ ) ∀x ∈ S.
Since, we have
∂L ∗ ∗
(x , λ ) = −(gj (x∗ ) − cj ) = 0 j = 1, . . . , m
∂λj
then
g1 (x∗ ) − c1 = g2 (x∗ ) − c2 = . . . = gm (x∗ ) − cm = 0.
So, the previous inequality reduces to
f (x∗ ) f (x) − λ∗1 (g1 (x) − c1 ) − . . . − λ∗m (gm (x) − cm ).

In particular, we have
f (x∗ ) f (x) ∀x ∈ {x ∈ S : g(x) = c}.

Thus x∗ solves the constrained maximization problem.
The minimization case can be established similarly.
Remark 3.4.1 * Note that there is no regularity assumption on the point

x∗ in the theorem. The proof uses the characterization of a C 1 convex
function on a convex set.
** The concavity/convexity hypothesis is a sufficient condition. We may
have a global extreme point with a Lagrangian that is neither concave nor
convex (see Example 3).
Example 1. Economy. If the cost of capital K and labor L is r and w dollars

per unit respectively, find the values of K and L that minimize the cost to
produce the output Q = c K a Lb , where c, a and b are positive parameters
satisfying a + b < 1.
Solution: The inputs K and L minimizing the cost must solve the problem
min rK + wL subject to cK a Lb = Q.
We look for the extreme points in the set Ω = (0, +∞) × (0, +∞) since K and
L must satisfy cK a Lb = Q. Denote
f (K, L) = rK + wL g(K, L) = cK a Lb S = Ω.
Note that f and g are C 1 in the open convex set Ω.
L(K, L, λ) = f (K, L) − λ(g(K, L) − Q) = rK + wL − λ(cK a Lb − Q)
and Lagrange’s necessary conditions
⎧
⎪
⎪ LK = r − λcaK a−1 Lb = 0
⎪
⎪
⎨
∇L(K, L, λ) = 0, 0, 0 ⇐⇒ LL = w − λcbK a Lb−1 = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(cK a Lb − Q) = 0.
Multiplying each side of the first equality by K, each side of the second equality
by L, we obtain
rK = λcaK a Lb = λaQ wL = λcbK a Lb = λbQ

then using the third equality, we deduce the unique solution of the system
1 a b
aQ bQ Q a+b r a+b w a+b
K ∗ = λ∗ L∗ = λ ∗ λ∗ = .
r w c aQ bQ
Convexity of L in (K, L). The Hessian matrix of L is

⎡ ⎤
−λ∗ ca(a − 1)K a−2 Lb −λ∗ cabK a−1 Lb−1
HL(.,.,λ∗ ) = ⎣ ⎦.
−λ∗ cabK a−1 Lb−1 −λ∗ cb(b − 1)K a Lb−2
D1 (K, L) = −λ∗ ca(a − 1)K a−2 Lb > 0 since 0 < a < a + b < 1

−λ∗ ca(a − 1)K a−2 Lb −λ∗ cabK a−1 Lb−1

D2 (K, L) =

−λ∗ cabK a−1 Lb−1 −λ∗ cb(b − 1)K a Lb−2
= (λ∗ )2 c2 abK 2a−2 L2b−2 (1 − (a + b)) > 0.
Hence, L(., ., λ∗ ) is strictly convex in (K, L) in Ω, and we conclude that the

point (K ∗ , L∗ ) is the solution to the constrained minimization problem.
Example 2. Two-constraint problem. Solve the problem

⎧
⎨ g1 (x, y, z) = x2 + y 2 = 1
min (max) f (x, y, z) = x−z subject to
⎩
g2 (x, y, z) = x2 + z 2 = 1.
Solution: i) Consider the Lagrangian

= x − z − λ1 (x2 + y 2 − 1) − λ2 (x2 + z 2 − 1)
and look for its stationary points, solution of the system
⎧
⎪
⎪ (1) Lx = 1 − 2xλ1 − 2xλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 0 − 2yλ1 = 0
⎪
⎪
⎨
∇L(x, y, z, λ1 , λ2 ) = 0R5 ⇐⇒ (3) Lz = −1 − 2zλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x2 + y 2 − 1) = 0
⎪
⎪
⎪
⎪
⎩
(5) Lλ2 = −(x2 + z 2 − 1) = 0.
From equation (2), we deduce that
λ1 = 0 or y = 0.
∗ If y = 0, then from (4) and (5) we deduce that
x = ±1 and z = 0.
But (3) is not possible.

∗ If λ1 = 0, then (1) and (3) reduce to
1 − 2xλ2 = 0 or − 1 − 2zλ2 = 0.
Since λ2 cannot be equal to zero, we deduce that
1
x = −z = .
2λ2
Inserting x = −z in (5), we obtain
√
2x2 = 1 ⇐⇒ x = ±1/ 2.
Then, from (4), we get

1 √
+ y2 = 1 ⇐⇒ y = ±1/ 2.
2
So, the critical points of L are
1 1 1 1
( √ , ± √ , − √ , λ∗1 , λ∗2 ) with (λ∗1 , λ∗2 ) = (0, √ ),
2 2 2 2
1 1 1 1
(− √ , ± √ , √ , λ∗1 , λ∗2 ) with (λ∗1 , λ∗2 ) = (0, − √ ).
2 2 2 2
The values taken by f at these points are
1 1 1 √ 1 1 1 √
f(√ , ±√ , −√ ) = 2 f (− √ , ± √ , √ ) = − 2.
2 2 2 2 2 2
ii) To study the convexity of L in (x, y, z), consider the Hessian matrix
⎡ ⎤ ⎡ ⎤
Lxx Lxy Lxz −2(λ1 + λ2 ) 0 0
HL(x,y,z,λ1 ,λ2 ) = ⎣ Lyx Lyy Lyz ⎦ = ⎣ 0 −2λ1 0 ⎦
Lzx Lzy Lzz 0 0 −2λ2
1
* With (λ∗1 , λ∗2 ) = (0, √ ), the Hessian is
2
⎡ √ ⎤
− 2 0 0
HL(x,y,z,0, √1 ) = ⎣ 0 0 0 ⎦
√
2
0 0 − 2
and
√ √ √ √
Δ12
1 =
− 2 =− 2 Δ13
1 =
0 =0 Δ23
1 =
− 2 =− 2
√ √
0 0 − 2 0 − 2 0
Δ12 = √ =0
Δ22 = √ =2 Δ32 = =0
0 − 2 0 − 2 0 0
√
− 2 0 0

Δ3 = 0 0 0
√
=0
(−1)k Δk 0 k = 1, 2, 3.
0 0 − 2
Thus L(., 0, √12 ) is concave in R3 and the points ( √12 , ± √12 , − √12 ) are maxima
points.
** Similarly, we show that L(., 0, − √12 ) is convex and the points
(− √12 , ± √12 , √12 ) are minima points.
iii) Comments. The constraint set, illustrated in Figure 3.27, is the inter-
section of two cylinders. A parametrization of this set is described by the
equations

x(t) = ± 1 − t2 , y(t) = t, z(t) = t or −t t ∈ [−1, 1].
The set is closed since g1 and g2 are continuous on R3 and

[(g1 , g2 ) = (1, 1)] = g1−1 {1} ∩ g2−1 {1} .
It is bounded since, for any (x, y, z) ∈ [(g1 , g2 ) = (1, 1)], we have
(x, y, z) = x2 + y 2 + z 2 (x2 + y 2 ) + (x2 + z 2 ) 1 + 1 = 2.
4
1.0
y 2
y 0.5
0
0.0
2 0.5
x2 y2 1
4 1.0
4 1.0
2 0.5
z z
0 0.0
2 0.5
x2 z2 1
4 1.0
4 1.0
2 0.5
0 0.0
x x
2 0.5
4 1.0
FIGURE 3.27: The constraint set
As f is continuous on the closed bounded constraint set [(g1 , g2 ) = (1, 1)], it

attains its maximum and minimum values on this set by the extreme value
theorem. Thus, the solution of the problem is found by comparing the values
of f taken at the candidate points obtained in i).
Example 3. No concavity nor convexity. Consider the problem
max f (x, y, z) = xy + yz + xz subject to g(x, y, z) = x + y + z = 3
and the associated Lagrangian
L(x, y, z, λ) = f (x, y, z) − λ(g(x, y, z) − 3) = xy + yz + xz − λ(x + y + z − 3).
Show that the local maximum point (1, 1, 1) of the constrained optimization
problem, with λ = 2, is a global maximum, but L(., 2) is not concave.
Solution: We have
Lx = y + z − λ Ly = x + z − λ
Lz = y + x − λ Lλ = −(x + y + z − 3).
To study the concavity of L in (x, y, z) when λ = 2, consider the Hessian

matrix
⎡ ⎤ ⎡ ⎤
Lxx Lxy Lxz 0 1 1
HL(x,y,z,2) = ⎣ Lyx Lyy Lyz ⎦ = ⎣ 1 0 1 ⎦ .
Lzx Lzy Lzz 1 1 0
The principal minors are

Δ12
1 =
0 =0 Δ13
1 =
0 =0 Δ23
1 =
0 =0

0 1 0 1 0 1
Δ12 = = −1 Δ 2
= = −1 Δ32 = = −1
1 0 2 1 0 1 0

0 1 1

Δ3 = 1 0 1 = 2.
1 1 0
So L(., 2) is neither concave nor convex in (x, y, z). Thus, we cannot conclude,
by using the theorem, whether the point (1, 1, 1) is a global maximum or not.
Now, to show that (1, 1, 1) is a global maximum point, we can proceed as

follows.
Consider the values of f taken on the plane g(x, y, z) = 3:
f (x, y, 3−(x+y)) = xy +(y +x)[3−(x+y)] = xy +3(x+y)−(x+y)2 = θ(x, y).
The maximization problem is equivalent to solve the following unconstrained

problem
max θ(x, y).

(x,y)∈R2
Since θ is C 1 , the critical points are solutions of
∇θ(x, y) = y + 3 − 2(x + y), x + 3 − 2(x + y) = 3 − 2x − y, 3 − x − 2y = 0, 0

⎧
⎨ 2x + y = 3
⇐⇒ ⇐⇒ (x, y) = (1, 1).
⎩
x + 2y = 3
(1, 1) is the only critical point for θ. Moreover, we have

θxx θxy −2 −1
Hθ (x, y) = =
θyx θyy −1 −2
D1 (x, y) = −2 =⇒ (−1)1 D1 (x, y) = 2 > 0

−2 −1
D2 (x, y) = =3 =⇒ (−1)2 D2 (x, y) = 3 > 0.
−1 −2
θ is strictly concave on R2 . Thus (1, 1) is a global maximum of θ on R2 .

Therefore, (1, 1, 1) is a global maximum of f on [g = 3].
Solved Problems
Part 1. – A constrained optimization problem. [29] i) Solve the fol-

lowing constrained minimization problem
min x21 + x22 + . . . + x2n subject to x 1 + x2 + . . . + x n = c
where c ∈ R.
ii) Use part (i) to show that if x1 , x2 , . . . , xn are given numbers, then
i=n
i=n
2
n x2i xi .
i=1 i=1
When does the equality hold ?
Solution: i) Denote by f and g the C ∞ functions in Rn :

f (x1 , . . . , xn ) = x21 + x22 + . . . + x2n g(x1 , . . . , xn ) = x1 + x2 + . . . + xn .
L(x1 , . . . , xn , λ) = f (x1 , . . . , xn ) − λ(g(x1 , . . . , xn ) − c)
= x21 + x22 + . . . + x2n − λ(x1 + x2 + . . . + xn − c).
Note that any point of the hyperplane g = c is a regular point since we have
g (x1 , . . . , xn ) = 1, . . . , 1 =⇒ rank(g (x1 , . . . , xn )) = 1.

The stationary points of the Lagrangian are solutions of the system
⎧
⎪
⎪ Lx1 = 2x1 − λ = 0
⎪
⎪ ..
⎪
⎪
⎪
⎪ .
⎪
⎪
⎨ Lxi = 2xi − λ = 0
∇L(x1 , . . . , xn , λ) = 0, . . . , 0, 0 ⇐⇒ ..
⎪
⎪ .
⎪
⎪
⎪
⎪ Lxn = 2xn − λ = 0
⎪
⎪
⎪
⎪
⎩
Lλ = −(x1 + x2 + . . . + xn − c) = 0.
We deduce, from the n first equations, that

λ
λ = 2x1 = . . . = 2xi = . . . = 2xn =⇒ xi = i = 1, . . . , n,
2
λ
which inserted into the last equation gives n = c. Hence the unique solution
2
to the system is
2c c
λ= xi = i = 1, . . . , n.
n n
2c
Now, let us study the convexity of L in (x1 , . . . , xn ) when λ = .
n
The corresponding Hessian matrix is
⎡ ⎤
2 ··· 0
⎢ .. . . .. ⎥
⎣ . . . ⎦
0 ··· 2
D1 = 2 > 0, D2 = 22 > 0, . . . Di = 2i , . . . Dn = 2n > 0.
Hence, L is strictly convex in (x1 , . . . , xn ), and we conclude that the point

c c
,...,
n n
is the solution to the constrained minimization problem.
ii) Let x1 , x2 , . . . , xn be given numbers. Denote by c their sum. From part i),
we have
c c
f ,..., f (t1 , . . . , tn ) ∀(t1 , . . . , tn ) ∈ [t1 + . . . + tn = c].
n n
In particular, for the given xi , we can write
c 2 c 2 c 2
+ + ... + x21 + x22 + . . . + x2n
n n n
c2 c2
⇐⇒ n 2 = x21 + x22 + . . . + x2n
n n
⇐⇒ c2 = (x1 + x2 + . . . + xn )2 n(x21 + x22 + . . . + x2n ).
The equality holds only at the minimum point whose coordinates are equal to
(x1 + x2 + . . . + xn )/n.
Part 2. – Method of least squares. [1]

Consider n points (x1 , y1 ), . . . , (xn , yn ) such that x1 , . . . , xn are not all
equal. Find the slope m and the y-intercept b of the line y = mx + b,
that minimize the quantity
n

D(m, b) = (mxi + b − yi )2 = (mx1 + b − y1 )2 + . . . + (mxn + b − yn )2
i=1
which represents the sum of the squares of the vertical distances di =

[yi −(mxi +b)] from these points to the line. This line is called the regression
line or the least squares’ line of best fit.
(Hint: find the point candidate and check its global optimality by using
Part 1 (ii))
Solution: consider the following unconstrained minimization problem:
min D(m, b) = [y1 − (mx1 + b)]2 + . . . + [yn − (mxn + b)]2

(m,b)
Since D is regular, then the local extreme points are stationary points of the
gradient of D, i.e, solution of ∇D(m, b) = 0, 0
⎧
⎪
⎪ ∂D
⎨ ∂m = −2[y1 − (mx1 + b)]x1 − . . . − 2[yn − (mxn + b)]xn = 0
⎪
⇐⇒
⎪
⎪
⎪
⎩
∂D
= −2[y1 − (mx1 + b)] − . . . − 2[yn − (mxn + b)] = 0
∂b
⎧ n n n
⎪
⎪ 2
⎪
⎪ y i x i = m[ x i ] + b[ xi ]
⎪
⎨ i=1 i=1 i=1
⇐⇒
⎪
⎪ n n
⎪
⎪
⎪
⎩ yi = m[ xi ] + b[n].
i=1 i=1
The determinant of this 2 × 2 linear system is

n 2 n

xi xi

i=1 i=1 n n 2
2
= n x i − xi =0

n i=1 i=1
xi n

i=1
since x1 , . . . , xn are different (see Part 1). Therefore, there exists a unique
solution to the system. It remains to show that it is the minimum point. For
this, we study the convexity of D where its Hessian matrix is given by
⎡ n
n
⎤
⎢ 2 x2i 2 xi ⎥
⎢ i=1 i=1 ⎥
⎢ ⎥
HD (m, b) = ⎢ ⎥
⎢ n ⎥
⎣ ⎦
2 xi 2n
i=1
The leading principal minors values are

n
n
n

D1 (m, b) = 2 x2i > 0 D2 (m, b) = 4 n x2i − ( xi )2 > 0.
i=1 i=1 i=1
So D is convex and the unique critical point (m∗ , b∗ ) is the global minimum.
The regression line equation is y = m∗ x + b∗ with

n n
n 2 n

yi x i xi xi yi xi

i=1 i=1 i=1 i=1

n n n

y i n x i y i

i=1
∗
m = n n b∗ = i=1
n
i=1
n
2
x x x 2
x
i i i i
i=1 i=1 i=1 i=1

n n

x i n x i n

i=1 i=1
Part 3. – Students’ scores. In a math course, Table 3.3 lists the

scores xi of 14 students on the midterm exam and their scores yi on the
final exam.
i) Plot the data. Do the data appear to lie along a straight line?
ii) Find the least squares’ line of best fit of y as a function of x.
iii) Plot the points and the regression line on the same graph.
iv) Use your answer from ii) to predict the final exam score of a student
whose midterm score was 41 and who dropped the course.
xi 100 95 81 71 83 48 92 100 85 63 78 58 73 60
yi 95 88 53 58 80 31 91 78 85 52 78 74 60 60
TABLE 3.3: Students’ scores
Solution: i) The plot, in Figure 3.28, shows that 10 points are close to a line.
The plot is obtained using the Mathematica coding below:
f p = {{100, 95}, {95, 88}, {81, 53}, {71, 58}, {83, 80}, {48, 31}, {92, 91},
{100, 78}, {85, 85}, {63, 52}, {78, 78}, {58, 74}, {73, 60}, {60, 60}};
gp = ListP lot[f p]
90
80
70
60
50
40
60 70 80 90 100
FIGURE 3.28: The data shows an alignment
ii) Using the results from Part 2, we have
14
14
14
14

xi = 1087 x2i = 87855 yi = 983 xi yi = 7942815178
i=1 i=1 i=1 i=1
1499 801
m∗ = ≈ 0.8981426 b∗ = ≈ 0.4799281
1669 1669
and the regression line will be
y = 0.8981426 x + 0.4799281.
iii) To check the equation of the line of best fit, we use the instruction
line = F it[f p, {1, x}, x]
0.479928 + 0.898143 x
To sketch the line (see Figure 3.29) with the data, we add the following Math-
ematica coding:
gl = P lot[line, {x, 25, 110}];

Show[gl, gp]
100
90
80
70
60
50
40
60 80 100
FIGURE 3.29: Data and line y = 0.479928 + 0.898143 x
iv) The student who dropped the course would have at the final exam the ap-
proximate mark of y(41) ≈ 0.8981426(41) + 0.4799281 = 36.4056. The student
would have failed if he didn’t improve his understanding of the material stud-
ied. However, this is only a relative prediction that doesn’t take into account
other factors involving the learning experience of the student.
Part 4. – University tuition. [12] The following, in Table 3.4, are

the tuition fees that were charged at Vanderbilt University from 1982 to
1991.
i) Plot the data.

ii) To fit these data with a model of the form y = β0 eβ1 x , find the
least squares’ line of best fit of ln y as a function of ln x. Deduce
approximate values of β0 and β1 .
iii) Sketch the curve in ii) with the data plot in i).
iv) Suppose the exponential model is accurate for a period of time. In

which year would the tuition attain a rate of $40000?
Solution: i) The data, of points (x, y), appear to lie along a straight line.
The plot, shown in Figure 3.30, is obtained using the Mathematica coding
below:
year year after 1981, x tuition (in thousands $), y

1982 1 6.1
1983 2 6.8
1984 3 7.5
1985 4 8.5
1986 5 9.3
1987 6 10.5
1988 7 11.5
1989 8 12.625
1990 9 13.975
1991 10 14.975
TABLE 3.4: University tuition
f p1 = {{1, 6.1}, {2, 6.8}, {3, 7.5}, {4, 8.5}, {5, 9.3}, {6, 10.5}, {7, 11.5},
{8, 12.625}, {9, 13.975}, {10, 14.975}};
gp1 = ListP lot[f p1]
14
12
10
4 6 8 10
FIGURE 3.30: The data (xi , yi ) lie along a straight line
The plot of the data, of points (xi , ln yi ), appears also to lie along a straight
line (see Figure 3.31).
f p2 = {{1, Log[6.1]}, {2, Log[6.8]}, {3, Log[7.5]}, {4, Log[8.5]}, {5, Log[9.3]},
{6, Log[10.5]}, {7, Log[11.5])}, {8, Log[12.625]}, {9, Log[13.975]}, {10, Log[14.975]}};
gp2 = ListP lot[f p2]
2.6
2.4
2.2
2.0
4 6 8 10
FIGURE 3.31: The data (xi , ln yi ) are positioned along a straight line
ii) Using the results from Part 2, the least squares’ line of best fit is given by
ln(y) = ln(β0 ) + β1 x; where b∗ = ln(β0 ) and m∗ = β1 , are the solution of the
linear system
p = A.m∗ + B.b∗ , q = B.m∗ + 10.b∗

where
10
10

B= xi = 55 A= x2i = 385
i=1 i=1
10
10

p= ln(yi ) = 22.7832 q= xi ln(yi ) = 22.7832
i=1 i=1
p.10 − qB qA − Bp
m∗ = ≈ 0.10156 b∗ = ≈ 1.71975
10A − B 2 10A − B 2
and the regression line will be
ln y = 0.10156x + 1.71975.
Thus ∗
β0 = eb ≈ 5.583112000 β1 = m∗ ≈ 0.10156.
iii) Using Matematica, we find the equation of the line of best fit

line = F it[f p2, {1, x}, x]
1.71975 + 0.10156 x
We sketch the line with the data (xi , ln(yi )), in Figure 3.32, using the coding:
gl = P lot[line, {x, 1/2, 11}];

Show[gl, gp2]
2.8
2.6
2.4
2.2
2.0
4 6 8 10
FIGURE 3.32: Data (xi , ln(yi )) and line y = 1.71975 + 0.10156 x
Finally, we sketch, in Figure 3.33, the curve y = f (x) = β0 eβ1 x , with the
original data (xi , yi ):
curve = P lot[5.583112Exp[0.10156x], {x, 1/2, 11}];

Show[curve, gp1]
16
14
12
10
2 4 6 8 10
FIGURE 3.33: Data (xi , yi ) and curve model f (x) = 5.583112e0.10156x
vi) Using the formula for prediction. We need to solve the equation
1 40
1000f (x) = 40000 ⇐⇒ x= ln( ) ≈ 19.38886498
0.10156 5.583112
Thus in the year 1981 + 19 = 2000, the tuition fees reached the rate of $40000.
Chapter 4
Constrained Optimization-Inequality
Constraints
In this chapter, we are interested in optimizing functions f : Ω ⊂ Rn −→ R over

subsets described by inequalities
⎧
⎪
⎨ g1 (x) b1
.. ..
g(x) = (g1 (x), g2 (x), . . . , gm (x)) bRm ⇐⇒ . . x ∈ Rn .
⎪
⎩ g (x) b
m m
Denote the set of the constraints
S = [g(x) b] = [g1 (x) b1 ] ∩ [g2 (x) b2 ] ∩ . . . ∩ [gm (x) bm ].
Example.
∗ S = [g1 (x, y) = x2 + y 2 1] ∩ [g2 (x, y) = x − y 0] is the plane region inside

the unit disk and above the line y = x. Here (n = 2, m = 2).
∗∗ S = [g1 (x, y, z) = 9−(x2 +y 2 +z 2 ) 0] = [x2 +y 2 +z 2 9] is the domain out-

side the sphere centered at the origin with radius 3. Here (n = 3, m = 1).
∗∗∗ S = [g(x, y) = x2 0] = {(0, y) : y ∈ R} is the y-axis. Here (n = 2,

m = 1).
Note that sets defined by inequalities contain interior points and boundary
points. So, for comparing the values of a function f taken around an extreme
point x∗ , it will be suitable to consider curves x(t) passing through x∗ and
included in the constraint set [g b]. We will consider, this time, curves
t −→ x(t) such that the set {x(t) : t ∈ [0, a], x(0) = x∗ }, for some a > 0, is
included in [g b]. Then, if x∗ is a local maximum of f , then we have
f (x(t)) f (x∗ ) ∀t ∈ [0, a].
203
Thus, 0 is local maximum point for the function t −→ f (x(t)). Hence
d

f (x(t)) = f (x(t)).x (t) 0 =⇒ f (x∗ ).x (0) 0.
dt t=0 t=0
x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This inequality
musn’t depend on a particular curve x(t). So, we should have
f (x∗ ).x (0) 0 for any curve x(t) such that g(x(t)) b.
In this chapter, we will first characterize, in Section 4.1, the set of tangent vectors to
such curves, then establish, in Section 4.2, the equations satisfied by a local extreme
point x∗ . In Section 4.3, we identify the candidates points for optimality, and in
Section 4.4, we explore the global optimality of a constrained local candidate point.
Finally, we establish, in Section 4.5, the dependence of the optimal value of the
objective function with respect to certain parameters involved in the problem.
4.1 Cone of Feasible Directions
Let
x∗ ∈ S = [g(x) b]
Definition 4.1.1 The set defined by
T = { x (0) : t −→ x(t) ∈ S, x ∈ C 1 [0, a], a > 0, x(0) = x∗ }
of all tangent vectors at x∗ to differentiable curves included in S, is called

cone of feasible directions at x∗ to the set [g b].
We have the following characterization of the cone T at an interior point

x∗ of S.
Constrained Optimization-Inequality Constraints 205
Remark 4.1.1 We have
g continuous on Ω and x∗ ∈ [g(x) < b] =⇒ T = Rn .
That is, when x∗ is an interior point of S, then the cone at x∗ coincides

with the whole space.
Indeed, we have T ⊂ Rn . Let us prove that Rn ⊂ T . Let y ∈ Rn .

∗ If y = 0, then the constant curve x(t) = x∗ with t ∈ [0, 1] satisfies:
x ∈ C 1 [0, 1], x(0) = x∗ , x (t) = 0, x (0) = 0 = y, x(t) = x∗ ∈ S ∀t ∈ [0, 1].
So y = 0 ∈ T .
m
∗
∗∗ Suppose y = 0. We have x ∈ [gj (x) < bj ] which is an open subset of
j=1
Rn . So there exists δ > 0 such that
m
Bδ (x∗ ) ⊂ [gj (x) < bj ].
j=1
Now
δ δ
x(t) = x∗ + ty ∈ Bδ (x∗ ) ∀t ∈ [− , ]
2|y| 2|y|
since
δ δ
|x(t) − x∗ | = |t||y| |y| = < δ.
2|y| 2
δ
We deduce that y ∈ T since the curve satisfies: x ∈ C 1 [0, 2|y| ],
δ
x(0) = x∗ , x (t) = y, x (0) = y, x(t) = x∗ + ty ∈ S ∀t ∈ [0, ].
2|y|
Example 1. Find and sketch the cone of feasible directions at the point
(−1/2, 1/2) belonging to the set
S = {(x, y) ∈ R2 : g1 (x, y) = x2 +y 2 −1 0 and g2 (x, y) = x−y 0}.
Solution: The set S is the part of the unit disk located above the line y = x.
The point (−1/2, 1/2) is an interior point of S; see Figure 4.1. Thus T = R2 .
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5
1.0 1.0 Cone IR2
1.5 1.5
FIGURE 4.1: S and the cone at (−1/2, 1/2)
We know a representation of the cone T when x∗ is a regular point of S.
Definition 4.1.2 A point x∗ ∈ S = [g b] is said to be a regular point

of the constraints if the gradient vectors ∇gi (x∗ ), i ∈ I(x∗ ) are linearly
independent, and where

I(x∗ ) = i, i ∈ {1, . . . , m} : gi (x) = bi .
Theorem 4.1.1
At a regular point x∗ ∈ S = [g b], where g is C 1 in a neighborhood of
x∗ , the cone of feasible directions T is equal to the convex cone
C = {y ∈ Rn : gi (x∗ )y 0, i ∈ I(x∗ )}.
Before giving the proof, we give some remarks and identify some cones.
Remark 4.1.2 The cone of feasible directions at a point x∗ ∈ S with

vertex x∗ is the translation of C by the vector x∗ given by
C(x∗ ) = x∗ + C = x∗ + {h ∈ Rn : gi (x∗ ).h 0, i ∈ I(x∗ )}

= {x∗ + h ∈ Rn : gi (x∗ ).h 0, i ∈ I(x∗ )}
= {x ∈ Rn : gi (x∗ ).(x − x∗ ) 0, i ∈ I(x∗ )}
C(x∗ ) is the cone of feasible directions to the constraint set [g(x) b]

passing through x∗ .
Example 2. Find and sketch the cone of feasible directions C(x, y) with vertex
√ √
(x, y) = (−1/2, −1/2), (0, 1) and (1/ 2, 1/ 2). The points belong to the set
S = {(x, y) ∈ R2 : g1 (x, y) = x2 +y 2 −1 0 and g2 (x, y) = x−y 0}.
Solution: Note that the three points belong to ∂S; see Figure 4.2.
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5
1.0 1.0 Cone : y x
1.5 1.5
FIGURE 4.2: Location of the points on S and C(−1/2, −1/2)
To determine the cone of feasible directions at each point (see Figures 4.2
and 4.3), we need to discuss the regularity of each point. First, we will need:
⎡ ⎤ ⎡ ⎤
∂g1 ∂g1
∂x ∂y 2x 2y
⎢ ⎥ ⎣ ⎦.
g (x, y) = ⎣ ⎦=
∂g2 ∂g2
∂x ∂y
1 −1
∗ At (−1/2, −1/2), the equality constraints g2 = 0 is satisfied and the point

is regular. We have

g2 (x, y) = 1 −1 and rank(g2 (−1/2, −1/2)) = 1
⎡ 1
⎤
x+ 2
C(−1/2, −1/2) = (x, y) ∈ R2 : 1 −1 .⎣ ⎦0
1
y+ 2

= (x, y) ∈ R2 : x−y 0 .
∗∗ At (0, 1), only the equality-constraint g1 = 0 is satisfied and the point is

regular. We have

g1 (0, 1) = 0 2 and rank(g1 (0, 1)) = 1
⎡ ⎤
x−0
C(0, 1) = (x, y) ∈ R2 : 0 2 .⎣ ⎦0
y−1

2
= (x, y) ∈ R : y1 .
y y
1.5 1.5
1.0 1.0
0.5 S 0.5 S
x x
1.5 1.0 0.5 0.5 1.0 1.5 1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5 Cone : x y 2
x y 0
1.0 Cone : y 1 1.0
1.5 1.5
√ √ √
FIGURE 4.3: C(0, 1) = [y 1] and C(1/ 2, 1/ 2) = [x + y 2] ∩ [x y]
√ √
∗ ∗ ∗ At (1/ 2, 1/ 2), the two equality constraints g1 = g2 = 0 are satisfied
and the point is regular. We have
√ √
1 1 1 21 2
g (√ , √ ) = rank(g ( √ , √ )) = 2 and
2 2 21 2 −1
⎡ ⎤
√ √ x − √12
1 1 2 2 2 ⎣ ⎦0
C( √ , √ ) = (x, y) ∈ R : .
2 2 1 −1
y − √12
√
= (x, y) ∈ R2 : x + y − 2 0 and x − y 0 .
Remark 4.1.3 The conclusion of the theorem is also true when the point
x∗ satisfies any one of the following regularity conditions [5] :
i) Each constraint gj (x) is affine for j ∈ I(x∗ ).
ii) There exists x̄ such that
gj (x̄) bj
∀j ∈ I(x∗ ).
gj (x̄) < bj if gj is not affine
Example 3. Suppose that all the constraints are affine and that the set S is
described by
n

S = {x ∈ Rn : aij xj bi , i = 1, . . . , m} = {x ∈ Rn : Ax b}
j=1
where A = (aij ) is an m × n matrix and b ∈ Rm .

Here g(x) = Ax, g (x) = A and gi (x) = ai1 ai2 . . . ain . Thus, from
the previous remark, any point of S is a regular point and the cone of feasible
directions at a point x∗ ∈ S with vertex x∗ is given by the polyhedra
n

C(x∗ ) = {x ∈ Rn : gi (x).(x − x∗ ) = aij (xj − x∗j ) 0, i ∈ I(x∗ )}
j=1
n
n

= {x ∈ Rn : aij xj aij x∗j = bi , i ∈ I(x∗ )}.
j=1 j=1
Example 4. Suppose f is a C 1 function and
S = [z f (x)] = {(x, z) ∈ Ω × R : z f (x)} Ω ⊂ Rn .
Let x∗ be a relative interior point of the surface z = f (x). Find the cone at
x∗ .
Solution: If we set g(x, z) = z − f (x), then the set S can be described by
S = [g(x, z) 0] = {(x, z) ∈ Ω × R : g(x, z) 0}

and the point (x∗ , f (x∗ )) is a regular point since we have

g (x∗ , f (x∗ )) = −f (x∗ ) 1 =0 rank(g (x∗ , f (x∗ ))) = 1.
The cone of feasible directions at the point (x∗ , f (x∗ )) with vertex (x∗ , f (x∗ ))
is given by

x − x∗
C(x∗ , f (x∗ )) = (x, z) ∈ Rn × R : g (x∗ , f (x∗ )). 0 .
z − f (x∗ )
We have

∗ ∗ x − x∗ ∗
x − x∗
g (x , f (x )). = −f (x ) 1 .
z − f (x∗ ) z − f (x∗ )
= −f (x∗ ).(x − x∗ ) + z − f (x∗ ) 0 ⇐⇒ z f (x∗ ) + f (x∗ ).(x − x∗ ).

Hence

C(x∗ , f (x∗ )) = (x, z) ∈ Rn × R : z f (x∗ ) + f (x∗ ).(x − x∗ ) .
The cone is the region below the hyperplane z = f (x∗ )+f (x∗ ).(x−x∗ ), which
is also the tangent plane to the surface z = f (x) at x∗ .
In particular, when x∗ is a stationary point, i.e f (x∗ ) = 0, the cone of feasible

directions at x∗ is the region below the horizontal tangent plane z = f (x∗ ).
Remark 4.1.4 Note that the representation of the cone of feasible direc-
tions obtained in the theorem used the fact that the point was regular.
When, this hypothesis is omitted the representation is not necessary valid.
Indeed, if we consider the set S defined by

g(x, y) 0 with g(x, y) = x2 ,
then S is reduced to the y axis. No point of S is regular since we have

g (x, y) = 2x 0 and g (0, y) = 0 0 on the y-axis.
We deduce that at each point (0, y0 ), we have

x−0
C(0, y0 ) = (x, y) : g (0, y0 ). 0
y − y0

x−0
= (x, y) : 0 0 . = 0 = R2 .
y − y0
However, the line

x(t) = 0 y(t) = y0 + t
remains included
in
S, passes
through the point (0, y0 ) at t = 0, and has the
x (0) 0
direction = . Hence, the cone of feasible directions at each
y (0) 1
point of S is equal to S. Note that, it also coincides with the tangent plane
at each point, since
g(x, y) = x2 0 ⇐⇒ g(x, y) = x2 = 0.
Proof. We have:
T ⊂ C : Indeed, let y ∈ T , y = 0, then
∃ x(t) differentiable such that g(x(t)) b ∀t ∈ [0, a] for some a > 0,

x(0) = x∗ , x (0) = y.
So 0 is a minimum for the function φi (t) = gi (x(t)) − bi , (i ∈ I(x∗ )), over the
interval [0, a] since we have
φi (t) = gi (x(t)) − bi 0 = φi (0)
φi (0) = gi (x∗ ) − bi = 0 because i ∈ I(x∗ ).
Since gi and x(.) are C 1 , then φi is C 1 and Taylor’s formula gives
φi (t) − φi (0) = φi (0)t + tα(t) = t φi (0) + α(t) with lim α(t) = 0.
t→0+
If φi (0) > 0 then there exists a0 ∈ (0, a) such that
φi (0) φi (0)

α(t) < ∀t ∈ (0, a0 ) =⇒ α(t) > − ∀t ∈ (0, a0 ).
2 2
We deduce that
φi (0) φ (0)

φi (t) − φi (0) > t φi (0) − =t i >0 ∀t ∈ (0, a0 )
2 2
which contradicts that 0 is a maximum for φi on [0, a]. So y ∈ C since we have
d
φi (0) = (gi (x(t))) = ∇gi (x(t)).x (t) = gi (x∗ ).y 0.
dt t=0 t=0
C ⊂ T : Let y ∈ C \ {0} . We distinguish between two situations:
First case Suppose that
gi (x∗ ).y < 0 ∀i ∈ I(x∗ ).
Since x∗ ∈ [gj (x) < bj ] for j ∈ I(x∗ ) and g continuous, there exists δ > 0 such
that
Bδ (x∗ ) ⊂ [gj (x) < bj ].

j ∈I(x∗ )
Consider the curve
x(t) = x∗ + ty t>0 where x(0) = x∗ and x (0) = y.
We claim that
δ
∃δ0 ∈ 0, min(δ, ) such that x(t) ∈ S = [g(x) b] ∀t ∈ [0, δ0 ].
|y|
Indeed, for j ∈ I(x∗ ), we have
gj (x(t)) = gj (x∗ + ty) = gj (x∗ ) + tgj (x∗ ).y + tεj (t) with lim εj (t) = 0.
t→0
Since gj (x∗ ) = bj and gj (x∗ ).y < 0, we deduce the existence of
δ
δ0j ∈ (0, min(δ, )) such that
|y|
1
|εj (t)| < − gj (x∗ ).y.
2
Consequently, for δ0 = min∗ δ0j , we have ∀j ∈ I(x∗ ),
j∈I(x )
t t
gj (x(t)) < bj + tgj (x∗ ).y − gj (x∗ ).y = bj + gj (x∗ ).y < bj ∀t ∈ (0, δ0 ).
2 2
Second case Suppose that
gi (x∗ ).y = 0 ∀i ∈ {i1 , i2 , . . . , ip } ⊂ I(x∗ ) and
gi (x∗ ).y < 0 ∀i ∈ I(x∗ ) \ {i1 , i2 , . . . , ip } p < n.

Consider the system of equations
F (t, u) = G x∗ + ty +t G (x∗ )u − B = 0
where, for t fixed, u ∈ Rp is the unknown, and where
G = (gi1 , gi2 , . . . , gip ), B = (bi1 , bi2 , . . . , bip ), rank(G (x∗ )) = p.

Note that F is well defined on an open subset of R × Rp . Indeed, if g is C 1 on
Bδ (x∗ ) ⊂ {x ∈ Rn : gj (x) < bj , j ∈ I(x∗ )},
δ δ
then ∀(t, u) ∈ (−δ0 , δ0 ) × Bδ0 (0) with δ0 = min , , we have
2 y 2 G (x∗ )
(x∗ + ty +t G (x∗ )u) − x∗ |t| y + u G (x∗ )
δ δ δ δ
< y + G (x∗ ) = + =δ
2 y 2 G (x∗ ) 2 2
=⇒ (x∗ + ty +t G (x∗ )u) ∈ Bδ (x∗ ).

We have
F (t, u) = G(X(t, u)) − B X(t, u) = x∗ + ty +t G (x∗ )u
p
∂Gi ∂Xj ∂Gip ∗
Xj (t, u) = x∗j + tyj + l
(x∗ )ul = (x )
∂xj ∂up ∂xj
l=1
∂Gi ∂Xj n ∂Gi n

∂Fk k k
∂Gip ∗
(t, u) = = (X(t, u)) (x )
∂up j=1
∂X j ∂u p j=1
∂x j ∂xj
∂F t
k
(t, u) = G (X(t, u)) G (x∗ ) .
∂ui k,i=1,··· ,m
By hypotheses, we have
– F is a C 1 function in the open set A = (−δ0 , δ0 ) × Bδ0 (0)
– F (0, 0) = G(x∗ ) − B = 0
– (0, 0) ∈ (−δ0 , δ0 ) × Bδ0 (0), so (0, 0) is an interior point
∂(F1 , . . . , Fp ) t
– det(∇u F (0, 0)) = = det G (x∗ ) G (x∗ ) = 0 since
∂(u1 , . . . , up )
G (x∗ ) has rank p.
Then, by the implicit function theorem, there exists open balls B (0) ⊂
(−δ0 , δ0 ), Bη (0) ⊂ Bδ0 (0), , η > 0 with B (0) × Bη (0) ⊆ A, and such that
det(∇u F (t, u)) = 0 in B (0) × Bη (0)
∀t ∈ B (0), ∃!u ∈ Bη (0) : F (t, u) = 0
u : (−, ) −→ Bη (0); t −→ u(t) is a C 1 function.
Thus, the curve
x(t) = X(t, u(t)) = x∗ + ty +t G (x∗ )u(t)

is, by construction, a curve in S since we have for each t ∈ (−, )
G(x(t)) − B = 0 ⇐⇒ gj (x(t)) − bj = 0 ∀j ∈ {i1 , i2 , . . . , ip } ⊂ I(x∗ )
x(t) ∈ Bδ (x∗ ) ⊂ {x ∈ Rn : gj (x) < bj , j ∈ I(x∗ )}
⇐⇒ gj (x(t)) − bj < 0 ∀j ∈ I(x∗ ).
By differentiating both sides of
F (t, u(t)) = G(x(t)) − B = G(X(t, u(t))) − B = 0

with respect to t, we get
n
∂G ∂Xj
d
0= G(x(t)) =
dt j=1
∂Xj ∂t
m
∂Gi m
∂Gi ∂Xj ∂ul
Xj (t, u) = x∗j + tyj + l
(x∗ )ul = yj + l
(x∗ )
∂xj ∂t ∂xj ∂t
l=1 l=1
#
d n
∂G m
∂Gl ∗ ∂ul
0= G(x(t)) = (X(t, u)) yj + (x )
dt t=0
j=1
∂xj ∂xj ∂t
l=1 t=0
= G (x∗ )y + G (x∗ )t G (x∗ )u (0).
Since we have G (x∗ )y = 0 and that G (x∗ )t G (x∗ ) is nonsingular and definite
positive, we conclude that
G (x∗ )t G (x∗ )u (0) = G (x∗ )y = 0 =⇒ u (0) = 0.

Hence
x (0) = y +t G (x∗ )u (0) = y.
Now, for j ∈ I(x∗ ) \ {i1 , i2 , . . . , ip }, we have
gj (x(t)) = gj (x(0)) + tgj (x∗ ).x (0) + tη(t) = bj + tgj (x∗ ).y + tη(t)
with lim η(t) = 0. Then, from the first case, there exists 0 ∈ (0, ) such that
t→0
gj (x(t)) < bj ∀t ∈ (0, 0 )
thus
x(t) ∈ [gj (x) bj ] for all j ∈ I(x∗ ) \ {i1 , i2 , . . . , ip }.
Finally, y is a tangent vector to the curve x(t) included in S for t ∈ [0, 0 /2],
so y ∈ T .
*C is a cone of Rn since for y ∈ C and κ ∈ R+ , we have
gi (x∗ )(κy) = κgi (x∗ )y 0
for i ∈ I(x∗ ). Thus κy ∈ C.
*C is a convex of Rn since for y, y ∈ C and s ∈ [0, 1], we have
gi (x∗ )(sy + (1 − s)y ) = sgi (x∗ )y + (1 − s)gi (x∗ )y s.0 + (1 − s).0 = 0
for i ∈ I(x∗ ). Thus sy + (1 − s)y ∈ C.

Solved Problems
1. – Find and draw the cone of feasible directions at the point (0, 3, 0)
belonging to the set x2 + y 2 + z 2 9.
Solution: Set g(x, y, z) = 9 − (x2 + y 2 + z 2 ). We have

5
5
y
y
0
0
5 5
5
5 Cone
y3
z
0 z
0
x0 5
5
x2 y2 z2 9
4 5
2 0
x
x
0 5
FIGURE 4.4: The set [g 3] ∩ [x 0] and C(0, 3, 0) = [y 3]
g (x, y, z) = −2xi − 2yj − 2zk, g (0, 3, 0) = −6j = 0, rank(g (0, 3, 0)) = 1.

So (0, 3, 0) is a regular point and the cone of feasible directions to [g 0],
with vertex at this point (see Figure 4.4), is given by
⎡ ⎤
x−0
C(0, 3, 0) = {(x, y, z) ∈ R3 : g (0, 3, 0). ⎣ y − 3 ⎦ 0}.
z−0
We have
⎡ ⎤
x−0
0 −6 0 .⎣ y − 3 ⎦ 0 ⇐⇒ 0(x − 0) − 6(y − 3) + 0(z − 1) 0
z−0
⇐⇒ y3: C(0, 3, 0) = [y 3].
2. – Find the cone of feasible directions at the point (0, 1, 0) to the set
g(x, y, z) = (g1 (x, y, z), g2 (x, y, z)) (1, 1)
g1 (x, y, z) = x + y + z, g2 (x, y, z) = x2 + y 2 + z 2 .
Solution: The set S = [g (1, 1)], as illustrated in Figure 4.5, is the part of
the unit ball located below the plane x + y + z 1.
2 2
y 1 y 1
0 0
1 1 Cone
x y z 1 1
y1 0
2 2
2 2
P P
1 1
z z
0 0
1 1
x2 y2 z2 1 0
and
2 x yz1 0 2
2 2
1 1
0 0
x x
1 1
2 2
FIGURE 4.5: [g (1, 1)] and C(0, 1, 0)
The point (0, 1, 0) ∈ S satisfies the two constraints g1 (x, y, z) = g2 (x, y, z) = 1

and is a regular point since we have:
⎡ ⎤
∂g1 ∂g1 ∂g1
∂x ∂y ∂z

⎢ ⎥ 1 1 1
g (x, y, z) = ⎣ ⎦=
∂g2 ∂g2 ∂g2 2x 2y 2z
∂x ∂y ∂z

1 1 1
g (0, 1, 0) = has rank 2.
0 2 0
The cone of feasible directions to the set S at the point (0, 1, 0), with vertex
this point, is the set of points (x, y, z) such that
⎡ ⎤ ⎡ ⎤
x−0 x
1 1 1 0
g (0, 1, 0). ⎣ y − 1 ⎦ = .⎣ y − 1 ⎦
0 2 0 0
z−0 z
⇐⇒ x+y−1+z 0 and 2(y − 1) 0.

Thus
C(0, 1, 0) = {(x, y, z) ∈ R3 : x+y+z 1 and y 1}.
3. – Show that the sets

1 2 5
z x2 + y 2 and z (x + y 2 ) +
10 2
have a common cone of feasible directions at the point (3, 4, 5).
Solution: Set
1 2 5
g1 (x, y, z) = z − x2 + y 2 g2 (x, y, z) = z − (x + y 2 ) − .
10 2
We have
g1 (3, 4, 5) = g2 (3, 4, 5) = 0
1 3 4
g1 = − xi − yj + k , g1 (3, 4, 5) = − i − j + k = 0
x2 + y 2 5 5
x y 3 4
g2 (x, y, z) = − i − j + k, g2 (3, 4, 5) = − i − j + k = 0
5 5 5 5
rank(g1 (3, 4, 5)) = 1 rank(g2 (3, 4, 5)) = 1.

So (3, 4, 5) is a regular point for the two constraints g1 (x, y, z) = 0 and
g2 (x, y, z) = 0. Therefore, the cones of feasible directions at the point (3, 4, 5)
for the sets [g1 (x, y, z) 0] and [g2 (x, y, z) 0], with vertex (3, 4, 5), are given
respectively by:

C1 (3, 4, 5) = {(x, y, z) ∈ R3 : g1 (3, 4, 5).t x−3 y−4 z−5 0}

C2 (3, 4, 5) = {(x, y, z) ∈ R3 : g2 (3, 4, 5).t x−3 y−4 z−5 0}.
Clearly, since g1 (3, 4, 5) = g2 (3, 4, 5), the two sets are equal and we have for
i = 1, 2
⎡ ⎤ ⎡ ⎤
x−3 x−3
gi (3, 4, 5). ⎣ y − 4 ⎦ 0 ⇐⇒ − 35 − 45 1 . ⎣ y − 4 ⎦ 0
z−5 z−5
3 4
⇐⇒ − (x − 3) − (y − 4) + 1(z − 5) 0.
5 5
Hence, the two given sets have a common cone of feasible directions at this
point (see the illustrations in Figure 4.6) characterized by the inequality
3 4
− (x − 3) − (y − 4) + (z − 5) 0.
5 5
10
y 5 5
y
0
0
5
5
10
10
P
z
5 z 4
2 Cone
0
0
10
5 5
0
x 0
5
x
10 5
10
y 5
5
10
10
z
5
Cone
0
10
5
0
x
5
10
FIGURE 4.6: Sets [g1 0], [g2 0] and C(3, 4, 5)

4.2 Necessary Condition for Local Extreme Points/

Inequality Constraints
In what follows, we will be interested in the study of the maximization problem
⎧
⎨ g1 (x1 , . . . , xn ) b1
⎪
max f (x1 , . . . , xn ) subject to .. ..
⎪ . .
⎩
gm (x1 , . . . , xn ) bm
The results established are strongly related to the fact that we are maximizing a
function f under inequality constraint g(x) b. To solve a minimization problem
min f (x), we can maximize −f (x), and if a constraint is given in the form gj (x) bj ,
we can transform it into −gj (x) −bj . An equality constraint gj (x) = bj can be
equivalently written as gj (x) bj and −gj (x) −bj .
We have the following preliminary lemma
Lemma 4.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neigh-

borhood of x∗ ∈ [g(x) b]. If x∗ is a regular point and a local maximum
point of f subject to these constraints, then we have
∀y ∈ Rn : gi (x∗ ).y 0, i ∈ I(x∗ ) =⇒ f (x∗ ).y 0.
Proof. Let y ∈ Rn such that g (x∗ ).y 0. Because, x∗ is a regular point

of the set [g(x) b], then y ∈ C(x∗ ), the cone of feasible directions at x∗ to
the set [g(x) b]. So ∃ a > 0, ∃ x ∈ C 1 [0.a] such that
g(x(t)) c ∀t ∈ [0, a], x(0) = x∗ , x (0) = y.

Now, since x∗ is a local maximum point of f on the set g(x) b, then there
exists δ ∈ (0, a) such that ∀t ∈ (0, δ),
f (x(t)) − f (x(0))
f (x(t)) f (x∗ ) = f (x(0)) ⇐⇒ 0
t−0
d f (x(t)) − f (x(0))
f (x∗ ).y = f (x(t)).x (t) = f (x(t)) = lim+ 0.
t=0 dt t=0 t→0 t−0
Remark 4.2.1 The lemma generalizes the necessary condition for a local
maximum point x∗ in a convex S:
f (x∗ ).(x − x∗ ) 0 ∀x ∈ S.
Without assuming the set S = [g(x) b] is convex, the local maximum

point must satisfy an inequality on the convex cone C(x∗ ):
f (x∗ ).(x − x∗ ) 0 ∀x ∈ C(x∗ ).
As a consequence of the lemma, we have the following characterization of

a constrained local maximum point.

borhood of x∗ ∈ [g(x) b]. If x∗ is a regular point and a local maximum
point of f subject to these constraints, then
∃λ∗j 0, j ∈ I(x∗ ) = {k ∈ N, 1km: gk (x) = bk }
such that
∂f ∗ ∂gj ∗
(x ) − λ∗j (x ) = 0, i = 1, · · · , n.
∂xi ∂xi
j∈I(x∗ )
The proof uses an argument of linear algebra called “Farkas-Minkowski’s

Lemma” [5] that says:
Farkas-Minkowski’s Lemma. Let A be an p×n real matrix and c ∈ Rn .

Then, the inclusion
{x ∈ Rn : Ax 0} ⊂ {x ∈ Rn : c.x 0}
is satisfied if and only if
∃λ = λ1 , . . . , λp ∈ Rp , λ 0, such that c = t Aλ.
Proof. Set I(x∗ ) = {i1 , i2 , · · · , ip },

⎡ ∂gi1 ∂gi1 ∂gi1 ⎤
∂x1 ∂x2 ... ∂xn
⎢ ⎥
⎢ ⎥
⎢ ∂gi2 ∂gi2 ∂gi2 ⎥
A = −[gj (x∗ )]j∈I(x∗ ) = −⎢
⎢ ∂x1 ∂x2 ... ∂xn ⎥
⎥
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
∂gip ∂gip ∂gip
∂x1 ∂x2 . . . ∂xn
⎡ ∂f ⎤
∂x1
⎢ ⎥
∂f ⎢ ⎥
t ∗ t ∂f ⎢ .. ⎥
c = − f (x ) = − ,··· , = −⎢ . ⎥.
∂x1 ∂xn ⎢ ⎥
⎣ ⎦
∂f
∂xn
From Farkas-Minkowski’s Lemma, the inclusion
{y = (y1 , · · · , yn ) ∈ Rn : Ay 0} = {y ∈ Rn : gi (x∗ ).y 0}

i∈I(x∗ )
⊂ {y ∈ Rn : f (x∗ ).y 0} = {y ∈ Rn : c.y = −f (x∗ ).y 0}
is satisfied, then ∃ λ∗ = (λ∗1 , . . . , λ∗p ) ∈ Rp , λ∗ 0 such that

p
p

−t f (x∗ ) = c = t Aλ∗ = − λ∗k t gik (x∗ ) ⇐⇒ f (x∗ ) = λ∗k gik (x∗ ).
k=1 k=1
So we are led to solve the system
∂f ∂gj
(x) − λj (x) = 0 i = 1, . . . , n, λj 0 j ∈ I(x∗ )
∂xi ∗
∂x i
j∈I(x )
gj (x) − bj = 0 ∀j ∈ I(x∗ )
gj (x) − bj < 0 ∀j ∈ I(x∗ ).
To find a practical way to solve the system, we introduce the complementary

slackness conditions
λj 0, with λj = 0 if gj (x) < bj , j = 1, . . . , m

When gj (x∗ ) = bj , we say that the constraint gj (x) bj is active or binding

at x∗ .
When gj (x∗ ) < bj , we say that the constraint gj (x) bj is inactive or slack
at x∗ .
We introduce the Lagrangian function
L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − . . . − λm (gm (x) − bm )

where λ1 , · · · , λm are the generalized Lagrange multipliers.
Then, we reformulate the previous theorem as follows:
Theorem 4.2.2 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighbor-

hood of x∗ ∈ [g(x) b]. If x∗ is a regular point and a local maximum point
of f subject to these constraints, then ∃!λ∗ = (λ∗1 , . . . , λ∗m ) such that the
following Karush-Kuhn-Tucker (KKT) conditions hold at (x∗ , λ∗ ):
⎧ m
⎪ ∂L ∗ ∗ ∂f ∗ ∂gj ∗
⎪
⎪ − λ∗j
⎨ ∂x (x , λ ) =
∂x
(x )
∂x
(x ) = 0 i = 1, . . . , n
i i j=1 i
⎪
⎪
⎪
⎩ ∗
λj 0, with λ∗j = 0 if gj (x∗ ) < bj , j = 1, . . . , m.
Remark 4.2.2 The numbers λ∗j , j ∈ I(x∗ ) are unique. Indeed, suppose
there exist λ = λ1 , · · · , λp and λ = λ1 , · · · , λp solutions of
c =t Aλ and c =t Aλ
then t A(λ − λ ) = 0, which we can write

(λj − λj )gj (x∗ ) = 0.
j∈I(x∗ )
Since the vectors gj (x∗ ) are linearly independent, deduce that
(λj − λj ) = 0 for each j ∈ I(x∗ ).

Remark 4.2.3 If I(x∗ ) = ∅ then the Karush-Kuhn-Tucker conditions re-

duce to ∇f (x∗ ) = 0 which is expected since then the point x∗ belongs to the
interior of the set of the constraints. On the other hand, this shows that
the Kuhn-Tucker conditions are not sufficient for optimality. In fact, when
x∗ is an interior point, it could be a local maximum, a local minimum or
a saddle point.
First, let us practice writing the KKT conditions through simple examples.
max (x − 2)3 subject to g(x) = −x 0.
Solution: Since f, g, are C 1 in R, consider the Lagrangian
L(x, α) = (x − 2)3 − α(−x)

and write the KKT conditions:
⎧
⎪ ∂L
⎨ (1) = 3(x − 2)2 + α = 0
∂x
⎪
⎩
(2) α0 with α = 0 if −x<0
∗ ∗
From (1), the only possible solution is (x , α ) = (2, 0) and 2 is an interior
point to the constraint set [0, +∞). Thus we have a critical point for the
Lagrangian without x∗ = 2 being the maximum point solution of the problem
since y = (x − 2)3 is increasing on R (y = 3(x − 2)2 ). Therefore, it doesn’t
attain its maximal value on [0, +∞).
max (x − 2)3 subject to g1 (x) = −x 0, g2 (x) = x 3.
Solution: i) The set of constraints is the set reduced to the closed interval
[0, 3]. The problem consists of maximizing the real function y = (x−2)3 = f (x)
which is increasing on R (f (x) = 3(x − 2)2 ). Therefore, it attains its maximal
value on [0, 3], at x = 3.
ii) Writing the KKT conditions. Note that f, g1 , g2 , are C 1 in R. Consider

the Lagrangian
L(x, α, β) = (x − 2)3 − α(−x) − β(x − 3).

The KKT conditions are
⎧
⎪ ∂L
⎪
⎪ (1) = 3(x − 2)2 + α − β = 0
⎪
⎪ ∂x
⎨
⎪ (2) α0 with α=0 if −x<0
⎪
⎪
⎪
⎪
⎩
(3) β0 with β=0 if x<3
iii) Solving the system. We proceed by discussing whether a constraint is

active or not.
∗ If x < 3, then β = 0, and with (1), we have
3(x − 2)2 + α = 0 =⇒ α = −3(x − 2)2 0.

With (2), we deduce that α = 0, and then by (1), we obtain x = 2, which is
an interior point to the set of the constraint [0, 3]. Thus a candidate point is
x=2 with (α, β) = (0, 0).
∗∗ If x = 3, then x > 0, and by (2), we deduce that α = 0. Then, by (1), we

have
3(x − 2)2 + α = 3(3 − 2)2 + 0 − β = 0 =⇒ β = 3 > 0.
Thus we obtain
x=3 with (α, β) = (0, 3).
Note that only the constraint g2 (x) = x is active at 3. We have
g2 (x) = [ 1 ] = 0 rank(g2 (3)) = 1.

Thus point 3 is regular and, therefore, it is a candidate point.
vi) Conclusion. x = 2 is not the optimal point since f (2) = 0 < 1 = f (3).
Because, f is increasing, then 3 is the maximum point. We can also conclude by
using the extreme value theorem since f is continuous on the closed bounded
constraint set (g1 , g2 ) (0, 3).
Example 3. Distance problem. For (a, b) ∈ R2 with a2 + b2 > 1, solve the

problem
2
min f (x, y) = (x, y) − (a, b) subject to g(x, y) = x2 + y 2 1.
Solution: i) The problem describes the shortest distance of the point (a, b)
to the unit disk (here (a, b) is located outside the unit disk). This distance is
attained by the extreme value theorem since f is continuous on the constraint
set [g 1], which is a closed and bounded subset of R2 . The case (a, b) = (2, 3)
is illustrated in Figure 4.7 and a graphical solution is described in Figure 4.8
using level curves.
x
4 0
2
y 2 4
1.0
z x 22 y 32
0.5
y
0.0
20 0.5
1.0
20
15
z
10
10
1.0
0.5
0.0
0 x
0.5
1.0
FIGURE 4.7: Graph of z = (x − 2)2 + (y − 3)2 on x2 + y 2 1
ii) KKT conditions. f and g being C 1 , introduce, for the corresponding

maximization problem, the Lagrangian
L(x, y, λ) = −(x − a)2 − (y − b)2 − λ(x2 + y 2 − 1).

The necessary conditions to satisfy are:
⎧
⎪
⎪ (i) Lx = −2(x − a) − 2λx = 0 ⇐⇒ x(1 + λ) = a
⎪
⎪
⎨
(ii) Ly = −2(y − b) − 2λy = 0 ⇐⇒ y(1 + λ) = b
⎪
⎪
⎪
⎪
⎩
(iii) λ 0 with λ = 0 if x2 + y 2 < 1
∗ If x2 + y 2 < 1 then λ = 0, and then (i) and (ii) yield (x, y) = (a, b) which
leads to a contradiction since a2 + b2 > 1.
a b
∗∗ If x2 +y 2 = 1 then from (i) and (ii), we deduce that (x, y) = ( , ).
1+λ 1+λ
By substitution in x2 + y 2 = 1, we get
a 2 b 2
+ = 1, λ 0 ⇐⇒ λ= a2 + b2 − 1.
1+λ 1+λ
Thus, the only solution of the system is the point
a b
(x∗ , y ∗ ) = ( √ ,√ )
+b 2 a + b2
2 a2
where the constraint is active. Finally, the point is regular since we have

g (x, y) = 2x 2y , and rankg (x∗ , y ∗ ) = 1.
Therefore, the point is candidate for optimality.
Conclusion. Now, since it is guaranteed that the maximum value is attained,

it must be at the candidate point found. Hence,

2
max
2
f (x, y) = f (x∗ , y ∗ ) = a2 + b2 − 1.
x +y 1
y
6.8 30.4 12.8 6.44 1.6 4.8
25.6
17.6
3.2
35.2
2 4.8
22.4
9.6 8
40
11.2 x
4
6.4 2 14.4 2 4
16
28.8
51.2 19.2 20
54.4
24
33.6 27.2
57.6
60.8 2
62.4 32
43.2
65.6 38.4
36.8
68.8
72 41
5.2
49.6
80 70.4 44.8 48
7876473
8 6 67 264 59.2 56 452 8 49 6 51 2
FIGURE 4.8: Minimal value of z = (x − 2)2 + (y − 3)2 on x2 + y 2 = 1
Remark 4.2.4 The conclusion of the Karush-Kuhn-Tucker theorem is

also true when the extreme point x∗ satisfies any one of the following reg-
ularity conditions (see [14], [5]):
i) Linear constraints: gj (x) is linear, j = 1, · · · , m.
ii) Slater’s condition: gj (x) is convex and there exists x̄ such that gj (x̄) <
bj , j = 1, · · · , m (with f concave).
iii) Concave programming (with f concave): gj (x) is convex and there

exists x̄ such that for any j = 1, . . . , m,
gj (x̄) bj and gj (x̄) < bj if gj is not linear.
iv) The rank condition: The constraints gi1 , . . . , gip , (p m), are bind-
ing. The rank of the matrix
⎡ ∗ ⎤
gi1 (x )
⎢ .. ⎥
⎣ . ⎦
gip (x∗ )
is equal to p.
This last case is the one we consider here in our study. These four conditions
are not equivalent to one another. For example, the uniqueness of the
Lagrange multipliers is established under the rank condition iv).
Example 4. (Non-uniqueness of Lagrange multipliers).

Solve the problem
max f (x, y) = x1/2 y 1/4 subject to
2x + y 3, x + 2y 3, x+y 2 with x 0, y 0.
Solution: To simplify calculations, we will transform the problem to an equiv-

alent one as we did for distance problems, where the square distance is con-
sidered instead of the distance itself. Here, to avoid the powers, we will use
the logarithmic function.
i) The constraint set, sketched in Figure 4.9, and defined by:
S = {(x, y) ∈ R+ × R+ : 2x + y 3, x + 2y 3, x + y 2}
is a closed bounded subset of R2 . f is continuous on S, then, by the extreme
value theorem,
∃(x∗ , y∗ ) ∈ S such that f (x∗ , y∗ ) = max f (x, y).
(x,y)∈S
Note that, we have

f (0, y) = f (x, 0) = 0 ∀x 0, y0
f (x, y) > 0 ∀x > 0, y > 0.

So f (x∗ , y∗ ) = max f (x, y) > 0. Therefore, at the maximum point, the con-
(x,y)∈S
straints x 0 and y 0 cannot be binding.
y
3.0
2.0
2x y 3
y 1.5
2.5 1.0
0.5
0.0
2.0
1.5
4
z x y
1.5
1.0
z
0.5
1.0
0.0
0.0
0.5 S x2 y 3
0.5
x y 2 1.0
x
1.5
x
0.5 1.0 1.5 2.0 2.5 3.0 2.0
FIGURE 4.9: Graph of z = x1/2 y 1/4 on S
ii) Set Ω = (0, +∞) × (0, +∞).

As a consequence of i), we have
max f (x, y) = max ∗ f (x, y)

(x,y)∈S (x,y)∈S
where
S ∗ = {(x, y) ∈ Ω : 2x + y 3, x + 2y 3, x + y 2}.
Set
1 1
F (x, y) = ln f (x, y) = ln(x) + ln(y) (x, y) ∈ Ω
2 4
F is well defined and we have
max f (x, y) = max ∗ f (x, y) = max ∗ eln F (x,y)

(x,y)∈S (x,y)∈S (x,y)∈S
max F (x, y) = ln max f (x, y) = ln f (x∗ , y∗ )

(x,y)∈S ∗ (x,y)∈S ∗
since the functions t −→ ln t and t −→ et are increasing.

Note that S ∗ is a bounded subset of R2 but not closed. Thus, we cannot apply
the extreme value theorem to conclude about the existence of a solution to
the problem
max F (x, y).

(x,y)∈S ∗
iii) Since F and the constraints are C 1 in Ω, to solve the problem, we write
the KKT conditions for the associated Lagrangian
1 1
L(x, y, λ1 , λ2 , λ3 ) = ln x+ ln y−λ1 (2x+y−3)−λ2 (x+2y−3)−λ3 (x+y−2),
2 4
The necessary conditions to satisfy are:
⎧
⎪ 1
⎪
⎪ (i) Lx = − 2λ1 − λ2 − λ3 = 0
⎪
⎪ 2x
⎪
⎪
⎪
⎪
⎪
⎪ 1
⎪
⎪ (ii) Ly = − λ1 − 2λ2 − λ3 = 0
⎪
⎪
⎨ 4y
⎪
⎪ (iii) λ1 0 with λ1 = 0 if 2x + y < 3
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (iv) λ2 0 with λ2 = 0 if x + 2y < 3
⎪
⎪
⎪
⎪
⎪
⎩
(v) λ3 0 with λ3 = 0 if x+y <2
* If x + y < 2 then λ3 = 0, and discuss the cases
◦ if x + 2y < 3 then λ2 = 0, and

1 1
λ1 = = >0 =⇒ y = x.
4x 4y
Because λ1 > 0 then 2x + y = 3. Thus, we have (x, y) = (1, 1). But x + y =

1 + 1 = 2 : contradiction.
◦◦ if x + 2y = 3, then
− if 2x + y < 3, then λ1 = 0, and

1 1
λ2 = = >0 =⇒ x = 4y.
2x 8y
Because x + 2y = 3, then, we have (x, y) = (2, 1/2). But x + y = 2 + 1/2 > 2:

contradiction.
− if 2x + y = 3, then (x, y) = (1, 1). But x + y = 1 + 1 = 2:

contradiction.
** If x + y = 2, then by drawing the constraint set, we see that the only point
satisfying x + y = 2 is (x, y) = (1, 1) for which we have also 2x + y = 3 and
x + 2y = 3 with
2λ1 + λ2 + λ3 = 1 λ1 + 2λ2 + λ3 = 1 ⇐⇒ λ1 = λ2 λ3 = 1 − 3λ1 .
iv) Conclusion. The only point candidate is (1, 1), and it is the maximum
point since we know that such a point exists.
However, we see that we do not have uniqueness of the Lagrange multipliers,

but still we can apply the KKT conditions since the constraints are linear.
Note also, that the rank condition is not satisfied since we have
g(x, y) = (2x + y, x + 2y , x + y) g(1, 1) = (3, 3, 3)

⎡ ⎤
2 1
g (x, y) = ⎣ 1 2 ⎦ rank(g (1, 1)) = 2 = 3
1 1
and the three constraints are active at (1, 1); see Figure 4.10.
y
2.0 096 0.704 1.312
1.024
0.416 1.44 1.536 1.6
1.152
0.864 1.28 1.568
1.472
0.576
1.5 0.288 1.056
1.504
1.184 1.344
0.768
1.408
0.48
0.928
1.00.16 1.088 1.248
1.376
0.64
0.384 0.832
0.96 1.12
1.216
0.5 0.224 0.544
0.672
0.8
0.896 0.992
0.32
0.448 0.512 0.608 0.7
0.0 032 0.192 0 128 0 256 0 352 0 0 x
0.0 0.5 1.0 1.5 2.0
FIGURE 4.10: Maximal value of z = x1/2 y 1/4 on S

Mixed Constraints
Some maximization problems take the form
⎧
⎨ gj (x) = bj , j = 1, · · · , r (r < n)
max f (x) subject to
⎩
hk (x) ck , k = 1, · · · , s
We have:
Theorem 4.2.3 Let f , g = (g1 , . . . , gr ), and h = (h1 , . . . , hs ) be C 1 func-

tions in a neighborhood of x∗ ∈ [g(x) = b] ∩ [h(x) c]. If x∗ is a regular
point and a local maximum point of f subject to these constraints, then
∃!(λ∗ , μ∗ ), λ∗ = (λ∗1 , . . . , λ∗r ), μ∗ = (μ∗1 , . . . , μ∗s ) such that the following
Karush-Kuhn-Tucker (KKT) conditions hold at (x∗ , λ∗ , μ∗ ):
⎧ r s
⎪ ∂f ∗
⎪
⎪
∂L ∗ ∗ ∗ ∗ ∂gj ∗ ∂hk ∗
⎪
⎪ (x , λ , μ ) = (x ) − λ j (x ) − μ∗k (x ) = 0
⎪
⎪ ∂x i ∂x i ∂x i ∂xi
⎪
⎪ j=1 k=1
⎪
⎪ i = 1, . . . , n
⎨
⎪
⎪ ∂L ∗ ∗ ∗
⎪
⎪ (x , λ , μ ) = −(gj (x∗ ) − bj ) = 0 j = 1, . . . , r.
⎪
⎪ ∂λ
⎪
⎪ j
⎪
⎪
⎪
⎩ ∗
μk 0, with μ∗k = 0 if hk (x∗ ) < ck , k = 1, . . . , s,
where
r
s

L(x, λ, μ) = f (x) − λj (gj (x) − bj ) − μk (hk (x) − ck ).
j=1 k=1
Proof. The maximization problem is equivalent to
⎧
⎪
⎪ gj (x) bj , j = 1, . . . , r
⎪
⎪
⎨
max f (x) subject to −gj (x) −bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎩
hk (x) ck , k = 1, . . . , s
By applying the KKT conditions with the Lagrangian
r
r
s

∗
L (x, τ, κ, μ) = f (x)− τj (gj (x)−bj )− κj (−gj (x)+bj )− μk (hk (x)−ck )
j=1 j=1 k=1
there exist unique multipliers τj∗ , κ∗j , μ∗k such that the necessary conditions are
satisfied:
⎧ r
⎪ ∂L∗ ∗ ∗ ∗ ∗ ∂f ∗ ∂gj ∗
⎪
⎪ (x , τ , κ , μ ) = (x ) − τj∗ (x )
⎪
⎪
⎪
⎪ ∂xi ∂xi ∂xi
⎪
⎪
j=1
⎪
⎪
⎪
⎪
⎪
⎪ r s

⎪ ∗ ∂gj ∗ ∂hk ∗
⎪
⎪
⎨ + κ j (x ) − μ∗k (x ) = 0 i = 1, . . . , n
j=1
∂xi ∂xi
k=1
⎪
⎪
⎪
⎪
⎪
⎪ τj∗ 0, with τj∗ = 0 if gj (x∗ ) < bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ κ∗j 0, with κ∗j = 0 if − gj (x∗ ) < −bj , j = 1, . . . , r
⎪
⎪
⎪
⎪
⎩ ∗
μk 0, with μ∗k = 0 if hk (x∗ ) < ck , k = 1, . . . , s.
Setting
λ∗ = τ ∗ − κ ∗ and
L(x, λ, μ) = L∗ (x, τ, κ, μ)
r
s

= f (x) − (τj − κj )(gj (x) − bj ) − μk (hk (x) − ck )
j=1 k=1
we deduce that (x∗ , λ∗ , μ∗ ) is also a solution of the KKT conditions corre-

sponding to Lagrangian L. Moreover, for j = 1, . . . , r, λj changes sign
λj = τj − κj = −κj 0 if gj (x) < bj

λ j = τj − κ j = τj 0 if gj (x) > bj .
Uniqueness of λ∗ and μ∗ . Suppose λ∗ and μ∗ are not uniquely defined,

then we would have for some λ = λ and μ = μ
r s

∂f ∗ ∂gj ∗ ∂hk ∗
(x ) − λj (x ) − μk (x ) = 0
∂xi j=1
∂xi ∂xi
k=1
r s

∂f ∗ ∂gj ∗ ∂hk ∗
(x ) − λj (x ) − μk (x ) = 0.
∂xi j=1
∂x i ∂xi
k=1
Subtracting the two equalities and using the fact that x∗ is a regular point,
we obtain a contradiction:
r
s

(λj − λj )∇gj (x∗ ) − (μk − μk )∇hk (x∗ ) = 0 =⇒ (λ , μ ) = (λ, μ).
j=1 k=1
Nonnegativity constraints
Some maximization problems take the form
⎧
⎨ gj (x) bj , j = 1, . . . , m
⎩
x1 0, . . . , xn 0.
We introduce the following n new constraints:
gm+1 (x) = −x1 0, . . . . . . . . . . . . , gm+n (x) = −xn 0.

The maximization problem is equivalent to
⎧
⎨ gj (x) bj , j = 1, . . . , m
⎩
gj (x) 0, j = m + 1, . . . , m + n.
By applying the KKT conditions, for a regular point x, with the Lagrangian
m
n

L∗ (x, λ, μ) = f (x) − λj (gj (x) − bj ) − μk (−xk )
j=1 k=1
there exist unique multipliers λj , μk such that

⎧ m
⎪ ∂L∗ ∂f ∂gj
⎪
⎪ −
⎪
⎪ (x, λ, μ) = (x) λj (x) + μi = 0 i = 1, . . . , n
⎪
⎪ ∂x i ∂x i ∂x i
⎨ j=1
⎪
⎪ λj 0, with λj = 0 if gj (x) < bj , j = 1, . . . , m
⎪
⎪
⎪
⎪
⎪
⎩
μk 0, with μk = 0 if xk > 0, k = 1, . . . , n.
We deduce then:
Theorem 4.2.4 Let f and g = (g1 , . . . , gr ) be C 1 functions in a neigh-

borhood of x∗ ∈ [g(x) b] ∩ [x 0]. If x is a regular point and a local
maximum point of f subject to these constraints, then ∃!λ∗ = (λ∗1 , . . . , λ∗m )
such that the following Karush-Kuhn-Tucker (KKT) conditions hold
at (x, λ):
⎧ m
⎪ ∂L ∂f ∂gj
⎪
⎪ − (x) 0
⎪
⎪ (x, λ) = (x) λj (=0 if xi > 0 ),
⎨ ∂xi ∂xi j=1
∂xi
⎪ i = 1, . . . , n
⎪
⎪
⎪
⎪
⎩
λj 0, with λj = 0 if gj (x) < bj , j = 1, . . . , m
where the Lagrangian is

m

L(x, λ) = f (x) − λj (gj (x) − bj ).
j=1
Solved Problems
1. – Importance of KKT hypotheses. Show that the KKT conditions

fail to hold at the optimal solution of the problem
⎧
⎨ g1 (x, y) = (x − 2)2 = 0
2
max f (x, y) = x + y subject to
⎩
g2 (x, y) = (y + 1)3 0.
Solution: i) The set of constraints, graphed in Figure 4.11, is
S = {(x, y) : g1 (x, y) = 0 and g2 (x, y) 0}

= {(x, y) : x=2 and y −1}.
y
0.0 x
0.5 1.0 1.5 2.0 2.5 3.0
0.5
1.0
1.5
2.0 S
2.5
3.0
FIGURE 4.11: Constraint set S
ii) The Karush-Kuhn-Tucker conditions for the Lagrangian
L(x, y, α, β) = x2 + y − α((x − 2)2 ) − β((y + 1)3 )

are ⎧
⎪
⎪ (1) Lx = 2x − 2α(x − 1) = 0
⎪
⎪
⎨
(2) Ly = 1 − 3β(y + 1)2 = 0
⎪
⎪
⎪
⎪
⎩
(3) β0 with β=0 if (y + 1)3 < 0
* If (y + 1)3 < 0, then β = 0. We get a contradiction with (2) which leads to
1 = 0.
* If (y + 1)3 = 0, then y = −1, and by (3) again, we obtain 1 = 0 which is not
possible.
Thus, KKT conditions have no solution.
ii) The problem has a solution at (2, −1) since, we have
f (x, y) = x2 + y = 22 + y 4 + (−1) = f (2, −1) ∀(x, y) ∈ S.
Thus
max f (x, y) = f (2, −1) = 3.
S
Note that the point is not a candidate for the KKT conditions. This is because
it doesn’t satisfy the constraint qualification under which the KKT conditions
are established. In particular, the rank condition is not satisfied.
Indeed,
the
g1 (2, −1)
two constraints are active at (2, −1), but we have rank( ) = 0
g2 (2, −1)
since

g1 (x, y) 2(x − 2) 0 g1 (2, −1) 0 0
= = .
g2 (x, y) 0 3(y + 1)2 g2 (2, −1) 0 0
2. –KKT conditions are not sufficient. Consider the problem
min f (x, y) = 2 − y − (x − 1)2 subject to y − x = 0, x+y−20
i) Sketch the feasible set and write down the necessary KKT conditions.
ii) Find the point(s) solution of the KKT conditions and check their
regularity.
iii) What can you conclude about the solution of the minimization prob-
lem?
iv) Does this contradict the theorem on the necessary conditions for a
constrained candidate point?
Solution: i) The set of the constraints is the set of points on the line y = x
included in the region below the line y = 2 − x, as shown in Figure 4.12.
y
3
x y 2
x
3 2 1 1 2 3
1
x y 0
2 S
3
FIGURE 4.12: Constraint set S
Writing the Karush-Kuhn-Tucker conditions. First, transform the

problem into a maximization one as:
max −f (x, y) = y − 2 + (x − 1)2 subject to y − x = 0, x+y−20
Note that f , and the constraints g1 and g2 are C ∞ in R2 where
g1 (x, y) = y − x g2 (x, y) = x + y − 2.
Thus, the Lagrangian associated is
L(x, y, α, β) = y − 2 + (x − 1)2 − α(y − x) − β(x + y − 2)
and the Karush-Kuhn-Tucker conditions are

⎧
⎪
⎪ (1) Lx = 2(x − 1) + α − β = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (2) Ly = 1 − α − β = 0
⎪
⎪
⎪
⎪ (3) Lα = −(y − x) = 0
⎪
⎪
⎪
⎪
⎩
(4) β0 with β=0 if x + y − 2 < 0.
ii) Solving the KKT conditions.
∗ If x + y − 2 < 0 then β = 0 and

⎧
⎨ 2(x − 1) + α = 0 1 1
1−α=0 =⇒ (x, y) = ( , ) and (α, β) = (1, 0).
⎩ 2 2
y−x=0
∗∗ If x + y − 2 = 0 then
⎧
⎪
⎪ 2(x − 1) + α − β = 0
⎨
1−α−β =0
=⇒ (x, y) = (1, 1) and (α, β) = (1/2, 1/2).
⎪
⎪ y−x=0
⎩
x+y−2=0
So, there are two solutions: (1/2, 1/2) and (1, 1) for the KKT conditions.
Regularity of the point (1/2, 1/2). Only the constraint g1 (x, y) = y − x is

active at (1/2, 1/2) and we have

g1 (x, y) = −1 1 rank( g1 (1/2, 1/2) ) = rank( −1 1 ) = 1.
The point (1/2, 1/2) is a regular point.
Regularity of the point (1, 1). The two constraints are active at (1, 1). We
have

g1 (x, y) −1 1 g1 (1, 1) −1 1
= rank( ) = rank( ) = 2.
g2 (x, y) 1 1 g2 (1, 1) 1 1
Thus the point (1, 1) is a regular point.
iii) Conclusion. The two points are candidates for optimality. Comparing
the values taken by f at these points gives:
1 1 1 1 5
f (1, 1) = 1, f ( , ) = 2 − − = > 1,
2 2 2 4 4
we deduce that, only (1, 1) is the candidate for minimality. However, it is not
the minimum point. Indeed, we have
f (x, x) = 2 − x − (x − 1)2 −→ −∞ as x −→ −∞.

Therefore, f doesn’t attain its minimal value.
iv) This doesn’t contradict the theorem since KKT conditions indicate only
where to find the possible points when they exist.
3. – Positivity constraints. Solve the problem by two methods:

⎧
⎨ y − x2 0, y 4
max f (x, y) = 3 + x − y + xy subject to
⎩
x 0, y0
i) using the extreme value theorem.
ii) using the KKT conditions.
Solution: i) EVT method. The constraints set, graphed in Figure 4.13, is
S = {(x, y) / 0 x 2, x2 y 4}.
x
y 0
4 1
5
2
3 3
y
2
2
yx 1
0
10
4
y4 z yxx y3
3 S
z
5
2
0 x
0.0 0.5 1.0 1.5 2.0 2.5 3.0
FIGURE 4.13: Graph of z = 3 + x − y + xy on S
f is continuous (because it is a polynomial) on the set S, which is a bounded

and closed subset of R2 . So f attains its absolute extreme points on S (by the
◦
extreme value theorem), either at the critical points located in S or on ∂S.
* Critical points of f : f has no critical point in the interior of S because
∇f (x, y) = 1 + y, −1 + x = 0, 0 ⇐⇒ (x, y) = (1, −1) ∈ S.
** Extreme values on ∂S :
Let L1 , L2 and L3 the three parts of the boundary of S defined by:
L1 = {(x, x2 ), 0 x 2}, L2 = {(x, 4), 0 x 2}
L3 = {(0, y), 0 y 4}
– On L1 , we have f (x, x2 ) = 3 + x − x2 + x3 = g(x),
g (x) = 3x2 − 2x + 1.
x 0 2
g (x) +
g(x) 3 9
TABLE 4.1: Variations of g(x) = 3 + x − x2 + x3 on [0, 2]
Then, using Table 4.1, we deduce that
max f = f (2, 4) = 9 min f = f (0, 0) = 3.

L1 L1
– On L2 , we have: f (x, 4) = 5x − 1 = h(x), h (x) = 5.
x 0 2
h (x) +
h(x) −1 9
TABLE 4.2: Variations of h(x) = 5x − 1 on [0, 2]
From Table 4.2, the extreme values on this side are
max f = f (2, 4) = 9 min f = f (0, 4) = −1.

L2 L2
– On L3 , we have: f (0, y) = 3 − y = ϕ(y), ϕ (y) = −1.

Hence, we obtain from Table 4.3,
max f = f (0, 0) = 3 min f = f (0, 4) = −1.

L3 L3
y 0 4
ϕ (y) −
ϕ(y) 3 −1
TABLE 4.3: Variations of ϕ(y) = 3 − y on [0, 4]
∗ ∗ ∗Conclusion:
The maximal value of f on S is 9 and is attained at the point (2, 4).
The minimal value of f on S is −1 and is attained at the point (0, 4).
ii) KKT conditions. Consider the Lagrangian
L(x, y, λ, μ) = 3 + x − y + x y − λ(x2 − y) − μ(y − 4).
The Karush-Kuhn-Tucker conditions are

⎧
⎪
⎪ (1) Lx = 1 + y − 2λx 0 (= 0 if x > 0)
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (2) Ly = −1 + x + λ − μ 0 (= 0 if y > 0)
⎪
⎪
⎪
⎪ (3) λ0 with λ=0 if y < x2
⎪
⎪
⎪
⎪
⎩
(4) μ0 with μ=0 if y<4
Solving the KKT conditions.
∗ If y < 4 then μ = 0 and
– Suppose y < x2 , then λ = 0 and by (1), we get 1 + y 0 which

contradicts y 0.
– Suppose y = x2 .
◦ if x > 0, then y = x2 > 0. From (1) and (2), we get
1 + x2 − 2λx = 0 and λ=1−x =⇒ 3x2 − 2x + 1 = 0
with no solution.
◦ if x = 0, then y = x2 = 0, and (1) leads to a contradiction.

∗∗ If y = 4
– Suppose y < x2 then λ = 0 and by (1), we get 1 + 4 0 which is

not possible.
– Suppose y = x2 . We deduce that x = 2 or x = −2. The second value

is not possible since x 0. For x = 2 > 0, we insert the values x = 2 and
y = 4 in (1) and (2), and obtain

5 − 4λ = 0 5 9
=⇒ (λ, μ) = ( , ).
1+λ−μ=0 4 4
Note that, both constraints are active at (2, 4), and if g(x, y) = (x2 − y, y − 4),
then

2x −1 4 −1
g (x, y) = rank(g (2, 4)) = rank( ) = 2.
0 1 0 1
Thus, (2, 4) is a regular point and, therefore, a candidate point. Moreover,

(2, 4) solves the problem since the maximal value of f is attained on S by the
EVT; see Figure 4.14.
y
5 0.7 4.55 8.75 10.15 12.2513.3 14.715.4
1.05
.75
1.4 6.3 14. 15.
11.2
0 12.95
2.45 14.
1.05 7.7 11.9
4
9.45 13.
0.7 10.5
0.35
11.55 12.
3 5.25 8.4
1.75
35 7. 9.8
10.8
3.5
2
8.05
.4 9.1
5.6
1 2.1 6.65
7.3
2.8
0 3 15 3 85 4.2 4.9 5 95 x
0.0 0.5 1.0 1.5 2.0 2.5 3.0
FIGURE 4.14: Maximal value of z = 3 + x − y + xy on S

4. – Application. Find (x, y) ∈ S = {(x, y) : x + y 0, x2 − 4 0}

that lies closest to the point (2, 3) by following the steps below:
i) Formulate the problem as an optimization problem.
ii) Illustrate the problem graphically (Hint: use level curves).
iii) Write down the KKT conditions.
iv) Find all points that satisfy the KKT conditions. Check whether or
not each point is regular.
v) What can you conclude about the solution of the problem?
Solution: i) The square of the distance between (x, y) and (2, 3) is given by
(x − 2)2 + (y − 3)2 . To find the point (x, y) ∈ S that lies closest to the point
(2, 3) is equivalent to solve the minimization problem
⎧
⎨ g1 (x, y) = x + y 0
min (x − 2)2 + (y − 3)2 subject to
⎩
g2 (x, y) = x2 − 4 0
or to maximize the objective function f below subject to the two constraints:

⎧
⎨ g1 (x, y) = x + y 0
max f (x, y) = −(x − 2)2 − (y − 3)2 subject to
⎩
g2 (x, y) = x2 − 4 0
ii) The feasible set, graphed in Figure 4.15, is also described by
S = {(x, y) : y −x, −2 x 2}
The level curves of f , with equations: (x√− 2)2 + (y − 3)2 = k where k 0,
are circles centered at (2, 3) with radius k; see Figure 4.16.
If we increase the values of the radius, the values of f decrease. The first circle
that will intersect the set S will be the circle with radius equal to the distance
of the point (2, 3) to the line y = −x. So, only the first constraint will be
active in solving the optimization problem.
iii) Writing the KKT conditions. Consider the Lagrangian
L(x, y, λ, β) = −(x − 2)2 − (y − 3)2 − λ(x + y) − β(x2 − 4)

x
1
0
1
4 2
y
2
0
y
10
x 2 4 x2
y x 2
z
5
x
3 2 1 1 2 3
2
z x 22 y 32
S
0
4
FIGURE 4.15: Graph of z = (x − 2)2 + (y − 3)2 on S
The KKT conditions are

⎧
⎪ ∂L
⎪
⎪ (1) = −2(x − 2) − λ − 2βx = 0
⎪
⎪ ∂x
⎪
⎪
⎪
⎪
⎪
⎪ ∂L
⎨ (2) = −2(y − 3) − λ = 0
∂y
⎪
⎪
⎪
⎪
⎪
⎪ (3) λ0 with λ=0 if x+y <0
⎪
⎪
⎪
⎪
⎪
⎩
(4) β0 with β=0 if x2 − 4 < 0
iv) Solving the KKT conditions.
∗ If x + y < 0 then λ = 0 and
⎧
⎨ −2(x − 2) − 2βx = 0
=⇒ x(1 + β) = 2 and y=3
⎩
−2(y − 3) = 0
y
22.1 17 10.2 6.8
5.1 3.4
7.2
5.5 4
1.
13.6 2
27.2
18.7
30.6
8.5
x
3 2 1 1 2 3
11.
23.8
37.4
15.3
4.2 34 20
2 28.9 25.5
47.6
54.4 40.8 32.3
35.7
57.8 39
64.6 51. 42.5
45.9
68
71.4 49
4
74.8
8.2 61.2 56.1 52.7
1.6 59
85.83 379.9 76 573.1 66.3 62.964
69 7
FIGURE 4.16: Minimal value of z = (x − 2)2 + (y − 3)2 on S
then with (4) we have

2
β0 =⇒ x= > 0.
1+β
This contradicts x + y < 0.
∗∗ If x + y = 0 then
– Suppose x2 − 4 < 0 then β = 0 and

⎧
⎨ −2(x − 2) − λ = 0
=⇒ y = x + 1.
⎩
−2(y − 3) − λ = 0
With x+y = 0, we deduce that (x, y) = (−1/2, 1/2). Note that (−1/2)2 −4 < 0
is satisfied and λ = 5 > 0.
– Suppose x2 − 4 = 0. We deduce that x = 2 or x = −2. Then,

inserting in (1) and (2), we obtain
⎧
⎨ λ + 4β = 0
(x, y) = (2, −2) =⇒ =⇒ (λ, β) = (10, −5/2)
⎩
10 − λ = 0
⎧
⎨ 8 − λ + 4β = 0
(x, y) = (−2, 2) =⇒ =⇒ (λ, β) = (0, −2)
⎩
λ=0
contradicting β 0.
So, the only point solution of the system is
(x∗ , y ∗ ) = (−1/2, 1/2) with (λ, β) = (5, 0).
Regularity of the candidate point (−1/2, 1/2). Note that only the con-
straint g1 (x, y) = x + y is active at (−1/2, 1/2). We have

g1 (x, y) = 1 1 rank( g1 (−1/2, 1/2) ) = rank( 1 1 ) = 1.
Thus the point (−1/2, 1/2) is a regular point.
iv) Conclusion. The constraint set is an unbounded closed convex and we

have
2
|f (x, y)| = (x, y) − (2, 3) ( (x, y) − (2, 3) )2
=⇒ lim f (x, y) = +∞.
(x,y)→+∞
By Theorem 2.4.2, there exists a minimum point for f on S. Thus, the candi-
date found solves the problem.
5. – Mixed constraints. Solve the problem

⎧
⎨ 2x2 + y 2 + z 2 = 1
2 2 2
max x + y + z subject to
⎩
x + y + z 0.
Solution: Set U (x, y, z) = x2 + y 2 + z 2 and
g(x, y, z) = x + y + z h(x, y, z) = 2x2 + y 2 + z 2 − 1.

First, the maximization problem has a solution by the extreme-value theorem.
Indeed, U is continuous on the set
S = {(x, y, z) : x + y + z 0, 2x2 + y 2 + z 2 = 1}
which is a closed and bounded subset of R3 as the intersection of the ellipsoid

x2
√ + y 2 + z 2 = 1 with the region below the plane x + y + z = 0. The
(1/ 2)2
plane passes through the center of the ellipsoid; see Figure 4.17.
2
y 1
1
2
2 Plane
1 Ellipsoid
z
0
1
2
2
1
0
x
1
FIGURE 4.17: S is the part of the ellipsoid below the plane
Next, the functions U , g and h are C 1 around each point (x, y, z) ∈ R3 . We

then may deduce the solution by using the Karusk-Kuhn-Tucker conditions.
The Lagrangian is given by
L(x, y, λ, μ) = x2 + y 2 + z 2 − λ(x + y + z) − μ(2x2 + y 2 + z 2 − 1),

and the necessary conditions to satisfy are:
⎧
⎪
⎪ (i) Lx = 2x − λ − 4μx = 0 ⇐⇒ 2x(1 − 2μ) = λ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (ii) Ly = 2y − λ − 2μy = 0 ⇐⇒ 2y(1 − μ) = λ
⎪
⎪
⎨
(iii) Lz = 2z − λ − 2μz = 0 ⇐⇒ 2z(1 − μ) = λ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (iv) Lμ = −(2x2 + y 2 + z 2 − 1) = 0
⎪
⎪
⎪
⎪
⎩
(v) λ0 with λ=0 if x + y + z < 0.
* If x + y + z < 0, then λ = 0, and from (i), (ii), (iii) and (iv), we deduce
that
⎧
⎪
⎪ x = 0 or μ = 1/2
⎨
y = 0 or μ = 1
⎪ z = 0 or μ = 1
⎪
⎩
2x2 + y 2 + z 2 = 1
We obtain the points
(0, 0, −1), (0, −1, 0) with μ=1

1 1
(− √ , 0, 0) with μ=
2 2
The active constraint at these points satisfies: h (x, y, z) = (4x, 2y, 2z),
1
rank h (0, −1, 0) = rank h (0, 0, −1) = rank h (− √ , 0, 0) = 1.
2
Thus, the points are regular and candidate for optimality.
* If x + y + z = 0, then
• Suppose x = 0. We deduce from (iv) that
y2 + z2 = 1 and y+z =1
and deduce the two candidate points
1 1 1 1
(x, y, z) = (0, √ , − √ ) or (0, − √ , √ ) with (λ, μ) = (0, 1/2).
2 2 2 2
The two constraints are active at theses points and satisfy

g(x, y, z) h(x, y, z) = x+y+z 2x2 + y 2 + z 2 − 1

g (x, y, z) 1 1 1
=
h (x, y, z) 4x 2y 2z
( #
g (0, √12 , − √12 ) 1 1 1√
rank = rank √ =2
h (0, √12 , − √12 ) 0 2 2 −2 2
( #
g (0, − √12 , √12 ) 1 1√ 1
rank = rank √ =2
h (0, − √12 , √12 ) 0 −2 2 2 2
The points satisfy the constraint qualification. They are regular points and
candidates for optimality.
• Suppose x = 0. Then
– if μ = 1/2, then λ = 0. By (ii) and (iii), we have y = z = 0.

Thus, from x + y + z = 0, we deduce x = 0 : contradiction with x = 0.
– if μ = 1/2, then from (i), we have λ = 0. Moreover, by (ii) and

(iii), we have μ = 1. So, by dividing each side of (ii) by each side of (iii), we
obtain y = z. Then we deduce that
1
x + 2y = 0 2x2 + 2y 2 = 1 =⇒ y = ±√
10
3
2y(1 − μ) = λ = 2(−2y)(1 − 2μ) =⇒ μ= .
5
With λ 0, the only possible point is:
2 1 1 4 3
(x, y, z) = −√ ,√ ,√ with (λ, μ) = ( √ , ).
10 10 10 5 10 5
It is clear also that the constraint qualification condition is satisfied, so the
point is regular.
⎡ ⎤
√2 , √1 , √1

g − 1 1 1
rank ⎣ 10 10 10 ⎦ = rank
− √810 √2 √2
= 2.
h − √2 , √1 , √1 10 10
10 10 10
Conclusion: Finally, comparing the values of f at the candidate points

2 1 1 3 1 1
f −√ ,√ ,√ = f − √ , 0, 0 =
10 10 10 5 2 2
1 1 1 1
f (0, 0, −1) = f (0, −1, 0) = f 0, √ , − √ = f 0, − √ , √ = 1
2 2 2 2
we deduce that f attains its maximum value subject to the constraints at
1 1 1 1
(0, 0, −1), (0, −1, 0), 0, √ , − √ and 0, − √ , √ .
2 2 2 2
4.3 Classification of Local Extreme Points-Inequality

Constraints
To classify a candidate point x∗ for optimality of the problem
local max (min) f (x) subject to g(x) b

with g = (g1 , . . . , gm ) and b = (b1 , . . . , bm ), we proceed as in the case of
equality constraints by comparing the values taken by the Lagrangian
L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − · · · − λm (gm (x) − bm ),

at points close to x∗ . Then, since, x∗ ∈ [g(x) b] means that
x∗ ∈ [gj (x) < bj ] = O and x∗ ∈ [gj (x) = bj ],

j ∈I(x∗ ) j∈I(x∗ )
we remark that by working in a neighborhood of x∗ included in the open set

O, we bring ourselves to solving a local optimization problem of type equality
constraints
local max (min) f (x) subject to gj (x) = bj , j ∈ I(x∗ ).
Consequently, we can apply the second derivative test established for equality
constraints by considering in the test only the active constraints at that point.
In what follows, suppose we have:
Hypothesis (H) f and g = (g1 , . . . , gm ) be C 2 functions in a neighbor-

hood of x∗ in Rn such that:
gj (x∗ ) = bj if j ∈ I(x∗ ) = {i1 , . . . , ip } p<n
λj = 0 if gj (x∗ ) < bj j ∈ I(x∗ )
rank(G (x∗ )) = p, G(x) = (gi1 (x), . . . , gip (x)),
∇x L(x∗ , λ∗ ) = 0 for a unique vector λ∗ = (λ∗1 , . . . , λ∗m ).

For r = p + 1, . . . , n, let Br (x∗ ) be the bordered Hessian determinant

∂gi1 ∗ ∂gi1 ∗
0 ... 0 ∂x1 (x ) ... ∂xr (x )
.. .. .. ..
.. ..
. . . . . .
∂gip ∂gip
0 . . . 0 (x ∗
) . . . ∂xr (x )
∗
∂x1
Br (x ) =
∗

∂gi1 (x∗ ) . . . ∂gip (x∗ ) L ∗ ∗ ∗ ∗
∂x1 ∂x1 x1 x1 (x , λ ) . . . Lx1 xr (x , λ )

.. .. .. .. .. ..
. . . . . .
∂gi ∂gip
1
(x ∗
) . . . (x ∗
) Lxr x1 (x ∗
, λ ∗
) . . . Lxr xr (x ∗
, λ∗ )
∂xr ∂xr
The variables are renumbered in order to make the first p columns in the
matrix G (x∗ ) linearly independent.
Theorem 4.3.1 Sufficient conditions for a local constrained extreme point.

If assumptions (H) hold, then
(i) λ 0, (−1)p Br (x∗ ) > 0 ∀r = p + 1, . . . , n

∗
=⇒ x is a strict local minimum point
(ii) λ 0, (−1)r Br (x∗ ) > 0 ∀r = p + 1, . . . , n

∗
=⇒ x is a strict local maximum point.
Proof. The proof follows the one seen for the case of equality constraints.
We outline here the key modification that allows us to conclude with the pre-
vious proof. We assume that I(x∗ ) = {1, . . . , m} to avoid the case of equality
constraints. Note that the positivity of λ is not assumed in the hypothesis
H in order to include both the maximization and minimization problems as
explained below. The Lagrangian introduced is used to link values of f and
g for comparison. Then depending on its positivity or negativity on the plan
tangent of the active constraints at that point, we identify whether we have a
minimum or a maximum point.
Step 0: Suppose that we assign for the problems
max f : L(x, α) = f (x) − α.(g(x) − b) α0
min f : L(x, β) = −f (x) − β.(g(x) − b) β0

then
−L(x, β) = f (x) − (−β).(g(x) − b) − β 0.
So, to consider the two problems simultaneously, we can introduce the La-
grangian
L(x, λ) = f (x) − λ.(g(x) − b)
with λ 0 (resp. ) for the maximization (resp. minimization) problem.
Step 1: We have
[g(x) b] = [gj (x) < bj ] [gj (x) = bj ] ⊂ O.

j ∈I(x∗ ) j∈I(x∗ )
Thus x∗ belongs to the open set O. So, one can find ρ0 > 0 such that Bρ0 (x∗ ) ⊂
O. Then, for h ∈ Rn such that x∗ + h ∈ Bρ0 (x∗ ), we have from Taylor’s
formula, for some τ ∈ (0, 1),
n
n n
1
L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+ Lxi (x∗ , λ∗ )hi + Lx x (x∗ +τ h, λ∗ )hi hj .
i=1
2 i=1 j=1 i j
By assumptions, we have
Lxi (x∗ , λ∗ ) = 0 i = 1, . . . , n

L(x, λ) = f (x) − λj (gj (x) − bj )
j∈I(x∗ )
gi1 (x∗ ) − bi1 = gi2 (x∗ ) − bi2 = . . . = gip (x∗ ) − bip = 0
then, we have
L(x∗ , λ∗ ) = f (x∗ ) − λ∗i1 (gi1 (x∗ ) − bi1 ) − . . . − λ∗ip ((gip (x∗ ) − bip ) = f (x∗ )
L(x∗ + h, λ∗ ) = f (x∗ + h) − λ∗i1 (gi1 (x∗ + h) − bi1 ) − . . . − λ∗ip (gip (x∗ + h) − bip )
n n
1
f (x∗ +h)−f (x∗ ) = λ∗k [gk (x∗ +h)−bk ]+ Lx x (x∗ +τ h, λ∗ )hi hj .
2 i=1 j=1 i j
k∈I(x∗ )
Using Taylor’s formula for each gk , k ∈ I(x∗ ), we obtain
n
∂gk
gk (x∗ + h) − bk = gk (x∗ + h) − gk (x∗ ) = (x∗ + τk h)hj τk ∈ (0, 1).
j=1
∂xj
Step 2: Consider the (p + n) × (p + n) bordered Hessian matrix

B(x0 , x1 , . . . , xp ) with
⎡ ∂gi1 ∂gi1 ⎤
(x1 )
∂xi1 ... ∂xn (x1 )
∂gik k ⎢ ⎥
G(x1 , · · · , xp ) = (x ) =⎢
⎣
..
.
..
.
..
.
⎥
⎦
∂xj p×n
∂gip p ∂gip p
∂x1 (x ) ... ∂xn (x ).
The remaining steps of the equality constraints’ proof work is shown using
the above notations.
Remark 4.3.1 If we introduce the notations:

n
n
Q(h) = Q(h1 , . . . , hn ) = Lxi xj (x∗ , λ∗ )hi hj
i=1 j=1
⎡ ⎤
0p×p G (x∗ )
the (p + n) × (p + n) bordered matrix ⎣ ⎦
t ∗ ∗ ∗
G (x ) [Lxi xj (x , λ )]n×n
M = {h ∈ Rn : G (x∗ ).h = 0}
the theorem says that
Q(h) > 0 ∀h ∈ M, h=0

=⇒ x∗ is a strict local constrained minimum
Q(h) < 0 ∀h ∈ M, h=0

=⇒ x∗ is a strict local constrained maximum.
It suffices then to study the positivity (negativity) of the quadratic form on

the tangent plan M to the constraints gk (x) = bk , k ∈ I(x∗ ) at x∗ .
local max (min) f (x, y) = xy subject to g(x, y) = x + y 2.
Solution: Consider the Lagrangian
L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2)

and the system

⎧
⎨ (i) Lx = y − λ = 0
(ii) Ly = x − λ = 0
⎩
(iii) λ = 0 if x + y < 2.
From (i) and (ii), we deduce that λ = x = y.
∗ If x + y < 2, then λ = 0. Thus (0, 0) is a candidate point, that is an
interior point of [g 2]. To explore its nature, we use the second derivatives
test for unconstrained problems. We have

0 1
Hf (x, y) = , D1 (0, 0) = 0 and D2 (0, 0) = −1 < 0.
1 0
Then, (0, 0) is a saddle point.
∗∗ If x + y = 2, then (x, y) = (1, 1) is a candidate point with λ = 1.

First, (1, 1) is a regular point since g (x, y) = 1, 1 and rank[g (1, 1)] = 1.
Next, since n = 2 and p = 1, we have to consider the sign of the bordered
Hessian determinant:

0 gx (1, 1) gy (1, 1)

0 1 1

(−1)2 B2 (1, 1) = gx (1, 1) Lxx (1, 1, 1) Lxy (1, 1, 1) = 1 0 1 = 2 > 0.

1 1 0

gy (1, 1) Lxy (1, 1, 1) Lyy (1, 1, 1)
We conclude that the point (1, 1) is a local maximum to the problem.
Finally, we also have
Theorem 4.3.2 Necessary conditions for a local constrained extreme points

If assumptions (H) hold, then
(i) x∗ is a local minimum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n is

t
positive semi definite on M : yHL y 0 ∀y ∈ M
(ii) x∗ is a local maximum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n is

is negative semi definite on M : t yHL y 0 ∀y ∈ M
where M = {h ∈ Rn : G (x∗ ).h = 0} is the tangent plan to the

constraints gk (x) = bk , k ∈ I(x∗ ) at x∗ .
Proof. Let x(t) ∈ C 2 [0, a], a > 0, be a curve on the constraint set g(x) b
passing through x∗ at t = 0. Suppose that x∗ is a local maximum point for f
subject to the constraint g(x) b. Then,
f (x∗ ) f (x(t)) ∀t ∈ [0, a).
or
f$(0) = f (x∗ ) f (x(t)) = f$(t) ∀t ∈ [0, a).
So f$ is a one variable function that has a local maximum at t = 0. Conse-
quently, it satisfies f$ (0) 0 and f$ (0) 0 or equivalently
d2

∇f (x∗ ).x (0) 0 and f (x(t)) 0.
dt2 t=0
We have
d2
f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0).
dt2
Moreover, differentiating the relations gk (x(t)) = bk , k ∈ I(x∗ ) twice and
denoting Λ∗ = λ∗i1 , . . . , λ∗ip , we obtain
t
x (0)HG (x∗ )x (0) + ∇G(x∗ )x (0) = 0
t
=⇒ x (0)t Λ∗ HG (x∗ )x (0) + t Λ∗ ∇G(x∗ )x (0) = 0.
Hence
d2

0 2
f (x(t)) = [t x (0)Hf (x∗ )x (0) + ∇f (x∗ )x (0)] −
dt t=0
[t x (0)t ΛHG (x∗ )x (0) + t Λ∇G(x∗ )x (0)]
= t x (0)[Hf (x∗ ) −t ΛHG (x∗ )]x (0) + [∇f (x∗ ) + t Λ∇G(x∗ )]x (0)
= t x (0)[HL (x∗ )]x (0) since ∇f (x∗ ) + t Λ∇G(x∗ ) = 0
and the result follows since x (0) is an arbitrary element of M .
Example 2. Suppose that (4, 0) is a candidate satisfying the KKT conditions

where only the constraint g is active and such that g (4, 0) = −1 0
−2 0
and the Hessian of the associated Lagrangian is HL(.,−8) (4, 0) = .
0 14
Can (4, 0) be a local maximum or minimum to the constrained optimization
problem?
Solution: The point (4, 0) is regular since rank(g1 (4, 0)) = 1.
We have p = 1 < 2 = n. Then we can consider the following determinant

(r = p + 1 = 2). (Note that the first column vector of g1 (4, 0) is linearly
independent, so we do not have to renumber the variables).

0 −1 0

B2 (4, 0) = −1 −2 0 = −14.
0 0 14
We have (−1)1 B2 (4, 0) = 14 > 0 and λ = −8 < 0. So the second derivatives
test is satisfied and (4, 0) is a strict local minimum. This shows also that the
Hessian is positive definite under the constraint. Indeed, we can check this
directly:

h h
g1 (4, 0) = −1 0 . = −h + (0)k = 0
k k

0
Thus, M ={ k ∈ R} and
k

−2 0 0 0
0 k = 14k2 0 ∀ ∈ M.
0 14 k k
Solved Problems
1. – Solve the problem

⎧
⎨ x + 2y 3
local min f (x, y) = x2 + y 2 s.t
⎩
2x − y 1
Solution: The problem is equivalent to the maximization problem.
⎧
⎨ −(x + 2y) −3
local max −f (x, y) = −(x2 + y 2 ) s.t
⎩
−(2x − y) −1
L(x, y, λ1 , λ2 ) = −f (x, y) − λ1 (−(x + 2y) + 3) − λ2 (−(2x − y) + 1)
= −(x2 + y 2 ) + λ1 (x + 2y − 3) + λ2 (2x − y − 1).
The constraints are linear, so we can look for the candidate points by writing
the Karush-Kuhn-Tucker conditions:
⎧
⎪
⎪ (i) Lx = −2x + λ1 + 2λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (ii) Ly = −2y + 2λ1 − λ2 = 0
⎪
⎪
⎪
⎪ (iii) λ1 0 with λ1 = 0 if x + 2y > 3
⎪
⎪
⎪
⎪
⎩
(iv) λ2 0 with λ2 = 0 if 2x − y > 1.
We distinguish several cases:
• If 2x − y > 1, then λ2 = 0. From (i) and (ii), we deduce that λ1 = 2x = y.

But 2x − 2x = 0 ≯ 1. So, no solution.
• If 2x − y = 1, then
– If x + 2y > 3, then λ1 = 0. From (i) and (ii), we deduce that λ2 =
x = 2y.
With 2x − y = 1, we deduce (x, y) = (2/3, 1/3). But 2/3 + 2(1/3) ≯ 3.
So, no solution.
– If x + 2y = 3, then with 2x − y = 1, we have (x, y) = (1, 1) and
(λ1 , λ2 ) are such that
⎧
⎨ λ1 + 2λ2 = 2 6 2
⇐⇒ (λ1 , λ2 ) = ( , ).
⎩ 5 5
2λ1 − λ2 = 2
Hence, the only solution point is

6 2
(x, y) = (1, 1) with (λ1 , λ2 ) = ( , ).
5 5
Regularity of the point. The two constraints are active at the point. We have
g = (g1 , g2 ) = (−(x + 2y), −(2x − y)),
( #
∂g1 ∂g1
−1 −2
g (x, y) = ∂g ∂x ∂y
∂g2 = =⇒ rank(g (1, 1)) = 2.
∂x
2
∂y
−2 1
Classification of the point. Since n = 2, p = 2, p ≮ n, then we can’t apply the

second derivatives test. Let us use comparison to conclude. We have
6 2 6 2
L(x, y, , ) = −(x2 + y 2 ) + (x + 2y − 3) + (2x − y − 1)
5 5 5 5
= −x2 − y 2 + 2x + 2y − 4 = −(x − 1)2 − (y − 1)2 − 2 −2
and, on the set of the constraints, we have
6 2 6 2
−f (x, y) L(x, y, , ) = −f (x, y) + (x + 2y − 3) + (2x − y − 1)
5 5 5 5
Thus,
−f (x, y) −2 = −f (1, 1) ∀(x, y) x + 2y 3, 2x − y 1.
Hence, (1, 1) is the minimum point solution; see Figure 4.18 for a geometric
interpretation of the solution.
y y
4 4 16.12 17.98 21.08 29.76
28.52
24.826.04
23.56 27.28 31
30.3
19.22 29.1
22.32 27.
13.02
16.74 20.46 26.6
24.18
25.4
3 3
18.6 22.94
7.44
21.
x2 y 3
9.92
2 2 15.5 19.8
3.72
17.3
1 1 1.24
6.2
S
x 12.4 x
1 1 2 3 4 1 1 2 3 4
2x y 1
0.62
1.86
1 1 16 12
FIGURE 4.18: Local minimum of z = x2 + y 2 on x + 2y 3 and 2x − y 1
2. – Classify the solutions of the problem
local max (min)f (x, y) = x2 y + 3y − 4 s.t g(x, y) = 4 − xy 0
Solution: i) Consider the Lagrangian
L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = x2 y + 3y − 4 − λ(4 − xy)
and write the conditions

⎧
⎪
⎪ (1) Lx = 2xy − λ(−y) = 0 ⇐⇒ y(2x + λ) = 0
⎪
⎪
⎨
(2) Ly = x2 + 3 − λ(−x) = 0
⎪
⎪
⎪
⎪
⎩
(3) λ = 0 if xy > 4.
∗ If xy > 4, then λ = 0, and with (2), we have x2 + 3 = 0 which has no

solution.
∗∗ If xy = 4, then x = 0 and y = 0. By (1), we deduce that λ = −2x, which
inserted in (2), we obtain 3 − x2 = 0. Thus, we have two solutions:
√ 4 √
(x, y) = ( 3, √ ) with λ = −2 3
3
√ 4 √
(x, y) = (− 3, − √ ) with λ = 2 3.
3
ii) Constraint qualification. Note that g is C 1 in R2 and any point of the set of
the constraints g = 0 is an interior point and regular; see Figure 4.19. Indeed,
we have
4
y 2
y 2
4
4
4
4x y0
2 2
z
0
x
4 2 2 4
2
4x y0
4
2 4
2
0
x
2
4 4
FIGURE 4.19: Graph of z = x2 y + 3y − 4 on xy 4

g (x, y) = −y −x rank(g (x, y)) = 1 for (x, y) ∈ [g = 0]
since g (x, y) = 0 ⇐⇒ (x, y) = (0, 0) and (0, 0) ∈ [g = 0]. In particular

√ 4 √ 4
( 3, √ ) and (− 3, − √ ) are regular points of [g = 0]. Therefore, they are
3 3
candidate points; see in Figure 4.20, the variations of the values of the function
close to these points.
iii) Classification. With m = 1 (the number of the active constraints), n = 2

(the dimension of the space), then r taking values from m + 1 to n = 2, must
be equal to r = 2. So, consider the following determinant

0 gx gy 0 −y −x

B2 (x, y) = gx Lxx Lxy = −y 2y 2x + λ
gy Lyx Lyy −x 2x + λ 0
√ √ √
∗ At ( 3, 4/ 3), we have λ = −2 3, then
√ √
0√ −4/√ 3 − 3 √ √
√ √ √ −4/ 3 −8/ 3 √

B2 ( 3, 4/ 3) = −4/ 3 8/ 3
0 = − 3 √ = −8 3
√ 8/ 3 0
− 3 0 0
√ 1
√ √
Because
√ √ λ = −2 3 0 and (−1) B2 ( 3, 4/ 3) > 0, we deduce that
( 3, 4/ 3) is a local minimum.
√ √ √
∗ At (− 3, −4/ 3), we have λ = 2 3, then
√ √
0 4/ √3 3 √ √
√ √ √ √ 4/ 3 3 √
B2 (− 3, −4/ 3) = 4/
√ 3 −8/ 3 0 = 3

√
8/ 3 =8 3
0
3 0 0
√ √ √
Because
√ λ√= 2 3 0 and (−1)2 B2 (− 3, −4/ 3) > 0, we deduce that
(− 3, −4/ 3) is a local maximum.
y
43.2 4 43.2
4.8 54 37.8 37.8 54 64
4 5.4 5
32.4 32.4
27 27
21.6 2 21.6
16.2 0 16.2
x
4 2 2 4
5.4
21.6 21.6
27 27
2
32.4 32.4
10.8
37.8 37.8
9.4 5
43.2 43.2
64.8
.6 64.8
7
48.6 416 2 48.6
FIGURE 4.20: Local extrema of z = x2 y + 3y − 4 on xy 4
3. – Solve the problem

⎧
⎨ g1 (x, y, z) = x + 2y + z 30
local min f (x, y, z) = x2 +y 2 +z 2 s.t
⎩
g2 (x, y, z) = 2x − y − 3z 10
Solution: Note that f , g1 and g2 are C 1 in R3 . The problem is equivalent to

the maximization problem
⎧
⎨ −g1 = −(x + 2y + z) −30
local max −f = −(x2 + y 2 + z 2 ) s.t
⎩
g2 = 2x − y − 3z 10
L(x, y, z, λ1 , λ2 ) = −(x2 + y 2 + z 2 ) + λ1 (x + 2y + z − 30) − λ2 (2x − y − 3z − 10).
Because the constraints are linear, the local candidate points satisfy the KKT
conditions:
⎧
⎪
⎪ (i) Lx = −2x + λ1 − 2λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (ii) Ly = −2y + 2λ1 + λ2 = 0
⎪
⎪
⎨
(iii) Lz = −2z + λ1 + 3λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (iv) λ1 0 with λ1 = 0 if x + 2y + z > 30
⎪
⎪
⎪
⎪
⎩
(v) λ2 0 with λ2 = 0 if 2x − y − 3z < 10.
From the first three equations, we deduce that
1 1 1 3
x= λ1 − λ 2 y = λ 1 + λ2 z= λ1 + λ2 .
2 2 2 2
We distinguish several cases:
∗ If x + 2y + z = 30 and 2x − y − 3z = 10, then inserting the expressions of

x, y and z above into the two equations gives
⎧
⎨ 3λ1 + 32 λ2 = 30 54 8
⇐⇒ λ1 = , λ2 = −
⎩ 3 5 5
− 2 λ1 − 7λ2 = 10
which contradicts λ2 0.
∗∗ If x + 2y + z = 30 and 2x − y − 3z < 10, then λ2 = 0 and

1 1
(x, y, z) = λ1 ( , 1, )
2 2
which inserted into the equation x + 2y + z = 30 gives λ1 = 10 and (x, y, z) =
(5, 10, 5). We have 2x − y − 3z = 2(5) − 10 − 3(5) = −15 < 10. So the point
(x, y, z) = (5, 10, 5) λ1 = 10, λ2 = 0

is a candidate for optimality.
Now, let us study the nature of the point (5, 10, 5). For this, we use the second
derivatives test since f , g1 and g2 are C 2 around this point. Since n = 3 and
p = 1 (only the constraint g1 is active), then r takes the values p + 1 = 2 to
n = 3. First, we consider the matrix

g (x, y, z) = − ∂g
∂x
1
− ∂g
∂y
1
− ∂g
∂z
1
= −1 −2 −1
Then rank(g (x, y, z)) = 1. Moreover, the first column vector of g (5, 10, 5) is
linearly independent, so we don’t have to renumber the variables.
Next, we have to consider the sign of the following bordered Hessian determi-
nants:

0 − ∂g 1
− ∂g 1
∂x ∂y
0 −1 −2
−1 −2 0
B2 (5, 10, 5) = − ∂g 1
Lxx Lxy = = 10.
∂x −2 0 −2
∂g
− 1 Lyx Lyy
∂y

0 − ∂g 1
− ∂g 1
− ∂g 1
∂x ∂y ∂z

− ∂g1 0 −1 −2 −1
∂x Lxx Lxy Lxz −1 −2
B3 (5, 10, 5) = = = −24.
0 0
−2 0 −2 0
− ∂g 1
Lyx Lyy Lyz
∂y −1 0 0 −2

− ∂g1 Lzx Lzy Lzz

∂z
Here, the partial derivatives of g1 are evaluated at the point (5, 10, 5) and the
second partial derivatives of L are evaluated at the point (5, 10, 5, 10, 0).
We have
(−1)2 B2 (5, 10, 5) = 10 > 0 and (−1)3 B3 (5, 10, 5) = 24 > 0.
We conclude that the point (5, 10, 5) is a local maximum to the maximization
problem, or equivalently, a local minimum to the minimization problem.
∗∗∗ If x + 2y + z > 30 and 2x − y − 3z = 10 then λ1 = 0 and

1 3
(x, y, z) = λ2 (1, − , − )
2 2
which inserted into the equation 2x − y − 3z = 10 gives λ2 = −10/7 < 0 :
contradiction.
∗ ∗ ∗∗ If x + 2y + z > 30 and 2x − y − 3z < 10 then λ1 = 0 and λ2 = 0. So

(x, y, z) = (0, 0, 0) which contradicts the first above inequality.
Conclusion: The minimization problem has one local minimum at the point
(5, 10, 5).
4. – Classify the candidates of the problem

⎧
⎨ g 1 = x2 + y 2 + z 2 = 1
local max(min)f (x, y, z) = x + y + z s.t
⎩
g2 = x − y − z 1.
Solution: i) Note that f , g1 and g2 are C ∞ in R3 and consider the Lagrangian
L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 1) − λ2 (1 − g2 (x, y, z))

= x + y + z − λ1 (x2 + y 2 + z 2 − 1) + λ2 (x − y − z − 1)
and let us look for the solutions of the system

⎧
⎪
⎪ (1) Lx = 1 − 2xλ1 + λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 1 − 2yλ1 − λ2 = 0
⎪
⎪
⎨
(3) Lz = 1 − 2zλ1 − λ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x2 + y 2 + z 2 − 1) = 0
⎪
⎪
⎪
⎪
⎩
(5) λ2 = 0 if x − y − z > 1.
−λ2 = 1 − 2xλ1 λ1 (x + y) = 1 λ1 (x + z) = 1.
Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1,
and from (2) : λ2 = 1. So λ1 = 0 and we have
1
x+y =x+z = =⇒ y = z.
λ1
1
∗ If x − y − z > 1, then λ2 = 0. We deduce that = 2x, thus
λ1
1
x+y =x+z = = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1.
λ1
Hence, we have two points
1 1 1 1 1 1
(x, y, z) = ( √ , √ , √ ) or (− √ , − √ , − √ ).
3 3 3 3 3 3
But, they do not satisfy x − y − z > 1.
∗ If x − y − z = 1, then with y = z and (4), we have

⎧
⎨ x = 1 + 2y 1 2
⇐⇒ (x, y) = (1, 0) or (x, y) = (− , − ).
⎩ 3 3
2y(3y + 2) = 0
We deduce then
(x, y, z) = (1, 0, 0) with λ1 = 1, λ2 = 1
1 2 2 1
(x, y, z) = (− , − , − ) with λ1 = −1, λ2 = − .
3 3 3 3
ii) Regularity of the points. We have

⎡ ⎤
∂g1 ∂g1 ∂g1
∂x ∂y ∂z

⎢ ⎥ 2x 2y 2z
g (x, y, z) = ⎣ ⎦= −1 1 1
− ∂g
∂x
2
− ∂g
∂y
2
− ∂g
∂z
2

2 0 0 1 2 2 − 23 − 43 − 43
g (1, 0, 0) = g (− , − , − ) = .
−1 1 1 3 3 3 −1 1 1
Then
1 2 2
rank(g (1, 0, 0)) = rank(g (− , − , − )) = 2.
3 3 3
The two points are regular. Moreover, we remark that the first two column
vectors are linearly independent and we will not renumber the variables.
iii) Classification of the points. Now, let us study the nature of the points
(1, 0, 0) and (− 13 , − 23 , − 23 ). For this we use the second derivatives test since
f , g1 and g2 are C 2 around these points. We have to consider the sign of the
following bordered Hessian determinant:
∂g1 ∂g1 ∂g1
0 0
∂x ∂y ∂z

0 ∂g2
− ∂x ∂g2
− ∂y ∂g2
− ∂z
0

∂g1
∂x − ∂g 2
Lxx Lxy Lxz
B3 (x, y, z) = ∂x

∂g1 − ∂g2 Lyx Lyy Lyz
∂y ∂y

∂g1
∂z − ∂g 2
L L L
∂z zx zy zz

0 0 2x 2y 2z

0 0 −1 1 1
= 2x −1 −2λ1 0 0 .

2y 1 0 −2λ1 0
2z 1 0 0 −2λ1
The first partial derivatives of g1 and g2 are evaluated at (x, y, z). The second
partial derivatives of L are evaluated at (x, y, z, λ1 , λ2 ).
∗ At (1, 0, 0) with λ1 = 1 and λ2 = 1, we have

0 0 2 0 0

0 0 −1 1 1

B3 (1, 0, 0) = 2 −1 −2 0 0 = −16 (−1)3 B3 = 16 > 0.
0 1 0 −2 0

0 1 0 0 −2
We conclude that the point (1, 0, 0) is a local maximum to the constrained

optimization problem (λ2 0, (−1)3 B3 > 0).
∗∗ At (− 13 , − 23 , − 23 ) with λ1 = −1 and λ2 = − 13 , we have

0 0 − 23 − 43 − 43

0 0 −1 1 1
1 2 2
B3 (− , − , − ) = − 23 −1 2 0 0 = 16
(−1)2 B3 = 16 > 0.
3 3 3
− 43 1 0 2 0
− 43 1 0 0 2
We conclude that the point (− 13 , − 23 , − 23 ) is a local minimum to the con-

strained optimization problem (λ2 0, (−1)2 B3 > 0).
iii) The set of the constraints is a closed bounded set of R2 as it is the

intersection of the unit sphere [g1 = 1] and the region above the plane [g2 = 1].
By the extreme value theorem, f attains its extreme values on this set of the
constraints. Therefore, the local points found in ii) are also the global extreme
points. Hence, we have
1 2 2 5
max f = f (1, 0, 0) = 1 min f = f (− , − , − ) = − .
g1 =1, g2 1 g1 =1, g2 1 3 3 3 3
5. – Classify the candidates of the problem

⎧
⎨ g 1 = x2 + y 2 + z 2 = 1
local max(min)f (x, y, z) = x + y + z s.t
⎩
g2 = x − y − z 1.
Solution: i) Note that f , g1 and g2 are C ∞ in R3 and consider the Lagrangian

= x + y + z − λ1 (x2 + y 2 + z 2 − 1) − λ2 (x − y − z − 1).
We look for the solutions of the system
⎧
⎪
⎪ (1) Lx = 1 − 2xλ1 − λ2 = 0 ⎧
⎪
⎪
⎨ ⎨ (4) Lλ1 = −(x2 + y 2 + z 2 − 1) = 0
(2) Ly = 1 − 2yλ1 + λ2 = 0
⎪
⎪ ⎩
⎪
⎪ (5) λ2 = 0 if x − y − z < 1.
⎩
(3) Lz = 1 − 2zλ1 + λ2 = 0
λ2 = 1 − 2xλ1 λ1 (x + y) = 1 λ1 (x + z) = 1.
Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1,
and from (2) : λ2 = 1. So λ1 = 0 and we have
1
x+y =x+z = =⇒ y = z.
λ1
1
∗ If x − y − z < 1, then λ2 = 0. We deduce that = 2x, thus
λ1
1
x+y =x+z = = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1.
λ1
Hence, we have two solutions
√
1 1 1 3
(x, y, z) = ( √ , √ , √ ) with λ1 = , λ2 = 0
3 3 3 2
√
1 1 1 3
(x, y, z) = (− √ , − √ , − √ ) with λ1 = − , λ2 = 0.
3 3 3 2
∗ If x − y − z = 1, then with y = z and (4), we have

⎧
⎨ x = 1 + 2y 1 2
⇐⇒ (x, y) = (1, 0) or (x, y) = (− , − ).
⎩ 3 3
2y(3y + 2) = 0
We deduce then
(x, y, z) = (1, 0, 0) with λ1 = 1, λ2 = −1
1 2 2 1
(x, y, z) = (− , − , − ) with λ1 = −1, λ2 = .
3 3 3 3
ii) Regularity of the points. We have

⎡ ⎤
∂g1 ∂g1 ∂g1
∂x ∂y ∂z

⎢ ⎥ 2x 2y 2z
g (x, y, z) = ⎣ ⎦=
∂g2 ∂g2 ∂g2 1 −1 −1
∂x ∂y ∂z
1 1 1 1 1 1
g2 ( √ , √ , √ ) = √2 √2 √2 = −g2 (− √ , − √ , − √ )
3 3 3
3 3 3 3 3 3
1 1 1 1 1 1
rank(g2 ( √ , √ , √ )) = rank(g2 (− √ , − √ , − √ )) = 1.
3 3 3 3 3 3

2 0 0 1 2 2 − 23 − 43 − 43
g (1, 0, 0) = g (− , − , − ) = .
1 −1 −1 3 3 3 1 −1 −1
1 2 2
rank(g (1, 0, 0)) = rank(g (− , − , − )) = 2.
3 3 3
The four points are regular. Moreover, we will not have to renumber the
variables since the first two column vectors of each derivative above are linearly
independent.
1 1 1
iii) Classification of the points (± √ , ± √ , ± √ ).
3 3 3
Here n = 3, p = 1, thus we have to consider the sign of the following bordered
Hessian determinants:

0 2x 2y 2z
0 2x 2y
2x −2λ1 0 0
B2 = 2x −2λ1 0
B3 = .

2y 2y 0 −2λ1 0
0 −2λ1
2z 0 0 −2λ1
We have
1 1 1 8 1 1 1
B2 ( √ , √ , √ ) = √ B3 ( √ , √ , √ ) = −12
3 3 3 3 3 3 3
1 1 1
(−1)r Br ( √ , √ , √ ) > 0 r = 2, 3.
3 3 3
Thus, the point is a local maximum since λ1 = 1 > 0 and (−1)2 B2 > 0,
(−1)3 B3 > 0.
1 1 1 8 1 1 1
B2 (− √ , − √ , − √ ) = − √ B3 (− √ , − √ , − √ ) = −12
3 3 3 3 3 3 3
1 1 1
(−1)1 Br (− √ , − √ , − √ ) > 0 r = 2, 3
3 3 3
Thus, the point is a local minimum since λ1 = −1 < 0 and (−1)1 B2 > 0,
(−1)1 B3 > 0.
1 2 2
iv) Classification of the points (1, 0, 0), (− , − , − ).
3 3 3
Here n = 3, p = 2, thus we have to consider the sign of the following bordered
Hessian determinant:

0 0 2x 2y 2z

0 0 1 −1 −1

B3 (x, y, z) = 2x 1 −2λ1 0 0 .
2y −1 0 −2λ 0
1
2z −1 0 0 −2λ1
∗ At (1, 0, 0) with λ1 = 1 and λ2 = −1, we have B3 (1, 0, 0) = −16. We conclude

that the point cannot be a local maximum because λ2 = −1 0. It cannot
also be a local minimum because the Hessian is not semi definite positive at
the point on the tangent plane
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
h h 0
2 0 0 ⎣ k ⎦= 0
M = ⎣ k ⎦: = k⎣ 1 ⎦ : k ∈ R
1 −1 −1 0
l l −1
⎡ ⎤⎡ ⎤
−2 0 0 0
0 k −k ⎣ 0 −2 0 ⎦ ⎣ k ⎦ = −4k2 0 on M.
0 0 −2 −k
∗∗ At (− 13 , − 23 , − 23 ) with λ1 = −1 and λ2 = 13 , we have B3 (− 13 , − 23 , − 23 ) = 16.

We conclude that the point cannot be a local minimum because λ2 = 1/3 0.
It cannot also be a local maximum because the Hessian is not semi definite
negative at the point on the tangent plane
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
h 2 4 4
h 0
−3 −3 −3 ⎣ 0
M = ⎣ k ⎦: k ⎦= = k ⎣ 1 ⎦: k ∈ R
1 −1 −1 0
l l −1
⎡ ⎤⎡ ⎤
2 0 0 0
0 k −k ⎣ 0 2 0 ⎦ ⎣ k ⎦ = 4k2 0 on M.
0 0 2 −k
v) The set of the constraints is a closed bounded set of R2 as it is the

intersection of the unit sphere [g1 = 1] and the region below the plane [g2 = 1].
By the extreme value theorem, f attains its extreme values on this set of the
constraints. Hence, we have
√ √
max f (x, y, z) = 3 and min f (x, y, z) = − 3.
g1 =1, g2 1 g1 =1, g2 1
4.4 Global Extreme Points-Inequality Constraints
When the Lagrangian is concave/convex on a convex constraint set, a so-

lution of the Karush-Kuhn-Tucker conditions is a global maximum/minimum
point.
Theorem 4.4.1 Let Ω ⊂ Rn , Ω be an open set and f, g1 , . . . , gm : Ω −→

◦
R be C 1 functions. Let S ⊂ Ω be convex, x∗ ∈ S and

L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − . . . − λm (gm (x) − bm )

∃ λ∗ = λ∗1 , . . . , λ∗m : ∇x L(x∗ , λ∗ ) = 0

∗
λj = 0 if gj (x∗ ) < bj j = 1, . . . , m.
Then, we have
λ∗ 0 and L(., λ∗ ) is concave in x ∈ S =⇒ f (x∗ ) = max f (x)

S∩{x∈Ω: g(x)b}
λ∗ 0 and L(., λ∗ ) is convex in x ∈ S =⇒ f (x∗ ) = min f (x)

S∩{x∈Ω: g(x)b}
Proof. i) First implication. The point x∗ is a critical point for the La-
grangian L(., λ∗ ) (∇x L(x∗ , λ∗ ) = 0) and L(., λ∗ ) is concave on the convex set
S, then x∗ is a global maximum for L(., λ∗ ) on S (by Theorem 2.3.4). Thus,
we have
L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − b1 ) − . . . − λ∗m (gm (x∗ ) − bm )

f (x) − λ∗1 (g1 (x) − b1 ) − . . . − λ∗m (gm (x) − bm ) = L(x, λ∗ ) ∀x ∈ S.
At x∗ , we have λ∗j 0, with λ∗j = 0 if gj (x∗ ) < bj j = 1, . . . , m

so
−λ∗j (gj (x∗ ) − bj ) = 0 j = 1, . . . , m,

and, the previous inequality reduces to
L(x∗ , λ∗ ) = f (x∗ ) f (x) − λ∗1 (g1 (x) − b1 ) − . . . − λ∗m (gm (x) − bm ) = L(x, λ∗ ).
For each j = 1, . . . , m, we also have , λ∗j 0 and gj (x) − bj 0, then
−λ∗j (gj (x) − bj ) 0. Therefore,
L(x∗ , λ∗ ) = f (x∗ ) L(x, λ∗ ) f (x) ∀x ∈ S ∩ {x ∈ Ω : g(x) b}.

Hence x∗ solves the constrained problem.
ii) Second implication. This part can be deduced similarly. Moreover, it sug-
gests, for example, when looking for candidates for a maximization problem
that we keep the points with negative Lagrange multipliers and see if they
are global minima points without maximizing (−f ) and introducing another
Lagrangian.

min(max)f (x, y, z) = x2 + y 2 + z 2 s.t g(x, y, z) = x − 2z −5.
Solution: Form the Lagrangian using the C ∞ functions f and g on R3 :
L(x, y, z, λ) = x2 + y 2 + z 2 − λ(x − 2z + 5)
Let us solve the system
⎧
⎪
⎪ (i) Lx = 2x − λ = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (ii) Ly = 2y = 0
⎪
⎪
⎪
⎪ (iii) Lz = 2z + 2λ = 0
⎪
⎪
⎪
⎪
⎩
(iv) λ = 0 if x − 2z + 5 < 0.
∗ If x − 2z + 5 < 0 , then λ = 0. From the equations (i), (ii) and (iii), we
deduce that (x, y, z) = (0, 0, 0). But, then the inequality x − 2z + 5 < 0 is
not satisfied.
∗∗ If x − 2z + 5 = 0 , then using (i), (ii) and (iii), we obtain

λ = 2x = −z, y = 0, x−2z+5 = 0 ⇐⇒ (x, y, z) = (−1, 0, 2) with λ = −2
which is the only candidate point for maximality.
Now, we study the convexity/concavity of L in (x, y, z) when λ = −2. We

have ⎡ ⎤
2 0 0
HL(.,−2) (x, y, z) = ⎣ 0 2 0 ⎦
0 0 2
The leading principal minors are such that: ∀(x, y, z) ∈ R3 ,
D1 (x, y, z) = 2 > 0, D2 (x, y, z) = 4 > 0, D3 (x, y, z) = 8 > 0.
Hence, L(., −2) is strictly convex in (x, y, z), and we conclude that the point
(−1, 0, 2) is the solution to the constrained manimization problem.
The maximization problem doesn’t have a solution, since there is only one
solution to the system and it is a global minimum point.
Interpretation. The problem looks for the shortest and farthest distance of the
origin to the space region located below the plan x − 2z + 5 = 0. The shortest
distance is attained on the plane.
Remark 4.4.1 The rank condition, at the point x∗ , is not assumed in the
theorem. The proof uses the characterization of a C 1 convex function on a
convex set only.
Example 2. In Example 4, Section 4.2, the point (1, 1) doesn’t satisfy the
rank condition. It solves the KKT conditions related to the problem with
linear constraints:
max F (x, y) = ln x + ln y subject to
2x + y 3, x + 2y 3 and x+y 2 with x > 0, y > 0.

Use concavity to show that (1, 1) solves the problem.
Solution: i) With the Lagrangian

1 1
L(x, y, λ1 , λ2 , λ3 ) = ln x+ ln y−λ1 (2x+y−3)−λ2 (x+2y−3)−λ3 (x+y−2),
2 4
the Hessian with respect to (x, y) is
⎡ ⎤
1
⎢ − x2 0 ⎥
HL(.,λ1 ,λ2 ,λ3 ) (x, y) = ⎣ 1 ⎦
0 − 2
y
is strictly definite negative since the leading principal minors are such that
1 1
D1 (x, y) = − < 0, D2 (x, y) = >0
x2 x2 y 2
for (x, y) ∈ Ω = (0, +∞) × (0, +∞). So the Lagrangian is strictly concave in
(x, y) ∈ Ω, and (1, 1) is the maximum point.
Remark 4.4.2 The concavity/convexity hypothesis is a sufficient condi-

tion. We may have a global extreme point with a Lagrangian that is neither
concave nor convex (see Exercise 3).
Example 3. In Exercise 2, Section 4.3, the points

√ 4 √ √ 4 √
( 3, √ ) with λ = −2 3 and (− 3, − √ ) with λ=2 3
3 3
solve respectively the local min and local max problems
local max (min)f (x, y) = x2 y + 3y − 4 s.t g(x, y) = 4 − xy 0.
Are there global extreme points?
Solution: i) Let us explore the concavity and convexity of L with respect to

(x, y)
L(x, y, λ) = x2 y + 3y − 4 + λ(xy − 4)
The Hessian matrix of L in (x, y) is

Lxx Lxy 2y 2x + λ
HL = =
Lyx Lyy 2x + λ 0
√
When λ = 2 3, the principal minors are
Δ11 = Lyy = 0, Δ21 = Lxx = 2y and Δ2 = −(2x + λ)2 .
So L is neither concave nor convex in (x, y) ∈ R2 .

√
Similarly, when λ = −2 3 the principal minors are
√
Δ11 = Lyy = 0, Δ21 = Lxx = 2y and Δ2 = −(2x − 2 3)2 ,
and L is neither concave nor convex in (x, y).
Therefore, we cannot use this sufficient condition to conclude anything about

the global optimality of the candidate points.
ii) Note that, on the boundary of the constraint set [g 4], we have y = 4/x
and f takes the values
4 12
f (x, ) = 4x + −4
x x
and
4 4
lim f (x, ) = +∞ and lim f (x, ) = −∞.
x→+∞ x x→−∞ x
Hence f doesn’t attain an absolute maximum nor an absolute minimum value
on the constraint set.
Remark. Note that
√ √ 24 √ √ 24
f ( 3, 4/ 3) = √ − 4 f (− 3, −4/ 3) = − √ − 4.
3 3
√ √ √ √ √ √
√ f ( 3,√4/ 3) > f (− 3, −4/ 3), ( 3, 4/ 3) being a local
With √ minimum
√ and
(− 3, −4/ 3) being a local maximum,
√ √ we can see that ( 3, 4/ 3) cannot
be a global minimum and (− 3, −4/ 3) cannot be a global maximum. A
constrained global extreme point would be a local one since any point of the
set of the constraints g = 4 is an interior point and regular.
Example 4. Quadratic programming. The general quadratic program

(QP) can be formulated as
1 t
min xQx +t x.d s.t Ax b
2
where Q is a symmetric n × n matrix, d ∈ Rn , b ∈ Rm and A an m × n matrix.
Introduce the Lagrangian
1 t
L(x, λ) = −( xQx +t x.d) − λ(Ax − b)
2
and write the KKT conditions
⎧
⎨ ∇x L = −Qx − d −t Aλ = 0
⎩
λi 0 with λi = 0 if (Ax)i < bi .
If (x∗ , λ∗ ) is a solution of the KKT conditions, and x∗ is a candidate point

where p constraints are active (Ax)ik = bik , k = 1, . . . , p, then the second
derivatives test at the point shows whether the point is a solution or not
since the HL (x, λ∗ ) = Q is constant and the constraints are linear. Thus the
positivity of the Hessian subject to these constraints is equivalent to test the
bordered determinants formed from the matrix

⎡ t ⎤
ai1
0 Ap ⎢ ⎥
t Ap = ⎣ ... ⎦ t
aik is the ik eme row vector of A
Ap Q
t
aip
Remark 4.4.3 * To sum up, solving an unconstrained or constrained op-

timization problem leads to solving a nonlinear system F (x, λ) = 0 that
appears in different forms
no constraints
f (x) = 0
F(x, λ) = 0

equality constraints inequality constraints
∇x,λ L(x, λ) = 0 ∇x L(x, λ) = 0, λ.(g(x) − b) = 0
On the other hand, solving a nonlinear equation is not easy even when F
is a polynomial of degree 3 of one variable.
** The importance of the theorems studied comes from
- locating the possible candidates

- showing how to compare the values of f along the feasible directions.
These two points are the start for the development of numerical methods
for approaching the solution with accuracy (see [17], [19], [8], [4]).
*** The proofs we studied for optimization problems in the Euclidean space
constitute a natural step to more complex ones developed in calculus of
variation where the maximum and minimum are searched in a class of
functions and where the objective function is a function defined on that
class (see [16], [6], [9]).
Solved Problems
1. – Distance to an hyperplane. Let a ∈ Rn , a = 0, b ∈ R b > 0.

i) Solve
2 t
min x subject to a.x b.
ii) Deduce the solution to the following problems.

α) min 5 + x2 + y 2 β) max −6x2 − 6y 2 − 6z 2 + 4
−x+y2 2x−y+2z−1

Solution: i) Let t a = a1 . . . an , t x = x1 . . . xn . The mini-
mization problem looks for points in the region above the hyperplane
t
a.x = b ⇐⇒ a 1 x 1 + a2 x 2 + . . . + a n x n = b
that are closest to the origin. It is a nonlinear minimization problem with

inequality constraints. We introduce the Lagrangian
L(x, λ) = −(x21 + x22 + . . . + x2n ) + λ(a1 x1 + a2 x2 + . . . + an xn − b)
⎧
⎪
⎪ Lx1 = −2x1 + λa1 = 0
⎪
⎪
⎪
⎪
⎪
⎪ ..
⎪
⎨ .
⎪
⎪
⎪
⎪ Lxn = −2xn + λan = 0
⎪
⎪
⎪
⎪
⎪
⎩
λ0 with λ = 0 if a1 x1 + a2 x2 + . . . + an xn > b.
Finding a candidate.
* If a1 x1 + a2 x2 + . . . + an xn > b, then λ = 0. We get x1 = . . . = xn = 0. But

then, we have a contradiction with a1 (0) + . . . + an (0) = 0 b.
* If a1 x1 + a2 x2 + . . . + an xn = b, then
λ
x i = ai , i = 1, . . . , n
2
which inserted in the equation of the hyperplane, we obtain
λ λ λ λ b
a1 a1 + a 2 a2 + . . . + a n an = b ⇐⇒ = .
2 2 2 2 a 2
Hence, a solution to the system is
b b
xi = ai , i = 1, . . . , n ⇐⇒ x= a.
a 2 a 2
Finding the solution.
b
To study the concavity of L in x when λ = 2 , consider the Hessian matrix
a 2
⎡ ⎤ ⎡ ⎤
Lx 1 x 1 ... Lx 1 x n −2 . . . 0
⎢ .. .. .. ⎥ ⎢ . .. .. ⎥
HL(.,λ) (x) = ⎣ . . . ⎦ = ⎣ .. . . ⎦
Lxn x1 ... Lxn xn 0 ... −2
The leading minor principals are equal to Dk (x) = (−2)k , k = 1, . . . , n.
The matrix is semi-definite negative. Thus, the point maximizes − x 2 subject
to the constraint t ax b. Hence, the point solves the minimization problem
and the minimal distance of the origin to this point is equal to
b
b
a = .
a 2 a
ii) α) Note that

min 5 + x2 + y 2 = min 5 + x2 + y 2 = 5 + min x2 + y 2 .
−x+y2 −x+y2 −x+y2
Moreover, we have
min x2 + y 2 = min⎡ ⎤ (x, y) 2 .

−x+y2 x
−1 1 .⎣ ⎦2
y
Thus
2 2 √
min x2 + y 2 = =√ = 2
−x+y2 −1 2
1
and is attained at (x∗ , y ∗ ) = (−1, 1). Hence
)
√
min 5+ x2 + y2 = 5+ 2.
−x+y2
β) We have
max −6x2 − 6y 2 − 6z 2 + 4 = 4 − 6 min x2 + y 2 + z 2

2x−y+2z−1 −2x+y−2z1
Thus
min x2 + y 2 + z 2 = min ⎡ ⎤ (x, y, z) 2 ,

−2x+y−2z1
⎢
x
⎥
−2 1 −2 .⎢
⎣ y ⎥1
⎦
z
6 6
max −6x2 − 6y 2 − 6z 2 + 4 = 4 − ⎡ ⎤ = 4 − √ = 2,
2x−y+2z−1 −2 9
⎣ 1 ⎦
−2
1
and is attained at (x∗ , y ∗ , z ∗ ) = (−2, 1, −2).
9
2. – Distance to an hyperplane with positive constraints.

i) Let a ∈ Rn , a = 0, b ∈ R b > 0. Solve
⎧ t
⎨ a.x b
2
min x subject to
⎩
x0
ii) Minimize x2 + y 2 over the following sets
α) − y −2, x 0 β) x − y 2, x 0, y 0
γ) − x + y 2, x 0, y 0 δ) x + y 2, x 0, y 0
Sketch graphs to check the solution.


Solution: i) Let t a = a1 . . . an , t
x = x1 . . . xn . The min-
imization problem looks for points with positive coordinates in the region
above the hyperplane
t
a.x = b ⇐⇒ a1 x1 + a2 x2 + . . . + an xn = b,
and that are closest to the origin. It is a nonlinear minimization problem with
inequality constraints. We introduce the Lagrangian
L(x, λ) = −(x21 + x22 + . . . + x2n ) + λ(a1 x1 + a2 x2 + . . . + an xn − b)
⎧
⎪ Lx1 = −2x1 + λa1 0 (= 0 if x1 > 0)
⎪
⎪
⎪
⎪ ..
⎨ .
⎪
⎪
⎪
⎪ Lxn = −2xn + λan 0 (= 0 if xn > 0)
⎪
⎩
λ0 with λ = 0 if a1 x1 + a2 x2 + . . . + an xn > b.
Finding a candidate.
* If xi = 0 for each i ∈ {1, . . . , n}, then, a1 (0) + . . . + an (0) = 0 b > 0, and

we get a contradiction with.
* If xi0 > 0 for some i0 ∈ {1, . . . , n}, then

λ
−2xi0 + λai0 = 0 ⇐⇒ ai xi0 =
2 0
then λ > 0 and ai0 > 0. As a consequence, we have a1 x1 +a2 x2 +. . .+an xn = b.
Suppose xi > 0 for i ∈ {i0 , i1 , . . . , ip }, and xi = 0 for i = i0 , i1 , . . . , ip . Then,
λaj 0 for j = i0 , i1 , . . . , ip ⇐⇒ aj 0 for j = i0 , i1 , . . . , ip
since λ > 0. Hence, we can write

λ λ
xj = max(aj , 0) = (aj )+ for j = i0 , i1 , . . . , ip
2 2
and get a unified formula for the candidate point
λ +
x∗ = a t +
a = a+
1 ... a+
n
2
Inserting the expression of x∗ in the equation of the hyperplane, we obtain
λ + λ λ λ b
a1 a + a2 a+ + . . . + a n an + =b ⇐⇒ = + .
2 1 2 2 2 2 a 2
Hence, a solution to the system is

b b
xi = a+ ,
2 i
i = 1, . . . , n ⇐⇒ x= a+ .
a+ a+ 2
Finding the solution.

b
To study the concavity of L in x when λ = 2 , consider the Hessian
a+ 2
matrix
⎡ ⎤ ⎡ ⎤
Lx 1 x 1 ... Lx 1 x n −2 ... 0
⎢ .. .. .. ⎥ ⎢ .. .. .. ⎥
HL(.,λ) (x) = ⎣ . . . ⎦ = ⎣ . . . ⎦
Lxn x1 ... Lxn xn 0 ... −2
The leading minor principals are equal to Dk (x) = (−2)k , k = 1, . . . , n.
The matrix is semi-definite negative. Thus, the point maximizes − x 2 subject
to the constraint t ax b and to the positivity constraint x 0. Hence, the
point solves the minimization problem and the minimal distance of the origin
to this point is equal to
b
b
+ 2 a+ = + .
a a
ii) Here, in filling Table 4.4, we have b = 2.
t t +
set a a a+ (x∗ , y ∗ )
α (0, 1) (0, 1) 1 (0, 2)
β (1, −1) (1, 0) 1 (2, 0)
γ (−1, 1) (0, 1) 1 (0, 2)

√ √ √
δ (1, 1) (1, 1) 2 ( 2, 2)
TABLE 4.4: Minima points for x2 + y 2 on the four sets
One can easily check the minimal distance of the origin to the given sets from
the graphics in Figure 4.21.
y
y 2.5
5
2.0 y x2
4
1.5
3 x0, y2
1.0 xy2
2
y2
0.5 x0, y0
1
x0 x
1 2 3 4 5
x
0.5 0.5 1.0 1.5 2.0 0.5
y
y 2.5
5
x0, y0 2.0 xy2
4 xy2
1.5 x0, y0
3
1.0
2
0.5
1
x
1 2 3 4
x
0.5 0.5 1.0 1.5 2.0 0.5
FIGURE 4.21: Closest point of the constraint set to the origin
3. – L not convex nor concave Consider the following minimization

problem: ⎧ 2
⎪ y 4−x
⎪
⎪
⎪
⎨
min x2 + y 2 s.t y 3x
⎪
⎪
⎪
⎪
⎩
y −3x
i) Sketch the feasible set.

ii) Write the problem as a maximization problem in the standard form,
and write down the necessary KKT conditions for a point (x∗ , y ∗ ) to
be a solution of the problem.
iii) Find the points that satisfy the KKT conditions. Check whether or
not each point is regular.
iv) Determine whether or not the point(s) in part ii) satisfy the second-
order sufficient condition.
v) Explore the concavity of the Lagrangian in (x, y) ∈ R2 .

vi) What can you conclude about the solution of the problem?
vii) Give a geometric interpretation of the problem that confirms the

solution you have found (Hint: use level curves).
Solution: i) The feasible set is the plane region located above the curve and
the two lines, as described in Figure 4.22.
y
y 3 x 4 y3x
y 4 x2
x
3 2 1 1 2 3
2
FIGURE 4.22: The constraint set S
ii) Writing the KKT conditions. The problem is equivalent to the follow-
ing maximization problem
⎧
⎪
⎪ g1 (x, y) = 4 − x2 − y 0
⎪
⎪
⎨
max (−x2 − y 2 ) subject to g2 (x, y) = 3x − y 0
⎪
⎪
⎪
⎪
⎩
g3 (x, y) = −3x − y 0.
L(x, y, λ, β, γ) = −x2 − y 2 − λ(4 − x2 − y) − β(3x − y) − γ(−3x − y).
The conditions are

⎧
⎪
⎪ (1) Lx = −2x + 2λx − 3β + 3γ = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = −2y + λ + β + γ = 0
⎪
⎪
⎨
(3) λ0 with λ=0 if 4 − x2 − y < 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) β0 with β=0 if 3x − y < 0
⎪
⎪
⎪
⎪
⎩
(5) γ0 with γ=0 if − 3x − y < 0
iii) Solving the equations satisfying the KKT conditions.
• If 4 − x2 − y < 0 then λ = 0 and

⎧
⎨ −2x − 3β + 3γ = 0
⎩
−2y + β + γ = 0
then we discuss
∗ Suppose 3x − y < 0, then β = 0. Thus,

⎧
⎨ −2x + 3γ = 0
⎩
−2y + γ = 0
we get 2x = 3γ = 6y =⇒ x = 3y. But, then 3x − y = 3(3y) − y = 8y < 0 and

hence γ < 0 which contradicts γ 0.
∗ Suppose 3x − y = 0. We have then
⎧
⎨ −2x − 3β + 3γ = 0
=⇒ 6γ = 20x and 3β = 8x 0.
⎩
−2(3x) + β + γ = 0
We deduce that x 0 and y 0. So x = y = 0 since −3x − y 0. But, this

contradicts 4 − 02 − 0 = 4 < 0.
•• If 4 − x2 − y = 0 then
∗ Suppose 3x − y < 0 then β = 0 and

⎧
⎨ −2x + 2λx + 3γ = 0
⎩
−2y + λ + γ = 0
– Suppose −3x − y < 0 then γ = 0 and
⎧ ⎧
⎨ −2x + 2λx = 0 ⎨ 2x(−1 + λ) = 0 ⇐⇒ x = 0 or λ = 1
⇐⇒
⎩ ⎩
−2y + λ = 0 2y = λ 0
√ √
◦ λ = 1 leads to y = 1/2 and x = ± 7/2. But, for (x, y) = (√7/2, 1/2),
the inequality 3x − y < 0 is not satisfied, and for (x, y) = (− 7/2, 1/2),
the inequality −3x − y < 0 is not satisfied. So we cannot have λ = 1.
◦ x = 0 leads to y = 4 and λ = 8 > 0. The two inequalities 3x − y < 0

and −3x − y < 0 are satisfied at this point. Hence, the following point
is a solution:
(x∗ , y ∗ ) = (0, 4) with (λ∗ , β ∗ , γ ∗ ) = (8, 0, 0) ←−
– Suppose −3x − y = 0 then y = −3x and

⎧ ⎧
⎨ −2x + 2λx + 3γ = 0 ⎨ −2x + 2λx + 3γ = 0
=⇒
⎩ ⎩
−2(−3x) + λ + γ = 0 λ + γ = −6x
From 4 − x2 − y = 0, we have
y = −3x and 4 − x2 − y = 0 ⇐⇒ (x, y) = (−1, 3) or (4, −12).
The point (4, −12) doesn’t satisfy the inequality 3x − y < 0, so it cannot be
a solution.
The point (−1, 3) satisfies the inequality 3x − y < 0, and we have

⎧
⎨ −2λ + 3γ = −2
=⇒ (λ, γ) = (4, 2).
⎩
λ+γ =6
Thus, we have another candidate point:
(x∗ , y ∗ ) = (−1, 3) with (λ∗ , β ∗ , γ ∗ ) = (4, 0, 2) ←−
∗∗ Suppose 3x − y = 0 then y = 3x. We have
y = 3x and 4 − x2 − y = 0 ⇐⇒ (x, y) = (−4, −12) or (1, 3).
The points (−4, −12) doesn’t satisfy the inequality −3x − y 0, so it cannot
be a candidate.
The point (1, 3) satisfies the inequality −3x − y < 0, thus γ = 0, and we have
⎧
⎨ 2λ − 3β = 2
=⇒ (λ, β) = (4, 2).
⎩
λ+β =6
Thus, we have another candidate point:
(x∗ , y ∗ ) = (1, 3) with (λ∗ , β ∗ , γ ∗ ) = (4, 2, 0) ←−
Regularity of the candidate point (0, 4). Only the constraint g1 (x, y) =
4 − x2 − y is active at (0, 4) and we have

g1 (x, y) = −2x −1 g1 (0, 4) = 0 −1 rank(g1 (0, 4)) = 1.
Regularity of the candidate point (−1, 3). Only the constraints g1 (x, y) =
4 − x2 − y and g3 (x, y) = −3x − y are active at (−1, 3) and we have

g1 (x, y) −2x −1 g1 (−1, 3) 2 −1
= = .
g3 (x, y) −3 −1 g3 (−1, 3) −3 −1

g1 (−1, 3)
Thus the point (1, −3) is a regular point since rank( ) = 2.
g3 (−1, 3)
Regularity of the candidate point (1, 3). Only the constraints g1 (x, y) =
4 − x2 − y and g2 (x, y) = 3x − y are active at (1, 3) and we have

g1 (x, y) −2x −1 g1 (1, 3) −2 −1
= = .
g2 (x, y) 3 −1 g2 (1, 3) 3 −1

g1 (1, 3)
Thus the point (1, 3) is a regular point since rank( ) = 2.
g2 (1, 3)
iv) With p = 2 (the number of active constraints) at the points (3, −1) and
(3, 1), n = 2 (the dimension of the space), then p = n. The second derivatives
test cannot be applied since it is established for p < n.
For the point (0, 4), we have p = 1 < 2 = n. We consider the following
determinant (r = p + 1 = 2) (Note that the first column vector of [g1 (x, y)] is
linearly dependent, so we have to renumber the variables)

0 ∂g1 ∂g1 0
∂y ∂x −1 −2x
1 −1 −2
B2 (x, y) = ∂g Lyy Lyx = 0
∂g
∂y −2x 0
1 Lxy Lxx −2 + 2λ
∂x
∗ At (0, 4), we have λ = 8,

0 −1 0

B2 (0, 4) = −1 −2 0 = −14.

0 0 14
We have (−1)2 B2 (0, 4) = −14 < 0. So the second derivatives test is not
satisfied at (0, 4).
v) Let us explore the concavity and convexity of L with respect to (x, y) where
the Hessian matrix of L in (x, y) is

Lxx Lxy −2 0
HL = =
Lyx Lyy 0 −2 + 2λ
When λ = 8 or 4, the principal minors are
Δ11 = Lyy = −2 + 2λ > 0 Δ21 = Lxx = −2 < 0 Δ2 = 4(1 − λ) < 0.
Therefore, L is neither concave, nor concave in (x, y).
vi) We have a situation where the theorems studied remain inconclusive. To

conclude, we proceed by comparison. Since, the candidate points are on the
boundary of the constraint set, let us study directly the values of the objective
function on these points.
On the lines y = ±3x, with |x| 1, the function f (x, y) = x2 + y 2 takes the
values
f (x, ±3x) = x2 + (±3x)2 = 10x2 10 = f (1, ±3) ∀ |x| 1.
On the parabola x2 = 4 − y, with |x| 1,we have
f (x, 4 − x2 ) = x2 + (4 − x2 )2 = x4 − 8x2 + 16 + x2 = x4 − 7x2 + 16 = ϕ(x)

ϕ (x) = 4x3 − 14x = 2x(2x2 − 7) = 0 ⇐⇒ x = 0, ± 7/2.
By the extreme value theorem, ϕ attains its extreme values on the closed
bounded interval [−1, 1] at the critical points inside the interval (−1, 1) or at
the end points. Therefore, we have
min ϕ(x) = min{ϕ(−1), ϕ(0), ϕ(1)} = min{10, 16, 10} = 10.

[−1,1]
Thus,
f (x, 4 − x2 ) = ϕ(x) 10 = f (±1, 3) ∀ |x| 1.

So we can conclude that the minimum value attained by f on the set of the
constraints is 10.
vii) The feasible set is
S = {(x, y) : 4 − x2 − y 0, 3x − y 0, −3x − y 0}
2 2
The level curves of f , with equations
√ : x + y = k where k 0, are circles
centered at (0, 0) with radius k; see Figure 4.23.
If we increase the values of the radius, the values of f increase. The value
k = 10 is the first one at which the level curve intersects the constraints
g1 = g2 = 0 and g1 = g3 = 0. Thus the value 10 is the minimal value of f
reached at (±1, 3).
Moreover, the objective function f (x, y) = x2 +y 2 is the square of the distance

between (x, y) and (0, 0). So our problem is to find the point(s) in the feasible
region that are closest to (0, 0).
y
x
4 2 2 4
2
4
FIGURE 4.23: Level curves of f and the closest points of S to the origin
4. – The data in Table 4.5 can be found in [8]. Here we consider boundary
conditions to illustrate an inequality constrained problem.
The Body Fat Index (BFI) measures the fitness of an individual. It is
a function of the body density ρ (in units of kilograms per liter) according
to Brozek’s formula,
457
BF I = − 414.2.
ρ
However the accurate measurement of ρ is costing. An alternative solution
is to try to describe the dependence of the BFI with respect of five variables
x1 , x2 , x3 , x4 , x5 in the form
f : x −→ BF I = y = f (x) = a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5
The variables are easier to measure and represent
x1 = weight(lb.) x2 = height(in.) x3 = abdomen(cm.)

x4 = wrist(cm.) x5 = neck(cm.) y = BF I
Using the following table of measurements, we assume the average of

each category x̄i of measurements satisfying:
(∗) a1 x̄1 + a2 x̄2 + a3 x̄3 + a4 x̄4 + a5 x̄5 ȳ
hoping to find a model when BF I ȳ = 15.23.

i) Use a software to find a linear function f which best fits the given
data, in the sense of least-squares, i.e., find a that minimizes the sum
of the square errors
10
10

(f (xi ) − yi )2 = (a1 xi1 + a2 xi2 + a3 xi3 + a4 xi4 + a5 xi5 − yi )2 s.t
i=1 i=1
xi = (xi1 , xi2 , xi3 , xi4 , xi5 ) are the measurements for the ieme individual.
ii) Formulate the constrained problem using matrices. Use Maple to

check that the Hessian of the resulting objective function is definite
positive on the convex described by (∗).
Solution: We use Maple software for solving the problem.

i) Finding the linear regression of best fit.
We solve the “least square problem” “LS” with ten linear residuals. The objective
function is
1
ϕ(a1 , a2 , a3 , a4 , a5 ) = (154.25a1 + 67.75a2 + 85.2a3 + 17.1a4 + 36.2a5 − 12.7)2
2
+(173.25a1 + 72.25a2 + 83a3 + 18.2a4 + 38.5a5 − 6.9)2
+(154a1 + 66.25a2 + 87.9a3 + 16.6a4 + 34a5 − 24.6)2
+(184.75a1 + 72.25a2 + 86.4a3 + 18.2a4 + 37.4a5 − 10.9)2
+(184.25a1 + 71.25a2 + 100a3 + 17.7a4 + 34.4a5 − 27.8)2
x1 x2 x3 x4 x5 y
154.25 67.75 85.2 17.1 36.2 12.6
173.25 72.25 83 18.2 38.5 6.9
154 66.25 87.9 16.6 34 24.6
184.75 72.25 86.4 18.2 37.4 10.9
184.25 71.25 100 17.7 34.4 27.8
210.25 74.75 94.4 18.8 39 20.6
181 69.75 90.7 17.7 36.4 19
176 72.5 88.5 18.8 37.8 12.8
191 74 82.5 18.2 38.1 5.1
198.25 73.5 88.6 19.2 42.1 12
TABLE 4.5: Measurements involved in BFI
+(210.25a1 + 74.75a2 + 94.4a3 + 18.8a4 + 39a5 − 20.6)2

+(181a1 + 69.75a2 + 90.7a3 + 17.7a4 + 36.4a5 − 19)2
+(176a1 + 72.5a2 + 88.5a3 + 18.8a4 + 37.8a5 − 12.8)2
+(191a1 + 74a2 + 82.5a3 + 18.2a4 + 38.1a5 − 5.1)2
+(198.25a1 + 73.5a2 + 88.6a3 + 19.2a4 + 42.1a5 − 12)2
with(Optimization) : LSSolve(
[154.25a1 + 67.75a2 + 85.2a3 + 17.1a4 + 36.2a5 − 12.6, 173.25a1 + 72.25a2 + 83a3 + 18.2a4 +
38.5a5 − 6.9,
154a1 + 66.25a2 + 87.9a3 + 16.6a4 + 34a5 − 24.6, 184.75a1 + 72.25a2 + 86.4a3 + 18.2a4 +
37.4a5 − 10.9,
184.25a1 + 71.25a2 + 100a3 + 17.7a4 + 34.4a5 − 27.8, 210.25a1 + 74.75a2 + 94.4a3 + 18.8a4 +
39a5 − 20.6,
181a1 +69.75a2 +90.7a3 +17.7a4 +36.4a5 −19, 176a1 +72.5a2 +88.5a3 +18.8a4 +37.8a5 −12.8,
191a1 +74a2 +82.5a3 +18.2a4 +38.1a5 −5.1, 198.25a1 +73.5a2 +88.6a3 +19.2a4 +42.1a5 −12],
{180.7a1 + 71.425a2 + 88.72a3 + 18.05a4 + 37.39a5 ≤ 15.23})
[15.0549945448635683, [a1 = 0.474753096134219e − 1, a2 = −1.03634130223772,
a3 = 1.22920301075594, a4 = −1.86308283592359, a5 = .140089140413700]]
Thus
f (x1 , x2 , x3 , x4 , x5 ) ≈ 0.474x1 − 1.036x2 + 1.229x3 − 1.863x4 + .140x5
f can be used to predict an individual’s body fat index, based upon the five measurements
types.
Comments. - Least square problems are solved by the LSSolve command.
- When the residuals in the objective function and the constraints are all linear, which is
the case here, then an active set method is used. This is an approximate method [19],[22],
[17].
-The LSSolve command uses various methods implemented in a built in library provided
by a group of numerical algorithms.
ii) Finding the linear regression of best fit using matrices.

Let G = (xi1 , xi2 , xi3 , xi4 , xi5 )i=1,...,10 ∈ M10;5 be the matrix whose rows are the vectors xi ,
or equivalently, the matrix whose columns are the five first columns entries of the table. Let
c be the last column entry of the table. Denote

a =t (a1 , a2 , a3 , a4 , a5 ), A =t (180.7, 71.425, 88.72, 18.05, 37.39), b = 15.23
then
1 1
ϕ(a) = ϕ(a1 , a2 , a3 , a4 , a5 ) = ((G.a − c)1 )2 + . . . + ((G.a − c)10 )2 = G.a − c 2
2 2
and the problem can be expressed as
1
G.a − c 2
min subject to Aa b.
2
Following Maple’s instructions, we enter the data using matrices
with(Optimization) :
c := V ector([12.6, 6.9, 24.6, 10.9, 27.8, 20.6, 19, 12.8, 5.1, 12],
datatype = f loat) :
G := M atrix([[154.25, 67.75, 85.2, 17.1, 36.2], [173.25, 72.25, 83, 18.2, 38.5],
[154, 66.25, 87.9, 16.6, 34], [184.75, 72.25, 86.4, 18.2, 37.4],
[184.25, 71.25, 100, 17.7, 34.4], [210.25, 74.75, 94.4, 18.8, 39],
[181, 69.75, 90.7, 17.7, 36.4], [176, 72.5, 88.5, 18.8, 37.8],
[191, 74, 82.5, 18.2, 38.1], [198.25, 73.5, 88.6, 19.2, 42.1]],
datatype = f loat) :
with(Statistics) :
A := M ean(G) :
b := M ean(c) :
A := M atrix([[180.7, 71.425, 88.72, 18.05, 37.39]], datatype = f loat) :
b := V ector([15.23], datatype = f loat) :
lc := [A, b] :
LSSolve([c, G], lc) :
⎡ ⎡ ⎤ ⎤
0.0474753096134219
⎢ ⎢ −1.03634130223772 ⎥ ⎥
⎢ ⎢ ⎥ ⎥
⎢ 15.0549945448635683, ⎢ 1.22920301075594 ⎥ ⎥
⎣ ⎣ −1.86308283592359 ⎦ ⎦
0.140089140413700
Hence, we obtain the same coefficients ai .
The Hessian of ϕ is
1 2 1
ϕ(a) = G.a − c = (G.a − c).(G.a − c)
2 2
1 1
= ( G.a 2 − 2t c.G.a + c 2 ) = (t at GG.a − 2t c.G.a + c 2
)
2 2
ϕ (a) =t GG.a − G.c ϕ (a) =t GG
Checking that the Hessian is definite positive.
with(LinearAlgebra)
H := M ultiply(T ranspose(G), G) :
IsDef inite(H)
true
4.5 Dependence on Parameters
The cost to produce an output Q is equal to rK + wL where r and w are

respectively the prices of the input capital K and labor L. The firm would like
the output to obey the Cobb-Douglas production function Q = cK a Lb (r > 0,
w > 0, c > 0, a + b < 1). Thus, to minimize the cost of production, the
problem is expressed as:
min rK + wL subject to cK a Lb = Q
with (K, L) ∈ (0, +∞) × (0, +∞). Using Lagrange’s multiplier method, the
unique solution is (see Example 1, Section 3.4)
1 a b
aQ bQ Q a+b aQ a+b bQ a+b
K∗ = λ L∗ = λ λ∗ = .
r w c r w
One can see the dependence of the extreme point on the parameters
r, w, c, a, b. In general, it is not easy to express explicitly the solution with
respect of many parameters. On the other hand, changing the parameters and
solving a new optimization problem is costing or difficult. An alternative solu-
tion is to have an estimate on how much the optimal value changes compared
to an initial situation.
To set the main result of this section, we suppose the objective function f
and the constraint function g depending on a parameter r ∈ Rk , i.e.
f (x, r) = f (x1 , . . . , xn , r1 , . . . , rk ),
g(x, r) = g(x1 , . . . , xn , r1 , . . . , rk ) g = (g1 , . . . , gn ),
I(x(r)) = {i ∈ {1, · · · , m} : gi (x(r), r) < 0},
Consider the problem (Pr )
f ∗ (r) = local max f (x, r) (resp. local min) s.t g(x, r) 0
and introduce the Lagrangian
L(x, λ, r) = f (x, r) − λ1 g1 (x, r) − . . . − λm gm (x, r).
Hypothesis (Hr ). f and g are C 2 functions in a neighborhood of x∗

and for each r ∈ Bδ (r̄) ⊆ Rk such that:
! "
gi (x∗ , r̄) = 0 if i ∈ I(x∗ ) = i1 , i2 , · · · , ip p<n
λj = 0 if gj (x∗ ) < bj j ∈ I(x∗ )

rank(G (x∗ , r̄)) = p, G (x∗ , r̄) = (gi1 (x∗ , r̄), . . . , gip (x∗ , r̄))
∇x L(x∗ , λ∗ , r̄) = 0 for a unique vector λ∗ = (λ∗1 , . . . , λ∗m ).
Theorem 4.5.1. Assume that (Hr ) holds and
- x∗ = x(r̄) solves (Pr̄ )
- the second derivatives test for strict maximality with λ∗ 0

(resp. minimality with λ∗ 0) is satisfied when r = r̄.
Then
− ∃η ∈ (0, δ] such that x(.) : r −→ x(r) and λ(.) : r −→ λ(r)
are C 1 (Bη (r̄))
− f ∗ : r −→ f (x(r), r) is C 1 on Bη (r̄) and
∂f ∗ ∂L
(r) = (x(r), λ(r), r) j = 1, . . . , k.
∂rj ∂rj
Remark 4.5.1 As a consequence of the regularity of the optimal value

function f ∗ , we have the following approximation
k
∂f ∗
f ∗ (r) ≈ f ∗ (r̄) + (r̄) (rj − r̄j )
j=1
∂rj
where
∂f ∗ ∂L
(r̄) = (x, λ, r) j = 1, . . . , k.
∂rj ∂rj x=x∗ , λ=λ∗ , r=r̄
Thus, we can estimate the change of f ∗ when the parameter r remains

close to r̄.
Proof. We write the proof for the constrained case.

Step 1. If we assume that a C 1 regularity of x(r) and λ(r) is established,
then we have
L(x(r), λ(r), r) = f (x(r), r) − λ(r)g1 (x(r), r) − . . . − λm (r)gm (x(r), r)

= f (x(r), r) − λi (r)gi (x(r), r) − λi (r)gi (x(r), r)
i∈I(x(r)) i∈I(x(r))
/
= f (x(r), r) = f ∗ (r)
because gi (x(r), r) = 0 for i ∈ I(x(r)) and λi (r) = 0 for i ∈
/ I(x(r)); then
using the Chain rule formula, we obtain
∂f ∗ ∂(L(x(r), λ(r), r))
(r) =
∂rj ∂rj
n m
∂L ∂xi ∂L ∂λt
= (x(r), λ(r), r)) (r) + (x(r), λ(r), r))
i=1
∂xi ∂rj t=1
∂λt ∂rj
k
∂L ∂rl
+ (x(r), λ(r), r))
∂rl ∂rj
l=1
n
∂x
∂f ∂g1 ∂gm i
= (x(r), r) − λ1 (r) (x(r), r) − . . . . . . − λm (r) (x(r), r) (r)
i=1
∂xi ∂xi ∂xi ∂rj
m
∂λt ∂L
+ − gt (x(r), r) (r) + (x(r), λ(r), r).
t=1
∂rj ∂rj
Since x(r) optimizes f (x, r) subject to the constraints g(x, r) 0, then the
necessary condition gives, for each i = 1, . . . , n,
∂f ∂g1 ∂gm
(x(r), r) − λ1 (r) (x(r), r) − . . . . . . − λm (r) (x(r), r) = 0.
∂xi ∂xi ∂xi
Now, since we have gt (x(r), r) = 0 for t ∈ I(x(r)) and λt (r) = 0 for t ∈ I(x(r)),
then
m
∂λt
− gt (x(r), r) (r) = 0.
t=1
∂rj
Hence
∂f ∗ ∂L
(r) = (x(r), λ(r), r).
∂rj ∂rj
Step 2. To prove the theorem, it remains to check the definiteness of x(r)

and its regularity. For this, we will need the implicit function theorem recalled
at the end of the proof. First, set

λp (r) = λi1 (r), . . . , λip (r) , λ∗p (r) = λ∗i1 (r), . . . , λ∗ip (r)
U (x, λp (r), r) = f (x, r) − λi1 (r)gi1 (x, r) − . . . . . . − λip (r)gip (x, r)
F (x, λp , r) = ∇x,λp U (x, λp , r)

p

∂f ∂gi
= (x, r) − λik (r) k (x, r), . . . . . . ,
∂x1 ∂x1
k=1
p

∂f ∂gik
(x, r) − λik (r) (x, r), −gi1 (x, r), . . . , −gip (x, r) .
∂xn ∂xn
k=1
Consider the following equation system

F (x, λp , r) = 0
By assumption, we have
– F is C 1 function in the open set A = Ω × Rp × B(r̄, δ) where Ω is an
open neighborhood of x∗
– F (x∗ , λ∗p , r̄) = 0
– (x∗ , λ∗p , r̄) ∈ Ω × Rp × B(r̄, δ), so (x∗ , λ∗p , r̄) is an interior point
⎡ ⎤
(Lxi xj ) −G (x∗ , r̄)
– det(∇x,λp F (x∗ , λ∗p , r̄)) = det ⎣ ⎦
t ∗
− G (x , r̄) 0
= (−1)2p Bn (x∗ , λ∗p , r̄) = 0, Bn : Bordered Hessian determinant.
Then, by the implicit function theorem, there exists open balls

B1 (x∗ ) ⊂ Rn , B2 (λ∗p ) ⊂ Rp , Bη (r̄) ⊂ Rk , 1 , 2 , η > 0
with
B1 (x∗ ) × B2 (λ∗p ) × Bη (r̄) ⊆ A,
det(∇x,λp F (x, λp , r)) = 0 in B1 (x∗ ) × B2 (λ∗p ) × Bη (r̄)

such that
∀r ∈ Bη (r̄), ∃!(x, λp ) ∈ B1 (x∗ ) × B2 (λ∗p ) : F (x, λp , r) = 0
(x, λp ) : Bη (r̄) −→ B1 (x∗ ) × B2 (λ∗p )
r −→ (x(r), λp (r)) are C 1 functions.

Remark 4.5.2 * In the theorem above, the local max(min) problem can
be replaced by the max(min) problem, provided we assume, for example,
− ∀r ∈ B(r̄, δ), x −→ L(x, λ∗ , r) is strictly concave (resp. convex)
* For the unconstrained case, L is reduced to f , F (x, r) = ∇x f (x, r) and

det(∇x F (x∗ , r̄)) = detHf (x∗ , r̄).
Example 1. Suppose that when a firm produces and sells x units of a com-
modity, it has a revenue R(x) = x, while the cost is C(x) = x2 .
i) Find the optimal choice of units of the commodity that maximize profit.
ii) Find the approximate change of the optimal profit if the revenue changes
to 0.99x.
Solution: i) The profit is given by
P (x) = R(x) − C(x) = x − x2 with x > 0.
Since the set of the constraints S = (0, +∞) is an open set and the profit
function is regular, the optimal point, if it exists, is a critical point solution
of the equation
dP 1
= 1 − 2x = 0 ⇐⇒ x= .
dx 2
Moreover, we have
d2 P d2 P
= −2 and <0 ∀x ∈ S.
dx2 dx2
Then P is strictly concave on the convex set S. Hence, the only critical point
x = 1/2 is a global maximum point. Thus x∗ = 1/2 units should be produced
to achieve maximum profit.
ii) Introduce the new profit function with the new revenue rx where r > 0 :
P (x, r) = rx − C(x) = rx − x2 with x > 0.
Proceeding as in i), one can verify that
d2 P
1. For, r close to 1, we have (x, r) = −2 < 0. Thus P (., r) is concave
dx2
in x.
2. The second order condition for strict maximality is satisfied when r = 1.
1 1 1
3. P (1/2, 1) = max P (x, 1) = − = .
S 2 4 2
As a consequence,
– ∃η > 0 such that the function P ∗ (r) = max P (x, r) is defined for any
x∈S
r ∈ (1 − η, 1 + η)
– P ∗ is C 1 and
– ∂P
dP ∗ 1
(1) = (x, r) =x = .
dr ∂r x=1/2, r=1 x=1/2, r=1 4
We can write the following approximation
dP ∗ 1 1
P ∗ (r) ≈ P ∗ (1) + (1)(r − 1) = + (r − 1) for r close to 1.
dr 4 2
In particular, for r = 0.99, the objective function P ∗ takes the following
approximate value:
P ∗ (0.99) ≈ 0.25 + 0.5(0.99 − 1) = 0.25 − 0.5(0.01) = 0.245
and the approximate change in the maximum value of the maximum profit
function is
P ∗ (0.99) − P ∗ (1) ≈ −0.5(0.01) = −0.005.
* Note that, for this example, we have easily the exact value of the objective
function P ∗ ; see Figure 4.24. Indeed, we have
r r r r2
P ∗ (r) = P (x∗ (r), r) = P ( , r) = r − ( )2 =
2 2 2 4
(0.99)2
P ∗ (0.99) = = 0.245025
4
We also have the following equality
dP ∗ r ∂P (x, r)
= = x∗ (r) = .
dr 2 ∂r x=x∗ (r)
y
0.30
0.25 : y xx^2
0.20
0.15
0.10 y 0.99 x x2
0.05
0.00 x
0.0 0.2 0.4 0.6 0.8 1.0
FIGURE 4.24: Highest profit for r = 1 and r = 0.99

Remark 4.5.3 In particular, when r = b, f (x, r) = f (x) and g(x, r) =

g(x) − b, we have
∂f ∗ ∂L
(b̄) = (x, λ, b) = λj (b̄) j = 1, . . . , m.
∂bj ∂bj x=x∗ , λ=λ∗ , b=b̄
This tells us that the Lagrange multiplier λj = λj (b̄) for the j th constraint
is the rate of change at which the optimal value function changes with
respect of the parameter bj at the point b̄.
Using the linear approximation formula,
∂f ∗ ∂f ∗
f ∗ (b) − f ∗ (b̄) (b̄)(b1 − b̄1 ) + · · · · · · + (b̄)(bm − b̄m )
∂b1 ∂bm
= λ1 (b̄)(b1 − b̄1 ) + · · · · · · + λm (b̄)(bm − b̄m ),
the change in the optimal value function is estimated, when one or more
components of the resource vector are slightly changed.
Example 2. For b close to 3, estimate

f ∗ (b) = local max f (x, y, z) = xy + yz + xz subject to x+y+z =b
knowing that (see Example 2, Section 3.3)
f ∗ (3) = f (1, 1, 1) = 3, λ(3) = 2, (−1)r Br (1, 1, 1) > 0 for r = 2, 3.
Solution: We can deduce that f ∗ ∈ C 1 (3 − η, 3 + η) for some η > 0, and write

the linear approximation
∂f ∗
f ∗ (b) ≈ f ∗ (3) + (3)(b − 3) for b close to 3.
∂b
If we denote by
L(x, y, z, λ, b) = xy + yz + xz − λ(x + y + z − b)
the Lagrangian associated with the new constrained maximization problem,
then we have
∂f ∗ ∂L
(3) = (x(b), y(b), z(b), λ(b), b) = λ(b) =2
∂b ∂b b=3 b=3
f ∗ (b) ≈ 3 + 2(b − 3) for b close to 3.

Solved Problems
1. – Irregular value function. i) Show that the value function
f ∗ (r) = max (x − r)2

x∈[−1,1]
is not differentiable on R. Is there a contradiction with the theorem?

ii) Can you expect a regularity for the value function
g ∗ (r) = min (x − r)2 .

x∈[−1,1]
Solution: This example shows that the optimal value function is not neces-
sarily regular. Indeed, set
y = f (x, r) = (x − r)2 f ∗ (r) = max f (x, r).

x∈[−1,1]
We have
dy
y =
= fx (x, r) = 2(x − r).
dx
We distinguish different cases:
∗ r ∈ (−1, 1) : From Table 4.6, we deduce the maximum value.
x −1 r 1
y = 2(x − r) − +
y = (x − r)2 (1 + r)2 0 (1 − r)2
TABLE 4.6: Variations of y = (x − r)2 when r ∈ (−1, 1)
max (x − r)2 = max (1 + r)2 , (1 − r)2 = f ∗ (r).

x∈[−1,1]
x −1 1 r
y = 2(x − r) − −
y = (x − r)2 (1 + r)2 (1 − r)2 0
TABLE 4.7: Variations of y = (x − r)2 when r ∈ (1, +∞)
x r −1 1
y = 2(x − r) + +
y = (x − r)2 0 (1 + r)2 (1 − r)2
TABLE 4.8: Variations of y = (x − r)2 when r ∈ (−∞, −1)
∗∗ r ∈ (1, +∞) : Using Table 4.7, we obtain
max (x − r)2 = (1 + r)2 = f ∗ (r).

x∈[−1,1]
∗∗ r ∈ (−∞, −1) : Table 4.8 shows that
max (x − r)2 = (1 − r)2 = f ∗ (r).

x∈[−1,1]
Conclusion: Note that (1 + r)2 − (1 − r)2 = 4r, then

⎧
⎪
⎪ (1 − r)2 if r<0
⎪
⎪
⎨
f ∗ (r) = 1 if r=0
⎪
⎪
⎪
⎪
⎩
(1 + r)2 if r>0
∗
For r = 0, f is differentiable since it is a polynomial. For r = 0, we have
⎧
⎪
⎪ (1 − r)2 − 1
⎪
⎨ = −(2 − r) if r<0
f ∗ (r) − f ∗ (0) r
=
r−0 ⎪
⎪ 2
⎩ (1 + r) − 1 = 2 + r
⎪
if r>0
r
Hence
f ∗ (r) − f ∗ (0) f ∗ (r) − f ∗ (0)

lim = −2 lim + =2
r−→0− r−0 r−→0 r−0
and f ∗ is not differentiable at 0.
This doesn’t contradicts the theorem since the regularity of f ∗ was proved
when x∗ is an interior point for f , which is not the case here with x∗ = ±1.
Indeed, we have f (x, 0) = x2 and f ∗ (0) = f (±1, 0) = 1.
ii) We have f (x) = f (x, 0) = x2 ,
min x2 = 0 = f (0) = g ∗ (0), 0 ∈ (−1, 1), f (x) = 2 > 0.

x∈[−1,1]
So f attains its minimal value at the interior point 0, where the second deriva-
tives test is satisfied. Moreover, f is convex on [−1, 1], which let’s 0 be the
global minimum point. Therefore, for r close to 0, that is ∃η > 0 such that
g ∗ ∈ C 1 (−η, η). In fact, from i), we have exactly g ∗ (r) = 0 for r ∈ (−1, 1),
which is a regular function.
2. – Find an approximate value of
max
2
(1.05)2 x + 5y sin(0.01) − 2x2 − 3y 2
R
Solution: Since, we are looking for an estimate of the maximal value, we

will proceed using the linear approximation for a suitable function. First, we
remark that 1.05 ≈ 1, 0.01 ≈ 0 and sin(0.01) ≈ 0. So, if we introduce the
function
f (x, y, r, s) = r2 x + 5y sin(s) − 2x2 − 3y 2

where r and s are parameters, then the problem seems like a perturbation of
the simpler problem max f (x, y, 1, 0) = x − 2x2 − 3y 2 to the given problem
max f (x, y, 1.05, 0.01).
Solving max
2
x − 2x2 − 3y 2 .
R
Since R2 is an open set, a global extreme point of f (x, y) = f (x, y, 1, 0) is also a

local extreme point. Therefore, it is a stationary point of f (x, y) = x−2x2 −3y 2
(f is a polynomial, it is C ∞ ). We have
1
∇f (x, y) = 1 − 4x, −6y = 0, 0 ⇐⇒ (x, y) = ( , 0).
4
The only stationary point is ( 14 , 0). The Hessian matrix is

−4 0
Hf (x, y) =
0 −6
The leading principal minors are D1 (x, y) = −4 < 0 and D2 (x, y) = 24 > 0.
1
Hence, f is strictly concave on R2 and we conclude that (x∗ , y ∗ ) = ( , 0) is a
4
global maximum point, and the only one.
Linear approximation. We have

−4 0
1. Hf (.,r,s) (x, y) = =⇒ f (., r, s) is concave on R2 for any
0 −6
(r, s) ∈ R2 .
1 1
2. f ( , 0) = max f (x, y, 1, 0) = .
4 R 2 8
3. The second order condition for strict maximality is satisfied when
(r, s) = (1, 0) at the point (x, y) = (1/4, 0).
As a consequence,
– ∃η > 0 such that the function f ∗ (r, s) = max 2 f (x, y, r, s)

(x,y)∈R
is defined for any (r, s) ∈ Bη (1, 0)

– f ∗ is C 1 (Bη (1, 0)) and
–
∂f ∗ ∂f
1
(1, 0) = = 2rx =
∂r ∂r (x,y)=(1/4,0), (r,s)=(1,0) (x,y)=(1/4,0), (r,s)=(1,0) 2
∂f ∗
∂f

(1, 0) = = 5y cos(s) = 0.
∂s ∂s (x,y)=(1/4,0), (r,s)=(1,0) (x,y)=(1/4,0), (r,s)=(1,0)
We can write the following approximation, for (r, s) close to (1, 0),
∂f ∗ ∂f ∗ 1 1
f ∗ (r, s) ≈ f ∗ (1, 0) + (1, 0)(r − 1) + (1, 0)(s − 0) = + (r − 1).
∂r ∂s 8 2
In particular, for (r, s) = (1.05, 0.01), the objective function f ∗ takes the
following approximate value:
1
f ∗ (1.05, 0.01) ≈ 0.125 + (1.05 − 1) = 0.125 + 0.025 = 0.15
2
and the approximate change in the maximum value of the maximum profit
function is
1
f ∗ (1.05, 0.01) − f ∗ (1, 0) ≈ (1.05 − 1) = 0.025.
2

⎧
⎨ g1 = x + y + z = 1
min(max)f (x, y, z) = ex + y + z s.t
⎩
g2 = x2 + y 2 + z 2 = 1.
i) Apply Lagrange’s theorem to the problem to show that there are four
points satisfying the necessary conditions.
ii) Show that each point is a regular point.
iii) What can you conclude about the global minimal and maximal values
of f subject to g1 = g2 = 1? Justify your answer.
iv) Replace the constraints by x + y + z = a and x2 + y 2 + z 2 = b with
(a, b) close to (1, 1) (a > 0, b > 0).
- What is the approximate change in the optimal value function
f ∗ (a, b) = min f (x, y, z)?

g1 =a, g2 =b
- What is the approximate change in the optimal value function
F ∗ (a, b) = max f (x, y, z)?

g1 =a, g2 =b
Solution: i) Note that f , g1 and g2 are C ∞ in R3 . Consider the Lagrangian
L(x, y, z, λ1 , λ2 ) = ex + y + z − λ1 (x + y + z − 1) − λ2 (x2 + y 2 + z 2 − 1)
and look for its stationary points solution of the system
⎧
⎪
⎪ (1) Lx = ex − λ1 − 2xλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (2) Ly = 1 − λ1 − 2yλ2 = 0
⎪
⎪
⎨
∇L(x, y, z, λ1 , λ2 ) = 0R5 ⇐⇒ (3) Lz = 1 − λ1 − 2zλ2 = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ (4) Lλ1 = −(x + y + z − 1) = 0
⎪
⎪
⎪
⎪
⎩
(5) Lλ2 = −(x2 + y 2 + z 2 − 1) = 0.
From equations (2) and (3), we deduce that
(z − y)λ2 = 0 =⇒ z=y or λ2 = 0.
∗ If λ2 = 0, we deduce from equation (2) that λ1 = 1 and then from equation

(1), that x = 0. Hence, equations (4) and (5) give
2y 2 − 2y = 0 =⇒ y=0 or y = 1.
Therefore, we have the two points
(0, 1, 0), (0, 0, 1) with (λ1 , λ2 ) = (1, 0).
∗∗ If z = y, then equations (4) and (5) give

2
x = 1 − 2y and 6y 2 − 4y = 0 =⇒ y=0 or y= .
3
Therefore, we have the two points
1
(1, 0, 0) with λ1 = 1 and λ2 = (e − 1)
2
1 2 2 1 1 1 1
(− , , ) with λ1 = (1 + 2e− 3 ) and λ2 = (1 − e− 3 ).
3 3 3 3 2
ii) Consider the matrix

⎡ ⎤ ⎡ ⎤
∂g1 ∂g1 ∂g1
∂x ∂y ∂z 1 1 1
⎢ ⎥ ⎣ ⎦
g (x, y, z) = ⎣ ⎦=
∂g2 ∂g2 ∂g2
∂x ∂y ∂z
2x 2y 2z

1 1 1 1 1 1
g (0, 1, 0) = g (0, 0, 1) =
0 2 0 0 0 2

1 1 1 1 2 2 1 1 1
g (1, 0, 0) = g (− , , ) = .
2 0 0 3 3 3 − 23 4
3
4
3
Each critical point is regular, and we remark that the first two column vectors
in the matrices g (− 13 , 23 , 23 )), g (0, 1, 0) and g (1, 0, 0) are linearly independent,
while they are linearly dependent in g (0, 0, 1). Therefore, we can keep the ma-
trices without renumbering the variables when applying the second derivatives
test in the first three matrices and change the variables in the last one.
iii) Now, f is continuous on the constraint set which is a closed and bounded
curve of R3 as the intersection of the unit sphere x2 + y 2 + z 2 = 1 and the
plane x + y + z − 1 = 0. So f attains its optimal values by the extreme value
theorem on points that are also critical points of the Lagrangian. Comparing
the values of f on these points, we obtain
1 2 2 1 4
2 < f (− , , ) = e− 3 + ≈ 2.0498 < e.
3 3 3 3
min f (x, y, z) = f (0, 1, 0) = f (0, 0, 1) = 2

g1 =1, g2 =1
and
max f (x, y, z) = f (1, 0, 0) = e.
g1 =1, g2 =1
iv) ∗ If we denote
f ∗ (a, b) = min f (x, y, z)

g1 =a, g2 =b
then f ∗ is regular for (a, b) close to (1, 1) because we have:
1. for (a, b) close to (1, 1), there exists a solution to the constrained mini-
mization problem by the extreme value theorem (because f is continuous
on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b).
2. (0, 1, 0) and (0, 0, 1) are solutions to the constrained minimization prob-
lem when (a, b) = (1, 1) and are regular points.
3. the second order condition for minimality is satisfied when (a, b) = (1, 1)
at (0, 1, 0) and (0, 0, 1). Indeed, n = 3 and m = 2, then we have to
consider the sign of the following bordered Hessian determinant:

0 0 ∂g1 ∂g1 ∂g1
∂x ∂y ∂z 0 0 1 1 1

∂g2 ∂g2 ∂g2
0 0 0 0 2x 2y 2z
∂x ∂y ∂z

B3 (x, y, z) = ∂g1 ∂g2
Lxx Lxy Lxz =1
2x ex − 2λ2 0 0 = 4.
∂x ∂x

∂g1 1
∂y
∂g2
Lyx Lyy Lyz 2y 0 −2λ2 0
∂y

∂g
1 ∂g2
1 2z 0 0 −2λ2
∂z ∂z
Lzx Lzy Lzz

0 0 1 1 1

0 0 0 2 0

B3 (0, 1, 0) = 1 0 1 0 0 =4 =⇒ (−1)2 B3 (0, 1, 0) = 4 > 0.

1 2 0 0 0

1 0 0 0 0
We change the variables in the order (x, z, y) to compute B3 (0, 0, 1) and obtain

0 0 1 1 1

0 0 0 2 0

B3 (0, 0, 1) = 1 0 1 0 0 =4 =⇒ (−1)2 B3 (0, 0, 1) = 4 > 0.

1 2 0 0 0

1 0 0 0 0
Consequently, with the new Lagrangian,
La,b (x, y, z, λ1 , λ2 ) = ex + y + z − λ1 (x + y + z − a) − λ2 (x2 + y 2 + z 2 − b),
we have
f ∗ (1, 1) = f (0, 1, 0) = f (0, 0, 1) = 2 with
λ1 (1, 1) = 1 and λ2 (1, 1) = 0
∂f ∗ ∂La,b
(1, 1) = = λ1 (1, 1) = 1
∂a ∂a (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1))
∂f ∗ ∂La,b
(1, 1) = = λ2 (1, 1) = 0
∂b ∂b (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1))
∂f ∗ ∂f ∗
f ∗ (a, b) ≈ f ∗ (1, 1) + (1, 1)(a − 1) + (1, 1)(b − 1)
∂a ∂b
= 2 + (a − 1) + (0)(b − 1) = a + 1.
∗∗ If we denote
F ∗ (a, b) = max f (x, y, z)
g1 =a, g2 =b
then F ∗ is regular for (a, b) close to (1, 1) because we have:
1. for (a, b) close to (1, 1), there exists a solution to the constrained maxi-
mization problem by the extreme value theorem (because f is continuous
on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b).
2. (1, 0, 0) is the solution to the constrained maximization problem when
(a, b) = (1, 1) and it is a regular point.
3. the second order condition for maximality is satisfied when (a, b) = (1, 1)
at (1, 0, 0). Indeed, n = 3 and m = 2, then we have to consider the sign
of the following bordered Hessian determinant:

0 0 1 1 1

0 0 2 0 0

B3 (1, 0, 0) = 1 2 1 0 0 = 8(1 − e) < 0 (−1)3 B3 = 8(e − 1) > 0.

1 0 0 1−e 0

1 0 0 0 1−e
Consequently, we have
F ∗ (1, 1) = f (1, 0, 0) = e with

1
λ1 (1, 1) = 1 and λ2 (1, 1) = (e − 1)
2
∂F ∗ ∂L
(1, 1) = = λ1 (1, 1) = 1
∂a ∂a (x,y,z,λ1 ,λ2 )=(1,0,0,λ1 (1,1),λ2 (1,1))
∂F ∗ ∂L 1
(1, 1) = = λ2 (1, 1) = (e − 1)
∂b ∂b (x,y,z,λ1 ,λ2 )=(1,0,0,λ1 (1,1),λ2 (1,1)) 2
∂F ∗ ∂F ∗
F ∗ (a, b) ≈ F ∗ (1, 1) + (1, 1)(a − 1) + (1, 1)(b − 1)
∂a ∂b
1
= e + (a − 1) + (e − 1)(b − 1).
2

⎧ 2
⎨ x + y2 8
min(max) f (x, y) = 1 − (x − 2)2 − y 2 s.t
⎩
x − y 0.
i) Sketch the feasible set and write down the necessary KKT conditions.
ii) Find the solutions candidates of the necessary KKT conditions.
iii) Use the second derivatives test to classify the points.
iv) Explore the concavity and convexity of the associated Lagrangian in

(x, y).
v) What can you conclude about the solution of the maximization prob-
lem?
vi) Determine the approximate values of each problem.
⎧ √
⎨ x2 + 1.04y 2 8
min(max) 1 − (0.98)3 (x − 2)2 − e−0.01 y 2 s.t
⎩
(1.04)2 x − y 0.
Solution: i) Figure 4.25 describes the constraint set and locate the extreme
points, approximately, following the variation of the objective function along
the level curves.

L(x, y, λ, β) = 1 − (x − 2)2 − y 2 − λ(x2 + y 2 − 8) − β(x − y)
We look simultaneously for the possible minima and maxima candidates. Thus,
the Karush-Kuhn-Tucker conditions are
y y
5046
48
4440 34
42 28 22 418 17 1
3 49 38 1
47 14
45
36 9 12
43 31
2 41
25
39
S 2
20
37
1 5 2
35 16
3
29
11
x 32 23 x
3 2 1 1 2 3 4 2 2 4
35 26
1 x2 y2 8 37 1
19
15 2 4 6
39
8
41 30
2 10
43
36 24
45
47
x y 0 49 38 13 1
3 50 42
4844
46 40 33 27 214 17 17 1
FIGURE 4.25: Level curve of highest profit
⎧
⎪
⎪ (1) Lx = −2(x − 2) − 2λx − β = 0
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (2) Ly = −2y − 2λy + β = 0
⎪
⎪
⎪
⎪ (3) λ=0 if x2 + y 2 < 8
⎪
⎪
⎪
⎪
⎩
(4) β=0 if x−y <0
with

(λ, β) (0, 0) for a maximum point

(λ, β) (0, 0) for a minimum point
ii) Solving the system.
∗ If x2 + y 2 < 8 then λ = 0 and
⎧
⎨ −2(x − 2) − β = 0
=⇒ y = −(x − 2)
⎩
−2y + β = 0
then inserted in (4) we discuss
– Suppose x − y = x + (x + 2) = 2(x + 1) < 0, then x < −1. From

(2), we have β = 2y = −2(x − 2) > 0. Thus, by (4), we get x − y = 0 which
contradicts x − y < 0.
– Suppose x − y = 0. We have then y = x and y = −x + 2. Thus, we

have a candidate point for optimality:
(x, y) = (1, 1) with (λ, β) = (0, 2).

∗ If x2 + y 2 = 8 then
– Suppose x − y < 0 then β = 0 and
⎧
⎨ x − 2 + λx = 0
⎩
−2y(1 + λ) = 0 ⇐⇒ y=0 or λ = −1.
λ = −1 is not possible by x − 2 + λx = 0. Thus y = 0.
√
With √x2 + y 2 = 8, we deduce that√ x = 8, which contradicts x < y, or
x = − 8. Inserting the value x = − 8 into x − 2 + λx = 0 gives λ = −1 − √12 .
So, we have another candidate
√ 1
(x, y) = (− 8, 0) with (λ, β) = (−1 − √ , 0).
2
– Suppose x − y = 0. With x2 + y 2 = 8, we deduce that x = 2 or

x = −2. Then, inserting in (1) and (2), we obtain
⎧
⎨ −4λ − β = 0 1
(x, y) = (2, 2) =⇒ ⇐⇒ (λ, β) = (− , 2)
⎩ 2
−4λ + β = 4
contradicting the common sign of λ and β.
⎧
⎨ 4λ − β = −8 3
(x, y) = (−2, −2) =⇒ ⇐⇒ (λ, β) = (− , 2)
⎩ 2
4λ + β = −4
contradicting the common sign of λ and β.
Regularity of the candidate point (1, 1). Note that the constraints
g1 (x, y) = x2 + y 2 and g2 (x, y) = x − y are C 1 in R2 and that only the
constraint g2 is active at (1, 1). We have

g2 (x, y) = 1 −1 rank(g2 (1, 1)) = 1
√
Regularity √of the candidate point (− 8, 0). Only the constraint g1 is
active at (− 8, 0). We have
√
g1 (x, y) = 2x 2y rank(g2 (− 8, 0)) = 1
√
Thus the point (− 8, 0) is a regular point.
iii) Second derivatives test at (1, 1). With p = 1 (the number of the
constraints), n = 2 (the dimension of the space), then r = p + 1, n = 2, 2 =⇒
r = 2 and we will consider the following determinant

0 ∂g2 ∂g2 0
∂x ∂y 1 −1
2 1
B2 (x, y) = ∂g Lxx Lxy = −2 − 2λ 0
∂g
∂x −1
∂y2 Lyx Lyy 0 −2 − 2λ
∗ At (1, 1), we have λ = 0, then

0 1 −1

B2 (1, 1) = 1 −2 0 =4
=⇒ (−1)2 B2 (1, 1) > 0
−1 0 −2
and (1, 1) is a local maximum.

√
Second derivatives test at (− 8, 0). We consider the following determinant

0 ∂g1 ∂g1 0
∂x ∂y 2x 2y
1 2x
B2 (x, y) = ∂g Lxx Lxy = −2 − 2λ 0
∂g
∂x 2y
∂y1 Lyx Lyy 0 −2 − 2λ
√
∗ At (− 8, 0), we have λ = −3/2, then
√
0√ −2 8 0
√
B2 (1, 1) = −2 8 1 0 = −32 =⇒ (−1)1 B2 (− 8, 0) > 0
0 0 1
√
and (− 8, 0) is a local minimum.
iv) and v) Let us explore the concavity and convexity of L with respect to
(x, y) where the Hessian matrix of L in (x, y) is

Lxx Lxy −2 − 2λ 0
HL = =
Lyx Lyy 0 −2 − 2λ
• When λ = 0, the principal minors are Δ11 = Lyy = −2 < 0, Δ21 = Lxx =
−2 < 0 and Δ2 = 4 > 0. So (−1)k Δk 0 for k = 1, 2. Therefore, L is concave
in (x, y) and then (1, 1) is a global maximum for the constrained maximization
problem.
• When λ = −3/2, the principal minors are Δ11 = Lyy = 1 > 0, Δ21 = Lxx =
1 > 0 and Δ2 = 1 √> 0. So (−1)k Δk 0 for k = 1, 2. Therefore, L is convex in
(x, y) and then (− 8, 0) is a global minimum for the constrained minimization
problem.
vi) Note that 0.98 ≈ 1, 1.04 ≈ 1, 0.01 ≈ 0 and e−0.01 ≈ 1. Thus, the new
problems seem like a perturbation from the original problem. Therefore, we
will use linear approximation to solve the problem when r = 0.98, s = 1.04 and
t = −0.01. So, introduce the Lagrangian associated with the new constrained
optimization problem
√
L(x, y, λ, β, r, s, t) = 1 − r3 (x − 2)2 − et y 2 − λ(x2 + sy 2 − 8) − β(s2 x − y)
• Set f (x, y, r, s, t) = 1 − r3 (x − 2)2 − et y 2 , and the value function

⎧ 2 √ 2
⎨ x + sy 8
f ∗ (r, s, t) = min f (x, y, r, s, t) s.t
⎩ 2
s x−y 0
Then f ∗ is well defined and differentiable when (r, s, t) is close to (1, 1, 0).
Indeed, the following is satisfied
√
1. There is a unique solution (x, y) = (− 8, 0) to the√constrained min-
imization problem when (r, s, t) = (1, 1, 0) and (− 8, 0) is a regular
point.
2. For (r, s, t) close to (1, 1, 0), there exists a solution to the constrained
minimization problem by the extreme value theorem since the set of
constraints is a closed bounded set and the function is continuous.
√
3. the second order condition for minimality is satisfied at (− 8, 0) when
(r, s, t) = (1, 1, 0).
As a consequence,
∂f ∗ ∂L
(1, 1, 0) = √ √
∂r ∂r (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
√

= −3r2 (x − 2)2 √ √ = −3( 8 + 2)2
(x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
∂f ∗ ∂L
(1, 1, 0) = √ √
∂s ∂s (x,y,λ,β)=(− 8,0,−1−1/ 2,0, (r,s,t)=(1,1,0)

1
= −λ √ y 2 − 2βsx √ √ =0
2 s (x,y,λ,μ)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
∂f ∗ ∂L
(1, 1, 0) = √ √
∂t ∂t (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)

= −et y 2 √ √ = 0.
(x,y,z,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)
Hence, for (r, s, t) close to (1, 1, 0)

∂f ∗
f ∗ (r, s, t) ≈ f ∗ (1, 1, 0) + (1, 1, 0)(r − 1)
∂r
∂f ∗ ∂f ∗
+ (1, 1, 0)(s − 1) + (1, 1, 0)(t − 0)
∂s ∂t
√
f ∗ (r, s, t) ≈ 2 − 3( 8 + 2)2 (r − 1)
√
f ∗ (0.98, 1.04, −0.01) ≈ 2 − 3( 8 + 2)2 (0.04).
• Set the value function

⎧ 2 √ 2
⎨ x + sy 8
F ∗ (r, s, t) = max f (x, y, r, s, t) s.t
⎩
s2 x − y 0
Then F ∗ is well defined and differentiable when (r, s, t) is close to (1, 1, 0).
Indeed, the following is satisfied
1. There is a unique solution (x, y) = (1, 1) to the constrained maximiza-
tion problem when (r, s, t) = (1, 1, 0) and (1, 1) is a regular point.
2. For (r, s, t) close to (1, 1, 0), there exists a solution to the constrained
maximization problem by the extreme value theorem since the set of
constraints is a closed bounded set and the function is continuous.
3. the second order condition for maximality is satisfied at (1, 1) when
(r, s, t) = (1, 1, 0).
As a consequence,
∂F ∗ ∂L
(1, 1, 0) =
∂r ∂r (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)

= −3r2 (x − 2)2 = −3
(x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)
∂F ∗ ∂L
(1, 1, 0) =
∂s ∂s (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)

1
= −λ √ y 2 − 2βsx = −4
2 s (x,y,λ,μ)=(1,1,0,2), (r,s,t)=(1,1,0)
∂F ∗ ∂L
(1, 1, 0) =
∂t ∂t (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)

= −et y 2 = −1.
(x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)
Hence, for (r, s, t) close to (1, 1, 0)
∂F ∗
F ∗ (r, s, t) ≈ F ∗ (1, 1, 0) + (1, 1, 0)(r − 1)
∂r
∂F ∗ ∂F ∗
+ (1, 1, 0)(s − 1) + (1, 1, 0)(t − 0)
∂s ∂t
F ∗ (r, s, t) ≈ −1 − 3(r − 1) − 4(s − 1) − (t − 0)
F ∗ (0.98, 1.04, −0.01) ≈ −1 − 3(−0.02) − (0.04) − (−0.01) = 0.02.
Remark. The set of feasible solutions S = {(x, y) : x2 +y 2 8, x−y 0}

is a closed bounded set of R2 and f is continuous on S. Therefore, the extreme
points are attained on this set by the extreme value theorem. Moreover, such
points must occur either at points satisfying the KKT conditions
√ or at points
where the constraint qualification fails. Since, (1, 1) and (− 8, 0) are the only
two points solution and they are regular, then they solve the problem.
For more practice, we refer the reader to [11], [27], [28], [26], [25], [24], [4].
Bibliography
[1] H. Anton, I. Bivens, and S. Davis. Calculus. Early Transcendentals.

John Wiley & Sons, Inc. New York, NY, USA, 2005.
[2] R. G. Bartle and D. R. Sherbert. Introduction to Real Analysis. John

Wiley & Sons, Inc, 2011.
[3] W. Briggs, L. Cochran, and B. Gillett. Calculus. Early Transcendentals.

Addison-Wesley. Pearson, 2011.
[4] E. K. P. Chong and S. H. Żak. An Introduction to Optimization. Wiley,

2013.
[5] P. G. Ciarlet. Introduction à l’analyse numérique matricielle et

l’optimisation. Masson, 1985.
[6] B. Dacorogna. Introduction au calcul des variations. Presses polythech-

niques et universitaires romandes. Lausanne, 1992.
[7] Jr. Ernest.F. Haeussler, S. P. Richard, and J. W. Richard. Introductory

Mathematical Analysis for Business, Economics, and the Life and Social
Sciences. Pearson, Prentice Hall, 2008.
[8] P. E. Fishback. Linear and Nonlinear Programming with MapleTM . An

Interactive, Applications-Based Approach. CRC Press, Taylor and Francis
Group, 2010.
[9] A.S. Gupta. Calculus of Variations with Applications. Prentice-Hall of
India, 2006.
[10] W. Keith Nicholson. Linear Algebra with Applications. McGraw-Hill

Ryerson, 2014.
[11] D. Koo. Elements of Optimisation with Applications in Economics and
Business. Springer-Verlag, 1977.
[12] R. J. Larsen and M. L. Marx. An Introduction to Mathematical Statistics
and its Applications. Prentice Hall, 2001.
[13] S. Lipschutz. Topologie, cours et problèmes. McGraw-Hill, 1983.
315
316 Bibliography
[14] D.G. Luenberger. Introduction to Linear and Nonlinear Programming.

Addison Wesley, 1973.
[15] J. E. Marsden. Elementary Classical Analysis. W. H. Freeman and Com-

pany, 1974.
[16] M. Mesterton-Gibbons. A primer on the calculus of variations and op-

timal control theory. Student Mathematical Library vol 50. American
Mathematical Society, 2009.
[17] M. Minoux. Mathematical Programming: Theory and Algorithms. John

Wiley and Sons, 1986.
[18] J. R. Munkres. Topology of First Course. Prentice Hall, 1975.
[19] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1999.

[20] M.H. Protter and C.B. Morrey. A First Course in Real Analysis. Springer,
2000.
[21] S.L. Salas, E. Hille, and G.J. Etgen. Calculus. One and Several Variables.
Tenth Edition. John Wiley & Sons, INC, 2007.
[22] J. A. Snyman. Practical Mathematical Optimization: An Introduction

to Basic Optimization Theory and Classical and New Gradient-Based
Algorithms. Springer, 2005.
[23] J. Stewart. Essential Calculus. Brooks/Cole, 2013.
[24] K. Sydsæter and P. Hammond. Mathematics for Economic Analysis. FT

Prentice Hall, 1995.
[25] K. Sydsæter, P. Hammond, A. Seierstad, and A. Strøm. Further Mathe-
matics for Economic Analysis. FT Prentice Hall, 2008.
[26] K. Sydsæter, P. Hammond, A. Seierstad, and A. Strøm. Instructor’s

Manual: Further Mathematics for Economic Analysis. Pearson, 2008.
2nd Edition.
[27] K. Sydsæter, A. Strøm, and P. Hammond. Instructor’s Manual: Essential

Mathematics for Economic Analysis. Pearson, 2008. 3rd Edition.
[28] K. Sydsæter, A. Strøm, and P. Hammond. Instructor’s Manual: Essential
Mathematics for Economic Analysis. Pearson, 2014. 4th Edition.
[29] W. L. Winston. Operations Research: Applications and Algorithms.

Brooks/Cole, 2004.
Index
absolute maximum, 54, 117 eigen value, 79

absolute minimum, 54, 117 ellipse, 23
active, 223 ellipsoid, 25
affine, 209 extreme-value theorem, 117
approximate method, 60
approximation, 293 Farkas-Minkowski, 222
ball, 8 generalized Lagrange multipliers, 223

binding, 223 global extreme points, 117
bordered Hessian determinant, global maximum, 50
252 global minimum, 50
boundary, 10, 27, 117 gradient, 30, 141
bounded, 10, 117 graph, 22
chain rule, 34, 96, 294 Hessian, 31, 71, 98

Clairaut, 31 hyperplane, 29
closed, 10, 117
implicit function theorem, 138, 139,
closure, 10
214, 295
Cobb-Douglas, 20, 292
inactive, 223
columns, 80
inflection point, 55
concave, 93
interior, 9, 117, 139
cone, 24, 206
interior point, 9, 54, 204
cone of feasible directions, 204
intermediate value theorem, 69
constraint function, 292
continuous, 28, 117 Jacobian, 141
continuously differentiable, 34
convex, 13, 93 Karush-Kuhn-Tucker, 223, 232, 235
critical point, 54
critical points, 117 Lagrange, 153
cylinder, 22 Lagrange multipliers, 153
Lagrangian, 153, 223
dependence, 292 Laplace, 32
determinant, 80 leading minors, 71, 98
differentiability, 29 level curve, 22
differentiable, 33 level surface, 22
dimension, 22 line, 23
domain, 21 line tangent, 29
317
318 Index
linear, 33 quadratic form, 76, 78, 81, 99, 254

linear combination, 152
linear constraints, 175 radius, 9
Linear programming, 133 rank, 139
linearly independent, 138, 176, rate of change, 29
206, 252 regular point, 137, 206
local extreme point, 50 relative maximum, 56
local maximum, 50 relative minimum, 56
local minimum, 50 rows, 80
negative definite, 175 saddle point, 56, 80

negative semi definite, 82, 255 second derivatives test, 72, 293
neighborhood, 9, 27, 80 semi definite, 80
normal line, 143 several variables, 26, 29
normal vector, 143 slack, 223
slope, 29
objective function, 50, 292 stationary point, 54
open, 9 strictly concave, 93
optimal value function, 293 strictly convex, 93, 99
orthogonal, 152 subspace, 138, 140
orthogonal matrix, 79 surface, 22
symmetric, 76, 79
parabola, 23 symmetric matrix, 81
Paraboloid, 23
tangent line, 95, 142
parallel, 23, 143
tangent plane, 137, 254
parameters, 292
Taylor’s formula, 76, 253
partial derivative, 29
traces, 22
plane tangent, 152
triangular inequality, 8, 94
polyhedra, 133
positive definite, 82, 99, 175 unbounded, 10, 121
positive semi definite, 82, 255 unit vectors, 150
principal minor, 80, 82, 100
production, 20 vertices, 26

(Chapman &amp; Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Chapman &amp; Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)

Uploaded by

Copyright:

Available Formats

Introduction to the

About the Series

Introducon to the Theory of Opmizaon in Euclidean Space

For more informaon about this series please visit: hps://www.crcpress.com/Chapman--HallCRC-Series-in-

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-0-367-19557-1 (Hardback)

Trademark Notice: Product or corporate names may be trademarks or registered trade-

Visit the Taylor & Francis Web site at

and the CRC Press Web site at

Symbol Description xiii

3 Constrained Optimization-Equality Constraints 135

4 Constrained Optimization-Inequality Constraints 203

4.4 Global Extreme Points-Inequality Constraints . . . . . . . . 271

The book is intended to provide students with a useful background in opti-

The book starts, in Chapter 1, with a short introduction to mathematical

In the following chapters, the study is devoted to the optimisation of a function

by an inequality g(x) 0 where g is a function of several variables. In each

– If the extreme point exists, then where is it located in S? Here, we

I am very grateful to my colleagues David Spring, Mario Roy and Alexander

∂S Closure of the set S th = h1 ... hn transpose of

∇f gradient of f C k (D) set of continuously diﬀerentiable

Samia Challal is an assistant professor of Mathematics at Glendon College,

Optimization problems arise in diﬀerent domains. In Section 1.1 of this chapter, we

1.1 Formulation of Some Optimization Problems

Example 1. Diﬀerent ways in modeling a problem.

To minimize the material in manufacturing a closed can with volume capacity

area = A = 2πr2 + 2πrh, volume = V = πr2 h.

FIGURE 1.1: S = (0, +∞) ⊂ R

** We can also express the problem as follows:

Here, the set S is a curve in R2 and is illustrated by Figure 1.2 below:

FIGURE 1.2: S is a curve h = π −1 /r2 in the plane (V=1 liter)

FIGURE 1.3: S is a plane region between two curves

A three dimensional formulation of the same problem is

FIGURE 1.4: S is a surface in the space

Example 2. Too many variables and linear inequalities.

type1 type2 type3 type4

TABLE 1.1: A diet problem with four variables

2u1 + 2u2 + u3 + 8u4 = f (u1 , u2 , u3 , u4 ).

2u1 + u2 + u4 12 and 3u1 + 4u2 + 3u3 + 5u4 7.

Hence, the problem would be

** The above problem is rendered more complex if more factors (fat,

type1 type2 type3 type4 type5 type6 type7

TABLE 1.2: A diet problem with seven variables

Example 3. Too many variables and nonlinearities.

Solution: We need to maximize the quantity x y on the set of points (see

FIGURE 1.5: S is a triangular region in the plane

L2 = {(0, y), 0 y 3}, L3 = {(x, (6 − 3x)/2), 0 x 2}.

P (l, k) = 33l0.46 k 0.52

where P is product, l is labor and k is capital.

P (A, B, C, D, E, F ) = A0.27 B 0.01 C 0.01 D0.23 E 0.09 F 0.27

where A is land, B is labor, C is improvements, D is liquid assets, E is

ﬁnd u ∈ S such that f (u) = min f (v)

where f : S ⊂ R −→ R is a given function and S a given subset of Rn .

It is obvious that establishing existence and uniqueness results of the extreme

min f (x) = x2 does not exist.

f (x) > 0 ∀x ∈ (0, 1).

To include these limit cases, usually, instead of looking for a minimum or a

inf f (x) = inf{f (x) : x ∈ S} and sup f (x) = sup{f (x) : x ∈ S}

where inf E and sup E of a nonempty subset E of R are deﬁned by [2]

sup E = the least number greater than or equal to all numbers in E

inf E = the greatest number less than or equal to all numbers in E.

If E is not bounded below, we write inf E = −∞. If E is not bounded above,

For the previous example, we have

(Chapman & Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)

(Chapman & Hall_CRC Series in Operations Research) Samia Challal - Introduction to the Theory of Optimization in Euclidean Space-Chapman and Hall_CRC (2019)

3 xy1 x0 y0

f1 : (x, y) −→ x + 3y, f2 : (x, y) −→ x, f3 : (x, y) −→ y