Professional Documents
Culture Documents
Lectures On Optimization: A. Banerji
Lectures On Optimization: A. Banerji
A. Banerji
September 23, 2013
Chapter 1
Introduction
1.1 Some Examples
We briey introduce our framework for optimization, and then discuss some
preliminary concepts and results that well need to analyze specic problems.
Our optimization examples can all be couched in the following general
framework:
Suppose V is a vector space and S V . Suppose F : V . We wish to
nd x
S s.t. F(x
) F(x), x S, or x
S s.t. F(x
) F(x), x S.
x
, x
k
i=1
p
i
x
i
p.x I.
Here, the objective function is U, and
1
S = {x
k
: x
i
0i = 1, ..., k, and 0 p.x I}
.
Example 2 Expenditure minimization. Same setting as above. Minimize
p.x s.t. x
i
0i = 1, ..., k and U(x)
U, where
U is a non-negative real
number.
Here the objective function F :
k
is F(x) = p.x and
S = {x
k
: x
i
0i = 1, ..., k, and U(x)
U}
Example 3 Prot Maximization. Given positive output prices p
1
, ..., p
s
and
input prices w
1
, ..., w
k
, and a production function f :
k
+
s
(transforming
k inputs into s products),
Maximize
s
j=1
p
j
f
j
(x)
k
i=1
w
i
x
i
, s.t. x
i
0, i = 1, ..., k. f
j
(x) is
the output of product j as a function of a vector x of the k inputs.
Here, the objective function is prots :
k
+
dened by (x) =
s
j=1
p
j
f
j
(x)
k
i=1
w
i
x
i
, and
S = {x
k
: x
i
0, i = 1, ..., k}
Example 4 Intertemporal utility maximization. A worker with a known life
span T, earning a constant wage w, and receiving interest at rate r on ac-
cumulated savings, or paying the same rate on accumulated debts, wishes to
2
decide optimal consumption path c(t), t [0, T]. Let accumulated assets/debts
at time t be denoted by k(t). His instantaneous utility from consumption is
u(c(t)), u
> 0, u
1
, ..., s
n
) is a strategy prole such that for each
player i, s
i
solves the following maximization problem:
Maximize u
i
(s
1
, .., s
i1
, s
i
, s
i+1
, .., s
n
) s.t.
s
i
S
i
.
1.2 Some Concepts and Results
We will now discuss some concepts that we will need, such as the compactness
of the set S above, and the continuity and dierentiability of the objective
3
function F. We will work in normed linear spaces. In the absence of any
other specication, the space we will be in is
n
with the Euclidean norm
||x|| = (
n
i=1
x
2
i
)
1/2
. (Theres a bunch of other norms that would work
equally well. Recall that a norm in
n
is dened to be a function assigning
to each vector x a non-negative real number ||x||, s.t. (i) for all x, ||x|| 0
with =
i x = 0 (0
k=1
of points in V converges to x if for every > 0 there
exists a positive integer N s.t. k N implies ||x
k
x|| < .
5
Note that this is the same as saying that for every open ball B(x, ), we
can nd N s.t. for all points x
k
following x
N
, x
k
lies in B(x, ). This implies
that when x
k
converges to x (notation: x
k
x), all but a nite number of
points in (x
k
) lie arbitrarily close to x.
Examples. x
k
= 1/k, k = 1, 2, ... is a sequence of real numbers converging
to zero. x
k
= (1/k, 1/k), k = 1, 2, ... is a sequence of vectors in
2
converging
to the origin. More generally, a sequence converges in
n
if and only if all
the coordinate sequences converge, as can be visualized in the example here
using hypotenuses and legs of triangles.
Theorem 2 (x
k
) x in
n
i for every i {1, . . . , n}, the coordinate
sequence (x
k
i
) x
i
.
Proof. Since
(x
k
i
x
i
)
2
j=1
(x
k
j
x
j
)
2
,
taking square roots implies |x
k
i
x
i
| ||x
k
x||, so for every k N s.t.
||x
k
x|| < , |x
k
i
x
i
| < .
Conversely, if all the coordinate sequences converge to the coordinates
of the point x, then there exists a positive integer N s.t. k N implies
|x
k
i
x
i
| < /
i=0
R
i
is a single point; call this point x.
Now we can choose points y
i
R
i
, i = 1, 2, ... s.t. each y
i
is some member
of (x
n
); because the R
i
s collapse to x, it is easy to show that (y
m
) is a
subsequence that converges to x. Moreover, the y
i
s lie in S, and S is closed;
so x S.
Conversely, suppose S is compact.
(i) Then it is bounded. For suppose not. Then we can construct a se-
quence (x
n
) in S s.t. for every n = 1, 2, ..., ||x
n
|| > n. But then, no subse-
quence of (x
n
) can converge to a point in S. Indeed, take any point x S
and any subsequence (x
m(n)
) of (x
n
). Then
||x
m(n)
|| = ||x
m(n)
x + x|| ||x
m(n)
x|| +||x||
(The inequality above is due to the triangle inequality).
So,
||x
m(n)
x|| ||x
m(n)
|| ||x|| n ||x||
and the RHS becomes larger with n. So (x
m(n)
) does not converge to x.
10
(ii) S is also closed. Take any sequence (x
n
) in S that converges to x.
Then, all subsequences of (x
n
) converge to x, and since S is compact, (x
n
)
has a subsequence converging to a point in S. So, this point of limit is x,
and x S. So, S is closed. .
Continuity of Functions
Denition 3 A function F :
n
m
is continuous at x
n
, if for
every sequence (x
k
) that converges to x in
n
, the image sequence (f(x
k
))
converges to f(x) in
m
.
Example of point discontinuity.
Example of continuous function on discrete space.
F is continuous on S
n
, if it is continuous at every point x S.
Examples. The real-valued function F(x) = x is continuous using this
denition, almost trivially, since (x
k
) and x are identical to (F(x
k
)) and F(x)
respectively.
F(x) = x
2
is continuous. We want to show that if (x
k
) converges to x,
then (F(x
k
)) = x
2
k
converges to F(x) = x
2
. This follows from the exercise
above on limits: x
k
x, x
k
x implies x
k
x
k
x.x = x
2
.
By extension, polynomials are continuous functions.
May talk a little about the coordinate functions of F :
n
m
:
(F
1
(x
1
, ..., x
n
), ..., F
m
(x
1
, ..., x
n
)).
Example: F(x
1
, x
2
) = (x
1
+ x
2
, x
2
1
+ x
2
2
). This is continuous because (i)
F
1
and F
2
are continuous; e.g. let x
k
x. Then the coordinates x
k
1
x
1
and x
k
2
x
2
. So F
1
(x
k
) = x
k
1
+ x
k
2
x
1
+ x
2
= F
1
(x).
(ii) Since the coordinate sequences F
1
(x
k
) F
1
(x) and F
2
(x
k
) F
2
(x),
F(x
k
) (F
1
(x
k
), F
2
(x
k
)) F(x) = (F
1
(x), F
2
(x)).
There is an equivalent, (, ) denition of continuity.
Denition 4 A function F :
n
m
is continuous at x
n
, if for every
> 0, there exists > 0 s.t. if for any y
n
we have ||x y|| < , then
||F(x) F(y)|| < .
11
So if there is a hurdle of size around F(x), then, if point y is close
enough to x, F(y) cannot overcome the hurdle.
Theorem 6 The two denitions above are equivalent.
Proof. Suppose there exists an > 0 s.t. for every > 0, there exists a y
with ||x y|| < and ||F(x) F(y)|| . Then for this particular , we can
choose a sequence of
k
= 1/k and x
k
with ||x x
k
|| < 1/k. So, (x
k
) x
but (F(x
k
)) does not converge to F(x), staying always outside the -band of
F(x).
Conversely, suppose there exists a sequence (x
k
) that converges to x, but
(F(x
k
)) does not converge to F(x). So, there exists > 0 s.t. for every
positive integer N, there exists k N for which ||F(x
k
) F(x)|| . Then,
for this specic , there does not exist any > 0 s.t. for all y with ||xy|| <
we have ||F(x) F(y)||; for we can nd for any such , one of the x
k
s s.t.
||x
k
x||, so ||F(x
k
) F(x)|| .
Here is an immediate upshot of the latter denition. Suppose F :
is continuous at x. If F(x) > 0, then there is an open interval (x , x + )
s.t. if y is in this interval, then F(y) > 0. The idea is that we can take an
= F(x)/2, say, and use the (, ) denition. A similar statement will hold
if F(x) < 0.
We use this fact now in the following result.
Theorem 7 Intermediate Value Theorem
Suppose F : is continuous on an interval [a, b] and F(a) and F(b)
are of opposite signs. Then there exists c (a, b) s.t. F(c) = 0.
Proof. Suppose WLOG that F(a) > 0, F(b) < 0 (i.e. for the other case
just consider the function F). Then the set
S = {x [a, b]|F(x) 0}
12
is bounded above. Indeed, b is an upper bound of S since F(b) is not 0.
By the completeness property of real numbers, S has a supremum, sup S = c,
say.
It cant be that F(c) > 0, for then by continuity, there is an h S, h > c,
s.t. F(h) > 0 so c is not an upper bound of S. It cant be that F(c) < 0. For,
if c is an upper bound of S with F(c) < 0, then we have for every x [a, b]
with F(x) 0, x c. However, by continuity, there is an interval (c , c]
s.t. every y in this interval satises F(y) < 0. But then, every x S must
be to the left of this interval. But then again, c is not the least upper bound
of S.
So, it must be that F(c) = 0.
13
Chapter 2
Existence of Optima
2.1 Weierstrass Theorem
This theorem of Weierstrass gives a sucient condition for a maximum and
minimum to exist, for an optimization problem.
Theorem 8 (Weierstrass). Let S
n
be compact and let F : S be
continuous. Then F has a maximum and minimum on S; i.e., there exist
z
1
, z
2
S s.t. f(z
2
) f(x) f(z
1
), x S.
The idea is that continuity of F preserves compactness; i.e. since S is
compact and F is continuous, the image set F(S) is compact. That holds
irrespective of the space F(S) is in; but since F is real-valued, F(S) is a
compact set of real numbers, and therefore must have a max and a min, by
a result in Chapter 1.
Proof.
Let (y
k
) be a sequence in F(S). So, for every k, there is an x
k
S, s.t.y
k
=
F(x
k
). Since (x
k
), k = 1, 2, ... is a sequence in the compact set S, it has a
subsequence (x
m(k)
) that converges to a point x in S. Since F is continuous,
the image sequence (f(x
n(k)
)) converges to f(x), which is obviously in F(S).
14
So weve found a convergent subsequence (y
m(k)
) = (f(x
m(k)
)) of (y
k
); hence
F(S) is compact. This means the set F(S) of real numbers is closed and
bounded; so, it has at least one maximum and at least one minimum. .
Example 6 p
1
= p
2
= 1, I = 10. Maximize U(x
1
, x
2
) = x
1
x
2
s.t. the budget
constraint. Here, the budget set is compact, since the prices are positive. We
can see that the image of the budget set S under the function U (or the range
of U), is U(S) = [0, 25]. This is compact, and so U attains a max (25) and
a min (0) on S.
The fact that U(S) is in fact an interval has to do with another property of
continuity of the objective: such functions preserve connectedness in addition
to preserving compactness of the set S, and here, the budget set is a connected
set.
Do applications of Weierstrass theorem to utility maximization and
cost minimization.
15
Chapter 3
Unconstrained Optima
3.1 Preliminaries
A function f : is dened to be dierentiable at x if there exists a
s.t.
lim
yx
_
f(y) f(x)
y x
a
_
= 0
(1)
By limit equal to 0 as y x, we require that the limit be 0 w.r.t. all
sequences (y
n
) s.t. y
n
x. a turns out to be the unique number equal to
the slope of the tangent to the graph of f at the point x. We denote a by
the notation f
(x)h. In
the general case, f(x +h) is approximated by the ane function f(x) +Ah.
It can be shown that (w.r.t. the standard bases in
n
and
m
), the matrix
A equals Df(x), the mn matrix of partial derivatives of f evaluated at the
point x. To see this, take the slightly less general case of a function f :
n
, then
Df(x
) = .
19
Here, = (0, ..., 0) is the origin, and Df(x
) = (f(x
)/x
1
, . . . , f(x
)/x
n
).
Proof. Let x
, y
k
x
, and
(z
k
), z
k
> x
, z
k
x
. Since x
)
z
k
x
0
f(y
k
) f(x
)
y
k
x
(x
) 0 f
(x
)
so f
(x
) = 0.
Step 2. Suppose n > 1. Take any j
t
h axis direction, and let g :
be dened by g(t) = f(x
+te
j
). Note that g(0) = f(x
). Now, since x
is a
local max of f, f(x
) f(x
+te
j
), for t smaller than some cuto value: i.e.,
g(0) g(t) for t smaller than this cuto value, i.e., g(0) is a local interior
maximum. (Since t < 0 and t > 0 are both allowed). g is dierentiable
at 0 since g(0) = f(h(0)) = f(x
), and f is dierentiable at x
and h is
dierentiable at t = 0. (Here, h(t) = x
+ te
j
, so Dh(t) = e
j
, t). So, g is
dierentiable at 0, g
)e
j
=
f(x
)
x
j
.
Note that this is necessary but not sucient for a local max or min, e.g.
f(x) = x
3
has a vanishing rst derivative at x = 0, which is not a local
optimum.
Second Order Conditions
20
Denition. x is a strict local maximum of f on S if f(x) > f(y), for all
y B(x, ) S, y = x, for some > 0.
We will represent the Hessian or second derivative (matrix) of f by D
2
f.
Theorem 10 Suppose f :
n
is C
2
on S
n
, and x is an interior
point of S.
1. (necessary) If f has a local max (resp. local min) at x, then D
2
f(x)
is n.s.d. (resp. p.s.d.).
2. (sucient) If Df(x) = and D
2
f(x) is n.d. (resp. p.d.) at x, then x
is a strict local max (resp. min) of f on S.
The results in the above theorem follow from taking a Taylor series ap-
proximation of order 2 around the local max or local min. For example,
f(x) = f(x
) + Df(x
(x x
) +
1
2
(x x
)
T
D
2
f(x
)(x x
) + R
2
(x x
)
where R
2
() is a remainder of order smaller than two. If x
is an interior
local max or min, then Df(x
)).
Examples to illustrate: (i) SONC are not sucient. f(x) = x
3
. (ii) Semi-
deniteness cannot be replaced by deniteness. f(x) = x
4
. (iii). These are
conditions for local, not global optima. f(x) = 2x
3
3x
2
. (iv) Strategy for
using the conditions to identify global optima. f(x) = 4x
3
5x
2
+ 2x on
S = [0, 1].
21
Chapter 4
Optimization with Equality
Constraints
4.1 Introduction
We are given an objective function f : R
n
R to maximize or minimize,
subject to k constraints. That is, there are k functions, g
1
: R
n
R,
g
2
: R
n
R, ... , g
k
: R
n
R, and we wish to
Maximize f(x) over all x R
n
such that g
1
(x) = 0, . . . , g
k
(x) = 0.
More compactly, collect the constraint functions (looking at them as com-
ponent functions) into one function g : R
n
R
k
, where g(x) = (g
1
(x), . . . , g
k
(x)).
Then what we want is to
Maximize f(x) over all x R
n
such that g(x)
1k
=
1k
.
The Theorem of Lagrange provides necessary conditions for a local opti-
mum x
>> .
Then reallocating a small amount of income from one good to the other
does not increase utility. Say income dI > 0 is shifted from good 1 to good 2.
So dx
1
= (dI/p
1
) > 0 and dx
2
= (dI/p
1
) < 0. Note that this reallocation
satises the budget constraint, since
p
1
(x
1
+ dx
1
) + p
2
(x
2
+ dx
2
) = I
The change in utility is dU = U
1
dx
1
+ U
2
dx
2
= [(U
1
/p
1
)(U
2
/p
2
)]dI 0, since the change in utility cannot be positive
at a maximum. Therefore,
(U
1
/p
1
) (U
2
/p
2
) 0 (1)
Similarly, dI > 0 shifted from good 1 to good 2 does not increase utility,
so that
[(U
1
/p
1
) + (U
2
/p
2
)]dI 0, or
(U
1
/p
1
) + (U
2
/p
2
) 0 (2)
Eq. (1) and (2) imply (U
1
(x
)/p
1
) = (U
2
(x
)/p
2
) =
(3)
That is, the marginal utility of the last bit of income equals (U
1
(x
)/p
1
) =
(U
2
(x
)/p
2
at the optimum. Also, (3) implies U
1
(x
) =
p
1
, U
2
(x
) =
p
2
.
Along with p
1
x
1
+p
2
x
2
= I, these are the FONC of the Lagrangean function
Max L(x, ) = U(x
1
, x
2
) + [I p
1
x
1
p
2
x
2
]
More generally, suppose F : R
n
R and G : R
n
R, and suppose x
. So
23
dF = F
1
dx
1
+ f
2
dx
2
0, or [(F
1
/G
1
) (F
2
/G
2
)]dc 0. Similarly, 0
can be shown similarly.
Therefore, (F
1
(x
)/G
1
(x
)) = (F
2
(x
)/G
2
(x
)) =
(4)
Caveat: We have assumed that G
1
(x
) and G
2
(x
.
Lets go back to the utility example. At the optimum (x
), suppose
you increase income by I. Buying more x
1
implies utility increases by
(U
1
(x
)/p
1
)I, approximately.
Buying more x
2
implies utility increases by (U
2
(x
)/p
2
)I
At the optimum,(U
1
(x
)/p
1
) = (U
2
(x
)/p
2
) =
.
So in either case, utility increases by
I. So
, F increases by
dF = F
1
dx
1
= (F
1
(x
)/G
1
(x
))c =
c.
If instead x
2
is changed, F increases by dF = F
2
dx
2
, = (F
2
(x
)/G
2
(x
))c =
c.
4.2 The Theorem of Lagrange
The set up is the following. f : R
n
R is the objective function, g
i
: R
n
, s.t. g
i
(x) = 0, i = 1, . . . , k}. Thus x
is
a Max on the set S = U {x R
n
|g
i
(x) = 0, i = 1, . . . , k}.
Theorem 11 (Theorem of Lagrange). Let f : R
n
R and g
i
: R
n
R, i = 1, . . . , k, k < n be C
1
functions. Suppose x
is a Max or a Min of
f on the set S = U {x R
n
|g
i
(x) = 0, i = 1, . . . , k}, for some open set
U R
n
. Then there exist real numbers
1
, . . . ,
k
, not all zero, such that
Df(x
) +
k
i=1
i
Dg
i
(x
) =
1n
.
Moreover if rank(Dg(x
), Dg
1
(x
), . . . , Dg
k
(x
), . . . , Dg
k
(x
)
are linearly independent. In that case,
Df(x
) +
k
i=1
i
Dg
i
(x
k
i=1
i
Dg
i
(x
i
=
0, i = 1, . . . , k. This cannot be. So if the CQ holds, then = 0, so we can
divide through by .
(2) In most applications the CQ holds. We usually check rst whether it
holds, and then proceed. Suppose it does hold. Note that
Df(x
) +
k
i=1
i
Dg
i
(x
)/x
j
) +
k
i=1
i
(g
i
(x
)/x
j
) = 0, j = 1, . . . , n
Note also that this leads to the usual procedure for nding equality
constrained Max or Min, by setting up a Lagrangean function:
L(x, ) = f(x) +
k
i=1
i
g
i
(x), and solving the FONC
25
(L(x, )/x
j
) = (f(x)/x
j
) +
k
i=1
i
(g
i
(x)/x
j
) = 0, j = 1, . . . , n
(L(x, )/
i
) = g
i
(x) = 0, i = 1, . . . , k
Which is (n + k) equations in (n + k) variables x
1
, . . . , x
n
,
1
, . . . ,
k
.
Why does the above procedure usually work to isolate global
optima?
The FONC that come out of the Lagrangean function are, as seen in the
Theorem of Lagrange, necessary conditions for local optima. However, when
we do equality constrained optimization, (i) usually a global max (or min)
x
is known to exist. (ii) Second, for most problems the CQ is met at all
x S. Therefore, it is met at the optimum as well. (Note that otherwise,
not knowing the optimum when we start out on a problem, it is not possible
to check whether the CQ holds at that point!)
When (i) and (ii) are met, the solutions to the FONC of the Lagrangean
function will include all local optima, and hence will include the global op-
timum that we want. By comparing the values f(x) for all x that solve the
FONC, we get the point at which f(x) is a max or a min. With this method,
we dont need second order conditions at all, if we just want to nd a global
max or a min.
Pathologies
The above procedure may not always work.
Pathology 1. A global optimum may not exist. Then none of the critical
points (solutions to the FONC of the Lagrangean function) is a global op-
timum. Critical points may then be only it local optima, or they may not
even be local optima. Indeed, the Theorem of Lagrange gives a necessary
condition; so there could be points x
= y
= 0 is a solution. But (x
, y
) = (0, 0)
is neither a local max nor a local min. Indeed, f(0, 0) = 0, whereas for
(x, y) = (, ), > 0, f(, ) = 2
3
> 0, and for (x, y) = (, ), < 0,
f(, ) = 2
3
< 0.
Pathology 2. The CQ is violated at the optimum.
In this case, the FONCs need not be satised at the global optimum.
Example. Max f(x, y) = y s.t. g(x, y) = y
3
x
2
= 0.
Let us rst nd the solution using native intelligence. Then well show
that the CQ fails at the optimum, and that the usual Lagrangean method
is a disaster. Finally, well show that the general form of the equation the
Theorem of Lagrange, that does NOT assume that the CQ holds at the
optimum, works.
The constraint is y
3
= x
2
, and since x
2
is nonnegative, so must y
3
be.
Therefore, y 0. The maximum of y s.t. y 0 implies y = 0 at the max.
So y
3
= x
2
= 0, so x = 0. So f attains global max at (x, y) = (0, 0).
Dg(x, y) = (2x, 3y
2
) = (0, 0) at (x, y) = (0, 0). So rank(Dg(x, y)) =
0 < k = 1 at the optimum; the CQ fails at this point. Using the Lagrangean
method, we get the following FONC:
(f/x) + (g/x) = 0, that is 2x = 0 (1)
(f/y) + (g/y) = 0, that is 1 + 3y
2
= 0 (2)
(L/) = 0, that is x
2
+ y
3
= 0 (3)
Eq.(1) implies either = 0 or x = 0. x = 0 implies, from Eq.(3), that
y = 0, but then (2) becomes 1 = 0, which is not possible. Similarly, = 0
again violates (2).
But the general form of the condition in the Theorem of Lagrange does
not rely on the CQ and works. In this problem, the only equation out of the
above three that changes is Eq. (2), as we see below:
Df(x, y) +Dg(x, y) = (0, 0), and x
2
+y
3
= 0, with Df(x, y) = (0, 1),
Dg(x, y) = (2x, 3y
2
) yield
(f/x) + (g/x) = 0, that is 2x = 0 (1)
27
(f/y) + (g/y) = 0, that is + 3y
2
= 0 (2)
(L/) = 0, that is x
2
+ y
3
= 0 (3)
Now, Eq.(1) implies = 0 or x = 0. If = 0, then Eq.(2) implies = 0.
But = = 0 is ruled out by the Theorem of Lagrange. Therefore, here
= 0. Hence x = 0. From Eq.(3), we then have y = 0, and so from Eq. (2),
= 0. So we get x = y = 0 as a solution.
Second-Order Conditions
These conditions are characterized by deniteness or semi-deniteness of
the Hessian of the Lagrangean function, which is the appropriate function
to look at in this constrained optimization problem. Also, we dont have to
check the appropriate inequality for the quadratic form for all x. Now, only
those x are relevant that satisfy the constraints. Second order conditions in
general say something about the curvature of the objective function around
the local max or min...i.e., how the graph curves as we move from x
to
a nearby x. In constrained optimization, we cannot move from x
to any
arbitrary x nearby; the move must be to an x which satises the constraints.
That is, such a move must leave all g
i
(x) at 0. In other words, dg
i
(x) =
Dg
i
(x).dx = 0, where dx is a vector x
k
i=1
i
g
i
(x),
D
2
L(x, )
nn
= D
2
f(x)
nn
+
k
i=1
i
D
2
g
i
(x)
nn
,
where D
2
f(x) =
_
_
_
f
11
(x) . . . f
1n
(x)
.
.
.
.
.
.
.
.
.
f
n1
(x) . . . f
nn
(x)
_
_
_
and D
2
g
i
(x) =
_
_
_
g
i11
(x) . . . g
i1n
(x)
.
.
.
.
.
.
.
.
.
g
in1
(x) . . . g
inn
(x)
_
_
_
So D
2
L(x, )
nn
=
_
_
_
f
11
(x) +
k
i=1
i
g
i11
(x) . . . f
1n
(x) +
k
i=1
i
g
i1n
(x)
.
.
.
.
.
.
.
.
.
f
n1
(x) +
k
i=1
i
g
in1
(x) . . . f
nn
(x) +
k
i=1
i
g
inn
(x)
_
_
_
28
is the second derivative of L w.r.t. the x variables. Note that D
2
L(x, )
is symmetric, so we may work with its quadratic form.
At a given x
R
n
, Dg(x
)
kn
=
_
_
_
Dg
1
(x
)
.
.
.
Dg
k
(x
)
_
_
_
So the set of all vectors x that are orthogonal to all the gradient vectors
of the constraint functions at x
), N(Dg(x
)) =
{x R
n
|Dg(x
)x =
k1
}.
Theorem 12 Suppose there exists (x
n1
,
k1
) such that Rank(Dg(x
)) = k
and Df(x
) +
k
i=1
Dg
i
(x
) = .
(i) (a necessary condition) If f has a local max (resp. local min) on S at
point x
, then x
T
D
2
L(x
))
(ii) (a sucient condition) If x
T
D
2
L(x
)), x = , then x
) =
_
0
kk
Dg(x
)
kn
[Dg(x
)]
T
nk
D
2
L(x
)
nn
_
(n+k)(n+k)
BH
(L
(L
).
BH
(L
)]
T
, which is the transpose of Dg(x
).
29
Theorem 13 (1a) x
T
D
2
L(x
(L
; n + k r)) 0, r = 0, 1, . . . , k 1.
(1b) x
T
D
2
L(x
(L
; k + n r)) 0, r = 0, 1, . . . , k 1.
(2a). x
T
D
2
L(x
)), i (1)
nr
det(BH(L
; n+
k r)) > 0, r = 0, 1, . . . , k 1.
(2b)x
T
D
2
L(x
)), i (1)
k
det(BH(L
; n+
k r)) > 0, r = 0, 1, . . . , k 1.
Note. (1) For the negative denite or semideniteness subject to con-
straints cases, the determinant of bordered Hessian with last r rows and
columns deleted must be of the same sign as (1)
nr
. The sign of (1)
nr
switches with each successive increase in r from r = 0 to r = k 1. So the
corresponding bordered Hessians switch signs. In the usual textbook case of
2 variables and one constraint, k = 1, k 1 = 0, so we just need to check
the sign for r = 0, that is, the sign of the determinant of the big bordered
Hessian. You should be clear about what this sign should be if it is to be a
sucient condition for a strict local max or min. For the necessary condition,
we need to check signs or 0, for one permuted matrix as well, in this
case. What is this permuted matrix?
(2) As in the unconstrained case, the suciency conditions do not require
checking weak inequalities for permuted matrices.
(3) In the p.s.d. and p.d. cases, the signs of the principal minors must
be all positive, if the number k of constraints is even, and all negative, if k
is odd.
(4) If we know that a global max or min exists, where the CQ is satised,
and we get a unique solution x
R
n
that solves the FONC, then we may
use a second order condition to check whether it is a max or a min. However,
weak inequalities demonstrating n.s.d. or p.s.d. (subject to constraints) of
D
2
(L
1
, x
2
) solves the above problem, then (i)
(x
1
, x
2
) > (0, 0). If x
i
= 0 for some i, then utility equals zero; clearly, we can
do better by allocating some income to the purchase of each good; and (ii)
the budget constraint binds at (x
1
, x
2
). For if p
1
x
1
+ p
2
x
2
< I, then we can
allocate some of the remaining income to both goods, and increase utility
further.
We conclude from this that a solution (x
1
, x
2
) will also be a solution to
the problem
Max x
1
x
2
s.t. x
1
> 0, x
2
> 0, and p
1
x
1
+ p
2
x
2
= I.
That is, Maximize U(x
1
, x
2
) = x
1
x
2
over the set S = R
2
++
{(x
1
, x
2
)|I
p
1
x
1
p
2
x
2
= 0}. Since the budget set in this problem is compact and the
utility function is continuous, U attains a maximum on the budget set (by
Weierstrass Theorem). Moreover, we argued above that at such a maximum
x
, x
i
> 0, i = 1, 2 and the budget constraint binds. So, x
S.
Furthermore, Dg(x) = (p
1
, p
2
), so Rank(Dg(x)) = 1, at all points in
the budget set. So the CQ is met. Therefore, the global max will be among
the critical points of L(x
1
, x
2
, ) = x
1
x
2
+ (I p
1
x
1
p
2
x
2
).
FONC: (L/x
1
) = x
2
p
1
= 0 (1)
(L/x
2
) = x
1
p
2
= 0 (2)
(L/) = I p
1
x
1
p
2
x
2
= 0 (3)
= 0, (otherwise (1) and (2) imply that x
1
= x
2
= 0, which violates (3)).
Therefore, from (1) and (2), = (x
2
/p
1
) = (x
1
/p
2
), or p
1
x
1
= p
2
x
2
. So (3)
31
implies I 2p
1
x
1
= 0, or p
1
x
1
= (I/2), which is the standard Cobb-Douglas
utility result that the budget share of a good is proportional to the exponent
w.r.t. it in the utility function. So we get
x
i
= (I/2p
i
), i = 1, 2, and
= (I/2p
1
p
2
).
We argued that the global max would be one of the critical points of
L(x, ) in this example; (note, however, that the global min (which occurs
at (x
1
, x
2
) = (0, 0) is not a critical point). Since we have only it one critical
point, it follows that this must be the global max! (We know that x
1
= x
2
= 0
is the global min, and not the point that we have located). If we were unsure
whether our point is a max or a min, we could try second order conditions
(unnecessary here) as follows:
Dg(x
) = (p
1
, p
2
)
D
2
L(x
) = D
2
U(x
) +
D
2
g(x
)
=
_
U
11
(x
) U
12
(x
)
U
21
(x
) U
22
(x
)
_
+
_
g
11
(x
) g
12
(x
)
g
21
(x
) g
22
(x
)
_
=
_
0 1
1 0
_
+
_
0 0
0 0
_
=
_
0 1
1 0
_
Now evaluate the quadratic form z
T
1
D
L
(x
)z
2
= 2z
1
z
2
at any (z
1
, z
2
)
that is orthogonal to Dg(x
) = (p
1
, p
2
). So, p
1
z
1
p
2
z
2
= 0 or
z
1
= (p
2
/p
1
)z
2
. For such (z
1
, z
2
), z
T
1
D
L
(x
)z
2
= (2p
2
/p
1
)z
2
2
< 0, so
D
2
L(x
) =
_
0 Dg(x
)
[Dg(x
)]
T
D
2
L(x
)
_
=
_
_
_
0 p
1
p
2
p
1
0 1
p
2
1 0
_
_
_
det(BH(L
)) = 2p
1
p
2
> 0. This is the sign of (1)
n
= (1)
2
. Therefore,
there is a strict local max at x
.
Digression on the Chain Rule
32
We saw an example (in the proof of the 1st order condition) of the Chain
Rule at work; you told me youve seen this. Namely, if h :
n
and
f :
n
are dierentiable at the relevant points, then the composition
g(t) = f(h(t)) is dierentiable at t and
g
(t) = Df(h(t))Dh(t) =
n
j=1
f(h(t))
x
j
h
j
(t)
You may have encountered this before in notation f(h
1
(t), . . . , h
n
(t)),
with some use of total dierentiation or something. Similarly, suppose h :
n
and f :
n
m
are dierentiable at the relevant points, then the
composition g(x) = f(h(x)), g :
p
m
is dierentiable at x, and
Dg(x) = Df(h(x))Dh(x)
.
Here, on the RHS an mn matrix multiplies an n p matrix, to result
in the m p matrix on the LHS. Things are actually quite similar to the
familiar case. The (i, j)
th
element of the matrix Dg(x) is g
i
(x)/x
j
, where
g
i
is the i
th
component function of g and x
j
is the j
th
variable. Since this is
equal to the dot product of the i
th
row of Df(h(x)) and the j
th
column of
Dh(x), we have
g
i
(x)/x
j
=
n
k=1
f
i
(h(x))
h
k
h
k
(x)
x
j
Application to the Implicit Function Theorem
Theorem 14 Suppose F :
m+n
m
is C
1
, and suppose, for a given
c
m
, F(y, x) = c for some y
m
and some x
n
. Suppose also
that DF
y
(y, x) has rank m. Then there are open sets U containing x and V
33
containing y and a C
1
function f : U V s.t.
F(f(x), x) = c x U
.
Moreover,
Df(x) = [DF
y
(y, x)]
1
DF
x
(y, x)
The proof of this theorem starts going deep, so will not be part of this
course. But notice, that applying the Chain Rule to dierentiate
F(f(x), x) = c
yields
DF
y
(y, x)Df(x) + DF
x
(y, x) = 0
(*)
whence the expression for Df(x).
More painfully in terms of compositions, if h(x) = (f(x), x), then Dh(x) =
_
Df(x)
I
_
,
whereas DF(.) = (DF
y
(.)|DF
x
(.)), so matrix multiplication using parti-
tions yields Eq.(*).
Proof of the Theorem of Lagrange
Before the formal proof, note that well use the tangency of the contour
sets of the objective and the constraint approach, which in other words uses
the implicit function theorem. For example, consider maximizing F(x
1
, x
2
)
s.t. G(x
1
, x
2
) = 0. If G
1
= 0 (this is the constraint qualication in this
case), we have at a tangency point of contour sets, G
1
f
(x
2
) +G
2
= 0 (where
x
1
= f(x
2
) is the implicit function that keeps the points (x
1
, x
2
) on the
constraint); so f
(x
2
) = G
2
/G
1
.
34
On the other hand, if we vary x
2
and adjust x
1
to stay on the constraint,
the function value F(x
1
, x
2
) = F(f(x
2
), x
2
) does not increase; therefore lo-
cally around the optimum, F
1
f
(x
2
) + F
2
= 0. Substituting, F
1
(G
2
/G
1
) +
F
2
= 0. If we now put
F
1
/G
1
=
,
we have both F
1
+ G
1
= 0 by denition, and G
2
+ F
2
= 0, the two
FONC.
The Proof:
Without loss of generality, let the leading principal k k minor matrix of
Dg(x
) +
Dg(x
) =
is the same as showing that the 2 equations below hold for this
; the
equations are of dimension 1 k and 1 (n k) respectively:
Df
w
(w
, z
) +
Dg
w
(w
, z
) =
(*)
Df
z
(w
, z
) +
Dg
z
(w
, z
) =
(*)
Since Dg
w
(w
, z
= Df
w
(w
, z
) [Dg
w
(w
, z
)]
1
35
(**)
We show
, z
)]
1
Dg
z
(w
, z
)
Second, dene F(z) = f(h(z), z). Since theres a constrained optimum
at (h(z
), z
). So
DF(z
) = Df
w
(w
, z
)Dh(z
) + Df
z
(w
, z
) =
Substituting for Dh(z
),
Df
w
(.)[Dg
w
(.)]
1
Dg
z
(.) + Df
z
(.) =
That is,
Dg
z
(.) + Df
z
(.) =
36
Chapter 5
Optimization with Inequality
Constraints
5.1 Introduction
The problem is to nd the Maximum or the Minimum of f : R
n
R on the
set {x R
n
|g
i
(x) 0, i = 1, . . . , k}, where g
i
: R
n
R are the k constraint
functions. At the optimum, the constraints are now allowed to be binding
(or tight or eective), i.e. g
i
(x) = 0, as before, or slack (or non-binding), i.e.
g
i
(x) > 0.
Example: Max U(x
1
, x
2
) s.t. x
1
0, x
2
0, I p
1
x
1
p
2
x
2
0. If we
do not know whether x
i
= 0, for some i, at the utility maximum, or whether
x
i
> 0, then clearly we cannot use the Theorem of Lagrange. Similarly,
if there is a bliss point, then we do not know in advance whether there at
the budget constrained optimum the budget constraint is binding or slack.
Again, we cannot then use the Theorem of Lagrange, to use which we need
to be assured that the constraint is binding.
Note the general nature of a constraint of the form g
i
(x) 0. If we have
a constraint h(x) 0, this is equivalent to h(x) 0. And something like
h(x) c is equivalent to c h(x) 0.
37
We use Kuhn-Tucker Theory to address optimization problems with in-
equality constraints. The main result is a rst order necessary condition
that is somewhat dierent for that of the Theorem of Lagrange; one main
dierence is that the rst order conditions g
i
(x) = 0, i = 1, . . . , k in the The-
orem of Lagrange are replaced by the conditions
i
g
i
(x) = 0, i = 1, . . . , k in
Kuhn-Tucker theory.
In order to motivate this dierence, let us do a simple example. Consider
the function f : R R dened by f(x) = 10x x
2
. f is strictly concave,
has a unique Max at x = 5, and equals 0 at x = 0 and x = 10. Consider rst
the problem
Max 10x x
2
s.t. x 3, and compare this with the equality constraint
x = 3. The constraint function in either case is g(x) = 3 x. Analytically
or from a diagram, we see that the maximum occurs at x = 3 in either case,
and the value of the multiplier
. Denote the
corresponding constraint functions as (g
i
)
i
, where is the set of indexes
of the binding constraints. Let g
: R
n
R
l
be the function whose l
components are the constraint functions of the binding constraints. That is
g
(x) = (g
i
(x))
i
.
Dg
(x) =
_
_
_
Dg
i
(x)
.
.
.
Dg
m
(x)
_
_
_
, where i, . . . , m are the indexes of the binding
constraints. So Dg
(x) is an l n matrix.
We now state FONC for the problem. The Theorem below is a consoli-
dation of the Fritz-John and the Kuhn-Tucker Theorems.
Theorem 15 (The Kuhn-Tucker (KT) Theorem). Let f : R
n
R, and
g
i
: R
n
R, i = 1, . . . , k be C
1
functions. Suppose x
is a Maximum of f
on the set S = U {x R
n
|g
i
(x) 0, i = 1, . . . , k}, for some open set
U R
n
. Then there exist real numbers ,
1
, . . . ,
k
, not all zero such that
Df(x
) +
k
i=1
i
Dg
i
(x
) =
1n
.
Moreover, if g
i
(x
i
= 0.
If, in addition, RankDg
(x
i
0, i = 1, . . . , k, and
i
> 0 for some i implies g
i
(x
) = 0.
Suppose the constraint qualication, RankDg
(x
) = l, is met at the
optimum. Then the KT equations are the following (n +k) equations in the
n + k variables x
1
, . . . , x
n
,
1
, . . . ,
k
:
i
g
i
(x
) = 0, i = 1, . . . , k,
i
0, g
i
(x
) 0 with complementary
slackness. (1)
40
Df(x
) +
k
i=1
i
Dg
i
(x
) = (2)
If x
) +
k
i=1
i
Dg
i
(x
) = (2)
Equation (1) and (2) are known as the Kuhn-Tucker conditions.
Note nally that since the conditions of the Kuhn-Tucker Theorem are
not sucient conditions for local optima; there may be points that satisfy
Equations (1) and (2) or (2) without being local optima. For example, you
may check that for the problem
Max f(x) = x
3
s.t. g(x) = x 0, the values x = = 0 satisfy the KT
FONC (1) and (2) for a local maximum but do not yield a maximum.
5.3 Using the Kuhn-Tucker Theorem
We want to maximize f(x) over the set {x R
n
|g(x)
1k
}, where g(x) =
(g
1
(x), . . . , g
k
(x)).
Set up L(x, ) = f(x) +
k
i=1
i
g
i
(x)
(If we want to minimize f(x), set up
L(x, ) = f(x) +
k
i=1
i
g
i
(x))
To ensure that the KT FONC will hold at the global max, verify that
(1) a global max exists and (2) The constraint qualication is met at the
maximum.
This second is not possible to do if we dont know where the maximum
is. What we do instead is to check whether the CQ holds everywhere in the
domain, and if not, we note points where it fails. The CQ in the theorem
depends on what constraints are binding at the maximum. Again since we
dont know the maximum, we dont know what constraints bind at it.
With k constraint functions, there are 2
k
proles of binding and non-
binding constraints possible, each of these proles implying a dierent CQ.
We either check all of them, or we rule out some proles using clever argu-
ments.
41
If both checks are ne, then we nd all solutions (x
) to the set of
equations:
i
(L(x, )/
i
) = 0,
i
0, (L(x, )/
i
) 0, i = 1, . . . , k, with CS.
(L(x, )/x
j
) = 0, j = 1, . . . , n.
From the set of all solutions (x
) for which
f(x
) is maximum. Note that this method does not require checking for con-
cavity of objective functions and constraints, and does not require checking
any second order condition.
The method may fail if a global max does not exist or if the CQ fails at
the maximum. The example Max f(x) = x
3
s.t. g(x) = x 0 is one where
no global max exists, and we saw earlier that the method fails.
An example in which CQ fails: Max f(x) = 2x
3
3x
2
, s.t. g(x) =
(3 x)
3
0.
Suppose the constraint does not bind at the maximum; then we dont have
to check a CQ. But suppose it does. That is, suppose the optimum occurs
at x = 3. Dg(x) = 3(3 x)
2
= 0 at x = 3. The CQ fails here. You could
check that the KT FONC will not isolate the maximum.In fact, in this baby
example, it is easy to see that x = 3 is the max, as (3x)
3
0 iff (3x) 0,
so we may work with the latter constraint function, with which CQ does not
fail. It is a good exercise to visualize f(x) and see that x = 3 is the maximum,
rather than merely cranking out the algebra now.
Alternatively, we may use the more general FONCs stated in the theorem.
Df(x) + Dg(x) = 0, with , not both zero.
(6x
2
6x) + (3(3 x)
2
) = 0, and (1)
(3 x)
3
0, with strict inequality implying = 0. (2)
If (3x)
3
> 0, then = 0, which from Eq.(1) implies either = 0, which
violates the FONC, or x = 1. At x = 1, f(x) = 1.
On the other hand, if (3 x)
3
= 0, that is x = 3, then Eq (1) implies
= 0, so it must be that > 0. At x = 3, f(x) = 27. so x = 3 is the
maximum.
Two Simple Utility Maximization Problems
42
Example 1. This is a real baby example meant purely for illustration.
No one expects you to use the heavy Kuhn-Tucker machinery for such simple
problems. In this example, one expects instead that you would use reason-
ing about the marginal utility per rupee ratios (U
1
/p
1
), (U
2
/p
2
) to solve the
problem.
Max U(x
1
, x
2
) = x
1
+ x
2
, over the set {x = (x
1
, x
2
) R
2
|x
1
0, x
2
0, I p
1
x
1
p
2
x
2
0}, where I > 0, p
1
> 0 and p
2
> 0 are given.
So there are 3 inequality constraints:
g
1
(x
1
, x
2
) = x
1
0, g
2
(x
1
, x
2
) = x
2
0, and
g
3
(x
1
, x
2
) = I p
1
x
1
p
2
x
2
0
At the maximum x
) = 0). Moreover, g
1
(x
) = g
2
(x
) = 0 is
not possible since consuming 0 of both goods gives utility equal to 0, which
is clearly not a maximum.
So we have to check just three possibilities out of the eight.
Case(1) g
1
(x
) > 0, g
2
(x
) > 0, g
3
(x
) = 0
Case(2) g
1
(x
) = 0, g
2
(x
) > 0, g
3
(x
) = 0
Case(3) g
1
(x
) > 0, g
2
(x
) = 0, g
3
(x
) = 0
Before using the KT conditions, we verify that (i) a global max exists
(here, because the utility function is continuous and the budget set is com-
pact), and that (ii) the CQ holds at all 3 relevant cominations of binding
constraints described above.
Indeed, for Case(1), Dg
(x) = Dg
3
(x) = (p
1
, p
2
), so Rank[Dg
3
(x)] =
1, so CQ holds.
For Case(2), Dg
(x) =
_
Dg
1
(x)
Dg
3
(x)
_
=
_
1 0
p
1
p
2
_
, so Rank[Dg
(x)]
= 2.
For Case(3), Dg
(x) =
_
Dg
2
(x)
Dg
3
(x)
_
=
_
0 1
p
1
p
2
_
, so Rank[Dg
(x)]
= 2.
Thus for the maximum, x
, there exists a
such that (x
) will be a
solution to the KT FONCs. Of course, there could be other (x, )
s that are
solutions as well, but a simple comparison of U(x) for all candidate solutions
43
will isolate for us the Maximum.
L(x, ) = x
1
+ x
2
+
1
x
1
+
2
x
2
+
3
(I p
1
x
1
p
2
x
2
)
The KT conditions are
1
(L/
1
) =
1
x
1
= 0,
1
0, x
1
0, with CS (1)
2
(L/
2
) =
2
x
2
= 0,
2
0, x
2
0, with CS (2)
3
(L/
3
) =
3
(I p
1
x
1
p
2
x
2
) = 0,
3
0, I p
1
x
1
p
2
x
2
0, with
CS (3)
(L/x
1
) = 1 +
1
3
p
1
= 0 (4)
(L/x
2
) = 1 +
2
3
p
2
= 0 (5)
Since we dont know which of the three cases select the constraints that
bind at the maximum, we must try all three.
Case(1). Since x
1
> 0, x
2
> 0, (1) and (2) imply
1
=
2
= 0.Plugging
these in Eq(4) and (5), we have 1 =
3
p
1
=
3
p
2
. Since utility is strictly in-
creasing, relaxing the budget constraint will increase utility. So the marginal
utility of income,
3
> 0. Thus
3
p
1
=
3
p
2
implies p
1
= p
2
.
(We could have alternatively got
3
> 0 simply by equation mining as
follows: If
3
= 0, then Eq(4) and (5) imply
1
=
2
. Now both
1
and
2
cant equal 0, for then all three multipliers equal 0, violating the KT
conditions. So both must be greater than 0. But then from Eq (1) and (2),
x
1
= x
2
= 0. which is ruled out in Case (1). thus it must be that
3
> 0.)
So if at a local max both x
1
and x
2
are strictly positive, then it must be
that there prices are equal. All (x
1
, x
2
) that solve Eq(3) are solutions. The
utility in any such case equals
x
1
+ (I p
1
x
1
/p
2
) = I/p, where p = p
1
= p
2
. Note that in this case,
(U
1
/p
1
) = (U
2
/p
2
) = 1/p.
Case 2. x
1
= 0 implies, from Eq(3), that x
2
= (I/p
2
). Since this is
greater than 0, Eq(2) implies
2
= 0. Hence from Eq(5),
3
p
2
= 1.
Since
1
0, Eq (4) and (5) imply
3
p
1
= 1 +
1
1 =
3
p
2
. Moreover,
since
3
> 0, this implies p
1
p
2
.
That is, if it is the case that at the maximum, x
1
= 0, x
2
> 0, then it must
be that p
1
p
2
. Note that in this case, (U
2
/p
2
) = (1/p
2
) (U
1
/p
1
) = (1/p
1
).
For completeness sake, Eq(5) implies
3
= (1/p
2
). So from Eq (4),
1
= (p
1
/p
2
) 1. So the unique critical point of L(x, ) is
44
(x
) = (x
1
, x
2
,
1
,
2
,
3
) = (0, (I/p
2
), (p
1
/p
2
) 1, 0, (1/p
2
)).
Case(3). This case is similar, and we get that x
2
= 0, x
1
> 0 occurs only
if p
1
p
2
. We have
(x
) = ((I/p
1
), 0, 0, (p
2
/p
1
) 1, 1/p
1
).
We see that which of the cases applies depends upon the price ratio p
1
/p
2
.
If p
1
= p
2
, then all three cases are relevant, and all (x
1
, x
2
) R
2
+
such that the
budget constraint binds are utility maxima. But if p
1
> p
2
, then only Case(2)
is applies, because if Case (1) had applied, we would have had p
1
= p
2
, and
if Case (3) had applied, that would have implied p
1
p
2
. The solution to
the KT conditions in that case is the utility maximum. Similarly, if p
1
< p
2
,
only Case (3) applies.
Example 2. Max U(x
1
, x
2
) = (x
1
/1 + x
1
) + x
2
/1 + x
2
), s.t. x
1
0,
x
2
0, p
1
x
1
+ p
2
x
2
I.
Check that the indierence curves are downward sloping, convex and that
they cut the axes (show all this). This last is due to the additive form of the
utility function, and may result in 0 consumption of one of the goods at the
utility maximum.
Exactly as in Example 1, we are assured that a global max exists, that
the CQ is met at the optimum, and that there are only 3 relevant cases of
binding constraints to check.
the Kuhn-Tucker conditions are:
1
(L/
1
) =
1
x
1
= 0,
1
0, x
1
0, with CS (1)
2
(L/
2
) =
2
x
2
= 0,
2
0, x
2
0, with CS (2)
3
(L/
3
) =
3
(I p
1
x
1
p
2
x
2
) = 0,
3
0, I p
1
x
1
p
2
x
2
0, with
CS (3)
(L/x
1
) = (1/(1 + x
1
)
2
) +
1
3
p
1
= 0 (4)
(L/x
2
) = (1/(1 + x
2
)
2
) +
2
3
p
2
= 0 (5)
Case(1). x
1
> 0, x
2
> 0 implies
1
=
2
= 0. Eq (4) implies
3
> 0, so
that Eq(4) and (5) give ((1 + x
2
)/(1 + x
1
)) = (p
1
/p
2
)
1/2
.
Using Eq(3), which gives x
2
= ((I p
1
x
1
)/p
2
), above, we get
((p
2
+ I p
1
x
1
)/(p
2
(1 + x
1
)) = (p
1
/p
2
)
1/2
, so simple computations yield
x
1
= ((I + p
2
(p
1
p
2
)
1/2
)/(p
1
+ (p
1
p
2
)
1/2
)),
45
x
2
= ((I + p
1
(p
1
p
2
)
1/2
)/(p
2
+ (p
1
p
2
)
1/2
)),
3
= (1/p
1
(1 + x
1
)
2
).
x
1
> 0, x
2
> 0 implies I > (p
1
p
2
)
1/2
p
1
, I > (p
1
p
2
)
1/2
p
2
. If either of
these fails, then we are not in the regime of Case 1.
Case(2) x
1
= 0 with Eq(3) implies x
2
= I/p
2
. Since this is positive,
2
= 0, so Eq(5) implies
3
= 1/(1 + (I/p
2
))
2
p
2
.
1
=
3
p
1
1 (from x
1
= 0 and Eq(4)).
1
= p
1
p
2
/(p
2
+ I)
2
1. For this to be 0, it is required that
p
1
p
2
/(p
2
+ I)
2
1,that is I (p
1
p
2
)
1/2
p
2
.
Utility equals x
2
/(1 + x
2
) = I/(p
2
+ I).
(x
1
, x
2
,
1
,
2
,
3
) = (0, I/p
2
, 1 + ((p
1
p
2
)/(p
2
+ I)
2
), 0, p
2
/(p
2
+ I)
2
).
Case(3) By symmetry, the solution is
(x
1
, x
2
,
1
,
2
,
3
) = (I/p
1
, 0, 0, 1 + ((p
1
p
2
)/(p
1
+ I)
2
), p
1
/(p
1
+ I)
2
)
And for this Case to hold it is necessary that p
1
p
2
/(p
1
+ I)
2
1, or
I (p
1
p
2
)
1/2
p
1
.
To summarize, suppose p
1
= p
2
= p, then (p
1
p
2
)
1/2
p
1
= (p
1
p
2
)
1/2
p
2
equals 0. So since I > 0, we are in the regime of Case I,and x
1
= x
2
= I/2p
at the maximum.
Suppose on the other hand that p
1
< p
2
(the contrary can be worked out
similarly), then p
2
> (p
1
p
2
)
1/2
> p
1
, so that
(p
1
p
2
)
1/2
p
1
> 0 > (p
1
p
2
)
1/2
p
2
. Thus either
I > (p
1
p
2
)
1/2
p
1
, in which case we use Case(1), or
I (p
1
p
2
)
1/2
p
1
in which case we use Case(3). Case(2), that in which a
positive amount of good 2 and zero of good 1 is consumed at the maximum,
does not apply.
5.4 Miscellaneous
(1) For problems where some constraints are of the form g
i
(x) = 0, and
others of the form g
j
(x) 0, only the latter give rise to Kuhn-Tucker like
complementary slackness conditions (
j
0, g
j
(x) 0,
j
g
j
(x) = 0).
46
(2) If the objective to be maximized, f, and the constraints g
i
, i = 1, . . . , k
(where constraints are of the form g
i
(x) 0) are all concave functions,
and if Slaters constraint qualication holds (i.e., there exists some x
n
s.t. g
i
(x) > 0, i = 1, . . . , k, then the Kuhn-Tucker conditions become both
necessary and sucient for a global max.
(3) Suppose f and all the g
i
s are quasiconcave. Then the Kuhn-Tucker
conditions are almost sucient for a global max: An x
) = , or f is concave.
47
Appendix
Completeness Property of Real Numbers
48