Preview of "Optimization - PDF"

Chapter 3
Dierentiable Nonlinear Programming
3.1 Unconstrained Problems

The first class of problems we consider in this chapter has the following form
minimize f (x)
subject to x X (3.1)
where f : IRn IR and X is an open nonempty subset of IRn .

Theorem 3.1.1 Let X be an open set in IRn , and let f : X IR. Suppose x0 is a local minimizer
of f . If f has first order partial derivatives at x0 , then
f
(x0 ) = 0, i = 1, 2, , n.
xi
If f is of class C (2) on some open ball B(x0 , ) centered at x0 , then the Hessian of f at x0
2
f
H(x0 ) = (x0 )
xi xj
is positive semidefinite.
Proof : Since X is open there exists a > 0 such that
V = {x IRn : |xi x0i | < , i = 1, , n} X.
Here x = (x1 , , xn ), x0 = (x01 , , x0n ). Set ei = (0, , 0, 1, 0, , 0) unit vector in the xi coordi-
nate direction. Next, set i (t) = f (x0 + tei ), |t| < .
f
Then, t = 0 is a local minimizer for i (). Thus, i (0) = 0. However, i (0) = x i
(x0 ).
(2) n
Next, suppose f is C on V . Let u be a unit vector in IR . Set
(t; u) = f (x0 + tu), |t| < .
The function ( ; u) is well defined and is of class C (2) , and has a local minimum at t = 0. Thus,
d
(t; u) |t=0 = 0.
dt
and
d2
(t; u) |t=0 0.
dt2
However,
d2
(t; u) = u, H(x0 + tu)u
dt2
63
64 CHAPTER 3. DIFFERENTIABLE NONLINEAR PROGRAMMING
Thus,
u, H(x0 + tu)u 0
since u is an arbitrary unit vector it follows that H is positive semidefinite.
In IRn the points c such that f (c) = 0 are called critical points of f .
Let A be an n n matrix with entries aij . Let 1 = a11 and

a11 a1k
.. ,
k = det ... ..
. . k = 2, 3, 4, , n
ak1 akk
The determiants k are called the principal minors of A. The following lemma gives a criterion
for positive definiteness of a symmetric matrix A with entries aij and is useful in dealing with
unconstrained problems where the positive definiteness of the Hessian is an issue.
Lemma 3.1.1 The symmetric matrix A is positive definite if and only if k > 0 for k = 1, 2, , n
and is negative definite if and only if (1)k k > 0 for k = 1, 2, , n.
Proof : Let Q(x) = xT Ax. If Q(x) is positive definite then A has to be nonsingular. Otherwise
there exists x0 IRn , x0 = 0 such that Ax0 = 0. Then, Q(x0 ) = xT0 Ax0 = 0 and Q would not be
positive definite.
If Q is positive semidefinite and A is non singular, then Q must be positive definite. If Q
were not positive definite then there exists x0 = 0 such that x Q(x) attains its minimum at

x0 . Thus Q (x0 ) = 0; i.e. Ax0 = 0 contradicting the fact that A is nosingular.
Next, we show that Q is positive definite implies that det(A) > 0. We note that
x, [(1 )I + A] x = x, x + (1 )x2 > 0
if x = 0 and 0 < < 1. Now, set
() = det [(1 )I + A] .
We note that (0) = 1 and (1) = det (A). Since A is nonsingular when Q is positive definite we
have det (A) = 0. Thus, suppose (1) = det A < 0. Then, by the intermediate value property
there exists 0 < < 1 such that ( ) = 0. That is, there exists x = 0 such that
x , [(1 )I + A] x = 0,
x , x + (1 )x 2 = 0,
which is not possible. Thus, det A > 0.
Next, let f : IRn IR be defined by
f (x) = xT Ax + 2 b, x + c.

f (x) = 2 A x + 2b = 0
therefore x = A1 b. Set x0 = A1 b.
f (x0 ) = x0 , Ax0 + 2b, x0 + c

= A1 b, AA1 b 2b, A1 b + c
= b, A1 b + c
= b, x0 + c
Next,

A b A Ax0 + b A 0
det = det = det = det A (b, x0 + c)
bT c bT b, x0 + c bT b, x0 + c
3.1. UNCONSTRAINED PROBLEMS 65
Therefore
A b
det
bT c
b, x0 + c =
det A
Finally, set
g(x, y) = x, Ax + 2yb, x + cy 2
= (x, y)T B (x, y)
where
A b
B=
bT c
Now suppose g is positive definite. Then, by what was proved earlier detB > 0. Setting y = 0,
g(x, 0) = x, Ax showing A is positive definite. Conversely, suppose A is positive definite and
detB > 0. Then g(x, 0) = x, Ax > 0 for all x = 0. If y = 0,
x x
g(x, y) = y 2 g( , 1) y 2 f ( ) y 2 f (x0 )
y y
Thus,
A b
det T
b c
g(x, y) y 2
det A
that is,
det B
g(x, y) y 2.
det A
We know det A > 0 since A is positive definite, and det B > 0 by assumption. Thus g(x, y) > 0 if
det B > 0 and A is positive definite.
To prove the lemma, let
a11 a1n
a21 a2n

A= . ..
.. .
an1 ann
Write A as

a11 a1,n1 a1n
a21 a2,n1 a2n

.. .. ..
A =
. . .

an1,1 an1,n1 an1,n
an1 an,n1 ann

An1 b
=
bT ann
Since A is positive definite we immediately conclude An1 is also positive definite. The proof
follows by induction. In the case A is negative definite we apply the argument to A which is
positive definite.
We state the following theorem without proof.
Theorem 3.1.2 Let X be an open set in IRn . Let f : X IR be of class C (2) on B(x0 ; ) for some
> 0. Suppose f (x0 ) = 0 and H(x0 ) is positive definite. Then, x0 is a strict local minimizer.
If X is convex, f is of class C (2) on X, and H(x) is positive semidefinite for all x in X, then x0
is a minimizer for f . If H(x) is positive definite for all x in X, then x0 is a strict minimizer.
Example 3.1.1
Let f (x, y) = x2 + y 3 and g(x, y) = x2 + y 2 . Then f (0, 0) = g(0, 0) = 0. f and g have positive
semidefinite Hessians at (0, 0). However, (0, 0) is a local minimizer for g but not for f .
3.2 Constrained Problems

In this section we derive first order necessary conditions for dierentiable nonlinear program-
ming problems (i.e. the objective function and the constraint functions are nonlinear). The
problem we consider has the form
min f (x)
subject to
g1 (x) 0
..
.
gm (x) 0 (P )
h1 (x) = 0
..
.
hk (x) = 0
x X0 IRn
In this problem f : X0 IRn IR, gi : X0 IR, i = 1, , m and hi : X0 IR, i = 1, , k.

Further X0 is a nonempty open subset of IRn . The functions f, g1 , , gm ; h1 , , hk are C (1)
functions on X0 .
Definition 3.2.1 The set S deined by
S = {x X0 : gi (x) 0, i = 1, ..., m ; hj (x) = 0, j = 1, ..., k}
is called the feasible or admissible set for (P ). Each point of the feasible set S is said to be a
feasible point or an admissible point.
Theorem 3.2.1 Let x be a solution of problem (P ). Then, there exists a real number 0 0, a
vector 0 in IRm , and a vector in IRk such that
(i) (0 , , ) = 0; = (1 , , m ), = (1 , , k )
(ii) , g(x ) = 0; i.e. i gi (x ) = 0, i = 1, , m.

m k
(iii) 0 f (x ) + i=1 i gi (x ) + i=1 i hi (x ) = 0
Remark 3.2.1 Theorem 3.2.1 with the additional hypothesis guaranteeing 0 > 0 was proved by
Kuhn and Tucker (1951). The theorem with 0 0 had been proved by John (1948). Theo-
rem 2.2.1 was also proved by Karush (1939) under conditions guaranteeing 0 > 0. Theorem
2.2.1 with additional hypothesis guaranteeing that 0 > 0 is referred to as Karush-Kuhn-Tucker
Theorem. Theorem 2.2.1 as stated above is referred to as the Fritz John Theorem, and the
multipliers 0 , , are called fritz Jihn multipliers.
Below we will give the proof of Theorem 2.2.1 based on McShanes (1973) penalty method.
Proof : (Proof of Theorem 3.2.1)
We may assume that x = 0 and that f (x ) = 0. Let
E = {i : gi (0) < 0} (3.2)

I = {i : gi (0) = 0} (3.3)
To simplify the notation we assume that E = {1, 2, , r} and I = {r + 1, , m}, 0 r m.

Let be a C (1) function from IR to [0, ) such that (i) is strictly increasing (ii) (t) = 0 for

t 0, (iii) (t) > 0 for t > 0. We note that (t) > 0 for t > 0.
3.2. CONSTRAINED PROBLEMS 67
Since g1 , , gm are continuous on the open set X0 , there exists an 0 > 0 such that B(0; 0 ) X0
and gi (x) < 0 for x B(0; 0 ) and i I.
Define a penalty function F as follows
m k

2
2
F (x, K) = f (x) + x + K (gi (x)) + (hi (x)) (3.4)
i=1 i=1
Now we assert that for each 0 < < 0 there exist a positive integer K() such that for any x with

x = we have F (x, K()) > 0. If this assertion were false then there would exist , 0 < < 0

such that for each positive integer K there exists a point xK with xK = and F (xK , K) 0.
Thus, m
k

2 2
f (xK ) + xK K (gi (xK )) + (hi (xK )) (3.5)
i=1 i=1

There exist positive integers K1 < K2 < < Kn < and x x = such that xKn x .
Since f, g1 , , gm , h1 , , hk are all continuous it follows from (3.5) that
m
k
2
(gi (x )) +
(hi (x )) = 0 (3.6)
i=1 i=1
Thus,
gi (x ) 0, 1im (3.7)
hi (x ) = 0, 1ik (3.8)
From (3.5) we also have

2
f (x ) + 0 (3.9)
2
From (3.7)(3.9) we see that x is feasible with f (x ) < f (0) a contradiction. Thus, there
exist a positive integer K such that F (x, K ) > 0 on {x : x = }.
Now, if we consider the problem of minimizing the function F (, K ) on the set {x : x },
then F (, K ) attains its minimum at a point x(), x() < . Thus,
d
F (x, K )|x=x() = 0 (3.10)
d
That is,
m k
f gi hi
(x()) + 2 xj () + K (gi (x())) (x()) + 2 K (hi (x())) (x()) = 0, j = 1, , n
xj i=1
x j i=1
xj
(3.11)
Define
m
2 k

2
M () = 1 + K (gi (x())) + [2 K (hi (x()))] (3.12)
i=1 i=1
Set
1
0 () = ,
M ()

K (gi (x()))
i () = 0, i = 1, , m (3.13)
M ()
2 K (hi (x()))
i () = (3.14)
M ()
Since 0 < < 0 and x() < < 0 we note that
i () = 0, i = r + 1, , m (3.15)
Let
1
0 = (3.16)
lim0+ M ()
Take an appropriate subsequence 1 > 2 > > n > converging to zero so that
0 (j ) 0 0 (3.17)
i (j ) i 0, 1im (3.18)
i (j ) i , 1ik
Then, from (3.11) we obtain
m k
f gi hi
0 (0) + i (0) + i (0) = 0 (3.19)
xj i=1
xj i=1
xj
Since
m
k

20 (j ) + 2i (j ) + 2i (j ) = 1,
i=1 i=1
we have
m
k

0 + |i | + |i | =
0 (3.20)
i=1 i=1
From (3.14) we have

i = 0, i r + 1, , m (3.21)
Thus,
i gi (0) = 0, i = 1, , m (3.22)
From (3.17) (3.22) we see that the proof is complete.
Definition 3.2.2 In problem (P ) the functions g1 , , gm ; h1 , , hk satisfy the constraint qualifi-

cation at a feasible point x0 if
(i) the vectors h1 (x0 ), , hk (x0 ) are linearly independent and
(ii) the system

gi (x0 ) z < 0, i E = { : g (x0 ) = 0}
hi (x0 ) z = 0, i = 1, , k
has a solution z in IRn .
If the equality constraints are absent, then the statements involving hi (x0 ), 1 i m are to
be deleted.
Lemma 3.2.1 Let x be a solution of problem (P ). Let the constraint qualification hold at x .
Then, 0 in Theorem 2.2.1 is strictly positive, that is 0 > 0.
Proof : Suppose 0 = 0. Then, from (iii) of Theorem 2.2.1
m
k

i gi (x ) + i hi (x ) = 0 (3.23)
i=1 i=1
3.2. CONSTRAINED PROBLEMS 69
If 1 = 2 = = m = 0, then from (3.23) and from the fact that h1 (x ), , hk (x ) are linearly
independent i = 0, i = 1, , m. However, from (i) of Theorem 2.2.1 (0 , , ) = 0. thus, we
cant have 1 , , m all zero. In particular i = 0 for all i E = { : g (x ) = 0}. Thus,
k

i gi (x ) + i hi (x ) = 0 (3.24)
iE i=1
for any vector z IRn as in (ii) of Definition 2.2.1, we obtain from (3.24)

i gi (x )T z = 0 (3.25)
iE
since
i 0, i = 0 and gi (x )T z < 0, iE
iE
we have a contradiction. Thus 0 > 0.
Theorem 3.2.2 Let x be a solution of problem (P ). Let the constraint qualification hold at x .
Then, there exist scalars 1 0, 2 0, , m 0, 1 , 2 , , k such that
(i) i gi (x ) = 0, i = 1, , m
m k
(ii) f (x ) + i=1 i gi (x ) + i=1 i hi (x ) = 0
Proof : By Lemma 2.2.1 in Theorem 2.2.1 0 > 0. Dividing (0 , , ) in Theorem 2.2.1 by 0

we obtain the KKT conditions. Let
E = { : g (x ) = 0, 1 m} (3.26)

I = { : g (x ) < 0, 1 m} (3.27)
The necessary conditions (i) and (ii) are referred to as the Karush-Kuhn-Tucker (KKT)
conditions, and the multipliers 1 , 2 , , m , 1 , 2 , , k are referred to as the Karush-Kuhn-
Tucker multipliers.
Theorem 3.2.3 Let x be a solution of problem (P ). Suppose the vectors in the set
{hi (x ) : i = 1, , k} {gi (x ) : i E}
are linearly independent. Then, the conclusions of Theorem 2.2.2 hold.
Proof : It suces to prove that 0 in Theorem 2.2.1 is strictly positive in this case. in fact, if
0 = 0, then (ii) and (iii) of Theorem 2.2.1 give
k

i gi (x ) + i hi (x ) = 0. (3.28)
iE i=1
By hypothesis of the theorem equation (3.28) 1 = = k = 0, i = 0, i E. Then, we contradict

(i) of Theorem 2.2.1 and the proof is complete.
The Karush-Kuhn-tucker necessary coditions are also sucient for problem (P ) where f, g1 , , gm
are convex and h1 = = hk = 0.
Theorem 3.2.4 Let f, g1 , , gm be C (1) functions that are convex and defined on the nonempty
convex set X0 IRn . Consider the problem
min f (x)
subject to
g1 (x) 0
..
.
gm (x) 0 (Pc )
n
x X0 IR
Let x X0 , 1 , 2 , , m scalars such that
(i) gi (x ) 0, i = 1, , m
(ii) i 0, i = 1, , m
m
(iii) i=1 i gi (x ) = 0
m
(iv) f (x ) + i=1 i gi (x ) = 0
Then, x is a solution to problem (Pc ).
Proof : Let
Xi = {x X0 : gi (x) 0}, i = 1, , m.
Then, Xi is convex for i = 1, , m. Set
X = m
i=1 Xi .
Then, X is convex and nonempty. From (i) x X. Since f is covex,

f (x) f (x ) f (x ), x x , xX
Now, by (iv)
m

f (x) f (x ) i gi (x ), x x
i=1
m
= i gi (x ), x x (3.29)
i=1
Since gi is convex for i = 1, , m we have

gi (x) gi (x ) gi (x ), x x
Thus, from (3.29) we have
m

f (x) f (x ) i (gi (x) gi (x ))
i=1
m
m

= i gi (x ) i gi (x)
i=1 i=1
m

= 0 i gi (x)
i=1
m

= (i )gi (x )
i=1
0
thus, f (x) f (x ).
3.3. SECOND ORDER CONDITIONS 71
Corollary 3.2.1 Let f, g1 , , gm be C (1) functions defined on the nonempty convex set X0 IRn .
Let hi , i = 1, , be an ane map from IRn to IR. Consider the problem
min f (x)
subject to
gi (x) 0 i = 1, , m
hi (x) = 0, i = 1, , (Pc )
n
x X0 IR
Let x X0 , 1 , 2 , , m scalars such that
(i) gi (x ) 0, i = 1, , m and hi (x ) = 0, i = 1, ,
(ii) i 0, i = 1, , m
m
(iii) i=1 i gi (x ) = 0
m
(iv) f (x ) + i=1 i gi (x ) = 0
Then, x is a solution to problem (Pc ).
3.3 Second Order Conditions

In what follows we refer to problem (P ) of Section 2.2 for our discussion.
Definition 3.3.1 Let x0 X. The vector z IRn , z =

0 is said to satisfy the tangential constraints at
x0 if
gE (x0 )T z 0, h(x0 )T z = 0 (3.30)
where
E = {i : gi (x0 ) = 0}
Notation 3.3.1 In (3.30) gE (x0 )T z 0 means for every i such that gi (x0 ) = 0 we have
gi (x0 )T z 0. Next h(x0 )T z = 0 means hTi (x0 ) z = 0 for i = 1, , .
For definiteness, in (3.30), we shall henceforth suppose that E = {1, , r}. Let I = {r +
1, , m}. Since either E or I can be empty, 0 r m.
Let z be a vector that satisfies the tangential constraints. Let
E1 = {i : i E, gi (x0 )T z = 0}
E2 = {i : i E, gi (x0 )T z < 0}
Then,
E = E1 E 2
T
gE1 (x0 ) z = 0, h(x0 )T z = 0 (3.31)
and
gE2 (x0 )T z < 0 (3.32)
For the sake of definiteness suppose E1 = {1, , q}. Since E1 may be empty, 0 q r. If q = 0,
statements involving E1 , in (3.31) in the ensuing discussion should be deleted. We should keep
in mind that the sets E1 and E2 depend on x0 and z.
Let z be a given vector that satisfies the tangential constraints.In the next lemma is presented
a condition under which how one can construct a curve () emanating from x0 and going into
the feasible set in the direction given by z.
Lemma 3.3.1 Let g1 , , gm , h1 , , hk and X0 be as in problem (P ). Assume g1 , , gm , h1 , , hk

are of class C ( p), p 1. Let x0 be feasible and let z IRn satisfy the tangential constraints at x0 .
Suppose that the vectors
g1 (x0 ), , gq (x0 ), h1 (x0 ), , hk (x0 ) (3.33)
are linearly independent. Then, there exists > 0 and a C (p) mapping () from (, ) to IRn
such that

(0) = x0 , (0) = z
gE1 ((t)) = 0, h ((t)) = 0, gI ((t)) < 0 gE2 ((t)) < 0 for 0 < t <
Proof : Let = q + k. In view of (3.33), n. If = n, then the only solution of (3.31) is

z = 0. Hence, E2 = and E1 = E. The mapping () is then the constant mapping (k) = x0 .
We now suppose that < n. Let denote the n matrix whose rows are the vectors in
(3.33). Then
gE1 (x0 )
=
h(x0 )
and has rank . By relabeling the coordinates corresponding to linearly independent columns
of , we may suppose without any loss of generality that the first columns of the matrix are
linearly independent. Thus, the matrix
g (x )
i 0
xj
= hs (x0 ) , i = 1, , q, s = 1, , k, j = 1, ,
xj
is nonsingular.
The system of equations
gE1 (x) = 0
h(x) = 0 (3.34)
x+1 x0,+1 t z+1 = 0
xn x0,n t zn = 0
where z+1 , , zn are the last n components of the vector z, has a solution (x, t) = (x0 , 0). Now,
using the implicit function theorem we can assert that there exists a > 0 and a C function
() defined on (, ) with range in IRn such that
(0) = x0
gE1 ((t)) = 0, h ((t)) = 0 (3.35)
i (t) = x0,i + t zi , i = + 1, , n
Dierentiating (3.35) with respect to t we have

0
(0) =
On, In z
where z = (z+1 , , zn ). The system

0
w=
On, In z

has a unique solution. Since (0) and z are both solutions, we get (0) = z.
Since gI (x0 ) < 0 and (0) = x0 , from the continuity of g and it follows that for suciently
small we have gI ((t)) < 0 for |t| .
3.3. SECOND ORDER CONDITIONS 73
Since
d gE2 ((t))
= gE2 ((t)) (t)
dt
E2 dg ((0))
and since (0) = x0 and (0) = z, it follows from (3.32) that dt < 0. It then follows from
continuity that by taking to be smaller, if necessary, there is an interval [0, ) on which all of
d gE2 ((0))
the previous conclusions hold and on which dt < 0. Since gE2 ((0)) = gE2 ((x0 )) = 0, we
have gE2 ((t)) < 0 on [0, ).
Corollary 3.3.1 Let the constraint qualification hold at x0 in problem (P ). Then for every vector

z satisfying (ii) of definition 1.2.2 there exists a C () function () such that (0) = x0 , (0) = z
and (t) is feasible for all t in some interval [0, ). Moreover, g ((t)) < 0 on (0, ).
Proof : A vector z satisfying (ii) of Definition 1.2.2 satisfies the tangential constraints, and
the set of indices E1 corresponding to z is empty.
Definition 3.3.2 Let (x , , ) be as in Theorem 1.2.2. Let z be a vector that satisfies the tangen-
tial constraint (3.30) at x. Let
E1 = E1 (z) = {i : i E, gi (x )T z = 0},
E2 = E2 (z) = {i : i E, gi (x )T z < 0}
The vector z will be called a second order test vector if
(i) i = 0 for i E2 and
(ii) the rows of the matrix

gE1 (x )
h(x )
are linearly independent.

Lemma 3.3.2 Let be of class C (2) on (, ) such that (0) = 0 and (t) (0) for all t in [0, ).

Then (0) 0.
Proof : By Taylors Theorem, for 0 < t <
1
(t) (0) = ()t2 ,
2

where 0 < < 1. If (0) were negative, then by continuity there would exist an interval [0, )

on which would be negative. Hence (t) < (0) on this interval, contradicting the hypothesis.
Theorem 3.3.1 Let f , g = (g1 , , gm ), h = (h1 , , hk ) be of class C (2) on X0 . Let x be a solution

to problem (P ). Let , be KKT multipliers as in Theorem 1.2.2 and let F (, , ) be defined by
F (x, , ) = f (x) + , g(x) + , h(x) (3.36)
Then, for every second-order test vector z,
z, Fxx (x , , )z 0 (3.37)
where
2F
Fxx = , i = 1, , n; j = 1, , n
xj xi
Proof : Let z be a second-order test vector. Since i = 0 for i E2 and
= (E , I ) = (E1 , E2 , I ) = (E1 , 0, 0)
we have
F (x, , ) = f (x) + E1 , gE1 (x) + , h(x)
Let () be the function corresponding to z as in Lemma 2.3.1, with x0 = x . Since gE1 ((t)) = 0
and h ((t)) = 0 for |t| < , we have
f (( )) = F ((t), , ) for |t| < .
Since for 0 t < , all points (t) are feasible, and since (0) = x the mapping t f ((t)),
where 0 t < has a minimum at t = 0. Hence, so does the mapping () defined on [0, ) by
(t) = F ((t), , ) .
Now,
(t) = F ((t), , ) , (t)

(t) = (t), Fxx ((t), , ) , (t) + F ((t), , ) , (t)

Setting t = 0 in the first equation and using (0) = x and F (x , , ) = 0 we have (0) = 0. Now,

by Lemma 2.3.2 we have (0) 0. Setting t = 0 in (t) and using (0) = x , F (x , , ) = 0,

and (0) = z give the conclusion of the theorem.
Corollary 3.3.2 Let the function f, g, h be as in Theorem 2.3.1 and let x be a solution to prob-
lem (P ). Suppose the vectors g1 (x ), , gr (x ), h1 (x ), , hk (x ) are linearly independent.
Then, there exist KKT multipliers (, ) as in Theorem 1.2.2, and (3.37) holds for every vector
z satisfying
gE (x )T z = 0, h(x )T z = 0 (3.38)
Corollary 3.3.3 Let f and h = (h1 , , hk ) be as in Theorem 2.3.1, let x be a solution of problem
(P ) and let the constraint qualification (Definition 1.2.2) hold at x . Then, there exists a unique
in IRk , = (1 , , k ) such that the function H(, ) defined by
H(x, ) = f (x) + , h(x)
satisfies
(i) f (x ) + T h(x ) = 0
(ii) z, Hxx (x , ) z 0
for all z in IRn satisfying h(x )T z = 0
3.4 Sucient Conditions

In this section we give some necessary theorems for the existence of a solution of problem (P ).
Theorem 3.4.1 Let (x , 0 , , ) satisfy the Fritz John necessary conditions of Theorem 2.2.1.
Suppose that every z = 0 that satisfies the tangential constraints (3.30) at x0 = x and the
inequality f (x )T z 0 also satisfies
0
z, Fxx (x , , ) z > 0 (3.39)
where
0
Fxx (x , , ) = 0 f (x) + , g(x) + , h(x).
Then, f in problem (P ) attains a strict local minimum from problem at x .
3.4. SUFFICIENT CONDITIONS 75
Proof : To simplify notation we assume that x = 0 and that f (0) = 0.

If f did not attain a strict local minimum at x = 0, then for every positive integer q there
would exist a feasible point vq = 0 such that vq B(0; 1q ) and f (vq ) 0. Thus there would exist a
sequence of points {vq } in X such that
vq = 0, lim vq = 0, f (vq ) 0, g(vq ) 0, h(vq ) = 0
q
Using the mean value theorem there exist v0 , vi , vj in (0, 1) such that
f (vq ) = f (0) + f (v0 , vq ), vq = f (v0 vq ), vq
Since f (0) = 0 and f (vq ) 0 we have
f (vq ) = f (v0 vq ), vq 0 (3.40)
Recalling E = {1, , v} is such that gi (x ) = 0, i E, we have by the mean value theorem again
gi (vq ) = gi (vi vq ), vq 0 (3.41)
we also have
hj (vq ) = hj (vj vq ), vq 0 (3.42)
it also follows from Taylors Theorem that
1
f (vq ) = f (0), vq + vq , fxx (0 vq )vq 0 (3.43)
2
and that for i = 1, , r and r = 1, , k
1
gi (vq ) = gi (0), vq + vq , gi xx (i vq )vq 0 (3.44)
2
1
hj (vq ) = hj (0), vq + vq , hj xx (j vq )vq 0 (3.45)
2
where 0 , i , j in (3.43), (3.44) and (3.45) are in (0, 1).
v
Since vq = 0, vqq is a unit vector. Hence there exists a unit vector v and a subsequence that
we relabel as {vq } such that
vq
v. (3.46)
vq
Now from (3.40)(3.42) we have
f (0), v 0
gE (0) v 0
h(0) v = 0.
Thus, v satisfies the tangential constraints (see Definition 2.3.1).
Now, multiplying (3.43) by 0 , (3.44) by i , and (3.45) by j and recalling that i = 0, i=
r + 1, , m we obtain
1
0 f (0) + T g(0) + T h(0), vq + vq , Fxx
0
(vq , vq ) 0 (3.47)
2
We devide the inequality in (3.47) by vq and letting q through an appropriate subsequence
as in (3.46) we have
1
0 f (0) + T g(0) + T h(0), v + v, Fxx
0
(0, v) 0 (3.48)
2
Since f (0), v 0, gE (0), v 0, and h(0), v 0 and i = 0, i = r + 1, , m we have from
(3.48) the inequality
0
v, Fxx (0, v) 0, (3.49)
contradicting the assumption of the theorem.
Corollary 3.4.1 Let (x , , ) satisfy the KKT necessary conditions. Suppose that every z = 0
in IRn that satisfies the tangential constraints at x0 = x and the inequality f (x ), z = 0 also
satisfies
z, Fxx (x , , )z > 0
where
F (x, , ) = f (x) + , g(x) + , h(x).
Then, f attains a strict local minimum for problem (P ) at x .
Proof : If (x , , ) satisfy the KKT conditions, then
f (x ) + T g(x ) + T h(x ) = 0
Since I = 0,
f (x ) = TE g(x ) T h(x )
For every z IRn we have
f (x ), z = E , g(x )z , h(x )z (3.50)
Since E 0 for every z that satisfies the tangential constraints we have from (3.50)
f (x ), z 0.
Hence, the condition f (x )T z 0 in Theorem 2.4.1 reduces to
f (x ), z = 0
Remark 3.4.1 If the inequality constraits in problem (P ) are absent, then a vector z satisfies
the tangential constraints at a point x0 if and only if h(x0 )T z = 0. Now, if the necessary
conditions of Theorem 1.2.2 hold at x then, from (ii) of Theorem 1.2.2 we have
k

f (x ) = i hi (x )
i=1
and thus,
k

f (x )T z = i hTi (x ) z = 0,
i=1
and we have the following result.

Corollary 3.4.2 Let (x , ) satisfy the necessary conditions of Theorem 2.2.2 when the inequali-
ties are absent. Suppose that for every z IRn such that h(x )T z = 0 we have
z, Hxx (x, )z > 0
where
H(x, ) = f (x) + , h(x).
Then, f attains a strict local minimum for prblem (P ) where the inequality constraints are
absent.
Corollary 3.4.3 (Pennisi 1953): Let (x , , ) satisfy the KKT conditions. Let E = {i : i
E and i > 0}. suppose that for every z = 0 in IRn that satisfies
gi (x ), z = 0, i E, gi (x ), z 0, i E, h(x ), z = 0 (3.51)
the inequality
z, fxx (x , , ) z > 0
holds, where F is as in Corollary 2.4.1. Then, f attains a strict local minimum for problem
(P ) at x .
Proof : To prove the Corollary it suces to show that the set V1 of vectors that satisfies
(3.50) is the same as the set of vectors that satisfy the tangential constraints and the equality
f (x ), z = 0, and then invoke Corollary 2.4.1.
We first show that V1 V2 . If V1 = there is nothing to prove. Thus, assume V1 = . From
(ii) of Theorem 2.2.2 we have
f (x ), z + , g(x )z + , h(x )z = 0. (3.52)
Now, let z V1 . Then, g(x )T z 0, and h(x )T z 0. thus, z satisfies the tangential
constraints. Moreover, since i = 0 for i I, we have that i = 0 for i E . Thus, (3.52) gives

f (x ), z + i gi (x ), z = 0
iE
for all z V1 . for such z and i E we have gi (x ), z = 0, and then, f (x ), z = 0, ad so

V1 V2 .
Next we show that V2 V1 . Assume that V2 = . Since i = 0 for i E and f (x ), z = 0, it
follows that for z V2 , (3.52) becomes

i gi (x ), z = 0.
iE
Since for any vector z that satisfies the tangential constraints we have gi (x ), z 0, and since
i > 0 for i E , the last inequality implies that gi (x ), z = 0 for i E . Since gE (x )T z 0
for any vector satisfyng the tangential constraints, we get that V2 V1 .
Example 3.4.1
1 2
min x1 + x22 + x33
2
subject to
x1 + x2 + x3 = 3
1

f (x) = 2 x21 + x22 + x23 and h(x) = x1 + x2 + x3 3. Fritz John Theorem gives
0 f (x0 ) + h(x0 ) = 0, 0 0, , 0 + || =
0.
at an optimal point x0 .
0 (x1 , x2 , x3 ) + (1, 1, 1) = (0, 0, 0)

0 x 1 + = 0
0 x 2 + = 0
0 x 3 + = 0
If 0 = 0, then = 0. Therefore 0 = 0. We may assume 0 = 1. Then,
x1 + = 0
x2 + = 0
x3 + = 0
Thus,
x1 = x2 = x3 = .
From x1 + x2 + x3 = 3 we have 3 = 3. Thus, = 1. Thus,

x0 = (1, 1, 1).
Now, set
F (x, ) = f (x) + h(x)
For every z = 0 satisfying h(x0 ), z =
0 we must have
z, Fxx (x0 , ) z 0.
In the present case
h(x0 ), z = z1 + z2 + z3 , and

1 0 0
Fxx (x0 , ) = 0 1 0
0 0 1
we must have
z, Fxx (x0 , ) z = z, z = |z|2 0
for any z satisfying z1 + z2 + z3 = 0. This is, in fact, true since |z|2 0 for any z = 0.
For suciency, for any z = 0 satisfying h(x0 ), z = 0 and f (x0 ), z = 0 the inequality
z, Fxx (x0 , ) z > 0
for any z = 0, although we only need the inequality to hold only for z = 0 such that z1 +z2 +z3 = 0.
Note that h(x0 ), z = z1 + z2 + z3 and f (x0 ), z = z1 + z2 + z3 .
Example 3.4.2
Let f (x) = x1 + , xn
min f (x)
subject to
x21 + + x2n = 1.
f (x) = (1, 1, 1)T .
h(x) = x21 + + x2n 1.
T
h(x) = (2x1 , 2x2 , , 2xn )
From Fritz John Theorem we have
0 (1, 1, , 1) + (2x1 , 2x2 , , 2xn ) = (0, 0, , 0)
Since (0, 0, , 0)T is not feasible, if 0 = 0 then = 0. Thus, 0 = 0. Thus, we may assume that
0 = 1. Then, = 0.
Now,
(1, 1, , 1) + 2(x1 , , xn ) = (0, 0, , 0)
2xi = 1, i = 1, , n
Therefore
1
xi = , i = 1, , n.
2
Since x21 + + x2n = 1, we have
1
n = 1
42
n
2 =
4
n
=
2
Therefore
1 1
xi =
n
=
2 n
2
For second order conditions, let z = 0 such that
h(x0 ), z = 0
2x01 z1 + + 2x0n zn = 0
x01 z1 + + x0n zn = 0
1
(z1 + + zn ) = 0
n
Therefore
z1 + + zn = 0
Let
F (x, ) = f (x) + h(x)

Hxx (x, ) = 2I
z, Hxx z = 2|z|2 0
Therefore
0

n
=
2
From the suciency theory we see we have a local minimum at
T
1 1 1
, , ,
n n n
Example 3.4.3
min f (x) = x1
subject to
(x1 + 1)2 + x22 1
x21 + x22 2
g1 (x) = 1 (x1 + 1)2 x22

g2 (x) = x21 + x22 2
g1 = (2(x1 + 1), 2x2 )
g2 = (2x1 , 2x2 )
f = (1, 0).
Check Fritz-John condition at (1, 1).
g1 (1, 1) = (0, 2)
g2 (1, 1) = (2, 2)
0 (1, 0) + 1 (0, 2) + 2 (2, 2) = (0, 0)

(0 , 0) + 1 (0, 21 ) + (22 , 22 ) = (0, 0)

(0 22 , 2(1 2 )) = (0, 0)
Therefore 1 2 = 0. Then
(0 22 , 0) = (0, 0)
We cant have 0 = 0. Therefore 0 = 1 and 2 = 12 . Since 1 = 2 , 1 = 12 . Now, set
1 1
F (x, 0 , ) = x1 + 1 (x1 + 1)2 x22 + (x21 + x22 2)
2 2
1 1 1
= x1 + (x1 + 1)2 + x21 1
2 2 2
F = (1 (x1 + 1) + x1 , 0) = (0, 0)
Fxx = 0.
The point (0, 0) is feasible. The second constraint is not active. Therefore 2 = 0.
0 (1, 0) + 1 (2, 0) = (0, 0)

(0 21 , 0) = (0, 0).
0 cant be zero. Therefore 0 = 1 and 1 = 21 .
1
F (x, 0 , ) = x1 + 1 (x1 + 1)2 x22
2
Fx = (1 (x1 + 1), x2 ) = (x1 , x2 )
g1 (x) = ((x1 + 1), x2 )
g1 (0, 0) = (1, 0)
The vector (0, z2 ), z2 = 0 is a test vector. For z = (0, z2 ), z2 = 1
z, I z = z22 < 0
The second order condition fails to hold. Therefore (0, 0) cant be a local minimum.
3.5 Summary
Constraint Qualification Conditions
The constraints satisfy the Constraint Qualification conditions at x if

(i) h1 (x ), , hk (x ) are linearly independet.
(ii) Letting E = {i : gi (x ) = 0} the system

h(x )T z = 0
gE (x )T z < 0
has asolution.
Tangential Constraints at x
The vector z = 0 satisfies the tangential constraints at x if

(i) h(x )T z = 0
3.5. SUMMARY 81
(ii) gE (x )T z 0
Second Order Conditions
A. (1) The vector z = 0 is a second order test vector if i = 0 whenever gi (x ) z < 0 and
gi (x ) = 0
(2) Letting
E2 = {i E : gi (x )T z < 0}
E1 = {i E : gi (x )T z = 0}

gE1 (x )
have linearly independent rows.
h(x )
For every second order test vector and (2) in force we must have
z, Fxx (x , , )z 0
B. No inequality Constraint For every z IRn , z = 0 and h(x )T z = 0 we must have
z, Fxx (x , )z 0.
Suciency
A. Let the Fritz John second order condition hold at x with multipliers 0 , , . If for all
z = 0 that satisfy the tangential constraints at x and f (x )T z 0 we also have
z, Fxx (x , 0 , , )z > 0
then x is a strict local minimum.
B. No Inequality Constraints: For every z IRn , z = 0 such that h(x )T z = 0 we also have
z, Fxx (x , 0 , )z > 0
then x is a strict local minimum.


Preview of "Optimization - PDF"

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Preview of "Optimization - PDF"

Uploaded by

Copyright:

Available Formats

Chapter 3

Dierentiable Nonlinear Programming

3.1 Unconstrained Problems

where f : IRn IR and X is an open nonempty subset of IRn .

V = {x IRn : |xi x0i | < , i = 1, , n} X.

(t; u) = f (x0 + tu), |t| < .

x, [(1 )I + A] x = x, x + (1 )x2 > 0

if x = 0 and 0 < < 1. Now, set

f (x0 ) = x0 , Ax0 + 2b, x0 + c

3.2 Constrained Problems

In this problem f : X0 IRn IR, gi : X0 IR, i = 1, , m and hi : X0 IR, i = 1, , k.

Definition 3.2.1 The set S deined by

S = {x X0 : gi (x) 0, i = 1, ..., m ; hj (x) = 0, j = 1, ..., k}

(ii) , g(x ) = 0; i.e. i gi (x ) = 0, i = 1, , m.

E = {i : gi (0) < 0} (3.2)

To simplify the notation we assume that E = {1, 2, , r} and I = {r + 1, , m}, 0 r m.

From (3.5) we also have

Since 0 < < 0 and x() < < 0 we note that

Then, from (3.11) we obtain

From (3.14) we have

Definition 3.2.2 In problem (P ) the functions g1 , , gm ; h1 , , hk satisfy the constraint qualifi-

(i) the vectors h1 (x0 ), , hk (x0 ) are linearly independent and

(ii) the system

Proof : Suppose 0 = 0. Then, from (iii) of Theorem 2.2.1

we have a contradiction. Thus 0 > 0.

Proof : By Lemma 2.2.1 in Theorem 2.2.1 0 > 0. Dividing (0 , , ) in Theorem 2.2.1 by 0

are linearly independent. Then, the conclusions of Theorem 2.2.2 hold.

By hypothesis of the theorem equation (3.28) 1 = = k = 0, i = 0, i E. Then, we contradict

Then, X is convex and nonempty. From (i) x X. Since f is covex,

Since gi is convex for i = 1, , m we have

Let x X0 , 1 , 2 , , m scalars such that

Then, x is a solution to problem (Pc ).

3.3 Second Order Conditions

Definition 3.3.1 Let x0 X. The vector z IRn , z =

Lemma 3.3.1 Let g1 , , gm , h1 , , hk and X0 be as in problem (P ). Assume g1 , , gm , h1 , , hk

g1 (x0 ), , gq (x0 ), h1 (x0 ), , hk (x0 ) (3.33)

Proof : Let = q + k. In view of (3.33), n. If = n, then the only solution of (3.31) is

Dierentiating (3.35) with respect to t we have

where z = (z+1 , , zn ). The system

(i) i = 0 for i E2 and

(ii) the rows of the matrix

Proof : By Taylors Theorem, for 0 < t <

Theorem 3.3.1 Let f , g = (g1 , , gm ), h = (h1 , , hk ) be of class C (2) on X0 . Let x be a solution

F (x, , ) = f (x) + , g(x) + , h(x) (3.36)

Then, for every second-order test vector z,

Proof : Let z be a second-order test vector. Since i = 0 for i E2 and

f (( )) = F ((t), , ) for |t| < .

H(x, ) = f (x) + , h(x)

3.4 Sucient Conditions

Proof : To simplify notation we assume that x = 0 and that f (0) = 0.

and we have the following result.

f (x ), z + , g(x )z + , h(x )z = 0. (3.52)

for all z V1 . for such z and i E we have gi (x ), z = 0, and then, f (x ), z = 0, ad so

0 (x1 , x2 , x3 ) + (1, 1, 1) = (0, 0, 0)

If 0 = 0, then = 0. Therefore 0 = 0. We may assume 0 = 1. Then,

From x1 + x2 + x3 = 3 we have 3 = 3. Thus, = 1. Thus,

F (x, ) = f (x) + h(x)

g1 (x) = 1 (x1 + 1)2 x22

Check Fritz-John condition at (1, 1).

0 (1, 0) + 1 (0, 2) + 2 (2, 2) = (0, 0)

(0 , 0) + 1 (0, 21 ) + (22 , 22 ) = (0, 0)

0 (1, 0) + 1 (2, 0) = (0, 0)