Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

ECE 403/503 Solutions to Assignment 2

Problems for this assignment: 1.4, 1.5, 1.6, and 1.7.

1.4 (2.5 points) Compute gradient and Hessian of the functions given below.
(a) f ( x1 , x2 )  x12  2 x2 sin x1  100
1 T
(b) f ( x)  x Hx  x T b   where H  R N  N , b  R N 1 , and   R11 are known.
2

(c) f ( ) 
1 P

P p 1
  T z
log 1  e p  where variable   R n1
, log() denotes natural logarithm, and

{zp for p = 1, 2, …, P} with zi  R n1 comes from a known data set.


Solution
(a) The gradient is given by
 2 x  2 x2 cos x1 
f ( x1 , x2 )   1 
 2sin x1 
and Hessian is given by
 2  2 x2 sin x1 2 cos x1 
 2 f ( x1 , x2 )   
 2 cos x1 0 
(b) Let us denote matrix H in terms of columns and rows as
 hˆ1T 
 T
 hˆ 
H   h1 h2  hn  and H   2 
 
 hˆT 
 n
and denote b as
 b1 
b 
b   2

 
bn 
f ( x )
By definition, the partial derivative is obtained by taking the limit
xi
f ( x ) f ( x   ei )  f ( x )
 lim where ei is the ith column of the identity matrix
xi   0 
Now we compute

1
1 T 1 1 1
f ( x   ei )  x Hx   eiT Hx   xT Hei   2 eiT Hei  x T b   eiT b  
2 2 2 2
1 T 1 ˆT 1 T 1 2
 x Hx   hi x   hi x   hi ,i  x T b   bi  
2 2 2 2
1 ˆT
2
 1

 f ( x )   hi  hiT x   2 hi ,i   bi
2
Hence
f ( x ) f ( x   ei )  f ( x ) 1 ˆ T
xi
 lim
 0  2

 hi  hiT x  bi 
which leads to
1
f ( x )  ( H  H T ) x  b
2
Using the result obtained above, the Hessian can be computed by definition as
1  1
 2 f ( x )    f ( x )T     x T ( H  H T )   ( H  H T )
2  2
In particular, if H is symmetric, then
f ( x )  Hx  b and  2 f ( x )  H
(c) The gradient of f ( ) is given by
 T z
1 P e p 1 P 1
f ( )    T
 z p
z p    T zp
P p 1 1  e P p 1 1  e z p
from which the Hessian can be evaluated as
T z
1 P e p
 f ( )  
2
z zT
T zp 2 p p
P p 1 (1  e )

1.5 (2.5 points) Point x is called a stationary point of function f ( x ) if f ( x )  0 . Find and
classify the stationary points of the following functions as minimizer, maximizer, or none of the
above (in that case the stationary point is called a saddle point):
(a) f ( x )  2 x12  x22  2 x1 x2  2 x13  x14

(b) f ( x )  x12 x22  4 x12 x2  4 x12  2 x1 x22  x22  8 x1 x2  8 x1  4 x2


(c) f ( x )  ( x12  x2 )2  x15
(Hint: Use the second-order sufficient conditions).
(a) We write the objective function in question as
f ( x )  2 x12  x22  2 x1 x2  2 x13  x14  x12 ( x12  2 x1  1)  ( x12  2 x1 x2  x22 )  x12 ( x1  1) 2  ( x1  x2 ) 2
from which we see it is always nonnegative. By setting its gradient to zero, i.e.,

2
 4 x 3  6 x12  4 x1  2 x2  0 
f ( x )   1  
 2 x2  2 x1  0 
we obtain
x2  x1
4 x13  6 x12  4 x1  2 x2  0
that leads to the equation
2 x13  3x12  x1  0
The solutions of the above equation are 0, –0.5, and –1, which in conjunction with x2 = x1 gives
three stationary points of the objective function as
0  0.5  1
xa    , xb    , xc   
0  0.5  1
We proceed by computing the Hessian of the objective function
12 x12  12 x1  4 2 
 f ( x)  
2

 2 2
At xa , the Hessian becomes
 4 2 
 2 f ( xa )   
 2 2 
which is positive definite because both eigenvalues, 5.2361 and 0.7639, are positive. Hence xa
satisfies the 2nd–order sufficient conditions and it is a minimizer.
At xb , the Hessian is equal to
 1 2 
 2 f ( xb )   
 2 2 
which is indefinite because its eigenvalues, 3.5616 and –0.5616, have mixed signs. Hence xb is
neither a minimizer nor a maximize (called a saddle point).
At xc , the Hessian is given by
 4 2 
 2 f ( xc )   
 2 2 
which is identical to  f ( xa ) and is known to be positive definite. Hence xc satisfies the 2nd–order
2

sufficient conditions and it is a minimizer.


(b) For convenience we express the objective function in a more compact form as
f ( x )  x12 x22  4 x12 x2  4 x12  2 x1 x22  x22  8 x1 x2  8 x1  4 x2
 x12 ( x2  2)2  2 x1 ( x2  2) 2  ( x2  2) 2  4
 ( x1  1) 2 ( x2  2) 2  4
It immediately follows that the minimum value of the objective function is –4.

3
The gradient of the objective function is given by
 2( x  1)( x2  2) 2 
f ( x )   1 
 2( x1  1) ( x2  2) 
2

By setting the gradient to zero, the stationary points of f ( x ) are found to be


x 
x    1  with x1 arbitrary
2
and
 1
x     with x2 arbitrary.
 x2 
Clearly, at any stationary point characterized above the objective function assumes its minimum
value –4, we conclude that all stationary points are global minimizers.
(c) For f ( x )  ( x1  x2 )  x1 , we compute its gradient and Hessian
2 2 5

 4 x1 ( x12  x2 )  5 x14  12 x12  4 x2  20 x13 4 x1 


f ( x )    ,  f ( x)  
2

 2( x1  x2 )  4 x1
2
 2 
By setting f ( x )  0 , we obtain x   0 as the only stationary point for the objective function. At
x*, the hessian becomes
0 0 
2 f ( x  )   
0 2 
which is positive semidefinite. This means we need a further analysis before making conclusions.
Let   1  2  with 1 and  2 small in magnitude but otherwise arbitrary. We compute
T

f ( x    )  f ( x  )  f ( )  (12   2 )  15
hence
f ( x    )  f ( x  )  0 when  2  12 and 1  0
f ( x    )  f ( x  )  0 when  2  12 and 1  0
i.e.,
f ( x    )  f ( x  ) when  2  12 and 1  0
f ( x    )  f ( x  ) when  2  12 and 1  0
Based on that we conclude that x* is neither a minimizer nor a maximizer, and is a saddle point. ■
1.6 (2.5 points) It is known that a function f ( x ) is convex over a region if its Hessian is positive
semidefinite over that region. Function f ( x ) is said to be concave if  f ( x ) is convex. Determine
whether the following functions are convex or concave or none of the above:
(a) f ( x )  x12  cosh( x2 )
(b) f ( x )  x12  2 x22  2 x32  x42  x1 x2  x1 x3  2 x2 x4  x1 x4

4
(c) f ( x )  x12  2 x22  2 x32  x42  x1 x2  x1 x3  2 x2 x4  x1 x4

(d) f ( wˆ ) 
1 P

P p 1
log1  e 
 y p wˆ T xˆ p
 where variable wˆ  R N 1 , log() denotes natural logarithm,

xˆ p  R N 1  and y p  {1,  1}  are known data.


Solution
(a) The Hessian of f ( x ) is given by

2 0 
H ( x)   
 0 cosh( x2 ) 

Since the eigenvalues of H(x), 2 and cosh(x2), are always positive, H(x) is positive definite at every
point x. From Theorem 2.14, f ( x ) is globally strictly convex.

(b) The Hessian of f ( x ) is found to be


 2 1 1 1
 1 4 0 2 
H 
1 0 4 0
 
 1 2 0 2

Its eigenvalues are given by 0.6099, 1.3281, 4.2427, and 5.8192. Hence H is positive definite and
hence f ( x ) (strictly) convex.
(c) The Hessian of f ( x ) is evaluated as
 2 1 1 1 
 1 4 0 2 
H  
 1 0 4 0 
 
 1 2 0 2 

whose eigenvalues are – 4.6939, – 4.1453, 1.1797, and 3.6595. Hence H is indefinite and function
f ( x ) is neither convex nor concave.
(d) It follows from Prob. 1.4(c) that the Hessian of f ( wˆ ) is given by
y wˆ T xˆ
1 P ep p
 f ( wˆ )  
2
xˆ p xˆ Tp
 
2
P p 1 1  e y p wˆ T xˆ p

N 1
which implies that for any v  R we have

5
v T   2 f ( wˆ )  v
 y wˆ T xˆ

1 P ep p 
v  
T T
xˆ p xˆ  v
 
2 p
 P p 1 1  e y p w x p
ˆT ˆ

 
y wˆ T xˆ  
1 P ep p
  v T xˆ p xˆ Tp v
 
2
P p 1 1  e y p wˆ T xˆ p

y wˆ T xˆ
1 P ep p
  v xˆ p   0
T 2

 
2
P p 1 1  e y p wˆ T xˆ p

Therefore,  2 f ( wˆ )  is always positive semidefinite and hence f ( wˆ )  is convex. ■


1.7 (2.5 points)
(a) Find all stationary points of the objective function
2
f ( x )  ( x1  x2 )2   2( x12  x22  1)  13 
(b) Classify each stationary point found from part (a) as a minimizer, maximizer, or saddle point.
Solution
(a) The gradient of the objective function is given by
 2( x1  x2 )  8 x1[2( x12  x22  1)  13 ] 
f ( x )   1 
 2( x1  x2 )  8 x2 [2( x1  x2  1)  3 ]
2 2

The stationary points of f ( x ) are obtained by setting f ( x )  0 that leads to a bit of analysis to
help find five stationary points as follows:
(i) If 2( x12  x22  1)  13  0 , then we have x1  x2  0 , hence x1 x2  7 /12 . This yields two
stationary points as
 7 /12    7 /12 
x (1)    , and x  
(2)
.
  7 /12   7 /12 
(ii) 2( x12  x22  1)  13  0 , then we have x1  x2 and 4 x1  8 x1[2(2 x12  1)  13 ]  0 . These two
equations lead to three stationary points as
0  11/ 24    11/ 24 
x (3)    , x (4)    , and x  
(5)

0  11/ 24    11/ 24 
(b) The Hessian of the objective function is given by
 48 x 2  16 x22  503 32 x1 x2  2 
2 f ( x)   1 
 32 x1 x2  2 16 x12  48 x22  503 
At x (1) and x (2) , the Hessian is equal to
 20.6667 16.6667 
 2 f ( x (1),(2) )   
 16.6667 20.6667 

6
which is positive definite, hence x (1) and x (2) are minimizers.
At x (3) , we have

 16.6667 2 
 2 f ( x (3) )  
 2 16.6667 
which is negative definite, hence x (3) is a maximizer.
At x (4) and x (5) , we have
12.6667 16.6667 
 2 f ( x (4),(5) )   
16.6667 12.6667 
which is indefinite, hence x (4) and x (5) are saddle points. ■

You might also like