Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

ODS: Solution-TT 1

Task 1 (

ODS EC-S 1-14

Line Search Methods

The solution of this exercise is given on the exercise course slides, which can be downloaded on the ODS website.

Task 2 (

Trust Region Method

The solution of this exercise is given on the exercise course slides, which can be downloaded on the ODS website.

Task 3 (

Dogleg Method

Introduction:
The Dogleg method searches for the minimum p opt of the quadratic sub problem on a path
defined by three points:
The unconstrained minimum of the given quadratic problem which is
( k ) 1 dJ ( )
p ND = H
d ( k )

the minimum along the steepest descent direction given by


T
g g
dJ ( )
p t = T ( k ) g with g =
d
g B g
and the initial parameter value p = 0 .

(k )

ODS: Solution-TT 1

ODS EC-S 1-14

This is shown in the following figure as an example in the banana scenario. The red lines
1

initial value
global minimum

0.8

0.6

0.4

pp t

pp s
b

0.2

p ND

-0.2

-0.4
-0.2

0.2

0.4

0.6

0.8

are the lines of the contour plot of the quadratic approximation of the objective function.
J ( ) is decreasing along this path. Therefore only the intersection of the TR with these
lines needs to be calculated to determine an approximation of the minimum. The possible
location of the minimum of the constrained optimization problem will be one of the following
three cases:
1 If p ND is inside the trust region, then p ND is the solution of the quadratic optimization
problem.
2 If p ND is outside if the trust region and pt is inside the TR, the solution lies somewhere on the path between p ND and p t .
3 If pt is outside the trust region, the solution lies on the path between 0 and pt .
Situation a:
The unconstrained minimum is given by

1
p opt = p ND =

0.25
This lies inside the trust region and therefore popt = p ND
Situation b:
Now, p ND is located outside the TR.

0.833
pt =

0.417
is located inside the trust region. In this case, we need to calculate the intersection point
of the line connecting pt with p ND and the boundary of the TR.
This can be done using the equation which describes the path between the two points and
then (if the Trust region is modeled as a circle) equalizing it with the radius of the TR.

ODS: Solution-TT 1

ODS EC-S 1-14


2

p t + ( 1) p ND p t

ps

with 1 < < 2

this equation can be reformed to a quadratic equation

p ND p t

( 1)
2

+ 2 p t p ND p t ( 1) + p t

2
2

This equation yields two solutions for .For the given problem, they are 1 = 1.7343 and
2 = 2.2343 . 2 is the intersection of the line with the bound of the trust region on the
other side and is not the desired solution. The solution is determined by inserting
1 = 1.7343 in the equation above

0.9557
popt =

0.2943
(The solution lies on the boundary of the trust region)
Situation c:
This Problem can not be solved using the Dogleg Method because the Hessian is not
positive definite. The problem needs to be redefined by using a positive definite quasi Hessian matrix.

Situation d:
Now pt is located outside of the TR. In this case, we need to calculate the intersection
point of the path connecting 0 with p t and the boundary of the TR.

p=

Task 4 (

0.9285
=

pt
0.3714
pt

Line Search and Trust Region


*

a) The minimum is given for

* = B 1 g
(The calculation was done in the lecture)
b) If B is negative definite, the problem is not bounded. Not bounded means the minimum
tends to if at least on parameter tends to .

ODS: Solution-TT 1

ODS EC-S 1-14

If the problem is positive semi definite (at least one Eigenvalue is zero), there is more
than one minimum.
c) For an arbitrary parameter vector (

0)

the gradient and the Hessian matrix is given by

dJ ( )
d 2 J ( )
0
= g + B ( ) and
=B
2
d (0)
d ( ) (0)
this results in the line search direction (sparing the iteration index)

p = B

( g + B ( ) ) .
0

*
The optimal step length can be calculated by

dJ ( ) + p
0

d
=

g p p B (
T

) =0
T

0)

p Bp

If we insert this in the update formula of the parameter vector we get

g p p B (
T

(1)

(0)

+ p =

( 0)

0)

p Bp

p=

g
B

= .

We see that the Newton method reaches the optimum of a quadratic optimization problem after one iteration regardless of the starting point. If the distance to the minimum
is small enough, every function can locally be approximated by a quadratic function
and therefore the Newton method reaches the minimum after one step if the minimum
is close enough to the starting point.

d) For ( 0 ) = 0 the gradient and the Hessian matrix is given by

dJ ( )
d 2 J ( )
= g and
= B.
2
d (0)
d ( ) ( 0)
and the steepest descent search direction is simply
p = g.

If the same steps as in the exercise before are performed

ODS: Solution-TT 1

ODS EC-S 1-14

( ) =0

dJ p

g p
T

p Bp

This yields the minimum along the steepest descent direction as


T

(1)

g g
T

g Bg

g.

(This result is used in the dogleg method as the point p t )


e) As the Trust Region is not bounded, the quadratic sub problem is not constrained and
therefore the solution is given by the full Newton update step (see exercise a) and c))
The Trust Region method uses a quadratic approximation of the original problem to
solve the original optimization problem iteratively. As in the exercise the original problem is a quadratic problem, ( k ) will always be 1 regardless of the size of the TR.

ODS: Solution-TT 1
Task 5 (

ODS EC-S 1-14

Quasi Newton Method

a) The secant equation is


B

( k +1)

( l ,l )

with
y

(k )

dJ ( )
d

(k )

=y

( l ,1)

(k )

( l ,1)

dJ ( )

( k +1)

(k )

3 2 1
= =
3 1 2

and

1
= .
1
k +1
As the quasi Hessian needs to be symmetric B( ) has the form
s

(k )

( k +1)

(k )

a b
=
a, b, c .
b c
k +1
Also the B( ) needs to be positive definite this results in the following conditions
B(

k +1)

a, c > 0
a b
det
> 0 ac b 2 > 0.

b c

Together with the conditions from the secant equation for the given problem

a +b =1
b+c = 2
We have a system of two equality and two inequality conditions for 3 parameters. This
system of equations is under determined. As a consequence, the solution set of this
problem is infinite. Therefore, the algorithms based on the secant equation introduce
k +1
k
additional conditions (e.g. min B ( ) B ( ) )
b) As the Broyden-Fletcher-Goldfarb-Shanno (BFGS) formula and the Davidon Fletcher
Powell (DFP) formula are quite long two m-files were written to do the calculation
(BFGS.m and DFP.m). Both methods yield

B(

k +1)

1 0
=
.
0 2

c) For the BFGS method the result is


( k +1)
133.47 6.522
B BFGS =
.
6.522 81.478

The DFP methods result does not differ much.

ODS: Solution-TT 1

ODS EC-S 1-14


(k )

d) The accuracy can be increased if s is decreased. But most of the times, the optimization will work fine with a not so accurate approximation of the Hessian matrix. Therefore it is not worthwhile to spend more processing time in calculating a more accurate
Hessian matrix.
e) The benefit of the quasi Newton compared to the Newton method is that the Hessian
matrix no longer needs to be provided. Also, the Hessian matrix is always positive
definite if approximated with the BFGS or DFP algorithm. In real implementations the
big benefit is that it is possible to approximate the inverse of the approximated Hessian
matrix directly. This saves a lot of calculation time in higher dimensional problems
(further information can be found in Nocedalss book if you interested in this)

Task 6 (

Steepest Descent Optimization

The result is shown in the figure.

ODS: Solution-TT 1
Task 7 (

ODS EC-S 1-14

Sectioning Optimization

a)

b) Answers:

No, it must be possible to minimize the parameters independently. This is most


times possible, but not always. (For example, in some NLSQ parameter fit problems
this approach will not work)
Pro: the most simple multidimensional optimization alg. and is therefore widely used
in practice.
Con: the alg. is very likely to stuck in valleys (e.g. in the Banana scenario).

ODS: Solution-TT 1
Task 8 (

ODS EC-S 1-14

Sectioning Procedure

a)
Our objective function is given by
4 2 1
J ( ) = [1 2 ]

1 2 2

and the initial parameter vector by


T

(0) = [ 6 4] .
The first iteration of the sectioning procedure optimizes this function with respect to 1
and keeps 2 constant.

4 2 1

min arg J (1 ) = [1 2 ]

1
1 2 2 2 = ( 0)

The new objective function now only dependent on 1 is


J (1 ) = [1

4 2 1
4 ]
= 412 121 + 32.

1 2 4

Using the first order condition, the minimum of this function can be determined (of course,
a computer would use numerical techniques or possibly search techniques as well)

!
d J (1 )
21 3 = 0 1 = 1.5.
d1

With this, the result of the first iteration is given by

(1)

1.5
= .
4

In the next iteration, the sectioning procedure would optimize the objective function with
respect to 2 and keep 1 constant. This procedure is repeated until a stopping criterion is
reached.
b)
By generalizing the procedure above, the minimum of the function with respect to 1 and
fixed 2 can be determined by

ODS: Solution-TT 1

ODS EC-S 1-14

dJ ( )
d1

!
3
= 81 + 3 2 = 0 1 = 2 .
8

With this, the general result of the first taxi cab iteration for the given problem is
3

(1) = 2(0) 2(0) .


8

We see that 1(0) is irrelevant, because it is replaced by the result of our first search,
which only depends on 2(0) . And the outcome of the second taxi cab iteration is

(2) = 1(1)

3
3
1(1) = 2(0)
4
8

3 3 (0)
2 .
84

(i )

Afterwards, we can set up a generic equation, which returns the resulting for every iteration
T

3 3 3 k 1 (0)
=
2
8 8 4

k
3 3 (0)

2 (1)
8 4

3 3 3 k (0)
=
2
8 8 4

k
3 3 (0)

2 (2)
8 4

(2k )

( 2 k +1)

with k * .
c) As we see from (1) and (2) a sectioning algorithm would never reach the real minimum
T
*
at = [ 0 0 ] exactly. (Or it would need an infinite number of iterations).

ODS: Solution-TT 1
Task 9 (

ODS EC-S 1-14

Simplex-Downhill (Nelder Mead) Algorithm 1

ODS: Solution-TT 1
Task 10 (

ODS EC-S 1-14

Simplex-Downhill (Nelder Mead) Algorithm 2

The following enumeration describes the simplex-downhill steps which are shown in the
following figure.
1. initial simplex
2. first, reflect the worst vertex the new vertex is better than our best vertex, so
we expand the simplex
3. same procedure as in step 2 with the new worst vertex
4. reflection of the worst vertex. Here, a reflection is not possible as this would result in a vertex outside the feasible region
5. reflection of the worst vertex (We assume that now the reflected vertex is the
second best vertex)
6. contraction of the worst vertex (As the resulting point of the reflection would be
worse than the other points, we contract towards the inside)
7. same procedure as before
8. reflection

ODS: Solution-TT 1
Task 11 (

ODS EC-S 1-14

NLSQ Parameter Fit Problem

a) One possible (and the common) objective function would be to minimize the sum of
squares of the residuals.
r [1] ( )
[ 2]

(
)
1 [1]
2
3
4
J ( ) = r ( ) r [ ] ( ) r [ ] ( ) r [ ] ( ) [3]

2
r ( )
[ 4]

r ( )
with

r [ ] ( ) = y[ ] u [ ]1 exp u [ ] 2
i

(e.g. for i=1 r [ ] ( ) = 3 11 exp (1 2 ) )


1

b) The gradient can be calculated by

r [1] ( )
[1]
[ 4]

dJ ( )
dr ( )
dr ( )
=
...

d
d [ 4]

d
r ( )
u[1] exp u[1] 2
=

[1] 2
1 exp u[1] 2
u

( )

...

...

[1]
r ( )

1 exp u[4] 2 [4]


r ( )

u [4] exp u[ 4] 2

( u[ ] )
4

G ( )

If the term
N

r ( )
[ j]

j =1

d 2 r [ j ] ( )
(d ) 2

is omitted, an approximation of the Hessian matrix can be determined by


d 2 J ( )

( d )

G ( ) G ( ) .

c) As the gradient and at least one approximation of the Hessian matrix can be calculated
simply, a Newton based optimization technique is recommended to solve this problem.

You might also like