Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

9.

1
Line Search Algorithms
Katta G. Murty, IOE 611 Lecture slides
Algorithms for one-dimensional opt. problems of form: min
f() over 0 (or a b for some a < b).
Bracket: An interval in feasible region which contains the
min.
A 2-point bracket: is
1

2
where f

(
1
) < 0 and
f

(
2
) > 0.
A 3-point bracket is
1
<
2
<
3
s. th. f(
2
)
min{f(
1
), f(
3
)}.
How to Select An Initial 3-pt. Bracket?
First consider min f() over 0 where at 0 the
function decreases as increases from 0 (otherwise 0 itself is a
local min).
Select a step length s. th. f() < f(0) (possible because of
78
above facts). Dene

0
= 0

r
=
r1
+ 2
r1
, for r = 1, 2, . . .
as long as f(
r
) keeps on decreasing, until a value k for r is found
for rst time s. th. f(
k+1
) > f(
k
).
So we have f(
k
) < f(
k1
) also. Among 4 points
k1
,
k
,
1
2
(
k
+

k+1
),
k+1
, drop either
k1
or
k+1
whichever is farther from
the point in pair {
k
,
1
2
(
k
+
k+1
)} that yields smallest value for
f(); and remaining 3 pts. form an equi-distant 3-pt. bracket.
If problem is min f() over a b it is reasonable
to assume that f() decreases as increases thro a (otherwise
a is a local min). Apply above procedure choosing suciently
small, keep dening
r
as above until either a k dened as above
is found, or until the upper bound b for is reached.
79
Methods Using 3-pt. Brackets
Golden Section Search: Works well when function is uni-
modal in initial bracket (i.e., it has unique local min there).
Quite reasonable assumption in many applications.
Let

be unknown min pt. Unimodality


1
<
2
<

then f(
1
) > f(
2
) and

<
1
<
2
then f(
2
) > f(
1
).
General Step: Let [, ] be interval of current bracket.
f(), f() would already have been computed earlier.
=
2
1+

5)
0.618 is the Golden Ratio. Let
1
= +
(1 0.618)( ),
2
= + 0.618( ). If f(
1
), f(
2
) not
available, compute them. If
f(
1
) < f(
2
) new bracket is [,
1
,
2
]
f(
1
) > f(
2
) new bracket is [
1
,
2
, ]
f(
1
) = f(
2
) new bracket interval is [
1
,
2
]
Repeat with new bracket, continue until bracket length be-
80
comes small.
Quadratic Fit Line Search Method
General Step: Let (
1
,
2
,
3
) be current 3-pt. bracket. Fit
a Quad. approx. Q() = a
2
+ b + c to f() using function
values at
1
,
2
,
3
. Because of bracket property, Q() will be
convex. Its unique min is:

=
(
2
2

2
3
)f(
1
) + (
2
3

2
1
)f(
2
) + (
2
1

2
2
)f(
3
)
2[(
2

3
)f(
1
) + (
3

1
)f(
2
) + (
1

2
)f(
3
)]
If

>
2
and f(

) > f(
2
) new bracket is [
1
,
2
,

]

>
2
and f(

) < f(
2
) new bracket is [
2
,

,
3
]

>
2
and f(

) = f(
2
) new bracket is either of the above.
If

<
2
similar to above.
If

=
2
, quad. t failed to produce a new pt. In this case:
if
3

1
= positive tolerance, stop with
2
as best pt.
81

1
> , dene

=
_

2
+ /2 if
2

1
<
3

2
/2 if
2

1
>
3

2
Compute f(

) and go to the above step again.


Terminate method when either of max{f(
1
), f(
3
)}f(
2
)
or
3

1
or |f

(
2
)| becomes smaller than a tolerance.
82
Methods Using 2-pt. Brackets
The method of Bisection: Let [a, b] be current bracket.
Compute f

((a + b)/2). If
f

((a + b)/2)
_

_
= 0 (a + b)/2 is a stationary pt., terminate
> 0 continue with [a, (a + b)/2]
< 0 continue with [(a + b)/2, b]
Not preferred, because method does not use function values.
Cubic Interpolation Method: Let [
1
,
2
] be current bracket.
Fit a cubic approx. to f() using values of f(
1
), f(
2
), f

(
1
),
f

(
2
). Because of bracket conds., min of this cubic func.

is
inside bracket. It is:

=
1
+ (
2

1
)
_
_
_1
f

(
2
) +
f

(
2
) f

(
1
) + 2
_
_
_
where =
3(f(
1
)f(
2
))

1
+f

(
1
)+f

(
2
) and = (
2
f

(
1
)f

(
2
))
1/2
.
If |f

)| is small accept

as approx. to min. Otherwise if


f

) > 0 (< 0) continue with [


1
,

] ([

,
2
]).
83
Line Search Method Based on Piecewise Linear Ap-
proximation
Let [a, b] be initial 2-pt. bracket for f().
A piecewise linear (or polyhedral) approximation for
f() in the bracket [a, b] is the pointwise supremum function of
the linearizations of f() at a & b, P() = max{f(a)+f

(a)(
a), f(b) + f

(b)( b)}.
From bracket conds., the min of P() occurs in [a, b] at the
point where the two linearizations are equal, it is a + d
P
where
d
P
=
f(a) f(b) f

(b)(a b)
f

(b) f

(a)
The quadratic Taylor series approximation of f() at a is
Q() = f(a) +f

(a)( a) +
1
2
( a)
2
f

(a). The min of Q()


occurs at a + d
Q
where
d
Q
=
_

_
f

(a)/f

(a) if f

(a) > 0
if f

(a) 0
One simple strategy is to take the new point to be
1
=
a + min{d
P
, d
Q
} if d
Q
> 0, or a + d
P
otherwise; and take the
next bracket to be the 2-pt. bracket among [a,
1
], [
1
, b] and
84
continue.
But C. Lemarechal & R. Miin make several modications
to guarantee convergence to a stationary pt. even when f() is
nonconvex. Also see Murty LCLNP.
85
Newtons Method for line search
2nd order method for: min f() over entire real line.
It is application of Newton-Raphson method to solve the eq.
f

() = 0.
Starting with initial point
0
, generates iterates by

r+1
=
r

(
r
)
f

(
r
)
assuming all f

(
r
) > 0. Not suitable if 2nd derivative 0 at a
point encountered.
Suitable if a near opt. sol. known, then function is locally
convex in the nbhd. of opt.
Secant Method: A modied Newtons method. Method ini-
tiated with a 2-pt. bracket, and replaces f

(
r
) by the nite
dierence approx.
f

(
r
)f

(
r1
)

r1
in Newton formula.
86
Inexact Line Search Procedures
Exact line searches expensive in subroutines for solving higher
dimensional min. problems. Inexact line searches with sucient
degree of descent do guarantee convergence of overall algo.
Consider: min f() = ( x + y) over 0. x is current
point, y is a descent direction at x.
Conditions for Global Convergence
1. Sucient Rate of Descent Condition: If

is step
length choosen, new pt. is x +

y. Cond. is that average rate
of descent from x to x +

y be specied fraction of initial rate
of decrease in theta(x) at x in direction y. Select (0, 1) and
require

satisfy:
f(

) f(0) +

f

(0)
2. Step Length not too Small: Cond. 1 will always be
satised for any < 1 by suciently small steps. This requires
that step lengths are not too small. Several equivalent ways to
87
ensure this. One is to require that

satisfy:
(2.1) f

) f

(0)
for some selected (, 1). Sats step must be long enough that
rate of change in f at

is specied fraction of its magnitude
at 0.
Theorem: If (x) is bounded below on R
n
, ( x)y < 0,
0 < < < 1, there exists 0 <
1
<
2
s. th. for any
[
1
,
2
], x + y satises both conds. 1 & (2.1).
Theorem: (x) is cont. di., and there exists 0 s. th.
||(z) (x)|| ||z x|| x, z R
n
. Starting with x
0
suppose the sequence {x
r
} generated by choosing y
r
satisfying
(x
r
)y
r
< 0, and the iteration x
r+1
= x
r
+
r
y
r
where
r
satises conds. 1 & (2.1). Then one of following holds.
(a) Either (x
k
) = 0 for some k, or
(b) lim
r
f(x
r
) = , or
(c) lim
r
(x
r
)y
r
||y
r
||
= 0.
88
From (c) we know that unless the angle between (x
r
) & y
r
converges to 90
0
as r , either (x
r
) 0, or f(x
r
) ,
or both.
If we can guarantee that angle between (x
r
) & y
r
is bounded
away from 90
0
(this can be guaranteed by proper choice of y
r
,
for example in Quasi-Newton methods y
r
= H
r
(x
r
) where
H
r
is PD, this property will hold if cond. numbers of {H
r
} are
uniformly bounded above) then either (x
r
) 0, or f(x
r
)
, or both.
Other equivalent conds. to ensure step lengths not too small
are:
(2.2) f(

) f(0) +

(0)
for some selected (0, ).
89
A simple procedure to satisfy these conds. uses 0 <
1
< 1 <
2
(
1
= 0.2,
2
= 2 commonly used). plays role of in above.
Cond. 1 is equivalent to requiring step length

satisfy f(

)
(

) = f(0) +

1
f

(0).
Cond. 2 is enforced by requiring f(
2

) > (
2

) = f(0) +

1
f

(0).
Step 1: See if length of 1 satises both conds. 1 and 2. If so,
select

= 1 as step length and terminate procedure. If step
length of 1 violates cond. 1 [cond. 2] go to Step 2 [Step 3].
Step 2: Take

=
1

t
2
where t is the smallest integer > 1 for which
f(
1

t
2
) (
1

t
2
), and terminate procedure.
Step 3: Take

=
t
2
where t is largest integer 0 for which
f(
t
2
) (
t
2
), and terminate procedure.
Other ways of implementing procedure uses quadratic ts to
determine new step lengths when current step length (step length
= 1 initially) violates one of the conds.
90

You might also like