E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method

E1
251 Linear and Nonlinear

Op2miza2on

Chapter 10: Quasi-‐Newton method
1
10.1. The basic idea
x k+1 = x k − α k S k ∇f (x k )
S k : Approximation of the inverse-Hessian
Positive definiteness of S k guarentees the existence of α k such
that f (x k+1 ) < f (x k ).
When applied for the quadratic function f (x) = 0.5xT Qx − b T x,

the optimial α k is given by
gTkS k g k
αk = T
g k S k QS k g k
2
10.2 Convergence rate for quadra@c func@on
1 1 T
E(x k ) = (x k − x ) Q(x k − x ) = y k Qy k , where y k = x k − x* .
* T *
2 2
1
E(x k+1 ) = (x k+1 − x* )T Q(x k+1 − x* )
2
Substitute x k+1 = x k − α k S k g k :
1
E(x k+1 ) = (x k − α k S k g k − x* )T Q(x k − α k S k g k − x* )
2
1
= (y k − α k S k g k )T Q(y k − α k S k g k )
2
1 T 1 2 T
= y k Qy k − α k g k S k Qy k + α k g k S k QS k g k
T
2 2
⎡ 2α k g Tk S k Qy k − α k2 g Tk S k QS k g k ⎤ ⎛ 1 T ⎞
= ⎢1 − T ⎥ ⎜⎝ y k Qy ⎟⎠
⎣ y k Qy k ⎦ 2
3
Substitute for α k :
⎡ ⎛ gk Sk gk ⎞ T
T
⎛ gk Sk gk ⎞ T
T 2
⎤
⎢ 2⎜ T ⎟ g k S k Qy k − ⎜ T ⎟ g k S k QS k g k ⎥
⎢ ⎝ g k S k QS k g k ⎠ ⎝ g k S k QS k g k ⎠ ⎥ E(x )
= ⎢1 − ⎥ k
y Tk Qy k
⎢ ⎥
⎢⎣ ⎥⎦
Using the relation Qy k = g k we get
⎡ ( ) ( ) ⎤
T 2 T 2
gk Sk gk gk Sk gk
⎢ 2 T − T ⎥
⎢ g k S k QS k g k g k S k QS k g k ⎥
= ⎢1 − ⎥ E(x k )
g Tk Q−1g k
⎢ ⎥
⎢⎣ ⎥⎦
⎡ ( ) ⎤
T 2
gk Sk gk
= ⎢1 − T ⎥ E(x k ) = [1 − γ k ] E(x k )
⎢
⎣ ( )(
g k S k QS k g k g Tk Q−1g k ) ⎥
⎦
4
( g Sg ) T 2
γk =
k k k
.
( g S QS g )( g Q
T
k k k k
T
k
−1
gk )
Let Tk = S k1/2QS k1/2 and p k = S k1/2 g k gives
(p pk )
T 2
γk =
k
.
(p k
T
Tk p k ) ( p k Tk p k )
T −1
To get upper bound on γ k we need to minimize γ k over

p k . As did in Chapter 8, we get
4(ak Ak )
γ min = , where ak and Ak are the smallest and
(ak + Ak ) 2
largest eigen values of Tk .
5
10.3 Construc@ng and upda@ng the inverse-‐
Hessian
1) WKT for a quadratic function, (1/2)xT Qx − xT b, the Hessian is
constant and is given by F = Q, and ∇f (x) = Fx − b.
2) Let g k = ∇f (x k ), p k = x k+1 − x k and q k = g k+1 − g k . Then q k = Fp k
3) Evaluation of q k and p k for k = 0,...,n − 1 completely

determines F or F −1 .
The main idea of all Quasi-Newton method is to construct, for

each k, an approximate inverse Hessian, H k+1 based on {q j ,p j } kj=0
in such a way that H k+1 satisfies
H k+1qi = p i for all 0 ≤ i ≤ k.
The update rule for H k+1 is contructed in such a way that H n = F −1 .
6
10.4. The general form of Quasi-‐Newton
algorithms
Intialization: x 0 , g 0 = Fx 0 − b, H 0 = I.
For k = 0,..., n − 1 do
(1) d k = −H k g k
d Tk g k
(2) Compute α k = − T
d k Fd k
(3) x k +1 = x k + α k d k , g k +1 = g k + α k Fd k
p k = x k +1 − x k = α k d k , q k = g k +1 − g k = α k Fd k
(4) H k +1 = H k + U k , where, the correction term, U k ,
is chosen such that H k +1qi = p i for all 0 ≤ i ≤ k.
7
10.5. The rank one correc@on
The idea is to make H k +1 to satisfy H k +1q k = p k .
H k +1 = Hk +
( p k − Hk qk )( p k − Hk qk )
T
qTk ( p k − H k q k )
Check:
H k +1q k = H k +
T
qk
qk ( p k − Hk qk )
T
= Hk qk +
( p k − Hk qk )( p k − Hk qk ) qk
T
= Hk qk + ( p k − Hk qk ) = p k
8
To verify H k+1qi = p i , for i ∈{0,..., k}...........(E1)
Step 1: verify for k = 0
which requires only to verify H k+1q k = p k :
done in the previous page
Step 2: Assume H k qi = p i , for i ∈{0,..., k − 1}....(E2)
Step 3: Prove H k+1qi = p i , for i ∈{0,..., k}
3.a) Verify H k+1q k = p k − Done
3.b) Verify H k+1qi = p i , for i ∈{0,..., k − 1}
9
Verify H k +1qi = p i , for i ∈{0,..., k − 1}
H k +1qi = H k qi +
T
qi
T
( p k − H k q k )( pTkqi − qTkH k qi )
= H k qi + .
T
Since H k qi = p i (by (E2)),

( T
k
T
k )
H k +1qi = H k qi + y k p qi − q p i , .................(E3)
(
where y k = T
p k − Hk qk )
qTk p i = pTk Fp i = pTk qi .
This verifies H k +1qi = p i , for i ∈{0,..., k − 1}. 10
The main issue in rank one correc@on
H k+1 = H k +
T
qTk ( p k − H k q k ) : may not be positive, hence H k+1 may not
be positive definite.
qTk ( p k − H k q k ) may be close to zero, and hence H k+1 may
be ill-conditioned.
11
10.6. Advanced methods
Davidon-‐Fletcher-‐Powell (DFP) method
Op@miza@on problem:
B k : current estimate of Hessian
B k+1 : next refined estimate of Hessian
arg min
B k+1 = fM ( B k − B ) subject to Bp k = q k and BT = B
B
where fM (• ) = W −1/2 (•)W −1/2 F
(weighted Frobenius norm), where
1
W = ∫ F(x k + τα k p k )dτ .
0
Solu@on:
B k+1 = (I − ρ k q k p k T )B k (I − ρ k p k q k T ) + ρ k q k qT k , ρ k = 1 / (qTk p k ).
−1 p k p k T H k q k qT k H k
H k+1 = B k+1 = Hk + T − T , where H k = B−1
k
pk qk q k Hk qk 12
Broyden, Fletcher, Goldfab, and Shanno (BFGS) methed
Op@miza@on problem:
H k : current estimate of inverse Hessian

H k+1 : next refined estimate of inverse Hessian
arg min
H k+1 = fM ( H k − H ) subject to Hq k = p k and HT = H
H
where fM (• ) = W1/2 (•)W1/2 F
(weighted Frobenius norm), where
1
W = ∫ F(x k + τα k p k )dτ .
0
Solu@on:
H k+1 = (I − ρ k p k q k T )H k (I − ρ k q k p k T ) + ρ k p k p T k , ρ k = 1 / (qTk p k ).
−1 B k p k p k T B k q k qT k −1
B k+1 = H k+1 = Bk − T
+ T
, where H k = Bk
pk qk q k qk
13
Broyden Family
B k p k p k T B k q k qT k
B k+1 = B k − T
+ T
+ φ k (p k
T
B p
k k )v T
k k ,
v
p k Bk p k p k qk
!#### "#### $
B k+1BFGS
⎡ qk Bk p k ⎤
vk = ⎢ T − T ⎥
⎣ k k
q p p k k k ⎦
B p
Alternative expressions:
B k+1 = φ B k+1DFP + (1− φ )B k+1BFGS
H k+1 = (1− φ )H k+1DFP + φ H k+1BFGS
Constraint on φk to ensure positive definiteness
φk =
1
, µk =
( q B q )( p B p )
k
T
−1
k k
T
k k k
1− µ k (p q )
T 2
k k 14
Theorem 10.6A:
Consider quadratic function f (x) = 0.5xT Qx − xT b with positive
definite Q. Let {H k } represent the sequence of inverse Hessian
approximations obtained using Broyden formula with φk ∈[0,1]
with postive definite starting matrix H 0 . let λ j { }
(k ) n
j=1
be the eigen
values of the matrix Q1/2 H kQ1/2 . Then for all k we have

{ } { }
min λ j (k ) ,1 ≤ λ j (k+1) ≤ max λ j (k ) ,1 . Further, this property is not
satisfied if the Broyden parameter is outside the interval [0,1].
15
Theorem 10.6B:
Consider quadratic function f (x) = 0.5xT Qx − xT b with positive
definite Q. Let {H k } represent the sequence of inverse Hessian
approximations obtained using Broyden formula with φk ≥ φk
with postive definite starting matrix H 0 . If the step size is computated
by means of exact line search formula, then the following hold:
(i) The iterates are independent of φk and converges to the solution
in at most n steps.
(ii) B k p j = q j , j = 1,2,..., k − 1.
(iii) If B0 = I, the iterates are identical to the those generated by
conjugate gradient method, and the search directions {p j } are
Q conjugate.
(iv) If n iterations are performed, we have Bn = Q.
16
Broyden method for non-‐quadra@c func@ons
Intialization: x 0 , g 0 = ∇f (x 0 ), H 0 = I.
For k = 0,..., N − 1 do
(1) d k = −H k g k
(2) Compute α k using Wolfe's condition
(3) x k+1 = x k + α k d k , g k+1 = ∇f (x k+1 )
p k = x k+1 − x k , q k = g k+1 − g k
(4) Compute H k+1 from H k ,p k , and q k using
Broyden update formula
The relation H k+1q j = p j is ensured only for j = k.
17
Convergence of Broyden method with φk = 0
Theorem 10.6C:
Let the function f (x) be strictly convex function. Then the sequence {x k }
generated by the Broyden method with φk = 0 and using Wolfe's line
search converges to the minimum of f (x).
18

E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method

Uploaded by

Copyright:

Available Formats

E1

251 Linear and Nonlinear

Chapter 10: Quasi-‐Newton method

When applied for the quadratic function f (x) = 0.5xT Qx − b T x,

To get upper bound on γ k we need to minimize γ k over

largest eigen values of Tk .

2) Let g k = ∇f (x k ), p k = x k+1 − x k and q k = g k+1 − g k . Then q k = Fp k

3) Evaluation of q k and p k for k = 0,...,n − 1 completely

The main idea of all Quasi-Newton method is to construct, for

The idea is to make H k +1 to satisfy H k +1q k = p k .

Since H k qi = p i (by (E2)),

H k : current estimate of inverse Hessian

Constraint on φk to ensure positive definiteness

values of the matrix Q1/2 H kQ1/2 . Then for all k we have

You might also like

E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method

Uploaded by

Copyright:

Available Formats

E1

251 Linear and Nonlinear

Chapter 10: Quasi-­‐Newton method

When applied for the quadratic function f (x) = 0.5xT Qx − b T x,

To get upper bound on γ k we need to minimize γ k over

largest eigen values of Tk .

2) Let g k = ∇f (x k ), p k = x k+1 − x k and q k = g k+1 − g k . Then q k = Fp k

3) Evaluation of q k and p k for k = 0,...,n − 1 completely

The main idea of all Quasi-Newton method is to construct, for

The idea is to make H k +1 to satisfy H k +1q k = p k .

Since H k qi = p i (by (E2)),

H k : current estimate of inverse Hessian

Constraint on φk to ensure positive definiteness

values of the matrix Q1/2 H kQ1/2 . Then for all k we have

You might also like

Chapter 10: Quasi-‐Newton method