Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

Introduction Methods Line Search Trust Region Least Squares Final Remarks

Lecture 2: Unconstrained Optimization

Prof. Marcelo Escobar

PPGEQ – UFRGS
Prof. Jorge Otávio Trierweiler

September 11, 2011

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Outline

1 Introduction
Basic Concepts
Indirect Solution

2 Methods

3 Line Search

4 Trust Region

5 Least Squares

6 Final Remarks
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Introduction

Unconstrained Optimization Problem

minimize f (x)
x

f (x): objective function


x: decision variables

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Introduction

Calculus: Smooth Functions

Consider a given function f (x) of n variables.

f (x) is continuous if:

lim f (x) = f (xk )


x→xk

f (x) is differentiable if the limit exists:

df f (x + d) − f (x)
f 0 (x) = = lim
dx d→0 d

In other words af (x) is differentiable if its derivative is continuous. Smooth


Functions: If the function is continuous and twice differentiable! logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Introduction

Calculus: Gradient Vector and Hessian Matrix

Gradient Vector:  
∂f (x)
∂x1
 . 
 .. 
∇f (x) =  
∂f (x)
∂xn

Hessian Matrix:
 ∂ 2 f (x) ∂ 2 f (x)

∂x12
... ∂x1 ∂xn

H(x) = ∇2 f (x) =  .. .. .. 
 . . .


∂ 2 f (x) ∂ 2 f (x)
∂xn ∂x1 ... ∂xn2

logo
∇f (x) points in the direction of greatest increase of f (x) ; H(x) = H(x)T .

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Introduction

Calculus: Taylor’s Series

Taylor’s series expansion of f (x) about xk :

1
f (x) ≈ f (xk ) + ∇f (xk )T d + d T H(xk )d, d = x − xk
2

Examine the vicinity of xk :


The term ∇f (xk )T d is called the directional derivative;
If ∇f (xk )T d > 0 the function locally increases in the direction d;
If ∇f (xk )T d < 0 the function locally decreases in the direction d;
d T H(xk )d ≥ 0 for d 6= 0 if the matrix H(x) is positive semidefinite;
d T H(xk )d > 0 for d 6= 0 if the matrix H(x) is positive definite.
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Basic Concepts

What is a solution?

Global and Local Minimum:


A point x ∗ is a global minimizer if f (x ∗ ) < f (x) ∀ x

x ∗ is a local minimizer if f (x ∗ ) < f (x) ∀ x ∈ N = x| kx − x ∗ k ≤ δ



logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Basic Concepts

Convexity

A given f (x) is called convex on the interval [x1 , x2 ] if:

f (αx1 + (1 − α)x2 ) ≤ αx1 + (1 − α)x2 , α ∈ [0, 1]

The function is convex if the Hessian is positive definite.

Why is it so important?
For a convex function any local minimizer is a global minimizer

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Basic Concepts

What characterizes a solution?

For a function f (x) of a single variable x:

First order Necessary condition: f 0 (x ∗ ) = 0


Second Order Sufficient Condition: f 00 (x ∗ ) > 0

For a multivariable function f (x):

First order Necessary condition: ∇f (x ∗ ) = 0


Second Order Necessary Condition: d T ∇2 f (x ∗ )d ≥ 0, ∀ d 6= 0;
Second Order Sufficient Condition: d T ∇2 f (x ∗ )d > 0, ∀ d 6= 0;
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Basic Concepts

What characterizes a solution?

f (x) = x 2 f (x) = x 3 f (x) = −x 2

minimum saddle maximum


f 0 (0) = 0 f 00 (0) = 2 f 0 (0) = 0 f 00 (0) = 0 f 0 (0) = 0 f 00 (0) = −2
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Basic Concepts

What characterizes a solution?

f (x) = −x12 − x22 f (x) = x12 − x22

maximum saddle
λ1 = −2; λ2 = −2 λ1 = 2; λ2 = −2 logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Indirect Solution

Example

As an example consider the following function:

minimize x 2 + 2x − 1
x

Analytical Solution:
df
= 2x − 2 = 0 ⇒ x ∗ = 1
dx
Sufficient conditions:

d 2f
=2>0
dx 2

this point is a local minimum! logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Indirect Solution

Indirect Solution

Suppose that the nonlinear function f (x) = f (x1 , x2 , . . . , xn ) is to be


minimized. The first order necessary conditions are:

 ∂f (x) 
1 ∂x
 ∂f (x) 
 ∂x2 
∇f (x) =  . 

=0
 .. 
∂f (x)
∂xn

It is a (non)linear system c(x) = 0 with n variables and n equations!!!


logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Indirect Solution

Newton’s Method Derivation

Suppose that we want to solve the following system c(x) with n


equations and n variables. Consider an approximation about xk :

c(x) ≈ c(xk ) + ∇c(xk )(x − xk ) = 0


Solving for x:
xk+1 = xk − ∇c(xk )−1 c(xk )

Recall that we want to solve c(x) = ∇f (x) = 0:

xk+1 = xk − ∇2 f (xk )−1 ∇f (xk )

Newton Step (d = xk+1 − xk ): ∇2 f (xk )d = ∇f (xk ) logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Indirect Solution

Newton’s Method Example 01

As an example consider the following function:

minimize x 2 − x
x

The f 0 (x) = 2x − 1 and f 00 (x) = 2. Newton’s method:

xk+1 = xk − f 00 (xk )−1 f 0 (xk ) = xk − (2)−1 (2xk − 1)

For x0 = 3, x1 = 0.5 (optimal solution since f 0 (x1 ) = 0 )

Note: Because the function is quadratic (f 0 (x) is linear), the minimum is


obtained in one step. logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Indirect Solution

Newton’s Method Example 02

As an example consider the following function:

minimize x 4 − x + 1
x

4xk3 −1
Newton’s method: xk+1 = xk − 12xk2
. For x0 = 3:

k xk f (xk )
1 3.0000 79.0000
2 2.0093 15.2904
3 1.3601 3.0619
.. .. ..
. . .
8 0.6300 0.5275
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Methods for Unconstrained Optimization

Direct Methods:
Scanning and Bracketing
Grid search
Interpolation
Stochastic Algorithms

Indirect Methods:
Steepest Descent Method;
Newton Method;
Quasi-Newton Method;
Conjugate Gradient.
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Direct Methods

General Algorithm:
1. Select initial set of point(s);
2. Evaluate objective function at each point;
3. Compare the values and keep the best solution (smallest value);
Outline: Do it forever and then you will surely find the optimal solution.
Remarks:
they are easy to apply and suitable for nonsmooth problems;
they require many objective functions evaluations;
there is no guarantee of convergence, or any proof the point is a
optimum;
methods of last resource - use them when nothing else works;
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Indirect Methods

Iterative procedure for generating a convergent sequence of points xk


such that f (x0 ) > f (x1 ) > . . . > f (xn∗ ):

Line Search:
Choose a promising direction
Find a step size to minimize along this direction
Trust Region:
Choose the maximum step size
Find a direction which minimizes

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Rates of Convergence

Different types of convergence rates for iterative procedures:

Linear: If there exists a c ∈ (0, 1) such that:

kxk+1 − x ∗ k ≤ c kxk − x ∗ k

p-Order: (often p = 2) If there exists M > 0 such that:

kxk+1 − x ∗ k ≤ M kxk − x ∗ kp

Superlinear: If there exists a zero converging sequence ck ,


limx→∞ ck = 0 such that

kxk+1 − x ∗ k ≤ ck kxk − x ∗ k
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Line Search

Outline:
0. Initial point xk ;
1. Choose a search direction dk
2. Minimize along that direction to find a new point:

xk+1 = xk + αd
where α is a positive scalar called the step size

Note: This is an iterative procedure, repeated until the convergence is


achieved!
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Search Direction
The search direction must be a descent direction:

∇f (xk )d T < 0
2.2. Overview of Algorithms 23

See the contour curves of the function (consider d = pk ):

p
k

_ f
k

logo

Prof.Figure 2.6Escobar
Marcelo A downhill direction pk of Chemical Processes
Optimization
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Exact Line Search

The step size α is determine by minimizing a merit function φ(α):

min φ(α) = f (xk + αd)


α≥0

A quadratic approximation of f (x) can be used as merit function:


1 2 T 2
φ(α) = f (xk ) + α∇f (xk )T d + α d ∇ f (xk )d
2
For which is possible an analytical solution:

∂φ(α) d T ∇f (xk )
= ∇f (xk )T d + αd T ∇2 f (xk )d = 0 ⇒ α = T 2
∂α d ∇ f (xk )d

Note: Exact line search is computationally expensive. There is a trade-off:


substantial reduction and computational effort. logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Line Search

In order to ensure convergence, only sufficient decrease is necessary:

36 Chapter 3. f (x + αd) < f (xk )


L i n e S e a r ck
h Methods

φ (α)

first local
minimizer

first
stationary
point global minimizer

Figure 3.1 The ideal step length is the global minimizer.

Basic Idea: Try out a sequence of candidates values for α, stopping to accept
iteration by means of a low-rank formula. When p is defined by (3.2) and B is positive
k k
logo
one of these values when
definite, we have certain conditions are satisfied.

pkT ∇fk  −∇fkT Bk−1 ∇fk < 0,

and therefore is a descent direction.


Prof.pk Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Inexact Line Search: Armijo’s Condition


38 Chapter 3. Line Search Methods

φ (α) = f(xk+ α pk )

l( α)

acceptable acceptable

Figure 3.3 Sufficient decrease condition.


Sufficient decrease condition (Armijo’s Condition):
for some constant c1 ∈ (0, 1). In other words, the reduction in f should be proportional to
both the step length αk and the directional derivative ∇fkT pk . Inequality T
(3.4) is sometimes
f (x + αd) < f (x ) + c α(x ) d
k
called the Armijo condition. k 1 k
The sufficient decrease condition is illustrated in Figure 3.3. The right-hand-side of
For some c1 ∈ (0, 1]. In practice, c is chosen to b quite small, say
(3.4), which is a linear function, can be denoted by l(α). The function l(·) has negative slope logo
1
c1 ∇fkT pk , but because c1 ∈ (0, 1), it lies above the graph of φ for small positive values of
c1 = 1e − 4! α. The sufficient decrease condition states that α is acceptable only if φ(α) ≤ l(α). The
intervals on which this condition is satisfied are shown in Figure 3.3. In practice, c1 is chosen
to be quite small, c1  10−4 .Escobar
Prof.sayMarcelo Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Inexact Line Search: Curvature Condition


The sufficient decrease condition is not enough by
3 . 1 .itself
S t e p to
L e n ensure
g t h 39 that the
algorithm makes reasonable progress.
φ (α) =f(x k+α pk )

desired
slope
tangent
α

acceptable acceptable

In order to avoid short


Figure 3.4 steps, enforce
The curvature condition. the curvature conditions

f (x + αd)T d ≥ c α∇f (x )T d 2 curvature condition


k terminate the line search. The
so it might make sense to k is illustrated in
logo
Figure 3.4. Typical values of c2 are 0.9 when the search direction pk is chosen by a Newton
Typical values of c are 0.9.
or quasi-Newton
2 method, and 0.1 when pk is obtained from a nonlinear conjugate gradient
method.
Prof. decrease
The sufficient Marcelo andEscobar Optimization
curvature conditions of Chemical
are known collectively Processes
as the Wolfe
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Inexact Line Search: Wolfe Conditions

The sufficient decrease and curvature conditions are known collectively as the
Wolfe conditions:

f (xk + αd) < f (xk ) + c1 α(xk )T d

f (xk + αd)T d ≥ c2 α∇f (xk )T d


with 0 < c1 < c2 < 1.

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Line Search Strategy

Sufficient Decrease and Backtracking

Backtracking Line Search Algorithm


Chose ᾱ > 0, ρ, c ∈ (0, 1); Set α ← ᾱ
repeat until f (xk + αd) ≤ f (xk ) + c∇f (xk )T d
α ← ρα
end (repeat)
Terminate with αk = α

Remarks:
The initial step length ᾱ is chosen to be 1 in Newton and quasi-Newton
methods, but can have different values in other algorithms such as
steepest descent or conjugate gradient;
An acceptable step length will be found after a finite number of trials!!!
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Steepest Descent

Steepest Descent Direction:

d = −∇f (xk )

The gradient is the vector that gives the (local) direction of the
greatest increase in f (x).

If α is sufficiently small, it always converge;


It has a linear rate of convergence;
It can be very slow for highly nonlinear problems;
it might zigzag close to the optimum solution
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Steepest Descent
As an example consider the following function:

minimize f (x) = x 3 − 100x


x

logo

This function has a maximum at -5.774 and a minimum at 5.774.


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Steepest Descent

The Gradient:
f 0 (x) = 3x 2 − 100
Steepest Descent:

xk+1 = xk − αk (3xk2 − 100)

For x0 = 10 and a tolerance of 1e−3 (Matlab File - Ex01a.m):


If αk = 0.1 the procedure diverges;
If αk = 0.01 the procedure converges in 27 iterations;
It exact line search is used the procedure converges in 1 iteration;

Note: The choice of α is essential for convergence. However, exact line


search is computationally expensive! logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Newton’s Method Derivation

Consider a second order approximation of f (x) about xk :

1
f (x) ≈ f (xk ) + d T ∇f (xk ) + d T ∇2 f (xk )d
2
What is the optimal direction to minimize f (x)?

∂f (x)
= ∇f (xk ) + ∇2 f (xk )d = 0
∂d

It results in:
d=-∇2 f (xk )−1 ∇f (xk )
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Newton’s Method Interpretations


Newton Direction:
d = −∇2 f (xk )−1 ∇f (xk )

Interpretations:
d minimizes the second order approximation of f (x):

1
f (x) = f (xk ) + ∇f (xk )T d + d T ∇2 f (xk )d
2
d solves linearized optimality conditions for min f (x):

∇f (x) = ∇f (xk ) + ∇2 f (xk )d = 0

logo
Note: It is important to note that Newton’s Method has a implicit α = 1.

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Newton Example

The Gradient and the Hessian are:

f 0 (x) = 3x 2 − 100 f 00 (x) = 6x

Newton Step:
3xk2 − 100
xk+1 = xk −
6x
For x0 = 10 and a tolerance of 1e−3 (Matlab File - Ex01a.m):
If αk = 1 the procedure converges in 4 iterations;
It exact line search is used the procedure converges in 1 iteration;

Note: The Newton’s Method may diverge if x0 is far from optimum!


logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Newton Direction

Newton Direction:
d = −∇2 (fk )−1 ∇f (xk )

Remarks:
Close to the solution it converges fast - quadratic convergence;
In order to converge from poor starting points - use line search;
The newton direction may be not a descent direction even for
sufficiently small α!

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Numerical Example 02
As an example consider the following function:

minimize f (x) = (x1 − 3)2 + 9(x2 − 5)2


x
50
7

30
6.5

6
10
5.5 5
1

5 2
x2

4.5

4
30
3.5

50
3

60
2.5
80

2
0 1 2 3 4 5 6 7
x1

logo

This function has a minimum at (x1∗ , x2∗ ) = (3, 5).


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Methods

Newton Example

The Gradient and the Hessian are:


   
2(x1 − 3) 2 0
∇f (x) = H(x) =
18(x2 − 3) 0 18

Line Search:
xk+1 = xk − H(xk )−1 ∇f (xk )

Consider the initial point x0 = (1, 1)T (Matlab File - Ex02a.m):


For steepest descent with α = 0.1 converges in 51 iterations;
For steepest descent with exact α converges in 6 iterations;
For Newton’s Method with exact α converges in 1 iteration;
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Levenberg-Marquardt

If the Hessian is not positive definite the Newton direction is not a


descent direction:

Levenberg-Marquardt Method (Hessian Modification):

H̃(xk ) = H(xk ) + βk I , βk > −min(λi )

where λi are the eigenvalues of H(xk ).

Enjoy quadratic convergence!!!

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi-Newton
Alternatively the Hessian can be approximated:

∇f (xk+1 ) ≈ ∇f (xk ) + Bk dk

This formula is referred as Secant equation:

Bk dk ≈ yk

where dk = xk+1 − xk and yk = ∇f (xk+1 ) − ∇f (xk )

The Hessian approximation must satisfy the secant equation:


must be positive definite
must be symmetric
as close to Bk−1 as possible (no wild changes) logo

Note: This matrix is not uniquely defined (use update formulas).


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi-Newton
The most popular update formulas are:

Broyden-Fletcher-Goldfarb-Shanno (BFGS):

yk ykT Bk dk (Bk dk )T
Bk+1 = Bk + T

yk dk dkT Bk dk

Davidon-Fletcher-Powell (DFP):

yk ykT 
yk dkT
 
dk ykT

Bk+1 = + I− ykT dk
Bk I − ykT dk
ykT dk

Note: For the first iteration B0 = I . Numerical experiments have shown that logo

BFGS formula’s performance is superior over DFP formula


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi-Newton

Alternatively, the inverse of the Hessian can be updated:


BFGS:

−1 (dkT yk + ykT Bk−1 yk )dk dkT Bk−1 yk dkT + dk ykT Bk−1


Bk+1 = Bk−1 + −
(ykT dk )2 dkT yk

DFP:
−1 Bk−1 yk ykT Bk−1 dk dkT
Bk+1 = Bk−1 − +
ykT Bk−1 yk ykT dk

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi-Newton
The methods presented in this section differ by the search direction:

d = −Bk−1 ∇f (xk )
where,

Steepest Descent Bk = I
Newton Method Bk = H (Hessian)
Quasi Newton Method Bk (Hessian approximation)

See the file Ex02a.m (Using Quasi-Newton Method, convergence in 2 logo

iterations).
Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi-Newton: Example
As an example consider the function (toy02.m):

minimize f (x1 , x2 ) = αexp(−β)


x1 ,x2
1

0.9

0.8
−0.3

0.7 −0.5

0.6
x2

0.5
−2

0.4 1
−4

0.3 −5

0.2 −3

−1
0.1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
x1
logo

This function has a minimum at (x1∗ , x2∗ ) = (0.7395, 0.3144).


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Quasi newton Example

Consider the initial point x0 = (0.8, 0.2)T :


Ex03b.m: Quasi Newton with Inexact Line Search 8 iterations;
Ex03b.m: Steepest Descent with Inexact Line Search 46 iterations;

See also different problems:


Ex03a.m: Quasi Newton with Exact Line Search - 14 iterations;
Ex03c.m: Quasi Newton with Inexact Line Search - 05 iterations;

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Conjugate Gradient (CG)

A set of nonzero vectors (d1 , . . . , dn ) are said to be conjugate with


respect to the symmetric positive definite matrix H if:

diT Hdj = 0 ∀i 6= j

Orthogonality is a special case of conjugacy.


A quadratic problem is minimized in n steps using exact line search with
conjugate directions. Since the directions are linearly independent we
can write:

x ∗ = α0 d0 + . . . + αn−1 dn−1
Note: For non-quadratic functions more iterations may be necessary!
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Conjugate Gradient: Fletcher-Reeves Method

Fletcher and Reeves showed an extension for nonlinear functions:

dk+1 = −∇f (xk+1 ) + βk dk

where,
∇f (xk+1 )T ∇f (xk+1 )
βk =
∇f (xk )T ∇f (xk )

For the first iteration d0 = −∇f (x0 ).

There are many CG methods that differ by the parameter βk .


See Ex03cc.m, convergence in 7 iterations. logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Practical Methods

Summarizing: Line Search Strategy

General Algorithm:
0. Guess an initial point x0 and d0 = −∇f (x0 );
1. At xk for the given direction dk select αk (exact, inexact);
2. Set xk+1 = xk + αk dk ;
3. Estimate dk+1
(Steepest Descent, Newton, Quasi-Newton, Conjugate Gradient);

To ensure progress toward the minimum, make sure that:


dk is a descent direction;
αk is such that f (xk + αk dk ) < f (xk )

Have fun!!!
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region

Consider a second order approximation:


1
φ(d) = f (x) ≈ f (xk ) + ∇f (xk )T d + d T Hd
2
To obtain each step, we seek the solution of the subproblem:
1
minimize φ(d) = f (xk ) + ∇f (xk )T d + d T H(xk )d
kdk≤∆k 2

where ∆k is the trust region radius. It can be shown that the solution is:

(H(xk ) − λI )d ∗ = −∇f (xk )

For some λ ≥ 0 sucht that (H(xk ) − λI ) is positive definite.


logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region: Solution of the Subproblem

If H(xk ) is positive definite and kdN k ≤ ∆k , the solution of the


subproblem is:
d ∗ = dN = H(xk )−1 ∇f (xk )
Otherwise,

d ∗ = (H(xk ) + λI )−1 ∇f (xk ), kd ∗ k ≤ ∆k

When λ varies between 0 and ∞, the search direction d(λ) will vary
between the Newton direction dN and a multiple of −∇f (x k ).

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


◮ Furthermore, direct
Introduction Methods Line Search Trust Region Least Squares Final Remarks
λ(∆k − kp ∗ k) = 0. multip
c 2007 Niclas Börlin, CS, UmU Nonlinear Optimization; Trust-region methods
Trust Region

Trust Region: Solution of the Subproblem Line search and trust-region


Trust-region methods
The trust-region model
Least Squares; Levenberg-Marquardt
The trust-region subproblem
The Dogleg algorithm
The trust-region algorithm

The redu

◮ To en

pN is defi
◮ If the
p(λ) increa
◮ If the
decre
◮ Furth
−g is not

c 2007 Niclas Börlin, CS, UmU Nonlinear Optimization; Trust-region methods

logo
where pN is the Newton direction and −g = −∇f (x).

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region: Alternatives

Steepest Descent: first order approximation,

φ(d) = f (x) ≈ f (xk ) + ∇f (xk )T d

analytical solution: dk = − k∇f∆(xk k )k ∇f (xk )

Quasi Newton: second order approximation,


1
φ(d) = f (x) ≈ f (xk ) + ∇f (xk )T d + d T Bk d
2
where Bk is a Hessian approximation (DFP, BFGS).
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region Radius

Given a direction dk define a ratio:


f (xk ) − f (xk + dk )
ρk =
φ(0) − φ(dk )

the numerator is called actual reduction and the denominator predicted


reduction.

ρk is always non-negative;
if ρk is close to one there is good agreement with the quadratic
approximation and then the trust region is expanded;
if ρk is close to zero, the trust region radius is shrink;
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region: Algorithm


Chose x0 ,∆¯ > 0, ∆0 ∈ (0, ∆),toland
¯ η ∈ (0, 0.25)
For k = 0, 1, 2, . . .
If k∇f (xk )k < tol, STOP with solution xk ;
Obtain dk by solving the subproblem (exact, or any approximation);
Evaluate ρk ;
If ρk < 0.25, then ∆k = 0.25∆k
else
¯
If ρk > 0.25 and dk = ∆k , then ∆k+1 = min(2∆k , ∆)
else, ∆k = ∆k
end
end
If ρk > η, then xk+1 = xk + dk
else, xk+1 = xk
end
end (for)
logo
Terminate with the trust region radius for the next iteration (∆k+1 ).

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Subproblem Solution
In order to converge, it is not necessary the optimal solution of the
subproblem. Only a crude approximation with sufficient reduction is
needed.

Cauchy Point (slow): Solve the linear approximation. Note that it


is steepest descent method with αk = ∆k :

∇f (xk )
d =− ∆k
k∇f (xk )k

The Dogleg method (fast - superlinear) : d = dC + τ (dN − dC )

∇f (xk )T ∇f (xk )
dC = − ∇f (xk ) dN = H(xk )−1 ∇f (xk )
∇f (xk )T Bk ∇f (xk )
logo
The factor τ depends on dC ,dN and ∆k .

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Trust Region

Trust Region Example

Consider the example min f (x) = α exp(−beta) (toy02.m):


Consider the initial point x0 = (0.8, 0.2)T :
Ex04a.m: For Cauchy Point (steepest descent) it converges in 44
iterations;
Ex04c.m: Newton with exact solution it converges in 6 iterations;
Ex04c2.m: Quasi Newton with exact solution it converges in 17
iterations;
Ex04d.m: Newton with dogleg it converges in 6 iterations;
Ex04e.m: Quasi Newton with dogleg it converges in 15 iterations;

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Least Squares

Least Square Problems


Approximate solution of overdetermined systems minimizing the sum of
the squares of the errors.

minimize f (β) = r (β)T r (β)


β

where r is the error function and β are the unknown parameters.

Curve Fitting:
r (β) = y − f (x, β)
Solution:
∂f ∂r
= 2r T =0
∂β ∂β

logo
Note: For m equations we can determine at most m parameters.

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Least Squares

Least Square Problems


For linear error function: r = y − A(x)β

Analytical Solution: β ∗ = (AT A)−1 AT y

Applications:
Fitting of Linear functions:

ri = yi − (β1 xi + β2 )

Fitting of Linear functions in terms of parameters (e.g.


polynomials):

ri = yi − (β1 f (xi ) + . . . + βn f (xi ))

logo

Note: Transformation is useful, e.g. y = β1 e β2 x ; ln(y ) = ln(β1 ) + β2 x


Prof. Marcelo Escobar Optimization of Chemical Processes
Introduction Methods Line Search Trust Region Least Squares Final Remarks

Least Squares

Least Square Problems

It is possible enforce different weights for each equation.

minimize f (β) = (Wr (β))T Wr (β)


x

where W = diag(w1 , . . . , wm ) and wi the weight of equation i .

For linear r , there is analytical solution:

β ∗ = (AT W T WA)−1 AT W T y

Nonlinear Least Square: Guass Newton ∆β ∗ = (J T J)−1 AT y , where


J = ∇β r .
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Least Squares

Applications in Chemical Engineering

Kinetic models, e.g.


k1 CA − k2 CB
−rA =
1 − k 3 C A CB
Thermodynamic models, e.g.
B
ln(P SAT ) = A −
T +C
Empirical models,e.g.
1
h = kRe α Pr 3

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Least Squares

Example

Example 1: For a given data (Exa1.m) estimate A and B:

B
log(P SAT ) = A −
T
Example 2: For given experimental data (Exa2.m) estimate α and β:

Nu = αRe β

logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Final Remarks

Final Remarks

Stationary points can be found by solving a non-linear system;


The solution can be found by direct or indirect methods;
Direct methods uses only objective function evaluations;
Indirect solution strategies: Line Search and Trust Region;
An important application of unconstrained optimization is curve
fitting and parameter estimation for stationary models.
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Final Remarks

Further Readings

Numerical Optimization. Nocedal


and Wright (1999):
Chapters 1, 2, 3, 4 and 5.
Optimization of Chemical Processes.
Edgar, Himmelblau and Lasdon
(1995) Chapters 5 and 6.
Linear and Nonlinear Programming.
Luenberguer (2008): Chapters 7, 8,
9 and 10;
Nonlinear Programming. Biegler
(2010): Chapters 2 and 3.
logo

Prof. Marcelo Escobar Optimization of Chemical Processes


Introduction Methods Line Search Trust Region Least Squares Final Remarks

Final Remarks

Help, Comments, Suggestions


P
Personal
l Information
I f ti

Just in Case Contact

Marcelo Escobar Aragão


(Teaching Assistant)

Department of Chemical Engineering


Federal University of Rio Grande do Sul
Porto Alegre ‐ RS
Phone: 55 51 3308 4163
Mobile: 55 51 9684 4213
email: escobar029@hotmail.com

logo
Presented by: Course 22/08/2011 11:39
Marcelo Escobar Slide 23/38

Prof. Marcelo Escobar Optimization of Chemical Processes


This presentation is over

Thank You for your attention!!!!


What you want to do now:
restart presentation

quit

You might also like